Press Release

Yubi launches India’s first indigenous open source fintech language model – YubiBERT

NEWS SUMMARY
  • With YubiBERT, Yubi becomes the first fintech company globally to launch an indigenous open source language model for Indian fintechs

  • YubiBERT is a language model trained from scratch which understands fintech text applied to the Indian context

  • YubiBERT was trained with 200+GB fintech public data and over 1 billion sentences, making it the most robustly trained language model in the world

  • YubiBERT models have an average success rate of 90% across different NLP use cases

FULL STORY

Yubi, the world’s first unified credit platform for corporates and lenders, has launched India’s first indigenous open source fintech language model YubiBERT.  YubiBERT is a language model trained from scratch (similar to BERT from Google) which understands fintech text applied to Indian context better. It currently supports 13 top regional languages along with English.

Despite being the world’s biggest and the most innovative fintech market, Indian fintech companies are compelled to use large language models (LLMs) which are not designed for the fintech sector or the Indian context. This has resulted in multiple inefficiencies for the fintech sector. With YubiBERT, Yubi aims to solve this problem for the entire fintech industry so that the ecosystem can collectively grow.

Commenting on the launch of this language model Mathangi Sri, Chief Data Officer, Yubi said, “Despite having an innovative fintech ecosystem in India, very few data science teams in Indian fintech companies attempt to train a model from scratch because of which most fintech companies finetune models given by Google, Microsoft, and Facebook. This approach has severely hindered the growth of the sector. India being a unique market for financial services needed a unique language and with a very strong data team at Yubi, we wanted to be the pioneers in building this language model. We are thrilled to launch this language model as an open source so that the entire Indian fintech ecosystem can thrive collectively.”

YubiBERT was trained with 200+Gb fintech public data and over 1 billion sentences, making it the one of the most robustly trained language model in the world. When fine tuned on FinTech related Natural Language Processing (NLP) tasks, it performs better than BERT, RoBERTa, FinBERT and DistilBERT models.

Commenting on the rationale of building YubiBERT, Swapnil Ashok Jadhav, Director Data Science, Yubi said, “Natural language processing has been a crucial part of many tech companies and their success. However, we noticed two main pain points. Firstly, India being a complex market with multiple languages, there was no model to analyze regional languages. Secondly, domain specific models perform better than generalized state of the art models. While there are domain specific models for fintech, none of these models consider the vastly different context of the Indian fintech market. These two pain points motivated us to train a model from scratch which resulted in YubiBERT. We are positive that this will have a massive impact on the fintech community and we are excited to see how the data science community  takes this language model to the next level.”

With an accuracy of over 90% across different natural language processing (NLP) use cases in fintech, YubiBERT’s accuracy is higher than the fine-tuned State of The Art (SOTA) models. The language model is also faster than SOTA models as it is trained on very small architecture and also works on CPU in milliseconds making it cost-effective to deploy.

Data Scientists can access the model here:

https://github.com/Yubi2Community/YubiAI/tree/master/yubiai/nlp/yubiEmbeddings

ABOUT YUBI

Established in 2020 by Founder and CEO Gaurav Kumar, Yubi is the world’s first unified credit platform powering the discovery, execution and fulfilment of credit. The platform comprises a digital debt marketplace and a sophisticated technology stack, seamlessly powering the end-to-end debt lifecycle from origination to collections. Its one-of-a-kind product suite (Yubi Loans, Yubi Co.Lend, Yubi Invest, Yubi Flow, Yubi Pools, and Yubi Build) offers loans, co-lending, corporate bond issuance, supply chain financing, asset-backed securitisation and RE & Infra financing to build a holistic digital credit ecosystem. Additionally, Yubi acquired its collections arm, spocto, a global risk mitigation platform and pioneer in AI-enabled recovery infrastructure and Corpository, a full-stack corporate credit underwriting company, in 2022 to strengthen its position as the ubiquitous layer that fuels the credit infrastructure of the country. Yubi currently has over 3,000+ Corporates and 750+ Lenders and has facilitated debt volumes of over INR 100,000 Crores. Yubi’s mission is to transform the debt markets globally, starting with India, by accelerating access to capital to further the GoI’s $5 Trillion economy goal.

Leave a Response