Modern Tech Infrastructure and Data Innovations in Fintech

India is going through an accelerated digital adoption phase. Businesses across the geographic span of the country are adopting digital to optimize their operations and for efficient and profitable financial outcomes. Activities such as accounting, payments, lending, insurance, etc. have become relatively easy, less time-consuming, and error free with computing and connectivity. While Fintechs have modernized finance for businesses, there is a scope to do much more and come up with Fintech First solutions. Mr. Gaurav Lahoti, VP Engineering, Khatabook, shared his insights on the tech foundation behind the Fintech growth in the country with the CXOToday.


  1. What are some of the developments around the Modern infrastructure in Financial Services?

Technology adoption in Fintech has been experiencing hyper-growth in India. Availing the services like accounting, payments, lending, insurance, etc. has become relatively easy with computing and connectivity available on the go and many services being available on APIs. With a large section of India’s population on the internet, building for scale has been a key consideration for engineering organizations. With apps becoming easier to build and iterate, companies are launching new products and getting them adopted at a never-before-seen pace. For instance, Khatabook reached the milestone of 10M monthly active users in 15 months since its launch.

Modern infrastructure being built for these products not only need to achieve the business goals but also need to ensure that the quality of the product stays high while shipping as fast as possible. Automated functional and security testing has become paramount for launching any product changes.

Ensuring the security of data internally as well as from outside attacks is quintessential. Today that security layer is built with Endpoint Detection and Response (EDR) systems, tools to highlight configuration issues on the cloud, and monitoring systems alerting on unauthorized access. Restricting access to Personally Identifiable Information (PII) internally is important for user data protection. Today, all forms of financial services encounter some fraud. Building data capabilities using data science to detect and respond to fraud in real-time is expected to maintain the trust in the products. Services like lending and insurance requiring extensive risk evaluation require integration with third-party data sources too.


  1. How do Fintech companies ingest, manage, and use the data?

With accelerated digital adoption – the increasing number of internet and app users and the increasing amount of time spent on the internet – we are generating more and more data. Modern data infrastructures on the cloud are engineered to store and process hundreds of terabytes of data. Besides the application data stored on the internal databases, there is a large amount of user event data from frontend that needs to be ingested, stored, transformed, and processed for marketing, product, customer service, on-ground operations, and engineering teams. Usually, all the data is stored in a central database, called a data warehouse [an analytical database (OLAP)]. Data warehouses may have thousands of tables, with some containing billions/trillions of rows. Making the data discoverable and understandable in a data warehouse requires building and maintaining a data catalog. Processing this data requires using batch processing and stream processing systems. In a fast-moving organization, making these systems self-serve for data specialists has a significant advantage as they are no longer dependent on other teams to deploy their computations to run at scale. This reduces the output time significantly, leading to faster iterations and better business/product growth. Today, Khatabook has many free, open-source tools like Apache Airflow, Spark, Redash, Open Metadata, Kubernetes, etc., and few paid tools as a part of its data platform. Khatabook’s data warehouse today manages 150TB of data, processing upwards of 30K+ queries per day with some critical tables upwards of 20TB in size.


  1. How is Data Science and Machine Learning impacting the innovation quotient?

One impactful use of data and AI/ML model in fintech is to mitigate risk and fraud. Having a strong KYC (Know Your Customer) check is a must. Validating any user heavily uses character and image recognition models on its IDs. Today, state-of-the-art ML models can determine the legitimacy of the info on the ID card (including photos) to prevent identity theft. With every transaction on the platform, there is a chance of fraud. The volume makes it impossible to do a manual check at a scale. ML models help build systems that block these transactions automatically or refer them for manual review if the system can’t make the decision. It helps in maintaining trust in the online product.

In the case of lending, digital underwriting requires gathering and processing data from many different data sources to determine creditworthiness. Processing such a large amount of information manually is quite hard. However, complex ML models have been effective in risk modeling with a large number of data points. Moreover, ML models can also be used to predict the active loans that could become at risk by continuously analyzing the user behavior and data.

In fintech, many other use cases like predicting user cohort, user churn, engagement, retention, lead generation, etc. use Data Science for better results. ML will be leveraged heavily in personalizing and customizing fintech solutions in the coming years.


  1. What are the data protection and security measures associated with the modern tech infra?

Given the data sensitivity on any Fintech platform, information and data security have to be a top priority from Day 0. Several systems need to be in place to secure User, Employee, and organization data. Having a well-defined Role-Based Access Control (RBAC) across all domains of an organization helps in protecting the data while ensuring the business doesn’t face any bottlenecks.

Security of employee and contractor devices, office infrastructure, and servers must be ensured. Given the rise in cybercrimes, systems that automatically detect and respond to untrusted or unusual behaviors are required to maintain safety and security.

Protecting the data with Disaster Recovery (DR) solutions that back up the data and server configuration to a different infrastructure is also a must. Continuous vulnerability and penetration testing of the apps and the server infrastructure should be carried out every few months.

Human errors, unintentional, can lead to security failures. As they say, ‘you are only as strong as your weakest link’. I firmly believe the orgs should undertake employee security training regularly to increase the knowledge and awareness that can prevent accidental security lapses.

Leave a Response