In 2012, Harvard Business Review declared: “Data Scientist: The Sexiest Job of the 21st Century.” This had a ring of truth to it as for almost a decade, organizations across industries sought to unlock the power of data to chase transformative growth. Organizations of all sizes shifted their focus to data science for making data-driven decisions, evaluating market trends, minimizing losses, and maximizing profits.
However, another technology that emerged and disrupted businesses at the same period is Deep Learning (Neural Networks), which has become synonymous with AI. Data Science is an umbrella term-machine learning is one of the sub-fields of data science and deep learning is another sub-field within machine learning.AI will continue to flourish, spread its influence into more territories, and continue to be relevant.
So, why the demand for data scientists will collapse if progress in AI continues?
To understand about the demand for data scientists and related positions (research scientists and machine learning engineers), we need to have a solid understanding of the basics and what is happening in the AI world.
Let’s take an example of Sentiment Classification, a machine learning(ML) tool that analyses texts for polarity, from positive to negative.A decade back, when someone solved the problem using a machine learning model, they used their labeled data, trained their model on the data, and voila, it delivered 70% accuracy on a validation set.
Fast forward 10 years and today’s machine learning models are different.
ML models = Prior knowledge+Specific knowledge needed to solve the problem
In the earlier days, the ML models were solving problems by only using the available labeled data. Incorporating prior knowledge into the model was not done (not known). Things started changing around 2012/2013 and we learned how to improve ML models with the incorporation of prior knowledge. We called it Transfer Learning, and it pushed the accuracy of the Sentiment Classification further, say up to 80% (2014).
From 2018 onward, the concept of Transfer Learning took a new shape. We started using large Deep Neural Networks, learning from a huge corpus of input data in a self-supervised fashion. It started in the domain of text and we called it Language Modelling. Today, you can’t think of building a superior model in any domain (language, image, audio, video) without injecting any prior knowledge.
Representations learned by the pre-trained models (prior Knowledge) are becoming better and better with every passing day. The pre-trained models are becoming larger and capable of handling increased complexities of the world. This development is enabling us to solve for the specific knowledge in a much easier way while improving the accuracies of the models. In some cases, the prior knowledge is so much superior that it can solve certain types of problems without building any specific knowledge. This is called Zero-Shot Learning. Similarly, when you need lesser data to solve the problem, you are in the realm of Few-Shot Learning.
Improvements in building superior representations are progressing by leaps and bounds. Here are some pointers –
- So far, models were mostly built based on a single language, mainly English. However, it is possible that you might find nuggets of knowledge in other languages as well. For example, a fine mango pickle recipe written in Telugu may not be available in English. So, the building of prior knowledge started to take information from different languages and combine them into its knowledge, making the models
- Currently, the prior knowledge is mostly built using either text, or audio, or, images. We used mostly one type of input at a time. The newer models are becoming Multi-modal, which means the models are combining rich information from text, audio, video and learning deeper representations of our world.
- We are learning how we can incorporate the concept of Multitasking into our models. Not one task, but the models learn from a variety of tasks.
Expressive power, the richness of information, and the deeper robust representations will continue to expand, and the large models will be capable of handling a wide variety of complexities of our lives. All these developments will reduce the effort needed to build “specific knowledge” tremendously. To top it up, building new specific knowledge will take advantage of all the specific knowledge that was built before, which means, we’ll learn to learn incrementally without building from scratch.
So, in the future, new ML models might look like:
ML models = Prior knowledge+All specific knowledge acquired so far+Specific knowledge needed to solve a new problem
Today, the prior knowledge is built by the Big Tech companies and they have been graciously sharing their research and their work. Their work is fueling the acceleration of AI. We would not have come thus far as a community without the massive contribution from their side. Even today, data scientists are solving problems to a great extent because of the contribution by Big Tech companies. What are the chances that one will be able to build a better artificial intelligence model without consuming their work? Chances are slim to impossible.
Currently, in any start-up or enterprise, data scientists mostly solve for building “specific knowledge.” How much specific knowledge may one need to build in the future? Well, very less to none in most of the cases.
In the future, the majority of the data scientists’ roles will transform to software engineers, or data engineers, or data subject matter experts (SMEs). A certain percentage of them will become researchers in the Big Tech companies or in Universities and they will solve the unsolved, harder problems, maybe consciousness (hopefully someday) and the rest of the community will consume their work in the form of pre-trained models. This trend will help a large pool of software engineers do sophisticated AI work without much understanding of how the intelligence is built. Embedding intelligence will not require any research team in most cases. Well, of course, this assumption is conditioned on the fact that the Big Tech companies will continue to share their research.
If the pace of improvements continues in hardware and in AI algorithms, costs of training large models would continue to drop progressively, and we might see the trend happening much earlier, maybe in the next five years. I know I’m being optimistic, but we have to keep in mind that the field is not linear, it is exponential. We cannot project the future linearly.
Remember, how programming changed from Assembly-to-Fortran-to-C-to-Python? We might be at a Fortran level of AI right now and it is changing and getting better. So, beware data scientists; the ubiquitous demand across organizations may reduce to superior data scientists in a select few.
What do you think?
(Arindam Paul is the Vice President of Fidelity Investments India and the views expressed in this article are his own)