Generative AI Agents

By Dr Debashis Guha


The use of Generative Artificial Intelligence (Gen-AI) like ChatGPT has become widespread since their introduction last year. Indeed, they are becoming a part of our everyday workflow just like search engines or online productivity apps. However, after the first rush of hype and excitement, many observers have started to wonder if all this frenzy was at all justified and whether gen-AI would really change the nature of work and improve productivity in a radical way, as ChatGPT enthusiasts have promised us. 

If we continue to use Gen-AI as we do now, merely to generate snippets of text through a conversational interface, using a model trained on a general-purpose corpus, then the answer to this question would have to be a resounding NO. Gen-AI such as ChatGPT do not have up to date information, they are prone to hallucinations and confabulations, and they do not have access to specialized information. Their writing style tends to be a bit wooden, and their stock of ideas somewhat hackneyed, although these last two shortcomings can be overcome to an extent through clever prompting. As a result, they do not add much value to business workflow as natural language information retrieval agents. Indeed, in their current avatar they seem to be more of a novelty item than an effective productivity enhancement tool.

Fortunately, the Gen-AI story does not end with conversational agents. The focus of LLM research in recent months has turned away from information retrieval and presentation to a completely different AI engineering field: task planning and execution. The tools that have resulted from this research are known as GPT agents or Foundation agents. A GPT agent is a structure built on top of a Gen-AI that allows it to run iteratively in order to fulfil a goal or complete a task. What this means is that an agent is given a goal to reach or a task to carry out, and it uses a GPT like model iteratively and autonomously to create a process workflow until the goal is reached. 

For example, a GPT agent may be assigned the task of writing long-form fiction. Current Gen-AI can write short passages and jokes, but it cannot write book-length fiction without human help. The e use of agents makes this possible. Gen-AI tools for book writing start out by generating a summary of the narrative, the principal characters and the setting. This is done in several stages. Next, the Gen-AI prepares a plan for each chapter, the links among chapters, and a structure for maintaining character and situational continuity. Once all this is in place, the actual narrative is generated, chapter by chapter. All this requires breaking down the task into its constituent parts, generate plans for each part, create tools for coordinating the overall generation, and iteratively generate all stages of the narrative.

Such a capability to carry out a complex task can also be brought to bear upon many other types of complex workflows, such as writing complex software, planning and monitoring a project, and teaching a course. Although the tools developed so far cannot carry out very complex projects, it is not hard to see that with the sort of accelerated progress we have seen in this field, the day is not far when we can get Gen-AI agents to carry out complex tasks and fulfil complex goals.

It is not hard to see that this will be the “killer-app” for Generative AI. AI agents can be used in all corners of the economy and are likely to increase productivity in all parts of the economy. For example:

Gen-AI can be used in product development by specifying the design goal and the target audience and asking the Gen-AI to run through all the intermediate stages such as idea generation, product design, testing, and launch. It is unlikely that the entire process is going to run autonomously, but every stage of the process and its workflow is going to be transformed and made more productive using Gen-AI.

Another major use-case will be in software development. Once the basic goal and functions of the software is defined, Gen-AI can create detailed specs, a list of required tools, a project plan, and a schedule. It can also write code and download tools from open-source repositories and combine the components into a package.

Many other use-cases come easily to mind. Job search, after defining its parameters such as domain, location, compensation etc, is one, as is employee search starting from a job description. Creating a new course or even a degree program may be possible using Gen-AI agents. Other possible uses are project planning and monitoring, investment banking projects, and the planning and running of marketing campaigns. Apart from these large-scale and enterprise level applications, GPT agents will also find a whole host of small-scale uses, such as students using to do their assignments, employees using it to execute their tasks, and everyone using it to help with personal and household chores.

Foundation agents are Gen-AI agents that rely on a large-scale foundation model that is equipped with a general-purpose agent that can break down and plan the components of any complex task. Such foundation agents, and more specialised Gen-AI agents are going to be part of all business processes in the near future and they are going to change the mature of work and its flow in fundamental ways.

(The author is Dr Debashis Guha, Associate Professor and Director – Master of Artificial Intelligence in Business, SP Jain School of Global Management, and the views expressed in this article are his own)

Leave a Response