If you are in the “AI Bubble” everyone has heard about “AI Agents” by now.
Inevitably the discussion shifted towards debating the usefulness of AI (Artificial Intelligence), AGI (Artificial General Intelligence), and semantics of things like AI Agents, rather than exploring the functionalities and benefits of AI and Generative AI specifically.
IMHO meta-discussions and much more so defining terms are helpful to clear the muddy waters, but must not be a means to the end. Getting buy-in on terminology helps me to structure my thinking and helps you to understand some basics and helps us to be on the same page.
I’ll promise in other articles we’ll focus gain on the more practical, substantive issues, such as the development, business considerations, and real-world applications of AI agents.
But, this article should help that there's clarity and consensus on what terms mean, which is crucial for effective communication and understanding, especially in a field as complex and rapidly evolving as AI.
Always nice when some smart person who is deeply in the trenches – or in the arena as one seems to say nowadays – fluently describes what oneself has been thinking about for quite some while but even the smartest folks on Twitter could not give sensemaking debating a term.
Dharmesh Shah’s first article on his new blog agent.ai is exactly about the differentiation of AI Agents. Makes sense to dissect the subjectmatter by asking What Is Agent AI when your entire site is called agent.ai (which he told on Twitter was a pretty expensive domain to buy).
While his sources are experience and Twitter, mine are Wikipedia and less successfully Twitter-althoufgh Twitter started the spark in my head.
What a great opportunity to close some loops in my research, finish up my thoughts and gaingin confidence in them with his help and get to understand and share it from my perspective.
What are AI Agents?
In conversations about AI many terms get thrown around like GPT, LLM, prompt engineering, RAG, Agent, often used interchangeably.
In my definition, an AI Agent is like a platoon on a mission:
AI Agent: An AI with agency, e.g., a capable AI that has been given goal-orientation by any user, including non-human users, to pursue (predefined) objectives or tasks.
Dharmesh Shah calls them Agent AI and defines them as:
Agent AI: Software that uses artificial intelligence to pursue a specified goal. It accomplishes this by decomposing the goal into actionable tasks, monitoring its progress, and engaging with digital resources and other agents as necessary.
An LLM is a language model, which is not an agent as it has no goal, but it can be used as a component of an intelligent agent.
A Large Language Model (LLM) is an artificial neural network – like your brain – that is pre-trained on data. Like you, when you graduated from school the neural connection ins your different parts of your brains build up a certain way.
And GPT (Generative pre-trained transformers) is one of the architectures how said brain is trained and neural connections are made.
So they went to OpenAI’s school to get training the GPT way.
Depending on what training program and iteration GPT-3.5 or GPT-4 etc. the models have a different capabilities. Just like when you went to an arts school versus a MINT school.
And from that training facility a neural network emerged that you can now talk to by text input (a chat user interface (UI)).
Now GPTs like GPT-4 that is used in ChatGPT.
With OpenAI announcing early November you are now able to be “building your own GPT” = building your own AI Agent using ChatGPT in two ways. First to build your AI Agent using nothing but a chat interface. And then to communicate with your AI Agent using just a chat interface.
This is like the first commercial web browser in 1995 allowing you to browse the web with a graphical user interface GUI, what had been a text-based command before.
This is huge: building your own agent on top of OpenAI’s LLM with no prior coding knowledge that you then can use with a chat interface to communicate with.
It goes beyond prompt engineering because your own GPT you predefined objectives or tasks.
I think understanding the concept of "intelligent agents" is foundational to discussing how Large Language Models (LLMs) like GPT-4 can be integrated into them to provide goal-orientation and agency.
What are Intelligent Agents?
An intelligent agent is a system that perceives its environment through sensors and acts upon that environment through effectors. These agents are designed to achieve specific goals or perform specific tasks. They can range from simple, rule-based systems to advanced AI systems capable of learning and adapting over time.
In the Future we will discuss the following
Key Characteristics of Intelligent Agents
Integrating LLMs into Intelligent Agents
Conclusion Intelligent Agents
This integration represents a significant advancement in AI.
Intelligent agents equipped with LLM capabilities like GPT-4 are more efficient or faster.
They also represent a new breed of AI that's more adaptable, interactive, and capable of handling complex, multi-faceted tasks in dynamic environments.
This leap forward opens up new possibilities in various sectors, from business and healthcare to education and beyond.
Bots and Software Agents vs. Intelligent Agents
Software agents are bots.
The Definition of Intelligence
An agent is described as the capacity of an entity (be it an individual, organization, or system) to act independently and make choices
An agent has to be also acting in an intelligent manner, to be an Intelligent Agent.
Intelligent agents in artificial intelligence are closely related to agents in economics,
This has meant traditionally that they are not intelligent, but rational. The rational being the stochastically with all its fine-tunning.
But even economics has left their homo economicus world view behind and is applying models closer mimicking reality.
Let’s discuss the concept of "intelligence," particularly in the context of artificial intelligence and its relation to the idea of agents, both in AI and economics.
In general terms, intelligence refers to the ability to learn, understand, and apply knowledge to manipulate one's environment or to think abstractly as measured by objective criteria (tests, etc.). It encompasses a range of cognitive abilities, including logic, understanding, self-awareness, learning, emotional knowledge, reasoning, planning, creativity, and problem-solving.
Intelligence in Artificial Intelligence
In the realm of AI, intelligence is often defined in terms of mimicking human cognitive functions. Intelligence often relates to the ability of a system to learn from data, adapt to new situations, and make decisions based on both pre-programmed rules and learned experiences.
- Learning and Adaptation: The ability of AI to learn from data, experiences, and their environment, and adapt its responses accordingly.
- Problem Solving and Decision Making: Using logic and analysis to solve complex problems and make decisions based on data and algorithms.
- Understanding and Interaction: The capacity to understand human languages, emotions, and social cues, and interact in a meaningful way.
Intelligence vs Rationality in Agents
We must make a clear distinction between intelligence and rationality in the context of agents, whether in AI or economics.
Rational Agents: In both AI and economics, a rational agent is one that acts to achieve the best outcome. Or, when there is uncertainty, the best expected outcome.
Rationality in this sense is tied to the optimization of a given utility function, which can be stochastic and finely-tuned based on the context and goals.
Intelligent Agents: While all intelligent agents can be rational, not all rational agents are necessarily intelligent. Intelligence in agents refers to a broader set of capabilities, including learning from past experiences, adapting to new situations, understanding complex scenarios, and making decisions that may not always align with a fixed utility function but consider broader contexts.
For application in businesses and AI this distinction would mean that:
- Rational agents designed for specific tasks where the goals and parameters are clear, and the best action can be determined through algorithms and data analysis.
- Intelligent agents are systems that are more adaptable, capable of learning and handling a variety of tasks, and can work in environments with more variables and uncertainties.
In summary, while rationality in agents (AI or economic) focuses on the optimal performance within a set framework, intelligence encompasses a broader ability to adapt, learn, and handle complexity. For businesses and AI applications, understanding this distinction is key to developing systems that are efficient and also adaptable and capable of handling complex, real-world scenarios.
The definition of Agency
In the realm of economics, an agent is fundamentally a decision-maker within a model of an economic system. These agents are characterized by their ability to make choices, typically aiming to solve optimization or choice problems, whether well-defined or not. This economic perspective of an agent aligns intriguingly with the notion of intelligent agents in artificial intelligence.
In AI, especially, an intelligent agent refers to a system capable of autonomous actions in order to achieve specific objectives. Just as in economics, these agents are decision-makers, but their decisions are based on algorithmic processing and data analysis rather than human judgment.
Agency, in the broadest sense, refers to the capacity of an entity (be it an individual, organization, or system) to act independently and make choices. In the context of artificial intelligence and robotics, agency takes on a more specific meaning.
Typically, every agent makes decisions by solving a well- or ill-defined optimization or choice problem. This is the objective the agent is given by the principal to take agency over. Like a Platoon Commander is given an order by their General to fulfill a mission objective and the Platoon then goes out into the field, takes total ownership of that mission and starts executing towards mission complete.
An Intelligent agent therefore has an "objective function" that encapsulates all the IA's goals. Meaning, such an agent is designed to create and execute whatever plan will, upon completion, maximize the expected value of the objective function.
Intelligent is the agent because it is capable of goal-directed behavior.
Without goal orientation the GPT is just a UX to an LLM
Or in the case of ChatGPT, chatting/text is just an UX to the LLM of OpenAI.
Very similar to the internet where text was the first UI to the “information of the internet ” and the browser then provided a GUI
Key Aspects of Agency in AI
- Decision-making ability
- Goal-oriented behavior
- Adaptability and Learning
"specific goal orientation"
Defining Specific Goal Orientations: In the context of intelligent agents, "specific goal orientations" refer to the ability of the system to pursue predefined objectives or tasks. These goals can range from answering questions accurately, providing educational content, assisting in problem-solving, to more complex tasks like creative writing or programming assistance. In ChatGPT, these goal orientations are embedded within the model's training and operational parameters.
Researchers have described several methods to build agency into an LLM such as GPT-4 to give them a form of goal-orientation and agency.