Introduction
Imagine having a personal assistant that not only understands your requests but also knows exactly how to execute them, whether it’s performing a quick calculation or fetching the latest stock market news. In this article, we delve into the fascinating world of AI agents, exploring how you can build your own using the LlamaIndex framework. We’ll guide you step-by-step through creating these intelligent agents, highlighting the power of LLM‘s function-calling capabilities, and demonstrating how they can make decisions and carry out tasks with impressive efficiency. Whether you’re new to AI or an experienced developer, this guide will show you how to unlock the full potential of AI agents in just a few lines of code.
Learning Outcomes
- Understand the basics of AI agents and their problem-solving capabilities.
- Learn how to implement AI agents using the LlamaIndex framework.
- Explore the function-calling features in LLMs for efficient task execution.
- Discover how to integrate web search tools within your AI agents.
- Gain hands-on experience in building and customizing AI agents with Python.
This article was published as a part of the Data Science Blogathon.
What are AI Agents?
AI agents are like digital assistants on steroids. They don’t just respond to your commands—they understand, analyze, and make decisions on the best way to execute those commands. Whether it’s answering questions, performing calculations, or fetching the latest news, AI agents are designed to handle complex tasks with minimal human intervention. These agents can process natural language queries, identify the key details, and use their abilities to provide the most accurate and helpful responses.
Why Use AI Agents?
The rise of AI agents is transforming how we interact with technology. They can automate repetitive tasks, enhance decision-making, and provide personalized experiences, making them invaluable in various industries. Whether you’re in finance, healthcare, or e-commerce, AI agents can streamline operations, improve customer service, and provide deep insights by handling tasks that would otherwise require significant manual effort.
What is LlamaIndex?
LlamaIndex is a cutting-edge framework designed to simplify the process of building AI agents using Large Language Models (LLMs). It leverages the power of LLMs like OpenAI’s models, enabling developers to create intelligent agents with minimal coding. With LlamaIndex, you can plug in custom Python functions, and the framework will automatically integrate these with the LLM, allowing your AI agent to perform a wide range of tasks.
Key Features of LlamaIndex
- Function Calling: LlamaIndex allows AI agents to call specific functions based on user queries. This feature is essential for creating agents that can handle multiple tasks.
- Tool Integration: The framework supports the integration of various tools, including web search, data analysis, and more, enabling your agent to perform complex operations.
- Ease of Use: LlamaIndex is designed to be user-friendly, making it accessible to both beginners and experienced developers.
- Customizability: With support for custom functions and advanced features like pydantic models, LlamaIndex provides the flexibility needed for specialized applications.
Steps to Implement AI Agents Using LlamaIndex
Let us now look onto the steps on how we can implement AI agents using LlamaIndex.
Here we will be using GPT-4o from OpenAI as our LLM model, and querying the web is being carried out using Bing search. Llama Index already has Bing search tool integration, and it can be installed with this command.
!pip install llama-index-tools-bing-search
Step1: Get the API key
First you need to create a Bing search API key, which can be obtained by creating a Bing resource from the below link. For experimentation, Bing also provides a free tier with 3 calls per second and 1k calls per month.
Step2: Install the Required Libraries
Install the necessary Python libraries using the following commands:
%%capture
!pip install llama_index llama-index-core llama-index-llms-openai
!pip install llama-index-tools-bing-search
Step3: Set the Environment Variables
Next, set your API keys as environment variables so that LlamaIndex can access them during execution.
import os
os.environ["OPENAI_API_KEY"] = "sk-proj-"
os.environ['BING_API_KEY'] = ""
Step4: Initialize the LLM
Initialize the LLM model (in this case, GPT-4o from OpenAI) and run a simple test to confirm it’s working.
from llama_index.llms.openai import OpenAI
llm = OpenAI(model="gpt-4o")
llm.complete("1+1=")
Step5: Create Two Different Functions
Create two functions that your AI agent will use. The first function performs a simple addition, while the second retrieves the latest stock market news using Bing Search.
from llama_index.tools.bing_search import BingSearchToolSpec
def addition_tool(a:int, b:int) -> int:
"""Returns sum of inputs"""
return a + b
def web_search_tool(query:str) -> str:
"""A web query tool to retrieve latest stock news"""
bing_tool = BingSearchToolSpec(api_key=os.getenv('BING_API_KEY'))
response = bing_tool.bing_news_search(query=query)
return response
For a better function definition, we can also make use of pydantic models. But for the sake of simplicity, here we will rely on LLM’s ability to extract arguments from the user query.
Step6: Create Function Tool Object from User-defined Functions
from llama_index.core.tools import FunctionTool
add_tool = FunctionTool.from_defaults(fn=addition_tool)
search_tool = FunctionTool.from_defaults(fn=web_search_tool)
A function tool allows users to easily convert any user-defined function into a tool object.
Here, the function name is the tool name, and the doc string will be treated as the description, but this can also be overridden like below.
tool = FunctionTool.from_defaults(addition_tool, name="...", description="...")
Step7: Call predict_and_call method with user’s query
query = "what is the current market price of apple"
response = llm.predict_and_call(
tools=[add_tool, search_tool],
user_msg=query, verbose = True
)
Here we will call llm’s predict_and_call method along with the user’s query and the tools we defined above. Tools arguments can take more than one function by placing all functions inside a list. The method will go through the user’s query and decide which is the most suitable tool to perform the given task from the list of tools.
Sample output
=== Calling Function ===
Calling function: web_search_tool with args: {"query": "current market price of Apple stock"}
=== Function Output ===
[['Warren Buffett Just Sold a Huge Chunk of Apple Stock. Should You Do the Same?', ..........
Step8: Putting All Together
from llama_index.llms.openai import OpenAI
from llama_index.tools.bing_search import BingSearchToolSpec
from llama_index.core.tools import FunctionTool
llm = OpenAI(model="gpt-4o")
def addition_tool(a:int, b:int)->int:
"""Returns sum of inputs"""
return a + b
def web_search_tool(query:str) -> str:
"""A web query tool to retrieve latest stock news"""
bing_tool = BingSearchToolSpec(api_key=os.getenv('BING_API_KEY'))
response = bing_tool.bing_news_search(query=query)
return response
add_tool = FunctionTool.from_defaults(fn=addition_tool)
search_tool = FunctionTool.from_defaults(fn=web_search_tool)
query = "what is the current market price of apple"
response = llm.predict_and_call(
tools=[add_tool, search_tool],
user_msg=query, verbose = True
)
Advanced Customization
For those looking to push the boundaries of what AI agents can do, advanced customization offers the tools and techniques to refine and expand their capabilities, allowing your agent to handle more complex tasks and deliver even more precise results.
Enhancing Function Definitions
To improve how the AI agent interprets and uses functions, you can incorporate pydantic models. This adds type checking and validation, ensuring that your agent processes inputs correctly.
Handling Complex Queries
For more complex user queries, consider creating additional tools or refining existing ones to handle multiple tasks or more intricate requests. This might involve adding error handling, logging, or even custom logic to manage how the agent responds to different scenarios.
Conclusion
AI agents can process user inputs, reason about the best approach, access relevant knowledge, and execute actions to provide accurate and helpful responses. They can extract parameters specified in the user’s query and pass them to the relevant function to carry out the task. With LLM frameworks such as LlamaIndex, Langchain, etc., one can easily implement agents with a few lines of code and also customize things such as function definitions using pydantic models.
Key Takeaways
- Agents can take multiple independent functions and determine which function to execute based on the user’s query.
- With Function Calling, LLM will decide the best function to complete the task based on the function name and the description.
- Function name and description can be overridden by explicitly specifying the function name and description parameter while creating the tool object.
- Llamaindex has built in tools and techniques to implement AI agents in a few lines of code.
- It’s also worth noting that function-calling agents can be implemented only using LLMs that support function-calling.
Frequently Asked Questions
A. An AI agent is a digital assistant that processes user queries, determines the best approach, and executes tasks to provide accurate responses.
A. LlamaIndex is a popular framework that allows easy implementation of AI agents using LLMs, like OpenAI’s models.
A. Function calling enables the AI agent to select the most appropriate function based on the user’s query, making the process more efficient.
A. You can integrate web search by using tools like BingSearchToolSpec, which retrieves real-time data based on queries.
A. Yes, AI agents can evaluate multiple functions and choose the best one to execute based on the user’s request.
The media shown in this article is not owned by Analytics Vidhya and is used at the Author’s discretion.