What is BabyAGI?

Authors

Ivan Belcic

Staff writer

Cole Stryker

Staff Editor, AI Models

IBM Think

BabyAGI is an autonomous agent framework designed to generate and run a sequence of tasks based on a user-provided objective. Publicly shared by Yohei Nakajima in 2023, BabyAGI orchestrates a loop of task creation, execution and prioritization by using a large language model (LLM) and a vector memory store. 

The standard implementation is a Python script that uses OpenAI’s GPT models through an API, a vector database (commonly Pinecone) for memory and the LangChain agent framework to structure AI agent roles. The vector database records task results as embeddings used for context retrieval, while the LLM powers the agent reasoning and task logic.1

As an autonomous AI agent, BabyAGI continuously iterates by using completed task outcomes to inform new tasks, reprioritize the task list and run subtasks. The process continues until the task queue is exhausted or a stop condition is reached.

The latest AI trends, brought to you by experts

Get curated insights on the most important—and intriguing—AI news. Subscribe to our weekly Think newsletter. See the IBM Privacy Statement.

Thank you! You are subscribed.

Your subscription will be delivered in English. You will find an unsubscribe link in every newsletter. You can manage your subscriptions or unsubscribe here. Refer to our IBM Privacy Statement for more information.

How does BabyAGI work?

BabyAGI uses a repeating three-stage AI workflow

  1. Task execution: The execution agent runs a task by using context from the vector database and the high-level objective as guidance.
     

  2. Task creation: Based on the outcome of the executed task, the task creation agent generates follow-up tasks that align with the original objective. 
     

  3. Task prioritization: A prioritization agent reorders all outstanding tasks, including new tasks, based on dependencies and relevance to the goal. 

The loop repeats until no tasks remain or another end condition is met.

BabyAGI’s key features

BabyAGI consists of several core architectural modules that work together to facilitate automated task generation, prioritization and execution. These components include:

  • LLM
     

  • Vector database
     

  • Task list
     

  • Task execution agent
     

  • Task creation agent
     

  • Task prioritization agent

LLM

BabyAGI’s LLM component is the central orchestrator of the agentic system. This artificial intelligence (AI) model acts as a high-level director, receiving the user’s prompt and assessing it with natural language processing (NLP) to identify the goal. It also powers the agents that create, execute and prioritize tasks.

BabyAGI typically uses OpenAI’s GPT-4. The three agents in the BabyAGI system use precise prompt engineering to guide GPT-4’s behavior in its agentic roles.

Vector database

BabyAGI’s vector database component stores the records and results of completed tasks and is the agent’s memory. BabyAGI can use the results of the first task to inform the second task and iterates on this process as it proceeds down the task list.

Vector databases store data as mathematical representations called embeddings. Data points that are closer to each other in the high-dimensional vector space are considered to be more semantically similar. BabyAGI uses semantic search to find relevant information in the database.

The canonical implementation uses Pinecone, but alternative vector stores such as Meta’s Facebook AI Similarity Search (FAISS) and Chroma are sometimes used in variants or forks. FAISS and Chroma are open source, while Pinecone, like many OpenAI products, is not.

Task list

The task list or queue is a prioritized list of subtasks stemming from the high-level goal and initial task. As the task execution agent completes tasks, those results are uploaded to the vector database. based on the outcomes of those tasks, the task list can change as priorities are adjusted and new tasks are added.

Task execution agent

The task execution agent uses the LLM and the data in the vector database to execute the tasks in the task list. Semantic search techniques are used to find relevant information in the database. After the task is complete, the system creates a new embedding and stores the record in the database.

Task creation agent

The task creation agent uses the high-level goal and the results of previous tasks to generate the next tasks in the workflow. Rather than work through a predetermined workflow, the ongoing task generation process allows the system to iterate on past results and learn dynamically.

Task prioritization agent

The task prioritization agent handles task management by regularly reordering and organizing the task list. Its job is to prioritize subtasks based on the results of previous tasks and how new tasks relate to the high-level goal. The prioritization agent also considers dependencies between tasks: if one task must be completed before another becomes possible.

AI agents

5 Types of AI Agents: Autonomous Functions & Real-World Applications

Learn how goal-driven and utility-based AI adapt to workflows and complex environments.

How to use BabyAGI

BabyAGI is a Python library and requires some Python coding knowledge to use. However, the setup process is relatively streamlined:

  1. Install Python and Git. Download the BabyAGI GitHub repository from github.com.
     

  2. Open the directory with BabyAGI and install all dependencies using the pip install command.
     

  3. Create a .env file and copy the .env example file to it.
     

  4. Add an OpenAI API key and Pinecone API key to the .env file. If needed, first create an OpenAI account and obtain an API key.
     

  5. Define the objective by changing the OBJECTIVE value. Then, provide an initial task.
     

  6. Save and close the .env file.
     

  7. Enter the command python babyagi.py to run the agent.

BabyAGI use cases

BabyAGI is more of an educational sandbox as opposed to a production-grade application ready for mainstream agentic AI use. Machine learning (ML) and agentic AI enthusiasts have used BabyAGI to explore autonomous task agents and chain-of-thought reasoning with LLMs.

BabyAGI vs AutoGPT

BabyAGI is often compared to AutoGPT, another open-source framework for autonomous agents built on LLMs. Both are AI tools designed to automate multistep objectives by combining an LLM with memory and tool use.

BabyAGI runs a compact loop that creates, executes and reprioritizes tasks with a vector database for short- and long-term memory. AutoGPT provides a feature-rich framework for goal decomposition, tool integration and external API use.

While BabyAGI is best used as a research tool and sandbox, AutoGPT can automate larger-scale tasks.

Is BabyAGI artificial general intelligence (AGI?)

Despite its name, BabyAGI is not an example of artificial general intelligence (AGI). AGI is a hypothetical AI that has human-level thought and reasoning capabilities. To date, AGI is still a theoretical concept. No AI application, including BabyAGI, has reached such a level of sophistication.

Like many other examples of generative AI applications, BabyAGI uses advanced statistical modeling to predict the most likely outcome for any given input. It does not understand, learn and think as humans do.

What is BabyAGI 2?

In 2024, Nakajima introduced BabyAGI 2, an experimental variant that uses a functionz framework that stores functions and their associated metadata in a database. The agent can load, run and update functions with metadata as it builds itself.

Abstract portrayal of AI agent, shown in isometric view, acting as bridge between two systems
Related solutions
IBM® watsonx Orchestrate™ 

Easily design scalable AI assistants and agents, automate repetitive tasks and simplify complex processes with IBM® watsonx Orchestrate™.

Explore watsonx Orchestrate
Artificial intelligence solutions

Put AI to work in your business with IBM's industry-leading AI expertise and portfolio of solutions at your side.

Explore AI solutions
AI consulting and services

Reinvent critical workflows and operations by adding AI to maximize experiences, real-time decision-making and business value.

Explore AI services
Take the next step

Whether you choose to customize pre-built apps and skills or build and deploy custom agentic services using an AI studio, the IBM watsonx platform has you covered.

Explore watsonx Orchestrate Explore watsonx.ai