Start with straightforward questions about your agent.
Who will use it? Technical users and non-technical users have different expectations, from how actively they’ll need to manage the workflow to the kind of user interface they’ll be comfortable with.
What are its responsibilities? You’ll need to decide if the AI agent is intended to tackle the entire problem or simply automate away some of the more tedious subtasks required to solve it.
What kinds of inputs will it process? An agent that will process multiple data modalities—text, audio, image, video—might require multiple models. Multimodal models exist, but sometimes a single AI model won’t provide adequate accuracy across all of your specific needs. For instance, if your agent needs to extract text from images and PDFs, an optical character recognition (OCR) model might outperform a generalist vision-language model (VLM). If those PDFs contain complex tables, charts and equations, you might need a dedicated document conversion model. If your agent will be working with very niche or domain-specific examples, its constituent models might need fine-tuning on a custom dataset.
What data sources and knowledge bases will it need access to? One of the most basic and essential agentic AI functions is retrieval augmented generation (RAG). A customer support chatbot, for instance, might need to reference information from your company’s CRM platform or FAQs. A software engineering agent will likely need to reference your codebase. Many agents need access to real-time information through search engines or specific web services. Knowing not just the information an AI agent will need, but also where it can be found, is essential.
What tools will it need access to? One of the fundamental benefits of an AI agent is the ability to supplement an LLM’s capabilities through external tool use. An agent acting as an “AI assistant” should be able to write and modify calendar items. Coding agents need a sandboxed environment to execute scripts. An AI agent in a sales or marketing environment might need external apps to send emails and other communications. External calculators or computational engines like Wolfram Alpha ensure mathematical operations can be performed reliably and quickly.
How will you measure success? AI agent development is an iterative process of testing, deploying, evaluating and optimizing. Depending on your use case, “success” might variably be measured as a function of things like time saved, cases reviewed, completion rates, accuracy or direct user ratings.