Six strategic shifts to build agentic AI with small, fast, open models

Two coworkers talking in front of a laptop, in office setting

Authors

Laura Langendorf

Laura Langendorf

Rogerio Gonçalves

AI/ML Ops Brand Content Strategist

The old rules of enterprise software development—such as six-to-twelve-month roadmaps, heavy upfront infrastructure and monolithic releases—can’t keep pace with our new-gen AI-powered world. It’s time to throw out the old rules, reinvent your development approach and rethink how you’re building your applications, following these six mantras:

1. Build fit-for-business models that deliver faster ROI

Let’s start with the heart of any application: the model behind it. But not every AI problem needs a model with hundreds of billions of parameters. Small, domain-tuned models often match or exceed generic large models on specific tasks, delivering comparable accuracy at a fraction of the cost and faster inference. By zeroing in on text summarization and analysis, code generation, document QA or other well-scoped problems, development teams can:

  • Lower inference costs per query, making it economically viable for large fleets of agents
  • Reduce latency to subsecond responses, critical for interactive workflows and human-in-the-loop processes
  • Deploy in hybrid or edge environments to avoid cloud egress fees while preserving data sovereignty and compliance

Selecting the right model isn’t about pursuing the highest parameter count—it’s about assessing cost per use, latency to value and fit for task metrics from day one.

2. Make platforms agile and the ecosystem open

Gen AI success depends on more than just choosing a model. It requires selecting the right model for the task and surrounding it with the tools, platforms and development practices that turn AI into real business outcomes. Developers and their managers should push for investment in:

  • Open-source AI models that avoid lock-in and foster vibrant developer communities
  • Microfactory architectures, which involve small teams that assemble and manage purpose-built model bundles, templates and best practices for each core use case
  • Modular pipelines, which allow you to chain lightweight models and business rules as microservices, allowing rapid iteration, continuous integration and seamless rollbacks

This modular, open approach condenses pilot programs into weeks rather than months, enabling teams to create new agents in minutes and unleashing gen AI’s transformative potential across the enterprise.

3. Embed responsible AI from the ground up

True responsible AI isn’t an afterthought; it’s embedded in every stage of development. Building these applications means that developers need to keep a consistent focus on:

  1. Model selection and training transparency: Choose models trained on open, auditable datasets that align with corporate values and regulatory requirements.
  2. Data engineering rigor: Track lineage, document transformations and maintain clear, explainable pipelines from raw data to model output.
  3. Human-in-the-loop governance: Embed review checkpoints and performance thresholds that trigger alerts or escalations, promoting compliance and quality.

By standardizing governance at the source, development teams can help reduce bias, protect privacy and foster trust, laying the groundwork for sustainable, scalable gen AI deployments.

4. Operationalize agentic AI through a full lifecycle

Once your foundation is in place with fast, fit-for-business models, modular pipelines and responsible development practices, the next logical step in the new playbook is to operationalize AI through agents. This initiative is likely a topic of active discussion across your organization and is where theory becomes a measurable impact.

An AI agent is a semiautonomous “worker” that reviews inputs, reasons about tasks and acts—often collaborating with humans and other agents. Integrating these agents into enterprise workflows drives measurable productivity gains.

These are the lifecycle phases:

  1. Design: Define clear agent roles (for example, "code extractor," "test generator," "compliance analyzer").
  2. Build: Assemble pipelines of specialized, lean models and encode business rules for each phase.
  3. Deploy: Orchestrate agents on a central platform with monitoring, traceability and service-level agreements.
  4. Operate and refine: Collect feedback loops, retrain or reconfigure agents and continuously measure return on investment (ROI).

5. Pair agents with the right models to maximize value

The key to putting agents to work at scale lies in how effectively you pair them with fast models built for business, making the agentic lifecycle not just possible, but productive. This approach requires careful design, orchestration and measurement practices that align technical components with business outcomes. Here’s how to make that integration count:

  • Choose the right model: Benchmark small models on key tasks (for example, code-to-text, SQL generation, entity extraction) and compare latency and cost per 1,000 tokens.
  • Build factories and orchestration: Group related agents into dedicated factories for core use cases (for example, banking mainframe, insurance engine).
  • Govern and monitor: Provide a central UI/API for agent access, and track performance metrics (latency, errors), versioning (agent ID) and human-in-the-loop checkpoints.
  • Measure value: Define clear KPIs—such as development hours saved, code quality improvements and cost reductions—and use a live dashboard to monitor time to modernize and tech debt pay-down.

6. Scale fast, small models to drive enterprise impact

When a single fast and compact model costs just a small percentage of a large language model, adding extra agents becomes a marginal expense. This economy of proliferation means:

  • Rapid scaling: Dozens—or even hundreds—of specialized agents can run concurrently without blowing budgets.
  • Blast‑radius control: Lightweight containers and fault-isolated deployments ensure that individual agent failures don’t cascade into system-wide outages.
  • Continuous optimization: Cost-aware monitoring dashboards track CPU/GPU usage, per-agent inference use and performance KPIs—fueling ongoing refinements.

Companies that leverage this scale advantage can more quickly scale up from PoC to enterprise-wide production. This initiative unlocks gen AI’s full productivity benefits—from reducing costs and accelerating time-to-market to improving quality, decision-making and team efficiency.

From strategic shifts to sustainable transformation

Embracing these six mantras isn’t just about building better AI—it’s about reshaping how innovation happens and where intelligence lives inside the enterprise.

They shift from experimentation to transformation. Development cycles collapse from months to weeks. AI agents evolve from promising prototypes into business-critical operators. And innovation becomes predictable—not a lucky outlier, but a repeatable process embedded in every product sprint and business decision.

Enterprise AI will shape its next frontier by:

  • Composable architectures: Agile, reusable components that let teams rapidly deploy and iterate without reinventing the wheel.
  • Cost-aware orchestration: AI systems that self-monitor efficiency and dynamically optimize for performance, budget and compliance.
  • Hybrid intelligence: Seamless collaboration between humans and AI agents—with governance, transparency and trust at the center.
  • Industry-tuned models at scale: Rather than one-size-fits-all LLMs, organizations operationalize fleets of domain-specialized models fine-tuned for their exact needs.

Enterprise AI is moving from centralized moon shots to decentralized utility—from one massive model to many right-sized ones, embedded across functions. The organizations that win don’t just build smarter models; they build smarter systems, teams and strategies.

By embracing these six strategic shifts, you’re not just modernizing your development stack—you’re future-proofing your business.

Abstract portrayal of AI agent, shown in isometric view, acting as bridge between two systems
Related solutions
IBM® watsonx Orchestrate™ 

Easily design scalable AI assistants and agents, automate repetitive tasks and simplify complex processes with IBM® watsonx Orchestrate™.

Explore watsonx Orchestrate
IBM AI agents and assistants

Create breakthrough productivity with one of the industry's most comprehensive set of capabilities for helping businesses build, customize and manage AI agents and assistants. 

Explore AI agents
IBM Granite

Achieve over 90% cost savings with Granite's smaller and open models, designed for developer efficiency. These enterprise-ready models deliver exceptional performance against safety benchmarks and across a wide range of enterprise tasks from cybersecurity to RAG.

Explore Granite
Take the next step

Whether you choose to customize pre-built apps and skills or build and deploy custom agentic services using an AI studio, the IBM watsonx platform has you covered.

Explore watsonx Orchestrate Explore watsonx.ai