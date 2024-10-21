Fine-tuning means giving an AI model enough additional data to change some of its parameters. Fine-tuning permanently changes the behavior of a model, adapting it to a particular use case or context. It’s also faster and cheaper than training a brand-new model.

“If you have a neural network that has 100 different layers, training it would mean that you’re modifying all 100 layers,” explains Choie. “Fine-tuning would mean that you’re really changing the last few layers. You’re still modifying the model, but you don’t have to change it entirely because it’s already performing well.”

Fine-tuning requires a little more upfront investment than prompt engineering and RAG. It is useful for turning a smaller model into an expert in a specialized domain. For example, an insurance company can fine-tune a model to master the art of processing new claims.

Varshney likens a fine-tuned model to an intensively trained new hire fresh out of school. They might not have the breadth of knowledge that a genius polymath (or big, general-purpose AI model) has, but they are much better at processing claims than the polymath would be.

“It can’t do your taxes or write a legal contract,” Varshney says, “But if I ask it to process a claim, it would know how to do it right away.”

Using proprietary data in these ways can offer a significant competitive advantage by familiarizing AI models with an enterprise’s unique processes, products, customers and other nuances.

“If you have an AI whose main users are from a particular enterprise, it is important that the AI uses data from that same enterprise,” Choie says.

When AI models have access to proprietary data, they are grounded in a specific business context, which means their outputs are also grounded in that context.

“I can take an open AI model, fine-tune it with my own proprietary data, and that copy is uniquely mine,” Varshney says. “I own the IP behind it. I run it on my own infrastructure.”

As a result, these models can produce more accurate and effective outputs than unaugmented, off-the-shelf models pulling from a general body of public data.