Choice of Generation Model
Many factors go into choosing a models that will work well for your project.
The model's license may restrict how it can be used. For example, a model's license may prevent it from being used as part of a commercial application.
The data set used to train the model training has a direct impact how well the model works for a specific application and significantly affects the risk that the model may generate non-sensical, offensive, or simply unwanted responses. Similarly, models trained on copyrighted or private data may open their users to legal liability. IBM provides full training data transparency and indemnification from legal claims arising from its models.
The size of the model, how many parameters it is trained with, and the size of its context window (how long of a passage of text can the model accept) affect model performance, resource requirements, and throughput. While it's tempting to go with a "bigger is better" philosophy and choose a 20 billion parameter model, the resource requirements and improvement (if any) in accuracy may not justify it. Recent studies have shown that smaller models can significantly outperform larger ones for some solutions.
Any fine-tuning applied to a model can affect its suitability for a task. For example, IBM offers two versions of the Granite model: one tuned for general chat applications, and another tuned to follow instructions.
Other considerations when choosing a model include: