Unitxt

Evaluate Granite's performance.

Unitxt is an innovative library for textual data preparation and evaluation of generative language models. It deconstructs the data preparation and evaluation flows into modular components, enabling easy customization and sharing between practitioners. Unitxt as an extensive catalog of datasets, tasks, templates, metrics and textual operators, that can be used as-is, or be combined in innovative ways to create new evaluations.

In this recipe we evaluate Granite's performance on the OpenbookQA dataset from HuggingFace. We load the dataset, generate the model client, run inference, and evaluate the results.

You will need a Hugging Face token to run this recipe in Colab. Instructions for obtaining this credential can be found here.

Get started

Explore sample code in a GitHub repo

Try it out

Execute sample code in Colab