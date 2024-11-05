Training a generative AI model, begins with a deep learning model or foundation model, which is trained on huge volumes of data. This data is ingested, prepared, standardised, to create a neural network of parameters, calculations and data.

Traditional platforms are mainly focused on structured data. However, with generative AI the emphasis is towards multi-modal data. Therefore, the scope of Data Governance must have the policies, processes, and procedures to fully support multi-modal data (more biased towards unstructured e.g. text, images, video, audio,…), which adds different dimensions to how you undertake e.g. data quality checking, data profiling, data history/origin etc.

When driving new techniques of fact checking, for example, multiple source verification may be required. Furthermore, Data Governance can be used to improve explainability of the model outputs, capturing how a model was created (its data and steps to produce) and the influences that were used to shape its output.

This multi-modal data may ultimately live in different contexts, for example embedded/encoded within the model itself, but equally foundation models could use co-located or external data generated as models, to extend the base foundational model, shaping outputs for a particular organisation or give a new area of expertise.