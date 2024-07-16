Two Prompts were developed (see picture above), the first performed the conversion of the statistical code to R. The LLM answers the prompt, with inbuilt knowledge to perform the translation once the prompt is correctly defined. The second prompt asked the LLM to generate a textual description of the code that was being converted; This was in form of a virtual author generating a narrative of the R code, in simple clear written English.

The LLM did not need to be trained for this task, the knowledge and understanding was already embedded in the model. This saves time and effort on behalf of the development community, although it is possible to extend the LLM capabilities.

LLMs are very powerful but have restrictions in terms of number of lines of text that can be processed at one time. The Python code split the statistical code into fixed number of lines, based on the last nearest comment or blank line, and then submitted to the LLM with the prompt, the results combined to create a complete output.

However, there are risks: LLMs can misinterpret the prompt or the source data and generate false statements, simply through not enough trained data or at an extreme through hallucination. Therefore, there must be thorough testing to ensure that the outputs of both code sets match.

To help automatically measure the LLM performance, there is an option to use IBM Watsonx.AI tooling has IBM Watsonx.governance which is designed to monitor artificial intelligence activities across an organisation, using GenAI or machine learning models from any vendor. The tool supports evaluation and monitoring of the model for health, accuracy, drift, bias and Gen AI quality. It provides powerful governance, risk and compliance capabilities featuring workflows with approvals, customizable dashboards, risk scorecards and reports. IBM watsonx.AI uses factsheet capabilities to collect and document model metadata automatically across the AI model lifecycle. However, we did not have time to implement it in this use case.

Through this process, care was taken to protect the client data so that it did not move outside of the environment. This means as stated earlier the LLM was locally hosted on the IBM cloud and the platform could not connect with the outside world as part of their operation. Some hosted LLMs required connection through an internet accessible API, which would not have been allowed.

Aware of, and mitigating the risks, it is easy to see conversion factories being setup where an end user can submit their code and ask for conversion into a specific language and by return get the output and description of the code that has been developed. A further enhancement could be that the GenAI builds replica systems in the cloud of choice; Using the code and other system artefacts as a template, delivering back the e.g. Python, Terraform for deployment where whole systems could be migrated in this way.