December 7, 2020 By Demi Ajayi 3 min read

Last time on the NLP blog series, we explored how BERT and GPT models change the game for NLP. BERT and GPT models have a lot of exciting potential applications, such as natural language generation (NLG) (useful for automating communication, report writing, summarizations), conversational assistant, question and answer platforms, and query understanding. However, there are several key considerations to investigate before embarking on a new model for your business use case.

Bias: As with all machine learning, it is important to understand any implicit bias in the training data. Applications using massive language models such as NLG are particularly prone to disastrous negative effects in bias when not properly evaluated. For instance, there are incidents of various applications generating text that is offensive or negatively stereotyped against the subject of the text. Given the especially massive training data required, it’s very important to be cognizant of the potential of bias in these models and to keep a human in the loop when refining these models to eliminate bias.

Explainability/Transparency: It is also important to understand the algorithmic workings of any model you use: transparency on how results are derived and actual explanations are critical to ensure you have a model you can trust. Increasingly, AI providers such as IBM are moving toward creating standards of fairness, explainability and transparency in the models they provide.

Computational costs: As mentioned, GPT-3 has been trained on over 100 billion parameters. Building applications with this model is an incredibly computationally intensive task. Other massive deep learning models are less computationally intensive than GPT-3, but still often require significant computation power to provide results quickly in real life settings. Often GPUs, which are significantly more expensive than conventional CPU processing, are used to increase speed of computation in these applications. As businesses consider applications of massive deep language models, they will also have to consider the cost-to-performance benefit of these models.

Data: Another consideration is training data. Businesses have to consider how much data they have (or can invest in acquiring) to meet the demands of training these models. With these models, requiring less training data often means that the underlying model is very large (such as GPT-3), which introduces the trade-off of computational costs vs. data.

Accuracy & Evaluation: For applications with established evaluation metrics (such as question answering, or traditional text analytic tasks like classification and sentiment), it’s important to pick the model that meets your needed level of accuracy, while considering the other tradeoffs discussed in data and computational costs. For applications with less established means of evaluating accuracy (such as NLG for summarization and conversation), it’s critical to choose or develop an evaluation scheme suitable to your use case before adopting these models for business use. NLG is often evaluated by human annotators, though there has been incremental progress in developing automated evaluation tools. Here, the scalability and reliability of the evaluation tool are also additional considerations. For instance, evaluating if an AI-generated report is coherent, exhaustive, and well-written enough for use in a business setting will require much deeper analysis and higher standards than evaluating its ability to compose Shakespearean sonnets.

When kicking off your pilot project, focus on creating a model that is trained to perform certain tasks with specific validation data. Train the model to perform the specific task you are trying to achieve.

Afterward, using the considerations listed in this blog, you can conduct testing to measure baseline performance. With the data science elements listed in this blog, you can assemble a checklist to determine which models will best help you launch your pilot. These elements of data science will all play a critical role in the machine learning pipeline in your pilot and thereafter.

Get started with IBM Watson NLP.

Was this article helpful?
YesNo

More from Artificial intelligence

The power of remote engine execution for ETL/ELT data pipelines

5 min read - Business leaders risk compromising their competitive edge if they do not proactively implement generative AI (gen AI). However, businesses scaling AI face entry barriers. Organizations require reliable data for robust AI models and accurate insights, yet the current technology landscape presents unparalleled data quality challenges. According to International Data Corporation (IDC), stored data is set to increase by 250% by 2025, with data rapidly propagating on-premises and across clouds, applications and locations with compromised quality. This situation will exacerbate data silos, increase costs…

Where to begin: 3 IBM leaders offer guidance to newly appointed chief AI officers

4 min read - The number of chief artificial intelligence officers (CAIOs) has almost tripled in the last 5 years, according to LinkedIn. Companies across industries are realizing the need to integrate artificial intelligence (AI) into their core strategies from the top to avoid falling behind. These AI leaders are responsible for developing a blueprint for AI adoption and oversight both in companies and the federal government. Following a recent executive order by the Biden administration and a meteoric rise in AI adoption across…

Scaling generative AI with flexible model choices

5 min read - This blog series demystifies enterprise generative AI (gen AI) for business and technology leaders. It provides simple frameworks and guiding principles for your transformative artificial intelligence (AI) journey. In the previous blog, we discussed the differentiated approach by IBM to delivering enterprise-grade models. In this blog, we delve into why foundation model choices matter and how they empower businesses to scale gen AI with confidence. Why are model choices important? In the dynamic world of gen AI, one-size-fits-all approaches are…

IBM Newsletters

Get our newsletters and topic updates that deliver the latest thought leadership and insights on emerging trends.
Subscribe now More newsletters