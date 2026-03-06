LLMs can be very useful, but their use poses ethical and societal risks. These risks aren’t caused by poor design or developer error: they’re a fundamental consequence of both human nature and how we train LLMs.

LLMs gain their core knowledge and linguistic abilities through self-supervised pretraining on a massive quantity of unlabeled text samples. After “learning” the patterns found across the billions upon billions of sentences in its training data, an LLM can generate grammatically coherent text that follows those patterns.

But in doing so, those model outputs might also reproduce any harmful content present in that training dataset. If the training data contains biases, inaccuracies, toxic content or discriminatory views, so too will the text that LLM generates. If training data gathered by indiscriminately scraping the internet contains private or sensitive information, the LLM might leak that information. In general, the probabilistic nature of how LLMs generate their outputs can lead to harmful AI hallucinations.

Further risks are posed by the potential to abuse LLMs. If its training data includes information about manufacturing weapons or dangerous chemicals, the LLM could help an individual harm others. Without guardrails, an LLM can be used to generate dangerous (but convincing) misinformation. In the most extreme hypothetical scenarios, a misaligned AI model could theoretically provoke nuclear war.

Alignment problems can arise in unexpected ways. A famous thought experiment in AI is philosopher Nick Bostrom’s “paperclip maximizer” scenario. Bostrom described an artificial superintelligence tasked with manufacturing paperclips determining that the best way to achieve its goal is to start “transforming first all of earth and then increasing portions of space into paperclip manufacturing facilities.”2

LLM alignment, as a discipline, arose as an attempt to mitigate these risks enough to make LLMs practical for real-world use and safe enough for continued advancement. The more thoroughly LLMs are integrated into our daily lives, the more essential it is to understand and account for potential misalignments with human interests.