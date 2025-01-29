Large language models (LLMs) such as GPT-3, 4 and PaLM are commonly referred to as "black box" models because users do not have access to their internals, such as parameters, tuning methods or decision-making processes.

Such interaction is essentially through text prompts that use application programming interface (API) calls as the major input and output mechanisms. While these models are quite excellent, their capability to produce precise task-specific outputs is often highly contingent on prompt quality.2, 3

With this, prompt engineering to design targeted prompts to steer model behavior is relevant. Both manual and automated approaches to prompt engineering have yielded notable success. However, they do not come without bitter pills, especially for those tasks that call for strong control or much instance-specific output.

For example, tasks such as summarization or dialogue generation require the model to follow target behaviors systematically, such as including key details or adhering to a strict reasoning pattern or prescribed stylistic guidelines. Conventional techniques are often not enough to guarantee consistent compliance with these nuanced requirements.

Directional stimulus prompting (DSP) comes to fill this gap. DSP is a small auxiliary policy model and generates instance-specific directional stimulus prompts that guide the LLM toward its decisions.

The prompts issued serve a specific context for each instance and are seen to coax the LLM to yield more aligned and desirable outputs. By plugging DSP into the process, users have a powerful tool to correct the behavior of black box LLMs to greater consistency, relevance and accuracy in work that needs precision.1