- The importance of prompt quality
- What is prompt engineering?
- Criteria for inclusion on the list
- Challenges compiling the list
- Conclusion
- References
The importance of prompt quality
Throughout this book I show examples of written instructions so that LLMs undertake psychometric tasks. The tasks range from generating items to checking item assignments to scales and scoring free text passages. See for example, the section on zero shot and few shot item generation prompts.
In this section, I emphasize the need to experiment with and carefully document all instructions used in psychological assessment. In addition, I now give a brief overview of the practice of creating high quality instructions with references for further reading, or prompt engineering.
What is prompt engineering?
Prompt engineering in LLMs refers to ways to structure your instructions to the LLM in a way that’s likely to lead to useful and accurate task performance. The internet is crowded with prompt engineering guides. The best are available from the large model providers, including OpenAI and Anthropic.
While many of the recommendations resemble good writing (e.g., clarity, specificity), they can have LLM-specific mechanisms. For example, in earlier models, placing instructions at the ends of prompts helped leverage later representations in attention layers (although this is not so necessary with newer models, and system prompts and format (JSON, XML) tags can have bigger effects.
Nonetheless, given that many of the tips frequently advocated reflect good writing principles, a case might be made that a prompt engineering overview is not required for this book. In practice, however, engineering high quality prompts is the primary way that psychologists are likely to interact with LLMs to improve their output on psychometric tasks. We are certainly more likely to carefully craft a prompt than we are to pre-train a model, or even to fine-tune one. With this point in mind, I have included a list of prompt engineering tips for psychometrics in this section.
Criteria for inclusion on the list
I set the following criteria for inclusion of a recommendation to avoid a bland list that doesn’t stretch beyond a good writing guide.
- First, the recommendation needed to be specific to AI, beyond what we might suggest for human-to-human written communication or have an LLM architectural basis such as the instruction placement within the prompt discussed earlier.
- Second, the recommendation needed to be either supported by documented evidence or from a key model vendor (on the assumption that those who build the models have more insight into how they work).
Challenges compiling the list
I encountered the following challenges in compiling the list. Including prompts that go beyond common sense was a challenging criterion, because one person’s common sense might sound wise to others. Next, prompts offer no guarantee of portability. What works for one model might not work on a second, or even the same model on a subsequent occasion. Finally, there were occasionally nuances.
Assigning personas is recommended in some guidance for priming domain knowledge, while other guidance warns it consumes limited context budget. Another is the widely known 'think step-by-step' instruction. Some guides treat this as a core technique while others argue any instruction that prompts longer output works equally well. In addition, some newer models with extended thinking modes respond to thinking keywords by allocating additional compute.
Conclusion
In short, the appealing idea of a perfect prompt is not only elusive but imaginary. This is not to undervalue the importance of prompt enhancements, because LLMs are sensitive to the way prompts are created. It is simply to say that there are many ways to write prompts that achieve high quality psychometric outcomes. Readers should experiment with different versions of prompts using the ideas in the table below and document your prompts thoroughly for inclusion in technical documentation.
References
Anthropic. (n.d.). Prompt engineering. https://docs.anthropic.com/en/docs/build-with-claude/prompt-engineering/overview
OpenAI. (n.d.). Prompt engineering. OpenAI Platform Documentation. https://platform.openai.com/docs/guides/prompt-engineering
Next section
Generating items via an API
Last section
Semantic convergent and discriminant validity
Return home
Psychometrics.ai
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0).