Prompt engineering techniques

The importance of prompt quality
What is prompt engineering?
Criteria for inclusion on the list
Challenges compiling the list
Conclusion
References

The importance of prompt quality

Throughout this book I show examples of written instructions in prompts that allow LLMs undertake psychometric tasks. The tasks range from generating items to checking item assignments to scales and scoring free text passages. See for example, the section on zero shot and few shot item generation prompts.

In this section, I emphasize the need to think carefully about prompts and to experiment with and carefully document all instructions used in psychological assessment. In addition, I now give a brief overview of helpful practices of creating high quality instructions with references for further reading, or prompt engineering.

What is prompt engineering?

Prompt engineering in LLMs refers to ways to structure your instructions to the LLM in a way that’s likely to lead to useful and accurate task performance. The internet is crowded with prompt engineering guides. Some of the best are available from the large model providers, including OpenAI, Anthropic, and Google.

While many of the recommendations resemble good writing (e.g., clarity, specificity), they can have LLM-specific mechanisms. For example, in earlier models, placing instructions at the ends of prompts helped leverage later representations in attention layers (although this is not so necessary with newer models, and system prompts and format prefixes (e.g., JSON, XML) tags can have bigger effects).

Nonetheless, given that many of the tips frequently advocated reflect good writing principles, a case might be made that a prompt engineering overview is not required for this book. In practice, engineering high quality prompts is the primary way that psychologists are likely to interact with LLMs to improve their output on psychometric tasks. We are more likely to carefully craft a prompt than to pre-train a model, or even to fine-tune one.

Criteria for inclusion on the list

I set the following criteria for inclusion of a recommendation to avoid a bland list that doesn’t stretch beyond a good writing guide.

The recommendation needed to be specific to AI beyond what we might suggest for human-to-human written communication or have an LLM architectural basis such as the instruction placement within the prompt discussed earlier.
The recommendation needed to be either supported by documented evidence from a key model vendor, on the assumption that those building the models have the most insight into how they work.

Challenges compiling the list

I encountered somechallenges in compiling the list. Including prompts that go beyond common sense was a challenging criterion, because one person’s common sense might sound wise to others. Next, the prompts offer no guarantee of portability. It seems that what works for one model might not work on a second, or even the same model on a subsequent occasion. Finally, there were occasionally nuances.

Assigning personas is recommended in some guidance for priming domain knowledge, while other guidance warns it consumes your limited context budget. Another nuance the widely known 'think step-by-step' instruction. Some treat this as a core technique while others argue any instruction that prompts longer output works equally well. In addition, some newer models with extended thinking modes respond to thinking keywords by allocating additional compute.

Tip #	Prompt engineering suggestion	Source	Description
1	Assign LLMs a role.	Anthropic	Role prompting assigns a specific persona (e.g., "experienced data scientist") via a system prompt for enhanced accuracy", "tailored tone", and improved focus.
2	Prompting often out-performs fine tuning.	Anthropic	Fine-tuning risks model losing general knowledge during retraining, requires expensive GPU resources and days of time, while prompt engineering is instant, cheaper, and preserves base capabilities
3	Avoid overspecified logic.	Anthropic	Encoding brittle if-else logic directly in prompts creates maintenance complexity and breaks easily with edge cases; better to provide high-level guidance with examples in system commands
4	Low temperature for factual responses.	OpenAI	Factual extraction and Q&A generally require temperature 0, while creative writing varies by needs, no one-size-fits-all creative needs.
5	Too much context causes context rot.	Anthropic	The more information you give a model at once, the worse it gets at remembering details from earlier in the conversation.
6	Instructions at start or end improves performance.	Anthropic	OpenAI recommend instructions at the start and encasing the context in triple quotes. Anthropic recommends placing instructions at the end of the prompt for better performance.
7	Chain-of-thought prompting improves performance.	Anthropic/ OpenAI	Encouraging LLMs to think arefully before begin and to give reasoning can improve their performance on complex tasks.
8	Use XML tagging to separate parts of posts.	Anthropic	Anthropic recommends using XML tags to structure prompts with multiple components to help Claude parse prompts more accurately
9	Few-shot prompting is important for pattern learning.	Google	Showing the model examples of what you want is more effective than just describing it as it helps achieve better formatting, accuracy, and pattern matching.
10	Lead outputs with partial completions.	Google	You can guide output formatting by starting the response structure yourself (e.g., "Outline: I. Introduction *") and letting the model complete the pattern.
11	Use prefixes for prompt sections.	Google	Use prefixes like "Text:" for inputs, "JSON:" for outputs, and labels in examples to signal semantically meaningful parts of your prompt and guide the model's response format.
12	Chain prompts for sequential tasks.	Google	Break complex tasks into sequential prompts where each prompt's output becomes the next prompt's input, with the final prompt producing the end result.
13	Experiment with sampling parameters.	Google	Experiment with parameters like max tokens (output length), temperature (randomness), topK/topP (token selection), and stop sequences to optimize model responses for your specific task.

Conclusion

In short, the appealing idea of a perfect prompt is not only elusive but imaginary. This is not to undervalue the importance of prompt enhancements, because LLMs are sensitive to the way prompts are created. It is simply to say that there are many ways to write prompts that achieve high quality psychometric outcomes. Readers should experiment with different versions of prompts using the ideas in the table below and document your prompts thoroughly for inclusion in technical documentation.

References

Google. (n.d.). Prompt design strategies. Google AI for Developers. Retrieved November 10, 2025, from https://ai.google.dev/gemini-api/docs/prompting-strategies

OpenAI. (n.d.). Prompt engineering. OpenAI Platform Documentation. https://platform.openai.com/docs/guides/prompt-engineering

Anthropic. (n.d.). Give Claude a role (prompt engineering). https://docs.anthropic.com/claude/docs/give-claude-a-role

Anthropic. (n.d.). Prompt engineering. https://docs.anthropic.com/en/docs/build-with-claude/prompt-engineering/overview

Anthropic. (n.d.). Effective context engineering for AI agents. https://www.anthropic.com/engineering/effective-context-engineering-for-ai-agents

OpenAI. (n.d.). Best practices for prompt engineering with the OpenAI API. https://help.openai.com/en/articles/6654000-best-practices-for-prompt-engineering-with-the-openai-api

Anthropic. (n.d.). Effective context engineering for AI agents. https://www.anthropic.com/engineering/effective-context-engineering-for-ai-agents

Anthropic. (n.d.). Prompt engineering overview. https://docs.anthropic.com/en/docs/build-with-claude/prompt-engineering/overview

OpenAI. (n.d.). Best practices for prompt engineering with the OpenAI API. https://help.openai.com/en/articles/6654000-best-practices-for-prompt-engineering-with-the-openai-api

Anthropic. (n.d.). Chain of thought prompting. https://docs.claude.com/en/docs/build-with-claude/prompt-engineering/chain-of-thought

Anthropic. (n.d.). Use XML tags. https://docs.claude.com/en/docs/build-with-claude/prompt-engineering/use-xml-tags

Google. (n.d.). Prompting strategies. https://ai.google.dev/gemini-api/docs/prompting-strategies

Next section

AI item generation strategies

Last section

Semantic convergent and discriminant validity

Return home

Psychometrics.ai

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0).