Prompt Engineering: OpenAI API Best Practices Guide

Understanding Prompt Engineering for OpenAI API

Prompt engineering has emerged as a critical discipline for anyone looking to harness the full power of large language models (LLMs) via the OpenAI API. The way these advanced AI models are trained means that the clarity, structure, and content of your prompts significantly impact the quality and relevance of the generated outputs. Far beyond simply asking a question, effective prompt engineering involves a thoughtful approach to instructing the AI, leading to more useful, consistent, and predictable results.

OpenAI's models, including cutting-edge iterations like GPT-5.3 Instant, are incredibly versatile, capable of everything from complex summarization and creative writing to robust code generation and factual data extraction. However, their performance hinges on how well you communicate your intentions. This guide distills the best practices recommended by OpenAI itself, offering actionable strategies to refine your prompts and elevate your AI interactions. While general guidelines exist, developers are always encouraged to experiment with different formats to discover what best suits their unique tasks and desired outcomes.

Crafting Clear and Effective Model Instructions

The foundation of superior AI interaction lies in the precision of your instructions. Generic prompts often yield generic results. To maximize utility and reduce iterative refinement, adhere to these core principles:

Prioritize Latest Models and Instruction Placement

Always opt for the latest, most capable OpenAI models available through the API. Newer models are generally more adept at understanding nuanced instructions and tend to be easier to prompt engineer, offering superior performance and efficiency. When structuring your prompt, place your primary instructions at the very beginning. This primes the model with its objective before it processes any context. Furthermore, use distinct separators like ### or triple quotes """ to clearly delineate your instructions from the contextual text input.

Less effective ❌:

Summarize the text below as a bullet point list of the most important points.

{text input here}

Better ✅:

Summarize the text below as a bullet point list of the most important points.

Text: """
{text input here}
"""

Be Specific, Descriptive, and Detailed

Vague prompts lead to vague outputs. Instead, articulate your requirements with high specificity, describing the desired context, outcome, length, format, and style in as much detail as possible.

Less effective ❌:

Write a poem about OpenAI.

Better ✅:

Write a short inspiring poem about OpenAI, focusing on the recent DALL-E product launch (DALL-E is a text to image ML model) in the style of a {famous poet}.

Articulate Desired Output Format Through Examples

For structured data extraction or specific output formats, show, don't just tell. Providing explicit examples of the desired format empowers the model to generate consistently structured responses, making programmatic parsing significantly easier and more reliable.

Less effective ❌:

Extract the entities mentioned in the text below. Extract the following 4 entity types: company names, people names, specific topics and themes.

Text: {text}

Better ✅:

Extract the important entities mentioned in the text below. First extract all company names, then extract all people names, then extract specific topics which fit the content and finally extract general overarching themes

Desired format:
Company names: <comma_separated_list_of_company_names>
People names: -||-
Specific topics: -||-
General themes: -||-

Text: {text}

Reduce Imprecise Descriptions and Frame Positively

Avoid "fluffy" or imprecise language. Quantify lengths and constraints whenever possible. Instead of saying "a few sentences," specify "a 3 to 5 sentence paragraph." Crucially, when guiding the model, focus on what to do rather than solely what not to do. Positive framing provides clearer directives.

Less effective ❌:

The following is a conversation between an Agent and a Customer. DO NOT ASK USERNAME OR PASSWORD. DO NOT REPEAT.

Customer: I can’t log in to my account.
Agent:

Better ✅:

The following is a conversation between an Agent and a Customer. The agent will attempt to diagnose the problem and suggest a solution, whilst refraining from asking any questions related to PII. Instead of asking for PII, such as username or password, refer the user to the help article www.samplewebsite.com/help/faq

Customer: I can’t log in to my account.
Agent:

Advanced Prompting Techniques: Zero-Shot, Few-Shot, and Fine-Tuning

Depending on the complexity and uniqueness of your task, you can employ different levels of prompting sophistication.

Progressive Learning Approaches

Zero-shot: This is the simplest approach, where the model performs a task with no examples, relying entirely on its pre-trained knowledge. It's effective for common tasks but can struggle with niche or complex requirements.
```
Extract keywords from the below text.

Text: {text}

Keywords:
```

Few-shot: When zero-shot falls short, few-shot learning involves providing a couple of high-quality input-output examples directly within your prompt. This demonstrates the desired format and behavior to the model, significantly improving accuracy and consistency for specific tasks.

Extract keywords from the corresponding texts below.

Text 1: Stripe provides APIs that web developers can use to integrate payment processing into their websites and mobile applications.
Keywords 1: Stripe, payment processing, APIs, web developers, websites, mobile applications
##
Text 2: OpenAI has trained cutting-edge language models that are very good at understanding and generating text. Our API provides access to these models and can be used to solve virtually any task that involves processing language.
Keywords 2: OpenAI, language models, text processing, API.
##
Text 3: {text}
Keywords 3:

Fine-tuning: For highly specialized or performance-critical applications where few-shot prompting isn't enough, fine-tuning involves training a pre-existing model on a custom dataset. This customizes the model's weights to excel at specific tasks, generate outputs in a unique style, or handle particular data distributions with unparalleled accuracy. While more resource-intensive, fine-tuning offers the highest degree of control and performance for tailored solutions.

Optimizing Code Generation and Task-Specific Prompts

When generating code or tackling other highly structured tasks, specific prompt engineering tactics can dramatically improve results.

Guiding Code Generation with "Leading Words"

For code generation, providing "leading words" acts as a powerful hint to the model, guiding it towards the correct programming language or syntax. For instance, starting a Python code generation prompt with import or an SQL prompt with SELECT immediately sets the context for the model, leading to more accurate and relevant code snippets. This technique is invaluable for developers looking to harness the power of Codex and similar models for expedited development.

Less effective ❌:

# Write a simple python function that
# 1. Ask me for a number in mile
# 2. It converts miles to kilometers

Better ✅:

# Write a simple python function that
# 1. Ask me for a number in mile
# 2. It converts miles to kilometers

import

Leveraging "Generate Anything" for Tailored Prompts

OpenAI also offers a "Generate Anything" feature, designed to assist developers in creating effective prompts. By describing your task or desired natural language output, the feature can suggest tailored prompt structures, providing a solid starting point for complex interactions and reducing the trial-and-error often associated with prompt engineering. This tool is particularly useful for new users or when exploring novel application areas.

Controlling Model Behavior with API Parameters

Beyond the prompt itself, the OpenAI API offers several key parameters that allow developers to fine-tune the model's behavior and output characteristics. Understanding and utilizing these parameters is crucial for achieving desired results.

Parameter	Description	Recommended Usage
`model`	Specifies which AI model to use. Different models offer varying levels of capability, speed, and cost. Newer models generally provide better performance and are easier to prompt.	Always use the latest and most capable model for best results, balancing performance with cost and latency considerations. Regularly explore the latest GPT models available.
`temperature`	Controls the randomness of the output. A higher temperature makes the output more diverse and creative, while a lower temperature makes it more deterministic and focused. It measures how often the model outputs a less likely token.	For factual tasks, data extraction, or consistent outputs, set `temperature` to `0`. For creative writing, brainstorming, or varied responses, use higher values (e.g., 0.7-1.0).
`max_completion_tokens`	Sets a hard upper limit on the number of tokens the model will generate in its response. This is not a direct control over output length but a safety cutoff.	Use to prevent excessively long responses or to manage token consumption. The model typically stops when it thinks it's finished or hits a `stop` sequence before reaching this limit.
`stop`	A sequence of one or more tokens that, when generated, will cause the model to stop generating further tokens.	Essential for controlling the end of a generated response and for ensuring the output adheres to a specific format or structure, especially in conversational or multi-turn interactions.

These parameters, particularly model and temperature, are the most commonly adjusted to alter the model's output to meet specific application needs. Thoughtful configuration of these settings, combined with well-engineered prompts, unlocks the full potential of the OpenAI API.

Original source

https://help.openai.com/en/articles/6654000-best-practices-for-prompt-engineering-with-the-openai-api

Frequently Asked Questions

What is prompt engineering for OpenAI models?

Prompt engineering involves crafting specific and effective instructions, known as 'prompts,' to guide OpenAI's large language models (LLMs) towards generating desired outputs. Due to how these models are trained, certain prompt formats yield significantly better results. It's an iterative process of refining inputs to achieve optimal performance, whether for summarization, code generation, creative writing, or data extraction. Mastering prompt engineering is crucial for unlocking the full potential of the OpenAI API and developing robust AI-powered applications.

Why is instruction placement important in prompts?

Placing instructions at the beginning of a prompt helps the model immediately understand its primary task before processing any contextual information. This clear hierarchical structure minimizes ambiguity and improves the model's ability to follow directions accurately. Additionally, using distinct separators like ### or triple quotes between instructions and the main text input visually delineates these sections for the model, ensuring it clearly distinguishes between what it needs to do and the data it needs to process. This practice leads to more consistent and reliable outputs.

How does model specificity improve AI output quality?

Being highly specific, descriptive, and detailed about the desired context, outcome, length, format, and style significantly enhances the quality of AI outputs. Vague instructions often result in generic or unhelpful responses. By contrast, providing explicit constraints (e.g., 'Write a 3-5 sentence paragraph,' 'in the style of a specific poet,' or 'extract entities into a comma-separated list') guides the model to produce precise, tailored, and usable results that align directly with the user's requirements. This reduces the need for post-processing and iterative refinement.

Explain the zero-shot, few-shot, and fine-tuning approach.

These represent a progressive hierarchy of prompt complexity and model customization. Zero-shot learning involves providing no examples, relying solely on the model's pre-trained knowledge. Few-shot learning enhances this by including a small number of input-output examples within the prompt itself, demonstrating the desired behavior. Fine-tuning is the most advanced approach, where a pre-trained model is further trained on a specific, high-quality dataset, customizing its weights to excel at particular tasks or generate highly specialized outputs. Each method offers different levels of control and resource investment.

What role does 'temperature' play in OpenAI API prompts?

The 'temperature' parameter controls the randomness and creativity of the model's output. A higher temperature (e.g., 0.8-1.0) makes the output more varied, unpredictable, and potentially creative, as the model selects less probable tokens. Conversely, a lower temperature (e.g., 0.0-0.2) makes the output more deterministic, focused, and factual, as the model tends to choose the most probable tokens. For tasks requiring high accuracy, consistency, or factual correctness, such as data extraction or truthful Q&A, a temperature of 0 is generally recommended to minimize 'hallucinations' or creative deviations.

How can developers generate better code with prompts?

To generate higher-quality code, developers should use 'leading words' to gently nudge the model towards the desired programming language or structure. For instance, starting a prompt with `import` for Python or `SELECT` for SQL can signal to the model the type of code expected, helping it initiate the output correctly. Additionally, clear, step-by-step instructions outlining the function's purpose, inputs, and desired outputs, along with code comments within the prompt, significantly improve the relevance and correctness of the generated code snippets from OpenAI's models, reducing errors and saving development time.

What are the key parameters to consider when using the OpenAI API?

The most crucial parameters for controlling OpenAI API outputs are `model`, `temperature`, `max_completion_tokens`, and `stop` sequences. The `model` parameter selects the specific AI model, impacting performance, cost, and latency. `Temperature` dictates the creativity versus determinism of the output; 0 is best for factual tasks. `max_completion_tokens` sets a hard upper limit on the number of tokens generated, preventing excessively long responses. `Stop` sequences are specific characters or phrases that, when generated, instruct the model to cease generating further tokens, providing precise control over the output's termination point and format. These parameters allow developers to fine-tune model behavior for diverse application requirements.

Stay Updated

Get the latest AI news delivered to your inbox.