Prompt Engineering
Prompt engineering is the art and science of crafting inputs (prompts) to foundation models to elicit the most accurate, relevant, and useful outputs. It is a critical skill for the AIF-C01 exam.
Exam Tip: The exam will test your understanding of different prompting techniques and when to use each one. Know the differences between zero-shot, few-shot, and chain-of-thought prompting.
Zero-Shot Prompting
- What: Asking the model to perform a task without providing any examples
- How: Simply describe the task in the prompt and let the model use its pre-trained knowledge
- When to Use:
- The task is straightforward and well-understood
- You don't have examples to provide
- You're testing the model's baseline capability
- Strengths:
- Simplest approach — no data preparation needed
- Works well for common, well-defined tasks
- Limitations:
- May not produce desired format or style
- Less accurate for domain-specific or nuanced tasks
Example:
Prompt: "Classify the following text as positive, negative, or neutral: 'The product arrived on time and works great!'"
Output: "Positive"
Exam Tip: Zero-shot = no examples given. If the question asks about prompting without providing reference examples, it's zero-shot.
Few-Shot Prompting (One-Shot, Single-Shot)
- What: Providing the model with one or more examples of the desired input-output behavior before asking it to perform the task
- Variants:
- One-Shot (Single-Shot): Provide exactly one example
- Few-Shot: Provide 2-6 examples (typically 3-5)
- How: Include example input-output pairs in the prompt, followed by the actual query
- When to Use:
- You need the model to follow a specific format or style
- The task is domain-specific or uncommon
- Zero-shot isn't producing acceptable results
- Strengths:
- Significantly improves accuracy for specific tasks
- Teaches the model the desired format without fine-tuning
- Quick to implement — no training required
- Limitations:
- Uses more tokens (costs more, uses context window)
- Quality depends on example selection
- May not generalize to edge cases
Example (Few-Shot):
Prompt:
"Classify the sentiment of each review:
Review: 'Great battery life!' → Positive
Review: 'Screen cracked after one day.' → Negative
Review: 'It's okay, nothing special.' → Neutral
Review: 'The camera quality is amazing!' → "
Output: "Positive"
Exam Tip: Few-shot = examples provided in the prompt. If the question mentions providing examples to guide model behavior without training, it's few-shot prompting. One-shot = 1 example, few-shot = multiple examples.
Chain of Thought (CoT)
- What: Prompting the model to show its reasoning step-by-step before arriving at a final answer
- How: Include instructions like "Think step by step" or provide examples with explicit reasoning chains
- When to Use:
- Complex reasoning tasks (math, logic, multi-step problems)
- Tasks requiring analysis before a conclusion
- When you need to verify the model's reasoning process
- Strengths:
- Dramatically improves accuracy on complex reasoning tasks
- Makes the model's logic transparent and verifiable
- Reduces errors in multi-step problems
- Limitations:
- Uses more tokens (longer responses)
- Slower response time
- May over-explain simple tasks
Example:
Prompt: "Roger has 5 tennis balls. He buys 2 more cans of tennis balls. Each can has 3 tennis balls. How many tennis balls does he have now? Think step by step."
Output:
"Step 1: Roger starts with 5 tennis balls.
Step 2: He buys 2 cans of tennis balls.
Step 3: Each can has 3 balls, so 2 × 3 = 6 new balls.
Step 4: Total = 5 + 6 = 11 tennis balls.
Answer: 11"
Exam Tip: Chain of thought = "show your work". If the question involves complex reasoning, math, or multi-step logic, CoT prompting improves accuracy. The trigger phrase is "think step by step."
Tree of Thoughts
- What: An advanced prompting technique where the model explores multiple reasoning paths simultaneously, evaluates each path, and selects the best one
- How: The model generates multiple possible next steps, evaluates them, and backtracks if a path seems wrong — like a tree search
- When to Use:
- Highly complex problems with multiple possible approaches
- Strategic planning and decision-making tasks
- Tasks where the first approach may not be optimal
- Strengths:
- Handles complex problems better than linear CoT
- Can recover from wrong initial reasoning paths
- More thorough exploration of solution space
- Limitations:
- Very token-intensive (expensive)
- Slower response time
- Overkill for simple tasks
- Relationship to CoT: Tree of Thoughts extends CoT by exploring multiple reasoning branches rather than a single linear chain
Exam Tip: Tree of Thoughts = multiple reasoning paths explored and evaluated. It's an advanced version of CoT. If the question mentions complex strategic problems requiring exploration of alternatives, think Tree of Thoughts.
Retrieval-Augmented Generation (RAG)
- What: A prompting/architecture pattern that retrieves relevant information from external sources and adds it to the prompt before generation
- How:
- Convert user query to embedding
- Search vector database for relevant documents
- Add retrieved documents to the prompt as context
- Model generates response grounded in retrieved information
- When to Use:
- Model needs access to proprietary/current information
- You want to reduce hallucinations
- You need citations/sources for responses
- Knowledge changes frequently
- Strengths:
- Grounds responses in factual data
- No model re-training needed when information changes
- Can provide source citations
- Dramatically reduces hallucinations
- Limitations:
- Quality depends on retrieval accuracy
- Adds latency (retrieval step)
- Requires maintaining a knowledge base
Exam Tip: RAG = retrieve then generate. It's both a prompting pattern and an architectural pattern. In the context of prompt engineering, RAG modifies what goes INTO the prompt by adding retrieved context.
Prompt Templates
- What: Pre-defined, reusable prompt structures with placeholder variables that get filled in at runtime
- How: Define a template with variables (e.g.,,
{{context}}) and fill them programmatically{{question}} - When to Use:
- Standardize prompt format across an application
- Ensure consistency in prompt structure
- Simplify prompt management at scale
- Benefits:
- Consistency across all invocations
- Easier to version and manage
- Separation of prompt structure from content
- A/B testing different templates
Example:
Template:
"You are a helpful {{role}} assistant.
Context: {{context}}
Question: {{question}}
Answer in {{format}} format."
Filled:
"You are a helpful medical assistant.
Context: Patient reports chest pain and shortness of breath...
Question: What are the possible conditions?
Answer in bullet point format."
Exam Tip: Prompt templates = reusable prompts with variables. They're about operationalizing prompt engineering for production applications.
Best Practices
General Best Practices
- Be Specific: Clearly state what you want — vague prompts produce vague outputs
- Provide Context: Give the model relevant background information
- Specify Format: Tell the model how you want the output structured (JSON, bullets, table)
- Set Constraints: Define boundaries (word count, language, tone)
- Use System Prompts: Define the model's role and behavior ("You are a helpful assistant...")
- Iterate: Start simple, test, and refine the prompt based on outputs
Reducing Hallucinations
- Use RAG to ground responses in factual data
- Include instructions like "Only answer based on the provided context"
- Add "If you don't know, say 'I don't know'" to the prompt
- Use Guardrails contextual grounding checks
Improving Output Quality
- Temperature: Lower values (0.0-0.3) for factual/deterministic tasks; higher values (0.7-1.0) for creative tasks
- Top-P: Control the diversity of token selection
- Max Tokens: Set appropriate limits for response length
- Stop Sequences: Define where the model should stop generating
Prompt Structure
[System/Role Definition]
[Context/Background Information]
[Task Instructions]
[Examples (if few-shot)]
[Input Data]
[Output Format Specification]
[Constraints/Rules]
Exam Tip: The exam may ask which inference parameter to adjust for a specific goal:
- More creative output → Increase temperature
- More factual/deterministic output → Decrease temperature
- Longer responses → Increase max tokens
- More diverse word choice → Increase top-p