DSP Prompt Engineering Strategy¶
Document Type: Knowledge Base - Prompt Engineering Research
Created: 2025-11-24
Last Updated: 2025-11-24
Confidence Level: High
Source: DSP/DSPy analysis, Prompt engineering best practices, Rust templating research
Purpose: Justify the need for prompt templates in AirsDSP and evaluate implementation approaches
Overview¶
In the DSP framework, the Predict operation relies heavily on carefully constructed prompts to guide language models. While simple string concatenation might work for trivial cases, production DSP pipelines require sophisticated prompt engineering. This document explores why templates are essential and how to implement them in Rust.
Part 1: Why Templates? (Use Cases)¶
Use Case 1: Dynamic Few-Shot Learning¶
The Problem: The "Demonstrate" operation in DSP requires injecting variable numbers of examples into prompts based on: - Available context window - Task complexity - Example relevance scores
Without Templates (String Concatenation):
let mut prompt = String::from("You are a math solver.\n\n");
for example in examples {
prompt.push_str(&format!("Q: {}\nA: {}\n\n", example.question, example.answer));
}
prompt.push_str(&format!("Q: {}\nA:", user_question));
Problems: - ❌ Logic mixed with presentation - ❌ Hard to maintain consistent formatting - ❌ Difficult to A/B test prompt variations - ❌ No validation of prompt structure
With Templates:
let prompt = template.render(context! {
role: "math solver",
examples: examples,
question: user_question,
})?;
Benefits: - ✅ Separation of concerns - ✅ Easy to modify prompt without touching code - ✅ Can swap templates for A/B testing - ✅ Template syntax can enforce structure
Use Case 2: Structured Reasoning (ReAct, Chain-of-Thought)¶
The Problem: DSP reasoning patterns like ReAct require strict output formats:
Thought: I need to search for information about X
Action: search("X")
Observation: [search results]
Thought: Based on the observation, I should...
Action: finish("answer")
Without Templates:
let prompt = format!(
"Follow this format:\nThought: [your reasoning]\nAction: [action]\n\n{}",
history
);
Issues: - ❌ Format instructions buried in code - ❌ Hard to ensure consistency across stages - ❌ Difficult to add examples of correct format
With Templates:
Benefits: - ✅ Format is self-documenting - ✅ Easy to add/remove format constraints - ✅ Examples naturally integrated - ✅ Can validate output against expected structure
Use Case 3: Role Switching in Multi-Stage Pipelines¶
The Problem: DSP pipelines often need different "personas" at different stages:
Stage 1 (Analysis): "You are a critical analyst..."
Stage 2 (Search): "You are a research librarian..."
Stage 3 (Synthesis): "You are a technical writer..."
Without Templates:
let stage1_prompt = format!("You are a critical analyst. {}", task);
let stage2_prompt = format!("You are a research librarian. {}", task);
// ... repeated for each stage
Issues: - ❌ Role definitions scattered across codebase - ❌ Hard to maintain consistent tone - ❌ Difficult to reuse role definitions
With Templates:
// roles.toml
[analyst]
system = "You are a critical analyst who evaluates claims rigorously."
tone = "skeptical"
[librarian]
system = "You are a research librarian who finds relevant sources."
tone = "helpful"
// Code
let prompt = template.render(context! {
role: roles.get("analyst"),
task: task,
})?;
Benefits: - ✅ Centralized role management - ✅ Easy to A/B test different personas - ✅ Reusable across pipelines - ✅ Non-engineers can modify roles
Use Case 4: Context Length Management¶
The Problem: LLMs have token limits. DSP pipelines must dynamically adjust prompt content:
Without Templates:
let mut prompt = base_prompt.clone();
let mut tokens = count_tokens(&prompt);
for example in examples {
let example_tokens = count_tokens(&example);
if tokens + example_tokens > limit {
break;
}
prompt.push_str(&example);
tokens += example_tokens;
}
Issues: - ❌ Token counting logic mixed with prompt construction - ❌ Hard to prioritize what to include/exclude - ❌ Difficult to test edge cases
With Templates (+ Smart Context):
let context = ContextBuilder::new()
.add_required("system_prompt", system)
.add_required("question", question)
.add_optional_list("examples", examples, priority=10)
.add_optional("retrieved_docs", docs, priority=5)
.build_within_limit(8000)?;
let prompt = template.render(context)?;
Benefits: - ✅ Declarative priority system - ✅ Automatic truncation - ✅ Testable in isolation - ✅ Framework handles complexity
Use Case 5: Multilingual Prompts¶
The Problem: Supporting multiple languages in DSP pipelines.
Without Templates:
let prompt = if lang == "en" {
format!("Answer this question: {}", q)
} else if lang == "es" {
format!("Responde esta pregunta: {}", q)
} else {
// ...
};
With Templates:
let template = template_loader.load(&format!("{}/predict.hbs", lang))?;
let prompt = template.render(context)?;
Benefits: - ✅ Translators can work on templates directly - ✅ No code changes for new languages - ✅ Easy to maintain consistency across languages
Part 2: Template Requirements for DSP¶
Based on the use cases above, AirsDSP templates must support:
Core Features¶
- Variable Substitution:
{{variable}} - Iteration:
{{#each items}}...{{/each}} - Conditionals:
{{#if condition}}...{{/if}} - Partials/Includes: Reuse common prompt fragments
- Whitespace Control: Precise control over newlines/spaces
- Escaping: Handle special characters in user input
DSP-Specific Features¶
- Example Formatting: Consistent few-shot example rendering
- Context Injection: Automatic insertion of retrieved documents
- Token Counting: Awareness of prompt length
- Validation: Ensure required fields are present
Part 3: Rust Templating Options¶
Option 1: Tera (Jinja2-like)¶
Description: Rust port of Jinja2, the most popular Python templating engine.
Syntax:
Pros: - ✅ Familiar to Python developers (DSPy users) - ✅ Rich feature set (filters, macros, inheritance) - ✅ Runtime flexibility (load templates from files) - ✅ Good error messages
Cons: - ❌ Runtime overhead (parsing templates at runtime) - ❌ No compile-time safety - ❌ Errors only caught at runtime
Rust Crate: tera = "1.19"
Option 2: Askama (Compile-Time Templates)¶
Description: Type-safe templates compiled at build time.
Syntax:
#[derive(Template)]
#[template(path = "predict.html")]
struct PredictTemplate {
role: String,
examples: Vec<Example>,
question: String,
}
// Usage
let tmpl = PredictTemplate {
role: "math solver".to_string(),
examples: vec![...],
question: "What is 2+2?".to_string(),
};
let prompt = tmpl.render()?;
Pros: - ✅ Compile-time safety: Type errors caught at build time - ✅ Zero runtime overhead: Templates compiled to Rust code - ✅ IDE support: Autocomplete for template variables - ✅ Very Rust-aligned: Leverages type system
Cons: - ❌ Less flexible (can't load templates dynamically) - ❌ Requires recompilation to change templates - ❌ Steeper learning curve for non-Rust users
Rust Crate: askama = "0.12"
Option 3: MiniJinja (Lightweight Jinja2)¶
Description: Minimal Jinja2 implementation, faster than Tera.
Syntax: Same as Tera (Jinja2-compatible)
Pros: - ✅ Faster than Tera - ✅ Smaller binary size - ✅ Jinja2 compatible - ✅ Good for dynamic templates
Cons: - ❌ Fewer features than Tera - ❌ Still runtime overhead - ❌ No compile-time safety
Rust Crate: minijinja = "2.0"
Option 4: Custom Macro-Based System¶
Description: Use Rust macros for compile-time prompt generation.
Syntax:
prompt! {
role: "math solver",
examples: [
("What is 2+2?", "4"),
("What is 3+3?", "6"),
],
question: user_question,
}
Pros: - ✅ Maximum compile-time safety - ✅ Zero overhead: Expands to pure Rust code - ✅ Full Rust integration: Can use any Rust expression - ✅ No external dependencies
Cons: - ❌ High implementation cost: Building a macro system is complex - ❌ Limited flexibility: Hard to change without recompilation - ❌ Unfamiliar syntax: Not standard templating - ❌ Maintenance burden: Custom code to maintain
Part 4: Recommendation for AirsDSP¶
Hybrid Approach: Askama (Primary) + Tera (Optional)¶
Rationale:
- Askama for Core Framework:
- Aligns with AirsDSP's philosophy of "explicit control"
- Compile-time safety prevents prompt bugs
- Zero runtime overhead fits performance goals
-
Type-safe templates are very "Rust-like"
-
Tera for Advanced Users (Optional Feature):
- Enable dynamic template loading for experimentation
- Allow non-Rust users to modify prompts
- Useful for A/B testing without recompilation
- Can be behind a feature flag:
--features dynamic-templates
Implementation Strategy¶
Phase 1: Askama Only - Ship with compile-time templates - Provide default templates for common patterns (CoT, ReAct) - Users can override by creating their own template structs
Phase 2: Add Tera Support
- Add optional tera feature
- Provide DynamicTemplate trait alongside Template
- Document trade-offs clearly
Phase 3: Template Library - Build a collection of battle-tested templates - Community can contribute templates - Version templates separately from core framework
Part 5: Example Implementation¶
Askama Example (Compile-Time)¶
// templates/chain_of_thought.txt
You are a {{ role }}.
{% for example in examples %}
Question: {{ example.question }}
Reasoning: {{ example.reasoning }}
Answer: {{ example.answer }}
{% endfor %}
Question: {{ question }}
Reasoning:
// src/templates.rs
use askama::Template;
#[derive(Template)]
#[template(path = "chain_of_thought.txt")]
pub struct ChainOfThoughtTemplate {
pub role: String,
pub examples: Vec<Example>,
pub question: String,
}
// Usage in pipeline
let prompt = ChainOfThoughtTemplate {
role: "math expert".to_string(),
examples: load_examples(),
question: user_input.to_string(),
}.render()?;
let prediction = lm.generate(&prompt).await?;
Tera Example (Runtime)¶
use tera::{Tera, Context};
let tera = Tera::new("templates/**/*")?;
let mut context = Context::new();
context.insert("role", "math expert");
context.insert("examples", &examples);
context.insert("question", &user_input);
let prompt = tera.render("chain_of_thought.txt", &context)?;
let prediction = lm.generate(&prompt).await?;
Part 6: Open Questions¶
- Template Discovery: How should users find and use templates?
- Registry system?
- Documentation with examples?
-
CLI tool to list available templates?
-
Template Versioning: How to handle breaking changes in templates?
- Semantic versioning for template library?
-
Deprecation warnings?
-
Template Testing: How to test templates in isolation?
- Snapshot testing?
- Golden file comparisons?
-
Property-based testing?
-
Template Composition: Should templates be composable?
- Inheritance (Jinja2-style)?
- Mixins?
- Partials?
Conclusion¶
Templates are essential for AirsDSP because: 1. They separate prompt engineering from pipeline logic 2. They enable dynamic few-shot learning 3. They enforce structured reasoning formats 4. They support role switching and multilingual prompts 5. They make prompt optimization testable and maintainable
Recommended approach: Start with Askama for compile-time safety and Rust alignment, optionally add Tera for runtime flexibility.
This aligns with AirsDSP's core philosophy: explicit control, transparency, and Rust-native performance.
References¶
- Askama Documentation: https://djc.github.io/askama/
- Tera Documentation: https://keats.github.io/tera/
- MiniJinja Documentation: https://docs.rs/minijinja/
- Prompt Engineering Guide: https://www.promptingguide.ai/
- DSPy Signatures: How DSPy handles prompt templates (for comparison)