DSP Prompt Engineering Strategy¶

Document Type: Knowledge Base - Prompt Engineering Research
Created: 2025-11-24
Last Updated: 2025-11-24
Confidence Level: High
Source: DSP/DSPy analysis, Prompt engineering best practices, Rust templating research
Purpose: Justify the need for prompt templates in AirsDSP and evaluate implementation approaches

Overview¶

In the DSP framework, the Predict operation relies heavily on carefully constructed prompts to guide language models. While simple string concatenation might work for trivial cases, production DSP pipelines require sophisticated prompt engineering. This document explores why templates are essential and how to implement them in Rust.

Part 1: Why Templates? (Use Cases)¶

Use Case 1: Dynamic Few-Shot Learning¶

The Problem: The "Demonstrate" operation in DSP requires injecting variable numbers of examples into prompts based on: - Available context window - Task complexity - Example relevance scores

Without Templates (String Concatenation):

let mut prompt = String::from("You are a math solver.\n\n");
for example in examples {
    prompt.push_str(&format!("Q: {}\nA: {}\n\n", example.question, example.answer));
}
prompt.push_str(&format!("Q: {}\nA:", user_question));

Problems: - ❌ Logic mixed with presentation - ❌ Hard to maintain consistent formatting - ❌ Difficult to A/B test prompt variations - ❌ No validation of prompt structure

With Templates:

let prompt = template.render(context! {
    role: "math solver",
    examples: examples,
    question: user_question,
})?;

Benefits: - ✅ Separation of concerns - ✅ Easy to modify prompt without touching code - ✅ Can swap templates for A/B testing - ✅ Template syntax can enforce structure

Use Case 2: Structured Reasoning (ReAct, Chain-of-Thought)¶

The Problem: DSP reasoning patterns like ReAct require strict output formats:

Thought: I need to search for information about X
Action: search("X")
Observation: [search results]
Thought: Based on the observation, I should...
Action: finish("answer")

Without Templates:

let prompt = format!(
    "Follow this format:\nThought: [your reasoning]\nAction: [action]\n\n{}",
    history
);

Issues: - ❌ Format instructions buried in code - ❌ Hard to ensure consistency across stages - ❌ Difficult to add examples of correct format

With Templates:

You are a reasoning agent. Follow this exact format:

{{#each examples}}
Thought: {{this.thought}}
Action: {{this.action}}
Observation: {{this.observation}}
{{/each}}

Now solve this problem:
{{problem}}

Thought:

Benefits: - ✅ Format is self-documenting - ✅ Easy to add/remove format constraints - ✅ Examples naturally integrated - ✅ Can validate output against expected structure

Use Case 3: Role Switching in Multi-Stage Pipelines¶

The Problem: DSP pipelines often need different "personas" at different stages:

Stage 1 (Analysis): "You are a critical analyst..."
Stage 2 (Search): "You are a research librarian..."
Stage 3 (Synthesis): "You are a technical writer..."

Without Templates:

let stage1_prompt = format!("You are a critical analyst. {}", task);
let stage2_prompt = format!("You are a research librarian. {}", task);
// ... repeated for each stage

Issues: - ❌ Role definitions scattered across codebase - ❌ Hard to maintain consistent tone - ❌ Difficult to reuse role definitions

With Templates:

// roles.toml
[analyst]
system = "You are a critical analyst who evaluates claims rigorously."
tone = "skeptical"

[librarian]
system = "You are a research librarian who finds relevant sources."
tone = "helpful"

// Code
let prompt = template.render(context! {
    role: roles.get("analyst"),
    task: task,
})?;

Benefits: - ✅ Centralized role management - ✅ Easy to A/B test different personas - ✅ Reusable across pipelines - ✅ Non-engineers can modify roles

Use Case 4: Context Length Management¶

The Problem: LLMs have token limits. DSP pipelines must dynamically adjust prompt content:

GPT-4: 8K tokens
Claude: 100K tokens
Llama-3: 8K tokens

Without Templates:

let mut prompt = base_prompt.clone();
let mut tokens = count_tokens(&prompt);

for example in examples {
    let example_tokens = count_tokens(&example);
    if tokens + example_tokens > limit {
        break;
    }
    prompt.push_str(&example);
    tokens += example_tokens;
}

Issues: - ❌ Token counting logic mixed with prompt construction - ❌ Hard to prioritize what to include/exclude - ❌ Difficult to test edge cases

With Templates (+ Smart Context):

let context = ContextBuilder::new()
    .add_required("system_prompt", system)
    .add_required("question", question)
    .add_optional_list("examples", examples, priority=10)
    .add_optional("retrieved_docs", docs, priority=5)
    .build_within_limit(8000)?;

let prompt = template.render(context)?;

Benefits: - ✅ Declarative priority system - ✅ Automatic truncation - ✅ Testable in isolation - ✅ Framework handles complexity

Use Case 5: Multilingual Prompts¶

The Problem: Supporting multiple languages in DSP pipelines.

Without Templates:

let prompt = if lang == "en" {
    format!("Answer this question: {}", q)
} else if lang == "es" {
    format!("Responde esta pregunta: {}", q)
} else {
    // ...
};

With Templates:

templates/
  en/predict.hbs
  es/predict.hbs
  fr/predict.hbs

let template = template_loader.load(&format!("{}/predict.hbs", lang))?;
let prompt = template.render(context)?;

Benefits: - ✅ Translators can work on templates directly - ✅ No code changes for new languages - ✅ Easy to maintain consistency across languages

Part 2: Template Requirements for DSP¶

Based on the use cases above, AirsDSP templates must support:

Core Features¶

Variable Substitution: {{variable}}
Iteration: {{#each items}}...{{/each}}
Conditionals: {{#if condition}}...{{/if}}
Partials/Includes: Reuse common prompt fragments
Whitespace Control: Precise control over newlines/spaces
Escaping: Handle special characters in user input

DSP-Specific Features¶

Example Formatting: Consistent few-shot example rendering
Context Injection: Automatic insertion of retrieved documents
Token Counting: Awareness of prompt length
Validation: Ensure required fields are present

Part 3: Rust Templating Options¶

Option 1: Tera (Jinja2-like)¶

Description: Rust port of Jinja2, the most popular Python templating engine.

Syntax:

You are a {{ role }}.

{% for example in examples %}
Q: {{ example.question }}
A: {{ example.answer }}
{% endfor %}

Q: {{ question }}
A:

Pros: - ✅ Familiar to Python developers (DSPy users) - ✅ Rich feature set (filters, macros, inheritance) - ✅ Runtime flexibility (load templates from files) - ✅ Good error messages

Cons: - ❌ Runtime overhead (parsing templates at runtime) - ❌ No compile-time safety - ❌ Errors only caught at runtime

Rust Crate: tera = "1.19"

Option 2: Askama (Compile-Time Templates)¶

Description: Type-safe templates compiled at build time.

Syntax:

#[derive(Template)]
#[template(path = "predict.html")]
struct PredictTemplate {
    role: String,
    examples: Vec<Example>,
    question: String,
}

// Usage
let tmpl = PredictTemplate {
    role: "math solver".to_string(),
    examples: vec![...],
    question: "What is 2+2?".to_string(),
};
let prompt = tmpl.render()?;

Pros: - ✅ Compile-time safety: Type errors caught at build time - ✅ Zero runtime overhead: Templates compiled to Rust code - ✅ IDE support: Autocomplete for template variables - ✅ Very Rust-aligned: Leverages type system

Cons: - ❌ Less flexible (can't load templates dynamically) - ❌ Requires recompilation to change templates - ❌ Steeper learning curve for non-Rust users

Rust Crate: askama = "0.12"

Option 3: MiniJinja (Lightweight Jinja2)¶

Description: Minimal Jinja2 implementation, faster than Tera.

Syntax: Same as Tera (Jinja2-compatible)

Pros: - ✅ Faster than Tera - ✅ Smaller binary size - ✅ Jinja2 compatible - ✅ Good for dynamic templates

Cons: - ❌ Fewer features than Tera - ❌ Still runtime overhead - ❌ No compile-time safety

Rust Crate: minijinja = "2.0"

Option 4: Custom Macro-Based System¶

Description: Use Rust macros for compile-time prompt generation.

Syntax:

prompt! {
    role: "math solver",
    examples: [
        ("What is 2+2?", "4"),
        ("What is 3+3?", "6"),
    ],
    question: user_question,
}

Pros: - ✅ Maximum compile-time safety - ✅ Zero overhead: Expands to pure Rust code - ✅ Full Rust integration: Can use any Rust expression - ✅ No external dependencies

Cons: - ❌ High implementation cost: Building a macro system is complex - ❌ Limited flexibility: Hard to change without recompilation - ❌ Unfamiliar syntax: Not standard templating - ❌ Maintenance burden: Custom code to maintain

Part 4: Recommendation for AirsDSP¶

Hybrid Approach: Askama (Primary) + Tera (Optional)¶

Rationale:

Askama for Core Framework:
Aligns with AirsDSP's philosophy of "explicit control"
Compile-time safety prevents prompt bugs
Zero runtime overhead fits performance goals
Type-safe templates are very "Rust-like"
Tera for Advanced Users (Optional Feature):
Enable dynamic template loading for experimentation
Allow non-Rust users to modify prompts
Useful for A/B testing without recompilation
Can be behind a feature flag: --features dynamic-templates

Implementation Strategy¶

Phase 1: Askama Only - Ship with compile-time templates - Provide default templates for common patterns (CoT, ReAct) - Users can override by creating their own template structs

Phase 2: Add Tera Support - Add optional tera feature - Provide DynamicTemplate trait alongside Template - Document trade-offs clearly

Phase 3: Template Library - Build a collection of battle-tested templates - Community can contribute templates - Version templates separately from core framework

Part 5: Example Implementation¶

Askama Example (Compile-Time)¶

// templates/chain_of_thought.txt
You are a {{ role }}.

{% for example in examples %}
Question: {{ example.question }}
Reasoning: {{ example.reasoning }}
Answer: {{ example.answer }}

{% endfor %}
Question: {{ question }}
Reasoning:

// src/templates.rs
use askama::Template;

#[derive(Template)]
#[template(path = "chain_of_thought.txt")]
pub struct ChainOfThoughtTemplate {
    pub role: String,
    pub examples: Vec<Example>,
    pub question: String,
}

// Usage in pipeline
let prompt = ChainOfThoughtTemplate {
    role: "math expert".to_string(),
    examples: load_examples(),
    question: user_input.to_string(),
}.render()?;

let prediction = lm.generate(&prompt).await?;

Tera Example (Runtime)¶

use tera::{Tera, Context};

let tera = Tera::new("templates/**/*")?;
let mut context = Context::new();
context.insert("role", "math expert");
context.insert("examples", &examples);
context.insert("question", &user_input);

let prompt = tera.render("chain_of_thought.txt", &context)?;
let prediction = lm.generate(&prompt).await?;

Part 6: Open Questions¶

Template Discovery: How should users find and use templates?
Registry system?
Documentation with examples?
CLI tool to list available templates?
Template Versioning: How to handle breaking changes in templates?
Semantic versioning for template library?
Deprecation warnings?
Template Testing: How to test templates in isolation?
Snapshot testing?
Golden file comparisons?
Property-based testing?
Template Composition: Should templates be composable?
Inheritance (Jinja2-style)?
Mixins?
Partials?

Conclusion¶

Templates are essential for AirsDSP because: 1. They separate prompt engineering from pipeline logic 2. They enable dynamic few-shot learning 3. They enforce structured reasoning formats 4. They support role switching and multilingual prompts 5. They make prompt optimization testable and maintainable

Recommended approach: Start with Askama for compile-time safety and Rust alignment, optionally add Tera for runtime flexibility.

This aligns with AirsDSP's core philosophy: explicit control, transparency, and Rust-native performance.

References¶

Askama Documentation: https://djc.github.io/askama/
Tera Documentation: https://keats.github.io/tera/
MiniJinja Documentation: https://docs.rs/minijinja/
Prompt Engineering Guide: https://www.promptingguide.ai/
DSPy Signatures: How DSPy handles prompt templates (for comparison)