Overview¶

What is AirsDSP?¶

AirsDSP is a Rust implementation of the Demonstrate-Search-Predict (DSP) framework, a systematic approach to building sophisticated language model and retrieval model pipelines for knowledge-intensive NLP tasks.

The DSP Framework¶

The DSP framework, introduced by Stanford's NLP research team in 2022, addresses a fundamental challenge in NLP: how to effectively combine language models and retrieval models to solve knowledge-intensive tasks without requiring fine-tuning.

Core Insight¶

Traditional approaches to knowledge-intensive NLP often fall into two categories:

Retrieve-then-read: Simple pipelines that retrieve documents and pass them to a language model
Fine-tuned models: Task-specific models trained on large datasets

DSP offers a third way: systematic decomposition and composition of language and retrieval capabilities through three fundamental operations orchestrated as pipeline stages.

Three Core Stages¶

AirsDSP implements the DSP framework through a stage-based pipeline architecture. Each stage corresponds to one of the three core DSP operations:

1. Demonstrate Stage¶

The Demonstrate stage creates pipeline-aware examples that guide language models through multi-step reasoning processes.

Key Characteristics: - Context-aware: Examples understand the full pipeline structure - Task-specific: Tailored to guide specific reasoning patterns - Reusable: Can be shared across similar pipeline configurations

Example Use Case:

// Bootstrap demonstrations for multi-hop question answering
let demonstrate_stage = YamlDemonstrateStage::new("examples.yaml")
    .with_selection_strategy(TopK::new(3));

let pipeline = Pipeline::builder()
    .add_stage(Box::new(demonstrate_stage))
    .search(/* ... */)
    .predict(/* ... */)
    .build()?;

Trait Definition:

#[async_trait]
pub trait DemonstrateStage: Stage {
    async fn demonstrate(
        &self,
        ctx: &Context,
    ) -> Result<Vec<Demonstration>>;
}

2. Search Stage¶

The Search stage provides sophisticated retrieval capabilities that can be strategically placed within program flow.

Key Characteristics: - Strategic placement: Retrieval happens exactly when and where needed - Multi-stage: Support for multiple retrieval steps in a single pipeline - Context-aware: Can use intermediate results to inform subsequent retrievals

Example Use Case:

// Multi-hop retrieval: first find the country, then find its capital
let pipeline = Pipeline::builder()
    .predict(EntityExtract::new(lm.clone()))  // Extract entity
    .search(VectorSearch::new(vs.clone()))    // First search
    .search(VectorSearch::new(vs.clone()))    // Second search
    .predict(AnswerGenerate::new(lm))         // Final answer
    .build()?;

Trait Definition:

#[async_trait]
pub trait SearchStage: Stage {
    async fn search(
        &self,
        ctx: &Context,
    ) -> Result<Vec<Document>>;
}

3. Predict Stage¶

The Predict stage leverages language models for grounded predictions based on demonstrations and retrieved context.

Key Characteristics: - Demonstration-guided: Uses examples to constrain generation - Context-grounded: Predictions based on retrieved information - Type-safe: Structured outputs with compile-time validation

Example Use Case:

// Generate answer grounded in demonstrations and retrieved context
let predict_stage = SimplePredict::new(language_model)
    .with_system_prompt("Answer based on provided context")
    .with_temperature(0.7);

let pipeline = Pipeline::builder()
    .demonstrate(/* ... */)
    .search(/* ... */)
    .predict(predict_stage)
    .build()?;

Trait Definition:

#[async_trait]
pub trait PredictStage: Stage {
    async fn predict(
        &self,
        ctx: &Context,
    ) -> Result<Prediction>;
}

Compositional Philosophy¶

The power of DSP comes from composition - combining these three stages to create sophisticated reasoning pipelines:

// Complex multi-hop reasoning pipeline
async fn answer_complex_question(
    lm: Arc<dyn LanguageModel>,
    vs: Arc<dyn VectorStore>,
) -> Result<Pipeline> {
    let pipeline = Pipeline::builder()
        // Stage 1: Load demonstrations
        .demonstrate(
            YamlDemonstrateStage::new("multi_hop_examples.yaml")
        )

        // Stage 2: Extract entity from question
        .predict(
            EntityExtract::new(lm.clone())
                .add_hook(Box::new(LoggingHook::new()))
        )

        // Stage 3: Search for entity information
        .search(
            VectorSearch::new(vs.clone())
                .with_top_k(5)
                .add_hook(Box::new(RerankHook::new()))
        )

        // Stage 4: Second search based on first result
        .search(
            VectorSearch::new(vs.clone())
                .with_top_k(3)
        )

        // Stage 5: Generate final answer
        .predict(
            AnswerGenerate::new(lm)
                .with_demonstrations_from_context()
                .with_documents_from_context()
                .add_hook(Box::new(ValidationHook::new()))
        )
        .build()?;

    Ok(pipeline)
}

Design Philosophy¶

Explicit Over Automatic¶

AirsDSP prioritizes explicit control over automated optimization. You define the pipeline structure, control flow, and reasoning steps. This provides:

Predictability: Pipeline behavior is deterministic and understandable
Debuggability: Each step can be inspected and reasoned about
Reliability: Production systems require predictable behavior

Architecture-Driven Accuracy¶

Rather than relying on automated prompt optimization, AirsDSP achieves accuracy through five architectural mechanisms:

Problem Decomposition: Break complex tasks into manageable stages
Strategic Retrieval: Place retrieval exactly where it provides value
Demonstration Guidance: Use examples to constrain model behavior
Context Grounding: Base predictions on retrieved information
Pipeline Composition: Combine stages in meaningful sequences

Stage-Based Architecture¶

Built on a flexible stage abstraction:

Base Stage Trait: All stages implement uniform interface
Specialized Traits: DemonstrateStage, SearchStage, PredictStage provide type safety
Custom Stages: Users can implement custom stages for domain-specific needs
Hook System: Cross-cutting concerns (logging, metrics, caching) without cluttering stage logic

See Architecture Documentation for detailed design.

Rust-Native Design¶

Built from the ground up for Rust, AirsDSP leverages:

Type Safety: Compile-time guarantees for pipeline correctness
Zero-Cost Abstractions: High-level APIs without runtime overhead
Async/Await: Non-blocking I/O for efficient pipeline execution
Memory Safety: No data races or memory leaks

Performance Expectations¶

Based on the original DSP research paper, pipelines built with this framework can achieve:

37-120% relative improvement over vanilla language models on knowledge-intensive tasks
8-40% improvement over simple retrieve-then-read baselines
Comparable or better accuracy than fine-tuned models without requiring training

These gains come from architectural sophistication rather than model optimization.

Use Cases¶

Multi-Hop Question Answering¶

Answer questions requiring multiple reasoning steps and information gathering:

Question: "Who wrote the book that inspired the movie that won Best Picture in 2020?"
Pipeline Stages:
  1. Demonstrate (load multi-hop examples)
  2. Predict (extract movie from 2020 Best Picture)
  3. Search (find movie details including source material)
  4. Predict (extract book title)
  5. Search (find book author)
  6. Predict (generate final answer)

Knowledge-Intensive Text Generation¶

Generate content requiring factual accuracy and external knowledge:

Task: Write a technical summary of a research area
Pipeline Stages:
  1. Search (retrieve relevant papers)
  2. Demonstrate (load summarization examples)
  3. Predict (synthesize summary from papers)

Document Analysis and Reasoning¶

Analyze large document collections with targeted retrieval:

Task: Extract insights from corporate reports
Pipeline Stages:
  1. Search (retrieve relevant sections)
  2. Predict (extract preliminary insights)
  3. Search (find supporting evidence for insights)
  4. Predict (refine and validate insights)

Custom Domain Solutions¶

Build specialized pipelines for specific domains:

Task: Medical diagnosis assistance
Pipeline Stages:
  1. Demonstrate (medical reasoning examples)
  2. Predict (extract symptoms and context)
  3. Search (retrieve relevant medical literature)
  4. Search (retrieve similar case studies)
  5. Predict (generate diagnostic suggestions with confidence)
  6. Validate (compliance and safety checks via hooks)

vs. Simple RAG (Retrieve-Augment-Generate)¶

Aspect	Simple RAG	AirsDSP
Retrieval	Single-stage	Multi-stage, strategic
Demonstrations	None/Manual	Pipeline-aware, systematic
Composition	Linear	Arbitrary control flow via stages
Flexibility	Limited	Highly composable
Performance	Baseline	8-40% improvement
Architecture	Fixed pattern	Custom stage combinations

vs. Fine-Tuned Models¶

Aspect	Fine-Tuned	AirsDSP
Training Required	Yes (days/weeks)	No
Data Requirements	Large datasets	Few examples
Adaptability	Task-specific	Flexible composition
Deployment	Model hosting	Frozen models + pipeline
Cost	Training + inference	Inference only
Iteration Speed	Slow (retrain)	Fast (adjust pipeline)

vs. DSPy¶

Aspect	DSPy	AirsDSP
Optimization	Automated (compilation)	Architecture-based (manual)
Control	Implicit	Explicit
Predictability	Variable	Deterministic
Language	Python	Rust
Performance	Research-focused	Production-focused
Best For	Research/prototyping	Production systems
Philosophy	Automate optimization	Enable manual optimization

Key Differentiation: AirsDSP implements the foundational DSP architecture (three-operation paradigm with explicit control), not the DSPy evolution (automated compilation and metric-driven optimization).

See Research: DSP vs DSPy Comparative Evolution for detailed comparison.

Project Goals¶

Short Term (Phase 1: Months 1-3)¶

Implement infrastructure trait abstractions (airsdsp/infra)
Implement core stage hierarchy and pipeline (airsdsp/core)
Provide clean, type-safe Rust APIs
Comprehensive documentation and examples
Testing infrastructure

Medium Term (Phase 2: Months 4-6)¶

High-level pattern library (CoT, ReAct, Multi-hop)
Multi-pipeline orchestration system
Task classification and routing
Extended examples and tutorials

Long Term (Phase 3: Months 7-9)¶

Evaluation framework (G-Eval priority)
Debugging and observability tools
Performance optimization and benchmarking
Production deployment patterns

Future Research¶

DAG-based intent decomposition (research phase)
Advanced composition patterns
Integration with AirsStack ecosystem
Community-contributed patterns and metrics

Modular Architecture¶

AirsDSP is organized as a Rust workspace with 6 modular crates:

Layer 1: Infrastructure¶

airsdsp/infra: Trait abstractions (LanguageModel, VectorStore, Cache)

Layer 2: Core + Patterns + Tooling¶

airsdsp/core: Stage hierarchy, Pipeline, Context, Hooks
airsdsp/patterns: CoT, ReAct, Multi-hop patterns
airsdsp/eval: G-Eval and evaluation framework
airsdsp/debug: Tracing and observability

Layer 3: Orchestration¶

airsdsp/orchestration: Multi-pipeline system, routing

Benefits: - Clear separation of concerns - Independent crate evolution - Flexible user dependencies - Phased implementation

See Architecture Documentation for detailed structure.

Getting Started¶

Ready to build your first DSP pipeline? Check out:

Getting Started Guide - Installation and setup
Architecture Documentation - Technical deep dive
Roadmap - Development phases and timeline

Community and Support¶

Issues: GitHub Issues
Discussions: GitHub Discussions
Contributing: Contributing Guide

Next: Getting Started - Set up AirsDSP and build your first pipeline

Overview¶

What is AirsDSP?¶

The DSP Framework¶

Core Insight¶

Three Core Stages¶

1. Demonstrate Stage¶

2. Search Stage¶

3. Predict Stage¶

Compositional Philosophy¶

Design Philosophy¶

Explicit Over Automatic¶

Architecture-Driven Accuracy¶

Stage-Based Architecture¶

Rust-Native Design¶

Performance Expectations¶

Use Cases¶

Multi-Hop Question Answering¶

Knowledge-Intensive Text Generation¶

Document Analysis and Reasoning¶

Custom Domain Solutions¶

Comparison with Related Approaches¶

vs. Simple RAG (Retrieve-Augment-Generate)¶

vs. Fine-Tuned Models¶

vs. DSPy¶

Project Goals¶

Short Term (Phase 1: Months 1-3)¶

Medium Term (Phase 2: Months 4-6)¶

Long Term (Phase 3: Months 7-9)¶

Future Research¶

Modular Architecture¶

Layer 1: Infrastructure¶

Layer 2: Core + Patterns + Tooling¶

Layer 3: Orchestration¶

Getting Started¶

Community and Support¶