Skip to content

Overview

What is AirsDSP?

AirsDSP is a Rust implementation of the Demonstrate-Search-Predict (DSP) framework, a systematic approach to building sophisticated language model and retrieval model pipelines for knowledge-intensive NLP tasks.

The DSP Framework

The DSP framework, introduced by Stanford's NLP research team in 2022, addresses a fundamental challenge in NLP: how to effectively combine language models and retrieval models to solve knowledge-intensive tasks without requiring fine-tuning.

Core Insight

Traditional approaches to knowledge-intensive NLP often fall into two categories:

  1. Retrieve-then-read: Simple pipelines that retrieve documents and pass them to a language model
  2. Fine-tuned models: Task-specific models trained on large datasets

DSP offers a third way: systematic decomposition and composition of language and retrieval capabilities through three fundamental operations orchestrated as pipeline stages.

Three Core Stages

AirsDSP implements the DSP framework through a stage-based pipeline architecture. Each stage corresponds to one of the three core DSP operations:

1. Demonstrate Stage

The Demonstrate stage creates pipeline-aware examples that guide language models through multi-step reasoning processes.

Key Characteristics: - Context-aware: Examples understand the full pipeline structure - Task-specific: Tailored to guide specific reasoning patterns - Reusable: Can be shared across similar pipeline configurations

Example Use Case:

// Bootstrap demonstrations for multi-hop question answering
let demonstrate_stage = YamlDemonstrateStage::new("examples.yaml")
    .with_selection_strategy(TopK::new(3));

let pipeline = Pipeline::builder()
    .add_stage(Box::new(demonstrate_stage))
    .search(/* ... */)
    .predict(/* ... */)
    .build()?;

Trait Definition:

#[async_trait]
pub trait DemonstrateStage: Stage {
    async fn demonstrate(
        &self,
        ctx: &Context,
    ) -> Result<Vec<Demonstration>>;
}

2. Search Stage

The Search stage provides sophisticated retrieval capabilities that can be strategically placed within program flow.

Key Characteristics: - Strategic placement: Retrieval happens exactly when and where needed - Multi-stage: Support for multiple retrieval steps in a single pipeline - Context-aware: Can use intermediate results to inform subsequent retrievals

Example Use Case:

// Multi-hop retrieval: first find the country, then find its capital
let pipeline = Pipeline::builder()
    .predict(EntityExtract::new(lm.clone()))  // Extract entity
    .search(VectorSearch::new(vs.clone()))    // First search
    .search(VectorSearch::new(vs.clone()))    // Second search
    .predict(AnswerGenerate::new(lm))         // Final answer
    .build()?;

Trait Definition:

#[async_trait]
pub trait SearchStage: Stage {
    async fn search(
        &self,
        ctx: &Context,
    ) -> Result<Vec<Document>>;
}

3. Predict Stage

The Predict stage leverages language models for grounded predictions based on demonstrations and retrieved context.

Key Characteristics: - Demonstration-guided: Uses examples to constrain generation - Context-grounded: Predictions based on retrieved information - Type-safe: Structured outputs with compile-time validation

Example Use Case:

// Generate answer grounded in demonstrations and retrieved context
let predict_stage = SimplePredict::new(language_model)
    .with_system_prompt("Answer based on provided context")
    .with_temperature(0.7);

let pipeline = Pipeline::builder()
    .demonstrate(/* ... */)
    .search(/* ... */)
    .predict(predict_stage)
    .build()?;

Trait Definition:

#[async_trait]
pub trait PredictStage: Stage {
    async fn predict(
        &self,
        ctx: &Context,
    ) -> Result<Prediction>;
}

Compositional Philosophy

The power of DSP comes from composition - combining these three stages to create sophisticated reasoning pipelines:

// Complex multi-hop reasoning pipeline
async fn answer_complex_question(
    lm: Arc<dyn LanguageModel>,
    vs: Arc<dyn VectorStore>,
) -> Result<Pipeline> {
    let pipeline = Pipeline::builder()
        // Stage 1: Load demonstrations
        .demonstrate(
            YamlDemonstrateStage::new("multi_hop_examples.yaml")
        )

        // Stage 2: Extract entity from question
        .predict(
            EntityExtract::new(lm.clone())
                .add_hook(Box::new(LoggingHook::new()))
        )

        // Stage 3: Search for entity information
        .search(
            VectorSearch::new(vs.clone())
                .with_top_k(5)
                .add_hook(Box::new(RerankHook::new()))
        )

        // Stage 4: Second search based on first result
        .search(
            VectorSearch::new(vs.clone())
                .with_top_k(3)
        )

        // Stage 5: Generate final answer
        .predict(
            AnswerGenerate::new(lm)
                .with_demonstrations_from_context()
                .with_documents_from_context()
                .add_hook(Box::new(ValidationHook::new()))
        )
        .build()?;

    Ok(pipeline)
}

Design Philosophy

Explicit Over Automatic

AirsDSP prioritizes explicit control over automated optimization. You define the pipeline structure, control flow, and reasoning steps. This provides:

  • Predictability: Pipeline behavior is deterministic and understandable
  • Debuggability: Each step can be inspected and reasoned about
  • Reliability: Production systems require predictable behavior

Architecture-Driven Accuracy

Rather than relying on automated prompt optimization, AirsDSP achieves accuracy through five architectural mechanisms:

  1. Problem Decomposition: Break complex tasks into manageable stages
  2. Strategic Retrieval: Place retrieval exactly where it provides value
  3. Demonstration Guidance: Use examples to constrain model behavior
  4. Context Grounding: Base predictions on retrieved information
  5. Pipeline Composition: Combine stages in meaningful sequences

Stage-Based Architecture

Built on a flexible stage abstraction:

  • Base Stage Trait: All stages implement uniform interface
  • Specialized Traits: DemonstrateStage, SearchStage, PredictStage provide type safety
  • Custom Stages: Users can implement custom stages for domain-specific needs
  • Hook System: Cross-cutting concerns (logging, metrics, caching) without cluttering stage logic

See Architecture Documentation for detailed design.

Rust-Native Design

Built from the ground up for Rust, AirsDSP leverages:

  • Type Safety: Compile-time guarantees for pipeline correctness
  • Zero-Cost Abstractions: High-level APIs without runtime overhead
  • Async/Await: Non-blocking I/O for efficient pipeline execution
  • Memory Safety: No data races or memory leaks

Performance Expectations

Based on the original DSP research paper, pipelines built with this framework can achieve:

  • 37-120% relative improvement over vanilla language models on knowledge-intensive tasks
  • 8-40% improvement over simple retrieve-then-read baselines
  • Comparable or better accuracy than fine-tuned models without requiring training

These gains come from architectural sophistication rather than model optimization.

Use Cases

Multi-Hop Question Answering

Answer questions requiring multiple reasoning steps and information gathering:

Question: "Who wrote the book that inspired the movie that won Best Picture in 2020?"
Pipeline Stages:
  1. Demonstrate (load multi-hop examples)
  2. Predict (extract movie from 2020 Best Picture)
  3. Search (find movie details including source material)
  4. Predict (extract book title)
  5. Search (find book author)
  6. Predict (generate final answer)

Knowledge-Intensive Text Generation

Generate content requiring factual accuracy and external knowledge:

Task: Write a technical summary of a research area
Pipeline Stages:
  1. Search (retrieve relevant papers)
  2. Demonstrate (load summarization examples)
  3. Predict (synthesize summary from papers)

Document Analysis and Reasoning

Analyze large document collections with targeted retrieval:

Task: Extract insights from corporate reports
Pipeline Stages:
  1. Search (retrieve relevant sections)
  2. Predict (extract preliminary insights)
  3. Search (find supporting evidence for insights)
  4. Predict (refine and validate insights)

Custom Domain Solutions

Build specialized pipelines for specific domains:

Task: Medical diagnosis assistance
Pipeline Stages:
  1. Demonstrate (medical reasoning examples)
  2. Predict (extract symptoms and context)
  3. Search (retrieve relevant medical literature)
  4. Search (retrieve similar case studies)
  5. Predict (generate diagnostic suggestions with confidence)
  6. Validate (compliance and safety checks via hooks)

vs. Simple RAG (Retrieve-Augment-Generate)

Aspect Simple RAG AirsDSP
Retrieval Single-stage Multi-stage, strategic
Demonstrations None/Manual Pipeline-aware, systematic
Composition Linear Arbitrary control flow via stages
Flexibility Limited Highly composable
Performance Baseline 8-40% improvement
Architecture Fixed pattern Custom stage combinations

vs. Fine-Tuned Models

Aspect Fine-Tuned AirsDSP
Training Required Yes (days/weeks) No
Data Requirements Large datasets Few examples
Adaptability Task-specific Flexible composition
Deployment Model hosting Frozen models + pipeline
Cost Training + inference Inference only
Iteration Speed Slow (retrain) Fast (adjust pipeline)

vs. DSPy

Aspect DSPy AirsDSP
Optimization Automated (compilation) Architecture-based (manual)
Control Implicit Explicit
Predictability Variable Deterministic
Language Python Rust
Performance Research-focused Production-focused
Best For Research/prototyping Production systems
Philosophy Automate optimization Enable manual optimization

Key Differentiation: AirsDSP implements the foundational DSP architecture (three-operation paradigm with explicit control), not the DSPy evolution (automated compilation and metric-driven optimization).

See Research: DSP vs DSPy Comparative Evolution for detailed comparison.

Project Goals

Short Term (Phase 1: Months 1-3)

  1. Implement infrastructure trait abstractions (airsdsp/infra)
  2. Implement core stage hierarchy and pipeline (airsdsp/core)
  3. Provide clean, type-safe Rust APIs
  4. Comprehensive documentation and examples
  5. Testing infrastructure

Medium Term (Phase 2: Months 4-6)

  1. High-level pattern library (CoT, ReAct, Multi-hop)
  2. Multi-pipeline orchestration system
  3. Task classification and routing
  4. Extended examples and tutorials

Long Term (Phase 3: Months 7-9)

  1. Evaluation framework (G-Eval priority)
  2. Debugging and observability tools
  3. Performance optimization and benchmarking
  4. Production deployment patterns

Future Research

  1. DAG-based intent decomposition (research phase)
  2. Advanced composition patterns
  3. Integration with AirsStack ecosystem
  4. Community-contributed patterns and metrics

Modular Architecture

AirsDSP is organized as a Rust workspace with 6 modular crates:

Layer 1: Infrastructure

  • airsdsp/infra: Trait abstractions (LanguageModel, VectorStore, Cache)

Layer 2: Core + Patterns + Tooling

  • airsdsp/core: Stage hierarchy, Pipeline, Context, Hooks
  • airsdsp/patterns: CoT, ReAct, Multi-hop patterns
  • airsdsp/eval: G-Eval and evaluation framework
  • airsdsp/debug: Tracing and observability

Layer 3: Orchestration

  • airsdsp/orchestration: Multi-pipeline system, routing

Benefits: - Clear separation of concerns - Independent crate evolution - Flexible user dependencies - Phased implementation

See Architecture Documentation for detailed structure.

Getting Started

Ready to build your first DSP pipeline? Check out:

Community and Support


Next: Getting Started - Set up AirsDSP and build your first pipeline