Overview¶
What is AirsDSP?¶
AirsDSP is a Rust implementation of the Demonstrate-Search-Predict (DSP) framework, a systematic approach to building sophisticated language model and retrieval model pipelines for knowledge-intensive NLP tasks.
The DSP Framework¶
The DSP framework, introduced by Stanford's NLP research team in 2022, addresses a fundamental challenge in NLP: how to effectively combine language models and retrieval models to solve knowledge-intensive tasks without requiring fine-tuning.
Core Insight¶
Traditional approaches to knowledge-intensive NLP often fall into two categories:
- Retrieve-then-read: Simple pipelines that retrieve documents and pass them to a language model
- Fine-tuned models: Task-specific models trained on large datasets
DSP offers a third way: systematic decomposition and composition of language and retrieval capabilities through three fundamental operations orchestrated as pipeline stages.
Three Core Stages¶
AirsDSP implements the DSP framework through a stage-based pipeline architecture. Each stage corresponds to one of the three core DSP operations:
1. Demonstrate Stage¶
The Demonstrate stage creates pipeline-aware examples that guide language models through multi-step reasoning processes.
Key Characteristics: - Context-aware: Examples understand the full pipeline structure - Task-specific: Tailored to guide specific reasoning patterns - Reusable: Can be shared across similar pipeline configurations
Example Use Case:
// Bootstrap demonstrations for multi-hop question answering
let demonstrate_stage = YamlDemonstrateStage::new("examples.yaml")
.with_selection_strategy(TopK::new(3));
let pipeline = Pipeline::builder()
.add_stage(Box::new(demonstrate_stage))
.search(/* ... */)
.predict(/* ... */)
.build()?;
Trait Definition:
#[async_trait]
pub trait DemonstrateStage: Stage {
async fn demonstrate(
&self,
ctx: &Context,
) -> Result<Vec<Demonstration>>;
}
2. Search Stage¶
The Search stage provides sophisticated retrieval capabilities that can be strategically placed within program flow.
Key Characteristics: - Strategic placement: Retrieval happens exactly when and where needed - Multi-stage: Support for multiple retrieval steps in a single pipeline - Context-aware: Can use intermediate results to inform subsequent retrievals
Example Use Case:
// Multi-hop retrieval: first find the country, then find its capital
let pipeline = Pipeline::builder()
.predict(EntityExtract::new(lm.clone())) // Extract entity
.search(VectorSearch::new(vs.clone())) // First search
.search(VectorSearch::new(vs.clone())) // Second search
.predict(AnswerGenerate::new(lm)) // Final answer
.build()?;
Trait Definition:
#[async_trait]
pub trait SearchStage: Stage {
async fn search(
&self,
ctx: &Context,
) -> Result<Vec<Document>>;
}
3. Predict Stage¶
The Predict stage leverages language models for grounded predictions based on demonstrations and retrieved context.
Key Characteristics: - Demonstration-guided: Uses examples to constrain generation - Context-grounded: Predictions based on retrieved information - Type-safe: Structured outputs with compile-time validation
Example Use Case:
// Generate answer grounded in demonstrations and retrieved context
let predict_stage = SimplePredict::new(language_model)
.with_system_prompt("Answer based on provided context")
.with_temperature(0.7);
let pipeline = Pipeline::builder()
.demonstrate(/* ... */)
.search(/* ... */)
.predict(predict_stage)
.build()?;
Trait Definition:
#[async_trait]
pub trait PredictStage: Stage {
async fn predict(
&self,
ctx: &Context,
) -> Result<Prediction>;
}
Compositional Philosophy¶
The power of DSP comes from composition - combining these three stages to create sophisticated reasoning pipelines:
// Complex multi-hop reasoning pipeline
async fn answer_complex_question(
lm: Arc<dyn LanguageModel>,
vs: Arc<dyn VectorStore>,
) -> Result<Pipeline> {
let pipeline = Pipeline::builder()
// Stage 1: Load demonstrations
.demonstrate(
YamlDemonstrateStage::new("multi_hop_examples.yaml")
)
// Stage 2: Extract entity from question
.predict(
EntityExtract::new(lm.clone())
.add_hook(Box::new(LoggingHook::new()))
)
// Stage 3: Search for entity information
.search(
VectorSearch::new(vs.clone())
.with_top_k(5)
.add_hook(Box::new(RerankHook::new()))
)
// Stage 4: Second search based on first result
.search(
VectorSearch::new(vs.clone())
.with_top_k(3)
)
// Stage 5: Generate final answer
.predict(
AnswerGenerate::new(lm)
.with_demonstrations_from_context()
.with_documents_from_context()
.add_hook(Box::new(ValidationHook::new()))
)
.build()?;
Ok(pipeline)
}
Design Philosophy¶
Explicit Over Automatic¶
AirsDSP prioritizes explicit control over automated optimization. You define the pipeline structure, control flow, and reasoning steps. This provides:
- Predictability: Pipeline behavior is deterministic and understandable
- Debuggability: Each step can be inspected and reasoned about
- Reliability: Production systems require predictable behavior
Architecture-Driven Accuracy¶
Rather than relying on automated prompt optimization, AirsDSP achieves accuracy through five architectural mechanisms:
- Problem Decomposition: Break complex tasks into manageable stages
- Strategic Retrieval: Place retrieval exactly where it provides value
- Demonstration Guidance: Use examples to constrain model behavior
- Context Grounding: Base predictions on retrieved information
- Pipeline Composition: Combine stages in meaningful sequences
Stage-Based Architecture¶
Built on a flexible stage abstraction:
- Base Stage Trait: All stages implement uniform interface
- Specialized Traits: DemonstrateStage, SearchStage, PredictStage provide type safety
- Custom Stages: Users can implement custom stages for domain-specific needs
- Hook System: Cross-cutting concerns (logging, metrics, caching) without cluttering stage logic
See Architecture Documentation for detailed design.
Rust-Native Design¶
Built from the ground up for Rust, AirsDSP leverages:
- Type Safety: Compile-time guarantees for pipeline correctness
- Zero-Cost Abstractions: High-level APIs without runtime overhead
- Async/Await: Non-blocking I/O for efficient pipeline execution
- Memory Safety: No data races or memory leaks
Performance Expectations¶
Based on the original DSP research paper, pipelines built with this framework can achieve:
- 37-120% relative improvement over vanilla language models on knowledge-intensive tasks
- 8-40% improvement over simple retrieve-then-read baselines
- Comparable or better accuracy than fine-tuned models without requiring training
These gains come from architectural sophistication rather than model optimization.
Use Cases¶
Multi-Hop Question Answering¶
Answer questions requiring multiple reasoning steps and information gathering:
Question: "Who wrote the book that inspired the movie that won Best Picture in 2020?"
Pipeline Stages:
1. Demonstrate (load multi-hop examples)
2. Predict (extract movie from 2020 Best Picture)
3. Search (find movie details including source material)
4. Predict (extract book title)
5. Search (find book author)
6. Predict (generate final answer)
Knowledge-Intensive Text Generation¶
Generate content requiring factual accuracy and external knowledge:
Task: Write a technical summary of a research area
Pipeline Stages:
1. Search (retrieve relevant papers)
2. Demonstrate (load summarization examples)
3. Predict (synthesize summary from papers)
Document Analysis and Reasoning¶
Analyze large document collections with targeted retrieval:
Task: Extract insights from corporate reports
Pipeline Stages:
1. Search (retrieve relevant sections)
2. Predict (extract preliminary insights)
3. Search (find supporting evidence for insights)
4. Predict (refine and validate insights)
Custom Domain Solutions¶
Build specialized pipelines for specific domains:
Task: Medical diagnosis assistance
Pipeline Stages:
1. Demonstrate (medical reasoning examples)
2. Predict (extract symptoms and context)
3. Search (retrieve relevant medical literature)
4. Search (retrieve similar case studies)
5. Predict (generate diagnostic suggestions with confidence)
6. Validate (compliance and safety checks via hooks)
Comparison with Related Approaches¶
vs. Simple RAG (Retrieve-Augment-Generate)¶
| Aspect | Simple RAG | AirsDSP |
|---|---|---|
| Retrieval | Single-stage | Multi-stage, strategic |
| Demonstrations | None/Manual | Pipeline-aware, systematic |
| Composition | Linear | Arbitrary control flow via stages |
| Flexibility | Limited | Highly composable |
| Performance | Baseline | 8-40% improvement |
| Architecture | Fixed pattern | Custom stage combinations |
vs. Fine-Tuned Models¶
| Aspect | Fine-Tuned | AirsDSP |
|---|---|---|
| Training Required | Yes (days/weeks) | No |
| Data Requirements | Large datasets | Few examples |
| Adaptability | Task-specific | Flexible composition |
| Deployment | Model hosting | Frozen models + pipeline |
| Cost | Training + inference | Inference only |
| Iteration Speed | Slow (retrain) | Fast (adjust pipeline) |
vs. DSPy¶
| Aspect | DSPy | AirsDSP |
|---|---|---|
| Optimization | Automated (compilation) | Architecture-based (manual) |
| Control | Implicit | Explicit |
| Predictability | Variable | Deterministic |
| Language | Python | Rust |
| Performance | Research-focused | Production-focused |
| Best For | Research/prototyping | Production systems |
| Philosophy | Automate optimization | Enable manual optimization |
Key Differentiation: AirsDSP implements the foundational DSP architecture (three-operation paradigm with explicit control), not the DSPy evolution (automated compilation and metric-driven optimization).
See Research: DSP vs DSPy Comparative Evolution for detailed comparison.
Project Goals¶
Short Term (Phase 1: Months 1-3)¶
- Implement infrastructure trait abstractions (
airsdsp/infra) - Implement core stage hierarchy and pipeline (
airsdsp/core) - Provide clean, type-safe Rust APIs
- Comprehensive documentation and examples
- Testing infrastructure
Medium Term (Phase 2: Months 4-6)¶
- High-level pattern library (CoT, ReAct, Multi-hop)
- Multi-pipeline orchestration system
- Task classification and routing
- Extended examples and tutorials
Long Term (Phase 3: Months 7-9)¶
- Evaluation framework (G-Eval priority)
- Debugging and observability tools
- Performance optimization and benchmarking
- Production deployment patterns
Future Research¶
- DAG-based intent decomposition (research phase)
- Advanced composition patterns
- Integration with AirsStack ecosystem
- Community-contributed patterns and metrics
Modular Architecture¶
AirsDSP is organized as a Rust workspace with 6 modular crates:
Layer 1: Infrastructure¶
airsdsp/infra: Trait abstractions (LanguageModel, VectorStore, Cache)
Layer 2: Core + Patterns + Tooling¶
airsdsp/core: Stage hierarchy, Pipeline, Context, Hooksairsdsp/patterns: CoT, ReAct, Multi-hop patternsairsdsp/eval: G-Eval and evaluation frameworkairsdsp/debug: Tracing and observability
Layer 3: Orchestration¶
airsdsp/orchestration: Multi-pipeline system, routing
Benefits: - Clear separation of concerns - Independent crate evolution - Flexible user dependencies - Phased implementation
See Architecture Documentation for detailed structure.
Getting Started¶
Ready to build your first DSP pipeline? Check out:
- Getting Started Guide - Installation and setup
- Architecture Documentation - Technical deep dive
- Roadmap - Development phases and timeline
Community and Support¶
- Issues: GitHub Issues
- Discussions: GitHub Discussions
- Contributing: Contributing Guide
Next: Getting Started - Set up AirsDSP and build your first pipeline