Getting Started¶

This guide will help you set up AirsDSP and build your first Demonstrate-Search-Predict pipeline.

Current Status¶

Project Phase: Architecture Complete, Phase 1 Starting
Implementation Status: Core crates under development
API Stability: Not yet stable (subject to change)

Note: The code examples below represent the planned API design based on our finalized architecture. Implementation is currently in progress (Phase 1).

Prerequisites¶

Before you begin, ensure you have:

Rust: Version 1.75 or later (Install Rust)
Cargo: Comes with Rust installation
Basic Rust Knowledge: Familiarity with async/await, traits, and error handling

Installation¶

Add AirsDSP to Your Project¶

Once Phase 1 is complete, you'll add AirsDSP to your Cargo.toml:

[dependencies]
# Core execution
airsdsp-infra = "0.1"
airsdsp-core = "0.1"

# Optional: High-level patterns
airsdsp-patterns = "0.1"

# Optional: Evaluation and debugging
airsdsp-eval = "0.1"
airsdsp-debug = "0.1"

# Async runtime
tokio = { version = "1", features = ["full"] }

Current Status: Not yet published to crates.io. Track Phase 1 progress for availability.

Quick Start¶

Your First Pipeline¶

Let's build a simple question-answering pipeline that demonstrates the core DSP stages.

Step 1: Create a New Project¶

cargo new my-dsp-project
cd my-dsp-project

Step 2: Add Dependencies¶

Edit your Cargo.toml:

[dependencies]
airsdsp-infra = "0.1"
airsdsp-core = "0.1"
tokio = { version = "1", features = ["full"] }
anyhow = "1"

Step 3: Implement Infrastructure Traits¶

Since airsdsp/infra provides only trait abstractions, you need to implement them with your LLM/VectorStore clients:

use airsdsp_infra::{LanguageModel, VectorStore};
use async_trait::async_trait;
use anyhow::Result;

// Implement LanguageModel for your LLM client
struct MyLanguageModel {
    // Your LLM client (OpenAI, Anthropic, local, etc.)
}

#[async_trait]
impl LanguageModel for MyLanguageModel {
    async fn generate(
        &self,
        prompt: &str,
        config: &GenerationConfig,
    ) -> Result<String> {
        // Your implementation
        // Call OpenAI API, local model, etc.
        Ok("Generated response".to_string())
    }

    // Implement other required methods...
}

// Implement VectorStore for your vector DB client
struct MyVectorStore {
    // Your vector store client (Qdrant, Pinecone, etc.)
}

#[async_trait]
impl VectorStore for MyVectorStore {
    async fn search(
        &self,
        query: &str,
        top_k: usize,
    ) -> Result<Vec<ScoredDocument>> {
        // Your implementation
        // Query Qdrant, Pinecone, etc.
        Ok(vec![])
    }

    // Implement other required methods...
}

Step 4: Build Your First Pipeline¶

Edit src/main.rs:

use airsdsp_core::prelude::*;
use airsdsp_core::stages::{YamlDemonstrateStage, VectorSearchStage, SimplePredict};
use std::sync::Arc;
use anyhow::Result;

#[tokio::main]
async fn main() -> Result<()> {
    // Initialize your infrastructure implementations
    let lm = Arc::new(MyLanguageModel::new());
    let vs = Arc::new(MyVectorStore::new());

    // Build a DSP pipeline using stage-based composition
    let mut pipeline = Pipeline::builder()
        // Stage 1: Demonstrate - Load examples
        .demonstrate(
            YamlDemonstrateStage::new("examples.yaml")
                .with_selection_strategy(TopK::new(3))
        )

        // Stage 2: Search - Retrieve relevant context
        .search(
            VectorSearchStage::new(vs.clone())
                .with_top_k(5)
        )

        // Stage 3: Predict - Generate grounded answer
        .predict(
            SimplePredict::new(lm.clone())
                .with_temperature(0.7)
        )
        .build()?;

    // Execute the pipeline
    let question = "What is the capital of France?";
    let answer = pipeline.execute(question).await?;

    println!("Question: {}", question);
    println!("Answer: {}", answer);

    Ok(())
}

Step 5: Run Your Pipeline¶

cargo run

Understanding the Code¶

Let's break down what each part does:

Infrastructure Layer¶

let lm = Arc::new(MyLanguageModel::new());
let vs = Arc::new(MyVectorStore::new());

What it does: - Creates instances of your infrastructure implementations - Uses Arc for shared ownership across stages - You provide concrete implementations of trait abstractions

Why it matters: - AirsDSP doesn't dictate which LLM or VectorDB you use - Maximum flexibility - integrate with existing infrastructure - No unnecessary dependencies

Pipeline Builder¶

let mut pipeline = Pipeline::builder()
    .demonstrate(/* ... */)
    .search(/* ... */)
    .predict(/* ... */)
    .build()?;

What it does: - Creates a pipeline with specified stages - Builder pattern for fluent API - Validates pipeline structure at build time

Stage order: - Demonstrate → Search → Predict (classic DSP pattern) - Can customize: Predict-only, multiple searches, etc.

Demonstrate Stage¶

.demonstrate(
    YamlDemonstrateStage::new("examples.yaml")
        .with_selection_strategy(TopK::new(3))
)

What it does: - Loads demonstrations from YAML file - Selects top-3 most relevant examples for the question - Adds demonstrations to pipeline context

Traits implemented: - Stage (base trait) - DemonstrateStage (specialized trait)

Search Stage¶

.search(
    VectorSearchStage::new(vs.clone())
        .with_top_k(5)
)

What it does: - Retrieves 5 most similar documents using vector search - Uses question and demonstrations to inform retrieval - Adds documents to pipeline context

Traits implemented: - Stage (base trait) - SearchStage (specialized trait)

Predict Stage¶

.predict(
    SimplePredict::new(lm.clone())
        .with_temperature(0.7)
)

What it does: - Generates answer using language model - Uses demonstrations and retrieved documents from context - Returns final prediction

Traits implemented: - Stage (base trait) - PredictStage (specialized trait)

Pipeline Execution¶

let answer = pipeline.execute(question).await?;

What it does: 1. Initializes context with question 2. Executes each stage in sequence 3. Each stage: - Runs before() hooks - Executes stage logic - Runs after() and transform() hooks 4. Returns final output from context

Adding Hooks for Cross-Cutting Concerns¶

Hooks enable logging, metrics, caching, and validation without cluttering stage logic:

use airsdsp_core::hooks::{LoggingHook, MetricsHook, CacheHook};

let mut pipeline = Pipeline::builder()
    .demonstrate(
        YamlDemonstrateStage::new("examples.yaml")
            .add_hook(Box::new(LoggingHook::new()))
            .add_hook(Box::new(CacheHook::new(cache.clone())))
    )
    .search(
        VectorSearchStage::new(vs.clone())
            .add_hook(Box::new(LoggingHook::new()))
            .add_hook(Box::new(MetricsHook::new(metrics.clone())))
    )
    .predict(
        SimplePredict::new(lm.clone())
            .add_hook(Box::new(LoggingHook::new()))
            .add_hook(Box::new(ValidationHook::new()))
    )
    .build()?;

Hook Types: - before(): Execute before stage (validation, cache checks) - after(): Execute after stage (logging, metrics) - transform(): Modify context after stage (reranking, filtering)

Error Strategies: - FailStage: Hook failure fails the stage - ContinueWarn: Log warning, continue (default) - ContinueSilent: Ignore hook errors

Advanced Example: Multi-Hop Reasoning¶

use airsdsp_core::stages::{EntityExtract, VectorSearchStage, AnswerGenerate};

async fn multi_hop_pipeline(
    lm: Arc<dyn LanguageModel>,
    vs: Arc<dyn VectorStore>,
) -> Result<Pipeline> {
    let pipeline = Pipeline::builder()
        // Stage 1: Extract entity from question
        .predict(
            EntityExtract::new(lm.clone())
                .with_prompt("Extract the main entity from this question")
        )

        // Stage 2: First search for entity information
        .search(
            VectorSearchStage::new(vs.clone())
                .with_top_k(5)
                .with_query_from_previous_prediction()
        )

        // Stage 3: Second search based on first results
        .search(
            VectorSearchStage::new(vs.clone())
                .with_top_k(3)
                .with_query_from_documents()
        )

        // Stage 4: Generate final answer
        .predict(
            AnswerGenerate::new(lm.clone())
                .with_all_context()
        )
        .build()?;

    Ok(pipeline)
}

#[tokio::main]
async fn main() -> Result<()> {
    let lm = Arc::new(MyLanguageModel::new());
    let vs = Arc::new(MyVectorStore::new());

    let mut pipeline = multi_hop_pipeline(lm, vs).await?;

    let question = "Who wrote the book that inspired the movie Blade Runner?";
    let answer = pipeline.execute(question).await?;

    println!("Answer: {}", answer);
    // Expected: Philip K. Dick wrote "Do Androids Dream of Electric Sheep?"

    Ok(())
}

What this demonstrates: - Multiple retrieval stages - Chaining: each stage uses previous stage outputs - Complex reasoning patterns - Still explicit and understandable

Using High-Level Patterns¶

Once Phase 2 is complete, you'll have access to pre-built patterns:

use airsdsp_patterns::{CoTPattern, ReActPattern, MultiHopPattern};

// Chain-of-Thought pattern
let cot_pipeline = CoTPattern::new()
    .with_steps(5)
    .build_pipeline(lm.clone())?;

// ReAct pattern
let react_pipeline = ReActPattern::new()
    .with_max_iterations(10)
    .with_tools(vec![calculator, search_tool])
    .build_pipeline(lm.clone())?;

// Multi-hop pattern
let multihop_pipeline = MultiHopPattern::new()
    .with_max_hops(3)
    .build_pipeline(lm.clone(), vs.clone())?;

Benefits: - Pre-configured stage combinations - Proven reasoning patterns - Still customizable if needed

See Patterns Documentation for more details.

Configuration¶

Language Model Configuration¶

Example integration with OpenAI (you implement):

use openai_api_rust::*;

struct OpenAILanguageModel {
    client: OpenAI,
    model: String,
}

impl OpenAILanguageModel {
    pub fn new(api_key: &str, model: &str) -> Self {
        Self {
            client: OpenAI::new(api_key),
            model: model.to_string(),
        }
    }
}

#[async_trait]
impl LanguageModel for OpenAILanguageModel {
    async fn generate(&self, prompt: &str, config: &GenerationConfig) -> Result<String> {
        // Call OpenAI API
        let response = self.client
            .completions()
            .create(prompt, &self.model)
            .await?;
        Ok(response.text())
    }
}

Vector Store Configuration¶

Example integration with Qdrant (you implement):

use qdrant_client::prelude::*;

struct QdrantVectorStore {
    client: QdrantClient,
    collection: String,
}

impl QdrantVectorStore {
    pub fn new(url: &str, collection: &str) -> Self {
        Self {
            client: QdrantClient::new(url),
            collection: collection.to_string(),
        }
    }
}

#[async_trait]
impl VectorStore for QdrantVectorStore {
    async fn search(&self, query: &str, top_k: usize) -> Result<Vec<ScoredDocument>> {
        // Query Qdrant
        let results = self.client
            .search(&self.collection, query, top_k)
            .await?;
        Ok(convert_results(results))
    }
}

Common Patterns¶

Predict-Only Pipeline¶

Simplest pipeline - no retrieval:

let mut pipeline = Pipeline::builder()
    .predict(SimplePredict::new(lm))
    .build()?;

RAG (Retrieve-Augment-Generate) Pipeline¶

Classic RAG pattern:

let mut pipeline = Pipeline::builder()
    .search(VectorSearchStage::new(vs))
    .predict(SimplePredict::new(lm))
    .build()?;

Few-Shot Learning Pipeline¶

Demonstrate → Predict:

let mut pipeline = Pipeline::builder()
    .demonstrate(YamlDemonstrateStage::new("examples.yaml"))
    .predict(SimplePredict::new(lm))
    .build()?;

Full DSP Pipeline¶

All three stages:

let mut pipeline = Pipeline::builder()
    .demonstrate(YamlDemonstrateStage::new("examples.yaml"))
    .search(VectorSearchStage::new(vs))
    .predict(SimplePredict::new(lm))
    .build()?;

Development Tools¶

Logging (Phase 3)¶

Enable execution tracing:

use airsdsp_debug::ExecutionTracer;

let tracer = ExecutionTracer::new();

let mut pipeline = Pipeline::builder()
    .with_tracer(tracer.clone())
    .demonstrate(/* ... */)
    .search(/* ... */)
    .predict(/* ... */)
    .build()?;

pipeline.execute(question).await?;

// Export trace
let trace_json = tracer.export_json();
println!("{}", trace_json);

Evaluation (Phase 3)¶

Evaluate pipeline quality:

use airsdsp_eval::GEval;

let evaluator = GEval::new(evaluator_lm)
    .with_criteria(vec![
        Criterion::Relevance,
        Criterion::Coherence,
        Criterion::Fluency,
    ]);

let report = evaluator
    .evaluate(&mut pipeline, test_cases)
    .await?;

println!("Average score: {:.2}", report.average_score());

Next Steps¶

Phase 1 (Current - Months 1-3)¶

Learn the core concepts: - Architecture Documentation - Understand the 3-layer model - Stage Trait Design - How stages work - Pipeline Composition - Building pipelines

Phase 2 (Months 4-6)¶

Explore high-level patterns: - Chain-of-Thought (CoT) reasoning - ReAct (Reason-Action) loops - Multi-hop information gathering

Phase 3 (Months 7-9)¶

Master evaluation and debugging: - G-Eval for quality assessment - Execution tracing for debugging - Observability integrations

Common Issues¶

Issue: Trait Not Implemented¶

Error: the trait `LanguageModel` is not implemented for `MyLanguageModel`

Solution: Ensure you've implemented all required trait methods with #[async_trait]:

#[async_trait]
impl LanguageModel for MyLanguageModel {
    async fn generate(&self, prompt: &str, config: &GenerationConfig) -> Result<String> {
        // Your implementation
    }

    // Implement all other required methods
}

Issue: Pipeline Build Fails¶

Error: Pipeline must contain at least one Predict stage

Solution: Every pipeline must have at least one Predict stage:

// ❌ Invalid - no Predict stage
let pipeline = Pipeline::builder()
    .search(VectorSearchStage::new(vs))
    .build()?;

// ✅ Valid - has Predict stage
let pipeline = Pipeline::builder()
    .search(VectorSearchStage::new(vs))
    .predict(SimplePredict::new(lm))
    .build()?;

Issue: Async Runtime Error¶

Error: Cannot start a runtime from within a runtime

Solution: Ensure you have #[tokio::main] on your main function:

#[tokio::main]
async fn main() -> Result<()> {
    // Your code here
}

Getting Help¶

Need help getting started?

GitHub Issues: Report issues
GitHub Discussions: Ask questions
Documentation: Architecture, Overview, Roadmap
Contributing: Contributing Guide

Next: Architecture Documentation - Deep dive into AirsDSP's design

Note: This guide reflects the planned API based on our finalized architecture. As implementation progresses (Phase 1), actual APIs may evolve. Check the roadmap for implementation status.