Skip to content

Roadmap

AirsDSP development roadmap and milestones.

Project Status

Current Phase: Architecture Complete, Phase 1 Starting
Version: Pre-alpha (0.0.x)
Status: Early Development

Development Phases

AirsDSP follows a 3-phase implementation plan over 9 months, aligned with the modular crate architecture.


Phase 1: Foundation (Months 1-3) - STARTING

Status: 🔄 In Progress
Timeline: Months 1-3
Focus: Core execution capability

Crates in Scope

  1. airsdsp/infra - Infrastructure trait abstractions
  2. airsdsp/core - Core execution engine

Objectives

airsdsp/infra (Layer 1): - Define LanguageModel trait (LLM integration interface) - Define VectorStore trait (vector database interface) - Define Cache trait (caching layer interface) - Define observability trait abstractions

airsdsp/core (Layer 2A): - Implement base Stage trait - Implement specialized traits: DemonstrateStage, SearchStage, PredictStage - Implement Pipeline and PipelineBuilder - Implement Context management (data flow between stages) - Implement Hook system (StageHook trait with before/after/transform) - Error handling and recovery mechanisms

Milestones

Milestone 1.1: Workspace Setup (Week 1) - [ ] Create Rust workspace Cargo.toml - [ ] Create crate directories: infra/, core/ - [ ] Set up CI/CD pipeline - [ ] Configure workspace dependencies

Milestone 1.2: Infrastructure Traits (Weeks 1-2) - [ ] Implement LanguageModel trait - [ ] Implement VectorStore trait - [ ] Implement Cache trait - [ ] Write trait documentation with examples - [ ] Create mock implementations for testing

Milestone 1.3: Stage Trait Hierarchy (Weeks 2-4) - [ ] Implement base Stage trait - [ ] Implement DemonstrateStage trait - [ ] Implement SearchStage trait - [ ] Implement PredictStage trait - [ ] Implement StageHook trait - [ ] Write comprehensive trait documentation

Milestone 1.4: Pipeline Orchestration (Weeks 5-7) - [ ] Implement Pipeline struct - [ ] Implement PipelineBuilder pattern - [ ] Implement hook execution (before/after/transform) - [ ] Implement error handling strategies - [ ] Write pipeline builder documentation

Milestone 1.5: Context Management (Weeks 8-9) - [ ] Implement Context struct - [ ] Implement data flow between stages - [ ] Implement context mutation rules - [ ] Write context management documentation

Milestone 1.6: Basic Stage Implementations (Weeks 10-11) - [ ] Implement YamlDemonstrateStage (load from file) - [ ] Implement VectorSearchStage (vector similarity) - [ ] Implement SimplePredict (single LLM call) - [ ] Implement 3-4 common hooks (Logging, Metrics, Cache, Validation)

Milestone 1.7: Testing & Documentation (Week 12) - [ ] Unit tests for all traits (>90% coverage) - [ ] Integration tests for complete pipelines - [ ] API documentation with examples - [ ] Simple example applications - [ ] Getting started tutorial

Success Criteria

  • ✅ Can build and execute simple pipelines (Demonstrate → Search → Predict)
  • ✅ All tests passing with >90% coverage
  • ✅ Zero compiler warnings (cargo clippy --deny warnings)
  • ✅ Documentation complete and published
  • ✅ At least 2 example implementations per stage type

Deliverables

  • airsdsp-infra crate (v0.1.0)
  • airsdsp-core crate (v0.1.0)
  • Basic test suite
  • API documentation
  • Simple examples (examples/simple_qa/)
  • Getting started guide

Phase 2: Patterns & Orchestration (Months 4-6) - PLANNED

Status: 📋 Planned
Timeline: Months 4-6
Focus: High-level patterns and multi-pipeline support

Crates in Scope

  1. airsdsp/patterns - Pattern library
  2. airsdsp/orchestration - Multi-pipeline orchestration

Objectives

airsdsp/patterns (Layer 2B): - Implement Chain-of-Thought (CoT) pattern - Implement ReAct (Reason-Action) pattern - Implement Multi-hop reasoning pattern - Document DAG execution primitives (research)

airsdsp/orchestration (Layer 3): - Implement multi-pipeline system - Implement task classification pipeline - Implement routing logic (confidence-based) - Implement context management - Document future: DAG-based intent decomposition

Milestones

Milestone 2.1: CoT Pattern (Weeks 13-15) - [ ] Design CoT pattern API - [ ] Implement CoT pipeline constructor - [ ] Implement CoT-specific stages - [ ] Write CoT examples and documentation - [ ] Integration tests for CoT pattern

Milestone 2.2: ReAct Pattern (Weeks 16-18) - [ ] Design ReAct pattern API - [ ] Implement ReAct iterative loop - [ ] Implement tool integration interface - [ ] Write ReAct examples and documentation - [ ] Integration tests for ReAct pattern

Milestone 2.3: Multi-Hop Pattern (Weeks 19-20) - [ ] Design Multi-hop pattern API - [ ] Implement multi-hop pipeline constructor - [ ] Implement entity extraction stage - [ ] Write multi-hop examples and documentation - [ ] Integration tests for multi-hop pattern

Milestone 2.4: Multi-Pipeline System (Weeks 21-23) - [ ] Design multi-pipeline architecture - [ ] Implement MultiPipeline struct - [ ] Implement pipeline registry - [ ] Implement default fallback mechanism - [ ] Write orchestration documentation

Milestone 2.5: Task Classification (Week 24) - [ ] Design task classification API - [ ] Implement TaskClassifier using DSP pipeline - [ ] Define TaskType enum - [ ] Write classification examples - [ ] Integration tests for classification

Milestone 2.6: Routing Logic (Week 25) - [ ] Design routing API - [ ] Implement confidence-based routing - [ ] Implement Router struct - [ ] Write routing examples - [ ] Integration tests for routing

Success Criteria

  • ✅ Patterns work correctly on top of core
  • ✅ Multi-pipeline system routes intelligently
  • ✅ Integration tests passing (>90% coverage)
  • ✅ Documentation complete with examples
  • ✅ At least 1 complex example per pattern

Deliverables

  • airsdsp-patterns crate (v0.1.0)
  • airsdsp-orchestration crate (v0.1.0)
  • Pattern examples (examples/cot_example/, examples/react_example/, examples/multi_hop_example/)
  • Multi-pipeline example (examples/multi_pipeline/)
  • Pattern documentation and tutorials

Phase 3: Tooling (Months 7-9) - PLANNED

Status: 📋 Planned
Timeline: Months 7-9
Focus: Developer experience and observability

Crates in Scope

  1. airsdsp/eval - Evaluation metrics
  2. airsdsp/debug - Debugging & observability

Objectives

airsdsp/eval (Layer 2C): - Implement G-Eval (LLM-based evaluation) - Priority - Implement Metric trait for extensibility - Implement pipeline evaluation utilities - Document extension points for future metrics

airsdsp/debug (Layer 2C): - Implement execution tracing - Implement observability hooks - Implement stage inspection - Document performance profiling (future)

Milestones

Milestone 3.1: G-Eval Implementation (Weeks 26-28) - [ ] Design G-Eval API - [ ] Implement GEval struct - [ ] Implement evaluation criteria framework - [ ] Implement LLM-based scoring - [ ] Write G-Eval examples and documentation - [ ] Integration tests for G-Eval

Milestone 3.2: Metric Framework (Weeks 29-30) - [ ] Design Metric trait - [ ] Implement metric aggregation - [ ] Implement evaluation reports - [ ] Document extension points for custom metrics - [ ] Example: Implement BLEU metric as extension

Milestone 3.3: Execution Tracing (Weeks 31-33) - [ ] Design tracing API - [ ] Implement ExecutionTracer struct - [ ] Implement TraceEvent capture - [ ] Implement trace export (JSON, text) - [ ] Write tracing examples and documentation

Milestone 3.4: Observability Hooks (Week 34) - [ ] Design observability integration - [ ] Implement logging integration (tracing crate) - [ ] Implement metrics integration (prometheus) - [ ] Write observability examples - [ ] Integration tests for observability

Milestone 3.5: Stage Inspection (Week 35) - [ ] Design inspection API - [ ] Implement pipeline introspection - [ ] Implement stage state inspection - [ ] Write inspection examples - [ ] Integration tests for inspection

Milestone 3.6: Documentation & Polish (Week 36) - [ ] Complete API documentation for all crates - [ ] Write comprehensive tutorials - [ ] Performance benchmarks - [ ] Production deployment guide - [ ] 1.0 release preparation

Success Criteria

  • ✅ Can evaluate pipeline quality with G-Eval
  • ✅ Can trace and debug execution
  • ✅ Observability integrations work correctly
  • ✅ Performance profiling available
  • ✅ Documentation complete (>90% API coverage)

Deliverables

  • airsdsp-eval crate (v0.1.0)
  • airsdsp-debug crate (v0.1.0)
  • Evaluation examples (examples/evaluation/)
  • Debugging examples (examples/debugging/)
  • Complete workspace documentation
  • Production deployment guide
  • Target: 1.0.0 release preparation

Future Considerations (Beyond Phase 3)

Post-1.0 Features

These features are being considered for post-1.0 releases:

DAG-Based Intent Decomposition (Research) - User intent decomposition into isolated pipeline nodes - Parallel execution of independent sub-tasks - Context isolation between nodes - Built-in synthesis operation - Currently in research phase for future consideration

Advanced Evaluation Metrics - BLEU, ROUGE for text similarity - F1, precision, recall for classification tasks - Custom domain-specific metrics - Automated metric selection

Visual Debugging Tools - Pipeline visualization (DAG graph) - Interactive trace exploration - Real-time execution monitoring - Performance flame graphs

Performance Optimization - Pipeline result caching - Concurrent stage execution - Memory optimization - Benchmark suite and regression testing

Ecosystem Integration - AirsSys runtime integration - AirsProtocols MCP support - AirsStack agent framework integration - Provider library (OpenAI, Anthropic, etc.)

Advanced Patterns - Self-consistency pattern - Tree-of-thought pattern - Ensemble pattern - Critique-revision pattern


Timeline Visualization

Month 1-3           Month 4-6           Month 7-9           Post-9
   |                   |                   |                   |
   |<-- Phase 1 -->    |<-- Phase 2 -->    |<-- Phase 3 -->    |  Future
   |   Foundation      |  Patterns &       |  Tooling          |  Research
   |                   |  Orchestration    |                   |
   |                   |                   |                   |
infra, core       patterns,          eval, debug          1.0 Release
                  orchestration                           + DAG, etc.

Key Dates: - Month 3: Phase 1 complete (core execution capability) - Month 6: Phase 2 complete (patterns + orchestration) - Month 9: Phase 3 complete (tooling + 1.0 release candidate) - Month 10+: 1.0 stable release + future features


Contributing to Development

Want to help with development?

High Priority Areas (by Phase)

Phase 1 (Current): 1. Infrastructure trait implementations (mock providers) 2. Stage implementations (Demonstrate, Search, Predict) 3. Hook implementations (common cross-cutting concerns) 4. Testing infrastructure 5. Documentation and examples

Phase 2 (Upcoming): 1. Pattern implementations (CoT, ReAct, Multi-hop) 2. Multi-pipeline system 3. Task classification strategies 4. Advanced examples

Phase 3 (Future): 1. G-Eval implementation 2. Tracing and debugging tools 3. Observability integrations 4. Performance benchmarking

Getting Involved

  • GitHub Issues: Check for good first issue and help wanted labels
  • Discussions: Join design discussions for upcoming phases
  • Research: Contribute analysis and insights (DAG research, patterns)
  • Testing: Help with testing strategies and coverage
  • Documentation: Write tutorials, improve API docs, add examples

See Contributing Guide for details.


Version Strategy

Pre-1.0 Versions (0.x.x)

Current: v0.0.x (pre-alpha)

  • Breaking changes may occur in any release
  • Minor versions (0.1, 0.2, etc.) add major features or complete phases
  • Patch versions (0.1.1, 0.1.2) fix bugs and minor improvements

Planned Releases: - v0.1.0: Phase 1 complete (infra + core) - v0.2.0: Phase 2 complete (patterns + orchestration) - v0.3.0: Phase 3 complete (eval + debug) - v1.0.0: Stable release

Post-1.0 Versions

Will follow Semantic Versioning:

  • MAJOR (1.0.0 → 2.0.0): Breaking changes to public API
  • MINOR (1.0.0 → 1.1.0): New features (backward compatible)
  • PATCH (1.0.0 → 1.0.1): Bug fixes and minor improvements

Success Metrics

Phase 1 Success Metrics

  • All planned crates compile without warnings
  • Test coverage >90% for core functionality
  • Can build and execute simple DSP pipelines
  • Documentation complete with examples
  • At least 2 reference implementations per stage type

Phase 2 Success Metrics

  • All 3 patterns (CoT, ReAct, Multi-hop) implemented and tested
  • Multi-pipeline system correctly routes tasks
  • Integration tests passing (>90% coverage)
  • Complex examples demonstrating pattern usage
  • Documentation complete with tutorials

Phase 3 Success Metrics

  • G-Eval produces reliable quality scores
  • Execution tracing captures complete pipeline flow
  • Observability integrations work with standard tools
  • Performance benchmarks establish baseline
  • Production deployment guide available
  • Ready for 1.0.0 release

Overall Success Criteria (1.0 Release)

  • Complete API stability
  • Zero known critical bugs
  • Documentation coverage >95%
  • Performance meets or exceeds DSP paper benchmarks
  • Community adoption (10+ external users)
  • Security audit complete

Risk Management

Identified Risks

Phase 1 Risks: - ⚠️ Trait design too complex → Mitigation: Iterative refinement based on usage - ⚠️ Performance overhead from trait objects → Mitigation: Benchmarking + static dispatch options

Phase 2 Risks: - ⚠️ Pattern API not flexible enough → Mitigation: Community feedback loop - ⚠️ Multi-pipeline routing insufficient → Mitigation: Simple initial impl, iterate based on needs

Phase 3 Risks: - ⚠️ G-Eval reliability concerns → Mitigation: Extensive testing, fallback metrics - ⚠️ Debugging tools insufficient → Mitigation: Priority on must-haves, defer nice-to-haves


Stay Updated

  • GitHub: Watch repository for updates
  • Releases: Subscribe to release notifications
  • Discussions: Follow roadmap discussions
  • Documentation: Check this page for updates


Last Updated: 2025-12-16
Next Milestone: Phase 1 Milestone 1.1 (Workspace Setup)
Target Release: v1.0.0 in Month 9 (Q3 2025)