Roadmap¶
AirsDSP development roadmap and milestones.
Project Status¶
Current Phase: Architecture Complete, Phase 1 Starting
Version: Pre-alpha (0.0.x)
Status: Early Development
Development Phases¶
AirsDSP follows a 3-phase implementation plan over 9 months, aligned with the modular crate architecture.
Phase 1: Foundation (Months 1-3) - STARTING¶
Status: 🔄 In Progress
Timeline: Months 1-3
Focus: Core execution capability
Crates in Scope¶
airsdsp/infra- Infrastructure trait abstractionsairsdsp/core- Core execution engine
Objectives¶
airsdsp/infra (Layer 1):
- Define LanguageModel trait (LLM integration interface)
- Define VectorStore trait (vector database interface)
- Define Cache trait (caching layer interface)
- Define observability trait abstractions
airsdsp/core (Layer 2A):
- Implement base Stage trait
- Implement specialized traits: DemonstrateStage, SearchStage, PredictStage
- Implement Pipeline and PipelineBuilder
- Implement Context management (data flow between stages)
- Implement Hook system (StageHook trait with before/after/transform)
- Error handling and recovery mechanisms
Milestones¶
Milestone 1.1: Workspace Setup (Week 1)
- [ ] Create Rust workspace Cargo.toml
- [ ] Create crate directories: infra/, core/
- [ ] Set up CI/CD pipeline
- [ ] Configure workspace dependencies
Milestone 1.2: Infrastructure Traits (Weeks 1-2)
- [ ] Implement LanguageModel trait
- [ ] Implement VectorStore trait
- [ ] Implement Cache trait
- [ ] Write trait documentation with examples
- [ ] Create mock implementations for testing
Milestone 1.3: Stage Trait Hierarchy (Weeks 2-4)
- [ ] Implement base Stage trait
- [ ] Implement DemonstrateStage trait
- [ ] Implement SearchStage trait
- [ ] Implement PredictStage trait
- [ ] Implement StageHook trait
- [ ] Write comprehensive trait documentation
Milestone 1.4: Pipeline Orchestration (Weeks 5-7)
- [ ] Implement Pipeline struct
- [ ] Implement PipelineBuilder pattern
- [ ] Implement hook execution (before/after/transform)
- [ ] Implement error handling strategies
- [ ] Write pipeline builder documentation
Milestone 1.5: Context Management (Weeks 8-9)
- [ ] Implement Context struct
- [ ] Implement data flow between stages
- [ ] Implement context mutation rules
- [ ] Write context management documentation
Milestone 1.6: Basic Stage Implementations (Weeks 10-11)
- [ ] Implement YamlDemonstrateStage (load from file)
- [ ] Implement VectorSearchStage (vector similarity)
- [ ] Implement SimplePredict (single LLM call)
- [ ] Implement 3-4 common hooks (Logging, Metrics, Cache, Validation)
Milestone 1.7: Testing & Documentation (Week 12) - [ ] Unit tests for all traits (>90% coverage) - [ ] Integration tests for complete pipelines - [ ] API documentation with examples - [ ] Simple example applications - [ ] Getting started tutorial
Success Criteria¶
- ✅ Can build and execute simple pipelines (Demonstrate → Search → Predict)
- ✅ All tests passing with >90% coverage
- ✅ Zero compiler warnings (
cargo clippy --deny warnings) - ✅ Documentation complete and published
- ✅ At least 2 example implementations per stage type
Deliverables¶
airsdsp-infracrate (v0.1.0)airsdsp-corecrate (v0.1.0)- Basic test suite
- API documentation
- Simple examples (
examples/simple_qa/) - Getting started guide
Phase 2: Patterns & Orchestration (Months 4-6) - PLANNED¶
Status: 📋 Planned
Timeline: Months 4-6
Focus: High-level patterns and multi-pipeline support
Crates in Scope¶
airsdsp/patterns- Pattern libraryairsdsp/orchestration- Multi-pipeline orchestration
Objectives¶
airsdsp/patterns (Layer 2B):
- Implement Chain-of-Thought (CoT) pattern
- Implement ReAct (Reason-Action) pattern
- Implement Multi-hop reasoning pattern
- Document DAG execution primitives (research)
airsdsp/orchestration (Layer 3):
- Implement multi-pipeline system
- Implement task classification pipeline
- Implement routing logic (confidence-based)
- Implement context management
- Document future: DAG-based intent decomposition
Milestones¶
Milestone 2.1: CoT Pattern (Weeks 13-15) - [ ] Design CoT pattern API - [ ] Implement CoT pipeline constructor - [ ] Implement CoT-specific stages - [ ] Write CoT examples and documentation - [ ] Integration tests for CoT pattern
Milestone 2.2: ReAct Pattern (Weeks 16-18) - [ ] Design ReAct pattern API - [ ] Implement ReAct iterative loop - [ ] Implement tool integration interface - [ ] Write ReAct examples and documentation - [ ] Integration tests for ReAct pattern
Milestone 2.3: Multi-Hop Pattern (Weeks 19-20) - [ ] Design Multi-hop pattern API - [ ] Implement multi-hop pipeline constructor - [ ] Implement entity extraction stage - [ ] Write multi-hop examples and documentation - [ ] Integration tests for multi-hop pattern
Milestone 2.4: Multi-Pipeline System (Weeks 21-23)
- [ ] Design multi-pipeline architecture
- [ ] Implement MultiPipeline struct
- [ ] Implement pipeline registry
- [ ] Implement default fallback mechanism
- [ ] Write orchestration documentation
Milestone 2.5: Task Classification (Week 24)
- [ ] Design task classification API
- [ ] Implement TaskClassifier using DSP pipeline
- [ ] Define TaskType enum
- [ ] Write classification examples
- [ ] Integration tests for classification
Milestone 2.6: Routing Logic (Week 25)
- [ ] Design routing API
- [ ] Implement confidence-based routing
- [ ] Implement Router struct
- [ ] Write routing examples
- [ ] Integration tests for routing
Success Criteria¶
- ✅ Patterns work correctly on top of core
- ✅ Multi-pipeline system routes intelligently
- ✅ Integration tests passing (>90% coverage)
- ✅ Documentation complete with examples
- ✅ At least 1 complex example per pattern
Deliverables¶
airsdsp-patternscrate (v0.1.0)airsdsp-orchestrationcrate (v0.1.0)- Pattern examples (
examples/cot_example/,examples/react_example/,examples/multi_hop_example/) - Multi-pipeline example (
examples/multi_pipeline/) - Pattern documentation and tutorials
Phase 3: Tooling (Months 7-9) - PLANNED¶
Status: 📋 Planned
Timeline: Months 7-9
Focus: Developer experience and observability
Crates in Scope¶
airsdsp/eval- Evaluation metricsairsdsp/debug- Debugging & observability
Objectives¶
airsdsp/eval (Layer 2C):
- Implement G-Eval (LLM-based evaluation) - Priority
- Implement Metric trait for extensibility
- Implement pipeline evaluation utilities
- Document extension points for future metrics
airsdsp/debug (Layer 2C):
- Implement execution tracing
- Implement observability hooks
- Implement stage inspection
- Document performance profiling (future)
Milestones¶
Milestone 3.1: G-Eval Implementation (Weeks 26-28)
- [ ] Design G-Eval API
- [ ] Implement GEval struct
- [ ] Implement evaluation criteria framework
- [ ] Implement LLM-based scoring
- [ ] Write G-Eval examples and documentation
- [ ] Integration tests for G-Eval
Milestone 3.2: Metric Framework (Weeks 29-30)
- [ ] Design Metric trait
- [ ] Implement metric aggregation
- [ ] Implement evaluation reports
- [ ] Document extension points for custom metrics
- [ ] Example: Implement BLEU metric as extension
Milestone 3.3: Execution Tracing (Weeks 31-33)
- [ ] Design tracing API
- [ ] Implement ExecutionTracer struct
- [ ] Implement TraceEvent capture
- [ ] Implement trace export (JSON, text)
- [ ] Write tracing examples and documentation
Milestone 3.4: Observability Hooks (Week 34) - [ ] Design observability integration - [ ] Implement logging integration (tracing crate) - [ ] Implement metrics integration (prometheus) - [ ] Write observability examples - [ ] Integration tests for observability
Milestone 3.5: Stage Inspection (Week 35) - [ ] Design inspection API - [ ] Implement pipeline introspection - [ ] Implement stage state inspection - [ ] Write inspection examples - [ ] Integration tests for inspection
Milestone 3.6: Documentation & Polish (Week 36) - [ ] Complete API documentation for all crates - [ ] Write comprehensive tutorials - [ ] Performance benchmarks - [ ] Production deployment guide - [ ] 1.0 release preparation
Success Criteria¶
- ✅ Can evaluate pipeline quality with G-Eval
- ✅ Can trace and debug execution
- ✅ Observability integrations work correctly
- ✅ Performance profiling available
- ✅ Documentation complete (>90% API coverage)
Deliverables¶
airsdsp-evalcrate (v0.1.0)airsdsp-debugcrate (v0.1.0)- Evaluation examples (
examples/evaluation/) - Debugging examples (
examples/debugging/) - Complete workspace documentation
- Production deployment guide
- Target: 1.0.0 release preparation
Future Considerations (Beyond Phase 3)¶
Post-1.0 Features¶
These features are being considered for post-1.0 releases:
DAG-Based Intent Decomposition (Research) - User intent decomposition into isolated pipeline nodes - Parallel execution of independent sub-tasks - Context isolation between nodes - Built-in synthesis operation - Currently in research phase for future consideration
Advanced Evaluation Metrics - BLEU, ROUGE for text similarity - F1, precision, recall for classification tasks - Custom domain-specific metrics - Automated metric selection
Visual Debugging Tools - Pipeline visualization (DAG graph) - Interactive trace exploration - Real-time execution monitoring - Performance flame graphs
Performance Optimization - Pipeline result caching - Concurrent stage execution - Memory optimization - Benchmark suite and regression testing
Ecosystem Integration - AirsSys runtime integration - AirsProtocols MCP support - AirsStack agent framework integration - Provider library (OpenAI, Anthropic, etc.)
Advanced Patterns - Self-consistency pattern - Tree-of-thought pattern - Ensemble pattern - Critique-revision pattern
Timeline Visualization¶
Month 1-3 Month 4-6 Month 7-9 Post-9
| | | |
|<-- Phase 1 --> |<-- Phase 2 --> |<-- Phase 3 --> | Future
| Foundation | Patterns & | Tooling | Research
| | Orchestration | |
| | | |
infra, core patterns, eval, debug 1.0 Release
orchestration + DAG, etc.
Key Dates: - Month 3: Phase 1 complete (core execution capability) - Month 6: Phase 2 complete (patterns + orchestration) - Month 9: Phase 3 complete (tooling + 1.0 release candidate) - Month 10+: 1.0 stable release + future features
Contributing to Development¶
Want to help with development?
High Priority Areas (by Phase)¶
Phase 1 (Current): 1. Infrastructure trait implementations (mock providers) 2. Stage implementations (Demonstrate, Search, Predict) 3. Hook implementations (common cross-cutting concerns) 4. Testing infrastructure 5. Documentation and examples
Phase 2 (Upcoming): 1. Pattern implementations (CoT, ReAct, Multi-hop) 2. Multi-pipeline system 3. Task classification strategies 4. Advanced examples
Phase 3 (Future): 1. G-Eval implementation 2. Tracing and debugging tools 3. Observability integrations 4. Performance benchmarking
Getting Involved¶
- GitHub Issues: Check for
good first issueandhelp wantedlabels - Discussions: Join design discussions for upcoming phases
- Research: Contribute analysis and insights (DAG research, patterns)
- Testing: Help with testing strategies and coverage
- Documentation: Write tutorials, improve API docs, add examples
See Contributing Guide for details.
Version Strategy¶
Pre-1.0 Versions (0.x.x)¶
Current: v0.0.x (pre-alpha)
- Breaking changes may occur in any release
- Minor versions (0.1, 0.2, etc.) add major features or complete phases
- Patch versions (0.1.1, 0.1.2) fix bugs and minor improvements
Planned Releases:
- v0.1.0: Phase 1 complete (infra + core)
- v0.2.0: Phase 2 complete (patterns + orchestration)
- v0.3.0: Phase 3 complete (eval + debug)
- v1.0.0: Stable release
Post-1.0 Versions¶
Will follow Semantic Versioning:
- MAJOR (1.0.0 → 2.0.0): Breaking changes to public API
- MINOR (1.0.0 → 1.1.0): New features (backward compatible)
- PATCH (1.0.0 → 1.0.1): Bug fixes and minor improvements
Success Metrics¶
Phase 1 Success Metrics¶
- All planned crates compile without warnings
- Test coverage >90% for core functionality
- Can build and execute simple DSP pipelines
- Documentation complete with examples
- At least 2 reference implementations per stage type
Phase 2 Success Metrics¶
- All 3 patterns (CoT, ReAct, Multi-hop) implemented and tested
- Multi-pipeline system correctly routes tasks
- Integration tests passing (>90% coverage)
- Complex examples demonstrating pattern usage
- Documentation complete with tutorials
Phase 3 Success Metrics¶
- G-Eval produces reliable quality scores
- Execution tracing captures complete pipeline flow
- Observability integrations work with standard tools
- Performance benchmarks establish baseline
- Production deployment guide available
- Ready for 1.0.0 release
Overall Success Criteria (1.0 Release)¶
- Complete API stability
- Zero known critical bugs
- Documentation coverage >95%
- Performance meets or exceeds DSP paper benchmarks
- Community adoption (10+ external users)
- Security audit complete
Risk Management¶
Identified Risks¶
Phase 1 Risks: - ⚠️ Trait design too complex → Mitigation: Iterative refinement based on usage - ⚠️ Performance overhead from trait objects → Mitigation: Benchmarking + static dispatch options
Phase 2 Risks: - ⚠️ Pattern API not flexible enough → Mitigation: Community feedback loop - ⚠️ Multi-pipeline routing insufficient → Mitigation: Simple initial impl, iterate based on needs
Phase 3 Risks: - ⚠️ G-Eval reliability concerns → Mitigation: Extensive testing, fallback metrics - ⚠️ Debugging tools insufficient → Mitigation: Priority on must-haves, defer nice-to-haves
Stay Updated¶
- GitHub: Watch repository for updates
- Releases: Subscribe to release notifications
- Discussions: Follow roadmap discussions
- Documentation: Check this page for updates
Related Documentation¶
- Architecture - Detailed architecture documentation
- Contributing - How to contribute to development
- Overview - High-level framework introduction
Last Updated: 2025-12-16
Next Milestone: Phase 1 Milestone 1.1 (Workspace Setup)
Target Release: v1.0.0 in Month 9 (Q3 2025)