System Overview¶

A comprehensive architectural overview of the airssys-rt actor runtime system, including component relationships, data flow, and design principles.

Note: This document provides the high-level architecture. See Components for detailed subsystem documentation.

System Philosophy¶

Core Principles¶

airssys-rt is designed around the Erlang/OTP actor model with Rust-native performance and type safety:

Fault Tolerance - "Let it crash" philosophy with supervision trees
Concurrency - Lightweight actors with message-passing isolation
Type Safety - Compile-time guarantees via generics and associated types
Performance - Zero-cost abstractions, static dispatch, minimal allocations
Composability - Builder patterns and trait-based design

Design Guidelines¶

All components follow Microsoft Rust Guidelines documented in .copilot/memory_bank/workspace/microsoft_rust_guidelines.md:

M-DI-HIERARCHY: Concrete types > generics > dyn traits
M-AVOID-WRAPPERS: No smart pointers in public APIs
M-SIMPLE-ABSTRACTIONS: Maximum 1 level of cognitive nesting
M-SERVICES-CLONE: Services implement cheap Clone via Arc<Inner>
M-ESSENTIAL-FN-INHERENT: Core functionality in inherent methods
M-MOCKABLE-SYSCALLS: All I/O and system calls must be mockable

High-Level Architecture¶

Component Layers¶

The runtime is organized in seven layers, each building on the previous:

┌───────────────────────────────────────────────────────────┐
│                    System Layer (Planned)                  │
│  Runtime coordination, actor registry, distributed nodes   │
└────────────────────┬──────────────────────────────────────┘
                     │
┌────────────────────▼──────────────────────────────────────┐
│                   Monitoring Layer                         │
│   Health checks, metrics, performance tracking, alerting   │
└────────────────────┬──────────────────────────────────────┘
                     │
┌────────────────────▼──────────────────────────────────────┐
│                  Supervisor Layer                          │
│  Fault tolerance, restart strategies, supervision trees    │
└────────────────────┬──────────────────────────────────────┘
                     │
┌────────────────────▼──────────────────────────────────────┐
│                   Mailbox Layer                            │
│    Message queue management, backpressure, buffering       │
└────────────────────┬──────────────────────────────────────┘
                     │
┌────────────────────▼──────────────────────────────────────┐
│                    Actor Layer                             │
│      Actor trait, context, lifecycle, error handling       │
└────────────────────┬──────────────────────────────────────┘
                     │
┌────────────────────▼──────────────────────────────────────┐
│                   Broker Layer                             │
│       Pub/sub message routing, subscriber management       │
└────────────────────┬──────────────────────────────────────┘
                     │
┌────────────────────▼──────────────────────────────────────┐
│                   Message Layer                            │
│         Message types, envelopes, identifiers              │
└───────────────────────────────────────────────────────────┘

Layer Responsibilities¶

Layer	Responsibility	Key Components
Message	Type-safe message contracts	`Message`, `MessageEnvelope`, `MessageId`
Broker	Pub/sub routing	`MessageBroker`, `InMemoryMessageBroker`
Actor	Business logic execution	`Actor`, `ActorContext`, `ActorLifecycle`
Mailbox	Message buffering	`BoundedMailbox`, `UnboundedMailbox`
Supervisor	Fault tolerance	`SupervisorNode`, `RestartStrategy`, `Child`
Monitoring	Health and metrics	`HealthMonitor`, `ActorMetrics`, `SupervisorMetrics`
System	Runtime coordination	`ActorSystem` (planned), registry, clustering

Core Component Diagram¶

Component Relationships¶

┌────────────────────────────────────────────────────────────────┐
│                         ActorSystem                             │
│                     (Planned - Q1 2026)                         │
└──────────────┬──────────────────────────────────┬──────────────┘
               │                                  │
               │                                  │
     ┌─────────▼──────────┐            ┌─────────▼─────────┐
     │  SupervisorNode    │            │   HealthMonitor    │
     │  - RestartStrategy │            │   - HealthConfig   │
     │  - Children        │            │   - Metrics        │
     └─────────┬──────────┘            └─────────┬─────────┘
               │                                  │
               │ supervises                       │ monitors
               │                                  │
     ┌─────────▼──────────────────────────────────▼─────────┐
     │                    Actor                              │
     │    ┌──────────────────────────────────────────┐      │
     │    │        ActorContext                      │      │
     │    │  - ActorAddress                          │      │
     │    │  - ActorLifecycle                        │      │
     │    │  - MessageBroker                         │      │
     │    │  - send() / request()                    │      │
     │    └──────────────┬───────────────────────────┘      │
     └───────────────────┼──────────────────────────────────┘
                         │
                         │ publishes/subscribes
                         │
     ┌───────────────────▼───────────────────────────────────┐
     │            InMemoryMessageBroker                       │
     │  - Subscribers: HashMap<ActorId, Sender<Envelope>>    │
     │  - publish(envelope)                                   │
     │  - subscribe(actor_id) → Receiver                      │
     └───────────────────┬───────────────────────────────────┘
                         │
                         │ routes messages
                         │
     ┌───────────────────▼───────────────────────────────────┐
     │                  Mailbox                               │
     │  - BoundedMailbox   (capacity + backpressure)         │
     │  - UnboundedMailbox (unlimited capacity)              │
     │  - Metrics tracking                                    │
     └────────────────────────────────────────────────────────┘

Message Flow Architecture¶

Complete Message Path¶

The following diagram shows the complete path of a message from sender to receiver:

  Sender Actor                                       Receiver Actor
┌──────────────┐                                    ┌──────────────┐
│ handle_msg() │                                    │ handle_msg() │
└──────┬───────┘                                    └──────▲───────┘
       │                                                   │
       │ 1. context.send(msg, recipient)                  │ 6. Process message
       ▼                                                   │
┌──────────────────────┐                                  │
│   ActorContext       │                                  │
│ - Wrap in envelope   │                                  │
│ - Add metadata       │                                  │
│ - Set timestamp      │                                  │
└──────┬───────────────┘                                  │
       │                                                   │
       │ 2. broker.publish(envelope)                      │
       ▼                                                   │
┌───────────────────────────────────────┐                 │
│      InMemoryMessageBroker            │                 │
│ ┌───────────────────────────────────┐ │                 │
│ │ Subscribers: HashMap               │ │                 │
│ │ - ActorId → mpsc::Sender<Envelope> │ │                 │
│ └───────────────────────────────────┘ │                 │
│                                       │                 │
│ - Find subscriber by ActorId          │                 │
│ - Clone envelope for each subscriber  │                 │
│ - Send to channel                     │                 │
└──────┬────────────────────────────────┘                 │
       │                                                   │
       │ 3. channel.send(envelope)                        │
       ▼                                                   │
┌───────────────────────────────────────┐                 │
│         Mailbox Queue                 │                 │
│                                       │                 │
│  BoundedMailbox:                      │                 │
│  ┌─────────────────────────────────┐ │                 │
│  │ [Env1][Env2][Env3]...│  │  │  │ │ │                 │
│  └─────────────────────────────────┘ │                 │
│  - Capacity limit                     │                 │
│  - Backpressure (Block/Drop/...)      │                 │
│  - Metrics tracking                   │                 │
│                                       │                 │
│  UnboundedMailbox:                    │                 │
│  ┌─────────────────────────────────┐ │                 │
│  │ [Env1][Env2][Env3]..............│ │                 │
│  └─────────────────────────────────┘ │                 │
│  - No capacity limit                  │                 │
│  - No backpressure                    │                 │
└──────┬────────────────────────────────┘                 │
       │                                                   │
       │ 4. receiver.recv()                               │
       ▼                                                   │
┌───────────────────────────────────────┐                 │
│      Actor Message Loop               │                 │
│                                       │                 │
│  loop {                               │                 │
│    envelope = receiver.recv().await   │                 │
│    actor.handle_message(msg, ctx)     │ ────────────────┘
│  }                                    │    5. Deliver message
└───────────────────────────────────────┘

Latency Breakdown¶

Based on measurements from BENCHMARKING.md:

Step	Operation	Latency	Percentage
1	Message wrapping (envelope creation)	~10 ns	1.4%
2	Broker routing (mutex + channel send)	~180 ns	24.4%
3	Channel transfer (Tokio mpsc)	~20 ns	2.7%
4	Mailbox buffering	~181 ns	24.6%
5	Actor processing overhead	~31 ns	4.2%
6	Business logic (varies)	~315 ns	42.7%
Total	Point-to-point roundtrip	737 ns	100%

Key Insights:

Broker routing and mailbox operations dominate latency (~49%)
Actual business logic is still the majority of time (~43%)
Infrastructure overhead is sub-microsecond (422 ns)

Supervision Tree Architecture¶

Hierarchical Fault Tolerance¶

Supervisors can supervise other supervisors, creating a fault-tolerant tree:

                    ┌─────────────────────┐
                    │  Root Supervisor    │
                    │  (OneForAll)        │
                    │  RestartPolicy:     │
                    │    Permanent        │
                    └──────┬──────────┬───┘
                           │          │
            ┌──────────────┘          └──────────────┐
            │                                        │
┌───────────▼──────────┐                 ┌───────────▼──────────┐
│ Worker Pool          │                 │ Cache Manager        │
│ Supervisor           │                 │ Supervisor           │
│ (OneForOne)          │                 │ (OneForAll)          │
│ RestartPolicy:       │                 │ RestartPolicy:       │
│   Permanent          │                 │   Transient          │
└───────┬──┬──┬────────┘                 └──────┬───────┬───────┘
        │  │  │                                 │       │
   ┌────┘  │  └────┐                      ┌─────┘       └─────┐
   │       │       │                      │                   │
┌──▼──┐ ┌──▼──┐ ┌──▼──┐            ┌─────▼─────┐      ┌──────▼──────┐
│ W-1 │ │ W-2 │ │ W-3 │            │   Cache   │      │ Persistence │
│Actor│ │Actor│ │Actor│            │   Actor   │      │   Actor     │
└─────┘ └─────┘ └─────┘            └───────────┘      └─────────────┘

Failure Propagation¶

Scenario: Worker-2 fails

With OneForOne strategy in Worker Pool: 1. W-2 fails → handle_message returns Err 2. Worker Pool Supervisor detects failure 3. W-2's on_error() returns ErrorAction::Restart 4. Supervisor applies OneForOne: restart only W-2 5. W-2 lifecycle: Running → Failed → Stopping → Starting → Running 6. W-1 and W-3 continue processing unaffected 7. Root Supervisor unaware (non-significant child)

Scenario: Cache Actor fails

With OneForAll strategy in Cache Manager: 1. Cache fails → handle_message returns Err 2. Cache Manager Supervisor detects failure 3. Cache's on_error() returns ErrorAction::Restart 4. Supervisor applies OneForAll: restart Cache AND Persistence 5. Both actors lifecycle: Running → Failed → Stopping → Starting → Running 6. Consistent state guaranteed across both actors 7. If Cache Manager marked significant: true, Root Supervisor would be notified

Performance Characteristics¶

Baseline Metrics¶

Measured on macOS development machine (October 16, 2025) with release build:

Actor System Performance¶

Metric	Latency	Throughput	Notes
Actor spawn (single)	624.74 ns	1.6M actors/sec	Sub-microsecond creation
Actor spawn (batch of 10)	681.40 ns/actor	1.47M actors/sec	Only 9% overhead
Message processing	31.55 ns/msg	31.7M msgs/sec	Direct processing
Message via broker	211.88 ns/msg	4.7M msgs/sec	Pub-sub overhead: 6.7x

Message Passing Performance¶

Metric	Latency	Throughput	Notes
Point-to-point	737 ns roundtrip	1.36M msgs/sec	Sub-microsecond latency
Broadcast (10 actors)	395 ns total	~40 ns/subscriber	Efficient multi-cast
Mailbox operations	181.60 ns/op	5.5M ops/sec	Enqueue + dequeue

Supervision Performance¶

Metric	Latency	Notes
Child spawn (builder API)	5-20 µs	Type-safe configuration
OneForOne restart	10-50 µs	Single child stop → start
OneForAll restart (3 children)	30-150 µs	~3x OneForOne
RestForOne restart (2 children)	20-100 µs	Between OneForOne and OneForAll

Scalability Characteristics¶

Memory per actor:

Actor struct: ~500 bytes - 2 KB (depends on state)
ActorContext: ~200 bytes
Mailbox (unbounded): ~100 bytes base
Mailbox (bounded 100): ~244 bytes
Total: ~1-3 KB per actor typical

Throughput scaling:

Message processing scales linearly with message count
Broadcast scales linearly with subscriber count (~40 ns/subscriber)
Batch actor spawn has 9% overhead vs single spawn (excellent)

Concurrency:

Actors are Send + Sync - true parallelism
Message broker uses Arc<Mutex<HashMap>> - contention point for many subscribers
Tokio runtime handles async/await scheduling efficiently

Type Safety Architecture¶

Generic Constraints¶

The runtime uses generics instead of trait objects for zero-cost abstraction:

// ✅ CORRECT - Generic constraints (static dispatch)
pub trait Actor: Send + Sync + 'static {
    type Message: Message;
    type Error: Error + Send + Sync + 'static;

    async fn handle_message<B: MessageBroker<Self::Message>>(
        &mut self,
        message: Self::Message,
        context: &mut ActorContext<Self::Message, B>,
    ) -> Result<(), Self::Error>;
}

// ❌ FORBIDDEN - Trait objects (dynamic dispatch, heap allocation)
async fn handle_message(
    &mut self,
    message: Box<dyn Message>,
    context: &mut ActorContext<Box<dyn Message>, Box<dyn MessageBroker>>,
) -> Result<(), Box<dyn Error>>;

Benefits:

Compile-time type checking
No runtime type dispatch overhead
No heap allocations for message passing
Better compiler optimizations

Associated Types¶

Associated types provide type safety without type parameter explosion:

impl Actor for CounterActor {
    type Message = CounterMessage;  // Specific message type
    type Error = CounterError;      // Specific error type

    async fn handle_message<B: MessageBroker<Self::Message>>(
        &mut self,
        message: Self::Message,  // CounterMessage, not generic
        context: &mut ActorContext<Self::Message, B>,
    ) -> Result<(), Self::Error> {  // CounterError, not generic
        // Fully type-safe, no casts needed
        self.value += message.delta;
        Ok(())
    }
}

Future Integrations (Planned)¶

airssys-wasm - WebAssembly component hosting: - Actors can host WASM components as children - Sandboxed component execution - Capability-based security - Component lifecycle management

Distributed nodes - Multi-node actor systems: - Network message broker - Transparent remote actor addressing - Distributed supervision - Cluster membership

Directory Structure¶

The runtime codebase follows the layered architecture:

airssys-rt/
├── src/
│   ├── lib.rs              # Public API surface
│   ├── message/            # Message Layer
│   │   ├── mod.rs          # Message trait, MessageId
│   │   └── envelope.rs     # MessageEnvelope
│   ├── broker/             # Broker Layer
│   │   ├── mod.rs          # Re-exports
│   │   ├── traits.rs       # MessageBroker trait
│   │   └── in_memory.rs    # InMemoryMessageBroker
│   ├── actor/              # Actor Layer
│   │   ├── mod.rs          # Re-exports
│   │   ├── traits.rs       # Actor trait
│   │   ├── context.rs      # ActorContext
│   │   └── lifecycle.rs    # ActorLifecycle, ActorState
│   ├── mailbox/            # Mailbox Layer
│   │   ├── mod.rs          # Re-exports
│   │   ├── bounded.rs      # BoundedMailbox
│   │   └── unbounded.rs    # UnboundedMailbox
│   ├── supervisor/         # Supervisor Layer
│   │   ├── mod.rs          # Re-exports
│   │   ├── supervisor.rs   # SupervisorNode
│   │   ├── builder.rs      # SupervisorBuilder (RT-TASK-013)
│   │   ├── strategy.rs     # OneForOne, OneForAll, RestForOne
│   │   └── child.rs        # Child trait, ChildSpec
│   ├── monitoring/         # Monitoring Layer
│   │   ├── mod.rs          # Re-exports
│   │   ├── health.rs       # HealthMonitor, ChildHealth
│   │   └── metrics.rs      # ActorMetrics, SupervisorMetrics
│   ├── system/             # System Layer (Planned)
│   │   └── mod.rs          # Future: ActorSystem, registry
│   └── util/               # Utilities
│       ├── mod.rs          # Re-exports
│       ├── address.rs      # ActorAddress, ActorId
│       └── id.rs           # ChildId, SupervisorId
├── examples/               # Working examples (15 total)
├── tests/                  # Integration tests
├── benches/                # Criterion benchmarks
└── docs/                   # mdBook documentation

Design Patterns¶

Builder Pattern (RT-TASK-013)¶

Type-safe configuration using builders:

let supervisor = SupervisorNode::builder()
    .with_strategy(OneForOne::new())
    .add_child(spec1, Box::new(worker1))
    .add_child(spec2, Box::new(worker2))
    .build()?;

Benefits:

Compile-time validation
Fluent API
Clear intent
Minimal overhead (5-20 µs)

Services Clone Pattern¶

Services implement cheap Clone via Arc<Inner>:

#[derive(Clone)]
pub struct InMemoryMessageBroker<M: Message> {
    // Arc makes clone cheap (just increment refcount)
    subscribers: Arc<Mutex<HashMap<ActorId, mpsc::Sender<MessageEnvelope<M>>>>>,
}

Benefits:

Services can be shared across actors
No deep copying overhead
Thread-safe via Arc
Simple ownership model

Dependency Injection¶

Generic constraints for testability:

async fn handle_message<B: MessageBroker<Self::Message>>(
    &mut self,
    message: Self::Message,
    context: &mut ActorContext<Self::Message, B>,  // B is injected
) -> Result<(), Self::Error>

Benefits:

Mock brokers in tests
Swap implementations (InMemory, Network, etc.)
No runtime coupling
Compile-time verification

Next Steps¶

For detailed subsystem architecture, see:

Components - Deep dive into each layer
Core Concepts - Fundamental concepts and examples
Actor Model - Actor trait and lifecycle
Message Passing - Messaging system details
Supervision - Fault tolerance and restart strategies
Process Lifecycle - State management

For API reference: - API Reference - Complete API documentation

For performance details: - Performance Reference - Detailed metrics and benchmarks