Chapter 3: The First Breath of Aspire
"The first time you see all your microservices breathing in harmony... it's like watching a city come alive."
The Orchestration Revelation
We had protocols. We had schemas. We had a vision. But scattered code doesn't make a system - it makes a mess.
How do you coordinate dozens of microservices, agents, databases, and message queues without losing your sanity? How do you see what's happening when things go wrong? How do you develop locally without deploying to production?
Enter .NET Aspire - our orchestration and observability savior.
What is .NET Aspire?
Think of Aspire as Kubernetes for local development - but simpler, smarter, and optimized for the inner development loop.
.NET Aspire provides:
- 🎯 Service Orchestration: Define your entire system in code
- 📊 Built-in Observability: Logs, traces, metrics out of the box
- 🔗 Service Discovery: Services find each other automatically
- 🐳 Container Management: Docker containers as first-class citizens
- 🚀 Developer Experience: One command to start everything
The Philosophy
Old Way:
# Start services manually (nightmare mode)
docker-compose up -d postgres
docker-compose up -d redis
docker-compose up -d nats
python agents/security_agent.py &
dotnet run --project services/safety &
npm start --prefix adapters/schneider &
# ... repeat for 15 more services
# ... kill them all when done (did you get all of them?)
Aspire Way:
# Start entire system with observability
cd src/CitadelMesh.AppHost
dotnet run
# → Opens dashboard at https://localhost:5000
# → All services, logs, traces, metrics in one place
# → Ctrl+C stops everything cleanly
One command. One dashboard. Complete visibility.
The Aspire Architecture
The AppHost: Mission Control
The CitadelMesh.AppHost project is the brain of our local development environment:
// src/CitadelMesh.AppHost/Program.cs
var builder = DistributedApplication.CreateBuilder(args);
// Infrastructure Services
var postgres = builder.AddPostgres("postgres")
.WithDataVolume()
.AddDatabase("citadel-db");
var redis = builder.AddRedis("redis")
.WithDataVolume();
var nats = builder.AddContainer("nats", "nats")
.WithBindMount("./config/nats", "/config")
.WithArgs("--config", "/config/nats-server.conf")
.WithEndpoint(port: 4222, targetPort: 4222, name: "client")
.WithEndpoint(port: 8222, targetPort: 8222, name: "monitoring");
// OPA Policy Engine
var opa = builder.AddContainer("opa", "openpolicyagent/opa")
.WithBindMount("./policies", "/policies")
.WithArgs("run", "--server", "--addr", "0.0.0.0:8181", "/policies")
.WithEndpoint(port: 8181, targetPort: 8181, name: "api");
// SPIRE Server (Identity)
var spire = builder.AddContainer("spire-server", "ghcr.io/spiffe/spire-server")
.WithBindMount("./config/spire", "/opt/spire/conf")
.WithEndpoint(port: 8081, targetPort: 8081, name: "api");
// CitadelMesh Microservices
var safety = builder.AddProject<Projects.CitadelMesh_Safety>("safety")
.WithReference(opa)
.WithReference(postgres);
var orchestrator = builder.AddProject<Projects.CitadelMesh_Orchestrator>("orchestrator")
.WithReference(nats)
.WithReference(redis)
.WithReference(safety);
// Gateway (UI Backend)
var gateway = builder.AddNpmApp("gateway", "../gateway")
.WithReference(safety)
.WithReference(orchestrator)
.WithHttpEndpoint(port: 3001, env: "PORT");
// Python Agents
var securityAgent = builder.AddExecutable(
"security-agent",
"python",
"../agents",
"-m", "security.security_agent"
).WithReference(nats)
.WithReference(safety);
builder.Build().Run();
What this gives us:
✅ Service Dependencies - Services start in the correct order ✅ Environment Variables - Auto-configured connection strings ✅ Health Checks - Know when services are ready ✅ Resource Management - Containers cleaned up automatically ✅ Observability - Logs and traces flow to dashboard
The Dashboard Experience
The Moment of Truth
cd src/CitadelMesh.AppHost
dotnet run
Output:
Building...
info: Aspire.Hosting.DistributedApplication[0]
Aspire app host listening on: https://localhost:5000
info: Aspire.Hosting.DistributedApplication[0]
Login to the dashboard at https://localhost:5000
Open your browser to https://localhost:5000 and witness the magic:
Dashboard Features
📊 Resources Tab See all services at a glance:
┌─────────────────────┬──────────┬─────────────────┬───────────┐
│ Resource │ State │ Type │ Endpoints │
├─────────────────────┼──────────┼─────────────────┼───────────┤
│ postgres │ Running │ Container │ 5432 │
│ redis │ Running │ Container │ 6379 │
│ nats │ Running │ Container │ 4222,8222 │
│ opa │ Running │ Container │ 8181 │
│ spire-server │ Running │ Container │ 8081 │
│ safety │ Running │ .NET Project │ 5100 │
│ orchestrator │ Running │ .NET Project │ 5200 │
│ gateway │ Running │ Node.js App │ 3001 │
│ security-agent │ Running │ Python Script │ - │
└─────────────────────┴──────────┴─────────────────┴───────────┘
📝 Logs Tab Unified log stream from all services:
[13:45:23] [safety] Policy evaluation: citadel/security/allow_door_unlock | ALLOW
[13:45:23] [orchestrator] Event received: citadel.security.incident
[13:45:24] [security-agent] Incident analyzed: severity=MEDIUM, action=ALERT
[13:45:24] [gateway] GET /api/incidents → 200 (45ms)
Filter by service, level, or search text.
📈 Traces Tab See request flow across services:
GET /api/policy/evaluate
├─ gateway → safety (12ms)
│ └─ safety → opa (18ms)
│ └─ OPA evaluation (15ms)
└─ Total: 45ms
Click any trace to see detailed spans, timing, and metadata.
📊 Metrics Tab Real-time performance metrics:
- Request rate (req/s)
- Error rate (%)
- Response time (p50, p95, p99)
- Resource usage (CPU, memory)
🔗 Dependencies Tab Visual graph of service dependencies:
┌─────────────┐
│ gateway │
└──────┬──────┘
│
┌──────┴──────┐
│ safety │
└──────┬──────┘
│
┌──────┴──────┐
│ opa │
└─────────────┘
The Development Workflow
Day 1: Starting from Scratch
Developer gets laptop, clones repo:
git clone https://github.com/KWIKalamazoo/CitadelMesh.git
cd CitadelMesh
Install prerequisites:
# .NET 8 SDK
dotnet --version # 8.0+
# Docker Desktop
docker --version
# Python 3.11+
python --version
# Node.js 20+
node --version
Start the world:
cd src/CitadelMesh.AppHost
dotnet run
Aspire does the rest:
- Pulls container images (postgres, redis, nats, opa, spire)
- Starts containers with correct configuration
- Builds .NET projects
- Installs npm dependencies
- Sets up Python virtual environments
- Configures service discovery
- Opens dashboard at
https://localhost:5000
Time to first working system: ~5 minutes.
Day-to-Day Development
Scenario: Adding a new OPA policy
-
Edit policy file:
code policies/energy.rego -
OPA auto-reloads (mounted volume)
-
Test in dashboard:
- View logs: See OPA reload message
- Test policy: Call gateway endpoint
- View trace: See policy evaluation span
-
No restart needed. Just edit and test.
Scenario: Debugging the security agent
-
Check logs in dashboard:
- Filter:
resource:security-agent - Search:
incident - See structured log output
- Filter:
-
View traces:
- Find incident processing trace
- See timing for each step
- Identify slow operations
-
Attach debugger:
# Stop agent in Aspire, run manually with debugger
cd src/agents
python -m debugpy --listen 5678 -m security.security_agent -
Restart agent in Aspire when done
The Inner Loop
Old way:
Edit code → Stop all services → Rebuild → Restart services → Test
(5-10 minutes per iteration)
Aspire way:
Edit code → Auto-reload or hot-reload → Test
(< 5 seconds per iteration)
10x faster iteration = 10x more productive.
Observability: The Superpower
Distributed Tracing with OpenTelemetry
Every request creates a trace that flows through multiple services:
Example: Policy Evaluation Request
Trace ID: 8f7d2a3b-1c4e-9f6a-2d8b-5e3a7c9f1b4d
Span Tree:
├─ gateway.http.request (48ms)
│ ├─ gateway.call_safety_service (35ms)
│ │ ├─ safety.evaluate_policy (30ms)
│ │ │ ├─ safety.call_opa (25ms)
│ │ │ │ └─ opa.evaluation (15ms)
│ │ │ └─ safety.audit_log (3ms)
│ │ └─ safety.response_serialization (2ms)
│ └─ gateway.response (5ms)
Click any span to see:
- ⏱️ Start time, duration
- 🏷️ Tags (service, operation, status)
- 📊 Attributes (user, resource, outcome)
- 🔗 Links to logs and metrics
Structured Logging
All services emit structured logs (JSON format):
{
"timestamp": "2025-10-01T13:45:23.123Z",
"level": "INFO",
"service": "safety",
"trace_id": "8f7d2a3b-1c4e-9f6a-2d8b-5e3a7c9f1b4d",
"span_id": "5e3a7c9f1b4d",
"message": "Policy evaluation completed",
"policy_path": "citadel/security/allow_door_unlock",
"decision": "ALLOW",
"duration_ms": 15,
"input": {
"role": "security_officer",
"time": 14,
"door_zone": "lobby"
}
}
Queryable, filterable, and correlatable with traces.
Metrics and Dashboards
Aspire collects metrics automatically:
Safety Service Metrics:
safety.policy.evaluations.total(counter)safety.policy.evaluation.duration(histogram)safety.policy.denials.total(counter)
Orchestrator Metrics:
orchestrator.events.received.total(counter)orchestrator.events.processing.duration(histogram)orchestrator.agents.active(gauge)
View in Aspire dashboard or export to Prometheus/Grafana.
The Aspire Advantage
Versus Docker Compose
| Feature | Docker Compose | Aspire |
|---|---|---|
| Service orchestration | ✅ | ✅ |
| Container management | ✅ | ✅ |
| Observability | ❌ Manual | ✅ Built-in |
| Service dependencies | ⚠️ Basic | ✅ Rich |
| Hot reload | ❌ | ✅ |
| Traces | ❌ | ✅ |
| Metrics | ❌ | ✅ |
| .NET integration | ❌ | ✅ Excellent |
| Python/Node support | ✅ | ✅ |
| Production deployment | ✅ | ⚠️ Dev-focused |
Verdict: Use Aspire for development, Kubernetes for production.
Versus Kubernetes (for dev)
Kubernetes:
- ✅ Production-grade orchestration
- ❌ Complex setup (minikube, kind, etc.)
- ❌ Slow iteration (build → push → deploy)
- ❌ Heavy resource usage
- ❌ Difficult debugging
Aspire:
- ✅ Instant startup
- ✅ Hot reload
- ✅ Built-in debugging
- ✅ Lightweight
- ⚠️ Dev environment only
Verdict: Aspire for dev, Kubernetes for prod. Best of both worlds.
The Foundation Services
With Aspire orchestrating, we deployed the core CitadelMesh services:
🛡️ CitadelMesh.Safety
.NET 8 microservice that wraps OPA policy engine:
- Exposes REST API for policy evaluation
- Manages policy loading and updates
- Provides audit logging
- Handles policy bundles
Endpoints:
POST /api/safety/evaluate- Evaluate policy decisionGET /api/safety/policies- List available policiesGET /api/safety/health- Health check
🎭 CitadelMesh.Orchestrator
.NET 8 service for event coordination:
- Subscribes to NATS event bus
- Routes events to appropriate agents
- Manages agent lifecycle
- Coordinates multi-agent workflows
Uses:
- Dapr for pub/sub and state management
- Orleans for actor-based agents (future)
- MassTransit for saga orchestration (future)
📊 OpenTelemetry
Observability infrastructure:
- Collects traces from all services
- Aggregates metrics
- Exports to Aspire dashboard
- Can export to Jaeger, Zipkin, Prometheus
Auto-instrumentation for:
- HTTP requests/responses
- gRPC calls
- Database queries
- Message queue operations
🗃️ Structured Logging
Serilog configured for all services:
- JSON output format
- Enriched with trace context
- Minimum level: Info (configurable)
- Sinks: Console, File, Dashboard
Milestone Achieved ✅
Aspire Orchestration Complete
- ✅ AppHost configured with all services
- ✅ Dashboard accessible at
https://localhost:5000 - ✅ All services starting successfully
- ✅ Logs, traces, metrics flowing
- ✅ Service discovery working
- ✅ Hot reload enabled
- ✅ Developer workflow optimized
The System Breathes
Type dotnet run, and watch the entire CitadelMesh ecosystem come to life:
- 🐘 PostgreSQL ready for data
- 🔴 Redis caching at speed
- 📨 NATS events flowing
- 🛡️ OPA policies enforcing
- 🔐 SPIRE identities issuing
- 🎯 Microservices coordinating
- 🤖 Agents listening
Foundation complete. Time to awaken the guardian.
🏰 NEXT: Chapter 4: The Policy Guardian Awakens →
Updated: October 2025 | Status: Complete ✅