Skip to main content

Using the Aspire Dashboard

The .NET Aspire dashboard is your command center for developing and debugging CitadelMesh. This guide covers everything you need to know to maximize productivity.

Starting the Dashboard

Quick Start

cd /path/to/CitadelMesh/src/CitadelMesh.AppHost
dotnet run

Expected Output:

Building...
info: Aspire.Hosting.DistributedApplication[0]
Aspire version: 8.0.0
info: Aspire.Hosting.DistributedApplication[0]
Distributed application started. Press Ctrl+C to shut down.
info: Aspire.Hosting.DistributedApplication[0]
Dashboard URL: https://localhost:5000

Navigate to: https://localhost:5000

Advanced Startup Options

Custom Port

# Set custom port via environment variable
export ASPIRE_DASHBOARD_PORT=7000
dotnet run

Or update appsettings.Development.json:

{
"Dashboard": {
"Port": 7000
}
}

Selective Service Startup

Edit src/CitadelMesh.AppHost/Program.cs to comment out services:

var builder = DistributedApplication.CreateBuilder(args);

// Core infrastructure (always needed)
var redis = builder.AddRedis("redis", port: 6379);
var postgres = builder.AddPostgres("postgres", port: 5432)
.AddDatabase("citadel-db");

// Optional: Comment out if not needed
// var jaeger = builder.AddContainer("jaeger", "jaegertracing/all-in-one");
// var prometheus = builder.AddContainer("prometheus", "prom/prometheus");

var app = builder.Build();
app.Run();

Dashboard Features

1. Resources Tab

The Resources view shows all running services with real-time status.

Service Categories

Resource TypePurposeDefault Port
redisCaching & pub/sub6379
postgresPersistent storage5432
natsEvent bus4222
opaPolicy engine8181
spire-serverIdentity provider8081
spire-agentWorkload attestation-
jaegerDistributed tracing16686
agent-runtimePython agent container-

Resource Actions

Each resource has quick actions:

  • View Logs: Opens console output
  • View Details: Shows configuration
  • Restart: Graceful restart
  • Stop/Start: Manual control

Example: Restarting OPA

  1. Find opa in Resources list
  2. Click ⋮ menu
  3. Select Restart
  4. Monitor logs for successful reload

2. Console Logs Tab

Real-time log streaming from all services.

Filtering Logs

By Service:

Filter: service:opa
Shows only OPA logs

By Level:

Filter: level:error
Shows only errors across all services

By Message Pattern:

Filter: message:policy
Shows logs containing "policy"

Combined Filters:

Filter: service:agent-runtime level:error
Shows errors from agent runtime

Log Levels

  • TRACE - Verbose debug info
  • DEBUG - Development diagnostics
  • INFO - Normal operations
  • WARN - Potential issues
  • ERROR - Failures requiring attention
  • FATAL - Critical system errors

Example: Debugging Policy Denials

Filter: message:"Policy violation"

Result:
[12:34:56] ERROR [CitadelMesh.Safety] Policy violation: Door unlock denied
Policy: citadel.security.door_unlock
Input: {"action":"door_unlock","duration_seconds":600}
Reason: Exceeded maximum duration (300s)

3. Structured Logs Tab

Advanced log analysis with filtering, grouping, and export.

Query Examples

Find all policy evaluations:

{
"policy.result": {"$exists": true}
}

Find slow operations:

{
"duration_ms": {"$gt": 1000}
}

Find agent errors:

{
"service": "agent-runtime",
"level": "error"
}

Grouping and Aggregation

  1. Select Group Byservice
  2. Select Aggregatecount
  3. Result: Log count per service

Export Logs

  1. Apply filters
  2. Click ExportJSON
  3. Use for offline analysis or bug reports

4. Traces Tab

Distributed tracing with OpenTelemetry/Jaeger integration.

Trace View

Each trace shows:

  • Trace ID: Unique identifier
  • Duration: Total time
  • Spans: Individual operations
  • Status: Success/Error/Timeout

Example: Agent Execution Trace

Trace: agent.security.process_scenario
├─ span: monitor_events (12ms)
│ └─ span: camera_action.get_incidents (8ms)
├─ span: analyze_threat (5ms)
├─ span: coordinate_response (3ms)
├─ span: execute_door_control (45ms)
│ └─ span: mcp.schneider.lock_door (40ms)
└─ span: audit_log_response (2ms)

Total: 67ms

Finding Performance Bottlenecks

  1. Click Traces tab
  2. Sort by Duration (descending)
  3. Click slow trace
  4. Examine span waterfall
  5. Identify bottleneck (longest span)

Trace Filtering

By Service:

service.name:agent.security

By Operation:

operation.name:execute_door_control

By Status:

status.code:ERROR

By Duration:

duration:>1000ms

5. Metrics Tab

Real-time metrics and dashboards.

Available Metrics

Infrastructure:

  • Redis: Operations/sec, memory usage
  • PostgreSQL: Query count, connection pool
  • NATS: Messages/sec, queue depth

Application:

  • Agent executions
  • Policy evaluations
  • MCP tool invocations
  • Event processing latency

System:

  • CPU usage per container
  • Memory usage per container
  • Network I/O
  • Disk I/O

Creating Custom Dashboards

  1. Click + New Dashboard
  2. Add metrics:
    • citadel_agent_executions_total
    • citadel_policy_evaluations_total
    • citadel_mcp_tool_duration_seconds
  3. Choose visualization (line chart, bar chart)
  4. Set refresh interval (5s, 30s, 1m)

Alerting

Set up alerts for critical metrics:

  1. Select metric (e.g., error_rate)
  2. Click Create Alert
  3. Set threshold: > 5%
  4. Configure notification (email, webhook)

6. Environment Variables Tab

View and edit environment variables for all services.

Viewing Variables

  1. Click Environment tab
  2. Select service (e.g., opa)
  3. See all env vars:
    OPA_LOG_LEVEL=debug
    OTEL_EXPORTER_OTLP_ENDPOINT=http://jaeger:4317

Hot-Reloading Variables

  1. Click Edit on variable
  2. Change value (e.g., OPA_LOG_LEVEL=info)
  3. Click Save
  4. Service auto-restarts with new value

Note: Only supported for containerized services with restart policies.

Debugging Workflows

Debugging OPA Policy Denials

Scenario: Agent action is denied by policy

Steps:

  1. Find the denial in logs:

    • Console Logs → Filter: message:"Policy violation"
    • Note the input and reason
  2. Check policy evaluation:

    • Structured Logs → Query: {"policy.result": false}
    • View full policy input/output
  3. Test policy locally:

    # Create test input
    cat > input.json <<EOF
    {
    "action": "door_unlock",
    "duration_seconds": 600
    }
    EOF

    # Evaluate policy
    opa eval -i input.json -d policies/security.rego \
    'data.citadel.security.allow'
  4. Fix policy or input:

    • Edit policies/security.rego
    • Policies auto-reload in OPA container
  5. Verify in dashboard:

    • Watch Console Logs for policy reload
    • Re-run agent
    • Check for successful evaluation

Debugging Agent State Machine

Scenario: Agent gets stuck in a state

Steps:

  1. Find agent execution trace:

    • Traces → Filter: service.name:agent.security
    • Find stuck execution (long duration)
  2. Examine span timeline:

    • Click trace
    • Look for incomplete spans or timeouts
  3. Check agent logs:

    • Console Logs → Filter: service:agent-runtime level:debug
    • Look for state transitions:
      DEBUG: State transition: monitor → analyze
      DEBUG: State transition: analyze → coordinate_response
      DEBUG: State transition: coordinate_response → STUCK
  4. Inspect agent state:

    • Structured Logs → Query: {"agent.state": {"$exists": true}}
    • View state object:
      {
      "status": "active",
      "current_state": "coordinate_response",
      "context": {...},
      "error": "Timeout waiting for MCP response"
      }
  5. Fix the issue:

    • Add timeout handling in agent code
    • Or fix MCP adapter issue

Debugging MCP Adapter Issues

Scenario: MCP tool invocation fails

Steps:

  1. Check MCP server logs:

    • Resources → Find MCP container
    • View Logs
    • Look for errors:
      ERROR: Failed to execute tool: set_temperature
      Error: Connection refused to EcoStruxure API
  2. Verify MCP server is running:

    docker ps | grep mcp
    # Should show running container
  3. Test MCP server directly:

    # List available tools
    curl http://localhost:3001/tools

    # Invoke tool
    curl -X POST http://localhost:3001/tools/set_temperature \
    -H 'Content-Type: application/json' \
    -d '{"zone": "lobby", "temperature": 22.0}'
  4. Check OPA policy for tool:

    • MCP tools may be blocked by policy
    • Console Logs → Filter: message:set_temperature
  5. Enable mock mode for development:

    • Environment → Select MCP server
    • Edit ENABLE_MOCK_MODE=true
    • Restart service

Hot Reload Workflows

.NET Microservices Hot Reload

Aspire supports .NET hot reload out of the box.

Steps:

  1. Edit C# file (e.g., src/microservices/CitadelMesh.Safety/Program.cs)
  2. Save file
  3. Dashboard shows: 🔄 Reloading: safety-engine
  4. Changes applied (no restart needed)

Limitations:

  • Method signature changes require restart
  • New dependencies require dotnet restore

OPA Policy Hot Reload

Policies auto-reload when files change.

Steps:

  1. Edit policy: vim policies/security.rego
  2. Save file
  3. Dashboard Console Logs shows:
    INFO: OPA bundle reloaded
    INFO: Loaded policies: citadel.security
  4. Test immediately (no restart)

Python Agent Development

Agents don't auto-reload, but you can use watchdog:

cd src/agents

# Install watchdog
pip install watchdog[watchmedo]

# Auto-restart on file change
watchmedo auto-restart \
--pattern="*.py" \
--recursive \
-- python security/security_agent.py

Advanced Features

Custom Resource Health Checks

Add health check to custom service:

// In Program.cs
var myService = builder.AddContainer("my-service", "my-image")
.WithHealthCheck("http://localhost:8080/health");

Dashboard will show health status with ✅/❌ indicator.

Resource Dependencies

Ensure services start in order:

var postgres = builder.AddPostgres("postgres");
var orchestrator = builder.AddProject<Projects.Orchestrator>("orchestrator")
.WithReference(postgres) // Waits for postgres
.WaitFor(postgres); // Explicit wait

External Services

Connect to external systems:

// External Redis
builder.AddConnectionString("external-redis", "redis://prod-redis:6379");

// Use in services
var myService = builder.AddProject<Projects.MyService>("my-service")
.WithReference("external-redis");

Performance Tips

1. Reduce Log Volume

For better performance, reduce log verbosity in production-like testing:

{
"Logging": {
"LogLevel": {
"Default": "Warning",
"CitadelMesh": "Information"
}
}
}

2. Disable Unused Features

// Disable tracing if not needed
builder.Services.AddOpenTelemetry()
.WithTracing(tracing => tracing.SetSampler(new AlwaysOffSampler()));

3. Use Persistent Volumes

Avoid recreating containers on each start:

var postgres = builder.AddPostgres("postgres")
.WithDataVolume("citadel-postgres-data"); // Persists across restarts

Troubleshooting Dashboard Issues

Dashboard not accessible

Error: Cannot connect to https://localhost:5000

Solution:

# Check if AppHost is running
ps aux | grep CitadelMesh.AppHost

# Restart AppHost
cd src/CitadelMesh.AppHost
dotnet run

Services show as unhealthy

Error: Red ❌ indicators for all services

Solution:

# Check Docker
docker ps

# Restart Docker Desktop if needed
# Then restart Aspire
dotnet run

Logs not appearing

Issue: Console Logs tab is empty

Solution:

  1. Check log level in appsettings.Development.json
  2. Ensure "Aspire": "Debug" is set
  3. Restart AppHost

Traces not showing

Issue: Traces tab shows "No traces"

Solution:

# Verify Jaeger is running
curl http://localhost:16686

# Check OTLP endpoint
curl http://localhost:4317

Next Steps

Dashboard Keyboard Shortcuts

ShortcutAction
Cmd/Ctrl + KOpen command palette
Cmd/Ctrl + FFocus search/filter
Cmd/Ctrl + RRefresh current view
EscClear filters
?Show help overlay

Master the dashboard and you'll be 10x more productive! Continue to Docker Compose Setup.