Technology Stack

CitadelMesh is intentionally polyglot, choosing the best language and framework for each component. This document explains our technology choices, decision rationale, and when to use each stack.

Guiding principle: Every technology decision reinforces the pillars outlined in the Architecture Overview—multi-agent autonomy, edge-first zero trust, and verifiable automation. Use this page to map components back to those differentiators.

Stack Overview

Python Stack (Agents)

When to Use Python

Best for:

Agent development (LangGraph, LlamaIndex, AutoGen)
Rapid iteration and experimentation
AI/ML integration (Transformers, PyTorch)
Protocol adapters with vendor SDKs

Not ideal for:

High-throughput stateful services
Low-latency requirements (< 5ms)
Large-scale parallel processing

Core Libraries

LangGraph

Purpose: Deterministic agent state machines

from langgraph.graph import StateGraph, END
from typing import TypedDict

class AgentState(TypedDict):
    messages: List[str]
    next_action: str

workflow = StateGraph(AgentState)
workflow.add_node("process", process_node)
workflow.add_node("decide", decide_node)
workflow.set_entry_point("process")
workflow.add_edge("process", "decide")
workflow.add_edge("decide", END)

graph = workflow.compile()

Why LangGraph:

Deterministic execution (replayable)
Built-in checkpointing
Human-in-the-loop support
Excellent OpenTelemetry integration

Pydantic

Purpose: Data validation and serialization

from pydantic import BaseModel, Field

class Command(BaseModel):
    id: str = Field(..., min_length=26, max_length=26)  # ULID
    target_id: str
    action: str
    params: dict[str, str] = {}
    priority: int = Field(default=2, ge=0, le=4)

# Automatic validation
command = Command(id=ulid(), target_id="door.lobby", action="unlock")

NATS.py

Purpose: Event bus client

import nats
from nats.js import JetStreamContext

nc = await nats.connect("nats://nats.citadel.svc:4222")
js = nc.jetstream()

# Publish event
await js.publish("telemetry.canonical.building_a", event_data)

# Subscribe to stream
sub = await js.pull_subscribe("telemetry.>", "agent-consumer")
msgs = await sub.fetch(batch=10)

Python Environment

# pyproject.toml
[project]
name = "citadel-agents"
version = "1.0.0"
requires-python = ">=3.11"

dependencies = [
    "langgraph>=0.2",
    "langchain>=0.3",
    "pydantic>=2.0",
    "nats-py>=2.7",
    "grpcio>=1.60",
    "protobuf>=5.0",
    "opentelemetry-api>=1.22",
    "opentelemetry-sdk>=1.22",
    "structlog>=24.0"
]

[tool.uv]
dev-dependencies = [
    "pytest>=8.0",
    "pytest-asyncio>=0.23",
    "ruff>=0.3"
]

.NET Stack (Services)

When to Use .NET

Best for:

High-throughput stateful services
Long-lived actors (Orleans)
Cloud-native microservices (Aspire)
Low-latency requirements

Not ideal for:

Quick prototyping
AI/ML-heavy workloads
Vendor SDKs only in Python/JS

Core Frameworks

.NET Aspire

Purpose: Cloud-native service composition and orchestration

// Program.cs
var builder = DistributedApplication.CreateBuilder(args);

// Add infrastructure
var cache = builder.AddRedis("cache");
var postgres = builder.AddPostgres("postgres");
var nats = builder.AddNats("nats");

// Add services
builder.AddProject<SchedulerService>("scheduler")
    .WithReference(cache)
    .WithReference(postgres)
    .WithReference(nats);

builder.AddProject<AlarmService>("alarms")
    .WithReference(postgres)
    .WithReference(nats);

builder.Build().Run();

Why Aspire:

Built-in service discovery
Automatic health checks
Integrated telemetry dashboard
Local development experience

Orleans

Purpose: Stateful virtual actors for long-lived workflows

public interface IIncidentActor : IGrainWithStringKey
{
    Task ReportIncident(IncidentReport report);
    Task UpdateStatus(IncidentStatus status);
    Task<IncidentState> GetState();
}

public class IncidentActor : Grain, IIncidentActor
{
    private IncidentState state = new();

    public async Task ReportIncident(IncidentReport report)
    {
        state.Severity = report.Severity;
        state.ReportedAt = DateTime.UtcNow;

        // Persist state
        await WriteStateAsync();

        // Publish event
        await PublishIncidentEvent(report);

        // Set reminder for follow-up
        await RegisterOrUpdateReminder(
            "followup",
            TimeSpan.FromMinutes(15),
            TimeSpan.FromMinutes(15)
        );
    }

    public override async Task OnActivateAsync(CancellationToken cancellationToken)
    {
        // Restore state
        await ReadStateAsync();
    }
}

Why Orleans:

Automatic state persistence
Location transparency
Built-in reminders/timers
Elastic scalability

Dapr

Purpose: Language-agnostic building blocks (pub/sub, state, bindings)

// Pub/Sub with Dapr
[Topic("pubsub", "telemetry.canonical")]
public async Task HandleTelemetry(CloudEvent<Point> cloudEvent)
{
    var point = cloudEvent.Data;

    logger.LogInformation(
        "Telemetry received: {EntityId} = {Value}",
        point.EntityId,
        point.Value
    );

    await ProcessTelemetry(point);
}

// State store
var stateStore = "statestore";
await daprClient.SaveStateAsync(stateStore, "zone-state", zoneState);
var retrieved = await daprClient.GetStateAsync<ZoneState>(stateStore, "zone-state");

Why Dapr:

Polyglot (Python, .NET, Node all use same APIs)
Portable across clouds (abstracts Kafka, Azure Service Bus, etc.)
Built-in retries, circuit breakers

Semantic Kernel

Purpose: .NET agent framework for AI integration

var kernel = Kernel.CreateBuilder()
    .AddOpenAIChatCompletion(modelId, apiKey)
    .Build();

// Add skills
kernel.ImportPluginFromObject(new HvacPlugin());

// Run agent
var result = await kernel.InvokePromptAsync(
    "Optimize HVAC for building A based on current weather and occupancy"
);

.NET Project Structure

// CitadelMesh.sln
solution
├── src/
│   ├── CitadelMesh.Aspire/
│   │   └── AppHost/              # Aspire composition
│   ├── CitadelMesh.Services/
│   │   ├── Scheduler/            # Scheduler service
│   │   ├── Alarms/               # Alarm service
│   │   └── Sessions/             # Session manager
│   ├── CitadelMesh.Orleans/
│   │   └── Grains/               # Orleans actors
│   └── CitadelMesh.Shared/
│       ├── Contracts/            # Protobuf generated
│       └── Common/               # Shared utilities
└── tests/
    └── CitadelMesh.Tests/

TypeScript Stack (Frontend & Tooling)

When to Use TypeScript

Best for:

Frontend web applications
Real-time dashboards
Developer tooling
MCP servers

Core Frameworks

React + Next.js

// Building dashboard component
'use client';

import { useEffect, useState } from 'react';
import { useStreamingTelemetry } from '@/hooks/useTelemetry';

export function ZoneStatusCard({ zoneId }: { zoneId: string }) {
    const telemetry = useStreamingTelemetry(zoneId);

    return (
        <div className="p-4 border rounded-lg">
            <h3 className="text-lg font-semibold">{zoneId}</h3>
            <div className="mt-2">
                <span>Temperature: {telemetry.temp}°F</span>
                <span>Setpoint: {telemetry.setpoint}°F</span>
                <span>Occupancy: {telemetry.occupied ? 'Yes' : 'No'}</span>
            </div>
        </div>
    );
}

gRPC-Web

// gRPC client for Twin Service
import { TwinServiceClient } from '@/proto/twin_grpc_web_pb';
import { GetEntityRequest } from '@/proto/twin_pb';

const client = new TwinServiceClient('https://twin.citadel.io');

async function getZoneState(zoneId: string) {
    const request = new GetEntityRequest();
    request.setEntityId(zoneId);

    const response = await client.getEntity(request, {});
    return response.toObject();
}

MCP Server (TypeScript)

import { MCPServer, Tool } from '@modelcontextprotocol/sdk';

class BACnetMCPServer extends MCPServer {
    async getTools(): Promise<Tool[]> {
        return [
            {
                name: 'bacnet_read_point',
                description: 'Read BACnet point value',
                inputSchema: {
                    type: 'object',
                    properties: {
                        point_id: { type: 'string' }
                    }
                }
            }
        ];
    }

    async callTool(name: string, args: any): Promise<any> {
        if (name === 'bacnet_read_point') {
            return await this.bacnetClient.readPoint(args.point_id);
        }
    }
}

Infrastructure Choices

Kubernetes Distribution: K3s

Why K3s:

Lightweight (< 512 MB memory)
Single binary deployment
Fully compatible with K8s APIs
Built-in components (Traefik, local storage)
Perfect for edge deployments

Alternative: Full Kubernetes (AKS, EKS, GKE) for large-scale cloud

Message Broker: NATS JetStream

Why NATS:

Low latency (microseconds)
Small footprint (< 20 MB)
JetStream persistence
Built-in clustering
MQTT compatibility

Alternative: Kafka/Redpanda for cloud high-throughput scenarios

Time-Series Database: TimescaleDB

Why TimescaleDB:

PostgreSQL-compatible (familiar SQL)
Automatic partitioning
Continuous aggregates
Compression (10x space savings)
Mature ecosystem

Alternative: InfluxDB, QuestDB for pure time-series workloads

Identity: SPIFFE/SPIRE

Why SPIFFE:

Zero-trust native
Automatic key rotation
No shared secrets
Industry standard
Multi-platform support

No viable alternative for workload identity at this scale

Cross-Language Integration

Protobuf for All

# Generate for all languages
buf generate

# Outputs:
src/proto_gen/python/citadel/v1/*.py
src/proto_gen/csharp/Citadel.V1/*.cs
src/proto_gen/typescript/citadel/v1/*.ts

gRPC Everywhere

# Python client calling .NET service
channel = grpc.secure_channel("twin-service:8443", credentials)
client = TwinServiceStub(channel)
entity = await client.GetEntity(request)

// .NET calling Python agent
var channel = GrpcChannel.ForAddress("http://energy-agent:5000");
var client = new AgentService.AgentServiceClient(channel);
var result = await client.OptimizeAsync(request);

Development Tools

Build Tools

Python: uv (fast package manager)
.NET: dotnet CLI
TypeScript: pnpm (fast, efficient)
Protobuf: buf (linting, breaking change detection)

Testing

# pytest for Python
pytest tests/ -v --cov=src

# xUnit for .NET
dotnet test --logger "console;verbosity=detailed"

# Vitest for TypeScript
pnpm test

Protocol Strategy - Cross-language protocols
Agent Topology - Python agent implementation
Cloud Integration - .NET services

Stack Overview​

Python Stack (Agents)​

When to Use Python​

Core Libraries​

LangGraph​

Pydantic​

NATS.py​

Python Environment​

.NET Stack (Services)​

When to Use .NET​

Core Frameworks​

.NET Aspire​

Orleans​

Dapr​

Semantic Kernel​

.NET Project Structure​

TypeScript Stack (Frontend & Tooling)​

When to Use TypeScript​

Core Frameworks​

React + Next.js​

gRPC-Web​

MCP Server (TypeScript)​

Infrastructure Choices​

Kubernetes Distribution: K3s​

Message Broker: NATS JetStream​

Time-Series Database: TimescaleDB​

Identity: SPIFFE/SPIRE​

Cross-Language Integration​

Protobuf for All​

gRPC Everywhere​

Development Tools​

Build Tools​

Testing​

Related Documentation​

See Also​

Stack Overview

Python Stack (Agents)

When to Use Python

Core Libraries

LangGraph

Pydantic

NATS.py

Python Environment

.NET Stack (Services)

When to Use .NET

Core Frameworks

.NET Aspire

Orleans

Dapr

Semantic Kernel

.NET Project Structure

TypeScript Stack (Frontend & Tooling)

When to Use TypeScript

Core Frameworks

React + Next.js

gRPC-Web

MCP Server (TypeScript)

Infrastructure Choices

Kubernetes Distribution: K3s

Message Broker: NATS JetStream

Time-Series Database: TimescaleDB

Identity: SPIFFE/SPIRE

Cross-Language Integration

Protobuf for All

gRPC Everywhere

Development Tools

Build Tools

Testing

Related Documentation

See Also