Skip to main content

Identity Foundation: SPIFFE/SPIRE

CitadelMesh implements zero-trust security using SPIFFE (Secure Production Identity Framework For Everyone) and SPIRE (SPIFFE Runtime Environment). Every workload, agent, and adapter receives a cryptographic identity that enables mutual authentication and authorization without shared secrets.

Zero-Trust Principles

Traditional building systems rely on network perimeter security: "If you're on the building network, you're trusted." This model fails in modern architectures with:

  • Cloud connectivity
  • Third-party integrations
  • Mobile access
  • Compromised credentials

Zero-trust requires:

  1. Verify explicitly: Authenticate every request based on identity
  2. Least privilege: Grant minimal access required for each workload
  3. Assume breach: Design for compromise; limit blast radius

SPIFFE Identity Model

SPIFFE ID Structure

Every workload receives a unique SPIFFE ID in URI format:

spiffe://trust-domain/path/to/workload

CitadelMesh Trust Domains:

spiffe://citadel.mesh/agent/security
spiffe://citadel.mesh/agent/energy
spiffe://citadel.mesh/adapter/bacnet
spiffe://citadel.mesh/adapter/security-expert
spiffe://citadel.mesh/service/twin
spiffe://citadel.mesh/service/opa-policy

Trust Domain (citadel.mesh): The security boundary; all workloads in this domain trust the same root CA

Workload Path (/agent/security): Hierarchical path identifying the specific workload

SPIFFE Verifiable Identity Document (SVID)

SPIRE issues SVIDs as proof of identity:

SVID = X.509 Certificate {
Subject: spiffe://citadel.mesh/agent/security
Issuer: SPIRE Server CA
Valid: 2025-09-30 to 2025-10-01 (1 hour)
Key Usage: Digital Signature, Key Encipherment
Extended Key Usage: Server Auth, Client Auth
}

Key Properties:

  • Short-lived: 1-hour validity; automatic rotation
  • Cryptographically signed: By SPIRE Server CA
  • Mutual TLS: Used for both client and server authentication
  • No shared secrets: Each workload gets unique keys

SPIRE Architecture

graph TB
subgraph "Edge Node"
Agent1[Security Agent]
Agent2[Energy Agent]
Adapter[BACnet Adapter]

SpireAgent[SPIRE Agent]

Agent1 -->|Request SVID| SpireAgent
Agent2 -->|Request SVID| SpireAgent
Adapter -->|Request SVID| SpireAgent
end

subgraph "SPIRE Server"
Server[SPIRE Server]
CA[Certificate Authority]
Registry[Workload Registry]

Server --> CA
Server --> Registry
end

SpireAgent -->|Attest| Server
Server -->|Issue SVID| SpireAgent

subgraph "Cloud"
CloudAgent[Cloud Services]
CloudSpire[SPIRE Agent]

CloudAgent --> CloudSpire
CloudSpire --> Server
end

SPIRE Server

Central authority for identity issuance:

  • Certificate Authority: Root CA for trust domain
  • Workload Registry: Defines which workloads can receive which identities
  • Attestation: Verifies workload authenticity before issuing SVIDs
  • Federation: Trusts other SPIRE servers for multi-site deployments

SPIRE Agent

Runs on every node (edge and cloud):

  • Workload API: Unix domain socket or named pipe for SVID requests
  • Attestation: Proves node identity to SPIRE Server
  • SVID Caching: Caches SVIDs and rotates before expiry
  • Health Monitoring: Monitors workload lifecycle

Workload Attestation

How does SPIRE know a workload is legitimate before issuing an SVID?

Node Attestation

SPIRE Agent proves the node's identity:

Edge (K3s):

# SPIRE Agent config
node_attestor "k8s_psat" {
cluster = "citadel-edge-building-a"
}

Cloud (Kubernetes):

node_attestor "k8s_sat" {
cluster = "citadel-cloud-prod"
}

Workload Attestation

SPIRE Server verifies workload identity via selectors:

# SPIRE Server registration entry
Entry {
spiffe_id = "spiffe://citadel.mesh/agent/security"
parent_id = "spiffe://citadel.mesh/node/edge-building-a"
selectors = [
"k8s:ns:citadel-agents",
"k8s:sa:security-agent",
"k8s:pod-label:app:security-agent"
]
ttl = 3600
}

Selectors match Kubernetes pod metadata:

  • Namespace: citadel-agents
  • Service Account: security-agent
  • Pod Label: app=security-agent

Only pods matching ALL selectors receive this SPIFFE ID.

Identity-Based Authentication

mTLS Everywhere

All network communication uses mutual TLS with SPIFFE SVIDs:

from spiffe import SpiffeClient, X509Source

# Initialize SPIFFE workload API client
x509_source = X509Source()
await x509_source.start()

# Get this workload's SVID
my_svid = x509_source.get_x509_svid()
print(f"My identity: {my_svid.spiffe_id}")

# Create gRPC channel with mTLS
import grpc
from grpc import ssl_channel_credentials

# Server credentials (verify client)
server_credentials = grpc.ssl_server_credentials(
[(my_svid.private_key_bytes, my_svid.cert_chain_bytes)],
root_certificates=x509_source.get_bundle_for_trust_domain("citadel.mesh"),
require_client_auth=True
)

# Client credentials (verify server)
client_credentials = grpc.ssl_channel_credentials(
root_certificates=x509_source.get_bundle_for_trust_domain("citadel.mesh"),
private_key=my_svid.private_key_bytes,
certificate_chain=my_svid.cert_chain_bytes
)

# Create authenticated channel
channel = grpc.secure_channel(
"twin-service.citadel.svc:8443",
client_credentials
)

NATS JetStream with SPIFFE

Event bus authentication using SPIFFE:

import nats
from nats.aio.client import Client as NATS

# Connect to NATS with mTLS
nc = await nats.connect(
servers=["nats://nats.citadel.svc:4222"],
tls=nats.tls.TLS(
cert=my_svid.cert_chain_bytes,
key=my_svid.private_key_bytes,
ca=x509_source.get_bundle_for_trust_domain("citadel.mesh")
)
)

# Publish with authenticated identity
await nc.publish(
"telemetry.canonical.building_a",
event.SerializeToString()
)

Authorization with SPIFFE

Identity alone isn't sufficient; we need authorization policies.

OPA Integration

OPA policies check SPIFFE IDs for authorization:

package citadel.authz

import rego.v1

# Default deny
default allow := false

# Energy agent can write HVAC setpoints
allow if {
input.spiffe_id == "spiffe://citadel.mesh/agent/energy"
input.action == "write_setpoint"
startswith(input.target, "hvac.")
}

# Security agent can control doors
allow if {
input.spiffe_id == "spiffe://citadel.mesh/agent/security"
input.action in ["lock_door", "unlock_door"]
startswith(input.target, "door.")
}

# Twin service can read everything
allow if {
input.spiffe_id == "spiffe://citadel.mesh/service/twin"
input.action == "read"
}

# Deny with reason
deny_reason := sprintf(
"SPIFFE ID %s not authorized for action %s on %s",
[input.spiffe_id, input.action, input.target]
) if not allow

JWT Claims with SPIFFE

CloudEvents carry JWT tokens with SPIFFE-based claims:

import jwt
from datetime import datetime, timedelta

def create_capability_token(svid, capabilities):
"""Create signed JWT with SPIFFE identity and capabilities."""

payload = {
"sub": str(svid.spiffe_id), # spiffe://citadel.mesh/agent/energy
"iss": "spiffe://citadel.mesh/opa-policy",
"aud": ["citadel.control"],
"exp": datetime.utcnow() + timedelta(minutes=5),
"iat": datetime.utcnow(),
"capabilities": capabilities,
"constraints": {
"temp_min": 65,
"temp_max": 78
}
}

# Sign with OPA service's SVID
token = jwt.encode(
payload,
opa_svid.private_key_bytes,
algorithm="RS256"
)

return token

Command includes capability token:

command = Command(
id=ulid(),
target_id="hvac.zone1.setpoint",
action="write_setpoint",
params={"value": "72"},
safety_token=create_capability_token(
my_svid,
capabilities=["hvac:write_setpoint"]
),
issued_by=str(my_svid.spiffe_id)
)

SVID Rotation

SVIDs are short-lived (1 hour) and automatically rotate:

from spiffe import X509Source
import asyncio

async def maintain_identity():
"""Continuously monitor SVID rotation."""

x509_source = X509Source()
await x509_source.start()

while True:
svid = x509_source.get_x509_svid()

# Log identity
logger.info(f"Current SVID: {svid.spiffe_id}")
logger.info(f"Expires: {svid.expiry}")

# X509Source automatically rotates before expiry
# Just sleep and let it handle rotation
await asyncio.sleep(300) # Check every 5 minutes

Rotation Flow:

  1. SPIRE Agent requests new SVID 30 minutes before expiry
  2. SPIRE Server validates workload still matches selectors
  3. Server issues new SVID with fresh keys
  4. Agent provides new SVID via Workload API
  5. Workload updates mTLS connections with new SVID
  6. Old SVID remains valid until expiry (grace period)

Multi-Site Federation

Enterprise deployments span multiple buildings/sites:

graph TB
subgraph "Building A"
ServerA[SPIRE Server A<br/>citadel.building-a]
AgentA1[Security Agent]
AgentA2[Energy Agent]

AgentA1 --> ServerA
AgentA2 --> ServerA
end

subgraph "Building B"
ServerB[SPIRE Server B<br/>citadel.building-b]
AgentB1[Security Agent]
AgentB2[Energy Agent]

AgentB1 --> ServerB
AgentB2 --> ServerB
end

subgraph "Cloud"
ServerCloud[SPIRE Server Cloud<br/>citadel.cloud]
ServiceTwin[Twin Service]
ServiceOps[Ops Service]

ServiceTwin --> ServerCloud
ServiceOps --> ServerCloud
end

ServerA <-->|Federation Bundle| ServerB
ServerA <-->|Federation Bundle| ServerCloud
ServerB <-->|Federation Bundle| ServerCloud

Federation Configuration

# SPIRE Server A config
federates_with "citadel.building-b" {
bundle_endpoint_url = "https://spire-server-b.citadel.io/bundle"
bundle_endpoint_profile "https_web" {
endpoint_spiffe_id = "spiffe://citadel.building-b/spire-server"
}
}

federates_with "citadel.cloud" {
bundle_endpoint_url = "https://spire-cloud.citadel.io/bundle"
bundle_endpoint_profile "https_web" {
endpoint_spiffe_id = "spiffe://citadel.cloud/spire-server"
}
}

Cross-trust-domain authentication:

# Building A agent calls Building B service
x509_source = X509Source()

# Get my SVID (Building A)
my_svid = x509_source.get_x509_svid()
# spiffe://citadel.building-a/agent/security

# Get trust bundle for Building B
building_b_bundle = x509_source.get_bundle_for_trust_domain(
"citadel.building-b"
)

# Create mTLS channel to Building B
channel = grpc.secure_channel(
"service.building-b.citadel.io:8443",
grpc.ssl_channel_credentials(
root_certificates=building_b_bundle,
private_key=my_svid.private_key_bytes,
certificate_chain=my_svid.cert_chain_bytes
)
)

Observability

Identity events are fully auditable:

# Emit identity events
logger.info(
"SVID issued",
extra={
"spiffe_id": svid.spiffe_id,
"serial_number": svid.serial_number,
"expiry": svid.expiry,
"selectors": selectors
}
)

logger.info(
"mTLS connection established",
extra={
"client_spiffe_id": client_svid.spiffe_id,
"server_spiffe_id": server_svid.spiffe_id,
"peer_trust_domain": peer_trust_domain
}
)

logger.warning(
"Authorization denied",
extra={
"spiffe_id": svid.spiffe_id,
"action": action,
"target": target,
"policy": policy_name,
"reason": deny_reason
}
)

Deployment Architecture

Edge Deployment

# K3s DaemonSet for SPIRE Agent
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: spire-agent
namespace: spire
spec:
selector:
matchLabels:
app: spire-agent
template:
spec:
hostPID: true
hostNetwork: true
containers:
- name: spire-agent
image: ghcr.io/spiffe/spire-agent:1.8.0
volumeMounts:
- name: spire-agent-socket
mountPath: /run/spire/sockets
- name: spire-config
mountPath: /etc/spire
volumes:
- name: spire-agent-socket
hostPath:
path: /run/spire/sockets
type: DirectoryOrCreate

Workload Integration

# Security Agent deployment with SPIRE
apiVersion: apps/v1
kind: Deployment
metadata:
name: security-agent
namespace: citadel-agents
spec:
template:
metadata:
labels:
app: security-agent
spec:
serviceAccountName: security-agent
containers:
- name: agent
image: citadel/security-agent:latest
env:
- name: SPIFFE_ENDPOINT_SOCKET
value: unix:///run/spire/sockets/agent.sock
volumeMounts:
- name: spire-agent-socket
mountPath: /run/spire/sockets
readOnly: true
volumes:
- name: spire-agent-socket
hostPath:
path: /run/spire/sockets

Security Considerations

Threat Model

Threats Mitigated:

  • Credential theft (no long-lived secrets)
  • Man-in-the-middle (mTLS required)
  • Privilege escalation (least privilege)
  • Lateral movement (identity-based segmentation)

Residual Risks:

  • SPIRE Server compromise (rotate root CA)
  • Node compromise (limit blast radius with segmentation)
  • Supply chain attacks (signed container images, SBOMs)

Best Practices

  1. Minimize SVID TTL: 1 hour max; shorter for high-risk workloads
  2. Strict Selectors: Multiple selectors (namespace + SA + labels)
  3. Audit Everything: Log all SVID issuance and auth decisions
  4. Rotate Root CA: Annual root CA rotation with grace period
  5. Monitor Anomalies: Alert on unusual SVID requests or auth failures

See Also