Identity Foundation: SPIFFE/SPIRE
CitadelMesh implements zero-trust security using SPIFFE (Secure Production Identity Framework For Everyone) and SPIRE (SPIFFE Runtime Environment). Every workload, agent, and adapter receives a cryptographic identity that enables mutual authentication and authorization without shared secrets.
Zero-Trust Principles
Traditional building systems rely on network perimeter security: "If you're on the building network, you're trusted." This model fails in modern architectures with:
- Cloud connectivity
- Third-party integrations
- Mobile access
- Compromised credentials
Zero-trust requires:
- Verify explicitly: Authenticate every request based on identity
- Least privilege: Grant minimal access required for each workload
- Assume breach: Design for compromise; limit blast radius
SPIFFE Identity Model
SPIFFE ID Structure
Every workload receives a unique SPIFFE ID in URI format:
spiffe://trust-domain/path/to/workload
CitadelMesh Trust Domains:
spiffe://citadel.mesh/agent/security
spiffe://citadel.mesh/agent/energy
spiffe://citadel.mesh/adapter/bacnet
spiffe://citadel.mesh/adapter/security-expert
spiffe://citadel.mesh/service/twin
spiffe://citadel.mesh/service/opa-policy
Trust Domain (citadel.mesh): The security boundary; all workloads in this domain trust the same root CA
Workload Path (/agent/security): Hierarchical path identifying the specific workload
SPIFFE Verifiable Identity Document (SVID)
SPIRE issues SVIDs as proof of identity:
SVID = X.509 Certificate {
Subject: spiffe://citadel.mesh/agent/security
Issuer: SPIRE Server CA
Valid: 2025-09-30 to 2025-10-01 (1 hour)
Key Usage: Digital Signature, Key Encipherment
Extended Key Usage: Server Auth, Client Auth
}
Key Properties:
- Short-lived: 1-hour validity; automatic rotation
- Cryptographically signed: By SPIRE Server CA
- Mutual TLS: Used for both client and server authentication
- No shared secrets: Each workload gets unique keys
SPIRE Architecture
graph TB
subgraph "Edge Node"
Agent1[Security Agent]
Agent2[Energy Agent]
Adapter[BACnet Adapter]
SpireAgent[SPIRE Agent]
Agent1 -->|Request SVID| SpireAgent
Agent2 -->|Request SVID| SpireAgent
Adapter -->|Request SVID| SpireAgent
end
subgraph "SPIRE Server"
Server[SPIRE Server]
CA[Certificate Authority]
Registry[Workload Registry]
Server --> CA
Server --> Registry
end
SpireAgent -->|Attest| Server
Server -->|Issue SVID| SpireAgent
subgraph "Cloud"
CloudAgent[Cloud Services]
CloudSpire[SPIRE Agent]
CloudAgent --> CloudSpire
CloudSpire --> Server
end
SPIRE Server
Central authority for identity issuance:
- Certificate Authority: Root CA for trust domain
- Workload Registry: Defines which workloads can receive which identities
- Attestation: Verifies workload authenticity before issuing SVIDs
- Federation: Trusts other SPIRE servers for multi-site deployments
SPIRE Agent
Runs on every node (edge and cloud):
- Workload API: Unix domain socket or named pipe for SVID requests
- Attestation: Proves node identity to SPIRE Server
- SVID Caching: Caches SVIDs and rotates before expiry
- Health Monitoring: Monitors workload lifecycle
Workload Attestation
How does SPIRE know a workload is legitimate before issuing an SVID?
Node Attestation
SPIRE Agent proves the node's identity:
Edge (K3s):
# SPIRE Agent config
node_attestor "k8s_psat" {
cluster = "citadel-edge-building-a"
}
Cloud (Kubernetes):
node_attestor "k8s_sat" {
cluster = "citadel-cloud-prod"
}
Workload Attestation
SPIRE Server verifies workload identity via selectors:
# SPIRE Server registration entry
Entry {
spiffe_id = "spiffe://citadel.mesh/agent/security"
parent_id = "spiffe://citadel.mesh/node/edge-building-a"
selectors = [
"k8s:ns:citadel-agents",
"k8s:sa:security-agent",
"k8s:pod-label:app:security-agent"
]
ttl = 3600
}
Selectors match Kubernetes pod metadata:
- Namespace:
citadel-agents - Service Account:
security-agent - Pod Label:
app=security-agent
Only pods matching ALL selectors receive this SPIFFE ID.
Identity-Based Authentication
mTLS Everywhere
All network communication uses mutual TLS with SPIFFE SVIDs:
from spiffe import SpiffeClient, X509Source
# Initialize SPIFFE workload API client
x509_source = X509Source()
await x509_source.start()
# Get this workload's SVID
my_svid = x509_source.get_x509_svid()
print(f"My identity: {my_svid.spiffe_id}")
# Create gRPC channel with mTLS
import grpc
from grpc import ssl_channel_credentials
# Server credentials (verify client)
server_credentials = grpc.ssl_server_credentials(
[(my_svid.private_key_bytes, my_svid.cert_chain_bytes)],
root_certificates=x509_source.get_bundle_for_trust_domain("citadel.mesh"),
require_client_auth=True
)
# Client credentials (verify server)
client_credentials = grpc.ssl_channel_credentials(
root_certificates=x509_source.get_bundle_for_trust_domain("citadel.mesh"),
private_key=my_svid.private_key_bytes,
certificate_chain=my_svid.cert_chain_bytes
)
# Create authenticated channel
channel = grpc.secure_channel(
"twin-service.citadel.svc:8443",
client_credentials
)
NATS JetStream with SPIFFE
Event bus authentication using SPIFFE:
import nats
from nats.aio.client import Client as NATS
# Connect to NATS with mTLS
nc = await nats.connect(
servers=["nats://nats.citadel.svc:4222"],
tls=nats.tls.TLS(
cert=my_svid.cert_chain_bytes,
key=my_svid.private_key_bytes,
ca=x509_source.get_bundle_for_trust_domain("citadel.mesh")
)
)
# Publish with authenticated identity
await nc.publish(
"telemetry.canonical.building_a",
event.SerializeToString()
)
Authorization with SPIFFE
Identity alone isn't sufficient; we need authorization policies.
OPA Integration
OPA policies check SPIFFE IDs for authorization:
package citadel.authz
import rego.v1
# Default deny
default allow := false
# Energy agent can write HVAC setpoints
allow if {
input.spiffe_id == "spiffe://citadel.mesh/agent/energy"
input.action == "write_setpoint"
startswith(input.target, "hvac.")
}
# Security agent can control doors
allow if {
input.spiffe_id == "spiffe://citadel.mesh/agent/security"
input.action in ["lock_door", "unlock_door"]
startswith(input.target, "door.")
}
# Twin service can read everything
allow if {
input.spiffe_id == "spiffe://citadel.mesh/service/twin"
input.action == "read"
}
# Deny with reason
deny_reason := sprintf(
"SPIFFE ID %s not authorized for action %s on %s",
[input.spiffe_id, input.action, input.target]
) if not allow
JWT Claims with SPIFFE
CloudEvents carry JWT tokens with SPIFFE-based claims:
import jwt
from datetime import datetime, timedelta
def create_capability_token(svid, capabilities):
"""Create signed JWT with SPIFFE identity and capabilities."""
payload = {
"sub": str(svid.spiffe_id), # spiffe://citadel.mesh/agent/energy
"iss": "spiffe://citadel.mesh/opa-policy",
"aud": ["citadel.control"],
"exp": datetime.utcnow() + timedelta(minutes=5),
"iat": datetime.utcnow(),
"capabilities": capabilities,
"constraints": {
"temp_min": 65,
"temp_max": 78
}
}
# Sign with OPA service's SVID
token = jwt.encode(
payload,
opa_svid.private_key_bytes,
algorithm="RS256"
)
return token
Command includes capability token:
command = Command(
id=ulid(),
target_id="hvac.zone1.setpoint",
action="write_setpoint",
params={"value": "72"},
safety_token=create_capability_token(
my_svid,
capabilities=["hvac:write_setpoint"]
),
issued_by=str(my_svid.spiffe_id)
)
SVID Rotation
SVIDs are short-lived (1 hour) and automatically rotate:
from spiffe import X509Source
import asyncio
async def maintain_identity():
"""Continuously monitor SVID rotation."""
x509_source = X509Source()
await x509_source.start()
while True:
svid = x509_source.get_x509_svid()
# Log identity
logger.info(f"Current SVID: {svid.spiffe_id}")
logger.info(f"Expires: {svid.expiry}")
# X509Source automatically rotates before expiry
# Just sleep and let it handle rotation
await asyncio.sleep(300) # Check every 5 minutes
Rotation Flow:
- SPIRE Agent requests new SVID 30 minutes before expiry
- SPIRE Server validates workload still matches selectors
- Server issues new SVID with fresh keys
- Agent provides new SVID via Workload API
- Workload updates mTLS connections with new SVID
- Old SVID remains valid until expiry (grace period)
Multi-Site Federation
Enterprise deployments span multiple buildings/sites:
graph TB
subgraph "Building A"
ServerA[SPIRE Server A<br/>citadel.building-a]
AgentA1[Security Agent]
AgentA2[Energy Agent]
AgentA1 --> ServerA
AgentA2 --> ServerA
end
subgraph "Building B"
ServerB[SPIRE Server B<br/>citadel.building-b]
AgentB1[Security Agent]
AgentB2[Energy Agent]
AgentB1 --> ServerB
AgentB2 --> ServerB
end
subgraph "Cloud"
ServerCloud[SPIRE Server Cloud<br/>citadel.cloud]
ServiceTwin[Twin Service]
ServiceOps[Ops Service]
ServiceTwin --> ServerCloud
ServiceOps --> ServerCloud
end
ServerA <-->|Federation Bundle| ServerB
ServerA <-->|Federation Bundle| ServerCloud
ServerB <-->|Federation Bundle| ServerCloud
Federation Configuration
# SPIRE Server A config
federates_with "citadel.building-b" {
bundle_endpoint_url = "https://spire-server-b.citadel.io/bundle"
bundle_endpoint_profile "https_web" {
endpoint_spiffe_id = "spiffe://citadel.building-b/spire-server"
}
}
federates_with "citadel.cloud" {
bundle_endpoint_url = "https://spire-cloud.citadel.io/bundle"
bundle_endpoint_profile "https_web" {
endpoint_spiffe_id = "spiffe://citadel.cloud/spire-server"
}
}
Cross-trust-domain authentication:
# Building A agent calls Building B service
x509_source = X509Source()
# Get my SVID (Building A)
my_svid = x509_source.get_x509_svid()
# spiffe://citadel.building-a/agent/security
# Get trust bundle for Building B
building_b_bundle = x509_source.get_bundle_for_trust_domain(
"citadel.building-b"
)
# Create mTLS channel to Building B
channel = grpc.secure_channel(
"service.building-b.citadel.io:8443",
grpc.ssl_channel_credentials(
root_certificates=building_b_bundle,
private_key=my_svid.private_key_bytes,
certificate_chain=my_svid.cert_chain_bytes
)
)
Observability
Identity events are fully auditable:
# Emit identity events
logger.info(
"SVID issued",
extra={
"spiffe_id": svid.spiffe_id,
"serial_number": svid.serial_number,
"expiry": svid.expiry,
"selectors": selectors
}
)
logger.info(
"mTLS connection established",
extra={
"client_spiffe_id": client_svid.spiffe_id,
"server_spiffe_id": server_svid.spiffe_id,
"peer_trust_domain": peer_trust_domain
}
)
logger.warning(
"Authorization denied",
extra={
"spiffe_id": svid.spiffe_id,
"action": action,
"target": target,
"policy": policy_name,
"reason": deny_reason
}
)
Deployment Architecture
Edge Deployment
# K3s DaemonSet for SPIRE Agent
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: spire-agent
namespace: spire
spec:
selector:
matchLabels:
app: spire-agent
template:
spec:
hostPID: true
hostNetwork: true
containers:
- name: spire-agent
image: ghcr.io/spiffe/spire-agent:1.8.0
volumeMounts:
- name: spire-agent-socket
mountPath: /run/spire/sockets
- name: spire-config
mountPath: /etc/spire
volumes:
- name: spire-agent-socket
hostPath:
path: /run/spire/sockets
type: DirectoryOrCreate
Workload Integration
# Security Agent deployment with SPIRE
apiVersion: apps/v1
kind: Deployment
metadata:
name: security-agent
namespace: citadel-agents
spec:
template:
metadata:
labels:
app: security-agent
spec:
serviceAccountName: security-agent
containers:
- name: agent
image: citadel/security-agent:latest
env:
- name: SPIFFE_ENDPOINT_SOCKET
value: unix:///run/spire/sockets/agent.sock
volumeMounts:
- name: spire-agent-socket
mountPath: /run/spire/sockets
readOnly: true
volumes:
- name: spire-agent-socket
hostPath:
path: /run/spire/sockets
Security Considerations
Threat Model
Threats Mitigated:
- Credential theft (no long-lived secrets)
- Man-in-the-middle (mTLS required)
- Privilege escalation (least privilege)
- Lateral movement (identity-based segmentation)
Residual Risks:
- SPIRE Server compromise (rotate root CA)
- Node compromise (limit blast radius with segmentation)
- Supply chain attacks (signed container images, SBOMs)
Best Practices
- Minimize SVID TTL: 1 hour max; shorter for high-risk workloads
- Strict Selectors: Multiple selectors (namespace + SA + labels)
- Audit Everything: Log all SVID issuance and auth decisions
- Rotate Root CA: Annual root CA rotation with grace period
- Monitor Anomalies: Alert on unusual SVID requests or auth failures
Related Documentation
- Protocol Strategy - SPIFFE IDs in CloudEvents
- Safety Guardrails - Identity-based authorization
- Edge Architecture - SPIRE deployment on K3s
- Observability - Identity audit logging