Digital Twin Evolution Architecture
From Digital Thread to full Digital Twin per ISO/IEC 30173:2025
Table of Contents
- Overview
- Twin Model & Twin Instance (ISO 23247 Alignment)
- Semantic Architecture Patterns
- Synchronization Architecture (L3)
- Behavioral Simulation Layer (L4)
- Simulation Orchestration
- 3D Viewer Integration
- Repository Structure
- Phased Delivery
- Technology Stack by Layer
- Related Documents
Overview
MetaForge’s Digital Twin evolves through four layers. Phase 1 (MVP) delivers the Digital Thread — the artifact graph that provides traceability across the product lifecycle. Subsequent phases add device synchronization, behavioral simulation, and fleet intelligence to achieve a full Digital Twin per ISO/IEC 30173:2025.
| Layer | Name | Phase | Description |
|---|---|---|---|
| L1 | Digital Thread | P1 (MVP) | Artifact graph, traceability, versioning, constraints |
| L2 | Operational Twin | P2 | Post-manufacturing: test telemetry, TSDB, field data ingestion |
| L3 | Live Twin | P3 | Real-time device synchronization via MQTT/OPC-UA |
| L4 | Simulation Twin | P3-P4 | Behavioral models, what-if analysis, predictive simulation |
Twin Model & Twin Instance (ISO 23247 Alignment)
Per ISO 23247, a Digital Twin system distinguishes between Twin Models (product definitions) and Twin Instances (specific deployed devices).
Twin Model = the product definition (design-time). One per product version. Twin Instance = a specific deployed device. Many per model.
TwinModel: "DroneFlightController-v2.1"
├── DesignElements (schematic, PCB, enclosure, firmware)
├── BOM (components, suppliers)
├── Constraints (power budget, thermal limits, pin assignments)
├── BehavioralModels (battery discharge, thermal propagation, motor response)
└── TestProcedures (EVT/DVT/PVT)
DeviceInstance: "FC-SN-20260301-001"
├── INSTANCE_OF → TwinModel "DroneFlightController-v2.1"
├── firmwareVersion: "v2.1.3"
├── status: "active"
├── location: "test-lab-B"
├── TelemetrySources:
│ ├── IMU (accelerometer, gyroscope) @ 100 Hz
│ ├── Power (voltage, current) @ 10 Hz
│ └── Temperature (board sensors) @ 1 Hz
└── SimulationState:
├── predictedFlightTime: 18.5 min
├── thermalMargin: 12.3 C
└── batteryHealth: 94%
Semantic Architecture Patterns
CQRS — Command Query Responsibility Segregation
The Digital Twin API separates mutation intake from query resolution. Commands (writes) go through the event-sourced pipeline with validation, constraint checking, and approval workflows. Queries (reads) hit optimized projections directly.
Commands (mutations):
Agent → Twin API → Validate → Constraint Check → Event → Kafka → Neo4j projection
Queries (reads):
Dashboard/Agent → Twin API → Neo4j (direct read) → Response
Telemetry query → Twin API → TSDB (direct read) → Response
This separation enables:
- Independent scaling — read-heavy workloads (dashboards, agents querying state) scale separately from write-heavy workloads (design iterations, telemetry ingestion)
- Optimistic UI — the dashboard can show pending mutations before they commit
- Projection diversity — the same event stream can power Neo4j, search indexes, and materialized views without coupling
Dual State Machine
The Digital Twin maintains two independent but linked state tracks:
| Track | What it holds | Versioned by | Update frequency |
|---|---|---|---|
| Semantic State | Entity graph — requirements, constraints, relationships, behavioral parameters | Event stream version | Every mutation |
| Artifact State | Projected files — STEP, KiCad, Gerbers, firmware binaries, BOMs | Git commit SHA | On explicit projection |
stateDiagram-v2
direction LR
state "Semantic State" as SS {
[*] --> Mutate: Agent proposes change
Mutate --> Validate: Constraint engine
Validate --> Committed: Pass
Validate --> Rejected: Fail
Committed --> Mutate: Next iteration
}
state "Artifact State" as AS {
[*] --> Project: MCP projection triggered
Project --> Generated: Tool produces files
Generated --> GitCommit: Files committed
GitCommit --> Linked: Commit linked to semantic version
}
SS --> AS: Explicit projection trigger
AS --> SS: Drift detection (file → semantic sync)
Key insight: Semantic iterations can proceed without artifact projection. An agent can refine constraints, rebalance power budgets, and iterate on component selection — all as semantic mutations — before projecting the final state into KiCad files. This enables fast design exploration without the overhead of regenerating tool-specific files at every step.
The AS → SS: Drift detection arrow in the diagram above represents bidirectional synchronization between file-world and graph-world. The full specification of this drift detection layer — including the change detection pipeline (watchfiles → adapter parser → ingest), the periodic reconciler, and the associated event types (drift.detected, drift.resolved) — is defined in Assistant Mode.
The dual state machine links through commit-to-version mapping:
class StateLink(BaseModel):
semantic_version: int # Event stream version at projection time
git_commit: str # SHA of the artifact commit
projected_at: datetime # When projection occurred
projected_by: str # Agent or user who triggered projection
artifacts: list[str] # Artifact paths included in this projection
Artifact Graph Layer
Every entity in the semantic graph maps to zero or more artifact files. The artifact graph tracks this mapping and the tool-specific export lineage.
| Concept | Description |
|---|---|
| Entity-to-file mapping | Which semantic entities project into which files (e.g., BOMItem:C42 → bom/bom.csv row 42, DesignElement:MCU → eda/kicad/board.kicad_sch component U1) |
| Dependency graph | Which files depend on which entities — enables targeted re-projection when a subset of the graph changes |
| Tool export lineage | Which MCP adapter produced each file, with what version, at what semantic state — enables reproducible regeneration |
class ArtifactMapping(BaseModel):
entity_id: str # Neo4j node ID
entity_type: str # "BOMItem", "DesignElement", etc.
artifact_path: str # Relative path in project
artifact_region: Optional[str] # Sub-file locator (row, component ref, section)
adapter: str # MCP adapter that produced this mapping
adapter_version: str # Adapter version for reproducibility
semantic_version: int # Semantic state version at projection time
Visualization as Projection
The 3D viewer and all dashboard panels are projections of twin state — they never compute domain logic. The UI subscribes to twin state changes and renders them. This means:
- The dashboard is a read-only view of the semantic graph and its artifact projections
- Selection, highlighting, filtering, and annotation happen client-side against cached twin state
- Live overlays (L3) subscribe to the same telemetry WebSocket that feeds the TSDB
- No engineering calculation ever runs in the browser — all computation happens in agents/skills on the backend
Synchronization Architecture (L3)
The synchronization layer connects physical devices to their digital twins through MQTT, Kafka, and a telemetry routing pipeline.
flowchart TD
DEV["Physical Device"] -->|"sensor data"| MQTT["MQTT Broker<br/>Mosquitto / EMQX"]
MQTT --> KAFKA_T["Kafka: device.telemetry"]
KAFKA_T --> ROUTER["Telemetry Router"]
ROUTER --> TSDB["TSDB Writer<br/>InfluxDB"]
ROUTER --> ANOMALY["Anomaly Detector<br/>Threshold checks"]
ROUTER --> AGG["State Aggregator<br/>Derive device state"]
AGG -->|"state changes only"| KAFKA_S["Kafka: device.state"]
KAFKA_S --> NEO4J["Neo4j<br/>DeviceInstance updated"]
NEO4J --> WS["WebSocket Broadcast<br/>Dashboard"]
WS --> VIEWER["3D Viewer<br/>Live overlays"]
ORCH["Orchestrator"] -->|"firmware update"| OTA["OTA Channel"]
OTA --> DEV
style DEV fill:#E67E22,color:#fff
style TSDB fill:#3498db,color:#fff
style NEO4J fill:#2C3E50,color:#fff
style VIEWER fill:#27ae60,color:#fff
Sync Policy
Synchronization is configurable per device or fleet group:
class SyncPolicy(BaseModel):
mode: Literal["event-driven", "periodic", "streaming"]
sample_rate_hz: Optional[float] = None # For streaming mode
batch_interval_s: Optional[float] = None # For periodic mode
staleness_threshold_s: float = 60.0 # Alert if no data for this long
retention_days: int = 90 # TSDB retention
Sync Mode Use Cases
| Use Case | Mode | Rate | Latency |
|---|---|---|---|
| Design-time artifact changes | event-driven | N/A | seconds |
| Lab test equipment during EVT | periodic | 1 Hz | < 1s |
| Drone in-flight telemetry | streaming | 100 Hz | < 10ms |
| Supply chain risk monitoring | event-driven | N/A | minutes |
| Fleet health dashboard | periodic | 0.1 Hz | < 10s |
Behavioral Simulation Layer (L4)
Behavioral models are lightweight mathematical models that predict device behavior without running full SPICE/FEA simulations. Full simulations (SPICE, CalculiX, OpenFOAM) run during design; behavioral models distill the results into fast-running equations for runtime prediction.
Behavioral Model Definition
class BehavioralModelDef(BaseModel):
"""Stored in Neo4j as BehavioralModel node."""
id: str
name: str
model_type: Literal[
"differential-equation", # dV/dt = f(I, T)
"transfer-function", # H(s) = num/den
"thermal-rc", # R-C thermal network
"lookup-table", # Interpolated from test data
"state-machine", # Discrete state transitions
]
domain: str # "electrical", "thermal", "mechanical"
inputs: list[ModelPort] # Named inputs with units
outputs: list[ModelPort] # Named outputs with units
parameters: dict # Model constants (from calibration)
calibration_source: Optional[str] # TestExecution ID that calibrated this
valid_range: Optional[dict] # Input ranges where model is accurate
class ModelPort(BaseModel):
name: str
unit: str # SI unit: "A", "V", "degC", "Pa", "m/s"
schema: Literal["float", "int", "bool", "array"]
Example: Battery Discharge Model
Calibrated from EVT test data:
behavioral_model:
name: "battery-discharge-3s-lipo"
model_type: "differential-equation"
domain: "electrical"
inputs:
- name: "load_current_a"
unit: "A"
- name: "ambient_temp_c"
unit: "degC"
outputs:
- name: "voltage_v"
unit: "V"
- name: "remaining_capacity_pct"
unit: "%"
parameters:
nominal_voltage: 11.1
capacity_ah: 2.2
internal_resistance_ohm: 0.045
temp_coefficient: -0.003 # V/degC
equation: |
dV/dt = -(I / C) - R_int * dI/dt - k_temp * (T - 25)
calibration_source: "test-exec-evt-battery-001"
valid_range:
load_current_a: [0, 30]
ambient_temp_c: [-10, 60]
Simulation Orchestration
flowchart TD
TRIGGER["Trigger<br/>manual / scheduled / anomaly"] --> CREATE["Create SimulationRun node"]
CREATE --> FETCH_STATE["Fetch DeviceInstance state<br/>from Neo4j"]
FETCH_STATE --> FETCH_TELEM["Fetch latest telemetry<br/>from TSDB"]
FETCH_TELEM --> LOAD["Load BehavioralModel(s)<br/>for this TwinModel"]
LOAD --> RUN["Run simulation<br/>in-process or containerized"]
RUN --> STORE["Store results in<br/>SimulationRun node"]
STORE --> COMPARE["Compare prediction vs actual<br/>Update accuracy metrics"]
COMPARE --> CHECK{Divergence > threshold?}
CHECK -->|Yes| ALERT["Trigger CAPA or<br/>maintenance alert"]
CHECK -->|No| DONE["Done"]
style TRIGGER fill:#E67E22,color:#fff
style ALERT fill:#e74c3c,color:#fff
style DONE fill:#27ae60,color:#fff
What-If Analysis
The ability to run hypothetical scenarios against the digital twin:
# "What happens if ambient temperature rises to 50C?"
what_if_result = await simulation_engine.run_what_if(
device_id="FC-SN-20260301-001",
overrides={"ambient_temp_c": 50.0},
duration_s=3600,
models=["battery-discharge", "thermal-propagation"]
)
# Returns: predicted voltage curve, thermal margins, time-to-throttle
Use cases:
- “What’s the flight time at 40C ambient vs 20C?”
- “If we switch to a 3000mAh battery, how does thermal behavior change?”
- “What load current causes thermal throttling within 10 minutes?”
- “Will the new firmware’s increased sampling rate cause battery issues?”
3D Viewer Integration
ModelManifest API
Endpoint: GET /api/v1/projects/{project_id}/model-manifest
class ModelManifest(BaseModel):
artifact_id: str # MinIO artifact ID for the GLB file
glb_url: str # Presigned MinIO URL (expires in 1h)
mesh_to_node_map: dict[str, str] # Three.js mesh name → Neo4j node ID
component_tree: list[PartNode]
last_updated: datetime
class PartNode(BaseModel):
id: str # Neo4j node ID
name: str
mesh_name: Optional[str] # Three.js mesh name in GLB
children: list["PartNode"]
node_type: str # "assembly", "pcb", "component", "enclosure"
Part Detail API
Endpoint: GET /api/v1/bom/items/{node_id}
Returns real BOM data from the graph, replacing the hardcoded demo data in the current 3D viewer.
STEP Ingestion Pipeline
STEP upload → MinIO stores original → Worker converts to GLB
→ GLB stored in MinIO → Agent parses STEP assembly tree → DesignElement nodes in graph
→ Mesh-to-node mapping recorded → ModelManifest available
Live State Overlay (L3)
When a DeviceInstance is selected in the 3D viewer, the viewer subscribes to its telemetry WebSocket and overlays live state:
// WebSocket subscription for live device state
interface LiveStateOverlay {
deviceId: string;
meshOverlays: {
meshName: string;
colorMap: "thermal" | "stress" | "vibration";
values: number[]; // Per-vertex or per-face values from telemetry/simulation
}[];
annotations: {
position: [number, number, number];
label: string;
value: string;
status: "nominal" | "warning" | "critical";
}[];
}
Repository Structure
The codebase is structured to mirror the layered architecture:
digital_twin/
├── thread/ # L1: Digital Thread (P1)
│ ├── models/
│ │ ├── artifact.py
│ │ ├── constraint.py
│ │ ├── relationship.py
│ │ └── version.py
│ ├── graph_engine.py
│ ├── versioning/
│ ├── constraint_engine/
│ ├── validation_engine/
│ └── api.py
├── models/ # Twin Model definitions (P2)
│ ├── twin_model.py
│ ├── behavioral_model.py
│ └── model_registry.py
├── knowledge/ # Knowledge Layer: AI memory (P1.5)
│ ├── store.py # pgvector CRUD + semantic search
│ ├── embedding_service.py # Local + cloud embedding abstraction
│ ├── consumer.py # Kafka event processor
│ ├── reconciler.py # Periodic reconciliation
│ ├── chunker.py # Document chunking
│ ├── templates.py # Embedding content templates
│ ├── models.py # Pydantic models
│ └── api.py # Knowledge query REST API
├── sync/ # L3: Device synchronization (P3)
│ ├── mqtt_adapter.py
│ ├── telemetry_router.py
│ ├── state_aggregator.py
│ ├── sync_policy.py
│ └── device_registry.py
├── simulation/ # L4: Behavioral simulation (P3-P4)
│ ├── engine.py
│ ├── model_runner.py
│ ├── what_if.py
│ ├── calibration.py
│ └── models/ # Built-in behavioral model implementations
│ ├── differential_eq.py
│ ├── transfer_function.py
│ ├── thermal_rc.py
│ └── lookup_table.py
└── events.py # Shared event schemas (all layers)
Phased Delivery
| Phase | Deliverable | Timeline | Key Capability |
|---|---|---|---|
| P1 (MVP) | Digital Thread (L1) | Months 1-6 | Artifact graph, traceability, constraints, 3D viewer |
| P1.5 | Knowledge Layer | Months 5-6 | pgvector setup, local embedding, retrieve_knowledge skill, decision + session indexing |
| P2 | Operational Twin (L2) | Months 7-12 | TSDB for test results, CAPA workflow, device provisioning, fleet registry |
| P2.5 | Twin Model schema | Month 10 | BehavioralModel node type, model definition format, calibration from test data |
| P3 | Live Twin (L3) | Months 13-18 | MQTT ingestion, real-time sync, live 3D overlays, anomaly detection |
| P3.5 | Simulation Twin (L4) | Months 16-20 | Behavioral model execution, what-if analysis, prediction vs actual |
| P4 | Fleet Intelligence | Months 21-24+ | Multi-device analytics, fleet-wide predictions, HiL integration |
Phase Dependencies
flowchart LR
P1["P1: Digital Thread<br/>Months 1-6"] --> P15["P1.5: Knowledge Layer<br/>Months 5-6"]
P15 --> P2["P2: Operational Twin<br/>Months 7-12"]
P2 --> P25["P2.5: Twin Model Schema<br/>Month 10"]
P25 --> P3["P3: Live Twin<br/>Months 13-18"]
P3 --> P35["P3.5: Simulation Twin<br/>Months 16-20"]
P35 --> P4["P4: Fleet Intelligence<br/>Months 21-24+"]
style P1 fill:#27ae60,color:#fff
style P15 fill:#2ecc71,color:#fff
style P2 fill:#3498db,color:#fff
style P25 fill:#3498db,color:#fff
style P3 fill:#9b59b6,color:#fff
style P35 fill:#9b59b6,color:#fff
style P4 fill:#E67E22,color:#fff
Technology Stack by Layer
| Layer | Component | Technology | Rationale |
|---|---|---|---|
| L1 | Graph database | Neo4j | Relationship-first queries, ACID |
| L1 | Object storage | MinIO | S3-compatible, content-addressable |
| L1 | Event bus | Kafka | Durable append-only streams |
| L1 | API framework | FastAPI (Python) | OpenAPI, async, Pydantic validation |
| L2 | Time-series DB | InfluxDB / TimescaleDB | Optimized for high-frequency sensor data |
| L3 | MQTT broker | Mosquitto / EMQX | Lightweight IoT messaging |
| L3 | Telemetry routing | Python (aiomqtt + kafka-python) | Mature MQTT/Kafka libraries |
| L4 | Simulation runtime | Python (NumPy/SciPy) | Scientific computing ecosystem |
| L4 | Heavy simulations | Containerized (CalculiX, OpenFOAM) | Isolated, reproducible |
| All | Vector store | pgvector (PostgreSQL extension) | Semantic search over knowledge; zero new infra (reuses existing PostgreSQL) |
| All | Embedding models | all-MiniLM-L6-v2 (local) / text-embedding-3-small (cloud) |
Local-first embedding; cloud option for higher quality |
| All | Observability | OpenTelemetry SDK + Prometheus + Grafana Tempo + Grafana Loki + Grafana | Three-pillar observability (logs, metrics, traces) with SLO/SLI framework. See System Observability |
| All | Dashboard | React + TypeScript | Component-based, Three.js integration |
Related Documents
| Document | Description |
|---|---|
| Graph Schema | Node types and relationships for all layers |
| Event Sourcing | Event stream architecture and concurrency model |
| Constraint Engine | Engineering constraint evaluation |
| AI Memory & Knowledge | pgvector-backed knowledge layer, embedding pipeline, RAG integration |
| System Vision | Architectural principles and layered model overview |
| System Observability | Unified logging, metrics, and tracing — phased delivery aligned with twin evolution |
| MVP Roadmap | Implementation timeline and resource requirements |
| ADR-002: 3D Viewer & CAD Pipeline | Decision record for R3F renderer and server-side STEP conversion pipeline |
| Assistant Mode | Dual-mode operation — human edits via IDE Assistants and AI-driven autonomous workflows sharing the same Design Graph |
| Digital Twin Page | 3D viewer UI specification |
Document Version: v1.1 Last Updated: 2026-02-28 Status: Technical Architecture Document
| ← Constraint Engine | Architecture Home → |