Digital Twin Evolution Architecture

From Digital Thread to full Digital Twin per ISO/IEC 30173:2025

Overview
Twin Model & Twin Instance (ISO 23247 Alignment)
Semantic Architecture Patterns
Synchronization Architecture (L3)
1. Sync Policy
2. Sync Mode Use Cases
Behavioral Simulation Layer (L4)
1. Behavioral Model Definition
2. Example: Battery Discharge Model
Simulation Orchestration
1. What-If Analysis
3D Viewer Integration
Repository Structure
Phased Delivery
1. Phase Dependencies
Technology Stack by Layer
Related Documents

Overview

MetaForge’s Digital Twin evolves through four layers. Phase 1 (MVP) delivers the Digital Thread — the artifact graph that provides traceability across the product lifecycle. Subsequent phases add device synchronization, behavioral simulation, and fleet intelligence to achieve a full Digital Twin per ISO/IEC 30173:2025.

Layer	Name	Phase	Description
L1	Digital Thread	P1 (MVP)	Artifact graph, traceability, versioning, constraints
L2	Operational Twin	P2	Post-manufacturing: test telemetry, TSDB, field data ingestion
L3	Live Twin	P3	Real-time device synchronization via MQTT/OPC-UA
L4	Simulation Twin	P3-P4	Behavioral models, what-if analysis, predictive simulation

Twin Model & Twin Instance (ISO 23247 Alignment)

Per ISO 23247, a Digital Twin system distinguishes between Twin Models (product definitions) and Twin Instances (specific deployed devices).

Twin Model = the product definition (design-time). One per product version. Twin Instance = a specific deployed device. Many per model.

TwinModel: "DroneFlightController-v2.1"
├── DesignElements (schematic, PCB, enclosure, firmware)
├── BOM (components, suppliers)
├── Constraints (power budget, thermal limits, pin assignments)
├── BehavioralModels (battery discharge, thermal propagation, motor response)
└── TestProcedures (EVT/DVT/PVT)

DeviceInstance: "FC-SN-20260301-001"
├── INSTANCE_OF → TwinModel "DroneFlightController-v2.1"
├── firmwareVersion: "v2.1.3"
├── status: "active"
├── location: "test-lab-B"
├── TelemetrySources:
│   ├── IMU (accelerometer, gyroscope) @ 100 Hz
│   ├── Power (voltage, current) @ 10 Hz
│   └── Temperature (board sensors) @ 1 Hz
└── SimulationState:
    ├── predictedFlightTime: 18.5 min
    ├── thermalMargin: 12.3 C
    └── batteryHealth: 94%

Semantic Architecture Patterns

CQRS — Command Query Responsibility Segregation

The Digital Twin API separates mutation intake from query resolution. Commands (writes) go through the event-sourced pipeline with validation, constraint checking, and approval workflows. Queries (reads) hit optimized projections directly.

Commands (mutations):
  Agent → Twin API → Validate → Constraint Check → Event → Kafka → Neo4j projection

Queries (reads):
  Dashboard/Agent → Twin API → Neo4j (direct read) → Response
  Telemetry query → Twin API → TSDB (direct read) → Response

This separation enables:

Independent scaling — read-heavy workloads (dashboards, agents querying state) scale separately from write-heavy workloads (design iterations, telemetry ingestion)
Optimistic UI — the dashboard can show pending mutations before they commit
Projection diversity — the same event stream can power Neo4j, search indexes, and materialized views without coupling

Dual State Machine

The Digital Twin maintains two independent but linked state tracks:

Track	What it holds	Versioned by	Update frequency
Semantic State	Entity graph — requirements, constraints, relationships, behavioral parameters	Event stream version	Every mutation
Artifact State	Projected files — STEP, KiCad, Gerbers, firmware binaries, BOMs	Git commit SHA	On explicit projection

stateDiagram-v2
    direction LR

    state "Semantic State" as SS {
        [*] --> Mutate: Agent proposes change
        Mutate --> Validate: Constraint engine
        Validate --> Committed: Pass
        Validate --> Rejected: Fail
        Committed --> Mutate: Next iteration
    }

    state "Artifact State" as AS {
        [*] --> Project: MCP projection triggered
        Project --> Generated: Tool produces files
        Generated --> GitCommit: Files committed
        GitCommit --> Linked: Commit linked to semantic version
    }

    SS --> AS: Explicit projection trigger
    AS --> SS: Drift detection (file → semantic sync)

Key insight: Semantic iterations can proceed without artifact projection. An agent can refine constraints, rebalance power budgets, and iterate on component selection — all as semantic mutations — before projecting the final state into KiCad files. This enables fast design exploration without the overhead of regenerating tool-specific files at every step.

The AS → SS: Drift detection arrow in the diagram above represents bidirectional synchronization between file-world and graph-world. The full specification of this drift detection layer — including the change detection pipeline (watchfiles → adapter parser → ingest), the periodic reconciler, and the associated event types (drift.detected, drift.resolved) — is defined in Assistant Mode.

The dual state machine links through commit-to-version mapping:

class StateLink(BaseModel):
    semantic_version: int        # Event stream version at projection time
    git_commit: str              # SHA of the artifact commit
    projected_at: datetime       # When projection occurred
    projected_by: str            # Agent or user who triggered projection
    artifacts: list[str]         # Artifact paths included in this projection

Artifact Graph Layer

Every entity in the semantic graph maps to zero or more artifact files. The artifact graph tracks this mapping and the tool-specific export lineage.

Concept	Description
Entity-to-file mapping	Which semantic entities project into which files (e.g., `BOMItem:C42` → `bom/bom.csv` row 42, `DesignElement:MCU` → `eda/kicad/board.kicad_sch` component U1)
Dependency graph	Which files depend on which entities — enables targeted re-projection when a subset of the graph changes
Tool export lineage	Which MCP adapter produced each file, with what version, at what semantic state — enables reproducible regeneration

class ArtifactMapping(BaseModel):
    entity_id: str               # Neo4j node ID
    entity_type: str             # "BOMItem", "DesignElement", etc.
    artifact_path: str           # Relative path in project
    artifact_region: Optional[str]  # Sub-file locator (row, component ref, section)
    adapter: str                 # MCP adapter that produced this mapping
    adapter_version: str         # Adapter version for reproducibility
    semantic_version: int        # Semantic state version at projection time

Visualization as Projection

The 3D viewer and all dashboard panels are projections of twin state — they never compute domain logic. The UI subscribes to twin state changes and renders them. This means:

The dashboard is a read-only view of the semantic graph and its artifact projections
Selection, highlighting, filtering, and annotation happen client-side against cached twin state
Live overlays (L3) subscribe to the same telemetry WebSocket that feeds the TSDB
No engineering calculation ever runs in the browser — all computation happens in agents/skills on the backend

Synchronization Architecture (L3)

The synchronization layer connects physical devices to their digital twins through MQTT, Kafka, and a telemetry routing pipeline.

flowchart TD
    DEV["Physical Device"] -->|"sensor data"| MQTT["MQTT Broker<br/>Mosquitto / EMQX"]
    MQTT --> KAFKA_T["Kafka: device.telemetry"]
    KAFKA_T --> ROUTER["Telemetry Router"]

    ROUTER --> TSDB["TSDB Writer<br/>InfluxDB"]
    ROUTER --> ANOMALY["Anomaly Detector<br/>Threshold checks"]
    ROUTER --> AGG["State Aggregator<br/>Derive device state"]

    AGG -->|"state changes only"| KAFKA_S["Kafka: device.state"]
    KAFKA_S --> NEO4J["Neo4j<br/>DeviceInstance updated"]
    NEO4J --> WS["WebSocket Broadcast<br/>Dashboard"]
    WS --> VIEWER["3D Viewer<br/>Live overlays"]

    ORCH["Orchestrator"] -->|"firmware update"| OTA["OTA Channel"]
    OTA --> DEV

    style DEV fill:#E67E22,color:#fff
    style TSDB fill:#3498db,color:#fff
    style NEO4J fill:#2C3E50,color:#fff
    style VIEWER fill:#27ae60,color:#fff

Sync Policy

Synchronization is configurable per device or fleet group:

class SyncPolicy(BaseModel):
    mode: Literal["event-driven", "periodic", "streaming"]
    sample_rate_hz: Optional[float] = None    # For streaming mode
    batch_interval_s: Optional[float] = None  # For periodic mode
    staleness_threshold_s: float = 60.0       # Alert if no data for this long
    retention_days: int = 90                   # TSDB retention

Sync Mode Use Cases

Use Case	Mode	Rate	Latency
Design-time artifact changes	event-driven	N/A	seconds
Lab test equipment during EVT	periodic	1 Hz	< 1s
Drone in-flight telemetry	streaming	100 Hz	< 10ms
Supply chain risk monitoring	event-driven	N/A	minutes
Fleet health dashboard	periodic	0.1 Hz	< 10s

Behavioral Simulation Layer (L4)

Behavioral models are lightweight mathematical models that predict device behavior without running full SPICE/FEA simulations. Full simulations (SPICE, CalculiX, OpenFOAM) run during design; behavioral models distill the results into fast-running equations for runtime prediction.

Behavioral Model Definition

class BehavioralModelDef(BaseModel):
    """Stored in Neo4j as BehavioralModel node."""
    id: str
    name: str
    model_type: Literal[
        "differential-equation",  # dV/dt = f(I, T)
        "transfer-function",      # H(s) = num/den
        "thermal-rc",             # R-C thermal network
        "lookup-table",           # Interpolated from test data
        "state-machine",          # Discrete state transitions
    ]
    domain: str                   # "electrical", "thermal", "mechanical"
    inputs: list[ModelPort]       # Named inputs with units
    outputs: list[ModelPort]      # Named outputs with units
    parameters: dict              # Model constants (from calibration)
    calibration_source: Optional[str]  # TestExecution ID that calibrated this
    valid_range: Optional[dict]   # Input ranges where model is accurate

class ModelPort(BaseModel):
    name: str
    unit: str                     # SI unit: "A", "V", "degC", "Pa", "m/s"
    schema: Literal["float", "int", "bool", "array"]

Example: Battery Discharge Model

Calibrated from EVT test data:

behavioral_model:
  name: "battery-discharge-3s-lipo"
  model_type: "differential-equation"
  domain: "electrical"
  inputs:
    - name: "load_current_a"
      unit: "A"
    - name: "ambient_temp_c"
      unit: "degC"
  outputs:
    - name: "voltage_v"
      unit: "V"
    - name: "remaining_capacity_pct"
      unit: "%"
  parameters:
    nominal_voltage: 11.1
    capacity_ah: 2.2
    internal_resistance_ohm: 0.045
    temp_coefficient: -0.003  # V/degC
  equation: |
    dV/dt = -(I / C) - R_int * dI/dt - k_temp * (T - 25)
  calibration_source: "test-exec-evt-battery-001"
  valid_range:
    load_current_a: [0, 30]
    ambient_temp_c: [-10, 60]

Simulation Orchestration

flowchart TD
    TRIGGER["Trigger<br/>manual / scheduled / anomaly"] --> CREATE["Create SimulationRun node"]
    CREATE --> FETCH_STATE["Fetch DeviceInstance state<br/>from Neo4j"]
    FETCH_STATE --> FETCH_TELEM["Fetch latest telemetry<br/>from TSDB"]
    FETCH_TELEM --> LOAD["Load BehavioralModel(s)<br/>for this TwinModel"]
    LOAD --> RUN["Run simulation<br/>in-process or containerized"]
    RUN --> STORE["Store results in<br/>SimulationRun node"]
    STORE --> COMPARE["Compare prediction vs actual<br/>Update accuracy metrics"]
    COMPARE --> CHECK{Divergence > threshold?}
    CHECK -->|Yes| ALERT["Trigger CAPA or<br/>maintenance alert"]
    CHECK -->|No| DONE["Done"]

    style TRIGGER fill:#E67E22,color:#fff
    style ALERT fill:#e74c3c,color:#fff
    style DONE fill:#27ae60,color:#fff

What-If Analysis

The ability to run hypothetical scenarios against the digital twin:

# "What happens if ambient temperature rises to 50C?"
what_if_result = await simulation_engine.run_what_if(
    device_id="FC-SN-20260301-001",
    overrides={"ambient_temp_c": 50.0},
    duration_s=3600,
    models=["battery-discharge", "thermal-propagation"]
)
# Returns: predicted voltage curve, thermal margins, time-to-throttle

Use cases:

“What’s the flight time at 40C ambient vs 20C?”
“If we switch to a 3000mAh battery, how does thermal behavior change?”
“What load current causes thermal throttling within 10 minutes?”
“Will the new firmware’s increased sampling rate cause battery issues?”

3D Viewer Integration

ModelManifest API

Endpoint: GET /api/v1/projects/{project_id}/model-manifest

class ModelManifest(BaseModel):
    artifact_id: str              # MinIO artifact ID for the GLB file
    glb_url: str                  # Presigned MinIO URL (expires in 1h)
    mesh_to_node_map: dict[str, str]  # Three.js mesh name → Neo4j node ID
    component_tree: list[PartNode]
    last_updated: datetime

class PartNode(BaseModel):
    id: str                       # Neo4j node ID
    name: str
    mesh_name: Optional[str]      # Three.js mesh name in GLB
    children: list["PartNode"]
    node_type: str                # "assembly", "pcb", "component", "enclosure"

Part Detail API

Endpoint: GET /api/v1/bom/items/{node_id}

Returns real BOM data from the graph, replacing the hardcoded demo data in the current 3D viewer.

STEP Ingestion Pipeline

STEP upload → MinIO stores original → Worker converts to GLB
→ GLB stored in MinIO → Agent parses STEP assembly tree → DesignElement nodes in graph
→ Mesh-to-node mapping recorded → ModelManifest available

Live State Overlay (L3)

When a DeviceInstance is selected in the 3D viewer, the viewer subscribes to its telemetry WebSocket and overlays live state:

// WebSocket subscription for live device state
interface LiveStateOverlay {
  deviceId: string;
  meshOverlays: {
    meshName: string;
    colorMap: "thermal" | "stress" | "vibration";
    values: number[];  // Per-vertex or per-face values from telemetry/simulation
  }[];
  annotations: {
    position: [number, number, number];
    label: string;
    value: string;
    status: "nominal" | "warning" | "critical";
  }[];
}

Repository Structure

The codebase is structured to mirror the layered architecture:

digital_twin/
├── thread/                    # L1: Digital Thread (P1)
│   ├── models/
│   │   ├── artifact.py
│   │   ├── constraint.py
│   │   ├── relationship.py
│   │   └── version.py
│   ├── graph_engine.py
│   ├── versioning/
│   ├── constraint_engine/
│   ├── validation_engine/
│   └── api.py
├── models/                    # Twin Model definitions (P2)
│   ├── twin_model.py
│   ├── behavioral_model.py
│   └── model_registry.py
├── knowledge/                 # Knowledge Layer: AI memory (P1.5)
│   ├── store.py               # pgvector CRUD + semantic search
│   ├── embedding_service.py   # Local + cloud embedding abstraction
│   ├── consumer.py            # Kafka event processor
│   ├── reconciler.py          # Periodic reconciliation
│   ├── chunker.py             # Document chunking
│   ├── templates.py           # Embedding content templates
│   ├── models.py              # Pydantic models
│   └── api.py                 # Knowledge query REST API
├── sync/                      # L3: Device synchronization (P3)
│   ├── mqtt_adapter.py
│   ├── telemetry_router.py
│   ├── state_aggregator.py
│   ├── sync_policy.py
│   └── device_registry.py
├── simulation/                # L4: Behavioral simulation (P3-P4)
│   ├── engine.py
│   ├── model_runner.py
│   ├── what_if.py
│   ├── calibration.py
│   └── models/               # Built-in behavioral model implementations
│       ├── differential_eq.py
│       ├── transfer_function.py
│       ├── thermal_rc.py
│       └── lookup_table.py
└── events.py                  # Shared event schemas (all layers)

Phased Delivery

Phase	Deliverable	Timeline	Key Capability
P1 (MVP)	Digital Thread (L1)	Months 1-6	Artifact graph, traceability, constraints, 3D viewer
P1.5	Knowledge Layer	Months 5-6	pgvector setup, local embedding, `retrieve_knowledge` skill, decision + session indexing
P2	Operational Twin (L2)	Months 7-12	TSDB for test results, CAPA workflow, device provisioning, fleet registry
P2.5	Twin Model schema	Month 10	BehavioralModel node type, model definition format, calibration from test data
P3	Live Twin (L3)	Months 13-18	MQTT ingestion, real-time sync, live 3D overlays, anomaly detection
P3.5	Simulation Twin (L4)	Months 16-20	Behavioral model execution, what-if analysis, prediction vs actual
P4	Fleet Intelligence	Months 21-24+	Multi-device analytics, fleet-wide predictions, HiL integration

Phase Dependencies

flowchart LR
    P1["P1: Digital Thread<br/>Months 1-6"] --> P15["P1.5: Knowledge Layer<br/>Months 5-6"]
    P15 --> P2["P2: Operational Twin<br/>Months 7-12"]
    P2 --> P25["P2.5: Twin Model Schema<br/>Month 10"]
    P25 --> P3["P3: Live Twin<br/>Months 13-18"]
    P3 --> P35["P3.5: Simulation Twin<br/>Months 16-20"]
    P35 --> P4["P4: Fleet Intelligence<br/>Months 21-24+"]

    style P1 fill:#27ae60,color:#fff
    style P15 fill:#2ecc71,color:#fff
    style P2 fill:#3498db,color:#fff
    style P25 fill:#3498db,color:#fff
    style P3 fill:#9b59b6,color:#fff
    style P35 fill:#9b59b6,color:#fff
    style P4 fill:#E67E22,color:#fff

Technology Stack by Layer

Layer	Component	Technology	Rationale
L1	Graph database	Neo4j	Relationship-first queries, ACID
L1	Object storage	MinIO	S3-compatible, content-addressable
L1	Event bus	Kafka	Durable append-only streams
L1	API framework	FastAPI (Python)	OpenAPI, async, Pydantic validation
L2	Time-series DB	InfluxDB / TimescaleDB	Optimized for high-frequency sensor data
L3	MQTT broker	Mosquitto / EMQX	Lightweight IoT messaging
L3	Telemetry routing	Python (aiomqtt + kafka-python)	Mature MQTT/Kafka libraries
L4	Simulation runtime	Python (NumPy/SciPy)	Scientific computing ecosystem
L4	Heavy simulations	Containerized (CalculiX, OpenFOAM)	Isolated, reproducible
All	Vector store	pgvector (PostgreSQL extension)	Semantic search over knowledge; zero new infra (reuses existing PostgreSQL)
All	Embedding models	`all-MiniLM-L6-v2` (local) / `text-embedding-3-small` (cloud)	Local-first embedding; cloud option for higher quality
All	Observability	OpenTelemetry SDK + Prometheus + Grafana Tempo + Grafana Loki + Grafana	Three-pillar observability (logs, metrics, traces) with SLO/SLI framework. See System Observability
All	Dashboard	React + TypeScript	Component-based, Three.js integration

Document	Description
Graph Schema	Node types and relationships for all layers
Event Sourcing	Event stream architecture and concurrency model
Constraint Engine	Engineering constraint evaluation
AI Memory & Knowledge	pgvector-backed knowledge layer, embedding pipeline, RAG integration
System Vision	Architectural principles and layered model overview
System Observability	Unified logging, metrics, and tracing — phased delivery aligned with twin evolution
MVP Roadmap	Implementation timeline and resource requirements
ADR-002: 3D Viewer & CAD Pipeline	Decision record for R3F renderer and server-side STEP conversion pipeline
Assistant Mode	Dual-mode operation — human edits via IDE Assistants and AI-driven autonomous workflows sharing the same Design Graph
Digital Twin Page	3D viewer UI specification

Document Version: v1.1 Last Updated: 2026-02-28 Status: Technical Architecture Document

← Constraint Engine

Architecture Home →