Digital Twin Evolution Architecture

From Digital Thread to full Digital Twin per ISO/IEC 30173:2025

Table of Contents

  1. Overview
  2. Twin Model & Twin Instance (ISO 23247 Alignment)
  3. Semantic Architecture Patterns
    1. CQRS — Command Query Responsibility Segregation
    2. Dual State Machine
    3. Artifact Graph Layer
    4. Visualization as Projection
  4. Synchronization Architecture (L3)
    1. Sync Policy
    2. Sync Mode Use Cases
  5. Behavioral Simulation Layer (L4)
    1. Behavioral Model Definition
    2. Example: Battery Discharge Model
  6. Simulation Orchestration
    1. What-If Analysis
  7. 3D Viewer Integration
    1. ModelManifest API
    2. Part Detail API
    3. STEP Ingestion Pipeline
    4. Live State Overlay (L3)
  8. Repository Structure
  9. Phased Delivery
    1. Phase Dependencies
  10. Technology Stack by Layer
  11. Related Documents

Overview

MetaForge’s Digital Twin evolves through four layers. Phase 1 (MVP) delivers the Digital Thread — the artifact graph that provides traceability across the product lifecycle. Subsequent phases add device synchronization, behavioral simulation, and fleet intelligence to achieve a full Digital Twin per ISO/IEC 30173:2025.

Layer Name Phase Description
L1 Digital Thread P1 (MVP) Artifact graph, traceability, versioning, constraints
L2 Operational Twin P2 Post-manufacturing: test telemetry, TSDB, field data ingestion
L3 Live Twin P3 Real-time device synchronization via MQTT/OPC-UA
L4 Simulation Twin P3-P4 Behavioral models, what-if analysis, predictive simulation

Twin Model & Twin Instance (ISO 23247 Alignment)

Per ISO 23247, a Digital Twin system distinguishes between Twin Models (product definitions) and Twin Instances (specific deployed devices).

Twin Model = the product definition (design-time). One per product version. Twin Instance = a specific deployed device. Many per model.

TwinModel: "DroneFlightController-v2.1"
├── DesignElements (schematic, PCB, enclosure, firmware)
├── BOM (components, suppliers)
├── Constraints (power budget, thermal limits, pin assignments)
├── BehavioralModels (battery discharge, thermal propagation, motor response)
└── TestProcedures (EVT/DVT/PVT)

DeviceInstance: "FC-SN-20260301-001"
├── INSTANCE_OF → TwinModel "DroneFlightController-v2.1"
├── firmwareVersion: "v2.1.3"
├── status: "active"
├── location: "test-lab-B"
├── TelemetrySources:
│   ├── IMU (accelerometer, gyroscope) @ 100 Hz
│   ├── Power (voltage, current) @ 10 Hz
│   └── Temperature (board sensors) @ 1 Hz
└── SimulationState:
    ├── predictedFlightTime: 18.5 min
    ├── thermalMargin: 12.3 C
    └── batteryHealth: 94%

Semantic Architecture Patterns

CQRS — Command Query Responsibility Segregation

The Digital Twin API separates mutation intake from query resolution. Commands (writes) go through the event-sourced pipeline with validation, constraint checking, and approval workflows. Queries (reads) hit optimized projections directly.

Commands (mutations):
  Agent → Twin API → Validate → Constraint Check → Event → Kafka → Neo4j projection

Queries (reads):
  Dashboard/Agent → Twin API → Neo4j (direct read) → Response
  Telemetry query → Twin API → TSDB (direct read) → Response

This separation enables:

  • Independent scaling — read-heavy workloads (dashboards, agents querying state) scale separately from write-heavy workloads (design iterations, telemetry ingestion)
  • Optimistic UI — the dashboard can show pending mutations before they commit
  • Projection diversity — the same event stream can power Neo4j, search indexes, and materialized views without coupling

Dual State Machine

The Digital Twin maintains two independent but linked state tracks:

Track What it holds Versioned by Update frequency
Semantic State Entity graph — requirements, constraints, relationships, behavioral parameters Event stream version Every mutation
Artifact State Projected files — STEP, KiCad, Gerbers, firmware binaries, BOMs Git commit SHA On explicit projection
stateDiagram-v2
    direction LR

    state "Semantic State" as SS {
        [*] --> Mutate: Agent proposes change
        Mutate --> Validate: Constraint engine
        Validate --> Committed: Pass
        Validate --> Rejected: Fail
        Committed --> Mutate: Next iteration
    }

    state "Artifact State" as AS {
        [*] --> Project: MCP projection triggered
        Project --> Generated: Tool produces files
        Generated --> GitCommit: Files committed
        GitCommit --> Linked: Commit linked to semantic version
    }

    SS --> AS: Explicit projection trigger
    AS --> SS: Drift detection (file → semantic sync)

Key insight: Semantic iterations can proceed without artifact projection. An agent can refine constraints, rebalance power budgets, and iterate on component selection — all as semantic mutations — before projecting the final state into KiCad files. This enables fast design exploration without the overhead of regenerating tool-specific files at every step.

The AS → SS: Drift detection arrow in the diagram above represents bidirectional synchronization between file-world and graph-world. The full specification of this drift detection layer — including the change detection pipeline (watchfiles → adapter parser → ingest), the periodic reconciler, and the associated event types (drift.detected, drift.resolved) — is defined in Assistant Mode.

The dual state machine links through commit-to-version mapping:

class StateLink(BaseModel):
    semantic_version: int        # Event stream version at projection time
    git_commit: str              # SHA of the artifact commit
    projected_at: datetime       # When projection occurred
    projected_by: str            # Agent or user who triggered projection
    artifacts: list[str]         # Artifact paths included in this projection

Artifact Graph Layer

Every entity in the semantic graph maps to zero or more artifact files. The artifact graph tracks this mapping and the tool-specific export lineage.

Concept Description
Entity-to-file mapping Which semantic entities project into which files (e.g., BOMItem:C42bom/bom.csv row 42, DesignElement:MCUeda/kicad/board.kicad_sch component U1)
Dependency graph Which files depend on which entities — enables targeted re-projection when a subset of the graph changes
Tool export lineage Which MCP adapter produced each file, with what version, at what semantic state — enables reproducible regeneration
class ArtifactMapping(BaseModel):
    entity_id: str               # Neo4j node ID
    entity_type: str             # "BOMItem", "DesignElement", etc.
    artifact_path: str           # Relative path in project
    artifact_region: Optional[str]  # Sub-file locator (row, component ref, section)
    adapter: str                 # MCP adapter that produced this mapping
    adapter_version: str         # Adapter version for reproducibility
    semantic_version: int        # Semantic state version at projection time

Visualization as Projection

The 3D viewer and all dashboard panels are projections of twin state — they never compute domain logic. The UI subscribes to twin state changes and renders them. This means:

  • The dashboard is a read-only view of the semantic graph and its artifact projections
  • Selection, highlighting, filtering, and annotation happen client-side against cached twin state
  • Live overlays (L3) subscribe to the same telemetry WebSocket that feeds the TSDB
  • No engineering calculation ever runs in the browser — all computation happens in agents/skills on the backend

Synchronization Architecture (L3)

The synchronization layer connects physical devices to their digital twins through MQTT, Kafka, and a telemetry routing pipeline.

flowchart TD
    DEV["Physical Device"] -->|"sensor data"| MQTT["MQTT Broker<br/>Mosquitto / EMQX"]
    MQTT --> KAFKA_T["Kafka: device.telemetry"]
    KAFKA_T --> ROUTER["Telemetry Router"]

    ROUTER --> TSDB["TSDB Writer<br/>InfluxDB"]
    ROUTER --> ANOMALY["Anomaly Detector<br/>Threshold checks"]
    ROUTER --> AGG["State Aggregator<br/>Derive device state"]

    AGG -->|"state changes only"| KAFKA_S["Kafka: device.state"]
    KAFKA_S --> NEO4J["Neo4j<br/>DeviceInstance updated"]
    NEO4J --> WS["WebSocket Broadcast<br/>Dashboard"]
    WS --> VIEWER["3D Viewer<br/>Live overlays"]

    ORCH["Orchestrator"] -->|"firmware update"| OTA["OTA Channel"]
    OTA --> DEV

    style DEV fill:#E67E22,color:#fff
    style TSDB fill:#3498db,color:#fff
    style NEO4J fill:#2C3E50,color:#fff
    style VIEWER fill:#27ae60,color:#fff

Sync Policy

Synchronization is configurable per device or fleet group:

class SyncPolicy(BaseModel):
    mode: Literal["event-driven", "periodic", "streaming"]
    sample_rate_hz: Optional[float] = None    # For streaming mode
    batch_interval_s: Optional[float] = None  # For periodic mode
    staleness_threshold_s: float = 60.0       # Alert if no data for this long
    retention_days: int = 90                   # TSDB retention

Sync Mode Use Cases

Use Case Mode Rate Latency
Design-time artifact changes event-driven N/A seconds
Lab test equipment during EVT periodic 1 Hz < 1s
Drone in-flight telemetry streaming 100 Hz < 10ms
Supply chain risk monitoring event-driven N/A minutes
Fleet health dashboard periodic 0.1 Hz < 10s

Behavioral Simulation Layer (L4)

Behavioral models are lightweight mathematical models that predict device behavior without running full SPICE/FEA simulations. Full simulations (SPICE, CalculiX, OpenFOAM) run during design; behavioral models distill the results into fast-running equations for runtime prediction.

Behavioral Model Definition

class BehavioralModelDef(BaseModel):
    """Stored in Neo4j as BehavioralModel node."""
    id: str
    name: str
    model_type: Literal[
        "differential-equation",  # dV/dt = f(I, T)
        "transfer-function",      # H(s) = num/den
        "thermal-rc",             # R-C thermal network
        "lookup-table",           # Interpolated from test data
        "state-machine",          # Discrete state transitions
    ]
    domain: str                   # "electrical", "thermal", "mechanical"
    inputs: list[ModelPort]       # Named inputs with units
    outputs: list[ModelPort]      # Named outputs with units
    parameters: dict              # Model constants (from calibration)
    calibration_source: Optional[str]  # TestExecution ID that calibrated this
    valid_range: Optional[dict]   # Input ranges where model is accurate

class ModelPort(BaseModel):
    name: str
    unit: str                     # SI unit: "A", "V", "degC", "Pa", "m/s"
    schema: Literal["float", "int", "bool", "array"]

Example: Battery Discharge Model

Calibrated from EVT test data:

behavioral_model:
  name: "battery-discharge-3s-lipo"
  model_type: "differential-equation"
  domain: "electrical"
  inputs:
    - name: "load_current_a"
      unit: "A"
    - name: "ambient_temp_c"
      unit: "degC"
  outputs:
    - name: "voltage_v"
      unit: "V"
    - name: "remaining_capacity_pct"
      unit: "%"
  parameters:
    nominal_voltage: 11.1
    capacity_ah: 2.2
    internal_resistance_ohm: 0.045
    temp_coefficient: -0.003  # V/degC
  equation: |
    dV/dt = -(I / C) - R_int * dI/dt - k_temp * (T - 25)
  calibration_source: "test-exec-evt-battery-001"
  valid_range:
    load_current_a: [0, 30]
    ambient_temp_c: [-10, 60]

Simulation Orchestration

flowchart TD
    TRIGGER["Trigger<br/>manual / scheduled / anomaly"] --> CREATE["Create SimulationRun node"]
    CREATE --> FETCH_STATE["Fetch DeviceInstance state<br/>from Neo4j"]
    FETCH_STATE --> FETCH_TELEM["Fetch latest telemetry<br/>from TSDB"]
    FETCH_TELEM --> LOAD["Load BehavioralModel(s)<br/>for this TwinModel"]
    LOAD --> RUN["Run simulation<br/>in-process or containerized"]
    RUN --> STORE["Store results in<br/>SimulationRun node"]
    STORE --> COMPARE["Compare prediction vs actual<br/>Update accuracy metrics"]
    COMPARE --> CHECK{Divergence > threshold?}
    CHECK -->|Yes| ALERT["Trigger CAPA or<br/>maintenance alert"]
    CHECK -->|No| DONE["Done"]

    style TRIGGER fill:#E67E22,color:#fff
    style ALERT fill:#e74c3c,color:#fff
    style DONE fill:#27ae60,color:#fff

What-If Analysis

The ability to run hypothetical scenarios against the digital twin:

# "What happens if ambient temperature rises to 50C?"
what_if_result = await simulation_engine.run_what_if(
    device_id="FC-SN-20260301-001",
    overrides={"ambient_temp_c": 50.0},
    duration_s=3600,
    models=["battery-discharge", "thermal-propagation"]
)
# Returns: predicted voltage curve, thermal margins, time-to-throttle

Use cases:

  • “What’s the flight time at 40C ambient vs 20C?”
  • “If we switch to a 3000mAh battery, how does thermal behavior change?”
  • “What load current causes thermal throttling within 10 minutes?”
  • “Will the new firmware’s increased sampling rate cause battery issues?”

3D Viewer Integration

ModelManifest API

Endpoint: GET /api/v1/projects/{project_id}/model-manifest

class ModelManifest(BaseModel):
    artifact_id: str              # MinIO artifact ID for the GLB file
    glb_url: str                  # Presigned MinIO URL (expires in 1h)
    mesh_to_node_map: dict[str, str]  # Three.js mesh name → Neo4j node ID
    component_tree: list[PartNode]
    last_updated: datetime

class PartNode(BaseModel):
    id: str                       # Neo4j node ID
    name: str
    mesh_name: Optional[str]      # Three.js mesh name in GLB
    children: list["PartNode"]
    node_type: str                # "assembly", "pcb", "component", "enclosure"

Part Detail API

Endpoint: GET /api/v1/bom/items/{node_id}

Returns real BOM data from the graph, replacing the hardcoded demo data in the current 3D viewer.

STEP Ingestion Pipeline

STEP upload → MinIO stores original → Worker converts to GLB
→ GLB stored in MinIO → Agent parses STEP assembly tree → DesignElement nodes in graph
→ Mesh-to-node mapping recorded → ModelManifest available

Live State Overlay (L3)

When a DeviceInstance is selected in the 3D viewer, the viewer subscribes to its telemetry WebSocket and overlays live state:

// WebSocket subscription for live device state
interface LiveStateOverlay {
  deviceId: string;
  meshOverlays: {
    meshName: string;
    colorMap: "thermal" | "stress" | "vibration";
    values: number[];  // Per-vertex or per-face values from telemetry/simulation
  }[];
  annotations: {
    position: [number, number, number];
    label: string;
    value: string;
    status: "nominal" | "warning" | "critical";
  }[];
}

Repository Structure

The codebase is structured to mirror the layered architecture:

digital_twin/
├── thread/                    # L1: Digital Thread (P1)
│   ├── models/
│   │   ├── artifact.py
│   │   ├── constraint.py
│   │   ├── relationship.py
│   │   └── version.py
│   ├── graph_engine.py
│   ├── versioning/
│   ├── constraint_engine/
│   ├── validation_engine/
│   └── api.py
├── models/                    # Twin Model definitions (P2)
│   ├── twin_model.py
│   ├── behavioral_model.py
│   └── model_registry.py
├── knowledge/                 # Knowledge Layer: AI memory (P1.5)
│   ├── store.py               # pgvector CRUD + semantic search
│   ├── embedding_service.py   # Local + cloud embedding abstraction
│   ├── consumer.py            # Kafka event processor
│   ├── reconciler.py          # Periodic reconciliation
│   ├── chunker.py             # Document chunking
│   ├── templates.py           # Embedding content templates
│   ├── models.py              # Pydantic models
│   └── api.py                 # Knowledge query REST API
├── sync/                      # L3: Device synchronization (P3)
│   ├── mqtt_adapter.py
│   ├── telemetry_router.py
│   ├── state_aggregator.py
│   ├── sync_policy.py
│   └── device_registry.py
├── simulation/                # L4: Behavioral simulation (P3-P4)
│   ├── engine.py
│   ├── model_runner.py
│   ├── what_if.py
│   ├── calibration.py
│   └── models/               # Built-in behavioral model implementations
│       ├── differential_eq.py
│       ├── transfer_function.py
│       ├── thermal_rc.py
│       └── lookup_table.py
└── events.py                  # Shared event schemas (all layers)

Phased Delivery

Phase Deliverable Timeline Key Capability
P1 (MVP) Digital Thread (L1) Months 1-6 Artifact graph, traceability, constraints, 3D viewer
P1.5 Knowledge Layer Months 5-6 pgvector setup, local embedding, retrieve_knowledge skill, decision + session indexing
P2 Operational Twin (L2) Months 7-12 TSDB for test results, CAPA workflow, device provisioning, fleet registry
P2.5 Twin Model schema Month 10 BehavioralModel node type, model definition format, calibration from test data
P3 Live Twin (L3) Months 13-18 MQTT ingestion, real-time sync, live 3D overlays, anomaly detection
P3.5 Simulation Twin (L4) Months 16-20 Behavioral model execution, what-if analysis, prediction vs actual
P4 Fleet Intelligence Months 21-24+ Multi-device analytics, fleet-wide predictions, HiL integration

Phase Dependencies

flowchart LR
    P1["P1: Digital Thread<br/>Months 1-6"] --> P15["P1.5: Knowledge Layer<br/>Months 5-6"]
    P15 --> P2["P2: Operational Twin<br/>Months 7-12"]
    P2 --> P25["P2.5: Twin Model Schema<br/>Month 10"]
    P25 --> P3["P3: Live Twin<br/>Months 13-18"]
    P3 --> P35["P3.5: Simulation Twin<br/>Months 16-20"]
    P35 --> P4["P4: Fleet Intelligence<br/>Months 21-24+"]

    style P1 fill:#27ae60,color:#fff
    style P15 fill:#2ecc71,color:#fff
    style P2 fill:#3498db,color:#fff
    style P25 fill:#3498db,color:#fff
    style P3 fill:#9b59b6,color:#fff
    style P35 fill:#9b59b6,color:#fff
    style P4 fill:#E67E22,color:#fff

Technology Stack by Layer

Layer Component Technology Rationale
L1 Graph database Neo4j Relationship-first queries, ACID
L1 Object storage MinIO S3-compatible, content-addressable
L1 Event bus Kafka Durable append-only streams
L1 API framework FastAPI (Python) OpenAPI, async, Pydantic validation
L2 Time-series DB InfluxDB / TimescaleDB Optimized for high-frequency sensor data
L3 MQTT broker Mosquitto / EMQX Lightweight IoT messaging
L3 Telemetry routing Python (aiomqtt + kafka-python) Mature MQTT/Kafka libraries
L4 Simulation runtime Python (NumPy/SciPy) Scientific computing ecosystem
L4 Heavy simulations Containerized (CalculiX, OpenFOAM) Isolated, reproducible
All Vector store pgvector (PostgreSQL extension) Semantic search over knowledge; zero new infra (reuses existing PostgreSQL)
All Embedding models all-MiniLM-L6-v2 (local) / text-embedding-3-small (cloud) Local-first embedding; cloud option for higher quality
All Observability OpenTelemetry SDK + Prometheus + Grafana Tempo + Grafana Loki + Grafana Three-pillar observability (logs, metrics, traces) with SLO/SLI framework. See System Observability
All Dashboard React + TypeScript Component-based, Three.js integration

Document Description
Graph Schema Node types and relationships for all layers
Event Sourcing Event stream architecture and concurrency model
Constraint Engine Engineering constraint evaluation
AI Memory & Knowledge pgvector-backed knowledge layer, embedding pipeline, RAG integration
System Vision Architectural principles and layered model overview
System Observability Unified logging, metrics, and tracing — phased delivery aligned with twin evolution
MVP Roadmap Implementation timeline and resource requirements
ADR-002: 3D Viewer & CAD Pipeline Decision record for R3F renderer and server-side STEP conversion pipeline
Assistant Mode Dual-mode operation — human edits via IDE Assistants and AI-driven autonomous workflows sharing the same Design Graph
Digital Twin Page 3D viewer UI specification

Document Version: v1.1 Last Updated: 2026-02-28 Status: Technical Architecture Document

← Constraint Engine Architecture Home →