UTE Roadmap: Toward “Scientific-Theory AGI”

This roadmap is about building a Universal Theory Engine (UTE): a platform where theories are executable, checkable, revisable, and comparable. The goal is not “general AGI” in the nebulous sense; it is a pragmatic kind of AGI for science and engineering: agents that can represent, review, argue, prove, and revise scientific theories with evidence and auditability.

Our core bet is that UTE requires a hybrid stack: symbolic semantics for correctness and explicit traces, plus HDC/VSA strategies for fast retrieval and capacity experiments under growth (elastic/dynamic size representations).

Today Executable DSL Query/Prove + traces Multi-strategy HDC Evals + AutoDiscovery Micro-Theories Reusable theory fragments Evidence conventions Community eval suites UTE Layers Provenance + revision Causal/mechanistic models Uncertainty + numeric Experiment planning Scientific-Theory AGI Domain scientist agents Review / argue / prove Evidence-aware workflows General AGI nebulous future

See Universal Theory Engine (UTE) and the research specs DS32–DS38 in the Specs matrix.

Where we are now

AGISystem2 is already a coherent research platform: you can express micro-theories in DSL, run symbolic reasoning with proofs/traces, and run capacity and retrieval experiments via multiple HDC strategies (including elastic and lossless variants).

Implemented foundation

DSL parsing and execution, Session runtime, query/prove, traces/proofs, eval harness.

See Specs matrix for implemented vs research items.

HDC strategy substrate

Dense / Sparse / Metric / EMA (elastic) / EXACT (lossless) as swappable strategy contract for controlled experiments.

See HDC strategy directions and dynamic size representations.

Agentic hardening loop

AutoDiscovery-style workflows treat eval data as a judge: regressions become artifacts (tests, minimal repros, documented fixes).

See agentic code generation and DS20.

Research Directions

NL→DSL Translation (Grammar-Based) Active

Goal: Expand deterministic coverage of “structured reasoning language” patterns (high precision, reproducible behavior).

Impact: Reduces dependency on external services, enables offline operation, deterministic parsing.

NL→DSL Translation (LLM-Assisted) Active

Goal: Use LLMs as proposers and disambiguators while keeping the system’s semantics checkable by DSL + evaluation.

Impact: Handles edge cases that grammar alone cannot, enables natural user interaction.

Proof→NL Generation Prototype

Goal: Make formal reasoning legible to humans: fluent explanations derived from proofs/traces (not post-hoc narratives).

Impact: Makes formal reasoning accessible to non-technical users, enables audit documentation.

Essay & Document Generation Planned

Goal: Generate structured, evidence-linked reports from query/proof results (research notes, audits, review memos).

Impact: Transforms knowledge bases into publishable content, enables automated reporting.

Learning from Textbook-Style Content Planned

Goal: Assist theory authoring: propose candidate theory fragments from structured text, validate via eval suites and proofs.

Impact: Bootstraps knowledge bases from existing content, accelerates domain modeling.

HDC/VSA/HRR Strategy Development Research

Goal: Advance the theoretical and practical foundations of hyperdimensional computing for reasoning.

Impact: A controlled substrate for retrieval and capacity research; informs UTE-scale theory manipulation.

Deep Holographic Reasoning Research

Goal: Understand when holographic steps help and when symbolic validation is required; build safe hybrid workflows.

Impact: Fast retrieval and candidate generation for large theory sets, with auditable fallbacks.

Domain Semantic Libraries Research

Goal: Build an ecosystem of domain libraries (theory fragments + eval suites) that make UTE progress cumulative.

Impact: Domain specialization is the path to scientific-theory assistants; community contribution model.

UTE milestones (qualitative)

Near-term

Improve the “Linux layer”: stable semantics, clearer docs/spec statuses, richer eval coverage, and better developer ergonomics.

Anchor: implemented DS + evaluation-driven development.

Mid-term

Prototype UTE-layer primitives where they can be validated: evidence/provenance objects, contradiction reports, and revision workflows.

Anchor: DS34 (provenance/revision) + DS19 direction.

Long-horizon

Integrate causal/mechanistic, uncertainty, numeric modeling, and experiment planning into closed-loop “theory improvement” workflows.

Anchor: DS35–DS38 (research) as the roadmap.

Honesty note: Roadmaps are hypotheses. In this project we keep the boundary explicit: what is implemented is tracked in the Specs matrix, and research specs are marked clearly (Research/Exploratory/Proposed).