AGISystem2 – Research Roadmap

UTE Roadmap: Toward “Scientific-Theory AGI”

This roadmap is about building a Universal Theory Engine (UTE): a platform where theories are executable, checkable, revisable, and comparable. The goal is not “general AGI” in the nebulous sense; it is a pragmatic kind of AGI for science and engineering: agents that can represent, review, argue, prove, and revise scientific theories with evidence and auditability.

Our core bet is that UTE requires a hybrid stack: symbolic semantics for correctness and explicit traces, plus HDC/VSA strategies for fast retrieval and capacity experiments under growth (elastic/dynamic size representations).

See Universal Theory Engine (UTE) and the research specs DS32–DS38 in the Specs matrix.

Where we are now

AGISystem2 is already a coherent research platform: you can express micro-theories in DSL, run symbolic reasoning with proofs/traces, and run capacity and retrieval experiments via multiple HDC strategies (including elastic and lossless variants).

Implemented foundation

DSL parsing and execution, Session runtime, query/prove, traces/proofs, eval harness.

See Specs matrix for implemented vs research items.

HDC strategy substrate

Dense / Sparse / Metric / EMA (elastic) / EXACT (lossless) as swappable strategy contract for controlled experiments.

See HDC strategy directions and dynamic size representations.

Agentic hardening loop

AutoDiscovery-style workflows treat eval data as a judge: regressions become artifacts (tests, minimal repros, documented fixes).

See agentic code generation and DS20.

Research Directions

NL→DSL Translation (Grammar-Based) Active

Goal: Expand deterministic coverage of “structured reasoning language” patterns (high precision, reproducible behavior).

Complex quantifiers (all, some, most, few)
Nested conditionals and exceptions
Temporal expressions (before, after, during, while)
Comparative structures (more than, as...as, the most)

Impact: Reduces dependency on external services, enables offline operation, deterministic parsing.

NL→DSL Translation (LLM-Assisted) Active

Goal: Use LLMs as proposers and disambiguators while keeping the system’s semantics checkable by DSL + evaluation.

Disambiguation of pronouns and references
Domain-specific terminology expansion
Handling of idioms and non-literal expressions
Multi-sentence context understanding

Impact: Handles edge cases that grammar alone cannot, enables natural user interaction.

Proof→NL Generation Prototype

Goal: Make formal reasoning legible to humans: fluent explanations derived from proofs/traces (not post-hoc narratives).

Template-based generation for common patterns
LLM-enhanced fluency post-processing
Adjustable verbosity levels (summary vs detailed)
Multi-language support (RO, EN, FR, DE)

Impact: Makes formal reasoning accessible to non-technical users, enables audit documentation.

Essay & Document Generation Planned

Goal: Generate structured, evidence-linked reports from query/proof results (research notes, audits, review memos).

Outline generation from concept hierarchies
Paragraph synthesis from related facts
Citation and source tracking
Style adaptation (academic, professional, casual)

Impact: Transforms knowledge bases into publishable content, enables automated reporting.

Learning from Textbook-Style Content Planned

Goal: Assist theory authoring: propose candidate theory fragments from structured text, validate via eval suites and proofs.

Definition extraction ("X is defined as...")
Relationship mining (taxonomies, properties, rules)
Example-based rule induction
Concept graph construction

Impact: Bootstraps knowledge bases from existing content, accelerates domain modeling.

HDC/VSA/HRR Strategy Development Research

Goal: Advance the theoretical and practical foundations of hyperdimensional computing for reasoning.

Hybrid strategies combining Dense + Sparse + Metric approaches
Dynamic/elastic size representations under KB growth
Novel similarity measures for semantic matching
Scalability to million-fact knowledge bases

Impact: A controlled substrate for retrieval and capacity research; informs UTE-scale theory manipulation.

Deep Holographic Reasoning Research

Goal: Understand when holographic steps help and when symbolic validation is required; build safe hybrid workflows.

HDC Master Equation for direct query answering
Analogical reasoning via bind/unbind algebra
Similarity-based rule application
Probabilistic confidence from vector distances

Impact: Fast retrieval and candidate generation for large theory sets, with auditable fallbacks.

Domain Semantic Libraries Research

Goal: Build an ecosystem of domain libraries (theory fragments + eval suites) that make UTE progress cumulative.

Medicine: Symptoms, diagnoses, treatments, drug interactions
Law: Legal concepts, rights, obligations, precedents
Physics: Physical quantities, laws, relationships
Biology: Organisms, pathways, genetics, ecology

Impact: Domain specialization is the path to scientific-theory assistants; community contribution model.

UTE milestones (qualitative)

Near-term

Improve the “Linux layer”: stable semantics, clearer docs/spec statuses, richer eval coverage, and better developer ergonomics.

Anchor: implemented DS + evaluation-driven development.

Mid-term

Prototype UTE-layer primitives where they can be validated: evidence/provenance objects, contradiction reports, and revision workflows.

Anchor: DS34 (provenance/revision) + DS19 direction.

Long-horizon

Integrate causal/mechanistic, uncertainty, numeric modeling, and experiment planning into closed-loop “theory improvement” workflows.

Anchor: DS35–DS38 (research) as the roadmap.

Honesty note: Roadmaps are hypotheses. In this project we keep the boundary explicit: what is implemented is tracked in the Specs matrix, and research specs are marked clearly (Research/Exploratory/Proposed).