Explainability - Trustworthy AI

Explainability in AI means more than generating plausible-sounding text. AGISystem2 provides actual proof traces—the real computational steps that led to a conclusion, not a post-hoc confabulation.

1. What Makes HDC Explainable?

Unlike neural networks where decisions emerge from billions of opaque weights, AGISystem2's reasoning is inherently transparent:

Explicit Rules

Every inference corresponds to a named rule in the knowledge base. No implicit patterns learned from data.

Traceable Steps

Each reasoning step is an explicit HDC operation (BIND, UNBIND, SIMILARITY). The proof trace IS the computation.

Deterministic

Same input always produces same output. Explanations are reproducible and debuggable.

Human-Readable KB

Knowledge is stored in DSL format readable by humans, not opaque weight matrices.

2. Anatomy of a Proof Trace

When AGISystem2 proves a goal, it generates a complete trace of reasoning steps:

GOAL: prove(Animal(Spot)) STEP 1: Query KB for direct fact Animal(Spot) Result: NOT FOUND (similarity 0.48) STEP 2: Search for applicable rules Found: rule: if Dog(?x) then Animal(?x) (Dog IS_A Animal - transitive inheritance) STEP 3: Unify rule with goal ?x = Spot New subgoal: Dog(Spot) STEP 4: Query KB for Dog(Spot) Result: FOUND (similarity 0.97) Source: explicit fact in knowledge base STEP 5: Apply rule [Dog IS_A Animal] Dog(Spot) ⟹ Animal(Spot) CONCLUSION: Animal(Spot) = TRUE Confidence: 0.97 Proof depth: 2 Rules applied: [Dog IS_A Animal]

Every element of this trace is verifiable:

The rule exists in the KB and can be inspected
The fact Dog(Spot) can be verified
The inference step follows logically from the rule

3. Multi-Level Explanations

Different stakeholders need different levels of detail:

Developer Level

Audience: System developers, debuggers

Content: Full proof trace with vector operations, similarity scores, binding/unbinding steps

Example:

prove(Animal(Spot)):
  match(Dog(Spot), KB) → sim=0.97
  apply_rule(Dog_IS_A_Animal, {?x=Spot})
  → Animal(Spot) [confidence=0.97, depth=2]

Domain Expert Level

Audience: Domain specialists, knowledge engineers

Content: Key reasoning steps, rules applied, without HDC internals

Example: "Concluded Spot is an animal by applying the inheritance rule 'all dogs are animals' to the known fact that Spot is a dog."

End User Level

Audience: Non-technical users

Content: Natural language summary of the conclusion

Example: "Spot is an animal because Spot is a dog, and all dogs are animals."

Auditor Level

Audience: Compliance officers, legal reviewers

Content: Decision justification with rule citations

Example: "Decision: APPROVED. Basis: Rule KB.Ontology.001 (taxonomic inheritance) applied to established fact F.Dog.Spot."

4. Contrastive Explanations

Often the most useful explanation answers "Why X instead of Y?"

QUESTION: Why is Spot classified as Animal, not Plant? ANALYSIS: For Animal(Spot): - Dog(Spot) is TRUE [explicit fact] - Dog IS_A Animal [rule exists] ⟹ Animal(Spot) PROVABLE For Plant(Spot): - No direct fact Plant(Spot) - No rule Chain leading to Plant(Spot) - Dog IS_NOT_A Plant [explicit constraint] ⟹ Plant(Spot) NOT PROVABLE CONTRASTIVE EXPLANATION: "Spot is Animal (not Plant) because Spot is a Dog, dogs inherit from Animal (not Plant), and Dog and Plant are disjoint categories."

5. Counterfactual Explanations

Understanding what would change an outcome:

QUESTION: What would make Animal(Spot) false? COUNTERFACTUAL ANALYSIS: Current proof depends on: 1. Fact: Dog(Spot) 2. Rule: Dog IS_A Animal To invalidate conclusion: Option A: Remove fact Dog(Spot) Option B: Remove rule Dog IS_A Animal Option C: Add exception: NOT Animal(Spot) MINIMAL CHANGE: "If Spot were not a dog, we could not conclude Spot is an animal (without additional information)."

6. Explanation Quality Metrics

AGISystem2 explanations can be evaluated on:

Metric	Description	How HDC Achieves It
Fidelity	Explanation matches actual reasoning	Proof trace IS the computation
Completeness	All relevant steps included	Full trace available, summarizable
Consistency	Same query → same explanation	Deterministic operations
Verifiability	Claims can be checked	Rules and facts inspectable
Minimality	No irrelevant details	Proof traces are minimal paths

7. Explainability and Bias Detection

Explainability enables systematic bias analysis:

Key Insight: Because we can see exactly which rules led to which conclusions, we can analyze whether certain rules disproportionately affect certain groups.

Rule impact analysis: Which rules most affect outcomes for group X vs Y?
Definition sensitivity: How do small changes in definitions change conclusions?
Path analysis: Do different demographic groups follow different reasoning paths?

See Bias Study for detailed methodology.

8. Implementation Architecture

AGISystem2's explainability system:


+--------------------------------------------------+
|              Explanation Generator               |
|    Transforms proof traces to target audience    |
+--------------------------------------------------+
|                Proof Trace Store                 |
|    Complete record of all reasoning steps        |
+--------------------------------------------------+
|               Reasoning Engine                   |
|    prove() / query() with trace logging          |
+--------------------------------------------------+
|                HDC Foundation                    |
|    Deterministic, traceable operations           |
+--------------------------------------------------+

9. Research Directions

Open Questions:

Automatic summarization of complex proof traces
Natural language generation from formal traces
Interactive explanation exploration ("why this rule?")
Explanation personalization based on user expertise
Visual representation of reasoning graphs

Explainability in HDC Reasoning Research