Goal: Leverage LLMs (Claude, GPT, etc.) to handle ambiguous, idiomatic, or context-dependent natural language before formal grammar parsing.
Principle: LLMs handle language understanding; AGISystem2 handles formal reasoning. Each does what it's best at.
User Input (ambiguous)
│
▼
┌───────────────────┐
│ LLM Preprocessor │ ← Resolve ambiguity, expand context
└───────────────────┘
│
▼
┌───────────────────┐
│ Grammar Parser │ ← Parse normalized text
└───────────────────┘
│
▼
┌───────────────────┐
│ AGISystem2 │ ← Formal reasoning
└───────────────────┘
User Input
│
▼
┌───────────────────┐
│ Grammar Parser │ ← Try grammar first
└───────────────────┘
│
┌───┴───┐
│Failed │
└───┬───┘
▼
┌───────────────────┐
│ LLM Translation │ ← LLM generates DSL directly
└───────────────────┘
│
▼
┌───────────────────┐
│ DSL Validator │ ← Validate LLM output
└───────────────────┘
| Capability | Use Case | Example |
|---|---|---|
| Coreference Resolution | Resolve pronouns and references | "John saw Mary. He waved." → "John waved" |
| Idiom Expansion | Convert idioms to literal meaning | "It's raining cats and dogs" → "It's raining heavily" |
| Domain Terminology | Expand jargon and abbreviations | "The patient has HTN" → "has hypertension" |
| Implicit Relations | Make implicit knowledge explicit | "Paris is a capital" → "Paris is the capital of France" |
| Sentence Simplification | Break complex sentences | "The tall man who wore a hat left" → "A man wore a hat. The man was tall. The man left." |
You are a natural language normalizer for a formal reasoning system.
Given this input text:
"${userInput}"
Transform it following these rules:
1. Resolve all pronouns to their referents
2. Expand abbreviations and acronyms
3. Convert idioms to literal meaning
4. Make implicit relations explicit
5. Split complex sentences into simple ones
6. Preserve all semantic content
Output the normalized text only, no explanations.
Translate to AGISystem2 DSL:
DSL Syntax:
- isA Subject Type (taxonomy)
- has Subject Property (attributes)
- Implies $antecedent $consequent (rules)
- And $a $b, Or $a $b, Not (x) (logic)
- Variables: ?x, ?y (in rules)
Input: "${userInput}"
Output valid DSL only, one statement per line.
// Validation pipeline async function validateLLMOutput(dsl, originalInput) { // 1. Syntax check const parseResult = parseDSL(dsl); if (parseResult.errors.length > 0) { return { valid: false, reason: 'syntax_error' }; } // 2. Extract entities from input const inputEntities = extractEntities(originalInput); // 3. Check all entities present in DSL const dslEntities = extractDSLEntities(dsl); const missing = inputEntities.filter(e => !dslEntities.includes(e)); if (missing.length > 0) { return { valid: false, reason: 'missing_entities', missing }; } // 4. Check for unknown operators const unknownOps = findUnknownOperators(dsl); if (unknownOps.length > 0) { return { valid: false, reason: 'unknown_operators', unknownOps }; } return { valid: true }; }
The optimal approach combines grammar-based and LLM-assisted translation:
| Scenario | Method | Rationale |
|---|---|---|
| Simple, structured input | Grammar only | Fast, deterministic, no API cost |
| Contains pronouns/references | LLM preprocessing → Grammar | Resolve references, then parse |
| Contains idioms/jargon | LLM preprocessing → Grammar | Normalize language, then parse |
| Grammar parse fails | LLM direct translation | Fallback for unsupported patterns |
| LLM translation fails validation | Return error to user | No unreliable output |