Language Ambiguity and AI Constraints
Natural language is characterized by inherent ambiguity. While Large Language Models effectively parse such inputs, mapping them to deterministic execution environments or formal logic remains prone to hallucinations.
Constrained Natural Language (CNL)
CNLs are subsets of natural languages with restricted grammar and controlled vocabulary. They maintain the readability of English while exhibiting the properties of formal languages. Notable examples include Attempto Controlled English (ACE) and the Grammatical Framework (GF).
Historical & Related Initiatives
- Basic English: A 1930 controlled language with a reduced vocabulary of 850 words, representing an early precursor to formal CNLs for international communication.
- RuleML / SWRL: Standards for representing rules in the Semantic Web, often utilizing human-readable "logical English" variants.
- PENG (Processable English): A CNL designed specifically for mapping to First-Order Logic, prioritizing machine-executability.
- CLCE (Common Logic Controlled English): A formal subset of English that maps directly to the ISO Common Logic standard.
Executable Semantics
Direct mapping of CNL to Formal Semantics (such as mapping sentences to SPARQL or logic programs) provides several technical benefits:
- Deterministic Verification: Inputs failing the CNL grammar are rejected, preventing hallucinated logic.
- Auditability: Reasoning steps are represented in human-readable yet formally grounded text.
- Bi-directional Transformation: Enables the conversion of formal data into verifiable human-readable claims.
Objective
The research focus is the development of systems where LLMs function as translators from natural language into CNLs, which are subsequently processed by deterministic symbolic engines. This hybrid approach aims to combine linguistic fluency with formal rigor.