Abstract Wiki Architect

Abstract Wiki Architect is a multilingual natural-language generation (NLG) toolkit for structured knowledge—designed with Abstract Wikipedia and Wikifunctions in mind, but usable independently.

Instead of building “one renderer per language,” it treats language generation as reusable infrastructure:

a small set of shared language-family engines (e.g., Romance, Slavic, Semitic),
per-language configuration cards (data, not hard-coded rules),
reusable sentence templates (“constructions”),
a lexicon that provides the features real languages need (gender, case, noun class, etc.),
a compact set of semantic frames (language-agnostic meaning structures),
and QA/test suites so multilingual output stays correct as the system evolves.

The goal: professional-grade, testable rule-based NLG at scale—structured, auditable, and maintainable.

Why this exists

Projects like Abstract Wikipedia and Wikifunctions aim to represent knowledge in a language-independent form, then render it into many languages. That only works at scale if the system is:

shared where languages share structure (family-level logic),
data-driven where languages differ (language cards / morphology settings),
and testable (regression suites that catch breakage immediately).

Abstract Wiki Architect is a concrete architecture for doing exactly that.

Note: This is not an official Wikimedia component. It’s a compatible, experimental architecture designed to learn from that ecosystem.

What it enables

Multilingual summaries from structured data (people, orgs, events, roles, decisions).
Consistent phrasing across languages (same meaning, equivalent sentence patterns).
Incremental language onboarding (add a language card + lexicon entries + tests).
High trust outputs (rule-based, explainable, and regression-tested).

How it works (high-level)

Think of it as a pipeline:

Meaning (frames) → Sentence plan (templates) → Language rules (family engine) → Surface text

1) Semantic frames (meaning)

Inputs are normalized into small, typed “frames” that represent meaning in a language-neutral way (e.g., a biography frame, a membership/role frame, an event frame).

2) Sentence templates (“constructions”)

Reusable sentence patterns decide what to say (roles like subject/predicate/location) and call the language engine to realize it correctly in each language.

3) Language-family engines + language cards

Family engines handle shared grammar logic; language cards specify language-specific details (articles, agreement options, morphology parameters).

4) Lexicon

Lexicon entries supply the linguistic features required by morphology (gender, animacy, noun class, irregular forms, etc.), optionally linked to Wikidata / lexeme IDs.

5) QA / regression suites

Native-speaker editable test cases (CSV-style) + automated test runners make quality measurable and prevent regressions.

Relationship to Konnaxion

Konnaxion focuses on governance, roles, processes, and coordination. Abstract Wiki Architect fits as a language layer:

rendering short descriptions of roles, mandates, decisions, and processes into multiple languages,
keeping meaning structured and auditable while producing human-readable text,
reusing stable patterns (e.g., “X is a mediator for Y”, “X coordinates process Z”) across contexts.

In short: Konnaxion governs and coordinates; Abstract Wiki Architect renders structured knowledge into multilingual narrative.

Practical entry points

Repository: https://github.com/Rejean-McCormick/abstract-wiki-architect
Meta-Wiki tools page: https://meta.wikimedia.org/wiki/Abstract_Wikipedia/Tools/abstract-wiki-architect

Technical appendix (for builders)

Core building blocks

Engines: shared grammar logic per language family.
Language profiles/cards: per-language settings and defaults.
Constructions: reusable sentence templates.
Semantics (frames): typed meaning structures.
Lexicon: linguistic features and forms.
Discourse (optional): basic reference choices (name vs pronoun) and ordering.
QA: test suites + regression runners.

Directory map (typical)

engines/ — family engines
morphology/ — inflection and agreement modules
constructions/ — sentence templates
semantics/ — frame types + normalization/bridges
lexicon/ — lexeme types, loaders, indices
discourse/ — topic/salience/referring expressions
qa/, qa_tools/ — test suites and runners
router.py — internal routing / entry points

Minimal conceptual example (frame → text)

# Pseudocode-level illustration (API details may vary by repo version)
bio = BioFrame(name="Douglas Adams", profession="writer", nationality="british")
result = generate(lang="en", frame=bio)
# "Douglas Adams was a British writer."