Insight · AI Augmented Architecture

Why your EA repository quality determines your AI output quality

Every team evaluating AI for architecture work eventually asks the same question: how good will the answers be? It is the wrong place to start. The quality of an AI answer about your enterprise architecture is set long before any model runs — it is set by how cleanly your repository is structured. Point a capable assistant at a messy EA repository and you get confident, plausible, occasionally wrong. Point the same assistant at a disciplined one and you get answers people act on.

The AI is rarely the variable that decides whether the answer is trustworthy. The repository is.

The MDG is the instruction manual

When you ask an AI system to answer something like “what data sources feed our customer service system?”, it does not understand “customer service system” the way you do. It reads the raw Sparx EA database: tables, columns, relationships, constraints. The phrase you typed lives somewhere in that schema — in an element name, probably spelled or abbreviated a little differently in three different places.

What turns that raw schema into something the AI can reason about is the MDG Technology — Sparx EA's Model Driven Generation mechanism, which carries the metamodel for your repository. It defines what kinds of things can exist, what attributes they carry, how they relate, and what each of them means in architectural terms.

Read it as a manual the AI consults: a System is a kind of Application Component; Application Components have attributes like Technology Stack, Owner, and Criticality; an Application Component can Connect To another; a Connection represents a logical data or control flow. Without that manual, the assistant sees database columns and has to guess what they represent. With it, the assistant can translate your business question into queries against the schema, then translate the results back into architecture answers.

That is why metamodel quality is the single biggest variable in whether your AI output is useful.

Implicit metamodel versus clean metamodel

Picture two repositories that look equally impressive from the outside.

The implicit one

Your architects have modeled the estate extensively — hundreds of application components, dozens of technology platforms, elaborate connection diagrams. But ask where the metamodel actually defines what a “Technology Platform” is, which attributes it must carry, and what may connect to it, and the answer goes fuzzy. The definition lives partly in the MDG, partly in conventions only the senior architects remember, and partly in the data itself.

Some platforms map cleanly to a Business Capability; some do not. Some declare a Technology Type; others bury it in a description field. Some connections are formal relationships; others are sentences in a notes field. The pattern is inconsistent because nothing enforces it. So when the AI is asked “what platforms support our e-commerce capability?”, it navigates contradictory signals and infers what the data probably means. The answers come back sometimes right, sometimes wrong, in ways nobody can predict — and stakeholders quietly learn not to trust them.

The clean one

Here the metamodel defines Technology Platform explicitly, with mandatory attributes: Name, Technology Category, Vendor, Life Cycle Status, and Business Capability at a defined cardinality. Every Technology Platform in the repository has them populated. The allowed values for the controlled fields are enumerated. Dependencies are always modeled with the formal Supports relationship, never narrated in notes.

Ask the same question, and the AI traverses the explicit Supports relationship from the capability to each platform that realizes it. The answer is reliable because the data underneath it is structured and consistent. The difference is not subtle: one repository produces answers that demand human verification, the other produces answers you can act on directly.

What this means for Kernaro and EA GraphLink

Sparx EA core does not ship an AI assistant or a built-in MCP server. The capability arrives through a separate layer of products that landed across the first half of 2026. Kernaro AI Hub includes EA GraphLink — a read-only MCP server deployed for enterprise-wide access — while AI Power Tools for EA runs a local MCP server with full read/write and diagram validation through the EA interface. These are early tools; the teams using them are weeks into the work, not years.

What every one of them has in common is this: they expose your repository to AI by translating its physical schema into a semantic layer the model can reason over — and that translation depends entirely on a well-defined MDG. GraphLink, for instance, needs an MDG Technology that maps the Sparx schema into the GraphQL schema it serves over MCP. If your metamodel is explicit and your data follows it, the semantic layer is accurate and the answers hold up. If the metamodel is implicit, the layer can only approximate what your data means, and the AI's confidence outruns its accuracy.

That is the line between teams that get real value from these tools and teams that get answers that sound right and aren't. The first group did the metamodel work before plugging anything in. They did not model the whole landscape perfectly — that is an infinite task — but they made the metamodel explicit, applied it consistently to the elements that matter most, and kept it current. The second group modeled extensively but inconsistently, and left the AI to reverse-engineer the architecture from contradictory data.

A concrete example

Suppose you want to know: which systems with a criticality of High or above depend on the legacy database platform?

With a clean metamodel, the path is unambiguous — traverse from a Database Platform (an explicitly defined element type, filterable by Technology Type) to every System holding a Depends On relationship, then filter where System.Criticality falls in {High, Critical}. The result is dependable.

With an implicit metamodel, the same question turns into a stack of guesses:

  • Is “Database Platform” a Technology Platform or a Technology Component? The metamodel doesn't distinguish.
  • Is the legacy database called “Legacy DB”, “Oracle Legacy”, or something else again? Naming isn't enforced.
  • Are dependencies formal relationships, entries on a Connection diagram, or lines in a description field?
  • Is criticality an attribute on System — and is it named Criticality, Business Impact, or something else?

The AI can make educated guesses, and sometimes it lands them. But stack four guesses together and a plausible-sounding wrong answer becomes likely. The thing separating the two outcomes is never the model. It is the data discipline behind the metamodel.

Audit your repository before you connect anything

Before wiring GraphLink or AI Power Tools to your repository, run your metamodel past four questions.

If you can't point to where a concept is defined, neither can the AI.

Is your metamodel explicit? Can you point to the configuration that defines which element types exist, their attributes, the possible relationships, and the cardinalities — or does that definition live half in the MDG and half in your architects' heads?

Is your data consistent? Do all System elements actually carry a Technology Stack attribute? Do all dependencies use the same relationship type? Or do you keep finding one-offs and workarounds where the metamodel didn't quite fit?

Are the concepts that matter modeled as first-class things? If business capability drives how you reason about the estate, is it a real element type with explicit relationships to systems and technology — or is it scattered across descriptions and diagrams?

How much meaning is implicit? How much of the architecture lives in diagrams, notes, and architect memory rather than in structured data an AI can actually traverse?

If your honest answer is “our metamodel is pretty clean,” connected AI tools will reward you with reliable answers. If it is “implicit and inconsistent,” you face a choice: invest in the metamodel first, or accept that every answer needs human verification. Most teams choose the investment, because the payoff reaches well past AI — a clean metamodel improves everything that touches the repository.

The right sequence

This is why the AI conversation should follow the metamodel conversation, not lead it.

“Our repository is mature but our metamodel is implicit” is a solvable problem, and you do not solve it by remodeling everything. You focus on the elements and relationships people actually ask about — the ones that surface in governance decisions and drive technical choices. You make those explicit, make the data follow, and put a discipline in place to keep it that way. Then you connect GraphLink and AI. The metamodel work pays off immediately in AI output quality — and it keeps paying in the efficiency of every architect who uses the repository and the confidence of every stakeholder who reads the models.

The AI isn't what makes your architecture accessible. Clean, consistent, explicitly modeled data is. The AI is just the interface that makes the accessibility effortless.

So if you are weighing up connecting AI to your repository, the first conversation isn't with a vendor. It is with your architects, about whether the metamodel is ready. If you want a structured way to find out, Paralysis to a Plan scores repository and metamodel readiness before any tooling decision — and AI Augmented Architecture shows what the practice looks like once the foundation is sound. Start with the data. The AI will be ready when you are.

Is your repository ready for AI — or not yet?

Talk to a practitioner about scoring your MDG and repository quality before you connect anything to your Sparx EA data.

Book a call →