Insight · Integration

How to Connect Sparx EA to Microsoft Fabric: Step-by-Step Integration Guide

Q: What Fabric capacity do we need for an EA integration?

Microsoft Fabric runs on F-SKU capacities, and the right size depends on repository volume, pipeline frequency, and how many people query the data. F2 is the entry point and is fine for proving the integration; production analytics with daily pipelines and Copilot querying usually justify a larger SKU. We scope capacity to your actual workload during the engagement rather than guessing up front.

Q: Does the Fabric agent read the MCP server live or the Lakehouse?

Both, for different jobs. The Fabric agent answers natural-language questions by querying EA GraphLink's read-only MCP server in real time, so the answer reflects the current repository. The Lakehouse holds scheduled copies of EA data for structured dashboards and cross-source SQL. Real-time AI querying and structured analytics run side by side.

By Sparx Services · February 21, 2026

The short version: connecting Sparx EA to Microsoft Fabric is a seven-step build that runs on EA GraphLink: confirm prerequisites, point the Fabric agent at EA GraphLink's read-only MCP server for live questions, stand up a Fabric Lakehouse as the data layer, prove the connection with sample queries, pipe EA data in through GraphQL, enrich it against other enterprise data, and surface the result in Power BI. The reason to add Fabric on top of plain Power BI is enrichment and orchestration: Fabric joins your architecture data to HR, finance, and project sources that standalone dashboards cannot reach. One constraint runs through all of it — Fabric's intelligence is only ever as good as the repository it reads, which is why MDG Technology quality is a prerequisite, not an afterthought.

Before you build, it helps to know where this sits. EA GraphLink is part of Kernaro AI Hub, released in January 2026. It is a server-deployed, read-only MCP server that exposes your Sparx EA repository for enterprise-wide access, and it relies on an MDG Technology that maps the physical Sparx schema into the GraphQL schema it serves. There is no MCP server hidden inside Sparx EA core; EA GraphLink is the product that provides one. With that clear, the full stack you are assembling is straightforward: Sparx EA → EA GraphLink → Fabric → Power BI and Copilot.

Before you start: prerequisites

Four things need to be true before the first pipeline runs. Skipping any of them tends to surface later as a connection that works in a demo but not in production.

EA GraphLink is deployed and operational

The GraphQL endpoint (Interface A) is active and queryable.
The MCP server endpoint (Interface B) is configured and reachable.
Pro Cloud Server is running and connected to the Sparx EA repository behind it.
Authentication — API key or OAuth 2.0 — is set up for both interfaces.

A Fabric workspace is provisioned

A dedicated workspace exists for EA analytics, kept separate from unrelated Fabric work.
Admin permissions are assigned for whoever runs the integration setup.
Capacity is allocated. F2 is the entry SKU and is fine for proving the integration; production workloads with daily pipelines and Copilot querying typically warrant a larger SKU, sized to your repository during scoping.

Microsoft 365 Copilot is licensed

Copilot licenses cover the people who will configure and query the integration.
Copilot is enabled in the Microsoft 365 admin center.

MDG governance has been assessed

This is the one teams underweight. Before connecting Fabric — or any intelligence tool — to the repository, confirm the model is governed: are Application Component elements consistently stereotyped, and are lifecycle status and business owner tagged values populated across most elements? Gaps here do not break the connection; they quietly degrade every answer that flows through it. The assessment runs during the engagement, and where it finds significant gaps, the right move is to fix MDG governance first.

The seven steps

The core build is a sequence. Steps 1 through 4 establish and prove the live AI connection; steps 5 through 7 build the persistent analytics layer behind it.

Open EA GraphLink's MCP server to the Fabric agent

Fabric's agent connects to external context through MCP. In EA GraphLink's Interface B configuration, confirm the MCP server endpoint URL (https://[your-pcs-host]/eagraphlink/mcp), issue a dedicated API key for the Fabric integration so the access is separately auditable, and check that the capability declaration covers the element types, tagged-value dimensions, and query patterns your use cases need. Probe the endpoint with Postman or curl: a capability-discovery request should return the expected schema. If the MCP server lives inside your network, give Fabric a path to it — a secure gateway or Fabric private link.

Stand up a Fabric Lakehouse as the data layer

The Lakehouse stores EA data persistently in Delta Lake format, which unlocks enrichment and heavier analytics than live GraphQL alone. Create a Lakehouse in the workspace — name it “Enterprise Architecture” — with a table per major element type plus relationship and tagged-value tables. Why a Lakehouse rather than a direct Power BI connection: it lets you join EA data to HR headcount, project financials, or ITSM incident data; it supports full SQL for complex analysis; its PySpark and SQL notebooks go well beyond Power Query; and it acts as a scheduled, persistent layer instead of querying live on every report load.

Register the EA repository as a Fabric agent source

In Workspace Settings → AI Integration → External Data Sources, add a new MCP data source named “Sparx EA Repository,” point it at the EA GraphLink Interface B endpoint, and authenticate with the key from step 1. Define the agent topic so the model knows its remit: enterprise architecture data including applications, capabilities, technology components, and architecture decisions from the Sparx EA repository. Save, then validate in Fabric Copilot chat with something like “What business capabilities does our CRM system support?” A correct setup returns an answer grounded in the model's Realisation relationships, not generic text.

Prove it with representative queries

Before any production build, validate against real questions: capability coverage (“list business capabilities with no application coverage”), portfolio status (“how many application components are end-of-life?”), ownership gaps (“which finance-domain applications have no business owner?”), dependency analysis (“which applications realize Customer Onboarding?”), and decision retrieval (“what accepted decisions relate to cloud infrastructure?”). Check that answers are specific to your repository, that names and counts match what you see in Sparx EA, and that nothing expected is missing. Missing data points to MDG gaps; empty results point to the connection. Write the findings down.

Pipe EA data into the Lakehouse via GraphQL

Build a Fabric Data Pipeline with a Copy Data activity: source it from EA GraphLink's GraphQL endpoint (Interface A) using its API key, write a query for the first data set, map the response fields to Lakehouse columns, and target a table such as ApplicationComponents. Schedule it — daily suits most teams, more often if currency demands it. For large repositories, configure incremental runs that pull only elements changed since the last load; EA GraphLink supports filtering by modification date. A standard configuration builds a handful of tables, summarized below.

Enrich EA data against other sources

This is the step that justifies Fabric. Once EA tables sit alongside HR or project data in the Lakehouse, a single SQL join answers questions plain dashboards cannot — for example, listing end-of-life applications next to each owner and that owner's manager, which is exactly the stakeholder list a decommissioning conversation needs. Joining application lifecycle to active replacement projects shows which legacy systems have funded successors and which do not. Neither view is buildable in Power BI alone without the data pre-joined.

Surface it in Power BI

Point Power BI at the Lakehouse (Get data → Microsoft Fabric → Lakehouse), select the tables, and model the relationships (ApplicationComponents to AppToCapability to Capabilities). Build the pages that earn their keep — a portfolio heat map by lifecycle and domain, a capability coverage matrix, a technology end-of-life timeline, and a decision register. Publish to the same workspace and set refresh schedules: the Lakehouse refreshes on the pipeline cadence, the semantic model on its own. Optionally enable Copilot on the published model so stakeholders can ask the dashboard questions in plain language.

The Lakehouse tables to build

For a standard EA analytics configuration, these tables give Power BI and the enrichment queries a clean foundation to work from.

Table	GraphQL query scope	Key columns
`ApplicationComponents`	All Application Component elements	Name, Stereotype, Domain, LifecycleStatus, BusinessOwner, TechnicalOwner, BusinessCriticality
`Capabilities`	All Capability elements	Name, Domain, MaturityScore, StrategicImportance, ApplicationCoverage, BusinessOwner
`AppToCapability`	Realisation relationships, application to capability	SourceName, TargetName, RelationshipType
`TechnologyNodes`	Technology Node elements	Name, Type, Vendor, LifecycleStatus, EOLDate
`ArchitectureDecisions`	Architecture Decision elements	ADRID, Status, Domain, DecisionDate, DecisionAuthority

Why add Fabric at all?

It is a fair question, because Power BI can already read EA GraphLink directly. The honest answer is that Fabric earns its place only when you need to combine architecture data with the rest of the enterprise. If all you want is EA dashboards, connect Power BI to the GraphQL endpoint and stop there — it is simpler and costs less. Fabric becomes worth the added surface area when the questions change shape: which end-of-life systems sit under which budget owner, which capabilities are funded by active projects, where ownership and lifecycle disagree. Those answers live in the joins, and the joins live in the Lakehouse.

Fabric's intelligence is bounded by the repository it reads. A clean, well-governed model produces precise answers; a patchy one produces confident noise. The MDG quality gate is the whole game.

That last point is worth sitting with, because it is the difference between this integration delivering and disappointing. We treat repository quality as the first deliverable, not a hope — it is the heart of how AI Augmented Architecture works in practice, and it is what Paralysis to a Plan measures before any tool is connected. For teams already committed to the Microsoft stack, this Fabric path is one configuration of the broader integration layer we stand up in Configure the Solution.

Frequently asked questions

What Fabric capacity do we need for an EA integration?

Fabric runs on F-SKU capacities, and the right size depends on repository volume, pipeline frequency, and how many people query the data. F2 is the entry point and is fine for proving the integration; production analytics with daily pipelines and Copilot querying usually justify a larger SKU. We scope capacity to your actual workload during the engagement rather than guessing up front.

Can we do this with Power BI alone, without Fabric?

Yes. Power BI can connect straight to EA GraphLink's GraphQL endpoint through its GraphQL connector, which is simpler and cheaper. You give up the cross-source enrichment and agentic workflows that Fabric adds. If you only need EA dashboards, the direct route is enough; choose Fabric when you want to join EA data with HR, finance, or project data.

Does the Fabric agent read the MCP server live, or the Lakehouse?

Both, for different jobs. The agent answers natural-language questions by querying EA GraphLink's read-only MCP server in real time, so the answer reflects the current repository. The Lakehouse holds scheduled copies of EA data for structured dashboards and cross-source SQL. Real-time AI querying and structured analytics run side by side.

How often should EA data refresh in the Lakehouse?

Daily suits most EA analytics, because architecture data changes over days and weeks, not minutes. An overnight pipeline run keeps executive dashboards current enough. If your team makes heavy intraday edits and stakeholders need same-day numbers, run the pipeline every four to six hours. Live AI querying through MCP is always current regardless of the Lakehouse schedule.

What are the security implications of EA data in a Fabric Lakehouse?

Lakehouse data inherits Fabric's security model: workspace role-based access, Microsoft Entra ID, and Microsoft Purview governance. Lakehouse access is managed separately from Sparx EA, so modelers do not automatically get it. For sensitive packages — security architecture, systems holding sensitive data — define a subset schema that excludes them from the Fabric integration. We address this in engagement scoping.

Can Fabric's machine-learning features run on EA data?

Yes, once EA data sits in the Lakehouse. Fabric's data-science notebooks support uses like predicting application obsolescence risk from age, vendor end-of-life patterns, and technical health; ranking capability investment priority; or clustering applications by similarity for rationalization. These go beyond standard dashboards and suit mature programs with enough history in the Lakehouse to learn from.

Want this stack built on your repository?

We deliver EA GraphLink deployment, Fabric Lakehouse configuration, pipelines, and your first Power BI dashboards — with the MDG quality gate handled first.

Book a call →

Keep reading