EA Repository Disaster Recovery: Backup, Restore and Business Continuity Planning for Sparx EA
The short version: a mature EA repository holds years of governance decisions, approved designs, traceability artifacts, and compliance evidence — losing it is a business problem, not an IT one. Protecting it takes the same discipline as any critical system: verified backups sized to each database platform, restores you have actually tested, defined RTO and RPO targets, and a continuity plan that ranks failure modes by impact rather than treating them as equal.
Most enterprise systems have a mature backup routine. The Sparx EA repository is the one that quietly gets left out — and the gap rarely shows until the day someone needs a restore that was never tested.
Why EA repository backup is different
The reasons it gets skipped are organizational more than technical. The repository is often seen as "the architects' tool" rather than a business system. It starts small, it is not customer-facing, and it does not look like an obvious compliance or regulatory concern — so it slips past enterprise backup policy during the initial deployment and stays out of scope long after it has stopped being small.
Why that view is wrong becomes clear once you account for what actually accumulates in a mature repository:
Architecture decisions. Why was application A chosen over application B? Which constraints govern the program, and which trade-offs were accepted in the technology strategy? Those decisions live in the repository as tagged values, element notes, decision records, and relationship structures. Lose them and you lose the institutional memory of why the current architecture exists at all.
Program traceability. Artifacts produced for program gates — approved designs, compliance mappings, standards assessments — are frequently referenced in documents that live outside the repository. If the repository goes, those references become unresolvable.
Compliance evidence. In regulated industries, the repository may hold evidence that specific designs were reviewed, approved, and compliant at specific points in time. That cannot be reconstructed after the fact.
Years of invested effort. A mature practice represents thousands of hours of modeling, governance, review, and curation. None of it comes back without the data.
Backup options by database platform
The right backup mechanism depends on where your repository database lives. Sparx EA repositories most commonly run on SQL Server, PostgreSQL, or MySQL — each with its own native tooling and managed-cloud equivalent.
SQL Server
SQL Server Agent schedules backup jobs. A daily full-database backup produces a file that restores the complete repository; with the database in full recovery model, transaction log backups add point-in-time recovery to any moment within the log retention window — not just the last full backup.
Azure Backup handles SQL Server on an Azure VM with policy-driven, configurable retention. Azure SQL Database goes further: automated backups are built in — full backups weekly, differentials every 12 hours, and transaction log backups every 5 to 10 minutes.
Recommended: daily full backups via SQL Server Agent or an Azure Backup policy, plus hourly transaction log backups for active repositories. Keep daily backups for at least 30 days and weekly backups for 12 weeks.
PostgreSQL
pg_dump creates a logical backup as a SQL script or archive file. Run it on a schedule (a cron job) and you get a portable backup that restores to any compatible PostgreSQL instance — the standard approach for self-hosted setups.
pg_basebackup captures a filesystem-level snapshot of the data directory. Paired with WAL (write-ahead logging) archiving, it enables point-in-time recovery to any moment within the WAL retention window.
Managed cloud PostgreSQL — Amazon RDS, Azure Database for PostgreSQL, Google Cloud SQL — manages backups for you. RDS provides automated daily backups with point-in-time recovery up to the configured retention period, which can be set as high as 35 days.
Recommended: daily pg_dump for self-hosted instances, with WAL archiving for point-in-time recovery. On managed cloud instances, set backup retention to at least 14 days.
MySQL
mysqldump produces a logical backup as a SQL script. Like pg_dump it schedules cleanly and yields a portable file; for small-to-medium repositories on MySQL, a daily mysqldump is adequate.
Binary logging enables point-in-time recovery from a mysqldump base. The binary log records every transaction, so you can restore to any point between the base backup and now.
Managed cloud MySQL — Amazon RDS and Azure Database for MySQL — offers automated backup with configurable retention, mirroring their PostgreSQL equivalents.
EAP file backup (small or solo repositories)
For repositories still on the legacy EAP file format, backup is simply copying the file to a safe location — at minimum a network share, ideally cloud storage, always off the machine it lives on. For an EAP file in active use, a daily copy is appropriate. Most shared practices have moved to a server-hosted database for exactly the resilience reasons covered here.
Backup frequency: what "active" means
Frequency should track how fast the repository changes — specifically, how much data loss you can tolerate if a failure forces a restore from the most recent backup.
Lightly used repositories (one or two architects, occasional additions): daily backup is enough. An RPO of up to 24 hours is acceptable where daily progress is modest.
Actively used repositories (five or more architects, daily modeling): daily backup as a floor, with transaction log or WAL-based point-in-time recovery to pull RPO down to hours.
Program-critical repositories (driving delivery artifacts in a critical phase): continuous or near-continuous backup with point-in-time recovery. During a gate review, losing even four hours of work can be unacceptable, so frequent transaction log or WAL checkpoints are warranted.
Recovery testing: the step that is always skipped
A backup you have never restored is not a backup you can rely on. This is not theoretical — backup files can be corrupt, incomplete, or written to storage that turns out to be unreachable on the day you need it.
What a restore test means: take a recent backup file, restore it to a staging or dedicated recovery environment, connect Sparx EA to the restored database, and confirm the content is complete, accessible, and functional.
How often: for active repositories, a monthly restore test is good practice. Where backups are automated in the cloud, a quarterly test confirms the automation is producing genuinely recoverable output.
What to check: element count and package structure match production, diagrams render, MDG Technology profiles are intact, users can connect through PCS, and reports generate. A test that only confirms the database came up is not thorough enough.
RTO and RPO targets for the EA repository
RTO (recovery time objective) is how long the repository can be offline before the impact is unacceptable. RPO (recovery point objective) is how much data loss is tolerable, measured as time since the last recoverable state. Classify your repository, then let the targets drive backup frequency, infrastructure, and monitoring.
| Repository classification | Suggested RTO | Suggested RPO |
|---|---|---|
| Low-use repository (few architects, non-critical period) | 24–48 hours | 24 hours |
| Active practice repository | 4–8 hours | 4–8 hours |
| Program-critical repository (active delivery period) | 2–4 hours | 1–2 hours |
| High-stakes compliance / regulated environment | 1–2 hours | < 1 hour |
Business continuity: tiered impact scenarios
Different failures in the Sparx EA stack produce very different impact profiles. Treating them as equivalent is the planning mistake that wastes continuity budget — the point is to rank them.
Pro Cloud Server failure
When the Pro Cloud Server instance goes down, Sparx EA clients cannot reach the shared repository and all multi-user access is offline. Architects may hold locally cached versions of recent diagrams, but the shared repository is not available for read or write.
Read-only fallback. If the database itself is still reachable, some organizations configure a direct database read connection as an emergency access path. It is not a supported configuration for ongoing use — it bypasses PCS security controls — but it can retrieve critical content during a short PCS outage.
Recovery. A PCS service restart is usually a matter of minutes. If the server itself has failed, recovery means standing up a replacement PCS instance and reconnecting it to the database; a pre-configured standby instance shortens that considerably.
Database server failure
If the database becomes unavailable, the entire repository is inaccessible — PCS cannot broker connections to a database that will not respond. This is the more serious scenario. Recovery from backup is the primary path: RTO depends on restore time (a function of backup size and infrastructure), and RPO depends on backup frequency. On managed cloud databases such as Azure SQL or RDS, failover to a standby replica can cut RTO sharply.
AI and analytics layer failure
If an external AI or business-intelligence integration that sits on top of the repository goes down, AI-assisted querying and BI dashboard refresh are offline — but core EA access is not. Architects keep working in Sparx EA exactly as before; only the stakeholders relying on those downstream tools are affected. A service restart resolves most availability issues; configuration problems such as an expired API key or a changed network path take longer to diagnose.
An outage in an AI or BI layer that reads the repository is not an EA repository failure.
This distinction is the most important one for your continuity plan. These layers are additive — they sit on top of the repository and read from it; their failure does not touch the architects working directly in the EA client. Communicate that clearly to stakeholders so a dashboard outage never triggers a full repository incident response, and so continuity investment lands where the real risk is: the database and the PCS layer beneath it.
Frequently asked questions
How much storage do EA repository backups require?
Storage depends on repository size and retention. A typical active EA repository database runs from a few hundred megabytes to a few gigabytes. With daily backups kept for 30 days and weekly backups kept for 12 weeks, total storage is modest — usually 50 to 200 GB for most practices, at a low monthly cloud-storage cost.
Can Sparx EA be restored to a different database server than the original?
Yes. The repository database can be restored to a different server instance of the same database type — SQL Server to SQL Server, PostgreSQL to PostgreSQL. After the restore, Pro Cloud Server must be reconfigured to point at the new database address. For disaster recovery where the original server is gone, this is the standard recovery path.
What is the impact on users if we take the repository offline for a backup?
For database-level backups (SQL Server Agent, pg_dump, mysqldump) you usually do not take the repository offline at all — these tools back up the database while it is in use. The exception is a very large repository where backup I/O slows performance; there, scheduling backups overnight keeps the impact on architects minimal.
Should the EA repository backup be included in our enterprise backup policy?
Yes, unambiguously. Treat the EA repository as a business-critical system in your enterprise backup policy, on par with other systems that hold institutional knowledge, program records, and compliance evidence. If it is out of scope today, work with your IT infrastructure team to bring it in.
What happens if we lose the PCS configuration during a server failure?
Pro Cloud Server configuration — ODBC connections, authentication, repository definitions, port and security settings — lives in files on the PCS server and should be backed up separately from the database. Losing it without a backup means rebuilding every setting by hand, which is slow and error-prone. Keep the PCS configuration files in your server backup policy.
Protect the repository like the value it holds
Disaster recovery for a Sparx EA repository is not exotic work — it is the ordinary discipline of verified backups, tested restores, and clear RTO and RPO targets, applied to a system that too often misses out on all three. Get the database and Pro Cloud Server layers right, rank your failure modes honestly, and the repository your practice depends on is protected with the same rigor as any other business-critical system. If you are weighing the platform decisions underneath all this — see why Sparx EA and the work your architects do day to day.
Is your EA repository as protected as the value it holds?
Talk to a practitioner about backup, restore testing, and a continuity plan for your Sparx EA repository — sized to how your team actually works.
Book a call →Keep reading
You might also be interested in
Sparx EA repository database options: SQL Server vs MySQL vs PostgreSQL
Which database to run your repository on — and what each choice means for backup and recovery.
Read → InsightWhat is Sparx EA Pro Cloud Server and why do you need it?
The layer that brokers multi-user access — and the one whose failure mode you plan for.
Read → Why Sparx EAWhy Sparx EA
The platform case for building your architecture practice on Sparx Enterprise Architect.
Explore → Talk to usOngoing support
Backup monitoring, restore testing, and continuity management for your repository.
Get in touch →