Sql-server – Massive peer-to-peer replication topology

replicationsql serversql-server-2008-r2

We have a client that's geographically distributed in remote communities, with somewhat unreliable network/internet connectivity between each of the physical locations. There is a single central location, and 52 satellite locations.

Normally, our application is deployed as a central solution where there is only a single database, and all locations connect to the one database. The data itself is partitioned by a location column in most of the tables (some data is centralized), and the application operates such that touching data at one physical location does not touch data at another physical location, including the centralized data (at least, we're pretty sure it doesn't — if it does, it's likely a bug). Our client would like to do centralized reporting (which we normally support), and synchronization of several centralized configuration tables across the enterprise (same, because there's normally only one copy of the tables).

Every location has its own data center, and is licenced for SQL Server 2008 R2 Enterprise.

The goal is to deploy our application such that broken network/internet connectivity must not prevent local read/write operations, as our application is mission-critical.

My first thought was to use a giant peer-to-peer topology that replicates every object in the database. I haven't been able to find any documentation or references of any installation that's scaled out this much using this method, so I have no idea if it's technically feasible, and if so, what hardware is going to be required. Would this work with a centralized distribution database (for ease of management)? Is this even a good idea at all? How are database schema changes to be deployed? Does index maintenance generate a huge amount of network overhead?

There's also the option of creating a roll-our-own solution, which I figured would involve log shipping separate copies of the database (each non-central location would only contain its partitioned portion of the data) to the central location, and then using a merging tool (which we already have) to create a reporting database. Configuration table changes would be pushed out by some other mechanism, possibly merge replication, possibly a homebrew solution.

I considered and tested using merge replication on the entire database, but we require quite a lot of database and application changes/redesign/etc. to make this work correctly, so I've ruled that out already due to time constraints. (Thoughts on this?)

What I'm looking for here is some guidance from people with more experience. Obviously, planning for this is critical. Am I on the right path? Are there other options? (URLT?)

Best Answer

For a meer 52 client sites replication will do just fine. Consider your local copy as the 'master' site and have the application work only on the local copy, always. Use replication to aggregate the data to the central repository for aggregated reporting. Index maintenance operations normally do not generate replicaiton traffic. Manage schema changes through application deployment migrations, ie. have upgrade scripts for each released version.

As the number of sites increases, managing replication becomes harder and harder. Also licensing would push toward deploying Express on periphery. And SQL Server upgrade is an absolute pain when replication is involved, as the order of upgrading the components (publisher, distributor, subscriber) is critical is but difficult to coordinate with many sites. Deployments that I know of with +1500 sites use customized data movement based on Service Broker. See Using Service Broker instead of Replication for an example. I've even seen designes that use Service Broker to push out to periphery new bits (application changes) which in turn deploy, when activated, migrations and schema changes, making the entire deployment operated from the center.