Sql-server – Optimising a Database Shrink (in terms of speed and reducing fragmentation)


We have a large database that has had a lot of older data stripped out, leaving primary datafile (all objects are currently in the primary filegroup) of about 80Gb containing a little less than 20Gb of data. We could do to release a large portion of that unused space back to the underlying filesystem.

Shuffling up to 20Gb of data around is going to be slow as we are using spinning-disk-and-moving-heads type drives, and I'm told that the shrink process can lead to significant structural fragmentation as it puts blocks where they'll fit with little logic regarding putting related blocks near each other.

Am I right in thinking that the following will be both faster and result in a less fragmented result:

  1. Create a new filegroup on a separate array (we currently have three arrays, one containing the system drive and tempdb, one containing data files and one containing transaction logs, I propose the new filegroup goes on the system array)
  2. Migrate all our tables and indexes over to the new filegroup by moving the clustered index over (or creating one on the new filegroup then dropping it, for tables with no clustered index currently) then dropping and recreating other indexes. This is where I expect to see a speed benefit, as the data will be moving between spindles rather than between places on the same disks to there will be less "head thrashing", and if I understand the process it should reduce exiting internal fragmentation too.
  3. Shrink the primary data file
  4. Move the file containing the new filegroup back to the array intended for data files. This would be done either by detaching the database, moving the files, them reattaching it, or by creating another new group and migrating all the data from the current new one over to it (leaving the data in this new filegroup would make this process easier in future – we can't just migrate to a new filegroup in step 2 this way as I'm told you can't do that with the primary group)

There are windows when I can perform the process in a time when it will not be an issue to take the database offline if needed, so a bit of downtime will not be issue.

Am I over complicating this? Is there a better way to shrink the primary filegroup while not introducing the chance of significant extra internal fragmentation? With plenty of RAM in the machine (16Gb currently, with the total data in this and other databases being about 45Gb including archive data that is rarely touched), is the fragmentation likely to matter much anyway? Should I just run a normal shrink operation and twiddle my thumbs while it does its business?!

Best Answer

A note first:

When you rebuild indexes, you need roughly free space of roughly 1.2 times the largest table. So a 15 GB table in that 20 GB used requires 18 GB free space. This sets a minimum database file size.

Personally, for OLTP databases, anything over 50%-60% used is a bit of red flag. So anything less then 35-40GB for for 20GB data would be worrying.

To decide whether you need downtime and filegroups and all the hassle, have you:

One thing to look at would be NTFS fragmentation too after all this. YOu can stop SQL Server and defrag the volumes to get the MDFs and LDFs contiguous again. Of course, when they grow again you'll get fragmentation: which merits having generous file sizes to start with (50-60% used maximum).

Otherwise, your idea is sound to use different spindles to manage IO. Note the file size and NTFS stuff I mentioned above too.