I have an enormous prod SQL Server database :
total size = 1.5 TB full backup duration = 4 hrs ( online )
that needs to be slimmed-down because it's causing a variety of issues for database maintenance :
- capacity planning & expansion ( cost of SAN storage , hit 2 TB LUN limit = must partition tables )
- backup duration & size of backups ( cost of storage )
- archive duration ( long-running SQL Agent jobs to delete rows in batches )
[PlayerWeapon] table is a candidate for weight-reduction :
[PlayerWeapon] table [GUID] (PK , uniqueidentifier , NOT NULL) [PlayerGUID] (uniqueidentifier , NOT NULL) [WeaponGUID] (uniqueidentifier , NOT NULL) [BeltSlot] (tinyint , NULL) ALTER TABLE [dbo].[PlayerWeapon] ADD CONSTRAINT [PlayerWeapon_PK] PRIMARY KEY CLUSTERED ( [GUID] ASC )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY] ALTER TABLE [dbo].[PlayerWeapon] ADD CONSTRAINT [PlayerWeapon_PlayerGUID_WeaponGUID_UK] UNIQUE NONCLUSTERED ( [PlayerGUID] ASC, [WeaponGUID] ASC )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY] Row Count = 500 M Total Size = 125 GB Data Size = 65 GB Index Size = 60 GB
Because there's a set of 16 default weapons and each player has a maximum of 16 belt slots ( and those counts are not anticipated to ever be changed ) there's a possible "optimization" :
[Weapon] table [GUID] (PK , uniqueidentifier , NOT NULL) [DefaultIndex] (tinyint , NULL) [Player] table [GUID] (uniqueidentifier , NOT NULL) [DefaultWeapons] (tinyint , NOT NULL)
[DefaultWeapons] is a packed representation of a 16×16 bit array representing belt slot index vs default weapon index.
If a belt slot index does not have a corresponding default weapon index ( i.e. row is all zeros ), then
[PlayerWeapon].[BeltSlot] is consulted for the (non-default) weapon for that belt slot.
A big con with this approach : if a later refactoring is needed then a complex data-transformation will be necessary on the prod dataset. Also, the unpacking ( either in t-sql or c# ) might not be highly performant.
What are the pros & cons of packing arrays into numeric fields ?
And I am wondering about other approaches to my problem of excessive table size ?
NOTE : table PKs are GUIDs ( rather than INT or BIGINT ) due to the design of an ORM layer built on top ( I have no ability to change this design ).