I have an application which creates millions of tables in a SQL Server 2008 database (non clustered). I am looking to upgrade to SQL Server 2014 (clustered), but am hitting an error message when under load:
“There is already an object named ‘PK__tablenameprefix__179E2ED8F259C33B’ in the database”
This is a system generated constraint name. It looks like a randomly generated 64-bit number. Is it possible that I am seeing collisions due to the large number of tables? Assuming I have 100 million tables, I calculate less than a 1-in-1-trillion chance of a collision when adding the next table, but that assumes a uniform distribution. Is it possible that SQL Server changed its name generation algorithm between version 2008 and 2014 to increase the odds of collision?
The other significant difference is that my 2014 instance is a clustered pair, but I am struggling to form a hypothesis for why that would generate the above error.
P.S. Yes, I know creating millions of tables is insane. This is black box 3rd party code over which I have no control. Despite the insanity, it worked in version 2008 and now doesn’t in version 2014.
Edit: on closer inspection, the generated suffix always seems to start with 179E2ED8 – meaning the random part is actually only a 32-bit number and the odds of collisions are a mere 1-in-50 every time a new table is added, which is a much closer match to the error rate I’m seeing!
Best Answer
This depends on the type of constraint and version of SQL Server.
Example Results 2008
Example Results 2017
For default constraints, check constraints and foreign key constraints the last 4 bytes of the auto generated name are a hexadecimal version of the objectid of the constraint. As
objectid
are guaranteed unique the name must also be unique. In Sybase too these usetabname_colname_objectid
For unique constraints and primary key constraints Sybase uses
This too would guarantee uniqueness.
SQL Server doesn't use this scheme.
In both SQL Server 2008 and 2017 it uses an 8 byte string at the end of the system generated name however the algorithm has changed as to how the last 4 bytes of that are generated.
In 2008 the last 4 bytes represent a signed integer counter that is offset from the
object_id
by-16000057
with any negative value wrapping around to max signed int. (The significance of16000057
is that this is the increment applied between successively createdobject_id
). This still guarantees uniqueness.On 2012 upwards I don't see any pattern at all between the object_id of the constraint and the integer obtained by treating the last 8 characters of the name as the hexadecimal representation of a signed int.
The function names in the call stack in 2017 shows that it now creates a GUID as part of the name generation process (On 2008 I see no mention of
MDConstraintNameGenerator
). I guess this is to provide some source of randomness. Clearly it isn't using the whole 16 bytes from the GUID in that 4 bytes that changes between constraints however.I presume the new algorithm was done for some efficiency reason at the expense of some increased possibility of collisions in extreme cases such as yours.
This is quite a pathological case as it requires the table name prefix and column name of the PK (insofar as this affects the 8 characters preceding the final 8) to be identical for tens of thousands of tables before it becomes probable but can be reproduced quite easily with the below.
An example run on SQL Server 2017 against a newly created database failed in just over a minute (after 50,931 tables had been created)