Currently we have an old (rather crude) system that has user-defined fields, which are mapped against rows in arbitrary tables. This was an after-the-fact modification based on a customer request, and it wasn't really designed to scale well. Our system has around 60 different types of entities, which makes things even more complicated. Essentially the implementation looks like this:
UDF_ID int ENTITY_TYPE int UDF_NAME nvarchar(64) UDF_DATA_TYPE int UDF_FORMAT nvarchar(16) UDF_MASK nvarchar(24) UDF_FLAGS int
UDF_ID int ENTITY_ID int VALUE int MODIFIED datetime
UDF_ID int ENTITY_ID int VALUE nvarchar(255) MODIFIED datetime
This gets nice and fun when we generate our own ways to index compound primary keys, but that's another DailyWTF-worthy story.
Obviously this is pretty hideous, and leads to some spectacularly horrific queries being generated, but it's worked alright for now because we limit each entity to a maximum of 5 user-defined fields. As a quick disclaimer, I wasn't with the company when this design decision was made! 😉
Anyway, we're about to start a shiny new project and will inevitably need a better way of doing this, with no restrictions on the number of UDFs we can apply to entities, increased performance, and less horror in the generated query department. If we could run a NoSQL solution like Mongo or Redis I'd be happy and wouldn't need to bother you all, but sadly that's not an option. Instead, I need a way to do this from within SQL Server 2008 R2.
So far, I've come up with the following options:
- Individual UDF table for each entity type, with identical structures.
- Benefits: Queries are faster and simpler, solves the compound index problem, can search by UDF.
- Downsides: Table versioning is more cumbersome, lots of extra tables, difficult to index across similar entity types (e.g. purchase orders and sales orders)
- Binary blob UDF column in every entity's table.
- Benefits: Really fast: no fetch overhead, no extra JOINs.
- Downsides: Can't search by UDF, extra proprietary code required.
- XML column in every entity's table.
- Benefits: Fast like binary blobs, seems to be indexable.
- Downsides: No idea how they work – looks complicated!
So, do any of these ideas have merit? Is there a better alternative?