Sql-server – Increase a counter for each changed row

sql serversql-server-2008

I'm using SQL Server 2008 Standard, which doesn't have a SEQUENCE feature.

An external system reads data from several dedicated tables of the main database.
External system keeps a copy of data and periodically checks for changes in the data and refreshes its copy.

To make the sync efficient I want to transfer only rows that were updated or inserted since the previous sync. (The rows are never deleted).
To know which rows were updated or inserted since the last sync there is a bigint column RowUpdateCounter in each table.

The idea is that whenever a row is inserted or updated, the number in its RowUpdateCounter column would change.
The values that go into the RowUpdateCounter column should be taken from an ever increasing sequence of numbers.
Values in the RowUpdateCounter column should be unique and each new value stored in a table should be greater than any previous value.

Please see the scripts that show the desired behaviour.

Schema

CREATE TABLE [dbo].[Test](
[ID] [int] NOT NULL,
[Value] [varchar](50) NOT NULL,
[RowUpdateCounter] [bigint] NOT NULL,
CONSTRAINT [PK_Test] PRIMARY KEY CLUSTERED
(
[ID] ASC
))
GO

CREATE UNIQUE NONCLUSTERED INDEX [IX_RowUpdateCounter] ON [dbo].[Test]
(
[RowUpdateCounter] ASC
)
GO


INSERT some rows

INSERT INTO [dbo].[Test]
([ID]
,[Value]
,[RowUpdateCounter])
VALUES
(1, 'A', ???),
(2, 'B', ???),
(3, 'C', ???),
(4, 'D', ???);


Expected result

+----+-------+------------------+
| ID | Value | RowUpdateCounter |
+----+-------+------------------+
|  1 | A     |                1 |
|  2 | B     |                2 |
|  3 | C     |                3 |
|  4 | D     |                4 |
+----+-------+------------------+


The generated values in RowUpdateCounter can be different, say, 5, 3, 7, 9. They should be unique and they should be greater than 0, since we started from empty table.

INSERT and UPDATE some rows

DECLARE @NewValues TABLE (ID int NOT NULL, Value varchar(50));
INSERT INTO @NewValues (ID, Value) VALUES
(3, 'E'),
(4, 'F'),
(5, 'G'),
(6, 'H');

MERGE INTO dbo.Test WITH (HOLDLOCK) AS Dst
USING
(
SELECT ID, Value
FROM @NewValues
)
AS Src ON Dst.ID = Src.ID
WHEN MATCHED THEN
UPDATE SET
Dst.Value            = Src.Value
,Dst.RowUpdateCounter = ???
WHEN NOT MATCHED BY TARGET THEN
INSERT
(ID
,Value
,RowUpdateCounter)
VALUES
(Src.ID
,Src.Value
,???)
;


Expected result

+----+-------+------------------+
| ID | Value | RowUpdateCounter |
+----+-------+------------------+
|  1 | A     |                1 |
|  2 | B     |                2 |
|  3 | E     |                5 |
|  4 | F     |                6 |
|  5 | G     |                7 |
|  6 | H     |                8 |
+----+-------+------------------+

• RowUpdateCounter for rows with ID 1,2 should remain as is, because these rows were not changed.
• RowUpdateCounter for rows with ID 3,4 should change, because they were updated.
• RowUpdateCounter for rows with ID 5,6 should change, because they were inserted.
• RowUpdateCounter for all changed rows should be greater than 4 (the last RowUpdateCounter from the sequence).

The order in which new values (5,6,7,8) are assigned to changed rows doesn't really matter.
The new values can have gaps, e.g. 15,26,47,58, but they should never decrease.

There are several tables with such counters in the database.
It doesn't matter if all of them use the single global sequence for their numbers, or each table has its own individual sequence.

I don't want to use a column with a datetime stamp instead of an integer counter, because:

• The clock on the server can jump both forward and backward. Especially when it is on a virtual machine.

• The values returned by system functions like SYSDATETIME are the same for all affected rows.
The sync process should be able to read changes in batches.
For example, if batch size is 3 rows, then after the MERGE step above the sync process would read only rows E,F,G.
When the sync process is run next time it would continue from row H.

The way I'm doing it now is rather ugly.

Since there is no SEQUENCE in SQL Server 2008, I emulate the SEQUENCE by a dedicated table with IDENTITY as shown in this answer. This in itself is pretty ugly and exacerbated by the fact that I need to generate not a single, but a batch of numbers at once.

Then, I have an INSTEAD OF UPDATE, INSERT trigger on each table with the RowUpdateCounter and generate required sets of numbers there.

In the INSERT, UPDATE and MERGE queries I set RowUpdateCounter to 0, which is replaced by the correct values in the trigger. The ??? in the queries
above are 0.

It works, but is there an easier solution?

You can use a ROWVERSION column for this.

The documentation states that

Each database has a counter that is incremented for each insert or update operation that is performed on a table that contains a rowversion column within the database.

The values are BINARY(8) and you should consider them as BINARY rather than BIGINT as after 0x7FFFFFFFFFFFFFFF it goes on to 0x80... and starts working up from -9223372036854775808if treated as a signed bigint.

A full worked example is below. Maintaining the index on the ROWVERSION column will be expensive if you have lots of updates so you might want to test your workload both with and without to see if it is worth the cost.

CREATE TABLE [dbo].[Test]
(
[ID]               [INT] NOT NULL CONSTRAINT [PK_Test] PRIMARY KEY,
[Value]            [VARCHAR](50) NOT NULL,
[RowUpdateCounter] [ROWVERSION] NOT NULL UNIQUE NONCLUSTERED
)

INSERT INTO [dbo].[Test]
([ID],
[Value])
VALUES     (1,'Foo'),
(2,'Bar'),
(3,'Baz');

DECLARE @RowVersion_LastSynch ROWVERSION = MIN_ACTIVE_ROWVERSION();

UPDATE [dbo].[Test]
SET    [Value] = 'X'
WHERE  [ID] = 2;

DECLARE @RowVersion_ThisSynch ROWVERSION = MIN_ACTIVE_ROWVERSION();

SELECT *
FROM   [dbo].[Test]
WHERE  [RowUpdateCounter] >= @RowVersion_LastSynch
AND RowUpdateCounter < @RowVersion_ThisSynch;

/*TODO: Store @RowVersion_ThisSynch somewhere*/

DROP TABLE [dbo].[Test]