Do you find yourself facing performance problems and long lock chains caused by very frequent INSERT, UPDATE, or DELETE statements being executed on a table? Check out this neat trick that could help you out and make all the difference in the world.
Okay, I admit that title ended up being quite long. But I wanted something that could be easily found in search engines by people facing similar problems.
So anyways, consider the following scenario:
Problem
You have a “hot” table, that finds itself at the center of attention. Let’s create a simulation for one such table, and call that hypothetical table “MyHotTable” and store some data in it:
CREATE TABLE MyHotTable
(
SomeIdentifier int NOT NULL,
SomeOtherIdentifier int NOT NULL,
SomeText nvarchar(100) NULL,
SomeCounter int NOT NULL,
CONSTRAINT PK_MyHotTable PRIMARY KEY CLUSTERED (SomeIdentifier, SomeOtherIdentifier)
);
CREATE NONCLUSTERED INDEX IX_MyHotTable_SomeOtherIdentifier ON MyHotTable (SomeOtherIdentifier) INCLUDE (SomeCounter, SomeText);
GO
WITH Level1
AS
(
SELECT TOP (100) ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS n
FROM sys.all_columns a CROSS JOIN sys.all_columns b
),
Level2
AS
(
SELECT TOP (10000) ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS n
FROM sys.all_columns a CROSS JOIN sys.all_columns b
)
INSERT INTO MyHotTable (SomeIdentifier, SomeOtherIdentifier, SomeText, SomeCounter)
SELECT Level1.n, Level2.n, NEWID(), ABS(CHECKSUM(NEWID())) % 10000
FROM Level1 CROSS JOIN Level2
Let’s say that we have an UPSERT logic implemented by a stored procedure that looks like this:
CREATE PROCEDURE UpsertHotTable
@SomeIdentifier int,
@SomeOtherIdentifier int
AS
SET NOCOUNT ON;
UPDATE MyHotTable SET SomeCounter = SomeCounter + 1
WHERE SomeIdentifier = @SomeIdentifier AND SomeOtherIdentifier = @SomeOtherIdentifier
IF @@ROWCOUNT = 0
BEGIN
INSERT INTO MyHotTable (SomeIdentifier, SomeOtherIdentifier, SomeCounter, SomeText)
VALUES (@SomeIdentifier, @SomeOtherIdentifier, 1, NEWID())
END
Now, let’s say that this stored procedure is being executed hundreds or even thousands of times per minute, and there are a few “hot spots” in the data (for example, “new” data at the end of the table).
Let’s set up SQLQueryStress to simulate this by running the script below a bunch of times in parallel:
DECLARE
@SomeIdentifier int = ABS(CHECKSUM(NEWID())) % 2 + 100
, @SomeOtherIdentifier int = ABS(CHECKSUM(NEWID())) % 5 + 10000
EXEC UpsertHotTable @SomeIdentifier, @SomeOtherIdentifier
Now, let’s throw in one more SQLQueryStress thread that would simulate something else that’s trying to retrieve lots of data from the hot table (I added a transaction and a table hint to simulate shared locks on the table):
IF OBJECT_ID('tempdb..#AggTemp') IS NULL
CREATE TABLE #AggTemp
(
SomeIdentifier int,
TotalCounter int
);
BEGIN TRAN
INSERT INTO #AggTemp
SELECT SomeIdentifier, SUM(SomeCounter)
FROM MyHotTable WITH(REPEATABLEREAD)
GROUP BY SomeIdentifier
WAITFOR DELAY '00:00:03'
DELETE FROM #AggTemp
COMMIT
The point of that query would be to simulate all kinds of “spikes” in workload that would end up locking records in the hot table, and by doing so impacting the UPSERT processes running in parallel.
Something like this scenario would have 3 main potential problems:
- Due to the high concurrency, it’s possible to get Primary Key Violation errors, even with the check for existing records.
- Lock chains become very common and very long. This leads to several problems, the main of which is (obviously) bad performance. But it can also get even worse by depleting all of your server’s worker threads, which in turn could lead to serious availability problems.
- If you have lots of data in this table, with multiple different non-clustered indexes, and different processes accessing this table differently (for example, reporting queries in addition to the updates and inserts), then you’ll have a high risk of getting deadlocks here as well.
Solution
The solution in this case would be to essentially turn our “UPSERT” logic into a much simpler “INSERT-only” logic.
Once we have INSERTs only into a table which would never be updated or queried from, the throughput would skyrocket. But how do we do that without making huge modifications in the logic of our system?
We implement what I call an “Asynchronous Ledger” logic.
The idea is basically this:
Step 1: Create a “Buffer Ledger” Table
We create an entirely new table, identical in structure to our original table, except that it would have no indexes at all (i.e. it would be a Heap), no constraints or triggers or anything. This table, and only this table, will be used for the high-throughput INSERT commands, and INSERT commands only. This table would be our “Buffer Ledger“. We’ll call this table MyHotTable_Buffer.
CREATE TABLE MyHotTable_Buffer (
SomeIdentifier int NOT NULL,
SomeOtherIdentifier int NOT NULL,
SomeText nchar(100) NULL,
SomeOtherText nchar(100) NULL,
SomeCounter int NOT NULL
);
The purpose of this table would be to receive high-throughput INSERTs. The exact implementation of these INSERTs is less important within the context of this blog post, but you should probably be aware of the performance differences between TVP inserts and SqlBulkCopy.
Step 2: Create a “Staging” Table
We create one more new table, identical in structure to the “Buffer Ledger” table. This one would be our “Staging” table. We’ll call this table MyHotTable_Staging.
CREATE TABLE MyHotTable_Staging (
SomeIdentifier int NOT NULL,
SomeOtherIdentifier int NOT NULL,
SomeText nchar(100) NULL,
SomeOtherText nchar(100) NULL,
SomeCounter int NOT NULL
);
Step 3: Create a “Buffer Synchronization” Procedure
We create a stored procedure which will do the following:
- Run an ALTER TABLE SWITCH TO command in order to move the data from the Buffer table and into the Staging table. But thanks to the logic of the SWITCH TO command, this would be a meta-data-only operation, and nearly instant in speed.
- Run your UPSERT logic to copy the data from the Staging table into the actual production table.
- Truncate the Staging table and then repeat.
It would look something like this:
WHILE 1=1
BEGIN
ALTER TABLE dbo.MyHotTable_Buffer SWITCH TO dbo.MyHotTable_Staging;
WITH Trgt
AS
(
SELECT *
FROM dbo.MyHotTable AS prod WITH(UPDLOCK, HOLDLOCK) /*table hints to reduce deadlocks*/
WHERE EXISTS
(
SELECT 1
FROM MyHotTable_Staging AS stg
WHERE stg.SomeIdentifier = prod.SomeIdentifier
AND stg.SomeOtherIdentifier = prod.SomeOtherIdentifier
)
)
MERGE INTO Trgt
USING (
SELECT SomeIdentifier, SomeOtherIdentifier, SUM(SomeCounter) AS SomeCounter
FROM MyHotTable_Staging
GROUP BY SomeIdentifier, SomeOtherIdentifier
) AS Src
ON Src.SomeIdentifier = Trgt.SomeIdentifier
AND Src.SomeOtherIdentifier = Trgt.SomeOtherIdentifier
WHEN MATCHED THEN
UPDATE SET SomeCounter = Trgt.SomeCounter + Src.SomeCounter
WHEN NOT MATCHED BY TARGET THEN
INSERT (SomeIdentifier, SomeOtherIdentifier, SomeText, SomeOtherText, SomeCounter)
VALUES (Src.SomeIdentifier, Src.SomeOtherIdentifier, NEWID(), NEWID(), Src.SomeCounter)
;
TRUNCATE TABLE dbo.MyHotTable_Staging;
END
This is, of course, a simplified version. But the core concept is the same.
Create a scheduled SQL Agent job to run the aforementioned stored procedure every few seconds or so (depending on how up-to-date the production table needs to be).
Step 4: Profit!
From this point on, this is what happens:
- The application layer inserts a bunch of records into the “Buffer Ledger” table. This can work with very high throughputs.
- The records “pile up” in the “Buffer Ledger” table until the next execution of the “Buffer Synchronization” SQL Agent job.
- The “Buffer Synchronization” job initiates an ALTER TABLE SWITCH TO operation which instantly “empties” the “Buffer Ledger” table, and transporting all the accumulated data into the “Staging” table.
- A potentially more complex MERGE operation is initiated between the “Staging” table and the final “Production” table. This operation may include various types of changes (inserts / updates / deletes) and even aggregations. While all this is happening, it has absolutely zero effect on the “Buffer Ledger” table which can freely continue receiving new data.
- Rinse and repeat.
Switching to this methodology will significantly improve the performance and throughput of both the ledger-like INSERTs and the synchronization of the data into the actual “Production” table, while essentially eliminating lock chains on the hot table.
Remember the actual real-world scenario for which I provided the screenshot previously?
Well, after implementing the asynchronous-ledger trick, this is what their query durations looked like:
The Downside
The downside to this “Asynchronous Ledger” method should be quite obvious: There’s a certain delay between the initial insertion of data and when it becomes visible in the final production table.
If, for whatever reason, your application’s logic requires that the new data be available in the final production table immediately, then the “Asynchronous Ledger” method would pose a problem and may not be a suitable solution, no matter how frequently the synchronization job would be running.
In most cases, though, this would not be the case. At worst, it could be avoided with a relatively simple change in the app logic, such as the implementation of an app-level buffer cache as an alternative.
Conclusion
Resolving “hot spots” in your data by utilizing more “asynchronous” work is a great way to improve database performance.
I was personally able to dramatically improve the performance of such bottlenecks for multiple clients. Hopefully, you could benefit from this as well.
You can use the demo script in our GitHub for a more detailed sample and a good starting point for your own use cases:
Pingback: Buffering Events in SQL Server – Curated SQL
this is very interesting.
I’m going to test this immediately
thanks, Eitan
Very nice post and a very practical approach.
This reminds me of a somewhat similar implementation I have done when sql server 2016 has been released in which I used a Stage table that resides on the same partition schema as the Prod table.
A job that runs every 5 min would Merge data into the Stage table as the target and then once a day the daily partition would be switched into the Prod table.
The original idea was an hourly partition but since the destination prod table is column store a large number of rows (~ 1M) is essential in order to prevent a state where there are tons of row groups that hurt performance.
For the sake of data retrieval, I have used a UNION between the Prod and Stage tables so the data policy was met, being close to real-time
This is a great technique I have also used, though I have stuck with separate update/insert over merge in the buffer sync.