Traditional database integrity checks in SQL Server can be time-consuming and resource-intensive, especially for large databases, even when using super cool tools like Ola Hallengren’s maintenance solution.
To address this challenge, I developed a TSQL script for performing incremental integrity checks, which significantly optimizes the process and reduces its impact on the server.
Large databases often face issues with prolonged integrity check times, leading to potential downtime and performance degradation. Full integrity checks, while thorough, are not always feasible on a regular basis due to their impact on system resources.
Before delving into the solution that I developed, let’s first dive into…
Some (Sub-Par) Alternatives
Ola Hallengren’s DatabaseIntegrityCheck
So, the most popular maintenance solution in SQL Server is that developed by Ola Hallengren, sporting the famous “DatabaseIntegrityCheck” stored procedure. We over here at Madeira Data Solutions love this and recommend all our customers to use it.
In terms of reducing its performance impact on the server, Ola’s stored procedure has these parameters:
- PhysicalOnly – This parameter is usually a no-brainer when you’re looking to reduce the performance impact of your integrity checks. It basically cuts down the duration of the operation to 40% or so, by skipping “logical” integrity checks, instead focusing on physical (i.e. storage-level) checks only. However, even 40% may not always be good enough for those really big databases.
- TimeLimit – This parameter can be used for limiting the overall duration of the maintenance operation. But keep in mind that this self-stopping mechanism only works in-between operations. So, it wouldn’t really help if an individual operation that takes reeeeeeeally long is starting just before the time limit is supposed to be up.
- CheckCommands – This parameter can be used to specify the type of DBCC CHECK[…] command(s) that we want it to execute. Its default is CHECKDB, but you can also tell it to use CHECKTABLE, CHECKFILEGROUP, CHECKALLOC, and CHECKCATALOG. By using “smaller” commands such as CHECKTABLE and CHECKFILEGROUP, you could potentially reduce the impact of each individual maintenance operation. However, you still need a way to do it for the entirety of the database eventually… Somehow.
- FileGroups and Objects – These two parameters can be used for running the integrity checks for specific filegroups or objects, or excluding specific filegroups or objects.
- DatabaseOrder – If you’re using the CHECKDB command type and the TimeLimit parameter, then it’s strongly recommended to make use of the “DatabaseOrder” parameter and set it to “DATABASE_LAST_GOOD_CHECK_ASC”. This will make sure that the databases that were not checked for the longest time, would be the first in queue to be checked. This way, even if your maintenance operation stopped itself due to the “TimeLimit” parameter before all databases were checked, you can be sure that the rest of the databases would be checked in its next execution.
These are all great. But… What if a single database is so large and its integrity check takes soooo long, that the above parameters don’t help anymore?
We’d need a way to break it down even further.
Tiger Toolbox’s Maintenance Solution
Another interesting alternative is the usp_CheckIntegrity stored procedure developed by Microsoft’s very own “Tiger” support team.
This special stored procedure was developed for VLDBs in mind (Very Large Databases). The way it works is that it splits the database tables into 7 equal-sized “buckets”, where each “bucket” would have DBCC CHECKTABLE run on its objects on each given weekday.
So, you would have bucket 1 checked on Sunday, bucket 2 checked on Monday, and so on till bucket 7 on Saturday. And in addition, on Sunday you would also have CHECKALLOC and CHECKCATALOG.
A similar logic could be done using 7 “buckets” of filegroups rather than tables. But that’s for super-duper-large databases.
Regardless, while Tiger team’s stored procedure is pretty cool, it’s not nearly as robust as Ola Hallengren’s variant. For example, it doesn’t have the equivalent of the “TimeLimit” parameter. Therefore, each “bucket” has to be completed in full in order for the maintenance to have proper coverage.
But… What if your database is still soooo large that even 1 seventh of it would take too long to check?
Introducing: Incremental Object-Level Integrity Checks
The sort of thing that we’re looking for is very similar to how one would combine the parameters in Ola Hallengren’s procedure – TimeLimit and DatabaseOrder, but instead of doing the checks at the database level, we want to do the same at the object level.
The problem is that, unlike at the database level, there is no metadata written in SQL Server at the object level when using DBCC CHECKTABLE. In other words, there’s nothing built-in in SQL Server that says when was the last time a table had CHECKTABLE run on it.
While there’s nothing like that built-in natively in SQL Server, once we make use of Ola Hallengren’s maintenance solution, we technically have something very similar – namely, the “CommandLog” table that’s part of the solution.
Specifically, when performing CHECKTABLE on a table using the “DatabaseIntegrityCheck” procedure, a record would be saved in the “CommandLog” table with the table name and the time when it was checked.
We can use that to create a sort of “plug-in” to Hallengren’s procedure, which would implement a sort of “round-robin” logic where, within a given time limit, it would perform DBCC CHECKTABLE on tables that were either never checked or had the longest time passed since their last check.
This is the essence of the script that I created.
Key Features of the Script
- Efficiency: By performing checks at the table level and being able to stop based on a specified time limit, the script minimizes the load on the server.
- Coverage: By always performing the checks for tables that were not checked yet or had the longest time passed since their last check, the script provides full eventual coverage for the database. It may not be completed within one execution or 3 executions, or even 10 executions, but eventually it will reach that point where all objects are checked.
- Scalability: Suitable for databases of any size, the script scales with your data growth.
How it works
- Get the list of databases relevant for the check.
- For each database, get the list of tables within it, calculate the size of each table, and get the last CHECKTABLE time logged for it in the CommandLog table. Save this list of tables in a single temporary table.
- For each table in the temporary table, sorted by last check time ascending and size descending, run the DatabaseIntegrityCheck procedure using the CHECKTABLE command type and filter for that specific table using the Objects parameter.
- Limit the execution duration of the above loop using the predetermined time limit.
Here’s an example of the T-SQL script:
Obviously, this requires that you already have Ola Hallengren’s maintenance solution already installed.
Don’t Forget!
I would only add that the above script is only responsible for doing the CHECKTABLE operations.
But in order to provide full coverage for your databases, you also need the CHECKALLOC and CHECKCATALOG operations done at the database level.
But that’s okay, because all you’d need to do is create another maintenance job that would do these checks using a script like below:
And that’s it!
Implementing an incremental integrity check for large databases in SQL Server can both maintain data reliability as well as reduce the impact on the system performance. Hopefully, the TSQL scripts I provided above should offer a practical solution to the challenges posed by traditional integrity checks.
Give it a try and experience the benefits for yourself!