Site icon Eitan Blumin's blog

Incremental Integrity Check for Large Databases

Traditional database integrity checks in SQL Server can be time-consuming and resource-intensive, especially for large databases, even when using super cool tools like Ola Hallengren’s maintenance solution.

To address this challenge, I developed a TSQL script for performing incremental integrity checks, which significantly optimizes the process and reduces its impact on the server.

Large databases often face issues with prolonged integrity check times, leading to potential downtime and performance degradation. Full integrity checks, while thorough, are not always feasible on a regular basis due to their impact on system resources.

Before delving into the solution that I developed, let’s first dive into…

Some (Sub-Par) Alternatives

Ola Hallengren’s DatabaseIntegrityCheck

So, the most popular maintenance solution in SQL Server is that developed by Ola Hallengren, sporting the famous “DatabaseIntegrityCheck” stored procedure. We over here at Madeira Data Solutions love this and recommend all our customers to use it.

In terms of reducing its performance impact on the server, Ola’s stored procedure has these parameters:

These are all great. But… What if a single database is so large and its integrity check takes soooo long, that the above parameters don’t help anymore?

We’d need a way to break it down even further.

Tiger Toolbox’s Maintenance Solution

Another interesting alternative is the usp_CheckIntegrity stored procedure developed by Microsoft’s very own “Tiger” support team.

This special stored procedure was developed for VLDBs in mind (Very Large Databases). The way it works is that it splits the database tables into 7 equal-sized “buckets”, where each “bucket” would have DBCC CHECKTABLE run on its objects on each given weekday.

So, you would have bucket 1 checked on Sunday, bucket 2 checked on Monday, and so on till bucket 7 on Saturday. And in addition, on Sunday you would also have CHECKALLOC and CHECKCATALOG.

A similar logic could be done using 7 “buckets” of filegroups rather than tables. But that’s for super-duper-large databases.

Regardless, while Tiger team’s stored procedure is pretty cool, it’s not nearly as robust as Ola Hallengren’s variant. For example, it doesn’t have the equivalent of the “TimeLimit” parameter. Therefore, each “bucket” has to be completed in full in order for the maintenance to have proper coverage.

But… What if your database is still soooo large that even 1 seventh of it would take too long to check?

Introducing: Incremental Object-Level Integrity Checks

The sort of thing that we’re looking for is very similar to how one would combine the parameters in Ola Hallengren’s procedure – TimeLimit and DatabaseOrder, but instead of doing the checks at the database level, we want to do the same at the object level.

The problem is that, unlike at the database level, there is no metadata written in SQL Server at the object level when using DBCC CHECKTABLE. In other words, there’s nothing built-in in SQL Server that says when was the last time a table had CHECKTABLE run on it.

While there’s nothing like that built-in natively in SQL Server, once we make use of Ola Hallengren’s maintenance solution, we technically have something very similar – namely, the “CommandLog” table that’s part of the solution.

Specifically, when performing CHECKTABLE on a table using the “DatabaseIntegrityCheck” procedure, a record would be saved in the “CommandLog” table with the table name and the time when it was checked.

We can use that to create a sort of “plug-in” to Hallengren’s procedure, which would implement a sort of “round-robin” logic where, within a given time limit, it would perform DBCC CHECKTABLE on tables that were either never checked or had the longest time passed since their last check.

This is the essence of the script that I created.

Key Features of the Script

How it works

  1. Get the list of databases relevant for the check.
  2. For each database, get the list of tables within it, calculate the size of each table, and get the last CHECKTABLE time logged for it in the CommandLog table. Save this list of tables in a single temporary table.
  3. For each table in the temporary table, sorted by last check time ascending and size descending, run the DatabaseIntegrityCheck procedure using the CHECKTABLE command type and filter for that specific table using the Objects parameter.
  4. Limit the execution duration of the above loop using the predetermined time limit.

Here’s an example of the T-SQL script:

Also available at our toolbox repository here

Obviously, this requires that you already have Ola Hallengren’s maintenance solution already installed.

Don’t Forget!

I would only add that the above script is only responsible for doing the CHECKTABLE operations.

But in order to provide full coverage for your databases, you also need the CHECKALLOC and CHECKCATALOG operations done at the database level.

But that’s okay, because all you’d need to do is create another maintenance job that would do these checks using a script like below:

Also available at our toolbox repository here

And that’s it!

Implementing an incremental integrity check for large databases in SQL Server can both maintain data reliability as well as reduce the impact on the system performance. Hopefully, the TSQL scripts I provided above should offer a practical solution to the challenges posed by traditional integrity checks.

Give it a try and experience the benefits for yourself!

Exit mobile version