Site icon Eitan Blumin's blog

How to Un-Heap Your Heaps?

Today’s script has something very interesting to do with Heap tables. Specifically, how to turn them into something that’s not a heap. Continue reading for more info…

What is a Heap?

In case you didn’t know, “Heap” tables in SQL Server are tables that don’t have a clustered index on them.

There’s plenty of information already available on the internet about these tables. Here is what Microsoft Docs has to say about it:


If a table is a heap and does not have any nonclustered indexes, then the entire table must be examined (a table scan) to find any row. This can be acceptable when the table is tiny, such as a list of the 12 regional offices of a company.

When a table is stored as a heap, individual rows are identified by reference to a row identifier (RID) consisting of the file number, data page number, and slot on the page. The row id is a small and efficient structure. Sometimes data architects use heaps when data is always accessed through nonclustered indexes and the RID is smaller than a clustered index key.

Heap tables can have detrimental implications on performance in the following scenarios:


So yeah, Heap tables are pretty bad, especially when they contain a lot of data. Many talented bloggers have already wrote about this issue, explaining heap tables, comparing them to clustered index tables, and telling you how important it is to design your database tables properly.

Which column(s) should be clustered?

But I’m not here to re-hash repeated and tired mantras. What I’m here to do today, is to share with you a trick that I’ve got up my sleeve to Quickly Generate Cluster Index Recommendations for Heap Tables! This would be especially useful to you if you have A LOT of databases in your SQL Server, and many of them containing A LOT of heap tables. Going through each and every one could be very tiresome.

So what I did, is basically write a “guestimation” script which tries to make use of whatever metadata and statistics SQL Server has, which may give a hint as to what would be the most probable clustered index to create. Its algorithm goes something like this:

  1. Look in index usage stats for the most “popular” non-clustered indexes which would be a good candidate as clustered index. If no such was found, then:
  2. If there’s any non-clustered index at all, get the first one created with the highest number of INCLUDE columns, give priority to UNIQUE indexes. If no such was found, then:
  3. Look in missing index stats for the most impactful index that has the highest number of INCLUDE columns. If no such was found, then:
  4. Use the IDENTITY column in the table. If no such was found, then:
  5. Check for any column statistics in the table and look for the column which is the most selective (most unique values). If no such was found, then:
  6. Use the first date/time column in the table, give priority to columns with a default constraint. If no such was found, then:
  7. Use the first int/bigint/smallint/tinyint column in the table, give priority to columns without a default constraint. If no such was found, then:
  8. Use the first non-nullable column in the table, give priority to columns without a default constraint. If no such was found, then:
  9. Bummer. I’m out of ideas. No automated recommendations are possible.

Note: the above may not be up to date when you read this post. I occasionally update the script with improvements and things could change such as new recommendations added, or their order of consideration changed. The most up-to-date algorithm would be documented within the script itself.

You can find the script in my GitHub Gists here:

I’ve had the chance to use this script in several environments, and it works very well for most common scenarios.
WARNING! CONSIDER THE FOLLOWING BEFORE MAKING ANY ACTUAL CHANGES:

The script also generates the actual CREATE script for these clustered indexes, but as I said, be careful not to run those scripts blindly without thinking first. So, while the script will save you significant time and give you helpful ideas, you should still evaluate each recommendation and make sure it’s fine.

Got any comments? Ideas? Let me know below.

Additional Resources:

Cover Image by PublicDomainPictures from Pixabay

Exit mobile version