How to Un-Heap Your Heaps?

Today’s script has something very interesting to do with Heap tables. Specifically, how to turn them into something that’s not a heap. Continue reading for more info…

In case you didn’t know, “Heap” tables in SQL Server are tables that don’t have a clustered index on them.

There’s plenty of information already available on the internet about these tables. Here is what Microsoft Docs has to say about it:


If a table is a heap and does not have any nonclustered indexes, then the entire table must be examined (a table scan) to find any row. This can be acceptable when the table is tiny, such as a list of the 12 regional offices of a company.

When a table is stored as a heap, individual rows are identified by reference to a row identifier (RID) consisting of the file number, data page number, and slot on the page. The row id is a small and efficient structure. Sometimes data architects use heaps when data is always accessed through nonclustered indexes and the RID is smaller than a clustered index key.

Heap tables can have detrimental implications on performance in the following scenarios:

  • When the data is frequently returned in a sorted order. A clustered index on the sorting column could avoid the sorting operation.
  • When the data is frequently grouped together. Data must be sorted before it is grouped, and a clustered index on the sorting column could avoid the sorting operation.
  • When ranges of data are frequently queried from the table. A clustered index on the range column will avoid sorting the entire heap.
  • When there are no nonclustered indexes and the table is large. In a heap, all rows of the heap must be read to find any row.

So yeah, Heap tables are pretty bad, especially when they contain a lot of data. Many talented bloggers have already wrote about this issue, explaining heap tables, comparing them to clustered index tables, and telling you how important it is to design your database tables properly.

But I’m not here to re-hash repeated and tired mantras. What I’m here to do today, is to share with you a trick that I’ve got up my sleeve to Quickly Generate Cluster Index Recommendations for Heap Tables! This would be especially useful to you if you have A LOT of databases in your SQL Server, and many of them containing A LOT of heap tables. Going through each and every one could be very tiresome.

So what I did, is basically write a “guestimation” script which tries to make use of whatever metadata and statistics SQL Server has, which may give a hint as to what would be the most probable clustered index to create. Its algorithm goes something like this:

  1. Look in index usage stats for the most “popular” non-clustered indexes which would be a good candidate as clustered index. If no such was found, then:
  2. If there’s any non-clustered index at all, get the first one created with the highest number of INCLUDE columns, give priority to UNIQUE indexes. If no such was found, then:
  3. Look in missing index stats for the most impactful index that has the highest number of INCLUDE columns. If no such was found, then:
  4. Use the IDENTITY column in the table. If no such was found, then:
  5. Check for any column statistics in the table and look for the column which is the most selective (most unique values). If no such was found, then:
  6. Use the first date/time column in the table, give priority to columns with a default constraint. If no such was found, then:
  7. Use the first int/bigint/smallint/tinyint column in the table, give priority to columns without a default constraint. If no such was found, then:
  8. Use the first non-nullable column in the table, give priority to columns without a default constraint. If no such was found, then:
  9. Bummer. I’m out of ideas. No automated recommendations are possible.

Note: the above may not be up to date when you read this post. I occasionally update the script with improvements and things could change such as new recommendations added, or their order of consideration changed. The most up-to-date algorithm would be documented within the script itself.

You can find the script in my GitHub Gists here:

—————————————————–
—- Generate Clustered Index Recommendations ——-
—————————————————–
Author: Eitan Blumin | https://www.eitanblumin.com
More info: https://eitanblumin.com/2019/12/30/resolving-tables-without-clustered-indexes-heaps/
—————————————————–
Description:
————
This script finds all heap tables, and "guestimates" a clustered index recommendation for each.
The script implements the following algorithm:
1. Look in index usage stats for the most "popular" non-clustered indexes which would be a good candidate as clustered index, give priority to UNIQUE indexes. If no such was found, then:
2. If there's any non-clustered index at all, get the first one created with the highest number of INCLUDE columns, give priority to UNIQUE indexes. If no such was found, then:
3. Look in missing index stats for the most impactful index that has the highest number of INCLUDE columns. If no such was found, then:
4. Use the IDENTITY column in the table. If no such was found, then:
5. Check for any column statistics in the table and look for the column which is the most selective (most unique values). If no such was found, then:
6. Use the first date/time column in the table, give priority to columns with a default constraint. If no such was found, then:
7. Use the first int/bigint/smallint/tinyint column in the table, give priority to columns without a default constraint. If no such was found, then:
8. Use the first non-nullable column in the table, give priority to columns without a default constraint. If no such was found, then:
9. Bummer. I'm out of ideas. No recommendations are possible.
—————————————————–
Change log:
————
2020-11-25 Various improvements:
– Changed recommendations prioritization – gave higher priority to most SELECTIVE column
– Added parameter @RetainHighestCompression to retain DATA_COMPRESSION settings in scripts
– Ignore special index types (columnstore, XML, spatial, …), and hypothetical indexes
– Give priority to UNIQUE indexes when prioritizing existing indexes based on usage stats
– Replaced usage of sp_MSforeachDb with a cursor, to support longer command text
2020-11-18 Added Rollback_Script column in output
2020-11-03 Added new step to find first integer column, and a new step to find first non-nullable column
2020-09-30 Added optional parameters @OnlineRebuild, @SortInTempDB, @MaxDOP
2020-09-21 Added columns list in initial recommendations retrieval, removed newlines from remediation scripts
2020-07-14 Added proper support for replacing unique indexes
2020-07-14 Added generated script for replacing existing nc index with a clustered index
2020-02-19 Added support for Azure SQL DB, and added version-dependent check to ignore memory optimized tables
2020-02-12 Changed prioritization a bit for the recommendations, added automatic generation of basic CREATE script
2020-01-07 Added check of database Updateability, and moved around a few columns
2019-12-29 Added checks for IDENTITY columns, and first DATE/TIME columns
2019-12-23 First version
—————————————————–
Parameters:
————
DECLARE
@MinimumRowsInTable INT = 200000 Minimum number of rows in a table in order to check it
Parameters controlling the structure of output scripts:
,@OnlineRebuild BIT = 1 If 1, will generate CREATE INDEX commands with the ONLINE option turned on.
,@SortInTempDB BIT = 1 If 1, will generate CREATE INDEX commands with the SORT_IN_TEMPDB option turned on.
,@MaxDOP INT = NULL If not NULL, will generate CREATE INDEX commands with the MAXDOP option. Set to 1 to prevent parallelism and reduce workload.
,@RetainHighestCompression BIT = 1 If 1, will retain the highest data compression setting when replacing existing indexes
—————————————————–
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;
SET NOCOUNT, ARITHABORT, XACT_ABORT ON;
IF OBJECT_ID(N'tempdb..#temp_heap') IS NOT NULL DROP TABLE #temp_heap;
DECLARE @CMD NVARCHAR(MAX), @CurrDB SYSNAME, @CurrObjId INT, @CurrTable NVARCHAR(1000);
DECLARE @RebuildOptions NVARCHAR(MAX);
Init local variables and defaults
SET @RebuildOptions = N''
IF @OnlineRebuild = 1 SET @RebuildOptions = @RebuildOptions + N', ONLINE = ON'
IF @SortInTempDB = 1 SET @RebuildOptions = @RebuildOptions + N', SORT_IN_TEMPDB = ON'
IF @MaxDOP IS NOT NULL SET @RebuildOptions = @RebuildOptions + N', MAXDOP = ' + CONVERT(nvarchar(4000), @MaxDOP)
IF @RetainHighestCompression = 1 SET @RebuildOptions = @RebuildOptions + N', DATA_COMPRESSION = {COMPRESSION}'
IF @RebuildOptions LIKE N',%' SET @RebuildOptions = N' WITH (' + STUFF(@RebuildOptions, 1, 2, N'') + N')';
IF @OnlineRebuild = 1 AND ISNULL(CONVERT(int, SERVERPROPERTY('EngineEdition')),0) NOT IN (3,5,8)
BEGIN
RAISERROR(N'— WARNING: @OnlineRebuild is set to 1, but current SQL edition does not support ONLINE rebuilds.', 0, 1);
END
CREATE TABLE #temp_heap
(
[database_name] NVARCHAR(50),
table_name NVARCHAR(MAX),
full_table_name NVARCHAR(MAX),
num_of_rows INT NULL,
[object_id] INT,
candidate_index SYSNAME NULL,
candidate_columns_from_existing_index NVARCHAR(MAX) NULL,
include_columns_from_existing_index NVARCHAR(MAX) NULL,
candidate_columns_from_missing_index NVARCHAR(MAX) NULL,
identity_column SYSNAME NULL,
most_selective_column_from_stats SYSNAME NULL,
first_date_column SYSNAME NULL,
first_integer_column SYSNAME NULL,
first_integer_column_type SYSNAME NULL,
first_non_nullable_column SYSNAME NULL,
is_unique BIT NULL,
data_compression_type TINYINT NULL,
data_compression_type_desc AS (CASE data_compression_type WHEN 2 THEN 'PAGE' WHEN 1 THEN 'ROW' ELSE 'NONE' END)
);
SET @CMD = N'
SELECT DB_NAME() as DatabaseName, t.object_id, OBJECT_NAME(t.object_id) AS table_name, QUOTENAME(OBJECT_SCHEMA_NAME(t.object_id)) + ''.'' + QUOTENAME(OBJECT_NAME(t.object_id)) AS FullTableName
, SUM(p.rows)
, QUOTENAME(ix.name) AS CandidateIndexName
, ix_columns
, inc_columns
, ix.is_unique
, data_compression_type = MAX(p.data_compression)
FROM sys.tables t
INNER JOIN sys.partitions p
ON t.object_id = p.OBJECT_ID
OUTER APPLY
(
SELECT TOP 1 us.index_id, ix.[name], ix.is_unique
FROM sys.dm_db_index_usage_stats AS us
INNER JOIN sys.indexes AS ix
ON us.index_id = ix.index_id AND us.object_id = ix.object_id
WHERE us.database_id = DB_ID()
AND us.object_id = t.object_id
AND ix.index_id > 1
AND ix.is_hypothetical = 0
AND ix.type <= 2
ORDER BY CONVERT(tinyint, ix.is_unique) DESC, us.user_updates DESC, us.user_scans DESC, us.user_seeks DESC
) AS ix
OUTER APPLY
(SELECT ix_columns = STUFF((
SELECT '', '' + QUOTENAME(c.name) + CASE ic.is_descending_key WHEN 1 THEN '' DESC'' ELSE '' ASC'' END
FROM sys.index_columns AS ic
INNER JOIN sys.columns AS c ON ic.object_id = c.object_id AND ic.column_id = c.column_id
WHERE ic.object_id = t.object_id AND ic.index_id = ix.index_id AND ic.is_included_column = 0
FOR XML PATH('''')
), 1, 2, '''')
, inc_columns = STUFF((
SELECT '', '' + QUOTENAME(c.name)
FROM sys.index_columns AS ic
INNER JOIN sys.columns AS c ON ic.object_id = c.object_id AND ic.column_id = c.column_id
WHERE ic.object_id = t.object_id AND ic.index_id = ix.index_id AND ic.is_included_column = 1
FOR XML PATH('''')
), 1, 2, '''')
) AS ixcolumns
WHERE p.index_id = 0
AND t.is_ms_shipped = 0
AND t.OBJECT_ID > 255'
Ignore memory-optimized tables in SQL Server versions 2014 and newer
+ CASE WHEN CONVERT(int, (@@microsoftversion / 0x1000000) & 0xff) >= 12 THEN N'
AND t.is_memory_optimized = 0'
ELSE N'' END + N'
GROUP BY t.object_id, ix.name, ix.is_unique, ix_columns, inc_columns
' + ISNULL(N'HAVING SUM(p.rows) >= ' + CONVERT(nvarchar,@MinimumRowsInTable), N'')
IF CONVERT(varchar(300),SERVERPROPERTY('Edition')) = 'SQL Azure'
BEGIN
INSERT INTO #temp_heap([database_name], [object_id], table_name, full_table_name, num_of_rows, candidate_index, candidate_columns_from_existing_index, include_columns_from_existing_index, is_unique, data_compression_type)
exec (@CMD)
END
ELSE
BEGIN
DECLARE @Executor NVARCHAR(1000)
DECLARE DBs CURSOR
LOCAL FAST_FORWARD
FOR
SELECT [name]
FROM sys.databases
WHERE database_id > 4
AND state_desc = 'ONLINE'
AND DATABASEPROPERTYEX(name, 'Updateability') = 'READ_WRITE'
OPEN DBs
FETCH NEXT FROM DBs INTO @CurrDB
WHILE @@FETCH_STATUS = 0
BEGIN
SET @Executor = QUOTENAME(@CurrDB) + N'..sp_executesql'
INSERT INTO #temp_heap([database_name], [object_id], table_name, full_table_name, num_of_rows, candidate_index, candidate_columns_from_existing_index, include_columns_from_existing_index, is_unique, data_compression_type)
EXEC @Executor @CMD
FETCH NEXT FROM DBs INTO @CurrDB
END
CLOSE DBs
DEALLOCATE DBs
END
Add recommendations based on missing index stats
UPDATE t
SET candidate_columns_from_missing_index = mi.indexColumns
FROM #temp_heap AS t
CROSS APPLY
(
SELECT TOP 1 ISNULL(mid.equality_columns, mid.inequality_columns) AS indexColumns
FROM sys.dm_db_missing_index_group_stats AS migs
INNER JOIN sys.dm_db_missing_index_groups AS mig
ON (migs.group_handle = mig.index_group_handle)
INNER JOIN sys.dm_db_missing_index_details AS mid
ON (mig.index_handle = mid.index_handle)
WHERE mid.object_id = OBJECT_ID(QUOTENAME(t.[database_name]) + N'.' + t.full_table_name) AND mid.database_id = DB_ID(t.database_name)
GROUP BY ISNULL(mid.equality_columns, mid.inequality_columns)
ORDER BY MAX(LEN(mid.included_columns) LEN(REPLACE(mid.included_columns, ', [', ''))) DESC
, SUM(migs.avg_user_impact * migs.avg_total_user_cost) DESC
, SUM(migs.user_scans) DESC, SUM(migs.user_seeks) DESC
) AS mi
WHERE t.candidate_index IS NULL — filters for only those without existing recommendation
DECLARE Tabs CURSOR
FAST_FORWARD READ_ONLY
FOR
SELECT database_name, object_id, full_table_name
FROM #temp_heap AS t
WHERE t.candidate_columns_from_missing_index IS NULL AND t.candidate_index IS NULL — filters for only those without existing recommendation
OPEN Tabs
FETCH NEXT FROM Tabs INTO @CurrDB, @CurrObjId, @CurrTable
WHILE @@FETCH_STATUS = 0
BEGIN
Get additional metadata for current table
DECLARE @FirstIndex SYSNAME, @IsUnique BIT, @FirstIndexColumns NVARCHAR(MAX), @FirstIndexIncludeColumns NVARCHAR(MAX), @IdentityColumn SYSNAME
, @FirstDateColumn SYSNAME, @FirstIntColumn SYSNAME, @FirstIntColumnType SYSNAME, @FirstNonNullableColumn SYSNAME;
SET @CMD = N'SELECT TOP 1
@FirstIndex = name,
@IsUnique = ix.is_unique,
@FirstIndexColumns =
STUFF((
SELECT '', '' + QUOTENAME(c.name) + CASE ic.is_descending_key WHEN 1 THEN '' DESC'' ELSE '' ASC'' END
FROM ' + QUOTENAME(@CurrDB) + N'.sys.index_columns AS ic
INNER JOIN ' + QUOTENAME(@CurrDB) + N'.sys.columns AS c ON ic.object_id = c.object_id AND ic.column_id = c.column_id
WHERE ic.object_id = ix.object_id AND ic.index_id = ix.index_id AND ic.is_included_column = 0
FOR XML PATH('''')
), 1, 2, ''''),
@FirstIndexIncludeColumns =
STUFF((
SELECT '', '' + QUOTENAME(c.name)
FROM ' + QUOTENAME(@CurrDB) + N'.sys.index_columns AS ic
INNER JOIN ' + QUOTENAME(@CurrDB) + N'.sys.columns AS c ON ic.object_id = c.object_id AND ic.column_id = c.column_id
WHERE ic.object_id = ix.object_id AND ic.index_id = ix.index_id AND ic.is_included_column = 1
FOR XML PATH('''')
), 1, 2, '''')
FROM ' + QUOTENAME(@CurrDB) + N'.sys.indexes AS ix
OUTER APPLY (SELECT SUM(CASE WHEN is_included_column = 1 THEN 1 ELSE 0 END) AS included_columns, COUNT(*) AS indexed_columns
FROM ' + QUOTENAME(@CurrDB) + N'.sys.index_columns AS ic WHERE ic.object_id = ix.object_id AND ic.index_id = ix.index_id) AS st
WHERE object_id = @ObjId AND index_id > 0
AND is_hypothetical = 0
AND type <= 2 — ignore special index types
ORDER BY ix.is_unique DESC, included_columns DESC, indexed_columns ASC, index_id ASC;
SELECT @IdentityColumn = [name]
FROM ' + QUOTENAME(@CurrDB) + N'.sys.identity_columns
WHERE object_id = @ObjId;
SELECT TOP 1 @FirstDateColumn = c.[name]
FROM ' + QUOTENAME(@CurrDB) + N'.sys.columns AS c
LEFT JOIN ' + QUOTENAME(@CurrDB) + N'.sys.default_constraints AS dc
ON c.default_object_id = dc.object_id
AND c.object_id = dc.parent_object_id
WHERE c.object_id = @ObjId
AND c.system_type_id IN
(SELECT system_type_id FROM ' + QUOTENAME(@CurrDB) + N'.sys.types WHERE precision > 0 AND (name LIKE ''%date%'' OR name LIKE ''%time%''))
ORDER BY
CASE WHEN dc.[definition] IS NOT NULL THEN 0 ELSE 1 END ASC,
CONVERT(smallint, c.is_nullable) ASC,
c.column_id ASC;
SELECT TOP 1 @FirstIntColumn = c.[name], @FirstIntColumnType = t.[name]
FROM ' + QUOTENAME(@CurrDB) + N'.sys.columns AS c
LEFT JOIN ' + QUOTENAME(@CurrDB) + N'.sys.default_constraints AS dc
ON c.default_object_id = dc.object_id
AND c.object_id = dc.parent_object_id
LEFT JOIN ' + QUOTENAME(@CurrDB) + N'.sys.types AS t ON c.system_type_id = t.system_type_id
WHERE c.object_id = @ObjId
AND t.[name] IN (''bigint'', ''int'', ''smallint'', ''tinyint'')
AND c.is_nullable = 0
ORDER BY
CASE WHEN dc.[definition] IS NOT NULL THEN 1 ELSE 0 END ASC,
CASE t.[name] WHEN ''int'' THEN 1 WHEN ''bigint'' THEN 2 WHEN ''smallint'' THEN 3 ELSE 4 END ASC,
c.column_id ASC;
SELECT TOP 1 @FirstNonNullableColumn = c.[name]
FROM ' + QUOTENAME(@CurrDB) + N'.sys.columns AS c
LEFT JOIN ' + QUOTENAME(@CurrDB) + N'.sys.default_constraints AS dc
ON c.default_object_id = dc.object_id
AND c.object_id = dc.parent_object_id
WHERE c.object_id = @ObjId
AND c.is_nullable = 0
ORDER BY
CASE WHEN dc.[definition] IS NOT NULL THEN 1 ELSE 0 END ASC,
c.column_id ASC;'
PRINT @CMD;
SET @FirstIndex = NULL;
SET @IdentityColumn = NULL;
SET @FirstDateColumn = NULL;
SET @FirstIntColumn = NULL;
SET @FirstIntColumnType = NULL;
SET @FirstNonNullableColumn = NULL;
EXEC sp_executesql @CMD
, N'@ObjId INT, @FirstIndex SYSNAME OUTPUT, @IsUnique BIT OUTPUT, @FirstIndexColumns NVARCHAR(MAX) OUTPUT, @FirstIndexIncludeColumns NVARCHAR(MAX) OUTPUT, @IdentityColumn SYSNAME OUTPUT, @FirstDateColumn SYSNAME OUTPUT, @FirstIntColumn SYSNAME OUTPUT, @FirstIntColumnType SYSNAME OUTPUT, @FirstNonNullableColumn SYSNAME OUTPUT'
, @CurrObjId, @FirstIndex OUTPUT, @IsUnique OUTPUT, @FirstIndexColumns OUTPUT, @FirstIndexIncludeColumns OUTPUT, @IdentityColumn OUTPUT, @FirstDateColumn OUTPUT, @FirstIntColumn OUTPUT, @FirstIntColumnType OUTPUT, @FirstNonNullableColumn OUTPUT
IF @FirstIndex IS NOT NULL
BEGIN
——————-
Add recommendations based on existing non-clustered indexes (even if no existing usage stats or missing index stats found)
——————-
UPDATE #temp_heap SET candidate_index = QUOTENAME(@FirstIndex) + N' (no usage)'
, candidate_columns_from_existing_index = @FirstIndexColumns
, include_columns_from_existing_index = @FirstIndexIncludeColumns
, is_unique = @IsUnique
WHERE database_name = @CurrDB AND object_id = @CurrObjId AND candidate_index IS NULL
END
IF @IdentityColumn IS NOT NULL
BEGIN
——————-
Add recommendations based on identity column
——————-
UPDATE #temp_heap SET identity_column = QUOTENAME(@IdentityColumn)
, is_unique = ISNULL(is_unique, 1)
WHERE database_name = @CurrDB AND object_id = @CurrObjId;
END
IF @FirstDateColumn IS NOT NULL
BEGIN
Add recommendation based on the first date/time column
UPDATE #temp_heap SET first_date_column = QUOTENAME(@FirstDateColumn)
, is_unique = ISNULL(is_unique, 0)
WHERE database_name = @CurrDB AND object_id = @CurrObjId;
END
IF @FirstIntColumn IS NOT NULL
BEGIN
Add recommendation based on the first date/time column
UPDATE #temp_heap SET first_integer_column = QUOTENAME(@FirstIntColumn)
, first_integer_column_type = @FirstIntColumnType
, is_unique = ISNULL(is_unique, 0)
WHERE database_name = @CurrDB AND object_id = @CurrObjId;
END
IF @FirstNonNullableColumn IS NOT NULL
BEGIN
Add recommendation based on the first date/time column
UPDATE #temp_heap SET first_non_nullable_column = QUOTENAME(@FirstNonNullableColumn)
, is_unique = ISNULL(is_unique, 0)
WHERE database_name = @CurrDB AND object_id = @CurrObjId;
END
IF @FirstIndex IS NULL — Performs check only if no previous recommendations found
BEGIN
——————-
Get recommendations based on most selective column based on statistics
——————-
Get list of table columns
DECLARE @Columns AS TABLE (colName SYSNAME);
SET @CMD = N'SELECT name FROM ' + QUOTENAME(@CurrDB) + N'.sys.columns WHERE object_id = @ObjId AND is_computed = 0'
INSERT INTO @Columns
EXEC sp_executesql @CMD, N'@ObjId INT', @CurrObjId
Generate and run SHOW_STATISTICS command
SET @CMD = N'USE ' + QUOTENAME(@CurrDB) + N';
SET NOCOUNT ON;'
SELECT @CMD = @CMD + N'
BEGIN TRY
DBCC SHOW_STATISTICS(' + QUOTENAME(@CurrTable, '"') COLLATE database_default + N', ' + QUOTENAME(colName) COLLATE database_default + N') WITH DENSITY_VECTOR, NO_INFOMSGS;
END TRY
BEGIN CATCH
PRINT ERROR_MESSAGE()
END CATCH'
FROM @Columns
DECLARE @DensityStats AS TABLE (AllDensity FLOAT, AvgLength FLOAT, Cols NVARCHAR(MAX));
INSERT INTO @DensityStats
EXEC(@CMD);
IF @@ROWCOUNT > 0
BEGIN
Set most selective column
UPDATE #temp_heap
SET most_selective_column_from_stats =
(
SELECT TOP 1 QUOTENAME(Cols)
FROM @DensityStats
ORDER BY AllDensity ASC, AvgLength ASC
)
, is_unique = ISNULL(is_unique, 0)
WHERE
database_name = @CurrDB
AND object_id = @CurrObjId;
END
END
Re-init for next iteration
DELETE @Columns;
DELETE @DensityStats;
FETCH NEXT FROM Tabs INTO @CurrDB, @CurrObjId, @CurrTable
END
CLOSE Tabs
DEALLOCATE Tabs
Output results
SELECT
Details = 'Database:' + QUOTENAME([database_name]) + ', Heap Table:' + full_table_name
+ COALESCE(
N', candidate INDEX: ' + t.candidate_index + ISNULL(N' (' + t.candidate_columns_from_existing_index + N')', N'')
, N', candidate column(s) from MISSING INDEX stats: ' + t.candidate_columns_from_missing_index
, N', IDENTITY column: ' + t.identity_column
, N', most SELECTIVE column: ' + t.most_selective_column_from_stats
, N', first DATE/TIME column: ' + t.first_date_column
, N', first ' + ISNULL(UPPER(t.first_integer_column_type), 'INTEGER') + ' column: ' + t.first_integer_column
, N', first non-nullable column: ' + t.first_non_nullable_column
, N', NO RECOMMENDATION POSSIBLE')
, Script = N'USE ' + QUOTENAME(t.database_name)
+
CASE
WHEN t.candidate_index IS NOT NULL AND t.candidate_columns_from_existing_index IS NULL THEN N'; — Recreate as clustered index: ' + t.candidate_index
WHEN t.candidate_index IS NOT NULL AND t.candidate_columns_from_existing_index IS NOT NULL THEN N'; DROP INDEX ' + t.candidate_index + ' ON ' + t.full_table_name
+ N'; CREATE ' + CASE WHEN t.is_unique = 1 THEN 'UNIQUE ' ELSE N'' END + N'CLUSTERED INDEX ' + t.candidate_index + ' ON ' + t.full_table_name
+ N' (' + t.candidate_columns_from_existing_index + N')' + REPLACE(@RebuildOptions, N'{COMPRESSION}', t.data_compression_type_desc)
ELSE
N'; CREATE ' + CASE WHEN t.is_unique = 1 THEN 'UNIQUE ' ELSE N'' END + N'CLUSTERED INDEX IX_clust ON ' + t.full_table_name
+ N' ('
+ COALESCE(
t.candidate_columns_from_missing_index,
t.identity_column,
t.most_selective_column_from_stats,
t.first_date_column,
t.first_integer_column,
t.first_non_nullable_column
)
+ N')' + REPLACE(@RebuildOptions, N'{COMPRESSION}', t.data_compression_type_desc)
END
, Rollback_Script = N'USE ' + QUOTENAME(t.database_name)
+
CASE
WHEN t.candidate_index IS NOT NULL AND t.candidate_columns_from_existing_index IS NULL THEN N'; — Recreate as nonclustered index: ' + t.candidate_index
WHEN t.candidate_index IS NOT NULL AND t.candidate_columns_from_existing_index IS NOT NULL THEN N'; DROP INDEX ' + t.candidate_index + ' ON ' + t.full_table_name
+ N'; CREATE ' + CASE WHEN t.is_unique = 1 THEN 'UNIQUE ' ELSE N'' END + N'NONCLUSTERED INDEX ' + t.candidate_index + ' ON ' + t.full_table_name
+ N' (' + t.candidate_columns_from_existing_index + N')' + ISNULL(N' INCLUDE (' + t.include_columns_from_existing_index + N')', N'')
+ REPLACE(@RebuildOptions, N'{COMPRESSION}', t.data_compression_type_desc)
ELSE
N'; DROP INDEX IX_clust ON ' + t.full_table_name
END
, *
FROM #temp_heap AS t
DROP TABLE #temp_heap
I’ve had the chance to use this script in several environments, and it works very well for most common scenarios.

WARNING! CONSIDER THE FOLLOWING BEFORE MAKING ANY ACTUAL CHANGES:

  • DO NOT APPLY the recommendations from the above script blindly!
    The script only generates estimated recommendations based on the little information it can gather from system tables.
    You would have to use precaution and careful discretion to consider what would be the right clustered index per each table.
  • You would need to consult with the developers / product managers to understand the implications of making index changes in the database schema and whether they agree to it (for example, if the same changes would also need to be made in some kind of a database source control, or when the database belongs to a vendor and making changes to it would void the warranty).
  • Be extremely careful when dealing with very large tables, especially when the SQL Server edition is not Enterprise and therefore doesn’t support ONLINE index operations. In which case: This probably means DOWNTIME. Prepare accordingly.
  • Non default file groups are NOT taken into consideration by this script and would not be retained by the output scripts. This will be fixed in a future version.

The script also generates the actual CREATE script for these clustered indexes, but as I said, be careful not to run those scripts blindly without thinking first. So, while the script will save you significant time and give you helpful ideas, you should still evaluate each recommendation and make sure it’s fine.

Got any comments? Ideas? Let me know below.

Additional Resources:

Cover Image by PublicDomainPictures from Pixabay

3 comments

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.