When administrating a SQL Server instance with multiple CPU cores and heavy workload, it’s common to see SQL Server creating and using execution plans with parallelism. The instance configuration “cost threshold for parallelism” is what determines for SQL Server the minimum sub-tree cost before it starts considering to create a parallelism plan. The default “out-of-the-box” value of this configuration is 5.
However, in some cases, we would want to increase the default configuration of “cost threshold for parallelism” to something higher than the Microsoft default of 5, thus decreasing the frequency in which SQL Server creates parallelism plans.
This is something we would usually want to do when we see too much CPU workload as a result of parallelism plans, and we would want to reduce it.
But how can we know which new value we should use? Do we just throw a randomly high number such as 50 and see what happens (as is often recommended in most blogs)?
Well, no, we don’t actually have to do that.
We can do it like smart people.
Being smart is a good thing.
What can we do?
Luckily, we can use a DMV such as sys.dm_exec_cached_plans to help us.
This view can give us the information we need, which is:
- Which execution plans currently use parallelism.
- What are their sub-tree costs.
- How often these plans are used (i.e. “usecount”).
And most importantly, using that information, we could figure out what would happen if the plan was not allowed to use parallelism.
“What would happen if” is a very smart question to start with. It’s like a hypothesis. It’s science!
The “Laboratory”
Anyways, for this purpose, I’ve prepared a special SQL script.
See below:
There are 2 ‘parameters’ for that script:
- @MinUseCount – Determines the minimum usecount of execution plans. Use this parameter to filter out “noise” from rarely-used, or one-time plans.
- @MaxSubTreeCost – Serves as maximum value for sub-tree costs. We would want to use this parameter to filter out actually-heavy execution plans that we really want to be using parallelism. You should put here the new cost threshold that you consider on using (or something a little higher).
The data returned by the script would be a list of execution plans, their respective SQL statements, the Sub-Tree cost of the statements, and their usecounts.
Using this script, you will be able to identify execution plans that use parallelism, which may stop using parallelism if you change “cost threshold for parallelism” to a value higher than the sub-tree cost of their non-parallel counterpart (i.e. when you specify OPTION(MAXDOP 1)).
Now comes the difficult part
You’ll need to investigate each of these plans, and think about what would happen if they were not allowed to be run with parallelism.
For example, if the plan performs a table scan, then the negative impact of it no longer using parallelism may be catastrophic!
In most cases, you will need to embark on a magnificent adventure of performance tuning for these execution plans, to make sure that nothing too bad would happen if they were no longer able to run with parallelism.
- Maybe that horrible table scan can be avoided altogether if you just add the right covering index?
- Or write the query a bit differently?
- Or maybe there’s no way to improve the query so you’re just better off setting the CTFP configuration to a value lower than the sub-tree cost of that plan?
- Or maybe it’ll be just fine and dandy without parallelism after all??
To test all of those questions, you can take the original text of those queries, and run them using the query hint OPTION(MAXDOP 1) to force them to run without parallelism.
Observe the results, and decide whether the impact of the query is acceptable without parallelism.
Hypothesize… Collect data… Experiment… Observe… Draw conclusions! That’s science!
Good luck and enjoy your adventure, you beautiful scientist, you!
This article was originally published by Eitan Blumin on February, 2018 in www.madeiradata.com
Cross-post on LinkedIn on August, 2018
Cover image by luvqs from Pixabay