Assume you have some kind of process outside the SQL Server which is supposed to record some kind of database activity. There are many ways to do it, most of which require constant and/or stable connection to the database. But what if that’s not good enough? Let’s consider our options.
First of all, I must give some credit to Erik (the) Darling, who implemented a brilliant solution which uses a similar idea to what I had in mind: sp_HumanEvents.
Erik’s basic idea is to replace the likes of sp_who, sp_who2 or sp_WhoIsActive, by way of utilizing extended events, and provide a more “complete” picture of what’s currently happening in the database… Or at least… During the few seconds while you’re running the script.
But that could also still be a problem, if what you need is a truly “complete” view of what is happening in the database…
More specifically, the likes of sp_who[2], sp_WhoIsActive, and sp_HumanEvents, all require a stable connection to the database, and they only tell you what’s happening now – while you’re running the script.
But what if you need more than that? What if you need to know not just what’s happening while you’re running a script to sample the database, but also – everything that has happened since the last time that you’ve run the script?
You could do something like set up a job that runs these scripts periodically and saves the results to tables. It’s even one of the built in functionalities in sp_HumanEvents, for example. But there are still two problems here:
- The data you collect is limited to what is captured while the monitor job is executing. But in-between the schedules – you’re still blind.
- You must save it to local tables on the target server, incurring additional impact on your target.
- These jobs and tables are “foot prints” that you leave on your target server, and it may not always be possible or feasible to create them.
Whatever we do, the bulk of our impact should be outside of our monitored target, and yet… We must somehow maintain continuity in our collected data.
What about Profiler?
What if, we had a way to monitor what’s happening on a SQL server, without us actually being constantly connected to it, and still incur a minimal impact on the target server itself? What if we could have something that “samples” a SQL server target for monitoring data from some kind of “buffer” which we can then clear out, and let it refill again with new monitoring data until the next time that we sample the server?
Something like the client side trace (stored procedure sp_trace_getdata which is also used by the Profiler client to stream data) comes to mind as a possible solution. SentryOne SQL Sentry, for example, uses that method to implement monitoring of T-SQL commands and other events. Theoretically, it could work very well as a sort of “buffer” that holds collected events until the next time that our process connects to the database. However, its retention depth is not configurable, and it’s kind of outdated technology.
Extended Events as the answer
Extended events provides a solution similar to client side trace. It basically can capture all events that a trace can capture (and more), and it also supports a wider choice of target types. And that, is where its true power lies.
It just so happens that Extended Events has a target type called “ring buffer“, and it gives us exactly what we need.
First of all, we can configure the ring buffer with a maximum size and a maximum number of events to collect. This guarantees that we have full control over our impact on the target server.
For example, let’s say that we’d like to take up no more than 16 MB of RAM with our monitoring session. It could be good enough, or not. It depends on the workload stress on your target, and how much RAM you can spare. Regardless, you should only filter for the events that you really need, in order to minimize your observer overhead.
Secondly, this ring buffer can also be completely and immediately deleted once we drop the extended events session. Unlike the alternative file target which does not delete its files if you drop the session. They just stay there… Taking up valuable disk space. (The same holds true for the server-side trace, but not for the client-side trace.)
It appears that, in the extended events ring buffer target, we have the perfect candidate to implement what we need.
And now, for a true story
The idea for this mechanism was born thanks to one client that I had, with multiple SQL servers hosted on Azure. Those were Azure SQL Databases, in fact. But there was no option of using third party monitoring tools, nor was the Azure SQL Analytics solution good enough for what they needed. They needed an alternative, and this is where I came in.
The implementation was as simple as possible:
- Check if our extended events session exists. If it does, return the data it collected so far, and then… Drop the XE session.
- Recreate the extended events session.
And that’s about it. At least, as far as it concerns the monitored SQL server target. As for the monitoring data that we’ve collected, we would of course need to save it somewhere. This particular client wanted it to be saved in Azure Data Explorer (aka Kusto), but it doesn’t really matter. You can save it wherever you want.
Azure Data Factory provided us with the “pure cloud” solution that my client was looking for. It has the capability to run the relevant T-SQL command (although it had to be a stored procedure), save the data in Kusto, and automatically run the process every few minutes based on a schedule.
You, however, could use whatever method you’d like… An SSIS package… Or a scheduled job on another server using a linked server connection… Or a Powershell script run by a scheduled task… Or some kind of application service… It really doesn’t matter.
Give me the scripts already!
I prepared two example scripts that utilize this methodology:
One example creates an extended events session that monitors for long running SQL commands:
The other example collects SQL error events:
Both scripts have various configurations that you can control using several variables at the top, such as minimum duration, minimum severity, what to ignore, how big the ring buffer should be, and more.
Once you run the script, your configurations are plugged in and a suitable script is generated for you. In some SQL versions you won’t have a choice but to copy the output script and use that in your actual monitoring solution. Other SQL versions will let you run the script immediately using the dynamically generated commands.
Either way, you’ll have what you need.
The scripts are provided as-is without guarantee. You can use them as a starting point to implement your own flavor of monitoring, or you can use them as they are if it suits you well enough.
Enjoy and good luck!
Cover image by geraldoswald62 from Pixabay.