Site icon Eitan Blumin's blog

Finding the Details Missing from the SQL Server Failed Logins Audit

We all know the error log used by SQL Server to audit failed logins by default. But is it good enough? For many scenarios, it isn’t. It’s actually missing a lot of important information. But, worry not, for there are better solutions!

Problem

There are many different reasons why a login would fail. Every time this happens, it’s logged as an error event in the SQL Server instance. In most cases, this would also be logged in the SQL Server error log. But, the details are limited to an IP address and, in the best-case scenario, a username.

Typical logon failure in the SQL Server Error Log

In a real customer use case, we noticed many repeating login failures in the SQL error log, but the information provided wasn’t enough to troubleshoot the issue. For example, it had crucial information missing – such as client hostname, program name, and for some failure types – even the login name.

First Solution: Default Server Trace

Did you know that the default server trace also records information about failed login events in your SQL Server instance? It includes very useful information such as the client hostname, program name, login name, target database, and even the host process ID.

To get the failed login events from the default server trace, all you gotta do is query all the events with EventClass=20. This identifies all the “Login Failed” event types. You can use the query below:

SELECT  trc.*
FROM fn_trace_getinfo(default) AS inf
CROSS APPLY fn_trace_gettable (convert(nvarchar(255), inf.value),default ) AS trc
WHERE inf.property = 2 AND inf.value IS NOT NULL
AND trc.EventClass= 20 
ORDER BY trc.StartTime DESC

There are a few issues with it, though:

Still, the default server trace should be more than good enough for most use cases. But, even for those use cases when it’s not enough, we still have another possible solution.

Second Solution: Extended Events

Luckily, since these failed logins are also user error events, this means we can set up an extended event session to capture all the details about it. We would also have much more control over it, since we’d be the ones setting it up (i.e. file roll-over and retention settings, specific event types, etc.).

The one major downside to this is that, since you’re the one who has to set it up, it will only start helping you from that moment on. Obviously, it won’t help you if you’re investigating a server that doesn’t have it already running.

In any case, as part of my research before setting up this extended event session, I did the following:

The final result is the script that you can see below. It contains two scripts actually: one that creates the extended event session and activates it, and one that queries from it to display the collected events and their details.

Scripts available in my GitHub Gists

At the time of this writing, the additional details (actions) that I found to be supported with these events are:

I wish it would’ve been possible to also log the client process ID as in the default server trace, but unfortunately it doesn’t seem to be supported (that would’ve helped pinpoint the exact process causing the issue). Still, it’s a whole lot more than what’s offered in the regular error log.

The events included at the time of this writing include the following errors:

I think I got all of them based on what I found in sys.messages, but if I missed something important, please let me know in the comments!

Take note of the event session settings for file roll-over, retention, memory usage, etc. You can fine-tune these settings to whatever you feel best fits your specific use-cases.

Conclusion

The default server trace and the extended events session turned out to be invaluable in the customer’s production environment. With the help of the extended events session, we even started detecting logon errors that weren’t even visible at all otherwise (due to some exotic application-level issues). In fact, this was so useful, that we decided to make this XE session a must-have standard on all production servers.

I hope you’ll find a good use for these solutions, and make them an inseparable part of your production environments and a valuable troubleshooting tool in your DBA tool belt.

If you have any thoughts on this, ideas, objections, questions, please comment below.

Exit mobile version