Login Failed for xxx – Who’s Keeps Trying to Connect to my Server?

I recently ran into an issue in which a machine was connecting to SQL and continuously generating a login failure every second. The account name was NT AUTHORITY\ANONYMOUS LOGON which typically means a Kerberos related double hop connection with some misconfiguration. We had the host name from the error logs (below), so we knew it was a web server that was generating the error.

Date        2/13/2017 10:18:50 PM
Log        SQL Server (Current - 2/11/2017 2:19:00 PM)
Source        Logon
Message
Login failed for user 'NT AUTHORITY\ANONYMOUS LOGON'. Reason: Could not find a login matching the name provided. [CLIENT: xxx.xxx.xxx.xxx]

Date        2/13/2017 10:18:50 PM
Log        SQL Server (Current - 2/11/2017 2:19:00 PM)
Source        Logon
Message
Error: 18456, Severity: 14, State: 5.

I use the sys.dm_exec_sessions DMV's host_process_id column to identify offending client processes when troubleshooting things like blocking. In this case, there was no active session, so we had no way to tell the what IIS App pool was generating the error. The application area did not know which process or IIS App pool was generating the logon event and to further complicate matters, they were experiencing no issues. Since these events were filling up our SQL error Logs, we were motivated to identify the culprit and shut it down.

We started down the path of using Extended Events, specifically the login event, but discovered it only fired when a successful login occurred as seen below.

 select [name], description, object_type from sys.dm_xe_objects where [name] = 'login'

Occurs when a successful connection is made to the Server. This event is fired for new connection or when connections are reused from a connection pool

So, that was out of the question. Since we knew the event was being written to the SQL Server error logs, we decided to try the errorlog_written event to see if it would in fact capture the client_pid field.

 CREATE EVENT SESSION [LoginFailureTrace] ON SERVER 
ADD EVENT sqlserver.errorlog_written( ACTION(sqlserver.client_app_name,sqlserver.client_hostname,sqlserver.client_pid)) 
ADD TARGET package0.event_file(SET filename=N'LoginFailureTrace',max_file_size=(128)) 
WITH (MAX_MEMORY=4096 KB,EVENT_RETENTION_MODE=ALLOW_MULTIPLE_EVENT_LOSS,MAX_DISPATCH_LATENCY=30 SECONDS,MAX_EVENT_SIZE=0 KB,MEMORY_PARTITION_MODE=NONE,TRACK_CAUSALITY=OFF,STARTUP_STATE=OFF) 
GO

We were in luck, and the client_pid did show up in the trace.

In this example, I was using SSMS on my workstation with Windows Process ID 3500.

We took the output of this event trace and handed it over to the application area. They identified the offending IIS App pool and addressed the issue.

Comments

  • Anonymous
    July 19, 2017
    Interesting solution. I've faced this myself where SQL receives thousands of failing logins, but the origin server and owners report 'no problems here'. Despite that there clearly IS a problem. So they were able to identify the issue via the PID - did they let you know what exactly was generating the failing logins?
    • Anonymous
      August 23, 2017
      The comment has been removed