Why do I get the infrastructure error for login failures?

In the past few weeks, I saw this error come across quite a bit and thought I will provide an explanation for the reasons why we generate this error. It is very possible that you came across either one of the flavors of the two error messages shown below:

2016-07-08 23:53:59.63 Logon Error: 18456, Severity: 14, State: 11.
2016-07-08 23:53:59.63 Logon Login failed for user 'IAAS6620137\MyInfraWindowsLogin'. Reason: Token-based server access validation failed with an infrastructure error. Check for previous errors. [CLIENT: 10.107.10.43]
2016-07-08 23:51:33.05 Logon Error: 18456, Severity: 14, State: 12.
2016-07-08 23:51:33.05 Logon Login failed for user 'MyInfraSQLLogin'. Reason: Login-based server access validation failed with an infrastructure error. Check for previous errors. [CLIENT: 10.107.10.43]

As you can see there are two flavors here:

- The token-based message is generated for the Windows authentication logins

- The login-based message is generated for the SQL authentication logins

The important piece of information from that message is the part about server access validation failed. So let’s examine that carefully. During the login process, the database engine has to perform several checks regarding the login and its various attributes before letting the application connect to the SQL Server instance. Out of that big list, related to this error condition, there are two checks to find out if this login is authorized to access this server instance. There are two permissions that come into play while performing these authorization checks:

- Does the login have the SERVER class permission named CONNECT SQL for this server instance?

- Does the login have the ENDPOINT class permission named CONNECT for the specific endpoint on which the client application established the connection?

Here is a sample query that lists both the permission classes we mentioned above and which logins or roles have it granted.

image

I highlighted the two logins I created for this experiment [MyInfraSQLLogin and MyInfraWindowsLogin]. You will notice that both of them are explicitly granted the CONNECT SQL permission for the SERVER. When you create the login for the first time you will notice that it automatically gets the CONNECT SQL permission for the SERVER. The next set of rows shows  CONNECT permission for the ENDPOINT. If we look at the [TSQL Default TCP] endpoint, which is where the application will connect when you come through TCP, only public server role is granted the permission. We all know that every login in SQL Server belongs to the public server role. So essentially the two logins I have created have the two required permissions to be authorized to access the SQL Server instance. So the application will go through fine. Now we will take a look at some of the common scenarios where these permissions are not setup properly and hence the error message is generated.

Scenario #1

In my example above, If I DENY or REVOKE these two referenced permissions for the two highlighted logins, then I will encounter the login failure errors shown in the beginning of the post.

Scenario #2

In some organizations, administrators will REVOKE the CONNECT permission for ENDPOINT from the public role. In this case each individual login, group or role needs to be GRANTED the CONNECT permission to the relevant ENDPOINT on which the application will connect. Otherwise you will encounter the errors referenced in this post.

Scenario #3

You create additional TSQL TCP endpoints. When you create new TSQL TCP endpoints, you actually get the following warnings:

Creation of a TSQL endpoint will result in the revocation of any 'Public' connect permissions on the 'TSQL Default TCP' endpoint. If 'Public' access is desired on this endpoint, reapply this permission using 'GRANT CONNECT ON ENDPOINT::[TSQL Default TCP] to [public]'.

Scenario #4

If the CONNECT SQL permission for SERVER and CONNECT permission for ENDPOINT is granted at the Windows Group level, then you need to evaluate the group memberships of the specific Windows Login and figure out if there is some issue there.

Scenario #5

You will notice a lot of references to this message “server access validation failed with an infrastructure error” and Windows UAC [User Account Control]. The reason comes down to the following:

1) Mostly likely the CONNECT SQL permission for SERVER or CONNECT for ENDPOINT is granted to a Windows Group. It could be one of the built in administrator groups. The individual Windows user account do not have logins created in the SQL Server instance.

2) You launch the application as a standard login [did not use run as administrator UAC option].

3) During authorization phase, SQL Server will check the membership tokens associated with the current login of the Windows account.

4) Since the administrator group membership tokens are not associated with the current login, you will get a login failure with the message “server access validation failed with an infrastructure error

Here is a side by side comparison of the tokens associated with the Windows Account or login in question. Left side shows the tokens associated with the login when the application is launched with administrator privileges Vs right side shows the tokens associated with the login when run as standard user in UAC.

image

Under such circumstances, assume your SQL Server instance permissions is setup as follows:

image

Then the application will trigger a login failed message “server access validation failed with an infrastructure error” if it is launched as standard user and not with administrator privileges. Does this help explain why everyone points to UAC whenever we see the infrastructure error?

There can be several other scenarios that lead to the same situation. But when you have one of this situations, evaluate and find out if the login used by the application has the CONNECT SQL permission for SERVER and CONNECT permission for the ENDPOINT. If you have seen other interesting scenarios, please post it to the comments and we can get them  consolidated here.

In SQL Server versions until 2014, you are get the error as shown in the beginning of this post. Basically you have to dig out and find the details regarding the reason for the failure. In SQL Server 2016 the situation is better. We made enhancements to this error message to make it very clear why we are generating this error from the security subsystem of the Database Engine. So you will now see the following flavors:

2016-07-08 23:02:07.42 Logon Error: 18456, Severity: 14, State: 147.
2016-07-08 23:02:07.42 Logon Login failed for user 'MyInfraSQLLogin'. Reason: Login-based server access validation failed with an infrastructure error. Login lacks Connect SQL permission. [CLIENT: 10.107.10.43]
2016-07-08 23:05:22.00 Logon Error: 18456, Severity: 14, State: 149.
2016-07-08 23:05:22.00 Logon Login failed for user 'MyInfraSQLLogin'. Reason: Login-based server access validation failed with an infrastructure error. Login lacks connect endpoint permission. [CLIENT: 10.107.10.43]

2016-07-08 23:25:38.43 Logon Error: 18456, Severity: 14, State: 146.
2016-07-08 23:25:38.43 Logon Login failed for user 'IAAS6620137\MyInfraWindowsLogin'. Reason: Token-based server access validation failed with an infrastructure error. Login lacks Connect SQL permission. [CLIENT: 10.107.10.43]
2016-07-08 23:26:21.73 Logon Error: 18456, Severity: 14, State: 148.
2016-07-08 23:26:21.73 Logon Login failed for user 'IAAS6620137\MyInfraWindowsLogin'. Reason: Token-based server access validation failed with an infrastructure error. Login lacks connect endpoint permission. [CLIENT: 10.107.10.43]

So now you can easily go fix up the permission it is complaining about and then everyone will be happy.

One more reason to move to SQL Server 2016 Smile

Thanks

Suresh Kandoth – Escalation Engineer – SQL Server Escalation Services

Comments

  • Anonymous
    July 18, 2016
    The comment has been removed
  • Anonymous
    October 26, 2016
    With a Windows 2012-R2 clustering / SQL-2014-64x(sp2-Cu1) 3-node AOAG, this error 18456,14,11 error abruptly became our problem. Windows security only / domain service accts, and domain user accts for Windows 2012 web side as well as SQL servers. All had been going well for many months, until "1 day", web client sessions ( web-service to sql ) started failing with 18456-14-11 while on the primary replica node. Both service acct assigned in app_pool, as well as Windows logon of developer, would no longer gain access. After manual f/o (via AG) to secondary, becoming primary, the "all-well" flag started waving again. If I f/o to 3rd node, happiness returns - no problems. I have done everything to figure out problem with "culprit node" expect for taking it to confession ... Tried new domain service accts for web-service in latest round of testing. Same barking: error, on AOAG fail over to the haunted node { Hn-2 }. Seems, only I am gaining tokens,,, redeemable only in heaven after I die... Any ideas on what I am missing ? ps. no errors generated at WSFC level
  • Anonymous
    January 04, 2017
    I have another question about the same issue.Scenario:1 server with 5 instances and different service accounts for each instance.Lets say instances are named A to E and service accounts a e respectively.Every day about 02:00 the following accourOn the first instance "A", service accounts "b" to "e" will try to logon thus getting the error"Error: 18456, Severity: 14, State: 11."Login failed for user 'b'. Reason: Token-based server access validation failed with an infrastructure error. Check for previous errors. [CLIENT: ]Login failed for user 'c'. Reason: Token-based server access validation failed with an infrastructure error. Check for previous errors. [CLIENT: ]Login failed for user 'd'. Reason: Token-based server access validation failed with an infrastructure error. Check for previous errors. [CLIENT: ]Login failed for user 'e'. Reason: Token-based server access validation failed with an infrastructure error. Check for previous errors. [CLIENT: ]For the next instance "B" the same happens but from "c" to "e". Not "a".Login failed for user 'c'. Reason: Token-based server access validation failed with an infrastructure error. Check for previous errors. [CLIENT: ]Login failed for user 'd'. Reason: Token-based server access validation failed with an infrastructure error. Check for previous errors. [CLIENT: ]Login failed for user 'e'. Reason: Token-based server access validation failed with an infrastructure error. Check for previous errors. [CLIENT: ]and so on untill the last instance "E" which has no errors.Anyone has an answer to this or a clue to how to solve it?
  • Anonymous
    January 04, 2017
    Forgot to say service accounts ar domain\a and so on....
  • Anonymous
    March 08, 2017
    Since the scripts are in a screen cap (image) I couldn't copy and paste.Here is that first (and last) script:SELECT SPM.class_desc, SPM.permission_name, SPM.state_desc,SPR.name, SPR.type_desc, SPR.is_disabled,EP.name, EP.protocol_desc, EP.state_desc, EP.type_descFROM sys.server_permissions AS SPMJOIN sys.server_principals AS SPR ON SPM.grantee_principal_id = SPR.principal_idLEFT OUTER JOIN sys.endpoints AS EP ON SPM.major_id = EP.endpoint_idWHERE (SPM.permission_name = 'CONNECT SQL' AND SPM.class_desc = 'SERVER') OR (SPM.permission_name = 'CONNECT' AND SPM.class_desc = 'ENDPOINT')
    • Anonymous
      March 10, 2017
      I actually found this script more useful as it shows all ENDPOINTs and SERVER connections, even those that are not GRANTed:SELECT SPM.class_desc, SPM.permission_name, SPM.state_desc,SPR.name, SPR.type_desc, SPR.is_disabled,EP.name, EP.protocol_desc, EP.state_desc, EP.type_descFROM sys.server_permissions AS SPMJOIN sys.server_principals AS SPR ON SPM.grantee_principal_id = SPR.principal_idLEFT OUTER JOIN sys.endpoints AS EP ON SPM.major_id = EP.endpoint_idWHERE SPM.class_desc = 'SERVER' OR SPM.class_desc = 'ENDPOINT'ORDER BY SPM.class_desc, SPM.permission_name, SPR.name
  • Anonymous
    August 30, 2018
    The comment has been removed