SQL Management Pack for SCOM: Low Privilege configuration in Clusters

 

You know it’s going to be a bad day when you take a support case and the person on the other end of the line has not only worked for Microsoft, but done your job (and done it well by all accounts) at Microsoft longer than you have.  That was my joy last week when I took a case about the SQL Management Pack low privilege setup not working properly.   I knew I was going to be in for a long day, and the case did not disappoint.

The case in question was due to issues with SCOM being unable to discover SQL 2012 databases using the SQL low privilege setup on clustered servers.  SQL discovery was working just fine on non-clustered instances, but on clustered servers we were only seeing the database engine (and it was showing as not monitored) and not the databases, SQL jobs, etc.

Just a bit of context on low privilege environments.  SCOM typically discovers SQL via one of two methods.  First is that the default action account happens to have sys admin rights to the SQL Server.  The second is that the Local System Account has these same rights.  The problem with this is security.  In the case of the Local System Account, a compromise of the server can lead to a very quick compromise of the SQL installation housed on said server.  While the Default Action Account is, in that respect a bit safer, this account usually has quite a bit of rights in the environment as a whole, making it an account that an intruder would be interested in compromising, which grants the intruder rights to every single SQL installation and administrative rights and most every server.  Given the average cost of a cyber event (now north of $12 million) and the fact that the SQL databases are what usually house the data that intruders are most interested in, it makes sense to reduce the attack surface of the SQL environment in any way possible.  Enter SQL low privilege.  Instructions for this setup are included in the SQL Management Pack documentation, and it includes creating three domain user accounts and giving them the basic rights that they need to monitor the SQL environment without granting SCOM administrative rights to the SQL servers themselves.  Given the cost of intrusions, it’s a simple thing that should be done in any environment that houses SQL data that would be considered confidential.

Now back to the problem.  In our environment, we could see the instances for the clustered systems, and were discovering databases for non-clustered systems housed on the same Operating System.  Also, if we made these accounts local admins, everything would discovery perfectly.  Just to make sure no mistakes were made, we walked through the SQL documentation again.  That all checked out clean.  The management pack documentation does note that for clustered environments, some of the accounts need Local Launch and Local Activation permissions (set in dcomconfig.exe), but that wasn’t the problem here.  Once we dove into the SCOM logs, we saw this:

image

I didn’t include a snapshot of the error in there because it had client data in it. (note this is a 2012 discovery error, more on that in a bit, but we were essentially seeing similar issues in 2008).  A look at the SQL logs showed no errors.  The NTFS security log on the system was also not showing any sort of permission denied events.  We checked antivirus, and for a while we thought it was that, but after disabling AV in our test environment, still no discovery data. 

We eventually had to resort to ProcMon to determine the breakdown.  Note that this screenshot doesn’t list every key, there are more.  I’ll list them below, but this is what we saw:

image

This screenshot unfortunately doesn’t seem to render well here, but a closer look at those script.exe events showed that they were the processes identified in the above shots and were being executed by the accounts specified in the SQL low privilege setup.  Apparently for clusters, these accounts need a bit more rights.  I ran this by a few people that I know who have had to do this and it appears that often times, these accounts are simply granted local admin rights, which starts to defeat the purpose of doing a low privilege setup.  At the end of the day, these accounts were trying to write to the following registry keys and failing:

 

HKLM\Software\Microsoft\Microsoft Operations Manager\3.0\Modules\{GUID} (note the GUID is the management group ID)

HKLM\Software\Microsoft\WEBEM\CMOM

HKLM\System\CurrentControlSet\Services\WinSock2\Parameters

The first key was preventing discovery of the Databases.  The second two keys were preventing discovery of SQL Agent jobs on a SQL 2008 installation.  I don’t have a clustered lab environment to test this on, so I’m somewhat curious if others have run into this before.  I’ve submitted a request to get the SQL MP documentation updated with this data, but until that happens, I hope this post will be useful.

Technorati Tags: GetSQL2012DBEngineWinServState SCOM Low Privilege SQL Agent Jobs discovery

Comments

  • Anonymous
    January 01, 2003
    Thank you Rich.
  • Anonymous
    May 28, 2015
    This is cool stuff man, genius!!!
  • Anonymous
    August 21, 2015
    Great stuff!! Thx for the help finally my low priv works!
  • Anonymous
    September 30, 2015
    I implemented the low-privilege setup per the SQL management pack. On standalone SQL nodes, it's working great. I tried implementing it on the two node SQL cluster hosting the OpsMgr databases. One instance is active on each SQL node. On one server, SCOM says it doesn't have permission to model and on the other SCOM reports insufficient permission to the ReportServerTempDB. I've added the additional cluster requirements (DCOM permissions, Cluster permissions and Windows Remote Management firewall exceptions). It's still not working. I failed over the cluster instances and rebooted both nodes and got another event. If I look at the SQL 2012 Summary Dashboards, I'm not getting complete info for the DBs on these two instances.

    I also noticed that the node hosting the instance is trying to log into the each database with NT AUTHORITYSYSTEM and not the accounta I setup in the RunAS profile. What's up with that?

    I haven't tried adding permission to the registry keys you suggested. I'm curious if anyone else has run into this with NT AUTHORITYSYSTEM
  • Anonymous
    December 16, 2015
    Hi Michael,

    Sorry for the delay in responding. I didn't notice this comment. SCOM is going to default to the local system account if it does not have the runas profiles setup for SQL low privilege. That tells me that it isn't using low priv at all and then trying the Local system. There could be a few issues at play. The easiest would be that the run as account or profile isn't correct, but I'm guessing it may be a bit more difficult than that. I would take a close look at your SQL logs. If SCOM isn't getting in there, you should see some sort of login failure in SQL.

    In this particular issue, it was causing problems discovering databases in a SQL low priv environment. If you see that symptom as well as the error described, I'd definitely permission those keys.