High Availability and Patching RunBook Servers

I was recently asked about patching in regards to Orchestrator because I was able to demo how to automate certain processes in regards to patching, and I was asked "Well, what about if I wanted to automate patching my Orchestrator servers", which of course got the wheels spinning. I did some searching and testing, and ended up using the information below to complete the task. Now, Orchestrator can be made highly available by ensuring that multiple Orchestrator web services are behind a load balancer, by ensuring that there are multiple runbook servers deployed, and by using a clustered SQL server for the Orchestrator database. But this post focuses on a suggested way to patch Orchestrator runbook servers that ensures the least amount of interruption to the automation capabilities provided by Orchestrator. These steps are outlined below:

 

1. Ensure your Orchestrator environment has at least 2 runbook servers connected. This ensures Orchestrator will be fault tolerant in the event of server maintenance or failure. If you only have one runbook server, use Deployment Manager on your Orchestrator Management Server to deploy another runbook server.

2. Patch each runbook server, one at a time, by doing the following:

a. Invoke a “Wait Runbook” on the current runbook server to patch. This runbook should do nothing, but run forever. For example, it might infinitely loop a PowerShell sleep command.

b . Set the maximum jobs to 1 on this runbook server by invoking "C:\Program Files (x86)\Microsoft System Center 2012\Orchestrator\Management Server\aspt.exe" from the command line on the Orchestrator Management Server. Your path to aspt.exe may be different if you chose a non-default installation location for the Management Server. The syntax for this command is: astp.exe RunbookServerName MaxNumberofJobs

c. Wait X seconds for all non-monitor running jobs on this runbook server to complete, where X is some amount of time that makes sense for the type of runbooks your organization typically runs. X should be large enough that a normal runbook should finish in that amount of time.

d. Move any jobs that are still running other than the “Wait Runbook” (probably monitors) to another runbook server by manually stopping them and then starting them again. These jobs will restart on a different runbook server than the one you are trying to patch since the max jobs for this runbook server is set to 1 and the “Wait Runbook” is holding that slot.

e. Confirm all jobs except the “Wait Runbook” have successfully switched runbook servers.

f. Do whatever work is required to fully patch this runbook server, including machine restarts.

g. Set the maximum jobs back to 50 on this runbook server by invoking aspt.exe again as in step 2.b

h. Stop the “Wait Runbook” on the runbook server you just patched.

 

Great! Now we have a guide to put a runbook server into a “pseudo-maintenance mode“ to keep automation up and running and not interrupt running jobs. But wait a sec…Orchestrator is so good at automating the patching of other services, wouldn’t it be great if Orchestrator could automate the patching of itself? It turns out this kind of patching “inception” is actually possible via a set of Orchestrator runbooks. An IT Pro can even automate the repetitive, time-consuming, and error-prone task of patching the service that lets them automate the patching of others services!

The runbook export for automating the patching of Orchestrator in a highly-available manner is available here. Remember, this is just an example and does not contain all of the error logic that you would want to apply in production usage. Some notes about using this set of runbooks:

  • Prereqs:
  • Before importing:
    • Download, register, and deploy the Orchestrator Web Service Integration Pack
    • Create a connection configuration for the Orchestrator Web Service Integration Pack you just installed. Name it “SCO”
  • After importing:
    • Make sure that the “Runbook Identifier” parameter in the "Get Wait job details" activity in the "3: PatchServer" runbook matches the correct path to the "4: WaitRunbook" runbook (ex: '\PatchRunbook\4: WaitRunbook').
    • In “3: PatchServer” change the “Wait 10 seconds for jobs to complete” activity to wait for X seconds, where X is the same X as described in the manual steps above. This activity is meant to wait for running jobs on the runbook server to patch to complete.
    • Replace the contents of “7: Run Patch” with your patching logic (integration with Configuration Manager, etc).

 

That's IT!