Monday, February 28, 2011

Scripting On-Demand Network Changes with Solarwinds Orion NCM

Getting called at 2am is never fun, even if you are the Network On-Call person.  Any chance I can  prevent a call like that, I'll take it! In this case, there's a "failover pair" of servers, one in each data center (DC). Each server has a locally unique admin/replication IP addresses on one interface that is always active and a second interface that shares the same IP address as the server in the other DC. Whichever server is active enables the  highly-available (HA) interface while the other server's HA interface is disabled. We can then make network changes to routers and switches to "switch" the server from one DC to the other. And instead of my having to manually make those changes at 2am, we can script the changes with a configuration management tool. Our tool of choice is Solarwinds Orion Network Configuration Manager (NCM).

In this particular use of NCM, there are 5 individual NCM jobs, one for each device that must be touched. The changes include enabling/disabling switch ports and adding/removing route advertisements in EIGRP and BGP.  Assume the names of the 5 jobs are AutoJob1a, AutoJob2a, ..., AutoJob5a. In addition, there are 5 jobs for the reverse direction named AutoJob1b, AutoJob2b, ..., AutoJob5b.  Each of these jobs has an NCM Job ID associated with it seen under the "Job ID" column when viewing Scheduled Jobs from the NCM GUI.

At this point, we've saved ourselves from having to individually login to each of the devices to make the required changes. But we can take it a step further by combining all the jobs and launching them from a Windows Batch (.bat) file.  On the NCM server we created the file d:\RemoteJobs\AutoJob-A.bat which contains these 5 lines, one per NCM job:

"D:\Program Files\SolarWinds\Configuration Management\configmgmtjob.exe" "D:\Program Files\SolarWinds\Configuration Management\Jobs\Job-318696.ConfigMgmtJob"
"D:\Program Files\SolarWinds\Configuration Management\configmgmtjob.exe" "D:\Program Files\SolarWinds\Configuration Management\Jobs\Job-631858.ConfigMgmtJob"
"D:\Program Files\SolarWinds\Configuration Management\configmgmtjob.exe" "D:\Program Files\SolarWinds\Configuration Management\Jobs\Job-713828.ConfigMgmtJob"
"D:\Program Files\SolarWinds\Configuration Management\configmgmtjob.exe" "D:\Program Files\SolarWinds\Configuration Management\Jobs\Job-272305.ConfigMgmtJob"
"D:\Program Files\SolarWinds\Configuration Management\configmgmtjob.exe" "D:\Program Files\SolarWinds\Configuration Management\Jobs\Job-777458.ConfigMgmtJob"


Note that the Job ID for each job shows up in the name of the .ConfigMgmtJob file that is called in each line of the .bat file.

At this point, any monkey with a login to the NCM server could just double-click on the .bat file to kick off those five NCM jobs.  But there's a better way, at least in our environment: Tidal Scheduler.  With a Tidal agent on the NCM server, Tidal can be configured to launch d:\RemoteJobs\AutoJob-A.bat or the reverse d:\RemoteJobs\AutoJob-B.bat on-demand by the Operator-On-Duty.  This allows the event to be properly audited and standardizes the action required by the Operator.  

In addition, we can configure each NCM job such that it generates an e-mail notification when it completes, so when all 5 have completed we get 5 e-mails that show exactly what commands were entered and the corresponding output from the router/switch that was modified. The e-mail can be sent to the Network team as well as the Operations team so they have a better understanding of success than simply a "completed job" message from Tidal.

In the end, instead of getting a wake-up call at 2am, the server admin team can now simply call the Operator-On-Duty and ask them to run NCM Job "AutoJob-A" or "AutoJob-B". They then use a simple traceroute to determine if the network "thinks" the server is in DC A or DC B.

Ahh, now I can go back to sleep. 

Zzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzz.

No comments:

Post a Comment