Menu Close

Using a recovery in OpsMgr – Basic

This is a simple overview of using a recovery for a custom Monitor in OpsMgr

Lets say we create a simple service monitor in OpsMgr… for this example – I will use the Print Spooler service:

Create a new monitor, unit monitor, and choose windows services – Basic Service Monitor:

image

Choose an appropriate management pack to save it to… such as a Base OS custom rule MP you create.

Give it a name – such as “Check Windows Spooler Service” and choose a valid target, such as “Windows Server”

image

Browse the service name – and pick the Print Spooler (Spooler):

image

Accept defaults for health, and let it create an alert, or not – depending on your requirements.

Once the monitor is created…. open it up in the Authoring tab of the Ops console.  Choose the “Diagnostic and Recovery” tab.

Under “Configure Recovery Tasks” add a a recovery for Critical Health State.  Choose “Run Command” and click Next.

Give the recovery a name…. such as “Restart service” and click Next.

For the command line settings… we need to provide a path to the file we want to run.  For a simple service restart – we can use the “NET” command, as in “NET START (servicename)”  For the path – just specify the original executable – do not add any command line switches…. such as:  “%windir%\system32\net.exe”

Under “Parameters” – this is where we will add the command line switches…. such as “start spooler” in this case:

image

Click “Create”  Click OK.

Now – pick a managed agent – and stop the Spooler service.  This will create a state change for the monitor.  If you told the monitor to alert – it will also create an alert at this time.  As soon as the state change occurs, our recovery will run…. which should restart the service.

Check the system event log to view the activity.  I got the following two events:

Event Type:    Information
Event Source:    Service Control Manager
Event Category:    None
Event ID:    7036
Date:        3/26/2008
Time:        1:24:44 AM
User:        N/A
Computer:    OMTERM
Description:
The Print Spooler service entered the stopped state.

Event Type:    Information
Event Source:    Service Control Manager
Event Category:    None
Event ID:    7036
Date:        3/26/2008
Time:        1:25:04 AM
User:        N/A
Computer:    OMTERM
Description:
The Print Spooler service entered the running state.

So the service was down for about 20 seconds…. for the monitor to detect the unhealthy state, and then to run a recovery to restart the service.

Open health explorer for the computer object for the test machine, and find the “Print Spooler Service Check” monitor.  It should show up as healthy… if the recovery worked.  Select this monitor, and then click the “State Change Events” tab.  We should see the service is running currently as the last logged state change.  Find the “Service is Not running” state change just below the current one…. and in the details pane – we should be able to see the recovery output where the recovery task ran automatically, and logged the output:

image

So what if we want a more advanced recovery?  Perhaps we have a service that just doesn’t always start reliably on the first try.  Perhaps we want to try and start the service three time over a 3 minute period, and THEN create the alert?   This can be done…. but will have to be done using a custom script that provides this logic, and then create the alert, or creates an event, and then a rule will alert from the event created.

3 Comments

  1. Gurunath Reddy

    Hi Kevin,
    Need your help here-
    I wanted to run a recovery task to automatically login through RDP with the service account to an application server when it is logged out or restarted for any reason. I have the script which remotely executes and login to that Server successfully.But when I add it as a recovery task, it shows as succeeded but actually not. I suspect that it may be because the recovery task is being run on the application server directly because the monitor is targeted to that. Is there a way to execute the recovery task on any other server apart from the monitoring target.

    Thank you
    guru

Leave a Reply

Your email address will not be published. Required fields are marked *