Menu Close

Creating a SCOM Service Monitor that allows overrides for Interval Frequency and Samples


image

 

The “built in” service monitor in SCOM is hard-coded for how often it checks the service state, and how many service checks have to return “not running” before it alarms.  This is a bit unfortunate, as customers would often want to customize this.  This article will explain how.

 

All the built in service monitoring uses Monitors that reference the Microsoft.Windows.CheckNTServiceStateMonitorType monitortype, which is in the Microsoft.Windows.Library mp.

This MonitorType has a hard coded definition with <Frequency>30</Frequency> and <MatchCount>2</MatchCount>.  This means by default, monitors that use this will inspect the service state every 30 seconds, and alarm when it is not running after two consecutive checks.  However – the challenge is – Microsoft did not expose these values as override-able parameters.

What if you want to check the service every 60 seconds, and alarm only after it has been consistently down for 15 samples (15 consecutive minutes)?  We can do that.  We have the tools.  Smile

 

Basically – we need to create our own MonitorType –which will expose these.  Here is an example:

<UnitMonitorType ID="Contoso.Demo.Service.MonitorType" Accessibility="Internal"> <MonitorTypeStates> <MonitorTypeState ID="Running" NoDetection="false" /> <MonitorTypeState ID="NotRunning" NoDetection="false" /> </MonitorTypeStates> <Configuration> <xsd:element name="ComputerName" type="xsd:string" xmlns:xsd="http://www.w3.org/2001/XMLSchema" /> <xsd:element name="ServiceName" type="xsd:string" xmlns:xsd="http://www.w3.org/2001/XMLSchema" /> <xsd:element name="IntervalSeconds" type="xsd:integer" xmlns:xsd="http://www.w3.org/2001/XMLSchema" /> <xsd:element name="CheckStartupType" minOccurs="0" maxOccurs="1" type="xsd:string" xmlns:xsd="http://www.w3.org/2001/XMLSchema" /> <xsd:element name="Samples" type="xsd:integer" xmlns:xsd="http://www.w3.org/2001/XMLSchema" /> </Configuration> <OverrideableParameters> <OverrideableParameter ID="IntervalSeconds" Selector="$Config/IntervalSeconds$" ParameterType="int" /> <OverrideableParameter ID="CheckStartupType" Selector="$Config/CheckStartupType$" ParameterType="string" /> <OverrideableParameter ID="Samples" Selector="$Config/Samples$" ParameterType="int" /> </OverrideableParameters> <MonitorImplementation> <MemberModules> <DataSource ID="DS" TypeID="Windows!Microsoft.Windows.Win32ServiceInformationProvider"> <ComputerName>$Config/ComputerName$</ComputerName> <ServiceName>$Config/ServiceName$</ServiceName> <Frequency>$Config/IntervalSeconds$</Frequency> <DisableCaching>true</DisableCaching> <CheckStartupType>$Config/CheckStartupType$</CheckStartupType> </DataSource> <ProbeAction ID="Probe" TypeID="Windows!Microsoft.Windows.Win32ServiceInformationProbe"> <ComputerName>$Config/ComputerName$</ComputerName> <ServiceName>$Config/ServiceName$</ServiceName> </ProbeAction> <ConditionDetection ID="ServiceRunning" TypeID="System!System.ExpressionFilter"> <Expression> <Or> <Expression> <And> <Expression> <SimpleExpression> <ValueExpression> <Value Type="String">$Config/CheckStartupType$</Value> </ValueExpression> <Operator>NotEqual</Operator> <ValueExpression> <Value Type="String">false</Value> </ValueExpression> </SimpleExpression> </Expression> <Expression> <SimpleExpression> <ValueExpression> <XPathQuery Type="Integer">Property[@Name='StartMode']</XPathQuery> </ValueExpression> <Operator>NotEqual</Operator> <ValueExpression> <Value Type="Integer">2</Value> <!-- 0=BootStart 1=SystemStart 2=Automatic 3=Manual 4=Disabled --> </ValueExpression> </SimpleExpression> </Expression> </And> </Expression> <Expression> <SimpleExpression> <ValueExpression> <XPathQuery Type="Integer">Property[@Name='State']</XPathQuery> </ValueExpression> <Operator>Equal</Operator> <ValueExpression> <Value Type="Integer">4</Value> <!-- 0=Unknown 1=Stopped 2=StartPending 3=StopPending 4=Running 5=ContinuePending 6=PausePending 7=Paused 8=ServiceNotFound 9=ServerNotFound --> </ValueExpression> </SimpleExpression> </Expression> </Or> </Expression> </ConditionDetection> <ConditionDetection ID="ServiceNotRunning" TypeID="System!System.ExpressionFilter"> <Expression> <And> <Expression> <Or> <Expression> <SimpleExpression> <ValueExpression> <XPathQuery Type="Integer">Property[@Name='StartMode']</XPathQuery> </ValueExpression> <Operator>Equal</Operator> <ValueExpression> <Value Type="Integer">2</Value> <!-- 0=BootStart 1=SystemStart 2=Automatic 3=Manual 4=Disabled --> </ValueExpression> </SimpleExpression> </Expression> <Expression> <And> <Expression> <SimpleExpression> <ValueExpression> <Value Type="String">$Config/CheckStartupType$</Value> </ValueExpression> <Operator>Equal</Operator> <ValueExpression> <Value Type="String">false</Value> </ValueExpression> </SimpleExpression> </Expression> <Expression> <SimpleExpression> <ValueExpression> <XPathQuery Type="Integer">Property[@Name='StartMode']</XPathQuery> </ValueExpression> <Operator>NotEqual</Operator> <ValueExpression> <Value Type="Integer">2</Value> <!-- 0=BootStart 1=SystemStart 2=Automatic 3=Manual 4=Disabled --> </ValueExpression> </SimpleExpression> </Expression> </And> </Expression> </Or> </Expression> <Expression> <SimpleExpression> <ValueExpression> <XPathQuery Type="Integer">Property[@Name='State']</XPathQuery> </ValueExpression> <Operator>NotEqual</Operator> <ValueExpression> <Value Type="Integer">4</Value> <!-- 0=Unknown 1=Stopped 2=StartPending 3=StopPending 4=Running 5=ContinuePending 6=PausePending 7=Paused 8=ServiceNotFound 9=ServerNotFound --> </ValueExpression> </SimpleExpression> </Expression> </And> </Expression> <SuppressionSettings> <MatchCount>$Config/Samples$</MatchCount> </SuppressionSettings> </ConditionDetection> </MemberModules> <RegularDetections> <RegularDetection MonitorTypeStateID="Running"> <Node ID="ServiceRunning"> <Node ID="DS" /> </Node> </RegularDetection> <RegularDetection MonitorTypeStateID="NotRunning"> <Node ID="ServiceNotRunning"> <Node ID="DS" /> </Node> </RegularDetection> </RegularDetections> <OnDemandDetections> <OnDemandDetection MonitorTypeStateID="Running"> <Node ID="ServiceRunning"> <Node ID="Probe" /> </Node> </OnDemandDetection> <OnDemandDetection MonitorTypeStateID="NotRunning"> <Node ID="ServiceNotRunning"> <Node ID="Probe" /> </Node> </OnDemandDetection> </OnDemandDetections> </MonitorImplementation> </UnitMonitorType>

 

Essentially – we have taken the hard-coded values, and changed them to allow a $Config/Value$ passed parameter.  This will allow the monitor to PASS this value to the MonitorType, and be used in the DataSource or ConditionDetection.  Even if you don’t fully understand that, it’s ok…. because I will be wrapping all this up in a consumable VSAE Fragment that is easy to implement.

The changes made to allow data to be passed in were:

          <Frequency>$Config/IntervalSeconds$</Frequency>
          <MatchCount>$Config/Samples$</MatchCount>

In the <Configuration> section we added:

          <xsd:element name=”IntervalSeconds” type=”xsd:integer” xmlns:xsd=”http://www.w3.org/2001/XMLSchema” />
          <xsd:element name=”Samples” type=”xsd:integer” xmlns:xsd=”
http://www.w3.org/2001/XMLSchema”
 />

In the <OverrideableParameters> section – we added:

          <OverrideableParameter ID=”IntervalSeconds” Selector=”$Config/IntervalSeconds$” ParameterType=”int” />
          <OverrideableParameter ID=”Samples” Selector=”$Config/Samples$” ParameterType=”int” />

In the DataSource – one new value that should be added when using Microsoft.Windows.Win32ServiceInformationProvider and multiple runs, is the following:

           <DisableCaching>true</DisableCaching>

This is very important, as this will cause the datasource to output data every time, even if nothing has changed.  We need this for the number of samples (MatchCount) to work as desired.

Now that we have this new MonitorType – we can reference it in our own Monitors.  Here is an example of a Monitor using this:

<UnitMonitor ID="Contoso.Demo.Spooler.Service.Monitor" Accessibility="Public" Enabled="true" Target="Windows!Microsoft.Windows.Server.OperatingSystem" ParentMonitorID="Health!System.Health.AvailabilityState" Remotable="true" Priority="Normal" TypeID="Contoso.Demo.Service.MonitorType" ConfirmDelivery="false"> <Category>AvailabilityHealth</Category> <AlertSettings AlertMessage="Contoso.Demo.Spooler.Service.Monitor.Alert.Message"> <AlertOnState>Error</AlertOnState> <AutoResolve>true</AutoResolve> <AlertPriority>Normal</AlertPriority> <AlertSeverity>Error</AlertSeverity> <AlertParameters> <AlertParameter1>$Data/Context/Property[@Name='Name']$</AlertParameter1> <AlertParameter2>$Target/Host/Property[Type="Windows!Microsoft.Windows.Computer"]/PrincipalName$</AlertParameter2> </AlertParameters> </AlertSettings> <OperationalStates> <OperationalState ID="Running" MonitorTypeStateID="Running" HealthState="Success" /> <OperationalState ID="NotRunning" MonitorTypeStateID="NotRunning" HealthState="Error" /> </OperationalStates> <Configuration> <ComputerName /> <ServiceName>spooler</ServiceName> <IntervalSeconds>30</IntervalSeconds> <CheckStartupType>true</CheckStartupType> <Samples>2</Samples> </Configuration> </UnitMonitor>

 

Once you implement this Monitor – you will see the new options exposed in overrides:

image

 

 

So the key takeaways are:

  • The built in service monitoring does not allow for configurable Interval and Sample count.
  • We can customize this using a custom MonitorType that allows for these variables to be passed in.
  • Using the Microsoft.Windows.Win32ServiceInformationProvider we MUST set <DisableCaching>true</DisableCaching>

 

 

This example has been added to my Fragment Library for you to download at:

https://gallery.technet.microsoft.com/SCOM-Management-Pack-VSAE-2c506737

(see:  Monitor.Service.WithAlert.FreqAndSamples.mpx)

 

To learn more about using MP Fragments, and how EASY they are to use with Visual Studio:

https://kevinholman.com/2016/06/04/authoring-management-packs-the-fast-and-easy-way-using-visual-studio/

https://www.youtube.com/watch?v=9CpUrT983Gc

 

To make using fragments REALLY EASY, using Silect MP Author Pro, watch the video:

https://kevinholman.com/2017/03/22/management-pack-authoring-the-really-fast-and-easy-way-using-silect-mp-author-and-fragments/

https://www.youtube.com/watch?v=E5nnuvPikFw

 

 

imageSmile

7 Comments

  1. Niksa

    Hi Kevin,

    I am curious can this be implemented with “Application Pool availability” monitor from IIS management pack? We often restart Application Pools and that results in alerts that close in a next interval. It would be good for us if we could create “Application Pool availability” monitor that will alert only in Application Pool is disabled after, say 5 checks/intervals?

    Thanks,
    Niksa

    • Kevin Holman

      That will take a little work – but yes it can be made MUCH better. You could add a consecutive samples filter, or use $Config/Samples$ just like I did above for the service monitor – you will need to re-create the Monitortype like I did above. Silect has a super handy tool that will grab this workflow, and pull all these parts into a new MP for you to customize, if you have that. Otherwise you can forklift it into XML.

    • Fred Reisser

      I am testing just such an Application Pool Availability monitor using the MatchCount expression filter. I added the Samples as a configuration element and an overrideable parameter to the data source module and to the unit monitor type, and to the member module that references the data source module and the MatchCount setting in the unit monitor type. Finally, add the Samples as a configuration item in the unit monitor. The result is an application pool up/down monitor with frequency and samples as overrideable parameters.

      Since the MatchCount filter is part of the System.ExpressionFilter in the System Library, it can be used in any unit monitor where you don’t want to change state until a condition has occurred x times in a row.

      • Kevin Holman

        That’s awesome Fred. I have done the exact same monitor for several customers, for things like app pools. It would be nice if we would go back and make MatchCount a standard and mandatory, ANYWHERE the System.ExpressionFilter is used…. and make it override-able.

  2. Luis Serrano

    Hi Kevin, I need your help.

    Do I need to create the Classes first, then the type of monitor and then the service monitor? In the fragments folder I do not see the type of monitor.

    I created the monito class and the monitor service for my application, and now I need to create the type of monitor?

    This is my code for “Monitor service”: but I do not know how to insert the for the monitor to work…

    AvailabilityHealth

    Error
    true
    Normal
    Error

    $Data/Context/Property[@Name=’Name’]$
    $Target/Host/Property[Type=”Windows!Microsoft.Windows.Computer”]/PrincipalName$

    MSSQLSERVER

    BCBA SQLApp MSSQLSERVER Service Monitor

    Running

    Not Running

    BCBA SQLApp MSSQLSERVER service is not running
    Service {0} is not running on {1}

    ty for your help.

  3. Graham Parker

    Hi Kevin

    I have created a Custom App MP, based on some of your fragments and few of my own, and is working fine.

    I have created the MP monitors to initially generate an alert when HealthStateChanges to Warning, Priority=High, Severity=MatchMonitorHealth – produces a Yellow Health state and Yellow alert. The App Manager can now create overrides if they want to have the Alert Red by changing the Severity from MatchMonitorHealth to Error/Critical.

    The issue is, the Health against the Service still shows Yellow, while the Alert shows red.

    If the App Manager creates an Override with enabling the Override parameter “Alert on State” and changes this from “The monitor is in a Warning State” to “The monitor is in a Critical State”, no alert is received. Makes sense because the Monitor is not in Health=Critical State, it is in the initial default Health=Warning State.

    Can an overrideable parameter be created to change the Health State of the Service monitor from Warning to Critical, so that when the Priority and Severity are overridden to create Yellow or Red Alert, the Healthy State will be Warning/Yellow or Critical/Red to match the colour of the Alert?

    Hope this makes sense.

    Thanks for any insight.

Leave a Reply

Your email address will not be published. Required fields are marked *