Menu Close

Monitoring Microsoft Configuration Manager

Quick Download:  https://github.com/thekevinholman/MCM

image

The old System Center Configuration Manager 2012 (SCCM) Management Pack is no longer available.  Customers often used this to monitor current branch, and it still works well for the most part.

Now that it was pulled, there is a gap for customers who don’t have that old MP, or want something a little more relevant to Microsoft Configuration Manager deployment monitoring.

I have taken the old SCCM 2012 MP’s, and combined them into a single MP.  It largely works the same way, but I made some enhancements based on things I saw in the field.

The biggest changes are around controlling noise for transient errors, and changing the alert source to the Unit monitors, as alerting from rollup monitors was a terrible design as a method to limit noise, because you also limit quality detailed alerts.  Fixed some workflows that flat out didn’t work, and aligned the MP closer to best practices.

  • MCM Client Class discovers sitecode and version properties
  • Fixed errors on MCM client discovery that logged events
  • Added MatchCount property to most monitors to support multiple consecutive samples to reduce noise
  • Removed overrides that were muting alerts from unit monitors
  • Added overrides to disabled alerts from the aggregate rollup monitors and dependency rollup monitors
  • Added ScaleBy to Process based CPU Rules and Monitors so they actually work.
  • Changed Max sample separation on Perf Collection rules from 10 to 4 so one data point will be collected every hour on any perf collection rules that are enabled.
  • Deleted all Manual Reset Monitors and replaced with Rules.
  • Deleted all Classes, Rules, and Monitors for Deprecated and Unsupported roles
  • Renamed all Element ID’s to include the MP ID, and follow structured rules:
    • Datasource ends with .DS and with Displaystring Datasource
    • ProbeAction ends with .PA and with Displaystring Probe Action
    • WriteAction ends with .WA and with Displaystring Write Action
    • MonitorType ends with .MT and with Displaystring MonitorType
    • Discovery ends with .Discovery and with Displaystring Discovery
    • Rule ends with .Rule and with Displaystring Rule
    • Monitor Ends with .Monitor and with Displaystring Monitor
    • Aggregate Monitor ends with .AggregateRollup.Monitor
    • Dependency Monitor ends with .DependencyRollup.Monitor
    • Groups end with .Group and with Displaystring Group
    • Task ends with .Task
  • Removed all Override Displaystrings which were pointless
  • Reduced frequency of MCM Client Discovery from 60 seconds to once a day.
  • Created an override to disable the SMSExec service monitor for members of the Site Database Computers Group, since this was causing false alarms as Database servers are not typically Site server roles with SMSExec service present.

image

image

 

You can download this management pack here:

https://github.com/thekevinholman/MCM

 

If you have any feedback or monitoring requests, I will be happy to consider them for future updates.

 

Known issues:

image

1.  This MP does not support some of the high availability features new to ConfigMgr, so you might see alerts on passive nodes.

2.  This MP primarily looks in the registry on ConfigMgr servers for issues reported by MCM.  These might be out of date, and sometimes require a manual reset on the ConfigMgr server registry to clear the issue in SCOM.  These keys are located at HKLM\SOFTWARE\Microsoft\SMS\Operations Management\Components.  The “Severity” value determines the health state.  0 or 1 is Healthy, while 2 or 3 is unhealthy.  You can review the “Time of last report” and if this is very old, you can consider resetting the Severity to “0” in the registry to set the monitor back to healthy, if you are sure the component is working in MCM.  Sometimes these status message driven registry entries do not get updated back to healthy.

 

Troubleshooting:

image

1.  Discovery

The first discovery that runs is “Site System Discovery”. 

This targets the Microsoft Windows Server Operating System class.  So make sure you have a Windows Server OS instance FIRST.  This means you MUST have the Windows Server OS MP’s imported for every version of Windows Server that hosts a MCM server role.

Next, make sure you have enabled agent proxy as a default setting:  HERE

This discovery discovers the “ConfigMgr Site System” instances.  This discovery checks all Windows Server OS instances for this registry value:  HKLM:SOFTWARE\Microsoft\SMS\Operations Management\Version and discovers instances if the value matches wildcard “5.*”.  Use Discovered Inventory, and ensure you have discovered a ConfigMgr Site System instance for each server:

image

The next discoveries that run are “MCM Site Server Discovery” and “MCM Remote Server Discovery”.  These discover instances of the “ConfigMgr Site Server” and “ConfigMgr Site System Server” respectively.  The MECM Site Server Discovery inspects the registry on all ConfigMgr Site System instances, for SOFTWARE\Microsoft\SMS\Operations Management\SMS Server Role\SMS Site Server.  If this registry key exists, it will discover a ConfigMgr Site System.  Check discovered inventory for these:

image

Next, the MCM Hierarchy discovery runs against all instances of the ConfigMgr Site Server class.  Also, the Central Site, Primary Site, Secondary Site, Site Services, and Standalone Primary Site discoveries all target the ConfigMgr Site Server class.  These are registry or script based discoveries (.js)

Look on the OperationsManager event logs on MCM servers for clues if a discovery is not working.

57 Comments

    • Kevin Holman

      This is a new MP with new ID’s, so it will not upgrade the old one. The SCCM 2012 MP’s should be removed, but you can run them side vby side for a short time if you need to. It will be confusing because some objects in the MP will have duplicate names. Obviously, customers will have a lot of overrides referencing the old MP’s so removing can be a pain. I’d still recommend it, because the old overrides likely won’t apply directly so I’d use it as a chance to start fresh.

  1. Ollie Woodall

    Great news! Do we understand why MS ceased development on SCCM MPs?

    Would there be any problem having this new MP imported, side-by-side with the SCCM 2012 MP? While we worked through the overrides required in the new MP (we have a LOT of overrides for the original 2012 MP).Or should you remove SCCM 2012 MP first?

    • Kevin Holman

      You can run the new one side by side. However, I’d keep it short, because some object names like groups will be duplicated.

        • Kevin Holman

          Meaning I’d make it a project to get the old one removed as soon as you can. I would not run both of them for months and months, because it could cause noise and confusion.

  2. Jay

    Hi Kevin,

    I just installed this MP and one of the groups we support that have no SCCM related objects in their server group/email subscription received email alerts from multiple servers that aren’t theirs for the critical alert MECM Windows Deployment Service Not Running. Is this a known issue?

    • Kevin Holman

      No. It isn’t. At least, not yet.

      MECM WDS Availability Monitor (MECM.PXEServicePoint.WDS.Service.Monitor) targets the class: ConfigMgr PXE service point (MECM.PXEServicePoint).

      If you look at discovered inventory for this class, do you have discovered objects that you do not expect in there?
      If you look at the group definition for the group scoping the people receiving alerts – how is it defined?

      • Jay

        When I target ConfigMgr PXE service point, I see objects that I expect to see in there. They are all objects from MECM servers.

        The subscription I created, targeting the server group of the team that received the email alert for the monitor, does not contain objects from the ConfigMgr PXE service point class within the server group. I have the subscription set to only receive new critical alerts targeting their server group which only contains the Windows Computer/Health Service watcher objects of the computers they support. Those computers are all SQL and DHCP servers.

        This issue occured immediately after installing the management pack. The MEMC WDS Availability monitor alerted right away on darn near all our MECM servers and thats when the other Non-MECM team received the email alerts. It may have just been a one-off type of deal, but I would have to wait for the alert to trigger again to see if it happens again. No other team in our environment received the email alert so that is a good thing.

  3. David Zemdegs

    Thanks heaps for that. I uninstalled the old one and installed this new one. I get a monitor alert about the notification port not being open. According to the doco I needed to open TCP port 10123 which I did but the monitor is still actively alerting. What I couldnt find was how to check which port the alert monitor is looking for?

    • Kevin Holman

      The Monitor for client notification port (like many monitors in this MP) gets this data directly from MECM. MECM publishes component state and severity in the registry, the monitors read this registry value straight from what MECM thinks about the health of itself:

      MECM.BGBServerFirewallBlock.StatusMessage.Monitor

      SOFTWARE\Microsoft\SMS\Operations Management\Components\SMS_Notification_Server\D07ACE61-FB84-4461-9F52-ABBA07C2EE3A\State

      SOFTWARE\Microsoft\SMS\Operations Management\Components\SMS_Notification_Server\D07ACE61-FB84-4461-9F52-ABBA07C2EE3A\Severity

      If Severity = 2 or 3 the monitor is marked unhealthy. If Severity = 0 or 1 the monitor is healthy. So if your monitor is unhealthy – then MECM thinks there is a problem with this component. You should have status messages in MECM reflecting this?

      • David Zemdegs

        the bgbserver.log has a few errors around specific clients timing out but nothing that suggests an entire port is not open. Looks like an erroneous alert. Override time.

        • Kevin Holman

          I would reset the registry first. If the registry is reflecting an old status message, it might not get updated sometimes. So I’d reset the registry severity back to 0 or 1 and see if the issue recurrs.

  4. Ali

    I added this SCOM management Pack in to SCOM.
    Only if I check the Microsoft Endpoint Configuration Manager module in SCOM it’s completely empty.
    It looks like it doesn’t want to discover my SCCM environment

      • Ali

        the only thing i can see in the event viewer is the following event:

        A scheduled discovery task was not started because the previous task for that discovery was still executing.

        Discovery name: MECM.StandaloneSite.Discovery
        Instance name: xxxx
        Management group name: xxxxx

        And i see this event multiple times in the event log.

  5. Scott

    This is great Kevin, thanks. Is there support added for active/passive site servers? The biggest issue we’re having with the current MP is due to a bug (sort of) in MCM leaves turns all the registry flags MCM uses to determine site health to an error state on the passive site server. We’ve written some PowerShell to reset the values to “green” on the passive site server after a failover to prevent the false positives, but ideally the MP would be active/passive site swerver aware someday.

    • Kevin Holman

      I need to know more about this scenario. I’ll talk with my MECM guys and see what they say. If you want to give me more details – feel free to shoot me a comment via my blog with your email address, and we can start an email conversation

      • Stephen

        I have the same issue with the active passive and awhile back had a case open. This was the end result.

        HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\SMS\Identification\Site Servers

        This will contain a key for both(all) nodes and have a value of 1 or 0 meaning the below:
        1 = Active
        0 = Passive

  6. Michael Forde

    Thanks so much for this Kevin – Absolutely required… still confuses me that it’s had to come through you … and not the “official” channels. Please never stop doing what you do 🙂

  7. Kamil

    Absolutely awesome MP which should be provided by official Microsoft channels.
    Could you share on github xml file with mp? It will be nice looking inside before import on TEST or DEV environments.

  8. Jazeel Ahmed Siddiqui

    HI Kevin,

    i have imported the MP in SCOM 2022 UR1 but it is observed that servers are not discovered by management pack. even though it discover only 3 servers which is also showing not monitored since days now. all SCCM servers have proxy settings enabled. how to troubleshoot why it is not discovered by the MP. Thanks

      • NILESH TAPASWI

        Hi Kevin,

        After installing the management pack, All the Sites are in Not Monitored state and None of the Management Points are discovered, only the Servers and Site System Roles are discovered and monitored. Please help on this. Thanks

  9. chandra

    I have downloaded and imported the MECM MP to SCOM, even after two days, nothing is showing in the site systems or servers or DPs. Even the hierarchy shows as not monitored. Could you please guide me what mistake am i doing.

    I installed the agents on CAS, Primary, and secondary sites and all shows as healthy.

  10. Filip M

    Seems like the discovery needs to be updated for this MP.
    Or make it so we can override the registry path for the discovery.

    The discovery currently checks for SOFTWARE\Microsoft\SMS\Operations Management\

    But I think Microsoft changed this to:
    SOFTWARE\Microsoft\SMS\Operations Management\SMS Server Role\SMS Distribution Point etc…

    That could be the problems root of the problem for not finding any SCCM servers in SCOM.

    Dont know if Kevin has time to fix it.

  11. chandra

    Thank you Filip for the findings. I am not sure from our side is it possible to edit the MP or not as it is locked.

    @Chris — i tried to visit the protect Org site and updated all my details, but there is no link to download the MP. Is there any other location we can download the MP.

  12. Dmitriy

    Hi, Kevin. Thanks for your work. I use “Enable PXE responder without Windows Deployment Service” and see “MECM Windows Deployment Service Not Running”. Is this a normal behavior?

    • Kevin Holman

      Yes – I just had that reported and this fix will be in the next update. You should disable the WDSSERVER service monitor, and I will include a new service monitor for the native PXE service.

  13. Ganesh

    Hi Kevin,
    I just noticed a small typo ! “Secondary” is spelled incorrectly in the below alert name.

    MECM Primary site to Secnodary Site Global Data Receiving Not Working
    MECM Primary site to Secnodary Site Global Data Sending Not Working

    I just noticed these two alert name as of now, can you please correct if possible. Thanks in advance!

  14. Kiwifulla

    Now since upgrading to 5.0.2303.0, all of our PXE Service Points that DO use WDS have now alerted on the Responder Service (SccmPxe) monitor. We have relied on the WDS monitor alerting for years, but I guess we now need to re-enable the WDS monitor and override the Responder Service monitor to be off.

    • Kevin Holman

      That is correct. I documented this behavior in the knowledge of the monitor. Disable the default monitor (SccmPxe) and enable the legacy monitor.

      • Kiwifulla

        Ah so it is, thanks Kevin – great job on this rewrite by the way, finally we have the unit monitors doing the alerting instead of the rollup monitors :). Works way better than the original SCCM one, thanks.

  15. Fabian

    Hi Kevin,

    Thanks for the effort in this package. Unfortunately, we are facing issues with detections on secondary servers. After going through the troubleshooting section here, we can see a warning in the event log:

    HierarchyDiscovery.js(947, 9) Microsoft JScript runtime error: ‘siteMap[…].ParentSiteCode’ is null or not an object

    How can we support in resolving the issue as a community approach?

    We are using: MCM 5.0.2303.0

    • Kevin Holman

      Is this on secondary site servers only that will not discover?
      Can you include the full event and is that even on the SCOM server or the secondary site server?

      • Fabian

        Hi Kevin,

        Primary is being detected properly so far. This is on a secondary only.

        The process started at 4:02:35 PM failed to create System.Discovery.Data. Errors found in output:

        C:\Program Files\Microsoft Monitoring Agent\Agent\Health Service State\Monitoring Host Temporary Files 15\5968\HierarchyDiscovery.js(947, 9) Microsoft JScript runtime error: ‘siteMap[…].ParentSiteCode’ is null or not an object

        Command executed: “C:\Windows\system32\cscript.exe” /nologo “HierarchyDiscovery.js” {} {} 2
        Working Directory: C:\Program Files\Microsoft Monitoring Agent\Agent\Health Service State\Monitoring Host Temporary Files 15\5968\

        One or more workflows were affected by this.

        Workflow name: MECM.Hierarchy.Discovery
        Instance name:
        Instance ID: {}
        Management group:

        We can also get in touch via e-mail, if further exchange and information is required.

  16. Kiwifulla

    For failover site servers, we have a process to change the “now standby site server” name manually in a dynamic group that has all these components and the relevant monitors overridden against it to FALSE:

    ( ( Object is ConfigMgr Server Component AND ( ( Component Name Equals SMS_CLOUD_USERSYNC ) OR ( Component Name Equals SMS_COMPONENT_STATUS_SUMMARIZER ) OR ( Component Name Equals SMS_COMPONENT_MONITOR ) OR ( Component Name Equals SMS_DMP_DOWNLOADER ) OR ( Component Name Equals SMS_INBOX_MONITOR ) OR ( Component Name Equals SMS_OFFER_STATUS_SUMMARIZER ) OR ( Component Name Equals SMS_OUTBOX_MONITOR ) OR ( Component Name Equals SMS_SITE_SYSTEM_STATUS_SUMMARIZER ) OR ( Component Name Equals SMS_WAKEONLAN_MANAGER ) OR ( Component Name Equals SMS_WSUS_CONFIGURATION_MANAGER ) OR ( Component Name Equals SMS_WSUS_SYNC_MANAGER ) ) AND ( ConfigMgr Component server.Windows Computer.NetBIOS Computer Name Matches wildcard *CFGMGR505 ) ) OR ( Object is ConfigMgr Site Server Role AND ( Windows Computer.NetBIOS Computer Name Matches wildcard *CFGMGR505 ) AND True ) )

    Used it successfully for 2.5 years doing 2 failovers a year – even in pre-prod testing too (hence the * in the server name), no issues.

    Obviously if the MP could detect this it would be ideal…but just an FYI in case it helps someone in the short term.

  17. Chris S

    Thank you for putting together Kevin! Any chance you’d also be open to maintaining this thing and maybe giving the community the ability to vote on desired features?
    If so, I’d like to drop my name in the hat for getting the ability to handle for the HA/Failover bits of [CURRENT PRODUCT NAME HERE].

  18. Mitch

    Hi Kevin,
    Version 5.0.2303.0 of this MP is not discovering our ConfigMgr Clients that are 1606 (5.00.8412.1000) and older because they do not have the ‘HKLM\SOFTWARE\Microsoft\SMS\Mobile Client\SmsClientVersion’ registry value. The SmsClientVersion registry value appears to have started somewhere between version 5.00.8968.1042 and 5.00.8412.1000. Would you consider filtering the discovery based on the ‘SMS\Mobile Client\ProductVersion’ registry value for the ConfigMgr Client discoveries (Server OS + Client OS)? If the SmsClientVersion value is important then you could add another property to the MECM.Client class and capture both registry values.
    Thanks!

    • Kevin Holman

      Hi Mitch – help me understand something – those versions you are talking about date back to 2016, over 7 years ago. They are not supported. Is there any reason you have clients that you are not updating – when the top capability of MCM is updating/patching/software deployment? Is there an intentional reason you do not update agents for 7 years?

      • Mitch

        Hi Kevin,
        You raise a good point. I can ask our ConfigMgr admins why we are running such old ConfigMgr agents, if it will help you decide if you want to make my suggested MP changes. However, I was able to work around this by disabling your ConfigMgr Client discovery and creating my own discoveries that look at ‘SMS\Mobile Client\ProductVersion’ instead of ‘SMS\Mobile Client\SmsClientVersion’.
        Thanks!

  19. Kiwifulla

    I’ve been trying to trip the CMG related monitors which I assumed equate to the 14-day outbound data transfer and storage thresholds that do successfully trigger in ConfigMgr. They are not enabled by default therefore I have overridden all 4 monitors to Enabled.

    These ones work:
    MCM Windows Azure Storage Monitor
    MCM Windows Azure Storage Warning Monitor

    These ones don’t:
    MCM Windows Azure Transfer Monitor
    MCM Windows Azure Transfer Warning Monitor

    Is this a known issue – perhaps they didn’t work in the old MP either (I had that MP but only recently built a CMG)?

    Thanks

  20. Charlez

    Hi,

    MCM Site Server Connectivity To SQL Database Server Availability Monitor is in warning on our passive site server.
    While sql is for sure reachable from this server and permissions are OK.

    br
    Charlez

    • Kevin Holman

      Yes – this MP does not understand the high availability options that were added to MCM. This is a common request, I will work on that.

  21. Vance Morrison

    Hi Kevin, thank you for the new MP.

    I have a question regarding the “MCM Distribution failed to access network” rule. We are running into an issue where one or more DP’s are offline and the rule starts to generate a lot of noise which then causes SCOM to suspend alerting for one of the primary CM servers.

    Do you happen to have any advise for this?. The concern is something happens to the primary but it doesn’t send out the alerts due to the suspension.

  22. Aidan Greenwood

    Hi Kevin,
    Tried to install MCM latest on my SCOM 2022 UR2 test server, and it fails with:
    Database error. MPInfra_p_ManagementPackInstall failed with exception:
    [MP ID: 519d8e1c-c720-b125-2d57-559e4186b931][MP Version: 5.0.2303.2][MP PKT: 3aea5d3804b7337d] Database error. MPInfra_p_ManagementPackInstall failed with exception:
    Incorrect syntax near ‘)’.

    • Aidan Greenwood

      I got it installed directly on the UR2 server, but what puzzles me now is the console on UR2 server is 10.22.10118, but my UR1 system has 10.22.10337?

  23. Charlez

    Hi Kevin,

    We changed our SCCM SQL reporting services to only accept https requests. From that moment on the monitor “MCM Reporting Service Point Availability Monitor” went to critical alert state.
    I guess the MP is not designed to check https only traffic, since the reporting services are working fine.

    Best regards

Leave a Reply

Your email address will not be published.