Recently, Microsoft posted a security vulnerability around SCOM 2019 agents:
https://msrc.microsoft.com/update-guide/en-US/vulnerability/CVE-2021-1728
Subsequently, we released a KB article about it: KB4601269
Let me give a little background data on it.
This vulnerability is that in SCOM 2019 agents, by default the Network Service account has a high level of privilege to the Operations Manager event log. This access to the event log was new for SCOM 2019, and did not exist in previous versions of SCOM. The update/patch simply removes this access as it is not necessary.
We released the update as Post SCOM 2019 UR2 Hotfix. We state that this hotfix requires UR2 as a prerequisite. That actually is not completely accurate. It will apply to SCOM 2019 RTM, UR1, or UR2. The reason they stated UR2 was required was simply because we took a fix in UR2 that addressed issues in putting agents into Pending Management in the console, after applying an update on the server. That’s it. That’s the only reason.
Honestly, I cannot imagine the work involved to get this patch distributed to ALL your agents in really large environments. Many customers are often way behind on agent Update Rollups as it is, because pushing these updates can be very time consuming if you don’t use Windows Update/SCCM for software deployments.
To make this a little easier I added a monitor and recovery to my SCOM Management Pack.
This monitor will inspect the Operations Manager event log security, and if Network Service has a high level of access, it will turn the SCOM management classes unhealthy (warning). This monitor also includes a recovery which is disabled by default, which will fix the issue. You can enable this recovery if you want to run automatically anytime it detects a SCOM 2019 agent with the issue.
This update is included in SCOM 2019 UR3.
This issue is not present on SCOM 2012R2 and SCOM 2016 agents.
Hi Kevin,
After applying the hotfix, in Administration – Operations Manager Products – Agents I see “Update Rollup 2 Hotfix KB4601269 10.19.10173.0”
In SCOM Management – SCOM HealthService I see “System Center Operations Manager Update Rollup 2 Hotfix 10.19.10173.0”
But in SCOM Management – SCOM Agents I still see “2019 UR2 10.19.10153.0”
Is it normal ?
Thanks !
Yes – it is. Unfortunately – the product group chose to increment the version number. In my opinion, we should not have done that. The main reason this is wrong is because they allowed the hotfix to apply to a SCOM 2019 RTM agent, UR1 agent, or UR2 level agent. This means you could have a SCOM 2019 RTM agent, apply the post-UR2 hotfix, and this would update your version number to 10.19.10173.0. This is VERY misleading, because the hotfix is not cumulative, so you’d be missing all the fixes in UR1 and UR2, and not have a good way to know it.
For this reason I chose not to update the SCOM Management > SCOM Agents information, because this is not applicable…. and instead took the monitor approach. If they would have blocked installation on all SCOM 2019 agents and ONLY applied to UR2 – then we could have taken the other approach.
Hey, Seems like update (KB4601269) is not visible in WSUS/SUP, only downloadable directly from Microsoft, so most likely needs to be imported to WSUS manually…
What is your classification for the site in SCCM? TBH I haven’t checked how this update is classified either but if you don’t select the right classification in SCCM it won’t download it.
Everything except Upgrades. Of course OpsMgr 2019 is selected in the Product view. Also, check Microsoft Update Catalog website – there is no result for KB4601269.
When I try to activate the recovery I get the following errors, any ideas what is going wrong?:
Note: The following information was gathered when the operation was attempted. The information may appear cryptic but provides context for the error. The application will continue to run.
: Verification failed with 1 errors:
——————————————————-
Error 1:
Found error in 1|SCOM.Management|1.0.0.0|SCOM.Management.EventLogSecurity.Monitor.Recovery/PSWA|| with message:
The configuration specified for Module PSWA is not valid.
: Schema validation failed.
The element ‘Configuration’ has invalid child element ‘Arguments’. List of possible elements expected: ‘ScriptBody’.
——————————————————-
: The configuration specified for Module PSWA is not valid.
: Schema validation failed.
The element ‘Configuration’ has invalid child element ‘Arguments’. List of possible elements expected: ‘ScriptBody’.
System.Xml.Schema.XmlSchemaValidationException: The element ‘Configuration’ has invalid child element ‘Arguments’. List of possible elements expected: ‘ScriptBody’.
How are you “activating the recovery” ?
You can activate this by changing enable = true in the XML for the recovery, or you can edit the monitor and then edit the recovery – and simply check the box for Run Automatically and provide a display name for the recovery, and the UI will set this correctly as well. You must be editing something that is not allowed, as the UI does not understand PowerShell script based recoveries by default.
Hi Kevin,
I’m experiencing the same problem when enabling the recovery action from the SCOM console using the “normal” wizard. Editing the XML and importing the new version works like a charm.
Best regards,
Bert
I tried to enable it from the monitor settings, I enable the recovery and give it a displayname.
when I apply the settings, I get that error as shown in my original reply
I have tried to repro this issue in SCOM 2012, 2016, and 2019, and I cannot. My recommendation is to simply modify the XML – and on the line for the recovery, change enabled = true. It sounds as if the UI is trying to change the recovery into a VBscript based recovery since that is all it knows, but I am not sure why or why I cannot repro this behavior. I could move the writeaction out to its own dedicated write action module and just reference it, but I am not sure that would change anything. Editing the XML is simple – OR – instead of enabling the recovery directly – simply use an override to enable the recovery.
editing the XML directly solved the issue
strange, we never had this issue before
I got the exact same error. I will go for the XML solution then.
@Kevin,
Is it still needed to install KB4601269 on Management Server(s) if we enable recovery from your MP?
No, I don’t see the benefit of that, since they do the same thing. UR3 will be out soon and will include/supersede this as well.
Any idea when UR3 will be released?
Its available now
Hi Kevin,
Thanks firstly for drawing our attention to this and for writing this monitor and the recovery task for it. I’ve done some testing on this as I’m required to cover this security issue.
I tried with editing the XML ( I set the recovery to Enabled=true and also ResetMonitor=true as well). However, the recovery doesn’t seem to be running. I checked on a few agents and the registry value CustomSD still contains (A…NU) part. I’ve also checked the OperationsMAnager event log and 3800 is not logged which leads me to believe the script didn’t run.
Running the script manually for each agent works however, that doesn’t reset/recalculate the monitor automatically.
Resetting the health state of the monitor sets it to healthy, but it seems that if I reset the health of the agent that I haven’t run the recovery task against, the health stays green even though the registry value still contains NU part (checked this and confirmed the behaviour). Does the monitor somehow keep checking the existence of this reg value and how often?
any other guidance is appreciated.
Thanks
G
1. A recovery will only ever run at the moment of the monitor state change. If the monitor is already unhealthy when you enable the recovery – it will never run. I recommend putting all these instances of “SCOM Management” class into maintenance mode for 5 minutes, then let them come out of MM and they should redetect the issue and run the recovery.
2. I never recommend ResetMonitor=true. This has caused many issues in my experience.
3. The monitor does not reset immediately, as it will only run once every 24 hours. Just wait a day, or override the interval.
4. If you reset the health state, it WILL re-detect the issue and run the recovery if so enabled. However, again – by default detection is once every 24 hours (86400 seconds). You may set this to something much shorter. I set it to once a day because some people get all kinds of upset when I run stuff too frequently. 🙂
I’m facing a weird situation… i’m getting this warning on our SCOM 2016 UR10 environment.
Could it be, because i have deployed your latest SCOM.Management MP on our environment.
What version are your agents?
Our environment is a hybrid scenario between Azure Log Analytics and SCOM…
Following Microsoft recommendations, we are using the latest OMS agents version on our servers, 10.20.18053.0.
Figured. Yes – the latest MMA from Azure has this same issue, until they address it you should remediate them using my recovery.
Is it any solution to automate this task? We have 200 Servers in warning state…
Hi Kevin,
I have enabled the task and have also manually triggered the recovery task (Ran successfully), But the state is still a warning (even after recalculating health). Should we reset the monitor manually? or how long do we wait for the monitor to become healthy?
Recalculate doesn’t work. Recalculate doesn’t work on 90% of the monitors out there. That requires the monitor have a special on-demand calculation configured inside the monitor XML.
You simply need to wait 24 hours, or change the frequency this monitor runs (86400 seconds by default).
Pingback:Community round up March 2021 | SquaredUp
We’ve got Service Manager in our environment and it deploys an old version of the MMA on SCSM servers in a way that precludes patching. I can’t find anything about remediating this vulnerability on Service Manager servers. Will the recovery task take care of this?
Thanks
I do not believe this is an issue on Service Manager.
Kevin: I have some severs with the SCOM agent version 10.20.18053.0 that still have this high level of privilege issue. I tried to install KB4601269 but won’t let me. What can I do here?
Use my recovery?