In a simplified view to groom alerts…..
Grooming of the ops DB is called once per day at 12:00am…. by the rule: “Partitioning and Grooming” You can search for this rule in the Authoring space of the console, under Rules. It is targeted to the “Root Management Server” and is part of the System Center Internal Library.
It calls the “p_PartitioningAndGrooming” stored procedure, which calls p_Grooming, which calls p_GroomNonPartitionedObjects(Alerts are not partitioned) which inspects the PartitionAndGroomingSettings table… and executes each stored procedure. The Alerts stored procedure in that table is referenced as p_AlertGrooming which has the following sql statement:
SELECT AlertId INTO #AlertsToGroom FROM dbo.Alert WHERE TimeResolved IS NOT NULL AND TimeResolved < @GroomingThresholdUTC AND ResolutionState = 255
So…. the criteria for what is groomed is pretty simple: In a resolution state of “Closed” (255) and older than the 7 day default setting (or your custom setting referenced in the table above)
We will ONLY groom out alerts that are CLOSED, and where the LastModified timestamp is older than the set threshold in days.
Ok – that covers grooming.
However – I can see that brings up the question – how does auto-resolution work?
The first equation specifically states “Resolve all active alerts in the new resolution state”. That’s incorrect and a bug in the UI. I’ll explain deeper.
There is a rule in SCOM named: “Alert Auto Resolve Execute All” which runs p_AlertAutoResolveExecuteAll once per day at 4:00 am. This calls p_AlertAutoResolve twice…. once with a variable of “0” and once with a “1”.
Here is the sql statement:
IF (@AutoResolveType = 0) BEGIN SELECT @AlertResolvePeriodInDays = [SettingValue] FROM dbo.[GlobalSettings] WHERE [ManagedTypePropertyId] =dbo.fn_ManagedTypePropertyId_MicrosoftSystemCenterManagementGroup_HealthyAlertAutoResolvePeriod() SET @AutoResolveThreshold = DATEADD(dd, -@AlertResolvePeriodInDays, getutcdate()) SET @RootMonitorId = dbo.fn_ManagedTypeId_SystemHealthEntityState() -- We will resolve all alerts that have green state and are un-resolved -- and haven't been modified for N number of days. INSERT INTO @AlertsToBeResolved SELECT A.[AlertId] FROM dbo.[Alert] A JOIN dbo.[State] S ON A.[BaseManagedEntityId] = S.[BaseManagedEntityId] AND S.[MonitorId] = @RootMonitorId WHERE A.[LastModified] < @AutoResolveThreshold AND A.[ResolutionState] <> 255 AND S.[HealthState] = 1 <snip> ELSE IF (@AutoResolveType = 1) BEGIN SELECT @AlertResolvePeriodInDays = [SettingValue] FROM dbo.[GlobalSettings] WHERE [ManagedTypePropertyId] =dbo.fn_ManagedTypePropertyId_MicrosoftSystemCenterManagementGroup_AlertAutoResolvePeriod() SET @AutoResolveThreshold = DATEADD(dd, -@AlertResolvePeriodInDays, getutcdate()) -- We will resolve all alerts that are un-resolved -- and haven't been modified for N number of days. INSERT INTO @AlertsToBeResolved SELECT A.[AlertId] FROM dbo.[Alert] A WHERE A.[LastModified] < @AutoResolveThreshold AND ResolutionState <> 255
So we are basically checking that Resolution state DOES NOT EQUAL 255….. not specifically “New” (0) as we would lead you to believe by the wording in the UI. To be clear, the first equation will resolve (close) ALL alerts,
The second equation states: “Resolve all active alerts when the alert source is healthy after….” Now, since this is dealing with health state, you might assume this means we will only resolve monitor based alerts. However, that is NOT the case. This half of the equation simply looks at All open alerts, regardless if they were generated by a monitor or a rule, and if the targeted alert source is Healthy in SCOM, and the LastModified timestamp is older than a set number of days – we will auto-resolve the alert. Period.
In Summary, there are two types of built in Alert auto-resolution with the following default settings:
- Resolve ALL alerts no matter what source (rule or monitor), where the alert hasn’t been last modified within “30” days. (where 30 days is the default value)
- Resolve ALL alerts no matter what source (rule or monitor), where the targeted object has a healthy state, and the alert hasn’t been last modified within “7” days. (where 7 days is the default value)
–Edited 2/11/2019 to correct previous errors