Menu Close

How Logical Disk free space monitoring works in SCOM

There have been many posts about this over the years, and this is nothing new nor groundbreaking.  However, I still run into customers who don’t fully understand how this works, how to configure it, and some of the good resources documenting this in the past have disappeared.  So, this article is dedicated to the topic of free space monitoring for disks in SCOM.

I’ll be using the Windows Server 2016 (and later) MP for examples, version 10.1.0.0

The monitors:

image

The Blue arrow points to an old Deprecated monitor which is no longer used, and disabled by default.  Ignore it, and do NOT use it.

Next – the Green arrows.  These were created years ago, and honestly, it probably caused more harm than good.  They were meant to meet a customer request of “simpler” disk monitoring to make using “Percent only” or “Megabytes free only” monitoring more user friendly.  In my opinion, they complicated disk monitoring and should have never been put into the MP.  They are disabled out of the box and I do not recommend them.

Lastly – the Red arrow points to the current and correct monitor for Windows Server 2016 (and later) disks…. “Logical Disk Free Space Monitor”.  This will be the focus of our attention moving forward.

 

First – understand when alerts are generated.  By default, the monitor will only generate alerts when the monitor is in a Critical health state.  Warnings are for showing state/health only.

image

Second – understand that this monitor treats SYSTEM disks (typically the C: drive) differently than other disks (non-system).  The default thresholds for System Disks are different that the default threshold for all other disks.  This is because System Disks (C:) are typically smaller than other disks, and have less free space under normal conditions. 

Third – understand the unhealthy calculation of the monitor.  This monitor is unique, in that TWO thresholds MUST be met before the monitor detects an unhealthy condition.  BOTH the Free Space in MB threshold, AND the Percentage free disk space threshold must be breached for the monitor to change state.  This is important, and very much by design.  Instead of monitoring for one or the other, using BOTH allow you to adequately monitor large disks, and small disks, effectively.  Above you can see the default thresholds.

image

 

Lets recap:

  1. Alerts are only generated when the monitor is critical.
  2. System Disks (C:) have different thresholds than Non-System disks
  3. BOTH a free space (MB) threshold, and a free space (%) threshold must be met for the monitor to change state.

The alerts are good, because they contain both the Percent Free Space and MB Free Space values in the Alert Description:

image

 

Customizing the monitor thresholds:

Customers are so different, I find everyone seems to have a different opinion on how they wish to monitor for free space.  Some customers like this default model.  However, one of the most common scenarios I get asked about – is “I want “Percent ONLY monitoring of disks”, as a simpler approach.

You can configure “Percent ONLY” monitoring easily with the built in disk space monitor, simply by setting an override to the “MB free” equation to a VERY high number.  This will put that half of the equation into a breached threshold state, which will result in only needing to trigger the Percent Free side of the equation.  For this, I like to use “seven nines” or “9999999” which equates to requiring 10 terabytes free space.  Since almost NO disks even exist of that size, that will evaluate that part of the threshold as “breached”. 

image

The above is just an example, I am not saying I recommend “Percent ONLY” monitoring of disk space, just showing what is easily possible.

***Note:  If you are interested in some advanced and customized disk space monitoring, check out Tim’s replacement of the built in SCOM OS Disk Space monitoring.  His solution could be customized to show a LOT more advanced data into a customized alert description.  You can also add auto-cleanup recoveries, either into the datasource script itself, or in a recovery action assigned to the monitor.  https://www.scom2k7.com/sick-of-explaining-to-end-users-why-they-didnt-get-a-disk-monitor-alert/

Windows Server 2012 Logical Disk monitors work the same exact way, however the Display Names are not consistent (by design – because the 2016 and later MP will become OS version agnostic) – so be aware.  The disabled one with the blue arrow is deprecated, and the disabled Green arrows are the ones I don’t recommend using.  The “good” one is labeled “Windows 2012 Logical Disk Free Space Monitor”

image

 

Summary:

The “out of the box” monitoring for disk free space in SCOM is actually quite good in my opinion.  The only thing most people need to understand are the three major points in the recap above, then review if the default thresholds make sense for your organization.

12 Comments

  1. Florian F.

    Hi Kevin,
    We in our company often face issues when it comes to disk with >500GB. Percentage is quite good but sometimes alerting a little to early. Absolute values of 1-10 GB are some times alerting to late. Is there some scaling disk space monitor? It would be nice to have default monitors which cover disk from 10 GB to 2-3 TB.
    Regards
    Florian

    • Kevin Holman

      That is the power of Overrides, and Groups. I agree, when you get to a 500GB disk, alerting when only 2GB is free is likely too late, while a pure 5% might be a bit too early. In those cases, we make simple groups.

      Since the SIZE of the disk is a property, you can easily create a “Large Disks Group” where they are over 500GB. Instead of 5% (25GB) and 2000 MB (2GB), you could set the critical threshold Free MB to 10GB (for the Large Disks Group)…. which would then generate a critical alert at that new threshold. You can create a “Huge Disks Group” for disks over 2TB and do something similar… What I often find, is on the huge disks, we might need different thresholds based on the application owner for those disks, and how the application uses disk space. SCOM is very flexible, allowing overrides at the class level, then group level, then instance level. The most specific override wins.

  2. Rick Bywalski

    So you say alerts only happen when the monitor is critical by default. I would like to get alerts for warnings as well how do I make that happen? Our idea is to address the alerts when they are in a warning state giving us plenty of time to get change control in place and expand the disk.

    • Kevin Holman

      You can do that with a simple override. You must take care, however, in that if you are using a connector or ticketing system, you might miss when an ignored Warning alert goes critical. This is because when a monitor changes from warning to critical, but alerts in both states, it simply UPDATES the existing alert from Warning to Critical. If your ticketing system uses custom resolution states, it might cause you to miss the critical alert, as the alert was already examined when it was warning. You’d need to test that notification email re-fire during that change as well – as this functionality has been changed depending on SCOM version and UR level over the years based on customer feedback and noise control.

  3. Erik

    We did take the % out of equation.
    With Windows updates being so big nowdays we wanted critical error <10gb on C:
    So often we got failed Windows updates just because C: ran out of space and the default value was <300mb

    • Kevin Holman

      One of my customers did this. They mandated 5GB of space minimum on C: for Windows Updates. Their smallest C: drives were 30GB. A warning threshold was set to 20% and 5000MB. This ensured that the disk would go Yellow at those thresholds, and we use a dashboard for all unhealthy disks for remediation. We also attached a Recovery action, to clean up the C: drive space automatically, based on their typical manual actions, but only enabled this recovery for C: drives by using a group and overrides.

  4. Atif Aman

    Hi Kevin, You suggested to override the free disk MB to 7x (9999999), if customer wants to monitor by %age.
    Don’t you think it will never breach the threshold for smaller disks because, BOTH a free space (MB) threshold, and a free space (%) threshold must be met for the monitor to change state.

    • Kevin Holman

      When the amount of free space is less than 9,999,999 MB (10 Terabytes) then this half of the equation is met…. the thresholds is breached – because NONE of your disks have 10TB of free space….. especially disks smaller than 10TB. Now, only percentage threshold must be met to change state.

  5. boukazoula el mahdi

    hello kevin;
    i need documents how to create alerts for disks and memory and processor, i am a debutant and i have installed scom 2016. currently i need to configure wholes. i need your help.
    thank you in advance

  6. ROMAIN

    Hello Kevin,

    you write “BOTH a free space (MB) threshold, and a free space (%) threshold must be met for the monitor to change state”. Did that mean that the 2 conditions are to be met ?
    If yes, those default settings are useless on Windows 2016 Servers.

    With 5% and 300MB thresholds, an critical alert will only occur at 300MB free space. Whatever the size of the system disk is, we must wait the 300MB threshold.
    Who use virtual Windows Server with 6GB System Drives ? 5% (threshold) of 6GB = 300MB (threshold)

    Why not having created monitor with OR condition? The first threshold reached generating an alert.

    Thanks

    • Kevin Holman

      I think you are missing the point of the design. The design is not so that BOTH thresholds will be met at exactly the same time/value…. it is simply that both thresholds should be met, to account for really big and really small disks. You are correct, in this case, the 300MB must be met on a system drive, no matter the disk size (since nobody has a 6GB system disk). This is by design, to limit noise. You cannot believe the number of customers I work with that have less than 500MB free on their C drives, and that is “normal” to them. Now, it is a terrible practice to run this way, simply because of patching and how much temporary space that requires. But the idea is that these are DEFAULT thresholds to control noise, and that you as the monitoring owner should CHANGE THEM to whatever your standards are. There is no default value we could ever set that will work for everyone. So set them to what works for you. One time, as soon as you implement. MP’s cannot be set to a “best practice” out of the box, as we have learned, because MOST customers don’t follow best practices – then complain SCOM is too noisy to use.

  7. ROMAIN

    Hi Kevin,
    Thank you for your reply and a very big thank you for all your articles, which are a very valuable help for me.

    I have to juggle with the requirements of users who think that monitoring can warn them before a problem happens 🙂
    Playing with the different thresholds, I think I’ve found now a suitable configuration for both small and big discs.

    Regards

Leave a Reply

Your email address will not be published. Required fields are marked *