Logical Disk Availability is critical – what does this mean?
You might have noticed a logical disk availability monitor, being red on some of your systems.
In more recent MP’s – this monitor was renamed to “File System error or corruption”
One of the challenges with this monitor – is that it has really good product knowledge, but the state change context doesn’t give us much to go on:
“BAD” is not enough information really to go do something to a production server.
If you look at the product knowledge – which is pretty good – it mentions
- Related physical disk has been removed
- Physical disk has become corrupt (for example; bad sectors) or inoperable
- Problem with physical disk driver
As to resolutions:
- Open the Disk Management snap-in.
- Rescan the disks and then reactivate any disks with errors.
- Resynchronize or regenerate the volume as necessary if the disk was a member of a mirrored or RAID-5 volume.
- Run chkdsk on any reactivated volumes.
But what if you didn’t see any problems? Then what? It’s not like we are going to run off and run a chkdsk on a production server if we don’t see anything wrong or know about any previous disk issues.
At that point – it is good to know what this monitor is actually doing. If you look at the MP in the XML, or follow the Monitor > MonitorType > DataSource in the Authoring Console, you will see this monitor runs a script every 5 minutes (Microsoft.Windows.Server.LogicalDiskHealthCheck.vbs)
While the script does MANY checks… the primary driver of “BAD” state is a single item – a WMI query to the Win32_LogicalDisk class to see if the Volume is marked as dirty.
You check this yourself:
- Open WBEMTEST
- Connect to root\CIMV2
- Select query, and paste in: “select * from Win32_Volume” (no quotes)
- (on older operatying systems prior to Server 2003, you would need to run “select * from Win32_LogicalDisk where (DriveType=3 or DriveType=6) and FileSystem != null” (no quotes)
- Select each line that was output by the query with a doubleclick.
- On the right side – click SHOW MOF
- Scroll all the way down in the list to “DirtyBitSet” (or use “VolumeDirty” if you ran the second query for old OS versions)
If DirtyBitSet or VolumeDirty = True, then this monitor will be “Bad”.
What this means is – at some point this volume got a NTFS error, or was removed from the OS in a critical manner. It *requires* a Chdksk /f to be run against this volume to restore the DirtyBitSet or VolumeDirty to a FALSE condition.
So – if you see these – you can double check this by running the simple WMI query… and then just schedule a Chkdsk on the volume during the next available maintenance window.