Time sync is critical in today’s networks. Experiencing time drift across devices can cause authentication breakdowns, reporting miscalculations, and wreak havoc on interconnected systems. This article shows a demo management pack to monitor for time sync across your Windows devices.
The basic idea was – to monitor all systems and compare their local time, against a target reference time server, using W32Time. Here is the command from the PowerShell:
$cmd = w32tm /stripchart /computer:$RefServer /dataonly /samples:$Samples
The script will take two parameters, the reference server and the threshold for how much time drift is allowed.
Here is the PowerShell script:
#================================================================================= # Time Skew Monitoring Script # Kevin Holman # Version 1.0 #================================================================================= param([string]$RefServer,[int]$Threshold) #================================================================================= # Constants section - modify stuff here: # Assign script name variable for use in event logging $ScriptName = "Demo.TimeDrift.PA.ps1" # Set samples to the number of w32time samples you wish to include [int]$Samples = '1' # For testing - assign values instead of paramtersto the script #[string]$RefServer = 'dc1.opsmgr.net' #[int]$Threshold = '10' #================================================================================= # Gather script start time $StartTime = Get-Date # Gather who the script is running as $WhoAmI = whoami # Load MomScript API and PropertyBag function $momapi = new-object -comObject 'MOM.ScriptAPI' $bag = $momapi.CreatePropertyBag() #Log script event that we are starting task $momapi.LogScriptEvent($ScriptName,9250,0, "Starting script") #Start MAIN body of script: #Getting the required data $cmd = w32tm /stripchart /computer:$RefServer /dataonly /samples:$Samples IF ($cmd -match 'error') { #Log error and quit $momapi.LogScriptEvent($ScriptName,9250,2, "Getting TimeDrift from Reference Server returned an error . Reference server is ($RefServer). Output of command is ($cmd)") exit } ELSE { #Assume we got good results from cmd $Skew = $cmd[-1..($Samples * -1)] | ConvertFrom-Csv -Header "Time","Skew" | Select -ExpandProperty Skew $Result = $Skew | % { $_ -replace "s","" } | Measure-Object -Average | select -ExpandProperty Average } #The problem is that you can have time skew in two directions: positive or negative. You can do two #things: create an IF statement that does check both or just create a positive number. IF ($Result -lt 0) { $Result = $Result * -1 } $TimeDriftSeconds = [math]::Round($Result,2) #Determine if the average time skew is higher than your threshold and report this back to SCOM. IF ($TimeDriftSeconds -gt $Threshold) { $bag.AddValue("TimeSkew","True") $momapi.LogScriptEvent($ScriptName,9250,2, "Time Drift was detected. Reference server is ($RefServer). Threshold is ($Threshold) seconds. Value is ($TimeDriftSeconds) seconds") } ELSE { $bag.AddValue("TimeSkew","False") #Log good event for testing #$momapi.LogScriptEvent($ScriptName,9250,0, "Time Drift was OK. Reference server is ($RefServer). Threshold is ($Threshold) seconds. Value is ($TimeDriftSeconds) seconds") } #Add stuff into the propertybag $bag.AddValue("RefServer",$RefServer) $bag.AddValue("Threshold",$Threshold) $bag.AddValue("TimeDriftSeconds",$TimeDriftSeconds) #Log an event for script ending and total execution time. $EndTime = Get-Date $ScriptTime = ($EndTime - $StartTime).TotalSeconds $ScriptTime = [math]::Round($ScriptTime,2) $momapi.LogScriptEvent($ScriptName,9250,0,"`n Script has completed. `n Reference server is ($RefServer). `n Threshold is ($Threshold) seconds. `n Value is ($TimeDriftSeconds) seconds. `n Runtime was ($ScriptTime) seconds.") #Output the propertybag $bag
Next, we will put the script into a Probe action, which will be called by a Datasource with a scheduler. The reason we want to break this out, is because we want to “share” this datasource between a monitor and rule. The monitor will monitor for the time skew, while the rule will collect the skew as a perf counter, so we can monitor for trends in the environment.
So the key components of the MP are the DS, the PA (containing the script), the MonitorType and the Monitor, the Perf collection rule, and some views to show this off:
When a threshold is breached, the monitor raises an alert:
The performance view will show you the trending across your systems:
On the monitor (and rule) you can modify the reference server:
One VERY IMPORTANT concept – if you change anything – you must make identical overrides on BOTH the monitor and the rule, otherwise you will break cookdown, and result in the script running twice for each interval. So be sure to set the IntervalSeconds, RefServer, and Threshold the same on both the monitor and the rule. If you want the monitor to run much more frequently than the default once an hour, that’s fine, but you might not want the perf data collected more than once per hour, so while that will break cookdown, it only breaks once per hour, which is probably less of an impact than overcollecting performance data.
From here, you could add in a recovery to force a resync of w32time if you wanted, or add in additional alert rules for w32time events.
The example MP is available here:
how to add multiple ref server ?
That kind of defeats the design of time sync. You use an authoritative source and you sync with it. The MP allows this to be overridden for agents who need a different authoritative source.
Hey Kevin,
What if I want to make sure that the PDC itself is not out of Sync with the external NTP. Can we achieve that by comparing the time of the DC with an external time source?
Sure, just change the name in the override, only for the PDC
OK, how to override, only for the PDC
Hi Kevin, very helpful and I just implemented it. How can I make the threshold to 50ms? Also how can I run the monitor every 10 minutes while not impacting performance counters?
Thanks
Jay
Override does work only for the monitor not for the rule
Hi Kevin, I would like to download the Time Drift Management Pack from the technet and the technet was closed. Would you like to let me know where I can download the pack?
This is on GitHub: https://github.com/thekevinholman/MonitorTimeDrift
Hi Kevin, thanks for your great support.
Hi Kevin, thanks for your great support.
Thank You Kevin for all the information that you share about SCOM , I was working with SCOM 2007 R2 but now I have installed SCOM 2016 , the knowledge that you share with us , helps me to clarify all my doubts and concerns
Updated the script to use the current time provider configured on the server also retry if one sample gets error:
#=================================================================================
# Time Skew Monitoring Script
# Kevin Holman
# Version 1.0
#=================================================================================
param([string]$RefServer,[int]$Threshold)
#=================================================================================
# Constants section – modify stuff here:
# Assign script name variable for use in event logging
$ScriptName = “Windows.TimeDrift.PA.ps1″
# Set samples to the number of w32time samples you wish to include
#[int]$Samples = ‘1’
# For testing – assign values instead of paramtersto the script
#[string]$RefServer = ‘dc1.opsmgr.net’
#[int]$Threshold = ’10’
#=================================================================================
[int]$Samples = ‘3’
#Get current NTP Source of the server
$RefServer = w32tm /query /source
$RefServer = $RefServer.ToString().Replace(” “,””)
if($RefServer -like “time.wind*”)
{
$RefServer= $RefServer.Split(“,”)[0]
}
#If server dont have source force to your main ntp server
if($RefServer -like “*Free-runningSystemClock*”)
{
$RefServer = “YOURNTPSERVERHERE”
$Threshold = 300
}
#if your ntp server is not resolving try time sync with time.windows.com
if($RefServer -like “*YOURNTPSERVERHERE*”)
{
Try{
$query = Resolve-DnsName $RefServer -QuickTimeout A -ErrorAction Stop
}
catch{
$RefServer = “time.windows.com”
$Threshold = 300
}
}
# Gather script start time
$StartTime = Get-Date
# Gather who the script is running as
$WhoAmI = whoami
# Load MomScript API and PropertyBag function
$momapi = new-object -comObject ‘MOM.ScriptAPI’
$bag = $momapi.CreatePropertyBag()
#Log script event that we are starting task
$momapi.LogScriptEvent($ScriptName,9250,0, “Starting script”)
#Start MAIN body of script:
#Getting the required data
$cmd = w32tm /stripchart /computer:$RefServer /dataonly /samples:$Samples
#If you get an error on the first try, try it again
IF ($cmd -match ‘0x800705B4’)
{
Sleep 2
$cmd = w32tm /stripchart /computer:$RefServer /dataonly /samples:$Samples
}
IF ($cmd -match ‘error’)
{
#Log error and quit
$momapi.LogScriptEvent($ScriptName,9250,2, “Getting TimeDrift from Reference Server returned an error . Reference server is ($RefServer). Output of command is ($cmd)”)
exit
}
ELSE
{
#Assume we got good results from cmd
$Skew = $cmd[-1..($Samples * -1)] | ConvertFrom-Csv -Header “Time”,”Skew” | Select -ExpandProperty Skew
$Result = $Skew | % { $_ -replace “s”,”” } | Measure-Object -Average | select -ExpandProperty Average
}
#The problem is that you can have time skew in two directions: positive or negative. You can do two
#things: create an IF statement that does check both or just create a positive number.
IF ($Result -lt 0) { $Result = $Result * -1 }
$TimeDriftSeconds = [math]::Round($Result,2)
#Determine if the average time skew is higher than your threshold and report this back to SCOM.
IF ($TimeDriftSeconds -gt $Threshold)
{
$bag.AddValue(“TimeSkew”,”True”)
$momapi.LogScriptEvent($ScriptName,9250,2, “Time Drift was detected. Reference server is ($RefServer). Threshold is ($Threshold) seconds. Value is ($TimeDriftSeconds) seconds”)
}
ELSE
{
$bag.AddValue(“TimeSkew”,”False”)
#Log good event for testing
#$momapi.LogScriptEvent($ScriptName,9250,0, “Time Drift was OK. Reference server is ($RefServer). Threshold is ($Threshold) seconds. Value is ($TimeDriftSeconds) seconds”)
}
#Add stuff into the propertybag
$bag.AddValue(“RefServer”,$RefServer)
$bag.AddValue(“Threshold”,$Threshold)
$bag.AddValue(“TimeDriftSeconds”,$TimeDriftSeconds)
#Log an event for script ending and total execution time.
$EndTime = Get-Date
$ScriptTime = ($EndTime – $StartTime).TotalSeconds
$ScriptTime = [math]::Round($ScriptTime,2)
$momapi.LogScriptEvent($ScriptName,9250,0,”`n Script has completed. `n Reference server is ($RefServer). `n Threshold is ($Threshold) seconds. `n Value is ($TimeDriftSeconds) seconds. `n Runtime was ($ScriptTime) seconds.”)
#Output the propertybag
$bag
Is there any possible for NTP time drift monitoring for Linux machines?
Can multiple instances of this per counter be set up simultaneously? EG I have a scenario where I have an authoritative time server that two machines sync to, but one of those machines is also a time server, and I want to see the delta between the two on the third machine, while also confirming the drift on the third machine against the original time server
Youd have to copy/paste the rule, or you’d have to re-write the script to accept multiple time sources and multiple outputs.
I’ve found an elegant solution (in my view), for a contingent reference to the timeserver :
I replaced “dc1.opsmgr.net” with “$Target/Host/Property[Type=”Windows!Microsoft.Windows.Computer”]/DomainDnsName$”
This resolves to the domain catch-all address which also calls the NTP server, in our case.
so:
3600
$Target/Host/Property[Type=”Windows!Microsoft.Windows.Computer”]/DomainDnsName$
200
It doesn’t work when an agent is “WORKGOUP’ed”, however.
Hi Kevin.
What I need to change in the script and in XML in case when there are servers from different domains with different DCs? Create a few Windows.TimeDrift MPs for each domain? How to do it? Help me please!
Create groups – then just use overrides.
I’m creating 2 rules overrides for each type of OS (is it right?) and i’m selecting a group that i’ve created before. Now a question – in “Override properties” window in Select destination MP area no possibility to select another MP. Selected and greyed only MP=”Windows TimeDrift”. Is this one normally and all right? Thank you for help, for your site and time.
Ups. After “Apply” in Override properties windows i am getting error (maybe there is solution?):
OpsMgr SDK Service error 23319 “an exception was thrown while processing TryUpdateManagementPackWithResources for session ID ….. Exception message: database error. MPInfra_p_ManagementPackInstall failed with exception:
Database error. MPInfra_p_ManagementPackInstall failed with exception:
Failed to validate item:
Alias…..OverrideForRuleWindowsTimeDriftPerfCollectionRuleForContextUINameSpace……Group
I’m seeing this same error and I’m not seeing all my groups when trying to override by groups. Is there a fix or update for this?
I’ve added below to the Powershell Script to find its Root-PDC (which is by default also the forest’s time source in a NT5DS Hierarchy):
[..]
# Check if $RefServer is default dummy
IF ($RefServer -eq “dc1.opsmgr.net”)
{
# Get ForestRootPDC and assign it as $RefServer instead
$DomainFQDN = (Get-WmiObject -Namespace root\cimv2 -Class Win32_ComputerSystem | Select Domain).Domain
$context = new-object System.DirectoryServices.ActiveDirectory.DirectoryContext(“Domain”,$DomainFQDN)
$ForestRootDomainFQDN = (([System.DirectoryServices.ActiveDirectory.Domain]::GetDomain($context)).Forest).Name
$context = new-object System.DirectoryServices.ActiveDirectory.DirectoryContext(“Forest”,$ForestRootDomainFQDN)
$ForestRootPDC = ([System.DirectoryServices.ActiveDirectory.Forest]::GetForest($context)).RootDomain | %{$_.pdcRoleOwner.Name}
$RefServer = $ForestRootPDC
}
[..]
I’m receiving the below alerts, how can I correct this issue:
Alert description: A script error occurred when measuring for Time Drift: Event Description: Windows.TimeDrift.PA.ps1 : Getting TimeDrift from Reference Server returned an error . Reference server is (0.pool.ntp.org). Output of command is (Tracking 0.pool.ntp.org [216.229.4.66:123]. Collecting 1 samples. The current time is 1/21/2023 7:02:38 AM. 07:02:38, error: 0x800705B4)
I’m receiving the below error alert (did anyone else experience this):
Alert description: A script error occurred when measuring for Time Drift: Event Description: Windows.TimeDrift.PA.ps1 : Getting TimeDrift from Reference Server returned an error . Reference server is (0.pool.ntp.org). Output of command is (Tracking 0.pool.ntp.org [216.229.4.66:123]. Collecting 1 samples. The current time is 1/21/2023 7:02:38 AM. 07:02:38, error: 0x800705B4)
Under Authoring > Mangement Pack Objects > Rules (‘Windows Server Operating System’ management pack scope) right-click the ‘Time Drift Monitoring Script had an error’ rule then select Properties. On the Configuration tab click the Edit button for Data sources. On the Expression tab click the Insert button.
-Parameter Name = EventDescription
-Operator = Does not contain
-Value = 0X800705B4
Save your edits.
I too am getting a 0X800705B4 error for just about every server we have. I did exclude those events, like you suggested, but what does that error message mean? Is the management pack/monitor still working if that error keeps happening?
Full error message for the above post:
Time Drift Script Error Rule, Description: A script error occurred when measuring for Time Drift: Event Description: Windows.TimeDrift.PA.ps1 : Getting TimeDrift from Reference Server returned an error . Reference server is (servername.company.com). Output of command is (Tracking servername2.company.com [10.84.170.42:123]. Collecting 1 samples. The current time is 5/13/2024 1:00:37 PM. 13:00:37, error: 0x800705B4)
Hello everyone,
Does anyone know how to reset the health state of a monitor, or monitors in general, when dealing with a time jump?
I have a red event in the future, and every time I reset the health, I create a green reset event in the Health Explorer from today, but the red event remains at the top…
I even disabled the monitor via an override. I see the info ‘the monitor has been disabled or removed’ in the Health Explorer, but the red event remains at the top, and the health is still bad.