Menu Close

SCOM Audit Collection (ACS) Internals

I am going to use this post ramble about a couple of the internals of ACS.

Database:

The ACS DB is primarily made up of daily partition tables…. we create a new one every day during the nightly maintenance, which defaults to 1AM.  We create a new partition, then close the previous one.  Then, we kick off a reindex of the previous day’s table for reporting performance.

To view all this…. first, lets have a look at the dtconfig table:  

Select * from dtconfig  

image

 

The only thing we would change in this table is the “number of partitions”.  This value is essentially the number of partitions to keep, or number of days worth of data we will retain in the ACS DB.  The default is 14, and you need to adjust this based on your retention requirements and DB sizing capability.

Next, lets check out the dtpartition table:

select * from dtpartition order by partitionstarttime

image

Essentially…. these are your daily partitions.  At any given time… you should have one partition with a status of “0” and the rest should be status of “2”.  “2” means they are ready to be groomed.   Pre-SP1, if the online indexing of a closed partition failed during the nightly maintenance, we would leave them in a status of “1”.  This is bad, because they would never groom, and fill the database if this kept up.  If you have any that are left in a status = 1, then just run this in SQL:

Use OperationsManagerAC

UPDATE dtPartition

SET Status = 2

WHERE Status = 1

This will fix the grooming issue, and the tables will groom at the next maintenance interval.  This is a very common issue prior to SP1.

UPDATE 6/3/08

Even in SP1 – occasionally people have reported issues where some partitions are left in a status of “1” and these partitions never groom.  If left unchecked, this can eventually fill the database/database volume.  Microsoft has released an internal hotfix that you can request from PSS if you feel you are impacted by this.  I have seen this happen in many large environments.  Request hotfix/KB 949969

Ok – enough on grooming.  On to bigger and better things.

Audit Collector

The audit collector really does all the work in ACS.  It keeps track of the forwarders (agents), maintains the queue, filters the data, and then writes to the ACS DB.

 

Lets first talk about the basics.  The audit collector runs a service, “Adtserver” which is running Adtserver.exe from the %systemroot%\system32\security\Adtserver directory.

 

Speaking of that directory – there is a lot of cool stuff in there!  Also present, are the .SQL files… which are called during maintenance…..  the primary ones to look at are DbCreatepartition.sql, DbClosepartition.sql, and DbDeletePartition.sql.  Pretty self explanatory…. these run to create new partition tables, close and reindex the previous day’s table, and then to delete old tables that are ready to be groomed out.  These are called from the audit collector to the ACS database, and should not be run manually.

 

Also present in this directory is a little gem of a file, by the name of AcsConfig.XML.  This file has a list of ALL the audit forwarders ever known to the collector, and their last contact time, and sequence number of the last event they have sent to the collector.  You can copy this out – open it with Excel, and see all the data in a very readable format.  This data is kept in memory on the collector, and updates the file every 5 minutes.

 

Probably the biggest problem I see in an ACS environment, is just lack of proper sizing.  The Perf and Scale guide has really good guidance here for ACS, and should be followed:  http://download.microsoft.com/download/d/3/6/d3633fa3-ce15-4071-be51-5e036a36f965/OM2007_PerfScal.doc

 

One of the best things you can do is to apply a proper filter on the collector.  By default, ACS will collect and store every single event in the security event logs from forwarders.  This is good and bad.  Good – because you are getting everything.  Bad – because “everything” doesn’t help you.  A large amount of the events logged in the security logs, are not very useful… depending on how draconian your audit policy is.  You really want to just collect the security events that are needed to meet your audit and security compliance requirements.  A couple good resources:

http://www.microsoft.com/technet/security/guidance/auditingandmonitoring/securitymonitoring/default.mspx

http://www.securevantage.com/Products/2007%20Solutions/Docs/ACS%20Guides/Secure%20Vantage%20ACS%20Noise%20Filter%20Guide.pdf

 

Here is a good, basic filter, to remove a lot of what most consider “not good info”:

 

SELECT * FROM AdtsEvent WHERE NOT (((EventId=528 AND String01=’5′) OR (EventId=576 AND (String01=’SeChangeNotifyPrivilege’ OR HeaderDomain=’NT Authority’)) OR (EventId=538 OR EventId=566 OR EventId=672 OR EventId=680)))

 

How do you apply a filter???  Well, I am glad you asked!  We will run adtadmin.  Here is a link to all the parameters:

http://technet.microsoft.com/en-us/library/bb309436.aspx

 

To examine the current filter, open a command prompt on the collector… and lets run a command in the %systemroot%\system32\security\Adtserver directory:    adtadmin -getquery

 

That will show you what you are currently filtering.  The default is “select * from AdtsEvent”  which is no filtering.  To use the filter posted above…. run the following:

 

adtadmin /setquery /collector:”collectorname” /query:”SELECT * FROM AdtsEvent WHERE NOT (((EventId=528 AND String01=’5′) OR (EventId=576 AND (String01=’SeChangeNotifyPrivilege’ OR HeaderDomain=’NT Authority’)) OR (EventId=538 OR EventId=566 OR EventId=672 OR EventId=680)))”

8 Comments

  1. Ravishankar k

    Hi Hevin,

    Is this post/query applicapable for ACS DB installed on on SQL server 2019 and for SCOM 2019(amangement servers installed on Windows 2019 servers).
    As we dont have any service packs for SQL server 2019.

    Regards,
    Ravi shankar

  2. Ravi shankar

    Hi Kevin,

    Alert we are receiving in SCOM 2019 : Microsoft Audit Collection Services Collector Database Operation Failure.

    We ran the following query to see partition details and Status.

    select * from dtpartition order by partitionstarttime
    But we see all the partitions status as “2”and one with “0” value.

    So what we can check further to solve this issue.

    Regards,
    Ravi shankar

  3. Thy Fere

    We have few collectors where ADTSERVER service stops randomly for 20 to 60 mins. Timing is different but usually it’s between 1am and 5am. What could be the reason?

  4. Blake Drumm

    For anyone wondering what the status codes for the dtPartition table are, here you go:
    — 0: active, set by collector
    — 1: inactive, mark for closing during collector startup & indexing,
    — set manually
    — 2: archived, ready for deletion, set from outside the collector
    — 100 – 108: closed, indexing in progress
    — 109: indexing complete

  5. Pingback:Deciding 'Event Collection vs. Alert' rule - Kevin Justin's Blog

Leave a Reply

Your email address will not be published.