Menu Close

Deleting and Purging data from the SCOM Database


image

 

First, I have to begin with a disclaimer – Microsoft does not support customers manually editing the database, unless under the guidance of a Microsoft support engineer with a support case.

However, I have seen several blogs now “leaking” this guidance I send out or use from time to time…. so it is probably time to publish it as an example only.  I have always been hesitant to blog about any manual edits to the SCOM database, because all too many times, customers will jump to a conclusion, and start hacking away at their DB because of something they read on some website…. only to find they have corrupted their database, or made their original problem worse.  You should NEVER perform any activity like this unless you are experienced and aware of the possible consequences.  You should also ensure you take a full backup of the DB’s before ever attempting an edit like this.

Ok, warnings aside…..

There are times in a SCOM Management Groups life cycle, where objects can get orphaned.  Either from bugs in SCOM, or bad practices taken by discoveries, or SDK commands, or Product Connector solutions.

For instance, there is a bug in SCOM, when you delete a Management Server (rare, but happens if you are retiring an old OS and replacing with a new one).  When you delete the MS, more often than not – we will orphan that management server’s Health Service Watcher:

What can happen, is that when you view the OperationsManager > Management Server > Management Servers State view, you might see old Management Server Health Service Watcher objects here:

image

This doesn’t really cause a problem, but it is an annoyance.  It is just an example where a manual delete is required to clean it up.

 

Another common scenario, is when customers implement a management pack, that inserts data via the SDK.  If objects are inserted via the SDK, they must be removed via the SDK.  Sometimes those MP’s get decommissioned, and customers are left with these objects, wondering how to get them removed.

 

Probably the most common scenario – is Windows Computers showing up in the console, that don’t have an agent.  People tend to assume these are “orphans” but often times they are just discovered by some Topology discovery in a management pack, like AD, or SCCM.  AEM will also leave you with a lot of ghosts.  Sometimes these are caused by running a discovery to “extend” the Windows Computer class, but the discovery written has no logic to “undiscover” the computer when the agent is deleted.  It will still be associated to that discovery, even though the agent is long gone.

 

To inspect the database relationships to see if there is a still a discovery associated with a computer – you can run this query against the OperationsManager database.  Just change my computername on the first line to one you are inspecting:

DECLARE @name varchar(255) = '%ws2016.opsmgr.net%' SELECT BME.FullName, BME.DisplayName, BME.Path, dv.DisplayName AS 'DiscoveryDisplayName', dv.Name AS 'DiscoveryName' FROM typedmanagedentity TME JOIN BaseManagedEntity BME ON TME.BaseManagedEntityId = BME.BaseManagedEntityId JOIN DiscoverySourceToTypedManagedEntity DSTME ON TME.TypedManagedEntityID = DSTME.TypedManagedEntityID JOIN DiscoverySource DS ON DS.DiscoverySourceID = DSTME.DiscoverySourceID JOIN DiscoveryView dv ON DS.DiscoveryRuleID=dv.Id WHERE BME.Fullname like @name ORDER BY BME.FullName, DiscoveryDisplayName

 

If you have fully inspected this and determined the object is not related to any current discoveries – then you can attempt a manual deletion.

 

You will find lots of blogs talking about setting “IsDeleted = 1” in the BaseManagedEntity table, or even other tables.  This is WRONG.  You should never do this!!!

The better way to delete managed entities, is the below set of queries.

 

--Query 1 --First get the Base Managed Entity ID of the object you think is orphaned/bad/needstogo: -- DECLARE @name varchar(255) = '%computername%' -- SELECT BaseManagedEntityId, FullName, DisplayName, IsDeleted, Path, Name FROM BaseManagedEntity WHERE FullName like @name OR DisplayName like @name ORDER BY FullName

 

Query 1, will help you find the BaseManagedEntity ID of the object you are wanting to delete.  Take GREAT care here that you fully understand the object is correct that you want to forcibly remove, or you risk corrupting the SCOM object relationships.

This will give you the BaseManagedEntityId GUID – which we need for the next step:

image

 

In Query 2, we paste in that GUID for the orphaned object we are trying to delete.

--Query 2 --Next input that BaseManagedEntityID into the delete statement --This will delete specific typedmanagedentities more gracefully than setting IsDeleted=1 --change "00000000-0000-0000-0000-000000000000" to the ID of the invalid entity -- DECLARE @EntityId uniqueidentifier = '00000000-0000-0000-0000-000000000000' -- DECLARE @TimeGenerated datetime; SET @TimeGenerated = getutcdate(); BEGIN TRANSACTION EXEC dbo.p_TypedManagedEntityDelete @EntityId, @TimeGenerated; COMMIT TRANSACTION

 

When you run this – it will delete the managed entity in a “graceful” method, that will delete any related objects in ManagedType tables – so you might see a lot more objects deleted in the SQL response than you expected.

 

In Query 3 – this is just a repeat of Query 1, but we run this to inspect the objects and see if their IsDeleted flag now = 1.

--Query 3 --Check to make sure the IsDeleted field of your BME now = 1 -- DECLARE @name varchar(255) = '%computername%' -- SELECT BaseManagedEntityId, FullName, DisplayName, IsDeleted, Path, Name FROM BaseManagedEntity WHERE FullName like @name OR DisplayName like @name ORDER BY FullName

 

In Query 4 – we can get an idea of how many objects currently are in a “Deleted” status, and ready to be purged from the database:

--Query 4 --Get an idea of how many BMEs are in scope to purge SELECT count(*) FROM BaseManagedEntity WHERE IsDeleted = 1

 

Lastly – in Query 5 – we are purging.  Purging is a normal cycle and part of SCOM grooming, where we will purge data from the SCOM database after 2-3 days of being marked deleted.  We do not delete these immediately, because that is an expensive operation and best reserved for when grooming runs at night.  Also, when you are troubleshooting an agent, you might not wish to delete all the data when you delete an agent from the management group.  This lets the discovered data remain, if you bring the agent back in within a day or two.

Purging on demand is only useful if you wish to remove the deleted data from SCOM immediately.  A common scenario where I use this – is when I need to take a server and install it as a Management Server or a Gateway server, but it was previously managed as an agent.  You don’t need to do all the manual SQL deletes in that scenario, but you do need to purge the DB so that when you install it as a Management Server or a GW, there are no issues during the installation.

--Query 5 --This query statement for SCOM 2012 will purge all IsDeleted=1 objects immediately --Normally this is a 2-3day wait before this would happen naturally --This only purges 10000 records.  If you have more it will require multiple runs --Purge IsDeleted=1 data from the SCOM 2012 DB: DECLARE @TimeGenerated DATETIME, @BatchSize INT, @RowCount INT SET @TimeGenerated = GETUTCDATE() SET @BatchSize = 10000 EXEC p_DiscoveryDataPurgingByRelationship @TimeGenerated, @BatchSize, @RowCount EXEC p_DiscoveryDataPurgingByTypedManagedEntity @TimeGenerated, @BatchSize, @RowCount EXEC p_DiscoveryDataPurgingByBaseManagedEntity @TimeGenerated, @BatchSize, @RowCount

 

 

Review:

Again – doing these DB edits are not supported by Microsoft.  I only publish this to keep people from using these queries (which have been published on other blogs now) without understanding the issues, the risks, or the proper way to leverage them.   And to get people to STOP forcing the IsDeleted = 1 property using a SQL update statement.

Here are all the Queries together for a quick copy/paste:

(These support SCOM 2012, 2016, and the semi-annual channel builds of SCOM)

--Query 1 --First get the Base Managed Entity ID of the object you think is orphaned/bad/needstogo: -- DECLARE @name varchar(255) = '%computername%' SELECT BaseManagedEntityId, FullName, DisplayName, IsDeleted, Path, Name FROM BaseManagedEntity WHERE FullName like @name OR DisplayName like @name ORDER BY FullName --Query 2 --Next input that BaseManagedEntityID into the delete statement --This will delete specific typedmanagedentities more gracefully than setting IsDeleted=1 --change "00000000-0000-0000-0000-000000000000" to the ID of the invalid entity -- DECLARE @EntityId uniqueidentifier = '00000000-0000-0000-0000-000000000000' DECLARE @TimeGenerated datetime; SET @TimeGenerated = getutcdate(); BEGIN TRANSACTION EXEC dbo.p_TypedManagedEntityDelete @EntityId, @TimeGenerated; COMMIT TRANSACTION --Query 3 --Check to make sure the IsDeleted field of your BME now = 1 -- DECLARE @name varchar(255) = '%computername%' SELECT BaseManagedEntityId, FullName, DisplayName, IsDeleted, Path, Name FROM BaseManagedEntity WHERE FullName like @name OR DisplayName like @name ORDER BY FullName --Query 4 --Get an idea of how many BMEs are in scope to purge -- SELECT count(*) FROM BaseManagedEntity WHERE IsDeleted = 1 --Query 5 --This query statement for SCOM 2012 will purge all IsDeleted=1 objects immediately --Normally this is a 2-3day wait before this would happen naturally --This only purges 10000 records. If you have more it will require multiple runs --Purge IsDeleted=1 data from the SCOM 2012 DB: -- DECLARE @TimeGenerated DATETIME, @BatchSize INT, @RowCount INT SET @TimeGenerated = GETUTCDATE() SET @BatchSize = 10000 EXEC p_DiscoveryDataPurgingByRelationship @TimeGenerated, @BatchSize, @RowCount EXEC p_DiscoveryDataPurgingByTypedManagedEntity @TimeGenerated, @BatchSize, @RowCount EXEC p_DiscoveryDataPurgingByBaseManagedEntity @TimeGenerated, @BatchSize, @RowCount --End of queries

 


18 Comments

  1. Matt B.

    Hello Kevin!
    First, I would like to thank you for the awesome content you’re posting!
    I’ve got a question regarding removal of an Agent from cluster nodes. SCOM GUI prevents me from deleting the agent throwing this kind of error:
    “Agent is managing other devices and cannot be uninstalled. Please resolve this issue via Agentless managed view in Administration prior to attempting uninstall again.”
    I assume that a good approach would be to disable the proxy option on the cluster nodes, wait for the discovery to run and for the agentless objects to disappear, then check again if it’s possible to delete the Agent in the usual way.
    And here’s come the issue; the cluster nodes are already decommissioned (the VM’s no longer exists), so disabling the proxy won’t make any difference, so I’m left with two ‘gray’ nodes and an agentless cluster object which apparently prevents me from getting rid of it. Removing the nodes via powershell has left 3 orphaned objects per cluster node (IsDeleted=’0′):
    Microsoft.SystemCenter.HealthServiceWatcher:Microsoft.SystemCenter.AgentWatchersGroup
    Microsoft.Windows.Computer:thatmadclusternode1.local
    Microsoft.Windows.OperatingSystem:thatmadclusternode1.local

    I went through your scripts and I had to manually set the IsDeleted rows to ‘1’ for the orphaned objects of each node using your second SQL query and it worked.

    I’m wondering if there’s any official way to remove cluster nodes when they no longer exists?
    [SCOM2012R2]

    Thanks in advance,
    Matt B.

    • Kevin Holman

      1. NEVER set IsDeleted = 1. I have never recommended that. That is a good way to end up with a corrupt database. Unless you mean using the SQL statements I published that use the typemanagedentitydelete function, which will result in setting the objects to deleted (along with any hosted objects)

      2. We haven’t had a problem deleting cluster nodes from Agent Managed in many years. What version of SCOM are you using, and what UR level? Is the console you are using patched to the latest UR level as well?

  2. Retep

    Hi Kevin
    Thanks for a great article – deleting the Windows computer object deleted a lot of rows – that´s great – however some classes remain (e.g. Healthservicewatcher, UserActionManager, CitrixMPAgentMachine.ControllerConnection) for the server when running query 1 again – should they also be manuel removed with this procedure again ?

    • Kevin Holman

      It “should” be…. but I am curious – what caused this to happen in the first place? What is unique about this orphaned agent, and what caused you to investigate deleting it manually in the first place?

  3. Dennis

    Hello Kevin,

    thank you for the great article! The timing is very good, because of the EOL from W2K8 we are planning to migrate our old GW’s and MS’s to W2K12 (running SCOM 2012 R2).
    But we think there are a lot of unobserved pitfalls. But maybe you can give us a good advise and some best practice approaches.
    Due to our implemented firewall policies, we plan to reuse the IP and hostname of the existing systems. Todo that we would like to do the following steps:
    1. Install a new server with W2K12 R2 with an other temporary hostname and ip
    2. We switch all agents/gw’s to an existing failover system
    3. (We remove the first MS/GW from SCOM and) shutdown the existing MS/GW
    4. We switch the ip and hostname from the old server tot he new W2K12 R2 and add the MS/GW to the management group again

    Regarding to your article above and our migration plan the following questions came up:
    – Do we really need to delete the existing objects (from a MS/GW) from the OpsDB with these SQL-Statements. Because we are going to reuse the hostname and ip I think the only difference should be the underlying OS?
    – If we have to delete the existing MS/GW objects, i think we got some problems with existing overrides set for the MS/GW’s objects. Is there a smooth solution to get all existing overrides regarding to all objects from a MS/GW (like system drives, computer, or other object instances)? (maybe through a sql-statement?)
    – If we don’t want to use the SQL-Statements above, is there another solution to ageing the deleted MS-Objects out from the database? (After a certain period of time?)
    – Do you know any other pitfalls if we want to reuse the IP and hostname?

    I think these questions are interesting for many customers who are planning a similar migration. So I hope you can give us a good advice! 🙂

    Thank you in advance!

    • Kevin Holman

      Hi Dennis

      1. You should NOT be using these SQL statements unless you are working on an open case with Microsoft support.

      2. These SQL statements to delete data in the SCOM DB should *never* be used in any way for a planned upgrade/management group transition or maintenance. These SQL statements are “last resort” deletes to clean up a mistake or bug, and should not be part of any “plan”.

      3. There is no requirement to reuse a servername and IP. I’d strongly advise against this. I’d absolutely install new servers with new names and IP’s. Then – migrate the agent assignments over – then retire the old management servers. If you delete the MS and GW – with agents assigned – you will orphan those agents, and have a mess on your hands. If there is a requirement to reuse the IP – that’s fine, as you can swap those at any time. But I do not recommend reusing the same names. Everything in SCOM is dependent on FQDN as the unique ID, and reinstalling a MS with the same name can lead to challenges. I’d advise against that.

      • Dennis

        Hello Kevin,

        I’m sure this isn’t the best solution, but I think there’s no way around it for us. The hostname has so many dependencies in our company that are not in our hands.
        That’s why I think we’ll have a smaller impact on the reused hostname than resolving the other big dependencies in our infrastructure.
        We also don’t want to execute any SQL statements. If there is a solution other than the SQL commands to resume the server with a new OS, this would definitely be the preferred way for us.

        • Kevin Holman

          You are saying that “due to our company internal requirements, we have no choice but to use ill advised and unsupported solutions to problems we created” ? That sounds like a recipe for disaster. 🙂

          • Dennis

            Hello, Kevin,

            thanks again for your help and tips. Together with the Microsoft support there was fortunately no disaster during the migration. 😉

            And with that said – we successfully moved 2 management servers and 14 gateway servers to W2k12 R2.

            Next Step -> SCOM 2016

            Regards,

            Dennis

  4. Kiran Kumar

    Hi Kevin,

    We are observing 29120 & 29112 event id’s in our SCOM 1801 environment and Management servers are unable to get the latest configuration file if we reset the Health state folder on SCOM Management servers. Could you please advise fix for the same.

    Regards,
    Kiran Kumar

  5. Zach

    Hi Kevin,
    I’ve been reading trying to diagnose an issue and will not change anything to “IsDeleted=1” 🙂

    I am running a report on my DataWarehouse view “vHealthServiceOutage” and there are a few servers listed there that do not exist in BaseManagedEntity. The servers used to exist, some were decommissioned over 4 years ago. They no longer exist anywhere else that I can find in the Operations Manager Database but do exist in places in the datawarehouse.

    They are creating erroneous data that I have to cull manually after report runs and it bugs me that the old data is there in the first place.

    I’d like to diagnose how the data still exists for these servers when other decommissioned servers do not exist anymore in the datawarehouse… there was obviously something done incorrectly years ago. Also, I’d like to get rid of that data but do it in such a way that my database doesn’t get corrupted.

    Any assistance you can give would be appreciated, thanks!

  6. Simon

    Hi Kevin,
    Love your article…

    Any ideas how you would do “query 2” against a list of BaseManaged Entity ID’s instead of it being done singularly.

    Thanks,

    • Kevin Holman

      Honestly, it could be done… but I’d be very concerned about this….. as we really should not have to be doing this very often or in great bulk. If we have to do this AT ALL then something went very, very wrong.

      • Simon

        Hi Kevin,
        Thanks for getting back to me. Yes I totally agree with that statement. For information in our example we had a situation from the Veeam management pack where we had moved the majority of the VMs from one Vcenter to another, but unfortunately it never cleaned out the old discovered entity’s so we had to manually clear these down.

  7. Jason

    Hi Kevin,

    We have requirment for making our MS into GW due to latency issue . I am under the impression your query should help here after we uninstall the MS role & run the Cleanup and then install the GW Role. our MS Server does have HP DMC so lot of HP DMC compoment are discovered & are linked to this MS Server. After installing the GW Role the HP DMC Object should be auto-Discovered is my understanding . DO you see any problem or would recommeded any additional precaution to be taken to avoid conflict between MS & GW role Switch which we are trying to peform.

    Regards,
    Jason Aranha

    • Kevin Holman

      Ugh. In general, I never recommend changing a MS to a GW. A MS has all kinds of special relationships. I’d assume once you remove a management server, and then purge the DB, everything “should” be gone so that a computer with the same name could be a simple agent managed role or Gateway role, but I have not tested this.

Leave a Reply

Your email address will not be published. Required fields are marked *