Menu Close

Upgrade from SCOM 2016 to SCOM 2019 Checklist

image

 

This is a planning checklist that will help you determine if an in-place upgrade is possible, and how to prepare the environment in advance for it.  It is similar to my previous post on Upgrading SCOM 2012R2 to SCOM 2016.

 

1. Verify we are moving from a supported version of SCOM to SCOM 2019.

2. Verify the SQL server versions and service pack levels are supported for both SCOM 2016/1801/1807 and SCOM 2019

3. Verify all OS versions for SCOM server roles will be supported for both SCOM 2016/1801/1807 and SCOM 2019

4. Verify all SERVER ROLES meet minimum hardware sizing for SCOM 2019

5. Verify all AGENT managed Operating Systems are supported for SCOM 2019.

6. Verify all MANAGEMENT PACKS in use are supported for SCOM 2019.

  • Check with 3rd party MP vendors and ensure their MP does not have any known support issues with SCOM 2019. Update these MP’s in advance if required.

7. SCOM Database: Verify the OperationsManager database has more than 50 percent free space

8. Optimize Registry settings for management servers

9. Export and review the SCOM management server event logs on all management server roles

  • Look for critical and warning events that indicate major issues that should be resolved before upgrading.
  • Save these for comparison after the upgrade to verify any new issues are actually new

10. Verify SCOM is healthy

  • Review the “Operations Manager > Management Group Health” dashboard in addition to the event logs and ensure SCOM is healthy

11. T-SQL: Clean up the database ETL table in the OperationsManager database

12. SCOM Console: Remove agents from Pending Management

13. Backup unsealed management packs

  • Get a fresh backup of all your unsealed MP’s which contain all your customizations, for disaster recovery
  • Example:    Get-SCOMManagementPack | where {$_.Sealed -eq $false}|Export-SCOMManagementPack -Path c:\mpbackup

14. REMOVE the OMS/Advisor management packs

15. SCOM Console: Disable Notification subscriptions

16. Disable product connectors or any external connections to the SDK.

17. Optional but recommended:  Restart the SQL server service on the OpsDB server and DW server

  • This will kill any stuck or old blocking processes, and free up any buffer cache
  • Wait at least 5 minutes after restarting to ensure the DB’s are online and functioning.
  • Ensure there is no active blocking in the OpsDB before continuing.
  • Consider a reboot of the entire server.

18. Optional but recommended:  Uninstall the SCOM Console and Web Console on the FIRST management server you plan to upgrade and REBOOT.

  • Removing these roles reduce the risk of an upgrade failure.
  • These roles are easy to reinstall once the management group upgrade is completed.

19. Stop the Operations Manager services on Management servers

  • Stop the following services on all management servers in the management group, to ensure NO changes are being made to SQL during the backup, so we can get a good backup right before the upgrade:
  • Microsoft Monitoring Agent
  • System Center Data Access
  • System Center Configuration

20. Backup the SCOM databases

21. Backup the Management Servers

  • Take a VM snapshot or a full bare-metal backup that is restorable, with the SCOM services stopped, so there should be no transient data. This will be for use in the case of disaster recovery only.

22. Install SCOM 2019 prerequisites on management servers with consoles

23. Ensure .Net 3.5, and .Net 4 (or 4.5) are both installed on ALL management servers

24. Remove any old SDK reference software from the management server

  • Some programs install DLL’s that might block upgrade, consider removing them if installed on your management servers:
  • SCOM 2007 R2 Authoring Console
  • Silect MP Author/MP Studio

25. Optional but recommended:  REBOOT ALL Management servers.

  • Rebooting these servers ensures that any OS related issues are observed or cleared before attempting an upgrade.
  • Rebooting these servers helps remove any question that something was wrong with them prior to the upgrade.
  • If a Management server cannot successfully reboot and start up without errors before an upgrade, it certainly cannot after an upgrade.

26. Upgrade the first management server

27. Upgrade additional management servers

  • It is CRITICAL not to upgrade multiple management servers at the same time. You should wait for one to complete FULLY and inspect the logs to ensure it is working, before continuing with the next.

28. Upgrade ACS (if applicable)

29. Upgrade all gateways (if applicable)

30. Upgrade Stand Alone Web Console servers (if applicable)

31. Upgrade Reporting Server

32. Upgrade Stand-Alone Consoles

33. Post Upgrade tasks

34. Reject Pending Management updates for any agents

  • We will update agents later, after applying the latest Update Rollup for SCOM 2019

35. Verify your SCOM license is reporting correctly as licensed

36. Apply the latest Cumulative Update Rollup for SCOM 2019

  • You should generally wait a few hours after an upgrade to SCOM 2019, before applying the latest SCOM 2019 update rollup. There are warehouse scripts as part of the upgrade that can take several hours to complete, and it is a best practice to not interrupt these.

37. Upgrade Agents

  • Using whatever method you choose, consider upgrading your agents to SCOM 2019 with the latest UR at this point.

 

What to do when things go wrong?

When SCOM upgrades fail, there will be a log telling us why.  Often times you will get an “Error 1603” which is simply a generic error and does not tell you anything.  These log files are typically located in the user profile directory of the account attempting the installation.  C:\Users\<username>\AppData\Local\SCOM\LOGS.  Review ALL the logs, and if needed provide all these logs to a Microsoft engineer when opening a support case.  Log files are not always easy to interpret – but the root cause is always in them.

Common issues causing failures:

  • Lack of permissions for the user account performing the upgrade (requires Local admin, SCOM admin, and SQL SysAdmin)
  • TLS 1.2 enforced on management servers or SQL but missing prerequisites
  • A SCOM Agent is installed on a SCOM Management server
  • SQL Database is experiencing blocking from another process.
  • SQL Database does not have enough free space or transaction log space.
  • Advisor MP caused the upgrade to fail because it was not deleted before attempting upgrade

 

Resources:

SCOM 2019 is HERE!

Security changes in SCOM 2019 – Log on as a Service

SCOM 2019 Log On As A Service Management Pack Helper

SCOM 2019 Security Accounts Matrix

SCOM 2019 QuickStart Deployment Guide

41 Comments

  1. Michiel Aubertijn

    Kevin,
    In our case we came across a strange issue. The first three Management Servers went fine. The fourth failed. We revert to the snapshot but this time the scom services were running. Started the upgrade and everything went fine.
    Best Regards,
    Michiel

    • Kevin Holman

      If you actually follow my steps – the last thing you do before attempting an upgrade is to reboot all management servers. This would leave all services running on everything before attempting an upgrade.

  2. Andrew T

    My SQL server is running 2016 (13.0.5820.21) but I’m getting an error on SQL version validation when starting the install/upgrade. I’m not quite sure what it’s complaining about. is there a log anywhere I can check to see what it’s specifically failing on?

    • Kevin Holman

      Yes – the logs should be available in your user directory – C:\Users\\AppData\Local\SCOM\LOGS

      Feel free to shoot them to me via email if needed.

  3. ANDRII VERESHCHAKA

    Kevin I DO APPRECIATE again and again your really priceless work !
    Thanks a lot . Using this checklist has saved our time enough to have two coffee and tea breaks during our SCOM upgrade procedure 8) we did it easy, because we followed your list step-to-step.

  4. James Farthing

    Hi Kevin,
    Great blog as always. We’ve been planning an update to our environment to from 1807 to 2019 and this highlights a few potential issues that we’ve not considered.
    I’ve got a question relating to 26 – upgrading the additional management servers. Normally we would stagger application upgrades of multiple machines over subsequent days where possible, would I be correct in thinking that this wouldn’t be recommended here? From my understanding, after upgrading the first (of 4) Management Server to 2019, we wouldn’t have a functional Management Pool due to being below the quorum threshold. Would you recommend immediately updating a second (or more) Management Server to 2019, or perhaps consider removing the other Management Servers from the pool until they have been upgraded? Alternatively, would the remaining servers continue to work within the Management Pool whilst still running under 1807?
    Thanks in advance.
    James

    • Kevin Holman

      I believe they will continue to work. There is no schema change in these versions… however I would recommend upgrading all components as quickly as possible. Certainly all management servers. If a customer needed to wait a while on a Gateway, or especially agents, that’s fine. But I’d always advise customers to upgrade all their management servers sequentially, but in the same planned outage for the upgrade. I have even worked with customers with 20 management servers, and we would apply the upgrade to all of them in the same evening, starting one as soon as the previous one completed.

      • James Farthing

        Hi Kevin,
        Many thanks for the quick reply, I’ll discuss this with my colleagues.
        It seems like upgrading all the machines in one go is the way forward then, especially as I don’t think we really gain any reduction in risk by spreading out the upgrade over multiple days. The rollback would be the same steps either way around.
        Kind Regards,
        James

  5. Senko S

    Hi Kevin and everyone in the community!

    I would like to ask you or anyone out there about a specific upgrade scenario. I wonder if anyone has tried to do a in-place upgrade from 1807 to 2019 where the SCOM environment isn’t basically allowed to have a long downtime?

    Would it work “in practice” to clone the existing SCOM VM:s in the hypervisor?

    1. Start the cloned VM:s with disabled NIC:s
    2. Start upgrading the management servers in parallel of the old environment
    3. Shutdown the original SCOM Management Servers one by one and enabling the NIC:s on the cloned VM:s with the same IP adress.
    as the previous ones?

    • Kevin Holman

      That would not be supportable, since the upgrade modifies both the SQL databases and the management servers at the same time.

      Doing an in place upgrade should be very limited downtime. What is allowed for your planned maintenance?

    • Kevin Holman

      Sure. Just use a SCOM 2016 agent on them. They still work fine as long as they have powershell 2.0 installed. Just not officially supported, but then neither is WS2008.

  6. Abdi

    I have a really annoying issue installing scom 2019 agent on WS2019 Domain controllers. All other servers non DCs (2016 & 2019) i can install the agent successfully.
    It was an in place upgrade from 2016 to 2019 and so the DC’s that already had the 2016 agent have moved over to 2019 successfully. Any new DCs i try to add (i have only tried to add 2019 DCs) have failed with access denied.
    Does SCOM 2019 need any additional permissions above the domain admin level?

  7. Pingback:Community Round-Up: January 2021 | SquaredUp

  8. Delpol

    We tried upgrading our SCOM instance from 2016 to 2019 but failed with the first management server. During the rollback it removed SCOM completely from the management server and leaving the OPsDB as upgraded. We tried installing 2016 back on it but didn’t work. So rolled back the Ops DB and DW DB to the working state from the backup, then tried recovering the management server. It recovered but still seeing Event ID 29120 (Microsoft.EnterpriseManagement.ManagementConfiguration.Interop.HealthServicePublicKeyNotRegisteredException: Padding is invalid and cannot be removed.)
    Both SDK and Config services are running fine but unable to connect to open the console on the same server after recovery. When opening the Ops Mgr Shell, it throws error: The User (xxxx) does not have sufficient permission to perform the operation.
    Both the Config Account and Action Account have sysadmin role on the DB and Local admin on the management and DB servers. Both accounts are set as logon as a service as well. Any ideas what we could be missing @Kevin ?

    • kevinholman

      How did you recover the management server? Command line? Or just reinstall using the UI? If you restore the DB – it would have been best to restore a snapshot of the management server as well.

      Your DAS account and Management Server action account should NOT have sysadmin to the database. That is an overextension of privilege. It wont hurt something, but not required and not a best practice.

      The most likely issue is that when you run a recovery, you need to use the command line with the /recover switch…. if you didnt do that, then review the runas accounts – you might have something messed up there in the accounts and in the profiles, and reset the passwords.

  9. Brian Hansen

    I need to upgrade from 1801 to 2019. I understand the need to have the actual SCOM servers on Server 2016. But according to the documentation the OpsDB and DWDB SQL servers need to be Server 2016 as well. Is that true? Or will the SQL servers be OK on Server 2012 as long as SQL is 2016? (yes, we will need to upgrade the OS on the SQL servers anyway, but for reasons I won’t go in to I need do the SCOM upgrade first)

    • Kevin Holman

      In order to perform a supported in-place upgrade of SCOM, both the previous version and the new version of SCOM have to be in a supported configuration at all times.

      SCOM 1801 supports server roles (Gateway, Web Console, Management Server, and SQL Servers) on WS2012R2 and WS2016.
      SCOM 2019 supports server roles (Gateway, Web Console, Management Server, and SQL Servers) on WS2016 and WS2019.

      Therefore the only common OS between these two is Windows Server 2016. All server roles MUST be on Windows Server 2016 in order to perform a supported in-place upgrade.

      SCOM 1801 supports SQL server roles (OpsDB, DataWarehouse DB, SCOM Reporting) on SQL 2016.
      SCOM 2019 supports SQL server roles (OpsDB, DataWarehouse DB, SCOM Reporting) on SQL 2016 and SQL 2017 and SQL 2019.

      Therefore the only common SQL version between these two is SQL 2016. All SQL server roles MUST be on SQL Server 2016 in order to perform a supported in-place upgrade.

      I understand this is highly restrictive, and often is the reason customers choose not to adopt current SCOM versions due to the impact here. I have given this feedback to the product teams many times, and we need more customers to provide this feedback to the support and product teams.

    • Kevin Holman

      I haven’t ever done it, so I am not sure – but I would think it would work – the new UI is just far superior but at the end of the day it just makes XML.

  10. Brian Hansen

    Doing an upgrade from 1801 to 2019. My 2 Gateway servers are saying they can’t continue setup because they have other SCOM roles installed (MS, Console, Web, agent, etc). But the only SCOM item installed on theses servers is the GWS itself. I have looked thru the MomGateway setup log but can’t find what it thinks there is.

    Any suggestions on how to get the upgrade to complete?

  11. Brian Hansen

    Nevermind, I found the culprit:

    From setup log:

    PROPERTY CHANGE: Adding CORECOMPONENTPRESENT_AGENT property. Its value is ‘{EE0183F4-3BF8-4EC8-8F7C-44D3BBE6FDF0}’.
    FindRelatedProducts. Return value 1.

    From Regsitry: (I deleted this key)

    [HKEY_LOCAL_MACHINE\SOFTWARE\Classes\Installer\Products\4F3810EE8FB38CE4F8C7443DBB6EDF0F]
    “Clients”=hex(7):3a,00,00,00,00,00
    “ProductName”=”Microsoft Monitoring Agent”
    “PackageCode”=”0DFCAAEAC7718004BADFDCCEDA1286B7”
    “Language”=dword:00000000
    “Version”=dword:080032fd
    “Assignment”=dword:00000001
    “AdvertiseFlags”=dword:00000180
    “ProductIcon”=”C:\\WINDOWS\\Installer\\{EE0183F4-3BF8-4EC8-8F7C-44D3BBE6FDF0}\\agentgateway.ico”
    “InstanceType”=dword:00000000
    “AuthorizedLUAApp”=dword:00000000
    “DeploymentFlags”=dword:00000001

  12. Simon

    I have 4 Management Servers plus 2 gateway servers (one in an untrusted domain) all currently 2016 but I plan to upgrade 3 of the management (and maybe the g/w in the trusted domain) next weekend to 2019 and leave the remaining ones at 2016 due to the versions of the servers being monitored are less than 201x required for 2019. I guess I can leave those 2 un-upgraded for a while since 2016/2019 co-existence is a thing. How long does the in-place upgrade take roughly? I would like to do those 3 servers THEN do the UR3 install – usually the UR installs take approx. 3+ hours including the agent upgrades so would all this be done likely in an 8am-4pm outage window ?
    Thanks for all your posts btw

    • Kevin Holman

      I’m not sure I understand your plan. What is your concern about downloevel agents???

      Down-level agents have nothing to do with choices you would make in an upgrade – so I am afraid some wires might be crossed here.

      Can you explain it in more detail – with SCOM server versions and OS versions?

      If you have 4 management servers – you should upgrade all four of them in sequence – period. Then – Gateways can be upgraded as you have time.

      • Simon

        Sorry for the confusion:
        We are currently monitoring 2008 servers within our environment so I will need to keep one SCOM management server to keep monitoring these servers, as i believe 2019 is unable to monitor operating systems lower than 2012. All our SCOM Management servers are running Windows Server 2016 with SCOM 2016 UR10 including the Gateway servers.
        The gateway server in our untrusted domain also monitors 2008 Servers so I believe keeping this at SCOM 2016 in order to maintain monitoring them.

        My end goal is:
        3xManagement Servers & 1 Gateway running SCOM2019 which will be monitoring 2012/2016/2019 Servers
        1xManagement Server running SCOM2016 to keep monitoring 2008 Servers
        1xGateway Server (in untrusted domain) running SCOM2016 to keep monitoring 2008 Servers

        • Kevin Holman

          Your plan is completely unsupportable. We do not support leaving different versions of SCOM in a management group. So I’m afraid that plan is not possible.

          That said – there is no issue monitoring WS2008 OS from SCOM 2019. It is not “unable to” – it works just fine. Almost all my customers monitor WS2008 with SCOM 2019. The boundary is not that it doesn’t work – it is simply that it isn’t supported. That’s because the OS isn’t supported anymore.

          My advice – is to simply upgrade to SCOM 2019 *the right way*. All the way. Then continue monitoring WS2008 just like you always have. We do not support that – because we do not support monitoring it with ANY version of SCOM anymore. 🙂

          • Simon

            Oh that’s great then! Thanks so much for clarifying and clearing up my huge misunderstanding! Would that also apply to moving to SCOM 2022 then? Same principle: it CAN support monitoring of 2008 (which we are in the process of trying to decomm/upgrade) but just “not supported” ?
            This is why i come to your site: so many amazing pieces of SCOM-related info. You’re the best! Thank you again !!!

          • Kevin Holman

            That’s right. You can monitor WS2008 systems using SCOM 2019 or SCOM 2022. Even deploy the SCOM 2019 or SCOM 2022 agent. The only minimum requirement is PowerShell 2.0 or later to be installed.

            SCOM 2022 adds PS version 3.0 requirement, so you might see SOME scripts failing on the WS2008 server that leverage a PS 3.0 function or command, but the original WS2008 OS management pack does not require this to work. SCOM 2022 also adds Microsoft .NET Framework (both the 3.5 and 4.7.2 or higher versions of Microsoft .NET are required.) I’d generally recommend staying with SCOM 2016 or SCOM 2019 agent on WS2008.

          • Simon

            Hi again. I did the SCOM 2016-2019 upgrade this weekend but am encountering strange issues post-upgrade. My 4 Management Servers are all reporting 2019 (with UR3) but are all greyed out in the Management Servers area and whenever i try and close some alerts it’s saying “the monitor is unhealthy”. Have you ever encountered that before ?

  13. Steve O.

    Step 6 is verify all MPs in use are supported for SCOM 2019.
    Question: Are all older windows non-3rd party MPs supported for SCOM2019?

    I have way too many references and it’s taking me too much time cleaning up overrides. I am getting heat from management to upgrade and need to see if all I have to do is get rid of 3rd party MPs and upgrade.

    Some old MPs I have are AD 2000, 2003, 2008, SQL 2000, Exchange 2003, 2007… ect.

    • Kevin Holman

      No – those old MP’s like you mention are not supported. Not only are they not supported for SCOM 2019 – they aren’t supported *at all* by Microsoft. Once a product falls out of support – Microsoft no longer supports the management pack. These old MP’s should work in SCOM 2019 and likely will not block an upgrade, however, we do not test unsupported MP’s in any scenario.

      I’d recommend investing in getting fast at cleaning up old references and getting rid of old junk – or start completely over with a clean install of SCOM and clean MP’s from scratch. The latter is very hard to do, because there is often so much customization after years of using poor MP practices in typical customer environments.

      I can usually clean up a customer environment and get rid of old unused MP’s in a day or two of work. It comes down to how many custom MP’s they have created, and how many MP’s reference these old unsupported and unused MP’s.

      But you can always attempt the upgrade without doing this work. You are just bringing bad stuff along for the ride and eventually you will not get the value that you should from the monitoring platform.

      • Steve O.

        So having the the out dated MPs will not cause any performance issues or DB problems if I keep the MPs?

        You don’t by chance have a blog or steps you do to clean up? 2 days is pretty good.. I have taken a month to clean up just one custom MP.

        • Kevin Holman

          They can cause performance issues, but mostly as we no longer have any instances, the overhead is minimal. The biggest issue I find is the old MOM 2005 Backwards compat MP – that should have been eliminated before 2012.

          If I am going to keep these around – I will disable the root/seed discovery(s) and then run Remove-SCOMDisabledClassInstance. This at least keeps old junk discoveries from running.

          I don’t have a blog post on MP cleanup – it would read terribly because it is a LOT of XML and requires understanding references, aliases, workflows, and how to correct these. Usually, these skills are developed over supporting SCOM for many years. If you have a CSAM, you might investigate if you have the opportunity for the hours to do an advisory call with Microsoft to work on cleaning these up.

  14. Raja

    I need your suggestion on whether the Windows operating system in-place upgrade 2016 to 2019 is recommended or not. After Os upgrade, we planning to upgrade scom 2019 to 2022
    If the OS in-place upgrade has an any impact on the performance of SCOM applications.

    • Kevin Holman

      We do not make blanket “recommendations” such as this. However, I can tell you that your choices are to either upgrade the OS in place, or to add new management servers with Windows Server 2019 OS, retire the old management servers, then perform the SCOM upgrade. Both scenarios are supported.

  15. Pingback:Upgrade from SCOM 2019 to SCOM 2022 Checklist – Kevin Holman's Blog

Leave a Reply

Your email address will not be published.