Menu Close

How to multihome a large number of agents in SCOM

Quick download:  https://github.com/thekevinholman/MultiHomeSCOMAgents

 

I have written solutions that include tasks to add and remove management group assignments to SCOM agents before:

https://kevinholman.com/2017/05/09/scom-management-mp-making-a-scom-admins-life-a-little-easier/

 

But, what if you are doing a side by side SCOM migration to a new management group, and you have thousands of agents to move?  There are a lot of challenges with that:

 

1.  Moving them manually with a task would be very time consuming.

2.  Agents that are down or in maintenance mode are not available to multi-home

3.  If you move all the agents at once, you might overwhelm the destination management group, or your local virtualization hosts with the simultaneous activity.

 

I have written a Management Pack called “SCOM.MultiHome” that will manage these issues more gracefully.

 

It contains one (disabled) rule, which will multihome your agents to your intended ManagementGroup and ManagementServer.  This is also override-able so you can specify different management servers initially if you wish:

 

image

 

This rule is special – in how it runs.  It is configured to check once per day (86400 seconds) to see if it needs to multi-home the agent.  If it is already multi-homed, it will do nothing.  If it is not multi-homed to the desired management group, it will add the new management group and management server.

But what is most special, is the timing.  Once enabled, it has a special scheduler datasource parameter using SpreadInitializationOverInterval.  This is very powerful:

<DataSource ID="Scheduler" TypeID="System!System.Scheduler"> <Scheduler> <SimpleReccuringSchedule> <Interval Unit="Seconds">86400</Interval> <SpreadInitializationOverInterval Unit="Seconds">86400</SpreadInitializationOverInterval> </SimpleReccuringSchedule> <ExcludeDates /> </Scheduler> </DataSource>

 

What this will do, is run once per day, but the workflow will not initialize immediately.  It will initialize randomly within the time window provided.  In the example above – this is 86400 seconds, or 24 hours.  This means if I enable the above rule for all agents, they will not run it immediately, but randomly pick a time between NOW and 24 hours from now to run the multi-home script.  This keeps us from overwhelming the new environment with hundreds or thousands of agents all at once.  You can even make this window bigger or smaller if you desire by editing the XML here.

It is literally THAT simple.

 

Now – some caveats:

This solution works really well for UP TO 3000 agents.  Since we only support 3000 agents per management server this simple solution will send all agents to the SAME management server.  You might want to change this.  You can override and enable this solution for specific custom groups, or you can use my more advance MP which has some customized groups in it.  More on that below:

 

 

In the additional management pack (SCOM.Advanced.MultiHome) – there are custom groups included.

This MP contains 8 Groups.

 

image

Let’s say you have a management group with 4000 agents.  If you multi-homed all of these to a new management group at once, it would overwhelm the new management group and take a very long time to catch up.  You will see terrible SQL blocking on your OpsMgr database and 2115 events about binding on discovery data while this happens.

The idea is to break up your agents into smaller groups, then override the multi-home rule using these groups in a phased approach.  You can start with 500 agents over a 4 hour period, and see how that works and how long it takes to catch up.  Then add more and more groups until all agents are multi-homed.

These groups will self-populate, dividing up the number of agents you have per group.  They query the SCOM database and use an integer to do this.  By default each group contains 500 agents, but you will need to adjust this for your total agent count.


  <DataSource ID=”DS” TypeID=”SCOM.MultiHome.SQLBased.Group.Discovery.DataSource”>
    <IntervalSeconds>86400</IntervalSeconds>
    <SyncTime>20:00</SyncTime>
    <GroupID>Group1</GroupID>
    <StartNumber>1</StartNumber>
    <EndNumber>500</EndNumber>
          
    <TimeoutSeconds>300</TimeoutSeconds>
  </DataSource>
</Discovery>

Also note there is a sync time set on each group, about 5 minutes apart.  This keeps all the groups from populating at once.  You will need to set this to your desired time, or wait until 10pm local time for them to start populating.

 

Wrap up:

Using this MP, we resolve the biggest issues with side by side migrations:

 

1.  No manual multi-homing is required.

2.  Agents that are down or in maintenance mode will multi-home when they come back up gracefully.

3.  Using the groups, you can control the load placed on the new management group and test the migration in phases.

4.  Using the groups, you can load balance the destination management group across different management servers easily.

53 Comments

  1. Raleine-Ann Asis

    Can I apply this to only select group of Agents? Where will this MP be installed – in the management group where agents are currently running or the Management group where the agents will be migrated over?
    Will there be an impact on the monitoring behavior if the agent version of my new management group is different from the current management group that is live?

    • Kevin Holman

      Yes, via a group. Install this MP in the “old” management group. As far as different agent versions, you should review the supported coexistence statements in the product documentation.

      • Eugene

        Hello Kevin, sorry for guffy question but how I can do it via group? should I change the xml before import? Thank you.

  2. Vipin Prasad K

    Hi Kevin,

    I have installed the MP on SCOM 2016, though i dont see the agents getting populated in the group automatically. Do we have to enable any other settings apart from enabling the rule. I have changed the agent count to 100, as for now we are monitoring 600 Servers overall.

    Regards,
    Vipin

    • Kevin Holman

      Those groups only populate once a day based on a sync time. Wait long enough, and look on your management servers for the events they log to see whats going on.

  3. Greg Smith

    Hi Kevin, We just started using the MP on a SCOM 2019 install. I believe that I found an issue with the way that the MStoASSIGN$ parameter gets passed to the SCOM.MultiHome.AddMG.Rule.WA.ps1 PowerShell script. For some reason, a space is getting added to the end of the string which caused an exception when the AddManagementGroup method was called? I updated the PowerShell code and added $MStoASSIGN = $MStoASSIGN.Trim() and the issue seems to be fixed.

    Exception calling “AddManagementGroup” with “3” argument(s): “The parameter is incorrect. (Exception from HRESULT: 0x80070057 (E_INVALIDARG))”

  4. Gordon

    Kevin,

    I’ve said it before and I’ll say it again. You are absolutely amazing. Thank you very much for all your fantastic contributions to SCOM community. I know I’ve appreciated a lot.

    Cheers
    G

  5. Richard

    In our current environment we set global management server settings to – “review new manual agent installations in pending management”. Will this management pack just add the SCOM agents to the new management group without being approved because they already were previously approved in the old management group?

      • Moise

        Hey Kevin,
        What is supposed to be the target of the override? Is it a group or the agent class? I did the agent class in dev it work but when I try the group it did not work in prod?

        • Kevin Holman

          For small environments – just override the rule for the Agent class. This will start multihoming ALL agents.
          For big environments – this is where the override should be scoped to a Group. Check the group memberships to ensure they first have populated with agents.

  6. martijn

    I can see how this will make any managed agent multihome. So how is the cleanup handled to make the switch in a side by side like you suggest?

  7. Jasper

    Kevin,

    What is your experience with multi-homing Linux agents? I am looking to multi-home our old 1807 setup to a 2019 setup but I’m a little bit afraid this may cause issues as the agent for SCOM 2019 is newer.
    Are these compatible with each other?

  8. Greg

    Hi Kevin,

    Do I need to upgrade my 2012r2 agents from v7 to v10 prior to dual homing?
    Will the 2012 environment still communicate with v10 agent?
    Will the v7 Agent communicate with SCOM 2019 environment?

    Thanks

  9. Georg

    Hi Kevin,

    I downloaded and installed your MultiHome MP into our SCOM 2012 R2. When i scope the Management Pack Objects, i only find the the targets for the “SCOM MultiHome GroupX SQLQueryBased” from the Management Pack SCOM Multihome. For some reason, the mentioned Rule is not present.

    regards,
    Georg

  10. Britt Johnson

    Thank you so much for the knowledge and skills you share. They’re invaluable. We plan to use this tool to migrate our 8,000 systems from SCOM 2012 to SCOM 2019. I can use the SpreadInitializationOverInterval and EndNumber fields to increase the number of machines per group to 1,000 and stretch the run time to 8 hours. But is there any way to control the machines processed in a more granular way? About 200 of our systems are Windows 2008 and won’t be migrated. Another significant bunch are mission critical and must be moved carefully/manually. Finally, Management would like to see a successful pilot using well less than 1,000 servers!! Any ideas on how to select specific machines from these 8 groups? Thanks!

    • Kevin Holman

      There are a few things you can do.

      You can put in the multi-homing script, to check the OS version, and do nothing if it is 2008 OS.

      You can also create your own custom groups…. you don’t need to use mine…. or use group exclusions to exclude specific members from the groups. This way you can verify your groups are good to go, before taking action. You can edit the group membership count to any number you want, for a pilot, or just create a custom group. The groups are only used for simple overrides.

  11. Sarav

    Hi Kevin, We are having SCOM 2019 Agents on our machines and we would like to multihome the 2019 agent back to SCOM 2012 R2 Management Server. is it compatible ?
    If not can the SCOM 2019 agent at least report to SCOM 1801 verion ?

  12. Praym

    Hi Kevin,
    After we multihome, what is the best way to upgrade the scom agent on clients from 2012 to 2019? Is there any script that we can upgrade remotely in bulk instead of manually installing 2019 msi?

  13. Sanja

    Hi Kevin,

    We use multihomed agents for the agent servers, but is there a way a gateway servers to be multihomed? If yes can you please share how?

  14. Leo Landicho

    Hi Kevin. I am a little bit confused. There are currently 3 MPs present on your link and all are SCOM.Multihome when imported in SCOM. Am I correct to assume that I will only use one of these MPs? Let say if I want to use the groupings in case I have large number of agents, I will install the version 1.0.0.2 this MP basically has the groupings. If I only have a small environment, I can use either the version 1.0.0.7 or the other one. I have used the version 1.0.0.7 but I don’t see any agents transferring to the new MG. I override SCOM MultiHome Agent to Additional Management Group Rule parameters: Enabled, MGtoAdd and MGtoAssign. This is targeting Class Agent. Please let me know if I am getting this correct. Thanks.

    • Kevin Holman

      SCOM.MultiHome.xml – basic simple MP with a rule to multihome agents.

      SCOM.Advanced.MultiHome.xml – advanced MP with multiple groups to control multi-home activities for huge environments.

      SCOM.MultiHome.GW.xml – advanced MP which required XML editing – to customize for a customer with up to 5 gateways. This required advance authoring experience if you have more than 5 gateways.

      • Leo Landicho

        Thanks Kevin. I figure it out. I forgot to change the security settings to enable review new manual agent installation in pending management. My bad. Currently setting up my test environment of SCOM 2019. Thanks for your help.

  15. Pietro C

    Kevin, I have a prod 2016 SCOM env and created a side-by-side 2019.
    2016 has 4 mgmt servers (let’s call them 2016-01, 02, 03 and 04) for a total of 4500 VMs. 2019 will only have 2 mgmt servers (let’s call it 2019-01 and 02).
    2016-01 has 2500 VMs and can go to 2019-01
    2016-02 has 600 – goes to 2019-02
    2016-03 has 900 – goes to 2019-02
    2016-04 has 500 – goes to 2019-02
    If I understand your Multihome MP correctly, I would install the MP to 2016-01, set the destination mgmt server to 2019-01 and run the advanced group and that should only multihome the VMs in 2016-01 to 2019-01?
    Then I would repeat the MP on 2016-02,03 and 04 and change the mgmt server destination to 2019-02?
    I just want to make sure that when I perform the MP on 2016-01, it does ONLY the VMs residing within that mgmt server and not all 4.

  16. Leo Landicho

    Hi Kevin, I would like to know how about Audit Collection after a successful multihoming? Since we are doing side-by-side migration, I will be setting up again ACS in the new server. Will the domain controllers be forwarding on both collectors, in our case old and new SCOM management servers? I haven’t done anything yet as I don’t know the effect yet. Hope to hear your thoughts as to how to move old forwarding to the new server and what is the best way to do it. Thanks.

  17. Brian Engel

    This MP as already saved me a ton of time. However my environment has 3 domains and it of course only worked on the primary domain that my SCOM servers sit in. The other two domains point to GW servers for their respective domains. I have groups created in SCOM 2012 R2 for the respective domains, however when go to assign the MP to a group for override, those groups don’t show up in the list.

Leave a Reply

Your email address will not be published.