On 9/17/21 5:55 PM, Huang, Ying wrote: >> @@ -3147,6 +3177,16 @@ static void __set_migration_target_nodes >> int node; >> >> /* >> + * The "migration path" array is heavily optimized >> + * for reads. This is the write side which incurs a >> + * very heavy synchronize_rcu(). Avoid this overhead >> + * when nothing of consequence has changed since the >> + * last write. >> + */ >> + if (!node_demotion_topo_changed()) >> + return; >> + >> + /* >> * Avoid any oddities like cycles that could occur >> * from changes in the topology. This will leave >> * a momentary gap when migration is disabled. > Now synchronize_rcu() is called in disable_all_migrate_targets(), which > is called for MEM_GOING_OFFLINE. Can we remove the synchronize_rcu() > from disable_all_migrate_targets() and call it in > __set_migration_target_nodes() before we update the node_demotion[]? I see what you are saying. This patch just targeted __set_migration_target_nodes() which is called in for MEM_ONLINE/OFFLINE. But, it missed MEM_GOING_OFFLINE's call to disable_all_migrate_targets(). I think I found something better than what I had in this patch, or the tweak you suggested: The 'memory_notify->status_change_nid' field is passed to all memory hotplug notifiers and tells us whether the node is going online/offline. Instead of trying to track the changes, I think we can simply rely on it to tell us when a node is going online/offline. This removes the need for the demotion code to track *any* state. I've attached a totally untested patch to do this.