On Mon, 2019-09-09 at 11:33 +0200, Juergen Gross wrote: > Today a cpu which is removed from the system is taken directly from > Pool0 to the offline state. This will conflict with the new idle > scheduler, so remove it from Pool0 first. Additionally accept > removing > a free cpu instead of requiring it to be in Pool0. > > For the resume failed case we need to call the scheduler code for > that > situation after the cpupool handling, so move the scheduler code into > a function and call it from cpupool_cpu_remove_forced() and remove > the > CPU_RESUME_FAILED case from cpu_schedule_callback(). > > Note that we are calling now schedule_cpu_switch() in stop_machine > context so we need to switch from spinlock_irq to spinlock_irqsave. > So, I was looking at this patch, and while doing that, also trying it out. I've done the following: # echo 0 > /sys/devices/system/xen_cpu/xen_cpu7/online And CPU 7 went offline, and was listed among the free CPUs: (XEN) Online Cpus: 0-6 (XEN) Free Cpus: 7 (XEN) Cpupool 0: (XEN) Cpus: 0-6 (XEN) Scheduler: SMP Credit Scheduler rev2 (credit2) (XEN) Active queues: 1 (XEN) default-weight = 256 (XEN) Runqueue 0: (XEN) ncpus = 7 (XEN) cpus = 0-6 (XEN) max_weight = 256 (XEN) pick_bias = 1 (XEN) instload = 1 (XEN) aveload = 3992 (~1%) (XEN) idlers: 0000006f (XEN) tickled: 00000000 (XEN) fully idle cores: 0000004f Then, I did: # echo 1 > /sys/devices/system/xen_cpu/xen_cpu7/online And again it appear to have worked, i.e., the CPU is back online and in Pool-0: (XEN) Online Cpus: 0-7 (XEN) Cpupool 0: (XEN) Cpus: 0-7 (XEN) Scheduler: SMP Credit Scheduler rev2 (credit2) (XEN) Active queues: 1 (XEN) default-weight = 256 (XEN) Runqueue 0: (XEN) ncpus = 8 (XEN) cpus = 0-7 (XEN) max_weight = 256 (XEN) pick_bias = 1 (XEN) instload = 2 (XEN) aveload = 271474 (~103%) (XEN) idlers: 000000af (XEN) tickled: 00000000 (XEN) fully idle cores: 0000008f Then I did: # echo 0 > /sys/devices/system/xen_cpu/xen_cpu7/online And, after that: # xl cpupool-cpu-remove Pool-0 7 And the system hanged. I don't have a working serial console on that testbox, unfortunately, so I can't poke at debug keys, etc. Is this anything that you've seen or that you can reproduce? Regards -- Dario Faggioli, Ph.D http://about.me/dario.faggioli Virtualization Software Engineer SUSE Labs, SUSE https://www.suse.com/ ------------------------------------------------------------------- <> (Raistlin Majere)