On 1/27/2021 5:21 PM, Hillf Danton wrote: > On Wed, 27 Jan 2021 16:04:25 +0800 Xing Zhengjun wrote: >> On 1/26/2021 3:39 PM, Hillf Danton wrote: >>> On 26 Jan 2021 10:45:21 +0800 Xing Zhengjun wrote: >>>> On 1/25/2021 5:29 PM, Hillf Danton wrote: >>>>> On 25 Jan 2021 16:31:32 +0800 Xing Zhengjun wrote: >>>>>> On 1/22/2021 3:59 PM, Hillf Danton wrote: >>>>>>> On Fri, 22 Jan 2021 09:48:32 +0800 Xing Zhengjun wrote: >>>>>>>> On 1/21/2021 12:00 PM, Hillf Danton wrote: >>>>>>>>> On Wed, 20 Jan 2021 21:46:33 +0800 Oliver Sang wrote: >>>>>>>>>> On Fri, Jan 15, 2021 at 03:24:32PM +0800, Hillf Danton wrote: >>>>>>>>>>> Thu, 14 Jan 2021 15:45:11 +0800 >>>>>>>>>>>> >>>>>>>>>>>> FYI, we noticed the following commit (built with gcc-9): >>>>>>>>>>>> >>>>>>>>>>>> commit: d5bff968ea9cc005e632d9369c26cbd8148c93d5 ("workqueue: break affinity initiatively") >>>>>>>>>>>> https://git.kernel.org/cgit/linux/kernel/git/paulmck/linux-rcu.git dev.2021.01.11b >>>>>>>>>>>> >>>>>>>>>>> [...] >>>>>>>>>>>> >>>>>>>>>>>> [ 73.794288] WARNING: CPU: 0 PID: 22 at kernel/workqueue.c:2192 process_one_work >>>>>>>>>>> >>>>>>>>>>> Thanks for your report. >>>>>>>>>>> >>>>>>>>>>> We can also break CPU affinity by checking POOL_DISASSOCIATED at attach >>>>>>>>>>> time without extra cost paid; that way we have the same behavior as at >>>>>>>>>>> the unbind time. >>>>>>>>>>> >>>>>>>>>>> What is more the change that makes kworker pcpu is cut because they are >>>>>>>>>>> going to not help either hotplug or the mechanism of stop machine. >>>>>>>>>> >>>>>>>>>> hi, by applying below patch, the issue still happened. >>>>>>>>> >>>>>>>>> Thanks for your report. >>>>>>>>>> >>>>>>>>>> [ 4.574467] pci 0000:00:00.0: Limiting direct PCI/PCI transfers >>>>>>>>>> [ 4.575651] pci 0000:00:01.0: Activating ISA DMA hang workarounds >>>>>>>>>> [ 4.576900] pci 0000:00:02.0: Video device with shadowed ROM at [mem 0x000c0000-0x000dffff] >>>>>>>>>> [ 4.578648] PCI: CLS 0 bytes, default 64 >>>>>>>>>> [ 4.579685] Unpacking initramfs... >>>>>>>>>> [ 8.878031] -----------[ cut here ]----------- >>>>>>>>>> [ 8.879083] WARNING: CPU: 0 PID: 22 at kernel/workqueue.c:2187 process_one_work+0x92/0x9e0 >>>>>>>>>> [ 8.880688] Modules linked in: >>>>>>>>>> [ 8.881274] CPU: 0 PID: 22 Comm: kworker/1:0 Not tainted 5.11.0-rc3-gc213503139bb #2 >>>>>>>>> >>>>>>>>> The kworker bond to CPU1 runs on CPU0 and triggers the warning, which >>>>>>>>> shows that scheduler breaks CPU affinity, after 06249738a41a >>>>>>>>> ("workqueue: Manually break affinity on hotplug"), though quite likely >>>>>>>>> by kworker/1:0 for the initial workers. >>>>>>>>> >>>>>>>>>> [ 8.882518] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014 >>>>>>>>>> [ 8.887539] Workqueue: 0x0 (events) >>>>>>>>>> [ 8.887838] EIP: process_one_work+0x92/0x9e0 >>>>>>>>>> [ 8.887838] Code: 37 64 a1 58 54 4c 43 39 45 24 74 2c 31 c9 ba 01 00 00 00 c7 04 24 01 00 00 00 b8 08 1d f5 42 e8 74 85 13 00 ff 05 b8 30 04 43 <0f> 0b ba 01 00 00 00 eb 22 8d 74 26 00 90 c7 04 24 01 00 00 00 31 >>>>>>>>>> [ 8.887838] EAX: 42f51d08 EBX: 00000000 ECX: 00000000 EDX: 00000001 >>>>>>>>>> [ 8.887838] ESI: 43c04720 EDI: 42e45620 EBP: de7f23c0 ESP: 43d7bf08 >>>>>>>>>> [ 8.887838] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 EFLAGS: 00010002 >>>>>>>>>> [ 8.887838] CR0: 80050033 CR2: 00000000 CR3: 034e3000 CR4: 000406d0 >>>>>>>>>> [ 8.887838] Call Trace: >>>>>>>>>> [ 8.887838] ? worker_thread+0x98/0x6a0 >>>>>>>>>> [ 8.887838] ? worker_thread+0x2dd/0x6a0 >>>>>>>>>> [ 8.887838] ? kthread+0x1ba/0x1e0 >>>>>>>>>> [ 8.887838] ? create_worker+0x1e0/0x1e0 >>>>>>>>>> [ 8.887838] ? kzalloc+0x20/0x20 >>>>>>>>>> [ 8.887838] ? ret_from_fork+0x1c/0x28 >>>>>>>>>> [ 8.887838] _warn_unseeded_randomness: 63 callbacks suppressed >>>>>>>>>> [ 8.887838] random: get_random_bytes called from init_oops_id+0x2b/0x60 with crng_init=0 >>>>>>>>>> [ 8.887838] --[ end trace ac461b4d54c37cfa ]-- >>>>>>>>> >>>>>>>>> >>>>>>>>> Instead of creating the initial workers only on the active CPUS, rebind >>>>>>>>> them (labeled pcpu) and jump to the right CPU at bootup time. >>>>>>>>> >>>>>>>>> --- a/kernel/workqueue.c >>>>>>>>> +++ b/kernel/workqueue.c >>>>>>>>> @@ -2385,6 +2385,16 @@ woke_up: >>>>>>>>> return 0; >>>>>>>>> } >>>>>>>>> >>>>>>>>> + if (!(pool->flags & POOL_DISASSOCIATED) && smp_processor_id() != >>>>>>>>> + pool->cpu) { >>>>>>>>> + /* scheduler breaks CPU affinity for us, rebind it */ >>>>>>>>> + raw_spin_unlock_irq(&pool->lock); >>>>>>>>> + set_cpus_allowed_ptr(worker->task, pool->attrs->cpumask); >>>>>>>>> + /* and jump to the right seat */ >>>>>>>>> + schedule_timeout_interruptible(1); >>>>>>>>> + goto woke_up; >>>>>>>>> + } >>>>>>>>> + >>>>>>>>> worker_leave_idle(worker); >>>>>>>>> recheck: >>>>>>>>> /* no more worker necessary? */ >>>>>>>>> -- >>>>>>>>> >>>>>>>> I test the patch, the warning still appears in the kernel log. >>>>>>> >>>>>>> Thanks for your report. >>>>>>>> >>>>>>>> [ 230.356503] smpboot: CPU 1 is now offline >>>>>>>> [ 230.544652] x86: Booting SMP configuration: >>>>>>>> [ 230.545077] smpboot: Booting Node 0 Processor 1 APIC 0x1 >>>>>>>> [ 230.545640] kvm-clock: cpu 1, msr 34f6021, secondary cpu clock >>>>>>>> [ 230.545675] masked ExtINT on CPU#1 >>>>>>>> [ 230.593829] ------------[ cut here ]------------ >>>>>>>> [ 230.594257] WARNING: CPU: 0 PID: 257 at kernel/workqueue.c:2192 process_one_work+0x92/0x9e0 >>>>>>>> [ 230.594990] Modules linked in: rcutorture torture mousedev input_leds >>>>>>>> led_class pcspkr psmouse evbug tiny_power_button button >>>>>>>> [ 230.595961] CPU: 0 PID: 257 Comm: kworker/1:3 Not tainted 5.11.0-rc3-gdcba55d9080f #2 >>>>>>> >>>>>>> Like what was reported, kworker bond to CPU1 runs on CPU0 and triggers >>>>>>> warning, due to scheduler breaking CPU affinity for us. What is new, the >>>>>>> affinity was broken at offline time instead of bootup. >>>>>>> >>>>>>>> [ 230.596621] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014 >>>>>>>> [ 230.597322] Workqueue: 0x0 (rcu_gp) >>>>>>>> [ 230.597636] EIP: process_one_work+0x92/0x9e0 >>>>>>>> [ 230.598005] Code: 37 64 a1 58 54 4c 43 39 45 24 74 2c 31 c9 ba 01 00 >>>>>>>> 00 00 c7 04 24 01 00 00 00 b8 08 1d f5 42 e8 f4 85 13 00 ff 05 cc 30 04 >>>>>>>> 43 <0f> 0b ba 01 00 00 00 eb 22 8d 74 26 00 90 c7 04 24 01 00 00 00 31 >>>>>>>> [ 230.599569] EAX: 42f51d08 EBX: 00000000 ECX: 00000000 EDX: 00000001 >>>>>>>> [ 230.600100] ESI: 43d94240 EDI: df4040f4 EBP: de7f23c0 ESP: bf5f1f08 >>>>>>>> [ 230.600629] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 EFLAGS: 00010002 >>>>>>>> [ 230.601203] CR0: 80050033 CR2: 01bdecbc CR3: 04e2c000 CR4: 000406d0 >>>>>>>> [ 230.601735] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000 >>>>>>>> [ 230.602265] DR6: fffe0ff0 DR7: 00000400 >>>>>>>> [ 230.602594] Call Trace: >>>>>>>> [ 230.602813] ? process_one_work+0x20e/0x9e0 >>>>>>>> [ 230.603181] ? worker_thread+0x32d/0x700 >>>>>>>> [ 230.603522] ? kthread+0x1ba/0x1e0 >>>>>>>> [ 230.603818] ? create_worker+0x1e0/0x1e0 >>>>>>>> [ 230.604157] ? kzalloc+0x20/0x20 >>>>>>>> [ 230.604524] ? ret_from_fork+0x1c/0x28 >>>>>>>> [ 230.604850] ---[ end trace 06b1e66b5e17fa85 ]--- >>>>>>>> [ 230.605504] kvm-guest: stealtime: cpu 1, msr 9e7e6ec0 >>>>>>>> [ 230.766960] smpboot: CPU 1 is now offline >>>>>>>> [ 230.814803] x86: Booting SMP configuration: >>>>>>>> [ 230.815306] smpboot: Booting Node 0 Processor 1 APIC 0x1 >>>>>>>> [ 230.815964] kvm-clock: cpu 1, msr 34f6021, secondary cpu clock >>>>>>> >>>>>>> >>>>>>> Unlike the above diff that is at most papering over the problem >>>>>>> sitting somewhere in the scheduler, add change to creating worker >>>>>>> by skipping set_cpus_allowed_ptr() because we will wake it up after >>>>>>> attaching it to worker pool. >>>>>>> >>>>>>> If we can ignore rescuer for now, then the allowed ptr is only >>>>>>> updated at on/offline time; lets see the difference at boot time. >>>>>>> >>>>>>> >>>>>>> --- a/kernel/workqueue.c >>>>>>> +++ b/kernel/workqueue.c >>>>>>> @@ -1844,16 +1844,10 @@ static struct worker *alloc_worker(int n >>>>>>> * cpu-[un]hotplugs. >>>>>>> */ >>>>>>> static void worker_attach_to_pool(struct worker *worker, >>>>>>> - struct worker_pool *pool) >>>>>>> + struct worker_pool *pool, >>>>>>> + int update_cpus_allowed) >>>>>>> { >>>>>>> mutex_lock(&wq_pool_attach_mutex); >>>>>>> - >>>>>>> - /* >>>>>>> - * set_cpus_allowed_ptr() will fail if the cpumask doesn't have any >>>>>>> - * online CPUs. It'll be re-applied when any of the CPUs come up. >>>>>>> - */ >>>>>>> - set_cpus_allowed_ptr(worker->task, pool->attrs->cpumask); >>>>>>> - >>>>>>> /* >>>>>>> * The wq_pool_attach_mutex ensures %POOL_DISASSOCIATED remains >>>>>>> * stable across this function. See the comments above the flag >>>>>>> @@ -1867,6 +1861,9 @@ static void worker_attach_to_pool(struct >>>>>>> list_add_tail(&worker->node, &pool->workers); >>>>>>> worker->pool = pool; >>>>>>> >>>>>>> + if (update_cpus_allowed) >>>>>>> + set_cpus_allowed_ptr(worker->task, pool->attrs->cpumask); >>>>>>> + >>>>>>> mutex_unlock(&wq_pool_attach_mutex); >>>>>>> } >>>>>>> >>>>>>> @@ -1942,8 +1939,11 @@ static struct worker *create_worker(stru >>>>>>> set_user_nice(worker->task, pool->attrs->nice); >>>>>>> kthread_bind_mask(worker->task, pool->attrs->cpumask); >>>>>>> >>>>>>> - /* successful, attach the worker to the pool */ >>>>>>> - worker_attach_to_pool(worker, pool); >>>>>>> + /* >>>>>>> + * attach the worker to the pool without asking scheduler to >>>>>>> + * update CPUs allowed >>>>>>> + */ >>>>>>> + worker_attach_to_pool(worker, pool, 0); >>>>>>> >>>>>>> /* start the newly created worker */ >>>>>>> raw_spin_lock_irq(&pool->lock); >>>>>>> @@ -2508,7 +2508,7 @@ repeat: >>>>>>> >>>>>>> raw_spin_unlock_irq(&wq_mayday_lock); >>>>>>> >>>>>>> - worker_attach_to_pool(rescuer, pool); >>>>>>> + worker_attach_to_pool(rescuer, pool, 1); >>>>>>> >>>>>>> raw_spin_lock_irq(&pool->lock); >>>>>>> >>>>>>> -- >>>>>>> >>>>>> I test the patch, the warning still appears in the kernel log. >>>>> >>>>> Thanks. >>>>>> >>>>>> [ 55.754187] smpboot: Booting Node 0 Processor 1 APIC 0x1 >>>>>> [ 55.785594] kvm-clock: cpu 1, msr 34f6021, secondary cpu clock >>>>>> [ 55.785646] masked ExtINT on CPU#1 >>>>>> [ 55.920602] ------------[ cut here ]------------ >>>>>> [ 55.921355] WARNING: CPU: 0 PID: 160 at kernel/workqueue.c:2192 process_one_work+0x92/0x9e0 >>>>>> [ 55.922583] Modules linked in: rcutorture torture mousedev evbug >>>>>> input_leds led_class tiny_power_button psmouse pcspkr button >>>>>> [ 55.924294] CPU: 0 PID: 160 Comm: kworker/1:2 Not tainted 5.11.0-rc3-00186-g77bf4e461cfa #2 >>>>> >>>>> Same issue as before. >>>>> >>>>>> [ 55.925552] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014 >>>>>> [ 55.926763] Workqueue: 0x0 (rcu_gp) >>>>>> [ 55.927298] EIP: process_one_work+0x92/0x9e0 >>>>>> [ 55.927950] Code: 37 64 a1 58 54 4c 43 39 45 24 74 2c 31 c9 ba 01 00 >>>>>> 00 00 c7 04 24 01 00 00 00 b8 08 1d f5 42 e8 94 85 13 00 ff 05 b8 30 04 >>>>>> 43 <0f> 0b ba 01 00 00 00 eb 22 8d 74 26 00 90 c7 04 24 01 00 00 00 31 >>>>>> [ 55.930726] EAX: 42f51d08 EBX: 00000000 ECX: 00000000 EDX: 00000001 >>>>>> [ 55.931642] ESI: 43d90540 EDI: df48c0f4 EBP: de7f23c0 ESP: bfb47f08 >>>>>> [ 55.932590] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 EFLAGS: 00010002 >>>>>> [ 55.933609] CR0: 80050033 CR2: 024e994c CR3: 7fd80000 CR4: 000406d0 >>>>>> [ 55.934555] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000 >>>>>> [ 55.935457] DR6: fffe0ff0 DR7: 00000400 >>>>>> [ 55.936041] Call Trace: >>>>>> [ 55.936534] ? process_one_work+0x20e/0x9e0 >>>>>> [ 55.937305] ? worker_thread+0x2dd/0x6a0 >>>>>> [ 55.938018] ? kthread+0x1ba/0x1e0 >>>>>> [ 55.938598] ? create_worker+0x1e0/0x1e0 >>>>>> [ 55.939315] ? kzalloc+0x20/0x20 >>>>>> [ 55.940000] ? ret_from_fork+0x1c/0x28 >>>>>> [ 55.940627] ---[ end trace d155e9e6402de179 ]--- >>>>>> [ 55.941641] kvm-guest: stealtime: cpu 1, msr 9e7e6ec0 >>>>>> [ 56.155271] smpboot: CPU 1 is now offline >>>>>> [ 56.193613] x86: Booting SMP configuration: >>>>>> [ 56.194400] smpboot: Booting Node 0 Processor 1 APIC 0x1 >>>>> >>>>> The changes in the diff below are >>>>> >>>>> 1/ at rescue time, change CPU affinity only if POOL_DISASSOCIATED >>>>> is not set, and print warning the same way as offline time. >>>>> >>>>> 2/ at offine time, dont update allowed CPUs after setting >>>>> POOL_DISASSOCIATED because we no longer have interest in affinity. >>>>> >>>>> 3/ at online time, mark pcpu before binding affinity. >>>>> >>>>> Though one change a diff is appreciated, by the WARNs, we can tell >>>>> which is what if any warning goes into dmesg. >>>>> >>>>> --- a/kernel/workqueue.c >>>>> +++ b/kernel/workqueue.c >>>>> @@ -1844,25 +1844,23 @@ static struct worker *alloc_worker(int n >>>>> * cpu-[un]hotplugs. >>>>> */ >>>>> static void worker_attach_to_pool(struct worker *worker, >>>>> - struct worker_pool *pool) >>>>> + struct worker_pool *pool, int set) >>>>> { >>>>> mutex_lock(&wq_pool_attach_mutex); >>>>> >>>>> /* >>>>> - * set_cpus_allowed_ptr() will fail if the cpumask doesn't have any >>>>> - * online CPUs. It'll be re-applied when any of the CPUs come up. >>>>> - */ >>>>> - set_cpus_allowed_ptr(worker->task, pool->attrs->cpumask); >>>>> - >>>>> - /* >>>>> * The wq_pool_attach_mutex ensures %POOL_DISASSOCIATED remains >>>>> * stable across this function. See the comments above the flag >>>>> * definition for details. >>>>> */ >>>>> if (pool->flags & POOL_DISASSOCIATED) >>>>> worker->flags |= WORKER_UNBOUND; >>>>> - else >>>>> + else { >>>>> kthread_set_per_cpu(worker->task, true); >>>>> + if (set) >>>>> + WARN_ON_ONCE(set_cpus_allowed_ptr(worker->task, >>>>> + pool->attrs->cpumask) < 0); >>>>> + } >>>>> >>>>> list_add_tail(&worker->node, &pool->workers); >>>>> worker->pool = pool; >>>>> @@ -1943,7 +1941,7 @@ static struct worker *create_worker(stru >>>>> kthread_bind_mask(worker->task, pool->attrs->cpumask); >>>>> >>>>> /* successful, attach the worker to the pool */ >>>>> - worker_attach_to_pool(worker, pool); >>>>> + worker_attach_to_pool(worker, pool, 0); >>>>> >>>>> /* start the newly created worker */ >>>>> raw_spin_lock_irq(&pool->lock); >>>>> @@ -2508,7 +2506,7 @@ repeat: >>>>> >>>>> raw_spin_unlock_irq(&wq_mayday_lock); >>>>> >>>>> - worker_attach_to_pool(rescuer, pool); >>>>> + worker_attach_to_pool(rescuer, pool, 1); >>>>> >>>>> raw_spin_lock_irq(&pool->lock); >>>>> >>>>> @@ -4923,7 +4921,6 @@ static void unbind_workers(int cpu) >>>>> >>>>> for_each_pool_worker(worker, pool) { >>>>> kthread_set_per_cpu(worker->task, false); >>>>> - WARN_ON_ONCE(set_cpus_allowed_ptr(worker->task, cpu_possible_mask) < 0); >>>>> } >>>>> >>>>> mutex_unlock(&wq_pool_attach_mutex); >>>>> @@ -4977,9 +4974,9 @@ static void rebind_workers(struct worker >>>>> * from CPU_ONLINE, the following shouldn't fail. >>>>> */ >>>>> for_each_pool_worker(worker, pool) { >>>>> + kthread_set_per_cpu(worker->task, true); >>>>> WARN_ON_ONCE(set_cpus_allowed_ptr(worker->task, >>>>> pool->attrs->cpumask) < 0); >>>>> - kthread_set_per_cpu(worker->task, true); >>>>> } >>>>> >>>>> raw_spin_lock_irq(&pool->lock); >>>>> -- >>>>> >>>> I test the patch, the warning still appears in the kernel log, but the >>> >>> Thanks. >>> >>>> warning is different from before. >>>> >>>> [ 0.054803] kvm-clock: cpu 1, msr 34f6021, secondary cpu clock >>>> [ 0.054803] masked ExtINT on CPU#1 >>>> [ 0.054803] smpboot: CPU 1 Converting physical 0 to logical die 1 >>>> [ 1.890338] ------------[ cut here ]------------ >>>> [ 1.890338] WARNING: CPU: 1 PID: 18 at kernel/kthread.c:508 kthread_set_per_cpu+0x156/0x180 >>>> [ 1.890338] Modules linked in: >>>> [ 1.890338] CPU: 1 PID: 18 Comm: cpuhp/1 Not tainted 5.11.0-rc3-00186-ged03082352b2 #2 >>> >>> This warning is better than the one in process_one_work() which is too >>> late, though, it has something to say. >>> >>>> [ 1.890338] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014 >>>> [ 1.890338] EIP: kthread_set_per_cpu+0x156/0x180 >>>> [ 1.890338] Code: 00 00 00 00 ff 05 68 4e 04 43 83 c4 08 5b 5e 5f c3 >>>> 8d 76 00 ff 05 34 50 04 43 0f 0b e9 f9 fe ff ff 8d 76 00 ff 05 2c 4e 04 >>>> 43 <0f> 0b eb 9d 8d b6 00 00 00 00 ff 05 40 4e 04 43 0f 0b e9 45 ff ff >>>> [ 1.890338] EAX: 42f52ce0 EBX: 00000001 ECX: 00000000 EDX: 00000001 >>>> [ 1.890338] ESI: 43d76300 EDI: 43c0de00 EBP: de7f2564 ESP: 43d6beb8 >>>> [ 1.900350] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 EFLAGS: 00010202 >>>> [ 1.901303] CR0: 80050033 CR2: 00000000 CR3: 034e3000 CR4: 000406d0 >>>> [ 1.902280] Call Trace: >>>> [ 1.902682] ? workqueue_online_cpu+0x12b/0x640 >>>> [ 1.903415] ? workqueue_prepare_cpu+0xa0/0xa0 >>> >>> By the prepare function we know it is a warning at boot time. >>> >>>> [ 1.904155] ? cpuhp_invoke_callback+0x1ed/0x1340 >>>> [ 1.904941] ? cpuhp_thread_fun+0x28f/0x460 >>>> [ 1.905630] ? cpuhp_thread_fun+0x49/0x460 >>>> [ 1.906298] ? smpboot_thread_fn+0x446/0x620 >>>> [ 1.910275] ? kthread+0x1ba/0x1e0 >>>> [ 1.910857] ? __smpboot_create_thread+0x260/0x260 >>>> [ 1.911659] ? kzalloc+0x20/0x20 >>>> [ 1.912368] ? ret_from_fork+0x1c/0x28 >>>> [ 1.913016] ---[ end trace 6f6c005278241eba ]--- >>>> [ 1.913971] kvm-guest: stealtime: cpu 1, msr 9e7e6ec0 >>>> [ 1.920012] smp: Brought up 1 node, 2 CPUs >>>> [ 1.920299] smpboot: Max logical packages: 2 >>>> [ 1.921019] smpboot: Total of 2 processors activated (10774.03 BogoMIPS) >>> >>> After staring at it again, the change added in rebind_workers() is >>> incorrect for a CPU getting back online after bootup, because the >>> CPU affinity has been broken at offline time. >>> >>> But the warning in the latest report occurs at boot time, after going >>> through steps like, >>> >>> kthread_create_on_node(); >>> >>> kthread_bind_mask(worker->task, pool->attrs->cpumask); >>> >>> /* successful, attach the worker to the pool */ >>> worker_attach_to_pool(worker, pool, 0); >>> >>> /* start the newly created worker */ >>> wake_up_process(worker->task); >>> >>> >>> without triggering the warning at kernel/kthread.c:507. >>> >>> Before finding an answer (wake_up_new_task()) to the question, >>> who changed the allowed ptr? revert the change at online time, >>> and as a compensation, clear POOL_DISASSOCIATED as early as we can. >>> >>> >>> --- a/kernel/workqueue.c >>> +++ b/kernel/workqueue.c >>> @@ -1844,25 +1844,23 @@ static struct worker *alloc_worker(int n >>> * cpu-[un]hotplugs. >>> */ >>> static void worker_attach_to_pool(struct worker *worker, >>> - struct worker_pool *pool) >>> + struct worker_pool *pool, int set) >>> { >>> mutex_lock(&wq_pool_attach_mutex); >>> >>> /* >>> - * set_cpus_allowed_ptr() will fail if the cpumask doesn't have any >>> - * online CPUs. It'll be re-applied when any of the CPUs come up. >>> - */ >>> - set_cpus_allowed_ptr(worker->task, pool->attrs->cpumask); >>> - >>> - /* >>> * The wq_pool_attach_mutex ensures %POOL_DISASSOCIATED remains >>> * stable across this function. See the comments above the flag >>> * definition for details. >>> */ >>> if (pool->flags & POOL_DISASSOCIATED) >>> worker->flags |= WORKER_UNBOUND; >>> - else >>> + else { >>> kthread_set_per_cpu(worker->task, true); >>> + if (set) >>> + WARN_ON_ONCE(set_cpus_allowed_ptr(worker->task, >>> + pool->attrs->cpumask) < 0); >>> + } >>> >>> list_add_tail(&worker->node, &pool->workers); >>> worker->pool = pool; >>> @@ -1943,7 +1941,7 @@ static struct worker *create_worker(stru >>> kthread_bind_mask(worker->task, pool->attrs->cpumask); >>> >>> /* successful, attach the worker to the pool */ >>> - worker_attach_to_pool(worker, pool); >>> + worker_attach_to_pool(worker, pool, 0); >>> >>> /* start the newly created worker */ >>> raw_spin_lock_irq(&pool->lock); >>> @@ -2508,7 +2506,7 @@ repeat: >>> >>> raw_spin_unlock_irq(&wq_mayday_lock); >>> >>> - worker_attach_to_pool(rescuer, pool); >>> + worker_attach_to_pool(rescuer, pool, 1); >>> >>> raw_spin_lock_irq(&pool->lock); >>> >>> @@ -4923,7 +4921,6 @@ static void unbind_workers(int cpu) >>> >>> for_each_pool_worker(worker, pool) { >>> kthread_set_per_cpu(worker->task, false); >>> - WARN_ON_ONCE(set_cpus_allowed_ptr(worker->task, cpu_possible_mask) < 0); >>> } >>> >>> mutex_unlock(&wq_pool_attach_mutex); >>> @@ -5954,6 +5951,7 @@ void __init workqueue_init_early(void) >>> i = 0; >>> for_each_cpu_worker_pool(pool, cpu) { >>> BUG_ON(init_worker_pool(pool)); >>> + pool->flags &= ~POOL_DISASSOCIATED; >>> pool->cpu = cpu; >>> cpumask_copy(pool->attrs->cpumask, cpumask_of(cpu)); >>> pool->attrs->nice = std_nice[i++]; >>> -- >>> >> I test the patch, the warning still appears in the kernel log. > > Thanks. >> >> [ 0.043584] masked ExtINT on CPU#1 >> [ 0.043584] smpboot: CPU 1 Converting physical 0 to logical die 1 >> [ 1.681867] ------------[ cut here ]------------ >> [ 1.681867] WARNING: CPU: 1 PID: 18 at kernel/workqueue.c:5015 workqueue_online_cpu+0x336/0x640 > > This is a new one at boot time. > >> [ 1.681867] Modules linked in: >> [ 1.681867] CPU: 1 PID: 18 Comm: cpuhp/1 Not tainted 5.11.0-rc3-00186-g178826956ca7 #2 >> [ 1.681867] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014 >> [ 1.681867] EIP: workqueue_online_cpu+0x336/0x640 >> [ 1.681867] Code: 01 00 00 00 eb 89 8d b6 00 00 00 00 8b 5c 24 0c 89 >> d8 e8 3d 39 19 01 e9 1c fd ff ff 8d b4 26 00 00 00 00 90 ff 05 dc 20 04 >> 43 <0f> 0b e9 07 ff ff ff 8d 76 00 ff 05 2c 21 04 43 0f 0b e9 12 fe ff >> [ 1.681867] EAX: 42f51168 EBX: 0000000c ECX: 00000000 EDX: 00000001 >> [ 1.681867] ESI: 00000000 EDI: 43c04720 EBP: de7f2564 ESP: 43d6bed0 >> [ 1.681867] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 EFLAGS: 00010002 >> [ 1.681867] CR0: 80050033 CR2: 00000000 CR3: 034e3000 CR4: 000406d0 >> [ 1.681867] Call Trace: >> [ 1.681867] ? workqueue_prepare_cpu+0xa0/0xa0 >> [ 1.681867] ? cpuhp_invoke_callback+0x1ed/0x1340 >> [ 1.681867] ? cpuhp_thread_fun+0x28f/0x460 >> [ 1.681867] ? cpuhp_thread_fun+0x49/0x460 >> [ 1.681867] ? smpboot_thread_fn+0x446/0x620 >> [ 1.681867] ? kthread+0x1ba/0x1e0 >> [ 1.681867] ? __smpboot_create_thread+0x260/0x260 >> [ 1.681867] ? kzalloc+0x20/0x20 >> [ 1.681867] ? ret_from_fork+0x1c/0x28 >> [ 1.681867] ---[ end trace 3e853d8ad82e44ac ]--- >> [ 1.691939] kvm-guest: stealtime: cpu 1, msr 9e7e6ec0 >> [ 1.705909] smp: Brought up 1 node, 2 CPUs >> [ 1.706605] smpboot: Max logical packages: 2 >> >> [ 85.402803] smpboot: Booting Node 0 Processor 1 APIC 0x1 >> [ 85.442465] kvm-clock: cpu 1, msr 34f6021, secondary cpu clock >> [ 85.442602] masked ExtINT on CPU#1 >> [ 85.541391] ------------[ cut here ]------------ >> [ 85.542250] WARNING: CPU: 1 PID: 18 at kernel/kthread.c:508 kthread_set_per_cpu+0x156/0x180 > > This is a known one. > >> [ 85.543664] Modules linked in: rcutorture torture mousedev evbug >> input_leds led_class psmouse pcspkr tiny_power_button button >> [ 85.545530] CPU: 1 PID: 18 Comm: cpuhp/1 Tainted: G W >> 5.11.0-rc3-00186-g178826956ca7 #2 >> [ 85.547006] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), >> BIOS 1.12.0-1 04/01/2014 >> [ 85.548454] EIP: kthread_set_per_cpu+0x156/0x180 >> [ 85.549231] Code: 00 00 00 00 ff 05 68 4e 04 43 83 c4 08 5b 5e 5f c3 >> 8d 76 00 ff 05 34 50 04 43 0f 0b e9 f9 fe ff ff 8d 76 00 ff 05 2c 4e 04 >> 43 <0f> 0b eb 9d 8d b6 00 00 00 00 ff 05 40 4e 04 43 0f 0b e9 45 ff ff >> [ 85.552236] EAX: 42f52ce0 EBX: 00000001 ECX: 00000000 EDX: 00000001 >> [ 85.556663] ESI: bfb70000 EDI: 45e61000 EBP: de7f2564 ESP: 43d6beb8 >> [ 85.559168] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 EFLAGS: 00010202 >> [ 85.561991] CR0: 80050033 CR2: 00000000 CR3: 034e3000 CR4: 000406d0 >> [ 85.563040] Call Trace: >> [ 85.563463] ? workqueue_online_cpu+0x17f/0x640 >> [ 85.564222] ? rapl_cpu_online+0xb7/0x1a0 >> [ 85.564904] ? workqueue_prepare_cpu+0xa0/0xa0 >> [ 85.565643] ? cpuhp_invoke_callback+0x1ed/0x1340 >> [ 85.566424] ? cpuhp_thread_fun+0x28f/0x460 >> [ 85.567140] ? cpuhp_thread_fun+0x49/0x460 >> [ 85.567844] ? smpboot_thread_fn+0x446/0x620 >> [ 85.568569] ? kthread+0x1ba/0x1e0 >> [ 85.569162] ? __smpboot_create_thread+0x260/0x260 >> [ 85.569961] ? kzalloc+0x20/0x20 >> [ 85.570918] ? ret_from_fork+0x1c/0x28 >> [ 85.575418] ---[ end trace 3e853d8ad82e44ae ]--- >> [ 85.577919] kvm-guest: stealtime: cpu 1, msr 9e7e6ec0 >> [ 85.580973] ------------[ cut here ]------------ >> [ 85.581747] WARNING: CPU: 0 PID: 87 at kernel/workqueue.c:2190 process_one_work+0x92/0x9e0 >> [ 85.583058] Modules linked in: rcutorture torture mousedev evbug >> input_leds led_class psmouse pcspkr tiny_power_button button >> [ 85.584850] CPU: 0 PID: 87 Comm: kworker/1:2 Tainted: G W 5.11.0-rc3-00186-g178826956ca7 #2 > > This is a big regression. > >> [ 85.586456] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), >> BIOS 1.12.0-1 04/01/2014 >> [ 85.587753] Workqueue: 0x0 (rcu_gp) >> [ 85.588347] EIP: process_one_work+0x92/0x9e0 >> [ 85.589018] Code: 37 64 a1 58 54 4c 43 39 45 24 74 2c 31 c9 ba 01 00 >> 00 00 c7 04 24 01 00 00 00 b8 d8 1c f5 42 e8 14 85 13 00 ff 05 a4 30 04 >> 43 <0f> 0b ba 01 00 00 00 eb 22 8d 74 26 00 90 c7 04 24 01 00 00 00 31 >> [ 85.595358] EAX: 42f51cd8 EBX: df4080f8 ECX: 00000000 EDX: 00000001 >> [ 85.596319] ESI: 43d94840 EDI: df4080f4 EBP: de7f23c0 ESP: 45505f08 >> [ 85.597291] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 EFLAGS: 00010002 >> [ 85.598410] CR0: 80050033 CR2: 372c2004 CR3: 7fa9f000 CR4: 000406d0 >> [ 85.599401] Call Trace: >> [ 85.599804] ? process_one_work+0x20e/0x9e0 >> [ 85.600474] ? worker_thread+0x40d/0x6a0 >> [ 85.601096] ? kthread+0x1ba/0x1e0 >> [ 85.601635] ? create_worker+0x1e0/0x1e0 >> [ 85.602246] ? kzalloc+0x20/0x20 >> [ 85.602907] ? ret_from_fork+0x1c/0x28 >> [ 85.603512] ---[ end trace 3e853d8ad82e44af ]--- >> [ 85.699976] smpboot: CPU 1 is now offline >> [ 85.771891] x86: Booting SMP configuration: >> [ 85.772532] smpboot: Booting Node 0 Processor 1 APIC 0x1 > > 1/ to fix the regression, roll back to the version without warning in > process_one_work(). > > 2/ to see if that version is utterly anything of a blind cat heading > on a blind rat, quiesce the 508 warning for kworkers now. > > --- a/kernel/workqueue.c > +++ b/kernel/workqueue.c > @@ -1844,25 +1844,23 @@ static struct worker *alloc_worker(int n > * cpu-[un]hotplugs. > */ > static void worker_attach_to_pool(struct worker *worker, > - struct worker_pool *pool) > + struct worker_pool *pool, int set) > { > mutex_lock(&wq_pool_attach_mutex); > > /* > - * set_cpus_allowed_ptr() will fail if the cpumask doesn't have any > - * online CPUs. It'll be re-applied when any of the CPUs come up. > - */ > - set_cpus_allowed_ptr(worker->task, pool->attrs->cpumask); > - > - /* > * The wq_pool_attach_mutex ensures %POOL_DISASSOCIATED remains > * stable across this function. See the comments above the flag > * definition for details. > */ > if (pool->flags & POOL_DISASSOCIATED) > worker->flags |= WORKER_UNBOUND; > - else > + else { > kthread_set_per_cpu(worker->task, true); > + if (set) > + WARN_ON_ONCE(set_cpus_allowed_ptr(worker->task, > + pool->attrs->cpumask) < 0); > + } > > list_add_tail(&worker->node, &pool->workers); > worker->pool = pool; > @@ -1943,7 +1941,7 @@ static struct worker *create_worker(stru > kthread_bind_mask(worker->task, pool->attrs->cpumask); > > /* successful, attach the worker to the pool */ > - worker_attach_to_pool(worker, pool); > + worker_attach_to_pool(worker, pool, 0); > > /* start the newly created worker */ > raw_spin_lock_irq(&pool->lock); > @@ -2508,7 +2506,7 @@ repeat: > > raw_spin_unlock_irq(&wq_mayday_lock); > > - worker_attach_to_pool(rescuer, pool); > + worker_attach_to_pool(rescuer, pool, 1); > > raw_spin_lock_irq(&pool->lock); > > @@ -4923,7 +4921,6 @@ static void unbind_workers(int cpu) > > for_each_pool_worker(worker, pool) { > kthread_set_per_cpu(worker->task, false); > - WARN_ON_ONCE(set_cpus_allowed_ptr(worker->task, cpu_possible_mask) < 0); > } > > mutex_unlock(&wq_pool_attach_mutex); > @@ -4977,9 +4974,9 @@ static void rebind_workers(struct worker > * from CPU_ONLINE, the following shouldn't fail. > */ > for_each_pool_worker(worker, pool) { > + kthread_set_per_cpu(worker->task, true); > WARN_ON_ONCE(set_cpus_allowed_ptr(worker->task, > pool->attrs->cpumask) < 0); > - kthread_set_per_cpu(worker->task, true); > } > > raw_spin_lock_irq(&pool->lock); > --- a/kernel/kthread.c > +++ b/kernel/kthread.c > @@ -505,7 +505,7 @@ void kthread_set_per_cpu(struct task_str > > if (set) { > WARN_ON_ONCE(!(k->flags & PF_NO_SETAFFINITY)); > - WARN_ON_ONCE(k->nr_cpus_allowed != 1); > + WARN_ON_ONCE(k->nr_cpus_allowed != 1 && !(k->flags & PF_WQ_WORKER)); > set_bit(KTHREAD_IS_PER_CPU, &kthread->flags); > } else { > clear_bit(KTHREAD_IS_PER_CPU, &kthread->flags); > -- > I test the patch 4 times, no warning appears in the kernel log. -- Zhengjun Xing