linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Oliver Sang <oliver.sang@intel.com>
To: Hillf Danton <hdanton@sina.com>
Cc: Peter Zijlstra <peterz@infradead.org>,
	"Paul E. McKenney" <paulmck@kernel.org>,
	LKML <linux-kernel@vger.kernel.org>,
	lkp@intel.com, lkp@lists.01.org, zhengjun.xing@linux.intel.com
Subject: Re: [workqueue]  d5bff968ea: WARNING:at_kernel/workqueue.c:#process_one_work
Date: Wed, 20 Jan 2021 21:41:44 +0800	[thread overview]
Message-ID: <20210120134144.GA11090@xsang-OptiPlex-9020> (raw)
In-Reply-To: <20210114084248.1819-1-hdanton@sina.com>

[-- Attachment #1: Type: text/plain, Size: 5146 bytes --]

On Thu, Jan 14, 2021 at 04:42:48PM +0800, Hillf Danton wrote:
> Thu, 14 Jan 2021 15:45:11 +0800
> > 
> > FYI, we noticed the following commit (built with gcc-9):
> > 
> > commit: d5bff968ea9cc005e632d9369c26cbd8148c93d5 ("workqueue: break affinity initiatively")
> > https://git.kernel.org/cgit/linux/kernel/git/paulmck/linux-rcu.git dev.2021.01.11b
> > 
> > 
> > in testcase: rcutorture
> > version: 
> > with following parameters:
> > 
> > 	runtime: 300s
> > 	test: cpuhotplug
> > 	torture_type: srcud
> > 
> > test-description: rcutorture is rcutorture kernel module load/unload test.
> > test-url: https://www.kernel.org/doc/Documentation/RCU/torture.txt
> > 
> > 
> > on test machine: qemu-system-i386 -enable-kvm -cpu SandyBridge -smp 2 -m 8G
> > 
> > caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):
> > 
> > 
> > +--------------------------------------------------+------------+------------+
> > |                                                  | 6211b34f6e | d5bff968ea |
> > +--------------------------------------------------+------------+------------+
> > | boot_successes                                   | 4          | 0          |
> > | boot_failures                                    | 0          | 12         |
> > | WARNING:at_kernel/workqueue.c:#process_one_work  | 0          | 12         |
> > | EIP:process_one_work                             | 0          | 12         |
> > | WARNING:at_kernel/kthread.c:#kthread_set_per_cpu | 0          | 4          |
> > | EIP:kthread_set_per_cpu                          | 0          | 4          |
> > +--------------------------------------------------+------------+------------+
> > 
> > 
> > If you fix the issue, kindly add following tag
> > Reported-by: kernel test robot <oliver.sang@intel.com>
> > 
> > 
> > [   73.794288] WARNING: CPU: 0 PID: 22 at kernel/workqueue.c:2192 process_one_work (kbuild/src/consumer/kernel/workqueue.c:2192) 
> > [   73.795012] Modules linked in: rcutorture torture mousedev evbug input_leds led_class psmouse pcspkr tiny_power_button button
> > [   73.795949] CPU: 0 PID: 22 Comm: kworker/1:0 Not tainted 5.11.0-rc3-gd5bff968ea9c #2
> > [   73.796592] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014
> > [   73.797280] Workqueue:  0x0 (rcu_gp)
> > [   73.797592] EIP: process_one_work (kbuild/src/consumer/kernel/workqueue.c:2192) 
> 
> 
> Can you run the reproducer with the changes to WQ cut?

hi, by applying below patch, the issue still happened. detail dmesg is attached.

[ 2.505530] TCP: Hash tables configured (established 32768 bind 32768)
[ 2.506668] -----------[ cut here ]-----------
[ 2.508080] WARNING: CPU: 0 PID: 23 at kernel/workqueue.c:2190 process_one_work+0x92/0x9e0
[ 2.509963] Modules linked in:
[ 2.510692] CPU: 0 PID: 23 Comm: kworker/1:0H Not tainted 5.11.0-rc3-00186-ge7792535d216 #2
[ 2.512608] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014
[ 2.514499] EIP: process_one_work+0x92/0x9e0
[ 2.515468] Code: 37 64 a1 58 54 4c 43 39 45 24 74 2c 31 c9 ba 01 00 00 00 c7 04 24 01 00 00 00 b8 08 1d f5 42 e8 74 85 13 00 ff 05 b8 30 04 43 <0f> 0b ba 01 00 00 00 eb 22 8d 74 26 00 90 c7 04 24 01 00 00 00 31
[ 2.516539] EAX: 42f51d08 EBX: 00000000 ECX: 00000000 EDX: 00000001
[ 2.516539] ESI: 43c04780 EDI: de7eb3ec EBP: de7f25e0 ESP: 43d83f08
[ 2.516539] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 EFLAGS: 00010002
[ 2.516539] CR0: 80050033 CR2: 00000000 CR3: 034e3000 CR4: 000406d0
[ 2.516539] Call Trace:
[ 2.516539] ? rcuwait_wake_up+0x53/0x80
[ 2.516539] ? rcuwait_wake_up+0x5/0x80
[ 2.516539] ? worker_thread+0x2dd/0x6a0
[ 2.516539] ? kthread+0x1ba/0x1e0
[ 2.516539] ? create_worker+0x1e0/0x1e0
[ 2.516539] ? kzalloc+0x20/0x20
[ 2.516539] ? ret_from_fork+0x1c/0x28
[ 2.516539] --[ end trace 71c162214dd31179 ]--
[ 2.534063] UDP hash table entries: 2048 (order: 5, 196608 bytes, linear)
[ 2.535774] UDP-Lite hash table entries: 2048 (order: 5, 196608 bytes, linear)
[ 2.537661] NET: Registered protocol family 1

> 
> It seems special to make kworker pcpu because they are going not to
> help either hotplug or stop. If it quiesces the warning then we have
> a fresh start for breaking CPU affinity.
> 
> --- a/kernel/workqueue.c
> +++ b/kernel/workqueue.c
> @@ -1861,8 +1861,6 @@ static void worker_attach_to_pool(struct
>  	 */
>  	if (pool->flags & POOL_DISASSOCIATED)
>  		worker->flags |= WORKER_UNBOUND;
> -	else
> -		kthread_set_per_cpu(worker->task, true);
>  
>  	list_add_tail(&worker->node, &pool->workers);
>  	worker->pool = pool;
> @@ -4922,7 +4920,6 @@ static void unbind_workers(int cpu)
>  		raw_spin_unlock_irq(&pool->lock);
>  
>  		for_each_pool_worker(worker, pool) {
> -			kthread_set_per_cpu(worker->task, false);
>  			WARN_ON_ONCE(set_cpus_allowed_ptr(worker->task, cpu_possible_mask) < 0);
>  		}
>  
> @@ -4979,7 +4976,6 @@ static void rebind_workers(struct worker
>  	for_each_pool_worker(worker, pool) {
>  		WARN_ON_ONCE(set_cpus_allowed_ptr(worker->task,
>  						  pool->attrs->cpumask) < 0);
> -		kthread_set_per_cpu(worker->task, true);
>  	}
>  
>  	raw_spin_lock_irq(&pool->lock);

[-- Attachment #2: dmesg-1.xz --]
[-- Type: application/x-xz, Size: 38252 bytes --]

  parent reply	other threads:[~2021-01-20 14:07 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-14  7:45 [workqueue] d5bff968ea: WARNING:at_kernel/workqueue.c:#process_one_work kernel test robot
     [not found] ` <20210114084248.1819-1-hdanton@sina.com>
2021-01-20 13:41   ` Oliver Sang [this message]
     [not found] ` <20210115072432.150-1-hdanton@sina.com>
2021-01-20 13:46   ` Oliver Sang
     [not found]   ` <20210121040037.1555-1-hdanton@sina.com>
2021-01-22  1:48     ` Xing Zhengjun
     [not found]     ` <20210122075903.1722-1-hdanton@sina.com>
2021-01-25  8:37       ` Xing Zhengjun
     [not found] <20210125092900.1839-1-hdanton@sina.com>
2021-01-26  2:45 ` Xing Zhengjun
     [not found] <20210126073925.1962-1-hdanton@sina.com>
2021-01-27  8:04 ` Xing Zhengjun
     [not found] ` <20210127092128.2299-1-hdanton@sina.com>
2021-01-28  7:52   ` Xing Zhengjun
     [not found]   ` <20210128090905.1596-1-hdanton@sina.com>
2021-01-28 18:08     ` Paul E. McKenney
2021-01-29  6:20       ` Xing Zhengjun

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210120134144.GA11090@xsang-OptiPlex-9020 \
    --to=oliver.sang@intel.com \
    --cc=hdanton@sina.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lkp@intel.com \
    --cc=lkp@lists.01.org \
    --cc=paulmck@kernel.org \
    --cc=peterz@infradead.org \
    --cc=zhengjun.xing@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).