linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* BUG: using __this_cpu_read() in preemptible [00000000] code: mm_percpu_wq/7
@ 2017-06-07  9:05 Andre Wild
  2017-07-12 15:44 ` Christopher Lameter
  0 siblings, 1 reply; 6+ messages in thread
From: Andre Wild @ 2017-06-07  9:05 UTC (permalink / raw)
  To: cl; +Cc: linux-mm, linux-kernel, heiko.carstens

Hi Christoph,

I'm currently seeing the following message running kernel version 4.11.0.
It looks like it was introduced with the patch 4037d452202e34214e8a939fa5621b2b3bbb45b7.
Can you please take a look at this problem?

[Tue Jun  6 15:27:03 2017] BUG: using __this_cpu_read() in preemptible [00000000] code: mm_percpu_wq/7
[Tue Jun  6 15:27:03 2017] caller is refresh_cpu_vm_stats+0x198/0x3d8
[Tue Jun  6 15:27:03 2017] CPU: 0 PID: 7 Comm: mm_percpu_wq Tainted: G        W       4.11.0-20170529.0.ae409ab.224a322.fc25.s390xdefault #1
[Tue Jun  6 15:27:03 2017] Workqueue: mm_percpu_wq vmstat_update
[Tue Jun  6 15:27:03 2017] Call Trace:
[Tue Jun  6 15:27:03 2017] ([<00000000001138ea>] show_trace+0x8a/0xe0)
[Tue Jun  6 15:27:03 2017]  [<00000000001139c0>] show_stack+0x80/0xd8
[Tue Jun  6 15:27:03 2017]  [<000000000074488e>] dump_stack+0x96/0xd8
[Tue Jun  6 15:27:03 2017]  [<000000000077afaa>] check_preemption_disabled+0xea/0x108
[Tue Jun  6 15:27:03 2017]  [<00000000002ec198>] refresh_cpu_vm_stats+0x198/0x3d8
[Tue Jun  6 15:27:03 2017]  [<00000000002ed306>] vmstat_update+0x2e/0x98
[Tue Jun  6 15:27:03 2017]  [<0000000000167450>] process_one_work+0x3d8/0x780
[Tue Jun  6 15:27:03 2017]  [<00000000001688dc>] rescuer_thread+0x224/0x3d0
[Tue Jun  6 15:27:03 2017]  [<0000000000170096>] kthread+0x166/0x178
[Tue Jun  6 15:27:03 2017]  [<0000000000a4d69a>] kernel_thread_starter+0x6/0xc
[Tue Jun  6 15:27:03 2017]  [<0000000000a4d694>] kernel_thread_starter+0x0/0xc
[Tue Jun  6 15:27:03 2017] INFO: lockdep is turned off.


Kind regards,

AndrA(C)

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: BUG: using __this_cpu_read() in preemptible [00000000] code: mm_percpu_wq/7
  2017-06-07  9:05 BUG: using __this_cpu_read() in preemptible [00000000] code: mm_percpu_wq/7 Andre Wild
@ 2017-07-12 15:44 ` Christopher Lameter
  2017-07-12 16:13   ` Heiko Carstens
  2017-08-16  9:13   ` Heiko Carstens
  0 siblings, 2 replies; 6+ messages in thread
From: Christopher Lameter @ 2017-07-12 15:44 UTC (permalink / raw)
  To: Andre Wild; +Cc: linux-mm, linux-kernel, heiko.carstens

On Wed, 7 Jun 2017, Andre Wild wrote:

> I'm currently seeing the following message running kernel version 4.11.0.
> It looks like it was introduced with the patch
> 4037d452202e34214e8a939fa5621b2b3bbb45b7.

A 2007 patch? At that point we did not have __this_cpu_read() nor
refresh_cpu_vmstats.... Is this on s390 or some such architecture?


> Can you please take a look at this problem?

Could you give me a bit more context?


> [Tue Jun  6 15:27:03 2017] BUG: using __this_cpu_read() in preemptible
> [00000000] code: mm_percpu_wq/7
> [Tue Jun  6 15:27:03 2017] caller is refresh_cpu_vm_stats+0x198/0x3d8
> [Tue Jun  6 15:27:03 2017] CPU: 0 PID: 7 Comm: mm_percpu_wq Tainted: G
> W       4.11.0-20170529.0.ae409ab.224a322.fc25.s390xdefault #1
> [Tue Jun  6 15:27:03 2017] Workqueue: mm_percpu_wq vmstat_update

It is run in preemptible mode but this from a kworker
context so the processor cannot change (see vmstat_refresh()).

Even on s390 or so this should be fine.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: BUG: using __this_cpu_read() in preemptible [00000000] code: mm_percpu_wq/7
  2017-07-12 15:44 ` Christopher Lameter
@ 2017-07-12 16:13   ` Heiko Carstens
  2017-08-16  9:13   ` Heiko Carstens
  1 sibling, 0 replies; 6+ messages in thread
From: Heiko Carstens @ 2017-07-12 16:13 UTC (permalink / raw)
  To: Christopher Lameter; +Cc: Andre Wild, linux-mm, linux-kernel

On Wed, Jul 12, 2017 at 10:44:54AM -0500, Christopher Lameter wrote:
> On Wed, 7 Jun 2017, Andre Wild wrote:
> 
> > I'm currently seeing the following message running kernel version 4.11.0.
> > It looks like it was introduced with the patch
> > 4037d452202e34214e8a939fa5621b2b3bbb45b7.
> 
> A 2007 patch? At that point we did not have __this_cpu_read() nor
> refresh_cpu_vmstats.... Is this on s390 or some such architecture?

It is on s390, but after I looked into the code a bit deeper the mentioned
patch doesn't seem to be the problem.

My initial thought was a missing preempt_disable() / preempt_enable() pair,
but that can't be the problem, since the code is executed on a per-cpu
workqueue.

> > Can you please take a look at this problem?
> 
> Could you give me a bit more context?
> 
> 
> > [Tue Jun  6 15:27:03 2017] BUG: using __this_cpu_read() in preemptible
> > [00000000] code: mm_percpu_wq/7
> > [Tue Jun  6 15:27:03 2017] caller is refresh_cpu_vm_stats+0x198/0x3d8
> > [Tue Jun  6 15:27:03 2017] CPU: 0 PID: 7 Comm: mm_percpu_wq Tainted: G
> > W       4.11.0-20170529.0.ae409ab.224a322.fc25.s390xdefault #1
> > [Tue Jun  6 15:27:03 2017] Workqueue: mm_percpu_wq vmstat_update
> 
> It is run in preemptible mode but this from a kworker
> context so the processor cannot change (see vmstat_refresh()).
> 
> Even on s390 or so this should be fine.

Yes, it *should* be fine. The only unusual thing here is that this happens
during quite a lot of cpu hotplug operations. So even though the workqueue
code should be able to handle cpu hotplug correctly, my best guess is that
current->cpus_allowed is not cpumask_of(this_cpu) for some reason.

That would be this check within lib/smp_processor_id.c:check_preemption_disabled()

	if (cpumask_equal(&current->cpus_allowed, cpumask_of(this_cpu)))
		goto out;

I changed the code to simply panic, so I can look into a dump to figure out
what actually does cause the warning. As soon as Andre finds some time to
reproduce this we will come back to you.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: BUG: using __this_cpu_read() in preemptible [00000000] code: mm_percpu_wq/7
  2017-07-12 15:44 ` Christopher Lameter
  2017-07-12 16:13   ` Heiko Carstens
@ 2017-08-16  9:13   ` Heiko Carstens
  2017-08-16 14:20     ` Tejun Heo
  1 sibling, 1 reply; 6+ messages in thread
From: Heiko Carstens @ 2017-08-16  9:13 UTC (permalink / raw)
  To: Christopher Lameter, Tejun Heo; +Cc: Andre Wild, linux-mm, linux-kernel

Tejun,

can you make any sense of the below? It _looks_ like a bug in the workqueue
code. Andre's testcase is also doing cpu hotplug until we finally see a
workqueue.c warning.

On Wed, Jul 12, 2017 at 10:44:54AM -0500, Christopher Lameter wrote:
> On Wed, 7 Jun 2017, Andre Wild wrote:
> 
> > I'm currently seeing the following message running kernel version 4.11.0.
> > It looks like it was introduced with the patch
> > 4037d452202e34214e8a939fa5621b2b3bbb45b7.
> 
> A 2007 patch? At that point we did not have __this_cpu_read() nor
> refresh_cpu_vmstats.... Is this on s390 or some such architecture?
> > Can you please take a look at this problem?
> 
> Could you give me a bit more context?

Cc'ing Tejun, since this might be workqueue related.

> > [Tue Jun  6 15:27:03 2017] BUG: using __this_cpu_read() in preemptible [00000000] code: mm_percpu_wq/7
> > [Tue Jun  6 15:27:03 2017] caller is refresh_cpu_vm_stats+0x198/0x3d8
> > [Tue Jun  6 15:27:03 2017] CPU: 0 PID: 7 Comm: mm_percpu_wq Tainted: GW       4.11.0-20170529.0.ae409ab.224a322.fc25.s390xdefault #1
> > [Tue Jun  6 15:27:03 2017] Workqueue: mm_percpu_wq vmstat_update
> 
> It is run in preemptible mode but this from a kworker
> context so the processor cannot change (see vmstat_refresh()).
> 
> Even on s390 or so this should be fine.

Sorry for this late answer. Andre reproduced the problem on vanilla
4.13.0-rc4 with this small patch applied (whitespace damaged due to
copy-paste), so that we have a dump to look at:

diff --git a/lib/smp_processor_id.c b/lib/smp_processor_id.c
index 2fb007be0212..dc18575044c3 100644
--- a/lib/smp_processor_id.c
+++ b/lib/smp_processor_id.c
@@ -44,7 +44,7 @@ notrace static unsigned int check_preemption_disabled(const char *what1,

        print_symbol("caller is %s\n", (long)__builtin_return_address(0));
        dump_stack();
-
+       panic("preempt check\n");
 out_enable:
        preempt_enable_no_resched_notrace();
 out:

With that applied we see:

[ 5968.010352] WARNING: CPU: 54 PID: 7 at kernel/workqueue.c:2041 process_one_work+0x6d4/0x718

(I don't remember we have seen the warning above in the first report) and then

[ 5968.010913] Kernel panic - not syncing: preempt check
[ 5968.010919] CPU: 54 PID: 7 Comm: mm_percpu_wq Tainted: G        W       4.13.0-rc4-dirty #3
[ 5968.010923] Hardware name: IBM 3906 M03 703 (z/VM 6.4.0)
[ 5968.010927] Workqueue: mm_percpu_wq vmstat_update
[ 5968.010933] Call Trace:
[ 5968.010937] ([<0000000000113fbe>] show_stack+0x8e/0xe0)
[ 5968.010942]  [<0000000000a514be>] dump_stack+0x96/0xd8
[ 5968.010947]  [<000000000014302a>] panic+0x102/0x248
[ 5968.010952]  [<00000000007836d8>] check_preemption_disabled+0xf8/0x110
[ 5968.010956]  [<00000000002ee8e2>] refresh_cpu_vm_stats+0x1b2/0x400
[ 5968.010961]  [<00000000002ef8be>] vmstat_update+0x2e/0x98
[ 5968.010965]  [<0000000000166374>] process_one_work+0x3d4/0x718
[ 5968.010970]  [<000000000016708c>] rescuer_thread+0x214/0x390
[ 5968.010974]  [<000000000016edbc>] kthread+0x16c/0x180
[ 5968.010978]  [<0000000000a7273a>] kernel_thread_starter+0x6/0xc
[ 5968.010983]  [<0000000000a72734>] kernel_thread_starter+0x0/0xc

On cpu 54 we have mm_percpu_wq with:

    nr_cpus_allowed = 0x1,
    cpus_allowed = {
	      bits = {0x4, 0x0, 0x0, 0x0}
    },

We also have CONFIG_NR_CPUS=256, so the above translates to cpu 3, which
obviously is not cpu 54 and explains the preempt check warning.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: BUG: using __this_cpu_read() in preemptible [00000000] code: mm_percpu_wq/7
  2017-08-16  9:13   ` Heiko Carstens
@ 2017-08-16 14:20     ` Tejun Heo
  2017-09-01  9:36       ` Andre Wild
  0 siblings, 1 reply; 6+ messages in thread
From: Tejun Heo @ 2017-08-16 14:20 UTC (permalink / raw)
  To: Heiko Carstens; +Cc: Christopher Lameter, Andre Wild, linux-mm, linux-kernel

Hello,

On Wed, Aug 16, 2017 at 11:13:07AM +0200, Heiko Carstens wrote:
> [ 5968.010352] WARNING: CPU: 54 PID: 7 at kernel/workqueue.c:2041 process_one_work+0x6d4/0x718
> 
> (I don't remember we have seen the warning above in the first report) and then
> 
> [ 5968.010913] Kernel panic - not syncing: preempt check
> [ 5968.010919] CPU: 54 PID: 7 Comm: mm_percpu_wq Tainted: G        W       4.13.0-rc4-dirty #3
> [ 5968.010923] Hardware name: IBM 3906 M03 703 (z/VM 6.4.0)
> [ 5968.010927] Workqueue: mm_percpu_wq vmstat_update
> [ 5968.010933] Call Trace:
> [ 5968.010937] ([<0000000000113fbe>] show_stack+0x8e/0xe0)
> [ 5968.010942]  [<0000000000a514be>] dump_stack+0x96/0xd8
> [ 5968.010947]  [<000000000014302a>] panic+0x102/0x248
> [ 5968.010952]  [<00000000007836d8>] check_preemption_disabled+0xf8/0x110
> [ 5968.010956]  [<00000000002ee8e2>] refresh_cpu_vm_stats+0x1b2/0x400
> [ 5968.010961]  [<00000000002ef8be>] vmstat_update+0x2e/0x98
> [ 5968.010965]  [<0000000000166374>] process_one_work+0x3d4/0x718
> [ 5968.010970]  [<000000000016708c>] rescuer_thread+0x214/0x390
> [ 5968.010974]  [<000000000016edbc>] kthread+0x16c/0x180
> [ 5968.010978]  [<0000000000a7273a>] kernel_thread_starter+0x6/0xc
> [ 5968.010983]  [<0000000000a72734>] kernel_thread_starter+0x0/0xc
> 
> On cpu 54 we have mm_percpu_wq with:
> 
>     nr_cpus_allowed = 0x1,
>     cpus_allowed = {
> 	      bits = {0x4, 0x0, 0x0, 0x0}
>     },
> 
> We also have CONFIG_NR_CPUS=256, so the above translates to cpu 3, which
> obviously is not cpu 54 and explains the preempt check warning.

Looks like the same issue Paul was hitting.

 http://lkml.kernel.org/r/1501541603-4456-3-git-send-email-paulmck@linux.vnet.ibm.com

Can you see whether the above patch helps?

Thank.s

-- 
tejun

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: BUG: using __this_cpu_read() in preemptible [00000000] code: mm_percpu_wq/7
  2017-08-16 14:20     ` Tejun Heo
@ 2017-09-01  9:36       ` Andre Wild
  0 siblings, 0 replies; 6+ messages in thread
From: Andre Wild @ 2017-09-01  9:36 UTC (permalink / raw)
  To: Tejun Heo, Heiko Carstens; +Cc: Christopher Lameter, linux-mm, linux-kernel

On 08/16/2017 04:20 PM, Tejun Heo wrote:
> Hello,
> 
> On Wed, Aug 16, 2017 at 11:13:07AM +0200, Heiko Carstens wrote:
>> [ 5968.010352] WARNING: CPU: 54 PID: 7 at kernel/workqueue.c:2041 process_one_work+0x6d4/0x718
>>
>> (I don't remember we have seen the warning above in the first report) and then
>>
>> [ 5968.010913] Kernel panic - not syncing: preempt check
>> [ 5968.010919] CPU: 54 PID: 7 Comm: mm_percpu_wq Tainted: G        W       4.13.0-rc4-dirty #3
>> [ 5968.010923] Hardware name: IBM 3906 M03 703 (z/VM 6.4.0)
>> [ 5968.010927] Workqueue: mm_percpu_wq vmstat_update
>> [ 5968.010933] Call Trace:
>> [ 5968.010937] ([<0000000000113fbe>] show_stack+0x8e/0xe0)
>> [ 5968.010942]  [<0000000000a514be>] dump_stack+0x96/0xd8
>> [ 5968.010947]  [<000000000014302a>] panic+0x102/0x248
>> [ 5968.010952]  [<00000000007836d8>] check_preemption_disabled+0xf8/0x110
>> [ 5968.010956]  [<00000000002ee8e2>] refresh_cpu_vm_stats+0x1b2/0x400
>> [ 5968.010961]  [<00000000002ef8be>] vmstat_update+0x2e/0x98
>> [ 5968.010965]  [<0000000000166374>] process_one_work+0x3d4/0x718
>> [ 5968.010970]  [<000000000016708c>] rescuer_thread+0x214/0x390
>> [ 5968.010974]  [<000000000016edbc>] kthread+0x16c/0x180
>> [ 5968.010978]  [<0000000000a7273a>] kernel_thread_starter+0x6/0xc
>> [ 5968.010983]  [<0000000000a72734>] kernel_thread_starter+0x0/0xc
>>
>> On cpu 54 we have mm_percpu_wq with:
>>
>>      nr_cpus_allowed = 0x1,
>>      cpus_allowed = {
>> 	      bits = {0x4, 0x0, 0x0, 0x0}
>>      },
>>
>> We also have CONFIG_NR_CPUS=256, so the above translates to cpu 3, which
>> obviously is not cpu 54 and explains the preempt check warning.
> 
> Looks like the same issue Paul was hitting.
> 
>   http://lkml.kernel.org/r/1501541603-4456-3-git-send-email-paulmck@linux.vnet.ibm.com
> 
> Can you see whether the above patch helps?
> 
> Thank.s
> 

Hello,

please excuse my late response. But I've found another kernel panic 
which stopped my test case execution each time. Now I've managed to work 
around the problem and it looks pretty good. With the patch I was not 
able to reproduce the problem within 24 hours runtime. Previously I 
could trigger it within 2-4 hours runtime.

Kind regards

AndrA(C)

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2017-09-01  9:36 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-06-07  9:05 BUG: using __this_cpu_read() in preemptible [00000000] code: mm_percpu_wq/7 Andre Wild
2017-07-12 15:44 ` Christopher Lameter
2017-07-12 16:13   ` Heiko Carstens
2017-08-16  9:13   ` Heiko Carstens
2017-08-16 14:20     ` Tejun Heo
2017-09-01  9:36       ` Andre Wild

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).