linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] Revert mm/vmstat.c: fix vmstat_update() preemption BUG
@ 2018-04-11  9:57 Sebastian Andrzej Siewior
  2018-04-11 13:56 ` Vlastimil Babka
  0 siblings, 1 reply; 9+ messages in thread
From: Sebastian Andrzej Siewior @ 2018-04-11  9:57 UTC (permalink / raw)
  To: linux-mm
  Cc: linux-kernel, tglx, Sebastian Andrzej Siewior, Steven J . Hill,
	Tejun Heo, Andrew Morton

This patch reverts commit c7f26ccfb2c3 ("mm/vmstat.c: fix
vmstat_update() preemption BUG").
Steven saw a "using smp_processor_id() in preemptible" message and
added a preempt_disable() section around it to keep it quiet. This is
not the right thing to do it does not fix the real problem.

vmstat_update() is invoked by a kworker on a specific CPU. This worker
it bound to this CPU. The name of the worker was "kworker/1:1" so it
should have been a worker which was bound to CPU1. A worker which can
run on any CPU would have a `u' before the first digit.

smp_processor_id() can be used in a preempt-enabled region as long as
the task is bound to a single CPU which is the case here. If it could
run on an arbitrary CPU then this is the problem we have an should seek
to resolve.
Not only this smp_processor_id() must not be migrated to another CPU but
also refresh_cpu_vm_stats() which might access wrong per-CPU variables.
Not to mention that other code relies on the fact that such a worker
runs on one specific CPU only.

Therefore I revert that commit and we should look instead what broke the
affinity mask of the kworker.

Cc: Steven J. Hill <steven.hill@cavium.com>
Cc: Tejun Heo <htejun@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---
 mm/vmstat.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/mm/vmstat.c b/mm/vmstat.c
index 33581be705f0..40b2db6db6b1 100644
--- a/mm/vmstat.c
+++ b/mm/vmstat.c
@@ -1839,11 +1839,9 @@ static void vmstat_update(struct work_struct *w)
 		 * to occur in the future. Keep on running the
 		 * update worker thread.
 		 */
-		preempt_disable();
 		queue_delayed_work_on(smp_processor_id(), mm_percpu_wq,
 				this_cpu_ptr(&vmstat_work),
 				round_jiffies_relative(sysctl_stat_interval));
-		preempt_enable();
 	}
 }
 
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH] Revert mm/vmstat.c: fix vmstat_update() preemption BUG
  2018-04-11  9:57 [PATCH] Revert mm/vmstat.c: fix vmstat_update() preemption BUG Sebastian Andrzej Siewior
@ 2018-04-11 13:56 ` Vlastimil Babka
  2018-04-11 14:09   ` Tejun Heo
  0 siblings, 1 reply; 9+ messages in thread
From: Vlastimil Babka @ 2018-04-11 13:56 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior, linux-mm
  Cc: linux-kernel, tglx, Steven J . Hill, Tejun Heo, Andrew Morton,
	Christoph Lameter

On 04/11/2018 11:57 AM, Sebastian Andrzej Siewior wrote:
> This patch reverts commit c7f26ccfb2c3 ("mm/vmstat.c: fix
> vmstat_update() preemption BUG").
> Steven saw a "using smp_processor_id() in preemptible" message and
> added a preempt_disable() section around it to keep it quiet. This is
> not the right thing to do it does not fix the real problem.
> 
> vmstat_update() is invoked by a kworker on a specific CPU. This worker
> it bound to this CPU. The name of the worker was "kworker/1:1" so it
> should have been a worker which was bound to CPU1. A worker which can
> run on any CPU would have a `u' before the first digit.

Oh my, and I have just been assured by Tejun that his cannot happen :)
And yet, in the original report [1] I see:

CPU: 0 PID: 269 Comm: kworker/1:1 Not tainted

So is this perhaps related to the cpu hotplug that [1] mentions? e.g. is
the cpu being hotplugged cpu 1, the worker started too early before
stuff can be scheduled on the CPU, so it has to run on different than
designated CPU?

[1] https://marc.info/?l=linux-mm&m=152088260625433&w=2

> smp_processor_id() can be used in a preempt-enabled region as long as
> the task is bound to a single CPU which is the case here. If it could
> run on an arbitrary CPU then this is the problem we have an should seek
> to resolve.
> Not only this smp_processor_id() must not be migrated to another CPU but
> also refresh_cpu_vm_stats() which might access wrong per-CPU variables.
> Not to mention that other code relies on the fact that such a worker
> runs on one specific CPU only.
> 
> Therefore I revert that commit and we should look instead what broke the
> affinity mask of the kworker.
> 
> Cc: Steven J. Hill <steven.hill@cavium.com>
> Cc: Tejun Heo <htejun@gmail.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
> ---
>  mm/vmstat.c | 2 --
>  1 file changed, 2 deletions(-)
> 
> diff --git a/mm/vmstat.c b/mm/vmstat.c
> index 33581be705f0..40b2db6db6b1 100644
> --- a/mm/vmstat.c
> +++ b/mm/vmstat.c
> @@ -1839,11 +1839,9 @@ static void vmstat_update(struct work_struct *w)
>  		 * to occur in the future. Keep on running the
>  		 * update worker thread.
>  		 */
> -		preempt_disable();
>  		queue_delayed_work_on(smp_processor_id(), mm_percpu_wq,
>  				this_cpu_ptr(&vmstat_work),
>  				round_jiffies_relative(sysctl_stat_interval));
> -		preempt_enable();
>  	}
>  }
>  
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] Revert mm/vmstat.c: fix vmstat_update() preemption BUG
  2018-04-11 13:56 ` Vlastimil Babka
@ 2018-04-11 14:09   ` Tejun Heo
  2018-04-11 14:42     ` Sebastian Andrzej Siewior
  0 siblings, 1 reply; 9+ messages in thread
From: Tejun Heo @ 2018-04-11 14:09 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: Sebastian Andrzej Siewior, linux-mm, linux-kernel, tglx,
	Steven J . Hill, Andrew Morton, Christoph Lameter

Hello,

On Wed, Apr 11, 2018 at 03:56:43PM +0200, Vlastimil Babka wrote:
> > vmstat_update() is invoked by a kworker on a specific CPU. This worker
> > it bound to this CPU. The name of the worker was "kworker/1:1" so it
> > should have been a worker which was bound to CPU1. A worker which can
> > run on any CPU would have a `u' before the first digit.
> 
> Oh my, and I have just been assured by Tejun that his cannot happen :)
> And yet, in the original report [1] I see:
> 
> CPU: 0 PID: 269 Comm: kworker/1:1 Not tainted
> 
> So is this perhaps related to the cpu hotplug that [1] mentions? e.g. is
> the cpu being hotplugged cpu 1, the worker started too early before
> stuff can be scheduled on the CPU, so it has to run on different than
> designated CPU?
> 
> [1] https://marc.info/?l=linux-mm&m=152088260625433&w=2

The report says that it happens when hotplug is attempted.  Per-cpu
doesn't pin the cpu alive, so if the cpu goes down while a work item
is in flight or a work item is queued while a cpu is offline it'll end
up executing on some other cpu.  So, if a piece of code doesn't want
that happening, it gotta interlock itself - ie. start queueing when
the cpu comes online and flush and prevent further queueing when its
cpu goes down.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] Revert mm/vmstat.c: fix vmstat_update() preemption BUG
  2018-04-11 14:09   ` Tejun Heo
@ 2018-04-11 14:42     ` Sebastian Andrzej Siewior
  2018-04-11 19:07       ` Sebastian Andrzej Siewior
  0 siblings, 1 reply; 9+ messages in thread
From: Sebastian Andrzej Siewior @ 2018-04-11 14:42 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Vlastimil Babka, linux-mm, linux-kernel, tglx, Steven J . Hill,
	Andrew Morton, Christoph Lameter

On 2018-04-11 07:09:13 [-0700], Tejun Heo wrote:
> Hello,
> 
> On Wed, Apr 11, 2018 at 03:56:43PM +0200, Vlastimil Babka wrote:
> > > vmstat_update() is invoked by a kworker on a specific CPU. This worker
> > > it bound to this CPU. The name of the worker was "kworker/1:1" so it
> > > should have been a worker which was bound to CPU1. A worker which can
> > > run on any CPU would have a `u' before the first digit.
> > 
> > Oh my, and I have just been assured by Tejun that his cannot happen :)
> > And yet, in the original report [1] I see:
> > 
> > CPU: 0 PID: 269 Comm: kworker/1:1 Not tainted
> > 
> > So is this perhaps related to the cpu hotplug that [1] mentions? e.g. is
> > the cpu being hotplugged cpu 1, the worker started too early before
> > stuff can be scheduled on the CPU, so it has to run on different than
> > designated CPU?
> > 
> > [1] https://marc.info/?l=linux-mm&m=152088260625433&w=2
> 
> The report says that it happens when hotplug is attempted.  Per-cpu
> doesn't pin the cpu alive, so if the cpu goes down while a work item
> is in flight or a work item is queued while a cpu is offline it'll end
> up executing on some other cpu.  So, if a piece of code doesn't want
> that happening, it gotta interlock itself - ie. start queueing when
> the cpu comes online and flush and prevent further queueing when its
> cpu goes down.

I missed that cpuhotplug part while reading it. So in that case, let me
add a CPU-hotplug notifier which cancels that work. After all it is not
need once the CPU is gone.

> Thanks.
> 

Sebastian

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] Revert mm/vmstat.c: fix vmstat_update() preemption BUG
  2018-04-11 14:42     ` Sebastian Andrzej Siewior
@ 2018-04-11 19:07       ` Sebastian Andrzej Siewior
  2018-04-18 15:44         ` Sebastian Andrzej Siewior
  2018-06-27 12:36         ` Steven Rostedt
  0 siblings, 2 replies; 9+ messages in thread
From: Sebastian Andrzej Siewior @ 2018-04-11 19:07 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Vlastimil Babka, linux-mm, linux-kernel, tglx, Steven J . Hill,
	Andrew Morton, Christoph Lameter

On 2018-04-11 16:42:21 [+0200], To Tejun Heo wrote:
> > > So is this perhaps related to the cpu hotplug that [1] mentions? e.g. is
> > > the cpu being hotplugged cpu 1, the worker started too early before
> > > stuff can be scheduled on the CPU, so it has to run on different than
> > > designated CPU?
> > > 
> > > [1] https://marc.info/?l=linux-mm&m=152088260625433&w=2
> > 
> > The report says that it happens when hotplug is attempted.  Per-cpu
> > doesn't pin the cpu alive, so if the cpu goes down while a work item
> > is in flight or a work item is queued while a cpu is offline it'll end
> > up executing on some other cpu.  So, if a piece of code doesn't want
> > that happening, it gotta interlock itself - ie. start queueing when
> > the cpu comes online and flush and prevent further queueing when its
> > cpu goes down.
> 
> I missed that cpuhotplug part while reading it. So in that case, let me
> add a CPU-hotplug notifier which cancels that work. After all it is not
> need once the CPU is gone.

This already happens:
- vmstat_shepherd() does get_online_cpus() and within this block it does
  queue_delayed_work_on(). So this has to wait until cpuhotplug
  completed before it can schedule something and then it won't schedule
  anything on the "off" CPU.

- The work item itself (vmstat_update()) schedules itself
  (conditionally) again.

- vmstat_cpu_down_prep() is the down event and does
  cancel_delayed_work_sync(). So it waits for the work-item to complete
  and cancels it.

This looks all good to me.

> > Thanks.

Sebastian

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] Revert mm/vmstat.c: fix vmstat_update() preemption BUG
  2018-04-11 19:07       ` Sebastian Andrzej Siewior
@ 2018-04-18 15:44         ` Sebastian Andrzej Siewior
  2018-04-18 19:54           ` Andrew Morton
  2018-06-27 12:36         ` Steven Rostedt
  1 sibling, 1 reply; 9+ messages in thread
From: Sebastian Andrzej Siewior @ 2018-04-18 15:44 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Vlastimil Babka, linux-mm, linux-kernel, tglx, Steven J . Hill,
	Tejun Heo, Christoph Lameter

ping.
any reason not to accept the revert?

On 2018-04-11 21:07:29 [+0200], To Tejun Heo wrote:
> On 2018-04-11 16:42:21 [+0200], To Tejun Heo wrote:
> > > > So is this perhaps related to the cpu hotplug that [1] mentions? e.g. is
> > > > the cpu being hotplugged cpu 1, the worker started too early before
> > > > stuff can be scheduled on the CPU, so it has to run on different than
> > > > designated CPU?
> > > > 
> > > > [1] https://marc.info/?l=linux-mm&m=152088260625433&w=2
> > > 
> > > The report says that it happens when hotplug is attempted.  Per-cpu
> > > doesn't pin the cpu alive, so if the cpu goes down while a work item
> > > is in flight or a work item is queued while a cpu is offline it'll end
> > > up executing on some other cpu.  So, if a piece of code doesn't want
> > > that happening, it gotta interlock itself - ie. start queueing when
> > > the cpu comes online and flush and prevent further queueing when its
> > > cpu goes down.
> > 
> > I missed that cpuhotplug part while reading it. So in that case, let me
> > add a CPU-hotplug notifier which cancels that work. After all it is not
> > need once the CPU is gone.
> 
> This already happens:
> - vmstat_shepherd() does get_online_cpus() and within this block it does
>   queue_delayed_work_on(). So this has to wait until cpuhotplug
>   completed before it can schedule something and then it won't schedule
>   anything on the "off" CPU.
> 
> - The work item itself (vmstat_update()) schedules itself
>   (conditionally) again.
> 
> - vmstat_cpu_down_prep() is the down event and does
>   cancel_delayed_work_sync(). So it waits for the work-item to complete
>   and cancels it.
> 
> This looks all good to me.
> 
> > > Thanks.

Sebastian

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] Revert mm/vmstat.c: fix vmstat_update() preemption BUG
  2018-04-18 15:44         ` Sebastian Andrzej Siewior
@ 2018-04-18 19:54           ` Andrew Morton
  0 siblings, 0 replies; 9+ messages in thread
From: Andrew Morton @ 2018-04-18 19:54 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: Vlastimil Babka, linux-mm, linux-kernel, tglx, Steven J . Hill,
	Tejun Heo, Christoph Lameter

On Wed, 18 Apr 2018 17:44:36 +0200 Sebastian Andrzej Siewior <bigeasy@linutronix.de> wrote:

> On 2018-04-11 21:07:29 [+0200], To Tejun Heo wrote:
> > On 2018-04-11 16:42:21 [+0200], To Tejun Heo wrote:
> > > > > So is this perhaps related to the cpu hotplug that [1] mentions? e.g. is
> > > > > the cpu being hotplugged cpu 1, the worker started too early before
> > > > > stuff can be scheduled on the CPU, so it has to run on different than
> > > > > designated CPU?
> > > > > 
> > > > > [1] https://marc.info/?l=linux-mm&m=152088260625433&w=2
> > > > 
> > > > The report says that it happens when hotplug is attempted.  Per-cpu
> > > > doesn't pin the cpu alive, so if the cpu goes down while a work item
> > > > is in flight or a work item is queued while a cpu is offline it'll end
> > > > up executing on some other cpu.  So, if a piece of code doesn't want
> > > > that happening, it gotta interlock itself - ie. start queueing when
> > > > the cpu comes online and flush and prevent further queueing when its
> > > > cpu goes down.
> > > 
> > > I missed that cpuhotplug part while reading it. So in that case, let me
> > > add a CPU-hotplug notifier which cancels that work. After all it is not
> > > need once the CPU is gone.
> > 
> > This already happens:
> > - vmstat_shepherd() does get_online_cpus() and within this block it does
> >   queue_delayed_work_on(). So this has to wait until cpuhotplug
> >   completed before it can schedule something and then it won't schedule
> >   anything on the "off" CPU.
> > 
> > - The work item itself (vmstat_update()) schedules itself
> >   (conditionally) again.
> > 
> > - vmstat_cpu_down_prep() is the down event and does
> >   cancel_delayed_work_sync(). So it waits for the work-item to complete
> >   and cancels it.
> > 
> > This looks all good to me.
> > 
> > > > Thanks.

(top-posting repaired, Please don't do that - how am I supposed to
reply to you while maintaining appropriate context?)

> ping.
> any reason not to accept the revert?
> 

That will make the warnings come back.  Or was the hotplug issue
addressed by other means?  If so, that fix should be referred to in
the changelog.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] Revert mm/vmstat.c: fix vmstat_update() preemption BUG
  2018-04-11 19:07       ` Sebastian Andrzej Siewior
  2018-04-18 15:44         ` Sebastian Andrzej Siewior
@ 2018-06-27 12:36         ` Steven Rostedt
  2018-06-27 12:47           ` Sebastian Andrzej Siewior
  1 sibling, 1 reply; 9+ messages in thread
From: Steven Rostedt @ 2018-06-27 12:36 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: Tejun Heo, Vlastimil Babka, linux-mm, linux-kernel, tglx,
	Steven J . Hill, Andrew Morton, Christoph Lameter

On Wed, Apr 11, 2018 at 09:07:30PM +0200, Sebastian Andrzej Siewior wrote:
> 
> This already happens:
> - vmstat_shepherd() does get_online_cpus() and within this block it does
>   queue_delayed_work_on(). So this has to wait until cpuhotplug
>   completed before it can schedule something and then it won't schedule
>   anything on the "off" CPU.

But can't we have something like this happen: ?

	CPU0			CPU1			CPU2
	----			----			----
 get_online_cpus()
 queue_work(vmstat_update, cpu1)
    wakeup(kworker/1)
			     High prio task running
 put_online_cpus()
 						     Shutdown CPU 1
			     migrate kworker/1
 schedule kworker/1
 (smp_processor_id() != 1)

-- Steve


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] Revert mm/vmstat.c: fix vmstat_update() preemption BUG
  2018-06-27 12:36         ` Steven Rostedt
@ 2018-06-27 12:47           ` Sebastian Andrzej Siewior
  0 siblings, 0 replies; 9+ messages in thread
From: Sebastian Andrzej Siewior @ 2018-06-27 12:47 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Tejun Heo, Vlastimil Babka, linux-mm, linux-kernel, tglx,
	Steven J . Hill, Andrew Morton, Christoph Lameter

On 2018-06-27 08:36:57 [-0400], Steven Rostedt wrote:
> On Wed, Apr 11, 2018 at 09:07:30PM +0200, Sebastian Andrzej Siewior wrote:
> > 
> > This already happens:
> > - vmstat_shepherd() does get_online_cpus() and within this block it does
> >   queue_delayed_work_on(). So this has to wait until cpuhotplug
> >   completed before it can schedule something and then it won't schedule
> >   anything on the "off" CPU.
> 
> But can't we have something like this happen: ?
> 
> 	CPU0			CPU1			CPU2
> 	----			----			----
>  get_online_cpus()
>  queue_work(vmstat_update, cpu1)
>     wakeup(kworker/1)
> 			     High prio task running
>  put_online_cpus()
>  						     Shutdown CPU 1
> 			     migrate kworker/1
>  schedule kworker/1
>  (smp_processor_id() != 1)

I don't get CPU1 doing in here. kworker/1 should not be migrated to
begin with. The work item should be done before CPU2 shutdowns CPU1 (or
0) because vmstat_cpu_down_prep() cancels the work while the CPUs goes
down so it should not be migrated. Also, the work is enqueued while
holding the CPU HP lock.

Sebastian

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2018-06-27 12:47 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-04-11  9:57 [PATCH] Revert mm/vmstat.c: fix vmstat_update() preemption BUG Sebastian Andrzej Siewior
2018-04-11 13:56 ` Vlastimil Babka
2018-04-11 14:09   ` Tejun Heo
2018-04-11 14:42     ` Sebastian Andrzej Siewior
2018-04-11 19:07       ` Sebastian Andrzej Siewior
2018-04-18 15:44         ` Sebastian Andrzej Siewior
2018-04-18 19:54           ` Andrew Morton
2018-06-27 12:36         ` Steven Rostedt
2018-06-27 12:47           ` Sebastian Andrzej Siewior

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).