All of lore.kernel.org
 help / color / mirror / Atom feed
From: Tejun Heo <tj@kernel.org>
To: Michal Hocko <mhocko@kernel.org>
Cc: Mel Gorman <mgorman@techsingularity.net>,
	Vlastimil Babka <vbabka@suse.cz>,
	Dmitry Vyukov <dvyukov@google.com>,
	Christoph Lameter <cl@linux.com>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	syzkaller <syzkaller@googlegroups.com>,
	Andrew Morton <akpm@linux-foundation.org>
Subject: Re: mm: deadlock between get_online_cpus/pcpu_alloc
Date: Tue, 7 Feb 2017 12:03:19 -0500	[thread overview]
Message-ID: <20170207170319.GA6164@htj.duckdns.org> (raw)
In-Reply-To: <20170207153459.GV5065@dhcp22.suse.cz>

Hello,

Sorry about the delay.

On Tue, Feb 07, 2017 at 04:34:59PM +0100, Michal Hocko wrote:
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index c3358d4f7932..b6411816787a 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -2343,7 +2343,16 @@ void drain_local_pages(struct zone *zone)
>  
>  static void drain_local_pages_wq(struct work_struct *work)
>  {
> +	/*
> +	 * drain_all_pages doesn't use proper cpu hotplug protection so
> +	 * we can race with cpu offline when the WQ can move this from
> +	 * a cpu pinned worker to an unbound one. We can operate on a different
> +	 * cpu which is allright but we also have to make sure to not move to
> +	 * a different one.
> +	 */
> +	preempt_disable();
>  	drain_local_pages(NULL);
> +	preempt_enable();
>  }
>  
>  /*
> @@ -2379,12 +2388,6 @@ void drain_all_pages(struct zone *zone)
>  	}
>  
>  	/*
> -	 * As this can be called from reclaim context, do not reenter reclaim.
> -	 * An allocation failure can be handled, it's simply slower
> -	 */
> -	get_online_cpus();
> -
> -	/*
>  	 * We don't care about racing with CPU hotplug event
>  	 * as offline notification will cause the notified
>  	 * cpu to drain that CPU pcps and on_each_cpu_mask
> @@ -2423,7 +2426,6 @@ void drain_all_pages(struct zone *zone)
>  	for_each_cpu(cpu, &cpus_with_pcps)
>  		flush_work(per_cpu_ptr(&pcpu_drain, cpu));
>  
> -	put_online_cpus();
>  	mutex_unlock(&pcpu_drain_mutex);

I think this would work; however, a more canonical way would be
something along the line of...

  drain_all_pages()
  {
	  ...
	  spin_lock();
	  for_each_possible_cpu() {
		  if (this cpu should get drained) {
			  queue_work_on(this cpu's work);
		  }
	  }
	  spin_unlock();
	  ...
  }

  offline_hook()
  {
	  spin_lock();
	  this cpu should get drained = false;
	  spin_unlock();
	  queue_work_on(this cpu's work);
	  flush_work(this cpu's work);
  }

I think what workqueue should do is automatically flush in-flight CPU
work items on CPU offline and erroring out on queue_work_on() on
offline CPUs.  And we now actually can do that because we have lifted
the guarantee that queue_work() is local CPU affine some releases ago.
I'll look into it soonish.

For the time being, either approach should be fine.  The more
canonical one might be a bit less surprising but the
preempt_disable/enable() change is short and sweet and completely fine
for the case at hand.

Please feel free to add

Acked-by: Tejun Heo <tj@kernel.org>

Thanks.

-- 
tejun

WARNING: multiple messages have this Message-ID (diff)
From: Tejun Heo <tj@kernel.org>
To: Michal Hocko <mhocko@kernel.org>
Cc: Mel Gorman <mgorman@techsingularity.net>,
	Vlastimil Babka <vbabka@suse.cz>,
	Dmitry Vyukov <dvyukov@google.com>,
	Christoph Lameter <cl@linux.com>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	syzkaller <syzkaller@googlegroups.com>,
	Andrew Morton <akpm@linux-foundation.org>
Subject: Re: mm: deadlock between get_online_cpus/pcpu_alloc
Date: Tue, 7 Feb 2017 12:03:19 -0500	[thread overview]
Message-ID: <20170207170319.GA6164@htj.duckdns.org> (raw)
In-Reply-To: <20170207153459.GV5065@dhcp22.suse.cz>

Hello,

Sorry about the delay.

On Tue, Feb 07, 2017 at 04:34:59PM +0100, Michal Hocko wrote:
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index c3358d4f7932..b6411816787a 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -2343,7 +2343,16 @@ void drain_local_pages(struct zone *zone)
>  
>  static void drain_local_pages_wq(struct work_struct *work)
>  {
> +	/*
> +	 * drain_all_pages doesn't use proper cpu hotplug protection so
> +	 * we can race with cpu offline when the WQ can move this from
> +	 * a cpu pinned worker to an unbound one. We can operate on a different
> +	 * cpu which is allright but we also have to make sure to not move to
> +	 * a different one.
> +	 */
> +	preempt_disable();
>  	drain_local_pages(NULL);
> +	preempt_enable();
>  }
>  
>  /*
> @@ -2379,12 +2388,6 @@ void drain_all_pages(struct zone *zone)
>  	}
>  
>  	/*
> -	 * As this can be called from reclaim context, do not reenter reclaim.
> -	 * An allocation failure can be handled, it's simply slower
> -	 */
> -	get_online_cpus();
> -
> -	/*
>  	 * We don't care about racing with CPU hotplug event
>  	 * as offline notification will cause the notified
>  	 * cpu to drain that CPU pcps and on_each_cpu_mask
> @@ -2423,7 +2426,6 @@ void drain_all_pages(struct zone *zone)
>  	for_each_cpu(cpu, &cpus_with_pcps)
>  		flush_work(per_cpu_ptr(&pcpu_drain, cpu));
>  
> -	put_online_cpus();
>  	mutex_unlock(&pcpu_drain_mutex);

I think this would work; however, a more canonical way would be
something along the line of...

  drain_all_pages()
  {
	  ...
	  spin_lock();
	  for_each_possible_cpu() {
		  if (this cpu should get drained) {
			  queue_work_on(this cpu's work);
		  }
	  }
	  spin_unlock();
	  ...
  }

  offline_hook()
  {
	  spin_lock();
	  this cpu should get drained = false;
	  spin_unlock();
	  queue_work_on(this cpu's work);
	  flush_work(this cpu's work);
  }

I think what workqueue should do is automatically flush in-flight CPU
work items on CPU offline and erroring out on queue_work_on() on
offline CPUs.  And we now actually can do that because we have lifted
the guarantee that queue_work() is local CPU affine some releases ago.
I'll look into it soonish.

For the time being, either approach should be fine.  The more
canonical one might be a bit less surprising but the
preempt_disable/enable() change is short and sweet and completely fine
for the case at hand.

Please feel free to add

Acked-by: Tejun Heo <tj@kernel.org>

Thanks.

-- 
tejun

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  parent reply	other threads:[~2017-02-07 17:03 UTC|newest]

Thread overview: 120+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-01-29 12:44 mm: deadlock between get_online_cpus/pcpu_alloc Dmitry Vyukov
2017-01-29 12:44 ` Dmitry Vyukov
2017-01-29 17:22 ` Vlastimil Babka
2017-01-29 17:22   ` Vlastimil Babka
2017-01-30 15:48   ` Dmitry Vyukov
2017-01-30 15:48     ` Dmitry Vyukov
2017-02-06 19:13     ` Dmitry Vyukov
2017-02-06 19:13       ` Dmitry Vyukov
2017-02-06 22:05       ` Mel Gorman
2017-02-06 22:05         ` Mel Gorman
2017-02-07  8:48         ` Michal Hocko
2017-02-07  8:48           ` Michal Hocko
2017-02-07  9:23           ` Vlastimil Babka
2017-02-07  9:23             ` Vlastimil Babka
2017-02-07  9:46             ` Mel Gorman
2017-02-07  9:46               ` Mel Gorman
2017-02-07  9:53             ` Michal Hocko
2017-02-07  9:53               ` Michal Hocko
2017-02-07 10:42             ` Mel Gorman
2017-02-07 10:42               ` Mel Gorman
2017-02-07 11:13               ` Mel Gorman
2017-02-07 11:13                 ` Mel Gorman
2017-02-07  9:43           ` Mel Gorman
2017-02-07  9:43             ` Mel Gorman
2017-02-07  9:49             ` Vlastimil Babka
2017-02-07  9:49               ` Vlastimil Babka
2017-02-07 10:05               ` Michal Hocko
2017-02-07 10:05                 ` Michal Hocko
2017-02-07 10:28               ` Mel Gorman
2017-02-07 10:28                 ` Mel Gorman
2017-02-07 10:35                 ` Michal Hocko
2017-02-07 10:35                   ` Michal Hocko
2017-02-07 11:34                   ` Mel Gorman
2017-02-07 11:34                     ` Mel Gorman
2017-02-07 11:43                     ` Michal Hocko
2017-02-07 11:43                       ` Michal Hocko
2017-02-07 11:54                       ` Vlastimil Babka
2017-02-07 11:54                         ` Vlastimil Babka
2017-02-07 12:08                         ` Michal Hocko
2017-02-07 12:08                           ` Michal Hocko
2017-02-07 12:37                       ` Michal Hocko
2017-02-07 12:37                         ` Michal Hocko
2017-02-07 12:43                         ` Vlastimil Babka
2017-02-07 12:43                           ` Vlastimil Babka
2017-02-07 12:48                           ` Michal Hocko
2017-02-07 12:48                             ` Michal Hocko
2017-02-07 13:57                             ` Vlastimil Babka
2017-02-07 13:57                               ` Vlastimil Babka
2017-02-07 13:58                         ` Mel Gorman
2017-02-07 13:58                           ` Mel Gorman
2017-02-07 14:19                           ` Michal Hocko
2017-02-07 14:19                             ` Michal Hocko
2017-02-07 15:34                             ` Michal Hocko
2017-02-07 15:34                               ` Michal Hocko
2017-02-07 16:22                               ` Mel Gorman
2017-02-07 16:22                                 ` Mel Gorman
2017-02-07 16:41                                 ` Michal Hocko
2017-02-07 16:41                                   ` Michal Hocko
2017-02-07 16:55                                   ` Christoph Lameter
2017-02-07 16:55                                     ` Christoph Lameter
2017-02-07 22:25                                     ` Thomas Gleixner
2017-02-07 22:25                                       ` Thomas Gleixner
2017-02-08  7:35                                       ` Michal Hocko
2017-02-08  7:35                                         ` Michal Hocko
2017-02-08 12:02                                         ` Thomas Gleixner
2017-02-08 12:02                                           ` Thomas Gleixner
2017-02-08 12:21                                           ` Michal Hocko
2017-02-08 12:21                                             ` Michal Hocko
2017-02-08 12:26                                           ` Mel Gorman
2017-02-08 12:26                                             ` Mel Gorman
2017-02-08 13:23                                             ` Thomas Gleixner
2017-02-08 13:23                                               ` Thomas Gleixner
2017-02-08 14:03                                               ` Mel Gorman
2017-02-08 14:03                                                 ` Mel Gorman
2017-02-08 14:11                                                 ` Peter Zijlstra
2017-02-08 14:11                                                   ` Peter Zijlstra
2017-02-08 15:11                                         ` Christoph Lameter
2017-02-08 15:11                                           ` Christoph Lameter
2017-02-08 15:21                                           ` Michal Hocko
2017-02-08 15:21                                             ` Michal Hocko
2017-02-08 16:17                                             ` Christoph Lameter
2017-02-08 16:17                                               ` Christoph Lameter
2017-02-08 17:46                                               ` Thomas Gleixner
2017-02-08 17:46                                                 ` Thomas Gleixner
2017-02-09  3:15                                                 ` Christoph Lameter
2017-02-09  3:15                                                   ` Christoph Lameter
2017-02-09 11:42                                                   ` Thomas Gleixner
2017-02-09 11:42                                                     ` Thomas Gleixner
2017-02-09 14:00                                                     ` Christoph Lameter
2017-02-09 14:00                                                       ` Christoph Lameter
2017-02-09 14:53                                                       ` Thomas Gleixner
2017-02-09 14:53                                                         ` Thomas Gleixner
2017-02-09 15:42                                                         ` Christoph Lameter
2017-02-09 15:42                                                           ` Christoph Lameter
2017-02-09 16:12                                                           ` Thomas Gleixner
2017-02-09 16:12                                                             ` Thomas Gleixner
2017-02-09 17:22                                                             ` Christoph Lameter
2017-02-09 17:22                                                               ` Christoph Lameter
2017-02-09 17:40                                                               ` Thomas Gleixner
2017-02-09 17:40                                                                 ` Thomas Gleixner
2017-02-09 19:15                                                               ` Michal Hocko
2017-02-09 19:15                                                                 ` Michal Hocko
2017-02-10 17:58                                                                 ` Christoph Lameter
2017-02-10 17:58                                                                   ` Christoph Lameter
2017-02-08 15:06                                       ` Christoph Lameter
2017-02-08 15:06                                         ` Christoph Lameter
2017-02-07 17:03                               ` Tejun Heo [this message]
2017-02-07 17:03                                 ` Tejun Heo
2017-02-07 20:16                                 ` Michal Hocko
2017-02-07 20:16                                   ` Michal Hocko
2017-02-07 13:03                       ` Mel Gorman
2017-02-07 13:03                         ` Mel Gorman
2017-02-07 13:48                         ` Michal Hocko
2017-02-07 13:48                           ` Michal Hocko
2017-02-07 11:24         ` Tetsuo Handa
2017-02-07 11:24           ` Tetsuo Handa
2017-02-07  8:43       ` Michal Hocko
2017-02-07  8:43         ` Michal Hocko
2017-02-07 21:53       ` Thomas Gleixner
2017-02-07 21:53         ` Thomas Gleixner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170207170319.GA6164@htj.duckdns.org \
    --to=tj@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=cl@linux.com \
    --cc=dvyukov@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@techsingularity.net \
    --cc=mhocko@kernel.org \
    --cc=mingo@kernel.org \
    --cc=peterz@infradead.org \
    --cc=syzkaller@googlegroups.com \
    --cc=tglx@linutronix.de \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.