All of lore.kernel.org
 help / color / mirror / Atom feed
From: Michal Hocko <mhocko@kernel.org>
To: Mel Gorman <mgorman@techsingularity.net>
Cc: Vlastimil Babka <vbabka@suse.cz>,
	Dmitry Vyukov <dvyukov@google.com>, Tejun Heo <tj@kernel.org>,
	Christoph Lameter <cl@linux.com>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	syzkaller <syzkaller@googlegroups.com>,
	Andrew Morton <akpm@linux-foundation.org>
Subject: Re: mm: deadlock between get_online_cpus/pcpu_alloc
Date: Tue, 7 Feb 2017 14:48:18 +0100	[thread overview]
Message-ID: <20170207134818.GQ5065@dhcp22.suse.cz> (raw)
In-Reply-To: <20170207130350.njwuiq3uh6vhj5t2@techsingularity.net>

On Tue 07-02-17 13:03:50, Mel Gorman wrote:
> On Tue, Feb 07, 2017 at 12:43:27PM +0100, Michal Hocko wrote:
> > > Right. The unbind operation can set a mask that is any allowable CPU and
> > > the final process_work is not done in a context that prevents
> > > preemption.
> > > 
> > > diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> > > index 3b93879990fd..7af165d308c4 100644
> > > --- a/mm/page_alloc.c
> > > +++ b/mm/page_alloc.c
> > > @@ -2342,7 +2342,14 @@ void drain_local_pages(struct zone *zone)
> > >  
> > >  static void drain_local_pages_wq(struct work_struct *work)
> > >  {
> > > +	/*
> > > +	 * Ordinarily a drain operation is bound to a CPU but may be unbound
> > > +	 * after a CPU hotplug operation so it's necessary to disable
> > > +	 * preemption for the drain to stabilise the CPU ID.
> > > +	 */
> > > +	preempt_disable();
> > >  	drain_local_pages(NULL);
> > > +	preempt_enable_no_resched();
> > >  }
> > >  
> > >  /*
> > [...]
> > > @@ -6711,7 +6714,16 @@ static int page_alloc_cpu_dead(unsigned int cpu)
> > >  {
> > >  
> > >  	lru_add_drain_cpu(cpu);
> > > +
> > > +	/*
> > > +	 * A per-cpu drain via a workqueue from drain_all_pages can be
> > > +	 * rescheduled onto an unrelated CPU. That allows the hotplug
> > > +	 * operation and the drain to potentially race on the same
> > > +	 * CPU. Serialise hotplug versus drain using pcpu_drain_mutex
> > > +	 */
> > > +	mutex_lock(&pcpu_drain_mutex);
> > >  	drain_pages(cpu);
> > > +	mutex_unlock(&pcpu_drain_mutex);
> > 
> > You cannot put sleepable lock inside the preempt disbaled section...
> > We can make it a spinlock right?
> > 
> 
> The CPU down callback can hold a mutex and at least he SLUB callback
> already does so. That gives
> 
> page_alloc_cpu_dead
>   mutex_lock
>     drain_pages
>   mutex_unlock
> 
> drain_all_pages
>   mutex_lock
>     queue workqueue
>       drain_local_pages_wq
>         preempt_disable
>         drain_local_pages
>         drain_pages
>         preempt_enable
>    flush queues
>  mutex_unlock
> 
> I must be blind or maybe it's rushing between multiple concerns but which
> sleepable lock is of concern?

I thought the cpu hotplug callback was non-preemptible. This is not the
case as mentioned in other reply. The pcpu_drain_mutex in the hotplug
callback is alright. Sorry about the confusion! I am still wondering
whether the lock is really needed. See the other reply.

-- 
Michal Hocko
SUSE Labs

WARNING: multiple messages have this Message-ID (diff)
From: Michal Hocko <mhocko@kernel.org>
To: Mel Gorman <mgorman@techsingularity.net>
Cc: Vlastimil Babka <vbabka@suse.cz>,
	Dmitry Vyukov <dvyukov@google.com>, Tejun Heo <tj@kernel.org>,
	Christoph Lameter <cl@linux.com>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	syzkaller <syzkaller@googlegroups.com>,
	Andrew Morton <akpm@linux-foundation.org>
Subject: Re: mm: deadlock between get_online_cpus/pcpu_alloc
Date: Tue, 7 Feb 2017 14:48:18 +0100	[thread overview]
Message-ID: <20170207134818.GQ5065@dhcp22.suse.cz> (raw)
In-Reply-To: <20170207130350.njwuiq3uh6vhj5t2@techsingularity.net>

On Tue 07-02-17 13:03:50, Mel Gorman wrote:
> On Tue, Feb 07, 2017 at 12:43:27PM +0100, Michal Hocko wrote:
> > > Right. The unbind operation can set a mask that is any allowable CPU and
> > > the final process_work is not done in a context that prevents
> > > preemption.
> > > 
> > > diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> > > index 3b93879990fd..7af165d308c4 100644
> > > --- a/mm/page_alloc.c
> > > +++ b/mm/page_alloc.c
> > > @@ -2342,7 +2342,14 @@ void drain_local_pages(struct zone *zone)
> > >  
> > >  static void drain_local_pages_wq(struct work_struct *work)
> > >  {
> > > +	/*
> > > +	 * Ordinarily a drain operation is bound to a CPU but may be unbound
> > > +	 * after a CPU hotplug operation so it's necessary to disable
> > > +	 * preemption for the drain to stabilise the CPU ID.
> > > +	 */
> > > +	preempt_disable();
> > >  	drain_local_pages(NULL);
> > > +	preempt_enable_no_resched();
> > >  }
> > >  
> > >  /*
> > [...]
> > > @@ -6711,7 +6714,16 @@ static int page_alloc_cpu_dead(unsigned int cpu)
> > >  {
> > >  
> > >  	lru_add_drain_cpu(cpu);
> > > +
> > > +	/*
> > > +	 * A per-cpu drain via a workqueue from drain_all_pages can be
> > > +	 * rescheduled onto an unrelated CPU. That allows the hotplug
> > > +	 * operation and the drain to potentially race on the same
> > > +	 * CPU. Serialise hotplug versus drain using pcpu_drain_mutex
> > > +	 */
> > > +	mutex_lock(&pcpu_drain_mutex);
> > >  	drain_pages(cpu);
> > > +	mutex_unlock(&pcpu_drain_mutex);
> > 
> > You cannot put sleepable lock inside the preempt disbaled section...
> > We can make it a spinlock right?
> > 
> 
> The CPU down callback can hold a mutex and at least he SLUB callback
> already does so. That gives
> 
> page_alloc_cpu_dead
>   mutex_lock
>     drain_pages
>   mutex_unlock
> 
> drain_all_pages
>   mutex_lock
>     queue workqueue
>       drain_local_pages_wq
>         preempt_disable
>         drain_local_pages
>         drain_pages
>         preempt_enable
>    flush queues
>  mutex_unlock
> 
> I must be blind or maybe it's rushing between multiple concerns but which
> sleepable lock is of concern?

I thought the cpu hotplug callback was non-preemptible. This is not the
case as mentioned in other reply. The pcpu_drain_mutex in the hotplug
callback is alright. Sorry about the confusion! I am still wondering
whether the lock is really needed. See the other reply.

-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2017-02-07 13:48 UTC|newest]

Thread overview: 120+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-01-29 12:44 mm: deadlock between get_online_cpus/pcpu_alloc Dmitry Vyukov
2017-01-29 12:44 ` Dmitry Vyukov
2017-01-29 17:22 ` Vlastimil Babka
2017-01-29 17:22   ` Vlastimil Babka
2017-01-30 15:48   ` Dmitry Vyukov
2017-01-30 15:48     ` Dmitry Vyukov
2017-02-06 19:13     ` Dmitry Vyukov
2017-02-06 19:13       ` Dmitry Vyukov
2017-02-06 22:05       ` Mel Gorman
2017-02-06 22:05         ` Mel Gorman
2017-02-07  8:48         ` Michal Hocko
2017-02-07  8:48           ` Michal Hocko
2017-02-07  9:23           ` Vlastimil Babka
2017-02-07  9:23             ` Vlastimil Babka
2017-02-07  9:46             ` Mel Gorman
2017-02-07  9:46               ` Mel Gorman
2017-02-07  9:53             ` Michal Hocko
2017-02-07  9:53               ` Michal Hocko
2017-02-07 10:42             ` Mel Gorman
2017-02-07 10:42               ` Mel Gorman
2017-02-07 11:13               ` Mel Gorman
2017-02-07 11:13                 ` Mel Gorman
2017-02-07  9:43           ` Mel Gorman
2017-02-07  9:43             ` Mel Gorman
2017-02-07  9:49             ` Vlastimil Babka
2017-02-07  9:49               ` Vlastimil Babka
2017-02-07 10:05               ` Michal Hocko
2017-02-07 10:05                 ` Michal Hocko
2017-02-07 10:28               ` Mel Gorman
2017-02-07 10:28                 ` Mel Gorman
2017-02-07 10:35                 ` Michal Hocko
2017-02-07 10:35                   ` Michal Hocko
2017-02-07 11:34                   ` Mel Gorman
2017-02-07 11:34                     ` Mel Gorman
2017-02-07 11:43                     ` Michal Hocko
2017-02-07 11:43                       ` Michal Hocko
2017-02-07 11:54                       ` Vlastimil Babka
2017-02-07 11:54                         ` Vlastimil Babka
2017-02-07 12:08                         ` Michal Hocko
2017-02-07 12:08                           ` Michal Hocko
2017-02-07 12:37                       ` Michal Hocko
2017-02-07 12:37                         ` Michal Hocko
2017-02-07 12:43                         ` Vlastimil Babka
2017-02-07 12:43                           ` Vlastimil Babka
2017-02-07 12:48                           ` Michal Hocko
2017-02-07 12:48                             ` Michal Hocko
2017-02-07 13:57                             ` Vlastimil Babka
2017-02-07 13:57                               ` Vlastimil Babka
2017-02-07 13:58                         ` Mel Gorman
2017-02-07 13:58                           ` Mel Gorman
2017-02-07 14:19                           ` Michal Hocko
2017-02-07 14:19                             ` Michal Hocko
2017-02-07 15:34                             ` Michal Hocko
2017-02-07 15:34                               ` Michal Hocko
2017-02-07 16:22                               ` Mel Gorman
2017-02-07 16:22                                 ` Mel Gorman
2017-02-07 16:41                                 ` Michal Hocko
2017-02-07 16:41                                   ` Michal Hocko
2017-02-07 16:55                                   ` Christoph Lameter
2017-02-07 16:55                                     ` Christoph Lameter
2017-02-07 22:25                                     ` Thomas Gleixner
2017-02-07 22:25                                       ` Thomas Gleixner
2017-02-08  7:35                                       ` Michal Hocko
2017-02-08  7:35                                         ` Michal Hocko
2017-02-08 12:02                                         ` Thomas Gleixner
2017-02-08 12:02                                           ` Thomas Gleixner
2017-02-08 12:21                                           ` Michal Hocko
2017-02-08 12:21                                             ` Michal Hocko
2017-02-08 12:26                                           ` Mel Gorman
2017-02-08 12:26                                             ` Mel Gorman
2017-02-08 13:23                                             ` Thomas Gleixner
2017-02-08 13:23                                               ` Thomas Gleixner
2017-02-08 14:03                                               ` Mel Gorman
2017-02-08 14:03                                                 ` Mel Gorman
2017-02-08 14:11                                                 ` Peter Zijlstra
2017-02-08 14:11                                                   ` Peter Zijlstra
2017-02-08 15:11                                         ` Christoph Lameter
2017-02-08 15:11                                           ` Christoph Lameter
2017-02-08 15:21                                           ` Michal Hocko
2017-02-08 15:21                                             ` Michal Hocko
2017-02-08 16:17                                             ` Christoph Lameter
2017-02-08 16:17                                               ` Christoph Lameter
2017-02-08 17:46                                               ` Thomas Gleixner
2017-02-08 17:46                                                 ` Thomas Gleixner
2017-02-09  3:15                                                 ` Christoph Lameter
2017-02-09  3:15                                                   ` Christoph Lameter
2017-02-09 11:42                                                   ` Thomas Gleixner
2017-02-09 11:42                                                     ` Thomas Gleixner
2017-02-09 14:00                                                     ` Christoph Lameter
2017-02-09 14:00                                                       ` Christoph Lameter
2017-02-09 14:53                                                       ` Thomas Gleixner
2017-02-09 14:53                                                         ` Thomas Gleixner
2017-02-09 15:42                                                         ` Christoph Lameter
2017-02-09 15:42                                                           ` Christoph Lameter
2017-02-09 16:12                                                           ` Thomas Gleixner
2017-02-09 16:12                                                             ` Thomas Gleixner
2017-02-09 17:22                                                             ` Christoph Lameter
2017-02-09 17:22                                                               ` Christoph Lameter
2017-02-09 17:40                                                               ` Thomas Gleixner
2017-02-09 17:40                                                                 ` Thomas Gleixner
2017-02-09 19:15                                                               ` Michal Hocko
2017-02-09 19:15                                                                 ` Michal Hocko
2017-02-10 17:58                                                                 ` Christoph Lameter
2017-02-10 17:58                                                                   ` Christoph Lameter
2017-02-08 15:06                                       ` Christoph Lameter
2017-02-08 15:06                                         ` Christoph Lameter
2017-02-07 17:03                               ` Tejun Heo
2017-02-07 17:03                                 ` Tejun Heo
2017-02-07 20:16                                 ` Michal Hocko
2017-02-07 20:16                                   ` Michal Hocko
2017-02-07 13:03                       ` Mel Gorman
2017-02-07 13:03                         ` Mel Gorman
2017-02-07 13:48                         ` Michal Hocko [this message]
2017-02-07 13:48                           ` Michal Hocko
2017-02-07 11:24         ` Tetsuo Handa
2017-02-07 11:24           ` Tetsuo Handa
2017-02-07  8:43       ` Michal Hocko
2017-02-07  8:43         ` Michal Hocko
2017-02-07 21:53       ` Thomas Gleixner
2017-02-07 21:53         ` Thomas Gleixner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170207134818.GQ5065@dhcp22.suse.cz \
    --to=mhocko@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=cl@linux.com \
    --cc=dvyukov@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@techsingularity.net \
    --cc=mingo@kernel.org \
    --cc=peterz@infradead.org \
    --cc=syzkaller@googlegroups.com \
    --cc=tglx@linutronix.de \
    --cc=tj@kernel.org \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.