All of
 help / color / mirror / Atom feed
From: Valentin Schneider <>
To: Will Deacon <>
	Peter Zijlstra <>,
	Ingo Molnar <>,
	Morten Rasmussen <>,
	Qais Yousef <>,
	Dietmar Eggemann <>,
	Quentin Perret <>,
	Juri Lelli <>,
	Vincent Guittot <>,
Subject: Re: [PATCH 2/2] sched: Plug race between SCA, hotplug and migration_cpu_stop()
Date: Thu, 24 Jun 2021 10:33:07 +0100	[thread overview]
Message-ID: <> (raw)
In-Reply-To: <20210622135140.GA31071@willie-the-truck>


On 22/06/21 14:51, Will Deacon wrote:
> Hi Valentin,
> I've been looking at this on and off and I'm afraid I'm still not convinced,
> almost certainly due to my own ignorance, but hey :)

Nah, this thing is nasty, I still haven't regrown the hair I lost from last
time I stared at it.

> On Wed, Jun 02, 2021 at 08:43:31PM +0100, Valentin Schneider wrote:
>> Harumph...
>> So something like all CPUs but one are running their take_cpu_down()
>> callback because one is still running migration_cpu_stop(), i.e.:
>>   CPU0                   CPU1                ...             CPUN
>>   <stopper>              <stopper>                           <stopper>
>>     migration_cpu_stop()   take_cpu_down()@MULTI_STOP_PREPARE    take_cpu_down()@MULTI_STOP_PREPARE
>> If CPU0 hits that else if (pending) condition, it'll queue a
>> migration_cpu_stop() elsewhere (say CPU1), then run the take_cpu_down()
>> callback which follows in its work->list.
>> If the CPU being brought down is anything else than CPU1, it shouldn't
>> really matter. If it *is* CPU1, then I think we've got some guarantees.
>> Namely, there's no (read: there shouldn't be any) way for a task to
>> still be on CPU1 at this point; per sched_cpu_wait_empty(),
>> migration-disabled tasks and pcpu kthreads must vacate the CPU before it
>> then (migrate_disable regions must be bounded, and pcpu kthreads are
>> expected to be moved away by their respective owners).
> I agree with that, but the stopper is still running on CPU1 and so
> migration_cpu_stop() could still queue work there after sched_cpu_wait_empty()
> has returned but before stop_machine_park(), afaict.

Right, the stopper should flush its work before parking itself - in the
above scenario migration_cpu_stop() is already on CPU1's work list when it
finishes running take_cpu_down(), so that should run before any parking

> Actually, it looks like migration_cpu_stop() ignores the return value of
> stop_one_cpu_nowait(), so if the stopper thread has been parked I think
> we'll quietly do nothing there as well.

There's a handful of loosely related cogs (including those added by the
patch) that *I think* give us sufficient guarantees this can't happen. Let
me try to structure this somehow, and please point out any inane bit.

1) Per the synchronize_rcu() in CPUHP_AP_ACTIVE.teardown(), the return
   value of is_cpu_allowed(p, cpu) is stable within a migration_cpu_stop()
   invocation for any p allowed on *active* CPUs.
2) Per the serialization of stopper callbacks, the return value of
   is_cpu_allowed(p, cpu) is stable within a migration_cpu_stop()
   invocation for any p allowed on *online* CPUs: migration_cpu_stop()
   has to complete before any CPU can do an online -> !online transition.

  (I would also add here that I don't expect any task to be migrated to an
  !active && online dying CPU via migration_cpu_stop(): the only tasks
  allowed on such CPUs are pcpu kthreads, which will be parked rather than
  migrated, and migrate_disabled() tasks which, unsurprisingly, aren't
  migrated - they just get to *stay* on their CPU longer than regular tasks)

3) Per 1) + 2), the move_queued_task() call in migration_cpu_stop() cannot
   silently fail.

4) No task other than a hotplug thread or a stopper thread may run on a
   CPU beyond CPUHP_AP_SCHED_WAIT_EMPTY.teardown()
5) Per 4), no (queued) task p passed to migration_cpu_stop() can have
   task_cpu(p) be a dying CPU with hotplug state below
6) Per 5), migration_cpu_stop() cannot invoke stop_one_cpu_nowait() on a
   CPU with a parked stopper.
7) Per 6), migration_cpu_stop() self-queuing cannot be silently discarded
   and will always end up executed

Per 3) + 7), we're good?

      reply	other threads:[~2021-06-24  9:33 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-26 20:57 [PATCH 0/2] sched: SCA vs hotplug vs stopper races fixes Valentin Schneider
2021-05-26 20:57 ` [PATCH 1/2] sched: Don't defer CPU pick to migration_cpu_stop() Valentin Schneider
2021-06-01 14:04   ` [tip: sched/core] " tip-bot2 for Valentin Schneider
2021-05-26 20:57 ` [PATCH 2/2] sched: Plug race between SCA, hotplug and migration_cpu_stop() Valentin Schneider
2021-05-27  9:12   ` Peter Zijlstra
2021-06-01 16:59   ` Valentin Schneider
2021-06-02 15:26     ` Will Deacon
2021-06-02 19:43       ` Valentin Schneider
2021-06-22 13:51         ` Will Deacon
2021-06-24  9:33           ` Valentin Schneider [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \ \ \ \ \ \ \ \ \ \ \ \ \ \

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.