All of lore.kernel.org
 help / color / mirror / Atom feed
From: Daniel Bristot de Oliveira <bristot@redhat.com>
To: Thomas Gleixner <tglx@linutronix.de>,
	Peter Zijlstra <peterz@infradead.org>,
	mingo@kernel.org
Cc: linux-kernel@vger.kernel.org, bigeasy@linutronix.de,
	qais.yousef@arm.com, swood@redhat.com,
	valentin.schneider@arm.com, juri.lelli@redhat.com,
	vincent.guittot@linaro.org, dietmar.eggemann@arm.com,
	rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de,
	vincent.donnefort@arm.com
Subject: Re: [PATCH 7/9] sched: Add migrate_disable()
Date: Mon, 21 Sep 2020 22:42:42 +0200	[thread overview]
Message-ID: <86929eee-36da-93a5-5280-00e6df1ef496@redhat.com> (raw)
In-Reply-To: <87v9g7aqjd.fsf@nanos.tec.linutronix.de>

On 9/21/20 9:16 PM, Thomas Gleixner wrote:
> On Mon, Sep 21 2020 at 18:36, Peter Zijlstra wrote:
>> Add the base migrate_disable() support (under protest).
> 
> :)
> 
>> +/*
>> + * Migrate-Disable and why it is (strongly) undesired.
>> + *
>> + * The premise of the Real-Time schedulers we have on Linux
>> + * (SCHED_FIFO/SCHED_DEADLINE) is that M CPUs can/will run M tasks
>> + * concurrently, provided there are sufficient runnable tasks, also known as
>> + * work-conserving. For instance SCHED_DEADLINE tries to schedule the M
>> + * earliest deadline threads, and SCHED_FIFO the M highest priority threads.
>> + *
>> + * The correctness of various scheduling models depends on this, but is it
>> + * broken by migrate_disable() that doesn't imply preempt_disable(). Where
>> + * preempt_disable() implies an immediate priority ceiling, preemptible
>> + * migrate_disable() allows nesting.
>> + *
>> + * The worst case is that all tasks preempt one another in a migrate_disable()
>> + * region and stack on a single CPU. This then reduces the available bandwidth
>> + * to a single CPU. And since Real-Time schedulability theory considers the
>> + * Worst-Case only, all Real-Time analysis shall revert to single-CPU
>> + * (instantly solving the SMP analysis problem).
> 
> I'm telling you for years that SMP is the source of all evils and
> NR_CPUS=0 is the ultimate solution of all problems. Paul surely
> disagrees as he thinks that NR_CPUS<0 is the right thing to do.

And I would not need to extend the model!

> But seriously, I completely understand your concern vs. schedulability
> theories, but those theories can neither deal well with preemption
> disable simply because you can create other trainwrecks when enough low
> priority tasks run long enough in preempt disabled regions in
> parallel. The scheduler simply does not know ahead how long these
> sections will take and how many of them will run in parallel.
> 
> The theories make some assumptions about preempt disable and consider it
> as temporary priority ceiling, but that's all assumptions as the bounds
> of these operations simply unknown.

Limited preemption is something that is more explored/well known than
limited/arbitrary affinity - I even know a dude that convinced academics about
the effects/properties of preempt disable on the PREEMPT_RT!

But I think that the message here is that: ok, migrate disable is better for the
"scheduling latency" than preempt disable (preempt rt goal). But the
indiscriminate usage of migrate disable has some undesired effects for "response
time" of real-time threads (scheduler goal), so we should use it with caution -
as much as we have with preempt disable. In the end, both are critical for
real-time workloads, and we need more work and analysis on them both.

>> + * The reason we have it anyway.
>> + *
>> + * PREEMPT_RT breaks a number of assumptions traditionally held. By forcing a
>> + * number of primitives into becoming preemptible, they would also allow
>> + * migration. This turns out to break a bunch of per-cpu usage. To this end,
>> + * all these primitives employ migirate_disable() to restore this implicit
>> + * assumption.
>> + *
>> + * This is a 'temporary' work-around at best. The correct solution is getting
>> + * rid of the above assumptions and reworking the code to employ explicit
>> + * per-cpu locking or short preempt-disable regions.
> 
> What timeframe are you envisioning for 'temporary'? I assume something
> which is closer to your retirement than to mine :)

/me counts how many years he still needs to wait for the retirement.

> 
>> + * The end goal must be to get rid of migrate_disable(), alternatively we need
>> + * a schedulability theory that does not depend on abritrary migration.
> 
> Finally something new the academics can twist their brain around :)

Like if there was not enough already :-)

> But as the kmap discussion has shown, the current situation of enforcing
> preempt disable even on a !RT kernel is not pretty either. I looked at
> quite some of the kmap_atomic() usage sites and the resulting
> workarounds for non-preemptability are pretty horrible especially if
> they do copy_from/to_user() or such in those regions. There is tons of
> other code which really only requires migrate disable
(not having an explicit declaration of the reason to disable preemption make
these all hard to rework... and we will have the same with migrate disable.
Anyways, I agree that disabling only migration helps -rt now [and I like
that]... but I also fear/care for scheduler metrics on the long term... well,
there is still a long way until retirement.)

Thanks!
-- Daniel

> Thanks,
> 
>         tglx
> 


  reply	other threads:[~2020-09-21 20:42 UTC|newest]

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-09-21 16:35 [PATCH 0/9] sched: Migrate disable support Peter Zijlstra
2020-09-21 16:35 ` [PATCH 1/9] stop_machine: Add function and caller debug info Peter Zijlstra
2020-09-21 16:35 ` [PATCH 2/9] sched: Fix balance_callback() Peter Zijlstra
2020-09-23 14:08   ` Thomas Gleixner
2020-09-21 16:36 ` [PATCH 3/9] sched/hotplug: Ensure only per-cpu kthreads run during hotplug Peter Zijlstra
2020-09-25 16:38   ` Dietmar Eggemann
2020-10-02 14:20     ` Peter Zijlstra
2020-09-21 16:36 ` [PATCH 4/9] sched/core: Wait for tasks being pushed away on hotplug Peter Zijlstra
2020-09-21 16:36 ` [PATCH 5/9] sched/hotplug: Consolidate task migration on CPU unplug Peter Zijlstra
2020-10-01 17:12   ` Vincent Donnefort
2020-10-02 14:17     ` Peter Zijlstra
2020-09-21 16:36 ` [PATCH 6/9] sched: Massage set_cpus_allowed Peter Zijlstra
2020-09-23 14:07   ` Thomas Gleixner
2020-09-21 16:36 ` [PATCH 7/9] sched: Add migrate_disable() Peter Zijlstra
2020-09-21 19:16   ` Thomas Gleixner
2020-09-21 20:42     ` Daniel Bristot de Oliveira [this message]
2020-09-23  8:31       ` Thomas Gleixner
2020-09-23 10:51         ` Daniel Bristot de Oliveira
2020-09-23 17:08         ` peterz
2020-09-23 17:54           ` Daniel Bristot de Oliveira
2020-09-23  7:48     ` peterz
2020-09-24 11:53   ` Valentin Schneider
2020-09-24 12:29     ` Peter Zijlstra
2020-09-24 12:33       ` Valentin Schneider
2020-09-24 12:35     ` Peter Zijlstra
2020-09-25 16:50   ` Sebastian Andrzej Siewior
2020-10-02 14:21     ` Peter Zijlstra
2020-10-02 14:36       ` Sebastian Andrzej Siewior
2020-09-21 16:36 ` [PATCH 8/9] sched: Fix migrate_disable() vs set_cpus_allowed_ptr() Peter Zijlstra
2020-09-24 19:59   ` Valentin Schneider
2020-09-25  8:43     ` Peter Zijlstra
2020-09-25 10:07       ` Valentin Schneider
2020-09-25  9:05     ` Peter Zijlstra
2020-09-25  9:56       ` Peter Zijlstra
2020-09-25 10:09         ` Valentin Schneider
2020-09-21 16:36 ` [PATCH 9/9] sched/core: Make migrate disable and CPU hotplug cooperative Peter Zijlstra
2020-09-25  9:12 ` [PATCH 0/9] sched: Migrate disable support Dietmar Eggemann
2020-09-25 10:10   ` Peter Zijlstra
2020-09-25 11:58     ` Dietmar Eggemann
2020-09-25 12:19       ` Valentin Schneider
2020-09-25 17:49         ` Valentin Schneider
2020-09-29  9:15           ` Dietmar Eggemann
2020-09-25 18:17 ` Sebastian Andrzej Siewior
2020-09-25 19:32   ` Valentin Schneider
2020-10-02 14:30     ` Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=86929eee-36da-93a5-5280-00e6df1ef496@redhat.com \
    --to=bristot@redhat.com \
    --cc=bigeasy@linutronix.de \
    --cc=bsegall@google.com \
    --cc=dietmar.eggemann@arm.com \
    --cc=juri.lelli@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mgorman@suse.de \
    --cc=mingo@kernel.org \
    --cc=peterz@infradead.org \
    --cc=qais.yousef@arm.com \
    --cc=rostedt@goodmis.org \
    --cc=swood@redhat.com \
    --cc=tglx@linutronix.de \
    --cc=valentin.schneider@arm.com \
    --cc=vincent.donnefort@arm.com \
    --cc=vincent.guittot@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.