From: "Paul E. McKenney" <paulmck@kernel.org>
To: Peter Zijlstra <peterz@infradead.org>
Cc: mingo@kernel.org, linux-kernel@vger.kernel.org,
tglx@linutronix.de, rostedt@goodmis.org, qais.yousef@arm.com,
juri.lelli@redhat.com, vincent.guittot@linaro.org,
dietmar.eggemann@arm.com, bsegall@google.com, mgorman@suse.de,
airlied@redhat.com, alexander.deucher@amd.com,
awalls@md.metrocast.net, axboe@kernel.dk, broonie@kernel.org,
daniel.lezcano@linaro.org, gregkh@linuxfoundation.org,
hannes@cmpxchg.org, herbert@gondor.apana.org.au,
hverkuil@xs4all.nl, john.stultz@linaro.org, nico@fluxnic.net,
rafael.j.wysocki@intel.com, rmk+kernel@arm.linux.org.uk,
sudeep.holla@arm.com, ulf.hansson@linaro.org,
wim@linux-watchdog.org
Subject: Re: [PATCH 01/23] sched: Provide sched_set_fifo()
Date: Wed, 22 Apr 2020 06:11:38 -0700 [thread overview]
Message-ID: <20200422131138.GL17661@paulmck-ThinkPad-P72> (raw)
In-Reply-To: <20200422112831.266499893@infradead.org>
On Wed, Apr 22, 2020 at 01:27:20PM +0200, Peter Zijlstra wrote:
> SCHED_FIFO (or any static priority scheduler) is a broken scheduler
> model; it is fundamentally incapable of resource management, the one
> thing an OS is actually supposed to do.
>
> It is impossible to compose static priority workloads. One cannot take
> two well designed and functional static priority workloads and mash
> them together and still expect them to work.
>
> Therefore it doesn't make sense to expose the priority field; the
> kernel is fundamentally incapable of setting a sensible value, it
> needs systems knowledge that it doesn't have.
>
> Take away sched_setschedule() / sched_setattr() from modules and
> replace them with:
>
> - sched_set_fifo(p); create a FIFO task (at prio 50)
> - sched_set_fifo_low(p); create a task higher than NORMAL,
> which ends up being a FIFO task at prio 1.
> - sched_set_normal(p, nice); (re)set the task to normal
>
> This stops the proliferation of randomly chosen, and irrelevant, FIFO
> priorities that dont't really mean anything anyway.
>
> The system administrator/integrator, whoever has insight into the
> actual system design and requirements (userspace) can set-up
> appropriate priorities if and when needed.
The sched_setscheduler_nocheck() calls in rcu_spawn_gp_kthread(),
rcu_cpu_kthread_setup(), and rcu_spawn_one_boost_kthread() all stay as
is because they all use the rcutree.kthread_prio boot parameter, which is
set at boot time by the system administrator (or {who,what}ever, correct?
Or did my email reader eat a patch or two?
Thanx, Paul
> Cc: airlied@redhat.com
> Cc: alexander.deucher@amd.com
> Cc: awalls@md.metrocast.net
> Cc: axboe@kernel.dk
> Cc: broonie@kernel.org
> Cc: daniel.lezcano@linaro.org
> Cc: gregkh@linuxfoundation.org
> Cc: hannes@cmpxchg.org
> Cc: herbert@gondor.apana.org.au
> Cc: hverkuil@xs4all.nl
> Cc: john.stultz@linaro.org
> Cc: nico@fluxnic.net
> Cc: paulmck@kernel.org
> Cc: rafael.j.wysocki@intel.com
> Cc: rmk+kernel@arm.linux.org.uk
> Cc: sudeep.holla@arm.com
> Cc: tglx@linutronix.de
> Cc: ulf.hansson@linaro.org
> Cc: wim@linux-watchdog.org
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> Reviewed-by: Ingo Molnar <mingo@kernel.org>
> ---
> include/linux/sched.h | 3 +++
> kernel/sched/core.c | 47 +++++++++++++++++++++++++++++++++++++++++++++++
> 2 files changed, 50 insertions(+)
>
> --- a/include/linux/sched.h
> +++ b/include/linux/sched.h
> @@ -1631,6 +1631,9 @@ extern int idle_cpu(int cpu);
> extern int available_idle_cpu(int cpu);
> extern int sched_setscheduler(struct task_struct *, int, const struct sched_param *);
> extern int sched_setscheduler_nocheck(struct task_struct *, int, const struct sched_param *);
> +extern int sched_set_fifo(struct task_struct *p);
> +extern int sched_set_fifo_low(struct task_struct *p);
> +extern int sched_set_normal(struct task_struct *p, int nice);
> extern int sched_setattr(struct task_struct *, const struct sched_attr *);
> extern int sched_setattr_nocheck(struct task_struct *, const struct sched_attr *);
> extern struct task_struct *idle_task(int cpu);
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -5055,6 +5055,8 @@ static int _sched_setscheduler(struct ta
> * @policy: new policy.
> * @param: structure containing the new RT priority.
> *
> + * Use sched_set_fifo(), read its comment.
> + *
> * Return: 0 on success. An error code otherwise.
> *
> * NOTE that the task may be already dead.
> @@ -5097,6 +5099,51 @@ int sched_setscheduler_nocheck(struct ta
> }
> EXPORT_SYMBOL_GPL(sched_setscheduler_nocheck);
>
> +/*
> + * SCHED_FIFO is a broken scheduler model; that is, it is fundamentally
> + * incapable of resource management, which is the one thing an OS really should
> + * be doing.
> + *
> + * This is of course the reason it is limited to privileged users only.
> + *
> + * Worse still; it is fundamentally impossible to compose static priority
> + * workloads. You cannot take two correctly working static prio workloads
> + * and smash them together and still expect them to work.
> + *
> + * For this reason 'all' FIFO tasks the kernel creates are basically at:
> + *
> + * MAX_RT_PRIO / 2
> + *
> + * The administrator _MUST_ configure the system, the kernel simply doesn't
> + * know enough information to make a sensible choice.
> + */
> +int sched_set_fifo(struct task_struct *p)
> +{
> + struct sched_param sp = { .sched_priority = MAX_RT_PRIO / 2 };
> + return sched_setscheduler_nocheck(p, SCHED_FIFO, &sp);
> +}
> +EXPORT_SYMBOL_GPL(sched_set_fifo);
> +
> +/*
> + * For when you don't much care about FIFO, but want to be above SCHED_NORMAL.
> + */
> +int sched_set_fifo_low(struct task_struct *p)
> +{
> + struct sched_param sp = { .sched_priority = 1 };
> + return sched_setscheduler_nocheck(p, SCHED_FIFO, &sp);
> +}
> +EXPORT_SYMBOL_GPL(sched_set_fifo_low);
> +
> +int sched_set_normal(struct task_struct *p, int nice)
> +{
> + struct sched_attr attr = {
> + .sched_policy = SCHED_NORMAL,
> + .sched_nice = nice,
> + };
> + return sched_setattr_nocheck(p, &attr);
> +}
> +EXPORT_SYMBOL_GPL(sched_set_normal);
> +
> static int
> do_sched_setscheduler(pid_t pid, int policy, struct sched_param __user *param)
> {
>
>
next prev parent reply other threads:[~2020-04-22 13:11 UTC|newest]
Thread overview: 71+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-04-22 11:27 [PATCH 00/23] sched: Remove FIFO priorities from modules Peter Zijlstra
2020-04-22 11:27 ` [PATCH 01/23] sched: Provide sched_set_fifo() Peter Zijlstra
2020-04-22 13:11 ` Paul E. McKenney [this message]
2020-04-22 13:26 ` Peter Zijlstra
2020-04-22 15:50 ` Paul E. McKenney
2020-04-22 16:33 ` Steven Rostedt
2020-04-22 16:40 ` Paul E. McKenney
2020-04-22 16:46 ` Steven Rostedt
2020-04-22 17:45 ` Paul E. McKenney
2020-04-22 15:50 ` Paul E. McKenney
2020-04-27 17:04 ` Qais Yousef
2020-04-22 11:27 ` [PATCH 02/23] sched,bL_switcher: Convert to sched_set_fifo*() Peter Zijlstra
2020-04-22 13:27 ` Nicolas Pitre
2020-04-22 11:27 ` [PATCH 03/23] sched,crypto: " Peter Zijlstra
2020-04-22 13:33 ` Herbert Xu
2020-04-22 11:27 ` [PATCH 04/23] sched,acpi_pad: " Peter Zijlstra
2020-04-22 16:45 ` Dietmar Eggemann
2020-04-23 8:46 ` Peter Zijlstra
2020-04-22 11:27 ` [PATCH 05/23] sched,drbd: " Peter Zijlstra
2020-04-23 8:57 ` Peter Zijlstra
2020-04-22 11:27 ` [PATCH 06/23] sched,psci: " Peter Zijlstra
2020-04-22 11:55 ` Valentin Schneider
2020-04-22 14:06 ` Sudeep Holla
2020-04-27 16:35 ` Qais Yousef
2020-04-27 16:58 ` Valentin Schneider
2020-04-22 11:27 ` [PATCH 07/23] sched,msm: " Peter Zijlstra
2020-04-22 11:27 ` [PATCH 08/23] sched,drm/scheduler: " Peter Zijlstra
2020-04-22 11:27 ` [PATCH 09/23] sched,ivtv: " Peter Zijlstra
2020-04-22 12:53 ` Steven Rostedt
2020-04-22 13:26 ` Peter Zijlstra
2020-04-24 9:58 ` Hans Verkuil
2020-04-22 11:27 ` [PATCH 10/23] sched,mmc: " Peter Zijlstra
2020-04-22 16:59 ` Ulf Hansson
2020-04-23 8:59 ` Peter Zijlstra
2020-04-23 12:01 ` Ulf Hansson
2020-04-22 11:27 ` [PATCH 11/23] sched,spi: " Peter Zijlstra
2020-04-22 13:56 ` Mark Brown
2020-04-22 14:35 ` Doug Anderson
2020-04-22 15:47 ` Guenter Roeck
2020-04-22 16:41 ` Doug Anderson
2020-04-22 20:16 ` Guenter Roeck
2020-04-22 11:27 ` [PATCH 12/23] sched,powercap: " Peter Zijlstra
2020-04-22 11:27 ` [PATCH 13/23] sched,ion: Convert to sched_set_normal() Peter Zijlstra
2020-04-22 13:21 ` Vincent Guittot
2020-04-22 13:29 ` Peter Zijlstra
2020-04-22 13:36 ` Vincent Guittot
2020-04-22 13:59 ` Peter Zijlstra
2020-04-22 15:09 ` Vincent Guittot
2020-04-22 15:39 ` Peter Zijlstra
2020-04-22 15:52 ` Vincent Guittot
2020-04-22 15:38 ` Juri Lelli
2020-04-22 15:42 ` Peter Zijlstra
2020-04-22 11:27 ` [PATCH 14/23] sched,powerclamp: Convert to sched_set_fifo() Peter Zijlstra
2020-04-22 11:27 ` [PATCH 15/23] sched,serial: " Peter Zijlstra
2020-04-22 11:27 ` [PATCH 16/23] sched,watchdog: " Peter Zijlstra
2020-04-22 12:51 ` Steven Rostedt
2020-04-22 13:24 ` Peter Zijlstra
2020-04-22 11:27 ` [PATCH 17/23] sched,irq: " Peter Zijlstra
2020-04-22 11:39 ` Peter Zijlstra
2020-04-22 11:27 ` [PATCH 18/23] sched,locktorture: " Peter Zijlstra
2020-04-22 15:51 ` Paul E. McKenney
2020-04-22 11:27 ` [PATCH 19/23] sched,rcuperf: Convert to sched_set_fifo_low() Peter Zijlstra
2020-04-22 15:51 ` Paul E. McKenney
2020-04-22 11:27 ` [PATCH 20/23] sched,rcutorture: " Peter Zijlstra
2020-04-22 15:51 ` Paul E. McKenney
2020-04-22 11:27 ` [PATCH 21/23] sched,psi: " Peter Zijlstra
2020-04-22 15:22 ` Johannes Weiner
2020-04-22 11:27 ` [PATCH 22/23] sched: Remove sched_setscheduler*() EXPORTs Peter Zijlstra
2020-04-22 11:27 ` [PATCH 23/23] sched: Remove sched_set_*() return value Peter Zijlstra
2020-04-22 14:25 ` Ingo Molnar
2020-04-22 16:16 ` Paul E. McKenney
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200422131138.GL17661@paulmck-ThinkPad-P72 \
--to=paulmck@kernel.org \
--cc=airlied@redhat.com \
--cc=alexander.deucher@amd.com \
--cc=awalls@md.metrocast.net \
--cc=axboe@kernel.dk \
--cc=broonie@kernel.org \
--cc=bsegall@google.com \
--cc=daniel.lezcano@linaro.org \
--cc=dietmar.eggemann@arm.com \
--cc=gregkh@linuxfoundation.org \
--cc=hannes@cmpxchg.org \
--cc=herbert@gondor.apana.org.au \
--cc=hverkuil@xs4all.nl \
--cc=john.stultz@linaro.org \
--cc=juri.lelli@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mgorman@suse.de \
--cc=mingo@kernel.org \
--cc=nico@fluxnic.net \
--cc=peterz@infradead.org \
--cc=qais.yousef@arm.com \
--cc=rafael.j.wysocki@intel.com \
--cc=rmk+kernel@arm.linux.org.uk \
--cc=rostedt@goodmis.org \
--cc=sudeep.holla@arm.com \
--cc=tglx@linutronix.de \
--cc=ulf.hansson@linaro.org \
--cc=vincent.guittot@linaro.org \
--cc=wim@linux-watchdog.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).