bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Roman Gushchin <guro@fb.com>
To: Peter Zijlstra <peterz@infradead.org>, Ingo Molnar <mingo@redhat.com>
Cc: Mel Gorman <mgorman@techsingularity.net>, <bpf@vger.kernel.org>,
	<linux-kernel@vger.kernel.org>, Roman Gushchin <guro@fb.com>
Subject: [PATCH rfc 4/6] sched: cfs: add bpf hooks to control wakeup and tick preemption
Date: Thu, 16 Sep 2021 09:24:49 -0700	[thread overview]
Message-ID: <20210916162451.709260-5-guro@fb.com> (raw)
In-Reply-To: <20210916162451.709260-1-guro@fb.com>

This patch adds 3 hooks to control wakeup and tick preemption:
  cfs_check_preempt_tick
  cfs_check_preempt_wakeup
  cfs_wakeup_preempt_entity

The first one allows to force or suppress a preemption from a tick
context. An obvious usage example is to minimize the number of
non-voluntary context switches and decrease an associated latency
penalty by (conditionally) providing tasks or task groups an extended
execution slice. It can be used instead of tweaking
sysctl_sched_min_granularity.

The second one is called from the wakeup preemption code and allows
to redefine whether a newly woken task should preempt the execution
of the current task. This is useful to minimize a number of
preemptions of latency sensitive tasks. To some extent it's a more
flexible analog of a sysctl_sched_wakeup_granularity.

The third one is similar, but it tweaks the wakeup_preempt_entity()
function, which is called not only from a wakeup context, but also
from pick_next_task(), which allows to influence the decision on which
task will be running next.

It's a place for a discussion whether we need both these hooks or only
one of them: the second is more powerful, but depends more on the
current implementation. In any case, bpf hooks are not an ABI, so it's
not a deal breaker.

The idea of the wakeup_preempt_entity hook belongs to Rik van Riel. He
also contributed a lot to the whole patchset by proving his ideas,
recommendations and a feedback for earlier (non-public) versions.

Signed-off-by: Roman Gushchin <guro@fb.com>
---
 include/linux/bpf_sched.h       |  1 +
 include/linux/sched_hook_defs.h |  4 +++-
 kernel/sched/fair.c             | 27 +++++++++++++++++++++++++++
 3 files changed, 31 insertions(+), 1 deletion(-)

diff --git a/include/linux/bpf_sched.h b/include/linux/bpf_sched.h
index 6e773aecdff7..5c238aeb853c 100644
--- a/include/linux/bpf_sched.h
+++ b/include/linux/bpf_sched.h
@@ -40,6 +40,7 @@ static inline RET bpf_sched_##NAME(__VA_ARGS__)	\
 {						\
 	return DEFAULT;				\
 }
+#include <linux/sched_hook_defs.h>
 #undef BPF_SCHED_HOOK
 
 static inline bool bpf_sched_enabled(void)
diff --git a/include/linux/sched_hook_defs.h b/include/linux/sched_hook_defs.h
index 14344004e335..f075b32698cd 100644
--- a/include/linux/sched_hook_defs.h
+++ b/include/linux/sched_hook_defs.h
@@ -1,2 +1,4 @@
 /* SPDX-License-Identifier: GPL-2.0 */
-BPF_SCHED_HOOK(int, 0, dummy, void)
+BPF_SCHED_HOOK(int, 0, cfs_check_preempt_tick, struct sched_entity *curr, unsigned long delta_exec)
+BPF_SCHED_HOOK(int, 0, cfs_check_preempt_wakeup, struct task_struct *curr, struct task_struct *p)
+BPF_SCHED_HOOK(int, 0, cfs_wakeup_preempt_entity, struct sched_entity *curr, struct sched_entity *se)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index ff69f245b939..35ea8911b25c 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -21,6 +21,7 @@
  *  Copyright (C) 2007 Red Hat, Inc., Peter Zijlstra
  */
 #include "sched.h"
+#include <linux/bpf_sched.h>
 
 /*
  * Targeted preemption latency for CPU-bound tasks:
@@ -4447,6 +4448,16 @@ check_preempt_tick(struct cfs_rq *cfs_rq, struct sched_entity *curr)
 
 	ideal_runtime = sched_slice(cfs_rq, curr);
 	delta_exec = curr->sum_exec_runtime - curr->prev_sum_exec_runtime;
+
+	if (bpf_sched_enabled()) {
+		int ret = bpf_sched_cfs_check_preempt_tick(curr, delta_exec);
+
+		if (ret < 0)
+			return;
+		else if (ret > 0)
+			resched_curr(rq_of(cfs_rq));
+	}
+
 	if (delta_exec > ideal_runtime) {
 		resched_curr(rq_of(cfs_rq));
 		/*
@@ -7083,6 +7094,13 @@ wakeup_preempt_entity(struct sched_entity *curr, struct sched_entity *se)
 {
 	s64 gran, vdiff = curr->vruntime - se->vruntime;
 
+	if (bpf_sched_enabled()) {
+		int ret = bpf_sched_cfs_wakeup_preempt_entity(curr, se);
+
+		if (ret)
+			return ret;
+	}
+
 	if (vdiff <= 0)
 		return -1;
 
@@ -7168,6 +7186,15 @@ static void check_preempt_wakeup(struct rq *rq, struct task_struct *p, int wake_
 	    likely(!task_has_idle_policy(p)))
 		goto preempt;
 
+	if (bpf_sched_enabled()) {
+		int ret = bpf_sched_cfs_check_preempt_wakeup(current, p);
+
+		if (ret < 0)
+			return;
+		else if (ret > 0)
+			goto preempt;
+	}
+
 	/*
 	 * Batch and idle tasks do not preempt non-idle tasks (their preemption
 	 * is driven by the tick):
-- 
2.31.1


  parent reply	other threads:[~2021-09-16 16:27 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20210915213550.3696532-1-guro@fb.com>
2021-09-16  0:19 ` [PATCH rfc 0/6] Scheduler BPF Hao Luo
2021-09-16  1:42   ` Roman Gushchin
2021-09-16 16:24 ` Roman Gushchin
2021-09-16 16:24   ` [PATCH rfc 1/6] bpf: sched: basic infrastructure for scheduler bpf Roman Gushchin
2021-09-16 16:24   ` [PATCH rfc 2/6] bpf: sched: add convenient helpers to identify sched entities Roman Gushchin
2021-11-25  6:09     ` Yafang Shao
2021-11-26 19:50       ` Roman Gushchin
2021-09-16 16:24   ` [PATCH rfc 3/6] bpf: sched: introduce bpf_sched_enable() Roman Gushchin
2021-09-16 16:24   ` Roman Gushchin [this message]
2021-10-01  3:35     ` [PATCH rfc 4/6] sched: cfs: add bpf hooks to control wakeup and tick preemption Barry Song
2021-10-02  0:13       ` Roman Gushchin
2021-09-16 16:24   ` [PATCH rfc 5/6] libbpf: add support for scheduler bpf programs Roman Gushchin
2021-09-16 16:24   ` [PATCH rfc 6/6] bpftool: recognize scheduler programs Roman Gushchin
2021-09-16 16:36   ` [PATCH rfc 0/6] Scheduler BPF Roman Gushchin
2021-10-06 16:39   ` Qais Yousef
2021-10-06 18:50     ` Roman Gushchin
2021-10-11 16:38       ` Qais Yousef
2021-10-11 18:09         ` Roman Gushchin
2021-10-12 10:16           ` Qais Yousef
     [not found]   ` <52EC1E80-4C89-43AD-8A59-8ACA184EAE53@gmail.com>
2021-11-25  6:00     ` Yafang Shao
2021-11-26 19:46       ` Roman Gushchin
2022-01-15  8:29   ` Huichun Feng
2022-01-18 22:54     ` Roman Gushchin
2022-07-19 13:05   ` Ren Zhijie
2022-07-19 13:17   ` Ren Zhijie
2022-07-19 23:21     ` Roman Gushchin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210916162451.709260-5-guro@fb.com \
    --to=guro@fb.com \
    --cc=bpf@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mgorman@techsingularity.net \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).