linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/2] sched: Optionally skip uclamp logic in fast path
@ 2020-06-19 17:20 Qais Yousef
  2020-06-19 17:20 ` [PATCH v2 1/2] sched/uclamp: Fix initialization of strut uclamp_rq Qais Yousef
                   ` (2 more replies)
  0 siblings, 3 replies; 11+ messages in thread
From: Qais Yousef @ 2020-06-19 17:20 UTC (permalink / raw)
  To: Ingo Molnar, Peter Zijlstra
  Cc: Qais Yousef, Juri Lelli, Vincent Guittot, Dietmar Eggemann,
	Steven Rostedt, Ben Segall, Mel Gorman, Patrick Bellasi,
	Chris Redpath, Lukasz Luba, linux-kernel

This series attempts to address the report that uclamp logic could be expensive
sometimes and shows a regression in netperf UDP_STREAM under certain
conditions.

The first patch is a fix for how struct uclamp_rq is initialized which is
required by the 2nd patch which contains the real 'fix'.

Worth noting that the root cause of the overhead is believed to be system
specific or related to potential certain code/data layout issues, leading to
worse I/D $ performance.

Different systems exhibited different behaviors and the regression did
disappear in certain kernel version while attempting to reporoduce.

More info can be found here:

https://lore.kernel.org/lkml/20200616110824.dgkkbyapn3io6wik@e107158-lin/

Having the static key seemed the best thing to do to ensure the effect of
uclamp is minimized for kernels that compile it in but don't have a userspace
that uses it, which will allow distros to distribute uclamp capable kernels by
default without having to compromise on performance for some systems that could
be affected.

Changes in v2:
	* Add more info in the commit message about the result of perf diff to
	  demonstrate that the activate/deactivate_task pressure is reduced in
	  the fast path.

	* Fix sparse warning reported by the test robot.

	* Add an extra commit about using static_branch_likely() instead of
	  static_branc_unlikely().

Thanks

--
Qais Yousef

Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Vincent Guittot <vincent.guittot@linaro.org>
Cc: Dietmar Eggemann <dietmar.eggemann@arm.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Ben Segall <bsegall@google.com>
Cc: Mel Gorman <mgorman@suse.de>
CC: Patrick Bellasi <patrick.bellasi@matbug.net>
Cc: Chris Redpath <chris.redpath@arm.com>
Cc: Lukasz Luba <lukasz.luba@arm.com>
Cc: linux-kernel@vger.kernel.org

Qais Yousef (2):
  sched/uclamp: Fix initialization of strut uclamp_rq
  sched/uclamp: Protect uclamp fast path code with static key

 kernel/sched/core.c | 81 ++++++++++++++++++++++++++++++++++++++++-----
 1 file changed, 72 insertions(+), 9 deletions(-)

-- 
2.17.1


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH v2 1/2] sched/uclamp: Fix initialization of strut uclamp_rq
  2020-06-19 17:20 [PATCH v2 0/2] sched: Optionally skip uclamp logic in fast path Qais Yousef
@ 2020-06-19 17:20 ` Qais Yousef
  2020-06-24  7:26   ` Patrick Bellasi
  2020-06-19 17:20 ` [PATCH v2 2/2] sched/uclamp: Protect uclamp fast path code with static key Qais Yousef
  2020-06-23 17:39 ` [PATCH v2 0/2] sched: Optionally skip uclamp logic in fast path Dietmar Eggemann
  2 siblings, 1 reply; 11+ messages in thread
From: Qais Yousef @ 2020-06-19 17:20 UTC (permalink / raw)
  To: Ingo Molnar, Peter Zijlstra
  Cc: Qais Yousef, Juri Lelli, Vincent Guittot, Dietmar Eggemann,
	Steven Rostedt, Ben Segall, Mel Gorman, Patrick Bellasi,
	Chris Redpath, Lukasz Luba, linux-kernel

struct uclamp_rq was zeroed out entirely in assumption that in the first
call to uclamp_rq_inc() they'd be initialized correctly in accordance to
default settings.

But when next patch introduces a static key to skip
uclamp_rq_{inc,dec}() until userspace opts in to use uclamp, schedutil
will fail to perform any frequency changes because the
rq->uclamp[UCLAMP_MAX].value is zeroed at init and stays as such. Which
means all rqs are capped to 0 by default.

Fix it by making sure we do proper initialization at init without
relying on uclamp_rq_inc() doing it later.

Fixes: 69842cba9ace ("sched/uclamp: Add CPU's clamp buckets refcounting")
Signed-off-by: Qais Yousef <qais.yousef@arm.com>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Vincent Guittot <vincent.guittot@linaro.org>
Cc: Dietmar Eggemann <dietmar.eggemann@arm.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Ben Segall <bsegall@google.com>
Cc: Mel Gorman <mgorman@suse.de>
CC: Patrick Bellasi <patrick.bellasi@matbug.net>
Cc: Chris Redpath <chris.redpath@arm.com>
Cc: Lukasz Luba <lukasz.luba@arm.com>
Cc: linux-kernel@vger.kernel.org
---
 kernel/sched/core.c | 23 ++++++++++++++++++-----
 1 file changed, 18 insertions(+), 5 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index a43c84c27c6f..4265861e13e9 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -1248,6 +1248,22 @@ static void uclamp_fork(struct task_struct *p)
 	}
 }
 
+static void __init init_uclamp_rq(struct rq *rq)
+{
+	enum uclamp_id clamp_id;
+	struct uclamp_rq *uc_rq = rq->uclamp;
+
+	for_each_clamp_id(clamp_id) {
+		memset(uc_rq[clamp_id].bucket,
+		       0,
+		       sizeof(struct uclamp_bucket)*UCLAMP_BUCKETS);
+
+		uc_rq[clamp_id].value = uclamp_none(clamp_id);
+	}
+
+	rq->uclamp_flags = 0;
+}
+
 static void __init init_uclamp(void)
 {
 	struct uclamp_se uc_max = {};
@@ -1256,11 +1272,8 @@ static void __init init_uclamp(void)
 
 	mutex_init(&uclamp_mutex);
 
-	for_each_possible_cpu(cpu) {
-		memset(&cpu_rq(cpu)->uclamp, 0,
-				sizeof(struct uclamp_rq)*UCLAMP_CNT);
-		cpu_rq(cpu)->uclamp_flags = 0;
-	}
+	for_each_possible_cpu(cpu)
+		init_uclamp_rq(cpu_rq(cpu));
 
 	for_each_clamp_id(clamp_id) {
 		uclamp_se_set(&init_task.uclamp_req[clamp_id],
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v2 2/2] sched/uclamp: Protect uclamp fast path code with static key
  2020-06-19 17:20 [PATCH v2 0/2] sched: Optionally skip uclamp logic in fast path Qais Yousef
  2020-06-19 17:20 ` [PATCH v2 1/2] sched/uclamp: Fix initialization of strut uclamp_rq Qais Yousef
@ 2020-06-19 17:20 ` Qais Yousef
  2020-06-24  7:34   ` Patrick Bellasi
  2020-06-23 17:39 ` [PATCH v2 0/2] sched: Optionally skip uclamp logic in fast path Dietmar Eggemann
  2 siblings, 1 reply; 11+ messages in thread
From: Qais Yousef @ 2020-06-19 17:20 UTC (permalink / raw)
  To: Ingo Molnar, Peter Zijlstra
  Cc: Qais Yousef, Juri Lelli, Vincent Guittot, Dietmar Eggemann,
	Steven Rostedt, Ben Segall, Mel Gorman, Patrick Bellasi,
	Chris Redpath, Lukasz Luba, linux-kernel

There is a report that when uclamp is enabled, a netperf UDP test
regresses compared to a kernel compiled without uclamp.

https://lore.kernel.org/lkml/20200529100806.GA3070@suse.de/

While investigating the root cause, there were no sign that the uclamp
code is doing anything particularly expensive but could suffer from bad
cache behavior under certain circumstances that are yet to be
understood.

https://lore.kernel.org/lkml/20200616110824.dgkkbyapn3io6wik@e107158-lin/

To reduce the pressure on the fast path anyway, add a static key that is
by default will skip executing uclamp logic in the
enqueue/dequeue_task() fast path until it's needed.

As soon as the user start using util clamp by:

	1. Changing uclamp value of a task with sched_setattr()
	2. Modifying the default sysctl_sched_util_clamp_{min, max}
	3. Modifying the default cpu.uclamp.{min, max} value in cgroup

We flip the static key now that the user has opted to use util clamp.
Effectively re-introducing uclamp logic in the enqueue/dequeue_task()
fast path. It stays on from that point forward until the next reboot.

This should help minimize the effect of util clamp on workloads that
don't need it but still allow distros to ship their kernels with uclamp
compiled in by default.

SCHED_WARN_ON() in uclamp_rq_dec_id() was removed since now we can end
up with unbalanced call to uclamp_rq_dec_id() if we flip the key while
a task is running in the rq. Since we know it is harmless we just
quietly return if we attempt a uclamp_rq_dec_id() when
rq->uclamp[].bucket[].tasks is 0.

The following results demonstrates how this helps on 2 Sockets Xeon E5
2x10-Cores system.

                                   nouclamp                 uclamp      uclamp-static-key
Hmean     send-64         162.43 (   0.00%)      157.84 *  -2.82%*      163.39 *   0.59%*
Hmean     send-128        324.71 (   0.00%)      314.78 *  -3.06%*      326.18 *   0.45%*
Hmean     send-256        641.55 (   0.00%)      628.67 *  -2.01%*      648.12 *   1.02%*
Hmean     send-1024      2525.28 (   0.00%)     2448.26 *  -3.05%*     2543.73 *   0.73%*
Hmean     send-2048      4836.14 (   0.00%)     4712.08 *  -2.57%*     4867.69 *   0.65%*
Hmean     send-3312      7540.83 (   0.00%)     7425.45 *  -1.53%*     7621.06 *   1.06%*
Hmean     send-4096      9124.53 (   0.00%)     8948.82 *  -1.93%*     9276.25 *   1.66%*
Hmean     send-8192     15589.67 (   0.00%)    15486.35 *  -0.66%*    15819.98 *   1.48%*
Hmean     send-16384    26386.47 (   0.00%)    25752.25 *  -2.40%*    26773.74 *   1.47%*

The perf diff between nouclamp and uclamp-static-key when uclamp is
disabled in the fast path:

     8.73%     -1.55%  [kernel.kallsyms]        [k] try_to_wake_up
     0.07%     +0.04%  [kernel.kallsyms]        [k] deactivate_task
     0.13%     -0.02%  [kernel.kallsyms]        [k] activate_task

The diff between nouclamp and uclamp-static-key when uclamp is enabled
in the fast path:

     8.73%     -0.72%  [kernel.kallsyms]        [k] try_to_wake_up
     0.13%     +0.39%  [kernel.kallsyms]        [k] activate_task
     0.07%     +0.38%  [kernel.kallsyms]        [k] deactivate_task

Reported-by: Mel Gorman <mgorman@suse.de>
Fixes: 69842cba9ace ("sched/uclamp: Add CPU's clamp buckets refcounting")
Signed-off-by: Qais Yousef <qais.yousef@arm.com>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Vincent Guittot <vincent.guittot@linaro.org>
Cc: Dietmar Eggemann <dietmar.eggemann@arm.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Ben Segall <bsegall@google.com>
Cc: Mel Gorman <mgorman@suse.de>
CC: Patrick Bellasi <patrick.bellasi@matbug.net>
Cc: Chris Redpath <chris.redpath@arm.com>
Cc: Lukasz Luba <lukasz.luba@arm.com>
Cc: linux-kernel@vger.kernel.org
---

This takes a different approach to PSI which introduces a config option

```
      CONFIG_PSI_DEFAULT_DISABLED

        Require boot parameter to enable pressure stall information
        tracking (NEW)

      boot param psi
```

via commit e0c274472d5d "psi: make disabling/enabling easier for vendor kernels"

uclamp has a clearer points of entry when userspace would like to use it so we
can automatically flip the switch if the kernel is running on a userspace that
wants to user utilclamp without any extra userspace visible switches.

I wanted to make this dependent on schedutil being the governor too, but beside
the complexity, uclamp is used for capacity awareness. We could certainly
construct a more complex condition, but I'm not sure it's worth it. Open to
hear more opinions and points of views on this :)
 kernel/sched/core.c | 58 +++++++++++++++++++++++++++++++++++++++++----
 1 file changed, 54 insertions(+), 4 deletions(-)


diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 4265861e13e9..9ab22f699613 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -793,6 +793,25 @@ unsigned int sysctl_sched_uclamp_util_max = SCHED_CAPACITY_SCALE;
 /* All clamps are required to be less or equal than these values */
 static struct uclamp_se uclamp_default[UCLAMP_CNT];
 
+/*
+ * This static key is used to reduce the uclamp overhead in the fast path. It
+ * only disables the call to uclamp_rq_{inc, dec}() in enqueue/dequeue_task().
+ *
+ * This allows users to continue to enable uclamp in their kernel config with
+ * minimum uclamp overhead in the fast path.
+ *
+ * As soon as userspace modifies any of the uclamp knobs, the static key is
+ * disabled, since we have an actual users that make use of uclamp
+ * functionality.
+ *
+ * The knobs that would disable this static key are:
+ *
+ *   * A task modifying its uclamp value with sched_setattr().
+ *   * An admin modifying the sysctl_sched_uclamp_{min, max} via procfs.
+ *   * An admin modifying the cgroup cpu.uclamp.{min, max}
+ */
+static DEFINE_STATIC_KEY_TRUE(sched_uclamp_unused);
+
 /* Integer rounded range for each bucket */
 #define UCLAMP_BUCKET_DELTA DIV_ROUND_CLOSEST(SCHED_CAPACITY_SCALE, UCLAMP_BUCKETS)
 
@@ -993,9 +1012,16 @@ static inline void uclamp_rq_dec_id(struct rq *rq, struct task_struct *p,
 	lockdep_assert_held(&rq->lock);
 
 	bucket = &uc_rq->bucket[uc_se->bucket_id];
-	SCHED_WARN_ON(!bucket->tasks);
-	if (likely(bucket->tasks))
-		bucket->tasks--;
+
+	/*
+	 * This could happen if sched_uclamp_unused was disabled while the
+	 * current task was running, hence we could end up with unbalanced call
+	 * to uclamp_rq_dec_id().
+	 */
+	if (unlikely(!bucket->tasks))
+		return;
+
+	bucket->tasks--;
 	uc_se->active = false;
 
 	/*
@@ -1031,6 +1057,13 @@ static inline void uclamp_rq_inc(struct rq *rq, struct task_struct *p)
 {
 	enum uclamp_id clamp_id;
 
+	/*
+	 * Avoid any overhead until uclamp is actually used by the userspace.
+	 * Including the potential JMP if we use static_branch_unlikely()
+	 */
+	if (static_branch_likely(&sched_uclamp_unused))
+		return;
+
 	if (unlikely(!p->sched_class->uclamp_enabled))
 		return;
 
@@ -1046,6 +1079,13 @@ static inline void uclamp_rq_dec(struct rq *rq, struct task_struct *p)
 {
 	enum uclamp_id clamp_id;
 
+	/*
+	 * Avoid any overhead until uclamp is actually used by the userspace.
+	 * Including the potential JMP if we use static_branch_unlikely()
+	 */
+	if (static_branch_likely(&sched_uclamp_unused))
+		return;
+
 	if (unlikely(!p->sched_class->uclamp_enabled))
 		return;
 
@@ -1155,9 +1195,13 @@ int sysctl_sched_uclamp_handler(struct ctl_table *table, int write,
 		update_root_tg = true;
 	}
 
-	if (update_root_tg)
+	if (update_root_tg) {
 		uclamp_update_root_tg();
 
+		if (static_branch_unlikely(&sched_uclamp_unused))
+			static_branch_disable(&sched_uclamp_unused);
+	}
+
 	/*
 	 * We update all RUNNABLE tasks only when task groups are in use.
 	 * Otherwise, keep it simple and do just a lazy update at each next
@@ -1221,6 +1265,9 @@ static void __setscheduler_uclamp(struct task_struct *p,
 	if (likely(!(attr->sched_flags & SCHED_FLAG_UTIL_CLAMP)))
 		return;
 
+	if (static_branch_unlikely(&sched_uclamp_unused))
+		static_branch_disable(&sched_uclamp_unused);
+
 	if (attr->sched_flags & SCHED_FLAG_UTIL_CLAMP_MIN) {
 		uclamp_se_set(&p->uclamp_req[UCLAMP_MIN],
 			      attr->sched_util_min, true);
@@ -7315,6 +7362,9 @@ static ssize_t cpu_uclamp_write(struct kernfs_open_file *of, char *buf,
 	if (req.ret)
 		return req.ret;
 
+	if (static_branch_unlikely(&sched_uclamp_unused))
+		static_branch_disable(&sched_uclamp_unused);
+
 	mutex_lock(&uclamp_mutex);
 	rcu_read_lock();
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 0/2] sched: Optionally skip uclamp logic in fast path
  2020-06-19 17:20 [PATCH v2 0/2] sched: Optionally skip uclamp logic in fast path Qais Yousef
  2020-06-19 17:20 ` [PATCH v2 1/2] sched/uclamp: Fix initialization of strut uclamp_rq Qais Yousef
  2020-06-19 17:20 ` [PATCH v2 2/2] sched/uclamp: Protect uclamp fast path code with static key Qais Yousef
@ 2020-06-23 17:39 ` Dietmar Eggemann
  2020-06-24  9:00   ` Vincent Guittot
  2 siblings, 1 reply; 11+ messages in thread
From: Dietmar Eggemann @ 2020-06-23 17:39 UTC (permalink / raw)
  To: Qais Yousef, Ingo Molnar, Peter Zijlstra
  Cc: Juri Lelli, Vincent Guittot, Steven Rostedt, Ben Segall,
	Mel Gorman, Patrick Bellasi, Chris Redpath, Lukasz Luba,
	linux-kernel

On 19/06/2020 19:20, Qais Yousef wrote> This series attempts to address the report that uclamp logic could be expensive
> sometimes and shows a regression in netperf UDP_STREAM under certain
> conditions.
> 
> The first patch is a fix for how struct uclamp_rq is initialized which is
> required by the 2nd patch which contains the real 'fix'.
> 
> Worth noting that the root cause of the overhead is believed to be system
> specific or related to potential certain code/data layout issues, leading to
> worse I/D $ performance.
> 
> Different systems exhibited different behaviors and the regression did
> disappear in certain kernel version while attempting to reporoduce.
> 
> More info can be found here:
> 
> https://lore.kernel.org/lkml/20200616110824.dgkkbyapn3io6wik@e107158-lin/
> 
> Having the static key seemed the best thing to do to ensure the effect of
> uclamp is minimized for kernels that compile it in but don't have a userspace
> that uses it, which will allow distros to distribute uclamp capable kernels by
> default without having to compromise on performance for some systems that could
> be affected.

My test data indicates that the static key w/o any uclamp users (3)
brings the performance number for the 'perf bench sched pipe'
workload back (i.e. from w/ !CONFIG_UCLAMP_TASK) (1).

platform:

    Arm64 Hikey960 (only little CPUs [0-3]), no CPUidle,
    performance CPUfreq governor

workload:

    perf stat -n -r 20 -- perf bench sched pipe -T -l 100000


(A) *** Performance results ***

(1) tip/sched/core
    # CONFIG_UCLAMP_TASK is not set

    *1.39285* +- 0.00191 seconds time elapsed  ( +-  0.14% )

(2) tip/sched/core
    CONFIG_UCLAMP_TASK=y

    *1.42877* +- 0.00181 seconds time elapsed  ( +-  0.13% )

(3) tip/sched/core + opt_skip_uclamp_v2
    CONFIG_UCLAMP_TASK=y

    *1.38833* +- 0.00291 seconds time elapsed  ( +-  0.21% )

(4) tip/sched/core + opt_skip_uclamp_v2
    CONFIG_UCLAMP_TASK=y
    echo 512 > /proc/sys/kernel/sched_util_clamp_min (enable uclamp)

    *1.42062* +- 0.00238 seconds time elapsed  ( +-  0.17% )


(B) *** Profiling on CPU0 and CPU1  ***

    (further hp'ing out CPU2 and CPU3 to get consistent hit numbers)

(1)

CPU0:  Function             Hit    Time            Avg             s^2
       --------             ---    ----            ---             ---
       deactivate_task  1997346    2207642 us      *1.105* us      0.033 us
       activate_task    1997391    1840057 us      *0.921* us      0.054 us

CPU1:  Function             Hit    Time            Avg             s^2
       --------             ---    ----            ---             ---
       deactivate_task  1997455    2225960 us      1.114 us        0.034 us
       activate_task    1997410    1842603 us      0.922 us        0.052 us

(2)

CPU0:  Function             Hit    Time            Avg             s^2
       --------             ---    ----            ---             ---
       deactivate_task  1998538    2419719 us      *1.210* us      0.061 us
       activate_task    1997119    1960401 us      *0.981* us      0.034 us

CPU1:  Function             Hit    Time            Avg             s^2
       --------             ---    ----            ---             ---
       deactivate_task  1996597    2400760 us      1.202 us        0.059 us
       activate_task    1998016    1985013 us      0.993 us        0.028 us

(3)

CPU0:  Function             Hit    Time            Avg             s^2
       --------             ---    ----            ---             ---
       deactivate_task  1997525    2155416 us      *1.079* us      0.020 us
       activate_task    1997874    1899002 us      *0.950* us      0.044 us

CPU1:  Function             Hit    Time            Avg             s^2
       --------             ---    ----            ---             ---
       deactivate_task  1997935    2118648 us      1.060 us        0.017 us
       activate_task    1997586    1895162 us      0.948 us        0.044 us

(4)

CPU0:  Function             Hit    Time            Avg             s^2
       --------             ---    ----            ---             ---
       deactivate_task  1998246    2428121 us      *1.215* us      0.062 us
       activate_task    1998252    2132141 us      *1.067* us      0.020 us

CPU1:  Function             Hit    Time            Avg             s^2
       --------             ---    ----            ---             ---
       deactivate_task  1996154    2414194 us      1.209 us        0.060 us
       activate_task    1996148    2140667 us      1.072 us        0.021 us

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 1/2] sched/uclamp: Fix initialization of strut uclamp_rq
  2020-06-19 17:20 ` [PATCH v2 1/2] sched/uclamp: Fix initialization of strut uclamp_rq Qais Yousef
@ 2020-06-24  7:26   ` Patrick Bellasi
  2020-06-24 10:36     ` Qais Yousef
  0 siblings, 1 reply; 11+ messages in thread
From: Patrick Bellasi @ 2020-06-24  7:26 UTC (permalink / raw)
  To: Qais Yousef
  Cc: Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot,
	Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman,
	Chris Redpath, Lukasz Luba, linux-kernel


Hi Qais,

On Fri, Jun 19, 2020 at 19:20:10 +0200, Qais Yousef <qais.yousef@arm.com> wrote...

> struct uclamp_rq was zeroed out entirely in assumption that in the first
> call to uclamp_rq_inc() they'd be initialized correctly in accordance to
> default settings.
>
> But when next patch introduces a static key to skip
> uclamp_rq_{inc,dec}() until userspace opts in to use uclamp, schedutil
> will fail to perform any frequency changes because the
> rq->uclamp[UCLAMP_MAX].value is zeroed at init and stays as such. Which
> means all rqs are capped to 0 by default.

Does not this means the problem is more likely with uclamp_rq_util_with(),
which should be guarded?

Otherwise, we will also keep doing useless min/max aggregations each
time schedutil calls that function, thus not completely removing
uclamp overheads while user-space has not opted in.

What about dropping this and add the guard in the following patch, along
with the others?

> Fix it by making sure we do proper initialization at init without

>
> Fix it by making sure we do proper initialization at init without
> relying on uclamp_rq_inc() doing it later.
>
> Fixes: 69842cba9ace ("sched/uclamp: Add CPU's clamp buckets refcounting")
> Signed-off-by: Qais Yousef <qais.yousef@arm.com>
> Cc: Juri Lelli <juri.lelli@redhat.com>
> Cc: Vincent Guittot <vincent.guittot@linaro.org>
> Cc: Dietmar Eggemann <dietmar.eggemann@arm.com>
> Cc: Steven Rostedt <rostedt@goodmis.org>
> Cc: Ben Segall <bsegall@google.com>
> Cc: Mel Gorman <mgorman@suse.de>
> CC: Patrick Bellasi <patrick.bellasi@matbug.net>
> Cc: Chris Redpath <chris.redpath@arm.com>
> Cc: Lukasz Luba <lukasz.luba@arm.com>
> Cc: linux-kernel@vger.kernel.org
> ---
>  kernel/sched/core.c | 23 ++++++++++++++++++-----
>  1 file changed, 18 insertions(+), 5 deletions(-)
>
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index a43c84c27c6f..4265861e13e9 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -1248,6 +1248,22 @@ static void uclamp_fork(struct task_struct *p)
>  	}
>  }
>  
> +static void __init init_uclamp_rq(struct rq *rq)
> +{
> +	enum uclamp_id clamp_id;
> +	struct uclamp_rq *uc_rq = rq->uclamp;
> +
> +	for_each_clamp_id(clamp_id) {
> +		memset(uc_rq[clamp_id].bucket,
> +		       0,
> +		       sizeof(struct uclamp_bucket)*UCLAMP_BUCKETS);
> +
> +		uc_rq[clamp_id].value = uclamp_none(clamp_id);
> +	}
> +
> +	rq->uclamp_flags = 0;
> +}
> +
>  static void __init init_uclamp(void)
>  {
>  	struct uclamp_se uc_max = {};
> @@ -1256,11 +1272,8 @@ static void __init init_uclamp(void)
>  
>  	mutex_init(&uclamp_mutex);
>  
> -	for_each_possible_cpu(cpu) {
> -		memset(&cpu_rq(cpu)->uclamp, 0,
> -				sizeof(struct uclamp_rq)*UCLAMP_CNT);
> -		cpu_rq(cpu)->uclamp_flags = 0;
> -	}
> +	for_each_possible_cpu(cpu)
> +		init_uclamp_rq(cpu_rq(cpu));
>  
>  	for_each_clamp_id(clamp_id) {
>  		uclamp_se_set(&init_task.uclamp_req[clamp_id],


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 2/2] sched/uclamp: Protect uclamp fast path code with static key
  2020-06-19 17:20 ` [PATCH v2 2/2] sched/uclamp: Protect uclamp fast path code with static key Qais Yousef
@ 2020-06-24  7:34   ` Patrick Bellasi
  2020-06-24 11:07     ` Qais Yousef
  2020-06-24 12:28     ` Peter Zijlstra
  0 siblings, 2 replies; 11+ messages in thread
From: Patrick Bellasi @ 2020-06-24  7:34 UTC (permalink / raw)
  To: Qais Yousef
  Cc: Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot,
	Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman,
	Chris Redpath, Lukasz Luba, linux-kernel


On Fri, Jun 19, 2020 at 19:20:11 +0200, Qais Yousef <qais.yousef@arm.com> wrote...

[...]

> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 4265861e13e9..9ab22f699613 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -793,6 +793,25 @@ unsigned int sysctl_sched_uclamp_util_max = SCHED_CAPACITY_SCALE;
>  /* All clamps are required to be less or equal than these values */
>  static struct uclamp_se uclamp_default[UCLAMP_CNT];
>  
> +/*
> + * This static key is used to reduce the uclamp overhead in the fast path. It
> + * only disables the call to uclamp_rq_{inc, dec}() in enqueue/dequeue_task().
> + *
> + * This allows users to continue to enable uclamp in their kernel config with
> + * minimum uclamp overhead in the fast path.
> + *
> + * As soon as userspace modifies any of the uclamp knobs, the static key is
> + * disabled, since we have an actual users that make use of uclamp
> + * functionality.
> + *
> + * The knobs that would disable this static key are:
> + *
> + *   * A task modifying its uclamp value with sched_setattr().
> + *   * An admin modifying the sysctl_sched_uclamp_{min, max} via procfs.
> + *   * An admin modifying the cgroup cpu.uclamp.{min, max}
> + */
> +static DEFINE_STATIC_KEY_TRUE(sched_uclamp_unused);

I would personally prefer a non negated semantic.

Why not using 'sched_uclamp_enabled'?

> +
>  /* Integer rounded range for each bucket */
>  #define UCLAMP_BUCKET_DELTA DIV_ROUND_CLOSEST(SCHED_CAPACITY_SCALE, UCLAMP_BUCKETS)
>  
> @@ -993,9 +1012,16 @@ static inline void uclamp_rq_dec_id(struct rq *rq, struct task_struct *p,
>  	lockdep_assert_held(&rq->lock);
>  
>  	bucket = &uc_rq->bucket[uc_se->bucket_id];
> -	SCHED_WARN_ON(!bucket->tasks);
> -	if (likely(bucket->tasks))
> -		bucket->tasks--;
> +
> +	/*
> +	 * This could happen if sched_uclamp_unused was disabled while the
> +	 * current task was running, hence we could end up with unbalanced call
> +	 * to uclamp_rq_dec_id().
> +	 */
> +	if (unlikely(!bucket->tasks))
> +		return;
> +
> +	bucket->tasks--;

Since above you are not really changing the logic, why changing the
code?

The SCHED_WARN_ON/if(likely) is a defensive programming thing.
I understand that SCHED_WARN_ON() can now be misleading because of the
unbalanced calls but... why not just removing it?

Maybe also adding in the comment, but I don't see valid reasons to
change the code if the functionality is not changing.


>  	uc_se->active = false;
>  
>  	/*
> @@ -1031,6 +1057,13 @@ static inline void uclamp_rq_inc(struct rq *rq, struct task_struct *p)
>  {
>  	enum uclamp_id clamp_id;
>  
> +	/*
> +	 * Avoid any overhead until uclamp is actually used by the userspace.
> +	 * Including the potential JMP if we use static_branch_unlikely()

The comment above (about unlikely) seems not to match the code?

> +	 */
> +	if (static_branch_likely(&sched_uclamp_unused))
> +		return;

Moreover, something like:

       if (static_key_false(&sched_uclamp_enabled))
                return;

is not just good enough?

> +
>  	if (unlikely(!p->sched_class->uclamp_enabled))
>  		return;

Since we already have these per sched_class gates, I'm wondering if it
could make sense to just re-purpose them.

Problem with the static key is that if just one RT task opts in, CFS
will still pay the overheads, and vice versa too.

So, an alternative approach could be to opt in sched classes on-demand.

The above if(unlikely) is not exactly has a static key true, but I
assume we agree the overheads we are tacking are nothing compared to
that check, aren't they?


> @@ -1046,6 +1079,13 @@ static inline void uclamp_rq_dec(struct rq *rq, struct task_struct *p)
>  {
>  	enum uclamp_id clamp_id;
>  
> +	/*
> +	 * Avoid any overhead until uclamp is actually used by the userspace.
> +	 * Including the potential JMP if we use static_branch_unlikely()
> +	 */
> +	if (static_branch_likely(&sched_uclamp_unused))
> +		return;
> +
>  	if (unlikely(!p->sched_class->uclamp_enabled))
>  		return;
>  
> @@ -1155,9 +1195,13 @@ int sysctl_sched_uclamp_handler(struct ctl_table *table, int write,
>  		update_root_tg = true;
>  	}
>  
> -	if (update_root_tg)
> +	if (update_root_tg) {
>  		uclamp_update_root_tg();
>  
> +		if (static_branch_unlikely(&sched_uclamp_unused))
> +			static_branch_disable(&sched_uclamp_unused);
> +	}
> +

Can we move the above into a function?

Something similar to set_schedstats(bool), what about uclamp_enable(bool)?

>  	/*
>  	 * We update all RUNNABLE tasks only when task groups are in use.
>  	 * Otherwise, keep it simple and do just a lazy update at each next
> @@ -1221,6 +1265,9 @@ static void __setscheduler_uclamp(struct task_struct *p,
>  	if (likely(!(attr->sched_flags & SCHED_FLAG_UTIL_CLAMP)))
>  		return;
>  
> +	if (static_branch_unlikely(&sched_uclamp_unused))
> +		static_branch_disable(&sched_uclamp_unused);
> +
>  	if (attr->sched_flags & SCHED_FLAG_UTIL_CLAMP_MIN) {
>  		uclamp_se_set(&p->uclamp_req[UCLAMP_MIN],
>  			      attr->sched_util_min, true);
> @@ -7315,6 +7362,9 @@ static ssize_t cpu_uclamp_write(struct kernfs_open_file *of, char *buf,
>  	if (req.ret)
>  		return req.ret;
>  
> +	if (static_branch_unlikely(&sched_uclamp_unused))
> +		static_branch_disable(&sched_uclamp_unused);
> +
>  	mutex_lock(&uclamp_mutex);
>  	rcu_read_lock();

Best,
Patrick

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 0/2] sched: Optionally skip uclamp logic in fast path
  2020-06-23 17:39 ` [PATCH v2 0/2] sched: Optionally skip uclamp logic in fast path Dietmar Eggemann
@ 2020-06-24  9:00   ` Vincent Guittot
  2020-06-24 12:26     ` Qais Yousef
  0 siblings, 1 reply; 11+ messages in thread
From: Vincent Guittot @ 2020-06-24  9:00 UTC (permalink / raw)
  To: Dietmar Eggemann, Qais Yousef
  Cc: Ingo Molnar, Peter Zijlstra, Juri Lelli, Steven Rostedt,
	Ben Segall, Mel Gorman, Patrick Bellasi, Chris Redpath,
	Lukasz Luba, linux-kernel

On Tue, 23 Jun 2020 at 19:40, Dietmar Eggemann <dietmar.eggemann@arm.com> wrote:
>
> On 19/06/2020 19:20, Qais Yousef wrote> This series attempts to address the report that uclamp logic could be expensive
> > sometimes and shows a regression in netperf UDP_STREAM under certain
> > conditions.
> >
> > The first patch is a fix for how struct uclamp_rq is initialized which is
> > required by the 2nd patch which contains the real 'fix'.
> >
> > Worth noting that the root cause of the overhead is believed to be system
> > specific or related to potential certain code/data layout issues, leading to
> > worse I/D $ performance.
> >
> > Different systems exhibited different behaviors and the regression did
> > disappear in certain kernel version while attempting to reporoduce.
> >
> > More info can be found here:
> >
> > https://lore.kernel.org/lkml/20200616110824.dgkkbyapn3io6wik@e107158-lin/
> >
> > Having the static key seemed the best thing to do to ensure the effect of
> > uclamp is minimized for kernels that compile it in but don't have a userspace
> > that uses it, which will allow distros to distribute uclamp capable kernels by
> > default without having to compromise on performance for some systems that could
> > be affected.
>
> My test data indicates that the static key w/o any uclamp users (3)
> brings the performance number for the 'perf bench sched pipe'
> workload back (i.e. from w/ !CONFIG_UCLAMP_TASK) (1).
>
> platform:
>
>     Arm64 Hikey960 (only little CPUs [0-3]), no CPUidle,
>     performance CPUfreq governor
>
> workload:
>
>     perf stat -n -r 20 -- perf bench sched pipe -T -l 100000
>
>
> (A) *** Performance results ***
>
> (1) tip/sched/core
>     # CONFIG_UCLAMP_TASK is not set
>
>     *1.39285* +- 0.00191 seconds time elapsed  ( +-  0.14% )
>
> (2) tip/sched/core
>     CONFIG_UCLAMP_TASK=y
>
>     *1.42877* +- 0.00181 seconds time elapsed  ( +-  0.13% )
>
> (3) tip/sched/core + opt_skip_uclamp_v2
>     CONFIG_UCLAMP_TASK=y
>
>     *1.38833* +- 0.00291 seconds time elapsed  ( +-  0.21% )
>
> (4) tip/sched/core + opt_skip_uclamp_v2
>     CONFIG_UCLAMP_TASK=y
>     echo 512 > /proc/sys/kernel/sched_util_clamp_min (enable uclamp)
>
>     *1.42062* +- 0.00238 seconds time elapsed  ( +-  0.17% )
>
>
> (B) *** Profiling on CPU0 and CPU1  ***
>
>     (further hp'ing out CPU2 and CPU3 to get consistent hit numbers)
>
> (1)
>
> CPU0:  Function             Hit    Time            Avg             s^2
>        --------             ---    ----            ---             ---
>        deactivate_task  1997346    2207642 us      *1.105* us      0.033 us
>        activate_task    1997391    1840057 us      *0.921* us      0.054 us
>
> CPU1:  Function             Hit    Time            Avg             s^2
>        --------             ---    ----            ---             ---
>        deactivate_task  1997455    2225960 us      1.114 us        0.034 us
>        activate_task    1997410    1842603 us      0.922 us        0.052 us
>
> (2)
>
> CPU0:  Function             Hit    Time            Avg             s^2
>        --------             ---    ----            ---             ---
>        deactivate_task  1998538    2419719 us      *1.210* us      0.061 us
>        activate_task    1997119    1960401 us      *0.981* us      0.034 us
>
> CPU1:  Function             Hit    Time            Avg             s^2
>        --------             ---    ----            ---             ---
>        deactivate_task  1996597    2400760 us      1.202 us        0.059 us
>        activate_task    1998016    1985013 us      0.993 us        0.028 us
>
> (3)
>
> CPU0:  Function             Hit    Time            Avg             s^2
>        --------             ---    ----            ---             ---
>        deactivate_task  1997525    2155416 us      *1.079* us      0.020 us
>        activate_task    1997874    1899002 us      *0.950* us      0.044 us
>
> CPU1:  Function             Hit    Time            Avg             s^2
>        --------             ---    ----            ---             ---
>        deactivate_task  1997935    2118648 us      1.060 us        0.017 us
>        activate_task    1997586    1895162 us      0.948 us        0.044 us
>
> (4)
>
> CPU0:  Function             Hit    Time            Avg             s^2
>        --------             ---    ----            ---             ---
>        deactivate_task  1998246    2428121 us      *1.215* us      0.062 us
>        activate_task    1998252    2132141 us      *1.067* us      0.020 us
>
> CPU1:  Function             Hit    Time            Avg             s^2
>        --------             ---    ----            ---             ---
>        deactivate_task  1996154    2414194 us      1.209 us        0.060 us
>        activate_task    1996148    2140667 us      1.072 us        0.021 us

I have rerun the tests that I ran previously on my octo core arm64 (hikey):
20 iteration of perf bench sched pipe -T -l 50000
tip stands for tip/sched/core
uclamp enabled means both uclamp task and uclamp cgroup
the stdev is around 0.25% for all tests

                           root           level 1       level 2       level 3

tip uclamp disabled        50653          47188         44568         41925
tip uclamp enabled         48800(-3.66%)  45600(-3.37%) 42822(-3.92%)
40257(-3.98%)
/w patch uclamp disabled   50615(-0.08%)  47198(+0.02%) 44609(+0.09%)
41735(-0.45%)
/w patch uclamp enabled    49661(-1.96%)  46611(-1.22%) 43803(-1.72%)
41243(-1.63%)

Results are better with your patch

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 1/2] sched/uclamp: Fix initialization of strut uclamp_rq
  2020-06-24  7:26   ` Patrick Bellasi
@ 2020-06-24 10:36     ` Qais Yousef
  0 siblings, 0 replies; 11+ messages in thread
From: Qais Yousef @ 2020-06-24 10:36 UTC (permalink / raw)
  To: Patrick Bellasi
  Cc: Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot,
	Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman,
	Chris Redpath, Lukasz Luba, linux-kernel

On 06/24/20 09:26, Patrick Bellasi wrote:
> 
> Hi Qais,
> 
> On Fri, Jun 19, 2020 at 19:20:10 +0200, Qais Yousef <qais.yousef@arm.com> wrote...
> 
> > struct uclamp_rq was zeroed out entirely in assumption that in the first
> > call to uclamp_rq_inc() they'd be initialized correctly in accordance to
> > default settings.
> >
> > But when next patch introduces a static key to skip
> > uclamp_rq_{inc,dec}() until userspace opts in to use uclamp, schedutil
> > will fail to perform any frequency changes because the
> > rq->uclamp[UCLAMP_MAX].value is zeroed at init and stays as such. Which
> > means all rqs are capped to 0 by default.
> 
> Does not this means the problem is more likely with uclamp_rq_util_with(),
> which should be guarded?

The initialization is wrong and needs to be fixed, no? So I won't say
uclamp_rq_util_with() has any problem.

For RT boosting to work as-is, uclamp_rq_util_with() needs to stay the same,
otherwise we need to add extra logic to deal with that. Which I don't think is
worth it or necessary. The function is called from sugov and
find_energy_efficient_cpu(), both of which aren't a worry to make this call
unconditionally IMO.

Thanks

--
Qais Yousef

> 
> Otherwise, we will also keep doing useless min/max aggregations each
> time schedutil calls that function, thus not completely removing
> uclamp overheads while user-space has not opted in.
> 
> What about dropping this and add the guard in the following patch, along
> with the others?

> 
> > Fix it by making sure we do proper initialization at init without
> 
> >
> > Fix it by making sure we do proper initialization at init without
> > relying on uclamp_rq_inc() doing it later.
> >
> > Fixes: 69842cba9ace ("sched/uclamp: Add CPU's clamp buckets refcounting")
> > Signed-off-by: Qais Yousef <qais.yousef@arm.com>
> > Cc: Juri Lelli <juri.lelli@redhat.com>
> > Cc: Vincent Guittot <vincent.guittot@linaro.org>
> > Cc: Dietmar Eggemann <dietmar.eggemann@arm.com>
> > Cc: Steven Rostedt <rostedt@goodmis.org>
> > Cc: Ben Segall <bsegall@google.com>
> > Cc: Mel Gorman <mgorman@suse.de>
> > CC: Patrick Bellasi <patrick.bellasi@matbug.net>
> > Cc: Chris Redpath <chris.redpath@arm.com>
> > Cc: Lukasz Luba <lukasz.luba@arm.com>
> > Cc: linux-kernel@vger.kernel.org
> > ---
> >  kernel/sched/core.c | 23 ++++++++++++++++++-----
> >  1 file changed, 18 insertions(+), 5 deletions(-)
> >
> > diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> > index a43c84c27c6f..4265861e13e9 100644
> > --- a/kernel/sched/core.c
> > +++ b/kernel/sched/core.c
> > @@ -1248,6 +1248,22 @@ static void uclamp_fork(struct task_struct *p)
> >  	}
> >  }
> >  
> > +static void __init init_uclamp_rq(struct rq *rq)
> > +{
> > +	enum uclamp_id clamp_id;
> > +	struct uclamp_rq *uc_rq = rq->uclamp;
> > +
> > +	for_each_clamp_id(clamp_id) {
> > +		memset(uc_rq[clamp_id].bucket,
> > +		       0,
> > +		       sizeof(struct uclamp_bucket)*UCLAMP_BUCKETS);
> > +
> > +		uc_rq[clamp_id].value = uclamp_none(clamp_id);
> > +	}
> > +
> > +	rq->uclamp_flags = 0;
> > +}
> > +
> >  static void __init init_uclamp(void)
> >  {
> >  	struct uclamp_se uc_max = {};
> > @@ -1256,11 +1272,8 @@ static void __init init_uclamp(void)
> >  
> >  	mutex_init(&uclamp_mutex);
> >  
> > -	for_each_possible_cpu(cpu) {
> > -		memset(&cpu_rq(cpu)->uclamp, 0,
> > -				sizeof(struct uclamp_rq)*UCLAMP_CNT);
> > -		cpu_rq(cpu)->uclamp_flags = 0;
> > -	}
> > +	for_each_possible_cpu(cpu)
> > +		init_uclamp_rq(cpu_rq(cpu));
> >  
> >  	for_each_clamp_id(clamp_id) {
> >  		uclamp_se_set(&init_task.uclamp_req[clamp_id],
> 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 2/2] sched/uclamp: Protect uclamp fast path code with static key
  2020-06-24  7:34   ` Patrick Bellasi
@ 2020-06-24 11:07     ` Qais Yousef
  2020-06-24 12:28     ` Peter Zijlstra
  1 sibling, 0 replies; 11+ messages in thread
From: Qais Yousef @ 2020-06-24 11:07 UTC (permalink / raw)
  To: Patrick Bellasi
  Cc: Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot,
	Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman,
	Chris Redpath, Lukasz Luba, linux-kernel

On 06/24/20 09:34, Patrick Bellasi wrote:
> 
> On Fri, Jun 19, 2020 at 19:20:11 +0200, Qais Yousef <qais.yousef@arm.com> wrote...
> 
> [...]
> 
> > diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> > index 4265861e13e9..9ab22f699613 100644
> > --- a/kernel/sched/core.c
> > +++ b/kernel/sched/core.c
> > @@ -793,6 +793,25 @@ unsigned int sysctl_sched_uclamp_util_max = SCHED_CAPACITY_SCALE;
> >  /* All clamps are required to be less or equal than these values */
> >  static struct uclamp_se uclamp_default[UCLAMP_CNT];
> >  
> > +/*
> > + * This static key is used to reduce the uclamp overhead in the fast path. It
> > + * only disables the call to uclamp_rq_{inc, dec}() in enqueue/dequeue_task().
> > + *
> > + * This allows users to continue to enable uclamp in their kernel config with
> > + * minimum uclamp overhead in the fast path.
> > + *
> > + * As soon as userspace modifies any of the uclamp knobs, the static key is
> > + * disabled, since we have an actual users that make use of uclamp
> > + * functionality.
> > + *
> > + * The knobs that would disable this static key are:
> > + *
> > + *   * A task modifying its uclamp value with sched_setattr().
> > + *   * An admin modifying the sysctl_sched_uclamp_{min, max} via procfs.
> > + *   * An admin modifying the cgroup cpu.uclamp.{min, max}
> > + */
> > +static DEFINE_STATIC_KEY_TRUE(sched_uclamp_unused);
> 
> I would personally prefer a non negated semantic.
> 
> Why not using 'sched_uclamp_enabled'?

This was already discussed and already changed to sched_uclamp_used which
I will post in v3, which is already ready.

> 
> > +
> >  /* Integer rounded range for each bucket */
> >  #define UCLAMP_BUCKET_DELTA DIV_ROUND_CLOSEST(SCHED_CAPACITY_SCALE, UCLAMP_BUCKETS)
> >  
> > @@ -993,9 +1012,16 @@ static inline void uclamp_rq_dec_id(struct rq *rq, struct task_struct *p,
> >  	lockdep_assert_held(&rq->lock);
> >  
> >  	bucket = &uc_rq->bucket[uc_se->bucket_id];
> > -	SCHED_WARN_ON(!bucket->tasks);
> > -	if (likely(bucket->tasks))
> > -		bucket->tasks--;
> > +
> > +	/*
> > +	 * This could happen if sched_uclamp_unused was disabled while the
> > +	 * current task was running, hence we could end up with unbalanced call
> > +	 * to uclamp_rq_dec_id().
> > +	 */
> > +	if (unlikely(!bucket->tasks))
> > +		return;
> > +
> > +	bucket->tasks--;
> 
> Since above you are not really changing the logic, why changing the
> code?

The logic has changed. We return immediately now.

> 
> The SCHED_WARN_ON/if(likely) is a defensive programming thing.
> I understand that SCHED_WARN_ON() can now be misleading because of the
> unbalanced calls but... why not just removing it?

Do you think we need to execute the rest of the code if bucket->tasks is 0? It
would be good to know if there's a corner case you're worried about. AFAIU if
it is 0, this means there's nothing to do.

> 
> Maybe also adding in the comment, but I don't see valid reasons to
> change the code if the functionality is not changing.

If it means a lot to you, I could put it back as it was. I just don't see why
we can't return immediately instead.

> 
> 
> >  	uc_se->active = false;
> >  
> >  	/*
> > @@ -1031,6 +1057,13 @@ static inline void uclamp_rq_inc(struct rq *rq, struct task_struct *p)
> >  {
> >  	enum uclamp_id clamp_id;
> >  
> > +	/*
> > +	 * Avoid any overhead until uclamp is actually used by the userspace.
> > +	 * Including the potential JMP if we use static_branch_unlikely()
> 
> The comment above (about unlikely) seems not to match the code?

It does. The comment says if we use unlikely() when uclamp is not used we'll
incur an extra jump/branch. Hence we use likely to avoid this potential
overhead.

> 
> > +	 */
> > +	if (static_branch_likely(&sched_uclamp_unused))
> > +		return;
> 
> Moreover, something like:
> 
>        if (static_key_false(&sched_uclamp_enabled))
>                 return;
> 
> is not just good enough?

Aren't they both good enough?

Use of static_key_false() is deprecated AFAICS.

> 
> > +
> >  	if (unlikely(!p->sched_class->uclamp_enabled))
> >  		return;
> 
> Since we already have these per sched_class gates, I'm wondering if it
> could make sense to just re-purpose them.
> 
> Problem with the static key is that if just one RT task opts in, CFS
> will still pay the overheads, and vice versa too.
> 
> So, an alternative approach could be to opt in sched classes on-demand.

I'll defer to Peter here.

Re-purposing the sched_class->uclamp_enable means we can't use a static key and
we'll potentially still incur a cache penalty in there even if uclamp is not
used. Based on RT sysctl discussion, this won't be okay if we can do better.

> 
> The above if(unlikely) is not exactly has a static key true, but I
> assume we agree the overheads we are tacking are nothing compared to
> that check, aren't they?

I'm not sure we can agree on that. The overhead seems to be system specific. On
juno-r2 it seemed I$, but on 2 sockets systems it seems the D$. So I can't say
for sure no one will care. The current approach gives stronger guarantees.

Thanks

--
Qais Yousef

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 0/2] sched: Optionally skip uclamp logic in fast path
  2020-06-24  9:00   ` Vincent Guittot
@ 2020-06-24 12:26     ` Qais Yousef
  0 siblings, 0 replies; 11+ messages in thread
From: Qais Yousef @ 2020-06-24 12:26 UTC (permalink / raw)
  To: Vincent Guittot
  Cc: Dietmar Eggemann, Ingo Molnar, Peter Zijlstra, Juri Lelli,
	Steven Rostedt, Ben Segall, Mel Gorman, Patrick Bellasi,
	Chris Redpath, Lukasz Luba, linux-kernel

On 06/24/20 11:00, Vincent Guittot wrote:
> On Tue, 23 Jun 2020 at 19:40, Dietmar Eggemann <dietmar.eggemann@arm.com> wrote:

[...]

> > (4)
> >
> > CPU0:  Function             Hit    Time            Avg             s^2
> >        --------             ---    ----            ---             ---
> >        deactivate_task  1998246    2428121 us      *1.215* us      0.062 us
> >        activate_task    1998252    2132141 us      *1.067* us      0.020 us
> >
> > CPU1:  Function             Hit    Time            Avg             s^2
> >        --------             ---    ----            ---             ---
> >        deactivate_task  1996154    2414194 us      1.209 us        0.060 us
> >        activate_task    1996148    2140667 us      1.072 us        0.021 us
> 
> I have rerun the tests that I ran previously on my octo core arm64 (hikey):
> 20 iteration of perf bench sched pipe -T -l 50000
> tip stands for tip/sched/core
> uclamp enabled means both uclamp task and uclamp cgroup
> the stdev is around 0.25% for all tests
> 
>                            root           level 1       level 2       level 3
> 
> tip uclamp disabled        50653          47188         44568         41925
> tip uclamp enabled         48800(-3.66%)  45600(-3.37%) 42822(-3.92%)
> 40257(-3.98%)
> /w patch uclamp disabled   50615(-0.08%)  47198(+0.02%) 44609(+0.09%)
> 41735(-0.45%)
> /w patch uclamp enabled    49661(-1.96%)  46611(-1.22%) 43803(-1.72%)
> 41243(-1.63%)
> 
> Results are better with your patch

Thanks both for testing!

Cheers

--
Qais Yousef

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 2/2] sched/uclamp: Protect uclamp fast path code with static key
  2020-06-24  7:34   ` Patrick Bellasi
  2020-06-24 11:07     ` Qais Yousef
@ 2020-06-24 12:28     ` Peter Zijlstra
  1 sibling, 0 replies; 11+ messages in thread
From: Peter Zijlstra @ 2020-06-24 12:28 UTC (permalink / raw)
  To: Patrick Bellasi
  Cc: Qais Yousef, Ingo Molnar, Juri Lelli, Vincent Guittot,
	Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman,
	Chris Redpath, Lukasz Luba, linux-kernel

On Wed, Jun 24, 2020 at 09:34:02AM +0200, Patrick Bellasi wrote:
> > +static DEFINE_STATIC_KEY_TRUE(sched_uclamp_unused);
> 
> I would personally prefer a non negated semantic.

That's what I said earlier as well.


> > +	 */
> > +	if (static_branch_likely(&sched_uclamp_unused))
> > +		return;
> 
> Moreover, something like:
> 
>        if (static_key_false(&sched_uclamp_enabled))
>                 return;
> 
> is not just good enough?

static_key_true/false() are the deprecated API.

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2020-06-24 12:28 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-06-19 17:20 [PATCH v2 0/2] sched: Optionally skip uclamp logic in fast path Qais Yousef
2020-06-19 17:20 ` [PATCH v2 1/2] sched/uclamp: Fix initialization of strut uclamp_rq Qais Yousef
2020-06-24  7:26   ` Patrick Bellasi
2020-06-24 10:36     ` Qais Yousef
2020-06-19 17:20 ` [PATCH v2 2/2] sched/uclamp: Protect uclamp fast path code with static key Qais Yousef
2020-06-24  7:34   ` Patrick Bellasi
2020-06-24 11:07     ` Qais Yousef
2020-06-24 12:28     ` Peter Zijlstra
2020-06-23 17:39 ` [PATCH v2 0/2] sched: Optionally skip uclamp logic in fast path Dietmar Eggemann
2020-06-24  9:00   ` Vincent Guittot
2020-06-24 12:26     ` Qais Yousef

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).