All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] sched/fair: Fix shift-out-of-bounds in load_balance()
@ 2021-02-25 17:56 Valentin Schneider
  2021-02-25 22:21 ` Peter Zijlstra
  0 siblings, 1 reply; 2+ messages in thread
From: Valentin Schneider @ 2021-02-25 17:56 UTC (permalink / raw)
  To: linux-kernel
  Cc: syzbot+d7581744d5fd27c9fbe1, Andrew Morton, Borislav Petkov,
	H. Peter Anvin, luto, Ingo Molnar, Peter Zijlstra,
	Vincent Guittot, syzkaller-bugs, Thomas Gleixner, x86

Syzbot reported a handful of occurrences where an sd->nr_balance_failed can
grow to much higher values than one would expect.

A successful load_balance() resets it to 0; a failed one increments
it. Once it gets to sd->cache_nice_tries + 3, this *should* trigger an
active balance, which will either set it to sd->cache_nice_tries+1 or reset
it to 0. However, in case the to-be-active-balanced task is not allowed to
run on env->dst_cpu, then the increment is done without any further
modification.

This could then be repeated ad nauseam, and would explain the absurdly high
values reported by syzbot (86, 149). VincentG noted there is value in
letting sd->cache_nice_tries grow, so the shift itself should be
fixed. That means preventing:

  """
  If the value of the right operand is negative or is greater than or equal
  to the width of the promoted left operand, the behavior is undefined.
  """

Thus we need to cap the shift exponent to
  BITS_PER_TYPE(typeof(lefthand)) - 1.

I had a look around for other similar cases via coccinelle:

  @expr@
  position pos;
  expression E1;
  expression E2;
  @@
  (
  E1 >> E2@pos
  |
  E1 >> E2@pos
  )

  @cst depends on expr@
  position pos;
  expression expr.E1;
  constant cst;
  @@
  (
  E1 >> cst@pos
  |
  E1 << cst@pos
  )

  @script:python depends on !cst@
  pos << expr.pos;
  exp << expr.E2;
  @@
  # Dirty hack to ignore constexpr
  if exp.upper() != exp:
     coccilib.report.print_report(pos[0], "Possible UB shift here")

The only other match in kernel/sched is rq_clock_thermal() which employs
sched_thermal_decay_shift, and that exponent is already capped to 10, so
that one is fine.

Link: http://lore.kernel.org/r/000000000000ffac1205b9a2112f@google.com
Fixes: 5a7f55590467 ("sched/fair: Relax constraint on task's load during load balance")
Reported-by: syzbot+d7581744d5fd27c9fbe1@syzkaller.appspotmail.com
Signed-off-by: Valentin Schneider <valentin.schneider@arm.com>
---
 kernel/sched/fair.c  | 3 +--
 kernel/sched/sched.h | 7 +++++++
 2 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 8a8bd7b13634..7ae7c2f13a8b 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -7708,8 +7708,7 @@ static int detach_tasks(struct lb_env *env)
 			 * scheduler fails to find a good waiting task to
 			 * migrate.
 			 */
-
-			if ((load >> env->sd->nr_balance_failed) > env->imbalance)
+			if (shr_bound(load, env->sd->nr_balance_failed) > env->imbalance)
 				goto next;
 
 			env->imbalance -= load;
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index 10a1522b1e30..e4e4f47cee6a 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -204,6 +204,13 @@ static inline void update_avg(u64 *avg, u64 sample)
 	*avg += diff / 8;
 }
 
+/*
+ * Shifting a value by an exponent greater *or equal* to the size of said value
+ * is UB; cap at size-1.
+ */
+#define shr_bound(val, shift)							\
+	(val >> min_t(typeof(shift), shift, BITS_PER_TYPE(typeof(val)) - 1))
+
 /*
  * !! For sched_setattr_nocheck() (kernel) only !!
  *
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 2+ messages in thread

* Re: [PATCH] sched/fair: Fix shift-out-of-bounds in load_balance()
  2021-02-25 17:56 [PATCH] sched/fair: Fix shift-out-of-bounds in load_balance() Valentin Schneider
@ 2021-02-25 22:21 ` Peter Zijlstra
  0 siblings, 0 replies; 2+ messages in thread
From: Peter Zijlstra @ 2021-02-25 22:21 UTC (permalink / raw)
  To: Valentin Schneider
  Cc: linux-kernel, syzbot+d7581744d5fd27c9fbe1, Andrew Morton,
	Borislav Petkov, H. Peter Anvin, luto, Ingo Molnar,
	Vincent Guittot, syzkaller-bugs, Thomas Gleixner, x86

On Thu, Feb 25, 2021 at 05:56:56PM +0000, Valentin Schneider wrote:
> Syzbot reported a handful of occurrences where an sd->nr_balance_failed can
> grow to much higher values than one would expect.
> 
> A successful load_balance() resets it to 0; a failed one increments
> it. Once it gets to sd->cache_nice_tries + 3, this *should* trigger an
> active balance, which will either set it to sd->cache_nice_tries+1 or reset
> it to 0. However, in case the to-be-active-balanced task is not allowed to
> run on env->dst_cpu, then the increment is done without any further
> modification.
> 
> This could then be repeated ad nauseam, and would explain the absurdly high
> values reported by syzbot (86, 149). VincentG noted there is value in
> letting sd->cache_nice_tries grow, so the shift itself should be
> fixed. That means preventing:
> 
>   """
>   If the value of the right operand is negative or is greater than or equal
>   to the width of the promoted left operand, the behavior is undefined.
>   """
> 
> Thus we need to cap the shift exponent to
>   BITS_PER_TYPE(typeof(lefthand)) - 1.
> 

Thanks!

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2021-02-25 22:25 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-02-25 17:56 [PATCH] sched/fair: Fix shift-out-of-bounds in load_balance() Valentin Schneider
2021-02-25 22:21 ` Peter Zijlstra

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.