linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/2] Fix race in CFS bandwidth
@ 2020-04-10 22:52 Josh Don
  2020-04-10 22:52 ` [PATCH 1/2] sched: eliminate bandwidth race between throttling and distribution Josh Don
                   ` (2 more replies)
  0 siblings, 3 replies; 14+ messages in thread
From: Josh Don @ 2020-04-10 22:52 UTC (permalink / raw)
  To: Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot
  Cc: Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman,
	Paul Turner, Huaixin Chang, Phil Auld, linux-kernel, Josh Don

This series contains a fix for CFS bandwidth and a related cleanup
patch.

Josh Don (2):
  sched: eliminate bandwidth race between throttling and distribution
  sched: remove distribute_running from CFS bandwidth

 kernel/sched/fair.c  | 91 +++++++++++++++++++++++---------------------
 kernel/sched/sched.h |  1 -
 2 files changed, 47 insertions(+), 45 deletions(-)

-- 
2.26.0.110.g2183baf09c-goog


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 1/2] sched: eliminate bandwidth race between throttling and distribution
  2020-04-10 22:52 [PATCH 0/2] Fix race in CFS bandwidth Josh Don
@ 2020-04-10 22:52 ` Josh Don
       [not found]   ` <CABk29NtxG8t6wHM6MDVFun7jxqRpupWr0d5dK57N2ecFbdSumw@mail.gmail.com>
                     ` (2 more replies)
  2020-04-10 22:52 ` [PATCH 2/2] sched: remove distribute_running from CFS bandwidth Josh Don
  2020-04-12  2:03 ` [PATCH 0/2] Fix race in " Josh Don
  2 siblings, 3 replies; 14+ messages in thread
From: Josh Don @ 2020-04-10 22:52 UTC (permalink / raw)
  To: Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot
  Cc: Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman,
	Paul Turner, Huaixin Chang, Phil Auld, linux-kernel, Josh Don

From: Paul Turner <pjt@google.com>

There is a race window in which an entity begins throttling before quota
is added to the pool, but does not finish throttling until after we have
finished with distribute_cfs_runtime(). This entity is not observed by
distribute_cfs_runtime() because it was not on the throttled list at the
time that distribution was running. This race manifests as rare
period-length statlls for such entities.

Rather than heavy-weight the synchronization with the progress of
distribution, we can fix this by aborting throttling if bandwidth has
become available. Otherwise, we immediately add the entity to the
throttled list so that it can be observed by a subsequent distribution.

Additionally, we can remove the case of adding the throttled entity to
the head of the throttled list, and simply always add to the tail.
Thanks to 26a8b12747c97, distribute_cfs_runtime() no longer holds onto
its own pool of runtime. This means that if we do hit the !assign and
distribute_running case, we know that distribution is about to end.

Signed-off-by: Paul Turner <pjt@google.com>
Co-developed-by: Ben Segall <bsegall@google.com>
Signed-off-by: Ben Segall <bsegall@google.com>
Signed-off-by: Josh Don <joshdon@google.com>
---
 kernel/sched/fair.c | 78 ++++++++++++++++++++++++++-------------------
 1 file changed, 46 insertions(+), 32 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 02f323b85b6d..3fb986c52825 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -4587,17 +4587,15 @@ static inline struct cfs_bandwidth *tg_cfs_bandwidth(struct task_group *tg)
 	return &tg->cfs_bandwidth;
 }
 
-/* returns 0 on failure to allocate runtime */
-static int assign_cfs_rq_runtime(struct cfs_rq *cfs_rq)
+/* returns 0 on failure to allocate runtime, called with cfs_b->lock held */
+static int __assign_cfs_rq_runtime(struct cfs_bandwidth *cfs_b,
+				   struct cfs_rq *cfs_rq, u64 target_runtime)
 {
-	struct task_group *tg = cfs_rq->tg;
-	struct cfs_bandwidth *cfs_b = tg_cfs_bandwidth(tg);
 	u64 amount = 0, min_amount;
 
 	/* note: this is a positive sum as runtime_remaining <= 0 */
-	min_amount = sched_cfs_bandwidth_slice() - cfs_rq->runtime_remaining;
+	min_amount = target_runtime - cfs_rq->runtime_remaining;
 
-	raw_spin_lock(&cfs_b->lock);
 	if (cfs_b->quota == RUNTIME_INF)
 		amount = min_amount;
 	else {
@@ -4609,13 +4607,26 @@ static int assign_cfs_rq_runtime(struct cfs_rq *cfs_rq)
 			cfs_b->idle = 0;
 		}
 	}
-	raw_spin_unlock(&cfs_b->lock);
 
 	cfs_rq->runtime_remaining += amount;
 
 	return cfs_rq->runtime_remaining > 0;
 }
 
+/* returns 0 on failure to allocate runtime */
+static int assign_cfs_rq_runtime(struct cfs_rq *cfs_rq)
+{
+	int ret;
+	struct cfs_bandwidth *cfs_b = tg_cfs_bandwidth(cfs_rq->tg);
+
+	raw_spin_lock(&cfs_b->lock);
+	ret = __assign_cfs_rq_runtime(cfs_b, cfs_rq,
+				      sched_cfs_bandwidth_slice());
+	raw_spin_unlock(&cfs_b->lock);
+
+	return ret;
+}
+
 static void __account_cfs_rq_runtime(struct cfs_rq *cfs_rq, u64 delta_exec)
 {
 	/* dock delta_exec before expiring quota (as it could span periods) */
@@ -4704,13 +4715,33 @@ static int tg_throttle_down(struct task_group *tg, void *data)
 	return 0;
 }
 
-static void throttle_cfs_rq(struct cfs_rq *cfs_rq)
+static bool throttle_cfs_rq(struct cfs_rq *cfs_rq)
 {
 	struct rq *rq = rq_of(cfs_rq);
 	struct cfs_bandwidth *cfs_b = tg_cfs_bandwidth(cfs_rq->tg);
 	struct sched_entity *se;
 	long task_delta, idle_task_delta, dequeue = 1;
-	bool empty;
+
+	raw_spin_lock(&cfs_b->lock);
+	/* This will start the period timer if necessary */
+	if (__assign_cfs_rq_runtime(cfs_b, cfs_rq, 1)) {
+		/*
+		 * We have raced with bandwidth becoming available, and if we
+		 * actually throttled the timer might not unthrottle us for an
+		 * entire period. We additionally needed to make sure that any
+		 * subsequent check_cfs_rq_runtime calls agree not to throttle
+		 * us, as we may commit to do cfs put_prev+pick_next, so we ask
+		 * for 1ns of runtime rather than just check cfs_b.
+		 */
+		dequeue = 0;
+	} else {
+		list_add_tail_rcu(&cfs_rq->throttled_list,
+				  &cfs_b->throttled_cfs_rq);
+	}
+	raw_spin_unlock(&cfs_b->lock);
+
+	if (!dequeue)
+		return false;  /* Throttle no longer required. */
 
 	se = cfs_rq->tg->se[cpu_of(rq_of(cfs_rq))];
 
@@ -4744,29 +4775,13 @@ static void throttle_cfs_rq(struct cfs_rq *cfs_rq)
 	if (!se)
 		sub_nr_running(rq, task_delta);
 
-	cfs_rq->throttled = 1;
-	cfs_rq->throttled_clock = rq_clock(rq);
-	raw_spin_lock(&cfs_b->lock);
-	empty = list_empty(&cfs_b->throttled_cfs_rq);
-
 	/*
-	 * Add to the _head_ of the list, so that an already-started
-	 * distribute_cfs_runtime will not see us. If disribute_cfs_runtime is
-	 * not running add to the tail so that later runqueues don't get starved.
+	 * Note: distribution will already see us throttled via the
+	 * throttled-list.  rq->lock protects completion.
 	 */
-	if (cfs_b->distribute_running)
-		list_add_rcu(&cfs_rq->throttled_list, &cfs_b->throttled_cfs_rq);
-	else
-		list_add_tail_rcu(&cfs_rq->throttled_list, &cfs_b->throttled_cfs_rq);
-
-	/*
-	 * If we're the first throttled task, make sure the bandwidth
-	 * timer is running.
-	 */
-	if (empty)
-		start_cfs_bandwidth(cfs_b);
-
-	raw_spin_unlock(&cfs_b->lock);
+	cfs_rq->throttled = 1;
+	cfs_rq->throttled_clock = rq_clock(rq);
+	return true;
 }
 
 void unthrottle_cfs_rq(struct cfs_rq *cfs_rq)
@@ -5121,8 +5136,7 @@ static bool check_cfs_rq_runtime(struct cfs_rq *cfs_rq)
 	if (cfs_rq_throttled(cfs_rq))
 		return true;
 
-	throttle_cfs_rq(cfs_rq);
-	return true;
+	return throttle_cfs_rq(cfs_rq);
 }
 
 static enum hrtimer_restart sched_cfs_slack_timer(struct hrtimer *timer)
-- 
2.26.0.110.g2183baf09c-goog


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 2/2] sched: remove distribute_running from CFS bandwidth
  2020-04-10 22:52 [PATCH 0/2] Fix race in CFS bandwidth Josh Don
  2020-04-10 22:52 ` [PATCH 1/2] sched: eliminate bandwidth race between throttling and distribution Josh Don
@ 2020-04-10 22:52 ` Josh Don
  2020-04-12  2:01   ` Josh Don
                     ` (2 more replies)
  2020-04-12  2:03 ` [PATCH 0/2] Fix race in " Josh Don
  2 siblings, 3 replies; 14+ messages in thread
From: Josh Don @ 2020-04-10 22:52 UTC (permalink / raw)
  To: Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot
  Cc: Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman,
	Paul Turner, Huaixin Chang, Phil Auld, linux-kernel, Josh Don

This is mostly a revert of baa9be4ff.

The primary use of distribute_running was to determine whether to add
throttled entities to the head or the tail of the throttled list. Now
that we always add to the tail, we can remove this field.

The other use of distribute_running is in the slack_timer, so that we
don't start a distribution while one is already running. However, even
in the event that this race occurs, it is fine to have two distributions
running (especially now that distribute grabs the cfs_b->lock to
determine remaining quota before assigning).

Signed-off-by: Josh Don <joshdon@google.com>
---
 kernel/sched/fair.c  | 13 +------------
 kernel/sched/sched.h |  1 -
 2 files changed, 1 insertion(+), 13 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 3fb986c52825..24b5e12efb40 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -4930,14 +4930,12 @@ static int do_sched_cfs_period_timer(struct cfs_bandwidth *cfs_b, int overrun, u
 	/*
 	 * This check is repeated as we release cfs_b->lock while we unthrottle.
 	 */
-	while (throttled && cfs_b->runtime > 0 && !cfs_b->distribute_running) {
-		cfs_b->distribute_running = 1;
+	while (throttled && cfs_b->runtime > 0) {
 		raw_spin_unlock_irqrestore(&cfs_b->lock, flags);
 		/* we can't nest cfs_b->lock while distributing bandwidth */
 		distribute_cfs_runtime(cfs_b);
 		raw_spin_lock_irqsave(&cfs_b->lock, flags);
 
-		cfs_b->distribute_running = 0;
 		throttled = !list_empty(&cfs_b->throttled_cfs_rq);
 	}
 
@@ -5051,10 +5049,6 @@ static void do_sched_cfs_slack_timer(struct cfs_bandwidth *cfs_b)
 	/* confirm we're still not at a refresh boundary */
 	raw_spin_lock_irqsave(&cfs_b->lock, flags);
 	cfs_b->slack_started = false;
-	if (cfs_b->distribute_running) {
-		raw_spin_unlock_irqrestore(&cfs_b->lock, flags);
-		return;
-	}
 
 	if (runtime_refresh_within(cfs_b, min_bandwidth_expiration)) {
 		raw_spin_unlock_irqrestore(&cfs_b->lock, flags);
@@ -5064,9 +5058,6 @@ static void do_sched_cfs_slack_timer(struct cfs_bandwidth *cfs_b)
 	if (cfs_b->quota != RUNTIME_INF && cfs_b->runtime > slice)
 		runtime = cfs_b->runtime;
 
-	if (runtime)
-		cfs_b->distribute_running = 1;
-
 	raw_spin_unlock_irqrestore(&cfs_b->lock, flags);
 
 	if (!runtime)
@@ -5075,7 +5066,6 @@ static void do_sched_cfs_slack_timer(struct cfs_bandwidth *cfs_b)
 	distribute_cfs_runtime(cfs_b);
 
 	raw_spin_lock_irqsave(&cfs_b->lock, flags);
-	cfs_b->distribute_running = 0;
 	raw_spin_unlock_irqrestore(&cfs_b->lock, flags);
 }
 
@@ -5217,7 +5207,6 @@ void init_cfs_bandwidth(struct cfs_bandwidth *cfs_b)
 	cfs_b->period_timer.function = sched_cfs_period_timer;
 	hrtimer_init(&cfs_b->slack_timer, CLOCK_MONOTONIC, HRTIMER_MODE_REL);
 	cfs_b->slack_timer.function = sched_cfs_slack_timer;
-	cfs_b->distribute_running = 0;
 	cfs_b->slack_started = false;
 }
 
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index db3a57675ccf..7198683b2869 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -349,7 +349,6 @@ struct cfs_bandwidth {
 
 	u8			idle;
 	u8			period_active;
-	u8			distribute_running;
 	u8			slack_started;
 	struct hrtimer		period_timer;
 	struct hrtimer		slack_timer;
-- 
2.26.0.110.g2183baf09c-goog


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/2] sched: remove distribute_running from CFS bandwidth
  2020-04-10 22:52 ` [PATCH 2/2] sched: remove distribute_running from CFS bandwidth Josh Don
@ 2020-04-12  2:01   ` Josh Don
  2020-04-13 14:49     ` Phil Auld
  2020-04-14 10:54   ` Peter Zijlstra
  2020-05-01 18:22   ` [tip: sched/core] sched/fair: Remove " tip-bot2 for Josh Don
  2 siblings, 1 reply; 14+ messages in thread
From: Josh Don @ 2020-04-12  2:01 UTC (permalink / raw)
  To: Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot
  Cc: Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman,
	Paul Turner, Huaixin Chang, linux-kernel, pauld

On Fri, Apr 10, 2020 at 3:52 PM Josh Don <joshdon@google.com> wrote:
>
> This is mostly a revert of baa9be4ff.
>
> The primary use of distribute_running was to determine whether to add
> throttled entities to the head or the tail of the throttled list. Now
> that we always add to the tail, we can remove this field.
>
> The other use of distribute_running is in the slack_timer, so that we
> don't start a distribution while one is already running. However, even
> in the event that this race occurs, it is fine to have two distributions
> running (especially now that distribute grabs the cfs_b->lock to
> determine remaining quota before assigning).
>
> Signed-off-by: Josh Don <joshdon@google.com>
> ---
>  kernel/sched/fair.c  | 13 +------------
>  kernel/sched/sched.h |  1 -
>  2 files changed, 1 insertion(+), 13 deletions(-)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 3fb986c52825..24b5e12efb40 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -4930,14 +4930,12 @@ static int do_sched_cfs_period_timer(struct cfs_bandwidth *cfs_b, int overrun, u
>         /*
>          * This check is repeated as we release cfs_b->lock while we unthrottle.
>          */
> -       while (throttled && cfs_b->runtime > 0 && !cfs_b->distribute_running) {
> -               cfs_b->distribute_running = 1;
> +       while (throttled && cfs_b->runtime > 0) {
>                 raw_spin_unlock_irqrestore(&cfs_b->lock, flags);
>                 /* we can't nest cfs_b->lock while distributing bandwidth */
>                 distribute_cfs_runtime(cfs_b);
>                 raw_spin_lock_irqsave(&cfs_b->lock, flags);
>
> -               cfs_b->distribute_running = 0;
>                 throttled = !list_empty(&cfs_b->throttled_cfs_rq);
>         }
>
> @@ -5051,10 +5049,6 @@ static void do_sched_cfs_slack_timer(struct cfs_bandwidth *cfs_b)
>         /* confirm we're still not at a refresh boundary */
>         raw_spin_lock_irqsave(&cfs_b->lock, flags);
>         cfs_b->slack_started = false;
> -       if (cfs_b->distribute_running) {
> -               raw_spin_unlock_irqrestore(&cfs_b->lock, flags);
> -               return;
> -       }
>
>         if (runtime_refresh_within(cfs_b, min_bandwidth_expiration)) {
>                 raw_spin_unlock_irqrestore(&cfs_b->lock, flags);
> @@ -5064,9 +5058,6 @@ static void do_sched_cfs_slack_timer(struct cfs_bandwidth *cfs_b)
>         if (cfs_b->quota != RUNTIME_INF && cfs_b->runtime > slice)
>                 runtime = cfs_b->runtime;
>
> -       if (runtime)
> -               cfs_b->distribute_running = 1;
> -
>         raw_spin_unlock_irqrestore(&cfs_b->lock, flags);
>
>         if (!runtime)
> @@ -5075,7 +5066,6 @@ static void do_sched_cfs_slack_timer(struct cfs_bandwidth *cfs_b)
>         distribute_cfs_runtime(cfs_b);
>
>         raw_spin_lock_irqsave(&cfs_b->lock, flags);
> -       cfs_b->distribute_running = 0;
>         raw_spin_unlock_irqrestore(&cfs_b->lock, flags);
>  }
>
> @@ -5217,7 +5207,6 @@ void init_cfs_bandwidth(struct cfs_bandwidth *cfs_b)
>         cfs_b->period_timer.function = sched_cfs_period_timer;
>         hrtimer_init(&cfs_b->slack_timer, CLOCK_MONOTONIC, HRTIMER_MODE_REL);
>         cfs_b->slack_timer.function = sched_cfs_slack_timer;
> -       cfs_b->distribute_running = 0;
>         cfs_b->slack_started = false;
>  }
>
> diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
> index db3a57675ccf..7198683b2869 100644
> --- a/kernel/sched/sched.h
> +++ b/kernel/sched/sched.h
> @@ -349,7 +349,6 @@ struct cfs_bandwidth {
>
>         u8                      idle;
>         u8                      period_active;
> -       u8                      distribute_running;
>         u8                      slack_started;
>         struct hrtimer          period_timer;
>         struct hrtimer          slack_timer;
> --
> 2.26.0.110.g2183baf09c-goog
>

+pauld@redhat.com

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 0/2] Fix race in CFS bandwidth
  2020-04-10 22:52 [PATCH 0/2] Fix race in CFS bandwidth Josh Don
  2020-04-10 22:52 ` [PATCH 1/2] sched: eliminate bandwidth race between throttling and distribution Josh Don
  2020-04-10 22:52 ` [PATCH 2/2] sched: remove distribute_running from CFS bandwidth Josh Don
@ 2020-04-12  2:03 ` Josh Don
  2 siblings, 0 replies; 14+ messages in thread
From: Josh Don @ 2020-04-12  2:03 UTC (permalink / raw)
  To: Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot
  Cc: Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman,
	Paul Turner, Huaixin Chang, linux-kernel, pauld

On Fri, Apr 10, 2020 at 3:52 PM Josh Don <joshdon@google.com> wrote:
>
> This series contains a fix for CFS bandwidth and a related cleanup
> patch.
>
> Josh Don (2):
>   sched: eliminate bandwidth race between throttling and distribution
>   sched: remove distribute_running from CFS bandwidth
>
>  kernel/sched/fair.c  | 91 +++++++++++++++++++++++---------------------
>  kernel/sched/sched.h |  1 -
>  2 files changed, 47 insertions(+), 45 deletions(-)
>
> --
> 2.26.0.110.g2183baf09c-goog
>

+pauld@redhat.com

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/2] sched: eliminate bandwidth race between throttling and distribution
       [not found]   ` <CABk29NtxG8t6wHM6MDVFun7jxqRpupWr0d5dK57N2ecFbdSumw@mail.gmail.com>
@ 2020-04-13 14:44     ` Phil Auld
  0 siblings, 0 replies; 14+ messages in thread
From: Phil Auld @ 2020-04-13 14:44 UTC (permalink / raw)
  To: Josh Don
  Cc: Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot,
	Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman,
	Paul Turner, Huaixin Chang, linux-kernel

Hi Josh,

On Sat, Apr 11, 2020 at 07:00:17PM -0700 Josh Don wrote:
> +[1]pauld@redhat.com
> 
> On Fri, Apr 10, 2020 at 3:52 PM Josh Don <[2]joshdon@google.com> wrote:
> 
>     From: Paul Turner <[3]pjt@google.com>
> 
>     There is a race window in which an entity begins throttling before quota
>     is added to the pool, but does not finish throttling until after we have
>     finished with distribute_cfs_runtime(). This entity is not observed by
>     distribute_cfs_runtime() because it was not on the throttled list at the
>     time that distribution was running. This race manifests as rare
>     period-length statlls for such entities.
> 
>     Rather than heavy-weight the synchronization with the progress of
>     distribution, we can fix this by aborting throttling if bandwidth has
>     become available. Otherwise, we immediately add the entity to the
>     throttled list so that it can be observed by a subsequent distribution.
> 
>     Additionally, we can remove the case of adding the throttled entity to
>     the head of the throttled list, and simply always add to the tail.
>     Thanks to 26a8b12747c97, distribute_cfs_runtime() no longer holds onto
>     its own pool of runtime. This means that if we do hit the !assign and
>     distribute_running case, we know that distribution is about to end.
> 
>     Signed-off-by: Paul Turner <[4]pjt@google.com>
>     Co-developed-by: Ben Segall <[5]bsegall@google.com>
>     Signed-off-by: Ben Segall <[6]bsegall@google.com>
>     Signed-off-by: Josh Don <[7]joshdon@google.com>

This looks good to me, thanks for the cc.

Reviewed-by: Phil Auld <pauld@redhat.com>

>     ---
>      kernel/sched/fair.c | 78 ++++++++++++++++++++++++++-------------------
>      1 file changed, 46 insertions(+), 32 deletions(-)
> 
>     diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
>     index 02f323b85b6d..3fb986c52825 100644
>     --- a/kernel/sched/fair.c
>     +++ b/kernel/sched/fair.c
>     @@ -4587,17 +4587,15 @@ static inline struct cfs_bandwidth
>     *tg_cfs_bandwidth(struct task_group *tg)
>             return &tg->cfs_bandwidth;
>      }
> 
>     -/* returns 0 on failure to allocate runtime */
>     -static int assign_cfs_rq_runtime(struct cfs_rq *cfs_rq)
>     +/* returns 0 on failure to allocate runtime, called with cfs_b->lock held
>     */
>     +static int __assign_cfs_rq_runtime(struct cfs_bandwidth *cfs_b,
>     +                                  struct cfs_rq *cfs_rq, u64
>     target_runtime)
>      {
>     -       struct task_group *tg = cfs_rq->tg;
>     -       struct cfs_bandwidth *cfs_b = tg_cfs_bandwidth(tg);
>             u64 amount = 0, min_amount;
> 
>             /* note: this is a positive sum as runtime_remaining <= 0 */
>     -       min_amount = sched_cfs_bandwidth_slice() - cfs_rq->
>     runtime_remaining;
>     +       min_amount = target_runtime - cfs_rq->runtime_remaining;
> 
>     -       raw_spin_lock(&cfs_b->lock);
>             if (cfs_b->quota == RUNTIME_INF)
>                     amount = min_amount;
>             else {
>     @@ -4609,13 +4607,26 @@ static int assign_cfs_rq_runtime(struct cfs_rq
>     *cfs_rq)
>                             cfs_b->idle = 0;
>                     }
>             }
>     -       raw_spin_unlock(&cfs_b->lock);
> 
>             cfs_rq->runtime_remaining += amount;
> 
>             return cfs_rq->runtime_remaining > 0;
>      }
> 
>     +/* returns 0 on failure to allocate runtime */
>     +static int assign_cfs_rq_runtime(struct cfs_rq *cfs_rq)
>     +{
>     +       int ret;
>     +       struct cfs_bandwidth *cfs_b = tg_cfs_bandwidth(cfs_rq->tg);
>     +
>     +       raw_spin_lock(&cfs_b->lock);
>     +       ret = __assign_cfs_rq_runtime(cfs_b, cfs_rq,
>     +                                     sched_cfs_bandwidth_slice());
>     +       raw_spin_unlock(&cfs_b->lock);
>     +
>     +       return ret;
>     +}
>     +
>      static void __account_cfs_rq_runtime(struct cfs_rq *cfs_rq, u64
>     delta_exec)
>      {
>             /* dock delta_exec before expiring quota (as it could span periods)
>     */
>     @@ -4704,13 +4715,33 @@ static int tg_throttle_down(struct task_group *tg,
>     void *data)
>             return 0;
>      }
> 
>     -static void throttle_cfs_rq(struct cfs_rq *cfs_rq)
>     +static bool throttle_cfs_rq(struct cfs_rq *cfs_rq)
>      {
>             struct rq *rq = rq_of(cfs_rq);
>             struct cfs_bandwidth *cfs_b = tg_cfs_bandwidth(cfs_rq->tg);
>             struct sched_entity *se;
>             long task_delta, idle_task_delta, dequeue = 1;
>     -       bool empty;
>     +
>     +       raw_spin_lock(&cfs_b->lock);
>     +       /* This will start the period timer if necessary */
>     +       if (__assign_cfs_rq_runtime(cfs_b, cfs_rq, 1)) {
>     +               /*
>     +                * We have raced with bandwidth becoming available, and if
>     we
>     +                * actually throttled the timer might not unthrottle us for
>     an
>     +                * entire period. We additionally needed to make sure that
>     any
>     +                * subsequent check_cfs_rq_runtime calls agree not to
>     throttle
>     +                * us, as we may commit to do cfs put_prev+pick_next, so we
>     ask
>     +                * for 1ns of runtime rather than just check cfs_b.
>     +                */
>     +               dequeue = 0;
>     +       } else {
>     +               list_add_tail_rcu(&cfs_rq->throttled_list,
>     +                                 &cfs_b->throttled_cfs_rq);
>     +       }
>     +       raw_spin_unlock(&cfs_b->lock);
>     +
>     +       if (!dequeue)
>     +               return false;  /* Throttle no longer required. */
> 
>             se = cfs_rq->tg->se[cpu_of(rq_of(cfs_rq))];
> 
>     @@ -4744,29 +4775,13 @@ static void throttle_cfs_rq(struct cfs_rq *cfs_rq)
>             if (!se)
>                     sub_nr_running(rq, task_delta);
> 
>     -       cfs_rq->throttled = 1;
>     -       cfs_rq->throttled_clock = rq_clock(rq);
>     -       raw_spin_lock(&cfs_b->lock);
>     -       empty = list_empty(&cfs_b->throttled_cfs_rq);
>     -
>             /*
>     -        * Add to the _head_ of the list, so that an already-started
>     -        * distribute_cfs_runtime will not see us. If disribute_cfs_runtime
>     is
>     -        * not running add to the tail so that later runqueues don't get
>     starved.
>     +        * Note: distribution will already see us throttled via the
>     +        * throttled-list.  rq->lock protects completion.
>              */
>     -       if (cfs_b->distribute_running)
>     -               list_add_rcu(&cfs_rq->throttled_list, &cfs_b->
>     throttled_cfs_rq);
>     -       else
>     -               list_add_tail_rcu(&cfs_rq->throttled_list, &cfs_b->
>     throttled_cfs_rq);
>     -
>     -       /*
>     -        * If we're the first throttled task, make sure the bandwidth
>     -        * timer is running.
>     -        */
>     -       if (empty)
>     -               start_cfs_bandwidth(cfs_b);
>     -
>     -       raw_spin_unlock(&cfs_b->lock);
>     +       cfs_rq->throttled = 1;
>     +       cfs_rq->throttled_clock = rq_clock(rq);
>     +       return true;
>      }
> 
>      void unthrottle_cfs_rq(struct cfs_rq *cfs_rq)
>     @@ -5121,8 +5136,7 @@ static bool check_cfs_rq_runtime(struct cfs_rq
>     *cfs_rq)
>             if (cfs_rq_throttled(cfs_rq))
>                     return true;
> 
>     -       throttle_cfs_rq(cfs_rq);
>     -       return true;
>     +       return throttle_cfs_rq(cfs_rq);
>      }
> 
>      static enum hrtimer_restart sched_cfs_slack_timer(struct hrtimer *timer)
>     --
>     2.26.0.110.g2183baf09c-goog
> 
> 
> 
> References:
> 
> [1] mailto:pauld@redhat.com
> [2] mailto:joshdon@google.com
> [3] mailto:pjt@google.com
> [4] mailto:pjt@google.com
> [5] mailto:bsegall@google.com
> [6] mailto:bsegall@google.com
> [7] mailto:joshdon@google.com

-- 


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/2] sched: remove distribute_running from CFS bandwidth
  2020-04-12  2:01   ` Josh Don
@ 2020-04-13 14:49     ` Phil Auld
  0 siblings, 0 replies; 14+ messages in thread
From: Phil Auld @ 2020-04-13 14:49 UTC (permalink / raw)
  To: Josh Don
  Cc: Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot,
	Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman,
	Paul Turner, Huaixin Chang, linux-kernel

Hi Josh,

On Sat, Apr 11, 2020 at 07:01:45PM -0700 Josh Don wrote:
> On Fri, Apr 10, 2020 at 3:52 PM Josh Don <joshdon@google.com> wrote:
> >
> > This is mostly a revert of baa9be4ff.
> >
> > The primary use of distribute_running was to determine whether to add
> > throttled entities to the head or the tail of the throttled list. Now
> > that we always add to the tail, we can remove this field.
> >
> > The other use of distribute_running is in the slack_timer, so that we
> > don't start a distribution while one is already running. However, even
> > in the event that this race occurs, it is fine to have two distributions
> > running (especially now that distribute grabs the cfs_b->lock to
> > determine remaining quota before assigning).
> >
> > Signed-off-by: Josh Don <joshdon@google.com>

Nice. This was more or less a hack. It's good to be able to remove it.
I re-ran my test cases that originally caused the need for the
distribute_running.  As expected, since there is no adding to the head
of the queue, the behavior is the same as before this series.

Reviewed-by: Phil Auld <pauld@redhat.com>
Tested-by: Phil Auld <pauld@redhat.com>


Cheers,
Phil

> > ---
> >  kernel/sched/fair.c  | 13 +------------
> >  kernel/sched/sched.h |  1 -
> >  2 files changed, 1 insertion(+), 13 deletions(-)
> >
> > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> > index 3fb986c52825..24b5e12efb40 100644
> > --- a/kernel/sched/fair.c
> > +++ b/kernel/sched/fair.c
> > @@ -4930,14 +4930,12 @@ static int do_sched_cfs_period_timer(struct cfs_bandwidth *cfs_b, int overrun, u
> >         /*
> >          * This check is repeated as we release cfs_b->lock while we unthrottle.
> >          */
> > -       while (throttled && cfs_b->runtime > 0 && !cfs_b->distribute_running) {
> > -               cfs_b->distribute_running = 1;
> > +       while (throttled && cfs_b->runtime > 0) {
> >                 raw_spin_unlock_irqrestore(&cfs_b->lock, flags);
> >                 /* we can't nest cfs_b->lock while distributing bandwidth */
> >                 distribute_cfs_runtime(cfs_b);
> >                 raw_spin_lock_irqsave(&cfs_b->lock, flags);
> >
> > -               cfs_b->distribute_running = 0;
> >                 throttled = !list_empty(&cfs_b->throttled_cfs_rq);
> >         }
> >
> > @@ -5051,10 +5049,6 @@ static void do_sched_cfs_slack_timer(struct cfs_bandwidth *cfs_b)
> >         /* confirm we're still not at a refresh boundary */
> >         raw_spin_lock_irqsave(&cfs_b->lock, flags);
> >         cfs_b->slack_started = false;
> > -       if (cfs_b->distribute_running) {
> > -               raw_spin_unlock_irqrestore(&cfs_b->lock, flags);
> > -               return;
> > -       }
> >
> >         if (runtime_refresh_within(cfs_b, min_bandwidth_expiration)) {
> >                 raw_spin_unlock_irqrestore(&cfs_b->lock, flags);
> > @@ -5064,9 +5058,6 @@ static void do_sched_cfs_slack_timer(struct cfs_bandwidth *cfs_b)
> >         if (cfs_b->quota != RUNTIME_INF && cfs_b->runtime > slice)
> >                 runtime = cfs_b->runtime;
> >
> > -       if (runtime)
> > -               cfs_b->distribute_running = 1;
> > -
> >         raw_spin_unlock_irqrestore(&cfs_b->lock, flags);
> >
> >         if (!runtime)
> > @@ -5075,7 +5066,6 @@ static void do_sched_cfs_slack_timer(struct cfs_bandwidth *cfs_b)
> >         distribute_cfs_runtime(cfs_b);
> >
> >         raw_spin_lock_irqsave(&cfs_b->lock, flags);
> > -       cfs_b->distribute_running = 0;
> >         raw_spin_unlock_irqrestore(&cfs_b->lock, flags);
> >  }
> >
> > @@ -5217,7 +5207,6 @@ void init_cfs_bandwidth(struct cfs_bandwidth *cfs_b)
> >         cfs_b->period_timer.function = sched_cfs_period_timer;
> >         hrtimer_init(&cfs_b->slack_timer, CLOCK_MONOTONIC, HRTIMER_MODE_REL);
> >         cfs_b->slack_timer.function = sched_cfs_slack_timer;
> > -       cfs_b->distribute_running = 0;
> >         cfs_b->slack_started = false;
> >  }
> >
> > diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
> > index db3a57675ccf..7198683b2869 100644
> > --- a/kernel/sched/sched.h
> > +++ b/kernel/sched/sched.h
> > @@ -349,7 +349,6 @@ struct cfs_bandwidth {
> >
> >         u8                      idle;
> >         u8                      period_active;
> > -       u8                      distribute_running;
> >         u8                      slack_started;
> >         struct hrtimer          period_timer;
> >         struct hrtimer          slack_timer;
> > --
> > 2.26.0.110.g2183baf09c-goog
> >
> 
> +pauld@redhat.com
> 

-- 


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/2] sched: eliminate bandwidth race between throttling and distribution
  2020-04-10 22:52 ` [PATCH 1/2] sched: eliminate bandwidth race between throttling and distribution Josh Don
       [not found]   ` <CABk29NtxG8t6wHM6MDVFun7jxqRpupWr0d5dK57N2ecFbdSumw@mail.gmail.com>
@ 2020-04-14 10:52   ` Peter Zijlstra
  2020-05-01 18:22   ` [tip: sched/core] sched/fair: Eliminate " tip-bot2 for Paul Turner
  2 siblings, 0 replies; 14+ messages in thread
From: Peter Zijlstra @ 2020-04-14 10:52 UTC (permalink / raw)
  To: Josh Don
  Cc: Ingo Molnar, Juri Lelli, Vincent Guittot, Dietmar Eggemann,
	Steven Rostedt, Ben Segall, Mel Gorman, Paul Turner,
	Huaixin Chang, Phil Auld, linux-kernel

On Fri, Apr 10, 2020 at 03:52:07PM -0700, Josh Don wrote:

> -/* returns 0 on failure to allocate runtime */
> +/* returns 0 on failure to allocate runtime, called with cfs_b->lock held */

That's a gross mis-spelling of lockdep_assert_held(); and since I was
editing things anyway it now looks like so:

--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -4587,11 +4587,13 @@ static inline struct cfs_bandwidth *tg_c
 	return &tg->cfs_bandwidth;
 }
 
-/* returns 0 on failure to allocate runtime, called with cfs_b->lock held */
+/* returns 0 on failure to allocate runtime */
 static int __assign_cfs_rq_runtime(struct cfs_bandwidth *cfs_b,
 				   struct cfs_rq *cfs_rq, u64 target_runtime)
 {
-	u64 amount = 0, min_amount;
+	u64 min_amount, amount = 0;
+
+	lockdep_assert_held(cfs_rq->lock);
 
 	/* note: this is a positive sum as runtime_remaining <= 0 */
 	min_amount = target_runtime - cfs_rq->runtime_remaining;
@@ -4616,12 +4618,11 @@ static int __assign_cfs_rq_runtime(struc
 /* returns 0 on failure to allocate runtime */
 static int assign_cfs_rq_runtime(struct cfs_rq *cfs_rq)
 {
-	int ret;
 	struct cfs_bandwidth *cfs_b = tg_cfs_bandwidth(cfs_rq->tg);
+	int ret;
 
 	raw_spin_lock(&cfs_b->lock);
-	ret = __assign_cfs_rq_runtime(cfs_b, cfs_rq,
-				      sched_cfs_bandwidth_slice());
+	ret = __assign_cfs_rq_runtime(cfs_b, cfs_rq, sched_cfs_bandwidth_slice());
 	raw_spin_unlock(&cfs_b->lock);
 
 	return ret;


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/2] sched: remove distribute_running from CFS bandwidth
  2020-04-10 22:52 ` [PATCH 2/2] sched: remove distribute_running from CFS bandwidth Josh Don
  2020-04-12  2:01   ` Josh Don
@ 2020-04-14 10:54   ` Peter Zijlstra
  2020-05-01 18:22   ` [tip: sched/core] sched/fair: Remove " tip-bot2 for Josh Don
  2 siblings, 0 replies; 14+ messages in thread
From: Peter Zijlstra @ 2020-04-14 10:54 UTC (permalink / raw)
  To: Josh Don
  Cc: Ingo Molnar, Juri Lelli, Vincent Guittot, Dietmar Eggemann,
	Steven Rostedt, Ben Segall, Mel Gorman, Paul Turner,
	Huaixin Chang, Phil Auld, linux-kernel

On Fri, Apr 10, 2020 at 03:52:08PM -0700, Josh Don wrote:
> This is mostly a revert of baa9be4ff.

That's written like:

"This is mostly a revert of commit:

  baa9be4ffb55 ("sched/fair: Fix throttle_list starvation with low CFS quota")"

Fixed that for you.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [tip: sched/core] sched/fair: Remove distribute_running from CFS bandwidth
  2020-04-10 22:52 ` [PATCH 2/2] sched: remove distribute_running from CFS bandwidth Josh Don
  2020-04-12  2:01   ` Josh Don
  2020-04-14 10:54   ` Peter Zijlstra
@ 2020-05-01 18:22   ` tip-bot2 for Josh Don
       [not found]     ` <BL0PR14MB3779C02BB87DC4426C4761639A840@BL0PR14MB3779.namprd14.prod.outlook.com>
  2 siblings, 1 reply; 14+ messages in thread
From: tip-bot2 for Josh Don @ 2020-05-01 18:22 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: Josh Don, Peter Zijlstra (Intel), Phil Auld, x86, LKML

The following commit has been merged into the sched/core branch of tip:

Commit-ID:     ab93a4bc955b3980c699430bc0b633f0d8b607be
Gitweb:        https://git.kernel.org/tip/ab93a4bc955b3980c699430bc0b633f0d8b607be
Author:        Josh Don <joshdon@google.com>
AuthorDate:    Fri, 10 Apr 2020 15:52:08 -07:00
Committer:     Peter Zijlstra <peterz@infradead.org>
CommitterDate: Thu, 30 Apr 2020 20:14:38 +02:00

sched/fair: Remove distribute_running from CFS bandwidth

This is mostly a revert of commit:

  baa9be4ffb55 ("sched/fair: Fix throttle_list starvation with low CFS quota")

The primary use of distribute_running was to determine whether to add
throttled entities to the head or the tail of the throttled list. Now
that we always add to the tail, we can remove this field.

The other use of distribute_running is in the slack_timer, so that we
don't start a distribution while one is already running. However, even
in the event that this race occurs, it is fine to have two distributions
running (especially now that distribute grabs the cfs_b->lock to
determine remaining quota before assigning).

Signed-off-by: Josh Don <joshdon@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Phil Auld <pauld@redhat.com>
Tested-by: Phil Auld <pauld@redhat.com>
Link: https://lkml.kernel.org/r/20200410225208.109717-3-joshdon@google.com
---
 kernel/sched/fair.c  | 13 +------------
 kernel/sched/sched.h |  1 -
 2 files changed, 1 insertion(+), 13 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 0c13a41..3d6ce75 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -4931,14 +4931,12 @@ static int do_sched_cfs_period_timer(struct cfs_bandwidth *cfs_b, int overrun, u
 	/*
 	 * This check is repeated as we release cfs_b->lock while we unthrottle.
 	 */
-	while (throttled && cfs_b->runtime > 0 && !cfs_b->distribute_running) {
-		cfs_b->distribute_running = 1;
+	while (throttled && cfs_b->runtime > 0) {
 		raw_spin_unlock_irqrestore(&cfs_b->lock, flags);
 		/* we can't nest cfs_b->lock while distributing bandwidth */
 		distribute_cfs_runtime(cfs_b);
 		raw_spin_lock_irqsave(&cfs_b->lock, flags);
 
-		cfs_b->distribute_running = 0;
 		throttled = !list_empty(&cfs_b->throttled_cfs_rq);
 	}
 
@@ -5052,10 +5050,6 @@ static void do_sched_cfs_slack_timer(struct cfs_bandwidth *cfs_b)
 	/* confirm we're still not at a refresh boundary */
 	raw_spin_lock_irqsave(&cfs_b->lock, flags);
 	cfs_b->slack_started = false;
-	if (cfs_b->distribute_running) {
-		raw_spin_unlock_irqrestore(&cfs_b->lock, flags);
-		return;
-	}
 
 	if (runtime_refresh_within(cfs_b, min_bandwidth_expiration)) {
 		raw_spin_unlock_irqrestore(&cfs_b->lock, flags);
@@ -5065,9 +5059,6 @@ static void do_sched_cfs_slack_timer(struct cfs_bandwidth *cfs_b)
 	if (cfs_b->quota != RUNTIME_INF && cfs_b->runtime > slice)
 		runtime = cfs_b->runtime;
 
-	if (runtime)
-		cfs_b->distribute_running = 1;
-
 	raw_spin_unlock_irqrestore(&cfs_b->lock, flags);
 
 	if (!runtime)
@@ -5076,7 +5067,6 @@ static void do_sched_cfs_slack_timer(struct cfs_bandwidth *cfs_b)
 	distribute_cfs_runtime(cfs_b);
 
 	raw_spin_lock_irqsave(&cfs_b->lock, flags);
-	cfs_b->distribute_running = 0;
 	raw_spin_unlock_irqrestore(&cfs_b->lock, flags);
 }
 
@@ -5218,7 +5208,6 @@ void init_cfs_bandwidth(struct cfs_bandwidth *cfs_b)
 	cfs_b->period_timer.function = sched_cfs_period_timer;
 	hrtimer_init(&cfs_b->slack_timer, CLOCK_MONOTONIC, HRTIMER_MODE_REL);
 	cfs_b->slack_timer.function = sched_cfs_slack_timer;
-	cfs_b->distribute_running = 0;
 	cfs_b->slack_started = false;
 }
 
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index db3a576..7198683 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -349,7 +349,6 @@ struct cfs_bandwidth {
 
 	u8			idle;
 	u8			period_active;
-	u8			distribute_running;
 	u8			slack_started;
 	struct hrtimer		period_timer;
 	struct hrtimer		slack_timer;

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [tip: sched/core] sched/fair: Eliminate bandwidth race between throttling and distribution
  2020-04-10 22:52 ` [PATCH 1/2] sched: eliminate bandwidth race between throttling and distribution Josh Don
       [not found]   ` <CABk29NtxG8t6wHM6MDVFun7jxqRpupWr0d5dK57N2ecFbdSumw@mail.gmail.com>
  2020-04-14 10:52   ` Peter Zijlstra
@ 2020-05-01 18:22   ` tip-bot2 for Paul Turner
  2 siblings, 0 replies; 14+ messages in thread
From: tip-bot2 for Paul Turner @ 2020-05-01 18:22 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Paul Turner, Ben Segall, Josh Don, Peter Zijlstra (Intel),
	Phil Auld, x86, LKML

The following commit has been merged into the sched/core branch of tip:

Commit-ID:     e98fa02c4f2ea4991dae422ac7e34d102d2f0599
Gitweb:        https://git.kernel.org/tip/e98fa02c4f2ea4991dae422ac7e34d102d2f0599
Author:        Paul Turner <pjt@google.com>
AuthorDate:    Fri, 10 Apr 2020 15:52:07 -07:00
Committer:     Peter Zijlstra <peterz@infradead.org>
CommitterDate: Thu, 30 Apr 2020 20:14:38 +02:00

sched/fair: Eliminate bandwidth race between throttling and distribution

There is a race window in which an entity begins throttling before quota
is added to the pool, but does not finish throttling until after we have
finished with distribute_cfs_runtime(). This entity is not observed by
distribute_cfs_runtime() because it was not on the throttled list at the
time that distribution was running. This race manifests as rare
period-length statlls for such entities.

Rather than heavy-weight the synchronization with the progress of
distribution, we can fix this by aborting throttling if bandwidth has
become available. Otherwise, we immediately add the entity to the
throttled list so that it can be observed by a subsequent distribution.

Additionally, we can remove the case of adding the throttled entity to
the head of the throttled list, and simply always add to the tail.
Thanks to 26a8b12747c97, distribute_cfs_runtime() no longer holds onto
its own pool of runtime. This means that if we do hit the !assign and
distribute_running case, we know that distribution is about to end.

Signed-off-by: Paul Turner <pjt@google.com>
Signed-off-by: Ben Segall <bsegall@google.com>
Signed-off-by: Josh Don <joshdon@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Phil Auld <pauld@redhat.com>
Link: https://lkml.kernel.org/r/20200410225208.109717-2-joshdon@google.com
---
 kernel/sched/fair.c | 79 ++++++++++++++++++++++++++------------------
 1 file changed, 47 insertions(+), 32 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 02f323b..0c13a41 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -4588,16 +4588,16 @@ static inline struct cfs_bandwidth *tg_cfs_bandwidth(struct task_group *tg)
 }
 
 /* returns 0 on failure to allocate runtime */
-static int assign_cfs_rq_runtime(struct cfs_rq *cfs_rq)
+static int __assign_cfs_rq_runtime(struct cfs_bandwidth *cfs_b,
+				   struct cfs_rq *cfs_rq, u64 target_runtime)
 {
-	struct task_group *tg = cfs_rq->tg;
-	struct cfs_bandwidth *cfs_b = tg_cfs_bandwidth(tg);
-	u64 amount = 0, min_amount;
+	u64 min_amount, amount = 0;
+
+	lockdep_assert_held(&cfs_b->lock);
 
 	/* note: this is a positive sum as runtime_remaining <= 0 */
-	min_amount = sched_cfs_bandwidth_slice() - cfs_rq->runtime_remaining;
+	min_amount = target_runtime - cfs_rq->runtime_remaining;
 
-	raw_spin_lock(&cfs_b->lock);
 	if (cfs_b->quota == RUNTIME_INF)
 		amount = min_amount;
 	else {
@@ -4609,13 +4609,25 @@ static int assign_cfs_rq_runtime(struct cfs_rq *cfs_rq)
 			cfs_b->idle = 0;
 		}
 	}
-	raw_spin_unlock(&cfs_b->lock);
 
 	cfs_rq->runtime_remaining += amount;
 
 	return cfs_rq->runtime_remaining > 0;
 }
 
+/* returns 0 on failure to allocate runtime */
+static int assign_cfs_rq_runtime(struct cfs_rq *cfs_rq)
+{
+	struct cfs_bandwidth *cfs_b = tg_cfs_bandwidth(cfs_rq->tg);
+	int ret;
+
+	raw_spin_lock(&cfs_b->lock);
+	ret = __assign_cfs_rq_runtime(cfs_b, cfs_rq, sched_cfs_bandwidth_slice());
+	raw_spin_unlock(&cfs_b->lock);
+
+	return ret;
+}
+
 static void __account_cfs_rq_runtime(struct cfs_rq *cfs_rq, u64 delta_exec)
 {
 	/* dock delta_exec before expiring quota (as it could span periods) */
@@ -4704,13 +4716,33 @@ static int tg_throttle_down(struct task_group *tg, void *data)
 	return 0;
 }
 
-static void throttle_cfs_rq(struct cfs_rq *cfs_rq)
+static bool throttle_cfs_rq(struct cfs_rq *cfs_rq)
 {
 	struct rq *rq = rq_of(cfs_rq);
 	struct cfs_bandwidth *cfs_b = tg_cfs_bandwidth(cfs_rq->tg);
 	struct sched_entity *se;
 	long task_delta, idle_task_delta, dequeue = 1;
-	bool empty;
+
+	raw_spin_lock(&cfs_b->lock);
+	/* This will start the period timer if necessary */
+	if (__assign_cfs_rq_runtime(cfs_b, cfs_rq, 1)) {
+		/*
+		 * We have raced with bandwidth becoming available, and if we
+		 * actually throttled the timer might not unthrottle us for an
+		 * entire period. We additionally needed to make sure that any
+		 * subsequent check_cfs_rq_runtime calls agree not to throttle
+		 * us, as we may commit to do cfs put_prev+pick_next, so we ask
+		 * for 1ns of runtime rather than just check cfs_b.
+		 */
+		dequeue = 0;
+	} else {
+		list_add_tail_rcu(&cfs_rq->throttled_list,
+				  &cfs_b->throttled_cfs_rq);
+	}
+	raw_spin_unlock(&cfs_b->lock);
+
+	if (!dequeue)
+		return false;  /* Throttle no longer required. */
 
 	se = cfs_rq->tg->se[cpu_of(rq_of(cfs_rq))];
 
@@ -4744,29 +4776,13 @@ static void throttle_cfs_rq(struct cfs_rq *cfs_rq)
 	if (!se)
 		sub_nr_running(rq, task_delta);
 
-	cfs_rq->throttled = 1;
-	cfs_rq->throttled_clock = rq_clock(rq);
-	raw_spin_lock(&cfs_b->lock);
-	empty = list_empty(&cfs_b->throttled_cfs_rq);
-
-	/*
-	 * Add to the _head_ of the list, so that an already-started
-	 * distribute_cfs_runtime will not see us. If disribute_cfs_runtime is
-	 * not running add to the tail so that later runqueues don't get starved.
-	 */
-	if (cfs_b->distribute_running)
-		list_add_rcu(&cfs_rq->throttled_list, &cfs_b->throttled_cfs_rq);
-	else
-		list_add_tail_rcu(&cfs_rq->throttled_list, &cfs_b->throttled_cfs_rq);
-
 	/*
-	 * If we're the first throttled task, make sure the bandwidth
-	 * timer is running.
+	 * Note: distribution will already see us throttled via the
+	 * throttled-list.  rq->lock protects completion.
 	 */
-	if (empty)
-		start_cfs_bandwidth(cfs_b);
-
-	raw_spin_unlock(&cfs_b->lock);
+	cfs_rq->throttled = 1;
+	cfs_rq->throttled_clock = rq_clock(rq);
+	return true;
 }
 
 void unthrottle_cfs_rq(struct cfs_rq *cfs_rq)
@@ -5121,8 +5137,7 @@ static bool check_cfs_rq_runtime(struct cfs_rq *cfs_rq)
 	if (cfs_rq_throttled(cfs_rq))
 		return true;
 
-	throttle_cfs_rq(cfs_rq);
-	return true;
+	return throttle_cfs_rq(cfs_rq);
 }
 
 static enum hrtimer_restart sched_cfs_slack_timer(struct hrtimer *timer)

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [tip: sched/core] sched/fair: Remove distribute_running from CFS bandwidth
       [not found]     ` <BL0PR14MB3779C02BB87DC4426C4761639A840@BL0PR14MB3779.namprd14.prod.outlook.com>
@ 2020-06-08 14:53       ` Phil Auld
       [not found]         ` <BL0PR14MB3779AD967619031948957C549A850@BL0PR14MB3779.namprd14.prod.outlook.com>
  0 siblings, 1 reply; 14+ messages in thread
From: Phil Auld @ 2020-06-08 14:53 UTC (permalink / raw)
  To: Tao Zhou; +Cc: linux-kernel, Josh Don, Peter Zijlstra (Intel), x86

On Sun, Jun 07, 2020 at 09:25:58AM +0800 Tao Zhou wrote:
> Hi,
> 
> On Fri, May 01, 2020 at 06:22:12PM -0000, tip-bot2 for Josh Don wrote:
> > The following commit has been merged into the sched/core branch of tip:
> > 
> > Commit-ID:     ab93a4bc955b3980c699430bc0b633f0d8b607be
> > Gitweb:        https://git.kernel.org/tip/ab93a4bc955b3980c699430bc0b633f0d8b607be
> > Author:        Josh Don <joshdon@google.com>
> > AuthorDate:    Fri, 10 Apr 2020 15:52:08 -07:00
> > Committer:     Peter Zijlstra <peterz@infradead.org>
> > CommitterDate: Thu, 30 Apr 2020 20:14:38 +02:00
> > 
> > sched/fair: Remove distribute_running from CFS bandwidth
> > 
> > This is mostly a revert of commit:
> > 
> >   baa9be4ffb55 ("sched/fair: Fix throttle_list starvation with low CFS quota")
> > 
> > The primary use of distribute_running was to determine whether to add
> > throttled entities to the head or the tail of the throttled list. Now
> > that we always add to the tail, we can remove this field.
> > 
> > The other use of distribute_running is in the slack_timer, so that we
> > don't start a distribution while one is already running. However, even
> > in the event that this race occurs, it is fine to have two distributions
> > running (especially now that distribute grabs the cfs_b->lock to
> > determine remaining quota before assigning).
> > 
> > Signed-off-by: Josh Don <joshdon@google.com>
> > Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> > Reviewed-by: Phil Auld <pauld@redhat.com>
> > Tested-by: Phil Auld <pauld@redhat.com>
> > Link: https://lkml.kernel.org/r/20200410225208.109717-3-joshdon@google.com
> > ---
> >  kernel/sched/fair.c  | 13 +------------
> >  kernel/sched/sched.h |  1 -
> >  2 files changed, 1 insertion(+), 13 deletions(-)
> > 
> > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> > index 0c13a41..3d6ce75 100644
> > --- a/kernel/sched/fair.c
> > +++ b/kernel/sched/fair.c
> > @@ -4931,14 +4931,12 @@ static int do_sched_cfs_period_timer(struct cfs_bandwidth *cfs_b, int overrun, u
> >  	/*
> >  	 * This check is repeated as we release cfs_b->lock while we unthrottle.
> >  	 */
> > -	while (throttled && cfs_b->runtime > 0 && !cfs_b->distribute_running) {
> > -		cfs_b->distribute_running = 1;
> > +	while (throttled && cfs_b->runtime > 0) {
> >  		raw_spin_unlock_irqrestore(&cfs_b->lock, flags);
> >  		/* we can't nest cfs_b->lock while distributing bandwidth */
> >  		distribute_cfs_runtime(cfs_b);
> >  		raw_spin_lock_irqsave(&cfs_b->lock, flags);
> >  
> > -		cfs_b->distribute_running = 0;
> >  		throttled = !list_empty(&cfs_b->throttled_cfs_rq);
> >  	}
> >  
> > @@ -5052,10 +5050,6 @@ static void do_sched_cfs_slack_timer(struct cfs_bandwidth *cfs_b)
> >  	/* confirm we're still not at a refresh boundary */
> >  	raw_spin_lock_irqsave(&cfs_b->lock, flags);
> >  	cfs_b->slack_started = false;
> > -	if (cfs_b->distribute_running) {
> > -		raw_spin_unlock_irqrestore(&cfs_b->lock, flags);
> > -		return;
> > -	}
> >  
> >  	if (runtime_refresh_within(cfs_b, min_bandwidth_expiration)) {
> >  		raw_spin_unlock_irqrestore(&cfs_b->lock, flags);
> > @@ -5065,9 +5059,6 @@ static void do_sched_cfs_slack_timer(struct cfs_bandwidth *cfs_b)
> >  	if (cfs_b->quota != RUNTIME_INF && cfs_b->runtime > slice)
> >  		runtime = cfs_b->runtime;
> >  
> > -	if (runtime)
> > -		cfs_b->distribute_running = 1;
> > -
> >  	raw_spin_unlock_irqrestore(&cfs_b->lock, flags);
> >  
> >  	if (!runtime)
> > @@ -5076,7 +5067,6 @@ static void do_sched_cfs_slack_timer(struct cfs_bandwidth *cfs_b)
> >  	distribute_cfs_runtime(cfs_b);
> >  
> >  	raw_spin_lock_irqsave(&cfs_b->lock, flags);
> > -	cfs_b->distribute_running = 0;
> >  	raw_spin_unlock_irqrestore(&cfs_b->lock, flags);
> >  }
> 
> When I read the tip code, I found nothing between above lock/unlock.
> This commit removed distribute_running. Is there any reason to remain
> that lock/unlock there ? I feel that it is not necessary now, no ?
>

Yeah, that looks pretty useless :)

Do you want to spin up a patch?


Cheers,
Phil


> Thanks
> 
> > @@ -5218,7 +5208,6 @@ void init_cfs_bandwidth(struct cfs_bandwidth *cfs_b)
> >  	cfs_b->period_timer.function = sched_cfs_period_timer;
> >  	hrtimer_init(&cfs_b->slack_timer, CLOCK_MONOTONIC, HRTIMER_MODE_REL);
> >  	cfs_b->slack_timer.function = sched_cfs_slack_timer;
> > -	cfs_b->distribute_running = 0;
> >  	cfs_b->slack_started = false;
> >  }
> >  
> > diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
> > index db3a576..7198683 100644
> > --- a/kernel/sched/sched.h
> > +++ b/kernel/sched/sched.h
> > @@ -349,7 +349,6 @@ struct cfs_bandwidth {
> >  
> >  	u8			idle;
> >  	u8			period_active;
> > -	u8			distribute_running;
> >  	u8			slack_started;
> >  	struct hrtimer		period_timer;
> >  	struct hrtimer		slack_timer;
> 

-- 


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [tip: sched/core] sched/fair: Remove distribute_running from CFS bandwidth
       [not found]         ` <BL0PR14MB3779AD967619031948957C549A850@BL0PR14MB3779.namprd14.prod.outlook.com>
@ 2020-06-08 23:44           ` Josh Don
  2020-06-09  0:28           ` Phil Auld
  1 sibling, 0 replies; 14+ messages in thread
From: Josh Don @ 2020-06-08 23:44 UTC (permalink / raw)
  To: Tao Zhou; +Cc: Phil Auld, linux-kernel, Peter Zijlstra (Intel), x86

Hi Tao,

On Mon, Jun 8, 2020 at 4:01 PM Tao Zhou <ouwen210@hotmail.com> wrote:
> After commit ab93a4bc955b, cfs_b->distribute_running is not used and
> removed. The lock/unlock protecting it are not removed and remain in
> the code. One benefit of removing them is that it can elimite the code
> size a little.
>
> Fixes: ab93a4bc955b ("sched/fair: Remove distribute_running from CFS bandwidth")
> ---
>  kernel/sched/fair.c | 3 ---
>  1 file changed, 3 deletions(-)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 35f4cc024dcf..cc2e1e839e03 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -5089,9 +5089,6 @@ static void do_sched_cfs_slack_timer(struct cfs_bandwidth *cfs_b)
>                 return;
>
>         distribute_cfs_runtime(cfs_b);
> -
> -       raw_spin_lock_irqsave(&cfs_b->lock, flags);
> -       raw_spin_unlock_irqrestore(&cfs_b->lock, flags);
>  }

Thanks, I missed the now-useless lock acquire in my revert.

s/elimite/eliminate

Reviewed-by: Josh Don <joshdon@google.com>

Best,
Josh

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [tip: sched/core] sched/fair: Remove distribute_running from CFS bandwidth
       [not found]         ` <BL0PR14MB3779AD967619031948957C549A850@BL0PR14MB3779.namprd14.prod.outlook.com>
  2020-06-08 23:44           ` Josh Don
@ 2020-06-09  0:28           ` Phil Auld
  1 sibling, 0 replies; 14+ messages in thread
From: Phil Auld @ 2020-06-09  0:28 UTC (permalink / raw)
  To: Tao Zhou; +Cc: linux-kernel, Josh Don, Peter Zijlstra (Intel), x86


On Tue, Jun 09, 2020 at 07:05:38AM +0800 Tao Zhou wrote:
> Hi Phil,
> 
> On Mon, Jun 08, 2020 at 10:53:04AM -0400, Phil Auld wrote:
> > On Sun, Jun 07, 2020 at 09:25:58AM +0800 Tao Zhou wrote:
> > > Hi,
> > > 
> > > On Fri, May 01, 2020 at 06:22:12PM -0000, tip-bot2 for Josh Don wrote:
> > > > The following commit has been merged into the sched/core branch of tip:
> > > > 
> > > > Commit-ID:     ab93a4bc955b3980c699430bc0b633f0d8b607be
> > > > Gitweb:        https://git.kernel.org/tip/ab93a4bc955b3980c699430bc0b633f0d8b607be
> > > > Author:        Josh Don <joshdon@google.com>
> > > > AuthorDate:    Fri, 10 Apr 2020 15:52:08 -07:00
> > > > Committer:     Peter Zijlstra <peterz@infradead.org>
> > > > CommitterDate: Thu, 30 Apr 2020 20:14:38 +02:00
> > > > 
> > > > sched/fair: Remove distribute_running from CFS bandwidth
> > > > 
> > > > This is mostly a revert of commit:
> > > > 
> > > >   baa9be4ffb55 ("sched/fair: Fix throttle_list starvation with low CFS quota")
> > > > 
> > > > The primary use of distribute_running was to determine whether to add
> > > > throttled entities to the head or the tail of the throttled list. Now
> > > > that we always add to the tail, we can remove this field.
> > > > 
> > > > The other use of distribute_running is in the slack_timer, so that we
> > > > don't start a distribution while one is already running. However, even
> > > > in the event that this race occurs, it is fine to have two distributions
> > > > running (especially now that distribute grabs the cfs_b->lock to
> > > > determine remaining quota before assigning).
> > > > 
> > > > Signed-off-by: Josh Don <joshdon@google.com>
> > > > Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> > > > Reviewed-by: Phil Auld <pauld@redhat.com>
> > > > Tested-by: Phil Auld <pauld@redhat.com>
> > > > Link: https://lkml.kernel.org/r/20200410225208.109717-3-joshdon@google.com
> > > > ---
> > > >  kernel/sched/fair.c  | 13 +------------
> > > >  kernel/sched/sched.h |  1 -
> > > >  2 files changed, 1 insertion(+), 13 deletions(-)
> > > > 
> > > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> > > > index 0c13a41..3d6ce75 100644
> > > > --- a/kernel/sched/fair.c
> > > > +++ b/kernel/sched/fair.c
> > > > @@ -4931,14 +4931,12 @@ static int do_sched_cfs_period_timer(struct cfs_bandwidth *cfs_b, int overrun, u
> > > >  	/*
> > > >  	 * This check is repeated as we release cfs_b->lock while we unthrottle.
> > > >  	 */
> > > > -	while (throttled && cfs_b->runtime > 0 && !cfs_b->distribute_running) {
> > > > -		cfs_b->distribute_running = 1;
> > > > +	while (throttled && cfs_b->runtime > 0) {
> > > >  		raw_spin_unlock_irqrestore(&cfs_b->lock, flags);
> > > >  		/* we can't nest cfs_b->lock while distributing bandwidth */
> > > >  		distribute_cfs_runtime(cfs_b);
> > > >  		raw_spin_lock_irqsave(&cfs_b->lock, flags);
> > > >  
> > > > -		cfs_b->distribute_running = 0;
> > > >  		throttled = !list_empty(&cfs_b->throttled_cfs_rq);
> > > >  	}
> > > >  
> > > > @@ -5052,10 +5050,6 @@ static void do_sched_cfs_slack_timer(struct cfs_bandwidth *cfs_b)
> > > >  	/* confirm we're still not at a refresh boundary */
> > > >  	raw_spin_lock_irqsave(&cfs_b->lock, flags);
> > > >  	cfs_b->slack_started = false;
> > > > -	if (cfs_b->distribute_running) {
> > > > -		raw_spin_unlock_irqrestore(&cfs_b->lock, flags);
> > > > -		return;
> > > > -	}
> > > >  
> > > >  	if (runtime_refresh_within(cfs_b, min_bandwidth_expiration)) {
> > > >  		raw_spin_unlock_irqrestore(&cfs_b->lock, flags);
> > > > @@ -5065,9 +5059,6 @@ static void do_sched_cfs_slack_timer(struct cfs_bandwidth *cfs_b)
> > > >  	if (cfs_b->quota != RUNTIME_INF && cfs_b->runtime > slice)
> > > >  		runtime = cfs_b->runtime;
> > > >  
> > > > -	if (runtime)
> > > > -		cfs_b->distribute_running = 1;
> > > > -
> > > >  	raw_spin_unlock_irqrestore(&cfs_b->lock, flags);
> > > >  
> > > >  	if (!runtime)
> > > > @@ -5076,7 +5067,6 @@ static void do_sched_cfs_slack_timer(struct cfs_bandwidth *cfs_b)
> > > >  	distribute_cfs_runtime(cfs_b);
> > > >  
> > > >  	raw_spin_lock_irqsave(&cfs_b->lock, flags);
> > > > -	cfs_b->distribute_running = 0;
> > > >  	raw_spin_unlock_irqrestore(&cfs_b->lock, flags);
> > > >  }
> > > 
> > > When I read the tip code, I found nothing between above lock/unlock.
> > > This commit removed distribute_running. Is there any reason to remain
> > > that lock/unlock there ? I feel that it is not necessary now, no ?
> > >
> > 
> > Yeah, that looks pretty useless :)
> > 
> > Do you want to spin up a patch?
> 
> Thanks for your reply, I prepared a patch for that indeed.
> 
> Attached below:
> 
> From 9ce12d6ab5542041e5adab7b200874c789cfd6e6 Mon Sep 17 00:00:00 2001
> From: Tao Zhou <ouwen210@hotmail.com>
> Date: Sat, 6 Jun 2020 23:08:58 +0800
> Subject: [PATCH] sched/fair: remove the residual cfs_b lock/unlock
> 
> After commit ab93a4bc955b, cfs_b->distribute_running is not used and
> removed. The lock/unlock protecting it are not removed and remain in
> the code. One benefit of removing them is that it can elimite the code
> size a little.
> 
> Fixes: ab93a4bc955b ("sched/fair: Remove distribute_running from CFS bandwidth")
> ---
>  kernel/sched/fair.c | 3 ---
>  1 file changed, 3 deletions(-)
> 
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 35f4cc024dcf..cc2e1e839e03 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -5089,9 +5089,6 @@ static void do_sched_cfs_slack_timer(struct cfs_bandwidth *cfs_b)
>  		return;
>  
>  	distribute_cfs_runtime(cfs_b);
> -
> -	raw_spin_lock_irqsave(&cfs_b->lock, flags);
> -	raw_spin_unlock_irqrestore(&cfs_b->lock, flags);
>  }
>  
>  /*
> --


Thanks Tao.

Reviewed-by: Phil Auld <pauld@redhat.com>

> 
> Thanks,
> Tao
> 
> > Cheers,
> > Phil
> > 
> > 
> > > Thanks
> > > 
> > > > @@ -5218,7 +5208,6 @@ void init_cfs_bandwidth(struct cfs_bandwidth *cfs_b)
> > > >  	cfs_b->period_timer.function = sched_cfs_period_timer;
> > > >  	hrtimer_init(&cfs_b->slack_timer, CLOCK_MONOTONIC, HRTIMER_MODE_REL);
> > > >  	cfs_b->slack_timer.function = sched_cfs_slack_timer;
> > > > -	cfs_b->distribute_running = 0;
> > > >  	cfs_b->slack_started = false;
> > > >  }
> > > >  
> > > > diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
> > > > index db3a576..7198683 100644
> > > > --- a/kernel/sched/sched.h
> > > > +++ b/kernel/sched/sched.h
> > > > @@ -349,7 +349,6 @@ struct cfs_bandwidth {
> > > >  
> > > >  	u8			idle;
> > > >  	u8			period_active;
> > > > -	u8			distribute_running;
> > > >  	u8			slack_started;
> > > >  	struct hrtimer		period_timer;
> > > >  	struct hrtimer		slack_timer;
> > > 
> > 
> > -- 
> 

-- 


^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2020-06-09  0:28 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-04-10 22:52 [PATCH 0/2] Fix race in CFS bandwidth Josh Don
2020-04-10 22:52 ` [PATCH 1/2] sched: eliminate bandwidth race between throttling and distribution Josh Don
     [not found]   ` <CABk29NtxG8t6wHM6MDVFun7jxqRpupWr0d5dK57N2ecFbdSumw@mail.gmail.com>
2020-04-13 14:44     ` Phil Auld
2020-04-14 10:52   ` Peter Zijlstra
2020-05-01 18:22   ` [tip: sched/core] sched/fair: Eliminate " tip-bot2 for Paul Turner
2020-04-10 22:52 ` [PATCH 2/2] sched: remove distribute_running from CFS bandwidth Josh Don
2020-04-12  2:01   ` Josh Don
2020-04-13 14:49     ` Phil Auld
2020-04-14 10:54   ` Peter Zijlstra
2020-05-01 18:22   ` [tip: sched/core] sched/fair: Remove " tip-bot2 for Josh Don
     [not found]     ` <BL0PR14MB3779C02BB87DC4426C4761639A840@BL0PR14MB3779.namprd14.prod.outlook.com>
2020-06-08 14:53       ` Phil Auld
     [not found]         ` <BL0PR14MB3779AD967619031948957C549A850@BL0PR14MB3779.namprd14.prod.outlook.com>
2020-06-08 23:44           ` Josh Don
2020-06-09  0:28           ` Phil Auld
2020-04-12  2:03 ` [PATCH 0/2] Fix race in " Josh Don

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).