linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Connor O'Brien" <connoro@google.com>
To: Valentin Schneider <vschneid@redhat.com>
Cc: linux-kernel@vger.kernel.org, kernel-team@android.com,
	John Stultz <jstultz@google.com>,
	Joel Fernandes <joelaf@google.com>,
	Qais Yousef <qais.yousef@arm.com>, Ingo Molnar <mingo@redhat.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Juri Lelli <juri.lelli@redhat.com>,
	Vincent Guittot <vincent.guittot@linaro.org>,
	Dietmar Eggemann <dietmar.eggemann@arm.com>,
	Steven Rostedt <rostedt@goodmis.org>,
	Ben Segall <bsegall@google.com>, Mel Gorman <mgorman@suse.de>,
	Daniel Bristot de Oliveira <bristot@redhat.com>,
	Will Deacon <will@kernel.org>, Waiman Long <longman@redhat.com>,
	Boqun Feng <boqun.feng@gmail.com>,
	"Paul E . McKenney" <paulmck@kernel.org>
Subject: Re: [RFC PATCH 09/11] sched/rt: Fix proxy/current (push,pull)ability
Date: Fri, 14 Oct 2022 15:32:56 -0700	[thread overview]
Message-ID: <CALE1s+ODz2FUJoSHcORa25kckk81qSHuZ6RSE6-k=s2gzQ+eOQ@mail.gmail.com> (raw)
In-Reply-To: <xhsmhv8orgb59.mognet@vschneid.remote.csb>

On Mon, Oct 10, 2022 at 4:40 AM Valentin Schneider <vschneid@redhat.com> wrote:
>
> On 03/10/22 21:44, Connor O'Brien wrote:
> > From: Valentin Schneider <valentin.schneider@arm.com>
>
> This was one of my attempts at fixing RT load balancing (the BUG_ON in
> pick_next_pushable_task() was quite easy to trigger), but I ended up
> convincing myself this was insufficient - this only "tags" the donor and
> the proxy, the entire blocked chain needs tagging. Hopefully not all of
> what I'm about to write is nonsense, some of the neurons I need for this
> haven't been used in a while - to be taken with a grain of salt.
Made sense to me! Thanks, this was really helpful for understanding
the interactions between proxy execution & load balancing.
>
> Consider pick_highest_pushable_task() - we don't want any task in a blocked
> chain to be pickable. There's no point in migrating it, we'll just hit
> schedule()->proxy(), follow p->blocked_on and most likely move it back to
> where the rest of the chain is. This applies any sort of balancing (CFS,
> RT, DL).
>
> ATM I think PE breaks the "run the N highest priority task on our N CPUs"
> policy. Consider:
>
>    p0 (FIFO42)
>     |
>     | blocked_on
>     v
>    p1 (FIFO41)
>     |
>     | blocked_on
>     v
>    p2 (FIFO40)
>
>   Add on top p3 an unrelated FIFO1 task, and p4 an unrelated CFS task.
>
>   CPU0
>   current:  p0
>   proxy:    p2
>   enqueued: p0, p1, p2, p3
>
>   CPU1
>   current:  p4
>   proxy:    p4
>   enqueued: p4
>
>
> pick_next_pushable_task() on CPU0 would pick p1 as the next highest
> priority task to push away to e.g. CPU1, but that would be undone as soon
> as proxy() happens on CPU1: we'd notice the CPU boundary and punt it back
> to CPU0. What we would want here is to pick p3 instead to have it run on
> CPU1.

Given this point, is there any reason that blocked tasks should ever
be pushable, even if they are not part of the blocked chain for the
currently running task? If we could just check task_is_blocked()
rather than needing to know whether the task is in the middle of the
"active" chain, that would seem to simplify things greatly. I think
that approach might also require another dequeue/enqueue, in
ttwu_runnable(), to catch non-proxy blocked tasks becoming unblocked
(and therefore pushable), but that *seems* OK...though I could
certainly be missing something.

A related load balancing correctness question that caught my eye while
taking another look at this code: when we have rq->curr != rq->proxy
and then rq->curr is preempted and switches out, IIUC rq->curr should
become pushable immediately - but this does not seem to be the case
even with this patch. Is there a path that handles this case that I'm
just missing, or a reason that no special handling is needed?
Otherwise I wonder if __schedule() might need a dequeue/enqueue for
the prev task as well in this case.
>
> I *think* we want only the proxy of an entire blocked-chain to be visible
> to load-balance, unfortunately PE gathers the blocked-chain onto the
> donor's CPU which kinda undoes that.
>
> Having the blocked tasks remain in the rq is very handy as it directly
> gives us the scheduling context and we can unwind the blocked chain for the
> execution context, but it does wreak havock in load-balancing :/
>

  reply	other threads:[~2022-10-14 22:33 UTC|newest]

Thread overview: 67+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-10-03 21:44 [RFC PATCH 00/11] Reviving the Proxy Execution Series Connor O'Brien
2022-10-03 21:44 ` [RFC PATCH 01/11] locking/ww_mutex: Remove wakeups from under mutex::wait_lock Connor O'Brien
2022-10-04 16:01   ` Waiman Long
2022-10-12 23:54     ` Joel Fernandes
2022-10-20 18:43     ` Connor O'Brien
2022-10-03 21:44 ` [RFC PATCH 02/11] locking/mutex: Rework task_struct::blocked_on Connor O'Brien
2022-10-03 21:44 ` [RFC PATCH 03/11] kernel/locking: Add p->blocked_on wrapper Connor O'Brien
2022-10-03 21:44 ` [RFC PATCH 04/11] locking/mutex: make mutex::wait_lock irq safe Connor O'Brien
2022-10-13  4:30   ` Joel Fernandes
2022-10-03 21:44 ` [RFC PATCH 05/11] sched: Split scheduler execution context Connor O'Brien
2022-10-14 17:01   ` Joel Fernandes
2022-10-19 17:17   ` Valentin Schneider
2022-10-20 18:43     ` Connor O'Brien
2022-10-03 21:44 ` [RFC PATCH 06/11] kernel/locking: Expose mutex_owner() Connor O'Brien
2022-10-03 21:44 ` [RFC PATCH 07/11] sched: Add proxy execution Connor O'Brien
2022-10-12  1:54   ` Joel Fernandes
2022-10-12  9:46     ` Juri Lelli
2022-10-14 17:07     ` Joel Fernandes
2022-10-15 13:53     ` Peter Zijlstra
2022-10-16 20:48       ` Steven Rostedt
2022-10-17  4:03         ` Joel Fernandes
2022-10-17  7:26         ` Peter Zijlstra
2022-10-24 22:33           ` Qais Yousef
2022-10-25 11:19             ` Joel Fernandes
2022-10-25 22:10               ` Qais Yousef
2022-10-15 15:28     ` Peter Zijlstra
2022-10-15 15:08   ` Peter Zijlstra
2022-10-15 15:10   ` Peter Zijlstra
2022-10-15 15:47   ` Peter Zijlstra
2022-10-24 10:13   ` Dietmar Eggemann
2022-10-29  3:31     ` Joel Fernandes
2022-10-31 16:39       ` Dietmar Eggemann
2022-10-31 18:00         ` Joel Fernandes
2022-11-04 17:09           ` Dietmar Eggemann
2022-11-21  0:22             ` Joel Fernandes
2022-11-21  1:49               ` Joel Fernandes
2022-11-21  3:59                 ` Joel Fernandes
2022-11-22 18:45                   ` Joel Fernandes
2023-01-09  8:51   ` Chen Yu
2022-10-03 21:44 ` [RFC PATCH 08/11] sched: Fixup task CPUs for potential proxies Connor O'Brien
2022-10-03 21:44 ` [RFC PATCH 09/11] sched/rt: Fix proxy/current (push,pull)ability Connor O'Brien
2022-10-10 11:40   ` Valentin Schneider
2022-10-14 22:32     ` Connor O'Brien [this message]
2022-10-19 17:05       ` Valentin Schneider
2022-10-20 13:30         ` Juri Lelli
2022-10-20 16:14           ` Valentin Schneider
2022-10-21  2:22         ` Connor O'Brien
2022-10-03 21:45 ` [RFC PATCH 10/11] torture: support randomized shuffling for proxy exec testing Connor O'Brien
2022-11-12 16:54   ` Joel Fernandes
2022-11-14 20:44     ` Connor O'Brien
2022-11-15 16:02       ` Joel Fernandes
2022-10-03 21:45 ` [RFC PATCH 11/11] locktorture: support nested mutexes Connor O'Brien
2022-10-06  9:59 ` [RFC PATCH 00/11] Reviving the Proxy Execution Series Juri Lelli
2022-10-06 10:07   ` Peter Zijlstra
2022-10-06 12:14     ` Juri Lelli
2022-10-15 15:44 ` Peter Zijlstra
2022-10-17  2:23 ` Joel Fernandes
2022-10-19 11:43   ` Qais Yousef
2022-10-19 12:23     ` Joel Fernandes
2022-10-19 13:41       ` Juri Lelli
2022-10-19 13:51         ` Joel Fernandes
2022-10-19 19:30         ` Qais Yousef
2022-10-20  8:51           ` Joel Fernandes
2022-10-17  3:25 ` Chengming Zhou
2022-10-17  3:56   ` Joel Fernandes
2022-10-17  4:26     ` Chengming Zhou
2022-10-17 12:27       ` Joel Fernandes

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CALE1s+ODz2FUJoSHcORa25kckk81qSHuZ6RSE6-k=s2gzQ+eOQ@mail.gmail.com' \
    --to=connoro@google.com \
    --cc=boqun.feng@gmail.com \
    --cc=bristot@redhat.com \
    --cc=bsegall@google.com \
    --cc=dietmar.eggemann@arm.com \
    --cc=joelaf@google.com \
    --cc=jstultz@google.com \
    --cc=juri.lelli@redhat.com \
    --cc=kernel-team@android.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=longman@redhat.com \
    --cc=mgorman@suse.de \
    --cc=mingo@redhat.com \
    --cc=paulmck@kernel.org \
    --cc=peterz@infradead.org \
    --cc=qais.yousef@arm.com \
    --cc=rostedt@goodmis.org \
    --cc=vincent.guittot@linaro.org \
    --cc=vschneid@redhat.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).