linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Vincent Guittot <vincent.guittot@linaro.org>
To: Valentin Schneider <Valentin.Schneider@arm.com>
Cc: Vincent Donnefort <Vincent.Donnefort@arm.com>,
	peterz@infradead.org, mingo@redhat.com,
	linux-kernel@vger.kernel.org, mgorman@techsingularity.net,
	dietmar.eggemann@arm.com
Subject: Re: [PATCH] sched/fair: Fix detection of per-CPU kthreads waking a task
Date: Thu, 25 Nov 2021 14:23:10 +0100	[thread overview]
Message-ID: <CAKfTPtDPskVdEd-KQ_cwe-R_zVFPQOgdbk9x+3eD12pKs8fGFw@mail.gmail.com> (raw)
In-Reply-To: <8735nkcwov.mognet@arm.com>

On Thu, 25 Nov 2021 at 12:16, Valentin Schneider
<Valentin.Schneider@arm.com> wrote:
>
> On 25/11/21 10:05, Vincent Guittot wrote:
> > On Wed, 24 Nov 2021 at 16:42, Vincent Donnefort
> > <vincent.donnefort@arm.com> wrote:
> >>
> >> select_idle_sibling() will return prev_cpu for the case where the task is
> >> woken up by a per-CPU kthread. However, the idle task has been recently
> >> modified and is now identified by is_per_cpu_kthread(), breaking the
> >> behaviour described above. Using !is_idle_task() ensures we do not
> >> spuriously trigger that select_idle_sibling() exit path.
> >>
> >> Fixes: 00b89fe0197f ("sched: Make the idle task quack like a per-CPU kthread")
> >> Signed-off-by: Vincent Donnefort <vincent.donnefort@arm.com>
> >>
> >> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> >> index 945d987246c5..8bf95b0e368d 100644
> >> --- a/kernel/sched/fair.c
> >> +++ b/kernel/sched/fair.c
> >> @@ -6399,6 +6399,7 @@ static int select_idle_sibling(struct task_struct *p, int prev, int target)
> >>          * pattern is IO completions.
> >>          */
> >>         if (is_per_cpu_kthread(current) &&
> >> +           !is_idle_task(current) &&
> >>             prev == smp_processor_id() &&
> >>             this_rq()->nr_running <= 1) {
> >>                 return prev;
> >
> > AFAICT, this can't be possible for a symmetric system because it would
> > have been already returned by other conditions.
> > Only an asymmetric system can face such a situation if the task
> > doesn't fit which is the subject of your other patch.
> > so this patch seems irrelevant outside the other one
> >
>
> I think you can still hit this on a symmetric system; let me try to
> reformulate my other email.
>
> If this (non-patched) condition evaluates to true, it means the previous
> condition
>
>   (available_idle_cpu(target) || sched_idle_cpu(target)) &&
>    asym_fits_capacity(task_util, target)
>
> evaluated to false, so for a symmetric system target sure isn't idle.
>
> prev == smp_processor_id() implies prev == target, IOW prev isn't
> idle. Now, consider:
>
>   p0.prev = CPU1
>   p1.prev = CPU1
>
>   CPU0                     CPU1
>   current = don't care     current = swapper/1
>
>   ttwu(p1)
>     ttwu_queue(p1, CPU1)
>     // or
>     ttwu_queue_wakelist(p1, CPU1)
>
>                           hrtimer_wakeup()
>                             wake_up_process()
>                               ttwu()
>                                 idle_cpu(CPU1)? no
>
>                                 is_per_cpu_kthread(current)? yes
>                                 prev == smp_processor_id()? yes
>                                 this_rq()->nr_running <= 1? yes
>                                 => self enqueue
>
>                           ...
>                           schedule_idle()
>
> This works if CPU0 does either a full enqueue (rq->nr_running == 1) or just
> a wakelist enqueue (rq->ttwu_pending > 0). If there was an idle CPU3
> around, we'd still be stacking p0 and p1 onto CPU1.
>
> IOW this opens a window between a remote ttwu() and the idle task invoking
> schedule_idle() where the idle task can stack more tasks onto its CPU.

Your use case above is out of the scope of this patch and has always
been there, even for other per cpu kthreads. In such case, the wake up
is not triggered by current (idle or another per cpu kthread) but by
an interrupt (hrtimer in your case). If we want to filter wakeup
generated by interrupt context while a per cpu kthread is running, it
would be better to fix all cases and test the running context like
this

--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -6397,7 +6397,8 @@ static int select_idle_sibling(struct
task_struct *p, int prev, int target)
         * essentially a sync wakeup. An obvious example of this
         * pattern is IO completions.
         */
-       if (is_per_cpu_kthread(current) &&
+       if (!in_interrupt() &&
+           is_per_cpu_kthread(current) &&
            prev == smp_processor_id() &&
            this_rq()->nr_running <= 1) {
                return prev;

>
> >
> >> --
> >> 2.25.1
> >>

  parent reply	other threads:[~2021-11-25 13:35 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-11-24 15:42 [PATCH] sched/fair: Fix detection of per-CPU kthreads waking a task Vincent Donnefort
2021-11-24 16:28 ` Valentin Schneider
2021-11-25  9:05 ` Vincent Guittot
2021-11-25 11:16   ` Valentin Schneider
2021-11-25 13:17     ` Dietmar Eggemann
2021-11-25 13:23     ` Vincent Guittot [this message]
2021-11-25 15:30       ` Valentin Schneider
2021-11-26  8:23         ` Vincent Guittot
2021-11-26 13:32           ` Valentin Schneider
2021-11-26 14:40             ` Vincent Guittot
2021-11-26 16:49               ` Valentin Schneider
2021-11-26 17:18                 ` Vincent Donnefort
2021-11-29 15:49                   ` Vincent Guittot
2021-11-29 16:54                     ` Vincent Donnefort
2021-11-30 13:35                       ` Dietmar Eggemann
2021-11-30 15:42                       ` Vincent Guittot
2021-12-01 14:40                         ` Vincent Donnefort
2021-12-01 16:19                           ` Vincent Guittot
2021-11-29  8:36 ` [sched/fair] 8d0920b981: stress-ng.sem.ops_per_sec 11.9% improvement kernel test robot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAKfTPtDPskVdEd-KQ_cwe-R_zVFPQOgdbk9x+3eD12pKs8fGFw@mail.gmail.com \
    --to=vincent.guittot@linaro.org \
    --cc=Valentin.Schneider@arm.com \
    --cc=Vincent.Donnefort@arm.com \
    --cc=dietmar.eggemann@arm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mgorman@techsingularity.net \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).