All of lore.kernel.org
 help / color / mirror / Atom feed
From: Vincent Guittot <vincent.guittot@linaro.org>
To: Phil Auld <pauld@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>,
	linux-kernel <linux-kernel@vger.kernel.org>,
	Ingo Molnar <mingo@kernel.org>,
	Juri Lelli <juri.lelli@redhat.com>
Subject: Re: [PATCH] sched/fair: Fix enqueue_task_fair warning some more
Date: Thu, 7 May 2020 17:06:29 +0200	[thread overview]
Message-ID: <CAKfTPtC6jEJ00YG6DyKOwxxpzuUEBE48ivoi3njKLCEB88U8LQ@mail.gmail.com> (raw)
In-Reply-To: <20200506180521.GC9773@lorien.usersys.redhat.com>

Hi Phil,

On Wed, 6 May 2020 at 20:05, Phil Auld <pauld@redhat.com> wrote:
>
> Hi Vincent,
>
> Thanks for taking a look. More below...
>
> On Wed, May 06, 2020 at 06:36:45PM +0200 Vincent Guittot wrote:
> > Hi Phil,
> >
> > - reply to all this time
> >
> > On Wed, 6 May 2020 at 16:18, Phil Auld <pauld@redhat.com> wrote:
> > >
> > > sched/fair: Fix enqueue_task_fair warning some more
> > >
> > > The recent patch, fe61468b2cb (sched/fair: Fix enqueue_task_fair warning)
> > > did not fully resolve the issues with the (rq->tmp_alone_branch !=
> > > &rq->leaf_cfs_rq_list) warning in enqueue_task_fair. There is a case where
> > > the first for_each_sched_entity loop exits due to on_rq, having incompletely
> > > updated the list.  In this case the second for_each_sched_entity loop can
> > > further modify se. The later code to fix up the list management fails to do
> >
> > But for the 2nd  for_each_sched_entity, the cfs_rq should already be
> > in the list, isn't it ?
>
> No. In this case we hit the parent not on list case in list_add_leaf_cfs_rq
> which sets rq-tmp_alone_branch to cfs_rq->leaf_cfs_rq_list which is not
> the same. It returns false expecting the parent to be added later.
>
> But then the parent doens't get there because it's on_rq.
>
> >
> > The third for_each_entity loop is there for the throttled case but is
> > useless for other case
> >
>
> There actually is a throttling involved usually. The second loop breaks out
> early because one of the parents is throttled. But not before it advances
> se at least once.

Ok, that's even because of the throttling that the problem occurs

>
> Then the 3rd loop doesn't fix the tmp_alone_branch because it doesn't start
> with the right se.
>
> > Could you provide us some details about the use case that creates such
> > a situation ?
> >
>
> I admit I had to add trace_printks to get here. Here's what it showed (sorry
> for the long lines...)
>
> 1)  sh-6271  [044]  1271.322317: bprint: enqueue_task_fair: se 0xffffa085e7e30080 on_rq 0 cfs_rq = 0xffffa085e93da200
> 2)  sh-6271  [044]  1271.322320: bprint: enqueue_entity: Add_leaf_rq: cpu 17: nr_r 2; cfs 0xffffa085e93da200 onlist 0 tmp_a_b = 0xffffa085ef92c868 &rq->l_c_r_l = 0xffffa085ef92c868
> 3)  sh-6271  [044]  1271.322322: bprint: enqueue_entity: Add_leaf_rq: cpu 17: nr_r 2: parent not onlist  Set t_a_branch to 0xffffa085e93da340 rq->l_c_r_l = 0xffffa085ef92c868
> 4)  sh-6271  [044]  1271.322323: bprint: enqueue_task_fair: se 0xffffa085e93d8800 on_rq 1 cfs_rq = 0xffffa085dbfaea00
> 5)  sh-6271  [044]  1271.322324: bprint: enqueue_task_fair: Done enqueues, se=0xffffa085e93d8800, pid=3642
> 6)  sh-6271  [044]  1271.322326: bprint: enqueue_task_fair: update: cfs 0xffffa085e48ce000 throttled, se = 0xffffa085dbfafc00
> 7)  sh-6271  [044]  1271.322326: bprint: enqueue_task_fair: current se = 0xffffa085dbfafc00, orig_se = 0xffffa085e7e30080
> 8)  sh-6271  [044]  1271.322327: bprint: enqueue_task_fair: Add_leaf_rq: cpu 17: nr_r 2; cfs 0xffffa085e48ce000 onlist 1 tmp_a_b = 0xffffa085e93da340 &rq->l_c_r_l = 0xffffa085ef92c868
> 9)  sh-6271  [044]  1271.322328: bprint: enqueue_task_fair: Add_leaf_rq: cpu 17: nr_r 0; cfs 0xffffa085ef92bf80 onlist 1 tmp_a_b = 0xffffa085e93da340 &rq->l_c_r_l = 0xffffa085ef92c868
> 10) sh-6271  [044]  1271.672599: bprint: enqueue_task_fair: cpu 17: rq->tmp_alone_branch = 0xffffa085e93da340 != &rq->leaf_cfs_rq_list = 0xffffa085ef92c868
>
>
> lines 1 and 4 are from the first loop in enqueue_task_fair. Line 2 and 3 are from the
> first call to list_add_leaf_rq with line 2 being at the start and line 3 showing which
> of the 3 cases we hit.
>
> Line 5 is right after the first loop.
>
> Line 6 is the second trip through the 2nd loop and is in the if(throttled) condition.
> Line 7 is right below the enqueue_throttle label.
>
> Lines 8 and 9 are from the fixup loop and since onlist is set for both of these it doesn't
> do anything. But we've left rq->tmp_alone_branch pointing to the cfs_rq->leaf_cfs_rq_list
> from the one call to list_add_leaf_rq that did something and so the cleanup doesn't work.
>
> Based on the comment at the clean up, it looked like it expected the se to be what it was
> when the first loop broke not whatever it was left at after the second loop.  Could have
> been NULL there too I guess but I didn't hit that case.
>
> This is 100% reproducible. And completely gone with the fix. I have a trace showing that.
>
> Does that make more sense?

Yes, Good catch
And thanks for the detailed explanation.

>
>
>
> Cheers,
> Phil
>
>
> > > what is needed because se does not point to the sched_entity which broke out
> > > of the first loop.
> > >
> > > Address this issue by saving the se pointer when the first loop exits and
> > > resetting it before doing the fix up, if needed.
> > >
> > > Signed-off-by: Phil Auld <pauld@redhat.com>
> > > Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
> > > Cc: Vincent Guittot <vincent.guittot@linaro.org>
> > > Cc: Ingo Molnar <mingo@kernel.org>
> > > Cc: Juri Lelli <juri.lelli@redhat.com>
> > > ---
> > >  kernel/sched/fair.c | 4 ++++
> > >  1 file changed, 4 insertions(+)
> > >
> > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> > > index 02f323b85b6d..719c996317e3 100644
> > > --- a/kernel/sched/fair.c
> > > +++ b/kernel/sched/fair.c
> > > @@ -5432,6 +5432,7 @@ enqueue_task_fair(struct rq *rq, struct task_struct *p, int flags)
> > >  {
> > >         struct cfs_rq *cfs_rq;
> > >         struct sched_entity *se = &p->se;
> > > +       struct sched_entity *saved_se = NULL;
> > >         int idle_h_nr_running = task_has_idle_policy(p);
> > >
> > >         /*
> > > @@ -5466,6 +5467,7 @@ enqueue_task_fair(struct rq *rq, struct task_struct *p, int flags)
> > >                 flags = ENQUEUE_WAKEUP;
> > >         }
> > >
> > > +       saved_se = se;

TBH, I don't like saving and going back to the saved se and loop one
more time on them

> > >         for_each_sched_entity(se) {
> > >                 cfs_rq = cfs_rq_of(se);

Could you add something like below in the 2nd loop instead ?

--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -5486,6 +5486,13 @@ enqueue_task_fair(struct rq *rq, struct
task_struct *p, int flags)
                /* end evaluation on encountering a throttled cfs_rq */
                if (cfs_rq_throttled(cfs_rq))
                        goto enqueue_throttle;
+
+               /*
+                * One parent has been throttled and cfs_rq removed from the
+                * list. Add it back to not break the leaf list.
+                */
+               if (throttled_hierarchy(cfs_rq))
+                       list_add_leaf_cfs_rq(cfs_rq);
        }

 enqueue_throttle:

> > >
> > > @@ -5510,6 +5512,8 @@ enqueue_task_fair(struct rq *rq, struct task_struct *p, int flags)
> > >                  * leaf list maintenance, resulting in triggering the assertion
> > >                  * below.
> > >                  */
> > > +               if (saved_se)
> > > +                       se = saved_se;
> > >                 for_each_sched_entity(se) {
> > >                         cfs_rq = cfs_rq_of(se);
> > >
> > > --
> > > 2.18.0
> > >
> > >
> > > Cheers,
> > > Phil
> > >
> >
>
> --
>

  reply	other threads:[~2020-05-07 15:06 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-06 14:18 [PATCH] sched/fair: Fix enqueue_task_fair warning some more Phil Auld
2020-05-06 16:36 ` Vincent Guittot
2020-05-06 18:05   ` Phil Auld
2020-05-07 15:06     ` Vincent Guittot [this message]
2020-05-07 15:17       ` Phil Auld
2020-05-07 18:04       ` Phil Auld
2020-05-07 18:21         ` Vincent Guittot
2020-05-07 20:36 ` [PATCH v2] " Phil Auld
2020-05-08 15:15   ` Tao Zhou
2020-05-08 15:27     ` Vincent Guittot
2020-05-08 17:02       ` Tao Zhou
2020-05-11  8:36         ` Vincent Guittot
     [not found]           ` <BL0PR14MB37792D0FD629FFF1C9FEDE369AA10@BL0PR14MB3779.namprd14.prod.outlook.com>
2020-05-11 19:22             ` Vincent Guittot
2020-05-11  8:40         ` Dietmar Eggemann
2020-05-11  9:36           ` Vincent Guittot
2020-05-11 10:39             ` Dietmar Eggemann
2020-05-11 12:12               ` Vincent Guittot
2020-05-11 17:02                 ` Dietmar Eggemann
2020-05-11 17:14                   ` Vincent Guittot
     [not found]               ` <BL0PR14MB3779ED5E2E5AD157B58D002C9AA10@BL0PR14MB3779.namprd14.prod.outlook.com>
2020-05-11 17:03                 ` Dietmar Eggemann
2020-05-11 19:25   ` Vincent Guittot
2020-05-11 20:44     ` Phil Auld
2020-05-12  9:00       ` Dietmar Eggemann
2020-05-12 13:37         ` Phil Auld
2020-05-12 14:06       ` Peter Zijlstra
2020-05-12 13:52 ` [PATCH v3] " Phil Auld
2020-05-12 14:10   ` Peter Zijlstra
2020-05-12 14:24     ` Phil Auld
2020-05-19 18:44   ` [tip: sched/urgent] sched/fair: Fix enqueue_task_fair() " tip-bot2 for Phil Auld

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAKfTPtC6jEJ00YG6DyKOwxxpzuUEBE48ivoi3njKLCEB88U8LQ@mail.gmail.com \
    --to=vincent.guittot@linaro.org \
    --cc=juri.lelli@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=pauld@redhat.com \
    --cc=peterz@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.