All of lore.kernel.org
 help / color / mirror / Atom feed
From: Steven Rostedt <rostedt@goodmis.org>
To: Qais Yousef <qais.yousef@arm.com>
Cc: Ingo Molnar <mingo@kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Dietmar Eggemann <dietmar.eggemann@arm.com>,
	Pavan Kondeti <pkondeti@codeaurora.org>,
	Juri Lelli <juri.lelli@redhat.com>,
	Vincent Guittot <vincent.guittot@linaro.org>,
	Ben Segall <bsegall@google.com>, Mel Gorman <mgorman@suse.de>,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH v3 1/6] sched/rt: cpupri_find: Implement fallback mechanism for !fit case
Date: Wed, 4 Mar 2020 13:54:04 -0500	[thread overview]
Message-ID: <20200304135404.146c56eb@gandalf.local.home> (raw)
In-Reply-To: <20200304173925.43xq4wztummxgs3x@e107158-lin.cambridge.arm.com>

On Wed, 4 Mar 2020 17:39:26 +0000
Qais Yousef <qais.yousef@arm.com> wrote:

> > > +	/*
> > > +	 * If we failed to find a fitting lowest_mask, make sure we fall back
> > > +	 * to the last known unfitting lowest_mask.
> > > +	 *
> > > +	 * Note that the map of the recorded idx might have changed since then,
> > > +	 * so we must ensure to do the full dance to make sure that level still
> > > +	 * holds a valid lowest_mask.
> > > +	 *
> > > +	 * As per above, the map could have been concurrently emptied while we
> > > +	 * were busy searching for a fitting lowest_mask at the other priority
> > > +	 * levels.
> > > +	 *
> > > +	 * This rule favours honouring priority over fitting the task in the
> > > +	 * correct CPU (Capacity Awareness being the only user now).
> > > +	 * The idea is that if a higher priority task can run, then it should
> > > +	 * run even if this ends up being on unfitting CPU.
> > > +	 *
> > > +	 * The cost of this trade-off is not entirely clear and will probably
> > > +	 * be good for some workloads and bad for others.
> > > +	 *
> > > +	 * The main idea here is that if some CPUs were overcommitted, we try
> > > +	 * to spread which is what the scheduler traditionally did. Sys admins
> > > +	 * must do proper RT planning to avoid overloading the system if they
> > > +	 * really care.
> > > +	 */
> > > +	if (best_unfit_idx != -1)
> > > +		return __cpupri_find(cp, p, lowest_mask, best_unfit_idx);  
> > 
> > Hmm, this only checks the one index, which can change and then we miss
> > everything. I think we can do better. What about this:  
> 
> Hmm. I see 2 issues with this approach:
> 
> > 
> > 
> >         for (idx = 0; idx < task_pri; idx++) {
> > 		int found = -1;
> > 
> >                 if (!__cpupri_find(cp, p, lowest_mask, idx))
> >                         continue;  
> 
> 1.
> 
> __cpupri_find() could update the lowest_mask at the next iteration, so the
> fallback wouldn't be the lowest level, right?

Hmm, you may be right.

> 
> > 
> >                 if (!lowest_mask || !fitness_fn)
> >                         return 1;
> > 
> > 		/* Make sure we have one fit CPU before clearing */
> > 		for_each_cpu(cpu, lowest_mask) {
> > 			if (fitness_fn(p, cpu)) {
> > 				found = cpu;
> > 				break;
> > 			}
> > 		}
> > 
> > 		if (found == -1)
> > 			continue;  
> 
> 2.
> 
> If we fix 1, then assuming found == -1 for all level, we'll still have the
> problem that the mask is stale.
> 
> We can do a full scan again as Tao was suggestion, the 2nd one without any
> fitness check that is. But isn't this expensive?

I was hoping to try to avoid that, but it's not that expensive and will
probably seldom happen. Perhaps we should run some test cases and trace the
results to see how often that can happen.

> 
> We risk the mask being stale anyway directly after selecting it. Or a priority
> level might become the lowest level just after we dismissed it.

Sure, but that's still a better effort.

> 
> So our best effort could end up lying even if we do the right thing (TM).
> 
> > 
> >                 /* Ensure the capacity of the CPUs fit the task */
> >                 for_each_cpu(cpu, lowest_mask) {
> >                         if (cpu < found || !fitness_fn(p, cpu))
> >                                 cpumask_clear_cpu(cpu, lowest_mask);
> >                 }
> > 
> >                 return 1;
> >         }
> > 
> > This way, if nothing fits we return the untouched lowest_mask, and only
> > clear the lowest_mask bits if we found a fitness cpu.  
> 
> Because of 1, I think the lowest mask will not be the true lowest mask, no?
> Please correct me if I missed something.

No, I think you are correct.

> 
> There's another 'major' problem that I need to bring your attention to,
> find_lowest_rq() always returns the first CPU in the mask.
> 
> See discussion below for more details
> 
> 	https://lore.kernel.org/lkml/20200219140243.wfljmupcrwm2jelo@e107158-lin/
> 
> In my test because multiple tasks wakeup together they all end up going to CPU1
> (the first fitting CPU in the mask), then just to be pushed back again. Not
> necessarily to where they were running before.
> 
> Not sure if there are other situations where this could happen.
> 
> If we fix this problem then we can help reduce the effect of this raciness in
> find_lowest_rq(), and end up with less ping-ponging if tasks happen to
> wakeup/sleep at the wrong time during the scan.

Hmm, I wonder if there's a fast way of getting the next CPU from the
current cpu the task is on. Perhaps that will help in the ping ponging.

-- Steve

> 
> Or maybe there is a way to eliminate all these races with the current design?
> 
> Thanks
> 
> --
> Qais Yousef


  reply	other threads:[~2020-03-04 18:54 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-03-02 13:27 [PATCH v3 0/6] RT Capacity Awareness Fixes & Improvements Qais Yousef
2020-03-02 13:27 ` [PATCH v3 1/6] sched/rt: cpupri_find: Implement fallback mechanism for !fit case Qais Yousef
     [not found]   ` <20200304143200.GA13200@geo.homenetwork>
2020-03-04 15:18     ` Qais Yousef
2020-03-04 16:27   ` Steven Rostedt
2020-03-04 17:39     ` Qais Yousef
2020-03-04 18:54       ` Steven Rostedt [this message]
2020-03-04 20:01         ` Qais Yousef
2020-03-05 12:43           ` Qais Yousef
2020-03-10 14:22             ` Qais Yousef
2020-03-11 13:51               ` Steven Rostedt
2020-03-20 12:58               ` [tip: sched/core] sched/rt: cpupri_find: Trigger a full search as fallback tip-bot2 for Qais Yousef
2020-03-06 14:42   ` [tip: sched/core] sched/rt: cpupri_find: Implement fallback mechanism for !fit case tip-bot2 for Qais Yousef
2020-03-02 13:27 ` [PATCH v3 2/6] sched/rt: Re-instate old behavior in select_task_rq_rt Qais Yousef
2020-03-06 14:42   ` [tip: sched/core] sched/rt: Re-instate old behavior in select_task_rq_rt() tip-bot2 for Qais Yousef
2020-03-02 13:27 ` [PATCH v3 3/6] sched/rt: Optimize cpupri_find on non-heterogenous systems Qais Yousef
2020-03-06 14:42   ` [tip: sched/core] sched/rt: Optimize cpupri_find() " tip-bot2 for Qais Yousef
2020-03-02 13:27 ` [PATCH v3 4/6] sched/rt: Allow pulling unfitting task Qais Yousef
     [not found]   ` <20200304145219.GA14173@geo.homenetwork>
2020-03-04 15:28     ` Qais Yousef
2020-03-06 14:42   ` [tip: sched/core] " tip-bot2 for Qais Yousef
2020-03-02 13:27 ` [PATCH v3 5/6] sched/rt: Remove unnecessary push for unfit tasks Qais Yousef
2020-03-06 14:42   ` [tip: sched/core] " tip-bot2 for Qais Yousef
2020-03-02 13:27 ` [PATCH v3 6/6] sched/rt: Fix pushing unfit tasks to a better CPU Qais Yousef
2020-03-06 17:51   ` Qais Yousef
2020-03-11 10:53     ` Pavan Kondeti
2020-03-11 14:11       ` Qais Yousef
2020-03-11 14:00   ` Steven Rostedt
2020-03-11 14:23     ` Qais Yousef

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200304135404.146c56eb@gandalf.local.home \
    --to=rostedt@goodmis.org \
    --cc=bsegall@google.com \
    --cc=dietmar.eggemann@arm.com \
    --cc=juri.lelli@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mgorman@suse.de \
    --cc=mingo@kernel.org \
    --cc=peterz@infradead.org \
    --cc=pkondeti@codeaurora.org \
    --cc=qais.yousef@arm.com \
    --cc=vincent.guittot@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.