linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "benbjiang(蒋彪)" <benbjiang@tencent.com>
To: "Li, Aubrey" <aubrey.li@linux.intel.com>
Cc: Vineeth Remanan Pillai <vpillai@digitalocean.com>,
	Nishanth Aravamudan <naravamudan@digitalocean.com>,
	Julien Desfossez <jdesfossez@digitalocean.com>,
	Peter Zijlstra <peterz@infradead.org>,
	"Tim Chen" <tim.c.chen@linux.intel.com>,
	"mingo@kernel.org" <mingo@kernel.org>,
	"tglx@linutronix.de" <tglx@linutronix.de>,
	"pjt@google.com" <pjt@google.com>,
	"torvalds@linux-foundation.org" <torvalds@linux-foundation.org>,
	Aubrey Li <aubrey.li@intel.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"subhra.mazumdar@oracle.com" <subhra.mazumdar@oracle.com>,
	"fweisbec@gmail.com" <fweisbec@gmail.com>,
	"keescook@chromium.org" <keescook@chromium.org>,
	"kerrnel@google.com" <kerrnel@google.com>,
	Phil Auld <pauld@redhat.com>, Aaron Lu <aaron.lwe@gmail.com>,
	Aubrey Li <aubrey.intel@gmail.com>,
	Valentin Schneider <valentin.schneider@arm.com>,
	Mel Gorman <mgorman@techsingularity.net>,
	Pawan Gupta <pawan.kumar.gupta@linux.intel.com>,
	Paolo Bonzini <pbonzini@redhat.com>,
	Joel Fernandes <joelaf@google.com>,
	"joel@joelfernandes.org" <joel@joelfernandes.org>,
	"vineethrp@gmail.com" <vineethrp@gmail.com>,
	Chen Yu <yu.c.chen@intel.com>,
	Christian Brauner <christian.brauner@ubuntu.com>
Subject: Re: [RFC PATCH 11/16] sched: migration changes for core scheduling(Internet mail)
Date: Thu, 23 Jul 2020 04:23:52 +0000	[thread overview]
Message-ID: <96A765D7-7FD3-40EB-873B-0F9365569490@tencent.com> (raw)
In-Reply-To: <6ab8a001-ae5e-e484-c571-90d6931004e7@linux.intel.com>

Hi,
> On Jul 23, 2020, at 11:35 AM, Li, Aubrey <aubrey.li@linux.intel.com> wrote:
> 
> On 2020/7/23 10:42, benbjiang(蒋彪) wrote:
>> Hi,
>> 
>>> On Jul 23, 2020, at 9:57 AM, Li, Aubrey <aubrey.li@linux.intel.com> wrote:
>>> 
>>> On 2020/7/22 22:32, benbjiang(蒋彪) wrote:
>>>> Hi,
>>>> 
>>>>> On Jul 22, 2020, at 8:13 PM, Li, Aubrey <aubrey.li@linux.intel.com> wrote:
>>>>> 
>>>>> On 2020/7/22 16:54, benbjiang(蒋彪) wrote:
>>>>>> Hi, Aubrey,
>>>>>> 
>>>>>>> On Jul 1, 2020, at 5:32 AM, Vineeth Remanan Pillai <vpillai@digitalocean.com> wrote:
>>>>>>> 
>>>>>>> From: Aubrey Li <aubrey.li@intel.com>
>>>>>>> 
>>>>>>> - Don't migrate if there is a cookie mismatch
>>>>>>>  Load balance tries to move task from busiest CPU to the
>>>>>>>  destination CPU. When core scheduling is enabled, if the
>>>>>>>  task's cookie does not match with the destination CPU's
>>>>>>>  core cookie, this task will be skipped by this CPU. This
>>>>>>>  mitigates the forced idle time on the destination CPU.
>>>>>>> 
>>>>>>> - Select cookie matched idle CPU
>>>>>>>  In the fast path of task wakeup, select the first cookie matched
>>>>>>>  idle CPU instead of the first idle CPU.
>>>>>>> 
>>>>>>> - Find cookie matched idlest CPU
>>>>>>>  In the slow path of task wakeup, find the idlest CPU whose core
>>>>>>>  cookie matches with task's cookie
>>>>>>> 
>>>>>>> - Don't migrate task if cookie not match
>>>>>>>  For the NUMA load balance, don't migrate task to the CPU whose
>>>>>>>  core cookie does not match with task's cookie
>>>>>>> 
>>>>>>> Signed-off-by: Aubrey Li <aubrey.li@linux.intel.com>
>>>>>>> Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
>>>>>>> Signed-off-by: Vineeth Remanan Pillai <vpillai@digitalocean.com>
>>>>>>> ---
>>>>>>> kernel/sched/fair.c  | 64 ++++++++++++++++++++++++++++++++++++++++----
>>>>>>> kernel/sched/sched.h | 29 ++++++++++++++++++++
>>>>>>> 2 files changed, 88 insertions(+), 5 deletions(-)
>>>>>>> 
>>>>>>> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
>>>>>>> index d16939766361..33dc4bf01817 100644
>>>>>>> --- a/kernel/sched/fair.c
>>>>>>> +++ b/kernel/sched/fair.c
>>>>>>> @@ -2051,6 +2051,15 @@ static void task_numa_find_cpu(struct task_numa_env *env,
>>>>>>> 		if (!cpumask_test_cpu(cpu, env->p->cpus_ptr))
>>>>>>> 			continue;
>>>>>>> 
>>>>>>> +#ifdef CONFIG_SCHED_CORE
>>>>>>> +		/*
>>>>>>> +		 * Skip this cpu if source task's cookie does not match
>>>>>>> +		 * with CPU's core cookie.
>>>>>>> +		 */
>>>>>>> +		if (!sched_core_cookie_match(cpu_rq(cpu), env->p))
>>>>>>> +			continue;
>>>>>>> +#endif
>>>>>>> +
>>>>>>> 		env->dst_cpu = cpu;
>>>>>>> 		if (task_numa_compare(env, taskimp, groupimp, maymove))
>>>>>>> 			break;
>>>>>>> @@ -5963,11 +5972,17 @@ find_idlest_group_cpu(struct sched_group *group, struct task_struct *p, int this
>>>>>>> 
>>>>>>> 	/* Traverse only the allowed CPUs */
>>>>>>> 	for_each_cpu_and(i, sched_group_span(group), p->cpus_ptr) {
>>>>>>> +		struct rq *rq = cpu_rq(i);
>>>>>>> +
>>>>>>> +#ifdef CONFIG_SCHED_CORE
>>>>>>> +		if (!sched_core_cookie_match(rq, p))
>>>>>>> +			continue;
>>>>>>> +#endif
>>>>>>> +
>>>>>>> 		if (sched_idle_cpu(i))
>>>>>>> 			return i;
>>>>>>> 
>>>>>>> 		if (available_idle_cpu(i)) {
>>>>>>> -			struct rq *rq = cpu_rq(i);
>>>>>>> 			struct cpuidle_state *idle = idle_get_state(rq);
>>>>>>> 			if (idle && idle->exit_latency < min_exit_latency) {
>>>>>>> 				/*
>>>>>>> @@ -6224,8 +6239,18 @@ static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, int t
>>>>>>> 	for_each_cpu_wrap(cpu, cpus, target) {
>>>>>>> 		if (!--nr)
>>>>>>> 			return -1;
>>>>>>> -		if (available_idle_cpu(cpu) || sched_idle_cpu(cpu))
>>>>>>> -			break;
>>>>>>> +
>>>>>>> +		if (available_idle_cpu(cpu) || sched_idle_cpu(cpu)) {
>>>>>>> +#ifdef CONFIG_SCHED_CORE
>>>>>>> +			/*
>>>>>>> +			 * If Core Scheduling is enabled, select this cpu
>>>>>>> +			 * only if the process cookie matches core cookie.
>>>>>>> +			 */
>>>>>>> +			if (sched_core_enabled(cpu_rq(cpu)) &&
>>>>>>> +			    p->core_cookie == cpu_rq(cpu)->core->core_cookie)
>>>>>> Why not also add similar logic in select_idle_smt to reduce forced-idle? :)
>>>>> We hit select_idle_smt after we scaned the entire LLC domain for idle cores
>>>>> and idle cpus and failed,so IMHO, an idle smt is probably a good choice under
>>>>> this scenario.
>>>> 
>>>> AFAIC, selecting idle sibling with unmatched cookie will cause unnecessary fored-idle, unfairness and latency, compared to choosing *target* cpu.
>>> Choosing target cpu could increase the runnable task number on the target runqueue, this
>>> could trigger busiest->nr_running > 1 logic and makes the idle sibling trying to pull but
>>> not success(due to cookie not match). Putting task to the idle sibling is relatively stable IMHO.
>> 
>> I’m afraid that *unsuccessful* pullings between smts would not result in unstableness, because
>> the load-balance always do periodicly , and unsuccess means nothing happen.
> unsuccess pulling means more unnecessary overhead in load balance.
> 
>> On the contrary, unmatched sibling tasks running concurrently could bring forced-idle to each other repeatedly,
>> Which is more unstable, and more costly when pick_next_task for all siblings.
> Not worse than two tasks ping-pong on the same target run queue I guess, and better if
> - task1(cookie A) is running on the target, and task2(cookie B) in the runqueue,
> - task3(cookie B) coming
> 
> If task3 chooses target's sibling, it could have a chance to run concurrently with task2.
> But if task3 chooses target, it will wait for next pulling luck of load balancer
That’s more interesting. :)
Distributing different cookie tasks onto different cpus(or cpusets) could be the *ideal stable status* we want, as I understood.
Different cookie tasks running on sibling smts could hurt performance, and that should be avoided with best effort.
For above case, selecting idle sibling cpu can improve the concurrency indeed, but it decrease the imbalance for load-balancer.
In that case, load-balancer could not notice the imbalance, and would do nothing to improve the unmatched situation.
On the contrary, choosing the *target* cpu could enhance the imbalance, and load-balancer could try to pull unmatched task away,
which could improve the unmatched situation and be helpful to reach the *ideal stable status*. Maybe that’s what we expect. :)

Thx.
Regards,
Jiang

> 
> Thanks,
> -Aubrey
> 
>> In consideration of currently load-balance being not fully aware of core-scheduling, and can not improve
>> the *unmatched sibling* case, the *find_idlest_** entry should try its best to avoid the case, IMHO.
> 
>> Also, just an advice and  an option. :)
>> 
>> Thx.
>> Regards,
>> Jiang  
>> 
>>> 
>>>> Besides, choosing *target* cpu may be more cache friendly. So IMHO, *target* cpu may be a better choice if cookie not match, instead of idle sibling.
>>> I'm not sure if it's more cache friendly as the target is busy, and the coming task
>>> is a cookie unmatched task.


  reply	other threads:[~2020-07-23  4:24 UTC|newest]

Thread overview: 81+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-06-30 21:32 [RFC PATCH 00/16] Core scheduling v6 Vineeth Remanan Pillai
2020-06-30 21:32 ` [RFC PATCH 01/16] sched: Wrap rq::lock access Vineeth Remanan Pillai
2020-06-30 21:32 ` [RFC PATCH 02/16] sched: Introduce sched_class::pick_task() Vineeth Remanan Pillai
2020-06-30 21:32 ` [RFC PATCH 03/16] sched: Core-wide rq->lock Vineeth Remanan Pillai
2020-06-30 21:32 ` [RFC PATCH 04/16] sched/fair: Add a few assertions Vineeth Remanan Pillai
2020-06-30 21:32 ` [RFC PATCH 05/16] sched: Basic tracking of matching tasks Vineeth Remanan Pillai
2020-07-21 14:02   ` [RFC PATCH 05/16] sched: Basic tracking of matching tasks(Internet mail) benbjiang(蒋彪)
2020-06-30 21:32 ` [RFC PATCH 06/16] sched: Add core wide task selection and scheduling Vineeth Remanan Pillai
2020-07-01 23:28   ` Joel Fernandes
2020-07-02  0:54     ` Tim Chen
2020-07-02 12:57       ` Joel Fernandes
2020-07-02 13:23         ` Joel Fernandes
2020-07-05 23:44         ` Tim Chen
2020-07-03 20:21     ` Vineeth Remanan Pillai
2020-07-06 14:09       ` Joel Fernandes
2020-07-06 14:38         ` Vineeth Remanan Pillai
2020-07-06 17:37           ` Joel Fernandes
2020-06-30 21:32 ` [RFC PATCH 07/16] sched/fair: Fix forced idle sibling starvation corner case Vineeth Remanan Pillai
2020-07-21  7:35   ` [RFC PATCH 07/16] sched/fair: Fix forced idle sibling starvation corner case(Internet mail) benbjiang(蒋彪)
2020-07-22  7:20   ` benbjiang(蒋彪)
2020-06-30 21:32 ` [RFC PATCH 08/16] sched/fair: wrapper for cfs_rq->min_vruntime Vineeth Remanan Pillai
2020-06-30 21:32 ` [RFC PATCH 09/16] sched/fair: core wide cfs task priority comparison Vineeth Remanan Pillai
2020-07-22  0:23   ` [RFC PATCH 09/16] sched/fair: core wide cfs task priority comparison(Internet mail) benbjiang(蒋彪)
2020-07-24  7:14     ` Aaron Lu
2020-07-24 12:08       ` Jiang Biao
2020-06-30 21:32 ` [RFC PATCH 10/16] sched: Trivial forced-newidle balancer Vineeth Remanan Pillai
2020-07-20  4:06   ` [RFC PATCH 10/16] sched: Trivial forced-newidle balancer(Internet mail) benbjiang(蒋彪)
2020-07-20  6:06     ` Li, Aubrey
     [not found]       ` <8082F052-2F52-42D3-B396-18A35A94F26F@tencent.com>
2020-07-20  8:03         ` Li, Aubrey
2020-07-20  8:22           ` benbjiang(蒋彪)
2020-07-20 14:34   ` benbjiang(蒋彪)
2020-06-30 21:32 ` [RFC PATCH 11/16] sched: migration changes for core scheduling Vineeth Remanan Pillai
2020-07-22  8:54   ` [RFC PATCH 11/16] sched: migration changes for core scheduling(Internet mail) benbjiang(蒋彪)
2020-07-22 12:13     ` Li, Aubrey
2020-07-22 14:32       ` benbjiang(蒋彪)
2020-07-23  1:57         ` Li, Aubrey
2020-07-23  2:42           ` benbjiang(蒋彪)
2020-07-23  3:35             ` Li, Aubrey
2020-07-23  4:23               ` benbjiang(蒋彪) [this message]
2020-07-23  5:39                 ` Li, Aubrey
2020-07-23  7:47                   ` benbjiang(蒋彪)
2020-07-23  8:06                     ` Li, Aubrey
2020-07-23  8:28                       ` benbjiang(蒋彪)
2020-07-23 23:43                         ` Aubrey Li
2020-07-24  1:26                           ` benbjiang(蒋彪)
2020-07-24  2:05                             ` Li, Aubrey
2020-07-24  2:29                               ` benbjiang(蒋彪)
2020-06-30 21:32 ` [RFC PATCH 12/16] sched: cgroup tagging interface for core scheduling Vineeth Remanan Pillai
2020-06-30 21:32 ` [RFC PATCH 13/16] sched: Fix pick_next_task() race condition in " Vineeth Remanan Pillai
2020-06-30 21:32 ` [RFC PATCH 14/16] irq: Add support for core-wide protection of IRQ and softirq Vineeth Remanan Pillai
2020-07-10 12:19   ` Li, Aubrey
2020-07-10 13:21     ` Joel Fernandes
2020-07-13  2:23       ` Li, Aubrey
2020-07-13 15:58         ` Joel Fernandes
2020-07-10 13:36     ` Vineeth Remanan Pillai
2020-07-11  1:33       ` Aubrey Li
2020-07-17 23:37     ` Thomas Gleixner
2020-07-18 17:05       ` Joel Fernandes
2020-07-17 23:36   ` Thomas Gleixner
2020-07-20  3:53     ` Joel Fernandes
2020-07-20  8:20       ` Thomas Gleixner
2020-07-20 11:09       ` Vineeth Pillai
2020-06-30 21:32 ` [RFC PATCH 15/16] Documentation: Add documentation on core scheduling Vineeth Remanan Pillai
2020-06-30 21:32 ` [RFC PATCH 16/16] sched: Debug bits Vineeth Remanan Pillai
2020-07-31 16:41 ` [RFC PATCH 00/16] Core scheduling v6 Vineeth Pillai
2020-08-03  8:23 ` Li, Aubrey
2020-08-03 16:53   ` Joel Fernandes
2020-08-05  3:57     ` Li, Aubrey
2020-08-05  6:16       ` [RFC PATCH 00/16] Core scheduling v6(Internet mail) benbjiang(蒋彪)
2020-08-09 16:44       ` [RFC PATCH 00/16] Core scheduling v6 Joel Fernandes
2020-08-12  2:01         ` Li, Aubrey
2020-08-12 23:08           ` Joel Fernandes
2020-08-13  4:28             ` Li, Aubrey
2020-08-14  0:26               ` [RFC PATCH 00/16] Core scheduling v6(Internet mail) benbjiang(蒋彪)
2020-08-14  1:36                 ` Li, Aubrey
2020-08-14  4:04                   ` benbjiang(蒋彪)
2020-08-14  5:18                     ` Li, Aubrey
2020-08-14  7:54                       ` benbjiang(蒋彪)
2020-08-20 22:37               ` [RFC PATCH 00/16] Core scheduling v6 Joel Fernandes
2020-08-27  0:30 ` Alexander Graf
2020-08-27  1:20   ` Vineeth Pillai

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=96A765D7-7FD3-40EB-873B-0F9365569490@tencent.com \
    --to=benbjiang@tencent.com \
    --cc=aaron.lwe@gmail.com \
    --cc=aubrey.intel@gmail.com \
    --cc=aubrey.li@intel.com \
    --cc=aubrey.li@linux.intel.com \
    --cc=christian.brauner@ubuntu.com \
    --cc=fweisbec@gmail.com \
    --cc=jdesfossez@digitalocean.com \
    --cc=joel@joelfernandes.org \
    --cc=joelaf@google.com \
    --cc=keescook@chromium.org \
    --cc=kerrnel@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mgorman@techsingularity.net \
    --cc=mingo@kernel.org \
    --cc=naravamudan@digitalocean.com \
    --cc=pauld@redhat.com \
    --cc=pawan.kumar.gupta@linux.intel.com \
    --cc=pbonzini@redhat.com \
    --cc=peterz@infradead.org \
    --cc=pjt@google.com \
    --cc=subhra.mazumdar@oracle.com \
    --cc=tglx@linutronix.de \
    --cc=tim.c.chen@linux.intel.com \
    --cc=torvalds@linux-foundation.org \
    --cc=valentin.schneider@arm.com \
    --cc=vineethrp@gmail.com \
    --cc=vpillai@digitalocean.com \
    --cc=yu.c.chen@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).