linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Dietmar Eggemann <dietmar.eggemann@arm.com>
To: Peter Zijlstra <peterz@infradead.org>, Bruno Wolff III <bruno@wolff.to>
Cc: Josh Boyer <jwboyer@redhat.com>,
	"mingo@redhat.com" <mingo@redhat.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"H. Peter Anvin" <hpa@zytor.com>,
	Thomas Gleixner <tglx@linutronix.de>
Subject: Re: Scheduler regression from caffcdd8d27ba78730d5540396ce72ad022aff2c
Date: Tue, 22 Jul 2014 13:12:38 +0100	[thread overview]
Message-ID: <53CE5536.4000005@arm.com> (raw)
In-Reply-To: <20140722094740.GJ12054@laptop.lan>

On 22/07/14 10:47, Peter Zijlstra wrote:
> On Mon, Jul 21, 2014 at 06:52:12PM +0200, Peter Zijlstra wrote:
>> On Mon, Jul 21, 2014 at 11:35:28AM -0500, Bruno Wolff III wrote:
>>> Is there more I can do to help with this now? Or should I just wait for
>>> patches to test?
>>
>> Yeah, sorry, was wiped out today. I'll go stare harder at the P4
>> topology setup code tomorrow. Something fishy there.
> 
> Does this make your machine boot again (while giving an error)?
> 
> It tries to robustify the topology setup a bit, crashing on crap input
> should be avoided if possible of course.
> 
> I'll go stare at the x86/P4 topology code like promised.
> 
> ---
> Subject: sched: Robustify topology setup
> From: Peter Zijlstra <peterz@infradead.org>
> Date: Mon Jul 21 23:07:06 CEST 2014
> 
> We hard assume that higher topology levels are strict supersets of
> lower levels.

IMHO, we require only that higher topology levels are supersets of lower
levels, not strict (proper) supersets.

AFAICS, the patch itself requires only supersets, i.e. on ARM TC2 with
the following change in cpu_corepower_mask:

 const struct cpumask *cpu_corepower_mask(int cpu)
 {
-       return &cpu_topology[cpu].thread_sibling;
+       return &cpu_topology[cpu].core_sibling;
 }

I get:

...
build_sched_domain: cpu: 0 level: GMC cpu_map: 0-4 tl->mask: 0-1
build_sched_domain: cpu: 0 level: MC cpu_map: 0-4 tl->mask: 0-1
build_sched_domain: cpu: 0 level: DIE cpu_map: 0-4 tl->mask: 0-4
...

without hitting the newly introduced pr_err's.

> 
> Detect, warn and try to fixup when we encounter this violated.
> 
> Signed-off-by: Peter Zijlstra <peterz@infradead.org>
> Link: http://lkml.kernel.org/n/tip-cgp9j2tk0qnunhtpps3udsom@git.kernel.org
> ---
>  kernel/sched/core.c |   14 ++++++++++++++
>  1 file changed, 14 insertions(+)
> 
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -6480,6 +6480,20 @@ struct sched_domain *build_sched_domain(
>  		sched_domain_level_max = max(sched_domain_level_max, sd->level);
>  		child->parent = sd;
>  		sd->child = child;
> +
> +		if (!cpumask_subset(sched_domain_span(child),
> +				    sched_domain_span(sd))) {
> +			pr_err("BUG: arch topology borken\n");
> +#ifdef CONFIG_SCHED_DEBUG
> +			pr_err("     the %s domain not a subset of the %s domain\n",
> +					child->name, sd->name);
> +#endif
> +			/* Fixup, ensure @sd has at least @child cpus. */
> +			cpumask_or(sched_domain_span(sd),
> +				   sched_domain_span(sd),
> +				   sched_domain_span(child));

This fixup will (probably) heal the Bruno's issue with it's wrong
cpu_coregroup_mask() function.

If I exchange cpu_corepower_mask with cpu_coregroup_mask in arm_topology[]

 static struct sched_domain_topology_level arm_topology[] = {
 #ifdef CONFIG_SCHED_MC
-       { cpu_corepower_mask, cpu_corepower_flags, SD_INIT_NAME(GMC) },
-       { cpu_coregroup_mask, cpu_core_flags, SD_INIT_NAME(MC) },
+       { cpu_coregroup_mask, cpu_corepower_flags, SD_INIT_NAME(GMC) },
+       { cpu_corepower_mask, cpu_core_flags, SD_INIT_NAME(MC) },
 #endif

I get:

...
build_sched_domain: cpu: 0 level: GMC cpu_map: 0-4 tl->mask: 0-1
build_sched_domain: cpu: 0 level: MC cpu_map: 0-4 tl->mask: 0
BUG: arch topology borken
     the GMC domain not a subset of the MC domain
build_sched_domain: cpu: 0 level: DIE cpu_map: 0-4 tl->mask: 0-4
...

cat /proc/schedstat
...
cpu0 0 0 16719 6169 10937 6392 5348510220 2935348625 10448
domain0 03 19190 19168 9 10265 13 0 0 19168 16 16 0 0 0 0 0 16 1196 1080
46 43570 75 0 0 1080 0 0 0 0 0 0 0 0 0 2947 280 0
domain1 1f 18768 18763 3 3006 2 0 9 18055 6 6 0 0 0 0 0 1 1125 996 94
81038 43 0 18 978 0 0 0 0 0 0 0 0 0 1582 172 0

# cat /proc/sys/kernel/sched_domain/cpu0/domain*/name
GMC
DIE

so MC level gets changed to mask 0-1.

> +		}
> +
>  	}
>  	set_domain_attribute(sd, attr);
>  
> 

Tested-by: Dietmar Eggemann <dietmar.eggemann@arm.com>




















  parent reply	other threads:[~2014-07-22 12:12 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-07-16 14:55 Scheduler regression from caffcdd8d27ba78730d5540396ce72ad022aff2c Bruno Wolff III
2014-07-16 15:17 ` Josh Boyer
2014-07-16 19:17   ` Dietmar Eggemann
2014-07-16 19:54     ` Bruno Wolff III
2014-07-16 23:18       ` Dietmar Eggemann
2014-07-17  3:09         ` Bruno Wolff III
2014-07-17  8:57           ` Dietmar Eggemann
2014-07-17  9:04             ` Peter Zijlstra
2014-07-17 11:23               ` Dietmar Eggemann
2014-07-17 12:35                 ` Peter Zijlstra
2014-07-18  5:34                   ` Bruno Wolff III
2014-07-18  9:28                     ` Dietmar Eggemann
2014-07-18 12:09                       ` Bruno Wolff III
2014-07-18 10:16                     ` Peter Zijlstra
2014-07-18 13:01                       ` Bruno Wolff III
2014-07-18 14:16                         ` Dietmar Eggemann
2014-07-18 14:16                         ` Peter Zijlstra
2014-07-18 14:50                           ` Peter Zijlstra
2014-07-18 16:16                             ` Peter Zijlstra
2014-07-21 16:35                               ` Bruno Wolff III
2014-07-21 16:52                                 ` Peter Zijlstra
2014-07-22  9:47                                   ` Peter Zijlstra
2014-07-22 10:38                                     ` Peter Zijlstra
2014-07-22 12:10                                       ` Bruno Wolff III
2014-07-22 13:03                                         ` Peter Zijlstra
2014-07-22 13:26                                           ` Peter Zijlstra
2014-07-22 13:35                                             ` Peter Zijlstra
2014-07-22 14:09                                               ` Bruno Wolff III
2014-07-22 14:18                                                 ` Peter Zijlstra
2014-07-23  1:37                                                   ` Bruno Wolff III
2014-07-23  6:51                                                     ` Peter Zijlstra
2014-07-22 17:05                                               ` H. Peter Anvin
2014-07-23 15:11                                               ` Peter Zijlstra
2014-07-23 15:12                                                 ` H. Peter Anvin
2014-07-24  1:45                                                 ` Bruno Wolff III
2014-07-23 15:39                                               ` [tip:x86/urgent] x86, cpu: Fix cache topology for early P4-SMT tip-bot for Peter Zijlstra
2014-07-22 12:12                                     ` Dietmar Eggemann [this message]
2014-07-22 12:57                                     ` Scheduler regression from caffcdd8d27ba78730d5540396ce72ad022aff2c Bruno Wolff III
2014-07-28  8:28                                     ` [tip:sched/core] sched: Robustify topology setup tip-bot for Peter Zijlstra
2014-07-17 16:36             ` Scheduler regression from caffcdd8d27ba78730d5540396ce72ad022aff2c Bruno Wolff III
2014-07-17 18:43               ` Dietmar Eggemann
2014-07-17 18:54                 ` Bruno Wolff III
2014-07-17  4:21         ` Bruno Wolff III
2014-07-17  4:28     ` Bruno Wolff III

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=53CE5536.4000005@arm.com \
    --to=dietmar.eggemann@arm.com \
    --cc=bruno@wolff.to \
    --cc=hpa@zytor.com \
    --cc=jwboyer@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).