linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* AMD Bulldozer topology regression since 4.6
@ 2016-11-29  7:03 Brice Goglin
  2016-11-29 19:39 ` Borislav Petkov
  0 siblings, 1 reply; 6+ messages in thread
From: Brice Goglin @ 2016-11-29  7:03 UTC (permalink / raw)
  To: Borislav Petkov, LKML

Hello

Since Linux 4.6 (and still in 4.9-rc5 at least), both AMD Bulldozer
cores of a single dual-core compute unit report the same core_id:

$ cat /sys/devices/system/cpu/cpu{?,??}/topology/core_id
0
0
1
1
2
2
3
0
3
[...]


Before 4.5 (and for a very long time), the kernel reported different
core_ids:

0
1
2
3
4
5
6
7
0
[...]


This causes user-space tools (at least all that rely on hwloc for
topology discovery) to think the processor has dual-threaded cores
instead of dual-core compute-unit modules.

The cause is likely this patch, which seems to assume both cores of a
same compute-unit have the same ID, which is wrong?

commit 8196dab4fc159943df6baaac04973bb1accb7100
Author: Borislav Petkov <bp@suse.de>
Date:   Fri Mar 25 15:52:36 2016 +0100

    x86/cpu: Get rid of compute_unit_id
    
    It is cpu_core_id anyway.
    
    Signed-off-by: Borislav Petkov <bp@suse.de>
    Link: http://lkml.kernel.org/r/1458917557-8757-3-git-send-email-bp@alien8.de
    Signed-off-by: Thomas Gleixner <tglx@linutronix.de>



Thanks
Brice
 

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: AMD Bulldozer topology regression since 4.6
  2016-11-29  7:03 AMD Bulldozer topology regression since 4.6 Brice Goglin
@ 2016-11-29 19:39 ` Borislav Petkov
  2016-11-29 21:02   ` Brice Goglin
  0 siblings, 1 reply; 6+ messages in thread
From: Borislav Petkov @ 2016-11-29 19:39 UTC (permalink / raw)
  To: Brice Goglin; +Cc: LKML, Thomas Gleixner

On Tue, Nov 29, 2016 at 08:03:15AM +0100, Brice Goglin wrote:
> This causes user-space tools (at least all that rely on hwloc for
> topology discovery) to think the processor has dual-threaded cores
> instead of dual-core compute-unit modules.
> 
> The cause is likely this patch, which seems to assume both cores of a
> same compute-unit have the same ID, which is wrong?

Does that fix it?

Patch is against latest tip/master because we have some more changes in
that area.

---
diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
index 4daad1e39352..a070d7b0a133 100644
--- a/arch/x86/kernel/cpu/amd.c
+++ b/arch/x86/kernel/cpu/amd.c
@@ -305,15 +305,7 @@ static void amd_get_topology(struct cpuinfo_x86 *c)
 
 	/* get information required for multi-node processors */
 	if (boot_cpu_has(X86_FEATURE_TOPOEXT)) {
-		u32 eax, ebx, ecx, edx;
-
-		cpuid(0x8000001e, &eax, &ebx, &ecx, &edx);
-		node_id = ecx & 7;
-
-		/* get compute unit information */
-		smp_num_siblings = ((ebx >> 8) & 3) + 1;
-		c->x86_max_cores /= smp_num_siblings;
-		c->cpu_core_id = ebx & 0xff;
+		node_id = cpuid_ecx(0x8000001e) & 7;
 
 		/*
 		 * We may have multiple LLCs if L3 caches exist, so check if we
---

Just in case, I have one against Linus' master too:

---
diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
index 1e81a37c034e..19f50a9d6b42 100644
--- a/arch/x86/kernel/cpu/amd.c
+++ b/arch/x86/kernel/cpu/amd.c
@@ -305,15 +305,7 @@ static void amd_get_topology(struct cpuinfo_x86 *c)
 
 	/* get information required for multi-node processors */
 	if (boot_cpu_has(X86_FEATURE_TOPOEXT)) {
-		u32 eax, ebx, ecx, edx;
-
-		cpuid(0x8000001e, &eax, &ebx, &ecx, &edx);
-		node_id = ecx & 7;
-
-		/* get compute unit information */
-		smp_num_siblings = ((ebx >> 8) & 3) + 1;
-		c->x86_max_cores /= smp_num_siblings;
-		c->cpu_core_id = ebx & 0xff;
+		node_id = cpuid_ecx(0x8000001e) & 7;
 	} else if (cpu_has(c, X86_FEATURE_NODEID_MSR)) {
 		u64 value;
---

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: AMD Bulldozer topology regression since 4.6
  2016-11-29 19:39 ` Borislav Petkov
@ 2016-11-29 21:02   ` Brice Goglin
  2016-11-29 21:15     ` Borislav Petkov
  2017-01-03 10:49     ` Brice Goglin
  0 siblings, 2 replies; 6+ messages in thread
From: Brice Goglin @ 2016-11-29 21:02 UTC (permalink / raw)
  To: Borislav Petkov; +Cc: LKML, Thomas Gleixner

Le 29/11/2016 20:39, Borislav Petkov a écrit :
> Does that fix it?
>
> Patch is against latest tip/master because we have some more changes in
> that area.

I tested the second patch on top of 4.8.11, it brings core_id back to
where it was before 4.6, thanks.

Reported-and-tested-by: Brice Goglin <Brice.Goglin@inria.fr>

However thread_siblings isn't back to where it was in 4.5. Now we have a
single bit in each thread_siblings mask. That's correct with respect to
the sysfs topology documentation. In 4.5, there were two bits (one for
each core of the compute unit), which was wrong (cores with different
core_ids shouldn't appear in each other thread_siblings). I assumed that
these processors had to break the sysfs topology documentation to expose
the concept of "dual-core compute-unit" which somehow sits between
hyperthreading and dual-core.

I personally do not care much about this regression, not sure about
other user-space tools?

Another minor related change: /proc/cpuinfo shows "cpu cores : 16"
instead of "8".

Brice


>
> ---
> diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
> index 4daad1e39352..a070d7b0a133 100644
> --- a/arch/x86/kernel/cpu/amd.c
> +++ b/arch/x86/kernel/cpu/amd.c
> @@ -305,15 +305,7 @@ static void amd_get_topology(struct cpuinfo_x86 *c)
>  
>  	/* get information required for multi-node processors */
>  	if (boot_cpu_has(X86_FEATURE_TOPOEXT)) {
> -		u32 eax, ebx, ecx, edx;
> -
> -		cpuid(0x8000001e, &eax, &ebx, &ecx, &edx);
> -		node_id = ecx & 7;
> -
> -		/* get compute unit information */
> -		smp_num_siblings = ((ebx >> 8) & 3) + 1;
> -		c->x86_max_cores /= smp_num_siblings;
> -		c->cpu_core_id = ebx & 0xff;
> +		node_id = cpuid_ecx(0x8000001e) & 7;
>  
>  		/*
>  		 * We may have multiple LLCs if L3 caches exist, so check if we
> ---
>
> Just in case, I have one against Linus' master too:
>
> ---
> diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
> index 1e81a37c034e..19f50a9d6b42 100644
> --- a/arch/x86/kernel/cpu/amd.c
> +++ b/arch/x86/kernel/cpu/amd.c
> @@ -305,15 +305,7 @@ static void amd_get_topology(struct cpuinfo_x86 *c)
>  
>  	/* get information required for multi-node processors */
>  	if (boot_cpu_has(X86_FEATURE_TOPOEXT)) {
> -		u32 eax, ebx, ecx, edx;
> -
> -		cpuid(0x8000001e, &eax, &ebx, &ecx, &edx);
> -		node_id = ecx & 7;
> -
> -		/* get compute unit information */
> -		smp_num_siblings = ((ebx >> 8) & 3) + 1;
> -		c->x86_max_cores /= smp_num_siblings;
> -		c->cpu_core_id = ebx & 0xff;
> +		node_id = cpuid_ecx(0x8000001e) & 7;
>  	} else if (cpu_has(c, X86_FEATURE_NODEID_MSR)) {
>  		u64 value;
> ---
>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: AMD Bulldozer topology regression since 4.6
  2016-11-29 21:02   ` Brice Goglin
@ 2016-11-29 21:15     ` Borislav Petkov
  2017-01-03 10:49     ` Brice Goglin
  1 sibling, 0 replies; 6+ messages in thread
From: Borislav Petkov @ 2016-11-29 21:15 UTC (permalink / raw)
  To: Brice Goglin; +Cc: LKML, Thomas Gleixner

On Tue, Nov 29, 2016 at 10:02:00PM +0100, Brice Goglin wrote:
> However thread_siblings isn't back to where it was in 4.5. Now we have a
> single bit in each thread_siblings mask. That's correct with respect to
> the sysfs topology documentation. In 4.5, there were two bits (one for
> each core of the compute unit), which was wrong (cores with different
> core_ids shouldn't appear in each other thread_siblings). I assumed that
> these processors had to break the sysfs topology documentation to expose
> the concept of "dual-core compute-unit" which somehow sits between
> hyperthreading and dual-core.

This is exactly the problem - there's never proper fitting of the
compute unit "ideology" in the whole topology view. So what we're doing
now is revert to the old strategy of keeping Bulldozer have core_id ==
thread_id. Basically, all CU threads are cores, as they're more powerful
than, say, SMT threads, but they still share an FPU.

So it is a different kind of sharing and it's kinda, well, different.
There's no other way to put it.

> I personally do not care much about this regression, not sure about
> other user-space tools?
> 
> Another minor related change: /proc/cpuinfo shows "cpu cores : 16"
> instead of "8".

Can you send me the whole thing? Offlist is fine too.

Thanks.

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: AMD Bulldozer topology regression since 4.6
  2016-11-29 21:02   ` Brice Goglin
  2016-11-29 21:15     ` Borislav Petkov
@ 2017-01-03 10:49     ` Brice Goglin
  2017-01-03 13:22       ` Borislav Petkov
  1 sibling, 1 reply; 6+ messages in thread
From: Brice Goglin @ 2017-01-03 10:49 UTC (permalink / raw)
  To: Borislav Petkov; +Cc: LKML, Thomas Gleixner



Le 29/11/2016 22:02, Brice Goglin a écrit :
> Le 29/11/2016 20:39, Borislav Petkov a écrit :
>> Does that fix it?
>>
>> Patch is against latest tip/master because we have some more changes in
>> that area.
> I tested the second patch on top of 4.8.11, it brings core_id back to
> where it was before 4.6, thanks.
>
> Reported-and-tested-by: Brice Goglin <Brice.Goglin@inria.fr>

Hello Borislav
Are you pushing this fix to Linus soon?
We received several bug reports on our side about dual-core compute
units being exposed as a dual-threaded single-cores in sysfs.
Thanks
Brice




> However thread_siblings isn't back to where it was in 4.5. Now we have a
> single bit in each thread_siblings mask. That's correct with respect to
> the sysfs topology documentation. In 4.5, there were two bits (one for
> each core of the compute unit), which was wrong (cores with different
> core_ids shouldn't appear in each other thread_siblings). I assumed that
> these processors had to break the sysfs topology documentation to expose
> the concept of "dual-core compute-unit" which somehow sits between
> hyperthreading and dual-core.
>
> I personally do not care much about this regression, not sure about
> other user-space tools?
>
> Another minor related change: /proc/cpuinfo shows "cpu cores : 16"
> instead of "8".
>
> Brice
>
>
>> ---
>> diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
>> index 4daad1e39352..a070d7b0a133 100644
>> --- a/arch/x86/kernel/cpu/amd.c
>> +++ b/arch/x86/kernel/cpu/amd.c
>> @@ -305,15 +305,7 @@ static void amd_get_topology(struct cpuinfo_x86 *c)
>>  
>>  	/* get information required for multi-node processors */
>>  	if (boot_cpu_has(X86_FEATURE_TOPOEXT)) {
>> -		u32 eax, ebx, ecx, edx;
>> -
>> -		cpuid(0x8000001e, &eax, &ebx, &ecx, &edx);
>> -		node_id = ecx & 7;
>> -
>> -		/* get compute unit information */
>> -		smp_num_siblings = ((ebx >> 8) & 3) + 1;
>> -		c->x86_max_cores /= smp_num_siblings;
>> -		c->cpu_core_id = ebx & 0xff;
>> +		node_id = cpuid_ecx(0x8000001e) & 7;
>>  
>>  		/*
>>  		 * We may have multiple LLCs if L3 caches exist, so check if we
>> ---
>>
>> Just in case, I have one against Linus' master too:
>>
>> ---
>> diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
>> index 1e81a37c034e..19f50a9d6b42 100644
>> --- a/arch/x86/kernel/cpu/amd.c
>> +++ b/arch/x86/kernel/cpu/amd.c
>> @@ -305,15 +305,7 @@ static void amd_get_topology(struct cpuinfo_x86 *c)
>>  
>>  	/* get information required for multi-node processors */
>>  	if (boot_cpu_has(X86_FEATURE_TOPOEXT)) {
>> -		u32 eax, ebx, ecx, edx;
>> -
>> -		cpuid(0x8000001e, &eax, &ebx, &ecx, &edx);
>> -		node_id = ecx & 7;
>> -
>> -		/* get compute unit information */
>> -		smp_num_siblings = ((ebx >> 8) & 3) + 1;
>> -		c->x86_max_cores /= smp_num_siblings;
>> -		c->cpu_core_id = ebx & 0xff;
>> +		node_id = cpuid_ecx(0x8000001e) & 7;
>>  	} else if (cpu_has(c, X86_FEATURE_NODEID_MSR)) {
>>  		u64 value;
>> ---
>>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: AMD Bulldozer topology regression since 4.6
  2017-01-03 10:49     ` Brice Goglin
@ 2017-01-03 13:22       ` Borislav Petkov
  0 siblings, 0 replies; 6+ messages in thread
From: Borislav Petkov @ 2017-01-03 13:22 UTC (permalink / raw)
  To: Brice Goglin; +Cc: LKML, Thomas Gleixner

On Tue, Jan 03, 2017 at 11:49:38AM +0100, Brice Goglin wrote:
> Are you pushing this fix to Linus soon?
> We received several bug reports on our side about dual-core compute
> units being exposed as a dual-threaded single-cores in sysfs.

Yes, it is the first thing on my TODO list for when I get back tomorrow.

Thanks.

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2017-01-03 13:22 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-11-29  7:03 AMD Bulldozer topology regression since 4.6 Brice Goglin
2016-11-29 19:39 ` Borislav Petkov
2016-11-29 21:02   ` Brice Goglin
2016-11-29 21:15     ` Borislav Petkov
2017-01-03 10:49     ` Brice Goglin
2017-01-03 13:22       ` Borislav Petkov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).