linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 1/2] x86/AMD: Fix cpu_llc_id for AMD Fam17h systems
@ 2016-10-28 15:51 Yazen Ghannam
  2016-10-28 15:51 ` [PATCH v2 2/2] x86/AMD: Group cpu_llc_id assignment by topology feature and family Yazen Ghannam
  2016-10-28 16:57 ` [PATCH v2 1/2] x86/AMD: Fix cpu_llc_id for AMD Fam17h systems Borislav Petkov
  0 siblings, 2 replies; 7+ messages in thread
From: Yazen Ghannam @ 2016-10-28 15:51 UTC (permalink / raw)
  To: linux-kernel; +Cc: x86, bp, Yazen Ghannam, stable

The current Fam17h cpu_llc_id derivation has an underflow bug when
extracting the socket_id value. The socket_id value starts from 0, so
subtracting 1 will result in an underflow. This breaks scheduling topology
later on since the cpu_llc_id will be incorrect.

The apicid decoding is fixed for bits 3 and above, which give the core
complex, node and socket IDs. The LLC is at the core complex level so we
can find a unique cpu_llc_id by right shifting the apicid by 3.

We can fix the underflow bug and simplify the code by replacing the
current cpu_llc_id derivation with a right shift.

Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com>
Cc: <stable@vger.kernel.org> # v4.4..
Fixes: 3849e91f571d ("x86/AMD: Fix last level cache topology for AMD Fam17h systems")
---
 arch/x86/kernel/cpu/amd.c | 6 +-----
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
index 7b76eb6..c3fc337 100644
--- a/arch/x86/kernel/cpu/amd.c
+++ b/arch/x86/kernel/cpu/amd.c
@@ -347,7 +347,6 @@ static void amd_detect_cmp(struct cpuinfo_x86 *c)
 #ifdef CONFIG_SMP
 	unsigned bits;
 	int cpu = smp_processor_id();
-	unsigned int socket_id, core_complex_id;
 
 	bits = c->x86_coreid_bits;
 	/* Low order bits define the core id (index of core in socket) */
@@ -365,10 +364,7 @@ static void amd_detect_cmp(struct cpuinfo_x86 *c)
 	 if (c->x86 != 0x17 || !cpuid_edx(0x80000006))
 		return;
 
-	socket_id	= (c->apicid >> bits) - 1;
-	core_complex_id	= (c->apicid & ((1 << bits) - 1)) >> 3;
-
-	per_cpu(cpu_llc_id, cpu) = (socket_id << 3) | core_complex_id;
+	per_cpu(cpu_llc_id, cpu) = c->apicid >> 3;
 #endif
 }
 
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH v2 2/2] x86/AMD: Group cpu_llc_id assignment by topology feature and family
  2016-10-28 15:51 [PATCH v2 1/2] x86/AMD: Fix cpu_llc_id for AMD Fam17h systems Yazen Ghannam
@ 2016-10-28 15:51 ` Yazen Ghannam
  2016-10-28 19:56   ` Borislav Petkov
  2016-10-28 16:57 ` [PATCH v2 1/2] x86/AMD: Fix cpu_llc_id for AMD Fam17h systems Borislav Petkov
  1 sibling, 1 reply; 7+ messages in thread
From: Yazen Ghannam @ 2016-10-28 15:51 UTC (permalink / raw)
  To: linux-kernel; +Cc: x86, bp, Yazen Ghannam

Currently, we assume that a system has multiple last level caches only if
there are multiple nodes, and that the cpu_llc_id is equal to the node_id.
This no longer applies since Fam17h can have multiple last level caches
within a node.

So group the cpu_llc_id assignment by topology feature and family.

The NODEID_MSR feature only applies to Fam10h in which case the llc is at
the node level.

The TOPOEXT feature is used on families 15h, 16h and 17h. So far we only
see multiple last level caches if L3 caches are available. Otherwise, the
cpu_llc_id will default to be the phys_proc_id. We have L3 caches only on
families 15h and 17h. On Fam15h, the llc is at the node level. On Fam17h,
the llc is at the core complex level and can be found by right shifting the
apicid. Also, keep the family checks explicit so that new families will
fall back to the default.

Single node systems in families 10h and 15h will have a Node ID of 0 which
will be the same as the phys_proc_id, so we don't need to check for
multiple nodes before using the node_id.

Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com>
---
 arch/x86/kernel/cpu/amd.c | 32 ++++++++++++++++++++------------
 1 file changed, 20 insertions(+), 12 deletions(-)

diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
index c3fc337..be70345 100644
--- a/arch/x86/kernel/cpu/amd.c
+++ b/arch/x86/kernel/cpu/amd.c
@@ -314,11 +314,31 @@ static void amd_get_topology(struct cpuinfo_x86 *c)
 		smp_num_siblings = ((ebx >> 8) & 3) + 1;
 		c->x86_max_cores /= smp_num_siblings;
 		c->cpu_core_id = ebx & 0xff;
+
+		/*
+		 * We may have multiple LLCs if L3 caches exist, so check if we
+		 * have an L3 cache by looking at the L3 cache cpuid leaf.
+		 */
+		if (cpuid_edx(0x80000006)) {
+			if (c->x86 == 0x15) {
+				/* LLC is at the node level. */
+				per_cpu(cpu_llc_id, cpu) = node_id;
+
+			} else if (c->x86 == 0x17) {
+				/*
+				 * LLC is at the core complex level.
+				 * Core complex id is ApicId[3].
+				 */
+				per_cpu(cpu_llc_id, cpu) = c->apicid >> 3;
+			}
+		}
 	} else if (cpu_has(c, X86_FEATURE_NODEID_MSR)) {
 		u64 value;
 
 		rdmsrl(MSR_FAM10H_NODE_ID, value);
 		node_id = value & 7;
+
+		per_cpu(cpu_llc_id, cpu) = node_id;
 	} else
 		return;
 
@@ -329,9 +349,6 @@ static void amd_get_topology(struct cpuinfo_x86 *c)
 		set_cpu_cap(c, X86_FEATURE_AMD_DCM);
 		cus_per_node = c->x86_max_cores / nodes_per_socket;
 
-		/* store NodeID, use llc_shared_map to store sibling info */
-		per_cpu(cpu_llc_id, cpu) = node_id;
-
 		/* core id has to be in the [0 .. cores_per_node - 1] range */
 		c->cpu_core_id %= cus_per_node;
 	}
@@ -356,15 +373,6 @@ static void amd_detect_cmp(struct cpuinfo_x86 *c)
 	/* use socket ID also for last level cache */
 	per_cpu(cpu_llc_id, cpu) = c->phys_proc_id;
 	amd_get_topology(c);
-
-	/*
-	 * Fix percpu cpu_llc_id here as LLC topology is different
-	 * for Fam17h systems.
-	 */
-	 if (c->x86 != 0x17 || !cpuid_edx(0x80000006))
-		return;
-
-	per_cpu(cpu_llc_id, cpu) = c->apicid >> 3;
 #endif
 }
 
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH v2 1/2] x86/AMD: Fix cpu_llc_id for AMD Fam17h systems
  2016-10-28 15:51 [PATCH v2 1/2] x86/AMD: Fix cpu_llc_id for AMD Fam17h systems Yazen Ghannam
  2016-10-28 15:51 ` [PATCH v2 2/2] x86/AMD: Group cpu_llc_id assignment by topology feature and family Yazen Ghannam
@ 2016-10-28 16:57 ` Borislav Petkov
  2016-10-31 19:15   ` Yazen Ghannam
  1 sibling, 1 reply; 7+ messages in thread
From: Borislav Petkov @ 2016-10-28 16:57 UTC (permalink / raw)
  To: Yazen Ghannam; +Cc: linux-kernel, x86, stable

On Fri, Oct 28, 2016 at 10:51:57AM -0500, Yazen Ghannam wrote:
> The current Fam17h cpu_llc_id  derivation has an underflow bug when
				^
			(Last Level Cache ID)

Let's write it out the first time.

> extracting the socket_id value. The socket_id value starts from 0, so
> subtracting 1 will result in an underflow. This breaks scheduling topology
> later on since the cpu_llc_id will be incorrect.
> 
> The apicid decoding is fixed for bits 3 and above,
						    ^
"is fixed ... in register... "

> which give the core
> complex, node and socket IDs. The LLC is at the core complex level so we
> can find a unique cpu_llc_id by right shifting the apicid by 3.

"... because then the LSBit will be the Core Complex ID."

> We can fix the underflow bug and simplify the code by replacing the
> current cpu_llc_id derivation with a right shift.
> 
> Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com>
> Cc: <stable@vger.kernel.org> # v4.4..
> Fixes: 3849e91f571d ("x86/AMD: Fix last level cache topology for AMD Fam17h systems")
> ---

I'll run it soon.

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2 2/2] x86/AMD: Group cpu_llc_id assignment by topology feature and family
  2016-10-28 15:51 ` [PATCH v2 2/2] x86/AMD: Group cpu_llc_id assignment by topology feature and family Yazen Ghannam
@ 2016-10-28 19:56   ` Borislav Petkov
  2016-10-31 19:22     ` Yazen Ghannam
  0 siblings, 1 reply; 7+ messages in thread
From: Borislav Petkov @ 2016-10-28 19:56 UTC (permalink / raw)
  To: Yazen Ghannam; +Cc: linux-kernel, x86

On Fri, Oct 28, 2016 at 10:51:58AM -0500, Yazen Ghannam wrote:
> Currently, we assume that a system has multiple last level caches only if
> there are multiple nodes, and that the cpu_llc_id is equal to the node_id.
> This no longer applies since Fam17h can have multiple last level caches
> within a node.
> 
> So group the cpu_llc_id assignment by topology feature and family.

That's a very nice intro, good.

> The NODEID_MSR feature only applies to Fam10h in which case the llc is at

s/llc/LLC (Last Level Cache/

Let's try to have abbreviations written out in their first mention in the text.

> the node level.
> 
> The TOPOEXT feature is used on families 15h, 16h and 17h. So far we only
> see multiple last level caches if L3 caches are available. Otherwise, the
> cpu_llc_id will default to be the phys_proc_id. We have L3 caches only on
> families 15h and 17h. On Fam15h, the llc is at the node level. On Fam17h,

s/llc/LLC/g

> the llc is at the core complex level and can be found by right shifting the
      ^^^

LLC

> apicid. Also, keep the family checks explicit so that new families will
> fall back to the default.
> 
> Single node systems in families 10h and 15h will have a Node ID of 0 which
> will be the same as the phys_proc_id, so we don't need to check for
> multiple nodes before using the node_id.
> 
> Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com>
> ---
>  arch/x86/kernel/cpu/amd.c | 32 ++++++++++++++++++++------------
>  1 file changed, 20 insertions(+), 12 deletions(-)
> 
> diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
> index c3fc337..be70345 100644
> --- a/arch/x86/kernel/cpu/amd.c
> +++ b/arch/x86/kernel/cpu/amd.c
> @@ -314,11 +314,31 @@ static void amd_get_topology(struct cpuinfo_x86 *c)
>  		smp_num_siblings = ((ebx >> 8) & 3) + 1;
>  		c->x86_max_cores /= smp_num_siblings;
>  		c->cpu_core_id = ebx & 0xff;
> +
> +		/*
> +		 * We may have multiple LLCs if L3 caches exist, so check if we
> +		 * have an L3 cache by looking at the L3 cache cpuid leaf.

x86 instructions in caps please: CPUID

> +		 */
> +		if (cpuid_edx(0x80000006)) {
> +			if (c->x86 == 0x15) {
> +				/* LLC is at the node level. */
> +				per_cpu(cpu_llc_id, cpu) = node_id;
> +
> +			} else if (c->x86 == 0x17) {

				How about >= ?

> +				/*
> +				 * LLC is at the core complex level.
> +				 * Core complex id is ApicId[3].
> +				 */
> +				per_cpu(cpu_llc_id, cpu) = c->apicid >> 3;
> +			}
> +		}
>  	} else if (cpu_has(c, X86_FEATURE_NODEID_MSR)) {
>  		u64 value;
>  
>  		rdmsrl(MSR_FAM10H_NODE_ID, value);
>  		node_id = value & 7;
> +
> +		per_cpu(cpu_llc_id, cpu) = node_id;
>  	} else
>  		return;

Btw, please add for your next submission:

Tested-by: Borislav Petkov <bp@suse.de>

Thanks.

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2 1/2] x86/AMD: Fix cpu_llc_id for AMD Fam17h systems
  2016-10-28 16:57 ` [PATCH v2 1/2] x86/AMD: Fix cpu_llc_id for AMD Fam17h systems Borislav Petkov
@ 2016-10-31 19:15   ` Yazen Ghannam
  0 siblings, 0 replies; 7+ messages in thread
From: Yazen Ghannam @ 2016-10-31 19:15 UTC (permalink / raw)
  To: Borislav Petkov; +Cc: linux-kernel, x86, stable

> > The current Fam17h cpu_llc_id  derivation has an underflow bug when
> 				^
> 			(Last Level Cache ID)
> 
> Let's write it out the first time.
> 

Ack.

> > 
> > The apicid decoding is fixed for bits 3 and above,
> 						    ^
> "is fixed ... in register... "
> 

Ack.

> > which give the core
> > complex, node and socket IDs. The LLC is at the core complex level so we
> > can find a unique cpu_llc_id by right shifting the apicid by 3.
> 
> "... because then the LSBit will be the Core Complex ID."
> 

Ack.

> > We can fix the underflow bug and simplify the code by replacing the
> > current cpu_llc_id derivation with a right shift.
> > 
> > Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com>
> > Cc: <stable@vger.kernel.org> # v4.4..
> > Fixes: 3849e91f571d ("x86/AMD: Fix last level cache topology for AMD Fam17h systems")
> > ---
> 
> I'll run it soon.
>

Thanks,
Yazen 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2 2/2] x86/AMD: Group cpu_llc_id assignment by topology feature and family
  2016-10-28 19:56   ` Borislav Petkov
@ 2016-10-31 19:22     ` Yazen Ghannam
  2016-10-31 21:55       ` Borislav Petkov
  0 siblings, 1 reply; 7+ messages in thread
From: Yazen Ghannam @ 2016-10-31 19:22 UTC (permalink / raw)
  To: Borislav Petkov; +Cc: linux-kernel, x86

> 
> > The NODEID_MSR feature only applies to Fam10h in which case the llc is at
> 
> s/llc/LLC (Last Level Cache/
> 
> Let's try to have abbreviations written out in their first mention in the text.
> 

Okay.

> > the node level.
> > 
> > The TOPOEXT feature is used on families 15h, 16h and 17h. So far we only
> > see multiple last level caches if L3 caches are available. Otherwise, the
> > cpu_llc_id will default to be the phys_proc_id. We have L3 caches only on
> > families 15h and 17h. On Fam15h, the llc is at the node level. On Fam17h,
> 
> s/llc/LLC/g
> 

Ack.

> > the llc is at the core complex level and can be found by right shifting the
>       ^^^
> 
> LLC
> 

Ack.

> > +
> > +		/*
> > +		 * We may have multiple LLCs if L3 caches exist, so check if we
> > +		 * have an L3 cache by looking at the L3 cache cpuid leaf.
> 
> x86 instructions in caps please: CPUID
>

Ack.
 
> > +		 */
> > +		if (cpuid_edx(0x80000006)) {
> > +			if (c->x86 == 0x15) {
> > +				/* LLC is at the node level. */
> > +				per_cpu(cpu_llc_id, cpu) = node_id;
> > +
> > +			} else if (c->x86 == 0x17) {
> 
> 				How about >= ?
>

This APICID format is only valid for Fam17h. What I'm going for is that
we fall back to a sensible default if we don't have a better assignment
for a new family. At first I thought that phys_proc_id would be good but
now I think node_id is better as a sensible default. I'll make this change
in the V3 set.

> 
> Btw, please add for your next submission:
> 
> Tested-by: Borislav Petkov <bp@suse.de>
> 

For both patches?

Thanks,
Yazen

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2 2/2] x86/AMD: Group cpu_llc_id assignment by topology feature and family
  2016-10-31 19:22     ` Yazen Ghannam
@ 2016-10-31 21:55       ` Borislav Petkov
  0 siblings, 0 replies; 7+ messages in thread
From: Borislav Petkov @ 2016-10-31 21:55 UTC (permalink / raw)
  To: Yazen Ghannam; +Cc: linux-kernel, x86

On Mon, Oct 31, 2016 at 03:22:36PM -0400, Yazen Ghannam wrote:
> For both patches?

Yap, I ran them both. But since you're going to change stuff again, I'll
run v3 too so don't add the tag yet.

Thanks.

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2016-10-31 21:55 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-10-28 15:51 [PATCH v2 1/2] x86/AMD: Fix cpu_llc_id for AMD Fam17h systems Yazen Ghannam
2016-10-28 15:51 ` [PATCH v2 2/2] x86/AMD: Group cpu_llc_id assignment by topology feature and family Yazen Ghannam
2016-10-28 19:56   ` Borislav Petkov
2016-10-31 19:22     ` Yazen Ghannam
2016-10-31 21:55       ` Borislav Petkov
2016-10-28 16:57 ` [PATCH v2 1/2] x86/AMD: Fix cpu_llc_id for AMD Fam17h systems Borislav Petkov
2016-10-31 19:15   ` Yazen Ghannam

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).