* [PATCH v3 1/2] x86/AMD: Fix cpu_llc_id for AMD Fam17h systems @ 2016-11-01 16:51 Yazen Ghannam 2016-11-01 16:51 ` [PATCH v3 2/2] x86/AMD: Group cpu_llc_id assignment by topology feature and family Yazen Ghannam 2016-11-02 20:13 ` [PATCH v3 1/2] x86/AMD: Fix cpu_llc_id for AMD Fam17h systems Borislav Petkov 0 siblings, 2 replies; 18+ messages in thread From: Yazen Ghannam @ 2016-11-01 16:51 UTC (permalink / raw) To: linux-kernel; +Cc: x86, bp, Yazen Ghannam, stable The current Fam17h cpu_llc_id (Last Level Cache ID) derivation has an underflow bug when extracting the socket_id value. The socket_id value starts from 0, so subtracting 1 will result in an underflow. This breaks scheduling topology later on since the cpu_llc_id will be incorrect. The APICID decoding is fixed, in register, for bits 3 and above, which give the core complex, node and socket IDs. The LLC is at the core complex level so we can find a unique cpu_llc_id by right shifting the APICID by 3 because then the least significant bit will be the Core Complex ID. We can fix the underflow bug and simplify the code by replacing the current cpu_llc_id derivation with a right shift. Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com> Cc: <stable@vger.kernel.org> # v4.4.. Fixes: 3849e91f571d ("x86/AMD: Fix last level cache topology for AMD Fam17h systems") --- Link: http://lkml.kernel.org/r/1477669918-56261-1-git-send-email-Yazen.Ghannam@amd.com v2->v3: * Fixup commit message based on comments. arch/x86/kernel/cpu/amd.c | 6 +----- 1 file changed, 1 insertion(+), 5 deletions(-) diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c index b81fe2d..1e81a37 100644 --- a/arch/x86/kernel/cpu/amd.c +++ b/arch/x86/kernel/cpu/amd.c @@ -347,7 +347,6 @@ static void amd_detect_cmp(struct cpuinfo_x86 *c) #ifdef CONFIG_SMP unsigned bits; int cpu = smp_processor_id(); - unsigned int socket_id, core_complex_id; bits = c->x86_coreid_bits; /* Low order bits define the core id (index of core in socket) */ @@ -365,10 +364,7 @@ static void amd_detect_cmp(struct cpuinfo_x86 *c) if (c->x86 != 0x17 || !cpuid_edx(0x80000006)) return; - socket_id = (c->apicid >> bits) - 1; - core_complex_id = (c->apicid & ((1 << bits) - 1)) >> 3; - - per_cpu(cpu_llc_id, cpu) = (socket_id << 3) | core_complex_id; + per_cpu(cpu_llc_id, cpu) = c->apicid >> 3; #endif } -- 1.9.1 ^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH v3 2/2] x86/AMD: Group cpu_llc_id assignment by topology feature and family 2016-11-01 16:51 [PATCH v3 1/2] x86/AMD: Fix cpu_llc_id for AMD Fam17h systems Yazen Ghannam @ 2016-11-01 16:51 ` Yazen Ghannam 2016-11-02 20:15 ` Borislav Petkov 2016-11-02 20:13 ` [PATCH v3 1/2] x86/AMD: Fix cpu_llc_id for AMD Fam17h systems Borislav Petkov 1 sibling, 1 reply; 18+ messages in thread From: Yazen Ghannam @ 2016-11-01 16:51 UTC (permalink / raw) To: linux-kernel; +Cc: x86, bp, Yazen Ghannam Currently, we assume that a system has multiple last level caches only if there are multiple nodes, and that the cpu_llc_id is equal to the node_id. This no longer applies since Fam17h can have multiple last level caches within a node. So group the cpu_llc_id assignment by topology feature and family. The NODEID_MSR feature only applies to Fam10h in which case the LLC (Last Level Cache) is at the node level. The TOPOEXT feature is used on families 15h, 16h and 17h. So far we only see multiple last level caches if L3 caches are available. Otherwise, the cpu_llc_id will default to be the phys_proc_id. We have L3 caches only on families 15h and 17h. On Fam15h, the LLC is at the node level. On Fam17h, the LLC is at the core complex level and can be found by right shifting the apicid. Also, keep the family checks explicit so that new families will fall back to the default, which will be node_id for TOPOEXT systems. Single node systems in families 10h and 15h will have a Node ID of 0 which will be the same as the phys_proc_id, so we don't need to check for multiple nodes before using the node_id. Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com> --- Link: http://lkml.kernel.org/r/1477669918-56261-2-git-send-email-Yazen.Ghannam@amd.com v2->v3: * Fixup commit message based on comments. * Make node_id the default cpu_llc_id for TOPOEXT systems. arch/x86/kernel/cpu/amd.c | 31 +++++++++++++++++++------------ 1 file changed, 19 insertions(+), 12 deletions(-) diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c index 1e81a37..4daad1e 100644 --- a/arch/x86/kernel/cpu/amd.c +++ b/arch/x86/kernel/cpu/amd.c @@ -314,11 +314,30 @@ static void amd_get_topology(struct cpuinfo_x86 *c) smp_num_siblings = ((ebx >> 8) & 3) + 1; c->x86_max_cores /= smp_num_siblings; c->cpu_core_id = ebx & 0xff; + + /* + * We may have multiple LLCs if L3 caches exist, so check if we + * have an L3 cache by looking at the L3 cache CPUID leaf. + */ + if (cpuid_edx(0x80000006)) { + if (c->x86 == 0x17) { + /* + * LLC is at the core complex level. + * Core complex id is ApicId[3]. + */ + per_cpu(cpu_llc_id, cpu) = c->apicid >> 3; + } else { + /* LLC is at the node level. */ + per_cpu(cpu_llc_id, cpu) = node_id; + } + } } else if (cpu_has(c, X86_FEATURE_NODEID_MSR)) { u64 value; rdmsrl(MSR_FAM10H_NODE_ID, value); node_id = value & 7; + + per_cpu(cpu_llc_id, cpu) = node_id; } else return; @@ -329,9 +348,6 @@ static void amd_get_topology(struct cpuinfo_x86 *c) set_cpu_cap(c, X86_FEATURE_AMD_DCM); cus_per_node = c->x86_max_cores / nodes_per_socket; - /* store NodeID, use llc_shared_map to store sibling info */ - per_cpu(cpu_llc_id, cpu) = node_id; - /* core id has to be in the [0 .. cores_per_node - 1] range */ c->cpu_core_id %= cus_per_node; } @@ -356,15 +372,6 @@ static void amd_detect_cmp(struct cpuinfo_x86 *c) /* use socket ID also for last level cache */ per_cpu(cpu_llc_id, cpu) = c->phys_proc_id; amd_get_topology(c); - - /* - * Fix percpu cpu_llc_id here as LLC topology is different - * for Fam17h systems. - */ - if (c->x86 != 0x17 || !cpuid_edx(0x80000006)) - return; - - per_cpu(cpu_llc_id, cpu) = c->apicid >> 3; #endif } -- 1.9.1 ^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [PATCH v3 2/2] x86/AMD: Group cpu_llc_id assignment by topology feature and family 2016-11-01 16:51 ` [PATCH v3 2/2] x86/AMD: Group cpu_llc_id assignment by topology feature and family Yazen Ghannam @ 2016-11-02 20:15 ` Borislav Petkov 2016-11-07 7:29 ` Ingo Molnar 0 siblings, 1 reply; 18+ messages in thread From: Borislav Petkov @ 2016-11-02 20:15 UTC (permalink / raw) To: x86; +Cc: Yazen Ghannam, linux-kernel Ditto, clean it up and add tags: --- From: Yazen Ghannam <Yazen.Ghannam@amd.com> Date: Tue, 1 Nov 2016 11:51:03 -0500 Subject: [PATCH] x86/AMD: Group cpu_llc_id assignment Currently, we assume that a system has multiple last level caches only if there are multiple nodes, and that the cpu_llc_id is equal to the node_id. This no longer applies since Fam17h can have multiple last level caches within a node. So group the cpu_llc_id assignment by topology feature and family. The NODEID_MSR feature only applies to Fam10h in which case the LLC (Last Level Cache) is at the node level. The TOPOEXT feature is used on families 15h, 16h and 17h. So far we only see multiple last level caches if L3 caches are available. Otherwise, the cpu_llc_id will default to be the phys_proc_id. We have L3 caches only on families 15h and 17h: - on Fam15h, the LLC is at the node level. - on Fam17h, the LLC is at the core complex level and can be found by right shifting the APIC ID. Also, keep the family checks explicit so that new families will fall back to the default, which will be node_id for TOPOEXT systems. Single node systems in families 10h and 15h will have a Node ID of 0 which will be the same as the phys_proc_id, so we don't need to check for multiple nodes before using the node_id. Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com> Tested-by: Borislav Petkov <bp@suse.de> Cc: Aravind Gopalakrishnan <aravindksg.lkml@gmail.com> Cc: x86-ml <x86@kernel.org> Link: http://lkml.kernel.org/r/1478019063-2632-2-git-send-email-Yazen.Ghannam@amd.com [ Boris: cleanup commit message. ] Signed-off-by: Borislav Petkov <bp@suse.de> --- arch/x86/kernel/cpu/amd.c | 31 +++++++++++++++++++------------ 1 file changed, 19 insertions(+), 12 deletions(-) diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c index 1e81a37c034e..4daad1e39352 100644 --- a/arch/x86/kernel/cpu/amd.c +++ b/arch/x86/kernel/cpu/amd.c @@ -314,11 +314,30 @@ static void amd_get_topology(struct cpuinfo_x86 *c) smp_num_siblings = ((ebx >> 8) & 3) + 1; c->x86_max_cores /= smp_num_siblings; c->cpu_core_id = ebx & 0xff; + + /* + * We may have multiple LLCs if L3 caches exist, so check if we + * have an L3 cache by looking at the L3 cache CPUID leaf. + */ + if (cpuid_edx(0x80000006)) { + if (c->x86 == 0x17) { + /* + * LLC is at the core complex level. + * Core complex id is ApicId[3]. + */ + per_cpu(cpu_llc_id, cpu) = c->apicid >> 3; + } else { + /* LLC is at the node level. */ + per_cpu(cpu_llc_id, cpu) = node_id; + } + } } else if (cpu_has(c, X86_FEATURE_NODEID_MSR)) { u64 value; rdmsrl(MSR_FAM10H_NODE_ID, value); node_id = value & 7; + + per_cpu(cpu_llc_id, cpu) = node_id; } else return; @@ -329,9 +348,6 @@ static void amd_get_topology(struct cpuinfo_x86 *c) set_cpu_cap(c, X86_FEATURE_AMD_DCM); cus_per_node = c->x86_max_cores / nodes_per_socket; - /* store NodeID, use llc_shared_map to store sibling info */ - per_cpu(cpu_llc_id, cpu) = node_id; - /* core id has to be in the [0 .. cores_per_node - 1] range */ c->cpu_core_id %= cus_per_node; } @@ -356,15 +372,6 @@ static void amd_detect_cmp(struct cpuinfo_x86 *c) /* use socket ID also for last level cache */ per_cpu(cpu_llc_id, cpu) = c->phys_proc_id; amd_get_topology(c); - - /* - * Fix percpu cpu_llc_id here as LLC topology is different - * for Fam17h systems. - */ - if (c->x86 != 0x17 || !cpuid_edx(0x80000006)) - return; - - per_cpu(cpu_llc_id, cpu) = c->apicid >> 3; #endif } -- 2.10.0 SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg) -- ^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [PATCH v3 2/2] x86/AMD: Group cpu_llc_id assignment by topology feature and family 2016-11-02 20:15 ` Borislav Petkov @ 2016-11-07 7:29 ` Ingo Molnar 2016-11-07 9:22 ` Borislav Petkov 0 siblings, 1 reply; 18+ messages in thread From: Ingo Molnar @ 2016-11-07 7:29 UTC (permalink / raw) To: Borislav Petkov; +Cc: x86, Yazen Ghannam, linux-kernel * Borislav Petkov <bp@suse.de> wrote: > Ditto, clean it up and add tags: > > --- > From: Yazen Ghannam <Yazen.Ghannam@amd.com> > Date: Tue, 1 Nov 2016 11:51:03 -0500 > Subject: [PATCH] x86/AMD: Group cpu_llc_id assignment > > Currently, we assume that a system has multiple last level caches only > if there are multiple nodes, and that the cpu_llc_id is equal to the > node_id. This no longer applies since Fam17h can have multiple last > level caches within a node. Wondering what the practical effect of this is - is any current hardware affected, and if yes, what are the effects of this patch? Thanks, Ingo ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v3 2/2] x86/AMD: Group cpu_llc_id assignment by topology feature and family 2016-11-07 7:29 ` Ingo Molnar @ 2016-11-07 9:22 ` Borislav Petkov 0 siblings, 0 replies; 18+ messages in thread From: Borislav Petkov @ 2016-11-07 9:22 UTC (permalink / raw) To: Ingo Molnar; +Cc: x86, Yazen Ghannam, linux-kernel On Mon, Nov 07, 2016 at 08:29:06AM +0100, Ingo Molnar wrote: > Wondering what the practical effect of this is - is any current > hardware affected, and if yes, what are the effects of this patch? No affect on current hw - just a cleanup. It can go in 4.10. Only the previous one is tagged for stable. Thanks. -- Regards/Gruss, Boris. SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg) -- ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v3 1/2] x86/AMD: Fix cpu_llc_id for AMD Fam17h systems 2016-11-01 16:51 [PATCH v3 1/2] x86/AMD: Fix cpu_llc_id for AMD Fam17h systems Yazen Ghannam 2016-11-01 16:51 ` [PATCH v3 2/2] x86/AMD: Group cpu_llc_id assignment by topology feature and family Yazen Ghannam @ 2016-11-02 20:13 ` Borislav Petkov 2016-11-07 7:31 ` Ingo Molnar 1 sibling, 1 reply; 18+ messages in thread From: Borislav Petkov @ 2016-11-02 20:13 UTC (permalink / raw) To: x86; +Cc: Yazen Ghannam, linux-kernel Lemme clean up the commit message a bit more and add tags: --- From: Yazen Ghannam <Yazen.Ghannam@amd.com> Date: Tue, 1 Nov 2016 11:51:02 -0500 Subject: [PATCH] x86/AMD: Fix cpu_llc_id for AMD Fam17h systems cpu_llc_id (Last Level Cache ID) derivation on AMD Fam17h has an underflow bug when extracting the socket_id value. It starts from 0 so subtracting 1 from it will result in an invalid value. This breaks scheduling topology later on since the cpu_llc_id will be incorrect. The APIC ID is preset in APICx020 for bits 3 and above: they contain the core complex, node and socket IDs. The LLC is at the core complex level so we can find a unique cpu_llc_id by right shifting the APICID by 3 because then the least significant bit will be the Core Complex ID. Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com> Tested-by: Borislav Petkov <bp@suse.de> Cc: Aravind Gopalakrishnan <aravindksg.lkml@gmail.com> Cc: <stable@vger.kernel.org> # v4.4.. Cc: x86-ml <x86@kernel.org> Link: http://lkml.kernel.org/r/1478019063-2632-1-git-send-email-Yazen.Ghannam@amd.com Fixes: 3849e91f571d ("x86/AMD: Fix last level cache topology for AMD Fam17h systems") [ Boris: cleanup commit message. ] Signed-off-by: Borislav Petkov <bp@suse.de> --- arch/x86/kernel/cpu/amd.c | 6 +----- 1 file changed, 1 insertion(+), 5 deletions(-) diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c index b81fe2d63e15..1e81a37c034e 100644 --- a/arch/x86/kernel/cpu/amd.c +++ b/arch/x86/kernel/cpu/amd.c @@ -347,7 +347,6 @@ static void amd_detect_cmp(struct cpuinfo_x86 *c) #ifdef CONFIG_SMP unsigned bits; int cpu = smp_processor_id(); - unsigned int socket_id, core_complex_id; bits = c->x86_coreid_bits; /* Low order bits define the core id (index of core in socket) */ @@ -365,10 +364,7 @@ static void amd_detect_cmp(struct cpuinfo_x86 *c) if (c->x86 != 0x17 || !cpuid_edx(0x80000006)) return; - socket_id = (c->apicid >> bits) - 1; - core_complex_id = (c->apicid & ((1 << bits) - 1)) >> 3; - - per_cpu(cpu_llc_id, cpu) = (socket_id << 3) | core_complex_id; + per_cpu(cpu_llc_id, cpu) = c->apicid >> 3; #endif } -- 2.10.0 SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg) -- ^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [PATCH v3 1/2] x86/AMD: Fix cpu_llc_id for AMD Fam17h systems 2016-11-02 20:13 ` [PATCH v3 1/2] x86/AMD: Fix cpu_llc_id for AMD Fam17h systems Borislav Petkov @ 2016-11-07 7:31 ` Ingo Molnar 2016-11-07 9:20 ` Borislav Petkov 0 siblings, 1 reply; 18+ messages in thread From: Ingo Molnar @ 2016-11-07 7:31 UTC (permalink / raw) To: Borislav Petkov; +Cc: x86, Yazen Ghannam, linux-kernel, Peter Zijlstra * Borislav Petkov <bp@suse.de> wrote: > Lemme clean up the commit message a bit more and add tags: > > --- > From: Yazen Ghannam <Yazen.Ghannam@amd.com> > Date: Tue, 1 Nov 2016 11:51:02 -0500 > Subject: [PATCH] x86/AMD: Fix cpu_llc_id for AMD Fam17h systems > > cpu_llc_id (Last Level Cache ID) derivation on AMD Fam17h has an > underflow bug when extracting the socket_id value. It starts from 0 > so subtracting 1 from it will result in an invalid value. This breaks > scheduling topology later on since the cpu_llc_id will be incorrect. > > The APIC ID is preset in APICx020 for bits 3 and above: they contain the > core complex, node and socket IDs. > > The LLC is at the core complex level so we can find a unique cpu_llc_id > by right shifting the APICID by 3 because then the least significant bit > will be the Core Complex ID. Same question as for the previous patch: what are the effects of the bug: - Outright bad scheduling? - Suboptimal scheduling? - Crash? - Something else? Such information needs to be in commit messages, _especially_ when a Cc: stable tag is added as well. Thanks, Ingo ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v3 1/2] x86/AMD: Fix cpu_llc_id for AMD Fam17h systems 2016-11-07 7:31 ` Ingo Molnar @ 2016-11-07 9:20 ` Borislav Petkov 2016-11-07 14:07 ` Ingo Molnar 0 siblings, 1 reply; 18+ messages in thread From: Borislav Petkov @ 2016-11-07 9:20 UTC (permalink / raw) To: Ingo Molnar; +Cc: x86, Yazen Ghannam, linux-kernel, Peter Zijlstra On Mon, Nov 07, 2016 at 08:31:21AM +0100, Ingo Molnar wrote: > > cpu_llc_id (Last Level Cache ID) derivation on AMD Fam17h has an > > underflow bug when extracting the socket_id value. It starts from 0 How's this... > > so subtracting 1 from it will result in an invalid value. This breaks > > scheduling topology later on since the cpu_llc_id will be incorrect. ^^^^^^^^^^^^^^^^^^^ ... here? > Same question as for the previous patch: what are the effects of the bug: See above. -- Regards/Gruss, Boris. SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg) -- ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v3 1/2] x86/AMD: Fix cpu_llc_id for AMD Fam17h systems 2016-11-07 9:20 ` Borislav Petkov @ 2016-11-07 14:07 ` Ingo Molnar 2016-11-07 15:56 ` Borislav Petkov 0 siblings, 1 reply; 18+ messages in thread From: Ingo Molnar @ 2016-11-07 14:07 UTC (permalink / raw) To: Borislav Petkov; +Cc: x86, Yazen Ghannam, linux-kernel, Peter Zijlstra * Borislav Petkov <bp@suse.de> wrote: > On Mon, Nov 07, 2016 at 08:31:21AM +0100, Ingo Molnar wrote: > > > cpu_llc_id (Last Level Cache ID) derivation on AMD Fam17h has an > > > underflow bug when extracting the socket_id value. It starts from 0 > > How's this... > > > > so subtracting 1 from it will result in an invalid value. This breaks > > > scheduling topology later on since the cpu_llc_id will be incorrect. > ^^^^^^^^^^^^^^^^^^^ > ... here? > > > Same question as for the previous patch: what are the effects of the bug: > > See above. There's many ways the scheduling topology can 'break', resulting in different effects: - scheduling domains might be mixed up to the extent of crashing the bootup - some cores might be missing altogether, reducing available CPUs in essence - cache domains might be seriously mixed up, resulting in serious drop in performance. - or domains might be partitioned 'wrong' but not catastrophically wrong, resulting in a minor performance drop (if at all) ... do we know which of these occurs in this situation? Thanks, Ingo ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v3 1/2] x86/AMD: Fix cpu_llc_id for AMD Fam17h systems 2016-11-07 14:07 ` Ingo Molnar @ 2016-11-07 15:56 ` Borislav Petkov 2016-11-08 6:31 ` Ingo Molnar 0 siblings, 1 reply; 18+ messages in thread From: Borislav Petkov @ 2016-11-07 15:56 UTC (permalink / raw) To: Ingo Molnar; +Cc: x86, Yazen Ghannam, linux-kernel, Peter Zijlstra [-- Attachment #1: Type: text/plain, Size: 1660 bytes --] On Mon, Nov 07, 2016 at 03:07:46PM +0100, Ingo Molnar wrote: > - cache domains might be seriously mixed up, resulting in serious drop in > performance. > > - or domains might be partitioned 'wrong' but not catastrophically > wrong, resulting in a minor performance drop (if at all) Something between the two. Here's some debugging output from set_cpu_sibling_map(): [ 0.202033] smpboot: set_cpu_sibling_map: cpu: 0, has_smt: 0, has_mp: 1 [ 0.202043] smpboot: set_cpu_sibling_map: first loop, llc(this): 65528, o: 0, llc(o): 65528 [ 0.202058] smpboot: set_cpu_sibling_map: first loop, link mask smt so we link it into the SMT mask even if has_smt is off. [ 0.202067] smpboot: set_cpu_sibling_map: first loop, link mask llc [ 0.202077] smpboot: set_cpu_sibling_map: second loop, llc(this): 65528, o: 0, llc(o): 65528 [ 0.202091] smpboot: set_cpu_sibling_map: second loop, link mask die I've attached the debug diff. And since those llc(o), i.e. the cpu_llc_id of the *other* CPU in the loops in set_cpu_sibling_map() underflows, we're generating the funniest thread_siblings masks and then when I run 8 threads of nbench, they get spread around the LLC domains in a very strange pattern which doesn't give you the normal scheduling spread one would expect for performance. And this is just one workload - I can't imagine what else might be influenced by this funkiness. Oh and other things like EDAC use cpu_llc_id so they will be b0rked too. So we absolutely need to fix that cpu_llc_id thing. -- Regards/Gruss, Boris. SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg) -- [-- Attachment #2: set_cpu_sibling_map_debug.diff --] [-- Type: text/x-diff, Size: 1596 bytes --] diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c index 601d2b331350..5974098d8266 100644 --- a/arch/x86/kernel/smpboot.c +++ b/arch/x86/kernel/smpboot.c @@ -506,6 +506,9 @@ void set_cpu_sibling_map(int cpu) struct cpuinfo_x86 *o; int i, threads; + pr_info("%s: cpu: %d, has_smt: %d, has_mp: %d\n", + __func__, cpu, has_smt, has_mp); + cpumask_set_cpu(cpu, cpu_sibling_setup_mask); if (!has_mp) { @@ -519,11 +522,19 @@ void set_cpu_sibling_map(int cpu) for_each_cpu(i, cpu_sibling_setup_mask) { o = &cpu_data(i); - if ((i == cpu) || (has_smt && match_smt(c, o))) + pr_info("%s: first loop, llc(this): %d, o: %d, llc(o): %d\n", + __func__, per_cpu(cpu_llc_id, cpu), + o->cpu_index, per_cpu(cpu_llc_id, o->cpu_index)); + + if ((i == cpu) || (has_smt && match_smt(c, o))) { + pr_info("%s: first loop, link mask smt\n", __func__); link_mask(topology_sibling_cpumask, cpu, i); + } - if ((i == cpu) || (has_mp && match_llc(c, o))) + if ((i == cpu) || (has_mp && match_llc(c, o))) { + pr_info("%s: first loop, link mask llc\n", __func__); link_mask(cpu_llc_shared_mask, cpu, i); + } } @@ -534,7 +545,12 @@ void set_cpu_sibling_map(int cpu) for_each_cpu(i, cpu_sibling_setup_mask) { o = &cpu_data(i); + pr_info("%s: second loop, llc(this): %d, o: %d, llc(o): %d\n", + __func__, per_cpu(cpu_llc_id, cpu), + o->cpu_index, per_cpu(cpu_llc_id, o->cpu_index)); + if ((i == cpu) || (has_mp && match_die(c, o))) { + pr_info("%s: second loop, link mask die\n", __func__); link_mask(topology_core_cpumask, cpu, i); /* ^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [PATCH v3 1/2] x86/AMD: Fix cpu_llc_id for AMD Fam17h systems 2016-11-07 15:56 ` Borislav Petkov @ 2016-11-08 6:31 ` Ingo Molnar 2016-11-08 8:35 ` Borislav Petkov 0 siblings, 1 reply; 18+ messages in thread From: Ingo Molnar @ 2016-11-08 6:31 UTC (permalink / raw) To: Borislav Petkov; +Cc: x86, Yazen Ghannam, linux-kernel, Peter Zijlstra * Borislav Petkov <bp@suse.de> wrote: > On Mon, Nov 07, 2016 at 03:07:46PM +0100, Ingo Molnar wrote: > > - cache domains might be seriously mixed up, resulting in serious drop in > > performance. > > > > - or domains might be partitioned 'wrong' but not catastrophically > > wrong, resulting in a minor performance drop (if at all) > > Something between the two. > > Here's some debugging output from set_cpu_sibling_map(): > > [ 0.202033] smpboot: set_cpu_sibling_map: cpu: 0, has_smt: 0, has_mp: 1 > [ 0.202043] smpboot: set_cpu_sibling_map: first loop, llc(this): 65528, o: 0, llc(o): 65528 > [ 0.202058] smpboot: set_cpu_sibling_map: first loop, link mask smt > > so we link it into the SMT mask even if has_smt is off. > > [ 0.202067] smpboot: set_cpu_sibling_map: first loop, link mask llc > [ 0.202077] smpboot: set_cpu_sibling_map: second loop, llc(this): 65528, o: 0, llc(o): 65528 > [ 0.202091] smpboot: set_cpu_sibling_map: second loop, link mask die > > I've attached the debug diff. > > And since those llc(o), i.e. the cpu_llc_id of the *other* CPU in the > loops in set_cpu_sibling_map() underflows, we're generating the funniest > thread_siblings masks and then when I run 8 threads of nbench, they get > spread around the LLC domains in a very strange pattern which doesn't > give you the normal scheduling spread one would expect for performance. > > And this is just one workload - I can't imagine what else might be > influenced by this funkiness. > > Oh and other things like EDAC use cpu_llc_id so they will be b0rked too. So the point I tried to make is that to people doing -stable backporting decisions this description you just gave is much more valuable than the previous changelog. > So we absolutely need to fix that cpu_llc_id thing. Absolutely! Thanks, Ingo ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v3 1/2] x86/AMD: Fix cpu_llc_id for AMD Fam17h systems 2016-11-08 6:31 ` Ingo Molnar @ 2016-11-08 8:35 ` Borislav Petkov 2016-11-08 10:29 ` Ingo Molnar 2016-11-09 16:13 ` [tip:x86/urgent] x86/cpu/AMD: Fix cpu_llc_id for AMD Fam17h systems tip-bot for Yazen Ghannam 0 siblings, 2 replies; 18+ messages in thread From: Borislav Petkov @ 2016-11-08 8:35 UTC (permalink / raw) To: Ingo Molnar; +Cc: x86, Yazen Ghannam, linux-kernel, Peter Zijlstra On Tue, Nov 08, 2016 at 07:31:45AM +0100, Ingo Molnar wrote: > So the point I tried to make is that to people doing -stable > backporting decisions this description you just gave is much more > valuable than the previous changelog. Ok, how's that below? I've integrated the gist of it in the commit message: --- From: Yazen Ghannam <Yazen.Ghannam@amd.com> Date: Tue, 1 Nov 2016 11:51:02 -0500 Subject: [PATCH] x86/AMD: Fix cpu_llc_id for AMD Fam17h systems cpu_llc_id (Last Level Cache ID) derivation on AMD Fam17h has an underflow bug when extracting the socket_id value. It starts from 0 so subtracting 1 from it will result in an invalid value. This breaks scheduling topology later on since the cpu_llc_id will be incorrect. For example, the the cpu_llc_id of the *other* CPU in the loops in set_cpu_sibling_map() underflows and we're generating the funniest thread_siblings masks and then when I run 8 threads of nbench, they get spread around the LLC domains in a very strange pattern which doesn't give you the normal scheduling spread one would expect for performance. Other things like EDAC use cpu_llc_id so they will be b0rked too. So, the APIC ID is preset in APICx020 for bits 3 and above: they contain the core complex, node and socket IDs. The LLC is at the core complex level so we can find a unique cpu_llc_id by right shifting the APICID by 3 because then the least significant bit will be the Core Complex ID. Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com> Tested-by: Borislav Petkov <bp@suse.de> Cc: Aravind Gopalakrishnan <aravindksg.lkml@gmail.com> Cc: <stable@vger.kernel.org> # v4.4.. Cc: x86-ml <x86@kernel.org> Link: http://lkml.kernel.org/r/1478019063-2632-1-git-send-email-Yazen.Ghannam@amd.com Fixes: 3849e91f571d ("x86/AMD: Fix last level cache topology for AMD Fam17h systems") [ Boris: cleanup and extend commit message. ] Signed-off-by: Borislav Petkov <bp@suse.de> --- arch/x86/kernel/cpu/amd.c | 6 +----- 1 file changed, 1 insertion(+), 5 deletions(-) diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c index b81fe2d63e15..1e81a37c034e 100644 --- a/arch/x86/kernel/cpu/amd.c +++ b/arch/x86/kernel/cpu/amd.c @@ -347,7 +347,6 @@ static void amd_detect_cmp(struct cpuinfo_x86 *c) #ifdef CONFIG_SMP unsigned bits; int cpu = smp_processor_id(); - unsigned int socket_id, core_complex_id; bits = c->x86_coreid_bits; /* Low order bits define the core id (index of core in socket) */ @@ -365,10 +364,7 @@ static void amd_detect_cmp(struct cpuinfo_x86 *c) if (c->x86 != 0x17 || !cpuid_edx(0x80000006)) return; - socket_id = (c->apicid >> bits) - 1; - core_complex_id = (c->apicid & ((1 << bits) - 1)) >> 3; - - per_cpu(cpu_llc_id, cpu) = (socket_id << 3) | core_complex_id; + per_cpu(cpu_llc_id, cpu) = c->apicid >> 3; #endif } -- 2.10.0 SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg) -- ^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [PATCH v3 1/2] x86/AMD: Fix cpu_llc_id for AMD Fam17h systems 2016-11-08 8:35 ` Borislav Petkov @ 2016-11-08 10:29 ` Ingo Molnar 2016-11-08 11:01 ` Borislav Petkov 2016-11-09 16:13 ` [tip:x86/urgent] x86/cpu/AMD: Fix cpu_llc_id for AMD Fam17h systems tip-bot for Yazen Ghannam 1 sibling, 1 reply; 18+ messages in thread From: Ingo Molnar @ 2016-11-08 10:29 UTC (permalink / raw) To: Borislav Petkov; +Cc: x86, Yazen Ghannam, linux-kernel, Peter Zijlstra * Borislav Petkov <bp@suse.de> wrote: > On Tue, Nov 08, 2016 at 07:31:45AM +0100, Ingo Molnar wrote: > > So the point I tried to make is that to people doing -stable > > backporting decisions this description you just gave is much more > > valuable than the previous changelog. > > Ok, how's that below? I've integrated the gist of it in the commit message: Looks good to me! Please also update the second patch with meta information and I can apply them. Thanks, Ingo ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v3 1/2] x86/AMD: Fix cpu_llc_id for AMD Fam17h systems 2016-11-08 10:29 ` Ingo Molnar @ 2016-11-08 11:01 ` Borislav Petkov 2016-11-08 14:54 ` Ingo Molnar 0 siblings, 1 reply; 18+ messages in thread From: Borislav Petkov @ 2016-11-08 11:01 UTC (permalink / raw) To: Ingo Molnar; +Cc: x86, Yazen Ghannam, linux-kernel, Peter Zijlstra On Tue, Nov 08, 2016 at 11:29:09AM +0100, Ingo Molnar wrote: > Looks good to me! Please also update the second patch with meta > information and I can apply them. What exactly do you want to have there? I think the commit message is pretty explanatory. -- Regards/Gruss, Boris. SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg) -- ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v3 1/2] x86/AMD: Fix cpu_llc_id for AMD Fam17h systems 2016-11-08 11:01 ` Borislav Petkov @ 2016-11-08 14:54 ` Ingo Molnar 2016-11-08 15:30 ` Borislav Petkov 0 siblings, 1 reply; 18+ messages in thread From: Ingo Molnar @ 2016-11-08 14:54 UTC (permalink / raw) To: Borislav Petkov; +Cc: x86, Yazen Ghannam, linux-kernel, Peter Zijlstra * Borislav Petkov <bp@suse.de> wrote: > On Tue, Nov 08, 2016 at 11:29:09AM +0100, Ingo Molnar wrote: > > Looks good to me! Please also update the second patch with meta > > information and I can apply them. > > What exactly do you want to have there? I think the commit message is > pretty explanatory. This one you gave: > No affect on current hw - just a cleanup. Nothing in the existing changelog (including the title) explained that detail. Thanks, Ingo ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v3 1/2] x86/AMD: Fix cpu_llc_id for AMD Fam17h systems 2016-11-08 14:54 ` Ingo Molnar @ 2016-11-08 15:30 ` Borislav Petkov 2016-11-10 8:00 ` [tip:x86/cpu] x86/cpu/AMD: Clean up cpu_llc_id assignment per topology feature tip-bot for Yazen Ghannam 0 siblings, 1 reply; 18+ messages in thread From: Borislav Petkov @ 2016-11-08 15:30 UTC (permalink / raw) To: Ingo Molnar; +Cc: x86, Yazen Ghannam, linux-kernel, Peter Zijlstra On Tue, Nov 08, 2016 at 03:54:55PM +0100, Ingo Molnar wrote: > This one you gave: > > > No affect on current hw - just a cleanup. > > Nothing in the existing changelog (including the title) explained that detail. Sounds to me you want to reintroduce the Impact: tagging. :-))) /me ducks Ok, ok, lemme try again. How's this? --- From: Yazen Ghannam <Yazen.Ghannam@amd.com> Date: Tue, 1 Nov 2016 11:51:03 -0500 Subject: [PATCH] x86/AMD: Group cpu_llc_id assignment per topology feature Currently, we assume that a system has a single Last Level Cache (LLC) per node, and that the cpu_llc_id is thus equal to the node_id. This no longer applies since Fam17h can have multiple last level caches within a node. So group the cpu_llc_id assignment by topology feature and family in order to make the computation of cpu_llc_id on the different families more clear. Here is how the LLC ID is being computed on the different families: The NODEID_MSR feature only applies to Fam10h in which case the LLC is at the node level. The TOPOEXT feature is used on families 15h, 16h and 17h. So far we only see multiple last level caches if L3 caches are available. Otherwise, the cpu_llc_id will default to be the phys_proc_id. We have L3 caches only on families 15h and 17h: - on Fam15h, the LLC is at the node level. - on Fam17h, the LLC is at the core complex level and can be found by right shifting the APIC ID. Also, keep the family checks explicit so that new families will fall back to the default, which will be node_id for TOPOEXT systems. Single node systems in families 10h and 15h will have a Node ID of 0 which will be the same as the phys_proc_id, so we don't need to check for multiple nodes before using the node_id. Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com> Tested-by: Borislav Petkov <bp@suse.de> Cc: Aravind Gopalakrishnan <aravindksg.lkml@gmail.com> Cc: x86-ml <x86@kernel.org> Link: http://lkml.kernel.org/r/1478019063-2632-2-git-send-email-Yazen.Ghannam@amd.com [ Boris: rewrite commit message. ] Signed-off-by: Borislav Petkov <bp@suse.de> --- arch/x86/kernel/cpu/amd.c | 31 +++++++++++++++++++------------ 1 file changed, 19 insertions(+), 12 deletions(-) diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c index 1e81a37c034e..4daad1e39352 100644 --- a/arch/x86/kernel/cpu/amd.c +++ b/arch/x86/kernel/cpu/amd.c @@ -314,11 +314,30 @@ static void amd_get_topology(struct cpuinfo_x86 *c) smp_num_siblings = ((ebx >> 8) & 3) + 1; c->x86_max_cores /= smp_num_siblings; c->cpu_core_id = ebx & 0xff; + + /* + * We may have multiple LLCs if L3 caches exist, so check if we + * have an L3 cache by looking at the L3 cache CPUID leaf. + */ + if (cpuid_edx(0x80000006)) { + if (c->x86 == 0x17) { + /* + * LLC is at the core complex level. + * Core complex id is ApicId[3]. + */ + per_cpu(cpu_llc_id, cpu) = c->apicid >> 3; + } else { + /* LLC is at the node level. */ + per_cpu(cpu_llc_id, cpu) = node_id; + } + } } else if (cpu_has(c, X86_FEATURE_NODEID_MSR)) { u64 value; rdmsrl(MSR_FAM10H_NODE_ID, value); node_id = value & 7; + + per_cpu(cpu_llc_id, cpu) = node_id; } else return; @@ -329,9 +348,6 @@ static void amd_get_topology(struct cpuinfo_x86 *c) set_cpu_cap(c, X86_FEATURE_AMD_DCM); cus_per_node = c->x86_max_cores / nodes_per_socket; - /* store NodeID, use llc_shared_map to store sibling info */ - per_cpu(cpu_llc_id, cpu) = node_id; - /* core id has to be in the [0 .. cores_per_node - 1] range */ c->cpu_core_id %= cus_per_node; } @@ -356,15 +372,6 @@ static void amd_detect_cmp(struct cpuinfo_x86 *c) /* use socket ID also for last level cache */ per_cpu(cpu_llc_id, cpu) = c->phys_proc_id; amd_get_topology(c); - - /* - * Fix percpu cpu_llc_id here as LLC topology is different - * for Fam17h systems. - */ - if (c->x86 != 0x17 || !cpuid_edx(0x80000006)) - return; - - per_cpu(cpu_llc_id, cpu) = c->apicid >> 3; #endif } -- 2.10.0 SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg) -- ^ permalink raw reply related [flat|nested] 18+ messages in thread
* [tip:x86/cpu] x86/cpu/AMD: Clean up cpu_llc_id assignment per topology feature 2016-11-08 15:30 ` Borislav Petkov @ 2016-11-10 8:00 ` tip-bot for Yazen Ghannam 0 siblings, 0 replies; 18+ messages in thread From: tip-bot for Yazen Ghannam @ 2016-11-10 8:00 UTC (permalink / raw) To: linux-tip-commits Cc: peterz, Yazen.Ghannam, linux-kernel, tglx, torvalds, mingo, hpa, bp, aravindksg.lkml Commit-ID: b6a50cddbcbda7105355898ead18f1a647c22520 Gitweb: http://git.kernel.org/tip/b6a50cddbcbda7105355898ead18f1a647c22520 Author: Yazen Ghannam <Yazen.Ghannam@amd.com> AuthorDate: Tue, 8 Nov 2016 16:30:54 +0100 Committer: Ingo Molnar <mingo@kernel.org> CommitDate: Wed, 9 Nov 2016 17:07:43 +0100 x86/cpu/AMD: Clean up cpu_llc_id assignment per topology feature These changes do not affect current hw - just a cleanup: Currently, we assume that a system has a single Last Level Cache (LLC) per node, and that the cpu_llc_id is thus equal to the node_id. This no longer applies since Fam17h can have multiple last level caches within a node. So group the cpu_llc_id assignment by topology feature and family in order to make the computation of cpu_llc_id on the different families more clear. Here is how the LLC ID is being computed on the different families: The NODEID_MSR feature only applies to Fam10h in which case the LLC is at the node level. The TOPOEXT feature is used on families 15h, 16h and 17h. So far we only see multiple last level caches if L3 caches are available. Otherwise, the cpu_llc_id will default to be the phys_proc_id. We have L3 caches only on families 15h and 17h: - on Fam15h, the LLC is at the node level. - on Fam17h, the LLC is at the core complex level and can be found by right shifting the APIC ID. Also, keep the family checks explicit so that new families will fall back to the default, which will be node_id for TOPOEXT systems. Single node systems in families 10h and 15h will have a Node ID of 0 which will be the same as the phys_proc_id, so we don't need to check for multiple nodes before using the node_id. Tested-by: Borislav Petkov <bp@suse.de> Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com> [ Rewrote the commit message. ] Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: Aravind Gopalakrishnan <aravindksg.lkml@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20161108153054.bs3sajbyevq6a6uu@pd.tnic Signed-off-by: Ingo Molnar <mingo@kernel.org> --- arch/x86/kernel/cpu/amd.c | 31 +++++++++++++++++++------------ 1 file changed, 19 insertions(+), 12 deletions(-) diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c index 1e81a37..4daad1e 100644 --- a/arch/x86/kernel/cpu/amd.c +++ b/arch/x86/kernel/cpu/amd.c @@ -314,11 +314,30 @@ static void amd_get_topology(struct cpuinfo_x86 *c) smp_num_siblings = ((ebx >> 8) & 3) + 1; c->x86_max_cores /= smp_num_siblings; c->cpu_core_id = ebx & 0xff; + + /* + * We may have multiple LLCs if L3 caches exist, so check if we + * have an L3 cache by looking at the L3 cache CPUID leaf. + */ + if (cpuid_edx(0x80000006)) { + if (c->x86 == 0x17) { + /* + * LLC is at the core complex level. + * Core complex id is ApicId[3]. + */ + per_cpu(cpu_llc_id, cpu) = c->apicid >> 3; + } else { + /* LLC is at the node level. */ + per_cpu(cpu_llc_id, cpu) = node_id; + } + } } else if (cpu_has(c, X86_FEATURE_NODEID_MSR)) { u64 value; rdmsrl(MSR_FAM10H_NODE_ID, value); node_id = value & 7; + + per_cpu(cpu_llc_id, cpu) = node_id; } else return; @@ -329,9 +348,6 @@ static void amd_get_topology(struct cpuinfo_x86 *c) set_cpu_cap(c, X86_FEATURE_AMD_DCM); cus_per_node = c->x86_max_cores / nodes_per_socket; - /* store NodeID, use llc_shared_map to store sibling info */ - per_cpu(cpu_llc_id, cpu) = node_id; - /* core id has to be in the [0 .. cores_per_node - 1] range */ c->cpu_core_id %= cus_per_node; } @@ -356,15 +372,6 @@ static void amd_detect_cmp(struct cpuinfo_x86 *c) /* use socket ID also for last level cache */ per_cpu(cpu_llc_id, cpu) = c->phys_proc_id; amd_get_topology(c); - - /* - * Fix percpu cpu_llc_id here as LLC topology is different - * for Fam17h systems. - */ - if (c->x86 != 0x17 || !cpuid_edx(0x80000006)) - return; - - per_cpu(cpu_llc_id, cpu) = c->apicid >> 3; #endif } ^ permalink raw reply related [flat|nested] 18+ messages in thread
* [tip:x86/urgent] x86/cpu/AMD: Fix cpu_llc_id for AMD Fam17h systems 2016-11-08 8:35 ` Borislav Petkov 2016-11-08 10:29 ` Ingo Molnar @ 2016-11-09 16:13 ` tip-bot for Yazen Ghannam 1 sibling, 0 replies; 18+ messages in thread From: tip-bot for Yazen Ghannam @ 2016-11-09 16:13 UTC (permalink / raw) To: linux-tip-commits Cc: hpa, torvalds, bp, tglx, peterz, linux-kernel, aravindksg.lkml, Yazen.Ghannam, mingo Commit-ID: b0b6e86846093c5f8820386bc01515f857dd8faa Gitweb: http://git.kernel.org/tip/b0b6e86846093c5f8820386bc01515f857dd8faa Author: Yazen Ghannam <Yazen.Ghannam@amd.com> AuthorDate: Tue, 8 Nov 2016 09:35:06 +0100 Committer: Ingo Molnar <mingo@kernel.org> CommitDate: Wed, 9 Nov 2016 17:06:08 +0100 x86/cpu/AMD: Fix cpu_llc_id for AMD Fam17h systems cpu_llc_id (Last Level Cache ID) derivation on AMD Fam17h has an underflow bug when extracting the socket_id value. It starts from 0 so subtracting 1 from it will result in an invalid value. This breaks scheduling topology later on since the cpu_llc_id will be incorrect. For example, the the cpu_llc_id of the *other* CPU in the loops in set_cpu_sibling_map() underflows and we're generating the funniest thread_siblings masks and then when I run 8 threads of nbench, they get spread around the LLC domains in a very strange pattern which doesn't give you the normal scheduling spread one would expect for performance. Other things like EDAC use cpu_llc_id so they will be b0rked too. So, the APIC ID is preset in APICx020 for bits 3 and above: they contain the core complex, node and socket IDs. The LLC is at the core complex level so we can find a unique cpu_llc_id by right shifting the APICID by 3 because then the least significant bit will be the Core Complex ID. Tested-by: Borislav Petkov <bp@suse.de> Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com> [ Cleaned up and extended the commit message. ] Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: <stable@vger.kernel.org> # v4.4.. Cc: Aravind Gopalakrishnan <aravindksg.lkml@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Fixes: 3849e91f571d ("x86/AMD: Fix last level cache topology for AMD Fam17h systems") Link: http://lkml.kernel.org/r/20161108083506.rvqb5h4chrcptj7d@pd.tnic Signed-off-by: Ingo Molnar <mingo@kernel.org> --- arch/x86/kernel/cpu/amd.c | 6 +----- 1 file changed, 1 insertion(+), 5 deletions(-) diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c index b81fe2d..1e81a37 100644 --- a/arch/x86/kernel/cpu/amd.c +++ b/arch/x86/kernel/cpu/amd.c @@ -347,7 +347,6 @@ static void amd_detect_cmp(struct cpuinfo_x86 *c) #ifdef CONFIG_SMP unsigned bits; int cpu = smp_processor_id(); - unsigned int socket_id, core_complex_id; bits = c->x86_coreid_bits; /* Low order bits define the core id (index of core in socket) */ @@ -365,10 +364,7 @@ static void amd_detect_cmp(struct cpuinfo_x86 *c) if (c->x86 != 0x17 || !cpuid_edx(0x80000006)) return; - socket_id = (c->apicid >> bits) - 1; - core_complex_id = (c->apicid & ((1 << bits) - 1)) >> 3; - - per_cpu(cpu_llc_id, cpu) = (socket_id << 3) | core_complex_id; + per_cpu(cpu_llc_id, cpu) = c->apicid >> 3; #endif } ^ permalink raw reply related [flat|nested] 18+ messages in thread
end of thread, other threads:[~2016-11-12 21:18 UTC | newest] Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2016-11-01 16:51 [PATCH v3 1/2] x86/AMD: Fix cpu_llc_id for AMD Fam17h systems Yazen Ghannam 2016-11-01 16:51 ` [PATCH v3 2/2] x86/AMD: Group cpu_llc_id assignment by topology feature and family Yazen Ghannam 2016-11-02 20:15 ` Borislav Petkov 2016-11-07 7:29 ` Ingo Molnar 2016-11-07 9:22 ` Borislav Petkov 2016-11-02 20:13 ` [PATCH v3 1/2] x86/AMD: Fix cpu_llc_id for AMD Fam17h systems Borislav Petkov 2016-11-07 7:31 ` Ingo Molnar 2016-11-07 9:20 ` Borislav Petkov 2016-11-07 14:07 ` Ingo Molnar 2016-11-07 15:56 ` Borislav Petkov 2016-11-08 6:31 ` Ingo Molnar 2016-11-08 8:35 ` Borislav Petkov 2016-11-08 10:29 ` Ingo Molnar 2016-11-08 11:01 ` Borislav Petkov 2016-11-08 14:54 ` Ingo Molnar 2016-11-08 15:30 ` Borislav Petkov 2016-11-10 8:00 ` [tip:x86/cpu] x86/cpu/AMD: Clean up cpu_llc_id assignment per topology feature tip-bot for Yazen Ghannam 2016-11-09 16:13 ` [tip:x86/urgent] x86/cpu/AMD: Fix cpu_llc_id for AMD Fam17h systems tip-bot for Yazen Ghannam
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).