* [linux41] Kernel panic at i686 @ 2015-07-23 22:23 Philip Müller 2015-07-26 6:18 ` [linux41] regression with 'x86/cacheinfo: Move cacheinfo sysfs code to generic infrastructure' on AMD i686 Philip Müller 0 siblings, 1 reply; 23+ messages in thread From: Philip Müller @ 2015-07-23 22:23 UTC (permalink / raw) To: linux-kernel Hi all, I started to test linux 4.1 series with rc6. However, I was never able to boot that kernel in i686 architecture. Trying it again with VirtualBox gave me more conclusions. Using one core it simply boots up. Using more than one CPU core it crashes with: Failed to access perfctr msr (MSR c0010007 is 0) task: f58e0000 ti: f58e8000 task.ti: f58e800 EIP: 0060:[<c135a903>] EFLAGS: 00010206 CPU: 0 EIP is at free_cache_attributes+0x83/0xd0 EAX: 00000001 EBX: f589d46c ECX: 00000090 EDX: 360c2000 ESI: 00000000 EDI: c1724a80 EBP: f58e9ec0 ESP: f58e9ea0 DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 CR0: 8005003b CR2: 000000ac CR3: 01731000 CR4: 000006d0 In more rich detail you can find that problem on my bug-tracker for Manjaro Linux: https://github.com/manjaro/packages-core/issues/14 I just want to know if you are aware of it. With current 4.1.3 release I still face that issue ... kind regards Philip Müller -------------------------- Manjaro Project-Lead ^ permalink raw reply [flat|nested] 23+ messages in thread
* [linux41] regression with 'x86/cacheinfo: Move cacheinfo sysfs code to generic infrastructure' on AMD i686 2015-07-23 22:23 [linux41] Kernel panic at i686 Philip Müller @ 2015-07-26 6:18 ` Philip Müller 2015-07-26 8:13 ` Thomas Gleixner 2015-07-27 7:58 ` [PATCH] cpu/cacheinfo: Fix teardown path Borislav Petkov 0 siblings, 2 replies; 23+ messages in thread From: Philip Müller @ 2015-07-26 6:18 UTC (permalink / raw) To: linux-kernel, Sudeep Holla, Guenter Roeck, manjaro-dev Cc: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, Andre Przywara, Borislav Petkov Hi Guenter, Sudeep, It now came down to 'x86/cacheinfo: Move cacheinfo sysfs code to generic infrastructure'[1] as you can see here[2]. The facts are: - You can't boot on i686 only with more than one CPU core on AMD hardware (x86_64 however works) - Using Ubuntu config[3] it boots but creates an kernel >= 1 GB in size on Manjaro - By reverting 0d55ba4[4] the kernel boots. So we have to find out what is causing this issue[5]. kind regards Philip Müller -------------------------- Manjaro Project-Lead [1] https://github.com/torvalds/linux/commit/0d55ba46bfbee64fd2b492b87bfe2ec172e7b056 [2] https://raw.githubusercontent.com/philmmanjaro/linux41/master/git-bisect.txt [3] https://raw.githubusercontent.com/philmmanjaro/linux41/master/config.4.1.3-040103-generic [4] https://github.com/manjaro/packages-core/commit/f7b77f3e84295a6313a9181d520fb48e60453b64 [5] https://github.com/manjaro/packages-core/issues/14 On 24.07.2015 00:23, Philip Müller wrote: > Hi all, > > I started to test linux 4.1 series with rc6. However, I was never able > to boot that kernel in i686 architecture. Trying it again with > VirtualBox gave me more conclusions. Using one core it simply boots up. > Using more than one CPU core it crashes with: > > Failed to access perfctr msr (MSR c0010007 is 0) > > task: f58e0000 ti: f58e8000 task.ti: f58e800 > EIP: 0060:[<c135a903>] EFLAGS: 00010206 CPU: 0 > EIP is at free_cache_attributes+0x83/0xd0 > EAX: 00000001 EBX: f589d46c ECX: 00000090 EDX: 360c2000 > ESI: 00000000 EDI: c1724a80 EBP: f58e9ec0 ESP: f58e9ea0 > DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 > CR0: 8005003b CR2: 000000ac CR3: 01731000 CR4: 000006d0 > > In more rich detail you can find that problem on my bug-tracker for > Manjaro Linux: > > https://github.com/manjaro/packages-core/issues/14 > > I just want to know if you are aware of it. With current 4.1.3 release I > still face that issue ... > > kind regards > Philip Müller > -------------------------- > Manjaro Project-Lead > ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [linux41] regression with 'x86/cacheinfo: Move cacheinfo sysfs code to generic infrastructure' on AMD i686 2015-07-26 6:18 ` [linux41] regression with 'x86/cacheinfo: Move cacheinfo sysfs code to generic infrastructure' on AMD i686 Philip Müller @ 2015-07-26 8:13 ` Thomas Gleixner 2015-07-26 8:41 ` Borislav Petkov 2015-07-27 7:58 ` [PATCH] cpu/cacheinfo: Fix teardown path Borislav Petkov 1 sibling, 1 reply; 23+ messages in thread From: Thomas Gleixner @ 2015-07-26 8:13 UTC (permalink / raw) To: Philip Müller Cc: linux-kernel, Sudeep Holla, Guenter Roeck, manjaro-dev, Ingo Molnar, H. Peter Anvin, Andre Przywara, Borislav Petkov [-- Attachment #1: Type: TEXT/PLAIN, Size: 1230 bytes --] On Sun, 26 Jul 2015, Philip Müller wrote: > > task: f58e0000 ti: f58e8000 task.ti: f58e800 > > EIP: 0060:[<c135a903>] EFLAGS: 00010206 CPU: 0 > > EIP is at free_cache_attributes+0x83/0xd0 > > EAX: 00000001 EBX: f589d46c ECX: 00000090 EDX: 360c2000 > > ESI: 00000000 EDI: c1724a80 EBP: f58e9ec0 ESP: f58e9ea0 > > DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 > > CR0: 8005003b CR2: 000000ac CR3: 01731000 CR4: 000006d0 That's a trivial NULL pointer dereference in the error/cleanup path. Patch below should fix it. Thanks, tglx --- diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c index 764280a91776..f09b106d8b81 100644 --- a/drivers/base/cacheinfo.c +++ b/drivers/base/cacheinfo.c @@ -159,6 +159,9 @@ static void cache_shared_cpu_map_remove(unsigned int cpu) static void free_cache_attributes(unsigned int cpu) { + if (!per_cpu_cacheinfo(cpu)) + return; + cache_shared_cpu_map_remove(cpu); kfree(per_cpu_cacheinfo(cpu)); @@ -514,8 +517,7 @@ static int cacheinfo_cpu_callback(struct notifier_block *nfb, break; case CPU_DEAD: cache_remove_dev(cpu); - if (per_cpu_cacheinfo(cpu)) - free_cache_attributes(cpu); + free_cache_attributes(cpu); break; } return notifier_from_errno(rc); ^ permalink raw reply related [flat|nested] 23+ messages in thread
* Re: [linux41] regression with 'x86/cacheinfo: Move cacheinfo sysfs code to generic infrastructure' on AMD i686 2015-07-26 8:13 ` Thomas Gleixner @ 2015-07-26 8:41 ` Borislav Petkov 2015-07-26 10:54 ` Philip Müller 0 siblings, 1 reply; 23+ messages in thread From: Borislav Petkov @ 2015-07-26 8:41 UTC (permalink / raw) To: Thomas Gleixner Cc: Philip Müller, linux-kernel, Sudeep Holla, Guenter Roeck, manjaro-dev, Ingo Molnar, H. Peter Anvin, Andre Przywara, Borislav Petkov On Sun, Jul 26, 2015 at 10:13:45AM +0200, Thomas Gleixner wrote: > On Sun, 26 Jul 2015, Philip Müller wrote: > > > task: f58e0000 ti: f58e8000 task.ti: f58e800 > > > EIP: 0060:[<c135a903>] EFLAGS: 00010206 CPU: 0 > > > EIP is at free_cache_attributes+0x83/0xd0 > > > EAX: 00000001 EBX: f589d46c ECX: 00000090 EDX: 360c2000 > > > ESI: 00000000 EDI: c1724a80 EBP: f58e9ec0 ESP: f58e9ea0 > > > DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 > > > CR0: 8005003b CR2: 000000ac CR3: 01731000 CR4: 000006d0 > > That's a trivial NULL pointer dereference in the error/cleanup > path. Patch below should fix it. Well, I got a bit different, and of course totally untested possible solution: cache_shared_cpu_map_setup() does check sib_cpu_ci->info_list before setting cpumask bits while cache_shared_cpu_map_remove() doesn't. Ballancing this out would mean: --- diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c index 764280a91776..8a4546dc25e3 100644 --- a/drivers/base/cacheinfo.c +++ b/drivers/base/cacheinfo.c @@ -148,7 +148,11 @@ static void cache_shared_cpu_map_remove(unsigned int cpu) if (sibling == cpu) /* skip itself */ continue; + sib_cpu_ci = get_cpu_cacheinfo(sibling); + if (!sib_cpu_ci->info_list) + continue; + sib_leaf = sib_cpu_ci->info_list + index; cpumask_clear_cpu(cpu, &sib_leaf->shared_cpu_map); cpumask_clear_cpu(sibling, &this_leaf->shared_cpu_map); --- Now Philip can have some more fun testing :-) -- Regards/Gruss, Boris. ECO tip #101: Trim your mails when you reply. -- ^ permalink raw reply related [flat|nested] 23+ messages in thread
* Re: [linux41] regression with 'x86/cacheinfo: Move cacheinfo sysfs code to generic infrastructure' on AMD i686 2015-07-26 8:41 ` Borislav Petkov @ 2015-07-26 10:54 ` Philip Müller 2015-07-26 14:42 ` Borislav Petkov 0 siblings, 1 reply; 23+ messages in thread From: Philip Müller @ 2015-07-26 10:54 UTC (permalink / raw) To: Borislav Petkov, Thomas Gleixner Cc: linux-kernel, Sudeep Holla, Guenter Roeck, manjaro-dev, Ingo Molnar, H. Peter Anvin, Andre Przywara, Borislav Petkov Hi Borislav, I can confirm your patch working. However, it might be good to use yours and Thomas' in combination to solve this properly. kind regards Philip On 26.07.2015 10:41, Borislav Petkov wrote: > On Sun, Jul 26, 2015 at 10:13:45AM +0200, Thomas Gleixner wrote: >> On Sun, 26 Jul 2015, Philip Müller wrote: >>>> task: f58e0000 ti: f58e8000 task.ti: f58e800 >>>> EIP: 0060:[<c135a903>] EFLAGS: 00010206 CPU: 0 >>>> EIP is at free_cache_attributes+0x83/0xd0 >>>> EAX: 00000001 EBX: f589d46c ECX: 00000090 EDX: 360c2000 >>>> ESI: 00000000 EDI: c1724a80 EBP: f58e9ec0 ESP: f58e9ea0 >>>> DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 >>>> CR0: 8005003b CR2: 000000ac CR3: 01731000 CR4: 000006d0 >> >> That's a trivial NULL pointer dereference in the error/cleanup >> path. Patch below should fix it. > > Well, I got a bit different, and of course totally untested possible > solution: > > cache_shared_cpu_map_setup() does check sib_cpu_ci->info_list before > setting cpumask bits while cache_shared_cpu_map_remove() doesn't. Ballancing > this out would mean: > > --- > diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c > index 764280a91776..8a4546dc25e3 100644 > --- a/drivers/base/cacheinfo.c > +++ b/drivers/base/cacheinfo.c > @@ -148,7 +148,11 @@ static void cache_shared_cpu_map_remove(unsigned int cpu) > > if (sibling == cpu) /* skip itself */ > continue; > + > sib_cpu_ci = get_cpu_cacheinfo(sibling); > + if (!sib_cpu_ci->info_list) > + continue; > + > sib_leaf = sib_cpu_ci->info_list + index; > cpumask_clear_cpu(cpu, &sib_leaf->shared_cpu_map); > cpumask_clear_cpu(sibling, &this_leaf->shared_cpu_map); > --- > > Now Philip can have some more fun testing :-) > ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [linux41] regression with 'x86/cacheinfo: Move cacheinfo sysfs code to generic infrastructure' on AMD i686 2015-07-26 10:54 ` Philip Müller @ 2015-07-26 14:42 ` Borislav Petkov 2015-07-26 15:59 ` Philip Müller 2015-09-16 23:52 ` Josh Boyer 0 siblings, 2 replies; 23+ messages in thread From: Borislav Petkov @ 2015-07-26 14:42 UTC (permalink / raw) To: Philip Müller Cc: Thomas Gleixner, linux-kernel, Sudeep Holla, Guenter Roeck, manjaro-dev, Ingo Molnar, H. Peter Anvin, Andre Przywara, Borislav Petkov On Sun, Jul 26, 2015 at 12:54:55PM +0200, Philip Müller wrote: > I can confirm your patch working. However, it might be good to use yours > and Thomas' in combination to solve this properly. Please do not top-post. We could use Thomas' too although from looking at it, detect_cache_attributes() allocates a per-CPU per_cpu_cacheinfo thing for each CPU. By the time we hit cache_shared_cpu_map_remove() in free_cache_attributes(), those per_cpu_cacheinfo(cpu) things are still allocated. We kfree them in the next step only. But I like the moving of the check from the CPU hotplug callback to free_cache_attributes(). So I'll merge the two patches and write up a proper commit message, unless someone objects. I'll add your Tested-by too. Thanks. -- Regards/Gruss, Boris. ECO tip #101: Trim your mails when you reply. -- ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [linux41] regression with 'x86/cacheinfo: Move cacheinfo sysfs code to generic infrastructure' on AMD i686 2015-07-26 14:42 ` Borislav Petkov @ 2015-07-26 15:59 ` Philip Müller 2015-07-26 16:11 ` Guenter Roeck 2015-09-16 23:52 ` Josh Boyer 1 sibling, 1 reply; 23+ messages in thread From: Philip Müller @ 2015-07-26 15:59 UTC (permalink / raw) To: Borislav Petkov Cc: Thomas Gleixner, linux-kernel, Sudeep Holla, Guenter Roeck, manjaro-dev, Ingo Molnar, H. Peter Anvin, Andre Przywara, Borislav Petkov Hi Borislav, I'm fine with that decision. I tested your patch alone and the combination with Thomas' changes. Both work to solve this problem. Do whatever suits best for this matter. Thx to you too for providing solutions so fast. kind regards Philip p.s. what do you mean by top-post? Am 26.07.2015 um 16:42 schrieb Borislav Petkov: > On Sun, Jul 26, 2015 at 12:54:55PM +0200, Philip Müller wrote: >> I can confirm your patch working. However, it might be good to use yours >> and Thomas' in combination to solve this properly. > > Please do not top-post. > > We could use Thomas' too although from looking at it, > detect_cache_attributes() allocates a per-CPU per_cpu_cacheinfo thing > for each CPU. By the time we hit cache_shared_cpu_map_remove() in > free_cache_attributes(), those per_cpu_cacheinfo(cpu) things are still > allocated. We kfree them in the next step only. > > But I like the moving of the check from the CPU hotplug callback to > free_cache_attributes(). > > So I'll merge the two patches and write up a proper commit message, > unless someone objects. > > I'll add your Tested-by too. > > Thanks. > ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [linux41] regression with 'x86/cacheinfo: Move cacheinfo sysfs code to generic infrastructure' on AMD i686 2015-07-26 15:59 ` Philip Müller @ 2015-07-26 16:11 ` Guenter Roeck 0 siblings, 0 replies; 23+ messages in thread From: Guenter Roeck @ 2015-07-26 16:11 UTC (permalink / raw) To: Philip Müller, Borislav Petkov Cc: Thomas Gleixner, linux-kernel, Sudeep Holla, manjaro-dev, Ingo Molnar, H. Peter Anvin, Andre Przywara, Borislav Petkov On 07/26/2015 08:59 AM, Philip Müller wrote: > Hi Borislav, > > I'm fine with that decision. I tested your patch alone and the > combination with Thomas' changes. Both work to solve this problem. > > Do whatever suits best for this matter. Thx to you too for providing > solutions so fast. > > kind regards > Philip > > p.s. what do you mean by top-post? > What you just did ;-). http://ck.wikia.com/wiki/TopPosting Guenter > Am 26.07.2015 um 16:42 schrieb Borislav Petkov: >> On Sun, Jul 26, 2015 at 12:54:55PM +0200, Philip Müller wrote: >>> I can confirm your patch working. However, it might be good to use yours >>> and Thomas' in combination to solve this properly. >> >> Please do not top-post. >> >> We could use Thomas' too although from looking at it, >> detect_cache_attributes() allocates a per-CPU per_cpu_cacheinfo thing >> for each CPU. By the time we hit cache_shared_cpu_map_remove() in >> free_cache_attributes(), those per_cpu_cacheinfo(cpu) things are still >> allocated. We kfree them in the next step only. >> >> But I like the moving of the check from the CPU hotplug callback to >> free_cache_attributes(). >> >> So I'll merge the two patches and write up a proper commit message, >> unless someone objects. >> >> I'll add your Tested-by too. >> >> Thanks. >> > > ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [linux41] regression with 'x86/cacheinfo: Move cacheinfo sysfs code to generic infrastructure' on AMD i686 2015-07-26 14:42 ` Borislav Petkov 2015-07-26 15:59 ` Philip Müller @ 2015-09-16 23:52 ` Josh Boyer 2015-09-17 5:36 ` Philip Müller 2015-09-17 7:15 ` Borislav Petkov 1 sibling, 2 replies; 23+ messages in thread From: Josh Boyer @ 2015-09-16 23:52 UTC (permalink / raw) To: Borislav Petkov Cc: Philip Müller, Thomas Gleixner, Linux-Kernel@Vger. Kernel. Org, Sudeep Holla, Guenter Roeck, manjaro-dev, Ingo Molnar, H. Peter Anvin, Andre Przywara, Borislav Petkov On Sun, Jul 26, 2015 at 10:42 AM, Borislav Petkov <bp@alien8.de> wrote: > On Sun, Jul 26, 2015 at 12:54:55PM +0200, Philip Müller wrote: >> I can confirm your patch working. However, it might be good to use yours >> and Thomas' in combination to solve this properly. > > Please do not top-post. > > We could use Thomas' too although from looking at it, > detect_cache_attributes() allocates a per-CPU per_cpu_cacheinfo thing > for each CPU. By the time we hit cache_shared_cpu_map_remove() in > free_cache_attributes(), those per_cpu_cacheinfo(cpu) things are still > allocated. We kfree them in the next step only. > > But I like the moving of the check from the CPU hotplug callback to > free_cache_attributes(). > > So I'll merge the two patches and write up a proper commit message, > unless someone objects. > > I'll add your Tested-by too. Did this actually happen? I don't see either fix in Linus' tree yet, the merge window is closed, and the bug happens on 4.1 and 4.2 stable kernels.. josh ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [linux41] regression with 'x86/cacheinfo: Move cacheinfo sysfs code to generic infrastructure' on AMD i686 2015-09-16 23:52 ` Josh Boyer @ 2015-09-17 5:36 ` Philip Müller 2015-09-17 7:15 ` Borislav Petkov 1 sibling, 0 replies; 23+ messages in thread From: Philip Müller @ 2015-09-17 5:36 UTC (permalink / raw) To: Josh Boyer, Borislav Petkov Cc: Thomas Gleixner, Linux-Kernel@Vger. Kernel. Org, Sudeep Holla, Guenter Roeck, manjaro-dev, Ingo Molnar, H. Peter Anvin, Andre Przywara, Borislav Petkov Am 17.09.2015 um 01:52 schrieb Josh Boyer: > > Did this actually happen? I don't see either fix in Linus' tree yet, > the merge window is closed, and the bug happens on 4.1 and 4.2 stable > kernels.. > > josh > Seems not yet. I don't see it neither in 4.3-rc1. Seems 4.3 will have the same issues then ... ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [linux41] regression with 'x86/cacheinfo: Move cacheinfo sysfs code to generic infrastructure' on AMD i686 2015-09-16 23:52 ` Josh Boyer 2015-09-17 5:36 ` Philip Müller @ 2015-09-17 7:15 ` Borislav Petkov 2015-09-17 12:54 ` Greg KH 1 sibling, 1 reply; 23+ messages in thread From: Borislav Petkov @ 2015-09-17 7:15 UTC (permalink / raw) To: Josh Boyer, Greg KH Cc: Philip Müller, Thomas Gleixner, Linux-Kernel@Vger. Kernel. Org, Sudeep Holla, Guenter Roeck, manjaro-dev, Ingo Molnar, H. Peter Anvin, Andre Przywara, Borislav Petkov On Wed, Sep 16, 2015 at 07:52:47PM -0400, Josh Boyer wrote: > On Sun, Jul 26, 2015 at 10:42 AM, Borislav Petkov <bp@alien8.de> wrote: > > On Sun, Jul 26, 2015 at 12:54:55PM +0200, Philip Müller wrote: > >> I can confirm your patch working. However, it might be good to use yours > >> and Thomas' in combination to solve this properly. > > > > Please do not top-post. > > > > We could use Thomas' too although from looking at it, > > detect_cache_attributes() allocates a per-CPU per_cpu_cacheinfo thing > > for each CPU. By the time we hit cache_shared_cpu_map_remove() in > > free_cache_attributes(), those per_cpu_cacheinfo(cpu) things are still > > allocated. We kfree them in the next step only. > > > > But I like the moving of the check from the CPU hotplug callback to > > free_cache_attributes(). > > > > So I'll merge the two patches and write up a proper commit message, > > unless someone objects. > > > > I'll add your Tested-by too. > > Did this actually happen? I don't see either fix in Linus' tree yet, > the merge window is closed, and the bug happens on 4.1 and 4.2 stable > kernels.. Greg wanted to pick it up... Greg, what's up? -- Regards/Gruss, Boris. ECO tip #101: Trim your mails when you reply. -- ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [linux41] regression with 'x86/cacheinfo: Move cacheinfo sysfs code to generic infrastructure' on AMD i686 2015-09-17 7:15 ` Borislav Petkov @ 2015-09-17 12:54 ` Greg KH 0 siblings, 0 replies; 23+ messages in thread From: Greg KH @ 2015-09-17 12:54 UTC (permalink / raw) To: Borislav Petkov Cc: Josh Boyer, Philip Müller, Thomas Gleixner, Linux-Kernel@Vger. Kernel. Org, Sudeep Holla, Guenter Roeck, manjaro-dev, Ingo Molnar, H. Peter Anvin, Andre Przywara, Borislav Petkov On Thu, Sep 17, 2015 at 09:15:04AM +0200, Borislav Petkov wrote: > On Wed, Sep 16, 2015 at 07:52:47PM -0400, Josh Boyer wrote: > > On Sun, Jul 26, 2015 at 10:42 AM, Borislav Petkov <bp@alien8.de> wrote: > > > On Sun, Jul 26, 2015 at 12:54:55PM +0200, Philip Müller wrote: > > >> I can confirm your patch working. However, it might be good to use yours > > >> and Thomas' in combination to solve this properly. > > > > > > Please do not top-post. > > > > > > We could use Thomas' too although from looking at it, > > > detect_cache_attributes() allocates a per-CPU per_cpu_cacheinfo thing > > > for each CPU. By the time we hit cache_shared_cpu_map_remove() in > > > free_cache_attributes(), those per_cpu_cacheinfo(cpu) things are still > > > allocated. We kfree them in the next step only. > > > > > > But I like the moving of the check from the CPU hotplug callback to > > > free_cache_attributes(). > > > > > > So I'll merge the two patches and write up a proper commit message, > > > unless someone objects. > > > > > > I'll add your Tested-by too. > > > > Did this actually happen? I don't see either fix in Linus' tree yet, > > the merge window is closed, and the bug happens on 4.1 and 4.2 stable > > kernels.. > > Greg wanted to pick it up... > > Greg, what's up? It's in my "to-apply" queue, let me go dig it up now... thanks for the reminder. greg k-h ^ permalink raw reply [flat|nested] 23+ messages in thread
* [PATCH] cpu/cacheinfo: Fix teardown path 2015-07-26 6:18 ` [linux41] regression with 'x86/cacheinfo: Move cacheinfo sysfs code to generic infrastructure' on AMD i686 Philip Müller 2015-07-26 8:13 ` Thomas Gleixner @ 2015-07-27 7:58 ` Borislav Petkov 2015-07-27 8:56 ` Sudeep Holla ` (3 more replies) 1 sibling, 4 replies; 23+ messages in thread From: Borislav Petkov @ 2015-07-27 7:58 UTC (permalink / raw) To: Thomas Gleixner Cc: Philip Müller, linux-kernel, Sudeep Holla, Guenter Roeck, manjaro-dev, Ingo Molnar, H. Peter Anvin, Andre Przywara From: Borislav Petkov <bp@suse.de> Date: Mon, 27 Jul 2015 08:36:27 +0200 Subject: [PATCH] cpu/cacheinfo: Fix teardown path MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Philip Müller reported a hang when booting 32-bit 4.1 kernel on an AMD box. A fragment of the splat was enough to pinpoint the issue: task: f58e0000 ti: f58e8000 task.ti: f58e800 EIP: 0060:[<c135a903>] EFLAGS: 00010206 CPU: 0 EIP is at free_cache_attributes+0x83/0xd0 EAX: 00000001 EBX: f589d46c ECX: 00000090 EDX: 360c2000 ESI: 00000000 EDI: c1724a80 EBP: f58e9ec0 ESP: f58e9ea0 DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 CR0: 8005003b CR2: 000000ac CR3: 01731000 CR4: 000006d0 cache_shared_cpu_map_setup() did check sibling CPUs cacheinfo descriptor while the respective teardown path cache_shared_cpu_map_remove() didn't. Fix that. >From tglx's version: to be on the safe side, move the cacheinfo descriptor check to free_cache_attributes(), thus cleaning up the hotplug path a little and making this even more robust. Reported-by: Philip Müller <philm@manjaro.org> Cc: <stable@vger.kernel.org> # 4.1 Cc: Andre Przywara <andre.przywara@arm.com> Cc: Guenter Roeck <linux@roeck-us.net> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: linux-kernel@vger.kernel.org Cc: manjaro-dev@manjaro.org Cc: Philip Müller <philm@manjaro.org> Cc: Sudeep Holla <sudeep.holla@arm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/55B47BB8.6080202@manjaro.org Signed-off-by: Borislav Petkov <bp@suse.de> --- Moin Thomas, I've merged both patches and tagged it for stable. Which means, tip-urgent. Thanks. drivers/base/cacheinfo.c | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c index 764280a91776..e9fd32e91668 100644 --- a/drivers/base/cacheinfo.c +++ b/drivers/base/cacheinfo.c @@ -148,7 +148,11 @@ static void cache_shared_cpu_map_remove(unsigned int cpu) if (sibling == cpu) /* skip itself */ continue; + sib_cpu_ci = get_cpu_cacheinfo(sibling); + if (!sib_cpu_ci->info_list) + continue; + sib_leaf = sib_cpu_ci->info_list + index; cpumask_clear_cpu(cpu, &sib_leaf->shared_cpu_map); cpumask_clear_cpu(sibling, &this_leaf->shared_cpu_map); @@ -159,6 +163,9 @@ static void cache_shared_cpu_map_remove(unsigned int cpu) static void free_cache_attributes(unsigned int cpu) { + if (!per_cpu_cacheinfo(cpu)) + return; + cache_shared_cpu_map_remove(cpu); kfree(per_cpu_cacheinfo(cpu)); @@ -514,8 +521,7 @@ static int cacheinfo_cpu_callback(struct notifier_block *nfb, break; case CPU_DEAD: cache_remove_dev(cpu); - if (per_cpu_cacheinfo(cpu)) - free_cache_attributes(cpu); + free_cache_attributes(cpu); break; } return notifier_from_errno(rc); -- 2.5.0.rc2.28.g6003e7f -- Regards/Gruss, Boris. ECO tip #101: Trim your mails when you reply. -- ^ permalink raw reply related [flat|nested] 23+ messages in thread
* Re: [PATCH] cpu/cacheinfo: Fix teardown path 2015-07-27 7:58 ` [PATCH] cpu/cacheinfo: Fix teardown path Borislav Petkov @ 2015-07-27 8:56 ` Sudeep Holla 2015-07-27 11:10 ` Thomas Gleixner ` (2 subsequent siblings) 3 siblings, 0 replies; 23+ messages in thread From: Sudeep Holla @ 2015-07-27 8:56 UTC (permalink / raw) To: Borislav Petkov, Thomas Gleixner Cc: Sudeep Holla, Philip Müller, linux-kernel, Guenter Roeck, manjaro-dev, Ingo Molnar, H. Peter Anvin, Andre Przywara On 27/07/15 08:58, Borislav Petkov wrote: > From: Borislav Petkov <bp@suse.de> > Date: Mon, 27 Jul 2015 08:36:27 +0200 > Subject: [PATCH] cpu/cacheinfo: Fix teardown path > MIME-Version: 1.0 > Content-Type: text/plain; charset=UTF-8 > Content-Transfer-Encoding: 8bit > > Philip Müller reported a hang when booting 32-bit 4.1 kernel on an AMD > box. A fragment of the splat was enough to pinpoint the issue: > > task: f58e0000 ti: f58e8000 task.ti: f58e800 > EIP: 0060:[<c135a903>] EFLAGS: 00010206 CPU: 0 > EIP is at free_cache_attributes+0x83/0xd0 > EAX: 00000001 EBX: f589d46c ECX: 00000090 EDX: 360c2000 > ESI: 00000000 EDI: c1724a80 EBP: f58e9ec0 ESP: f58e9ea0 > DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 > CR0: 8005003b CR2: 000000ac CR3: 01731000 CR4: 000006d0 > > cache_shared_cpu_map_setup() did check sibling CPUs cacheinfo descriptor > while the respective teardown path cache_shared_cpu_map_remove() didn't. > Fix that. > > From tglx's version: to be on the safe side, move the cacheinfo > descriptor check to free_cache_attributes(), thus cleaning up the > hotplug path a little and making this even more robust. > > Reported-by: Philip Müller <philm@manjaro.org> > Cc: <stable@vger.kernel.org> # 4.1 > Cc: Andre Przywara <andre.przywara@arm.com> > Cc: Guenter Roeck <linux@roeck-us.net> > Cc: "H. Peter Anvin" <hpa@zytor.com> > Cc: Ingo Molnar <mingo@redhat.com> > Cc: linux-kernel@vger.kernel.org > Cc: manjaro-dev@manjaro.org > Cc: Philip Müller <philm@manjaro.org> > Cc: Sudeep Holla <sudeep.holla@arm.com> Looks good to me. If not too late Acked-by: Sudeep Holla <sudeep.holla@arm.com> Regards, Sudeep ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH] cpu/cacheinfo: Fix teardown path 2015-07-27 7:58 ` [PATCH] cpu/cacheinfo: Fix teardown path Borislav Petkov 2015-07-27 8:56 ` Sudeep Holla @ 2015-07-27 11:10 ` Thomas Gleixner 2015-07-27 18:49 ` Philip Müller 2015-08-05 20:14 ` [tip:x86/urgent] x86/cpu/cacheinfo: " tip-bot for Borislav Petkov 2015-08-08 8:46 ` [PATCH] cpu/cacheinfo: " Borislav Petkov 3 siblings, 1 reply; 23+ messages in thread From: Thomas Gleixner @ 2015-07-27 11:10 UTC (permalink / raw) To: Borislav Petkov Cc: Philip Müller, linux-kernel, Sudeep Holla, Guenter Roeck, manjaro-dev, Ingo Molnar, H. Peter Anvin, Andre Przywara [-- Attachment #1: Type: TEXT/PLAIN, Size: 1864 bytes --] On Mon, 27 Jul 2015, Borislav Petkov wrote: > From: Borislav Petkov <bp@suse.de> > Date: Mon, 27 Jul 2015 08:36:27 +0200 > Subject: [PATCH] cpu/cacheinfo: Fix teardown path > MIME-Version: 1.0 > Content-Type: text/plain; charset=UTF-8 > Content-Transfer-Encoding: 8bit > > Philip Müller reported a hang when booting 32-bit 4.1 kernel on an AMD > box. A fragment of the splat was enough to pinpoint the issue: > > task: f58e0000 ti: f58e8000 task.ti: f58e800 > EIP: 0060:[<c135a903>] EFLAGS: 00010206 CPU: 0 > EIP is at free_cache_attributes+0x83/0xd0 > EAX: 00000001 EBX: f589d46c ECX: 00000090 EDX: 360c2000 > ESI: 00000000 EDI: c1724a80 EBP: f58e9ec0 ESP: f58e9ea0 > DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 > CR0: 8005003b CR2: 000000ac CR3: 01731000 CR4: 000006d0 > > cache_shared_cpu_map_setup() did check sibling CPUs cacheinfo descriptor > while the respective teardown path cache_shared_cpu_map_remove() didn't. > Fix that. > > >From tglx's version: to be on the safe side, move the cacheinfo > descriptor check to free_cache_attributes(), thus cleaning up the > hotplug path a little and making this even more robust. > > Reported-by: Philip Müller <philm@manjaro.org> > Cc: <stable@vger.kernel.org> # 4.1 > Cc: Andre Przywara <andre.przywara@arm.com> > Cc: Guenter Roeck <linux@roeck-us.net> > Cc: "H. Peter Anvin" <hpa@zytor.com> > Cc: Ingo Molnar <mingo@redhat.com> > Cc: linux-kernel@vger.kernel.org > Cc: manjaro-dev@manjaro.org > Cc: Philip Müller <philm@manjaro.org> > Cc: Sudeep Holla <sudeep.holla@arm.com> > Cc: Thomas Gleixner <tglx@linutronix.de> > Link: https://lkml.kernel.org/r/55B47BB8.6080202@manjaro.org > Signed-off-by: Borislav Petkov <bp@suse.de> > --- > > Moin Thomas, > > I've merged both patches and tagged it for stable. Which means, > tip-urgent. Reviewed-by: Thomas Gleixner <tglx@linutronix.de> ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH] cpu/cacheinfo: Fix teardown path 2015-07-27 11:10 ` Thomas Gleixner @ 2015-07-27 18:49 ` Philip Müller 0 siblings, 0 replies; 23+ messages in thread From: Philip Müller @ 2015-07-27 18:49 UTC (permalink / raw) To: Borislav Petkov Cc: Thomas Gleixner, linux-kernel, Sudeep Holla, Guenter Roeck, manjaro-dev, Ingo Molnar, H. Peter Anvin, Andre Przywara Am 27.07.2015 um 13:10 schrieb Thomas Gleixner: >> --- >> >> Moin Thomas, >> >> I've merged both patches and tagged it for stable. Which means, >> tip-urgent. > > Reviewed-by: Thomas Gleixner <tglx@linutronix.de> > Hi Borislav, I also reviewed your new code and also tested it. Acked-by: Philip Müller <philm@manjaro.org> ^ permalink raw reply [flat|nested] 23+ messages in thread
* [tip:x86/urgent] x86/cpu/cacheinfo: Fix teardown path 2015-07-27 7:58 ` [PATCH] cpu/cacheinfo: Fix teardown path Borislav Petkov 2015-07-27 8:56 ` Sudeep Holla 2015-07-27 11:10 ` Thomas Gleixner @ 2015-08-05 20:14 ` tip-bot for Borislav Petkov 2015-08-08 8:46 ` [PATCH] cpu/cacheinfo: " Borislav Petkov 3 siblings, 0 replies; 23+ messages in thread From: tip-bot for Borislav Petkov @ 2015-08-05 20:14 UTC (permalink / raw) To: linux-tip-commits Cc: peterz, sudeep.holla, bp, andre.przywara, mingo, philm, linux-kernel, tglx, torvalds, linux, hpa Commit-ID: 680ac028240f8747f31c03986fbcf18b2b521e93 Gitweb: http://git.kernel.org/tip/680ac028240f8747f31c03986fbcf18b2b521e93 Author: Borislav Petkov <bp@suse.de> AuthorDate: Mon, 27 Jul 2015 09:58:05 +0200 Committer: Ingo Molnar <mingo@kernel.org> CommitDate: Wed, 5 Aug 2015 10:08:17 +0200 x86/cpu/cacheinfo: Fix teardown path Philip Müller reported a hang when booting 32-bit 4.1 kernel on an AMD box. A fragment of the splat was enough to pinpoint the issue: task: f58e0000 ti: f58e8000 task.ti: f58e800 EIP: 0060:[<c135a903>] EFLAGS: 00010206 CPU: 0 EIP is at free_cache_attributes+0x83/0xd0 EAX: 00000001 EBX: f589d46c ECX: 00000090 EDX: 360c2000 ESI: 00000000 EDI: c1724a80 EBP: f58e9ec0 ESP: f58e9ea0 DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 CR0: 8005003b CR2: 000000ac CR3: 01731000 CR4: 000006d0 cache_shared_cpu_map_setup() did check sibling CPUs cacheinfo descriptor while the respective teardown path cache_shared_cpu_map_remove() didn't. Fix that. >From tglx's version: to be on the safe side, move the cacheinfo descriptor check to free_cache_attributes(), thus cleaning up the hotplug path a little and making this even more robust. Reported-by: Philip Müller <philm@manjaro.org> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: <stable@vger.kernel.org> # v4.1+ Cc: Andre Przywara <andre.przywara@arm.com> Cc: Guenter Roeck <linux@roeck-us.net> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sudeep Holla <sudeep.holla@arm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Cc: manjaro-dev@manjaro.org Link: http://lkml.kernel.org/r/20150727075805.GA20416@nazgul.tnic Link: https://lkml.kernel.org/r/55B47BB8.6080202@manjaro.org Signed-off-by: Ingo Molnar <mingo@kernel.org> --- drivers/base/cacheinfo.c | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c index 764280a..e9fd32e 100644 --- a/drivers/base/cacheinfo.c +++ b/drivers/base/cacheinfo.c @@ -148,7 +148,11 @@ static void cache_shared_cpu_map_remove(unsigned int cpu) if (sibling == cpu) /* skip itself */ continue; + sib_cpu_ci = get_cpu_cacheinfo(sibling); + if (!sib_cpu_ci->info_list) + continue; + sib_leaf = sib_cpu_ci->info_list + index; cpumask_clear_cpu(cpu, &sib_leaf->shared_cpu_map); cpumask_clear_cpu(sibling, &this_leaf->shared_cpu_map); @@ -159,6 +163,9 @@ static void cache_shared_cpu_map_remove(unsigned int cpu) static void free_cache_attributes(unsigned int cpu) { + if (!per_cpu_cacheinfo(cpu)) + return; + cache_shared_cpu_map_remove(cpu); kfree(per_cpu_cacheinfo(cpu)); @@ -514,8 +521,7 @@ static int cacheinfo_cpu_callback(struct notifier_block *nfb, break; case CPU_DEAD: cache_remove_dev(cpu); - if (per_cpu_cacheinfo(cpu)) - free_cache_attributes(cpu); + free_cache_attributes(cpu); break; } return notifier_from_errno(rc); ^ permalink raw reply related [flat|nested] 23+ messages in thread
* Re: [PATCH] cpu/cacheinfo: Fix teardown path 2015-07-27 7:58 ` [PATCH] cpu/cacheinfo: Fix teardown path Borislav Petkov ` (2 preceding siblings ...) 2015-08-05 20:14 ` [tip:x86/urgent] x86/cpu/cacheinfo: " tip-bot for Borislav Petkov @ 2015-08-08 8:46 ` Borislav Petkov 2015-08-08 15:41 ` Greg KH 3 siblings, 1 reply; 23+ messages in thread From: Borislav Petkov @ 2015-08-08 8:46 UTC (permalink / raw) To: Greg KH Cc: Thomas Gleixner, Philip Müller, linux-kernel, Sudeep Holla, Guenter Roeck, manjaro-dev, Ingo Molnar, H. Peter Anvin, Andre Przywara On Mon, Jul 27, 2015 at 09:58:05AM +0200, Borislav Petkov wrote: > Philip Müller reported a hang when booting 32-bit 4.1 kernel on an AMD > box. A fragment of the splat was enough to pinpoint the issue: Bah, this goes to Greg and not to tip. Anyway, here's a version with updated tags. Greg, please queue for 4.2 as it fixes a hang. Thanks. --- From: Borislav Petkov <bp@suse.de> Date: Mon, 27 Jul 2015 08:36:27 +0200 Subject: [PATCH] cpu/cacheinfo: Fix teardown path MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Philip Müller reported a hang when booting 32-bit 4.1 kernel on an AMD box. A fragment of the splat was enough to pinpoint the issue: task: f58e0000 ti: f58e8000 task.ti: f58e800 EIP: 0060:[<c135a903>] EFLAGS: 00010206 CPU: 0 EIP is at free_cache_attributes+0x83/0xd0 EAX: 00000001 EBX: f589d46c ECX: 00000090 EDX: 360c2000 ESI: 00000000 EDI: c1724a80 EBP: f58e9ec0 ESP: f58e9ea0 DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 CR0: 8005003b CR2: 000000ac CR3: 01731000 CR4: 000006d0 cache_shared_cpu_map_setup() did check sibling CPUs cacheinfo descriptor while the respective teardown path cache_shared_cpu_map_remove() didn't. Fix that. >From tglx's version: to be on the safe side, move the cacheinfo descriptor check to free_cache_attributes(), thus cleaning up the hotplug path a little and making this even more robust. Reported-and-tested-by: Philip Müller <philm@manjaro.org> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Sudeep Holla <sudeep.holla@arm.com> Cc: <stable@vger.kernel.org> # 4.1 Cc: Andre Przywara <andre.przywara@arm.com> Cc: Guenter Roeck <linux@roeck-us.net> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: linux-kernel@vger.kernel.org Cc: manjaro-dev@manjaro.org Cc: Philip Müller <philm@manjaro.org> Link: https://lkml.kernel.org/r/55B47BB8.6080202@manjaro.org Signed-off-by: Borislav Petkov <bp@suse.de> --- drivers/base/cacheinfo.c | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c index 764280a91776..e9fd32e91668 100644 --- a/drivers/base/cacheinfo.c +++ b/drivers/base/cacheinfo.c @@ -148,7 +148,11 @@ static void cache_shared_cpu_map_remove(unsigned int cpu) if (sibling == cpu) /* skip itself */ continue; + sib_cpu_ci = get_cpu_cacheinfo(sibling); + if (!sib_cpu_ci->info_list) + continue; + sib_leaf = sib_cpu_ci->info_list + index; cpumask_clear_cpu(cpu, &sib_leaf->shared_cpu_map); cpumask_clear_cpu(sibling, &this_leaf->shared_cpu_map); @@ -159,6 +163,9 @@ static void cache_shared_cpu_map_remove(unsigned int cpu) static void free_cache_attributes(unsigned int cpu) { + if (!per_cpu_cacheinfo(cpu)) + return; + cache_shared_cpu_map_remove(cpu); kfree(per_cpu_cacheinfo(cpu)); @@ -514,8 +521,7 @@ static int cacheinfo_cpu_callback(struct notifier_block *nfb, break; case CPU_DEAD: cache_remove_dev(cpu); - if (per_cpu_cacheinfo(cpu)) - free_cache_attributes(cpu); + free_cache_attributes(cpu); break; } return notifier_from_errno(rc); -- 2.5.0.rc2.28.g6003e7f -- Regards/Gruss, Boris. ECO tip #101: Trim your mails when you reply. -- ^ permalink raw reply related [flat|nested] 23+ messages in thread
* Re: [PATCH] cpu/cacheinfo: Fix teardown path 2015-08-08 8:46 ` [PATCH] cpu/cacheinfo: " Borislav Petkov @ 2015-08-08 15:41 ` Greg KH 2015-08-08 18:23 ` Philip Müller 2015-08-08 19:47 ` Borislav Petkov 0 siblings, 2 replies; 23+ messages in thread From: Greg KH @ 2015-08-08 15:41 UTC (permalink / raw) To: Borislav Petkov Cc: Thomas Gleixner, Philip Müller, linux-kernel, Sudeep Holla, Guenter Roeck, manjaro-dev, Ingo Molnar, H. Peter Anvin, Andre Przywara On Sat, Aug 08, 2015 at 10:46:02AM +0200, Borislav Petkov wrote: > On Mon, Jul 27, 2015 at 09:58:05AM +0200, Borislav Petkov wrote: > > Philip Müller reported a hang when booting 32-bit 4.1 kernel on an AMD > > box. A fragment of the splat was enough to pinpoint the issue: > > Bah, this goes to Greg and not to tip. Anyway, here's a version with > updated tags. > > Greg, please queue for 4.2 as it fixes a hang. What commit caused this issue? And it's a bit late for 4.2, as you say 4.1 is also affected, I'll wait for 4.3-rc1 to give this a chance to get some testing. thanks, greg k-h ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH] cpu/cacheinfo: Fix teardown path 2015-08-08 15:41 ` Greg KH @ 2015-08-08 18:23 ` Philip Müller 2015-08-08 19:42 ` Borislav Petkov 2015-08-08 19:47 ` Borislav Petkov 1 sibling, 1 reply; 23+ messages in thread From: Philip Müller @ 2015-08-08 18:23 UTC (permalink / raw) To: Greg KH, Borislav Petkov Cc: Thomas Gleixner, linux-kernel, Sudeep Holla, Guenter Roeck, manjaro-dev, Ingo Molnar, H. Peter Anvin, Andre Przywara, manjaro-dev Hi Greg, I bi-sected it to following commit: 0d55ba46bfbee64fd2b492b87bfe2ec172e7b056 is the first bad commit commit 0d55ba46bfbee64fd2b492b87bfe2ec172e7b056 Author: Sudeep Holla <sudeep.holla@arm.com> Date: Wed Mar 4 12:00:16 2015 +0000 x86/cacheinfo: Move cacheinfo sysfs code to generic infrastructure You can follow it on my github repo in rich detail: https://github.com/philmmanjaro/linux41/blob/master/git-bisect.txt kind regards Philip Müller Am 08.08.2015 um 17:41 schrieb Greg KH: > On Sat, Aug 08, 2015 at 10:46:02AM +0200, Borislav Petkov wrote: >> On Mon, Jul 27, 2015 at 09:58:05AM +0200, Borislav Petkov wrote: >>> Philip Müller reported a hang when booting 32-bit 4.1 kernel on an AMD >>> box. A fragment of the splat was enough to pinpoint the issue: >> >> Bah, this goes to Greg and not to tip. Anyway, here's a version with >> updated tags. >> >> Greg, please queue for 4.2 as it fixes a hang. > > What commit caused this issue? > > And it's a bit late for 4.2, as you say 4.1 is also affected, I'll wait > for 4.3-rc1 to give this a chance to get some testing. > > thanks, > > greg k-h > ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH] cpu/cacheinfo: Fix teardown path 2015-08-08 18:23 ` Philip Müller @ 2015-08-08 19:42 ` Borislav Petkov 0 siblings, 0 replies; 23+ messages in thread From: Borislav Petkov @ 2015-08-08 19:42 UTC (permalink / raw) To: Philip Müller Cc: Greg KH, Thomas Gleixner, linux-kernel, Sudeep Holla, Guenter Roeck, manjaro-dev, Ingo Molnar, H. Peter Anvin, Andre Przywara On Sat, Aug 08, 2015 at 08:23:49PM +0200, Philip Müller wrote: > Hi Greg, > > I bi-sected it to following commit: > > 0d55ba46bfbee64fd2b492b87bfe2ec172e7b056 is the first bad commit > commit 0d55ba46bfbee64fd2b492b87bfe2ec172e7b056 > Author: Sudeep Holla <sudeep.holla@arm.com> > Date: Wed Mar 4 12:00:16 2015 +0000 > > x86/cacheinfo: Move cacheinfo sysfs code to generic infrastructure > > You can follow it on my github repo in rich detail: > > https://github.com/philmmanjaro/linux41/blob/master/git-bisect.txt Philip, what is with you and top-posting? How hard is it not to do it?! Please stop with the top-posting already. -- Regards/Gruss, Boris. ECO tip #101: Trim your mails when you reply. -- ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH] cpu/cacheinfo: Fix teardown path 2015-08-08 15:41 ` Greg KH 2015-08-08 18:23 ` Philip Müller @ 2015-08-08 19:47 ` Borislav Petkov 2015-09-13 7:03 ` Philip Müller 1 sibling, 1 reply; 23+ messages in thread From: Borislav Petkov @ 2015-08-08 19:47 UTC (permalink / raw) To: Greg KH Cc: Thomas Gleixner, Philip Müller, linux-kernel, Sudeep Holla, Guenter Roeck, manjaro-dev, Ingo Molnar, H. Peter Anvin, Andre Przywara On Sat, Aug 08, 2015 at 08:41:56AM -0700, Greg KH wrote: > What commit caused this issue? Apparently 0d55ba46bfbe ("x86/cacheinfo: Move cacheinfo sysfs code to generic infrastructure") Looks like moving x86 to the generic cacheinfo stuff uncovered this shortcoming there... > And it's a bit late for 4.2, as you say 4.1 is also affected, I'll wait > for 4.3-rc1 to give this a chance to get some testing. Right, I guess that's fine too as it'll trickle to stable eventually... Thanks. -- Regards/Gruss, Boris. ECO tip #101: Trim your mails when you reply. -- ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH] cpu/cacheinfo: Fix teardown path 2015-08-08 19:47 ` Borislav Petkov @ 2015-09-13 7:03 ` Philip Müller 0 siblings, 0 replies; 23+ messages in thread From: Philip Müller @ 2015-09-13 7:03 UTC (permalink / raw) To: Borislav Petkov, Greg KH Cc: Thomas Gleixner, linux-kernel, Sudeep Holla, Guenter Roeck, manjaro-dev, Ingo Molnar, H. Peter Anvin, Andre Przywara, manjaro-dev [-- Attachment #1: Type: text/plain, Size: 656 bytes --] On 08.08.2015 21:47, Borislav Petkov wrote: > On Sat, Aug 08, 2015 at 08:41:56AM -0700, Greg KH wrote: >> What commit caused this issue? > > Apparently > > 0d55ba46bfbe ("x86/cacheinfo: Move cacheinfo sysfs code to generic infrastructure") > > Looks like moving x86 to the generic cacheinfo stuff uncovered this > shortcoming there... > >> And it's a bit late for 4.2, as you say 4.1 is also affected, I'll wait >> for 4.3-rc1 to give this a chance to get some testing. > > Right, I guess that's fine too as it'll trickle to stable eventually... > > Thanks. > Just a note from my end. Seems this patch didn't made it into 4.3-rc1. Any reason why? [-- Attachment #2: 0001-cpu-cacheinfo-fix-teardown-path.patch --] [-- Type: text/x-patch, Size: 1090 bytes --] diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c index 764280a91776..e9fd32e91668 100644 --- a/drivers/base/cacheinfo.c +++ b/drivers/base/cacheinfo.c @@ -148,7 +148,11 @@ static void cache_shared_cpu_map_remove(unsigned int cpu) if (sibling == cpu) /* skip itself */ continue; + sib_cpu_ci = get_cpu_cacheinfo(sibling); + if (!sib_cpu_ci->info_list) + continue; + sib_leaf = sib_cpu_ci->info_list + index; cpumask_clear_cpu(cpu, &sib_leaf->shared_cpu_map); cpumask_clear_cpu(sibling, &this_leaf->shared_cpu_map); @@ -159,6 +163,9 @@ static void cache_shared_cpu_map_remove(unsigned int cpu) static void free_cache_attributes(unsigned int cpu) { + if (!per_cpu_cacheinfo(cpu)) + return; + cache_shared_cpu_map_remove(cpu); kfree(per_cpu_cacheinfo(cpu)); @@ -514,8 +521,7 @@ static int cacheinfo_cpu_callback(struct notifier_block *nfb, break; case CPU_DEAD: cache_remove_dev(cpu); - if (per_cpu_cacheinfo(cpu)) - free_cache_attributes(cpu); + free_cache_attributes(cpu); break; } return notifier_from_errno(rc); ^ permalink raw reply related [flat|nested] 23+ messages in thread
end of thread, other threads:[~2015-09-17 12:54 UTC | newest] Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2015-07-23 22:23 [linux41] Kernel panic at i686 Philip Müller 2015-07-26 6:18 ` [linux41] regression with 'x86/cacheinfo: Move cacheinfo sysfs code to generic infrastructure' on AMD i686 Philip Müller 2015-07-26 8:13 ` Thomas Gleixner 2015-07-26 8:41 ` Borislav Petkov 2015-07-26 10:54 ` Philip Müller 2015-07-26 14:42 ` Borislav Petkov 2015-07-26 15:59 ` Philip Müller 2015-07-26 16:11 ` Guenter Roeck 2015-09-16 23:52 ` Josh Boyer 2015-09-17 5:36 ` Philip Müller 2015-09-17 7:15 ` Borislav Petkov 2015-09-17 12:54 ` Greg KH 2015-07-27 7:58 ` [PATCH] cpu/cacheinfo: Fix teardown path Borislav Petkov 2015-07-27 8:56 ` Sudeep Holla 2015-07-27 11:10 ` Thomas Gleixner 2015-07-27 18:49 ` Philip Müller 2015-08-05 20:14 ` [tip:x86/urgent] x86/cpu/cacheinfo: " tip-bot for Borislav Petkov 2015-08-08 8:46 ` [PATCH] cpu/cacheinfo: " Borislav Petkov 2015-08-08 15:41 ` Greg KH 2015-08-08 18:23 ` Philip Müller 2015-08-08 19:42 ` Borislav Petkov 2015-08-08 19:47 ` Borislav Petkov 2015-09-13 7:03 ` Philip Müller
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).