linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [linux41] Kernel panic at i686
@ 2015-07-23 22:23 Philip Müller
  2015-07-26  6:18 ` [linux41] regression with 'x86/cacheinfo: Move cacheinfo sysfs code to generic infrastructure' on AMD i686 Philip Müller
  0 siblings, 1 reply; 23+ messages in thread
From: Philip Müller @ 2015-07-23 22:23 UTC (permalink / raw)
  To: linux-kernel

Hi all,

I started to test linux 4.1 series with rc6. However, I was never able
to boot that kernel in i686 architecture. Trying it again with
VirtualBox gave me more conclusions. Using one core it simply boots up.
Using more than one CPU core it crashes with:

Failed to access perfctr msr (MSR c0010007 is 0)

task: f58e0000 ti: f58e8000 task.ti: f58e800
EIP: 0060:[<c135a903>] EFLAGS: 00010206 CPU: 0
EIP is at free_cache_attributes+0x83/0xd0
EAX: 00000001 EBX: f589d46c ECX: 00000090 EDX: 360c2000
ESI: 00000000 EDI: c1724a80 EBP: f58e9ec0 ESP: f58e9ea0
 DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
CR0: 8005003b CR2: 000000ac CR3: 01731000 CR4: 000006d0

In more rich detail you can find that problem on my bug-tracker for
Manjaro Linux:

https://github.com/manjaro/packages-core/issues/14

I just want to know if you are aware of it. With current 4.1.3 release I
still face that issue ...

kind regards
Philip Müller
--------------------------
Manjaro Project-Lead

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [linux41] regression with 'x86/cacheinfo: Move cacheinfo sysfs code to generic infrastructure' on AMD i686
  2015-07-23 22:23 [linux41] Kernel panic at i686 Philip Müller
@ 2015-07-26  6:18 ` Philip Müller
  2015-07-26  8:13   ` Thomas Gleixner
  2015-07-27  7:58   ` [PATCH] cpu/cacheinfo: Fix teardown path Borislav Petkov
  0 siblings, 2 replies; 23+ messages in thread
From: Philip Müller @ 2015-07-26  6:18 UTC (permalink / raw)
  To: linux-kernel, Sudeep Holla, Guenter Roeck, manjaro-dev
  Cc: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, Andre Przywara,
	Borislav Petkov

Hi Guenter, Sudeep,

It now came down to 'x86/cacheinfo: Move cacheinfo sysfs code to generic
infrastructure'[1] as you can see here[2].

The facts are:

- You can't boot on i686 only with more than one CPU core on AMD
hardware (x86_64 however works)
- Using Ubuntu config[3] it boots but creates an kernel >= 1 GB in size
on Manjaro
- By reverting 0d55ba4[4] the kernel boots.

So we have to find out what is causing this issue[5].

kind regards
Philip Müller
--------------------------
Manjaro Project-Lead

[1]
https://github.com/torvalds/linux/commit/0d55ba46bfbee64fd2b492b87bfe2ec172e7b056
[2]
https://raw.githubusercontent.com/philmmanjaro/linux41/master/git-bisect.txt
[3]
https://raw.githubusercontent.com/philmmanjaro/linux41/master/config.4.1.3-040103-generic
[4]
https://github.com/manjaro/packages-core/commit/f7b77f3e84295a6313a9181d520fb48e60453b64
[5] https://github.com/manjaro/packages-core/issues/14


On 24.07.2015 00:23, Philip Müller wrote:
> Hi all,
> 
> I started to test linux 4.1 series with rc6. However, I was never able
> to boot that kernel in i686 architecture. Trying it again with
> VirtualBox gave me more conclusions. Using one core it simply boots up.
> Using more than one CPU core it crashes with:
> 
> Failed to access perfctr msr (MSR c0010007 is 0)
> 
> task: f58e0000 ti: f58e8000 task.ti: f58e800
> EIP: 0060:[<c135a903>] EFLAGS: 00010206 CPU: 0
> EIP is at free_cache_attributes+0x83/0xd0
> EAX: 00000001 EBX: f589d46c ECX: 00000090 EDX: 360c2000
> ESI: 00000000 EDI: c1724a80 EBP: f58e9ec0 ESP: f58e9ea0
>  DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
> CR0: 8005003b CR2: 000000ac CR3: 01731000 CR4: 000006d0
> 
> In more rich detail you can find that problem on my bug-tracker for
> Manjaro Linux:
> 
> https://github.com/manjaro/packages-core/issues/14
> 
> I just want to know if you are aware of it. With current 4.1.3 release I
> still face that issue ...
> 
> kind regards
> Philip Müller
> --------------------------
> Manjaro Project-Lead
> 


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [linux41] regression with 'x86/cacheinfo: Move cacheinfo sysfs code to generic infrastructure' on AMD i686
  2015-07-26  6:18 ` [linux41] regression with 'x86/cacheinfo: Move cacheinfo sysfs code to generic infrastructure' on AMD i686 Philip Müller
@ 2015-07-26  8:13   ` Thomas Gleixner
  2015-07-26  8:41     ` Borislav Petkov
  2015-07-27  7:58   ` [PATCH] cpu/cacheinfo: Fix teardown path Borislav Petkov
  1 sibling, 1 reply; 23+ messages in thread
From: Thomas Gleixner @ 2015-07-26  8:13 UTC (permalink / raw)
  To: Philip Müller
  Cc: linux-kernel, Sudeep Holla, Guenter Roeck, manjaro-dev,
	Ingo Molnar, H. Peter Anvin, Andre Przywara, Borislav Petkov

[-- Attachment #1: Type: TEXT/PLAIN, Size: 1230 bytes --]

On Sun, 26 Jul 2015, Philip Müller wrote:
> > task: f58e0000 ti: f58e8000 task.ti: f58e800
> > EIP: 0060:[<c135a903>] EFLAGS: 00010206 CPU: 0
> > EIP is at free_cache_attributes+0x83/0xd0
> > EAX: 00000001 EBX: f589d46c ECX: 00000090 EDX: 360c2000
> > ESI: 00000000 EDI: c1724a80 EBP: f58e9ec0 ESP: f58e9ea0
> >  DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
> > CR0: 8005003b CR2: 000000ac CR3: 01731000 CR4: 000006d0

That's a trivial NULL pointer dereference in the error/cleanup
path. Patch below should fix it.

Thanks,

	tglx
---
diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c
index 764280a91776..f09b106d8b81 100644
--- a/drivers/base/cacheinfo.c
+++ b/drivers/base/cacheinfo.c
@@ -159,6 +159,9 @@ static void cache_shared_cpu_map_remove(unsigned int cpu)
 
 static void free_cache_attributes(unsigned int cpu)
 {
+	if (!per_cpu_cacheinfo(cpu))
+		return;
+
 	cache_shared_cpu_map_remove(cpu);
 
 	kfree(per_cpu_cacheinfo(cpu));
@@ -514,8 +517,7 @@ static int cacheinfo_cpu_callback(struct notifier_block *nfb,
 		break;
 	case CPU_DEAD:
 		cache_remove_dev(cpu);
-		if (per_cpu_cacheinfo(cpu))
-			free_cache_attributes(cpu);
+		free_cache_attributes(cpu);
 		break;
 	}
 	return notifier_from_errno(rc);

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [linux41] regression with 'x86/cacheinfo: Move cacheinfo sysfs code to generic infrastructure' on AMD i686
  2015-07-26  8:13   ` Thomas Gleixner
@ 2015-07-26  8:41     ` Borislav Petkov
  2015-07-26 10:54       ` Philip Müller
  0 siblings, 1 reply; 23+ messages in thread
From: Borislav Petkov @ 2015-07-26  8:41 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Philip Müller, linux-kernel, Sudeep Holla, Guenter Roeck,
	manjaro-dev, Ingo Molnar, H. Peter Anvin, Andre Przywara,
	Borislav Petkov

On Sun, Jul 26, 2015 at 10:13:45AM +0200, Thomas Gleixner wrote:
> On Sun, 26 Jul 2015, Philip Müller wrote:
> > > task: f58e0000 ti: f58e8000 task.ti: f58e800
> > > EIP: 0060:[<c135a903>] EFLAGS: 00010206 CPU: 0
> > > EIP is at free_cache_attributes+0x83/0xd0
> > > EAX: 00000001 EBX: f589d46c ECX: 00000090 EDX: 360c2000
> > > ESI: 00000000 EDI: c1724a80 EBP: f58e9ec0 ESP: f58e9ea0
> > >  DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
> > > CR0: 8005003b CR2: 000000ac CR3: 01731000 CR4: 000006d0
> 
> That's a trivial NULL pointer dereference in the error/cleanup
> path. Patch below should fix it.

Well, I got a bit different, and of course totally untested possible
solution:

cache_shared_cpu_map_setup() does check sib_cpu_ci->info_list before
setting cpumask bits while cache_shared_cpu_map_remove() doesn't. Ballancing
this out would mean:

---
diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c
index 764280a91776..8a4546dc25e3 100644
--- a/drivers/base/cacheinfo.c
+++ b/drivers/base/cacheinfo.c
@@ -148,7 +148,11 @@ static void cache_shared_cpu_map_remove(unsigned int cpu)
 
 			if (sibling == cpu) /* skip itself */
 				continue;
+
 			sib_cpu_ci = get_cpu_cacheinfo(sibling);
+			if (!sib_cpu_ci->info_list)
+				continue;
+
 			sib_leaf = sib_cpu_ci->info_list + index;
 			cpumask_clear_cpu(cpu, &sib_leaf->shared_cpu_map);
 			cpumask_clear_cpu(sibling, &this_leaf->shared_cpu_map);
---

Now Philip can have some more fun testing :-)

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.
--

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [linux41] regression with 'x86/cacheinfo: Move cacheinfo sysfs code to generic infrastructure' on AMD i686
  2015-07-26  8:41     ` Borislav Petkov
@ 2015-07-26 10:54       ` Philip Müller
  2015-07-26 14:42         ` Borislav Petkov
  0 siblings, 1 reply; 23+ messages in thread
From: Philip Müller @ 2015-07-26 10:54 UTC (permalink / raw)
  To: Borislav Petkov, Thomas Gleixner
  Cc: linux-kernel, Sudeep Holla, Guenter Roeck, manjaro-dev,
	Ingo Molnar, H. Peter Anvin, Andre Przywara, Borislav Petkov

Hi Borislav,

I can confirm your patch working. However, it might be good to use yours
and Thomas' in combination to solve this properly.

kind regards
Philip

On 26.07.2015 10:41, Borislav Petkov wrote:
> On Sun, Jul 26, 2015 at 10:13:45AM +0200, Thomas Gleixner wrote:
>> On Sun, 26 Jul 2015, Philip Müller wrote:
>>>> task: f58e0000 ti: f58e8000 task.ti: f58e800
>>>> EIP: 0060:[<c135a903>] EFLAGS: 00010206 CPU: 0
>>>> EIP is at free_cache_attributes+0x83/0xd0
>>>> EAX: 00000001 EBX: f589d46c ECX: 00000090 EDX: 360c2000
>>>> ESI: 00000000 EDI: c1724a80 EBP: f58e9ec0 ESP: f58e9ea0
>>>>  DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
>>>> CR0: 8005003b CR2: 000000ac CR3: 01731000 CR4: 000006d0
>>
>> That's a trivial NULL pointer dereference in the error/cleanup
>> path. Patch below should fix it.
> 
> Well, I got a bit different, and of course totally untested possible
> solution:
> 
> cache_shared_cpu_map_setup() does check sib_cpu_ci->info_list before
> setting cpumask bits while cache_shared_cpu_map_remove() doesn't. Ballancing
> this out would mean:
> 
> ---
> diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c
> index 764280a91776..8a4546dc25e3 100644
> --- a/drivers/base/cacheinfo.c
> +++ b/drivers/base/cacheinfo.c
> @@ -148,7 +148,11 @@ static void cache_shared_cpu_map_remove(unsigned int cpu)
>  
>  			if (sibling == cpu) /* skip itself */
>  				continue;
> +
>  			sib_cpu_ci = get_cpu_cacheinfo(sibling);
> +			if (!sib_cpu_ci->info_list)
> +				continue;
> +
>  			sib_leaf = sib_cpu_ci->info_list + index;
>  			cpumask_clear_cpu(cpu, &sib_leaf->shared_cpu_map);
>  			cpumask_clear_cpu(sibling, &this_leaf->shared_cpu_map);
> ---
> 
> Now Philip can have some more fun testing :-)
> 


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [linux41] regression with 'x86/cacheinfo: Move cacheinfo sysfs code to generic infrastructure' on AMD i686
  2015-07-26 10:54       ` Philip Müller
@ 2015-07-26 14:42         ` Borislav Petkov
  2015-07-26 15:59           ` Philip Müller
  2015-09-16 23:52           ` Josh Boyer
  0 siblings, 2 replies; 23+ messages in thread
From: Borislav Petkov @ 2015-07-26 14:42 UTC (permalink / raw)
  To: Philip Müller
  Cc: Thomas Gleixner, linux-kernel, Sudeep Holla, Guenter Roeck,
	manjaro-dev, Ingo Molnar, H. Peter Anvin, Andre Przywara,
	Borislav Petkov

On Sun, Jul 26, 2015 at 12:54:55PM +0200, Philip Müller wrote:
> I can confirm your patch working. However, it might be good to use yours
> and Thomas' in combination to solve this properly.

Please do not top-post.

We could use Thomas' too although from looking at it,
detect_cache_attributes() allocates a per-CPU per_cpu_cacheinfo thing
for each CPU. By the time we hit cache_shared_cpu_map_remove() in
free_cache_attributes(), those per_cpu_cacheinfo(cpu) things are still
allocated. We kfree them in the next step only.

But I like the moving of the check from the CPU hotplug callback to
free_cache_attributes().

So I'll merge the two patches and write up a proper commit message,
unless someone objects.

I'll add your Tested-by too.

Thanks.

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.
--

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [linux41] regression with 'x86/cacheinfo: Move cacheinfo sysfs code to generic infrastructure' on AMD i686
  2015-07-26 14:42         ` Borislav Petkov
@ 2015-07-26 15:59           ` Philip Müller
  2015-07-26 16:11             ` Guenter Roeck
  2015-09-16 23:52           ` Josh Boyer
  1 sibling, 1 reply; 23+ messages in thread
From: Philip Müller @ 2015-07-26 15:59 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Thomas Gleixner, linux-kernel, Sudeep Holla, Guenter Roeck,
	manjaro-dev, Ingo Molnar, H. Peter Anvin, Andre Przywara,
	Borislav Petkov

Hi Borislav,

I'm fine with that decision. I tested your patch alone and the
combination with Thomas' changes. Both work to solve this problem.

Do whatever suits best for this matter. Thx to you too for providing
solutions so fast.

kind regards
Philip

p.s. what do you mean by top-post?

Am 26.07.2015 um 16:42 schrieb Borislav Petkov:
> On Sun, Jul 26, 2015 at 12:54:55PM +0200, Philip Müller wrote:
>> I can confirm your patch working. However, it might be good to use yours
>> and Thomas' in combination to solve this properly.
> 
> Please do not top-post.
> 
> We could use Thomas' too although from looking at it,
> detect_cache_attributes() allocates a per-CPU per_cpu_cacheinfo thing
> for each CPU. By the time we hit cache_shared_cpu_map_remove() in
> free_cache_attributes(), those per_cpu_cacheinfo(cpu) things are still
> allocated. We kfree them in the next step only.
> 
> But I like the moving of the check from the CPU hotplug callback to
> free_cache_attributes().
> 
> So I'll merge the two patches and write up a proper commit message,
> unless someone objects.
> 
> I'll add your Tested-by too.
> 
> Thanks.
> 


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [linux41] regression with 'x86/cacheinfo: Move cacheinfo sysfs code to generic infrastructure' on AMD i686
  2015-07-26 15:59           ` Philip Müller
@ 2015-07-26 16:11             ` Guenter Roeck
  0 siblings, 0 replies; 23+ messages in thread
From: Guenter Roeck @ 2015-07-26 16:11 UTC (permalink / raw)
  To: Philip Müller, Borislav Petkov
  Cc: Thomas Gleixner, linux-kernel, Sudeep Holla, manjaro-dev,
	Ingo Molnar, H. Peter Anvin, Andre Przywara, Borislav Petkov

On 07/26/2015 08:59 AM, Philip Müller wrote:
> Hi Borislav,
>
> I'm fine with that decision. I tested your patch alone and the
> combination with Thomas' changes. Both work to solve this problem.
>
> Do whatever suits best for this matter. Thx to you too for providing
> solutions so fast.
>
> kind regards
> Philip
>
> p.s. what do you mean by top-post?
>

What you just did ;-).

http://ck.wikia.com/wiki/TopPosting

Guenter

> Am 26.07.2015 um 16:42 schrieb Borislav Petkov:
>> On Sun, Jul 26, 2015 at 12:54:55PM +0200, Philip Müller wrote:
>>> I can confirm your patch working. However, it might be good to use yours
>>> and Thomas' in combination to solve this properly.
>>
>> Please do not top-post.
>>
>> We could use Thomas' too although from looking at it,
>> detect_cache_attributes() allocates a per-CPU per_cpu_cacheinfo thing
>> for each CPU. By the time we hit cache_shared_cpu_map_remove() in
>> free_cache_attributes(), those per_cpu_cacheinfo(cpu) things are still
>> allocated. We kfree them in the next step only.
>>
>> But I like the moving of the check from the CPU hotplug callback to
>> free_cache_attributes().
>>
>> So I'll merge the two patches and write up a proper commit message,
>> unless someone objects.
>>
>> I'll add your Tested-by too.
>>
>> Thanks.
>>
>
>


^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH] cpu/cacheinfo: Fix teardown path
  2015-07-26  6:18 ` [linux41] regression with 'x86/cacheinfo: Move cacheinfo sysfs code to generic infrastructure' on AMD i686 Philip Müller
  2015-07-26  8:13   ` Thomas Gleixner
@ 2015-07-27  7:58   ` Borislav Petkov
  2015-07-27  8:56     ` Sudeep Holla
                       ` (3 more replies)
  1 sibling, 4 replies; 23+ messages in thread
From: Borislav Petkov @ 2015-07-27  7:58 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Philip Müller, linux-kernel, Sudeep Holla, Guenter Roeck,
	manjaro-dev, Ingo Molnar, H. Peter Anvin, Andre Przywara

From: Borislav Petkov <bp@suse.de>
Date: Mon, 27 Jul 2015 08:36:27 +0200
Subject: [PATCH] cpu/cacheinfo: Fix teardown path
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Philip Müller reported a hang when booting 32-bit 4.1 kernel on an AMD
box. A fragment of the splat was enough to pinpoint the issue:

  task: f58e0000 ti: f58e8000 task.ti: f58e800
  EIP: 0060:[<c135a903>] EFLAGS: 00010206 CPU: 0
  EIP is at free_cache_attributes+0x83/0xd0
  EAX: 00000001 EBX: f589d46c ECX: 00000090 EDX: 360c2000
  ESI: 00000000 EDI: c1724a80 EBP: f58e9ec0 ESP: f58e9ea0
   DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
  CR0: 8005003b CR2: 000000ac CR3: 01731000 CR4: 000006d0

cache_shared_cpu_map_setup() did check sibling CPUs cacheinfo descriptor
while the respective teardown path cache_shared_cpu_map_remove() didn't.
Fix that.

>From tglx's version: to be on the safe side, move the cacheinfo
descriptor check to free_cache_attributes(), thus cleaning up the
hotplug path a little and making this even more robust.

Reported-by: Philip Müller <philm@manjaro.org>
Cc: <stable@vger.kernel.org> # 4.1
Cc: Andre Przywara <andre.przywara@arm.com>
Cc: Guenter Roeck <linux@roeck-us.net>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: linux-kernel@vger.kernel.org
Cc: manjaro-dev@manjaro.org
Cc: Philip Müller <philm@manjaro.org>
Cc: Sudeep Holla <sudeep.holla@arm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/55B47BB8.6080202@manjaro.org
Signed-off-by: Borislav Petkov <bp@suse.de>
---

Moin Thomas,

I've merged both patches and tagged it for stable. Which means,
tip-urgent.

Thanks.

 drivers/base/cacheinfo.c | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c
index 764280a91776..e9fd32e91668 100644
--- a/drivers/base/cacheinfo.c
+++ b/drivers/base/cacheinfo.c
@@ -148,7 +148,11 @@ static void cache_shared_cpu_map_remove(unsigned int cpu)
 
 			if (sibling == cpu) /* skip itself */
 				continue;
+
 			sib_cpu_ci = get_cpu_cacheinfo(sibling);
+			if (!sib_cpu_ci->info_list)
+				continue;
+
 			sib_leaf = sib_cpu_ci->info_list + index;
 			cpumask_clear_cpu(cpu, &sib_leaf->shared_cpu_map);
 			cpumask_clear_cpu(sibling, &this_leaf->shared_cpu_map);
@@ -159,6 +163,9 @@ static void cache_shared_cpu_map_remove(unsigned int cpu)
 
 static void free_cache_attributes(unsigned int cpu)
 {
+	if (!per_cpu_cacheinfo(cpu))
+		return;
+
 	cache_shared_cpu_map_remove(cpu);
 
 	kfree(per_cpu_cacheinfo(cpu));
@@ -514,8 +521,7 @@ static int cacheinfo_cpu_callback(struct notifier_block *nfb,
 		break;
 	case CPU_DEAD:
 		cache_remove_dev(cpu);
-		if (per_cpu_cacheinfo(cpu))
-			free_cache_attributes(cpu);
+		free_cache_attributes(cpu);
 		break;
 	}
 	return notifier_from_errno(rc);
-- 
2.5.0.rc2.28.g6003e7f

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.
--

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH] cpu/cacheinfo: Fix teardown path
  2015-07-27  7:58   ` [PATCH] cpu/cacheinfo: Fix teardown path Borislav Petkov
@ 2015-07-27  8:56     ` Sudeep Holla
  2015-07-27 11:10     ` Thomas Gleixner
                       ` (2 subsequent siblings)
  3 siblings, 0 replies; 23+ messages in thread
From: Sudeep Holla @ 2015-07-27  8:56 UTC (permalink / raw)
  To: Borislav Petkov, Thomas Gleixner
  Cc: Sudeep Holla, Philip Müller, linux-kernel, Guenter Roeck,
	manjaro-dev, Ingo Molnar, H. Peter Anvin, Andre Przywara



On 27/07/15 08:58, Borislav Petkov wrote:
> From: Borislav Petkov <bp@suse.de>
> Date: Mon, 27 Jul 2015 08:36:27 +0200
> Subject: [PATCH] cpu/cacheinfo: Fix teardown path
> MIME-Version: 1.0
> Content-Type: text/plain; charset=UTF-8
> Content-Transfer-Encoding: 8bit
>
> Philip Müller reported a hang when booting 32-bit 4.1 kernel on an AMD
> box. A fragment of the splat was enough to pinpoint the issue:
>
>    task: f58e0000 ti: f58e8000 task.ti: f58e800
>    EIP: 0060:[<c135a903>] EFLAGS: 00010206 CPU: 0
>    EIP is at free_cache_attributes+0x83/0xd0
>    EAX: 00000001 EBX: f589d46c ECX: 00000090 EDX: 360c2000
>    ESI: 00000000 EDI: c1724a80 EBP: f58e9ec0 ESP: f58e9ea0
>     DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
>    CR0: 8005003b CR2: 000000ac CR3: 01731000 CR4: 000006d0
>
> cache_shared_cpu_map_setup() did check sibling CPUs cacheinfo descriptor
> while the respective teardown path cache_shared_cpu_map_remove() didn't.
> Fix that.
>
>  From tglx's version: to be on the safe side, move the cacheinfo
> descriptor check to free_cache_attributes(), thus cleaning up the
> hotplug path a little and making this even more robust.
>
> Reported-by: Philip Müller <philm@manjaro.org>
> Cc: <stable@vger.kernel.org> # 4.1
> Cc: Andre Przywara <andre.przywara@arm.com>
> Cc: Guenter Roeck <linux@roeck-us.net>
> Cc: "H. Peter Anvin" <hpa@zytor.com>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: linux-kernel@vger.kernel.org
> Cc: manjaro-dev@manjaro.org
> Cc: Philip Müller <philm@manjaro.org>
> Cc: Sudeep Holla <sudeep.holla@arm.com>

Looks good to me. If not too late
Acked-by: Sudeep Holla <sudeep.holla@arm.com>

Regards,
Sudeep

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] cpu/cacheinfo: Fix teardown path
  2015-07-27  7:58   ` [PATCH] cpu/cacheinfo: Fix teardown path Borislav Petkov
  2015-07-27  8:56     ` Sudeep Holla
@ 2015-07-27 11:10     ` Thomas Gleixner
  2015-07-27 18:49       ` Philip Müller
  2015-08-05 20:14     ` [tip:x86/urgent] x86/cpu/cacheinfo: " tip-bot for Borislav Petkov
  2015-08-08  8:46     ` [PATCH] cpu/cacheinfo: " Borislav Petkov
  3 siblings, 1 reply; 23+ messages in thread
From: Thomas Gleixner @ 2015-07-27 11:10 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Philip Müller, linux-kernel, Sudeep Holla, Guenter Roeck,
	manjaro-dev, Ingo Molnar, H. Peter Anvin, Andre Przywara

[-- Attachment #1: Type: TEXT/PLAIN, Size: 1864 bytes --]



On Mon, 27 Jul 2015, Borislav Petkov wrote:

> From: Borislav Petkov <bp@suse.de>
> Date: Mon, 27 Jul 2015 08:36:27 +0200
> Subject: [PATCH] cpu/cacheinfo: Fix teardown path
> MIME-Version: 1.0
> Content-Type: text/plain; charset=UTF-8
> Content-Transfer-Encoding: 8bit
> 
> Philip Müller reported a hang when booting 32-bit 4.1 kernel on an AMD
> box. A fragment of the splat was enough to pinpoint the issue:
> 
>   task: f58e0000 ti: f58e8000 task.ti: f58e800
>   EIP: 0060:[<c135a903>] EFLAGS: 00010206 CPU: 0
>   EIP is at free_cache_attributes+0x83/0xd0
>   EAX: 00000001 EBX: f589d46c ECX: 00000090 EDX: 360c2000
>   ESI: 00000000 EDI: c1724a80 EBP: f58e9ec0 ESP: f58e9ea0
>    DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
>   CR0: 8005003b CR2: 000000ac CR3: 01731000 CR4: 000006d0
> 
> cache_shared_cpu_map_setup() did check sibling CPUs cacheinfo descriptor
> while the respective teardown path cache_shared_cpu_map_remove() didn't.
> Fix that.
> 
> >From tglx's version: to be on the safe side, move the cacheinfo
> descriptor check to free_cache_attributes(), thus cleaning up the
> hotplug path a little and making this even more robust.
> 
> Reported-by: Philip Müller <philm@manjaro.org>
> Cc: <stable@vger.kernel.org> # 4.1
> Cc: Andre Przywara <andre.przywara@arm.com>
> Cc: Guenter Roeck <linux@roeck-us.net>
> Cc: "H. Peter Anvin" <hpa@zytor.com>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: linux-kernel@vger.kernel.org
> Cc: manjaro-dev@manjaro.org
> Cc: Philip Müller <philm@manjaro.org>
> Cc: Sudeep Holla <sudeep.holla@arm.com>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Link: https://lkml.kernel.org/r/55B47BB8.6080202@manjaro.org
> Signed-off-by: Borislav Petkov <bp@suse.de>
> ---
> 
> Moin Thomas,
> 
> I've merged both patches and tagged it for stable. Which means,
> tip-urgent.

Reviewed-by: Thomas Gleixner <tglx@linutronix.de>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] cpu/cacheinfo: Fix teardown path
  2015-07-27 11:10     ` Thomas Gleixner
@ 2015-07-27 18:49       ` Philip Müller
  0 siblings, 0 replies; 23+ messages in thread
From: Philip Müller @ 2015-07-27 18:49 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Thomas Gleixner, linux-kernel, Sudeep Holla, Guenter Roeck,
	manjaro-dev, Ingo Molnar, H. Peter Anvin, Andre Przywara

Am 27.07.2015 um 13:10 schrieb Thomas Gleixner:
>> ---
>>
>> Moin Thomas,
>>
>> I've merged both patches and tagged it for stable. Which means,
>> tip-urgent.
> 
> Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
> 

Hi Borislav,

I also reviewed your new code and also tested it.

Acked-by: Philip Müller <philm@manjaro.org>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [tip:x86/urgent] x86/cpu/cacheinfo: Fix teardown path
  2015-07-27  7:58   ` [PATCH] cpu/cacheinfo: Fix teardown path Borislav Petkov
  2015-07-27  8:56     ` Sudeep Holla
  2015-07-27 11:10     ` Thomas Gleixner
@ 2015-08-05 20:14     ` tip-bot for Borislav Petkov
  2015-08-08  8:46     ` [PATCH] cpu/cacheinfo: " Borislav Petkov
  3 siblings, 0 replies; 23+ messages in thread
From: tip-bot for Borislav Petkov @ 2015-08-05 20:14 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: peterz, sudeep.holla, bp, andre.przywara, mingo, philm,
	linux-kernel, tglx, torvalds, linux, hpa

Commit-ID:  680ac028240f8747f31c03986fbcf18b2b521e93
Gitweb:     http://git.kernel.org/tip/680ac028240f8747f31c03986fbcf18b2b521e93
Author:     Borislav Petkov <bp@suse.de>
AuthorDate: Mon, 27 Jul 2015 09:58:05 +0200
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Wed, 5 Aug 2015 10:08:17 +0200

x86/cpu/cacheinfo: Fix teardown path

Philip Müller reported a hang when booting 32-bit 4.1 kernel on
an AMD box. A fragment of the splat was enough to pinpoint the
issue:

  task: f58e0000 ti: f58e8000 task.ti: f58e800
  EIP: 0060:[<c135a903>] EFLAGS: 00010206 CPU: 0
  EIP is at free_cache_attributes+0x83/0xd0
  EAX: 00000001 EBX: f589d46c ECX: 00000090 EDX: 360c2000
  ESI: 00000000 EDI: c1724a80 EBP: f58e9ec0 ESP: f58e9ea0
   DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
  CR0: 8005003b CR2: 000000ac CR3: 01731000 CR4: 000006d0

cache_shared_cpu_map_setup() did check sibling CPUs cacheinfo
descriptor while the respective teardown path
cache_shared_cpu_map_remove() didn't. Fix that.

>From tglx's version: to be on the safe side, move the cacheinfo
descriptor check to free_cache_attributes(), thus cleaning up
the hotplug path a little and making this even more robust.

Reported-by: Philip Müller <philm@manjaro.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: <stable@vger.kernel.org> # v4.1+
Cc: Andre Przywara <andre.przywara@arm.com>
Cc: Guenter Roeck <linux@roeck-us.net>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sudeep Holla <sudeep.holla@arm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Cc: manjaro-dev@manjaro.org
Link: http://lkml.kernel.org/r/20150727075805.GA20416@nazgul.tnic
Link: https://lkml.kernel.org/r/55B47BB8.6080202@manjaro.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 drivers/base/cacheinfo.c | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c
index 764280a..e9fd32e 100644
--- a/drivers/base/cacheinfo.c
+++ b/drivers/base/cacheinfo.c
@@ -148,7 +148,11 @@ static void cache_shared_cpu_map_remove(unsigned int cpu)
 
 			if (sibling == cpu) /* skip itself */
 				continue;
+
 			sib_cpu_ci = get_cpu_cacheinfo(sibling);
+			if (!sib_cpu_ci->info_list)
+				continue;
+
 			sib_leaf = sib_cpu_ci->info_list + index;
 			cpumask_clear_cpu(cpu, &sib_leaf->shared_cpu_map);
 			cpumask_clear_cpu(sibling, &this_leaf->shared_cpu_map);
@@ -159,6 +163,9 @@ static void cache_shared_cpu_map_remove(unsigned int cpu)
 
 static void free_cache_attributes(unsigned int cpu)
 {
+	if (!per_cpu_cacheinfo(cpu))
+		return;
+
 	cache_shared_cpu_map_remove(cpu);
 
 	kfree(per_cpu_cacheinfo(cpu));
@@ -514,8 +521,7 @@ static int cacheinfo_cpu_callback(struct notifier_block *nfb,
 		break;
 	case CPU_DEAD:
 		cache_remove_dev(cpu);
-		if (per_cpu_cacheinfo(cpu))
-			free_cache_attributes(cpu);
+		free_cache_attributes(cpu);
 		break;
 	}
 	return notifier_from_errno(rc);

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH] cpu/cacheinfo: Fix teardown path
  2015-07-27  7:58   ` [PATCH] cpu/cacheinfo: Fix teardown path Borislav Petkov
                       ` (2 preceding siblings ...)
  2015-08-05 20:14     ` [tip:x86/urgent] x86/cpu/cacheinfo: " tip-bot for Borislav Petkov
@ 2015-08-08  8:46     ` Borislav Petkov
  2015-08-08 15:41       ` Greg KH
  3 siblings, 1 reply; 23+ messages in thread
From: Borislav Petkov @ 2015-08-08  8:46 UTC (permalink / raw)
  To: Greg KH
  Cc: Thomas Gleixner, Philip Müller, linux-kernel, Sudeep Holla,
	Guenter Roeck, manjaro-dev, Ingo Molnar, H. Peter Anvin,
	Andre Przywara

On Mon, Jul 27, 2015 at 09:58:05AM +0200, Borislav Petkov wrote:
> Philip Müller reported a hang when booting 32-bit 4.1 kernel on an AMD
> box. A fragment of the splat was enough to pinpoint the issue:

Bah, this goes to Greg and not to tip. Anyway, here's a version with
updated tags.

Greg, please queue for 4.2 as it fixes a hang.

Thanks.

---
From: Borislav Petkov <bp@suse.de>
Date: Mon, 27 Jul 2015 08:36:27 +0200
Subject: [PATCH] cpu/cacheinfo: Fix teardown path
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Philip Müller reported a hang when booting 32-bit 4.1 kernel on an AMD
box. A fragment of the splat was enough to pinpoint the issue:

  task: f58e0000 ti: f58e8000 task.ti: f58e800
  EIP: 0060:[<c135a903>] EFLAGS: 00010206 CPU: 0
  EIP is at free_cache_attributes+0x83/0xd0
  EAX: 00000001 EBX: f589d46c ECX: 00000090 EDX: 360c2000
  ESI: 00000000 EDI: c1724a80 EBP: f58e9ec0 ESP: f58e9ea0
   DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
  CR0: 8005003b CR2: 000000ac CR3: 01731000 CR4: 000006d0

cache_shared_cpu_map_setup() did check sibling CPUs cacheinfo descriptor
while the respective teardown path cache_shared_cpu_map_remove() didn't.
Fix that.

>From tglx's version: to be on the safe side, move the cacheinfo
descriptor check to free_cache_attributes(), thus cleaning up the
hotplug path a little and making this even more robust.

Reported-and-tested-by: Philip Müller <philm@manjaro.org>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Sudeep Holla <sudeep.holla@arm.com>
Cc: <stable@vger.kernel.org> # 4.1
Cc: Andre Przywara <andre.przywara@arm.com>
Cc: Guenter Roeck <linux@roeck-us.net>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: linux-kernel@vger.kernel.org
Cc: manjaro-dev@manjaro.org
Cc: Philip Müller <philm@manjaro.org>
Link: https://lkml.kernel.org/r/55B47BB8.6080202@manjaro.org
Signed-off-by: Borislav Petkov <bp@suse.de>
---
 drivers/base/cacheinfo.c | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c
index 764280a91776..e9fd32e91668 100644
--- a/drivers/base/cacheinfo.c
+++ b/drivers/base/cacheinfo.c
@@ -148,7 +148,11 @@ static void cache_shared_cpu_map_remove(unsigned int cpu)
 
 			if (sibling == cpu) /* skip itself */
 				continue;
+
 			sib_cpu_ci = get_cpu_cacheinfo(sibling);
+			if (!sib_cpu_ci->info_list)
+				continue;
+
 			sib_leaf = sib_cpu_ci->info_list + index;
 			cpumask_clear_cpu(cpu, &sib_leaf->shared_cpu_map);
 			cpumask_clear_cpu(sibling, &this_leaf->shared_cpu_map);
@@ -159,6 +163,9 @@ static void cache_shared_cpu_map_remove(unsigned int cpu)
 
 static void free_cache_attributes(unsigned int cpu)
 {
+	if (!per_cpu_cacheinfo(cpu))
+		return;
+
 	cache_shared_cpu_map_remove(cpu);
 
 	kfree(per_cpu_cacheinfo(cpu));
@@ -514,8 +521,7 @@ static int cacheinfo_cpu_callback(struct notifier_block *nfb,
 		break;
 	case CPU_DEAD:
 		cache_remove_dev(cpu);
-		if (per_cpu_cacheinfo(cpu))
-			free_cache_attributes(cpu);
+		free_cache_attributes(cpu);
 		break;
 	}
 	return notifier_from_errno(rc);
-- 
2.5.0.rc2.28.g6003e7f

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.
--

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH] cpu/cacheinfo: Fix teardown path
  2015-08-08  8:46     ` [PATCH] cpu/cacheinfo: " Borislav Petkov
@ 2015-08-08 15:41       ` Greg KH
  2015-08-08 18:23         ` Philip Müller
  2015-08-08 19:47         ` Borislav Petkov
  0 siblings, 2 replies; 23+ messages in thread
From: Greg KH @ 2015-08-08 15:41 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Thomas Gleixner, Philip Müller, linux-kernel, Sudeep Holla,
	Guenter Roeck, manjaro-dev, Ingo Molnar, H. Peter Anvin,
	Andre Przywara

On Sat, Aug 08, 2015 at 10:46:02AM +0200, Borislav Petkov wrote:
> On Mon, Jul 27, 2015 at 09:58:05AM +0200, Borislav Petkov wrote:
> > Philip Müller reported a hang when booting 32-bit 4.1 kernel on an AMD
> > box. A fragment of the splat was enough to pinpoint the issue:
> 
> Bah, this goes to Greg and not to tip. Anyway, here's a version with
> updated tags.
> 
> Greg, please queue for 4.2 as it fixes a hang.

What commit caused this issue?

And it's a bit late for 4.2, as you say 4.1 is also affected, I'll wait
for 4.3-rc1 to give this a chance to get some testing.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] cpu/cacheinfo: Fix teardown path
  2015-08-08 15:41       ` Greg KH
@ 2015-08-08 18:23         ` Philip Müller
  2015-08-08 19:42           ` Borislav Petkov
  2015-08-08 19:47         ` Borislav Petkov
  1 sibling, 1 reply; 23+ messages in thread
From: Philip Müller @ 2015-08-08 18:23 UTC (permalink / raw)
  To: Greg KH, Borislav Petkov
  Cc: Thomas Gleixner, linux-kernel, Sudeep Holla, Guenter Roeck,
	manjaro-dev, Ingo Molnar, H. Peter Anvin, Andre Przywara,
	manjaro-dev

Hi Greg,

I bi-sected it to following commit:

0d55ba46bfbee64fd2b492b87bfe2ec172e7b056 is the first bad commit
commit 0d55ba46bfbee64fd2b492b87bfe2ec172e7b056
Author: Sudeep Holla <sudeep.holla@arm.com>
Date:   Wed Mar 4 12:00:16 2015 +0000

    x86/cacheinfo: Move cacheinfo sysfs code to generic infrastructure

You can follow it on my github repo in rich detail:

https://github.com/philmmanjaro/linux41/blob/master/git-bisect.txt

kind regards
Philip Müller

Am 08.08.2015 um 17:41 schrieb Greg KH:
> On Sat, Aug 08, 2015 at 10:46:02AM +0200, Borislav Petkov wrote:
>> On Mon, Jul 27, 2015 at 09:58:05AM +0200, Borislav Petkov wrote:
>>> Philip Müller reported a hang when booting 32-bit 4.1 kernel on an AMD
>>> box. A fragment of the splat was enough to pinpoint the issue:
>>
>> Bah, this goes to Greg and not to tip. Anyway, here's a version with
>> updated tags.
>>
>> Greg, please queue for 4.2 as it fixes a hang.
> 
> What commit caused this issue?
> 
> And it's a bit late for 4.2, as you say 4.1 is also affected, I'll wait
> for 4.3-rc1 to give this a chance to get some testing.
> 
> thanks,
> 
> greg k-h
> 


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] cpu/cacheinfo: Fix teardown path
  2015-08-08 18:23         ` Philip Müller
@ 2015-08-08 19:42           ` Borislav Petkov
  0 siblings, 0 replies; 23+ messages in thread
From: Borislav Petkov @ 2015-08-08 19:42 UTC (permalink / raw)
  To: Philip Müller
  Cc: Greg KH, Thomas Gleixner, linux-kernel, Sudeep Holla,
	Guenter Roeck, manjaro-dev, Ingo Molnar, H. Peter Anvin,
	Andre Przywara

On Sat, Aug 08, 2015 at 08:23:49PM +0200, Philip Müller wrote:
> Hi Greg,
> 
> I bi-sected it to following commit:
> 
> 0d55ba46bfbee64fd2b492b87bfe2ec172e7b056 is the first bad commit
> commit 0d55ba46bfbee64fd2b492b87bfe2ec172e7b056
> Author: Sudeep Holla <sudeep.holla@arm.com>
> Date:   Wed Mar 4 12:00:16 2015 +0000
> 
>     x86/cacheinfo: Move cacheinfo sysfs code to generic infrastructure
> 
> You can follow it on my github repo in rich detail:
> 
> https://github.com/philmmanjaro/linux41/blob/master/git-bisect.txt

Philip,

what is with you and top-posting? How hard is it not to do it?!

Please stop with the top-posting already.

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.
--

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] cpu/cacheinfo: Fix teardown path
  2015-08-08 15:41       ` Greg KH
  2015-08-08 18:23         ` Philip Müller
@ 2015-08-08 19:47         ` Borislav Petkov
  2015-09-13  7:03           ` Philip Müller
  1 sibling, 1 reply; 23+ messages in thread
From: Borislav Petkov @ 2015-08-08 19:47 UTC (permalink / raw)
  To: Greg KH
  Cc: Thomas Gleixner, Philip Müller, linux-kernel, Sudeep Holla,
	Guenter Roeck, manjaro-dev, Ingo Molnar, H. Peter Anvin,
	Andre Przywara

On Sat, Aug 08, 2015 at 08:41:56AM -0700, Greg KH wrote:
> What commit caused this issue?

Apparently

 0d55ba46bfbe ("x86/cacheinfo: Move cacheinfo sysfs code to generic infrastructure")

Looks like moving x86 to the generic cacheinfo stuff uncovered this
shortcoming there...

> And it's a bit late for 4.2, as you say 4.1 is also affected, I'll wait
> for 4.3-rc1 to give this a chance to get some testing.

Right, I guess that's fine too as it'll trickle to stable eventually...

Thanks.

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.
--

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] cpu/cacheinfo: Fix teardown path
  2015-08-08 19:47         ` Borislav Petkov
@ 2015-09-13  7:03           ` Philip Müller
  0 siblings, 0 replies; 23+ messages in thread
From: Philip Müller @ 2015-09-13  7:03 UTC (permalink / raw)
  To: Borislav Petkov, Greg KH
  Cc: Thomas Gleixner, linux-kernel, Sudeep Holla, Guenter Roeck,
	manjaro-dev, Ingo Molnar, H. Peter Anvin, Andre Przywara,
	manjaro-dev

[-- Attachment #1: Type: text/plain, Size: 656 bytes --]

On 08.08.2015 21:47, Borislav Petkov wrote:
> On Sat, Aug 08, 2015 at 08:41:56AM -0700, Greg KH wrote:
>> What commit caused this issue?
> 
> Apparently
> 
>  0d55ba46bfbe ("x86/cacheinfo: Move cacheinfo sysfs code to generic infrastructure")
> 
> Looks like moving x86 to the generic cacheinfo stuff uncovered this
> shortcoming there...
> 
>> And it's a bit late for 4.2, as you say 4.1 is also affected, I'll wait
>> for 4.3-rc1 to give this a chance to get some testing.
> 
> Right, I guess that's fine too as it'll trickle to stable eventually...
> 
> Thanks.
> 

Just a note from my end. Seems this patch didn't made it into 4.3-rc1.
Any reason why?

[-- Attachment #2: 0001-cpu-cacheinfo-fix-teardown-path.patch --]
[-- Type: text/x-patch, Size: 1090 bytes --]

diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c
index 764280a91776..e9fd32e91668 100644
--- a/drivers/base/cacheinfo.c
+++ b/drivers/base/cacheinfo.c
@@ -148,7 +148,11 @@ static void cache_shared_cpu_map_remove(unsigned int cpu)
 
 			if (sibling == cpu) /* skip itself */
 				continue;
+
 			sib_cpu_ci = get_cpu_cacheinfo(sibling);
+			if (!sib_cpu_ci->info_list)
+				continue;
+
 			sib_leaf = sib_cpu_ci->info_list + index;
 			cpumask_clear_cpu(cpu, &sib_leaf->shared_cpu_map);
 			cpumask_clear_cpu(sibling, &this_leaf->shared_cpu_map);
@@ -159,6 +163,9 @@ static void cache_shared_cpu_map_remove(unsigned int cpu)
 
 static void free_cache_attributes(unsigned int cpu)
 {
+	if (!per_cpu_cacheinfo(cpu))
+		return;
+
 	cache_shared_cpu_map_remove(cpu);
 
 	kfree(per_cpu_cacheinfo(cpu));
@@ -514,8 +521,7 @@ static int cacheinfo_cpu_callback(struct notifier_block *nfb,
 		break;
 	case CPU_DEAD:
 		cache_remove_dev(cpu);
-		if (per_cpu_cacheinfo(cpu))
-			free_cache_attributes(cpu);
+		free_cache_attributes(cpu);
 		break;
 	}
 	return notifier_from_errno(rc);

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [linux41] regression with 'x86/cacheinfo: Move cacheinfo sysfs code to generic infrastructure' on AMD i686
  2015-07-26 14:42         ` Borislav Petkov
  2015-07-26 15:59           ` Philip Müller
@ 2015-09-16 23:52           ` Josh Boyer
  2015-09-17  5:36             ` Philip Müller
  2015-09-17  7:15             ` Borislav Petkov
  1 sibling, 2 replies; 23+ messages in thread
From: Josh Boyer @ 2015-09-16 23:52 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Philip Müller, Thomas Gleixner,
	Linux-Kernel@Vger. Kernel. Org, Sudeep Holla, Guenter Roeck,
	manjaro-dev, Ingo Molnar, H. Peter Anvin, Andre Przywara,
	Borislav Petkov

On Sun, Jul 26, 2015 at 10:42 AM, Borislav Petkov <bp@alien8.de> wrote:
> On Sun, Jul 26, 2015 at 12:54:55PM +0200, Philip Müller wrote:
>> I can confirm your patch working. However, it might be good to use yours
>> and Thomas' in combination to solve this properly.
>
> Please do not top-post.
>
> We could use Thomas' too although from looking at it,
> detect_cache_attributes() allocates a per-CPU per_cpu_cacheinfo thing
> for each CPU. By the time we hit cache_shared_cpu_map_remove() in
> free_cache_attributes(), those per_cpu_cacheinfo(cpu) things are still
> allocated. We kfree them in the next step only.
>
> But I like the moving of the check from the CPU hotplug callback to
> free_cache_attributes().
>
> So I'll merge the two patches and write up a proper commit message,
> unless someone objects.
>
> I'll add your Tested-by too.

Did this actually happen?  I don't see either fix in Linus' tree yet,
the merge window is closed, and the bug happens on 4.1 and 4.2 stable
kernels..

josh

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [linux41] regression with 'x86/cacheinfo: Move cacheinfo sysfs code to generic infrastructure' on AMD i686
  2015-09-16 23:52           ` Josh Boyer
@ 2015-09-17  5:36             ` Philip Müller
  2015-09-17  7:15             ` Borislav Petkov
  1 sibling, 0 replies; 23+ messages in thread
From: Philip Müller @ 2015-09-17  5:36 UTC (permalink / raw)
  To: Josh Boyer, Borislav Petkov
  Cc: Thomas Gleixner, Linux-Kernel@Vger. Kernel. Org, Sudeep Holla,
	Guenter Roeck, manjaro-dev, Ingo Molnar, H. Peter Anvin,
	Andre Przywara, Borislav Petkov

Am 17.09.2015 um 01:52 schrieb Josh Boyer:
> 
> Did this actually happen?  I don't see either fix in Linus' tree yet,
> the merge window is closed, and the bug happens on 4.1 and 4.2 stable
> kernels..
> 
> josh
> 

Seems not yet. I don't see it neither in 4.3-rc1. Seems 4.3 will have
the same issues then ...


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [linux41] regression with 'x86/cacheinfo: Move cacheinfo sysfs code to generic infrastructure' on AMD i686
  2015-09-16 23:52           ` Josh Boyer
  2015-09-17  5:36             ` Philip Müller
@ 2015-09-17  7:15             ` Borislav Petkov
  2015-09-17 12:54               ` Greg KH
  1 sibling, 1 reply; 23+ messages in thread
From: Borislav Petkov @ 2015-09-17  7:15 UTC (permalink / raw)
  To: Josh Boyer, Greg KH
  Cc: Philip Müller, Thomas Gleixner,
	Linux-Kernel@Vger. Kernel. Org, Sudeep Holla, Guenter Roeck,
	manjaro-dev, Ingo Molnar, H. Peter Anvin, Andre Przywara,
	Borislav Petkov

On Wed, Sep 16, 2015 at 07:52:47PM -0400, Josh Boyer wrote:
> On Sun, Jul 26, 2015 at 10:42 AM, Borislav Petkov <bp@alien8.de> wrote:
> > On Sun, Jul 26, 2015 at 12:54:55PM +0200, Philip Müller wrote:
> >> I can confirm your patch working. However, it might be good to use yours
> >> and Thomas' in combination to solve this properly.
> >
> > Please do not top-post.
> >
> > We could use Thomas' too although from looking at it,
> > detect_cache_attributes() allocates a per-CPU per_cpu_cacheinfo thing
> > for each CPU. By the time we hit cache_shared_cpu_map_remove() in
> > free_cache_attributes(), those per_cpu_cacheinfo(cpu) things are still
> > allocated. We kfree them in the next step only.
> >
> > But I like the moving of the check from the CPU hotplug callback to
> > free_cache_attributes().
> >
> > So I'll merge the two patches and write up a proper commit message,
> > unless someone objects.
> >
> > I'll add your Tested-by too.
> 
> Did this actually happen?  I don't see either fix in Linus' tree yet,
> the merge window is closed, and the bug happens on 4.1 and 4.2 stable
> kernels..

Greg wanted to pick it up...

Greg, what's up?

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.
--

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [linux41] regression with 'x86/cacheinfo: Move cacheinfo sysfs code to generic infrastructure' on AMD i686
  2015-09-17  7:15             ` Borislav Petkov
@ 2015-09-17 12:54               ` Greg KH
  0 siblings, 0 replies; 23+ messages in thread
From: Greg KH @ 2015-09-17 12:54 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Josh Boyer, Philip Müller, Thomas Gleixner,
	Linux-Kernel@Vger. Kernel. Org, Sudeep Holla, Guenter Roeck,
	manjaro-dev, Ingo Molnar, H. Peter Anvin, Andre Przywara,
	Borislav Petkov

On Thu, Sep 17, 2015 at 09:15:04AM +0200, Borislav Petkov wrote:
> On Wed, Sep 16, 2015 at 07:52:47PM -0400, Josh Boyer wrote:
> > On Sun, Jul 26, 2015 at 10:42 AM, Borislav Petkov <bp@alien8.de> wrote:
> > > On Sun, Jul 26, 2015 at 12:54:55PM +0200, Philip Müller wrote:
> > >> I can confirm your patch working. However, it might be good to use yours
> > >> and Thomas' in combination to solve this properly.
> > >
> > > Please do not top-post.
> > >
> > > We could use Thomas' too although from looking at it,
> > > detect_cache_attributes() allocates a per-CPU per_cpu_cacheinfo thing
> > > for each CPU. By the time we hit cache_shared_cpu_map_remove() in
> > > free_cache_attributes(), those per_cpu_cacheinfo(cpu) things are still
> > > allocated. We kfree them in the next step only.
> > >
> > > But I like the moving of the check from the CPU hotplug callback to
> > > free_cache_attributes().
> > >
> > > So I'll merge the two patches and write up a proper commit message,
> > > unless someone objects.
> > >
> > > I'll add your Tested-by too.
> > 
> > Did this actually happen?  I don't see either fix in Linus' tree yet,
> > the merge window is closed, and the bug happens on 4.1 and 4.2 stable
> > kernels..
> 
> Greg wanted to pick it up...
> 
> Greg, what's up?

It's in my "to-apply" queue, let me go dig it up now...

thanks for the reminder.

greg k-h

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2015-09-17 12:54 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-07-23 22:23 [linux41] Kernel panic at i686 Philip Müller
2015-07-26  6:18 ` [linux41] regression with 'x86/cacheinfo: Move cacheinfo sysfs code to generic infrastructure' on AMD i686 Philip Müller
2015-07-26  8:13   ` Thomas Gleixner
2015-07-26  8:41     ` Borislav Petkov
2015-07-26 10:54       ` Philip Müller
2015-07-26 14:42         ` Borislav Petkov
2015-07-26 15:59           ` Philip Müller
2015-07-26 16:11             ` Guenter Roeck
2015-09-16 23:52           ` Josh Boyer
2015-09-17  5:36             ` Philip Müller
2015-09-17  7:15             ` Borislav Petkov
2015-09-17 12:54               ` Greg KH
2015-07-27  7:58   ` [PATCH] cpu/cacheinfo: Fix teardown path Borislav Petkov
2015-07-27  8:56     ` Sudeep Holla
2015-07-27 11:10     ` Thomas Gleixner
2015-07-27 18:49       ` Philip Müller
2015-08-05 20:14     ` [tip:x86/urgent] x86/cpu/cacheinfo: " tip-bot for Borislav Petkov
2015-08-08  8:46     ` [PATCH] cpu/cacheinfo: " Borislav Petkov
2015-08-08 15:41       ` Greg KH
2015-08-08 18:23         ` Philip Müller
2015-08-08 19:42           ` Borislav Petkov
2015-08-08 19:47         ` Borislav Petkov
2015-09-13  7:03           ` Philip Müller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).