linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4] x86, hotplug: fix llc shared map unreleased during cpu hotplug
@ 2014-07-29  9:24 Wanpeng Li
  2014-07-29 17:18 ` Toshi Kani
                   ` (3 more replies)
  0 siblings, 4 replies; 16+ messages in thread
From: Wanpeng Li @ 2014-07-29  9:24 UTC (permalink / raw)
  To: hpa
  Cc: Ingo Molnar, Peter Zijlstra, x86, Borislav Petkov,
	Yasuaki Ishimatsu, David Rientjes, Prarit Bhargava,
	Steven Rostedt, Jan Kiszka, Toshi Kani, linux-kernel, Zhang Yang,
	Konrad Rzeszutek Wilk, Wanpeng Li

BUG: unable to handle kernel NULL pointer dereference at 0000000000000004
IP: [..] find_busiest_group
PGD 5a9d5067 PUD 13067 PMD 0
Oops: 0000 [#3] SMP
[...]
Call Trace:
load_balance
? _raw_spin_unlock_irqrestore
idle_balance
__schedule
schedule
schedule_timeout
? lock_timer_base
schedule_timeout_uninterruptible
msleep
lock_device_hotplug_sysfs
online_store
dev_attr_store
sysfs_write_file
vfs_write
SyS_write
system_call_fastpath

Last level cache shared map is built during cpu up and build sched domain 
routine takes advantage of it to setup sched domain cpu topology, however, 
llc shared map is unreleased during cpu disable which lead to invalid sched 
domain cpu topology. This patch fix it by release llc shared map correctly
during cpu disable.

Reviewed-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Signed-off-by: Wanpeng Li <wanpeng.li@linux.intel.com>
---
v1 -> v2:
 * fix subject line
v2 -> v3:
 * simplify backtrace 

 arch/x86/kernel/smpboot.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index 5492798..0134ec7 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -1292,6 +1292,9 @@ static void remove_siblinginfo(int cpu)
 
 	for_each_cpu(sibling, cpu_sibling_mask(cpu))
 		cpumask_clear_cpu(cpu, cpu_sibling_mask(sibling));
+	for_each_cpu(sibling, cpu_llc_shared_mask(cpu))
+		cpumask_clear_cpu(cpu, cpu_llc_shared_mask(sibling));
+	cpumask_clear(cpu_llc_shared_mask(cpu));
 	cpumask_clear(cpu_sibling_mask(cpu));
 	cpumask_clear(cpu_core_mask(cpu));
 	c->phys_proc_id = 0;
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH v4] x86, hotplug: fix llc shared map unreleased during cpu hotplug
  2014-07-29  9:24 [PATCH v4] x86, hotplug: fix llc shared map unreleased during cpu hotplug Wanpeng Li
@ 2014-07-29 17:18 ` Toshi Kani
  2014-08-07  6:33 ` Wanpeng Li
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 16+ messages in thread
From: Toshi Kani @ 2014-07-29 17:18 UTC (permalink / raw)
  To: Wanpeng Li
  Cc: hpa, Ingo Molnar, Peter Zijlstra, x86, Borislav Petkov,
	Yasuaki Ishimatsu, David Rientjes, Prarit Bhargava,
	Steven Rostedt, Jan Kiszka, linux-kernel, Zhang Yang,
	Konrad Rzeszutek Wilk

On Tue, 2014-07-29 at 17:24 +0800, Wanpeng Li wrote:
> BUG: unable to handle kernel NULL pointer dereference at 0000000000000004
> IP: [..] find_busiest_group
> PGD 5a9d5067 PUD 13067 PMD 0
> Oops: 0000 [#3] SMP
> [...]
> Call Trace:
> load_balance
> ? _raw_spin_unlock_irqrestore
> idle_balance
> __schedule
> schedule
> schedule_timeout
> ? lock_timer_base
> schedule_timeout_uninterruptible
> msleep
> lock_device_hotplug_sysfs
> online_store
> dev_attr_store
> sysfs_write_file
> vfs_write
> SyS_write
> system_call_fastpath
> 
> Last level cache shared map is built during cpu up and build sched domain 
> routine takes advantage of it to setup sched domain cpu topology, however, 
> llc shared map is unreleased during cpu disable which lead to invalid sched 
> domain cpu topology. This patch fix it by release llc shared map correctly
> during cpu disable.
> 
> Reviewed-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
> Signed-off-by: Wanpeng Li <wanpeng.li@linux.intel.com>

The change looks good to me.

Reviewed-by: Toshi Kani <toshi.kani@hp.com>

Thanks,
-Toshi


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v4] x86, hotplug: fix llc shared map unreleased during cpu hotplug
  2014-07-29  9:24 [PATCH v4] x86, hotplug: fix llc shared map unreleased during cpu hotplug Wanpeng Li
  2014-07-29 17:18 ` Toshi Kani
@ 2014-08-07  6:33 ` Wanpeng Li
       [not found] ` <20140808224057.GA9288@oranje.fc.hp.com>
  2014-09-04  5:20 ` Ingo Molnar
  3 siblings, 0 replies; 16+ messages in thread
From: Wanpeng Li @ 2014-08-07  6:33 UTC (permalink / raw)
  To: Wanpeng Li, hpa
  Cc: Ingo Molnar, Peter Zijlstra, x86, Borislav Petkov,
	Yasuaki Ishimatsu, David Rientjes, Prarit Bhargava,
	Steven Rostedt, Jan Kiszka, Toshi Kani, linux-kernel, Zhang Yang,
	Konrad Rzeszutek Wilk

Ping

于 14-7-29 下午5:24, Wanpeng Li 写道:
> BUG: unable to handle kernel NULL pointer dereference at 0000000000000004
> IP: [..] find_busiest_group
> PGD 5a9d5067 PUD 13067 PMD 0
> Oops: 0000 [#3] SMP
> [...]
> Call Trace:
> load_balance
> ? _raw_spin_unlock_irqrestore
> idle_balance
> __schedule
> schedule
> schedule_timeout
> ? lock_timer_base
> schedule_timeout_uninterruptible
> msleep
> lock_device_hotplug_sysfs
> online_store
> dev_attr_store
> sysfs_write_file
> vfs_write
> SyS_write
> system_call_fastpath
>
> Last level cache shared map is built during cpu up and build sched domain 
> routine takes advantage of it to setup sched domain cpu topology, however, 
> llc shared map is unreleased during cpu disable which lead to invalid sched 
> domain cpu topology. This patch fix it by release llc shared map correctly
> during cpu disable.
>
> Reviewed-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
> Signed-off-by: Wanpeng Li <wanpeng.li@linux.intel.com>
> ---
> v1 -> v2:
>  * fix subject line
> v2 -> v3:
>  * simplify backtrace 
>
>  arch/x86/kernel/smpboot.c | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
> index 5492798..0134ec7 100644
> --- a/arch/x86/kernel/smpboot.c
> +++ b/arch/x86/kernel/smpboot.c
> @@ -1292,6 +1292,9 @@ static void remove_siblinginfo(int cpu)
>  
>  	for_each_cpu(sibling, cpu_sibling_mask(cpu))
>  		cpumask_clear_cpu(cpu, cpu_sibling_mask(sibling));
> +	for_each_cpu(sibling, cpu_llc_shared_mask(cpu))
> +		cpumask_clear_cpu(cpu, cpu_llc_shared_mask(sibling));
> +	cpumask_clear(cpu_llc_shared_mask(cpu));
>  	cpumask_clear(cpu_sibling_mask(cpu));
>  	cpumask_clear(cpu_core_mask(cpu));
>  	c->phys_proc_id = 0;


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v4] x86, hotplug: fix llc shared map unreleased during cpu hotplug
       [not found] ` <20140808224057.GA9288@oranje.fc.hp.com>
@ 2014-08-15  3:00   ` Wanpeng Li
  2014-08-15  6:07     ` Borislav Petkov
  0 siblings, 1 reply; 16+ messages in thread
From: Wanpeng Li @ 2014-08-15  3:00 UTC (permalink / raw)
  To: hpa
  Cc: Ingo Molnar, Peter Zijlstra, x86, Borislav Petkov,
	Yasuaki Ishimatsu, David Rientjes, Prarit Bhargava,
	Steven Rostedt, Jan Kiszka, Toshi Kani, linux-kernel, Zhang Yang,
	Konrad Rzeszutek Wilk, Linn Crosetto

Hi Peter,
On Fri, Aug 08, 2014 at 04:40:57PM -0600, Linn Crosetto wrote:
[...]
>
>Tested with a CPU hotplug stress test, run on a large system with 240 CPUs.
>Thanks.
>
>Tested-by: Linn Crosetto <linn@hp.com>

Is it ok for you to apply this patch or still need update?

Regards,
Wanpeng Li 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v4] x86, hotplug: fix llc shared map unreleased during cpu hotplug
  2014-08-15  3:00   ` Wanpeng Li
@ 2014-08-15  6:07     ` Borislav Petkov
  2014-08-25  5:32       ` Wanpeng Li
  0 siblings, 1 reply; 16+ messages in thread
From: Borislav Petkov @ 2014-08-15  6:07 UTC (permalink / raw)
  To: Wanpeng Li
  Cc: hpa, Ingo Molnar, Peter Zijlstra, x86, Yasuaki Ishimatsu,
	David Rientjes, Prarit Bhargava, Steven Rostedt, Jan Kiszka,
	Toshi Kani, linux-kernel, Zhang Yang, Konrad Rzeszutek Wilk,
	Linn Crosetto

On Fri, Aug 15, 2014 at 11:00:42AM +0800, Wanpeng Li wrote:
> Is it ok for you to apply this patch or still need update?

Just be patient: we have the merge window still open and after that
kernel summit coming up first.

-- 
Regards/Gruss,
    Boris.

Sent from a fat crate under my desk. Formatting is fine.
--

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v4] x86, hotplug: fix llc shared map unreleased during cpu hotplug
  2014-08-15  6:07     ` Borislav Petkov
@ 2014-08-25  5:32       ` Wanpeng Li
  2014-09-04  1:46         ` Wanpeng Li
  0 siblings, 1 reply; 16+ messages in thread
From: Wanpeng Li @ 2014-08-25  5:32 UTC (permalink / raw)
  To: Borislav Petkov, Wanpeng Li
  Cc: hpa, Ingo Molnar, Peter Zijlstra, x86, Yasuaki Ishimatsu,
	David Rientjes, Prarit Bhargava, Steven Rostedt, Jan Kiszka,
	Toshi Kani, linux-kernel, Zhang Yang, Konrad Rzeszutek Wilk,
	Linn Crosetto


于 14-8-15 下午2:07, Borislav Petkov 写道:
> On Fri, Aug 15, 2014 at 11:00:42AM +0800, Wanpeng Li wrote:
>> Is it ok for you to apply this patch or still need update?
> Just be patient: we have the merge window still open and after that
> kernel summit coming up first.

Thanks for pointing out.

Regards,
Wanpeng Li

>


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v4] x86, hotplug: fix llc shared map unreleased during cpu hotplug
  2014-08-25  5:32       ` Wanpeng Li
@ 2014-09-04  1:46         ` Wanpeng Li
  0 siblings, 0 replies; 16+ messages in thread
From: Wanpeng Li @ 2014-09-04  1:46 UTC (permalink / raw)
  To: Ingo Molnar, hpa
  Cc: Borislav Petkov, Peter Zijlstra, x86, Yasuaki Ishimatsu,
	David Rientjes, Prarit Bhargava, Steven Rostedt, Jan Kiszka,
	Toshi Kani, linux-kernel, Zhang Yang, Konrad Rzeszutek Wilk,
	Linn Crosetto, Wanpeng Li

Ping Ingo, Peter,
On Mon, Aug 25, 2014 at 01:32:47PM +0800, Wanpeng Li wrote:
>
>于 14-8-15 下午2:07, Borislav Petkov 写道:
>>On Fri, Aug 15, 2014 at 11:00:42AM +0800, Wanpeng Li wrote:
>>>Is it ok for you to apply this patch or still need update?
>>Just be patient: we have the merge window still open and after that
>>kernel summit coming up first.
>
>Thanks for pointing out.
>
>Regards,
>Wanpeng Li
>
>>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v4] x86, hotplug: fix llc shared map unreleased during cpu hotplug
  2014-07-29  9:24 [PATCH v4] x86, hotplug: fix llc shared map unreleased during cpu hotplug Wanpeng Li
                   ` (2 preceding siblings ...)
       [not found] ` <20140808224057.GA9288@oranje.fc.hp.com>
@ 2014-09-04  5:20 ` Ingo Molnar
  2014-09-04  5:40   ` Yasuaki Ishimatsu
  2014-09-04  8:56   ` Wanpeng Li
  3 siblings, 2 replies; 16+ messages in thread
From: Ingo Molnar @ 2014-09-04  5:20 UTC (permalink / raw)
  To: Wanpeng Li
  Cc: hpa, Ingo Molnar, Peter Zijlstra, x86, Borislav Petkov,
	Yasuaki Ishimatsu, David Rientjes, Prarit Bhargava,
	Steven Rostedt, Jan Kiszka, Toshi Kani, linux-kernel, Zhang Yang,
	Konrad Rzeszutek Wilk


* Wanpeng Li <wanpeng.li@linux.intel.com> wrote:

> BUG: unable to handle kernel NULL pointer dereference at 0000000000000004
> IP: [..] find_busiest_group
> PGD 5a9d5067 PUD 13067 PMD 0
> Oops: 0000 [#3] SMP
> [...]
> Call Trace:
> load_balance
> ? _raw_spin_unlock_irqrestore
> idle_balance
> __schedule
> schedule
> schedule_timeout
> ? lock_timer_base
> schedule_timeout_uninterruptible
> msleep
> lock_device_hotplug_sysfs
> online_store
> dev_attr_store
> sysfs_write_file
> vfs_write
> SyS_write
> system_call_fastpath
> 
> Last level cache shared map is built during cpu up and build sched domain 
> routine takes advantage of it to setup sched domain cpu topology, however, 
> llc shared map is unreleased during cpu disable which lead to invalid sched 
> domain cpu topology. This patch fix it by release llc shared map correctly
> during cpu disable.

Very little is said in this changelog about how the bug was 
found, how likely it is to occur for others, what systems are 
affected, etc.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v4] x86, hotplug: fix llc shared map unreleased during cpu hotplug
  2014-09-04  5:20 ` Ingo Molnar
@ 2014-09-04  5:40   ` Yasuaki Ishimatsu
  2014-09-04  6:24     ` Peter Zijlstra
  2014-09-04  9:02     ` Wanpeng Li
  2014-09-04  8:56   ` Wanpeng Li
  1 sibling, 2 replies; 16+ messages in thread
From: Yasuaki Ishimatsu @ 2014-09-04  5:40 UTC (permalink / raw)
  To: Ingo Molnar, Wanpeng Li
  Cc: hpa, Ingo Molnar, Peter Zijlstra, x86, Borislav Petkov,
	David Rientjes, Prarit Bhargava, Steven Rostedt, Jan Kiszka,
	Toshi Kani, linux-kernel, Zhang Yang, Konrad Rzeszutek Wilk

(2014/09/04 14:20), Ingo Molnar wrote:
>
> * Wanpeng Li <wanpeng.li@linux.intel.com> wrote:
>
>> BUG: unable to handle kernel NULL pointer dereference at 0000000000000004
>> IP: [..] find_busiest_group
>> PGD 5a9d5067 PUD 13067 PMD 0
>> Oops: 0000 [#3] SMP
>> [...]
>> Call Trace:
>> load_balance
>> ? _raw_spin_unlock_irqrestore
>> idle_balance
>> __schedule
>> schedule
>> schedule_timeout
>> ? lock_timer_base
>> schedule_timeout_uninterruptible
>> msleep
>> lock_device_hotplug_sysfs
>> online_store
>> dev_attr_store
>> sysfs_write_file
>> vfs_write
>> SyS_write
>> system_call_fastpath
>>
>> Last level cache shared map is built during cpu up and build sched domain
>> routine takes advantage of it to setup sched domain cpu topology, however,
>> llc shared map is unreleased during cpu disable which lead to invalid sched
>> domain cpu topology. This patch fix it by release llc shared map correctly
>> during cpu disable.
>
> Very little is said in this changelog about how the bug was
> found, how likely it is to occur for others, what systems are
> affected, etc.

Hi Wanpeng,

In my understanding, the panic occurs by just onlining CPU as follows:
echo 1 > /sys/devices/system/cpu/cpuX/online

So, how about add the information?

Thanks,
Yasuaki Ishimatsu


>
> Thanks,
>
> 	Ingo
>



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v4] x86, hotplug: fix llc shared map unreleased during cpu hotplug
  2014-09-04  5:40   ` Yasuaki Ishimatsu
@ 2014-09-04  6:24     ` Peter Zijlstra
  2014-09-04  9:02     ` Wanpeng Li
  1 sibling, 0 replies; 16+ messages in thread
From: Peter Zijlstra @ 2014-09-04  6:24 UTC (permalink / raw)
  To: Yasuaki Ishimatsu
  Cc: Ingo Molnar, Wanpeng Li, hpa, Ingo Molnar, x86, Borislav Petkov,
	David Rientjes, Prarit Bhargava, Steven Rostedt, Jan Kiszka,
	Toshi Kani, linux-kernel, Zhang Yang, Konrad Rzeszutek Wilk

On Thu, Sep 04, 2014 at 02:40:07PM +0900, Yasuaki Ishimatsu wrote:
> (2014/09/04 14:20), Ingo Molnar wrote:
> >
> >* Wanpeng Li <wanpeng.li@linux.intel.com> wrote:
> >
> >>BUG: unable to handle kernel NULL pointer dereference at 0000000000000004
> >>IP: [..] find_busiest_group
> >>PGD 5a9d5067 PUD 13067 PMD 0
> >>Oops: 0000 [#3] SMP
> >>[...]
> >>Call Trace:
> >>load_balance
> >>? _raw_spin_unlock_irqrestore
> >>idle_balance
> >>__schedule
> >>schedule
> >>schedule_timeout
> >>? lock_timer_base
> >>schedule_timeout_uninterruptible
> >>msleep
> >>lock_device_hotplug_sysfs
> >>online_store
> >>dev_attr_store
> >>sysfs_write_file
> >>vfs_write
> >>SyS_write
> >>system_call_fastpath
> >>
> >>Last level cache shared map is built during cpu up and build sched domain
> >>routine takes advantage of it to setup sched domain cpu topology, however,
> >>llc shared map is unreleased during cpu disable which lead to invalid sched
> >>domain cpu topology. This patch fix it by release llc shared map correctly
> >>during cpu disable.
> >
> >Very little is said in this changelog about how the bug was
> >found, how likely it is to occur for others, what systems are
> >affected, etc.
> 
> Hi Wanpeng,
> 
> In my understanding, the panic occurs by just onlining CPU as follows:
> echo 1 > /sys/devices/system/cpu/cpuX/online
> 
> So, how about add the information?

>From what I remember you need a special kind of hardware too, one that
doesn't preserve cpu numbers across hotplug. Most systems do; just not
this magic special one.

We want to fix that, but the only reason for this patch is consistency
with the rest of the code, we do indeed clear and set these bits in all
other masks, but not this one.

But yes, the Changelog needs help.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v4] x86, hotplug: fix llc shared map unreleased during cpu hotplug
  2014-09-04  5:20 ` Ingo Molnar
  2014-09-04  5:40   ` Yasuaki Ishimatsu
@ 2014-09-04  8:56   ` Wanpeng Li
  2014-09-04  9:34     ` Wanpeng Li
  1 sibling, 1 reply; 16+ messages in thread
From: Wanpeng Li @ 2014-09-04  8:56 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: hpa, Ingo Molnar, Peter Zijlstra, x86, Borislav Petkov,
	Yasuaki Ishimatsu, David Rientjes, Prarit Bhargava,
	Steven Rostedt, Jan Kiszka, Toshi Kani, linux-kernel, Zhang Yang,
	Wanpeng Li

On Thu, Sep 04, 2014 at 07:20:34AM +0200, Ingo Molnar wrote:
>
>* Wanpeng Li <wanpeng.li@linux.intel.com> wrote:
>
>> BUG: unable to handle kernel NULL pointer dereference at 0000000000000004
>> IP: [..] find_busiest_group
>> PGD 5a9d5067 PUD 13067 PMD 0
>> Oops: 0000 [#3] SMP
>> [...]
>> Call Trace:
>> load_balance
>> ? _raw_spin_unlock_irqrestore
>> idle_balance
>> __schedule
>> schedule
>> schedule_timeout
>> ? lock_timer_base
>> schedule_timeout_uninterruptible
>> msleep
>> lock_device_hotplug_sysfs
>> online_store
>> dev_attr_store
>> sysfs_write_file
>> vfs_write
>> SyS_write
>> system_call_fastpath
>> 
>> Last level cache shared map is built during cpu up and build sched domain 
>> routine takes advantage of it to setup sched domain cpu topology, however, 
>> llc shared map is unreleased during cpu disable which lead to invalid sched 
>> domain cpu topology. This patch fix it by release llc shared map correctly
>> during cpu disable.
>
>Very little is said in this changelog about how the bug was 
>found, how likely it is to occur for others, what systems are 
>affected, etc.

This bug can be triggered by hot add and remove large number of xen 
domain0's vcpus repeated.

Regards,
Wanpeng Li 

>
>Thanks,
>
>	Ingo

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v4] x86, hotplug: fix llc shared map unreleased during cpu hotplug
  2014-09-04  5:40   ` Yasuaki Ishimatsu
  2014-09-04  6:24     ` Peter Zijlstra
@ 2014-09-04  9:02     ` Wanpeng Li
  1 sibling, 0 replies; 16+ messages in thread
From: Wanpeng Li @ 2014-09-04  9:02 UTC (permalink / raw)
  To: Yasuaki Ishimatsu
  Cc: Ingo Molnar, Wanpeng Li, hpa, Ingo Molnar, Peter Zijlstra, x86,
	Borislav Petkov, David Rientjes, Prarit Bhargava, Steven Rostedt,
	Jan Kiszka, Toshi Kani, linux-kernel, Zhang Yang

On Thu, Sep 04, 2014 at 02:40:07PM +0900, Yasuaki Ishimatsu wrote:
>(2014/09/04 14:20), Ingo Molnar wrote:
>>
>>* Wanpeng Li <wanpeng.li@linux.intel.com> wrote:
>>
>>>BUG: unable to handle kernel NULL pointer dereference at 0000000000000004
>>>IP: [..] find_busiest_group
>>>PGD 5a9d5067 PUD 13067 PMD 0
>>>Oops: 0000 [#3] SMP
>>>[...]
>>>Call Trace:
>>>load_balance
>>>? _raw_spin_unlock_irqrestore
>>>idle_balance
>>>__schedule
>>>schedule
>>>schedule_timeout
>>>? lock_timer_base
>>>schedule_timeout_uninterruptible
>>>msleep
>>>lock_device_hotplug_sysfs
>>>online_store
>>>dev_attr_store
>>>sysfs_write_file
>>>vfs_write
>>>SyS_write
>>>system_call_fastpath
>>>
>>>Last level cache shared map is built during cpu up and build sched domain
>>>routine takes advantage of it to setup sched domain cpu topology, however,
>>>llc shared map is unreleased during cpu disable which lead to invalid sched
>>>domain cpu topology. This patch fix it by release llc shared map correctly
>>>during cpu disable.
>>
>>Very little is said in this changelog about how the bug was
>>found, how likely it is to occur for others, what systems are
>>affected, etc.
>
>Hi Wanpeng,

Hi Yasuaki,

>
>In my understanding, the panic occurs by just onlining CPU as follows:
>echo 1 > /sys/devices/system/cpu/cpuX/online
>

See my reply to Ingo.

Regards,
Wanpeng Li 

>So, how about add the information?
>
>Thanks,
>Yasuaki Ishimatsu
>
>
>>
>>Thanks,
>>
>>	Ingo
>>
>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v4] x86, hotplug: fix llc shared map unreleased during cpu hotplug
  2014-09-04  8:56   ` Wanpeng Li
@ 2014-09-04  9:34     ` Wanpeng Li
  2014-09-15  1:32       ` Wanpeng Li
  2014-09-16  9:01       ` Ingo Molnar
  0 siblings, 2 replies; 16+ messages in thread
From: Wanpeng Li @ 2014-09-04  9:34 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: hpa, Ingo Molnar, Peter Zijlstra, x86, Borislav Petkov,
	Yasuaki Ishimatsu, David Rientjes, Prarit Bhargava,
	Steven Rostedt, Jan Kiszka, Toshi Kani, linux-kernel, Zhang Yang,
	Wanpeng Li

Hi Ingo,
On Thu, Sep 04, 2014 at 04:56:41PM +0800, Wanpeng Li wrote:
>On Thu, Sep 04, 2014 at 07:20:34AM +0200, Ingo Molnar wrote:
>>
>>* Wanpeng Li <wanpeng.li@linux.intel.com> wrote:
>>
>>> BUG: unable to handle kernel NULL pointer dereference at 0000000000000004
>>> IP: [..] find_busiest_group
>>> PGD 5a9d5067 PUD 13067 PMD 0
>>> Oops: 0000 [#3] SMP
>>> [...]
>>> Call Trace:
>>> load_balance
>>> ? _raw_spin_unlock_irqrestore
>>> idle_balance
>>> __schedule
>>> schedule
>>> schedule_timeout
>>> ? lock_timer_base
>>> schedule_timeout_uninterruptible
>>> msleep
>>> lock_device_hotplug_sysfs
>>> online_store
>>> dev_attr_store
>>> sysfs_write_file
>>> vfs_write
>>> SyS_write
>>> system_call_fastpath
>>> 
>>> Last level cache shared map is built during cpu up and build sched domain 
>>> routine takes advantage of it to setup sched domain cpu topology, however, 
>>> llc shared map is unreleased during cpu disable which lead to invalid sched 
>>> domain cpu topology. This patch fix it by release llc shared map correctly
>>> during cpu disable.
>>
>>Very little is said in this changelog about how the bug was 
>>found, how likely it is to occur for others, what systems are 
>>affected, etc.
>
>This bug can be triggered by hot add and remove large number of xen 
>domain0's vcpus repeated.
>

If I need to send a new version of the patch or you can pick the patch
w/ the updated changelog for me? ;-)

Regards,
Wanpeng Li 

>Regards,
>Wanpeng Li 
>
>>
>>Thanks,
>>
>>	Ingo

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v4] x86, hotplug: fix llc shared map unreleased during cpu hotplug
  2014-09-04  9:34     ` Wanpeng Li
@ 2014-09-15  1:32       ` Wanpeng Li
  2014-09-16  9:01       ` Ingo Molnar
  1 sibling, 0 replies; 16+ messages in thread
From: Wanpeng Li @ 2014-09-15  1:32 UTC (permalink / raw)
  To: Wanpeng Li
  Cc: Ingo Molnar, hpa, Peter Zijlstra, x86, Borislav Petkov,
	Yasuaki Ishimatsu, David Rientjes, Prarit Bhargava,
	Steven Rostedt, Jan Kiszka, Toshi Kani, linux-kernel, Zhang Yang

Ping Ingo, HPA, PeterZ
于 14-9-4 下午5:34, Wanpeng Li 写道:
> Hi Ingo,
> On Thu, Sep 04, 2014 at 04:56:41PM +0800, Wanpeng Li wrote:
>> On Thu, Sep 04, 2014 at 07:20:34AM +0200, Ingo Molnar wrote:
>>> * Wanpeng Li <wanpeng.li@linux.intel.com> wrote:
>>>
>>>> BUG: unable to handle kernel NULL pointer dereference at 0000000000000004
>>>> IP: [..] find_busiest_group
>>>> PGD 5a9d5067 PUD 13067 PMD 0
>>>> Oops: 0000 [#3] SMP
>>>> [...]
>>>> Call Trace:
>>>> load_balance
>>>> ? _raw_spin_unlock_irqrestore
>>>> idle_balance
>>>> __schedule
>>>> schedule
>>>> schedule_timeout
>>>> ? lock_timer_base
>>>> schedule_timeout_uninterruptible
>>>> msleep
>>>> lock_device_hotplug_sysfs
>>>> online_store
>>>> dev_attr_store
>>>> sysfs_write_file
>>>> vfs_write
>>>> SyS_write
>>>> system_call_fastpath
>>>>
>>>> Last level cache shared map is built during cpu up and build sched domain
>>>> routine takes advantage of it to setup sched domain cpu topology, however,
>>>> llc shared map is unreleased during cpu disable which lead to invalid sched
>>>> domain cpu topology. This patch fix it by release llc shared map correctly
>>>> during cpu disable.
>>> Very little is said in this changelog about how the bug was
>>> found, how likely it is to occur for others, what systems are
>>> affected, etc.
>> This bug can be triggered by hot add and remove large number of xen
>> domain0's vcpus repeated.
>>
> If I need to send a new version of the patch or you can pick the patch
> w/ the updated changelog for me? ;-)
>
> Regards,
> Wanpeng Li
>
>> Regards,
>> Wanpeng Li
>>
>>> Thanks,
>>>
>>> 	Ingo
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v4] x86, hotplug: fix llc shared map unreleased during cpu hotplug
  2014-09-04  9:34     ` Wanpeng Li
  2014-09-15  1:32       ` Wanpeng Li
@ 2014-09-16  9:01       ` Ingo Molnar
  2014-09-16 10:00         ` Wanpeng Li
  1 sibling, 1 reply; 16+ messages in thread
From: Ingo Molnar @ 2014-09-16  9:01 UTC (permalink / raw)
  To: Wanpeng Li
  Cc: Ingo Molnar, hpa, Peter Zijlstra, x86, Borislav Petkov,
	Yasuaki Ishimatsu, David Rientjes, Prarit Bhargava,
	Steven Rostedt, Jan Kiszka, Toshi Kani, linux-kernel, Zhang Yang


* Wanpeng Li <wanpeng.li@linux.intel.com> wrote:

> Hi Ingo,
> On Thu, Sep 04, 2014 at 04:56:41PM +0800, Wanpeng Li wrote:
> >On Thu, Sep 04, 2014 at 07:20:34AM +0200, Ingo Molnar wrote:
> >>
> >>* Wanpeng Li <wanpeng.li@linux.intel.com> wrote:
> >>
> >>> BUG: unable to handle kernel NULL pointer dereference at 0000000000000004
> >>> IP: [..] find_busiest_group
> >>> PGD 5a9d5067 PUD 13067 PMD 0
> >>> Oops: 0000 [#3] SMP
> >>> [...]
> >>> Call Trace:
> >>> load_balance
> >>> ? _raw_spin_unlock_irqrestore
> >>> idle_balance
> >>> __schedule
> >>> schedule
> >>> schedule_timeout
> >>> ? lock_timer_base
> >>> schedule_timeout_uninterruptible
> >>> msleep
> >>> lock_device_hotplug_sysfs
> >>> online_store
> >>> dev_attr_store
> >>> sysfs_write_file
> >>> vfs_write
> >>> SyS_write
> >>> system_call_fastpath
> >>> 
> >>> Last level cache shared map is built during cpu up and build sched domain 
> >>> routine takes advantage of it to setup sched domain cpu topology, however, 
> >>> llc shared map is unreleased during cpu disable which lead to invalid sched 
> >>> domain cpu topology. This patch fix it by release llc shared map correctly
> >>> during cpu disable.
> >>
> >>Very little is said in this changelog about how the bug was 
> >>found, how likely it is to occur for others, what systems are 
> >>affected, etc.
> >
> >This bug can be triggered by hot add and remove large number of xen 
> >domain0's vcpus repeated.
> >
> 
> If I need to send a new version of the patch or you can pick the patch
> w/ the updated changelog for me? ;-)

Please send a fresh new version, maintainers trying to splice & 
dice patches and changelogs is an unrobust approach prone to 
mistakes.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v4] x86, hotplug: fix llc shared map unreleased during cpu hotplug
  2014-09-16  9:01       ` Ingo Molnar
@ 2014-09-16 10:00         ` Wanpeng Li
  0 siblings, 0 replies; 16+ messages in thread
From: Wanpeng Li @ 2014-09-16 10:00 UTC (permalink / raw)
  To: Ingo Molnar, Wanpeng Li
  Cc: Ingo Molnar, hpa, Peter Zijlstra, x86, Borislav Petkov,
	Yasuaki Ishimatsu, David Rientjes, Prarit Bhargava,
	Steven Rostedt, Jan Kiszka, Toshi Kani, linux-kernel, Zhang Yang

于 14-9-16 下午5:01, Ingo Molnar 写道:
> * Wanpeng Li <wanpeng.li@linux.intel.com> wrote:
>
>> Hi Ingo,
>> On Thu, Sep 04, 2014 at 04:56:41PM +0800, Wanpeng Li wrote:
>>> On Thu, Sep 04, 2014 at 07:20:34AM +0200, Ingo Molnar wrote:
>>>> * Wanpeng Li <wanpeng.li@linux.intel.com> wrote:
>>>>
>>>>> BUG: unable to handle kernel NULL pointer dereference at 0000000000000004
>>>>> IP: [..] find_busiest_group
>>>>> PGD 5a9d5067 PUD 13067 PMD 0
>>>>> Oops: 0000 [#3] SMP
>>>>> [...]
>>>>> Call Trace:
>>>>> load_balance
>>>>> ? _raw_spin_unlock_irqrestore
>>>>> idle_balance
>>>>> __schedule
>>>>> schedule
>>>>> schedule_timeout
>>>>> ? lock_timer_base
>>>>> schedule_timeout_uninterruptible
>>>>> msleep
>>>>> lock_device_hotplug_sysfs
>>>>> online_store
>>>>> dev_attr_store
>>>>> sysfs_write_file
>>>>> vfs_write
>>>>> SyS_write
>>>>> system_call_fastpath
>>>>>
>>>>> Last level cache shared map is built during cpu up and build sched domain
>>>>> routine takes advantage of it to setup sched domain cpu topology, however,
>>>>> llc shared map is unreleased during cpu disable which lead to invalid sched
>>>>> domain cpu topology. This patch fix it by release llc shared map correctly
>>>>> during cpu disable.
>>>> Very little is said in this changelog about how the bug was
>>>> found, how likely it is to occur for others, what systems are
>>>> affected, etc.
>>> This bug can be triggered by hot add and remove large number of xen
>>> domain0's vcpus repeated.
>>>
>> If I need to send a new version of the patch or you can pick the patch
>> w/ the updated changelog for me? ;-)
> Please send a fresh new version, maintainers trying to splice &
> dice patches and changelogs is an unrobust approach prone to
> mistakes.

Ok, I will send out a new version tomorrow. ;-)

Regards,
Wanpeng Li

>
> Thanks,
>
> 	Ingo
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/


^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2014-09-16 10:00 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-07-29  9:24 [PATCH v4] x86, hotplug: fix llc shared map unreleased during cpu hotplug Wanpeng Li
2014-07-29 17:18 ` Toshi Kani
2014-08-07  6:33 ` Wanpeng Li
     [not found] ` <20140808224057.GA9288@oranje.fc.hp.com>
2014-08-15  3:00   ` Wanpeng Li
2014-08-15  6:07     ` Borislav Petkov
2014-08-25  5:32       ` Wanpeng Li
2014-09-04  1:46         ` Wanpeng Li
2014-09-04  5:20 ` Ingo Molnar
2014-09-04  5:40   ` Yasuaki Ishimatsu
2014-09-04  6:24     ` Peter Zijlstra
2014-09-04  9:02     ` Wanpeng Li
2014-09-04  8:56   ` Wanpeng Li
2014-09-04  9:34     ` Wanpeng Li
2014-09-15  1:32       ` Wanpeng Li
2014-09-16  9:01       ` Ingo Molnar
2014-09-16 10:00         ` Wanpeng Li

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).