All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] Fix NULL pointer for Xen guests
@ 2010-04-27 15:24 Prarit Bhargava
  2010-04-27 16:58 ` [LKML] " Konrad Rzeszutek Wilk
                   ` (3 more replies)
  0 siblings, 4 replies; 20+ messages in thread
From: Prarit Bhargava @ 2010-04-27 15:24 UTC (permalink / raw)
  To: linux-kernel, suresh.b.siddha, x86, clalance, drjones; +Cc: Prarit Bhargava

Upstream PV guests fail to boot because of a NULL pointer.  It is possible that
xen guests have irq_desc->chip_data = NULL.

Test for NULL chip_data pointer before attempting to complete an irq move.

Signed-off-by: Prarit Bhargava <prarit@redhat.com>
Acked-by: Suresh Siddha <suresh.b.siddha@intel.com>

diff --git a/arch/x86/kernel/apic/io_apic.c b/arch/x86/kernel/apic/io_apic.c
index 127b871..eb2789c 100644
--- a/arch/x86/kernel/apic/io_apic.c
+++ b/arch/x86/kernel/apic/io_apic.c
@@ -2545,6 +2545,9 @@ void irq_force_complete_move(int irq)
 	struct irq_desc *desc = irq_to_desc(irq);
 	struct irq_cfg *cfg = desc->chip_data;
 
+	if (!cfg)
+		return;
+
 	__irq_complete_move(&desc, cfg->vector);
 }
 #else

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [LKML] [PATCH] Fix NULL pointer for Xen guests
  2010-04-27 15:24 [PATCH] Fix NULL pointer for Xen guests Prarit Bhargava
@ 2010-04-27 16:58 ` Konrad Rzeszutek Wilk
  2010-04-27 17:09   ` Prarit Bhargava
  2010-04-28 18:26 ` Andrew Morton
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 20+ messages in thread
From: Konrad Rzeszutek Wilk @ 2010-04-27 16:58 UTC (permalink / raw)
  To: Prarit Bhargava; +Cc: linux-kernel, suresh.b.siddha, x86, clalance, drjones

On Tue, Apr 27, 2010 at 11:24:42AM -0400, Prarit Bhargava wrote:
> Upstream PV guests fail to boot because of a NULL pointer.  It is possible that
> xen guests have irq_desc->chip_data = NULL.

Can you provide a short example of test scenario? As in what I should do
to reproduce this problem?
> 
> Test for NULL chip_data pointer before attempting to complete an irq move.
> 
> Signed-off-by: Prarit Bhargava <prarit@redhat.com>
> Acked-by: Suresh Siddha <suresh.b.siddha@intel.com>
> 
> diff --git a/arch/x86/kernel/apic/io_apic.c b/arch/x86/kernel/apic/io_apic.c
> index 127b871..eb2789c 100644
> --- a/arch/x86/kernel/apic/io_apic.c
> +++ b/arch/x86/kernel/apic/io_apic.c
> @@ -2545,6 +2545,9 @@ void irq_force_complete_move(int irq)
>  	struct irq_desc *desc = irq_to_desc(irq);
>  	struct irq_cfg *cfg = desc->chip_data;
>  
> +	if (!cfg)
> +		return;
> +
>  	__irq_complete_move(&desc, cfg->vector);
>  }
>  #else
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [LKML] [PATCH] Fix NULL pointer for Xen guests
  2010-04-27 16:58 ` [LKML] " Konrad Rzeszutek Wilk
@ 2010-04-27 17:09   ` Prarit Bhargava
  2010-04-27 17:59     ` Andrew Jones
  2010-04-27 18:34     ` Konrad Rzeszutek Wilk
  0 siblings, 2 replies; 20+ messages in thread
From: Prarit Bhargava @ 2010-04-27 17:09 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: linux-kernel, suresh.b.siddha, x86, clalance, drjones



On 04/27/2010 12:58 PM, Konrad Rzeszutek Wilk wrote:
> On Tue, Apr 27, 2010 at 11:24:42AM -0400, Prarit Bhargava wrote:
>    
>> Upstream PV guests fail to boot because of a NULL pointer.  It is possible that
>> xen guests have irq_desc->chip_data = NULL.
>>      
> Can you provide a short example of test scenario? As in what I should do
> to reproduce this problem?
>    

Take the latest upstream (well ... to be honest, a bit older than that 
because of some other bugs) -- take 2.6.33 and try to boot it as a PV 
guest.  I'm using a RHEL5 Xen HV fwiw ...

P.

>> Test for NULL chip_data pointer before attempting to complete an irq move.
>>
>> Signed-off-by: Prarit Bhargava<prarit@redhat.com>
>> Acked-by: Suresh Siddha<suresh.b.siddha@intel.com>
>>
>> diff --git a/arch/x86/kernel/apic/io_apic.c b/arch/x86/kernel/apic/io_apic.c
>> index 127b871..eb2789c 100644
>> --- a/arch/x86/kernel/apic/io_apic.c
>> +++ b/arch/x86/kernel/apic/io_apic.c
>> @@ -2545,6 +2545,9 @@ void irq_force_complete_move(int irq)
>>   	struct irq_desc *desc = irq_to_desc(irq);
>>   	struct irq_cfg *cfg = desc->chip_data;
>>
>> +	if (!cfg)
>> +		return;
>> +
>>   	__irq_complete_move(&desc, cfg->vector);
>>   }
>>   #else
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> Please read the FAQ at  http://www.tux.org/lkml/
>>      

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [LKML] [PATCH] Fix NULL pointer for Xen guests
  2010-04-27 17:09   ` Prarit Bhargava
@ 2010-04-27 17:59     ` Andrew Jones
  2010-04-27 18:34     ` Konrad Rzeszutek Wilk
  1 sibling, 0 replies; 20+ messages in thread
From: Andrew Jones @ 2010-04-27 17:59 UTC (permalink / raw)
  To: Prarit Bhargava
  Cc: Konrad Rzeszutek Wilk, linux-kernel, suresh.b.siddha, x86, clalance

On 04/27/2010 07:09 PM, Prarit Bhargava wrote:
> 
> 
> On 04/27/2010 12:58 PM, Konrad Rzeszutek Wilk wrote:
>> On Tue, Apr 27, 2010 at 11:24:42AM -0400, Prarit Bhargava wrote:
>>   
>>> Upstream PV guests fail to boot because of a NULL pointer.  It is
>>> possible that
>>> xen guests have irq_desc->chip_data = NULL.
>>>      
>> Can you provide a short example of test scenario? As in what I should do
>> to reproduce this problem?
>>    
> 
> Take the latest upstream (well ... to be honest, a bit older than that
> because of some other bugs) -- take 2.6.33 and try to boot it as a PV
> guest.  I'm using a RHEL5 Xen HV fwiw ...
> 
> P.

Another ingredient is to boot the guest with a configuration where its
maxvcpus is greater than its vcpus. If you have RHEL 5.5 userspace then
you can create a config with lines like this

maxvcpus = 4
vcpus = 2

with that you'll crash on boot. Then you can check that
irq_force_complete_move is on the stack if you have "preserve" for
on_crash and use xenctx to look at the state of the vcpus.

If the Xen you're using doesn't support the maxvcpus var, then I believe
you can do the same principle, but in a different way, using the
vcpus_avail var. Or, you can boot with > 1 vcpus and then attempt to
remove one with 'xm vcpu-set'.

Andrew

> 
>>> Test for NULL chip_data pointer before attempting to complete an irq
>>> move.
>>>
>>> Signed-off-by: Prarit Bhargava<prarit@redhat.com>
>>> Acked-by: Suresh Siddha<suresh.b.siddha@intel.com>
>>>
>>> diff --git a/arch/x86/kernel/apic/io_apic.c
>>> b/arch/x86/kernel/apic/io_apic.c
>>> index 127b871..eb2789c 100644
>>> --- a/arch/x86/kernel/apic/io_apic.c
>>> +++ b/arch/x86/kernel/apic/io_apic.c
>>> @@ -2545,6 +2545,9 @@ void irq_force_complete_move(int irq)
>>>       struct irq_desc *desc = irq_to_desc(irq);
>>>       struct irq_cfg *cfg = desc->chip_data;
>>>
>>> +    if (!cfg)
>>> +        return;
>>> +
>>>       __irq_complete_move(&desc, cfg->vector);
>>>   }
>>>   #else
>>> -- 
>>> To unsubscribe from this list: send the line "unsubscribe
>>> linux-kernel" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>> Please read the FAQ at  http://www.tux.org/lkml/
>>>      


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [LKML] [PATCH] Fix NULL pointer for Xen guests
  2010-04-27 17:09   ` Prarit Bhargava
  2010-04-27 17:59     ` Andrew Jones
@ 2010-04-27 18:34     ` Konrad Rzeszutek Wilk
  2010-04-27 18:47       ` Prarit Bhargava
  1 sibling, 1 reply; 20+ messages in thread
From: Konrad Rzeszutek Wilk @ 2010-04-27 18:34 UTC (permalink / raw)
  To: Prarit Bhargava; +Cc: linux-kernel, suresh.b.siddha, x86, clalance, drjones

>> Can you provide a short example of test scenario? As in what I should do
>> to reproduce this problem?
>>    
>
> Take the latest upstream (well ... to be honest, a bit older than that  
> because of some other bugs) -- take 2.6.33 and try to boot it as a PV  

2.6.34-rc5 PV boots under Xen for me (and pretty much since 2.6.33 +
Suresh fix for the CONFIG_RODATA_MARK).

Perhaps I am missing some of the .config options you have set that make it not work?

The irqbalance daemon looks to be running - but I think you are hitting
this during bootup?  How long do you have to wait for this to trigger?

How many CPUs did you assign to your guest?

What are the "other bugs" you speak off?

> guest.  I'm using a RHEL5 Xen HV fwiw ...

OK, so your control domain is RHEL5. Mine is the Jeremy's xen/next one
(2.6.32). Let me try to compile RHEL5 under FC11 - any tricks necessary
to do that?

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [LKML] [PATCH] Fix NULL pointer for Xen guests
  2010-04-27 18:34     ` Konrad Rzeszutek Wilk
@ 2010-04-27 18:47       ` Prarit Bhargava
  2010-05-03 19:16         ` Konrad Rzeszutek Wilk
  0 siblings, 1 reply; 20+ messages in thread
From: Prarit Bhargava @ 2010-04-27 18:47 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: linux-kernel, suresh.b.siddha, x86, clalance, drjones



On 04/27/2010 02:34 PM, Konrad Rzeszutek Wilk wrote:
>>> Can you provide a short example of test scenario? As in what I should do
>>> to reproduce this problem?
>>>
>>>        
>> Take the latest upstream (well ... to be honest, a bit older than that
>> because of some other bugs) -- take 2.6.33 and try to boot it as a PV
>>      
> 2.6.34-rc5 PV boots under Xen for me (and pretty much since 2.6.33 +
> Suresh fix for the CONFIG_RODATA_MARK).
>
> Perhaps I am missing some of the .config options you have set that make it not work?
>
> The irqbalance daemon looks to be running - but I think you are hitting
> this during bootup?  How long do you have to wait for this to trigger?
>
>    

It happens during bootup.   I don't have a 2.6.33 vanilla panic handy 
but I do have one from an earlier 2.6.32...

rip: ffffffff81256f45 delay_tsc+0x45

rsp: ffff8800fac95a98

rax: fffffffff6ef46d0   rbx: 00000002   rcx: f6ef46d0   rdx: 0010850c

rsi: 002b3bb6   rdi: 002b3bcc   rbp: ffff8800fac95ab8

  r8: ffffffff    r9: 00000002   r10: 00000002   r11: 00000000

r12: fffffffff6dec1c4   r13: 00000002   r14: 002b3bcc   r15: 00000001

  cs: 0000e033    ds: 00000000    fs: 00000000    gs: 00000000



Stack:

  000000000002ef45 ffff8800fac95c88 0000000000000009 ffff8800fac93540

  ffff8800fac95ac8 ffffffff81256ef6 ffff8800fac95b48 ffffffff814c6341

  0000000000000010 ffff8800fac95b38 ffff880000000008 ffff8800fac95b58

  ffff8800fac95b08 a22d306b065d4a66 0000000000000000 0000000000000000



Code:

f3 90 65 8b 1c 25 d8 e3 00 00 44 39 eb 75 23 66 66 90 0f ae e8<e8>  46 3d dc ff
66 90 48 98 48 89



Call Trace:

   [<ffffffff81256f45>] delay_tsc+0x45<--

   [<ffffffff81256ef6>] __const_udelay+0x46

   [<ffffffff814c6341>] panic+0x135

   [<ffffffff814ca23c>] oops_end+0xdc

   [<ffffffff81042272>] no_context+0xf2

   [<ffffffff8125946c>] __bitmap_weight+0x8c

   [<ffffffff81042505>] __bad_area_nosemaphore+0x125

   [<ffffffff8105fad4>] find_busiest_group+0x254

   [<ffffffff810425d3>] bad_area_nosemaphore+0x13

   [<ffffffff814cbccf>] do_page_fault+0x2ef

   [<ffffffff814c9595>] page_fault+0x25

   [<ffffffff810302f2>] irq_force_complete_move+0x12

   [<ffffffff81015214>] fixup_irqs+0xa4

   [<ffffffff8102ce59>] cpu_disable_common+0x1a9

   [<ffffffff8100f9c2>] check_events+0x12

   [<ffffffff810c2550>] __stop_machine+0x120

   [<ffffffff8100ff75>] xen_cpu_disable+0x25

   [<ffffffff814b0427>] take_cpu_down+0x17

   [<ffffffff810c25f9>] stop_cpu+0xa9

   [<ffffffff8108869d>] worker_thread+0x16d

   [<ffffffff8100f19d>] xen_force_evtchn_callback+0xd

   [<ffffffff8108dd00>] wake_up_bit+0x40

   [<ffffffff814c90f6>] _spin_unlock_irqrestore+0x16

   [<ffffffff81088530>] create_workqueue_thread+0xd0

   [<ffffffff8108d9a6>] kthread+0x96

   [<ffffffff8101418a>] child_rip+0xa

   [<ffffffff81013351>] int_ret_from_sys_call+0x7

   [<ffffffff81013add>] retint_restore_args+0x5

   [<ffffffff81014180>] kernel_thread+0xe0


> How many CPUs did you assign to your guest?
>
>    

It didn't matter as long as vcpus >1 and maxcpus > vcpus.

> What are the "other bugs" you speak off?
>    

I got a different panic (which I've yet to resolve).

>    
>> guest.  I'm using a RHEL5 Xen HV fwiw ...
>>      
> OK, so your control domain is RHEL5. Mine is the Jeremy's xen/next one
> (2.6.32). Let me try to compile RHEL5 under FC11 - any tricks necessary
> to do that?
>    

I haven't tried it -- it might work :)

Also, did you try booting with maxvcpus > vcpus as drjones suggested ?

P.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] Fix NULL pointer for Xen guests
  2010-04-27 15:24 [PATCH] Fix NULL pointer for Xen guests Prarit Bhargava
  2010-04-27 16:58 ` [LKML] " Konrad Rzeszutek Wilk
@ 2010-04-28 18:26 ` Andrew Morton
  2010-04-28 18:29   ` Prarit Bhargava
  2010-04-30 20:55 ` H. Peter Anvin
  2010-04-30 21:36 ` [tip:x86/urgent] x86: Fix NULL pointer access in irq_force_complete_move() " tip-bot for Prarit Bhargava
  3 siblings, 1 reply; 20+ messages in thread
From: Andrew Morton @ 2010-04-28 18:26 UTC (permalink / raw)
  To: Prarit Bhargava; +Cc: linux-kernel, suresh.b.siddha, x86, clalance, drjones

On Tue, 27 Apr 2010 11:24:42 -0400
Prarit Bhargava <prarit@redhat.com> wrote:

> Upstream PV guests fail to boot because of a NULL pointer.  It is possible that
> xen guests have irq_desc->chip_data = NULL.
> 
> Test for NULL chip_data pointer before attempting to complete an irq move.
> 
> Signed-off-by: Prarit Bhargava <prarit@redhat.com>
> Acked-by: Suresh Siddha <suresh.b.siddha@intel.com>
> 
> diff --git a/arch/x86/kernel/apic/io_apic.c b/arch/x86/kernel/apic/io_apic.c
> index 127b871..eb2789c 100644
> --- a/arch/x86/kernel/apic/io_apic.c
> +++ b/arch/x86/kernel/apic/io_apic.c
> @@ -2545,6 +2545,9 @@ void irq_force_complete_move(int irq)
>  	struct irq_desc *desc = irq_to_desc(irq);
>  	struct irq_cfg *cfg = desc->chip_data;
>  
> +	if (!cfg)
> +		return;
> +
>  	__irq_complete_move(&desc, cfg->vector);
>  }
>  #else

I assume this is needed for 2.6.34?

What about 2.6.33.x and earlier?

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] Fix NULL pointer for Xen guests
  2010-04-28 18:26 ` Andrew Morton
@ 2010-04-28 18:29   ` Prarit Bhargava
  2010-04-28 18:42     ` Suresh Siddha
  2010-04-28 18:50     ` Andrew Morton
  0 siblings, 2 replies; 20+ messages in thread
From: Prarit Bhargava @ 2010-04-28 18:29 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel, suresh.b.siddha, x86, clalance, drjones



On 04/28/2010 02:26 PM, Andrew Morton wrote:
> On Tue, 27 Apr 2010 11:24:42 -0400
> Prarit Bhargava<prarit@redhat.com>  wrote:
>
>    
>> Upstream PV guests fail to boot because of a NULL pointer.  It is possible that
>> xen guests have irq_desc->chip_data = NULL.
>>
>> Test for NULL chip_data pointer before attempting to complete an irq move.
>>
>> Signed-off-by: Prarit Bhargava<prarit@redhat.com>
>> Acked-by: Suresh Siddha<suresh.b.siddha@intel.com>
>>
>> diff --git a/arch/x86/kernel/apic/io_apic.c b/arch/x86/kernel/apic/io_apic.c
>> index 127b871..eb2789c 100644
>> --- a/arch/x86/kernel/apic/io_apic.c
>> +++ b/arch/x86/kernel/apic/io_apic.c
>> @@ -2545,6 +2545,9 @@ void irq_force_complete_move(int irq)
>>   	struct irq_desc *desc = irq_to_desc(irq);
>>   	struct irq_cfg *cfg = desc->chip_data;
>>
>> +	if (!cfg)
>> +		return;
>> +
>>   	__irq_complete_move(&desc, cfg->vector);
>>   }
>>   #else
>>      
> I assume this is needed for 2.6.34?
>
> What about 2.6.33.x and earlier?
>    

Hey Andrew,

I actually pinged Chris Wright to see about including this in the 
-stable branches.  I haven't heard anything back so I'll reping him.

P.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] Fix NULL pointer for Xen guests
  2010-04-28 18:29   ` Prarit Bhargava
@ 2010-04-28 18:42     ` Suresh Siddha
  2010-04-28 18:50     ` Andrew Morton
  1 sibling, 0 replies; 20+ messages in thread
From: Suresh Siddha @ 2010-04-28 18:42 UTC (permalink / raw)
  To: Prarit Bhargava; +Cc: Andrew Morton, linux-kernel, x86, clalance, drjones

On Wed, 2010-04-28 at 11:29 -0700, Prarit Bhargava wrote:
> 
> On 04/28/2010 02:26 PM, Andrew Morton wrote:
> > On Tue, 27 Apr 2010 11:24:42 -0400
> > Prarit Bhargava<prarit@redhat.com>  wrote:
> >
> >    
> >> Upstream PV guests fail to boot because of a NULL pointer.  It is possible that
> >> xen guests have irq_desc->chip_data = NULL.
> >>
> >> Test for NULL chip_data pointer before attempting to complete an irq move.
> >>
> >> Signed-off-by: Prarit Bhargava<prarit@redhat.com>
> >> Acked-by: Suresh Siddha<suresh.b.siddha@intel.com>
> >>
> >> diff --git a/arch/x86/kernel/apic/io_apic.c b/arch/x86/kernel/apic/io_apic.c
> >> index 127b871..eb2789c 100644
> >> --- a/arch/x86/kernel/apic/io_apic.c
> >> +++ b/arch/x86/kernel/apic/io_apic.c
> >> @@ -2545,6 +2545,9 @@ void irq_force_complete_move(int irq)
> >>   	struct irq_desc *desc = irq_to_desc(irq);
> >>   	struct irq_cfg *cfg = desc->chip_data;
> >>
> >> +	if (!cfg)
> >> +		return;
> >> +
> >>   	__irq_complete_move(&desc, cfg->vector);
> >>   }
> >>   #else
> >>      
> > I assume this is needed for 2.6.34?
> >
> > What about 2.6.33.x and earlier?
> >    
> 
> Hey Andrew,
> 
> I actually pinged Chris Wright to see about including this in the 
> -stable branches.  I haven't heard anything back so I'll reping him.

It will be applicable for 2.6.33 and beyond.

thanks,
suresh


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] Fix NULL pointer for Xen guests
  2010-04-28 18:29   ` Prarit Bhargava
  2010-04-28 18:42     ` Suresh Siddha
@ 2010-04-28 18:50     ` Andrew Morton
  2010-04-28 19:15       ` [stable] " Greg KH
  1 sibling, 1 reply; 20+ messages in thread
From: Andrew Morton @ 2010-04-28 18:50 UTC (permalink / raw)
  To: Prarit Bhargava
  Cc: linux-kernel, suresh.b.siddha, x86, clalance, drjones, stable

On Wed, 28 Apr 2010 14:29:06 -0400
Prarit Bhargava <prarit@redhat.com> wrote:

> 
> 
> On 04/28/2010 02:26 PM, Andrew Morton wrote:
> > On Tue, 27 Apr 2010 11:24:42 -0400
> > Prarit Bhargava<prarit@redhat.com>  wrote:
> >
> >    
> >> Upstream PV guests fail to boot because of a NULL pointer.  It is possible that
> >> xen guests have irq_desc->chip_data = NULL.
> >>
> >> Test for NULL chip_data pointer before attempting to complete an irq move.
> >>
> >> Signed-off-by: Prarit Bhargava<prarit@redhat.com>
> >> Acked-by: Suresh Siddha<suresh.b.siddha@intel.com>
> >>
> >> diff --git a/arch/x86/kernel/apic/io_apic.c b/arch/x86/kernel/apic/io_apic.c
> >> index 127b871..eb2789c 100644
> >> --- a/arch/x86/kernel/apic/io_apic.c
> >> +++ b/arch/x86/kernel/apic/io_apic.c
> >> @@ -2545,6 +2545,9 @@ void irq_force_complete_move(int irq)
> >>   	struct irq_desc *desc = irq_to_desc(irq);
> >>   	struct irq_cfg *cfg = desc->chip_data;
> >>
> >> +	if (!cfg)
> >> +		return;
> >> +
> >>   	__irq_complete_move(&desc, cfg->vector);
> >>   }
> >>   #else
> >>      
> > I assume this is needed for 2.6.34?
> >
> > What about 2.6.33.x and earlier?
> >    
> 
> Hey Andrew,
> 
> I actually pinged Chris Wright to see about including this in the 
> -stable branches.  I haven't heard anything back so I'll reping him.
> 

Well.  Pinging people offlist isn't very reliable.  Put

Cc: <stable@kernel.org>

at the end of the changelog and cc stable@kernel.org on the original
patch and then the patch will reliably receive consideration for
backporting.

I have added Cc:<stable@kernel.org> to my copy of the patch, so the
-stable guys will at least see it when I drop it after it is merged. 
But if the x86 maintainers were to merge your patch as you sent it, it
would have no Cc: <stable@kernel.org> when it goes into Linus's tree.

I worry that if the -stable maintainer see me drop a patch, but the
patch in Linus's tree doesn't have the stable tag, they might not merge
the fix into -stable.  I bugged them about this scenario recently and
the reply was a bit waffly ;)

By far the safest thing to do is to include the stable tag in your
changelog right at the outset.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [stable] [PATCH] Fix NULL pointer for Xen guests
  2010-04-28 18:50     ` Andrew Morton
@ 2010-04-28 19:15       ` Greg KH
  0 siblings, 0 replies; 20+ messages in thread
From: Greg KH @ 2010-04-28 19:15 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Prarit Bhargava, drjones, suresh.b.siddha, x86, linux-kernel,
	clalance, stable

On Wed, Apr 28, 2010 at 11:50:39AM -0700, Andrew Morton wrote:
> I worry that if the -stable maintainer see me drop a patch, but the
> patch in Linus's tree doesn't have the stable tag, they might not merge
> the fix into -stable.  I bugged them about this scenario recently and
> the reply was a bit waffly ;)

It was?

I try my best, that if I see you drop a patch, to go dig through Linus's
tree to find if it landed there.  If not, I leave it in my queue, and do
that for a few releases.  If after a long time (like 6 months) I either
ping someone, or just drop it from my queue as I guessed that someone
dropped it for some reason.

If I miss one of these, please let me know.

> By far the safest thing to do is to include the stable tag in your
> changelog right at the outset.

Yes, that's the _easiest_ and will not get lost.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] Fix NULL pointer for Xen guests
  2010-04-27 15:24 [PATCH] Fix NULL pointer for Xen guests Prarit Bhargava
  2010-04-27 16:58 ` [LKML] " Konrad Rzeszutek Wilk
  2010-04-28 18:26 ` Andrew Morton
@ 2010-04-30 20:55 ` H. Peter Anvin
  2010-04-30 21:33   ` H. Peter Anvin
  2010-04-30 22:01   ` Prarit Bhargava
  2010-04-30 21:36 ` [tip:x86/urgent] x86: Fix NULL pointer access in irq_force_complete_move() " tip-bot for Prarit Bhargava
  3 siblings, 2 replies; 20+ messages in thread
From: H. Peter Anvin @ 2010-04-30 20:55 UTC (permalink / raw)
  To: Prarit Bhargava; +Cc: linux-kernel, suresh.b.siddha, x86, clalance, drjones

This looks like it should be tagged stable for 2.6.33.  Is that correct?

	-hpa

On 04/27/2010 08:24 AM, Prarit Bhargava wrote:
> Upstream PV guests fail to boot because of a NULL pointer.  It is possible that
> xen guests have irq_desc->chip_data = NULL.
> 
> Test for NULL chip_data pointer before attempting to complete an irq move.
> 
> Signed-off-by: Prarit Bhargava <prarit@redhat.com>
> Acked-by: Suresh Siddha <suresh.b.siddha@intel.com>
> 
> diff --git a/arch/x86/kernel/apic/io_apic.c b/arch/x86/kernel/apic/io_apic.c
> index 127b871..eb2789c 100644
> --- a/arch/x86/kernel/apic/io_apic.c
> +++ b/arch/x86/kernel/apic/io_apic.c
> @@ -2545,6 +2545,9 @@ void irq_force_complete_move(int irq)
>  	struct irq_desc *desc = irq_to_desc(irq);
>  	struct irq_cfg *cfg = desc->chip_data;
>  
> +	if (!cfg)
> +		return;
> +
>  	__irq_complete_move(&desc, cfg->vector);
>  }
>  #else
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] Fix NULL pointer for Xen guests
  2010-04-30 20:55 ` H. Peter Anvin
@ 2010-04-30 21:33   ` H. Peter Anvin
  2010-04-30 22:01   ` Prarit Bhargava
  1 sibling, 0 replies; 20+ messages in thread
From: H. Peter Anvin @ 2010-04-30 21:33 UTC (permalink / raw)
  To: Prarit Bhargava; +Cc: linux-kernel, suresh.b.siddha, x86, clalance, drjones

On 04/30/2010 01:55 PM, H. Peter Anvin wrote:
> This looks like it should be tagged stable for 2.6.33.  Is that correct?
> 
> 	-hpa

Nevermind... see it has already been discussed.

	-hpa

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [tip:x86/urgent] x86: Fix NULL pointer access in irq_force_complete_move() for Xen guests
  2010-04-27 15:24 [PATCH] Fix NULL pointer for Xen guests Prarit Bhargava
                   ` (2 preceding siblings ...)
  2010-04-30 20:55 ` H. Peter Anvin
@ 2010-04-30 21:36 ` tip-bot for Prarit Bhargava
  2010-05-04 15:02   ` [LKML] " Konrad Rzeszutek Wilk
  3 siblings, 1 reply; 20+ messages in thread
From: tip-bot for Prarit Bhargava @ 2010-04-30 21:36 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: linux-kernel, hpa, mingo, suresh.b.siddha, tglx, prarit

Commit-ID:  bbd391a15d82e14efe9d69ba64cadb855b061dba
Gitweb:     http://git.kernel.org/tip/bbd391a15d82e14efe9d69ba64cadb855b061dba
Author:     Prarit Bhargava <prarit@redhat.com>
AuthorDate: Tue, 27 Apr 2010 11:24:42 -0400
Committer:  H. Peter Anvin <hpa@zytor.com>
CommitDate: Fri, 30 Apr 2010 14:31:38 -0700

x86: Fix NULL pointer access in irq_force_complete_move() for Xen guests

Upstream PV guests fail to boot because of a NULL pointer in
irq_force_complete_move().  It is possible that xen guests have
irq_desc->chip_data = NULL.

Test for NULL chip_data pointer before attempting to complete an irq move.

Signed-off-by: Prarit Bhargava <prarit@redhat.com>
LKML-Reference: <20100427152434.16193.49104.sendpatchset@prarit.bos.redhat.com>
Acked-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Cc: <stable@kernel.org> [2.6.33]
---
 arch/x86/kernel/apic/io_apic.c |    3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/arch/x86/kernel/apic/io_apic.c b/arch/x86/kernel/apic/io_apic.c
index 127b871..eb2789c 100644
--- a/arch/x86/kernel/apic/io_apic.c
+++ b/arch/x86/kernel/apic/io_apic.c
@@ -2545,6 +2545,9 @@ void irq_force_complete_move(int irq)
 	struct irq_desc *desc = irq_to_desc(irq);
 	struct irq_cfg *cfg = desc->chip_data;
 
+	if (!cfg)
+		return;
+
 	__irq_complete_move(&desc, cfg->vector);
 }
 #else

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH] Fix NULL pointer for Xen guests
  2010-04-30 20:55 ` H. Peter Anvin
  2010-04-30 21:33   ` H. Peter Anvin
@ 2010-04-30 22:01   ` Prarit Bhargava
  1 sibling, 0 replies; 20+ messages in thread
From: Prarit Bhargava @ 2010-04-30 22:01 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: linux-kernel, suresh.b.siddha, x86, clalance, drjones



On 04/30/2010 04:55 PM, H. Peter Anvin wrote:
> This looks like it should be tagged stable for 2.6.33.  Is that correct?
>
>    

Yes.

P.

> 	-hpa
>
> On 04/27/2010 08:24 AM, Prarit Bhargava wrote:
>    
>> Upstream PV guests fail to boot because of a NULL pointer.  It is possible that
>> xen guests have irq_desc->chip_data = NULL.
>>
>> Test for NULL chip_data pointer before attempting to complete an irq move.
>>
>> Signed-off-by: Prarit Bhargava<prarit@redhat.com>
>> Acked-by: Suresh Siddha<suresh.b.siddha@intel.com>
>>
>> diff --git a/arch/x86/kernel/apic/io_apic.c b/arch/x86/kernel/apic/io_apic.c
>> index 127b871..eb2789c 100644
>> --- a/arch/x86/kernel/apic/io_apic.c
>> +++ b/arch/x86/kernel/apic/io_apic.c
>> @@ -2545,6 +2545,9 @@ void irq_force_complete_move(int irq)
>>   	struct irq_desc *desc = irq_to_desc(irq);
>>   	struct irq_cfg *cfg = desc->chip_data;
>>
>> +	if (!cfg)
>> +		return;
>> +
>>   	__irq_complete_move(&desc, cfg->vector);
>>   }
>>   #else
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> Please read the FAQ at  http://www.tux.org/lkml/
>>      
>    

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [LKML] [PATCH] Fix NULL pointer for Xen guests
  2010-04-27 18:47       ` Prarit Bhargava
@ 2010-05-03 19:16         ` Konrad Rzeszutek Wilk
  2010-05-03 19:56           ` Prarit Bhargava
  2010-05-04 15:02           ` [LKML] " Konrad Rzeszutek Wilk
  0 siblings, 2 replies; 20+ messages in thread
From: Konrad Rzeszutek Wilk @ 2010-05-03 19:16 UTC (permalink / raw)
  To: Prarit Bhargava; +Cc: linux-kernel, suresh.b.siddha, x86, clalance, drjones

>> OK, so your control domain is RHEL5. Mine is the Jeremy's xen/next one
>> (2.6.32). Let me try to compile RHEL5 under FC11 - any tricks necessary
>> to do that?
>>    
>
> I haven't tried it -- it might work :)
>
> Also, did you try booting with maxvcpus > vcpus as drjones suggested ?

Yes. No luck reproducing the crash/panic. I am just not seeing the failure you
guys are seeing.

Let me build once more 2.6.33 vanilla + CONFIG_DEBUG_MARK_RODATA=n) and check
this. And also install a vanilla RHEL5 dom0 as it looks impossible to
compile a 2.6.18-era kernel under FC11.

The Xen I am using is xen-unstable - so 4.0.1. I know that the IRQ balance
code in the Xen hypervisor was fixed in 4.0 (it used to run out of
context - now it runs in the IRQ context). Maybe this bug you are seeing
(and have the fix for) is just a red-heering?

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [LKML] [PATCH] Fix NULL pointer for Xen guests
  2010-05-03 19:16         ` Konrad Rzeszutek Wilk
@ 2010-05-03 19:56           ` Prarit Bhargava
  2010-05-04 15:02           ` [LKML] " Konrad Rzeszutek Wilk
  1 sibling, 0 replies; 20+ messages in thread
From: Prarit Bhargava @ 2010-05-03 19:56 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: linux-kernel, suresh.b.siddha, x86, clalance, drjones



On 05/03/2010 03:16 PM, Konrad Rzeszutek Wilk wrote:
>>> OK, so your control domain is RHEL5. Mine is the Jeremy's xen/next one
>>> (2.6.32). Let me try to compile RHEL5 under FC11 - any tricks necessary
>>> to do that?
>>>
>>>        
>> I haven't tried it -- it might work :)
>>
>> Also, did you try booting with maxvcpus>  vcpus as drjones suggested ?
>>      
> Yes. No luck reproducing the crash/panic. I am just not seeing the failure you
> guys are seeing.
>
> Let me build once more 2.6.33 vanilla + CONFIG_DEBUG_MARK_RODATA=n) and check
> this. And also install a vanilla RHEL5 dom0 as it looks impossible to
> compile a 2.6.18-era kernel under FC11.
>    

Let me try reproducing this on FC11 + 2.6.33.

P.


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [LKML] Re: [LKML] [PATCH] Fix NULL pointer for Xen guests
  2010-05-03 19:16         ` Konrad Rzeszutek Wilk
  2010-05-03 19:56           ` Prarit Bhargava
@ 2010-05-04 15:02           ` Konrad Rzeszutek Wilk
  2010-05-04 15:21             ` Prarit Bhargava
  1 sibling, 1 reply; 20+ messages in thread
From: Konrad Rzeszutek Wilk @ 2010-05-04 15:02 UTC (permalink / raw)
  To: Prarit Bhargava; +Cc: linux-kernel, suresh.b.siddha, x86, clalance, drjones

On Mon, May 03, 2010 at 03:16:34PM -0400, Konrad Rzeszutek Wilk wrote:
> >> OK, so your control domain is RHEL5. Mine is the Jeremy's xen/next one
> >> (2.6.32). Let me try to compile RHEL5 under FC11 - any tricks necessary
> >> to do that?
> >>    
> >
> > I haven't tried it -- it might work :)
> >
> > Also, did you try booting with maxvcpus > vcpus as drjones suggested ?
> 
> Yes. No luck reproducing the crash/panic. I am just not seeing the failure you
> guys are seeing.
> 
> Let me build once more 2.6.33 vanilla + CONFIG_DEBUG_MARK_RODATA=n) and check
> this. And also install a vanilla RHEL5 dom0 as it looks impossible to
> compile a 2.6.18-era kernel under FC11.

Rebuilding everything from scratch did it. I am seeing a similar
failure where xenctx reports:

Call Trace:
  [<ffffffff8107f780>] stop_cpu+0xc6  <--
  [<ffffffff8105520e>] worker_thread+0x15d 
  [<ffffffff8107f6ba>] __stop_machine+0x106 
  [<ffffffff81058afb>] wake_up_bit+0x25 
  [<ffffffff81038720>] spin_unlock_irqrestore+0x9 
  [<ffffffff810550b1>] spin_lock_irq+0xb 
  [<ffffffff810586cb>] kthread+0x7a 
  [<ffffffff8100a964>] kernel_thread_helper+0x4 
  [<ffffffff81009d61>] int_ret_from_sys_call+0x7 
  [<ffffffff814033dd>] retint_restore_args+0x5 
  [<ffffffff8100a960>] gs_change+0x13 

With this guest file:

kernel = "/mnt/lab/vs11/vmlinuz"
ramdisk = "/mnt/lab/vs11/initramfs.cpio.gz"
memory = 2048
maxvcpus = 4
vcpus = 2
vif = [ 'mac=00:0F:4B:00:00:71, bridge=switch' ]
vfb = [ 'vnc=1, vnclisten=0.0.0.0,vncunused=1']
root = "debug loglevel=10 plymouth:splash=solar plymouth:debug norm console=hvc0 initcall_debug"

This is with the latest linux kernel:
d93ac51c7a129db7a1431d859a3ef45a0b1f3fc5 (Merge branch 'for-linus' of
git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client)

With your patch the PV guests keeps on going.

So:

Tested-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> 
> The Xen I am using is xen-unstable - so 4.0.1. I know that the IRQ balance
> code in the Xen hypervisor was fixed in 4.0 (it used to run out of
> context - now it runs in the IRQ context). Maybe this bug you are seeing
> (and have the fix for) is just a red-heering?

Interestingly enough, I couldn't reproduce this on my Intel box, but on
a AMD box with a very wacked TSC (cpu MHz         : 2795681.405) I can
reproduce this.

> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [LKML] [tip:x86/urgent] x86: Fix NULL pointer access in irq_force_complete_move() for Xen guests
  2010-04-30 21:36 ` [tip:x86/urgent] x86: Fix NULL pointer access in irq_force_complete_move() " tip-bot for Prarit Bhargava
@ 2010-05-04 15:02   ` Konrad Rzeszutek Wilk
  0 siblings, 0 replies; 20+ messages in thread
From: Konrad Rzeszutek Wilk @ 2010-05-04 15:02 UTC (permalink / raw)
  To: mingo, hpa, linux-kernel, suresh.b.siddha, tglx, prarit; +Cc: linux-tip-commits

On Fri, Apr 30, 2010 at 09:36:42PM +0000, tip-bot for Prarit Bhargava wrote:
> Commit-ID:  bbd391a15d82e14efe9d69ba64cadb855b061dba
> Gitweb:     http://git.kernel.org/tip/bbd391a15d82e14efe9d69ba64cadb855b061dba
> Author:     Prarit Bhargava <prarit@redhat.com>
> AuthorDate: Tue, 27 Apr 2010 11:24:42 -0400
> Committer:  H. Peter Anvin <hpa@zytor.com>
> CommitDate: Fri, 30 Apr 2010 14:31:38 -0700
> 
> x86: Fix NULL pointer access in irq_force_complete_move() for Xen guests
> 
> Upstream PV guests fail to boot because of a NULL pointer in
> irq_force_complete_move().  It is possible that xen guests have
> irq_desc->chip_data = NULL.
> 
> Test for NULL chip_data pointer before attempting to complete an irq move.
> 
> Signed-off-by: Prarit Bhargava <prarit@redhat.com>
> LKML-Reference: <20100427152434.16193.49104.sendpatchset@prarit.bos.redhat.com>
> Acked-by: Suresh Siddha <suresh.b.siddha@intel.com>
> Signed-off-by: H. Peter Anvin <hpa@zytor.com>

Tested-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

> Cc: <stable@kernel.org> [2.6.33]
> ---
>  arch/x86/kernel/apic/io_apic.c |    3 +++
>  1 files changed, 3 insertions(+), 0 deletions(-)
> 
> diff --git a/arch/x86/kernel/apic/io_apic.c b/arch/x86/kernel/apic/io_apic.c
> index 127b871..eb2789c 100644
> --- a/arch/x86/kernel/apic/io_apic.c
> +++ b/arch/x86/kernel/apic/io_apic.c
> @@ -2545,6 +2545,9 @@ void irq_force_complete_move(int irq)
>  	struct irq_desc *desc = irq_to_desc(irq);
>  	struct irq_cfg *cfg = desc->chip_data;
>  
> +	if (!cfg)
> +		return;
> +
>  	__irq_complete_move(&desc, cfg->vector);
>  }
>  #else
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [LKML] Re: [LKML] [PATCH] Fix NULL pointer for Xen guests
  2010-05-04 15:02           ` [LKML] " Konrad Rzeszutek Wilk
@ 2010-05-04 15:21             ` Prarit Bhargava
  0 siblings, 0 replies; 20+ messages in thread
From: Prarit Bhargava @ 2010-05-04 15:21 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: linux-kernel, suresh.b.siddha, x86, clalance, drjones


> Interestingly enough, I couldn't reproduce this on my Intel box, but on
> a AMD box with a very wacked TSC (cpu MHz         : 2795681.405) I can
> reproduce this.
>
>    

Huh ... that's odd.  I'll grab a dinar based system and see if I can 
reproduce it there.  It would be interesting to know what the 
differences are.

P.

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2010-05-04 15:23 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-04-27 15:24 [PATCH] Fix NULL pointer for Xen guests Prarit Bhargava
2010-04-27 16:58 ` [LKML] " Konrad Rzeszutek Wilk
2010-04-27 17:09   ` Prarit Bhargava
2010-04-27 17:59     ` Andrew Jones
2010-04-27 18:34     ` Konrad Rzeszutek Wilk
2010-04-27 18:47       ` Prarit Bhargava
2010-05-03 19:16         ` Konrad Rzeszutek Wilk
2010-05-03 19:56           ` Prarit Bhargava
2010-05-04 15:02           ` [LKML] " Konrad Rzeszutek Wilk
2010-05-04 15:21             ` Prarit Bhargava
2010-04-28 18:26 ` Andrew Morton
2010-04-28 18:29   ` Prarit Bhargava
2010-04-28 18:42     ` Suresh Siddha
2010-04-28 18:50     ` Andrew Morton
2010-04-28 19:15       ` [stable] " Greg KH
2010-04-30 20:55 ` H. Peter Anvin
2010-04-30 21:33   ` H. Peter Anvin
2010-04-30 22:01   ` Prarit Bhargava
2010-04-30 21:36 ` [tip:x86/urgent] x86: Fix NULL pointer access in irq_force_complete_move() " tip-bot for Prarit Bhargava
2010-05-04 15:02   ` [LKML] " Konrad Rzeszutek Wilk

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.