All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: [Fastboot] [PATCH] kexec: Avoid migration of already disabled
@ 2007-01-31  2:58 Jay Lan
  2007-01-31  3:49 ` [Fastboot] [PATCH] kexec: Avoid migration of already disabled irqs (ia64) Magnus Damm
                   ` (5 more replies)
  0 siblings, 6 replies; 7+ messages in thread
From: Jay Lan @ 2007-01-31  2:58 UTC (permalink / raw)
  To: linux-ia64

Magnus Damm wrote:
> kexec: Avoid migration of already disabled irqs (ia64)
> 
> This patch fixes up ia64 kexec support for HP rx2620 hardware. It does this 
> by skipping migration of already disabled irqs. This is most likely a problem
> on other ia64 platforms as well, but I've only tested this on one machine
> so far.

I have not seen this problem on SN systems.

Cheers,
 - jay

> 
> The full story is that handle_bad_irq() gets invoked before starting the new 
> kernel without this patch. This seems to happen when fixup_irqs() calls 
> generic_handle_irq() on already migrated (and disabled) irqs. So by avoiding
> migration of disabled irqs we stay away of handle_bad_irq().
> 
> Signed-off-by: Magnus Damm <magnus@valinux.co.jp>
> ---
> 
>  Tested using kexec-tools-testing 7792798a79b78a5d566f70c9f00237d050b01350
>  on HP rx2620 hardware.
> 
>  Applies on top of 2.6.20-rc6.
> 
>  arch/ia64/kernel/irq.c |    3 +++
>  1 file changed, 3 insertions(+)
> 
> --- 0001/arch/ia64/kernel/irq.c
> +++ 0004/arch/ia64/kernel/irq.c	2007-01-30 12:35:10.000000000 +0900
> @@ -122,6 +122,9 @@ static void migrate_irqs(void)
>  	for (irq=0; irq < NR_IRQS; irq++) {
>  		desc = irq_desc + irq;
>  
> +		if (desc->status = IRQ_DISABLED)
> +			continue;
> +
>  		/*
>  		 * No handling for now.
>  		 * TBD: Implement a disable function so we can now
> _______________________________________________
> fastboot mailing list
> fastboot@lists.osdl.org
> https://lists.osdl.org/mailman/listinfo/fastboot


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Fastboot] [PATCH] kexec: Avoid migration of already disabled irqs (ia64)
  2007-01-31  2:58 [Fastboot] [PATCH] kexec: Avoid migration of already disabled Jay Lan
@ 2007-01-31  3:49 ` Magnus Damm
  2007-01-31  4:54 ` Zou, Nanhai
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: Magnus Damm @ 2007-01-31  3:49 UTC (permalink / raw)
  To: linux-ia64

On 1/31/07, Jay Lan <jlan@sgi.com> wrote:
> Magnus Damm wrote:
> > kexec: Avoid migration of already disabled irqs (ia64)
> >
> > This patch fixes up ia64 kexec support for HP rx2620 hardware. It does this
> > by skipping migration of already disabled irqs. This is most likely a problem
> > on other ia64 platforms as well, but I've only tested this on one machine
> > so far.
>
> I have not seen this problem on SN systems.

Ok, thanks. Let me give you more details.

When I perform "kexec -e" the following output appears on my serial
console (with my patch applied).

ACPI: PCI interrupt for device 0000:20:02.1 disabled
GSI 30 (level, low) -> CPU 1 (0x0200) vector 53 unregistered
ACPI: PCI interrupt for device 0000:20:02.0 disabled
GSI 29 (level, low) -> CPU 0 (0x0000) vector 52 unregistered
Starting new kernel
CPU 1 is now offline
CPU 2 is now offline
CPU 3 is now offline
Linux version 2.6.20-rc6 (damm@localhost) (gcc version 3.4.5) #1 SMP
Tue Jan 30 16:59:54 JST 2007

Without the patch the kernel tries to migrate already disabled
interrupts which results in this:

ACPI: PCI interrupt for device 0000:20:02.1 disabled
GSI 30 (level, low) -> CPU 1 (0x0200) vector 53 unregistered
ACPI: PCI interrupt for device 0000:20:02.0 disabled
GSI 29 (level, low) -> CPU 0 (0x0000) vector 52 unregistered
Starting new kernel
BUG: at arch/ia64/kernel/irq.c:155 migrate_irqs()

Call Trace:
 [<a000000100012e10>] show_stack+0x50/0xa0
                                spà000040fb7f7b20 bspà000040fb7f0d60
 [<a000000100012e90>] dump_stack+0x30/0x60
                                spà000040fb7f7cf0 bspà000040fb7f0d48
 [<a000000100011610>] fixup_irqs+0x490/0x680
                                spà000040fb7f7cf0 bspà000040fb7f0d08
 [<a000000100055300>] __cpu_disable+0x5c0/0x660
                                spà000040fb7f7d80 bspà000040fb7f0cb8
 [<a0000001000d21c0>] take_cpu_down+0x20/0x80
                                spà000040fb7f7dc0 bspà000040fb7f0ca0
 [<a0000001000dcd90>] do_stop+0x250/0x360
                                spà000040fb7f7dc0 bspà000040fb7f0c60
 [<a0000001000bf9b0>] kthread+0x230/0x2a0
                                spà000040fb7f7dd0 bspà000040fb7f0c20
 [<a000000100015290>] kernel_thread_helper+0xd0/0x100
                                spà000040fb7f7e30 bspà000040fb7f0bf0
 [<a0000001000094c0>] start_kernel_thread+0x20/0x40
                                spà000040fb7f7e30 bspà000040fb7f0bf0
irq 53, desc: a0000001007d1080, depth: 0, count: 3, unhandled: 0
->handle_irq():  0000000000000000, 0x0
->chip(): a000000100831fa8, no_irq_chip+0x0/0x80
->action(): 0000000000000000
  IRQ_DISABLED set
Unexpected irq vector 0x35 on CPU 1!
CPU 1 is now offline
...
(more or less same thing on CPU2 and CPU3 as well)

This is how my /proc/interrupts look:

/ # cat /proc/interrupts
           CPU0       CPU1       CPU2       CPU3
 28:          1          1          1          1          LSAPIC  cpe_poll
 29:          0          0          0          0          LSAPIC  cmc_poll
 31:          0          0          0          0          LSAPIC  cmc_hndlr
 48:          0          0          0          0  IO-SAPIC-level  acpi
 49:          0         52          0          0  IO-SAPIC-level  serial
 52:        104          0          0          0  IO-SAPIC-level  eth0
232:          0          0          0          0          LSAPIC  mca_rdzv
238:          0          0          0          0          LSAPIC  perfmon
239:       7027       6979       7055       6916          LSAPIC  timer
240:          0          0          0          0          LSAPIC  mca_wkup
253:        145        106        766        689          LSAPIC  resched
254:         21         50         51         34          LSAPIC  IPI
ERR:          0

Are IOSAPIC interrupts routed to multiples CPUs in your case?
Do you get any "ACPI: PCI interrupt for device nnn disabled" messages?

Thanks!

/ magnus

^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: [Fastboot] [PATCH] kexec: Avoid migration of already disabled irqs (ia64)
  2007-01-31  2:58 [Fastboot] [PATCH] kexec: Avoid migration of already disabled Jay Lan
  2007-01-31  3:49 ` [Fastboot] [PATCH] kexec: Avoid migration of already disabled irqs (ia64) Magnus Damm
@ 2007-01-31  4:54 ` Zou, Nanhai
  2007-01-31  5:57 ` Magnus Damm
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: Zou, Nanhai @ 2007-01-31  4:54 UTC (permalink / raw)
  To: linux-ia64

> -----Original Message-----
> From: Magnus Damm [mailto:magnus.damm@gmail.com]
> Sent: 2007Äê1ÔÂ31ÈÕ 11:50
> To: Jay Lan
> Cc: Magnus Damm; linux-ia64@vger.kernel.org; Luck, Tony; Zou, Nanhai;
> fastboot@lists.osdl.org
> Subject: Re: [Fastboot] [PATCH] kexec: Avoid migration of already disabled irqs
> (ia64)
> 
> On 1/31/07, Jay Lan <jlan@sgi.com> wrote:
> > Magnus Damm wrote:
> > > kexec: Avoid migration of already disabled irqs (ia64)
> > >
> > > This patch fixes up ia64 kexec support for HP rx2620 hardware. It does this
> > > by skipping migration of already disabled irqs. This is most likely a problem
> > > on other ia64 platforms as well, but I've only tested this on one machine
> > > so far.
> >
> > I have not seen this problem on SN systems.
> 
> Ok, thanks. Let me give you more details.
> 
> When I perform "kexec -e" the following output appears on my serial
> console (with my patch applied).

 This patch is correct I think. I assume you will also see this bug when trying to offline a CPU by echo 0 > /sys/devices/system/cpu/cpuX/online

However it will not be triggered in crash dump case which has been tested heavily on different kind of platforms. We do not migrate IRQ at the time of crash. I guess that is why Jay has never seen it on SN platforms.
 
Thanks
Zou Nan hai

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Fastboot] [PATCH] kexec: Avoid migration of already disabled irqs (ia64)
  2007-01-31  2:58 [Fastboot] [PATCH] kexec: Avoid migration of already disabled Jay Lan
  2007-01-31  3:49 ` [Fastboot] [PATCH] kexec: Avoid migration of already disabled irqs (ia64) Magnus Damm
  2007-01-31  4:54 ` Zou, Nanhai
@ 2007-01-31  5:57 ` Magnus Damm
  2007-01-31  6:07 ` Zou, Nanhai
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: Magnus Damm @ 2007-01-31  5:57 UTC (permalink / raw)
  To: linux-ia64

Hi Nan hai,

On 1/31/07, Zou, Nanhai <nanhai.zou@intel.com> wrote:
> > On 1/31/07, Jay Lan <jlan@sgi.com> wrote:
> > > Magnus Damm wrote:
> > > > kexec: Avoid migration of already disabled irqs (ia64)
> > > >
> > > > This patch fixes up ia64 kexec support for HP rx2620 hardware. It does this
> > > > by skipping migration of already disabled irqs. This is most likely a problem
> > > > on other ia64 platforms as well, but I've only tested this on one machine
> > > > so far.
> > >
> > > I have not seen this problem on SN systems.
> >
> > Ok, thanks. Let me give you more details.
> >
> > When I perform "kexec -e" the following output appears on my serial
> > console (with my patch applied).
>
>  This patch is correct I think. I assume you will also see this bug when trying to offline a CPU by echo 0 > /sys/devices/system/cpu/cpuX/online

You are right, I can trigger the bug that way too.
And the bug goes away with the patch. Excellent!

> However it will not be triggered in crash dump case which has been tested heavily on different kind of platforms. We do not migrate IRQ at the time of crash. I guess that is why Jay has never seen it on SN platforms.

Booting into a crash kernel does not seem to work on this ia64 box
unfortunately. I'm about to investigate why, but I thought trying and
fixing kexec was a good first step.

Thanks for the help!

/ magnus

^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: [Fastboot] [PATCH] kexec: Avoid migration of already disabled irqs (ia64)
  2007-01-31  2:58 [Fastboot] [PATCH] kexec: Avoid migration of already disabled Jay Lan
                   ` (2 preceding siblings ...)
  2007-01-31  5:57 ` Magnus Damm
@ 2007-01-31  6:07 ` Zou, Nanhai
  2007-01-31 18:09 ` [Fastboot] [PATCH] kexec: Avoid migration of already disabled Jay Lan
  2007-02-01  6:02 ` [Fastboot] [PATCH] kexec: Avoid migration of already disabled irqs (ia64) Magnus Damm
  5 siblings, 0 replies; 7+ messages in thread
From: Zou, Nanhai @ 2007-01-31  6:07 UTC (permalink / raw)
  To: linux-ia64


> -----Original Message-----
> From: Magnus Damm [mailto:magnus.damm@gmail.com]
> Sent: 2007Äê1ÔÂ31ÈÕ 13:58
> To: Zou, Nanhai
> Cc: Jay Lan; Magnus Damm; linux-ia64@vger.kernel.org; Luck, Tony;
> fastboot@lists.osdl.org
> Subject: Re: [Fastboot] [PATCH] kexec: Avoid migration of already disabled irqs
> (ia64)
> 
> Hi Nan hai,
> 
> On 1/31/07, Zou, Nanhai <nanhai.zou@intel.com> wrote:
> > > On 1/31/07, Jay Lan <jlan@sgi.com> wrote:
> > > > Magnus Damm wrote:
> > > > > kexec: Avoid migration of already disabled irqs (ia64)
> > > > >
> > > > > This patch fixes up ia64 kexec support for HP rx2620 hardware. It does
> this
> > > > > by skipping migration of already disabled irqs. This is most likely
> a problem
> > > > > on other ia64 platforms as well, but I've only tested this on one machine
> > > > > so far.
> > > >
> > > > I have not seen this problem on SN systems.
> > >
> > > Ok, thanks. Let me give you more details.
> > >
> > > When I perform "kexec -e" the following output appears on my serial
> > > console (with my patch applied).
> >
> >  This patch is correct I think. I assume you will also see this bug when trying
> to offline a CPU by echo 0 > /sys/devices/system/cpu/cpuX/online
> 
> You are right, I can trigger the bug that way too.
> And the bug goes away with the patch. Excellent!
> 
> > However it will not be triggered in crash dump case which has been tested
> heavily on different kind of platforms. We do not migrate IRQ at the time of
> crash. I guess that is why Jay has never seen it on SN platforms.
> 
> Booting into a crash kernel does not seem to work on this ia64 box
> unfortunately. I'm about to investigate why, but I thought trying and
> fixing kexec was a good first step.
> 
  That is probably caused by HP's IOMMU engine not shutting down at crash time...., 
Please try add machvec=dig option for the crash dump kernel.
  However we still need to find a light way to correctly shutdown IOMMU engine on those HP platforms.

Thanks
Zou Nan hai
> Thanks for the help!

> 
> / magnus

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Fastboot] [PATCH] kexec: Avoid migration of already disabled
  2007-01-31  2:58 [Fastboot] [PATCH] kexec: Avoid migration of already disabled Jay Lan
                   ` (3 preceding siblings ...)
  2007-01-31  6:07 ` Zou, Nanhai
@ 2007-01-31 18:09 ` Jay Lan
  2007-02-01  6:02 ` [Fastboot] [PATCH] kexec: Avoid migration of already disabled irqs (ia64) Magnus Damm
  5 siblings, 0 replies; 7+ messages in thread
From: Jay Lan @ 2007-01-31 18:09 UTC (permalink / raw)
  To: linux-ia64

Magnus Damm wrote:
> Hi Nan hai,
> 
> On 1/31/07, Zou, Nanhai <nanhai.zou@intel.com> wrote:
>> > On 1/31/07, Jay Lan <jlan@sgi.com> wrote:
>> > > Magnus Damm wrote:
>> > > > kexec: Avoid migration of already disabled irqs (ia64)
>> > > >
>> > > > This patch fixes up ia64 kexec support for HP rx2620 hardware.
>> It does this
>> > > > by skipping migration of already disabled irqs. This is most
>> likely a problem
>> > > > on other ia64 platforms as well, but I've only tested this on
>> one machine
>> > > > so far.
>> > >
>> > > I have not seen this problem on SN systems.
>> >
>> > Ok, thanks. Let me give you more details.
>> >
>> > When I perform "kexec -e" the following output appears on my serial
>> > console (with my patch applied).
>>
>>  This patch is correct I think. I assume you will also see this bug
>> when trying to offline a CPU by echo 0 >
>> /sys/devices/system/cpu/cpuX/online
> 
> You are right, I can trigger the bug that way too.
> And the bug goes away with the patch. Excellent!

I tried on a 2p SN system with both '-l' followed by '-e' and also
Nan-hai's trigger command. Neither triggered the problem. But, on
the other hand, the system worked fine with your patch also. :)

Thanks,
 - jay

> 
>> However it will not be triggered in crash dump case which has been
>> tested heavily on different kind of platforms. We do not migrate IRQ
>> at the time of crash. I guess that is why Jay has never seen it on SN
>> platforms.
> 
> Booting into a crash kernel does not seem to work on this ia64 box
> unfortunately. I'm about to investigate why, but I thought trying and
> fixing kexec was a good first step.
> 
> Thanks for the help!
> 
> / magnus


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Fastboot] [PATCH] kexec: Avoid migration of already disabled irqs (ia64)
  2007-01-31  2:58 [Fastboot] [PATCH] kexec: Avoid migration of already disabled Jay Lan
                   ` (4 preceding siblings ...)
  2007-01-31 18:09 ` [Fastboot] [PATCH] kexec: Avoid migration of already disabled Jay Lan
@ 2007-02-01  6:02 ` Magnus Damm
  5 siblings, 0 replies; 7+ messages in thread
From: Magnus Damm @ 2007-02-01  6:02 UTC (permalink / raw)
  To: linux-ia64

On 1/31/07, Zou, Nanhai <nanhai.zou@intel.com> wrote:
> > From: Magnus Damm [mailto:magnus.damm@gmail.com]
> > Booting into a crash kernel does not seem to work on this ia64 box
> > unfortunately. I'm about to investigate why, but I thought trying and
> > fixing kexec was a good first step.
> >
>   That is probably caused by HP's IOMMU engine not shutting down at crash time....,
> Please try add machvec=dig option for the crash dump kernel.
>   However we still need to find a light way to correctly shutdown IOMMU engine on those HP platforms.

It does unfortunately not seem to help. Regardless of SMP or UP kernel.

I'll try to track down where things lock up. Thanks.

/ magnus

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2007-02-01  6:02 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-01-31  2:58 [Fastboot] [PATCH] kexec: Avoid migration of already disabled Jay Lan
2007-01-31  3:49 ` [Fastboot] [PATCH] kexec: Avoid migration of already disabled irqs (ia64) Magnus Damm
2007-01-31  4:54 ` Zou, Nanhai
2007-01-31  5:57 ` Magnus Damm
2007-01-31  6:07 ` Zou, Nanhai
2007-01-31 18:09 ` [Fastboot] [PATCH] kexec: Avoid migration of already disabled Jay Lan
2007-02-01  6:02 ` [Fastboot] [PATCH] kexec: Avoid migration of already disabled irqs (ia64) Magnus Damm

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.