All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/2] x86/apic/kexec: Legacy irq setting fixs in kdump kernel
@ 2018-01-25 14:11 Baoquan He
  2018-01-25 14:11 ` [PATCH v2 1/2] x86/apic/kexec: Enable legacy irq mode before jump to kexec/kdump kernel Baoquan He
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Baoquan He @ 2018-01-25 14:11 UTC (permalink / raw)
  To: linux-kernel
  Cc: ebiederm, tglx, mingo, hpa, x86, douly.fnst, joro, uobergfe,
	prarit, Baoquan He

This is v2 post.

In v2, no code change, just improve change log of patch 1 and 2.

And drop the old patch 3 in v1, a clean up patch. The current
x86_io_apic_ops.disable() hook is still needed by irq remapping.

Baoquan He (2):
  x86/apic/kexec: Enable legacy irq mode before jump to kexec/kdump
    kernel
  x86/apic: Set up through-local-APIC on boot CPU's LINT0 if 'noapic'
    specified

 arch/x86/include/asm/io_apic.h     |  3 ++-
 arch/x86/kernel/apic/apic.c        |  2 +-
 arch/x86/kernel/apic/io_apic.c     | 12 ++++--------
 arch/x86/kernel/crash.c            |  2 +-
 arch/x86/kernel/machine_kexec_32.c | 15 +++++----------
 arch/x86/kernel/machine_kexec_64.c | 15 +++++----------
 arch/x86/kernel/reboot.c           |  2 +-
 7 files changed, 19 insertions(+), 32 deletions(-)

-- 
2.13.6

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v2 1/2] x86/apic/kexec: Enable legacy irq mode before jump to kexec/kdump kernel
  2018-01-25 14:11 [PATCH v2 0/2] x86/apic/kexec: Legacy irq setting fixs in kdump kernel Baoquan He
@ 2018-01-25 14:11 ` Baoquan He
  2018-02-07 17:11   ` Eric W. Biederman
  2018-01-25 14:11 ` [PATCH v2 2/2] x86/apic: Set up through-local-APIC on boot CPU's LINT0 if 'noapic' specified Baoquan He
  2018-02-02  4:04 ` [PATCH v2 0/2] x86/apic/kexec: Legacy irq setting fixs in kdump kernel Dou Liyang
  2 siblings, 1 reply; 8+ messages in thread
From: Baoquan He @ 2018-01-25 14:11 UTC (permalink / raw)
  To: linux-kernel
  Cc: ebiederm, tglx, mingo, hpa, x86, douly.fnst, joro, uobergfe,
	prarit, Baoquan He

On kvm guest, kernel always prints warning during kdump kernel boots as
below.

[    0.001000] WARNING: CPU: 0 PID: 0 at arch/x86/kernel/apic/apic.c:1467 setup_local_APIC+0x228/0x330
[    0.001000] Modules linked in:
[    0.001000] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.15.0-rc5+ #3
[    0.001000] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1.fc26 04/01/2014
[    0.001000] RIP: 0010:setup_local_APIC+0x228/0x330
[    0.001000] RSP: 0000:ffffffffb6e03eb8 EFLAGS: 00010286
[    0.001000] RAX: 0000009edb4c4d84 RBX: 0000000000000000 RCX: 00000000b099d800
[    0.001000] RDX: 0000009e00000000 RSI: 0000000000000000 RDI: 0000000000000810
[    0.001000] RBP: 0000000000000000 R08: ffffffffffffffff R09: 0000000000000001
[    0.001000] R10: ffff98ce6a801c00 R11: 0761076d072f0776 R12: 0000000000000001
[    0.001000] R13: 00000000000000f0 R14: 0000000000004000 R15: ffffffffffffc6ff
[    0.001000] FS:  0000000000000000(0000) GS:ffff98ce6bc00000(0000) knlGS:0000000000000000
[    0.001000] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    0.001000] CR2: 00000000ffffffff CR3: 0000000022209000 CR4: 00000000000406b0
[    0.001000] Call Trace:
[    0.001000]  apic_bsp_setup+0x56/0x74
[    0.001000]  x86_late_time_init+0x11/0x16
[    0.001000]  start_kernel+0x3c9/0x486
[    0.001000]  secondary_startup_64+0xa5/0xb0
[    0.001000] Code: 00 85 c9 74 2d 0f 31 c1 e1 0a 48 c1 e2 20 41 89 cf 4c 03 7c 24 08 48 09 d0 49 29 c7 4c 89 3c 24 48 83 3c 24 00 0f 8f 8f fe ff
ff <0f> ff e9 10 ff ff ff 48 83 2c 24 01 eb e7 48 83 c4 18 5b 5d 41
[    0.001000] ---[ end trace b88e71b9a6ebebdd ]---
[    0.001000] masked ExtINT on CPU#0

The root cause is the legacy irq mode is disabled before jump to kexec/kdump
kernel since commit 522e66464467 ("x86/apic: Disable I/O APIC before shutdown
of the local APIC"). In that commit, lapic_shutdown() calling was moved after
disable_IO_APIC(). In fact in disable_IO_APIC(), it not only calls
clear_IO_APIC() to disable IO-APIC, and also sets LAPIC and IO-APIC to make
system be PIC or Virtual wire mode. Hence local APIC is disabled completely
by the calling of lapic_shutdown().

Later in kdump kernel, when calling setup_local_APIC(), the
'do { xxx } while (queued && max_loops > 0)' loop does not function well any
more if pending irq exists in APIC IRR since LAPIC is disabled. The do while
loop will terminate finally when max_loops overflows by subtraction. Then,
next WARN_ON(max_loops <= 0) is triggered.

In normal kernel it defaults to be PIC mode or Virtual Wire mode which is
done by BIOS. But kexec/kdump kernel won't go through BIOS, we need set
system as PIC or Virtual Wire mode before jump to kdump kernel code directly.
With this the pending irq can be handled correclty before APIC mode enabling.

So take clear_IO_APIC out of disable_IO_APIC, and rename disable_IO_APIC
as switch_to_legacy_irq_mode. Then only call clear_IO_APIC when IO-APIC
need be disabled. And call switch_to_legacy_irq_mode before kexec/kdump
jumping.

Signed-off-by: Baoquan He <bhe@redhat.com>
---
 arch/x86/include/asm/io_apic.h     |  3 ++-
 arch/x86/kernel/apic/io_apic.c     | 12 ++++--------
 arch/x86/kernel/crash.c            |  2 +-
 arch/x86/kernel/machine_kexec_32.c | 15 +++++----------
 arch/x86/kernel/machine_kexec_64.c | 15 +++++----------
 arch/x86/kernel/reboot.c           |  2 +-
 6 files changed, 18 insertions(+), 31 deletions(-)

diff --git a/arch/x86/include/asm/io_apic.h b/arch/x86/include/asm/io_apic.h
index a8834dd546cd..e38ad3863a2c 100644
--- a/arch/x86/include/asm/io_apic.h
+++ b/arch/x86/include/asm/io_apic.h
@@ -192,7 +192,8 @@ static inline unsigned int io_apic_read(unsigned int apic, unsigned int reg)
 
 extern void setup_IO_APIC(void);
 extern void enable_IO_APIC(void);
-extern void disable_IO_APIC(void);
+extern void clear_IO_APIC (void);
+extern void switch_to_legacy_irq_mode(void);
 extern int IO_APIC_get_PCI_irq_vector(int bus, int devfn, int pin);
 extern void print_IO_APICs(void);
 #else  /* !CONFIG_X86_IO_APIC */
diff --git a/arch/x86/kernel/apic/io_apic.c b/arch/x86/kernel/apic/io_apic.c
index 8a7963421460..a47aa915d18c 100644
--- a/arch/x86/kernel/apic/io_apic.c
+++ b/arch/x86/kernel/apic/io_apic.c
@@ -587,7 +587,7 @@ static void clear_IO_APIC_pin(unsigned int apic, unsigned int pin)
 		       mpc_ioapic_id(apic), pin);
 }
 
-static void clear_IO_APIC (void)
+void clear_IO_APIC (void)
 {
 	int apic, pin;
 
@@ -1439,15 +1439,11 @@ void native_disable_io_apic(void)
 }
 
 /*
- * Not an __init, needed by the reboot code
+ * Not an __init, needed by kexec/kdump code.
+ * For safety IO-APIC and Local APIC need be cleared before this.
  */
-void disable_IO_APIC(void)
+void switch_to_legacy_irq_mode(void)
 {
-	/*
-	 * Clear the IO-APIC before rebooting:
-	 */
-	clear_IO_APIC();
-
 	if (!nr_legacy_irqs())
 		return;
 
diff --git a/arch/x86/kernel/crash.c b/arch/x86/kernel/crash.c
index 10e74d4778a1..318ffeaaf55a 100644
--- a/arch/x86/kernel/crash.c
+++ b/arch/x86/kernel/crash.c
@@ -199,7 +199,7 @@ void native_machine_crash_shutdown(struct pt_regs *regs)
 #ifdef CONFIG_X86_IO_APIC
 	/* Prevent crash_kexec() from deadlocking on ioapic_lock. */
 	ioapic_zap_locks();
-	disable_IO_APIC();
+	clear_IO_APIC();
 #endif
 	lapic_shutdown();
 #ifdef CONFIG_HPET_TIMER
diff --git a/arch/x86/kernel/machine_kexec_32.c b/arch/x86/kernel/machine_kexec_32.c
index edfede768688..7ab10d930cc6 100644
--- a/arch/x86/kernel/machine_kexec_32.c
+++ b/arch/x86/kernel/machine_kexec_32.c
@@ -190,18 +190,13 @@ void machine_kexec(struct kimage *image)
 	local_irq_disable();
 	hw_breakpoint_disable();
 
-	if (image->preserve_context) {
 #ifdef CONFIG_X86_IO_APIC
-		/*
-		 * We need to put APICs in legacy mode so that we can
-		 * get timer interrupts in second kernel. kexec/kdump
-		 * paths already have calls to disable_IO_APIC() in
-		 * one form or other. kexec jump path also need
-		 * one.
-		 */
-		disable_IO_APIC();
+	/*
+	 * We need to put APICs in legacy mode so that we can
+	 * get timer interrupts in second kernel.
+	 */
+	switch_to_legacy_irq_mode();
 #endif
-	}
 
 	control_page = page_address(image->control_code_page);
 	memcpy(control_page, relocate_kernel, KEXEC_CONTROL_CODE_MAX_SIZE);
diff --git a/arch/x86/kernel/machine_kexec_64.c b/arch/x86/kernel/machine_kexec_64.c
index 1f790cf9d38f..b5c0cbed6791 100644
--- a/arch/x86/kernel/machine_kexec_64.c
+++ b/arch/x86/kernel/machine_kexec_64.c
@@ -288,18 +288,13 @@ void machine_kexec(struct kimage *image)
 	local_irq_disable();
 	hw_breakpoint_disable();
 
-	if (image->preserve_context) {
 #ifdef CONFIG_X86_IO_APIC
-		/*
-		 * We need to put APICs in legacy mode so that we can
-		 * get timer interrupts in second kernel. kexec/kdump
-		 * paths already have calls to disable_IO_APIC() in
-		 * one form or other. kexec jump path also need
-		 * one.
-		 */
-		disable_IO_APIC();
+	/*
+	 * We need to put APICs in legacy mode so that we can
+	 * get timer interrupts in second kernel.
+	 */
+	switch_to_legacy_irq_mode();
 #endif
-	}
 
 	control_page = page_address(image->control_code_page) + PAGE_SIZE;
 	memcpy(control_page, relocate_kernel, KEXEC_CONTROL_CODE_MAX_SIZE);
diff --git a/arch/x86/kernel/reboot.c b/arch/x86/kernel/reboot.c
index 2126b9d27c34..b70cc0f38a29 100644
--- a/arch/x86/kernel/reboot.c
+++ b/arch/x86/kernel/reboot.c
@@ -666,7 +666,7 @@ void native_machine_shutdown(void)
 	 * Even without the erratum, it still makes sense to quiet IO APIC
 	 * before disabling Local APIC.
 	 */
-	disable_IO_APIC();
+	clear_IO_APIC();
 #endif
 
 #ifdef CONFIG_SMP
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v2 2/2] x86/apic: Set up through-local-APIC on boot CPU's LINT0 if 'noapic' specified
  2018-01-25 14:11 [PATCH v2 0/2] x86/apic/kexec: Legacy irq setting fixs in kdump kernel Baoquan He
  2018-01-25 14:11 ` [PATCH v2 1/2] x86/apic/kexec: Enable legacy irq mode before jump to kexec/kdump kernel Baoquan He
@ 2018-01-25 14:11 ` Baoquan He
  2018-02-02  4:04 ` [PATCH v2 0/2] x86/apic/kexec: Legacy irq setting fixs in kdump kernel Dou Liyang
  2 siblings, 0 replies; 8+ messages in thread
From: Baoquan He @ 2018-01-25 14:11 UTC (permalink / raw)
  To: linux-kernel
  Cc: ebiederm, tglx, mingo, hpa, x86, douly.fnst, joro, uobergfe,
	prarit, Baoquan He

Kdump kernel will become very slow if 'noapic' is specified. Normal kernel
won't.

In normal kernel the legacy irq mode is enabled in BIOS. If it is virtual
virtual wire mode, the local-APIC has been enabled and set as
through-local-APIC. Though specifying 'noapic' in normal kernel it still
have the virtual wire mode working. In kdump kernel, local-APIC is disabled
in crashed kernel before jump to kdump kernel since commit 522e664644
("x86/apic: Disable I/O APIC before shutdown of the local APIC"). So we
need set up through-local-APIC on boot CPU explicitly in setup_local_APIC()
if 'noapic' specified.

Do it now.

Signed-off-by: Baoquan He <bhe@redhat.com>
---
 arch/x86/kernel/apic/apic.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c
index 25ddf02598d2..3fc259b4dd2d 100644
--- a/arch/x86/kernel/apic/apic.c
+++ b/arch/x86/kernel/apic/apic.c
@@ -1570,7 +1570,7 @@ void setup_local_APIC(void)
 	 * TODO: set up through-local-APIC from through-I/O-APIC? --macro
 	 */
 	value = apic_read(APIC_LVT0) & APIC_LVT_MASKED;
-	if (!cpu && (pic_mode || !value)) {
+	if (!cpu && (pic_mode || !value || skip_ioapic_setup)) {
 		value = APIC_DM_EXTINT;
 		apic_printk(APIC_VERBOSE, "enabled ExtINT on CPU#%d\n", cpu);
 	} else {
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 0/2] x86/apic/kexec: Legacy irq setting fixs in kdump kernel
  2018-01-25 14:11 [PATCH v2 0/2] x86/apic/kexec: Legacy irq setting fixs in kdump kernel Baoquan He
  2018-01-25 14:11 ` [PATCH v2 1/2] x86/apic/kexec: Enable legacy irq mode before jump to kexec/kdump kernel Baoquan He
  2018-01-25 14:11 ` [PATCH v2 2/2] x86/apic: Set up through-local-APIC on boot CPU's LINT0 if 'noapic' specified Baoquan He
@ 2018-02-02  4:04 ` Dou Liyang
  2 siblings, 0 replies; 8+ messages in thread
From: Dou Liyang @ 2018-02-02  4:04 UTC (permalink / raw)
  To: Baoquan He, linux-kernel
  Cc: ebiederm, tglx, mingo, hpa, x86, joro, uobergfe, prarit

Hi Baoquan,

At 01/25/2018 10:11 PM, Baoquan He wrote:
> This is v2 post.
> 
> In v2, no code change, just improve change log of patch 1 and 2.
> 
> And drop the old patch 3 in v1, a clean up patch. The current
> x86_io_apic_ops.disable() hook is still needed by irq remapping.
> 
> Baoquan He (2):

Test these patches in both 32bit and 64bit of x86 arches. feel free to
add my Tested-by.

Tested-by: Dou Liyang <douly.fnst@cn.fujitsu.com>

Thanks,
	dou.

>    x86/apic/kexec: Enable legacy irq mode before jump to kexec/kdump
>      kernel
>    x86/apic: Set up through-local-APIC on boot CPU's LINT0 if 'noapic'
>      specified
> 
>   arch/x86/include/asm/io_apic.h     |  3 ++-
>   arch/x86/kernel/apic/apic.c        |  2 +-
>   arch/x86/kernel/apic/io_apic.c     | 12 ++++--------
>   arch/x86/kernel/crash.c            |  2 +-
>   arch/x86/kernel/machine_kexec_32.c | 15 +++++----------
>   arch/x86/kernel/machine_kexec_64.c | 15 +++++----------
>   arch/x86/kernel/reboot.c           |  2 +-
>   7 files changed, 19 insertions(+), 32 deletions(-)
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 1/2] x86/apic/kexec: Enable legacy irq mode before jump to kexec/kdump kernel
  2018-01-25 14:11 ` [PATCH v2 1/2] x86/apic/kexec: Enable legacy irq mode before jump to kexec/kdump kernel Baoquan He
@ 2018-02-07 17:11   ` Eric W. Biederman
  2018-02-07 17:35     ` Eric W. Biederman
  0 siblings, 1 reply; 8+ messages in thread
From: Eric W. Biederman @ 2018-02-07 17:11 UTC (permalink / raw)
  To: Baoquan He
  Cc: linux-kernel, tglx, mingo, hpa, x86, douly.fnst, joro, uobergfe, prarit

Baoquan He <bhe@redhat.com> writes:

> On kvm guest, kernel always prints warning during kdump kernel boots as
> below.
>
> [    0.001000] WARNING: CPU: 0 PID: 0 at arch/x86/kernel/apic/apic.c:1467 setup_local_APIC+0x228/0x330
> [    0.001000] Modules linked in:
> [    0.001000] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.15.0-rc5+ #3
> [    0.001000] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1.fc26 04/01/2014
> [    0.001000] RIP: 0010:setup_local_APIC+0x228/0x330
> [    0.001000] RSP: 0000:ffffffffb6e03eb8 EFLAGS: 00010286
> [    0.001000] RAX: 0000009edb4c4d84 RBX: 0000000000000000 RCX: 00000000b099d800
> [    0.001000] RDX: 0000009e00000000 RSI: 0000000000000000 RDI: 0000000000000810
> [    0.001000] RBP: 0000000000000000 R08: ffffffffffffffff R09: 0000000000000001
> [    0.001000] R10: ffff98ce6a801c00 R11: 0761076d072f0776 R12: 0000000000000001
> [    0.001000] R13: 00000000000000f0 R14: 0000000000004000 R15: ffffffffffffc6ff
> [    0.001000] FS:  0000000000000000(0000) GS:ffff98ce6bc00000(0000) knlGS:0000000000000000
> [    0.001000] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [    0.001000] CR2: 00000000ffffffff CR3: 0000000022209000 CR4: 00000000000406b0
> [    0.001000] Call Trace:
> [    0.001000]  apic_bsp_setup+0x56/0x74
> [    0.001000]  x86_late_time_init+0x11/0x16
> [    0.001000]  start_kernel+0x3c9/0x486
> [    0.001000]  secondary_startup_64+0xa5/0xb0
> [    0.001000] Code: 00 85 c9 74 2d 0f 31 c1 e1 0a 48 c1 e2 20 41 89 cf 4c 03 7c 24 08 48 09 d0 49 29 c7 4c 89 3c 24 48 83 3c 24 00 0f 8f 8f fe ff
> ff <0f> ff e9 10 ff ff ff 48 83 2c 24 01 eb e7 48 83 c4 18 5b 5d 41
> [    0.001000] ---[ end trace b88e71b9a6ebebdd ]---
> [    0.001000] masked ExtINT on CPU#0
>
> The root cause is the legacy irq mode is disabled before jump to kexec/kdump
> kernel since commit 522e66464467 ("x86/apic: Disable I/O APIC before shutdown
> of the local APIC"). In that commit, lapic_shutdown() calling was moved after
> disable_IO_APIC(). In fact in disable_IO_APIC(), it not only calls
> clear_IO_APIC() to disable IO-APIC, and also sets LAPIC and IO-APIC to make
> system be PIC or Virtual wire mode. Hence local APIC is disabled completely
> by the calling of lapic_shutdown().

The actions of lapic_shutdown do not depend on the actions of
disable_IO_APIC so this description and justificaiton are nonsense.

Further we don't hardware disable the local APIC except when we hardware
enable it.  And only on 32bit at that.

I keep wondering if the above oops is due to an emulation bug in kvm.
If that is the case it might be better to fix kvm.

> Later in kdump kernel, when calling setup_local_APIC(), the
> 'do { xxx } while (queued && max_loops > 0)' loop does not function well any
> more if pending irq exists in APIC IRR since LAPIC is disabled. The do while
> loop will terminate finally when max_loops overflows by subtraction. Then,
> next WARN_ON(max_loops <= 0) is triggered.

That sounds like what needs to be fixed.  That loop in the kernel
startup.  Ideally the linux kernel will be as robust as possible during
startup especially for the kdump case.

> In normal kernel it defaults to be PIC mode or Virtual Wire mode which is
> done by BIOS. But kexec/kdump kernel won't go through BIOS, we need set
> system as PIC or Virtual Wire mode before jump to kdump kernel code directly.
> With this the pending irq can be handled correclty before APIC mode
> enabling.

Nope that is not the fix.

> So take clear_IO_APIC out of disable_IO_APIC, and rename disable_IO_APIC
> as switch_to_legacy_irq_mode. Then only call clear_IO_APIC when IO-APIC
> need be disabled. And call switch_to_legacy_irq_mode before kexec/kdump
> jumping.

All I can see your code change accomplishing is moving the entry into
legacy irq mode later and thus somehow avoiding the generating of the
interrupts that are problematic at startup.


As for the change itself it is incredibly ugly.  It extends a hack
for the KEXEC_JUMP path into the default case.  When we should probably
be removing the KEXEC_JUMP code entirely.

KEXEC_JUMP was a very nice idea but I don't believe anyone except for
the developer actually used KEXEC_JUMP.  Certainly KEXEC_JUMP needs some
aditional love if it is going to be used long term.  If we were to port
suspend to ram to KEXEC_JUMP it would be worth preserving.

I do think the ordering in native_machine_shutdown between
disable_IO_APIC and local_irq_disable is a bit fishy.  So you might be
able to play with that and get some improvement.

For the kdump we definitely need to revisit:
3d1675b41b02 ("[PATCH] i386 kexec-on-panic: Don't shutdown the apics.")
And just not touch the apics at all.

If the kernel boot path is ready that would be an ideal way to solve
things.

Additionaly for any BIOS that does not trigger a hardware reset this
change which removes the return to legacy mode on the normal reboot path
is likely to break the reboot path.

Eric


> Signed-off-by: Baoquan He <bhe@redhat.com>
> ---
>  arch/x86/include/asm/io_apic.h     |  3 ++-
>  arch/x86/kernel/apic/io_apic.c     | 12 ++++--------
>  arch/x86/kernel/crash.c            |  2 +-
>  arch/x86/kernel/machine_kexec_32.c | 15 +++++----------
>  arch/x86/kernel/machine_kexec_64.c | 15 +++++----------
>  arch/x86/kernel/reboot.c           |  2 +-
>  6 files changed, 18 insertions(+), 31 deletions(-)
>
> diff --git a/arch/x86/include/asm/io_apic.h b/arch/x86/include/asm/io_apic.h
> index a8834dd546cd..e38ad3863a2c 100644
> --- a/arch/x86/include/asm/io_apic.h
> +++ b/arch/x86/include/asm/io_apic.h
> @@ -192,7 +192,8 @@ static inline unsigned int io_apic_read(unsigned int apic, unsigned int reg)
>  
>  extern void setup_IO_APIC(void);
>  extern void enable_IO_APIC(void);
> -extern void disable_IO_APIC(void);
> +extern void clear_IO_APIC (void);
> +extern void switch_to_legacy_irq_mode(void);
>  extern int IO_APIC_get_PCI_irq_vector(int bus, int devfn, int pin);
>  extern void print_IO_APICs(void);
>  #else  /* !CONFIG_X86_IO_APIC */
> diff --git a/arch/x86/kernel/apic/io_apic.c b/arch/x86/kernel/apic/io_apic.c
> index 8a7963421460..a47aa915d18c 100644
> --- a/arch/x86/kernel/apic/io_apic.c
> +++ b/arch/x86/kernel/apic/io_apic.c
> @@ -587,7 +587,7 @@ static void clear_IO_APIC_pin(unsigned int apic, unsigned int pin)
>  		       mpc_ioapic_id(apic), pin);
>  }
>  
> -static void clear_IO_APIC (void)
> +void clear_IO_APIC (void)
>  {
>  	int apic, pin;
>  
> @@ -1439,15 +1439,11 @@ void native_disable_io_apic(void)
>  }
>  
>  /*
> - * Not an __init, needed by the reboot code
> + * Not an __init, needed by kexec/kdump code.
> + * For safety IO-APIC and Local APIC need be cleared before this.
>   */
> -void disable_IO_APIC(void)
> +void switch_to_legacy_irq_mode(void)
>  {
> -	/*
> -	 * Clear the IO-APIC before rebooting:
> -	 */
> -	clear_IO_APIC();
> -
>  	if (!nr_legacy_irqs())
>  		return;
>  
> diff --git a/arch/x86/kernel/crash.c b/arch/x86/kernel/crash.c
> index 10e74d4778a1..318ffeaaf55a 100644
> --- a/arch/x86/kernel/crash.c
> +++ b/arch/x86/kernel/crash.c
> @@ -199,7 +199,7 @@ void native_machine_crash_shutdown(struct pt_regs *regs)
>  #ifdef CONFIG_X86_IO_APIC
>  	/* Prevent crash_kexec() from deadlocking on ioapic_lock. */
>  	ioapic_zap_locks();
> -	disable_IO_APIC();
> +	clear_IO_APIC();
>  #endif
>  	lapic_shutdown();
>  #ifdef CONFIG_HPET_TIMER
> diff --git a/arch/x86/kernel/machine_kexec_32.c b/arch/x86/kernel/machine_kexec_32.c
> index edfede768688..7ab10d930cc6 100644
> --- a/arch/x86/kernel/machine_kexec_32.c
> +++ b/arch/x86/kernel/machine_kexec_32.c
> @@ -190,18 +190,13 @@ void machine_kexec(struct kimage *image)
>  	local_irq_disable();
>  	hw_breakpoint_disable();
>  
> -	if (image->preserve_context) {
>  #ifdef CONFIG_X86_IO_APIC
> -		/*
> -		 * We need to put APICs in legacy mode so that we can
> -		 * get timer interrupts in second kernel. kexec/kdump
> -		 * paths already have calls to disable_IO_APIC() in
> -		 * one form or other. kexec jump path also need
> -		 * one.
> -		 */
> -		disable_IO_APIC();
> +	/*
> +	 * We need to put APICs in legacy mode so that we can
> +	 * get timer interrupts in second kernel.
> +	 */
> +	switch_to_legacy_irq_mode();
>  #endif
> -	}
>  
>  	control_page = page_address(image->control_code_page);
>  	memcpy(control_page, relocate_kernel, KEXEC_CONTROL_CODE_MAX_SIZE);
> diff --git a/arch/x86/kernel/machine_kexec_64.c b/arch/x86/kernel/machine_kexec_64.c
> index 1f790cf9d38f..b5c0cbed6791 100644
> --- a/arch/x86/kernel/machine_kexec_64.c
> +++ b/arch/x86/kernel/machine_kexec_64.c
> @@ -288,18 +288,13 @@ void machine_kexec(struct kimage *image)
>  	local_irq_disable();
>  	hw_breakpoint_disable();
>  
> -	if (image->preserve_context) {
>  #ifdef CONFIG_X86_IO_APIC
> -		/*
> -		 * We need to put APICs in legacy mode so that we can
> -		 * get timer interrupts in second kernel. kexec/kdump
> -		 * paths already have calls to disable_IO_APIC() in
> -		 * one form or other. kexec jump path also need
> -		 * one.
> -		 */
> -		disable_IO_APIC();
> +	/*
> +	 * We need to put APICs in legacy mode so that we can
> +	 * get timer interrupts in second kernel.
> +	 */
> +	switch_to_legacy_irq_mode();
>  #endif
> -	}
>  
>  	control_page = page_address(image->control_code_page) + PAGE_SIZE;
>  	memcpy(control_page, relocate_kernel, KEXEC_CONTROL_CODE_MAX_SIZE);
> diff --git a/arch/x86/kernel/reboot.c b/arch/x86/kernel/reboot.c
> index 2126b9d27c34..b70cc0f38a29 100644
> --- a/arch/x86/kernel/reboot.c
> +++ b/arch/x86/kernel/reboot.c
> @@ -666,7 +666,7 @@ void native_machine_shutdown(void)
>  	 * Even without the erratum, it still makes sense to quiet IO APIC
>  	 * before disabling Local APIC.
>  	 */
> -	disable_IO_APIC();
> +	clear_IO_APIC();
>  #endif
>  
>  #ifdef CONFIG_SMP

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 1/2] x86/apic/kexec: Enable legacy irq mode before jump to kexec/kdump kernel
  2018-02-07 17:11   ` Eric W. Biederman
@ 2018-02-07 17:35     ` Eric W. Biederman
  2018-02-07 19:48       ` Eric W. Biederman
  0 siblings, 1 reply; 8+ messages in thread
From: Eric W. Biederman @ 2018-02-07 17:35 UTC (permalink / raw)
  To: Baoquan He
  Cc: linux-kernel, tglx, mingo, hpa, x86, douly.fnst, joro, uobergfe, prarit

ebiederm@xmission.com (Eric W. Biederman) writes:

> Baoquan He <bhe@redhat.com> writes:
>
>> On kvm guest, kernel always prints warning during kdump kernel boots as
>> below.
>>
>> [    0.001000] WARNING: CPU: 0 PID: 0 at arch/x86/kernel/apic/apic.c:1467 setup_local_APIC+0x228/0x330
>> [    0.001000] Modules linked in:
>> [    0.001000] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.15.0-rc5+ #3
>> [    0.001000] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1.fc26 04/01/2014
>> [    0.001000] RIP: 0010:setup_local_APIC+0x228/0x330
>> [    0.001000] RSP: 0000:ffffffffb6e03eb8 EFLAGS: 00010286
>> [    0.001000] RAX: 0000009edb4c4d84 RBX: 0000000000000000 RCX: 00000000b099d800
>> [    0.001000] RDX: 0000009e00000000 RSI: 0000000000000000 RDI: 0000000000000810
>> [    0.001000] RBP: 0000000000000000 R08: ffffffffffffffff R09: 0000000000000001
>> [    0.001000] R10: ffff98ce6a801c00 R11: 0761076d072f0776 R12: 0000000000000001
>> [    0.001000] R13: 00000000000000f0 R14: 0000000000004000 R15: ffffffffffffc6ff
>> [    0.001000] FS:  0000000000000000(0000) GS:ffff98ce6bc00000(0000) knlGS:0000000000000000
>> [    0.001000] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> [    0.001000] CR2: 00000000ffffffff CR3: 0000000022209000 CR4: 00000000000406b0
>> [    0.001000] Call Trace:
>> [    0.001000]  apic_bsp_setup+0x56/0x74
>> [    0.001000]  x86_late_time_init+0x11/0x16
>> [    0.001000]  start_kernel+0x3c9/0x486
>> [    0.001000]  secondary_startup_64+0xa5/0xb0
>> [    0.001000] Code: 00 85 c9 74 2d 0f 31 c1 e1 0a 48 c1 e2 20 41 89 cf 4c 03 7c 24 08 48 09 d0 49 29 c7 4c 89 3c 24 48 83 3c 24 00 0f 8f 8f fe ff
>> ff <0f> ff e9 10 ff ff ff 48 83 2c 24 01 eb e7 48 83 c4 18 5b 5d 41
>> [    0.001000] ---[ end trace b88e71b9a6ebebdd ]---
>> [    0.001000] masked ExtINT on CPU#0
>>
>> The root cause is the legacy irq mode is disabled before jump to kexec/kdump
>> kernel since commit 522e66464467 ("x86/apic: Disable I/O APIC before shutdown
>> of the local APIC"). In that commit, lapic_shutdown() calling was moved after
>> disable_IO_APIC(). In fact in disable_IO_APIC(), it not only calls
>> clear_IO_APIC() to disable IO-APIC, and also sets LAPIC and IO-APIC to make
>> system be PIC or Virtual wire mode. Hence local APIC is disabled completely
>> by the calling of lapic_shutdown().
>
> The actions of lapic_shutdown do not depend on the actions of
> disable_IO_APIC so this description and justificaiton are nonsense.
>
> Further we don't hardware disable the local APIC except when we hardware
> enable it.  And only on 32bit at that.
>
> I keep wondering if the above oops is due to an emulation bug in kvm.
> If that is the case it might be better to fix kvm.

Sigh.  Reading a little deeper I see where the local apic is affected.
It is the work of disconnect_bsp_APIC called from disable_IO_APIC.

Calling lapic_shutdown (which clears the local apic) after the local
apic has been placed into virtual wire mode would indeed be a problem.

Now that I see that I agree in essence with this patch series.
I don't agree with the implemenation details.

Can you please split disable_IO_APIC and switch_to_legacy_irq_mode
in a single patch.

In a second patch just perform the code motion and place
switch_to_legacy_irq_mode after lapic_shutdown() where disable_IO_APIC
used to be.

If you do just that the code will make much more sense and will be a
candidate for backporting to stable.  As it is a fix for an old
regression.

With a patch title that restores the ordering and achieves this affect
something like: "Fix enabling legacy irq mode in reboot and kexec/kdump"

You subject line above makes it sounds like enabling legacy irq mode is
something new when in fact it is what the code has been trying to do all
along.  All that happened is that a bug slipped in, and you are fixing
it.

Eric

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 1/2] x86/apic/kexec: Enable legacy irq mode before jump to kexec/kdump kernel
  2018-02-07 17:35     ` Eric W. Biederman
@ 2018-02-07 19:48       ` Eric W. Biederman
  2018-02-08 12:34         ` Baoquan He
  0 siblings, 1 reply; 8+ messages in thread
From: Eric W. Biederman @ 2018-02-07 19:48 UTC (permalink / raw)
  To: Baoquan He
  Cc: linux-kernel, tglx, mingo, hpa, x86, douly.fnst, joro, uobergfe, prarit

ebiederm@xmission.com (Eric W. Biederman) writes:

> ebiederm@xmission.com (Eric W. Biederman) writes:
>
>> Baoquan He <bhe@redhat.com> writes:
>>
>>> On kvm guest, kernel always prints warning during kdump kernel boots as
>>> below.
>>>
>>> [    0.001000] WARNING: CPU: 0 PID: 0 at arch/x86/kernel/apic/apic.c:1467 setup_local_APIC+0x228/0x330
>>> [    0.001000] Modules linked in:
>>> [    0.001000] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.15.0-rc5+ #3
>>> [    0.001000] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1.fc26 04/01/2014
>>> [    0.001000] RIP: 0010:setup_local_APIC+0x228/0x330
>>> [    0.001000] RSP: 0000:ffffffffb6e03eb8 EFLAGS: 00010286
>>> [    0.001000] RAX: 0000009edb4c4d84 RBX: 0000000000000000 RCX: 00000000b099d800
>>> [    0.001000] RDX: 0000009e00000000 RSI: 0000000000000000 RDI: 0000000000000810
>>> [    0.001000] RBP: 0000000000000000 R08: ffffffffffffffff R09: 0000000000000001
>>> [    0.001000] R10: ffff98ce6a801c00 R11: 0761076d072f0776 R12: 0000000000000001
>>> [    0.001000] R13: 00000000000000f0 R14: 0000000000004000 R15: ffffffffffffc6ff
>>> [    0.001000] FS:  0000000000000000(0000) GS:ffff98ce6bc00000(0000) knlGS:0000000000000000
>>> [    0.001000] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>> [    0.001000] CR2: 00000000ffffffff CR3: 0000000022209000 CR4: 00000000000406b0
>>> [    0.001000] Call Trace:
>>> [    0.001000]  apic_bsp_setup+0x56/0x74
>>> [    0.001000]  x86_late_time_init+0x11/0x16
>>> [    0.001000]  start_kernel+0x3c9/0x486
>>> [    0.001000]  secondary_startup_64+0xa5/0xb0
>>> [    0.001000] Code: 00 85 c9 74 2d 0f 31 c1 e1 0a 48 c1 e2 20 41 89 cf 4c 03 7c 24 08 48 09 d0 49 29 c7 4c 89 3c 24 48 83 3c 24 00 0f 8f 8f fe ff
>>> ff <0f> ff e9 10 ff ff ff 48 83 2c 24 01 eb e7 48 83 c4 18 5b 5d 41
>>> [    0.001000] ---[ end trace b88e71b9a6ebebdd ]---
>>> [    0.001000] masked ExtINT on CPU#0
>>>
>>> The root cause is the legacy irq mode is disabled before jump to kexec/kdump
>>> kernel since commit 522e66464467 ("x86/apic: Disable I/O APIC before shutdown
>>> of the local APIC"). In that commit, lapic_shutdown() calling was moved after
>>> disable_IO_APIC(). In fact in disable_IO_APIC(), it not only calls
>>> clear_IO_APIC() to disable IO-APIC, and also sets LAPIC and IO-APIC to make
>>> system be PIC or Virtual wire mode. Hence local APIC is disabled completely
>>> by the calling of lapic_shutdown().
>>
>> The actions of lapic_shutdown do not depend on the actions of
>> disable_IO_APIC so this description and justificaiton are nonsense.
>>
>> Further we don't hardware disable the local APIC except when we hardware
>> enable it.  And only on 32bit at that.
>>
>> I keep wondering if the above oops is due to an emulation bug in kvm.
>> If that is the case it might be better to fix kvm.
>
> Sigh.  Reading a little deeper I see where the local apic is affected.
> It is the work of disconnect_bsp_APIC called from disable_IO_APIC.
>
> Calling lapic_shutdown (which clears the local apic) after the local
> apic has been placed into virtual wire mode would indeed be a problem.
>
> Now that I see that I agree in essence with this patch series.
> I don't agree with the implemenation details.
>
> Can you please split disable_IO_APIC and switch_to_legacy_irq_mode
> in a single patch.

Now that I think about it can you call the function not
switch_to_legacy_irq_mode but restore_boot_irq_mode.  And the
corresponding x86_io_apic_ops not .disable but .restore.

That should make the purpose of the code clearer, and help avoid
mistakes like the one that led to this regression.

Eric

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 1/2] x86/apic/kexec: Enable legacy irq mode before jump to kexec/kdump kernel
  2018-02-07 19:48       ` Eric W. Biederman
@ 2018-02-08 12:34         ` Baoquan He
  0 siblings, 0 replies; 8+ messages in thread
From: Baoquan He @ 2018-02-08 12:34 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: linux-kernel, tglx, mingo, hpa, x86, douly.fnst, joro, uobergfe, prarit

On 02/07/18 at 01:48pm, Eric W. Biederman wrote:
> ebiederm@xmission.com (Eric W. Biederman) writes:

> > Now that I see that I agree in essence with this patch series.
> > I don't agree with the implemenation details.
> >
> > Can you please split disable_IO_APIC and switch_to_legacy_irq_mode
> > in a single patch.
> 
> Now that I think about it can you call the function not
> switch_to_legacy_irq_mode but restore_boot_irq_mode.  And the
> corresponding x86_io_apic_ops not .disable but .restore.
> 
> That should make the purpose of the code clearer, and help avoid
> mistakes like the one that led to this regression.

Thanks for your comments and great suggestion. Have made change as you
suggested, will post after test passed.

Thanks
Baoquan

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2018-02-08 12:34 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-01-25 14:11 [PATCH v2 0/2] x86/apic/kexec: Legacy irq setting fixs in kdump kernel Baoquan He
2018-01-25 14:11 ` [PATCH v2 1/2] x86/apic/kexec: Enable legacy irq mode before jump to kexec/kdump kernel Baoquan He
2018-02-07 17:11   ` Eric W. Biederman
2018-02-07 17:35     ` Eric W. Biederman
2018-02-07 19:48       ` Eric W. Biederman
2018-02-08 12:34         ` Baoquan He
2018-01-25 14:11 ` [PATCH v2 2/2] x86/apic: Set up through-local-APIC on boot CPU's LINT0 if 'noapic' specified Baoquan He
2018-02-02  4:04 ` [PATCH v2 0/2] x86/apic/kexec: Legacy irq setting fixs in kdump kernel Dou Liyang

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.