* [BUG 4.2-rc8] Interrupt occurs while apply_alternatives() is patching the handler
@ 2015-08-31 2:37 Chuck Ebbert
2015-09-01 6:20 ` Richard W.M. Jones
0 siblings, 1 reply; 15+ messages in thread
From: Chuck Ebbert @ 2015-08-31 2:37 UTC (permalink / raw)
To: linux-kernel; +Cc: x86, Thomas Gleixner, Ingo Molnar, H. Peter Anvin
This is from https://bugzilla.redhat.com/show_bug.cgi?id=1258223
[ 0.036000] BUG: unable to handle kernel paging request at 55501e06
[ 0.036000] IP: [<c0aae48b>] common_interrupt+0xb/0x38
[ 0.036000] *pde = 00000000
[ 0.036000] Oops: 0000 [#1] SMP
[ 0.036000] Modules linked in:
[ 0.036000] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.2.0-0.rc8.git3.1.fc24.i686 #1
[ 0.036000] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.8.2-20150714_191134- 04/01/2014
[ 0.036000] task: c0d49ac0 ti: c0d42000 task.ti: c0d42000
[ 0.036000] EIP: 0060:[<c0aae48b>] EFLAGS: 00200046 CPU: 0
[ 0.036000] EIP is at common_interrupt+0xb/0x38
[ 0.036000] EAX: c0aae480 EBX: 0000008d ECX: c0ab1c83 EDX: e4af6810
[ 0.036000] ESI: 029a7802 EDI: 00000003 EBP: c0d43e68 ESP: c0d43e44
[ 0.036000] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
[ 0.036000] CR0: 8005003b CR2: 55501e06 CR3: 00ebd000 CR4: 00000690
[ 0.036000] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
[ 0.036000] DR6: 00000000 DR7: 00000000
[ 0.036000] Stack:
[ 0.036000] 0000004f c0409c80 00000060 00200202 00200046 c0d43e60 c0ea150c 029a7802
[ 0.036000] 00000000 c0d43fb8 c040a054 c07f1cf0 6c0a1000 ffff0006 00200046 00000043
[ 0.036000] c0ed0bc0 00000000 c0d43e98 c071a6fc c0d43ea8 c0d43ec4 c0ea4c73 c0ea4c7f
[ 0.036000] Call Trace:
[ 0.036000] [<c0409c80>] ? add_nops+0x90/0xa0
[ 0.036000] [<c040a054>] apply_alternatives+0x274/0x630
[ 0.036000] [<c07f1cf0>] ? wait_for_xmitr+0xa0/0xa0
[ 0.036000] [<c071a6fc>] ? sprintf+0x1c/0x20
[ 0.036000] [<c0aae480>] ? irq_entries_start+0x698/0x698
[ 0.036000] [<c071be4b>] ? memcpy+0xb/0x30
[ 0.036000] [<c07f3950>] ? serial8250_set_termios+0x20/0x20
[ 0.036000] [<c0aad4e3>] ? _raw_write_unlock_irqrestore+0x13/0x20
[ 0.036000] [<c0aad4e3>] ? _raw_write_unlock_irqrestore+0x13/0x20
[ 0.036000] [<c0aad4fd>] ? _raw_spin_unlock_irqrestore+0xd/0x10
[ 0.036000] [<c04b17b9>] ? console_unlock+0x2e9/0x610
[ 0.036000] [<c04b03cd>] ? log_store+0x1cd/0x210
[ 0.036000] [<c04b1d7e>] ? vprintk_emit+0x29e/0x570
[ 0.036000] [<c04b21e1>] ? vprintk_default+0x41/0x60
[ 0.036000] [<c0aa7725>] ? printk+0x17/0x19
[ 0.036000] [<c0dfdd48>] ? identify_boot_cpu+0x7b/0x80
[ 0.036000] [<c0dfca47>] alternative_instructions+0x17/0xc1
[ 0.036000] [<c0dfdda9>] check_bugs+0x32/0x39
[ 0.036000] [<c0df6b57>] start_kernel+0x3ca/0x40a
[ 0.036000] [<c0df62e3>] i386_start_kernel+0x91/0x95
[ 0.036000] Code: 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 8d 90 90 83 04 24 80 fc 0f a8 0f <a0> 06 1e 50 55 57 56 52 51 53 ba 7b 00 00 00 8e da 8e c2 ba d8
0: 8d 90 90 83 04 24 lea 0x24048390(%eax),%edx
6: 80 fc 0f cmp $0xf,%ah
9: a8 0f test $0xf,%al
>> b: a0 06 1e 50 55 mov 0x55501e06,%al
10: 57 push %edi
11: 56 push %esi
Interrupt 0x30 occurred while the alternatives code was replacing the
initial 0x90,0x90,0x90 NOPs (from the ASM_CLAC macro) with the optimized
version, 0x8d,0x76,0x00. Only the first byte has been replaced so far,
and it makes a mess out of the insn decoding.
I have no clue how to fix this.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [BUG 4.2-rc8] Interrupt occurs while apply_alternatives() is patching the handler 2015-08-31 2:37 [BUG 4.2-rc8] Interrupt occurs while apply_alternatives() is patching the handler Chuck Ebbert @ 2015-09-01 6:20 ` Richard W.M. Jones 2015-09-02 9:11 ` Thomas Gleixner 0 siblings, 1 reply; 15+ messages in thread From: Richard W.M. Jones @ 2015-09-01 6:20 UTC (permalink / raw) To: Chuck Ebbert Cc: linux-kernel, x86, Thomas Gleixner, Ingo Molnar, H. Peter Anvin On Sun, Aug 30, 2015 at 10:37:57PM -0400, Chuck Ebbert wrote: > This is from https://bugzilla.redhat.com/show_bug.cgi?id=1258223 > > [ 0.036000] BUG: unable to handle kernel paging request at 55501e06 [...] > [ 0.036000] [<c0409c80>] ? add_nops+0x90/0xa0 > [ 0.036000] [<c040a054>] apply_alternatives+0x274/0x630 > [ 0.036000] [<c07f1cf0>] ? wait_for_xmitr+0xa0/0xa0 > [ 0.036000] [<c071a6fc>] ? sprintf+0x1c/0x20 > [ 0.036000] [<c0aae480>] ? irq_entries_start+0x698/0x698 > [ 0.036000] [<c071be4b>] ? memcpy+0xb/0x30 > [ 0.036000] [<c07f3950>] ? serial8250_set_termios+0x20/0x20 [...] > Interrupt 0x30 occurred while the alternatives code was replacing the > initial 0x90,0x90,0x90 NOPs (from the ASM_CLAC macro) with the optimized > version, 0x8d,0x76,0x00. Only the first byte has been replaced so far, > and it makes a mess out of the insn decoding. Chuck, thanks for reporting this. I have only been able to reproduce this so far using qemu and TCG (not KVM) which of course raises a range of questions: could it be a qemu bug or a TCG bug? Could it be that an atomic op is not correctly implemented by qemu? I will keep trying on KVM. Because I don't have a convenient server with 32 bit kernel and a serial port that I can reboot thousands of times, I have not tried to reproduce on baremetal yet. Here's how to reproduce it. (The host can be x86-64) (1) Grab the 32 bit Fedora kernel we are using from https://kojipkgs.fedoraproject.org//packages/kernel/4.2.0/1.fc24/i686/kernel-core-4.2.0-1.fc24.i686.rpm (from http://koji.fedoraproject.org/koji/buildinfo?buildID=681723) (2) Unpack it to extract vmlinuz: cd /tmp rpm2cpio /mnt/scratch/kernel-core-4.2.0-1.fc24.i686.rpm | cpio -id cp ./lib/modules/4.2.0-1.fc24.i686/vmlinuz . (3) Boot the kernel under qemu/KVM. The following single line command repeatedly boots the kernel until the bug is hit: while qemu-system-x86_64 -nographic -no-reboot -M accel=kvm:tcg -kernel vmlinuz -append 'console=ttyS0 panic=1' -serial stdio -monitor none >& log; ! grep add_nops log; do echo -n .; done It takes many iterations (100s with TCG) to hit the bug. Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com virt-p2v converts physical machines to virtual machines. Boot with a live CD or over the network (PXE) and turn machines into KVM guests. http://libguestfs.org/virt-v2v ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [BUG 4.2-rc8] Interrupt occurs while apply_alternatives() is patching the handler 2015-09-01 6:20 ` Richard W.M. Jones @ 2015-09-02 9:11 ` Thomas Gleixner 2015-09-02 19:05 ` Richard W.M. Jones 2015-09-03 8:50 ` Borislav Petkov 0 siblings, 2 replies; 15+ messages in thread From: Thomas Gleixner @ 2015-09-02 9:11 UTC (permalink / raw) To: Richard W.M. Jones Cc: Chuck Ebbert, linux-kernel, x86, Ingo Molnar, H. Peter Anvin On Tue, 1 Sep 2015, Richard W.M. Jones wrote: > On Sun, Aug 30, 2015 at 10:37:57PM -0400, Chuck Ebbert wrote: > > This is from https://bugzilla.redhat.com/show_bug.cgi?id=1258223 > > > > [ 0.036000] BUG: unable to handle kernel paging request at 55501e06 > [...] > > [ 0.036000] [<c0409c80>] ? add_nops+0x90/0xa0 > > [ 0.036000] [<c040a054>] apply_alternatives+0x274/0x630 > > [ 0.036000] [<c07f1cf0>] ? wait_for_xmitr+0xa0/0xa0 > > [ 0.036000] [<c071a6fc>] ? sprintf+0x1c/0x20 > > [ 0.036000] [<c0aae480>] ? irq_entries_start+0x698/0x698 > > [ 0.036000] [<c071be4b>] ? memcpy+0xb/0x30 > > [ 0.036000] [<c07f3950>] ? serial8250_set_termios+0x20/0x20 > [...] > > Interrupt 0x30 occurred while the alternatives code was replacing the > > initial 0x90,0x90,0x90 NOPs (from the ASM_CLAC macro) with the optimized > > version, 0x8d,0x76,0x00. Only the first byte has been replaced so far, > > and it makes a mess out of the insn decoding. apply_alternatives() has two ways to modify the code: 1) text_poke_early() 2) optimize_nops() The former disables interrupts, the latter not. The patch below should fix the issue. Thanks, tglx diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c index c42827eb86cf..6a2f93e029f4 100644 --- a/arch/x86/kernel/alternative.c +++ b/arch/x86/kernel/alternative.c @@ -257,6 +257,9 @@ void __init arch_init_ideal_nops(void) /* Use this to add nops to a buffer, then text_poke the whole buffer. */ static void __init_or_module add_nops(void *insns, unsigned int len) { + unsigned long flags; + + local_irq_save(flags); while (len > 0) { unsigned int noplen = len; if (noplen > ASM_NOP_MAX) @@ -265,6 +268,7 @@ static void __init_or_module add_nops(void *insns, unsigned int len) insns += noplen; len -= noplen; } + local_irq_restore(flags); } extern struct alt_instr __alt_instructions[], __alt_instructions_end[]; ^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [BUG 4.2-rc8] Interrupt occurs while apply_alternatives() is patching the handler 2015-09-02 9:11 ` Thomas Gleixner @ 2015-09-02 19:05 ` Richard W.M. Jones 2015-09-03 7:53 ` Richard W.M. Jones 2015-09-03 8:50 ` Borislav Petkov 1 sibling, 1 reply; 15+ messages in thread From: Richard W.M. Jones @ 2015-09-02 19:05 UTC (permalink / raw) To: Thomas Gleixner Cc: Chuck Ebbert, linux-kernel, x86, Ingo Molnar, H. Peter Anvin On Wed, Sep 02, 2015 at 11:11:55AM +0200, Thomas Gleixner wrote: > On Tue, 1 Sep 2015, Richard W.M. Jones wrote: > > On Sun, Aug 30, 2015 at 10:37:57PM -0400, Chuck Ebbert wrote: > > > This is from https://bugzilla.redhat.com/show_bug.cgi?id=1258223 > > > > > > [ 0.036000] BUG: unable to handle kernel paging request at 55501e06 > > [...] > > > [ 0.036000] [<c0409c80>] ? add_nops+0x90/0xa0 > > > [ 0.036000] [<c040a054>] apply_alternatives+0x274/0x630 > > > [ 0.036000] [<c07f1cf0>] ? wait_for_xmitr+0xa0/0xa0 > > > [ 0.036000] [<c071a6fc>] ? sprintf+0x1c/0x20 > > > [ 0.036000] [<c0aae480>] ? irq_entries_start+0x698/0x698 > > > [ 0.036000] [<c071be4b>] ? memcpy+0xb/0x30 > > > [ 0.036000] [<c07f3950>] ? serial8250_set_termios+0x20/0x20 > > [...] > > > Interrupt 0x30 occurred while the alternatives code was replacing the > > > initial 0x90,0x90,0x90 NOPs (from the ASM_CLAC macro) with the optimized > > > version, 0x8d,0x76,0x00. Only the first byte has been replaced so far, > > > and it makes a mess out of the insn decoding. > > apply_alternatives() has two ways to modify the code: > > 1) text_poke_early() > > 2) optimize_nops() > > The former disables interrupts, the latter not. The patch below should > fix the issue. It has gone through about 1100 iterations so far without hitting the bug. I'll leave it running overnight. Rich. > Thanks, > > tglx > > diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c > index c42827eb86cf..6a2f93e029f4 100644 > --- a/arch/x86/kernel/alternative.c > +++ b/arch/x86/kernel/alternative.c > @@ -257,6 +257,9 @@ void __init arch_init_ideal_nops(void) > /* Use this to add nops to a buffer, then text_poke the whole buffer. */ > static void __init_or_module add_nops(void *insns, unsigned int len) > { > + unsigned long flags; > + > + local_irq_save(flags); > while (len > 0) { > unsigned int noplen = len; > if (noplen > ASM_NOP_MAX) > @@ -265,6 +268,7 @@ static void __init_or_module add_nops(void *insns, unsigned int len) > insns += noplen; > len -= noplen; > } > + local_irq_restore(flags); > } > > extern struct alt_instr __alt_instructions[], __alt_instructions_end[]; > > > -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com virt-builder quickly builds VMs from scratch http://libguestfs.org/virt-builder.1.html ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [BUG 4.2-rc8] Interrupt occurs while apply_alternatives() is patching the handler 2015-09-02 19:05 ` Richard W.M. Jones @ 2015-09-03 7:53 ` Richard W.M. Jones 0 siblings, 0 replies; 15+ messages in thread From: Richard W.M. Jones @ 2015-09-03 7:53 UTC (permalink / raw) To: Thomas Gleixner Cc: Chuck Ebbert, linux-kernel, x86, Ingo Molnar, H. Peter Anvin On Wed, Sep 02, 2015 at 08:05:12PM +0100, Richard W.M. Jones wrote: > On Wed, Sep 02, 2015 at 11:11:55AM +0200, Thomas Gleixner wrote: > > On Tue, 1 Sep 2015, Richard W.M. Jones wrote: > > > On Sun, Aug 30, 2015 at 10:37:57PM -0400, Chuck Ebbert wrote: > > > > This is from https://bugzilla.redhat.com/show_bug.cgi?id=1258223 > > > > > > > > [ 0.036000] BUG: unable to handle kernel paging request at 55501e06 > > > [...] > > > > [ 0.036000] [<c0409c80>] ? add_nops+0x90/0xa0 > > > > [ 0.036000] [<c040a054>] apply_alternatives+0x274/0x630 > > > > [ 0.036000] [<c07f1cf0>] ? wait_for_xmitr+0xa0/0xa0 > > > > [ 0.036000] [<c071a6fc>] ? sprintf+0x1c/0x20 > > > > [ 0.036000] [<c0aae480>] ? irq_entries_start+0x698/0x698 > > > > [ 0.036000] [<c071be4b>] ? memcpy+0xb/0x30 > > > > [ 0.036000] [<c07f3950>] ? serial8250_set_termios+0x20/0x20 > > > [...] > > > > Interrupt 0x30 occurred while the alternatives code was replacing the > > > > initial 0x90,0x90,0x90 NOPs (from the ASM_CLAC macro) with the optimized > > > > version, 0x8d,0x76,0x00. Only the first byte has been replaced so far, > > > > and it makes a mess out of the insn decoding. > > > > apply_alternatives() has two ways to modify the code: > > > > 1) text_poke_early() > > > > 2) optimize_nops() > > > > The former disables interrupts, the latter not. The patch below should > > fix the issue. > > It has gone through about 1100 iterations so far without hitting the > bug. I'll leave it running overnight. That ran ~ 4000 iterations overnight, so it seems to work. You can add: Tested-by: Richard W.M. Jones <rjones@redhat.com> Thanks, Rich. > Rich. > > > Thanks, > > > > tglx > > > > diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c > > index c42827eb86cf..6a2f93e029f4 100644 > > --- a/arch/x86/kernel/alternative.c > > +++ b/arch/x86/kernel/alternative.c > > @@ -257,6 +257,9 @@ void __init arch_init_ideal_nops(void) > > /* Use this to add nops to a buffer, then text_poke the whole buffer. */ > > static void __init_or_module add_nops(void *insns, unsigned int len) > > { > > + unsigned long flags; > > + > > + local_irq_save(flags); > > while (len > 0) { > > unsigned int noplen = len; > > if (noplen > ASM_NOP_MAX) > > @@ -265,6 +268,7 @@ static void __init_or_module add_nops(void *insns, unsigned int len) > > insns += noplen; > > len -= noplen; > > } > > + local_irq_restore(flags); > > } > > > > extern struct alt_instr __alt_instructions[], __alt_instructions_end[]; > > > > > > > > -- > Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones > Read my programming and virtualization blog: http://rwmj.wordpress.com > virt-builder quickly builds VMs from scratch > http://libguestfs.org/virt-builder.1.html -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com virt-df lists disk usage of guests without needing to install any software inside the virtual machine. Supports Linux and Windows. http://people.redhat.com/~rjones/virt-df/ ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [BUG 4.2-rc8] Interrupt occurs while apply_alternatives() is patching the handler 2015-09-02 9:11 ` Thomas Gleixner 2015-09-02 19:05 ` Richard W.M. Jones @ 2015-09-03 8:50 ` Borislav Petkov 2015-09-03 10:41 ` Thomas Gleixner 1 sibling, 1 reply; 15+ messages in thread From: Borislav Petkov @ 2015-09-03 8:50 UTC (permalink / raw) To: Thomas Gleixner Cc: Richard W.M. Jones, Chuck Ebbert, linux-kernel, x86, Ingo Molnar, H. Peter Anvin On Wed, Sep 02, 2015 at 11:11:55AM +0200, Thomas Gleixner wrote: > On Tue, 1 Sep 2015, Richard W.M. Jones wrote: > > On Sun, Aug 30, 2015 at 10:37:57PM -0400, Chuck Ebbert wrote: > > > This is from https://bugzilla.redhat.com/show_bug.cgi?id=1258223 > > > > > > [ 0.036000] BUG: unable to handle kernel paging request at 55501e06 > > [...] > > > [ 0.036000] [<c0409c80>] ? add_nops+0x90/0xa0 > > > [ 0.036000] [<c040a054>] apply_alternatives+0x274/0x630 > > > [ 0.036000] [<c07f1cf0>] ? wait_for_xmitr+0xa0/0xa0 > > > [ 0.036000] [<c071a6fc>] ? sprintf+0x1c/0x20 > > > [ 0.036000] [<c0aae480>] ? irq_entries_start+0x698/0x698 > > > [ 0.036000] [<c071be4b>] ? memcpy+0xb/0x30 > > > [ 0.036000] [<c07f3950>] ? serial8250_set_termios+0x20/0x20 > > [...] > > > Interrupt 0x30 occurred while the alternatives code was replacing the > > > initial 0x90,0x90,0x90 NOPs (from the ASM_CLAC macro) with the optimized > > > version, 0x8d,0x76,0x00. Only the first byte has been replaced so far, > > > and it makes a mess out of the insn decoding. > > apply_alternatives() has two ways to modify the code: > > 1) text_poke_early() > > 2) optimize_nops() > > The former disables interrupts, the latter not. The patch below should > fix the issue. > > Thanks, > > tglx > > diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c > index c42827eb86cf..6a2f93e029f4 100644 > --- a/arch/x86/kernel/alternative.c > +++ b/arch/x86/kernel/alternative.c > @@ -257,6 +257,9 @@ void __init arch_init_ideal_nops(void) > /* Use this to add nops to a buffer, then text_poke the whole buffer. */ > static void __init_or_module add_nops(void *insns, unsigned int len) > { > + unsigned long flags; > + > + local_irq_save(flags); > while (len > 0) { I guess you want to optimize the len==0 case to not disable interrupts needlessly: if (!len) return; local_irq_save(flags); while (len > 0) ... Other than that, good catch! Thanks. -- Regards/Gruss, Boris. ECO tip #101: Trim your mails when you reply. -- ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [BUG 4.2-rc8] Interrupt occurs while apply_alternatives() is patching the handler 2015-09-03 8:50 ` Borislav Petkov @ 2015-09-03 10:41 ` Thomas Gleixner 2015-09-03 12:43 ` Josh Boyer ` (4 more replies) 0 siblings, 5 replies; 15+ messages in thread From: Thomas Gleixner @ 2015-09-03 10:41 UTC (permalink / raw) To: Borislav Petkov Cc: Richard W.M. Jones, Chuck Ebbert, linux-kernel, x86, Ingo Molnar, H. Peter Anvin On Thu, 3 Sep 2015, Borislav Petkov wrote: > On Wed, Sep 02, 2015 at 11:11:55AM +0200, Thomas Gleixner wrote: > > static void __init_or_module add_nops(void *insns, unsigned int len) > > { > > + unsigned long flags; > > + > > + local_irq_save(flags); > > while (len > 0) { > > I guess you want to optimize the len==0 case to not disable interrupts > needlessly: > > if (!len) > return; > > local_irq_save(flags); > while (len > 0) > ... Nah. I rather put the local_irq_save into optimize_nops(). All other callers of add_nops() are operating on a buffer and use text_poke after that. Aside of that optimize_nops() is missing a sync_core(). Updated patch below. Thanks, tglx -----------------> Subject: x86/alternatives: Make optimize_nops() interrupt safe and synced From: Thomas Gleixner <tglx@linutronix.de> Date: Thu, 03 Sep 2015 12:34:55 +0200 optimize_nops() is buggy in two aspects: - It's not disabling interrupts across the modification - It's lacking a sync_core() call Fixes: 4fd4b6e5537c 'x86/alternatives: Use optimized NOPs for padding' Reported-by: "Richard W.M. Jones" <rjones@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> --- arch/x86/kernel/alternative.c | 5 +++++ 1 file changed, 5 insertions(+) Index: tip/arch/x86/kernel/alternative.c =================================================================== --- tip.orig/arch/x86/kernel/alternative.c +++ tip/arch/x86/kernel/alternative.c @@ -338,10 +338,15 @@ done: static void __init_or_module optimize_nops(struct alt_instr *a, u8 *instr) { + unsigned long flags; + if (instr[0] != 0x90) return; + local_irq_save(flags); add_nops(instr + (a->instrlen - a->padlen), a->padlen); + sync_core(); + local_irq_restore(flags); DUMP_BYTES(instr, a->instrlen, "%p: [%d:%d) optimized NOPs: ", instr, a->instrlen - a->padlen, a->padlen); ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [BUG 4.2-rc8] Interrupt occurs while apply_alternatives() is patching the handler 2015-09-03 10:41 ` Thomas Gleixner @ 2015-09-03 12:43 ` Josh Boyer 2015-09-03 13:01 ` Thomas Gleixner 2015-09-03 15:48 ` Richard W.M. Jones ` (3 subsequent siblings) 4 siblings, 1 reply; 15+ messages in thread From: Josh Boyer @ 2015-09-03 12:43 UTC (permalink / raw) To: Thomas Gleixner Cc: Borislav Petkov, Richard W.M. Jones, Chuck Ebbert, Linux-Kernel@Vger. Kernel. Org, x86, Ingo Molnar, H. Peter Anvin On Thu, Sep 3, 2015 at 6:41 AM, Thomas Gleixner <tglx@linutronix.de> wrote: > On Thu, 3 Sep 2015, Borislav Petkov wrote: >> On Wed, Sep 02, 2015 at 11:11:55AM +0200, Thomas Gleixner wrote: >> > static void __init_or_module add_nops(void *insns, unsigned int len) >> > { >> > + unsigned long flags; >> > + >> > + local_irq_save(flags); >> > while (len > 0) { >> >> I guess you want to optimize the len==0 case to not disable interrupts >> needlessly: >> >> if (!len) >> return; >> >> local_irq_save(flags); >> while (len > 0) >> ... > > Nah. I rather put the local_irq_save into optimize_nops(). All other > callers of add_nops() are operating on a buffer and use text_poke > after that. Aside of that optimize_nops() is missing a sync_core(). > > Updated patch below. > > Thanks, > > tglx > > -----------------> > Subject: x86/alternatives: Make optimize_nops() interrupt safe and synced > From: Thomas Gleixner <tglx@linutronix.de> > Date: Thu, 03 Sep 2015 12:34:55 +0200 > > optimize_nops() is buggy in two aspects: > > - It's not disabling interrupts across the modification > - It's lacking a sync_core() call > > Fixes: 4fd4b6e5537c 'x86/alternatives: Use optimized NOPs for padding' Possibly CC: stable since it fixes a commit in 4.1 and 4.2? josh > Reported-by: "Richard W.M. Jones" <rjones@redhat.com> > Signed-off-by: Thomas Gleixner <tglx@linutronix.de> > --- > arch/x86/kernel/alternative.c | 5 +++++ > 1 file changed, 5 insertions(+) > > Index: tip/arch/x86/kernel/alternative.c > =================================================================== > --- tip.orig/arch/x86/kernel/alternative.c > +++ tip/arch/x86/kernel/alternative.c > @@ -338,10 +338,15 @@ done: > > static void __init_or_module optimize_nops(struct alt_instr *a, u8 *instr) > { > + unsigned long flags; > + > if (instr[0] != 0x90) > return; > > + local_irq_save(flags); > add_nops(instr + (a->instrlen - a->padlen), a->padlen); > + sync_core(); > + local_irq_restore(flags); > > DUMP_BYTES(instr, a->instrlen, "%p: [%d:%d) optimized NOPs: ", > instr, a->instrlen - a->padlen, a->padlen); > > > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [BUG 4.2-rc8] Interrupt occurs while apply_alternatives() is patching the handler 2015-09-03 12:43 ` Josh Boyer @ 2015-09-03 13:01 ` Thomas Gleixner 0 siblings, 0 replies; 15+ messages in thread From: Thomas Gleixner @ 2015-09-03 13:01 UTC (permalink / raw) To: Josh Boyer Cc: Borislav Petkov, Richard W.M. Jones, Chuck Ebbert, Linux-Kernel@Vger. Kernel. Org, x86, Ingo Molnar, H. Peter Anvin On Thu, 3 Sep 2015, Josh Boyer wrote: > > Fixes: 4fd4b6e5537c 'x86/alternatives: Use optimized NOPs for padding' > > Possibly CC: stable since it fixes a commit in 4.1 and 4.2? Indeed. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [BUG 4.2-rc8] Interrupt occurs while apply_alternatives() is patching the handler 2015-09-03 10:41 ` Thomas Gleixner 2015-09-03 12:43 ` Josh Boyer @ 2015-09-03 15:48 ` Richard W.M. Jones 2015-09-03 19:30 ` [tip:x86/urgent] x86/alternatives: Make optimize_nops() interrupt safe and synced tip-bot for Thomas Gleixner ` (2 subsequent siblings) 4 siblings, 0 replies; 15+ messages in thread From: Richard W.M. Jones @ 2015-09-03 15:48 UTC (permalink / raw) To: Thomas Gleixner Cc: Borislav Petkov, Chuck Ebbert, linux-kernel, x86, Ingo Molnar, H. Peter Anvin On Thu, Sep 03, 2015 at 12:41:47PM +0200, Thomas Gleixner wrote: > On Thu, 3 Sep 2015, Borislav Petkov wrote: > > On Wed, Sep 02, 2015 at 11:11:55AM +0200, Thomas Gleixner wrote: > > > static void __init_or_module add_nops(void *insns, unsigned int len) > > > { > > > + unsigned long flags; > > > + > > > + local_irq_save(flags); > > > while (len > 0) { > > > > I guess you want to optimize the len==0 case to not disable interrupts > > needlessly: > > > > if (!len) > > return; > > > > local_irq_save(flags); > > while (len > 0) > > ... > > Nah. I rather put the local_irq_save into optimize_nops(). All other > callers of add_nops() are operating on a buffer and use text_poke > after that. Aside of that optimize_nops() is missing a sync_core(). > > Updated patch below. The V2 patch has got to 900 iterations without hitting the problem. As that is a lot more than without the patch, you can add: Tested-by: Richard W.M. Jones <rjones@redhat.com> I will leave it going overnight just in case. Thanks, Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com virt-top is 'top' for virtual machines. Tiny program with many powerful monitoring features, net stats, disk stats, logging, etc. http://people.redhat.com/~rjones/virt-top ^ permalink raw reply [flat|nested] 15+ messages in thread
* [tip:x86/urgent] x86/alternatives: Make optimize_nops() interrupt safe and synced 2015-09-03 10:41 ` Thomas Gleixner 2015-09-03 12:43 ` Josh Boyer 2015-09-03 15:48 ` Richard W.M. Jones @ 2015-09-03 19:30 ` tip-bot for Thomas Gleixner 2015-09-04 7:40 ` [BUG 4.2-rc8] Interrupt occurs while apply_alternatives() is patching the handler Richard W.M. Jones 2015-09-04 12:02 ` Borislav Petkov 4 siblings, 0 replies; 15+ messages in thread From: tip-bot for Thomas Gleixner @ 2015-09-03 19:30 UTC (permalink / raw) To: linux-tip-commits Cc: bp, mingo, rjones, hpa, tglx, linux-kernel, cebbert.lkml Commit-ID: 66c117d7fa2ae429911e60d84bf31a90b2b96189 Gitweb: http://git.kernel.org/tip/66c117d7fa2ae429911e60d84bf31a90b2b96189 Author: Thomas Gleixner <tglx@linutronix.de> AuthorDate: Thu, 3 Sep 2015 12:34:55 +0200 Committer: Thomas Gleixner <tglx@linutronix.de> CommitDate: Thu, 3 Sep 2015 21:27:47 +0200 x86/alternatives: Make optimize_nops() interrupt safe and synced Richard reported the following crash: [ 0.036000] BUG: unable to handle kernel paging request at 55501e06 [ 0.036000] IP: [<c0aae48b>] common_interrupt+0xb/0x38 [ 0.036000] Call Trace: [ 0.036000] [<c0409c80>] ? add_nops+0x90/0xa0 [ 0.036000] [<c040a054>] apply_alternatives+0x274/0x630 Chuck decoded: " 0: 8d 90 90 83 04 24 lea 0x24048390(%eax),%edx 6: 80 fc 0f cmp $0xf,%ah 9: a8 0f test $0xf,%al >> b: a0 06 1e 50 55 mov 0x55501e06,%al 10: 57 push %edi 11: 56 push %esi Interrupt 0x30 occurred while the alternatives code was replacing the initial 0x90,0x90,0x90 NOPs (from the ASM_CLAC macro) with the optimized version, 0x8d,0x76,0x00. Only the first byte has been replaced so far, and it makes a mess out of the insn decoding." optimize_nops() is buggy in two aspects: - It's not disabling interrupts across the modification - It's lacking a sync_core() call Add both. Fixes: 4fd4b6e5537c 'x86/alternatives: Use optimized NOPs for padding' Reported-and-tested-by: "Richard W.M. Jones" <rjones@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Richard W.M. Jones <rjones@redhat.com> Cc: Chuck Ebbert <cebbert.lkml@gmail.com> Cc: Borislav Petkov <bp@alien8.de> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/alpine.DEB.2.11.1509031232340.15006@nanos Signed-off-by: Thomas Gleixner <tglx@linutronix.de> --- arch/x86/kernel/alternative.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c index c42827e..25f9093 100644 --- a/arch/x86/kernel/alternative.c +++ b/arch/x86/kernel/alternative.c @@ -338,10 +338,15 @@ done: static void __init_or_module optimize_nops(struct alt_instr *a, u8 *instr) { + unsigned long flags; + if (instr[0] != 0x90) return; + local_irq_save(flags); add_nops(instr + (a->instrlen - a->padlen), a->padlen); + sync_core(); + local_irq_restore(flags); DUMP_BYTES(instr, a->instrlen, "%p: [%d:%d) optimized NOPs: ", instr, a->instrlen - a->padlen, a->padlen); ^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [BUG 4.2-rc8] Interrupt occurs while apply_alternatives() is patching the handler 2015-09-03 10:41 ` Thomas Gleixner ` (2 preceding siblings ...) 2015-09-03 19:30 ` [tip:x86/urgent] x86/alternatives: Make optimize_nops() interrupt safe and synced tip-bot for Thomas Gleixner @ 2015-09-04 7:40 ` Richard W.M. Jones 2015-09-04 12:02 ` Borislav Petkov 4 siblings, 0 replies; 15+ messages in thread From: Richard W.M. Jones @ 2015-09-04 7:40 UTC (permalink / raw) To: Thomas Gleixner Cc: Borislav Petkov, Chuck Ebbert, linux-kernel, x86, Ingo Molnar, H. Peter Anvin On Thu, Sep 03, 2015 at 12:41:47PM +0200, Thomas Gleixner wrote: > On Thu, 3 Sep 2015, Borislav Petkov wrote: > > On Wed, Sep 02, 2015 at 11:11:55AM +0200, Thomas Gleixner wrote: > > > static void __init_or_module add_nops(void *insns, unsigned int len) > > > { > > > + unsigned long flags; > > > + > > > + local_irq_save(flags); > > > while (len > 0) { > > > > I guess you want to optimize the len==0 case to not disable interrupts > > needlessly: > > > > if (!len) > > return; > > > > local_irq_save(flags); > > while (len > 0) > > ... > > Nah. I rather put the local_irq_save into optimize_nops(). All other > callers of add_nops() are operating on a buffer and use text_poke > after that. Aside of that optimize_nops() is missing a sync_core(). > > Updated patch below. V2 of the patch managed ~ 5000 iterations overnight without hitting the problem, so looks like it's a good fix. Thanks, Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com virt-top is 'top' for virtual machines. Tiny program with many powerful monitoring features, net stats, disk stats, logging, etc. http://people.redhat.com/~rjones/virt-top ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [BUG 4.2-rc8] Interrupt occurs while apply_alternatives() is patching the handler 2015-09-03 10:41 ` Thomas Gleixner ` (3 preceding siblings ...) 2015-09-04 7:40 ` [BUG 4.2-rc8] Interrupt occurs while apply_alternatives() is patching the handler Richard W.M. Jones @ 2015-09-04 12:02 ` Borislav Petkov 2015-09-04 13:37 ` Thomas Gleixner 4 siblings, 1 reply; 15+ messages in thread From: Borislav Petkov @ 2015-09-04 12:02 UTC (permalink / raw) To: Thomas Gleixner Cc: Richard W.M. Jones, Chuck Ebbert, linux-kernel, x86, Ingo Molnar, H. Peter Anvin On Thu, Sep 03, 2015 at 12:41:47PM +0200, Thomas Gleixner wrote: > Nah. I rather put the local_irq_save into optimize_nops(). All other > callers of add_nops() are operating on a buffer and use text_poke > after that. Aside of that optimize_nops() is missing a sync_core(). Whoops. > Updated patch below. Looks good, thanks. Hrrm, maybe optimize_nops() should work on a buffer and do text_poke_early() too at the end, so that it doesn't differ from all the other paths changing kernel text. So as to stuff like that doesn't get missed next time I'm touching it. But with your way we don't need the additional buffer. I guess your solution is better resource-wise. Thanks. -- Regards/Gruss, Boris. ECO tip #101: Trim your mails when you reply. -- ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [BUG 4.2-rc8] Interrupt occurs while apply_alternatives() is patching the handler 2015-09-04 12:02 ` Borislav Petkov @ 2015-09-04 13:37 ` Thomas Gleixner 2015-09-05 15:25 ` Borislav Petkov 0 siblings, 1 reply; 15+ messages in thread From: Thomas Gleixner @ 2015-09-04 13:37 UTC (permalink / raw) To: Borislav Petkov Cc: Richard W.M. Jones, Chuck Ebbert, linux-kernel, x86, Ingo Molnar, H. Peter Anvin On Fri, 4 Sep 2015, Borislav Petkov wrote: > On Thu, Sep 03, 2015 at 12:41:47PM +0200, Thomas Gleixner wrote: > > Nah. I rather put the local_irq_save into optimize_nops(). All other > > callers of add_nops() are operating on a buffer and use text_poke > > after that. Aside of that optimize_nops() is missing a sync_core(). > > Whoops. > > > Updated patch below. > > Looks good, thanks. > > Hrrm, maybe optimize_nops() should work on a buffer and do > text_poke_early() too at the end, so that it doesn't differ from all the > other paths changing kernel text. So as to stuff like that doesn't get > missed next time I'm touching it. > > But with your way we don't need the additional buffer. I guess your > solution is better resource-wise. I pondered the buffer/text_poke variant, but that's too much of a hassle for a fix supposed to go to stable. Resourcewise you could use the buffer which is available in apply_alternatives anyway. Thanks, tglx ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [BUG 4.2-rc8] Interrupt occurs while apply_alternatives() is patching the handler 2015-09-04 13:37 ` Thomas Gleixner @ 2015-09-05 15:25 ` Borislav Petkov 0 siblings, 0 replies; 15+ messages in thread From: Borislav Petkov @ 2015-09-05 15:25 UTC (permalink / raw) To: Thomas Gleixner Cc: Richard W.M. Jones, Chuck Ebbert, linux-kernel, x86, Ingo Molnar, H. Peter Anvin On Fri, Sep 04, 2015 at 03:37:24PM +0200, Thomas Gleixner wrote: > I pondered the buffer/text_poke variant, but that's too much of a > hassle for a fix supposed to go to stable. True. > Resourcewise you could use the buffer which is available in > apply_alternatives anyway. I'll poke at it when I get back - who knows, it might get not too ugly to keep it alive... Thanks. -- Regards/Gruss, Boris. ECO tip #101: Trim your mails when you reply. -- ^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2015-09-05 15:25 UTC | newest] Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2015-08-31 2:37 [BUG 4.2-rc8] Interrupt occurs while apply_alternatives() is patching the handler Chuck Ebbert 2015-09-01 6:20 ` Richard W.M. Jones 2015-09-02 9:11 ` Thomas Gleixner 2015-09-02 19:05 ` Richard W.M. Jones 2015-09-03 7:53 ` Richard W.M. Jones 2015-09-03 8:50 ` Borislav Petkov 2015-09-03 10:41 ` Thomas Gleixner 2015-09-03 12:43 ` Josh Boyer 2015-09-03 13:01 ` Thomas Gleixner 2015-09-03 15:48 ` Richard W.M. Jones 2015-09-03 19:30 ` [tip:x86/urgent] x86/alternatives: Make optimize_nops() interrupt safe and synced tip-bot for Thomas Gleixner 2015-09-04 7:40 ` [BUG 4.2-rc8] Interrupt occurs while apply_alternatives() is patching the handler Richard W.M. Jones 2015-09-04 12:02 ` Borislav Petkov 2015-09-04 13:37 ` Thomas Gleixner 2015-09-05 15:25 ` Borislav Petkov
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).