LKML Archive on lore.kernel.org
 help / Atom feed
* [BUG 4.2-rc8] Interrupt occurs while apply_alternatives() is patching the handler
@ 2015-08-31  2:37 Chuck Ebbert
  2015-09-01  6:20 ` Richard W.M. Jones
  0 siblings, 1 reply; 15+ messages in thread
From: Chuck Ebbert @ 2015-08-31  2:37 UTC (permalink / raw)
  To: linux-kernel; +Cc: x86, Thomas Gleixner, Ingo Molnar, H. Peter Anvin

This is from https://bugzilla.redhat.com/show_bug.cgi?id=1258223

[    0.036000] BUG: unable to handle kernel paging request at 55501e06
[    0.036000] IP: [<c0aae48b>] common_interrupt+0xb/0x38
[    0.036000] *pde = 00000000 
[    0.036000] Oops: 0000 [#1] SMP 
[    0.036000] Modules linked in:
[    0.036000] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.2.0-0.rc8.git3.1.fc24.i686 #1
[    0.036000] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.8.2-20150714_191134- 04/01/2014
[    0.036000] task: c0d49ac0 ti: c0d42000 task.ti: c0d42000
[    0.036000] EIP: 0060:[<c0aae48b>] EFLAGS: 00200046 CPU: 0
[    0.036000] EIP is at common_interrupt+0xb/0x38
[    0.036000] EAX: c0aae480 EBX: 0000008d ECX: c0ab1c83 EDX: e4af6810
[    0.036000] ESI: 029a7802 EDI: 00000003 EBP: c0d43e68 ESP: c0d43e44
[    0.036000]  DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
[    0.036000] CR0: 8005003b CR2: 55501e06 CR3: 00ebd000 CR4: 00000690
[    0.036000] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
[    0.036000] DR6: 00000000 DR7: 00000000
[    0.036000] Stack:
[    0.036000]  0000004f c0409c80 00000060 00200202 00200046 c0d43e60 c0ea150c 029a7802
[    0.036000]  00000000 c0d43fb8 c040a054 c07f1cf0 6c0a1000 ffff0006 00200046 00000043
[    0.036000]  c0ed0bc0 00000000 c0d43e98 c071a6fc c0d43ea8 c0d43ec4 c0ea4c73 c0ea4c7f
[    0.036000] Call Trace:
[    0.036000]  [<c0409c80>] ? add_nops+0x90/0xa0
[    0.036000]  [<c040a054>] apply_alternatives+0x274/0x630
[    0.036000]  [<c07f1cf0>] ? wait_for_xmitr+0xa0/0xa0
[    0.036000]  [<c071a6fc>] ? sprintf+0x1c/0x20
[    0.036000]  [<c0aae480>] ? irq_entries_start+0x698/0x698
[    0.036000]  [<c071be4b>] ? memcpy+0xb/0x30
[    0.036000]  [<c07f3950>] ? serial8250_set_termios+0x20/0x20
[    0.036000]  [<c0aad4e3>] ? _raw_write_unlock_irqrestore+0x13/0x20
[    0.036000]  [<c0aad4e3>] ? _raw_write_unlock_irqrestore+0x13/0x20
[    0.036000]  [<c0aad4fd>] ? _raw_spin_unlock_irqrestore+0xd/0x10
[    0.036000]  [<c04b17b9>] ? console_unlock+0x2e9/0x610
[    0.036000]  [<c04b03cd>] ? log_store+0x1cd/0x210
[    0.036000]  [<c04b1d7e>] ? vprintk_emit+0x29e/0x570
[    0.036000]  [<c04b21e1>] ? vprintk_default+0x41/0x60
[    0.036000]  [<c0aa7725>] ? printk+0x17/0x19
[    0.036000]  [<c0dfdd48>] ? identify_boot_cpu+0x7b/0x80
[    0.036000]  [<c0dfca47>] alternative_instructions+0x17/0xc1
[    0.036000]  [<c0dfdda9>] check_bugs+0x32/0x39
[    0.036000]  [<c0df6b57>] start_kernel+0x3ca/0x40a
[    0.036000]  [<c0df62e3>] i386_start_kernel+0x91/0x95
[    0.036000] Code: 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 8d 90 90 83 04 24 80 fc 0f a8 0f <a0> 06 1e 50 55 57 56 52 51 53 ba 7b 00 00 00 8e da 8e c2 ba d8

   0:	8d 90 90 83 04 24    	lea    0x24048390(%eax),%edx
   6:	80 fc 0f             	cmp    $0xf,%ah
   9:	a8 0f                	test   $0xf,%al
>> b:	a0 06 1e 50 55       	mov    0x55501e06,%al
  10:	57                   	push   %edi
  11:	56                   	push   %esi

Interrupt 0x30 occurred while the alternatives code was replacing the
initial 0x90,0x90,0x90 NOPs (from the ASM_CLAC macro) with the optimized
version, 0x8d,0x76,0x00. Only the first byte has been replaced so far,
and it makes a mess out of the insn decoding.

I have no clue how to fix this.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [BUG 4.2-rc8] Interrupt occurs while apply_alternatives() is patching the handler
  2015-08-31  2:37 [BUG 4.2-rc8] Interrupt occurs while apply_alternatives() is patching the handler Chuck Ebbert
@ 2015-09-01  6:20 ` Richard W.M. Jones
  2015-09-02  9:11   ` Thomas Gleixner
  0 siblings, 1 reply; 15+ messages in thread
From: Richard W.M. Jones @ 2015-09-01  6:20 UTC (permalink / raw)
  To: Chuck Ebbert
  Cc: linux-kernel, x86, Thomas Gleixner, Ingo Molnar, H. Peter Anvin

On Sun, Aug 30, 2015 at 10:37:57PM -0400, Chuck Ebbert wrote:
> This is from https://bugzilla.redhat.com/show_bug.cgi?id=1258223
> 
> [    0.036000] BUG: unable to handle kernel paging request at 55501e06
[...]
> [    0.036000]  [<c0409c80>] ? add_nops+0x90/0xa0
> [    0.036000]  [<c040a054>] apply_alternatives+0x274/0x630
> [    0.036000]  [<c07f1cf0>] ? wait_for_xmitr+0xa0/0xa0
> [    0.036000]  [<c071a6fc>] ? sprintf+0x1c/0x20
> [    0.036000]  [<c0aae480>] ? irq_entries_start+0x698/0x698
> [    0.036000]  [<c071be4b>] ? memcpy+0xb/0x30
> [    0.036000]  [<c07f3950>] ? serial8250_set_termios+0x20/0x20
[...]
> Interrupt 0x30 occurred while the alternatives code was replacing the
> initial 0x90,0x90,0x90 NOPs (from the ASM_CLAC macro) with the optimized
> version, 0x8d,0x76,0x00. Only the first byte has been replaced so far,
> and it makes a mess out of the insn decoding.

Chuck, thanks for reporting this.

I have only been able to reproduce this so far using qemu and TCG (not
KVM) which of course raises a range of questions: could it be a qemu
bug or a TCG bug?  Could it be that an atomic op is not correctly
implemented by qemu?  I will keep trying on KVM.  Because I don't have
a convenient server with 32 bit kernel and a serial port that I can
reboot thousands of times, I have not tried to reproduce on baremetal yet.

Here's how to reproduce it.  (The host can be x86-64)

(1) Grab the 32 bit Fedora kernel we are using from
https://kojipkgs.fedoraproject.org//packages/kernel/4.2.0/1.fc24/i686/kernel-core-4.2.0-1.fc24.i686.rpm
(from http://koji.fedoraproject.org/koji/buildinfo?buildID=681723)

(2) Unpack it to extract vmlinuz:

cd /tmp
rpm2cpio /mnt/scratch/kernel-core-4.2.0-1.fc24.i686.rpm | cpio -id
cp ./lib/modules/4.2.0-1.fc24.i686/vmlinuz .

(3) Boot the kernel under qemu/KVM.  The following single line command
repeatedly boots the kernel until the bug is hit:

while qemu-system-x86_64 -nographic -no-reboot -M accel=kvm:tcg -kernel vmlinuz -append 'console=ttyS0 panic=1' -serial stdio -monitor none >& log; ! grep add_nops log; do echo -n .; done

It takes many iterations (100s with TCG) to hit the bug.

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-p2v converts physical machines to virtual machines.  Boot with a
live CD or over the network (PXE) and turn machines into KVM guests.
http://libguestfs.org/virt-v2v

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [BUG 4.2-rc8] Interrupt occurs while apply_alternatives() is patching the handler
  2015-09-01  6:20 ` Richard W.M. Jones
@ 2015-09-02  9:11   ` Thomas Gleixner
  2015-09-02 19:05     ` Richard W.M. Jones
  2015-09-03  8:50     ` Borislav Petkov
  0 siblings, 2 replies; 15+ messages in thread
From: Thomas Gleixner @ 2015-09-02  9:11 UTC (permalink / raw)
  To: Richard W.M. Jones
  Cc: Chuck Ebbert, linux-kernel, x86, Ingo Molnar, H. Peter Anvin

On Tue, 1 Sep 2015, Richard W.M. Jones wrote:
> On Sun, Aug 30, 2015 at 10:37:57PM -0400, Chuck Ebbert wrote:
> > This is from https://bugzilla.redhat.com/show_bug.cgi?id=1258223
> > 
> > [    0.036000] BUG: unable to handle kernel paging request at 55501e06
> [...]
> > [    0.036000]  [<c0409c80>] ? add_nops+0x90/0xa0
> > [    0.036000]  [<c040a054>] apply_alternatives+0x274/0x630
> > [    0.036000]  [<c07f1cf0>] ? wait_for_xmitr+0xa0/0xa0
> > [    0.036000]  [<c071a6fc>] ? sprintf+0x1c/0x20
> > [    0.036000]  [<c0aae480>] ? irq_entries_start+0x698/0x698
> > [    0.036000]  [<c071be4b>] ? memcpy+0xb/0x30
> > [    0.036000]  [<c07f3950>] ? serial8250_set_termios+0x20/0x20
> [...]
> > Interrupt 0x30 occurred while the alternatives code was replacing the
> > initial 0x90,0x90,0x90 NOPs (from the ASM_CLAC macro) with the optimized
> > version, 0x8d,0x76,0x00. Only the first byte has been replaced so far,
> > and it makes a mess out of the insn decoding.

apply_alternatives() has two ways to modify the code:

1) text_poke_early()

2) optimize_nops()

The former disables interrupts, the latter not. The patch below should
fix the issue.

Thanks,

	tglx

diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
index c42827eb86cf..6a2f93e029f4 100644
--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -257,6 +257,9 @@ void __init arch_init_ideal_nops(void)
 /* Use this to add nops to a buffer, then text_poke the whole buffer. */
 static void __init_or_module add_nops(void *insns, unsigned int len)
 {
+	unsigned long flags;
+
+	local_irq_save(flags);
 	while (len > 0) {
 		unsigned int noplen = len;
 		if (noplen > ASM_NOP_MAX)
@@ -265,6 +268,7 @@ static void __init_or_module add_nops(void *insns, unsigned int len)
 		insns += noplen;
 		len -= noplen;
 	}
+	local_irq_restore(flags);
 }
 
 extern struct alt_instr __alt_instructions[], __alt_instructions_end[];





^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [BUG 4.2-rc8] Interrupt occurs while apply_alternatives() is patching the handler
  2015-09-02  9:11   ` Thomas Gleixner
@ 2015-09-02 19:05     ` Richard W.M. Jones
  2015-09-03  7:53       ` Richard W.M. Jones
  2015-09-03  8:50     ` Borislav Petkov
  1 sibling, 1 reply; 15+ messages in thread
From: Richard W.M. Jones @ 2015-09-02 19:05 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Chuck Ebbert, linux-kernel, x86, Ingo Molnar, H. Peter Anvin

On Wed, Sep 02, 2015 at 11:11:55AM +0200, Thomas Gleixner wrote:
> On Tue, 1 Sep 2015, Richard W.M. Jones wrote:
> > On Sun, Aug 30, 2015 at 10:37:57PM -0400, Chuck Ebbert wrote:
> > > This is from https://bugzilla.redhat.com/show_bug.cgi?id=1258223
> > > 
> > > [    0.036000] BUG: unable to handle kernel paging request at 55501e06
> > [...]
> > > [    0.036000]  [<c0409c80>] ? add_nops+0x90/0xa0
> > > [    0.036000]  [<c040a054>] apply_alternatives+0x274/0x630
> > > [    0.036000]  [<c07f1cf0>] ? wait_for_xmitr+0xa0/0xa0
> > > [    0.036000]  [<c071a6fc>] ? sprintf+0x1c/0x20
> > > [    0.036000]  [<c0aae480>] ? irq_entries_start+0x698/0x698
> > > [    0.036000]  [<c071be4b>] ? memcpy+0xb/0x30
> > > [    0.036000]  [<c07f3950>] ? serial8250_set_termios+0x20/0x20
> > [...]
> > > Interrupt 0x30 occurred while the alternatives code was replacing the
> > > initial 0x90,0x90,0x90 NOPs (from the ASM_CLAC macro) with the optimized
> > > version, 0x8d,0x76,0x00. Only the first byte has been replaced so far,
> > > and it makes a mess out of the insn decoding.
> 
> apply_alternatives() has two ways to modify the code:
> 
> 1) text_poke_early()
> 
> 2) optimize_nops()
> 
> The former disables interrupts, the latter not. The patch below should
> fix the issue.

It has gone through about 1100 iterations so far without hitting the
bug.  I'll leave it running overnight.

Rich.

> Thanks,
> 
> 	tglx
> 
> diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
> index c42827eb86cf..6a2f93e029f4 100644
> --- a/arch/x86/kernel/alternative.c
> +++ b/arch/x86/kernel/alternative.c
> @@ -257,6 +257,9 @@ void __init arch_init_ideal_nops(void)
>  /* Use this to add nops to a buffer, then text_poke the whole buffer. */
>  static void __init_or_module add_nops(void *insns, unsigned int len)
>  {
> +	unsigned long flags;
> +
> +	local_irq_save(flags);
>  	while (len > 0) {
>  		unsigned int noplen = len;
>  		if (noplen > ASM_NOP_MAX)
> @@ -265,6 +268,7 @@ static void __init_or_module add_nops(void *insns, unsigned int len)
>  		insns += noplen;
>  		len -= noplen;
>  	}
> +	local_irq_restore(flags);
>  }
>  
>  extern struct alt_instr __alt_instructions[], __alt_instructions_end[];
> 
> 
> 

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-builder quickly builds VMs from scratch
http://libguestfs.org/virt-builder.1.html

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [BUG 4.2-rc8] Interrupt occurs while apply_alternatives() is patching the handler
  2015-09-02 19:05     ` Richard W.M. Jones
@ 2015-09-03  7:53       ` Richard W.M. Jones
  0 siblings, 0 replies; 15+ messages in thread
From: Richard W.M. Jones @ 2015-09-03  7:53 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Chuck Ebbert, linux-kernel, x86, Ingo Molnar, H. Peter Anvin

On Wed, Sep 02, 2015 at 08:05:12PM +0100, Richard W.M. Jones wrote:
> On Wed, Sep 02, 2015 at 11:11:55AM +0200, Thomas Gleixner wrote:
> > On Tue, 1 Sep 2015, Richard W.M. Jones wrote:
> > > On Sun, Aug 30, 2015 at 10:37:57PM -0400, Chuck Ebbert wrote:
> > > > This is from https://bugzilla.redhat.com/show_bug.cgi?id=1258223
> > > > 
> > > > [    0.036000] BUG: unable to handle kernel paging request at 55501e06
> > > [...]
> > > > [    0.036000]  [<c0409c80>] ? add_nops+0x90/0xa0
> > > > [    0.036000]  [<c040a054>] apply_alternatives+0x274/0x630
> > > > [    0.036000]  [<c07f1cf0>] ? wait_for_xmitr+0xa0/0xa0
> > > > [    0.036000]  [<c071a6fc>] ? sprintf+0x1c/0x20
> > > > [    0.036000]  [<c0aae480>] ? irq_entries_start+0x698/0x698
> > > > [    0.036000]  [<c071be4b>] ? memcpy+0xb/0x30
> > > > [    0.036000]  [<c07f3950>] ? serial8250_set_termios+0x20/0x20
> > > [...]
> > > > Interrupt 0x30 occurred while the alternatives code was replacing the
> > > > initial 0x90,0x90,0x90 NOPs (from the ASM_CLAC macro) with the optimized
> > > > version, 0x8d,0x76,0x00. Only the first byte has been replaced so far,
> > > > and it makes a mess out of the insn decoding.
> > 
> > apply_alternatives() has two ways to modify the code:
> > 
> > 1) text_poke_early()
> > 
> > 2) optimize_nops()
> > 
> > The former disables interrupts, the latter not. The patch below should
> > fix the issue.
> 
> It has gone through about 1100 iterations so far without hitting the
> bug.  I'll leave it running overnight.

That ran ~ 4000 iterations overnight, so it seems to work.  You can
add:

  Tested-by: Richard W.M. Jones <rjones@redhat.com>

Thanks,

Rich.

> Rich.
> 
> > Thanks,
> > 
> > 	tglx
> > 
> > diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
> > index c42827eb86cf..6a2f93e029f4 100644
> > --- a/arch/x86/kernel/alternative.c
> > +++ b/arch/x86/kernel/alternative.c
> > @@ -257,6 +257,9 @@ void __init arch_init_ideal_nops(void)
> >  /* Use this to add nops to a buffer, then text_poke the whole buffer. */
> >  static void __init_or_module add_nops(void *insns, unsigned int len)
> >  {
> > +	unsigned long flags;
> > +
> > +	local_irq_save(flags);
> >  	while (len > 0) {
> >  		unsigned int noplen = len;
> >  		if (noplen > ASM_NOP_MAX)
> > @@ -265,6 +268,7 @@ static void __init_or_module add_nops(void *insns, unsigned int len)
> >  		insns += noplen;
> >  		len -= noplen;
> >  	}
> > +	local_irq_restore(flags);
> >  }
> >  
> >  extern struct alt_instr __alt_instructions[], __alt_instructions_end[];
> > 
> > 
> > 
> 
> -- 
> Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
> Read my programming and virtualization blog: http://rwmj.wordpress.com
> virt-builder quickly builds VMs from scratch
> http://libguestfs.org/virt-builder.1.html

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-df lists disk usage of guests without needing to install any
software inside the virtual machine.  Supports Linux and Windows.
http://people.redhat.com/~rjones/virt-df/

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [BUG 4.2-rc8] Interrupt occurs while apply_alternatives() is patching the handler
  2015-09-02  9:11   ` Thomas Gleixner
  2015-09-02 19:05     ` Richard W.M. Jones
@ 2015-09-03  8:50     ` Borislav Petkov
  2015-09-03 10:41       ` Thomas Gleixner
  1 sibling, 1 reply; 15+ messages in thread
From: Borislav Petkov @ 2015-09-03  8:50 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Richard W.M. Jones, Chuck Ebbert, linux-kernel, x86, Ingo Molnar,
	H. Peter Anvin

On Wed, Sep 02, 2015 at 11:11:55AM +0200, Thomas Gleixner wrote:
> On Tue, 1 Sep 2015, Richard W.M. Jones wrote:
> > On Sun, Aug 30, 2015 at 10:37:57PM -0400, Chuck Ebbert wrote:
> > > This is from https://bugzilla.redhat.com/show_bug.cgi?id=1258223
> > > 
> > > [    0.036000] BUG: unable to handle kernel paging request at 55501e06
> > [...]
> > > [    0.036000]  [<c0409c80>] ? add_nops+0x90/0xa0
> > > [    0.036000]  [<c040a054>] apply_alternatives+0x274/0x630
> > > [    0.036000]  [<c07f1cf0>] ? wait_for_xmitr+0xa0/0xa0
> > > [    0.036000]  [<c071a6fc>] ? sprintf+0x1c/0x20
> > > [    0.036000]  [<c0aae480>] ? irq_entries_start+0x698/0x698
> > > [    0.036000]  [<c071be4b>] ? memcpy+0xb/0x30
> > > [    0.036000]  [<c07f3950>] ? serial8250_set_termios+0x20/0x20
> > [...]
> > > Interrupt 0x30 occurred while the alternatives code was replacing the
> > > initial 0x90,0x90,0x90 NOPs (from the ASM_CLAC macro) with the optimized
> > > version, 0x8d,0x76,0x00. Only the first byte has been replaced so far,
> > > and it makes a mess out of the insn decoding.
> 
> apply_alternatives() has two ways to modify the code:
> 
> 1) text_poke_early()
> 
> 2) optimize_nops()
> 
> The former disables interrupts, the latter not. The patch below should
> fix the issue.
> 
> Thanks,
> 
> 	tglx
> 
> diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
> index c42827eb86cf..6a2f93e029f4 100644
> --- a/arch/x86/kernel/alternative.c
> +++ b/arch/x86/kernel/alternative.c
> @@ -257,6 +257,9 @@ void __init arch_init_ideal_nops(void)
>  /* Use this to add nops to a buffer, then text_poke the whole buffer. */
>  static void __init_or_module add_nops(void *insns, unsigned int len)
>  {
> +	unsigned long flags;
> +
> +	local_irq_save(flags);
>  	while (len > 0) {

I guess you want to optimize the len==0 case to not disable interrupts
needlessly:

	if (!len)
		return;

	local_irq_save(flags);
	while (len > 0)
		...

Other than that, good catch!

Thanks.

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.
--

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [BUG 4.2-rc8] Interrupt occurs while apply_alternatives() is patching the handler
  2015-09-03  8:50     ` Borislav Petkov
@ 2015-09-03 10:41       ` Thomas Gleixner
  2015-09-03 12:43         ` Josh Boyer
                           ` (4 more replies)
  0 siblings, 5 replies; 15+ messages in thread
From: Thomas Gleixner @ 2015-09-03 10:41 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Richard W.M. Jones, Chuck Ebbert, linux-kernel, x86, Ingo Molnar,
	H. Peter Anvin

On Thu, 3 Sep 2015, Borislav Petkov wrote:
> On Wed, Sep 02, 2015 at 11:11:55AM +0200, Thomas Gleixner wrote:
> >  static void __init_or_module add_nops(void *insns, unsigned int len)
> >  {
> > +	unsigned long flags;
> > +
> > +	local_irq_save(flags);
> >  	while (len > 0) {
> 
> I guess you want to optimize the len==0 case to not disable interrupts
> needlessly:
> 
> 	if (!len)
> 		return;
> 
> 	local_irq_save(flags);
> 	while (len > 0)
> 		...

Nah. I rather put the local_irq_save into optimize_nops(). All other
callers of add_nops() are operating on a buffer and use text_poke
after that. Aside of that optimize_nops() is missing a sync_core().

Updated patch below.

Thanks,

	tglx

----------------->
Subject: x86/alternatives: Make optimize_nops() interrupt safe and synced
From: Thomas Gleixner <tglx@linutronix.de>
Date: Thu, 03 Sep 2015 12:34:55 +0200

optimize_nops() is buggy in two aspects:

- It's not disabling interrupts across the modification
- It's lacking a sync_core() call

Fixes: 4fd4b6e5537c 'x86/alternatives: Use optimized NOPs for padding'
Reported-by: "Richard W.M. Jones" <rjones@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/kernel/alternative.c |    5 +++++
 1 file changed, 5 insertions(+)

Index: tip/arch/x86/kernel/alternative.c
===================================================================
--- tip.orig/arch/x86/kernel/alternative.c
+++ tip/arch/x86/kernel/alternative.c
@@ -338,10 +338,15 @@ done:
 
 static void __init_or_module optimize_nops(struct alt_instr *a, u8 *instr)
 {
+	unsigned long flags;
+
 	if (instr[0] != 0x90)
 		return;
 
+	local_irq_save(flags);
 	add_nops(instr + (a->instrlen - a->padlen), a->padlen);
+	sync_core();
+	local_irq_restore(flags);
 
 	DUMP_BYTES(instr, a->instrlen, "%p: [%d:%d) optimized NOPs: ",
 		   instr, a->instrlen - a->padlen, a->padlen);



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [BUG 4.2-rc8] Interrupt occurs while apply_alternatives() is patching the handler
  2015-09-03 10:41       ` Thomas Gleixner
@ 2015-09-03 12:43         ` Josh Boyer
  2015-09-03 13:01           ` Thomas Gleixner
  2015-09-03 15:48         ` Richard W.M. Jones
                           ` (3 subsequent siblings)
  4 siblings, 1 reply; 15+ messages in thread
From: Josh Boyer @ 2015-09-03 12:43 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Borislav Petkov, Richard W.M. Jones, Chuck Ebbert,
	Linux-Kernel@Vger. Kernel. Org, x86, Ingo Molnar, H. Peter Anvin

On Thu, Sep 3, 2015 at 6:41 AM, Thomas Gleixner <tglx@linutronix.de> wrote:
> On Thu, 3 Sep 2015, Borislav Petkov wrote:
>> On Wed, Sep 02, 2015 at 11:11:55AM +0200, Thomas Gleixner wrote:
>> >  static void __init_or_module add_nops(void *insns, unsigned int len)
>> >  {
>> > +   unsigned long flags;
>> > +
>> > +   local_irq_save(flags);
>> >     while (len > 0) {
>>
>> I guess you want to optimize the len==0 case to not disable interrupts
>> needlessly:
>>
>>       if (!len)
>>               return;
>>
>>       local_irq_save(flags);
>>       while (len > 0)
>>               ...
>
> Nah. I rather put the local_irq_save into optimize_nops(). All other
> callers of add_nops() are operating on a buffer and use text_poke
> after that. Aside of that optimize_nops() is missing a sync_core().
>
> Updated patch below.
>
> Thanks,
>
>         tglx
>
> ----------------->
> Subject: x86/alternatives: Make optimize_nops() interrupt safe and synced
> From: Thomas Gleixner <tglx@linutronix.de>
> Date: Thu, 03 Sep 2015 12:34:55 +0200
>
> optimize_nops() is buggy in two aspects:
>
> - It's not disabling interrupts across the modification
> - It's lacking a sync_core() call
>
> Fixes: 4fd4b6e5537c 'x86/alternatives: Use optimized NOPs for padding'

Possibly CC: stable since it fixes a commit in 4.1 and 4.2?

josh

> Reported-by: "Richard W.M. Jones" <rjones@redhat.com>
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> ---
>  arch/x86/kernel/alternative.c |    5 +++++
>  1 file changed, 5 insertions(+)
>
> Index: tip/arch/x86/kernel/alternative.c
> ===================================================================
> --- tip.orig/arch/x86/kernel/alternative.c
> +++ tip/arch/x86/kernel/alternative.c
> @@ -338,10 +338,15 @@ done:
>
>  static void __init_or_module optimize_nops(struct alt_instr *a, u8 *instr)
>  {
> +       unsigned long flags;
> +
>         if (instr[0] != 0x90)
>                 return;
>
> +       local_irq_save(flags);
>         add_nops(instr + (a->instrlen - a->padlen), a->padlen);
> +       sync_core();
> +       local_irq_restore(flags);
>
>         DUMP_BYTES(instr, a->instrlen, "%p: [%d:%d) optimized NOPs: ",
>                    instr, a->instrlen - a->padlen, a->padlen);
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [BUG 4.2-rc8] Interrupt occurs while apply_alternatives() is patching the handler
  2015-09-03 12:43         ` Josh Boyer
@ 2015-09-03 13:01           ` Thomas Gleixner
  0 siblings, 0 replies; 15+ messages in thread
From: Thomas Gleixner @ 2015-09-03 13:01 UTC (permalink / raw)
  To: Josh Boyer
  Cc: Borislav Petkov, Richard W.M. Jones, Chuck Ebbert,
	Linux-Kernel@Vger. Kernel. Org, x86, Ingo Molnar, H. Peter Anvin

On Thu, 3 Sep 2015, Josh Boyer wrote:
> > Fixes: 4fd4b6e5537c 'x86/alternatives: Use optimized NOPs for padding'
> 
> Possibly CC: stable since it fixes a commit in 4.1 and 4.2?

Indeed.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [BUG 4.2-rc8] Interrupt occurs while apply_alternatives() is patching the handler
  2015-09-03 10:41       ` Thomas Gleixner
  2015-09-03 12:43         ` Josh Boyer
@ 2015-09-03 15:48         ` Richard W.M. Jones
  2015-09-03 19:30         ` [tip:x86/urgent] x86/alternatives: Make optimize_nops() interrupt safe and synced tip-bot for Thomas Gleixner
                           ` (2 subsequent siblings)
  4 siblings, 0 replies; 15+ messages in thread
From: Richard W.M. Jones @ 2015-09-03 15:48 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Borislav Petkov, Chuck Ebbert, linux-kernel, x86, Ingo Molnar,
	H. Peter Anvin

On Thu, Sep 03, 2015 at 12:41:47PM +0200, Thomas Gleixner wrote:
> On Thu, 3 Sep 2015, Borislav Petkov wrote:
> > On Wed, Sep 02, 2015 at 11:11:55AM +0200, Thomas Gleixner wrote:
> > >  static void __init_or_module add_nops(void *insns, unsigned int len)
> > >  {
> > > +	unsigned long flags;
> > > +
> > > +	local_irq_save(flags);
> > >  	while (len > 0) {
> > 
> > I guess you want to optimize the len==0 case to not disable interrupts
> > needlessly:
> > 
> > 	if (!len)
> > 		return;
> > 
> > 	local_irq_save(flags);
> > 	while (len > 0)
> > 		...
> 
> Nah. I rather put the local_irq_save into optimize_nops(). All other
> callers of add_nops() are operating on a buffer and use text_poke
> after that. Aside of that optimize_nops() is missing a sync_core().
> 
> Updated patch below.

The V2 patch has got to 900 iterations without hitting the problem.
As that is a lot more than without the patch, you can add:

  Tested-by: Richard W.M. Jones <rjones@redhat.com>

I will leave it going overnight just in case.

Thanks,

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-top is 'top' for virtual machines.  Tiny program with many
powerful monitoring features, net stats, disk stats, logging, etc.
http://people.redhat.com/~rjones/virt-top

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [tip:x86/urgent] x86/alternatives: Make optimize_nops() interrupt safe and synced
  2015-09-03 10:41       ` Thomas Gleixner
  2015-09-03 12:43         ` Josh Boyer
  2015-09-03 15:48         ` Richard W.M. Jones
@ 2015-09-03 19:30         ` tip-bot for Thomas Gleixner
  2015-09-04  7:40         ` [BUG 4.2-rc8] Interrupt occurs while apply_alternatives() is patching the handler Richard W.M. Jones
  2015-09-04 12:02         ` Borislav Petkov
  4 siblings, 0 replies; 15+ messages in thread
From: tip-bot for Thomas Gleixner @ 2015-09-03 19:30 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: bp, mingo, rjones, hpa, tglx, linux-kernel, cebbert.lkml

Commit-ID:  66c117d7fa2ae429911e60d84bf31a90b2b96189
Gitweb:     http://git.kernel.org/tip/66c117d7fa2ae429911e60d84bf31a90b2b96189
Author:     Thomas Gleixner <tglx@linutronix.de>
AuthorDate: Thu, 3 Sep 2015 12:34:55 +0200
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Thu, 3 Sep 2015 21:27:47 +0200

x86/alternatives: Make optimize_nops() interrupt safe and synced

Richard reported the following crash:

[    0.036000] BUG: unable to handle kernel paging request at 55501e06
[    0.036000] IP: [<c0aae48b>] common_interrupt+0xb/0x38
[    0.036000] Call Trace:
[    0.036000]  [<c0409c80>] ? add_nops+0x90/0xa0
[    0.036000]  [<c040a054>] apply_alternatives+0x274/0x630

Chuck decoded:

 "  0:   8d 90 90 83 04 24       lea    0x24048390(%eax),%edx
    6:   80 fc 0f                cmp    $0xf,%ah
    9:   a8 0f                   test   $0xf,%al
 >> b:   a0 06 1e 50 55          mov    0x55501e06,%al
   10:   57                      push   %edi
   11:   56                      push   %esi

 Interrupt 0x30 occurred while the alternatives code was replacing the
 initial 0x90,0x90,0x90 NOPs (from the ASM_CLAC macro) with the
 optimized version, 0x8d,0x76,0x00. Only the first byte has been
 replaced so far, and it makes a mess out of the insn decoding."

optimize_nops() is buggy in two aspects:

- It's not disabling interrupts across the modification
- It's lacking a sync_core() call

Add both.

Fixes: 4fd4b6e5537c 'x86/alternatives: Use optimized NOPs for padding'
Reported-and-tested-by: "Richard W.M. Jones" <rjones@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Richard W.M. Jones <rjones@redhat.com>
Cc: Chuck Ebbert <cebbert.lkml@gmail.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/alpine.DEB.2.11.1509031232340.15006@nanos
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/kernel/alternative.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
index c42827e..25f9093 100644
--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -338,10 +338,15 @@ done:
 
 static void __init_or_module optimize_nops(struct alt_instr *a, u8 *instr)
 {
+	unsigned long flags;
+
 	if (instr[0] != 0x90)
 		return;
 
+	local_irq_save(flags);
 	add_nops(instr + (a->instrlen - a->padlen), a->padlen);
+	sync_core();
+	local_irq_restore(flags);
 
 	DUMP_BYTES(instr, a->instrlen, "%p: [%d:%d) optimized NOPs: ",
 		   instr, a->instrlen - a->padlen, a->padlen);

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [BUG 4.2-rc8] Interrupt occurs while apply_alternatives() is patching the handler
  2015-09-03 10:41       ` Thomas Gleixner
                           ` (2 preceding siblings ...)
  2015-09-03 19:30         ` [tip:x86/urgent] x86/alternatives: Make optimize_nops() interrupt safe and synced tip-bot for Thomas Gleixner
@ 2015-09-04  7:40         ` Richard W.M. Jones
  2015-09-04 12:02         ` Borislav Petkov
  4 siblings, 0 replies; 15+ messages in thread
From: Richard W.M. Jones @ 2015-09-04  7:40 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Borislav Petkov, Chuck Ebbert, linux-kernel, x86, Ingo Molnar,
	H. Peter Anvin

On Thu, Sep 03, 2015 at 12:41:47PM +0200, Thomas Gleixner wrote:
> On Thu, 3 Sep 2015, Borislav Petkov wrote:
> > On Wed, Sep 02, 2015 at 11:11:55AM +0200, Thomas Gleixner wrote:
> > >  static void __init_or_module add_nops(void *insns, unsigned int len)
> > >  {
> > > +	unsigned long flags;
> > > +
> > > +	local_irq_save(flags);
> > >  	while (len > 0) {
> > 
> > I guess you want to optimize the len==0 case to not disable interrupts
> > needlessly:
> > 
> > 	if (!len)
> > 		return;
> > 
> > 	local_irq_save(flags);
> > 	while (len > 0)
> > 		...
> 
> Nah. I rather put the local_irq_save into optimize_nops(). All other
> callers of add_nops() are operating on a buffer and use text_poke
> after that. Aside of that optimize_nops() is missing a sync_core().
> 
> Updated patch below.

V2 of the patch managed ~ 5000 iterations overnight without hitting
the problem, so looks like it's a good fix.

Thanks, Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-top is 'top' for virtual machines.  Tiny program with many
powerful monitoring features, net stats, disk stats, logging, etc.
http://people.redhat.com/~rjones/virt-top

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [BUG 4.2-rc8] Interrupt occurs while apply_alternatives() is patching the handler
  2015-09-03 10:41       ` Thomas Gleixner
                           ` (3 preceding siblings ...)
  2015-09-04  7:40         ` [BUG 4.2-rc8] Interrupt occurs while apply_alternatives() is patching the handler Richard W.M. Jones
@ 2015-09-04 12:02         ` Borislav Petkov
  2015-09-04 13:37           ` Thomas Gleixner
  4 siblings, 1 reply; 15+ messages in thread
From: Borislav Petkov @ 2015-09-04 12:02 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Richard W.M. Jones, Chuck Ebbert, linux-kernel, x86, Ingo Molnar,
	H. Peter Anvin

On Thu, Sep 03, 2015 at 12:41:47PM +0200, Thomas Gleixner wrote:
> Nah. I rather put the local_irq_save into optimize_nops(). All other
> callers of add_nops() are operating on a buffer and use text_poke
> after that. Aside of that optimize_nops() is missing a sync_core().

Whoops.

> Updated patch below.

Looks good, thanks.

Hrrm, maybe optimize_nops() should work on a buffer and do
text_poke_early() too at the end, so that it doesn't differ from all the
other paths changing kernel text. So as to stuff like that doesn't get
missed next time I'm touching it.

But with your way we don't need the additional buffer. I guess your
solution is better resource-wise.

Thanks.

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.
--

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [BUG 4.2-rc8] Interrupt occurs while apply_alternatives() is patching the handler
  2015-09-04 12:02         ` Borislav Petkov
@ 2015-09-04 13:37           ` Thomas Gleixner
  2015-09-05 15:25             ` Borislav Petkov
  0 siblings, 1 reply; 15+ messages in thread
From: Thomas Gleixner @ 2015-09-04 13:37 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Richard W.M. Jones, Chuck Ebbert, linux-kernel, x86, Ingo Molnar,
	H. Peter Anvin

On Fri, 4 Sep 2015, Borislav Petkov wrote:
> On Thu, Sep 03, 2015 at 12:41:47PM +0200, Thomas Gleixner wrote:
> > Nah. I rather put the local_irq_save into optimize_nops(). All other
> > callers of add_nops() are operating on a buffer and use text_poke
> > after that. Aside of that optimize_nops() is missing a sync_core().
> 
> Whoops.
> 
> > Updated patch below.
> 
> Looks good, thanks.
> 
> Hrrm, maybe optimize_nops() should work on a buffer and do
> text_poke_early() too at the end, so that it doesn't differ from all the
> other paths changing kernel text. So as to stuff like that doesn't get
> missed next time I'm touching it.
> 
> But with your way we don't need the additional buffer. I guess your
> solution is better resource-wise.

I pondered the buffer/text_poke variant, but that's too much of a
hassle for a fix supposed to go to stable.

Resourcewise you could use the buffer which is available in
apply_alternatives anyway.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [BUG 4.2-rc8] Interrupt occurs while apply_alternatives() is patching the handler
  2015-09-04 13:37           ` Thomas Gleixner
@ 2015-09-05 15:25             ` Borislav Petkov
  0 siblings, 0 replies; 15+ messages in thread
From: Borislav Petkov @ 2015-09-05 15:25 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Richard W.M. Jones, Chuck Ebbert, linux-kernel, x86, Ingo Molnar,
	H. Peter Anvin

On Fri, Sep 04, 2015 at 03:37:24PM +0200, Thomas Gleixner wrote:
> I pondered the buffer/text_poke variant, but that's too much of a
> hassle for a fix supposed to go to stable.

True.

> Resourcewise you could use the buffer which is available in
> apply_alternatives anyway.

I'll poke at it when I get back - who knows, it might get not too ugly
to keep it alive...

Thanks.

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.
--

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, back to index

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-08-31  2:37 [BUG 4.2-rc8] Interrupt occurs while apply_alternatives() is patching the handler Chuck Ebbert
2015-09-01  6:20 ` Richard W.M. Jones
2015-09-02  9:11   ` Thomas Gleixner
2015-09-02 19:05     ` Richard W.M. Jones
2015-09-03  7:53       ` Richard W.M. Jones
2015-09-03  8:50     ` Borislav Petkov
2015-09-03 10:41       ` Thomas Gleixner
2015-09-03 12:43         ` Josh Boyer
2015-09-03 13:01           ` Thomas Gleixner
2015-09-03 15:48         ` Richard W.M. Jones
2015-09-03 19:30         ` [tip:x86/urgent] x86/alternatives: Make optimize_nops() interrupt safe and synced tip-bot for Thomas Gleixner
2015-09-04  7:40         ` [BUG 4.2-rc8] Interrupt occurs while apply_alternatives() is patching the handler Richard W.M. Jones
2015-09-04 12:02         ` Borislav Petkov
2015-09-04 13:37           ` Thomas Gleixner
2015-09-05 15:25             ` Borislav Petkov

LKML Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/lkml/0 lkml/git/0.git
	git clone --mirror https://lore.kernel.org/lkml/1 lkml/git/1.git
	git clone --mirror https://lore.kernel.org/lkml/2 lkml/git/2.git
	git clone --mirror https://lore.kernel.org/lkml/3 lkml/git/3.git
	git clone --mirror https://lore.kernel.org/lkml/4 lkml/git/4.git
	git clone --mirror https://lore.kernel.org/lkml/5 lkml/git/5.git
	git clone --mirror https://lore.kernel.org/lkml/6 lkml/git/6.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 lkml lkml/ https://lore.kernel.org/lkml \
		linux-kernel@vger.kernel.org linux-kernel@archiver.kernel.org
	public-inbox-index lkml


Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-kernel


AGPL code for this site: git clone https://public-inbox.org/ public-inbox