All of lore.kernel.org
 help / color / mirror / Atom feed
* 32 bit qemu regression from v6.5 tip pull [6c480f222128 x86/alternative: Rewrite optimize_nops() some]
@ 2023-10-29 18:41 Paul Gortmaker
  2023-10-30  8:26 ` Peter Zijlstra
  0 siblings, 1 reply; 21+ messages in thread
From: Paul Gortmaker @ 2023-10-29 18:41 UTC (permalink / raw)
  To: Peter Zijlstra, Borislav Petkov
  Cc: Richard Purdie, Thomas Gleixner, x86, linux-kernel

The TL;DR is that the Yocto folks encountered a regression in their
automated QA tests (after a move from v6.4 --> v6.5) where non-KVM
enabled boot tests on 32 bit x86 would (with ~2% frequency) splat with:

[0.326235] x86/fpu: Supporting XSAVE feature 0x004: 'AVX registers'
[0.326556] x86/fpu: xstate_offset[2]:  576, xstate_sizes[2]:  256
[0.326965] x86/fpu: Enabled xstate features 0x7, context size is 832 bytes, using 'standard' format.
[0.331789] __common_interrupt: 0.167 No irq handler for vector
[0.331789] __common_interrupt: 0.112 No irq handler for vector
[0.331789] iret exception: 0000 [#1] PREEMPT SMP
[0.331789] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 6.5.7-yocto-standard #1
[0.331789] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.2-0-gea1b7a073390-prebuilt.qemu.org 04/01/2014
[0.331789] EIP: 0x60
[0.331789] Code: Unable to access opcode bytes at 0x36.

..or similar - common theme being FPU init and __common_interrupt.

The 2% reproducibility was a problem, so the Yocto folks asked me to
take a look, and keeping with the TL;DR I managed to bisect it to the
tip merge of alternates, and then in turn to the commit within:

6c480f222128 x86/alternative: Rewrite optimize_nops() some

That failed six times in 381 qemu boots.  I've run the commit below it,
14e4ec9c3e91 close to 1500 times (still going) without a fail - since as
we all know at 2%, that bad is bad but good is only statistically proven.

I'm not quite sure where to go next.  Has been nearly 20 years since
I've had to juggle NOP counts for some IMHO broken MIPS pipeline.  So I
figured I best report it at this point.

I've put a bunch of details in the bugzilla of the Yocto folks here:

https://bugzilla.yoctoproject.org/show_bug.cgi?id=15230

Skip ahead to comment 11 and you'll avoid me chasing FPU changes like
tglx's FPU init relocation commits, only to go nowhere.

I've kept kernel build dirs, boot logs, etc for all the commits I've
touched down into for testing, so I can revisit and re-test easily.

Paul.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: 32 bit qemu regression from v6.5 tip pull [6c480f222128 x86/alternative: Rewrite optimize_nops() some]
  2023-10-29 18:41 32 bit qemu regression from v6.5 tip pull [6c480f222128 x86/alternative: Rewrite optimize_nops() some] Paul Gortmaker
@ 2023-10-30  8:26 ` Peter Zijlstra
  2023-10-30 10:55   ` Richard Purdie
  0 siblings, 1 reply; 21+ messages in thread
From: Peter Zijlstra @ 2023-10-30  8:26 UTC (permalink / raw)
  To: Paul Gortmaker
  Cc: Borislav Petkov, Richard Purdie, Thomas Gleixner, x86, linux-kernel

On Sun, Oct 29, 2023 at 02:41:46PM -0400, Paul Gortmaker wrote:
> The TL;DR is that the Yocto folks encountered a regression in their
> automated QA tests (after a move from v6.4 --> v6.5) where non-KVM
> enabled boot tests on 32 bit x86 would (with ~2% frequency) splat with:

You're sure you're not running into this here:

  https://lkml.kernel.org/r/20230706170537.95959-1-richard.henderson@linaro.org

?

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: 32 bit qemu regression from v6.5 tip pull [6c480f222128 x86/alternative: Rewrite optimize_nops() some]
  2023-10-30  8:26 ` Peter Zijlstra
@ 2023-10-30 10:55   ` Richard Purdie
  2023-10-30 11:44     ` Peter Zijlstra
  0 siblings, 1 reply; 21+ messages in thread
From: Richard Purdie @ 2023-10-30 10:55 UTC (permalink / raw)
  To: Peter Zijlstra, Paul Gortmaker
  Cc: Borislav Petkov, Thomas Gleixner, x86, linux-kernel

On Mon, 2023-10-30 at 09:26 +0100, Peter Zijlstra wrote:
> On Sun, Oct 29, 2023 at 02:41:46PM -0400, Paul Gortmaker wrote:
> > The TL;DR is that the Yocto folks encountered a regression in their
> > automated QA tests (after a move from v6.4 --> v6.5) where non-KVM
> > enabled boot tests on 32 bit x86 would (with ~2% frequency) splat with:
> 
> You're sure you're not running into this here:
> 
>   https://lkml.kernel.org/r/20230706170537.95959-1-richard.henderson@linaro.org
> 
> ?

We're using qemu 8.1.0. Whilst I will get us updated to 8.1.2 and see
if that helps, I think those commits are in 8.1.0:

$ git show cb62bd15e14e304617d250158b77d0deb032f03
commit cb62bd15e14e304617d250158b77d0deb032f032
Author: Richard Henderson <richard.henderson@linaro.org>
Date:   Thu Jul 6 08:45:13 2023 +0100

    accel/tcg: Split out cpu_exec_longjmp_cleanup
[...]
$ git tag --contains cb62bd15e14e304617d250158b77d0deb032f03
v8.1.0
v8.1.0-rc0
v8.1.0-rc1
v8.1.0-rc2
v8.1.0-rc3
v8.1.0-rc4
v8.1.1
v8.1.2

Similarly for:

commit deba78709ae8ce103e2248413857747f804cd1ef
Author: Richard Henderson <richard.henderson@linaro.org>
Date:   Thu Jul 6 17:55:48 2023 +0100

    accel/tcg: Always lock pages before translation

and

commit ad17868eb162a5466d8ad43e5ccb428776403308
Author: Richard Henderson <richard.henderson@linaro.org>
Date:   Wed Jul 26 12:58:08 2023 -0700

    accel/tcg: Clear tcg_ctx->gen_tb on buffer overflow
[...]
    Fixes: deba78709ae8 ("accel/tcg: Always lock pages before translation")

Both of which are also in 8.1.0.

Is there any other patch related to those we might be missing?

Cheers,

Richard

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: 32 bit qemu regression from v6.5 tip pull [6c480f222128 x86/alternative: Rewrite optimize_nops() some]
  2023-10-30 10:55   ` Richard Purdie
@ 2023-10-30 11:44     ` Peter Zijlstra
  2023-10-30 15:28       ` Paul Gortmaker
  0 siblings, 1 reply; 21+ messages in thread
From: Peter Zijlstra @ 2023-10-30 11:44 UTC (permalink / raw)
  To: Richard Purdie
  Cc: Paul Gortmaker, Borislav Petkov, Thomas Gleixner, x86, linux-kernel

On Mon, Oct 30, 2023 at 10:55:26AM +0000, Richard Purdie wrote:
> On Mon, 2023-10-30 at 09:26 +0100, Peter Zijlstra wrote:
> > On Sun, Oct 29, 2023 at 02:41:46PM -0400, Paul Gortmaker wrote:
> > > The TL;DR is that the Yocto folks encountered a regression in their
> > > automated QA tests (after a move from v6.4 --> v6.5) where non-KVM
> > > enabled boot tests on 32 bit x86 would (with ~2% frequency) splat with:
> > 
> > You're sure you're not running into this here:
> > 
> >   https://lkml.kernel.org/r/20230706170537.95959-1-richard.henderson@linaro.org
> > 
> > ?
> 
> We're using qemu 8.1.0. Whilst I will get us updated to 8.1.2 and see
> if that helps, I think those commits are in 8.1.0:
> 
> $ git show cb62bd15e14e304617d250158b77d0deb032f03
> commit cb62bd15e14e304617d250158b77d0deb032f032
> Author: Richard Henderson <richard.henderson@linaro.org>
> Date:   Thu Jul 6 08:45:13 2023 +0100
> 
>     accel/tcg: Split out cpu_exec_longjmp_cleanup
> [...]
> $ git tag --contains cb62bd15e14e304617d250158b77d0deb032f03
> v8.1.0
> v8.1.0-rc0
> v8.1.0-rc1
> v8.1.0-rc2
> v8.1.0-rc3
> v8.1.0-rc4
> v8.1.1
> v8.1.2
> 
> Similarly for:
> 
> commit deba78709ae8ce103e2248413857747f804cd1ef
> Author: Richard Henderson <richard.henderson@linaro.org>
> Date:   Thu Jul 6 17:55:48 2023 +0100
> 
>     accel/tcg: Always lock pages before translation
> 
> and
> 
> commit ad17868eb162a5466d8ad43e5ccb428776403308
> Author: Richard Henderson <richard.henderson@linaro.org>
> Date:   Wed Jul 26 12:58:08 2023 -0700
> 
>     accel/tcg: Clear tcg_ctx->gen_tb on buffer overflow
> [...]
>     Fixes: deba78709ae8 ("accel/tcg: Always lock pages before translation")
> 
> Both of which are also in 8.1.0.
> 
> Is there any other patch related to those we might be missing?

Not sure -- afaik that was it.

Thomas was looking at this and wondered if something like the below
would help?

---
 arch/x86/kernel/alternative.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
index 73be3931e4f0..fd44739828f7 100644
--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -1685,8 +1685,8 @@ void __init_or_module text_poke_early(void *addr, const void *opcode,
 	} else {
 		local_irq_save(flags);
 		memcpy(addr, opcode, len);
-		local_irq_restore(flags);
 		sync_core();
+		local_irq_restore(flags);
 
 		/*
 		 * Could also do a CLFLUSH here to speed up CPU recovery; but

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: 32 bit qemu regression from v6.5 tip pull [6c480f222128 x86/alternative: Rewrite optimize_nops() some]
  2023-10-30 11:44     ` Peter Zijlstra
@ 2023-10-30 15:28       ` Paul Gortmaker
  2023-10-30 18:24         ` Thomas Gleixner
  0 siblings, 1 reply; 21+ messages in thread
From: Paul Gortmaker @ 2023-10-30 15:28 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Richard Purdie, Borislav Petkov, Thomas Gleixner, x86, linux-kernel

[Re: 32 bit qemu regression from v6.5 tip pull [6c480f222128 x86/alternative: Rewrite optimize_nops() some]] On 30/10/2023 (Mon 12:44) Peter Zijlstra wrote:

> Thomas was looking at this and wondered if something like the below
> would help?

I tested this on a vanilla v6.5.7 baseline, for lack of a better choice
and got six failures in 136 boots - everything else unchanged - even the
shell instance that builds the kernel.

Paul.
--

> 
> ---
>  arch/x86/kernel/alternative.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
> index 73be3931e4f0..fd44739828f7 100644
> --- a/arch/x86/kernel/alternative.c
> +++ b/arch/x86/kernel/alternative.c
> @@ -1685,8 +1685,8 @@ void __init_or_module text_poke_early(void *addr, const void *opcode,
>  	} else {
>  		local_irq_save(flags);
>  		memcpy(addr, opcode, len);
> -		local_irq_restore(flags);
>  		sync_core();
> +		local_irq_restore(flags);
>  
>  		/*
>  		 * Could also do a CLFLUSH here to speed up CPU recovery; but

rp-qemu32-v6.5.7-tglx$while [ 1 ] ; do date ; ls -l1 qemurunner_log* | wc -l ; grep  -l   _common qemu_boot_log.2023* ;  sleep 5m ; done
Mon 30 Oct 2023 10:29:57 AM EDT
22
Mon 30 Oct 2023 10:34:57 AM EDT
32
qemu_boot_log.20231030103001
Mon 30 Oct 2023 10:39:57 AM EDT
46
qemu_boot_log.20231030103001
Mon 30 Oct 2023 10:44:57 AM EDT
60
qemu_boot_log.20231030103001
Mon 30 Oct 2023 10:49:57 AM EDT
70
qemu_boot_log.20231030103001
qemu_boot_log.20231030104722
Mon 30 Oct 2023 10:54:57 AM EDT
80
qemu_boot_log.20231030103001
qemu_boot_log.20231030104722
qemu_boot_log.20231030105036
Mon 30 Oct 2023 10:59:57 AM EDT
89
qemu_boot_log.20231030103001
qemu_boot_log.20231030104722
qemu_boot_log.20231030105036
qemu_boot_log.20231030105809
Mon 30 Oct 2023 11:04:57 AM EDT
99
qemu_boot_log.20231030103001
qemu_boot_log.20231030104722
qemu_boot_log.20231030105036
qemu_boot_log.20231030105809
qemu_boot_log.20231030110019
Mon 30 Oct 2023 11:09:57 AM EDT
109
qemu_boot_log.20231030103001
qemu_boot_log.20231030104722
qemu_boot_log.20231030105036
qemu_boot_log.20231030105809
qemu_boot_log.20231030110019
qemu_boot_log.20231030110615
Mon 30 Oct 2023 11:14:57 AM EDT
123
qemu_boot_log.20231030103001
qemu_boot_log.20231030104722
qemu_boot_log.20231030105036
qemu_boot_log.20231030105809
qemu_boot_log.20231030110019
qemu_boot_log.20231030110615
Mon 30 Oct 2023 11:19:57 AM EDT
136
qemu_boot_log.20231030103001
qemu_boot_log.20231030104722
qemu_boot_log.20231030105036
qemu_boot_log.20231030105809
qemu_boot_log.20231030110019
qemu_boot_log.20231030110615
^C

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: 32 bit qemu regression from v6.5 tip pull [6c480f222128 x86/alternative: Rewrite optimize_nops() some]
  2023-10-30 15:28       ` Paul Gortmaker
@ 2023-10-30 18:24         ` Thomas Gleixner
  2023-10-30 19:30           ` Thomas Gleixner
  0 siblings, 1 reply; 21+ messages in thread
From: Thomas Gleixner @ 2023-10-30 18:24 UTC (permalink / raw)
  To: Paul Gortmaker, Peter Zijlstra
  Cc: Richard Purdie, Borislav Petkov, x86, linux-kernel

On Mon, Oct 30 2023 at 11:28, Paul Gortmaker wrote:
> [Re: 32 bit qemu regression from v6.5 tip pull [6c480f222128 x86/alternative: Rewrite optimize_nops() some]] On 30/10/2023 (Mon 12:44) Peter Zijlstra wrote:
>
>> Thomas was looking at this and wondered if something like the below
>> would help?
>
> I tested this on a vanilla v6.5.7 baseline, for lack of a better choice
> and got six failures in 136 boots - everything else unchanged - even the
> shell instance that builds the kernel.

While the sync_core() invocation is definitely at the wrong place, I did
not really expect that this cures it.

Can you add "debug-alternative" to the kernel command line and log both
a working and the non-working kernel output. It's noisy :)

Also do you have a .config and the qemu command line handy?

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: 32 bit qemu regression from v6.5 tip pull [6c480f222128 x86/alternative: Rewrite optimize_nops() some]
  2023-10-30 18:24         ` Thomas Gleixner
@ 2023-10-30 19:30           ` Thomas Gleixner
  2023-10-31 15:40             ` Paul Gortmaker
  0 siblings, 1 reply; 21+ messages in thread
From: Thomas Gleixner @ 2023-10-30 19:30 UTC (permalink / raw)
  To: Paul Gortmaker, Peter Zijlstra
  Cc: Richard Purdie, Borislav Petkov, x86, linux-kernel

On Mon, Oct 30 2023 at 19:24, Thomas Gleixner wrote:
> On Mon, Oct 30 2023 at 11:28, Paul Gortmaker wrote:
>> [Re: 32 bit qemu regression from v6.5 tip pull [6c480f222128 x86/alternative: Rewrite optimize_nops() some]] On 30/10/2023 (Mon 12:44) Peter Zijlstra wrote:
>>
>>> Thomas was looking at this and wondered if something like the below
>>> would help?
>>
>> I tested this on a vanilla v6.5.7 baseline, for lack of a better choice
>> and got six failures in 136 boots - everything else unchanged - even the
>> shell instance that builds the kernel.
>
> While the sync_core() invocation is definitely at the wrong place, I did
> not really expect that this cures it.
>
> Can you add "debug-alternative" to the kernel command line and log both
> a working and the non-working kernel output. It's noisy :)
>
> Also do you have a .config and the qemu command line handy?

Forgot to ask: Does the probkem persist with 6.6 ?

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: 32 bit qemu regression from v6.5 tip pull [6c480f222128 x86/alternative: Rewrite optimize_nops() some]
  2023-10-30 19:30           ` Thomas Gleixner
@ 2023-10-31 15:40             ` Paul Gortmaker
  2023-11-11 11:51               ` Linux regression tracking (Thorsten Leemhuis)
  0 siblings, 1 reply; 21+ messages in thread
From: Paul Gortmaker @ 2023-10-31 15:40 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Peter Zijlstra, Richard Purdie, Borislav Petkov, x86, linux-kernel

[Re: 32 bit qemu regression from v6.5 tip pull [6c480f222128 x86/alternative: Rewrite optimize_nops() some]] On 30/10/2023 (Mon 20:30) Thomas Gleixner wrote:

> On Mon, Oct 30 2023 at 19:24, Thomas Gleixner wrote:

> > Can you add "debug-alternative" to the kernel command line and log both
> > a working and the non-working kernel output. It's noisy :)
> >
> > Also do you have a .config and the qemu command line handy?
> 
> Forgot to ask: Does the probkem persist with 6.6 ?

My bad, should have mentioned v6.6 and linux-next is impacted. Always a
conflict between burying the lead and boring people with details.

The "debug-alternative" is definitely "chatty" -- the "works" case is
easy to capture -- working on getting a snapshot of the "fails" case.

I will attach a .config to the yocto bugzilla case, for lack of a better
place to put it.

The giant sh*tshow of qemu args used by QA are as follows:

Running
paul/poky/build-qemu-x86_32/tmp/work/x86_64-linux/qemu-helper-native/1.0/recipe-sysroot-native/usr/bin/qemu-system-i386 
-device virtio-net-pci,netdev=net0,mac=52:54:00:12:34:02 -netdev
tap,id=net0,ifname=tap0,script=no,downscript=no -object
rng-random,filename=/dev/urandom,id=r
ng0 -device virtio-rng-pci,rng=rng0 -drive
file=./core-image-minimal-qemux86.rootfs.ext4.2545136,if=virtio,format=raw
-usb -device usb-tablet -usb -device usb
-kbd   -cpu IvyBridge -machine q35,i8042=off -smp 4 -m 256 -serial
tcp:127.0.0.1:45019 -serial tcp:127.0.0.1:49567  -pidfile
/folk/pgortmak/tglx/pidfile_25451
32  -S -qmp unix:./.ut8nuyx1,server,wait -qmp
unix:./.7mu5pxk1,server,nowait -nographic  -kernel
/yow-lpggp32/paul/poky/build-qemu-x86_32/tmp/deploy/images/qemux86/bzImage -append 'root=/dev/vda rw
ip=192.168.7.2::192.168.7.1:255.255.255.0::eth0:off:8.8.8.8
net.ifnames=0 console=ttyS0 console=ttyS1 oprofile.timer=
1 tsc=reliable no_timer_check rcupdate.rcu_expedited=1 swiotlb=0
debug-alternative printk.time=1'

It pains me that it fills a screen.  But there it is.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: 32 bit qemu regression from v6.5 tip pull [6c480f222128 x86/alternative: Rewrite optimize_nops() some]
  2023-10-31 15:40             ` Paul Gortmaker
@ 2023-11-11 11:51               ` Linux regression tracking (Thorsten Leemhuis)
  2023-11-22 14:11                 ` Richard Purdie
  2023-11-29  8:57                 ` Thomas Gleixner
  0 siblings, 2 replies; 21+ messages in thread
From: Linux regression tracking (Thorsten Leemhuis) @ 2023-11-11 11:51 UTC (permalink / raw)
  To: paul.gortmaker
  Cc: bp, linux-kernel, peterz, richard.purdie, tglx, x86,
	Linux kernel regressions list

[CCing the regression list, as it should be in the loop for regressions:
https://docs.kernel.org/admin-guide/reporting-regressions.html]

> [Re: 32 bit qemu regression from v6.5 tip pull [6c480f222128 x86/alternative: Rewrite optimize_nops() some]] On 30/10/2023 (Mon 20:30) Thomas Gleixner wrote:
> 
>> On Mon, Oct 30 2023 at 19:24, Thomas Gleixner wrote:
> 
>> > Can you add "debug-alternative" to the kernel command line and log both
>> > a working and the non-working kernel output. It's noisy :)
>> >
>> > Also do you have a .config and the qemu command line handy?
>> 
>> Forgot to ask: Does the probkem persist with 6.6 ?
> 
> My bad, should have mentioned v6.6 and linux-next is impacted. Always a
> conflict between burying the lead and boring people with details.

Paul, was this regression ever solved? I wonder as I could not find
anything with a "Fixes: 6c480f222128 [...]" tag on lore and this thread
looks stalled.

Ciao, Thorsten

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: 32 bit qemu regression from v6.5 tip pull [6c480f222128 x86/alternative: Rewrite optimize_nops() some]
  2023-11-11 11:51               ` Linux regression tracking (Thorsten Leemhuis)
@ 2023-11-22 14:11                 ` Richard Purdie
  2023-11-29  8:57                 ` Thomas Gleixner
  1 sibling, 0 replies; 21+ messages in thread
From: Richard Purdie @ 2023-11-22 14:11 UTC (permalink / raw)
  To: Linux regressions mailing list, paul.gortmaker
  Cc: bp, linux-kernel, peterz, tglx, x86

On Sat, 2023-11-11 at 12:51 +0100, Linux regression tracking (Thorsten
Leemhuis) wrote:
> [CCing the regression list, as it should be in the loop for regressions:
> https://docs.kernel.org/admin-guide/reporting-regressions.html]
> 
> > [Re: 32 bit qemu regression from v6.5 tip pull [6c480f222128 x86/alternative: Rewrite optimize_nops() some]] On 30/10/2023 (Mon 20:30) Thomas Gleixner wrote:
> > 
> > > On Mon, Oct 30 2023 at 19:24, Thomas Gleixner wrote:
> > 
> > > > Can you add "debug-alternative" to the kernel command line and log both
> > > > a working and the non-working kernel output. It's noisy :)
> > > > 
> > > > Also do you have a .config and the qemu command line handy?
> > > 
> > > Forgot to ask: Does the probkem persist with 6.6 ?
> > 
> > My bad, should have mentioned v6.6 and linux-next is impacted. Always a
> > conflict between burying the lead and boring people with details.
> 
> Paul, was this regression ever solved? I wonder as I could not find
> anything with a "Fixes: 6c480f222128 [...]" tag on lore and this thread
> looks stalled.

I didn't see a reply to this. I can confirm the issue is still present
and we're not aware of any fix yet.

Cheers,

Richard

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: 32 bit qemu regression from v6.5 tip pull [6c480f222128 x86/alternative: Rewrite optimize_nops() some]
  2023-11-11 11:51               ` Linux regression tracking (Thorsten Leemhuis)
  2023-11-22 14:11                 ` Richard Purdie
@ 2023-11-29  8:57                 ` Thomas Gleixner
  2023-12-06 15:46                   ` Paul Gortmaker
  2023-12-07 19:49                   ` [patch 0/2] x86/alternatives: Prevent crash in NOP optimizer Thomas Gleixner
  1 sibling, 2 replies; 21+ messages in thread
From: Thomas Gleixner @ 2023-11-29  8:57 UTC (permalink / raw)
  To: Linux regression tracking (Thorsten Leemhuis), paul.gortmaker
  Cc: bp, linux-kernel, peterz, richard.purdie, x86,
	Linux kernel regressions list

On Sat, Nov 11 2023 at 12:51, Linux regression tracking wrote:
> [CCing the regression list, as it should be in the loop for regressions:
> https://docs.kernel.org/admin-guide/reporting-regressions.html]
>
>> [Re: 32 bit qemu regression from v6.5 tip pull [6c480f222128 x86/alternative: Rewrite optimize_nops() some]] On 30/10/2023 (Mon 20:30) Thomas Gleixner wrote:
>> 
>>> On Mon, Oct 30 2023 at 19:24, Thomas Gleixner wrote:
>> 
>>> > Can you add "debug-alternative" to the kernel command line and log both
>>> > a working and the non-working kernel output. It's noisy :)
>>> >
>>> > Also do you have a .config and the qemu command line handy?
>>> 
>>> Forgot to ask: Does the probkem persist with 6.6 ?
>> 
>> My bad, should have mentioned v6.6 and linux-next is impacted. Always a
>> conflict between burying the lead and boring people with details.
>
> Paul, was this regression ever solved? I wonder as I could not find
> anything with a "Fixes: 6c480f222128 [...]" tag on lore and this thread
> looks stalled.

Hmm. I was waiting for the .config file to materialize and for an
eventual analysis of the alternatives debug output....

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: 32 bit qemu regression from v6.5 tip pull [6c480f222128 x86/alternative: Rewrite optimize_nops() some]
  2023-11-29  8:57                 ` Thomas Gleixner
@ 2023-12-06 15:46                   ` Paul Gortmaker
  2023-12-07 16:34                     ` Thomas Gleixner
  2023-12-07 19:49                   ` [patch 0/2] x86/alternatives: Prevent crash in NOP optimizer Thomas Gleixner
  1 sibling, 1 reply; 21+ messages in thread
From: Paul Gortmaker @ 2023-12-06 15:46 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Linux regression tracking (Thorsten Leemhuis),
	bp, linux-kernel, peterz, richard.purdie, x86,
	Linux kernel regressions list

[Re: 32 bit qemu regression from v6.5 tip pull [6c480f222128 x86/alternative: Rewrite optimize_nops() some]] On 29/11/2023 (Wed 09:57) Thomas Gleixner wrote:

[...]

> > Paul, was this regression ever solved? I wonder as I could not find
> > anything with a "Fixes: 6c480f222128 [...]" tag on lore and this thread
> > looks stalled.
> 
> Hmm. I was waiting for the .config file to materialize and for an
> eventual analysis of the alternatives debug output....

Sorry - was off for a week vacation, and was playing catch-up on my
return and 32 bit non-KVM qemu wasn't top on my priority list.

Then I did a batch of runs to get a fail, but without also booting with
"ignore-loglevel" - since the lines are debug and don't show on the
crashed console by default - start over!  :(

The config file and both "good" and "bad" files are there now:

https://bugzilla.yoctoproject.org/show_bug.cgi?id=15230

There are a lot more debug lines in the good/working output!

$ grep alternatives: dmesg-debug-alt-good.txt | wc -l
4868
$ grep alternatives: dmesg-debug-alt.txt | wc -l
19

Paul
--

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: 32 bit qemu regression from v6.5 tip pull [6c480f222128 x86/alternative: Rewrite optimize_nops() some]
  2023-12-06 15:46                   ` Paul Gortmaker
@ 2023-12-07 16:34                     ` Thomas Gleixner
  2023-12-07 16:52                       ` Paul Gortmaker
  0 siblings, 1 reply; 21+ messages in thread
From: Thomas Gleixner @ 2023-12-07 16:34 UTC (permalink / raw)
  To: Paul Gortmaker
  Cc: Linux regression tracking (Thorsten Leemhuis),
	bp, linux-kernel, peterz, richard.purdie, x86,
	Linux kernel regressions list

On Wed, Dec 06 2023 at 10:46, Paul Gortmaker wrote:
> The config file and both "good" and "bad" files are there now:
>
> https://bugzilla.yoctoproject.org/show_bug.cgi?id=15230
>
> There are a lot more debug lines in the good/working output!
>
> $ grep alternatives: dmesg-debug-alt-good.txt | wc -l
> 4868
> $ grep alternatives: dmesg-debug-alt.txt | wc -l
> 19

Unsurprisingly so. The failing one dies right at the start of
alternative patching....

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: 32 bit qemu regression from v6.5 tip pull [6c480f222128 x86/alternative: Rewrite optimize_nops() some]
  2023-12-07 16:34                     ` Thomas Gleixner
@ 2023-12-07 16:52                       ` Paul Gortmaker
  0 siblings, 0 replies; 21+ messages in thread
From: Paul Gortmaker @ 2023-12-07 16:52 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Linux regression tracking (Thorsten Leemhuis),
	bp, linux-kernel, peterz, richard.purdie, x86,
	Linux kernel regressions list

[Re: 32 bit qemu regression from v6.5 tip pull [6c480f222128 x86/alternative: Rewrite optimize_nops() some]] On 07/12/2023 (Thu 17:34) Thomas Gleixner wrote:

> On Wed, Dec 06 2023 at 10:46, Paul Gortmaker wrote:
> > The config file and both "good" and "bad" files are there now:
> >
> > https://bugzilla.yoctoproject.org/show_bug.cgi?id=15230
> >
> > There are a lot more debug lines in the good/working output!
> >
> > $ grep alternatives: dmesg-debug-alt-good.txt | wc -l
> > 4868
> > $ grep alternatives: dmesg-debug-alt.txt | wc -l
> > 19
> 
> Unsurprisingly so. The failing one dies right at the start of
> alternative patching....

I'm still set up to where I can pick a version of your choice, add a
debug patch or whatever, and let it run the 200 odd boots it takes
before it fails once.  And I'll try and be more prompt this time!

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [patch 0/2] x86/alternatives: Prevent crash in NOP optimizer
  2023-11-29  8:57                 ` Thomas Gleixner
  2023-12-06 15:46                   ` Paul Gortmaker
@ 2023-12-07 19:49                   ` Thomas Gleixner
  2023-12-07 19:49                     ` [patch 1/2] x86/alternatives: Sync core before enabling interrupts Thomas Gleixner
                                       ` (3 more replies)
  1 sibling, 4 replies; 21+ messages in thread
From: Thomas Gleixner @ 2023-12-07 19:49 UTC (permalink / raw)
  To: LKML; +Cc: paul.gortmaker, x86, regressions, richard.purdie, regressions

The following series addresses the regression report from Paul on behalf of
the yocto project. It turns out that the recent changes to alternatives
opened a race window where interrupts are enabled and NOPs are optimized in
place. An interrupt hitting into the modification will observe inconsistent
text and crash and burn.

A 32bit QEMU crashes w/o these fixes reliably within about 50 boot
attempts. With the fix applied it survived close to 600 attempts by
now.

Thanks to Paul for providing all the information!

Thanks,

	tglx



^ permalink raw reply	[flat|nested] 21+ messages in thread

* [patch 1/2] x86/alternatives: Sync core before enabling interrupts
  2023-12-07 19:49                   ` [patch 0/2] x86/alternatives: Prevent crash in NOP optimizer Thomas Gleixner
@ 2023-12-07 19:49                     ` Thomas Gleixner
  2023-12-07 19:49                     ` [patch 2/2] x86/alternatives: Disable interrupts and sync when optimizing NOPs in place Thomas Gleixner
                                       ` (2 subsequent siblings)
  3 siblings, 0 replies; 21+ messages in thread
From: Thomas Gleixner @ 2023-12-07 19:49 UTC (permalink / raw)
  To: LKML; +Cc: paul.gortmaker, x86, regressions, richard.purdie, regressions

text_poke_early() does:

   local_irq_save(flags);
   memcpy(addr, opcode, len);
   local_irq_restore(flags);
   sync_core();

That's not really correct because the synchronization should happen before
interrupts are reenabled to ensure that a pending interrupt observes the
complete update of the opcodes.

It's not entirely clear whether the interrupt entry provides enough
serialization already, but moving the sync_core() invocation into interrupt
disabled region does no harm and is obviously correct.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/kernel/alternative.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -1685,8 +1685,8 @@ void __init_or_module text_poke_early(vo
 	} else {
 		local_irq_save(flags);
 		memcpy(addr, opcode, len);
-		local_irq_restore(flags);
 		sync_core();
+		local_irq_restore(flags);
 
 		/*
 		 * Could also do a CLFLUSH here to speed up CPU recovery; but


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [patch 2/2] x86/alternatives: Disable interrupts and sync when optimizing NOPs in place
  2023-12-07 19:49                   ` [patch 0/2] x86/alternatives: Prevent crash in NOP optimizer Thomas Gleixner
  2023-12-07 19:49                     ` [patch 1/2] x86/alternatives: Sync core before enabling interrupts Thomas Gleixner
@ 2023-12-07 19:49                     ` Thomas Gleixner
  2023-12-08 13:22                       ` Borislav Petkov
  2023-12-08  8:35                     ` [patch 0/2] x86/alternatives: Prevent crash in NOP optimizer Paul Gortmaker
  2023-12-15  9:10                     ` Peter Zijlstra
  3 siblings, 1 reply; 21+ messages in thread
From: Thomas Gleixner @ 2023-12-07 19:49 UTC (permalink / raw)
  To: LKML; +Cc: paul.gortmaker, x86, regressions, richard.purdie, regressions

apply_alternatives() treats alternatives with the ALT_FLAG_NOT flag set
special as it optimizes the existing NOPs in place.

Unfortunately this happens with interrupts enabled and does not provide any
form of core synchronization.

So an interrupt hitting in the middle of the update and using the affected
code path will observe a half updated NOP and crash and burn. The following
3 NOP sequence was observed to expose this crash halfways reliably under
QEMU 32bit:

   0x90 0x90 0x90

which is replaced by the optimized 3 byte NOP:

   0x8d 0x76 0x00

So an interrupt can observe:

   1) 0x90 0x90 0x90		nop nop nop
   2) 0x8d 0x90 0x90		undefined
   3) 0x8d 0x76 0x90		lea    -0x70(%esi),%esi
   4) 0x8d 0x76 0x00		lea     0x0(%esi),%esi

Where only #1 and #4 are true NOPs. The same problem exists for 64bit obviously.

Disable interrupts around this NOP optimization and invoke sync_core()
before reenabling them.

Fixes: 270a69c4485d ("x86/alternative: Support relocations in alternatives")
Reported-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
---
 arch/x86/kernel/alternative.c |   12 +++++++++++-
 1 file changed, 11 insertions(+), 1 deletion(-)

--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -255,6 +255,16 @@ static void __init_or_module noinline op
 	}
 }
 
+static void __init_or_module noinline optimize_nops_inplace(u8 *instr, size_t len)
+{
+	unsigned long flags;
+
+	local_irq_save(flags);
+	optimize_nops(instr, len);
+	sync_core();
+	local_irq_restore(flags);
+}
+
 /*
  * In this context, "source" is where the instructions are placed in the
  * section .altinstr_replacement, for example during kernel build by the
@@ -438,7 +448,7 @@ void __init_or_module noinline apply_alt
 		 *   patch if feature is *NOT* present.
 		 */
 		if (!boot_cpu_has(a->cpuid) == !(a->flags & ALT_FLAG_NOT)) {
-			optimize_nops(instr, a->instrlen);
+			optimize_nops_inplace(instr, a->instrlen);
 			continue;
 		}
 


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [patch 0/2] x86/alternatives: Prevent crash in NOP optimizer
  2023-12-07 19:49                   ` [patch 0/2] x86/alternatives: Prevent crash in NOP optimizer Thomas Gleixner
  2023-12-07 19:49                     ` [patch 1/2] x86/alternatives: Sync core before enabling interrupts Thomas Gleixner
  2023-12-07 19:49                     ` [patch 2/2] x86/alternatives: Disable interrupts and sync when optimizing NOPs in place Thomas Gleixner
@ 2023-12-08  8:35                     ` Paul Gortmaker
  2023-12-15  9:10                     ` Peter Zijlstra
  3 siblings, 0 replies; 21+ messages in thread
From: Paul Gortmaker @ 2023-12-08  8:35 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: LKML, x86, regressions, richard.purdie, regressions

[[patch 0/2] x86/alternatives: Prevent crash in NOP optimizer] On 07/12/2023 (Thu 20:49) Thomas Gleixner wrote:

> The following series addresses the regression report from Paul on behalf of
> the yocto project. It turns out that the recent changes to alternatives
> opened a race window where interrupts are enabled and NOPs are optimized in
> place. An interrupt hitting into the modification will observe inconsistent
> text and crash and burn.
> 
> A 32bit QEMU crashes w/o these fixes reliably within about 50 boot
> attempts. With the fix applied it survived close to 600 attempts by
> now.

I can confirm that - since I was already set up for it, I let it run
overnight for about 1250 attempts.  With an average of one in 200 that I
was seeing before, I should have had around six fails.  Didn't see one.

> Thanks to Paul for providing all the information!

On behalf of Richard and myself, thanks for the complex fix.

Paul.
--

> 
> Thanks,
> 
> 	tglx
> 
> 

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [patch 2/2] x86/alternatives: Disable interrupts and sync when optimizing NOPs in place
  2023-12-07 19:49                     ` [patch 2/2] x86/alternatives: Disable interrupts and sync when optimizing NOPs in place Thomas Gleixner
@ 2023-12-08 13:22                       ` Borislav Petkov
  2023-12-08 13:37                         ` Thomas Gleixner
  0 siblings, 1 reply; 21+ messages in thread
From: Borislav Petkov @ 2023-12-08 13:22 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, paul.gortmaker, x86, regressions, richard.purdie, regressions

On Thu, Dec 07, 2023 at 08:49:26PM +0100, Thomas Gleixner wrote:
> +static void __init_or_module noinline optimize_nops_inplace(u8 *instr, size_t len)
> +{
> +	unsigned long flags;
> +
> +	local_irq_save(flags);
> +	optimize_nops(instr, len);
> +	sync_core();
> +	local_irq_restore(flags);
> +}
> +
>  /*
>   * In this context, "source" is where the instructions are placed in the
>   * section .altinstr_replacement, for example during kernel build by the
> @@ -438,7 +448,7 @@ void __init_or_module noinline apply_alt
>  		 *   patch if feature is *NOT* present.
>  		 */
>  		if (!boot_cpu_has(a->cpuid) == !(a->flags & ALT_FLAG_NOT)) {
> -			optimize_nops(instr, a->instrlen);
> +			optimize_nops_inplace(instr, a->instrlen);
>  			continue;
>  		}

Arguably, the proper thing to do here would be to convert the NOP
optimizing to the same 2-stage process as normal patching: write insns
into a buffer and text_poke* it.

VS what we currently do: operating straight on kernel memory.

Lemme put it on the TODO and see how ugly it becomes.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [patch 2/2] x86/alternatives: Disable interrupts and sync when optimizing NOPs in place
  2023-12-08 13:22                       ` Borislav Petkov
@ 2023-12-08 13:37                         ` Thomas Gleixner
  0 siblings, 0 replies; 21+ messages in thread
From: Thomas Gleixner @ 2023-12-08 13:37 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: LKML, paul.gortmaker, x86, regressions, richard.purdie, regressions

On Fri, Dec 08 2023 at 14:22, Borislav Petkov wrote:
> On Thu, Dec 07, 2023 at 08:49:26PM +0100, Thomas Gleixner wrote:
>> +static void __init_or_module noinline optimize_nops_inplace(u8 *instr, size_t len)
>> +{
>> +	unsigned long flags;
>> +
>> +	local_irq_save(flags);
>> +	optimize_nops(instr, len);
>> +	sync_core();
>> +	local_irq_restore(flags);
>> +}
>> +
>>  /*
>>   * In this context, "source" is where the instructions are placed in the
>>   * section .altinstr_replacement, for example during kernel build by the
>> @@ -438,7 +448,7 @@ void __init_or_module noinline apply_alt
>>  		 *   patch if feature is *NOT* present.
>>  		 */
>>  		if (!boot_cpu_has(a->cpuid) == !(a->flags & ALT_FLAG_NOT)) {
>> -			optimize_nops(instr, a->instrlen);
>> +			optimize_nops_inplace(instr, a->instrlen);
>>  			continue;
>>  		}
>
> Arguably, the proper thing to do here would be to convert the NOP
> optimizing to the same 2-stage process as normal patching: write insns
> into a buffer and text_poke* it.
>
> VS what we currently do: operating straight on kernel memory.

Well, apply_alternatives() results in text_poke_early() which is nothing
else than a memcpy() with interrupts disabled :)

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [patch 0/2] x86/alternatives: Prevent crash in NOP optimizer
  2023-12-07 19:49                   ` [patch 0/2] x86/alternatives: Prevent crash in NOP optimizer Thomas Gleixner
                                       ` (2 preceding siblings ...)
  2023-12-08  8:35                     ` [patch 0/2] x86/alternatives: Prevent crash in NOP optimizer Paul Gortmaker
@ 2023-12-15  9:10                     ` Peter Zijlstra
  3 siblings, 0 replies; 21+ messages in thread
From: Peter Zijlstra @ 2023-12-15  9:10 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, paul.gortmaker, x86, regressions, richard.purdie, regressions

On Thu, Dec 07, 2023 at 08:49:22PM +0100, Thomas Gleixner wrote:
> The following series addresses the regression report from Paul on behalf of
> the yocto project. It turns out that the recent changes to alternatives
> opened a race window where interrupts are enabled and NOPs are optimized in
> place. An interrupt hitting into the modification will observe inconsistent
> text and crash and burn.
> 
> A 32bit QEMU crashes w/o these fixes reliably within about 50 boot
> attempts. With the fix applied it survived close to 600 attempts by
> now.
> 
> Thanks to Paul for providing all the information!

Urgh and D'0h.

Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2023-12-15  9:11 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-10-29 18:41 32 bit qemu regression from v6.5 tip pull [6c480f222128 x86/alternative: Rewrite optimize_nops() some] Paul Gortmaker
2023-10-30  8:26 ` Peter Zijlstra
2023-10-30 10:55   ` Richard Purdie
2023-10-30 11:44     ` Peter Zijlstra
2023-10-30 15:28       ` Paul Gortmaker
2023-10-30 18:24         ` Thomas Gleixner
2023-10-30 19:30           ` Thomas Gleixner
2023-10-31 15:40             ` Paul Gortmaker
2023-11-11 11:51               ` Linux regression tracking (Thorsten Leemhuis)
2023-11-22 14:11                 ` Richard Purdie
2023-11-29  8:57                 ` Thomas Gleixner
2023-12-06 15:46                   ` Paul Gortmaker
2023-12-07 16:34                     ` Thomas Gleixner
2023-12-07 16:52                       ` Paul Gortmaker
2023-12-07 19:49                   ` [patch 0/2] x86/alternatives: Prevent crash in NOP optimizer Thomas Gleixner
2023-12-07 19:49                     ` [patch 1/2] x86/alternatives: Sync core before enabling interrupts Thomas Gleixner
2023-12-07 19:49                     ` [patch 2/2] x86/alternatives: Disable interrupts and sync when optimizing NOPs in place Thomas Gleixner
2023-12-08 13:22                       ` Borislav Petkov
2023-12-08 13:37                         ` Thomas Gleixner
2023-12-08  8:35                     ` [patch 0/2] x86/alternatives: Prevent crash in NOP optimizer Paul Gortmaker
2023-12-15  9:10                     ` Peter Zijlstra

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.