* Perf: WARNING: arch/x86/entry/common.c:624 idtentry_exit_cond_rcu+0x92/0xc0
@ 2020-06-11 18:56 Naresh Kamboju
2020-06-11 19:10 ` Andy Lutomirski
2020-06-12 19:50 ` [tip: x86/entry] x86/entry: Make NMI use IDTENTRY_RAW tip-bot2 for Thomas Gleixner
0 siblings, 2 replies; 13+ messages in thread
From: Naresh Kamboju @ 2020-06-11 18:56 UTC (permalink / raw)
To: open list, x86
Cc: cj.chengjian, Arnaldo Carvalho de Melo, Peter Zijlstra,
Ingo Molnar, H. Peter Anvin, Borislav Petkov, Thomas Gleixner,
Andy Lutomirski, Minchan Kim, Andrew Morton, Michel Lespinasse,
lkft-triage, Dave Hansen
While running perf test and selftest x86 single_step_syscall_32 on
i386 kernel linux
next 5.7.0-next-20200610 kernel warning noticed.
steps to reproduce:
--------------------------
perf test
and
cd /opt/kselftests/default-in-kernel/x86
./single_step_syscall_32
perf warning log:
----------------------
[ 57.260865] ------------[ cut here ]------------
[ 57.266576] IRQs not disabled as expected
[ 57.270583] WARNING: CPU: 1 PID: 500 at
/usr/src/kernel/arch/x86/entry/common.c:624
idtentry_exit_cond_rcu+0x92/0xc0
[ 57.281092] Modules linked in: x86_pkg_temp_thermal fuse
[ 57.286406] CPU: 1 PID: 500 Comm: perf Not tainted 5.7.0-next-20200610 #1
[ 57.293190] Hardware name: Supermicro SYS-5019S-ML/X11SSH-F, BIOS
2.2 05/23/2018
[ 57.300577] EIP: idtentry_exit_cond_rcu+0x92/0xc0
[ 57.305280] Code: 8b 89 d8 05 00 00 85 c9 74 ae 80 3d b1 64 2c d4
00 75 a5 68 94 2d fb d3 89 55 f8 89 45 fc c6 05 b1 64 2c d4 01 e8 8e
f5 2b ff <0f> 0b 58 8b 55 f8 8b 45 fc eb 83 8d 76 00 e8 5b fd ff ff c9
c3 89
[ 57.324017] EAX: 0000001d EBX: 0d00022a ECX: 00000027 EDX: f5b9e14c
[ 57.330274] ESI: f2a2ffb4 EDI: ffff0ff4 EBP: f2a2ff8c ESP: f2a2ff80
[ 57.336531] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 EFLAGS: 00010096
[ 57.343345] CR0: 80050033 CR2: 08700a58 CR3: 14ad7000 CR4: 003406d0
[ 57.349608] DR0: 080dfb80 DR1: 080dfc00 DR2: 08700a58 DR3: 00000000
[ 57.355866] DR6: fffe0ff0 DR7: 0d00062a
[ 57.359697] Call Trace:
[ 57.362143] exc_debug+0x84/0x1b0
[ 57.365487] ? exc_int3+0x1d0/0x1d0
[ 57.368981] handle_exception+0x145/0x145
[ 57.372991] EIP: 0x80dfbcd
[ 57.375694] Code: Bad RIP value.
[ 57.378918] EAX: 00000000 EBX: 00000005 ECX: 00002400 EDX: 00000000
[ 57.385175] ESI: 00000003 EDI: 00000004 EBP: bfd59798 ESP: bfd59770
[ 57.391431] DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b EFLAGS: 00000246
[ 57.398215] irq event stamp: 1896
[ 57.401556] hardirqs last enabled at (1895): [<d3c316b9>]
idtentry_exit_user+0x39/0x60
[ 57.409556] hardirqs last disabled at (1896): [<d2f6e742>]
vprintk_emit+0x42/0x290
[ 57.417120] softirqs last enabled at (0): [<d2eee73a>]
copy_process+0x3ea/0x17d0
[ 57.424590] softirqs last disabled at (0): [<00000000>] 0x0
[ 57.430155] ---[ end trace 421998f6ba46136b ]---
10.2: PMU event [ 57.434872] ------------[ cut here ]------------
[ 57.440858] IRQs not disabled as expected
[ 57.444870] WARNING: CPU: 1 PID: 500 at
/usr/src/kernel/arch/x86/entry/common.c:240
__prepare_exit_to_usermode+0x164/0x1b0
[ 57.455900] Modules linked in: x86_pkg_temp_thermal fuse
[ 57.461207] CPU: 1 PID: 500 Comm: perf Tainted: G W
5.7.0-next-20200610 #1
[ 57.469379] Hardware name: Supermicro SYS-5019S-ML/X11SSH-F, BIOS
2.2 05/23/2018
[ 57.476770] EIP: __prepare_exit_to_usermode+0x164/0x1b0
[ 57.481988] Code: d8 05 00 00 85 db 0f 84 08 ff ff ff 80 3d b2 64
2c d4 00 0f 85 fb fe ff ff 68 94 2d fb d3 c6 05 b2 64 2c d4 01 e8 8c
ec 0e 00 <0f> 0b 59 e9 e2 fe ff ff 8d 74 26 00 68 3c 2d fb d3 e8 76 ec
0e 00
[ 57.500732] EAX: 0000001d EBX: 00000001 ECX: 00000027 EDX: f5b9e14c
[ 57.506991] ESI: f1e9a400 EDI: f2a2ffb4 EBP: f2a2ff74 ESP: f2a2ff64
[ 57.513275] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 EFLAGS: 00010086
[ 57.520060] CR0: 80050033 CR2: 08700a58 CR3: 14ad7000 CR4: 003406d0
[ 57.526346] DR0: 080dfb80 DR1: 080dfc00 DR2: 08700a58 DR3: 00000000
[ 57.532608] DR6: fffe0ff0 DR7: 0d00062a
[ 57.536439] Call Trace:
[ 57.538885] prepare_exit_to_usermode+0x8/0x30
[ 57.543356] idtentry_exit_cond_rcu+0xa5/0xc0
[ 57.547714] ? idtentry_exit_user+0x39/0x60
[ 57.551892] exc_debug+0x84/0x1b0
[ 57.555212] ? exc_int3+0x1d0/0x1d0
[ 57.558704] handle_exception+0x145/0x145
[ 57.562715] EIP: 0x80dfc00
[ 57.565417] Code: Bad RIP value.
[ 57.568644] EAX: 0000001d EBX: 00000005 ECX: bfd5920c EDX: bfd5918c
[ 57.574899] ESI: 00000003 EDI: 00000004 EBP: bfd59798 ESP: bfd5917c
[ 57.581158] DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b EFLAGS: 00010246
[ 57.587948] irq event stamp: 1942
[ 57.591340] hardirqs last enabled at (1941): [<d3c316b9>]
idtentry_exit_user+0x39/0x60
[ 57.599345] hardirqs last disabled at (1942): [<d2f6e742>]
vprintk_emit+0x42/0x290
[ 57.606906] softirqs last enabled at (1934): [<d2eb1b68>]
fpu__clear+0x78/0xd0
[ 57.614210] softirqs last disabled at (1932): [<d2eb1b11>]
fpu__clear+0x21/0xd0
[ 57.621509] ---[ end trace 421998f6ba46136c ]---
map aliases :
Full test log link,
----------------------
perf test full output log:
https://qa-reports.linaro.org/lkft/linux-next-oe/build/next-20200610/testrun/2808447/suite/linux-log-parser/test/check-kernel-warning-1486409/log
kselftest full output log,
https://qa-reports.linaro.org/lkft/linux-next-oe/build/next-20200610/testrun/2808443/suite/linux-log-parser/test/check-kernel-exception-1486406/log
metadata:
--------------
git branch: master
git repo: https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
git commit: 976c320052ff22a7f9b8be0d56aefacb3a40f466
git describe: next-20200610
kernel-config:
http://snapshots.linaro.org/openembedded/lkft/lkft/sumo/intel-core2-32/lkft/linux-next/786/config
--
Linaro LKFT
https://lkft.linaro.org
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Perf: WARNING: arch/x86/entry/common.c:624 idtentry_exit_cond_rcu+0x92/0xc0
2020-06-11 18:56 Perf: WARNING: arch/x86/entry/common.c:624 idtentry_exit_cond_rcu+0x92/0xc0 Naresh Kamboju
@ 2020-06-11 19:10 ` Andy Lutomirski
2020-06-11 19:25 ` Peter Zijlstra
2020-06-12 19:50 ` [tip: x86/entry] x86/entry: Make NMI use IDTENTRY_RAW tip-bot2 for Thomas Gleixner
1 sibling, 1 reply; 13+ messages in thread
From: Andy Lutomirski @ 2020-06-11 19:10 UTC (permalink / raw)
To: Naresh Kamboju
Cc: open list, X86 ML, cj.chengjian, Arnaldo Carvalho de Melo,
Peter Zijlstra, Ingo Molnar, H. Peter Anvin, Borislav Petkov,
Thomas Gleixner, Andy Lutomirski, Minchan Kim, Andrew Morton,
Michel Lespinasse, lkft-triage, Dave Hansen
On Thu, Jun 11, 2020 at 11:56 AM Naresh Kamboju
<naresh.kamboju@linaro.org> wrote:
>
> While running perf test and selftest x86 single_step_syscall_32 on
> i386 kernel linux
> next 5.7.0-next-20200610 kernel warning noticed.
>
> steps to reproduce:
> --------------------------
> perf test
> and
> cd /opt/kselftests/default-in-kernel/x86
> ./single_step_syscall_32
>
> perf warning log:
> ----------------------
> [ 57.260865] ------------[ cut here ]------------
> [ 57.266576] IRQs not disabled as expected
> [ 57.270583] WARNING: CPU: 1 PID: 500 at
> /usr/src/kernel/arch/x86/entry/common.c:624
> idtentry_exit_cond_rcu+0x92/0xc0
> [ 57.281092] Modules linked in: x86_pkg_temp_thermal fuse
> [ 57.286406] CPU: 1 PID: 500 Comm: perf Not tainted 5.7.0-next-20200610 #1
> [ 57.293190] Hardware name: Supermicro SYS-5019S-ML/X11SSH-F, BIOS
> 2.2 05/23/2018
> [ 57.300577] EIP: idtentry_exit_cond_rcu+0x92/0xc0
> [ 57.305280] Code: 8b 89 d8 05 00 00 85 c9 74 ae 80 3d b1 64 2c d4
> 00 75 a5 68 94 2d fb d3 89 55 f8 89 45 fc c6 05 b1 64 2c d4 01 e8 8e
> f5 2b ff <0f> 0b 58 8b 55 f8 8b 45 fc eb 83 8d 76 00 e8 5b fd ff ff c9
> c3 89
> [ 57.324017] EAX: 0000001d EBX: 0d00022a ECX: 00000027 EDX: f5b9e14c
> [ 57.330274] ESI: f2a2ffb4 EDI: ffff0ff4 EBP: f2a2ff8c ESP: f2a2ff80
> [ 57.336531] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 EFLAGS: 00010096
> [ 57.343345] CR0: 80050033 CR2: 08700a58 CR3: 14ad7000 CR4: 003406d0
> [ 57.349608] DR0: 080dfb80 DR1: 080dfc00 DR2: 08700a58 DR3: 00000000
> [ 57.355866] DR6: fffe0ff0 DR7: 0d00062a
> [ 57.359697] Call Trace:
> [ 57.362143] exc_debug+0x84/0x1b0
> [ 57.365487] ? exc_int3+0x1d0/0x1d0
> [ 57.368981] handle_exception+0x145/0x145
> [ 57.372991] EIP: 0x80dfbcd
> [ 57.375694] Code: Bad RIP value.
> [ 57.378918] EAX: 00000000 EBX: 00000005 ECX: 00002400 EDX: 00000000
> [ 57.385175] ESI: 00000003 EDI: 00000004 EBP: bfd59798 ESP: bfd59770
> [ 57.391431] DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b EFLAGS: 00000246
> [ 57.398215] irq event stamp: 1896
A regrettable property of the current entry code structure is that we
lose any real indication of the vector. Presumably this is #DB, hence
exc_debug. I don't know what perf has to do with it.
I'll bang on this after lunch if no one beats me to it.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Perf: WARNING: arch/x86/entry/common.c:624 idtentry_exit_cond_rcu+0x92/0xc0
2020-06-11 19:10 ` Andy Lutomirski
@ 2020-06-11 19:25 ` Peter Zijlstra
2020-06-11 23:22 ` Andy Lutomirski
0 siblings, 1 reply; 13+ messages in thread
From: Peter Zijlstra @ 2020-06-11 19:25 UTC (permalink / raw)
To: Andy Lutomirski
Cc: Naresh Kamboju, open list, X86 ML, cj.chengjian,
Arnaldo Carvalho de Melo, Ingo Molnar, H. Peter Anvin,
Borislav Petkov, Thomas Gleixner, Minchan Kim, Andrew Morton,
Michel Lespinasse, lkft-triage, Dave Hansen
On Thu, Jun 11, 2020 at 12:10:50PM -0700, Andy Lutomirski wrote:
> On Thu, Jun 11, 2020 at 11:56 AM Naresh Kamboju
> <naresh.kamboju@linaro.org> wrote:
> >
> > While running perf test and selftest x86 single_step_syscall_32 on
> > i386 kernel linux
> > next 5.7.0-next-20200610 kernel warning noticed.
> >
> > steps to reproduce:
> > --------------------------
> > perf test
> > and
> > cd /opt/kselftests/default-in-kernel/x86
> > ./single_step_syscall_32
> >
> > perf warning log:
> > ----------------------
> > [ 57.260865] ------------[ cut here ]------------
> > [ 57.266576] IRQs not disabled as expected
> > [ 57.270583] WARNING: CPU: 1 PID: 500 at
> > /usr/src/kernel/arch/x86/entry/common.c:624
> > idtentry_exit_cond_rcu+0x92/0xc0
> > [ 57.281092] Modules linked in: x86_pkg_temp_thermal fuse
> > [ 57.286406] CPU: 1 PID: 500 Comm: perf Not tainted 5.7.0-next-20200610 #1
> > [ 57.293190] Hardware name: Supermicro SYS-5019S-ML/X11SSH-F, BIOS
> > 2.2 05/23/2018
> > [ 57.300577] EIP: idtentry_exit_cond_rcu+0x92/0xc0
> > [ 57.305280] Code: 8b 89 d8 05 00 00 85 c9 74 ae 80 3d b1 64 2c d4
> > 00 75 a5 68 94 2d fb d3 89 55 f8 89 45 fc c6 05 b1 64 2c d4 01 e8 8e
> > f5 2b ff <0f> 0b 58 8b 55 f8 8b 45 fc eb 83 8d 76 00 e8 5b fd ff ff c9
> > c3 89
> > [ 57.324017] EAX: 0000001d EBX: 0d00022a ECX: 00000027 EDX: f5b9e14c
> > [ 57.330274] ESI: f2a2ffb4 EDI: ffff0ff4 EBP: f2a2ff8c ESP: f2a2ff80
> > [ 57.336531] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 EFLAGS: 00010096
> > [ 57.343345] CR0: 80050033 CR2: 08700a58 CR3: 14ad7000 CR4: 003406d0
> > [ 57.349608] DR0: 080dfb80 DR1: 080dfc00 DR2: 08700a58 DR3: 00000000
> > [ 57.355866] DR6: fffe0ff0 DR7: 0d00062a
> > [ 57.359697] Call Trace:
> > [ 57.362143] exc_debug+0x84/0x1b0
> > [ 57.365487] ? exc_int3+0x1d0/0x1d0
> > [ 57.368981] handle_exception+0x145/0x145
> > [ 57.372991] EIP: 0x80dfbcd
> > [ 57.375694] Code: Bad RIP value.
> > [ 57.378918] EAX: 00000000 EBX: 00000005 ECX: 00002400 EDX: 00000000
> > [ 57.385175] ESI: 00000003 EDI: 00000004 EBP: bfd59798 ESP: bfd59770
> > [ 57.391431] DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b EFLAGS: 00000246
> > [ 57.398215] irq event stamp: 1896
>
> A regrettable property of the current entry code structure is that we
> lose any real indication of the vector. Presumably this is #DB, hence
> exc_debug. I don't know what perf has to do with it.
>
> I'll bang on this after lunch if no one beats me to it.
Puzzling, CR3 seems to suggest this is !user_mode(), but either way #DB
has either idtentry_enter_user() or nmi_enter().
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Perf: WARNING: arch/x86/entry/common.c:624 idtentry_exit_cond_rcu+0x92/0xc0
2020-06-11 19:25 ` Peter Zijlstra
@ 2020-06-11 23:22 ` Andy Lutomirski
2020-06-12 3:30 ` Andy Lutomirski
0 siblings, 1 reply; 13+ messages in thread
From: Andy Lutomirski @ 2020-06-11 23:22 UTC (permalink / raw)
To: Peter Zijlstra
Cc: Andy Lutomirski, Naresh Kamboju, open list, X86 ML, cj.chengjian,
Arnaldo Carvalho de Melo, Ingo Molnar, H. Peter Anvin,
Borislav Petkov, Thomas Gleixner, Minchan Kim, Andrew Morton,
Michel Lespinasse, lkft-triage, Dave Hansen
On Thu, Jun 11, 2020 at 12:25 PM Peter Zijlstra <peterz@infradead.org> wrote:
>
> On Thu, Jun 11, 2020 at 12:10:50PM -0700, Andy Lutomirski wrote:
> > On Thu, Jun 11, 2020 at 11:56 AM Naresh Kamboju
> > <naresh.kamboju@linaro.org> wrote:
> > >
> > > While running perf test and selftest x86 single_step_syscall_32 on
> > > i386 kernel linux
> > > next 5.7.0-next-20200610 kernel warning noticed.
> > >
> > > steps to reproduce:
> > > --------------------------
> > > perf test
> > > and
> > > cd /opt/kselftests/default-in-kernel/x86
> > > ./single_step_syscall_32
> > >
> > > perf warning log:
> > > ----------------------
> > > [ 57.260865] ------------[ cut here ]------------
> > > [ 57.266576] IRQs not disabled as expected
> > > [ 57.270583] WARNING: CPU: 1 PID: 500 at
> > > /usr/src/kernel/arch/x86/entry/common.c:624
> > > idtentry_exit_cond_rcu+0x92/0xc0
> > > [ 57.281092] Modules linked in: x86_pkg_temp_thermal fuse
> > > [ 57.286406] CPU: 1 PID: 500 Comm: perf Not tainted 5.7.0-next-20200610 #1
> > > [ 57.293190] Hardware name: Supermicro SYS-5019S-ML/X11SSH-F, BIOS
> > > 2.2 05/23/2018
> > > [ 57.300577] EIP: idtentry_exit_cond_rcu+0x92/0xc0
> > > [ 57.305280] Code: 8b 89 d8 05 00 00 85 c9 74 ae 80 3d b1 64 2c d4
> > > 00 75 a5 68 94 2d fb d3 89 55 f8 89 45 fc c6 05 b1 64 2c d4 01 e8 8e
> > > f5 2b ff <0f> 0b 58 8b 55 f8 8b 45 fc eb 83 8d 76 00 e8 5b fd ff ff c9
> > > c3 89
> > > [ 57.324017] EAX: 0000001d EBX: 0d00022a ECX: 00000027 EDX: f5b9e14c
> > > [ 57.330274] ESI: f2a2ffb4 EDI: ffff0ff4 EBP: f2a2ff8c ESP: f2a2ff80
> > > [ 57.336531] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 EFLAGS: 00010096
> > > [ 57.343345] CR0: 80050033 CR2: 08700a58 CR3: 14ad7000 CR4: 003406d0
> > > [ 57.349608] DR0: 080dfb80 DR1: 080dfc00 DR2: 08700a58 DR3: 00000000
> > > [ 57.355866] DR6: fffe0ff0 DR7: 0d00062a
> > > [ 57.359697] Call Trace:
> > > [ 57.362143] exc_debug+0x84/0x1b0
> > > [ 57.365487] ? exc_int3+0x1d0/0x1d0
> > > [ 57.368981] handle_exception+0x145/0x145
> > > [ 57.372991] EIP: 0x80dfbcd
> > > [ 57.375694] Code: Bad RIP value.
> > > [ 57.378918] EAX: 00000000 EBX: 00000005 ECX: 00002400 EDX: 00000000
> > > [ 57.385175] ESI: 00000003 EDI: 00000004 EBP: bfd59798 ESP: bfd59770
> > > [ 57.391431] DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b EFLAGS: 00000246
> > > [ 57.398215] irq event stamp: 1896
> >
> > A regrettable property of the current entry code structure is that we
> > lose any real indication of the vector. Presumably this is #DB, hence
> > exc_debug. I don't know what perf has to do with it.
> >
> > I'll bang on this after lunch if no one beats me to it.
>
> Puzzling, CR3 seems to suggest this is !user_mode(), but either way #DB
> has either idtentry_enter_user() or nmi_enter().
>
I just got this splat on 32-bit while running my perf abuser here:
https://git.kernel.org/pub/scm/linux/kernel/git/luto/misc-tests.git/tree/perf_lots_of_nmi.c
[ 21.874114] traps: PANIC: double fault, error_code: 0x0
[ 21.874127] double fault: 0000 [#1] SMP PTI
[ 21.874127] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.7.0+ #117
[ 21.874128] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
BIOS ?-20190727_073836-buildvm-ppc64le-16.ppc.fedoraproject.org-3.fc31
04/01/2014
[ 21.874128] EIP: asm_exc_invalid_op+0x6/0x10
[ 21.874129] Code: a1 07 00 00 0f 01 ca fc 6a 00 68 f0 4c d7 cf e9
91 07 00 00 0f 01 ca fc 6a 00 68 90 4f d7 cf e9 81 07 00 00 0f 01 ca
fc 6a 00 <68> 30 4d d7 cf e9 71 07 00 00 0f 01 ca fc 6a 00 68 60 55 d7
cf e9
[ 21.874129] EAX: 00000000 EBX: cff9902c ECX: 000000e0 EDX: 00000000
[ 21.874129] ESI: cff9be9c EDI: cfd74d30 EBP: cff99014 ESP: cff99000
[ 21.874130] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 EFLAGS: 00210046
[ 21.874130] CR0: 80050033 CR2: cff98ffc CR3: 10108000 CR4: 00340ef0
[ 21.874130] Call Trace:
[ 21.874131] Modules linked in:
[ 21.889116] ---[ end trace aa191c9e5c534faf ]---
[ 21.889116] EIP: asm_exc_invalid_op+0x6/0x10
[ 21.889117] Code: a1 07 00 00 0f 01 ca fc 6a 00 68 f0 4c d7 cf e9
91 07 00 00 0f 01 ca fc 6a 00 68 90 4f d7 cf e9 81 07 00 00 0f 01 ca
fc 6a 00 <68> 30 4d d7 cf e9 71 07 00 00 0f 01 ca fc 6a 00 68 60 55 d7
cf e9
[ 21.889117] EAX: 00000000 EBX: cff9902c ECX: 000000e0 EDX: 00000000
[ 21.889117] ESI: cff9be9c EDI: cfd74d30 EBP: cff99014 ESP: cff99000
[ 21.889127] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 EFLAGS: 00210046
[ 21.889128] CR0: 80050033 CR2: cff98ffc CR3: 10108000 CR4: 00340ef0
[ 21.889128] Kernel panic - not syncing: Fatal exception in interrupt
[ 21.889338] Kernel Offset: 0xe600000 from 0xc1000000 (relocation
range: 0xc0000000-0xf7dfdfff)
All code
========
0: a1 07 00 00 0f mov 0xf000007,%eax
5: 01 ca add %ecx,%edx
7: fc cld
8: 6a 00 push $0x0
a: 68 f0 4c d7 cf push $0xcfd74cf0
f: e9 91 07 00 00 jmp 0x7a5
14: 0f 01 ca clac
17: fc cld
18: 6a 00 push $0x0
1a: 68 90 4f d7 cf push $0xcfd74f90
1f: e9 81 07 00 00 jmp 0x7a5
24: 0f 01 ca clac
27: fc cld
28: 6a 00 push $0x0
2a:* 68 30 4d d7 cf push $0xcfd74d30 <--
trapping instruction
2f: e9 71 07 00 00 jmp 0x7a5
34: 0f 01 ca clac
37: fc cld
38: 6a 00 push $0x0
3a: 68 60 55 d7 cf push $0xcfd75560
This is on my private branch, which has some extra changes, but I
don't think they're relevant. I think we broke NMI :( I did a nopti
run and got a different explosion:
[ 14.618581] BUG: unable to handle page fault for address: 0021004a
[ 14.620149] #PF: supervisor write access in kernel mode
[ 14.621134] #PF: error_code(0x0002) - not-present page
[ 14.622028] *pdpt = 0000000035b2f001 *pde = 0000000000000000
[ 14.623047] Oops: 0002 [#1] SMP NOPTI
[ 14.623684] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.7.0+ #117
[ 14.624733] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
BIOS ?-20190727_073836-buildvm-ppc64le-16.ppc.fedoraproject.org-3.fc31
04/01/2014
[ 14.627013] EIP: run_timer_softirq+0x14a/0x470
[ 14.627793] Code: 3b 63 00 8b 45 b8 8b 00 89 45 bc 8b 45 bc 85 c0
74 5c 8b 45 b8 8b 18 89 5f 04 3e 8d 74 26 00 8b 03 8b 53 04 89 02 85
c0 74 03 <89> 50 04 c7 43 04 00 00 00 00 8b 53 0c c7 03 22 01 00 00 f6
43 12
[ 14.630874] EAX: 00210046 EBX: f60f3ee4 ECX: fffba48e EDX: f602bf88
[ 14.631913] ESI: f602bf88 EDI: f65c8080 EBP: f602bfbc ESP: f602bf68
[ 14.632970] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 EFLAGS: 00210002
[ 14.634248] CR0: 80050033 CR2: 0021004a CR3: 3583b120 CR4: 00340ef0
[ 14.635300] Call Trace:
[ 14.635710] <SOFTIRQ>
[ 14.636232] __do_softirq+0xac/0x25b
[ 14.636885] ? __entry_text_end+0x8/0x8
[ 14.637786] call_on_stack+0x40/0x50
[ 14.638397] </SOFTIRQ>
[ 14.638813] ? irq_exit_rcu+0x95/0xa0
[ 14.639513] ? sysvec_apic_timer_interrupt+0x27/0x40
[ 14.640424] ? handle_exception+0x15d/0x15d
[ 14.641272] ? __cpuidle_text_start+0x8/0x8
[ 14.642167] ? default_idle+0x23/0x160
I have to run, so I'll keep digging later.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Perf: WARNING: arch/x86/entry/common.c:624 idtentry_exit_cond_rcu+0x92/0xc0
2020-06-11 23:22 ` Andy Lutomirski
@ 2020-06-12 3:30 ` Andy Lutomirski
2020-06-12 9:01 ` Thomas Gleixner
0 siblings, 1 reply; 13+ messages in thread
From: Andy Lutomirski @ 2020-06-12 3:30 UTC (permalink / raw)
To: Andy Lutomirski
Cc: Peter Zijlstra, Naresh Kamboju, open list, X86 ML, cj.chengjian,
Arnaldo Carvalho de Melo, Ingo Molnar, H. Peter Anvin,
Borislav Petkov, Thomas Gleixner, Minchan Kim, Andrew Morton,
Michel Lespinasse, lkft-triage, Dave Hansen
On Thu, Jun 11, 2020 at 4:22 PM Andy Lutomirski <luto@kernel.org> wrote:
>
> On Thu, Jun 11, 2020 at 12:25 PM Peter Zijlstra <peterz@infradead.org> wrote:
> >
> > On Thu, Jun 11, 2020 at 12:10:50PM -0700, Andy Lutomirski wrote:
> > > On Thu, Jun 11, 2020 at 11:56 AM Naresh Kamboju
> > > <naresh.kamboju@linaro.org> wrote:
> > > >
> > > > While running perf test and selftest x86 single_step_syscall_32 on
> > > > i386 kernel linux
> > > > next 5.7.0-next-20200610 kernel warning noticed.
> > > >
> > > > steps to reproduce:
> > > > --------------------------
> > > > perf test
> > > > and
> > > > cd /opt/kselftests/default-in-kernel/x86
> > > > ./single_step_syscall_32
> > > >
> > > > perf warning log:
> > > > ----------------------
> > > > [ 57.260865] ------------[ cut here ]------------
> > > > [ 57.266576] IRQs not disabled as expected
> > > > [ 57.270583] WARNING: CPU: 1 PID: 500 at
> > > > /usr/src/kernel/arch/x86/entry/common.c:624
> > > > idtentry_exit_cond_rcu+0x92/0xc0
> > > > [ 57.281092] Modules linked in: x86_pkg_temp_thermal fuse
> > > > [ 57.286406] CPU: 1 PID: 500 Comm: perf Not tainted 5.7.0-next-20200610 #1
> > > > [ 57.293190] Hardware name: Supermicro SYS-5019S-ML/X11SSH-F, BIOS
> > > > 2.2 05/23/2018
> > > > [ 57.300577] EIP: idtentry_exit_cond_rcu+0x92/0xc0
> > > > [ 57.305280] Code: 8b 89 d8 05 00 00 85 c9 74 ae 80 3d b1 64 2c d4
> > > > 00 75 a5 68 94 2d fb d3 89 55 f8 89 45 fc c6 05 b1 64 2c d4 01 e8 8e
> > > > f5 2b ff <0f> 0b 58 8b 55 f8 8b 45 fc eb 83 8d 76 00 e8 5b fd ff ff c9
> > > > c3 89
> > > > [ 57.324017] EAX: 0000001d EBX: 0d00022a ECX: 00000027 EDX: f5b9e14c
> > > > [ 57.330274] ESI: f2a2ffb4 EDI: ffff0ff4 EBP: f2a2ff8c ESP: f2a2ff80
> > > > [ 57.336531] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 EFLAGS: 00010096
> > > > [ 57.343345] CR0: 80050033 CR2: 08700a58 CR3: 14ad7000 CR4: 003406d0
> > > > [ 57.349608] DR0: 080dfb80 DR1: 080dfc00 DR2: 08700a58 DR3: 00000000
> > > > [ 57.355866] DR6: fffe0ff0 DR7: 0d00062a
> > > > [ 57.359697] Call Trace:
> > > > [ 57.362143] exc_debug+0x84/0x1b0
> > > > [ 57.365487] ? exc_int3+0x1d0/0x1d0
> > > > [ 57.368981] handle_exception+0x145/0x145
> > > > [ 57.372991] EIP: 0x80dfbcd
> > > > [ 57.375694] Code: Bad RIP value.
> > > > [ 57.378918] EAX: 00000000 EBX: 00000005 ECX: 00002400 EDX: 00000000
> > > > [ 57.385175] ESI: 00000003 EDI: 00000004 EBP: bfd59798 ESP: bfd59770
> > > > [ 57.391431] DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b EFLAGS: 00000246
> > > > [ 57.398215] irq event stamp: 1896
> > >
> > > A regrettable property of the current entry code structure is that we
> > > lose any real indication of the vector. Presumably this is #DB, hence
> > > exc_debug. I don't know what perf has to do with it.
> > >
> > > I'll bang on this after lunch if no one beats me to it.
> >
> > Puzzling, CR3 seems to suggest this is !user_mode(), but either way #DB
> > has either idtentry_enter_user() or nmi_enter().
> >
>
> I just got this splat on 32-bit while running my perf abuser here:
>
> https://git.kernel.org/pub/scm/linux/kernel/git/luto/misc-tests.git/tree/perf_lots_of_nmi.c
>
> [ 21.874114] traps: PANIC: double fault, error_code: 0x0
Two bugs here.
1. We had an issue with WARN. Patch sent.
2. idtentry.h has, for x86_32:
# define DEFINE_IDTENTRY_IST DEFINE_IDTENTRY
This is nonsense. It's getting late over here and I'd rather focus on
the more interesting RCU issue, so that's all from me today.
--Andy
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Perf: WARNING: arch/x86/entry/common.c:624 idtentry_exit_cond_rcu+0x92/0xc0
2020-06-12 3:30 ` Andy Lutomirski
@ 2020-06-12 9:01 ` Thomas Gleixner
2020-06-12 9:18 ` Andy Lutomirski
0 siblings, 1 reply; 13+ messages in thread
From: Thomas Gleixner @ 2020-06-12 9:01 UTC (permalink / raw)
To: Andy Lutomirski, Andy Lutomirski
Cc: Peter Zijlstra, Naresh Kamboju, open list, X86 ML, cj.chengjian,
Arnaldo Carvalho de Melo, Ingo Molnar, H. Peter Anvin,
Borislav Petkov, Minchan Kim, Andrew Morton, Michel Lespinasse,
lkft-triage, Dave Hansen
Andy Lutomirski <luto@kernel.org> writes:
> On Thu, Jun 11, 2020 at 4:22 PM Andy Lutomirski <luto@kernel.org> wrote:
>
> Two bugs here.
>
> 1. We had an issue with WARN. Patch sent.
Grabbed it
> 2. idtentry.h has, for x86_32:
>
> # define DEFINE_IDTENTRY_IST DEFINE_IDTENTRY
>
> This is nonsense. It's getting late over here and I'd rather focus on
> the more interesting RCU issue, so that's all from me today.
Well, this might be nonsense, but it's exactly matching the current code
in mainline which, e.g. for #DB does:
SYM_CODE_START(debug)
/*
* Entry from sysenter is now handled in common_exception
*/
ASM_CLAC
pushl $0
pushl $do_debug
jmp common_exception
SYM_CODE_END(debug)
There is no IST on 32bit, never was. We do software stack switching for
device interrupts, but that's a different story.
Thanks,
tglx
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Perf: WARNING: arch/x86/entry/common.c:624 idtentry_exit_cond_rcu+0x92/0xc0
2020-06-12 9:01 ` Thomas Gleixner
@ 2020-06-12 9:18 ` Andy Lutomirski
2020-06-12 10:34 ` Thomas Gleixner
0 siblings, 1 reply; 13+ messages in thread
From: Andy Lutomirski @ 2020-06-12 9:18 UTC (permalink / raw)
To: Thomas Gleixner
Cc: Andy Lutomirski, Peter Zijlstra, Naresh Kamboju, open list,
X86 ML, cj.chengjian, Arnaldo Carvalho de Melo, Ingo Molnar,
H. Peter Anvin, Borislav Petkov, Minchan Kim, Andrew Morton,
Michel Lespinasse, lkft-triage, Dave Hansen
> On Jun 12, 2020, at 2:01 AM, Thomas Gleixner <tglx@linutronix.de> wrote:
>
> Andy Lutomirski <luto@kernel.org> writes:
>> On Thu, Jun 11, 2020 at 4:22 PM Andy Lutomirski <luto@kernel.org> wrote:
>>
>> Two bugs here.
>>
>> 1. We had an issue with WARN. Patch sent.
>
> Grabbed it
>
>> 2. idtentry.h has, for x86_32:
>>
>> # define DEFINE_IDTENTRY_IST DEFINE_IDTENTRY
>>
>> This is nonsense. It's getting late over here and I'd rather focus on
>> the more interesting RCU issue, so that's all from me today.
>
> Well, this might be nonsense, but it's exactly matching the current code
> in mainline which, e.g. for #DB does:
>
> SYM_CODE_START(debug)
> /*
> * Entry from sysenter is now handled in common_exception
> */
> ASM_CLAC
> pushl $0
> pushl $do_debug
> jmp common_exception
> SYM_CODE_END(debug)
>
> There is no IST on 32bit, never was. We do software stack switching for
> device interrupts, but that's a different story.
>
DEFINE_IDTENTRY does the idtentry_enter_cond_rcu() dance, which isn’t intended to be safe from NMI context. It should probably map to DEFINE_IDTENTRY_RAW() instead. The specific issue is that NMI ends up there, and at least DEFINE_IDTENTRY_NMI should be raw.
I haven’t tried this at all, nor have I dug through all the users of these macros to check what they expect. Perhaps we should not have the _IST one defined at all on 32 bit and rename it to DEFINE_IDTENTRY_IST_RAW on 64 bit to make it more clear what’s going on when reading the C code.
Or maybe I’m too sleepy and I’m nuts. But I don’t think I am.
> Thanks,
>
> tglx
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Perf: WARNING: arch/x86/entry/common.c:624 idtentry_exit_cond_rcu+0x92/0xc0
2020-06-12 9:18 ` Andy Lutomirski
@ 2020-06-12 10:34 ` Thomas Gleixner
2020-07-02 14:27 ` Naresh Kamboju
0 siblings, 1 reply; 13+ messages in thread
From: Thomas Gleixner @ 2020-06-12 10:34 UTC (permalink / raw)
To: Andy Lutomirski
Cc: Andy Lutomirski, Peter Zijlstra, Naresh Kamboju, open list,
X86 ML, cj.chengjian, Arnaldo Carvalho de Melo, Ingo Molnar,
H. Peter Anvin, Borislav Petkov, Minchan Kim, Andrew Morton,
Michel Lespinasse, lkft-triage, Dave Hansen
Andy Lutomirski <luto@amacapital.net> writes:
>> On Jun 12, 2020, at 2:01 AM, Thomas Gleixner <tglx@linutronix.de> wrote:
>> There is no IST on 32bit, never was. We do software stack switching for
>> device interrupts, but that's a different story.
>>
>
> DEFINE_IDTENTRY does the idtentry_enter_cond_rcu() dance, which isn’t
> intended to be safe from NMI context. It should probably map to
> DEFINE_IDTENTRY_RAW() instead. The specific issue is that NMI ends up
> there, and at least DEFINE_IDTENTRY_NMI should be raw.
Yes, you are right. That's clearly broken.
> I haven’t tried this at all, nor have I dug through all the users of
> these macros to check what they expect. Perhaps we should not have
> the _IST one defined at all on 32 bit and rename it to
> DEFINE_IDTENTRY_IST_RAW on 64 bit to make it more clear what’s going
> on when reading the C code.
Let me go over it with a fine comb.
> Or maybe I’m too sleepy and I’m nuts. But I don’t think I am.
/me politely refrains from commenting
^ permalink raw reply [flat|nested] 13+ messages in thread
* [tip: x86/entry] x86/entry: Make NMI use IDTENTRY_RAW
2020-06-11 18:56 Perf: WARNING: arch/x86/entry/common.c:624 idtentry_exit_cond_rcu+0x92/0xc0 Naresh Kamboju
2020-06-11 19:10 ` Andy Lutomirski
@ 2020-06-12 19:50 ` tip-bot2 for Thomas Gleixner
1 sibling, 0 replies; 13+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2020-06-12 19:50 UTC (permalink / raw)
To: linux-tip-commits; +Cc: Naresh Kamboju, Thomas Gleixner, x86, LKML
The following commit has been merged into the x86/entry branch of tip:
Commit-ID: 71ed49d8fb33023f242419a77ecb1141c029cac4
Gitweb: https://git.kernel.org/tip/71ed49d8fb33023f242419a77ecb1141c029cac4
Author: Thomas Gleixner <tglx@linutronix.de>
AuthorDate: Fri, 12 Jun 2020 14:02:27 +02:00
Committer: Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Fri, 12 Jun 2020 14:15:48 +02:00
x86/entry: Make NMI use IDTENTRY_RAW
For no reason other than beginning brainmelt, IDTENTRY_NMI was mapped to
IDTENTRY_IST.
This is not a problem on 64bit because the IST default entry point maps to
IDTENTRY_RAW which does not any entry handling. The surplus function
declaration for the noist C entry point is unused and as there is no ASM
code emitted for NMI this went unnoticed.
On 32bit IDTENTRY_IST maps to a regular IDTENTRY which does the normal
entry handling. That is clearly the wrong thing to do for NMI.
Map it to IDTENTRY_RAW to unbreak it. The IDTENTRY_NMI mapping needs to
stay to avoid emitting ASM code.
Fixes: 6271fef00b34 ("x86/entry: Convert NMI to IDTENTRY_NMI")
Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org>
Debugged-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/CA+G9fYvF3cyrY+-iw_SZtpN-i2qA2BruHg4M=QYECU2-dNdsMw@mail.gmail.com
---
arch/x86/include/asm/idtentry.h | 4 ++--
arch/x86/kernel/nmi.c | 2 +-
2 files changed, 3 insertions(+), 3 deletions(-)
diff --git a/arch/x86/include/asm/idtentry.h b/arch/x86/include/asm/idtentry.h
index 2fc6b0c..cf51c50 100644
--- a/arch/x86/include/asm/idtentry.h
+++ b/arch/x86/include/asm/idtentry.h
@@ -391,8 +391,8 @@ __visible noinstr void func(struct pt_regs *regs, \
#define DEFINE_IDTENTRY_MCE DEFINE_IDTENTRY_IST
#define DEFINE_IDTENTRY_MCE_USER DEFINE_IDTENTRY_NOIST
-#define DECLARE_IDTENTRY_NMI DECLARE_IDTENTRY_IST
-#define DEFINE_IDTENTRY_NMI DEFINE_IDTENTRY_IST
+#define DECLARE_IDTENTRY_NMI DECLARE_IDTENTRY_RAW
+#define DEFINE_IDTENTRY_NMI DEFINE_IDTENTRY_RAW
#define DECLARE_IDTENTRY_DEBUG DECLARE_IDTENTRY_IST
#define DEFINE_IDTENTRY_DEBUG DEFINE_IDTENTRY_IST
diff --git a/arch/x86/kernel/nmi.c b/arch/x86/kernel/nmi.c
index 3a98ff3..2de365f 100644
--- a/arch/x86/kernel/nmi.c
+++ b/arch/x86/kernel/nmi.c
@@ -476,7 +476,7 @@ static DEFINE_PER_CPU(enum nmi_states, nmi_state);
static DEFINE_PER_CPU(unsigned long, nmi_cr2);
static DEFINE_PER_CPU(unsigned long, nmi_dr7);
-DEFINE_IDTENTRY_NMI(exc_nmi)
+DEFINE_IDTENTRY_RAW(exc_nmi)
{
if (IS_ENABLED(CONFIG_SMP) && cpu_is_offline(smp_processor_id()))
return;
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: Perf: WARNING: arch/x86/entry/common.c:624 idtentry_exit_cond_rcu+0x92/0xc0
2020-06-12 10:34 ` Thomas Gleixner
@ 2020-07-02 14:27 ` Naresh Kamboju
2020-07-02 15:02 ` Peter Zijlstra
0 siblings, 1 reply; 13+ messages in thread
From: Naresh Kamboju @ 2020-07-02 14:27 UTC (permalink / raw)
To: Andy Lutomirski, Thomas Gleixner
Cc: Andy Lutomirski, Peter Zijlstra, open list, X86 ML, cj.chengjian,
Arnaldo Carvalho de Melo, Ingo Molnar, H. Peter Anvin,
Borislav Petkov, Minchan Kim, Andrew Morton, Michel Lespinasse,
lkft-triage, Dave Hansen
I have reported this warning on linux-next and now it is happening on
linux mainline tree.
May I know , are we missing a fix patch on linus 's tree ?
- Naresh
---
While running selftest x86 single_step_syscall_32 on
i386 kernel linux 5.8.0-rc3 kernel warning noticed.
steps to reproduce:
--------------------------
perf test
and
cd /opt/kselftests/default-in-kernel/x86
./single_step_syscall_32
crash dump,
[ 1324.774385] kselftest: Running tests in x86
[ 1324.830187] ------------[ cut here ]------------
[ 1324.834820] IRQs not disabled as expected
[ 1324.838838] WARNING: CPU: 2 PID: 5365 at
/usr/src/kernel/arch/x86/entry/common.c:645
idtentry_exit_cond_rcu+0x92/0xc0
[ 1324.849448] Modules linked in: test_sysctl nf_tables act_mirred
cls_u32 sch_netem sch_ingress mpls_iptunnel mpls_router sch_etf sch_fq
sit x86_pkg_temp_thermal fuse [last unloaded: test_user_copy]
[ 1324.866909] CPU: 2 PID: 5365 Comm: single_step_sys Not tainted 5.8.0-rc3 #1
[ 1324.873866] Hardware name: Supermicro SYS-5019S-ML/X11SSH-F, BIOS
2.0b 07/27/2017
[ 1324.881340] EIP: idtentry_exit_cond_rcu+0x92/0xc0
[ 1324.886045] Code: 8b 89 d8 05 00 00 85 c9 74 ae 80 3d 3f 1d f8 de
00 75 a5 68 3c 6f c5 de 89 55 f8 89 45 fc c6 05 3f 1d f8 de 01 e8 7e
17 23 ff <0f> 0b 58 8b 55 f8 8b 45 fc eb 83 8d 76 00 e8 5b fd ff ff c9
c3 89
[ 1324.904790] EAX: 0000001d EBX: 00000000 ECX: 00000027 EDX: f594c14c
[ 1324.911055] ESI: f1da3fb4 EDI: ffff4ff0 EBP: f1da3f8c ESP: f1da3f80
[ 1324.917312] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 EFLAGS: 00010096
[ 1324.924090] CR0: 80050033 CR2: 0883a16c CR3: 26c06000 CR4: 003406d0
[ 1324.930347] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
[ 1324.936622] DR6: fffe0ff0 DR7: 00000400
[ 1324.940460] Call Trace:
[ 1324.942906] exc_debug+0x84/0x1b0
[ 1324.946224] ? exc_int3+0x1d0/0x1d0
[ 1324.949716] handle_exception+0x145/0x145
[ 1324.953720] EIP: 0x8048900
[ 1324.956423] Code: Bad RIP value.
[ 1324.959646] EAX: 00000005 EBX: 00000000 ECX: bfe19f0c EDX: bfe19e8c
[ 1324.965903] ESI: b7e3a000 EDI: 00000000 EBP: bfe1a488 ESP: bfe19e7c
[ 1324.972161] DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b EFLAGS: 00000282
[ 1324.978942] irq event stamp: 3694
[ 1324.982257] hardirqs last enabled at (3693): [<de8bf6b9>]
idtentry_exit_user+0x39/0x60
[ 1324.990251] hardirqs last disabled at (3694): [<ddb6ea62>]
vprintk_emit+0x42/0x290
[ 1324.997814] softirqs last enabled at (3688): [<ddab1d98>]
fpu__clear+0x78/0xd0
[ 1325.005110] softirqs last disabled at (3686): [<ddab1d41>]
fpu__clear+0x21/0xd0
[ 1325.012406] ---[ end trace d68e89e2115323e2 ]---
[ 1325.017075] ------------[ cut here ]------------
[ 1325.021688] IRQs not disabled as expected
[ 1325.025695] WARNING: CPU: 2 PID: 5365 at
/usr/src/kernel/arch/x86/entry/common.c:240
__prepare_exit_to_usermode+0x164/0x1b0
[ 1325.036811] Modules linked in: test_sysctl nf_tables act_mirred
cls_u32 sch_netem sch_ingress mpls_iptunnel mpls_router sch_etf sch_fq
sit x86_pkg_temp_thermal fuse [last unloaded: test_user_copy]
[ 1325.054258] CPU: 2 PID: 5365 Comm: single_step_sys Tainted: G
W 5.8.0-rc3 #1
[ 1325.062624] Hardware name: Supermicro SYS-5019S-ML/X11SSH-F, BIOS
2.0b 07/27/2017
[ 1325.070099] EIP: __prepare_exit_to_usermode+0x164/0x1b0
[ 1325.075315] Code: d8 05 00 00 85 db 0f 84 08 ff ff ff 80 3d 40 1d
f8 de 00 0f 85 fb fe ff ff 68 3c 6f c5 de c6 05 40 1d f8 de 01 e8 7c
ee 0e 00 <0f> 0b 59 e9 e2 fe ff ff 8d 74 26 00 68 e4 6e c5 de e8 66 ee
0e 00
[ 1325.094053] EAX: 0000001d EBX: 00000001 ECX: 00000027 EDX: f594c14c
[ 1325.100309] ESI: f5766c00 EDI: f1da3fb4 EBP: f1da3f74 ESP: f1da3f64
[ 1325.106583] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 EFLAGS: 00010086
[ 1325.113378] CR0: 80050033 CR2: 0804d084 CR3: 26c06000 CR4: 003406d0
[ 1325.119637] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
[ 1325.125902] DR6: fffe0ff0 DR7: 00000400
[ 1325.129731] Call Trace:
[ 1325.132179] prepare_exit_to_usermode+0x8/0x30
[ 1325.136624] idtentry_exit_cond_rcu+0xa5/0xc0
[ 1325.140983] ? idtentry_exit_user+0x39/0x60
[ 1325.145168] exc_debug+0x84/0x1b0
[ 1325.148480] ? exc_int3+0x1d0/0x1d0
[ 1325.151970] handle_exception+0x145/0x145
[ 1325.155973] EIP: 0x8048900
[ 1325.158677] Code: Bad RIP value.
[ 1325.161901] EAX: 00000005 EBX: 00000000 ECX: bfe19f0c EDX: bfe19e8c
[ 1325.168160] ESI: b7e3a000 EDI: 00000000 EBP: bfe1a488 ESP: bfe19e7c
[ 1325.174423] DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b EFLAGS: 00000282
[ 1325.181205] irq event stamp: 3808
[ 1325.184520] hardirqs last enabled at (3807): [<de8bf6b9>]
idtentry_exit_user+0x39/0x60
[ 1325.192512] hardirqs last disabled at (3808): [<ddb6ea62>]
vprintk_emit+0x42/0x290
[ 1325.200079] softirqs last enabled at (3802): [<ddab1d98>]
fpu__clear+0x78/0xd0
[ 1325.207384] softirqs last disabled at (3800): [<ddab1d41>]
fpu__clear+0x21/0xd0
[ 1325.214688] ---[ end trace d68e89e2115323e3 ]---
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Perf: WARNING: arch/x86/entry/common.c:624 idtentry_exit_cond_rcu+0x92/0xc0
2020-07-02 14:27 ` Naresh Kamboju
@ 2020-07-02 15:02 ` Peter Zijlstra
2020-07-02 17:15 ` Naresh Kamboju
2020-07-02 18:21 ` Andy Lutomirski
0 siblings, 2 replies; 13+ messages in thread
From: Peter Zijlstra @ 2020-07-02 15:02 UTC (permalink / raw)
To: Naresh Kamboju
Cc: Andy Lutomirski, Thomas Gleixner, Andy Lutomirski, open list,
X86 ML, cj.chengjian, Arnaldo Carvalho de Melo, Ingo Molnar,
H. Peter Anvin, Borislav Petkov, Minchan Kim, Andrew Morton,
Michel Lespinasse, lkft-triage, Dave Hansen
On Thu, Jul 02, 2020 at 07:57:54PM +0530, Naresh Kamboju wrote:
> I have reported this warning on linux-next and now it is happening on
> linux mainline tree.
> May I know , are we missing a fix patch on linus 's tree ?
>
> - Naresh
> ---
> While running selftest x86 single_step_syscall_32 on
> i386 kernel linux 5.8.0-rc3 kernel warning noticed.
>
> steps to reproduce:
> --------------------------
> perf test
> and
> cd /opt/kselftests/default-in-kernel/x86
> ./single_step_syscall_32
>
> crash dump,
> [ 1324.774385] kselftest: Running tests in x86
> [ 1324.830187] ------------[ cut here ]------------
> [ 1324.834820] IRQs not disabled as expected
> [ 1324.838838] WARNING: CPU: 2 PID: 5365 at
> /usr/src/kernel/arch/x86/entry/common.c:645
> idtentry_exit_cond_rcu+0x92/0xc0
How's this?
DEFINE_IDTENTRY_DEBUG() is DEFINE_IDTENTRY_RAW on x86_64
DEFINE_IDTENTRY on i386
calling exc_debug_*() from DEFINE_IDTENTRY() does a double layer of
idtentry_{enter,exit}*() functions.
Also, handle_debug() is still a trainwreck, we should never get to
cond_local_irq_enable() when !user.
Completely untested...
---
arch/x86/kernel/traps.c | 49 +++++++++++++++++++++++--------------------------
1 file changed, 23 insertions(+), 26 deletions(-)
diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index b0195771c932..47d6b46e1564 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -806,8 +806,17 @@ static void handle_debug(struct pt_regs *regs, unsigned long dr6, bool user)
* If DR6 is zero, no point in trying to handle it. The kernel is
* not using INT1.
*/
- if (!user && !dr6)
- return;
+ if (!user) {
+ /*
+ * Catch SYSENTER with TF set and clear DR_STEP. If this hit a
+ * watchpoint at the same time then that will still be handled.
+ */
+ if ((dr6 & DR_STEP) && is_sysenter_singlestep(regs))
+ dr6 &= ~DR_STEP;
+
+ if (!dr6)
+ return;
+ }
/*
* If dr6 has no reason to give us about the origin of this trap,
@@ -859,25 +868,29 @@ static void handle_debug(struct pt_regs *regs, unsigned long dr6, bool user)
cond_local_irq_disable(regs);
}
+#ifdef CONFIG_X86_64
static __always_inline void exc_debug_kernel(struct pt_regs *regs,
unsigned long dr6)
{
bool irq_state = idtentry_enter_nmi(regs);
instrumentation_begin();
- /*
- * Catch SYSENTER with TF set and clear DR_STEP. If this hit a
- * watchpoint at the same time then that will still be handled.
- */
- if ((dr6 & DR_STEP) && is_sysenter_singlestep(regs))
- dr6 &= ~DR_STEP;
-
handle_debug(regs, dr6, false);
instrumentation_end();
idtentry_exit_nmi(regs, irq_state);
}
+/* IST stack entry */
+DEFINE_IDTENTRY_DEBUG(exc_debug)
+{
+ unsigned long dr6, dr7;
+
+ debug_enter(&dr6, &dr7);
+ exc_debug_kernel(regs, dr6);
+ debug_exit(dr7);
+}
+
static __always_inline void exc_debug_user(struct pt_regs *regs,
unsigned long dr6)
{
@@ -890,17 +903,6 @@ static __always_inline void exc_debug_user(struct pt_regs *regs,
idtentry_exit_user(regs);
}
-#ifdef CONFIG_X86_64
-/* IST stack entry */
-DEFINE_IDTENTRY_DEBUG(exc_debug)
-{
- unsigned long dr6, dr7;
-
- debug_enter(&dr6, &dr7);
- exc_debug_kernel(regs, dr6);
- debug_exit(dr7);
-}
-
/* User entry, runs on regular task stack */
DEFINE_IDTENTRY_DEBUG_USER(exc_debug)
{
@@ -917,12 +919,7 @@ DEFINE_IDTENTRY_DEBUG(exc_debug)
unsigned long dr6, dr7;
debug_enter(&dr6, &dr7);
-
- if (user_mode(regs))
- exc_debug_user(regs, dr6);
- else
- exc_debug_kernel(regs, dr6);
-
+ handle_debug(regs, dr6, user_mode(regs));
debug_exit(dr7);
}
#endif
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: Perf: WARNING: arch/x86/entry/common.c:624 idtentry_exit_cond_rcu+0x92/0xc0
2020-07-02 15:02 ` Peter Zijlstra
@ 2020-07-02 17:15 ` Naresh Kamboju
2020-07-02 18:21 ` Andy Lutomirski
1 sibling, 0 replies; 13+ messages in thread
From: Naresh Kamboju @ 2020-07-02 17:15 UTC (permalink / raw)
To: Peter Zijlstra
Cc: Andy Lutomirski, Thomas Gleixner, Andy Lutomirski, open list,
X86 ML, cj.chengjian, Arnaldo Carvalho de Melo, Ingo Molnar,
H. Peter Anvin, Borislav Petkov, Minchan Kim, Andrew Morton,
Michel Lespinasse, lkft-triage, Dave Hansen
On Thu, 2 Jul 2020 at 20:33, Peter Zijlstra <peterz@infradead.org> wrote:
>
> How's this?
>
> DEFINE_IDTENTRY_DEBUG() is DEFINE_IDTENTRY_RAW on x86_64
> DEFINE_IDTENTRY on i386
>
> calling exc_debug_*() from DEFINE_IDTENTRY() does a double layer of
> idtentry_{enter,exit}*() functions.
>
> Also, handle_debug() is still a trainwreck, we should never get to
> cond_local_irq_enable() when !user.
>
> Completely untested...
Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org>
Tested-by: Naresh Kamboju <naresh.kamboju@linaro.org>
The patch did not apply smooth.
I have manually edited the patch [1] and applied it on top of linus
master branch
and tested on x86_64 and i386 and test pass and the reported WARNING gone.
ref:
patch link
[1] https://pastebin.com/mBHkP1A6
Test jobs links,
https://lkft.validation.linaro.org/scheduler/job/1538367#L1218
https://lkft.validation.linaro.org/scheduler/job/1538368#L1271
- Naresh
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Perf: WARNING: arch/x86/entry/common.c:624 idtentry_exit_cond_rcu+0x92/0xc0
2020-07-02 15:02 ` Peter Zijlstra
2020-07-02 17:15 ` Naresh Kamboju
@ 2020-07-02 18:21 ` Andy Lutomirski
1 sibling, 0 replies; 13+ messages in thread
From: Andy Lutomirski @ 2020-07-02 18:21 UTC (permalink / raw)
To: Peter Zijlstra
Cc: Naresh Kamboju, Thomas Gleixner, Andy Lutomirski, open list,
X86 ML, cj.chengjian, Arnaldo Carvalho de Melo, Ingo Molnar,
H. Peter Anvin, Borislav Petkov, Minchan Kim, Andrew Morton,
Michel Lespinasse, lkft-triage, Dave Hansen
On Thu, Jul 2, 2020 at 8:03 AM Peter Zijlstra <peterz@infradead.org> wrote:
>
> On Thu, Jul 02, 2020 at 07:57:54PM +0530, Naresh Kamboju wrote:
> > I have reported this warning on linux-next and now it is happening on
> > linux mainline tree.
> > May I know , are we missing a fix patch on linus 's tree ?
> >
> > - Naresh
> > ---
> > While running selftest x86 single_step_syscall_32 on
> > i386 kernel linux 5.8.0-rc3 kernel warning noticed.
> >
> > steps to reproduce:
> > --------------------------
> > perf test
> > and
> > cd /opt/kselftests/default-in-kernel/x86
> > ./single_step_syscall_32
> >
> > crash dump,
> > [ 1324.774385] kselftest: Running tests in x86
> > [ 1324.830187] ------------[ cut here ]------------
> > [ 1324.834820] IRQs not disabled as expected
> > [ 1324.838838] WARNING: CPU: 2 PID: 5365 at
> > /usr/src/kernel/arch/x86/entry/common.c:645
> > idtentry_exit_cond_rcu+0x92/0xc0
>
> How's this?
>
> DEFINE_IDTENTRY_DEBUG() is DEFINE_IDTENTRY_RAW on x86_64
> DEFINE_IDTENTRY on i386
>
> calling exc_debug_*() from DEFINE_IDTENTRY() does a double layer of
> idtentry_{enter,exit}*() functions.
>
> Also, handle_debug() is still a trainwreck, we should never get to
> cond_local_irq_enable() when !user.
>
> Completely untested...
>
> ---
> arch/x86/kernel/traps.c | 49 +++++++++++++++++++++++--------------------------
> 1 file changed, 23 insertions(+), 26 deletions(-)
>
> diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
> index b0195771c932..47d6b46e1564 100644
> --- a/arch/x86/kernel/traps.c
> +++ b/arch/x86/kernel/traps.c
> @@ -806,8 +806,17 @@ static void handle_debug(struct pt_regs *regs, unsigned long dr6, bool user)
> * If DR6 is zero, no point in trying to handle it. The kernel is
> * not using INT1.
> */
> - if (!user && !dr6)
> - return;
> + if (!user) {
> + /*
> + * Catch SYSENTER with TF set and clear DR_STEP. If this hit a
> + * watchpoint at the same time then that will still be handled.
> + */
> + if ((dr6 & DR_STEP) && is_sysenter_singlestep(regs))
> + dr6 &= ~DR_STEP;
> +
> + if (!dr6)
> + return;
> + }
>
> /*
> * If dr6 has no reason to give us about the origin of this trap,
> @@ -859,25 +868,29 @@ static void handle_debug(struct pt_regs *regs, unsigned long dr6, bool user)
> cond_local_irq_disable(regs);
> }
>
> +#ifdef CONFIG_X86_64
> static __always_inline void exc_debug_kernel(struct pt_regs *regs,
> unsigned long dr6)
> {
> bool irq_state = idtentry_enter_nmi(regs);
> instrumentation_begin();
>
> - /*
> - * Catch SYSENTER with TF set and clear DR_STEP. If this hit a
> - * watchpoint at the same time then that will still be handled.
> - */
> - if ((dr6 & DR_STEP) && is_sysenter_singlestep(regs))
> - dr6 &= ~DR_STEP;
> -
> handle_debug(regs, dr6, false);
>
> instrumentation_end();
> idtentry_exit_nmi(regs, irq_state);
> }
>
> +/* IST stack entry */
> +DEFINE_IDTENTRY_DEBUG(exc_debug)
> +{
> + unsigned long dr6, dr7;
> +
> + debug_enter(&dr6, &dr7);
> + exc_debug_kernel(regs, dr6);
> + debug_exit(dr7);
> +}
> +
> static __always_inline void exc_debug_user(struct pt_regs *regs,
> unsigned long dr6)
> {
> @@ -890,17 +903,6 @@ static __always_inline void exc_debug_user(struct pt_regs *regs,
> idtentry_exit_user(regs);
> }
>
> -#ifdef CONFIG_X86_64
> -/* IST stack entry */
> -DEFINE_IDTENTRY_DEBUG(exc_debug)
> -{
> - unsigned long dr6, dr7;
> -
> - debug_enter(&dr6, &dr7);
> - exc_debug_kernel(regs, dr6);
> - debug_exit(dr7);
> -}
> -
> /* User entry, runs on regular task stack */
> DEFINE_IDTENTRY_DEBUG_USER(exc_debug)
> {
> @@ -917,12 +919,7 @@ DEFINE_IDTENTRY_DEBUG(exc_debug)
> unsigned long dr6, dr7;
>
> debug_enter(&dr6, &dr7);
> -
> - if (user_mode(regs))
> - exc_debug_user(regs, dr6);
> - else
> - exc_debug_kernel(regs, dr6);
> -
> + handle_debug(regs, dr6, user_mode(regs));
> debug_exit(dr7);
> }
> #endif
I'll fold something like this into my pile of pending fixes.
^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2020-07-02 18:21 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-06-11 18:56 Perf: WARNING: arch/x86/entry/common.c:624 idtentry_exit_cond_rcu+0x92/0xc0 Naresh Kamboju
2020-06-11 19:10 ` Andy Lutomirski
2020-06-11 19:25 ` Peter Zijlstra
2020-06-11 23:22 ` Andy Lutomirski
2020-06-12 3:30 ` Andy Lutomirski
2020-06-12 9:01 ` Thomas Gleixner
2020-06-12 9:18 ` Andy Lutomirski
2020-06-12 10:34 ` Thomas Gleixner
2020-07-02 14:27 ` Naresh Kamboju
2020-07-02 15:02 ` Peter Zijlstra
2020-07-02 17:15 ` Naresh Kamboju
2020-07-02 18:21 ` Andy Lutomirski
2020-06-12 19:50 ` [tip: x86/entry] x86/entry: Make NMI use IDTENTRY_RAW tip-bot2 for Thomas Gleixner
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).