All of lore.kernel.org
 help / color / mirror / Atom feed
* BUG: KCSAN: data-race in do_page_fault / spectre_v4_enable_task_mitigation
@ 2022-12-15  7:32 Naresh Kamboju
  2022-12-15  8:32 ` Marco Elver
  0 siblings, 1 reply; 6+ messages in thread
From: Naresh Kamboju @ 2022-12-15  7:32 UTC (permalink / raw)
  To: open list, rcu, kunit-dev, lkft-triage, kasan-dev
  Cc: Paul E. McKenney, Dominique Martinet, Netdev, Marco Elver, Anders Roxell

[Please ignore if it is already reported, and not an expert of KCSAN]

On Linux next-20221215 tag arm64 allmodconfig boot failed due to following
data-race reported by KCSAN.

Reported-by: Linux Kernel Functional Testing <lkft@linaro.org>

[    0.000000][    T0] Booting Linux on physical CPU 0x0000000000 [0x410fd034]
[    0.000000][    T0] Linux version 6.1.0-next-20221214
(tuxmake@tuxmake) (aarch64-linux-gnu-gcc (Debian 12.2.0-9) 12.2.0, GNU
ld (GNU Binutils for Debian) 2.39) #2 SMP PREEMPT_DYNAMIC @1671022464
[    0.000000][    T0] random: crng init done
[    0.000000][    T0] Machine model: linux,dummy-virt
...
[ 1067.461794][  T132] BUG: KCSAN: data-race in do_page_fault /
spectre_v4_enable_task_mitigation
[ 1067.467529][  T132]
[ 1067.469146][  T132] write to 0xffff80000f00bfb8 of 8 bytes by task
93 on cpu 0:
[ 1067.473790][  T132]  spectre_v4_enable_task_mitigation+0x2f8/0x340
[ 1067.477964][  T132]  __switch_to+0xc4/0x200
[ 1067.480877][  T132]  __schedule+0x5ec/0x6c0
[ 1067.483764][  T132]  schedule+0x6c/0x100
[ 1067.486526][  T132]  worker_thread+0x7d8/0x8c0
[ 1067.489581][  T132]  kthread+0x1b8/0x200
[ 1067.492483][  T132]  ret_from_fork+0x10/0x20
[ 1067.495450][  T132]
[ 1067.497034][  T132] read to 0xffff80000f00bfb8 of 8 bytes by task
132 on cpu 0:
[ 1067.501684][  T132]  do_page_fault+0x568/0xa40
[ 1067.504938][  T132]  do_mem_abort+0x7c/0x180
[ 1067.508051][  T132]  el0_da+0x64/0x100
[ 1067.510712][  T132]  el0t_64_sync_handler+0x90/0x180
[ 1067.514191][  T132]  el0t_64_sync+0x1a4/0x1a8
[ 1067.517200][  T132]
[ 1067.518758][  T132] 1 lock held by (udevadm)/132:
[ 1067.521883][  T132]  #0: ffff00000b802c28
(&mm->mmap_lock){++++}-{3:3}, at: do_page_fault+0x480/0xa40
[ 1067.528399][  T132] irq event stamp: 1461
[ 1067.531041][  T132] hardirqs last  enabled at (1460):
[<ffff80000af83e40>] preempt_schedule_irq+0x40/0x100
[ 1067.537176][  T132] hardirqs last disabled at (1461):
[<ffff80000af82c84>] __schedule+0x84/0x6c0
[ 1067.542788][  T132] softirqs last  enabled at (1423):
[<ffff800008020688>] fpsimd_restore_current_state+0x148/0x1c0
[ 1067.549480][  T132] softirqs last disabled at (1421):
[<ffff8000080205fc>] fpsimd_restore_current_state+0xbc/0x1c0
[ 1067.556127][  T132]
[ 1067.557687][  T132] value changed: 0x0000000060000000 -> 0x0000000060001000
[ 1067.562039][  T132]
[ 1067.563631][  T132] Reported by Kernel Concurrency Sanitizer on:
[ 1067.567480][  T132] CPU: 0 PID: 132 Comm: (udevadm) Tainted: G
          T  6.1.0-next-20221214 #2
4185b46758ba972fed408118afddb8c426bff43a
[ 1067.575669][  T132] Hardware name: linux,dummy-virt (DT)


metadata:
  repo: https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/tree/?h=next-20221214
  config: allmodconfig
  arch: arm64
  Build details:
https://qa-reports.linaro.org/lkft/linux-next-master/build/next-20221214/

--
Linaro LKFT
https://lkft.linaro.org

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: BUG: KCSAN: data-race in do_page_fault / spectre_v4_enable_task_mitigation
  2022-12-15  7:32 BUG: KCSAN: data-race in do_page_fault / spectre_v4_enable_task_mitigation Naresh Kamboju
@ 2022-12-15  8:32 ` Marco Elver
  0 siblings, 0 replies; 6+ messages in thread
From: Marco Elver @ 2022-12-15  8:32 UTC (permalink / raw)
  To: Naresh Kamboju
  Cc: open list, rcu, kunit-dev, lkft-triage, kasan-dev,
	Paul E. McKenney, Dominique Martinet, Netdev, Anders Roxell

On Thu, 15 Dec 2022 at 08:32, Naresh Kamboju <naresh.kamboju@linaro.org> wrote:
>
> [Please ignore if it is already reported, and not an expert of KCSAN]
>
> On Linux next-20221215 tag arm64 allmodconfig boot failed due to following
> data-race reported by KCSAN.
>
> Reported-by: Linux Kernel Functional Testing <lkft@linaro.org>
>
> [    0.000000][    T0] Booting Linux on physical CPU 0x0000000000 [0x410fd034]
> [    0.000000][    T0] Linux version 6.1.0-next-20221214
> (tuxmake@tuxmake) (aarch64-linux-gnu-gcc (Debian 12.2.0-9) 12.2.0, GNU
> ld (GNU Binutils for Debian) 2.39) #2 SMP PREEMPT_DYNAMIC @1671022464
> [    0.000000][    T0] random: crng init done
> [    0.000000][    T0] Machine model: linux,dummy-virt
> ...
> [ 1067.461794][  T132] BUG: KCSAN: data-race in do_page_fault /
> spectre_v4_enable_task_mitigation
> [ 1067.467529][  T132]
> [ 1067.469146][  T132] write to 0xffff80000f00bfb8 of 8 bytes by task
> 93 on cpu 0:
> [ 1067.473790][  T132]  spectre_v4_enable_task_mitigation+0x2f8/0x340
> [ 1067.477964][  T132]  __switch_to+0xc4/0x200

Please provide line numbers with all reports - you can use the script
scripts/decode_stacktrace.sh (requires the vmlinux you found this
with) to do so.

It would be good to do this immediately, because having anyone else do
so is nearly impossible - and without line numbers this report will
very likely be ignored.

Thanks,
-- Marco

> [ 1067.480877][  T132]  __schedule+0x5ec/0x6c0
> [ 1067.483764][  T132]  schedule+0x6c/0x100
> [ 1067.486526][  T132]  worker_thread+0x7d8/0x8c0
> [ 1067.489581][  T132]  kthread+0x1b8/0x200
> [ 1067.492483][  T132]  ret_from_fork+0x10/0x20
> [ 1067.495450][  T132]
> [ 1067.497034][  T132] read to 0xffff80000f00bfb8 of 8 bytes by task
> 132 on cpu 0:
> [ 1067.501684][  T132]  do_page_fault+0x568/0xa40
> [ 1067.504938][  T132]  do_mem_abort+0x7c/0x180
> [ 1067.508051][  T132]  el0_da+0x64/0x100
> [ 1067.510712][  T132]  el0t_64_sync_handler+0x90/0x180
> [ 1067.514191][  T132]  el0t_64_sync+0x1a4/0x1a8
> [ 1067.517200][  T132]
> [ 1067.518758][  T132] 1 lock held by (udevadm)/132:
> [ 1067.521883][  T132]  #0: ffff00000b802c28
> (&mm->mmap_lock){++++}-{3:3}, at: do_page_fault+0x480/0xa40
> [ 1067.528399][  T132] irq event stamp: 1461
> [ 1067.531041][  T132] hardirqs last  enabled at (1460):
> [<ffff80000af83e40>] preempt_schedule_irq+0x40/0x100
> [ 1067.537176][  T132] hardirqs last disabled at (1461):
> [<ffff80000af82c84>] __schedule+0x84/0x6c0
> [ 1067.542788][  T132] softirqs last  enabled at (1423):
> [<ffff800008020688>] fpsimd_restore_current_state+0x148/0x1c0
> [ 1067.549480][  T132] softirqs last disabled at (1421):
> [<ffff8000080205fc>] fpsimd_restore_current_state+0xbc/0x1c0
> [ 1067.556127][  T132]
> [ 1067.557687][  T132] value changed: 0x0000000060000000 -> 0x0000000060001000
> [ 1067.562039][  T132]
> [ 1067.563631][  T132] Reported by Kernel Concurrency Sanitizer on:
> [ 1067.567480][  T132] CPU: 0 PID: 132 Comm: (udevadm) Tainted: G
>           T  6.1.0-next-20221214 #2
> 4185b46758ba972fed408118afddb8c426bff43a
> [ 1067.575669][  T132] Hardware name: linux,dummy-virt (DT)
>
>
> metadata:
>   repo: https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/tree/?h=next-20221214
>   config: allmodconfig
>   arch: arm64
>   Build details:
> https://qa-reports.linaro.org/lkft/linux-next-master/build/next-20221214/
>
> --
> Linaro LKFT
> https://lkft.linaro.org

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: BUG: KCSAN: data-race in do_page_fault / spectre_v4_enable_task_mitigation
  2022-12-21 14:54 ` Anders Roxell
@ 2023-01-06 17:38   ` Will Deacon
  -1 siblings, 0 replies; 6+ messages in thread
From: Will Deacon @ 2023-01-06 17:38 UTC (permalink / raw)
  To: Anders Roxell
  Cc: catalin.marinas, mark.rutland, james.morse, ebiederm, elver,
	linux-arm-kernel, linux-kernel, arnd

On Wed, Dec 21, 2022 at 03:54:36PM +0100, Anders Roxell wrote:
> Hey,
> 
> I'm building an allmodconfig kernel on yesterdays linux-next (tag:
> next-20221220) and I see a
> "BUG: KCSAN: data-race in do_page_fault / spectre_v4_enable_task_mitigation"
> when I boot up in QEMU. I ran the output via
> scripts/decode_stacktrace.sh and this is what I see:
> 
> 
> [ 2105.261121][  T154] ==================================================================
> [ 2105.266067][  T154] BUG: KCSAN: data-race in do_page_fault / spectre_v4_enable_task_mitigation
> [ 2105.271577][  T154] 
> [ 2105.273121][  T154] write to 0xffff8000210b3fb8 of 8 bytes by task 136 on cpu 0:
> [ 2105.277743][ T154] spectre_v4_enable_task_mitigation (/home/anders/src/kernel/next/arch/arm64/kernel/proton-pack.c:651 /home/anders/src/kernel/next/arch/arm64/kernel/proton-pack.c:664) 
> [ 2105.281802][ T154] __switch_to (/home/anders/src/kernel/next/arch/arm64/kernel/process.c:459 /home/anders/src/kernel/next/arch/arm64/kernel/process.c:532) 
> [ 2105.284670][ T154] __schedule (/home/anders/src/kernel/next/kernel/sched/core.c:5247 /home/anders/src/kernel/next/kernel/sched/core.c:6555) 
> [ 2105.287555][ T154] preempt_schedule_irq (/home/anders/src/kernel/next/arch/arm64/include/asm/irqflags.h:70 /home/anders/src/kernel/next/arch/arm64/include/asm/irqflags.h:98 /home/anders/src/kernel/next/kernel/sched/core.c:6868) 
> [ 2105.290857][ T154] arm64_preempt_schedule_irq (/home/anders/src/kernel/next/arch/arm64/kernel/entry-common.c:237) 
> [ 2105.294433][ T154] el1_interrupt (/home/anders/src/kernel/next/arch/arm64/kernel/entry-common.c:476 /home/anders/src/kernel/next/arch/arm64/kernel/entry-common.c:486) 
> [ 2105.297433][ T154] el1h_64_irq_handler (/home/anders/src/kernel/next/arch/arm64/kernel/entry-common.c:492) 
> [ 2105.300718][ T154] el1h_64_irq (/home/anders/src/kernel/next/arch/arm64/kernel/entry.S:580) 
> [ 2105.303497][ T154] arch_local_irq_restore (/home/anders/src/kernel/next/arch/arm64/include/asm/jump_label.h:21 /home/anders/src/kernel/next/arch/arm64/include/asm/irqflags.h:130) 
> [ 2105.306750][ T154] fs_reclaim_acquire (/home/anders/src/kernel/next/mm/page_alloc.c:4691) 
> [ 2105.310118][ T154] slab_pre_alloc_hook.constprop.0 (/home/anders/src/kernel/next/include/linux/sched/mm.h:272 /home/anders/src/kernel/next/mm/slab.h:720) 
> [ 2105.313966][ T154] slab_alloc_node.isra.0 (/home/anders/src/kernel/next/mm/slub.c:3434) 
> [ 2105.317343][ T154] __kmem_cache_alloc_lru (/home/anders/src/kernel/next/mm/slub.c:3469) 
> [ 2105.320659][ T154] kmem_cache_alloc (/home/anders/src/kernel/next/mm/slub.c:3477) 
> [ 2105.323673][ T154] getname_flags (/home/anders/src/kernel/next/fs/namei.c:139) 
> [ 2105.326703][ T154] getname (/home/anders/src/kernel/next/fs/namei.c:218) 
> [ 2105.329377][ T154] do_sys_openat2 (/home/anders/src/kernel/next/fs/open.c:1304) 
> [ 2105.332352][ T154] __arm64_sys_openat (/home/anders/src/kernel/next/fs/open.c:1326) 
> [ 2105.335573][ T154] el0_svc_common.constprop.0 (/home/anders/src/kernel/next/arch/arm64/kernel/syscall.c:38 /home/anders/src/kernel/next/arch/arm64/kernel/syscall.c:52 /home/anders/src/kernel/next/arch/arm64/kernel/syscall.c:142) 
> [ 2105.339272][ T154] do_el0_svc (/home/anders/src/kernel/next/arch/arm64/kernel/syscall.c:197) 
> [ 2105.342025][ T154] el0_svc (/home/anders/src/kernel/next/arch/arm64/kernel/entry-common.c:133 /home/anders/src/kernel/next/arch/arm64/kernel/entry-common.c:142 /home/anders/src/kernel/next/arch/arm64/kernel/entry-common.c:638) 
> [ 2105.344687][ T154] el0t_64_sync_handler (/home/anders/src/kernel/next/arch/arm64/kernel/entry-common.c:656) 
> [ 2105.348089][ T154] el0t_64_sync (/home/anders/src/kernel/next/arch/arm64/kernel/entry.S:584) 
> [ 2105.350998][  T154] 
> [ 2105.352567][  T154] read to 0xffff8000210b3fb8 of 8 bytes by task 154 on cpu 0:
> [ 2105.357117][ T154] do_page_fault (/home/anders/src/kernel/next/arch/arm64/mm/fault.c:517 /home/anders/src/kernel/next/arch/arm64/mm/fault.c:558) 
> [ 2105.360110][ T154] do_translation_fault (/home/anders/src/kernel/next/arch/arm64/mm/fault.c:695) 
> [ 2105.363400][ T154] do_mem_abort (/home/anders/src/kernel/next/arch/arm64/mm/fault.c:831) 
> [ 2105.366400][ T154] el0_ia (/home/anders/src/kernel/next/arch/arm64/kernel/entry-common.c:133 /home/anders/src/kernel/next/arch/arm64/kernel/entry-common.c:142 /home/anders/src/kernel/next/arch/arm64/kernel/entry-common.c:534) 
> [ 2105.369059][ T154] el0t_64_sync_handler (/home/anders/src/kernel/next/arch/arm64/kernel/entry-common.c:662) 
> [ 2105.372445][ T154] el0t_64_sync (/home/anders/src/kernel/next/arch/arm64/kernel/entry.S:584) 
> [ 2105.375404][  T154] 
> [ 2105.376935][  T154] no locks held by systemd/154.
> [ 2105.379935][  T154] irq event stamp: 385
> [ 2105.382448][ T154] hardirqs last enabled at (385): local_daif_restore (/home/anders/src/kernel/next/arch/arm64/include/asm/daifflags.h:71 (discriminator 1)) 
> [ 2105.388413][ T154] hardirqs last disabled at (384): el0t_64_sync_handler (/home/anders/src/kernel/next/arch/arm64/kernel/entry-common.c:662) 
> [ 2105.394436][ T154] softirqs last enabled at (352): fpsimd_restore_current_state (/home/anders/src/kernel/next/arch/arm64/kernel/fpsimd.c:264 /home/anders/src/kernel/next/arch/arm64/kernel/fpsimd.c:1780) 
> [ 2105.400932][ T154] softirqs last disabled at (350): fpsimd_restore_current_state (/home/anders/src/kernel/next/include/linux/bottom_half.h:20 /home/anders/src/kernel/next/arch/arm64/kernel/fpsimd.c:242 /home/anders/src/kernel/next/arch/arm64/kernel/fpsimd.c:1773) 
> [ 2105.407394][  T154] 
> [ 2105.408909][  T154] value changed: 0x0000000060000000 -> 0x0000000060001000
> [ 2105.413225][  T154] 
> [ 2105.414746][  T154] Reported by Kernel Concurrency Sanitizer on:
> [ 2105.426169][  T154] Hardware name: linux,dummy-virt (DT)
> [ 2105.429528][  T154] ==================================================================
> 
> 
> The prctl case, which only gets called on 'current'. However, assuming
> the kernel could be preempted while accessing current->pt_regs->pstate,
> and then it races against the task switch.
> 
> Any idea what happens and how to fix it?

I can't quite decipher this against mainline, as the line numbers above in
do_page_fault() seem to correlate with reads of the esr, which is either
a register or a stack read. I also think everything in the stacktraces
is running in the context of 'current', so I can't see how we could really
race here.

:/

Will

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: BUG: KCSAN: data-race in do_page_fault / spectre_v4_enable_task_mitigation
@ 2023-01-06 17:38   ` Will Deacon
  0 siblings, 0 replies; 6+ messages in thread
From: Will Deacon @ 2023-01-06 17:38 UTC (permalink / raw)
  To: Anders Roxell
  Cc: catalin.marinas, mark.rutland, james.morse, ebiederm, elver,
	linux-arm-kernel, linux-kernel, arnd

On Wed, Dec 21, 2022 at 03:54:36PM +0100, Anders Roxell wrote:
> Hey,
> 
> I'm building an allmodconfig kernel on yesterdays linux-next (tag:
> next-20221220) and I see a
> "BUG: KCSAN: data-race in do_page_fault / spectre_v4_enable_task_mitigation"
> when I boot up in QEMU. I ran the output via
> scripts/decode_stacktrace.sh and this is what I see:
> 
> 
> [ 2105.261121][  T154] ==================================================================
> [ 2105.266067][  T154] BUG: KCSAN: data-race in do_page_fault / spectre_v4_enable_task_mitigation
> [ 2105.271577][  T154] 
> [ 2105.273121][  T154] write to 0xffff8000210b3fb8 of 8 bytes by task 136 on cpu 0:
> [ 2105.277743][ T154] spectre_v4_enable_task_mitigation (/home/anders/src/kernel/next/arch/arm64/kernel/proton-pack.c:651 /home/anders/src/kernel/next/arch/arm64/kernel/proton-pack.c:664) 
> [ 2105.281802][ T154] __switch_to (/home/anders/src/kernel/next/arch/arm64/kernel/process.c:459 /home/anders/src/kernel/next/arch/arm64/kernel/process.c:532) 
> [ 2105.284670][ T154] __schedule (/home/anders/src/kernel/next/kernel/sched/core.c:5247 /home/anders/src/kernel/next/kernel/sched/core.c:6555) 
> [ 2105.287555][ T154] preempt_schedule_irq (/home/anders/src/kernel/next/arch/arm64/include/asm/irqflags.h:70 /home/anders/src/kernel/next/arch/arm64/include/asm/irqflags.h:98 /home/anders/src/kernel/next/kernel/sched/core.c:6868) 
> [ 2105.290857][ T154] arm64_preempt_schedule_irq (/home/anders/src/kernel/next/arch/arm64/kernel/entry-common.c:237) 
> [ 2105.294433][ T154] el1_interrupt (/home/anders/src/kernel/next/arch/arm64/kernel/entry-common.c:476 /home/anders/src/kernel/next/arch/arm64/kernel/entry-common.c:486) 
> [ 2105.297433][ T154] el1h_64_irq_handler (/home/anders/src/kernel/next/arch/arm64/kernel/entry-common.c:492) 
> [ 2105.300718][ T154] el1h_64_irq (/home/anders/src/kernel/next/arch/arm64/kernel/entry.S:580) 
> [ 2105.303497][ T154] arch_local_irq_restore (/home/anders/src/kernel/next/arch/arm64/include/asm/jump_label.h:21 /home/anders/src/kernel/next/arch/arm64/include/asm/irqflags.h:130) 
> [ 2105.306750][ T154] fs_reclaim_acquire (/home/anders/src/kernel/next/mm/page_alloc.c:4691) 
> [ 2105.310118][ T154] slab_pre_alloc_hook.constprop.0 (/home/anders/src/kernel/next/include/linux/sched/mm.h:272 /home/anders/src/kernel/next/mm/slab.h:720) 
> [ 2105.313966][ T154] slab_alloc_node.isra.0 (/home/anders/src/kernel/next/mm/slub.c:3434) 
> [ 2105.317343][ T154] __kmem_cache_alloc_lru (/home/anders/src/kernel/next/mm/slub.c:3469) 
> [ 2105.320659][ T154] kmem_cache_alloc (/home/anders/src/kernel/next/mm/slub.c:3477) 
> [ 2105.323673][ T154] getname_flags (/home/anders/src/kernel/next/fs/namei.c:139) 
> [ 2105.326703][ T154] getname (/home/anders/src/kernel/next/fs/namei.c:218) 
> [ 2105.329377][ T154] do_sys_openat2 (/home/anders/src/kernel/next/fs/open.c:1304) 
> [ 2105.332352][ T154] __arm64_sys_openat (/home/anders/src/kernel/next/fs/open.c:1326) 
> [ 2105.335573][ T154] el0_svc_common.constprop.0 (/home/anders/src/kernel/next/arch/arm64/kernel/syscall.c:38 /home/anders/src/kernel/next/arch/arm64/kernel/syscall.c:52 /home/anders/src/kernel/next/arch/arm64/kernel/syscall.c:142) 
> [ 2105.339272][ T154] do_el0_svc (/home/anders/src/kernel/next/arch/arm64/kernel/syscall.c:197) 
> [ 2105.342025][ T154] el0_svc (/home/anders/src/kernel/next/arch/arm64/kernel/entry-common.c:133 /home/anders/src/kernel/next/arch/arm64/kernel/entry-common.c:142 /home/anders/src/kernel/next/arch/arm64/kernel/entry-common.c:638) 
> [ 2105.344687][ T154] el0t_64_sync_handler (/home/anders/src/kernel/next/arch/arm64/kernel/entry-common.c:656) 
> [ 2105.348089][ T154] el0t_64_sync (/home/anders/src/kernel/next/arch/arm64/kernel/entry.S:584) 
> [ 2105.350998][  T154] 
> [ 2105.352567][  T154] read to 0xffff8000210b3fb8 of 8 bytes by task 154 on cpu 0:
> [ 2105.357117][ T154] do_page_fault (/home/anders/src/kernel/next/arch/arm64/mm/fault.c:517 /home/anders/src/kernel/next/arch/arm64/mm/fault.c:558) 
> [ 2105.360110][ T154] do_translation_fault (/home/anders/src/kernel/next/arch/arm64/mm/fault.c:695) 
> [ 2105.363400][ T154] do_mem_abort (/home/anders/src/kernel/next/arch/arm64/mm/fault.c:831) 
> [ 2105.366400][ T154] el0_ia (/home/anders/src/kernel/next/arch/arm64/kernel/entry-common.c:133 /home/anders/src/kernel/next/arch/arm64/kernel/entry-common.c:142 /home/anders/src/kernel/next/arch/arm64/kernel/entry-common.c:534) 
> [ 2105.369059][ T154] el0t_64_sync_handler (/home/anders/src/kernel/next/arch/arm64/kernel/entry-common.c:662) 
> [ 2105.372445][ T154] el0t_64_sync (/home/anders/src/kernel/next/arch/arm64/kernel/entry.S:584) 
> [ 2105.375404][  T154] 
> [ 2105.376935][  T154] no locks held by systemd/154.
> [ 2105.379935][  T154] irq event stamp: 385
> [ 2105.382448][ T154] hardirqs last enabled at (385): local_daif_restore (/home/anders/src/kernel/next/arch/arm64/include/asm/daifflags.h:71 (discriminator 1)) 
> [ 2105.388413][ T154] hardirqs last disabled at (384): el0t_64_sync_handler (/home/anders/src/kernel/next/arch/arm64/kernel/entry-common.c:662) 
> [ 2105.394436][ T154] softirqs last enabled at (352): fpsimd_restore_current_state (/home/anders/src/kernel/next/arch/arm64/kernel/fpsimd.c:264 /home/anders/src/kernel/next/arch/arm64/kernel/fpsimd.c:1780) 
> [ 2105.400932][ T154] softirqs last disabled at (350): fpsimd_restore_current_state (/home/anders/src/kernel/next/include/linux/bottom_half.h:20 /home/anders/src/kernel/next/arch/arm64/kernel/fpsimd.c:242 /home/anders/src/kernel/next/arch/arm64/kernel/fpsimd.c:1773) 
> [ 2105.407394][  T154] 
> [ 2105.408909][  T154] value changed: 0x0000000060000000 -> 0x0000000060001000
> [ 2105.413225][  T154] 
> [ 2105.414746][  T154] Reported by Kernel Concurrency Sanitizer on:
> [ 2105.426169][  T154] Hardware name: linux,dummy-virt (DT)
> [ 2105.429528][  T154] ==================================================================
> 
> 
> The prctl case, which only gets called on 'current'. However, assuming
> the kernel could be preempted while accessing current->pt_regs->pstate,
> and then it races against the task switch.
> 
> Any idea what happens and how to fix it?

I can't quite decipher this against mainline, as the line numbers above in
do_page_fault() seem to correlate with reads of the esr, which is either
a register or a stack read. I also think everything in the stacktraces
is running in the context of 'current', so I can't see how we could really
race here.

:/

Will

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 6+ messages in thread

* BUG: KCSAN: data-race in do_page_fault / spectre_v4_enable_task_mitigation
@ 2022-12-21 14:54 ` Anders Roxell
  0 siblings, 0 replies; 6+ messages in thread
From: Anders Roxell @ 2022-12-21 14:54 UTC (permalink / raw)
  To: will, catalin.marinas, mark.rutland, james.morse, ebiederm, elver
  Cc: linux-arm-kernel, linux-kernel, arnd

Hey,

I'm building an allmodconfig kernel on yesterdays linux-next (tag:
next-20221220) and I see a
"BUG: KCSAN: data-race in do_page_fault / spectre_v4_enable_task_mitigation"
when I boot up in QEMU. I ran the output via
scripts/decode_stacktrace.sh and this is what I see:


[ 2105.261121][  T154] ==================================================================
[ 2105.266067][  T154] BUG: KCSAN: data-race in do_page_fault / spectre_v4_enable_task_mitigation
[ 2105.271577][  T154] 
[ 2105.273121][  T154] write to 0xffff8000210b3fb8 of 8 bytes by task 136 on cpu 0:
[ 2105.277743][ T154] spectre_v4_enable_task_mitigation (/home/anders/src/kernel/next/arch/arm64/kernel/proton-pack.c:651 /home/anders/src/kernel/next/arch/arm64/kernel/proton-pack.c:664) 
[ 2105.281802][ T154] __switch_to (/home/anders/src/kernel/next/arch/arm64/kernel/process.c:459 /home/anders/src/kernel/next/arch/arm64/kernel/process.c:532) 
[ 2105.284670][ T154] __schedule (/home/anders/src/kernel/next/kernel/sched/core.c:5247 /home/anders/src/kernel/next/kernel/sched/core.c:6555) 
[ 2105.287555][ T154] preempt_schedule_irq (/home/anders/src/kernel/next/arch/arm64/include/asm/irqflags.h:70 /home/anders/src/kernel/next/arch/arm64/include/asm/irqflags.h:98 /home/anders/src/kernel/next/kernel/sched/core.c:6868) 
[ 2105.290857][ T154] arm64_preempt_schedule_irq (/home/anders/src/kernel/next/arch/arm64/kernel/entry-common.c:237) 
[ 2105.294433][ T154] el1_interrupt (/home/anders/src/kernel/next/arch/arm64/kernel/entry-common.c:476 /home/anders/src/kernel/next/arch/arm64/kernel/entry-common.c:486) 
[ 2105.297433][ T154] el1h_64_irq_handler (/home/anders/src/kernel/next/arch/arm64/kernel/entry-common.c:492) 
[ 2105.300718][ T154] el1h_64_irq (/home/anders/src/kernel/next/arch/arm64/kernel/entry.S:580) 
[ 2105.303497][ T154] arch_local_irq_restore (/home/anders/src/kernel/next/arch/arm64/include/asm/jump_label.h:21 /home/anders/src/kernel/next/arch/arm64/include/asm/irqflags.h:130) 
[ 2105.306750][ T154] fs_reclaim_acquire (/home/anders/src/kernel/next/mm/page_alloc.c:4691) 
[ 2105.310118][ T154] slab_pre_alloc_hook.constprop.0 (/home/anders/src/kernel/next/include/linux/sched/mm.h:272 /home/anders/src/kernel/next/mm/slab.h:720) 
[ 2105.313966][ T154] slab_alloc_node.isra.0 (/home/anders/src/kernel/next/mm/slub.c:3434) 
[ 2105.317343][ T154] __kmem_cache_alloc_lru (/home/anders/src/kernel/next/mm/slub.c:3469) 
[ 2105.320659][ T154] kmem_cache_alloc (/home/anders/src/kernel/next/mm/slub.c:3477) 
[ 2105.323673][ T154] getname_flags (/home/anders/src/kernel/next/fs/namei.c:139) 
[ 2105.326703][ T154] getname (/home/anders/src/kernel/next/fs/namei.c:218) 
[ 2105.329377][ T154] do_sys_openat2 (/home/anders/src/kernel/next/fs/open.c:1304) 
[ 2105.332352][ T154] __arm64_sys_openat (/home/anders/src/kernel/next/fs/open.c:1326) 
[ 2105.335573][ T154] el0_svc_common.constprop.0 (/home/anders/src/kernel/next/arch/arm64/kernel/syscall.c:38 /home/anders/src/kernel/next/arch/arm64/kernel/syscall.c:52 /home/anders/src/kernel/next/arch/arm64/kernel/syscall.c:142) 
[ 2105.339272][ T154] do_el0_svc (/home/anders/src/kernel/next/arch/arm64/kernel/syscall.c:197) 
[ 2105.342025][ T154] el0_svc (/home/anders/src/kernel/next/arch/arm64/kernel/entry-common.c:133 /home/anders/src/kernel/next/arch/arm64/kernel/entry-common.c:142 /home/anders/src/kernel/next/arch/arm64/kernel/entry-common.c:638) 
[ 2105.344687][ T154] el0t_64_sync_handler (/home/anders/src/kernel/next/arch/arm64/kernel/entry-common.c:656) 
[ 2105.348089][ T154] el0t_64_sync (/home/anders/src/kernel/next/arch/arm64/kernel/entry.S:584) 
[ 2105.350998][  T154] 
[ 2105.352567][  T154] read to 0xffff8000210b3fb8 of 8 bytes by task 154 on cpu 0:
[ 2105.357117][ T154] do_page_fault (/home/anders/src/kernel/next/arch/arm64/mm/fault.c:517 /home/anders/src/kernel/next/arch/arm64/mm/fault.c:558) 
[ 2105.360110][ T154] do_translation_fault (/home/anders/src/kernel/next/arch/arm64/mm/fault.c:695) 
[ 2105.363400][ T154] do_mem_abort (/home/anders/src/kernel/next/arch/arm64/mm/fault.c:831) 
[ 2105.366400][ T154] el0_ia (/home/anders/src/kernel/next/arch/arm64/kernel/entry-common.c:133 /home/anders/src/kernel/next/arch/arm64/kernel/entry-common.c:142 /home/anders/src/kernel/next/arch/arm64/kernel/entry-common.c:534) 
[ 2105.369059][ T154] el0t_64_sync_handler (/home/anders/src/kernel/next/arch/arm64/kernel/entry-common.c:662) 
[ 2105.372445][ T154] el0t_64_sync (/home/anders/src/kernel/next/arch/arm64/kernel/entry.S:584) 
[ 2105.375404][  T154] 
[ 2105.376935][  T154] no locks held by systemd/154.
[ 2105.379935][  T154] irq event stamp: 385
[ 2105.382448][ T154] hardirqs last enabled at (385): local_daif_restore (/home/anders/src/kernel/next/arch/arm64/include/asm/daifflags.h:71 (discriminator 1)) 
[ 2105.388413][ T154] hardirqs last disabled at (384): el0t_64_sync_handler (/home/anders/src/kernel/next/arch/arm64/kernel/entry-common.c:662) 
[ 2105.394436][ T154] softirqs last enabled at (352): fpsimd_restore_current_state (/home/anders/src/kernel/next/arch/arm64/kernel/fpsimd.c:264 /home/anders/src/kernel/next/arch/arm64/kernel/fpsimd.c:1780) 
[ 2105.400932][ T154] softirqs last disabled at (350): fpsimd_restore_current_state (/home/anders/src/kernel/next/include/linux/bottom_half.h:20 /home/anders/src/kernel/next/arch/arm64/kernel/fpsimd.c:242 /home/anders/src/kernel/next/arch/arm64/kernel/fpsimd.c:1773) 
[ 2105.407394][  T154] 
[ 2105.408909][  T154] value changed: 0x0000000060000000 -> 0x0000000060001000
[ 2105.413225][  T154] 
[ 2105.414746][  T154] Reported by Kernel Concurrency Sanitizer on:
[ 2105.426169][  T154] Hardware name: linux,dummy-virt (DT)
[ 2105.429528][  T154] ==================================================================


The prctl case, which only gets called on 'current'. However, assuming
the kernel could be preempted while accessing current->pt_regs->pstate,
and then it races against the task switch.

Any idea what happens and how to fix it?


Cheers,
Anders

^ permalink raw reply	[flat|nested] 6+ messages in thread

* BUG: KCSAN: data-race in do_page_fault / spectre_v4_enable_task_mitigation
@ 2022-12-21 14:54 ` Anders Roxell
  0 siblings, 0 replies; 6+ messages in thread
From: Anders Roxell @ 2022-12-21 14:54 UTC (permalink / raw)
  To: will, catalin.marinas, mark.rutland, james.morse, ebiederm, elver
  Cc: linux-arm-kernel, linux-kernel, arnd

Hey,

I'm building an allmodconfig kernel on yesterdays linux-next (tag:
next-20221220) and I see a
"BUG: KCSAN: data-race in do_page_fault / spectre_v4_enable_task_mitigation"
when I boot up in QEMU. I ran the output via
scripts/decode_stacktrace.sh and this is what I see:


[ 2105.261121][  T154] ==================================================================
[ 2105.266067][  T154] BUG: KCSAN: data-race in do_page_fault / spectre_v4_enable_task_mitigation
[ 2105.271577][  T154] 
[ 2105.273121][  T154] write to 0xffff8000210b3fb8 of 8 bytes by task 136 on cpu 0:
[ 2105.277743][ T154] spectre_v4_enable_task_mitigation (/home/anders/src/kernel/next/arch/arm64/kernel/proton-pack.c:651 /home/anders/src/kernel/next/arch/arm64/kernel/proton-pack.c:664) 
[ 2105.281802][ T154] __switch_to (/home/anders/src/kernel/next/arch/arm64/kernel/process.c:459 /home/anders/src/kernel/next/arch/arm64/kernel/process.c:532) 
[ 2105.284670][ T154] __schedule (/home/anders/src/kernel/next/kernel/sched/core.c:5247 /home/anders/src/kernel/next/kernel/sched/core.c:6555) 
[ 2105.287555][ T154] preempt_schedule_irq (/home/anders/src/kernel/next/arch/arm64/include/asm/irqflags.h:70 /home/anders/src/kernel/next/arch/arm64/include/asm/irqflags.h:98 /home/anders/src/kernel/next/kernel/sched/core.c:6868) 
[ 2105.290857][ T154] arm64_preempt_schedule_irq (/home/anders/src/kernel/next/arch/arm64/kernel/entry-common.c:237) 
[ 2105.294433][ T154] el1_interrupt (/home/anders/src/kernel/next/arch/arm64/kernel/entry-common.c:476 /home/anders/src/kernel/next/arch/arm64/kernel/entry-common.c:486) 
[ 2105.297433][ T154] el1h_64_irq_handler (/home/anders/src/kernel/next/arch/arm64/kernel/entry-common.c:492) 
[ 2105.300718][ T154] el1h_64_irq (/home/anders/src/kernel/next/arch/arm64/kernel/entry.S:580) 
[ 2105.303497][ T154] arch_local_irq_restore (/home/anders/src/kernel/next/arch/arm64/include/asm/jump_label.h:21 /home/anders/src/kernel/next/arch/arm64/include/asm/irqflags.h:130) 
[ 2105.306750][ T154] fs_reclaim_acquire (/home/anders/src/kernel/next/mm/page_alloc.c:4691) 
[ 2105.310118][ T154] slab_pre_alloc_hook.constprop.0 (/home/anders/src/kernel/next/include/linux/sched/mm.h:272 /home/anders/src/kernel/next/mm/slab.h:720) 
[ 2105.313966][ T154] slab_alloc_node.isra.0 (/home/anders/src/kernel/next/mm/slub.c:3434) 
[ 2105.317343][ T154] __kmem_cache_alloc_lru (/home/anders/src/kernel/next/mm/slub.c:3469) 
[ 2105.320659][ T154] kmem_cache_alloc (/home/anders/src/kernel/next/mm/slub.c:3477) 
[ 2105.323673][ T154] getname_flags (/home/anders/src/kernel/next/fs/namei.c:139) 
[ 2105.326703][ T154] getname (/home/anders/src/kernel/next/fs/namei.c:218) 
[ 2105.329377][ T154] do_sys_openat2 (/home/anders/src/kernel/next/fs/open.c:1304) 
[ 2105.332352][ T154] __arm64_sys_openat (/home/anders/src/kernel/next/fs/open.c:1326) 
[ 2105.335573][ T154] el0_svc_common.constprop.0 (/home/anders/src/kernel/next/arch/arm64/kernel/syscall.c:38 /home/anders/src/kernel/next/arch/arm64/kernel/syscall.c:52 /home/anders/src/kernel/next/arch/arm64/kernel/syscall.c:142) 
[ 2105.339272][ T154] do_el0_svc (/home/anders/src/kernel/next/arch/arm64/kernel/syscall.c:197) 
[ 2105.342025][ T154] el0_svc (/home/anders/src/kernel/next/arch/arm64/kernel/entry-common.c:133 /home/anders/src/kernel/next/arch/arm64/kernel/entry-common.c:142 /home/anders/src/kernel/next/arch/arm64/kernel/entry-common.c:638) 
[ 2105.344687][ T154] el0t_64_sync_handler (/home/anders/src/kernel/next/arch/arm64/kernel/entry-common.c:656) 
[ 2105.348089][ T154] el0t_64_sync (/home/anders/src/kernel/next/arch/arm64/kernel/entry.S:584) 
[ 2105.350998][  T154] 
[ 2105.352567][  T154] read to 0xffff8000210b3fb8 of 8 bytes by task 154 on cpu 0:
[ 2105.357117][ T154] do_page_fault (/home/anders/src/kernel/next/arch/arm64/mm/fault.c:517 /home/anders/src/kernel/next/arch/arm64/mm/fault.c:558) 
[ 2105.360110][ T154] do_translation_fault (/home/anders/src/kernel/next/arch/arm64/mm/fault.c:695) 
[ 2105.363400][ T154] do_mem_abort (/home/anders/src/kernel/next/arch/arm64/mm/fault.c:831) 
[ 2105.366400][ T154] el0_ia (/home/anders/src/kernel/next/arch/arm64/kernel/entry-common.c:133 /home/anders/src/kernel/next/arch/arm64/kernel/entry-common.c:142 /home/anders/src/kernel/next/arch/arm64/kernel/entry-common.c:534) 
[ 2105.369059][ T154] el0t_64_sync_handler (/home/anders/src/kernel/next/arch/arm64/kernel/entry-common.c:662) 
[ 2105.372445][ T154] el0t_64_sync (/home/anders/src/kernel/next/arch/arm64/kernel/entry.S:584) 
[ 2105.375404][  T154] 
[ 2105.376935][  T154] no locks held by systemd/154.
[ 2105.379935][  T154] irq event stamp: 385
[ 2105.382448][ T154] hardirqs last enabled at (385): local_daif_restore (/home/anders/src/kernel/next/arch/arm64/include/asm/daifflags.h:71 (discriminator 1)) 
[ 2105.388413][ T154] hardirqs last disabled at (384): el0t_64_sync_handler (/home/anders/src/kernel/next/arch/arm64/kernel/entry-common.c:662) 
[ 2105.394436][ T154] softirqs last enabled at (352): fpsimd_restore_current_state (/home/anders/src/kernel/next/arch/arm64/kernel/fpsimd.c:264 /home/anders/src/kernel/next/arch/arm64/kernel/fpsimd.c:1780) 
[ 2105.400932][ T154] softirqs last disabled at (350): fpsimd_restore_current_state (/home/anders/src/kernel/next/include/linux/bottom_half.h:20 /home/anders/src/kernel/next/arch/arm64/kernel/fpsimd.c:242 /home/anders/src/kernel/next/arch/arm64/kernel/fpsimd.c:1773) 
[ 2105.407394][  T154] 
[ 2105.408909][  T154] value changed: 0x0000000060000000 -> 0x0000000060001000
[ 2105.413225][  T154] 
[ 2105.414746][  T154] Reported by Kernel Concurrency Sanitizer on:
[ 2105.426169][  T154] Hardware name: linux,dummy-virt (DT)
[ 2105.429528][  T154] ==================================================================


The prctl case, which only gets called on 'current'. However, assuming
the kernel could be preempted while accessing current->pt_regs->pstate,
and then it races against the task switch.

Any idea what happens and how to fix it?


Cheers,
Anders

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2023-01-06 17:39 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-12-15  7:32 BUG: KCSAN: data-race in do_page_fault / spectre_v4_enable_task_mitigation Naresh Kamboju
2022-12-15  8:32 ` Marco Elver
2022-12-21 14:54 Anders Roxell
2022-12-21 14:54 ` Anders Roxell
2023-01-06 17:38 ` Will Deacon
2023-01-06 17:38   ` Will Deacon

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.