All of lore.kernel.org
 help / color / mirror / Atom feed
* [next] arm: boot failed - PC is at cpu_ca15_set_pte_ext
@ 2022-04-19 10:58 ` Naresh Kamboju
  0 siblings, 0 replies; 18+ messages in thread
From: Naresh Kamboju @ 2022-04-19 10:58 UTC (permalink / raw)
  To: Linux ARM, open list, Linux-Next Mailing List, lkft-triage
  Cc: Stephen Rothwell, Russell King - ARM Linux, Arnd Bergmann,
	Ard Biesheuvel, Andrew Morton, max.krummenacher, Shawn Guo,
	Stefano Stabellini, Christoph Hellwig, Konrad Rzeszutek Wilk,
	Eric W. Biederman

Linux next 20220419 boot failed on arm architecture qemu_arm and BeagleBoard
x15 device.

kernel crash log from x15:
-----------------
[    6.866516] 8<--- cut here ---
[    6.869598] Unable to handle kernel paging request at virtual
address f000e62c
[    6.876861] [f000e62c] *pgd=82935811, *pte=00000000, *ppte=00000000
[    6.883209] Internal error: Oops: 807 [#3] SMP ARM
[    6.888000] Modules linked in:
[    6.891082] CPU: 1 PID: 1 Comm: swapper/0 Tainted: G      D W
  5.18.0-rc3-next-20220419 #1
[    6.899993] Hardware name: Generic DRA74X (Flattened Device Tree)
[    6.906127] PC is at cpu_ca15_set_pte_ext+0x4c/0x58
[    6.911041] LR is at handle_mm_fault+0x60c/0xed0
[    6.915679] pc : [<c031f26c>]    lr : [<c04cfeb8>]    psr: 40000013
[    6.921966] sp : f000dde8  ip : f000de44  fp : a0000013
[    6.927215] r10: 00000000  r9 : 00000000  r8 : c1e95194
[    6.932464] r7 : c3c95000  r6 : befffff1  r5 : 00000081  r4 : c29d8000
[    6.939025] r3 : 00000000  r2 : 00000000  r1 : 00000040  r0 : f000de2c
[    6.945587] Flags: nZcv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
[    6.952758] Control: 10c5387d  Table: 8020406a  DAC: 00000051
[    6.958526] Register r0 information: 2-page vmalloc region starting
at 0xf000c000 allocated at kernel_clone+0x94/0x3b0
[    6.969299] Register r1 information: non-paged memory
[    6.974365] Register r2 information: NULL pointer
[    6.979095] Register r3 information: NULL pointer
[    6.983825] Register r4 information: slab task_struct start
c29d8000 pointer offset 0
[    6.991729] Register r5 information: non-paged memory
[    6.996795] Register r6 information: non-paged memory
[    7.001861] Register r7 information: slab vm_area_struct start
c3c95000 pointer offset 0
[    7.010009] Register r8 information: non-slab/vmalloc memory
[    7.015716] Register r9 information: NULL pointer
[    7.020446] Register r10 information: NULL pointer
[    7.025238] Register r11 information: non-paged memory
[    7.030426] Register r12 information: 2-page vmalloc region
starting at 0xf000c000 allocated at kernel_clone+0x94/0x3b0
[    7.041259] Process swapper/0 (pid: 1, stack limit = 0xfaff0077)
[    7.047302] Stack: (0xf000dde8 to 0xf000e000)
[    7.051696] dde0:                   c29d8000 00000cc0 c20a1108
c2065fa0 c1e09f50 b6db6db7
[    7.059906] de00: c195bf0c 17c0f572 c29d8000 c3c95000 00000cc0
000befff befff000 befffff1
[    7.068115] de20: 00000081 c3c3afb8 c3c3afb8 00000000 00000000
00000000 00000000 00000000
[    7.076324] de40: 00000000 17c0f572 befff000 c3c95000 00002017
befffff1 00002017 00002fb8
[    7.084564] de60: c2d04000 00000081 c29d8000 c04c6790 c20d01d4
00000000 00000001 c20ce440
[    7.092773] de80: c1e10bcc fffff000 00000000 c2a45680 eeb33cc0
c29d8000 00000000 c2d04000
[    7.100982] dea0: befffff1 f000df18 00000000 00002017 c20661a0
c04c77e8 f000df18 00000000
[    7.109222] dec0: 00000000 c1d95c40 00000002 c20661e0 00000000
00000001 00000000 c04c7ad0
[    7.117431] dee0: 00000011 c2d02a00 00000001 befffff1 c29d8000
00000000 00000011 c2a30010
[    7.125640] df00: c29d8000 c0524c24 f000df18 00000000 00000000
2cd9e000 c1d95c40 17c0f572
[    7.133850] df20: 00000000 c2d02a00 0000000b 00000ffc 00000000
befffff1 00000000 c0524f74
[    7.142089] df40: c1e0e394 c2d02a00 c209a71c 38e38e39 c29d8000
bee00008 c2d02a00 c2a30000
[    7.150299] df60: c1e0e394 c1e0e420 00000000 00000000 00000000
c05266bc c209a000 c1944c60
[    7.158508] df80: 00000000 00000000 00000000 c129d2b4 c209a000
c1e0e394 00000000 c12b5600
[    7.166748] dfa0: 00000000 c12b5518 00000000 c0300168 00000000
00000000 00000000 00000000
[    7.174957] dfc0: 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000
[    7.183166] dfe0: 00000000 00000000 00000000 00000000 00000013
00000000 00000000 00000000
[    7.191406] Code: 13110001 12211b02 13110b02 03a03000 (e5a03800)
[    7.197570] ---[ end trace 0000000000000000 ]---
[    7.202209] Kernel panic - not syncing: Attempted to kill init!
exitcode=0x0000000b

Reported-by: Linux Kernel Functional Testing <lkft@linaro.org>

metadata:
  git_ref: master
  git_repo: https://gitlab.com/Linaro/lkft/mirrors/next/linux-next
  git_sha: 634de1db0e9bbeb90d7b01020e59ec3dab4d38a1
  git_describe: next-20220419
  kernel-config: https://builds.tuxbuild.com/280TXP6P7tIBfnowvFY4wobXp3R/config
  System.map:  https://builds.tuxbuild.com/280TXP6P7tIBfnowvFY4wobXp3R/System.map
  vmlinux.xz: https://builds.tuxbuild.com/280TXP6P7tIBfnowvFY4wobXp3R/vmlinux.xz
  build-url: https://gitlab.com/Linaro/lkft/mirrors/next/linux-next/-/pipelines/519362851
  build: https://builds.tuxbuild.com/280TXP6P7tIBfnowvFY4wobXp3R
  toolchain: gcc-10

--
Linaro LKFT
https://lkft.linaro.org

[1] https://lkft.validation.linaro.org/scheduler/job/4921995#L2616
[2] https://lkft.validation.linaro.org/scheduler/job/4922061#L552

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [next] arm: boot failed - PC is at cpu_ca15_set_pte_ext
@ 2022-04-19 10:58 ` Naresh Kamboju
  0 siblings, 0 replies; 18+ messages in thread
From: Naresh Kamboju @ 2022-04-19 10:58 UTC (permalink / raw)
  To: Linux ARM, open list, Linux-Next Mailing List, lkft-triage
  Cc: Stephen Rothwell, Russell King - ARM Linux, Arnd Bergmann,
	Ard Biesheuvel, Andrew Morton, max.krummenacher, Shawn Guo,
	Stefano Stabellini, Christoph Hellwig, Konrad Rzeszutek Wilk,
	Eric W. Biederman

Linux next 20220419 boot failed on arm architecture qemu_arm and BeagleBoard
x15 device.

kernel crash log from x15:
-----------------
[    6.866516] 8<--- cut here ---
[    6.869598] Unable to handle kernel paging request at virtual
address f000e62c
[    6.876861] [f000e62c] *pgd=82935811, *pte=00000000, *ppte=00000000
[    6.883209] Internal error: Oops: 807 [#3] SMP ARM
[    6.888000] Modules linked in:
[    6.891082] CPU: 1 PID: 1 Comm: swapper/0 Tainted: G      D W
  5.18.0-rc3-next-20220419 #1
[    6.899993] Hardware name: Generic DRA74X (Flattened Device Tree)
[    6.906127] PC is at cpu_ca15_set_pte_ext+0x4c/0x58
[    6.911041] LR is at handle_mm_fault+0x60c/0xed0
[    6.915679] pc : [<c031f26c>]    lr : [<c04cfeb8>]    psr: 40000013
[    6.921966] sp : f000dde8  ip : f000de44  fp : a0000013
[    6.927215] r10: 00000000  r9 : 00000000  r8 : c1e95194
[    6.932464] r7 : c3c95000  r6 : befffff1  r5 : 00000081  r4 : c29d8000
[    6.939025] r3 : 00000000  r2 : 00000000  r1 : 00000040  r0 : f000de2c
[    6.945587] Flags: nZcv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
[    6.952758] Control: 10c5387d  Table: 8020406a  DAC: 00000051
[    6.958526] Register r0 information: 2-page vmalloc region starting
at 0xf000c000 allocated at kernel_clone+0x94/0x3b0
[    6.969299] Register r1 information: non-paged memory
[    6.974365] Register r2 information: NULL pointer
[    6.979095] Register r3 information: NULL pointer
[    6.983825] Register r4 information: slab task_struct start
c29d8000 pointer offset 0
[    6.991729] Register r5 information: non-paged memory
[    6.996795] Register r6 information: non-paged memory
[    7.001861] Register r7 information: slab vm_area_struct start
c3c95000 pointer offset 0
[    7.010009] Register r8 information: non-slab/vmalloc memory
[    7.015716] Register r9 information: NULL pointer
[    7.020446] Register r10 information: NULL pointer
[    7.025238] Register r11 information: non-paged memory
[    7.030426] Register r12 information: 2-page vmalloc region
starting at 0xf000c000 allocated at kernel_clone+0x94/0x3b0
[    7.041259] Process swapper/0 (pid: 1, stack limit = 0xfaff0077)
[    7.047302] Stack: (0xf000dde8 to 0xf000e000)
[    7.051696] dde0:                   c29d8000 00000cc0 c20a1108
c2065fa0 c1e09f50 b6db6db7
[    7.059906] de00: c195bf0c 17c0f572 c29d8000 c3c95000 00000cc0
000befff befff000 befffff1
[    7.068115] de20: 00000081 c3c3afb8 c3c3afb8 00000000 00000000
00000000 00000000 00000000
[    7.076324] de40: 00000000 17c0f572 befff000 c3c95000 00002017
befffff1 00002017 00002fb8
[    7.084564] de60: c2d04000 00000081 c29d8000 c04c6790 c20d01d4
00000000 00000001 c20ce440
[    7.092773] de80: c1e10bcc fffff000 00000000 c2a45680 eeb33cc0
c29d8000 00000000 c2d04000
[    7.100982] dea0: befffff1 f000df18 00000000 00002017 c20661a0
c04c77e8 f000df18 00000000
[    7.109222] dec0: 00000000 c1d95c40 00000002 c20661e0 00000000
00000001 00000000 c04c7ad0
[    7.117431] dee0: 00000011 c2d02a00 00000001 befffff1 c29d8000
00000000 00000011 c2a30010
[    7.125640] df00: c29d8000 c0524c24 f000df18 00000000 00000000
2cd9e000 c1d95c40 17c0f572
[    7.133850] df20: 00000000 c2d02a00 0000000b 00000ffc 00000000
befffff1 00000000 c0524f74
[    7.142089] df40: c1e0e394 c2d02a00 c209a71c 38e38e39 c29d8000
bee00008 c2d02a00 c2a30000
[    7.150299] df60: c1e0e394 c1e0e420 00000000 00000000 00000000
c05266bc c209a000 c1944c60
[    7.158508] df80: 00000000 00000000 00000000 c129d2b4 c209a000
c1e0e394 00000000 c12b5600
[    7.166748] dfa0: 00000000 c12b5518 00000000 c0300168 00000000
00000000 00000000 00000000
[    7.174957] dfc0: 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000
[    7.183166] dfe0: 00000000 00000000 00000000 00000000 00000013
00000000 00000000 00000000
[    7.191406] Code: 13110001 12211b02 13110b02 03a03000 (e5a03800)
[    7.197570] ---[ end trace 0000000000000000 ]---
[    7.202209] Kernel panic - not syncing: Attempted to kill init!
exitcode=0x0000000b

Reported-by: Linux Kernel Functional Testing <lkft@linaro.org>

metadata:
  git_ref: master
  git_repo: https://gitlab.com/Linaro/lkft/mirrors/next/linux-next
  git_sha: 634de1db0e9bbeb90d7b01020e59ec3dab4d38a1
  git_describe: next-20220419
  kernel-config: https://builds.tuxbuild.com/280TXP6P7tIBfnowvFY4wobXp3R/config
  System.map:  https://builds.tuxbuild.com/280TXP6P7tIBfnowvFY4wobXp3R/System.map
  vmlinux.xz: https://builds.tuxbuild.com/280TXP6P7tIBfnowvFY4wobXp3R/vmlinux.xz
  build-url: https://gitlab.com/Linaro/lkft/mirrors/next/linux-next/-/pipelines/519362851
  build: https://builds.tuxbuild.com/280TXP6P7tIBfnowvFY4wobXp3R
  toolchain: gcc-10

--
Linaro LKFT
https://lkft.linaro.org

[1] https://lkft.validation.linaro.org/scheduler/job/4921995#L2616
[2] https://lkft.validation.linaro.org/scheduler/job/4922061#L552

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [next] arm: boot failed - PC is at cpu_ca15_set_pte_ext
  2022-04-19 10:58 ` Naresh Kamboju
@ 2022-04-19 18:57   ` Russell King (Oracle)
  -1 siblings, 0 replies; 18+ messages in thread
From: Russell King (Oracle) @ 2022-04-19 18:57 UTC (permalink / raw)
  To: Naresh Kamboju
  Cc: Linux ARM, open list, Linux-Next Mailing List, lkft-triage,
	Stephen Rothwell, Arnd Bergmann, Ard Biesheuvel, Andrew Morton,
	max.krummenacher, Shawn Guo, Stefano Stabellini,
	Christoph Hellwig, Konrad Rzeszutek Wilk, Eric W. Biederman

On Tue, Apr 19, 2022 at 04:28:52PM +0530, Naresh Kamboju wrote:
> Linux next 20220419 boot failed on arm architecture qemu_arm and BeagleBoard
> x15 device.

Was the immediately previous linux-next behaving correctly?

If so, nothing has changed in the ARM32 kernel tree, so this must be
someone else's issue - code that someone else has pushed into
linux-next.

It looks to me like someone is walking the page tables incorrectly,
somewhere buried in handle_mm_fault(), because the PTE pointer is in
the upper-2k of a 4k page, which is most definitely illegal on arm32.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [next] arm: boot failed - PC is at cpu_ca15_set_pte_ext
@ 2022-04-19 18:57   ` Russell King (Oracle)
  0 siblings, 0 replies; 18+ messages in thread
From: Russell King (Oracle) @ 2022-04-19 18:57 UTC (permalink / raw)
  To: Naresh Kamboju
  Cc: Linux ARM, open list, Linux-Next Mailing List, lkft-triage,
	Stephen Rothwell, Arnd Bergmann, Ard Biesheuvel, Andrew Morton,
	max.krummenacher, Shawn Guo, Stefano Stabellini,
	Christoph Hellwig, Konrad Rzeszutek Wilk, Eric W. Biederman

On Tue, Apr 19, 2022 at 04:28:52PM +0530, Naresh Kamboju wrote:
> Linux next 20220419 boot failed on arm architecture qemu_arm and BeagleBoard
> x15 device.

Was the immediately previous linux-next behaving correctly?

If so, nothing has changed in the ARM32 kernel tree, so this must be
someone else's issue - code that someone else has pushed into
linux-next.

It looks to me like someone is walking the page tables incorrectly,
somewhere buried in handle_mm_fault(), because the PTE pointer is in
the upper-2k of a 4k page, which is most definitely illegal on arm32.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [next] arm: boot failed - PC is at cpu_ca15_set_pte_ext
  2022-04-19 10:58 ` Naresh Kamboju
@ 2022-04-20  7:31   ` Ard Biesheuvel
  -1 siblings, 0 replies; 18+ messages in thread
From: Ard Biesheuvel @ 2022-04-20  7:31 UTC (permalink / raw)
  To: Naresh Kamboju
  Cc: Linux ARM, open list, Linux-Next Mailing List, lkft-triage,
	Stephen Rothwell, Russell King - ARM Linux, Arnd Bergmann,
	Andrew Morton, max.krummenacher, Shawn Guo, Stefano Stabellini,
	Christoph Hellwig, Konrad Rzeszutek Wilk, Eric W. Biederman

On Tue, 19 Apr 2022 at 12:59, Naresh Kamboju <naresh.kamboju@linaro.org> wrote:
>
> Linux next 20220419 boot failed on arm architecture qemu_arm and BeagleBoard
> x15 device.
>
> kernel crash log from x15:
> -----------------
> [    6.866516] 8<--- cut here ---
> [    6.869598] Unable to handle kernel paging request at virtual
> address f000e62c
> [    6.876861] [f000e62c] *pgd=82935811, *pte=00000000, *ppte=00000000
> [    6.883209] Internal error: Oops: 807 [#3] SMP ARM
> [    6.888000] Modules linked in:
> [    6.891082] CPU: 1 PID: 1 Comm: swapper/0 Tainted: G      D W
>   5.18.0-rc3-next-20220419 #1
> [    6.899993] Hardware name: Generic DRA74X (Flattened Device Tree)
> [    6.906127] PC is at cpu_ca15_set_pte_ext+0x4c/0x58
> [    6.911041] LR is at handle_mm_fault+0x60c/0xed0
> [    6.915679] pc : [<c031f26c>]    lr : [<c04cfeb8>]    psr: 40000013
> [    6.921966] sp : f000dde8  ip : f000de44  fp : a0000013
> [    6.927215] r10: 00000000  r9 : 00000000  r8 : c1e95194
> [    6.932464] r7 : c3c95000  r6 : befffff1  r5 : 00000081  r4 : c29d8000
> [    6.939025] r3 : 00000000  r2 : 00000000  r1 : 00000040  r0 : f000de2c
> [    6.945587] Flags: nZcv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
> [    6.952758] Control: 10c5387d  Table: 8020406a  DAC: 00000051
> [    6.958526] Register r0 information: 2-page vmalloc region starting
> at 0xf000c000 allocated at kernel_clone+0x94/0x3b0
> [    6.969299] Register r1 information: non-paged memory
> [    6.974365] Register r2 information: NULL pointer
> [    6.979095] Register r3 information: NULL pointer
> [    6.983825] Register r4 information: slab task_struct start
> c29d8000 pointer offset 0
> [    6.991729] Register r5 information: non-paged memory
> [    6.996795] Register r6 information: non-paged memory
> [    7.001861] Register r7 information: slab vm_area_struct start
> c3c95000 pointer offset 0
> [    7.010009] Register r8 information: non-slab/vmalloc memory
> [    7.015716] Register r9 information: NULL pointer
> [    7.020446] Register r10 information: NULL pointer
> [    7.025238] Register r11 information: non-paged memory
> [    7.030426] Register r12 information: 2-page vmalloc region
> starting at 0xf000c000 allocated at kernel_clone+0x94/0x3b0
> [    7.041259] Process swapper/0 (pid: 1, stack limit = 0xfaff0077)
> [    7.047302] Stack: (0xf000dde8 to 0xf000e000)
> [    7.051696] dde0:                   c29d8000 00000cc0 c20a1108
> c2065fa0 c1e09f50 b6db6db7
> [    7.059906] de00: c195bf0c 17c0f572 c29d8000 c3c95000 00000cc0
> 000befff befff000 befffff1
> [    7.068115] de20: 00000081 c3c3afb8 c3c3afb8 00000000 00000000
> 00000000 00000000 00000000
> [    7.076324] de40: 00000000 17c0f572 befff000 c3c95000 00002017
> befffff1 00002017 00002fb8
> [    7.084564] de60: c2d04000 00000081 c29d8000 c04c6790 c20d01d4
> 00000000 00000001 c20ce440
> [    7.092773] de80: c1e10bcc fffff000 00000000 c2a45680 eeb33cc0
> c29d8000 00000000 c2d04000
> [    7.100982] dea0: befffff1 f000df18 00000000 00002017 c20661a0
> c04c77e8 f000df18 00000000
> [    7.109222] dec0: 00000000 c1d95c40 00000002 c20661e0 00000000
> 00000001 00000000 c04c7ad0
> [    7.117431] dee0: 00000011 c2d02a00 00000001 befffff1 c29d8000
> 00000000 00000011 c2a30010
> [    7.125640] df00: c29d8000 c0524c24 f000df18 00000000 00000000
> 2cd9e000 c1d95c40 17c0f572
> [    7.133850] df20: 00000000 c2d02a00 0000000b 00000ffc 00000000
> befffff1 00000000 c0524f74
> [    7.142089] df40: c1e0e394 c2d02a00 c209a71c 38e38e39 c29d8000
> bee00008 c2d02a00 c2a30000
> [    7.150299] df60: c1e0e394 c1e0e420 00000000 00000000 00000000
> c05266bc c209a000 c1944c60
> [    7.158508] df80: 00000000 00000000 00000000 c129d2b4 c209a000
> c1e0e394 00000000 c12b5600
> [    7.166748] dfa0: 00000000 c12b5518 00000000 c0300168 00000000
> 00000000 00000000 00000000
> [    7.174957] dfc0: 00000000 00000000 00000000 00000000 00000000
> 00000000 00000000 00000000
> [    7.183166] dfe0: 00000000 00000000 00000000 00000000 00000013
> 00000000 00000000 00000000
> [    7.191406] Code: 13110001 12211b02 13110b02 03a03000 (e5a03800)

This decodes to

   0: 13110001 tstne r1, #1
   4: 12211b02 eorne r1, r1, #2048 ; 0x800
   8: 13110b02 tstne r1, #2048 ; 0x800
   c: 03a03000 moveq r3, #0
  10:* e5a03800 str r3, [r0, #2048]! ; 0x800 <-- trapping instruction

and R0 points into the stack. So we are updating a PTE that is located
on the stack rather than in a page table somewhere, which seems very
odd. However, this could be a latent bug that got uncovered by the
VMAP stacks changes.

Unfortunately, the vmlinux.xz file I downloaded from the link below
seems to be different from the one that produced the crash, given that
the LR address of c04cfeb8 does not seem to correspond with
handle_mm_fault+0x60c/0xed0.

Can you please double check the artifacts?



> metadata:
>   git_ref: master
>   git_repo: https://gitlab.com/Linaro/lkft/mirrors/next/linux-next
>   git_sha: 634de1db0e9bbeb90d7b01020e59ec3dab4d38a1
>   git_describe: next-20220419
>   kernel-config: https://builds.tuxbuild.com/280TXP6P7tIBfnowvFY4wobXp3R/config
>   System.map:  https://builds.tuxbuild.com/280TXP6P7tIBfnowvFY4wobXp3R/System.map
>   vmlinux.xz: https://builds.tuxbuild.com/280TXP6P7tIBfnowvFY4wobXp3R/vmlinux.xz
>   build-url: https://gitlab.com/Linaro/lkft/mirrors/next/linux-next/-/pipelines/519362851
>   build: https://builds.tuxbuild.com/280TXP6P7tIBfnowvFY4wobXp3R
>   toolchain: gcc-10
>
> --
> Linaro LKFT
> https://lkft.linaro.org
>
> [1] https://lkft.validation.linaro.org/scheduler/job/4921995#L2616
> [2] https://lkft.validation.linaro.org/scheduler/job/4922061#L552

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [next] arm: boot failed - PC is at cpu_ca15_set_pte_ext
@ 2022-04-20  7:31   ` Ard Biesheuvel
  0 siblings, 0 replies; 18+ messages in thread
From: Ard Biesheuvel @ 2022-04-20  7:31 UTC (permalink / raw)
  To: Naresh Kamboju
  Cc: Linux ARM, open list, Linux-Next Mailing List, lkft-triage,
	Stephen Rothwell, Russell King - ARM Linux, Arnd Bergmann,
	Andrew Morton, max.krummenacher, Shawn Guo, Stefano Stabellini,
	Christoph Hellwig, Konrad Rzeszutek Wilk, Eric W. Biederman

On Tue, 19 Apr 2022 at 12:59, Naresh Kamboju <naresh.kamboju@linaro.org> wrote:
>
> Linux next 20220419 boot failed on arm architecture qemu_arm and BeagleBoard
> x15 device.
>
> kernel crash log from x15:
> -----------------
> [    6.866516] 8<--- cut here ---
> [    6.869598] Unable to handle kernel paging request at virtual
> address f000e62c
> [    6.876861] [f000e62c] *pgd=82935811, *pte=00000000, *ppte=00000000
> [    6.883209] Internal error: Oops: 807 [#3] SMP ARM
> [    6.888000] Modules linked in:
> [    6.891082] CPU: 1 PID: 1 Comm: swapper/0 Tainted: G      D W
>   5.18.0-rc3-next-20220419 #1
> [    6.899993] Hardware name: Generic DRA74X (Flattened Device Tree)
> [    6.906127] PC is at cpu_ca15_set_pte_ext+0x4c/0x58
> [    6.911041] LR is at handle_mm_fault+0x60c/0xed0
> [    6.915679] pc : [<c031f26c>]    lr : [<c04cfeb8>]    psr: 40000013
> [    6.921966] sp : f000dde8  ip : f000de44  fp : a0000013
> [    6.927215] r10: 00000000  r9 : 00000000  r8 : c1e95194
> [    6.932464] r7 : c3c95000  r6 : befffff1  r5 : 00000081  r4 : c29d8000
> [    6.939025] r3 : 00000000  r2 : 00000000  r1 : 00000040  r0 : f000de2c
> [    6.945587] Flags: nZcv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
> [    6.952758] Control: 10c5387d  Table: 8020406a  DAC: 00000051
> [    6.958526] Register r0 information: 2-page vmalloc region starting
> at 0xf000c000 allocated at kernel_clone+0x94/0x3b0
> [    6.969299] Register r1 information: non-paged memory
> [    6.974365] Register r2 information: NULL pointer
> [    6.979095] Register r3 information: NULL pointer
> [    6.983825] Register r4 information: slab task_struct start
> c29d8000 pointer offset 0
> [    6.991729] Register r5 information: non-paged memory
> [    6.996795] Register r6 information: non-paged memory
> [    7.001861] Register r7 information: slab vm_area_struct start
> c3c95000 pointer offset 0
> [    7.010009] Register r8 information: non-slab/vmalloc memory
> [    7.015716] Register r9 information: NULL pointer
> [    7.020446] Register r10 information: NULL pointer
> [    7.025238] Register r11 information: non-paged memory
> [    7.030426] Register r12 information: 2-page vmalloc region
> starting at 0xf000c000 allocated at kernel_clone+0x94/0x3b0
> [    7.041259] Process swapper/0 (pid: 1, stack limit = 0xfaff0077)
> [    7.047302] Stack: (0xf000dde8 to 0xf000e000)
> [    7.051696] dde0:                   c29d8000 00000cc0 c20a1108
> c2065fa0 c1e09f50 b6db6db7
> [    7.059906] de00: c195bf0c 17c0f572 c29d8000 c3c95000 00000cc0
> 000befff befff000 befffff1
> [    7.068115] de20: 00000081 c3c3afb8 c3c3afb8 00000000 00000000
> 00000000 00000000 00000000
> [    7.076324] de40: 00000000 17c0f572 befff000 c3c95000 00002017
> befffff1 00002017 00002fb8
> [    7.084564] de60: c2d04000 00000081 c29d8000 c04c6790 c20d01d4
> 00000000 00000001 c20ce440
> [    7.092773] de80: c1e10bcc fffff000 00000000 c2a45680 eeb33cc0
> c29d8000 00000000 c2d04000
> [    7.100982] dea0: befffff1 f000df18 00000000 00002017 c20661a0
> c04c77e8 f000df18 00000000
> [    7.109222] dec0: 00000000 c1d95c40 00000002 c20661e0 00000000
> 00000001 00000000 c04c7ad0
> [    7.117431] dee0: 00000011 c2d02a00 00000001 befffff1 c29d8000
> 00000000 00000011 c2a30010
> [    7.125640] df00: c29d8000 c0524c24 f000df18 00000000 00000000
> 2cd9e000 c1d95c40 17c0f572
> [    7.133850] df20: 00000000 c2d02a00 0000000b 00000ffc 00000000
> befffff1 00000000 c0524f74
> [    7.142089] df40: c1e0e394 c2d02a00 c209a71c 38e38e39 c29d8000
> bee00008 c2d02a00 c2a30000
> [    7.150299] df60: c1e0e394 c1e0e420 00000000 00000000 00000000
> c05266bc c209a000 c1944c60
> [    7.158508] df80: 00000000 00000000 00000000 c129d2b4 c209a000
> c1e0e394 00000000 c12b5600
> [    7.166748] dfa0: 00000000 c12b5518 00000000 c0300168 00000000
> 00000000 00000000 00000000
> [    7.174957] dfc0: 00000000 00000000 00000000 00000000 00000000
> 00000000 00000000 00000000
> [    7.183166] dfe0: 00000000 00000000 00000000 00000000 00000013
> 00000000 00000000 00000000
> [    7.191406] Code: 13110001 12211b02 13110b02 03a03000 (e5a03800)

This decodes to

   0: 13110001 tstne r1, #1
   4: 12211b02 eorne r1, r1, #2048 ; 0x800
   8: 13110b02 tstne r1, #2048 ; 0x800
   c: 03a03000 moveq r3, #0
  10:* e5a03800 str r3, [r0, #2048]! ; 0x800 <-- trapping instruction

and R0 points into the stack. So we are updating a PTE that is located
on the stack rather than in a page table somewhere, which seems very
odd. However, this could be a latent bug that got uncovered by the
VMAP stacks changes.

Unfortunately, the vmlinux.xz file I downloaded from the link below
seems to be different from the one that produced the crash, given that
the LR address of c04cfeb8 does not seem to correspond with
handle_mm_fault+0x60c/0xed0.

Can you please double check the artifacts?



> metadata:
>   git_ref: master
>   git_repo: https://gitlab.com/Linaro/lkft/mirrors/next/linux-next
>   git_sha: 634de1db0e9bbeb90d7b01020e59ec3dab4d38a1
>   git_describe: next-20220419
>   kernel-config: https://builds.tuxbuild.com/280TXP6P7tIBfnowvFY4wobXp3R/config
>   System.map:  https://builds.tuxbuild.com/280TXP6P7tIBfnowvFY4wobXp3R/System.map
>   vmlinux.xz: https://builds.tuxbuild.com/280TXP6P7tIBfnowvFY4wobXp3R/vmlinux.xz
>   build-url: https://gitlab.com/Linaro/lkft/mirrors/next/linux-next/-/pipelines/519362851
>   build: https://builds.tuxbuild.com/280TXP6P7tIBfnowvFY4wobXp3R
>   toolchain: gcc-10
>
> --
> Linaro LKFT
> https://lkft.linaro.org
>
> [1] https://lkft.validation.linaro.org/scheduler/job/4921995#L2616
> [2] https://lkft.validation.linaro.org/scheduler/job/4922061#L552

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [next] arm: boot failed - PC is at cpu_ca15_set_pte_ext
  2022-04-20  7:31   ` Ard Biesheuvel
@ 2022-04-20  7:50     ` Max Krummenacher
  -1 siblings, 0 replies; 18+ messages in thread
From: Max Krummenacher @ 2022-04-20  7:50 UTC (permalink / raw)
  To: Ard Biesheuvel, Naresh Kamboju
  Cc: Linux ARM, open list, Linux-Next Mailing List, lkft-triage,
	Stephen Rothwell, Russell King - ARM Linux, Arnd Bergmann,
	Andrew Morton, max.krummenacher, Shawn Guo, Stefano Stabellini,
	Christoph Hellwig, Konrad Rzeszutek Wilk, Eric W. Biederman,
	Russell King (Oracle)

Am Mittwoch, den 20.04.2022, 09:31 +0200 schrieb Ard Biesheuvel:
> On Tue, 19 Apr 2022 at 12:59, Naresh Kamboju <naresh.kamboju@linaro.org> wrote:
> > Linux next 20220419 boot failed on arm architecture qemu_arm and BeagleBoard
> > x15 device.
> > 
> > kernel crash log from x15:
> > -----------------
> > [    6.866516] 8<--- cut here ---
> > [    6.869598] Unable to handle kernel paging request at virtual
> > address f000e62c
> > [    6.876861] [f000e62c] *pgd=82935811, *pte=00000000, *ppte=00000000
> > [    6.883209] Internal error: Oops: 807 [#3] SMP ARM
> > [    6.888000] Modules linked in:
> > [    6.891082] CPU: 1 PID: 1 Comm: swapper/0 Tainted: G      D W
> >   5.18.0-rc3-next-20220419 #1
> > [    6.899993] Hardware name: Generic DRA74X (Flattened Device Tree)
> > [    6.906127] PC is at cpu_ca15_set_pte_ext+0x4c/0x58
> > [    6.911041] LR is at handle_mm_fault+0x60c/0xed0
> > [    6.915679] pc : [<c031f26c>]    lr : [<c04cfeb8>]    psr: 40000013
> > [    6.921966] sp : f000dde8  ip : f000de44  fp : a0000013
> > [    6.927215] r10: 00000000  r9 : 00000000  r8 : c1e95194
> > [    6.932464] r7 : c3c95000  r6 : befffff1  r5 : 00000081  r4 : c29d8000
> > [    6.939025] r3 : 00000000  r2 : 00000000  r1 : 00000040  r0 : f000de2c
> > [    6.945587] Flags: nZcv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
> > [    6.952758] Control: 10c5387d  Table: 8020406a  DAC: 00000051
> > [    6.958526] Register r0 information: 2-page vmalloc region starting
> > at 0xf000c000 allocated at kernel_clone+0x94/0x3b0
> > [    6.969299] Register r1 information: non-paged memory
> > [    6.974365] Register r2 information: NULL pointer
> > [    6.979095] Register r3 information: NULL pointer
> > [    6.983825] Register r4 information: slab task_struct start
> > c29d8000 pointer offset 0
> > [    6.991729] Register r5 information: non-paged memory
> > [    6.996795] Register r6 information: non-paged memory
> > [    7.001861] Register r7 information: slab vm_area_struct start
> > c3c95000 pointer offset 0
> > [    7.010009] Register r8 information: non-slab/vmalloc memory
> > [    7.015716] Register r9 information: NULL pointer
> > [    7.020446] Register r10 information: NULL pointer
> > [    7.025238] Register r11 information: non-paged memory
> > [    7.030426] Register r12 information: 2-page vmalloc region
> > starting at 0xf000c000 allocated at kernel_clone+0x94/0x3b0
> > [    7.041259] Process swapper/0 (pid: 1, stack limit = 0xfaff0077)
> > [    7.047302] Stack: (0xf000dde8 to 0xf000e000)
> > [    7.051696] dde0:                   c29d8000 00000cc0 c20a1108
> > c2065fa0 c1e09f50 b6db6db7
> > [    7.059906] de00: c195bf0c 17c0f572 c29d8000 c3c95000 00000cc0
> > 000befff befff000 befffff1
> > [    7.068115] de20: 00000081 c3c3afb8 c3c3afb8 00000000 00000000
> > 00000000 00000000 00000000
> > [    7.076324] de40: 00000000 17c0f572 befff000 c3c95000 00002017
> > befffff1 00002017 00002fb8
> > [    7.084564] de60: c2d04000 00000081 c29d8000 c04c6790 c20d01d4
> > 00000000 00000001 c20ce440
> > [    7.092773] de80: c1e10bcc fffff000 00000000 c2a45680 eeb33cc0
> > c29d8000 00000000 c2d04000
> > [    7.100982] dea0: befffff1 f000df18 00000000 00002017 c20661a0
> > c04c77e8 f000df18 00000000
> > [    7.109222] dec0: 00000000 c1d95c40 00000002 c20661e0 00000000
> > 00000001 00000000 c04c7ad0
> > [    7.117431] dee0: 00000011 c2d02a00 00000001 befffff1 c29d8000
> > 00000000 00000011 c2a30010
> > [    7.125640] df00: c29d8000 c0524c24 f000df18 00000000 00000000
> > 2cd9e000 c1d95c40 17c0f572
> > [    7.133850] df20: 00000000 c2d02a00 0000000b 00000ffc 00000000
> > befffff1 00000000 c0524f74
> > [    7.142089] df40: c1e0e394 c2d02a00 c209a71c 38e38e39 c29d8000
> > bee00008 c2d02a00 c2a30000
> > [    7.150299] df60: c1e0e394 c1e0e420 00000000 00000000 00000000
> > c05266bc c209a000 c1944c60
> > [    7.158508] df80: 00000000 00000000 00000000 c129d2b4 c209a000
> > c1e0e394 00000000 c12b5600
> > [    7.166748] dfa0: 00000000 c12b5518 00000000 c0300168 00000000
> > 00000000 00000000 00000000
> > [    7.174957] dfc0: 00000000 00000000 00000000 00000000 00000000
> > 00000000 00000000 00000000
> > [    7.183166] dfe0: 00000000 00000000 00000000 00000000 00000013
> > 00000000 00000000 00000000
> > [    7.191406] Code: 13110001 12211b02 13110b02 03a03000 (e5a03800)
> 
> This decodes to
> 
>    0: 13110001 tstne r1, #1
>    4: 12211b02 eorne r1, r1, #2048 ; 0x800
>    8: 13110b02 tstne r1, #2048 ; 0x800
>    c: 03a03000 moveq r3, #0
>   10:* e5a03800 str r3, [r0, #2048]! ; 0x800 <-- trapping instruction
> 
> and R0 points into the stack. So we are updating a PTE that is located
> on the stack rather than in a page table somewhere, which seems very
> odd. However, this could be a latent bug that got uncovered by the
> VMAP stacks changes.
> 
> Unfortunately, the vmlinux.xz file I downloaded from the link below
> seems to be different from the one that produced the crash, given that
> the LR address of c04cfeb8 does not seem to correspond with
> handle_mm_fault+0x60c/0xed0.
> 
> Can you please double check the artifacts?

Commit "mm: check against orig_pte for finish_fault()" introduced this,
i.e. on yesterdays next reverting a066bab3c0eb made a i.MX6 boot again.
A fix is discussed here:

https://lore.kernel.org/all/YliNP7ADcdc4Puvs@xz-m1.local/

Max

> 
> 
> 
> > metadata:
> >   git_ref: master
> >   git_repo: https://gitlab.com/Linaro/lkft/mirrors/next/linux-next
> >   git_sha: 634de1db0e9bbeb90d7b01020e59ec3dab4d38a1
> >   git_describe: next-20220419
> >   kernel-config: https://builds.tuxbuild.com/280TXP6P7tIBfnowvFY4wobXp3R/config
> >   System.map:  https://builds.tuxbuild.com/280TXP6P7tIBfnowvFY4wobXp3R/System.map
> >   vmlinux.xz: https://builds.tuxbuild.com/280TXP6P7tIBfnowvFY4wobXp3R/vmlinux.xz
> >   build-url: https://gitlab.com/Linaro/lkft/mirrors/next/linux-next/-/pipelines/519362851
> >   build: https://builds.tuxbuild.com/280TXP6P7tIBfnowvFY4wobXp3R
> >   toolchain: gcc-10
> > 
> > --
> > Linaro LKFT
> > https://lkft.linaro.org
> > 
> > [1] https://lkft.validation.linaro.org/scheduler/job/4921995#L2616
> > [2] https://lkft.validation.linaro.org/scheduler/job/4922061#L552
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [next] arm: boot failed - PC is at cpu_ca15_set_pte_ext
@ 2022-04-20  7:50     ` Max Krummenacher
  0 siblings, 0 replies; 18+ messages in thread
From: Max Krummenacher @ 2022-04-20  7:50 UTC (permalink / raw)
  To: Ard Biesheuvel, Naresh Kamboju
  Cc: Linux ARM, open list, Linux-Next Mailing List, lkft-triage,
	Stephen Rothwell, Russell King - ARM Linux, Arnd Bergmann,
	Andrew Morton, max.krummenacher, Shawn Guo, Stefano Stabellini,
	Christoph Hellwig, Konrad Rzeszutek Wilk, Eric W. Biederman,
	Russell King (Oracle)

Am Mittwoch, den 20.04.2022, 09:31 +0200 schrieb Ard Biesheuvel:
> On Tue, 19 Apr 2022 at 12:59, Naresh Kamboju <naresh.kamboju@linaro.org> wrote:
> > Linux next 20220419 boot failed on arm architecture qemu_arm and BeagleBoard
> > x15 device.
> > 
> > kernel crash log from x15:
> > -----------------
> > [    6.866516] 8<--- cut here ---
> > [    6.869598] Unable to handle kernel paging request at virtual
> > address f000e62c
> > [    6.876861] [f000e62c] *pgd=82935811, *pte=00000000, *ppte=00000000
> > [    6.883209] Internal error: Oops: 807 [#3] SMP ARM
> > [    6.888000] Modules linked in:
> > [    6.891082] CPU: 1 PID: 1 Comm: swapper/0 Tainted: G      D W
> >   5.18.0-rc3-next-20220419 #1
> > [    6.899993] Hardware name: Generic DRA74X (Flattened Device Tree)
> > [    6.906127] PC is at cpu_ca15_set_pte_ext+0x4c/0x58
> > [    6.911041] LR is at handle_mm_fault+0x60c/0xed0
> > [    6.915679] pc : [<c031f26c>]    lr : [<c04cfeb8>]    psr: 40000013
> > [    6.921966] sp : f000dde8  ip : f000de44  fp : a0000013
> > [    6.927215] r10: 00000000  r9 : 00000000  r8 : c1e95194
> > [    6.932464] r7 : c3c95000  r6 : befffff1  r5 : 00000081  r4 : c29d8000
> > [    6.939025] r3 : 00000000  r2 : 00000000  r1 : 00000040  r0 : f000de2c
> > [    6.945587] Flags: nZcv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
> > [    6.952758] Control: 10c5387d  Table: 8020406a  DAC: 00000051
> > [    6.958526] Register r0 information: 2-page vmalloc region starting
> > at 0xf000c000 allocated at kernel_clone+0x94/0x3b0
> > [    6.969299] Register r1 information: non-paged memory
> > [    6.974365] Register r2 information: NULL pointer
> > [    6.979095] Register r3 information: NULL pointer
> > [    6.983825] Register r4 information: slab task_struct start
> > c29d8000 pointer offset 0
> > [    6.991729] Register r5 information: non-paged memory
> > [    6.996795] Register r6 information: non-paged memory
> > [    7.001861] Register r7 information: slab vm_area_struct start
> > c3c95000 pointer offset 0
> > [    7.010009] Register r8 information: non-slab/vmalloc memory
> > [    7.015716] Register r9 information: NULL pointer
> > [    7.020446] Register r10 information: NULL pointer
> > [    7.025238] Register r11 information: non-paged memory
> > [    7.030426] Register r12 information: 2-page vmalloc region
> > starting at 0xf000c000 allocated at kernel_clone+0x94/0x3b0
> > [    7.041259] Process swapper/0 (pid: 1, stack limit = 0xfaff0077)
> > [    7.047302] Stack: (0xf000dde8 to 0xf000e000)
> > [    7.051696] dde0:                   c29d8000 00000cc0 c20a1108
> > c2065fa0 c1e09f50 b6db6db7
> > [    7.059906] de00: c195bf0c 17c0f572 c29d8000 c3c95000 00000cc0
> > 000befff befff000 befffff1
> > [    7.068115] de20: 00000081 c3c3afb8 c3c3afb8 00000000 00000000
> > 00000000 00000000 00000000
> > [    7.076324] de40: 00000000 17c0f572 befff000 c3c95000 00002017
> > befffff1 00002017 00002fb8
> > [    7.084564] de60: c2d04000 00000081 c29d8000 c04c6790 c20d01d4
> > 00000000 00000001 c20ce440
> > [    7.092773] de80: c1e10bcc fffff000 00000000 c2a45680 eeb33cc0
> > c29d8000 00000000 c2d04000
> > [    7.100982] dea0: befffff1 f000df18 00000000 00002017 c20661a0
> > c04c77e8 f000df18 00000000
> > [    7.109222] dec0: 00000000 c1d95c40 00000002 c20661e0 00000000
> > 00000001 00000000 c04c7ad0
> > [    7.117431] dee0: 00000011 c2d02a00 00000001 befffff1 c29d8000
> > 00000000 00000011 c2a30010
> > [    7.125640] df00: c29d8000 c0524c24 f000df18 00000000 00000000
> > 2cd9e000 c1d95c40 17c0f572
> > [    7.133850] df20: 00000000 c2d02a00 0000000b 00000ffc 00000000
> > befffff1 00000000 c0524f74
> > [    7.142089] df40: c1e0e394 c2d02a00 c209a71c 38e38e39 c29d8000
> > bee00008 c2d02a00 c2a30000
> > [    7.150299] df60: c1e0e394 c1e0e420 00000000 00000000 00000000
> > c05266bc c209a000 c1944c60
> > [    7.158508] df80: 00000000 00000000 00000000 c129d2b4 c209a000
> > c1e0e394 00000000 c12b5600
> > [    7.166748] dfa0: 00000000 c12b5518 00000000 c0300168 00000000
> > 00000000 00000000 00000000
> > [    7.174957] dfc0: 00000000 00000000 00000000 00000000 00000000
> > 00000000 00000000 00000000
> > [    7.183166] dfe0: 00000000 00000000 00000000 00000000 00000013
> > 00000000 00000000 00000000
> > [    7.191406] Code: 13110001 12211b02 13110b02 03a03000 (e5a03800)
> 
> This decodes to
> 
>    0: 13110001 tstne r1, #1
>    4: 12211b02 eorne r1, r1, #2048 ; 0x800
>    8: 13110b02 tstne r1, #2048 ; 0x800
>    c: 03a03000 moveq r3, #0
>   10:* e5a03800 str r3, [r0, #2048]! ; 0x800 <-- trapping instruction
> 
> and R0 points into the stack. So we are updating a PTE that is located
> on the stack rather than in a page table somewhere, which seems very
> odd. However, this could be a latent bug that got uncovered by the
> VMAP stacks changes.
> 
> Unfortunately, the vmlinux.xz file I downloaded from the link below
> seems to be different from the one that produced the crash, given that
> the LR address of c04cfeb8 does not seem to correspond with
> handle_mm_fault+0x60c/0xed0.
> 
> Can you please double check the artifacts?

Commit "mm: check against orig_pte for finish_fault()" introduced this,
i.e. on yesterdays next reverting a066bab3c0eb made a i.MX6 boot again.
A fix is discussed here:

https://lore.kernel.org/all/YliNP7ADcdc4Puvs@xz-m1.local/

Max

> 
> 
> 
> > metadata:
> >   git_ref: master
> >   git_repo: https://gitlab.com/Linaro/lkft/mirrors/next/linux-next
> >   git_sha: 634de1db0e9bbeb90d7b01020e59ec3dab4d38a1
> >   git_describe: next-20220419
> >   kernel-config: https://builds.tuxbuild.com/280TXP6P7tIBfnowvFY4wobXp3R/config
> >   System.map:  https://builds.tuxbuild.com/280TXP6P7tIBfnowvFY4wobXp3R/System.map
> >   vmlinux.xz: https://builds.tuxbuild.com/280TXP6P7tIBfnowvFY4wobXp3R/vmlinux.xz
> >   build-url: https://gitlab.com/Linaro/lkft/mirrors/next/linux-next/-/pipelines/519362851
> >   build: https://builds.tuxbuild.com/280TXP6P7tIBfnowvFY4wobXp3R
> >   toolchain: gcc-10
> > 
> > --
> > Linaro LKFT
> > https://lkft.linaro.org
> > 
> > [1] https://lkft.validation.linaro.org/scheduler/job/4921995#L2616
> > [2] https://lkft.validation.linaro.org/scheduler/job/4922061#L552
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [next] arm: boot failed - PC is at cpu_ca15_set_pte_ext
  2022-04-20  7:31   ` Ard Biesheuvel
@ 2022-04-20  8:54     ` Naresh Kamboju
  -1 siblings, 0 replies; 18+ messages in thread
From: Naresh Kamboju @ 2022-04-20  8:54 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: Linux ARM, open list, Linux-Next Mailing List, lkft-triage,
	Stephen Rothwell, Russell King - ARM Linux, Arnd Bergmann,
	Andrew Morton, max.krummenacher, Shawn Guo, Stefano Stabellini,
	Christoph Hellwig, Konrad Rzeszutek Wilk, Eric W. Biederman

On Wed, 20 Apr 2022 at 13:01, Ard Biesheuvel <ardb@kernel.org> wrote:
>
> On Tue, 19 Apr 2022 at 12:59, Naresh Kamboju <naresh.kamboju@linaro.org> wrote:
> >
> > Linux next 20220419 boot failed on arm architecture qemu_arm and BeagleBoard
> > x15 device.
> >
> > kernel crash log from x15:
> > -----------------
> > [    6.866516] 8<--- cut here ---
> > [    6.869598] Unable to handle kernel paging request at virtual
> > address f000e62c
> > [    6.876861] [f000e62c] *pgd=82935811, *pte=00000000, *ppte=00000000
> > [    6.883209] Internal error: Oops: 807 [#3] SMP ARM
> > [    6.888000] Modules linked in:
> > [    6.891082] CPU: 1 PID: 1 Comm: swapper/0 Tainted: G      D W
> >   5.18.0-rc3-next-20220419 #1
> > [    6.899993] Hardware name: Generic DRA74X (Flattened Device Tree)
> > [    6.906127] PC is at cpu_ca15_set_pte_ext+0x4c/0x58
> > [    6.911041] LR is at handle_mm_fault+0x60c/0xed0
> > [    6.915679] pc : [<c031f26c>]    lr : [<c04cfeb8>]    psr: 40000013
> > [    6.921966] sp : f000dde8  ip : f000de44  fp : a0000013
> > [    6.927215] r10: 00000000  r9 : 00000000  r8 : c1e95194
> > [    6.932464] r7 : c3c95000  r6 : befffff1  r5 : 00000081  r4 : c29d8000
> > [    6.939025] r3 : 00000000  r2 : 00000000  r1 : 00000040  r0 : f000de2c
> > [    6.945587] Flags: nZcv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
> > [    6.952758] Control: 10c5387d  Table: 8020406a  DAC: 00000051
> > [    6.958526] Register r0 information: 2-page vmalloc region starting
> > at 0xf000c000 allocated at kernel_clone+0x94/0x3b0
> > [    6.969299] Register r1 information: non-paged memory
> > [    6.974365] Register r2 information: NULL pointer
> > [    6.979095] Register r3 information: NULL pointer
> > [    6.983825] Register r4 information: slab task_struct start
> > c29d8000 pointer offset 0
> > [    6.991729] Register r5 information: non-paged memory
> > [    6.996795] Register r6 information: non-paged memory
> > [    7.001861] Register r7 information: slab vm_area_struct start
> > c3c95000 pointer offset 0
> > [    7.010009] Register r8 information: non-slab/vmalloc memory
> > [    7.015716] Register r9 information: NULL pointer
> > [    7.020446] Register r10 information: NULL pointer
> > [    7.025238] Register r11 information: non-paged memory
> > [    7.030426] Register r12 information: 2-page vmalloc region
> > starting at 0xf000c000 allocated at kernel_clone+0x94/0x3b0
> > [    7.041259] Process swapper/0 (pid: 1, stack limit = 0xfaff0077)
> > [    7.047302] Stack: (0xf000dde8 to 0xf000e000)
> > [    7.051696] dde0:                   c29d8000 00000cc0 c20a1108
> > c2065fa0 c1e09f50 b6db6db7
> > [    7.059906] de00: c195bf0c 17c0f572 c29d8000 c3c95000 00000cc0
> > 000befff befff000 befffff1
> > [    7.068115] de20: 00000081 c3c3afb8 c3c3afb8 00000000 00000000
> > 00000000 00000000 00000000
> > [    7.076324] de40: 00000000 17c0f572 befff000 c3c95000 00002017
> > befffff1 00002017 00002fb8
> > [    7.084564] de60: c2d04000 00000081 c29d8000 c04c6790 c20d01d4
> > 00000000 00000001 c20ce440
> > [    7.092773] de80: c1e10bcc fffff000 00000000 c2a45680 eeb33cc0
> > c29d8000 00000000 c2d04000
> > [    7.100982] dea0: befffff1 f000df18 00000000 00002017 c20661a0
> > c04c77e8 f000df18 00000000
> > [    7.109222] dec0: 00000000 c1d95c40 00000002 c20661e0 00000000
> > 00000001 00000000 c04c7ad0
> > [    7.117431] dee0: 00000011 c2d02a00 00000001 befffff1 c29d8000
> > 00000000 00000011 c2a30010
> > [    7.125640] df00: c29d8000 c0524c24 f000df18 00000000 00000000
> > 2cd9e000 c1d95c40 17c0f572
> > [    7.133850] df20: 00000000 c2d02a00 0000000b 00000ffc 00000000
> > befffff1 00000000 c0524f74
> > [    7.142089] df40: c1e0e394 c2d02a00 c209a71c 38e38e39 c29d8000
> > bee00008 c2d02a00 c2a30000
> > [    7.150299] df60: c1e0e394 c1e0e420 00000000 00000000 00000000
> > c05266bc c209a000 c1944c60
> > [    7.158508] df80: 00000000 00000000 00000000 c129d2b4 c209a000
> > c1e0e394 00000000 c12b5600
> > [    7.166748] dfa0: 00000000 c12b5518 00000000 c0300168 00000000
> > 00000000 00000000 00000000
> > [    7.174957] dfc0: 00000000 00000000 00000000 00000000 00000000
> > 00000000 00000000 00000000
> > [    7.183166] dfe0: 00000000 00000000 00000000 00000000 00000013
> > 00000000 00000000 00000000
> > [    7.191406] Code: 13110001 12211b02 13110b02 03a03000 (e5a03800)
>
> This decodes to
>
>    0: 13110001 tstne r1, #1
>    4: 12211b02 eorne r1, r1, #2048 ; 0x800
>    8: 13110b02 tstne r1, #2048 ; 0x800
>    c: 03a03000 moveq r3, #0
>   10:* e5a03800 str r3, [r0, #2048]! ; 0x800 <-- trapping instruction
>
> and R0 points into the stack. So we are updating a PTE that is located
> on the stack rather than in a page table somewhere, which seems very
> odd. However, this could be a latent bug that got uncovered by the
> VMAP stacks changes.
>
> Unfortunately, the vmlinux.xz file I downloaded from the link below
> seems to be different from the one that produced the crash, given that
> the LR address of c04cfeb8 does not seem to correspond with
> handle_mm_fault+0x60c/0xed0.
> Can you please double check the artifacts?

You can find the vmlinux.xz for the trace log I have pasted.

vmlinux.xz : https://builds.tuxbuild.com/280TS8MuM6sYWk5aUtrvWIw0RQ7/vmlinux.xz
artifact-location: https://builds.tuxbuild.com/280TS8MuM6sYWk5aUtrvWIw0RQ7

- Naresh

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [next] arm: boot failed - PC is at cpu_ca15_set_pte_ext
@ 2022-04-20  8:54     ` Naresh Kamboju
  0 siblings, 0 replies; 18+ messages in thread
From: Naresh Kamboju @ 2022-04-20  8:54 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: Linux ARM, open list, Linux-Next Mailing List, lkft-triage,
	Stephen Rothwell, Russell King - ARM Linux, Arnd Bergmann,
	Andrew Morton, max.krummenacher, Shawn Guo, Stefano Stabellini,
	Christoph Hellwig, Konrad Rzeszutek Wilk, Eric W. Biederman

On Wed, 20 Apr 2022 at 13:01, Ard Biesheuvel <ardb@kernel.org> wrote:
>
> On Tue, 19 Apr 2022 at 12:59, Naresh Kamboju <naresh.kamboju@linaro.org> wrote:
> >
> > Linux next 20220419 boot failed on arm architecture qemu_arm and BeagleBoard
> > x15 device.
> >
> > kernel crash log from x15:
> > -----------------
> > [    6.866516] 8<--- cut here ---
> > [    6.869598] Unable to handle kernel paging request at virtual
> > address f000e62c
> > [    6.876861] [f000e62c] *pgd=82935811, *pte=00000000, *ppte=00000000
> > [    6.883209] Internal error: Oops: 807 [#3] SMP ARM
> > [    6.888000] Modules linked in:
> > [    6.891082] CPU: 1 PID: 1 Comm: swapper/0 Tainted: G      D W
> >   5.18.0-rc3-next-20220419 #1
> > [    6.899993] Hardware name: Generic DRA74X (Flattened Device Tree)
> > [    6.906127] PC is at cpu_ca15_set_pte_ext+0x4c/0x58
> > [    6.911041] LR is at handle_mm_fault+0x60c/0xed0
> > [    6.915679] pc : [<c031f26c>]    lr : [<c04cfeb8>]    psr: 40000013
> > [    6.921966] sp : f000dde8  ip : f000de44  fp : a0000013
> > [    6.927215] r10: 00000000  r9 : 00000000  r8 : c1e95194
> > [    6.932464] r7 : c3c95000  r6 : befffff1  r5 : 00000081  r4 : c29d8000
> > [    6.939025] r3 : 00000000  r2 : 00000000  r1 : 00000040  r0 : f000de2c
> > [    6.945587] Flags: nZcv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
> > [    6.952758] Control: 10c5387d  Table: 8020406a  DAC: 00000051
> > [    6.958526] Register r0 information: 2-page vmalloc region starting
> > at 0xf000c000 allocated at kernel_clone+0x94/0x3b0
> > [    6.969299] Register r1 information: non-paged memory
> > [    6.974365] Register r2 information: NULL pointer
> > [    6.979095] Register r3 information: NULL pointer
> > [    6.983825] Register r4 information: slab task_struct start
> > c29d8000 pointer offset 0
> > [    6.991729] Register r5 information: non-paged memory
> > [    6.996795] Register r6 information: non-paged memory
> > [    7.001861] Register r7 information: slab vm_area_struct start
> > c3c95000 pointer offset 0
> > [    7.010009] Register r8 information: non-slab/vmalloc memory
> > [    7.015716] Register r9 information: NULL pointer
> > [    7.020446] Register r10 information: NULL pointer
> > [    7.025238] Register r11 information: non-paged memory
> > [    7.030426] Register r12 information: 2-page vmalloc region
> > starting at 0xf000c000 allocated at kernel_clone+0x94/0x3b0
> > [    7.041259] Process swapper/0 (pid: 1, stack limit = 0xfaff0077)
> > [    7.047302] Stack: (0xf000dde8 to 0xf000e000)
> > [    7.051696] dde0:                   c29d8000 00000cc0 c20a1108
> > c2065fa0 c1e09f50 b6db6db7
> > [    7.059906] de00: c195bf0c 17c0f572 c29d8000 c3c95000 00000cc0
> > 000befff befff000 befffff1
> > [    7.068115] de20: 00000081 c3c3afb8 c3c3afb8 00000000 00000000
> > 00000000 00000000 00000000
> > [    7.076324] de40: 00000000 17c0f572 befff000 c3c95000 00002017
> > befffff1 00002017 00002fb8
> > [    7.084564] de60: c2d04000 00000081 c29d8000 c04c6790 c20d01d4
> > 00000000 00000001 c20ce440
> > [    7.092773] de80: c1e10bcc fffff000 00000000 c2a45680 eeb33cc0
> > c29d8000 00000000 c2d04000
> > [    7.100982] dea0: befffff1 f000df18 00000000 00002017 c20661a0
> > c04c77e8 f000df18 00000000
> > [    7.109222] dec0: 00000000 c1d95c40 00000002 c20661e0 00000000
> > 00000001 00000000 c04c7ad0
> > [    7.117431] dee0: 00000011 c2d02a00 00000001 befffff1 c29d8000
> > 00000000 00000011 c2a30010
> > [    7.125640] df00: c29d8000 c0524c24 f000df18 00000000 00000000
> > 2cd9e000 c1d95c40 17c0f572
> > [    7.133850] df20: 00000000 c2d02a00 0000000b 00000ffc 00000000
> > befffff1 00000000 c0524f74
> > [    7.142089] df40: c1e0e394 c2d02a00 c209a71c 38e38e39 c29d8000
> > bee00008 c2d02a00 c2a30000
> > [    7.150299] df60: c1e0e394 c1e0e420 00000000 00000000 00000000
> > c05266bc c209a000 c1944c60
> > [    7.158508] df80: 00000000 00000000 00000000 c129d2b4 c209a000
> > c1e0e394 00000000 c12b5600
> > [    7.166748] dfa0: 00000000 c12b5518 00000000 c0300168 00000000
> > 00000000 00000000 00000000
> > [    7.174957] dfc0: 00000000 00000000 00000000 00000000 00000000
> > 00000000 00000000 00000000
> > [    7.183166] dfe0: 00000000 00000000 00000000 00000000 00000013
> > 00000000 00000000 00000000
> > [    7.191406] Code: 13110001 12211b02 13110b02 03a03000 (e5a03800)
>
> This decodes to
>
>    0: 13110001 tstne r1, #1
>    4: 12211b02 eorne r1, r1, #2048 ; 0x800
>    8: 13110b02 tstne r1, #2048 ; 0x800
>    c: 03a03000 moveq r3, #0
>   10:* e5a03800 str r3, [r0, #2048]! ; 0x800 <-- trapping instruction
>
> and R0 points into the stack. So we are updating a PTE that is located
> on the stack rather than in a page table somewhere, which seems very
> odd. However, this could be a latent bug that got uncovered by the
> VMAP stacks changes.
>
> Unfortunately, the vmlinux.xz file I downloaded from the link below
> seems to be different from the one that produced the crash, given that
> the LR address of c04cfeb8 does not seem to correspond with
> handle_mm_fault+0x60c/0xed0.
> Can you please double check the artifacts?

You can find the vmlinux.xz for the trace log I have pasted.

vmlinux.xz : https://builds.tuxbuild.com/280TS8MuM6sYWk5aUtrvWIw0RQ7/vmlinux.xz
artifact-location: https://builds.tuxbuild.com/280TS8MuM6sYWk5aUtrvWIw0RQ7

- Naresh

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [next] arm: boot failed - PC is at cpu_ca15_set_pte_ext
  2022-04-19 18:57   ` Russell King (Oracle)
@ 2022-04-20  8:55     ` Naresh Kamboju
  -1 siblings, 0 replies; 18+ messages in thread
From: Naresh Kamboju @ 2022-04-20  8:55 UTC (permalink / raw)
  To: Russell King (Oracle)
  Cc: Linux ARM, open list, Linux-Next Mailing List, lkft-triage,
	Stephen Rothwell, Arnd Bergmann, Ard Biesheuvel, Andrew Morton,
	max.krummenacher, Shawn Guo, Stefano Stabellini,
	Christoph Hellwig, Konrad Rzeszutek Wilk, Eric W. Biederman

On Wed, 20 Apr 2022 at 00:28, Russell King (Oracle)
<linux@armlinux.org.uk> wrote:
>
> On Tue, Apr 19, 2022 at 04:28:52PM +0530, Naresh Kamboju wrote:
> > Linux next 20220419 boot failed on arm architecture qemu_arm and BeagleBoard
> > x15 device.
>
> Was the immediately previous linux-next behaving correctly?

This crash started happening from the next-20220413 tag.

- Naresh

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [next] arm: boot failed - PC is at cpu_ca15_set_pte_ext
@ 2022-04-20  8:55     ` Naresh Kamboju
  0 siblings, 0 replies; 18+ messages in thread
From: Naresh Kamboju @ 2022-04-20  8:55 UTC (permalink / raw)
  To: Russell King (Oracle)
  Cc: Linux ARM, open list, Linux-Next Mailing List, lkft-triage,
	Stephen Rothwell, Arnd Bergmann, Ard Biesheuvel, Andrew Morton,
	max.krummenacher, Shawn Guo, Stefano Stabellini,
	Christoph Hellwig, Konrad Rzeszutek Wilk, Eric W. Biederman

On Wed, 20 Apr 2022 at 00:28, Russell King (Oracle)
<linux@armlinux.org.uk> wrote:
>
> On Tue, Apr 19, 2022 at 04:28:52PM +0530, Naresh Kamboju wrote:
> > Linux next 20220419 boot failed on arm architecture qemu_arm and BeagleBoard
> > x15 device.
>
> Was the immediately previous linux-next behaving correctly?

This crash started happening from the next-20220413 tag.

- Naresh

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [next] arm: boot failed - PC is at cpu_ca15_set_pte_ext
  2022-04-20  8:55     ` Naresh Kamboju
@ 2022-04-20  9:44       ` Russell King (Oracle)
  -1 siblings, 0 replies; 18+ messages in thread
From: Russell King (Oracle) @ 2022-04-20  9:44 UTC (permalink / raw)
  To: Naresh Kamboju
  Cc: Linux ARM, open list, Linux-Next Mailing List, lkft-triage,
	Stephen Rothwell, Arnd Bergmann, Ard Biesheuvel, Andrew Morton,
	max.krummenacher, Shawn Guo, Stefano Stabellini,
	Christoph Hellwig, Konrad Rzeszutek Wilk, Eric W. Biederman

On Wed, Apr 20, 2022 at 02:25:32PM +0530, Naresh Kamboju wrote:
> On Wed, 20 Apr 2022 at 00:28, Russell King (Oracle)
> <linux@armlinux.org.uk> wrote:
> >
> > On Tue, Apr 19, 2022 at 04:28:52PM +0530, Naresh Kamboju wrote:
> > > Linux next 20220419 boot failed on arm architecture qemu_arm and BeagleBoard
> > > x15 device.
> >
> > Was the immediately previous linux-next behaving correctly?
> 
> This crash started happening from the next-20220413 tag.

That rules out any arm32 specific changes - the last time my tree
changed in for-next was 1st April.

Ard points out that the pte table is on the stack, which it really
should not be. I'm guessing there's some inappropriate generic
kernel change that has broken arm32. A pte table should never ever
appear on a kernel stack.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [next] arm: boot failed - PC is at cpu_ca15_set_pte_ext
@ 2022-04-20  9:44       ` Russell King (Oracle)
  0 siblings, 0 replies; 18+ messages in thread
From: Russell King (Oracle) @ 2022-04-20  9:44 UTC (permalink / raw)
  To: Naresh Kamboju
  Cc: Linux ARM, open list, Linux-Next Mailing List, lkft-triage,
	Stephen Rothwell, Arnd Bergmann, Ard Biesheuvel, Andrew Morton,
	max.krummenacher, Shawn Guo, Stefano Stabellini,
	Christoph Hellwig, Konrad Rzeszutek Wilk, Eric W. Biederman

On Wed, Apr 20, 2022 at 02:25:32PM +0530, Naresh Kamboju wrote:
> On Wed, 20 Apr 2022 at 00:28, Russell King (Oracle)
> <linux@armlinux.org.uk> wrote:
> >
> > On Tue, Apr 19, 2022 at 04:28:52PM +0530, Naresh Kamboju wrote:
> > > Linux next 20220419 boot failed on arm architecture qemu_arm and BeagleBoard
> > > x15 device.
> >
> > Was the immediately previous linux-next behaving correctly?
> 
> This crash started happening from the next-20220413 tag.

That rules out any arm32 specific changes - the last time my tree
changed in for-next was 1st April.

Ard points out that the pte table is on the stack, which it really
should not be. I'm guessing there's some inappropriate generic
kernel change that has broken arm32. A pte table should never ever
appear on a kernel stack.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [next] arm: boot failed - PC is at cpu_ca15_set_pte_ext
  2022-04-20  7:50     ` Max Krummenacher
@ 2022-04-20 13:04       ` Naresh Kamboju
  -1 siblings, 0 replies; 18+ messages in thread
From: Naresh Kamboju @ 2022-04-20 13:04 UTC (permalink / raw)
  To: Max Krummenacher
  Cc: Ard Biesheuvel, Russell King - ARM Linux, Linux ARM, open list,
	Linux-Next Mailing List, lkft-triage, Stephen Rothwell,
	Arnd Bergmann, Andrew Morton, max.krummenacher, Shawn Guo,
	Stefano Stabellini, Christoph Hellwig, Konrad Rzeszutek Wilk,
	Eric W. Biederman

Hi Max,

On Wed, 20 Apr 2022 at 13:20, Max Krummenacher <max.oss.09@gmail.com> wrote:
>
> Am Mittwoch, den 20.04.2022, 09:31 +0200 schrieb Ard Biesheuvel:
> > On Tue, 19 Apr 2022 at 12:59, Naresh Kamboju <naresh.kamboju@linaro.org> wrote:
> > > Linux next 20220419 boot failed on arm architecture qemu_arm and BeagleBoard
> > > x15 device.
> > >
> > > kernel crash log from x15:
> > > -----------------
> > > [    6.866516] 8<--- cut here ---
> > > [    6.869598] Unable to handle kernel paging request at virtual
> > > address f000e62c

<trim>

> > Unfortunately, the vmlinux.xz file I downloaded from the link below
> > seems to be different from the one that produced the crash, given that
> > the LR address of c04cfeb8 does not seem to correspond with
> > handle_mm_fault+0x60c/0xed0.
> >
> > Can you please double check the artifacts?
>
> Commit "mm: check against orig_pte for finish_fault()" introduced this,
> i.e. on yesterdays next reverting a066bab3c0eb made a i.MX6 boot again.

Thanks for the pointers,
I have reverted the suggested commit and boot pass now.

Revert "mm: check against orig_pte for finish_fault()"
       This reverts commit a066bab3c0eb8f6155257f1345f07d1f6550bc4a.

> A fix is discussed here:
> https://lore.kernel.org/all/YliNP7ADcdc4Puvs@xz-m1.local/
>
> Max

- Naresh

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [next] arm: boot failed - PC is at cpu_ca15_set_pte_ext
@ 2022-04-20 13:04       ` Naresh Kamboju
  0 siblings, 0 replies; 18+ messages in thread
From: Naresh Kamboju @ 2022-04-20 13:04 UTC (permalink / raw)
  To: Max Krummenacher
  Cc: Ard Biesheuvel, Russell King - ARM Linux, Linux ARM, open list,
	Linux-Next Mailing List, lkft-triage, Stephen Rothwell,
	Arnd Bergmann, Andrew Morton, max.krummenacher, Shawn Guo,
	Stefano Stabellini, Christoph Hellwig, Konrad Rzeszutek Wilk,
	Eric W. Biederman

Hi Max,

On Wed, 20 Apr 2022 at 13:20, Max Krummenacher <max.oss.09@gmail.com> wrote:
>
> Am Mittwoch, den 20.04.2022, 09:31 +0200 schrieb Ard Biesheuvel:
> > On Tue, 19 Apr 2022 at 12:59, Naresh Kamboju <naresh.kamboju@linaro.org> wrote:
> > > Linux next 20220419 boot failed on arm architecture qemu_arm and BeagleBoard
> > > x15 device.
> > >
> > > kernel crash log from x15:
> > > -----------------
> > > [    6.866516] 8<--- cut here ---
> > > [    6.869598] Unable to handle kernel paging request at virtual
> > > address f000e62c

<trim>

> > Unfortunately, the vmlinux.xz file I downloaded from the link below
> > seems to be different from the one that produced the crash, given that
> > the LR address of c04cfeb8 does not seem to correspond with
> > handle_mm_fault+0x60c/0xed0.
> >
> > Can you please double check the artifacts?
>
> Commit "mm: check against orig_pte for finish_fault()" introduced this,
> i.e. on yesterdays next reverting a066bab3c0eb made a i.MX6 boot again.

Thanks for the pointers,
I have reverted the suggested commit and boot pass now.

Revert "mm: check against orig_pte for finish_fault()"
       This reverts commit a066bab3c0eb8f6155257f1345f07d1f6550bc4a.

> A fix is discussed here:
> https://lore.kernel.org/all/YliNP7ADcdc4Puvs@xz-m1.local/
>
> Max

- Naresh

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [next] arm: boot failed - PC is at cpu_ca15_set_pte_ext
  2022-04-20  7:50     ` Max Krummenacher
@ 2022-04-20 21:53       ` Andrew Morton
  -1 siblings, 0 replies; 18+ messages in thread
From: Andrew Morton @ 2022-04-20 21:53 UTC (permalink / raw)
  To: Max Krummenacher
  Cc: Ard Biesheuvel, Naresh Kamboju, Linux ARM, open list,
	Linux-Next Mailing List, lkft-triage, Stephen Rothwell,
	Russell King - ARM Linux, Arnd Bergmann, max.krummenacher,
	Shawn Guo, Stefano Stabellini, Christoph Hellwig,
	Konrad Rzeszutek Wilk, Eric W. Biederman

On Wed, 20 Apr 2022 09:50:52 +0200 Max Krummenacher <max.oss.09@gmail.com> wrote:

> > 
> > Unfortunately, the vmlinux.xz file I downloaded from the link below
> > seems to be different from the one that produced the crash, given that
> > the LR address of c04cfeb8 does not seem to correspond with
> > handle_mm_fault+0x60c/0xed0.
> > 
> > Can you please double check the artifacts?
> 
> Commit "mm: check against orig_pte for finish_fault()" introduced this,
> i.e. on yesterdays next reverting a066bab3c0eb made a i.MX6 boot again.
> A fix is discussed here:
> 
> https://lore.kernel.org/all/YliNP7ADcdc4Puvs@xz-m1.local/
> 

Thanks for finding that.  I have Peter's fix queued and shall push out
a snapshot later today, for integration into linux-next.


From: Peter Xu <peterx@redhat.com>
Subject: mm-check-against-orig_pte-for-finish_fault-fix

fix crash reported by Marek

Link: https://lkml.kernel.org/r/Ylb9rXJyPm8/ao8f@xz-m1.local
Signed-off-by: Peter Xu <peterx@redhat.com>
Reported-by: Marek Szyprowski <m.szyprowski@samsung.com>
Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: "Kirill A . Shutemov" <kirill@shutemov.name>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Nadav Amit <nadav.amit@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---


--- a/include/linux/mm_types.h~mm-check-against-orig_pte-for-finish_fault-fix
+++ a/include/linux/mm_types.h
@@ -814,6 +814,8 @@ typedef struct {
  * @FAULT_FLAG_UNSHARE: The fault is an unsharing request to unshare (and mark
  *                      exclusive) a possibly shared anonymous page that is
  *                      mapped R/O.
+ * @FAULT_FLAG_ORIG_PTE_VALID: whether the fault has vmf->orig_pte cached.
+ *                        We should only access orig_pte if this flag set.
  *
  * About @FAULT_FLAG_ALLOW_RETRY and @FAULT_FLAG_TRIED: we can specify
  * whether we would allow page faults to retry by specifying these two
@@ -850,6 +852,7 @@ enum fault_flag {
 	FAULT_FLAG_INSTRUCTION =	1 << 8,
 	FAULT_FLAG_INTERRUPTIBLE =	1 << 9,
 	FAULT_FLAG_UNSHARE =		1 << 10,
+	FAULT_FLAG_ORIG_PTE_VALID =	1 << 11,
 };
 
 #endif /* _LINUX_MM_TYPES_H */
--- a/mm/memory.c~mm-check-against-orig_pte-for-finish_fault-fix
+++ a/mm/memory.c
@@ -4194,6 +4194,15 @@ void do_set_pte(struct vm_fault *vmf, st
 	set_pte_at(vma->vm_mm, addr, vmf->pte, entry);
 }
 
+static bool vmf_pte_changed(struct vm_fault *vmf)
+{
+	if (vmf->flags & FAULT_FLAG_ORIG_PTE_VALID) {
+		return !pte_same(*vmf->pte, vmf->orig_pte);
+	}
+
+	return !pte_none(*vmf->pte);
+}
+
 /**
  * finish_fault - finish page fault once we have prepared the page to fault
  *
@@ -4252,7 +4261,7 @@ vm_fault_t finish_fault(struct vm_fault
 				      vmf->address, &vmf->ptl);
 	ret = 0;
 	/* Re-check under ptl */
-	if (likely(pte_same(*vmf->pte, vmf->orig_pte)))
+	if (likely(!vmf_pte_changed(vmf)))
 		do_set_pte(vmf, page, vmf->address);
 	else
 		ret = VM_FAULT_NOPAGE;
@@ -4720,13 +4729,7 @@ static vm_fault_t handle_pte_fault(struc
 		 * concurrent faults and from rmap lookups.
 		 */
 		vmf->pte = NULL;
-		/*
-		 * Always initialize orig_pte.  This matches with below
-		 * code to have orig_pte to be the none pte if pte==NULL.
-		 * This makes the rest code to be always safe to reference
-		 * it, e.g. in finish_fault() we'll detect pte changes.
-		 */
-		pte_clear(vmf->vma->vm_mm, vmf->address, &vmf->orig_pte);
+		vmf->flags &= ~FAULT_FLAG_ORIG_PTE_VALID;
 	} else {
 		/*
 		 * If a huge pmd materialized under us just retry later.  Use
@@ -4750,6 +4753,7 @@ static vm_fault_t handle_pte_fault(struc
 		 */
 		vmf->pte = pte_offset_map(vmf->pmd, vmf->address);
 		vmf->orig_pte = *vmf->pte;
+		vmf->flags |= FAULT_FLAG_ORIG_PTE_VALID;
 
 		/*
 		 * some architectures can have larger ptes than wordsize,
_


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [next] arm: boot failed - PC is at cpu_ca15_set_pte_ext
@ 2022-04-20 21:53       ` Andrew Morton
  0 siblings, 0 replies; 18+ messages in thread
From: Andrew Morton @ 2022-04-20 21:53 UTC (permalink / raw)
  To: Max Krummenacher
  Cc: Ard Biesheuvel, Naresh Kamboju, Linux ARM, open list,
	Linux-Next Mailing List, lkft-triage, Stephen Rothwell,
	Russell King - ARM Linux, Arnd Bergmann, max.krummenacher,
	Shawn Guo, Stefano Stabellini, Christoph Hellwig,
	Konrad Rzeszutek Wilk, Eric W. Biederman

On Wed, 20 Apr 2022 09:50:52 +0200 Max Krummenacher <max.oss.09@gmail.com> wrote:

> > 
> > Unfortunately, the vmlinux.xz file I downloaded from the link below
> > seems to be different from the one that produced the crash, given that
> > the LR address of c04cfeb8 does not seem to correspond with
> > handle_mm_fault+0x60c/0xed0.
> > 
> > Can you please double check the artifacts?
> 
> Commit "mm: check against orig_pte for finish_fault()" introduced this,
> i.e. on yesterdays next reverting a066bab3c0eb made a i.MX6 boot again.
> A fix is discussed here:
> 
> https://lore.kernel.org/all/YliNP7ADcdc4Puvs@xz-m1.local/
> 

Thanks for finding that.  I have Peter's fix queued and shall push out
a snapshot later today, for integration into linux-next.


From: Peter Xu <peterx@redhat.com>
Subject: mm-check-against-orig_pte-for-finish_fault-fix

fix crash reported by Marek

Link: https://lkml.kernel.org/r/Ylb9rXJyPm8/ao8f@xz-m1.local
Signed-off-by: Peter Xu <peterx@redhat.com>
Reported-by: Marek Szyprowski <m.szyprowski@samsung.com>
Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: "Kirill A . Shutemov" <kirill@shutemov.name>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Nadav Amit <nadav.amit@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---


--- a/include/linux/mm_types.h~mm-check-against-orig_pte-for-finish_fault-fix
+++ a/include/linux/mm_types.h
@@ -814,6 +814,8 @@ typedef struct {
  * @FAULT_FLAG_UNSHARE: The fault is an unsharing request to unshare (and mark
  *                      exclusive) a possibly shared anonymous page that is
  *                      mapped R/O.
+ * @FAULT_FLAG_ORIG_PTE_VALID: whether the fault has vmf->orig_pte cached.
+ *                        We should only access orig_pte if this flag set.
  *
  * About @FAULT_FLAG_ALLOW_RETRY and @FAULT_FLAG_TRIED: we can specify
  * whether we would allow page faults to retry by specifying these two
@@ -850,6 +852,7 @@ enum fault_flag {
 	FAULT_FLAG_INSTRUCTION =	1 << 8,
 	FAULT_FLAG_INTERRUPTIBLE =	1 << 9,
 	FAULT_FLAG_UNSHARE =		1 << 10,
+	FAULT_FLAG_ORIG_PTE_VALID =	1 << 11,
 };
 
 #endif /* _LINUX_MM_TYPES_H */
--- a/mm/memory.c~mm-check-against-orig_pte-for-finish_fault-fix
+++ a/mm/memory.c
@@ -4194,6 +4194,15 @@ void do_set_pte(struct vm_fault *vmf, st
 	set_pte_at(vma->vm_mm, addr, vmf->pte, entry);
 }
 
+static bool vmf_pte_changed(struct vm_fault *vmf)
+{
+	if (vmf->flags & FAULT_FLAG_ORIG_PTE_VALID) {
+		return !pte_same(*vmf->pte, vmf->orig_pte);
+	}
+
+	return !pte_none(*vmf->pte);
+}
+
 /**
  * finish_fault - finish page fault once we have prepared the page to fault
  *
@@ -4252,7 +4261,7 @@ vm_fault_t finish_fault(struct vm_fault
 				      vmf->address, &vmf->ptl);
 	ret = 0;
 	/* Re-check under ptl */
-	if (likely(pte_same(*vmf->pte, vmf->orig_pte)))
+	if (likely(!vmf_pte_changed(vmf)))
 		do_set_pte(vmf, page, vmf->address);
 	else
 		ret = VM_FAULT_NOPAGE;
@@ -4720,13 +4729,7 @@ static vm_fault_t handle_pte_fault(struc
 		 * concurrent faults and from rmap lookups.
 		 */
 		vmf->pte = NULL;
-		/*
-		 * Always initialize orig_pte.  This matches with below
-		 * code to have orig_pte to be the none pte if pte==NULL.
-		 * This makes the rest code to be always safe to reference
-		 * it, e.g. in finish_fault() we'll detect pte changes.
-		 */
-		pte_clear(vmf->vma->vm_mm, vmf->address, &vmf->orig_pte);
+		vmf->flags &= ~FAULT_FLAG_ORIG_PTE_VALID;
 	} else {
 		/*
 		 * If a huge pmd materialized under us just retry later.  Use
@@ -4750,6 +4753,7 @@ static vm_fault_t handle_pte_fault(struc
 		 */
 		vmf->pte = pte_offset_map(vmf->pmd, vmf->address);
 		vmf->orig_pte = *vmf->pte;
+		vmf->flags |= FAULT_FLAG_ORIG_PTE_VALID;
 
 		/*
 		 * some architectures can have larger ptes than wordsize,
_


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2022-04-20 21:54 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-04-19 10:58 [next] arm: boot failed - PC is at cpu_ca15_set_pte_ext Naresh Kamboju
2022-04-19 10:58 ` Naresh Kamboju
2022-04-19 18:57 ` Russell King (Oracle)
2022-04-19 18:57   ` Russell King (Oracle)
2022-04-20  8:55   ` Naresh Kamboju
2022-04-20  8:55     ` Naresh Kamboju
2022-04-20  9:44     ` Russell King (Oracle)
2022-04-20  9:44       ` Russell King (Oracle)
2022-04-20  7:31 ` Ard Biesheuvel
2022-04-20  7:31   ` Ard Biesheuvel
2022-04-20  7:50   ` Max Krummenacher
2022-04-20  7:50     ` Max Krummenacher
2022-04-20 13:04     ` Naresh Kamboju
2022-04-20 13:04       ` Naresh Kamboju
2022-04-20 21:53     ` Andrew Morton
2022-04-20 21:53       ` Andrew Morton
2022-04-20  8:54   ` Naresh Kamboju
2022-04-20  8:54     ` Naresh Kamboju

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.