All of lore.kernel.org
 help / color / mirror / Atom feed
* rcu_sched self-detected stall on CPU
@ 2022-04-05 21:41 Miguel Ojeda
  2022-04-06  9:31   ` Zhouyi Zhou
  0 siblings, 1 reply; 56+ messages in thread
From: Miguel Ojeda @ 2022-04-05 21:41 UTC (permalink / raw)
  To: linuxppc-dev, rcu

[-- Attachment #1: Type: text/plain, Size: 400 bytes --]

Hi PPC/RCU,

While merging v5.18-rc1 changes I noticed our CI PPC runs broke. I
reproduced the problem in v5.18-rc1 as well as next-20220405, under
both QEMU 4.2.1 and 6.1.0, with `-smp 2`; but I cannot reproduce it in
v5.17 from a few tries.

Sadly, the problem is not deterministic although it is not too hard to
reproduce (1 out of 5?). Please see attached config and QEMU output.

Cheers,
Miguel

[-- Attachment #2: qemu --]
[-- Type: application/octet-stream, Size: 20395 bytes --]

# qemu-system-ppc64 -kernel vmlinux -nographic -vga none -no-reboot -smp 2
qemu-system-ppc64: warning: TCG doesn't support requested feature, cap-cfpc=workaround
qemu-system-ppc64: warning: TCG doesn't support requested feature, cap-sbbc=workaround
qemu-system-ppc64: warning: TCG doesn't support requested feature, cap-ibs=workaround


SLOF **********************************************************************
QEMU Starting
 Build Date = Jan 31 2020 20:27:09
 FW Version = buildd@ release 20191209
 Press "s" to enter Open Firmware.

Populating /vdevice methods
Populating /vdevice/vty@71000000
Populating /vdevice/nvram@71000001
Populating /vdevice/l-lan@71000002
Populating /vdevice/v-scsi@71000003
       SCSI: Looking for devices
          8200000000000000 CD-ROM   : "QEMU     QEMU CD-ROM      2.5+"
Populating /pci@800000020000000
No NVRAM common partition, re-initializing...
Scanning USB 
Using default console: /vdevice/vty@71000000
Detected RAM kernel at 400000 (121f8f0 bytes) 
     
  Welcome to Open Firmware

  Copyright (c) 2004, 2017 IBM Corporation All rights reserved.
  This program and the accompanying materials are made available
  under the terms of the BSD License available at
  http://www.opensource.org/licenses/bsd-license.php

Booting from memory...
OF stdout device is: /vdevice/vty@71000000
Preparing to boot Linux version 5.18.0-rc1 (root@test) (powerpc64le-linux-gnu-gcc (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0, GNU ld (GNU Binutils for Ubuntu) 2.34) #2 SMP Tue Apr 5 21:09:40 UTC 2022
Detected machine type: 0000000000000101
command line:  
Max number of cores passed to firmware: 2 (NR_CPUS = 2)
Calling ibm,client-architecture-support...qemu-system-ppc64: warning: TCG doesn't support requested feature, cap-cfpc=workaround
qemu-system-ppc64: warning: TCG doesn't support requested feature, cap-sbbc=workaround
qemu-system-ppc64: warning: TCG doesn't support requested feature, cap-ibs=workaround


SLOF **********************************************************************
QEMU Starting
 Build Date = Jan 31 2020 20:27:09
 FW Version = buildd@ release 20191209
 Press "s" to enter Open Firmware.

Populating /vdevice methods
Populating /vdevice/vty@71000000
Populating /vdevice/nvram@71000001
Populating /vdevice/l-lan@71000002
Populating /vdevice/v-scsi@71000003
       SCSI: Looking for devices
          8200000000000000 CD-ROM   : "QEMU     QEMU CD-ROM      2.5+"
Populating /pci@800000020000000
Scanning USB 
Using default console: /vdevice/vty@71000000
Detected RAM kernel at 400000 (121f8f0 bytes) 
     
  Welcome to Open Firmware

  Copyright (c) 2004, 2017 IBM Corporation All rights reserved.
  This program and the accompanying materials are made available
  under the terms of the BSD License available at
  http://www.opensource.org/licenses/bsd-license.php

Booting from memory...
OF stdout device is: /vdevice/vty@71000000
Preparing to boot Linux version 5.18.0-rc1 (root@test) (powerpc64le-linux-gnu-gcc (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0, GNU ld (GNU Binutils for Ubuntu) 2.34) #2 SMP Tue Apr 5 21:09:40 UTC 2022
Detected machine type: 0000000000000101
command line:  
Max number of cores passed to firmware: 2 (NR_CPUS = 2)
Calling ibm,client-architecture-support... done
memory layout at init:
  memory_limit : 0000000000000000 (16 MB aligned)
  alloc_bottom : 0000000001630000
  alloc_top    : 0000000020000000
  alloc_top_hi : 0000000020000000
  rmo_top      : 0000000020000000
  ram_top      : 0000000020000000
instantiating rtas at 0x000000001fff0000... done
prom_hold_cpus: skipped
copying OF device tree...
Building dt strings...
Building dt structure...
Device tree strings 0x0000000001640000 -> 0x0000000001640a77
Device tree struct  0x0000000001650000 -> 0x0000000001660000
Quiescing Open Firmware ...
Booting Linux via __start() @ 0x0000000000400000 ...
[    0.000000] radix-mmu: Page sizes from device-tree:
[    0.000000] radix-mmu: Page size shift = 12 AP=0x0
[    0.000000] radix-mmu: Page size shift = 16 AP=0x5
[    0.000000] radix-mmu: Page size shift = 21 AP=0x1
[    0.000000] radix-mmu: Page size shift = 30 AP=0x2
[    0.000000] Activating Kernel Userspace Access Prevention
[    0.000000] Activating Kernel Userspace Execution Prevention
[    0.000000] radix-mmu: Mapped 0x0000000000000000-0x0000000001200000 with 2.00 MiB pages (exec)
[    0.000000] radix-mmu: Mapped 0x0000000001200000-0x0000000020000000 with 2.00 MiB pages
[    0.000000] lpar: Using radix MMU under hypervisor
[    0.000000] Linux version 5.18.0-rc1 (root@test) (powerpc64le-linux-gnu-gcc (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0, GNU ld (GNU Binutils for Ubuntu) 2.34) #2 SMP Tue Apr 5 21:09:40 UTC 2022
[    0.000000] Using pSeries machine description
[    0.000000] printk: bootconsole [udbg0] enabled
[    0.000000] Partition configured for 2 cpus.
[    0.000000] CPU maps initialized for 1 thread per core
[    0.000000] -----------------------------------------------------
[    0.000000] phys_mem_size     = 0x20000000
[    0.000000] dcache_bsize      = 0x80
[    0.000000] icache_bsize      = 0x80
[    0.000000] cpu_features      = 0x0001c06b8f4f9187
[    0.000000]   possible        = 0x000ffbebcf5fb187
[    0.000000]   always          = 0x0000006b8b5c9181
[    0.000000] cpu_user_features = 0xdc0065c2 0xaef00000
[    0.000000] mmu_features      = 0x3c007641
[    0.000000] firmware_features = 0x00000085455a445f
[    0.000000] vmalloc start     = 0xc008000000000000
[    0.000000] IO start          = 0xc00a000000000000
[    0.000000] vmemmap start     = 0xc00c000000000000
[    0.000000] -----------------------------------------------------
[    0.000000] rfi-flush: fallback displacement flush available
[    0.000000] rfi-flush: ori type flush available
[    0.000000] rfi-flush: mttrig type flush available
[    0.000000] count-cache-flush: software flush enabled.
[    0.000000] link-stack-flush: software flush enabled.
[    0.000000] stf-barrier: eieio barrier available
[    0.000000] PPC64 nvram contains 65536 bytes
[    0.000000] PV qspinlock hash table entries: 4096 (order: 0, 65536 bytes, linear)
[    0.000000] barrier-nospec: using ORI speculation barrier
[    0.000000] Zone ranges:
[    0.000000]   Normal   [mem 0x0000000000000000-0x000000001fffffff]
[    0.000000] Movable zone start for each node
[    0.000000] Early memory node ranges
[    0.000000]   node   0: [mem 0x0000000000000000-0x000000001fffffff]
[    0.000000] Initmem setup node 0 [mem 0x0000000000000000-0x000000001fffffff]
[    0.000000] percpu: Embedded 2 pages/cpu s32544 r0 d98528 u131072
[    0.000000] Built 1 zonelists, mobility grouping on.  Total pages: 8185
[    0.000000] Kernel command line: 
[    0.000000] Dentry cache hash table entries: 65536 (order: 3, 524288 bytes, linear)
[    0.000000] Inode-cache hash table entries: 32768 (order: 2, 262144 bytes, linear)
[    0.000000] mem auto-init: stack:off, heap alloc:on, heap free:on
[    0.000000] mem auto-init: clearing system memory may take some time...
[    0.000000] Memory: 418560K/524288K available (4096K kernel code, 704K rwdata, 768K rodata, 1024K init, 446K bss, 105728K reserved, 0K cma-reserved)
[    0.000000] random: get_random_u64 called from __kmem_cache_create+0x34/0x520 with crng_init=0
[    0.000000] SLUB: HWalign=128, Order=0-3, MinObjects=0, CPUs=2, Nodes=1
[    0.000000] rcu: Hierarchical RCU implementation.
[    0.000000] rcu: RCU calculated value of scheduler-enlistment delay is 100 jiffies.
[    0.000000] NR_IRQS: 512, nr_irqs: 512, preallocated irqs: 16
[    0.000000] xive: Using IRQ range [0-1]
[    0.000000] xive: Interrupt handling initialized with spapr backend
[    0.000000] xive: Using priority 6 for all interrupts
[    0.000000] xive: Using 64kB queues
[    0.000061] time_init: 56 bit decrementer (max: 7fffffffffffff)
[    0.000505] clocksource: timebase: mask: 0xffffffffffffffff max_cycles: 0x761537d007, max_idle_ns: 440795202126 ns
[    0.000999] clocksource: timebase mult[1f40000] shift[24] registered
[    0.007895] Console: colour dummy device 80x25
[    0.008493] printk: console [hvc0] enabled
[    0.008493] printk: console [hvc0] enabled
[    0.010438] printk: bootconsole [udbg0] disabled
[    0.010438] printk: bootconsole [udbg0] disabled
[    0.011696] pid_max: default: 32768 minimum: 301
[    0.012503] Mount-cache hash table entries: 8192 (order: 0, 65536 bytes, linear)
[    0.012575] Mountpoint-cache hash table entries: 8192 (order: 0, 65536 bytes, linear)
[    0.041799] POWER9 performance monitor hardware support registered
[    0.042735] rcu: Hierarchical SRCU implementation.
[    0.045220] smp: Bringing up secondary CPUs ...
[    0.066059] smp: Brought up 1 node, 2 CPUs
[    0.093530] devtmpfs: initialized
[    0.100032] PCI host bridge /pci@800000020000000  ranges:
[    0.100607]   IO 0x0000200000000000..0x000020000000ffff -> 0x0000000000000000
[    0.100734]  MEM 0x0000200080000000..0x00002000ffffffff -> 0x0000000080000000 
[    0.100805]  MEM 0x0000210000000000..0x000021ffffffffff -> 0x0000210000000000 
[    0.101744] PCI: OF: PROBE_ONLY disabled
[    0.102198] clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 1911260446275000 ns
[    0.102371] futex hash table entries: 512 (order: 0, 65536 bytes, linear)
Linux ppc64le
#2 SMP Tue Apr 5[    0.110196] EEH: pSeries platform initialized
[    0.115987] software IO TLB: tearing down default memory pool
[    0.134293] PCI: Probing PCI hardware
[    0.136684] PCI host bridge to bus 0000:00
[    0.136999] pci_bus 0000:00: root bus resource [io  0x10000-0x1ffff] (bus address [0x0000-0xffff])
[    0.137449] pci_bus 0000:00: root bus resource [mem 0x200080000000-0x2000ffffffff] (bus address [0x80000000-0xffffffff])
[    0.137535] pci_bus 0000:00: root bus resource [mem 0x210000000000-0x21ffffffffff 64bit]
[    0.137720] pci_bus 0000:00: root bus resource [bus 00-ff]
[    0.142416] IOMMU table initialized, virtual merging enabled
[    0.143475] pci_bus 0000:00: resource 4 [io  0x10000-0x1ffff]
[    0.143541] pci_bus 0000:00: resource 5 [mem 0x200080000000-0x2000ffffffff]
[    0.143575] pci_bus 0000:00: resource 6 [mem 0x210000000000-0x21ffffffffff 64bit]
[    0.143754] EEH: No capable adapters found: recovery disabled.
[    0.166254] vgaarb: loaded
[    0.168227] clocksource: Switched to clocksource timebase
[    0.182817] PCI: CLS 0 bytes, default 128
[    1.790625] workingset: timestamp_bits=62 max_order=13 bucket_order=0
[   21.186912] rcu: INFO: rcu_sched self-detected stall on CPU
[   21.187331] rcu: 	1-...!: (4712629 ticks this GP) idle=2c1/0/0x3 softirq=8/8 fqs=0 
[   21.187529] 	(t=21000 jiffies g=-1183 q=3)
[   21.187681] rcu: rcu_sched kthread timer wakeup didn't happen for 20997 jiffies! g-1183 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402
[   21.187770] rcu: 	Possible timer handling issue on cpu=1 timer-softirq=1
[   21.187927] rcu: rcu_sched kthread starved for 21001 jiffies! g-1183 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402 ->cpu=1
[   21.188019] rcu: 	Unless rcu_sched kthread gets sufficient CPU time, OOM is now expected behavior.
[   21.188087] rcu: RCU grace-period kthread stack dump:
[   21.188196] task:rcu_sched       state:I stack:    0 pid:   10 ppid:     2 flags:0x00000800
[   21.188453] Call Trace:
[   21.188525] [c0000000061e78a0] [c0000000061e78e0] 0xc0000000061e78e0 (unreliable)
[   21.188900] [c0000000061e7a90] [c000000000017210] __switch_to+0x250/0x310
[   21.189210] [c0000000061e7b00] [c0000000003ed660] __schedule+0x210/0x660
[   21.189315] [c0000000061e7b80] [c0000000003edb14] schedule+0x64/0x110
[   21.189387] [c0000000061e7bb0] [c0000000003f6648] schedule_timeout+0x1d8/0x390
[   21.189473] [c0000000061e7c80] [c00000000011111c] rcu_gp_fqs_loop+0x2dc/0x3d0
[   21.189555] [c0000000061e7d30] [c0000000001144ec] rcu_gp_kthread+0x13c/0x160
[   21.189633] [c0000000061e7dc0] [c0000000000c1770] kthread+0x110/0x120
[   21.189714] [c0000000061e7e10] [c00000000000c9e4] ret_from_kernel_thread+0x5c/0x64
[   21.189938] rcu: Stack dump where RCU GP kthread last ran:
[   21.189992] Task dump for CPU 1:
[   21.190059] task:swapper/1       state:R  running task     stack:    0 pid:    0 ppid:     1 flags:0x00000804
[   21.190169] Call Trace:
[   21.190194] [c0000000061ef2d0] [c0000000000c9a40] sched_show_task+0x180/0x1c0 (unreliable)
[   21.190278] [c0000000061ef340] [c000000000116ca0] rcu_check_gp_kthread_starvation+0x16c/0x19c
[   21.190370] [c0000000061ef3c0] [c000000000114f7c] rcu_sched_clock_irq+0x7ec/0xaf0
[   21.190448] [c0000000061ef4b0] [c000000000120fdc] update_process_times+0xbc/0x140
[   21.190524] [c0000000061ef4f0] [c000000000136a24] tick_nohz_handler+0xf4/0x1b0
[   21.190608] [c0000000061ef540] [c00000000001c828] timer_interrupt+0x148/0x2d0
[   21.190699] [c0000000061ef590] [c0000000000098e8] decrementer_common_virt+0x208/0x210
[   21.190837] --- interrupt: 900 at arch_local_irq_restore+0x168/0x170
[   21.190941] NIP:  c000000000013608 LR: c0000000003f8114 CTR: c0000000000dc630
[   21.191031] REGS: c0000000061ef600 TRAP: 0900   Not tainted  (5.18.0-rc1)
[   21.191109] MSR:  8000000000009033 <SF,EE,ME,IR,DR,RI,LE>  CR: 22000202  XER: 00000000
[   21.191274] CFAR: 0000000000000000 IRQMASK: 0 
[   21.191274] GPR00: c00000000009c368 c0000000061ef8a0 c00000000116a700 0000000000000000 
[   21.191274] GPR04: 0000000000000000 0000000000000000 000000001ee30000 ffffffffffffffff 
[   21.191274] GPR08: 000000001ee30000 0000000000000000 0000000000008002 7265677368657265 
[   21.191274] GPR12: c0000000000dc630 c00000001ffe5800 0000000000000000 0000000000000000 
[   21.191274] GPR16: 0000000000000282 0000000000000000 0000000000000000 c0000000061eff00 
[   21.191274] GPR20: 0000000000000000 0000000000000001 c0000000061b9f80 c000000001195a10 
[   21.191274] GPR24: c000000001193a00 00000000fffb6cc4 000000000000000a c0000000010721e8 
[   21.191274] GPR28: c000000001076800 c000000001070380 c0000000010716d8 c0000000061b9f80 
[   21.191932] NIP [c000000000013608] arch_local_irq_restore+0x168/0x170
[   21.192024] LR [c0000000003f8114] __do_softirq+0xd4/0x2ec
[   21.192118] --- interrupt: 900
[   21.192158] [c0000000061ef8a0] [c0000000061b9f80] 0xc0000000061b9f80 (unreliable)
[   21.192227] [c0000000061ef9b0] [c00000000009c368] irq_exit+0xc8/0x110
[   21.192307] [c0000000061ef9d0] [c00000000001c858] timer_interrupt+0x178/0x2d0
[   21.192397] [c0000000061efa20] [c0000000000098e8] decrementer_common_virt+0x208/0x210
[   21.192495] --- interrupt: 900 at plpar_hcall_norets_notrace+0x18/0x2c
[   21.192566] NIP:  c000000000072988 LR: c000000000074fa8 CTR: c000000000074f10
[   21.192615] REGS: c0000000061efa90 TRAP: 0900   Not tainted  (5.18.0-rc1)
[   21.192659] MSR:  8000000000009033 <SF,EE,ME,IR,DR,RI,LE>  CR: 28000202  XER: 00000000
[   21.192755] CFAR: 0000000000000000 IRQMASK: 0 
[   21.192755] GPR00: 0000000028000202 c0000000061efd30 c00000000116a700 0000000000000000 
[   21.192755] GPR04: c00000001fea0280 ffffffffffffffff 0000000001f40000 000000019d088fcf 
[   21.192755] GPR08: 000000001ee30000 c00000001ffe5400 0000000000000001 0000000100000000 
[   21.192755] GPR12: c000000000074f10 c00000001ffe5800 0000000000000000 0000000000000000 
[   21.192755] GPR16: 0000000000000000 0000000000000000 0000000000000000 c0000000061eff00 
[   21.192755] GPR20: c00000000003d440 0000000000000001 c000000001195b30 c000000001195a10 
[   21.192755] GPR24: 0000000000080000 c0000000061ba000 c000000001195a98 0000000000000001 
[   21.192755] GPR28: 0000000000000001 c0000000010716d0 c0000000010716d8 c0000000010716d0 
[   21.193290] NIP [c000000000072988] plpar_hcall_norets_notrace+0x18/0x2c
[   21.193363] LR [c000000000074fa8] pseries_lpar_idle+0x98/0x1b0
[   21.193428] --- interrupt: 900
[   21.193457] [c0000000061efd30] [0000000000000001] 0x1 (unreliable)
[   21.193512] [c0000000061efdb0] [c000000000018b54] arch_cpu_idle+0x44/0x180
[   21.193590] [c0000000061efde0] [c0000000003f75bc] default_idle_call+0x4c/0x7c
[   21.193679] [c0000000061efe00] [c0000000000e1384] do_idle+0x114/0x1e0
[   21.193747] [c0000000061efe60] [c0000000000e1664] cpu_startup_entry+0x34/0x40
[   21.193901] [c0000000061efe90] [c00000000003f044] start_secondary+0x624/0xa00
[   21.194002] [c0000000061eff90] [c00000000000cd54] start_secondary_prolog+0x10/0x14
[   21.194245] Task dump for CPU 1:
[   21.194284] task:swapper/1       state:R  running task     stack:    0 pid:    0 ppid:     1 flags:0x00000804
[   21.194374] Call Trace:
[   21.194400] [c0000000061ef2b0] [c0000000000c9a40] sched_show_task+0x180/0x1c0 (unreliable)
[   21.194479] [c0000000061ef320] [c000000000116df8] rcu_dump_cpu_stacks+0x128/0x188
[   21.194567] [c0000000061ef3c0] [c000000000114f9c] rcu_sched_clock_irq+0x80c/0xaf0
[   21.194642] [c0000000061ef4b0] [c000000000120fdc] update_process_times+0xbc/0x140
[   21.194712] [c0000000061ef4f0] [c000000000136a24] tick_nohz_handler+0xf4/0x1b0
[   21.194828] [c0000000061ef540] [c00000000001c828] timer_interrupt+0x148/0x2d0
[   21.194942] [c0000000061ef590] [c0000000000098e8] decrementer_common_virt+0x208/0x210
[   21.195035] --- interrupt: 900 at arch_local_irq_restore+0x168/0x170
[   21.195104] NIP:  c000000000013608 LR: c0000000003f8114 CTR: c0000000000dc630
[   21.195152] REGS: c0000000061ef600 TRAP: 0900   Not tainted  (5.18.0-rc1)
[   21.195199] MSR:  8000000000009033 <SF,EE,ME,IR,DR,RI,LE>  CR: 22000202  XER: 00000000
[   21.195296] CFAR: 0000000000000000 IRQMASK: 0 
[   21.195296] GPR00: c00000000009c368 c0000000061ef8a0 c00000000116a700 0000000000000000 
[   21.195296] GPR04: 0000000000000000 0000000000000000 000000001ee30000 ffffffffffffffff 
[   21.195296] GPR08: 000000001ee30000 0000000000000000 0000000000008002 7265677368657265 
[   21.195296] GPR12: c0000000000dc630 c00000001ffe5800 0000000000000000 0000000000000000 
[   21.195296] GPR16: 0000000000000282 0000000000000000 0000000000000000 c0000000061eff00 
[   21.195296] GPR20: 0000000000000000 0000000000000001 c0000000061b9f80 c000000001195a10 
[   21.195296] GPR24: c000000001193a00 00000000fffb6cc4 000000000000000a c0000000010721e8 
[   21.195296] GPR28: c000000001076800 c000000001070380 c0000000010716d8 c0000000061b9f80 
[   21.195850] NIP [c000000000013608] arch_local_irq_restore+0x168/0x170
[   21.195944] LR [c0000000003f8114] __do_softirq+0xd4/0x2ec
[   21.196027] --- interrupt: 900
[   21.196056] [c0000000061ef8a0] [c0000000061b9f80] 0xc0000000061b9f80 (unreliable)
[   21.196119] [c0000000061ef9b0] [c00000000009c368] irq_exit+0xc8/0x110
[   21.196192] [c0000000061ef9d0] [c00000000001c858] timer_interrupt+0x178/0x2d0
[   21.196282] [c0000000061efa20] [c0000000000098e8] decrementer_common_virt+0x208/0x210
[   21.196373] --- interrupt: 900 at plpar_hcall_norets_notrace+0x18/0x2c
[   21.196439] NIP:  c000000000072988 LR: c000000000074fa8 CTR: c000000000074f10
[   21.196489] REGS: c0000000061efa90 TRAP: 0900   Not tainted  (5.18.0-rc1)
[   21.196534] MSR:  8000000000009033 <SF,EE,ME,IR,DR,RI,LE>  CR: 28000202  XER: 00000000
[   21.196627] CFAR: 0000000000000000 IRQMASK: 0 
[   21.196627] GPR00: 0000000028000202 c0000000061efd30 c00000000116a700 0000000000000000 
[   21.196627] GPR04: c00000001fea0280 ffffffffffffffff 0000000001f40000 000000019d088fcf 
[   21.196627] GPR08: 000000001ee30000 c00000001ffe5400 0000000000000001 0000000100000000 
[   21.196627] GPR12: c000000000074f10 c00000001ffe5800 0000000000000000 0000000000000000 
[   21.196627] GPR16: 0000000000000000 0000000000000000 0000000000000000 c0000000061eff00 
[   21.196627] GPR20: c00000000003d440 0000000000000001 c000000001195b30 c000000001195a10 
[   21.196627] GPR24: 0000000000080000 c0000000061ba000 c000000001195a98 0000000000000001 
[   21.196627] GPR28: 0000000000000001 c0000000010716d0 c0000000010716d8 c0000000010716d0 
[   21.197168] NIP [c000000000072988] plpar_hcall_norets_notrace+0x18/0x2c
[   21.197239] LR [c000000000074fa8] pseries_lpar_idle+0x98/0x1b0
[   21.197305] --- interrupt: 900
[   21.197333] [c0000000061efd30] [0000000000000001] 0x1 (unreliable)
[   21.197390] [c0000000061efdb0] [c000000000018b54] arch_cpu_idle+0x44/0x180
[   21.197470] [c0000000061efde0] [c0000000003f75bc] default_idle_call+0x4c/0x7c
[   21.197556] [c0000000061efe00] [c0000000000e1384] do_idle+0x114/0x1e0
[   21.197620] [c0000000061efe60] [c0000000000e1664] cpu_startup_entry+0x34/0x40
[   21.197696] [c0000000061efe90] [c00000000003f044] start_secondary+0x624/0xa00
[   21.197820] [c0000000061eff90] [c00000000000cd54] start_secondary_prolog+0x10/0x14

[-- Attachment #3: config --]
[-- Type: application/octet-stream, Size: 35266 bytes --]

#
# Automatically generated file; DO NOT EDIT.
# Linux/powerpc 5.18.0-rc1 Kernel Configuration
#
CONFIG_CC_VERSION_TEXT="powerpc64le-linux-gnu-gcc (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0"
CONFIG_CC_IS_GCC=y
CONFIG_GCC_VERSION=90400
CONFIG_CLANG_VERSION=0
CONFIG_AS_IS_GNU=y
CONFIG_AS_VERSION=23400
CONFIG_LD_IS_BFD=y
CONFIG_LD_VERSION=23400
CONFIG_LLD_VERSION=0
CONFIG_CC_CAN_LINK=y
CONFIG_CC_CAN_LINK_STATIC=y
CONFIG_CC_HAS_ASM_GOTO=y
CONFIG_CC_HAS_ASM_INLINE=y
CONFIG_CC_HAS_NO_PROFILE_FN_ATTR=y
CONFIG_PAHOLE_VERSION=0
CONFIG_IRQ_WORK=y
CONFIG_BUILDTIME_TABLE_SORT=y
CONFIG_THREAD_INFO_IN_TASK=y

#
# General setup
#
CONFIG_INIT_ENV_ARG_LIMIT=32
# CONFIG_COMPILE_TEST is not set
CONFIG_WERROR=y
CONFIG_LOCALVERSION=""
# CONFIG_LOCALVERSION_AUTO is not set
CONFIG_BUILD_SALT=""
CONFIG_HAVE_KERNEL_GZIP=y
CONFIG_HAVE_KERNEL_XZ=y
CONFIG_KERNEL_GZIP=y
# CONFIG_KERNEL_XZ is not set
CONFIG_DEFAULT_INIT=""
CONFIG_DEFAULT_HOSTNAME="(none)"
CONFIG_SWAP=y
# CONFIG_SYSVIPC is not set
# CONFIG_WATCH_QUEUE is not set
# CONFIG_CROSS_MEMORY_ATTACH is not set
# CONFIG_USELIB is not set
CONFIG_HAVE_ARCH_AUDITSYSCALL=y

#
# IRQ subsystem
#
CONFIG_GENERIC_IRQ_SHOW=y
CONFIG_GENERIC_IRQ_SHOW_LEVEL=y
CONFIG_GENERIC_IRQ_MIGRATION=y
CONFIG_HARDIRQS_SW_RESEND=y
CONFIG_IRQ_DOMAIN=y
CONFIG_IRQ_DOMAIN_HIERARCHY=y
CONFIG_GENERIC_MSI_IRQ=y
CONFIG_GENERIC_MSI_IRQ_DOMAIN=y
CONFIG_IRQ_FORCED_THREADING=y
CONFIG_SPARSE_IRQ=y
# end of IRQ subsystem

CONFIG_GENERIC_TIME_VSYSCALL=y
CONFIG_GENERIC_CLOCKEVENTS=y
CONFIG_ARCH_HAS_TICK_BROADCAST=y
CONFIG_GENERIC_CLOCKEVENTS_BROADCAST=y
CONFIG_GENERIC_CMOS_UPDATE=y

#
# Timers subsystem
#
CONFIG_TICK_ONESHOT=y
CONFIG_NO_HZ_COMMON=y
# CONFIG_HZ_PERIODIC is not set
CONFIG_NO_HZ_IDLE=y
# CONFIG_NO_HZ_FULL is not set
# CONFIG_NO_HZ is not set
# CONFIG_HIGH_RES_TIMERS is not set
# end of Timers subsystem

CONFIG_HAVE_EBPF_JIT=y

#
# BPF subsystem
#
# CONFIG_BPF_SYSCALL is not set
# end of BPF subsystem

CONFIG_PREEMPT_VOLUNTARY_BUILD=y
# CONFIG_PREEMPT_NONE is not set
CONFIG_PREEMPT_VOLUNTARY=y
# CONFIG_PREEMPT is not set
# CONFIG_SCHED_CORE is not set

#
# CPU/Task time and stats accounting
#
CONFIG_TICK_CPU_ACCOUNTING=y
# CONFIG_VIRT_CPU_ACCOUNTING_NATIVE is not set
# CONFIG_VIRT_CPU_ACCOUNTING_GEN is not set
# CONFIG_IRQ_TIME_ACCOUNTING is not set
# CONFIG_BSD_PROCESS_ACCT is not set
# CONFIG_PSI is not set
# end of CPU/Task time and stats accounting

CONFIG_CPU_ISOLATION=y

#
# RCU Subsystem
#
CONFIG_TREE_RCU=y
# CONFIG_RCU_EXPERT is not set
CONFIG_SRCU=y
CONFIG_TREE_SRCU=y
CONFIG_RCU_STALL_COMMON=y
CONFIG_RCU_NEED_SEGCBLIST=y
# end of RCU Subsystem

# CONFIG_IKCONFIG is not set
# CONFIG_IKHEADERS is not set
CONFIG_LOG_BUF_SHIFT=16
CONFIG_LOG_CPU_MAX_BUF_SHIFT=12
CONFIG_PRINTK_SAFE_LOG_BUF_SHIFT=13

#
# Scheduler features
#
# end of Scheduler features

CONFIG_ARCH_SUPPORTS_NUMA_BALANCING=y
CONFIG_CC_HAS_INT128=y
CONFIG_CC_IMPLICIT_FALLTHROUGH="-Wimplicit-fallthrough=5"
# CONFIG_CGROUPS is not set
CONFIG_NAMESPACES=y
# CONFIG_UTS_NS is not set
CONFIG_TIME_NS=y
# CONFIG_USER_NS is not set
# CONFIG_PID_NS is not set
# CONFIG_CHECKPOINT_RESTORE is not set
# CONFIG_SCHED_AUTOGROUP is not set
# CONFIG_SYSFS_DEPRECATED is not set
# CONFIG_RELAY is not set
CONFIG_BLK_DEV_INITRD=y
CONFIG_INITRAMFS_SOURCE=""
# CONFIG_RD_GZIP is not set
# CONFIG_RD_BZIP2 is not set
# CONFIG_RD_LZMA is not set
# CONFIG_RD_XZ is not set
# CONFIG_RD_LZO is not set
# CONFIG_RD_LZ4 is not set
# CONFIG_RD_ZSTD is not set
# CONFIG_BOOT_CONFIG is not set
CONFIG_CC_OPTIMIZE_FOR_PERFORMANCE=y
# CONFIG_CC_OPTIMIZE_FOR_SIZE is not set
CONFIG_HAVE_LD_DEAD_CODE_DATA_ELIMINATION=y
CONFIG_LD_ORPHAN_WARN=y
CONFIG_SYSCTL=y
CONFIG_SYSCTL_EXCEPTION_TRACE=y
CONFIG_HAVE_PCSPKR_PLATFORM=y
# CONFIG_EXPERT is not set
CONFIG_MULTIUSER=y
CONFIG_SGETMASK_SYSCALL=y
CONFIG_SYSFS_SYSCALL=y
CONFIG_FHANDLE=y
CONFIG_POSIX_TIMERS=y
CONFIG_PRINTK=y
CONFIG_BUG=y
CONFIG_ELF_CORE=y
CONFIG_PCSPKR_PLATFORM=y
CONFIG_BASE_FULL=y
CONFIG_FUTEX=y
CONFIG_FUTEX_PI=y
CONFIG_EPOLL=y
CONFIG_SIGNALFD=y
CONFIG_TIMERFD=y
CONFIG_EVENTFD=y
CONFIG_SHMEM=y
CONFIG_AIO=y
CONFIG_IO_URING=y
CONFIG_ADVISE_SYSCALLS=y
CONFIG_MEMBARRIER=y
CONFIG_KALLSYMS=y
CONFIG_KALLSYMS_BASE_RELATIVE=y
# CONFIG_USERFAULTFD is not set
CONFIG_ARCH_HAS_MEMBARRIER_CALLBACKS=y
CONFIG_ARCH_HAS_MEMBARRIER_SYNC_CORE=y
CONFIG_RSEQ=y
# CONFIG_EMBEDDED is not set
CONFIG_HAVE_PERF_EVENTS=y

#
# Kernel Performance Events And Counters
#
CONFIG_PERF_EVENTS=y
# end of Kernel Performance Events And Counters

CONFIG_VM_EVENT_COUNTERS=y
CONFIG_SLUB_DEBUG=y
# CONFIG_COMPAT_BRK is not set
# CONFIG_SLAB is not set
CONFIG_SLUB=y
# CONFIG_SLAB_MERGE_DEFAULT is not set
# CONFIG_SLAB_FREELIST_RANDOM is not set
CONFIG_SLAB_FREELIST_HARDENED=y
# CONFIG_SHUFFLE_PAGE_ALLOCATOR is not set
CONFIG_SLUB_CPU_PARTIAL=y
# CONFIG_PROFILING is not set
# end of General setup

CONFIG_PPC64=y

#
# Processor support
#
CONFIG_PPC_BOOK3S_64=y
# CONFIG_PPC_BOOK3E_64 is not set
CONFIG_GENERIC_CPU=y
# CONFIG_POWER7_CPU is not set
# CONFIG_POWER8_CPU is not set
# CONFIG_POWER9_CPU is not set
CONFIG_PPC_BOOK3S=y
CONFIG_PPC_FPU_REGS=y
CONFIG_PPC_FPU=y
CONFIG_ALTIVEC=y
CONFIG_VSX=y
CONFIG_PPC_64S_HASH_MMU=y
CONFIG_PPC_RADIX_MMU=y
CONFIG_PPC_RADIX_MMU_DEFAULT=y
CONFIG_PPC_KUEP=y
CONFIG_PPC_KUAP=y
# CONFIG_PPC_KUAP_DEBUG is not set
CONFIG_PPC_PKEY=y
CONFIG_PPC_MM_SLICES=y
CONFIG_PPC_HAVE_PMU_SUPPORT=y
# CONFIG_PMU_SYSFS is not set
CONFIG_PPC_PERF_CTRS=y
CONFIG_FORCE_SMP=y
CONFIG_SMP=y
CONFIG_NR_CPUS=2
CONFIG_PPC_DOORBELL=y
# end of Processor support

# CONFIG_CPU_BIG_ENDIAN is not set
CONFIG_CPU_LITTLE_ENDIAN=y
CONFIG_PPC64_BOOT_WRAPPER=y
CONFIG_64BIT=y
CONFIG_MMU=y
CONFIG_ARCH_MMAP_RND_BITS_MAX=29
CONFIG_ARCH_MMAP_RND_BITS_MIN=14
CONFIG_ARCH_MMAP_RND_COMPAT_BITS_MAX=13
CONFIG_ARCH_MMAP_RND_COMPAT_BITS_MIN=7
CONFIG_NR_IRQS=512
CONFIG_STACKTRACE_SUPPORT=y
CONFIG_LOCKDEP_SUPPORT=y
CONFIG_GENERIC_HWEIGHT=y
CONFIG_PPC=y
CONFIG_PPC_BARRIER_NOSPEC=y
CONFIG_EARLY_PRINTK=y
CONFIG_PANIC_TIMEOUT=-1
# CONFIG_COMPAT is not set
CONFIG_SCHED_OMIT_FRAME_POINTER=y
CONFIG_ARCH_MAY_HAVE_PC_FDC=y
CONFIG_PPC_UDBG_16550=y
CONFIG_AUDIT_ARCH=y
CONFIG_GENERIC_BUG=y
CONFIG_GENERIC_BUG_RELATIVE_POINTERS=y
CONFIG_ARCH_HIBERNATION_POSSIBLE=y
CONFIG_ARCH_SUSPEND_POSSIBLE=y
CONFIG_ARCH_SUSPEND_NONZERO_CPU=y
CONFIG_ARCH_SUPPORTS_UPROBES=y
CONFIG_PPC_DAWR=y
CONFIG_PGTABLE_LEVELS=4
CONFIG_PPC_MSI_BITMAP=y
CONFIG_PPC_XICS=y
CONFIG_PPC_ICP_NATIVE=y
CONFIG_PPC_ICP_HV=y
CONFIG_PPC_ICS_RTAS=y
CONFIG_PPC_XIVE=y
CONFIG_PPC_XIVE_SPAPR=y

#
# Platform support
#
# CONFIG_PPC_POWERNV is not set
CONFIG_PPC_PSERIES=y
CONFIG_PARAVIRT_SPINLOCKS=y
CONFIG_PPC_SPLPAR=y
# CONFIG_PSERIES_ENERGY is not set
CONFIG_IO_EVENT_IRQ=y
# CONFIG_LPARCFG is not set
# CONFIG_PPC_SMLPAR is not set
# CONFIG_HV_PERF_CTRS is not set
CONFIG_IBMVIO=y
# CONFIG_PPC_SVM is not set
CONFIG_PPC_VAS=y
# CONFIG_KVM_GUEST is not set
# CONFIG_EPAPR_PARAVIRT is not set
CONFIG_PPC_OF_BOOT_TRAMPOLINE=y
# CONFIG_PPC_DT_CPU_FTRS is not set
# CONFIG_UDBG_RTAS_CONSOLE is not set
CONFIG_PPC_SMP_MUXED_IPI=y
CONFIG_MPIC=y
# CONFIG_MPIC_MSGR is not set
CONFIG_PPC_I8259=y
CONFIG_PPC_RTAS=y
CONFIG_RTAS_ERROR_LOGGING=y
CONFIG_PPC_RTAS_DAEMON=y
# CONFIG_RTAS_PROC is not set
CONFIG_EEH=y

#
# CPU Frequency scaling
#
# CONFIG_CPU_FREQ is not set
# end of CPU Frequency scaling

#
# CPUIdle driver
#

#
# CPU Idle
#
# CONFIG_CPU_IDLE is not set
# end of CPU Idle
# end of CPUIdle driver

# CONFIG_GEN_RTC is not set
# end of Platform support

#
# Kernel options
#
# CONFIG_HZ_100 is not set
# CONFIG_HZ_250 is not set
# CONFIG_HZ_300 is not set
CONFIG_HZ_1000=y
CONFIG_HZ=1000
# CONFIG_PPC_TRANSACTIONAL_MEM is not set
CONFIG_HOTPLUG_CPU=y
CONFIG_PPC_QUEUED_SPINLOCKS=y
CONFIG_ARCH_CPU_PROBE_RELEASE=y
# CONFIG_PPC64_SUPPORTS_MEMORY_FAILURE is not set
# CONFIG_KEXEC is not set
CONFIG_RELOCATABLE=y
# CONFIG_RELOCATABLE_TEST is not set
# CONFIG_CRASH_DUMP is not set
# CONFIG_FA_DUMP is not set
# CONFIG_IRQ_ALL_CPUS is not set
# CONFIG_NUMA is not set
CONFIG_ARCH_SELECT_MEMORY_MODEL=y
CONFIG_ARCH_FLATMEM_ENABLE=y
CONFIG_ARCH_SPARSEMEM_ENABLE=y
CONFIG_ARCH_SPARSEMEM_DEFAULT=y
CONFIG_ILLEGAL_POINTER_VALUE=0x5deadbeef0000000
# CONFIG_PPC_4K_PAGES is not set
CONFIG_PPC_64K_PAGES=y
CONFIG_PPC_PAGE_SHIFT=16
CONFIG_THREAD_SHIFT=14
CONFIG_DATA_SHIFT=24
CONFIG_FORCE_MAX_ZONEORDER=9
# CONFIG_PPC_SUBPAGE_PROT is not set
# CONFIG_PPC_PROT_SAO_LPAR is not set
CONFIG_SCHED_SMT=y
CONFIG_PPC_DENORMALISATION=y
CONFIG_CMDLINE=""
CONFIG_EXTRA_TARGETS=""
# CONFIG_SUSPEND is not set
# CONFIG_HIBERNATION is not set
# CONFIG_PM is not set
# CONFIG_PPC_MEM_KEYS is not set
CONFIG_PPC_RTAS_FILTER=y
# end of Kernel options

CONFIG_ISA_DMA_API=y

#
# Bus options
#
CONFIG_GENERIC_ISA_DMA=y
# CONFIG_FSL_LBC is not set
# end of Bus options

CONFIG_NONSTATIC_KERNEL=y
CONFIG_PAGE_OFFSET=0xc000000000000000
CONFIG_KERNEL_START=0xc000000000000000
CONFIG_PHYSICAL_START=0x00000000
CONFIG_ARCH_RANDOM=y
# CONFIG_VIRTUALIZATION is not set

#
# General architecture-dependent options
#
# CONFIG_KPROBES is not set
CONFIG_JUMP_LABEL=y
# CONFIG_STATIC_KEYS_SELFTEST is not set
CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS=y
CONFIG_ARCH_USE_BUILTIN_BSWAP=y
CONFIG_HAVE_IOREMAP_PROT=y
CONFIG_HAVE_KPROBES=y
CONFIG_HAVE_KRETPROBES=y
CONFIG_HAVE_OPTPROBES=y
CONFIG_HAVE_KPROBES_ON_FTRACE=y
CONFIG_HAVE_FUNCTION_ERROR_INJECTION=y
CONFIG_HAVE_NMI=y
CONFIG_TRACE_IRQFLAGS_SUPPORT=y
CONFIG_HAVE_ARCH_TRACEHOOK=y
CONFIG_GENERIC_SMP_IDLE_THREAD=y
CONFIG_ARCH_HAS_FORTIFY_SOURCE=y
CONFIG_ARCH_HAS_SET_MEMORY=y
CONFIG_HAVE_ASM_MODVERSIONS=y
CONFIG_HAVE_REGS_AND_STACK_ACCESS_API=y
CONFIG_HAVE_RSEQ=y
CONFIG_HAVE_HW_BREAKPOINT=y
CONFIG_HAVE_PERF_EVENTS_NMI=y
CONFIG_HAVE_NMI_WATCHDOG=y
CONFIG_HAVE_HARDLOCKUP_DETECTOR_ARCH=y
CONFIG_HAVE_PERF_REGS=y
CONFIG_HAVE_PERF_USER_STACK_DUMP=y
CONFIG_HAVE_ARCH_JUMP_LABEL=y
CONFIG_HAVE_ARCH_JUMP_LABEL_RELATIVE=y
CONFIG_MMU_GATHER_TABLE_FREE=y
CONFIG_MMU_GATHER_RCU_TABLE_FREE=y
CONFIG_MMU_GATHER_PAGE_SIZE=y
CONFIG_ARCH_WANT_IRQS_OFF_ACTIVATE_MM=y
CONFIG_ARCH_HAVE_NMI_SAFE_CMPXCHG=y
CONFIG_ARCH_WEAK_RELEASE_ACQUIRE=y
CONFIG_ARCH_WANT_IPC_PARSE_VERSION=y
CONFIG_HAVE_ARCH_SECCOMP=y
CONFIG_HAVE_ARCH_SECCOMP_FILTER=y
# CONFIG_SECCOMP is not set
CONFIG_HAVE_STACKPROTECTOR=y
CONFIG_STACKPROTECTOR=y
CONFIG_STACKPROTECTOR_STRONG=y
CONFIG_LTO_NONE=y
CONFIG_HAVE_CONTEXT_TRACKING=y
CONFIG_HAVE_VIRT_CPU_ACCOUNTING=y
CONFIG_HAVE_VIRT_CPU_ACCOUNTING_GEN=y
CONFIG_HAVE_IRQ_TIME_ACCOUNTING=y
CONFIG_HAVE_MOVE_PUD=y
CONFIG_HAVE_MOVE_PMD=y
CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE=y
CONFIG_HAVE_ARCH_HUGE_VMAP=y
CONFIG_HAVE_ARCH_HUGE_VMALLOC=y
CONFIG_HAVE_ARCH_SOFT_DIRTY=y
CONFIG_HAVE_MOD_ARCH_SPECIFIC=y
CONFIG_MODULES_USE_ELF_RELA=y
CONFIG_HAVE_IRQ_EXIT_ON_IRQ_STACK=y
CONFIG_HAVE_SOFTIRQ_ON_OWN_STACK=y
CONFIG_ARCH_HAS_ELF_RANDOMIZE=y
CONFIG_HAVE_ARCH_MMAP_RND_BITS=y
CONFIG_ARCH_MMAP_RND_BITS=14
CONFIG_PAGE_SIZE_LESS_THAN_256KB=y
CONFIG_HAVE_RELIABLE_STACKTRACE=y
CONFIG_HAVE_ARCH_NVRAM_OPS=y
CONFIG_CLONE_BACKWARDS=y
CONFIG_OLD_SIGSUSPEND=y
# CONFIG_COMPAT_32BIT_TIME is not set
CONFIG_ARCH_OPTIONAL_KERNEL_RWX=y
CONFIG_ARCH_OPTIONAL_KERNEL_RWX_DEFAULT=y
CONFIG_ARCH_HAS_STRICT_KERNEL_RWX=y
CONFIG_STRICT_KERNEL_RWX=y
CONFIG_ARCH_HAS_STRICT_MODULE_RWX=y
CONFIG_STRICT_MODULE_RWX=y
CONFIG_ARCH_HAS_PHYS_TO_DMA=y
CONFIG_ARCH_WANT_LD_ORPHAN_WARN=y
CONFIG_ARCH_SUPPORTS_DEBUG_PAGEALLOC=y

#
# GCOV-based kernel profiling
#
CONFIG_ARCH_HAS_GCOV_PROFILE_ALL=y
# end of GCOV-based kernel profiling

CONFIG_HAVE_GCC_PLUGINS=y
# end of General architecture-dependent options

CONFIG_RT_MUTEXES=y
CONFIG_BASE_SMALL=0
CONFIG_MODULES=y
# CONFIG_MODULE_FORCE_LOAD is not set
CONFIG_MODULE_UNLOAD=y
# CONFIG_MODULE_FORCE_UNLOAD is not set
# CONFIG_MODVERSIONS is not set
# CONFIG_MODULE_SRCVERSION_ALL is not set
# CONFIG_MODULE_SIG is not set
CONFIG_MODULE_COMPRESS_NONE=y
# CONFIG_MODULE_COMPRESS_GZIP is not set
# CONFIG_MODULE_COMPRESS_XZ is not set
# CONFIG_MODULE_COMPRESS_ZSTD is not set
# CONFIG_MODULE_ALLOW_MISSING_NAMESPACE_IMPORTS is not set
CONFIG_MODPROBE_PATH="/sbin/modprobe"
CONFIG_MODULES_TREE_LOOKUP=y
CONFIG_BLOCK=y
CONFIG_BLOCK_LEGACY_AUTOLOAD=y
# CONFIG_BLK_DEV_BSGLIB is not set
# CONFIG_BLK_DEV_INTEGRITY is not set
# CONFIG_BLK_DEV_ZONED is not set
# CONFIG_BLK_WBT is not set
# CONFIG_BLK_SED_OPAL is not set
# CONFIG_BLK_INLINE_ENCRYPTION is not set

#
# Partition Types
#
CONFIG_PARTITION_ADVANCED=y
# CONFIG_ACORN_PARTITION is not set
# CONFIG_AIX_PARTITION is not set
# CONFIG_OSF_PARTITION is not set
# CONFIG_AMIGA_PARTITION is not set
# CONFIG_ATARI_PARTITION is not set
# CONFIG_MAC_PARTITION is not set
# CONFIG_MSDOS_PARTITION is not set
# CONFIG_LDM_PARTITION is not set
# CONFIG_SGI_PARTITION is not set
# CONFIG_ULTRIX_PARTITION is not set
# CONFIG_SUN_PARTITION is not set
# CONFIG_KARMA_PARTITION is not set
# CONFIG_EFI_PARTITION is not set
# CONFIG_SYSV68_PARTITION is not set
# CONFIG_CMDLINE_PARTITION is not set
# end of Partition Types

CONFIG_BLK_MQ_PCI=y

#
# IO Schedulers
#
# CONFIG_MQ_IOSCHED_DEADLINE is not set
# CONFIG_MQ_IOSCHED_KYBER is not set
# CONFIG_IOSCHED_BFQ is not set
# end of IO Schedulers

CONFIG_INLINE_SPIN_UNLOCK_IRQ=y
CONFIG_INLINE_READ_UNLOCK=y
CONFIG_INLINE_READ_UNLOCK_IRQ=y
CONFIG_INLINE_WRITE_UNLOCK=y
CONFIG_INLINE_WRITE_UNLOCK_IRQ=y
CONFIG_ARCH_SUPPORTS_ATOMIC_RMW=y
CONFIG_MUTEX_SPIN_ON_OWNER=y
CONFIG_RWSEM_SPIN_ON_OWNER=y
CONFIG_LOCK_SPIN_ON_OWNER=y
CONFIG_ARCH_USE_QUEUED_SPINLOCKS=y
CONFIG_QUEUED_SPINLOCKS=y
CONFIG_ARCH_USE_QUEUED_RWLOCKS=y
CONFIG_QUEUED_RWLOCKS=y
CONFIG_ARCH_HAS_MMIOWB=y
CONFIG_MMIOWB=y
CONFIG_ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE=y

#
# Executable file formats
#
CONFIG_BINFMT_ELF=y
CONFIG_ELFCORE=y
CONFIG_CORE_DUMP_DEFAULT_ELF_HEADERS=y
CONFIG_BINFMT_SCRIPT=y
# CONFIG_BINFMT_MISC is not set
CONFIG_COREDUMP=y
# end of Executable file formats

#
# Memory Management options
#
CONFIG_SELECT_MEMORY_MODEL=y
# CONFIG_FLATMEM_MANUAL is not set
CONFIG_SPARSEMEM_MANUAL=y
CONFIG_SPARSEMEM=y
CONFIG_SPARSEMEM_EXTREME=y
CONFIG_SPARSEMEM_VMEMMAP_ENABLE=y
CONFIG_SPARSEMEM_VMEMMAP=y
CONFIG_HAVE_FAST_GUP=y
CONFIG_ARCH_KEEP_MEMBLOCK=y
CONFIG_EXCLUSIVE_SYSTEM_RAM=y
CONFIG_ARCH_ENABLE_MEMORY_HOTPLUG=y
# CONFIG_MEMORY_HOTPLUG is not set
CONFIG_ARCH_ENABLE_MEMORY_HOTREMOVE=y
CONFIG_SPLIT_PTLOCK_CPUS=4
CONFIG_ARCH_ENABLE_SPLIT_PMD_PTLOCK=y
CONFIG_COMPACTION=y
# CONFIG_PAGE_REPORTING is not set
CONFIG_MIGRATION=y
CONFIG_ARCH_ENABLE_THP_MIGRATION=y
CONFIG_PHYS_ADDR_T_64BIT=y
# CONFIG_KSM is not set
CONFIG_DEFAULT_MMAP_MIN_ADDR=4096
CONFIG_TRANSPARENT_HUGEPAGE=y
# CONFIG_TRANSPARENT_HUGEPAGE_ALWAYS is not set
CONFIG_TRANSPARENT_HUGEPAGE_MADVISE=y
CONFIG_NEED_PER_CPU_EMBED_FIRST_CHUNK=y
CONFIG_NEED_PER_CPU_PAGE_FIRST_CHUNK=y
CONFIG_HAVE_SETUP_PER_CPU_AREA=y
# CONFIG_CMA is not set
# CONFIG_ZPOOL is not set
# CONFIG_ZSMALLOC is not set
CONFIG_GENERIC_EARLY_IOREMAP=y
# CONFIG_DEFERRED_STRUCT_PAGE_INIT is not set
# CONFIG_IDLE_PAGE_TRACKING is not set
CONFIG_ARCH_HAS_CURRENT_STACK_POINTER=y
CONFIG_ARCH_HAS_PTE_DEVMAP=y
# CONFIG_PERCPU_STATS is not set

#
# GUP_TEST needs to have DEBUG_FS enabled
#
# CONFIG_READ_ONLY_THP_FOR_FS is not set
CONFIG_ARCH_HAS_PTE_SPECIAL=y
# CONFIG_ANON_VMA_NAME is not set

#
# Data Access Monitoring
#
# CONFIG_DAMON is not set
# end of Data Access Monitoring
# end of Memory Management options

# CONFIG_NET is not set

#
# Device Drivers
#
CONFIG_HAVE_PCI=y
CONFIG_FORCE_PCI=y
CONFIG_PCI=y
CONFIG_PCI_DOMAINS=y
CONFIG_PCI_SYSCALL=y
# CONFIG_PCIEPORTBUS is not set
CONFIG_PCIEASPM=y
CONFIG_PCIEASPM_DEFAULT=y
# CONFIG_PCIEASPM_POWERSAVE is not set
# CONFIG_PCIEASPM_POWER_SUPERSAVE is not set
# CONFIG_PCIEASPM_PERFORMANCE is not set
# CONFIG_PCIE_PTM is not set
CONFIG_PCI_MSI=y
CONFIG_PCI_MSI_IRQ_DOMAIN=y
CONFIG_PCI_MSI_ARCH_FALLBACKS=y
CONFIG_PCI_QUIRKS=y
# CONFIG_PCI_STUB is not set
# CONFIG_PCI_IOV is not set
# CONFIG_PCI_PRI is not set
# CONFIG_PCI_PASID is not set
CONFIG_VGA_ARB=y
CONFIG_VGA_ARB_MAX_GPUS=16
# CONFIG_HOTPLUG_PCI is not set

#
# PCI controller drivers
#
# CONFIG_PCI_FTPCI100 is not set
# CONFIG_PCI_HOST_GENERIC is not set
# CONFIG_PCIE_XILINX is not set
# CONFIG_PCIE_MICROCHIP_HOST is not set

#
# DesignWare PCI Core Support
#
# CONFIG_PCIE_DW_PLAT_HOST is not set
# CONFIG_PCI_MESON is not set
# end of DesignWare PCI Core Support

#
# Mobiveil PCIe Core Support
#
# end of Mobiveil PCIe Core Support

#
# Cadence PCIe controllers support
#
# CONFIG_PCIE_CADENCE_PLAT_HOST is not set
# CONFIG_PCI_J721E_HOST is not set
# end of Cadence PCIe controllers support
# end of PCI controller drivers

#
# PCI Endpoint
#
# CONFIG_PCI_ENDPOINT is not set
# end of PCI Endpoint

#
# PCI switch controller drivers
#
# CONFIG_PCI_SW_SWITCHTEC is not set
# end of PCI switch controller drivers

# CONFIG_CXL_BUS is not set
# CONFIG_PCCARD is not set
# CONFIG_RAPIDIO is not set

#
# Generic Driver Options
#
# CONFIG_UEVENT_HELPER is not set
CONFIG_DEVTMPFS=y
CONFIG_DEVTMPFS_MOUNT=y
# CONFIG_DEVTMPFS_SAFE is not set
CONFIG_STANDALONE=y
CONFIG_PREVENT_FIRMWARE_BUILD=y

#
# Firmware loader
#
CONFIG_FW_LOADER=y
CONFIG_EXTRA_FIRMWARE=""
# CONFIG_FW_LOADER_USER_HELPER is not set
# CONFIG_FW_LOADER_COMPRESS is not set
# end of Firmware loader

CONFIG_ALLOW_DEV_COREDUMP=y
# CONFIG_TEST_ASYNC_DRIVER_PROBE is not set
CONFIG_GENERIC_CPU_AUTOPROBE=y
CONFIG_GENERIC_CPU_VULNERABILITIES=y
# end of Generic Driver Options

#
# Bus devices
#
# CONFIG_MHI_BUS is not set
# end of Bus devices

#
# Firmware Drivers
#

#
# ARM System Control and Management Interface Protocol
#
# end of ARM System Control and Management Interface Protocol

# CONFIG_GOOGLE_FIRMWARE is not set

#
# Tegra firmware driver
#
# end of Tegra firmware driver
# end of Firmware Drivers

# CONFIG_GNSS is not set
# CONFIG_MTD is not set
CONFIG_DTC=y
CONFIG_OF=y
# CONFIG_OF_UNITTEST is not set
CONFIG_OF_FLATTREE=y
CONFIG_OF_EARLY_FLATTREE=y
CONFIG_OF_KOBJ=y
CONFIG_OF_DYNAMIC=y
CONFIG_OF_ADDRESS=y
CONFIG_OF_IRQ=y
CONFIG_OF_RESERVED_MEM=y
# CONFIG_OF_OVERLAY is not set
CONFIG_OF_DMA_DEFAULT_COHERENT=y
CONFIG_ARCH_MIGHT_HAVE_PC_PARPORT=y
# CONFIG_PARPORT is not set
# CONFIG_BLK_DEV is not set

#
# NVME Support
#
# CONFIG_BLK_DEV_NVME is not set
# CONFIG_NVME_FC is not set
# end of NVME Support

#
# Misc devices
#
# CONFIG_DUMMY_IRQ is not set
# CONFIG_IBMVMC is not set
# CONFIG_PHANTOM is not set
# CONFIG_TIFM_CORE is not set
# CONFIG_ENCLOSURE_SERVICES is not set
# CONFIG_HP_ILO is not set
# CONFIG_SRAM is not set
# CONFIG_DW_XDATA_PCIE is not set
# CONFIG_PCI_ENDPOINT_TEST is not set
# CONFIG_XILINX_SDFEC is not set
# CONFIG_OPEN_DICE is not set
# CONFIG_C2PORT is not set

#
# EEPROM support
#
# CONFIG_EEPROM_93CX6 is not set
# end of EEPROM support

# CONFIG_CB710_CORE is not set

#
# Texas Instruments shared transport line discipline
#
# end of Texas Instruments shared transport line discipline

#
# Altera FPGA firmware download module (requires I2C)
#
# CONFIG_GENWQE is not set
# CONFIG_ECHO is not set
# CONFIG_BCM_VK is not set
# CONFIG_MISC_ALCOR_PCI is not set
# CONFIG_MISC_RTSX_PCI is not set
# CONFIG_HABANA_AI is not set
# CONFIG_PVPANIC is not set
# end of Misc devices

#
# SCSI device support
#
CONFIG_SCSI_MOD=y
# CONFIG_RAID_ATTRS is not set
# CONFIG_SCSI is not set
# end of SCSI device support

# CONFIG_ATA is not set
# CONFIG_MD is not set
# CONFIG_TARGET_CORE is not set
# CONFIG_FUSION is not set

#
# IEEE 1394 (FireWire) support
#
# CONFIG_FIREWIRE is not set
# CONFIG_FIREWIRE_NOSY is not set
# end of IEEE 1394 (FireWire) support

# CONFIG_MACINTOSH_DRIVERS is not set

#
# Input device support
#
CONFIG_INPUT=y
# CONFIG_INPUT_FF_MEMLESS is not set
# CONFIG_INPUT_SPARSEKMAP is not set
# CONFIG_INPUT_MATRIXKMAP is not set

#
# Userland interfaces
#
# CONFIG_INPUT_MOUSEDEV is not set
# CONFIG_INPUT_JOYDEV is not set
# CONFIG_INPUT_EVDEV is not set
# CONFIG_INPUT_EVBUG is not set

#
# Input Device Drivers
#
# CONFIG_INPUT_KEYBOARD is not set
# CONFIG_INPUT_MOUSE is not set
# CONFIG_INPUT_JOYSTICK is not set
# CONFIG_INPUT_TABLET is not set
# CONFIG_INPUT_TOUCHSCREEN is not set
# CONFIG_INPUT_MISC is not set
# CONFIG_RMI4_CORE is not set

#
# Hardware I/O ports
#
# CONFIG_SERIO is not set
CONFIG_ARCH_MIGHT_HAVE_PC_SERIO=y
# CONFIG_GAMEPORT is not set
# end of Hardware I/O ports
# end of Input device support

#
# Character devices
#
CONFIG_TTY=y
CONFIG_VT=y
CONFIG_CONSOLE_TRANSLATIONS=y
CONFIG_VT_CONSOLE=y
CONFIG_HW_CONSOLE=y
# CONFIG_VT_HW_CONSOLE_BINDING is not set
CONFIG_UNIX98_PTYS=y
# CONFIG_LEGACY_PTYS is not set
# CONFIG_LDISC_AUTOLOAD is not set

#
# Serial drivers
#
CONFIG_SERIAL_EARLYCON=y
CONFIG_SERIAL_8250=y
# CONFIG_SERIAL_8250_DEPRECATED_OPTIONS is not set
# CONFIG_SERIAL_8250_16550A_VARIANTS is not set
# CONFIG_SERIAL_8250_FINTEK is not set
CONFIG_SERIAL_8250_CONSOLE=y
# CONFIG_SERIAL_8250_PCI is not set
CONFIG_SERIAL_8250_NR_UARTS=1
CONFIG_SERIAL_8250_RUNTIME_UARTS=1
# CONFIG_SERIAL_8250_EXTENDED is not set
CONFIG_SERIAL_8250_FSL=y
# CONFIG_SERIAL_8250_DW is not set
# CONFIG_SERIAL_8250_RT288X is not set
CONFIG_SERIAL_8250_PERICOM=y
# CONFIG_SERIAL_OF_PLATFORM is not set

#
# Non-8250 serial port support
#
# CONFIG_SERIAL_UARTLITE is not set
CONFIG_SERIAL_CORE=y
CONFIG_SERIAL_CORE_CONSOLE=y
# CONFIG_SERIAL_ICOM is not set
# CONFIG_SERIAL_JSM is not set
# CONFIG_SERIAL_SIFIVE is not set
# CONFIG_SERIAL_SCCNXP is not set
# CONFIG_SERIAL_ALTERA_JTAGUART is not set
# CONFIG_SERIAL_ALTERA_UART is not set
# CONFIG_SERIAL_XILINX_PS_UART is not set
# CONFIG_SERIAL_ARC is not set
# CONFIG_SERIAL_RP2 is not set
# CONFIG_SERIAL_FSL_LPUART is not set
# CONFIG_SERIAL_FSL_LINFLEXUART is not set
# CONFIG_SERIAL_CONEXANT_DIGICOLOR is not set
# end of Serial drivers

# CONFIG_SERIAL_NONSTANDARD is not set
# CONFIG_PPC_EPAPR_HV_BYTECHAN is not set
# CONFIG_NOZOMI is not set
# CONFIG_NULL_TTY is not set
CONFIG_HVC_DRIVER=y
CONFIG_HVC_IRQ=y
CONFIG_HVC_CONSOLE=y
# CONFIG_HVC_OLD_HVSI is not set
# CONFIG_HVC_RTAS is not set
# CONFIG_HVC_UDBG is not set
# CONFIG_HVCS is not set
# CONFIG_SERIAL_DEV_BUS is not set
# CONFIG_VIRTIO_CONSOLE is not set
# CONFIG_IBM_BSR is not set
# CONFIG_IPMI_HANDLER is not set
# CONFIG_HW_RANDOM is not set
# CONFIG_APPLICOM is not set
# CONFIG_DEVMEM is not set
# CONFIG_NVRAM is not set
CONFIG_DEVPORT=y
# CONFIG_HANGCHECK_TIMER is not set
# CONFIG_TCG_TPM is not set
# CONFIG_XILLYBUS is not set
# CONFIG_RANDOM_TRUST_CPU is not set
# CONFIG_RANDOM_TRUST_BOOTLOADER is not set
# end of Character devices

#
# I2C support
#
# CONFIG_I2C is not set
# end of I2C support

# CONFIG_I3C is not set
# CONFIG_SPI is not set
# CONFIG_SPMI is not set
# CONFIG_HSI is not set
# CONFIG_PPS is not set

#
# PTP clock support
#
CONFIG_PTP_1588_CLOCK_OPTIONAL=y

#
# Enable PHYLIB and NETWORK_PHY_TIMESTAMPING to see the additional clocks.
#
# end of PTP clock support

# CONFIG_PINCTRL is not set
# CONFIG_GPIOLIB is not set
# CONFIG_W1 is not set
# CONFIG_POWER_RESET is not set
# CONFIG_POWER_SUPPLY is not set
# CONFIG_HWMON is not set
# CONFIG_THERMAL is not set
# CONFIG_WATCHDOG is not set
CONFIG_SSB_POSSIBLE=y
# CONFIG_SSB is not set
CONFIG_BCMA_POSSIBLE=y
# CONFIG_BCMA is not set

#
# Multifunction device drivers
#
# CONFIG_MFD_ATMEL_FLEXCOM is not set
# CONFIG_MFD_ATMEL_HLCDC is not set
# CONFIG_MFD_MADERA is not set
# CONFIG_MFD_HI6421_PMIC is not set
# CONFIG_HTC_PASIC3 is not set
# CONFIG_LPC_ICH is not set
# CONFIG_LPC_SCH is not set
# CONFIG_MFD_JANZ_CMODIO is not set
# CONFIG_MFD_KEMPLD is not set
# CONFIG_MFD_MT6397 is not set
# CONFIG_MFD_RDC321X is not set
# CONFIG_MFD_SM501 is not set
# CONFIG_MFD_SYSCON is not set
# CONFIG_MFD_TI_AM335X_TSCADC is not set
# CONFIG_MFD_TQMX86 is not set
# CONFIG_MFD_VX855 is not set
# end of Multifunction device drivers

# CONFIG_REGULATOR is not set
# CONFIG_RC_CORE is not set

#
# CEC support
#
# CONFIG_MEDIA_CEC_SUPPORT is not set
# end of CEC support

# CONFIG_MEDIA_SUPPORT is not set

#
# Graphics support
#
# CONFIG_AGP is not set
# CONFIG_DRM is not set

#
# ARM devices
#
# end of ARM devices

#
# Frame buffer Devices
#
# CONFIG_FB is not set
# end of Frame buffer Devices

#
# Backlight & LCD device support
#
# CONFIG_LCD_CLASS_DEVICE is not set
# CONFIG_BACKLIGHT_CLASS_DEVICE is not set
# end of Backlight & LCD device support

#
# Console display driver support
#
# CONFIG_VGA_CONSOLE is not set
CONFIG_DUMMY_CONSOLE=y
CONFIG_DUMMY_CONSOLE_COLUMNS=80
CONFIG_DUMMY_CONSOLE_ROWS=25
# end of Console display driver support
# end of Graphics support

# CONFIG_SOUND is not set

#
# HID support
#
# CONFIG_HID is not set
# end of HID support

CONFIG_USB_OHCI_LITTLE_ENDIAN=y
# CONFIG_USB_SUPPORT is not set
# CONFIG_MMC is not set
# CONFIG_MEMSTICK is not set
# CONFIG_NEW_LEDS is not set
# CONFIG_ACCESSIBILITY is not set
CONFIG_EDAC_ATOMIC_SCRUB=y
CONFIG_EDAC_SUPPORT=y
CONFIG_RTC_LIB=y
# CONFIG_RTC_CLASS is not set
# CONFIG_DMADEVICES is not set

#
# DMABUF options
#
# CONFIG_SYNC_FILE is not set
# CONFIG_DMABUF_HEAPS is not set
# end of DMABUF options

# CONFIG_AUXDISPLAY is not set
# CONFIG_UIO is not set
# CONFIG_VFIO is not set
# CONFIG_VIRT_DRIVERS is not set
# CONFIG_VIRTIO_MENU is not set
# CONFIG_VHOST_MENU is not set

#
# Microsoft Hyper-V guest support
#
# end of Microsoft Hyper-V guest support

# CONFIG_GREYBUS is not set
# CONFIG_COMEDI is not set
# CONFIG_STAGING is not set
# CONFIG_GOLDFISH is not set
# CONFIG_COMMON_CLK is not set
# CONFIG_HWSPINLOCK is not set

#
# Clock Source drivers
#
CONFIG_I8253_LOCK=y
CONFIG_CLKBLD_I8253=y
# CONFIG_MICROCHIP_PIT64B is not set
# end of Clock Source drivers

# CONFIG_MAILBOX is not set
# CONFIG_IOMMU_SUPPORT is not set

#
# Remoteproc drivers
#
# CONFIG_REMOTEPROC is not set
# end of Remoteproc drivers

#
# Rpmsg drivers
#
# CONFIG_RPMSG_VIRTIO is not set
# end of Rpmsg drivers

# CONFIG_SOUNDWIRE is not set

#
# SOC (System On Chip) specific Drivers
#

#
# Amlogic SoC drivers
#
# end of Amlogic SoC drivers

#
# Broadcom SoC drivers
#
# end of Broadcom SoC drivers

#
# NXP/Freescale QorIQ SoC drivers
#
# CONFIG_QUICC_ENGINE is not set
# end of NXP/Freescale QorIQ SoC drivers

#
# i.MX SoC drivers
#
# end of i.MX SoC drivers

#
# Enable LiteX SoC Builder specific drivers
#
# CONFIG_LITEX_SOC_CONTROLLER is not set
# end of Enable LiteX SoC Builder specific drivers

#
# Qualcomm SoC drivers
#
# end of Qualcomm SoC drivers

# CONFIG_SOC_TI is not set

#
# Xilinx SoC drivers
#
# end of Xilinx SoC drivers
# end of SOC (System On Chip) specific Drivers

# CONFIG_PM_DEVFREQ is not set
# CONFIG_EXTCON is not set
# CONFIG_MEMORY is not set
# CONFIG_IIO is not set
# CONFIG_NTB is not set
# CONFIG_VME_BUS is not set
# CONFIG_PWM is not set

#
# IRQ chip support
#
CONFIG_IRQCHIP=y
# CONFIG_AL_FIC is not set
# end of IRQ chip support

# CONFIG_IPACK_BUS is not set
# CONFIG_RESET_CONTROLLER is not set

#
# PHY Subsystem
#
# CONFIG_GENERIC_PHY is not set
# CONFIG_PHY_CAN_TRANSCEIVER is not set

#
# PHY drivers for Broadcom platforms
#
# CONFIG_BCM_KONA_USB2_PHY is not set
# end of PHY drivers for Broadcom platforms

# CONFIG_PHY_CADENCE_DPHY is not set
# CONFIG_PHY_CADENCE_DPHY_RX is not set
# CONFIG_PHY_CADENCE_SALVO is not set
# CONFIG_PHY_PXA_28NM_HSIC is not set
# CONFIG_PHY_PXA_28NM_USB2 is not set
# end of PHY Subsystem

# CONFIG_POWERCAP is not set
# CONFIG_MCB is not set

#
# Performance monitor support
#
# end of Performance monitor support

# CONFIG_RAS is not set
# CONFIG_USB4 is not set

#
# Android
#
CONFIG_ANDROID=y
# CONFIG_ANDROID_BINDER_IPC is not set
# end of Android

# CONFIG_DAX is not set
# CONFIG_NVMEM is not set

#
# HW tracing support
#
# CONFIG_STM is not set
# CONFIG_INTEL_TH is not set
# end of HW tracing support

# CONFIG_FPGA is not set
# CONFIG_FSI is not set
# CONFIG_SIOX is not set
# CONFIG_SLIMBUS is not set
# CONFIG_INTERCONNECT is not set
# CONFIG_COUNTER is not set
# CONFIG_PECI is not set
# end of Device Drivers

#
# File systems
#
CONFIG_DCACHE_WORD_ACCESS=y
# CONFIG_VALIDATE_FS_PARSER is not set
# CONFIG_EXT2_FS is not set
# CONFIG_EXT3_FS is not set
# CONFIG_EXT4_FS is not set
# CONFIG_REISERFS_FS is not set
# CONFIG_JFS_FS is not set
# CONFIG_XFS_FS is not set
# CONFIG_GFS2_FS is not set
# CONFIG_BTRFS_FS is not set
# CONFIG_NILFS2_FS is not set
# CONFIG_F2FS_FS is not set
CONFIG_EXPORTFS=y
# CONFIG_EXPORTFS_BLOCK_OPS is not set
CONFIG_FILE_LOCKING=y
# CONFIG_FS_ENCRYPTION is not set
# CONFIG_FS_VERITY is not set
# CONFIG_DNOTIFY is not set
# CONFIG_INOTIFY_USER is not set
# CONFIG_FANOTIFY is not set
# CONFIG_QUOTA is not set
# CONFIG_AUTOFS4_FS is not set
# CONFIG_AUTOFS_FS is not set
# CONFIG_FUSE_FS is not set
# CONFIG_OVERLAY_FS is not set

#
# Caches
#
# CONFIG_FSCACHE is not set
# end of Caches

#
# CD-ROM/DVD Filesystems
#
# CONFIG_ISO9660_FS is not set
# CONFIG_UDF_FS is not set
# end of CD-ROM/DVD Filesystems

#
# DOS/FAT/EXFAT/NT Filesystems
#
# CONFIG_MSDOS_FS is not set
# CONFIG_VFAT_FS is not set
# CONFIG_EXFAT_FS is not set
# CONFIG_NTFS_FS is not set
# CONFIG_NTFS3_FS is not set
# end of DOS/FAT/EXFAT/NT Filesystems

#
# Pseudo filesystems
#
CONFIG_PROC_FS=y
# CONFIG_PROC_KCORE is not set
CONFIG_PROC_SYSCTL=y
CONFIG_PROC_PAGE_MONITOR=y
# CONFIG_PROC_CHILDREN is not set
CONFIG_KERNFS=y
CONFIG_SYSFS=y
# CONFIG_TMPFS is not set
CONFIG_ARCH_SUPPORTS_HUGETLBFS=y
# CONFIG_HUGETLBFS is not set
CONFIG_ARCH_HAS_GIGANTIC_PAGE=y
# CONFIG_CONFIGFS_FS is not set
# end of Pseudo filesystems

# CONFIG_MISC_FILESYSTEMS is not set
# CONFIG_NLS is not set
# CONFIG_UNICODE is not set
CONFIG_IO_WQ=y
# end of File systems

#
# Security options
#
# CONFIG_KEYS is not set
# CONFIG_SECURITY_DMESG_RESTRICT is not set
# CONFIG_SECURITY is not set
# CONFIG_SECURITYFS is not set
CONFIG_HAVE_HARDENED_USERCOPY_ALLOCATOR=y
CONFIG_HARDENED_USERCOPY=y
CONFIG_FORTIFY_SOURCE=y
# CONFIG_STATIC_USERMODEHELPER is not set
CONFIG_DEFAULT_SECURITY_DAC=y
CONFIG_LSM="lockdown,yama,loadpin,safesetid,integrity,selinux,smack,tomoyo,apparmor,bpf"

#
# Kernel hardening options
#

#
# Memory initialization
#
CONFIG_INIT_STACK_NONE=y
CONFIG_INIT_ON_ALLOC_DEFAULT_ON=y
CONFIG_INIT_ON_FREE_DEFAULT_ON=y
# end of Memory initialization
# end of Kernel hardening options
# end of Security options

# CONFIG_CRYPTO is not set

#
# Library routines
#
# CONFIG_PACKING is not set
CONFIG_BITREVERSE=y
CONFIG_GENERIC_STRNCPY_FROM_USER=y
CONFIG_GENERIC_STRNLEN_USER=y
# CONFIG_CORDIC is not set
# CONFIG_PRIME_NUMBERS is not set
CONFIG_GENERIC_PCI_IOMAP=y
CONFIG_ARCH_USE_CMPXCHG_LOCKREF=y
CONFIG_ARCH_HAS_FAST_MULTIPLIER=y

#
# Crypto library routines
#
CONFIG_CRYPTO_LIB_BLAKE2S_GENERIC=y
# CONFIG_CRYPTO_LIB_CURVE25519 is not set
CONFIG_CRYPTO_LIB_POLY1305_RSIZE=1
# CONFIG_CRYPTO_LIB_POLY1305 is not set
# end of Crypto library routines

# CONFIG_CRC_CCITT is not set
# CONFIG_CRC16 is not set
# CONFIG_CRC_T10DIF is not set
# CONFIG_CRC64_ROCKSOFT is not set
# CONFIG_CRC_ITU_T is not set
CONFIG_CRC32=y
# CONFIG_CRC32_SELFTEST is not set
CONFIG_CRC32_SLICEBY8=y
# CONFIG_CRC32_SLICEBY4 is not set
# CONFIG_CRC32_SARWATE is not set
# CONFIG_CRC32_BIT is not set
# CONFIG_CRC64 is not set
# CONFIG_CRC4 is not set
# CONFIG_CRC7 is not set
# CONFIG_LIBCRC32C is not set
# CONFIG_CRC8 is not set
# CONFIG_RANDOM32_SELFTEST is not set
CONFIG_ZLIB_DEFLATE=y
# CONFIG_XZ_DEC is not set
CONFIG_XARRAY_MULTI=y
CONFIG_HAS_IOMEM=y
CONFIG_HAS_IOPORT_MAP=y
CONFIG_HAS_DMA=y
CONFIG_DMA_OPS=y
CONFIG_DMA_OPS_BYPASS=y
CONFIG_ARCH_HAS_DMA_MAP_DIRECT=y
CONFIG_NEED_SG_DMA_LENGTH=y
CONFIG_NEED_DMA_MAP_STATE=y
CONFIG_ARCH_DMA_ADDR_T_64BIT=y
CONFIG_DMA_DECLARE_COHERENT=y
CONFIG_SWIOTLB=y
# CONFIG_DMA_RESTRICTED_POOL is not set
# CONFIG_DMA_API_DEBUG is not set
CONFIG_IOMMU_HELPER=y
# CONFIG_IRQ_POLL is not set
CONFIG_LIBFDT=y
CONFIG_HAVE_GENERIC_VDSO=y
CONFIG_GENERIC_GETTIMEOFDAY=y
CONFIG_GENERIC_VDSO_TIME_NS=y
CONFIG_ARCH_HAS_PMEM_API=y
CONFIG_ARCH_HAS_MEMREMAP_COMPAT_ALIGN=y
CONFIG_ARCH_HAS_UACCESS_FLUSHCACHE=y
CONFIG_ARCH_HAS_COPY_MC=y
CONFIG_ARCH_STACKWALK=y
CONFIG_SBITMAP=y
# end of Library routines

#
# Kernel hacking
#

#
# printk and dmesg options
#
CONFIG_PRINTK_TIME=y
# CONFIG_PRINTK_CALLER is not set
# CONFIG_STACKTRACE_BUILD_ID is not set
CONFIG_CONSOLE_LOGLEVEL_DEFAULT=7
CONFIG_CONSOLE_LOGLEVEL_QUIET=4
CONFIG_MESSAGE_LOGLEVEL_DEFAULT=4
# CONFIG_DYNAMIC_DEBUG is not set
# CONFIG_DYNAMIC_DEBUG_CORE is not set
CONFIG_SYMBOLIC_ERRNAME=y
CONFIG_DEBUG_BUGVERBOSE=y
# end of printk and dmesg options

# CONFIG_DEBUG_KERNEL is not set

#
# Compile-time checks and compiler options
#
CONFIG_FRAME_WARN=2048
# CONFIG_STRIP_ASM_SYMS is not set
# CONFIG_HEADERS_INSTALL is not set
# CONFIG_DEBUG_SECTION_MISMATCH is not set
# CONFIG_SECTION_MISMATCH_WARN_ONLY is not set
# end of Compile-time checks and compiler options

#
# Generic Kernel Debugging Instruments
#
# CONFIG_MAGIC_SYSRQ is not set
# CONFIG_DEBUG_FS is not set
CONFIG_HAVE_ARCH_KGDB=y
CONFIG_ARCH_HAS_UBSAN_SANITIZE_ALL=y
# CONFIG_UBSAN is not set
# end of Generic Kernel Debugging Instruments

#
# Networking Debugging
#
# end of Networking Debugging

#
# Memory Debugging
#
# CONFIG_PAGE_EXTENSION is not set
# CONFIG_PAGE_POISONING is not set
# CONFIG_DEBUG_RODATA_TEST is not set
CONFIG_ARCH_HAS_DEBUG_WX=y
# CONFIG_DEBUG_WX is not set
CONFIG_GENERIC_PTDUMP=y
# CONFIG_SLUB_DEBUG_ON is not set
# CONFIG_SLUB_STATS is not set
CONFIG_HAVE_DEBUG_KMEMLEAK=y
CONFIG_ARCH_HAS_DEBUG_VM_PGTABLE=y
# CONFIG_DEBUG_VM_PGTABLE is not set
CONFIG_ARCH_HAS_DEBUG_VIRTUAL=y
CONFIG_DEBUG_MEMORY_INIT=y
CONFIG_HAVE_DEBUG_STACKOVERFLOW=y
CONFIG_CC_HAS_KASAN_GENERIC=y
CONFIG_CC_HAS_WORKING_NOSANITIZE_ADDRESS=y
# end of Memory Debugging

#
# Debug Oops, Lockups and Hangs
#
CONFIG_PANIC_ON_OOPS=y
CONFIG_PANIC_ON_OOPS_VALUE=1
# CONFIG_TEST_LOCKUP is not set
# end of Debug Oops, Lockups and Hangs

#
# Scheduler Debugging
#
# end of Scheduler Debugging

# CONFIG_DEBUG_TIMEKEEPING is not set

#
# Lock Debugging (spinlocks, mutexes, etc...)
#
CONFIG_LOCK_DEBUGGING_SUPPORT=y
# CONFIG_WW_MUTEX_SELFTEST is not set
# end of Lock Debugging (spinlocks, mutexes, etc...)

# CONFIG_DEBUG_IRQFLAGS is not set
# CONFIG_STACKTRACE is not set
# CONFIG_WARN_ALL_UNSEEDED_RANDOM is not set

#
# Debug kernel data structures
#
# CONFIG_BUG_ON_DATA_CORRUPTION is not set
# end of Debug kernel data structures

#
# RCU Debugging
#
CONFIG_RCU_CPU_STALL_TIMEOUT=21
# end of RCU Debugging

CONFIG_HAVE_FUNCTION_TRACER=y
CONFIG_HAVE_FUNCTION_GRAPH_TRACER=y
CONFIG_HAVE_DYNAMIC_FTRACE=y
CONFIG_HAVE_FTRACE_MCOUNT_RECORD=y
CONFIG_HAVE_SYSCALL_TRACEPOINTS=y
CONFIG_HAVE_C_RECORDMCOUNT=y
CONFIG_TRACING_SUPPORT=y
# CONFIG_FTRACE is not set
CONFIG_SAMPLES=y
# CONFIG_SAMPLE_AUXDISPLAY is not set
# CONFIG_SAMPLE_KOBJECT is not set
# CONFIG_SAMPLE_HW_BREAKPOINT is not set
# CONFIG_SAMPLE_KFIFO is not set
# CONFIG_SAMPLE_WATCHDOG is not set
CONFIG_ARCH_HAS_DEVMEM_IS_ALLOWED=y
# CONFIG_STRICT_DEVMEM is not set

#
# powerpc Debugging
#
# CONFIG_PPC_DISABLE_WERROR is not set
CONFIG_PPC_WERROR=y
CONFIG_PRINT_STACK_DEPTH=64
# CONFIG_JUMP_LABEL_FEATURE_CHECKS is not set
# CONFIG_PPC_IRQ_SOFT_MASK_DEBUG is not set
# CONFIG_PPC_RFI_SRR_DEBUG is not set
# CONFIG_BOOTX_TEXT is not set
# CONFIG_PPC_EARLY_DEBUG is not set
# end of powerpc Debugging

#
# Kernel Testing and Coverage
#
# CONFIG_KUNIT is not set
CONFIG_ARCH_HAS_KCOV=y
CONFIG_CC_HAS_SANCOV_TRACE_PC=y
# CONFIG_KCOV is not set
# CONFIG_RUNTIME_TESTING_MENU is not set
CONFIG_ARCH_USE_MEMTEST=y
# CONFIG_MEMTEST is not set
# end of Kernel Testing and Coverage
# end of Kernel hacking

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: rcu_sched self-detected stall on CPU
  2022-04-05 21:41 rcu_sched self-detected stall on CPU Miguel Ojeda
@ 2022-04-06  9:31   ` Zhouyi Zhou
  0 siblings, 0 replies; 56+ messages in thread
From: Zhouyi Zhou @ 2022-04-06  9:31 UTC (permalink / raw)
  To: Miguel Ojeda; +Cc: rcu, linuxppc-dev

Hi

I can reproduce it in a ppc virtual cloud server provided by Oregon
State University.  Following is what I do:
1) curl -l https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/snapshot/linux-5.18-rc1.tar.gz
-o linux-5.18-rc1.tar.gz
2) tar zxf linux-5.18-rc1.tar.gz
3) cp config linux-5.18-rc1/.config
4) cd linux-5.18-rc1
5) make vmlinux -j 8
6) qemu-system-ppc64 -kernel vmlinux -nographic -vga none -no-reboot
-smp 2 (QEMU 4.2.1)
7) after 12 rounds, the bug got reproduced:
(http://154.223.142.244/logs/20220406/qemu.log.txt)

Cheers ;-)
Zhouyi

On Wed, Apr 6, 2022 at 3:47 PM Miguel Ojeda
<miguel.ojeda.sandonis@gmail.com> wrote:
>
> Hi PPC/RCU,
>
> While merging v5.18-rc1 changes I noticed our CI PPC runs broke. I
> reproduced the problem in v5.18-rc1 as well as next-20220405, under
> both QEMU 4.2.1 and 6.1.0, with `-smp 2`; but I cannot reproduce it in
> v5.17 from a few tries.
>
> Sadly, the problem is not deterministic although it is not too hard to
> reproduce (1 out of 5?). Please see attached config and QEMU output.
>
> Cheers,
> Miguel

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: rcu_sched self-detected stall on CPU
@ 2022-04-06  9:31   ` Zhouyi Zhou
  0 siblings, 0 replies; 56+ messages in thread
From: Zhouyi Zhou @ 2022-04-06  9:31 UTC (permalink / raw)
  To: Miguel Ojeda; +Cc: linuxppc-dev, rcu

Hi

I can reproduce it in a ppc virtual cloud server provided by Oregon
State University.  Following is what I do:
1) curl -l https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/snapshot/linux-5.18-rc1.tar.gz
-o linux-5.18-rc1.tar.gz
2) tar zxf linux-5.18-rc1.tar.gz
3) cp config linux-5.18-rc1/.config
4) cd linux-5.18-rc1
5) make vmlinux -j 8
6) qemu-system-ppc64 -kernel vmlinux -nographic -vga none -no-reboot
-smp 2 (QEMU 4.2.1)
7) after 12 rounds, the bug got reproduced:
(http://154.223.142.244/logs/20220406/qemu.log.txt)

Cheers ;-)
Zhouyi

On Wed, Apr 6, 2022 at 3:47 PM Miguel Ojeda
<miguel.ojeda.sandonis@gmail.com> wrote:
>
> Hi PPC/RCU,
>
> While merging v5.18-rc1 changes I noticed our CI PPC runs broke. I
> reproduced the problem in v5.18-rc1 as well as next-20220405, under
> both QEMU 4.2.1 and 6.1.0, with `-smp 2`; but I cannot reproduce it in
> v5.17 from a few tries.
>
> Sadly, the problem is not deterministic although it is not too hard to
> reproduce (1 out of 5?). Please see attached config and QEMU output.
>
> Cheers,
> Miguel

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: rcu_sched self-detected stall on CPU
  2022-04-06  9:31   ` Zhouyi Zhou
@ 2022-04-06 17:00     ` Paul E. McKenney
  -1 siblings, 0 replies; 56+ messages in thread
From: Paul E. McKenney @ 2022-04-06 17:00 UTC (permalink / raw)
  To: Zhouyi Zhou; +Cc: rcu, Miguel Ojeda, linuxppc-dev

On Wed, Apr 06, 2022 at 05:31:10PM +0800, Zhouyi Zhou wrote:
> Hi
> 
> I can reproduce it in a ppc virtual cloud server provided by Oregon
> State University.  Following is what I do:
> 1) curl -l https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/snapshot/linux-5.18-rc1.tar.gz
> -o linux-5.18-rc1.tar.gz
> 2) tar zxf linux-5.18-rc1.tar.gz
> 3) cp config linux-5.18-rc1/.config
> 4) cd linux-5.18-rc1
> 5) make vmlinux -j 8
> 6) qemu-system-ppc64 -kernel vmlinux -nographic -vga none -no-reboot
> -smp 2 (QEMU 4.2.1)
> 7) after 12 rounds, the bug got reproduced:
> (http://154.223.142.244/logs/20220406/qemu.log.txt)

Just to make sure, are you both seeing the same thing?  Last I knew,
Zhouyi was chasing an RCU-tasks issue that appears only in kernels
built with CONFIG_PROVE_RCU=y, which Miguel does not have set.  Or did
I miss something?

Miguel is instead seeing an RCU CPU stall warning where RCU's grace-period
kthread slept for three milliseconds, but did not wake up for more than
20 seconds.  This kthread would normally have awakened on CPU 1, but
CPU 1 looks to me to be very unhealthy, as can be seen in your console
output below (but maybe my idea of what is healthy for powerpc systems
is outdated).  Please see also the inline annotations.

Thoughts from the PPC guys?

							Thanx, Paul

------------------------------------------------------------------------

[   21.186912] rcu: INFO: rcu_sched self-detected stall on CPU
[   21.187331] rcu: 	1-...!: (4712629 ticks this GP) idle=2c1/0/0x3 softirq=8/8 fqs=0 
[   21.187529] 	(t=21000 jiffies g=-1183 q=3)
[   21.187681] rcu: rcu_sched kthread timer wakeup didn't happen for 20997 jiffies! g-1183 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402

	The grace-period kthread is still asleep (->state=0x402).
	This indicates that the three-jiffy timer has somehow been
	prevented from expiring for almost a full 21 seconds.  Of course,
	if timers don't work, RCU cannot work.

[   21.187770] rcu: 	Possible timer handling issue on cpu=1 timer-softirq=1
[   21.187927] rcu: rcu_sched kthread starved for 21001 jiffies! g-1183 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402 ->cpu=1
[   21.188019] rcu: 	Unless rcu_sched kthread gets sufficient CPU time, OOM is now expected behavior.
[   21.188087] rcu: RCU grace-period kthread stack dump:
[   21.188196] task:rcu_sched       state:I stack:    0 pid:   10 ppid:     2 flags:0x00000800
[   21.188453] Call Trace:
[   21.188525] [c0000000061e78a0] [c0000000061e78e0] 0xc0000000061e78e0 (unreliable)
[   21.188900] [c0000000061e7a90] [c000000000017210] __switch_to+0x250/0x310
[   21.189210] [c0000000061e7b00] [c0000000003ed660] __schedule+0x210/0x660
[   21.189315] [c0000000061e7b80] [c0000000003edb14] schedule+0x64/0x110
[   21.189387] [c0000000061e7bb0] [c0000000003f6648] schedule_timeout+0x1d8/0x390
[   21.189473] [c0000000061e7c80] [c00000000011111c] rcu_gp_fqs_loop+0x2dc/0x3d0
[   21.189555] [c0000000061e7d30] [c0000000001144ec] rcu_gp_kthread+0x13c/0x160
[   21.189633] [c0000000061e7dc0] [c0000000000c1770] kthread+0x110/0x120
[   21.189714] [c0000000061e7e10] [c00000000000c9e4] ret_from_kernel_thread+0x5c/0x64

	The above stack trace is expected behavior when the RCU
	grace-period kthread is waiting to do its next FQS scan.

[   21.189938] rcu: Stack dump where RCU GP kthread last ran:

	And here is the stalled CPU, which also happens to be the CPU
	that RCU last ran on:

[   21.189992] Task dump for CPU 1:
[   21.190059] task:swapper/1       state:R  running task     stack:    0 pid:    0 ppid:     1 flags:0x00000804
[   21.190169] Call Trace:
[   21.190194] [c0000000061ef2d0] [c0000000000c9a40] sched_show_task+0x180/0x1c0 (unreliable)
[   21.190278] [c0000000061ef340] [c000000000116ca0] rcu_check_gp_kthread_starvation+0x16c/0x19c
[   21.190370] [c0000000061ef3c0] [c000000000114f7c] rcu_sched_clock_irq+0x7ec/0xaf0
[   21.190448] [c0000000061ef4b0] [c000000000120fdc] update_process_times+0xbc/0x140
[   21.190524] [c0000000061ef4f0] [c000000000136a24] tick_nohz_handler+0xf4/0x1b0
[   21.190608] [c0000000061ef540] [c00000000001c828] timer_interrupt+0x148/0x2d0
[   21.190699] [c0000000061ef590] [c0000000000098e8] decrementer_common_virt+0x208/0x210
[   21.190837] --- interrupt: 900 at arch_local_irq_restore+0x168/0x170

	Up through this point is just the stack trace of the the
	code doing the stack dump that the RCU CPU stall warning code
	asked for.

[   21.190941] NIP:  c000000000013608 LR: c0000000003f8114 CTR: c0000000000dc630

	This NIP does not look at all good to me.  But I freely confess
	that I am out of date on what Power machines do.

[   21.191031] REGS: c0000000061ef600 TRAP: 0900   Not tainted  (5.18.0-rc1)
[   21.191109] MSR:  8000000000009033 <SF,EE,ME,IR,DR,RI,LE>  CR: 22000202  XER: 00000000
[   21.191274] CFAR: 0000000000000000 IRQMASK: 0 
[   21.191274] GPR00: c00000000009c368 c0000000061ef8a0 c00000000116a700 0000000000000000 
[   21.191274] GPR04: 0000000000000000 0000000000000000 000000001ee30000 ffffffffffffffff 
[   21.191274] GPR08: 000000001ee30000 0000000000000000 0000000000008002 7265677368657265 
[   21.191274] GPR12: c0000000000dc630 c00000001ffe5800 0000000000000000 0000000000000000 
[   21.191274] GPR16: 0000000000000282 0000000000000000 0000000000000000 c0000000061eff00 
[   21.191274] GPR20: 0000000000000000 0000000000000001 c0000000061b9f80 c000000001195a10 
[   21.191274] GPR24: c000000001193a00 00000000fffb6cc4 000000000000000a c0000000010721e8 
[   21.191274] GPR28: c000000001076800 c000000001070380 c0000000010716d8 c0000000061b9f80 
[   21.191932] NIP [c000000000013608] arch_local_irq_restore+0x168/0x170
[   21.192024] LR [c0000000003f8114] __do_softirq+0xd4/0x2ec
[   21.192118] --- interrupt: 900
[   21.192158] [c0000000061ef8a0] [c0000000061b9f80] 0xc0000000061b9f80 (unreliable)
[   21.192227] [c0000000061ef9b0] [c00000000009c368] irq_exit+0xc8/0x110
[   21.192307] [c0000000061ef9d0] [c00000000001c858] timer_interrupt+0x178/0x2d0
[   21.192397] [c0000000061efa20] [c0000000000098e8] decrementer_common_virt+0x208/0x210
[   21.192495] --- interrupt: 900 at plpar_hcall_norets_notrace+0x18/0x2c
[   21.192566] NIP:  c000000000072988 LR: c000000000074fa8 CTR: c000000000074f10
[   21.192615] REGS: c0000000061efa90 TRAP: 0900   Not tainted  (5.18.0-rc1)
[   21.192659] MSR:  8000000000009033 <SF,EE,ME,IR,DR,RI,LE>  CR: 28000202  XER: 00000000
[   21.192755] CFAR: 0000000000000000 IRQMASK: 0 
[   21.192755] GPR00: 0000000028000202 c0000000061efd30 c00000000116a700 0000000000000000 
[   21.192755] GPR04: c00000001fea0280 ffffffffffffffff 0000000001f40000 000000019d088fcf 
[   21.192755] GPR08: 000000001ee30000 c00000001ffe5400 0000000000000001 0000000100000000 
[   21.192755] GPR12: c000000000074f10 c00000001ffe5800 0000000000000000 0000000000000000 
[   21.192755] GPR16: 0000000000000000 0000000000000000 0000000000000000 c0000000061eff00 
[   21.192755] GPR20: c00000000003d440 0000000000000001 c000000001195b30 c000000001195a10 
[   21.192755] GPR24: 0000000000080000 c0000000061ba000 c000000001195a98 0000000000000001 
[   21.192755] GPR28: 0000000000000001 c0000000010716d0 c0000000010716d8 c0000000010716d0 
[   21.193290] NIP [c000000000072988] plpar_hcall_norets_notrace+0x18/0x2c
[   21.193363] LR [c000000000074fa8] pseries_lpar_idle+0x98/0x1b0
[   21.193428] --- interrupt: 900
[   21.193457] [c0000000061efd30] [0000000000000001] 0x1 (unreliable)
[   21.193512] [c0000000061efdb0] [c000000000018b54] arch_cpu_idle+0x44/0x180
[   21.193590] [c0000000061efde0] [c0000000003f75bc] default_idle_call+0x4c/0x7c
[   21.193679] [c0000000061efe00] [c0000000000e1384] do_idle+0x114/0x1e0
[   21.193747] [c0000000061efe60] [c0000000000e1664] cpu_startup_entry+0x34/0x40
[   21.193901] [c0000000061efe90] [c00000000003f044] start_secondary+0x624/0xa00
[   21.194002] [c0000000061eff90] [c00000000000cd54] start_secondary_prolog+0x10/0x14
[   21.194245] Task dump for CPU 1:
[   21.194284] task:swapper/1       state:R  running task     stack:    0 pid:    0 ppid:     1 flags:0x00000804
[   21.194374] Call Trace:
[   21.194400] [c0000000061ef2b0] [c0000000000c9a40] sched_show_task+0x180/0x1c0 (unreliable)
[   21.194479] [c0000000061ef320] [c000000000116df8] rcu_dump_cpu_stacks+0x128/0x188
[   21.194567] [c0000000061ef3c0] [c000000000114f9c] rcu_sched_clock_irq+0x80c/0xaf0
[   21.194642] [c0000000061ef4b0] [c000000000120fdc] update_process_times+0xbc/0x140
[   21.194712] [c0000000061ef4f0] [c000000000136a24] tick_nohz_handler+0xf4/0x1b0
[   21.194828] [c0000000061ef540] [c00000000001c828] timer_interrupt+0x148/0x2d0
[   21.194942] [c0000000061ef590] [c0000000000098e8] decrementer_common_virt+0x208/0x210
[   21.195035] --- interrupt: 900 at arch_local_irq_restore+0x168/0x170
[   21.195104] NIP:  c000000000013608 LR: c0000000003f8114 CTR: c0000000000dc630
[   21.195152] REGS: c0000000061ef600 TRAP: 0900   Not tainted  (5.18.0-rc1)
[   21.195199] MSR:  8000000000009033 <SF,EE,ME,IR,DR,RI,LE>  CR: 22000202  XER: 00000000
[   21.195296] CFAR: 0000000000000000 IRQMASK: 0 
[   21.195296] GPR00: c00000000009c368 c0000000061ef8a0 c00000000116a700 0000000000000000 
[   21.195296] GPR04: 0000000000000000 0000000000000000 000000001ee30000 ffffffffffffffff 
[   21.195296] GPR08: 000000001ee30000 0000000000000000 0000000000008002 7265677368657265 
[   21.195296] GPR12: c0000000000dc630 c00000001ffe5800 0000000000000000 0000000000000000 
[   21.195296] GPR16: 0000000000000282 0000000000000000 0000000000000000 c0000000061eff00 
[   21.195296] GPR20: 0000000000000000 0000000000000001 c0000000061b9f80 c000000001195a10 
[   21.195296] GPR24: c000000001193a00 00000000fffb6cc4 000000000000000a c0000000010721e8 
[   21.195296] GPR28: c000000001076800 c000000001070380 c0000000010716d8 c0000000061b9f80 
[   21.195850] NIP [c000000000013608] arch_local_irq_restore+0x168/0x170
[   21.195944] LR [c0000000003f8114] __do_softirq+0xd4/0x2ec
[   21.196027] --- interrupt: 900
[   21.196056] [c0000000061ef8a0] [c0000000061b9f80] 0xc0000000061b9f80 (unreliable)
[   21.196119] [c0000000061ef9b0] [c00000000009c368] irq_exit+0xc8/0x110
[   21.196192] [c0000000061ef9d0] [c00000000001c858] timer_interrupt+0x178/0x2d0
[   21.196282] [c0000000061efa20] [c0000000000098e8] decrementer_common_virt+0x208/0x210
[   21.196373] --- interrupt: 900 at plpar_hcall_norets_notrace+0x18/0x2c
[   21.196439] NIP:  c000000000072988 LR: c000000000074fa8 CTR: c000000000074f10
[   21.196489] REGS: c0000000061efa90 TRAP: 0900   Not tainted  (5.18.0-rc1)
[   21.196534] MSR:  8000000000009033 <SF,EE,ME,IR,DR,RI,LE>  CR: 28000202  XER: 00000000
[   21.196627] CFAR: 0000000000000000 IRQMASK: 0 
[   21.196627] GPR00: 0000000028000202 c0000000061efd30 c00000000116a700 0000000000000000 
[   21.196627] GPR04: c00000001fea0280 ffffffffffffffff 0000000001f40000 000000019d088fcf 
[   21.196627] GPR08: 000000001ee30000 c00000001ffe5400 0000000000000001 0000000100000000 
[   21.196627] GPR12: c000000000074f10 c00000001ffe5800 0000000000000000 0000000000000000 
[   21.196627] GPR16: 0000000000000000 0000000000000000 0000000000000000 c0000000061eff00 
[   21.196627] GPR20: c00000000003d440 0000000000000001 c000000001195b30 c000000001195a10 
[   21.196627] GPR24: 0000000000080000 c0000000061ba000 c000000001195a98 0000000000000001 
[   21.196627] GPR28: 0000000000000001 c0000000010716d0 c0000000010716d8 c0000000010716d0 
[   21.197168] NIP [c000000000072988] plpar_hcall_norets_notrace+0x18/0x2c
[   21.197239] LR [c000000000074fa8] pseries_lpar_idle+0x98/0x1b0
[   21.197305] --- interrupt: 900
[   21.197333] [c0000000061efd30] [0000000000000001] 0x1 (unreliable)
[   21.197390] [c0000000061efdb0] [c000000000018b54] arch_cpu_idle+0x44/0x180
[   21.197470] [c0000000061efde0] [c0000000003f75bc] default_idle_call+0x4c/0x7c
[   21.197556] [c0000000061efe00] [c0000000000e1384] do_idle+0x114/0x1e0
[   21.197620] [c0000000061efe60] [c0000000000e1664] cpu_startup_entry+0x34/0x40
[   21.197696] [c0000000061efe90] [c00000000003f044] start_secondary+0x624/0xa00
[   21.197820] [c0000000061eff90] [c00000000000cd54] start_secondary_prolog+0x10/0x1I4

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: rcu_sched self-detected stall on CPU
@ 2022-04-06 17:00     ` Paul E. McKenney
  0 siblings, 0 replies; 56+ messages in thread
From: Paul E. McKenney @ 2022-04-06 17:00 UTC (permalink / raw)
  To: Zhouyi Zhou; +Cc: Miguel Ojeda, linuxppc-dev, rcu

On Wed, Apr 06, 2022 at 05:31:10PM +0800, Zhouyi Zhou wrote:
> Hi
> 
> I can reproduce it in a ppc virtual cloud server provided by Oregon
> State University.  Following is what I do:
> 1) curl -l https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/snapshot/linux-5.18-rc1.tar.gz
> -o linux-5.18-rc1.tar.gz
> 2) tar zxf linux-5.18-rc1.tar.gz
> 3) cp config linux-5.18-rc1/.config
> 4) cd linux-5.18-rc1
> 5) make vmlinux -j 8
> 6) qemu-system-ppc64 -kernel vmlinux -nographic -vga none -no-reboot
> -smp 2 (QEMU 4.2.1)
> 7) after 12 rounds, the bug got reproduced:
> (http://154.223.142.244/logs/20220406/qemu.log.txt)

Just to make sure, are you both seeing the same thing?  Last I knew,
Zhouyi was chasing an RCU-tasks issue that appears only in kernels
built with CONFIG_PROVE_RCU=y, which Miguel does not have set.  Or did
I miss something?

Miguel is instead seeing an RCU CPU stall warning where RCU's grace-period
kthread slept for three milliseconds, but did not wake up for more than
20 seconds.  This kthread would normally have awakened on CPU 1, but
CPU 1 looks to me to be very unhealthy, as can be seen in your console
output below (but maybe my idea of what is healthy for powerpc systems
is outdated).  Please see also the inline annotations.

Thoughts from the PPC guys?

							Thanx, Paul

------------------------------------------------------------------------

[   21.186912] rcu: INFO: rcu_sched self-detected stall on CPU
[   21.187331] rcu: 	1-...!: (4712629 ticks this GP) idle=2c1/0/0x3 softirq=8/8 fqs=0 
[   21.187529] 	(t=21000 jiffies g=-1183 q=3)
[   21.187681] rcu: rcu_sched kthread timer wakeup didn't happen for 20997 jiffies! g-1183 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402

	The grace-period kthread is still asleep (->state=0x402).
	This indicates that the three-jiffy timer has somehow been
	prevented from expiring for almost a full 21 seconds.  Of course,
	if timers don't work, RCU cannot work.

[   21.187770] rcu: 	Possible timer handling issue on cpu=1 timer-softirq=1
[   21.187927] rcu: rcu_sched kthread starved for 21001 jiffies! g-1183 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402 ->cpu=1
[   21.188019] rcu: 	Unless rcu_sched kthread gets sufficient CPU time, OOM is now expected behavior.
[   21.188087] rcu: RCU grace-period kthread stack dump:
[   21.188196] task:rcu_sched       state:I stack:    0 pid:   10 ppid:     2 flags:0x00000800
[   21.188453] Call Trace:
[   21.188525] [c0000000061e78a0] [c0000000061e78e0] 0xc0000000061e78e0 (unreliable)
[   21.188900] [c0000000061e7a90] [c000000000017210] __switch_to+0x250/0x310
[   21.189210] [c0000000061e7b00] [c0000000003ed660] __schedule+0x210/0x660
[   21.189315] [c0000000061e7b80] [c0000000003edb14] schedule+0x64/0x110
[   21.189387] [c0000000061e7bb0] [c0000000003f6648] schedule_timeout+0x1d8/0x390
[   21.189473] [c0000000061e7c80] [c00000000011111c] rcu_gp_fqs_loop+0x2dc/0x3d0
[   21.189555] [c0000000061e7d30] [c0000000001144ec] rcu_gp_kthread+0x13c/0x160
[   21.189633] [c0000000061e7dc0] [c0000000000c1770] kthread+0x110/0x120
[   21.189714] [c0000000061e7e10] [c00000000000c9e4] ret_from_kernel_thread+0x5c/0x64

	The above stack trace is expected behavior when the RCU
	grace-period kthread is waiting to do its next FQS scan.

[   21.189938] rcu: Stack dump where RCU GP kthread last ran:

	And here is the stalled CPU, which also happens to be the CPU
	that RCU last ran on:

[   21.189992] Task dump for CPU 1:
[   21.190059] task:swapper/1       state:R  running task     stack:    0 pid:    0 ppid:     1 flags:0x00000804
[   21.190169] Call Trace:
[   21.190194] [c0000000061ef2d0] [c0000000000c9a40] sched_show_task+0x180/0x1c0 (unreliable)
[   21.190278] [c0000000061ef340] [c000000000116ca0] rcu_check_gp_kthread_starvation+0x16c/0x19c
[   21.190370] [c0000000061ef3c0] [c000000000114f7c] rcu_sched_clock_irq+0x7ec/0xaf0
[   21.190448] [c0000000061ef4b0] [c000000000120fdc] update_process_times+0xbc/0x140
[   21.190524] [c0000000061ef4f0] [c000000000136a24] tick_nohz_handler+0xf4/0x1b0
[   21.190608] [c0000000061ef540] [c00000000001c828] timer_interrupt+0x148/0x2d0
[   21.190699] [c0000000061ef590] [c0000000000098e8] decrementer_common_virt+0x208/0x210
[   21.190837] --- interrupt: 900 at arch_local_irq_restore+0x168/0x170

	Up through this point is just the stack trace of the the
	code doing the stack dump that the RCU CPU stall warning code
	asked for.

[   21.190941] NIP:  c000000000013608 LR: c0000000003f8114 CTR: c0000000000dc630

	This NIP does not look at all good to me.  But I freely confess
	that I am out of date on what Power machines do.

[   21.191031] REGS: c0000000061ef600 TRAP: 0900   Not tainted  (5.18.0-rc1)
[   21.191109] MSR:  8000000000009033 <SF,EE,ME,IR,DR,RI,LE>  CR: 22000202  XER: 00000000
[   21.191274] CFAR: 0000000000000000 IRQMASK: 0 
[   21.191274] GPR00: c00000000009c368 c0000000061ef8a0 c00000000116a700 0000000000000000 
[   21.191274] GPR04: 0000000000000000 0000000000000000 000000001ee30000 ffffffffffffffff 
[   21.191274] GPR08: 000000001ee30000 0000000000000000 0000000000008002 7265677368657265 
[   21.191274] GPR12: c0000000000dc630 c00000001ffe5800 0000000000000000 0000000000000000 
[   21.191274] GPR16: 0000000000000282 0000000000000000 0000000000000000 c0000000061eff00 
[   21.191274] GPR20: 0000000000000000 0000000000000001 c0000000061b9f80 c000000001195a10 
[   21.191274] GPR24: c000000001193a00 00000000fffb6cc4 000000000000000a c0000000010721e8 
[   21.191274] GPR28: c000000001076800 c000000001070380 c0000000010716d8 c0000000061b9f80 
[   21.191932] NIP [c000000000013608] arch_local_irq_restore+0x168/0x170
[   21.192024] LR [c0000000003f8114] __do_softirq+0xd4/0x2ec
[   21.192118] --- interrupt: 900
[   21.192158] [c0000000061ef8a0] [c0000000061b9f80] 0xc0000000061b9f80 (unreliable)
[   21.192227] [c0000000061ef9b0] [c00000000009c368] irq_exit+0xc8/0x110
[   21.192307] [c0000000061ef9d0] [c00000000001c858] timer_interrupt+0x178/0x2d0
[   21.192397] [c0000000061efa20] [c0000000000098e8] decrementer_common_virt+0x208/0x210
[   21.192495] --- interrupt: 900 at plpar_hcall_norets_notrace+0x18/0x2c
[   21.192566] NIP:  c000000000072988 LR: c000000000074fa8 CTR: c000000000074f10
[   21.192615] REGS: c0000000061efa90 TRAP: 0900   Not tainted  (5.18.0-rc1)
[   21.192659] MSR:  8000000000009033 <SF,EE,ME,IR,DR,RI,LE>  CR: 28000202  XER: 00000000
[   21.192755] CFAR: 0000000000000000 IRQMASK: 0 
[   21.192755] GPR00: 0000000028000202 c0000000061efd30 c00000000116a700 0000000000000000 
[   21.192755] GPR04: c00000001fea0280 ffffffffffffffff 0000000001f40000 000000019d088fcf 
[   21.192755] GPR08: 000000001ee30000 c00000001ffe5400 0000000000000001 0000000100000000 
[   21.192755] GPR12: c000000000074f10 c00000001ffe5800 0000000000000000 0000000000000000 
[   21.192755] GPR16: 0000000000000000 0000000000000000 0000000000000000 c0000000061eff00 
[   21.192755] GPR20: c00000000003d440 0000000000000001 c000000001195b30 c000000001195a10 
[   21.192755] GPR24: 0000000000080000 c0000000061ba000 c000000001195a98 0000000000000001 
[   21.192755] GPR28: 0000000000000001 c0000000010716d0 c0000000010716d8 c0000000010716d0 
[   21.193290] NIP [c000000000072988] plpar_hcall_norets_notrace+0x18/0x2c
[   21.193363] LR [c000000000074fa8] pseries_lpar_idle+0x98/0x1b0
[   21.193428] --- interrupt: 900
[   21.193457] [c0000000061efd30] [0000000000000001] 0x1 (unreliable)
[   21.193512] [c0000000061efdb0] [c000000000018b54] arch_cpu_idle+0x44/0x180
[   21.193590] [c0000000061efde0] [c0000000003f75bc] default_idle_call+0x4c/0x7c
[   21.193679] [c0000000061efe00] [c0000000000e1384] do_idle+0x114/0x1e0
[   21.193747] [c0000000061efe60] [c0000000000e1664] cpu_startup_entry+0x34/0x40
[   21.193901] [c0000000061efe90] [c00000000003f044] start_secondary+0x624/0xa00
[   21.194002] [c0000000061eff90] [c00000000000cd54] start_secondary_prolog+0x10/0x14
[   21.194245] Task dump for CPU 1:
[   21.194284] task:swapper/1       state:R  running task     stack:    0 pid:    0 ppid:     1 flags:0x00000804
[   21.194374] Call Trace:
[   21.194400] [c0000000061ef2b0] [c0000000000c9a40] sched_show_task+0x180/0x1c0 (unreliable)
[   21.194479] [c0000000061ef320] [c000000000116df8] rcu_dump_cpu_stacks+0x128/0x188
[   21.194567] [c0000000061ef3c0] [c000000000114f9c] rcu_sched_clock_irq+0x80c/0xaf0
[   21.194642] [c0000000061ef4b0] [c000000000120fdc] update_process_times+0xbc/0x140
[   21.194712] [c0000000061ef4f0] [c000000000136a24] tick_nohz_handler+0xf4/0x1b0
[   21.194828] [c0000000061ef540] [c00000000001c828] timer_interrupt+0x148/0x2d0
[   21.194942] [c0000000061ef590] [c0000000000098e8] decrementer_common_virt+0x208/0x210
[   21.195035] --- interrupt: 900 at arch_local_irq_restore+0x168/0x170
[   21.195104] NIP:  c000000000013608 LR: c0000000003f8114 CTR: c0000000000dc630
[   21.195152] REGS: c0000000061ef600 TRAP: 0900   Not tainted  (5.18.0-rc1)
[   21.195199] MSR:  8000000000009033 <SF,EE,ME,IR,DR,RI,LE>  CR: 22000202  XER: 00000000
[   21.195296] CFAR: 0000000000000000 IRQMASK: 0 
[   21.195296] GPR00: c00000000009c368 c0000000061ef8a0 c00000000116a700 0000000000000000 
[   21.195296] GPR04: 0000000000000000 0000000000000000 000000001ee30000 ffffffffffffffff 
[   21.195296] GPR08: 000000001ee30000 0000000000000000 0000000000008002 7265677368657265 
[   21.195296] GPR12: c0000000000dc630 c00000001ffe5800 0000000000000000 0000000000000000 
[   21.195296] GPR16: 0000000000000282 0000000000000000 0000000000000000 c0000000061eff00 
[   21.195296] GPR20: 0000000000000000 0000000000000001 c0000000061b9f80 c000000001195a10 
[   21.195296] GPR24: c000000001193a00 00000000fffb6cc4 000000000000000a c0000000010721e8 
[   21.195296] GPR28: c000000001076800 c000000001070380 c0000000010716d8 c0000000061b9f80 
[   21.195850] NIP [c000000000013608] arch_local_irq_restore+0x168/0x170
[   21.195944] LR [c0000000003f8114] __do_softirq+0xd4/0x2ec
[   21.196027] --- interrupt: 900
[   21.196056] [c0000000061ef8a0] [c0000000061b9f80] 0xc0000000061b9f80 (unreliable)
[   21.196119] [c0000000061ef9b0] [c00000000009c368] irq_exit+0xc8/0x110
[   21.196192] [c0000000061ef9d0] [c00000000001c858] timer_interrupt+0x178/0x2d0
[   21.196282] [c0000000061efa20] [c0000000000098e8] decrementer_common_virt+0x208/0x210
[   21.196373] --- interrupt: 900 at plpar_hcall_norets_notrace+0x18/0x2c
[   21.196439] NIP:  c000000000072988 LR: c000000000074fa8 CTR: c000000000074f10
[   21.196489] REGS: c0000000061efa90 TRAP: 0900   Not tainted  (5.18.0-rc1)
[   21.196534] MSR:  8000000000009033 <SF,EE,ME,IR,DR,RI,LE>  CR: 28000202  XER: 00000000
[   21.196627] CFAR: 0000000000000000 IRQMASK: 0 
[   21.196627] GPR00: 0000000028000202 c0000000061efd30 c00000000116a700 0000000000000000 
[   21.196627] GPR04: c00000001fea0280 ffffffffffffffff 0000000001f40000 000000019d088fcf 
[   21.196627] GPR08: 000000001ee30000 c00000001ffe5400 0000000000000001 0000000100000000 
[   21.196627] GPR12: c000000000074f10 c00000001ffe5800 0000000000000000 0000000000000000 
[   21.196627] GPR16: 0000000000000000 0000000000000000 0000000000000000 c0000000061eff00 
[   21.196627] GPR20: c00000000003d440 0000000000000001 c000000001195b30 c000000001195a10 
[   21.196627] GPR24: 0000000000080000 c0000000061ba000 c000000001195a98 0000000000000001 
[   21.196627] GPR28: 0000000000000001 c0000000010716d0 c0000000010716d8 c0000000010716d0 
[   21.197168] NIP [c000000000072988] plpar_hcall_norets_notrace+0x18/0x2c
[   21.197239] LR [c000000000074fa8] pseries_lpar_idle+0x98/0x1b0
[   21.197305] --- interrupt: 900
[   21.197333] [c0000000061efd30] [0000000000000001] 0x1 (unreliable)
[   21.197390] [c0000000061efdb0] [c000000000018b54] arch_cpu_idle+0x44/0x180
[   21.197470] [c0000000061efde0] [c0000000003f75bc] default_idle_call+0x4c/0x7c
[   21.197556] [c0000000061efe00] [c0000000000e1384] do_idle+0x114/0x1e0
[   21.197620] [c0000000061efe60] [c0000000000e1664] cpu_startup_entry+0x34/0x40
[   21.197696] [c0000000061efe90] [c00000000003f044] start_secondary+0x624/0xa00
[   21.197820] [c0000000061eff90] [c00000000000cd54] start_secondary_prolog+0x10/0x1I4

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: rcu_sched self-detected stall on CPU
  2022-04-06 17:00     ` Paul E. McKenney
@ 2022-04-06 18:25       ` Zhouyi Zhou
  -1 siblings, 0 replies; 56+ messages in thread
From: Zhouyi Zhou @ 2022-04-06 18:25 UTC (permalink / raw)
  To: Paul E. McKenney, Zhouyi Zhou; +Cc: rcu, Miguel Ojeda, linuxppc-dev

Hi Paul

On Thu, Apr 7, 2022 at 1:00 AM Paul E. McKenney <paulmck@kernel.org> wrote:
>
> On Wed, Apr 06, 2022 at 05:31:10PM +0800, Zhouyi Zhou wrote:
> > Hi
> >
> > I can reproduce it in a ppc virtual cloud server provided by Oregon
> > State University.  Following is what I do:
> > 1) curl -l https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/snapshot/linux-5.18-rc1.tar.gz
> > -o linux-5.18-rc1.tar.gz
> > 2) tar zxf linux-5.18-rc1.tar.gz
> > 3) cp config linux-5.18-rc1/.config
> > 4) cd linux-5.18-rc1
> > 5) make vmlinux -j 8
> > 6) qemu-system-ppc64 -kernel vmlinux -nographic -vga none -no-reboot
> > -smp 2 (QEMU 4.2.1)
> > 7) after 12 rounds, the bug got reproduced:
> > (http://154.223.142.244/logs/20220406/qemu.log.txt)
>
> Just to make sure, are you both seeing the same thing?  Last I knew,
> Zhouyi was chasing an RCU-tasks issue that appears only in kernels
> built with CONFIG_PROVE_RCU=y, which Miguel does not have set.  Or did
> I miss something?
We are both seeing the same thing, I work in parallel.
1) I am chasing the RCU-tasks issue which I will report my discoveries
to you later.
2) I am reproducing the RCU CPU stall issue reported by Miguel
yesterday. Lucky enough, I can reproduce it and thanks to Oregon State
University who provides me with the environment! I am also very
interested in helping chase the reason behind the issue. Lucky enough
the issue can be reproduced in a non-hardware accelerated qemu
environment so that I can give a hand.

Thanks
Zhouyi
>
> Miguel is instead seeing an RCU CPU stall warning where RCU's grace-period
> kthread slept for three milliseconds, but did not wake up for more than
> 20 seconds.  This kthread would normally have awakened on CPU 1, but
> CPU 1 looks to me to be very unhealthy, as can be seen in your console
> output below (but maybe my idea of what is healthy for powerpc systems
> is outdated).  Please see also the inline annotations.
>
> Thoughts from the PPC guys?
>
>                                                         Thanx, Paul
>
> ------------------------------------------------------------------------
>
> [   21.186912] rcu: INFO: rcu_sched self-detected stall on CPU
> [   21.187331] rcu:     1-...!: (4712629 ticks this GP) idle=2c1/0/0x3 softirq=8/8 fqs=0
> [   21.187529]  (t=21000 jiffies g=-1183 q=3)
> [   21.187681] rcu: rcu_sched kthread timer wakeup didn't happen for 20997 jiffies! g-1183 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402
>
>         The grace-period kthread is still asleep (->state=0x402).
>         This indicates that the three-jiffy timer has somehow been
>         prevented from expiring for almost a full 21 seconds.  Of course,
>         if timers don't work, RCU cannot work.
>
> [   21.187770] rcu:     Possible timer handling issue on cpu=1 timer-softirq=1
> [   21.187927] rcu: rcu_sched kthread starved for 21001 jiffies! g-1183 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402 ->cpu=1
> [   21.188019] rcu:     Unless rcu_sched kthread gets sufficient CPU time, OOM is now expected behavior.
> [   21.188087] rcu: RCU grace-period kthread stack dump:
> [   21.188196] task:rcu_sched       state:I stack:    0 pid:   10 ppid:     2 flags:0x00000800
> [   21.188453] Call Trace:
> [   21.188525] [c0000000061e78a0] [c0000000061e78e0] 0xc0000000061e78e0 (unreliable)
> [   21.188900] [c0000000061e7a90] [c000000000017210] __switch_to+0x250/0x310
> [   21.189210] [c0000000061e7b00] [c0000000003ed660] __schedule+0x210/0x660
> [   21.189315] [c0000000061e7b80] [c0000000003edb14] schedule+0x64/0x110
> [   21.189387] [c0000000061e7bb0] [c0000000003f6648] schedule_timeout+0x1d8/0x390
> [   21.189473] [c0000000061e7c80] [c00000000011111c] rcu_gp_fqs_loop+0x2dc/0x3d0
> [   21.189555] [c0000000061e7d30] [c0000000001144ec] rcu_gp_kthread+0x13c/0x160
> [   21.189633] [c0000000061e7dc0] [c0000000000c1770] kthread+0x110/0x120
> [   21.189714] [c0000000061e7e10] [c00000000000c9e4] ret_from_kernel_thread+0x5c/0x64
>
>         The above stack trace is expected behavior when the RCU
>         grace-period kthread is waiting to do its next FQS scan.
>
> [   21.189938] rcu: Stack dump where RCU GP kthread last ran:
>
>         And here is the stalled CPU, which also happens to be the CPU
>         that RCU last ran on:
>
> [   21.189992] Task dump for CPU 1:
> [   21.190059] task:swapper/1       state:R  running task     stack:    0 pid:    0 ppid:     1 flags:0x00000804
> [   21.190169] Call Trace:
> [   21.190194] [c0000000061ef2d0] [c0000000000c9a40] sched_show_task+0x180/0x1c0 (unreliable)
> [   21.190278] [c0000000061ef340] [c000000000116ca0] rcu_check_gp_kthread_starvation+0x16c/0x19c
> [   21.190370] [c0000000061ef3c0] [c000000000114f7c] rcu_sched_clock_irq+0x7ec/0xaf0
> [   21.190448] [c0000000061ef4b0] [c000000000120fdc] update_process_times+0xbc/0x140
> [   21.190524] [c0000000061ef4f0] [c000000000136a24] tick_nohz_handler+0xf4/0x1b0
> [   21.190608] [c0000000061ef540] [c00000000001c828] timer_interrupt+0x148/0x2d0
> [   21.190699] [c0000000061ef590] [c0000000000098e8] decrementer_common_virt+0x208/0x210
> [   21.190837] --- interrupt: 900 at arch_local_irq_restore+0x168/0x170
>
>         Up through this point is just the stack trace of the the
>         code doing the stack dump that the RCU CPU stall warning code
>         asked for.
>
> [   21.190941] NIP:  c000000000013608 LR: c0000000003f8114 CTR: c0000000000dc630
>
>         This NIP does not look at all good to me.  But I freely confess
>         that I am out of date on what Power machines do.
>
> [   21.191031] REGS: c0000000061ef600 TRAP: 0900   Not tainted  (5.18.0-rc1)
> [   21.191109] MSR:  8000000000009033 <SF,EE,ME,IR,DR,RI,LE>  CR: 22000202  XER: 00000000
> [   21.191274] CFAR: 0000000000000000 IRQMASK: 0
> [   21.191274] GPR00: c00000000009c368 c0000000061ef8a0 c00000000116a700 0000000000000000
> [   21.191274] GPR04: 0000000000000000 0000000000000000 000000001ee30000 ffffffffffffffff
> [   21.191274] GPR08: 000000001ee30000 0000000000000000 0000000000008002 7265677368657265
> [   21.191274] GPR12: c0000000000dc630 c00000001ffe5800 0000000000000000 0000000000000000
> [   21.191274] GPR16: 0000000000000282 0000000000000000 0000000000000000 c0000000061eff00
> [   21.191274] GPR20: 0000000000000000 0000000000000001 c0000000061b9f80 c000000001195a10
> [   21.191274] GPR24: c000000001193a00 00000000fffb6cc4 000000000000000a c0000000010721e8
> [   21.191274] GPR28: c000000001076800 c000000001070380 c0000000010716d8 c0000000061b9f80
> [   21.191932] NIP [c000000000013608] arch_local_irq_restore+0x168/0x170
> [   21.192024] LR [c0000000003f8114] __do_softirq+0xd4/0x2ec
> [   21.192118] --- interrupt: 900
> [   21.192158] [c0000000061ef8a0] [c0000000061b9f80] 0xc0000000061b9f80 (unreliable)
> [   21.192227] [c0000000061ef9b0] [c00000000009c368] irq_exit+0xc8/0x110
> [   21.192307] [c0000000061ef9d0] [c00000000001c858] timer_interrupt+0x178/0x2d0
> [   21.192397] [c0000000061efa20] [c0000000000098e8] decrementer_common_virt+0x208/0x210
> [   21.192495] --- interrupt: 900 at plpar_hcall_norets_notrace+0x18/0x2c
> [   21.192566] NIP:  c000000000072988 LR: c000000000074fa8 CTR: c000000000074f10
> [   21.192615] REGS: c0000000061efa90 TRAP: 0900   Not tainted  (5.18.0-rc1)
> [   21.192659] MSR:  8000000000009033 <SF,EE,ME,IR,DR,RI,LE>  CR: 28000202  XER: 00000000
> [   21.192755] CFAR: 0000000000000000 IRQMASK: 0
> [   21.192755] GPR00: 0000000028000202 c0000000061efd30 c00000000116a700 0000000000000000
> [   21.192755] GPR04: c00000001fea0280 ffffffffffffffff 0000000001f40000 000000019d088fcf
> [   21.192755] GPR08: 000000001ee30000 c00000001ffe5400 0000000000000001 0000000100000000
> [   21.192755] GPR12: c000000000074f10 c00000001ffe5800 0000000000000000 0000000000000000
> [   21.192755] GPR16: 0000000000000000 0000000000000000 0000000000000000 c0000000061eff00
> [   21.192755] GPR20: c00000000003d440 0000000000000001 c000000001195b30 c000000001195a10
> [   21.192755] GPR24: 0000000000080000 c0000000061ba000 c000000001195a98 0000000000000001
> [   21.192755] GPR28: 0000000000000001 c0000000010716d0 c0000000010716d8 c0000000010716d0
> [   21.193290] NIP [c000000000072988] plpar_hcall_norets_notrace+0x18/0x2c
> [   21.193363] LR [c000000000074fa8] pseries_lpar_idle+0x98/0x1b0
> [   21.193428] --- interrupt: 900
> [   21.193457] [c0000000061efd30] [0000000000000001] 0x1 (unreliable)
> [   21.193512] [c0000000061efdb0] [c000000000018b54] arch_cpu_idle+0x44/0x180
> [   21.193590] [c0000000061efde0] [c0000000003f75bc] default_idle_call+0x4c/0x7c
> [   21.193679] [c0000000061efe00] [c0000000000e1384] do_idle+0x114/0x1e0
> [   21.193747] [c0000000061efe60] [c0000000000e1664] cpu_startup_entry+0x34/0x40
> [   21.193901] [c0000000061efe90] [c00000000003f044] start_secondary+0x624/0xa00
> [   21.194002] [c0000000061eff90] [c00000000000cd54] start_secondary_prolog+0x10/0x14
> [   21.194245] Task dump for CPU 1:
> [   21.194284] task:swapper/1       state:R  running task     stack:    0 pid:    0 ppid:     1 flags:0x00000804
> [   21.194374] Call Trace:
> [   21.194400] [c0000000061ef2b0] [c0000000000c9a40] sched_show_task+0x180/0x1c0 (unreliable)
> [   21.194479] [c0000000061ef320] [c000000000116df8] rcu_dump_cpu_stacks+0x128/0x188
> [   21.194567] [c0000000061ef3c0] [c000000000114f9c] rcu_sched_clock_irq+0x80c/0xaf0
> [   21.194642] [c0000000061ef4b0] [c000000000120fdc] update_process_times+0xbc/0x140
> [   21.194712] [c0000000061ef4f0] [c000000000136a24] tick_nohz_handler+0xf4/0x1b0
> [   21.194828] [c0000000061ef540] [c00000000001c828] timer_interrupt+0x148/0x2d0
> [   21.194942] [c0000000061ef590] [c0000000000098e8] decrementer_common_virt+0x208/0x210
> [   21.195035] --- interrupt: 900 at arch_local_irq_restore+0x168/0x170
> [   21.195104] NIP:  c000000000013608 LR: c0000000003f8114 CTR: c0000000000dc630
> [   21.195152] REGS: c0000000061ef600 TRAP: 0900   Not tainted  (5.18.0-rc1)
> [   21.195199] MSR:  8000000000009033 <SF,EE,ME,IR,DR,RI,LE>  CR: 22000202  XER: 00000000
> [   21.195296] CFAR: 0000000000000000 IRQMASK: 0
> [   21.195296] GPR00: c00000000009c368 c0000000061ef8a0 c00000000116a700 0000000000000000
> [   21.195296] GPR04: 0000000000000000 0000000000000000 000000001ee30000 ffffffffffffffff
> [   21.195296] GPR08: 000000001ee30000 0000000000000000 0000000000008002 7265677368657265
> [   21.195296] GPR12: c0000000000dc630 c00000001ffe5800 0000000000000000 0000000000000000
> [   21.195296] GPR16: 0000000000000282 0000000000000000 0000000000000000 c0000000061eff00
> [   21.195296] GPR20: 0000000000000000 0000000000000001 c0000000061b9f80 c000000001195a10
> [   21.195296] GPR24: c000000001193a00 00000000fffb6cc4 000000000000000a c0000000010721e8
> [   21.195296] GPR28: c000000001076800 c000000001070380 c0000000010716d8 c0000000061b9f80
> [   21.195850] NIP [c000000000013608] arch_local_irq_restore+0x168/0x170
> [   21.195944] LR [c0000000003f8114] __do_softirq+0xd4/0x2ec
> [   21.196027] --- interrupt: 900
> [   21.196056] [c0000000061ef8a0] [c0000000061b9f80] 0xc0000000061b9f80 (unreliable)
> [   21.196119] [c0000000061ef9b0] [c00000000009c368] irq_exit+0xc8/0x110
> [   21.196192] [c0000000061ef9d0] [c00000000001c858] timer_interrupt+0x178/0x2d0
> [   21.196282] [c0000000061efa20] [c0000000000098e8] decrementer_common_virt+0x208/0x210
> [   21.196373] --- interrupt: 900 at plpar_hcall_norets_notrace+0x18/0x2c
> [   21.196439] NIP:  c000000000072988 LR: c000000000074fa8 CTR: c000000000074f10
> [   21.196489] REGS: c0000000061efa90 TRAP: 0900   Not tainted  (5.18.0-rc1)
> [   21.196534] MSR:  8000000000009033 <SF,EE,ME,IR,DR,RI,LE>  CR: 28000202  XER: 00000000
> [   21.196627] CFAR: 0000000000000000 IRQMASK: 0
> [   21.196627] GPR00: 0000000028000202 c0000000061efd30 c00000000116a700 0000000000000000
> [   21.196627] GPR04: c00000001fea0280 ffffffffffffffff 0000000001f40000 000000019d088fcf
> [   21.196627] GPR08: 000000001ee30000 c00000001ffe5400 0000000000000001 0000000100000000
> [   21.196627] GPR12: c000000000074f10 c00000001ffe5800 0000000000000000 0000000000000000
> [   21.196627] GPR16: 0000000000000000 0000000000000000 0000000000000000 c0000000061eff00
> [   21.196627] GPR20: c00000000003d440 0000000000000001 c000000001195b30 c000000001195a10
> [   21.196627] GPR24: 0000000000080000 c0000000061ba000 c000000001195a98 0000000000000001
> [   21.196627] GPR28: 0000000000000001 c0000000010716d0 c0000000010716d8 c0000000010716d0
> [   21.197168] NIP [c000000000072988] plpar_hcall_norets_notrace+0x18/0x2c
> [   21.197239] LR [c000000000074fa8] pseries_lpar_idle+0x98/0x1b0
> [   21.197305] --- interrupt: 900
> [   21.197333] [c0000000061efd30] [0000000000000001] 0x1 (unreliable)
> [   21.197390] [c0000000061efdb0] [c000000000018b54] arch_cpu_idle+0x44/0x180
> [   21.197470] [c0000000061efde0] [c0000000003f75bc] default_idle_call+0x4c/0x7c
> [   21.197556] [c0000000061efe00] [c0000000000e1384] do_idle+0x114/0x1e0
> [   21.197620] [c0000000061efe60] [c0000000000e1664] cpu_startup_entry+0x34/0x40
> [   21.197696] [c0000000061efe90] [c00000000003f044] start_secondary+0x624/0xa00
> [   21.197820] [c0000000061eff90] [c00000000000cd54] start_secondary_prolog+0x10/0x1I4

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: rcu_sched self-detected stall on CPU
@ 2022-04-06 18:25       ` Zhouyi Zhou
  0 siblings, 0 replies; 56+ messages in thread
From: Zhouyi Zhou @ 2022-04-06 18:25 UTC (permalink / raw)
  To: Paul E. McKenney, Zhouyi Zhou; +Cc: Miguel Ojeda, linuxppc-dev, rcu

Hi Paul

On Thu, Apr 7, 2022 at 1:00 AM Paul E. McKenney <paulmck@kernel.org> wrote:
>
> On Wed, Apr 06, 2022 at 05:31:10PM +0800, Zhouyi Zhou wrote:
> > Hi
> >
> > I can reproduce it in a ppc virtual cloud server provided by Oregon
> > State University.  Following is what I do:
> > 1) curl -l https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/snapshot/linux-5.18-rc1.tar.gz
> > -o linux-5.18-rc1.tar.gz
> > 2) tar zxf linux-5.18-rc1.tar.gz
> > 3) cp config linux-5.18-rc1/.config
> > 4) cd linux-5.18-rc1
> > 5) make vmlinux -j 8
> > 6) qemu-system-ppc64 -kernel vmlinux -nographic -vga none -no-reboot
> > -smp 2 (QEMU 4.2.1)
> > 7) after 12 rounds, the bug got reproduced:
> > (http://154.223.142.244/logs/20220406/qemu.log.txt)
>
> Just to make sure, are you both seeing the same thing?  Last I knew,
> Zhouyi was chasing an RCU-tasks issue that appears only in kernels
> built with CONFIG_PROVE_RCU=y, which Miguel does not have set.  Or did
> I miss something?
We are both seeing the same thing, I work in parallel.
1) I am chasing the RCU-tasks issue which I will report my discoveries
to you later.
2) I am reproducing the RCU CPU stall issue reported by Miguel
yesterday. Lucky enough, I can reproduce it and thanks to Oregon State
University who provides me with the environment! I am also very
interested in helping chase the reason behind the issue. Lucky enough
the issue can be reproduced in a non-hardware accelerated qemu
environment so that I can give a hand.

Thanks
Zhouyi
>
> Miguel is instead seeing an RCU CPU stall warning where RCU's grace-period
> kthread slept for three milliseconds, but did not wake up for more than
> 20 seconds.  This kthread would normally have awakened on CPU 1, but
> CPU 1 looks to me to be very unhealthy, as can be seen in your console
> output below (but maybe my idea of what is healthy for powerpc systems
> is outdated).  Please see also the inline annotations.
>
> Thoughts from the PPC guys?
>
>                                                         Thanx, Paul
>
> ------------------------------------------------------------------------
>
> [   21.186912] rcu: INFO: rcu_sched self-detected stall on CPU
> [   21.187331] rcu:     1-...!: (4712629 ticks this GP) idle=2c1/0/0x3 softirq=8/8 fqs=0
> [   21.187529]  (t=21000 jiffies g=-1183 q=3)
> [   21.187681] rcu: rcu_sched kthread timer wakeup didn't happen for 20997 jiffies! g-1183 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402
>
>         The grace-period kthread is still asleep (->state=0x402).
>         This indicates that the three-jiffy timer has somehow been
>         prevented from expiring for almost a full 21 seconds.  Of course,
>         if timers don't work, RCU cannot work.
>
> [   21.187770] rcu:     Possible timer handling issue on cpu=1 timer-softirq=1
> [   21.187927] rcu: rcu_sched kthread starved for 21001 jiffies! g-1183 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402 ->cpu=1
> [   21.188019] rcu:     Unless rcu_sched kthread gets sufficient CPU time, OOM is now expected behavior.
> [   21.188087] rcu: RCU grace-period kthread stack dump:
> [   21.188196] task:rcu_sched       state:I stack:    0 pid:   10 ppid:     2 flags:0x00000800
> [   21.188453] Call Trace:
> [   21.188525] [c0000000061e78a0] [c0000000061e78e0] 0xc0000000061e78e0 (unreliable)
> [   21.188900] [c0000000061e7a90] [c000000000017210] __switch_to+0x250/0x310
> [   21.189210] [c0000000061e7b00] [c0000000003ed660] __schedule+0x210/0x660
> [   21.189315] [c0000000061e7b80] [c0000000003edb14] schedule+0x64/0x110
> [   21.189387] [c0000000061e7bb0] [c0000000003f6648] schedule_timeout+0x1d8/0x390
> [   21.189473] [c0000000061e7c80] [c00000000011111c] rcu_gp_fqs_loop+0x2dc/0x3d0
> [   21.189555] [c0000000061e7d30] [c0000000001144ec] rcu_gp_kthread+0x13c/0x160
> [   21.189633] [c0000000061e7dc0] [c0000000000c1770] kthread+0x110/0x120
> [   21.189714] [c0000000061e7e10] [c00000000000c9e4] ret_from_kernel_thread+0x5c/0x64
>
>         The above stack trace is expected behavior when the RCU
>         grace-period kthread is waiting to do its next FQS scan.
>
> [   21.189938] rcu: Stack dump where RCU GP kthread last ran:
>
>         And here is the stalled CPU, which also happens to be the CPU
>         that RCU last ran on:
>
> [   21.189992] Task dump for CPU 1:
> [   21.190059] task:swapper/1       state:R  running task     stack:    0 pid:    0 ppid:     1 flags:0x00000804
> [   21.190169] Call Trace:
> [   21.190194] [c0000000061ef2d0] [c0000000000c9a40] sched_show_task+0x180/0x1c0 (unreliable)
> [   21.190278] [c0000000061ef340] [c000000000116ca0] rcu_check_gp_kthread_starvation+0x16c/0x19c
> [   21.190370] [c0000000061ef3c0] [c000000000114f7c] rcu_sched_clock_irq+0x7ec/0xaf0
> [   21.190448] [c0000000061ef4b0] [c000000000120fdc] update_process_times+0xbc/0x140
> [   21.190524] [c0000000061ef4f0] [c000000000136a24] tick_nohz_handler+0xf4/0x1b0
> [   21.190608] [c0000000061ef540] [c00000000001c828] timer_interrupt+0x148/0x2d0
> [   21.190699] [c0000000061ef590] [c0000000000098e8] decrementer_common_virt+0x208/0x210
> [   21.190837] --- interrupt: 900 at arch_local_irq_restore+0x168/0x170
>
>         Up through this point is just the stack trace of the the
>         code doing the stack dump that the RCU CPU stall warning code
>         asked for.
>
> [   21.190941] NIP:  c000000000013608 LR: c0000000003f8114 CTR: c0000000000dc630
>
>         This NIP does not look at all good to me.  But I freely confess
>         that I am out of date on what Power machines do.
>
> [   21.191031] REGS: c0000000061ef600 TRAP: 0900   Not tainted  (5.18.0-rc1)
> [   21.191109] MSR:  8000000000009033 <SF,EE,ME,IR,DR,RI,LE>  CR: 22000202  XER: 00000000
> [   21.191274] CFAR: 0000000000000000 IRQMASK: 0
> [   21.191274] GPR00: c00000000009c368 c0000000061ef8a0 c00000000116a700 0000000000000000
> [   21.191274] GPR04: 0000000000000000 0000000000000000 000000001ee30000 ffffffffffffffff
> [   21.191274] GPR08: 000000001ee30000 0000000000000000 0000000000008002 7265677368657265
> [   21.191274] GPR12: c0000000000dc630 c00000001ffe5800 0000000000000000 0000000000000000
> [   21.191274] GPR16: 0000000000000282 0000000000000000 0000000000000000 c0000000061eff00
> [   21.191274] GPR20: 0000000000000000 0000000000000001 c0000000061b9f80 c000000001195a10
> [   21.191274] GPR24: c000000001193a00 00000000fffb6cc4 000000000000000a c0000000010721e8
> [   21.191274] GPR28: c000000001076800 c000000001070380 c0000000010716d8 c0000000061b9f80
> [   21.191932] NIP [c000000000013608] arch_local_irq_restore+0x168/0x170
> [   21.192024] LR [c0000000003f8114] __do_softirq+0xd4/0x2ec
> [   21.192118] --- interrupt: 900
> [   21.192158] [c0000000061ef8a0] [c0000000061b9f80] 0xc0000000061b9f80 (unreliable)
> [   21.192227] [c0000000061ef9b0] [c00000000009c368] irq_exit+0xc8/0x110
> [   21.192307] [c0000000061ef9d0] [c00000000001c858] timer_interrupt+0x178/0x2d0
> [   21.192397] [c0000000061efa20] [c0000000000098e8] decrementer_common_virt+0x208/0x210
> [   21.192495] --- interrupt: 900 at plpar_hcall_norets_notrace+0x18/0x2c
> [   21.192566] NIP:  c000000000072988 LR: c000000000074fa8 CTR: c000000000074f10
> [   21.192615] REGS: c0000000061efa90 TRAP: 0900   Not tainted  (5.18.0-rc1)
> [   21.192659] MSR:  8000000000009033 <SF,EE,ME,IR,DR,RI,LE>  CR: 28000202  XER: 00000000
> [   21.192755] CFAR: 0000000000000000 IRQMASK: 0
> [   21.192755] GPR00: 0000000028000202 c0000000061efd30 c00000000116a700 0000000000000000
> [   21.192755] GPR04: c00000001fea0280 ffffffffffffffff 0000000001f40000 000000019d088fcf
> [   21.192755] GPR08: 000000001ee30000 c00000001ffe5400 0000000000000001 0000000100000000
> [   21.192755] GPR12: c000000000074f10 c00000001ffe5800 0000000000000000 0000000000000000
> [   21.192755] GPR16: 0000000000000000 0000000000000000 0000000000000000 c0000000061eff00
> [   21.192755] GPR20: c00000000003d440 0000000000000001 c000000001195b30 c000000001195a10
> [   21.192755] GPR24: 0000000000080000 c0000000061ba000 c000000001195a98 0000000000000001
> [   21.192755] GPR28: 0000000000000001 c0000000010716d0 c0000000010716d8 c0000000010716d0
> [   21.193290] NIP [c000000000072988] plpar_hcall_norets_notrace+0x18/0x2c
> [   21.193363] LR [c000000000074fa8] pseries_lpar_idle+0x98/0x1b0
> [   21.193428] --- interrupt: 900
> [   21.193457] [c0000000061efd30] [0000000000000001] 0x1 (unreliable)
> [   21.193512] [c0000000061efdb0] [c000000000018b54] arch_cpu_idle+0x44/0x180
> [   21.193590] [c0000000061efde0] [c0000000003f75bc] default_idle_call+0x4c/0x7c
> [   21.193679] [c0000000061efe00] [c0000000000e1384] do_idle+0x114/0x1e0
> [   21.193747] [c0000000061efe60] [c0000000000e1664] cpu_startup_entry+0x34/0x40
> [   21.193901] [c0000000061efe90] [c00000000003f044] start_secondary+0x624/0xa00
> [   21.194002] [c0000000061eff90] [c00000000000cd54] start_secondary_prolog+0x10/0x14
> [   21.194245] Task dump for CPU 1:
> [   21.194284] task:swapper/1       state:R  running task     stack:    0 pid:    0 ppid:     1 flags:0x00000804
> [   21.194374] Call Trace:
> [   21.194400] [c0000000061ef2b0] [c0000000000c9a40] sched_show_task+0x180/0x1c0 (unreliable)
> [   21.194479] [c0000000061ef320] [c000000000116df8] rcu_dump_cpu_stacks+0x128/0x188
> [   21.194567] [c0000000061ef3c0] [c000000000114f9c] rcu_sched_clock_irq+0x80c/0xaf0
> [   21.194642] [c0000000061ef4b0] [c000000000120fdc] update_process_times+0xbc/0x140
> [   21.194712] [c0000000061ef4f0] [c000000000136a24] tick_nohz_handler+0xf4/0x1b0
> [   21.194828] [c0000000061ef540] [c00000000001c828] timer_interrupt+0x148/0x2d0
> [   21.194942] [c0000000061ef590] [c0000000000098e8] decrementer_common_virt+0x208/0x210
> [   21.195035] --- interrupt: 900 at arch_local_irq_restore+0x168/0x170
> [   21.195104] NIP:  c000000000013608 LR: c0000000003f8114 CTR: c0000000000dc630
> [   21.195152] REGS: c0000000061ef600 TRAP: 0900   Not tainted  (5.18.0-rc1)
> [   21.195199] MSR:  8000000000009033 <SF,EE,ME,IR,DR,RI,LE>  CR: 22000202  XER: 00000000
> [   21.195296] CFAR: 0000000000000000 IRQMASK: 0
> [   21.195296] GPR00: c00000000009c368 c0000000061ef8a0 c00000000116a700 0000000000000000
> [   21.195296] GPR04: 0000000000000000 0000000000000000 000000001ee30000 ffffffffffffffff
> [   21.195296] GPR08: 000000001ee30000 0000000000000000 0000000000008002 7265677368657265
> [   21.195296] GPR12: c0000000000dc630 c00000001ffe5800 0000000000000000 0000000000000000
> [   21.195296] GPR16: 0000000000000282 0000000000000000 0000000000000000 c0000000061eff00
> [   21.195296] GPR20: 0000000000000000 0000000000000001 c0000000061b9f80 c000000001195a10
> [   21.195296] GPR24: c000000001193a00 00000000fffb6cc4 000000000000000a c0000000010721e8
> [   21.195296] GPR28: c000000001076800 c000000001070380 c0000000010716d8 c0000000061b9f80
> [   21.195850] NIP [c000000000013608] arch_local_irq_restore+0x168/0x170
> [   21.195944] LR [c0000000003f8114] __do_softirq+0xd4/0x2ec
> [   21.196027] --- interrupt: 900
> [   21.196056] [c0000000061ef8a0] [c0000000061b9f80] 0xc0000000061b9f80 (unreliable)
> [   21.196119] [c0000000061ef9b0] [c00000000009c368] irq_exit+0xc8/0x110
> [   21.196192] [c0000000061ef9d0] [c00000000001c858] timer_interrupt+0x178/0x2d0
> [   21.196282] [c0000000061efa20] [c0000000000098e8] decrementer_common_virt+0x208/0x210
> [   21.196373] --- interrupt: 900 at plpar_hcall_norets_notrace+0x18/0x2c
> [   21.196439] NIP:  c000000000072988 LR: c000000000074fa8 CTR: c000000000074f10
> [   21.196489] REGS: c0000000061efa90 TRAP: 0900   Not tainted  (5.18.0-rc1)
> [   21.196534] MSR:  8000000000009033 <SF,EE,ME,IR,DR,RI,LE>  CR: 28000202  XER: 00000000
> [   21.196627] CFAR: 0000000000000000 IRQMASK: 0
> [   21.196627] GPR00: 0000000028000202 c0000000061efd30 c00000000116a700 0000000000000000
> [   21.196627] GPR04: c00000001fea0280 ffffffffffffffff 0000000001f40000 000000019d088fcf
> [   21.196627] GPR08: 000000001ee30000 c00000001ffe5400 0000000000000001 0000000100000000
> [   21.196627] GPR12: c000000000074f10 c00000001ffe5800 0000000000000000 0000000000000000
> [   21.196627] GPR16: 0000000000000000 0000000000000000 0000000000000000 c0000000061eff00
> [   21.196627] GPR20: c00000000003d440 0000000000000001 c000000001195b30 c000000001195a10
> [   21.196627] GPR24: 0000000000080000 c0000000061ba000 c000000001195a98 0000000000000001
> [   21.196627] GPR28: 0000000000000001 c0000000010716d0 c0000000010716d8 c0000000010716d0
> [   21.197168] NIP [c000000000072988] plpar_hcall_norets_notrace+0x18/0x2c
> [   21.197239] LR [c000000000074fa8] pseries_lpar_idle+0x98/0x1b0
> [   21.197305] --- interrupt: 900
> [   21.197333] [c0000000061efd30] [0000000000000001] 0x1 (unreliable)
> [   21.197390] [c0000000061efdb0] [c000000000018b54] arch_cpu_idle+0x44/0x180
> [   21.197470] [c0000000061efde0] [c0000000003f75bc] default_idle_call+0x4c/0x7c
> [   21.197556] [c0000000061efe00] [c0000000000e1384] do_idle+0x114/0x1e0
> [   21.197620] [c0000000061efe60] [c0000000000e1664] cpu_startup_entry+0x34/0x40
> [   21.197696] [c0000000061efe90] [c00000000003f044] start_secondary+0x624/0xa00
> [   21.197820] [c0000000061eff90] [c00000000000cd54] start_secondary_prolog+0x10/0x1I4

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: rcu_sched self-detected stall on CPU
  2022-04-06 18:25       ` Zhouyi Zhou
@ 2022-04-06 19:50         ` Paul E. McKenney
  -1 siblings, 0 replies; 56+ messages in thread
From: Paul E. McKenney @ 2022-04-06 19:50 UTC (permalink / raw)
  To: Zhouyi Zhou; +Cc: rcu, Miguel Ojeda, linuxppc-dev

On Thu, Apr 07, 2022 at 02:25:59AM +0800, Zhouyi Zhou wrote:
> Hi Paul
> 
> On Thu, Apr 7, 2022 at 1:00 AM Paul E. McKenney <paulmck@kernel.org> wrote:
> >
> > On Wed, Apr 06, 2022 at 05:31:10PM +0800, Zhouyi Zhou wrote:
> > > Hi
> > >
> > > I can reproduce it in a ppc virtual cloud server provided by Oregon
> > > State University.  Following is what I do:
> > > 1) curl -l https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/snapshot/linux-5.18-rc1.tar.gz
> > > -o linux-5.18-rc1.tar.gz
> > > 2) tar zxf linux-5.18-rc1.tar.gz
> > > 3) cp config linux-5.18-rc1/.config
> > > 4) cd linux-5.18-rc1
> > > 5) make vmlinux -j 8
> > > 6) qemu-system-ppc64 -kernel vmlinux -nographic -vga none -no-reboot
> > > -smp 2 (QEMU 4.2.1)
> > > 7) after 12 rounds, the bug got reproduced:
> > > (http://154.223.142.244/logs/20220406/qemu.log.txt)
> >
> > Just to make sure, are you both seeing the same thing?  Last I knew,
> > Zhouyi was chasing an RCU-tasks issue that appears only in kernels
> > built with CONFIG_PROVE_RCU=y, which Miguel does not have set.  Or did
> > I miss something?
> We are both seeing the same thing, I work in parallel.
> 1) I am chasing the RCU-tasks issue which I will report my discoveries
> to you later.
> 2) I am reproducing the RCU CPU stall issue reported by Miguel
> yesterday. Lucky enough, I can reproduce it and thanks to Oregon State
> University who provides me with the environment! I am also very
> interested in helping chase the reason behind the issue. Lucky enough
> the issue can be reproduced in a non-hardware accelerated qemu
> environment so that I can give a hand.

How quickly does this happen?  The console log that Miguel sent had
within 30 seconds of boot.  If it always happens this quickly, it
should be possible to do a bisection, especially when running qemu.
The trick would be to boot a given commit until you see it fail on the
one hand or until it boots successfully 70 times.  In the latter case,
report success to "git bisect", in the former case report failure.
If the one-out-of-5 failure rate is accurate, you will have a 99.997%
chance of reporting the correct failure state on each step, resulting
in better than a 99.9% chance of converging on the correct commit.

Of course, you would hit the preceding commit hard to double-check.

Does this seem reasonable?  Or am I being overly optimstic on the
failure times?

							Thanx, Paul

> Thanks
> Zhouyi
> >
> > Miguel is instead seeing an RCU CPU stall warning where RCU's grace-period
> > kthread slept for three milliseconds, but did not wake up for more than
> > 20 seconds.  This kthread would normally have awakened on CPU 1, but
> > CPU 1 looks to me to be very unhealthy, as can be seen in your console
> > output below (but maybe my idea of what is healthy for powerpc systems
> > is outdated).  Please see also the inline annotations.
> >
> > Thoughts from the PPC guys?
> >
> >                                                         Thanx, Paul
> >
> > ------------------------------------------------------------------------
> >
> > [   21.186912] rcu: INFO: rcu_sched self-detected stall on CPU
> > [   21.187331] rcu:     1-...!: (4712629 ticks this GP) idle=2c1/0/0x3 softirq=8/8 fqs=0
> > [   21.187529]  (t=21000 jiffies g=-1183 q=3)
> > [   21.187681] rcu: rcu_sched kthread timer wakeup didn't happen for 20997 jiffies! g-1183 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402
> >
> >         The grace-period kthread is still asleep (->state=0x402).
> >         This indicates that the three-jiffy timer has somehow been
> >         prevented from expiring for almost a full 21 seconds.  Of course,
> >         if timers don't work, RCU cannot work.
> >
> > [   21.187770] rcu:     Possible timer handling issue on cpu=1 timer-softirq=1
> > [   21.187927] rcu: rcu_sched kthread starved for 21001 jiffies! g-1183 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402 ->cpu=1
> > [   21.188019] rcu:     Unless rcu_sched kthread gets sufficient CPU time, OOM is now expected behavior.
> > [   21.188087] rcu: RCU grace-period kthread stack dump:
> > [   21.188196] task:rcu_sched       state:I stack:    0 pid:   10 ppid:     2 flags:0x00000800
> > [   21.188453] Call Trace:
> > [   21.188525] [c0000000061e78a0] [c0000000061e78e0] 0xc0000000061e78e0 (unreliable)
> > [   21.188900] [c0000000061e7a90] [c000000000017210] __switch_to+0x250/0x310
> > [   21.189210] [c0000000061e7b00] [c0000000003ed660] __schedule+0x210/0x660
> > [   21.189315] [c0000000061e7b80] [c0000000003edb14] schedule+0x64/0x110
> > [   21.189387] [c0000000061e7bb0] [c0000000003f6648] schedule_timeout+0x1d8/0x390
> > [   21.189473] [c0000000061e7c80] [c00000000011111c] rcu_gp_fqs_loop+0x2dc/0x3d0
> > [   21.189555] [c0000000061e7d30] [c0000000001144ec] rcu_gp_kthread+0x13c/0x160
> > [   21.189633] [c0000000061e7dc0] [c0000000000c1770] kthread+0x110/0x120
> > [   21.189714] [c0000000061e7e10] [c00000000000c9e4] ret_from_kernel_thread+0x5c/0x64
> >
> >         The above stack trace is expected behavior when the RCU
> >         grace-period kthread is waiting to do its next FQS scan.
> >
> > [   21.189938] rcu: Stack dump where RCU GP kthread last ran:
> >
> >         And here is the stalled CPU, which also happens to be the CPU
> >         that RCU last ran on:
> >
> > [   21.189992] Task dump for CPU 1:
> > [   21.190059] task:swapper/1       state:R  running task     stack:    0 pid:    0 ppid:     1 flags:0x00000804
> > [   21.190169] Call Trace:
> > [   21.190194] [c0000000061ef2d0] [c0000000000c9a40] sched_show_task+0x180/0x1c0 (unreliable)
> > [   21.190278] [c0000000061ef340] [c000000000116ca0] rcu_check_gp_kthread_starvation+0x16c/0x19c
> > [   21.190370] [c0000000061ef3c0] [c000000000114f7c] rcu_sched_clock_irq+0x7ec/0xaf0
> > [   21.190448] [c0000000061ef4b0] [c000000000120fdc] update_process_times+0xbc/0x140
> > [   21.190524] [c0000000061ef4f0] [c000000000136a24] tick_nohz_handler+0xf4/0x1b0
> > [   21.190608] [c0000000061ef540] [c00000000001c828] timer_interrupt+0x148/0x2d0
> > [   21.190699] [c0000000061ef590] [c0000000000098e8] decrementer_common_virt+0x208/0x210
> > [   21.190837] --- interrupt: 900 at arch_local_irq_restore+0x168/0x170
> >
> >         Up through this point is just the stack trace of the the
> >         code doing the stack dump that the RCU CPU stall warning code
> >         asked for.
> >
> > [   21.190941] NIP:  c000000000013608 LR: c0000000003f8114 CTR: c0000000000dc630
> >
> >         This NIP does not look at all good to me.  But I freely confess
> >         that I am out of date on what Power machines do.
> >
> > [   21.191031] REGS: c0000000061ef600 TRAP: 0900   Not tainted  (5.18.0-rc1)
> > [   21.191109] MSR:  8000000000009033 <SF,EE,ME,IR,DR,RI,LE>  CR: 22000202  XER: 00000000
> > [   21.191274] CFAR: 0000000000000000 IRQMASK: 0
> > [   21.191274] GPR00: c00000000009c368 c0000000061ef8a0 c00000000116a700 0000000000000000
> > [   21.191274] GPR04: 0000000000000000 0000000000000000 000000001ee30000 ffffffffffffffff
> > [   21.191274] GPR08: 000000001ee30000 0000000000000000 0000000000008002 7265677368657265
> > [   21.191274] GPR12: c0000000000dc630 c00000001ffe5800 0000000000000000 0000000000000000
> > [   21.191274] GPR16: 0000000000000282 0000000000000000 0000000000000000 c0000000061eff00
> > [   21.191274] GPR20: 0000000000000000 0000000000000001 c0000000061b9f80 c000000001195a10
> > [   21.191274] GPR24: c000000001193a00 00000000fffb6cc4 000000000000000a c0000000010721e8
> > [   21.191274] GPR28: c000000001076800 c000000001070380 c0000000010716d8 c0000000061b9f80
> > [   21.191932] NIP [c000000000013608] arch_local_irq_restore+0x168/0x170
> > [   21.192024] LR [c0000000003f8114] __do_softirq+0xd4/0x2ec
> > [   21.192118] --- interrupt: 900
> > [   21.192158] [c0000000061ef8a0] [c0000000061b9f80] 0xc0000000061b9f80 (unreliable)
> > [   21.192227] [c0000000061ef9b0] [c00000000009c368] irq_exit+0xc8/0x110
> > [   21.192307] [c0000000061ef9d0] [c00000000001c858] timer_interrupt+0x178/0x2d0
> > [   21.192397] [c0000000061efa20] [c0000000000098e8] decrementer_common_virt+0x208/0x210
> > [   21.192495] --- interrupt: 900 at plpar_hcall_norets_notrace+0x18/0x2c
> > [   21.192566] NIP:  c000000000072988 LR: c000000000074fa8 CTR: c000000000074f10
> > [   21.192615] REGS: c0000000061efa90 TRAP: 0900   Not tainted  (5.18.0-rc1)
> > [   21.192659] MSR:  8000000000009033 <SF,EE,ME,IR,DR,RI,LE>  CR: 28000202  XER: 00000000
> > [   21.192755] CFAR: 0000000000000000 IRQMASK: 0
> > [   21.192755] GPR00: 0000000028000202 c0000000061efd30 c00000000116a700 0000000000000000
> > [   21.192755] GPR04: c00000001fea0280 ffffffffffffffff 0000000001f40000 000000019d088fcf
> > [   21.192755] GPR08: 000000001ee30000 c00000001ffe5400 0000000000000001 0000000100000000
> > [   21.192755] GPR12: c000000000074f10 c00000001ffe5800 0000000000000000 0000000000000000
> > [   21.192755] GPR16: 0000000000000000 0000000000000000 0000000000000000 c0000000061eff00
> > [   21.192755] GPR20: c00000000003d440 0000000000000001 c000000001195b30 c000000001195a10
> > [   21.192755] GPR24: 0000000000080000 c0000000061ba000 c000000001195a98 0000000000000001
> > [   21.192755] GPR28: 0000000000000001 c0000000010716d0 c0000000010716d8 c0000000010716d0
> > [   21.193290] NIP [c000000000072988] plpar_hcall_norets_notrace+0x18/0x2c
> > [   21.193363] LR [c000000000074fa8] pseries_lpar_idle+0x98/0x1b0
> > [   21.193428] --- interrupt: 900
> > [   21.193457] [c0000000061efd30] [0000000000000001] 0x1 (unreliable)
> > [   21.193512] [c0000000061efdb0] [c000000000018b54] arch_cpu_idle+0x44/0x180
> > [   21.193590] [c0000000061efde0] [c0000000003f75bc] default_idle_call+0x4c/0x7c
> > [   21.193679] [c0000000061efe00] [c0000000000e1384] do_idle+0x114/0x1e0
> > [   21.193747] [c0000000061efe60] [c0000000000e1664] cpu_startup_entry+0x34/0x40
> > [   21.193901] [c0000000061efe90] [c00000000003f044] start_secondary+0x624/0xa00
> > [   21.194002] [c0000000061eff90] [c00000000000cd54] start_secondary_prolog+0x10/0x14
> > [   21.194245] Task dump for CPU 1:
> > [   21.194284] task:swapper/1       state:R  running task     stack:    0 pid:    0 ppid:     1 flags:0x00000804
> > [   21.194374] Call Trace:
> > [   21.194400] [c0000000061ef2b0] [c0000000000c9a40] sched_show_task+0x180/0x1c0 (unreliable)
> > [   21.194479] [c0000000061ef320] [c000000000116df8] rcu_dump_cpu_stacks+0x128/0x188
> > [   21.194567] [c0000000061ef3c0] [c000000000114f9c] rcu_sched_clock_irq+0x80c/0xaf0
> > [   21.194642] [c0000000061ef4b0] [c000000000120fdc] update_process_times+0xbc/0x140
> > [   21.194712] [c0000000061ef4f0] [c000000000136a24] tick_nohz_handler+0xf4/0x1b0
> > [   21.194828] [c0000000061ef540] [c00000000001c828] timer_interrupt+0x148/0x2d0
> > [   21.194942] [c0000000061ef590] [c0000000000098e8] decrementer_common_virt+0x208/0x210
> > [   21.195035] --- interrupt: 900 at arch_local_irq_restore+0x168/0x170
> > [   21.195104] NIP:  c000000000013608 LR: c0000000003f8114 CTR: c0000000000dc630
> > [   21.195152] REGS: c0000000061ef600 TRAP: 0900   Not tainted  (5.18.0-rc1)
> > [   21.195199] MSR:  8000000000009033 <SF,EE,ME,IR,DR,RI,LE>  CR: 22000202  XER: 00000000
> > [   21.195296] CFAR: 0000000000000000 IRQMASK: 0
> > [   21.195296] GPR00: c00000000009c368 c0000000061ef8a0 c00000000116a700 0000000000000000
> > [   21.195296] GPR04: 0000000000000000 0000000000000000 000000001ee30000 ffffffffffffffff
> > [   21.195296] GPR08: 000000001ee30000 0000000000000000 0000000000008002 7265677368657265
> > [   21.195296] GPR12: c0000000000dc630 c00000001ffe5800 0000000000000000 0000000000000000
> > [   21.195296] GPR16: 0000000000000282 0000000000000000 0000000000000000 c0000000061eff00
> > [   21.195296] GPR20: 0000000000000000 0000000000000001 c0000000061b9f80 c000000001195a10
> > [   21.195296] GPR24: c000000001193a00 00000000fffb6cc4 000000000000000a c0000000010721e8
> > [   21.195296] GPR28: c000000001076800 c000000001070380 c0000000010716d8 c0000000061b9f80
> > [   21.195850] NIP [c000000000013608] arch_local_irq_restore+0x168/0x170
> > [   21.195944] LR [c0000000003f8114] __do_softirq+0xd4/0x2ec
> > [   21.196027] --- interrupt: 900
> > [   21.196056] [c0000000061ef8a0] [c0000000061b9f80] 0xc0000000061b9f80 (unreliable)
> > [   21.196119] [c0000000061ef9b0] [c00000000009c368] irq_exit+0xc8/0x110
> > [   21.196192] [c0000000061ef9d0] [c00000000001c858] timer_interrupt+0x178/0x2d0
> > [   21.196282] [c0000000061efa20] [c0000000000098e8] decrementer_common_virt+0x208/0x210
> > [   21.196373] --- interrupt: 900 at plpar_hcall_norets_notrace+0x18/0x2c
> > [   21.196439] NIP:  c000000000072988 LR: c000000000074fa8 CTR: c000000000074f10
> > [   21.196489] REGS: c0000000061efa90 TRAP: 0900   Not tainted  (5.18.0-rc1)
> > [   21.196534] MSR:  8000000000009033 <SF,EE,ME,IR,DR,RI,LE>  CR: 28000202  XER: 00000000
> > [   21.196627] CFAR: 0000000000000000 IRQMASK: 0
> > [   21.196627] GPR00: 0000000028000202 c0000000061efd30 c00000000116a700 0000000000000000
> > [   21.196627] GPR04: c00000001fea0280 ffffffffffffffff 0000000001f40000 000000019d088fcf
> > [   21.196627] GPR08: 000000001ee30000 c00000001ffe5400 0000000000000001 0000000100000000
> > [   21.196627] GPR12: c000000000074f10 c00000001ffe5800 0000000000000000 0000000000000000
> > [   21.196627] GPR16: 0000000000000000 0000000000000000 0000000000000000 c0000000061eff00
> > [   21.196627] GPR20: c00000000003d440 0000000000000001 c000000001195b30 c000000001195a10
> > [   21.196627] GPR24: 0000000000080000 c0000000061ba000 c000000001195a98 0000000000000001
> > [   21.196627] GPR28: 0000000000000001 c0000000010716d0 c0000000010716d8 c0000000010716d0
> > [   21.197168] NIP [c000000000072988] plpar_hcall_norets_notrace+0x18/0x2c
> > [   21.197239] LR [c000000000074fa8] pseries_lpar_idle+0x98/0x1b0
> > [   21.197305] --- interrupt: 900
> > [   21.197333] [c0000000061efd30] [0000000000000001] 0x1 (unreliable)
> > [   21.197390] [c0000000061efdb0] [c000000000018b54] arch_cpu_idle+0x44/0x180
> > [   21.197470] [c0000000061efde0] [c0000000003f75bc] default_idle_call+0x4c/0x7c
> > [   21.197556] [c0000000061efe00] [c0000000000e1384] do_idle+0x114/0x1e0
> > [   21.197620] [c0000000061efe60] [c0000000000e1664] cpu_startup_entry+0x34/0x40
> > [   21.197696] [c0000000061efe90] [c00000000003f044] start_secondary+0x624/0xa00
> > [   21.197820] [c0000000061eff90] [c00000000000cd54] start_secondary_prolog+0x10/0x1I4

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: rcu_sched self-detected stall on CPU
@ 2022-04-06 19:50         ` Paul E. McKenney
  0 siblings, 0 replies; 56+ messages in thread
From: Paul E. McKenney @ 2022-04-06 19:50 UTC (permalink / raw)
  To: Zhouyi Zhou; +Cc: Miguel Ojeda, linuxppc-dev, rcu

On Thu, Apr 07, 2022 at 02:25:59AM +0800, Zhouyi Zhou wrote:
> Hi Paul
> 
> On Thu, Apr 7, 2022 at 1:00 AM Paul E. McKenney <paulmck@kernel.org> wrote:
> >
> > On Wed, Apr 06, 2022 at 05:31:10PM +0800, Zhouyi Zhou wrote:
> > > Hi
> > >
> > > I can reproduce it in a ppc virtual cloud server provided by Oregon
> > > State University.  Following is what I do:
> > > 1) curl -l https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/snapshot/linux-5.18-rc1.tar.gz
> > > -o linux-5.18-rc1.tar.gz
> > > 2) tar zxf linux-5.18-rc1.tar.gz
> > > 3) cp config linux-5.18-rc1/.config
> > > 4) cd linux-5.18-rc1
> > > 5) make vmlinux -j 8
> > > 6) qemu-system-ppc64 -kernel vmlinux -nographic -vga none -no-reboot
> > > -smp 2 (QEMU 4.2.1)
> > > 7) after 12 rounds, the bug got reproduced:
> > > (http://154.223.142.244/logs/20220406/qemu.log.txt)
> >
> > Just to make sure, are you both seeing the same thing?  Last I knew,
> > Zhouyi was chasing an RCU-tasks issue that appears only in kernels
> > built with CONFIG_PROVE_RCU=y, which Miguel does not have set.  Or did
> > I miss something?
> We are both seeing the same thing, I work in parallel.
> 1) I am chasing the RCU-tasks issue which I will report my discoveries
> to you later.
> 2) I am reproducing the RCU CPU stall issue reported by Miguel
> yesterday. Lucky enough, I can reproduce it and thanks to Oregon State
> University who provides me with the environment! I am also very
> interested in helping chase the reason behind the issue. Lucky enough
> the issue can be reproduced in a non-hardware accelerated qemu
> environment so that I can give a hand.

How quickly does this happen?  The console log that Miguel sent had
within 30 seconds of boot.  If it always happens this quickly, it
should be possible to do a bisection, especially when running qemu.
The trick would be to boot a given commit until you see it fail on the
one hand or until it boots successfully 70 times.  In the latter case,
report success to "git bisect", in the former case report failure.
If the one-out-of-5 failure rate is accurate, you will have a 99.997%
chance of reporting the correct failure state on each step, resulting
in better than a 99.9% chance of converging on the correct commit.

Of course, you would hit the preceding commit hard to double-check.

Does this seem reasonable?  Or am I being overly optimstic on the
failure times?

							Thanx, Paul

> Thanks
> Zhouyi
> >
> > Miguel is instead seeing an RCU CPU stall warning where RCU's grace-period
> > kthread slept for three milliseconds, but did not wake up for more than
> > 20 seconds.  This kthread would normally have awakened on CPU 1, but
> > CPU 1 looks to me to be very unhealthy, as can be seen in your console
> > output below (but maybe my idea of what is healthy for powerpc systems
> > is outdated).  Please see also the inline annotations.
> >
> > Thoughts from the PPC guys?
> >
> >                                                         Thanx, Paul
> >
> > ------------------------------------------------------------------------
> >
> > [   21.186912] rcu: INFO: rcu_sched self-detected stall on CPU
> > [   21.187331] rcu:     1-...!: (4712629 ticks this GP) idle=2c1/0/0x3 softirq=8/8 fqs=0
> > [   21.187529]  (t=21000 jiffies g=-1183 q=3)
> > [   21.187681] rcu: rcu_sched kthread timer wakeup didn't happen for 20997 jiffies! g-1183 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402
> >
> >         The grace-period kthread is still asleep (->state=0x402).
> >         This indicates that the three-jiffy timer has somehow been
> >         prevented from expiring for almost a full 21 seconds.  Of course,
> >         if timers don't work, RCU cannot work.
> >
> > [   21.187770] rcu:     Possible timer handling issue on cpu=1 timer-softirq=1
> > [   21.187927] rcu: rcu_sched kthread starved for 21001 jiffies! g-1183 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402 ->cpu=1
> > [   21.188019] rcu:     Unless rcu_sched kthread gets sufficient CPU time, OOM is now expected behavior.
> > [   21.188087] rcu: RCU grace-period kthread stack dump:
> > [   21.188196] task:rcu_sched       state:I stack:    0 pid:   10 ppid:     2 flags:0x00000800
> > [   21.188453] Call Trace:
> > [   21.188525] [c0000000061e78a0] [c0000000061e78e0] 0xc0000000061e78e0 (unreliable)
> > [   21.188900] [c0000000061e7a90] [c000000000017210] __switch_to+0x250/0x310
> > [   21.189210] [c0000000061e7b00] [c0000000003ed660] __schedule+0x210/0x660
> > [   21.189315] [c0000000061e7b80] [c0000000003edb14] schedule+0x64/0x110
> > [   21.189387] [c0000000061e7bb0] [c0000000003f6648] schedule_timeout+0x1d8/0x390
> > [   21.189473] [c0000000061e7c80] [c00000000011111c] rcu_gp_fqs_loop+0x2dc/0x3d0
> > [   21.189555] [c0000000061e7d30] [c0000000001144ec] rcu_gp_kthread+0x13c/0x160
> > [   21.189633] [c0000000061e7dc0] [c0000000000c1770] kthread+0x110/0x120
> > [   21.189714] [c0000000061e7e10] [c00000000000c9e4] ret_from_kernel_thread+0x5c/0x64
> >
> >         The above stack trace is expected behavior when the RCU
> >         grace-period kthread is waiting to do its next FQS scan.
> >
> > [   21.189938] rcu: Stack dump where RCU GP kthread last ran:
> >
> >         And here is the stalled CPU, which also happens to be the CPU
> >         that RCU last ran on:
> >
> > [   21.189992] Task dump for CPU 1:
> > [   21.190059] task:swapper/1       state:R  running task     stack:    0 pid:    0 ppid:     1 flags:0x00000804
> > [   21.190169] Call Trace:
> > [   21.190194] [c0000000061ef2d0] [c0000000000c9a40] sched_show_task+0x180/0x1c0 (unreliable)
> > [   21.190278] [c0000000061ef340] [c000000000116ca0] rcu_check_gp_kthread_starvation+0x16c/0x19c
> > [   21.190370] [c0000000061ef3c0] [c000000000114f7c] rcu_sched_clock_irq+0x7ec/0xaf0
> > [   21.190448] [c0000000061ef4b0] [c000000000120fdc] update_process_times+0xbc/0x140
> > [   21.190524] [c0000000061ef4f0] [c000000000136a24] tick_nohz_handler+0xf4/0x1b0
> > [   21.190608] [c0000000061ef540] [c00000000001c828] timer_interrupt+0x148/0x2d0
> > [   21.190699] [c0000000061ef590] [c0000000000098e8] decrementer_common_virt+0x208/0x210
> > [   21.190837] --- interrupt: 900 at arch_local_irq_restore+0x168/0x170
> >
> >         Up through this point is just the stack trace of the the
> >         code doing the stack dump that the RCU CPU stall warning code
> >         asked for.
> >
> > [   21.190941] NIP:  c000000000013608 LR: c0000000003f8114 CTR: c0000000000dc630
> >
> >         This NIP does not look at all good to me.  But I freely confess
> >         that I am out of date on what Power machines do.
> >
> > [   21.191031] REGS: c0000000061ef600 TRAP: 0900   Not tainted  (5.18.0-rc1)
> > [   21.191109] MSR:  8000000000009033 <SF,EE,ME,IR,DR,RI,LE>  CR: 22000202  XER: 00000000
> > [   21.191274] CFAR: 0000000000000000 IRQMASK: 0
> > [   21.191274] GPR00: c00000000009c368 c0000000061ef8a0 c00000000116a700 0000000000000000
> > [   21.191274] GPR04: 0000000000000000 0000000000000000 000000001ee30000 ffffffffffffffff
> > [   21.191274] GPR08: 000000001ee30000 0000000000000000 0000000000008002 7265677368657265
> > [   21.191274] GPR12: c0000000000dc630 c00000001ffe5800 0000000000000000 0000000000000000
> > [   21.191274] GPR16: 0000000000000282 0000000000000000 0000000000000000 c0000000061eff00
> > [   21.191274] GPR20: 0000000000000000 0000000000000001 c0000000061b9f80 c000000001195a10
> > [   21.191274] GPR24: c000000001193a00 00000000fffb6cc4 000000000000000a c0000000010721e8
> > [   21.191274] GPR28: c000000001076800 c000000001070380 c0000000010716d8 c0000000061b9f80
> > [   21.191932] NIP [c000000000013608] arch_local_irq_restore+0x168/0x170
> > [   21.192024] LR [c0000000003f8114] __do_softirq+0xd4/0x2ec
> > [   21.192118] --- interrupt: 900
> > [   21.192158] [c0000000061ef8a0] [c0000000061b9f80] 0xc0000000061b9f80 (unreliable)
> > [   21.192227] [c0000000061ef9b0] [c00000000009c368] irq_exit+0xc8/0x110
> > [   21.192307] [c0000000061ef9d0] [c00000000001c858] timer_interrupt+0x178/0x2d0
> > [   21.192397] [c0000000061efa20] [c0000000000098e8] decrementer_common_virt+0x208/0x210
> > [   21.192495] --- interrupt: 900 at plpar_hcall_norets_notrace+0x18/0x2c
> > [   21.192566] NIP:  c000000000072988 LR: c000000000074fa8 CTR: c000000000074f10
> > [   21.192615] REGS: c0000000061efa90 TRAP: 0900   Not tainted  (5.18.0-rc1)
> > [   21.192659] MSR:  8000000000009033 <SF,EE,ME,IR,DR,RI,LE>  CR: 28000202  XER: 00000000
> > [   21.192755] CFAR: 0000000000000000 IRQMASK: 0
> > [   21.192755] GPR00: 0000000028000202 c0000000061efd30 c00000000116a700 0000000000000000
> > [   21.192755] GPR04: c00000001fea0280 ffffffffffffffff 0000000001f40000 000000019d088fcf
> > [   21.192755] GPR08: 000000001ee30000 c00000001ffe5400 0000000000000001 0000000100000000
> > [   21.192755] GPR12: c000000000074f10 c00000001ffe5800 0000000000000000 0000000000000000
> > [   21.192755] GPR16: 0000000000000000 0000000000000000 0000000000000000 c0000000061eff00
> > [   21.192755] GPR20: c00000000003d440 0000000000000001 c000000001195b30 c000000001195a10
> > [   21.192755] GPR24: 0000000000080000 c0000000061ba000 c000000001195a98 0000000000000001
> > [   21.192755] GPR28: 0000000000000001 c0000000010716d0 c0000000010716d8 c0000000010716d0
> > [   21.193290] NIP [c000000000072988] plpar_hcall_norets_notrace+0x18/0x2c
> > [   21.193363] LR [c000000000074fa8] pseries_lpar_idle+0x98/0x1b0
> > [   21.193428] --- interrupt: 900
> > [   21.193457] [c0000000061efd30] [0000000000000001] 0x1 (unreliable)
> > [   21.193512] [c0000000061efdb0] [c000000000018b54] arch_cpu_idle+0x44/0x180
> > [   21.193590] [c0000000061efde0] [c0000000003f75bc] default_idle_call+0x4c/0x7c
> > [   21.193679] [c0000000061efe00] [c0000000000e1384] do_idle+0x114/0x1e0
> > [   21.193747] [c0000000061efe60] [c0000000000e1664] cpu_startup_entry+0x34/0x40
> > [   21.193901] [c0000000061efe90] [c00000000003f044] start_secondary+0x624/0xa00
> > [   21.194002] [c0000000061eff90] [c00000000000cd54] start_secondary_prolog+0x10/0x14
> > [   21.194245] Task dump for CPU 1:
> > [   21.194284] task:swapper/1       state:R  running task     stack:    0 pid:    0 ppid:     1 flags:0x00000804
> > [   21.194374] Call Trace:
> > [   21.194400] [c0000000061ef2b0] [c0000000000c9a40] sched_show_task+0x180/0x1c0 (unreliable)
> > [   21.194479] [c0000000061ef320] [c000000000116df8] rcu_dump_cpu_stacks+0x128/0x188
> > [   21.194567] [c0000000061ef3c0] [c000000000114f9c] rcu_sched_clock_irq+0x80c/0xaf0
> > [   21.194642] [c0000000061ef4b0] [c000000000120fdc] update_process_times+0xbc/0x140
> > [   21.194712] [c0000000061ef4f0] [c000000000136a24] tick_nohz_handler+0xf4/0x1b0
> > [   21.194828] [c0000000061ef540] [c00000000001c828] timer_interrupt+0x148/0x2d0
> > [   21.194942] [c0000000061ef590] [c0000000000098e8] decrementer_common_virt+0x208/0x210
> > [   21.195035] --- interrupt: 900 at arch_local_irq_restore+0x168/0x170
> > [   21.195104] NIP:  c000000000013608 LR: c0000000003f8114 CTR: c0000000000dc630
> > [   21.195152] REGS: c0000000061ef600 TRAP: 0900   Not tainted  (5.18.0-rc1)
> > [   21.195199] MSR:  8000000000009033 <SF,EE,ME,IR,DR,RI,LE>  CR: 22000202  XER: 00000000
> > [   21.195296] CFAR: 0000000000000000 IRQMASK: 0
> > [   21.195296] GPR00: c00000000009c368 c0000000061ef8a0 c00000000116a700 0000000000000000
> > [   21.195296] GPR04: 0000000000000000 0000000000000000 000000001ee30000 ffffffffffffffff
> > [   21.195296] GPR08: 000000001ee30000 0000000000000000 0000000000008002 7265677368657265
> > [   21.195296] GPR12: c0000000000dc630 c00000001ffe5800 0000000000000000 0000000000000000
> > [   21.195296] GPR16: 0000000000000282 0000000000000000 0000000000000000 c0000000061eff00
> > [   21.195296] GPR20: 0000000000000000 0000000000000001 c0000000061b9f80 c000000001195a10
> > [   21.195296] GPR24: c000000001193a00 00000000fffb6cc4 000000000000000a c0000000010721e8
> > [   21.195296] GPR28: c000000001076800 c000000001070380 c0000000010716d8 c0000000061b9f80
> > [   21.195850] NIP [c000000000013608] arch_local_irq_restore+0x168/0x170
> > [   21.195944] LR [c0000000003f8114] __do_softirq+0xd4/0x2ec
> > [   21.196027] --- interrupt: 900
> > [   21.196056] [c0000000061ef8a0] [c0000000061b9f80] 0xc0000000061b9f80 (unreliable)
> > [   21.196119] [c0000000061ef9b0] [c00000000009c368] irq_exit+0xc8/0x110
> > [   21.196192] [c0000000061ef9d0] [c00000000001c858] timer_interrupt+0x178/0x2d0
> > [   21.196282] [c0000000061efa20] [c0000000000098e8] decrementer_common_virt+0x208/0x210
> > [   21.196373] --- interrupt: 900 at plpar_hcall_norets_notrace+0x18/0x2c
> > [   21.196439] NIP:  c000000000072988 LR: c000000000074fa8 CTR: c000000000074f10
> > [   21.196489] REGS: c0000000061efa90 TRAP: 0900   Not tainted  (5.18.0-rc1)
> > [   21.196534] MSR:  8000000000009033 <SF,EE,ME,IR,DR,RI,LE>  CR: 28000202  XER: 00000000
> > [   21.196627] CFAR: 0000000000000000 IRQMASK: 0
> > [   21.196627] GPR00: 0000000028000202 c0000000061efd30 c00000000116a700 0000000000000000
> > [   21.196627] GPR04: c00000001fea0280 ffffffffffffffff 0000000001f40000 000000019d088fcf
> > [   21.196627] GPR08: 000000001ee30000 c00000001ffe5400 0000000000000001 0000000100000000
> > [   21.196627] GPR12: c000000000074f10 c00000001ffe5800 0000000000000000 0000000000000000
> > [   21.196627] GPR16: 0000000000000000 0000000000000000 0000000000000000 c0000000061eff00
> > [   21.196627] GPR20: c00000000003d440 0000000000000001 c000000001195b30 c000000001195a10
> > [   21.196627] GPR24: 0000000000080000 c0000000061ba000 c000000001195a98 0000000000000001
> > [   21.196627] GPR28: 0000000000000001 c0000000010716d0 c0000000010716d8 c0000000010716d0
> > [   21.197168] NIP [c000000000072988] plpar_hcall_norets_notrace+0x18/0x2c
> > [   21.197239] LR [c000000000074fa8] pseries_lpar_idle+0x98/0x1b0
> > [   21.197305] --- interrupt: 900
> > [   21.197333] [c0000000061efd30] [0000000000000001] 0x1 (unreliable)
> > [   21.197390] [c0000000061efdb0] [c000000000018b54] arch_cpu_idle+0x44/0x180
> > [   21.197470] [c0000000061efde0] [c0000000003f75bc] default_idle_call+0x4c/0x7c
> > [   21.197556] [c0000000061efe00] [c0000000000e1384] do_idle+0x114/0x1e0
> > [   21.197620] [c0000000061efe60] [c0000000000e1664] cpu_startup_entry+0x34/0x40
> > [   21.197696] [c0000000061efe90] [c00000000003f044] start_secondary+0x624/0xa00
> > [   21.197820] [c0000000061eff90] [c00000000000cd54] start_secondary_prolog+0x10/0x1I4

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: rcu_sched self-detected stall on CPU
  2022-04-06 19:50         ` Paul E. McKenney
@ 2022-04-07  2:26           ` Zhouyi Zhou
  -1 siblings, 0 replies; 56+ messages in thread
From: Zhouyi Zhou @ 2022-04-07  2:26 UTC (permalink / raw)
  To: Paul E. McKenney; +Cc: Miguel Ojeda, linuxppc-dev, rcu, Zhouyi Zhou

Hi Paul

On Thu, Apr 7, 2022 at 3:50 AM Paul E. McKenney <paulmck@kernel.org> wrote:
>
> On Thu, Apr 07, 2022 at 02:25:59AM +0800, Zhouyi Zhou wrote:
> > Hi Paul
> >
> > On Thu, Apr 7, 2022 at 1:00 AM Paul E. McKenney <paulmck@kernel.org> wrote:
> > >
> > > On Wed, Apr 06, 2022 at 05:31:10PM +0800, Zhouyi Zhou wrote:
> > > > Hi
> > > >
> > > > I can reproduce it in a ppc virtual cloud server provided by Oregon
> > > > State University.  Following is what I do:
> > > > 1) curl -l https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/snapshot/linux-5.18-rc1.tar.gz
> > > > -o linux-5.18-rc1.tar.gz
> > > > 2) tar zxf linux-5.18-rc1.tar.gz
> > > > 3) cp config linux-5.18-rc1/.config
> > > > 4) cd linux-5.18-rc1
> > > > 5) make vmlinux -j 8
> > > > 6) qemu-system-ppc64 -kernel vmlinux -nographic -vga none -no-reboot
> > > > -smp 2 (QEMU 4.2.1)
> > > > 7) after 12 rounds, the bug got reproduced:
> > > > (http://154.223.142.244/logs/20220406/qemu.log.txt)
> > >
> > > Just to make sure, are you both seeing the same thing?  Last I knew,
> > > Zhouyi was chasing an RCU-tasks issue that appears only in kernels
> > > built with CONFIG_PROVE_RCU=y, which Miguel does not have set.  Or did
> > > I miss something?
> > We are both seeing the same thing, I work in parallel.
> > 1) I am chasing the RCU-tasks issue which I will report my discoveries
> > to you later.
> > 2) I am reproducing the RCU CPU stall issue reported by Miguel
> > yesterday. Lucky enough, I can reproduce it and thanks to Oregon State
> > University who provides me with the environment! I am also very
> > interested in helping chase the reason behind the issue. Lucky enough
> > the issue can be reproduced in a non-hardware accelerated qemu
> > environment so that I can give a hand.
>
> How quickly does this happen?  The console log that Miguel sent had
> within 30 seconds of boot.  If it always happens this quickly, it
Yes, this happens within 30 seconds after kernel boot.  If we take all
into account (qemu preparing, kernel loading), we can do one test
within 54 seconds.
> should be possible to do a bisection, especially when running qemu.
Thank you for your guidance! I will do it.
> The trick would be to boot a given commit until you see it fail on the
> one hand or until it boots successfully 70 times.  In the latter case,
> report success to "git bisect", in the former case report failure.
Yes, I will do it.
> If the one-out-of-5 failure rate is accurate, you will have a 99.997%
> chance of reporting the correct failure state on each step, resulting
> in better than a 99.9% chance of converging on the correct commit.
Agree, I have learned that probability from your book.
>
> Of course, you would hit the preceding commit hard to double-check.
Agree
>
> Does this seem reasonable?  Or am I being overly optimstic on the
> failure times?
This is very reasonable, I have written a test script (based on the
script I used to test RCU-tasks issue), and will perform the bisection
in the coming days.

#!/bin/sh
if [ "$#" -ne 1 ]; then
    echo "Usage: test.sh kernel"
    exit
fi
COUNTER=0
while [ $COUNTER -lt 1000 ] ; do
    mv /tmp/console.log /tmp/console.log.orig
    echo $COUNTER > /tmp/console.log
    date
    qemu-system-ppc64 -kernel vmlinux -nographic -vga none -no-reboot
-smp 2 -serial file:/tmp/console.log -m 2048 -append "console=ttyS0"&
    qemu_pid=$!
    echo "Start round $COUNTER"
    while true ; do
        if grep -q "rcu_sched self-detected stall" /tmp/console.log;
        then
            echo "find rcu_sched detected stalls"
            break
        fi
        if grep -q "Unable to mount root fs" /tmp/console.log;
        then
            echo "kernel test round $COUNTER finish"
            break
        fi
        sleep 1
    done
    kill $qemu_pid
    if grep -q "rcu_sched self-detected stall" /tmp/console.log;
    then
        echo $COUNTER
        exit
    fi
    COUNTER=$(($COUNTER+1))
done

Thanks
Zhouyi
>
>                                                         Thanx, Paul
>
> > Thanks
> > Zhouyi
> > >
> > > Miguel is instead seeing an RCU CPU stall warning where RCU's grace-period
> > > kthread slept for three milliseconds, but did not wake up for more than
> > > 20 seconds.  This kthread would normally have awakened on CPU 1, but
> > > CPU 1 looks to me to be very unhealthy, as can be seen in your console
> > > output below (but maybe my idea of what is healthy for powerpc systems
> > > is outdated).  Please see also the inline annotations.
> > >
> > > Thoughts from the PPC guys?
> > >
> > >                                                         Thanx, Paul
> > >
> > > ------------------------------------------------------------------------
> > >
> > > [   21.186912] rcu: INFO: rcu_sched self-detected stall on CPU
> > > [   21.187331] rcu:     1-...!: (4712629 ticks this GP) idle=2c1/0/0x3 softirq=8/8 fqs=0
> > > [   21.187529]  (t=21000 jiffies g=-1183 q=3)
> > > [   21.187681] rcu: rcu_sched kthread timer wakeup didn't happen for 20997 jiffies! g-1183 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402
> > >
> > >         The grace-period kthread is still asleep (->state=0x402).
> > >         This indicates that the three-jiffy timer has somehow been
> > >         prevented from expiring for almost a full 21 seconds.  Of course,
> > >         if timers don't work, RCU cannot work.
> > >
> > > [   21.187770] rcu:     Possible timer handling issue on cpu=1 timer-softirq=1
> > > [   21.187927] rcu: rcu_sched kthread starved for 21001 jiffies! g-1183 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402 ->cpu=1
> > > [   21.188019] rcu:     Unless rcu_sched kthread gets sufficient CPU time, OOM is now expected behavior.
> > > [   21.188087] rcu: RCU grace-period kthread stack dump:
> > > [   21.188196] task:rcu_sched       state:I stack:    0 pid:   10 ppid:     2 flags:0x00000800
> > > [   21.188453] Call Trace:
> > > [   21.188525] [c0000000061e78a0] [c0000000061e78e0] 0xc0000000061e78e0 (unreliable)
> > > [   21.188900] [c0000000061e7a90] [c000000000017210] __switch_to+0x250/0x310
> > > [   21.189210] [c0000000061e7b00] [c0000000003ed660] __schedule+0x210/0x660
> > > [   21.189315] [c0000000061e7b80] [c0000000003edb14] schedule+0x64/0x110
> > > [   21.189387] [c0000000061e7bb0] [c0000000003f6648] schedule_timeout+0x1d8/0x390
> > > [   21.189473] [c0000000061e7c80] [c00000000011111c] rcu_gp_fqs_loop+0x2dc/0x3d0
> > > [   21.189555] [c0000000061e7d30] [c0000000001144ec] rcu_gp_kthread+0x13c/0x160
> > > [   21.189633] [c0000000061e7dc0] [c0000000000c1770] kthread+0x110/0x120
> > > [   21.189714] [c0000000061e7e10] [c00000000000c9e4] ret_from_kernel_thread+0x5c/0x64
> > >
> > >         The above stack trace is expected behavior when the RCU
> > >         grace-period kthread is waiting to do its next FQS scan.
> > >
> > > [   21.189938] rcu: Stack dump where RCU GP kthread last ran:
> > >
> > >         And here is the stalled CPU, which also happens to be the CPU
> > >         that RCU last ran on:
> > >
> > > [   21.189992] Task dump for CPU 1:
> > > [   21.190059] task:swapper/1       state:R  running task     stack:    0 pid:    0 ppid:     1 flags:0x00000804
> > > [   21.190169] Call Trace:
> > > [   21.190194] [c0000000061ef2d0] [c0000000000c9a40] sched_show_task+0x180/0x1c0 (unreliable)
> > > [   21.190278] [c0000000061ef340] [c000000000116ca0] rcu_check_gp_kthread_starvation+0x16c/0x19c
> > > [   21.190370] [c0000000061ef3c0] [c000000000114f7c] rcu_sched_clock_irq+0x7ec/0xaf0
> > > [   21.190448] [c0000000061ef4b0] [c000000000120fdc] update_process_times+0xbc/0x140
> > > [   21.190524] [c0000000061ef4f0] [c000000000136a24] tick_nohz_handler+0xf4/0x1b0
> > > [   21.190608] [c0000000061ef540] [c00000000001c828] timer_interrupt+0x148/0x2d0
> > > [   21.190699] [c0000000061ef590] [c0000000000098e8] decrementer_common_virt+0x208/0x210
> > > [   21.190837] --- interrupt: 900 at arch_local_irq_restore+0x168/0x170
> > >
> > >         Up through this point is just the stack trace of the the
> > >         code doing the stack dump that the RCU CPU stall warning code
> > >         asked for.
> > >
> > > [   21.190941] NIP:  c000000000013608 LR: c0000000003f8114 CTR: c0000000000dc630
> > >
> > >         This NIP does not look at all good to me.  But I freely confess
> > >         that I am out of date on what Power machines do.
> > >
> > > [   21.191031] REGS: c0000000061ef600 TRAP: 0900   Not tainted  (5.18.0-rc1)
> > > [   21.191109] MSR:  8000000000009033 <SF,EE,ME,IR,DR,RI,LE>  CR: 22000202  XER: 00000000
> > > [   21.191274] CFAR: 0000000000000000 IRQMASK: 0
> > > [   21.191274] GPR00: c00000000009c368 c0000000061ef8a0 c00000000116a700 0000000000000000
> > > [   21.191274] GPR04: 0000000000000000 0000000000000000 000000001ee30000 ffffffffffffffff
> > > [   21.191274] GPR08: 000000001ee30000 0000000000000000 0000000000008002 7265677368657265
> > > [   21.191274] GPR12: c0000000000dc630 c00000001ffe5800 0000000000000000 0000000000000000
> > > [   21.191274] GPR16: 0000000000000282 0000000000000000 0000000000000000 c0000000061eff00
> > > [   21.191274] GPR20: 0000000000000000 0000000000000001 c0000000061b9f80 c000000001195a10
> > > [   21.191274] GPR24: c000000001193a00 00000000fffb6cc4 000000000000000a c0000000010721e8
> > > [   21.191274] GPR28: c000000001076800 c000000001070380 c0000000010716d8 c0000000061b9f80
> > > [   21.191932] NIP [c000000000013608] arch_local_irq_restore+0x168/0x170
> > > [   21.192024] LR [c0000000003f8114] __do_softirq+0xd4/0x2ec
> > > [   21.192118] --- interrupt: 900
> > > [   21.192158] [c0000000061ef8a0] [c0000000061b9f80] 0xc0000000061b9f80 (unreliable)
> > > [   21.192227] [c0000000061ef9b0] [c00000000009c368] irq_exit+0xc8/0x110
> > > [   21.192307] [c0000000061ef9d0] [c00000000001c858] timer_interrupt+0x178/0x2d0
> > > [   21.192397] [c0000000061efa20] [c0000000000098e8] decrementer_common_virt+0x208/0x210
> > > [   21.192495] --- interrupt: 900 at plpar_hcall_norets_notrace+0x18/0x2c
> > > [   21.192566] NIP:  c000000000072988 LR: c000000000074fa8 CTR: c000000000074f10
> > > [   21.192615] REGS: c0000000061efa90 TRAP: 0900   Not tainted  (5.18.0-rc1)
> > > [   21.192659] MSR:  8000000000009033 <SF,EE,ME,IR,DR,RI,LE>  CR: 28000202  XER: 00000000
> > > [   21.192755] CFAR: 0000000000000000 IRQMASK: 0
> > > [   21.192755] GPR00: 0000000028000202 c0000000061efd30 c00000000116a700 0000000000000000
> > > [   21.192755] GPR04: c00000001fea0280 ffffffffffffffff 0000000001f40000 000000019d088fcf
> > > [   21.192755] GPR08: 000000001ee30000 c00000001ffe5400 0000000000000001 0000000100000000
> > > [   21.192755] GPR12: c000000000074f10 c00000001ffe5800 0000000000000000 0000000000000000
> > > [   21.192755] GPR16: 0000000000000000 0000000000000000 0000000000000000 c0000000061eff00
> > > [   21.192755] GPR20: c00000000003d440 0000000000000001 c000000001195b30 c000000001195a10
> > > [   21.192755] GPR24: 0000000000080000 c0000000061ba000 c000000001195a98 0000000000000001
> > > [   21.192755] GPR28: 0000000000000001 c0000000010716d0 c0000000010716d8 c0000000010716d0
> > > [   21.193290] NIP [c000000000072988] plpar_hcall_norets_notrace+0x18/0x2c
> > > [   21.193363] LR [c000000000074fa8] pseries_lpar_idle+0x98/0x1b0
> > > [   21.193428] --- interrupt: 900
> > > [   21.193457] [c0000000061efd30] [0000000000000001] 0x1 (unreliable)
> > > [   21.193512] [c0000000061efdb0] [c000000000018b54] arch_cpu_idle+0x44/0x180
> > > [   21.193590] [c0000000061efde0] [c0000000003f75bc] default_idle_call+0x4c/0x7c
> > > [   21.193679] [c0000000061efe00] [c0000000000e1384] do_idle+0x114/0x1e0
> > > [   21.193747] [c0000000061efe60] [c0000000000e1664] cpu_startup_entry+0x34/0x40
> > > [   21.193901] [c0000000061efe90] [c00000000003f044] start_secondary+0x624/0xa00
> > > [   21.194002] [c0000000061eff90] [c00000000000cd54] start_secondary_prolog+0x10/0x14
> > > [   21.194245] Task dump for CPU 1:
> > > [   21.194284] task:swapper/1       state:R  running task     stack:    0 pid:    0 ppid:     1 flags:0x00000804
> > > [   21.194374] Call Trace:
> > > [   21.194400] [c0000000061ef2b0] [c0000000000c9a40] sched_show_task+0x180/0x1c0 (unreliable)
> > > [   21.194479] [c0000000061ef320] [c000000000116df8] rcu_dump_cpu_stacks+0x128/0x188
> > > [   21.194567] [c0000000061ef3c0] [c000000000114f9c] rcu_sched_clock_irq+0x80c/0xaf0
> > > [   21.194642] [c0000000061ef4b0] [c000000000120fdc] update_process_times+0xbc/0x140
> > > [   21.194712] [c0000000061ef4f0] [c000000000136a24] tick_nohz_handler+0xf4/0x1b0
> > > [   21.194828] [c0000000061ef540] [c00000000001c828] timer_interrupt+0x148/0x2d0
> > > [   21.194942] [c0000000061ef590] [c0000000000098e8] decrementer_common_virt+0x208/0x210
> > > [   21.195035] --- interrupt: 900 at arch_local_irq_restore+0x168/0x170
> > > [   21.195104] NIP:  c000000000013608 LR: c0000000003f8114 CTR: c0000000000dc630
> > > [   21.195152] REGS: c0000000061ef600 TRAP: 0900   Not tainted  (5.18.0-rc1)
> > > [   21.195199] MSR:  8000000000009033 <SF,EE,ME,IR,DR,RI,LE>  CR: 22000202  XER: 00000000
> > > [   21.195296] CFAR: 0000000000000000 IRQMASK: 0
> > > [   21.195296] GPR00: c00000000009c368 c0000000061ef8a0 c00000000116a700 0000000000000000
> > > [   21.195296] GPR04: 0000000000000000 0000000000000000 000000001ee30000 ffffffffffffffff
> > > [   21.195296] GPR08: 000000001ee30000 0000000000000000 0000000000008002 7265677368657265
> > > [   21.195296] GPR12: c0000000000dc630 c00000001ffe5800 0000000000000000 0000000000000000
> > > [   21.195296] GPR16: 0000000000000282 0000000000000000 0000000000000000 c0000000061eff00
> > > [   21.195296] GPR20: 0000000000000000 0000000000000001 c0000000061b9f80 c000000001195a10
> > > [   21.195296] GPR24: c000000001193a00 00000000fffb6cc4 000000000000000a c0000000010721e8
> > > [   21.195296] GPR28: c000000001076800 c000000001070380 c0000000010716d8 c0000000061b9f80
> > > [   21.195850] NIP [c000000000013608] arch_local_irq_restore+0x168/0x170
> > > [   21.195944] LR [c0000000003f8114] __do_softirq+0xd4/0x2ec
> > > [   21.196027] --- interrupt: 900
> > > [   21.196056] [c0000000061ef8a0] [c0000000061b9f80] 0xc0000000061b9f80 (unreliable)
> > > [   21.196119] [c0000000061ef9b0] [c00000000009c368] irq_exit+0xc8/0x110
> > > [   21.196192] [c0000000061ef9d0] [c00000000001c858] timer_interrupt+0x178/0x2d0
> > > [   21.196282] [c0000000061efa20] [c0000000000098e8] decrementer_common_virt+0x208/0x210
> > > [   21.196373] --- interrupt: 900 at plpar_hcall_norets_notrace+0x18/0x2c
> > > [   21.196439] NIP:  c000000000072988 LR: c000000000074fa8 CTR: c000000000074f10
> > > [   21.196489] REGS: c0000000061efa90 TRAP: 0900   Not tainted  (5.18.0-rc1)
> > > [   21.196534] MSR:  8000000000009033 <SF,EE,ME,IR,DR,RI,LE>  CR: 28000202  XER: 00000000
> > > [   21.196627] CFAR: 0000000000000000 IRQMASK: 0
> > > [   21.196627] GPR00: 0000000028000202 c0000000061efd30 c00000000116a700 0000000000000000
> > > [   21.196627] GPR04: c00000001fea0280 ffffffffffffffff 0000000001f40000 000000019d088fcf
> > > [   21.196627] GPR08: 000000001ee30000 c00000001ffe5400 0000000000000001 0000000100000000
> > > [   21.196627] GPR12: c000000000074f10 c00000001ffe5800 0000000000000000 0000000000000000
> > > [   21.196627] GPR16: 0000000000000000 0000000000000000 0000000000000000 c0000000061eff00
> > > [   21.196627] GPR20: c00000000003d440 0000000000000001 c000000001195b30 c000000001195a10
> > > [   21.196627] GPR24: 0000000000080000 c0000000061ba000 c000000001195a98 0000000000000001
> > > [   21.196627] GPR28: 0000000000000001 c0000000010716d0 c0000000010716d8 c0000000010716d0
> > > [   21.197168] NIP [c000000000072988] plpar_hcall_norets_notrace+0x18/0x2c
> > > [   21.197239] LR [c000000000074fa8] pseries_lpar_idle+0x98/0x1b0
> > > [   21.197305] --- interrupt: 900
> > > [   21.197333] [c0000000061efd30] [0000000000000001] 0x1 (unreliable)
> > > [   21.197390] [c0000000061efdb0] [c000000000018b54] arch_cpu_idle+0x44/0x180
> > > [   21.197470] [c0000000061efde0] [c0000000003f75bc] default_idle_call+0x4c/0x7c
> > > [   21.197556] [c0000000061efe00] [c0000000000e1384] do_idle+0x114/0x1e0
> > > [   21.197620] [c0000000061efe60] [c0000000000e1664] cpu_startup_entry+0x34/0x40
> > > [   21.197696] [c0000000061efe90] [c00000000003f044] start_secondary+0x624/0xa00
> > > [   21.197820] [c0000000061eff90] [c00000000000cd54] start_secondary_prolog+0x10/0x1I4

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: rcu_sched self-detected stall on CPU
@ 2022-04-07  2:26           ` Zhouyi Zhou
  0 siblings, 0 replies; 56+ messages in thread
From: Zhouyi Zhou @ 2022-04-07  2:26 UTC (permalink / raw)
  To: Paul E. McKenney; +Cc: rcu, Miguel Ojeda, linuxppc-dev, Zhouyi Zhou

Hi Paul

On Thu, Apr 7, 2022 at 3:50 AM Paul E. McKenney <paulmck@kernel.org> wrote:
>
> On Thu, Apr 07, 2022 at 02:25:59AM +0800, Zhouyi Zhou wrote:
> > Hi Paul
> >
> > On Thu, Apr 7, 2022 at 1:00 AM Paul E. McKenney <paulmck@kernel.org> wrote:
> > >
> > > On Wed, Apr 06, 2022 at 05:31:10PM +0800, Zhouyi Zhou wrote:
> > > > Hi
> > > >
> > > > I can reproduce it in a ppc virtual cloud server provided by Oregon
> > > > State University.  Following is what I do:
> > > > 1) curl -l https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/snapshot/linux-5.18-rc1.tar.gz
> > > > -o linux-5.18-rc1.tar.gz
> > > > 2) tar zxf linux-5.18-rc1.tar.gz
> > > > 3) cp config linux-5.18-rc1/.config
> > > > 4) cd linux-5.18-rc1
> > > > 5) make vmlinux -j 8
> > > > 6) qemu-system-ppc64 -kernel vmlinux -nographic -vga none -no-reboot
> > > > -smp 2 (QEMU 4.2.1)
> > > > 7) after 12 rounds, the bug got reproduced:
> > > > (http://154.223.142.244/logs/20220406/qemu.log.txt)
> > >
> > > Just to make sure, are you both seeing the same thing?  Last I knew,
> > > Zhouyi was chasing an RCU-tasks issue that appears only in kernels
> > > built with CONFIG_PROVE_RCU=y, which Miguel does not have set.  Or did
> > > I miss something?
> > We are both seeing the same thing, I work in parallel.
> > 1) I am chasing the RCU-tasks issue which I will report my discoveries
> > to you later.
> > 2) I am reproducing the RCU CPU stall issue reported by Miguel
> > yesterday. Lucky enough, I can reproduce it and thanks to Oregon State
> > University who provides me with the environment! I am also very
> > interested in helping chase the reason behind the issue. Lucky enough
> > the issue can be reproduced in a non-hardware accelerated qemu
> > environment so that I can give a hand.
>
> How quickly does this happen?  The console log that Miguel sent had
> within 30 seconds of boot.  If it always happens this quickly, it
Yes, this happens within 30 seconds after kernel boot.  If we take all
into account (qemu preparing, kernel loading), we can do one test
within 54 seconds.
> should be possible to do a bisection, especially when running qemu.
Thank you for your guidance! I will do it.
> The trick would be to boot a given commit until you see it fail on the
> one hand or until it boots successfully 70 times.  In the latter case,
> report success to "git bisect", in the former case report failure.
Yes, I will do it.
> If the one-out-of-5 failure rate is accurate, you will have a 99.997%
> chance of reporting the correct failure state on each step, resulting
> in better than a 99.9% chance of converging on the correct commit.
Agree, I have learned that probability from your book.
>
> Of course, you would hit the preceding commit hard to double-check.
Agree
>
> Does this seem reasonable?  Or am I being overly optimstic on the
> failure times?
This is very reasonable, I have written a test script (based on the
script I used to test RCU-tasks issue), and will perform the bisection
in the coming days.

#!/bin/sh
if [ "$#" -ne 1 ]; then
    echo "Usage: test.sh kernel"
    exit
fi
COUNTER=0
while [ $COUNTER -lt 1000 ] ; do
    mv /tmp/console.log /tmp/console.log.orig
    echo $COUNTER > /tmp/console.log
    date
    qemu-system-ppc64 -kernel vmlinux -nographic -vga none -no-reboot
-smp 2 -serial file:/tmp/console.log -m 2048 -append "console=ttyS0"&
    qemu_pid=$!
    echo "Start round $COUNTER"
    while true ; do
        if grep -q "rcu_sched self-detected stall" /tmp/console.log;
        then
            echo "find rcu_sched detected stalls"
            break
        fi
        if grep -q "Unable to mount root fs" /tmp/console.log;
        then
            echo "kernel test round $COUNTER finish"
            break
        fi
        sleep 1
    done
    kill $qemu_pid
    if grep -q "rcu_sched self-detected stall" /tmp/console.log;
    then
        echo $COUNTER
        exit
    fi
    COUNTER=$(($COUNTER+1))
done

Thanks
Zhouyi
>
>                                                         Thanx, Paul
>
> > Thanks
> > Zhouyi
> > >
> > > Miguel is instead seeing an RCU CPU stall warning where RCU's grace-period
> > > kthread slept for three milliseconds, but did not wake up for more than
> > > 20 seconds.  This kthread would normally have awakened on CPU 1, but
> > > CPU 1 looks to me to be very unhealthy, as can be seen in your console
> > > output below (but maybe my idea of what is healthy for powerpc systems
> > > is outdated).  Please see also the inline annotations.
> > >
> > > Thoughts from the PPC guys?
> > >
> > >                                                         Thanx, Paul
> > >
> > > ------------------------------------------------------------------------
> > >
> > > [   21.186912] rcu: INFO: rcu_sched self-detected stall on CPU
> > > [   21.187331] rcu:     1-...!: (4712629 ticks this GP) idle=2c1/0/0x3 softirq=8/8 fqs=0
> > > [   21.187529]  (t=21000 jiffies g=-1183 q=3)
> > > [   21.187681] rcu: rcu_sched kthread timer wakeup didn't happen for 20997 jiffies! g-1183 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402
> > >
> > >         The grace-period kthread is still asleep (->state=0x402).
> > >         This indicates that the three-jiffy timer has somehow been
> > >         prevented from expiring for almost a full 21 seconds.  Of course,
> > >         if timers don't work, RCU cannot work.
> > >
> > > [   21.187770] rcu:     Possible timer handling issue on cpu=1 timer-softirq=1
> > > [   21.187927] rcu: rcu_sched kthread starved for 21001 jiffies! g-1183 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402 ->cpu=1
> > > [   21.188019] rcu:     Unless rcu_sched kthread gets sufficient CPU time, OOM is now expected behavior.
> > > [   21.188087] rcu: RCU grace-period kthread stack dump:
> > > [   21.188196] task:rcu_sched       state:I stack:    0 pid:   10 ppid:     2 flags:0x00000800
> > > [   21.188453] Call Trace:
> > > [   21.188525] [c0000000061e78a0] [c0000000061e78e0] 0xc0000000061e78e0 (unreliable)
> > > [   21.188900] [c0000000061e7a90] [c000000000017210] __switch_to+0x250/0x310
> > > [   21.189210] [c0000000061e7b00] [c0000000003ed660] __schedule+0x210/0x660
> > > [   21.189315] [c0000000061e7b80] [c0000000003edb14] schedule+0x64/0x110
> > > [   21.189387] [c0000000061e7bb0] [c0000000003f6648] schedule_timeout+0x1d8/0x390
> > > [   21.189473] [c0000000061e7c80] [c00000000011111c] rcu_gp_fqs_loop+0x2dc/0x3d0
> > > [   21.189555] [c0000000061e7d30] [c0000000001144ec] rcu_gp_kthread+0x13c/0x160
> > > [   21.189633] [c0000000061e7dc0] [c0000000000c1770] kthread+0x110/0x120
> > > [   21.189714] [c0000000061e7e10] [c00000000000c9e4] ret_from_kernel_thread+0x5c/0x64
> > >
> > >         The above stack trace is expected behavior when the RCU
> > >         grace-period kthread is waiting to do its next FQS scan.
> > >
> > > [   21.189938] rcu: Stack dump where RCU GP kthread last ran:
> > >
> > >         And here is the stalled CPU, which also happens to be the CPU
> > >         that RCU last ran on:
> > >
> > > [   21.189992] Task dump for CPU 1:
> > > [   21.190059] task:swapper/1       state:R  running task     stack:    0 pid:    0 ppid:     1 flags:0x00000804
> > > [   21.190169] Call Trace:
> > > [   21.190194] [c0000000061ef2d0] [c0000000000c9a40] sched_show_task+0x180/0x1c0 (unreliable)
> > > [   21.190278] [c0000000061ef340] [c000000000116ca0] rcu_check_gp_kthread_starvation+0x16c/0x19c
> > > [   21.190370] [c0000000061ef3c0] [c000000000114f7c] rcu_sched_clock_irq+0x7ec/0xaf0
> > > [   21.190448] [c0000000061ef4b0] [c000000000120fdc] update_process_times+0xbc/0x140
> > > [   21.190524] [c0000000061ef4f0] [c000000000136a24] tick_nohz_handler+0xf4/0x1b0
> > > [   21.190608] [c0000000061ef540] [c00000000001c828] timer_interrupt+0x148/0x2d0
> > > [   21.190699] [c0000000061ef590] [c0000000000098e8] decrementer_common_virt+0x208/0x210
> > > [   21.190837] --- interrupt: 900 at arch_local_irq_restore+0x168/0x170
> > >
> > >         Up through this point is just the stack trace of the the
> > >         code doing the stack dump that the RCU CPU stall warning code
> > >         asked for.
> > >
> > > [   21.190941] NIP:  c000000000013608 LR: c0000000003f8114 CTR: c0000000000dc630
> > >
> > >         This NIP does not look at all good to me.  But I freely confess
> > >         that I am out of date on what Power machines do.
> > >
> > > [   21.191031] REGS: c0000000061ef600 TRAP: 0900   Not tainted  (5.18.0-rc1)
> > > [   21.191109] MSR:  8000000000009033 <SF,EE,ME,IR,DR,RI,LE>  CR: 22000202  XER: 00000000
> > > [   21.191274] CFAR: 0000000000000000 IRQMASK: 0
> > > [   21.191274] GPR00: c00000000009c368 c0000000061ef8a0 c00000000116a700 0000000000000000
> > > [   21.191274] GPR04: 0000000000000000 0000000000000000 000000001ee30000 ffffffffffffffff
> > > [   21.191274] GPR08: 000000001ee30000 0000000000000000 0000000000008002 7265677368657265
> > > [   21.191274] GPR12: c0000000000dc630 c00000001ffe5800 0000000000000000 0000000000000000
> > > [   21.191274] GPR16: 0000000000000282 0000000000000000 0000000000000000 c0000000061eff00
> > > [   21.191274] GPR20: 0000000000000000 0000000000000001 c0000000061b9f80 c000000001195a10
> > > [   21.191274] GPR24: c000000001193a00 00000000fffb6cc4 000000000000000a c0000000010721e8
> > > [   21.191274] GPR28: c000000001076800 c000000001070380 c0000000010716d8 c0000000061b9f80
> > > [   21.191932] NIP [c000000000013608] arch_local_irq_restore+0x168/0x170
> > > [   21.192024] LR [c0000000003f8114] __do_softirq+0xd4/0x2ec
> > > [   21.192118] --- interrupt: 900
> > > [   21.192158] [c0000000061ef8a0] [c0000000061b9f80] 0xc0000000061b9f80 (unreliable)
> > > [   21.192227] [c0000000061ef9b0] [c00000000009c368] irq_exit+0xc8/0x110
> > > [   21.192307] [c0000000061ef9d0] [c00000000001c858] timer_interrupt+0x178/0x2d0
> > > [   21.192397] [c0000000061efa20] [c0000000000098e8] decrementer_common_virt+0x208/0x210
> > > [   21.192495] --- interrupt: 900 at plpar_hcall_norets_notrace+0x18/0x2c
> > > [   21.192566] NIP:  c000000000072988 LR: c000000000074fa8 CTR: c000000000074f10
> > > [   21.192615] REGS: c0000000061efa90 TRAP: 0900   Not tainted  (5.18.0-rc1)
> > > [   21.192659] MSR:  8000000000009033 <SF,EE,ME,IR,DR,RI,LE>  CR: 28000202  XER: 00000000
> > > [   21.192755] CFAR: 0000000000000000 IRQMASK: 0
> > > [   21.192755] GPR00: 0000000028000202 c0000000061efd30 c00000000116a700 0000000000000000
> > > [   21.192755] GPR04: c00000001fea0280 ffffffffffffffff 0000000001f40000 000000019d088fcf
> > > [   21.192755] GPR08: 000000001ee30000 c00000001ffe5400 0000000000000001 0000000100000000
> > > [   21.192755] GPR12: c000000000074f10 c00000001ffe5800 0000000000000000 0000000000000000
> > > [   21.192755] GPR16: 0000000000000000 0000000000000000 0000000000000000 c0000000061eff00
> > > [   21.192755] GPR20: c00000000003d440 0000000000000001 c000000001195b30 c000000001195a10
> > > [   21.192755] GPR24: 0000000000080000 c0000000061ba000 c000000001195a98 0000000000000001
> > > [   21.192755] GPR28: 0000000000000001 c0000000010716d0 c0000000010716d8 c0000000010716d0
> > > [   21.193290] NIP [c000000000072988] plpar_hcall_norets_notrace+0x18/0x2c
> > > [   21.193363] LR [c000000000074fa8] pseries_lpar_idle+0x98/0x1b0
> > > [   21.193428] --- interrupt: 900
> > > [   21.193457] [c0000000061efd30] [0000000000000001] 0x1 (unreliable)
> > > [   21.193512] [c0000000061efdb0] [c000000000018b54] arch_cpu_idle+0x44/0x180
> > > [   21.193590] [c0000000061efde0] [c0000000003f75bc] default_idle_call+0x4c/0x7c
> > > [   21.193679] [c0000000061efe00] [c0000000000e1384] do_idle+0x114/0x1e0
> > > [   21.193747] [c0000000061efe60] [c0000000000e1664] cpu_startup_entry+0x34/0x40
> > > [   21.193901] [c0000000061efe90] [c00000000003f044] start_secondary+0x624/0xa00
> > > [   21.194002] [c0000000061eff90] [c00000000000cd54] start_secondary_prolog+0x10/0x14
> > > [   21.194245] Task dump for CPU 1:
> > > [   21.194284] task:swapper/1       state:R  running task     stack:    0 pid:    0 ppid:     1 flags:0x00000804
> > > [   21.194374] Call Trace:
> > > [   21.194400] [c0000000061ef2b0] [c0000000000c9a40] sched_show_task+0x180/0x1c0 (unreliable)
> > > [   21.194479] [c0000000061ef320] [c000000000116df8] rcu_dump_cpu_stacks+0x128/0x188
> > > [   21.194567] [c0000000061ef3c0] [c000000000114f9c] rcu_sched_clock_irq+0x80c/0xaf0
> > > [   21.194642] [c0000000061ef4b0] [c000000000120fdc] update_process_times+0xbc/0x140
> > > [   21.194712] [c0000000061ef4f0] [c000000000136a24] tick_nohz_handler+0xf4/0x1b0
> > > [   21.194828] [c0000000061ef540] [c00000000001c828] timer_interrupt+0x148/0x2d0
> > > [   21.194942] [c0000000061ef590] [c0000000000098e8] decrementer_common_virt+0x208/0x210
> > > [   21.195035] --- interrupt: 900 at arch_local_irq_restore+0x168/0x170
> > > [   21.195104] NIP:  c000000000013608 LR: c0000000003f8114 CTR: c0000000000dc630
> > > [   21.195152] REGS: c0000000061ef600 TRAP: 0900   Not tainted  (5.18.0-rc1)
> > > [   21.195199] MSR:  8000000000009033 <SF,EE,ME,IR,DR,RI,LE>  CR: 22000202  XER: 00000000
> > > [   21.195296] CFAR: 0000000000000000 IRQMASK: 0
> > > [   21.195296] GPR00: c00000000009c368 c0000000061ef8a0 c00000000116a700 0000000000000000
> > > [   21.195296] GPR04: 0000000000000000 0000000000000000 000000001ee30000 ffffffffffffffff
> > > [   21.195296] GPR08: 000000001ee30000 0000000000000000 0000000000008002 7265677368657265
> > > [   21.195296] GPR12: c0000000000dc630 c00000001ffe5800 0000000000000000 0000000000000000
> > > [   21.195296] GPR16: 0000000000000282 0000000000000000 0000000000000000 c0000000061eff00
> > > [   21.195296] GPR20: 0000000000000000 0000000000000001 c0000000061b9f80 c000000001195a10
> > > [   21.195296] GPR24: c000000001193a00 00000000fffb6cc4 000000000000000a c0000000010721e8
> > > [   21.195296] GPR28: c000000001076800 c000000001070380 c0000000010716d8 c0000000061b9f80
> > > [   21.195850] NIP [c000000000013608] arch_local_irq_restore+0x168/0x170
> > > [   21.195944] LR [c0000000003f8114] __do_softirq+0xd4/0x2ec
> > > [   21.196027] --- interrupt: 900
> > > [   21.196056] [c0000000061ef8a0] [c0000000061b9f80] 0xc0000000061b9f80 (unreliable)
> > > [   21.196119] [c0000000061ef9b0] [c00000000009c368] irq_exit+0xc8/0x110
> > > [   21.196192] [c0000000061ef9d0] [c00000000001c858] timer_interrupt+0x178/0x2d0
> > > [   21.196282] [c0000000061efa20] [c0000000000098e8] decrementer_common_virt+0x208/0x210
> > > [   21.196373] --- interrupt: 900 at plpar_hcall_norets_notrace+0x18/0x2c
> > > [   21.196439] NIP:  c000000000072988 LR: c000000000074fa8 CTR: c000000000074f10
> > > [   21.196489] REGS: c0000000061efa90 TRAP: 0900   Not tainted  (5.18.0-rc1)
> > > [   21.196534] MSR:  8000000000009033 <SF,EE,ME,IR,DR,RI,LE>  CR: 28000202  XER: 00000000
> > > [   21.196627] CFAR: 0000000000000000 IRQMASK: 0
> > > [   21.196627] GPR00: 0000000028000202 c0000000061efd30 c00000000116a700 0000000000000000
> > > [   21.196627] GPR04: c00000001fea0280 ffffffffffffffff 0000000001f40000 000000019d088fcf
> > > [   21.196627] GPR08: 000000001ee30000 c00000001ffe5400 0000000000000001 0000000100000000
> > > [   21.196627] GPR12: c000000000074f10 c00000001ffe5800 0000000000000000 0000000000000000
> > > [   21.196627] GPR16: 0000000000000000 0000000000000000 0000000000000000 c0000000061eff00
> > > [   21.196627] GPR20: c00000000003d440 0000000000000001 c000000001195b30 c000000001195a10
> > > [   21.196627] GPR24: 0000000000080000 c0000000061ba000 c000000001195a98 0000000000000001
> > > [   21.196627] GPR28: 0000000000000001 c0000000010716d0 c0000000010716d8 c0000000010716d0
> > > [   21.197168] NIP [c000000000072988] plpar_hcall_norets_notrace+0x18/0x2c
> > > [   21.197239] LR [c000000000074fa8] pseries_lpar_idle+0x98/0x1b0
> > > [   21.197305] --- interrupt: 900
> > > [   21.197333] [c0000000061efd30] [0000000000000001] 0x1 (unreliable)
> > > [   21.197390] [c0000000061efdb0] [c000000000018b54] arch_cpu_idle+0x44/0x180
> > > [   21.197470] [c0000000061efde0] [c0000000003f75bc] default_idle_call+0x4c/0x7c
> > > [   21.197556] [c0000000061efe00] [c0000000000e1384] do_idle+0x114/0x1e0
> > > [   21.197620] [c0000000061efe60] [c0000000000e1664] cpu_startup_entry+0x34/0x40
> > > [   21.197696] [c0000000061efe90] [c00000000003f044] start_secondary+0x624/0xa00
> > > [   21.197820] [c0000000061eff90] [c00000000000cd54] start_secondary_prolog+0x10/0x1I4

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: rcu_sched self-detected stall on CPU
  2022-04-07  2:26           ` Zhouyi Zhou
@ 2022-04-07 10:07             ` Miguel Ojeda
  -1 siblings, 0 replies; 56+ messages in thread
From: Miguel Ojeda @ 2022-04-07 10:07 UTC (permalink / raw)
  To: Zhouyi Zhou; +Cc: Paul E. McKenney, linuxppc-dev, rcu

On Thu, Apr 7, 2022 at 4:27 AM Zhouyi Zhou <zhouzhouyi@gmail.com> wrote:
>
> Yes, this happens within 30 seconds after kernel boot.  If we take all
> into account (qemu preparing, kernel loading), we can do one test
> within 54 seconds.

When it does not trigger, the run should be 20 seconds quicker than
that (e.g. 10 seconds), since we don't wait for the stall timeout. I
guess the timeout could also be reduced a fair bit to make failures
quicker, but they do not contribute as much as the successes anyway.

Thanks a lot for running the bisect on that server, Zhouyi!

Cheers,
Miguel

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: rcu_sched self-detected stall on CPU
@ 2022-04-07 10:07             ` Miguel Ojeda
  0 siblings, 0 replies; 56+ messages in thread
From: Miguel Ojeda @ 2022-04-07 10:07 UTC (permalink / raw)
  To: Zhouyi Zhou; +Cc: rcu, linuxppc-dev, Paul E. McKenney

On Thu, Apr 7, 2022 at 4:27 AM Zhouyi Zhou <zhouzhouyi@gmail.com> wrote:
>
> Yes, this happens within 30 seconds after kernel boot.  If we take all
> into account (qemu preparing, kernel loading), we can do one test
> within 54 seconds.

When it does not trigger, the run should be 20 seconds quicker than
that (e.g. 10 seconds), since we don't wait for the stall timeout. I
guess the timeout could also be reduced a fair bit to make failures
quicker, but they do not contribute as much as the successes anyway.

Thanks a lot for running the bisect on that server, Zhouyi!

Cheers,
Miguel

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: rcu_sched self-detected stall on CPU
  2022-04-07 10:07             ` Miguel Ojeda
@ 2022-04-07 15:15               ` Paul E. McKenney
  -1 siblings, 0 replies; 56+ messages in thread
From: Paul E. McKenney @ 2022-04-07 15:15 UTC (permalink / raw)
  To: Miguel Ojeda; +Cc: Zhouyi Zhou, linuxppc-dev, rcu

On Thu, Apr 07, 2022 at 12:07:34PM +0200, Miguel Ojeda wrote:
> On Thu, Apr 7, 2022 at 4:27 AM Zhouyi Zhou <zhouzhouyi@gmail.com> wrote:
> >
> > Yes, this happens within 30 seconds after kernel boot.  If we take all
> > into account (qemu preparing, kernel loading), we can do one test
> > within 54 seconds.
> 
> When it does not trigger, the run should be 20 seconds quicker than
> that (e.g. 10 seconds), since we don't wait for the stall timeout. I
> guess the timeout could also be reduced a fair bit to make failures
> quicker, but they do not contribute as much as the successes anyway.

Ah.  So you would instead look for boot to have completed within 10
seconds?  Either way, reliable automation might well more important than
reduction in time.

> Thanks a lot for running the bisect on that server, Zhouyi!

What Miguel said!

							Thanx, Paul

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: rcu_sched self-detected stall on CPU
@ 2022-04-07 15:15               ` Paul E. McKenney
  0 siblings, 0 replies; 56+ messages in thread
From: Paul E. McKenney @ 2022-04-07 15:15 UTC (permalink / raw)
  To: Miguel Ojeda; +Cc: rcu, Zhouyi Zhou, linuxppc-dev

On Thu, Apr 07, 2022 at 12:07:34PM +0200, Miguel Ojeda wrote:
> On Thu, Apr 7, 2022 at 4:27 AM Zhouyi Zhou <zhouzhouyi@gmail.com> wrote:
> >
> > Yes, this happens within 30 seconds after kernel boot.  If we take all
> > into account (qemu preparing, kernel loading), we can do one test
> > within 54 seconds.
> 
> When it does not trigger, the run should be 20 seconds quicker than
> that (e.g. 10 seconds), since we don't wait for the stall timeout. I
> guess the timeout could also be reduced a fair bit to make failures
> quicker, but they do not contribute as much as the successes anyway.

Ah.  So you would instead look for boot to have completed within 10
seconds?  Either way, reliable automation might well more important than
reduction in time.

> Thanks a lot for running the bisect on that server, Zhouyi!

What Miguel said!

							Thanx, Paul

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: rcu_sched self-detected stall on CPU
  2022-04-07 15:15               ` Paul E. McKenney
@ 2022-04-07 17:05                 ` Miguel Ojeda
  -1 siblings, 0 replies; 56+ messages in thread
From: Miguel Ojeda @ 2022-04-07 17:05 UTC (permalink / raw)
  To: Paul E. McKenney; +Cc: Zhouyi Zhou, linuxppc-dev, rcu

On Thu, Apr 7, 2022 at 5:15 PM Paul E. McKenney <paulmck@kernel.org> wrote:
>
> Ah.  So you would instead look for boot to have completed within 10
> seconds?  Either way, reliable automation might well more important than
> reduction in time.

No (although I guess that could be an option), I was only pointing out
that when no stall is produced, the run should be much quicker than 30
seconds (at least it was in my setup), which would be the majority of the runs.

Cheers,
Miguel

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: rcu_sched self-detected stall on CPU
@ 2022-04-07 17:05                 ` Miguel Ojeda
  0 siblings, 0 replies; 56+ messages in thread
From: Miguel Ojeda @ 2022-04-07 17:05 UTC (permalink / raw)
  To: Paul E. McKenney; +Cc: rcu, Zhouyi Zhou, linuxppc-dev

On Thu, Apr 7, 2022 at 5:15 PM Paul E. McKenney <paulmck@kernel.org> wrote:
>
> Ah.  So you would instead look for boot to have completed within 10
> seconds?  Either way, reliable automation might well more important than
> reduction in time.

No (although I guess that could be an option), I was only pointing out
that when no stall is produced, the run should be much quicker than 30
seconds (at least it was in my setup), which would be the majority of the runs.

Cheers,
Miguel

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: rcu_sched self-detected stall on CPU
  2022-04-07 17:05                 ` Miguel Ojeda
@ 2022-04-07 17:55                   ` Paul E. McKenney
  -1 siblings, 0 replies; 56+ messages in thread
From: Paul E. McKenney @ 2022-04-07 17:55 UTC (permalink / raw)
  To: Miguel Ojeda; +Cc: Zhouyi Zhou, linuxppc-dev, rcu

On Thu, Apr 07, 2022 at 07:05:58PM +0200, Miguel Ojeda wrote:
> On Thu, Apr 7, 2022 at 5:15 PM Paul E. McKenney <paulmck@kernel.org> wrote:
> >
> > Ah.  So you would instead look for boot to have completed within 10
> > seconds?  Either way, reliable automation might well more important than
> > reduction in time.
> 
> No (although I guess that could be an option), I was only pointing out
> that when no stall is produced, the run should be much quicker than 30
> seconds (at least it was in my setup), which would be the majority of the runs.

Ah, thank you for the clarification!

							Thanx, Paul

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: rcu_sched self-detected stall on CPU
@ 2022-04-07 17:55                   ` Paul E. McKenney
  0 siblings, 0 replies; 56+ messages in thread
From: Paul E. McKenney @ 2022-04-07 17:55 UTC (permalink / raw)
  To: Miguel Ojeda; +Cc: rcu, Zhouyi Zhou, linuxppc-dev

On Thu, Apr 07, 2022 at 07:05:58PM +0200, Miguel Ojeda wrote:
> On Thu, Apr 7, 2022 at 5:15 PM Paul E. McKenney <paulmck@kernel.org> wrote:
> >
> > Ah.  So you would instead look for boot to have completed within 10
> > seconds?  Either way, reliable automation might well more important than
> > reduction in time.
> 
> No (although I guess that could be an option), I was only pointing out
> that when no stall is produced, the run should be much quicker than 30
> seconds (at least it was in my setup), which would be the majority of the runs.

Ah, thank you for the clarification!

							Thanx, Paul

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: rcu_sched self-detected stall on CPU
  2022-04-07 17:55                   ` Paul E. McKenney
@ 2022-04-07 23:14                     ` Zhouyi Zhou
  -1 siblings, 0 replies; 56+ messages in thread
From: Zhouyi Zhou @ 2022-04-07 23:14 UTC (permalink / raw)
  To: Paul E. McKenney; +Cc: Miguel Ojeda, linuxppc-dev, rcu, Zhouyi Zhou

Dear Paul and Miguel

On Fri, Apr 8, 2022 at 1:55 AM Paul E. McKenney <paulmck@kernel.org> wrote:
>
> On Thu, Apr 07, 2022 at 07:05:58PM +0200, Miguel Ojeda wrote:
> > On Thu, Apr 7, 2022 at 5:15 PM Paul E. McKenney <paulmck@kernel.org> wrote:
> > >
> > > Ah.  So you would instead look for boot to have completed within 10
> > > seconds?  Either way, reliable automation might well more important than
> > > reduction in time.
> >
> > No (although I guess that could be an option), I was only pointing out
> > that when no stall is produced, the run should be much quicker than 30
> > seconds (at least it was in my setup), which would be the majority of the runs.
>
> Ah, thank you for the clarification!
Thank both of you for the information. In my setup (PPC cloud VM), the
majority of the runs complete at least for 50 seconds. From last
evening to this morning (Beijing Time), following experiments have
been done:
1) torture mainline: the test quickly finished by hitting "rcu_sched
self-detected stall" after 12 runs
2) torture v5.17: the test last 10 hours plus 14 minutes, 702 runs
have been done without trigger the bug

Conclusion:
There must be a commit that causes the bug as Paul has pointed out.
I am going to do the bisect, and estimate to locate the bug within a
week (at most).
This is a good learning experience, thanks for the guidance ;-)

Kind Regards
Zhouyi
>
>                                                         Thanx, Paul

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: rcu_sched self-detected stall on CPU
@ 2022-04-07 23:14                     ` Zhouyi Zhou
  0 siblings, 0 replies; 56+ messages in thread
From: Zhouyi Zhou @ 2022-04-07 23:14 UTC (permalink / raw)
  To: Paul E. McKenney; +Cc: rcu, Miguel Ojeda, linuxppc-dev, Zhouyi Zhou

Dear Paul and Miguel

On Fri, Apr 8, 2022 at 1:55 AM Paul E. McKenney <paulmck@kernel.org> wrote:
>
> On Thu, Apr 07, 2022 at 07:05:58PM +0200, Miguel Ojeda wrote:
> > On Thu, Apr 7, 2022 at 5:15 PM Paul E. McKenney <paulmck@kernel.org> wrote:
> > >
> > > Ah.  So you would instead look for boot to have completed within 10
> > > seconds?  Either way, reliable automation might well more important than
> > > reduction in time.
> >
> > No (although I guess that could be an option), I was only pointing out
> > that when no stall is produced, the run should be much quicker than 30
> > seconds (at least it was in my setup), which would be the majority of the runs.
>
> Ah, thank you for the clarification!
Thank both of you for the information. In my setup (PPC cloud VM), the
majority of the runs complete at least for 50 seconds. From last
evening to this morning (Beijing Time), following experiments have
been done:
1) torture mainline: the test quickly finished by hitting "rcu_sched
self-detected stall" after 12 runs
2) torture v5.17: the test last 10 hours plus 14 minutes, 702 runs
have been done without trigger the bug

Conclusion:
There must be a commit that causes the bug as Paul has pointed out.
I am going to do the bisect, and estimate to locate the bug within a
week (at most).
This is a good learning experience, thanks for the guidance ;-)

Kind Regards
Zhouyi
>
>                                                         Thanx, Paul

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: rcu_sched self-detected stall on CPU
  2022-04-07 23:14                     ` Zhouyi Zhou
@ 2022-04-08  1:43                       ` Paul E. McKenney
  -1 siblings, 0 replies; 56+ messages in thread
From: Paul E. McKenney @ 2022-04-08  1:43 UTC (permalink / raw)
  To: Zhouyi Zhou; +Cc: Miguel Ojeda, linuxppc-dev, rcu

On Fri, Apr 08, 2022 at 07:14:20AM +0800, Zhouyi Zhou wrote:
> Dear Paul and Miguel
> 
> On Fri, Apr 8, 2022 at 1:55 AM Paul E. McKenney <paulmck@kernel.org> wrote:
> >
> > On Thu, Apr 07, 2022 at 07:05:58PM +0200, Miguel Ojeda wrote:
> > > On Thu, Apr 7, 2022 at 5:15 PM Paul E. McKenney <paulmck@kernel.org> wrote:
> > > >
> > > > Ah.  So you would instead look for boot to have completed within 10
> > > > seconds?  Either way, reliable automation might well more important than
> > > > reduction in time.
> > >
> > > No (although I guess that could be an option), I was only pointing out
> > > that when no stall is produced, the run should be much quicker than 30
> > > seconds (at least it was in my setup), which would be the majority of the runs.
> >
> > Ah, thank you for the clarification!
> Thank both of you for the information. In my setup (PPC cloud VM), the
> majority of the runs complete at least for 50 seconds. From last
> evening to this morning (Beijing Time), following experiments have
> been done:
> 1) torture mainline: the test quickly finished by hitting "rcu_sched
> self-detected stall" after 12 runs
> 2) torture v5.17: the test last 10 hours plus 14 minutes, 702 runs
> have been done without trigger the bug
> 
> Conclusion:
> There must be a commit that causes the bug as Paul has pointed out.
> I am going to do the bisect, and estimate to locate the bug within a
> week (at most).
> This is a good learning experience, thanks for the guidance ;-)

Very good, and looking forward to seeing what you find.

							Thanx, Paul

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: rcu_sched self-detected stall on CPU
@ 2022-04-08  1:43                       ` Paul E. McKenney
  0 siblings, 0 replies; 56+ messages in thread
From: Paul E. McKenney @ 2022-04-08  1:43 UTC (permalink / raw)
  To: Zhouyi Zhou; +Cc: rcu, Miguel Ojeda, linuxppc-dev

On Fri, Apr 08, 2022 at 07:14:20AM +0800, Zhouyi Zhou wrote:
> Dear Paul and Miguel
> 
> On Fri, Apr 8, 2022 at 1:55 AM Paul E. McKenney <paulmck@kernel.org> wrote:
> >
> > On Thu, Apr 07, 2022 at 07:05:58PM +0200, Miguel Ojeda wrote:
> > > On Thu, Apr 7, 2022 at 5:15 PM Paul E. McKenney <paulmck@kernel.org> wrote:
> > > >
> > > > Ah.  So you would instead look for boot to have completed within 10
> > > > seconds?  Either way, reliable automation might well more important than
> > > > reduction in time.
> > >
> > > No (although I guess that could be an option), I was only pointing out
> > > that when no stall is produced, the run should be much quicker than 30
> > > seconds (at least it was in my setup), which would be the majority of the runs.
> >
> > Ah, thank you for the clarification!
> Thank both of you for the information. In my setup (PPC cloud VM), the
> majority of the runs complete at least for 50 seconds. From last
> evening to this morning (Beijing Time), following experiments have
> been done:
> 1) torture mainline: the test quickly finished by hitting "rcu_sched
> self-detected stall" after 12 runs
> 2) torture v5.17: the test last 10 hours plus 14 minutes, 702 runs
> have been done without trigger the bug
> 
> Conclusion:
> There must be a commit that causes the bug as Paul has pointed out.
> I am going to do the bisect, and estimate to locate the bug within a
> week (at most).
> This is a good learning experience, thanks for the guidance ;-)

Very good, and looking forward to seeing what you find.

							Thanx, Paul

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: rcu_sched self-detected stall on CPU
  2022-04-06 17:00     ` Paul E. McKenney
  (?)
  (?)
@ 2022-04-08  7:23     ` Michael Ellerman
  2022-04-08 10:02         ` Zhouyi Zhou
                         ` (3 more replies)
  -1 siblings, 4 replies; 56+ messages in thread
From: Michael Ellerman @ 2022-04-08  7:23 UTC (permalink / raw)
  To: paulmck, Zhouyi Zhou; +Cc: rcu, Miguel Ojeda, linuxppc-dev, Nicholas Piggin

"Paul E. McKenney" <paulmck@kernel.org> writes:
> On Wed, Apr 06, 2022 at 05:31:10PM +0800, Zhouyi Zhou wrote:
>> Hi
>> 
>> I can reproduce it in a ppc virtual cloud server provided by Oregon
>> State University.  Following is what I do:
>> 1) curl -l https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/snapshot/linux-5.18-rc1.tar.gz
>> -o linux-5.18-rc1.tar.gz
>> 2) tar zxf linux-5.18-rc1.tar.gz
>> 3) cp config linux-5.18-rc1/.config
>> 4) cd linux-5.18-rc1
>> 5) make vmlinux -j 8
>> 6) qemu-system-ppc64 -kernel vmlinux -nographic -vga none -no-reboot
>> -smp 2 (QEMU 4.2.1)
>> 7) after 12 rounds, the bug got reproduced:
>> (http://154.223.142.244/logs/20220406/qemu.log.txt)
>
> Just to make sure, are you both seeing the same thing?  Last I knew,
> Zhouyi was chasing an RCU-tasks issue that appears only in kernels
> built with CONFIG_PROVE_RCU=y, which Miguel does not have set.  Or did
> I miss something?
>
> Miguel is instead seeing an RCU CPU stall warning where RCU's grace-period
> kthread slept for three milliseconds, but did not wake up for more than
> 20 seconds.  This kthread would normally have awakened on CPU 1, but
> CPU 1 looks to me to be very unhealthy, as can be seen in your console
> output below (but maybe my idea of what is healthy for powerpc systems
> is outdated).  Please see also the inline annotations.
>
> Thoughts from the PPC guys?

I haven't seen it in my testing. But using Miguel's config I can
reproduce it seemingly on every boot.

For me it bisects to:

  35de589cb879 ("powerpc/time: improve decrementer clockevent processing")

Which seems plausible.

Reverting that on mainline makes the bug go away.

I don't see an obvious bug in the diff, but I could be wrong, or the old
code was papering over an existing bug?

I'll try and work out what it is about Miguel's config that exposes
this vs our defconfig, that might give us a clue.

cheers

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: rcu_sched self-detected stall on CPU
  2022-04-08  7:23     ` Michael Ellerman
@ 2022-04-08 10:02         ` Zhouyi Zhou
  2022-04-08 13:52         ` Miguel Ojeda
                           ` (2 subsequent siblings)
  3 siblings, 0 replies; 56+ messages in thread
From: Zhouyi Zhou @ 2022-04-08 10:02 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: Paul E. McKenney, rcu, Miguel Ojeda, linuxppc-dev, Nicholas Piggin

On Fri, Apr 8, 2022 at 3:23 PM Michael Ellerman <mpe@ellerman.id.au> wrote:
>
> "Paul E. McKenney" <paulmck@kernel.org> writes:
> > On Wed, Apr 06, 2022 at 05:31:10PM +0800, Zhouyi Zhou wrote:
> >> Hi
> >>
> >> I can reproduce it in a ppc virtual cloud server provided by Oregon
> >> State University.  Following is what I do:
> >> 1) curl -l https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/snapshot/linux-5.18-rc1.tar.gz
> >> -o linux-5.18-rc1.tar.gz
> >> 2) tar zxf linux-5.18-rc1.tar.gz
> >> 3) cp config linux-5.18-rc1/.config
> >> 4) cd linux-5.18-rc1
> >> 5) make vmlinux -j 8
> >> 6) qemu-system-ppc64 -kernel vmlinux -nographic -vga none -no-reboot
> >> -smp 2 (QEMU 4.2.1)
> >> 7) after 12 rounds, the bug got reproduced:
> >> (http://154.223.142.244/logs/20220406/qemu.log.txt)
> >
> > Just to make sure, are you both seeing the same thing?  Last I knew,
> > Zhouyi was chasing an RCU-tasks issue that appears only in kernels
> > built with CONFIG_PROVE_RCU=y, which Miguel does not have set.  Or did
> > I miss something?
> >
> > Miguel is instead seeing an RCU CPU stall warning where RCU's grace-period
> > kthread slept for three milliseconds, but did not wake up for more than
> > 20 seconds.  This kthread would normally have awakened on CPU 1, but
> > CPU 1 looks to me to be very unhealthy, as can be seen in your console
> > output below (but maybe my idea of what is healthy for powerpc systems
> > is outdated).  Please see also the inline annotations.
> >
> > Thoughts from the PPC guys?
>
> I haven't seen it in my testing. But using Miguel's config I can
> reproduce it seemingly on every boot.
>
> For me it bisects to:
>
>   35de589cb879 ("powerpc/time: improve decrementer clockevent processing")
>
> Which seems plausible.
I also bisect to 35de589cb879 ("powerpc/time: improve decrementer
clockevent processing")
>
> Reverting that on mainline makes the bug go away.
I also revert that on the mainline, and am currently doing a pressure
test (by repeatedly invoking qemu and checking the console.log) on PPC
VM in Oregon State University.
>
> I don't see an obvious bug in the diff, but I could be wrong, or the old
> code was papering over an existing bug?
>
> I'll try and work out what it is about Miguel's config that exposes
> this vs our defconfig, that might give us a clue.
Great job!
>
> cheers
Thanks
Zhouyi

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: rcu_sched self-detected stall on CPU
@ 2022-04-08 10:02         ` Zhouyi Zhou
  0 siblings, 0 replies; 56+ messages in thread
From: Zhouyi Zhou @ 2022-04-08 10:02 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: rcu, Miguel Ojeda, linuxppc-dev, Nicholas Piggin, Paul E. McKenney

On Fri, Apr 8, 2022 at 3:23 PM Michael Ellerman <mpe@ellerman.id.au> wrote:
>
> "Paul E. McKenney" <paulmck@kernel.org> writes:
> > On Wed, Apr 06, 2022 at 05:31:10PM +0800, Zhouyi Zhou wrote:
> >> Hi
> >>
> >> I can reproduce it in a ppc virtual cloud server provided by Oregon
> >> State University.  Following is what I do:
> >> 1) curl -l https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/snapshot/linux-5.18-rc1.tar.gz
> >> -o linux-5.18-rc1.tar.gz
> >> 2) tar zxf linux-5.18-rc1.tar.gz
> >> 3) cp config linux-5.18-rc1/.config
> >> 4) cd linux-5.18-rc1
> >> 5) make vmlinux -j 8
> >> 6) qemu-system-ppc64 -kernel vmlinux -nographic -vga none -no-reboot
> >> -smp 2 (QEMU 4.2.1)
> >> 7) after 12 rounds, the bug got reproduced:
> >> (http://154.223.142.244/logs/20220406/qemu.log.txt)
> >
> > Just to make sure, are you both seeing the same thing?  Last I knew,
> > Zhouyi was chasing an RCU-tasks issue that appears only in kernels
> > built with CONFIG_PROVE_RCU=y, which Miguel does not have set.  Or did
> > I miss something?
> >
> > Miguel is instead seeing an RCU CPU stall warning where RCU's grace-period
> > kthread slept for three milliseconds, but did not wake up for more than
> > 20 seconds.  This kthread would normally have awakened on CPU 1, but
> > CPU 1 looks to me to be very unhealthy, as can be seen in your console
> > output below (but maybe my idea of what is healthy for powerpc systems
> > is outdated).  Please see also the inline annotations.
> >
> > Thoughts from the PPC guys?
>
> I haven't seen it in my testing. But using Miguel's config I can
> reproduce it seemingly on every boot.
>
> For me it bisects to:
>
>   35de589cb879 ("powerpc/time: improve decrementer clockevent processing")
>
> Which seems plausible.
I also bisect to 35de589cb879 ("powerpc/time: improve decrementer
clockevent processing")
>
> Reverting that on mainline makes the bug go away.
I also revert that on the mainline, and am currently doing a pressure
test (by repeatedly invoking qemu and checking the console.log) on PPC
VM in Oregon State University.
>
> I don't see an obvious bug in the diff, but I could be wrong, or the old
> code was papering over an existing bug?
>
> I'll try and work out what it is about Miguel's config that exposes
> this vs our defconfig, that might give us a clue.
Great job!
>
> cheers
Thanks
Zhouyi

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: rcu_sched self-detected stall on CPU
  2022-04-08  7:23     ` Michael Ellerman
@ 2022-04-08 13:52         ` Miguel Ojeda
  2022-04-08 13:52         ` Miguel Ojeda
                           ` (2 subsequent siblings)
  3 siblings, 0 replies; 56+ messages in thread
From: Miguel Ojeda @ 2022-04-08 13:52 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: Paul E. McKenney, Zhouyi Zhou, rcu, linuxppc-dev, Nicholas Piggin

On Fri, Apr 8, 2022 at 9:23 AM Michael Ellerman <mpe@ellerman.id.au> wrote:
>
> I haven't seen it in my testing. But using Miguel's config I can
> reproduce it seemingly on every boot.

Hmm... I noticed this for some kernel builds: in some builds/commits,
it triggered the very first time, while in others I had to re-try
quite a few times. It could be a "fluke", but since it happened to you
too (and Zhouyi seemed to need 12 tries), it may be that particular
kernel builds makes the bug much more likely.

> For me it bisects to:
>
>   35de589cb879 ("powerpc/time: improve decrementer clockevent processing")
>
> Which seems plausible.
>
> Reverting that on mainline makes the bug go away.

That is great, thanks for that -- I can revert that one in our CI meanwhile.

> I'll try and work out what it is about Miguel's config that exposes
> this vs our defconfig, that might give us a clue.

Yeah, it is one based on the "debug" one you sent for Rust PPC.
Assuming you based that one on the others we had for other archs, then
I guess we are bound to find some things like this at times like with
randconfig, since I made them to be fairly minimal and "custom"... :)

Cheers,
Miguel

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: rcu_sched self-detected stall on CPU
@ 2022-04-08 13:52         ` Miguel Ojeda
  0 siblings, 0 replies; 56+ messages in thread
From: Miguel Ojeda @ 2022-04-08 13:52 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: rcu, Zhouyi Zhou, linuxppc-dev, Nicholas Piggin, Paul E. McKenney

On Fri, Apr 8, 2022 at 9:23 AM Michael Ellerman <mpe@ellerman.id.au> wrote:
>
> I haven't seen it in my testing. But using Miguel's config I can
> reproduce it seemingly on every boot.

Hmm... I noticed this for some kernel builds: in some builds/commits,
it triggered the very first time, while in others I had to re-try
quite a few times. It could be a "fluke", but since it happened to you
too (and Zhouyi seemed to need 12 tries), it may be that particular
kernel builds makes the bug much more likely.

> For me it bisects to:
>
>   35de589cb879 ("powerpc/time: improve decrementer clockevent processing")
>
> Which seems plausible.
>
> Reverting that on mainline makes the bug go away.

That is great, thanks for that -- I can revert that one in our CI meanwhile.

> I'll try and work out what it is about Miguel's config that exposes
> this vs our defconfig, that might give us a clue.

Yeah, it is one based on the "debug" one you sent for Rust PPC.
Assuming you based that one on the others we had for other archs, then
I guess we are bound to find some things like this at times like with
randconfig, since I made them to be fairly minimal and "custom"... :)

Cheers,
Miguel

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: rcu_sched self-detected stall on CPU
  2022-04-08  7:23     ` Michael Ellerman
@ 2022-04-08 14:06         ` Paul E. McKenney
  2022-04-08 13:52         ` Miguel Ojeda
                           ` (2 subsequent siblings)
  3 siblings, 0 replies; 56+ messages in thread
From: Paul E. McKenney @ 2022-04-08 14:06 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: Zhouyi Zhou, rcu, Miguel Ojeda, linuxppc-dev, Nicholas Piggin

On Fri, Apr 08, 2022 at 05:23:32PM +1000, Michael Ellerman wrote:
> "Paul E. McKenney" <paulmck@kernel.org> writes:
> > On Wed, Apr 06, 2022 at 05:31:10PM +0800, Zhouyi Zhou wrote:
> >> Hi
> >> 
> >> I can reproduce it in a ppc virtual cloud server provided by Oregon
> >> State University.  Following is what I do:
> >> 1) curl -l https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/snapshot/linux-5.18-rc1.tar.gz
> >> -o linux-5.18-rc1.tar.gz
> >> 2) tar zxf linux-5.18-rc1.tar.gz
> >> 3) cp config linux-5.18-rc1/.config
> >> 4) cd linux-5.18-rc1
> >> 5) make vmlinux -j 8
> >> 6) qemu-system-ppc64 -kernel vmlinux -nographic -vga none -no-reboot
> >> -smp 2 (QEMU 4.2.1)
> >> 7) after 12 rounds, the bug got reproduced:
> >> (http://154.223.142.244/logs/20220406/qemu.log.txt)
> >
> > Just to make sure, are you both seeing the same thing?  Last I knew,
> > Zhouyi was chasing an RCU-tasks issue that appears only in kernels
> > built with CONFIG_PROVE_RCU=y, which Miguel does not have set.  Or did
> > I miss something?
> >
> > Miguel is instead seeing an RCU CPU stall warning where RCU's grace-period
> > kthread slept for three milliseconds, but did not wake up for more than
> > 20 seconds.  This kthread would normally have awakened on CPU 1, but
> > CPU 1 looks to me to be very unhealthy, as can be seen in your console
> > output below (but maybe my idea of what is healthy for powerpc systems
> > is outdated).  Please see also the inline annotations.
> >
> > Thoughts from the PPC guys?
> 
> I haven't seen it in my testing. But using Miguel's config I can
> reproduce it seemingly on every boot.
> 
> For me it bisects to:
> 
>   35de589cb879 ("powerpc/time: improve decrementer clockevent processing")
> 
> Which seems plausible.
> 
> Reverting that on mainline makes the bug go away.

Thank you for looking into this!

> I don't see an obvious bug in the diff, but I could be wrong, or the old
> code was papering over an existing bug?
> 
> I'll try and work out what it is about Miguel's config that exposes
> this vs our defconfig, that might give us a clue.

I have recently had some RCU bugs that were due to Kconfig failing to
rule out broken .config files.  Maybe this is something similar?

							Thanx, Paul

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: rcu_sched self-detected stall on CPU
@ 2022-04-08 14:06         ` Paul E. McKenney
  0 siblings, 0 replies; 56+ messages in thread
From: Paul E. McKenney @ 2022-04-08 14:06 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: rcu, Zhouyi Zhou, linuxppc-dev, Nicholas Piggin, Miguel Ojeda

On Fri, Apr 08, 2022 at 05:23:32PM +1000, Michael Ellerman wrote:
> "Paul E. McKenney" <paulmck@kernel.org> writes:
> > On Wed, Apr 06, 2022 at 05:31:10PM +0800, Zhouyi Zhou wrote:
> >> Hi
> >> 
> >> I can reproduce it in a ppc virtual cloud server provided by Oregon
> >> State University.  Following is what I do:
> >> 1) curl -l https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/snapshot/linux-5.18-rc1.tar.gz
> >> -o linux-5.18-rc1.tar.gz
> >> 2) tar zxf linux-5.18-rc1.tar.gz
> >> 3) cp config linux-5.18-rc1/.config
> >> 4) cd linux-5.18-rc1
> >> 5) make vmlinux -j 8
> >> 6) qemu-system-ppc64 -kernel vmlinux -nographic -vga none -no-reboot
> >> -smp 2 (QEMU 4.2.1)
> >> 7) after 12 rounds, the bug got reproduced:
> >> (http://154.223.142.244/logs/20220406/qemu.log.txt)
> >
> > Just to make sure, are you both seeing the same thing?  Last I knew,
> > Zhouyi was chasing an RCU-tasks issue that appears only in kernels
> > built with CONFIG_PROVE_RCU=y, which Miguel does not have set.  Or did
> > I miss something?
> >
> > Miguel is instead seeing an RCU CPU stall warning where RCU's grace-period
> > kthread slept for three milliseconds, but did not wake up for more than
> > 20 seconds.  This kthread would normally have awakened on CPU 1, but
> > CPU 1 looks to me to be very unhealthy, as can be seen in your console
> > output below (but maybe my idea of what is healthy for powerpc systems
> > is outdated).  Please see also the inline annotations.
> >
> > Thoughts from the PPC guys?
> 
> I haven't seen it in my testing. But using Miguel's config I can
> reproduce it seemingly on every boot.
> 
> For me it bisects to:
> 
>   35de589cb879 ("powerpc/time: improve decrementer clockevent processing")
> 
> Which seems plausible.
> 
> Reverting that on mainline makes the bug go away.

Thank you for looking into this!

> I don't see an obvious bug in the diff, but I could be wrong, or the old
> code was papering over an existing bug?
> 
> I'll try and work out what it is about Miguel's config that exposes
> this vs our defconfig, that might give us a clue.

I have recently had some RCU bugs that were due to Kconfig failing to
rule out broken .config files.  Maybe this is something similar?

							Thanx, Paul

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: rcu_sched self-detected stall on CPU
  2022-04-08 10:02         ` Zhouyi Zhou
@ 2022-04-08 14:07           ` Paul E. McKenney
  -1 siblings, 0 replies; 56+ messages in thread
From: Paul E. McKenney @ 2022-04-08 14:07 UTC (permalink / raw)
  To: Zhouyi Zhou
  Cc: Michael Ellerman, rcu, Miguel Ojeda, linuxppc-dev, Nicholas Piggin

On Fri, Apr 08, 2022 at 06:02:19PM +0800, Zhouyi Zhou wrote:
> On Fri, Apr 8, 2022 at 3:23 PM Michael Ellerman <mpe@ellerman.id.au> wrote:
> >
> > "Paul E. McKenney" <paulmck@kernel.org> writes:
> > > On Wed, Apr 06, 2022 at 05:31:10PM +0800, Zhouyi Zhou wrote:
> > >> Hi
> > >>
> > >> I can reproduce it in a ppc virtual cloud server provided by Oregon
> > >> State University.  Following is what I do:
> > >> 1) curl -l https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/snapshot/linux-5.18-rc1.tar.gz
> > >> -o linux-5.18-rc1.tar.gz
> > >> 2) tar zxf linux-5.18-rc1.tar.gz
> > >> 3) cp config linux-5.18-rc1/.config
> > >> 4) cd linux-5.18-rc1
> > >> 5) make vmlinux -j 8
> > >> 6) qemu-system-ppc64 -kernel vmlinux -nographic -vga none -no-reboot
> > >> -smp 2 (QEMU 4.2.1)
> > >> 7) after 12 rounds, the bug got reproduced:
> > >> (http://154.223.142.244/logs/20220406/qemu.log.txt)
> > >
> > > Just to make sure, are you both seeing the same thing?  Last I knew,
> > > Zhouyi was chasing an RCU-tasks issue that appears only in kernels
> > > built with CONFIG_PROVE_RCU=y, which Miguel does not have set.  Or did
> > > I miss something?
> > >
> > > Miguel is instead seeing an RCU CPU stall warning where RCU's grace-period
> > > kthread slept for three milliseconds, but did not wake up for more than
> > > 20 seconds.  This kthread would normally have awakened on CPU 1, but
> > > CPU 1 looks to me to be very unhealthy, as can be seen in your console
> > > output below (but maybe my idea of what is healthy for powerpc systems
> > > is outdated).  Please see also the inline annotations.
> > >
> > > Thoughts from the PPC guys?
> >
> > I haven't seen it in my testing. But using Miguel's config I can
> > reproduce it seemingly on every boot.
> >
> > For me it bisects to:
> >
> >   35de589cb879 ("powerpc/time: improve decrementer clockevent processing")
> >
> > Which seems plausible.
> I also bisect to 35de589cb879 ("powerpc/time: improve decrementer
> clockevent processing")

Very good!  Thank you all!!!

							Thanx, Paul

> > Reverting that on mainline makes the bug go away.
> I also revert that on the mainline, and am currently doing a pressure
> test (by repeatedly invoking qemu and checking the console.log) on PPC
> VM in Oregon State University.
> >
> > I don't see an obvious bug in the diff, but I could be wrong, or the old
> > code was papering over an existing bug?
> >
> > I'll try and work out what it is about Miguel's config that exposes
> > this vs our defconfig, that might give us a clue.
> Great job!
> >
> > cheers
> Thanks
> Zhouyi

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: rcu_sched self-detected stall on CPU
@ 2022-04-08 14:07           ` Paul E. McKenney
  0 siblings, 0 replies; 56+ messages in thread
From: Paul E. McKenney @ 2022-04-08 14:07 UTC (permalink / raw)
  To: Zhouyi Zhou; +Cc: rcu, linuxppc-dev, Nicholas Piggin, Miguel Ojeda

On Fri, Apr 08, 2022 at 06:02:19PM +0800, Zhouyi Zhou wrote:
> On Fri, Apr 8, 2022 at 3:23 PM Michael Ellerman <mpe@ellerman.id.au> wrote:
> >
> > "Paul E. McKenney" <paulmck@kernel.org> writes:
> > > On Wed, Apr 06, 2022 at 05:31:10PM +0800, Zhouyi Zhou wrote:
> > >> Hi
> > >>
> > >> I can reproduce it in a ppc virtual cloud server provided by Oregon
> > >> State University.  Following is what I do:
> > >> 1) curl -l https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/snapshot/linux-5.18-rc1.tar.gz
> > >> -o linux-5.18-rc1.tar.gz
> > >> 2) tar zxf linux-5.18-rc1.tar.gz
> > >> 3) cp config linux-5.18-rc1/.config
> > >> 4) cd linux-5.18-rc1
> > >> 5) make vmlinux -j 8
> > >> 6) qemu-system-ppc64 -kernel vmlinux -nographic -vga none -no-reboot
> > >> -smp 2 (QEMU 4.2.1)
> > >> 7) after 12 rounds, the bug got reproduced:
> > >> (http://154.223.142.244/logs/20220406/qemu.log.txt)
> > >
> > > Just to make sure, are you both seeing the same thing?  Last I knew,
> > > Zhouyi was chasing an RCU-tasks issue that appears only in kernels
> > > built with CONFIG_PROVE_RCU=y, which Miguel does not have set.  Or did
> > > I miss something?
> > >
> > > Miguel is instead seeing an RCU CPU stall warning where RCU's grace-period
> > > kthread slept for three milliseconds, but did not wake up for more than
> > > 20 seconds.  This kthread would normally have awakened on CPU 1, but
> > > CPU 1 looks to me to be very unhealthy, as can be seen in your console
> > > output below (but maybe my idea of what is healthy for powerpc systems
> > > is outdated).  Please see also the inline annotations.
> > >
> > > Thoughts from the PPC guys?
> >
> > I haven't seen it in my testing. But using Miguel's config I can
> > reproduce it seemingly on every boot.
> >
> > For me it bisects to:
> >
> >   35de589cb879 ("powerpc/time: improve decrementer clockevent processing")
> >
> > Which seems plausible.
> I also bisect to 35de589cb879 ("powerpc/time: improve decrementer
> clockevent processing")

Very good!  Thank you all!!!

							Thanx, Paul

> > Reverting that on mainline makes the bug go away.
> I also revert that on the mainline, and am currently doing a pressure
> test (by repeatedly invoking qemu and checking the console.log) on PPC
> VM in Oregon State University.
> >
> > I don't see an obvious bug in the diff, but I could be wrong, or the old
> > code was papering over an existing bug?
> >
> > I'll try and work out what it is about Miguel's config that exposes
> > this vs our defconfig, that might give us a clue.
> Great job!
> >
> > cheers
> Thanks
> Zhouyi

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: rcu_sched self-detected stall on CPU
  2022-04-08 14:07           ` Paul E. McKenney
@ 2022-04-08 14:25             ` Zhouyi Zhou
  -1 siblings, 0 replies; 56+ messages in thread
From: Zhouyi Zhou @ 2022-04-08 14:25 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Michael Ellerman, rcu, Miguel Ojeda, linuxppc-dev, Nicholas Piggin

On Fri, Apr 8, 2022 at 10:07 PM Paul E. McKenney <paulmck@kernel.org> wrote:
>
> On Fri, Apr 08, 2022 at 06:02:19PM +0800, Zhouyi Zhou wrote:
> > On Fri, Apr 8, 2022 at 3:23 PM Michael Ellerman <mpe@ellerman.id.au> wrote:
> > >
> > > "Paul E. McKenney" <paulmck@kernel.org> writes:
> > > > On Wed, Apr 06, 2022 at 05:31:10PM +0800, Zhouyi Zhou wrote:
> > > >> Hi
> > > >>
> > > >> I can reproduce it in a ppc virtual cloud server provided by Oregon
> > > >> State University.  Following is what I do:
> > > >> 1) curl -l https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/snapshot/linux-5.18-rc1.tar.gz
> > > >> -o linux-5.18-rc1.tar.gz
> > > >> 2) tar zxf linux-5.18-rc1.tar.gz
> > > >> 3) cp config linux-5.18-rc1/.config
> > > >> 4) cd linux-5.18-rc1
> > > >> 5) make vmlinux -j 8
> > > >> 6) qemu-system-ppc64 -kernel vmlinux -nographic -vga none -no-reboot
> > > >> -smp 2 (QEMU 4.2.1)
> > > >> 7) after 12 rounds, the bug got reproduced:
> > > >> (http://154.223.142.244/logs/20220406/qemu.log.txt)
> > > >
> > > > Just to make sure, are you both seeing the same thing?  Last I knew,
> > > > Zhouyi was chasing an RCU-tasks issue that appears only in kernels
> > > > built with CONFIG_PROVE_RCU=y, which Miguel does not have set.  Or did
> > > > I miss something?
> > > >
> > > > Miguel is instead seeing an RCU CPU stall warning where RCU's grace-period
> > > > kthread slept for three milliseconds, but did not wake up for more than
> > > > 20 seconds.  This kthread would normally have awakened on CPU 1, but
> > > > CPU 1 looks to me to be very unhealthy, as can be seen in your console
> > > > output below (but maybe my idea of what is healthy for powerpc systems
> > > > is outdated).  Please see also the inline annotations.
> > > >
> > > > Thoughts from the PPC guys?
> > >
> > > I haven't seen it in my testing. But using Miguel's config I can
> > > reproduce it seemingly on every boot.
> > >
> > > For me it bisects to:
> > >
> > >   35de589cb879 ("powerpc/time: improve decrementer clockevent processing")
> > >
> > > Which seems plausible.
> > I also bisect to 35de589cb879 ("powerpc/time: improve decrementer
> > clockevent processing")
>
> Very good!  Thank you all!!!
You are very welcome ;-)  and Thank you all!!!!
>
>                                                         Thanx, Paul
>
> > > Reverting that on mainline makes the bug go away.
> > I also revert that on the mainline, and am currently doing a pressure
> > test (by repeatedly invoking qemu and checking the console.log) on PPC
> > VM in Oregon State University.
After 306 rounds of stress test on mainline without triggering the bug
(last for 4 hours and 27 minutes), I think the bug is indeed caused by
35de589cb879 ("powerpc/time: improve decrementer clockevent
processing") and stop the test for now.

Thanks ;-)
Zhouyi
> > >
> > > I don't see an obvious bug in the diff, but I could be wrong, or the old
> > > code was papering over an existing bug?
> > >
> > > I'll try and work out what it is about Miguel's config that exposes
> > > this vs our defconfig, that might give us a clue.
> > Great job!
> > >
> > > cheers
> > Thanks
> > Zhouyi

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: rcu_sched self-detected stall on CPU
@ 2022-04-08 14:25             ` Zhouyi Zhou
  0 siblings, 0 replies; 56+ messages in thread
From: Zhouyi Zhou @ 2022-04-08 14:25 UTC (permalink / raw)
  To: Paul E. McKenney; +Cc: rcu, linuxppc-dev, Nicholas Piggin, Miguel Ojeda

On Fri, Apr 8, 2022 at 10:07 PM Paul E. McKenney <paulmck@kernel.org> wrote:
>
> On Fri, Apr 08, 2022 at 06:02:19PM +0800, Zhouyi Zhou wrote:
> > On Fri, Apr 8, 2022 at 3:23 PM Michael Ellerman <mpe@ellerman.id.au> wrote:
> > >
> > > "Paul E. McKenney" <paulmck@kernel.org> writes:
> > > > On Wed, Apr 06, 2022 at 05:31:10PM +0800, Zhouyi Zhou wrote:
> > > >> Hi
> > > >>
> > > >> I can reproduce it in a ppc virtual cloud server provided by Oregon
> > > >> State University.  Following is what I do:
> > > >> 1) curl -l https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/snapshot/linux-5.18-rc1.tar.gz
> > > >> -o linux-5.18-rc1.tar.gz
> > > >> 2) tar zxf linux-5.18-rc1.tar.gz
> > > >> 3) cp config linux-5.18-rc1/.config
> > > >> 4) cd linux-5.18-rc1
> > > >> 5) make vmlinux -j 8
> > > >> 6) qemu-system-ppc64 -kernel vmlinux -nographic -vga none -no-reboot
> > > >> -smp 2 (QEMU 4.2.1)
> > > >> 7) after 12 rounds, the bug got reproduced:
> > > >> (http://154.223.142.244/logs/20220406/qemu.log.txt)
> > > >
> > > > Just to make sure, are you both seeing the same thing?  Last I knew,
> > > > Zhouyi was chasing an RCU-tasks issue that appears only in kernels
> > > > built with CONFIG_PROVE_RCU=y, which Miguel does not have set.  Or did
> > > > I miss something?
> > > >
> > > > Miguel is instead seeing an RCU CPU stall warning where RCU's grace-period
> > > > kthread slept for three milliseconds, but did not wake up for more than
> > > > 20 seconds.  This kthread would normally have awakened on CPU 1, but
> > > > CPU 1 looks to me to be very unhealthy, as can be seen in your console
> > > > output below (but maybe my idea of what is healthy for powerpc systems
> > > > is outdated).  Please see also the inline annotations.
> > > >
> > > > Thoughts from the PPC guys?
> > >
> > > I haven't seen it in my testing. But using Miguel's config I can
> > > reproduce it seemingly on every boot.
> > >
> > > For me it bisects to:
> > >
> > >   35de589cb879 ("powerpc/time: improve decrementer clockevent processing")
> > >
> > > Which seems plausible.
> > I also bisect to 35de589cb879 ("powerpc/time: improve decrementer
> > clockevent processing")
>
> Very good!  Thank you all!!!
You are very welcome ;-)  and Thank you all!!!!
>
>                                                         Thanx, Paul
>
> > > Reverting that on mainline makes the bug go away.
> > I also revert that on the mainline, and am currently doing a pressure
> > test (by repeatedly invoking qemu and checking the console.log) on PPC
> > VM in Oregon State University.
After 306 rounds of stress test on mainline without triggering the bug
(last for 4 hours and 27 minutes), I think the bug is indeed caused by
35de589cb879 ("powerpc/time: improve decrementer clockevent
processing") and stop the test for now.

Thanks ;-)
Zhouyi
> > >
> > > I don't see an obvious bug in the diff, but I could be wrong, or the old
> > > code was papering over an existing bug?
> > >
> > > I'll try and work out what it is about Miguel's config that exposes
> > > this vs our defconfig, that might give us a clue.
> > Great job!
> > >
> > > cheers
> > Thanks
> > Zhouyi

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: rcu_sched self-detected stall on CPU
  2022-04-08  7:23     ` Michael Ellerman
                         ` (2 preceding siblings ...)
  2022-04-08 14:06         ` Paul E. McKenney
@ 2022-04-08 14:42       ` Michael Ellerman
  2022-04-08 15:52           ` Paul E. McKenney
                           ` (2 more replies)
  3 siblings, 3 replies; 56+ messages in thread
From: Michael Ellerman @ 2022-04-08 14:42 UTC (permalink / raw)
  To: paulmck, Zhouyi Zhou; +Cc: rcu, Miguel Ojeda, linuxppc-dev, Nicholas Piggin

Michael Ellerman <mpe@ellerman.id.au> writes:
> "Paul E. McKenney" <paulmck@kernel.org> writes:
>> On Wed, Apr 06, 2022 at 05:31:10PM +0800, Zhouyi Zhou wrote:
>>> Hi
>>> 
>>> I can reproduce it in a ppc virtual cloud server provided by Oregon
>>> State University.  Following is what I do:
>>> 1) curl -l https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/snapshot/linux-5.18-rc1.tar.gz
>>> -o linux-5.18-rc1.tar.gz
>>> 2) tar zxf linux-5.18-rc1.tar.gz
>>> 3) cp config linux-5.18-rc1/.config
>>> 4) cd linux-5.18-rc1
>>> 5) make vmlinux -j 8
>>> 6) qemu-system-ppc64 -kernel vmlinux -nographic -vga none -no-reboot
>>> -smp 2 (QEMU 4.2.1)
>>> 7) after 12 rounds, the bug got reproduced:
>>> (http://154.223.142.244/logs/20220406/qemu.log.txt)
>>
>> Just to make sure, are you both seeing the same thing?  Last I knew,
>> Zhouyi was chasing an RCU-tasks issue that appears only in kernels
>> built with CONFIG_PROVE_RCU=y, which Miguel does not have set.  Or did
>> I miss something?
>>
>> Miguel is instead seeing an RCU CPU stall warning where RCU's grace-period
>> kthread slept for three milliseconds, but did not wake up for more than
>> 20 seconds.  This kthread would normally have awakened on CPU 1, but
>> CPU 1 looks to me to be very unhealthy, as can be seen in your console
>> output below (but maybe my idea of what is healthy for powerpc systems
>> is outdated).  Please see also the inline annotations.
>>
>> Thoughts from the PPC guys?
>
> I haven't seen it in my testing. But using Miguel's config I can
> reproduce it seemingly on every boot.
>
> For me it bisects to:
>
>   35de589cb879 ("powerpc/time: improve decrementer clockevent processing")
>
> Which seems plausible.
>
> Reverting that on mainline makes the bug go away.
>
> I don't see an obvious bug in the diff, but I could be wrong, or the old
> code was papering over an existing bug?
>
> I'll try and work out what it is about Miguel's config that exposes
> this vs our defconfig, that might give us a clue.

It's CONFIG_HIGH_RES_TIMERS=n which triggers the stall.

I can reproduce just with:

  $ make ppc64le_guest_defconfig
  $ ./scripts/config -d HIGH_RES_TIMERS

We have no defconfigs that disable HIGH_RES_TIMERS, I didn't even
realise you could disable it TBH :)

The Rust CI has it disabled because I copied that from the x86 defconfig
they were using back when I added the Rust support. I think that was
meant to be a stripped down fast config for CI, but the result is it's
just using a badly tested combination which is not helpful.

So I'll send a patch to turn HIGH_RES_TIMERS on for the Rust CI, and we
can debug this further without blocking them.

cheers

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: rcu_sched self-detected stall on CPU
  2022-04-08 14:42       ` Michael Ellerman
@ 2022-04-08 15:52           ` Paul E. McKenney
  2022-04-08 17:02           ` Miguel Ojeda
  2022-04-13  5:11           ` Nicholas Piggin
  2 siblings, 0 replies; 56+ messages in thread
From: Paul E. McKenney @ 2022-04-08 15:52 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: Zhouyi Zhou, rcu, Miguel Ojeda, linuxppc-dev, Nicholas Piggin

On Sat, Apr 09, 2022 at 12:42:39AM +1000, Michael Ellerman wrote:
> Michael Ellerman <mpe@ellerman.id.au> writes:
> > "Paul E. McKenney" <paulmck@kernel.org> writes:
> >> On Wed, Apr 06, 2022 at 05:31:10PM +0800, Zhouyi Zhou wrote:
> >>> Hi
> >>> 
> >>> I can reproduce it in a ppc virtual cloud server provided by Oregon
> >>> State University.  Following is what I do:
> >>> 1) curl -l https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/snapshot/linux-5.18-rc1.tar.gz
> >>> -o linux-5.18-rc1.tar.gz
> >>> 2) tar zxf linux-5.18-rc1.tar.gz
> >>> 3) cp config linux-5.18-rc1/.config
> >>> 4) cd linux-5.18-rc1
> >>> 5) make vmlinux -j 8
> >>> 6) qemu-system-ppc64 -kernel vmlinux -nographic -vga none -no-reboot
> >>> -smp 2 (QEMU 4.2.1)
> >>> 7) after 12 rounds, the bug got reproduced:
> >>> (http://154.223.142.244/logs/20220406/qemu.log.txt)
> >>
> >> Just to make sure, are you both seeing the same thing?  Last I knew,
> >> Zhouyi was chasing an RCU-tasks issue that appears only in kernels
> >> built with CONFIG_PROVE_RCU=y, which Miguel does not have set.  Or did
> >> I miss something?
> >>
> >> Miguel is instead seeing an RCU CPU stall warning where RCU's grace-period
> >> kthread slept for three milliseconds, but did not wake up for more than
> >> 20 seconds.  This kthread would normally have awakened on CPU 1, but
> >> CPU 1 looks to me to be very unhealthy, as can be seen in your console
> >> output below (but maybe my idea of what is healthy for powerpc systems
> >> is outdated).  Please see also the inline annotations.
> >>
> >> Thoughts from the PPC guys?
> >
> > I haven't seen it in my testing. But using Miguel's config I can
> > reproduce it seemingly on every boot.
> >
> > For me it bisects to:
> >
> >   35de589cb879 ("powerpc/time: improve decrementer clockevent processing")
> >
> > Which seems plausible.
> >
> > Reverting that on mainline makes the bug go away.
> >
> > I don't see an obvious bug in the diff, but I could be wrong, or the old
> > code was papering over an existing bug?
> >
> > I'll try and work out what it is about Miguel's config that exposes
> > this vs our defconfig, that might give us a clue.
> 
> It's CONFIG_HIGH_RES_TIMERS=n which triggers the stall.
> 
> I can reproduce just with:
> 
>   $ make ppc64le_guest_defconfig
>   $ ./scripts/config -d HIGH_RES_TIMERS
> 
> We have no defconfigs that disable HIGH_RES_TIMERS, I didn't even
> realise you could disable it TBH :)
> 
> The Rust CI has it disabled because I copied that from the x86 defconfig
> they were using back when I added the Rust support. I think that was
> meant to be a stripped down fast config for CI, but the result is it's
> just using a badly tested combination which is not helpful.
> 
> So I'll send a patch to turn HIGH_RES_TIMERS on for the Rust CI, and we
> can debug this further without blocking them.

Would it make sense to select HIGH_RES_TIMERS from one of the Kconfig*
files in arch/powerpc?  Asking for a friend.  ;-)

							Thanx, Paul

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: rcu_sched self-detected stall on CPU
@ 2022-04-08 15:52           ` Paul E. McKenney
  0 siblings, 0 replies; 56+ messages in thread
From: Paul E. McKenney @ 2022-04-08 15:52 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: rcu, Zhouyi Zhou, linuxppc-dev, Nicholas Piggin, Miguel Ojeda

On Sat, Apr 09, 2022 at 12:42:39AM +1000, Michael Ellerman wrote:
> Michael Ellerman <mpe@ellerman.id.au> writes:
> > "Paul E. McKenney" <paulmck@kernel.org> writes:
> >> On Wed, Apr 06, 2022 at 05:31:10PM +0800, Zhouyi Zhou wrote:
> >>> Hi
> >>> 
> >>> I can reproduce it in a ppc virtual cloud server provided by Oregon
> >>> State University.  Following is what I do:
> >>> 1) curl -l https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/snapshot/linux-5.18-rc1.tar.gz
> >>> -o linux-5.18-rc1.tar.gz
> >>> 2) tar zxf linux-5.18-rc1.tar.gz
> >>> 3) cp config linux-5.18-rc1/.config
> >>> 4) cd linux-5.18-rc1
> >>> 5) make vmlinux -j 8
> >>> 6) qemu-system-ppc64 -kernel vmlinux -nographic -vga none -no-reboot
> >>> -smp 2 (QEMU 4.2.1)
> >>> 7) after 12 rounds, the bug got reproduced:
> >>> (http://154.223.142.244/logs/20220406/qemu.log.txt)
> >>
> >> Just to make sure, are you both seeing the same thing?  Last I knew,
> >> Zhouyi was chasing an RCU-tasks issue that appears only in kernels
> >> built with CONFIG_PROVE_RCU=y, which Miguel does not have set.  Or did
> >> I miss something?
> >>
> >> Miguel is instead seeing an RCU CPU stall warning where RCU's grace-period
> >> kthread slept for three milliseconds, but did not wake up for more than
> >> 20 seconds.  This kthread would normally have awakened on CPU 1, but
> >> CPU 1 looks to me to be very unhealthy, as can be seen in your console
> >> output below (but maybe my idea of what is healthy for powerpc systems
> >> is outdated).  Please see also the inline annotations.
> >>
> >> Thoughts from the PPC guys?
> >
> > I haven't seen it in my testing. But using Miguel's config I can
> > reproduce it seemingly on every boot.
> >
> > For me it bisects to:
> >
> >   35de589cb879 ("powerpc/time: improve decrementer clockevent processing")
> >
> > Which seems plausible.
> >
> > Reverting that on mainline makes the bug go away.
> >
> > I don't see an obvious bug in the diff, but I could be wrong, or the old
> > code was papering over an existing bug?
> >
> > I'll try and work out what it is about Miguel's config that exposes
> > this vs our defconfig, that might give us a clue.
> 
> It's CONFIG_HIGH_RES_TIMERS=n which triggers the stall.
> 
> I can reproduce just with:
> 
>   $ make ppc64le_guest_defconfig
>   $ ./scripts/config -d HIGH_RES_TIMERS
> 
> We have no defconfigs that disable HIGH_RES_TIMERS, I didn't even
> realise you could disable it TBH :)
> 
> The Rust CI has it disabled because I copied that from the x86 defconfig
> they were using back when I added the Rust support. I think that was
> meant to be a stripped down fast config for CI, but the result is it's
> just using a badly tested combination which is not helpful.
> 
> So I'll send a patch to turn HIGH_RES_TIMERS on for the Rust CI, and we
> can debug this further without blocking them.

Would it make sense to select HIGH_RES_TIMERS from one of the Kconfig*
files in arch/powerpc?  Asking for a friend.  ;-)

							Thanx, Paul

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: rcu_sched self-detected stall on CPU
  2022-04-08 14:42       ` Michael Ellerman
@ 2022-04-08 17:02           ` Miguel Ojeda
  2022-04-08 17:02           ` Miguel Ojeda
  2022-04-13  5:11           ` Nicholas Piggin
  2 siblings, 0 replies; 56+ messages in thread
From: Miguel Ojeda @ 2022-04-08 17:02 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: Paul E. McKenney, Zhouyi Zhou, rcu, linuxppc-dev, Nicholas Piggin

On Fri, Apr 8, 2022 at 4:42 PM Michael Ellerman <mpe@ellerman.id.au> wrote:
>
> The Rust CI has it disabled because I copied that from the x86 defconfig
> they were using back when I added the Rust support. I think that was
> meant to be a stripped down fast config for CI, but the result is it's

Indeed, that was my intention when I created the original config.

> So I'll send a patch to turn HIGH_RES_TIMERS on for the Rust CI, and we
> can debug this further without blocking them.

Thanks! I can also do it on your behalf, if you prefer, when I sync with -rc1.

Cheers,
Miguel

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: rcu_sched self-detected stall on CPU
@ 2022-04-08 17:02           ` Miguel Ojeda
  0 siblings, 0 replies; 56+ messages in thread
From: Miguel Ojeda @ 2022-04-08 17:02 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: rcu, Zhouyi Zhou, linuxppc-dev, Nicholas Piggin, Paul E. McKenney

On Fri, Apr 8, 2022 at 4:42 PM Michael Ellerman <mpe@ellerman.id.au> wrote:
>
> The Rust CI has it disabled because I copied that from the x86 defconfig
> they were using back when I added the Rust support. I think that was
> meant to be a stripped down fast config for CI, but the result is it's

Indeed, that was my intention when I created the original config.

> So I'll send a patch to turn HIGH_RES_TIMERS on for the Rust CI, and we
> can debug this further without blocking them.

Thanks! I can also do it on your behalf, if you prefer, when I sync with -rc1.

Cheers,
Miguel

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: rcu_sched self-detected stall on CPU
  2022-04-08 14:25             ` Zhouyi Zhou
  (?)
@ 2022-04-10 11:33             ` Michael Ellerman
  2022-04-11  3:05                 ` Paul E. McKenney
  -1 siblings, 1 reply; 56+ messages in thread
From: Michael Ellerman @ 2022-04-10 11:33 UTC (permalink / raw)
  To: Zhouyi Zhou, Paul E. McKenney
  Cc: rcu, Miguel Ojeda, linuxppc-dev, Nicholas Piggin

Zhouyi Zhou <zhouzhouyi@gmail.com> writes:
> On Fri, Apr 8, 2022 at 10:07 PM Paul E. McKenney <paulmck@kernel.org> wrote:
>> On Fri, Apr 08, 2022 at 06:02:19PM +0800, Zhouyi Zhou wrote:
>> > On Fri, Apr 8, 2022 at 3:23 PM Michael Ellerman <mpe@ellerman.id.au> wrote:
...
>> > > I haven't seen it in my testing. But using Miguel's config I can
>> > > reproduce it seemingly on every boot.
>> > >
>> > > For me it bisects to:
>> > >
>> > >   35de589cb879 ("powerpc/time: improve decrementer clockevent processing")
>> > >
>> > > Which seems plausible.
>> > I also bisect to 35de589cb879 ("powerpc/time: improve decrementer
>> > clockevent processing")
...
>>
>> > > Reverting that on mainline makes the bug go away.

>> > I also revert that on the mainline, and am currently doing a pressure
>> > test (by repeatedly invoking qemu and checking the console.log) on PPC
>> > VM in Oregon State University.

> After 306 rounds of stress test on mainline without triggering the bug
> (last for 4 hours and 27 minutes), I think the bug is indeed caused by
> 35de589cb879 ("powerpc/time: improve decrementer clockevent
> processing") and stop the test for now.

Thanks for testing, that's pretty conclusive.

I'm not inclined to actually revert it yet.

We need to understand if there's actually a bug in the patch, or if it's
just exposing some existing bug/bad behavior we have. The fact that it
only appears with CONFIG_HIGH_RES_TIMERS=n is suspicious.

Do we have some code that inadvertently relies on something enabled by
HIGH_RES_TIMERS=y, or do we have a bug that is hidden by HIGH_RES_TIMERS=y ?

cheers

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: rcu_sched self-detected stall on CPU
  2022-04-10 11:33             ` Michael Ellerman
@ 2022-04-11  3:05                 ` Paul E. McKenney
  0 siblings, 0 replies; 56+ messages in thread
From: Paul E. McKenney @ 2022-04-11  3:05 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: Zhouyi Zhou, rcu, Miguel Ojeda, linuxppc-dev, Nicholas Piggin

On Sun, Apr 10, 2022 at 09:33:43PM +1000, Michael Ellerman wrote:
> Zhouyi Zhou <zhouzhouyi@gmail.com> writes:
> > On Fri, Apr 8, 2022 at 10:07 PM Paul E. McKenney <paulmck@kernel.org> wrote:
> >> On Fri, Apr 08, 2022 at 06:02:19PM +0800, Zhouyi Zhou wrote:
> >> > On Fri, Apr 8, 2022 at 3:23 PM Michael Ellerman <mpe@ellerman.id.au> wrote:
> ...
> >> > > I haven't seen it in my testing. But using Miguel's config I can
> >> > > reproduce it seemingly on every boot.
> >> > >
> >> > > For me it bisects to:
> >> > >
> >> > >   35de589cb879 ("powerpc/time: improve decrementer clockevent processing")
> >> > >
> >> > > Which seems plausible.
> >> > I also bisect to 35de589cb879 ("powerpc/time: improve decrementer
> >> > clockevent processing")
> ...
> >>
> >> > > Reverting that on mainline makes the bug go away.
> 
> >> > I also revert that on the mainline, and am currently doing a pressure
> >> > test (by repeatedly invoking qemu and checking the console.log) on PPC
> >> > VM in Oregon State University.
> 
> > After 306 rounds of stress test on mainline without triggering the bug
> > (last for 4 hours and 27 minutes), I think the bug is indeed caused by
> > 35de589cb879 ("powerpc/time: improve decrementer clockevent
> > processing") and stop the test for now.
> 
> Thanks for testing, that's pretty conclusive.
> 
> I'm not inclined to actually revert it yet.
> 
> We need to understand if there's actually a bug in the patch, or if it's
> just exposing some existing bug/bad behavior we have. The fact that it
> only appears with CONFIG_HIGH_RES_TIMERS=n is suspicious.
> 
> Do we have some code that inadvertently relies on something enabled by
> HIGH_RES_TIMERS=y, or do we have a bug that is hidden by HIGH_RES_TIMERS=y ?

For whatever it is worth, moderate rcutorture runs to completion without
errors with CONFIG_HIGH_RES_TIMERS=n on 64-bit x86.

Also for whatever it is worth, I don't know of anything other than
microcontrollers or the larger IoT devices that would want their kernels
built with CONFIG_HIGH_RES_TIMERS=n.  Which might be a failure of
imagination on my part, but so it goes.

							Thanx, Paul

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: rcu_sched self-detected stall on CPU
@ 2022-04-11  3:05                 ` Paul E. McKenney
  0 siblings, 0 replies; 56+ messages in thread
From: Paul E. McKenney @ 2022-04-11  3:05 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: rcu, Zhouyi Zhou, linuxppc-dev, Nicholas Piggin, Miguel Ojeda

On Sun, Apr 10, 2022 at 09:33:43PM +1000, Michael Ellerman wrote:
> Zhouyi Zhou <zhouzhouyi@gmail.com> writes:
> > On Fri, Apr 8, 2022 at 10:07 PM Paul E. McKenney <paulmck@kernel.org> wrote:
> >> On Fri, Apr 08, 2022 at 06:02:19PM +0800, Zhouyi Zhou wrote:
> >> > On Fri, Apr 8, 2022 at 3:23 PM Michael Ellerman <mpe@ellerman.id.au> wrote:
> ...
> >> > > I haven't seen it in my testing. But using Miguel's config I can
> >> > > reproduce it seemingly on every boot.
> >> > >
> >> > > For me it bisects to:
> >> > >
> >> > >   35de589cb879 ("powerpc/time: improve decrementer clockevent processing")
> >> > >
> >> > > Which seems plausible.
> >> > I also bisect to 35de589cb879 ("powerpc/time: improve decrementer
> >> > clockevent processing")
> ...
> >>
> >> > > Reverting that on mainline makes the bug go away.
> 
> >> > I also revert that on the mainline, and am currently doing a pressure
> >> > test (by repeatedly invoking qemu and checking the console.log) on PPC
> >> > VM in Oregon State University.
> 
> > After 306 rounds of stress test on mainline without triggering the bug
> > (last for 4 hours and 27 minutes), I think the bug is indeed caused by
> > 35de589cb879 ("powerpc/time: improve decrementer clockevent
> > processing") and stop the test for now.
> 
> Thanks for testing, that's pretty conclusive.
> 
> I'm not inclined to actually revert it yet.
> 
> We need to understand if there's actually a bug in the patch, or if it's
> just exposing some existing bug/bad behavior we have. The fact that it
> only appears with CONFIG_HIGH_RES_TIMERS=n is suspicious.
> 
> Do we have some code that inadvertently relies on something enabled by
> HIGH_RES_TIMERS=y, or do we have a bug that is hidden by HIGH_RES_TIMERS=y ?

For whatever it is worth, moderate rcutorture runs to completion without
errors with CONFIG_HIGH_RES_TIMERS=n on 64-bit x86.

Also for whatever it is worth, I don't know of anything other than
microcontrollers or the larger IoT devices that would want their kernels
built with CONFIG_HIGH_RES_TIMERS=n.  Which might be a failure of
imagination on my part, but so it goes.

							Thanx, Paul

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: rcu_sched self-detected stall on CPU
  2022-04-11  3:05                 ` Paul E. McKenney
@ 2022-04-12  6:53                   ` Michael Ellerman
  -1 siblings, 0 replies; 56+ messages in thread
From: Michael Ellerman @ 2022-04-12  6:53 UTC (permalink / raw)
  To: paulmck; +Cc: rcu, Zhouyi Zhou, linuxppc-dev, Nicholas Piggin, Miguel Ojeda

"Paul E. McKenney" <paulmck@kernel.org> writes:
> On Sun, Apr 10, 2022 at 09:33:43PM +1000, Michael Ellerman wrote:
>> Zhouyi Zhou <zhouzhouyi@gmail.com> writes:
>> > On Fri, Apr 8, 2022 at 10:07 PM Paul E. McKenney <paulmck@kernel.org> wrote:
>> >> On Fri, Apr 08, 2022 at 06:02:19PM +0800, Zhouyi Zhou wrote:
>> >> > On Fri, Apr 8, 2022 at 3:23 PM Michael Ellerman <mpe@ellerman.id.au> wrote:
>> ...
>> >> > > I haven't seen it in my testing. But using Miguel's config I can
>> >> > > reproduce it seemingly on every boot.
>> >> > >
>> >> > > For me it bisects to:
>> >> > >
>> >> > >   35de589cb879 ("powerpc/time: improve decrementer clockevent processing")
>> >> > >
>> >> > > Which seems plausible.
>> >> > I also bisect to 35de589cb879 ("powerpc/time: improve decrementer
>> >> > clockevent processing")
>> ...
>> >>
>> >> > > Reverting that on mainline makes the bug go away.
>> 
>> >> > I also revert that on the mainline, and am currently doing a pressure
>> >> > test (by repeatedly invoking qemu and checking the console.log) on PPC
>> >> > VM in Oregon State University.
>> 
>> > After 306 rounds of stress test on mainline without triggering the bug
>> > (last for 4 hours and 27 minutes), I think the bug is indeed caused by
>> > 35de589cb879 ("powerpc/time: improve decrementer clockevent
>> > processing") and stop the test for now.
>> 
>> Thanks for testing, that's pretty conclusive.
>> 
>> I'm not inclined to actually revert it yet.
>> 
>> We need to understand if there's actually a bug in the patch, or if it's
>> just exposing some existing bug/bad behavior we have. The fact that it
>> only appears with CONFIG_HIGH_RES_TIMERS=n is suspicious.
>> 
>> Do we have some code that inadvertently relies on something enabled by
>> HIGH_RES_TIMERS=y, or do we have a bug that is hidden by HIGH_RES_TIMERS=y ?
>
> For whatever it is worth, moderate rcutorture runs to completion without
> errors with CONFIG_HIGH_RES_TIMERS=n on 64-bit x86.

Thanks for testing that, I don't have any big x86 machines to test on :)

> Also for whatever it is worth, I don't know of anything other than
> microcontrollers or the larger IoT devices that would want their kernels
> built with CONFIG_HIGH_RES_TIMERS=n.  Which might be a failure of
> imagination on my part, but so it goes.

Yeah I agree, like I said before I wasn't even aware you could turn it
off. So I think we'll definitely add a select HIGH_RES_TIMERS in future,
but first I need to work out why we are seeing stalls with it disabled.

cheers

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: rcu_sched self-detected stall on CPU
@ 2022-04-12  6:53                   ` Michael Ellerman
  0 siblings, 0 replies; 56+ messages in thread
From: Michael Ellerman @ 2022-04-12  6:53 UTC (permalink / raw)
  To: paulmck; +Cc: Zhouyi Zhou, rcu, Miguel Ojeda, linuxppc-dev, Nicholas Piggin

"Paul E. McKenney" <paulmck@kernel.org> writes:
> On Sun, Apr 10, 2022 at 09:33:43PM +1000, Michael Ellerman wrote:
>> Zhouyi Zhou <zhouzhouyi@gmail.com> writes:
>> > On Fri, Apr 8, 2022 at 10:07 PM Paul E. McKenney <paulmck@kernel.org> wrote:
>> >> On Fri, Apr 08, 2022 at 06:02:19PM +0800, Zhouyi Zhou wrote:
>> >> > On Fri, Apr 8, 2022 at 3:23 PM Michael Ellerman <mpe@ellerman.id.au> wrote:
>> ...
>> >> > > I haven't seen it in my testing. But using Miguel's config I can
>> >> > > reproduce it seemingly on every boot.
>> >> > >
>> >> > > For me it bisects to:
>> >> > >
>> >> > >   35de589cb879 ("powerpc/time: improve decrementer clockevent processing")
>> >> > >
>> >> > > Which seems plausible.
>> >> > I also bisect to 35de589cb879 ("powerpc/time: improve decrementer
>> >> > clockevent processing")
>> ...
>> >>
>> >> > > Reverting that on mainline makes the bug go away.
>> 
>> >> > I also revert that on the mainline, and am currently doing a pressure
>> >> > test (by repeatedly invoking qemu and checking the console.log) on PPC
>> >> > VM in Oregon State University.
>> 
>> > After 306 rounds of stress test on mainline without triggering the bug
>> > (last for 4 hours and 27 minutes), I think the bug is indeed caused by
>> > 35de589cb879 ("powerpc/time: improve decrementer clockevent
>> > processing") and stop the test for now.
>> 
>> Thanks for testing, that's pretty conclusive.
>> 
>> I'm not inclined to actually revert it yet.
>> 
>> We need to understand if there's actually a bug in the patch, or if it's
>> just exposing some existing bug/bad behavior we have. The fact that it
>> only appears with CONFIG_HIGH_RES_TIMERS=n is suspicious.
>> 
>> Do we have some code that inadvertently relies on something enabled by
>> HIGH_RES_TIMERS=y, or do we have a bug that is hidden by HIGH_RES_TIMERS=y ?
>
> For whatever it is worth, moderate rcutorture runs to completion without
> errors with CONFIG_HIGH_RES_TIMERS=n on 64-bit x86.

Thanks for testing that, I don't have any big x86 machines to test on :)

> Also for whatever it is worth, I don't know of anything other than
> microcontrollers or the larger IoT devices that would want their kernels
> built with CONFIG_HIGH_RES_TIMERS=n.  Which might be a failure of
> imagination on my part, but so it goes.

Yeah I agree, like I said before I wasn't even aware you could turn it
off. So I think we'll definitely add a select HIGH_RES_TIMERS in future,
but first I need to work out why we are seeing stalls with it disabled.

cheers

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: rcu_sched self-detected stall on CPU
  2022-04-12  6:53                   ` Michael Ellerman
@ 2022-04-12 13:36                     ` Paul E. McKenney
  -1 siblings, 0 replies; 56+ messages in thread
From: Paul E. McKenney @ 2022-04-12 13:36 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: Zhouyi Zhou, rcu, Miguel Ojeda, linuxppc-dev, Nicholas Piggin

On Tue, Apr 12, 2022 at 04:53:06PM +1000, Michael Ellerman wrote:
> "Paul E. McKenney" <paulmck@kernel.org> writes:
> > On Sun, Apr 10, 2022 at 09:33:43PM +1000, Michael Ellerman wrote:
> >> Zhouyi Zhou <zhouzhouyi@gmail.com> writes:
> >> > On Fri, Apr 8, 2022 at 10:07 PM Paul E. McKenney <paulmck@kernel.org> wrote:
> >> >> On Fri, Apr 08, 2022 at 06:02:19PM +0800, Zhouyi Zhou wrote:
> >> >> > On Fri, Apr 8, 2022 at 3:23 PM Michael Ellerman <mpe@ellerman.id.au> wrote:
> >> ...
> >> >> > > I haven't seen it in my testing. But using Miguel's config I can
> >> >> > > reproduce it seemingly on every boot.
> >> >> > >
> >> >> > > For me it bisects to:
> >> >> > >
> >> >> > >   35de589cb879 ("powerpc/time: improve decrementer clockevent processing")
> >> >> > >
> >> >> > > Which seems plausible.
> >> >> > I also bisect to 35de589cb879 ("powerpc/time: improve decrementer
> >> >> > clockevent processing")
> >> ...
> >> >>
> >> >> > > Reverting that on mainline makes the bug go away.
> >> 
> >> >> > I also revert that on the mainline, and am currently doing a pressure
> >> >> > test (by repeatedly invoking qemu and checking the console.log) on PPC
> >> >> > VM in Oregon State University.
> >> 
> >> > After 306 rounds of stress test on mainline without triggering the bug
> >> > (last for 4 hours and 27 minutes), I think the bug is indeed caused by
> >> > 35de589cb879 ("powerpc/time: improve decrementer clockevent
> >> > processing") and stop the test for now.
> >> 
> >> Thanks for testing, that's pretty conclusive.
> >> 
> >> I'm not inclined to actually revert it yet.
> >> 
> >> We need to understand if there's actually a bug in the patch, or if it's
> >> just exposing some existing bug/bad behavior we have. The fact that it
> >> only appears with CONFIG_HIGH_RES_TIMERS=n is suspicious.
> >> 
> >> Do we have some code that inadvertently relies on something enabled by
> >> HIGH_RES_TIMERS=y, or do we have a bug that is hidden by HIGH_RES_TIMERS=y ?
> >
> > For whatever it is worth, moderate rcutorture runs to completion without
> > errors with CONFIG_HIGH_RES_TIMERS=n on 64-bit x86.
> 
> Thanks for testing that, I don't have any big x86 machines to test on :)
> 
> > Also for whatever it is worth, I don't know of anything other than
> > microcontrollers or the larger IoT devices that would want their kernels
> > built with CONFIG_HIGH_RES_TIMERS=n.  Which might be a failure of
> > imagination on my part, but so it goes.
> 
> Yeah I agree, like I said before I wasn't even aware you could turn it
> off. So I think we'll definitely add a select HIGH_RES_TIMERS in future,
> but first I need to work out why we are seeing stalls with it disabled.

Good point, and fair enough!

							Thanx, Paul

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: rcu_sched self-detected stall on CPU
@ 2022-04-12 13:36                     ` Paul E. McKenney
  0 siblings, 0 replies; 56+ messages in thread
From: Paul E. McKenney @ 2022-04-12 13:36 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: rcu, Zhouyi Zhou, linuxppc-dev, Nicholas Piggin, Miguel Ojeda

On Tue, Apr 12, 2022 at 04:53:06PM +1000, Michael Ellerman wrote:
> "Paul E. McKenney" <paulmck@kernel.org> writes:
> > On Sun, Apr 10, 2022 at 09:33:43PM +1000, Michael Ellerman wrote:
> >> Zhouyi Zhou <zhouzhouyi@gmail.com> writes:
> >> > On Fri, Apr 8, 2022 at 10:07 PM Paul E. McKenney <paulmck@kernel.org> wrote:
> >> >> On Fri, Apr 08, 2022 at 06:02:19PM +0800, Zhouyi Zhou wrote:
> >> >> > On Fri, Apr 8, 2022 at 3:23 PM Michael Ellerman <mpe@ellerman.id.au> wrote:
> >> ...
> >> >> > > I haven't seen it in my testing. But using Miguel's config I can
> >> >> > > reproduce it seemingly on every boot.
> >> >> > >
> >> >> > > For me it bisects to:
> >> >> > >
> >> >> > >   35de589cb879 ("powerpc/time: improve decrementer clockevent processing")
> >> >> > >
> >> >> > > Which seems plausible.
> >> >> > I also bisect to 35de589cb879 ("powerpc/time: improve decrementer
> >> >> > clockevent processing")
> >> ...
> >> >>
> >> >> > > Reverting that on mainline makes the bug go away.
> >> 
> >> >> > I also revert that on the mainline, and am currently doing a pressure
> >> >> > test (by repeatedly invoking qemu and checking the console.log) on PPC
> >> >> > VM in Oregon State University.
> >> 
> >> > After 306 rounds of stress test on mainline without triggering the bug
> >> > (last for 4 hours and 27 minutes), I think the bug is indeed caused by
> >> > 35de589cb879 ("powerpc/time: improve decrementer clockevent
> >> > processing") and stop the test for now.
> >> 
> >> Thanks for testing, that's pretty conclusive.
> >> 
> >> I'm not inclined to actually revert it yet.
> >> 
> >> We need to understand if there's actually a bug in the patch, or if it's
> >> just exposing some existing bug/bad behavior we have. The fact that it
> >> only appears with CONFIG_HIGH_RES_TIMERS=n is suspicious.
> >> 
> >> Do we have some code that inadvertently relies on something enabled by
> >> HIGH_RES_TIMERS=y, or do we have a bug that is hidden by HIGH_RES_TIMERS=y ?
> >
> > For whatever it is worth, moderate rcutorture runs to completion without
> > errors with CONFIG_HIGH_RES_TIMERS=n on 64-bit x86.
> 
> Thanks for testing that, I don't have any big x86 machines to test on :)
> 
> > Also for whatever it is worth, I don't know of anything other than
> > microcontrollers or the larger IoT devices that would want their kernels
> > built with CONFIG_HIGH_RES_TIMERS=n.  Which might be a failure of
> > imagination on my part, but so it goes.
> 
> Yeah I agree, like I said before I wasn't even aware you could turn it
> off. So I think we'll definitely add a select HIGH_RES_TIMERS in future,
> but first I need to work out why we are seeing stalls with it disabled.

Good point, and fair enough!

							Thanx, Paul

^ permalink raw reply	[flat|nested] 56+ messages in thread

* (no subject)
  2022-04-08 14:42       ` Michael Ellerman
@ 2022-04-13  5:11           ` Nicholas Piggin
  2022-04-08 17:02           ` Miguel Ojeda
  2022-04-13  5:11           ` Nicholas Piggin
  2 siblings, 0 replies; 56+ messages in thread
From: Nicholas Piggin @ 2022-04-13  5:11 UTC (permalink / raw)
  To: Michael Ellerman, paulmck, Zhouyi Zhou
  Cc: linuxppc-dev, Miguel Ojeda, rcu, Daniel Lezcano, Thomas Gleixner,
	linux-kernel, Viresh Kumar

+Daniel, Thomas, Viresh

Subject: Re: rcu_sched self-detected stall on CPU

Excerpts from Michael Ellerman's message of April 9, 2022 12:42 am:
> Michael Ellerman <mpe@ellerman.id.au> writes:
>> "Paul E. McKenney" <paulmck@kernel.org> writes:
>>> On Wed, Apr 06, 2022 at 05:31:10PM +0800, Zhouyi Zhou wrote:
>>>> Hi
>>>> 
>>>> I can reproduce it in a ppc virtual cloud server provided by Oregon
>>>> State University.  Following is what I do:
>>>> 1) curl -l https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/snapshot/linux-5.18-rc1.tar.gz
>>>> -o linux-5.18-rc1.tar.gz
>>>> 2) tar zxf linux-5.18-rc1.tar.gz
>>>> 3) cp config linux-5.18-rc1/.config
>>>> 4) cd linux-5.18-rc1
>>>> 5) make vmlinux -j 8
>>>> 6) qemu-system-ppc64 -kernel vmlinux -nographic -vga none -no-reboot
>>>> -smp 2 (QEMU 4.2.1)
>>>> 7) after 12 rounds, the bug got reproduced:
>>>> (http://154.223.142.244/logs/20220406/qemu.log.txt)
>>>
>>> Just to make sure, are you both seeing the same thing?  Last I knew,
>>> Zhouyi was chasing an RCU-tasks issue that appears only in kernels
>>> built with CONFIG_PROVE_RCU=y, which Miguel does not have set.  Or did
>>> I miss something?
>>>
>>> Miguel is instead seeing an RCU CPU stall warning where RCU's grace-period
>>> kthread slept for three milliseconds, but did not wake up for more than
>>> 20 seconds.  This kthread would normally have awakened on CPU 1, but
>>> CPU 1 looks to me to be very unhealthy, as can be seen in your console
>>> output below (but maybe my idea of what is healthy for powerpc systems
>>> is outdated).  Please see also the inline annotations.
>>>
>>> Thoughts from the PPC guys?
>>
>> I haven't seen it in my testing. But using Miguel's config I can
>> reproduce it seemingly on every boot.
>>
>> For me it bisects to:
>>
>>   35de589cb879 ("powerpc/time: improve decrementer clockevent processing")
>>
>> Which seems plausible.
>>
>> Reverting that on mainline makes the bug go away.
>>
>> I don't see an obvious bug in the diff, but I could be wrong, or the old
>> code was papering over an existing bug?
>>
>> I'll try and work out what it is about Miguel's config that exposes
>> this vs our defconfig, that might give us a clue.
> 
> It's CONFIG_HIGH_RES_TIMERS=n which triggers the stall.
> 
> I can reproduce just with:
> 
>   $ make ppc64le_guest_defconfig
>   $ ./scripts/config -d HIGH_RES_TIMERS
> 
> We have no defconfigs that disable HIGH_RES_TIMERS, I didn't even
> realise you could disable it TBH :)
> 
> The Rust CI has it disabled because I copied that from the x86 defconfig
> they were using back when I added the Rust support. I think that was
> meant to be a stripped down fast config for CI, but the result is it's
> just using a badly tested combination which is not helpful.
> 
> So I'll send a patch to turn HIGH_RES_TIMERS on for the Rust CI, and we
> can debug this further without blocking them.

So we traced the problem down to possibly a misunderstanding between 
decrementer clock event device and core code.

The decrementer is only oneshot*ish*. It actually needs to either be 
reprogrammed or shut down otherwise it just continues to cause 
interrupts.

Before commit 35de589cb879, it was sort of two-shot. The initial 
interrupt at the programmed time would set its internal next_tb variable 
to ~0 and call the ->event_handler(). If that did not set_next_event or 
stop the timer, the interrupt will fire again immediately, notice 
next_tb is ~0, and only then stop the decrementer interrupt.

So that was already kind of ugly, this patch just turned it into a hang.

The problem happens when the tick is stopped with an event still 
pending, then tick_nohz_handler() is called, but it bails out because 
tick_stopped == 1 so the device never gets programmed again, and so it 
keeps firing.

How to fix it? Before commit a7cba02deced, powerpc's decrementer was 
really oneshot, but we would like to avoid doing that because it requires 
additional programming of the hardware on each timer interrupt. We have 
the ONESHOT_STOPPED state which seems to be just about what we want.

Did the ONESHOT_STOPPED patch just miss this case, or is there a reason 
we don't stop it here? This patch seems to fix the hang (not heavily
tested though).
 
Thanks,
Nick

---
diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
index 2d76c91b85de..7e13a55b6b71 100644
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -1364,9 +1364,11 @@ static void tick_nohz_handler(struct clock_event_device *dev)
 	tick_sched_do_timer(ts, now);
 	tick_sched_handle(ts, regs);
 
-	/* No need to reprogram if we are running tickless  */
-	if (unlikely(ts->tick_stopped))
+	if (unlikely(ts->tick_stopped)) {
+		/* If we are tickless, change the clock event to stopped */
+		tick_program_event(KTIME_MAX, 1);
 		return;
+	}
 
 	hrtimer_forward(&ts->sched_timer, now, TICK_NSEC);
 	tick_program_event(hrtimer_get_expires(&ts->sched_timer), 1);

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* (no subject)
@ 2022-04-13  5:11           ` Nicholas Piggin
  0 siblings, 0 replies; 56+ messages in thread
From: Nicholas Piggin @ 2022-04-13  5:11 UTC (permalink / raw)
  To: Michael Ellerman, paulmck, Zhouyi Zhou
  Cc: Viresh Kumar, Daniel Lezcano, linux-kernel, rcu, Miguel Ojeda,
	Thomas Gleixner, linuxppc-dev

+Daniel, Thomas, Viresh

Subject: Re: rcu_sched self-detected stall on CPU

Excerpts from Michael Ellerman's message of April 9, 2022 12:42 am:
> Michael Ellerman <mpe@ellerman.id.au> writes:
>> "Paul E. McKenney" <paulmck@kernel.org> writes:
>>> On Wed, Apr 06, 2022 at 05:31:10PM +0800, Zhouyi Zhou wrote:
>>>> Hi
>>>> 
>>>> I can reproduce it in a ppc virtual cloud server provided by Oregon
>>>> State University.  Following is what I do:
>>>> 1) curl -l https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/snapshot/linux-5.18-rc1.tar.gz
>>>> -o linux-5.18-rc1.tar.gz
>>>> 2) tar zxf linux-5.18-rc1.tar.gz
>>>> 3) cp config linux-5.18-rc1/.config
>>>> 4) cd linux-5.18-rc1
>>>> 5) make vmlinux -j 8
>>>> 6) qemu-system-ppc64 -kernel vmlinux -nographic -vga none -no-reboot
>>>> -smp 2 (QEMU 4.2.1)
>>>> 7) after 12 rounds, the bug got reproduced:
>>>> (http://154.223.142.244/logs/20220406/qemu.log.txt)
>>>
>>> Just to make sure, are you both seeing the same thing?  Last I knew,
>>> Zhouyi was chasing an RCU-tasks issue that appears only in kernels
>>> built with CONFIG_PROVE_RCU=y, which Miguel does not have set.  Or did
>>> I miss something?
>>>
>>> Miguel is instead seeing an RCU CPU stall warning where RCU's grace-period
>>> kthread slept for three milliseconds, but did not wake up for more than
>>> 20 seconds.  This kthread would normally have awakened on CPU 1, but
>>> CPU 1 looks to me to be very unhealthy, as can be seen in your console
>>> output below (but maybe my idea of what is healthy for powerpc systems
>>> is outdated).  Please see also the inline annotations.
>>>
>>> Thoughts from the PPC guys?
>>
>> I haven't seen it in my testing. But using Miguel's config I can
>> reproduce it seemingly on every boot.
>>
>> For me it bisects to:
>>
>>   35de589cb879 ("powerpc/time: improve decrementer clockevent processing")
>>
>> Which seems plausible.
>>
>> Reverting that on mainline makes the bug go away.
>>
>> I don't see an obvious bug in the diff, but I could be wrong, or the old
>> code was papering over an existing bug?
>>
>> I'll try and work out what it is about Miguel's config that exposes
>> this vs our defconfig, that might give us a clue.
> 
> It's CONFIG_HIGH_RES_TIMERS=n which triggers the stall.
> 
> I can reproduce just with:
> 
>   $ make ppc64le_guest_defconfig
>   $ ./scripts/config -d HIGH_RES_TIMERS
> 
> We have no defconfigs that disable HIGH_RES_TIMERS, I didn't even
> realise you could disable it TBH :)
> 
> The Rust CI has it disabled because I copied that from the x86 defconfig
> they were using back when I added the Rust support. I think that was
> meant to be a stripped down fast config for CI, but the result is it's
> just using a badly tested combination which is not helpful.
> 
> So I'll send a patch to turn HIGH_RES_TIMERS on for the Rust CI, and we
> can debug this further without blocking them.

So we traced the problem down to possibly a misunderstanding between 
decrementer clock event device and core code.

The decrementer is only oneshot*ish*. It actually needs to either be 
reprogrammed or shut down otherwise it just continues to cause 
interrupts.

Before commit 35de589cb879, it was sort of two-shot. The initial 
interrupt at the programmed time would set its internal next_tb variable 
to ~0 and call the ->event_handler(). If that did not set_next_event or 
stop the timer, the interrupt will fire again immediately, notice 
next_tb is ~0, and only then stop the decrementer interrupt.

So that was already kind of ugly, this patch just turned it into a hang.

The problem happens when the tick is stopped with an event still 
pending, then tick_nohz_handler() is called, but it bails out because 
tick_stopped == 1 so the device never gets programmed again, and so it 
keeps firing.

How to fix it? Before commit a7cba02deced, powerpc's decrementer was 
really oneshot, but we would like to avoid doing that because it requires 
additional programming of the hardware on each timer interrupt. We have 
the ONESHOT_STOPPED state which seems to be just about what we want.

Did the ONESHOT_STOPPED patch just miss this case, or is there a reason 
we don't stop it here? This patch seems to fix the hang (not heavily
tested though).
 
Thanks,
Nick

---
diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
index 2d76c91b85de..7e13a55b6b71 100644
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -1364,9 +1364,11 @@ static void tick_nohz_handler(struct clock_event_device *dev)
 	tick_sched_do_timer(ts, now);
 	tick_sched_handle(ts, regs);
 
-	/* No need to reprogram if we are running tickless  */
-	if (unlikely(ts->tick_stopped))
+	if (unlikely(ts->tick_stopped)) {
+		/* If we are tickless, change the clock event to stopped */
+		tick_program_event(KTIME_MAX, 1);
 		return;
+	}
 
 	hrtimer_forward(&ts->sched_timer, now, TICK_NSEC);
 	tick_program_event(hrtimer_get_expires(&ts->sched_timer), 1);

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* Low-res tick handler device not going to ONESHOT_STOPPED when tick is stopped (was: rcu_sched self-detected stall on CPU)
  2022-04-13  5:11           ` Nicholas Piggin
@ 2022-04-13  6:10             ` Nicholas Piggin
  -1 siblings, 0 replies; 56+ messages in thread
From: Nicholas Piggin @ 2022-04-13  6:10 UTC (permalink / raw)
  To: Michael Ellerman, paulmck, Zhouyi Zhou
  Cc: Viresh Kumar, Daniel Lezcano, linux-kernel, rcu, Miguel Ojeda,
	Thomas Gleixner, linuxppc-dev

Oops, fixed subject...

Excerpts from Nicholas Piggin's message of April 13, 2022 3:11 pm:
> +Daniel, Thomas, Viresh
> 
> Subject: Re: rcu_sched self-detected stall on CPU
> 
> Excerpts from Michael Ellerman's message of April 9, 2022 12:42 am:
>> Michael Ellerman <mpe@ellerman.id.au> writes:
>>> "Paul E. McKenney" <paulmck@kernel.org> writes:
>>>> On Wed, Apr 06, 2022 at 05:31:10PM +0800, Zhouyi Zhou wrote:
>>>>> Hi
>>>>> 
>>>>> I can reproduce it in a ppc virtual cloud server provided by Oregon
>>>>> State University.  Following is what I do:
>>>>> 1) curl -l https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/snapshot/linux-5.18-rc1.tar.gz
>>>>> -o linux-5.18-rc1.tar.gz
>>>>> 2) tar zxf linux-5.18-rc1.tar.gz
>>>>> 3) cp config linux-5.18-rc1/.config
>>>>> 4) cd linux-5.18-rc1
>>>>> 5) make vmlinux -j 8
>>>>> 6) qemu-system-ppc64 -kernel vmlinux -nographic -vga none -no-reboot
>>>>> -smp 2 (QEMU 4.2.1)
>>>>> 7) after 12 rounds, the bug got reproduced:
>>>>> (http://154.223.142.244/logs/20220406/qemu.log.txt)
>>>>
>>>> Just to make sure, are you both seeing the same thing?  Last I knew,
>>>> Zhouyi was chasing an RCU-tasks issue that appears only in kernels
>>>> built with CONFIG_PROVE_RCU=y, which Miguel does not have set.  Or did
>>>> I miss something?
>>>>
>>>> Miguel is instead seeing an RCU CPU stall warning where RCU's grace-period
>>>> kthread slept for three milliseconds, but did not wake up for more than
>>>> 20 seconds.  This kthread would normally have awakened on CPU 1, but
>>>> CPU 1 looks to me to be very unhealthy, as can be seen in your console
>>>> output below (but maybe my idea of what is healthy for powerpc systems
>>>> is outdated).  Please see also the inline annotations.
>>>>
>>>> Thoughts from the PPC guys?
>>>
>>> I haven't seen it in my testing. But using Miguel's config I can
>>> reproduce it seemingly on every boot.
>>>
>>> For me it bisects to:
>>>
>>>   35de589cb879 ("powerpc/time: improve decrementer clockevent processing")
>>>
>>> Which seems plausible.
>>>
>>> Reverting that on mainline makes the bug go away.
>>>
>>> I don't see an obvious bug in the diff, but I could be wrong, or the old
>>> code was papering over an existing bug?
>>>
>>> I'll try and work out what it is about Miguel's config that exposes
>>> this vs our defconfig, that might give us a clue.
>> 
>> It's CONFIG_HIGH_RES_TIMERS=n which triggers the stall.
>> 
>> I can reproduce just with:
>> 
>>   $ make ppc64le_guest_defconfig
>>   $ ./scripts/config -d HIGH_RES_TIMERS
>> 
>> We have no defconfigs that disable HIGH_RES_TIMERS, I didn't even
>> realise you could disable it TBH :)
>> 
>> The Rust CI has it disabled because I copied that from the x86 defconfig
>> they were using back when I added the Rust support. I think that was
>> meant to be a stripped down fast config for CI, but the result is it's
>> just using a badly tested combination which is not helpful.
>> 
>> So I'll send a patch to turn HIGH_RES_TIMERS on for the Rust CI, and we
>> can debug this further without blocking them.
> 
> So we traced the problem down to possibly a misunderstanding between 
> decrementer clock event device and core code.
> 
> The decrementer is only oneshot*ish*. It actually needs to either be 
> reprogrammed or shut down otherwise it just continues to cause 
> interrupts.
> 
> Before commit 35de589cb879, it was sort of two-shot. The initial 
> interrupt at the programmed time would set its internal next_tb variable 
> to ~0 and call the ->event_handler(). If that did not set_next_event or 
> stop the timer, the interrupt will fire again immediately, notice 
> next_tb is ~0, and only then stop the decrementer interrupt.
> 
> So that was already kind of ugly, this patch just turned it into a hang.
> 
> The problem happens when the tick is stopped with an event still 
> pending, then tick_nohz_handler() is called, but it bails out because 
> tick_stopped == 1 so the device never gets programmed again, and so it 
> keeps firing.
> 
> How to fix it? Before commit a7cba02deced, powerpc's decrementer was 
> really oneshot, but we would like to avoid doing that because it requires 
> additional programming of the hardware on each timer interrupt. We have 
> the ONESHOT_STOPPED state which seems to be just about what we want.
> 
> Did the ONESHOT_STOPPED patch just miss this case, or is there a reason 
> we don't stop it here? This patch seems to fix the hang (not heavily
> tested though).
>  
> Thanks,
> Nick
> 
> ---
> diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
> index 2d76c91b85de..7e13a55b6b71 100644
> --- a/kernel/time/tick-sched.c
> +++ b/kernel/time/tick-sched.c
> @@ -1364,9 +1364,11 @@ static void tick_nohz_handler(struct clock_event_device *dev)
>  	tick_sched_do_timer(ts, now);
>  	tick_sched_handle(ts, regs);
>  
> -	/* No need to reprogram if we are running tickless  */
> -	if (unlikely(ts->tick_stopped))
> +	if (unlikely(ts->tick_stopped)) {
> +		/* If we are tickless, change the clock event to stopped */
> +		tick_program_event(KTIME_MAX, 1);
>  		return;
> +	}
>  
>  	hrtimer_forward(&ts->sched_timer, now, TICK_NSEC);
>  	tick_program_event(hrtimer_get_expires(&ts->sched_timer), 1);
> 

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Low-res tick handler device not going to ONESHOT_STOPPED when tick is stopped (was: rcu_sched self-detected stall on CPU)
@ 2022-04-13  6:10             ` Nicholas Piggin
  0 siblings, 0 replies; 56+ messages in thread
From: Nicholas Piggin @ 2022-04-13  6:10 UTC (permalink / raw)
  To: Michael Ellerman, paulmck, Zhouyi Zhou
  Cc: Daniel Lezcano, linux-kernel, linuxppc-dev, Miguel Ojeda, rcu,
	Thomas Gleixner, Viresh Kumar

Oops, fixed subject...

Excerpts from Nicholas Piggin's message of April 13, 2022 3:11 pm:
> +Daniel, Thomas, Viresh
> 
> Subject: Re: rcu_sched self-detected stall on CPU
> 
> Excerpts from Michael Ellerman's message of April 9, 2022 12:42 am:
>> Michael Ellerman <mpe@ellerman.id.au> writes:
>>> "Paul E. McKenney" <paulmck@kernel.org> writes:
>>>> On Wed, Apr 06, 2022 at 05:31:10PM +0800, Zhouyi Zhou wrote:
>>>>> Hi
>>>>> 
>>>>> I can reproduce it in a ppc virtual cloud server provided by Oregon
>>>>> State University.  Following is what I do:
>>>>> 1) curl -l https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/snapshot/linux-5.18-rc1.tar.gz
>>>>> -o linux-5.18-rc1.tar.gz
>>>>> 2) tar zxf linux-5.18-rc1.tar.gz
>>>>> 3) cp config linux-5.18-rc1/.config
>>>>> 4) cd linux-5.18-rc1
>>>>> 5) make vmlinux -j 8
>>>>> 6) qemu-system-ppc64 -kernel vmlinux -nographic -vga none -no-reboot
>>>>> -smp 2 (QEMU 4.2.1)
>>>>> 7) after 12 rounds, the bug got reproduced:
>>>>> (http://154.223.142.244/logs/20220406/qemu.log.txt)
>>>>
>>>> Just to make sure, are you both seeing the same thing?  Last I knew,
>>>> Zhouyi was chasing an RCU-tasks issue that appears only in kernels
>>>> built with CONFIG_PROVE_RCU=y, which Miguel does not have set.  Or did
>>>> I miss something?
>>>>
>>>> Miguel is instead seeing an RCU CPU stall warning where RCU's grace-period
>>>> kthread slept for three milliseconds, but did not wake up for more than
>>>> 20 seconds.  This kthread would normally have awakened on CPU 1, but
>>>> CPU 1 looks to me to be very unhealthy, as can be seen in your console
>>>> output below (but maybe my idea of what is healthy for powerpc systems
>>>> is outdated).  Please see also the inline annotations.
>>>>
>>>> Thoughts from the PPC guys?
>>>
>>> I haven't seen it in my testing. But using Miguel's config I can
>>> reproduce it seemingly on every boot.
>>>
>>> For me it bisects to:
>>>
>>>   35de589cb879 ("powerpc/time: improve decrementer clockevent processing")
>>>
>>> Which seems plausible.
>>>
>>> Reverting that on mainline makes the bug go away.
>>>
>>> I don't see an obvious bug in the diff, but I could be wrong, or the old
>>> code was papering over an existing bug?
>>>
>>> I'll try and work out what it is about Miguel's config that exposes
>>> this vs our defconfig, that might give us a clue.
>> 
>> It's CONFIG_HIGH_RES_TIMERS=n which triggers the stall.
>> 
>> I can reproduce just with:
>> 
>>   $ make ppc64le_guest_defconfig
>>   $ ./scripts/config -d HIGH_RES_TIMERS
>> 
>> We have no defconfigs that disable HIGH_RES_TIMERS, I didn't even
>> realise you could disable it TBH :)
>> 
>> The Rust CI has it disabled because I copied that from the x86 defconfig
>> they were using back when I added the Rust support. I think that was
>> meant to be a stripped down fast config for CI, but the result is it's
>> just using a badly tested combination which is not helpful.
>> 
>> So I'll send a patch to turn HIGH_RES_TIMERS on for the Rust CI, and we
>> can debug this further without blocking them.
> 
> So we traced the problem down to possibly a misunderstanding between 
> decrementer clock event device and core code.
> 
> The decrementer is only oneshot*ish*. It actually needs to either be 
> reprogrammed or shut down otherwise it just continues to cause 
> interrupts.
> 
> Before commit 35de589cb879, it was sort of two-shot. The initial 
> interrupt at the programmed time would set its internal next_tb variable 
> to ~0 and call the ->event_handler(). If that did not set_next_event or 
> stop the timer, the interrupt will fire again immediately, notice 
> next_tb is ~0, and only then stop the decrementer interrupt.
> 
> So that was already kind of ugly, this patch just turned it into a hang.
> 
> The problem happens when the tick is stopped with an event still 
> pending, then tick_nohz_handler() is called, but it bails out because 
> tick_stopped == 1 so the device never gets programmed again, and so it 
> keeps firing.
> 
> How to fix it? Before commit a7cba02deced, powerpc's decrementer was 
> really oneshot, but we would like to avoid doing that because it requires 
> additional programming of the hardware on each timer interrupt. We have 
> the ONESHOT_STOPPED state which seems to be just about what we want.
> 
> Did the ONESHOT_STOPPED patch just miss this case, or is there a reason 
> we don't stop it here? This patch seems to fix the hang (not heavily
> tested though).
>  
> Thanks,
> Nick
> 
> ---
> diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
> index 2d76c91b85de..7e13a55b6b71 100644
> --- a/kernel/time/tick-sched.c
> +++ b/kernel/time/tick-sched.c
> @@ -1364,9 +1364,11 @@ static void tick_nohz_handler(struct clock_event_device *dev)
>  	tick_sched_do_timer(ts, now);
>  	tick_sched_handle(ts, regs);
>  
> -	/* No need to reprogram if we are running tickless  */
> -	if (unlikely(ts->tick_stopped))
> +	if (unlikely(ts->tick_stopped)) {
> +		/* If we are tickless, change the clock event to stopped */
> +		tick_program_event(KTIME_MAX, 1);
>  		return;
> +	}
>  
>  	hrtimer_forward(&ts->sched_timer, now, TICK_NSEC);
>  	tick_program_event(hrtimer_get_expires(&ts->sched_timer), 1);
> 

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: Low-res tick handler device not going to ONESHOT_STOPPED when tick is stopped (was: rcu_sched self-detected stall on CPU)
  2022-04-13  6:10             ` Nicholas Piggin
@ 2022-04-14 17:15               ` Paul E. McKenney
  -1 siblings, 0 replies; 56+ messages in thread
From: Paul E. McKenney @ 2022-04-14 17:15 UTC (permalink / raw)
  To: Nicholas Piggin
  Cc: frederic, Daniel Lezcano, linux-kernel, rcu, Miguel Ojeda,
	Viresh Kumar, Zhouyi Zhou, Thomas Gleixner, linuxppc-dev

On Wed, Apr 13, 2022 at 04:10:02PM +1000, Nicholas Piggin wrote:
> Oops, fixed subject...
> 
> Excerpts from Nicholas Piggin's message of April 13, 2022 3:11 pm:
> > +Daniel, Thomas, Viresh
> > 
> > Subject: Re: rcu_sched self-detected stall on CPU
> > 
> > Excerpts from Michael Ellerman's message of April 9, 2022 12:42 am:
> >> Michael Ellerman <mpe@ellerman.id.au> writes:
> >>> "Paul E. McKenney" <paulmck@kernel.org> writes:
> >>>> On Wed, Apr 06, 2022 at 05:31:10PM +0800, Zhouyi Zhou wrote:
> >>>>> Hi
> >>>>> 
> >>>>> I can reproduce it in a ppc virtual cloud server provided by Oregon
> >>>>> State University.  Following is what I do:
> >>>>> 1) curl -l https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/snapshot/linux-5.18-rc1.tar.gz
> >>>>> -o linux-5.18-rc1.tar.gz
> >>>>> 2) tar zxf linux-5.18-rc1.tar.gz
> >>>>> 3) cp config linux-5.18-rc1/.config
> >>>>> 4) cd linux-5.18-rc1
> >>>>> 5) make vmlinux -j 8
> >>>>> 6) qemu-system-ppc64 -kernel vmlinux -nographic -vga none -no-reboot
> >>>>> -smp 2 (QEMU 4.2.1)
> >>>>> 7) after 12 rounds, the bug got reproduced:
> >>>>> (http://154.223.142.244/logs/20220406/qemu.log.txt)
> >>>>
> >>>> Just to make sure, are you both seeing the same thing?  Last I knew,
> >>>> Zhouyi was chasing an RCU-tasks issue that appears only in kernels
> >>>> built with CONFIG_PROVE_RCU=y, which Miguel does not have set.  Or did
> >>>> I miss something?
> >>>>
> >>>> Miguel is instead seeing an RCU CPU stall warning where RCU's grace-period
> >>>> kthread slept for three milliseconds, but did not wake up for more than
> >>>> 20 seconds.  This kthread would normally have awakened on CPU 1, but
> >>>> CPU 1 looks to me to be very unhealthy, as can be seen in your console
> >>>> output below (but maybe my idea of what is healthy for powerpc systems
> >>>> is outdated).  Please see also the inline annotations.
> >>>>
> >>>> Thoughts from the PPC guys?
> >>>
> >>> I haven't seen it in my testing. But using Miguel's config I can
> >>> reproduce it seemingly on every boot.
> >>>
> >>> For me it bisects to:
> >>>
> >>>   35de589cb879 ("powerpc/time: improve decrementer clockevent processing")
> >>>
> >>> Which seems plausible.
> >>>
> >>> Reverting that on mainline makes the bug go away.
> >>>
> >>> I don't see an obvious bug in the diff, but I could be wrong, or the old
> >>> code was papering over an existing bug?
> >>>
> >>> I'll try and work out what it is about Miguel's config that exposes
> >>> this vs our defconfig, that might give us a clue.
> >> 
> >> It's CONFIG_HIGH_RES_TIMERS=n which triggers the stall.
> >> 
> >> I can reproduce just with:
> >> 
> >>   $ make ppc64le_guest_defconfig
> >>   $ ./scripts/config -d HIGH_RES_TIMERS
> >> 
> >> We have no defconfigs that disable HIGH_RES_TIMERS, I didn't even
> >> realise you could disable it TBH :)
> >> 
> >> The Rust CI has it disabled because I copied that from the x86 defconfig
> >> they were using back when I added the Rust support. I think that was
> >> meant to be a stripped down fast config for CI, but the result is it's
> >> just using a badly tested combination which is not helpful.
> >> 
> >> So I'll send a patch to turn HIGH_RES_TIMERS on for the Rust CI, and we
> >> can debug this further without blocking them.
> > 
> > So we traced the problem down to possibly a misunderstanding between 
> > decrementer clock event device and core code.
> > 
> > The decrementer is only oneshot*ish*. It actually needs to either be 
> > reprogrammed or shut down otherwise it just continues to cause 
> > interrupts.
> > 
> > Before commit 35de589cb879, it was sort of two-shot. The initial 
> > interrupt at the programmed time would set its internal next_tb variable 
> > to ~0 and call the ->event_handler(). If that did not set_next_event or 
> > stop the timer, the interrupt will fire again immediately, notice 
> > next_tb is ~0, and only then stop the decrementer interrupt.
> > 
> > So that was already kind of ugly, this patch just turned it into a hang.
> > 
> > The problem happens when the tick is stopped with an event still 
> > pending, then tick_nohz_handler() is called, but it bails out because 
> > tick_stopped == 1 so the device never gets programmed again, and so it 
> > keeps firing.
> > 
> > How to fix it? Before commit a7cba02deced, powerpc's decrementer was 
> > really oneshot, but we would like to avoid doing that because it requires 
> > additional programming of the hardware on each timer interrupt. We have 
> > the ONESHOT_STOPPED state which seems to be just about what we want.
> > 
> > Did the ONESHOT_STOPPED patch just miss this case, or is there a reason 
> > we don't stop it here? This patch seems to fix the hang (not heavily
> > tested though).

This looks plausible to me based on my interactions with ticks, but it
would be good to have someone who understands that code better than I
do to look it over.

							Thanx, Paul

> > Thanks,
> > Nick
> > 
> > ---
> > diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
> > index 2d76c91b85de..7e13a55b6b71 100644
> > --- a/kernel/time/tick-sched.c
> > +++ b/kernel/time/tick-sched.c
> > @@ -1364,9 +1364,11 @@ static void tick_nohz_handler(struct clock_event_device *dev)
> >  	tick_sched_do_timer(ts, now);
> >  	tick_sched_handle(ts, regs);
> >  
> > -	/* No need to reprogram if we are running tickless  */
> > -	if (unlikely(ts->tick_stopped))
> > +	if (unlikely(ts->tick_stopped)) {
> > +		/* If we are tickless, change the clock event to stopped */
> > +		tick_program_event(KTIME_MAX, 1);
> >  		return;
> > +	}
> >  
> >  	hrtimer_forward(&ts->sched_timer, now, TICK_NSEC);
> >  	tick_program_event(hrtimer_get_expires(&ts->sched_timer), 1);
> > 

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: Low-res tick handler device not going to ONESHOT_STOPPED when tick is stopped (was: rcu_sched self-detected stall on CPU)
@ 2022-04-14 17:15               ` Paul E. McKenney
  0 siblings, 0 replies; 56+ messages in thread
From: Paul E. McKenney @ 2022-04-14 17:15 UTC (permalink / raw)
  To: Nicholas Piggin
  Cc: Michael Ellerman, Zhouyi Zhou, Daniel Lezcano, linux-kernel,
	linuxppc-dev, Miguel Ojeda, rcu, Thomas Gleixner, Viresh Kumar,
	frederic

On Wed, Apr 13, 2022 at 04:10:02PM +1000, Nicholas Piggin wrote:
> Oops, fixed subject...
> 
> Excerpts from Nicholas Piggin's message of April 13, 2022 3:11 pm:
> > +Daniel, Thomas, Viresh
> > 
> > Subject: Re: rcu_sched self-detected stall on CPU
> > 
> > Excerpts from Michael Ellerman's message of April 9, 2022 12:42 am:
> >> Michael Ellerman <mpe@ellerman.id.au> writes:
> >>> "Paul E. McKenney" <paulmck@kernel.org> writes:
> >>>> On Wed, Apr 06, 2022 at 05:31:10PM +0800, Zhouyi Zhou wrote:
> >>>>> Hi
> >>>>> 
> >>>>> I can reproduce it in a ppc virtual cloud server provided by Oregon
> >>>>> State University.  Following is what I do:
> >>>>> 1) curl -l https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/snapshot/linux-5.18-rc1.tar.gz
> >>>>> -o linux-5.18-rc1.tar.gz
> >>>>> 2) tar zxf linux-5.18-rc1.tar.gz
> >>>>> 3) cp config linux-5.18-rc1/.config
> >>>>> 4) cd linux-5.18-rc1
> >>>>> 5) make vmlinux -j 8
> >>>>> 6) qemu-system-ppc64 -kernel vmlinux -nographic -vga none -no-reboot
> >>>>> -smp 2 (QEMU 4.2.1)
> >>>>> 7) after 12 rounds, the bug got reproduced:
> >>>>> (http://154.223.142.244/logs/20220406/qemu.log.txt)
> >>>>
> >>>> Just to make sure, are you both seeing the same thing?  Last I knew,
> >>>> Zhouyi was chasing an RCU-tasks issue that appears only in kernels
> >>>> built with CONFIG_PROVE_RCU=y, which Miguel does not have set.  Or did
> >>>> I miss something?
> >>>>
> >>>> Miguel is instead seeing an RCU CPU stall warning where RCU's grace-period
> >>>> kthread slept for three milliseconds, but did not wake up for more than
> >>>> 20 seconds.  This kthread would normally have awakened on CPU 1, but
> >>>> CPU 1 looks to me to be very unhealthy, as can be seen in your console
> >>>> output below (but maybe my idea of what is healthy for powerpc systems
> >>>> is outdated).  Please see also the inline annotations.
> >>>>
> >>>> Thoughts from the PPC guys?
> >>>
> >>> I haven't seen it in my testing. But using Miguel's config I can
> >>> reproduce it seemingly on every boot.
> >>>
> >>> For me it bisects to:
> >>>
> >>>   35de589cb879 ("powerpc/time: improve decrementer clockevent processing")
> >>>
> >>> Which seems plausible.
> >>>
> >>> Reverting that on mainline makes the bug go away.
> >>>
> >>> I don't see an obvious bug in the diff, but I could be wrong, or the old
> >>> code was papering over an existing bug?
> >>>
> >>> I'll try and work out what it is about Miguel's config that exposes
> >>> this vs our defconfig, that might give us a clue.
> >> 
> >> It's CONFIG_HIGH_RES_TIMERS=n which triggers the stall.
> >> 
> >> I can reproduce just with:
> >> 
> >>   $ make ppc64le_guest_defconfig
> >>   $ ./scripts/config -d HIGH_RES_TIMERS
> >> 
> >> We have no defconfigs that disable HIGH_RES_TIMERS, I didn't even
> >> realise you could disable it TBH :)
> >> 
> >> The Rust CI has it disabled because I copied that from the x86 defconfig
> >> they were using back when I added the Rust support. I think that was
> >> meant to be a stripped down fast config for CI, but the result is it's
> >> just using a badly tested combination which is not helpful.
> >> 
> >> So I'll send a patch to turn HIGH_RES_TIMERS on for the Rust CI, and we
> >> can debug this further without blocking them.
> > 
> > So we traced the problem down to possibly a misunderstanding between 
> > decrementer clock event device and core code.
> > 
> > The decrementer is only oneshot*ish*. It actually needs to either be 
> > reprogrammed or shut down otherwise it just continues to cause 
> > interrupts.
> > 
> > Before commit 35de589cb879, it was sort of two-shot. The initial 
> > interrupt at the programmed time would set its internal next_tb variable 
> > to ~0 and call the ->event_handler(). If that did not set_next_event or 
> > stop the timer, the interrupt will fire again immediately, notice 
> > next_tb is ~0, and only then stop the decrementer interrupt.
> > 
> > So that was already kind of ugly, this patch just turned it into a hang.
> > 
> > The problem happens when the tick is stopped with an event still 
> > pending, then tick_nohz_handler() is called, but it bails out because 
> > tick_stopped == 1 so the device never gets programmed again, and so it 
> > keeps firing.
> > 
> > How to fix it? Before commit a7cba02deced, powerpc's decrementer was 
> > really oneshot, but we would like to avoid doing that because it requires 
> > additional programming of the hardware on each timer interrupt. We have 
> > the ONESHOT_STOPPED state which seems to be just about what we want.
> > 
> > Did the ONESHOT_STOPPED patch just miss this case, or is there a reason 
> > we don't stop it here? This patch seems to fix the hang (not heavily
> > tested though).

This looks plausible to me based on my interactions with ticks, but it
would be good to have someone who understands that code better than I
do to look it over.

							Thanx, Paul

> > Thanks,
> > Nick
> > 
> > ---
> > diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
> > index 2d76c91b85de..7e13a55b6b71 100644
> > --- a/kernel/time/tick-sched.c
> > +++ b/kernel/time/tick-sched.c
> > @@ -1364,9 +1364,11 @@ static void tick_nohz_handler(struct clock_event_device *dev)
> >  	tick_sched_do_timer(ts, now);
> >  	tick_sched_handle(ts, regs);
> >  
> > -	/* No need to reprogram if we are running tickless  */
> > -	if (unlikely(ts->tick_stopped))
> > +	if (unlikely(ts->tick_stopped)) {
> > +		/* If we are tickless, change the clock event to stopped */
> > +		tick_program_event(KTIME_MAX, 1);
> >  		return;
> > +	}
> >  
> >  	hrtimer_forward(&ts->sched_timer, now, TICK_NSEC);
> >  	tick_program_event(hrtimer_get_expires(&ts->sched_timer), 1);
> > 

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re:
  2022-04-13  5:11           ` Nicholas Piggin
@ 2022-04-22 15:53             ` Thomas Gleixner
  -1 siblings, 0 replies; 56+ messages in thread
From: Thomas Gleixner @ 2022-04-22 15:53 UTC (permalink / raw)
  To: Nicholas Piggin, Michael Ellerman, paulmck, Zhouyi Zhou
  Cc: linuxppc-dev, Miguel Ojeda, rcu, Daniel Lezcano, linux-kernel,
	Viresh Kumar

On Wed, Apr 13 2022 at 15:11, Nicholas Piggin wrote:
> So we traced the problem down to possibly a misunderstanding between 
> decrementer clock event device and core code.
>
> The decrementer is only oneshot*ish*. It actually needs to either be 
> reprogrammed or shut down otherwise it just continues to cause 
> interrupts.

I always thought that PPC had sane timers. That's really disillusioning.

> Before commit 35de589cb879, it was sort of two-shot. The initial 
> interrupt at the programmed time would set its internal next_tb variable 
> to ~0 and call the ->event_handler(). If that did not set_next_event or 
> stop the timer, the interrupt will fire again immediately, notice 
> next_tb is ~0, and only then stop the decrementer interrupt.
>
> So that was already kind of ugly, this patch just turned it into a hang.
>
> The problem happens when the tick is stopped with an event still 
> pending, then tick_nohz_handler() is called, but it bails out because 
> tick_stopped == 1 so the device never gets programmed again, and so it 
> keeps firing.
>
> How to fix it? Before commit a7cba02deced, powerpc's decrementer was 
> really oneshot, but we would like to avoid doing that because it requires 
> additional programming of the hardware on each timer interrupt. We have 
> the ONESHOT_STOPPED state which seems to be just about what we want.
>
> Did the ONESHOT_STOPPED patch just miss this case, or is there a reason 
> we don't stop it here? This patch seems to fix the hang (not heavily
> tested though).

This was definitely overlooked, but it's arguable it is is not required
for real oneshot clockevent devices. This should only handle the case
where the interrupt was already pending.

The ONESHOT_STOPPED state was introduced to handle the case where the
last timer gets canceled, so the already programmed event does not fire.

It was not necessarily meant to "fix" clockevent devices which are
pretending to be ONESHOT, but keep firing over and over.

That, said. I'm fine with the change along with a big fat comment why
this is required.

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re:
@ 2022-04-22 15:53             ` Thomas Gleixner
  0 siblings, 0 replies; 56+ messages in thread
From: Thomas Gleixner @ 2022-04-22 15:53 UTC (permalink / raw)
  To: Nicholas Piggin, Michael Ellerman, paulmck, Zhouyi Zhou
  Cc: Viresh Kumar, Daniel Lezcano, linux-kernel, rcu, Miguel Ojeda,
	linuxppc-dev

On Wed, Apr 13 2022 at 15:11, Nicholas Piggin wrote:
> So we traced the problem down to possibly a misunderstanding between 
> decrementer clock event device and core code.
>
> The decrementer is only oneshot*ish*. It actually needs to either be 
> reprogrammed or shut down otherwise it just continues to cause 
> interrupts.

I always thought that PPC had sane timers. That's really disillusioning.

> Before commit 35de589cb879, it was sort of two-shot. The initial 
> interrupt at the programmed time would set its internal next_tb variable 
> to ~0 and call the ->event_handler(). If that did not set_next_event or 
> stop the timer, the interrupt will fire again immediately, notice 
> next_tb is ~0, and only then stop the decrementer interrupt.
>
> So that was already kind of ugly, this patch just turned it into a hang.
>
> The problem happens when the tick is stopped with an event still 
> pending, then tick_nohz_handler() is called, but it bails out because 
> tick_stopped == 1 so the device never gets programmed again, and so it 
> keeps firing.
>
> How to fix it? Before commit a7cba02deced, powerpc's decrementer was 
> really oneshot, but we would like to avoid doing that because it requires 
> additional programming of the hardware on each timer interrupt. We have 
> the ONESHOT_STOPPED state which seems to be just about what we want.
>
> Did the ONESHOT_STOPPED patch just miss this case, or is there a reason 
> we don't stop it here? This patch seems to fix the hang (not heavily
> tested though).

This was definitely overlooked, but it's arguable it is is not required
for real oneshot clockevent devices. This should only handle the case
where the interrupt was already pending.

The ONESHOT_STOPPED state was introduced to handle the case where the
last timer gets canceled, so the already programmed event does not fire.

It was not necessarily meant to "fix" clockevent devices which are
pretending to be ONESHOT, but keep firing over and over.

That, said. I'm fine with the change along with a big fat comment why
this is required.

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re:
  2022-04-22 15:53             ` Re: Thomas Gleixner
@ 2022-04-23  2:29               ` Nicholas Piggin
  -1 siblings, 0 replies; 56+ messages in thread
From: Nicholas Piggin @ 2022-04-23  2:29 UTC (permalink / raw)
  To: Michael Ellerman, paulmck, Thomas Gleixner, Zhouyi Zhou
  Cc: Daniel
	 Lezcano, linux-kernel, linuxppc-dev, Miguel Ojeda, rcu,
	Viresh
	 Kumar

Excerpts from Thomas Gleixner's message of April 23, 2022 1:53 am:
> On Wed, Apr 13 2022 at 15:11, Nicholas Piggin wrote:
>> So we traced the problem down to possibly a misunderstanding between 
>> decrementer clock event device and core code.
>>
>> The decrementer is only oneshot*ish*. It actually needs to either be 
>> reprogrammed or shut down otherwise it just continues to cause 
>> interrupts.
> 
> I always thought that PPC had sane timers. That's really disillusioning.

My comment was probably a bit misleading explanation of the whole
situation. This weirdness is actually in software in the powerpc
clock event driver due to a recent change I made assuming the clock 
event goes to oneshot-stopped.

The hardware is relatively sane I think, global synchronized constant
rate high frequency clock distributed to the CPUs so reads don't
go off-core. And per-CPU "decrementer" event interrupt at the same
frequency as the clock -- program it to a +ve value and it decrements
until zero then creates basically a level triggered interrupt.

Before my change, the decrementer interrupt would always clear the
interrupt at entry. The event_handler usually programs another
timer in so I tried to avoid that first clear counting on the
oneshot_stopped callback to clear the interrupt if there was no
other timer.

>> Before commit 35de589cb879, it was sort of two-shot. The initial 
>> interrupt at the programmed time would set its internal next_tb variable 
>> to ~0 and call the ->event_handler(). If that did not set_next_event or 
>> stop the timer, the interrupt will fire again immediately, notice 
>> next_tb is ~0, and only then stop the decrementer interrupt.
>>
>> So that was already kind of ugly, this patch just turned it into a hang.
>>
>> The problem happens when the tick is stopped with an event still 
>> pending, then tick_nohz_handler() is called, but it bails out because 
>> tick_stopped == 1 so the device never gets programmed again, and so it 
>> keeps firing.
>>
>> How to fix it? Before commit a7cba02deced, powerpc's decrementer was 
>> really oneshot, but we would like to avoid doing that because it requires 
>> additional programming of the hardware on each timer interrupt. We have 
>> the ONESHOT_STOPPED state which seems to be just about what we want.
>>
>> Did the ONESHOT_STOPPED patch just miss this case, or is there a reason 
>> we don't stop it here? This patch seems to fix the hang (not heavily
>> tested though).
> 
> This was definitely overlooked, but it's arguable it is is not required
> for real oneshot clockevent devices. This should only handle the case
> where the interrupt was already pending.
> 
> The ONESHOT_STOPPED state was introduced to handle the case where the
> last timer gets canceled, so the already programmed event does not fire.
> 
> It was not necessarily meant to "fix" clockevent devices which are
> pretending to be ONESHOT, but keep firing over and over.
> 
> That, said. I'm fine with the change along with a big fat comment why
> this is required.

Thanks for taking a look and confirming. I just sent a patch with a
comment and what looks like another missed case. Hopefully it's okay.

Thanks,
Nick

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re:
@ 2022-04-23  2:29               ` Nicholas Piggin
  0 siblings, 0 replies; 56+ messages in thread
From: Nicholas Piggin @ 2022-04-23  2:29 UTC (permalink / raw)
  To: Michael Ellerman, paulmck, Thomas Gleixner, Zhouyi Zhou
  Cc: Viresh
	 Kumar,
	Daniel
	 Lezcano, linux-kernel, rcu, Miguel Ojeda, linuxppc-dev

Excerpts from Thomas Gleixner's message of April 23, 2022 1:53 am:
> On Wed, Apr 13 2022 at 15:11, Nicholas Piggin wrote:
>> So we traced the problem down to possibly a misunderstanding between 
>> decrementer clock event device and core code.
>>
>> The decrementer is only oneshot*ish*. It actually needs to either be 
>> reprogrammed or shut down otherwise it just continues to cause 
>> interrupts.
> 
> I always thought that PPC had sane timers. That's really disillusioning.

My comment was probably a bit misleading explanation of the whole
situation. This weirdness is actually in software in the powerpc
clock event driver due to a recent change I made assuming the clock 
event goes to oneshot-stopped.

The hardware is relatively sane I think, global synchronized constant
rate high frequency clock distributed to the CPUs so reads don't
go off-core. And per-CPU "decrementer" event interrupt at the same
frequency as the clock -- program it to a +ve value and it decrements
until zero then creates basically a level triggered interrupt.

Before my change, the decrementer interrupt would always clear the
interrupt at entry. The event_handler usually programs another
timer in so I tried to avoid that first clear counting on the
oneshot_stopped callback to clear the interrupt if there was no
other timer.

>> Before commit 35de589cb879, it was sort of two-shot. The initial 
>> interrupt at the programmed time would set its internal next_tb variable 
>> to ~0 and call the ->event_handler(). If that did not set_next_event or 
>> stop the timer, the interrupt will fire again immediately, notice 
>> next_tb is ~0, and only then stop the decrementer interrupt.
>>
>> So that was already kind of ugly, this patch just turned it into a hang.
>>
>> The problem happens when the tick is stopped with an event still 
>> pending, then tick_nohz_handler() is called, but it bails out because 
>> tick_stopped == 1 so the device never gets programmed again, and so it 
>> keeps firing.
>>
>> How to fix it? Before commit a7cba02deced, powerpc's decrementer was 
>> really oneshot, but we would like to avoid doing that because it requires 
>> additional programming of the hardware on each timer interrupt. We have 
>> the ONESHOT_STOPPED state which seems to be just about what we want.
>>
>> Did the ONESHOT_STOPPED patch just miss this case, or is there a reason 
>> we don't stop it here? This patch seems to fix the hang (not heavily
>> tested though).
> 
> This was definitely overlooked, but it's arguable it is is not required
> for real oneshot clockevent devices. This should only handle the case
> where the interrupt was already pending.
> 
> The ONESHOT_STOPPED state was introduced to handle the case where the
> last timer gets canceled, so the already programmed event does not fire.
> 
> It was not necessarily meant to "fix" clockevent devices which are
> pretending to be ONESHOT, but keep firing over and over.
> 
> That, said. I'm fine with the change along with a big fat comment why
> this is required.

Thanks for taking a look and confirming. I just sent a patch with a
comment and what looks like another missed case. Hopefully it's okay.

Thanks,
Nick

^ permalink raw reply	[flat|nested] 56+ messages in thread

end of thread, other threads:[~2022-04-23  2:30 UTC | newest]

Thread overview: 56+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-04-05 21:41 rcu_sched self-detected stall on CPU Miguel Ojeda
2022-04-06  9:31 ` Zhouyi Zhou
2022-04-06  9:31   ` Zhouyi Zhou
2022-04-06 17:00   ` Paul E. McKenney
2022-04-06 17:00     ` Paul E. McKenney
2022-04-06 18:25     ` Zhouyi Zhou
2022-04-06 18:25       ` Zhouyi Zhou
2022-04-06 19:50       ` Paul E. McKenney
2022-04-06 19:50         ` Paul E. McKenney
2022-04-07  2:26         ` Zhouyi Zhou
2022-04-07  2:26           ` Zhouyi Zhou
2022-04-07 10:07           ` Miguel Ojeda
2022-04-07 10:07             ` Miguel Ojeda
2022-04-07 15:15             ` Paul E. McKenney
2022-04-07 15:15               ` Paul E. McKenney
2022-04-07 17:05               ` Miguel Ojeda
2022-04-07 17:05                 ` Miguel Ojeda
2022-04-07 17:55                 ` Paul E. McKenney
2022-04-07 17:55                   ` Paul E. McKenney
2022-04-07 23:14                   ` Zhouyi Zhou
2022-04-07 23:14                     ` Zhouyi Zhou
2022-04-08  1:43                     ` Paul E. McKenney
2022-04-08  1:43                       ` Paul E. McKenney
2022-04-08  7:23     ` Michael Ellerman
2022-04-08 10:02       ` Zhouyi Zhou
2022-04-08 10:02         ` Zhouyi Zhou
2022-04-08 14:07         ` Paul E. McKenney
2022-04-08 14:07           ` Paul E. McKenney
2022-04-08 14:25           ` Zhouyi Zhou
2022-04-08 14:25             ` Zhouyi Zhou
2022-04-10 11:33             ` Michael Ellerman
2022-04-11  3:05               ` Paul E. McKenney
2022-04-11  3:05                 ` Paul E. McKenney
2022-04-12  6:53                 ` Michael Ellerman
2022-04-12  6:53                   ` Michael Ellerman
2022-04-12 13:36                   ` Paul E. McKenney
2022-04-12 13:36                     ` Paul E. McKenney
2022-04-08 13:52       ` Miguel Ojeda
2022-04-08 13:52         ` Miguel Ojeda
2022-04-08 14:06       ` Paul E. McKenney
2022-04-08 14:06         ` Paul E. McKenney
2022-04-08 14:42       ` Michael Ellerman
2022-04-08 15:52         ` Paul E. McKenney
2022-04-08 15:52           ` Paul E. McKenney
2022-04-08 17:02         ` Miguel Ojeda
2022-04-08 17:02           ` Miguel Ojeda
2022-04-13  5:11         ` Nicholas Piggin
2022-04-13  5:11           ` Nicholas Piggin
2022-04-13  6:10           ` Low-res tick handler device not going to ONESHOT_STOPPED when tick is stopped (was: rcu_sched self-detected stall on CPU) Nicholas Piggin
2022-04-13  6:10             ` Nicholas Piggin
2022-04-14 17:15             ` Paul E. McKenney
2022-04-14 17:15               ` Paul E. McKenney
2022-04-22 15:53           ` Thomas Gleixner
2022-04-22 15:53             ` Re: Thomas Gleixner
2022-04-23  2:29             ` Re: Nicholas Piggin
2022-04-23  2:29               ` Re: Nicholas Piggin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.