All of lore.kernel.org
 help / color / mirror / Atom feed
* WARNING: CPU: 0 PID: 12 at kernel/sched/fair.c:3306 update_blocked_averages+0x941/0x9a0
@ 2021-07-30 15:21 Ammar Faizi
  2021-08-02  8:42 ` Dietmar Eggemann
  0 siblings, 1 reply; 3+ messages in thread
From: Ammar Faizi @ 2021-07-30 15:21 UTC (permalink / raw)
  To: Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot,
	Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman,
	Daniel Bristot de Oliveira
  Cc: linux-kernel, Ammar Faizi

[-- Attachment #1: Type: text/plain, Size: 4872 bytes --]

Hi everyone,

I compiled Linux 5.13.0 and use it on my Ubuntu. I got a kernel warning
at kernel/sched/fair.c:3306.

Below is the system information
Kernel: 5.13.0-icetea001-12377-gf55966571d5e
OS: Ubuntu 21.04
CPU: 4 Core
Hardware name: Acer Aspire ES1-421/OLVIA_BE, BIOS V1.05 07/02/2015

Reproduction steps:
1. Connect to a wireless (internet).
2. After several moment (the time to reproduce is random), the internet
will suddenly hang for a few seconds. After that the network is down,
but the interface state is still connected.

The only way to get the network back is reconnect the wireless.
  # Ok, after hang, internet won't work.

  # See, it's still connected.
  nmcli c

  # Disconnect
  nmcli c down qwerty;

  # Connect again
  nmcli c up qwerty;

  # Internet work again after reconnect.

3. Check `dmesg -Sr`.


Here is the warning (I attached more log and kernel config as well):

[    C0] ------------[ cut here ]------------
[    C0] cfs_rq->avg.load_avg || cfs_rq->avg.util_avg ||
cfs_rq->avg.runnable_avg
[    C0] WARNING: CPU: 0 PID: 12 at kernel/sched/fair.c:3306
update_blocked_averages+0x941/0x9a0
[    C0] Modules linked in: rfcomm xt_CHECKSUM xt_MASQUERADE
xt_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpudp nft_compat
nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4
nft_counter nf_tables nfnetlink bridge stp llc bfq cmac algif_hash
algif_skcipher af_alg bnep dm_multipath scsi_dh_rdac scsi_dh_emc
scsi_dh_alua snd_hda_codec_realtek snd_hda_codec_generic ledtrig_audio
btusb snd_hda_codec_hdmi btrtl snd_hda_intel uvcvideo btbcm btintel
snd_intel_dspcfg snd_intel_sdw_acpi videobuf2_vmalloc bluetooth
snd_hda_codec videobuf2_memops snd_hda_core videobuf2_v4l2 snd_hwdep
videobuf2_common snd_pcm edac_mce_amd videodev snd_seq_midi
snd_seq_midi_event snd_rawmidi kvm_amd ecdh_generic mc ecc kvm snd_seq
wl(OE) acer_wmi snd_seq_device sparse_keymap snd_timer cfg80211
input_leds snd soundcore wmi_bmof serio_raw ccp k10temp mac_hid
fam15h_power sch_fq_codel msr ip_tables x_tables autofs4 btrfs
blake2b_generic raid10 raid456 async_raid6_recov async_memcpy async_pq
[    C0]  async_xor async_tx xor raid6_pq libcrc32c raid1 raid0
multipath linear amdgpu iommu_v2 gpu_sched radeon i2c_algo_bit
drm_ttm_helper ttm drm_kms_helper hid_generic syscopyarea rtsx_pci_sdmmc
sysfillrect sysimgblt crct10dif_pclmul fb_sys_fops cec crc32_pclmul
rc_core ghash_clmulni_intel usbhid aesni_intel sdhci_pci crypto_simd
cqhci xhci_pci r8169 psmouse drm xhci_pci_renesas ahci cryptd realtek
sdhci rtsx_pci libahci hid i2c_piix4 wmi video
[    C0] CPU: 0 PID: 12 Comm: ksoftirqd/0 Tainted: G           OE    
5.13.0-icetea001-12377-gf55966571d5e #3
[    C0] Hardware name: Acer Aspire ES1-421/OLVIA_BE, BIOS V1.05 07/02/2015
[    C0] RIP: 0010:update_blocked_averages+0x941/0x9a0
[    C0] Code: 00 e9 a7 fe ff ff e8 9e 22 c2 00 e9 4b f9 ff ff 0f 0b e9
da fe ff ff 48 c7 c7 88 c1 5c 82 c6 05 07 f6 ae 01 01 e8 d3 50 bc 00
<0f> 0b 41 8b 84 24 78 01 00 00 e9 f8 fa ff ff 48 c7 c7 88 bb 5c 82
[    C0] RSP: 0018:ffffc900001a7de0 EFLAGS: 00010082
[    C0] RAX: 0000000000000000 RBX: ffff888104ec6980 RCX: 0000000000000027
[    C0] RDX: ffff888313c18e28 RSI: 0000000000000001 RDI: ffff888313c18e20
[    C0] RBP: ffffc900001a7e58 R08: ffffffff82962048 R09: 00000000ffffdfff
[    C0] R10: ffffffff82882060 R11: ffffffff82882060 R12: ffff888104ec6800
[    C0] R13: 0000000000000000 R14: 0000735d623b8c53 R15: ffff888103830200
[    C0] FS:  0000000000000000(0000) GS:ffff888313c00000(0000)
knlGS:0000000000000000
[    C0] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    C0] CR2: 00007ff1f8c91000 CR3: 000000017c42a000 CR4: 00000000000406f0
[    C0] Call Trace:
[    C0]  run_rebalance_domains+0x53/0x80
[    C0]  __do_softirq+0xd2/0x472
[    C0]  run_ksoftirqd+0x3f/0x60
[    C0]  smpboot_thread_fn+0xc2/0x170
[    C0]  ? smpboot_register_percpu_thread+0xe0/0xe0
[    C0]  kthread+0x138/0x160
[    C0]  ? set_kthread_struct+0x50/0x50
[    C0]  ret_from_fork+0x1f/0x30
[    C0] irq event stamp: 43203642
[    C0] hardirqs last  enabled at (43203641): [<ffffffff810a6004>]
run_ksoftirqd+0x44/0x60
[    C0] hardirqs last disabled at (43203642): [<ffffffff81d1e8af>]
__schedule+0xfcf/0x17d0
[    C0] softirqs last  enabled at (43203640): [<ffffffff810a5fff>]
run_ksoftirqd+0x3f/0x60
[    C0] softirqs last disabled at (43203625): [<ffffffff810a5fff>]
run_ksoftirqd+0x3f/0x60
[    C0] ---[ end trace 74d3894cf8cf6ef8 ]---

Attachment:
1) config.gz (kernel config for compile)
2) dmesg.txt (more about kernel log)
3) proc_cpuinfo.gz (From cat /proc/cpuinfo)
4) proc_modules.gz (From cat /proc/modules)

If you need more information or want to me to do something, please let
me know. I will be happy to help.

Regards,
  Ammar


[-- Attachment #2: config.gz --]
[-- Type: application/gzip, Size: 62358 bytes --]

[-- Attachment #3: dmesg.txt --]
[-- Type: text/plain, Size: 16716 bytes --]

<6>[38709.965880][T38129] Enabling non-boot CPUs ...
<6>[38709.966448][T38129] x86: Booting SMP configuration:
<6>[38709.966452][T38129] smpboot: Booting Node 0 Processor 1 APIC 0x1
<6>[38709.966899][    T0] microcode: CPU1: patch_level=0x07030105
<6>[38709.969480][   T17] ACPI: \_PR_.C001: Found 2 idle states
<6>[38709.970394][T38129] CPU1 is up
<6>[38709.971153][T38129] smpboot: Booting Node 0 Processor 2 APIC 0x2
<6>[38709.971630][    T0] microcode: CPU2: patch_level=0x07030105
<6>[38709.974192][   T23] ACPI: \_PR_.C002: Found 2 idle states
<6>[38709.975275][T38129] CPU2 is up
<6>[38709.975865][T38129] smpboot: Booting Node 0 Processor 3 APIC 0x3
<6>[38709.976338][    T0] microcode: CPU3: patch_level=0x07030105
<6>[38709.978912][   T29] ACPI: \_PR_.C003: Found 2 idle states
<6>[38709.980162][T38129] CPU3 is up
<6>[38709.980968][T38129] ACPI: PM: Waking up from system sleep state S3
<6>[38709.982840][T38129] ACPI: EC: interrupt unblocked
<6>[38710.015701][T38129] ACPI: EC: event unblocked
<5>[38710.028398][T38259] sd 0:0:0:0: [sda] Starting disk
<6>[38710.031490][T38235] [drm] PCIE GART of 2048M enabled (table at 0x000000000030E000).
<6>[38710.031661][T38235] radeon 0000:00:01.0: WB enabled
<6>[38710.031682][T38235] radeon 0000:00:01.0: fence driver on ring 0 use gpu addr 0x0000000040000c00
<6>[38710.031687][T38235] radeon 0000:00:01.0: fence driver on ring 1 use gpu addr 0x0000000040000c04
<6>[38710.031691][T38235] radeon 0000:00:01.0: fence driver on ring 2 use gpu addr 0x0000000040000c08
<6>[38710.031695][T38235] radeon 0000:00:01.0: fence driver on ring 3 use gpu addr 0x0000000040000c0c
<6>[38710.031698][T38235] radeon 0000:00:01.0: fence driver on ring 4 use gpu addr 0x0000000040000c10
<6>[38710.032155][T38235] radeon 0000:00:01.0: fence driver on ring 5 use gpu addr 0x0000000000078d30
<6>[38710.032350][T38235] radeon 0000:00:01.0: fence driver on ring 6 use gpu addr 0x0000000040000c18
<6>[38710.032354][T38235] radeon 0000:00:01.0: fence driver on ring 7 use gpu addr 0x0000000040000c1c
<3>[38710.032779][T38235] debugfs: File 'radeon_ring_gfx' in directory '0' already present!
<3>[38710.032788][T38235] debugfs: File 'radeon_ring_cp1' in directory '0' already present!
<3>[38710.032792][T38235] debugfs: File 'radeon_ring_cp2' in directory '0' already present!
<3>[38710.032796][T38235] debugfs: File 'radeon_ring_dma1' in directory '0' already present!
<3>[38710.032800][T38235] debugfs: File 'radeon_ring_dma2' in directory '0' already present!
<6>[38710.034617][T38235] [drm] ring test on 0 succeeded in 2 usecs
<6>[38710.034686][T38235] [drm] ring test on 1 succeeded in 2 usecs
<6>[38710.034698][T38235] [drm] ring test on 2 succeeded in 2 usecs
<6>[38710.034892][T38235] [drm] ring test on 3 succeeded in 3 usecs
<6>[38710.034900][T38235] [drm] ring test on 4 succeeded in 3 usecs
<3>[38710.034904][T38235] debugfs: File 'radeon_ring_uvd' in directory '0' already present!
<6>[38710.060957][T38235] [drm] ring test on 5 succeeded in 1 usecs
<6>[38710.060976][T38235] [drm] UVD initialized successfully.
<3>[38710.060980][T38235] debugfs: File 'radeon_ring_vce1' in directory '0' already present!
<3>[38710.060985][T38235] debugfs: File 'radeon_ring_vce2' in directory '0' already present!
<6>[38710.171028][T38235] [drm] ring test on 6 succeeded in 7 usecs
<6>[38710.171037][T38235] [drm] ring test on 7 succeeded in 2 usecs
<6>[38710.171039][T38235] [drm] VCE initialized successfully.
<6>[38710.171200][T38235] [drm] ib test on ring 0 succeeded in 0 usecs
<6>[38710.171345][T38235] [drm] ib test on ring 1 succeeded in 0 usecs
<6>[38710.171487][T38235] [drm] ib test on ring 2 succeeded in 0 usecs
<6>[38710.171631][T38235] [drm] ib test on ring 3 succeeded in 0 usecs
<6>[38710.171772][T38235] [drm] ib test on ring 4 succeeded in 0 usecs
<6>[38710.207631][T38243] r8169 0000:01:00.1 enp1s0f1: Link is Down
<6>[38710.311030][T37238] usb 1-1.3: reset full-speed USB device number 4 using ehci-pci
<6>[38710.500934][  T226] ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
<6>[38710.503179][  T226] ata2.00: configured for UDMA/133
<6>[38710.517575][T38224] usb 1-1.4: reset high-speed USB device number 5 using ehci-pci
<6>[38710.674395][T38235] [drm] ib test on ring 5 succeeded
<6>[38710.675071][T38235] [drm] ib test on ring 6 succeeded
<6>[38710.675537][T38235] [drm] ib test on ring 7 succeeded
<6>[38712.075685][T38129] OOM killer enabled.
<6>[38712.076615][T38129] Restarting tasks ... done.
<6>[38712.247515][T38129] PM: suspend exit
<6>[38712.314152][  T221] ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
<6>[38712.351731][  T221] ata1.00: configured for UDMA/133
<6>[38712.356592][ T6855] Bluetooth: hci0: BCM: chip id 70
<6>[38712.357597][ T6855] Bluetooth: hci0: BCM: features 0x06
<6>[38712.373616][ T6855] Bluetooth: hci0: BCM43142A
<6>[38712.373632][ T6855] Bluetooth: hci0: BCM43142A0 (001.001.011) build 0000
<3>[38712.374683][ T6855] Bluetooth: hci0: BCM: firmware Patch file not found, tried:
<3>[38712.374692][ T6855] Bluetooth: hci0: BCM: 'brcm/BCM43142A0-04ca-2012.hcd'
<3>[38712.374696][ T6855] Bluetooth: hci0: BCM: 'brcm/BCM-04ca-2012.hcd'
<3>[38714.540944][T38244] Bluetooth: hci0: command 0x1003 tx timeout
<3>[38714.542266][  T646] Bluetooth: hci0: unexpected event for opcode 0x1003
<6>[48927.382553][  T892] IPv6: ADDRCONF(NETDEV_CHANGE): wlp2s0: link becomes ready
<6>[84886.024002][T109369] PM: suspend entry (deep)
<6>[84894.415626][T109369] Filesystems sync: 8.391 seconds
<6>[84894.444695][T109369] Freezing user space processes ... (elapsed 0.004 seconds) done.
<6>[84894.450594][T109369] OOM killer disabled.
<6>[84894.451544][T109369] Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done.
<6>[84894.453939][T109369] printk: Suspending console(s) (use no_console_suspend to debug)
<5>[84894.486958][T109203] sd 0:0:0:0: [sda] Synchronizing SCSI cache
<5>[84894.777291][T109203] sd 0:0:0:0: [sda] Stopping disk
<6>[84895.345583][T109369] ACPI: EC: interrupt blocked
<6>[84895.377121][T109369] ACPI: PM: Preparing to enter system sleep state S3
<6>[84895.378448][T109369] ACPI: EC: event blocked
<6>[84895.378450][T109369] ACPI: EC: EC stopped
<6>[84895.378452][T109369] ACPI: PM: Saving platform NVS memory
<6>[84895.378460][T109369] Disabling non-boot CPUs ...
<4>[84895.379920][   T19] IRQ 42: no longer affine to CPU1
<6>[84895.380991][T109369] smpboot: CPU 1 is now offline
<4>[84895.385290][   T25] IRQ 39: no longer affine to CPU2
<6>[84895.386348][T109369] smpboot: CPU 2 is now offline
<6>[84895.390032][T109369] smpboot: CPU 3 is now offline
<6>[84895.392453][T109369] ACPI: PM: Low-level resume complete
<6>[84895.392512][T109369] ACPI: EC: EC started
<6>[84895.392513][T109369] ACPI: PM: Restoring platform NVS memory
<6>[84895.392537][T109369] LVT offset 0 assigned for vector 0x400
<6>[84895.392981][T109369] Enabling non-boot CPUs ...
<6>[84895.393548][T109369] x86: Booting SMP configuration:
<6>[84895.393552][T109369] smpboot: Booting Node 0 Processor 1 APIC 0x1
<6>[84895.393981][    T0] microcode: CPU1: patch_level=0x07030105
<6>[84895.396708][   T17] ACPI: \_PR_.C001: Found 2 idle states
<6>[84895.397576][T109369] CPU1 is up
<6>[84895.398114][T109369] smpboot: Booting Node 0 Processor 2 APIC 0x2
<6>[84895.398607][    T0] microcode: CPU2: patch_level=0x07030105
<6>[84895.401167][   T23] ACPI: \_PR_.C002: Found 2 idle states
<6>[84895.402215][T109369] CPU2 is up
<6>[84895.402796][T109369] smpboot: Booting Node 0 Processor 3 APIC 0x3
<6>[84895.403280][    T0] microcode: CPU3: patch_level=0x07030105
<6>[84895.405852][   T29] ACPI: \_PR_.C003: Found 2 idle states
<6>[84895.407150][T109369] CPU3 is up
<6>[84895.407941][T109369] ACPI: PM: Waking up from system sleep state S3
<6>[84895.409821][T109369] ACPI: EC: interrupt unblocked
<6>[84895.442475][T109369] ACPI: EC: event unblocked
<5>[84895.449605][T109486] sd 0:0:0:0: [sda] Starting disk
<6>[84895.457940][T109459] [drm] PCIE GART of 2048M enabled (table at 0x000000000030E000).
<6>[84895.458120][T109459] radeon 0000:00:01.0: WB enabled
<6>[84895.458140][T109459] radeon 0000:00:01.0: fence driver on ring 0 use gpu addr 0x0000000040000c00
<6>[84895.458145][T109459] radeon 0000:00:01.0: fence driver on ring 1 use gpu addr 0x0000000040000c04
<6>[84895.458149][T109459] radeon 0000:00:01.0: fence driver on ring 2 use gpu addr 0x0000000040000c08
<6>[84895.458153][T109459] radeon 0000:00:01.0: fence driver on ring 3 use gpu addr 0x0000000040000c0c
<6>[84895.458156][T109459] radeon 0000:00:01.0: fence driver on ring 4 use gpu addr 0x0000000040000c10
<6>[84895.458612][T109459] radeon 0000:00:01.0: fence driver on ring 5 use gpu addr 0x0000000000078d30
<6>[84895.458803][T109459] radeon 0000:00:01.0: fence driver on ring 6 use gpu addr 0x0000000040000c18
<6>[84895.458807][T109459] radeon 0000:00:01.0: fence driver on ring 7 use gpu addr 0x0000000040000c1c
<3>[84895.459278][T109459] debugfs: File 'radeon_ring_gfx' in directory '0' already present!
<3>[84895.459286][T109459] debugfs: File 'radeon_ring_cp1' in directory '0' already present!
<3>[84895.459290][T109459] debugfs: File 'radeon_ring_cp2' in directory '0' already present!
<3>[84895.459294][T109459] debugfs: File 'radeon_ring_dma1' in directory '0' already present!
<3>[84895.459299][T109459] debugfs: File 'radeon_ring_dma2' in directory '0' already present!
<6>[84895.461171][T109459] [drm] ring test on 0 succeeded in 2 usecs
<6>[84895.461240][T109459] [drm] ring test on 1 succeeded in 2 usecs
<6>[84895.461252][T109459] [drm] ring test on 2 succeeded in 2 usecs
<6>[84895.461478][T109459] [drm] ring test on 3 succeeded in 3 usecs
<6>[84895.461487][T109459] [drm] ring test on 4 succeeded in 3 usecs
<3>[84895.461491][T109459] debugfs: File 'radeon_ring_uvd' in directory '0' already present!
<6>[84895.487680][T109459] [drm] ring test on 5 succeeded in 1 usecs
<6>[84895.487700][T109459] [drm] UVD initialized successfully.
<3>[84895.487704][T109459] debugfs: File 'radeon_ring_vce1' in directory '0' already present!
<3>[84895.487710][T109459] debugfs: File 'radeon_ring_vce2' in directory '0' already present!
<6>[84895.597746][T109459] [drm] ring test on 6 succeeded in 7 usecs
<6>[84895.597756][T109459] [drm] ring test on 7 succeeded in 2 usecs
<6>[84895.597758][T109459] [drm] VCE initialized successfully.
<6>[84895.597913][T109459] [drm] ib test on ring 0 succeeded in 0 usecs
<6>[84895.598059][T109459] [drm] ib test on ring 1 succeeded in 0 usecs
<6>[84895.598202][T109459] [drm] ib test on ring 2 succeeded in 0 usecs
<6>[84895.598345][T109459] [drm] ib test on ring 3 succeeded in 0 usecs
<6>[84895.598487][T109459] [drm] ib test on ring 4 succeeded in 0 usecs
<6>[84895.680986][T109476] r8169 0000:01:00.1 enp1s0f1: Link is Down
<6>[84895.731092][T109451] usb 1-1.3: reset full-speed USB device number 4 using ehci-pci
<6>[84895.921040][  T226] ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
<6>[84895.923268][  T226] ata2.00: configured for UDMA/133
<6>[84895.924258][T109474] usb 1-1.4: reset high-speed USB device number 5 using ehci-pci
<6>[84896.101150][T109459] [drm] ib test on ring 5 succeeded
<6>[84896.101739][T109459] [drm] ib test on ring 6 succeeded
<6>[84896.102209][T109459] [drm] ib test on ring 7 succeeded
<6>[84897.502201][T109369] OOM killer enabled.
<6>[84897.503125][T109369] Restarting tasks ... done.
<6>[84897.614246][T109369] PM: suspend exit
<6>[84897.726601][  T646] Bluetooth: hci0: BCM: chip id 70
<6>[84897.727597][  T646] Bluetooth: hci0: BCM: features 0x06
<6>[84897.743644][  T646] Bluetooth: hci0: BCM43142A
<6>[84897.743662][  T646] Bluetooth: hci0: BCM43142A0 (001.001.011) build 0000
<3>[84897.744709][  T646] Bluetooth: hci0: BCM: firmware Patch file not found, tried:
<3>[84897.744717][  T646] Bluetooth: hci0: BCM: 'brcm/BCM43142A0-04ca-2012.hcd'
<3>[84897.744721][  T646] Bluetooth: hci0: BCM: 'brcm/BCM-04ca-2012.hcd'
<6>[84897.854266][  T221] ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
<6>[84897.855637][  T221] ata1.00: configured for UDMA/133
<3>[84899.914224][T109468] Bluetooth: hci0: command 0x1003 tx timeout
<3>[84899.915320][ T6855] Bluetooth: hci0: unexpected event for opcode 0x1003
<6>[89859.506002][    C1] perf: interrupt took too long (3157 > 3130), lowering kernel.perf_event_max_sample_rate to 63300
<6>[90756.311979][  T892] IPv6: ADDRCONF(NETDEV_CHANGE): wlp2s0: link becomes ready
<6>[93514.074186][T11423] traps: Chrome_IOThread[11423] trap invalid opcode ip:55f50f74f914 sp:7f79aeaadbd0 error:0 in Discord[55f50d2b6000+5cbf000]
<4>[128750.304266][    C0] ------------[ cut here ]------------
<4>[128750.304283][    C0] cfs_rq->avg.load_avg || cfs_rq->avg.util_avg || cfs_rq->avg.runnable_avg
<4>[128750.304290][    C0] WARNING: CPU: 0 PID: 12 at kernel/sched/fair.c:3306 update_blocked_averages+0x941/0x9a0
<4>[128750.304304][    C0] Modules linked in: rfcomm xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpudp nft_compat nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nft_counter nf_tables nfnetlink bridge stp llc bfq cmac algif_hash algif_skcipher af_alg bnep dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua snd_hda_codec_realtek snd_hda_codec_generic ledtrig_audio btusb snd_hda_codec_hdmi btrtl snd_hda_intel uvcvideo btbcm btintel snd_intel_dspcfg snd_intel_sdw_acpi videobuf2_vmalloc bluetooth snd_hda_codec videobuf2_memops snd_hda_core videobuf2_v4l2 snd_hwdep videobuf2_common snd_pcm edac_mce_amd videodev snd_seq_midi snd_seq_midi_event snd_rawmidi kvm_amd ecdh_generic mc ecc kvm snd_seq wl(OE) acer_wmi snd_seq_device sparse_keymap snd_timer cfg80211 input_leds snd soundcore wmi_bmof serio_raw ccp k10temp mac_hid fam15h_power sch_fq_codel msr ip_tables x_tables autofs4 btrfs blake2b_generic raid10 raid456 async_raid6_recov async_memcpy async_pq
<4>[128750.304427][    C0]  async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear amdgpu iommu_v2 gpu_sched radeon i2c_algo_bit drm_ttm_helper ttm drm_kms_helper hid_generic syscopyarea rtsx_pci_sdmmc sysfillrect sysimgblt crct10dif_pclmul fb_sys_fops cec crc32_pclmul rc_core ghash_clmulni_intel usbhid aesni_intel sdhci_pci crypto_simd cqhci xhci_pci r8169 psmouse drm xhci_pci_renesas ahci cryptd realtek sdhci rtsx_pci libahci hid i2c_piix4 wmi video
<4>[128750.304498][    C0] CPU: 0 PID: 12 Comm: ksoftirqd/0 Tainted: G           OE     5.13.0-icetea001-12377-gf55966571d5e #3
<4>[128750.304505][    C0] Hardware name: Acer Aspire ES1-421/OLVIA_BE, BIOS V1.05 07/02/2015
<4>[128750.304509][    C0] RIP: 0010:update_blocked_averages+0x941/0x9a0
<4>[128750.304517][    C0] Code: 00 e9 a7 fe ff ff e8 9e 22 c2 00 e9 4b f9 ff ff 0f 0b e9 da fe ff ff 48 c7 c7 88 c1 5c 82 c6 05 07 f6 ae 01 01 e8 d3 50 bc 00 <0f> 0b 41 8b 84 24 78 01 00 00 e9 f8 fa ff ff 48 c7 c7 88 bb 5c 82
<4>[128750.304522][    C0] RSP: 0018:ffffc900001a7de0 EFLAGS: 00010082
<4>[128750.304527][    C0] RAX: 0000000000000000 RBX: ffff888104ec6980 RCX: 0000000000000027
<4>[128750.304531][    C0] RDX: ffff888313c18e28 RSI: 0000000000000001 RDI: ffff888313c18e20
<4>[128750.304535][    C0] RBP: ffffc900001a7e58 R08: ffffffff82962048 R09: 00000000ffffdfff
<4>[128750.304538][    C0] R10: ffffffff82882060 R11: ffffffff82882060 R12: ffff888104ec6800
<4>[128750.304542][    C0] R13: 0000000000000000 R14: 0000735d623b8c53 R15: ffff888103830200
<4>[128750.304546][    C0] FS:  0000000000000000(0000) GS:ffff888313c00000(0000) knlGS:0000000000000000
<4>[128750.304550][    C0] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4>[128750.304554][    C0] CR2: 00007ff1f8c91000 CR3: 000000017c42a000 CR4: 00000000000406f0
<4>[128750.304558][    C0] Call Trace:
<4>[128750.304570][    C0]  run_rebalance_domains+0x53/0x80
<4>[128750.304577][    C0]  __do_softirq+0xd2/0x472
<4>[128750.304586][    C0]  run_ksoftirqd+0x3f/0x60
<4>[128750.304593][    C0]  smpboot_thread_fn+0xc2/0x170
<4>[128750.304598][    C0]  ? smpboot_register_percpu_thread+0xe0/0xe0
<4>[128750.304604][    C0]  kthread+0x138/0x160
<4>[128750.304611][    C0]  ? set_kthread_struct+0x50/0x50
<4>[128750.304617][    C0]  ret_from_fork+0x1f/0x30
<4>[128750.304628][    C0] irq event stamp: 43203642
<4>[128750.304631][    C0] hardirqs last  enabled at (43203641): [<ffffffff810a6004>] run_ksoftirqd+0x44/0x60
<4>[128750.304638][    C0] hardirqs last disabled at (43203642): [<ffffffff81d1e8af>] __schedule+0xfcf/0x17d0
<4>[128750.304644][    C0] softirqs last  enabled at (43203640): [<ffffffff810a5fff>] run_ksoftirqd+0x3f/0x60
<4>[128750.304650][    C0] softirqs last disabled at (43203625): [<ffffffff810a5fff>] run_ksoftirqd+0x3f/0x60
<4>[128750.304655][    C0] ---[ end trace 74d3894cf8cf6ef8 ]---
<6>[128761.670654][  T892] IPv6: ADDRCONF(NETDEV_CHANGE): wlp2s0: link becomes ready

[-- Attachment #4: proc_cpuinfo.gz --]
[-- Type: application/gzip, Size: 845 bytes --]

[-- Attachment #5: proc_modules.gz --]
[-- Type: application/gzip, Size: 1649 bytes --]

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: WARNING: CPU: 0 PID: 12 at kernel/sched/fair.c:3306 update_blocked_averages+0x941/0x9a0
  2021-07-30 15:21 WARNING: CPU: 0 PID: 12 at kernel/sched/fair.c:3306 update_blocked_averages+0x941/0x9a0 Ammar Faizi
@ 2021-08-02  8:42 ` Dietmar Eggemann
  2021-08-02 12:52   ` Ammar Faizi
  0 siblings, 1 reply; 3+ messages in thread
From: Dietmar Eggemann @ 2021-08-02  8:42 UTC (permalink / raw)
  To: Ammar Faizi, Ingo Molnar, Peter Zijlstra, Juri Lelli,
	Vincent Guittot, Steven Rostedt, Ben Segall, Mel Gorman,
	Daniel Bristot de Oliveira
  Cc: linux-kernel, Ammar Faizi

Hi Ammar,

On 30/07/2021 17:21, Ammar Faizi wrote:
> Hi everyone,
> 
> I compiled Linux 5.13.0 and use it on my Ubuntu. I got a kernel warning
> at kernel/sched/fair.c:3306.
> 
> Below is the system information
> Kernel: 5.13.0-icetea001-12377-gf55966571d5e

So you're running with:

9e077b52d86a - sched/pelt: Check that *_avg are null when *_sum are
(2021-06-17 Vincent Guittot)

but not with:

ceb6ba45dc80 - sched/fair: Sync load_sum with load_avg after dequeue
(2021-07-02 Vincent Guittot)

The SCHED_WARN_ON you're hitting is harmless and just tells you that the
PELT load_avg and load_sum part of one of your cfs_rq's is not aligned.
Has to be load (and not util or runnable) since load is the only one
still not fixed in f55966571d5e.

This should go away once you applied ceb6ba45dc80.

-- Dietmar


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: WARNING: CPU: 0 PID: 12 at kernel/sched/fair.c:3306 update_blocked_averages+0x941/0x9a0
  2021-08-02  8:42 ` Dietmar Eggemann
@ 2021-08-02 12:52   ` Ammar Faizi
  0 siblings, 0 replies; 3+ messages in thread
From: Ammar Faizi @ 2021-08-02 12:52 UTC (permalink / raw)
  To: Dietmar Eggemann, Ingo Molnar, Peter Zijlstra, Juri Lelli,
	Vincent Guittot, Steven Rostedt, Ben Segall, Mel Gorman,
	Daniel Bristot de Oliveira
  Cc: linux-kernel, Ammar Faizi

On 8/2/21 3:42 PM, Dietmar Eggemann wrote:
> So you're running with:
>
> 9e077b52d86a - sched/pelt: Check that *_avg are null when *_sum are
> (2021-06-17 Vincent Guittot)
>
> but not with:
>
> ceb6ba45dc80 - sched/fair: Sync load_sum with load_avg after dequeue
> (2021-07-02 Vincent Guittot)
>
> The SCHED_WARN_ON you're hitting is harmless and just tells you that the
> PELT load_avg and load_sum part of one of your cfs_rq's is not aligned.
> Has to be load (and not util or runnable) since load is the only one
> still not fixed in f55966571d5e.
>
> This should go away once you applied ceb6ba45dc80.

Alright, I have just moved to 5.14-rc4 and doesn't seem to have this
issue anymore.

Thanks for the response, Dietmar.

  Ammar


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2021-08-02 12:52 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-07-30 15:21 WARNING: CPU: 0 PID: 12 at kernel/sched/fair.c:3306 update_blocked_averages+0x941/0x9a0 Ammar Faizi
2021-08-02  8:42 ` Dietmar Eggemann
2021-08-02 12:52   ` Ammar Faizi

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.