> On 24-Jun-2021, at 4:48 PM, Odin Ugedal wrote: > > The _sum and _avg values are in general sync together with the PELT > divider. They are however not always completely in perfect sync, > resulting in situations where _sum gets to zero while _avg stays > positive. Such situations are undesirable. > > This comes from the fact that PELT will increase period_contrib, also > increasing the PELT divider, without updating _sum and _avg values to > stay in perfect sync where (_sum == _avg * divider). However, such PELT > change will never lower _sum, making it impossible to end up in a > situation where _sum is zero and _avg is not. > > Therefore, we need to ensure that when subtracting load outside PELT, > that when _sum is zero, _avg is also set to zero. This occurs when > (_sum < _avg * divider), and the subtracted (_avg * divider) is bigger > or equal to the current _sum, while the subtracted _avg is smaller than > the current _avg. > > Reported-by: Sachin Sant > Reported-by: Naresh Kamboju > Signed-off-by: Odin Ugedal Hello Odin, Vincent, The issue of kernel warning(during boot) seen on Power sever, reported few days back is not completely fixed. I am able to recreate this issue with latest 5.13 kernel(which has this fix) # git log --oneline kernel/sched/fair.c a6eaf3850cb1 Merge tag 'sched-urgent-2021-06-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 54a728dc5e4f Merge tag 'sched-core-2021-06-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 1c35b07e6d39 sched/fair: Ensure _sum and _avg values stay consistent <<== [ 65.407631] ------------[ cut here ]------------ [ 65.407656] cfs_rq->avg.load_avg || cfs_rq->avg.util_avg || cfs_rq->avg.runnable_avg [ 65.407666] WARNING: CPU: 18 PID: 6642 at kernel/sched/fair.c:3308 update_blocked_averages+0x748/0x7a0 [ 65.407693] Modules linked in: dm_mod bonding nft_ct nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip_set rfkill nf_tables nfnetlink sunrpc xfs libcrc32c pseries_rng xts uio_pdrv_genirq vmx_crypto uio sch_fq_codel ip_tables ext4 mbcache jbd2 sd_mod t10_pi sg ibmvscsi ibmveth scsi_transport_srp fuse [ 65.407776] CPU: 18 PID: 6642 Comm: gcc Not tainted 5.13.0-05357-gdbe69e433722 #1 [ 65.407789] NIP: c0000000001b2e48 LR: c0000000001b2e44 CTR: c00000000072cac0 [ 65.407798] REGS: c00000003f2cf660 TRAP: 0700 Not tainted (5.13.0-05357-gdbe69e433722) [ 65.407809] MSR: 8000000000029033 CR: 48088224 XER: 00000005 [ 65.407837] CFAR: c00000000014d520 IRQMASK: 1 GPR00: c0000000001b2e44 c00000003f2cf900 c0000000029bc700 0000000000000048 GPR04: 00000000ffff7fff c00000003f2cf5c0 0000000000000027 c0000008baa07e18 GPR08: 0000000000000023 0000000000000001 0000000000000027 c000000002876888 GPR12: 0000000000008000 c00000001ec35a80 c0000008baa92580 0000000f3a993c3c GPR16: 0000000000000000 c0000008baa92580 c00000001a53b400 0000000000000001 GPR20: 0000000000000000 c0000000029fdfe0 0000000000000000 0000000000000012 GPR24: 0000000000000000 c0000008baa92f90 0000000000000001 c0000008baa92600 GPR28: 0000000000000012 0000000f3a993c3c c00000001a53b5c0 0000000000000001 [ 65.407984] NIP [c0000000001b2e48] update_blocked_averages+0x748/0x7a0 [ 65.407995] LR [c0000000001b2e44] update_blocked_averages+0x744/0x7a0 [ 65.408005] Call Trace: [ 65.408010] [c00000003f2cf900] [c0000000001b2e44] update_blocked_averages+0x744/0x7a0 (unreliable) [ 65.408025] [c00000003f2cfa20] [c0000000001bdd78] newidle_balance+0x258/0x5c0 [ 65.408038] [c00000003f2cfab0] [c0000000001be1bc] pick_next_task_fair+0x7c/0x4c0 [ 65.408051] [c00000003f2cfb10] [c000000000cf223c] __schedule+0x15c/0x17b0 [ 65.408063] [c00000003f2cfc50] [c0000000001a6044] do_task_dead+0x64/0x70 [ 65.408076] [c00000003f2cfc80] [c0000000001567d8] do_exit+0x868/0xce0 [ 65.408089] [c00000003f2cfd50] [c000000000156d24] do_group_exit+0x64/0xe0 [ 65.408101] [c00000003f2cfd90] [c000000000156dc4] sys_exit_group+0x24/0x30 [ 65.408115] [c00000003f2cfdb0] [c0000000000307b0] system_call_exception+0x150/0x2d0 [ 65.408128] [c00000003f2cfe10] [c00000000000cc5c] system_call_common+0xec/0x278 [ 65.408141] --- interrupt: c00 at 0x7fffa163b80c [ 65.408150] NIP: 00007fffa163b80c LR: 00007fffa1597004 CTR: 0000000000000000 [ 65.408159] REGS: c00000003f2cfe80 TRAP: 0c00 Not tainted (5.13.0-05357-gdbe69e433722) [ 65.408168] MSR: 800000000280f033 CR: 28000222 XER: 00000000 [ 65.408199] IRQMASK: 0 GPR00: 00000000000000ea 00007fffdf5cd220 00007fffa1757300 0000000000000000 GPR04: 0000000000000000 00007fffdf5cd078 0000000000000000 0000000000000000 GPR08: 0000000000000001 0000000000000000 0000000000000000 0000000000000000 GPR12: 0000000000000000 00007fffa192a930 0000000000000000 0000000000000000 GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 GPR20: 0000000000000000 00007fffa1754ee0 0000000000000004 0000000000000001 GPR24: 00007fffa17508a0 0000000000000000 0000000000000000 0000000000000001 GPR28: 0000000000000000 0000000000000000 00007fffa1923950 0000000000000000 [ 65.408308] NIP [00007fffa163b80c] 0x7fffa163b80c [ 65.408317] LR [00007fffa1597004] 0x7fffa1597004 [ 65.408324] --- interrupt: c00 [ 65.408331] Instruction dump: [ 65.408339] 2f890000 409efa3c 4bffa119 4bfffa34 60000000 60000000 e9210070 e8610088 [ 65.408361] 39400001 99490003 4bf9a679 60000000 <0fe00000> e95201ba 2faa0000 4bfffc50 [ 65.408383] ---[ end trace 14708e73bf91cf1c ]— Let me know if any additional information is required. I have attached the dmesg log captured after the failure. Thanks -Sachin > --- > > Reports and discussion can be found here: > > https://lore.kernel.org/lkml/2ED1BDF5-BC0C-47CD-8F33-9A46C738F8CF@linux.vnet.ibm.com/ > https://lore.kernel.org/lkml/CA+G9fYsMXELmjGUzw4SY1bghTYz_PeR2diM6dRp2J37bBZzMSA@mail.gmail.com/ > > kernel/sched/fair.c | 6 +++--- > 1 file changed, 3 insertions(+), 3 deletions(-) > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > index bfaa6e1f6067..def48bc2e90b 100644 > --- a/kernel/sched/fair.c > +++ b/kernel/sched/fair.c > @@ -3688,15 +3688,15 @@ update_cfs_rq_load_avg(u64 now, struct cfs_rq *cfs_rq) > > r = removed_load; > sub_positive(&sa->load_avg, r); > - sub_positive(&sa->load_sum, r * divider); > + sa->load_sum = sa->load_avg * divider; > > r = removed_util; > sub_positive(&sa->util_avg, r); > - sub_positive(&sa->util_sum, r * divider); > + sa->util_sum = sa->util_avg * divider; > > r = removed_runnable; > sub_positive(&sa->runnable_avg, r); > - sub_positive(&sa->runnable_sum, r * divider); > + sa->runnable_sum = sa->runnable_avg * divider; > > /* > * removed_runnable is the unweighted version of removed_load so we > -- > 2.32.0 >