linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Sachin Sant <sachinp@linux.vnet.ibm.com>
To: Odin Ugedal <odin@uged.al>
Cc: Vincent Guittot <vincent.guittot@linaro.org>,
	open list <linux-kernel@vger.kernel.org>,
	linuxppc-dev@lists.ozlabs.org,
	Peter Zijlstra <peterz@infradead.org>
Subject: Re: [powerpc][5.13.0-rc7] Kernel warning (kernel/sched/fair.c:401) while running LTP tests
Date: Mon, 21 Jun 2021 16:27:36 +0530	[thread overview]
Message-ID: <6D1F875D-58E9-4A55-B0C3-21D5F31EDB76@linux.vnet.ibm.com> (raw)
In-Reply-To: <CAFpoUr2o2PVPOx+AvatjjUvqPTyNKE3C6oXejyU3HVMmtCnzvQ@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 1475 bytes --]



> On 21-Jun-2021, at 3:24 PM, Odin Ugedal <odin@uged.al> wrote:
> 
> man. 21. jun. 2021 kl. 11:50 skrev Vincent Guittot <vincent.guittot@linaro.org>:
>> This means that a child's load was not null and it was inserted
>> whereas parent's load was null. This should not happen unless the
>> propagation failed somewhere
> 
> My initial thought is that the patch below will fix it, if that is the
> issue (that a leaf is inserted, but the propagation is not "completed"
> in unthrottle). Might that be the case? Still working on reproducing
> the issue tho.
> 

Unfortunately this does not help. I can still recreate the failure.

Have attached the o/p from test run.

Thanks
-Sachin
> 
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index bfaa6e1f6067..015c5a5c1a4d 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -4930,12 +4930,7 @@ void unthrottle_cfs_rq(struct cfs_rq *cfs_rq)
>                if (cfs_rq_throttled(cfs_rq))
>                        goto unthrottle_throttle;
> 
> -               /*
> -                * One parent has been throttled and cfs_rq removed from the
> -                * list. Add it back to not break the leaf list.
> -                */
> -               if (throttled_hierarchy(cfs_rq))
> -                       list_add_leaf_cfs_rq(cfs_rq);
> +               list_add_leaf_cfs_rq(cfs_rq);
>        }
> 
>        /* At this point se is NULL and we are at root level*/

[-- Attachment #2: run.txt --]
[-- Type: text/plain, Size: 5966 bytes --]

# cd /opt/ltp/testcases/bin
# ./cfs_bandwidth01 -i 5
tst_test.c:1313: TINFO: Timeout per run is 0h 05m 00s
tst_buffers.c:55: TINFO: Test is using guarded buffers
cfs_bandwidth01.c:49: TINFO: Set 'worker1/cpu.max' = '3000 10000'
cfs_bandwidth01.c:49: TINFO: Set 'worker2/cpu.max' = '2000 10000'
cfs_bandwidth01.c:49: TINFO: Set 'worker3/cpu.max' = '3000 10000'
cfs_bandwidth01.c:113: TPASS: Scheduled bandwidth constrained workers
cfs_bandwidth01.c:49: TINFO: Set 'level2/cpu.max' = '5000 10000'
cfs_bandwidth01.c:125: TPASS: Workers exited
cfs_bandwidth01.c:113: TPASS: Scheduled bandwidth constrained workers
[   48.343143] ------------[ cut here ]------------
[   48.343164] rq->tmp_alone_branch != &rq->leaf_cfs_rq_list
[   48.343172] WARNING: CPU: 24 PID: 4405 at kernel/sched/fair.c:401 unthrottle_cfs_rq+0x49c/0x560
[   48.343196] Modules linked in: nf_tables nfnetlink tun bridge stp llc rfkill sunrpc pseries_rng xts vmx_crypto uio_pdrv_genirq uio sch_fq_codel ip_tables xfs libcrc32c sr_mod sd_mod cdrom t10_pi sg ibmvscsi ibmveth scsi_transport_srp dm_mirror dm_region_hash dm_log dm_mod fuse
[   48.343251] CPU: 24 PID: 4405 Comm: cfs_bandwidth01 Not tainted 5.13.0-rc7-dirty #4
[   48.343261] NIP:  c0000000001b88fc LR: c0000000001b88f8 CTR: c000000000723d10
[   48.343269] REGS: c00000000fb13780 TRAP: 0700   Not tainted  (5.13.0-rc7-dirty)
[   48.343278] MSR:  8000000000029033 <SF,EE,ME,IR,DR,RI,LE>  CR: 48044224  XER: 00000005
[   48.343295] CFAR: c00000000014d8a0 IRQMASK: 1 
[   48.343295] GPR00: c0000000001b88f8 c00000000fb13a20 c0000000029ab400 000000000000002d 
[   48.343295] GPR04: 00000000fffeffff c00000000fb136e0 0000000000000027 c00000154f817e08 
[   48.343295] GPR08: 0000000000000023 0000000000000001 0000000000000027 c00000167f1d7fe8 
[   48.343295] GPR12: 0000000000004000 c00000154ffdc680 0000000000000000 0000000000000000 
[   48.343295] GPR16: c000000000fa6660 0000000000000001 0000000000000000 c0000000024e1cd8 
[   48.343295] GPR20: 0000000000000000 c00000000290a69a 0000000000000000 c0000000024e1cc0 
[   48.343295] GPR24: 0000000000000000 c0000000029f2140 c00000154f762380 0000000000000001 
[   48.343295] GPR28: 0000000000000001 0000000000000000 c00000154f762400 0000000000000000 
[   48.343388] NIP [c0000000001b88fc] unthrottle_cfs_rq+0x49c/0x560
[   48.343397] LR [c0000000001b88f8] unthrottle_cfs_rq+0x498/0x560
[   48.343406] Call Trace:
[   48.343410] [c00000000fb13a20] [c0000000001b88f8] unthrottle_cfs_rq+0x498/0x560 (unreliable)
[   48.343422] [c00000000fb13ac0] [c00000000019edb8] tg_set_cfs_bandwidth+0x2c8/0x470
[   48.343433] [c00000000fb13bc0] [c000000000263874] cgroup_file_write+0x164/0x210
[   48.343444] [c00000000fb13c20] [c00000000058cfac] kernfs_fop_write_iter+0x1cc/0x280
[   48.343455] [c00000000fb13c70] [c00000000047024c] new_sync_write+0x14c/0x1d0
[   48.343467] [c00000000fb13d10] [c000000000473844] vfs_write+0x224/0x330
[   48.343476] [c00000000fb13d60] [c000000000473b2c] ksys_write+0x7c/0x140
[   48.343485] [c00000000fb13db0] [c000000000030fb0] system_call_exception+0x150/0x2d0
[   48.343495] [c00000000fb13e10] [c00000000000d45c] system_call_common+0xec/0x278
[   48.343504] --- interrupt: c00 at 0x7fffaa67bd74
[   48.343511] NIP:  00007fffaa67bd74 LR: 00007fffaa5f34c4 CTR: 0000000000000000
[   48.343519] REGS: c00000000fb13e80 TRAP: 0c00   Not tainted  (5.13.0-rc7-dirty)
[   48.343527] MSR:  800000000280f033 <SF,VEC,VSX,EE,PR,FP,ME,IR,DR,RI,LE>  CR: 28002282  XER: 00000000
[   48.343548] IRQMASK: 0 
[   48.343548] GPR00: 0000000000000004 00007fffcb534d60 00007fffaa777100 0000000000000010 
[   48.343548] GPR04: 00000000415623d0 0000000000000005 0000000000000010 00007fffcb534df8 
[   48.343548] GPR08: 0000000010028618 0000000000000000 0000000000000000 0000000000000000 
[   48.343548] GPR12: 0000000000000000 00007fffaa81a310 0000000000000000 0000000000000000 
[   48.343548] GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 
[   48.343548] GPR20: 0000000000000000 0000000000000000 0000000000000002 0000000000000000 
[   48.343548] GPR24: 0000000000000000 000000000000002b 0000000000000005 00000000415623d0 
[   48.343548] GPR28: 0000000000000005 00007fffcb534eb0 00000000415623d0 0000000000000005 
[   48.343634] NIP [00007fffaa67bd74] 0x7fffaa67bd74
[   48.343640] LR [00007fffaa5f34c4] 0x7fffaa5f34c4
[   48.343646] --- interrupt: c00
[   48.343651] Instruction dump:
[   48.343656] 4bfffc74 3d22fff6 8929f2a9 2f890000 409efed4 39200001 3d42fff6 3c62fe60 
[   48.343672] 3863be08 992af2a9 4bf94f45 60000000 <0fe00000> 4bfffeb0 7f6407b4 7f43d378 
[   48.343687] ---[ end trace 61db91af8340603f ]---
cfs_bandwidth01.c:49: TINFO: Set 'level2/cpu.max' = '5000 10000'
cfs_bandwidth01.c:125: TPASS: Workers exited
cfs_bandwidth01.c:113: TPASS: Scheduled bandwidth constrained workers
cfs_bandwidth01.c:49: TINFO: Set 'level2/cpu.max' = '5000 10000'
cfs_bandwidth01.c:125: TPASS: Workers exited
cfs_bandwidth01.c:113: TPASS: Scheduled bandwidth constrained workers
cfs_bandwidth01.c:49: TINFO: Set 'level2/cpu.max' = '5000 10000'
cfs_bandwidth01.c:125: TPASS: Workers exited
cfs_bandwidth01.c:113: TPASS: Scheduled bandwidth constrained workers
cfs_bandwidth01.c:49: TINFO: Set 'level2/cpu.max' = '5000 10000'
cfs_bandwidth01.c:125: TPASS: Workers exited
tst_test.c:1349: TFAIL: Kernel is now tainted.

HINT: You _MAY_ be missing kernel fixes, see:

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=39f23ce07b93
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=b34cb07dde7c
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=fe61468b2cbc
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=5ab297bab984
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=6d4d22468dae

Summary:
passed   10
failed   1
broken   0
skipped  0
warnings 0

  reply	other threads:[~2021-06-21 10:57 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-21  6:32 [powerpc][5.13.0-rc7] Kernel warning (kernel/sched/fair.c:401) while running LTP tests Sachin Sant
2021-06-21  9:39 ` Odin Ugedal
2021-06-21  9:50   ` Vincent Guittot
2021-06-21  9:54     ` Odin Ugedal
2021-06-21 10:57       ` Sachin Sant [this message]
2021-06-21 11:04         ` Odin Ugedal
2021-06-21 12:42           ` Odin Ugedal
2021-06-21 16:22             ` Vincent Guittot
2021-06-21 16:45               ` Odin Ugedal
2021-06-21 17:07                 ` Vincent Guittot
2021-06-21 17:09               ` Vincent Guittot
2021-06-21 17:31                 ` Sachin Sant
2021-06-21 17:44                   ` Vincent Guittot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=6D1F875D-58E9-4A55-B0C3-21D5F31EDB76@linux.vnet.ibm.com \
    --to=sachinp@linux.vnet.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=odin@uged.al \
    --cc=peterz@infradead.org \
    --cc=vincent.guittot@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).