* WARNING: CPU: 112 PID: 2041 at kernel/sched/sched.h:1453 @ 2021-07-28 13:11 Bruno Goncalves 2021-07-28 15:55 ` Dietmar Eggemann 0 siblings, 1 reply; 7+ messages in thread From: Bruno Goncalves @ 2021-07-28 13:11 UTC (permalink / raw) To: CKI Project, linux-kernel; +Cc: nathan, Memory Management Hello, Since this commit (Commit: 45312bd762d3 - Merge tag 'zonefs-5.14-rc2') we started to see the following call trace, it seems to be reproducible only on aarch64. [ 384.485614] ------------[ cut here ]------------ [ 384.490227] rq->clock_update_flags < RQCF_ACT_SKIP [ 384.490232] WARNING: CPU: 112 PID: 2041 at kernel/sched/sched.h:1453 sub_running_bw.isra.0+0x190/0x1a0 [ 384.504312] Modules linked in: mlx5_ib ib_uverbs ib_core rfkill sunrpc acpi_ipmi ipmi_ssif mlx5_core mlxfw psample ipmi_devintf arm_cmn ipmi_msghandler arm_dsu_pmu cppc_cpufreq acpi_tad vfat fat fuse zram ip_tables x_tables xfs crct10dif_ce ghash_ce ast i2c_algo_bit drm_vram_helper sbsa_gwdt drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops cec drm_ttm_helper ttm nvme nvme_core drm xgene_hwmon aes_neon_bs [ 384.541165] CPU: 112 PID: 2041 Comm: sugov:112 Tainted: G W 5.14.0-rc1 #1 [ 384.549244] Hardware name: WIWYNN Mt.Jade Server System B81.030Z1.0007/Mt.Jade Motherboard, BIOS 1.6.20210526 (SCP: 1.06.20210526) 2021/05/26 [ 384.561922] pstate: 404000c9 (nZcv daIF +PAN -UAO -TCO BTYPE=--) [ 384.567918] pc : sub_running_bw.isra.0+0x190/0x1a0 [ 384.572698] lr : sub_running_bw.isra.0+0x190/0x1a0 [ 384.577477] sp : ffff800024c4bb20 [ 384.580779] x29: ffff800024c4bb20 x28: 0000000000000000 x27: ffffb9a9bbe1d200 [ 384.587904] x26: 0000000000000074 x25: 0000000000000011 x24: ffffb9a9bdff9000 [ 384.595029] x23: ffff07ffb36fcaa0 x22: ffff401ee09b65c0 x21: ffffb9a9bbe1de00 [ 384.602153] x20: ffff401ee09a3360 x19: ffff401ee09b6f58 x18: 0000000000000000 [ 384.609277] x17: ffff867522f0c000 x16: ffff800010384000 x15: 0000000000000030 [ 384.616401] x14: 0000000000000000 x13: 50494b535f544341 x12: 5f46435152203c20 [ 384.623526] x11: ffff401ee04b0ea8 x10: ffff401ee021e068 x9 : ffffb9a9bbe4214c [ 384.630650] x8 : 0000000000010ea8 x7 : ffff401ee01e0000 x6 : 0000000000017ffd [ 384.637774] x5 : ffff401ee09a3490 x4 : 0000000000000001 x3 : ffff867522f0c000 [ 384.644898] x2 : ffff401ee09a3498 x1 : ffff07ffb53cc000 x0 : 0000000000000026 [ 384.652022] Call trace: [ 384.654457] sub_running_bw.isra.0+0x190/0x1a0 [ 384.658890] migrate_task_rq_dl+0xf8/0x1e0 [ 384.662975] set_task_cpu+0xa8/0x1f0 [ 384.666540] try_to_wake_up+0x150/0x3d4 [ 384.670365] wake_up_q+0x64/0xc0 [ 384.673582] __up_write+0xd0/0x1c0 [ 384.676974] up_write+0x4c/0x2b0 [ 384.680191] cppc_set_perf+0x120/0x2d0 [ 384.683931] cppc_cpufreq_set_target+0xe0/0x1a4 [cppc_cpufreq] [ 384.689756] __cpufreq_driver_target+0x74/0x140 [ 384.694277] sugov_work+0x64/0x80 [ 384.697580] kthread_worker_fn+0xe0/0x230 [ 384.701580] kthread+0x138/0x140 [ 384.704797] ret_from_fork+0x10/0x18 More logs can be found checking out dmesg logs on: https://arr-cki-prod-datawarehouse-public.s3.amazonaws.com/index.html?prefix=datawarehouse-public/2021/07/16/338525814/build_aarch64_redhat%3A1431434591/tests/storage_software_RAID_testing/ https://arr-cki-prod-datawarehouse-public.s3.amazonaws.com/index.html?prefix=datawarehouse-public/2021/07/16/338525814/build_aarch64_redhat%3A1431434591/tests/xfstests_btrfs/ https://arr-cki-prod-datawarehouse-public.s3.amazonaws.com/index.html?prefix=datawarehouse-public/2021/07/16/338525814/build_aarch64_redhat%3A1431434591/tests/xfstests_ext4/ https://arr-cki-prod-datawarehouse-public.s3.amazonaws.com/index.html?prefix=datawarehouse-public/2021/07/16/338525814/build_aarch64_redhat%3A1431434591/tests/xfstests_xfs/ Thank you, Bruno Goncalves ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: WARNING: CPU: 112 PID: 2041 at kernel/sched/sched.h:1453 2021-07-28 13:11 WARNING: CPU: 112 PID: 2041 at kernel/sched/sched.h:1453 Bruno Goncalves @ 2021-07-28 15:55 ` Dietmar Eggemann 2021-07-29 12:36 ` Bruno Goncalves 0 siblings, 1 reply; 7+ messages in thread From: Dietmar Eggemann @ 2021-07-28 15:55 UTC (permalink / raw) To: Bruno Goncalves, CKI Project, linux-kernel; +Cc: nathan, Memory Management On 28/07/2021 15:11, Bruno Goncalves wrote: > Hello, > > Since this commit (Commit: 45312bd762d3 - Merge tag 'zonefs-5.14-rc2') > we started to see the following call trace, it seems to be > reproducible only on aarch64. It should happen on platforms using a slow-switching cpufreq driver. Only in this case you have n (depends on nbr of frequency domains) special-purpose DL threads when using schedutil CPUFreq governor: root@juno: ps -eTo comm,pid,pri,class | grep sugov sugov:0 132 140 DLN sugov:1 134 140 DLN > > [ 384.485614] ------------[ cut here ]------------ > [ 384.490227] rq->clock_update_flags < RQCF_ACT_SKIP > [ 384.490232] WARNING: CPU: 112 PID: 2041 at > kernel/sched/sched.h:1453 sub_running_bw.isra.0+0x190/0x1a0 > [ 384.504312] Modules linked in: mlx5_ib ib_uverbs ib_core rfkill > sunrpc acpi_ipmi ipmi_ssif mlx5_core mlxfw psample ipmi_devintf > arm_cmn ipmi_msghandler arm_dsu_pmu cppc_cpufreq acpi_tad vfat fat > fuse zram ip_tables x_tables xfs crct10dif_ce ghash_ce ast > i2c_algo_bit drm_vram_helper sbsa_gwdt drm_kms_helper syscopyarea > sysfillrect sysimgblt fb_sys_fops cec drm_ttm_helper ttm nvme > nvme_core drm xgene_hwmon aes_neon_bs > [ 384.541165] CPU: 112 PID: 2041 Comm: sugov:112 Tainted: G W > 5.14.0-rc1 #1 > [ 384.549244] Hardware name: WIWYNN Mt.Jade Server System > B81.030Z1.0007/Mt.Jade Motherboard, BIOS 1.6.20210526 (SCP: > 1.06.20210526) 2021/05/26 > [ 384.561922] pstate: 404000c9 (nZcv daIF +PAN -UAO -TCO BTYPE=--) > [ 384.567918] pc : sub_running_bw.isra.0+0x190/0x1a0 > [ 384.572698] lr : sub_running_bw.isra.0+0x190/0x1a0 > [ 384.577477] sp : ffff800024c4bb20 > [ 384.580779] x29: ffff800024c4bb20 x28: 0000000000000000 x27: ffffb9a9bbe1d200 > [ 384.587904] x26: 0000000000000074 x25: 0000000000000011 x24: ffffb9a9bdff9000 > [ 384.595029] x23: ffff07ffb36fcaa0 x22: ffff401ee09b65c0 x21: ffffb9a9bbe1de00 > [ 384.602153] x20: ffff401ee09a3360 x19: ffff401ee09b6f58 x18: 0000000000000000 > [ 384.609277] x17: ffff867522f0c000 x16: ffff800010384000 x15: 0000000000000030 > [ 384.616401] x14: 0000000000000000 x13: 50494b535f544341 x12: 5f46435152203c20 > [ 384.623526] x11: ffff401ee04b0ea8 x10: ffff401ee021e068 x9 : ffffb9a9bbe4214c > [ 384.630650] x8 : 0000000000010ea8 x7 : ffff401ee01e0000 x6 : 0000000000017ffd > [ 384.637774] x5 : ffff401ee09a3490 x4 : 0000000000000001 x3 : ffff867522f0c000 > [ 384.644898] x2 : ffff401ee09a3498 x1 : ffff07ffb53cc000 x0 : 0000000000000026 > [ 384.652022] Call trace: > [ 384.654457] sub_running_bw.isra.0+0x190/0x1a0 > [ 384.658890] migrate_task_rq_dl+0xf8/0x1e0 > [ 384.662975] set_task_cpu+0xa8/0x1f0 > [ 384.666540] try_to_wake_up+0x150/0x3d4 > [ 384.670365] wake_up_q+0x64/0xc0 > [ 384.673582] __up_write+0xd0/0x1c0 > [ 384.676974] up_write+0x4c/0x2b0 > [ 384.680191] cppc_set_perf+0x120/0x2d0 > [ 384.683931] cppc_cpufreq_set_target+0xe0/0x1a4 [cppc_cpufreq] > [ 384.689756] __cpufreq_driver_target+0x74/0x140 > [ 384.694277] sugov_work+0x64/0x80 > [ 384.697580] kthread_worker_fn+0xe0/0x230 > [ 384.701580] kthread+0x138/0x140 > [ 384.704797] ret_from_fork+0x10/0x18 Don't quite get this. `sugov:112` should be a special DL entity (dl_se->flags & SCHED_FLAG_SUGOV) so sub_running_bw() should not call __sub_running_bw() and hence there won't be a call to cpufreq_update_util() which calls q_clock(rq) -> assert_clock_updated()? Can't reproduce it on my Juno (arm64) (slow-switching (scpi-cpufreq driver)). ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: WARNING: CPU: 112 PID: 2041 at kernel/sched/sched.h:1453 2021-07-28 15:55 ` Dietmar Eggemann @ 2021-07-29 12:36 ` Bruno Goncalves 2021-07-29 14:38 ` Dietmar Eggemann 0 siblings, 1 reply; 7+ messages in thread From: Bruno Goncalves @ 2021-07-29 12:36 UTC (permalink / raw) To: Dietmar Eggemann Cc: CKI Project, linux-kernel, nathan, Memory Management, linux-arm-kernel On Wed, Jul 28, 2021 at 5:55 PM Dietmar Eggemann <dietmar.eggemann@arm.com> wrote: > > On 28/07/2021 15:11, Bruno Goncalves wrote: > > Hello, > > > > Since this commit (Commit: 45312bd762d3 - Merge tag 'zonefs-5.14-rc2') > > we started to see the following call trace, it seems to be > > reproducible only on aarch64. > > It should happen on platforms using a slow-switching cpufreq driver. > > Only in this case you have n (depends on nbr of frequency domains) > special-purpose DL threads when using schedutil CPUFreq governor: > > root@juno: ps -eTo comm,pid,pri,class | grep sugov > > sugov:0 132 140 DLN > sugov:1 134 140 DLN > > > > > [ 384.485614] ------------[ cut here ]------------ > > [ 384.490227] rq->clock_update_flags < RQCF_ACT_SKIP > > [ 384.490232] WARNING: CPU: 112 PID: 2041 at > > kernel/sched/sched.h:1453 sub_running_bw.isra.0+0x190/0x1a0 > > [ 384.504312] Modules linked in: mlx5_ib ib_uverbs ib_core rfkill > > sunrpc acpi_ipmi ipmi_ssif mlx5_core mlxfw psample ipmi_devintf > > arm_cmn ipmi_msghandler arm_dsu_pmu cppc_cpufreq acpi_tad vfat fat > > fuse zram ip_tables x_tables xfs crct10dif_ce ghash_ce ast > > i2c_algo_bit drm_vram_helper sbsa_gwdt drm_kms_helper syscopyarea > > sysfillrect sysimgblt fb_sys_fops cec drm_ttm_helper ttm nvme > > nvme_core drm xgene_hwmon aes_neon_bs > > [ 384.541165] CPU: 112 PID: 2041 Comm: sugov:112 Tainted: G W > > 5.14.0-rc1 #1 > > [ 384.549244] Hardware name: WIWYNN Mt.Jade Server System > > B81.030Z1.0007/Mt.Jade Motherboard, BIOS 1.6.20210526 (SCP: > > 1.06.20210526) 2021/05/26 > > [ 384.561922] pstate: 404000c9 (nZcv daIF +PAN -UAO -TCO BTYPE=--) > > [ 384.567918] pc : sub_running_bw.isra.0+0x190/0x1a0 > > [ 384.572698] lr : sub_running_bw.isra.0+0x190/0x1a0 > > [ 384.577477] sp : ffff800024c4bb20 > > [ 384.580779] x29: ffff800024c4bb20 x28: 0000000000000000 x27: ffffb9a9bbe1d200 > > [ 384.587904] x26: 0000000000000074 x25: 0000000000000011 x24: ffffb9a9bdff9000 > > [ 384.595029] x23: ffff07ffb36fcaa0 x22: ffff401ee09b65c0 x21: ffffb9a9bbe1de00 > > [ 384.602153] x20: ffff401ee09a3360 x19: ffff401ee09b6f58 x18: 0000000000000000 > > [ 384.609277] x17: ffff867522f0c000 x16: ffff800010384000 x15: 0000000000000030 > > [ 384.616401] x14: 0000000000000000 x13: 50494b535f544341 x12: 5f46435152203c20 > > [ 384.623526] x11: ffff401ee04b0ea8 x10: ffff401ee021e068 x9 : ffffb9a9bbe4214c > > [ 384.630650] x8 : 0000000000010ea8 x7 : ffff401ee01e0000 x6 : 0000000000017ffd > > [ 384.637774] x5 : ffff401ee09a3490 x4 : 0000000000000001 x3 : ffff867522f0c000 > > [ 384.644898] x2 : ffff401ee09a3498 x1 : ffff07ffb53cc000 x0 : 0000000000000026 > > [ 384.652022] Call trace: > > [ 384.654457] sub_running_bw.isra.0+0x190/0x1a0 > > [ 384.658890] migrate_task_rq_dl+0xf8/0x1e0 > > [ 384.662975] set_task_cpu+0xa8/0x1f0 > > [ 384.666540] try_to_wake_up+0x150/0x3d4 > > [ 384.670365] wake_up_q+0x64/0xc0 > > [ 384.673582] __up_write+0xd0/0x1c0 > > [ 384.676974] up_write+0x4c/0x2b0 > > [ 384.680191] cppc_set_perf+0x120/0x2d0 > > [ 384.683931] cppc_cpufreq_set_target+0xe0/0x1a4 [cppc_cpufreq] > > [ 384.689756] __cpufreq_driver_target+0x74/0x140 > > [ 384.694277] sugov_work+0x64/0x80 > > [ 384.697580] kthread_worker_fn+0xe0/0x230 > > [ 384.701580] kthread+0x138/0x140 > > [ 384.704797] ret_from_fork+0x10/0x18 > > Don't quite get this. > `sugov:112` should be a special DL entity (dl_se->flags & > SCHED_FLAG_SUGOV) so sub_running_bw() should not call __sub_running_bw() > and hence there won't be a call to cpufreq_update_util() which calls > q_clock(rq) -> assert_clock_updated()? > > Can't reproduce it on my Juno (arm64) (slow-switching (scpi-cpufreq > driver)). We seem to be able to reproduce this only on Ampere Altra machines, specifically on mtjade and mtsnow cpus. # cpupower frequency-info analyzing CPU 0: driver: cppc_cpufreq CPUs which run at the same hardware frequency: 0 CPUs which need to have their frequency coordinated by software: 0 maximum transition latency: Cannot determine or is not supported. hardware limits: 1000 MHz - 2.80 GHz available cpufreq governors: conservative ondemand userspace powersave performance schedutil current policy: frequency should be within 2.00 GHz and 2.80 GHz. The governor "schedutil" may decide which speed to use within this range. current CPU frequency: 1.55 GHz (asserted by call to hardware) # ps -eTo comm,pid,pri,class | grep sugov sugov:0 1082 140 DLN sugov:1 1085 140 DLN ... sugov:78 1319 140 DLN sugov:79 1320 140 DLN Bruno > ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: WARNING: CPU: 112 PID: 2041 at kernel/sched/sched.h:1453 2021-07-29 12:36 ` Bruno Goncalves @ 2021-07-29 14:38 ` Dietmar Eggemann 2021-07-30 12:22 ` Dietmar Eggemann 0 siblings, 1 reply; 7+ messages in thread From: Dietmar Eggemann @ 2021-07-29 14:38 UTC (permalink / raw) To: Bruno Goncalves Cc: CKI Project, linux-kernel, nathan, Memory Management, linux-arm-kernel On 29/07/2021 14:36, Bruno Goncalves wrote: > On Wed, Jul 28, 2021 at 5:55 PM Dietmar Eggemann > <dietmar.eggemann@arm.com> wrote: >> >> On 28/07/2021 15:11, Bruno Goncalves wrote: [...] >> Can't reproduce it on my Juno (arm64) (slow-switching (scpi-cpufreq >> driver)). > > We seem to be able to reproduce this only on Ampere Altra machines, > specifically on mtjade and mtsnow cpus. > > # cpupower frequency-info > analyzing CPU 0: > driver: cppc_cpufreq > CPUs which run at the same hardware frequency: 0 > CPUs which need to have their frequency coordinated by software: 0 > maximum transition latency: Cannot determine or is not supported. > hardware limits: 1000 MHz - 2.80 GHz > available cpufreq governors: conservative ondemand userspace > powersave performance schedutil > current policy: frequency should be within 2.00 GHz and 2.80 GHz. > The governor "schedutil" may decide which speed to use > within this range. > current CPU frequency: 1.55 GHz (asserted by call to hardware) > > # ps -eTo comm,pid,pri,class | grep sugov > sugov:0 1082 140 DLN > sugov:1 1085 140 DLN > ... > sugov:78 1319 140 DLN > sugov:79 1320 140 DLN Thanks! In the meantime I got access to an Ampere Altra so I can try 5.14.0-rc1 later today. ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: WARNING: CPU: 112 PID: 2041 at kernel/sched/sched.h:1453 2021-07-29 14:38 ` Dietmar Eggemann @ 2021-07-30 12:22 ` Dietmar Eggemann 2021-07-30 15:23 ` Bruno Goncalves 0 siblings, 1 reply; 7+ messages in thread From: Dietmar Eggemann @ 2021-07-30 12:22 UTC (permalink / raw) To: Bruno Goncalves Cc: CKI Project, linux-kernel, nathan, Memory Management, linux-arm-kernel On 29/07/2021 16:38, Dietmar Eggemann wrote: > On 29/07/2021 14:36, Bruno Goncalves wrote: >> On Wed, Jul 28, 2021 at 5:55 PM Dietmar Eggemann >> <dietmar.eggemann@arm.com> wrote: >>> >>> On 28/07/2021 15:11, Bruno Goncalves wrote: > > [...] > >>> Can't reproduce it on my Juno (arm64) (slow-switching (scpi-cpufreq >>> driver)). >> >> We seem to be able to reproduce this only on Ampere Altra machines, >> specifically on mtjade and mtsnow cpus. >> >> # cpupower frequency-info >> analyzing CPU 0: >> driver: cppc_cpufreq >> CPUs which run at the same hardware frequency: 0 >> CPUs which need to have their frequency coordinated by software: 0 >> maximum transition latency: Cannot determine or is not supported. >> hardware limits: 1000 MHz - 2.80 GHz >> available cpufreq governors: conservative ondemand userspace >> powersave performance schedutil >> current policy: frequency should be within 2.00 GHz and 2.80 GHz. >> The governor "schedutil" may decide which speed to use >> within this range. >> current CPU frequency: 1.55 GHz (asserted by call to hardware) >> >> # ps -eTo comm,pid,pri,class | grep sugov >> sugov:0 1082 140 DLN >> sugov:1 1085 140 DLN >> ... >> sugov:78 1319 140 DLN >> sugov:79 1320 140 DLN > > Thanks! In the meantime I got access to an Ampere Altra so I can try > 5.14.0-rc1 later today. The task causing this seem to be the new `cppc_fie` DL task introduced by commit 1eb5dde674f5 "cpufreq: CPPC: Add support for frequency invariance" in v5.14-rc1. With `CONFIG_ACPI_CPPC_CPUFREQ_FIE=y` and schedutil cpufreq governor on slow-switching system: DL task curr=`sugov:X` makes p=`cppc_fie` migrate and since it is in `non_contending` state, migrate_task_rq_dl() calls sub_running_bw()->__sub_running_bw()->cpufreq_update_util()-> rq_clock()->assert_clock_updated() on p. Can you try this snippet? It should fix it. --8<-- From: Dietmar Eggemann <dietmar.eggemann@arm.com> Date: Fri, 30 Jul 2021 14:03:40 +0200 Subject: [PATCH] sched/deadline: Fix missing clock update in migrate_task_rq_dl() Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com> --- kernel/sched/deadline.c | 1 + 1 file changed, 1 insertion(+) diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c index aaacd6cfd42f..4920f498492f 100644 --- a/kernel/sched/deadline.c +++ b/kernel/sched/deadline.c @@ -1733,6 +1733,7 @@ static void migrate_task_rq_dl(struct task_struct *p, int new_cpu __maybe_unused */ raw_spin_rq_lock(rq); if (p->dl.dl_non_contending) { + update_rq_clock(rq); sub_running_bw(&p->dl, &rq->dl); p->dl.dl_non_contending = 0; /* -- 2.25.1 ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: WARNING: CPU: 112 PID: 2041 at kernel/sched/sched.h:1453 2021-07-30 12:22 ` Dietmar Eggemann @ 2021-07-30 15:23 ` Bruno Goncalves 2021-08-02 8:43 ` Dietmar Eggemann 0 siblings, 1 reply; 7+ messages in thread From: Bruno Goncalves @ 2021-07-30 15:23 UTC (permalink / raw) To: Dietmar Eggemann Cc: CKI Project, linux-kernel, nathan, Memory Management, linux-arm-kernel On Fri, Jul 30, 2021 at 2:22 PM Dietmar Eggemann <dietmar.eggemann@arm.com> wrote: > > On 29/07/2021 16:38, Dietmar Eggemann wrote: > > On 29/07/2021 14:36, Bruno Goncalves wrote: > >> On Wed, Jul 28, 2021 at 5:55 PM Dietmar Eggemann > >> <dietmar.eggemann@arm.com> wrote: > >>> > >>> On 28/07/2021 15:11, Bruno Goncalves wrote: > > > > [...] > > > >>> Can't reproduce it on my Juno (arm64) (slow-switching (scpi-cpufreq > >>> driver)). > >> > >> We seem to be able to reproduce this only on Ampere Altra machines, > >> specifically on mtjade and mtsnow cpus. > >> > >> # cpupower frequency-info > >> analyzing CPU 0: > >> driver: cppc_cpufreq > >> CPUs which run at the same hardware frequency: 0 > >> CPUs which need to have their frequency coordinated by software: 0 > >> maximum transition latency: Cannot determine or is not supported. > >> hardware limits: 1000 MHz - 2.80 GHz > >> available cpufreq governors: conservative ondemand userspace > >> powersave performance schedutil > >> current policy: frequency should be within 2.00 GHz and 2.80 GHz. > >> The governor "schedutil" may decide which speed to use > >> within this range. > >> current CPU frequency: 1.55 GHz (asserted by call to hardware) > >> > >> # ps -eTo comm,pid,pri,class | grep sugov > >> sugov:0 1082 140 DLN > >> sugov:1 1085 140 DLN > >> ... > >> sugov:78 1319 140 DLN > >> sugov:79 1320 140 DLN > > > > Thanks! In the meantime I got access to an Ampere Altra so I can try > > 5.14.0-rc1 later today. > > The task causing this seem to be the new `cppc_fie` DL task introduced > by commit 1eb5dde674f5 "cpufreq: CPPC: Add support for frequency > invariance" in v5.14-rc1. > > With `CONFIG_ACPI_CPPC_CPUFREQ_FIE=y` and schedutil cpufreq governor on > slow-switching system: > > DL task curr=`sugov:X` makes p=`cppc_fie` migrate and since it is in > `non_contending` state, migrate_task_rq_dl() calls > > sub_running_bw()->__sub_running_bw()->cpufreq_update_util()-> > rq_clock()->assert_clock_updated() > > on p. > > Can you try this snippet? It should fix it. Thank you, I've tried the patch and it fixes the issue. Bruno > > --8<-- > > From: Dietmar Eggemann <dietmar.eggemann@arm.com> > Date: Fri, 30 Jul 2021 14:03:40 +0200 > Subject: [PATCH] sched/deadline: Fix missing clock update in > migrate_task_rq_dl() > > Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com> > --- > kernel/sched/deadline.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c > index aaacd6cfd42f..4920f498492f 100644 > --- a/kernel/sched/deadline.c > +++ b/kernel/sched/deadline.c > @@ -1733,6 +1733,7 @@ static void migrate_task_rq_dl(struct task_struct *p, int new_cpu __maybe_unused > */ > raw_spin_rq_lock(rq); > if (p->dl.dl_non_contending) { > + update_rq_clock(rq); > sub_running_bw(&p->dl, &rq->dl); > p->dl.dl_non_contending = 0; > /* > -- > 2.25.1 > ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: WARNING: CPU: 112 PID: 2041 at kernel/sched/sched.h:1453 2021-07-30 15:23 ` Bruno Goncalves @ 2021-08-02 8:43 ` Dietmar Eggemann 0 siblings, 0 replies; 7+ messages in thread From: Dietmar Eggemann @ 2021-08-02 8:43 UTC (permalink / raw) To: Bruno Goncalves Cc: CKI Project, linux-kernel, nathan, Memory Management, linux-arm-kernel On 30/07/2021 17:23, Bruno Goncalves wrote: > On Fri, Jul 30, 2021 at 2:22 PM Dietmar Eggemann > <dietmar.eggemann@arm.com> wrote: >> >> On 29/07/2021 16:38, Dietmar Eggemann wrote: >>> On 29/07/2021 14:36, Bruno Goncalves wrote: >>>> On Wed, Jul 28, 2021 at 5:55 PM Dietmar Eggemann >>>> <dietmar.eggemann@arm.com> wrote: >>>>> >>>>> On 28/07/2021 15:11, Bruno Goncalves wrote: [...] >> The task causing this seem to be the new `cppc_fie` DL task introduced >> by commit 1eb5dde674f5 "cpufreq: CPPC: Add support for frequency >> invariance" in v5.14-rc1. >> >> With `CONFIG_ACPI_CPPC_CPUFREQ_FIE=y` and schedutil cpufreq governor on >> slow-switching system: >> >> DL task curr=`sugov:X` makes p=`cppc_fie` migrate and since it is in >> `non_contending` state, migrate_task_rq_dl() calls >> >> sub_running_bw()->__sub_running_bw()->cpufreq_update_util()-> >> rq_clock()->assert_clock_updated() >> >> on p. >> >> Can you try this snippet? It should fix it. > > Thank you, I've tried the patch and it fixes the issue. Thanks for testing! Let me send out a proper patch then. ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2021-08-02 8:43 UTC | newest] Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2021-07-28 13:11 WARNING: CPU: 112 PID: 2041 at kernel/sched/sched.h:1453 Bruno Goncalves 2021-07-28 15:55 ` Dietmar Eggemann 2021-07-29 12:36 ` Bruno Goncalves 2021-07-29 14:38 ` Dietmar Eggemann 2021-07-30 12:22 ` Dietmar Eggemann 2021-07-30 15:23 ` Bruno Goncalves 2021-08-02 8:43 ` Dietmar Eggemann
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).