* [PATCH] sched/deadline: Update rq_clock of later_rq when pushing a task
@ 2018-07-13 14:09 Daniel Bristot de Oliveira
2018-07-14 13:37 ` [PATCH V2] " Daniel Bristot de Oliveira
` (2 more replies)
0 siblings, 3 replies; 6+ messages in thread
From: Daniel Bristot de Oliveira @ 2018-07-13 14:09 UTC (permalink / raw)
To: linux-kernel
Cc: Juri Lelli, Luca Abeni, Tommaso Cucinotta, Steven Rostedt,
Peter Zijlstra, Ingo Molnar, stable
Daniel Casini got this warn while running a DL task here at RetisLab:
[ 461.137582] ------------[ cut here ]------------
[ 461.137583] rq->clock_update_flags < RQCF_ACT_SKIP
[ 461.137599] WARNING: CPU: 4 PID: 2354 at kernel/sched/sched.h:967 assert_clock_updated.isra.32.part.33+0x17/0x20
[a ton of modules]
[ 461.137646] CPU: 4 PID: 2354 Comm: label_image Not tainted 4.18.0-rc4+ #3
[ 461.137647] Hardware name: ASUS All Series/Z87-K, BIOS 0801 09/02/2013
[ 461.137649] RIP: 0010:assert_clock_updated.isra.32.part.33+0x17/0x20
[ 461.137649] Code: ff 48 89 83 08 09 00 00 eb c6 66 0f 1f 84 00 00 00 00 00 55 48 c7 c7 98 7a 6c a5 c6 05 bc 0d 54 01 01 48 89 e5 e8 a9 84 fb ff <0f> 0b 5d c3 0f 1f 44 00 00 0f 1f 44 00 00 83 7e 60 01 74 0a 48 3b
[ 461.137673] RSP: 0018:ffffa77e08cafc68 EFLAGS: 00010082
[ 461.137674] RAX: 0000000000000000 RBX: ffff8b3fc1702d80 RCX: 0000000000000006
[ 461.137674] RDX: 0000000000000007 RSI: 0000000000000096 RDI: ffff8b3fded164b0
[ 461.137675] RBP: ffffa77e08cafc68 R08: 0000000000000026 R09: 0000000000000339
[ 461.137676] R10: ffff8b3fd060d410 R11: 0000000000000026 R12: ffffffffa4e14e20
[ 461.137677] R13: ffff8b3fdec22940 R14: ffff8b3fc1702da0 R15: ffff8b3fdec22940
[ 461.137678] FS: 00007efe43ee5700(0000) GS:ffff8b3fded00000(0000) knlGS:0000000000000000
[ 461.137679] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 461.137680] CR2: 00007efe30000010 CR3: 0000000301744003 CR4: 00000000001606e0
[ 461.137680] Call Trace:
[ 461.137684] push_dl_task.part.46+0x3bc/0x460
[ 461.137686] task_woken_dl+0x60/0x80
[ 461.137689] ttwu_do_wakeup+0x4f/0x150
[ 461.137690] ttwu_do_activate+0x77/0x80
[ 461.137692] try_to_wake_up+0x1d6/0x4c0
[ 461.137693] wake_up_q+0x32/0x70
[ 461.137696] do_futex+0x7e7/0xb50
[ 461.137698] __x64_sys_futex+0x8b/0x180
[ 461.137701] do_syscall_64+0x5a/0x110
[ 461.137703] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 461.137705] RIP: 0033:0x7efe4918ca26
[ 461.137705] Code: 00 00 00 74 17 49 8b 48 20 44 8b 59 10 41 83 e3 30 41 83 fb 20 74 1e be 85 00 00 00 41 ba 01 00 00 00 41 b9 01 00 00 04 0f 05 <48> 3d 01 f0 ff ff 73 1f 31 c0 c3 be 8c 00 00 00 49 89 c8 4d 31 d2
[ 461.137738] RSP: 002b:00007efe43ee4928 EFLAGS: 00000283 ORIG_RAX: 00000000000000ca
[ 461.137739] RAX: ffffffffffffffda RBX: 0000000005094df0 RCX: 00007efe4918ca26
[ 461.137740] RDX: 0000000000000001 RSI: 0000000000000085 RDI: 0000000005094e24
[ 461.137741] RBP: 00007efe43ee49c0 R08: 0000000005094e20 R09: 0000000004000001
[ 461.137741] R10: 0000000000000001 R11: 0000000000000283 R12: 0000000000000000
[ 461.137742] R13: 0000000005094df8 R14: 0000000000000001 R15: 0000000000448a10
[ 461.137743] ---[ end trace 187df4cad2bf7649 ]---
This warning was caused in the push_dl_task(), because the rq_clock of
the later_rq was used by cpufreq_update_util() inside __add_running_bw(),
before its update at activate_task(). The fix then is to update the
clock before calling add_running_bw.
To avoid double rq_clock_update(), we set ENQUEUE_NOCLOCK flag to
activate_task().
Reported-by: Daniel Casini <daniel.casini@santannapisa.it>
Signed-off-by: Daniel Bristot de Oliveira <bristot@redhat.com>
Cc: Juri Lelli <juri.lelli@arm.com>
Cc: Luca Abeni <luca.abeni@santannapisa.it>
Cc: Tommaso Cucinotta <tommaso.cucinotta@sssup.it>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: linux-kernel@vger.kernel.org
Cc: <stable@vger.kernel.org> # 4.16+
Fixes: e0367b12674b sched/deadline: Move CPU frequency selection triggering points
---
kernel/sched/deadline.c | 10 +++++++++-
1 file changed, 9 insertions(+), 1 deletion(-)
diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
index fbfc3f1d368a..e733c15b7695 100644
--- a/kernel/sched/deadline.c
+++ b/kernel/sched/deadline.c
@@ -2090,8 +2090,16 @@ static int push_dl_task(struct rq *rq)
sub_rq_bw(&next_task->dl, &rq->dl);
set_task_cpu(next_task, later_rq->cpu);
add_rq_bw(&next_task->dl, &later_rq->dl);
+
+ /*
+ * Update the later_rq clock here, because the clock is used
+ * by the cpufreq_update_util() inside __add_running_bw().
+ * Then, set ENQUEUE_NOCLOCK flag to avoid updating the rq_clock
+ * again in the activate_task()->enqueue_task().
+ */
+ update_rq_clock(later_rq);
add_running_bw(&next_task->dl, &later_rq->dl);
- activate_task(later_rq, next_task, 0);
+ activate_task(later_rq, next_task, ENQUEUE_NOCLOCK);
ret = 1;
resched_curr(later_rq);
--
2.17.1
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH V2] sched/deadline: Update rq_clock of later_rq when pushing a task
2018-07-13 14:09 [PATCH] sched/deadline: Update rq_clock of later_rq when pushing a task Daniel Bristot de Oliveira
@ 2018-07-14 13:37 ` Daniel Bristot de Oliveira
2018-07-16 9:37 ` Juri Lelli
2018-07-16 10:45 ` Kirill Tkhai
2 siblings, 0 replies; 6+ messages in thread
From: Daniel Bristot de Oliveira @ 2018-07-14 13:37 UTC (permalink / raw)
To: linux-kernel
Cc: Juri Lelli, Luca Abeni, Tommaso Cucinotta, Steven Rostedt,
Peter Zijlstra, Ingo Molnar, stable
Daniel Casini got this warn while running a DL task here at RetisLab:
[ 461.137582] ------------[ cut here ]------------
[ 461.137583] rq->clock_update_flags < RQCF_ACT_SKIP
[ 461.137599] WARNING: CPU: 4 PID: 2354 at kernel/sched/sched.h:967 assert_clock_updated.isra.32.part.33+0x17/0x20
[a ton of modules]
[ 461.137646] CPU: 4 PID: 2354 Comm: label_image Not tainted 4.18.0-rc4+ #3
[ 461.137647] Hardware name: ASUS All Series/Z87-K, BIOS 0801 09/02/2013
[ 461.137649] RIP: 0010:assert_clock_updated.isra.32.part.33+0x17/0x20
[ 461.137649] Code: ff 48 89 83 08 09 00 00 eb c6 66 0f 1f 84 00 00 00 00 00 55 48 c7 c7 98 7a 6c a5 c6 05 bc 0d 54 01 01 48 89 e5 e8 a9 84 fb ff <0f> 0b 5d c3 0f 1f 44 00 00 0f 1f 44 00 00 83 7e 60 01 74 0a 48 3b
[ 461.137673] RSP: 0018:ffffa77e08cafc68 EFLAGS: 00010082
[ 461.137674] RAX: 0000000000000000 RBX: ffff8b3fc1702d80 RCX: 0000000000000006
[ 461.137674] RDX: 0000000000000007 RSI: 0000000000000096 RDI: ffff8b3fded164b0
[ 461.137675] RBP: ffffa77e08cafc68 R08: 0000000000000026 R09: 0000000000000339
[ 461.137676] R10: ffff8b3fd060d410 R11: 0000000000000026 R12: ffffffffa4e14e20
[ 461.137677] R13: ffff8b3fdec22940 R14: ffff8b3fc1702da0 R15: ffff8b3fdec22940
[ 461.137678] FS: 00007efe43ee5700(0000) GS:ffff8b3fded00000(0000) knlGS:0000000000000000
[ 461.137679] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 461.137680] CR2: 00007efe30000010 CR3: 0000000301744003 CR4: 00000000001606e0
[ 461.137680] Call Trace:
[ 461.137684] push_dl_task.part.46+0x3bc/0x460
[ 461.137686] task_woken_dl+0x60/0x80
[ 461.137689] ttwu_do_wakeup+0x4f/0x150
[ 461.137690] ttwu_do_activate+0x77/0x80
[ 461.137692] try_to_wake_up+0x1d6/0x4c0
[ 461.137693] wake_up_q+0x32/0x70
[ 461.137696] do_futex+0x7e7/0xb50
[ 461.137698] __x64_sys_futex+0x8b/0x180
[ 461.137701] do_syscall_64+0x5a/0x110
[ 461.137703] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 461.137705] RIP: 0033:0x7efe4918ca26
[ 461.137705] Code: 00 00 00 74 17 49 8b 48 20 44 8b 59 10 41 83 e3 30 41 83 fb 20 74 1e be 85 00 00 00 41 ba 01 00 00 00 41 b9 01 00 00 04 0f 05 <48> 3d 01 f0 ff ff 73 1f 31 c0 c3 be 8c 00 00 00 49 89 c8 4d 31 d2
[ 461.137738] RSP: 002b:00007efe43ee4928 EFLAGS: 00000283 ORIG_RAX: 00000000000000ca
[ 461.137739] RAX: ffffffffffffffda RBX: 0000000005094df0 RCX: 00007efe4918ca26
[ 461.137740] RDX: 0000000000000001 RSI: 0000000000000085 RDI: 0000000005094e24
[ 461.137741] RBP: 00007efe43ee49c0 R08: 0000000005094e20 R09: 0000000004000001
[ 461.137741] R10: 0000000000000001 R11: 0000000000000283 R12: 0000000000000000
[ 461.137742] R13: 0000000005094df8 R14: 0000000000000001 R15: 0000000000448a10
[ 461.137743] ---[ end trace 187df4cad2bf7649 ]---
This warning happened in the push_dl_task(), because
__add_running_bw()->cpufreq_update_util() is getting the rq_clock of
the later_rq before its update, which takes place at activate_task().
The fix then is to update the rq_clock before calling add_running_bw().
To avoid double rq_clock_update() call, we set ENQUEUE_NOCLOCK flag to
activate_task().
Changes from v1:
Cosmetic changes in the log, and correct Juri's email (Daniel).
Reported-by: Daniel Casini <daniel.casini@santannapisa.it>
Signed-off-by: Daniel Bristot de Oliveira <bristot@redhat.com>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Luca Abeni <luca.abeni@santannapisa.it>
Cc: Tommaso Cucinotta <tommaso.cucinotta@santannapisa.it>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: linux-kernel@vger.kernel.org
Cc: <stable@vger.kernel.org> # 4.16+
Fixes: e0367b12674b sched/deadline: Move CPU frequency selection triggering points
---
kernel/sched/deadline.c | 10 +++++++++-
1 file changed, 9 insertions(+), 1 deletion(-)
diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
index fbfc3f1d368a..e733c15b7695 100644
--- a/kernel/sched/deadline.c
+++ b/kernel/sched/deadline.c
@@ -2090,8 +2090,16 @@ static int push_dl_task(struct rq *rq)
sub_rq_bw(&next_task->dl, &rq->dl);
set_task_cpu(next_task, later_rq->cpu);
add_rq_bw(&next_task->dl, &later_rq->dl);
+
+ /*
+ * Update the later_rq clock here, because the clock is used
+ * by the cpufreq_update_util() inside __add_running_bw().
+ * Then, set ENQUEUE_NOCLOCK flag to avoid updating the rq_clock
+ * again in the activate_task()->enqueue_task().
+ */
+ update_rq_clock(later_rq);
add_running_bw(&next_task->dl, &later_rq->dl);
- activate_task(later_rq, next_task, 0);
+ activate_task(later_rq, next_task, ENQUEUE_NOCLOCK);
ret = 1;
resched_curr(later_rq);
--
2.17.1
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH V2] sched/deadline: Update rq_clock of later_rq when pushing a task
2018-07-13 14:09 [PATCH] sched/deadline: Update rq_clock of later_rq when pushing a task Daniel Bristot de Oliveira
2018-07-14 13:37 ` [PATCH V2] " Daniel Bristot de Oliveira
@ 2018-07-16 9:37 ` Juri Lelli
2018-07-16 12:10 ` Daniel Bristot de Oliveira
2018-07-16 10:45 ` Kirill Tkhai
2 siblings, 1 reply; 6+ messages in thread
From: Juri Lelli @ 2018-07-16 9:37 UTC (permalink / raw)
To: Daniel Bristot de Oliveira
Cc: linux-kernel, Luca Abeni, Tommaso Cucinotta, Steven Rostedt,
Peter Zijlstra, Ingo Molnar, stable
Hi,
On 14/07/18 15:37, Daniel Bristot de Oliveira wrote:
> Daniel Casini got this warn while running a DL task here at RetisLab:
>
> [ 461.137582] ------------[ cut here ]------------
> [ 461.137583] rq->clock_update_flags < RQCF_ACT_SKIP
> [ 461.137599] WARNING: CPU: 4 PID: 2354 at kernel/sched/sched.h:967 assert_clock_updated.isra.32.part.33+0x17/0x20
> [a ton of modules]
> [ 461.137646] CPU: 4 PID: 2354 Comm: label_image Not tainted 4.18.0-rc4+ #3
> [ 461.137647] Hardware name: ASUS All Series/Z87-K, BIOS 0801 09/02/2013
> [ 461.137649] RIP: 0010:assert_clock_updated.isra.32.part.33+0x17/0x20
> [ 461.137649] Code: ff 48 89 83 08 09 00 00 eb c6 66 0f 1f 84 00 00 00 00 00 55 48 c7 c7 98 7a 6c a5 c6 05 bc 0d 54 01 01 48 89 e5 e8 a9 84 fb ff <0f> 0b 5d c3 0f 1f 44 00 00 0f 1f 44 00 00 83 7e 60 01 74 0a 48 3b
> [ 461.137673] RSP: 0018:ffffa77e08cafc68 EFLAGS: 00010082
> [ 461.137674] RAX: 0000000000000000 RBX: ffff8b3fc1702d80 RCX: 0000000000000006
> [ 461.137674] RDX: 0000000000000007 RSI: 0000000000000096 RDI: ffff8b3fded164b0
> [ 461.137675] RBP: ffffa77e08cafc68 R08: 0000000000000026 R09: 0000000000000339
> [ 461.137676] R10: ffff8b3fd060d410 R11: 0000000000000026 R12: ffffffffa4e14e20
> [ 461.137677] R13: ffff8b3fdec22940 R14: ffff8b3fc1702da0 R15: ffff8b3fdec22940
> [ 461.137678] FS: 00007efe43ee5700(0000) GS:ffff8b3fded00000(0000) knlGS:0000000000000000
> [ 461.137679] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 461.137680] CR2: 00007efe30000010 CR3: 0000000301744003 CR4: 00000000001606e0
> [ 461.137680] Call Trace:
> [ 461.137684] push_dl_task.part.46+0x3bc/0x460
> [ 461.137686] task_woken_dl+0x60/0x80
> [ 461.137689] ttwu_do_wakeup+0x4f/0x150
> [ 461.137690] ttwu_do_activate+0x77/0x80
> [ 461.137692] try_to_wake_up+0x1d6/0x4c0
> [ 461.137693] wake_up_q+0x32/0x70
> [ 461.137696] do_futex+0x7e7/0xb50
> [ 461.137698] __x64_sys_futex+0x8b/0x180
> [ 461.137701] do_syscall_64+0x5a/0x110
> [ 461.137703] entry_SYSCALL_64_after_hwframe+0x44/0xa9
> [ 461.137705] RIP: 0033:0x7efe4918ca26
> [ 461.137705] Code: 00 00 00 74 17 49 8b 48 20 44 8b 59 10 41 83 e3 30 41 83 fb 20 74 1e be 85 00 00 00 41 ba 01 00 00 00 41 b9 01 00 00 04 0f 05 <48> 3d 01 f0 ff ff 73 1f 31 c0 c3 be 8c 00 00 00 49 89 c8 4d 31 d2
> [ 461.137738] RSP: 002b:00007efe43ee4928 EFLAGS: 00000283 ORIG_RAX: 00000000000000ca
> [ 461.137739] RAX: ffffffffffffffda RBX: 0000000005094df0 RCX: 00007efe4918ca26
> [ 461.137740] RDX: 0000000000000001 RSI: 0000000000000085 RDI: 0000000005094e24
> [ 461.137741] RBP: 00007efe43ee49c0 R08: 0000000005094e20 R09: 0000000004000001
> [ 461.137741] R10: 0000000000000001 R11: 0000000000000283 R12: 0000000000000000
> [ 461.137742] R13: 0000000005094df8 R14: 0000000000000001 R15: 0000000000448a10
> [ 461.137743] ---[ end trace 187df4cad2bf7649 ]---
>
> This warning happened in the push_dl_task(), because
> __add_running_bw()->cpufreq_update_util() is getting the rq_clock of
> the later_rq before its update, which takes place at activate_task().
> The fix then is to update the rq_clock before calling add_running_bw().
I believe Steve hit the same problem last week:
https://lore.kernel.org/lkml/20180703120301.3667f418@gandalf.local.home/
> To avoid double rq_clock_update() call, we set ENQUEUE_NOCLOCK flag to
> activate_task().
I suggested almost the same, but missed the ENQUEUE_NOCLOCK bit (which I
think it's required).
> Changes from v1:
> Cosmetic changes in the log, and correct Juri's email (Daniel).
>
> Reported-by: Daniel Casini <daniel.casini@santannapisa.it>
> Signed-off-by: Daniel Bristot de Oliveira <bristot@redhat.com>
> Cc: Juri Lelli <juri.lelli@redhat.com>
> Cc: Luca Abeni <luca.abeni@santannapisa.it>
> Cc: Tommaso Cucinotta <tommaso.cucinotta@santannapisa.it>
> Cc: Steven Rostedt <rostedt@goodmis.org>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Ingo Molnar <mingo@kernel.org>
> Cc: linux-kernel@vger.kernel.org
> Cc: <stable@vger.kernel.org> # 4.16+
> Fixes: e0367b12674b sched/deadline: Move CPU frequency selection triggering points
> ---
> kernel/sched/deadline.c | 10 +++++++++-
> 1 file changed, 9 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
> index fbfc3f1d368a..e733c15b7695 100644
> --- a/kernel/sched/deadline.c
> +++ b/kernel/sched/deadline.c
> @@ -2090,8 +2090,16 @@ static int push_dl_task(struct rq *rq)
> sub_rq_bw(&next_task->dl, &rq->dl);
> set_task_cpu(next_task, later_rq->cpu);
> add_rq_bw(&next_task->dl, &later_rq->dl);
> +
> + /*
> + * Update the later_rq clock here, because the clock is used
> + * by the cpufreq_update_util() inside __add_running_bw().
> + * Then, set ENQUEUE_NOCLOCK flag to avoid updating the rq_clock
> + * again in the activate_task()->enqueue_task().
> + */
> + update_rq_clock(later_rq);
> add_running_bw(&next_task->dl, &later_rq->dl);
> - activate_task(later_rq, next_task, 0);
> + activate_task(later_rq, next_task, ENQUEUE_NOCLOCK);
> ret = 1;
Acked-by: Juri Lelli <juri.lelli@redhat.com>
Thanks!
- Juri
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH V2] sched/deadline: Update rq_clock of later_rq when pushing a task
2018-07-13 14:09 [PATCH] sched/deadline: Update rq_clock of later_rq when pushing a task Daniel Bristot de Oliveira
2018-07-14 13:37 ` [PATCH V2] " Daniel Bristot de Oliveira
2018-07-16 9:37 ` Juri Lelli
@ 2018-07-16 10:45 ` Kirill Tkhai
2018-08-02 2:45 ` Steven Rostedt
2 siblings, 1 reply; 6+ messages in thread
From: Kirill Tkhai @ 2018-07-16 10:45 UTC (permalink / raw)
To: Daniel Bristot de Oliveira, linux-kernel
Cc: Juri Lelli, Luca Abeni, Tommaso Cucinotta, Steven Rostedt,
Peter Zijlstra, Ingo Molnar, stable
On 14.07.2018 16:37, Daniel Bristot de Oliveira wrote:
> Daniel Casini got this warn while running a DL task here at RetisLab:
>
> [ 461.137582] ------------[ cut here ]------------
> [ 461.137583] rq->clock_update_flags < RQCF_ACT_SKIP
> [ 461.137599] WARNING: CPU: 4 PID: 2354 at kernel/sched/sched.h:967 assert_clock_updated.isra.32.part.33+0x17/0x20
> [a ton of modules]
> [ 461.137646] CPU: 4 PID: 2354 Comm: label_image Not tainted 4.18.0-rc4+ #3
> [ 461.137647] Hardware name: ASUS All Series/Z87-K, BIOS 0801 09/02/2013
> [ 461.137649] RIP: 0010:assert_clock_updated.isra.32.part.33+0x17/0x20
> [ 461.137649] Code: ff 48 89 83 08 09 00 00 eb c6 66 0f 1f 84 00 00 00 00 00 55 48 c7 c7 98 7a 6c a5 c6 05 bc 0d 54 01 01 48 89 e5 e8 a9 84 fb ff <0f> 0b 5d c3 0f 1f 44 00 00 0f 1f 44 00 00 83 7e 60 01 74 0a 48 3b
> [ 461.137673] RSP: 0018:ffffa77e08cafc68 EFLAGS: 00010082
> [ 461.137674] RAX: 0000000000000000 RBX: ffff8b3fc1702d80 RCX: 0000000000000006
> [ 461.137674] RDX: 0000000000000007 RSI: 0000000000000096 RDI: ffff8b3fded164b0
> [ 461.137675] RBP: ffffa77e08cafc68 R08: 0000000000000026 R09: 0000000000000339
> [ 461.137676] R10: ffff8b3fd060d410 R11: 0000000000000026 R12: ffffffffa4e14e20
> [ 461.137677] R13: ffff8b3fdec22940 R14: ffff8b3fc1702da0 R15: ffff8b3fdec22940
> [ 461.137678] FS: 00007efe43ee5700(0000) GS:ffff8b3fded00000(0000) knlGS:0000000000000000
> [ 461.137679] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 461.137680] CR2: 00007efe30000010 CR3: 0000000301744003 CR4: 00000000001606e0
> [ 461.137680] Call Trace:
> [ 461.137684] push_dl_task.part.46+0x3bc/0x460
> [ 461.137686] task_woken_dl+0x60/0x80
> [ 461.137689] ttwu_do_wakeup+0x4f/0x150
> [ 461.137690] ttwu_do_activate+0x77/0x80
> [ 461.137692] try_to_wake_up+0x1d6/0x4c0
> [ 461.137693] wake_up_q+0x32/0x70
> [ 461.137696] do_futex+0x7e7/0xb50
> [ 461.137698] __x64_sys_futex+0x8b/0x180
> [ 461.137701] do_syscall_64+0x5a/0x110
> [ 461.137703] entry_SYSCALL_64_after_hwframe+0x44/0xa9
> [ 461.137705] RIP: 0033:0x7efe4918ca26
> [ 461.137705] Code: 00 00 00 74 17 49 8b 48 20 44 8b 59 10 41 83 e3 30 41 83 fb 20 74 1e be 85 00 00 00 41 ba 01 00 00 00 41 b9 01 00 00 04 0f 05 <48> 3d 01 f0 ff ff 73 1f 31 c0 c3 be 8c 00 00 00 49 89 c8 4d 31 d2
> [ 461.137738] RSP: 002b:00007efe43ee4928 EFLAGS: 00000283 ORIG_RAX: 00000000000000ca
> [ 461.137739] RAX: ffffffffffffffda RBX: 0000000005094df0 RCX: 00007efe4918ca26
> [ 461.137740] RDX: 0000000000000001 RSI: 0000000000000085 RDI: 0000000005094e24
> [ 461.137741] RBP: 00007efe43ee49c0 R08: 0000000005094e20 R09: 0000000004000001
> [ 461.137741] R10: 0000000000000001 R11: 0000000000000283 R12: 0000000000000000
> [ 461.137742] R13: 0000000005094df8 R14: 0000000000000001 R15: 0000000000448a10
> [ 461.137743] ---[ end trace 187df4cad2bf7649 ]---
>
> This warning happened in the push_dl_task(), because
> __add_running_bw()->cpufreq_update_util() is getting the rq_clock of
> the later_rq before its update, which takes place at activate_task().
> The fix then is to update the rq_clock before calling add_running_bw().
>
> To avoid double rq_clock_update() call, we set ENQUEUE_NOCLOCK flag to
> activate_task().
>
> Changes from v1:
> Cosmetic changes in the log, and correct Juri's email (Daniel).
>
> Reported-by: Daniel Casini <daniel.casini@santannapisa.it>
> Signed-off-by: Daniel Bristot de Oliveira <bristot@redhat.com>
> Cc: Juri Lelli <juri.lelli@redhat.com>
> Cc: Luca Abeni <luca.abeni@santannapisa.it>
> Cc: Tommaso Cucinotta <tommaso.cucinotta@santannapisa.it>
> Cc: Steven Rostedt <rostedt@goodmis.org>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Ingo Molnar <mingo@kernel.org>
> Cc: linux-kernel@vger.kernel.org
> Cc: <stable@vger.kernel.org> # 4.16+
> Fixes: e0367b12674b sched/deadline: Move CPU frequency selection triggering points
> ---
> kernel/sched/deadline.c | 10 +++++++++-
> 1 file changed, 9 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
> index fbfc3f1d368a..e733c15b7695 100644
> --- a/kernel/sched/deadline.c
> +++ b/kernel/sched/deadline.c
> @@ -2090,8 +2090,16 @@ static int push_dl_task(struct rq *rq)
> sub_rq_bw(&next_task->dl, &rq->dl);
> set_task_cpu(next_task, later_rq->cpu);
> add_rq_bw(&next_task->dl, &later_rq->dl);
> +
> + /*
> + * Update the later_rq clock here, because the clock is used
> + * by the cpufreq_update_util() inside __add_running_bw().
> + * Then, set ENQUEUE_NOCLOCK flag to avoid updating the rq_clock
> + * again in the activate_task()->enqueue_task().
Is the second sentence really needed? It seems everybody knows, what NOCLOCK
flag does, and we does not have to paraphrase this in every place it's used :)
> + */
> + update_rq_clock(later_rq);
> add_running_bw(&next_task->dl, &later_rq->dl);
> - activate_task(later_rq, next_task, 0);
> + activate_task(later_rq, next_task, ENQUEUE_NOCLOCK);
> ret = 1;
>
> resched_curr(later_rq);
>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH V2] sched/deadline: Update rq_clock of later_rq when pushing a task
2018-07-16 9:37 ` Juri Lelli
@ 2018-07-16 12:10 ` Daniel Bristot de Oliveira
0 siblings, 0 replies; 6+ messages in thread
From: Daniel Bristot de Oliveira @ 2018-07-16 12:10 UTC (permalink / raw)
To: Juri Lelli
Cc: linux-kernel, Luca Abeni, Tommaso Cucinotta, Steven Rostedt,
Peter Zijlstra, Ingo Molnar, stable, Clark Williams
On 07/16/2018 11:37 AM, Juri Lelli wrote:
> Hi,
>
> On 14/07/18 15:37, Daniel Bristot de Oliveira wrote:
>> Daniel Casini got this warn while running a DL task here at RetisLab:
>>
>> [ 461.137582] ------------[ cut here ]------------
>> [ 461.137583] rq->clock_update_flags < RQCF_ACT_SKIP
>> [ 461.137599] WARNING: CPU: 4 PID: 2354 at kernel/sched/sched.h:967 assert_clock_updated.isra.32.part.33+0x17/0x20
>> [a ton of modules]
>> [ 461.137646] CPU: 4 PID: 2354 Comm: label_image Not tainted 4.18.0-rc4+ #3
>> [ 461.137647] Hardware name: ASUS All Series/Z87-K, BIOS 0801 09/02/2013
>> [ 461.137649] RIP: 0010:assert_clock_updated.isra.32.part.33+0x17/0x20
>> [ 461.137649] Code: ff 48 89 83 08 09 00 00 eb c6 66 0f 1f 84 00 00 00 00 00 55 48 c7 c7 98 7a 6c a5 c6 05 bc 0d 54 01 01 48 89 e5 e8 a9 84 fb ff <0f> 0b 5d c3 0f 1f 44 00 00 0f 1f 44 00 00 83 7e 60 01 74 0a 48 3b
>> [ 461.137673] RSP: 0018:ffffa77e08cafc68 EFLAGS: 00010082
>> [ 461.137674] RAX: 0000000000000000 RBX: ffff8b3fc1702d80 RCX: 0000000000000006
>> [ 461.137674] RDX: 0000000000000007 RSI: 0000000000000096 RDI: ffff8b3fded164b0
>> [ 461.137675] RBP: ffffa77e08cafc68 R08: 0000000000000026 R09: 0000000000000339
>> [ 461.137676] R10: ffff8b3fd060d410 R11: 0000000000000026 R12: ffffffffa4e14e20
>> [ 461.137677] R13: ffff8b3fdec22940 R14: ffff8b3fc1702da0 R15: ffff8b3fdec22940
>> [ 461.137678] FS: 00007efe43ee5700(0000) GS:ffff8b3fded00000(0000) knlGS:0000000000000000
>> [ 461.137679] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> [ 461.137680] CR2: 00007efe30000010 CR3: 0000000301744003 CR4: 00000000001606e0
>> [ 461.137680] Call Trace:
>> [ 461.137684] push_dl_task.part.46+0x3bc/0x460
>> [ 461.137686] task_woken_dl+0x60/0x80
>> [ 461.137689] ttwu_do_wakeup+0x4f/0x150
>> [ 461.137690] ttwu_do_activate+0x77/0x80
>> [ 461.137692] try_to_wake_up+0x1d6/0x4c0
>> [ 461.137693] wake_up_q+0x32/0x70
>> [ 461.137696] do_futex+0x7e7/0xb50
>> [ 461.137698] __x64_sys_futex+0x8b/0x180
>> [ 461.137701] do_syscall_64+0x5a/0x110
>> [ 461.137703] entry_SYSCALL_64_after_hwframe+0x44/0xa9
>> [ 461.137705] RIP: 0033:0x7efe4918ca26
>> [ 461.137705] Code: 00 00 00 74 17 49 8b 48 20 44 8b 59 10 41 83 e3 30 41 83 fb 20 74 1e be 85 00 00 00 41 ba 01 00 00 00 41 b9 01 00 00 04 0f 05 <48> 3d 01 f0 ff ff 73 1f 31 c0 c3 be 8c 00 00 00 49 89 c8 4d 31 d2
>> [ 461.137738] RSP: 002b:00007efe43ee4928 EFLAGS: 00000283 ORIG_RAX: 00000000000000ca
>> [ 461.137739] RAX: ffffffffffffffda RBX: 0000000005094df0 RCX: 00007efe4918ca26
>> [ 461.137740] RDX: 0000000000000001 RSI: 0000000000000085 RDI: 0000000005094e24
>> [ 461.137741] RBP: 00007efe43ee49c0 R08: 0000000005094e20 R09: 0000000004000001
>> [ 461.137741] R10: 0000000000000001 R11: 0000000000000283 R12: 0000000000000000
>> [ 461.137742] R13: 0000000005094df8 R14: 0000000000000001 R15: 0000000000448a10
>> [ 461.137743] ---[ end trace 187df4cad2bf7649 ]---
>>
>> This warning happened in the push_dl_task(), because
>> __add_running_bw()->cpufreq_update_util() is getting the rq_clock of
>> the later_rq before its update, which takes place at activate_task().
>> The fix then is to update the rq_clock before calling add_running_bw().
>
> I believe Steve hit the same problem last week:
>
> https://lore.kernel.org/lkml/20180703120301.3667f418@gandalf.local.home/
Yeap, this is the same problem!
>> To avoid double rq_clock_update() call, we set ENQUEUE_NOCLOCK flag to
>> activate_task().
>
> I suggested almost the same, but missed the ENQUEUE_NOCLOCK bit (which I
> think it's required).
>
>> Changes from v1:
>> Cosmetic changes in the log, and correct Juri's email (Daniel).
>>
>> Reported-by: Daniel Casini <daniel.casini@santannapisa.it>
>> Signed-off-by: Daniel Bristot de Oliveira <bristot@redhat.com>
>> Cc: Juri Lelli <juri.lelli@redhat.com>
>> Cc: Luca Abeni <luca.abeni@santannapisa.it>
>> Cc: Tommaso Cucinotta <tommaso.cucinotta@santannapisa.it>
>> Cc: Steven Rostedt <rostedt@goodmis.org>
>> Cc: Peter Zijlstra <peterz@infradead.org>
>> Cc: Ingo Molnar <mingo@kernel.org>
>> Cc: linux-kernel@vger.kernel.org
>> Cc: <stable@vger.kernel.org> # 4.16+
>> Fixes: e0367b12674b sched/deadline: Move CPU frequency selection triggering points
>> ---
>> kernel/sched/deadline.c | 10 +++++++++-
>> 1 file changed, 9 insertions(+), 1 deletion(-)
>>
>> diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
>> index fbfc3f1d368a..e733c15b7695 100644
>> --- a/kernel/sched/deadline.c
>> +++ b/kernel/sched/deadline.c
>> @@ -2090,8 +2090,16 @@ static int push_dl_task(struct rq *rq)
>> sub_rq_bw(&next_task->dl, &rq->dl);
>> set_task_cpu(next_task, later_rq->cpu);
>> add_rq_bw(&next_task->dl, &later_rq->dl);
>> +
>> + /*
>> + * Update the later_rq clock here, because the clock is used
>> + * by the cpufreq_update_util() inside __add_running_bw().
>> + * Then, set ENQUEUE_NOCLOCK flag to avoid updating the rq_clock
>> + * again in the activate_task()->enqueue_task().
>> + */
>> + update_rq_clock(later_rq);
>> add_running_bw(&next_task->dl, &later_rq->dl);
>> - activate_task(later_rq, next_task, 0);
>> + activate_task(later_rq, next_task, ENQUEUE_NOCLOCK);
>> ret = 1;
>
> Acked-by: Juri Lelli <juri.lelli@redhat.com>
Thanks!
-- Daniel
> Thanks!
>
> - Juri
>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH V2] sched/deadline: Update rq_clock of later_rq when pushing a task
2018-07-16 10:45 ` Kirill Tkhai
@ 2018-08-02 2:45 ` Steven Rostedt
0 siblings, 0 replies; 6+ messages in thread
From: Steven Rostedt @ 2018-08-02 2:45 UTC (permalink / raw)
To: Kirill Tkhai
Cc: Daniel Bristot de Oliveira, linux-kernel, Juri Lelli, Luca Abeni,
Tommaso Cucinotta, Peter Zijlstra, Ingo Molnar, stable
On Mon, 16 Jul 2018 13:45:21 +0300
Kirill Tkhai <ktkhai@virtuozzo.com> wrote:
> > --- a/kernel/sched/deadline.c
> > +++ b/kernel/sched/deadline.c
> > @@ -2090,8 +2090,16 @@ static int push_dl_task(struct rq *rq)
> > sub_rq_bw(&next_task->dl, &rq->dl);
> > set_task_cpu(next_task, later_rq->cpu);
> > add_rq_bw(&next_task->dl, &later_rq->dl);
> > +
> > + /*
> > + * Update the later_rq clock here, because the clock is used
> > + * by the cpufreq_update_util() inside __add_running_bw().
> > + * Then, set ENQUEUE_NOCLOCK flag to avoid updating the rq_clock
> > + * again in the activate_task()->enqueue_task().
>
> Is the second sentence really needed? It seems everybody knows, what NOCLOCK
> flag does, and we does not have to paraphrase this in every place it's used :)
I would keep the mention, but change the comment:
/*
* Update the later_rq clock before calling add_running_bw()
* because the clock is used by cpufreq_update_util() that is
* inside __add_running_bw(). As the later_rq clock is already
* updated, we need to set ENQUEUE_NOCLOCK to prevent
* activate_task() from updating it again.
*/
Other than that...
Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
-- Steve
>
> > + */
> > + update_rq_clock(later_rq);
> > add_running_bw(&next_task->dl, &later_rq->dl);
> > - activate_task(later_rq, next_task, 0);
> > + activate_task(later_rq, next_task, ENQUEUE_NOCLOCK);
> > ret = 1;
> >
> > resched_curr(later_rq);
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2018-08-02 2:46 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-07-13 14:09 [PATCH] sched/deadline: Update rq_clock of later_rq when pushing a task Daniel Bristot de Oliveira
2018-07-14 13:37 ` [PATCH V2] " Daniel Bristot de Oliveira
2018-07-16 9:37 ` Juri Lelli
2018-07-16 12:10 ` Daniel Bristot de Oliveira
2018-07-16 10:45 ` Kirill Tkhai
2018-08-02 2:45 ` Steven Rostedt
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).