linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/2] Avoid obvious double update_rq_clock warning
@ 2022-04-22  9:09 Hao Jia
  2022-04-22  9:09 ` [PATCH v2 1/2] sched/core: " Hao Jia
  2022-04-22  9:09 ` [PATCH v2 2/2] sched/dl: Remove some comments and adjust code in push_dl_task Hao Jia
  0 siblings, 2 replies; 7+ messages in thread
From: Hao Jia @ 2022-04-22  9:09 UTC (permalink / raw)
  To: mingo, peterz, juri.lelli, vincent.guittot, dietmar.eggemann,
	rostedt, bsegall, mgorman, bristot
  Cc: linux-kernel, Hao Jia

These two patches are about the kernel scheduler:

patch 1: fixed the issue that kernel may trigger WARN_DOUBLE_CLOCK warning.
patch 2: removed some no longer needed comments in the deadline scheduler
and cleaned up the code.

v2:
   - Added double_rq_clock_clear_update inline helper to clear RQCF_UPDATED
     of rq->clock_update_flags.
   - split into two separate patches.

[1] https://lore.kernel.org/lkml/20220418090929.54005-1-jiahao.os@bytedance.com/

Hao Jia (2):
  sched/core: Avoid obvious double update_rq_clock warning
  sched/dl: Remove some comments and adjust code in push_dl_task

 kernel/sched/core.c     |  6 +++---
 kernel/sched/deadline.c |  8 +-------
 kernel/sched/rt.c       |  5 +++--
 kernel/sched/sched.h    | 31 +++++++++++++++++++++++++++----
 4 files changed, 34 insertions(+), 16 deletions(-)

-- 
2.32.0


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH v2 1/2] sched/core: Avoid obvious double update_rq_clock warning
  2022-04-22  9:09 [PATCH v2 0/2] Avoid obvious double update_rq_clock warning Hao Jia
@ 2022-04-22  9:09 ` Hao Jia
  2022-04-25 15:37   ` Dietmar Eggemann
  2022-04-22  9:09 ` [PATCH v2 2/2] sched/dl: Remove some comments and adjust code in push_dl_task Hao Jia
  1 sibling, 1 reply; 7+ messages in thread
From: Hao Jia @ 2022-04-22  9:09 UTC (permalink / raw)
  To: mingo, peterz, juri.lelli, vincent.guittot, dietmar.eggemann,
	rostedt, bsegall, mgorman, bristot
  Cc: linux-kernel, Hao Jia

When we use raw_spin_rq_lock to acquire the rq lock and have to
update the rq clock while holding the lock, the kernel may issue
a WARN_DOUBLE_CLOCK warning.

Since we directly use raw_spin_rq_lock to acquire rq lock instead of
rq_lock, there is no corresponding change to rq->clock_update_flags.
In particular, we have obtained the rq lock of other cores,
the core rq->clock_update_flags may be RQCF_UPDATED at this time, and
then calling update_rq_clock will trigger the WARN_DOUBLE_CLOCK warning.

So we need to clear RQCF_UPDATED of rq->clock_update_flags synchronously
to avoid the WARN_DOUBLE_CLOCK warning.

Some call trace reports:
Call Trace 1:
 <IRQ>
 sched_rt_period_timer+0x10f/0x3a0
 ? enqueue_top_rt_rq+0x110/0x110
 __hrtimer_run_queues+0x1a9/0x490
 hrtimer_interrupt+0x10b/0x240
 __sysvec_apic_timer_interrupt+0x8a/0x250
 sysvec_apic_timer_interrupt+0x9a/0xd0
 </IRQ>
 <TASK>
 asm_sysvec_apic_timer_interrupt+0x12/0x20

Call Trace 2:
 <TASK>
 activate_task+0x8b/0x110
 push_rt_task.part.108+0x241/0x2c0
 push_rt_tasks+0x15/0x30
 finish_task_switch+0xaa/0x2e0
 ? __switch_to+0x134/0x420
 __schedule+0x343/0x8e0
 ? hrtimer_start_range_ns+0x101/0x340
 schedule+0x4e/0xb0
 do_nanosleep+0x8e/0x160
 hrtimer_nanosleep+0x89/0x120
 ? hrtimer_init_sleeper+0x90/0x90
 __x64_sys_nanosleep+0x96/0xd0
 do_syscall_64+0x34/0x90
 entry_SYSCALL_64_after_hwframe+0x44/0xae

Call Trace 3:
 <TASK>
 deactivate_task+0x93/0xe0
 pull_rt_task+0x33e/0x400
 balance_rt+0x7e/0x90
 __schedule+0x62f/0x8e0
 do_task_dead+0x3f/0x50
 do_exit+0x7b8/0xbb0
 do_group_exit+0x2d/0x90
 get_signal+0x9df/0x9e0
 ? preempt_count_add+0x56/0xa0
 ? __remove_hrtimer+0x35/0x70
 arch_do_signal_or_restart+0x36/0x720
 ? nanosleep_copyout+0x39/0x50
 ? do_nanosleep+0x131/0x160
 ? audit_filter_inodes+0xf5/0x120
 exit_to_user_mode_prepare+0x10f/0x1e0
 syscall_exit_to_user_mode+0x17/0x30
 do_syscall_64+0x40/0x90
 entry_SYSCALL_64_after_hwframe+0x44/0xae

Steps to reproduce:
1. Enable CONFIG_SCHED_DEBUG when compiling the kernel
2. echo 1 > /sys/kernel/debug/clear_warn_once
   echo "WARN_DOUBLE_CLOCK" > /sys/kernel/debug/sched_features
   echo "NO_RT_PUSH_IPI" > /sys/kernel/debug/sched_features
3. Run some rt tasks that periodically change the priority and sleep

Signed-off-by: Hao Jia <jiahao.os@bytedance.com>
Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
---
 kernel/sched/core.c  |  6 +++---
 kernel/sched/rt.c    |  5 +++--
 kernel/sched/sched.h | 31 +++++++++++++++++++++++++++----
 3 files changed, 33 insertions(+), 9 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 51efaabac3e4..84538271b4eb 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -610,10 +610,10 @@ void double_rq_lock(struct rq *rq1, struct rq *rq2)
 		swap(rq1, rq2);
 
 	raw_spin_rq_lock(rq1);
-	if (__rq_lockp(rq1) == __rq_lockp(rq2))
-		return;
+	if (__rq_lockp(rq1) != __rq_lockp(rq2))
+		raw_spin_rq_lock_nested(rq2, SINGLE_DEPTH_NESTING);
 
-	raw_spin_rq_lock_nested(rq2, SINGLE_DEPTH_NESTING);
+	double_rq_clock_clear_update(rq1, rq2);
 }
 #endif
 
diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
index a32c46889af8..7891c0f0e1ff 100644
--- a/kernel/sched/rt.c
+++ b/kernel/sched/rt.c
@@ -871,6 +871,7 @@ static int do_sched_rt_period_timer(struct rt_bandwidth *rt_b, int overrun)
 		int enqueue = 0;
 		struct rt_rq *rt_rq = sched_rt_period_rt_rq(rt_b, i);
 		struct rq *rq = rq_of_rt_rq(rt_rq);
+		struct rq_flags rf;
 		int skip;
 
 		/*
@@ -885,7 +886,7 @@ static int do_sched_rt_period_timer(struct rt_bandwidth *rt_b, int overrun)
 		if (skip)
 			continue;
 
-		raw_spin_rq_lock(rq);
+		rq_lock(rq, &rf);
 		update_rq_clock(rq);
 
 		if (rt_rq->rt_time) {
@@ -923,7 +924,7 @@ static int do_sched_rt_period_timer(struct rt_bandwidth *rt_b, int overrun)
 
 		if (enqueue)
 			sched_rt_rq_enqueue(rt_rq);
-		raw_spin_rq_unlock(rq);
+		rq_unlock(rq, &rf);
 	}
 
 	if (!throttled && (!rt_bandwidth_enabled() || rt_b->rt_runtime == RUNTIME_INF))
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index 8dccb34eb190..7f1a18b2aff2 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -2478,6 +2478,27 @@ unsigned long arch_scale_freq_capacity(int cpu)
 }
 #endif
 
+#ifdef CONFIG_SCHED_DEBUG
+/*
+ * In double_lock_balance/double_rq_lock, we use raw_spin_rq_lock to acquire rq lock
+ * instead of rq_lock. So at the end of these two functions we need to call
+ * double_rq_clock_clear_update synchronously to clear RQCF_UPDATED of
+ * rq->clock_update_flags to avoid the WARN_DOUBLE_CLOCK warning.
+ */
+static inline void double_rq_clock_clear_update(struct rq *rq1, struct rq *rq2)
+{
+	rq1->clock_update_flags &= (RQCF_REQ_SKIP|RQCF_ACT_SKIP);
+	/*
+	 * If CONFIG_SMP is not defined, rq1 and rq2 are the same,
+	 * so we just clear RQCF_UPDATED one of them.
+	 */
+#ifdef CONFIG_SMP
+	rq2->clock_update_flags &= (RQCF_REQ_SKIP|RQCF_ACT_SKIP);
+#endif
+}
+#else
+static inline void double_rq_clock_clear_update(struct rq *rq1, struct rq *rq2) {}
+#endif
 
 #ifdef CONFIG_SMP
 
@@ -2543,14 +2564,15 @@ static inline int _double_lock_balance(struct rq *this_rq, struct rq *busiest)
 	__acquires(busiest->lock)
 	__acquires(this_rq->lock)
 {
-	if (__rq_lockp(this_rq) == __rq_lockp(busiest))
-		return 0;
-
-	if (likely(raw_spin_rq_trylock(busiest)))
+	if (__rq_lockp(this_rq) == __rq_lockp(busiest) ||
+		likely(raw_spin_rq_trylock(busiest))) {
+		double_rq_clock_clear_update(this_rq, busiest);
 		return 0;
+	}
 
 	if (rq_order_less(this_rq, busiest)) {
 		raw_spin_rq_lock_nested(busiest, SINGLE_DEPTH_NESTING);
+		double_rq_clock_clear_update(this_rq, busiest);
 		return 0;
 	}
 
@@ -2644,6 +2666,7 @@ static inline void double_rq_lock(struct rq *rq1, struct rq *rq2)
 	BUG_ON(rq1 != rq2);
 	raw_spin_rq_lock(rq1);
 	__acquire(rq2->lock);	/* Fake it out ;) */
+	double_rq_clock_clear_update(rq1, rq2);
 }
 
 /*
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH v2 2/2] sched/dl: Remove some comments and adjust code in push_dl_task
  2022-04-22  9:09 [PATCH v2 0/2] Avoid obvious double update_rq_clock warning Hao Jia
  2022-04-22  9:09 ` [PATCH v2 1/2] sched/core: " Hao Jia
@ 2022-04-22  9:09 ` Hao Jia
  2022-04-25 16:21   ` Dietmar Eggemann
  1 sibling, 1 reply; 7+ messages in thread
From: Hao Jia @ 2022-04-22  9:09 UTC (permalink / raw)
  To: mingo, peterz, juri.lelli, vincent.guittot, dietmar.eggemann,
	rostedt, bsegall, mgorman, bristot
  Cc: linux-kernel, Hao Jia

The change to call update_rq_clock() before activate_task()
commit 840d719604b0 ("sched/deadline: Update rq_clock of later_rq
when pushing a task") is no longer needed since commit f4904815f97a
("sched/deadline: Fix double accounting of rq/running bw in push & pull")
removed the add_running_bw() before the activate_task().

So we remove some comments that are no longer needed and update
rq clock in activate_task().

Signed-off-by: Hao Jia <jiahao.os@bytedance.com>
---
 kernel/sched/deadline.c | 8 +-------
 1 file changed, 1 insertion(+), 7 deletions(-)

diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
index fb4255ae0b2c..8eb694ed7ac1 100644
--- a/kernel/sched/deadline.c
+++ b/kernel/sched/deadline.c
@@ -2319,13 +2319,7 @@ static int push_dl_task(struct rq *rq)
 
 	deactivate_task(rq, next_task, 0);
 	set_task_cpu(next_task, later_rq->cpu);
-
-	/*
-	 * Update the later_rq clock here, because the clock is used
-	 * by the cpufreq_update_util() inside __add_running_bw().
-	 */
-	update_rq_clock(later_rq);
-	activate_task(later_rq, next_task, ENQUEUE_NOCLOCK);
+	activate_task(later_rq, next_task, 0);
 	ret = 1;
 
 	resched_curr(later_rq);
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH v2 1/2] sched/core: Avoid obvious double update_rq_clock warning
  2022-04-22  9:09 ` [PATCH v2 1/2] sched/core: " Hao Jia
@ 2022-04-25 15:37   ` Dietmar Eggemann
  2022-04-26  7:57     ` [External] " Hao Jia
  0 siblings, 1 reply; 7+ messages in thread
From: Dietmar Eggemann @ 2022-04-25 15:37 UTC (permalink / raw)
  To: Hao Jia, mingo, peterz, juri.lelli, vincent.guittot, rostedt,
	bsegall, mgorman, bristot
  Cc: linux-kernel

On 22/04/2022 11:09, Hao Jia wrote:
> When we use raw_spin_rq_lock to acquire the rq lock and have to
> update the rq clock while holding the lock, the kernel may issue
> a WARN_DOUBLE_CLOCK warning.
> 
> Since we directly use raw_spin_rq_lock to acquire rq lock instead of
> rq_lock, there is no corresponding change to rq->clock_update_flags.
> In particular, we have obtained the rq lock of other cores,
> the core rq->clock_update_flags may be RQCF_UPDATED at this time, and
> then calling update_rq_clock will trigger the WARN_DOUBLE_CLOCK warning.
> 
> So we need to clear RQCF_UPDATED of rq->clock_update_flags synchronously
> to avoid the WARN_DOUBLE_CLOCK warning.
> 
> Some call trace reports:
> Call Trace 1:
>  <IRQ>
>  sched_rt_period_timer+0x10f/0x3a0
>  ? enqueue_top_rt_rq+0x110/0x110
>  __hrtimer_run_queues+0x1a9/0x490
>  hrtimer_interrupt+0x10b/0x240

[...]

For the  sched_rt_period_timer() case you simply replace
raw_spin_rq_lock()/raw_spin_rq_unlock() with rq_lock()/rq_unlock().
I.e. you're not using the new double_rq_clock_clear_update() which
is depicted in the text above.

> Call Trace 2:
>  <TASK>
>  activate_task+0x8b/0x110
>  push_rt_task.part.108+0x241/0x2c0
>  push_rt_tasks+0x15/0x30
>  finish_task_switch+0xaa/0x2e0
>  ? __switch_to+0x134/0x420
>  __schedule+0x343/0x8e0

[...]

> Call Trace 3:
>  <TASK>
>  deactivate_task+0x93/0xe0
>  pull_rt_task+0x33e/0x400
>  balance_rt+0x7e/0x90
>  __schedule+0x62f/0x8e0
>  do_task_dead+0x3f/0x50

[...]

> Steps to reproduce:
> 1. Enable CONFIG_SCHED_DEBUG when compiling the kernel
> 2. echo 1 > /sys/kernel/debug/clear_warn_once
>    echo "WARN_DOUBLE_CLOCK" > /sys/kernel/debug/sched_features

s/sched_features/sched/features

>    echo "NO_RT_PUSH_IPI" > /sys/kernel/debug/sched_features

s/sched_features/sched/features

> 3. Run some rt tasks that periodically change the priority and sleep

Not sure about the `change the priority` part? I'm using rt-app with 2*n
rt or dl (90% running) tasks (on a system with n CPUs) and can trigger
all of these cases.
 
> Signed-off-by: Hao Jia <jiahao.os@bytedance.com>
> Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com>

You can remove this Signed-off-by. I can provide a Reviewed-By for the
next version.

[...]

When running PREEMPT_RT kernel there is another similar case in DL.

[  215.158100] ------------[ cut here ]------------
[  215.158105] rq->clock_update_flags & RQCF_UPDATED
[  215.158113] WARNING: CPU: 2 PID: 1942 at kernel/sched/core.c:690 update_rq_clock+0x128/0x1a0
...
[  215.158245]  update_rq_clock+0x128/0x1a0
[  215.158253]  migrate_task_rq_dl+0xec/0x310    <--- !!!
[  215.158259]  set_task_cpu+0x84/0x1e4
[  215.158264]  try_to_wake_up+0x1d8/0x5c0
[  215.158268]  wake_up_process+0x1c/0x30
[  215.158272]  hrtimer_wakeup+0x24/0x3c
[  215.158279]  __hrtimer_run_queues+0x114/0x270
[  215.158285]  hrtimer_interrupt+0xe8/0x244
[  215.158291]  arch_timer_handler_phys+0x30/0x50
[  215.158301]  handle_percpu_devid_irq+0x88/0x140
[  215.158306]  generic_handle_domain_irq+0x40/0x60
[  215.158313]  gic_handle_irq+0x48/0xe0
[  215.158320]  call_on_irq_stack+0x2c/0x60
[  215.158327]  do_interrupt_handler+0x80/0x84

For a non_contending task, this is the same issue as in sched_rt_period_timer().

-->8--

diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
index cfe7b40bc4ff..668a9910cd6d 100644
--- a/kernel/sched/deadline.c
+++ b/kernel/sched/deadline.c
@@ -1804,6 +1804,7 @@ select_task_rq_dl(struct task_struct *p, int cpu, int flags)
 
 static void migrate_task_rq_dl(struct task_struct *p, int new_cpu __maybe_unused)
 {
+	struct rq_flags rf;
 	struct rq *rq;
 
 	if (READ_ONCE(p->__state) != TASK_WAKING)
@@ -1815,7 +1816,7 @@ static void migrate_task_rq_dl(struct task_struct *p, int new_cpu __maybe_unused
 	 * from try_to_wake_up(). Hence, p->pi_lock is locked, but
 	 * rq->lock is not... So, lock it
 	 */
-	raw_spin_rq_lock(rq);
+	rq_lock(rq, &rf);
 	if (p->dl.dl_non_contending) {
 		update_rq_clock(rq);
 		sub_running_bw(&p->dl, &rq->dl);
@@ -1831,7 +1832,7 @@ static void migrate_task_rq_dl(struct task_struct *p, int new_cpu __maybe_unused
 			put_task_struct(p);
 	}
 	sub_rq_bw(&p->dl, &rq->dl);
-	raw_spin_rq_unlock(rq);
+	rq_unlock(rq, &rf);
 }
 
 static void check_preempt_equal_dl(struct rq *rq, struct task_struct *p)
-- 
2.25.1

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH v2 2/2] sched/dl: Remove some comments and adjust code in push_dl_task
  2022-04-22  9:09 ` [PATCH v2 2/2] sched/dl: Remove some comments and adjust code in push_dl_task Hao Jia
@ 2022-04-25 16:21   ` Dietmar Eggemann
  2022-04-26  8:01     ` [External] " Hao Jia
  0 siblings, 1 reply; 7+ messages in thread
From: Dietmar Eggemann @ 2022-04-25 16:21 UTC (permalink / raw)
  To: Hao Jia, mingo, peterz, juri.lelli, vincent.guittot, rostedt,
	bsegall, mgorman, bristot
  Cc: linux-kernel

On 22/04/2022 11:09, Hao Jia wrote:

Nitpick: I would change the message slightly into something like:

sched/deadline: Remove superfluous rq clock update in push_dl_task()

> The change to call update_rq_clock() before activate_task()
> commit 840d719604b0 ("sched/deadline: Update rq_clock of later_rq
> when pushing a task") is no longer needed since commit f4904815f97a
> ("sched/deadline: Fix double accounting of rq/running bw in push & pull")
> removed the add_running_bw() before the activate_task().
> 
> So we remove some comments that are no longer needed and update
> rq clock in activate_task().
> 
> Signed-off-by: Hao Jia <jiahao.os@bytedance.com>
> ---
>  kernel/sched/deadline.c | 8 +-------
>  1 file changed, 1 insertion(+), 7 deletions(-)
> 
> diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
> index fb4255ae0b2c..8eb694ed7ac1 100644
> --- a/kernel/sched/deadline.c
> +++ b/kernel/sched/deadline.c
> @@ -2319,13 +2319,7 @@ static int push_dl_task(struct rq *rq)
>  
>  	deactivate_task(rq, next_task, 0);
>  	set_task_cpu(next_task, later_rq->cpu);
> -
> -	/*
> -	 * Update the later_rq clock here, because the clock is used
> -	 * by the cpufreq_update_util() inside __add_running_bw().
> -	 */
> -	update_rq_clock(later_rq);
> -	activate_task(later_rq, next_task, ENQUEUE_NOCLOCK);
> +	activate_task(later_rq, next_task, 0);
>  	ret = 1;
>  
>  	resched_curr(later_rq);

Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [External] Re: [PATCH v2 1/2] sched/core: Avoid obvious double update_rq_clock warning
  2022-04-25 15:37   ` Dietmar Eggemann
@ 2022-04-26  7:57     ` Hao Jia
  0 siblings, 0 replies; 7+ messages in thread
From: Hao Jia @ 2022-04-26  7:57 UTC (permalink / raw)
  To: Dietmar Eggemann, mingo, peterz, juri.lelli, vincent.guittot,
	rostedt, bsegall, mgorman, bristot
  Cc: linux-kernel



On 2022/4/25 Dietmar Eggemann wrote:
> On 22/04/2022 11:09, Hao Jia wrote:
>> When we use raw_spin_rq_lock to acquire the rq lock and have to
>> update the rq clock while holding the lock, the kernel may issue
>> a WARN_DOUBLE_CLOCK warning.
>>
>> Since we directly use raw_spin_rq_lock to acquire rq lock instead of
>> rq_lock, there is no corresponding change to rq->clock_update_flags.
>> In particular, we have obtained the rq lock of other cores,
>> the core rq->clock_update_flags may be RQCF_UPDATED at this time, and
>> then calling update_rq_clock will trigger the WARN_DOUBLE_CLOCK warning.
>>
>> So we need to clear RQCF_UPDATED of rq->clock_update_flags synchronously
>> to avoid the WARN_DOUBLE_CLOCK warning.
>>
>> Some call trace reports:
>> Call Trace 1:
>>   <IRQ>
>>   sched_rt_period_timer+0x10f/0x3a0
>>   ? enqueue_top_rt_rq+0x110/0x110
>>   __hrtimer_run_queues+0x1a9/0x490
>>   hrtimer_interrupt+0x10b/0x240
> 
> [...]
> 
> For the  sched_rt_period_timer() case you simply replace
> raw_spin_rq_lock()/raw_spin_rq_unlock() with rq_lock()/rq_unlock().
> I.e. you're not using the new double_rq_clock_clear_update() which
> is depicted in the text above.
> 

Thanks for your review comments.
Yes, adding this description would make it clearer.
I will do it in patch v3.
Thanks.


>> Call Trace 2:
>>   <TASK>
>>   activate_task+0x8b/0x110
>>   push_rt_task.part.108+0x241/0x2c0
>>   push_rt_tasks+0x15/0x30
>>   finish_task_switch+0xaa/0x2e0
>>   ? __switch_to+0x134/0x420
>>   __schedule+0x343/0x8e0
> 
> [...]
> 
>> Call Trace 3:
>>   <TASK>
>>   deactivate_task+0x93/0xe0
>>   pull_rt_task+0x33e/0x400
>>   balance_rt+0x7e/0x90
>>   __schedule+0x62f/0x8e0
>>   do_task_dead+0x3f/0x50
> 
> [...]
> 
>> Steps to reproduce:
>> 1. Enable CONFIG_SCHED_DEBUG when compiling the kernel
>> 2. echo 1 > /sys/kernel/debug/clear_warn_once
>>     echo "WARN_DOUBLE_CLOCK" > /sys/kernel/debug/sched_features
> 
> s/sched_features/sched/features

I will fix it in patch v3.
Thanks.

> 
>>     echo "NO_RT_PUSH_IPI" > /sys/kernel/debug/sched_features
> 
> s/sched_features/sched/features

I will fix it in patch v3.
Thanks.

> 
>> 3. Run some rt tasks that periodically change the priority and sleep
> 
> Not sure about the `change the priority` part? I'm using rt-app with 2*n
> rt or dl (90% running) tasks (on a system with n CPUs) and can trigger
> all of these cases.

Yes, this problem is relatively easy to reproduce. I wrote a test case 
for rt scheduler to trigger all cases. I create 150 rt threads (my test 
environment has 96 cpus), change priority and sleep periodically. 
Constantly changing priorities in order to trigger push/pull_rt_task.


void *ThreadFun(void *arg)
{
	int cnt = *(int*)arg;
	struct sched_param param;

	while (1) {
		sqrt(MAGIC_NUM);
		cnt = cnt % 10 + 1;
		param.sched_priority = cnt;
		pthread_setschedparam(pthread_self(), SCHED_RR, &param);
		sqrt(MAGIC_NUM);
		sqrt(MAGIC_NUM);
		sleep(cnt);
	}
	return NULL;
}


I will try to reproduce again using rt-app, and will change the 
description of the reproduction steps in patch v3.
Thanks.


>   
>> Signed-off-by: Hao Jia <jiahao.os@bytedance.com>
>> Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
> 
> You can remove this Signed-off-by. I can provide a Reviewed-By for the
> next version.


OK, Thanks again for your review and suggestion, it really helped me.

> 
> [...]
> 
> When running PREEMPT_RT kernel there is another similar case in DL.
> 
> [  215.158100] ------------[ cut here ]------------
> [  215.158105] rq->clock_update_flags & RQCF_UPDATED
> [  215.158113] WARNING: CPU: 2 PID: 1942 at kernel/sched/core.c:690 update_rq_clock+0x128/0x1a0
> ...
> [  215.158245]  update_rq_clock+0x128/0x1a0
> [  215.158253]  migrate_task_rq_dl+0xec/0x310    <--- !!!
> [  215.158259]  set_task_cpu+0x84/0x1e4
> [  215.158264]  try_to_wake_up+0x1d8/0x5c0
> [  215.158268]  wake_up_process+0x1c/0x30
> [  215.158272]  hrtimer_wakeup+0x24/0x3c
> [  215.158279]  __hrtimer_run_queues+0x114/0x270
> [  215.158285]  hrtimer_interrupt+0xe8/0x244
> [  215.158291]  arch_timer_handler_phys+0x30/0x50
> [  215.158301]  handle_percpu_devid_irq+0x88/0x140
> [  215.158306]  generic_handle_domain_irq+0x40/0x60
> [  215.158313]  gic_handle_irq+0x48/0xe0
> [  215.158320]  call_on_irq_stack+0x2c/0x60
> [  215.158327]  do_interrupt_handler+0x80/0x84
> 
> For a non_contending task, this is the same issue as in sched_rt_period_timer().
> 
> -->8--
> 
> diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
> index cfe7b40bc4ff..668a9910cd6d 100644
> --- a/kernel/sched/deadline.c
> +++ b/kernel/sched/deadline.c
> @@ -1804,6 +1804,7 @@ select_task_rq_dl(struct task_struct *p, int cpu, int flags)
>   
>   static void migrate_task_rq_dl(struct task_struct *p, int new_cpu __maybe_unused)
>   {
> +	struct rq_flags rf;
>   	struct rq *rq;
>   
>   	if (READ_ONCE(p->__state) != TASK_WAKING)
> @@ -1815,7 +1816,7 @@ static void migrate_task_rq_dl(struct task_struct *p, int new_cpu __maybe_unused
>   	 * from try_to_wake_up(). Hence, p->pi_lock is locked, but
>   	 * rq->lock is not... So, lock it
>   	 */
> -	raw_spin_rq_lock(rq);
> +	rq_lock(rq, &rf);
>   	if (p->dl.dl_non_contending) {
>   		update_rq_clock(rq);
>   		sub_running_bw(&p->dl, &rq->dl);
> @@ -1831,7 +1832,7 @@ static void migrate_task_rq_dl(struct task_struct *p, int new_cpu __maybe_unused
>   			put_task_struct(p);
>   	}
>   	sub_rq_bw(&p->dl, &rq->dl);
> -	raw_spin_rq_unlock(rq);
> +	rq_unlock(rq, &rf);
>   }
>   

I will fix it in patch v3.
Thanks.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [External] Re: [PATCH v2 2/2] sched/dl: Remove some comments and adjust code in push_dl_task
  2022-04-25 16:21   ` Dietmar Eggemann
@ 2022-04-26  8:01     ` Hao Jia
  0 siblings, 0 replies; 7+ messages in thread
From: Hao Jia @ 2022-04-26  8:01 UTC (permalink / raw)
  To: Dietmar Eggemann, mingo, peterz, juri.lelli, vincent.guittot,
	rostedt, bsegall, mgorman, bristot
  Cc: linux-kernel



On 2022/4/26 Dietmar Eggemann wrote:
> On 22/04/2022 11:09, Hao Jia wrote:
> 
> Nitpick: I would change the message slightly into something like:
> 
> sched/deadline: Remove superfluous rq clock update in push_dl_task()
> 

I will do it in patch v3.
Thanks.


>> The change to call update_rq_clock() before activate_task()
>> commit 840d719604b0 ("sched/deadline: Update rq_clock of later_rq
>> when pushing a task") is no longer needed since commit f4904815f97a
>> ("sched/deadline: Fix double accounting of rq/running bw in push & pull")
>> removed the add_running_bw() before the activate_task().
>>
>> So we remove some comments that are no longer needed and update
>> rq clock in activate_task().
>>
>> Signed-off-by: Hao Jia <jiahao.os@bytedance.com>
>> ---
>>   kernel/sched/deadline.c | 8 +-------
>>   1 file changed, 1 insertion(+), 7 deletions(-)
>>
>> diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
>> index fb4255ae0b2c..8eb694ed7ac1 100644
>> --- a/kernel/sched/deadline.c
>> +++ b/kernel/sched/deadline.c
>> @@ -2319,13 +2319,7 @@ static int push_dl_task(struct rq *rq)
>>   
>>   	deactivate_task(rq, next_task, 0);
>>   	set_task_cpu(next_task, later_rq->cpu);
>> -
>> -	/*
>> -	 * Update the later_rq clock here, because the clock is used
>> -	 * by the cpufreq_update_util() inside __add_running_bw().
>> -	 */
>> -	update_rq_clock(later_rq);
>> -	activate_task(later_rq, next_task, ENQUEUE_NOCLOCK);
>> +	activate_task(later_rq, next_task, 0);
>>   	ret = 1;
>>   
>>   	resched_curr(later_rq);
> 
> Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com>

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2022-04-26  8:01 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-04-22  9:09 [PATCH v2 0/2] Avoid obvious double update_rq_clock warning Hao Jia
2022-04-22  9:09 ` [PATCH v2 1/2] sched/core: " Hao Jia
2022-04-25 15:37   ` Dietmar Eggemann
2022-04-26  7:57     ` [External] " Hao Jia
2022-04-22  9:09 ` [PATCH v2 2/2] sched/dl: Remove some comments and adjust code in push_dl_task Hao Jia
2022-04-25 16:21   ` Dietmar Eggemann
2022-04-26  8:01     ` [External] " Hao Jia

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).