All of lore.kernel.org
 help / color / mirror / Atom feed
* [Resend PATCH] psi : calc cfs task memstall time more precisely
@ 2021-10-15  6:16 Huangzhaoyang
  2021-11-02 19:47 ` Johannes Weiner
  0 siblings, 1 reply; 25+ messages in thread
From: Huangzhaoyang @ 2021-10-15  6:16 UTC (permalink / raw)
  To: Andrew Morton, Johannes Weiner, Michal Hocko, Vladimir Davydov,
	Zhaoyang Huang, linux-mm, linux-kernel

From: Zhaoyang Huang <zhaoyang.huang@unisoc.com>

In an EAS enabled system, there are two scenarios discordant to current design,

1. workload used to be heavy uneven among cores for sake of scheduler policy.
RT task usually preempts CFS task in little core.
2. CFS task's memstall time is counted as simple as exit - entry so far, which
ignore the preempted time by RT, DL and Irqs.

With these two constraints, the percpu nonidle time would be mainly consumed by
none CFS tasks and couldn't be averaged. Eliminating them by calc the time growth
via the proportion of cfs_rq's utilization on the whole rq.

eg.
Here is the scenario which this commit want to fix, that is the rt and irq consume
some utilization of the whole rq. This scenario could be typical in a core
which is assigned to deal with all irqs. Furthermore, the rt task used to run on
little core under EAS.

Binder:305_3-314    [002] d..1   257.880195: psi_memtime_fixup: original:30616,adjusted:25951,se:89,cfs:353,rt:139,dl:0,irq:18
droid.phone-1525    [001] d..1   265.145492: psi_memtime_fixup: original:61616,adjusted:53492,se:55,cfs:225,rt:121,dl:0,irq:15

Signed-off-by: Zhaoyang Huang <zhaoyang.huang@unisoc.com>
---
 kernel/sched/psi.c | 20 +++++++++++++++++++-
 1 file changed, 19 insertions(+), 1 deletion(-)

diff --git a/kernel/sched/psi.c b/kernel/sched/psi.c
index cc25a3c..754a836 100644
--- a/kernel/sched/psi.c
+++ b/kernel/sched/psi.c
@@ -182,6 +182,8 @@ struct psi_group psi_system = {
 
 static void psi_avgs_work(struct work_struct *work);
 
+static unsigned long psi_memtime_fixup(u32 growth);
+
 static void group_init(struct psi_group *group)
 {
 	int cpu;
@@ -492,6 +494,21 @@ static u64 window_update(struct psi_window *win, u64 now, u64 value)
 	return growth;
 }
 
+static unsigned long psi_memtime_fixup(u32 growth)
+{
+	struct rq *rq = task_rq(current);
+	unsigned long growth_fixed = (unsigned long)growth;
+
+	if (!(current->policy == SCHED_NORMAL || current->policy == SCHED_BATCH))
+		return growth_fixed;
+
+	if (current->in_memstall)
+		growth_fixed = div64_ul((1024 - rq->avg_rt.util_avg - rq->avg_dl.util_avg
+					- rq->avg_irq.util_avg + 1) * growth, 1024);
+
+	return growth_fixed;
+}
+
 static void init_triggers(struct psi_group *group, u64 now)
 {
 	struct psi_trigger *t;
@@ -658,6 +675,7 @@ static void record_times(struct psi_group_cpu *groupc, u64 now)
 	}
 
 	if (groupc->state_mask & (1 << PSI_MEM_SOME)) {
+		delta = psi_memtime_fixup(delta);
 		groupc->times[PSI_MEM_SOME] += delta;
 		if (groupc->state_mask & (1 << PSI_MEM_FULL))
 			groupc->times[PSI_MEM_FULL] += delta;
@@ -928,8 +946,8 @@ void psi_memstall_leave(unsigned long *flags)
 	 */
 	rq = this_rq_lock_irq(&rf);
 
-	current->in_memstall = 0;
 	psi_task_change(current, TSK_MEMSTALL, 0);
+	current->in_memstall = 0;
 
 	rq_unlock_irq(rq, &rf);
 }
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread
* [Resend PATCH] psi : calc cfs task memstall time more precisely
@ 2021-09-26  3:27 Huangzhaoyang
  0 siblings, 0 replies; 25+ messages in thread
From: Huangzhaoyang @ 2021-09-26  3:27 UTC (permalink / raw)
  To: Johannes Weiner, Zhaoyang Huang, linux-mm, linux-kernel,
	xuewen.yan, ke.wang

From: Zhaoyang Huang <zhaoyang.huang@unisoc.com>

In an EAS enabled system, there are two scenarios discordant to current design,

1. workload used to be heavy uneven among cores for sake of scheduler policy.
RT task usually preempts CFS task in little core.
2. CFS task's memstall time is counted as simple as exit - entry so far, which
ignore the preempted time by RT, DL and Irqs.

With these two constraints, the percpu nonidle time would be mainly consumed by
none CFS tasks and couldn't be averaged. Eliminating them by calc the time growth
via the proportion of cfs_rq's utilization on the whole rq.

eg.
Here is the scenario which this commit want to fix, that is the rt and irq consume
some utilization of the whole rq. This scenario could be typical in a core
which is assigned to deal with all irqs. Furthermore, the rt task used to run on
little core under EAS.

Binder:305_3-314    [002] d..1   257.880195: psi_memtime_fixup: original:30616,adjusted:25951,se:89,cfs:353,rt:139,dl:0,irq:18
droid.phone-1525    [001] d..1   265.145492: psi_memtime_fixup: original:61616,adjusted:53492,se:55,cfs:225,rt:121,dl:0,irq:15

Signed-off-by: Zhaoyang Huang <zhaoyang.huang@unisoc.com>
---
 kernel/sched/psi.c | 20 +++++++++++++++++++-
 1 file changed, 19 insertions(+), 1 deletion(-)

diff --git a/kernel/sched/psi.c b/kernel/sched/psi.c
index cc25a3c..754a836 100644
--- a/kernel/sched/psi.c
+++ b/kernel/sched/psi.c
@@ -182,6 +182,8 @@ struct psi_group psi_system = {
 
 static void psi_avgs_work(struct work_struct *work);
 
+static unsigned long psi_memtime_fixup(u32 growth);
+
 static void group_init(struct psi_group *group)
 {
 	int cpu;
@@ -492,6 +494,21 @@ static u64 window_update(struct psi_window *win, u64 now, u64 value)
 	return growth;
 }
 
+static unsigned long psi_memtime_fixup(u32 growth)
+{
+	struct rq *rq = task_rq(current);
+	unsigned long growth_fixed = (unsigned long)growth;
+
+	if (!(current->policy == SCHED_NORMAL || current->policy == SCHED_BATCH))
+		return growth_fixed;
+
+	if (current->in_memstall)
+		growth_fixed = div64_ul((1024 - rq->avg_rt.util_avg - rq->avg_dl.util_avg
+					- rq->avg_irq.util_avg + 1) * growth, 1024);
+
+	return growth_fixed;
+}
+
 static void init_triggers(struct psi_group *group, u64 now)
 {
 	struct psi_trigger *t;
@@ -658,6 +675,7 @@ static void record_times(struct psi_group_cpu *groupc, u64 now)
 	}
 
 	if (groupc->state_mask & (1 << PSI_MEM_SOME)) {
+		delta = psi_memtime_fixup(delta);
 		groupc->times[PSI_MEM_SOME] += delta;
 		if (groupc->state_mask & (1 << PSI_MEM_FULL))
 			groupc->times[PSI_MEM_FULL] += delta;
@@ -928,8 +946,8 @@ void psi_memstall_leave(unsigned long *flags)
 	 */
 	rq = this_rq_lock_irq(&rf);
 
-	current->in_memstall = 0;
 	psi_task_change(current, TSK_MEMSTALL, 0);
+	current->in_memstall = 0;
 
 	rq_unlock_irq(rq, &rf);
 }
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread
* [Resend PATCH] psi : calc cfs task  memstall time more precisely
@ 2021-09-18  5:25 Huangzhaoyang
  0 siblings, 0 replies; 25+ messages in thread
From: Huangzhaoyang @ 2021-09-18  5:25 UTC (permalink / raw)
  To: Johannes Weiner, Zhaoyang Huang, linux-mm, linux-kernel,
	xuewen.yan, ke.wang

From: Zhaoyang Huang <zhaoyang.huang@unisoc.com>

cfs task's memstall time is counted as simple as exit - entry so far, which
ignore the preempted time by rt, dl and irq time. Eliminating them by calc the
time growth via the proportion of cfs_rq's utilization on the whole rq.

eg.
Here is the scenario which this commit want to fix, that is the rt and irq consume
some utilization of the whole rq. This scenario could be typical in a core
which is assigned to deal with all irqs. Furthermore, the rt task used to run on
little core under EAS.

Binder:305_3-314    [002] d..1   257.880195: psi_memtime_fixup: original:30616,adjusted:25951,se:89,cfs:353,rt:139,dl:0,irq:18
droid.phone-1525    [001] d..1   265.145492: psi_memtime_fixup: original:61616,adjusted:53492,se:55,cfs:225,rt:121,dl:0,irq:15

Signed-off-by: Zhaoyang Huang <zhaoyang.huang@unisoc.com>
---
 kernel/sched/psi.c | 20 +++++++++++++++++++-
 1 file changed, 19 insertions(+), 1 deletion(-)

diff --git a/kernel/sched/psi.c b/kernel/sched/psi.c
index cc25a3c..754a836 100644
--- a/kernel/sched/psi.c
+++ b/kernel/sched/psi.c
@@ -182,6 +182,8 @@ struct psi_group psi_system = {
 
 static void psi_avgs_work(struct work_struct *work);
 
+static unsigned long psi_memtime_fixup(u32 growth);
+
 static void group_init(struct psi_group *group)
 {
 	int cpu;
@@ -492,6 +494,21 @@ static u64 window_update(struct psi_window *win, u64 now, u64 value)
 	return growth;
 }
 
+static unsigned long psi_memtime_fixup(u32 growth)
+{
+	struct rq *rq = task_rq(current);
+	unsigned long growth_fixed = (unsigned long)growth;
+
+	if (!(current->policy == SCHED_NORMAL || current->policy == SCHED_BATCH))
+		return growth_fixed;
+
+	if (current->in_memstall)
+		growth_fixed = div64_ul((1024 - rq->avg_rt.util_avg - rq->avg_dl.util_avg
+					- rq->avg_irq.util_avg + 1) * growth, 1024);
+
+	return growth_fixed;
+}
+
 static void init_triggers(struct psi_group *group, u64 now)
 {
 	struct psi_trigger *t;
@@ -658,6 +675,7 @@ static void record_times(struct psi_group_cpu *groupc, u64 now)
 	}
 
 	if (groupc->state_mask & (1 << PSI_MEM_SOME)) {
+		delta = psi_memtime_fixup(delta);
 		groupc->times[PSI_MEM_SOME] += delta;
 		if (groupc->state_mask & (1 << PSI_MEM_FULL))
 			groupc->times[PSI_MEM_FULL] += delta;
@@ -928,8 +946,8 @@ void psi_memstall_leave(unsigned long *flags)
 	 */
 	rq = this_rq_lock_irq(&rf);
 
-	current->in_memstall = 0;
 	psi_task_change(current, TSK_MEMSTALL, 0);
+	current->in_memstall = 0;
 
 	rq_unlock_irq(rq, &rf);
 }
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2021-11-15  2:25 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-10-15  6:16 [Resend PATCH] psi : calc cfs task memstall time more precisely Huangzhaoyang
2021-11-02 19:47 ` Johannes Weiner
2021-11-03  7:07   ` Zhaoyang Huang
2021-11-03  7:08     ` Zhaoyang Huang
2021-11-04  8:58       ` Dietmar Eggemann
2021-11-05  5:58         ` Zhaoyang Huang
2021-11-05 16:42           ` Dietmar Eggemann
2021-11-08  8:49             ` Xuewen Yan
2021-11-08  9:20               ` Zhaoyang Huang
2021-11-09 12:29                 ` Dietmar Eggemann
2021-11-10  5:38                   ` Xuewen Yan
2021-11-09  9:43               ` Dietmar Eggemann
2021-11-10  5:36                 ` Xuewen Yan
2021-11-12 14:16                   ` Dietmar Eggemann
2021-11-09 14:56   ` Peter Zijlstra
2021-11-10  1:37     ` Zhaoyang Huang
2021-11-10  8:36       ` Peter Zijlstra
2021-11-10  8:47         ` Zhaoyang Huang
2021-11-10  8:49     ` Vincent Guittot
2021-11-10  9:04       ` Zhaoyang Huang
2021-11-12 16:36     ` Johannes Weiner
2021-11-12 19:23       ` Peter Zijlstra
2021-11-15  2:24       ` Zhaoyang Huang
  -- strict thread matches above, loose matches on Subject: below --
2021-09-26  3:27 Huangzhaoyang
2021-09-18  5:25 Huangzhaoyang

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.