linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] psi: fix randomized calculation in record_times()
@ 2020-04-03  8:55 Yafang Shao
  2020-04-03 15:20 ` Johannes Weiner
  0 siblings, 1 reply; 3+ messages in thread
From: Yafang Shao @ 2020-04-03  8:55 UTC (permalink / raw)
  To: hannes, peterz, akpm
  Cc: linux-mm, linux-kernel, Yafang Shao, Daniel Drake, Suren Baghdasaryan

In record_times() we use 'now' and groupc->state_start to calculate the
delta as bellow,
	delta = now - groupc->state_start;
But note that groupc->state_start may be not initialized yet, IOW, the
state_start may be 0 currently. If state_start is 0, this calculation is
same with assigning the lower 32-bit of 'now' to delta, that is a random
value. To fix this value, we should initialize groupc->state_start before.

After we calculate the delta, we will assign 'now' to
groupc->state_start then,
	groupc->state_start = now;
This will cause the same issue if groupc->state_start will not be used in a
long period. Let's take an example. We create a cgroup foo and run tasks
in it. Some of these tasks enter into memstall and state_start is set.
Then we move all of these tasks out of cgroup foo for more than (1 << 32)
nsecs, and then move them in. That will cause the same issue as above.

The root cause of these issues is that we don't initialize the state_start
properly. To fix it, we should record how many tasks in this per cpu
psi_group. If there's no task in it, we just set state_start and don't
calculate the delta, that means it is the begin of the pressure. To
avoid redundant calculating the total number of tasks in this per cpu
psi_group, a new member 'total_tasks' is introduced in struct
psi_group_cpu, which is the sum of array members in tasks[].

Fixes: eb414681d5a0 ("psi: pressure stall information for CPU, memory, and IO")
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Daniel Drake <drake@endlessm.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
---
 include/linux/psi_types.h |  2 ++
 kernel/sched/psi.c        | 13 ++++++++++---
 2 files changed, 12 insertions(+), 3 deletions(-)

diff --git a/include/linux/psi_types.h b/include/linux/psi_types.h
index 4b7258495a04..b42cbfdb15e9 100644
--- a/include/linux/psi_types.h
+++ b/include/linux/psi_types.h
@@ -69,6 +69,8 @@ struct psi_group_cpu {
 
 	/* States of the tasks belonging to this group */
 	unsigned int tasks[NR_PSI_TASK_COUNTS];
+	/* Sum of above array members */
+	unsigned int total_tasks;
 
 	/* Aggregate pressure state derived from the tasks */
 	u32 state_mask;
diff --git a/kernel/sched/psi.c b/kernel/sched/psi.c
index 8f45cdb6463b..7061529dc406 100644
--- a/kernel/sched/psi.c
+++ b/kernel/sched/psi.c
@@ -690,7 +690,10 @@ static void psi_group_change(struct psi_group *group, int cpu,
 	 */
 	write_seqcount_begin(&groupc->seq);
 
-	record_times(groupc, cpu, false);
+	if (groupc->total_tasks)
+		record_times(groupc, cpu, false);
+	else
+		groupc->state_start = cpu_clock(cpu);
 
 	for (t = 0, m = clear; m; m &= ~(1 << t), t++) {
 		if (!(m & (1 << t)))
@@ -703,11 +706,15 @@ static void psi_group_change(struct psi_group *group, int cpu,
 			psi_bug = 1;
 		}
 		groupc->tasks[t]--;
+		groupc->total_tasks--;
 	}
 
-	for (t = 0; set; set &= ~(1 << t), t++)
-		if (set & (1 << t))
+	for (t = 0; set; set &= ~(1 << t), t++) {
+		if (set & (1 << t)) {
 			groupc->tasks[t]++;
+			groupc->total_tasks++;
+		}
+	}
 
 	/* Calculate state mask representing active states */
 	for (s = 0; s < NR_PSI_STATES; s++) {
-- 
2.18.2


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH] psi: fix randomized calculation in record_times()
  2020-04-03  8:55 [PATCH] psi: fix randomized calculation in record_times() Yafang Shao
@ 2020-04-03 15:20 ` Johannes Weiner
  2020-04-04  2:31   ` Yafang Shao
  0 siblings, 1 reply; 3+ messages in thread
From: Johannes Weiner @ 2020-04-03 15:20 UTC (permalink / raw)
  To: Yafang Shao
  Cc: peterz, akpm, linux-mm, linux-kernel, Daniel Drake, Suren Baghdasaryan

On Fri, Apr 03, 2020 at 04:55:25AM -0400, Yafang Shao wrote:
> @@ -690,7 +690,10 @@ static void psi_group_change(struct psi_group *group, int cpu,
>  	 */
>  	write_seqcount_begin(&groupc->seq);
>  
> -	record_times(groupc, cpu, false);
> +	if (groupc->total_tasks)
> +		record_times(groupc, cpu, false);
> +	else
> +		groupc->state_start = cpu_clock(cpu);

This change appears is a no-op. If there are no tasks,
groupc->state_mask is 0, and the only thing record_times() does is

	groupc->state_start = cpu_clock(cpu);

Did you encounter actual problems, or are you just reading the code?

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] psi: fix randomized calculation in record_times()
  2020-04-03 15:20 ` Johannes Weiner
@ 2020-04-04  2:31   ` Yafang Shao
  0 siblings, 0 replies; 3+ messages in thread
From: Yafang Shao @ 2020-04-04  2:31 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: Peter Zijlstra, Andrew Morton, Linux MM, LKML, Daniel Drake,
	Suren Baghdasaryan

On Fri, Apr 3, 2020 at 11:20 PM Johannes Weiner <hannes@cmpxchg.org> wrote:
>
> On Fri, Apr 03, 2020 at 04:55:25AM -0400, Yafang Shao wrote:
> > @@ -690,7 +690,10 @@ static void psi_group_change(struct psi_group *group, int cpu,
> >        */
> >       write_seqcount_begin(&groupc->seq);
> >
> > -     record_times(groupc, cpu, false);
> > +     if (groupc->total_tasks)
> > +             record_times(groupc, cpu, false);
> > +     else
> > +             groupc->state_start = cpu_clock(cpu);
>
> This change appears is a no-op. If there are no tasks,
> groupc->state_mask is 0, and the only thing record_times() does is
>
>         groupc->state_start = cpu_clock(cpu);
>
> Did you encounter actual problems, or are you just reading the code?

No real issues. Just reading the code and found it looks like a little weird.
But as you explained that state_mask can also guarantee it then we
don't need to make this change.

Thanks for the explanation.

Thanks
Yafang

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2020-04-04  2:31 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-04-03  8:55 [PATCH] psi: fix randomized calculation in record_times() Yafang Shao
2020-04-03 15:20 ` Johannes Weiner
2020-04-04  2:31   ` Yafang Shao

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).