linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] sched/psi: Optimise psi_group_change a bit
@ 2024-03-29 16:06 Tvrtko Ursulin
  2024-03-29 18:51 ` Johannes Weiner
  2024-04-01  6:46 ` Chengming Zhou
  0 siblings, 2 replies; 7+ messages in thread
From: Tvrtko Ursulin @ 2024-03-29 16:06 UTC (permalink / raw)
  To: linux-kernel
  Cc: Tvrtko Ursulin, Johannes Weiner, Suren Baghdasaryan,
	Peter Ziljstra, kernel-dev

From: Tvrtko Ursulin <tursulin@ursulin.net>

The current code loops over the psi_states only to call a helper which
then resolves back to the action needed for each state using a switch
statement. That is effectively creating a double indirection of a kind
which, given how all the states need to be explicitly listed and handled
anyway, we can simply remove. Both the for loop and the switch statement
that is.

The benefit is both in the code size and CPU time spent in this function.
YMMV but on my Steam Deck, while in a game, the patch makes the CPU usage
go from ~2.4% down to ~1.2%. Text size at the same time went from 0x323 to
0x2c1.

Signed-off-by: Tvrtko Ursulin <tursulin@ursulin.net>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Peter Ziljstra <peterz@infradead.org>
Cc: linux-kernel@vger.kernel.org
Cc: kernel-dev@igalia.com
---
 kernel/sched/psi.c | 54 +++++++++++++++++++++++-----------------------
 1 file changed, 27 insertions(+), 27 deletions(-)

diff --git a/kernel/sched/psi.c b/kernel/sched/psi.c
index 7b4aa5809c0f..55720ecf420e 100644
--- a/kernel/sched/psi.c
+++ b/kernel/sched/psi.c
@@ -218,28 +218,32 @@ void __init psi_init(void)
 	group_init(&psi_system);
 }
 
-static bool test_state(unsigned int *tasks, enum psi_states state, bool oncpu)
+static u32 test_states(unsigned int *tasks, u32 state_mask)
 {
-	switch (state) {
-	case PSI_IO_SOME:
-		return unlikely(tasks[NR_IOWAIT]);
-	case PSI_IO_FULL:
-		return unlikely(tasks[NR_IOWAIT] && !tasks[NR_RUNNING]);
-	case PSI_MEM_SOME:
-		return unlikely(tasks[NR_MEMSTALL]);
-	case PSI_MEM_FULL:
-		return unlikely(tasks[NR_MEMSTALL] &&
-			tasks[NR_RUNNING] == tasks[NR_MEMSTALL_RUNNING]);
-	case PSI_CPU_SOME:
-		return unlikely(tasks[NR_RUNNING] > oncpu);
-	case PSI_CPU_FULL:
-		return unlikely(tasks[NR_RUNNING] && !oncpu);
-	case PSI_NONIDLE:
-		return tasks[NR_IOWAIT] || tasks[NR_MEMSTALL] ||
-			tasks[NR_RUNNING];
-	default:
-		return false;
+	const bool oncpu = state_mask & PSI_ONCPU;
+
+	if (tasks[NR_IOWAIT]) {
+		state_mask |= BIT(PSI_IO_SOME);
+		if (!tasks[NR_RUNNING])
+			state_mask |= BIT(PSI_IO_FULL);
 	}
+
+	if (tasks[NR_MEMSTALL]) {
+		state_mask |= BIT(PSI_MEM_SOME);
+		if (tasks[NR_RUNNING] == tasks[NR_MEMSTALL_RUNNING])
+			state_mask |= BIT(PSI_MEM_FULL);
+	}
+
+	if (tasks[NR_RUNNING] > oncpu)
+		state_mask |= BIT(PSI_CPU_SOME);
+
+	if (tasks[NR_RUNNING] && !oncpu)
+		state_mask |= BIT(PSI_CPU_FULL);
+
+	if (tasks[NR_IOWAIT] || tasks[NR_MEMSTALL] || tasks[NR_RUNNING])
+		state_mask |= BIT(PSI_NONIDLE);
+
+	return state_mask;
 }
 
 static void get_recent_times(struct psi_group *group, int cpu,
@@ -770,7 +774,6 @@ static void psi_group_change(struct psi_group *group, int cpu,
 {
 	struct psi_group_cpu *groupc;
 	unsigned int t, m;
-	enum psi_states s;
 	u32 state_mask;
 
 	groupc = per_cpu_ptr(group->pcpu, cpu);
@@ -841,10 +844,7 @@ static void psi_group_change(struct psi_group *group, int cpu,
 		return;
 	}
 
-	for (s = 0; s < NR_PSI_STATES; s++) {
-		if (test_state(groupc->tasks, s, state_mask & PSI_ONCPU))
-			state_mask |= (1 << s);
-	}
+	state_mask = test_states(groupc->tasks, state_mask);
 
 	/*
 	 * Since we care about lost potential, a memstall is FULL
@@ -1194,7 +1194,7 @@ void psi_cgroup_restart(struct psi_group *group)
 	/*
 	 * After we disable psi_group->enabled, we don't actually
 	 * stop percpu tasks accounting in each psi_group_cpu,
-	 * instead only stop test_state() loop, record_times()
+	 * instead only stop test_states() loop, record_times()
 	 * and averaging worker, see psi_group_change() for details.
 	 *
 	 * When disable cgroup PSI, this function has nothing to sync
@@ -1202,7 +1202,7 @@ void psi_cgroup_restart(struct psi_group *group)
 	 * would see !psi_group->enabled and only do task accounting.
 	 *
 	 * When re-enable cgroup PSI, this function use psi_group_change()
-	 * to get correct state mask from test_state() loop on tasks[],
+	 * to get correct state mask from test_states() loop on tasks[],
 	 * and restart groupc->state_start from now, use .clear = .set = 0
 	 * here since no task status really changed.
 	 */
-- 
2.44.0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH] sched/psi: Optimise psi_group_change a bit
  2024-03-29 16:06 [PATCH] sched/psi: Optimise psi_group_change a bit Tvrtko Ursulin
@ 2024-03-29 18:51 ` Johannes Weiner
  2024-04-09 22:38   ` Johannes Weiner
  2024-04-01  6:46 ` Chengming Zhou
  1 sibling, 1 reply; 7+ messages in thread
From: Johannes Weiner @ 2024-03-29 18:51 UTC (permalink / raw)
  To: Tvrtko Ursulin
  Cc: linux-kernel, Tvrtko Ursulin, Suren Baghdasaryan, Peter Ziljstra,
	Ingo Molnar, kernel-dev

On Fri, Mar 29, 2024 at 04:06:48PM +0000, Tvrtko Ursulin wrote:
> From: Tvrtko Ursulin <tursulin@ursulin.net>
> 
> The current code loops over the psi_states only to call a helper which
> then resolves back to the action needed for each state using a switch
> statement. That is effectively creating a double indirection of a kind
> which, given how all the states need to be explicitly listed and handled
> anyway, we can simply remove. Both the for loop and the switch statement
> that is.
> 
> The benefit is both in the code size and CPU time spent in this function.
> YMMV but on my Steam Deck, while in a game, the patch makes the CPU usage
> go from ~2.4% down to ~1.2%. Text size at the same time went from 0x323 to
> 0x2c1.
> 
> Signed-off-by: Tvrtko Ursulin <tursulin@ursulin.net>
> Cc: Johannes Weiner <hannes@cmpxchg.org>
> Cc: Suren Baghdasaryan <surenb@google.com>
> Cc: Peter Ziljstra <peterz@infradead.org>
> Cc: linux-kernel@vger.kernel.org
> Cc: kernel-dev@igalia.com

This is great.

Acked-by: Johannes Weiner <hannes@cmpxchg.org>

Ingo, would you mind please taking this through the scheduler tree? I
think Peter is still out.

Remaining quote below.

Thanks

> ---
>  kernel/sched/psi.c | 54 +++++++++++++++++++++++-----------------------
>  1 file changed, 27 insertions(+), 27 deletions(-)
> 
> diff --git a/kernel/sched/psi.c b/kernel/sched/psi.c
> index 7b4aa5809c0f..55720ecf420e 100644
> --- a/kernel/sched/psi.c
> +++ b/kernel/sched/psi.c
> @@ -218,28 +218,32 @@ void __init psi_init(void)
>  	group_init(&psi_system);
>  }
>  
> -static bool test_state(unsigned int *tasks, enum psi_states state, bool oncpu)
> +static u32 test_states(unsigned int *tasks, u32 state_mask)
>  {
> -	switch (state) {
> -	case PSI_IO_SOME:
> -		return unlikely(tasks[NR_IOWAIT]);
> -	case PSI_IO_FULL:
> -		return unlikely(tasks[NR_IOWAIT] && !tasks[NR_RUNNING]);
> -	case PSI_MEM_SOME:
> -		return unlikely(tasks[NR_MEMSTALL]);
> -	case PSI_MEM_FULL:
> -		return unlikely(tasks[NR_MEMSTALL] &&
> -			tasks[NR_RUNNING] == tasks[NR_MEMSTALL_RUNNING]);
> -	case PSI_CPU_SOME:
> -		return unlikely(tasks[NR_RUNNING] > oncpu);
> -	case PSI_CPU_FULL:
> -		return unlikely(tasks[NR_RUNNING] && !oncpu);
> -	case PSI_NONIDLE:
> -		return tasks[NR_IOWAIT] || tasks[NR_MEMSTALL] ||
> -			tasks[NR_RUNNING];
> -	default:
> -		return false;
> +	const bool oncpu = state_mask & PSI_ONCPU;
> +
> +	if (tasks[NR_IOWAIT]) {
> +		state_mask |= BIT(PSI_IO_SOME);
> +		if (!tasks[NR_RUNNING])
> +			state_mask |= BIT(PSI_IO_FULL);
>  	}
> +
> +	if (tasks[NR_MEMSTALL]) {
> +		state_mask |= BIT(PSI_MEM_SOME);
> +		if (tasks[NR_RUNNING] == tasks[NR_MEMSTALL_RUNNING])
> +			state_mask |= BIT(PSI_MEM_FULL);
> +	}
> +
> +	if (tasks[NR_RUNNING] > oncpu)
> +		state_mask |= BIT(PSI_CPU_SOME);
> +
> +	if (tasks[NR_RUNNING] && !oncpu)
> +		state_mask |= BIT(PSI_CPU_FULL);
> +
> +	if (tasks[NR_IOWAIT] || tasks[NR_MEMSTALL] || tasks[NR_RUNNING])
> +		state_mask |= BIT(PSI_NONIDLE);
> +
> +	return state_mask;
>  }
>  
>  static void get_recent_times(struct psi_group *group, int cpu,
> @@ -770,7 +774,6 @@ static void psi_group_change(struct psi_group *group, int cpu,
>  {
>  	struct psi_group_cpu *groupc;
>  	unsigned int t, m;
> -	enum psi_states s;
>  	u32 state_mask;
>  
>  	groupc = per_cpu_ptr(group->pcpu, cpu);
> @@ -841,10 +844,7 @@ static void psi_group_change(struct psi_group *group, int cpu,
>  		return;
>  	}
>  
> -	for (s = 0; s < NR_PSI_STATES; s++) {
> -		if (test_state(groupc->tasks, s, state_mask & PSI_ONCPU))
> -			state_mask |= (1 << s);
> -	}
> +	state_mask = test_states(groupc->tasks, state_mask);
>  
>  	/*
>  	 * Since we care about lost potential, a memstall is FULL
> @@ -1194,7 +1194,7 @@ void psi_cgroup_restart(struct psi_group *group)
>  	/*
>  	 * After we disable psi_group->enabled, we don't actually
>  	 * stop percpu tasks accounting in each psi_group_cpu,
> -	 * instead only stop test_state() loop, record_times()
> +	 * instead only stop test_states() loop, record_times()
>  	 * and averaging worker, see psi_group_change() for details.
>  	 *
>  	 * When disable cgroup PSI, this function has nothing to sync
> @@ -1202,7 +1202,7 @@ void psi_cgroup_restart(struct psi_group *group)
>  	 * would see !psi_group->enabled and only do task accounting.
>  	 *
>  	 * When re-enable cgroup PSI, this function use psi_group_change()
> -	 * to get correct state mask from test_state() loop on tasks[],
> +	 * to get correct state mask from test_states() loop on tasks[],
>  	 * and restart groupc->state_start from now, use .clear = .set = 0
>  	 * here since no task status really changed.
>  	 */
> -- 
> 2.44.0
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] sched/psi: Optimise psi_group_change a bit
  2024-03-29 16:06 [PATCH] sched/psi: Optimise psi_group_change a bit Tvrtko Ursulin
  2024-03-29 18:51 ` Johannes Weiner
@ 2024-04-01  6:46 ` Chengming Zhou
  1 sibling, 0 replies; 7+ messages in thread
From: Chengming Zhou @ 2024-04-01  6:46 UTC (permalink / raw)
  To: Tvrtko Ursulin, linux-kernel
  Cc: Tvrtko Ursulin, Johannes Weiner, Suren Baghdasaryan,
	Peter Ziljstra, kernel-dev

On 2024/3/30 00:06, Tvrtko Ursulin wrote:
> From: Tvrtko Ursulin <tursulin@ursulin.net>
> 
> The current code loops over the psi_states only to call a helper which
> then resolves back to the action needed for each state using a switch
> statement. That is effectively creating a double indirection of a kind
> which, given how all the states need to be explicitly listed and handled
> anyway, we can simply remove. Both the for loop and the switch statement
> that is.
> 
> The benefit is both in the code size and CPU time spent in this function.
> YMMV but on my Steam Deck, while in a game, the patch makes the CPU usage
> go from ~2.4% down to ~1.2%. Text size at the same time went from 0x323 to
> 0x2c1.
> 
> Signed-off-by: Tvrtko Ursulin <tursulin@ursulin.net>
> Cc: Johannes Weiner <hannes@cmpxchg.org>
> Cc: Suren Baghdasaryan <surenb@google.com>
> Cc: Peter Ziljstra <peterz@infradead.org>
> Cc: linux-kernel@vger.kernel.org
> Cc: kernel-dev@igalia.com

Nice!

Reviewed-by: Chengming Zhou <chengming.zhou@linux.dev>

Thanks.

> ---
>  kernel/sched/psi.c | 54 +++++++++++++++++++++++-----------------------
>  1 file changed, 27 insertions(+), 27 deletions(-)
> 
> diff --git a/kernel/sched/psi.c b/kernel/sched/psi.c
> index 7b4aa5809c0f..55720ecf420e 100644
> --- a/kernel/sched/psi.c
> +++ b/kernel/sched/psi.c
> @@ -218,28 +218,32 @@ void __init psi_init(void)
>  	group_init(&psi_system);
>  }
>  
> -static bool test_state(unsigned int *tasks, enum psi_states state, bool oncpu)
> +static u32 test_states(unsigned int *tasks, u32 state_mask)
>  {
> -	switch (state) {
> -	case PSI_IO_SOME:
> -		return unlikely(tasks[NR_IOWAIT]);
> -	case PSI_IO_FULL:
> -		return unlikely(tasks[NR_IOWAIT] && !tasks[NR_RUNNING]);
> -	case PSI_MEM_SOME:
> -		return unlikely(tasks[NR_MEMSTALL]);
> -	case PSI_MEM_FULL:
> -		return unlikely(tasks[NR_MEMSTALL] &&
> -			tasks[NR_RUNNING] == tasks[NR_MEMSTALL_RUNNING]);
> -	case PSI_CPU_SOME:
> -		return unlikely(tasks[NR_RUNNING] > oncpu);
> -	case PSI_CPU_FULL:
> -		return unlikely(tasks[NR_RUNNING] && !oncpu);
> -	case PSI_NONIDLE:
> -		return tasks[NR_IOWAIT] || tasks[NR_MEMSTALL] ||
> -			tasks[NR_RUNNING];
> -	default:
> -		return false;
> +	const bool oncpu = state_mask & PSI_ONCPU;
> +
> +	if (tasks[NR_IOWAIT]) {
> +		state_mask |= BIT(PSI_IO_SOME);
> +		if (!tasks[NR_RUNNING])
> +			state_mask |= BIT(PSI_IO_FULL);
>  	}
> +
> +	if (tasks[NR_MEMSTALL]) {
> +		state_mask |= BIT(PSI_MEM_SOME);
> +		if (tasks[NR_RUNNING] == tasks[NR_MEMSTALL_RUNNING])
> +			state_mask |= BIT(PSI_MEM_FULL);
> +	}
> +
> +	if (tasks[NR_RUNNING] > oncpu)
> +		state_mask |= BIT(PSI_CPU_SOME);
> +
> +	if (tasks[NR_RUNNING] && !oncpu)
> +		state_mask |= BIT(PSI_CPU_FULL);
> +
> +	if (tasks[NR_IOWAIT] || tasks[NR_MEMSTALL] || tasks[NR_RUNNING])
> +		state_mask |= BIT(PSI_NONIDLE);
> +
> +	return state_mask;
>  }
>  
>  static void get_recent_times(struct psi_group *group, int cpu,
> @@ -770,7 +774,6 @@ static void psi_group_change(struct psi_group *group, int cpu,
>  {
>  	struct psi_group_cpu *groupc;
>  	unsigned int t, m;
> -	enum psi_states s;
>  	u32 state_mask;
>  
>  	groupc = per_cpu_ptr(group->pcpu, cpu);
> @@ -841,10 +844,7 @@ static void psi_group_change(struct psi_group *group, int cpu,
>  		return;
>  	}
>  
> -	for (s = 0; s < NR_PSI_STATES; s++) {
> -		if (test_state(groupc->tasks, s, state_mask & PSI_ONCPU))
> -			state_mask |= (1 << s);
> -	}
> +	state_mask = test_states(groupc->tasks, state_mask);
>  
>  	/*
>  	 * Since we care about lost potential, a memstall is FULL
> @@ -1194,7 +1194,7 @@ void psi_cgroup_restart(struct psi_group *group)
>  	/*
>  	 * After we disable psi_group->enabled, we don't actually
>  	 * stop percpu tasks accounting in each psi_group_cpu,
> -	 * instead only stop test_state() loop, record_times()
> +	 * instead only stop test_states() loop, record_times()
>  	 * and averaging worker, see psi_group_change() for details.
>  	 *
>  	 * When disable cgroup PSI, this function has nothing to sync
> @@ -1202,7 +1202,7 @@ void psi_cgroup_restart(struct psi_group *group)
>  	 * would see !psi_group->enabled and only do task accounting.
>  	 *
>  	 * When re-enable cgroup PSI, this function use psi_group_change()
> -	 * to get correct state mask from test_state() loop on tasks[],
> +	 * to get correct state mask from test_states() loop on tasks[],
>  	 * and restart groupc->state_start from now, use .clear = .set = 0
>  	 * here since no task status really changed.
>  	 */

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] sched/psi: Optimise psi_group_change a bit
  2024-03-29 18:51 ` Johannes Weiner
@ 2024-04-09 22:38   ` Johannes Weiner
  2024-04-18  8:54     ` Tvrtko Ursulin
  0 siblings, 1 reply; 7+ messages in thread
From: Johannes Weiner @ 2024-04-09 22:38 UTC (permalink / raw)
  To: Tvrtko Ursulin
  Cc: linux-kernel, Tvrtko Ursulin, Suren Baghdasaryan, Peter Ziljstra,
	Ingo Molnar, kernel-dev

[ Oops, I still had an old mutt alias for Ingo's address. ]

Ingo, would you mind taking this through the scheduler tree?

On Fri, Mar 29, 2024 at 02:51:53PM -0400, Johannes Weiner wrote:
> On Fri, Mar 29, 2024 at 04:06:48PM +0000, Tvrtko Ursulin wrote:
> > From: Tvrtko Ursulin <tursulin@ursulin.net>
> > 
> > The current code loops over the psi_states only to call a helper which
> > then resolves back to the action needed for each state using a switch
> > statement. That is effectively creating a double indirection of a kind
> > which, given how all the states need to be explicitly listed and handled
> > anyway, we can simply remove. Both the for loop and the switch statement
> > that is.
> > 
> > The benefit is both in the code size and CPU time spent in this function.
> > YMMV but on my Steam Deck, while in a game, the patch makes the CPU usage
> > go from ~2.4% down to ~1.2%. Text size at the same time went from 0x323 to
> > 0x2c1.
> > 
> > Signed-off-by: Tvrtko Ursulin <tursulin@ursulin.net>
> > Cc: Johannes Weiner <hannes@cmpxchg.org>
> > Cc: Suren Baghdasaryan <surenb@google.com>
> > Cc: Peter Ziljstra <peterz@infradead.org>
> > Cc: linux-kernel@vger.kernel.org
> > Cc: kernel-dev@igalia.com
> 
> This is great.
> 
> Acked-by: Johannes Weiner <hannes@cmpxchg.org>
> 
> Ingo, would you mind please taking this through the scheduler tree? I
> think Peter is still out.
> 
> Remaining quote below.
> 
> Thanks
> 
> > ---
> >  kernel/sched/psi.c | 54 +++++++++++++++++++++++-----------------------
> >  1 file changed, 27 insertions(+), 27 deletions(-)
> > 
> > diff --git a/kernel/sched/psi.c b/kernel/sched/psi.c
> > index 7b4aa5809c0f..55720ecf420e 100644
> > --- a/kernel/sched/psi.c
> > +++ b/kernel/sched/psi.c
> > @@ -218,28 +218,32 @@ void __init psi_init(void)
> >  	group_init(&psi_system);
> >  }
> >  
> > -static bool test_state(unsigned int *tasks, enum psi_states state, bool oncpu)
> > +static u32 test_states(unsigned int *tasks, u32 state_mask)
> >  {
> > -	switch (state) {
> > -	case PSI_IO_SOME:
> > -		return unlikely(tasks[NR_IOWAIT]);
> > -	case PSI_IO_FULL:
> > -		return unlikely(tasks[NR_IOWAIT] && !tasks[NR_RUNNING]);
> > -	case PSI_MEM_SOME:
> > -		return unlikely(tasks[NR_MEMSTALL]);
> > -	case PSI_MEM_FULL:
> > -		return unlikely(tasks[NR_MEMSTALL] &&
> > -			tasks[NR_RUNNING] == tasks[NR_MEMSTALL_RUNNING]);
> > -	case PSI_CPU_SOME:
> > -		return unlikely(tasks[NR_RUNNING] > oncpu);
> > -	case PSI_CPU_FULL:
> > -		return unlikely(tasks[NR_RUNNING] && !oncpu);
> > -	case PSI_NONIDLE:
> > -		return tasks[NR_IOWAIT] || tasks[NR_MEMSTALL] ||
> > -			tasks[NR_RUNNING];
> > -	default:
> > -		return false;
> > +	const bool oncpu = state_mask & PSI_ONCPU;
> > +
> > +	if (tasks[NR_IOWAIT]) {
> > +		state_mask |= BIT(PSI_IO_SOME);
> > +		if (!tasks[NR_RUNNING])
> > +			state_mask |= BIT(PSI_IO_FULL);
> >  	}
> > +
> > +	if (tasks[NR_MEMSTALL]) {
> > +		state_mask |= BIT(PSI_MEM_SOME);
> > +		if (tasks[NR_RUNNING] == tasks[NR_MEMSTALL_RUNNING])
> > +			state_mask |= BIT(PSI_MEM_FULL);
> > +	}
> > +
> > +	if (tasks[NR_RUNNING] > oncpu)
> > +		state_mask |= BIT(PSI_CPU_SOME);
> > +
> > +	if (tasks[NR_RUNNING] && !oncpu)
> > +		state_mask |= BIT(PSI_CPU_FULL);
> > +
> > +	if (tasks[NR_IOWAIT] || tasks[NR_MEMSTALL] || tasks[NR_RUNNING])
> > +		state_mask |= BIT(PSI_NONIDLE);
> > +
> > +	return state_mask;
> >  }
> >  
> >  static void get_recent_times(struct psi_group *group, int cpu,
> > @@ -770,7 +774,6 @@ static void psi_group_change(struct psi_group *group, int cpu,
> >  {
> >  	struct psi_group_cpu *groupc;
> >  	unsigned int t, m;
> > -	enum psi_states s;
> >  	u32 state_mask;
> >  
> >  	groupc = per_cpu_ptr(group->pcpu, cpu);
> > @@ -841,10 +844,7 @@ static void psi_group_change(struct psi_group *group, int cpu,
> >  		return;
> >  	}
> >  
> > -	for (s = 0; s < NR_PSI_STATES; s++) {
> > -		if (test_state(groupc->tasks, s, state_mask & PSI_ONCPU))
> > -			state_mask |= (1 << s);
> > -	}
> > +	state_mask = test_states(groupc->tasks, state_mask);
> >  
> >  	/*
> >  	 * Since we care about lost potential, a memstall is FULL
> > @@ -1194,7 +1194,7 @@ void psi_cgroup_restart(struct psi_group *group)
> >  	/*
> >  	 * After we disable psi_group->enabled, we don't actually
> >  	 * stop percpu tasks accounting in each psi_group_cpu,
> > -	 * instead only stop test_state() loop, record_times()
> > +	 * instead only stop test_states() loop, record_times()
> >  	 * and averaging worker, see psi_group_change() for details.
> >  	 *
> >  	 * When disable cgroup PSI, this function has nothing to sync
> > @@ -1202,7 +1202,7 @@ void psi_cgroup_restart(struct psi_group *group)
> >  	 * would see !psi_group->enabled and only do task accounting.
> >  	 *
> >  	 * When re-enable cgroup PSI, this function use psi_group_change()
> > -	 * to get correct state mask from test_state() loop on tasks[],
> > +	 * to get correct state mask from test_states() loop on tasks[],
> >  	 * and restart groupc->state_start from now, use .clear = .set = 0
> >  	 * here since no task status really changed.
> >  	 */
> > -- 
> > 2.44.0
> > 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] sched/psi: Optimise psi_group_change a bit
  2024-04-09 22:38   ` Johannes Weiner
@ 2024-04-18  8:54     ` Tvrtko Ursulin
  2024-05-01 15:09       ` Tvrtko Ursulin
  0 siblings, 1 reply; 7+ messages in thread
From: Tvrtko Ursulin @ 2024-04-18  8:54 UTC (permalink / raw)
  To: Tvrtko Ursulin, Ingo Molnar
  Cc: linux-kernel, Tvrtko Ursulin, Suren Baghdasaryan, Peter Ziljstra,
	kernel-dev, Johannes Weiner


Hi Ingo,

On 09/04/2024 23:38, Johannes Weiner wrote:
> [ Oops, I still had an old mutt alias for Ingo's address. ]
> 
> Ingo, would you mind taking this through the scheduler tree?

Gentle reminder so this one does not fall through the cracks.

Regards,

Tvrtko

> On Fri, Mar 29, 2024 at 02:51:53PM -0400, Johannes Weiner wrote:
>> On Fri, Mar 29, 2024 at 04:06:48PM +0000, Tvrtko Ursulin wrote:
>>> From: Tvrtko Ursulin <tursulin@ursulin.net>
>>>
>>> The current code loops over the psi_states only to call a helper which
>>> then resolves back to the action needed for each state using a switch
>>> statement. That is effectively creating a double indirection of a kind
>>> which, given how all the states need to be explicitly listed and handled
>>> anyway, we can simply remove. Both the for loop and the switch statement
>>> that is.
>>>
>>> The benefit is both in the code size and CPU time spent in this function.
>>> YMMV but on my Steam Deck, while in a game, the patch makes the CPU usage
>>> go from ~2.4% down to ~1.2%. Text size at the same time went from 0x323 to
>>> 0x2c1.
>>>
>>> Signed-off-by: Tvrtko Ursulin <tursulin@ursulin.net>
>>> Cc: Johannes Weiner <hannes@cmpxchg.org>
>>> Cc: Suren Baghdasaryan <surenb@google.com>
>>> Cc: Peter Ziljstra <peterz@infradead.org>
>>> Cc: linux-kernel@vger.kernel.org
>>> Cc: kernel-dev@igalia.com
>>
>> This is great.
>>
>> Acked-by: Johannes Weiner <hannes@cmpxchg.org>
>>
>> Ingo, would you mind please taking this through the scheduler tree? I
>> think Peter is still out.
>>
>> Remaining quote below.
>>
>> Thanks
>>
>>> ---
>>>   kernel/sched/psi.c | 54 +++++++++++++++++++++++-----------------------
>>>   1 file changed, 27 insertions(+), 27 deletions(-)
>>>
>>> diff --git a/kernel/sched/psi.c b/kernel/sched/psi.c
>>> index 7b4aa5809c0f..55720ecf420e 100644
>>> --- a/kernel/sched/psi.c
>>> +++ b/kernel/sched/psi.c
>>> @@ -218,28 +218,32 @@ void __init psi_init(void)
>>>   	group_init(&psi_system);
>>>   }
>>>   
>>> -static bool test_state(unsigned int *tasks, enum psi_states state, bool oncpu)
>>> +static u32 test_states(unsigned int *tasks, u32 state_mask)
>>>   {
>>> -	switch (state) {
>>> -	case PSI_IO_SOME:
>>> -		return unlikely(tasks[NR_IOWAIT]);
>>> -	case PSI_IO_FULL:
>>> -		return unlikely(tasks[NR_IOWAIT] && !tasks[NR_RUNNING]);
>>> -	case PSI_MEM_SOME:
>>> -		return unlikely(tasks[NR_MEMSTALL]);
>>> -	case PSI_MEM_FULL:
>>> -		return unlikely(tasks[NR_MEMSTALL] &&
>>> -			tasks[NR_RUNNING] == tasks[NR_MEMSTALL_RUNNING]);
>>> -	case PSI_CPU_SOME:
>>> -		return unlikely(tasks[NR_RUNNING] > oncpu);
>>> -	case PSI_CPU_FULL:
>>> -		return unlikely(tasks[NR_RUNNING] && !oncpu);
>>> -	case PSI_NONIDLE:
>>> -		return tasks[NR_IOWAIT] || tasks[NR_MEMSTALL] ||
>>> -			tasks[NR_RUNNING];
>>> -	default:
>>> -		return false;
>>> +	const bool oncpu = state_mask & PSI_ONCPU;
>>> +
>>> +	if (tasks[NR_IOWAIT]) {
>>> +		state_mask |= BIT(PSI_IO_SOME);
>>> +		if (!tasks[NR_RUNNING])
>>> +			state_mask |= BIT(PSI_IO_FULL);
>>>   	}
>>> +
>>> +	if (tasks[NR_MEMSTALL]) {
>>> +		state_mask |= BIT(PSI_MEM_SOME);
>>> +		if (tasks[NR_RUNNING] == tasks[NR_MEMSTALL_RUNNING])
>>> +			state_mask |= BIT(PSI_MEM_FULL);
>>> +	}
>>> +
>>> +	if (tasks[NR_RUNNING] > oncpu)
>>> +		state_mask |= BIT(PSI_CPU_SOME);
>>> +
>>> +	if (tasks[NR_RUNNING] && !oncpu)
>>> +		state_mask |= BIT(PSI_CPU_FULL);
>>> +
>>> +	if (tasks[NR_IOWAIT] || tasks[NR_MEMSTALL] || tasks[NR_RUNNING])
>>> +		state_mask |= BIT(PSI_NONIDLE);
>>> +
>>> +	return state_mask;
>>>   }
>>>   
>>>   static void get_recent_times(struct psi_group *group, int cpu,
>>> @@ -770,7 +774,6 @@ static void psi_group_change(struct psi_group *group, int cpu,
>>>   {
>>>   	struct psi_group_cpu *groupc;
>>>   	unsigned int t, m;
>>> -	enum psi_states s;
>>>   	u32 state_mask;
>>>   
>>>   	groupc = per_cpu_ptr(group->pcpu, cpu);
>>> @@ -841,10 +844,7 @@ static void psi_group_change(struct psi_group *group, int cpu,
>>>   		return;
>>>   	}
>>>   
>>> -	for (s = 0; s < NR_PSI_STATES; s++) {
>>> -		if (test_state(groupc->tasks, s, state_mask & PSI_ONCPU))
>>> -			state_mask |= (1 << s);
>>> -	}
>>> +	state_mask = test_states(groupc->tasks, state_mask);
>>>   
>>>   	/*
>>>   	 * Since we care about lost potential, a memstall is FULL
>>> @@ -1194,7 +1194,7 @@ void psi_cgroup_restart(struct psi_group *group)
>>>   	/*
>>>   	 * After we disable psi_group->enabled, we don't actually
>>>   	 * stop percpu tasks accounting in each psi_group_cpu,
>>> -	 * instead only stop test_state() loop, record_times()
>>> +	 * instead only stop test_states() loop, record_times()
>>>   	 * and averaging worker, see psi_group_change() for details.
>>>   	 *
>>>   	 * When disable cgroup PSI, this function has nothing to sync
>>> @@ -1202,7 +1202,7 @@ void psi_cgroup_restart(struct psi_group *group)
>>>   	 * would see !psi_group->enabled and only do task accounting.
>>>   	 *
>>>   	 * When re-enable cgroup PSI, this function use psi_group_change()
>>> -	 * to get correct state mask from test_state() loop on tasks[],
>>> +	 * to get correct state mask from test_states() loop on tasks[],
>>>   	 * and restart groupc->state_start from now, use .clear = .set = 0
>>>   	 * here since no task status really changed.
>>>   	 */
>>> -- 
>>> 2.44.0
>>>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] sched/psi: Optimise psi_group_change a bit
  2024-04-18  8:54     ` Tvrtko Ursulin
@ 2024-05-01 15:09       ` Tvrtko Ursulin
  2024-05-22  8:42         ` Tvrtko Ursulin
  0 siblings, 1 reply; 7+ messages in thread
From: Tvrtko Ursulin @ 2024-05-01 15:09 UTC (permalink / raw)
  To: Tvrtko Ursulin
  Cc: linux-kernel, Tvrtko Ursulin, Suren Baghdasaryan, Peter Ziljstra,
	kernel-dev, Johannes Weiner, Juri Lelli, Vincent Guittot,
	Ingo Molnar


Hi,

Adding more scheduler maintainers to Cc - hopefully someone can merge to 
the relevant tree?

Thanks,

Tvrtko

On 18/04/2024 09:54, Tvrtko Ursulin wrote:
> 
> Hi Ingo,
> 
> On 09/04/2024 23:38, Johannes Weiner wrote:
>> [ Oops, I still had an old mutt alias for Ingo's address. ]
>>
>> Ingo, would you mind taking this through the scheduler tree?
> 
> Gentle reminder so this one does not fall through the cracks.
> 
> Regards,
> 
> Tvrtko
> 
>> On Fri, Mar 29, 2024 at 02:51:53PM -0400, Johannes Weiner wrote:
>>> On Fri, Mar 29, 2024 at 04:06:48PM +0000, Tvrtko Ursulin wrote:
>>>> From: Tvrtko Ursulin <tursulin@ursulin.net>
>>>>
>>>> The current code loops over the psi_states only to call a helper which
>>>> then resolves back to the action needed for each state using a switch
>>>> statement. That is effectively creating a double indirection of a kind
>>>> which, given how all the states need to be explicitly listed and 
>>>> handled
>>>> anyway, we can simply remove. Both the for loop and the switch 
>>>> statement
>>>> that is.
>>>>
>>>> The benefit is both in the code size and CPU time spent in this 
>>>> function.
>>>> YMMV but on my Steam Deck, while in a game, the patch makes the CPU 
>>>> usage
>>>> go from ~2.4% down to ~1.2%. Text size at the same time went from 
>>>> 0x323 to
>>>> 0x2c1.
>>>>
>>>> Signed-off-by: Tvrtko Ursulin <tursulin@ursulin.net>
>>>> Cc: Johannes Weiner <hannes@cmpxchg.org>
>>>> Cc: Suren Baghdasaryan <surenb@google.com>
>>>> Cc: Peter Ziljstra <peterz@infradead.org>
>>>> Cc: linux-kernel@vger.kernel.org
>>>> Cc: kernel-dev@igalia.com
>>>
>>> This is great.
>>>
>>> Acked-by: Johannes Weiner <hannes@cmpxchg.org>
>>>
>>> Ingo, would you mind please taking this through the scheduler tree? I
>>> think Peter is still out.
>>>
>>> Remaining quote below.
>>>
>>> Thanks
>>>
>>>> ---
>>>>   kernel/sched/psi.c | 54 
>>>> +++++++++++++++++++++++-----------------------
>>>>   1 file changed, 27 insertions(+), 27 deletions(-)
>>>>
>>>> diff --git a/kernel/sched/psi.c b/kernel/sched/psi.c
>>>> index 7b4aa5809c0f..55720ecf420e 100644
>>>> --- a/kernel/sched/psi.c
>>>> +++ b/kernel/sched/psi.c
>>>> @@ -218,28 +218,32 @@ void __init psi_init(void)
>>>>       group_init(&psi_system);
>>>>   }
>>>> -static bool test_state(unsigned int *tasks, enum psi_states state, 
>>>> bool oncpu)
>>>> +static u32 test_states(unsigned int *tasks, u32 state_mask)
>>>>   {
>>>> -    switch (state) {
>>>> -    case PSI_IO_SOME:
>>>> -        return unlikely(tasks[NR_IOWAIT]);
>>>> -    case PSI_IO_FULL:
>>>> -        return unlikely(tasks[NR_IOWAIT] && !tasks[NR_RUNNING]);
>>>> -    case PSI_MEM_SOME:
>>>> -        return unlikely(tasks[NR_MEMSTALL]);
>>>> -    case PSI_MEM_FULL:
>>>> -        return unlikely(tasks[NR_MEMSTALL] &&
>>>> -            tasks[NR_RUNNING] == tasks[NR_MEMSTALL_RUNNING]);
>>>> -    case PSI_CPU_SOME:
>>>> -        return unlikely(tasks[NR_RUNNING] > oncpu);
>>>> -    case PSI_CPU_FULL:
>>>> -        return unlikely(tasks[NR_RUNNING] && !oncpu);
>>>> -    case PSI_NONIDLE:
>>>> -        return tasks[NR_IOWAIT] || tasks[NR_MEMSTALL] ||
>>>> -            tasks[NR_RUNNING];
>>>> -    default:
>>>> -        return false;
>>>> +    const bool oncpu = state_mask & PSI_ONCPU;
>>>> +
>>>> +    if (tasks[NR_IOWAIT]) {
>>>> +        state_mask |= BIT(PSI_IO_SOME);
>>>> +        if (!tasks[NR_RUNNING])
>>>> +            state_mask |= BIT(PSI_IO_FULL);
>>>>       }
>>>> +
>>>> +    if (tasks[NR_MEMSTALL]) {
>>>> +        state_mask |= BIT(PSI_MEM_SOME);
>>>> +        if (tasks[NR_RUNNING] == tasks[NR_MEMSTALL_RUNNING])
>>>> +            state_mask |= BIT(PSI_MEM_FULL);
>>>> +    }
>>>> +
>>>> +    if (tasks[NR_RUNNING] > oncpu)
>>>> +        state_mask |= BIT(PSI_CPU_SOME);
>>>> +
>>>> +    if (tasks[NR_RUNNING] && !oncpu)
>>>> +        state_mask |= BIT(PSI_CPU_FULL);
>>>> +
>>>> +    if (tasks[NR_IOWAIT] || tasks[NR_MEMSTALL] || tasks[NR_RUNNING])
>>>> +        state_mask |= BIT(PSI_NONIDLE);
>>>> +
>>>> +    return state_mask;
>>>>   }
>>>>   static void get_recent_times(struct psi_group *group, int cpu,
>>>> @@ -770,7 +774,6 @@ static void psi_group_change(struct psi_group 
>>>> *group, int cpu,
>>>>   {
>>>>       struct psi_group_cpu *groupc;
>>>>       unsigned int t, m;
>>>> -    enum psi_states s;
>>>>       u32 state_mask;
>>>>       groupc = per_cpu_ptr(group->pcpu, cpu);
>>>> @@ -841,10 +844,7 @@ static void psi_group_change(struct psi_group 
>>>> *group, int cpu,
>>>>           return;
>>>>       }
>>>> -    for (s = 0; s < NR_PSI_STATES; s++) {
>>>> -        if (test_state(groupc->tasks, s, state_mask & PSI_ONCPU))
>>>> -            state_mask |= (1 << s);
>>>> -    }
>>>> +    state_mask = test_states(groupc->tasks, state_mask);
>>>>       /*
>>>>        * Since we care about lost potential, a memstall is FULL
>>>> @@ -1194,7 +1194,7 @@ void psi_cgroup_restart(struct psi_group *group)
>>>>       /*
>>>>        * After we disable psi_group->enabled, we don't actually
>>>>        * stop percpu tasks accounting in each psi_group_cpu,
>>>> -     * instead only stop test_state() loop, record_times()
>>>> +     * instead only stop test_states() loop, record_times()
>>>>        * and averaging worker, see psi_group_change() for details.
>>>>        *
>>>>        * When disable cgroup PSI, this function has nothing to sync
>>>> @@ -1202,7 +1202,7 @@ void psi_cgroup_restart(struct psi_group *group)
>>>>        * would see !psi_group->enabled and only do task accounting.
>>>>        *
>>>>        * When re-enable cgroup PSI, this function use 
>>>> psi_group_change()
>>>> -     * to get correct state mask from test_state() loop on tasks[],
>>>> +     * to get correct state mask from test_states() loop on tasks[],
>>>>        * and restart groupc->state_start from now, use .clear = .set 
>>>> = 0
>>>>        * here since no task status really changed.
>>>>        */
>>>> -- 
>>>> 2.44.0
>>>>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] sched/psi: Optimise psi_group_change a bit
  2024-05-01 15:09       ` Tvrtko Ursulin
@ 2024-05-22  8:42         ` Tvrtko Ursulin
  0 siblings, 0 replies; 7+ messages in thread
From: Tvrtko Ursulin @ 2024-05-22  8:42 UTC (permalink / raw)
  To: Tvrtko Ursulin
  Cc: linux-kernel, Tvrtko Ursulin, Suren Baghdasaryan, Peter Ziljstra,
	kernel-dev, Johannes Weiner, Juri Lelli, Vincent Guittot,
	Ingo Molnar


Hi,

Any chance of getting this in somewhere? I am gradually implementing an 
exponential back-off in terms of pinging this thread but it is a simple 
and clear improvement so I don't know why it is stuck really.

Kind regards,

Tvrtko

On 01/05/2024 16:09, Tvrtko Ursulin wrote:
> 
> Hi,
> 
> Adding more scheduler maintainers to Cc - hopefully someone can merge to 
> the relevant tree?
> 
> Thanks,
> 
> Tvrtko
> 
> On 18/04/2024 09:54, Tvrtko Ursulin wrote:
>>
>> Hi Ingo,
>>
>> On 09/04/2024 23:38, Johannes Weiner wrote:
>>> [ Oops, I still had an old mutt alias for Ingo's address. ]
>>>
>>> Ingo, would you mind taking this through the scheduler tree?
>>
>> Gentle reminder so this one does not fall through the cracks.
>>
>> Regards,
>>
>> Tvrtko
>>
>>> On Fri, Mar 29, 2024 at 02:51:53PM -0400, Johannes Weiner wrote:
>>>> On Fri, Mar 29, 2024 at 04:06:48PM +0000, Tvrtko Ursulin wrote:
>>>>> From: Tvrtko Ursulin <tursulin@ursulin.net>
>>>>>
>>>>> The current code loops over the psi_states only to call a helper which
>>>>> then resolves back to the action needed for each state using a switch
>>>>> statement. That is effectively creating a double indirection of a kind
>>>>> which, given how all the states need to be explicitly listed and 
>>>>> handled
>>>>> anyway, we can simply remove. Both the for loop and the switch 
>>>>> statement
>>>>> that is.
>>>>>
>>>>> The benefit is both in the code size and CPU time spent in this 
>>>>> function.
>>>>> YMMV but on my Steam Deck, while in a game, the patch makes the CPU 
>>>>> usage
>>>>> go from ~2.4% down to ~1.2%. Text size at the same time went from 
>>>>> 0x323 to
>>>>> 0x2c1.
>>>>>
>>>>> Signed-off-by: Tvrtko Ursulin <tursulin@ursulin.net>
>>>>> Cc: Johannes Weiner <hannes@cmpxchg.org>
>>>>> Cc: Suren Baghdasaryan <surenb@google.com>
>>>>> Cc: Peter Ziljstra <peterz@infradead.org>
>>>>> Cc: linux-kernel@vger.kernel.org
>>>>> Cc: kernel-dev@igalia.com
>>>>
>>>> This is great.
>>>>
>>>> Acked-by: Johannes Weiner <hannes@cmpxchg.org>
>>>>
>>>> Ingo, would you mind please taking this through the scheduler tree? I
>>>> think Peter is still out.
>>>>
>>>> Remaining quote below.
>>>>
>>>> Thanks
>>>>
>>>>> ---
>>>>>   kernel/sched/psi.c | 54 
>>>>> +++++++++++++++++++++++-----------------------
>>>>>   1 file changed, 27 insertions(+), 27 deletions(-)
>>>>>
>>>>> diff --git a/kernel/sched/psi.c b/kernel/sched/psi.c
>>>>> index 7b4aa5809c0f..55720ecf420e 100644
>>>>> --- a/kernel/sched/psi.c
>>>>> +++ b/kernel/sched/psi.c
>>>>> @@ -218,28 +218,32 @@ void __init psi_init(void)
>>>>>       group_init(&psi_system);
>>>>>   }
>>>>> -static bool test_state(unsigned int *tasks, enum psi_states state, 
>>>>> bool oncpu)
>>>>> +static u32 test_states(unsigned int *tasks, u32 state_mask)
>>>>>   {
>>>>> -    switch (state) {
>>>>> -    case PSI_IO_SOME:
>>>>> -        return unlikely(tasks[NR_IOWAIT]);
>>>>> -    case PSI_IO_FULL:
>>>>> -        return unlikely(tasks[NR_IOWAIT] && !tasks[NR_RUNNING]);
>>>>> -    case PSI_MEM_SOME:
>>>>> -        return unlikely(tasks[NR_MEMSTALL]);
>>>>> -    case PSI_MEM_FULL:
>>>>> -        return unlikely(tasks[NR_MEMSTALL] &&
>>>>> -            tasks[NR_RUNNING] == tasks[NR_MEMSTALL_RUNNING]);
>>>>> -    case PSI_CPU_SOME:
>>>>> -        return unlikely(tasks[NR_RUNNING] > oncpu);
>>>>> -    case PSI_CPU_FULL:
>>>>> -        return unlikely(tasks[NR_RUNNING] && !oncpu);
>>>>> -    case PSI_NONIDLE:
>>>>> -        return tasks[NR_IOWAIT] || tasks[NR_MEMSTALL] ||
>>>>> -            tasks[NR_RUNNING];
>>>>> -    default:
>>>>> -        return false;
>>>>> +    const bool oncpu = state_mask & PSI_ONCPU;
>>>>> +
>>>>> +    if (tasks[NR_IOWAIT]) {
>>>>> +        state_mask |= BIT(PSI_IO_SOME);
>>>>> +        if (!tasks[NR_RUNNING])
>>>>> +            state_mask |= BIT(PSI_IO_FULL);
>>>>>       }
>>>>> +
>>>>> +    if (tasks[NR_MEMSTALL]) {
>>>>> +        state_mask |= BIT(PSI_MEM_SOME);
>>>>> +        if (tasks[NR_RUNNING] == tasks[NR_MEMSTALL_RUNNING])
>>>>> +            state_mask |= BIT(PSI_MEM_FULL);
>>>>> +    }
>>>>> +
>>>>> +    if (tasks[NR_RUNNING] > oncpu)
>>>>> +        state_mask |= BIT(PSI_CPU_SOME);
>>>>> +
>>>>> +    if (tasks[NR_RUNNING] && !oncpu)
>>>>> +        state_mask |= BIT(PSI_CPU_FULL);
>>>>> +
>>>>> +    if (tasks[NR_IOWAIT] || tasks[NR_MEMSTALL] || tasks[NR_RUNNING])
>>>>> +        state_mask |= BIT(PSI_NONIDLE);
>>>>> +
>>>>> +    return state_mask;
>>>>>   }
>>>>>   static void get_recent_times(struct psi_group *group, int cpu,
>>>>> @@ -770,7 +774,6 @@ static void psi_group_change(struct psi_group 
>>>>> *group, int cpu,
>>>>>   {
>>>>>       struct psi_group_cpu *groupc;
>>>>>       unsigned int t, m;
>>>>> -    enum psi_states s;
>>>>>       u32 state_mask;
>>>>>       groupc = per_cpu_ptr(group->pcpu, cpu);
>>>>> @@ -841,10 +844,7 @@ static void psi_group_change(struct psi_group 
>>>>> *group, int cpu,
>>>>>           return;
>>>>>       }
>>>>> -    for (s = 0; s < NR_PSI_STATES; s++) {
>>>>> -        if (test_state(groupc->tasks, s, state_mask & PSI_ONCPU))
>>>>> -            state_mask |= (1 << s);
>>>>> -    }
>>>>> +    state_mask = test_states(groupc->tasks, state_mask);
>>>>>       /*
>>>>>        * Since we care about lost potential, a memstall is FULL
>>>>> @@ -1194,7 +1194,7 @@ void psi_cgroup_restart(struct psi_group *group)
>>>>>       /*
>>>>>        * After we disable psi_group->enabled, we don't actually
>>>>>        * stop percpu tasks accounting in each psi_group_cpu,
>>>>> -     * instead only stop test_state() loop, record_times()
>>>>> +     * instead only stop test_states() loop, record_times()
>>>>>        * and averaging worker, see psi_group_change() for details.
>>>>>        *
>>>>>        * When disable cgroup PSI, this function has nothing to sync
>>>>> @@ -1202,7 +1202,7 @@ void psi_cgroup_restart(struct psi_group *group)
>>>>>        * would see !psi_group->enabled and only do task accounting.
>>>>>        *
>>>>>        * When re-enable cgroup PSI, this function use 
>>>>> psi_group_change()
>>>>> -     * to get correct state mask from test_state() loop on tasks[],
>>>>> +     * to get correct state mask from test_states() loop on tasks[],
>>>>>        * and restart groupc->state_start from now, use .clear = 
>>>>> .set = 0
>>>>>        * here since no task status really changed.
>>>>>        */
>>>>> -- 
>>>>> 2.44.0
>>>>>

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2024-05-22  8:43 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-03-29 16:06 [PATCH] sched/psi: Optimise psi_group_change a bit Tvrtko Ursulin
2024-03-29 18:51 ` Johannes Weiner
2024-04-09 22:38   ` Johannes Weiner
2024-04-18  8:54     ` Tvrtko Ursulin
2024-05-01 15:09       ` Tvrtko Ursulin
2024-05-22  8:42         ` Tvrtko Ursulin
2024-04-01  6:46 ` Chengming Zhou

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).