All of lore.kernel.org
 help / color / mirror / Atom feed
From: Chengming Zhou <zhouchengming@bytedance.com>
To: hannes@cmpxchg.org, surenb@google.com, quic_pkondeti@quicinc.com
Cc: peterz@infradead.org, quic_charante@quicinc.com,
	linux-kernel@vger.kernel.org,
	Chengming Zhou <zhouchengming@bytedance.com>
Subject: [PATCH] sched/psi: Fix avgs_work re-arm in psi_avgs_work()
Date: Mon, 10 Oct 2022 18:42:06 +0800	[thread overview]
Message-ID: <20221010104206.12184-1-zhouchengming@bytedance.com> (raw)
In-Reply-To: <20220913140817.GA9091@hu-pkondeti-hyd.qualcomm.com>

Pavan reported a problem that PSI avgs_work idle shutoff is not
working at all. Because PSI_NONIDLE condition would be observed in
psi_avgs_work()->collect_percpu_times()->get_recent_times() even if
only the kworker running avgs_work on the CPU.

Although commit 1b69ac6b40eb ("psi: fix aggregation idle shut-off")
avoided the ping-pong wake problem when the worker sleep, psi_avgs_work()
still will always re-arm the avgs_work, so shutoff is not working.

This patch changes to consider current CPU groupc as IDLE if the
kworker running avgs_work is the only task running and no IOWAIT
or MEMSTALL sleep tasks, in which case we will shut off the avgs_work
if other CPUs' groupc are also IDLE.

One potential problem is that the brief period of non-idle time
incurred between the aggregation run and the kworker's dequeue will
be stranded in the per-cpu buckets until avgs_work run next time.
The buckets can hold 4s worth of time, and future activity will wake
the avgs_work with a 2s delay, giving us 2s worth of data we can leave
behind when shut off the avgs_work. If the kworker run other works after
avgs_work shut off and doesn't have any scheduler activities for 2s,
this maybe a problem.

Reported-by: Pavan Kondeti <quic_pkondeti@quicinc.com>
Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com>
---
 kernel/sched/psi.c | 15 +++++++++++++++
 1 file changed, 15 insertions(+)

diff --git a/kernel/sched/psi.c b/kernel/sched/psi.c
index ee2ecc081422..f4cdf6f184ba 100644
--- a/kernel/sched/psi.c
+++ b/kernel/sched/psi.c
@@ -242,6 +242,8 @@ static void get_recent_times(struct psi_group *group, int cpu,
 			     u32 *pchanged_states)
 {
 	struct psi_group_cpu *groupc = per_cpu_ptr(group->pcpu, cpu);
+	int current_cpu = raw_smp_processor_id();
+	bool only_avgs_work = false;
 	u64 now, state_start;
 	enum psi_states s;
 	unsigned int seq;
@@ -256,6 +258,15 @@ static void get_recent_times(struct psi_group *group, int cpu,
 		memcpy(times, groupc->times, sizeof(groupc->times));
 		state_mask = groupc->state_mask;
 		state_start = groupc->state_start;
+		/*
+		 * This CPU has only avgs_work kworker running, snapshot the
+		 * newest times then don't need to re-arm for this groupc.
+		 * Normally this kworker will sleep soon and won't wake
+		 * avgs_work back up in psi_group_change().
+		 */
+		if (current_cpu == cpu && groupc->tasks[NR_RUNNING] == 1 &&
+		    !groupc->tasks[NR_IOWAIT] && !groupc->tasks[NR_MEMSTALL])
+			only_avgs_work = true;
 	} while (read_seqcount_retry(&groupc->seq, seq));
 
 	/* Calculate state time deltas against the previous snapshot */
@@ -280,6 +291,10 @@ static void get_recent_times(struct psi_group *group, int cpu,
 		if (delta)
 			*pchanged_states |= (1 << s);
 	}
+
+	/* Clear PSI_NONIDLE so avgs_work won't be re-armed for this groupc */
+	if (only_avgs_work)
+		*pchanged_states &= ~(1 << PSI_NONIDLE);
 }
 
 static void calc_avgs(unsigned long avg[3], int missed_periods,
-- 
2.37.2


  parent reply	other threads:[~2022-10-10 10:42 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-09-13 14:08 PSI idle-shutoff Pavan Kondeti
2022-09-15  6:20 ` Pavan Kondeti
2022-09-17  5:45   ` Suren Baghdasaryan
2022-10-03  6:11     ` Suren Baghdasaryan
2022-10-05 16:32       ` Suren Baghdasaryan
2022-10-09 12:41         ` Chengming Zhou
2022-10-09 13:17           ` Chengming Zhou
2022-10-10  6:18             ` Pavan Kondeti
2022-10-10  6:43               ` Pavan Kondeti
2022-10-10  6:57                 ` [External] " Chengming Zhou
2022-10-10  8:30                   ` Chengming Zhou
2022-10-10  9:09                     ` Pavan Kondeti
2022-10-10  9:22                       ` Chengming Zhou
2022-10-10 20:59             ` Suren Baghdasaryan
2022-10-10 20:33           ` Suren Baghdasaryan
2022-10-10  5:57         ` Pavan Kondeti
2022-10-10  9:01           ` Pavan Kondeti
2022-10-10  6:25         ` Pavan Kondeti
2022-10-10 10:42 ` Chengming Zhou [this message]
2022-10-10 21:21   ` [PATCH] sched/psi: Fix avgs_work re-arm in psi_avgs_work() Suren Baghdasaryan
2022-10-11  0:07     ` Chengming Zhou
2022-10-11 17:00       ` Suren Baghdasaryan
2022-10-12  2:10         ` Chengming Zhou
2022-10-12 18:24           ` Suren Baghdasaryan
2022-10-13  2:23             ` Chengming Zhou
2022-10-13 11:06             ` Chengming Zhou
2022-10-13 15:52               ` Johannes Weiner
2022-10-13 16:10                 ` Suren Baghdasaryan
2022-10-14  2:03                   ` Chengming Zhou
2022-10-14  2:02                 ` Chengming Zhou
2022-10-28  6:42   ` [tip: sched/core] " tip-bot2 for Chengming Zhou
2022-10-28  6:50     ` [External] " Chengming Zhou
2022-10-28 15:58       ` Suren Baghdasaryan
2022-10-28 16:05         ` Chengming Zhou
2022-10-28 19:53         ` [External] " Peter Zijlstra
2022-10-29 11:55           ` Peter Zijlstra
2022-10-29 12:40             ` Chengming Zhou
2022-10-29 18:46               ` Suren Baghdasaryan
2022-10-10 10:57 ` PSI idle-shutoff Hillf Danton
2022-10-10 21:16   ` Suren Baghdasaryan
2022-10-11 11:38     ` Hillf Danton
2022-10-11 17:11       ` Suren Baghdasaryan
2022-10-12  6:20         ` Hillf Danton
2022-10-12 15:40           ` Suren Baghdasaryan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20221010104206.12184-1-zhouchengming@bytedance.com \
    --to=zhouchengming@bytedance.com \
    --cc=hannes@cmpxchg.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=peterz@infradead.org \
    --cc=quic_charante@quicinc.com \
    --cc=quic_pkondeti@quicinc.com \
    --cc=surenb@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.