All of lore.kernel.org
 help / color / mirror / Atom feed
From: Pavan Kondeti <quic_pkondeti@quicinc.com>
To: Pavan Kondeti <quic_pkondeti@quicinc.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Suren Baghdasaryan <surenb@google.com>
Cc: <linux-kernel@vger.kernel.org>, <quic_charante@quicinc.com>
Subject: Re: PSI idle-shutoff
Date: Thu, 15 Sep 2022 11:50:27 +0530	[thread overview]
Message-ID: <20220915062027.GA14713@hu-pkondeti-hyd.qualcomm.com> (raw)
In-Reply-To: <20220913140817.GA9091@hu-pkondeti-hyd.qualcomm.com>

On Tue, Sep 13, 2022 at 07:38:17PM +0530, Pavan Kondeti wrote:
> Hi
> 
> The fact that psi_avgs_work()->collect_percpu_times()->get_recent_times()
> run from a kworker thread, PSI_NONIDLE condition would be observed as
> there is a RUNNING task. So we would always end up re-arming the work.
> 
> If the work is re-armed from the psi_avgs_work() it self, the backing off
> logic in psi_task_change() (will be moved to psi_task_switch soon) can't
> help. The work is already scheduled. so we don't do anything there.
> 
> Probably I am missing some thing here. Can you please clarify how we
> shut off re-arming the psi avg work?
> 

I have collected traces on an idle system (running android12-5.10 with minimal
user space). This is a older kernel, however the issue remain on latest kernel
as per code inspection.

I have eliminated noise created by other work items. For example, vmstat_work.
This is a deferrable work but gets executed since this is queued on the same
CPU on which PSI work timer is queued. So I have increased
sysctl_stat_interval to 60 * HZ to supress this work.

As we can see from the traces, CPU#7 comes out of idle only to execute PSI
work for every 2 seconds. The work is always re-armed from the psi_avgs_work()
as it finds PSI_NONIDLE condition. The non-idle time is essentially

non_idle_time = (work_start_now - wakeup_now) + (sleep_prev - work_end_prev)

The first term accounts the non-idle time since the task woken up (queued) to
the execution of the work item. It is around ~4 usec (54.119420 - 54.119416)

The second term account for the previous update. ~2 usec (52.135424 -
52.135422).

PSI work needs to be run when there is some activity after the last update is done
i.e last time the work is run. Since we use non-deferrable timer, the other
deferrable timers gets woken up and they might queue work or wakeup other threads
and creates activity which inturn makes PSI work to be scheduled.

PSI work can't just be made deferrable work. Because, it is a system level
work and if the CPU on which it is queued is idle for longer duration but the
other CPUs are active, we miss PSI updates. What we probably need is a global
deferrable timers [1] i.e this timer should not be bound to any CPU but
run when any of the CPU comes out of idle. As long as one CPU is busy, we keep
running the PSI but if the whole system is idle, we never wakeup.

          <idle>-0     [007]    52.135402: cpu_idle:             state=4294967295 cpu_id=7
          <idle>-0     [007]    52.135415: workqueue_activate_work: work struct 0xffffffc011bd5010
          <idle>-0     [007]    52.135417: sched_wakeup:         comm=kworker/7:3 pid=196 prio=120 target_cpu=007
          <idle>-0     [007]    52.135421: sched_switch:         prev_comm=swapper/7 prev_pid=0 prev_prio=120 prev_state=R ==> next_comm=kworker/7:3 next_pid=196 next_prio=120
     kworker/7:3-196   [007]    52.135421: workqueue_execute_start: work struct 0xffffffc011bd5010: function psi_avgs_work
     kworker/7:3-196   [007]    52.135422: timer_start:          timer=0xffffffc011bd5040 function=delayed_work_timer_fn expires=4294905814 [timeout=494] cpu=7 idx=123 flags=D|P|I
     kworker/7:3-196   [007]    52.135422: workqueue_execute_end: work struct 0xffffffc011bd5010: function psi_avgs_work
     kworker/7:3-196   [007]    52.135424: sched_switch:         prev_comm=kworker/7:3 prev_pid=196 prev_prio=120 prev_state=I ==> next_comm=swapper/7 next_pid=0 next_prio=120
          <idle>-0     [007]    52.135428: cpu_idle:             state=0 cpu_id=7

	  <system is idle and gets woken up after 2 seconds due to PSI work>

          <idle>-0     [007]    54.119402: cpu_idle:             state=4294967295 cpu_id=7
          <idle>-0     [007]    54.119414: workqueue_activate_work: work struct 0xffffffc011bd5010
          <idle>-0     [007]    54.119416: sched_wakeup:         comm=kworker/7:3 pid=196 prio=120 target_cpu=007
          <idle>-0     [007]    54.119420: sched_switch:         prev_comm=swapper/7 prev_pid=0 prev_prio=120 prev_state=R ==> next_comm=kworker/7:3 next_pid=196 next_prio=120
     kworker/7:3-196   [007]    54.119420: workqueue_execute_start: work struct 0xffffffc011bd5010: function psi_avgs_work
     kworker/7:3-196   [007]    54.119421: timer_start:          timer=0xffffffc011bd5040 function=delayed_work_timer_fn expires=4294906315 [timeout=499] cpu=7 idx=122 flags=D|P|I
     kworker/7:3-196   [007]    54.119422: workqueue_execute_end: work struct 0xffffffc011bd5010: function psi_avgs_work

[1]
https://lore.kernel.org/lkml/1430188744-24737-1-git-send-email-joonwoop@codeaurora.org/

Thanks,
Pavan

  reply	other threads:[~2022-09-15  6:20 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-09-13 14:08 PSI idle-shutoff Pavan Kondeti
2022-09-15  6:20 ` Pavan Kondeti [this message]
2022-09-17  5:45   ` Suren Baghdasaryan
2022-10-03  6:11     ` Suren Baghdasaryan
2022-10-05 16:32       ` Suren Baghdasaryan
2022-10-09 12:41         ` Chengming Zhou
2022-10-09 13:17           ` Chengming Zhou
2022-10-10  6:18             ` Pavan Kondeti
2022-10-10  6:43               ` Pavan Kondeti
2022-10-10  6:57                 ` [External] " Chengming Zhou
2022-10-10  8:30                   ` Chengming Zhou
2022-10-10  9:09                     ` Pavan Kondeti
2022-10-10  9:22                       ` Chengming Zhou
2022-10-10 20:59             ` Suren Baghdasaryan
2022-10-10 20:33           ` Suren Baghdasaryan
2022-10-10  5:57         ` Pavan Kondeti
2022-10-10  9:01           ` Pavan Kondeti
2022-10-10  6:25         ` Pavan Kondeti
2022-10-10 10:42 ` [PATCH] sched/psi: Fix avgs_work re-arm in psi_avgs_work() Chengming Zhou
2022-10-10 21:21   ` Suren Baghdasaryan
2022-10-11  0:07     ` Chengming Zhou
2022-10-11 17:00       ` Suren Baghdasaryan
2022-10-12  2:10         ` Chengming Zhou
2022-10-12 18:24           ` Suren Baghdasaryan
2022-10-13  2:23             ` Chengming Zhou
2022-10-13 11:06             ` Chengming Zhou
2022-10-13 15:52               ` Johannes Weiner
2022-10-13 16:10                 ` Suren Baghdasaryan
2022-10-14  2:03                   ` Chengming Zhou
2022-10-14  2:02                 ` Chengming Zhou
2022-10-28  6:42   ` [tip: sched/core] " tip-bot2 for Chengming Zhou
2022-10-28  6:50     ` [External] " Chengming Zhou
2022-10-28 15:58       ` Suren Baghdasaryan
2022-10-28 16:05         ` Chengming Zhou
2022-10-28 19:53         ` [External] " Peter Zijlstra
2022-10-29 11:55           ` Peter Zijlstra
2022-10-29 12:40             ` Chengming Zhou
2022-10-29 18:46               ` Suren Baghdasaryan
2022-10-10 10:57 ` PSI idle-shutoff Hillf Danton
2022-10-10 21:16   ` Suren Baghdasaryan
2022-10-11 11:38     ` Hillf Danton
2022-10-11 17:11       ` Suren Baghdasaryan
2022-10-12  6:20         ` Hillf Danton
2022-10-12 15:40           ` Suren Baghdasaryan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220915062027.GA14713@hu-pkondeti-hyd.qualcomm.com \
    --to=quic_pkondeti@quicinc.com \
    --cc=hannes@cmpxchg.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=quic_charante@quicinc.com \
    --cc=surenb@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.