linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Suren Baghdasaryan <surenb@google.com>
To: Johannes Weiner <hannes@cmpxchg.org>
Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	linux-block@vger.kernel.org, cgroups@vger.kernel.org,
	Ingo Molnar <mingo@redhat.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Andrew Morton <akpm@linuxfoundation.org>,
	Tejun Heo <tj@kernel.org>, Balbir Singh <bsingharora@gmail.com>,
	Mike Galbraith <efault@gmx.de>, Oliver Yang <yangoliver@me.com>,
	Shakeel Butt <shakeelb@google.com>, xxx xxx <x.qendo@gmail.com>,
	Taras Kondratiuk <takondra@cisco.com>,
	Daniel Walker <danielwa@cisco.com>,
	Vinayak Menon <vinmenon@codeaurora.org>,
	Ruslan Ruslichenko <rruslich@cisco.com>,
	kernel-team@fb.com
Subject: Re: [PATCH 0/7] psi: pressure stall information for CPU, memory, and IO
Date: Wed, 30 May 2018 16:32:52 -0700	[thread overview]
Message-ID: <CAJuCfpGXSyu3SOky6jMhKjix=bbaPccg05VcepbvuJiv+bQgzw@mail.gmail.com> (raw)
In-Reply-To: <20180529181616.GB28689@cmpxchg.org>

On Tue, May 29, 2018 at 11:16 AM, Johannes Weiner <hannes@cmpxchg.org> wrote:
> Hi Suren,
>
> On Fri, May 25, 2018 at 05:29:30PM -0700, Suren Baghdasaryan wrote:
>> Hi Johannes,
>> I tried your previous memdelay patches before this new set was posted
>> and results were promising for predicting when Android system is close
>> to OOM. I'm definitely going to try this one after I backport it to
>> 4.9.
>
> I'm happy to hear that!
>
>> Would it make sense to split CONFIG_PSI into CONFIG_PSI_CPU,
>> CONFIG_PSI_MEM and CONFIG_PSI_IO since one might need only specific
>> subset of this feature?
>
> Yes, that should be doable. I'll split them out in the next version.
>
>> > The total= value gives the absolute stall time in microseconds. This
>> > allows detecting latency spikes that might be too short to sway the
>> > running averages. It also allows custom time averaging in case the
>> > 10s/1m/5m windows aren't adequate for the usecase (or are too coarse
>> > with future hardware).
>>
>> Any reasons these specific windows were chosen (empirical
>> data/historical reasons)? I'm worried that with the smallest window
>> being 10s the signal might be too inert to detect fast memory pressure
>> buildup before OOM kill happens. I'll have to experiment with that
>> first, however if you have some insights into this already please
>> share them.
>
> They were chosen empirically. We started out with the loadavg window
> sizes, but had to reduce them for exactly the reason you mention -
> they're way too coarse to detect acute pressure buildup.
>
> 10s has been working well for us. We could make it smaller, but there
> is some worry that we don't have enough samples then and the average
> becomes too erratic - whereas monitoring total= directly would allow
> you to detect accute spikes and handle this erraticness explicitly.

Unfortunately total= field is now updated only at 2sec intervals which
might be too late to react to mounting memory pressure. With previous
memdelay patchset md->aggregate which is reported as "total" was
calculated directly from inside memdelay_task_change, so it was always
up-to-date. Now group->some and group->full are updated from inside
psi_clock with up to 2sec delay. This prevents us from detecting these
acute pressure spikes immediately. I understand why you moved these
calculations out of the hot path but maybe we could keep updating
"total" inside psi_group_update? This would allow for custom averaging
and eliminate this delay for detecting spikes in the pressure signal.
More conceptually I would love to have a way to monitor the averages
at a slow rate and when they rise and cross some threshold to increase
the monitoring rate and react quickly in case they shoot up. Current
2sec delay poses a problem for doing that.

>
> Let me know how it works out in your tests.

I've done the backporting to 4.9 and running the tests but the 2sec
delay is problematic for getting a detailed look at the signal and its
usefulness. Thinking about workarounds if only for data collection but
don't want to deviate too much from your baseline. Would love to hear
from you if a good compromise can be reached here.

>
> Thanks for your feedback.

      reply	other threads:[~2018-05-30 23:32 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-05-07 21:01 [PATCH 0/7] psi: pressure stall information for CPU, memory, and IO Johannes Weiner
2018-05-07 21:01 ` [PATCH 1/7] mm: workingset: don't drop refault information prematurely Johannes Weiner
2018-05-07 21:01 ` [PATCH 2/7] mm: workingset: tell cache transitions from workingset thrashing Johannes Weiner
2018-05-07 21:01 ` [PATCH 3/7] delayacct: track delays from thrashing cache pages Johannes Weiner
2018-05-07 21:01 ` [PATCH 4/7] sched: loadavg: consolidate LOAD_INT, LOAD_FRAC, CALC_LOAD Johannes Weiner
2018-05-07 21:01 ` [PATCH 5/7] sched: loadavg: make calc_load_n() public Johannes Weiner
2018-05-09  9:49   ` Peter Zijlstra
2018-05-10 13:46     ` Johannes Weiner
2018-05-07 21:01 ` [PATCH 6/7] psi: pressure stall information for CPU, memory, and IO Johannes Weiner
2018-05-08  0:42   ` Randy Dunlap
2018-05-08 14:06     ` Johannes Weiner
2018-05-08  1:35   ` kbuild test robot
2018-05-08  3:04   ` kbuild test robot
2018-05-08 14:05     ` Johannes Weiner
2018-05-09  9:59   ` Peter Zijlstra
2018-05-10 13:49     ` Johannes Weiner
2018-05-09 10:04   ` Peter Zijlstra
2018-05-10 14:10     ` Johannes Weiner
2018-05-09 10:05   ` Peter Zijlstra
2018-05-10 14:13     ` Johannes Weiner
2018-05-09 10:14   ` Peter Zijlstra
2018-05-10 14:18     ` Johannes Weiner
2018-05-09 10:21   ` Peter Zijlstra
2018-05-10 14:24     ` Johannes Weiner
2018-05-09 10:26   ` Peter Zijlstra
2018-05-09 10:46   ` Peter Zijlstra
2018-05-09 11:38     ` Peter Zijlstra
2018-05-10 13:41       ` Johannes Weiner
2018-05-14  8:33         ` Peter Zijlstra
2018-05-09 10:55   ` Peter Zijlstra
2018-05-09 11:03   ` Vinayak Menon
2018-05-23 13:17     ` Johannes Weiner
2018-05-23 13:19       ` Vinayak Menon
2018-06-07  0:46   ` Suren Baghdasaryan
2018-05-07 21:01 ` [PATCH 7/7] psi: cgroup support Johannes Weiner
2018-05-09 11:07   ` Peter Zijlstra
2018-05-10 14:49     ` Johannes Weiner
2018-05-14 15:39 ` [PATCH 0/7] psi: pressure stall information for CPU, memory, and IO Christopher Lameter
2018-05-14 17:35   ` Bart Van Assche
2018-05-14 18:55   ` Johannes Weiner
2018-05-14 20:15     ` Christopher Lameter
2018-05-26  0:29 ` Suren Baghdasaryan
2018-05-29 18:16   ` Johannes Weiner
2018-05-30 23:32     ` Suren Baghdasaryan [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAJuCfpGXSyu3SOky6jMhKjix=bbaPccg05VcepbvuJiv+bQgzw@mail.gmail.com' \
    --to=surenb@google.com \
    --cc=akpm@linuxfoundation.org \
    --cc=bsingharora@gmail.com \
    --cc=cgroups@vger.kernel.org \
    --cc=danielwa@cisco.com \
    --cc=efault@gmx.de \
    --cc=hannes@cmpxchg.org \
    --cc=kernel-team@fb.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rruslich@cisco.com \
    --cc=shakeelb@google.com \
    --cc=takondra@cisco.com \
    --cc=tj@kernel.org \
    --cc=vinmenon@codeaurora.org \
    --cc=x.qendo@gmail.com \
    --cc=yangoliver@me.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).