All of lore.kernel.org
 help / color / mirror / Atom feed
From: Johannes Weiner <hannes@cmpxchg.org>
To: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@redhat.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Rik van Riel <riel@redhat.com>, Mel Gorman <mgorman@suse.de>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	kernel-team@fb.com
Subject: Re: [PATCH 3/3] mm/sched: memdelay: memory health interface for systems and workloads
Date: Mon, 31 Jul 2017 16:38:40 -0400	[thread overview]
Message-ID: <20170731203839.GA5162@cmpxchg.org> (raw)
In-Reply-To: <1501530579.9118.43.camel@gmx.de>

On Mon, Jul 31, 2017 at 09:49:39PM +0200, Mike Galbraith wrote:
> On Mon, 2017-07-31 at 14:41 -0400, Johannes Weiner wrote:
> > 
> > Adding an rq counter for tasks inside memdelay sections should be
> > straight-forward as well (except for maybe the migration cost of that
> > state between CPUs in ttwu that Mike pointed out).
> 
> What I pointed out should be easily eliminated (zero use case).

How so?

> > That leaves the question of how to track these numbers per cgroup at
> > an acceptable cost. The idea for a tree of cgroups is that walltime
> > impact of delays at each level is reported for all tasks at or below
> > that level. E.g. a leave group aggregates the state of its own tasks,
> > the root/system aggregates the state of all tasks in the system; hence
> > the propagation of the task state counters up the hierarchy.
> 
> The crux of the biscuit is where exactly the investment return lies.
>  Gathering of these numbers ain't gonna be free, no matter how hard you
> try, and you're plugging into paths where every cycle added is made of
> userspace hide.

Right. But how to implement it sanely and optimize for cycles, and
whether we want to default-enable this interface are two separate
conversations.

It makes sense to me to first make the implementation as lightweight
on cycles and maintainability as possible, and then worry about the
cost / benefit defaults of the shipped Linux kernel afterwards.

That goes for the purely informative userspace interface, anyway. The
easily-provoked thrashing livelock I have described in the email to
Andrew is a different matter. If the OOM killer requires hooking up to
this metric to fix it, it won't be optional. But the OOM code isn't
part of this series yet, so again a conversation best had later, IMO.

PS: I'm stealing the "made of userspace hide" thing.

WARNING: multiple messages have this Message-ID (diff)
From: Johannes Weiner <hannes@cmpxchg.org>
To: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@redhat.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Rik van Riel <riel@redhat.com>, Mel Gorman <mgorman@suse.de>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	kernel-team@fb.com
Subject: Re: [PATCH 3/3] mm/sched: memdelay: memory health interface for systems and workloads
Date: Mon, 31 Jul 2017 16:38:40 -0400	[thread overview]
Message-ID: <20170731203839.GA5162@cmpxchg.org> (raw)
In-Reply-To: <1501530579.9118.43.camel@gmx.de>

On Mon, Jul 31, 2017 at 09:49:39PM +0200, Mike Galbraith wrote:
> On Mon, 2017-07-31 at 14:41 -0400, Johannes Weiner wrote:
> > 
> > Adding an rq counter for tasks inside memdelay sections should be
> > straight-forward as well (except for maybe the migration cost of that
> > state between CPUs in ttwu that Mike pointed out).
> 
> What I pointed out should be easily eliminated (zero use case).

How so?

> > That leaves the question of how to track these numbers per cgroup at
> > an acceptable cost. The idea for a tree of cgroups is that walltime
> > impact of delays at each level is reported for all tasks at or below
> > that level. E.g. a leave group aggregates the state of its own tasks,
> > the root/system aggregates the state of all tasks in the system; hence
> > the propagation of the task state counters up the hierarchy.
> 
> The crux of the biscuit is where exactly the investment return lies.
>  Gathering of these numbers ain't gonna be free, no matter how hard you
> try, and you're plugging into paths where every cycle added is made of
> userspace hide.

Right. But how to implement it sanely and optimize for cycles, and
whether we want to default-enable this interface are two separate
conversations.

It makes sense to me to first make the implementation as lightweight
on cycles and maintainability as possible, and then worry about the
cost / benefit defaults of the shipped Linux kernel afterwards.

That goes for the purely informative userspace interface, anyway. The
easily-provoked thrashing livelock I have described in the email to
Andrew is a different matter. If the OOM killer requires hooking up to
this metric to fix it, it won't be optional. But the OOM code isn't
part of this series yet, so again a conversation best had later, IMO.

PS: I'm stealing the "made of userspace hide" thing.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2017-07-31 20:38 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-07-27 15:30 [PATCH 0/3] memdelay: memory health metric for systems and workloads Johannes Weiner
2017-07-27 15:30 ` Johannes Weiner
2017-07-27 15:30 ` [PATCH 1/3] sched/loadavg: consolidate LOAD_INT, LOAD_FRAC macros Johannes Weiner
2017-07-27 15:30   ` Johannes Weiner
2017-07-27 15:30 ` [PATCH 2/3] mm: workingset: tell cache transitions from workingset thrashing Johannes Weiner
2017-07-27 15:30   ` Johannes Weiner
2017-07-27 15:30 ` [PATCH 3/3] mm/sched: memdelay: memory health interface for systems and workloads Johannes Weiner
2017-07-27 15:30   ` Johannes Weiner
2017-07-27 15:56   ` Johannes Weiner
2017-07-27 15:56     ` Johannes Weiner
2017-07-29  9:10   ` Peter Zijlstra
2017-07-29  9:10     ` Peter Zijlstra
2017-07-30 15:28     ` Johannes Weiner
2017-07-30 15:28       ` Johannes Weiner
2017-07-31  8:31       ` Peter Zijlstra
2017-07-31  8:31         ` Peter Zijlstra
2017-07-31 18:41         ` Johannes Weiner
2017-07-31 18:41           ` Johannes Weiner
2017-07-31 19:49           ` Mike Galbraith
2017-07-31 19:49             ` Mike Galbraith
2017-07-31 20:38             ` Johannes Weiner [this message]
2017-07-31 20:38               ` Johannes Weiner
2017-08-01  2:23               ` Mike Galbraith
2017-08-01  2:23                 ` Mike Galbraith
2017-08-01  7:57           ` Peter Zijlstra
2017-08-01  7:57             ` Peter Zijlstra
2017-08-01 12:26             ` Johannes Weiner
2017-08-01 12:26               ` Johannes Weiner
2017-08-13 14:52               ` Peter Zijlstra
2017-08-13 14:52                 ` Peter Zijlstra
2017-07-29 13:31   ` kbuild test robot
2017-07-27 20:43 ` [PATCH 0/3] memdelay: memory health metric " Andrew Morton
2017-07-27 20:43   ` Andrew Morton
2017-07-28 19:43   ` Johannes Weiner
2017-07-28 19:43     ` Johannes Weiner
2017-08-02  8:11     ` Michal Hocko
2017-08-02  8:11       ` Michal Hocko
2017-07-29  2:48 ` Mike Galbraith
2017-07-29  2:48   ` Mike Galbraith
2017-07-29  3:21   ` Mike Galbraith
2017-07-29  3:21     ` Mike Galbraith
2017-07-29  6:38   ` Mike Galbraith
2017-07-29  6:38     ` Mike Galbraith

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170731203839.GA5162@cmpxchg.org \
    --to=hannes@cmpxchg.org \
    --cc=akpm@linux-foundation.org \
    --cc=efault@gmx.de \
    --cc=kernel-team@fb.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@suse.de \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=riel@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.