All of lore.kernel.org
 help / color / mirror / Atom feed
From: Johannes Weiner <hannes@cmpxchg.org>
To: Minchan Kim <minchan@kernel.org>
Cc: Tim Murray <timmurray@google.com>,
	Michal Hocko <mhocko@kernel.org>,
	Vladimir Davydov <vdavydov.dev@gmail.com>,
	LKML <linux-kernel@vger.kernel.org>,
	cgroups@vger.kernel.org, Linux-MM <linux-mm@kvack.org>,
	Suren Baghdasaryan <surenb@google.com>,
	Patrik Torstensson <totte@google.com>,
	Android Kernel Team <kernel-team@android.com>
Subject: Re: [RFC 0/1] add support for reclaiming priorities per mem cgroup
Date: Thu, 13 Apr 2017 12:01:47 -0400	[thread overview]
Message-ID: <20170413160147.GB29727@cmpxchg.org> (raw)
In-Reply-To: <20170413043047.GA16783@bbox>

On Thu, Apr 13, 2017 at 01:30:47PM +0900, Minchan Kim wrote:
> On Thu, Mar 30, 2017 at 12:40:32PM -0700, Tim Murray wrote:
> > As a result, I think there's still a need for relative priority
> > between mem cgroups, not just an absolute limit.
> > 
> > Does that make sense?
> 
> I agree with it.
> 
> Recently, embedded platform's workload for smart things would be much
> diverse(from game to alarm) so it's hard to handle the absolute limit
> proactively and userspace has more hints about what workloads are
> more important(ie, greedy) compared to others although it would be
> harmful for something(e.g., it's not visible effect to user)
> 
> As a such point of view, I support this idea as basic approach.
> And with thrashing detector from Johannes, we can do fine-tune of
> LRU balancing and vmpressure shooting time better.
> 
> Johannes,
> 
> Do you have any concern about this memcg prority idea?

While I fully agree that relative priority levels would be easier to
configure, this patch doesn't really do that. It allows you to set a
scan window divider to a fixed amount and, as I already pointed out,
the scan window is no longer representative of memory pressure.

[ Really, sc->priority should probably just be called LRU lookahead
  factor or something, there is not much about it being representative
  of any kind of urgency anymore. ]

With this patch, if you configure the priorities of two 8G groups to 0
and 4, reclaim will treat them exactly the same*. If you configure the
priorities of two 100G groups to 0 and 7, reclaim will treat them
exactly the same. The bigger the group, the more of the lower range of
the priority range becomes meaningless, because once the divider
produces outcomes bigger than SWAP_CLUSTER_MAX(32), it doesn't
actually bias reclaim anymore.

So that's not a portable relative scale of pressure discrimination.

But the bigger problem with this is that, as sc->priority doesn't
represent memory pressure anymore, it is merely a cut-off for which
groups to scan and which groups not to scan *based on their size*.

That is the same as setting memory.low!

* For simplicity, I'm glossing over the fact here that LRUs are split
  by type and into inactive/active, so in reality the numbers are a
  little different, but you get the point.

> Or
> Do you think the patchset you are preparing solve this situation?

It's certainly a requirement. In order to implement a relative scale
of memory pressure discrimination, we first need to be able to really
quantify memory pressure.

Then we can either allow setting absolute latency/slowdown minimums
for each group, with reclaim skipping groups above those thresholds,
or we can map a relative priority scale against the total slowdown due
to lack of memory in the system, and each group gets a relative share
based on its priority compared to other groups.

But there is no way around first having a working measure of memory
pressure before we can meaningfully distribute it among the groups.

Thanks

WARNING: multiple messages have this Message-ID (diff)
From: Johannes Weiner <hannes@cmpxchg.org>
To: Minchan Kim <minchan@kernel.org>
Cc: Tim Murray <timmurray@google.com>,
	Michal Hocko <mhocko@kernel.org>,
	Vladimir Davydov <vdavydov.dev@gmail.com>,
	LKML <linux-kernel@vger.kernel.org>,
	cgroups@vger.kernel.org, Linux-MM <linux-mm@kvack.org>,
	Suren Baghdasaryan <surenb@google.com>,
	Patrik Torstensson <totte@google.com>,
	Android Kernel Team <kernel-team@android.com>
Subject: Re: [RFC 0/1] add support for reclaiming priorities per mem cgroup
Date: Thu, 13 Apr 2017 12:01:47 -0400	[thread overview]
Message-ID: <20170413160147.GB29727@cmpxchg.org> (raw)
In-Reply-To: <20170413043047.GA16783@bbox>

On Thu, Apr 13, 2017 at 01:30:47PM +0900, Minchan Kim wrote:
> On Thu, Mar 30, 2017 at 12:40:32PM -0700, Tim Murray wrote:
> > As a result, I think there's still a need for relative priority
> > between mem cgroups, not just an absolute limit.
> > 
> > Does that make sense?
> 
> I agree with it.
> 
> Recently, embedded platform's workload for smart things would be much
> diverse(from game to alarm) so it's hard to handle the absolute limit
> proactively and userspace has more hints about what workloads are
> more important(ie, greedy) compared to others although it would be
> harmful for something(e.g., it's not visible effect to user)
> 
> As a such point of view, I support this idea as basic approach.
> And with thrashing detector from Johannes, we can do fine-tune of
> LRU balancing and vmpressure shooting time better.
> 
> Johannes,
> 
> Do you have any concern about this memcg prority idea?

While I fully agree that relative priority levels would be easier to
configure, this patch doesn't really do that. It allows you to set a
scan window divider to a fixed amount and, as I already pointed out,
the scan window is no longer representative of memory pressure.

[ Really, sc->priority should probably just be called LRU lookahead
  factor or something, there is not much about it being representative
  of any kind of urgency anymore. ]

With this patch, if you configure the priorities of two 8G groups to 0
and 4, reclaim will treat them exactly the same*. If you configure the
priorities of two 100G groups to 0 and 7, reclaim will treat them
exactly the same. The bigger the group, the more of the lower range of
the priority range becomes meaningless, because once the divider
produces outcomes bigger than SWAP_CLUSTER_MAX(32), it doesn't
actually bias reclaim anymore.

So that's not a portable relative scale of pressure discrimination.

But the bigger problem with this is that, as sc->priority doesn't
represent memory pressure anymore, it is merely a cut-off for which
groups to scan and which groups not to scan *based on their size*.

That is the same as setting memory.low!

* For simplicity, I'm glossing over the fact here that LRUs are split
  by type and into inactive/active, so in reality the numbers are a
  little different, but you get the point.

> Or
> Do you think the patchset you are preparing solve this situation?

It's certainly a requirement. In order to implement a relative scale
of memory pressure discrimination, we first need to be able to really
quantify memory pressure.

Then we can either allow setting absolute latency/slowdown minimums
for each group, with reclaim skipping groups above those thresholds,
or we can map a relative priority scale against the total slowdown due
to lack of memory in the system, and each group gets a relative share
based on its priority compared to other groups.

But there is no way around first having a working measure of memory
pressure before we can meaningfully distribute it among the groups.

Thanks

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)
From: Johannes Weiner <hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org>
To: Minchan Kim <minchan-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
Cc: Tim Murray <timmurray-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>,
	Michal Hocko <mhocko-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>,
	Vladimir Davydov
	<vdavydov.dev-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>,
	LKML <linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	Linux-MM <linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org>,
	Suren Baghdasaryan
	<surenb-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>,
	Patrik Torstensson
	<totte-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>,
	Android Kernel Team
	<kernel-team-z5hGa2qSFaRBDgjK7y7TUQ@public.gmane.org>
Subject: Re: [RFC 0/1] add support for reclaiming priorities per mem cgroup
Date: Thu, 13 Apr 2017 12:01:47 -0400	[thread overview]
Message-ID: <20170413160147.GB29727@cmpxchg.org> (raw)
In-Reply-To: <20170413043047.GA16783@bbox>

On Thu, Apr 13, 2017 at 01:30:47PM +0900, Minchan Kim wrote:
> On Thu, Mar 30, 2017 at 12:40:32PM -0700, Tim Murray wrote:
> > As a result, I think there's still a need for relative priority
> > between mem cgroups, not just an absolute limit.
> > 
> > Does that make sense?
> 
> I agree with it.
> 
> Recently, embedded platform's workload for smart things would be much
> diverse(from game to alarm) so it's hard to handle the absolute limit
> proactively and userspace has more hints about what workloads are
> more important(ie, greedy) compared to others although it would be
> harmful for something(e.g., it's not visible effect to user)
> 
> As a such point of view, I support this idea as basic approach.
> And with thrashing detector from Johannes, we can do fine-tune of
> LRU balancing and vmpressure shooting time better.
> 
> Johannes,
> 
> Do you have any concern about this memcg prority idea?

While I fully agree that relative priority levels would be easier to
configure, this patch doesn't really do that. It allows you to set a
scan window divider to a fixed amount and, as I already pointed out,
the scan window is no longer representative of memory pressure.

[ Really, sc->priority should probably just be called LRU lookahead
  factor or something, there is not much about it being representative
  of any kind of urgency anymore. ]

With this patch, if you configure the priorities of two 8G groups to 0
and 4, reclaim will treat them exactly the same*. If you configure the
priorities of two 100G groups to 0 and 7, reclaim will treat them
exactly the same. The bigger the group, the more of the lower range of
the priority range becomes meaningless, because once the divider
produces outcomes bigger than SWAP_CLUSTER_MAX(32), it doesn't
actually bias reclaim anymore.

So that's not a portable relative scale of pressure discrimination.

But the bigger problem with this is that, as sc->priority doesn't
represent memory pressure anymore, it is merely a cut-off for which
groups to scan and which groups not to scan *based on their size*.

That is the same as setting memory.low!

* For simplicity, I'm glossing over the fact here that LRUs are split
  by type and into inactive/active, so in reality the numbers are a
  little different, but you get the point.

> Or
> Do you think the patchset you are preparing solve this situation?

It's certainly a requirement. In order to implement a relative scale
of memory pressure discrimination, we first need to be able to really
quantify memory pressure.

Then we can either allow setting absolute latency/slowdown minimums
for each group, with reclaim skipping groups above those thresholds,
or we can map a relative priority scale against the total slowdown due
to lack of memory in the system, and each group gets a relative share
based on its priority compared to other groups.

But there is no way around first having a working measure of memory
pressure before we can meaningfully distribute it among the groups.

Thanks

  reply	other threads:[~2017-04-13 16:01 UTC|newest]

Thread overview: 51+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-03-17 23:16 [RFC 0/1] add support for reclaiming priorities per mem cgroup Tim Murray
2017-03-17 23:16 ` Tim Murray
2017-03-17 23:16 ` [RFC 1/1] mm, memcg: add prioritized reclaim Tim Murray
2017-03-17 23:16   ` Tim Murray
2017-03-20 14:41   ` vinayak menon
2017-03-20 14:41     ` vinayak menon
2017-03-20  5:59 ` [RFC 0/1] add support for reclaiming priorities per mem cgroup Minchan Kim
2017-03-20  5:59   ` Minchan Kim
2017-03-20 13:58   ` Vinayak Menon
2017-03-20 13:58     ` Vinayak Menon
2017-03-20 13:58     ` Vinayak Menon
2017-03-20 15:23     ` Johannes Weiner
2017-03-20 15:23       ` Johannes Weiner
2017-03-22 12:13       ` Vinayak Menon
2017-03-22 12:13         ` Vinayak Menon
2017-03-21 17:18   ` Tim Murray
2017-03-21 17:18     ` Tim Murray
2017-03-22  4:41     ` Minchan Kim
2017-03-22  4:41       ` Minchan Kim
2017-03-22  5:20       ` Minchan Kim
2017-03-22  5:20         ` Minchan Kim
2017-03-20  6:56 ` peter enderborg
2017-03-20  6:56   ` peter enderborg
2017-03-20  8:18 ` Kyungmin Park
2017-03-20  8:18   ` Kyungmin Park
2017-03-30  5:59 ` Minchan Kim
2017-03-30  5:59   ` Minchan Kim
2017-03-30  5:59   ` Minchan Kim
2017-03-30  7:10   ` Tim Murray
2017-03-30  7:10     ` Tim Murray
2017-03-30  7:10     ` Tim Murray
2017-03-30 15:51 ` Johannes Weiner
2017-03-30 15:51   ` Johannes Weiner
2017-03-30 16:48   ` Shakeel Butt
2017-03-30 16:48     ` Shakeel Butt
2017-03-30 16:48     ` Shakeel Butt
2017-04-13 16:03     ` Johannes Weiner
2017-04-13 16:03       ` Johannes Weiner
2017-03-30 19:40   ` Tim Murray
2017-03-30 19:40     ` Tim Murray
2017-03-30 21:54     ` Tim Murray
2017-03-30 21:54       ` Tim Murray
2017-04-13  4:30     ` Minchan Kim
2017-04-13  4:30       ` Minchan Kim
2017-04-13  4:30       ` Minchan Kim
2017-04-13 16:01       ` Johannes Weiner [this message]
2017-04-13 16:01         ` Johannes Weiner
2017-04-13 16:01         ` Johannes Weiner
2017-04-17  4:26         ` Minchan Kim
2017-04-17  4:26           ` Minchan Kim
2017-04-17  4:26           ` Minchan Kim

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170413160147.GB29727@cmpxchg.org \
    --to=hannes@cmpxchg.org \
    --cc=cgroups@vger.kernel.org \
    --cc=kernel-team@android.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=minchan@kernel.org \
    --cc=surenb@google.com \
    --cc=timmurray@google.com \
    --cc=totte@google.com \
    --cc=vdavydov.dev@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.