linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] mm: vmpressure: fix scan window after SWAP_CLUSTER_MAX increase
@ 2015-10-19 18:13 Johannes Weiner
  2015-10-20  7:47 ` Mel Gorman
  2015-10-21 19:38 ` Johannes Weiner
  0 siblings, 2 replies; 8+ messages in thread
From: Johannes Weiner @ 2015-10-19 18:13 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Vladimir Davydov, linux-mm, cgroups, linux-kernel

mm-increase-swap_cluster_max-to-batch-tlb-flushes.patch changed
SWAP_CLUSTER_MAX from 32 pages to 256 pages, inadvertantly switching
the scan window for vmpressure detection from 2MB to 16MB. Revert.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
---
 mm/vmpressure.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/vmpressure.c b/mm/vmpressure.c
index c5afd57..74f206b 100644
--- a/mm/vmpressure.c
+++ b/mm/vmpressure.c
@@ -38,7 +38,7 @@
  * TODO: Make the window size depend on machine size, as we do for vmstat
  * thresholds. Currently we set it to 512 pages (2MB for 4KB pages).
  */
-static const unsigned long vmpressure_win = SWAP_CLUSTER_MAX * 16;
+static const unsigned long vmpressure_win = SWAP_CLUSTER_MAX;
 
 /*
  * These thresholds are used when we account memory pressure through
-- 
2.6.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH] mm: vmpressure: fix scan window after SWAP_CLUSTER_MAX increase
  2015-10-19 18:13 [PATCH] mm: vmpressure: fix scan window after SWAP_CLUSTER_MAX increase Johannes Weiner
@ 2015-10-20  7:47 ` Mel Gorman
  2015-10-20 13:50   ` Johannes Weiner
  2015-10-21 19:38 ` Johannes Weiner
  1 sibling, 1 reply; 8+ messages in thread
From: Mel Gorman @ 2015-10-20  7:47 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: Andrew Morton, Vladimir Davydov, linux-mm, cgroups, linux-kernel

On Mon, Oct 19, 2015 at 02:13:01PM -0400, Johannes Weiner wrote:
> mm-increase-swap_cluster_max-to-batch-tlb-flushes.patch changed
> SWAP_CLUSTER_MAX from 32 pages to 256 pages, inadvertantly switching
> the scan window for vmpressure detection from 2MB to 16MB. Revert.
> 
> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>

This was known at the time but it was not clear what the measurable
impact would be. VM Pressure is odd in that it gives strange results at
times anyway, particularly on NUMA machines. To be honest, it still isn't
clear to me what the impact of the patch is. With different base page sizes
(e.g. on ppc64 with some configs), the window is still large. At the time,
it was left as-is as I could not decide one way or the other but I'm ok
with restoring the behaviour so either way;

Acked-by: Mel Gorman <mgorman@techsingularity.net>

Out of curiosity though, what *is* the user-visible impact of the patch
though? It's different but I'm having trouble deciding if it's better
or worse. I'm curious as to whether the patch is based on a bug report
or intuition.

Thanks.

-- 
Mel Gorman
SUSE Labs

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] mm: vmpressure: fix scan window after SWAP_CLUSTER_MAX increase
  2015-10-20  7:47 ` Mel Gorman
@ 2015-10-20 13:50   ` Johannes Weiner
  2015-10-21  9:17     ` Mel Gorman
  0 siblings, 1 reply; 8+ messages in thread
From: Johannes Weiner @ 2015-10-20 13:50 UTC (permalink / raw)
  To: Mel Gorman
  Cc: Andrew Morton, Vladimir Davydov, linux-mm, cgroups, linux-kernel

On Tue, Oct 20, 2015 at 08:47:00AM +0100, Mel Gorman wrote:
> On Mon, Oct 19, 2015 at 02:13:01PM -0400, Johannes Weiner wrote:
> > mm-increase-swap_cluster_max-to-batch-tlb-flushes.patch changed
> > SWAP_CLUSTER_MAX from 32 pages to 256 pages, inadvertantly switching
> > the scan window for vmpressure detection from 2MB to 16MB. Revert.
> > 
> > Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
> 
> This was known at the time but it was not clear what the measurable
> impact would be. VM Pressure is odd in that it gives strange results at
> times anyway, particularly on NUMA machines.

Interesting. Strange how? In terms of reporting random flukes?

I'm interested in vmpressure because I think reclaim efficiency would
be a useful metric to export to other parts of the kernel. We have
slab shrinkers that export LRU scan rate for ageable caches--the
keyword being "ageable" here--that are currently used for everything
that wants to know about memory pressure. But unless you actually age
and rotate your objects, it does not make sense to shrink your objects
based on LRU scan rate. There is no reason to drop your costly objects
if all the VM does is pick up streaming cache pages. In those cases,
it would make more sense to hook yourself up to be notified when
reclaim efficiency drops below a certain threshold.

> To be honest, it still isn't clear to me what the impact of the
> patch is. With different base page sizes (e.g. on ppc64 with some
> configs), the window is still large. At the time, it was left as-is
> as I could not decide one way or the other but I'm ok with restoring
> the behaviour so either way;
> 
> Acked-by: Mel Gorman <mgorman@techsingularity.net>

Thanks!

> Out of curiosity though, what *is* the user-visible impact of the patch
> though? It's different but I'm having trouble deciding if it's better
> or worse. I'm curious as to whether the patch is based on a bug report
> or intuition.

No, I didn't observe a bug, it just struck me during code review.

My sole line of reasoning was: the pressure scan window was chosen
back then to be at a certain point between reliable sample size and
on-time reporting of detected pressure. Increasing the LRU scan window
for the purpose of improved TLB batching seems sufficiently unrelated
that we wouldn't want to change the vmpressure window as a side effect.

Arguably it should not even reference SWAP_CLUSTER_MAX, but on the
other hand that value has been pretty static in the past, and it looks
better than '256' :-)

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] mm: vmpressure: fix scan window after SWAP_CLUSTER_MAX increase
  2015-10-20 13:50   ` Johannes Weiner
@ 2015-10-21  9:17     ` Mel Gorman
  0 siblings, 0 replies; 8+ messages in thread
From: Mel Gorman @ 2015-10-21  9:17 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: Andrew Morton, Vladimir Davydov, linux-mm, cgroups, linux-kernel

On Tue, Oct 20, 2015 at 09:50:56AM -0400, Johannes Weiner wrote:
> On Tue, Oct 20, 2015 at 08:47:00AM +0100, Mel Gorman wrote:
> > On Mon, Oct 19, 2015 at 02:13:01PM -0400, Johannes Weiner wrote:
> > > mm-increase-swap_cluster_max-to-batch-tlb-flushes.patch changed
> > > SWAP_CLUSTER_MAX from 32 pages to 256 pages, inadvertantly switching
> > > the scan window for vmpressure detection from 2MB to 16MB. Revert.
> > > 
> > > Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
> > 
> > This was known at the time but it was not clear what the measurable
> > impact would be. VM Pressure is odd in that it gives strange results at
> > times anyway, particularly on NUMA machines.
> 
> Interesting. Strange how? In terms of reporting random flukes?
> 

I haven't investigated closely myself. What I'm told is that on NUMA
machines that there is a "sawtooth-like" pattern of notifications. At
first, they received low pressure notifications and their application
released some memory. Later they expected more "low" events to be caught
but most times they saw "critical" or "medium" events immediately followed
by multiple critical events, sometimes multiple ones in the same second.
I currently that they are receiving events for one node but freeing on
another but never proved it.

> I'm interested in vmpressure because I think reclaim efficiency would
> be a useful metric to export to other parts of the kernel. We have
> slab shrinkers that export LRU scan rate for ageable caches--the
> keyword being "ageable" here--that are currently used for everything
> that wants to know about memory pressure. But unless you actually age
> and rotate your objects, it does not make sense to shrink your objects
> based on LRU scan rate. There is no reason to drop your costly objects
> if all the VM does is pick up streaming cache pages. In those cases,
> it would make more sense to hook yourself up to be notified when
> reclaim efficiency drops below a certain threshold.
> 

That makes sense but my expectation is that it needs NUMA smarts. It's
nowhere on my radar at the moment. The only thing I had on my plate that
was reclaim-related was to revisit the one-LRU-per-node series if the
atomic reserves and watermark series makes it through.

> > To be honest, it still isn't clear to me what the impact of the
> > patch is. With different base page sizes (e.g. on ppc64 with some
> > configs), the window is still large. At the time, it was left as-is
> > as I could not decide one way or the other but I'm ok with restoring
> > the behaviour so either way;
> > 
> > Acked-by: Mel Gorman <mgorman@techsingularity.net>
> 
> Thanks!
> 
> > Out of curiosity though, what *is* the user-visible impact of the patch
> > though? It's different but I'm having trouble deciding if it's better
> > or worse. I'm curious as to whether the patch is based on a bug report
> > or intuition.
> 
> No, I didn't observe a bug, it just struck me during code review.
> 
> My sole line of reasoning was: the pressure scan window was chosen
> back then to be at a certain point between reliable sample size and
> on-time reporting of detected pressure. Increasing the LRU scan window
> for the purpose of improved TLB batching seems sufficiently unrelated
> that we wouldn't want to change the vmpressure window as a side effect.
> 

That's reasonable and roughly what I expected. The vmpressure stuff was
originally designed with mobile devices in mind. AFAIK, they're reasonably
happy with it so the decisions it uses made sense to some interested user.

> Arguably it should not even reference SWAP_CLUSTER_MAX, but on the
> other hand that value has been pretty static in the past, and it looks
> better than '256' :-)

True. If it was changed, it would be in the context of NUMA smarts with
scaling based on reclaim activity or watermarks. Certainly, it's outside
the scope of what your patch is doing.

-- 
Mel Gorman
SUSE Labs

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] mm: vmpressure: fix scan window after SWAP_CLUSTER_MAX increase
  2015-10-19 18:13 [PATCH] mm: vmpressure: fix scan window after SWAP_CLUSTER_MAX increase Johannes Weiner
  2015-10-20  7:47 ` Mel Gorman
@ 2015-10-21 19:38 ` Johannes Weiner
  2015-10-21 20:05   ` Hugh Dickins
  1 sibling, 1 reply; 8+ messages in thread
From: Johannes Weiner @ 2015-10-21 19:38 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Vladimir Davydov, linux-mm, cgroups, linux-kernel

On Mon, Oct 19, 2015 at 02:13:01PM -0400, Johannes Weiner wrote:
> mm-increase-swap_cluster_max-to-batch-tlb-flushes.patch changed
> SWAP_CLUSTER_MAX from 32 pages to 256 pages, inadvertantly switching
> the scan window for vmpressure detection from 2MB to 16MB. Revert.
> 
> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
> ---
>  mm/vmpressure.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/mm/vmpressure.c b/mm/vmpressure.c
> index c5afd57..74f206b 100644
> --- a/mm/vmpressure.c
> +++ b/mm/vmpressure.c
> @@ -38,7 +38,7 @@
>   * TODO: Make the window size depend on machine size, as we do for vmstat
>   * thresholds. Currently we set it to 512 pages (2MB for 4KB pages).
>   */
> -static const unsigned long vmpressure_win = SWAP_CLUSTER_MAX * 16;
> +static const unsigned long vmpressure_win = SWAP_CLUSTER_MAX;

Argh, Mel's patch sets SWAP_CLUSTER_MAX to 256, so this should be
SWAP_CLUSTER_MAX * 2 to retain the 512 pages scan window.

Andrew could you please update this fix in-place? Otherwise I'll
resend a corrected version.

Thanks, and sorry about that.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] mm: vmpressure: fix scan window after SWAP_CLUSTER_MAX increase
  2015-10-21 19:38 ` Johannes Weiner
@ 2015-10-21 20:05   ` Hugh Dickins
  2015-11-16 20:22     ` Johannes Weiner
  0 siblings, 1 reply; 8+ messages in thread
From: Hugh Dickins @ 2015-10-21 20:05 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: Andrew Morton, Vladimir Davydov, linux-mm, cgroups, linux-kernel

On Wed, 21 Oct 2015, Johannes Weiner wrote:
> On Mon, Oct 19, 2015 at 02:13:01PM -0400, Johannes Weiner wrote:
> > mm-increase-swap_cluster_max-to-batch-tlb-flushes.patch changed
> > SWAP_CLUSTER_MAX from 32 pages to 256 pages, inadvertantly switching
> > the scan window for vmpressure detection from 2MB to 16MB. Revert.
> > 
> > Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
> > ---
> >  mm/vmpressure.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/mm/vmpressure.c b/mm/vmpressure.c
> > index c5afd57..74f206b 100644
> > --- a/mm/vmpressure.c
> > +++ b/mm/vmpressure.c
> > @@ -38,7 +38,7 @@
> >   * TODO: Make the window size depend on machine size, as we do for vmstat
> >   * thresholds. Currently we set it to 512 pages (2MB for 4KB pages).
> >   */
> > -static const unsigned long vmpressure_win = SWAP_CLUSTER_MAX * 16;
> > +static const unsigned long vmpressure_win = SWAP_CLUSTER_MAX;
> 
> Argh, Mel's patch sets SWAP_CLUSTER_MAX to 256, so this should be
> SWAP_CLUSTER_MAX * 2 to retain the 512 pages scan window.
> 
> Andrew could you please update this fix in-place? Otherwise I'll
> resend a corrected version.
> 
> Thanks, and sorry about that.

I don't understand why "SWAP_CLUSTER_MAX * 2" is thought better than "512".
Retaining a level of obscurity, that just bit us twice, is a good thing?

Hugh

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] mm: vmpressure: fix scan window after SWAP_CLUSTER_MAX increase
  2015-10-21 20:05   ` Hugh Dickins
@ 2015-11-16 20:22     ` Johannes Weiner
  2015-12-02 10:11       ` Hugh Dickins
  0 siblings, 1 reply; 8+ messages in thread
From: Johannes Weiner @ 2015-11-16 20:22 UTC (permalink / raw)
  To: Hugh Dickins
  Cc: Andrew Morton, Vladimir Davydov, linux-mm, cgroups, linux-kernel

Dear Hugh,

[ sorry, I just noticed this email now ]

On Wed, Oct 21, 2015 at 01:05:53PM -0700, Hugh Dickins wrote:
> On Wed, 21 Oct 2015, Johannes Weiner wrote:
> > On Mon, Oct 19, 2015 at 02:13:01PM -0400, Johannes Weiner wrote:
> > > mm-increase-swap_cluster_max-to-batch-tlb-flushes.patch changed
> > > SWAP_CLUSTER_MAX from 32 pages to 256 pages, inadvertantly switching
> > > the scan window for vmpressure detection from 2MB to 16MB. Revert.
> > > 
> > > Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
> > > ---
> > >  mm/vmpressure.c | 2 +-
> > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > 
> > > diff --git a/mm/vmpressure.c b/mm/vmpressure.c
> > > index c5afd57..74f206b 100644
> > > --- a/mm/vmpressure.c
> > > +++ b/mm/vmpressure.c
> > > @@ -38,7 +38,7 @@
> > >   * TODO: Make the window size depend on machine size, as we do for vmstat
> > >   * thresholds. Currently we set it to 512 pages (2MB for 4KB pages).
> > >   */
> > > -static const unsigned long vmpressure_win = SWAP_CLUSTER_MAX * 16;
> > > +static const unsigned long vmpressure_win = SWAP_CLUSTER_MAX;
> > 
> > Argh, Mel's patch sets SWAP_CLUSTER_MAX to 256, so this should be
> > SWAP_CLUSTER_MAX * 2 to retain the 512 pages scan window.
> > 
> > Andrew could you please update this fix in-place? Otherwise I'll
> > resend a corrected version.
> > 
> > Thanks, and sorry about that.
> 
> I don't understand why "SWAP_CLUSTER_MAX * 2" is thought better than "512".
> Retaining a level of obscurity, that just bit us twice, is a good thing?

I'm not sure it is. But it doesn't seem entirely wrong to link it to
the reclaim scan window, either--at least be a multiple of it so that
the vmpressure reporting happens cleanly at the end of a scan cycle?

I don't mind changing it to 512, but it doesn't feel like an obvious
improvement, either.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] mm: vmpressure: fix scan window after SWAP_CLUSTER_MAX increase
  2015-11-16 20:22     ` Johannes Weiner
@ 2015-12-02 10:11       ` Hugh Dickins
  0 siblings, 0 replies; 8+ messages in thread
From: Hugh Dickins @ 2015-12-02 10:11 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: Hugh Dickins, Andrew Morton, Vladimir Davydov, linux-mm, cgroups,
	linux-kernel

On Mon, 16 Nov 2015, Johannes Weiner wrote:

> Dear Hugh,
> 
> [ sorry, I just noticed this email now ]

No problem: as you can see, I'm very far from keeping up myself.

> 
> On Wed, Oct 21, 2015 at 01:05:53PM -0700, Hugh Dickins wrote:
> > On Wed, 21 Oct 2015, Johannes Weiner wrote:
> > > On Mon, Oct 19, 2015 at 02:13:01PM -0400, Johannes Weiner wrote:
> > > > mm-increase-swap_cluster_max-to-batch-tlb-flushes.patch changed
> > > > SWAP_CLUSTER_MAX from 32 pages to 256 pages, inadvertantly switching
> > > > the scan window for vmpressure detection from 2MB to 16MB. Revert.
> > > > 
> > > > Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
> > > > ---
> > > >  mm/vmpressure.c | 2 +-
> > > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > > 
> > > > diff --git a/mm/vmpressure.c b/mm/vmpressure.c
> > > > index c5afd57..74f206b 100644
> > > > --- a/mm/vmpressure.c
> > > > +++ b/mm/vmpressure.c
> > > > @@ -38,7 +38,7 @@
> > > >   * TODO: Make the window size depend on machine size, as we do for vmstat
> > > >   * thresholds. Currently we set it to 512 pages (2MB for 4KB pages).
> > > >   */
> > > > -static const unsigned long vmpressure_win = SWAP_CLUSTER_MAX * 16;
> > > > +static const unsigned long vmpressure_win = SWAP_CLUSTER_MAX;
> > > 
> > > Argh, Mel's patch sets SWAP_CLUSTER_MAX to 256, so this should be
> > > SWAP_CLUSTER_MAX * 2 to retain the 512 pages scan window.
> > > 
> > > Andrew could you please update this fix in-place? Otherwise I'll
> > > resend a corrected version.
> > > 
> > > Thanks, and sorry about that.
> > 
> > I don't understand why "SWAP_CLUSTER_MAX * 2" is thought better than "512".
> > Retaining a level of obscurity, that just bit us twice, is a good thing?
> 
> I'm not sure it is. But it doesn't seem entirely wrong to link it to
> the reclaim scan window, either--at least be a multiple of it so that
> the vmpressure reporting happens cleanly at the end of a scan cycle?
> 
> I don't mind changing it to 512, but it doesn't feel like an obvious
> improvement, either.

Never mind: I thought it worth calling out at the time, but now that
your change is in, I've no reason to make a fuss over it - if it ever
bites us again, we can rework it then.

Besides, when I wrote, I hadn't noticed the comment there about
SWAP_CLUSTER_MAX, just out of sight before the context of your fix:
that would have to be reworded, and I don't know what it should say.

Oh, it's all still mmotm stuff, not in 4.4-rc.  Well, even so,
forget my interjection: we've all got bigger fish to fry.

Hugh

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2015-12-02 10:11 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-10-19 18:13 [PATCH] mm: vmpressure: fix scan window after SWAP_CLUSTER_MAX increase Johannes Weiner
2015-10-20  7:47 ` Mel Gorman
2015-10-20 13:50   ` Johannes Weiner
2015-10-21  9:17     ` Mel Gorman
2015-10-21 19:38 ` Johannes Weiner
2015-10-21 20:05   ` Hugh Dickins
2015-11-16 20:22     ` Johannes Weiner
2015-12-02 10:11       ` Hugh Dickins

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).