linux-parisc.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Mel Gorman <mgorman@techsingularity.net>
To: Mikulas Patocka <mpatocka@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Helge Deller <deller@gmx.de>,
	"James E.J. Bottomley" <James.Bottomley@hansenpartnership.com>,
	John David Anglin <dave.anglin@bell.net>,
	linux-parisc@vger.kernel.org, linux-mm@kvack.org,
	Vlastimil Babka <vbabka@suse.cz>,
	Andrea Arcangeli <aarcange@redhat.com>,
	Zi Yan <zi.yan@cs.rutgers.edu>
Subject: Re: Memory management broken by "mm: reclaim small amounts of memory when an external fragmentation event occurs"
Date: Mon, 8 Apr 2019 13:54:48 +0100	[thread overview]
Message-ID: <20190408125448.GB18914@techsingularity.net> (raw)
In-Reply-To: <alpine.LRH.2.02.1904080639570.4674@file01.intranet.prod.int.rdu2.redhat.com>

On Mon, Apr 08, 2019 at 07:10:11AM -0400, Mikulas Patocka wrote:
> > First, if pa-risc is !NUMA then why are separate local ranges
> > represented as separate nodes? Is it because of DISCONTIGMEM or something
> > else? DISCONTIGMEM is before my time so I'm not familiar with it and
> 
> I'm not an expert in this area, I don't know.
> 

Ok.

> > I consider it "essentially dead" but the arch init code seems to setup
> > pgdats for each physical contiguous range so it's a possibility. The most
> > likely explanation is pa-risc does not have hardware with addressing
> > limitations smaller than the CPUs physical address limits and it's
> > possible to have more ranges than available zones but clarification would
> > be nice.  By rights, SPARSEMEM would be supported on pa-risc but that
> > would be a time-consuming and somewhat futile exercise.  Regardless of the
> > explanation, as pa-risc does not appear to support transparent hugepages,
> > an option is to special case watermark_boost_factor to be 0 on DISCONTIGMEM
> > as that commit was primarily about THP with secondary concerns around
> > SLUB. This is probably the most straight-forward solution but it'd need
> > a comment obviously. I do not know what the distro configurations for
> > pa-risc set as I'm not a user of gentoo or debian.
> 
> I use Debian Sid, but I compile my own kernel. I uploaded the kernel 
> .config here: 
> http://people.redhat.com/~mpatocka/testcases/parisc-config.txt
> 

DISCONTIGMEM is set so based on the arch init code. Glancing at the
history, it seems my assumption was accurate. Discontig used NUMA
structures for non-NUMA machines to allow code to be reused and simplify
matters.

I'll put together a patch that disables this feature on DISCONTIG as it
is surprising in the DISCONTIGMEM.

> > Second, if you set the sysctl vm.watermark_boost_factor=0, does the
> > problem go away? If so, an option would be to set this sysctl to 0 by
> > default on distros that support pa-risc. Would that be suitable?
> 
> I have tried it and the problem almost goes away. With 
> vm.watermark_boost_factor=0, if I read 2GiB data from the disk, the buffer 
> cache will contain about 1.8GiB. So, there's still some superfluous page 
> reclaim, but it is smaller.
> 

Ok, for NUMA, I would generally expect some small amounts of reclaim on
a per-node basis from kswapd waking up as the node fills. I know in your
case there is no NUMA but from a memory consumption/reclaim point of
view, it doesn't matter. There are multiple active node structures so
it's treated as such.

In the short-term, I suggest you update /etc/sysctl.conf to workaround
the issue.

> BTW. I'm interested - on real NUMA machines - is reclaiming the file cache 
> really a better option than allocating the file cache from non-local node?
> 

The patch is not related to file cache concerns, it's for long-term
viability of high-order allocations, particularly THP but also SLUB which
uses high-order allocations by default.

> 
> > Finally, I'm sure this has been asked before buy why is pa-risc alive?
> > It appears a new CPU has not been manufactured since 2005. Even Alpha
> > I can understand being semi-alive since it's an interesting case for
> > weakly-ordered memory models. pa-risc appears to be supported and active
> > for debian at least so someone cares. It's not the only feature like this
> > that is bizarrely alive but it is curious -- 32 bit NUMA support on x86,
> > I'm looking at you, your machines are all dead since the early 2000's
> > AFAIK and anyone else using NUMA on 32-bit x86 needs their head examined.
> 
> I use it to test programs for portability to risc.
> 
> If one could choose between buying an expensive power system or a cheap 
> pa-risc system, pa-risc may be a better choice. The last pa-risc model has 
> four cores at 1.1GHz, so it is not completely unuseable.

Well if it was me and I was checking portability to risc, I'd probably
get hold of a raspberry pi but we all have different ways of looking at
things.

-- 
Mel Gorman
SUSE Labs

  reply	other threads:[~2019-04-08 13:00 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-04-06 15:20 Memory management broken by "mm: reclaim small amounts of memory when an external fragmentation event occurs" Mikulas Patocka
2019-04-06 17:26 ` Mikulas Patocka
2019-04-08  9:52 ` Mel Gorman
2019-04-08 11:10   ` Mikulas Patocka
2019-04-08 12:54     ` Mel Gorman [this message]
2019-04-08 14:29   ` James Bottomley
2019-04-08 15:22     ` Helge Deller
2019-04-08 19:44       ` James Bottomley
2019-04-09 20:09       ` Helge Deller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190408125448.GB18914@techsingularity.net \
    --to=mgorman@techsingularity.net \
    --cc=James.Bottomley@hansenpartnership.com \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=dave.anglin@bell.net \
    --cc=deller@gmx.de \
    --cc=linux-mm@kvack.org \
    --cc=linux-parisc@vger.kernel.org \
    --cc=mpatocka@redhat.com \
    --cc=vbabka@suse.cz \
    --cc=zi.yan@cs.rutgers.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).