linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Johannes Weiner <hannes@cmpxchg.org>
To: Rik van Riel <riel@redhat.com>
Cc: "Pierre-Loup A. Griffais" <pgriffais@valvesoftware.com>,
	linux-kernel@vger.kernel.org, torvalds@linux-foundation.org,
	sonnyrao@chromium.org, kamezawa.hiroyu@jp.fujitsu.com,
	akpm@linux-foundation.org
Subject: Re: IO regression after ab8fabd46f on x86 kernels with high memory
Date: Fri, 26 Apr 2013 22:42:48 -0400	[thread overview]
Message-ID: <20130427024248.GA1229@cmpxchg.org> (raw)
In-Reply-To: <517B2FB4.30605@redhat.com>

On Fri, Apr 26, 2013 at 09:53:56PM -0400, Rik van Riel wrote:
> On 04/26/2013 07:44 PM, Pierre-Loup A. Griffais wrote:
> >I initially observed this between kernels 3.2 and 3.5: on 3.2, copying a
> >180M shared object on the same ext4 filesystem takes 0.6s. On 3.5, it
> >takes between two and three minutes. It looks like a similar throughput
> >regression happens on any machine running an i386 PAE kernel with high
> >amounts of memory; the threshold seems to be 16G; passing mem=15G to the
> >kernel commandline fixes it.
> 
> If you have that much memory in the system, you will
> want to run a 64 bit kernel to avoid all kinds of
> memory management corner cases.

Agreed.  You can even keep your 32 bit userland, just swap the
kernel...

> >I bisected it to the following change:
> >
> >commit ab8fabd46f811d5153d8a0cd2fac9a0d41fb593d
> >Author: Johannes Weiner <jweiner@redhat.com>
> >Date:   Tue Jan 10 15:07:42 2012 -0800
> >
> >     mm: exclude reserved pages from dirtyable memory
> >
> >I realize running x86 kernels against high amounts of memory is not
> >advised for various reasons, but I would assume that such a big
> >regression in basic functionality to not be part of them. Is that
> >accurate, or are these configurations expected to become unusable from
> >3.3 onwards?
> 
> Reverting that patch would probably break i686 PAE systems with
> lots of memory at a different threshold.

It would also re-introduce the reclaim stalls when zones with very
little page cache due to lowmem reserves end up with a large
percentage of their LRU dirty.  And that affects modern machines too,
because of the lowmem reserves in DMA32 due to relatively bigger
Normal zones.

On such large highmem machines, however, the imbalance between highmem
and lowmem is so enormous that the lowmem reserves basically exclude
all of lowmem from page cache usage.

But because dirty highmem creates lowmem pressure, and the amount of
sanely allowable dirty memory is actually a function of lowmem, not
highmem, highmem is not included in the amount of dirtyable memory.

So because your lowmem is not available for page cache and highmem is
not considered dirtyable out of the box, the amount of dirtyable
memory on your machine is 0.  You can workaround this by setting
vm.highmem_is_dirtyable=1.

  reply	other threads:[~2013-04-27  2:42 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-04-26 23:44 IO regression after ab8fabd46f on x86 kernels with high memory Pierre-Loup A. Griffais
2013-04-27  1:53 ` Rik van Riel
2013-04-27  2:42   ` Johannes Weiner [this message]
2013-04-29 21:53     ` Pierre-Loup A. Griffais
2013-04-29 22:03       ` Linus Torvalds
2013-04-29 22:08         ` Pierre-Loup A. Griffais
2013-05-02  4:37           ` Sonny Rao
2013-04-30  0:48         ` Rik van Riel
2013-04-30  1:06           ` Pierre-Loup A. Griffais
2013-05-02  1:34           ` Steven Rostedt
2013-05-02  2:46             ` [PATCH] mm,x86: limit 32 bit kernel to 12GB memory Rik van Riel
2013-05-02  7:37               ` Pierre-Loup A. Griffais
2013-05-02 20:03               ` Linus Torvalds
2013-05-11  9:16                 ` Yuhong Bao
2013-05-08 19:10         ` IO regression after ab8fabd46f on x86 kernels with high memory H. Peter Anvin
2013-06-03  1:17           ` Yuhong Bao

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130427024248.GA1229@cmpxchg.org \
    --to=hannes@cmpxchg.org \
    --cc=akpm@linux-foundation.org \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pgriffais@valvesoftware.com \
    --cc=riel@redhat.com \
    --cc=sonnyrao@chromium.org \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).