LKML Archive on lore.kernel.org
 help / color / Atom feed
From: David Rientjes <rientjes@google.com>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: ying.huang@intel.com, Andrea Arcangeli <aarcange@redhat.com>,
	Michal Hocko <mhocko@suse.com>,
	s.priebe@profihost.ag, mgorman@techsingularity.net,
	Linux List Kernel Mailing <linux-kernel@vger.kernel.org>,
	alex.williamson@redhat.com, lkp@01.org, kirill@shutemov.name,
	Andrew Morton <akpm@linux-foundation.org>,
	zi.yan@cs.rutgers.edu, Vlastimil Babka <vbabka@suse.cz>
Subject: Re: [LKP] [mm] ac5b2c1891: vm-scalability.throughput -61.3% regression
Date: Wed, 28 Nov 2018 15:10:04 -0800 (PST)
Message-ID: <alpine.DEB.2.21.1811281504030.231719@chino.kir.corp.google.com> (raw)
In-Reply-To: <CAHk-=wjgRO-=NPaU9EmrdC3it3o7kvf4u7sewv3crtNLkE13Hg@mail.gmail.com>

On Wed, 28 Nov 2018, Linus Torvalds wrote:

> On Tue, Nov 27, 2018 at 7:20 PM Huang, Ying <ying.huang@intel.com> wrote:
> >
> > From the above data, for the parent commit 3 processes exited within
> > 14s, another 3 exited within 100s.  For this commit, the first process
> > exited at 203s.  That is, this commit makes memory allocation more fair
> > among processes, so that processes proceeded at more similar speed.  But
> > this raises system memory footprint too, so triggered much more swap,
> > thus lower benchmark score.
> >
> > In general, memory allocation fairness among processes should be a good
> > thing.  So I think the report should have been a "performance
> > improvement" instead of "performance regression".
> 
> Hey, when you put it that way...
> 
> Let's ignore this issue for now, and see if it shows up in some real
> workload and people complain.
> 

Well, I originally complained[*] when the change was first proposed and 
when the stable backports were proposed[**].  On a fragmented host, the 
change itself showed a 13.9% access latency regression on Haswell and up 
to 40% allocation latency regression.  This is more substantial on Naples 
and Rome.  I also measured similar numbers to this for Haswell.

We are particularly hit hard by this because we have libraries that remap 
the text segment of binaries to hugepages; hugetlbfs is not widely used so 
this normally falls back to transparent hugepages.  We mmap(), 
madvise(MADV_HUGEPAGE), memcpy(), mremap().  We fully accept the latency 
to do this when the binary starts because the access latency at runtime is 
so much better.

With this change, however, we have no userspace workaround other than 
mbind() to prefer the local node.  On all of our platforms, native sized 
pages are always a win over remote hugepages and it leaves open the 
opportunity that we collapse memory into hugepages later by khugepaged if 
fragmentation is the issue.  mbind() is not viable if the local node is 
saturated, we are ok with falling back to remote pages of the native page 
size when the local node is oom; this would result in an oom kill if we 
used it to retain the old behavior.

Given this severe access and allocation latency regression, we must revert 
this patch in our own kernel, there is simply no path forward without 
doing so.

[*] https://marc.info/?l=linux-kernel&m=153868420126775
[**] https://marc.info/?l=linux-kernel&m=154269994800842

  parent reply index

Thread overview: 77+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-11-27  6:25 kernel test robot
2018-11-27 17:08 ` Linus Torvalds
2018-11-27 18:17   ` Michal Hocko
2018-11-27 18:21     ` Michal Hocko
2018-11-27 19:05   ` Vlastimil Babka
2018-11-27 19:16     ` Vlastimil Babka
2018-11-27 20:57   ` Andrea Arcangeli
2018-11-27 22:50     ` Linus Torvalds
2018-11-28  6:30       ` Michal Hocko
2018-11-28  3:20     ` Huang\, Ying
2018-11-28 16:48       ` Linus Torvalds
2018-11-28 18:39         ` Andrea Arcangeli
2018-11-28 23:10         ` David Rientjes [this message]
2018-12-03 18:01         ` Linus Torvalds
2018-12-03 18:14           ` Michal Hocko
2018-12-03 18:19             ` Linus Torvalds
2018-12-03 18:30               ` Michal Hocko
2018-12-03 18:45                 ` Linus Torvalds
2018-12-03 18:59                   ` Michal Hocko
2018-12-03 19:23                     ` Andrea Arcangeli
2018-12-03 20:26                       ` David Rientjes
2018-12-03 19:28                     ` Linus Torvalds
2018-12-03 20:12                       ` Andrea Arcangeli
2018-12-03 20:36                         ` David Rientjes
2018-12-03 22:04                         ` Linus Torvalds
2018-12-03 22:27                           ` Linus Torvalds
2018-12-03 22:57                             ` David Rientjes
2018-12-04  9:22                             ` Vlastimil Babka
2018-12-04 10:45                               ` Mel Gorman
2018-12-05  0:47                                 ` David Rientjes
2018-12-05  9:08                                   ` Michal Hocko
2018-12-05 10:43                                     ` Mel Gorman
2018-12-05 11:43                                       ` Michal Hocko
2018-12-05 10:06                                 ` Mel Gorman
2018-12-05 20:40                                 ` Andrea Arcangeli
2018-12-05 21:59                                   ` David Rientjes
2018-12-06  0:00                                     ` Andrea Arcangeli
2018-12-05 22:03                                   ` Linus Torvalds
2018-12-05 22:12                                     ` David Rientjes
2018-12-05 23:36                                     ` Andrea Arcangeli
2018-12-05 23:51                                       ` Linus Torvalds
2018-12-06  0:58                                         ` Linus Torvalds
2018-12-06  9:14                                           ` MADV_HUGEPAGE vs. NUMA semantic (was: Re: [LKP] [mm] ac5b2c1891: vm-scalability.throughput -61.3% regression) Michal Hocko
2018-12-06 23:49                                             ` David Rientjes
2018-12-07  7:34                                               ` Michal Hocko
2018-12-07  4:31                                             ` Linus Torvalds
2018-12-07  7:49                                               ` Michal Hocko
2018-12-07  9:06                                                 ` Vlastimil Babka
2018-12-07 23:15                                                   ` David Rientjes
2018-12-06 23:43                                           ` [LKP] [mm] ac5b2c1891: vm-scalability.throughput -61.3% regression David Rientjes
2018-12-07  4:01                                             ` Linus Torvalds
2018-12-10  0:29                                               ` David Rientjes
2018-12-10  4:49                                                 ` Andrea Arcangeli
2018-12-12  0:37                                                   ` David Rientjes
2018-12-12  9:50                                                     ` Michal Hocko
2018-12-12 17:00                                                       ` Andrea Arcangeli
2018-12-14 11:32                                                         ` Michal Hocko
2018-12-12 10:14                                                     ` Vlastimil Babka
2018-12-14 21:04                                                       ` David Rientjes
2018-12-14 21:33                                                         ` Vlastimil Babka
2018-12-21 22:18                                                           ` David Rientjes
2018-12-22 12:08                                                             ` Mel Gorman
2018-12-14 23:11                                                         ` Mel Gorman
2018-12-21 22:15                                                           ` David Rientjes
2018-12-12 10:44                                                   ` Andrea Arcangeli
2019-04-15 11:48                                             ` Michal Hocko
2018-12-06  0:18                                       ` David Rientjes
2018-12-06  0:54                                         ` Andrea Arcangeli
2018-12-06  9:23                                           ` Vlastimil Babka
2018-12-03 20:39                     ` David Rientjes
2018-12-03 21:25                       ` Michal Hocko
2018-12-03 21:53                         ` David Rientjes
2018-12-04  8:48                           ` Michal Hocko
2018-12-05  0:07                             ` David Rientjes
2018-12-05 10:18                               ` Michal Hocko
2018-12-05 19:16                                 ` David Rientjes
2018-11-27  7:23 kernel test robot

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.DEB.2.21.1811281504030.231719@chino.kir.corp.google.com \
    --to=rientjes@google.com \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=alex.williamson@redhat.com \
    --cc=kirill@shutemov.name \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lkp@01.org \
    --cc=mgorman@techsingularity.net \
    --cc=mhocko@suse.com \
    --cc=s.priebe@profihost.ag \
    --cc=torvalds@linux-foundation.org \
    --cc=vbabka@suse.cz \
    --cc=ying.huang@intel.com \
    --cc=zi.yan@cs.rutgers.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

LKML Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/lkml/0 lkml/git/0.git
	git clone --mirror https://lore.kernel.org/lkml/1 lkml/git/1.git
	git clone --mirror https://lore.kernel.org/lkml/2 lkml/git/2.git
	git clone --mirror https://lore.kernel.org/lkml/3 lkml/git/3.git
	git clone --mirror https://lore.kernel.org/lkml/4 lkml/git/4.git
	git clone --mirror https://lore.kernel.org/lkml/5 lkml/git/5.git
	git clone --mirror https://lore.kernel.org/lkml/6 lkml/git/6.git
	git clone --mirror https://lore.kernel.org/lkml/7 lkml/git/7.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 lkml lkml/ https://lore.kernel.org/lkml \
		linux-kernel@vger.kernel.org linux-kernel@archiver.kernel.org
	public-inbox-index lkml


Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-kernel


AGPL code for this site: git clone https://public-inbox.org/ public-inbox