linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Andrea Arcangeli <aarcange@redhat.com>
To: David Rientjes <rientjes@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>,
	"Kirill A. Shutemov" <kirill@shutemov.name>,
	Andrew Morton <akpm@linux-foundation.org>,
	linux-mm@kvack.org, Alex Williamson <alex.williamson@redhat.com>
Subject: Re: [PATCH 0/2] fix for "pathological THP behavior"
Date: Tue, 21 Aug 2018 18:18:43 -0400	[thread overview]
Message-ID: <20180821221843.GI13047@redhat.com> (raw)
In-Reply-To: <alpine.DEB.2.21.1808211021110.258924@chino.kir.corp.google.com>

On Tue, Aug 21, 2018 at 10:26:54AM -0700, David Rientjes wrote:
> MADV_HUGEPAGE (or defrag == "always") would now become a combination of 
> "try to compact locally" and "allocate remotely if necesary" without the 
> ability to avoid the latter absent a mempolicy that affects all memory 

I don't follow why compaction should run only on the local node in
such case (i.e. __GFP_THISNODE removed when __GFP_DIRECT_RECLAIM is
set).

The zonelist will simply span all nodes so compaction & reclaim should
both run on all for MADV_HUGEPAGE with option 2).

The only mess there is in the allocator right now is that compaction
runs per zone and reclaim runs per node but that's another issue and
won't hurt for this case.

> allocations.  I think the complete solution would be a MPOL_F_HUGEPAGE 
> flag that defines mempolicies for hugepage allocations.  In my experience 
> thp falling back to remote nodes for intrasocket latency is a win but 
> intersocket or two-hop intersocket latency is a no go.

Yes, that's my expectation too.

So what you suggest is to add a new hard binding, that allows altering
the default behavior for THP, that sure sounds fine.

We've still to pick the actual default and decide if a single default
is ok or it should be tunable or even change the default depending on
the NUMA topology.

I suspect it's a bit overkill to have different defaults depending on
NUMA topology. There have been defaults for obscure things like
numa_zonelist_order that changed behavior depending on number of nodes
and they happened to hurt on some system. I ended up tuning them to
the current default (until the runtime tuning was removed).

It's a bit hard to just pick the best just based on arbitrary things
like number of numa nodes or distance, especially when what is better
also depends on the actual application.

I think options are sane behaviors with some pros and cons, and option
2) is simpler and will likely perform better on smaller systems,
option 1) is less risky in larger systems.

In any case the watermark optimization to set __GFP_THISNODE only if
there's plenty of PAGE_SIZEd memory in the local node, remains a valid
optimization for later for the default "defrag" value (i.e. no
MADV_HUGEPAGE) not setting __GFP_DIRECT_RECLAIM. If there's no RAM
free in the local node we can totally try to pick the THP from the
other nodes and not doing so only has the benefit of saving the
watermark check itself.

Thanks,
Andrea

  reply	other threads:[~2018-08-21 22:18 UTC|newest]

Thread overview: 61+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-08-20  3:22 [PATCH 0/2] fix for "pathological THP behavior" Andrea Arcangeli
2018-08-20  3:22 ` [PATCH 1/2] mm: thp: consolidate policy_nodemask call Andrea Arcangeli
2018-08-20  3:22 ` [PATCH 2/2] mm: thp: fix transparent_hugepage/defrag = madvise || always Andrea Arcangeli
2018-08-20  3:26   ` [PATCH 0/1] fix for "pathological THP behavior" v2 Andrea Arcangeli
2018-08-20  3:26     ` [PATCH 1/1] mm: thp: fix transparent_hugepage/defrag = madvise || always Andrea Arcangeli
2018-08-20 12:35   ` [PATCH 2/2] " Zi Yan
2018-08-20 15:32     ` Andrea Arcangeli
2018-08-21 11:50   ` Michal Hocko
2018-08-21 21:40     ` Andrea Arcangeli
2018-08-22  9:02       ` Michal Hocko
2018-08-22 11:07         ` Michal Hocko
2018-08-22 14:24           ` Andrea Arcangeli
2018-08-22 14:45             ` Michal Hocko
2018-08-22 15:24               ` Andrea Arcangeli
2018-08-23 10:50                 ` Michal Hocko
2018-08-22 15:52         ` Andrea Arcangeli
2018-08-23 10:52           ` Michal Hocko
2018-08-28  7:53             ` Michal Hocko
2018-08-28  8:18               ` Michal Hocko
2018-08-28  8:54                 ` Stefan Priebe - Profihost AG
2018-08-29 11:11                   ` Stefan Priebe - Profihost AG
     [not found]                 ` <D5F4A33C-0A37-495C-9468-D6866A862097@cs.rutgers.edu>
2018-08-29 14:28                   ` Michal Hocko
2018-08-29 14:35                     ` Michal Hocko
2018-08-29 15:22                       ` Zi Yan
2018-08-29 15:47                         ` Michal Hocko
2018-08-29 16:06                           ` Zi Yan
2018-08-29 16:25                             ` Michal Hocko
2018-08-29 19:24                               ` [PATCH] mm, thp: relax __GFP_THISNODE for MADV_HUGEPAGE mappings Michal Hocko
2018-08-29 22:54                                 ` Zi Yan
2018-08-30  7:00                                   ` Michal Hocko
2018-08-30 13:22                                     ` Zi Yan
2018-08-30 13:45                                       ` Michal Hocko
2018-08-30 14:02                                         ` Zi Yan
2018-08-30 16:19                                           ` Stefan Priebe - Profihost AG
2018-08-30 16:40                                           ` Michal Hocko
2018-09-05  3:44                                             ` Andrea Arcangeli
2018-09-05  7:08                                               ` Michal Hocko
2018-09-06 11:10                                                 ` Vlastimil Babka
2018-09-06 11:16                                                   ` Vlastimil Babka
2018-09-06 11:25                                                     ` Michal Hocko
2018-09-06 12:35                                                       ` Zi Yan
2018-09-06 10:59                                   ` Vlastimil Babka
2018-09-06 11:17                                     ` Zi Yan
2018-08-30  6:47                                 ` Michal Hocko
2018-09-06 11:18                                   ` Vlastimil Babka
2018-09-06 11:27                                     ` Michal Hocko
2018-09-12 17:29                                 ` Mel Gorman
2018-09-17  6:11                                   ` Michal Hocko
2018-09-17  7:04                                     ` Stefan Priebe - Profihost AG
2018-09-17  9:32                                       ` Stefan Priebe - Profihost AG
2018-09-17 11:27                                       ` Michal Hocko
2018-08-20 11:58 ` [PATCH 0/2] fix for "pathological THP behavior" Kirill A. Shutemov
2018-08-20 15:19   ` Andrea Arcangeli
2018-08-21 15:30     ` Vlastimil Babka
2018-08-21 17:26       ` David Rientjes
2018-08-21 22:18         ` Andrea Arcangeli [this message]
2018-08-21 22:05       ` Andrea Arcangeli
2018-08-22  9:24       ` Michal Hocko
2018-08-22 15:56         ` Andrea Arcangeli
2018-08-20 19:06   ` Yang Shi
2018-08-20 23:24     ` Andrea Arcangeli

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180821221843.GI13047@redhat.com \
    --to=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=alex.williamson@redhat.com \
    --cc=kirill@shutemov.name \
    --cc=linux-mm@kvack.org \
    --cc=rientjes@google.com \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).