linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: "Zi Yan" <zi.yan@cs.rutgers.edu>
To: Michal Hocko <mhocko@suse.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	linux-mm@kvack.org, Alex Williamson <alex.williamson@redhat.com>,
	David Rientjes <rientjes@google.com>,
	Vlastimil Babka <vbabka@suse.cz>,
	Stefan Priebe - Profihost AG <s.priebe@profihost.ag>
Subject: Re: [PATCH] mm, thp: relax __GFP_THISNODE for MADV_HUGEPAGE mappings
Date: Thu, 30 Aug 2018 09:22:21 -0400	[thread overview]
Message-ID: <4AFDF557-46E3-4C62-8A43-C28E8F2A54CF@cs.rutgers.edu> (raw)
In-Reply-To: <20180830070021.GB2656@dhcp22.suse.cz>

[-- Attachment #1: Type: text/plain, Size: 3285 bytes --]

On 30 Aug 2018, at 3:00, Michal Hocko wrote:

> On Wed 29-08-18 18:54:23, Zi Yan wrote:
> [...]
>> I tested it against Linus’s tree with “memhog -r3 130g” in a two-socket machine with 128GB memory on
>> each node and got the results below. I expect this test should fill one node, then fall back to the other.
>>
>> 1. madvise(MADV_HUGEPAGE) + defrag = {always, madvise, defer+madvise}:
>> no swap, THPs are allocated in the fallback node.
>> 2. madvise(MADV_HUGEPAGE) + defrag = defer: pages got swapped to the
>> disk instead of being allocated in the fallback node.
>> 3. no madvise, THP is on by default + defrag = {always, defer,
>> defer+madvise}: pages got swapped to the disk instead of being
>> allocated in the fallback node.
>> 4. no madvise, THP is on by default + defrag = madvise: no swap, base
>> pages are allocated in the fallback node.
>>
>> The result 2 and 3 seems unexpected, since pages should be allocated in the fallback node.
>>
>> The reason, as Andrea mentioned in his email, is that the combination
>> of __THIS_NODE and __GFP_DIRECT_RECLAIM (plus __GFP_KSWAPD_RECLAIM
>> from this experiment).
>
> But we do not set __GFP_THISNODE along with __GFP_DIRECT_RECLAIM AFAICS.
> We do for __GFP_KSWAPD_RECLAIM though and I guess that it is expected to
> see kswapd do the reclaim to balance the node. If the node is full of
> anonymous pages then there is no other way than swap out.

GFP_TRANSHUGE implies __GFP_DIRECT_RECLAIM. When no madvise is given, THP is on
+ defrag=always, gfp_mask has __GFP_THISNODE and __GFP_DIRECT_RECLAIM, so swapping
can be triggered.

The key issue here is that “memhog -r3 130g” uses the default memory policy (MPOL_DEFAULT),
which should allow page allocation fallback to other nodes, but as shown in
result 3, swapping is triggered instead of page allocation fallback.

>
>> __THIS_NODE uses ZONELIST_NOFALLBACK, which
>> removes the fallback possibility and __GFP_*_RECLAIM triggers page
>> reclaim in the first page allocation node when fallback nodes are
>> removed by ZONELIST_NOFALLBACK.
>
> Yes but the point is that the allocations which use __GFP_THISNODE are
> optimistic so they shouldn't fallback to remote NUMA nodes.

This can be achieved by using MPOL_BIND memory policy which restricts
nodemask in struct alloc_context for user space memory allocations.

>
>> IMHO, __THIS_NODE should not be used for user memory allocation at
>> all, since it fights against most of memory policies.  But kernel
>> memory allocation would need it as a kernel MPOL_BIND memory policy.
>
> __GFP_THISNODE is indeed an ugliness. I would really love to get rid of
> it here. But the problem is that optimistic THP allocations should
> prefer a local node because a remote node might easily offset the
> advantage of the THP. I do not have a great idea how to achieve that
> without __GFP_THISNODE though.

MPOL_PREFERRED memory policy can be used to achieve this optimistic THP allocation
for user space. Even with the default memory policy, local memory node will be used
first until it is full. It seems to me that __GFP_THISNODE is not necessary
if a proper memory policy is used.

Let me know if I miss anything. Thanks.


—
Best Regards,
Yan Zi

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 516 bytes --]

  reply	other threads:[~2018-08-30 13:22 UTC|newest]

Thread overview: 75+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-08-20  3:22 [PATCH 0/2] fix for "pathological THP behavior" Andrea Arcangeli
2018-08-20  3:22 ` [PATCH 1/2] mm: thp: consolidate policy_nodemask call Andrea Arcangeli
2018-08-20  3:22 ` [PATCH 2/2] mm: thp: fix transparent_hugepage/defrag = madvise || always Andrea Arcangeli
2018-08-20  3:26   ` [PATCH 0/1] fix for "pathological THP behavior" v2 Andrea Arcangeli
2018-08-20  3:26     ` [PATCH 1/1] mm: thp: fix transparent_hugepage/defrag = madvise || always Andrea Arcangeli
2018-08-20 12:35   ` [PATCH 2/2] " Zi Yan
2018-08-20 15:32     ` Andrea Arcangeli
2018-08-21 11:50   ` Michal Hocko
2018-08-21 21:40     ` Andrea Arcangeli
2018-08-22  9:02       ` Michal Hocko
2018-08-22 11:07         ` Michal Hocko
2018-08-22 14:24           ` Andrea Arcangeli
2018-08-22 14:45             ` Michal Hocko
2018-08-22 15:24               ` Andrea Arcangeli
2018-08-23 10:50                 ` Michal Hocko
2018-08-22 15:52         ` Andrea Arcangeli
2018-08-23 10:52           ` Michal Hocko
2018-08-28  7:53             ` Michal Hocko
2018-08-28  8:18               ` Michal Hocko
2018-08-28  8:54                 ` Stefan Priebe - Profihost AG
2018-08-29 11:11                   ` Stefan Priebe - Profihost AG
     [not found]                 ` <D5F4A33C-0A37-495C-9468-D6866A862097@cs.rutgers.edu>
2018-08-29 14:28                   ` Michal Hocko
2018-08-29 14:35                     ` Michal Hocko
2018-08-29 15:22                       ` Zi Yan
2018-08-29 15:47                         ` Michal Hocko
2018-08-29 16:06                           ` Zi Yan
2018-08-29 16:25                             ` Michal Hocko
2018-08-29 19:24                               ` [PATCH] mm, thp: relax __GFP_THISNODE for MADV_HUGEPAGE mappings Michal Hocko
2018-08-29 22:54                                 ` Zi Yan
2018-08-30  7:00                                   ` Michal Hocko
2018-08-30 13:22                                     ` Zi Yan [this message]
2018-08-30 13:45                                       ` Michal Hocko
2018-08-30 14:02                                         ` Zi Yan
2018-08-30 16:19                                           ` Stefan Priebe - Profihost AG
2018-08-30 16:40                                           ` Michal Hocko
2018-09-05  3:44                                             ` Andrea Arcangeli
2018-09-05  7:08                                               ` Michal Hocko
2018-09-06 11:10                                                 ` Vlastimil Babka
2018-09-06 11:16                                                   ` Vlastimil Babka
2018-09-06 11:25                                                     ` Michal Hocko
2018-09-06 12:35                                                       ` Zi Yan
2018-09-06 10:59                                   ` Vlastimil Babka
2018-09-06 11:17                                     ` Zi Yan
2018-08-30  6:47                                 ` Michal Hocko
2018-09-06 11:18                                   ` Vlastimil Babka
2018-09-06 11:27                                     ` Michal Hocko
2018-09-12 17:29                                 ` Mel Gorman
2018-09-17  6:11                                   ` Michal Hocko
2018-09-17  7:04                                     ` Stefan Priebe - Profihost AG
2018-09-17  9:32                                       ` Stefan Priebe - Profihost AG
2018-09-17 11:27                                       ` Michal Hocko
2018-08-20 11:58 ` [PATCH 0/2] fix for "pathological THP behavior" Kirill A. Shutemov
2018-08-20 15:19   ` Andrea Arcangeli
2018-08-21 15:30     ` Vlastimil Babka
2018-08-21 17:26       ` David Rientjes
2018-08-21 22:18         ` Andrea Arcangeli
2018-08-21 22:05       ` Andrea Arcangeli
2018-08-22  9:24       ` Michal Hocko
2018-08-22 15:56         ` Andrea Arcangeli
2018-08-20 19:06   ` Yang Shi
2018-08-20 23:24     ` Andrea Arcangeli
2018-09-07 13:05 [PATCH] mm, thp: relax __GFP_THISNODE for MADV_HUGEPAGE mappings Michal Hocko
2018-09-08 18:52 ` Stefan Priebe - Profihost AG
2018-09-10  7:39   ` Michal Hocko
2018-09-11  9:03   ` Vlastimil Babka
2018-09-10 20:08 ` David Rientjes
2018-09-10 20:22   ` Stefan Priebe - Profihost AG
2018-09-11  8:51   ` Vlastimil Babka
2018-09-11 11:56   ` Michal Hocko
2018-09-11 20:30     ` David Rientjes
2018-09-12 12:05       ` Michal Hocko
2018-09-12 20:40         ` David Rientjes
2018-09-12 13:54     ` Andrea Arcangeli
2018-09-12 14:21       ` Michal Hocko
2018-09-12 15:25         ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4AFDF557-46E3-4C62-8A43-C28E8F2A54CF@cs.rutgers.edu \
    --to=zi.yan@cs.rutgers.edu \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=alex.williamson@redhat.com \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=rientjes@google.com \
    --cc=s.priebe@profihost.ag \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).