From: Michal Hocko <mhocko@kernel.org>
To: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Mel Gorman <mgorman@suse.de>, Vlastimil Babka <vbabka@suse.cz>,
David Rientjes <rientjes@google.com>,
Andrea Argangeli <andrea@kernel.org>,
Zi Yan <zi.yan@cs.rutgers.edu>,
Stefan Priebe - Profihost AG <s.priebe@profihost.ag>,
linux-mm@kvack.org, LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 2/2] mm, thp: consolidate THP gfp handling into alloc_hugepage_direct_gfpmask
Date: Wed, 26 Sep 2018 16:17:08 +0200 [thread overview]
Message-ID: <20180926141708.GX6278@dhcp22.suse.cz> (raw)
In-Reply-To: <20180926133039.y7o5x4nafovxzh2s@kshutemo-mobl1>
On Wed 26-09-18 16:30:39, Kirill A. Shutemov wrote:
> On Tue, Sep 25, 2018 at 02:03:26PM +0200, Michal Hocko wrote:
> > diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> > index c3bc7e9c9a2a..c0bcede31930 100644
> > --- a/mm/huge_memory.c
> > +++ b/mm/huge_memory.c
> > @@ -629,21 +629,40 @@ static vm_fault_t __do_huge_pmd_anonymous_page(struct vm_fault *vmf,
> > * available
> > * never: never stall for any thp allocation
> > */
> > -static inline gfp_t alloc_hugepage_direct_gfpmask(struct vm_area_struct *vma)
> > +static inline gfp_t alloc_hugepage_direct_gfpmask(struct vm_area_struct *vma, unsigned long addr)
> > {
> > const bool vma_madvised = !!(vma->vm_flags & VM_HUGEPAGE);
> > + gfp_t this_node = 0;
> > +
> > +#ifdef CONFIG_NUMA
> > + struct mempolicy *pol;
> > + /*
> > + * __GFP_THISNODE is used only when __GFP_DIRECT_RECLAIM is not
> > + * specified, to express a general desire to stay on the current
> > + * node for optimistic allocation attempts. If the defrag mode
> > + * and/or madvise hint requires the direct reclaim then we prefer
> > + * to fallback to other node rather than node reclaim because that
> > + * can lead to excessive reclaim even though there is free memory
> > + * on other nodes. We expect that NUMA preferences are specified
> > + * by memory policies.
> > + */
> > + pol = get_vma_policy(vma, addr);
> > + if (pol->mode != MPOL_BIND)
> > + this_node = __GFP_THISNODE;
> > + mpol_cond_put(pol);
> > +#endif
>
> I'm not very good with NUMA policies. Could you explain in more details how
> the code above is equivalent to the code below?
MPOL_PREFERRED is handled by policy_node() before we call __alloc_pages_nodemask.
__GFP_THISNODE is applied only when we are not using
__GFP_DIRECT_RECLAIM which is handled in alloc_hugepage_direct_gfpmask
now.
Lastly MPOL_BIND wasn't handled explicitly but in the end the removed
late check would remove __GFP_THISNODE for it as well. So in the end we
are doing the same thing unless I miss something
> > @@ -2026,60 +2025,6 @@ alloc_pages_vma(gfp_t gfp, int order, struct vm_area_struct *vma,
> > goto out;
> > }
> >
> > - if (unlikely(IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE) && hugepage)) {
> > - int hpage_node = node;
> > -
> > - /*
> > - * For hugepage allocation and non-interleave policy which
> > - * allows the current node (or other explicitly preferred
> > - * node) we only try to allocate from the current/preferred
> > - * node and don't fall back to other nodes, as the cost of
> > - * remote accesses would likely offset THP benefits.
> > - *
> > - * If the policy is interleave, or does not allow the current
> > - * node in its nodemask, we allocate the standard way.
> > - */
> > - if (pol->mode == MPOL_PREFERRED &&
> > - !(pol->flags & MPOL_F_LOCAL))
> > - hpage_node = pol->v.preferred_node;
> > -
> > - nmask = policy_nodemask(gfp, pol);
> > - if (!nmask || node_isset(hpage_node, *nmask)) {
> > - mpol_cond_put(pol);
> > - /*
> > - * We cannot invoke reclaim if __GFP_THISNODE
> > - * is set. Invoking reclaim with
> > - * __GFP_THISNODE set, would cause THP
> > - * allocations to trigger heavy swapping
> > - * despite there may be tons of free memory
> > - * (including potentially plenty of THP
> > - * already available in the buddy) on all the
> > - * other NUMA nodes.
> > - *
> > - * At most we could invoke compaction when
> > - * __GFP_THISNODE is set (but we would need to
> > - * refrain from invoking reclaim even if
> > - * compaction returned COMPACT_SKIPPED because
> > - * there wasn't not enough memory to succeed
> > - * compaction). For now just avoid
> > - * __GFP_THISNODE instead of limiting the
> > - * allocation path to a strict and single
> > - * compaction invocation.
> > - *
> > - * Supposedly if direct reclaim was enabled by
> > - * the caller, the app prefers THP regardless
> > - * of the node it comes from so this would be
> > - * more desiderable behavior than only
> > - * providing THP originated from the local
> > - * node in such case.
> > - */
> > - if (!(gfp & __GFP_DIRECT_RECLAIM))
> > - gfp |= __GFP_THISNODE;
> > - page = __alloc_pages_node(hpage_node, gfp, order);
> > - goto out;
> > - }
> > - }
> > -
> > nmask = policy_nodemask(gfp, pol);
> > preferred_nid = policy_node(gfp, pol, node);
> > page = __alloc_pages_nodemask(gfp, order, preferred_nid, nmask);
>
> --
> Kirill A. Shutemov
--
Michal Hocko
SUSE Labs
next prev parent reply other threads:[~2018-09-26 14:17 UTC|newest]
Thread overview: 74+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-09-25 12:03 [PATCH 0/2] thp nodereclaim fixes Michal Hocko
2018-09-25 12:03 ` [PATCH 1/2] mm: thp: relax __GFP_THISNODE for MADV_HUGEPAGE mappings Michal Hocko
2018-09-25 12:20 ` Mel Gorman
2018-09-25 12:30 ` Michal Hocko
2018-10-04 20:16 ` David Rientjes
2018-10-04 21:10 ` Andrea Arcangeli
2018-10-04 23:05 ` David Rientjes
2018-10-06 3:19 ` Andrea Arcangeli
2018-10-05 7:38 ` Mel Gorman
2018-10-05 20:35 ` David Rientjes
2018-10-05 23:21 ` Andrea Arcangeli
2018-10-08 20:41 ` David Rientjes
2018-10-09 9:48 ` Mel Gorman
2018-10-09 12:27 ` Michal Hocko
2018-10-09 13:00 ` Mel Gorman
2018-10-09 14:25 ` Michal Hocko
2018-10-09 15:16 ` Mel Gorman
2018-10-09 23:03 ` Andrea Arcangeli
2018-10-10 21:19 ` David Rientjes
2018-10-15 22:30 ` David Rientjes
2018-10-15 22:44 ` Andrew Morton
2018-10-15 23:19 ` Andrea Arcangeli
2018-10-22 20:54 ` David Rientjes
2018-10-16 7:46 ` Mel Gorman
2018-10-16 22:37 ` Andrew Morton
2018-10-16 23:11 ` Andrea Arcangeli
2018-10-16 23:16 ` Andrew Morton
2018-10-17 7:08 ` Michal Hocko
2018-10-17 9:00 ` Mel Gorman
2018-10-22 21:04 ` David Rientjes
2018-10-23 1:27 ` Zi Yan
2018-10-28 21:45 ` David Rientjes
2018-10-23 7:57 ` Mel Gorman
2018-10-23 8:38 ` Mel Gorman
2018-10-15 22:57 ` Andrea Arcangeli
2018-10-22 20:45 ` David Rientjes
2018-10-09 22:17 ` David Rientjes
2018-10-09 22:51 ` Andrea Arcangeli
2018-10-10 7:54 ` Vlastimil Babka
2018-10-10 21:00 ` David Rientjes
2018-10-09 13:08 ` Vlastimil Babka
2018-10-09 22:21 ` Andrea Arcangeli
2018-10-29 5:17 ` Balbir Singh
2018-10-29 9:00 ` Michal Hocko
2018-10-29 9:42 ` Balbir Singh
2018-10-29 10:08 ` Michal Hocko
2018-10-29 10:56 ` Andrea Arcangeli
2018-09-25 12:03 ` [PATCH 2/2] mm, thp: consolidate THP gfp handling into alloc_hugepage_direct_gfpmask Michal Hocko
2018-09-26 13:30 ` Kirill A. Shutemov
2018-09-26 14:17 ` Michal Hocko [this message]
2018-09-26 14:22 ` Michal Hocko
2018-10-19 2:11 ` Andrew Morton
2018-10-19 8:06 ` Michal Hocko
2018-10-22 13:27 ` Vlastimil Babka
2018-10-24 23:17 ` Andrew Morton
2018-10-25 4:56 ` Vlastimil Babka
2018-10-25 16:14 ` Michal Hocko
2018-10-25 16:18 ` Andrew Morton
2018-10-25 16:45 ` Michal Hocko
2018-10-22 13:15 ` Vlastimil Babka
2018-10-22 13:30 ` Michal Hocko
2018-10-22 13:35 ` Vlastimil Babka
2018-10-22 13:46 ` Michal Hocko
2018-10-22 13:53 ` Vlastimil Babka
2018-10-04 20:17 ` David Rientjes
2018-10-04 21:49 ` Zi Yan
2018-10-09 12:36 ` Michal Hocko
2018-09-26 13:08 ` linux-mm@ archive on lore.kernel.org (Was: [PATCH 0/2] thp nodereclaim fixes) Kirill A. Shutemov
2018-09-26 13:14 ` Michal Hocko
2018-09-26 22:22 ` Andrew Morton
2018-09-26 23:08 ` Mel Gorman
2018-09-27 0:47 ` Konstantin Ryabitsev
2018-09-26 15:25 ` Konstantin Ryabitsev
2018-09-27 11:30 ` Kirill A. Shutemov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180926141708.GX6278@dhcp22.suse.cz \
--to=mhocko@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=andrea@kernel.org \
--cc=kirill@shutemov.name \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mgorman@suse.de \
--cc=rientjes@google.com \
--cc=s.priebe@profihost.ag \
--cc=vbabka@suse.cz \
--cc=zi.yan@cs.rutgers.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).