All of lore.kernel.org
 help / color / mirror / Atom feed
From: Michal Hocko <mhocko@suse.com>
To: Zach O'Keefe <zokeefe@google.com>
Cc: Yang Shi <shy828301@gmail.com>,
	akpm@linux-foundation.org, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org,
	Andrew Davidoff <davidoff@qedmf.net>, Bob Liu <lliubbo@gmail.com>
Subject: Re: [PATCH] mm: don't warn if the node is offlined
Date: Thu, 3 Nov 2022 08:51:03 +0100	[thread overview]
Message-ID: <Y2Ny54MOyr3LveY3@dhcp22.suse.cz> (raw)
In-Reply-To: <CAAa6QmQ+4XndbtE_=mcaC5OaeK4g42dKYfY5FmYoRDTKGO-3nA@mail.gmail.com>

On Wed 02-11-22 11:58:26, Zach O'Keefe wrote:
> On Wed, Nov 2, 2022 at 11:18 AM Yang Shi <shy828301@gmail.com> wrote:
> >
> > On Wed, Nov 2, 2022 at 10:47 AM Michal Hocko <mhocko@suse.com> wrote:
> > >
> > > On Wed 02-11-22 10:36:07, Yang Shi wrote:
> > > > On Wed, Nov 2, 2022 at 9:15 AM Michal Hocko <mhocko@suse.com> wrote:
> > > > >
> > > > > On Wed 02-11-22 09:03:57, Yang Shi wrote:
> > > > > > On Wed, Nov 2, 2022 at 12:39 AM Michal Hocko <mhocko@suse.com> wrote:
> > > > > > >
> > > > > > > On Tue 01-11-22 12:13:35, Zach O'Keefe wrote:
> > > > > > > [...]
> > > > > > > > This is slightly tangential - but I don't want to send a new mail
> > > > > > > > about it -- but I wonder if we should be doing __GFP_THISNODE +
> > > > > > > > explicit node vs having hpage_collapse_find_target_node() set a
> > > > > > > > nodemask. We could then provide fallback nodes for ties, or if some
> > > > > > > > node contained > some threshold number of pages.
> > > > > > >
> > > > > > > I would simply go with something like this (not even compile tested):
> > > > > >
> > > > > > Thanks, Michal. It is definitely an option. As I talked with Zach, I'm
> > > > > > not sure whether it is worth making the code more complicated for such
> > > > > > micro optimization or not. Removing __GFP_THISNODE or even removing
> > > > > > the node balance code should be fine too IMHO. TBH I doubt there would
> > > > > > be any noticeable difference.
> > > > >
> > > > > I do agree that an explicit nodes (quasi)round robin sounds over
> > > > > engineered. It makes some sense to try to target the prevalent node
> > > > > though because this code can be executed from khugepaged and therefore
> > > > > allocating with a completely different affinity than the original fault.
> > > >
> > > > Yeah, the corner case comes from the node balance code, it just tries
> > > > to balance between multiple prevalent nodes, so you agree to remove it
> > > > IIRC?
> > >
> > > Yeah, let's just collect all good nodes into a nodemask and keep
> > > __GFP_THISNODE in place. You can consider having the nodemask per collapse_control
> > > so that you allocate it only once in the struct lifetime.
> >
> > Actually my intention is more aggressive, just remove that node balance code.
> >
> 
> The balancing code dates back to 2013 commit 9f1b868a13ac ("mm: thp:
> khugepaged: add policy for finding target node") where it was made to
> satisfy "numactl --interleave=all". I don't know why any real
> workloads would want this -- but there very well could be a valid use
> case. If not, I think it could be removed independent of what we do
> with __GFP_THISNODE and nodemask.

Thanks for the reference. The patch is really dubious. If the primary
usecase is a memory policy then one should be used. We have the vma
handy. Sure per task policy would be a bigger problem but interleaving
is a mere hint rather than something that has hard requirements.

> Balancing aside -- I haven't fully thought through what an ideal (and
> further overengineered) solution would be for numa, but one (perceived
> - not measured) issue that khugepaged might have (MADV_COLLAPSE
> doesn't have the choice) is on systems with many, many nodes with
> source pages sprinkled across all of them. Should we collapse these
> pages into a single THP from the node with the most (but could still
> be a small %) pages? Probably there are better candidates. So, maybe a
> khugepaged-only check for max_value > (HPAGE_PMD_NR >> 1) or something
> makes sense.

Honestly I do not see any problem to be solved here.

-- 
Michal Hocko
SUSE Labs

  parent reply	other threads:[~2022-11-03  7:51 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-10-31 18:31 [PATCH] mm: don't warn if the node is offlined Yang Shi
2022-10-31 21:16 ` Zach O'Keefe
2022-10-31 22:08 ` Michal Hocko
2022-11-01  0:05   ` Zach O'Keefe
2022-11-01  7:54     ` Michal Hocko
2022-11-01 17:12       ` Yang Shi
2022-11-01 19:13         ` Zach O'Keefe
2022-11-01 20:09           ` Yang Shi
2022-11-01 22:05             ` Zach O'Keefe
2022-11-02  7:39           ` Michal Hocko
2022-11-02  7:49             ` Michal Hocko
2022-11-02 16:03             ` Yang Shi
2022-11-02 16:15               ` Michal Hocko
2022-11-02 17:36                 ` Yang Shi
2022-11-02 17:47                   ` Michal Hocko
2022-11-02 18:18                     ` Yang Shi
2022-11-02 18:58                       ` Zach O'Keefe
2022-11-02 20:08                         ` Yang Shi
2022-11-02 20:21                           ` Zach O'Keefe
2022-11-03  7:54                           ` Michal Hocko
2022-11-03 17:13                             ` Yang Shi
2022-11-03  7:51                         ` Michal Hocko [this message]
2022-11-02  7:14         ` Michal Hocko
2022-11-02 15:58           ` Yang Shi
2022-11-02 16:11             ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Y2Ny54MOyr3LveY3@dhcp22.suse.cz \
    --to=mhocko@suse.com \
    --cc=akpm@linux-foundation.org \
    --cc=davidoff@qedmf.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lliubbo@gmail.com \
    --cc=shy828301@gmail.com \
    --cc=zokeefe@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.