All of lore.kernel.org
 help / color / mirror / Atom feed
From: Yang Shi <shy828301@gmail.com>
To: Gerald Schaefer <gerald.schaefer@linux.ibm.com>
Cc: Mel Gorman <mgorman@suse.de>,
	"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
	Zi Yan <ziy@nvidia.com>, Michal Hocko <mhocko@suse.com>,
	Huang Ying <ying.huang@intel.com>,
	Hugh Dickins <hughd@google.com>,
	hca@linux.ibm.com, gor@linux.ibm.com, borntraeger@de.ibm.com,
	Andrew Morton <akpm@linux-foundation.org>,
	Linux MM <linux-mm@kvack.org>,
	linux-s390@vger.kernel.org,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Alexander Gordeev <agordeev@linux.ibm.com>
Subject: Re: [RFC PATCH 0/6] mm: thp: use generic THP migration for NUMA hinting fault
Date: Tue, 30 Mar 2021 09:51:46 -0700	[thread overview]
Message-ID: <CAHbLzkrYd+5L8Ep+b83PkkFL_QGQe_vSAk=erQ+fvC6dEOsGsw@mail.gmail.com> (raw)
In-Reply-To: <20210330164200.01a4b78f@thinkpad>

On Tue, Mar 30, 2021 at 7:42 AM Gerald Schaefer
<gerald.schaefer@linux.ibm.com> wrote:
>
> On Mon, 29 Mar 2021 11:33:06 -0700
> Yang Shi <shy828301@gmail.com> wrote:
>
> >
> > When the THP NUMA fault support was added THP migration was not supported yet.
> > So the ad hoc THP migration was implemented in NUMA fault handling.  Since v4.14
> > THP migration has been supported so it doesn't make too much sense to still keep
> > another THP migration implementation rather than using the generic migration
> > code.  It is definitely a maintenance burden to keep two THP migration
> > implementation for different code paths and it is more error prone.  Using the
> > generic THP migration implementation allows us remove the duplicate code and
> > some hacks needed by the old ad hoc implementation.
> >
> > A quick grep shows x86_64, PowerPC (book3s), ARM64 ans S390 support both THP
> > and NUMA balancing.  The most of them support THP migration except for S390.
> > Zi Yan tried to add THP migration support for S390 before but it was not
> > accepted due to the design of S390 PMD.  For the discussion, please see:
> > https://lkml.org/lkml/2018/4/27/953.
> >
> > I'm not expert on S390 so not sure if it is feasible to support THP migration
> > for S390 or not.  If it is not feasible then the patchset may make THP NUMA
> > balancing not be functional on S390.  Not sure if this is a show stopper although
> > the patchset does simplify the code a lot.  Anyway it seems worth posting the
> > series to the mailing list to get some feedback.
>
> The reason why THP migration cannot work on s390 is because the migration
> code will establish swap ptes in a pmd. The pmd layout is very different from
> the pte layout on s390, so you cannot simply write a swap pte into a pmd.
> There are no separate swp primitives for swap/migration pmds, IIRC. And even
> if there were, we'd still need to find some space for a present bit in the
> s390 pmd, and/or possibly move around some other bits.
>
> A lot of things can go wrong here, even if it could be possible in theory,
> by introducing separate swp primitives in common code for pmd entries, along
> with separate offset, type, shift, etc. I don't see that happening in the
> near future.

Thanks a lot for elaboration. IIUC, implementing migration PMD entry
is *not* prevented from by hardware, it may be very tricky to
implement it, right?

>
> Not sure if this is a show stopper, but I am not familiar enough with
> NUMA and migration code to judge. E.g., I do not see any swp entry action
> in your patches, but I assume this is implicitly triggered by the switch
> to generic THP migration code.

Yes, exactly. The migrate_pages() called by migrate_misplaced_page()
takes care of everything.

>
> Could there be a work-around by splitting THP pages instead of marking them
> as migrate pmds (via pte swap entries), at least when THP migration is not
> supported? I guess it could also be acceptable if THP pages were simply not
> migrated for NUMA balancing on s390, but then we might need some extra config
> option to make that behavior explicit.

Yes, it could be. The old behavior of migration was to return -ENOMEM
if THP migration is not supported then split THP. That behavior was
not very friendly to some usecases, for example, memory policy and
migration lieu of reclaim (the upcoming). But I don't mean we restore
the old behavior. We could split THP if it returns -ENOSYS and the
page is THP.

>
> See also my comment on patch #5 of this series.
>
> Regards,
> Gerald

  reply	other threads:[~2021-03-30 16:52 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-29 18:33 [RFC PATCH 0/6] mm: thp: use generic THP migration for NUMA hinting fault Yang Shi
2021-03-29 18:33 ` [PATCH 1/6] mm: memory: add orig_pmd to struct vm_fault Yang Shi
2021-03-29 18:33 ` [PATCH 2/6] mm: memory: make numa_migrate_prep() non-static Yang Shi
2021-03-29 18:33 ` [PATCH 3/6] mm: migrate: teach migrate_misplaced_page() about THP Yang Shi
2021-03-30  0:21   ` Huang, Ying
2021-03-30  0:21     ` Huang, Ying
2021-03-30 16:57     ` Yang Shi
2021-03-30 16:57       ` Yang Shi
2021-03-29 18:33 ` [PATCH 4/6] mm: thp: refactor NUMA fault handling Yang Shi
2021-03-30  0:41   ` Huang, Ying
2021-03-30  0:41     ` Huang, Ying
2021-03-30 17:02     ` Yang Shi
2021-03-30 17:02       ` Yang Shi
2021-04-01  2:34   ` kernel test robot
2021-03-29 18:33 ` [PATCH 5/6] mm: migrate: don't split THP for misplaced NUMA page Yang Shi
2021-03-30 14:42   ` Gerald Schaefer
2021-03-30 16:53     ` Yang Shi
2021-03-30 16:53       ` Yang Shi
2021-03-29 18:33 ` [PATCH 6/6] mm: migrate: remove redundant page count check for THP Yang Shi
2021-03-30 14:42 ` [RFC PATCH 0/6] mm: thp: use generic THP migration for NUMA hinting fault Gerald Schaefer
2021-03-30 16:51   ` Yang Shi [this message]
2021-03-30 16:51     ` Yang Shi
2021-03-31 11:47     ` Gerald Schaefer
2021-04-01 20:10       ` Yang Shi
2021-04-01 20:10         ` Yang Shi
2021-04-06 12:02         ` Gerald Schaefer
2021-04-06 16:42           ` Yang Shi
2021-04-06 16:42             ` Yang Shi
2021-04-07  8:32             ` Mel Gorman
2021-04-07 16:04               ` Yang Shi
2021-04-07 16:04                 ` Yang Shi
2021-03-31 13:20   ` Mel Gorman
2021-04-01 20:12     ` Yang Shi
2021-04-01 20:12       ` Yang Shi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAHbLzkrYd+5L8Ep+b83PkkFL_QGQe_vSAk=erQ+fvC6dEOsGsw@mail.gmail.com' \
    --to=shy828301@gmail.com \
    --cc=agordeev@linux.ibm.com \
    --cc=akpm@linux-foundation.org \
    --cc=borntraeger@de.ibm.com \
    --cc=gerald.schaefer@linux.ibm.com \
    --cc=gor@linux.ibm.com \
    --cc=hca@linux.ibm.com \
    --cc=hughd@google.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=mgorman@suse.de \
    --cc=mhocko@suse.com \
    --cc=ying.huang@intel.com \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.