From: Yang Shi <shy828301@gmail.com>
To: Zi Yan <ziy@nvidia.com>
Cc: Michal Hocko <mhocko@suse.com>, Song Liu <songliubraving@fb.com>,
Mel Gorman <mgorman@suse.de>, Jan Kara <jack@suse.cz>,
Andrew Morton <akpm@linux-foundation.org>,
Linux MM <linux-mm@kvack.org>,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 5/5] mm: migrate: return -ENOSYS if THP migration is unsupported
Date: Fri, 6 Nov 2020 15:15:36 -0800 [thread overview]
Message-ID: <CAHbLzkq6PRPXkf6qYr4=OK2_RAyKUp-Qc1q5isfuog+RD8W5cA@mail.gmail.com> (raw)
In-Reply-To: <CAHbLzkrXuvT5pMrJQDPwJtXWKK5LC53UoXRkx2-xkWfwX488wg@mail.gmail.com>
On Fri, Nov 6, 2020 at 2:02 PM Yang Shi <shy828301@gmail.com> wrote:
>
> On Fri, Nov 6, 2020 at 12:17 PM Zi Yan <ziy@nvidia.com> wrote:
> >
> > On 3 Nov 2020, at 8:03, Yang Shi wrote:
> >
> > > In the current implementation unmap_and_move() would return -ENOMEM if
> > > THP migration is unsupported, then the THP will be split. If split is
> > > failed just exit without trying to migrate other pages. It doesn't make
> > > too much sense since there may be enough free memory to migrate other
> > > pages and there may be a lot base pages on the list.
> > >
> > > Return -ENOSYS to make consistent with hugetlb. And if THP split is
> > > failed just skip and try other pages on the list.
> > >
> > > Just skip the whole list and exit when free memory is really low.
> > >
> > > Signed-off-by: Yang Shi <shy828301@gmail.com>
> > > ---
> > > mm/migrate.c | 62 ++++++++++++++++++++++++++++++++++++++--------------
> > > 1 file changed, 46 insertions(+), 16 deletions(-)
> > >
> > > diff --git a/mm/migrate.c b/mm/migrate.c
> > > index 8f6a61c9274b..b3466d8c7f03 100644
> > > --- a/mm/migrate.c
> > > +++ b/mm/migrate.c
> > > @@ -1172,7 +1172,7 @@ static int unmap_and_move(new_page_t get_new_page,
> > > struct page *newpage = NULL;
> > >
> > > if (!thp_migration_supported() && PageTransHuge(page))
> > > - return -ENOMEM;
> > > + return -ENOSYS;
> > >
> > > if (page_count(page) == 1) {
> > > /* page was freed from under us. So we are done. */
> > > @@ -1376,6 +1376,20 @@ static int unmap_and_move_huge_page(new_page_t get_new_page,
> > > return rc;
> > > }
> > >
> > > +static inline int try_split_thp(struct page *page, struct page *page2,
> > > + struct list_head *from)
> > > +{
> > > + int rc = 0;
> > > +
> > > + lock_page(page);
> > > + rc = split_huge_page_to_list(page, from);
> > > + unlock_page(page);
> > > + if (!rc)
> > > + list_safe_reset_next(page, page2, lru);
> >
> > This does not work as expected, right? After macro expansion, we have
> > page2 = list_next_entry(page, lru). Since page2 is passed as a pointer, the change
> > does not return back the caller. You need to use the pointer to page2 here.
> >
> > > +
> > > + return rc;
> > > +}
> > > +
> > > /*
> > > * migrate_pages - migrate the pages specified in a list, to the free pages
> > > * supplied as the target for the page migration
> > > @@ -1445,24 +1459,40 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
> > > reason, &ret_pages);
> > >
> > > switch(rc) {
> > > + /*
> > > + * THP migration might be unsupported or the
> > > + * allocation could've failed so we should
> > > + * retry on the same page with the THP split
> > > + * to base pages.
> > > + *
> > > + * Head page is retried immediately and tail
> > > + * pages are added to the tail of the list so
> > > + * we encounter them after the rest of the list
> > > + * is processed.
> > > + */
> > > + case -ENOSYS:
> > > + /* THP migration is unsupported */
> > > + if (is_thp) {
> > > + if (!try_split_thp(page, page2, from)) {
> > > + nr_thp_split++;
> > > + goto retry;
> > > + }
> > > +
> > > + nr_thp_failed++;
> > > + nr_failed += nr_subpages;
> > > + break;
> > > + }
> > > +
> > > + /* Hugetlb migration is unsupported */
> > > + nr_failed++;
> > > + break;
> > > case -ENOMEM:
> > > /*
> > > - * THP migration might be unsupported or the
> > > - * allocation could've failed so we should
> > > - * retry on the same page with the THP split
> > > - * to base pages.
> > > - *
> > > - * Head page is retried immediately and tail
> > > - * pages are added to the tail of the list so
> > > - * we encounter them after the rest of the list
> > > - * is processed.
> > > + * When memory is low, don't bother to try to migrate
> > > + * other pages, just exit.
> >
> > The comment does not match the code below. For THPs, the code tries to split the THP
> > and migrate the base pages if the split is successful.
>
> BTW, actually I don't think -ENOSYS is a proper return value for this
> case since it typically means "the syscall doesn't exist". IMHO, it
> should return -EINVAL. And actually the return value doesn't matter
> since all callers of migrate_pages() just check if the return value !=
> 0. And, neither -ENOSYS nor -EINVAL won't be visible by userspace
> since migrate_pages() just returns the number of failed pages for this
> case.
>
> Anyway the return value is a little bit messy, it may return -ENOMEM,
> 0 or nr_failed. But it looks the callers just care about if ret != 0,
> so it may be better to let it return nr_failed for -ENOMEM case too.
By relooking all the callsites, it seems I overlooked two cases:
* compaction: it checks if the return value is -ENOMEM if so it
may return COMPACT_CONTENDED
* migrate_pages syscall: it may return -ENOMEM to the userspace,
but the man page doesn't document this case, not sure if it is
legitimate
>
> >
> > > */
> > > if (is_thp) {
> > > - lock_page(page);
> > > - rc = split_huge_page_to_list(page, from);
> > > - unlock_page(page);
> > > - if (!rc) {
> > > - list_safe_reset_next(page, page2, lru);
> > > + if (!try_split_thp(page, page2, from)) {
> > > nr_thp_split++;
> > > goto retry;
> > > }
> > > @@ -1490,7 +1520,7 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
> > > break;
> > > default:
> > > /*
> > > - * Permanent failure (-EBUSY, -ENOSYS, etc.):
> > > + * Permanent failure (-EBUSY, etc.):
> > > * unlike -EAGAIN case, the failed page is
> > > * removed from migration page list and not
> > > * retried in the next outer loop.
> > > --
> > > 2.26.2
> >
> >
> > —
> > Best Regards,
> > Yan Zi
prev parent reply other threads:[~2020-11-06 23:15 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-11-03 13:03 [PATCH 0/5] mm: misc migrate cleanup and improvement Yang Shi
2020-11-03 13:03 ` [PATCH 1/5] mm: truncate_complete_page is not existed anymore Yang Shi
2020-11-06 9:16 ` Jan Kara
2020-11-03 13:03 ` [PATCH 2/5] mm: migrate: simplify the logic for handling permanent failure Yang Shi
2020-11-06 20:03 ` Zi Yan
2020-11-06 21:34 ` Yang Shi
2020-11-03 13:03 ` [PATCH 3/5] mm: migrate: skip shared exec THP for NUMA balancing Yang Shi
2020-11-03 13:03 ` [PATCH 4/5] mm: migrate: clean up migrate_prep{_local} Yang Shi
2020-11-06 20:05 ` Zi Yan
2020-11-03 13:03 ` [PATCH 5/5] mm: migrate: return -ENOSYS if THP migration is unsupported Yang Shi
2020-11-06 20:17 ` Zi Yan
2020-11-06 21:53 ` Yang Shi
2020-11-06 22:02 ` Yang Shi
2020-11-06 23:15 ` Yang Shi [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAHbLzkq6PRPXkf6qYr4=OK2_RAyKUp-Qc1q5isfuog+RD8W5cA@mail.gmail.com' \
--to=shy828301@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=jack@suse.cz \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mgorman@suse.de \
--cc=mhocko@suse.com \
--cc=songliubraving@fb.com \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).