All of lore.kernel.org
 help / color / mirror / Atom feed
From: Pavel Tatashin <pasha.tatashin@soleen.com>
To: Jason Gunthorpe <jgg@ziepe.ca>
Cc: LKML <linux-kernel@vger.kernel.org>,
	linux-mm <linux-mm@kvack.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Vlastimil Babka <vbabka@suse.cz>, Michal Hocko <mhocko@suse.com>,
	David Hildenbrand <david@redhat.com>,
	Oscar Salvador <osalvador@suse.de>,
	Dan Williams <dan.j.williams@intel.com>,
	Sasha Levin <sashal@kernel.org>,
	Tyler Hicks <tyhicks@linux.microsoft.com>,
	Joonsoo Kim <iamjoonsoo.kim@lge.com>,
	mike.kravetz@oracle.com, Steven Rostedt <rostedt@goodmis.org>,
	Ingo Molnar <mingo@redhat.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Mel Gorman <mgorman@suse.de>,
	Matthew Wilcox <willy@infradead.org>,
	David Rientjes <rientjes@google.com>,
	John Hubbard <jhubbard@nvidia.com>
Subject: Re: [PATCH 6/6] mm/gup: migrate pinned pages out of movable zone
Date: Wed, 2 Dec 2020 20:34:32 -0500	[thread overview]
Message-ID: <CA+CK2bBRgcCc5Nm0RcsEgVFpGBFC-_icA6UDRiqQxeRJE5U-Aw@mail.gmail.com> (raw)
In-Reply-To: <20201203010809.GQ5487@ziepe.ca>

On Wed, Dec 2, 2020 at 8:08 PM Jason Gunthorpe <jgg@ziepe.ca> wrote:
>
> On Wed, Dec 02, 2020 at 07:19:45PM -0500, Pavel Tatashin wrote:
> > > It is a good moment to say, I really dislike how this was implemented
> > > in the first place.
> > >
> > > Scanning the output of gup just to do the is_migrate_movable() test is
> > > kind of nonsense and slow. It would be better/faster to handle this
> > > directly while gup is scanning the page tables and adding pages to the
> > > list.
> >
> > Hi Jason,
> >
> > I assume you mean to migrate pages as soon as they are followed and
> > skip those that are faulted, as we already know that faulted pages are
> > allocated from nomovable zone.
> >
> > The place would be:
> >
> > __get_user_pages()
> >       while(more pages)
> >           get_gate_page()
> >           follow_hugetlb_page()
> >           follow_page_mask()
> >
> >           if (!page)
> >                faultin_page()
> >
> >           if (page && !faulted && (gup_flags & FOLL_LONGTERM) )
> >                 check_and_migrate this page
>
> Either here or perhaps even lower down the call chain when the page is
> captured, similar to how GUP fast would detect it. (how is that done
> anyhow?)

Ah, thank you for pointing this out. I think I need to address it here:

https://soleen.com/source/xref/linux/mm/gup.c?r=96e1fac1#94

static __maybe_unused struct page *try_grab_compound_head()
              if (unlikely(flags & FOLL_LONGTERM) &&  is_migrate_cma_page(page))
                   return NULL;

I need to change is_migrate_cma_page() to all migratable pages. Will
study, and send an update with this fix.

>
> > I looked at that function, and I do not think the code will be cleaner
> > there, as that function already has a complicated loop.
>
> That function is complicated for its own reasons.. But complicated is
> not the point here..
>
> > The only drawback with the current approach that I see is that
> > check_and_migrate_movable_pages() has to check once the faulted
> > pages.
>
> Yes
>
> > This is while not optimal is not horrible.
>
> It is.
>
> > The FOLL_LONGTERM should not happen too frequently, so having one
> > extra nr_pages loop should not hurt the performance.
>
> FOLL_LONGTERM is typically used with very large regions, for instance
> we are benchmarking around the 300G level. It takes 10s of seconds for
> get_user_pages to operate. There are many inefficiencies in this
> path. This extra work of re-scanning the list is part of the cost.

OK, I did not realize that pinning was for such large regions, the
path must be optimized.

>
> Further, having these special wrappers just for FOLL_LONGTERM has a
> spill over complexity on the entire rest of the callchain up to here,
> we now have endless wrappers and varieties of function calls that
> generally are happening because the longterm path needs to end up in a
> different place than other paths.
>
> IMHO this is due to the lack of integration with the primary loop
> above
>
> > Also, I checked and most of the users of FOLL_LONGTERM pin only one
> > page at a time. Which means the extra loop is only to check a single
> > page.
>
> Er, I don't know what you checked but those are not the cases I
> see. Two big users are vfio and rdma. Both are pinning huge ranges of
> memory in very typical use cases.

What I meant is the users of the interface do it incrementally not in
large chunks. For example:

vfio_pin_pages_remote
   vaddr_get_pfn
        ret = pin_user_pages_remote(mm, vaddr, 1, flags |
FOLL_LONGTERM, page, NULL, NULL);
1 -> pin only one pages at a time

RDMA indeed can do it in one chunk though. Regardless, the VFIO should
probably be optimized to do it in a larger chunk, and the code path
should be optimized for the reasons you gave above.

>
> > However, those changes can come after this series. The current series
> > fixes a bug where hot-remove is not working with making minimal amount
> > of changes, so it is easy to backport it to stable kernels.
>
> This is a good point, good enough that you should probably continue as
> is

I will continue looking into this code, and see if I can address your
concerns in a follow-up fixes.


Thanks,
Pasha

>
> Jason

  reply	other threads:[~2020-12-03  1:35 UTC|newest]

Thread overview: 92+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-12-02  5:23 [PATCH 0/6] prohibit pinning pages in ZONE_MOVABLE Pavel Tatashin
2020-12-02  5:23 ` [PATCH 1/6] mm/gup: perform check_dax_vmas only when FS_DAX is enabled Pavel Tatashin
2020-12-02 16:22   ` Ira Weiny
2020-12-02 18:15     ` Pavel Tatashin
2020-12-02 18:15       ` Pavel Tatashin
2020-12-02 16:29   ` Jason Gunthorpe
2020-12-02 18:16     ` Pavel Tatashin
2020-12-02 18:16       ` Pavel Tatashin
2020-12-03  7:59   ` John Hubbard
2020-12-03 14:52     ` Pavel Tatashin
2020-12-03 14:52       ` Pavel Tatashin
2020-12-02  5:23 ` [PATCH 2/6] mm/gup: don't pin migrated cma pages in movable zone Pavel Tatashin
2020-12-02 16:31   ` David Hildenbrand
2020-12-02 18:17     ` Pavel Tatashin
2020-12-02 18:17       ` Pavel Tatashin
2020-12-03  8:01   ` John Hubbard
2020-12-03  8:46   ` Michal Hocko
2020-12-03 14:58     ` Pavel Tatashin
2020-12-03 14:58       ` Pavel Tatashin
2020-12-02  5:23 ` [PATCH 3/6] mm/gup: make __gup_longterm_locked common Pavel Tatashin
2020-12-02 16:31   ` Ira Weiny
2020-12-02 16:33     ` Ira Weiny
2020-12-02 18:19       ` Pavel Tatashin
2020-12-02 18:19         ` Pavel Tatashin
2020-12-03  0:03         ` Pavel Tatashin
2020-12-03  0:03           ` Pavel Tatashin
2020-12-03  8:03   ` John Hubbard
2020-12-03 15:02     ` Pavel Tatashin
2020-12-03 15:02       ` Pavel Tatashin
2020-12-02  5:23 ` [PATCH 4/6] mm cma: rename PF_MEMALLOC_NOCMA to PF_MEMALLOC_NOMOVABLE Pavel Tatashin
2020-12-03  8:04   ` John Hubbard
2020-12-03 15:02     ` Pavel Tatashin
2020-12-03 15:02       ` Pavel Tatashin
2020-12-03  8:57   ` Michal Hocko
2020-12-03 15:02     ` Pavel Tatashin
2020-12-03 15:02       ` Pavel Tatashin
2020-12-02  5:23 ` [PATCH 5/6] mm: honor PF_MEMALLOC_NOMOVABLE for all allocations Pavel Tatashin
2020-12-03  8:17   ` John Hubbard
2020-12-03 15:06     ` Pavel Tatashin
2020-12-03 15:06       ` Pavel Tatashin
2020-12-03 16:51       ` John Hubbard
2020-12-03  9:17   ` Michal Hocko
2020-12-03 15:15     ` Pavel Tatashin
2020-12-03 15:15       ` Pavel Tatashin
2020-12-04  8:43       ` Michal Hocko
2020-12-04  8:54         ` Michal Hocko
2020-12-04 16:07           ` Pavel Tatashin
2020-12-04 16:07             ` Pavel Tatashin
2020-12-02  5:23 ` [PATCH 6/6] mm/gup: migrate pinned pages out of movable zone Pavel Tatashin
2020-12-02 16:35   ` Jason Gunthorpe
2020-12-03  0:19     ` Pavel Tatashin
2020-12-03  0:19       ` Pavel Tatashin
2020-12-03  1:08       ` Jason Gunthorpe
2020-12-03  1:34         ` Pavel Tatashin [this message]
2020-12-03  1:34           ` Pavel Tatashin
2020-12-03 14:17           ` Jason Gunthorpe
2020-12-03 16:40             ` Pavel Tatashin
2020-12-03 16:40               ` Pavel Tatashin
2020-12-03 16:59               ` Jason Gunthorpe
2020-12-03 17:14                 ` Pavel Tatashin
2020-12-03 17:14                   ` Pavel Tatashin
2020-12-03 19:15                   ` Pavel Tatashin
2020-12-03 19:15                     ` Pavel Tatashin
2020-12-03 19:36                     ` Jason Gunthorpe
2020-12-04 16:24                       ` Pavel Tatashin
2020-12-04 16:24                         ` Pavel Tatashin
2020-12-04 17:06                         ` Jason Gunthorpe
2020-12-04 20:05             ` Daniel Jordan
2020-12-04 20:16               ` Pavel Tatashin
2020-12-04 20:16                 ` Pavel Tatashin
2020-12-08  2:27                 ` Daniel Jordan
2020-12-04 20:52               ` Jason Gunthorpe
2020-12-08  2:48                 ` Daniel Jordan
2020-12-08 13:24                   ` Jason Gunthorpe
2020-12-03  8:22   ` John Hubbard
2020-12-03 15:55     ` Pavel Tatashin
2020-12-03 15:55       ` Pavel Tatashin
2020-12-04  4:13   ` Joonsoo Kim
2020-12-04 17:43     ` Pavel Tatashin
2020-12-04 17:43       ` Pavel Tatashin
2020-12-07  7:13       ` Joonsoo Kim
2020-12-04  4:02 ` [PATCH 0/6] prohibit pinning pages in ZONE_MOVABLE Joonsoo Kim
2020-12-04 15:55   ` Pavel Tatashin
2020-12-04 15:55     ` Pavel Tatashin
2020-12-04 16:10     ` Jason Gunthorpe
2020-12-04 17:50       ` Pavel Tatashin
2020-12-04 17:50         ` Pavel Tatashin
2020-12-04 18:01         ` David Hildenbrand
2020-12-04 18:10           ` Pavel Tatashin
2020-12-04 18:10             ` Pavel Tatashin
2020-12-07  7:12         ` Joonsoo Kim
2020-12-07 12:13           ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CA+CK2bBRgcCc5Nm0RcsEgVFpGBFC-_icA6UDRiqQxeRJE5U-Aw@mail.gmail.com \
    --to=pasha.tatashin@soleen.com \
    --cc=akpm@linux-foundation.org \
    --cc=dan.j.williams@intel.com \
    --cc=david@redhat.com \
    --cc=iamjoonsoo.kim@lge.com \
    --cc=jgg@ziepe.ca \
    --cc=jhubbard@nvidia.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@suse.de \
    --cc=mhocko@suse.com \
    --cc=mike.kravetz@oracle.com \
    --cc=mingo@redhat.com \
    --cc=osalvador@suse.de \
    --cc=peterz@infradead.org \
    --cc=rientjes@google.com \
    --cc=rostedt@goodmis.org \
    --cc=sashal@kernel.org \
    --cc=tyhicks@linux.microsoft.com \
    --cc=vbabka@suse.cz \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.