linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jason Gunthorpe <jgg@ziepe.ca>
To: Pavel Tatashin <pasha.tatashin@soleen.com>
Cc: LKML <linux-kernel@vger.kernel.org>,
	linux-mm <linux-mm@kvack.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Vlastimil Babka <vbabka@suse.cz>, Michal Hocko <mhocko@suse.com>,
	David Hildenbrand <david@redhat.com>,
	Oscar Salvador <osalvador@suse.de>,
	Dan Williams <dan.j.williams@intel.com>,
	Sasha Levin <sashal@kernel.org>,
	Tyler Hicks <tyhicks@linux.microsoft.com>,
	Joonsoo Kim <iamjoonsoo.kim@lge.com>,
	mike.kravetz@oracle.com, Steven Rostedt <rostedt@goodmis.org>,
	Ingo Molnar <mingo@redhat.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Mel Gorman <mgorman@suse.de>,
	Matthew Wilcox <willy@infradead.org>,
	David Rientjes <rientjes@google.com>,
	John Hubbard <jhubbard@nvidia.com>
Subject: Re: [PATCH 6/6] mm/gup: migrate pinned pages out of movable zone
Date: Wed, 2 Dec 2020 21:08:09 -0400	[thread overview]
Message-ID: <20201203010809.GQ5487@ziepe.ca> (raw)
In-Reply-To: <CA+CK2bBT=U+xhbzXTDFwsL5wTvPHgNJ0DRpaeseiUq=w4EOe9w@mail.gmail.com>

On Wed, Dec 02, 2020 at 07:19:45PM -0500, Pavel Tatashin wrote:
> > It is a good moment to say, I really dislike how this was implemented
> > in the first place.
> >
> > Scanning the output of gup just to do the is_migrate_movable() test is
> > kind of nonsense and slow. It would be better/faster to handle this
> > directly while gup is scanning the page tables and adding pages to the
> > list.
> 
> Hi Jason,
> 
> I assume you mean to migrate pages as soon as they are followed and
> skip those that are faulted, as we already know that faulted pages are
> allocated from nomovable zone.
> 
> The place would be:
> 
> __get_user_pages()
>       while(more pages)
>           get_gate_page()
>           follow_hugetlb_page()
>           follow_page_mask()
> 
>           if (!page)
>                faultin_page()
> 
>           if (page && !faulted && (gup_flags & FOLL_LONGTERM) )
>                 check_and_migrate this page

Either here or perhaps even lower down the call chain when the page is
captured, similar to how GUP fast would detect it. (how is that done
anyhow?)

> I looked at that function, and I do not think the code will be cleaner
> there, as that function already has a complicated loop.  

That function is complicated for its own reasons.. But complicated is
not the point here..

> The only drawback with the current approach that I see is that
> check_and_migrate_movable_pages() has to check once the faulted
> pages.

Yes

> This is while not optimal is not horrible. 

It is.

> The FOLL_LONGTERM should not happen too frequently, so having one
> extra nr_pages loop should not hurt the performance. 

FOLL_LONGTERM is typically used with very large regions, for instance
we are benchmarking around the 300G level. It takes 10s of seconds for
get_user_pages to operate. There are many inefficiencies in this
path. This extra work of re-scanning the list is part of the cost.

Further, having these special wrappers just for FOLL_LONGTERM has a
spill over complexity on the entire rest of the callchain up to here,
we now have endless wrappers and varieties of function calls that
generally are happening because the longterm path needs to end up in a
different place than other paths.

IMHO this is due to the lack of integration with the primary loop
above

> Also, I checked and most of the users of FOLL_LONGTERM pin only one
> page at a time. Which means the extra loop is only to check a single
> page.

Er, I don't know what you checked but those are not the cases I
see. Two big users are vfio and rdma. Both are pinning huge ranges of
memory in very typical use cases.

> However, those changes can come after this series. The current series
> fixes a bug where hot-remove is not working with making minimal amount
> of changes, so it is easy to backport it to stable kernels.

This is a good point, good enough that you should probably continue as
is

Jason

  reply	other threads:[~2020-12-03  1:09 UTC|newest]

Thread overview: 67+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-12-02  5:23 [PATCH 0/6] prohibit pinning pages in ZONE_MOVABLE Pavel Tatashin
2020-12-02  5:23 ` [PATCH 1/6] mm/gup: perform check_dax_vmas only when FS_DAX is enabled Pavel Tatashin
2020-12-02 16:22   ` Ira Weiny
2020-12-02 18:15     ` Pavel Tatashin
2020-12-02 16:29   ` Jason Gunthorpe
2020-12-02 18:16     ` Pavel Tatashin
2020-12-03  7:59   ` John Hubbard
2020-12-03 14:52     ` Pavel Tatashin
2020-12-02  5:23 ` [PATCH 2/6] mm/gup: don't pin migrated cma pages in movable zone Pavel Tatashin
2020-12-02 16:31   ` David Hildenbrand
2020-12-02 18:17     ` Pavel Tatashin
2020-12-03  8:01   ` John Hubbard
2020-12-03  8:46   ` Michal Hocko
2020-12-03 14:58     ` Pavel Tatashin
2020-12-02  5:23 ` [PATCH 3/6] mm/gup: make __gup_longterm_locked common Pavel Tatashin
2020-12-02 16:31   ` Ira Weiny
2020-12-02 16:33     ` Ira Weiny
2020-12-02 18:19       ` Pavel Tatashin
2020-12-03  0:03         ` Pavel Tatashin
2020-12-03  8:03   ` John Hubbard
2020-12-03 15:02     ` Pavel Tatashin
2020-12-02  5:23 ` [PATCH 4/6] mm cma: rename PF_MEMALLOC_NOCMA to PF_MEMALLOC_NOMOVABLE Pavel Tatashin
2020-12-03  8:04   ` John Hubbard
2020-12-03 15:02     ` Pavel Tatashin
2020-12-03  8:57   ` Michal Hocko
2020-12-03 15:02     ` Pavel Tatashin
2020-12-02  5:23 ` [PATCH 5/6] mm: honor PF_MEMALLOC_NOMOVABLE for all allocations Pavel Tatashin
2020-12-03  8:17   ` John Hubbard
2020-12-03 15:06     ` Pavel Tatashin
2020-12-03 16:51       ` John Hubbard
2020-12-03  9:17   ` Michal Hocko
2020-12-03 15:15     ` Pavel Tatashin
2020-12-04  8:43       ` Michal Hocko
2020-12-04  8:54         ` Michal Hocko
2020-12-04 16:07           ` Pavel Tatashin
2020-12-02  5:23 ` [PATCH 6/6] mm/gup: migrate pinned pages out of movable zone Pavel Tatashin
2020-12-02 16:35   ` Jason Gunthorpe
2020-12-03  0:19     ` Pavel Tatashin
2020-12-03  1:08       ` Jason Gunthorpe [this message]
2020-12-03  1:34         ` Pavel Tatashin
2020-12-03 14:17           ` Jason Gunthorpe
2020-12-03 16:40             ` Pavel Tatashin
2020-12-03 16:59               ` Jason Gunthorpe
2020-12-03 17:14                 ` Pavel Tatashin
2020-12-03 19:15                   ` Pavel Tatashin
2020-12-03 19:36                     ` Jason Gunthorpe
2020-12-04 16:24                       ` Pavel Tatashin
2020-12-04 17:06                         ` Jason Gunthorpe
2020-12-04 20:05             ` Daniel Jordan
2020-12-04 20:16               ` Pavel Tatashin
2020-12-08  2:27                 ` Daniel Jordan
2020-12-04 20:52               ` Jason Gunthorpe
2020-12-08  2:48                 ` Daniel Jordan
2020-12-08 13:24                   ` Jason Gunthorpe
2020-12-03  8:22   ` John Hubbard
2020-12-03 15:55     ` Pavel Tatashin
2020-12-04  4:13   ` Joonsoo Kim
2020-12-04 17:43     ` Pavel Tatashin
2020-12-07  7:13       ` Joonsoo Kim
2020-12-04  4:02 ` [PATCH 0/6] prohibit pinning pages in ZONE_MOVABLE Joonsoo Kim
2020-12-04 15:55   ` Pavel Tatashin
2020-12-04 16:10     ` Jason Gunthorpe
2020-12-04 17:50       ` Pavel Tatashin
2020-12-04 18:01         ` David Hildenbrand
2020-12-04 18:10           ` Pavel Tatashin
2020-12-07  7:12         ` Joonsoo Kim
2020-12-07 12:13           ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201203010809.GQ5487@ziepe.ca \
    --to=jgg@ziepe.ca \
    --cc=akpm@linux-foundation.org \
    --cc=dan.j.williams@intel.com \
    --cc=david@redhat.com \
    --cc=iamjoonsoo.kim@lge.com \
    --cc=jhubbard@nvidia.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@suse.de \
    --cc=mhocko@suse.com \
    --cc=mike.kravetz@oracle.com \
    --cc=mingo@redhat.com \
    --cc=osalvador@suse.de \
    --cc=pasha.tatashin@soleen.com \
    --cc=peterz@infradead.org \
    --cc=rientjes@google.com \
    --cc=rostedt@goodmis.org \
    --cc=sashal@kernel.org \
    --cc=tyhicks@linux.microsoft.com \
    --cc=vbabka@suse.cz \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).