From: Ira Weiny <ira.weiny@intel.com>
To: john.hubbard@gmail.com
Cc: "Andrew Morton" <akpm@linux-foundation.org>,
"Christoph Hellwig" <hch@infradead.org>,
"Dan Williams" <dan.j.williams@intel.com>,
"Dave Chinner" <david@fromorbit.com>, "Jan Kara" <jack@suse.cz>,
"Jason Gunthorpe" <jgg@ziepe.ca>,
"Jérôme Glisse" <jglisse@redhat.com>,
LKML <linux-kernel@vger.kernel.org>,
linux-mm@kvack.org, linux-fsdevel@vger.kernel.org,
linux-rdma@vger.kernel.org, "John Hubbard" <jhubbard@nvidia.com>
Subject: Re: [RFC PATCH 2/2] mm/gup: introduce vaddr_pin_pages_remote()
Date: Mon, 12 Aug 2019 15:03:41 -0700 [thread overview]
Message-ID: <20190812220340.GA26305@iweiny-DESK2.sc.intel.com> (raw)
In-Reply-To: <20190812015044.26176-3-jhubbard@nvidia.com>
On Sun, Aug 11, 2019 at 06:50:44PM -0700, john.hubbard@gmail.com wrote:
> From: John Hubbard <jhubbard@nvidia.com>
>
> This is the "vaddr_pin_pages" corresponding variant to
> get_user_pages_remote(), but with FOLL_PIN semantics: the implementation
> sets FOLL_PIN. That, in turn, means that the pages must ultimately be
> released by put_user_page*()--typically, via vaddr_unpin_pages*().
>
> Note that the put_user_page*() requirement won't be truly
> required until all of the call sites have been converted, and
> the tracking of pages is actually activated.
>
> Also introduce vaddr_unpin_pages(), in order to have a simpler
> call for the error handling cases.
>
> Use both of these new calls in the Infiniband drive, replacing
> get_user_pages_remote() and put_user_pages().
>
> Signed-off-by: John Hubbard <jhubbard@nvidia.com>
> ---
> drivers/infiniband/core/umem_odp.c | 15 +++++----
> include/linux/mm.h | 7 +++++
> mm/gup.c | 50 ++++++++++++++++++++++++++++++
> 3 files changed, 66 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/infiniband/core/umem_odp.c b/drivers/infiniband/core/umem_odp.c
> index 53085896d718..fdff034a8a30 100644
> --- a/drivers/infiniband/core/umem_odp.c
> +++ b/drivers/infiniband/core/umem_odp.c
> @@ -534,7 +534,7 @@ static int ib_umem_odp_map_dma_single_page(
> }
>
> out:
> - put_user_page(page);
> + vaddr_unpin_pages(&page, 1, &umem_odp->umem.vaddr_pin);
>
> if (remove_existing_mapping) {
> ib_umem_notifier_start_account(umem_odp);
> @@ -635,9 +635,10 @@ int ib_umem_odp_map_dma_pages(struct ib_umem_odp *umem_odp, u64 user_virt,
> * complex (and doesn't gain us much performance in most use
> * cases).
> */
> - npages = get_user_pages_remote(owning_process, owning_mm,
> + npages = vaddr_pin_pages_remote(owning_process, owning_mm,
> user_virt, gup_num_pages,
> - flags, local_page_list, NULL, NULL);
> + flags, local_page_list, NULL, NULL,
> + &umem_odp->umem.vaddr_pin);
> up_read(&owning_mm->mmap_sem);
>
> if (npages < 0) {
> @@ -657,7 +658,8 @@ int ib_umem_odp_map_dma_pages(struct ib_umem_odp *umem_odp, u64 user_virt,
> ret = -EFAULT;
> break;
> }
> - put_user_page(local_page_list[j]);
> + vaddr_unpin_pages(&local_page_list[j], 1,
> + &umem_odp->umem.vaddr_pin);
> continue;
> }
>
> @@ -684,8 +686,9 @@ int ib_umem_odp_map_dma_pages(struct ib_umem_odp *umem_odp, u64 user_virt,
> * ib_umem_odp_map_dma_single_page().
> */
> if (npages - (j + 1) > 0)
> - put_user_pages(&local_page_list[j+1],
> - npages - (j + 1));
> + vaddr_unpin_pages(&local_page_list[j+1],
> + npages - (j + 1),
> + &umem_odp->umem.vaddr_pin);
> break;
> }
> }
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index 61b616cd9243..2bd76ad8787e 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -1606,6 +1606,13 @@ int __account_locked_vm(struct mm_struct *mm, unsigned long pages, bool inc,
> long vaddr_pin_pages(unsigned long addr, unsigned long nr_pages,
> unsigned int gup_flags, struct page **pages,
> struct vaddr_pin *vaddr_pin);
> +long vaddr_pin_pages_remote(struct task_struct *tsk, struct mm_struct *mm,
> + unsigned long start, unsigned long nr_pages,
> + unsigned int gup_flags, struct page **pages,
> + struct vm_area_struct **vmas, int *locked,
> + struct vaddr_pin *vaddr_pin);
> +void vaddr_unpin_pages(struct page **pages, unsigned long nr_pages,
> + struct vaddr_pin *vaddr_pin);
> void vaddr_unpin_pages_dirty_lock(struct page **pages, unsigned long nr_pages,
> struct vaddr_pin *vaddr_pin, bool make_dirty);
> bool mapping_inode_has_layout(struct vaddr_pin *vaddr_pin, struct page *page);
> diff --git a/mm/gup.c b/mm/gup.c
> index 85f09958fbdc..bb95adfaf9b6 100644
> --- a/mm/gup.c
> +++ b/mm/gup.c
> @@ -2518,6 +2518,38 @@ long vaddr_pin_pages(unsigned long addr, unsigned long nr_pages,
> }
> EXPORT_SYMBOL(vaddr_pin_pages);
>
> +/**
> + * vaddr_pin_pages pin pages by virtual address and return the pages to the
vaddr_pin_pages_remote
Fixed in my tree.
> + * user.
> + *
> + * @tsk: the task_struct to use for page fault accounting, or
> + * NULL if faults are not to be recorded.
> + * @mm: mm_struct of target mm
> + * @addr: start address
> + * @nr_pages: number of pages to pin
> + * @gup_flags: flags to use for the pin
> + * @pages: array of pages returned
> + * @vaddr_pin: initialized meta information this pin is to be associated
> + * with.
> + *
> + * This is the "vaddr_pin_pages" corresponding variant to
> + * get_user_pages_remote(), but with FOLL_PIN semantics: the implementation sets
> + * FOLL_PIN. That, in turn, means that the pages must ultimately be released
> + * by put_user_page().
> + */
> +long vaddr_pin_pages_remote(struct task_struct *tsk, struct mm_struct *mm,
> + unsigned long start, unsigned long nr_pages,
> + unsigned int gup_flags, struct page **pages,
> + struct vm_area_struct **vmas, int *locked,
> + struct vaddr_pin *vaddr_pin)
> +{
> + gup_flags |= FOLL_TOUCH | FOLL_REMOTE | FOLL_PIN;
> +
> + return __get_user_pages_locked(tsk, mm, start, nr_pages, pages, vmas,
> + locked, gup_flags, vaddr_pin);
> +}
> +EXPORT_SYMBOL(vaddr_pin_pages_remote);
> +
> /**
> * vaddr_unpin_pages_dirty_lock - counterpart to vaddr_pin_pages
> *
> @@ -2536,3 +2568,21 @@ void vaddr_unpin_pages_dirty_lock(struct page **pages, unsigned long nr_pages,
> __put_user_pages_dirty_lock(vaddr_pin, pages, nr_pages, make_dirty);
> }
> EXPORT_SYMBOL(vaddr_unpin_pages_dirty_lock);
> +
> +/**
> + * vaddr_unpin_pages - simple, non-dirtying counterpart to vaddr_pin_pages
> + *
> + * @pages: array of pages returned
> + * @nr_pages: number of pages in pages
> + * @vaddr_pin: same information passed to vaddr_pin_pages
> + *
> + * Like vaddr_unpin_pages_dirty_lock, but for non-dirty pages. Useful in putting
> + * back pages in an error case: they were never made dirty.
> + */
> +void vaddr_unpin_pages(struct page **pages, unsigned long nr_pages,
> + struct vaddr_pin *vaddr_pin)
> +{
> + __put_user_pages_dirty_lock(vaddr_pin, pages, nr_pages, false);
> +}
> +EXPORT_SYMBOL(vaddr_unpin_pages);
Rather than have another wrapping call why don't we just do this? Would it be
so bad to just have to specify false for make_dirty?
diff --git a/mm/gup.c b/mm/gup.c
index e77b250c1307..ca660a5e8206 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -2540,7 +2540,7 @@ long vaddr_pin_pages_remote(struct task_struct *tsk, struct mm_struct *mm,
EXPORT_SYMBOL(vaddr_pin_pages_remote);
/**
- * vaddr_unpin_pages_dirty_lock - counterpart to vaddr_pin_pages
+ * vaddr_unpin_pages - counterpart to vaddr_pin_pages
*
* @pages: array of pages returned
* @nr_pages: number of pages in pages
@@ -2551,26 +2551,9 @@ EXPORT_SYMBOL(vaddr_pin_pages_remote);
* in vaddr_pin_pages should be passed back into this call for proper
* tracking.
*/
-void vaddr_unpin_pages_dirty_lock(struct page **pages, unsigned long nr_pages,
- struct vaddr_pin *vaddr_pin, bool make_dirty)
+void vaddr_unpin_pages(struct page **pages, unsigned long nr_pages,
+ struct vaddr_pin *vaddr_pin, bool make_dirty)
{
__put_user_pages_dirty_lock(vaddr_pin, pages, nr_pages, make_dirty);
}
EXPORT_SYMBOL(vaddr_unpin_pages_dirty_lock);
-
-/**
- * vaddr_unpin_pages - simple, non-dirtying counterpart to vaddr_pin_pages
- *
- * @pages: array of pages returned
- * @nr_pages: number of pages in pages
- * @vaddr_pin: same information passed to vaddr_pin_pages
- *
- * Like vaddr_unpin_pages_dirty_lock, but for non-dirty pages. Useful in putting
- * back pages in an error case: they were never made dirty.
- */
-void vaddr_unpin_pages(struct page **pages, unsigned long nr_pages,
- struct vaddr_pin *vaddr_pin)
-{
- __put_user_pages_dirty_lock(vaddr_pin, pages, nr_pages, false);
-}
-EXPORT_SYMBOL(vaddr_unpin_pages);
next prev parent reply other threads:[~2019-08-12 22:03 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-08-12 1:50 [RFC PATCH 0/2] mm/gup: introduce vaddr_pin_pages_remote(), FOLL_PIN john.hubbard
2019-08-12 1:50 ` [RFC PATCH 1/2] mm/gup: introduce FOLL_PIN flag for get_user_pages() john.hubbard
2019-08-12 1:50 ` [RFC PATCH 2/2] mm/gup: introduce vaddr_pin_pages_remote() john.hubbard
2019-08-12 22:03 ` Ira Weiny [this message]
2019-08-12 22:21 ` John Hubbard
2019-08-12 23:49 ` Ira Weiny
2019-08-13 0:07 ` John Hubbard
2019-08-13 21:08 ` Ira Weiny
2019-08-14 0:51 ` John Hubbard
2019-08-14 0:56 ` John Hubbard
2019-08-14 23:50 ` Ira Weiny
2019-08-15 0:02 ` John Hubbard
2019-08-15 3:01 ` John Hubbard
2019-08-15 13:26 ` Jan Kara
2019-08-15 13:35 ` Jan Kara
2019-08-15 14:51 ` Jason Gunthorpe
2019-08-15 17:32 ` Ira Weiny
2019-08-15 17:41 ` John Hubbard
2019-08-16 2:14 ` John Hubbard
2019-08-16 15:41 ` Jan Kara
2019-08-16 18:33 ` Ira Weiny
2019-08-16 18:50 ` John Hubbard
2019-08-16 21:59 ` Ira Weiny
2019-08-16 22:36 ` John Hubbard
2019-08-16 8:47 ` Vlastimil Babka
2019-08-16 15:44 ` Jan Kara
2019-08-16 15:52 ` Jerome Glisse
2019-08-16 16:13 ` Jan Kara
2019-08-16 16:31 ` Jason Gunthorpe
2019-08-16 16:54 ` Jerome Glisse
2019-08-16 17:04 ` Jason Gunthorpe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190812220340.GA26305@iweiny-DESK2.sc.intel.com \
--to=ira.weiny@intel.com \
--cc=akpm@linux-foundation.org \
--cc=dan.j.williams@intel.com \
--cc=david@fromorbit.com \
--cc=hch@infradead.org \
--cc=jack@suse.cz \
--cc=jgg@ziepe.ca \
--cc=jglisse@redhat.com \
--cc=jhubbard@nvidia.com \
--cc=john.hubbard@gmail.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-rdma@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).