linux-rdma.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/2] mm/gup + IB: allow FOLL_FORCE for gup_fast and use in IB
@ 2019-11-25  0:37 John Hubbard
  2019-11-25  0:37 ` [PATCH 1/2] mm/gup: allow FOLL_FORCE for get_user_pages_fast() John Hubbard
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: John Hubbard @ 2019-11-25  0:37 UTC (permalink / raw)
  To: Andrew Morton, Jason Gunthorpe, Leon Romanovsky, Christoph Hellwig
  Cc: Ira Weiny, linux-rdma, linux-mm, LKML, John Hubbard

Hi Leon, Jason, Christoph,

Maybe I'm overlooking something, but as I wrote in patch 1, it looks
like we can simply allow FOLL_FORCE to be passed to gup_fast().

This should fix Leon's reported RDMA failure [1]  when using patch 2 by
itself. (I've compile- and boot-tested these, and also did short LTP
and fio with direct IO tests, but I don't have an Infiniband runtime
setup that exercises the umem.c code.)

[1] https://lore.kernel.org/r/20191124100724.GH136476@unreal

John Hubbard (2):
  mm/gup: allow FOLL_FORCE for get_user_pages_fast()
  IB/umem: use get_user_pages_fast() to pin DMA pages

 drivers/infiniband/core/umem.c | 17 ++++++-----------
 mm/gup.c                       |  3 ++-
 2 files changed, 8 insertions(+), 12 deletions(-)

-- 
2.24.0


^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH 1/2] mm/gup: allow FOLL_FORCE for get_user_pages_fast()
  2019-11-25  0:37 [PATCH 0/2] mm/gup + IB: allow FOLL_FORCE for gup_fast and use in IB John Hubbard
@ 2019-11-25  0:37 ` John Hubbard
  2019-11-25  0:37 ` [PATCH 2/2] IB/umem: use get_user_pages_fast() to pin DMA pages John Hubbard
  2019-11-25  2:07 ` [PATCH 0/2] mm/gup + IB: allow FOLL_FORCE for gup_fast and use in IB John Hubbard
  2 siblings, 0 replies; 4+ messages in thread
From: John Hubbard @ 2019-11-25  0:37 UTC (permalink / raw)
  To: Andrew Morton, Jason Gunthorpe, Leon Romanovsky, Christoph Hellwig
  Cc: Ira Weiny, linux-rdma, linux-mm, LKML, John Hubbard, Christoph Hellwig

Commit 817be129e6f2 ("mm: validate get_user_pages_fast flags") allowed
only FOLL_WRITE and FOLL_LONGTERM to be passed to get_user_pages_fast().
This, combined with the fact that get_user_pages_fast() falls back to
"slow gup", which *does* accept FOLL_FORCE, leads to an odd situation:
if you need FOLL_FORCE, you cannot call get_user_pages_fast().

There does not appear to be any reason for filtering out FOLL_FORCE.
There is nothing in the _fast() implementation that requires that we
avoid writing to the pages. So it appears to have been an oversight.

Fix by allowing FOLL_FORCE to be set for get_user_pages_fast().

Fixes: 817be129e6f2 ("mm: validate get_user_pages_fast flags")
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 mm/gup.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/mm/gup.c b/mm/gup.c
index 8f236a335ae9..745b4036cdfd 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -2401,7 +2401,8 @@ int get_user_pages_fast(unsigned long start, int nr_pages,
 	unsigned long addr, len, end;
 	int nr = 0, ret = 0;
 
-	if (WARN_ON_ONCE(gup_flags & ~(FOLL_WRITE | FOLL_LONGTERM)))
+	if (WARN_ON_ONCE(gup_flags & ~(FOLL_WRITE | FOLL_LONGTERM |
+				       FOLL_FORCE)))
 		return -EINVAL;
 
 	start = untagged_addr(start) & PAGE_MASK;
-- 
2.24.0


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH 2/2] IB/umem: use get_user_pages_fast() to pin DMA pages
  2019-11-25  0:37 [PATCH 0/2] mm/gup + IB: allow FOLL_FORCE for gup_fast and use in IB John Hubbard
  2019-11-25  0:37 ` [PATCH 1/2] mm/gup: allow FOLL_FORCE for get_user_pages_fast() John Hubbard
@ 2019-11-25  0:37 ` John Hubbard
  2019-11-25  2:07 ` [PATCH 0/2] mm/gup + IB: allow FOLL_FORCE for gup_fast and use in IB John Hubbard
  2 siblings, 0 replies; 4+ messages in thread
From: John Hubbard @ 2019-11-25  0:37 UTC (permalink / raw)
  To: Andrew Morton, Jason Gunthorpe, Leon Romanovsky, Christoph Hellwig
  Cc: Ira Weiny, linux-rdma, linux-mm, LKML, John Hubbard,
	Christoph Hellwig, Jan Kara, Jason Gunthorpe

And get rid of the mmap_sem calls, as part of that. Note
that get_user_pages_fast() will, if necessary, fall back to
__gup_longterm_unlocked(), which takes the mmap_sem as needed.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/infiniband/core/umem.c | 17 ++++++-----------
 1 file changed, 6 insertions(+), 11 deletions(-)

diff --git a/drivers/infiniband/core/umem.c b/drivers/infiniband/core/umem.c
index 24244a2f68cc..3d664a2539eb 100644
--- a/drivers/infiniband/core/umem.c
+++ b/drivers/infiniband/core/umem.c
@@ -271,16 +271,13 @@ struct ib_umem *ib_umem_get(struct ib_udata *udata, unsigned long addr,
 	sg = umem->sg_head.sgl;
 
 	while (npages) {
-		down_read(&mm->mmap_sem);
-		ret = get_user_pages(cur_base,
-				     min_t(unsigned long, npages,
-					   PAGE_SIZE / sizeof (struct page *)),
-				     gup_flags | FOLL_LONGTERM,
-				     page_list, NULL);
-		if (ret < 0) {
-			up_read(&mm->mmap_sem);
+		ret = get_user_pages_fast(cur_base,
+					  min_t(unsigned long, npages,
+						PAGE_SIZE /
+						sizeof(struct page *)),
+					  gup_flags | FOLL_LONGTERM, page_list);
+		if (ret < 0)
 			goto umem_release;
-		}
 
 		cur_base += ret * PAGE_SIZE;
 		npages   -= ret;
@@ -288,8 +285,6 @@ struct ib_umem *ib_umem_get(struct ib_udata *udata, unsigned long addr,
 		sg = ib_umem_add_sg_table(sg, page_list, ret,
 			dma_get_max_seg_size(context->device->dma_device),
 			&umem->sg_nents);
-
-		up_read(&mm->mmap_sem);
 	}
 
 	sg_mark_end(sg);
-- 
2.24.0


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH 0/2] mm/gup + IB: allow FOLL_FORCE for gup_fast and use in IB
  2019-11-25  0:37 [PATCH 0/2] mm/gup + IB: allow FOLL_FORCE for gup_fast and use in IB John Hubbard
  2019-11-25  0:37 ` [PATCH 1/2] mm/gup: allow FOLL_FORCE for get_user_pages_fast() John Hubbard
  2019-11-25  0:37 ` [PATCH 2/2] IB/umem: use get_user_pages_fast() to pin DMA pages John Hubbard
@ 2019-11-25  2:07 ` John Hubbard
  2 siblings, 0 replies; 4+ messages in thread
From: John Hubbard @ 2019-11-25  2:07 UTC (permalink / raw)
  To: Andrew Morton, Jason Gunthorpe, Leon Romanovsky, Christoph Hellwig
  Cc: Ira Weiny, linux-rdma, linux-mm, LKML

On 11/24/19 4:37 PM, John Hubbard wrote:
> Hi Leon, Jason, Christoph,
> 
> Maybe I'm overlooking something, but as I wrote in patch 1, it looks
> like we can simply allow FOLL_FORCE to be passed to gup_fast().
> 
> This should fix Leon's reported RDMA failure [1]  when using patch 2 by
> itself. (I've compile- and boot-tested these, and also did short LTP
> and fio with direct IO tests, but I don't have an Infiniband runtime
> setup that exercises the umem.c code.)
> 
> [1] https://lore.kernel.org/r/20191124100724.GH136476@unreal
> 
> John Hubbard (2):
>   mm/gup: allow FOLL_FORCE for get_user_pages_fast()
>   IB/umem: use get_user_pages_fast() to pin DMA pages
> 
>  drivers/infiniband/core/umem.c | 17 ++++++-----------
>  mm/gup.c                       |  3 ++-
>  2 files changed, 8 insertions(+), 12 deletions(-)
> 

OK, based on Jason's response [1] that it's too late to put this into
5.5, let's withdraw this, and I'll resend when it's time to send out
patches for 5.6.

[1] https://lore.kernel.org/r/20191125005339.GC5634@ziepe.ca

thanks,
-- 
John Hubbard
NVIDIA

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2019-11-25  2:07 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-11-25  0:37 [PATCH 0/2] mm/gup + IB: allow FOLL_FORCE for gup_fast and use in IB John Hubbard
2019-11-25  0:37 ` [PATCH 1/2] mm/gup: allow FOLL_FORCE for get_user_pages_fast() John Hubbard
2019-11-25  0:37 ` [PATCH 2/2] IB/umem: use get_user_pages_fast() to pin DMA pages John Hubbard
2019-11-25  2:07 ` [PATCH 0/2] mm/gup + IB: allow FOLL_FORCE for gup_fast and use in IB John Hubbard

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).