* [PATCH 1/2] mm/gup: allow FOLL_FORCE for get_user_pages_fast()
2019-11-25 0:37 [PATCH 0/2] mm/gup + IB: allow FOLL_FORCE for gup_fast and use in IB John Hubbard
@ 2019-11-25 0:37 ` John Hubbard
2019-11-25 0:37 ` [PATCH 2/2] IB/umem: use get_user_pages_fast() to pin DMA pages John Hubbard
2019-11-25 2:07 ` [PATCH 0/2] mm/gup + IB: allow FOLL_FORCE for gup_fast and use in IB John Hubbard
2 siblings, 0 replies; 4+ messages in thread
From: John Hubbard @ 2019-11-25 0:37 UTC (permalink / raw)
To: Andrew Morton, Jason Gunthorpe, Leon Romanovsky, Christoph Hellwig
Cc: Ira Weiny, linux-rdma, linux-mm, LKML, John Hubbard, Christoph Hellwig
Commit 817be129e6f2 ("mm: validate get_user_pages_fast flags") allowed
only FOLL_WRITE and FOLL_LONGTERM to be passed to get_user_pages_fast().
This, combined with the fact that get_user_pages_fast() falls back to
"slow gup", which *does* accept FOLL_FORCE, leads to an odd situation:
if you need FOLL_FORCE, you cannot call get_user_pages_fast().
There does not appear to be any reason for filtering out FOLL_FORCE.
There is nothing in the _fast() implementation that requires that we
avoid writing to the pages. So it appears to have been an oversight.
Fix by allowing FOLL_FORCE to be set for get_user_pages_fast().
Fixes: 817be129e6f2 ("mm: validate get_user_pages_fast flags")
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
mm/gup.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/mm/gup.c b/mm/gup.c
index 8f236a335ae9..745b4036cdfd 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -2401,7 +2401,8 @@ int get_user_pages_fast(unsigned long start, int nr_pages,
unsigned long addr, len, end;
int nr = 0, ret = 0;
- if (WARN_ON_ONCE(gup_flags & ~(FOLL_WRITE | FOLL_LONGTERM)))
+ if (WARN_ON_ONCE(gup_flags & ~(FOLL_WRITE | FOLL_LONGTERM |
+ FOLL_FORCE)))
return -EINVAL;
start = untagged_addr(start) & PAGE_MASK;
--
2.24.0
^ permalink raw reply related [flat|nested] 4+ messages in thread
* [PATCH 2/2] IB/umem: use get_user_pages_fast() to pin DMA pages
2019-11-25 0:37 [PATCH 0/2] mm/gup + IB: allow FOLL_FORCE for gup_fast and use in IB John Hubbard
2019-11-25 0:37 ` [PATCH 1/2] mm/gup: allow FOLL_FORCE for get_user_pages_fast() John Hubbard
@ 2019-11-25 0:37 ` John Hubbard
2019-11-25 2:07 ` [PATCH 0/2] mm/gup + IB: allow FOLL_FORCE for gup_fast and use in IB John Hubbard
2 siblings, 0 replies; 4+ messages in thread
From: John Hubbard @ 2019-11-25 0:37 UTC (permalink / raw)
To: Andrew Morton, Jason Gunthorpe, Leon Romanovsky, Christoph Hellwig
Cc: Ira Weiny, linux-rdma, linux-mm, LKML, John Hubbard,
Christoph Hellwig, Jan Kara, Jason Gunthorpe
And get rid of the mmap_sem calls, as part of that. Note
that get_user_pages_fast() will, if necessary, fall back to
__gup_longterm_unlocked(), which takes the mmap_sem as needed.
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
drivers/infiniband/core/umem.c | 17 ++++++-----------
1 file changed, 6 insertions(+), 11 deletions(-)
diff --git a/drivers/infiniband/core/umem.c b/drivers/infiniband/core/umem.c
index 24244a2f68cc..3d664a2539eb 100644
--- a/drivers/infiniband/core/umem.c
+++ b/drivers/infiniband/core/umem.c
@@ -271,16 +271,13 @@ struct ib_umem *ib_umem_get(struct ib_udata *udata, unsigned long addr,
sg = umem->sg_head.sgl;
while (npages) {
- down_read(&mm->mmap_sem);
- ret = get_user_pages(cur_base,
- min_t(unsigned long, npages,
- PAGE_SIZE / sizeof (struct page *)),
- gup_flags | FOLL_LONGTERM,
- page_list, NULL);
- if (ret < 0) {
- up_read(&mm->mmap_sem);
+ ret = get_user_pages_fast(cur_base,
+ min_t(unsigned long, npages,
+ PAGE_SIZE /
+ sizeof(struct page *)),
+ gup_flags | FOLL_LONGTERM, page_list);
+ if (ret < 0)
goto umem_release;
- }
cur_base += ret * PAGE_SIZE;
npages -= ret;
@@ -288,8 +285,6 @@ struct ib_umem *ib_umem_get(struct ib_udata *udata, unsigned long addr,
sg = ib_umem_add_sg_table(sg, page_list, ret,
dma_get_max_seg_size(context->device->dma_device),
&umem->sg_nents);
-
- up_read(&mm->mmap_sem);
}
sg_mark_end(sg);
--
2.24.0
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH 0/2] mm/gup + IB: allow FOLL_FORCE for gup_fast and use in IB
2019-11-25 0:37 [PATCH 0/2] mm/gup + IB: allow FOLL_FORCE for gup_fast and use in IB John Hubbard
2019-11-25 0:37 ` [PATCH 1/2] mm/gup: allow FOLL_FORCE for get_user_pages_fast() John Hubbard
2019-11-25 0:37 ` [PATCH 2/2] IB/umem: use get_user_pages_fast() to pin DMA pages John Hubbard
@ 2019-11-25 2:07 ` John Hubbard
2 siblings, 0 replies; 4+ messages in thread
From: John Hubbard @ 2019-11-25 2:07 UTC (permalink / raw)
To: Andrew Morton, Jason Gunthorpe, Leon Romanovsky, Christoph Hellwig
Cc: Ira Weiny, linux-rdma, linux-mm, LKML
On 11/24/19 4:37 PM, John Hubbard wrote:
> Hi Leon, Jason, Christoph,
>
> Maybe I'm overlooking something, but as I wrote in patch 1, it looks
> like we can simply allow FOLL_FORCE to be passed to gup_fast().
>
> This should fix Leon's reported RDMA failure [1] when using patch 2 by
> itself. (I've compile- and boot-tested these, and also did short LTP
> and fio with direct IO tests, but I don't have an Infiniband runtime
> setup that exercises the umem.c code.)
>
> [1] https://lore.kernel.org/r/20191124100724.GH136476@unreal
>
> John Hubbard (2):
> mm/gup: allow FOLL_FORCE for get_user_pages_fast()
> IB/umem: use get_user_pages_fast() to pin DMA pages
>
> drivers/infiniband/core/umem.c | 17 ++++++-----------
> mm/gup.c | 3 ++-
> 2 files changed, 8 insertions(+), 12 deletions(-)
>
OK, based on Jason's response [1] that it's too late to put this into
5.5, let's withdraw this, and I'll resend when it's time to send out
patches for 5.6.
[1] https://lore.kernel.org/r/20191125005339.GC5634@ziepe.ca
thanks,
--
John Hubbard
NVIDIA
^ permalink raw reply [flat|nested] 4+ messages in thread