linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/3] get_user_pages*() and RDMA: first steps
@ 2018-10-05  4:02 john.hubbard
  2018-10-05  4:02 ` [PATCH v2 1/3] mm: get_user_pages: consolidate error handling john.hubbard
                   ` (2 more replies)
  0 siblings, 3 replies; 11+ messages in thread
From: john.hubbard @ 2018-10-05  4:02 UTC (permalink / raw)
  To: Matthew Wilcox, Michal Hocko, Christopher Lameter,
	Jason Gunthorpe, Dan Williams, Jan Kara
  Cc: linux-mm, LKML, linux-rdma, linux-fsdevel, John Hubbard, Al Viro,
	Jerome Glisse, Christoph Hellwig

From: John Hubbard <jhubbard@nvidia.com>

Changes since v1:

-- Renamed release_user_pages*() to put_user_pages*(), from Jan's feedback.

-- Removed the goldfish.c changes, and instead, only included a single
   user (infiniband) of the new functions. That is because goldfish.c no
   longer has a name collision (it has a release_user_pages() routine), and
   also because infiniband exercises both the put_user_page() and
   put_user_pages*() paths.

-- Updated links to discussions and plans, so as to be sure to include
   bounce buffers, thanks to Jerome's feedback.

Also:

-- Dennis, thanks for your earlier review, and I have not yet added your
   Reviewed-by tag, because this revision changes the things that you had
   previously reviewed, thus potentially requiring another look.

This short series prepares for eventually fixing the problem described
in [1], and is following a plan listed in [2], [3], [4].

I'd like to get the first two patches into the -mm tree.

Patch 1, although not technically critical to do now, is still nice to
have, because it's already been reviewed by Jan, and it's just one more
thing on the long TODO list here, that is ready to be checked off.

Patch 2 is required in order to allow me (and others, if I'm lucky) to
start submitting changes to convert all of the callsites of
get_user_pages*() and put_page().  I think this will work a lot better
than trying to maintain a massive patchset and submitting all at once.

Patch 3 converts infiniband drivers: put_page() --> put_user_page(), and
also exercises put_user_pages_dirty_locked().

Once these are all in, then the floodgates can open up to convert the large
number of get_user_pages*() callsites.

[1] https://lwn.net/Articles/753027/ : "The Trouble with get_user_pages()"

[2] https://lkml.kernel.org/r/20180709080554.21931-1-jhubbard@nvidia.com
    Proposed steps for fixing get_user_pages() + DMA problems.

[3]https://lkml.kernel.org/r/20180710082100.mkdwngdv5kkrcz6n@quack2.suse.cz
    Bounce buffers (otherwise [2] is not really viable).

[4] https://lkml.kernel.org/r/20181003162115.GG24030@quack2.suse.cz
    Follow-up discussions.

CC: Matthew Wilcox <willy@infradead.org>
CC: Michal Hocko <mhocko@kernel.org>
CC: Christopher Lameter <cl@linux.com>
CC: Jason Gunthorpe <jgg@ziepe.ca>
CC: Dan Williams <dan.j.williams@intel.com>
CC: Jan Kara <jack@suse.cz>
CC: Al Viro <viro@zeniv.linux.org.uk>
CC: Jerome Glisse <jglisse@redhat.com>
CC: Christoph Hellwig <hch@infradead.org>

John Hubbard (3):
  mm: get_user_pages: consolidate error handling
  mm: introduce put_user_page[s](), placeholder versions
  infiniband/mm: convert to the new put_user_page[s]() calls

 drivers/infiniband/core/umem.c              |  2 +-
 drivers/infiniband/core/umem_odp.c          |  2 +-
 drivers/infiniband/hw/hfi1/user_pages.c     | 11 ++----
 drivers/infiniband/hw/mthca/mthca_memfree.c |  6 +--
 drivers/infiniband/hw/qib/qib_user_pages.c  | 11 ++----
 drivers/infiniband/hw/qib/qib_user_sdma.c   |  8 ++--
 drivers/infiniband/hw/usnic/usnic_uiom.c    |  2 +-
 include/linux/mm.h                          | 42 ++++++++++++++++++++-
 mm/gup.c                                    | 37 ++++++++++--------
 9 files changed, 80 insertions(+), 41 deletions(-)

-- 
2.19.0

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH v2 1/3] mm: get_user_pages: consolidate error handling
  2018-10-05  4:02 [PATCH v2 0/3] get_user_pages*() and RDMA: first steps john.hubbard
@ 2018-10-05  4:02 ` john.hubbard
  2018-10-05  4:02 ` [PATCH v2 2/3] mm: introduce put_user_page[s](), placeholder versions john.hubbard
  2018-10-05  4:02 ` [PATCH v2 3/3] infiniband/mm: convert to the new put_user_page[s]() calls john.hubbard
  2 siblings, 0 replies; 11+ messages in thread
From: john.hubbard @ 2018-10-05  4:02 UTC (permalink / raw)
  To: Matthew Wilcox, Michal Hocko, Christopher Lameter,
	Jason Gunthorpe, Dan Williams, Jan Kara
  Cc: linux-mm, LKML, linux-rdma, linux-fsdevel, John Hubbard

From: John Hubbard <jhubbard@nvidia.com>

An upcoming patch requires a way to operate on each page that
any of the get_user_pages_*() variants returns.

In preparation for that, consolidate the error handling for
__get_user_pages(). This provides a single location (the "out:" label)
for operating on the collected set of pages that are about to be returned.

As long every use of the "ret" variable is being edited, rename
"ret" --> "err", so that its name matches its true role.
This also gets rid of two shadowed variable declarations, as a
tiny beneficial a side effect.

Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 mm/gup.c | 37 ++++++++++++++++++++++---------------
 1 file changed, 22 insertions(+), 15 deletions(-)

diff --git a/mm/gup.c b/mm/gup.c
index 1abc8b4afff6..05ee7c18e59a 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -660,6 +660,7 @@ static long __get_user_pages(struct task_struct *tsk, struct mm_struct *mm,
 		struct vm_area_struct **vmas, int *nonblocking)
 {
 	long i = 0;
+	int err = 0;
 	unsigned int page_mask;
 	struct vm_area_struct *vma = NULL;
 
@@ -685,18 +686,19 @@ static long __get_user_pages(struct task_struct *tsk, struct mm_struct *mm,
 		if (!vma || start >= vma->vm_end) {
 			vma = find_extend_vma(mm, start);
 			if (!vma && in_gate_area(mm, start)) {
-				int ret;
-				ret = get_gate_page(mm, start & PAGE_MASK,
+				err = get_gate_page(mm, start & PAGE_MASK,
 						gup_flags, &vma,
 						pages ? &pages[i] : NULL);
-				if (ret)
-					return i ? : ret;
+				if (err)
+					goto out;
 				page_mask = 0;
 				goto next_page;
 			}
 
-			if (!vma || check_vma_flags(vma, gup_flags))
-				return i ? : -EFAULT;
+			if (!vma || check_vma_flags(vma, gup_flags)) {
+				err = -EFAULT;
+				goto out;
+			}
 			if (is_vm_hugetlb_page(vma)) {
 				i = follow_hugetlb_page(mm, vma, pages, vmas,
 						&start, &nr_pages, i,
@@ -709,23 +711,25 @@ static long __get_user_pages(struct task_struct *tsk, struct mm_struct *mm,
 		 * If we have a pending SIGKILL, don't keep faulting pages and
 		 * potentially allocating memory.
 		 */
-		if (unlikely(fatal_signal_pending(current)))
-			return i ? i : -ERESTARTSYS;
+		if (unlikely(fatal_signal_pending(current))) {
+			err = -ERESTARTSYS;
+			goto out;
+		}
 		cond_resched();
 		page = follow_page_mask(vma, start, foll_flags, &page_mask);
 		if (!page) {
-			int ret;
-			ret = faultin_page(tsk, vma, start, &foll_flags,
+			err = faultin_page(tsk, vma, start, &foll_flags,
 					nonblocking);
-			switch (ret) {
+			switch (err) {
 			case 0:
 				goto retry;
 			case -EFAULT:
 			case -ENOMEM:
 			case -EHWPOISON:
-				return i ? i : ret;
+				goto out;
 			case -EBUSY:
-				return i;
+				err = 0;
+				goto out;
 			case -ENOENT:
 				goto next_page;
 			}
@@ -737,7 +741,8 @@ static long __get_user_pages(struct task_struct *tsk, struct mm_struct *mm,
 			 */
 			goto next_page;
 		} else if (IS_ERR(page)) {
-			return i ? i : PTR_ERR(page);
+			err = PTR_ERR(page);
+			goto out;
 		}
 		if (pages) {
 			pages[i] = page;
@@ -757,7 +762,9 @@ static long __get_user_pages(struct task_struct *tsk, struct mm_struct *mm,
 		start += page_increm * PAGE_SIZE;
 		nr_pages -= page_increm;
 	} while (nr_pages);
-	return i;
+
+out:
+	return i ? i : err;
 }
 
 static bool vma_permits_fault(struct vm_area_struct *vma,
-- 
2.19.0

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v2 2/3] mm: introduce put_user_page[s](), placeholder versions
  2018-10-05  4:02 [PATCH v2 0/3] get_user_pages*() and RDMA: first steps john.hubbard
  2018-10-05  4:02 ` [PATCH v2 1/3] mm: get_user_pages: consolidate error handling john.hubbard
@ 2018-10-05  4:02 ` john.hubbard
  2018-10-05 15:17   ` Jason Gunthorpe
  2018-10-05  4:02 ` [PATCH v2 3/3] infiniband/mm: convert to the new put_user_page[s]() calls john.hubbard
  2 siblings, 1 reply; 11+ messages in thread
From: john.hubbard @ 2018-10-05  4:02 UTC (permalink / raw)
  To: Matthew Wilcox, Michal Hocko, Christopher Lameter,
	Jason Gunthorpe, Dan Williams, Jan Kara
  Cc: linux-mm, LKML, linux-rdma, linux-fsdevel, John Hubbard, Al Viro,
	Jerome Glisse, Christoph Hellwig

From: John Hubbard <jhubbard@nvidia.com>

Introduces put_user_page(), which simply calls put_page().
This provides a way to update all get_user_pages*() callers,
so that they call put_user_page(), instead of put_page().

Also introduces put_user_pages(), and a few dirty/locked variations,
as a replacement for release_pages(), for the same reasons.
These may be used for subsequent performance improvements,
via batching of pages to be released.

This prepares for eventually fixing the problem described
in [1], and is following a plan listed in [2], [3], [4].

[1] https://lwn.net/Articles/753027/ : "The Trouble with get_user_pages()"

[2] https://lkml.kernel.org/r/20180709080554.21931-1-jhubbard@nvidia.com
    Proposed steps for fixing get_user_pages() + DMA problems.

[3]https://lkml.kernel.org/r/20180710082100.mkdwngdv5kkrcz6n@quack2.suse.cz
    Bounce buffers (otherwise [2] is not really viable).

[4] https://lkml.kernel.org/r/20181003162115.GG24030@quack2.suse.cz
    Follow-up discussions.

CC: Matthew Wilcox <willy@infradead.org>
CC: Michal Hocko <mhocko@kernel.org>
CC: Christopher Lameter <cl@linux.com>
CC: Jason Gunthorpe <jgg@ziepe.ca>
CC: Dan Williams <dan.j.williams@intel.com>
CC: Jan Kara <jack@suse.cz>
CC: Al Viro <viro@zeniv.linux.org.uk>
CC: Jerome Glisse <jglisse@redhat.com>
CC: Christoph Hellwig <hch@infradead.org>
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 include/linux/mm.h | 42 ++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 40 insertions(+), 2 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index a61ebe8ad4ca..1a9aae7c659f 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -137,6 +137,8 @@ extern int overcommit_ratio_handler(struct ctl_table *, int, void __user *,
 				    size_t *, loff_t *);
 extern int overcommit_kbytes_handler(struct ctl_table *, int, void __user *,
 				    size_t *, loff_t *);
+int set_page_dirty(struct page *page);
+int set_page_dirty_lock(struct page *page);
 
 #define nth_page(page,n) pfn_to_page(page_to_pfn((page)) + (n))
 
@@ -943,6 +945,44 @@ static inline void put_page(struct page *page)
 		__put_page(page);
 }
 
+/* Placeholder version, until all get_user_pages*() callers are updated. */
+static inline void put_user_page(struct page *page)
+{
+	put_page(page);
+}
+
+/* For get_user_pages*()-pinned pages, use these variants instead of
+ * release_pages():
+ */
+static inline void put_user_pages_dirty(struct page **pages,
+					unsigned long npages)
+{
+	while (npages) {
+		set_page_dirty(pages[npages]);
+		put_user_page(pages[npages]);
+		--npages;
+	}
+}
+
+static inline void put_user_pages_dirty_lock(struct page **pages,
+					     unsigned long npages)
+{
+	while (npages) {
+		set_page_dirty_lock(pages[npages]);
+		put_user_page(pages[npages]);
+		--npages;
+	}
+}
+
+static inline void put_user_pages(struct page **pages,
+				  unsigned long npages)
+{
+	while (npages) {
+		put_user_page(pages[npages]);
+		--npages;
+	}
+}
+
 #if defined(CONFIG_SPARSEMEM) && !defined(CONFIG_SPARSEMEM_VMEMMAP)
 #define SECTION_IN_PAGE_FLAGS
 #endif
@@ -1534,8 +1574,6 @@ int redirty_page_for_writepage(struct writeback_control *wbc,
 void account_page_dirtied(struct page *page, struct address_space *mapping);
 void account_page_cleaned(struct page *page, struct address_space *mapping,
 			  struct bdi_writeback *wb);
-int set_page_dirty(struct page *page);
-int set_page_dirty_lock(struct page *page);
 void __cancel_dirty_page(struct page *page);
 static inline void cancel_dirty_page(struct page *page)
 {
-- 
2.19.0

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v2 3/3] infiniband/mm: convert to the new put_user_page[s]() calls
  2018-10-05  4:02 [PATCH v2 0/3] get_user_pages*() and RDMA: first steps john.hubbard
  2018-10-05  4:02 ` [PATCH v2 1/3] mm: get_user_pages: consolidate error handling john.hubbard
  2018-10-05  4:02 ` [PATCH v2 2/3] mm: introduce put_user_page[s](), placeholder versions john.hubbard
@ 2018-10-05  4:02 ` john.hubbard
  2018-10-05 15:20   ` Jason Gunthorpe
  2 siblings, 1 reply; 11+ messages in thread
From: john.hubbard @ 2018-10-05  4:02 UTC (permalink / raw)
  To: Matthew Wilcox, Michal Hocko, Christopher Lameter,
	Jason Gunthorpe, Dan Williams, Jan Kara
  Cc: linux-mm, LKML, linux-rdma, linux-fsdevel, John Hubbard,
	Doug Ledford, Mike Marciniszyn, Dennis Dalessandro,
	Christian Benvenuti

From: John Hubbard <jhubbard@nvidia.com>

For code that retains pages via get_user_pages*(),
release those pages via the new put_user_page(),
instead of put_page().

This prepares for eventually fixing the problem described
in [1], and is following a plan listed in [2], [3], [4].

[1] https://lwn.net/Articles/753027/ : "The Trouble with get_user_pages()"

[2] https://lkml.kernel.org/r/20180709080554.21931-1-jhubbard@nvidia.com
    Proposed steps for fixing get_user_pages() + DMA problems.

[3]https://lkml.kernel.org/r/20180710082100.mkdwngdv5kkrcz6n@quack2.suse.cz
    Bounce buffers (otherwise [2] is not really viable).

[4] https://lkml.kernel.org/r/20181003162115.GG24030@quack2.suse.cz
    Follow-up discussions.

CC: Doug Ledford <dledford@redhat.com>
CC: Jason Gunthorpe <jgg@ziepe.ca>
CC: Mike Marciniszyn <mike.marciniszyn@intel.com>
CC: Dennis Dalessandro <dennis.dalessandro@intel.com>
CC: Christian Benvenuti <benve@cisco.com>

CC: linux-rdma@vger.kernel.org
CC: linux-kernel@vger.kernel.org
CC: linux-mm@kvack.org
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/infiniband/core/umem.c              |  2 +-
 drivers/infiniband/core/umem_odp.c          |  2 +-
 drivers/infiniband/hw/hfi1/user_pages.c     | 11 ++++-------
 drivers/infiniband/hw/mthca/mthca_memfree.c |  6 +++---
 drivers/infiniband/hw/qib/qib_user_pages.c  | 11 ++++-------
 drivers/infiniband/hw/qib/qib_user_sdma.c   |  8 ++++----
 drivers/infiniband/hw/usnic/usnic_uiom.c    |  2 +-
 7 files changed, 18 insertions(+), 24 deletions(-)

diff --git a/drivers/infiniband/core/umem.c b/drivers/infiniband/core/umem.c
index a41792dbae1f..9430d697cb9f 100644
--- a/drivers/infiniband/core/umem.c
+++ b/drivers/infiniband/core/umem.c
@@ -60,7 +60,7 @@ static void __ib_umem_release(struct ib_device *dev, struct ib_umem *umem, int d
 		page = sg_page(sg);
 		if (!PageDirty(page) && umem->writable && dirty)
 			set_page_dirty_lock(page);
-		put_page(page);
+		put_user_page(page);
 	}
 
 	sg_free_table(&umem->sg_head);
diff --git a/drivers/infiniband/core/umem_odp.c b/drivers/infiniband/core/umem_odp.c
index 6ec748eccff7..6227b89cf05c 100644
--- a/drivers/infiniband/core/umem_odp.c
+++ b/drivers/infiniband/core/umem_odp.c
@@ -717,7 +717,7 @@ int ib_umem_odp_map_dma_pages(struct ib_umem *umem, u64 user_virt, u64 bcnt,
 					ret = -EFAULT;
 					break;
 				}
-				put_page(local_page_list[j]);
+				put_user_page(local_page_list[j]);
 				continue;
 			}
 
diff --git a/drivers/infiniband/hw/hfi1/user_pages.c b/drivers/infiniband/hw/hfi1/user_pages.c
index e341e6dcc388..99ccc0483711 100644
--- a/drivers/infiniband/hw/hfi1/user_pages.c
+++ b/drivers/infiniband/hw/hfi1/user_pages.c
@@ -121,13 +121,10 @@ int hfi1_acquire_user_pages(struct mm_struct *mm, unsigned long vaddr, size_t np
 void hfi1_release_user_pages(struct mm_struct *mm, struct page **p,
 			     size_t npages, bool dirty)
 {
-	size_t i;
-
-	for (i = 0; i < npages; i++) {
-		if (dirty)
-			set_page_dirty_lock(p[i]);
-		put_page(p[i]);
-	}
+	if (dirty)
+		put_user_pages_dirty_lock(p, npages);
+	else
+		put_user_pages(p, npages);
 
 	if (mm) { /* during close after signal, mm can be NULL */
 		down_write(&mm->mmap_sem);
diff --git a/drivers/infiniband/hw/mthca/mthca_memfree.c b/drivers/infiniband/hw/mthca/mthca_memfree.c
index cc9c0c8ccba3..b8b12effd009 100644
--- a/drivers/infiniband/hw/mthca/mthca_memfree.c
+++ b/drivers/infiniband/hw/mthca/mthca_memfree.c
@@ -481,7 +481,7 @@ int mthca_map_user_db(struct mthca_dev *dev, struct mthca_uar *uar,
 
 	ret = pci_map_sg(dev->pdev, &db_tab->page[i].mem, 1, PCI_DMA_TODEVICE);
 	if (ret < 0) {
-		put_page(pages[0]);
+		put_user_page(pages[0]);
 		goto out;
 	}
 
@@ -489,7 +489,7 @@ int mthca_map_user_db(struct mthca_dev *dev, struct mthca_uar *uar,
 				 mthca_uarc_virt(dev, uar, i));
 	if (ret) {
 		pci_unmap_sg(dev->pdev, &db_tab->page[i].mem, 1, PCI_DMA_TODEVICE);
-		put_page(sg_page(&db_tab->page[i].mem));
+		put_user_page(sg_page(&db_tab->page[i].mem));
 		goto out;
 	}
 
@@ -555,7 +555,7 @@ void mthca_cleanup_user_db_tab(struct mthca_dev *dev, struct mthca_uar *uar,
 		if (db_tab->page[i].uvirt) {
 			mthca_UNMAP_ICM(dev, mthca_uarc_virt(dev, uar, i), 1);
 			pci_unmap_sg(dev->pdev, &db_tab->page[i].mem, 1, PCI_DMA_TODEVICE);
-			put_page(sg_page(&db_tab->page[i].mem));
+			put_user_page(sg_page(&db_tab->page[i].mem));
 		}
 	}
 
diff --git a/drivers/infiniband/hw/qib/qib_user_pages.c b/drivers/infiniband/hw/qib/qib_user_pages.c
index 16543d5e80c3..1a5c64c8695f 100644
--- a/drivers/infiniband/hw/qib/qib_user_pages.c
+++ b/drivers/infiniband/hw/qib/qib_user_pages.c
@@ -40,13 +40,10 @@
 static void __qib_release_user_pages(struct page **p, size_t num_pages,
 				     int dirty)
 {
-	size_t i;
-
-	for (i = 0; i < num_pages; i++) {
-		if (dirty)
-			set_page_dirty_lock(p[i]);
-		put_page(p[i]);
-	}
+	if (dirty)
+		put_user_pages_dirty_lock(p, num_pages);
+	else
+		put_user_pages(p, num_pages);
 }
 
 /*
diff --git a/drivers/infiniband/hw/qib/qib_user_sdma.c b/drivers/infiniband/hw/qib/qib_user_sdma.c
index 926f3c8eba69..14f94d823907 100644
--- a/drivers/infiniband/hw/qib/qib_user_sdma.c
+++ b/drivers/infiniband/hw/qib/qib_user_sdma.c
@@ -266,7 +266,7 @@ static void qib_user_sdma_init_frag(struct qib_user_sdma_pkt *pkt,
 	pkt->addr[i].length = len;
 	pkt->addr[i].first_desc = first_desc;
 	pkt->addr[i].last_desc = last_desc;
-	pkt->addr[i].put_page = put_page;
+	pkt->addr[i].put_page = put_user_page;
 	pkt->addr[i].dma_mapped = dma_mapped;
 	pkt->addr[i].page = page;
 	pkt->addr[i].kvaddr = kvaddr;
@@ -321,7 +321,7 @@ static int qib_user_sdma_page_to_frags(const struct qib_devdata *dd,
 		 * the caller can ignore this page.
 		 */
 		if (put) {
-			put_page(page);
+			put_user_page(page);
 		} else {
 			/* coalesce case */
 			kunmap(page);
@@ -635,7 +635,7 @@ static void qib_user_sdma_free_pkt_frag(struct device *dev,
 			kunmap(pkt->addr[i].page);
 
 		if (pkt->addr[i].put_page)
-			put_page(pkt->addr[i].page);
+			put_user_page(pkt->addr[i].page);
 		else
 			__free_page(pkt->addr[i].page);
 	} else if (pkt->addr[i].kvaddr) {
@@ -710,7 +710,7 @@ static int qib_user_sdma_pin_pages(const struct qib_devdata *dd,
 	/* if error, return all pages not managed by pkt */
 free_pages:
 	while (i < j)
-		put_page(pages[i++]);
+		put_user_page(pages[i++]);
 
 done:
 	return ret;
diff --git a/drivers/infiniband/hw/usnic/usnic_uiom.c b/drivers/infiniband/hw/usnic/usnic_uiom.c
index 9dd39daa602b..0f607f31c262 100644
--- a/drivers/infiniband/hw/usnic/usnic_uiom.c
+++ b/drivers/infiniband/hw/usnic/usnic_uiom.c
@@ -91,7 +91,7 @@ static void usnic_uiom_put_pages(struct list_head *chunk_list, int dirty)
 			pa = sg_phys(sg);
 			if (!PageDirty(page) && dirty)
 				set_page_dirty_lock(page);
-			put_page(page);
+			put_user_page(page);
 			usnic_dbg("pa: %pa\n", &pa);
 		}
 		kfree(chunk);
-- 
2.19.0

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 2/3] mm: introduce put_user_page[s](), placeholder versions
  2018-10-05  4:02 ` [PATCH v2 2/3] mm: introduce put_user_page[s](), placeholder versions john.hubbard
@ 2018-10-05 15:17   ` Jason Gunthorpe
  2018-10-05 19:49     ` John Hubbard
  0 siblings, 1 reply; 11+ messages in thread
From: Jason Gunthorpe @ 2018-10-05 15:17 UTC (permalink / raw)
  To: john.hubbard
  Cc: Matthew Wilcox, Michal Hocko, Christopher Lameter, Dan Williams,
	Jan Kara, linux-mm, LKML, linux-rdma, linux-fsdevel,
	John Hubbard, Al Viro, Jerome Glisse, Christoph Hellwig

On Thu, Oct 04, 2018 at 09:02:24PM -0700, john.hubbard@gmail.com wrote:
> From: John Hubbard <jhubbard@nvidia.com>
> 
> Introduces put_user_page(), which simply calls put_page().
> This provides a way to update all get_user_pages*() callers,
> so that they call put_user_page(), instead of put_page().
> 
> Also introduces put_user_pages(), and a few dirty/locked variations,
> as a replacement for release_pages(), for the same reasons.
> These may be used for subsequent performance improvements,
> via batching of pages to be released.
> 
> This prepares for eventually fixing the problem described
> in [1], and is following a plan listed in [2], [3], [4].
> 
> [1] https://lwn.net/Articles/753027/ : "The Trouble with get_user_pages()"
> 
> [2] https://lkml.kernel.org/r/20180709080554.21931-1-jhubbard@nvidia.com
>     Proposed steps for fixing get_user_pages() + DMA problems.
> 
> [3]https://lkml.kernel.org/r/20180710082100.mkdwngdv5kkrcz6n@quack2.suse.cz
>     Bounce buffers (otherwise [2] is not really viable).
> 
> [4] https://lkml.kernel.org/r/20181003162115.GG24030@quack2.suse.cz
>     Follow-up discussions.
> 
> CC: Matthew Wilcox <willy@infradead.org>
> CC: Michal Hocko <mhocko@kernel.org>
> CC: Christopher Lameter <cl@linux.com>
> CC: Jason Gunthorpe <jgg@ziepe.ca>
> CC: Dan Williams <dan.j.williams@intel.com>
> CC: Jan Kara <jack@suse.cz>
> CC: Al Viro <viro@zeniv.linux.org.uk>
> CC: Jerome Glisse <jglisse@redhat.com>
> CC: Christoph Hellwig <hch@infradead.org>
> Signed-off-by: John Hubbard <jhubbard@nvidia.com>
>  include/linux/mm.h | 42 ++++++++++++++++++++++++++++++++++++++++--
>  1 file changed, 40 insertions(+), 2 deletions(-)
> 
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index a61ebe8ad4ca..1a9aae7c659f 100644
> +++ b/include/linux/mm.h
> @@ -137,6 +137,8 @@ extern int overcommit_ratio_handler(struct ctl_table *, int, void __user *,
>  				    size_t *, loff_t *);
>  extern int overcommit_kbytes_handler(struct ctl_table *, int, void __user *,
>  				    size_t *, loff_t *);
> +int set_page_dirty(struct page *page);
> +int set_page_dirty_lock(struct page *page);
>  
>  #define nth_page(page,n) pfn_to_page(page_to_pfn((page)) + (n))
>  
> @@ -943,6 +945,44 @@ static inline void put_page(struct page *page)
>  		__put_page(page);
>  }
>  
> +/* Placeholder version, until all get_user_pages*() callers are updated. */
> +static inline void put_user_page(struct page *page)
> +{
> +	put_page(page);
> +}
> +
> +/* For get_user_pages*()-pinned pages, use these variants instead of
> + * release_pages():
> + */
> +static inline void put_user_pages_dirty(struct page **pages,
> +					unsigned long npages)
> +{
> +	while (npages) {
> +		set_page_dirty(pages[npages]);
> +		put_user_page(pages[npages]);
> +		--npages;
> +	}
> +}

Shouldn't these do the !PageDirty(page) thing?

Jason

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 3/3] infiniband/mm: convert to the new put_user_page[s]() calls
  2018-10-05  4:02 ` [PATCH v2 3/3] infiniband/mm: convert to the new put_user_page[s]() calls john.hubbard
@ 2018-10-05 15:20   ` Jason Gunthorpe
  2018-10-05 20:48     ` John Hubbard
  0 siblings, 1 reply; 11+ messages in thread
From: Jason Gunthorpe @ 2018-10-05 15:20 UTC (permalink / raw)
  To: john.hubbard
  Cc: Matthew Wilcox, Michal Hocko, Christopher Lameter, Dan Williams,
	Jan Kara, linux-mm, LKML, linux-rdma, linux-fsdevel,
	John Hubbard, Doug Ledford, Mike Marciniszyn, Dennis Dalessandro,
	Christian Benvenuti

On Thu, Oct 04, 2018 at 09:02:25PM -0700, john.hubbard@gmail.com wrote:
> From: John Hubbard <jhubbard@nvidia.com>
> 
> For code that retains pages via get_user_pages*(),
> release those pages via the new put_user_page(),
> instead of put_page().
> 
> This prepares for eventually fixing the problem described
> in [1], and is following a plan listed in [2], [3], [4].
> 
> [1] https://lwn.net/Articles/753027/ : "The Trouble with get_user_pages()"
> 
> [2] https://lkml.kernel.org/r/20180709080554.21931-1-jhubbard@nvidia.com
>     Proposed steps for fixing get_user_pages() + DMA problems.
> 
> [3]https://lkml.kernel.org/r/20180710082100.mkdwngdv5kkrcz6n@quack2.suse.cz
>     Bounce buffers (otherwise [2] is not really viable).
> 
> [4] https://lkml.kernel.org/r/20181003162115.GG24030@quack2.suse.cz
>     Follow-up discussions.
> 
> CC: Doug Ledford <dledford@redhat.com>
> CC: Jason Gunthorpe <jgg@ziepe.ca>
> CC: Mike Marciniszyn <mike.marciniszyn@intel.com>
> CC: Dennis Dalessandro <dennis.dalessandro@intel.com>
> CC: Christian Benvenuti <benve@cisco.com>
> 
> CC: linux-rdma@vger.kernel.org
> CC: linux-kernel@vger.kernel.org
> CC: linux-mm@kvack.org
> Signed-off-by: John Hubbard <jhubbard@nvidia.com>
>  drivers/infiniband/core/umem.c              |  2 +-
>  drivers/infiniband/core/umem_odp.c          |  2 +-
>  drivers/infiniband/hw/hfi1/user_pages.c     | 11 ++++-------
>  drivers/infiniband/hw/mthca/mthca_memfree.c |  6 +++---
>  drivers/infiniband/hw/qib/qib_user_pages.c  | 11 ++++-------
>  drivers/infiniband/hw/qib/qib_user_sdma.c   |  8 ++++----
>  drivers/infiniband/hw/usnic/usnic_uiom.c    |  2 +-
>  7 files changed, 18 insertions(+), 24 deletions(-)
> 
> diff --git a/drivers/infiniband/core/umem.c b/drivers/infiniband/core/umem.c
> index a41792dbae1f..9430d697cb9f 100644
> +++ b/drivers/infiniband/core/umem.c
> @@ -60,7 +60,7 @@ static void __ib_umem_release(struct ib_device *dev, struct ib_umem *umem, int d
>  		page = sg_page(sg);
>  		if (!PageDirty(page) && umem->writable && dirty)
>  			set_page_dirty_lock(page);
> -		put_page(page);
> +		put_user_page(page);
>  	}

How about ?

if (umem->writable && dirty)
     put_user_pages_dirty_lock(&page, 1);
else
     put_user_page(page);

?

> diff --git a/drivers/infiniband/hw/hfi1/user_pages.c b/drivers/infiniband/hw/hfi1/user_pages.c
> index e341e6dcc388..99ccc0483711 100644
> +++ b/drivers/infiniband/hw/hfi1/user_pages.c
> @@ -121,13 +121,10 @@ int hfi1_acquire_user_pages(struct mm_struct *mm, unsigned long vaddr, size_t np
>  void hfi1_release_user_pages(struct mm_struct *mm, struct page **p,
>  			     size_t npages, bool dirty)
>  {
> -	size_t i;
> -
> -	for (i = 0; i < npages; i++) {
> -		if (dirty)
> -			set_page_dirty_lock(p[i]);
> -		put_page(p[i]);
> -	}
> +	if (dirty)
> +		put_user_pages_dirty_lock(p, npages);
> +	else
> +		put_user_pages(p, npages);

And I know Jan gave the feedback to remove the bool argument, but just
pointing out that quite possibly evey caller will wrapper it in an if
like this..

Jason

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 2/3] mm: introduce put_user_page[s](), placeholder versions
  2018-10-05 15:17   ` Jason Gunthorpe
@ 2018-10-05 19:49     ` John Hubbard
  2018-10-05 20:51       ` John Hubbard
  2018-10-05 21:48       ` Jason Gunthorpe
  0 siblings, 2 replies; 11+ messages in thread
From: John Hubbard @ 2018-10-05 19:49 UTC (permalink / raw)
  To: Jason Gunthorpe, john.hubbard
  Cc: Matthew Wilcox, Michal Hocko, Christopher Lameter, Dan Williams,
	Jan Kara, linux-mm, LKML, linux-rdma, linux-fsdevel, Al Viro,
	Jerome Glisse, Christoph Hellwig

On 10/5/18 8:17 AM, Jason Gunthorpe wrote:
> On Thu, Oct 04, 2018 at 09:02:24PM -0700, john.hubbard@gmail.com wrote:
>> From: John Hubbard <jhubbard@nvidia.com>
>>
>> Introduces put_user_page(), which simply calls put_page().
>> This provides a way to update all get_user_pages*() callers,
>> so that they call put_user_page(), instead of put_page().
>>
>> Also introduces put_user_pages(), and a few dirty/locked variations,
>> as a replacement for release_pages(), for the same reasons.
>> These may be used for subsequent performance improvements,
>> via batching of pages to be released.
>>
>> This prepares for eventually fixing the problem described
>> in [1], and is following a plan listed in [2], [3], [4].
>>
>> [1] https://lwn.net/Articles/753027/ : "The Trouble with get_user_pages()"
>>
>> [2] https://lkml.kernel.org/r/20180709080554.21931-1-jhubbard@nvidia.com
>>     Proposed steps for fixing get_user_pages() + DMA problems.
>>
>> [3]https://lkml.kernel.org/r/20180710082100.mkdwngdv5kkrcz6n@quack2.suse.cz
>>     Bounce buffers (otherwise [2] is not really viable).
>>
>> [4] https://lkml.kernel.org/r/20181003162115.GG24030@quack2.suse.cz
>>     Follow-up discussions.
>>
[...]
>>  
>> +/* Placeholder version, until all get_user_pages*() callers are updated. */
>> +static inline void put_user_page(struct page *page)
>> +{
>> +	put_page(page);
>> +}
>> +
>> +/* For get_user_pages*()-pinned pages, use these variants instead of
>> + * release_pages():
>> + */
>> +static inline void put_user_pages_dirty(struct page **pages,
>> +					unsigned long npages)
>> +{
>> +	while (npages) {
>> +		set_page_dirty(pages[npages]);
>> +		put_user_page(pages[npages]);
>> +		--npages;
>> +	}
>> +}
> 
> Shouldn't these do the !PageDirty(page) thing?
> 

Well, not yet. This is the "placeholder" patch, in which I planned to keep
the behavior the same, while I go to all the get_user_pages call sites and change 
put_page() and release_pages() over to use these new routines.

After the call sites are changed, then these routines will be updated to do more.
[2], above has slightly more detail about that.


thanks,
-- 
John Hubbard
NVIDIA

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 3/3] infiniband/mm: convert to the new put_user_page[s]() calls
  2018-10-05 15:20   ` Jason Gunthorpe
@ 2018-10-05 20:48     ` John Hubbard
  0 siblings, 0 replies; 11+ messages in thread
From: John Hubbard @ 2018-10-05 20:48 UTC (permalink / raw)
  To: Jason Gunthorpe, john.hubbard
  Cc: Matthew Wilcox, Michal Hocko, Christopher Lameter, Dan Williams,
	Jan Kara, linux-mm, LKML, linux-rdma, linux-fsdevel,
	Doug Ledford, Mike Marciniszyn, Dennis Dalessandro,
	Christian Benvenuti

On 10/5/18 8:20 AM, Jason Gunthorpe wrote:
> On Thu, Oct 04, 2018 at 09:02:25PM -0700, john.hubbard@gmail.com wrote:
>> From: John Hubbard <jhubbard@nvidia.com>
>>
>> For code that retains pages via get_user_pages*(),
>> release those pages via the new put_user_page(),
>> instead of put_page().
>>
>> This prepares for eventually fixing the problem described
>> in [1], and is following a plan listed in [2], [3], [4].
>>
>> [1] https://lwn.net/Articles/753027/ : "The Trouble with get_user_pages()"
>>
>> [2] https://lkml.kernel.org/r/20180709080554.21931-1-jhubbard@nvidia.com
>>     Proposed steps for fixing get_user_pages() + DMA problems.
>>
>> [3]https://lkml.kernel.org/r/20180710082100.mkdwngdv5kkrcz6n@quack2.suse.cz
>>     Bounce buffers (otherwise [2] is not really viable).
>>
>> [4] https://lkml.kernel.org/r/20181003162115.GG24030@quack2.suse.cz
>>     Follow-up discussions.
>>
>> CC: Doug Ledford <dledford@redhat.com>
>> CC: Jason Gunthorpe <jgg@ziepe.ca>
>> CC: Mike Marciniszyn <mike.marciniszyn@intel.com>
>> CC: Dennis Dalessandro <dennis.dalessandro@intel.com>
>> CC: Christian Benvenuti <benve@cisco.com>
>>
>> CC: linux-rdma@vger.kernel.org
>> CC: linux-kernel@vger.kernel.org
>> CC: linux-mm@kvack.org
>> Signed-off-by: John Hubbard <jhubbard@nvidia.com>
>>  drivers/infiniband/core/umem.c              |  2 +-
>>  drivers/infiniband/core/umem_odp.c          |  2 +-
>>  drivers/infiniband/hw/hfi1/user_pages.c     | 11 ++++-------
>>  drivers/infiniband/hw/mthca/mthca_memfree.c |  6 +++---
>>  drivers/infiniband/hw/qib/qib_user_pages.c  | 11 ++++-------
>>  drivers/infiniband/hw/qib/qib_user_sdma.c   |  8 ++++----
>>  drivers/infiniband/hw/usnic/usnic_uiom.c    |  2 +-
>>  7 files changed, 18 insertions(+), 24 deletions(-)
>>
>> diff --git a/drivers/infiniband/core/umem.c b/drivers/infiniband/core/umem.c
>> index a41792dbae1f..9430d697cb9f 100644
>> +++ b/drivers/infiniband/core/umem.c
>> @@ -60,7 +60,7 @@ static void __ib_umem_release(struct ib_device *dev, struct ib_umem *umem, int d
>>  		page = sg_page(sg);
>>  		if (!PageDirty(page) && umem->writable && dirty)
>>  			set_page_dirty_lock(page);
>> -		put_page(page);
>> +		put_user_page(page);
>>  	}
> 
> How about ?
> 
> if (umem->writable && dirty)
>      put_user_pages_dirty_lock(&page, 1);
> else
>      put_user_page(page);
> 
> ?

OK, I'll make that change.

> 
>> diff --git a/drivers/infiniband/hw/hfi1/user_pages.c b/drivers/infiniband/hw/hfi1/user_pages.c
>> index e341e6dcc388..99ccc0483711 100644
>> +++ b/drivers/infiniband/hw/hfi1/user_pages.c
>> @@ -121,13 +121,10 @@ int hfi1_acquire_user_pages(struct mm_struct *mm, unsigned long vaddr, size_t np
>>  void hfi1_release_user_pages(struct mm_struct *mm, struct page **p,
>>  			     size_t npages, bool dirty)
>>  {
>> -	size_t i;
>> -
>> -	for (i = 0; i < npages; i++) {
>> -		if (dirty)
>> -			set_page_dirty_lock(p[i]);
>> -		put_page(p[i]);
>> -	}
>> +	if (dirty)
>> +		put_user_pages_dirty_lock(p, npages);
>> +	else
>> +		put_user_pages(p, npages);
> 
> And I know Jan gave the feedback to remove the bool argument, but just
> pointing out that quite possibly evey caller will wrapper it in an if
> like this..
> 

Yes, that attracted me, too. It's nice to write the "if" code once, instead of 
many times. But doing it efficiently requires using a bool argument (otherwise,
you end up with another "if" branch, to convert from bool to an enum or flag arg),
and that's generally avoided because no one wants to see code of the form:

   do_this(0, 1, 0, 1);
   do_this(1, 0, 0, 1);

, which, although hilarious, is still evil. haha. Anyway, maybe I'll leave it as-is
for now, to inject some hysteresis into this aspect of the review?


thanks,
-- 
John Hubbard
NVIDIA

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 2/3] mm: introduce put_user_page[s](), placeholder versions
  2018-10-05 19:49     ` John Hubbard
@ 2018-10-05 20:51       ` John Hubbard
  2018-10-05 21:48       ` Jason Gunthorpe
  1 sibling, 0 replies; 11+ messages in thread
From: John Hubbard @ 2018-10-05 20:51 UTC (permalink / raw)
  To: Jason Gunthorpe, john.hubbard
  Cc: Matthew Wilcox, Michal Hocko, Christopher Lameter, Dan Williams,
	Jan Kara, linux-mm, LKML, linux-rdma, linux-fsdevel, Al Viro,
	Jerome Glisse, Christoph Hellwig

On 10/5/18 12:49 PM, John Hubbard wrote:
> On 10/5/18 8:17 AM, Jason Gunthorpe wrote:
>> On Thu, Oct 04, 2018 at 09:02:24PM -0700, john.hubbard@gmail.com wrote:
>>> From: John Hubbard <jhubbard@nvidia.com>
>>>
>>> Introduces put_user_page(), which simply calls put_page().
>>> This provides a way to update all get_user_pages*() callers,
>>> so that they call put_user_page(), instead of put_page().
>>>
>>> Also introduces put_user_pages(), and a few dirty/locked variations,
>>> as a replacement for release_pages(), for the same reasons.
>>> These may be used for subsequent performance improvements,
>>> via batching of pages to be released.
>>>
>>> This prepares for eventually fixing the problem described
>>> in [1], and is following a plan listed in [2], [3], [4].
>>>
>>> [1] https://lwn.net/Articles/753027/ : "The Trouble with get_user_pages()"
>>>
>>> [2] https://lkml.kernel.org/r/20180709080554.21931-1-jhubbard@nvidia.com
>>>     Proposed steps for fixing get_user_pages() + DMA problems.
>>>
>>> [3]https://lkml.kernel.org/r/20180710082100.mkdwngdv5kkrcz6n@quack2.suse.cz
>>>     Bounce buffers (otherwise [2] is not really viable).
>>>
>>> [4] https://lkml.kernel.org/r/20181003162115.GG24030@quack2.suse.cz
>>>     Follow-up discussions.
>>>
> [...]
>>>  
>>> +/* Placeholder version, until all get_user_pages*() callers are updated. */
>>> +static inline void put_user_page(struct page *page)
>>> +{
>>> +	put_page(page);
>>> +}
>>> +
>>> +/* For get_user_pages*()-pinned pages, use these variants instead of
>>> + * release_pages():
>>> + */
>>> +static inline void put_user_pages_dirty(struct page **pages,
>>> +					unsigned long npages)
>>> +{
>>> +	while (npages) {
>>> +		set_page_dirty(pages[npages]);
>>> +		put_user_page(pages[npages]);
>>> +		--npages;
>>> +	}
>>> +}
>>
>> Shouldn't these do the !PageDirty(page) thing?
>>
> 
> Well, not yet. This is the "placeholder" patch, in which I planned to keep
> the behavior the same, while I go to all the get_user_pages call sites and change 
> put_page() and release_pages() over to use these new routines.
> 
> After the call sites are changed, then these routines will be updated to do more.
> [2], above has slightly more detail about that.
> 
> 

Also, I plan to respin again pretty soon, because someone politely pointed out offline
that even in this small patchset, I've botched the handling of the --npages loop, sigh. 
(Thanks, Ralph!)

The original form:

    while(--npages)

was correct, but now it's not so much.

thanks,
-- 
John Hubbard
NVIDIA

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 2/3] mm: introduce put_user_page[s](), placeholder versions
  2018-10-05 19:49     ` John Hubbard
  2018-10-05 20:51       ` John Hubbard
@ 2018-10-05 21:48       ` Jason Gunthorpe
  2018-10-06  0:03         ` John Hubbard
  1 sibling, 1 reply; 11+ messages in thread
From: Jason Gunthorpe @ 2018-10-05 21:48 UTC (permalink / raw)
  To: John Hubbard
  Cc: john.hubbard, Matthew Wilcox, Michal Hocko, Christopher Lameter,
	Dan Williams, Jan Kara, linux-mm, LKML, linux-rdma,
	linux-fsdevel, Al Viro, Jerome Glisse, Christoph Hellwig

On Fri, Oct 05, 2018 at 12:49:06PM -0700, John Hubbard wrote:
> On 10/5/18 8:17 AM, Jason Gunthorpe wrote:
> > On Thu, Oct 04, 2018 at 09:02:24PM -0700, john.hubbard@gmail.com wrote:
> >> From: John Hubbard <jhubbard@nvidia.com>
> >>
> >> Introduces put_user_page(), which simply calls put_page().
> >> This provides a way to update all get_user_pages*() callers,
> >> so that they call put_user_page(), instead of put_page().
> >>
> >> Also introduces put_user_pages(), and a few dirty/locked variations,
> >> as a replacement for release_pages(), for the same reasons.
> >> These may be used for subsequent performance improvements,
> >> via batching of pages to be released.
> >>
> >> This prepares for eventually fixing the problem described
> >> in [1], and is following a plan listed in [2], [3], [4].
> >>
> >> [1] https://lwn.net/Articles/753027/ : "The Trouble with get_user_pages()"
> >>
> >> [2] https://lkml.kernel.org/r/20180709080554.21931-1-jhubbard@nvidia.com
> >>     Proposed steps for fixing get_user_pages() + DMA problems.
> >>
> >> [3]https://lkml.kernel.org/r/20180710082100.mkdwngdv5kkrcz6n@quack2.suse.cz
> >>     Bounce buffers (otherwise [2] is not really viable).
> >>
> >> [4] https://lkml.kernel.org/r/20181003162115.GG24030@quack2.suse.cz
> >>     Follow-up discussions.
> >>
> [...]
> >>  
> >> +/* Placeholder version, until all get_user_pages*() callers are updated. */
> >> +static inline void put_user_page(struct page *page)
> >> +{
> >> +	put_page(page);
> >> +}
> >> +
> >> +/* For get_user_pages*()-pinned pages, use these variants instead of
> >> + * release_pages():
> >> + */
> >> +static inline void put_user_pages_dirty(struct page **pages,
> >> +					unsigned long npages)
> >> +{
> >> +	while (npages) {
> >> +		set_page_dirty(pages[npages]);
> >> +		put_user_page(pages[npages]);
> >> +		--npages;
> >> +	}
> >> +}
> > 
> > Shouldn't these do the !PageDirty(page) thing?
> > 
> 
> Well, not yet. This is the "placeholder" patch, in which I planned to keep
> the behavior the same, while I go to all the get_user_pages call sites and change 
> put_page() and release_pages() over to use these new routines.

Hmm.. Well, if it is the right thing to do here, why not include it and
take it out of callers when doing the conversion?

If it is the wrong thing, then let us still take it out of callers
when doing the conversion :)

Just seems like things will be in a better place to make future
changes if all the call sights are de-duplicated and correct.

Jason

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 2/3] mm: introduce put_user_page[s](), placeholder versions
  2018-10-05 21:48       ` Jason Gunthorpe
@ 2018-10-06  0:03         ` John Hubbard
  0 siblings, 0 replies; 11+ messages in thread
From: John Hubbard @ 2018-10-06  0:03 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: john.hubbard, Matthew Wilcox, Michal Hocko, Christopher Lameter,
	Dan Williams, Jan Kara, linux-mm, LKML, linux-rdma,
	linux-fsdevel, Al Viro, Jerome Glisse, Christoph Hellwig

On 10/5/18 2:48 PM, Jason Gunthorpe wrote:
> On Fri, Oct 05, 2018 at 12:49:06PM -0700, John Hubbard wrote:
>> On 10/5/18 8:17 AM, Jason Gunthorpe wrote:
>>> On Thu, Oct 04, 2018 at 09:02:24PM -0700, john.hubbard@gmail.com wrote:
>>>> From: John Hubbard <jhubbard@nvidia.com>
>>>>
>>>> Introduces put_user_page(), which simply calls put_page().
>>>> This provides a way to update all get_user_pages*() callers,
>>>> so that they call put_user_page(), instead of put_page().
>>>>
>>>> Also introduces put_user_pages(), and a few dirty/locked variations,
>>>> as a replacement for release_pages(), for the same reasons.
>>>> These may be used for subsequent performance improvements,
>>>> via batching of pages to be released.
>>>>
>>>> This prepares for eventually fixing the problem described
>>>> in [1], and is following a plan listed in [2], [3], [4].
>>>>
>>>> [1] https://lwn.net/Articles/753027/ : "The Trouble with get_user_pages()"
>>>>
>>>> [2] https://lkml.kernel.org/r/20180709080554.21931-1-jhubbard@nvidia.com
>>>>     Proposed steps for fixing get_user_pages() + DMA problems.
>>>>
>>>> [3]https://lkml.kernel.org/r/20180710082100.mkdwngdv5kkrcz6n@quack2.suse.cz
>>>>     Bounce buffers (otherwise [2] is not really viable).
>>>>
>>>> [4] https://lkml.kernel.org/r/20181003162115.GG24030@quack2.suse.cz
>>>>     Follow-up discussions.
>>>>
>> [...]
>>>>  
>>>> +/* Placeholder version, until all get_user_pages*() callers are updated. */
>>>> +static inline void put_user_page(struct page *page)
>>>> +{
>>>> +	put_page(page);
>>>> +}
>>>> +
>>>> +/* For get_user_pages*()-pinned pages, use these variants instead of
>>>> + * release_pages():
>>>> + */
>>>> +static inline void put_user_pages_dirty(struct page **pages,
>>>> +					unsigned long npages)
>>>> +{
>>>> +	while (npages) {
>>>> +		set_page_dirty(pages[npages]);
>>>> +		put_user_page(pages[npages]);
>>>> +		--npages;
>>>> +	}
>>>> +}
>>>
>>> Shouldn't these do the !PageDirty(page) thing?
>>>
>>
>> Well, not yet. This is the "placeholder" patch, in which I planned to keep
>> the behavior the same, while I go to all the get_user_pages call sites and change 
>> put_page() and release_pages() over to use these new routines.
> 
> Hmm.. Well, if it is the right thing to do here, why not include it and
> take it out of callers when doing the conversion?
> 
> If it is the wrong thing, then let us still take it out of callers
> when doing the conversion :)
> 
> Just seems like things will be in a better place to make future
> changes if all the call sights are de-duplicated and correct.
> 

OK, yes. Let me send out a v3 with that included, then.

thanks,
-- 
John Hubbard
NVIDIA

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2018-10-06  0:03 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-10-05  4:02 [PATCH v2 0/3] get_user_pages*() and RDMA: first steps john.hubbard
2018-10-05  4:02 ` [PATCH v2 1/3] mm: get_user_pages: consolidate error handling john.hubbard
2018-10-05  4:02 ` [PATCH v2 2/3] mm: introduce put_user_page[s](), placeholder versions john.hubbard
2018-10-05 15:17   ` Jason Gunthorpe
2018-10-05 19:49     ` John Hubbard
2018-10-05 20:51       ` John Hubbard
2018-10-05 21:48       ` Jason Gunthorpe
2018-10-06  0:03         ` John Hubbard
2018-10-05  4:02 ` [PATCH v2 3/3] infiniband/mm: convert to the new put_user_page[s]() calls john.hubbard
2018-10-05 15:20   ` Jason Gunthorpe
2018-10-05 20:48     ` John Hubbard

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).