amd-gfx.lists.freedesktop.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 hmm 0/9] Small hmm_range_fault() cleanups
@ 2020-03-24  1:14 Jason Gunthorpe
  2020-03-24  1:14 ` [PATCH v2 hmm 1/9] mm/hmm: remove pgmap checking for devmap pages Jason Gunthorpe
                   ` (9 more replies)
  0 siblings, 10 replies; 22+ messages in thread
From: Jason Gunthorpe @ 2020-03-24  1:14 UTC (permalink / raw)
  To: Jerome Glisse, Ralph Campbell, Felix.Kuehling
  Cc: Philip Yang, John Hubbard, amd-gfx, linux-mm, Jason Gunthorpe,
	dri-devel, Christoph Hellwig

From: Jason Gunthorpe <jgg@mellanox.com>

This is v2 of the first simple series with a few additional patches of little
adjustments.

This needs an additional patch to the hmm tester:

diff --git a/tools/testing/selftests/vm/hmm-tests.c b/tools/testing/selftests/vm/hmm-tests.c
index 033a12c7ab5b6d..da15471a2bbf9a 100644
--- a/tools/testing/selftests/vm/hmm-tests.c
+++ b/tools/testing/selftests/vm/hmm-tests.c
@@ -1274,7 +1274,7 @@ TEST_F(hmm2, snapshot)
 	/* Check what the device saw. */
 	m = buffer->mirror;
 	ASSERT_EQ(m[0], HMM_DMIRROR_PROT_ERROR);
-	ASSERT_EQ(m[1], HMM_DMIRROR_PROT_NONE);
+	ASSERT_EQ(m[1], HMM_DMIRROR_PROT_ERROR);
 	ASSERT_EQ(m[2], HMM_DMIRROR_PROT_ZERO | HMM_DMIRROR_PROT_READ);
 	ASSERT_EQ(m[3], HMM_DMIRROR_PROT_READ);
 	ASSERT_EQ(m[4], HMM_DMIRROR_PROT_WRITE);

v2 changes:
 - Simplify and rename the flags, rework hmm_vma_walk_test in patch 2 (CH)
 - Adjust more comments in patch 3 (CH, Ralph)
 - Put the ugly boolean logic into a function in patch 3 (CH)
 - Update commit message of patch 4 (CH)
 - Adjust formatting in patch 5 (CH)
 Patches 6, 7, 8 are new

v1: https://lore.kernel.org/r/20200320164905.21722-1-jgg@ziepe.ca

Jason Gunthorpe (9):
  mm/hmm: remove pgmap checking for devmap pages
  mm/hmm: return the fault type from hmm_pte_need_fault()
  mm/hmm: remove unused code and tidy comments
  mm/hmm: remove HMM_FAULT_SNAPSHOT
  mm/hmm: remove the CONFIG_TRANSPARENT_HUGEPAGE #ifdef
  mm/hmm: use device_private_entry_to_pfn()
  mm/hmm: do not unconditionally set pfns when returning EBUSY
  mm/hmm: do not set pfns when returning an error code
  mm/hmm: return error for non-vma snapshots

 Documentation/vm/hmm.rst                |  12 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c |   2 +-
 drivers/gpu/drm/nouveau/nouveau_svm.c   |   2 +-
 include/linux/hmm.h                     | 109 +--------
 mm/hmm.c                                | 312 ++++++++++--------------
 5 files changed, 133 insertions(+), 304 deletions(-)

-- 
2.25.2

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v2 hmm 1/9] mm/hmm: remove pgmap checking for devmap pages
  2020-03-24  1:14 [PATCH v2 hmm 0/9] Small hmm_range_fault() cleanups Jason Gunthorpe
@ 2020-03-24  1:14 ` Jason Gunthorpe
  2020-03-24  1:14 ` [PATCH v2 hmm 2/9] mm/hmm: return the fault type from hmm_pte_need_fault() Jason Gunthorpe
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 22+ messages in thread
From: Jason Gunthorpe @ 2020-03-24  1:14 UTC (permalink / raw)
  To: Jerome Glisse, Ralph Campbell, Felix.Kuehling
  Cc: Philip Yang, John Hubbard, amd-gfx, linux-mm, Jason Gunthorpe,
	dri-devel, Christoph Hellwig

From: Jason Gunthorpe <jgg@mellanox.com>

The checking boils down to some racy check if the pagemap is still
available or not. Instead of checking this, rely entirely on the
notifiers, if a pagemap is destroyed then all pages that belong to it must
be removed from the tables and the notifiers triggered.

Reviewed-by: Ralph Campbell <rcampbell@nvidia.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
---
 mm/hmm.c | 50 ++------------------------------------------------
 1 file changed, 2 insertions(+), 48 deletions(-)

diff --git a/mm/hmm.c b/mm/hmm.c
index a491d9aaafe45d..3a2610e0713329 100644
--- a/mm/hmm.c
+++ b/mm/hmm.c
@@ -28,7 +28,6 @@
 
 struct hmm_vma_walk {
 	struct hmm_range	*range;
-	struct dev_pagemap	*pgmap;
 	unsigned long		last;
 	unsigned int		flags;
 };
@@ -196,19 +195,8 @@ static int hmm_vma_handle_pmd(struct mm_walk *walk, unsigned long addr,
 		return hmm_vma_fault(addr, end, fault, write_fault, walk);
 
 	pfn = pmd_pfn(pmd) + ((addr & ~PMD_MASK) >> PAGE_SHIFT);
-	for (i = 0; addr < end; addr += PAGE_SIZE, i++, pfn++) {
-		if (pmd_devmap(pmd)) {
-			hmm_vma_walk->pgmap = get_dev_pagemap(pfn,
-					      hmm_vma_walk->pgmap);
-			if (unlikely(!hmm_vma_walk->pgmap))
-				return -EBUSY;
-		}
+	for (i = 0; addr < end; addr += PAGE_SIZE, i++, pfn++)
 		pfns[i] = hmm_device_entry_from_pfn(range, pfn) | cpu_flags;
-	}
-	if (hmm_vma_walk->pgmap) {
-		put_dev_pagemap(hmm_vma_walk->pgmap);
-		hmm_vma_walk->pgmap = NULL;
-	}
 	hmm_vma_walk->last = end;
 	return 0;
 }
@@ -300,15 +288,6 @@ static int hmm_vma_handle_pte(struct mm_walk *walk, unsigned long addr,
 	if (fault || write_fault)
 		goto fault;
 
-	if (pte_devmap(pte)) {
-		hmm_vma_walk->pgmap = get_dev_pagemap(pte_pfn(pte),
-					      hmm_vma_walk->pgmap);
-		if (unlikely(!hmm_vma_walk->pgmap)) {
-			pte_unmap(ptep);
-			return -EBUSY;
-		}
-	}
-
 	/*
 	 * Since each architecture defines a struct page for the zero page, just
 	 * fall through and treat it like a normal page.
@@ -328,10 +307,6 @@ static int hmm_vma_handle_pte(struct mm_walk *walk, unsigned long addr,
 	return 0;
 
 fault:
-	if (hmm_vma_walk->pgmap) {
-		put_dev_pagemap(hmm_vma_walk->pgmap);
-		hmm_vma_walk->pgmap = NULL;
-	}
 	pte_unmap(ptep);
 	/* Fault any virtual address we were asked to fault */
 	return hmm_vma_fault(addr, end, fault, write_fault, walk);
@@ -418,16 +393,6 @@ static int hmm_vma_walk_pmd(pmd_t *pmdp,
 			return r;
 		}
 	}
-	if (hmm_vma_walk->pgmap) {
-		/*
-		 * We do put_dev_pagemap() here and not in hmm_vma_handle_pte()
-		 * so that we can leverage get_dev_pagemap() optimization which
-		 * will not re-take a reference on a pgmap if we already have
-		 * one.
-		 */
-		put_dev_pagemap(hmm_vma_walk->pgmap);
-		hmm_vma_walk->pgmap = NULL;
-	}
 	pte_unmap(ptep - 1);
 
 	hmm_vma_walk->last = addr;
@@ -491,20 +456,9 @@ static int hmm_vma_walk_pud(pud_t *pudp, unsigned long start, unsigned long end,
 		}
 
 		pfn = pud_pfn(pud) + ((addr & ~PUD_MASK) >> PAGE_SHIFT);
-		for (i = 0; i < npages; ++i, ++pfn) {
-			hmm_vma_walk->pgmap = get_dev_pagemap(pfn,
-					      hmm_vma_walk->pgmap);
-			if (unlikely(!hmm_vma_walk->pgmap)) {
-				ret = -EBUSY;
-				goto out_unlock;
-			}
+		for (i = 0; i < npages; ++i, ++pfn)
 			pfns[i] = hmm_device_entry_from_pfn(range, pfn) |
 				  cpu_flags;
-		}
-		if (hmm_vma_walk->pgmap) {
-			put_dev_pagemap(hmm_vma_walk->pgmap);
-			hmm_vma_walk->pgmap = NULL;
-		}
 		hmm_vma_walk->last = end;
 		goto out_unlock;
 	}
-- 
2.25.2

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v2 hmm 2/9] mm/hmm: return the fault type from hmm_pte_need_fault()
  2020-03-24  1:14 [PATCH v2 hmm 0/9] Small hmm_range_fault() cleanups Jason Gunthorpe
  2020-03-24  1:14 ` [PATCH v2 hmm 1/9] mm/hmm: remove pgmap checking for devmap pages Jason Gunthorpe
@ 2020-03-24  1:14 ` Jason Gunthorpe
  2020-03-24  7:27   ` Christoph Hellwig
  2020-03-24  1:14 ` [PATCH v2 hmm 3/9] mm/hmm: remove unused code and tidy comments Jason Gunthorpe
                   ` (7 subsequent siblings)
  9 siblings, 1 reply; 22+ messages in thread
From: Jason Gunthorpe @ 2020-03-24  1:14 UTC (permalink / raw)
  To: Jerome Glisse, Ralph Campbell, Felix.Kuehling
  Cc: Philip Yang, John Hubbard, amd-gfx, linux-mm, Jason Gunthorpe,
	dri-devel, Christoph Hellwig

From: Jason Gunthorpe <jgg@mellanox.com>

Using two bools instead of flags return is not necessary and leads to
bugs. Returning a value is easier for the compiler to check and easier to
pass around the code flow.

Convert the two bools into flags and push the change to all callers.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
---
 mm/hmm.c | 183 ++++++++++++++++++++++++-------------------------------
 1 file changed, 81 insertions(+), 102 deletions(-)

diff --git a/mm/hmm.c b/mm/hmm.c
index 3a2610e0713329..2a0eda1534bcda 100644
--- a/mm/hmm.c
+++ b/mm/hmm.c
@@ -32,6 +32,12 @@ struct hmm_vma_walk {
 	unsigned int		flags;
 };
 
+enum {
+	HMM_NEED_FAULT = 1 << 0,
+	HMM_NEED_WRITE_FAULT = HMM_NEED_FAULT | (1 << 1),
+	HMM_NEED_ALL_BITS = HMM_NEED_FAULT | HMM_NEED_WRITE_FAULT,
+};
+
 static int hmm_pfns_fill(unsigned long addr, unsigned long end,
 		struct hmm_range *range, enum hmm_pfn_value_e value)
 {
@@ -49,8 +55,7 @@ static int hmm_pfns_fill(unsigned long addr, unsigned long end,
  * hmm_vma_fault() - fault in a range lacking valid pmd or pte(s)
  * @addr: range virtual start address (inclusive)
  * @end: range virtual end address (exclusive)
- * @fault: should we fault or not ?
- * @write_fault: write fault ?
+ * @required_fault: HMM_NEED_* flags
  * @walk: mm_walk structure
  * Return: -EBUSY after page fault, or page fault error
  *
@@ -58,8 +63,7 @@ static int hmm_pfns_fill(unsigned long addr, unsigned long end,
  * or whenever there is no page directory covering the virtual address range.
  */
 static int hmm_vma_fault(unsigned long addr, unsigned long end,
-			      bool fault, bool write_fault,
-			      struct mm_walk *walk)
+			 unsigned int required_fault, struct mm_walk *walk)
 {
 	struct hmm_vma_walk *hmm_vma_walk = walk->private;
 	struct hmm_range *range = hmm_vma_walk->range;
@@ -68,13 +72,13 @@ static int hmm_vma_fault(unsigned long addr, unsigned long end,
 	unsigned long i = (addr - range->start) >> PAGE_SHIFT;
 	unsigned int fault_flags = FAULT_FLAG_REMOTE;
 
-	WARN_ON_ONCE(!fault && !write_fault);
+	WARN_ON_ONCE(!required_fault);
 	hmm_vma_walk->last = addr;
 
 	if (!vma)
 		goto out_error;
 
-	if (write_fault) {
+	if ((required_fault & HMM_NEED_WRITE_FAULT) == HMM_NEED_WRITE_FAULT) {
 		if (!(vma->vm_flags & VM_WRITE))
 			return -EPERM;
 		fault_flags |= FAULT_FLAG_WRITE;
@@ -91,14 +95,13 @@ static int hmm_vma_fault(unsigned long addr, unsigned long end,
 	return -EFAULT;
 }
 
-static inline void hmm_pte_need_fault(const struct hmm_vma_walk *hmm_vma_walk,
-				      uint64_t pfns, uint64_t cpu_flags,
-				      bool *fault, bool *write_fault)
+static unsigned int hmm_pte_need_fault(const struct hmm_vma_walk *hmm_vma_walk,
+				       uint64_t pfns, uint64_t cpu_flags)
 {
 	struct hmm_range *range = hmm_vma_walk->range;
 
 	if (hmm_vma_walk->flags & HMM_FAULT_SNAPSHOT)
-		return;
+		return 0;
 
 	/*
 	 * So we not only consider the individual per page request we also
@@ -114,37 +117,37 @@ static inline void hmm_pte_need_fault(const struct hmm_vma_walk *hmm_vma_walk,
 
 	/* We aren't ask to do anything ... */
 	if (!(pfns & range->flags[HMM_PFN_VALID]))
-		return;
+		return 0;
 
-	/* If CPU page table is not valid then we need to fault */
-	*fault = !(cpu_flags & range->flags[HMM_PFN_VALID]);
 	/* Need to write fault ? */
 	if ((pfns & range->flags[HMM_PFN_WRITE]) &&
-	    !(cpu_flags & range->flags[HMM_PFN_WRITE])) {
-		*write_fault = true;
-		*fault = true;
-	}
+	    !(cpu_flags & range->flags[HMM_PFN_WRITE]))
+		return HMM_NEED_WRITE_FAULT;
+
+	/* If CPU page table is not valid then we need to fault */
+	if (!(cpu_flags & range->flags[HMM_PFN_VALID]))
+		return HMM_NEED_FAULT;
+	return 0;
 }
 
-static void hmm_range_need_fault(const struct hmm_vma_walk *hmm_vma_walk,
-				 const uint64_t *pfns, unsigned long npages,
-				 uint64_t cpu_flags, bool *fault,
-				 bool *write_fault)
+static unsigned int
+hmm_range_need_fault(const struct hmm_vma_walk *hmm_vma_walk,
+		     const uint64_t *pfns, unsigned long npages,
+		     uint64_t cpu_flags)
 {
+	unsigned int required_fault = 0;
 	unsigned long i;
 
-	if (hmm_vma_walk->flags & HMM_FAULT_SNAPSHOT) {
-		*fault = *write_fault = false;
-		return;
-	}
+	if (hmm_vma_walk->flags & HMM_FAULT_SNAPSHOT)
+		return 0;
 
-	*fault = *write_fault = false;
 	for (i = 0; i < npages; ++i) {
-		hmm_pte_need_fault(hmm_vma_walk, pfns[i], cpu_flags,
-				   fault, write_fault);
-		if ((*write_fault))
-			return;
+		required_fault |=
+			hmm_pte_need_fault(hmm_vma_walk, pfns[i], cpu_flags);
+		if (required_fault == HMM_NEED_ALL_BITS)
+			return required_fault;
 	}
+	return required_fault;
 }
 
 static int hmm_vma_walk_hole(unsigned long addr, unsigned long end,
@@ -152,17 +155,16 @@ static int hmm_vma_walk_hole(unsigned long addr, unsigned long end,
 {
 	struct hmm_vma_walk *hmm_vma_walk = walk->private;
 	struct hmm_range *range = hmm_vma_walk->range;
-	bool fault, write_fault;
+	unsigned int required_fault;
 	unsigned long i, npages;
 	uint64_t *pfns;
 
 	i = (addr - range->start) >> PAGE_SHIFT;
 	npages = (end - addr) >> PAGE_SHIFT;
 	pfns = &range->pfns[i];
-	hmm_range_need_fault(hmm_vma_walk, pfns, npages,
-			     0, &fault, &write_fault);
-	if (fault || write_fault)
-		return hmm_vma_fault(addr, end, fault, write_fault, walk);
+	required_fault = hmm_range_need_fault(hmm_vma_walk, pfns, npages, 0);
+	if (required_fault)
+		return hmm_vma_fault(addr, end, required_fault, walk);
 	hmm_vma_walk->last = addr;
 	return hmm_pfns_fill(addr, end, range, HMM_PFN_NONE);
 }
@@ -183,16 +185,15 @@ static int hmm_vma_handle_pmd(struct mm_walk *walk, unsigned long addr,
 	struct hmm_vma_walk *hmm_vma_walk = walk->private;
 	struct hmm_range *range = hmm_vma_walk->range;
 	unsigned long pfn, npages, i;
-	bool fault, write_fault;
+	unsigned int required_fault;
 	uint64_t cpu_flags;
 
 	npages = (end - addr) >> PAGE_SHIFT;
 	cpu_flags = pmd_to_hmm_pfn_flags(range, pmd);
-	hmm_range_need_fault(hmm_vma_walk, pfns, npages, cpu_flags,
-			     &fault, &write_fault);
-
-	if (fault || write_fault)
-		return hmm_vma_fault(addr, end, fault, write_fault, walk);
+	required_fault =
+		hmm_range_need_fault(hmm_vma_walk, pfns, npages, cpu_flags);
+	if (required_fault)
+		return hmm_vma_fault(addr, end, required_fault, walk);
 
 	pfn = pmd_pfn(pmd) + ((addr & ~PMD_MASK) >> PAGE_SHIFT);
 	for (i = 0; addr < end; addr += PAGE_SIZE, i++, pfn++)
@@ -229,18 +230,15 @@ static int hmm_vma_handle_pte(struct mm_walk *walk, unsigned long addr,
 {
 	struct hmm_vma_walk *hmm_vma_walk = walk->private;
 	struct hmm_range *range = hmm_vma_walk->range;
-	bool fault, write_fault;
+	unsigned int required_fault;
 	uint64_t cpu_flags;
 	pte_t pte = *ptep;
 	uint64_t orig_pfn = *pfn;
 
 	*pfn = range->values[HMM_PFN_NONE];
-	fault = write_fault = false;
-
 	if (pte_none(pte)) {
-		hmm_pte_need_fault(hmm_vma_walk, orig_pfn, 0,
-				   &fault, &write_fault);
-		if (fault || write_fault)
+		required_fault = hmm_pte_need_fault(hmm_vma_walk, orig_pfn, 0);
+		if (required_fault)
 			goto fault;
 		return 0;
 	}
@@ -261,9 +259,8 @@ static int hmm_vma_handle_pte(struct mm_walk *walk, unsigned long addr,
 			return 0;
 		}
 
-		hmm_pte_need_fault(hmm_vma_walk, orig_pfn, 0, &fault,
-				   &write_fault);
-		if (!fault && !write_fault)
+		required_fault = hmm_pte_need_fault(hmm_vma_walk, orig_pfn, 0);
+		if (!required_fault)
 			return 0;
 
 		if (!non_swap_entry(entry))
@@ -283,9 +280,8 @@ static int hmm_vma_handle_pte(struct mm_walk *walk, unsigned long addr,
 	}
 
 	cpu_flags = pte_to_hmm_pfn_flags(range, pte);
-	hmm_pte_need_fault(hmm_vma_walk, orig_pfn, cpu_flags, &fault,
-			   &write_fault);
-	if (fault || write_fault)
+	required_fault = hmm_pte_need_fault(hmm_vma_walk, orig_pfn, cpu_flags);
+	if (required_fault)
 		goto fault;
 
 	/*
@@ -293,9 +289,7 @@ static int hmm_vma_handle_pte(struct mm_walk *walk, unsigned long addr,
 	 * fall through and treat it like a normal page.
 	 */
 	if (pte_special(pte) && !is_zero_pfn(pte_pfn(pte))) {
-		hmm_pte_need_fault(hmm_vma_walk, orig_pfn, 0, &fault,
-				   &write_fault);
-		if (fault || write_fault) {
+		if (hmm_pte_need_fault(hmm_vma_walk, orig_pfn, 0)) {
 			pte_unmap(ptep);
 			return -EFAULT;
 		}
@@ -309,7 +303,7 @@ static int hmm_vma_handle_pte(struct mm_walk *walk, unsigned long addr,
 fault:
 	pte_unmap(ptep);
 	/* Fault any virtual address we were asked to fault */
-	return hmm_vma_fault(addr, end, fault, write_fault, walk);
+	return hmm_vma_fault(addr, end, required_fault, walk);
 }
 
 static int hmm_vma_walk_pmd(pmd_t *pmdp,
@@ -322,7 +316,6 @@ static int hmm_vma_walk_pmd(pmd_t *pmdp,
 	uint64_t *pfns = &range->pfns[(start - range->start) >> PAGE_SHIFT];
 	unsigned long npages = (end - start) >> PAGE_SHIFT;
 	unsigned long addr = start;
-	bool fault, write_fault;
 	pte_t *ptep;
 	pmd_t pmd;
 
@@ -332,9 +325,7 @@ static int hmm_vma_walk_pmd(pmd_t *pmdp,
 		return hmm_vma_walk_hole(start, end, -1, walk);
 
 	if (thp_migration_supported() && is_pmd_migration_entry(pmd)) {
-		hmm_range_need_fault(hmm_vma_walk, pfns, npages,
-				     0, &fault, &write_fault);
-		if (fault || write_fault) {
+		if (hmm_range_need_fault(hmm_vma_walk, pfns, npages, 0)) {
 			hmm_vma_walk->last = addr;
 			pmd_migration_entry_wait(walk->mm, pmdp);
 			return -EBUSY;
@@ -343,9 +334,7 @@ static int hmm_vma_walk_pmd(pmd_t *pmdp,
 	}
 
 	if (!pmd_present(pmd)) {
-		hmm_range_need_fault(hmm_vma_walk, pfns, npages, 0, &fault,
-				     &write_fault);
-		if (fault || write_fault)
+		if (hmm_range_need_fault(hmm_vma_walk, pfns, npages, 0))
 			return -EFAULT;
 		return hmm_pfns_fill(start, end, range, HMM_PFN_ERROR);
 	}
@@ -375,9 +364,7 @@ static int hmm_vma_walk_pmd(pmd_t *pmdp,
 	 * recover.
 	 */
 	if (pmd_bad(pmd)) {
-		hmm_range_need_fault(hmm_vma_walk, pfns, npages, 0, &fault,
-				     &write_fault);
-		if (fault || write_fault)
+		if (hmm_range_need_fault(hmm_vma_walk, pfns, npages, 0))
 			return -EFAULT;
 		return hmm_pfns_fill(start, end, range, HMM_PFN_ERROR);
 	}
@@ -434,8 +421,8 @@ static int hmm_vma_walk_pud(pud_t *pudp, unsigned long start, unsigned long end,
 
 	if (pud_huge(pud) && pud_devmap(pud)) {
 		unsigned long i, npages, pfn;
+		unsigned int required_fault;
 		uint64_t *pfns, cpu_flags;
-		bool fault, write_fault;
 
 		if (!pud_present(pud)) {
 			spin_unlock(ptl);
@@ -447,12 +434,11 @@ static int hmm_vma_walk_pud(pud_t *pudp, unsigned long start, unsigned long end,
 		pfns = &range->pfns[i];
 
 		cpu_flags = pud_to_hmm_pfn_flags(range, pud);
-		hmm_range_need_fault(hmm_vma_walk, pfns, npages,
-				     cpu_flags, &fault, &write_fault);
-		if (fault || write_fault) {
+		required_fault = hmm_range_need_fault(hmm_vma_walk, pfns,
+						      npages, cpu_flags);
+		if (required_fault) {
 			spin_unlock(ptl);
-			return hmm_vma_fault(addr, end, fault, write_fault,
-						  walk);
+			return hmm_vma_fault(addr, end, required_fault, walk);
 		}
 
 		pfn = pud_pfn(pud) + ((addr & ~PUD_MASK) >> PAGE_SHIFT);
@@ -484,7 +470,7 @@ static int hmm_vma_walk_hugetlb_entry(pte_t *pte, unsigned long hmask,
 	struct hmm_range *range = hmm_vma_walk->range;
 	struct vm_area_struct *vma = walk->vma;
 	uint64_t orig_pfn, cpu_flags;
-	bool fault, write_fault;
+	unsigned int required_fault;
 	spinlock_t *ptl;
 	pte_t entry;
 
@@ -495,12 +481,10 @@ static int hmm_vma_walk_hugetlb_entry(pte_t *pte, unsigned long hmask,
 	orig_pfn = range->pfns[i];
 	range->pfns[i] = range->values[HMM_PFN_NONE];
 	cpu_flags = pte_to_hmm_pfn_flags(range, entry);
-	fault = write_fault = false;
-	hmm_pte_need_fault(hmm_vma_walk, orig_pfn, cpu_flags,
-			   &fault, &write_fault);
-	if (fault || write_fault) {
+	required_fault = hmm_pte_need_fault(hmm_vma_walk, orig_pfn, cpu_flags);
+	if (required_fault) {
 		spin_unlock(ptl);
-		return hmm_vma_fault(addr, end, fault, write_fault, walk);
+		return hmm_vma_fault(addr, end, required_fault, walk);
 	}
 
 	pfn = pte_pfn(entry) + ((start & ~hmask) >> PAGE_SHIFT);
@@ -522,37 +506,32 @@ static int hmm_vma_walk_test(unsigned long start, unsigned long end,
 	struct hmm_range *range = hmm_vma_walk->range;
 	struct vm_area_struct *vma = walk->vma;
 
+	if (!(vma->vm_flags & (VM_IO | VM_PFNMAP | VM_MIXEDMAP)) &&
+	    vma->vm_flags & VM_READ)
+		return 0;
+
 	/*
-	 * Skip vma ranges that don't have struct page backing them or map I/O
-	 * devices directly.
+	 * vma ranges that don't have struct page backing them or map I/O
+	 * devices directly cannot be handled by hmm_range_fault().
 	 *
 	 * If the vma does not allow read access, then assume that it does not
 	 * allow write access either. HMM does not support architectures that
 	 * allow write without read.
+	 *
+	 * If a fault is requested for an unsupported range then it is a hard
+	 * failure.
 	 */
-	if ((vma->vm_flags & (VM_IO | VM_PFNMAP | VM_MIXEDMAP)) ||
-	    !(vma->vm_flags & VM_READ)) {
-		bool fault, write_fault;
-
-		/*
-		 * Check to see if a fault is requested for any page in the
-		 * range.
-		 */
-		hmm_range_need_fault(hmm_vma_walk, range->pfns +
-					((start - range->start) >> PAGE_SHIFT),
-					(end - start) >> PAGE_SHIFT,
-					0, &fault, &write_fault);
-		if (fault || write_fault)
-			return -EFAULT;
-
-		hmm_pfns_fill(start, end, range, HMM_PFN_ERROR);
-		hmm_vma_walk->last = end;
+	if (hmm_range_need_fault(hmm_vma_walk,
+				 range->pfns +
+					 ((start - range->start) >> PAGE_SHIFT),
+				 (end - start) >> PAGE_SHIFT, 0))
+		return -EFAULT;
 
-		/* Skip this vma and continue processing the next vma. */
-		return 1;
-	}
+	hmm_pfns_fill(start, end, range, HMM_PFN_ERROR);
+	hmm_vma_walk->last = end;
 
-	return 0;
+	/* Skip this vma and continue processing the next vma. */
+	return 1;
 }
 
 static const struct mm_walk_ops hmm_walk_ops = {
-- 
2.25.2

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v2 hmm 3/9] mm/hmm: remove unused code and tidy comments
  2020-03-24  1:14 [PATCH v2 hmm 0/9] Small hmm_range_fault() cleanups Jason Gunthorpe
  2020-03-24  1:14 ` [PATCH v2 hmm 1/9] mm/hmm: remove pgmap checking for devmap pages Jason Gunthorpe
  2020-03-24  1:14 ` [PATCH v2 hmm 2/9] mm/hmm: return the fault type from hmm_pte_need_fault() Jason Gunthorpe
@ 2020-03-24  1:14 ` Jason Gunthorpe
  2020-03-24  7:27   ` Christoph Hellwig
  2020-03-24  1:14 ` [PATCH v2 hmm 4/9] mm/hmm: remove HMM_FAULT_SNAPSHOT Jason Gunthorpe
                   ` (6 subsequent siblings)
  9 siblings, 1 reply; 22+ messages in thread
From: Jason Gunthorpe @ 2020-03-24  1:14 UTC (permalink / raw)
  To: Jerome Glisse, Ralph Campbell, Felix.Kuehling
  Cc: Philip Yang, John Hubbard, amd-gfx, linux-mm, Jason Gunthorpe,
	dri-devel, Christoph Hellwig

From: Jason Gunthorpe <jgg@mellanox.com>

Delete several functions that are never called, fix some desync between
comments and structure content, toss the now out of date top of file
header, and move one function only used by hmm.c into hmm.c

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
---
 include/linux/hmm.h | 104 +-------------------------------------------
 mm/hmm.c            |  24 +++++++---
 2 files changed, 19 insertions(+), 109 deletions(-)

diff --git a/include/linux/hmm.h b/include/linux/hmm.h
index bb6be4428633a8..daee6508a3f609 100644
--- a/include/linux/hmm.h
+++ b/include/linux/hmm.h
@@ -3,58 +3,8 @@
  * Copyright 2013 Red Hat Inc.
  *
  * Authors: Jérôme Glisse <jglisse@redhat.com>
- */
-/*
- * Heterogeneous Memory Management (HMM)
- *
- * See Documentation/vm/hmm.rst for reasons and overview of what HMM is and it
- * is for. Here we focus on the HMM API description, with some explanation of
- * the underlying implementation.
- *
- * Short description: HMM provides a set of helpers to share a virtual address
- * space between CPU and a device, so that the device can access any valid
- * address of the process (while still obeying memory protection). HMM also
- * provides helpers to migrate process memory to device memory, and back. Each
- * set of functionality (address space mirroring, and migration to and from
- * device memory) can be used independently of the other.
- *
- *
- * HMM address space mirroring API:
- *
- * Use HMM address space mirroring if you want to mirror a range of the CPU
- * page tables of a process into a device page table. Here, "mirror" means "keep
- * synchronized". Prerequisites: the device must provide the ability to write-
- * protect its page tables (at PAGE_SIZE granularity), and must be able to
- * recover from the resulting potential page faults.
  *
- * HMM guarantees that at any point in time, a given virtual address points to
- * either the same memory in both CPU and device page tables (that is: CPU and
- * device page tables each point to the same pages), or that one page table (CPU
- * or device) points to no entry, while the other still points to the old page
- * for the address. The latter case happens when the CPU page table update
- * happens first, and then the update is mirrored over to the device page table.
- * This does not cause any issue, because the CPU page table cannot start
- * pointing to a new page until the device page table is invalidated.
- *
- * HMM uses mmu_notifiers to monitor the CPU page tables, and forwards any
- * updates to each device driver that has registered a mirror. It also provides
- * some API calls to help with taking a snapshot of the CPU page table, and to
- * synchronize with any updates that might happen concurrently.
- *
- *
- * HMM migration to and from device memory:
- *
- * HMM provides a set of helpers to hotplug device memory as ZONE_DEVICE, with
- * a new MEMORY_DEVICE_PRIVATE type. This provides a struct page for each page
- * of the device memory, and allows the device driver to manage its memory
- * using those struct pages. Having struct pages for device memory makes
- * migration easier. Because that memory is not addressable by the CPU it must
- * never be pinned to the device; in other words, any CPU page fault can always
- * cause the device memory to be migrated (copied/moved) back to regular memory.
- *
- * A new migrate helper (migrate_vma()) has been added (see mm/migrate.c) that
- * allows use of a device DMA engine to perform the copy operation between
- * regular system memory and device memory.
+ * See Documentation/vm/hmm.rst for reasons and overview of what HMM is.
  */
 #ifndef LINUX_HMM_H
 #define LINUX_HMM_H
@@ -120,9 +70,6 @@ enum hmm_pfn_value_e {
  *
  * @notifier: a mmu_interval_notifier that includes the start/end
  * @notifier_seq: result of mmu_interval_read_begin()
- * @hmm: the core HMM structure this range is active against
- * @vma: the vm area struct for the range
- * @list: all range lock are on a list
  * @start: range virtual start address (inclusive)
  * @end: range virtual end address (exclusive)
  * @pfns: array of pfns (big enough for the range)
@@ -130,8 +77,7 @@ enum hmm_pfn_value_e {
  * @values: pfn value for some special case (none, special, error, ...)
  * @default_flags: default flags for the range (write, read, ... see hmm doc)
  * @pfn_flags_mask: allows to mask pfn flags so that only default_flags matter
- * @pfn_shifts: pfn shift value (should be <= PAGE_SHIFT)
- * @valid: pfns array did not change since it has been fill by an HMM function
+ * @pfn_shift: pfn shift value (should be <= PAGE_SHIFT)
  * @dev_private_owner: owner of device private pages
  */
 struct hmm_range {
@@ -171,52 +117,6 @@ static inline struct page *hmm_device_entry_to_page(const struct hmm_range *rang
 	return pfn_to_page(entry >> range->pfn_shift);
 }
 
-/*
- * hmm_device_entry_to_pfn() - return pfn value store in a device entry
- * @range: range use to decode device entry value
- * @entry: device entry to extract pfn from
- * Return: pfn value if device entry is valid, -1UL otherwise
- */
-static inline unsigned long
-hmm_device_entry_to_pfn(const struct hmm_range *range, uint64_t pfn)
-{
-	if (pfn == range->values[HMM_PFN_NONE])
-		return -1UL;
-	if (pfn == range->values[HMM_PFN_ERROR])
-		return -1UL;
-	if (pfn == range->values[HMM_PFN_SPECIAL])
-		return -1UL;
-	if (!(pfn & range->flags[HMM_PFN_VALID]))
-		return -1UL;
-	return (pfn >> range->pfn_shift);
-}
-
-/*
- * hmm_device_entry_from_page() - create a valid device entry for a page
- * @range: range use to encode HMM pfn value
- * @page: page for which to create the device entry
- * Return: valid device entry for the page
- */
-static inline uint64_t hmm_device_entry_from_page(const struct hmm_range *range,
-						  struct page *page)
-{
-	return (page_to_pfn(page) << range->pfn_shift) |
-		range->flags[HMM_PFN_VALID];
-}
-
-/*
- * hmm_device_entry_from_pfn() - create a valid device entry value from pfn
- * @range: range use to encode HMM pfn value
- * @pfn: pfn value for which to create the device entry
- * Return: valid device entry for the pfn
- */
-static inline uint64_t hmm_device_entry_from_pfn(const struct hmm_range *range,
-						 unsigned long pfn)
-{
-	return (pfn << range->pfn_shift) |
-		range->flags[HMM_PFN_VALID];
-}
-
 /* Don't fault in missing PTEs, just snapshot the current state. */
 #define HMM_FAULT_SNAPSHOT		(1 << 1)
 
diff --git a/mm/hmm.c b/mm/hmm.c
index 2a0eda1534bcda..c298c936469bbb 100644
--- a/mm/hmm.c
+++ b/mm/hmm.c
@@ -38,6 +38,18 @@ enum {
 	HMM_NEED_ALL_BITS = HMM_NEED_FAULT | HMM_NEED_WRITE_FAULT,
 };
 
+/*
+ * hmm_device_entry_from_pfn() - create a valid device entry value from pfn
+ * @range: range use to encode HMM pfn value
+ * @pfn: pfn value for which to create the device entry
+ * Return: valid device entry for the pfn
+ */
+static uint64_t hmm_device_entry_from_pfn(const struct hmm_range *range,
+					  unsigned long pfn)
+{
+	return (pfn << range->pfn_shift) | range->flags[HMM_PFN_VALID];
+}
+
 static int hmm_pfns_fill(unsigned long addr, unsigned long end,
 		struct hmm_range *range, enum hmm_pfn_value_e value)
 {
@@ -544,7 +556,7 @@ static const struct mm_walk_ops hmm_walk_ops = {
 
 /**
  * hmm_range_fault - try to fault some address in a virtual address range
- * @range:	range being faulted
+ * @range:	argument structure
  * @flags:	HMM_FAULT_* flags
  *
  * Return: the number of valid pages in range->pfns[] (from range start
@@ -558,13 +570,11 @@ static const struct mm_walk_ops hmm_walk_ops = {
  *		only).
  * -EBUSY:	The range has been invalidated and the caller needs to wait for
  *		the invalidation to finish.
- * -EFAULT:	Invalid (i.e., either no valid vma or it is illegal to access
- *		that range) number of valid pages in range->pfns[] (from
- *              range start address).
+ * -EFAULT:     A page was requested to be valid and could not be made valid
+ *              ie it has no backing VMA or it is illegal to access
  *
- * This is similar to a regular CPU page fault except that it will not trigger
- * any memory migration if the memory being faulted is not accessible by CPUs
- * and caller does not ask for migration.
+ * This is similar to get_user_pages(), except that it can read the page tables
+ * without mutating them (ie causing faults).
  *
  * On error, for one virtual address in the range, the function will mark the
  * corresponding HMM pfn entry with an error flag.
-- 
2.25.2

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v2 hmm 4/9] mm/hmm: remove HMM_FAULT_SNAPSHOT
  2020-03-24  1:14 [PATCH v2 hmm 0/9] Small hmm_range_fault() cleanups Jason Gunthorpe
                   ` (2 preceding siblings ...)
  2020-03-24  1:14 ` [PATCH v2 hmm 3/9] mm/hmm: remove unused code and tidy comments Jason Gunthorpe
@ 2020-03-24  1:14 ` Jason Gunthorpe
  2020-03-24  7:33   ` Christoph Hellwig
  2020-03-24  1:14 ` [PATCH v2 hmm 5/9] mm/hmm: remove the CONFIG_TRANSPARENT_HUGEPAGE #ifdef Jason Gunthorpe
                   ` (5 subsequent siblings)
  9 siblings, 1 reply; 22+ messages in thread
From: Jason Gunthorpe @ 2020-03-24  1:14 UTC (permalink / raw)
  To: Jerome Glisse, Ralph Campbell, Felix.Kuehling
  Cc: Philip Yang, John Hubbard, amd-gfx, linux-mm, Jason Gunthorpe,
	dri-devel, Christoph Hellwig

From: Jason Gunthorpe <jgg@mellanox.com>

Now that flags are handled on a fine-grained per-page basis this global
flag is redundant and has a confusing overlap with the pfn_flags_mask and
default_flags.

Normalize the HMM_FAULT_SNAPSHOT behavior into one place. Callers needing
the SNAPSHOT behavior should set a pfn_flags_mask and default_flags that
always results in a cleared HMM_PFN_VALID. Then no pages will be faulted,
and HMM_FAULT_SNAPSHOT is not a special flow that overrides the masking
mechanism.

As this is the last flag, also remove the flags argument. If future flags
are needed they can be part of the struct hmm_range function arguments.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
---
 Documentation/vm/hmm.rst                | 12 +++++-------
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c |  2 +-
 drivers/gpu/drm/nouveau/nouveau_svm.c   |  2 +-
 include/linux/hmm.h                     |  5 +----
 mm/hmm.c                                | 22 ++++++++++++++--------
 5 files changed, 22 insertions(+), 21 deletions(-)

diff --git a/Documentation/vm/hmm.rst b/Documentation/vm/hmm.rst
index 95fec596836262..4e3e9362afeb10 100644
--- a/Documentation/vm/hmm.rst
+++ b/Documentation/vm/hmm.rst
@@ -161,13 +161,11 @@ device must complete the update before the driver callback returns.
 When the device driver wants to populate a range of virtual addresses, it can
 use::
 
-  long hmm_range_fault(struct hmm_range *range, unsigned int flags);
+  long hmm_range_fault(struct hmm_range *range);
 
-With the HMM_RANGE_SNAPSHOT flag, it will only fetch present CPU page table
-entries and will not trigger a page fault on missing or non-present entries.
-Without that flag, it does trigger a page fault on missing or read-only entries
-if write access is requested (see below). Page faults use the generic mm page
-fault code path just like a CPU page fault.
+It will trigger a page fault on missing or read-only entries if write access is
+requested (see below). Page faults use the generic mm page fault code path just
+like a CPU page fault.
 
 Both functions copy CPU page table entries into their pfns array argument. Each
 entry in that array corresponds to an address in the virtual range. HMM
@@ -197,7 +195,7 @@ The usage pattern is::
  again:
       range.notifier_seq = mmu_interval_read_begin(&interval_sub);
       down_read(&mm->mmap_sem);
-      ret = hmm_range_fault(&range, HMM_RANGE_SNAPSHOT);
+      ret = hmm_range_fault(&range);
       if (ret) {
           up_read(&mm->mmap_sem);
           if (ret == -EBUSY)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
index 90821ce5e6cad0..c520290709371b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -856,7 +856,7 @@ int amdgpu_ttm_tt_get_user_pages(struct amdgpu_bo *bo, struct page **pages)
 	range->notifier_seq = mmu_interval_read_begin(&bo->notifier);
 
 	down_read(&mm->mmap_sem);
-	r = hmm_range_fault(range, 0);
+	r = hmm_range_fault(range);
 	up_read(&mm->mmap_sem);
 	if (unlikely(r <= 0)) {
 		/*
diff --git a/drivers/gpu/drm/nouveau/nouveau_svm.c b/drivers/gpu/drm/nouveau/nouveau_svm.c
index 39c731a99937c6..e3797b2d4d1759 100644
--- a/drivers/gpu/drm/nouveau/nouveau_svm.c
+++ b/drivers/gpu/drm/nouveau/nouveau_svm.c
@@ -540,7 +540,7 @@ static int nouveau_range_fault(struct nouveau_svmm *svmm,
 		range.default_flags = 0;
 		range.pfn_flags_mask = -1UL;
 		down_read(&mm->mmap_sem);
-		ret = hmm_range_fault(&range, 0);
+		ret = hmm_range_fault(&range);
 		up_read(&mm->mmap_sem);
 		if (ret <= 0) {
 			if (ret == 0 || ret == -EBUSY)
diff --git a/include/linux/hmm.h b/include/linux/hmm.h
index daee6508a3f609..7475051100c782 100644
--- a/include/linux/hmm.h
+++ b/include/linux/hmm.h
@@ -117,13 +117,10 @@ static inline struct page *hmm_device_entry_to_page(const struct hmm_range *rang
 	return pfn_to_page(entry >> range->pfn_shift);
 }
 
-/* Don't fault in missing PTEs, just snapshot the current state. */
-#define HMM_FAULT_SNAPSHOT		(1 << 1)
-
 /*
  * Please see Documentation/vm/hmm.rst for how to use the range API.
  */
-long hmm_range_fault(struct hmm_range *range, unsigned int flags);
+long hmm_range_fault(struct hmm_range *range);
 
 /*
  * HMM_RANGE_DEFAULT_TIMEOUT - default timeout (ms) when waiting for a range
diff --git a/mm/hmm.c b/mm/hmm.c
index c298c936469bbb..43d107a4d9dec6 100644
--- a/mm/hmm.c
+++ b/mm/hmm.c
@@ -29,7 +29,6 @@
 struct hmm_vma_walk {
 	struct hmm_range	*range;
 	unsigned long		last;
-	unsigned int		flags;
 };
 
 enum {
@@ -112,9 +111,6 @@ static unsigned int hmm_pte_need_fault(const struct hmm_vma_walk *hmm_vma_walk,
 {
 	struct hmm_range *range = hmm_vma_walk->range;
 
-	if (hmm_vma_walk->flags & HMM_FAULT_SNAPSHOT)
-		return 0;
-
 	/*
 	 * So we not only consider the individual per page request we also
 	 * consider the default flags requested for the range. The API can
@@ -142,15 +138,27 @@ static unsigned int hmm_pte_need_fault(const struct hmm_vma_walk *hmm_vma_walk,
 	return 0;
 }
 
+/*
+ * If the valid flag is masked off, and default_flags doesn't set valid, then
+ * hmm_pte_need_fault() always returns 0.
+ */
+static bool hmm_can_fault(struct hmm_range *range)
+{
+	return ((range->flags[HMM_PFN_VALID] & range->pfn_flags_mask) |
+		range->default_flags) &
+	       range->flags[HMM_PFN_VALID];
+}
+
 static unsigned int
 hmm_range_need_fault(const struct hmm_vma_walk *hmm_vma_walk,
 		     const uint64_t *pfns, unsigned long npages,
 		     uint64_t cpu_flags)
 {
+	struct hmm_range *range = hmm_vma_walk->range;
 	unsigned int required_fault = 0;
 	unsigned long i;
 
-	if (hmm_vma_walk->flags & HMM_FAULT_SNAPSHOT)
+	if (!hmm_can_fault(range))
 		return 0;
 
 	for (i = 0; i < npages; ++i) {
@@ -557,7 +565,6 @@ static const struct mm_walk_ops hmm_walk_ops = {
 /**
  * hmm_range_fault - try to fault some address in a virtual address range
  * @range:	argument structure
- * @flags:	HMM_FAULT_* flags
  *
  * Return: the number of valid pages in range->pfns[] (from range start
  * address), which may be zero.  On error one of the following status codes
@@ -579,12 +586,11 @@ static const struct mm_walk_ops hmm_walk_ops = {
  * On error, for one virtual address in the range, the function will mark the
  * corresponding HMM pfn entry with an error flag.
  */
-long hmm_range_fault(struct hmm_range *range, unsigned int flags)
+long hmm_range_fault(struct hmm_range *range)
 {
 	struct hmm_vma_walk hmm_vma_walk = {
 		.range = range,
 		.last = range->start,
-		.flags = flags,
 	};
 	struct mm_struct *mm = range->notifier->mm;
 	int ret;
-- 
2.25.2

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v2 hmm 5/9] mm/hmm: remove the CONFIG_TRANSPARENT_HUGEPAGE #ifdef
  2020-03-24  1:14 [PATCH v2 hmm 0/9] Small hmm_range_fault() cleanups Jason Gunthorpe
                   ` (3 preceding siblings ...)
  2020-03-24  1:14 ` [PATCH v2 hmm 4/9] mm/hmm: remove HMM_FAULT_SNAPSHOT Jason Gunthorpe
@ 2020-03-24  1:14 ` Jason Gunthorpe
  2020-03-24  7:33   ` Christoph Hellwig
  2020-03-24  1:14 ` [PATCH v2 hmm 6/9] mm/hmm: use device_private_entry_to_pfn() Jason Gunthorpe
                   ` (4 subsequent siblings)
  9 siblings, 1 reply; 22+ messages in thread
From: Jason Gunthorpe @ 2020-03-24  1:14 UTC (permalink / raw)
  To: Jerome Glisse, Ralph Campbell, Felix.Kuehling
  Cc: Philip Yang, John Hubbard, amd-gfx, linux-mm, Jason Gunthorpe,
	dri-devel, Christoph Hellwig

From: Jason Gunthorpe <jgg@mellanox.com>

This code can be compiled when CONFIG_TRANSPARENT_HUGEPAGE is off, so
remove the ifdef.

The function is only ever called under

   if (pmd_devmap(pmd) || pmd_trans_huge(pmd))

Which is statically false if !CONFIG_TRANSPARENT_HUGEPAGE, so the compiler
reliably eliminates all of this code.

Reviewed-by: Ralph Campbell <rcampbell@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
---
 mm/hmm.c | 6 ------
 1 file changed, 6 deletions(-)

diff --git a/mm/hmm.c b/mm/hmm.c
index 43d107a4d9dec6..f59e59fb303e95 100644
--- a/mm/hmm.c
+++ b/mm/hmm.c
@@ -198,7 +198,6 @@ static inline uint64_t pmd_to_hmm_pfn_flags(struct hmm_range *range, pmd_t pmd)
 				range->flags[HMM_PFN_VALID];
 }
 
-#ifdef CONFIG_TRANSPARENT_HUGEPAGE
 static int hmm_vma_handle_pmd(struct mm_walk *walk, unsigned long addr,
 		unsigned long end, uint64_t *pfns, pmd_t pmd)
 {
@@ -221,11 +220,6 @@ static int hmm_vma_handle_pmd(struct mm_walk *walk, unsigned long addr,
 	hmm_vma_walk->last = end;
 	return 0;
 }
-#else /* CONFIG_TRANSPARENT_HUGEPAGE */
-/* stub to allow the code below to compile */
-int hmm_vma_handle_pmd(struct mm_walk *walk, unsigned long addr,
-		unsigned long end, uint64_t *pfns, pmd_t pmd);
-#endif /* CONFIG_TRANSPARENT_HUGEPAGE */
 
 static inline bool hmm_is_device_private_entry(struct hmm_range *range,
 		swp_entry_t entry)
-- 
2.25.2

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v2 hmm 6/9] mm/hmm: use device_private_entry_to_pfn()
  2020-03-24  1:14 [PATCH v2 hmm 0/9] Small hmm_range_fault() cleanups Jason Gunthorpe
                   ` (4 preceding siblings ...)
  2020-03-24  1:14 ` [PATCH v2 hmm 5/9] mm/hmm: remove the CONFIG_TRANSPARENT_HUGEPAGE #ifdef Jason Gunthorpe
@ 2020-03-24  1:14 ` Jason Gunthorpe
  2020-03-24  7:34   ` Christoph Hellwig
  2020-03-24  1:14 ` [PATCH v2 hmm 7/9] mm/hmm: do not unconditionally set pfns when returning EBUSY Jason Gunthorpe
                   ` (3 subsequent siblings)
  9 siblings, 1 reply; 22+ messages in thread
From: Jason Gunthorpe @ 2020-03-24  1:14 UTC (permalink / raw)
  To: Jerome Glisse, Ralph Campbell, Felix.Kuehling
  Cc: Philip Yang, John Hubbard, amd-gfx, linux-mm, Jason Gunthorpe,
	dri-devel, Christoph Hellwig

From: Jason Gunthorpe <jgg@mellanox.com>

swp_offset() should not be called directly, the wrappers are supposed to
abstract away the encoding of the device_private specific information in
the swap entry.

Reviewed-by: Ralph Campbell <rcampbell@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
---
 mm/hmm.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/hmm.c b/mm/hmm.c
index f59e59fb303e95..e114110ad498a2 100644
--- a/mm/hmm.c
+++ b/mm/hmm.c
@@ -266,7 +266,7 @@ static int hmm_vma_handle_pte(struct mm_walk *walk, unsigned long addr,
 		 */
 		if (hmm_is_device_private_entry(range, entry)) {
 			*pfn = hmm_device_entry_from_pfn(range,
-					    swp_offset(entry));
+				device_private_entry_to_pfn(entry));
 			*pfn |= range->flags[HMM_PFN_VALID];
 			if (is_write_device_private_entry(entry))
 				*pfn |= range->flags[HMM_PFN_WRITE];
-- 
2.25.2

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v2 hmm 7/9] mm/hmm: do not unconditionally set pfns when returning EBUSY
  2020-03-24  1:14 [PATCH v2 hmm 0/9] Small hmm_range_fault() cleanups Jason Gunthorpe
                   ` (5 preceding siblings ...)
  2020-03-24  1:14 ` [PATCH v2 hmm 6/9] mm/hmm: use device_private_entry_to_pfn() Jason Gunthorpe
@ 2020-03-24  1:14 ` Jason Gunthorpe
  2020-03-24  7:37   ` Christoph Hellwig
  2020-03-24  1:14 ` [PATCH v2 hmm 8/9] mm/hmm: do not set pfns when returning an error code Jason Gunthorpe
                   ` (2 subsequent siblings)
  9 siblings, 1 reply; 22+ messages in thread
From: Jason Gunthorpe @ 2020-03-24  1:14 UTC (permalink / raw)
  To: Jerome Glisse, Ralph Campbell, Felix.Kuehling
  Cc: Philip Yang, John Hubbard, amd-gfx, linux-mm, Jason Gunthorpe,
	dri-devel, Christoph Hellwig

From: Jason Gunthorpe <jgg@mellanox.com>

In hmm_vma_handle_pte() and hmm_vma_walk_hugetlb_entry() if fault happens
then -EBUSY will be returned and the pfns input flags will have been
destroyed.

For hmm_vma_handle_pte() set HMM_PFN_NONE only on the success returns that
don't otherwise store to pfns.

For hmm_vma_walk_hugetlb_entry() all exit paths already set pfns, so
remove the redundant store.

Fixes: 2aee09d8c116 ("mm/hmm: change hmm_vma_fault() to allow write fault on page basis")
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
---
 mm/hmm.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/mm/hmm.c b/mm/hmm.c
index e114110ad498a2..bf77b852f12d3a 100644
--- a/mm/hmm.c
+++ b/mm/hmm.c
@@ -249,11 +249,11 @@ static int hmm_vma_handle_pte(struct mm_walk *walk, unsigned long addr,
 	pte_t pte = *ptep;
 	uint64_t orig_pfn = *pfn;
 
-	*pfn = range->values[HMM_PFN_NONE];
 	if (pte_none(pte)) {
 		required_fault = hmm_pte_need_fault(hmm_vma_walk, orig_pfn, 0);
 		if (required_fault)
 			goto fault;
+		*pfn = range->values[HMM_PFN_NONE];
 		return 0;
 	}
 
@@ -274,8 +274,10 @@ static int hmm_vma_handle_pte(struct mm_walk *walk, unsigned long addr,
 		}
 
 		required_fault = hmm_pte_need_fault(hmm_vma_walk, orig_pfn, 0);
-		if (!required_fault)
+		if (!required_fault) {
+			*pfn = range->values[HMM_PFN_NONE];
 			return 0;
+		}
 
 		if (!non_swap_entry(entry))
 			goto fault;
@@ -493,7 +495,6 @@ static int hmm_vma_walk_hugetlb_entry(pte_t *pte, unsigned long hmask,
 
 	i = (start - range->start) >> PAGE_SHIFT;
 	orig_pfn = range->pfns[i];
-	range->pfns[i] = range->values[HMM_PFN_NONE];
 	cpu_flags = pte_to_hmm_pfn_flags(range, entry);
 	required_fault = hmm_pte_need_fault(hmm_vma_walk, orig_pfn, cpu_flags);
 	if (required_fault) {
-- 
2.25.2

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v2 hmm 8/9] mm/hmm: do not set pfns when returning an error code
  2020-03-24  1:14 [PATCH v2 hmm 0/9] Small hmm_range_fault() cleanups Jason Gunthorpe
                   ` (6 preceding siblings ...)
  2020-03-24  1:14 ` [PATCH v2 hmm 7/9] mm/hmm: do not unconditionally set pfns when returning EBUSY Jason Gunthorpe
@ 2020-03-24  1:14 ` Jason Gunthorpe
  2020-03-24  7:38   ` Christoph Hellwig
  2020-03-24  1:14 ` [PATCH v2 hmm 9/9] mm/hmm: return error for non-vma snapshots Jason Gunthorpe
  2020-03-26 21:21 ` [PATCH v2 hmm 0/9] Small hmm_range_fault() cleanups Ralph Campbell
  9 siblings, 1 reply; 22+ messages in thread
From: Jason Gunthorpe @ 2020-03-24  1:14 UTC (permalink / raw)
  To: Jerome Glisse, Ralph Campbell, Felix.Kuehling
  Cc: Philip Yang, John Hubbard, amd-gfx, linux-mm, Jason Gunthorpe,
	dri-devel, Christoph Hellwig

From: Jason Gunthorpe <jgg@mellanox.com>

Most places that return an error code, like -EFAULT, do not set
HMM_PFN_ERROR, only two places do this.

Resolve this inconsistency by never setting the pfns on an error
exit. This doesn't seem like a worthwhile thing to do anyhow.

If for some reason it becomes important, it makes more sense to directly
return the address of the failing page rather than have the caller scan
for the HMM_PFN_ERROR.

No caller inspects the pnfs output array if hmm_range_fault() fails.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
---
 mm/hmm.c | 18 +++---------------
 1 file changed, 3 insertions(+), 15 deletions(-)

diff --git a/mm/hmm.c b/mm/hmm.c
index bf77b852f12d3a..14c33e1225866c 100644
--- a/mm/hmm.c
+++ b/mm/hmm.c
@@ -77,17 +77,14 @@ static int hmm_vma_fault(unsigned long addr, unsigned long end,
 			 unsigned int required_fault, struct mm_walk *walk)
 {
 	struct hmm_vma_walk *hmm_vma_walk = walk->private;
-	struct hmm_range *range = hmm_vma_walk->range;
 	struct vm_area_struct *vma = walk->vma;
-	uint64_t *pfns = range->pfns;
-	unsigned long i = (addr - range->start) >> PAGE_SHIFT;
 	unsigned int fault_flags = FAULT_FLAG_REMOTE;
 
 	WARN_ON_ONCE(!required_fault);
 	hmm_vma_walk->last = addr;
 
 	if (!vma)
-		goto out_error;
+		return -EFAULT;
 
 	if ((required_fault & HMM_NEED_WRITE_FAULT) == HMM_NEED_WRITE_FAULT) {
 		if (!(vma->vm_flags & VM_WRITE))
@@ -95,15 +92,10 @@ static int hmm_vma_fault(unsigned long addr, unsigned long end,
 		fault_flags |= FAULT_FLAG_WRITE;
 	}
 
-	for (; addr < end; addr += PAGE_SIZE, i++)
+	for (; addr < end; addr += PAGE_SIZE)
 		if (handle_mm_fault(vma, addr, fault_flags) & VM_FAULT_ERROR)
-			goto out_error;
-
+			return -EFAULT;
 	return -EBUSY;
-
-out_error:
-	pfns[i] = range->values[HMM_PFN_ERROR];
-	return -EFAULT;
 }
 
 static unsigned int hmm_pte_need_fault(const struct hmm_vma_walk *hmm_vma_walk,
@@ -291,7 +283,6 @@ static int hmm_vma_handle_pte(struct mm_walk *walk, unsigned long addr,
 
 		/* Report error for everything else */
 		pte_unmap(ptep);
-		*pfn = range->values[HMM_PFN_ERROR];
 		return -EFAULT;
 	}
 
@@ -577,9 +568,6 @@ static const struct mm_walk_ops hmm_walk_ops = {
  *
  * This is similar to get_user_pages(), except that it can read the page tables
  * without mutating them (ie causing faults).
- *
- * On error, for one virtual address in the range, the function will mark the
- * corresponding HMM pfn entry with an error flag.
  */
 long hmm_range_fault(struct hmm_range *range)
 {
-- 
2.25.2

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v2 hmm 9/9] mm/hmm: return error for non-vma snapshots
  2020-03-24  1:14 [PATCH v2 hmm 0/9] Small hmm_range_fault() cleanups Jason Gunthorpe
                   ` (7 preceding siblings ...)
  2020-03-24  1:14 ` [PATCH v2 hmm 8/9] mm/hmm: do not set pfns when returning an error code Jason Gunthorpe
@ 2020-03-24  1:14 ` Jason Gunthorpe
  2020-03-24  7:45   ` Christoph Hellwig
  2020-03-26 21:21 ` [PATCH v2 hmm 0/9] Small hmm_range_fault() cleanups Ralph Campbell
  9 siblings, 1 reply; 22+ messages in thread
From: Jason Gunthorpe @ 2020-03-24  1:14 UTC (permalink / raw)
  To: Jerome Glisse, Ralph Campbell, Felix.Kuehling
  Cc: Philip Yang, John Hubbard, amd-gfx, linux-mm, Jason Gunthorpe,
	dri-devel, Christoph Hellwig

From: Jason Gunthorpe <jgg@mellanox.com>

The pagewalker does not call most ops with NULL vma, those are all routed
to pte_hole instead.

Thus hmm_vma_fault() is only called with a NULL vma from
hmm_vma_walk_hole(), so hoist the check to there.

Now it is clear that snapshotting with no vma is a HMM_PFN_ERROR as
without a vma we have no path to call hmm_vma_fault().

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
---
 mm/hmm.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/mm/hmm.c b/mm/hmm.c
index 14c33e1225866c..df0574061b37d3 100644
--- a/mm/hmm.c
+++ b/mm/hmm.c
@@ -83,9 +83,6 @@ static int hmm_vma_fault(unsigned long addr, unsigned long end,
 	WARN_ON_ONCE(!required_fault);
 	hmm_vma_walk->last = addr;
 
-	if (!vma)
-		return -EFAULT;
-
 	if ((required_fault & HMM_NEED_WRITE_FAULT) == HMM_NEED_WRITE_FAULT) {
 		if (!(vma->vm_flags & VM_WRITE))
 			return -EPERM;
@@ -175,6 +172,11 @@ static int hmm_vma_walk_hole(unsigned long addr, unsigned long end,
 	npages = (end - addr) >> PAGE_SHIFT;
 	pfns = &range->pfns[i];
 	required_fault = hmm_range_need_fault(hmm_vma_walk, pfns, npages, 0);
+	if (!walk->vma) {
+		if (required_fault)
+			return -EFAULT;
+		return hmm_pfns_fill(addr, end, range, HMM_PFN_ERROR);
+	}
 	if (required_fault)
 		return hmm_vma_fault(addr, end, required_fault, walk);
 	hmm_vma_walk->last = addr;
-- 
2.25.2

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 hmm 2/9] mm/hmm: return the fault type from hmm_pte_need_fault()
  2020-03-24  1:14 ` [PATCH v2 hmm 2/9] mm/hmm: return the fault type from hmm_pte_need_fault() Jason Gunthorpe
@ 2020-03-24  7:27   ` Christoph Hellwig
  2020-03-24 18:58     ` Jason Gunthorpe
  0 siblings, 1 reply; 22+ messages in thread
From: Christoph Hellwig @ 2020-03-24  7:27 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Philip Yang, Ralph Campbell, John Hubbard, Felix.Kuehling,
	dri-devel, linux-mm, Jerome Glisse, Jason Gunthorpe, amd-gfx,
	Christoph Hellwig

On Mon, Mar 23, 2020 at 10:14:50PM -0300, Jason Gunthorpe wrote:
> +enum {
> +	HMM_NEED_FAULT = 1 << 0,
> +	HMM_NEED_WRITE_FAULT = HMM_NEED_FAULT | (1 << 1),
> +	HMM_NEED_ALL_BITS = HMM_NEED_FAULT | HMM_NEED_WRITE_FAULT,

I have to say I find the compound version of HMM_NEED_WRITE_FAULT
way harder to understand than the logic in the previous version,
and would refer keeping separate bits here.

Mostly beccause of statements like this:

> +	if ((required_fault & HMM_NEED_WRITE_FAULT) == HMM_NEED_WRITE_FAULT) {

which seems rather weird.
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 hmm 3/9] mm/hmm: remove unused code and tidy comments
  2020-03-24  1:14 ` [PATCH v2 hmm 3/9] mm/hmm: remove unused code and tidy comments Jason Gunthorpe
@ 2020-03-24  7:27   ` Christoph Hellwig
  0 siblings, 0 replies; 22+ messages in thread
From: Christoph Hellwig @ 2020-03-24  7:27 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Philip Yang, Ralph Campbell, John Hubbard, Felix.Kuehling,
	dri-devel, linux-mm, Jerome Glisse, Jason Gunthorpe, amd-gfx,
	Christoph Hellwig

On Mon, Mar 23, 2020 at 10:14:51PM -0300, Jason Gunthorpe wrote:
> From: Jason Gunthorpe <jgg@mellanox.com>
> 
> Delete several functions that are never called, fix some desync between
> comments and structure content, toss the now out of date top of file
> header, and move one function only used by hmm.c into hmm.c
> 
> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 hmm 4/9] mm/hmm: remove HMM_FAULT_SNAPSHOT
  2020-03-24  1:14 ` [PATCH v2 hmm 4/9] mm/hmm: remove HMM_FAULT_SNAPSHOT Jason Gunthorpe
@ 2020-03-24  7:33   ` Christoph Hellwig
  2020-03-24 19:31     ` Jason Gunthorpe
  0 siblings, 1 reply; 22+ messages in thread
From: Christoph Hellwig @ 2020-03-24  7:33 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Philip Yang, Ralph Campbell, John Hubbard, Felix.Kuehling,
	dri-devel, linux-mm, Jerome Glisse, Jason Gunthorpe, amd-gfx,
	Christoph Hellwig

>  
> +/*
> + * If the valid flag is masked off, and default_flags doesn't set valid, then
> + * hmm_pte_need_fault() always returns 0.
> + */
> +static bool hmm_can_fault(struct hmm_range *range)
> +{
> +	return ((range->flags[HMM_PFN_VALID] & range->pfn_flags_mask) |
> +		range->default_flags) &
> +	       range->flags[HMM_PFN_VALID];
> +}

So my idea behind the helper was to turn this into something readable :)

E.g.

/*
 * We only need to fault if either the default mask requires to fault all
 * pages, or at least the mask allows for individual pages to be faulted.
 */
static bool hmm_can_fault(struct hmm_range *range)
{
	return ((range->default_flags | range->pfn_flags_mask) &
		range->flags[HMM_PFN_VALID]);
}

In fact now that I managed to destill it down to this I'm not even
sure we really even need the helper, although the comment really helps.
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 hmm 5/9] mm/hmm: remove the CONFIG_TRANSPARENT_HUGEPAGE #ifdef
  2020-03-24  1:14 ` [PATCH v2 hmm 5/9] mm/hmm: remove the CONFIG_TRANSPARENT_HUGEPAGE #ifdef Jason Gunthorpe
@ 2020-03-24  7:33   ` Christoph Hellwig
  0 siblings, 0 replies; 22+ messages in thread
From: Christoph Hellwig @ 2020-03-24  7:33 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Philip Yang, Ralph Campbell, John Hubbard, Felix.Kuehling,
	dri-devel, linux-mm, Jerome Glisse, Jason Gunthorpe, amd-gfx,
	Christoph Hellwig

On Mon, Mar 23, 2020 at 10:14:53PM -0300, Jason Gunthorpe wrote:
> From: Jason Gunthorpe <jgg@mellanox.com>
> 
> This code can be compiled when CONFIG_TRANSPARENT_HUGEPAGE is off, so
> remove the ifdef.
> 
> The function is only ever called under
> 
>    if (pmd_devmap(pmd) || pmd_trans_huge(pmd))
> 
> Which is statically false if !CONFIG_TRANSPARENT_HUGEPAGE, so the compiler
> reliably eliminates all of this code.
> 
> Reviewed-by: Ralph Campbell <rcampbell@nvidia.com>
> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 hmm 6/9] mm/hmm: use device_private_entry_to_pfn()
  2020-03-24  1:14 ` [PATCH v2 hmm 6/9] mm/hmm: use device_private_entry_to_pfn() Jason Gunthorpe
@ 2020-03-24  7:34   ` Christoph Hellwig
  0 siblings, 0 replies; 22+ messages in thread
From: Christoph Hellwig @ 2020-03-24  7:34 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Philip Yang, Ralph Campbell, John Hubbard, Felix.Kuehling,
	dri-devel, linux-mm, Jerome Glisse, Jason Gunthorpe, amd-gfx,
	Christoph Hellwig

On Mon, Mar 23, 2020 at 10:14:54PM -0300, Jason Gunthorpe wrote:
> From: Jason Gunthorpe <jgg@mellanox.com>
> 
> swp_offset() should not be called directly, the wrappers are supposed to
> abstract away the encoding of the device_private specific information in
> the swap entry.
> 
> Reviewed-by: Ralph Campbell <rcampbell@nvidia.com>
> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 hmm 7/9] mm/hmm: do not unconditionally set pfns when returning EBUSY
  2020-03-24  1:14 ` [PATCH v2 hmm 7/9] mm/hmm: do not unconditionally set pfns when returning EBUSY Jason Gunthorpe
@ 2020-03-24  7:37   ` Christoph Hellwig
  2020-03-24 15:47     ` Jason Gunthorpe
  0 siblings, 1 reply; 22+ messages in thread
From: Christoph Hellwig @ 2020-03-24  7:37 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Philip Yang, Ralph Campbell, John Hubbard, Felix.Kuehling,
	dri-devel, linux-mm, Jerome Glisse, Jason Gunthorpe, amd-gfx,
	Christoph Hellwig

On Mon, Mar 23, 2020 at 10:14:55PM -0300, Jason Gunthorpe wrote:
>  	if (pte_none(pte)) {
>  		required_fault = hmm_pte_need_fault(hmm_vma_walk, orig_pfn, 0);
>  		if (required_fault)
>  			goto fault;
> +		*pfn = range->values[HMM_PFN_NONE];
>  		return 0;
>  	}
>  
> @@ -274,8 +274,10 @@ static int hmm_vma_handle_pte(struct mm_walk *walk, unsigned long addr,
>  		}
>  
>  		required_fault = hmm_pte_need_fault(hmm_vma_walk, orig_pfn, 0);
> -		if (!required_fault)
> +		if (!required_fault) {
> +			*pfn = range->values[HMM_PFN_NONE];
>  			return 0;
> +		}

Maybe throw in a goto hole to consolidaste the set PFN and return
0 cases?

Otherwise looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 hmm 8/9] mm/hmm: do not set pfns when returning an error code
  2020-03-24  1:14 ` [PATCH v2 hmm 8/9] mm/hmm: do not set pfns when returning an error code Jason Gunthorpe
@ 2020-03-24  7:38   ` Christoph Hellwig
  0 siblings, 0 replies; 22+ messages in thread
From: Christoph Hellwig @ 2020-03-24  7:38 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Philip Yang, Ralph Campbell, John Hubbard, Felix.Kuehling,
	dri-devel, linux-mm, Jerome Glisse, Jason Gunthorpe, amd-gfx,
	Christoph Hellwig

On Mon, Mar 23, 2020 at 10:14:56PM -0300, Jason Gunthorpe wrote:
> From: Jason Gunthorpe <jgg@mellanox.com>
> 
> Most places that return an error code, like -EFAULT, do not set
> HMM_PFN_ERROR, only two places do this.
> 
> Resolve this inconsistency by never setting the pfns on an error
> exit. This doesn't seem like a worthwhile thing to do anyhow.
> 
> If for some reason it becomes important, it makes more sense to directly
> return the address of the failing page rather than have the caller scan
> for the HMM_PFN_ERROR.
> 
> No caller inspects the pnfs output array if hmm_range_fault() fails.
> 
> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 hmm 9/9] mm/hmm: return error for non-vma snapshots
  2020-03-24  1:14 ` [PATCH v2 hmm 9/9] mm/hmm: return error for non-vma snapshots Jason Gunthorpe
@ 2020-03-24  7:45   ` Christoph Hellwig
  0 siblings, 0 replies; 22+ messages in thread
From: Christoph Hellwig @ 2020-03-24  7:45 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Philip Yang, Ralph Campbell, John Hubbard, Felix.Kuehling,
	dri-devel, linux-mm, Jerome Glisse, Jason Gunthorpe, amd-gfx,
	Christoph Hellwig

On Mon, Mar 23, 2020 at 10:14:57PM -0300, Jason Gunthorpe wrote:
> From: Jason Gunthorpe <jgg@mellanox.com>
> 
> The pagewalker does not call most ops with NULL vma, those are all routed
> to pte_hole instead.

Does ->pte_hole 

> 
> Thus hmm_vma_fault() is only called with a NULL vma from
> hmm_vma_walk_hole(), so hoist the check to there.
> 
> Now it is clear that snapshotting with no vma is a HMM_PFN_ERROR as
> without a vma we have no path to call hmm_vma_fault().
> 
> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 hmm 7/9] mm/hmm: do not unconditionally set pfns when returning EBUSY
  2020-03-24  7:37   ` Christoph Hellwig
@ 2020-03-24 15:47     ` Jason Gunthorpe
  0 siblings, 0 replies; 22+ messages in thread
From: Jason Gunthorpe @ 2020-03-24 15:47 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Philip Yang, Ralph Campbell, John Hubbard, Felix.Kuehling,
	dri-devel, linux-mm, Jerome Glisse, amd-gfx

On Tue, Mar 24, 2020 at 08:37:46AM +0100, Christoph Hellwig wrote:
> On Mon, Mar 23, 2020 at 10:14:55PM -0300, Jason Gunthorpe wrote:
> >  	if (pte_none(pte)) {
> >  		required_fault = hmm_pte_need_fault(hmm_vma_walk, orig_pfn, 0);
> >  		if (required_fault)
> >  			goto fault;
> > +		*pfn = range->values[HMM_PFN_NONE];
> >  		return 0;
> >  	}
> >  
> > @@ -274,8 +274,10 @@ static int hmm_vma_handle_pte(struct mm_walk *walk, unsigned long addr,
> >  		}
> >  
> >  		required_fault = hmm_pte_need_fault(hmm_vma_walk, orig_pfn, 0);
> > -		if (!required_fault)
> > +		if (!required_fault) {
> > +			*pfn = range->values[HMM_PFN_NONE];
> >  			return 0;
> > +		}
> 
> Maybe throw in a goto hole to consolidaste the set PFN and return
> 0 cases?

Then we have goto fault and goto none both ending in returns. I
generally prefer the goto labels to have a single return

The pte_unmap() before faulting makes this routine twisty and I
haven't thought of a good way to untwist it

Thanks,
Jason
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 hmm 2/9] mm/hmm: return the fault type from hmm_pte_need_fault()
  2020-03-24  7:27   ` Christoph Hellwig
@ 2020-03-24 18:58     ` Jason Gunthorpe
  0 siblings, 0 replies; 22+ messages in thread
From: Jason Gunthorpe @ 2020-03-24 18:58 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Philip Yang, Ralph Campbell, John Hubbard, Felix.Kuehling,
	dri-devel, linux-mm, Jerome Glisse, amd-gfx

On Tue, Mar 24, 2020 at 08:27:12AM +0100, Christoph Hellwig wrote:
> On Mon, Mar 23, 2020 at 10:14:50PM -0300, Jason Gunthorpe wrote:
> > +enum {
> > +	HMM_NEED_FAULT = 1 << 0,
> > +	HMM_NEED_WRITE_FAULT = HMM_NEED_FAULT | (1 << 1),
> > +	HMM_NEED_ALL_BITS = HMM_NEED_FAULT | HMM_NEED_WRITE_FAULT,
> 
> I have to say I find the compound version of HMM_NEED_WRITE_FAULT
> way harder to understand than the logic in the previous version,
> and would refer keeping separate bits here.
> 
> Mostly beccause of statements like this:
> 
> > +	if ((required_fault & HMM_NEED_WRITE_FAULT) == HMM_NEED_WRITE_FAULT) {
> 
> which seems rather weird.

Okay, I checked it over, and there is one weird statement above but
only one place that |'s them together, so it is overall simpler to
split the enum.

I'll keep the HMM_NEED_ALL_BITS, I think that purpose is clear enough.

Thanks,
Jason
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 hmm 4/9] mm/hmm: remove HMM_FAULT_SNAPSHOT
  2020-03-24  7:33   ` Christoph Hellwig
@ 2020-03-24 19:31     ` Jason Gunthorpe
  0 siblings, 0 replies; 22+ messages in thread
From: Jason Gunthorpe @ 2020-03-24 19:31 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Philip Yang, Ralph Campbell, John Hubbard, Felix.Kuehling,
	dri-devel, linux-mm, Jerome Glisse, amd-gfx

On Tue, Mar 24, 2020 at 08:33:39AM +0100, Christoph Hellwig wrote:
> >  
> > +/*
> > + * If the valid flag is masked off, and default_flags doesn't set valid, then
> > + * hmm_pte_need_fault() always returns 0.
> > + */
> > +static bool hmm_can_fault(struct hmm_range *range)
> > +{
> > +	return ((range->flags[HMM_PFN_VALID] & range->pfn_flags_mask) |
> > +		range->default_flags) &
> > +	       range->flags[HMM_PFN_VALID];
> > +}
> 
> So my idea behind the helper was to turn this into something readable :)

Well, it does help to give the expression a name :)

> E.g.
> 
> /*
>  * We only need to fault if either the default mask requires to fault all
>  * pages, or at least the mask allows for individual pages to be faulted.
>  */
> static bool hmm_can_fault(struct hmm_range *range)
> {
> 	return ((range->default_flags | range->pfn_flags_mask) &
> 		range->flags[HMM_PFN_VALID]);
> }

Okay, I find this as understandable and it is less cluttered. I think
the comment is good enough now.

Can we concur on this then:

 static unsigned int
 hmm_range_need_fault(const struct hmm_vma_walk *hmm_vma_walk,
 		     const uint64_t *pfns, unsigned long npages,
 		     uint64_t cpu_flags)
 {
+	struct hmm_range *range = hmm_vma_walk->range;
 	unsigned int required_fault = 0;
 	unsigned long i;
 
-	if (hmm_vma_walk->flags & HMM_FAULT_SNAPSHOT)
+	/*
+	 * If the default flags do not request to fault pages, and the mask does
+	 * not allow for individual pages to be faulted, then
+	 * hmm_pte_need_fault() will always return 0.
+	 */
+	if (!((range->default_flags | range->pfn_flags_mask) &
+	      range->flags[HMM_PFN_VALID]))
 		return 0;

I think everything else is sorted now, so if yes I'll send this as v3.

Thanks,
Jason
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 hmm 0/9] Small hmm_range_fault() cleanups
  2020-03-24  1:14 [PATCH v2 hmm 0/9] Small hmm_range_fault() cleanups Jason Gunthorpe
                   ` (8 preceding siblings ...)
  2020-03-24  1:14 ` [PATCH v2 hmm 9/9] mm/hmm: return error for non-vma snapshots Jason Gunthorpe
@ 2020-03-26 21:21 ` Ralph Campbell
  9 siblings, 0 replies; 22+ messages in thread
From: Ralph Campbell @ 2020-03-26 21:21 UTC (permalink / raw)
  To: Jason Gunthorpe, Jerome Glisse, Felix.Kuehling
  Cc: Philip Yang, John Hubbard, amd-gfx, linux-mm, Jason Gunthorpe,
	dri-devel, Christoph Hellwig


On 3/23/20 6:14 PM, Jason Gunthorpe wrote:
> From: Jason Gunthorpe <jgg@mellanox.com>
> 
> This is v2 of the first simple series with a few additional patches of little
> adjustments.
> 
> This needs an additional patch to the hmm tester:
> 
> diff --git a/tools/testing/selftests/vm/hmm-tests.c b/tools/testing/selftests/vm/hmm-tests.c
> index 033a12c7ab5b6d..da15471a2bbf9a 100644
> --- a/tools/testing/selftests/vm/hmm-tests.c
> +++ b/tools/testing/selftests/vm/hmm-tests.c
> @@ -1274,7 +1274,7 @@ TEST_F(hmm2, snapshot)
>   	/* Check what the device saw. */
>   	m = buffer->mirror;
>   	ASSERT_EQ(m[0], HMM_DMIRROR_PROT_ERROR);
> -	ASSERT_EQ(m[1], HMM_DMIRROR_PROT_NONE);
> +	ASSERT_EQ(m[1], HMM_DMIRROR_PROT_ERROR);
>   	ASSERT_EQ(m[2], HMM_DMIRROR_PROT_ZERO | HMM_DMIRROR_PROT_READ);
>   	ASSERT_EQ(m[3], HMM_DMIRROR_PROT_READ);
>   	ASSERT_EQ(m[4], HMM_DMIRROR_PROT_WRITE);
> 
> v2 changes:
>   - Simplify and rename the flags, rework hmm_vma_walk_test in patch 2 (CH)
>   - Adjust more comments in patch 3 (CH, Ralph)
>   - Put the ugly boolean logic into a function in patch 3 (CH)
>   - Update commit message of patch 4 (CH)
>   - Adjust formatting in patch 5 (CH)
>   Patches 6, 7, 8 are new
> 
> v1: https://lore.kernel.org/r/20200320164905.21722-1-jgg@ziepe.ca
> 
> Jason Gunthorpe (9):
>    mm/hmm: remove pgmap checking for devmap pages
>    mm/hmm: return the fault type from hmm_pte_need_fault()
>    mm/hmm: remove unused code and tidy comments
>    mm/hmm: remove HMM_FAULT_SNAPSHOT
>    mm/hmm: remove the CONFIG_TRANSPARENT_HUGEPAGE #ifdef
>    mm/hmm: use device_private_entry_to_pfn()
>    mm/hmm: do not unconditionally set pfns when returning EBUSY
>    mm/hmm: do not set pfns when returning an error code
>    mm/hmm: return error for non-vma snapshots
> 
>   Documentation/vm/hmm.rst                |  12 +-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c |   2 +-
>   drivers/gpu/drm/nouveau/nouveau_svm.c   |   2 +-
>   include/linux/hmm.h                     | 109 +--------
>   mm/hmm.c                                | 312 ++++++++++--------------
>   5 files changed, 133 insertions(+), 304 deletions(-)
> 

I was able to recompile Karol Herbst's mesa tree and Jerome's SVM tests to
test this with nouveau so for the series you can add,
Tested-by: Ralph Campbell <rcampbell@nvidia.com>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2020-03-26 21:21 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-03-24  1:14 [PATCH v2 hmm 0/9] Small hmm_range_fault() cleanups Jason Gunthorpe
2020-03-24  1:14 ` [PATCH v2 hmm 1/9] mm/hmm: remove pgmap checking for devmap pages Jason Gunthorpe
2020-03-24  1:14 ` [PATCH v2 hmm 2/9] mm/hmm: return the fault type from hmm_pte_need_fault() Jason Gunthorpe
2020-03-24  7:27   ` Christoph Hellwig
2020-03-24 18:58     ` Jason Gunthorpe
2020-03-24  1:14 ` [PATCH v2 hmm 3/9] mm/hmm: remove unused code and tidy comments Jason Gunthorpe
2020-03-24  7:27   ` Christoph Hellwig
2020-03-24  1:14 ` [PATCH v2 hmm 4/9] mm/hmm: remove HMM_FAULT_SNAPSHOT Jason Gunthorpe
2020-03-24  7:33   ` Christoph Hellwig
2020-03-24 19:31     ` Jason Gunthorpe
2020-03-24  1:14 ` [PATCH v2 hmm 5/9] mm/hmm: remove the CONFIG_TRANSPARENT_HUGEPAGE #ifdef Jason Gunthorpe
2020-03-24  7:33   ` Christoph Hellwig
2020-03-24  1:14 ` [PATCH v2 hmm 6/9] mm/hmm: use device_private_entry_to_pfn() Jason Gunthorpe
2020-03-24  7:34   ` Christoph Hellwig
2020-03-24  1:14 ` [PATCH v2 hmm 7/9] mm/hmm: do not unconditionally set pfns when returning EBUSY Jason Gunthorpe
2020-03-24  7:37   ` Christoph Hellwig
2020-03-24 15:47     ` Jason Gunthorpe
2020-03-24  1:14 ` [PATCH v2 hmm 8/9] mm/hmm: do not set pfns when returning an error code Jason Gunthorpe
2020-03-24  7:38   ` Christoph Hellwig
2020-03-24  1:14 ` [PATCH v2 hmm 9/9] mm/hmm: return error for non-vma snapshots Jason Gunthorpe
2020-03-24  7:45   ` Christoph Hellwig
2020-03-26 21:21 ` [PATCH v2 hmm 0/9] Small hmm_range_fault() cleanups Ralph Campbell

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).