linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* turn the hmm migrate_vma upside down
@ 2019-07-29 14:28 Christoph Hellwig
  2019-07-29 14:28 ` [PATCH 1/9] mm: turn " Christoph Hellwig
                   ` (9 more replies)
  0 siblings, 10 replies; 30+ messages in thread
From: Christoph Hellwig @ 2019-07-29 14:28 UTC (permalink / raw)
  To: Jérôme Glisse, Jason Gunthorpe, Ben Skeggs
  Cc: Ralph Campbell, Bharata B Rao, Andrew Morton, linux-mm, nouveau,
	dri-devel, linux-kernel

Hi Jérôme, Ben and Jason,

below is a series against the hmm tree which starts revamping the
migrate_vma functionality.  The prime idea is to export three slightly
lower level functions and thus avoid the need for migrate_vma_ops
callbacks.

Diffstat:

    4 files changed, 285 insertions(+), 602 deletions(-)

A git tree is also available at:

    git://git.infradead.org/users/hch/misc.git migrate_vma-cleanup

Gitweb:

    http://git.infradead.org/users/hch/misc.git/shortlog/refs/heads/migrate_vma-cleanup

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH 1/9] mm: turn migrate_vma upside down
  2019-07-29 14:28 turn the hmm migrate_vma upside down Christoph Hellwig
@ 2019-07-29 14:28 ` Christoph Hellwig
  2019-07-29 23:12   ` Ralph Campbell
                     ` (2 more replies)
  2019-07-29 14:28 ` [PATCH 2/9] nouveau: reset dma_nr in nouveau_dmem_migrate_alloc_and_copy Christoph Hellwig
                   ` (8 subsequent siblings)
  9 siblings, 3 replies; 30+ messages in thread
From: Christoph Hellwig @ 2019-07-29 14:28 UTC (permalink / raw)
  To: Jérôme Glisse, Jason Gunthorpe, Ben Skeggs
  Cc: Ralph Campbell, Bharata B Rao, Andrew Morton, linux-mm, nouveau,
	dri-devel, linux-kernel

There isn't any good reason to pass callbacks to migrate_vma.  Instead
we can just export the three steps done by this function to drivers and
let them sequence the operation without callbacks.  This removes a lot
of boilerplate code as-is, and will allow the drivers to drastically
improve code flow and error handling further on.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 Documentation/vm/hmm.rst               |  55 +-----
 drivers/gpu/drm/nouveau/nouveau_dmem.c | 122 +++++++------
 include/linux/migrate.h                | 118 ++----------
 mm/migrate.c                           | 242 +++++++++++--------------
 4 files changed, 193 insertions(+), 344 deletions(-)

diff --git a/Documentation/vm/hmm.rst b/Documentation/vm/hmm.rst
index ddcb5ca8b296..ad880e3996b1 100644
--- a/Documentation/vm/hmm.rst
+++ b/Documentation/vm/hmm.rst
@@ -339,58 +339,9 @@ Migration to and from device memory
 ===================================
 
 Because the CPU cannot access device memory, migration must use the device DMA
-engine to perform copy from and to device memory. For this we need a new
-migration helper::
-
- int migrate_vma(const struct migrate_vma_ops *ops,
-                 struct vm_area_struct *vma,
-                 unsigned long mentries,
-                 unsigned long start,
-                 unsigned long end,
-                 unsigned long *src,
-                 unsigned long *dst,
-                 void *private);
-
-Unlike other migration functions it works on a range of virtual address, there
-are two reasons for that. First, device DMA copy has a high setup overhead cost
-and thus batching multiple pages is needed as otherwise the migration overhead
-makes the whole exercise pointless. The second reason is because the
-migration might be for a range of addresses the device is actively accessing.
-
-The migrate_vma_ops struct defines two callbacks. First one (alloc_and_copy())
-controls destination memory allocation and copy operation. Second one is there
-to allow the device driver to perform cleanup operations after migration::
-
- struct migrate_vma_ops {
-     void (*alloc_and_copy)(struct vm_area_struct *vma,
-                            const unsigned long *src,
-                            unsigned long *dst,
-                            unsigned long start,
-                            unsigned long end,
-                            void *private);
-     void (*finalize_and_map)(struct vm_area_struct *vma,
-                              const unsigned long *src,
-                              const unsigned long *dst,
-                              unsigned long start,
-                              unsigned long end,
-                              void *private);
- };
-
-It is important to stress that these migration helpers allow for holes in the
-virtual address range. Some pages in the range might not be migrated for all
-the usual reasons (page is pinned, page is locked, ...). This helper does not
-fail but just skips over those pages.
-
-The alloc_and_copy() might decide to not migrate all pages in the
-range (for reasons under the callback control). For those, the callback just
-has to leave the corresponding dst entry empty.
-
-Finally, the migration of the struct page might fail (for file backed page) for
-various reasons (failure to freeze reference, or update page cache, ...). If
-that happens, then the finalize_and_map() can catch any pages that were not
-migrated. Note those pages were still copied to a new page and thus we wasted
-bandwidth but this is considered as a rare event and a price that we are
-willing to pay to keep all the code simpler.
+engine to perform copy from and to device memory. For this we need a new to
+use migrate_vma_setup(), migrate_vma_pages(), and migrate_vma_finalize()
+helpers. 
 
 
 Memory cgroup (memcg) and rss accounting
diff --git a/drivers/gpu/drm/nouveau/nouveau_dmem.c b/drivers/gpu/drm/nouveau/nouveau_dmem.c
index 345c63cb752a..38416798abd4 100644
--- a/drivers/gpu/drm/nouveau/nouveau_dmem.c
+++ b/drivers/gpu/drm/nouveau/nouveau_dmem.c
@@ -131,9 +131,8 @@ nouveau_dmem_fault_alloc_and_copy(struct vm_area_struct *vma,
 				  unsigned long *dst_pfns,
 				  unsigned long start,
 				  unsigned long end,
-				  void *private)
+				  struct nouveau_dmem_fault *fault)
 {
-	struct nouveau_dmem_fault *fault = private;
 	struct nouveau_drm *drm = fault->drm;
 	struct device *dev = drm->dev->dev;
 	unsigned long addr, i, npages = 0;
@@ -230,14 +229,9 @@ nouveau_dmem_fault_alloc_and_copy(struct vm_area_struct *vma,
 	}
 }
 
-void nouveau_dmem_fault_finalize_and_map(struct vm_area_struct *vma,
-					 const unsigned long *src_pfns,
-					 const unsigned long *dst_pfns,
-					 unsigned long start,
-					 unsigned long end,
-					 void *private)
+static void
+nouveau_dmem_fault_finalize_and_map(struct nouveau_dmem_fault *fault)
 {
-	struct nouveau_dmem_fault *fault = private;
 	struct nouveau_drm *drm = fault->drm;
 
 	if (fault->fence) {
@@ -257,29 +251,35 @@ void nouveau_dmem_fault_finalize_and_map(struct vm_area_struct *vma,
 	kfree(fault->dma);
 }
 
-static const struct migrate_vma_ops nouveau_dmem_fault_migrate_ops = {
-	.alloc_and_copy		= nouveau_dmem_fault_alloc_and_copy,
-	.finalize_and_map	= nouveau_dmem_fault_finalize_and_map,
-};
-
 static vm_fault_t nouveau_dmem_migrate_to_ram(struct vm_fault *vmf)
 {
 	struct nouveau_dmem *dmem = page_to_dmem(vmf->page);
 	unsigned long src[1] = {0}, dst[1] = {0};
+	struct migrate_vma args = {
+		.vma		= vmf->vma,
+		.start		= vmf->address,
+		.end		= vmf->address + PAGE_SIZE,
+		.src		= src,
+		.dst		= dst,
+	};
 	struct nouveau_dmem_fault fault = { .drm = dmem->drm };
-	int ret;
 
 	/*
 	 * FIXME what we really want is to find some heuristic to migrate more
 	 * than just one page on CPU fault. When such fault happens it is very
 	 * likely that more surrounding page will CPU fault too.
 	 */
-	ret = migrate_vma(&nouveau_dmem_fault_migrate_ops, vmf->vma,
-			vmf->address, vmf->address + PAGE_SIZE,
-			src, dst, &fault);
-	if (ret)
+	if (migrate_vma_setup(&args) < 0)
 		return VM_FAULT_SIGBUS;
+	if (!args.cpages)
+		return 0;
+
+	nouveau_dmem_fault_alloc_and_copy(args.vma, src, dst, args.start,
+			args.end, &fault);
+	migrate_vma_pages(&args);
+	nouveau_dmem_fault_finalize_and_map(&fault);
 
+	migrate_vma_finalize(&args);
 	if (dst[0] == MIGRATE_PFN_ERROR)
 		return VM_FAULT_SIGBUS;
 
@@ -648,9 +648,8 @@ nouveau_dmem_migrate_alloc_and_copy(struct vm_area_struct *vma,
 				    unsigned long *dst_pfns,
 				    unsigned long start,
 				    unsigned long end,
-				    void *private)
+				    struct nouveau_migrate *migrate)
 {
-	struct nouveau_migrate *migrate = private;
 	struct nouveau_drm *drm = migrate->drm;
 	struct device *dev = drm->dev->dev;
 	unsigned long addr, i, npages = 0;
@@ -747,14 +746,9 @@ nouveau_dmem_migrate_alloc_and_copy(struct vm_area_struct *vma,
 	}
 }
 
-void nouveau_dmem_migrate_finalize_and_map(struct vm_area_struct *vma,
-					   const unsigned long *src_pfns,
-					   const unsigned long *dst_pfns,
-					   unsigned long start,
-					   unsigned long end,
-					   void *private)
+static void
+nouveau_dmem_migrate_finalize_and_map(struct nouveau_migrate *migrate)
 {
-	struct nouveau_migrate *migrate = private;
 	struct nouveau_drm *drm = migrate->drm;
 
 	if (migrate->fence) {
@@ -779,10 +773,15 @@ void nouveau_dmem_migrate_finalize_and_map(struct vm_area_struct *vma,
 	 */
 }
 
-static const struct migrate_vma_ops nouveau_dmem_migrate_ops = {
-	.alloc_and_copy		= nouveau_dmem_migrate_alloc_and_copy,
-	.finalize_and_map	= nouveau_dmem_migrate_finalize_and_map,
-};
+static void nouveau_dmem_migrate_chunk(struct migrate_vma *args,
+		struct nouveau_migrate *migrate)
+{
+	nouveau_dmem_migrate_alloc_and_copy(args->vma, args->src, args->dst,
+			args->start, args->end, migrate);
+	migrate_vma_pages(args);
+	nouveau_dmem_migrate_finalize_and_map(migrate);
+	migrate_vma_finalize(args);
+}
 
 int
 nouveau_dmem_migrate_vma(struct nouveau_drm *drm,
@@ -790,40 +789,45 @@ nouveau_dmem_migrate_vma(struct nouveau_drm *drm,
 			 unsigned long start,
 			 unsigned long end)
 {
-	unsigned long *src_pfns, *dst_pfns, npages;
-	struct nouveau_migrate migrate = {0};
-	unsigned long i, c, max;
-	int ret = 0;
-
-	npages = (end - start) >> PAGE_SHIFT;
-	max = min(SG_MAX_SINGLE_ALLOC, npages);
-	src_pfns = kzalloc(sizeof(long) * max, GFP_KERNEL);
-	if (src_pfns == NULL)
-		return -ENOMEM;
-	dst_pfns = kzalloc(sizeof(long) * max, GFP_KERNEL);
-	if (dst_pfns == NULL) {
-		kfree(src_pfns);
-		return -ENOMEM;
-	}
+	unsigned long npages = (end - start) >> PAGE_SHIFT;
+	unsigned long max = min(SG_MAX_SINGLE_ALLOC, npages);
+	struct migrate_vma args = {
+		.vma		= vma,
+		.start		= start,
+	};
+	struct nouveau_migrate migrate = {
+		.drm		= drm,
+		.vma		= vma,
+		.npages		= npages,
+	};
+	unsigned long c, i;
+	int ret = -ENOMEM;
+
+	args.src = kzalloc(sizeof(long) * max, GFP_KERNEL);
+	if (!args.src)
+		goto out;
+	args.dst = kzalloc(sizeof(long) * max, GFP_KERNEL);
+	if (!args.dst)
+		goto out_free_src;
 
-	migrate.drm = drm;
-	migrate.vma = vma;
-	migrate.npages = npages;
 	for (i = 0; i < npages; i += c) {
-		unsigned long next;
-
 		c = min(SG_MAX_SINGLE_ALLOC, npages);
-		next = start + (c << PAGE_SHIFT);
-		ret = migrate_vma(&nouveau_dmem_migrate_ops, vma, start,
-				  next, src_pfns, dst_pfns, &migrate);
+		args.end = start + (c << PAGE_SHIFT);
+		ret = migrate_vma_setup(&args);
 		if (ret)
-			goto out;
-		start = next;
+			goto out_free_dst;
+
+		if (args.cpages)
+			nouveau_dmem_migrate_chunk(&args, &migrate);
+		args.start = args.end;
 	}
 
+	ret = 0;
+out_free_dst:
+	kfree(args.dst);
+out_free_src:
+	kfree(args.src);
 out:
-	kfree(dst_pfns);
-	kfree(src_pfns);
 	return ret;
 }
 
diff --git a/include/linux/migrate.h b/include/linux/migrate.h
index 7f04754c7f2b..093d67fcf6dd 100644
--- a/include/linux/migrate.h
+++ b/include/linux/migrate.h
@@ -182,107 +182,27 @@ static inline unsigned long migrate_pfn(unsigned long pfn)
 	return (pfn << MIGRATE_PFN_SHIFT) | MIGRATE_PFN_VALID;
 }
 
-/*
- * struct migrate_vma_ops - migrate operation callback
- *
- * @alloc_and_copy: alloc destination memory and copy source memory to it
- * @finalize_and_map: allow caller to map the successfully migrated pages
- *
- *
- * The alloc_and_copy() callback happens once all source pages have been locked,
- * unmapped and checked (checked whether pinned or not). All pages that can be
- * migrated will have an entry in the src array set with the pfn value of the
- * page and with the MIGRATE_PFN_VALID and MIGRATE_PFN_MIGRATE flag set (other
- * flags might be set but should be ignored by the callback).
- *
- * The alloc_and_copy() callback can then allocate destination memory and copy
- * source memory to it for all those entries (ie with MIGRATE_PFN_VALID and
- * MIGRATE_PFN_MIGRATE flag set). Once these are allocated and copied, the
- * callback must update each corresponding entry in the dst array with the pfn
- * value of the destination page and with the MIGRATE_PFN_VALID and
- * MIGRATE_PFN_LOCKED flags set (destination pages must have their struct pages
- * locked, via lock_page()).
- *
- * At this point the alloc_and_copy() callback is done and returns.
- *
- * Note that the callback does not have to migrate all the pages that are
- * marked with MIGRATE_PFN_MIGRATE flag in src array unless this is a migration
- * from device memory to system memory (ie the MIGRATE_PFN_DEVICE flag is also
- * set in the src array entry). If the device driver cannot migrate a device
- * page back to system memory, then it must set the corresponding dst array
- * entry to MIGRATE_PFN_ERROR. This will trigger a SIGBUS if CPU tries to
- * access any of the virtual addresses originally backed by this page. Because
- * a SIGBUS is such a severe result for the userspace process, the device
- * driver should avoid setting MIGRATE_PFN_ERROR unless it is really in an
- * unrecoverable state.
- *
- * For empty entry inside CPU page table (pte_none() or pmd_none() is true) we
- * do set MIGRATE_PFN_MIGRATE flag inside the corresponding source array thus
- * allowing device driver to allocate device memory for those unback virtual
- * address. For this the device driver simply have to allocate device memory
- * and properly set the destination entry like for regular migration. Note that
- * this can still fails and thus inside the device driver must check if the
- * migration was successful for those entry inside the finalize_and_map()
- * callback just like for regular migration.
- *
- * THE alloc_and_copy() CALLBACK MUST NOT CHANGE ANY OF THE SRC ARRAY ENTRIES
- * OR BAD THINGS WILL HAPPEN !
- *
- *
- * The finalize_and_map() callback happens after struct page migration from
- * source to destination (destination struct pages are the struct pages for the
- * memory allocated by the alloc_and_copy() callback).  Migration can fail, and
- * thus the finalize_and_map() allows the driver to inspect which pages were
- * successfully migrated, and which were not. Successfully migrated pages will
- * have the MIGRATE_PFN_MIGRATE flag set for their src array entry.
- *
- * It is safe to update device page table from within the finalize_and_map()
- * callback because both destination and source page are still locked, and the
- * mmap_sem is held in read mode (hence no one can unmap the range being
- * migrated).
- *
- * Once callback is done cleaning up things and updating its page table (if it
- * chose to do so, this is not an obligation) then it returns. At this point,
- * the HMM core will finish up the final steps, and the migration is complete.
- *
- * THE finalize_and_map() CALLBACK MUST NOT CHANGE ANY OF THE SRC OR DST ARRAY
- * ENTRIES OR BAD THINGS WILL HAPPEN !
- */
-struct migrate_vma_ops {
-	void (*alloc_and_copy)(struct vm_area_struct *vma,
-			       const unsigned long *src,
-			       unsigned long *dst,
-			       unsigned long start,
-			       unsigned long end,
-			       void *private);
-	void (*finalize_and_map)(struct vm_area_struct *vma,
-				 const unsigned long *src,
-				 const unsigned long *dst,
-				 unsigned long start,
-				 unsigned long end,
-				 void *private);
+struct migrate_vma {
+	struct vm_area_struct	*vma;
+ 	/*
+	 * Both src and dst array must be big enough for
+	 * (end - start) >> PAGE_SHIFT entries.
+	 *
+	 * The src array must not be modified by the caller after
+	 * migrate_vma_setup(), and must not change the dst array after
+	 * migrate_vma_pages() returns.
+	 */
+	unsigned long		*dst;
+	unsigned long		*src;
+	unsigned long		cpages;
+	unsigned long		npages;
+	unsigned long		start;
+	unsigned long		end;
 };
 
-#if defined(CONFIG_MIGRATE_VMA_HELPER)
-int migrate_vma(const struct migrate_vma_ops *ops,
-		struct vm_area_struct *vma,
-		unsigned long start,
-		unsigned long end,
-		unsigned long *src,
-		unsigned long *dst,
-		void *private);
-#else
-static inline int migrate_vma(const struct migrate_vma_ops *ops,
-			      struct vm_area_struct *vma,
-			      unsigned long start,
-			      unsigned long end,
-			      unsigned long *src,
-			      unsigned long *dst,
-			      void *private)
-{
-	return -EINVAL;
-}
-#endif /* IS_ENABLED(CONFIG_MIGRATE_VMA_HELPER) */
+int migrate_vma_setup(struct migrate_vma *args);
+void migrate_vma_pages(struct migrate_vma *migrate);
+void migrate_vma_finalize(struct migrate_vma *migrate);
 
 #endif /* CONFIG_MIGRATION */
 
diff --git a/mm/migrate.c b/mm/migrate.c
index 8992741f10aa..dc4e60a496f2 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -2118,16 +2118,6 @@ int migrate_misplaced_transhuge_page(struct mm_struct *mm,
 #endif /* CONFIG_NUMA */
 
 #if defined(CONFIG_MIGRATE_VMA_HELPER)
-struct migrate_vma {
-	struct vm_area_struct	*vma;
-	unsigned long		*dst;
-	unsigned long		*src;
-	unsigned long		cpages;
-	unsigned long		npages;
-	unsigned long		start;
-	unsigned long		end;
-};
-
 static int migrate_vma_collect_hole(unsigned long start,
 				    unsigned long end,
 				    struct mm_walk *walk)
@@ -2578,6 +2568,108 @@ static void migrate_vma_unmap(struct migrate_vma *migrate)
 	}
 }
 
+/**
+ * migrate_vma_setup() - prepare to migrate a range of memory
+ * @args: contains the vma, start, and and pfns arrays for the migration
+ *
+ * Returns: negative errno on failures, 0 when 0 or more pages were migrated
+ * without an error.
+ *
+ * Prepare to migrate a range of memory virtual address range by collecting all
+ * the pages backing each virtual address in the range, saving them inside the
+ * src array.  Then lock those pages and unmap them. Once the pages are locked
+ * and unmapped, check whether each page is pinned or not.  Pages that aren't
+ * pinned have the MIGRATE_PFN_MIGRATE flag set (by this function) in the
+ * corresponding src array entry.  Then restores any pages that are pinned, by
+ * remapping and unlocking those pages.
+ *
+ * The caller should then allocate destination memory and copy source memory to
+ * it for all those entries (ie with MIGRATE_PFN_VALID and MIGRATE_PFN_MIGRATE
+ * flag set).  Once these are allocated and copied, the caller must update each
+ * corresponding entry in the dst array with the pfn value of the destination
+ * page and with the MIGRATE_PFN_VALID and MIGRATE_PFN_LOCKED flags set
+ * (destination pages must have their struct pages locked, via lock_page()).
+ *
+ * Note that the caller does not have to migrate all the pages that are marked
+ * with MIGRATE_PFN_MIGRATE flag in src array unless this is a migration from
+ * device memory to system memory.  If the caller cannot migrate a device page
+ * back to system memory, then it must return VM_FAULT_SIGBUS, which will
+ * might have severe consequences for the userspace process, so it should best
+ * be avoided if possible.
+ *
+ * For empty entries inside CPU page table (pte_none() or pmd_none() is true) we
+ * do set MIGRATE_PFN_MIGRATE flag inside the corresponding source array thus
+ * allowing the caller to allocate device memory for those unback virtual
+ * address.  For this the caller simply havs to allocate device memory and
+ * properly set the destination entry like for regular migration.  Note that
+ * this can still fails and thus inside the device driver must check if the
+ * migration was successful for those entries after calling migrate_vma_pages()
+ * just like for regular migration.
+ *
+ * After that, the callers must call migrate_vma_pages() to go over each entry
+ * in the src array that has the MIGRATE_PFN_VALID and MIGRATE_PFN_MIGRATE flag
+ * set. If the corresponding entry in dst array has MIGRATE_PFN_VALID flag set,
+ * then migrate_vma_pages() to migrate struct page information from the source
+ * struct page to the destination struct page.  If it fails to migrate the
+ * struct page information, then it clears the MIGRATE_PFN_MIGRATE flag in the
+ * src array.
+ *
+ * At this point all successfully migrated pages have an entry in the src
+ * array with MIGRATE_PFN_VALID and MIGRATE_PFN_MIGRATE flag set and the dst
+ * array entry with MIGRATE_PFN_VALID flag set.
+ *
+ * Once migrate_vma_pages() returns the caller may inspect which pages were
+ * successfully migrated, and which were not.  Successfully migrated pages will
+ * have the MIGRATE_PFN_MIGRATE flag set for their src array entry.
+ *
+ * It is safe to update device page table from within the finalize_and_map()
+ * callback because both destination and source page are still locked, and the
+ * mmap_sem is held in read mode (hence no one can unmap the range being
+ * migrated).
+ *
+ * Once the caller is done cleaning up things and updating its page table (if it
+ * chose to do so, this is not an obligation) it finally calls
+ * migrate_vma_finalize() to update the CPU page table to point to new pages
+ * for successfully migrated pages or otherwise restore the CPU page table to
+ * point to the original source pages.
+ */
+int migrate_vma_setup(struct migrate_vma *args)
+{
+	long nr_pages = (args->end - args->start) >> PAGE_SHIFT;
+
+	args->start &= PAGE_MASK;
+	args->end &= PAGE_MASK;
+	if (!args->vma || is_vm_hugetlb_page(args->vma) ||
+	    (args->vma->vm_flags & VM_SPECIAL) || vma_is_dax(args->vma))
+		return -EINVAL;
+	if (nr_pages <= 0)
+		return -EINVAL;
+	if (args->start < args->vma->vm_start ||
+	    args->start >= args->vma->vm_end)
+		return -EINVAL;
+	if (args->end <= args->vma->vm_start || args->end > args->vma->vm_end)
+		return -EINVAL;
+	if (!args->src || !args->dst)
+		return -EINVAL;
+
+	memset(args->src, 0, sizeof(*args->src) * nr_pages);
+
+	migrate_vma_collect(args);
+	if (args->cpages)
+		migrate_vma_prepare(args);
+	if (args->cpages)
+		migrate_vma_unmap(args);
+
+	/*
+	 * At this point pages are locked and unmapped, and thus they have
+	 * stable content and can safely be copied to destination memory that
+	 * is allocated by the drivers.
+	 */
+	return 0;
+
+}
+EXPORT_SYMBOL(migrate_vma_setup);
+
 static void migrate_vma_insert_page(struct migrate_vma *migrate,
 				    unsigned long addr,
 				    struct page *page,
@@ -2709,7 +2801,7 @@ static void migrate_vma_insert_page(struct migrate_vma *migrate,
 	*src &= ~MIGRATE_PFN_MIGRATE;
 }
 
-/*
+/**
  * migrate_vma_pages() - migrate meta-data from src page to dst page
  * @migrate: migrate struct containing all migration information
  *
@@ -2717,7 +2809,7 @@ static void migrate_vma_insert_page(struct migrate_vma *migrate,
  * struct page. This effectively finishes the migration from source page to the
  * destination page.
  */
-static void migrate_vma_pages(struct migrate_vma *migrate)
+void migrate_vma_pages(struct migrate_vma *migrate)
 {
 	const unsigned long npages = migrate->npages;
 	const unsigned long start = migrate->start;
@@ -2791,8 +2883,9 @@ static void migrate_vma_pages(struct migrate_vma *migrate)
 	if (notified)
 		mmu_notifier_invalidate_range_only_end(&range);
 }
+EXPORT_SYMBOL(migrate_vma_pages);
 
-/*
+/**
  * migrate_vma_finalize() - restore CPU page table entry
  * @migrate: migrate struct containing all migration information
  *
@@ -2803,7 +2896,7 @@ static void migrate_vma_pages(struct migrate_vma *migrate)
  * This also unlocks the pages and puts them back on the lru, or drops the extra
  * refcount, for device pages.
  */
-static void migrate_vma_finalize(struct migrate_vma *migrate)
+void migrate_vma_finalize(struct migrate_vma *migrate)
 {
 	const unsigned long npages = migrate->npages;
 	unsigned long i;
@@ -2846,124 +2939,5 @@ static void migrate_vma_finalize(struct migrate_vma *migrate)
 		}
 	}
 }
-
-/*
- * migrate_vma() - migrate a range of memory inside vma
- *
- * @ops: migration callback for allocating destination memory and copying
- * @vma: virtual memory area containing the range to be migrated
- * @start: start address of the range to migrate (inclusive)
- * @end: end address of the range to migrate (exclusive)
- * @src: array of hmm_pfn_t containing source pfns
- * @dst: array of hmm_pfn_t containing destination pfns
- * @private: pointer passed back to each of the callback
- * Returns: 0 on success, error code otherwise
- *
- * This function tries to migrate a range of memory virtual address range, using
- * callbacks to allocate and copy memory from source to destination. First it
- * collects all the pages backing each virtual address in the range, saving this
- * inside the src array. Then it locks those pages and unmaps them. Once the pages
- * are locked and unmapped, it checks whether each page is pinned or not. Pages
- * that aren't pinned have the MIGRATE_PFN_MIGRATE flag set (by this function)
- * in the corresponding src array entry. It then restores any pages that are
- * pinned, by remapping and unlocking those pages.
- *
- * At this point it calls the alloc_and_copy() callback. For documentation on
- * what is expected from that callback, see struct migrate_vma_ops comments in
- * include/linux/migrate.h
- *
- * After the alloc_and_copy() callback, this function goes over each entry in
- * the src array that has the MIGRATE_PFN_VALID and MIGRATE_PFN_MIGRATE flag
- * set. If the corresponding entry in dst array has MIGRATE_PFN_VALID flag set,
- * then the function tries to migrate struct page information from the source
- * struct page to the destination struct page. If it fails to migrate the struct
- * page information, then it clears the MIGRATE_PFN_MIGRATE flag in the src
- * array.
- *
- * At this point all successfully migrated pages have an entry in the src
- * array with MIGRATE_PFN_VALID and MIGRATE_PFN_MIGRATE flag set and the dst
- * array entry with MIGRATE_PFN_VALID flag set.
- *
- * It then calls the finalize_and_map() callback. See comments for "struct
- * migrate_vma_ops", in include/linux/migrate.h for details about
- * finalize_and_map() behavior.
- *
- * After the finalize_and_map() callback, for successfully migrated pages, this
- * function updates the CPU page table to point to new pages, otherwise it
- * restores the CPU page table to point to the original source pages.
- *
- * Function returns 0 after the above steps, even if no pages were migrated
- * (The function only returns an error if any of the arguments are invalid.)
- *
- * Both src and dst array must be big enough for (end - start) >> PAGE_SHIFT
- * unsigned long entries.
- */
-int migrate_vma(const struct migrate_vma_ops *ops,
-		struct vm_area_struct *vma,
-		unsigned long start,
-		unsigned long end,
-		unsigned long *src,
-		unsigned long *dst,
-		void *private)
-{
-	struct migrate_vma migrate;
-
-	/* Sanity check the arguments */
-	start &= PAGE_MASK;
-	end &= PAGE_MASK;
-	if (!vma || is_vm_hugetlb_page(vma) || (vma->vm_flags & VM_SPECIAL) ||
-			vma_is_dax(vma))
-		return -EINVAL;
-	if (start < vma->vm_start || start >= vma->vm_end)
-		return -EINVAL;
-	if (end <= vma->vm_start || end > vma->vm_end)
-		return -EINVAL;
-	if (!ops || !src || !dst || start >= end)
-		return -EINVAL;
-
-	memset(src, 0, sizeof(*src) * ((end - start) >> PAGE_SHIFT));
-	migrate.src = src;
-	migrate.dst = dst;
-	migrate.start = start;
-	migrate.npages = 0;
-	migrate.cpages = 0;
-	migrate.end = end;
-	migrate.vma = vma;
-
-	/* Collect, and try to unmap source pages */
-	migrate_vma_collect(&migrate);
-	if (!migrate.cpages)
-		return 0;
-
-	/* Lock and isolate page */
-	migrate_vma_prepare(&migrate);
-	if (!migrate.cpages)
-		return 0;
-
-	/* Unmap pages */
-	migrate_vma_unmap(&migrate);
-	if (!migrate.cpages)
-		return 0;
-
-	/*
-	 * At this point pages are locked and unmapped, and thus they have
-	 * stable content and can safely be copied to destination memory that
-	 * is allocated by the callback.
-	 *
-	 * Note that migration can fail in migrate_vma_struct_page() for each
-	 * individual page.
-	 */
-	ops->alloc_and_copy(vma, src, dst, start, end, private);
-
-	/* This does the real migration of struct page */
-	migrate_vma_pages(&migrate);
-
-	ops->finalize_and_map(vma, src, dst, start, end, private);
-
-	/* Unlock and remap pages */
-	migrate_vma_finalize(&migrate);
-
-	return 0;
-}
-EXPORT_SYMBOL(migrate_vma);
+EXPORT_SYMBOL(migrate_vma_finalize);
 #endif /* defined(MIGRATE_VMA_HELPER) */
-- 
2.20.1


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH 2/9] nouveau: reset dma_nr in nouveau_dmem_migrate_alloc_and_copy
  2019-07-29 14:28 turn the hmm migrate_vma upside down Christoph Hellwig
  2019-07-29 14:28 ` [PATCH 1/9] mm: turn " Christoph Hellwig
@ 2019-07-29 14:28 ` Christoph Hellwig
  2019-07-29 23:18   ` Ralph Campbell
  2019-07-29 14:28 ` [PATCH 3/9] nouveau: factor out device memory address calculation Christoph Hellwig
                   ` (7 subsequent siblings)
  9 siblings, 1 reply; 30+ messages in thread
From: Christoph Hellwig @ 2019-07-29 14:28 UTC (permalink / raw)
  To: Jérôme Glisse, Jason Gunthorpe, Ben Skeggs
  Cc: Ralph Campbell, Bharata B Rao, Andrew Morton, linux-mm, nouveau,
	dri-devel, linux-kernel

When we start a new batch of dma_map operations we need to reset dma_nr,
as we start filling a newly allocated array.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/gpu/drm/nouveau/nouveau_dmem.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/nouveau/nouveau_dmem.c b/drivers/gpu/drm/nouveau/nouveau_dmem.c
index 38416798abd4..e696157f771e 100644
--- a/drivers/gpu/drm/nouveau/nouveau_dmem.c
+++ b/drivers/gpu/drm/nouveau/nouveau_dmem.c
@@ -682,6 +682,7 @@ nouveau_dmem_migrate_alloc_and_copy(struct vm_area_struct *vma,
 	migrate->dma = kmalloc(sizeof(*migrate->dma) * npages, GFP_KERNEL);
 	if (!migrate->dma)
 		goto error;
+	migrate->dma_nr = 0;
 
 	/* Copy things over */
 	copy = drm->dmem->migrate.copy_func;
-- 
2.20.1


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH 3/9] nouveau: factor out device memory address calculation
  2019-07-29 14:28 turn the hmm migrate_vma upside down Christoph Hellwig
  2019-07-29 14:28 ` [PATCH 1/9] mm: turn " Christoph Hellwig
  2019-07-29 14:28 ` [PATCH 2/9] nouveau: reset dma_nr in nouveau_dmem_migrate_alloc_and_copy Christoph Hellwig
@ 2019-07-29 14:28 ` Christoph Hellwig
  2019-07-29 23:21   ` Ralph Campbell
  2019-07-29 14:28 ` [PATCH 4/9] nouveau: factor out dmem fence completion Christoph Hellwig
                   ` (6 subsequent siblings)
  9 siblings, 1 reply; 30+ messages in thread
From: Christoph Hellwig @ 2019-07-29 14:28 UTC (permalink / raw)
  To: Jérôme Glisse, Jason Gunthorpe, Ben Skeggs
  Cc: Ralph Campbell, Bharata B Rao, Andrew Morton, linux-mm, nouveau,
	dri-devel, linux-kernel

Factor out the repeated device memory address calculation into
a helper.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/gpu/drm/nouveau/nouveau_dmem.c | 42 +++++++++++---------------
 1 file changed, 17 insertions(+), 25 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/nouveau_dmem.c b/drivers/gpu/drm/nouveau/nouveau_dmem.c
index e696157f771e..d469bc334438 100644
--- a/drivers/gpu/drm/nouveau/nouveau_dmem.c
+++ b/drivers/gpu/drm/nouveau/nouveau_dmem.c
@@ -102,6 +102,14 @@ struct nouveau_migrate {
 	unsigned long dma_nr;
 };
 
+static unsigned long nouveau_dmem_page_addr(struct page *page)
+{
+	struct nouveau_dmem_chunk *chunk = page->zone_device_data;
+	unsigned long idx = page_to_pfn(page) - chunk->pfn_first;
+
+	return (idx << PAGE_SHIFT) + chunk->bo->bo.offset;
+}
+
 static void nouveau_dmem_page_free(struct page *page)
 {
 	struct nouveau_dmem_chunk *chunk = page->zone_device_data;
@@ -169,9 +177,7 @@ nouveau_dmem_fault_alloc_and_copy(struct vm_area_struct *vma,
 	/* Copy things over */
 	copy = drm->dmem->migrate.copy_func;
 	for (addr = start, i = 0; addr < end; addr += PAGE_SIZE, i++) {
-		struct nouveau_dmem_chunk *chunk;
 		struct page *spage, *dpage;
-		u64 src_addr, dst_addr;
 
 		dpage = migrate_pfn_to_page(dst_pfns[i]);
 		if (!dpage || dst_pfns[i] == MIGRATE_PFN_ERROR)
@@ -194,14 +200,10 @@ nouveau_dmem_fault_alloc_and_copy(struct vm_area_struct *vma,
 			continue;
 		}
 
-		dst_addr = fault->dma[fault->npages++];
-
-		chunk = spage->zone_device_data;
-		src_addr = page_to_pfn(spage) - chunk->pfn_first;
-		src_addr = (src_addr << PAGE_SHIFT) + chunk->bo->bo.offset;
-
-		ret = copy(drm, 1, NOUVEAU_APER_HOST, dst_addr,
-				   NOUVEAU_APER_VRAM, src_addr);
+		ret = copy(drm, 1, NOUVEAU_APER_HOST,
+				fault->dma[fault->npages++],
+				NOUVEAU_APER_VRAM,
+				nouveau_dmem_page_addr(spage));
 		if (ret) {
 			dst_pfns[i] = MIGRATE_PFN_ERROR;
 			__free_page(dpage);
@@ -687,18 +689,12 @@ nouveau_dmem_migrate_alloc_and_copy(struct vm_area_struct *vma,
 	/* Copy things over */
 	copy = drm->dmem->migrate.copy_func;
 	for (addr = start, i = 0; addr < end; addr += PAGE_SIZE, i++) {
-		struct nouveau_dmem_chunk *chunk;
 		struct page *spage, *dpage;
-		u64 src_addr, dst_addr;
 
 		dpage = migrate_pfn_to_page(dst_pfns[i]);
 		if (!dpage || dst_pfns[i] == MIGRATE_PFN_ERROR)
 			continue;
 
-		chunk = dpage->zone_device_data;
-		dst_addr = page_to_pfn(dpage) - chunk->pfn_first;
-		dst_addr = (dst_addr << PAGE_SHIFT) + chunk->bo->bo.offset;
-
 		spage = migrate_pfn_to_page(src_pfns[i]);
 		if (!spage || !(src_pfns[i] & MIGRATE_PFN_MIGRATE)) {
 			nouveau_dmem_page_free_locked(drm, dpage);
@@ -716,10 +712,10 @@ nouveau_dmem_migrate_alloc_and_copy(struct vm_area_struct *vma,
 			continue;
 		}
 
-		src_addr = migrate->dma[migrate->dma_nr++];
-
-		ret = copy(drm, 1, NOUVEAU_APER_VRAM, dst_addr,
-				   NOUVEAU_APER_HOST, src_addr);
+		ret = copy(drm, 1, NOUVEAU_APER_VRAM,
+				nouveau_dmem_page_addr(dpage),
+				NOUVEAU_APER_HOST,
+				migrate->dma[migrate->dma_nr++]);
 		if (ret) {
 			nouveau_dmem_page_free_locked(drm, dpage);
 			dst_pfns[i] = 0;
@@ -846,7 +842,6 @@ nouveau_dmem_convert_pfn(struct nouveau_drm *drm,
 
 	npages = (range->end - range->start) >> PAGE_SHIFT;
 	for (i = 0; i < npages; ++i) {
-		struct nouveau_dmem_chunk *chunk;
 		struct page *page;
 		uint64_t addr;
 
@@ -864,10 +859,7 @@ nouveau_dmem_convert_pfn(struct nouveau_drm *drm,
 			continue;
 		}
 
-		chunk = page->zone_device_data;
-		addr = page_to_pfn(page) - chunk->pfn_first;
-		addr = (addr + chunk->bo->bo.mem.start) << PAGE_SHIFT;
-
+		addr = nouveau_dmem_page_addr(page);
 		range->pfns[i] &= ((1UL << range->pfn_shift) - 1);
 		range->pfns[i] |= (addr >> PAGE_SHIFT) << range->pfn_shift;
 	}
-- 
2.20.1


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH 4/9] nouveau: factor out dmem fence completion
  2019-07-29 14:28 turn the hmm migrate_vma upside down Christoph Hellwig
                   ` (2 preceding siblings ...)
  2019-07-29 14:28 ` [PATCH 3/9] nouveau: factor out device memory address calculation Christoph Hellwig
@ 2019-07-29 14:28 ` Christoph Hellwig
  2019-07-29 23:23   ` Ralph Campbell
  2019-07-29 14:28 ` [PATCH 5/9] nouveau: simplify nouveau_dmem_migrate_to_ram Christoph Hellwig
                   ` (5 subsequent siblings)
  9 siblings, 1 reply; 30+ messages in thread
From: Christoph Hellwig @ 2019-07-29 14:28 UTC (permalink / raw)
  To: Jérôme Glisse, Jason Gunthorpe, Ben Skeggs
  Cc: Ralph Campbell, Bharata B Rao, Andrew Morton, linux-mm, nouveau,
	dri-devel, linux-kernel

Factor out the end of fencing logic from the two migration routines.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/gpu/drm/nouveau/nouveau_dmem.c | 33 ++++++++++++--------------
 1 file changed, 15 insertions(+), 18 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/nouveau_dmem.c b/drivers/gpu/drm/nouveau/nouveau_dmem.c
index d469bc334438..21052a4aaf69 100644
--- a/drivers/gpu/drm/nouveau/nouveau_dmem.c
+++ b/drivers/gpu/drm/nouveau/nouveau_dmem.c
@@ -133,6 +133,19 @@ static void nouveau_dmem_page_free(struct page *page)
 	spin_unlock(&chunk->lock);
 }
 
+static void nouveau_dmem_fence_done(struct nouveau_fence **fence)
+{
+	if (fence) {
+		nouveau_fence_wait(*fence, true, false);
+		nouveau_fence_unref(fence);
+	} else {
+		/*
+		 * FIXME wait for channel to be IDLE before calling finalizing
+		 * the hmem object.
+		 */
+	}
+}
+
 static void
 nouveau_dmem_fault_alloc_and_copy(struct vm_area_struct *vma,
 				  const unsigned long *src_pfns,
@@ -236,15 +249,7 @@ nouveau_dmem_fault_finalize_and_map(struct nouveau_dmem_fault *fault)
 {
 	struct nouveau_drm *drm = fault->drm;
 
-	if (fault->fence) {
-		nouveau_fence_wait(fault->fence, true, false);
-		nouveau_fence_unref(&fault->fence);
-	} else {
-		/*
-		 * FIXME wait for channel to be IDLE before calling finalizing
-		 * the hmem object below (nouveau_migrate_hmem_fini()).
-		 */
-	}
+	nouveau_dmem_fence_done(&fault->fence);
 
 	while (fault->npages--) {
 		dma_unmap_page(drm->dev->dev, fault->dma[fault->npages],
@@ -748,15 +753,7 @@ nouveau_dmem_migrate_finalize_and_map(struct nouveau_migrate *migrate)
 {
 	struct nouveau_drm *drm = migrate->drm;
 
-	if (migrate->fence) {
-		nouveau_fence_wait(migrate->fence, true, false);
-		nouveau_fence_unref(&migrate->fence);
-	} else {
-		/*
-		 * FIXME wait for channel to be IDLE before finalizing
-		 * the hmem object below (nouveau_migrate_hmem_fini()) ?
-		 */
-	}
+	nouveau_dmem_fence_done(&migrate->fence);
 
 	while (migrate->dma_nr--) {
 		dma_unmap_page(drm->dev->dev, migrate->dma[migrate->dma_nr],
-- 
2.20.1


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH 5/9] nouveau: simplify nouveau_dmem_migrate_to_ram
  2019-07-29 14:28 turn the hmm migrate_vma upside down Christoph Hellwig
                   ` (3 preceding siblings ...)
  2019-07-29 14:28 ` [PATCH 4/9] nouveau: factor out dmem fence completion Christoph Hellwig
@ 2019-07-29 14:28 ` Christoph Hellwig
  2019-07-29 23:26   ` Ralph Campbell
  2019-07-31  9:57   ` Bharata B Rao
  2019-07-29 14:28 ` [PATCH 6/9] nouveau: simplify nouveau_dmem_migrate_vma Christoph Hellwig
                   ` (4 subsequent siblings)
  9 siblings, 2 replies; 30+ messages in thread
From: Christoph Hellwig @ 2019-07-29 14:28 UTC (permalink / raw)
  To: Jérôme Glisse, Jason Gunthorpe, Ben Skeggs
  Cc: Ralph Campbell, Bharata B Rao, Andrew Morton, linux-mm, nouveau,
	dri-devel, linux-kernel

Factor the main copy page to ram routine out into a helper that acts on
a single page and which doesn't require the nouveau_dmem_fault
structure for argument passing.  Also remove the loop over multiple
pages as we only handle one at the moment, although the structure of
the main worker function makes it relatively easy to add multi page
support back if needed in the future.  But at least for now this avoid
the needed to dynamically allocate memory for the dma addresses in
what is essentially the page fault path.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/gpu/drm/nouveau/nouveau_dmem.c | 158 ++++++-------------------
 1 file changed, 39 insertions(+), 119 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/nouveau_dmem.c b/drivers/gpu/drm/nouveau/nouveau_dmem.c
index 21052a4aaf69..036e6c07d489 100644
--- a/drivers/gpu/drm/nouveau/nouveau_dmem.c
+++ b/drivers/gpu/drm/nouveau/nouveau_dmem.c
@@ -86,13 +86,6 @@ static inline struct nouveau_dmem *page_to_dmem(struct page *page)
 	return container_of(page->pgmap, struct nouveau_dmem, pagemap);
 }
 
-struct nouveau_dmem_fault {
-	struct nouveau_drm *drm;
-	struct nouveau_fence *fence;
-	dma_addr_t *dma;
-	unsigned long npages;
-};
-
 struct nouveau_migrate {
 	struct vm_area_struct *vma;
 	struct nouveau_drm *drm;
@@ -146,130 +139,55 @@ static void nouveau_dmem_fence_done(struct nouveau_fence **fence)
 	}
 }
 
-static void
-nouveau_dmem_fault_alloc_and_copy(struct vm_area_struct *vma,
-				  const unsigned long *src_pfns,
-				  unsigned long *dst_pfns,
-				  unsigned long start,
-				  unsigned long end,
-				  struct nouveau_dmem_fault *fault)
+static vm_fault_t nouveau_dmem_fault_copy_one(struct nouveau_drm *drm,
+		struct vm_area_struct *vma, unsigned long addr,
+		unsigned long src, unsigned long *dst, dma_addr_t *dma_addr)
 {
-	struct nouveau_drm *drm = fault->drm;
 	struct device *dev = drm->dev->dev;
-	unsigned long addr, i, npages = 0;
-	nouveau_migrate_copy_t copy;
-	int ret;
-
-
-	/* First allocate new memory */
-	for (addr = start, i = 0; addr < end; addr += PAGE_SIZE, i++) {
-		struct page *dpage, *spage;
-
-		dst_pfns[i] = 0;
-		spage = migrate_pfn_to_page(src_pfns[i]);
-		if (!spage || !(src_pfns[i] & MIGRATE_PFN_MIGRATE))
-			continue;
+	struct page *dpage, *spage;
 
-		dpage = alloc_page_vma(GFP_HIGHUSER, vma, addr);
-		if (!dpage) {
-			dst_pfns[i] = MIGRATE_PFN_ERROR;
-			continue;
-		}
-		lock_page(dpage);
-
-		dst_pfns[i] = migrate_pfn(page_to_pfn(dpage)) |
-			      MIGRATE_PFN_LOCKED;
-		npages++;
-	}
+	spage = migrate_pfn_to_page(src);
+	if (!spage || !(src & MIGRATE_PFN_MIGRATE))
+		return 0;
 
-	/* Allocate storage for DMA addresses, so we can unmap later. */
-	fault->dma = kmalloc(sizeof(*fault->dma) * npages, GFP_KERNEL);
-	if (!fault->dma)
+	dpage = alloc_page_vma(GFP_HIGHUSER, args->vma, addr);
+	if (!dpage)
 		goto error;
+	lock_page(dpage);
 
-	/* Copy things over */
-	copy = drm->dmem->migrate.copy_func;
-	for (addr = start, i = 0; addr < end; addr += PAGE_SIZE, i++) {
-		struct page *spage, *dpage;
-
-		dpage = migrate_pfn_to_page(dst_pfns[i]);
-		if (!dpage || dst_pfns[i] == MIGRATE_PFN_ERROR)
-			continue;
-
-		spage = migrate_pfn_to_page(src_pfns[i]);
-		if (!spage || !(src_pfns[i] & MIGRATE_PFN_MIGRATE)) {
-			dst_pfns[i] = MIGRATE_PFN_ERROR;
-			__free_page(dpage);
-			continue;
-		}
-
-		fault->dma[fault->npages] =
-			dma_map_page_attrs(dev, dpage, 0, PAGE_SIZE,
-					   PCI_DMA_BIDIRECTIONAL,
-					   DMA_ATTR_SKIP_CPU_SYNC);
-		if (dma_mapping_error(dev, fault->dma[fault->npages])) {
-			dst_pfns[i] = MIGRATE_PFN_ERROR;
-			__free_page(dpage);
-			continue;
-		}
-
-		ret = copy(drm, 1, NOUVEAU_APER_HOST,
-				fault->dma[fault->npages++],
-				NOUVEAU_APER_VRAM,
-				nouveau_dmem_page_addr(spage));
-		if (ret) {
-			dst_pfns[i] = MIGRATE_PFN_ERROR;
-			__free_page(dpage);
-			continue;
-		}
-	}
+	*dma_addr = dma_map_page(dev, dpage, 0, PAGE_SIZE, DMA_BIDIRECTIONAL);
+	if (dma_mapping_error(dev, *dma_addr))
+		goto error_free_page;
 
-	nouveau_fence_new(drm->dmem->migrate.chan, false, &fault->fence);
+	if (drm->dmem->migrate.copy_func(drm, 1, NOUVEAU_APER_HOST, *dma_addr,
+			NOUVEAU_APER_VRAM, nouveau_dmem_page_addr(spage)))
+		goto error_dma_unmap;
 
-	return;
+	*dst = migrate_pfn(page_to_pfn(dpage)) | MIGRATE_PFN_LOCKED;
 
+error_dma_unmap:
+	dma_unmap_page(dev, *dma_addr, PAGE_SIZE, DMA_BIDIRECTIONAL);
+error_free_page:
+	__free_page(dpage);
 error:
-	for (addr = start, i = 0; addr < end; addr += PAGE_SIZE, ++i) {
-		struct page *page;
-
-		if (!dst_pfns[i] || dst_pfns[i] == MIGRATE_PFN_ERROR)
-			continue;
-
-		page = migrate_pfn_to_page(dst_pfns[i]);
-		dst_pfns[i] = MIGRATE_PFN_ERROR;
-		if (page == NULL)
-			continue;
-
-		__free_page(page);
-	}
-}
-
-static void
-nouveau_dmem_fault_finalize_and_map(struct nouveau_dmem_fault *fault)
-{
-	struct nouveau_drm *drm = fault->drm;
-
-	nouveau_dmem_fence_done(&fault->fence);
-
-	while (fault->npages--) {
-		dma_unmap_page(drm->dev->dev, fault->dma[fault->npages],
-			       PAGE_SIZE, PCI_DMA_BIDIRECTIONAL);
-	}
-	kfree(fault->dma);
+	return VM_FAULT_SIGBUS;
 }
 
 static vm_fault_t nouveau_dmem_migrate_to_ram(struct vm_fault *vmf)
 {
 	struct nouveau_dmem *dmem = page_to_dmem(vmf->page);
-	unsigned long src[1] = {0}, dst[1] = {0};
+	struct nouveau_drm *drm = dmem->drm;
+	struct nouveau_fence *fence;
+	unsigned long src = 0, dst = 0;
+	dma_addr_t dma_addr = 0;
+	vm_fault_t ret;
 	struct migrate_vma args = {
 		.vma		= vmf->vma,
 		.start		= vmf->address,
 		.end		= vmf->address + PAGE_SIZE,
-		.src		= src,
-		.dst		= dst,
+		.src		= &src,
+		.dst		= &dst,
 	};
-	struct nouveau_dmem_fault fault = { .drm = dmem->drm };
 
 	/*
 	 * FIXME what we really want is to find some heuristic to migrate more
@@ -281,16 +199,18 @@ static vm_fault_t nouveau_dmem_migrate_to_ram(struct vm_fault *vmf)
 	if (!args.cpages)
 		return 0;
 
-	nouveau_dmem_fault_alloc_and_copy(args.vma, src, dst, args.start,
-			args.end, &fault);
-	migrate_vma_pages(&args);
-	nouveau_dmem_fault_finalize_and_map(&fault);
+	ret = nouveau_dmem_fault_copy_one(drm, vmf->vma, vmf->address, src,
+			&dst, &dma_addr);
+	if (ret || dst == 0)
+		goto done;
 
+	nouveau_fence_new(dmem->migrate.chan, false, &fence);
+	migrate_vma_pages(&args);
+	nouveau_dmem_fence_done(&fence);
+	dma_unmap_page(drm->dev->dev, dma_addr, PAGE_SIZE, DMA_BIDIRECTIONAL);
+done:
 	migrate_vma_finalize(&args);
-	if (dst[0] == MIGRATE_PFN_ERROR)
-		return VM_FAULT_SIGBUS;
-
-	return 0;
+	return ret;
 }
 
 static const struct dev_pagemap_ops nouveau_dmem_pagemap_ops = {
-- 
2.20.1


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH 6/9] nouveau: simplify nouveau_dmem_migrate_vma
  2019-07-29 14:28 turn the hmm migrate_vma upside down Christoph Hellwig
                   ` (4 preceding siblings ...)
  2019-07-29 14:28 ` [PATCH 5/9] nouveau: simplify nouveau_dmem_migrate_to_ram Christoph Hellwig
@ 2019-07-29 14:28 ` Christoph Hellwig
  2019-07-29 23:27   ` Ralph Campbell
  2019-07-29 14:28 ` [PATCH 7/9] mm: remove the unused MIGRATE_PFN_ERROR flag Christoph Hellwig
                   ` (3 subsequent siblings)
  9 siblings, 1 reply; 30+ messages in thread
From: Christoph Hellwig @ 2019-07-29 14:28 UTC (permalink / raw)
  To: Jérôme Glisse, Jason Gunthorpe, Ben Skeggs
  Cc: Ralph Campbell, Bharata B Rao, Andrew Morton, linux-mm, nouveau,
	dri-devel, linux-kernel

Factor the main copy page to vram routine out into a helper that acts
on a single page and which doesn't require the nouveau_dmem_migrate
structure for argument passing.  As an added benefit the new version
only allocates the dma address array once and reuses it for each
subsequent chunk of work.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/gpu/drm/nouveau/nouveau_dmem.c | 185 ++++++++-----------------
 1 file changed, 56 insertions(+), 129 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/nouveau_dmem.c b/drivers/gpu/drm/nouveau/nouveau_dmem.c
index 036e6c07d489..6cb930755970 100644
--- a/drivers/gpu/drm/nouveau/nouveau_dmem.c
+++ b/drivers/gpu/drm/nouveau/nouveau_dmem.c
@@ -44,8 +44,6 @@
 #define DMEM_CHUNK_SIZE (2UL << 20)
 #define DMEM_CHUNK_NPAGES (DMEM_CHUNK_SIZE >> PAGE_SHIFT)
 
-struct nouveau_migrate;
-
 enum nouveau_aper {
 	NOUVEAU_APER_VIRT,
 	NOUVEAU_APER_VRAM,
@@ -86,15 +84,6 @@ static inline struct nouveau_dmem *page_to_dmem(struct page *page)
 	return container_of(page->pgmap, struct nouveau_dmem, pagemap);
 }
 
-struct nouveau_migrate {
-	struct vm_area_struct *vma;
-	struct nouveau_drm *drm;
-	struct nouveau_fence *fence;
-	unsigned long npages;
-	dma_addr_t *dma;
-	unsigned long dma_nr;
-};
-
 static unsigned long nouveau_dmem_page_addr(struct page *page)
 {
 	struct nouveau_dmem_chunk *chunk = page->zone_device_data;
@@ -569,131 +558,67 @@ nouveau_dmem_init(struct nouveau_drm *drm)
 	drm->dmem = NULL;
 }
 
-static void
-nouveau_dmem_migrate_alloc_and_copy(struct vm_area_struct *vma,
-				    const unsigned long *src_pfns,
-				    unsigned long *dst_pfns,
-				    unsigned long start,
-				    unsigned long end,
-				    struct nouveau_migrate *migrate)
+static unsigned long nouveau_dmem_migrate_copy_one(struct nouveau_drm *drm,
+		struct vm_area_struct *vma, unsigned long addr,
+		unsigned long src, dma_addr_t *dma_addr)
 {
-	struct nouveau_drm *drm = migrate->drm;
 	struct device *dev = drm->dev->dev;
-	unsigned long addr, i, npages = 0;
-	nouveau_migrate_copy_t copy;
-	int ret;
-
-	/* First allocate new memory */
-	for (addr = start, i = 0; addr < end; addr += PAGE_SIZE, i++) {
-		struct page *dpage, *spage;
-
-		dst_pfns[i] = 0;
-		spage = migrate_pfn_to_page(src_pfns[i]);
-		if (!spage || !(src_pfns[i] & MIGRATE_PFN_MIGRATE))
-			continue;
-
-		dpage = nouveau_dmem_page_alloc_locked(drm);
-		if (!dpage)
-			continue;
-
-		dst_pfns[i] = migrate_pfn(page_to_pfn(dpage)) |
-			      MIGRATE_PFN_LOCKED |
-			      MIGRATE_PFN_DEVICE;
-		npages++;
-	}
-
-	if (!npages)
-		return;
-
-	/* Allocate storage for DMA addresses, so we can unmap later. */
-	migrate->dma = kmalloc(sizeof(*migrate->dma) * npages, GFP_KERNEL);
-	if (!migrate->dma)
-		goto error;
-	migrate->dma_nr = 0;
-
-	/* Copy things over */
-	copy = drm->dmem->migrate.copy_func;
-	for (addr = start, i = 0; addr < end; addr += PAGE_SIZE, i++) {
-		struct page *spage, *dpage;
-
-		dpage = migrate_pfn_to_page(dst_pfns[i]);
-		if (!dpage || dst_pfns[i] == MIGRATE_PFN_ERROR)
-			continue;
-
-		spage = migrate_pfn_to_page(src_pfns[i]);
-		if (!spage || !(src_pfns[i] & MIGRATE_PFN_MIGRATE)) {
-			nouveau_dmem_page_free_locked(drm, dpage);
-			dst_pfns[i] = 0;
-			continue;
-		}
-
-		migrate->dma[migrate->dma_nr] =
-			dma_map_page_attrs(dev, spage, 0, PAGE_SIZE,
-					   PCI_DMA_BIDIRECTIONAL,
-					   DMA_ATTR_SKIP_CPU_SYNC);
-		if (dma_mapping_error(dev, migrate->dma[migrate->dma_nr])) {
-			nouveau_dmem_page_free_locked(drm, dpage);
-			dst_pfns[i] = 0;
-			continue;
-		}
-
-		ret = copy(drm, 1, NOUVEAU_APER_VRAM,
-				nouveau_dmem_page_addr(dpage),
-				NOUVEAU_APER_HOST,
-				migrate->dma[migrate->dma_nr++]);
-		if (ret) {
-			nouveau_dmem_page_free_locked(drm, dpage);
-			dst_pfns[i] = 0;
-			continue;
-		}
-	}
+	struct page *dpage, *spage;
 
-	nouveau_fence_new(drm->dmem->migrate.chan, false, &migrate->fence);
+	spage = migrate_pfn_to_page(src);
+	if (!spage || !(src & MIGRATE_PFN_MIGRATE))
+		goto out;
 
-	return;
+	dpage = nouveau_dmem_page_alloc_locked(drm);
+	if (!dpage)
+		return 0;
 
-error:
-	for (addr = start, i = 0; addr < end; addr += PAGE_SIZE, ++i) {
-		struct page *page;
+	*dma_addr = dma_map_page(dev, spage, 0, PAGE_SIZE, DMA_BIDIRECTIONAL);
+	if (dma_mapping_error(dev, *dma_addr))
+		goto out_free_page;
 
-		if (!dst_pfns[i] || dst_pfns[i] == MIGRATE_PFN_ERROR)
-			continue;
+	if (drm->dmem->migrate.copy_func(drm, 1, NOUVEAU_APER_VRAM,
+			nouveau_dmem_page_addr(dpage), NOUVEAU_APER_HOST,
+			*dma_addr))
+		goto out_dma_unmap;
 
-		page = migrate_pfn_to_page(dst_pfns[i]);
-		dst_pfns[i] = MIGRATE_PFN_ERROR;
-		if (page == NULL)
-			continue;
+	return migrate_pfn(page_to_pfn(dpage)) |
+		MIGRATE_PFN_LOCKED | MIGRATE_PFN_DEVICE;
 
-		__free_page(page);
-	}
+out_dma_unmap:
+	dma_unmap_page(dev, *dma_addr, PAGE_SIZE, DMA_BIDIRECTIONAL);
+out_free_page:
+	nouveau_dmem_page_free_locked(drm, dpage);
+out:
+	return 0;
 }
 
-static void
-nouveau_dmem_migrate_finalize_and_map(struct nouveau_migrate *migrate)
+static void nouveau_dmem_migrate_chunk(struct migrate_vma *args,
+		struct nouveau_drm *drm, dma_addr_t *dma_addrs)
 {
-	struct nouveau_drm *drm = migrate->drm;
+	struct nouveau_fence *fence;
+	unsigned long addr = args->start, nr_dma = 0, i;
+
+	for (i = 0; addr < args->end; i++) {
+		args->dst[i] = nouveau_dmem_migrate_copy_one(drm, args->vma,
+				addr, args->src[i], &dma_addrs[nr_dma]);
+		if (args->dst[i])
+			nr_dma++;
+		addr += PAGE_SIZE;
+	}
 
-	nouveau_dmem_fence_done(&migrate->fence);
+	nouveau_fence_new(drm->dmem->migrate.chan, false, &fence);
+	migrate_vma_pages(args);
+	nouveau_dmem_fence_done(&fence);
 
-	while (migrate->dma_nr--) {
-		dma_unmap_page(drm->dev->dev, migrate->dma[migrate->dma_nr],
-			       PAGE_SIZE, PCI_DMA_BIDIRECTIONAL);
+	while (nr_dma--) {
+		dma_unmap_page(drm->dev->dev, dma_addrs[nr_dma], PAGE_SIZE,
+				DMA_BIDIRECTIONAL);
 	}
-	kfree(migrate->dma);
-
 	/*
-	 * FIXME optimization: update GPU page table to point to newly
-	 * migrated memory.
+	 * FIXME optimization: update GPU page table to point to newly migrated
+	 * memory.
 	 */
-}
-
-static void nouveau_dmem_migrate_chunk(struct migrate_vma *args,
-		struct nouveau_migrate *migrate)
-{
-	nouveau_dmem_migrate_alloc_and_copy(args->vma, args->src, args->dst,
-			args->start, args->end, migrate);
-	migrate_vma_pages(args);
-	nouveau_dmem_migrate_finalize_and_map(migrate);
 	migrate_vma_finalize(args);
 }
 
@@ -705,38 +630,40 @@ nouveau_dmem_migrate_vma(struct nouveau_drm *drm,
 {
 	unsigned long npages = (end - start) >> PAGE_SHIFT;
 	unsigned long max = min(SG_MAX_SINGLE_ALLOC, npages);
+	dma_addr_t *dma_addrs;
 	struct migrate_vma args = {
 		.vma		= vma,
 		.start		= start,
 	};
-	struct nouveau_migrate migrate = {
-		.drm		= drm,
-		.vma		= vma,
-		.npages		= npages,
-	};
 	unsigned long c, i;
 	int ret = -ENOMEM;
 
-	args.src = kzalloc(sizeof(long) * max, GFP_KERNEL);
+	args.src = kcalloc(max, sizeof(args.src), GFP_KERNEL);
 	if (!args.src)
 		goto out;
-	args.dst = kzalloc(sizeof(long) * max, GFP_KERNEL);
+	args.dst = kcalloc(max, sizeof(args.dst), GFP_KERNEL);
 	if (!args.dst)
 		goto out_free_src;
 
+	dma_addrs = kmalloc_array(max, sizeof(*dma_addrs), GFP_KERNEL);
+	if (!dma_addrs)
+		goto out_free_dst;
+
 	for (i = 0; i < npages; i += c) {
 		c = min(SG_MAX_SINGLE_ALLOC, npages);
 		args.end = start + (c << PAGE_SHIFT);
 		ret = migrate_vma_setup(&args);
 		if (ret)
-			goto out_free_dst;
+			goto out_free_dma;
 
 		if (args.cpages)
-			nouveau_dmem_migrate_chunk(&args, &migrate);
+			nouveau_dmem_migrate_chunk(&args, drm, dma_addrs);
 		args.start = args.end;
 	}
 
 	ret = 0;
+out_free_dma:
+	kfree(dma_addrs);
 out_free_dst:
 	kfree(args.dst);
 out_free_src:
-- 
2.20.1


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH 7/9] mm: remove the unused MIGRATE_PFN_ERROR flag
  2019-07-29 14:28 turn the hmm migrate_vma upside down Christoph Hellwig
                   ` (5 preceding siblings ...)
  2019-07-29 14:28 ` [PATCH 6/9] nouveau: simplify nouveau_dmem_migrate_vma Christoph Hellwig
@ 2019-07-29 14:28 ` Christoph Hellwig
  2019-07-29 23:29   ` Ralph Campbell
  2019-07-29 14:28 ` [PATCH 8/9] mm: remove the unused MIGRATE_PFN_DEVICE flag Christoph Hellwig
                   ` (2 subsequent siblings)
  9 siblings, 1 reply; 30+ messages in thread
From: Christoph Hellwig @ 2019-07-29 14:28 UTC (permalink / raw)
  To: Jérôme Glisse, Jason Gunthorpe, Ben Skeggs
  Cc: Ralph Campbell, Bharata B Rao, Andrew Morton, linux-mm, nouveau,
	dri-devel, linux-kernel

We don't use this flag anymore, so remove it.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 include/linux/migrate.h | 1 -
 1 file changed, 1 deletion(-)

diff --git a/include/linux/migrate.h b/include/linux/migrate.h
index 093d67fcf6dd..229153c2c496 100644
--- a/include/linux/migrate.h
+++ b/include/linux/migrate.h
@@ -167,7 +167,6 @@ static inline int migrate_misplaced_transhuge_page(struct mm_struct *mm,
 #define MIGRATE_PFN_LOCKED	(1UL << 2)
 #define MIGRATE_PFN_WRITE	(1UL << 3)
 #define MIGRATE_PFN_DEVICE	(1UL << 4)
-#define MIGRATE_PFN_ERROR	(1UL << 5)
 #define MIGRATE_PFN_SHIFT	6
 
 static inline struct page *migrate_pfn_to_page(unsigned long mpfn)
-- 
2.20.1


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH 8/9] mm: remove the unused MIGRATE_PFN_DEVICE flag
  2019-07-29 14:28 turn the hmm migrate_vma upside down Christoph Hellwig
                   ` (6 preceding siblings ...)
  2019-07-29 14:28 ` [PATCH 7/9] mm: remove the unused MIGRATE_PFN_ERROR flag Christoph Hellwig
@ 2019-07-29 14:28 ` Christoph Hellwig
  2019-07-29 23:31   ` Ralph Campbell
  2019-07-29 14:28 ` [PATCH 9/9] mm: remove the MIGRATE_PFN_WRITE flag Christoph Hellwig
  2019-07-30 12:32 ` turn the hmm migrate_vma upside down Jason Gunthorpe
  9 siblings, 1 reply; 30+ messages in thread
From: Christoph Hellwig @ 2019-07-29 14:28 UTC (permalink / raw)
  To: Jérôme Glisse, Jason Gunthorpe, Ben Skeggs
  Cc: Ralph Campbell, Bharata B Rao, Andrew Morton, linux-mm, nouveau,
	dri-devel, linux-kernel

No one ever checks this flag, and we could easily get that information
from the page if needed.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/gpu/drm/nouveau/nouveau_dmem.c | 3 +--
 include/linux/migrate.h                | 1 -
 mm/migrate.c                           | 4 ++--
 3 files changed, 3 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/nouveau_dmem.c b/drivers/gpu/drm/nouveau/nouveau_dmem.c
index 6cb930755970..f04686a2c21f 100644
--- a/drivers/gpu/drm/nouveau/nouveau_dmem.c
+++ b/drivers/gpu/drm/nouveau/nouveau_dmem.c
@@ -582,8 +582,7 @@ static unsigned long nouveau_dmem_migrate_copy_one(struct nouveau_drm *drm,
 			*dma_addr))
 		goto out_dma_unmap;
 
-	return migrate_pfn(page_to_pfn(dpage)) |
-		MIGRATE_PFN_LOCKED | MIGRATE_PFN_DEVICE;
+	return migrate_pfn(page_to_pfn(dpage)) | MIGRATE_PFN_LOCKED;
 
 out_dma_unmap:
 	dma_unmap_page(dev, *dma_addr, PAGE_SIZE, DMA_BIDIRECTIONAL);
diff --git a/include/linux/migrate.h b/include/linux/migrate.h
index 229153c2c496..8b46cfdb1a0e 100644
--- a/include/linux/migrate.h
+++ b/include/linux/migrate.h
@@ -166,7 +166,6 @@ static inline int migrate_misplaced_transhuge_page(struct mm_struct *mm,
 #define MIGRATE_PFN_MIGRATE	(1UL << 1)
 #define MIGRATE_PFN_LOCKED	(1UL << 2)
 #define MIGRATE_PFN_WRITE	(1UL << 3)
-#define MIGRATE_PFN_DEVICE	(1UL << 4)
 #define MIGRATE_PFN_SHIFT	6
 
 static inline struct page *migrate_pfn_to_page(unsigned long mpfn)
diff --git a/mm/migrate.c b/mm/migrate.c
index dc4e60a496f2..74735256e260 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -2237,8 +2237,8 @@ static int migrate_vma_collect_pmd(pmd_t *pmdp,
 				goto next;
 
 			page = device_private_entry_to_page(entry);
-			mpfn = migrate_pfn(page_to_pfn(page))|
-				MIGRATE_PFN_DEVICE | MIGRATE_PFN_MIGRATE;
+			mpfn = migrate_pfn(page_to_pfn(page)) |
+					MIGRATE_PFN_MIGRATE;
 			if (is_write_device_private_entry(entry))
 				mpfn |= MIGRATE_PFN_WRITE;
 		} else {
-- 
2.20.1


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH 9/9] mm: remove the MIGRATE_PFN_WRITE flag
  2019-07-29 14:28 turn the hmm migrate_vma upside down Christoph Hellwig
                   ` (7 preceding siblings ...)
  2019-07-29 14:28 ` [PATCH 8/9] mm: remove the unused MIGRATE_PFN_DEVICE flag Christoph Hellwig
@ 2019-07-29 14:28 ` Christoph Hellwig
  2019-07-29 23:30   ` Jerome Glisse
  2019-07-29 23:42   ` Ralph Campbell
  2019-07-30 12:32 ` turn the hmm migrate_vma upside down Jason Gunthorpe
  9 siblings, 2 replies; 30+ messages in thread
From: Christoph Hellwig @ 2019-07-29 14:28 UTC (permalink / raw)
  To: Jérôme Glisse, Jason Gunthorpe, Ben Skeggs
  Cc: Ralph Campbell, Bharata B Rao, Andrew Morton, linux-mm, nouveau,
	dri-devel, linux-kernel

The MIGRATE_PFN_WRITE is only used locally in migrate_vma_collect_pmd,
where it can be replaced with a simple boolean local variable.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 include/linux/migrate.h | 1 -
 mm/migrate.c            | 9 +++++----
 2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/include/linux/migrate.h b/include/linux/migrate.h
index 8b46cfdb1a0e..ba74ef5a7702 100644
--- a/include/linux/migrate.h
+++ b/include/linux/migrate.h
@@ -165,7 +165,6 @@ static inline int migrate_misplaced_transhuge_page(struct mm_struct *mm,
 #define MIGRATE_PFN_VALID	(1UL << 0)
 #define MIGRATE_PFN_MIGRATE	(1UL << 1)
 #define MIGRATE_PFN_LOCKED	(1UL << 2)
-#define MIGRATE_PFN_WRITE	(1UL << 3)
 #define MIGRATE_PFN_SHIFT	6
 
 static inline struct page *migrate_pfn_to_page(unsigned long mpfn)
diff --git a/mm/migrate.c b/mm/migrate.c
index 74735256e260..724f92dcc31b 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -2212,6 +2212,7 @@ static int migrate_vma_collect_pmd(pmd_t *pmdp,
 		unsigned long mpfn, pfn;
 		struct page *page;
 		swp_entry_t entry;
+		bool writable = false;
 		pte_t pte;
 
 		pte = *ptep;
@@ -2240,7 +2241,7 @@ static int migrate_vma_collect_pmd(pmd_t *pmdp,
 			mpfn = migrate_pfn(page_to_pfn(page)) |
 					MIGRATE_PFN_MIGRATE;
 			if (is_write_device_private_entry(entry))
-				mpfn |= MIGRATE_PFN_WRITE;
+				writable = true;
 		} else {
 			if (is_zero_pfn(pfn)) {
 				mpfn = MIGRATE_PFN_MIGRATE;
@@ -2250,7 +2251,8 @@ static int migrate_vma_collect_pmd(pmd_t *pmdp,
 			}
 			page = vm_normal_page(migrate->vma, addr, pte);
 			mpfn = migrate_pfn(pfn) | MIGRATE_PFN_MIGRATE;
-			mpfn |= pte_write(pte) ? MIGRATE_PFN_WRITE : 0;
+			if (pte_write(pte))
+				writable = true;
 		}
 
 		/* FIXME support THP */
@@ -2284,8 +2286,7 @@ static int migrate_vma_collect_pmd(pmd_t *pmdp,
 			ptep_get_and_clear(mm, addr, ptep);
 
 			/* Setup special migration page table entry */
-			entry = make_migration_entry(page, mpfn &
-						     MIGRATE_PFN_WRITE);
+			entry = make_migration_entry(page, writable);
 			swp_pte = swp_entry_to_pte(entry);
 			if (pte_soft_dirty(pte))
 				swp_pte = pte_swp_mksoft_dirty(swp_pte);
-- 
2.20.1


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 1/9] mm: turn migrate_vma upside down
  2019-07-29 14:28 ` [PATCH 1/9] mm: turn " Christoph Hellwig
@ 2019-07-29 23:12   ` Ralph Campbell
  2019-07-29 23:43   ` Jerome Glisse
  2019-07-31  1:46   ` Ralph Campbell
  2 siblings, 0 replies; 30+ messages in thread
From: Ralph Campbell @ 2019-07-29 23:12 UTC (permalink / raw)
  To: Christoph Hellwig, Jérôme Glisse, Jason Gunthorpe, Ben Skeggs
  Cc: Bharata B Rao, Andrew Morton, linux-mm, nouveau, dri-devel, linux-kernel


On 7/29/19 7:28 AM, Christoph Hellwig wrote:
> There isn't any good reason to pass callbacks to migrate_vma.  Instead
> we can just export the three steps done by this function to drivers and
> let them sequence the operation without callbacks.  This removes a lot
> of boilerplate code as-is, and will allow the drivers to drastically
> improve code flow and error handling further on.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>

Except for a few white space errors (<space><tab> and <space>$),
looks OK.

Reviewed-by: Ralph Campbell <rcampbell@nvidia.com>

> ---
>   Documentation/vm/hmm.rst               |  55 +-----
>   drivers/gpu/drm/nouveau/nouveau_dmem.c | 122 +++++++------
>   include/linux/migrate.h                | 118 ++----------
>   mm/migrate.c                           | 242 +++++++++++--------------
>   4 files changed, 193 insertions(+), 344 deletions(-)
> 
> diff --git a/Documentation/vm/hmm.rst b/Documentation/vm/hmm.rst
> index ddcb5ca8b296..ad880e3996b1 100644
> --- a/Documentation/vm/hmm.rst
> +++ b/Documentation/vm/hmm.rst
> @@ -339,58 +339,9 @@ Migration to and from device memory
>   ===================================
>   
>   Because the CPU cannot access device memory, migration must use the device DMA
> -engine to perform copy from and to device memory. For this we need a new
> -migration helper::
> -
> - int migrate_vma(const struct migrate_vma_ops *ops,
> -                 struct vm_area_struct *vma,
> -                 unsigned long mentries,
> -                 unsigned long start,
> -                 unsigned long end,
> -                 unsigned long *src,
> -                 unsigned long *dst,
> -                 void *private);
> -
> -Unlike other migration functions it works on a range of virtual address, there
> -are two reasons for that. First, device DMA copy has a high setup overhead cost
> -and thus batching multiple pages is needed as otherwise the migration overhead
> -makes the whole exercise pointless. The second reason is because the
> -migration might be for a range of addresses the device is actively accessing.
> -
> -The migrate_vma_ops struct defines two callbacks. First one (alloc_and_copy())
> -controls destination memory allocation and copy operation. Second one is there
> -to allow the device driver to perform cleanup operations after migration::
> -
> - struct migrate_vma_ops {
> -     void (*alloc_and_copy)(struct vm_area_struct *vma,
> -                            const unsigned long *src,
> -                            unsigned long *dst,
> -                            unsigned long start,
> -                            unsigned long end,
> -                            void *private);
> -     void (*finalize_and_map)(struct vm_area_struct *vma,
> -                              const unsigned long *src,
> -                              const unsigned long *dst,
> -                              unsigned long start,
> -                              unsigned long end,
> -                              void *private);
> - };
> -
> -It is important to stress that these migration helpers allow for holes in the
> -virtual address range. Some pages in the range might not be migrated for all
> -the usual reasons (page is pinned, page is locked, ...). This helper does not
> -fail but just skips over those pages.
> -
> -The alloc_and_copy() might decide to not migrate all pages in the
> -range (for reasons under the callback control). For those, the callback just
> -has to leave the corresponding dst entry empty.
> -
> -Finally, the migration of the struct page might fail (for file backed page) for
> -various reasons (failure to freeze reference, or update page cache, ...). If
> -that happens, then the finalize_and_map() can catch any pages that were not
> -migrated. Note those pages were still copied to a new page and thus we wasted
> -bandwidth but this is considered as a rare event and a price that we are
> -willing to pay to keep all the code simpler.
> +engine to perform copy from and to device memory. For this we need a new to
> +use migrate_vma_setup(), migrate_vma_pages(), and migrate_vma_finalize()
> +helpers.
>   
>   
>   Memory cgroup (memcg) and rss accounting
> diff --git a/drivers/gpu/drm/nouveau/nouveau_dmem.c b/drivers/gpu/drm/nouveau/nouveau_dmem.c
> index 345c63cb752a..38416798abd4 100644
> --- a/drivers/gpu/drm/nouveau/nouveau_dmem.c
> +++ b/drivers/gpu/drm/nouveau/nouveau_dmem.c
> @@ -131,9 +131,8 @@ nouveau_dmem_fault_alloc_and_copy(struct vm_area_struct *vma,
>   				  unsigned long *dst_pfns,
>   				  unsigned long start,
>   				  unsigned long end,
> -				  void *private)
> +				  struct nouveau_dmem_fault *fault)
>   {
> -	struct nouveau_dmem_fault *fault = private;
>   	struct nouveau_drm *drm = fault->drm;
>   	struct device *dev = drm->dev->dev;
>   	unsigned long addr, i, npages = 0;
> @@ -230,14 +229,9 @@ nouveau_dmem_fault_alloc_and_copy(struct vm_area_struct *vma,
>   	}
>   }
>   
> -void nouveau_dmem_fault_finalize_and_map(struct vm_area_struct *vma,
> -					 const unsigned long *src_pfns,
> -					 const unsigned long *dst_pfns,
> -					 unsigned long start,
> -					 unsigned long end,
> -					 void *private)
> +static void
> +nouveau_dmem_fault_finalize_and_map(struct nouveau_dmem_fault *fault)
>   {
> -	struct nouveau_dmem_fault *fault = private;
>   	struct nouveau_drm *drm = fault->drm;
>   
>   	if (fault->fence) {
> @@ -257,29 +251,35 @@ void nouveau_dmem_fault_finalize_and_map(struct vm_area_struct *vma,
>   	kfree(fault->dma);
>   }
>   
> -static const struct migrate_vma_ops nouveau_dmem_fault_migrate_ops = {
> -	.alloc_and_copy		= nouveau_dmem_fault_alloc_and_copy,
> -	.finalize_and_map	= nouveau_dmem_fault_finalize_and_map,
> -};
> -
>   static vm_fault_t nouveau_dmem_migrate_to_ram(struct vm_fault *vmf)
>   {
>   	struct nouveau_dmem *dmem = page_to_dmem(vmf->page);
>   	unsigned long src[1] = {0}, dst[1] = {0};
> +	struct migrate_vma args = {
> +		.vma		= vmf->vma,
> +		.start		= vmf->address,
> +		.end		= vmf->address + PAGE_SIZE,
> +		.src		= src,
> +		.dst		= dst,
> +	};
>   	struct nouveau_dmem_fault fault = { .drm = dmem->drm };
> -	int ret;
>   
>   	/*
>   	 * FIXME what we really want is to find some heuristic to migrate more
>   	 * than just one page on CPU fault. When such fault happens it is very
>   	 * likely that more surrounding page will CPU fault too.
>   	 */
> -	ret = migrate_vma(&nouveau_dmem_fault_migrate_ops, vmf->vma,
> -			vmf->address, vmf->address + PAGE_SIZE,
> -			src, dst, &fault);
> -	if (ret)
> +	if (migrate_vma_setup(&args) < 0)
>   		return VM_FAULT_SIGBUS;
> +	if (!args.cpages)
> +		return 0;
> +
> +	nouveau_dmem_fault_alloc_and_copy(args.vma, src, dst, args.start,
> +			args.end, &fault);
> +	migrate_vma_pages(&args);
> +	nouveau_dmem_fault_finalize_and_map(&fault);
>   
> +	migrate_vma_finalize(&args);
>   	if (dst[0] == MIGRATE_PFN_ERROR)
>   		return VM_FAULT_SIGBUS;
>   
> @@ -648,9 +648,8 @@ nouveau_dmem_migrate_alloc_and_copy(struct vm_area_struct *vma,
>   				    unsigned long *dst_pfns,
>   				    unsigned long start,
>   				    unsigned long end,
> -				    void *private)
> +				    struct nouveau_migrate *migrate)
>   {
> -	struct nouveau_migrate *migrate = private;
>   	struct nouveau_drm *drm = migrate->drm;
>   	struct device *dev = drm->dev->dev;
>   	unsigned long addr, i, npages = 0;
> @@ -747,14 +746,9 @@ nouveau_dmem_migrate_alloc_and_copy(struct vm_area_struct *vma,
>   	}
>   }
>   
> -void nouveau_dmem_migrate_finalize_and_map(struct vm_area_struct *vma,
> -					   const unsigned long *src_pfns,
> -					   const unsigned long *dst_pfns,
> -					   unsigned long start,
> -					   unsigned long end,
> -					   void *private)
> +static void
> +nouveau_dmem_migrate_finalize_and_map(struct nouveau_migrate *migrate)
>   {
> -	struct nouveau_migrate *migrate = private;
>   	struct nouveau_drm *drm = migrate->drm;
>   
>   	if (migrate->fence) {
> @@ -779,10 +773,15 @@ void nouveau_dmem_migrate_finalize_and_map(struct vm_area_struct *vma,
>   	 */
>   }
>   
> -static const struct migrate_vma_ops nouveau_dmem_migrate_ops = {
> -	.alloc_and_copy		= nouveau_dmem_migrate_alloc_and_copy,
> -	.finalize_and_map	= nouveau_dmem_migrate_finalize_and_map,
> -};
> +static void nouveau_dmem_migrate_chunk(struct migrate_vma *args,
> +		struct nouveau_migrate *migrate)
> +{
> +	nouveau_dmem_migrate_alloc_and_copy(args->vma, args->src, args->dst,
> +			args->start, args->end, migrate);
> +	migrate_vma_pages(args);
> +	nouveau_dmem_migrate_finalize_and_map(migrate);
> +	migrate_vma_finalize(args);
> +}
>   
>   int
>   nouveau_dmem_migrate_vma(struct nouveau_drm *drm,
> @@ -790,40 +789,45 @@ nouveau_dmem_migrate_vma(struct nouveau_drm *drm,
>   			 unsigned long start,
>   			 unsigned long end)
>   {
> -	unsigned long *src_pfns, *dst_pfns, npages;
> -	struct nouveau_migrate migrate = {0};
> -	unsigned long i, c, max;
> -	int ret = 0;
> -
> -	npages = (end - start) >> PAGE_SHIFT;
> -	max = min(SG_MAX_SINGLE_ALLOC, npages);
> -	src_pfns = kzalloc(sizeof(long) * max, GFP_KERNEL);
> -	if (src_pfns == NULL)
> -		return -ENOMEM;
> -	dst_pfns = kzalloc(sizeof(long) * max, GFP_KERNEL);
> -	if (dst_pfns == NULL) {
> -		kfree(src_pfns);
> -		return -ENOMEM;
> -	}
> +	unsigned long npages = (end - start) >> PAGE_SHIFT;
> +	unsigned long max = min(SG_MAX_SINGLE_ALLOC, npages);
> +	struct migrate_vma args = {
> +		.vma		= vma,
> +		.start		= start,
> +	};
> +	struct nouveau_migrate migrate = {
> +		.drm		= drm,
> +		.vma		= vma,
> +		.npages		= npages,
> +	};
> +	unsigned long c, i;
> +	int ret = -ENOMEM;
> +
> +	args.src = kzalloc(sizeof(long) * max, GFP_KERNEL);
> +	if (!args.src)
> +		goto out;
> +	args.dst = kzalloc(sizeof(long) * max, GFP_KERNEL);
> +	if (!args.dst)
> +		goto out_free_src;
>   
> -	migrate.drm = drm;
> -	migrate.vma = vma;
> -	migrate.npages = npages;
>   	for (i = 0; i < npages; i += c) {
> -		unsigned long next;
> -
>   		c = min(SG_MAX_SINGLE_ALLOC, npages);
> -		next = start + (c << PAGE_SHIFT);
> -		ret = migrate_vma(&nouveau_dmem_migrate_ops, vma, start,
> -				  next, src_pfns, dst_pfns, &migrate);
> +		args.end = start + (c << PAGE_SHIFT);
> +		ret = migrate_vma_setup(&args);
>   		if (ret)
> -			goto out;
> -		start = next;
> +			goto out_free_dst;
> +
> +		if (args.cpages)
> +			nouveau_dmem_migrate_chunk(&args, &migrate);
> +		args.start = args.end;
>   	}
>   
> +	ret = 0;
> +out_free_dst:
> +	kfree(args.dst);
> +out_free_src:
> +	kfree(args.src);
>   out:
> -	kfree(dst_pfns);
> -	kfree(src_pfns);
>   	return ret;
>   }
>   
> diff --git a/include/linux/migrate.h b/include/linux/migrate.h
> index 7f04754c7f2b..093d67fcf6dd 100644
> --- a/include/linux/migrate.h
> +++ b/include/linux/migrate.h
> @@ -182,107 +182,27 @@ static inline unsigned long migrate_pfn(unsigned long pfn)
>   	return (pfn << MIGRATE_PFN_SHIFT) | MIGRATE_PFN_VALID;
>   }
>   
> -/*
> - * struct migrate_vma_ops - migrate operation callback
> - *
> - * @alloc_and_copy: alloc destination memory and copy source memory to it
> - * @finalize_and_map: allow caller to map the successfully migrated pages
> - *
> - *
> - * The alloc_and_copy() callback happens once all source pages have been locked,
> - * unmapped and checked (checked whether pinned or not). All pages that can be
> - * migrated will have an entry in the src array set with the pfn value of the
> - * page and with the MIGRATE_PFN_VALID and MIGRATE_PFN_MIGRATE flag set (other
> - * flags might be set but should be ignored by the callback).
> - *
> - * The alloc_and_copy() callback can then allocate destination memory and copy
> - * source memory to it for all those entries (ie with MIGRATE_PFN_VALID and
> - * MIGRATE_PFN_MIGRATE flag set). Once these are allocated and copied, the
> - * callback must update each corresponding entry in the dst array with the pfn
> - * value of the destination page and with the MIGRATE_PFN_VALID and
> - * MIGRATE_PFN_LOCKED flags set (destination pages must have their struct pages
> - * locked, via lock_page()).
> - *
> - * At this point the alloc_and_copy() callback is done and returns.
> - *
> - * Note that the callback does not have to migrate all the pages that are
> - * marked with MIGRATE_PFN_MIGRATE flag in src array unless this is a migration
> - * from device memory to system memory (ie the MIGRATE_PFN_DEVICE flag is also
> - * set in the src array entry). If the device driver cannot migrate a device
> - * page back to system memory, then it must set the corresponding dst array
> - * entry to MIGRATE_PFN_ERROR. This will trigger a SIGBUS if CPU tries to
> - * access any of the virtual addresses originally backed by this page. Because
> - * a SIGBUS is such a severe result for the userspace process, the device
> - * driver should avoid setting MIGRATE_PFN_ERROR unless it is really in an
> - * unrecoverable state.
> - *
> - * For empty entry inside CPU page table (pte_none() or pmd_none() is true) we
> - * do set MIGRATE_PFN_MIGRATE flag inside the corresponding source array thus
> - * allowing device driver to allocate device memory for those unback virtual
> - * address. For this the device driver simply have to allocate device memory
> - * and properly set the destination entry like for regular migration. Note that
> - * this can still fails and thus inside the device driver must check if the
> - * migration was successful for those entry inside the finalize_and_map()
> - * callback just like for regular migration.
> - *
> - * THE alloc_and_copy() CALLBACK MUST NOT CHANGE ANY OF THE SRC ARRAY ENTRIES
> - * OR BAD THINGS WILL HAPPEN !
> - *
> - *
> - * The finalize_and_map() callback happens after struct page migration from
> - * source to destination (destination struct pages are the struct pages for the
> - * memory allocated by the alloc_and_copy() callback).  Migration can fail, and
> - * thus the finalize_and_map() allows the driver to inspect which pages were
> - * successfully migrated, and which were not. Successfully migrated pages will
> - * have the MIGRATE_PFN_MIGRATE flag set for their src array entry.
> - *
> - * It is safe to update device page table from within the finalize_and_map()
> - * callback because both destination and source page are still locked, and the
> - * mmap_sem is held in read mode (hence no one can unmap the range being
> - * migrated).
> - *
> - * Once callback is done cleaning up things and updating its page table (if it
> - * chose to do so, this is not an obligation) then it returns. At this point,
> - * the HMM core will finish up the final steps, and the migration is complete.
> - *
> - * THE finalize_and_map() CALLBACK MUST NOT CHANGE ANY OF THE SRC OR DST ARRAY
> - * ENTRIES OR BAD THINGS WILL HAPPEN !
> - */
> -struct migrate_vma_ops {
> -	void (*alloc_and_copy)(struct vm_area_struct *vma,
> -			       const unsigned long *src,
> -			       unsigned long *dst,
> -			       unsigned long start,
> -			       unsigned long end,
> -			       void *private);
> -	void (*finalize_and_map)(struct vm_area_struct *vma,
> -				 const unsigned long *src,
> -				 const unsigned long *dst,
> -				 unsigned long start,
> -				 unsigned long end,
> -				 void *private);
> +struct migrate_vma {
> +	struct vm_area_struct	*vma;
> + 	/*
> +	 * Both src and dst array must be big enough for
> +	 * (end - start) >> PAGE_SHIFT entries.
> +	 *
> +	 * The src array must not be modified by the caller after
> +	 * migrate_vma_setup(), and must not change the dst array after
> +	 * migrate_vma_pages() returns.
> +	 */
> +	unsigned long		*dst;
> +	unsigned long		*src;
> +	unsigned long		cpages;
> +	unsigned long		npages;
> +	unsigned long		start;
> +	unsigned long		end;
>   };
>   
> -#if defined(CONFIG_MIGRATE_VMA_HELPER)
> -int migrate_vma(const struct migrate_vma_ops *ops,
> -		struct vm_area_struct *vma,
> -		unsigned long start,
> -		unsigned long end,
> -		unsigned long *src,
> -		unsigned long *dst,
> -		void *private);
> -#else
> -static inline int migrate_vma(const struct migrate_vma_ops *ops,
> -			      struct vm_area_struct *vma,
> -			      unsigned long start,
> -			      unsigned long end,
> -			      unsigned long *src,
> -			      unsigned long *dst,
> -			      void *private)
> -{
> -	return -EINVAL;
> -}
> -#endif /* IS_ENABLED(CONFIG_MIGRATE_VMA_HELPER) */
> +int migrate_vma_setup(struct migrate_vma *args);
> +void migrate_vma_pages(struct migrate_vma *migrate);
> +void migrate_vma_finalize(struct migrate_vma *migrate);
>   
>   #endif /* CONFIG_MIGRATION */
>   
> diff --git a/mm/migrate.c b/mm/migrate.c
> index 8992741f10aa..dc4e60a496f2 100644
> --- a/mm/migrate.c
> +++ b/mm/migrate.c
> @@ -2118,16 +2118,6 @@ int migrate_misplaced_transhuge_page(struct mm_struct *mm,
>   #endif /* CONFIG_NUMA */
>   
>   #if defined(CONFIG_MIGRATE_VMA_HELPER)
> -struct migrate_vma {
> -	struct vm_area_struct	*vma;
> -	unsigned long		*dst;
> -	unsigned long		*src;
> -	unsigned long		cpages;
> -	unsigned long		npages;
> -	unsigned long		start;
> -	unsigned long		end;
> -};
> -
>   static int migrate_vma_collect_hole(unsigned long start,
>   				    unsigned long end,
>   				    struct mm_walk *walk)
> @@ -2578,6 +2568,108 @@ static void migrate_vma_unmap(struct migrate_vma *migrate)
>   	}
>   }
>   
> +/**
> + * migrate_vma_setup() - prepare to migrate a range of memory
> + * @args: contains the vma, start, and and pfns arrays for the migration
> + *
> + * Returns: negative errno on failures, 0 when 0 or more pages were migrated
> + * without an error.
> + *
> + * Prepare to migrate a range of memory virtual address range by collecting all
> + * the pages backing each virtual address in the range, saving them inside the
> + * src array.  Then lock those pages and unmap them. Once the pages are locked
> + * and unmapped, check whether each page is pinned or not.  Pages that aren't
> + * pinned have the MIGRATE_PFN_MIGRATE flag set (by this function) in the
> + * corresponding src array entry.  Then restores any pages that are pinned, by
> + * remapping and unlocking those pages.
> + *
> + * The caller should then allocate destination memory and copy source memory to
> + * it for all those entries (ie with MIGRATE_PFN_VALID and MIGRATE_PFN_MIGRATE
> + * flag set).  Once these are allocated and copied, the caller must update each
> + * corresponding entry in the dst array with the pfn value of the destination
> + * page and with the MIGRATE_PFN_VALID and MIGRATE_PFN_LOCKED flags set
> + * (destination pages must have their struct pages locked, via lock_page()).
> + *
> + * Note that the caller does not have to migrate all the pages that are marked
> + * with MIGRATE_PFN_MIGRATE flag in src array unless this is a migration from
> + * device memory to system memory.  If the caller cannot migrate a device page
> + * back to system memory, then it must return VM_FAULT_SIGBUS, which will
> + * might have severe consequences for the userspace process, so it should best
> + * be avoided if possible.
> + *
> + * For empty entries inside CPU page table (pte_none() or pmd_none() is true) we
> + * do set MIGRATE_PFN_MIGRATE flag inside the corresponding source array thus
> + * allowing the caller to allocate device memory for those unback virtual
> + * address.  For this the caller simply havs to allocate device memory and
> + * properly set the destination entry like for regular migration.  Note that
> + * this can still fails and thus inside the device driver must check if the
> + * migration was successful for those entries after calling migrate_vma_pages()
> + * just like for regular migration.
> + *
> + * After that, the callers must call migrate_vma_pages() to go over each entry
> + * in the src array that has the MIGRATE_PFN_VALID and MIGRATE_PFN_MIGRATE flag
> + * set. If the corresponding entry in dst array has MIGRATE_PFN_VALID flag set,
> + * then migrate_vma_pages() to migrate struct page information from the source
> + * struct page to the destination struct page.  If it fails to migrate the
> + * struct page information, then it clears the MIGRATE_PFN_MIGRATE flag in the
> + * src array.
> + *
> + * At this point all successfully migrated pages have an entry in the src
> + * array with MIGRATE_PFN_VALID and MIGRATE_PFN_MIGRATE flag set and the dst
> + * array entry with MIGRATE_PFN_VALID flag set.
> + *
> + * Once migrate_vma_pages() returns the caller may inspect which pages were
> + * successfully migrated, and which were not.  Successfully migrated pages will
> + * have the MIGRATE_PFN_MIGRATE flag set for their src array entry.
> + *
> + * It is safe to update device page table from within the finalize_and_map()
> + * callback because both destination and source page are still locked, and the
> + * mmap_sem is held in read mode (hence no one can unmap the range being
> + * migrated).
> + *
> + * Once the caller is done cleaning up things and updating its page table (if it
> + * chose to do so, this is not an obligation) it finally calls
> + * migrate_vma_finalize() to update the CPU page table to point to new pages
> + * for successfully migrated pages or otherwise restore the CPU page table to
> + * point to the original source pages.
> + */
> +int migrate_vma_setup(struct migrate_vma *args)
> +{
> +	long nr_pages = (args->end - args->start) >> PAGE_SHIFT;
> +
> +	args->start &= PAGE_MASK;
> +	args->end &= PAGE_MASK;
> +	if (!args->vma || is_vm_hugetlb_page(args->vma) ||
> +	    (args->vma->vm_flags & VM_SPECIAL) || vma_is_dax(args->vma))
> +		return -EINVAL;
> +	if (nr_pages <= 0)
> +		return -EINVAL;
> +	if (args->start < args->vma->vm_start ||
> +	    args->start >= args->vma->vm_end)
> +		return -EINVAL;
> +	if (args->end <= args->vma->vm_start || args->end > args->vma->vm_end)
> +		return -EINVAL;
> +	if (!args->src || !args->dst)
> +		return -EINVAL;
> +
> +	memset(args->src, 0, sizeof(*args->src) * nr_pages);
> +
> +	migrate_vma_collect(args);
> +	if (args->cpages)
> +		migrate_vma_prepare(args);
> +	if (args->cpages)
> +		migrate_vma_unmap(args);
> +
> +	/*
> +	 * At this point pages are locked and unmapped, and thus they have
> +	 * stable content and can safely be copied to destination memory that
> +	 * is allocated by the drivers.
> +	 */
> +	return 0;
> +
> +}
> +EXPORT_SYMBOL(migrate_vma_setup);
> +
>   static void migrate_vma_insert_page(struct migrate_vma *migrate,
>   				    unsigned long addr,
>   				    struct page *page,
> @@ -2709,7 +2801,7 @@ static void migrate_vma_insert_page(struct migrate_vma *migrate,
>   	*src &= ~MIGRATE_PFN_MIGRATE;
>   }
>   
> -/*
> +/**
>    * migrate_vma_pages() - migrate meta-data from src page to dst page
>    * @migrate: migrate struct containing all migration information
>    *
> @@ -2717,7 +2809,7 @@ static void migrate_vma_insert_page(struct migrate_vma *migrate,
>    * struct page. This effectively finishes the migration from source page to the
>    * destination page.
>    */
> -static void migrate_vma_pages(struct migrate_vma *migrate)
> +void migrate_vma_pages(struct migrate_vma *migrate)
>   {
>   	const unsigned long npages = migrate->npages;
>   	const unsigned long start = migrate->start;
> @@ -2791,8 +2883,9 @@ static void migrate_vma_pages(struct migrate_vma *migrate)
>   	if (notified)
>   		mmu_notifier_invalidate_range_only_end(&range);
>   }
> +EXPORT_SYMBOL(migrate_vma_pages);
>   
> -/*
> +/**
>    * migrate_vma_finalize() - restore CPU page table entry
>    * @migrate: migrate struct containing all migration information
>    *
> @@ -2803,7 +2896,7 @@ static void migrate_vma_pages(struct migrate_vma *migrate)
>    * This also unlocks the pages and puts them back on the lru, or drops the extra
>    * refcount, for device pages.
>    */
> -static void migrate_vma_finalize(struct migrate_vma *migrate)
> +void migrate_vma_finalize(struct migrate_vma *migrate)
>   {
>   	const unsigned long npages = migrate->npages;
>   	unsigned long i;
> @@ -2846,124 +2939,5 @@ static void migrate_vma_finalize(struct migrate_vma *migrate)
>   		}
>   	}
>   }
> -
> -/*
> - * migrate_vma() - migrate a range of memory inside vma
> - *
> - * @ops: migration callback for allocating destination memory and copying
> - * @vma: virtual memory area containing the range to be migrated
> - * @start: start address of the range to migrate (inclusive)
> - * @end: end address of the range to migrate (exclusive)
> - * @src: array of hmm_pfn_t containing source pfns
> - * @dst: array of hmm_pfn_t containing destination pfns
> - * @private: pointer passed back to each of the callback
> - * Returns: 0 on success, error code otherwise
> - *
> - * This function tries to migrate a range of memory virtual address range, using
> - * callbacks to allocate and copy memory from source to destination. First it
> - * collects all the pages backing each virtual address in the range, saving this
> - * inside the src array. Then it locks those pages and unmaps them. Once the pages
> - * are locked and unmapped, it checks whether each page is pinned or not. Pages
> - * that aren't pinned have the MIGRATE_PFN_MIGRATE flag set (by this function)
> - * in the corresponding src array entry. It then restores any pages that are
> - * pinned, by remapping and unlocking those pages.
> - *
> - * At this point it calls the alloc_and_copy() callback. For documentation on
> - * what is expected from that callback, see struct migrate_vma_ops comments in
> - * include/linux/migrate.h
> - *
> - * After the alloc_and_copy() callback, this function goes over each entry in
> - * the src array that has the MIGRATE_PFN_VALID and MIGRATE_PFN_MIGRATE flag
> - * set. If the corresponding entry in dst array has MIGRATE_PFN_VALID flag set,
> - * then the function tries to migrate struct page information from the source
> - * struct page to the destination struct page. If it fails to migrate the struct
> - * page information, then it clears the MIGRATE_PFN_MIGRATE flag in the src
> - * array.
> - *
> - * At this point all successfully migrated pages have an entry in the src
> - * array with MIGRATE_PFN_VALID and MIGRATE_PFN_MIGRATE flag set and the dst
> - * array entry with MIGRATE_PFN_VALID flag set.
> - *
> - * It then calls the finalize_and_map() callback. See comments for "struct
> - * migrate_vma_ops", in include/linux/migrate.h for details about
> - * finalize_and_map() behavior.
> - *
> - * After the finalize_and_map() callback, for successfully migrated pages, this
> - * function updates the CPU page table to point to new pages, otherwise it
> - * restores the CPU page table to point to the original source pages.
> - *
> - * Function returns 0 after the above steps, even if no pages were migrated
> - * (The function only returns an error if any of the arguments are invalid.)
> - *
> - * Both src and dst array must be big enough for (end - start) >> PAGE_SHIFT
> - * unsigned long entries.
> - */
> -int migrate_vma(const struct migrate_vma_ops *ops,
> -		struct vm_area_struct *vma,
> -		unsigned long start,
> -		unsigned long end,
> -		unsigned long *src,
> -		unsigned long *dst,
> -		void *private)
> -{
> -	struct migrate_vma migrate;
> -
> -	/* Sanity check the arguments */
> -	start &= PAGE_MASK;
> -	end &= PAGE_MASK;
> -	if (!vma || is_vm_hugetlb_page(vma) || (vma->vm_flags & VM_SPECIAL) ||
> -			vma_is_dax(vma))
> -		return -EINVAL;
> -	if (start < vma->vm_start || start >= vma->vm_end)
> -		return -EINVAL;
> -	if (end <= vma->vm_start || end > vma->vm_end)
> -		return -EINVAL;
> -	if (!ops || !src || !dst || start >= end)
> -		return -EINVAL;
> -
> -	memset(src, 0, sizeof(*src) * ((end - start) >> PAGE_SHIFT));
> -	migrate.src = src;
> -	migrate.dst = dst;
> -	migrate.start = start;
> -	migrate.npages = 0;
> -	migrate.cpages = 0;
> -	migrate.end = end;
> -	migrate.vma = vma;
> -
> -	/* Collect, and try to unmap source pages */
> -	migrate_vma_collect(&migrate);
> -	if (!migrate.cpages)
> -		return 0;
> -
> -	/* Lock and isolate page */
> -	migrate_vma_prepare(&migrate);
> -	if (!migrate.cpages)
> -		return 0;
> -
> -	/* Unmap pages */
> -	migrate_vma_unmap(&migrate);
> -	if (!migrate.cpages)
> -		return 0;
> -
> -	/*
> -	 * At this point pages are locked and unmapped, and thus they have
> -	 * stable content and can safely be copied to destination memory that
> -	 * is allocated by the callback.
> -	 *
> -	 * Note that migration can fail in migrate_vma_struct_page() for each
> -	 * individual page.
> -	 */
> -	ops->alloc_and_copy(vma, src, dst, start, end, private);
> -
> -	/* This does the real migration of struct page */
> -	migrate_vma_pages(&migrate);
> -
> -	ops->finalize_and_map(vma, src, dst, start, end, private);
> -
> -	/* Unlock and remap pages */
> -	migrate_vma_finalize(&migrate);
> -
> -	return 0;
> -}
> -EXPORT_SYMBOL(migrate_vma);
> +EXPORT_SYMBOL(migrate_vma_finalize);
>   #endif /* defined(MIGRATE_VMA_HELPER) */
> 

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 2/9] nouveau: reset dma_nr in nouveau_dmem_migrate_alloc_and_copy
  2019-07-29 14:28 ` [PATCH 2/9] nouveau: reset dma_nr in nouveau_dmem_migrate_alloc_and_copy Christoph Hellwig
@ 2019-07-29 23:18   ` Ralph Campbell
  0 siblings, 0 replies; 30+ messages in thread
From: Ralph Campbell @ 2019-07-29 23:18 UTC (permalink / raw)
  To: Christoph Hellwig, Jérôme Glisse, Jason Gunthorpe, Ben Skeggs
  Cc: Bharata B Rao, Andrew Morton, linux-mm, nouveau, dri-devel, linux-kernel


On 7/29/19 7:28 AM, Christoph Hellwig wrote:
> When we start a new batch of dma_map operations we need to reset dma_nr,
> as we start filling a newly allocated array.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>

Reviewed-by: Ralph Campbell <rcampbell@nvidia.com>

> ---
>   drivers/gpu/drm/nouveau/nouveau_dmem.c | 1 +
>   1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/gpu/drm/nouveau/nouveau_dmem.c b/drivers/gpu/drm/nouveau/nouveau_dmem.c
> index 38416798abd4..e696157f771e 100644
> --- a/drivers/gpu/drm/nouveau/nouveau_dmem.c
> +++ b/drivers/gpu/drm/nouveau/nouveau_dmem.c
> @@ -682,6 +682,7 @@ nouveau_dmem_migrate_alloc_and_copy(struct vm_area_struct *vma,
>   	migrate->dma = kmalloc(sizeof(*migrate->dma) * npages, GFP_KERNEL);
>   	if (!migrate->dma)
>   		goto error;
> +	migrate->dma_nr = 0;
>   
>   	/* Copy things over */
>   	copy = drm->dmem->migrate.copy_func;
> 

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 3/9] nouveau: factor out device memory address calculation
  2019-07-29 14:28 ` [PATCH 3/9] nouveau: factor out device memory address calculation Christoph Hellwig
@ 2019-07-29 23:21   ` Ralph Campbell
  0 siblings, 0 replies; 30+ messages in thread
From: Ralph Campbell @ 2019-07-29 23:21 UTC (permalink / raw)
  To: Christoph Hellwig, Jérôme Glisse, Jason Gunthorpe, Ben Skeggs
  Cc: Bharata B Rao, Andrew Morton, linux-mm, nouveau, dri-devel, linux-kernel


On 7/29/19 7:28 AM, Christoph Hellwig wrote:
> Factor out the repeated device memory address calculation into
> a helper.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>

Reviewed-by: Ralph Campbell <rcampbell@nvidia.com>

> ---
>   drivers/gpu/drm/nouveau/nouveau_dmem.c | 42 +++++++++++---------------
>   1 file changed, 17 insertions(+), 25 deletions(-)
> 
> diff --git a/drivers/gpu/drm/nouveau/nouveau_dmem.c b/drivers/gpu/drm/nouveau/nouveau_dmem.c
> index e696157f771e..d469bc334438 100644
> --- a/drivers/gpu/drm/nouveau/nouveau_dmem.c
> +++ b/drivers/gpu/drm/nouveau/nouveau_dmem.c
> @@ -102,6 +102,14 @@ struct nouveau_migrate {
>   	unsigned long dma_nr;
>   };
>   
> +static unsigned long nouveau_dmem_page_addr(struct page *page)
> +{
> +	struct nouveau_dmem_chunk *chunk = page->zone_device_data;
> +	unsigned long idx = page_to_pfn(page) - chunk->pfn_first;
> +
> +	return (idx << PAGE_SHIFT) + chunk->bo->bo.offset;
> +}
> +
>   static void nouveau_dmem_page_free(struct page *page)
>   {
>   	struct nouveau_dmem_chunk *chunk = page->zone_device_data;
> @@ -169,9 +177,7 @@ nouveau_dmem_fault_alloc_and_copy(struct vm_area_struct *vma,
>   	/* Copy things over */
>   	copy = drm->dmem->migrate.copy_func;
>   	for (addr = start, i = 0; addr < end; addr += PAGE_SIZE, i++) {
> -		struct nouveau_dmem_chunk *chunk;
>   		struct page *spage, *dpage;
> -		u64 src_addr, dst_addr;
>   
>   		dpage = migrate_pfn_to_page(dst_pfns[i]);
>   		if (!dpage || dst_pfns[i] == MIGRATE_PFN_ERROR)
> @@ -194,14 +200,10 @@ nouveau_dmem_fault_alloc_and_copy(struct vm_area_struct *vma,
>   			continue;
>   		}
>   
> -		dst_addr = fault->dma[fault->npages++];
> -
> -		chunk = spage->zone_device_data;
> -		src_addr = page_to_pfn(spage) - chunk->pfn_first;
> -		src_addr = (src_addr << PAGE_SHIFT) + chunk->bo->bo.offset;
> -
> -		ret = copy(drm, 1, NOUVEAU_APER_HOST, dst_addr,
> -				   NOUVEAU_APER_VRAM, src_addr);
> +		ret = copy(drm, 1, NOUVEAU_APER_HOST,
> +				fault->dma[fault->npages++],
> +				NOUVEAU_APER_VRAM,
> +				nouveau_dmem_page_addr(spage));
>   		if (ret) {
>   			dst_pfns[i] = MIGRATE_PFN_ERROR;
>   			__free_page(dpage);
> @@ -687,18 +689,12 @@ nouveau_dmem_migrate_alloc_and_copy(struct vm_area_struct *vma,
>   	/* Copy things over */
>   	copy = drm->dmem->migrate.copy_func;
>   	for (addr = start, i = 0; addr < end; addr += PAGE_SIZE, i++) {
> -		struct nouveau_dmem_chunk *chunk;
>   		struct page *spage, *dpage;
> -		u64 src_addr, dst_addr;
>   
>   		dpage = migrate_pfn_to_page(dst_pfns[i]);
>   		if (!dpage || dst_pfns[i] == MIGRATE_PFN_ERROR)
>   			continue;
>   
> -		chunk = dpage->zone_device_data;
> -		dst_addr = page_to_pfn(dpage) - chunk->pfn_first;
> -		dst_addr = (dst_addr << PAGE_SHIFT) + chunk->bo->bo.offset;
> -
>   		spage = migrate_pfn_to_page(src_pfns[i]);
>   		if (!spage || !(src_pfns[i] & MIGRATE_PFN_MIGRATE)) {
>   			nouveau_dmem_page_free_locked(drm, dpage);
> @@ -716,10 +712,10 @@ nouveau_dmem_migrate_alloc_and_copy(struct vm_area_struct *vma,
>   			continue;
>   		}
>   
> -		src_addr = migrate->dma[migrate->dma_nr++];
> -
> -		ret = copy(drm, 1, NOUVEAU_APER_VRAM, dst_addr,
> -				   NOUVEAU_APER_HOST, src_addr);
> +		ret = copy(drm, 1, NOUVEAU_APER_VRAM,
> +				nouveau_dmem_page_addr(dpage),
> +				NOUVEAU_APER_HOST,
> +				migrate->dma[migrate->dma_nr++]);
>   		if (ret) {
>   			nouveau_dmem_page_free_locked(drm, dpage);
>   			dst_pfns[i] = 0;
> @@ -846,7 +842,6 @@ nouveau_dmem_convert_pfn(struct nouveau_drm *drm,
>   
>   	npages = (range->end - range->start) >> PAGE_SHIFT;
>   	for (i = 0; i < npages; ++i) {
> -		struct nouveau_dmem_chunk *chunk;
>   		struct page *page;
>   		uint64_t addr;
>   
> @@ -864,10 +859,7 @@ nouveau_dmem_convert_pfn(struct nouveau_drm *drm,
>   			continue;
>   		}
>   
> -		chunk = page->zone_device_data;
> -		addr = page_to_pfn(page) - chunk->pfn_first;
> -		addr = (addr + chunk->bo->bo.mem.start) << PAGE_SHIFT;
> -
> +		addr = nouveau_dmem_page_addr(page);
>   		range->pfns[i] &= ((1UL << range->pfn_shift) - 1);
>   		range->pfns[i] |= (addr >> PAGE_SHIFT) << range->pfn_shift;
>   	}
> 

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 4/9] nouveau: factor out dmem fence completion
  2019-07-29 14:28 ` [PATCH 4/9] nouveau: factor out dmem fence completion Christoph Hellwig
@ 2019-07-29 23:23   ` Ralph Campbell
  0 siblings, 0 replies; 30+ messages in thread
From: Ralph Campbell @ 2019-07-29 23:23 UTC (permalink / raw)
  To: Christoph Hellwig, Jérôme Glisse, Jason Gunthorpe, Ben Skeggs
  Cc: Bharata B Rao, Andrew Morton, linux-mm, nouveau, dri-devel, linux-kernel


On 7/29/19 7:28 AM, Christoph Hellwig wrote:
> Factor out the end of fencing logic from the two migration routines.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>

Reviewed-by: Ralph Campbell <rcampbell@nvidia.com>

> ---
>   drivers/gpu/drm/nouveau/nouveau_dmem.c | 33 ++++++++++++--------------
>   1 file changed, 15 insertions(+), 18 deletions(-)
> 
> diff --git a/drivers/gpu/drm/nouveau/nouveau_dmem.c b/drivers/gpu/drm/nouveau/nouveau_dmem.c
> index d469bc334438..21052a4aaf69 100644
> --- a/drivers/gpu/drm/nouveau/nouveau_dmem.c
> +++ b/drivers/gpu/drm/nouveau/nouveau_dmem.c
> @@ -133,6 +133,19 @@ static void nouveau_dmem_page_free(struct page *page)
>   	spin_unlock(&chunk->lock);
>   }
>   
> +static void nouveau_dmem_fence_done(struct nouveau_fence **fence)
> +{
> +	if (fence) {
> +		nouveau_fence_wait(*fence, true, false);
> +		nouveau_fence_unref(fence);
> +	} else {
> +		/*
> +		 * FIXME wait for channel to be IDLE before calling finalizing
> +		 * the hmem object.
> +		 */
> +	}
> +}
> +
>   static void
>   nouveau_dmem_fault_alloc_and_copy(struct vm_area_struct *vma,
>   				  const unsigned long *src_pfns,
> @@ -236,15 +249,7 @@ nouveau_dmem_fault_finalize_and_map(struct nouveau_dmem_fault *fault)
>   {
>   	struct nouveau_drm *drm = fault->drm;
>   
> -	if (fault->fence) {
> -		nouveau_fence_wait(fault->fence, true, false);
> -		nouveau_fence_unref(&fault->fence);
> -	} else {
> -		/*
> -		 * FIXME wait for channel to be IDLE before calling finalizing
> -		 * the hmem object below (nouveau_migrate_hmem_fini()).
> -		 */
> -	}
> +	nouveau_dmem_fence_done(&fault->fence);
>   
>   	while (fault->npages--) {
>   		dma_unmap_page(drm->dev->dev, fault->dma[fault->npages],
> @@ -748,15 +753,7 @@ nouveau_dmem_migrate_finalize_and_map(struct nouveau_migrate *migrate)
>   {
>   	struct nouveau_drm *drm = migrate->drm;
>   
> -	if (migrate->fence) {
> -		nouveau_fence_wait(migrate->fence, true, false);
> -		nouveau_fence_unref(&migrate->fence);
> -	} else {
> -		/*
> -		 * FIXME wait for channel to be IDLE before finalizing
> -		 * the hmem object below (nouveau_migrate_hmem_fini()) ?
> -		 */
> -	}
> +	nouveau_dmem_fence_done(&migrate->fence);
>   
>   	while (migrate->dma_nr--) {
>   		dma_unmap_page(drm->dev->dev, migrate->dma[migrate->dma_nr],
> 

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 5/9] nouveau: simplify nouveau_dmem_migrate_to_ram
  2019-07-29 14:28 ` [PATCH 5/9] nouveau: simplify nouveau_dmem_migrate_to_ram Christoph Hellwig
@ 2019-07-29 23:26   ` Ralph Campbell
  2019-07-31  9:57   ` Bharata B Rao
  1 sibling, 0 replies; 30+ messages in thread
From: Ralph Campbell @ 2019-07-29 23:26 UTC (permalink / raw)
  To: Christoph Hellwig, Jérôme Glisse, Jason Gunthorpe, Ben Skeggs
  Cc: Bharata B Rao, Andrew Morton, linux-mm, nouveau, dri-devel, linux-kernel


On 7/29/19 7:28 AM, Christoph Hellwig wrote:
> Factor the main copy page to ram routine out into a helper that acts on
> a single page and which doesn't require the nouveau_dmem_fault
> structure for argument passing.  Also remove the loop over multiple
> pages as we only handle one at the moment, although the structure of
> the main worker function makes it relatively easy to add multi page
> support back if needed in the future.  But at least for now this avoid
> the needed to dynamically allocate memory for the dma addresses in
> what is essentially the page fault path.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>

Reviewed-by: Ralph Campbell <rcampbell@nvidia.com>

> ---
>   drivers/gpu/drm/nouveau/nouveau_dmem.c | 158 ++++++-------------------
>   1 file changed, 39 insertions(+), 119 deletions(-)
> 
> diff --git a/drivers/gpu/drm/nouveau/nouveau_dmem.c b/drivers/gpu/drm/nouveau/nouveau_dmem.c
> index 21052a4aaf69..036e6c07d489 100644
> --- a/drivers/gpu/drm/nouveau/nouveau_dmem.c
> +++ b/drivers/gpu/drm/nouveau/nouveau_dmem.c
> @@ -86,13 +86,6 @@ static inline struct nouveau_dmem *page_to_dmem(struct page *page)
>   	return container_of(page->pgmap, struct nouveau_dmem, pagemap);
>   }
>   
> -struct nouveau_dmem_fault {
> -	struct nouveau_drm *drm;
> -	struct nouveau_fence *fence;
> -	dma_addr_t *dma;
> -	unsigned long npages;
> -};
> -
>   struct nouveau_migrate {
>   	struct vm_area_struct *vma;
>   	struct nouveau_drm *drm;
> @@ -146,130 +139,55 @@ static void nouveau_dmem_fence_done(struct nouveau_fence **fence)
>   	}
>   }
>   
> -static void
> -nouveau_dmem_fault_alloc_and_copy(struct vm_area_struct *vma,
> -				  const unsigned long *src_pfns,
> -				  unsigned long *dst_pfns,
> -				  unsigned long start,
> -				  unsigned long end,
> -				  struct nouveau_dmem_fault *fault)
> +static vm_fault_t nouveau_dmem_fault_copy_one(struct nouveau_drm *drm,
> +		struct vm_area_struct *vma, unsigned long addr,
> +		unsigned long src, unsigned long *dst, dma_addr_t *dma_addr)
>   {
> -	struct nouveau_drm *drm = fault->drm;
>   	struct device *dev = drm->dev->dev;
> -	unsigned long addr, i, npages = 0;
> -	nouveau_migrate_copy_t copy;
> -	int ret;
> -
> -
> -	/* First allocate new memory */
> -	for (addr = start, i = 0; addr < end; addr += PAGE_SIZE, i++) {
> -		struct page *dpage, *spage;
> -
> -		dst_pfns[i] = 0;
> -		spage = migrate_pfn_to_page(src_pfns[i]);
> -		if (!spage || !(src_pfns[i] & MIGRATE_PFN_MIGRATE))
> -			continue;
> +	struct page *dpage, *spage;
>   
> -		dpage = alloc_page_vma(GFP_HIGHUSER, vma, addr);
> -		if (!dpage) {
> -			dst_pfns[i] = MIGRATE_PFN_ERROR;
> -			continue;
> -		}
> -		lock_page(dpage);
> -
> -		dst_pfns[i] = migrate_pfn(page_to_pfn(dpage)) |
> -			      MIGRATE_PFN_LOCKED;
> -		npages++;
> -	}
> +	spage = migrate_pfn_to_page(src);
> +	if (!spage || !(src & MIGRATE_PFN_MIGRATE))
> +		return 0;
>   
> -	/* Allocate storage for DMA addresses, so we can unmap later. */
> -	fault->dma = kmalloc(sizeof(*fault->dma) * npages, GFP_KERNEL);
> -	if (!fault->dma)
> +	dpage = alloc_page_vma(GFP_HIGHUSER, args->vma, addr);
> +	if (!dpage)
>   		goto error;
> +	lock_page(dpage);
>   
> -	/* Copy things over */
> -	copy = drm->dmem->migrate.copy_func;
> -	for (addr = start, i = 0; addr < end; addr += PAGE_SIZE, i++) {
> -		struct page *spage, *dpage;
> -
> -		dpage = migrate_pfn_to_page(dst_pfns[i]);
> -		if (!dpage || dst_pfns[i] == MIGRATE_PFN_ERROR)
> -			continue;
> -
> -		spage = migrate_pfn_to_page(src_pfns[i]);
> -		if (!spage || !(src_pfns[i] & MIGRATE_PFN_MIGRATE)) {
> -			dst_pfns[i] = MIGRATE_PFN_ERROR;
> -			__free_page(dpage);
> -			continue;
> -		}
> -
> -		fault->dma[fault->npages] =
> -			dma_map_page_attrs(dev, dpage, 0, PAGE_SIZE,
> -					   PCI_DMA_BIDIRECTIONAL,
> -					   DMA_ATTR_SKIP_CPU_SYNC);
> -		if (dma_mapping_error(dev, fault->dma[fault->npages])) {
> -			dst_pfns[i] = MIGRATE_PFN_ERROR;
> -			__free_page(dpage);
> -			continue;
> -		}
> -
> -		ret = copy(drm, 1, NOUVEAU_APER_HOST,
> -				fault->dma[fault->npages++],
> -				NOUVEAU_APER_VRAM,
> -				nouveau_dmem_page_addr(spage));
> -		if (ret) {
> -			dst_pfns[i] = MIGRATE_PFN_ERROR;
> -			__free_page(dpage);
> -			continue;
> -		}
> -	}
> +	*dma_addr = dma_map_page(dev, dpage, 0, PAGE_SIZE, DMA_BIDIRECTIONAL);
> +	if (dma_mapping_error(dev, *dma_addr))
> +		goto error_free_page;
>   
> -	nouveau_fence_new(drm->dmem->migrate.chan, false, &fault->fence);
> +	if (drm->dmem->migrate.copy_func(drm, 1, NOUVEAU_APER_HOST, *dma_addr,
> +			NOUVEAU_APER_VRAM, nouveau_dmem_page_addr(spage)))
> +		goto error_dma_unmap;
>   
> -	return;
> +	*dst = migrate_pfn(page_to_pfn(dpage)) | MIGRATE_PFN_LOCKED;

Need a "return 0;" here or you undo the work done.

>   
> +error_dma_unmap:
> +	dma_unmap_page(dev, *dma_addr, PAGE_SIZE, DMA_BIDIRECTIONAL);
> +error_free_page:
> +	__free_page(dpage);
>   error:
> -	for (addr = start, i = 0; addr < end; addr += PAGE_SIZE, ++i) {
> -		struct page *page;
> -
> -		if (!dst_pfns[i] || dst_pfns[i] == MIGRATE_PFN_ERROR)
> -			continue;
> -
> -		page = migrate_pfn_to_page(dst_pfns[i]);
> -		dst_pfns[i] = MIGRATE_PFN_ERROR;
> -		if (page == NULL)
> -			continue;
> -
> -		__free_page(page);
> -	}
> -}
> -
> -static void
> -nouveau_dmem_fault_finalize_and_map(struct nouveau_dmem_fault *fault)
> -{
> -	struct nouveau_drm *drm = fault->drm;
> -
> -	nouveau_dmem_fence_done(&fault->fence);
> -
> -	while (fault->npages--) {
> -		dma_unmap_page(drm->dev->dev, fault->dma[fault->npages],
> -			       PAGE_SIZE, PCI_DMA_BIDIRECTIONAL);
> -	}
> -	kfree(fault->dma);
> +	return VM_FAULT_SIGBUS;
>   }
>   
>   static vm_fault_t nouveau_dmem_migrate_to_ram(struct vm_fault *vmf)
>   {
>   	struct nouveau_dmem *dmem = page_to_dmem(vmf->page);
> -	unsigned long src[1] = {0}, dst[1] = {0};
> +	struct nouveau_drm *drm = dmem->drm;
> +	struct nouveau_fence *fence;
> +	unsigned long src = 0, dst = 0;
> +	dma_addr_t dma_addr = 0;
> +	vm_fault_t ret;
>   	struct migrate_vma args = {
>   		.vma		= vmf->vma,
>   		.start		= vmf->address,
>   		.end		= vmf->address + PAGE_SIZE,
> -		.src		= src,
> -		.dst		= dst,
> +		.src		= &src,
> +		.dst		= &dst,
>   	};
> -	struct nouveau_dmem_fault fault = { .drm = dmem->drm };
>   
>   	/*
>   	 * FIXME what we really want is to find some heuristic to migrate more
> @@ -281,16 +199,18 @@ static vm_fault_t nouveau_dmem_migrate_to_ram(struct vm_fault *vmf)
>   	if (!args.cpages)
>   		return 0;
>   
> -	nouveau_dmem_fault_alloc_and_copy(args.vma, src, dst, args.start,
> -			args.end, &fault);
> -	migrate_vma_pages(&args);
> -	nouveau_dmem_fault_finalize_and_map(&fault);
> +	ret = nouveau_dmem_fault_copy_one(drm, vmf->vma, vmf->address, src,
> +			&dst, &dma_addr);
> +	if (ret || dst == 0)
> +		goto done;
>   
> +	nouveau_fence_new(dmem->migrate.chan, false, &fence);
> +	migrate_vma_pages(&args);
> +	nouveau_dmem_fence_done(&fence);
> +	dma_unmap_page(drm->dev->dev, dma_addr, PAGE_SIZE, DMA_BIDIRECTIONAL);
> +done:
>   	migrate_vma_finalize(&args);
> -	if (dst[0] == MIGRATE_PFN_ERROR)
> -		return VM_FAULT_SIGBUS;
> -
> -	return 0;
> +	return ret;
>   }
>   
>   static const struct dev_pagemap_ops nouveau_dmem_pagemap_ops = {
> 

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 6/9] nouveau: simplify nouveau_dmem_migrate_vma
  2019-07-29 14:28 ` [PATCH 6/9] nouveau: simplify nouveau_dmem_migrate_vma Christoph Hellwig
@ 2019-07-29 23:27   ` Ralph Campbell
  0 siblings, 0 replies; 30+ messages in thread
From: Ralph Campbell @ 2019-07-29 23:27 UTC (permalink / raw)
  To: Christoph Hellwig, Jérôme Glisse, Jason Gunthorpe, Ben Skeggs
  Cc: Bharata B Rao, Andrew Morton, linux-mm, nouveau, dri-devel, linux-kernel


On 7/29/19 7:28 AM, Christoph Hellwig wrote:
> Factor the main copy page to vram routine out into a helper that acts
> on a single page and which doesn't require the nouveau_dmem_migrate
> structure for argument passing.  As an added benefit the new version
> only allocates the dma address array once and reuses it for each
> subsequent chunk of work.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>

Reviewed-by: Ralph Campbell <rcampbell@nvidia.com>

> ---
>   drivers/gpu/drm/nouveau/nouveau_dmem.c | 185 ++++++++-----------------
>   1 file changed, 56 insertions(+), 129 deletions(-)
> 
> diff --git a/drivers/gpu/drm/nouveau/nouveau_dmem.c b/drivers/gpu/drm/nouveau/nouveau_dmem.c
> index 036e6c07d489..6cb930755970 100644
> --- a/drivers/gpu/drm/nouveau/nouveau_dmem.c
> +++ b/drivers/gpu/drm/nouveau/nouveau_dmem.c
> @@ -44,8 +44,6 @@
>   #define DMEM_CHUNK_SIZE (2UL << 20)
>   #define DMEM_CHUNK_NPAGES (DMEM_CHUNK_SIZE >> PAGE_SHIFT)
>   
> -struct nouveau_migrate;
> -
>   enum nouveau_aper {
>   	NOUVEAU_APER_VIRT,
>   	NOUVEAU_APER_VRAM,
> @@ -86,15 +84,6 @@ static inline struct nouveau_dmem *page_to_dmem(struct page *page)
>   	return container_of(page->pgmap, struct nouveau_dmem, pagemap);
>   }
>   
> -struct nouveau_migrate {
> -	struct vm_area_struct *vma;
> -	struct nouveau_drm *drm;
> -	struct nouveau_fence *fence;
> -	unsigned long npages;
> -	dma_addr_t *dma;
> -	unsigned long dma_nr;
> -};
> -
>   static unsigned long nouveau_dmem_page_addr(struct page *page)
>   {
>   	struct nouveau_dmem_chunk *chunk = page->zone_device_data;
> @@ -569,131 +558,67 @@ nouveau_dmem_init(struct nouveau_drm *drm)
>   	drm->dmem = NULL;
>   }
>   
> -static void
> -nouveau_dmem_migrate_alloc_and_copy(struct vm_area_struct *vma,
> -				    const unsigned long *src_pfns,
> -				    unsigned long *dst_pfns,
> -				    unsigned long start,
> -				    unsigned long end,
> -				    struct nouveau_migrate *migrate)
> +static unsigned long nouveau_dmem_migrate_copy_one(struct nouveau_drm *drm,
> +		struct vm_area_struct *vma, unsigned long addr,
> +		unsigned long src, dma_addr_t *dma_addr)
>   {
> -	struct nouveau_drm *drm = migrate->drm;
>   	struct device *dev = drm->dev->dev;
> -	unsigned long addr, i, npages = 0;
> -	nouveau_migrate_copy_t copy;
> -	int ret;
> -
> -	/* First allocate new memory */
> -	for (addr = start, i = 0; addr < end; addr += PAGE_SIZE, i++) {
> -		struct page *dpage, *spage;
> -
> -		dst_pfns[i] = 0;
> -		spage = migrate_pfn_to_page(src_pfns[i]);
> -		if (!spage || !(src_pfns[i] & MIGRATE_PFN_MIGRATE))
> -			continue;
> -
> -		dpage = nouveau_dmem_page_alloc_locked(drm);
> -		if (!dpage)
> -			continue;
> -
> -		dst_pfns[i] = migrate_pfn(page_to_pfn(dpage)) |
> -			      MIGRATE_PFN_LOCKED |
> -			      MIGRATE_PFN_DEVICE;
> -		npages++;
> -	}
> -
> -	if (!npages)
> -		return;
> -
> -	/* Allocate storage for DMA addresses, so we can unmap later. */
> -	migrate->dma = kmalloc(sizeof(*migrate->dma) * npages, GFP_KERNEL);
> -	if (!migrate->dma)
> -		goto error;
> -	migrate->dma_nr = 0;
> -
> -	/* Copy things over */
> -	copy = drm->dmem->migrate.copy_func;
> -	for (addr = start, i = 0; addr < end; addr += PAGE_SIZE, i++) {
> -		struct page *spage, *dpage;
> -
> -		dpage = migrate_pfn_to_page(dst_pfns[i]);
> -		if (!dpage || dst_pfns[i] == MIGRATE_PFN_ERROR)
> -			continue;
> -
> -		spage = migrate_pfn_to_page(src_pfns[i]);
> -		if (!spage || !(src_pfns[i] & MIGRATE_PFN_MIGRATE)) {
> -			nouveau_dmem_page_free_locked(drm, dpage);
> -			dst_pfns[i] = 0;
> -			continue;
> -		}
> -
> -		migrate->dma[migrate->dma_nr] =
> -			dma_map_page_attrs(dev, spage, 0, PAGE_SIZE,
> -					   PCI_DMA_BIDIRECTIONAL,
> -					   DMA_ATTR_SKIP_CPU_SYNC);
> -		if (dma_mapping_error(dev, migrate->dma[migrate->dma_nr])) {
> -			nouveau_dmem_page_free_locked(drm, dpage);
> -			dst_pfns[i] = 0;
> -			continue;
> -		}
> -
> -		ret = copy(drm, 1, NOUVEAU_APER_VRAM,
> -				nouveau_dmem_page_addr(dpage),
> -				NOUVEAU_APER_HOST,
> -				migrate->dma[migrate->dma_nr++]);
> -		if (ret) {
> -			nouveau_dmem_page_free_locked(drm, dpage);
> -			dst_pfns[i] = 0;
> -			continue;
> -		}
> -	}
> +	struct page *dpage, *spage;
>   
> -	nouveau_fence_new(drm->dmem->migrate.chan, false, &migrate->fence);
> +	spage = migrate_pfn_to_page(src);
> +	if (!spage || !(src & MIGRATE_PFN_MIGRATE))
> +		goto out;
>   
> -	return;
> +	dpage = nouveau_dmem_page_alloc_locked(drm);
> +	if (!dpage)
> +		return 0;
>   
> -error:
> -	for (addr = start, i = 0; addr < end; addr += PAGE_SIZE, ++i) {
> -		struct page *page;
> +	*dma_addr = dma_map_page(dev, spage, 0, PAGE_SIZE, DMA_BIDIRECTIONAL);
> +	if (dma_mapping_error(dev, *dma_addr))
> +		goto out_free_page;
>   
> -		if (!dst_pfns[i] || dst_pfns[i] == MIGRATE_PFN_ERROR)
> -			continue;
> +	if (drm->dmem->migrate.copy_func(drm, 1, NOUVEAU_APER_VRAM,
> +			nouveau_dmem_page_addr(dpage), NOUVEAU_APER_HOST,
> +			*dma_addr))
> +		goto out_dma_unmap;
>   
> -		page = migrate_pfn_to_page(dst_pfns[i]);
> -		dst_pfns[i] = MIGRATE_PFN_ERROR;
> -		if (page == NULL)
> -			continue;
> +	return migrate_pfn(page_to_pfn(dpage)) |
> +		MIGRATE_PFN_LOCKED | MIGRATE_PFN_DEVICE;
>   
> -		__free_page(page);
> -	}
> +out_dma_unmap:
> +	dma_unmap_page(dev, *dma_addr, PAGE_SIZE, DMA_BIDIRECTIONAL);
> +out_free_page:
> +	nouveau_dmem_page_free_locked(drm, dpage);
> +out:
> +	return 0;
>   }
>   
> -static void
> -nouveau_dmem_migrate_finalize_and_map(struct nouveau_migrate *migrate)
> +static void nouveau_dmem_migrate_chunk(struct migrate_vma *args,
> +		struct nouveau_drm *drm, dma_addr_t *dma_addrs)
>   {
> -	struct nouveau_drm *drm = migrate->drm;
> +	struct nouveau_fence *fence;
> +	unsigned long addr = args->start, nr_dma = 0, i;
> +
> +	for (i = 0; addr < args->end; i++) {
> +		args->dst[i] = nouveau_dmem_migrate_copy_one(drm, args->vma,
> +				addr, args->src[i], &dma_addrs[nr_dma]);
> +		if (args->dst[i])
> +			nr_dma++;
> +		addr += PAGE_SIZE;
> +	}
>   
> -	nouveau_dmem_fence_done(&migrate->fence);
> +	nouveau_fence_new(drm->dmem->migrate.chan, false, &fence);
> +	migrate_vma_pages(args);
> +	nouveau_dmem_fence_done(&fence);
>   
> -	while (migrate->dma_nr--) {
> -		dma_unmap_page(drm->dev->dev, migrate->dma[migrate->dma_nr],
> -			       PAGE_SIZE, PCI_DMA_BIDIRECTIONAL);
> +	while (nr_dma--) {
> +		dma_unmap_page(drm->dev->dev, dma_addrs[nr_dma], PAGE_SIZE,
> +				DMA_BIDIRECTIONAL);
>   	}
> -	kfree(migrate->dma);
> -
>   	/*
> -	 * FIXME optimization: update GPU page table to point to newly
> -	 * migrated memory.
> +	 * FIXME optimization: update GPU page table to point to newly migrated
> +	 * memory.
>   	 */
> -}
> -
> -static void nouveau_dmem_migrate_chunk(struct migrate_vma *args,
> -		struct nouveau_migrate *migrate)
> -{
> -	nouveau_dmem_migrate_alloc_and_copy(args->vma, args->src, args->dst,
> -			args->start, args->end, migrate);
> -	migrate_vma_pages(args);
> -	nouveau_dmem_migrate_finalize_and_map(migrate);
>   	migrate_vma_finalize(args);
>   }
>   
> @@ -705,38 +630,40 @@ nouveau_dmem_migrate_vma(struct nouveau_drm *drm,
>   {
>   	unsigned long npages = (end - start) >> PAGE_SHIFT;
>   	unsigned long max = min(SG_MAX_SINGLE_ALLOC, npages);
> +	dma_addr_t *dma_addrs;
>   	struct migrate_vma args = {
>   		.vma		= vma,
>   		.start		= start,
>   	};
> -	struct nouveau_migrate migrate = {
> -		.drm		= drm,
> -		.vma		= vma,
> -		.npages		= npages,
> -	};
>   	unsigned long c, i;
>   	int ret = -ENOMEM;
>   
> -	args.src = kzalloc(sizeof(long) * max, GFP_KERNEL);
> +	args.src = kcalloc(max, sizeof(args.src), GFP_KERNEL);
>   	if (!args.src)
>   		goto out;
> -	args.dst = kzalloc(sizeof(long) * max, GFP_KERNEL);
> +	args.dst = kcalloc(max, sizeof(args.dst), GFP_KERNEL);
>   	if (!args.dst)
>   		goto out_free_src;
>   
> +	dma_addrs = kmalloc_array(max, sizeof(*dma_addrs), GFP_KERNEL);
> +	if (!dma_addrs)
> +		goto out_free_dst;
> +
>   	for (i = 0; i < npages; i += c) {
>   		c = min(SG_MAX_SINGLE_ALLOC, npages);
>   		args.end = start + (c << PAGE_SHIFT);
>   		ret = migrate_vma_setup(&args);
>   		if (ret)
> -			goto out_free_dst;
> +			goto out_free_dma;
>   
>   		if (args.cpages)
> -			nouveau_dmem_migrate_chunk(&args, &migrate);
> +			nouveau_dmem_migrate_chunk(&args, drm, dma_addrs);
>   		args.start = args.end;
>   	}
>   
>   	ret = 0;
> +out_free_dma:
> +	kfree(dma_addrs);
>   out_free_dst:
>   	kfree(args.dst);
>   out_free_src:
> 

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 7/9] mm: remove the unused MIGRATE_PFN_ERROR flag
  2019-07-29 14:28 ` [PATCH 7/9] mm: remove the unused MIGRATE_PFN_ERROR flag Christoph Hellwig
@ 2019-07-29 23:29   ` Ralph Campbell
  0 siblings, 0 replies; 30+ messages in thread
From: Ralph Campbell @ 2019-07-29 23:29 UTC (permalink / raw)
  To: Christoph Hellwig, Jérôme Glisse, Jason Gunthorpe, Ben Skeggs
  Cc: Bharata B Rao, Andrew Morton, linux-mm, nouveau, dri-devel, linux-kernel


On 7/29/19 7:28 AM, Christoph Hellwig wrote:
> We don't use this flag anymore, so remove it.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>

Reviewed-by: Ralph Campbell <rcampbell@nvidia.com>

> ---
>   include/linux/migrate.h | 1 -
>   1 file changed, 1 deletion(-)
> 
> diff --git a/include/linux/migrate.h b/include/linux/migrate.h
> index 093d67fcf6dd..229153c2c496 100644
> --- a/include/linux/migrate.h
> +++ b/include/linux/migrate.h
> @@ -167,7 +167,6 @@ static inline int migrate_misplaced_transhuge_page(struct mm_struct *mm,
>   #define MIGRATE_PFN_LOCKED	(1UL << 2)
>   #define MIGRATE_PFN_WRITE	(1UL << 3)
>   #define MIGRATE_PFN_DEVICE	(1UL << 4)
> -#define MIGRATE_PFN_ERROR	(1UL << 5)
>   #define MIGRATE_PFN_SHIFT	6

The MIGRATE_PFN_SHIFT could be reduced to 5 since it is only used
to make room for the flags.

>   static inline struct page *migrate_pfn_to_page(unsigned long mpfn)
> 

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 9/9] mm: remove the MIGRATE_PFN_WRITE flag
  2019-07-29 14:28 ` [PATCH 9/9] mm: remove the MIGRATE_PFN_WRITE flag Christoph Hellwig
@ 2019-07-29 23:30   ` Jerome Glisse
  2019-07-30  5:46     ` Christoph Hellwig
  2019-07-29 23:42   ` Ralph Campbell
  1 sibling, 1 reply; 30+ messages in thread
From: Jerome Glisse @ 2019-07-29 23:30 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Jason Gunthorpe, Ben Skeggs, Ralph Campbell, Bharata B Rao,
	Andrew Morton, linux-mm, nouveau, dri-devel, linux-kernel

On Mon, Jul 29, 2019 at 05:28:43PM +0300, Christoph Hellwig wrote:
> The MIGRATE_PFN_WRITE is only used locally in migrate_vma_collect_pmd,
> where it can be replaced with a simple boolean local variable.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>

NAK that flag is useful, for instance a anonymous vma might have
some of its page read only even if the vma has write permission.

It seems that the code in nouveau is wrong (probably lost that
in various rebase/rework) as this flag should be use to decide
wether to map the device memory with write permission or not.

I am traveling right now, i will investigate what happened to
nouveau code.

Cheers,
Jérôme

> ---
>  include/linux/migrate.h | 1 -
>  mm/migrate.c            | 9 +++++----
>  2 files changed, 5 insertions(+), 5 deletions(-)
> 
> diff --git a/include/linux/migrate.h b/include/linux/migrate.h
> index 8b46cfdb1a0e..ba74ef5a7702 100644
> --- a/include/linux/migrate.h
> +++ b/include/linux/migrate.h
> @@ -165,7 +165,6 @@ static inline int migrate_misplaced_transhuge_page(struct mm_struct *mm,
>  #define MIGRATE_PFN_VALID	(1UL << 0)
>  #define MIGRATE_PFN_MIGRATE	(1UL << 1)
>  #define MIGRATE_PFN_LOCKED	(1UL << 2)
> -#define MIGRATE_PFN_WRITE	(1UL << 3)
>  #define MIGRATE_PFN_SHIFT	6
>  
>  static inline struct page *migrate_pfn_to_page(unsigned long mpfn)
> diff --git a/mm/migrate.c b/mm/migrate.c
> index 74735256e260..724f92dcc31b 100644
> --- a/mm/migrate.c
> +++ b/mm/migrate.c
> @@ -2212,6 +2212,7 @@ static int migrate_vma_collect_pmd(pmd_t *pmdp,
>  		unsigned long mpfn, pfn;
>  		struct page *page;
>  		swp_entry_t entry;
> +		bool writable = false;
>  		pte_t pte;
>  
>  		pte = *ptep;
> @@ -2240,7 +2241,7 @@ static int migrate_vma_collect_pmd(pmd_t *pmdp,
>  			mpfn = migrate_pfn(page_to_pfn(page)) |
>  					MIGRATE_PFN_MIGRATE;
>  			if (is_write_device_private_entry(entry))
> -				mpfn |= MIGRATE_PFN_WRITE;
> +				writable = true;
>  		} else {
>  			if (is_zero_pfn(pfn)) {
>  				mpfn = MIGRATE_PFN_MIGRATE;
> @@ -2250,7 +2251,8 @@ static int migrate_vma_collect_pmd(pmd_t *pmdp,
>  			}
>  			page = vm_normal_page(migrate->vma, addr, pte);
>  			mpfn = migrate_pfn(pfn) | MIGRATE_PFN_MIGRATE;
> -			mpfn |= pte_write(pte) ? MIGRATE_PFN_WRITE : 0;
> +			if (pte_write(pte))
> +				writable = true;
>  		}
>  
>  		/* FIXME support THP */
> @@ -2284,8 +2286,7 @@ static int migrate_vma_collect_pmd(pmd_t *pmdp,
>  			ptep_get_and_clear(mm, addr, ptep);
>  
>  			/* Setup special migration page table entry */
> -			entry = make_migration_entry(page, mpfn &
> -						     MIGRATE_PFN_WRITE);
> +			entry = make_migration_entry(page, writable);
>  			swp_pte = swp_entry_to_pte(entry);
>  			if (pte_soft_dirty(pte))
>  				swp_pte = pte_swp_mksoft_dirty(swp_pte);
> -- 
> 2.20.1
> 

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 8/9] mm: remove the unused MIGRATE_PFN_DEVICE flag
  2019-07-29 14:28 ` [PATCH 8/9] mm: remove the unused MIGRATE_PFN_DEVICE flag Christoph Hellwig
@ 2019-07-29 23:31   ` Ralph Campbell
  0 siblings, 0 replies; 30+ messages in thread
From: Ralph Campbell @ 2019-07-29 23:31 UTC (permalink / raw)
  To: Christoph Hellwig, Jérôme Glisse, Jason Gunthorpe, Ben Skeggs
  Cc: Bharata B Rao, Andrew Morton, linux-mm, nouveau, dri-devel, linux-kernel


On 7/29/19 7:28 AM, Christoph Hellwig wrote:
> No one ever checks this flag, and we could easily get that information
> from the page if needed.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>

Reviewed-by: Ralph Campbell <rcampbell@nvidia.com>

> ---
>   drivers/gpu/drm/nouveau/nouveau_dmem.c | 3 +--
>   include/linux/migrate.h                | 1 -
>   mm/migrate.c                           | 4 ++--
>   3 files changed, 3 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/gpu/drm/nouveau/nouveau_dmem.c b/drivers/gpu/drm/nouveau/nouveau_dmem.c
> index 6cb930755970..f04686a2c21f 100644
> --- a/drivers/gpu/drm/nouveau/nouveau_dmem.c
> +++ b/drivers/gpu/drm/nouveau/nouveau_dmem.c
> @@ -582,8 +582,7 @@ static unsigned long nouveau_dmem_migrate_copy_one(struct nouveau_drm *drm,
>   			*dma_addr))
>   		goto out_dma_unmap;
>   
> -	return migrate_pfn(page_to_pfn(dpage)) |
> -		MIGRATE_PFN_LOCKED | MIGRATE_PFN_DEVICE;
> +	return migrate_pfn(page_to_pfn(dpage)) | MIGRATE_PFN_LOCKED;
>   
>   out_dma_unmap:
>   	dma_unmap_page(dev, *dma_addr, PAGE_SIZE, DMA_BIDIRECTIONAL);
> diff --git a/include/linux/migrate.h b/include/linux/migrate.h
> index 229153c2c496..8b46cfdb1a0e 100644
> --- a/include/linux/migrate.h
> +++ b/include/linux/migrate.h
> @@ -166,7 +166,6 @@ static inline int migrate_misplaced_transhuge_page(struct mm_struct *mm,
>   #define MIGRATE_PFN_MIGRATE	(1UL << 1)
>   #define MIGRATE_PFN_LOCKED	(1UL << 2)
>   #define MIGRATE_PFN_WRITE	(1UL << 3)
> -#define MIGRATE_PFN_DEVICE	(1UL << 4)
>   #define MIGRATE_PFN_SHIFT	6
>   
>   static inline struct page *migrate_pfn_to_page(unsigned long mpfn)
> diff --git a/mm/migrate.c b/mm/migrate.c
> index dc4e60a496f2..74735256e260 100644
> --- a/mm/migrate.c
> +++ b/mm/migrate.c
> @@ -2237,8 +2237,8 @@ static int migrate_vma_collect_pmd(pmd_t *pmdp,
>   				goto next;
>   
>   			page = device_private_entry_to_page(entry);
> -			mpfn = migrate_pfn(page_to_pfn(page))|
> -				MIGRATE_PFN_DEVICE | MIGRATE_PFN_MIGRATE;
> +			mpfn = migrate_pfn(page_to_pfn(page)) |
> +					MIGRATE_PFN_MIGRATE;
>   			if (is_write_device_private_entry(entry))
>   				mpfn |= MIGRATE_PFN_WRITE;
>   		} else {
> 

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 9/9] mm: remove the MIGRATE_PFN_WRITE flag
  2019-07-29 14:28 ` [PATCH 9/9] mm: remove the MIGRATE_PFN_WRITE flag Christoph Hellwig
  2019-07-29 23:30   ` Jerome Glisse
@ 2019-07-29 23:42   ` Ralph Campbell
  2019-07-29 23:46     ` Jerome Glisse
  1 sibling, 1 reply; 30+ messages in thread
From: Ralph Campbell @ 2019-07-29 23:42 UTC (permalink / raw)
  To: Christoph Hellwig, Jérôme Glisse, Jason Gunthorpe, Ben Skeggs
  Cc: Bharata B Rao, Andrew Morton, linux-mm, nouveau, dri-devel, linux-kernel


On 7/29/19 7:28 AM, Christoph Hellwig wrote:
> The MIGRATE_PFN_WRITE is only used locally in migrate_vma_collect_pmd,
> where it can be replaced with a simple boolean local variable.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>

Reviewed-by: Ralph Campbell <rcampbell@nvidia.com>

> ---
>   include/linux/migrate.h | 1 -
>   mm/migrate.c            | 9 +++++----
>   2 files changed, 5 insertions(+), 5 deletions(-)
> 
> diff --git a/include/linux/migrate.h b/include/linux/migrate.h
> index 8b46cfdb1a0e..ba74ef5a7702 100644
> --- a/include/linux/migrate.h
> +++ b/include/linux/migrate.h
> @@ -165,7 +165,6 @@ static inline int migrate_misplaced_transhuge_page(struct mm_struct *mm,
>   #define MIGRATE_PFN_VALID	(1UL << 0)
>   #define MIGRATE_PFN_MIGRATE	(1UL << 1)
>   #define MIGRATE_PFN_LOCKED	(1UL << 2)
> -#define MIGRATE_PFN_WRITE	(1UL << 3)
>   #define MIGRATE_PFN_SHIFT	6
>   
>   static inline struct page *migrate_pfn_to_page(unsigned long mpfn)
> diff --git a/mm/migrate.c b/mm/migrate.c
> index 74735256e260..724f92dcc31b 100644
> --- a/mm/migrate.c
> +++ b/mm/migrate.c
> @@ -2212,6 +2212,7 @@ static int migrate_vma_collect_pmd(pmd_t *pmdp,
>   		unsigned long mpfn, pfn;
>   		struct page *page;
>   		swp_entry_t entry;
> +		bool writable = false;
>   		pte_t pte;
>   
>   		pte = *ptep;
> @@ -2240,7 +2241,7 @@ static int migrate_vma_collect_pmd(pmd_t *pmdp,
>   			mpfn = migrate_pfn(page_to_pfn(page)) |
>   					MIGRATE_PFN_MIGRATE;
>   			if (is_write_device_private_entry(entry))
> -				mpfn |= MIGRATE_PFN_WRITE;
> +				writable = true;
>   		} else {
>   			if (is_zero_pfn(pfn)) {
>   				mpfn = MIGRATE_PFN_MIGRATE;
> @@ -2250,7 +2251,8 @@ static int migrate_vma_collect_pmd(pmd_t *pmdp,
>   			}
>   			page = vm_normal_page(migrate->vma, addr, pte);
>   			mpfn = migrate_pfn(pfn) | MIGRATE_PFN_MIGRATE;
> -			mpfn |= pte_write(pte) ? MIGRATE_PFN_WRITE : 0;
> +			if (pte_write(pte))
> +				writable = true;
>   		}
>   
>   		/* FIXME support THP */
> @@ -2284,8 +2286,7 @@ static int migrate_vma_collect_pmd(pmd_t *pmdp,
>   			ptep_get_and_clear(mm, addr, ptep);
>   
>   			/* Setup special migration page table entry */
> -			entry = make_migration_entry(page, mpfn &
> -						     MIGRATE_PFN_WRITE);
> +			entry = make_migration_entry(page, writable);
>   			swp_pte = swp_entry_to_pte(entry);
>   			if (pte_soft_dirty(pte))
>   				swp_pte = pte_swp_mksoft_dirty(swp_pte);
> 

MIGRATE_PFN_WRITE may mot being used but that seems like a bug to me.
If a page is migrated to device memory, it could be mapped at the same
time to avoid a device page fault but it would need the flag to know
whether to map it RW or RO. But I suppose that could be inferred from
the vma->vm_flags.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 1/9] mm: turn migrate_vma upside down
  2019-07-29 14:28 ` [PATCH 1/9] mm: turn " Christoph Hellwig
  2019-07-29 23:12   ` Ralph Campbell
@ 2019-07-29 23:43   ` Jerome Glisse
  2019-07-31  1:46   ` Ralph Campbell
  2 siblings, 0 replies; 30+ messages in thread
From: Jerome Glisse @ 2019-07-29 23:43 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Jason Gunthorpe, Ben Skeggs, Ralph Campbell, Bharata B Rao,
	Andrew Morton, linux-mm, nouveau, dri-devel, linux-kernel

On Mon, Jul 29, 2019 at 05:28:35PM +0300, Christoph Hellwig wrote:
> There isn't any good reason to pass callbacks to migrate_vma.  Instead
> we can just export the three steps done by this function to drivers and
> let them sequence the operation without callbacks.  This removes a lot
> of boilerplate code as-is, and will allow the drivers to drastically
> improve code flow and error handling further on.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>


I haven't finished review, especialy the nouveau code, i will look
into this once i get back. In the meantime below are few corrections.

> ---
>  Documentation/vm/hmm.rst               |  55 +-----
>  drivers/gpu/drm/nouveau/nouveau_dmem.c | 122 +++++++------
>  include/linux/migrate.h                | 118 ++----------
>  mm/migrate.c                           | 242 +++++++++++--------------
>  4 files changed, 193 insertions(+), 344 deletions(-)
> 

[...]

> diff --git a/mm/migrate.c b/mm/migrate.c
> index 8992741f10aa..dc4e60a496f2 100644
> --- a/mm/migrate.c
> +++ b/mm/migrate.c
> @@ -2118,16 +2118,6 @@ int migrate_misplaced_transhuge_page(struct mm_struct *mm,
>  #endif /* CONFIG_NUMA */
>  
>  #if defined(CONFIG_MIGRATE_VMA_HELPER)
> -struct migrate_vma {
> -	struct vm_area_struct	*vma;
> -	unsigned long		*dst;
> -	unsigned long		*src;
> -	unsigned long		cpages;
> -	unsigned long		npages;
> -	unsigned long		start;
> -	unsigned long		end;
> -};
> -
>  static int migrate_vma_collect_hole(unsigned long start,
>  				    unsigned long end,
>  				    struct mm_walk *walk)
> @@ -2578,6 +2568,108 @@ static void migrate_vma_unmap(struct migrate_vma *migrate)
>  	}
>  }
>  
> +/**
> + * migrate_vma_setup() - prepare to migrate a range of memory
> + * @args: contains the vma, start, and and pfns arrays for the migration
> + *
> + * Returns: negative errno on failures, 0 when 0 or more pages were migrated
> + * without an error.
> + *
> + * Prepare to migrate a range of memory virtual address range by collecting all
> + * the pages backing each virtual address in the range, saving them inside the
> + * src array.  Then lock those pages and unmap them. Once the pages are locked
> + * and unmapped, check whether each page is pinned or not.  Pages that aren't
> + * pinned have the MIGRATE_PFN_MIGRATE flag set (by this function) in the
> + * corresponding src array entry.  Then restores any pages that are pinned, by
> + * remapping and unlocking those pages.
> + *
> + * The caller should then allocate destination memory and copy source memory to
> + * it for all those entries (ie with MIGRATE_PFN_VALID and MIGRATE_PFN_MIGRATE
> + * flag set).  Once these are allocated and copied, the caller must update each
> + * corresponding entry in the dst array with the pfn value of the destination
> + * page and with the MIGRATE_PFN_VALID and MIGRATE_PFN_LOCKED flags set
> + * (destination pages must have their struct pages locked, via lock_page()).
> + *
> + * Note that the caller does not have to migrate all the pages that are marked
> + * with MIGRATE_PFN_MIGRATE flag in src array unless this is a migration from
> + * device memory to system memory.  If the caller cannot migrate a device page
> + * back to system memory, then it must return VM_FAULT_SIGBUS, which will
> + * might have severe consequences for the userspace process, so it should best

      ^s/might//                                                      ^s/should best/must/

> + * be avoided if possible.
                 ^s/if possible//

Maybe adding something about failing only on unrecoverable device error. The
only reason we allow failure for migration here is because GPU devices can
go into bad state (GPU lockup) and when that happens the GPU memory might be
corrupted (power to GPU memory might be cut by GPU driver to recover the
GPU).

So failing migration back to main memory is only a last resort event.


> + *
> + * For empty entries inside CPU page table (pte_none() or pmd_none() is true) we
> + * do set MIGRATE_PFN_MIGRATE flag inside the corresponding source array thus
> + * allowing the caller to allocate device memory for those unback virtual
> + * address.  For this the caller simply havs to allocate device memory and
                                           ^ haves


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 9/9] mm: remove the MIGRATE_PFN_WRITE flag
  2019-07-29 23:42   ` Ralph Campbell
@ 2019-07-29 23:46     ` Jerome Glisse
  0 siblings, 0 replies; 30+ messages in thread
From: Jerome Glisse @ 2019-07-29 23:46 UTC (permalink / raw)
  To: Ralph Campbell
  Cc: Christoph Hellwig, Jason Gunthorpe, Ben Skeggs, Bharata B Rao,
	Andrew Morton, linux-mm, nouveau, dri-devel, linux-kernel

On Mon, Jul 29, 2019 at 04:42:01PM -0700, Ralph Campbell wrote:
> 
> On 7/29/19 7:28 AM, Christoph Hellwig wrote:
> > The MIGRATE_PFN_WRITE is only used locally in migrate_vma_collect_pmd,
> > where it can be replaced with a simple boolean local variable.
> > 
> > Signed-off-by: Christoph Hellwig <hch@lst.de>
> 
> Reviewed-by: Ralph Campbell <rcampbell@nvidia.com>
> 
> > ---
> >   include/linux/migrate.h | 1 -
> >   mm/migrate.c            | 9 +++++----
> >   2 files changed, 5 insertions(+), 5 deletions(-)
> > 
> > diff --git a/include/linux/migrate.h b/include/linux/migrate.h
> > index 8b46cfdb1a0e..ba74ef5a7702 100644
> > --- a/include/linux/migrate.h
> > +++ b/include/linux/migrate.h
> > @@ -165,7 +165,6 @@ static inline int migrate_misplaced_transhuge_page(struct mm_struct *mm,
> >   #define MIGRATE_PFN_VALID	(1UL << 0)
> >   #define MIGRATE_PFN_MIGRATE	(1UL << 1)
> >   #define MIGRATE_PFN_LOCKED	(1UL << 2)
> > -#define MIGRATE_PFN_WRITE	(1UL << 3)
> >   #define MIGRATE_PFN_SHIFT	6
> >   static inline struct page *migrate_pfn_to_page(unsigned long mpfn)
> > diff --git a/mm/migrate.c b/mm/migrate.c
> > index 74735256e260..724f92dcc31b 100644
> > --- a/mm/migrate.c
> > +++ b/mm/migrate.c
> > @@ -2212,6 +2212,7 @@ static int migrate_vma_collect_pmd(pmd_t *pmdp,
> >   		unsigned long mpfn, pfn;
> >   		struct page *page;
> >   		swp_entry_t entry;
> > +		bool writable = false;
> >   		pte_t pte;
> >   		pte = *ptep;
> > @@ -2240,7 +2241,7 @@ static int migrate_vma_collect_pmd(pmd_t *pmdp,
> >   			mpfn = migrate_pfn(page_to_pfn(page)) |
> >   					MIGRATE_PFN_MIGRATE;
> >   			if (is_write_device_private_entry(entry))
> > -				mpfn |= MIGRATE_PFN_WRITE;
> > +				writable = true;
> >   		} else {
> >   			if (is_zero_pfn(pfn)) {
> >   				mpfn = MIGRATE_PFN_MIGRATE;
> > @@ -2250,7 +2251,8 @@ static int migrate_vma_collect_pmd(pmd_t *pmdp,
> >   			}
> >   			page = vm_normal_page(migrate->vma, addr, pte);
> >   			mpfn = migrate_pfn(pfn) | MIGRATE_PFN_MIGRATE;
> > -			mpfn |= pte_write(pte) ? MIGRATE_PFN_WRITE : 0;
> > +			if (pte_write(pte))
> > +				writable = true;
> >   		}
> >   		/* FIXME support THP */
> > @@ -2284,8 +2286,7 @@ static int migrate_vma_collect_pmd(pmd_t *pmdp,
> >   			ptep_get_and_clear(mm, addr, ptep);
> >   			/* Setup special migration page table entry */
> > -			entry = make_migration_entry(page, mpfn &
> > -						     MIGRATE_PFN_WRITE);
> > +			entry = make_migration_entry(page, writable);
> >   			swp_pte = swp_entry_to_pte(entry);
> >   			if (pte_soft_dirty(pte))
> >   				swp_pte = pte_swp_mksoft_dirty(swp_pte);
> > 
> 
> MIGRATE_PFN_WRITE may mot being used but that seems like a bug to me.
> If a page is migrated to device memory, it could be mapped at the same
> time to avoid a device page fault but it would need the flag to know
> whether to map it RW or RO. But I suppose that could be inferred from
> the vma->vm_flags.

It is a bug that it is not being use right now. I will have to dig my
git repo to see when that got kill. Will look into it once i get back.

The vma->vm_flags is of no use here. A page can be write protected
inside a writable vma for various reasons.

Cheers,
Jérôme

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 9/9] mm: remove the MIGRATE_PFN_WRITE flag
  2019-07-29 23:30   ` Jerome Glisse
@ 2019-07-30  5:46     ` Christoph Hellwig
  2019-07-30 15:51       ` Jerome Glisse
  0 siblings, 1 reply; 30+ messages in thread
From: Christoph Hellwig @ 2019-07-30  5:46 UTC (permalink / raw)
  To: Jerome Glisse
  Cc: Christoph Hellwig, Jason Gunthorpe, Ben Skeggs, Ralph Campbell,
	Bharata B Rao, Andrew Morton, linux-mm, nouveau, dri-devel,
	linux-kernel

On Mon, Jul 29, 2019 at 07:30:44PM -0400, Jerome Glisse wrote:
> On Mon, Jul 29, 2019 at 05:28:43PM +0300, Christoph Hellwig wrote:
> > The MIGRATE_PFN_WRITE is only used locally in migrate_vma_collect_pmd,
> > where it can be replaced with a simple boolean local variable.
> > 
> > Signed-off-by: Christoph Hellwig <hch@lst.de>
> 
> NAK that flag is useful, for instance a anonymous vma might have
> some of its page read only even if the vma has write permission.
> 
> It seems that the code in nouveau is wrong (probably lost that
> in various rebase/rework) as this flag should be use to decide
> wether to map the device memory with write permission or not.
> 
> I am traveling right now, i will investigate what happened to
> nouveau code.

We can add it back when needed pretty easily.  Much of this has bitrotted
way to fast, and the pending ppc kvmhmm code doesn't need it either.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: turn the hmm migrate_vma upside down
  2019-07-29 14:28 turn the hmm migrate_vma upside down Christoph Hellwig
                   ` (8 preceding siblings ...)
  2019-07-29 14:28 ` [PATCH 9/9] mm: remove the MIGRATE_PFN_WRITE flag Christoph Hellwig
@ 2019-07-30 12:32 ` Jason Gunthorpe
  2019-07-30 13:09   ` Christoph Hellwig
  9 siblings, 1 reply; 30+ messages in thread
From: Jason Gunthorpe @ 2019-07-30 12:32 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Jérôme Glisse, Ben Skeggs, Ralph Campbell,
	Bharata B Rao, Andrew Morton, linux-mm, nouveau, dri-devel,
	linux-kernel

On Mon, Jul 29, 2019 at 05:28:34PM +0300, Christoph Hellwig wrote:
> Hi Jérôme, Ben and Jason,
> 
> below is a series against the hmm tree which starts revamping the
> migrate_vma functionality.  The prime idea is to export three slightly
> lower level functions and thus avoid the need for migrate_vma_ops
> callbacks.

I don't feel I can contribute a worthwhile review for this part of the
code right now.

Does this only impact hmm users, or does migrate.c have a broader
usage?

Who do we need on review to progress on this?

Thanks,
Jason

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: turn the hmm migrate_vma upside down
  2019-07-30 12:32 ` turn the hmm migrate_vma upside down Jason Gunthorpe
@ 2019-07-30 13:09   ` Christoph Hellwig
  0 siblings, 0 replies; 30+ messages in thread
From: Christoph Hellwig @ 2019-07-30 13:09 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Christoph Hellwig, Jérôme Glisse, Ben Skeggs,
	Ralph Campbell, Bharata B Rao, Andrew Morton, linux-mm, nouveau,
	dri-devel, linux-kernel

On Tue, Jul 30, 2019 at 12:32:24PM +0000, Jason Gunthorpe wrote:
> Does this only impact hmm users, or does migrate.c have a broader
> usage?

migrate.c really contains two things:  the traditional page migration
code implementing aops ->migrate semantics, and migrate_vma and its
callbacks.  The first part is broader, the second part is hmm specific
(and should probably have been in a file of its own, given that it is
guarded off CONFIG_MIGRATE_VMA_HELPER).  This series only touched the
latter part.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 9/9] mm: remove the MIGRATE_PFN_WRITE flag
  2019-07-30  5:46     ` Christoph Hellwig
@ 2019-07-30 15:51       ` Jerome Glisse
  0 siblings, 0 replies; 30+ messages in thread
From: Jerome Glisse @ 2019-07-30 15:51 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Jason Gunthorpe, Ben Skeggs, Ralph Campbell, Bharata B Rao,
	Andrew Morton, linux-mm, nouveau, dri-devel, linux-kernel

On Tue, Jul 30, 2019 at 07:46:33AM +0200, Christoph Hellwig wrote:
> On Mon, Jul 29, 2019 at 07:30:44PM -0400, Jerome Glisse wrote:
> > On Mon, Jul 29, 2019 at 05:28:43PM +0300, Christoph Hellwig wrote:
> > > The MIGRATE_PFN_WRITE is only used locally in migrate_vma_collect_pmd,
> > > where it can be replaced with a simple boolean local variable.
> > > 
> > > Signed-off-by: Christoph Hellwig <hch@lst.de>
> > 
> > NAK that flag is useful, for instance a anonymous vma might have
> > some of its page read only even if the vma has write permission.
> > 
> > It seems that the code in nouveau is wrong (probably lost that
> > in various rebase/rework) as this flag should be use to decide
> > wether to map the device memory with write permission or not.
> > 
> > I am traveling right now, i will investigate what happened to
> > nouveau code.
> 
> We can add it back when needed pretty easily.  Much of this has bitrotted
> way to fast, and the pending ppc kvmhmm code doesn't need it either.

Not using is a serious bug, i will investigate this friday.

Cheers,
Jérôme

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 1/9] mm: turn migrate_vma upside down
  2019-07-29 14:28 ` [PATCH 1/9] mm: turn " Christoph Hellwig
  2019-07-29 23:12   ` Ralph Campbell
  2019-07-29 23:43   ` Jerome Glisse
@ 2019-07-31  1:46   ` Ralph Campbell
  2019-08-01  7:42     ` Christoph Hellwig
  2 siblings, 1 reply; 30+ messages in thread
From: Ralph Campbell @ 2019-07-31  1:46 UTC (permalink / raw)
  To: Christoph Hellwig, Jérôme Glisse, Jason Gunthorpe, Ben Skeggs
  Cc: Bharata B Rao, Andrew Morton, linux-mm, nouveau, dri-devel, linux-kernel


On 7/29/19 7:28 AM, Christoph Hellwig wrote:
> There isn't any good reason to pass callbacks to migrate_vma.  Instead
> we can just export the three steps done by this function to drivers and
> let them sequence the operation without callbacks.  This removes a lot
> of boilerplate code as-is, and will allow the drivers to drastically
> improve code flow and error handling further on.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>   Documentation/vm/hmm.rst               |  55 +-----
>   drivers/gpu/drm/nouveau/nouveau_dmem.c | 122 +++++++------
>   include/linux/migrate.h                | 118 ++----------
>   mm/migrate.c                           | 242 +++++++++++--------------
>   4 files changed, 193 insertions(+), 344 deletions(-)
> 
> diff --git a/Documentation/vm/hmm.rst b/Documentation/vm/hmm.rst
> index ddcb5ca8b296..ad880e3996b1 100644
> --- a/Documentation/vm/hmm.rst
> +++ b/Documentation/vm/hmm.rst
> @@ -339,58 +339,9 @@ Migration to and from device memory
>   ===================================
>   
>   Because the CPU cannot access device memory, migration must use the device DMA
> -engine to perform copy from and to device memory. For this we need a new
> -migration helper::
> -
> - int migrate_vma(const struct migrate_vma_ops *ops,
> -                 struct vm_area_struct *vma,
> -                 unsigned long mentries,
> -                 unsigned long start,
> -                 unsigned long end,
> -                 unsigned long *src,
> -                 unsigned long *dst,
> -                 void *private);
> -
> -Unlike other migration functions it works on a range of virtual address, there
> -are two reasons for that. First, device DMA copy has a high setup overhead cost
> -and thus batching multiple pages is needed as otherwise the migration overhead
> -makes the whole exercise pointless. The second reason is because the
> -migration might be for a range of addresses the device is actively accessing.
> -
> -The migrate_vma_ops struct defines two callbacks. First one (alloc_and_copy())
> -controls destination memory allocation and copy operation. Second one is there
> -to allow the device driver to perform cleanup operations after migration::
> -
> - struct migrate_vma_ops {
> -     void (*alloc_and_copy)(struct vm_area_struct *vma,
> -                            const unsigned long *src,
> -                            unsigned long *dst,
> -                            unsigned long start,
> -                            unsigned long end,
> -                            void *private);
> -     void (*finalize_and_map)(struct vm_area_struct *vma,
> -                              const unsigned long *src,
> -                              const unsigned long *dst,
> -                              unsigned long start,
> -                              unsigned long end,
> -                              void *private);
> - };
> -
> -It is important to stress that these migration helpers allow for holes in the
> -virtual address range. Some pages in the range might not be migrated for all
> -the usual reasons (page is pinned, page is locked, ...). This helper does not
> -fail but just skips over those pages.
> -
> -The alloc_and_copy() might decide to not migrate all pages in the
> -range (for reasons under the callback control). For those, the callback just
> -has to leave the corresponding dst entry empty.
> -
> -Finally, the migration of the struct page might fail (for file backed page) for
> -various reasons (failure to freeze reference, or update page cache, ...). If
> -that happens, then the finalize_and_map() can catch any pages that were not
> -migrated. Note those pages were still copied to a new page and thus we wasted
> -bandwidth but this is considered as a rare event and a price that we are
> -willing to pay to keep all the code simpler.
> +engine to perform copy from and to device memory. For this we need a new to
> +use migrate_vma_setup(), migrate_vma_pages(), and migrate_vma_finalize()
> +helpers.
>   
>   
>   Memory cgroup (memcg) and rss accounting
> diff --git a/drivers/gpu/drm/nouveau/nouveau_dmem.c b/drivers/gpu/drm/nouveau/nouveau_dmem.c
> index 345c63cb752a..38416798abd4 100644
> --- a/drivers/gpu/drm/nouveau/nouveau_dmem.c
> +++ b/drivers/gpu/drm/nouveau/nouveau_dmem.c
> @@ -131,9 +131,8 @@ nouveau_dmem_fault_alloc_and_copy(struct vm_area_struct *vma,
>   				  unsigned long *dst_pfns,
>   				  unsigned long start,
>   				  unsigned long end,
> -				  void *private)
> +				  struct nouveau_dmem_fault *fault)
>   {
> -	struct nouveau_dmem_fault *fault = private;
>   	struct nouveau_drm *drm = fault->drm;
>   	struct device *dev = drm->dev->dev;
>   	unsigned long addr, i, npages = 0;
> @@ -230,14 +229,9 @@ nouveau_dmem_fault_alloc_and_copy(struct vm_area_struct *vma,
>   	}
>   }
>   
> -void nouveau_dmem_fault_finalize_and_map(struct vm_area_struct *vma,
> -					 const unsigned long *src_pfns,
> -					 const unsigned long *dst_pfns,
> -					 unsigned long start,
> -					 unsigned long end,
> -					 void *private)
> +static void
> +nouveau_dmem_fault_finalize_and_map(struct nouveau_dmem_fault *fault)
>   {
> -	struct nouveau_dmem_fault *fault = private;
>   	struct nouveau_drm *drm = fault->drm;
>   
>   	if (fault->fence) {
> @@ -257,29 +251,35 @@ void nouveau_dmem_fault_finalize_and_map(struct vm_area_struct *vma,
>   	kfree(fault->dma);
>   }
>   
> -static const struct migrate_vma_ops nouveau_dmem_fault_migrate_ops = {
> -	.alloc_and_copy		= nouveau_dmem_fault_alloc_and_copy,
> -	.finalize_and_map	= nouveau_dmem_fault_finalize_and_map,
> -};
> -
>   static vm_fault_t nouveau_dmem_migrate_to_ram(struct vm_fault *vmf)
>   {
>   	struct nouveau_dmem *dmem = page_to_dmem(vmf->page);
>   	unsigned long src[1] = {0}, dst[1] = {0};
> +	struct migrate_vma args = {
> +		.vma		= vmf->vma,
> +		.start		= vmf->address,
> +		.end		= vmf->address + PAGE_SIZE,
> +		.src		= src,
> +		.dst		= dst,
> +	};
>   	struct nouveau_dmem_fault fault = { .drm = dmem->drm };
> -	int ret;
>   
>   	/*
>   	 * FIXME what we really want is to find some heuristic to migrate more
>   	 * than just one page on CPU fault. When such fault happens it is very
>   	 * likely that more surrounding page will CPU fault too.
>   	 */
> -	ret = migrate_vma(&nouveau_dmem_fault_migrate_ops, vmf->vma,
> -			vmf->address, vmf->address + PAGE_SIZE,
> -			src, dst, &fault);
> -	if (ret)
> +	if (migrate_vma_setup(&args) < 0)
>   		return VM_FAULT_SIGBUS;
> +	if (!args.cpages)
> +		return 0;
> +
> +	nouveau_dmem_fault_alloc_and_copy(args.vma, src, dst, args.start,
> +			args.end, &fault);
> +	migrate_vma_pages(&args);
> +	nouveau_dmem_fault_finalize_and_map(&fault);
>   
> +	migrate_vma_finalize(&args);
>   	if (dst[0] == MIGRATE_PFN_ERROR)
>   		return VM_FAULT_SIGBUS;
>   
> @@ -648,9 +648,8 @@ nouveau_dmem_migrate_alloc_and_copy(struct vm_area_struct *vma,
>   				    unsigned long *dst_pfns,
>   				    unsigned long start,
>   				    unsigned long end,
> -				    void *private)
> +				    struct nouveau_migrate *migrate)
>   {
> -	struct nouveau_migrate *migrate = private;
>   	struct nouveau_drm *drm = migrate->drm;
>   	struct device *dev = drm->dev->dev;
>   	unsigned long addr, i, npages = 0;
> @@ -747,14 +746,9 @@ nouveau_dmem_migrate_alloc_and_copy(struct vm_area_struct *vma,
>   	}
>   }
>   
> -void nouveau_dmem_migrate_finalize_and_map(struct vm_area_struct *vma,
> -					   const unsigned long *src_pfns,
> -					   const unsigned long *dst_pfns,
> -					   unsigned long start,
> -					   unsigned long end,
> -					   void *private)
> +static void
> +nouveau_dmem_migrate_finalize_and_map(struct nouveau_migrate *migrate)
>   {
> -	struct nouveau_migrate *migrate = private;
>   	struct nouveau_drm *drm = migrate->drm;
>   
>   	if (migrate->fence) {
> @@ -779,10 +773,15 @@ void nouveau_dmem_migrate_finalize_and_map(struct vm_area_struct *vma,
>   	 */
>   }
>   
> -static const struct migrate_vma_ops nouveau_dmem_migrate_ops = {
> -	.alloc_and_copy		= nouveau_dmem_migrate_alloc_and_copy,
> -	.finalize_and_map	= nouveau_dmem_migrate_finalize_and_map,
> -};
> +static void nouveau_dmem_migrate_chunk(struct migrate_vma *args,
> +		struct nouveau_migrate *migrate)
> +{
> +	nouveau_dmem_migrate_alloc_and_copy(args->vma, args->src, args->dst,
> +			args->start, args->end, migrate);
> +	migrate_vma_pages(args);
> +	nouveau_dmem_migrate_finalize_and_map(migrate);
> +	migrate_vma_finalize(args);
> +}
>   
>   int
>   nouveau_dmem_migrate_vma(struct nouveau_drm *drm,
> @@ -790,40 +789,45 @@ nouveau_dmem_migrate_vma(struct nouveau_drm *drm,
>   			 unsigned long start,
>   			 unsigned long end)
>   {
> -	unsigned long *src_pfns, *dst_pfns, npages;
> -	struct nouveau_migrate migrate = {0};
> -	unsigned long i, c, max;
> -	int ret = 0;
> -
> -	npages = (end - start) >> PAGE_SHIFT;
> -	max = min(SG_MAX_SINGLE_ALLOC, npages);
> -	src_pfns = kzalloc(sizeof(long) * max, GFP_KERNEL);
> -	if (src_pfns == NULL)
> -		return -ENOMEM;
> -	dst_pfns = kzalloc(sizeof(long) * max, GFP_KERNEL);
> -	if (dst_pfns == NULL) {
> -		kfree(src_pfns);
> -		return -ENOMEM;
> -	}
> +	unsigned long npages = (end - start) >> PAGE_SHIFT;
> +	unsigned long max = min(SG_MAX_SINGLE_ALLOC, npages);
> +	struct migrate_vma args = {
> +		.vma		= vma,
> +		.start		= start,
> +	};
> +	struct nouveau_migrate migrate = {
> +		.drm		= drm,
> +		.vma		= vma,
> +		.npages		= npages,
> +	};
> +	unsigned long c, i;
> +	int ret = -ENOMEM;
> +
> +	args.src = kzalloc(sizeof(long) * max, GFP_KERNEL);
> +	if (!args.src)
> +		goto out;
> +	args.dst = kzalloc(sizeof(long) * max, GFP_KERNEL);
> +	if (!args.dst)
> +		goto out_free_src;
>   
> -	migrate.drm = drm;
> -	migrate.vma = vma;
> -	migrate.npages = npages;
>   	for (i = 0; i < npages; i += c) {
> -		unsigned long next;
> -
>   		c = min(SG_MAX_SINGLE_ALLOC, npages);
> -		next = start + (c << PAGE_SHIFT);
> -		ret = migrate_vma(&nouveau_dmem_migrate_ops, vma, start,
> -				  next, src_pfns, dst_pfns, &migrate);
> +		args.end = start + (c << PAGE_SHIFT);

Since migrate_vma_setup() is called in a loop, either args.cpages and
args.npages need to be cleared here or cleared in migrate_vma_setup().

> +		ret = migrate_vma_setup(&args);
>   		if (ret)
> -			goto out;
> -		start = next;
> +			goto out_free_dst;
> +
> +		if (args.cpages)
> +			nouveau_dmem_migrate_chunk(&args, &migrate);
> +		args.start = args.end;
>   	}
>   
> +	ret = 0;
> +out_free_dst:
> +	kfree(args.dst);
> +out_free_src:
> +	kfree(args.src);
>   out:
> -	kfree(dst_pfns);
> -	kfree(src_pfns);
>   	return ret;
>   }
>   
> diff --git a/include/linux/migrate.h b/include/linux/migrate.h
> index 7f04754c7f2b..093d67fcf6dd 100644
> --- a/include/linux/migrate.h
> +++ b/include/linux/migrate.h
> @@ -182,107 +182,27 @@ static inline unsigned long migrate_pfn(unsigned long pfn)
>   	return (pfn << MIGRATE_PFN_SHIFT) | MIGRATE_PFN_VALID;
>   }
>   
> -/*
> - * struct migrate_vma_ops - migrate operation callback
> - *
> - * @alloc_and_copy: alloc destination memory and copy source memory to it
> - * @finalize_and_map: allow caller to map the successfully migrated pages
> - *
> - *
> - * The alloc_and_copy() callback happens once all source pages have been locked,
> - * unmapped and checked (checked whether pinned or not). All pages that can be
> - * migrated will have an entry in the src array set with the pfn value of the
> - * page and with the MIGRATE_PFN_VALID and MIGRATE_PFN_MIGRATE flag set (other
> - * flags might be set but should be ignored by the callback).
> - *
> - * The alloc_and_copy() callback can then allocate destination memory and copy
> - * source memory to it for all those entries (ie with MIGRATE_PFN_VALID and
> - * MIGRATE_PFN_MIGRATE flag set). Once these are allocated and copied, the
> - * callback must update each corresponding entry in the dst array with the pfn
> - * value of the destination page and with the MIGRATE_PFN_VALID and
> - * MIGRATE_PFN_LOCKED flags set (destination pages must have their struct pages
> - * locked, via lock_page()).
> - *
> - * At this point the alloc_and_copy() callback is done and returns.
> - *
> - * Note that the callback does not have to migrate all the pages that are
> - * marked with MIGRATE_PFN_MIGRATE flag in src array unless this is a migration
> - * from device memory to system memory (ie the MIGRATE_PFN_DEVICE flag is also
> - * set in the src array entry). If the device driver cannot migrate a device
> - * page back to system memory, then it must set the corresponding dst array
> - * entry to MIGRATE_PFN_ERROR. This will trigger a SIGBUS if CPU tries to
> - * access any of the virtual addresses originally backed by this page. Because
> - * a SIGBUS is such a severe result for the userspace process, the device
> - * driver should avoid setting MIGRATE_PFN_ERROR unless it is really in an
> - * unrecoverable state.
> - *
> - * For empty entry inside CPU page table (pte_none() or pmd_none() is true) we
> - * do set MIGRATE_PFN_MIGRATE flag inside the corresponding source array thus
> - * allowing device driver to allocate device memory for those unback virtual
> - * address. For this the device driver simply have to allocate device memory
> - * and properly set the destination entry like for regular migration. Note that
> - * this can still fails and thus inside the device driver must check if the
> - * migration was successful for those entry inside the finalize_and_map()
> - * callback just like for regular migration.
> - *
> - * THE alloc_and_copy() CALLBACK MUST NOT CHANGE ANY OF THE SRC ARRAY ENTRIES
> - * OR BAD THINGS WILL HAPPEN !
> - *
> - *
> - * The finalize_and_map() callback happens after struct page migration from
> - * source to destination (destination struct pages are the struct pages for the
> - * memory allocated by the alloc_and_copy() callback).  Migration can fail, and
> - * thus the finalize_and_map() allows the driver to inspect which pages were
> - * successfully migrated, and which were not. Successfully migrated pages will
> - * have the MIGRATE_PFN_MIGRATE flag set for their src array entry.
> - *
> - * It is safe to update device page table from within the finalize_and_map()
> - * callback because both destination and source page are still locked, and the
> - * mmap_sem is held in read mode (hence no one can unmap the range being
> - * migrated).
> - *
> - * Once callback is done cleaning up things and updating its page table (if it
> - * chose to do so, this is not an obligation) then it returns. At this point,
> - * the HMM core will finish up the final steps, and the migration is complete.
> - *
> - * THE finalize_and_map() CALLBACK MUST NOT CHANGE ANY OF THE SRC OR DST ARRAY
> - * ENTRIES OR BAD THINGS WILL HAPPEN !
> - */
> -struct migrate_vma_ops {
> -	void (*alloc_and_copy)(struct vm_area_struct *vma,
> -			       const unsigned long *src,
> -			       unsigned long *dst,
> -			       unsigned long start,
> -			       unsigned long end,
> -			       void *private);
> -	void (*finalize_and_map)(struct vm_area_struct *vma,
> -				 const unsigned long *src,
> -				 const unsigned long *dst,
> -				 unsigned long start,
> -				 unsigned long end,
> -				 void *private);
> +struct migrate_vma {
> +	struct vm_area_struct	*vma;
> + 	/*
> +	 * Both src and dst array must be big enough for
> +	 * (end - start) >> PAGE_SHIFT entries.
> +	 *
> +	 * The src array must not be modified by the caller after
> +	 * migrate_vma_setup(), and must not change the dst array after
> +	 * migrate_vma_pages() returns.
> +	 */
> +	unsigned long		*dst;
> +	unsigned long		*src;
> +	unsigned long		cpages;
> +	unsigned long		npages;
> +	unsigned long		start;
> +	unsigned long		end;
>   };
>   
> -#if defined(CONFIG_MIGRATE_VMA_HELPER)
> -int migrate_vma(const struct migrate_vma_ops *ops,
> -		struct vm_area_struct *vma,
> -		unsigned long start,
> -		unsigned long end,
> -		unsigned long *src,
> -		unsigned long *dst,
> -		void *private);
> -#else
> -static inline int migrate_vma(const struct migrate_vma_ops *ops,
> -			      struct vm_area_struct *vma,
> -			      unsigned long start,
> -			      unsigned long end,
> -			      unsigned long *src,
> -			      unsigned long *dst,
> -			      void *private)
> -{
> -	return -EINVAL;
> -}
> -#endif /* IS_ENABLED(CONFIG_MIGRATE_VMA_HELPER) */
> +int migrate_vma_setup(struct migrate_vma *args);
> +void migrate_vma_pages(struct migrate_vma *migrate);
> +void migrate_vma_finalize(struct migrate_vma *migrate);
>   
>   #endif /* CONFIG_MIGRATION */
>   
> diff --git a/mm/migrate.c b/mm/migrate.c
> index 8992741f10aa..dc4e60a496f2 100644
> --- a/mm/migrate.c
> +++ b/mm/migrate.c
> @@ -2118,16 +2118,6 @@ int migrate_misplaced_transhuge_page(struct mm_struct *mm,
>   #endif /* CONFIG_NUMA */
>   
>   #if defined(CONFIG_MIGRATE_VMA_HELPER)
> -struct migrate_vma {
> -	struct vm_area_struct	*vma;
> -	unsigned long		*dst;
> -	unsigned long		*src;
> -	unsigned long		cpages;
> -	unsigned long		npages;
> -	unsigned long		start;
> -	unsigned long		end;
> -};
> -
>   static int migrate_vma_collect_hole(unsigned long start,
>   				    unsigned long end,
>   				    struct mm_walk *walk)
> @@ -2578,6 +2568,108 @@ static void migrate_vma_unmap(struct migrate_vma *migrate)
>   	}
>   }
>   
> +/**
> + * migrate_vma_setup() - prepare to migrate a range of memory
> + * @args: contains the vma, start, and and pfns arrays for the migration
> + *
> + * Returns: negative errno on failures, 0 when 0 or more pages were migrated
> + * without an error.
> + *
> + * Prepare to migrate a range of memory virtual address range by collecting all
> + * the pages backing each virtual address in the range, saving them inside the
> + * src array.  Then lock those pages and unmap them. Once the pages are locked
> + * and unmapped, check whether each page is pinned or not.  Pages that aren't
> + * pinned have the MIGRATE_PFN_MIGRATE flag set (by this function) in the
> + * corresponding src array entry.  Then restores any pages that are pinned, by
> + * remapping and unlocking those pages.
> + *
> + * The caller should then allocate destination memory and copy source memory to
> + * it for all those entries (ie with MIGRATE_PFN_VALID and MIGRATE_PFN_MIGRATE
> + * flag set).  Once these are allocated and copied, the caller must update each
> + * corresponding entry in the dst array with the pfn value of the destination
> + * page and with the MIGRATE_PFN_VALID and MIGRATE_PFN_LOCKED flags set
> + * (destination pages must have their struct pages locked, via lock_page()).
> + *
> + * Note that the caller does not have to migrate all the pages that are marked
> + * with MIGRATE_PFN_MIGRATE flag in src array unless this is a migration from
> + * device memory to system memory.  If the caller cannot migrate a device page
> + * back to system memory, then it must return VM_FAULT_SIGBUS, which will
> + * might have severe consequences for the userspace process, so it should best
> + * be avoided if possible.
> + *
> + * For empty entries inside CPU page table (pte_none() or pmd_none() is true) we
> + * do set MIGRATE_PFN_MIGRATE flag inside the corresponding source array thus
> + * allowing the caller to allocate device memory for those unback virtual
> + * address.  For this the caller simply havs to allocate device memory and
> + * properly set the destination entry like for regular migration.  Note that
> + * this can still fails and thus inside the device driver must check if the
> + * migration was successful for those entries after calling migrate_vma_pages()
> + * just like for regular migration.
> + *
> + * After that, the callers must call migrate_vma_pages() to go over each entry
> + * in the src array that has the MIGRATE_PFN_VALID and MIGRATE_PFN_MIGRATE flag
> + * set. If the corresponding entry in dst array has MIGRATE_PFN_VALID flag set,
> + * then migrate_vma_pages() to migrate struct page information from the source
> + * struct page to the destination struct page.  If it fails to migrate the
> + * struct page information, then it clears the MIGRATE_PFN_MIGRATE flag in the
> + * src array.
> + *
> + * At this point all successfully migrated pages have an entry in the src
> + * array with MIGRATE_PFN_VALID and MIGRATE_PFN_MIGRATE flag set and the dst
> + * array entry with MIGRATE_PFN_VALID flag set.
> + *
> + * Once migrate_vma_pages() returns the caller may inspect which pages were
> + * successfully migrated, and which were not.  Successfully migrated pages will
> + * have the MIGRATE_PFN_MIGRATE flag set for their src array entry.
> + *
> + * It is safe to update device page table from within the finalize_and_map()
> + * callback because both destination and source page are still locked, and the
> + * mmap_sem is held in read mode (hence no one can unmap the range being
> + * migrated).
> + *
> + * Once the caller is done cleaning up things and updating its page table (if it
> + * chose to do so, this is not an obligation) it finally calls
> + * migrate_vma_finalize() to update the CPU page table to point to new pages
> + * for successfully migrated pages or otherwise restore the CPU page table to
> + * point to the original source pages.
> + */
> +int migrate_vma_setup(struct migrate_vma *args)
> +{
> +	long nr_pages = (args->end - args->start) >> PAGE_SHIFT;
> +
> +	args->start &= PAGE_MASK;
> +	args->end &= PAGE_MASK;
> +	if (!args->vma || is_vm_hugetlb_page(args->vma) ||
> +	    (args->vma->vm_flags & VM_SPECIAL) || vma_is_dax(args->vma))
> +		return -EINVAL;
> +	if (nr_pages <= 0)
> +		return -EINVAL;
> +	if (args->start < args->vma->vm_start ||
> +	    args->start >= args->vma->vm_end)
> +		return -EINVAL;
> +	if (args->end <= args->vma->vm_start || args->end > args->vma->vm_end)
> +		return -EINVAL;
> +	if (!args->src || !args->dst)
> +		return -EINVAL;
> +
> +	memset(args->src, 0, sizeof(*args->src) * nr_pages);
> +
> +	migrate_vma_collect(args);
> +	if (args->cpages)
> +		migrate_vma_prepare(args);
> +	if (args->cpages)
> +		migrate_vma_unmap(args);
> +
> +	/*
> +	 * At this point pages are locked and unmapped, and thus they have
> +	 * stable content and can safely be copied to destination memory that
> +	 * is allocated by the drivers.
> +	 */
> +	return 0;
> +
> +}
> +EXPORT_SYMBOL(migrate_vma_setup);
> +
>   static void migrate_vma_insert_page(struct migrate_vma *migrate,
>   				    unsigned long addr,
>   				    struct page *page,
> @@ -2709,7 +2801,7 @@ static void migrate_vma_insert_page(struct migrate_vma *migrate,
>   	*src &= ~MIGRATE_PFN_MIGRATE;
>   }
>   
> -/*
> +/**
>    * migrate_vma_pages() - migrate meta-data from src page to dst page
>    * @migrate: migrate struct containing all migration information
>    *
> @@ -2717,7 +2809,7 @@ static void migrate_vma_insert_page(struct migrate_vma *migrate,
>    * struct page. This effectively finishes the migration from source page to the
>    * destination page.
>    */
> -static void migrate_vma_pages(struct migrate_vma *migrate)
> +void migrate_vma_pages(struct migrate_vma *migrate)
>   {
>   	const unsigned long npages = migrate->npages;
>   	const unsigned long start = migrate->start;
> @@ -2791,8 +2883,9 @@ static void migrate_vma_pages(struct migrate_vma *migrate)
>   	if (notified)
>   		mmu_notifier_invalidate_range_only_end(&range);
>   }
> +EXPORT_SYMBOL(migrate_vma_pages);
>   
> -/*
> +/**
>    * migrate_vma_finalize() - restore CPU page table entry
>    * @migrate: migrate struct containing all migration information
>    *
> @@ -2803,7 +2896,7 @@ static void migrate_vma_pages(struct migrate_vma *migrate)
>    * This also unlocks the pages and puts them back on the lru, or drops the extra
>    * refcount, for device pages.
>    */
> -static void migrate_vma_finalize(struct migrate_vma *migrate)
> +void migrate_vma_finalize(struct migrate_vma *migrate)
>   {
>   	const unsigned long npages = migrate->npages;
>   	unsigned long i;
> @@ -2846,124 +2939,5 @@ static void migrate_vma_finalize(struct migrate_vma *migrate)
>   		}
>   	}
>   }
> -
> -/*
> - * migrate_vma() - migrate a range of memory inside vma
> - *
> - * @ops: migration callback for allocating destination memory and copying
> - * @vma: virtual memory area containing the range to be migrated
> - * @start: start address of the range to migrate (inclusive)
> - * @end: end address of the range to migrate (exclusive)
> - * @src: array of hmm_pfn_t containing source pfns
> - * @dst: array of hmm_pfn_t containing destination pfns
> - * @private: pointer passed back to each of the callback
> - * Returns: 0 on success, error code otherwise
> - *
> - * This function tries to migrate a range of memory virtual address range, using
> - * callbacks to allocate and copy memory from source to destination. First it
> - * collects all the pages backing each virtual address in the range, saving this
> - * inside the src array. Then it locks those pages and unmaps them. Once the pages
> - * are locked and unmapped, it checks whether each page is pinned or not. Pages
> - * that aren't pinned have the MIGRATE_PFN_MIGRATE flag set (by this function)
> - * in the corresponding src array entry. It then restores any pages that are
> - * pinned, by remapping and unlocking those pages.
> - *
> - * At this point it calls the alloc_and_copy() callback. For documentation on
> - * what is expected from that callback, see struct migrate_vma_ops comments in
> - * include/linux/migrate.h
> - *
> - * After the alloc_and_copy() callback, this function goes over each entry in
> - * the src array that has the MIGRATE_PFN_VALID and MIGRATE_PFN_MIGRATE flag
> - * set. If the corresponding entry in dst array has MIGRATE_PFN_VALID flag set,
> - * then the function tries to migrate struct page information from the source
> - * struct page to the destination struct page. If it fails to migrate the struct
> - * page information, then it clears the MIGRATE_PFN_MIGRATE flag in the src
> - * array.
> - *
> - * At this point all successfully migrated pages have an entry in the src
> - * array with MIGRATE_PFN_VALID and MIGRATE_PFN_MIGRATE flag set and the dst
> - * array entry with MIGRATE_PFN_VALID flag set.
> - *
> - * It then calls the finalize_and_map() callback. See comments for "struct
> - * migrate_vma_ops", in include/linux/migrate.h for details about
> - * finalize_and_map() behavior.
> - *
> - * After the finalize_and_map() callback, for successfully migrated pages, this
> - * function updates the CPU page table to point to new pages, otherwise it
> - * restores the CPU page table to point to the original source pages.
> - *
> - * Function returns 0 after the above steps, even if no pages were migrated
> - * (The function only returns an error if any of the arguments are invalid.)
> - *
> - * Both src and dst array must be big enough for (end - start) >> PAGE_SHIFT
> - * unsigned long entries.
> - */
> -int migrate_vma(const struct migrate_vma_ops *ops,
> -		struct vm_area_struct *vma,
> -		unsigned long start,
> -		unsigned long end,
> -		unsigned long *src,
> -		unsigned long *dst,
> -		void *private)
> -{
> -	struct migrate_vma migrate;
> -
> -	/* Sanity check the arguments */
> -	start &= PAGE_MASK;
> -	end &= PAGE_MASK;
> -	if (!vma || is_vm_hugetlb_page(vma) || (vma->vm_flags & VM_SPECIAL) ||
> -			vma_is_dax(vma))
> -		return -EINVAL;
> -	if (start < vma->vm_start || start >= vma->vm_end)
> -		return -EINVAL;
> -	if (end <= vma->vm_start || end > vma->vm_end)
> -		return -EINVAL;
> -	if (!ops || !src || !dst || start >= end)
> -		return -EINVAL;
> -
> -	memset(src, 0, sizeof(*src) * ((end - start) >> PAGE_SHIFT));
> -	migrate.src = src;
> -	migrate.dst = dst;
> -	migrate.start = start;
> -	migrate.npages = 0;
> -	migrate.cpages = 0;
> -	migrate.end = end;
> -	migrate.vma = vma;
> -
> -	/* Collect, and try to unmap source pages */
> -	migrate_vma_collect(&migrate);
> -	if (!migrate.cpages)
> -		return 0;
> -
> -	/* Lock and isolate page */
> -	migrate_vma_prepare(&migrate);
> -	if (!migrate.cpages)
> -		return 0;
> -
> -	/* Unmap pages */
> -	migrate_vma_unmap(&migrate);
> -	if (!migrate.cpages)
> -		return 0;
> -
> -	/*
> -	 * At this point pages are locked and unmapped, and thus they have
> -	 * stable content and can safely be copied to destination memory that
> -	 * is allocated by the callback.
> -	 *
> -	 * Note that migration can fail in migrate_vma_struct_page() for each
> -	 * individual page.
> -	 */
> -	ops->alloc_and_copy(vma, src, dst, start, end, private);
> -
> -	/* This does the real migration of struct page */
> -	migrate_vma_pages(&migrate);
> -
> -	ops->finalize_and_map(vma, src, dst, start, end, private);
> -
> -	/* Unlock and remap pages */
> -	migrate_vma_finalize(&migrate);
> -
> -	return 0;
> -}
> -EXPORT_SYMBOL(migrate_vma);
> +EXPORT_SYMBOL(migrate_vma_finalize);
>   #endif /* defined(MIGRATE_VMA_HELPER) */
> 

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 5/9] nouveau: simplify nouveau_dmem_migrate_to_ram
  2019-07-29 14:28 ` [PATCH 5/9] nouveau: simplify nouveau_dmem_migrate_to_ram Christoph Hellwig
  2019-07-29 23:26   ` Ralph Campbell
@ 2019-07-31  9:57   ` Bharata B Rao
  2019-08-01  7:46     ` Christoph Hellwig
  1 sibling, 1 reply; 30+ messages in thread
From: Bharata B Rao @ 2019-07-31  9:57 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Jérôme Glisse, Jason Gunthorpe, Ben Skeggs,
	Ralph Campbell, Andrew Morton, linux-mm, nouveau, dri-devel,
	linux-kernel

On Mon, Jul 29, 2019 at 05:28:39PM +0300, Christoph Hellwig wrote:
> Factor the main copy page to ram routine out into a helper that acts on
> a single page and which doesn't require the nouveau_dmem_fault
> structure for argument passing.  Also remove the loop over multiple
> pages as we only handle one at the moment, although the structure of
> the main worker function makes it relatively easy to add multi page
> support back if needed in the future.  But at least for now this avoid
> the needed to dynamically allocate memory for the dma addresses in
> what is essentially the page fault path.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>  drivers/gpu/drm/nouveau/nouveau_dmem.c | 158 ++++++-------------------
>  1 file changed, 39 insertions(+), 119 deletions(-)
> 
> diff --git a/drivers/gpu/drm/nouveau/nouveau_dmem.c b/drivers/gpu/drm/nouveau/nouveau_dmem.c
> index 21052a4aaf69..036e6c07d489 100644
> --- a/drivers/gpu/drm/nouveau/nouveau_dmem.c
> +++ b/drivers/gpu/drm/nouveau/nouveau_dmem.c
> @@ -86,13 +86,6 @@ static inline struct nouveau_dmem *page_to_dmem(struct page *page)
>  	return container_of(page->pgmap, struct nouveau_dmem, pagemap);
>  }
>  
> -struct nouveau_dmem_fault {
> -	struct nouveau_drm *drm;
> -	struct nouveau_fence *fence;
> -	dma_addr_t *dma;
> -	unsigned long npages;
> -};
> -
>  struct nouveau_migrate {
>  	struct vm_area_struct *vma;
>  	struct nouveau_drm *drm;
> @@ -146,130 +139,55 @@ static void nouveau_dmem_fence_done(struct nouveau_fence **fence)
>  	}
>  }
>  
> -static void
> -nouveau_dmem_fault_alloc_and_copy(struct vm_area_struct *vma,
> -				  const unsigned long *src_pfns,
> -				  unsigned long *dst_pfns,
> -				  unsigned long start,
> -				  unsigned long end,
> -				  struct nouveau_dmem_fault *fault)
> +static vm_fault_t nouveau_dmem_fault_copy_one(struct nouveau_drm *drm,
> +		struct vm_area_struct *vma, unsigned long addr,
> +		unsigned long src, unsigned long *dst, dma_addr_t *dma_addr)
>  {
> -	struct nouveau_drm *drm = fault->drm;
>  	struct device *dev = drm->dev->dev;
> -	unsigned long addr, i, npages = 0;
> -	nouveau_migrate_copy_t copy;
> -	int ret;
> -
> -
> -	/* First allocate new memory */
> -	for (addr = start, i = 0; addr < end; addr += PAGE_SIZE, i++) {
> -		struct page *dpage, *spage;
> -
> -		dst_pfns[i] = 0;
> -		spage = migrate_pfn_to_page(src_pfns[i]);
> -		if (!spage || !(src_pfns[i] & MIGRATE_PFN_MIGRATE))
> -			continue;
> +	struct page *dpage, *spage;
>  
> -		dpage = alloc_page_vma(GFP_HIGHUSER, vma, addr);
> -		if (!dpage) {
> -			dst_pfns[i] = MIGRATE_PFN_ERROR;
> -			continue;
> -		}
> -		lock_page(dpage);
> -
> -		dst_pfns[i] = migrate_pfn(page_to_pfn(dpage)) |
> -			      MIGRATE_PFN_LOCKED;
> -		npages++;
> -	}
> +	spage = migrate_pfn_to_page(src);
> +	if (!spage || !(src & MIGRATE_PFN_MIGRATE))
> +		return 0;
>  
> -	/* Allocate storage for DMA addresses, so we can unmap later. */
> -	fault->dma = kmalloc(sizeof(*fault->dma) * npages, GFP_KERNEL);
> -	if (!fault->dma)
> +	dpage = alloc_page_vma(GFP_HIGHUSER, args->vma, addr);
> +	if (!dpage)
>  		goto error;
> +	lock_page(dpage);
>  
> -	/* Copy things over */
> -	copy = drm->dmem->migrate.copy_func;
> -	for (addr = start, i = 0; addr < end; addr += PAGE_SIZE, i++) {
> -		struct page *spage, *dpage;
> -
> -		dpage = migrate_pfn_to_page(dst_pfns[i]);
> -		if (!dpage || dst_pfns[i] == MIGRATE_PFN_ERROR)
> -			continue;
> -
> -		spage = migrate_pfn_to_page(src_pfns[i]);
> -		if (!spage || !(src_pfns[i] & MIGRATE_PFN_MIGRATE)) {
> -			dst_pfns[i] = MIGRATE_PFN_ERROR;
> -			__free_page(dpage);
> -			continue;
> -		}
> -
> -		fault->dma[fault->npages] =
> -			dma_map_page_attrs(dev, dpage, 0, PAGE_SIZE,
> -					   PCI_DMA_BIDIRECTIONAL,
> -					   DMA_ATTR_SKIP_CPU_SYNC);
> -		if (dma_mapping_error(dev, fault->dma[fault->npages])) {
> -			dst_pfns[i] = MIGRATE_PFN_ERROR;
> -			__free_page(dpage);
> -			continue;
> -		}
> -
> -		ret = copy(drm, 1, NOUVEAU_APER_HOST,
> -				fault->dma[fault->npages++],
> -				NOUVEAU_APER_VRAM,
> -				nouveau_dmem_page_addr(spage));
> -		if (ret) {
> -			dst_pfns[i] = MIGRATE_PFN_ERROR;
> -			__free_page(dpage);
> -			continue;
> -		}
> -	}
> +	*dma_addr = dma_map_page(dev, dpage, 0, PAGE_SIZE, DMA_BIDIRECTIONAL);
> +	if (dma_mapping_error(dev, *dma_addr))
> +		goto error_free_page;
>  
> -	nouveau_fence_new(drm->dmem->migrate.chan, false, &fault->fence);
> +	if (drm->dmem->migrate.copy_func(drm, 1, NOUVEAU_APER_HOST, *dma_addr,
> +			NOUVEAU_APER_VRAM, nouveau_dmem_page_addr(spage)))
> +		goto error_dma_unmap;
>  
> -	return;
> +	*dst = migrate_pfn(page_to_pfn(dpage)) | MIGRATE_PFN_LOCKED;
>  
> +error_dma_unmap:
> +	dma_unmap_page(dev, *dma_addr, PAGE_SIZE, DMA_BIDIRECTIONAL);
> +error_free_page:
> +	__free_page(dpage);
>  error:
> -	for (addr = start, i = 0; addr < end; addr += PAGE_SIZE, ++i) {
> -		struct page *page;
> -
> -		if (!dst_pfns[i] || dst_pfns[i] == MIGRATE_PFN_ERROR)
> -			continue;
> -
> -		page = migrate_pfn_to_page(dst_pfns[i]);
> -		dst_pfns[i] = MIGRATE_PFN_ERROR;
> -		if (page == NULL)
> -			continue;
> -
> -		__free_page(page);
> -	}
> -}
> -
> -static void
> -nouveau_dmem_fault_finalize_and_map(struct nouveau_dmem_fault *fault)
> -{
> -	struct nouveau_drm *drm = fault->drm;
> -
> -	nouveau_dmem_fence_done(&fault->fence);
> -
> -	while (fault->npages--) {
> -		dma_unmap_page(drm->dev->dev, fault->dma[fault->npages],
> -			       PAGE_SIZE, PCI_DMA_BIDIRECTIONAL);
> -	}
> -	kfree(fault->dma);
> +	return VM_FAULT_SIGBUS;

Looks like nouveau_dmem_fault_copy_one() is now returning VM_FAULT_SIGBUS
for success case. Is this expected?

Regards,
Bharata.


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 1/9] mm: turn migrate_vma upside down
  2019-07-31  1:46   ` Ralph Campbell
@ 2019-08-01  7:42     ` Christoph Hellwig
  0 siblings, 0 replies; 30+ messages in thread
From: Christoph Hellwig @ 2019-08-01  7:42 UTC (permalink / raw)
  To: Ralph Campbell
  Cc: Christoph Hellwig, Jérôme Glisse, Jason Gunthorpe,
	Ben Skeggs, Bharata B Rao, Andrew Morton, linux-mm, nouveau,
	dri-devel, linux-kernel

On Tue, Jul 30, 2019 at 06:46:25PM -0700, Ralph Campbell wrote:
>>   	for (i = 0; i < npages; i += c) {
>> -		unsigned long next;
>> -
>>   		c = min(SG_MAX_SINGLE_ALLOC, npages);
>> -		next = start + (c << PAGE_SHIFT);
>> -		ret = migrate_vma(&nouveau_dmem_migrate_ops, vma, start,
>> -				  next, src_pfns, dst_pfns, &migrate);
>> +		args.end = start + (c << PAGE_SHIFT);
>
> Since migrate_vma_setup() is called in a loop, either args.cpages and
> args.npages need to be cleared here or cleared in migrate_vma_setup().

I think clearing everything that is not used for argument passing in
migrate_vma_setup is a good idea.  I'll do that.

Btw, it seems like this was a fullquote just for the little comment
as far as I could tell from wading through it.  It would be very
appreciated to properly quote the replies.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 5/9] nouveau: simplify nouveau_dmem_migrate_to_ram
  2019-07-31  9:57   ` Bharata B Rao
@ 2019-08-01  7:46     ` Christoph Hellwig
  0 siblings, 0 replies; 30+ messages in thread
From: Christoph Hellwig @ 2019-08-01  7:46 UTC (permalink / raw)
  To: Bharata B Rao
  Cc: Christoph Hellwig, Jérôme Glisse, Jason Gunthorpe,
	Ben Skeggs, Ralph Campbell, Andrew Morton, linux-mm, nouveau,
	dri-devel, linux-kernel

On Wed, Jul 31, 2019 at 03:27:35PM +0530, Bharata B Rao wrote:
> > -	kfree(fault->dma);
> > +	return VM_FAULT_SIGBUS;

> Looks like nouveau_dmem_fault_copy_one() is now returning VM_FAULT_SIGBUS
> for success case. Is this expected?

No, fixed.

^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2019-08-01  7:46 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-07-29 14:28 turn the hmm migrate_vma upside down Christoph Hellwig
2019-07-29 14:28 ` [PATCH 1/9] mm: turn " Christoph Hellwig
2019-07-29 23:12   ` Ralph Campbell
2019-07-29 23:43   ` Jerome Glisse
2019-07-31  1:46   ` Ralph Campbell
2019-08-01  7:42     ` Christoph Hellwig
2019-07-29 14:28 ` [PATCH 2/9] nouveau: reset dma_nr in nouveau_dmem_migrate_alloc_and_copy Christoph Hellwig
2019-07-29 23:18   ` Ralph Campbell
2019-07-29 14:28 ` [PATCH 3/9] nouveau: factor out device memory address calculation Christoph Hellwig
2019-07-29 23:21   ` Ralph Campbell
2019-07-29 14:28 ` [PATCH 4/9] nouveau: factor out dmem fence completion Christoph Hellwig
2019-07-29 23:23   ` Ralph Campbell
2019-07-29 14:28 ` [PATCH 5/9] nouveau: simplify nouveau_dmem_migrate_to_ram Christoph Hellwig
2019-07-29 23:26   ` Ralph Campbell
2019-07-31  9:57   ` Bharata B Rao
2019-08-01  7:46     ` Christoph Hellwig
2019-07-29 14:28 ` [PATCH 6/9] nouveau: simplify nouveau_dmem_migrate_vma Christoph Hellwig
2019-07-29 23:27   ` Ralph Campbell
2019-07-29 14:28 ` [PATCH 7/9] mm: remove the unused MIGRATE_PFN_ERROR flag Christoph Hellwig
2019-07-29 23:29   ` Ralph Campbell
2019-07-29 14:28 ` [PATCH 8/9] mm: remove the unused MIGRATE_PFN_DEVICE flag Christoph Hellwig
2019-07-29 23:31   ` Ralph Campbell
2019-07-29 14:28 ` [PATCH 9/9] mm: remove the MIGRATE_PFN_WRITE flag Christoph Hellwig
2019-07-29 23:30   ` Jerome Glisse
2019-07-30  5:46     ` Christoph Hellwig
2019-07-30 15:51       ` Jerome Glisse
2019-07-29 23:42   ` Ralph Campbell
2019-07-29 23:46     ` Jerome Glisse
2019-07-30 12:32 ` turn the hmm migrate_vma upside down Jason Gunthorpe
2019-07-30 13:09   ` Christoph Hellwig

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).