Fast prefaulting for GTT mmappings

All of lore.kernel.org
 help / color / mirror / Atom feed

* Fast prefaulting for GTT mmappings
@ 2015-04-07 16:31 Chris Wilson
  2015-04-07 16:31 ` [PATCH 1/5] mutex: Export an interface to wrap a mutex lock Chris Wilson
                   ` (5 more replies)
  0 siblings, 6 replies; 22+ messages in thread
From: Chris Wilson @ 2015-04-07 16:31 UTC (permalink / raw)
  To: Joonas Lahtinen; +Cc: intel-gfx

Hi Joonas,

  since you were looking at extending the GTT fault capabilities, I
thought you might like to revivew these patches, as they may prove
beneficial for your use case as well.
-Chris

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH 1/5] mutex: Export an interface to wrap a mutex lock
  2015-04-07 16:31 Fast prefaulting for GTT mmappings Chris Wilson
@ 2015-04-07 16:31 ` Chris Wilson
  2015-04-09  7:46   ` Joonas Lahtinen
  2015-04-07 16:31   ` Chris Wilson
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 22+ messages in thread
From: Chris Wilson @ 2015-04-07 16:31 UTC (permalink / raw)
  To: Joonas Lahtinen; +Cc: intel-gfx, Ben Widawsky

In i915, we have a big mutex around our device struct - every time before
we attempt to communicate with the GPU, we acquire the mutex. This makes
it a convenient juncture to place our GPU error handling - before we take
the mutex we first check whether the GPU is hung or whether we are in
the process of recovering from a GPU hang. So we wrap the call to
mutex_lock() alongside our additional error handling routines.

The downside of using a wrapper around mutex_lock() is that lockdep and
lockstat cannot discriminate the true callers of mutex_lock(). Unless we
provide a means for the wrapper to pass that information down.

It also appears that i915 is almost unique in this manner of wrapping
mutex_lock(), with only one or two other potential candidates for using
this interface.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ben Widawsky <benjamin.widawsky@intel.com>
---
 drivers/gpu/drm/i915/i915_gem.c |  4 +++-
 include/linux/mutex.h           |  9 +++++++++
 kernel/locking/mutex.c          | 36 ++++++++++++++++++++++++++++++++++++
 3 files changed, 48 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 267fdf0f46ae..7ab8e0039790 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -135,7 +135,9 @@ int i915_mutex_lock_interruptible(struct drm_device *dev)
 	if (ret)
 		return ret;
 
-	ret = mutex_lock_interruptible(&dev->struct_mutex);
+	ret = mutex_lock_wrapper(&dev->struct_mutex,
+				 TASK_INTERRUPTIBLE,
+				 _RET_IP_);
 	if (ret)
 		return ret;
 
diff --git a/include/linux/mutex.h b/include/linux/mutex.h
index 2cb7531e7d7a..3f6030b3f5aa 100644
--- a/include/linux/mutex.h
+++ b/include/linux/mutex.h
@@ -142,10 +142,15 @@ extern int __must_check mutex_lock_interruptible_nested(struct mutex *lock,
 					unsigned int subclass);
 extern int __must_check mutex_lock_killable_nested(struct mutex *lock,
 					unsigned int subclass);
+extern int __must_check mutex_lock_wrapper_nested(struct mutex *lock,
+						  unsigned int subclass,
+						  long state,
+						  unsigned long ip);
 
 #define mutex_lock(lock) mutex_lock_nested(lock, 0)
 #define mutex_lock_interruptible(lock) mutex_lock_interruptible_nested(lock, 0)
 #define mutex_lock_killable(lock) mutex_lock_killable_nested(lock, 0)
+#define mutex_lock_wrapper(lock, state, ip) mutex_lock_wrapper_nested(lock, 0, state, ip)
 
 #define mutex_lock_nest_lock(lock, nest_lock)				\
 do {									\
@@ -157,10 +162,14 @@ do {									\
 extern void mutex_lock(struct mutex *lock);
 extern int __must_check mutex_lock_interruptible(struct mutex *lock);
 extern int __must_check mutex_lock_killable(struct mutex *lock);
+extern int __must_check mutex_lock_wrapper(struct mutex *lock,
+					   long state,
+					   unsigned long ip);
 
 # define mutex_lock_nested(lock, subclass) mutex_lock(lock)
 # define mutex_lock_interruptible_nested(lock, subclass) mutex_lock_interruptible(lock)
 # define mutex_lock_killable_nested(lock, subclass) mutex_lock_killable(lock)
+# define mutex_lock_wrapper_nested(lock, subclass, state, ip) mutex_lock_wrapper(lock, state, ip)
 # define mutex_lock_nest_lock(lock, nest_lock) mutex_lock(lock)
 #endif
 
diff --git a/kernel/locking/mutex.c b/kernel/locking/mutex.c
index 94674e5919cb..098b9e71ada1 100644
--- a/kernel/locking/mutex.c
+++ b/kernel/locking/mutex.c
@@ -658,6 +658,17 @@ mutex_lock_interruptible_nested(struct mutex *lock, unsigned int subclass)
 
 EXPORT_SYMBOL_GPL(mutex_lock_interruptible_nested);
 
+int __sched
+mutex_lock_wrapper_nested(struct mutex *lock, unsigned int subclass,
+			  long state, unsigned long ip)
+{
+	might_sleep();
+	return __mutex_lock_common(lock, state,
+				   subclass, NULL, ip, NULL, 0);
+}
+
+EXPORT_SYMBOL_GPL(mutex_lock_wrapper_nested);
+
 static inline int
 ww_mutex_deadlock_injection(struct ww_mutex *lock, struct ww_acquire_ctx *ctx)
 {
@@ -780,6 +791,9 @@ __mutex_lock_killable_slowpath(struct mutex *lock);
 static noinline int __sched
 __mutex_lock_interruptible_slowpath(struct mutex *lock);
 
+static noinline int __sched
+__mutex_lock_wrapper_slowpath(struct mutex *lock, long state, unsigned long ip);
+
 /**
  * mutex_lock_interruptible - acquire the mutex, interruptible
  * @lock: the mutex to be acquired
@@ -806,6 +820,21 @@ int __sched mutex_lock_interruptible(struct mutex *lock)
 
 EXPORT_SYMBOL(mutex_lock_interruptible);
 
+int __sched mutex_lock_wrapper(struct mutex *lock, long state, unsigned long ip)
+{
+	int ret;
+
+	might_sleep();
+	ret =  __mutex_fastpath_lock_retval(&lock->count);
+	if (likely(!ret)) {
+		mutex_set_owner(lock);
+		return 0;
+	} else
+		return __mutex_lock_wrapper_slowpath(lock, state, ip);
+}
+
+EXPORT_SYMBOL(mutex_lock_wrapper);
+
 int __sched mutex_lock_killable(struct mutex *lock)
 {
 	int ret;
@@ -844,6 +873,13 @@ __mutex_lock_interruptible_slowpath(struct mutex *lock)
 }
 
 static noinline int __sched
+__mutex_lock_wrapper_slowpath(struct mutex *lock, long state, unsigned long ip)
+{
+	return __mutex_lock_common(lock, state, 0,
+				   NULL, ip, NULL, 0);
+}
+
+static noinline int __sched
 __ww_mutex_lock_slowpath(struct ww_mutex *lock, struct ww_acquire_ctx *ctx)
 {
 	return __mutex_lock_common(&lock->base, TASK_UNINTERRUPTIBLE, 0,
-- 
2.1.4

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 2/5] mm: Refactor remap_pfn_range()
  2015-04-07 16:31 Fast prefaulting for GTT mmappings Chris Wilson
@ 2015-04-07 16:31   ` Chris Wilson
  2015-04-07 16:31   ` Chris Wilson
                     ` (4 subsequent siblings)
  5 siblings, 0 replies; 22+ messages in thread
From: Chris Wilson @ 2015-04-07 16:31 UTC (permalink / raw)
  To: Joonas Lahtinen
  Cc: intel-gfx, Chris Wilson, Andrew Morton, Kirill A. Shutemov,
	Peter Zijlstra, Rik van Riel, Mel Gorman, Cyrill Gorcunov,
	Johannes Weiner, linux-mm

In preparation for exporting very similar functionality through another
interface, gut the current remap_pfn_range(). The motivating factor here
is to reuse the PGB/PUD/PMD/PTE walker, but allow back progation of
errors rather than BUG_ON.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Cyrill Gorcunov <gorcunov@gmail.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: linux-mm@kvack.org
---
 mm/memory.c | 102 +++++++++++++++++++++++++++++++++---------------------------
 1 file changed, 57 insertions(+), 45 deletions(-)

diff --git a/mm/memory.c b/mm/memory.c
index 97839f5c8c30..acb06f40d614 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -1614,71 +1614,81 @@ int vm_insert_mixed(struct vm_area_struct *vma, unsigned long addr,
 }
 EXPORT_SYMBOL(vm_insert_mixed);
 
+struct remap_pfn {
+	struct mm_struct *mm;
+	unsigned long addr;
+	unsigned long pfn;
+	pgprot_t prot;
+};
+
 /*
  * maps a range of physical memory into the requested pages. the old
  * mappings are removed. any references to nonexistent pages results
  * in null mappings (currently treated as "copy-on-access")
  */
-static int remap_pte_range(struct mm_struct *mm, pmd_t *pmd,
-			unsigned long addr, unsigned long end,
-			unsigned long pfn, pgprot_t prot)
+static inline int remap_pfn(struct remap_pfn *r, pte_t *pte)
+{
+	if (!pte_none(*pte))
+		return -EBUSY;
+
+	set_pte_at(r->mm, r->addr, pte,
+		   pte_mkspecial(pfn_pte(r->pfn, r->prot)));
+	r->pfn++;
+	r->addr += PAGE_SIZE;
+	return 0;
+}
+
+static int remap_pte_range(struct remap_pfn *r, pmd_t *pmd, unsigned long end)
 {
 	pte_t *pte;
 	spinlock_t *ptl;
+	int err;
 
-	pte = pte_alloc_map_lock(mm, pmd, addr, &ptl);
+	pte = pte_alloc_map_lock(r->mm, pmd, r->addr, &ptl);
 	if (!pte)
 		return -ENOMEM;
+
 	arch_enter_lazy_mmu_mode();
 	do {
-		BUG_ON(!pte_none(*pte));
-		set_pte_at(mm, addr, pte, pte_mkspecial(pfn_pte(pfn, prot)));
-		pfn++;
-	} while (pte++, addr += PAGE_SIZE, addr != end);
+		err = remap_pfn(r, pte++);
+	} while (err == 0 && r->addr < end);
 	arch_leave_lazy_mmu_mode();
+
 	pte_unmap_unlock(pte - 1, ptl);
-	return 0;
+	return err;
 }
 
-static inline int remap_pmd_range(struct mm_struct *mm, pud_t *pud,
-			unsigned long addr, unsigned long end,
-			unsigned long pfn, pgprot_t prot)
+static inline int remap_pmd_range(struct remap_pfn *r, pud_t *pud, unsigned long end)
 {
 	pmd_t *pmd;
-	unsigned long next;
+	int err;
 
-	pfn -= addr >> PAGE_SHIFT;
-	pmd = pmd_alloc(mm, pud, addr);
+	pmd = pmd_alloc(r->mm, pud, r->addr);
 	if (!pmd)
 		return -ENOMEM;
 	VM_BUG_ON(pmd_trans_huge(*pmd));
+
 	do {
-		next = pmd_addr_end(addr, end);
-		if (remap_pte_range(mm, pmd, addr, next,
-				pfn + (addr >> PAGE_SHIFT), prot))
-			return -ENOMEM;
-	} while (pmd++, addr = next, addr != end);
-	return 0;
+		err = remap_pte_range(r, pmd++, pmd_addr_end(r->addr, end));
+	} while (err == 0 && r->addr < end);
+
+	return err;
 }
 
-static inline int remap_pud_range(struct mm_struct *mm, pgd_t *pgd,
-			unsigned long addr, unsigned long end,
-			unsigned long pfn, pgprot_t prot)
+static inline int remap_pud_range(struct remap_pfn *r, pgd_t *pgd, unsigned long end)
 {
 	pud_t *pud;
-	unsigned long next;
+	int err;
 
-	pfn -= addr >> PAGE_SHIFT;
-	pud = pud_alloc(mm, pgd, addr);
+	pud = pud_alloc(r->mm, pgd, r->addr);
 	if (!pud)
 		return -ENOMEM;
+
 	do {
-		next = pud_addr_end(addr, end);
-		if (remap_pmd_range(mm, pud, addr, next,
-				pfn + (addr >> PAGE_SHIFT), prot))
-			return -ENOMEM;
-	} while (pud++, addr = next, addr != end);
-	return 0;
+		err = remap_pmd_range(r, pud++, pud_addr_end(r->addr, end));
+	} while (err == 0 && r->addr < end);
+
+	return err;
 }
 
 /**
@@ -1694,10 +1704,9 @@ static inline int remap_pud_range(struct mm_struct *mm, pgd_t *pgd,
 int remap_pfn_range(struct vm_area_struct *vma, unsigned long addr,
 		    unsigned long pfn, unsigned long size, pgprot_t prot)
 {
-	pgd_t *pgd;
-	unsigned long next;
 	unsigned long end = addr + PAGE_ALIGN(size);
-	struct mm_struct *mm = vma->vm_mm;
+	struct remap_pfn r;
+	pgd_t *pgd;
 	int err;
 
 	/*
@@ -1731,19 +1740,22 @@ int remap_pfn_range(struct vm_area_struct *vma, unsigned long addr,
 	vma->vm_flags |= VM_IO | VM_PFNMAP | VM_DONTEXPAND | VM_DONTDUMP;
 
 	BUG_ON(addr >= end);
-	pfn -= addr >> PAGE_SHIFT;
-	pgd = pgd_offset(mm, addr);
 	flush_cache_range(vma, addr, end);
+
+	r.mm = vma->vm_mm;
+	r.addr = addr;
+	r.pfn = pfn;
+	r.prot = prot;
+
+	pgd = pgd_offset(r.mm, addr);
 	do {
-		next = pgd_addr_end(addr, end);
-		err = remap_pud_range(mm, pgd, addr, next,
-				pfn + (addr >> PAGE_SHIFT), prot);
-		if (err)
-			break;
-	} while (pgd++, addr = next, addr != end);
+		err = remap_pud_range(&r, pgd++, pgd_addr_end(r.addr, end));
+	} while (err == 0 && r.addr < end);
 
-	if (err)
+	if (err) {
 		untrack_pfn(vma, pfn, PAGE_ALIGN(size));
+		BUG_ON(err == -EBUSY);
+	}
 
 	return err;
 }
-- 
2.1.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 2/5] mm: Refactor remap_pfn_range()
@ 2015-04-07 16:31   ` Chris Wilson
  0 siblings, 0 replies; 22+ messages in thread
From: Chris Wilson @ 2015-04-07 16:31 UTC (permalink / raw)
  To: Joonas Lahtinen
  Cc: Rik van Riel, Peter Zijlstra, intel-gfx, Cyrill Gorcunov,
	linux-mm, Mel Gorman, Johannes Weiner, Andrew Morton,
	Kirill A. Shutemov

In preparation for exporting very similar functionality through another
interface, gut the current remap_pfn_range(). The motivating factor here
is to reuse the PGB/PUD/PMD/PTE walker, but allow back progation of
errors rather than BUG_ON.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Cyrill Gorcunov <gorcunov@gmail.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: linux-mm@kvack.org
---
 mm/memory.c | 102 +++++++++++++++++++++++++++++++++---------------------------
 1 file changed, 57 insertions(+), 45 deletions(-)

diff --git a/mm/memory.c b/mm/memory.c
index 97839f5c8c30..acb06f40d614 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -1614,71 +1614,81 @@ int vm_insert_mixed(struct vm_area_struct *vma, unsigned long addr,
 }
 EXPORT_SYMBOL(vm_insert_mixed);
 
+struct remap_pfn {
+	struct mm_struct *mm;
+	unsigned long addr;
+	unsigned long pfn;
+	pgprot_t prot;
+};
+
 /*
  * maps a range of physical memory into the requested pages. the old
  * mappings are removed. any references to nonexistent pages results
  * in null mappings (currently treated as "copy-on-access")
  */
-static int remap_pte_range(struct mm_struct *mm, pmd_t *pmd,
-			unsigned long addr, unsigned long end,
-			unsigned long pfn, pgprot_t prot)
+static inline int remap_pfn(struct remap_pfn *r, pte_t *pte)
+{
+	if (!pte_none(*pte))
+		return -EBUSY;
+
+	set_pte_at(r->mm, r->addr, pte,
+		   pte_mkspecial(pfn_pte(r->pfn, r->prot)));
+	r->pfn++;
+	r->addr += PAGE_SIZE;
+	return 0;
+}
+
+static int remap_pte_range(struct remap_pfn *r, pmd_t *pmd, unsigned long end)
 {
 	pte_t *pte;
 	spinlock_t *ptl;
+	int err;
 
-	pte = pte_alloc_map_lock(mm, pmd, addr, &ptl);
+	pte = pte_alloc_map_lock(r->mm, pmd, r->addr, &ptl);
 	if (!pte)
 		return -ENOMEM;
+
 	arch_enter_lazy_mmu_mode();
 	do {
-		BUG_ON(!pte_none(*pte));
-		set_pte_at(mm, addr, pte, pte_mkspecial(pfn_pte(pfn, prot)));
-		pfn++;
-	} while (pte++, addr += PAGE_SIZE, addr != end);
+		err = remap_pfn(r, pte++);
+	} while (err == 0 && r->addr < end);
 	arch_leave_lazy_mmu_mode();
+
 	pte_unmap_unlock(pte - 1, ptl);
-	return 0;
+	return err;
 }
 
-static inline int remap_pmd_range(struct mm_struct *mm, pud_t *pud,
-			unsigned long addr, unsigned long end,
-			unsigned long pfn, pgprot_t prot)
+static inline int remap_pmd_range(struct remap_pfn *r, pud_t *pud, unsigned long end)
 {
 	pmd_t *pmd;
-	unsigned long next;
+	int err;
 
-	pfn -= addr >> PAGE_SHIFT;
-	pmd = pmd_alloc(mm, pud, addr);
+	pmd = pmd_alloc(r->mm, pud, r->addr);
 	if (!pmd)
 		return -ENOMEM;
 	VM_BUG_ON(pmd_trans_huge(*pmd));
+
 	do {
-		next = pmd_addr_end(addr, end);
-		if (remap_pte_range(mm, pmd, addr, next,
-				pfn + (addr >> PAGE_SHIFT), prot))
-			return -ENOMEM;
-	} while (pmd++, addr = next, addr != end);
-	return 0;
+		err = remap_pte_range(r, pmd++, pmd_addr_end(r->addr, end));
+	} while (err == 0 && r->addr < end);
+
+	return err;
 }
 
-static inline int remap_pud_range(struct mm_struct *mm, pgd_t *pgd,
-			unsigned long addr, unsigned long end,
-			unsigned long pfn, pgprot_t prot)
+static inline int remap_pud_range(struct remap_pfn *r, pgd_t *pgd, unsigned long end)
 {
 	pud_t *pud;
-	unsigned long next;
+	int err;
 
-	pfn -= addr >> PAGE_SHIFT;
-	pud = pud_alloc(mm, pgd, addr);
+	pud = pud_alloc(r->mm, pgd, r->addr);
 	if (!pud)
 		return -ENOMEM;
+
 	do {
-		next = pud_addr_end(addr, end);
-		if (remap_pmd_range(mm, pud, addr, next,
-				pfn + (addr >> PAGE_SHIFT), prot))
-			return -ENOMEM;
-	} while (pud++, addr = next, addr != end);
-	return 0;
+		err = remap_pmd_range(r, pud++, pud_addr_end(r->addr, end));
+	} while (err == 0 && r->addr < end);
+
+	return err;
 }
 
 /**
@@ -1694,10 +1704,9 @@ static inline int remap_pud_range(struct mm_struct *mm, pgd_t *pgd,
 int remap_pfn_range(struct vm_area_struct *vma, unsigned long addr,
 		    unsigned long pfn, unsigned long size, pgprot_t prot)
 {
-	pgd_t *pgd;
-	unsigned long next;
 	unsigned long end = addr + PAGE_ALIGN(size);
-	struct mm_struct *mm = vma->vm_mm;
+	struct remap_pfn r;
+	pgd_t *pgd;
 	int err;
 
 	/*
@@ -1731,19 +1740,22 @@ int remap_pfn_range(struct vm_area_struct *vma, unsigned long addr,
 	vma->vm_flags |= VM_IO | VM_PFNMAP | VM_DONTEXPAND | VM_DONTDUMP;
 
 	BUG_ON(addr >= end);
-	pfn -= addr >> PAGE_SHIFT;
-	pgd = pgd_offset(mm, addr);
 	flush_cache_range(vma, addr, end);
+
+	r.mm = vma->vm_mm;
+	r.addr = addr;
+	r.pfn = pfn;
+	r.prot = prot;
+
+	pgd = pgd_offset(r.mm, addr);
 	do {
-		next = pgd_addr_end(addr, end);
-		err = remap_pud_range(mm, pgd, addr, next,
-				pfn + (addr >> PAGE_SHIFT), prot);
-		if (err)
-			break;
-	} while (pgd++, addr = next, addr != end);
+		err = remap_pud_range(&r, pgd++, pgd_addr_end(r.addr, end));
+	} while (err == 0 && r.addr < end);
 
-	if (err)
+	if (err) {
 		untrack_pfn(vma, pfn, PAGE_ALIGN(size));
+		BUG_ON(err == -EBUSY);
+	}
 
 	return err;
 }
-- 
2.1.4

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 3/5] io-mapping: Always create a struct to hold metadata about the io-mapping
  2015-04-07 16:31 Fast prefaulting for GTT mmappings Chris Wilson
@ 2015-04-07 16:31   ` Chris Wilson
  2015-04-07 16:31   ` Chris Wilson
                     ` (4 subsequent siblings)
  5 siblings, 0 replies; 22+ messages in thread
From: Chris Wilson @ 2015-04-07 16:31 UTC (permalink / raw)
  To: Joonas Lahtinen; +Cc: intel-gfx, Chris Wilson, linux-mm

Currently, we only allocate a structure to hold metadata if we need to
allocate an ioremap for every access, such as on x86-32. However, it
would be useful to store basic information about the io-mapping, such as
its page protection, on all platforms.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: linux-mm@kvack.org
---
 include/linux/io-mapping.h | 52 ++++++++++++++++++++++++++++------------------
 1 file changed, 32 insertions(+), 20 deletions(-)

diff --git a/include/linux/io-mapping.h b/include/linux/io-mapping.h
index 657fab4efab3..e053011f50bb 100644
--- a/include/linux/io-mapping.h
+++ b/include/linux/io-mapping.h
@@ -31,16 +31,17 @@
  * See Documentation/io-mapping.txt
  */
 
-#ifdef CONFIG_HAVE_ATOMIC_IOMAP
-
-#include <asm/iomap.h>
-
 struct io_mapping {
 	resource_size_t base;
 	unsigned long size;
 	pgprot_t prot;
+	void __iomem *iomem;
 };
 
+
+#ifdef CONFIG_HAVE_ATOMIC_IOMAP
+
+#include <asm/iomap.h>
 /*
  * For small address space machines, mapping large objects
  * into the kernel virtual space isn't practical. Where
@@ -119,48 +120,59 @@ io_mapping_unmap(void __iomem *vaddr)
 #else
 
 #include <linux/uaccess.h>
-
-/* this struct isn't actually defined anywhere */
-struct io_mapping;
+#include <asm/pgtable_types.h>
 
 /* Create the io_mapping object*/
 static inline struct io_mapping *
 io_mapping_create_wc(resource_size_t base, unsigned long size)
 {
-	return (struct io_mapping __force *) ioremap_wc(base, size);
+	struct io_mapping *iomap;
+
+	iomap = kmalloc(sizeof(*iomap), GFP_KERNEL);
+	if (!iomap)
+		return NULL;
+
+	iomap->base = base;
+	iomap->size = size;
+	iomap->iomem = ioremap_wc(base, size);
+	iomap->prot = pgprot_writecombine(PAGE_KERNEL_IO);
+
+	return iomap;
 }
 
 static inline void
 io_mapping_free(struct io_mapping *mapping)
 {
-	iounmap((void __force __iomem *) mapping);
+	iounmap(mapping->iomem);
+	kfree(mapping);
 }
 
-/* Atomic map/unmap */
+/* Non-atomic map/unmap */
 static inline void __iomem *
-io_mapping_map_atomic_wc(struct io_mapping *mapping,
-			 unsigned long offset)
+io_mapping_map_wc(struct io_mapping *mapping, unsigned long offset)
 {
-	pagefault_disable();
-	return ((char __force __iomem *) mapping) + offset;
+	return mapping->iomem + offset;
 }
 
 static inline void
-io_mapping_unmap_atomic(void __iomem *vaddr)
+io_mapping_unmap(void __iomem *vaddr)
 {
-	pagefault_enable();
 }
 
-/* Non-atomic map/unmap */
+/* Atomic map/unmap */
 static inline void __iomem *
-io_mapping_map_wc(struct io_mapping *mapping, unsigned long offset)
+io_mapping_map_atomic_wc(struct io_mapping *mapping,
+			 unsigned long offset)
 {
-	return ((char __force __iomem *) mapping) + offset;
+	pagefault_disable();
+	return io_mapping_map_wc(mapping, offset);
 }
 
 static inline void
-io_mapping_unmap(void __iomem *vaddr)
+io_mapping_unmap_atomic(void __iomem *vaddr)
 {
+	io_mapping_unmap(vaddr);
+	pagefault_enable();
 }
 
 #endif /* HAVE_ATOMIC_IOMAP */
-- 
2.1.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 3/5] io-mapping: Always create a struct to hold metadata about the io-mapping
@ 2015-04-07 16:31   ` Chris Wilson
  0 siblings, 0 replies; 22+ messages in thread
From: Chris Wilson @ 2015-04-07 16:31 UTC (permalink / raw)
  To: Joonas Lahtinen; +Cc: linux-mm, intel-gfx

Currently, we only allocate a structure to hold metadata if we need to
allocate an ioremap for every access, such as on x86-32. However, it
would be useful to store basic information about the io-mapping, such as
its page protection, on all platforms.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: linux-mm@kvack.org
---
 include/linux/io-mapping.h | 52 ++++++++++++++++++++++++++++------------------
 1 file changed, 32 insertions(+), 20 deletions(-)

diff --git a/include/linux/io-mapping.h b/include/linux/io-mapping.h
index 657fab4efab3..e053011f50bb 100644
--- a/include/linux/io-mapping.h
+++ b/include/linux/io-mapping.h
@@ -31,16 +31,17 @@
  * See Documentation/io-mapping.txt
  */
 
-#ifdef CONFIG_HAVE_ATOMIC_IOMAP
-
-#include <asm/iomap.h>
-
 struct io_mapping {
 	resource_size_t base;
 	unsigned long size;
 	pgprot_t prot;
+	void __iomem *iomem;
 };
 
+
+#ifdef CONFIG_HAVE_ATOMIC_IOMAP
+
+#include <asm/iomap.h>
 /*
  * For small address space machines, mapping large objects
  * into the kernel virtual space isn't practical. Where
@@ -119,48 +120,59 @@ io_mapping_unmap(void __iomem *vaddr)
 #else
 
 #include <linux/uaccess.h>
-
-/* this struct isn't actually defined anywhere */
-struct io_mapping;
+#include <asm/pgtable_types.h>
 
 /* Create the io_mapping object*/
 static inline struct io_mapping *
 io_mapping_create_wc(resource_size_t base, unsigned long size)
 {
-	return (struct io_mapping __force *) ioremap_wc(base, size);
+	struct io_mapping *iomap;
+
+	iomap = kmalloc(sizeof(*iomap), GFP_KERNEL);
+	if (!iomap)
+		return NULL;
+
+	iomap->base = base;
+	iomap->size = size;
+	iomap->iomem = ioremap_wc(base, size);
+	iomap->prot = pgprot_writecombine(PAGE_KERNEL_IO);
+
+	return iomap;
 }
 
 static inline void
 io_mapping_free(struct io_mapping *mapping)
 {
-	iounmap((void __force __iomem *) mapping);
+	iounmap(mapping->iomem);
+	kfree(mapping);
 }
 
-/* Atomic map/unmap */
+/* Non-atomic map/unmap */
 static inline void __iomem *
-io_mapping_map_atomic_wc(struct io_mapping *mapping,
-			 unsigned long offset)
+io_mapping_map_wc(struct io_mapping *mapping, unsigned long offset)
 {
-	pagefault_disable();
-	return ((char __force __iomem *) mapping) + offset;
+	return mapping->iomem + offset;
 }
 
 static inline void
-io_mapping_unmap_atomic(void __iomem *vaddr)
+io_mapping_unmap(void __iomem *vaddr)
 {
-	pagefault_enable();
 }
 
-/* Non-atomic map/unmap */
+/* Atomic map/unmap */
 static inline void __iomem *
-io_mapping_map_wc(struct io_mapping *mapping, unsigned long offset)
+io_mapping_map_atomic_wc(struct io_mapping *mapping,
+			 unsigned long offset)
 {
-	return ((char __force __iomem *) mapping) + offset;
+	pagefault_disable();
+	return io_mapping_map_wc(mapping, offset);
 }
 
 static inline void
-io_mapping_unmap(void __iomem *vaddr)
+io_mapping_unmap_atomic(void __iomem *vaddr)
 {
+	io_mapping_unmap(vaddr);
+	pagefault_enable();
 }
 
 #endif /* HAVE_ATOMIC_IOMAP */
-- 
2.1.4

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 4/5] mm: Export remap_io_mapping()
  2015-04-07 16:31 Fast prefaulting for GTT mmappings Chris Wilson
@ 2015-04-07 16:31   ` Chris Wilson
  2015-04-07 16:31   ` Chris Wilson
                     ` (4 subsequent siblings)
  5 siblings, 0 replies; 22+ messages in thread
From: Chris Wilson @ 2015-04-07 16:31 UTC (permalink / raw)
  To: Joonas Lahtinen
  Cc: intel-gfx, Chris Wilson, Andrew Morton, Kirill A. Shutemov,
	Peter Zijlstra, Rik van Riel, Mel Gorman, Cyrill Gorcunov,
	Johannes Weiner, linux-mm

This is similar to remap_pfn_range(), and uses the recently refactor
code to do the page table walking. The key difference is that is back
propagates its error as this is required for use from within a pagefault
handler. The other difference, is that it combine the page protection
from io-mapping, which is known from when the io-mapping is created,
with the per-vma page protection flags. This avoids having to walk the
entire system description to rediscover the special page protection
established for the io-mapping.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Cyrill Gorcunov <gorcunov@gmail.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: linux-mm@kvack.org
---
 include/linux/mm.h |  4 ++++
 mm/memory.c        | 46 ++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 50 insertions(+)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 47a93928b90f..3dfecd58adb0 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -2083,6 +2083,10 @@ unsigned long change_prot_numa(struct vm_area_struct *vma,
 struct vm_area_struct *find_extend_vma(struct mm_struct *, unsigned long addr);
 int remap_pfn_range(struct vm_area_struct *, unsigned long addr,
 			unsigned long pfn, unsigned long size, pgprot_t);
+struct io_mapping;
+int remap_io_mapping(struct vm_area_struct *,
+		     unsigned long addr, unsigned long pfn, unsigned long size,
+		     struct io_mapping *iomap);
 int vm_insert_page(struct vm_area_struct *, unsigned long addr, struct page *);
 int vm_insert_pfn(struct vm_area_struct *vma, unsigned long addr,
 			unsigned long pfn);
diff --git a/mm/memory.c b/mm/memory.c
index acb06f40d614..83bc5df3fafc 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -61,6 +61,7 @@
 #include <linux/string.h>
 #include <linux/dma-debug.h>
 #include <linux/debugfs.h>
+#include <linux/io-mapping.h>
 
 #include <asm/io.h>
 #include <asm/pgalloc.h>
@@ -1762,6 +1763,51 @@ int remap_pfn_range(struct vm_area_struct *vma, unsigned long addr,
 EXPORT_SYMBOL(remap_pfn_range);
 
 /**
+ * remap_io_mapping - remap an IO mapping to userspace
+ * @vma: user vma to map to
+ * @addr: target user address to start at
+ * @pfn: physical address of kernel memory
+ * @size: size of map area
+ * @iomap: the source io_mapping
+ *
+ *  Note: this is only safe if the mm semaphore is held when called.
+ */
+int remap_io_mapping(struct vm_area_struct *vma,
+		     unsigned long addr, unsigned long pfn, unsigned long size,
+		     struct io_mapping *iomap)
+{
+	unsigned long end = addr + PAGE_ALIGN(size);
+	struct remap_pfn r;
+	pgd_t *pgd;
+	int err;
+
+	if (WARN_ON(addr >= end))
+		return -EINVAL;
+
+#define MUST_SET (VM_IO | VM_PFNMAP | VM_DONTEXPAND | VM_DONTDUMP)
+	BUG_ON(is_cow_mapping(vma->vm_flags));
+	BUG_ON((vma->vm_flags & MUST_SET) != MUST_SET);
+#undef MUST_SET
+
+	r.mm = vma->vm_mm;
+	r.addr = addr;
+	r.pfn = pfn;
+	r.prot = __pgprot((pgprot_val(iomap->prot) & _PAGE_CACHE_MASK) |
+			  (pgprot_val(vma->vm_page_prot) & ~_PAGE_CACHE_MASK));
+
+	pgd = pgd_offset(r.mm, addr);
+	do {
+		err = remap_pud_range(&r, pgd++, pgd_addr_end(r.addr, end));
+	} while (err == 0 && r.addr < end);
+
+	if (err)
+		zap_page_range_single(vma, addr, r.addr - addr, NULL);
+
+	return err;
+}
+EXPORT_SYMBOL(remap_io_mapping);
+
+/**
  * vm_iomap_memory - remap memory to userspace
  * @vma: user vma to map to
  * @start: start of area
-- 
2.1.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 4/5] mm: Export remap_io_mapping()
@ 2015-04-07 16:31   ` Chris Wilson
  0 siblings, 0 replies; 22+ messages in thread
From: Chris Wilson @ 2015-04-07 16:31 UTC (permalink / raw)
  To: Joonas Lahtinen
  Cc: Rik van Riel, Peter Zijlstra, intel-gfx, Cyrill Gorcunov,
	linux-mm, Mel Gorman, Johannes Weiner, Andrew Morton,
	Kirill A. Shutemov

This is similar to remap_pfn_range(), and uses the recently refactor
code to do the page table walking. The key difference is that is back
propagates its error as this is required for use from within a pagefault
handler. The other difference, is that it combine the page protection
from io-mapping, which is known from when the io-mapping is created,
with the per-vma page protection flags. This avoids having to walk the
entire system description to rediscover the special page protection
established for the io-mapping.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Cyrill Gorcunov <gorcunov@gmail.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: linux-mm@kvack.org
---
 include/linux/mm.h |  4 ++++
 mm/memory.c        | 46 ++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 50 insertions(+)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 47a93928b90f..3dfecd58adb0 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -2083,6 +2083,10 @@ unsigned long change_prot_numa(struct vm_area_struct *vma,
 struct vm_area_struct *find_extend_vma(struct mm_struct *, unsigned long addr);
 int remap_pfn_range(struct vm_area_struct *, unsigned long addr,
 			unsigned long pfn, unsigned long size, pgprot_t);
+struct io_mapping;
+int remap_io_mapping(struct vm_area_struct *,
+		     unsigned long addr, unsigned long pfn, unsigned long size,
+		     struct io_mapping *iomap);
 int vm_insert_page(struct vm_area_struct *, unsigned long addr, struct page *);
 int vm_insert_pfn(struct vm_area_struct *vma, unsigned long addr,
 			unsigned long pfn);
diff --git a/mm/memory.c b/mm/memory.c
index acb06f40d614..83bc5df3fafc 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -61,6 +61,7 @@
 #include <linux/string.h>
 #include <linux/dma-debug.h>
 #include <linux/debugfs.h>
+#include <linux/io-mapping.h>
 
 #include <asm/io.h>
 #include <asm/pgalloc.h>
@@ -1762,6 +1763,51 @@ int remap_pfn_range(struct vm_area_struct *vma, unsigned long addr,
 EXPORT_SYMBOL(remap_pfn_range);
 
 /**
+ * remap_io_mapping - remap an IO mapping to userspace
+ * @vma: user vma to map to
+ * @addr: target user address to start at
+ * @pfn: physical address of kernel memory
+ * @size: size of map area
+ * @iomap: the source io_mapping
+ *
+ *  Note: this is only safe if the mm semaphore is held when called.
+ */
+int remap_io_mapping(struct vm_area_struct *vma,
+		     unsigned long addr, unsigned long pfn, unsigned long size,
+		     struct io_mapping *iomap)
+{
+	unsigned long end = addr + PAGE_ALIGN(size);
+	struct remap_pfn r;
+	pgd_t *pgd;
+	int err;
+
+	if (WARN_ON(addr >= end))
+		return -EINVAL;
+
+#define MUST_SET (VM_IO | VM_PFNMAP | VM_DONTEXPAND | VM_DONTDUMP)
+	BUG_ON(is_cow_mapping(vma->vm_flags));
+	BUG_ON((vma->vm_flags & MUST_SET) != MUST_SET);
+#undef MUST_SET
+
+	r.mm = vma->vm_mm;
+	r.addr = addr;
+	r.pfn = pfn;
+	r.prot = __pgprot((pgprot_val(iomap->prot) & _PAGE_CACHE_MASK) |
+			  (pgprot_val(vma->vm_page_prot) & ~_PAGE_CACHE_MASK));
+
+	pgd = pgd_offset(r.mm, addr);
+	do {
+		err = remap_pud_range(&r, pgd++, pgd_addr_end(r.addr, end));
+	} while (err == 0 && r.addr < end);
+
+	if (err)
+		zap_page_range_single(vma, addr, r.addr - addr, NULL);
+
+	return err;
+}
+EXPORT_SYMBOL(remap_io_mapping);
+
+/**
  * vm_iomap_memory - remap memory to userspace
  * @vma: user vma to map to
  * @start: start of area
-- 
2.1.4

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 5/5] drm/i915: Use remap_io_mapping() to prefault all PTE in a single pass
  2015-04-07 16:31 Fast prefaulting for GTT mmappings Chris Wilson
@ 2015-04-07 16:31   ` Chris Wilson
  2015-04-07 16:31   ` Chris Wilson
                     ` (4 subsequent siblings)
  5 siblings, 0 replies; 22+ messages in thread
From: Chris Wilson @ 2015-04-07 16:31 UTC (permalink / raw)
  To: Joonas Lahtinen; +Cc: intel-gfx, Chris Wilson, linux-mm

On an Ivybridge i7-3720qm with 1600MHz DDR3, with 32 fences,
Upload rate for 2 linear surfaces:  8134MiB/s -> 8154MiB/s
Upload rate for 2 tiled surfaces:   8625MiB/s -> 8632MiB/s
Upload rate for 4 linear surfaces:  8127MiB/s -> 8134MiB/s
Upload rate for 4 tiled surfaces:   8602MiB/s -> 8629MiB/s
Upload rate for 8 linear surfaces:  8124MiB/s -> 8137MiB/s
Upload rate for 8 tiled surfaces:   8603MiB/s -> 8624MiB/s
Upload rate for 16 linear surfaces: 8123MiB/s -> 8128MiB/s
Upload rate for 16 tiled surfaces:  8606MiB/s -> 8618MiB/s
Upload rate for 32 linear surfaces: 8121MiB/s -> 8128MiB/s
Upload rate for 32 tiled surfaces:  8605MiB/s -> 8614MiB/s
Upload rate for 64 linear surfaces: 8121MiB/s -> 8127MiB/s
Upload rate for 64 tiled surfaces:  3017MiB/s -> 5202MiB/s

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Testcase: igt/gem_fence_upload/performance
Testcase: igt/gem_mmap_gtt
Reviewed-by: Brad Volkin <bradley.d.volkin@intel.com>
Cc: linux-mm@kvack.org
---
 drivers/gpu/drm/i915/i915_gem.c | 23 ++++++-----------------
 1 file changed, 6 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 7ab8e0039790..90d772f72276 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1667,25 +1667,14 @@ int i915_gem_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
 	pfn = dev_priv->gtt.mappable_base + i915_gem_obj_ggtt_offset(obj);
 	pfn >>= PAGE_SHIFT;
 
-	if (!obj->fault_mappable) {
-		unsigned long size = min_t(unsigned long,
-					   vma->vm_end - vma->vm_start,
-					   obj->base.size);
-		int i;
+	ret = remap_io_mapping(vma,
+			       vma->vm_start, pfn, vma->vm_end - vma->vm_start,
+			       dev_priv->gtt.mappable);
+	if (ret)
+		goto unpin;
 
-		for (i = 0; i < size >> PAGE_SHIFT; i++) {
-			ret = vm_insert_pfn(vma,
-					    (unsigned long)vma->vm_start + i * PAGE_SIZE,
-					    pfn + i);
-			if (ret)
-				break;
-		}
+	obj->fault_mappable = true;
 
-		obj->fault_mappable = true;
-	} else
-		ret = vm_insert_pfn(vma,
-				    (unsigned long)vmf->virtual_address,
-				    pfn + page_offset);
 unpin:
 	i915_gem_object_ggtt_unpin(obj);
 unlock:
-- 
2.1.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 5/5] drm/i915: Use remap_io_mapping() to prefault all PTE in a single pass
@ 2015-04-07 16:31   ` Chris Wilson
  0 siblings, 0 replies; 22+ messages in thread
From: Chris Wilson @ 2015-04-07 16:31 UTC (permalink / raw)
  To: Joonas Lahtinen; +Cc: linux-mm, intel-gfx

On an Ivybridge i7-3720qm with 1600MHz DDR3, with 32 fences,
Upload rate for 2 linear surfaces:  8134MiB/s -> 8154MiB/s
Upload rate for 2 tiled surfaces:   8625MiB/s -> 8632MiB/s
Upload rate for 4 linear surfaces:  8127MiB/s -> 8134MiB/s
Upload rate for 4 tiled surfaces:   8602MiB/s -> 8629MiB/s
Upload rate for 8 linear surfaces:  8124MiB/s -> 8137MiB/s
Upload rate for 8 tiled surfaces:   8603MiB/s -> 8624MiB/s
Upload rate for 16 linear surfaces: 8123MiB/s -> 8128MiB/s
Upload rate for 16 tiled surfaces:  8606MiB/s -> 8618MiB/s
Upload rate for 32 linear surfaces: 8121MiB/s -> 8128MiB/s
Upload rate for 32 tiled surfaces:  8605MiB/s -> 8614MiB/s
Upload rate for 64 linear surfaces: 8121MiB/s -> 8127MiB/s
Upload rate for 64 tiled surfaces:  3017MiB/s -> 5202MiB/s

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Testcase: igt/gem_fence_upload/performance
Testcase: igt/gem_mmap_gtt
Reviewed-by: Brad Volkin <bradley.d.volkin@intel.com>
Cc: linux-mm@kvack.org
---
 drivers/gpu/drm/i915/i915_gem.c | 23 ++++++-----------------
 1 file changed, 6 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 7ab8e0039790..90d772f72276 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1667,25 +1667,14 @@ int i915_gem_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
 	pfn = dev_priv->gtt.mappable_base + i915_gem_obj_ggtt_offset(obj);
 	pfn >>= PAGE_SHIFT;
 
-	if (!obj->fault_mappable) {
-		unsigned long size = min_t(unsigned long,
-					   vma->vm_end - vma->vm_start,
-					   obj->base.size);
-		int i;
+	ret = remap_io_mapping(vma,
+			       vma->vm_start, pfn, vma->vm_end - vma->vm_start,
+			       dev_priv->gtt.mappable);
+	if (ret)
+		goto unpin;
 
-		for (i = 0; i < size >> PAGE_SHIFT; i++) {
-			ret = vm_insert_pfn(vma,
-					    (unsigned long)vma->vm_start + i * PAGE_SIZE,
-					    pfn + i);
-			if (ret)
-				break;
-		}
+	obj->fault_mappable = true;
 
-		obj->fault_mappable = true;
-	} else
-		ret = vm_insert_pfn(vma,
-				    (unsigned long)vmf->virtual_address,
-				    pfn + page_offset);
 unpin:
 	i915_gem_object_ggtt_unpin(obj);
 unlock:
-- 
2.1.4

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH 5/5] drm/i915: Use remap_io_mapping() to prefault all PTE in a single pass
  2015-04-07 16:31   ` Chris Wilson
  (?)
@ 2015-04-07 19:28   ` shuang.he
  -1 siblings, 0 replies; 22+ messages in thread
From: shuang.he @ 2015-04-07 19:28 UTC (permalink / raw)
  To: shuang.he, ethan.gao, intel-gfx, chris

Tested-By: Intel Graphics QA PRTS (Patch Regression Test System Contact: shuang.he@intel.com)
Task id: 6142
-------------------------------------Summary-------------------------------------
Platform          Delta          drm-intel-nightly          Series Applied
PNV                 -3              272/272              269/272
ILK                 -1              302/302              301/302
SNB                                  303/303              303/303
IVB                                  338/338              338/338
BYT                 -1              287/287              286/287
HSW                                  361/361              361/361
BDW                                  308/308              308/308
-------------------------------------Detailed-------------------------------------
Platform  Test                                drm-intel-nightly          Series Applied
 PNV  igt@gem_tiled_pread_pwrite      FAIL(3)PASS(13)      FAIL(1)PASS(1)
 PNV  igt@gem_userptr_blits@coherency-sync      CRASH(6)PASS(9)      CRASH(2)
 PNV  igt@gen3_render_tiledx_blits      FAIL(8)PASS(6)      FAIL(2)
*ILK  igt@kms_flip@flip-vs-rmfb-interruptible      PASS(3)      DMESG_WARN(1)PASS(1)
(dmesg patch applied)drm:intel_pch_fifo_underrun_irq_handler[i915]]*ERROR*PCH_transcoder_A_FIFO_underrun@PCH transcoder A FIFO underrun
*BYT  igt@gem_exec_bad_domains@conflicting-write-domain      PASS(22)      FAIL(1)PASS(1)
Note: You need to pay more attention to line start with '*'
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 2/5] mm: Refactor remap_pfn_range()
  2015-04-07 16:31   ` Chris Wilson
  (?)
@ 2015-04-07 20:27   ` Andrew Morton
  2015-04-08  9:45     ` Peter Zijlstra
  -1 siblings, 1 reply; 22+ messages in thread
From: Andrew Morton @ 2015-04-07 20:27 UTC (permalink / raw)
  To: Chris Wilson
  Cc: Joonas Lahtinen, intel-gfx, Kirill A. Shutemov, Peter Zijlstra,
	Rik van Riel, Mel Gorman, Cyrill Gorcunov, Johannes Weiner,
	linux-mm

On Tue,  7 Apr 2015 17:31:36 +0100 Chris Wilson <chris@chris-wilson.co.uk> wrote:

> In preparation for exporting very similar functionality through another
> interface, gut the current remap_pfn_range(). The motivating factor here
> is to reuse the PGB/PUD/PMD/PTE walker, but allow back progation of
> errors rather than BUG_ON.

I'm not on intel-gfx and for some reason these patches didn't show up on
linux-mm.  I wanted to comment on "mutex: Export an interface to wrap a
mutex lock" but
http://lists.freedesktop.org/archives/intel-gfx/2015-April/064063.html
doesn't tell me which mailing lists were cc'ed and I can't find that
patch on linux-kernel.

Can you please do something to make this easier for us??

And please fully document all the mutex interfaces which you just
added.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 2/5] mm: Refactor remap_pfn_range()
  2015-04-07 20:27   ` Andrew Morton
@ 2015-04-08  9:45     ` Peter Zijlstra
  0 siblings, 0 replies; 22+ messages in thread
From: Peter Zijlstra @ 2015-04-08  9:45 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Chris Wilson, Joonas Lahtinen, intel-gfx, Kirill A. Shutemov,
	Rik van Riel, Mel Gorman, Cyrill Gorcunov, Johannes Weiner,
	linux-mm

On Tue, Apr 07, 2015 at 01:27:21PM -0700, Andrew Morton wrote:
> On Tue,  7 Apr 2015 17:31:36 +0100 Chris Wilson <chris@chris-wilson.co.uk> wrote:
> 
> > In preparation for exporting very similar functionality through another
> > interface, gut the current remap_pfn_range(). The motivating factor here
> > is to reuse the PGB/PUD/PMD/PTE walker, but allow back progation of
> > errors rather than BUG_ON.
> 
> I'm not on intel-gfx and for some reason these patches didn't show up on
> linux-mm.  I wanted to comment on "mutex: Export an interface to wrap a
> mutex lock" but
> http://lists.freedesktop.org/archives/intel-gfx/2015-April/064063.html
> doesn't tell me which mailing lists were cc'ed and I can't find that
> patch on linux-kernel.
> 
> Can you please do something to make this easier for us??
> 
> And please fully document all the mutex interfaces which you just
> added.

Also, please Cc locking people if you poke at mutexes..

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/5] mutex: Export an interface to wrap a mutex lock
  2015-04-07 16:31 ` [PATCH 1/5] mutex: Export an interface to wrap a mutex lock Chris Wilson
@ 2015-04-09  7:46   ` Joonas Lahtinen
  0 siblings, 0 replies; 22+ messages in thread
From: Joonas Lahtinen @ 2015-04-09  7:46 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx, Ben Widawsky

Hi,

On ti, 2015-04-07 at 17:31 +0100, Chris Wilson wrote:
> In i915, we have a big mutex around our device struct - every time before
> we attempt to communicate with the GPU, we acquire the mutex. This makes
> it a convenient juncture to place our GPU error handling - before we take
> the mutex we first check whether the GPU is hung or whether we are in
> the process of recovering from a GPU hang. So we wrap the call to
> mutex_lock() alongside our additional error handling routines.
> 
> The downside of using a wrapper around mutex_lock() is that lockdep and
> lockstat cannot discriminate the true callers of mutex_lock(). Unless we
> provide a means for the wrapper to pass that information down.
> 
> It also appears that i915 is almost unique in this manner of wrapping
> mutex_lock(), with only one or two other potential candidates for using
> this interface.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Ben Widawsky <benjamin.widawsky@intel.com>

Seems like reasonable thing to do, I'd split this to two completely
separate parts, the kernel change and the gem side.

You might want to CC the kernel side with a little bigger audience, as
this could be called invasive change. Not by changing existing stuff,
but maybe somebody with much experience in the mutex subsystem might
want to object exposing such functionality (maybe to keep people from
using mutexes the way we do it) for reasons beyond me.

Regards, joonas

> ---
>  drivers/gpu/drm/i915/i915_gem.c |  4 +++-
>  include/linux/mutex.h           |  9 +++++++++
>  kernel/locking/mutex.c          | 36 ++++++++++++++++++++++++++++++++++++
>  3 files changed, 48 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 267fdf0f46ae..7ab8e0039790 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -135,7 +135,9 @@ int i915_mutex_lock_interruptible(struct drm_device *dev)
>  	if (ret)
>  		return ret;
>  
> -	ret = mutex_lock_interruptible(&dev->struct_mutex);
> +	ret = mutex_lock_wrapper(&dev->struct_mutex,
> +				 TASK_INTERRUPTIBLE,
> +				 _RET_IP_);
>  	if (ret)
>  		return ret;
>  
> diff --git a/include/linux/mutex.h b/include/linux/mutex.h
> index 2cb7531e7d7a..3f6030b3f5aa 100644
> --- a/include/linux/mutex.h
> +++ b/include/linux/mutex.h
> @@ -142,10 +142,15 @@ extern int __must_check mutex_lock_interruptible_nested(struct mutex *lock,
>  					unsigned int subclass);
>  extern int __must_check mutex_lock_killable_nested(struct mutex *lock,
>  					unsigned int subclass);
> +extern int __must_check mutex_lock_wrapper_nested(struct mutex *lock,
> +						  unsigned int subclass,
> +						  long state,
> +						  unsigned long ip);
>  
>  #define mutex_lock(lock) mutex_lock_nested(lock, 0)
>  #define mutex_lock_interruptible(lock) mutex_lock_interruptible_nested(lock, 0)
>  #define mutex_lock_killable(lock) mutex_lock_killable_nested(lock, 0)
> +#define mutex_lock_wrapper(lock, state, ip) mutex_lock_wrapper_nested(lock, 0, state, ip)
>  
>  #define mutex_lock_nest_lock(lock, nest_lock)				\
>  do {									\
> @@ -157,10 +162,14 @@ do {									\
>  extern void mutex_lock(struct mutex *lock);
>  extern int __must_check mutex_lock_interruptible(struct mutex *lock);
>  extern int __must_check mutex_lock_killable(struct mutex *lock);
> +extern int __must_check mutex_lock_wrapper(struct mutex *lock,
> +					   long state,
> +					   unsigned long ip);
>  
>  # define mutex_lock_nested(lock, subclass) mutex_lock(lock)
>  # define mutex_lock_interruptible_nested(lock, subclass) mutex_lock_interruptible(lock)
>  # define mutex_lock_killable_nested(lock, subclass) mutex_lock_killable(lock)
> +# define mutex_lock_wrapper_nested(lock, subclass, state, ip) mutex_lock_wrapper(lock, state, ip)
>  # define mutex_lock_nest_lock(lock, nest_lock) mutex_lock(lock)
>  #endif
>  
> diff --git a/kernel/locking/mutex.c b/kernel/locking/mutex.c
> index 94674e5919cb..098b9e71ada1 100644
> --- a/kernel/locking/mutex.c
> +++ b/kernel/locking/mutex.c
> @@ -658,6 +658,17 @@ mutex_lock_interruptible_nested(struct mutex *lock, unsigned int subclass)
>  
>  EXPORT_SYMBOL_GPL(mutex_lock_interruptible_nested);
>  
> +int __sched
> +mutex_lock_wrapper_nested(struct mutex *lock, unsigned int subclass,
> +			  long state, unsigned long ip)
> +{
> +	might_sleep();
> +	return __mutex_lock_common(lock, state,
> +				   subclass, NULL, ip, NULL, 0);
> +}
> +
> +EXPORT_SYMBOL_GPL(mutex_lock_wrapper_nested);
> +
>  static inline int
>  ww_mutex_deadlock_injection(struct ww_mutex *lock, struct ww_acquire_ctx *ctx)
>  {
> @@ -780,6 +791,9 @@ __mutex_lock_killable_slowpath(struct mutex *lock);
>  static noinline int __sched
>  __mutex_lock_interruptible_slowpath(struct mutex *lock);
>  
> +static noinline int __sched
> +__mutex_lock_wrapper_slowpath(struct mutex *lock, long state, unsigned long ip);
> +
>  /**
>   * mutex_lock_interruptible - acquire the mutex, interruptible
>   * @lock: the mutex to be acquired
> @@ -806,6 +820,21 @@ int __sched mutex_lock_interruptible(struct mutex *lock)
>  
>  EXPORT_SYMBOL(mutex_lock_interruptible);
>  
> +int __sched mutex_lock_wrapper(struct mutex *lock, long state, unsigned long ip)
> +{
> +	int ret;
> +
> +	might_sleep();
> +	ret =  __mutex_fastpath_lock_retval(&lock->count);
> +	if (likely(!ret)) {
> +		mutex_set_owner(lock);
> +		return 0;
> +	} else
> +		return __mutex_lock_wrapper_slowpath(lock, state, ip);
> +}
> +
> +EXPORT_SYMBOL(mutex_lock_wrapper);
> +
>  int __sched mutex_lock_killable(struct mutex *lock)
>  {
>  	int ret;
> @@ -844,6 +873,13 @@ __mutex_lock_interruptible_slowpath(struct mutex *lock)
>  }
>  
>  static noinline int __sched
> +__mutex_lock_wrapper_slowpath(struct mutex *lock, long state, unsigned long ip)
> +{
> +	return __mutex_lock_common(lock, state, 0,
> +				   NULL, ip, NULL, 0);
> +}
> +
> +static noinline int __sched
>  __ww_mutex_lock_slowpath(struct ww_mutex *lock, struct ww_acquire_ctx *ctx)
>  {
>  	return __mutex_lock_common(&lock->base, TASK_UNINTERRUPTIBLE, 0,


_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 3/5] io-mapping: Always create a struct to hold metadata about the io-mapping
  2015-04-07 16:31   ` Chris Wilson
@ 2015-04-09  7:58     ` Joonas Lahtinen
  -1 siblings, 0 replies; 22+ messages in thread
From: Joonas Lahtinen @ 2015-04-09  7:58 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx, linux-mm

On ti, 2015-04-07 at 17:31 +0100, Chris Wilson wrote:
> Currently, we only allocate a structure to hold metadata if we need to
> allocate an ioremap for every access, such as on x86-32. However, it
> would be useful to store basic information about the io-mapping, such as
> its page protection, on all platforms.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>

Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

> Cc: linux-mm@kvack.org
> ---
>  include/linux/io-mapping.h | 52 ++++++++++++++++++++++++++++------------------
>  1 file changed, 32 insertions(+), 20 deletions(-)
> 
> diff --git a/include/linux/io-mapping.h b/include/linux/io-mapping.h
> index 657fab4efab3..e053011f50bb 100644
> --- a/include/linux/io-mapping.h
> +++ b/include/linux/io-mapping.h
> @@ -31,16 +31,17 @@
>   * See Documentation/io-mapping.txt
>   */
>  
> -#ifdef CONFIG_HAVE_ATOMIC_IOMAP
> -
> -#include <asm/iomap.h>
> -
>  struct io_mapping {
>  	resource_size_t base;
>  	unsigned long size;
>  	pgprot_t prot;
> +	void __iomem *iomem;
>  };
>  
> +
> +#ifdef CONFIG_HAVE_ATOMIC_IOMAP
> +
> +#include <asm/iomap.h>
>  /*
>   * For small address space machines, mapping large objects
>   * into the kernel virtual space isn't practical. Where
> @@ -119,48 +120,59 @@ io_mapping_unmap(void __iomem *vaddr)
>  #else
>  
>  #include <linux/uaccess.h>
> -
> -/* this struct isn't actually defined anywhere */
> -struct io_mapping;
> +#include <asm/pgtable_types.h>
>  
>  /* Create the io_mapping object*/
>  static inline struct io_mapping *
>  io_mapping_create_wc(resource_size_t base, unsigned long size)
>  {
> -	return (struct io_mapping __force *) ioremap_wc(base, size);
> +	struct io_mapping *iomap;
> +
> +	iomap = kmalloc(sizeof(*iomap), GFP_KERNEL);
> +	if (!iomap)
> +		return NULL;
> +
> +	iomap->base = base;
> +	iomap->size = size;
> +	iomap->iomem = ioremap_wc(base, size);
> +	iomap->prot = pgprot_writecombine(PAGE_KERNEL_IO);
> +
> +	return iomap;
>  }
>  
>  static inline void
>  io_mapping_free(struct io_mapping *mapping)
>  {
> -	iounmap((void __force __iomem *) mapping);
> +	iounmap(mapping->iomem);
> +	kfree(mapping);
>  }
>  
> -/* Atomic map/unmap */
> +/* Non-atomic map/unmap */
>  static inline void __iomem *
> -io_mapping_map_atomic_wc(struct io_mapping *mapping,
> -			 unsigned long offset)
> +io_mapping_map_wc(struct io_mapping *mapping, unsigned long offset)
>  {
> -	pagefault_disable();
> -	return ((char __force __iomem *) mapping) + offset;
> +	return mapping->iomem + offset;
>  }
>  
>  static inline void
> -io_mapping_unmap_atomic(void __iomem *vaddr)
> +io_mapping_unmap(void __iomem *vaddr)
>  {
> -	pagefault_enable();
>  }
>  
> -/* Non-atomic map/unmap */
> +/* Atomic map/unmap */
>  static inline void __iomem *
> -io_mapping_map_wc(struct io_mapping *mapping, unsigned long offset)
> +io_mapping_map_atomic_wc(struct io_mapping *mapping,
> +			 unsigned long offset)
>  {
> -	return ((char __force __iomem *) mapping) + offset;
> +	pagefault_disable();
> +	return io_mapping_map_wc(mapping, offset);
>  }
>  
>  static inline void
> -io_mapping_unmap(void __iomem *vaddr)
> +io_mapping_unmap_atomic(void __iomem *vaddr)
>  {
> +	io_mapping_unmap(vaddr);
> +	pagefault_enable();
>  }
>  
>  #endif /* HAVE_ATOMIC_IOMAP */


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 3/5] io-mapping: Always create a struct to hold metadata about the io-mapping
@ 2015-04-09  7:58     ` Joonas Lahtinen
  0 siblings, 0 replies; 22+ messages in thread
From: Joonas Lahtinen @ 2015-04-09  7:58 UTC (permalink / raw)
  To: Chris Wilson; +Cc: linux-mm, intel-gfx

On ti, 2015-04-07 at 17:31 +0100, Chris Wilson wrote:
> Currently, we only allocate a structure to hold metadata if we need to
> allocate an ioremap for every access, such as on x86-32. However, it
> would be useful to store basic information about the io-mapping, such as
> its page protection, on all platforms.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>

Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

> Cc: linux-mm@kvack.org
> ---
>  include/linux/io-mapping.h | 52 ++++++++++++++++++++++++++++------------------
>  1 file changed, 32 insertions(+), 20 deletions(-)
> 
> diff --git a/include/linux/io-mapping.h b/include/linux/io-mapping.h
> index 657fab4efab3..e053011f50bb 100644
> --- a/include/linux/io-mapping.h
> +++ b/include/linux/io-mapping.h
> @@ -31,16 +31,17 @@
>   * See Documentation/io-mapping.txt
>   */
>  
> -#ifdef CONFIG_HAVE_ATOMIC_IOMAP
> -
> -#include <asm/iomap.h>
> -
>  struct io_mapping {
>  	resource_size_t base;
>  	unsigned long size;
>  	pgprot_t prot;
> +	void __iomem *iomem;
>  };
>  
> +
> +#ifdef CONFIG_HAVE_ATOMIC_IOMAP
> +
> +#include <asm/iomap.h>
>  /*
>   * For small address space machines, mapping large objects
>   * into the kernel virtual space isn't practical. Where
> @@ -119,48 +120,59 @@ io_mapping_unmap(void __iomem *vaddr)
>  #else
>  
>  #include <linux/uaccess.h>
> -
> -/* this struct isn't actually defined anywhere */
> -struct io_mapping;
> +#include <asm/pgtable_types.h>
>  
>  /* Create the io_mapping object*/
>  static inline struct io_mapping *
>  io_mapping_create_wc(resource_size_t base, unsigned long size)
>  {
> -	return (struct io_mapping __force *) ioremap_wc(base, size);
> +	struct io_mapping *iomap;
> +
> +	iomap = kmalloc(sizeof(*iomap), GFP_KERNEL);
> +	if (!iomap)
> +		return NULL;
> +
> +	iomap->base = base;
> +	iomap->size = size;
> +	iomap->iomem = ioremap_wc(base, size);
> +	iomap->prot = pgprot_writecombine(PAGE_KERNEL_IO);
> +
> +	return iomap;
>  }
>  
>  static inline void
>  io_mapping_free(struct io_mapping *mapping)
>  {
> -	iounmap((void __force __iomem *) mapping);
> +	iounmap(mapping->iomem);
> +	kfree(mapping);
>  }
>  
> -/* Atomic map/unmap */
> +/* Non-atomic map/unmap */
>  static inline void __iomem *
> -io_mapping_map_atomic_wc(struct io_mapping *mapping,
> -			 unsigned long offset)
> +io_mapping_map_wc(struct io_mapping *mapping, unsigned long offset)
>  {
> -	pagefault_disable();
> -	return ((char __force __iomem *) mapping) + offset;
> +	return mapping->iomem + offset;
>  }
>  
>  static inline void
> -io_mapping_unmap_atomic(void __iomem *vaddr)
> +io_mapping_unmap(void __iomem *vaddr)
>  {
> -	pagefault_enable();
>  }
>  
> -/* Non-atomic map/unmap */
> +/* Atomic map/unmap */
>  static inline void __iomem *
> -io_mapping_map_wc(struct io_mapping *mapping, unsigned long offset)
> +io_mapping_map_atomic_wc(struct io_mapping *mapping,
> +			 unsigned long offset)
>  {
> -	return ((char __force __iomem *) mapping) + offset;
> +	pagefault_disable();
> +	return io_mapping_map_wc(mapping, offset);
>  }
>  
>  static inline void
> -io_mapping_unmap(void __iomem *vaddr)
> +io_mapping_unmap_atomic(void __iomem *vaddr)
>  {
> +	io_mapping_unmap(vaddr);
> +	pagefault_enable();
>  }
>  
>  #endif /* HAVE_ATOMIC_IOMAP */


_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 5/5] drm/i915: Use remap_io_mapping() to prefault all PTE in a single pass
  2015-04-07 16:31   ` Chris Wilson
@ 2015-04-09  8:00     ` Joonas Lahtinen
  -1 siblings, 0 replies; 22+ messages in thread
From: Joonas Lahtinen @ 2015-04-09  8:00 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx, linux-mm

On ti, 2015-04-07 at 17:31 +0100, Chris Wilson wrote:
> On an Ivybridge i7-3720qm with 1600MHz DDR3, with 32 fences,
> Upload rate for 2 linear surfaces:  8134MiB/s -> 8154MiB/s
> Upload rate for 2 tiled surfaces:   8625MiB/s -> 8632MiB/s
> Upload rate for 4 linear surfaces:  8127MiB/s -> 8134MiB/s
> Upload rate for 4 tiled surfaces:   8602MiB/s -> 8629MiB/s
> Upload rate for 8 linear surfaces:  8124MiB/s -> 8137MiB/s
> Upload rate for 8 tiled surfaces:   8603MiB/s -> 8624MiB/s
> Upload rate for 16 linear surfaces: 8123MiB/s -> 8128MiB/s
> Upload rate for 16 tiled surfaces:  8606MiB/s -> 8618MiB/s
> Upload rate for 32 linear surfaces: 8121MiB/s -> 8128MiB/s
> Upload rate for 32 tiled surfaces:  8605MiB/s -> 8614MiB/s
> Upload rate for 64 linear surfaces: 8121MiB/s -> 8127MiB/s
> Upload rate for 64 tiled surfaces:  3017MiB/s -> 5202MiB/s
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Testcase: igt/gem_fence_upload/performance
> Testcase: igt/gem_mmap_gtt
> Reviewed-by: Brad Volkin <bradley.d.volkin@intel.com>

Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

> Cc: linux-mm@kvack.org
> ---
>  drivers/gpu/drm/i915/i915_gem.c | 23 ++++++-----------------
>  1 file changed, 6 insertions(+), 17 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 7ab8e0039790..90d772f72276 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -1667,25 +1667,14 @@ int i915_gem_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
>  	pfn = dev_priv->gtt.mappable_base + i915_gem_obj_ggtt_offset(obj);
>  	pfn >>= PAGE_SHIFT;
>  
> -	if (!obj->fault_mappable) {
> -		unsigned long size = min_t(unsigned long,
> -					   vma->vm_end - vma->vm_start,
> -					   obj->base.size);
> -		int i;
> +	ret = remap_io_mapping(vma,
> +			       vma->vm_start, pfn, vma->vm_end - vma->vm_start,
> +			       dev_priv->gtt.mappable);
> +	if (ret)
> +		goto unpin;
>  
> -		for (i = 0; i < size >> PAGE_SHIFT; i++) {
> -			ret = vm_insert_pfn(vma,
> -					    (unsigned long)vma->vm_start + i * PAGE_SIZE,
> -					    pfn + i);
> -			if (ret)
> -				break;
> -		}
> +	obj->fault_mappable = true;
>  
> -		obj->fault_mappable = true;
> -	} else
> -		ret = vm_insert_pfn(vma,
> -				    (unsigned long)vmf->virtual_address,
> -				    pfn + page_offset);
>  unpin:
>  	i915_gem_object_ggtt_unpin(obj);
>  unlock:


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 5/5] drm/i915: Use remap_io_mapping() to prefault all PTE in a single pass
@ 2015-04-09  8:00     ` Joonas Lahtinen
  0 siblings, 0 replies; 22+ messages in thread
From: Joonas Lahtinen @ 2015-04-09  8:00 UTC (permalink / raw)
  To: Chris Wilson; +Cc: linux-mm, intel-gfx

On ti, 2015-04-07 at 17:31 +0100, Chris Wilson wrote:
> On an Ivybridge i7-3720qm with 1600MHz DDR3, with 32 fences,
> Upload rate for 2 linear surfaces:  8134MiB/s -> 8154MiB/s
> Upload rate for 2 tiled surfaces:   8625MiB/s -> 8632MiB/s
> Upload rate for 4 linear surfaces:  8127MiB/s -> 8134MiB/s
> Upload rate for 4 tiled surfaces:   8602MiB/s -> 8629MiB/s
> Upload rate for 8 linear surfaces:  8124MiB/s -> 8137MiB/s
> Upload rate for 8 tiled surfaces:   8603MiB/s -> 8624MiB/s
> Upload rate for 16 linear surfaces: 8123MiB/s -> 8128MiB/s
> Upload rate for 16 tiled surfaces:  8606MiB/s -> 8618MiB/s
> Upload rate for 32 linear surfaces: 8121MiB/s -> 8128MiB/s
> Upload rate for 32 tiled surfaces:  8605MiB/s -> 8614MiB/s
> Upload rate for 64 linear surfaces: 8121MiB/s -> 8127MiB/s
> Upload rate for 64 tiled surfaces:  3017MiB/s -> 5202MiB/s
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Testcase: igt/gem_fence_upload/performance
> Testcase: igt/gem_mmap_gtt
> Reviewed-by: Brad Volkin <bradley.d.volkin@intel.com>

Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

> Cc: linux-mm@kvack.org
> ---
>  drivers/gpu/drm/i915/i915_gem.c | 23 ++++++-----------------
>  1 file changed, 6 insertions(+), 17 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 7ab8e0039790..90d772f72276 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -1667,25 +1667,14 @@ int i915_gem_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
>  	pfn = dev_priv->gtt.mappable_base + i915_gem_obj_ggtt_offset(obj);
>  	pfn >>= PAGE_SHIFT;
>  
> -	if (!obj->fault_mappable) {
> -		unsigned long size = min_t(unsigned long,
> -					   vma->vm_end - vma->vm_start,
> -					   obj->base.size);
> -		int i;
> +	ret = remap_io_mapping(vma,
> +			       vma->vm_start, pfn, vma->vm_end - vma->vm_start,
> +			       dev_priv->gtt.mappable);
> +	if (ret)
> +		goto unpin;
>  
> -		for (i = 0; i < size >> PAGE_SHIFT; i++) {
> -			ret = vm_insert_pfn(vma,
> -					    (unsigned long)vma->vm_start + i * PAGE_SIZE,
> -					    pfn + i);
> -			if (ret)
> -				break;
> -		}
> +	obj->fault_mappable = true;
>  
> -		obj->fault_mappable = true;
> -	} else
> -		ret = vm_insert_pfn(vma,
> -				    (unsigned long)vmf->virtual_address,
> -				    pfn + page_offset);
>  unpin:
>  	i915_gem_object_ggtt_unpin(obj);
>  unlock:


_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 4/5] mm: Export remap_io_mapping()
  2015-04-07 16:31   ` Chris Wilson
@ 2015-04-09  8:18     ` Joonas Lahtinen
  -1 siblings, 0 replies; 22+ messages in thread
From: Joonas Lahtinen @ 2015-04-09  8:18 UTC (permalink / raw)
  To: Chris Wilson
  Cc: intel-gfx, Andrew Morton, Kirill A. Shutemov, Peter Zijlstra,
	Rik van Riel, Mel Gorman, Cyrill Gorcunov, Johannes Weiner,
	linux-mm

On ti, 2015-04-07 at 17:31 +0100, Chris Wilson wrote:
> This is similar to remap_pfn_range(), and uses the recently refactor
> code to do the page table walking. The key difference is that is back
> propagates its error as this is required for use from within a pagefault
> handler. The other difference, is that it combine the page protection
> from io-mapping, which is known from when the io-mapping is created,
> with the per-vma page protection flags. This avoids having to walk the
> entire system description to rediscover the special page protection
> established for the io-mapping.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Rik van Riel <riel@redhat.com>
> Cc: Mel Gorman <mgorman@suse.de>
> Cc: Cyrill Gorcunov <gorcunov@gmail.com>
> Cc: Johannes Weiner <hannes@cmpxchg.org>
> Cc: linux-mm@kvack.org
> ---
>  include/linux/mm.h |  4 ++++
>  mm/memory.c        | 46 ++++++++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 50 insertions(+)
> 
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index 47a93928b90f..3dfecd58adb0 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -2083,6 +2083,10 @@ unsigned long change_prot_numa(struct vm_area_struct *vma,
>  struct vm_area_struct *find_extend_vma(struct mm_struct *, unsigned long addr);
>  int remap_pfn_range(struct vm_area_struct *, unsigned long addr,
>  			unsigned long pfn, unsigned long size, pgprot_t);
> +struct io_mapping;

This is unconditional code, so just move the struct forward declaration
to the top of the file after "struct writeback_control" and others.

> +int remap_io_mapping(struct vm_area_struct *,
> +		     unsigned long addr, unsigned long pfn, unsigned long size,
> +		     struct io_mapping *iomap);
>  int vm_insert_page(struct vm_area_struct *, unsigned long addr, struct page *);
>  int vm_insert_pfn(struct vm_area_struct *vma, unsigned long addr,
>  			unsigned long pfn);
> diff --git a/mm/memory.c b/mm/memory.c
> index acb06f40d614..83bc5df3fafc 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -61,6 +61,7 @@
>  #include <linux/string.h>
>  #include <linux/dma-debug.h>
>  #include <linux/debugfs.h>
> +#include <linux/io-mapping.h>
>  
>  #include <asm/io.h>
>  #include <asm/pgalloc.h>
> @@ -1762,6 +1763,51 @@ int remap_pfn_range(struct vm_area_struct *vma, unsigned long addr,
>  EXPORT_SYMBOL(remap_pfn_range);
>  
>  /**
> + * remap_io_mapping - remap an IO mapping to userspace
> + * @vma: user vma to map to
> + * @addr: target user address to start at
> + * @pfn: physical address of kernel memory
> + * @size: size of map area
> + * @iomap: the source io_mapping
> + *
> + *  Note: this is only safe if the mm semaphore is held when called.
> + */
> +int remap_io_mapping(struct vm_area_struct *vma,
> +		     unsigned long addr, unsigned long pfn, unsigned long size,
> +		     struct io_mapping *iomap)
> +{
> +	unsigned long end = addr + PAGE_ALIGN(size);
> +	struct remap_pfn r;
> +	pgd_t *pgd;
> +	int err;
> +
> +	if (WARN_ON(addr >= end))
> +		return -EINVAL;
> +
> +#define MUST_SET (VM_IO | VM_PFNMAP | VM_DONTEXPAND | VM_DONTDUMP)
> +	BUG_ON(is_cow_mapping(vma->vm_flags));
> +	BUG_ON((vma->vm_flags & MUST_SET) != MUST_SET);
> +#undef MUST_SET
> +

I think that is bit general for define name, maybe something along
REMAP_IO_NEEDED_FLAGS outside of the function... and then it doesn't
have to be #undeffed. And if it is kept inside function then at least _
prefix it. But I don't see why not make it available outside too.

Otherwise looking good.

Regards, Joonas

> +	r.mm = vma->vm_mm;
> +	r.addr = addr;
> +	r.pfn = pfn;
> +	r.prot = __pgprot((pgprot_val(iomap->prot) & _PAGE_CACHE_MASK) |
> +			  (pgprot_val(vma->vm_page_prot) & ~_PAGE_CACHE_MASK));
> +
> +	pgd = pgd_offset(r.mm, addr);
> +	do {
> +		err = remap_pud_range(&r, pgd++, pgd_addr_end(r.addr, end));
> +	} while (err == 0 && r.addr < end);
> +
> +	if (err)
> +		zap_page_range_single(vma, addr, r.addr - addr, NULL);
> +
> +	return err;
> +}
> +EXPORT_SYMBOL(remap_io_mapping);
> +
> +/**
>   * vm_iomap_memory - remap memory to userspace
>   * @vma: user vma to map to
>   * @start: start of area


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 4/5] mm: Export remap_io_mapping()
@ 2015-04-09  8:18     ` Joonas Lahtinen
  0 siblings, 0 replies; 22+ messages in thread
From: Joonas Lahtinen @ 2015-04-09  8:18 UTC (permalink / raw)
  To: Chris Wilson
  Cc: Rik van Riel, Peter Zijlstra, intel-gfx, Cyrill Gorcunov,
	linux-mm, Mel Gorman, Johannes Weiner, Andrew Morton,
	Kirill A. Shutemov

On ti, 2015-04-07 at 17:31 +0100, Chris Wilson wrote:
> This is similar to remap_pfn_range(), and uses the recently refactor
> code to do the page table walking. The key difference is that is back
> propagates its error as this is required for use from within a pagefault
> handler. The other difference, is that it combine the page protection
> from io-mapping, which is known from when the io-mapping is created,
> with the per-vma page protection flags. This avoids having to walk the
> entire system description to rediscover the special page protection
> established for the io-mapping.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Rik van Riel <riel@redhat.com>
> Cc: Mel Gorman <mgorman@suse.de>
> Cc: Cyrill Gorcunov <gorcunov@gmail.com>
> Cc: Johannes Weiner <hannes@cmpxchg.org>
> Cc: linux-mm@kvack.org
> ---
>  include/linux/mm.h |  4 ++++
>  mm/memory.c        | 46 ++++++++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 50 insertions(+)
> 
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index 47a93928b90f..3dfecd58adb0 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -2083,6 +2083,10 @@ unsigned long change_prot_numa(struct vm_area_struct *vma,
>  struct vm_area_struct *find_extend_vma(struct mm_struct *, unsigned long addr);
>  int remap_pfn_range(struct vm_area_struct *, unsigned long addr,
>  			unsigned long pfn, unsigned long size, pgprot_t);
> +struct io_mapping;

This is unconditional code, so just move the struct forward declaration
to the top of the file after "struct writeback_control" and others.

> +int remap_io_mapping(struct vm_area_struct *,
> +		     unsigned long addr, unsigned long pfn, unsigned long size,
> +		     struct io_mapping *iomap);
>  int vm_insert_page(struct vm_area_struct *, unsigned long addr, struct page *);
>  int vm_insert_pfn(struct vm_area_struct *vma, unsigned long addr,
>  			unsigned long pfn);
> diff --git a/mm/memory.c b/mm/memory.c
> index acb06f40d614..83bc5df3fafc 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -61,6 +61,7 @@
>  #include <linux/string.h>
>  #include <linux/dma-debug.h>
>  #include <linux/debugfs.h>
> +#include <linux/io-mapping.h>
>  
>  #include <asm/io.h>
>  #include <asm/pgalloc.h>
> @@ -1762,6 +1763,51 @@ int remap_pfn_range(struct vm_area_struct *vma, unsigned long addr,
>  EXPORT_SYMBOL(remap_pfn_range);
>  
>  /**
> + * remap_io_mapping - remap an IO mapping to userspace
> + * @vma: user vma to map to
> + * @addr: target user address to start at
> + * @pfn: physical address of kernel memory
> + * @size: size of map area
> + * @iomap: the source io_mapping
> + *
> + *  Note: this is only safe if the mm semaphore is held when called.
> + */
> +int remap_io_mapping(struct vm_area_struct *vma,
> +		     unsigned long addr, unsigned long pfn, unsigned long size,
> +		     struct io_mapping *iomap)
> +{
> +	unsigned long end = addr + PAGE_ALIGN(size);
> +	struct remap_pfn r;
> +	pgd_t *pgd;
> +	int err;
> +
> +	if (WARN_ON(addr >= end))
> +		return -EINVAL;
> +
> +#define MUST_SET (VM_IO | VM_PFNMAP | VM_DONTEXPAND | VM_DONTDUMP)
> +	BUG_ON(is_cow_mapping(vma->vm_flags));
> +	BUG_ON((vma->vm_flags & MUST_SET) != MUST_SET);
> +#undef MUST_SET
> +

I think that is bit general for define name, maybe something along
REMAP_IO_NEEDED_FLAGS outside of the function... and then it doesn't
have to be #undeffed. And if it is kept inside function then at least _
prefix it. But I don't see why not make it available outside too.

Otherwise looking good.

Regards, Joonas

> +	r.mm = vma->vm_mm;
> +	r.addr = addr;
> +	r.pfn = pfn;
> +	r.prot = __pgprot((pgprot_val(iomap->prot) & _PAGE_CACHE_MASK) |
> +			  (pgprot_val(vma->vm_page_prot) & ~_PAGE_CACHE_MASK));
> +
> +	pgd = pgd_offset(r.mm, addr);
> +	do {
> +		err = remap_pud_range(&r, pgd++, pgd_addr_end(r.addr, end));
> +	} while (err == 0 && r.addr < end);
> +
> +	if (err)
> +		zap_page_range_single(vma, addr, r.addr - addr, NULL);
> +
> +	return err;
> +}
> +EXPORT_SYMBOL(remap_io_mapping);
> +
> +/**
>   * vm_iomap_memory - remap memory to userspace
>   * @vma: user vma to map to
>   * @start: start of area


_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 2/5] mm: Refactor remap_pfn_range()
  2015-04-07 16:31   ` Chris Wilson
  (?)
  (?)
@ 2015-04-09  8:32   ` Joonas Lahtinen
  -1 siblings, 0 replies; 22+ messages in thread
From: Joonas Lahtinen @ 2015-04-09  8:32 UTC (permalink / raw)
  To: Chris Wilson
  Cc: intel-gfx, Andrew Morton, Kirill A. Shutemov, Peter Zijlstra,
	Rik van Riel, Mel Gorman, Cyrill Gorcunov, Johannes Weiner,
	linux-mm

On ti, 2015-04-07 at 17:31 +0100, Chris Wilson wrote:
> In preparation for exporting very similar functionality through another
> interface, gut the current remap_pfn_range(). The motivating factor here
> is to reuse the PGB/PUD/PMD/PTE walker, but allow back progation of
> errors rather than BUG_ON.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Rik van Riel <riel@redhat.com>
> Cc: Mel Gorman <mgorman@suse.de>
> Cc: Cyrill Gorcunov <gorcunov@gmail.com>
> Cc: Johannes Weiner <hannes@cmpxchg.org>
> Cc: linux-mm@kvack.org
> ---
>  mm/memory.c | 102 +++++++++++++++++++++++++++++++++---------------------------
>  1 file changed, 57 insertions(+), 45 deletions(-)
> 
> diff --git a/mm/memory.c b/mm/memory.c
> index 97839f5c8c30..acb06f40d614 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -1614,71 +1614,81 @@ int vm_insert_mixed(struct vm_area_struct *vma, unsigned long addr,
>  }
>  EXPORT_SYMBOL(vm_insert_mixed);
>  
> +struct remap_pfn {
> +	struct mm_struct *mm;
> +	unsigned long addr;
> +	unsigned long pfn;
> +	pgprot_t prot;
> +};
> +
>  /*
>   * maps a range of physical memory into the requested pages. the old
>   * mappings are removed. any references to nonexistent pages results
>   * in null mappings (currently treated as "copy-on-access")
>   */
> -static int remap_pte_range(struct mm_struct *mm, pmd_t *pmd,
> -			unsigned long addr, unsigned long end,
> -			unsigned long pfn, pgprot_t prot)
> +static inline int remap_pfn(struct remap_pfn *r, pte_t *pte)

I think add a brief own comment for this function and keep it below old
comment not to cause unnecessary noise.

Otherwise looks good.

Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

> +{
> +	if (!pte_none(*pte))
> +		return -EBUSY;
> +
> +	set_pte_at(r->mm, r->addr, pte,
> +		   pte_mkspecial(pfn_pte(r->pfn, r->prot)));
> +	r->pfn++;
> +	r->addr += PAGE_SIZE;
> +	return 0;
> +}
> +
> +static int remap_pte_range(struct remap_pfn *r, pmd_t *pmd, unsigned long end)
>  {
>  	pte_t *pte;
>  	spinlock_t *ptl;
> +	int err;
>  
> -	pte = pte_alloc_map_lock(mm, pmd, addr, &ptl);
> +	pte = pte_alloc_map_lock(r->mm, pmd, r->addr, &ptl);
>  	if (!pte)
>  		return -ENOMEM;
> +
>  	arch_enter_lazy_mmu_mode();
>  	do {
> -		BUG_ON(!pte_none(*pte));
> -		set_pte_at(mm, addr, pte, pte_mkspecial(pfn_pte(pfn, prot)));
> -		pfn++;
> -	} while (pte++, addr += PAGE_SIZE, addr != end);
> +		err = remap_pfn(r, pte++);
> +	} while (err == 0 && r->addr < end);
>  	arch_leave_lazy_mmu_mode();
> +
>  	pte_unmap_unlock(pte - 1, ptl);
> -	return 0;
> +	return err;
>  }
>  
> -static inline int remap_pmd_range(struct mm_struct *mm, pud_t *pud,
> -			unsigned long addr, unsigned long end,
> -			unsigned long pfn, pgprot_t prot)
> +static inline int remap_pmd_range(struct remap_pfn *r, pud_t *pud, unsigned long end)
>  {
>  	pmd_t *pmd;
> -	unsigned long next;
> +	int err;
>  
> -	pfn -= addr >> PAGE_SHIFT;
> -	pmd = pmd_alloc(mm, pud, addr);
> +	pmd = pmd_alloc(r->mm, pud, r->addr);
>  	if (!pmd)
>  		return -ENOMEM;
>  	VM_BUG_ON(pmd_trans_huge(*pmd));
> +
>  	do {
> -		next = pmd_addr_end(addr, end);
> -		if (remap_pte_range(mm, pmd, addr, next,
> -				pfn + (addr >> PAGE_SHIFT), prot))
> -			return -ENOMEM;
> -	} while (pmd++, addr = next, addr != end);
> -	return 0;
> +		err = remap_pte_range(r, pmd++, pmd_addr_end(r->addr, end));
> +	} while (err == 0 && r->addr < end);
> +
> +	return err;
>  }
>  
> -static inline int remap_pud_range(struct mm_struct *mm, pgd_t *pgd,
> -			unsigned long addr, unsigned long end,
> -			unsigned long pfn, pgprot_t prot)
> +static inline int remap_pud_range(struct remap_pfn *r, pgd_t *pgd, unsigned long end)
>  {
>  	pud_t *pud;
> -	unsigned long next;
> +	int err;
>  
> -	pfn -= addr >> PAGE_SHIFT;
> -	pud = pud_alloc(mm, pgd, addr);
> +	pud = pud_alloc(r->mm, pgd, r->addr);
>  	if (!pud)
>  		return -ENOMEM;
> +
>  	do {
> -		next = pud_addr_end(addr, end);
> -		if (remap_pmd_range(mm, pud, addr, next,
> -				pfn + (addr >> PAGE_SHIFT), prot))
> -			return -ENOMEM;
> -	} while (pud++, addr = next, addr != end);
> -	return 0;
> +		err = remap_pmd_range(r, pud++, pud_addr_end(r->addr, end));
> +	} while (err == 0 && r->addr < end);
> +
> +	return err;
>  }
>  
>  /**
> @@ -1694,10 +1704,9 @@ static inline int remap_pud_range(struct mm_struct *mm, pgd_t *pgd,
>  int remap_pfn_range(struct vm_area_struct *vma, unsigned long addr,
>  		    unsigned long pfn, unsigned long size, pgprot_t prot)
>  {
> -	pgd_t *pgd;
> -	unsigned long next;
>  	unsigned long end = addr + PAGE_ALIGN(size);
> -	struct mm_struct *mm = vma->vm_mm;
> +	struct remap_pfn r;
> +	pgd_t *pgd;
>  	int err;
>  
>  	/*
> @@ -1731,19 +1740,22 @@ int remap_pfn_range(struct vm_area_struct *vma, unsigned long addr,
>  	vma->vm_flags |= VM_IO | VM_PFNMAP | VM_DONTEXPAND | VM_DONTDUMP;
>  
>  	BUG_ON(addr >= end);
> -	pfn -= addr >> PAGE_SHIFT;
> -	pgd = pgd_offset(mm, addr);
>  	flush_cache_range(vma, addr, end);
> +
> +	r.mm = vma->vm_mm;
> +	r.addr = addr;
> +	r.pfn = pfn;
> +	r.prot = prot;
> +
> +	pgd = pgd_offset(r.mm, addr);
>  	do {
> -		next = pgd_addr_end(addr, end);
> -		err = remap_pud_range(mm, pgd, addr, next,
> -				pfn + (addr >> PAGE_SHIFT), prot);
> -		if (err)
> -			break;
> -	} while (pgd++, addr = next, addr != end);
> +		err = remap_pud_range(&r, pgd++, pgd_addr_end(r.addr, end));
> +	} while (err == 0 && r.addr < end);
>  
> -	if (err)
> +	if (err) {
>  		untrack_pfn(vma, pfn, PAGE_ALIGN(size));
> +		BUG_ON(err == -EBUSY);
> +	}
>  
>  	return err;
>  }


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Fast prefaulting for GTT mmappings
  2015-04-07 16:31 Fast prefaulting for GTT mmappings Chris Wilson
                   ` (4 preceding siblings ...)
  2015-04-07 16:31   ` Chris Wilson
@ 2015-04-09  8:33 ` Joonas Lahtinen
  5 siblings, 0 replies; 22+ messages in thread
From: Joonas Lahtinen @ 2015-04-09  8:33 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

On ti, 2015-04-07 at 17:31 +0100, Chris Wilson wrote:
> Hi Joonas,
> 
>   since you were looking at extending the GTT fault capabilities, I
> thought you might like to revivew these patches, as they may prove
> beneficial for your use case as well.

Did so, sent a couple cosmical comments.

Regards, Joonas

> -Chris
> 


_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2015-04-09  8:33 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-04-07 16:31 Fast prefaulting for GTT mmappings Chris Wilson
2015-04-07 16:31 ` [PATCH 1/5] mutex: Export an interface to wrap a mutex lock Chris Wilson
2015-04-09  7:46   ` Joonas Lahtinen
2015-04-07 16:31 ` [PATCH 2/5] mm: Refactor remap_pfn_range() Chris Wilson
2015-04-07 16:31   ` Chris Wilson
2015-04-07 20:27   ` Andrew Morton
2015-04-08  9:45     ` Peter Zijlstra
2015-04-09  8:32   ` Joonas Lahtinen
2015-04-07 16:31 ` [PATCH 3/5] io-mapping: Always create a struct to hold metadata about the io-mapping Chris Wilson
2015-04-07 16:31   ` Chris Wilson
2015-04-09  7:58   ` Joonas Lahtinen
2015-04-09  7:58     ` Joonas Lahtinen
2015-04-07 16:31 ` [PATCH 4/5] mm: Export remap_io_mapping() Chris Wilson
2015-04-07 16:31   ` Chris Wilson
2015-04-09  8:18   ` Joonas Lahtinen
2015-04-09  8:18     ` Joonas Lahtinen
2015-04-07 16:31 ` [PATCH 5/5] drm/i915: Use remap_io_mapping() to prefault all PTE in a single pass Chris Wilson
2015-04-07 16:31   ` Chris Wilson
2015-04-07 19:28   ` shuang.he
2015-04-09  8:00   ` Joonas Lahtinen
2015-04-09  8:00     ` Joonas Lahtinen
2015-04-09  8:33 ` Fast prefaulting for GTT mmappings Joonas Lahtinen

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.