* [PATCH] drm/i915/userptr: Reinvent GGTT self-faulting protection
@ 2017-03-15 14:01 Chris Wilson
2017-03-15 15:28 ` ✗ Fi.CI.BAT: warning for " Patchwork
2017-03-16 9:56 ` [PATCH] " Tvrtko Ursulin
0 siblings, 2 replies; 4+ messages in thread
From: Chris Wilson @ 2017-03-15 14:01 UTC (permalink / raw)
To: intel-gfx
lockdep doesn't like us taking the mm->mmap_sem inside the get_pages
callback for a couple of reasons. The straightforward deadlock:
[13755.434059] =============================================
[13755.434061] [ INFO: possible recursive locking detected ]
[13755.434064] 4.11.0-rc1-CI-CI_DRM_297+ #1 Tainted: G U
[13755.434066] ---------------------------------------------
[13755.434068] gem_userptr_bli/8398 is trying to acquire lock:
[13755.434070] (&mm->mmap_sem){++++++}, at: [<ffffffffa00c988a>] i915_gem_userptr_get_pages+0x5a/0x2e0 [i915]
[13755.434096]
but task is already holding lock:
[13755.434098] (&mm->mmap_sem){++++++}, at: [<ffffffff8104d485>] __do_page_fault+0x105/0x560
[13755.434105]
other info that might help us debug this:
[13755.434108] Possible unsafe locking scenario:
[13755.434110] CPU0
[13755.434111] ----
[13755.434112] lock(&mm->mmap_sem);
[13755.434115] lock(&mm->mmap_sem);
[13755.434117]
*** DEADLOCK ***
[13755.434121] May be due to missing lock nesting notation
[13755.434126] 2 locks held by gem_userptr_bli/8398:
[13755.434128] #0: (&mm->mmap_sem){++++++}, at: [<ffffffff8104d485>] __do_page_fault+0x105/0x560
[13755.434135] #1: (&obj->mm.lock){+.+.+.}, at: [<ffffffffa00b887d>] __i915_gem_object_get_pages+0x1d/0x70 [i915]
[13755.434156]
stack backtrace:
[13755.434161] CPU: 3 PID: 8398 Comm: gem_userptr_bli Tainted: G U 4.11.0-rc1-CI-CI_DRM_297+ #1
[13755.434165] Hardware name: GIGABYTE GB-BKi7(H)A-7500/MFLP7AP-00, BIOS F4 02/20/2017
[13755.434169] Call Trace:
[13755.434174] dump_stack+0x67/0x92
[13755.434178] __lock_acquire+0x133a/0x1b50
[13755.434182] lock_acquire+0xc9/0x220
[13755.434200] ? i915_gem_userptr_get_pages+0x5a/0x2e0 [i915]
[13755.434204] down_read+0x42/0x70
[13755.434221] ? i915_gem_userptr_get_pages+0x5a/0x2e0 [i915]
[13755.434238] i915_gem_userptr_get_pages+0x5a/0x2e0 [i915]
[13755.434255] ____i915_gem_object_get_pages+0x25/0x60 [i915]
[13755.434272] __i915_gem_object_get_pages+0x59/0x70 [i915]
[13755.434288] i915_gem_fault+0x397/0x6a0 [i915]
[13755.434304] ? i915_gem_fault+0x1a1/0x6a0 [i915]
[13755.434308] ? __lock_acquire+0x449/0x1b50
[13755.434311] ? __lock_acquire+0x449/0x1b50
[13755.434315] ? vm_mmap_pgoff+0xa9/0xd0
[13755.434318] __do_fault+0x19/0x70
[13755.434321] __handle_mm_fault+0x863/0xe50
[13755.434325] handle_mm_fault+0x17f/0x370
[13755.434329] ? handle_mm_fault+0x40/0x370
[13755.434332] __do_page_fault+0x279/0x560
[13755.434336] do_page_fault+0xc/0x10
[13755.434339] page_fault+0x22/0x30
[13755.434342] RIP: 0033:0x7f5ab91b5880
[13755.434345] RSP: 002b:00007fff62922218 EFLAGS: 00010216
[13755.434348] RAX: 0000000000b74500 RBX: 00007f5ab7f81000 RCX: 0000000000000000
[13755.434352] RDX: 0000000000100000 RSI: 00007f5ab7f81000 RDI: 00007f5aba61c000
[13755.434355] RBP: 00007f5aba61c000 R08: 0000000000000007 R09: 0000000100000000
[13755.434359] R10: 000000000000037d R11: 00007f5ab91b5840 R12: 0000000000000001
[13755.434362] R13: 0000000000000005 R14: 0000000000000001 R15: 0000000000000000
and cyclic deadlocks:
[ 2566.458979] ======================================================
[ 2566.459054] [ INFO: possible circular locking dependency detected ]
[ 2566.459127] 4.11.0-rc1+ #26 Not tainted
[ 2566.459194] -------------------------------------------------------
[ 2566.459266] gem_streaming_w/759 is trying to acquire lock:
[ 2566.459334] (&obj->mm.lock){+.+.+.}, at: [<ffffffffa034bc80>] i915_gem_object_pin_pages+0x0/0xc0 [i915]
[ 2566.459605]
[ 2566.459605] but task is already holding lock:
[ 2566.459699] (&mm->mmap_sem){++++++}, at: [<ffffffff8106fd11>] __do_page_fault+0x121/0x500
[ 2566.459814]
[ 2566.459814] which lock already depends on the new lock.
[ 2566.459814]
[ 2566.459934]
[ 2566.459934] the existing dependency chain (in reverse order) is:
[ 2566.460030]
[ 2566.460030] -> #1 (&mm->mmap_sem){++++++}:
[ 2566.460139] lock_acquire+0xfe/0x220
[ 2566.460214] down_read+0x4e/0x90
[ 2566.460444] i915_gem_userptr_get_pages+0x6e/0x340 [i915]
[ 2566.460669] ____i915_gem_object_get_pages+0x8b/0xd0 [i915]
[ 2566.460900] __i915_gem_object_get_pages+0x6a/0x80 [i915]
[ 2566.461132] __i915_vma_do_pin+0x7fa/0x930 [i915]
[ 2566.461352] eb_add_vma+0x67b/0x830 [i915]
[ 2566.461572] eb_lookup_vmas+0xafe/0x1010 [i915]
[ 2566.461792] i915_gem_do_execbuffer+0x715/0x2870 [i915]
[ 2566.462012] i915_gem_execbuffer2+0x106/0x2b0 [i915]
[ 2566.462152] drm_ioctl+0x36c/0x670 [drm]
[ 2566.462236] do_vfs_ioctl+0x12c/0xa60
[ 2566.462317] SyS_ioctl+0x41/0x70
[ 2566.462399] entry_SYSCALL_64_fastpath+0x1c/0xb1
[ 2566.462477]
[ 2566.462477] -> #0 (&obj->mm.lock){+.+.+.}:
[ 2566.462587] __lock_acquire+0x1602/0x1790
[ 2566.462661] lock_acquire+0xfe/0x220
[ 2566.462893] i915_gem_object_pin_pages+0x4c/0xc0 [i915]
[ 2566.463116] i915_gem_fault+0x2c2/0x8c0 [i915]
[ 2566.463197] __do_fault+0x42/0x130
[ 2566.463276] __handle_mm_fault+0x92c/0x1280
[ 2566.463356] handle_mm_fault+0x1e2/0x440
[ 2566.463443] __do_page_fault+0x1c4/0x500
[ 2566.463529] do_page_fault+0xc/0x10
[ 2566.463613] page_fault+0x1f/0x30
[ 2566.463693]
[ 2566.463693] other info that might help us debug this:
[ 2566.463693]
[ 2566.463820] Possible unsafe locking scenario:
[ 2566.463820]
[ 2566.463918] CPU0 CPU1
[ 2566.463988] ---- ----
[ 2566.464068] lock(&mm->mmap_sem);
[ 2566.464143] lock(&obj->mm.lock);
[ 2566.464226] lock(&mm->mmap_sem);
[ 2566.464304] lock(&obj->mm.lock);
[ 2566.464378]
[ 2566.464378] *** DEADLOCK ***
[ 2566.464378]
[ 2566.464504] 1 lock held by gem_streaming_w/759:
[ 2566.464576] #0: (&mm->mmap_sem){++++++}, at: [<ffffffff8106fd11>] __do_page_fault+0x121/0x500
[ 2566.464699]
[ 2566.464699] stack backtrace:
[ 2566.464801] CPU: 0 PID: 759 Comm: gem_streaming_w Not tainted 4.11.0-rc1+ #26
[ 2566.464881] Hardware name: GIGABYTE GB-BXBT-1900/MZBAYAB-00, BIOS F8 03/02/2016
[ 2566.464983] Call Trace:
[ 2566.465061] dump_stack+0x68/0x9f
[ 2566.465144] print_circular_bug+0x20b/0x260
[ 2566.465234] __lock_acquire+0x1602/0x1790
[ 2566.465323] ? debug_check_no_locks_freed+0x1a0/0x1a0
[ 2566.465564] ? i915_gem_object_wait+0x238/0x650 [i915]
[ 2566.465657] ? debug_lockdep_rcu_enabled.part.4+0x1a/0x30
[ 2566.465749] lock_acquire+0xfe/0x220
[ 2566.465985] ? i915_sg_trim+0x1b0/0x1b0 [i915]
[ 2566.466223] i915_gem_object_pin_pages+0x4c/0xc0 [i915]
[ 2566.466461] ? i915_sg_trim+0x1b0/0x1b0 [i915]
[ 2566.466699] i915_gem_fault+0x2c2/0x8c0 [i915]
[ 2566.466939] ? i915_gem_pwrite_ioctl+0xce0/0xce0 [i915]
[ 2566.467030] ? __lock_acquire+0x642/0x1790
[ 2566.467122] ? __lock_acquire+0x642/0x1790
[ 2566.467209] ? debug_lockdep_rcu_enabled+0x35/0x40
[ 2566.467299] ? get_unmapped_area+0x1b4/0x1d0
[ 2566.467387] __do_fault+0x42/0x130
[ 2566.467474] __handle_mm_fault+0x92c/0x1280
[ 2566.467564] ? __pmd_alloc+0x1e0/0x1e0
[ 2566.467651] ? vm_mmap_pgoff+0x160/0x190
[ 2566.467740] ? handle_mm_fault+0x111/0x440
[ 2566.467827] handle_mm_fault+0x1e2/0x440
[ 2566.467914] ? handle_mm_fault+0x5d/0x440
[ 2566.468002] __do_page_fault+0x1c4/0x500
[ 2566.468090] do_page_fault+0xc/0x10
[ 2566.468180] page_fault+0x1f/0x30
[ 2566.468263] RIP: 0033:0x557895ced32a
[ 2566.468337] RSP: 002b:00007fffd6dd8a10 EFLAGS: 00010202
[ 2566.468419] RAX: 00007f659a4db000 RBX: 0000000000000003 RCX: 00007f659ad032da
[ 2566.468501] RDX: 0000000000000000 RSI: 0000000000100000 RDI: 0000000000000000
[ 2566.468586] RBP: 0000000000000007 R08: 0000000000000003 R09: 0000000100000000
[ 2566.468667] R10: 0000000000000001 R11: 0000000000000246 R12: 0000557895ceda60
[ 2566.468749] R13: 0000000000000001 R14: 00007fffd6dd8ac0 R15: 00007f659a4db000
Fixes: 1c8782dd313e ("drm/i915/userptr: Disallow wrapping GTT into a userptr")
Testcase: igt/gem_userptr_blits/map-fixed-invalidate-gup
Testcase: igt/gem_userptr_blits/dmabuf-sync
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michał Winiarski <michal.winiarski@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
drivers/gpu/drm/i915/i915_gem_userptr.c | 54 ++++++++-------------------------
1 file changed, 13 insertions(+), 41 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_gem_userptr.c b/drivers/gpu/drm/i915/i915_gem_userptr.c
index 07046fa4c6a9..58ccf8b8ca1c 100644
--- a/drivers/gpu/drm/i915/i915_gem_userptr.c
+++ b/drivers/gpu/drm/i915/i915_gem_userptr.c
@@ -66,13 +66,18 @@ static void cancel_userptr(struct work_struct *work)
{
struct i915_mmu_object *mo = container_of(work, typeof(*mo), work);
struct drm_i915_gem_object *obj = mo->obj;
- struct drm_device *dev = obj->base.dev;
+ struct work_struct *active;
+
+ /* Cancel any active worker and force us to re-evaluate gup */
+ mutex_lock(&obj->mm.lock);
+ active = fetch_and_zero(&obj->userptr.work);
+ mutex_unlock(&obj->mm.lock);
+ if (active)
+ goto out;
i915_gem_object_wait(obj, I915_WAIT_ALL, MAX_SCHEDULE_TIMEOUT, NULL);
- mutex_lock(&dev->struct_mutex);
- /* Cancel any active worker and force us to re-evaluate gup */
- obj->userptr.work = NULL;
+ mutex_lock(&obj->base.dev->struct_mutex);
/* We are inside a kthread context and can't be interrupted */
if (i915_gem_object_unbind(obj) == 0)
@@ -83,8 +88,10 @@ static void cancel_userptr(struct work_struct *work)
atomic_read(&obj->mm.pages_pin_count),
obj->pin_display);
+ mutex_unlock(&obj->base.dev->struct_mutex);
+
+out:
i915_gem_object_put(obj);
- mutex_unlock(&dev->struct_mutex);
}
static void add_object(struct i915_mmu_object *mo)
@@ -488,37 +495,6 @@ __i915_gem_userptr_set_active(struct drm_i915_gem_object *obj,
return ret;
}
-static bool noncontiguous_or_overlaps_ggtt(struct drm_i915_gem_object *obj,
- struct mm_struct *mm)
-{
- const struct vm_operations_struct *gem_vm_ops =
- obj->base.dev->driver->gem_vm_ops;
- unsigned long addr = obj->userptr.ptr;
- const unsigned long end = addr + obj->base.size;
- struct vm_area_struct *vma;
-
- /* Check for a contiguous set of vma covering the userptr, if any
- * are absent, they will EFAULT. More importantly if any point back
- * to a drm_i915_gem_object GTT mmaping, we may trigger a deadlock
- * between the deferred gup of this userptr and the object being
- * unbound calling invalidate_range -> cancel_userptr.
- */
- for (vma = find_vma(mm, addr); vma; vma = vma->vm_next) {
- if (vma->vm_start > addr) /* gap */
- break;
-
- if (vma->vm_ops == gem_vm_ops) /* GTT mmapping */
- break;
-
- if (vma->vm_end >= end)
- return false;
-
- addr = vma->vm_end;
- }
-
- return true;
-}
-
static void
__i915_gem_userptr_get_pages_worker(struct work_struct *_work)
{
@@ -665,10 +641,7 @@ i915_gem_userptr_get_pages(struct drm_i915_gem_object *obj)
pvec = NULL;
pinned = 0;
- down_read(&mm->mmap_sem);
- if (unlikely(noncontiguous_or_overlaps_ggtt(obj, mm))) {
- pinned = -EFAULT;
- } else if (mm == current->mm) {
+ if (mm == current->mm) {
pvec = drm_malloc_gfp(num_pages, sizeof(struct page *),
GFP_TEMPORARY |
__GFP_NORETRY |
@@ -693,7 +666,6 @@ i915_gem_userptr_get_pages(struct drm_i915_gem_object *obj)
}
if (active)
__i915_gem_userptr_set_active(obj, true);
- up_read(&mm->mmap_sem);
if (IS_ERR(pages))
release_pages(pvec, pinned, 0);
--
2.11.0
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 4+ messages in thread
* ✗ Fi.CI.BAT: warning for drm/i915/userptr: Reinvent GGTT self-faulting protection
2017-03-15 14:01 [PATCH] drm/i915/userptr: Reinvent GGTT self-faulting protection Chris Wilson
@ 2017-03-15 15:28 ` Patchwork
2017-03-16 9:56 ` [PATCH] " Tvrtko Ursulin
1 sibling, 0 replies; 4+ messages in thread
From: Patchwork @ 2017-03-15 15:28 UTC (permalink / raw)
To: Chris Wilson; +Cc: intel-gfx
== Series Details ==
Series: drm/i915/userptr: Reinvent GGTT self-faulting protection
URL : https://patchwork.freedesktop.org/series/21291/
State : warning
== Summary ==
Series 21291v1 drm/i915/userptr: Reinvent GGTT self-faulting protection
https://patchwork.freedesktop.org/api/1.0/series/21291/revisions/1/mbox/
Test gem_exec_flush:
Subgroup basic-batch-kernel-default-uc:
fail -> PASS (fi-snb-2600) fdo#100007
Test gem_exec_suspend:
Subgroup basic-s4-devices:
pass -> DMESG-WARN (fi-snb-2600)
Test kms_force_connector_basic:
Subgroup prune-stale-modes:
skip -> PASS (fi-snb-2600)
fdo#100007 https://bugs.freedesktop.org/show_bug.cgi?id=100007
fi-bdw-5557u total:278 pass:267 dwarn:0 dfail:0 fail:0 skip:11 time: 457s
fi-bsw-n3050 total:278 pass:239 dwarn:0 dfail:0 fail:0 skip:39 time: 574s
fi-bxt-j4205 total:278 pass:259 dwarn:0 dfail:0 fail:0 skip:19 time: 538s
fi-bxt-t5700 total:278 pass:258 dwarn:0 dfail:0 fail:0 skip:20 time: 564s
fi-byt-j1900 total:278 pass:251 dwarn:0 dfail:0 fail:0 skip:27 time: 497s
fi-byt-n2820 total:278 pass:247 dwarn:0 dfail:0 fail:0 skip:31 time: 490s
fi-hsw-4770 total:278 pass:262 dwarn:0 dfail:0 fail:0 skip:16 time: 435s
fi-ivb-3520m total:278 pass:260 dwarn:0 dfail:0 fail:0 skip:18 time: 510s
fi-ivb-3770 total:278 pass:260 dwarn:0 dfail:0 fail:0 skip:18 time: 491s
fi-kbl-7500u total:278 pass:260 dwarn:0 dfail:0 fail:0 skip:18 time: 484s
fi-skl-6260u total:278 pass:268 dwarn:0 dfail:0 fail:0 skip:10 time: 484s
fi-skl-6700k total:278 pass:256 dwarn:4 dfail:0 fail:0 skip:18 time: 492s
fi-skl-6770hq total:278 pass:268 dwarn:0 dfail:0 fail:0 skip:10 time: 511s
fi-snb-2520m total:278 pass:250 dwarn:0 dfail:0 fail:0 skip:28 time: 547s
fi-snb-2600 total:278 pass:248 dwarn:1 dfail:0 fail:0 skip:29 time: 428s
25f04730646ed8f7bc59e9e9d1cc9b4c9ecbc09d drm-tip: 2017y-03m-15d-12h-56m-15s UTC integration manifest
f41d3f2 drm/i915/userptr: Reinvent GGTT self-faulting protection
== Logs ==
For more details see: https://intel-gfx-ci.01.org/CI/Patchwork_4184/
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] drm/i915/userptr: Reinvent GGTT self-faulting protection
2017-03-15 14:01 [PATCH] drm/i915/userptr: Reinvent GGTT self-faulting protection Chris Wilson
2017-03-15 15:28 ` ✗ Fi.CI.BAT: warning for " Patchwork
@ 2017-03-16 9:56 ` Tvrtko Ursulin
2017-03-16 11:21 ` Chris Wilson
1 sibling, 1 reply; 4+ messages in thread
From: Tvrtko Ursulin @ 2017-03-16 9:56 UTC (permalink / raw)
To: Chris Wilson, intel-gfx
On 15/03/2017 14:01, Chris Wilson wrote:
> lockdep doesn't like us taking the mm->mmap_sem inside the get_pages
> callback for a couple of reasons. The straightforward deadlock:
>
> [13755.434059] =============================================
> [13755.434061] [ INFO: possible recursive locking detected ]
> [13755.434064] 4.11.0-rc1-CI-CI_DRM_297+ #1 Tainted: G U
> [13755.434066] ---------------------------------------------
> [13755.434068] gem_userptr_bli/8398 is trying to acquire lock:
> [13755.434070] (&mm->mmap_sem){++++++}, at: [<ffffffffa00c988a>] i915_gem_userptr_get_pages+0x5a/0x2e0 [i915]
> [13755.434096]
> but task is already holding lock:
> [13755.434098] (&mm->mmap_sem){++++++}, at: [<ffffffff8104d485>] __do_page_fault+0x105/0x560
> [13755.434105]
> other info that might help us debug this:
> [13755.434108] Possible unsafe locking scenario:
>
> [13755.434110] CPU0
> [13755.434111] ----
> [13755.434112] lock(&mm->mmap_sem);
> [13755.434115] lock(&mm->mmap_sem);
> [13755.434117]
> *** DEADLOCK ***
>
> [13755.434121] May be due to missing lock nesting notation
>
> [13755.434126] 2 locks held by gem_userptr_bli/8398:
> [13755.434128] #0: (&mm->mmap_sem){++++++}, at: [<ffffffff8104d485>] __do_page_fault+0x105/0x560
> [13755.434135] #1: (&obj->mm.lock){+.+.+.}, at: [<ffffffffa00b887d>] __i915_gem_object_get_pages+0x1d/0x70 [i915]
> [13755.434156]
> stack backtrace:
> [13755.434161] CPU: 3 PID: 8398 Comm: gem_userptr_bli Tainted: G U 4.11.0-rc1-CI-CI_DRM_297+ #1
> [13755.434165] Hardware name: GIGABYTE GB-BKi7(H)A-7500/MFLP7AP-00, BIOS F4 02/20/2017
> [13755.434169] Call Trace:
> [13755.434174] dump_stack+0x67/0x92
> [13755.434178] __lock_acquire+0x133a/0x1b50
> [13755.434182] lock_acquire+0xc9/0x220
> [13755.434200] ? i915_gem_userptr_get_pages+0x5a/0x2e0 [i915]
> [13755.434204] down_read+0x42/0x70
> [13755.434221] ? i915_gem_userptr_get_pages+0x5a/0x2e0 [i915]
> [13755.434238] i915_gem_userptr_get_pages+0x5a/0x2e0 [i915]
> [13755.434255] ____i915_gem_object_get_pages+0x25/0x60 [i915]
> [13755.434272] __i915_gem_object_get_pages+0x59/0x70 [i915]
> [13755.434288] i915_gem_fault+0x397/0x6a0 [i915]
> [13755.434304] ? i915_gem_fault+0x1a1/0x6a0 [i915]
> [13755.434308] ? __lock_acquire+0x449/0x1b50
> [13755.434311] ? __lock_acquire+0x449/0x1b50
> [13755.434315] ? vm_mmap_pgoff+0xa9/0xd0
> [13755.434318] __do_fault+0x19/0x70
> [13755.434321] __handle_mm_fault+0x863/0xe50
> [13755.434325] handle_mm_fault+0x17f/0x370
> [13755.434329] ? handle_mm_fault+0x40/0x370
> [13755.434332] __do_page_fault+0x279/0x560
> [13755.434336] do_page_fault+0xc/0x10
> [13755.434339] page_fault+0x22/0x30
> [13755.434342] RIP: 0033:0x7f5ab91b5880
> [13755.434345] RSP: 002b:00007fff62922218 EFLAGS: 00010216
> [13755.434348] RAX: 0000000000b74500 RBX: 00007f5ab7f81000 RCX: 0000000000000000
> [13755.434352] RDX: 0000000000100000 RSI: 00007f5ab7f81000 RDI: 00007f5aba61c000
> [13755.434355] RBP: 00007f5aba61c000 R08: 0000000000000007 R09: 0000000100000000
> [13755.434359] R10: 000000000000037d R11: 00007f5ab91b5840 R12: 0000000000000001
> [13755.434362] R13: 0000000000000005 R14: 0000000000000001 R15: 0000000000000000
>
> and cyclic deadlocks:
>
> [ 2566.458979] ======================================================
> [ 2566.459054] [ INFO: possible circular locking dependency detected ]
> [ 2566.459127] 4.11.0-rc1+ #26 Not tainted
> [ 2566.459194] -------------------------------------------------------
> [ 2566.459266] gem_streaming_w/759 is trying to acquire lock:
> [ 2566.459334] (&obj->mm.lock){+.+.+.}, at: [<ffffffffa034bc80>] i915_gem_object_pin_pages+0x0/0xc0 [i915]
> [ 2566.459605]
> [ 2566.459605] but task is already holding lock:
> [ 2566.459699] (&mm->mmap_sem){++++++}, at: [<ffffffff8106fd11>] __do_page_fault+0x121/0x500
> [ 2566.459814]
> [ 2566.459814] which lock already depends on the new lock.
> [ 2566.459814]
> [ 2566.459934]
> [ 2566.459934] the existing dependency chain (in reverse order) is:
> [ 2566.460030]
> [ 2566.460030] -> #1 (&mm->mmap_sem){++++++}:
> [ 2566.460139] lock_acquire+0xfe/0x220
> [ 2566.460214] down_read+0x4e/0x90
> [ 2566.460444] i915_gem_userptr_get_pages+0x6e/0x340 [i915]
> [ 2566.460669] ____i915_gem_object_get_pages+0x8b/0xd0 [i915]
> [ 2566.460900] __i915_gem_object_get_pages+0x6a/0x80 [i915]
> [ 2566.461132] __i915_vma_do_pin+0x7fa/0x930 [i915]
> [ 2566.461352] eb_add_vma+0x67b/0x830 [i915]
> [ 2566.461572] eb_lookup_vmas+0xafe/0x1010 [i915]
> [ 2566.461792] i915_gem_do_execbuffer+0x715/0x2870 [i915]
> [ 2566.462012] i915_gem_execbuffer2+0x106/0x2b0 [i915]
> [ 2566.462152] drm_ioctl+0x36c/0x670 [drm]
> [ 2566.462236] do_vfs_ioctl+0x12c/0xa60
> [ 2566.462317] SyS_ioctl+0x41/0x70
> [ 2566.462399] entry_SYSCALL_64_fastpath+0x1c/0xb1
> [ 2566.462477]
> [ 2566.462477] -> #0 (&obj->mm.lock){+.+.+.}:
> [ 2566.462587] __lock_acquire+0x1602/0x1790
> [ 2566.462661] lock_acquire+0xfe/0x220
> [ 2566.462893] i915_gem_object_pin_pages+0x4c/0xc0 [i915]
> [ 2566.463116] i915_gem_fault+0x2c2/0x8c0 [i915]
> [ 2566.463197] __do_fault+0x42/0x130
> [ 2566.463276] __handle_mm_fault+0x92c/0x1280
> [ 2566.463356] handle_mm_fault+0x1e2/0x440
> [ 2566.463443] __do_page_fault+0x1c4/0x500
> [ 2566.463529] do_page_fault+0xc/0x10
> [ 2566.463613] page_fault+0x1f/0x30
> [ 2566.463693]
> [ 2566.463693] other info that might help us debug this:
> [ 2566.463693]
> [ 2566.463820] Possible unsafe locking scenario:
> [ 2566.463820]
> [ 2566.463918] CPU0 CPU1
> [ 2566.463988] ---- ----
> [ 2566.464068] lock(&mm->mmap_sem);
> [ 2566.464143] lock(&obj->mm.lock);
> [ 2566.464226] lock(&mm->mmap_sem);
> [ 2566.464304] lock(&obj->mm.lock);
> [ 2566.464378]
> [ 2566.464378] *** DEADLOCK ***
> [ 2566.464378]
> [ 2566.464504] 1 lock held by gem_streaming_w/759:
> [ 2566.464576] #0: (&mm->mmap_sem){++++++}, at: [<ffffffff8106fd11>] __do_page_fault+0x121/0x500
> [ 2566.464699]
> [ 2566.464699] stack backtrace:
> [ 2566.464801] CPU: 0 PID: 759 Comm: gem_streaming_w Not tainted 4.11.0-rc1+ #26
> [ 2566.464881] Hardware name: GIGABYTE GB-BXBT-1900/MZBAYAB-00, BIOS F8 03/02/2016
> [ 2566.464983] Call Trace:
> [ 2566.465061] dump_stack+0x68/0x9f
> [ 2566.465144] print_circular_bug+0x20b/0x260
> [ 2566.465234] __lock_acquire+0x1602/0x1790
> [ 2566.465323] ? debug_check_no_locks_freed+0x1a0/0x1a0
> [ 2566.465564] ? i915_gem_object_wait+0x238/0x650 [i915]
> [ 2566.465657] ? debug_lockdep_rcu_enabled.part.4+0x1a/0x30
> [ 2566.465749] lock_acquire+0xfe/0x220
> [ 2566.465985] ? i915_sg_trim+0x1b0/0x1b0 [i915]
> [ 2566.466223] i915_gem_object_pin_pages+0x4c/0xc0 [i915]
> [ 2566.466461] ? i915_sg_trim+0x1b0/0x1b0 [i915]
> [ 2566.466699] i915_gem_fault+0x2c2/0x8c0 [i915]
> [ 2566.466939] ? i915_gem_pwrite_ioctl+0xce0/0xce0 [i915]
> [ 2566.467030] ? __lock_acquire+0x642/0x1790
> [ 2566.467122] ? __lock_acquire+0x642/0x1790
> [ 2566.467209] ? debug_lockdep_rcu_enabled+0x35/0x40
> [ 2566.467299] ? get_unmapped_area+0x1b4/0x1d0
> [ 2566.467387] __do_fault+0x42/0x130
> [ 2566.467474] __handle_mm_fault+0x92c/0x1280
> [ 2566.467564] ? __pmd_alloc+0x1e0/0x1e0
> [ 2566.467651] ? vm_mmap_pgoff+0x160/0x190
> [ 2566.467740] ? handle_mm_fault+0x111/0x440
> [ 2566.467827] handle_mm_fault+0x1e2/0x440
> [ 2566.467914] ? handle_mm_fault+0x5d/0x440
> [ 2566.468002] __do_page_fault+0x1c4/0x500
> [ 2566.468090] do_page_fault+0xc/0x10
> [ 2566.468180] page_fault+0x1f/0x30
> [ 2566.468263] RIP: 0033:0x557895ced32a
> [ 2566.468337] RSP: 002b:00007fffd6dd8a10 EFLAGS: 00010202
> [ 2566.468419] RAX: 00007f659a4db000 RBX: 0000000000000003 RCX: 00007f659ad032da
> [ 2566.468501] RDX: 0000000000000000 RSI: 0000000000100000 RDI: 0000000000000000
> [ 2566.468586] RBP: 0000000000000007 R08: 0000000000000003 R09: 0000000100000000
> [ 2566.468667] R10: 0000000000000001 R11: 0000000000000246 R12: 0000557895ceda60
> [ 2566.468749] R13: 0000000000000001 R14: 00007fffd6dd8ac0 R15: 00007f659a4db000
>
> Fixes: 1c8782dd313e ("drm/i915/userptr: Disallow wrapping GTT into a userptr")
> Testcase: igt/gem_userptr_blits/map-fixed-invalidate-gup
> Testcase: igt/gem_userptr_blits/dmabuf-sync
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Michał Winiarski <michal.winiarski@intel.com>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> ---
> drivers/gpu/drm/i915/i915_gem_userptr.c | 54 ++++++++-------------------------
> 1 file changed, 13 insertions(+), 41 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_gem_userptr.c b/drivers/gpu/drm/i915/i915_gem_userptr.c
> index 07046fa4c6a9..58ccf8b8ca1c 100644
> --- a/drivers/gpu/drm/i915/i915_gem_userptr.c
> +++ b/drivers/gpu/drm/i915/i915_gem_userptr.c
> @@ -66,13 +66,18 @@ static void cancel_userptr(struct work_struct *work)
> {
> struct i915_mmu_object *mo = container_of(work, typeof(*mo), work);
> struct drm_i915_gem_object *obj = mo->obj;
> - struct drm_device *dev = obj->base.dev;
> + struct work_struct *active;
> +
> + /* Cancel any active worker and force us to re-evaluate gup */
> + mutex_lock(&obj->mm.lock);
> + active = fetch_and_zero(&obj->userptr.work);
> + mutex_unlock(&obj->mm.lock);
> + if (active)
> + goto out;
>
> i915_gem_object_wait(obj, I915_WAIT_ALL, MAX_SCHEDULE_TIMEOUT, NULL);
>
> - mutex_lock(&dev->struct_mutex);
> - /* Cancel any active worker and force us to re-evaluate gup */
> - obj->userptr.work = NULL;
> + mutex_lock(&obj->base.dev->struct_mutex);
>
> /* We are inside a kthread context and can't be interrupted */
> if (i915_gem_object_unbind(obj) == 0)
> @@ -83,8 +88,10 @@ static void cancel_userptr(struct work_struct *work)
> atomic_read(&obj->mm.pages_pin_count),
> obj->pin_display);
>
> + mutex_unlock(&obj->base.dev->struct_mutex);
> +
> +out:
> i915_gem_object_put(obj);
> - mutex_unlock(&dev->struct_mutex);
> }
>
> static void add_object(struct i915_mmu_object *mo)
> @@ -488,37 +495,6 @@ __i915_gem_userptr_set_active(struct drm_i915_gem_object *obj,
> return ret;
> }
>
> -static bool noncontiguous_or_overlaps_ggtt(struct drm_i915_gem_object *obj,
> - struct mm_struct *mm)
> -{
> - const struct vm_operations_struct *gem_vm_ops =
> - obj->base.dev->driver->gem_vm_ops;
> - unsigned long addr = obj->userptr.ptr;
> - const unsigned long end = addr + obj->base.size;
> - struct vm_area_struct *vma;
> -
> - /* Check for a contiguous set of vma covering the userptr, if any
> - * are absent, they will EFAULT. More importantly if any point back
> - * to a drm_i915_gem_object GTT mmaping, we may trigger a deadlock
> - * between the deferred gup of this userptr and the object being
> - * unbound calling invalidate_range -> cancel_userptr.
> - */
> - for (vma = find_vma(mm, addr); vma; vma = vma->vm_next) {
> - if (vma->vm_start > addr) /* gap */
> - break;
> -
> - if (vma->vm_ops == gem_vm_ops) /* GTT mmapping */
> - break;
> -
> - if (vma->vm_end >= end)
> - return false;
> -
> - addr = vma->vm_end;
> - }
> -
> - return true;
> -}
> -
> static void
> __i915_gem_userptr_get_pages_worker(struct work_struct *_work)
> {
> @@ -665,10 +641,7 @@ i915_gem_userptr_get_pages(struct drm_i915_gem_object *obj)
> pvec = NULL;
> pinned = 0;
>
> - down_read(&mm->mmap_sem);
> - if (unlikely(noncontiguous_or_overlaps_ggtt(obj, mm))) {
> - pinned = -EFAULT;
> - } else if (mm == current->mm) {
> + if (mm == current->mm) {
> pvec = drm_malloc_gfp(num_pages, sizeof(struct page *),
> GFP_TEMPORARY |
> __GFP_NORETRY |
> @@ -693,7 +666,6 @@ i915_gem_userptr_get_pages(struct drm_i915_gem_object *obj)
> }
> if (active)
> __i915_gem_userptr_set_active(obj, true);
> - up_read(&mm->mmap_sem);
>
> if (IS_ERR(pages))
> release_pages(pvec, pinned, 0);
>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Regards,
Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] drm/i915/userptr: Reinvent GGTT self-faulting protection
2017-03-16 9:56 ` [PATCH] " Tvrtko Ursulin
@ 2017-03-16 11:21 ` Chris Wilson
0 siblings, 0 replies; 4+ messages in thread
From: Chris Wilson @ 2017-03-16 11:21 UTC (permalink / raw)
To: Tvrtko Ursulin; +Cc: intel-gfx
On Thu, Mar 16, 2017 at 09:56:19AM +0000, Tvrtko Ursulin wrote:
> >Fixes: 1c8782dd313e ("drm/i915/userptr: Disallow wrapping GTT into a userptr")
> >Testcase: igt/gem_userptr_blits/map-fixed-invalidate-gup
> >Testcase: igt/gem_userptr_blits/dmabuf-sync
> >Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> >Cc: Michał Winiarski <michal.winiarski@intel.com>
> >Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
I added a bit of blurb to the changelog to explain how the patch
circumvents the struct_mutex recursion (virtually cut'n'paste from our
irc discussion).
Pushed, thanks
-Chris
--
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2017-03-16 11:21 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-03-15 14:01 [PATCH] drm/i915/userptr: Reinvent GGTT self-faulting protection Chris Wilson
2017-03-15 15:28 ` ✗ Fi.CI.BAT: warning for " Patchwork
2017-03-16 9:56 ` [PATCH] " Tvrtko Ursulin
2017-03-16 11:21 ` Chris Wilson
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.