All of lore.kernel.org
 help / color / mirror / Atom feed
* VM lockdep warning
@ 2012-04-21 11:39 Dave Airlie
  2012-04-21 12:35 ` Christian König
  0 siblings, 1 reply; 9+ messages in thread
From: Dave Airlie @ 2012-04-21 11:39 UTC (permalink / raw)
  To: dri-devel, Christian König

running 3.4.0-rc3 + Christian's reset patch series.

The locks are definitely taken in different orders between vm_bo_add
and cs ioctl.

Dave.

======================================================
[ INFO: possible circular locking dependency detected ]
3.4.0-rc3+ #33 Not tainted
-------------------------------------------------------
shader_runner/3090 is trying to acquire lock:
 (&vm->mutex){+.+...}, at: [<ffffffffa00c513f>]
radeon_cs_ioctl+0x438/0x5c1 [radeon]

but task is already holding lock:
 (&rdev->cs_mutex){+.+.+.}, at: [<ffffffffa00c4d3a>]
radeon_cs_ioctl+0x33/0x5c1 [radeon]

which lock already depends on the new lock.


the existing dependency chain (in reverse order) is:

-> #1 (&rdev->cs_mutex){+.+.+.}:
       [<ffffffff810757f5>] lock_acquire+0xf0/0x116
       [<ffffffff81427881>] mutex_lock_nested+0x6a/0x2bb
       [<ffffffffa00b5f4d>] radeon_vm_bo_add+0x118/0x1f5 [radeon]
       [<ffffffffa00b6479>] radeon_vm_init+0x6b/0x70 [radeon]
       [<ffffffffa00a3bfc>] radeon_driver_open_kms+0x68/0x9a [radeon]
       [<ffffffffa0019698>] drm_open+0x201/0x587 [drm]
       [<ffffffffa0019b0a>] drm_stub_open+0xec/0x14a [drm]
       [<ffffffff8110f788>] chrdev_open+0x11c/0x145
       [<ffffffff8110a23a>] __dentry_open+0x17e/0x29b
       [<ffffffff8110b138>] nameidata_to_filp+0x5b/0x62
       [<ffffffff811188d0>] do_last+0x75d/0x771
       [<ffffffff81118ab3>] path_openat+0xcb/0x380
       [<ffffffff81118e51>] do_filp_open+0x33/0x81
       [<ffffffff8110b23f>] do_sys_open+0x100/0x192
       [<ffffffff8110b2ed>] sys_open+0x1c/0x1e
       [<ffffffff81430722>] system_call_fastpath+0x16/0x1b

-> #0 (&vm->mutex){+.+...}:
       [<ffffffff81074c99>] __lock_acquire+0xfcd/0x1664
       [<ffffffff810757f5>] lock_acquire+0xf0/0x116
       [<ffffffff81427881>] mutex_lock_nested+0x6a/0x2bb
       [<ffffffffa00c513f>] radeon_cs_ioctl+0x438/0x5c1 [radeon]
       [<ffffffffa00187a9>] drm_ioctl+0x2d8/0x3a4 [drm]
       [<ffffffff8111afd6>] do_vfs_ioctl+0x469/0x4aa
       [<ffffffff8111b068>] sys_ioctl+0x51/0x75
       [<ffffffff81430722>] system_call_fastpath+0x16/0x1b

other info that might help us debug this:

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(&rdev->cs_mutex);
                               lock(&vm->mutex);
                               lock(&rdev->cs_mutex);
  lock(&vm->mutex);

 *** DEADLOCK ***

1 lock held by shader_runner/3090:
 #0:  (&rdev->cs_mutex){+.+.+.}, at: [<ffffffffa00c4d3a>]
radeon_cs_ioctl+0x33/0x5c1 [radeon]

stack backtrace:
Pid: 3090, comm: shader_runner Not tainted 3.4.0-rc3+ #33
Call Trace:
 [<ffffffff81420ac7>] print_circular_bug+0x28a/0x29b
 [<ffffffff81074c99>] __lock_acquire+0xfcd/0x1664
 [<ffffffff810757f5>] lock_acquire+0xf0/0x116
 [<ffffffffa00c513f>] ? radeon_cs_ioctl+0x438/0x5c1 [radeon]
 [<ffffffff810db991>] ? might_fault+0x57/0xa7
 [<ffffffff81427881>] mutex_lock_nested+0x6a/0x2bb
 [<ffffffffa00c513f>] ? radeon_cs_ioctl+0x438/0x5c1 [radeon]
 [<ffffffffa00f4196>] ? evergreen_ib_parse+0x1b2/0x204 [radeon]
 [<ffffffffa00c513f>] radeon_cs_ioctl+0x438/0x5c1 [radeon]
 [<ffffffffa00187a9>] drm_ioctl+0x2d8/0x3a4 [drm]
 [<ffffffffa00c4d07>] ? radeon_cs_finish_pages+0xa3/0xa3 [radeon]
 [<ffffffff811ee4c4>] ? avc_has_perm_flags+0xd7/0x160
 [<ffffffff811ee413>] ? avc_has_perm_flags+0x26/0x160
 [<ffffffff8104bf6a>] ? up_read+0x1b/0x32
 [<ffffffff8111afd6>] do_vfs_ioctl+0x469/0x4aa
 [<ffffffff8111b068>] sys_ioctl+0x51/0x75
 [<ffffffff8104f955>] ? __wake_up+0x1d/0x48
 [<ffffffff81430722>] system_call_fastpath+0x16/0x1b

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: VM lockdep warning
  2012-04-21 11:39 VM lockdep warning Dave Airlie
@ 2012-04-21 12:35 ` Christian König
  2012-04-21 14:08   ` Jerome Glisse
  0 siblings, 1 reply; 9+ messages in thread
From: Christian König @ 2012-04-21 12:35 UTC (permalink / raw)
  To: Dave Airlie; +Cc: dri-devel

Interesting, I'm pretty sure that I haven't touched the locking order of 
the cs_mutex vs. vm_mutex.

Maybe it is just some kind of side effect, going to locking into it anyway.

Christian.

On 21.04.2012 13:39, Dave Airlie wrote:
> running 3.4.0-rc3 + Christian's reset patch series.
>
> The locks are definitely taken in different orders between vm_bo_add
> and cs ioctl.
>
> Dave.
>
> ======================================================
> [ INFO: possible circular locking dependency detected ]
> 3.4.0-rc3+ #33 Not tainted
> -------------------------------------------------------
> shader_runner/3090 is trying to acquire lock:
>   (&vm->mutex){+.+...}, at: [<ffffffffa00c513f>]
> radeon_cs_ioctl+0x438/0x5c1 [radeon]
>
> but task is already holding lock:
>   (&rdev->cs_mutex){+.+.+.}, at: [<ffffffffa00c4d3a>]
> radeon_cs_ioctl+0x33/0x5c1 [radeon]
>
> which lock already depends on the new lock.
>
>
> the existing dependency chain (in reverse order) is:
>
> ->  #1 (&rdev->cs_mutex){+.+.+.}:
>         [<ffffffff810757f5>] lock_acquire+0xf0/0x116
>         [<ffffffff81427881>] mutex_lock_nested+0x6a/0x2bb
>         [<ffffffffa00b5f4d>] radeon_vm_bo_add+0x118/0x1f5 [radeon]
>         [<ffffffffa00b6479>] radeon_vm_init+0x6b/0x70 [radeon]
>         [<ffffffffa00a3bfc>] radeon_driver_open_kms+0x68/0x9a [radeon]
>         [<ffffffffa0019698>] drm_open+0x201/0x587 [drm]
>         [<ffffffffa0019b0a>] drm_stub_open+0xec/0x14a [drm]
>         [<ffffffff8110f788>] chrdev_open+0x11c/0x145
>         [<ffffffff8110a23a>] __dentry_open+0x17e/0x29b
>         [<ffffffff8110b138>] nameidata_to_filp+0x5b/0x62
>         [<ffffffff811188d0>] do_last+0x75d/0x771
>         [<ffffffff81118ab3>] path_openat+0xcb/0x380
>         [<ffffffff81118e51>] do_filp_open+0x33/0x81
>         [<ffffffff8110b23f>] do_sys_open+0x100/0x192
>         [<ffffffff8110b2ed>] sys_open+0x1c/0x1e
>         [<ffffffff81430722>] system_call_fastpath+0x16/0x1b
>
> ->  #0 (&vm->mutex){+.+...}:
>         [<ffffffff81074c99>] __lock_acquire+0xfcd/0x1664
>         [<ffffffff810757f5>] lock_acquire+0xf0/0x116
>         [<ffffffff81427881>] mutex_lock_nested+0x6a/0x2bb
>         [<ffffffffa00c513f>] radeon_cs_ioctl+0x438/0x5c1 [radeon]
>         [<ffffffffa00187a9>] drm_ioctl+0x2d8/0x3a4 [drm]
>         [<ffffffff8111afd6>] do_vfs_ioctl+0x469/0x4aa
>         [<ffffffff8111b068>] sys_ioctl+0x51/0x75
>         [<ffffffff81430722>] system_call_fastpath+0x16/0x1b
>
> other info that might help us debug this:
>
>   Possible unsafe locking scenario:
>
>         CPU0                    CPU1
>         ----                    ----
>    lock(&rdev->cs_mutex);
>                                 lock(&vm->mutex);
>                                 lock(&rdev->cs_mutex);
>    lock(&vm->mutex);
>
>   *** DEADLOCK ***
>
> 1 lock held by shader_runner/3090:
>   #0:  (&rdev->cs_mutex){+.+.+.}, at: [<ffffffffa00c4d3a>]
> radeon_cs_ioctl+0x33/0x5c1 [radeon]
>
> stack backtrace:
> Pid: 3090, comm: shader_runner Not tainted 3.4.0-rc3+ #33
> Call Trace:
>   [<ffffffff81420ac7>] print_circular_bug+0x28a/0x29b
>   [<ffffffff81074c99>] __lock_acquire+0xfcd/0x1664
>   [<ffffffff810757f5>] lock_acquire+0xf0/0x116
>   [<ffffffffa00c513f>] ? radeon_cs_ioctl+0x438/0x5c1 [radeon]
>   [<ffffffff810db991>] ? might_fault+0x57/0xa7
>   [<ffffffff81427881>] mutex_lock_nested+0x6a/0x2bb
>   [<ffffffffa00c513f>] ? radeon_cs_ioctl+0x438/0x5c1 [radeon]
>   [<ffffffffa00f4196>] ? evergreen_ib_parse+0x1b2/0x204 [radeon]
>   [<ffffffffa00c513f>] radeon_cs_ioctl+0x438/0x5c1 [radeon]
>   [<ffffffffa00187a9>] drm_ioctl+0x2d8/0x3a4 [drm]
>   [<ffffffffa00c4d07>] ? radeon_cs_finish_pages+0xa3/0xa3 [radeon]
>   [<ffffffff811ee4c4>] ? avc_has_perm_flags+0xd7/0x160
>   [<ffffffff811ee413>] ? avc_has_perm_flags+0x26/0x160
>   [<ffffffff8104bf6a>] ? up_read+0x1b/0x32
>   [<ffffffff8111afd6>] do_vfs_ioctl+0x469/0x4aa
>   [<ffffffff8111b068>] sys_ioctl+0x51/0x75
>   [<ffffffff8104f955>] ? __wake_up+0x1d/0x48
>   [<ffffffff81430722>] system_call_fastpath+0x16/0x1b
>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: VM lockdep warning
  2012-04-21 12:35 ` Christian König
@ 2012-04-21 14:08   ` Jerome Glisse
  2012-04-21 14:16     ` Christian König
  0 siblings, 1 reply; 9+ messages in thread
From: Jerome Glisse @ 2012-04-21 14:08 UTC (permalink / raw)
  To: Christian König; +Cc: dri-devel

2012/4/21 Christian König <deathsimple@vodafone.de>:
> Interesting, I'm pretty sure that I haven't touched the locking order of the
> cs_mutex vs. vm_mutex.
>
> Maybe it is just some kind of side effect, going to locking into it anyway.
>
> Christian.
>

It's the using, init path take lock in different order than cs path

Cheers,
Jerome

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: VM lockdep warning
  2012-04-21 14:08   ` Jerome Glisse
@ 2012-04-21 14:16     ` Christian König
  2012-04-21 14:50       ` Jerome Glisse
  0 siblings, 1 reply; 9+ messages in thread
From: Christian König @ 2012-04-21 14:16 UTC (permalink / raw)
  To: Jerome Glisse; +Cc: dri-devel

On 21.04.2012 16:08, Jerome Glisse wrote:
> 2012/4/21 Christian König<deathsimple@vodafone.de>:
>> Interesting, I'm pretty sure that I haven't touched the locking order of the
>> cs_mutex vs. vm_mutex.
>>
>> Maybe it is just some kind of side effect, going to locking into it anyway.
>>
>> Christian.
>>
> It's the using, init path take lock in different order than cs path
Well, could you explain to me why the vm code takes cs mutex in the 
first place?

It clearly has it's own mutex and it doesn't looks like that it deals 
with any cs related data anyway.

Christian.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: VM lockdep warning
  2012-04-21 14:16     ` Christian König
@ 2012-04-21 14:50       ` Jerome Glisse
  2012-04-21 15:57         ` Dave Airlie
  0 siblings, 1 reply; 9+ messages in thread
From: Jerome Glisse @ 2012-04-21 14:50 UTC (permalink / raw)
  To: Christian König; +Cc: dri-devel

2012/4/21 Christian König <deathsimple@vodafone.de>:
> On 21.04.2012 16:08, Jerome Glisse wrote:
>>
>> 2012/4/21 Christian König<deathsimple@vodafone.de>:
>>>
>>> Interesting, I'm pretty sure that I haven't touched the locking order of
>>> the
>>> cs_mutex vs. vm_mutex.
>>>
>>> Maybe it is just some kind of side effect, going to locking into it
>>> anyway.
>>>
>>> Christian.
>>>
>> It's the using, init path take lock in different order than cs path
>
> Well, could you explain to me why the vm code takes cs mutex in the first
> place?
>
> It clearly has it's own mutex and it doesn't looks like that it deals with
> any cs related data anyway.
>
> Christian.

Lock simplification is on my todo. The issue is that vm manager is protected by
cs_mutex The vm.mutex is specific to each vm it doesn't protect the global vm
management. I didn't wanted to introduce a new global vm mutex as vm activity
is mostly trigger on behalf of cs so i dediced to use the cs mutex.

That's why non cs path of vm need to take the cs mutex.

Cheers,
Jerome

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: VM lockdep warning
  2012-04-21 14:50       ` Jerome Glisse
@ 2012-04-21 15:57         ` Dave Airlie
  2012-04-21 16:32           ` Christian König
  0 siblings, 1 reply; 9+ messages in thread
From: Dave Airlie @ 2012-04-21 15:57 UTC (permalink / raw)
  To: Jerome Glisse; +Cc: Christian König, dri-devel

2012/4/21 Jerome Glisse <j.glisse@gmail.com>:
> 2012/4/21 Christian König <deathsimple@vodafone.de>:
>> On 21.04.2012 16:08, Jerome Glisse wrote:
>>>
>>> 2012/4/21 Christian König<deathsimple@vodafone.de>:
>>>>
>>>> Interesting, I'm pretty sure that I haven't touched the locking order of
>>>> the
>>>> cs_mutex vs. vm_mutex.
>>>>
>>>> Maybe it is just some kind of side effect, going to locking into it
>>>> anyway.
>>>>
>>>> Christian.
>>>>
>>> It's the using, init path take lock in different order than cs path
>>
>> Well, could you explain to me why the vm code takes cs mutex in the first
>> place?
>>
>> It clearly has it's own mutex and it doesn't looks like that it deals with
>> any cs related data anyway.
>>
>> Christian.
>
> Lock simplification is on my todo. The issue is that vm manager is protected by
> cs_mutex The vm.mutex is specific to each vm it doesn't protect the global vm
> management. I didn't wanted to introduce a new global vm mutex as vm activity
> is mostly trigger on behalf of cs so i dediced to use the cs mutex.
>
> That's why non cs path of vm need to take the cs mutex.

So if one app is adding a bo, and another doing CS, isn't deadlock a
real possibility?

I expect the VM code need to take CS mutex earlier then.

Dave.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: VM lockdep warning
  2012-04-21 15:57         ` Dave Airlie
@ 2012-04-21 16:32           ` Christian König
  2012-04-21 17:30             ` Jerome Glisse
  0 siblings, 1 reply; 9+ messages in thread
From: Christian König @ 2012-04-21 16:32 UTC (permalink / raw)
  To: Dave Airlie; +Cc: dri-devel

[-- Attachment #1: Type: text/plain, Size: 1630 bytes --]

On 21.04.2012 17:57, Dave Airlie wrote:
> 2012/4/21 Jerome Glisse<j.glisse@gmail.com>:
>> 2012/4/21 Christian König<deathsimple@vodafone.de>:
>>> On 21.04.2012 16:08, Jerome Glisse wrote:
>>>> 2012/4/21 Christian König<deathsimple@vodafone.de>:
>>>>> Interesting, I'm pretty sure that I haven't touched the locking order of
>>>>> the
>>>>> cs_mutex vs. vm_mutex.
>>>>>
>>>>> Maybe it is just some kind of side effect, going to locking into it
>>>>> anyway.
>>>>>
>>>>> Christian.
>>>>>
>>>> It's the using, init path take lock in different order than cs path
>>> Well, could you explain to me why the vm code takes cs mutex in the first
>>> place?
>>>
>>> It clearly has it's own mutex and it doesn't looks like that it deals with
>>> any cs related data anyway.
>>>
>>> Christian.
>> Lock simplification is on my todo. The issue is that vm manager is protected by
>> cs_mutex The vm.mutex is specific to each vm it doesn't protect the global vm
>> management. I didn't wanted to introduce a new global vm mutex as vm activity
>> is mostly trigger on behalf of cs so i dediced to use the cs mutex.
>>
>> That's why non cs path of vm need to take the cs mutex.
> So if one app is adding a bo, and another doing CS, isn't deadlock a
> real possibility?
Yeah, I think so.
> I expect the VM code need to take CS mutex earlier then.

I would strongly suggest to give the vm code their own global mutex and 
remove the per vm mutex, cause the later is pretty superfluous if the 
cs_mutex is also taken most of the time.

The attached patch is against drm-fixes and does exactly that.

Christian.

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-drm-radeon-use-a-global-mutex-instead-of-per-vm-one.patch --]
[-- Type: text/x-patch; name="0001-drm-radeon-use-a-global-mutex-instead-of-per-vm-one.patch", Size: 4775 bytes --]

>From b6a79c2e54f8200e770c25e930b0784343105a2b Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Christian=20K=C3=B6nig?= <deathsimple@vodafone.de>
Date: Sat, 21 Apr 2012 18:29:34 +0200
Subject: [PATCH] drm/radeon: use a global mutex instead of per vm one
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Resolving deadlock problems with the cs_mutex.

Signed-off-by: Christian König <deathsimple@vodafone.de>
---
 drivers/gpu/drm/radeon/radeon.h        |    2 +-
 drivers/gpu/drm/radeon/radeon_device.c |    1 +
 drivers/gpu/drm/radeon/radeon_gart.c   |   25 +++++++++----------------
 3 files changed, 11 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h
index 138b952..f35957d 100644
--- a/drivers/gpu/drm/radeon/radeon.h
+++ b/drivers/gpu/drm/radeon/radeon.h
@@ -680,7 +680,6 @@ struct radeon_vm {
 	u64				pt_gpu_addr;
 	u64				*pt;
 	struct radeon_sa_bo		sa_bo;
-	struct mutex			mutex;
 	/* last fence for cs using this vm */
 	struct radeon_fence		*fence;
 };
@@ -1527,6 +1526,7 @@ struct radeon_device {
 	struct radeon_pm		pm;
 	uint32_t			bios_scratch[RADEON_BIOS_NUM_SCRATCH];
 	struct radeon_mutex		cs_mutex;
+	struct mutex			vm_mutex;
 	struct radeon_wb		wb;
 	struct radeon_dummy_page	dummy_page;
 	bool				gpu_lockup;
diff --git a/drivers/gpu/drm/radeon/radeon_device.c b/drivers/gpu/drm/radeon/radeon_device.c
index ea7df16..cecb785 100644
--- a/drivers/gpu/drm/radeon/radeon_device.c
+++ b/drivers/gpu/drm/radeon/radeon_device.c
@@ -725,6 +725,7 @@ int radeon_device_init(struct radeon_device *rdev,
 	 * can recall function without having locking issues */
 	radeon_mutex_init(&rdev->cs_mutex);
 	radeon_mutex_init(&rdev->ib_pool.mutex);
+	mutex_init(&rdev->vm_mutex);
 	for (i = 0; i < RADEON_NUM_RINGS; ++i)
 		mutex_init(&rdev->ring[i].mutex);
 	mutex_init(&rdev->dc_hw_i2c_mutex);
diff --git a/drivers/gpu/drm/radeon/radeon_gart.c b/drivers/gpu/drm/radeon/radeon_gart.c
index c58a036..1b4933b 100644
--- a/drivers/gpu/drm/radeon/radeon_gart.c
+++ b/drivers/gpu/drm/radeon/radeon_gart.c
@@ -356,13 +356,13 @@ int radeon_vm_manager_suspend(struct radeon_device *rdev)
 {
 	struct radeon_vm *vm, *tmp;
 
-	radeon_mutex_lock(&rdev->cs_mutex);
+	mutex_lock(&rdev->vm_mutex);
 	/* unbind all active vm */
 	list_for_each_entry_safe(vm, tmp, &rdev->vm_manager.lru_vm, list) {
 		radeon_vm_unbind_locked(rdev, vm);
 	}
 	rdev->vm_manager.funcs->fini(rdev);
-	radeon_mutex_unlock(&rdev->cs_mutex);
+	mutex_unlock(&rdev->vm_mutex);
 	return radeon_sa_bo_manager_suspend(rdev, &rdev->vm_manager.sa_manager);
 }
 
@@ -476,13 +476,11 @@ int radeon_vm_bo_add(struct radeon_device *rdev,
 		return -EINVAL;
 	}
 
-	mutex_lock(&vm->mutex);
+	mutex_lock(&rdev->vm_mutex);
 	if (last_pfn > vm->last_pfn) {
 		/* grow va space 32M by 32M */
 		unsigned align = ((32 << 20) >> 12) - 1;
-		radeon_mutex_lock(&rdev->cs_mutex);
 		radeon_vm_unbind_locked(rdev, vm);
-		radeon_mutex_unlock(&rdev->cs_mutex);
 		vm->last_pfn = (last_pfn + align) & ~align;
 	}
 	head = &vm->va;
@@ -498,7 +496,7 @@ int radeon_vm_bo_add(struct radeon_device *rdev,
 				bo, (unsigned)bo_va->soffset, tmp->bo,
 				(unsigned)tmp->soffset, (unsigned)tmp->eoffset);
 			kfree(bo_va);
-			mutex_unlock(&vm->mutex);
+			mutex_unlock(&rdev->vm_mutex);
 			return -EINVAL;
 		}
 		last_offset = tmp->eoffset;
@@ -506,7 +504,7 @@ int radeon_vm_bo_add(struct radeon_device *rdev,
 	}
 	list_add(&bo_va->vm_list, head);
 	list_add_tail(&bo_va->bo_list, &bo->va);
-	mutex_unlock(&vm->mutex);
+	mutex_unlock(&rdev->vm_mutex);
 	return 0;
 }
 
@@ -597,13 +595,11 @@ int radeon_vm_bo_rmv(struct radeon_device *rdev,
 	if (bo_va == NULL)
 		return 0;
 
-	mutex_lock(&vm->mutex);
-	radeon_mutex_lock(&rdev->cs_mutex);
+	mutex_lock(&rdev->vm_mutex);
 	radeon_vm_bo_update_pte(rdev, vm, bo, NULL);
-	radeon_mutex_unlock(&rdev->cs_mutex);
 	list_del(&bo_va->vm_list);
-	mutex_unlock(&vm->mutex);
 	list_del(&bo_va->bo_list);
+	mutex_unlock(&rdev->vm_mutex);
 
 	kfree(bo_va);
 	return 0;
@@ -643,11 +639,8 @@ void radeon_vm_fini(struct radeon_device *rdev, struct radeon_vm *vm)
 	struct radeon_bo_va *bo_va, *tmp;
 	int r;
 
-	mutex_lock(&vm->mutex);
-
-	radeon_mutex_lock(&rdev->cs_mutex);
+	mutex_lock(&rdev->vm_mutex);
 	radeon_vm_unbind_locked(rdev, vm);
-	radeon_mutex_unlock(&rdev->cs_mutex);
 
 	/* remove all bo */
 	r = radeon_bo_reserve(rdev->ib_pool.sa_manager.bo, false);
@@ -670,5 +663,5 @@ void radeon_vm_fini(struct radeon_device *rdev, struct radeon_vm *vm)
 			kfree(bo_va);
 		}
 	}
-	mutex_unlock(&vm->mutex);
+	mutex_unlock(&rdev->vm_mutex);
 }
-- 
1.7.5.4


[-- Attachment #3: Type: text/plain, Size: 159 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: VM lockdep warning
  2012-04-21 16:32           ` Christian König
@ 2012-04-21 17:30             ` Jerome Glisse
  2012-04-21 19:26               ` Christian König
  0 siblings, 1 reply; 9+ messages in thread
From: Jerome Glisse @ 2012-04-21 17:30 UTC (permalink / raw)
  To: Christian König; +Cc: dri-devel

2012/4/21 Christian König <deathsimple@vodafone.de>:
> On 21.04.2012 17:57, Dave Airlie wrote:
>>
>> 2012/4/21 Jerome Glisse<j.glisse@gmail.com>:
>>>
>>> 2012/4/21 Christian König<deathsimple@vodafone.de>:
>>>>
>>>> On 21.04.2012 16:08, Jerome Glisse wrote:
>>>>>
>>>>> 2012/4/21 Christian König<deathsimple@vodafone.de>:
>>>>>>
>>>>>> Interesting, I'm pretty sure that I haven't touched the locking order
>>>>>> of
>>>>>> the
>>>>>> cs_mutex vs. vm_mutex.
>>>>>>
>>>>>> Maybe it is just some kind of side effect, going to locking into it
>>>>>> anyway.
>>>>>>
>>>>>> Christian.
>>>>>>
>>>>> It's the using, init path take lock in different order than cs path
>>>>
>>>> Well, could you explain to me why the vm code takes cs mutex in the
>>>> first
>>>> place?
>>>>
>>>> It clearly has it's own mutex and it doesn't looks like that it deals
>>>> with
>>>> any cs related data anyway.
>>>>
>>>> Christian.
>>>
>>> Lock simplification is on my todo. The issue is that vm manager is
>>> protected by
>>> cs_mutex The vm.mutex is specific to each vm it doesn't protect the
>>> global vm
>>> management. I didn't wanted to introduce a new global vm mutex as vm
>>> activity
>>> is mostly trigger on behalf of cs so i dediced to use the cs mutex.
>>>
>>> That's why non cs path of vm need to take the cs mutex.
>>
>> So if one app is adding a bo, and another doing CS, isn't deadlock a
>> real possibility?
>
> Yeah, I think so.

No it's not. Look at the code.

>> I expect the VM code need to take CS mutex earlier then.

No it does not. The idea is that when adding a bo we only need to take the
cs mutex if we need to resize the vm size (and even that can be worked around).

So we will need to take the cs ioctl in very few case (suspend, increasing vm
size).

>
> I would strongly suggest to give the vm code their own global mutex and
> remove the per vm mutex, cause the later is pretty superfluous if the
> cs_mutex is also taken most of the time.
>
> The attached patch is against drm-fixes and does exactly that.
>
> Christian.

NAK with your change there will be lock contention if one app is in cs and
another try to create bo. Currently there is allmost never contention. Once
i ironed out the DP->VGA i will work on something to remove the cs mutex
from vm path (ie remove it from bo creation/del path).

Cheers,
Jerome

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: VM lockdep warning
  2012-04-21 17:30             ` Jerome Glisse
@ 2012-04-21 19:26               ` Christian König
  0 siblings, 0 replies; 9+ messages in thread
From: Christian König @ 2012-04-21 19:26 UTC (permalink / raw)
  To: Jerome Glisse; +Cc: dri-devel

On 21.04.2012 19:30, Jerome Glisse wrote:
> 2012/4/21 Christian König<deathsimple@vodafone.de>:
>> On 21.04.2012 17:57, Dave Airlie wrote:
>>> 2012/4/21 Jerome Glisse<j.glisse@gmail.com>:
>>>> 2012/4/21 Christian König<deathsimple@vodafone.de>:
>>>>> On 21.04.2012 16:08, Jerome Glisse wrote:
>>>>>> 2012/4/21 Christian König<deathsimple@vodafone.de>:
>>>>>>> Interesting, I'm pretty sure that I haven't touched the locking order
>>>>>>> of
>>>>>>> the
>>>>>>> cs_mutex vs. vm_mutex.
>>>>>>>
>>>>>>> Maybe it is just some kind of side effect, going to locking into it
>>>>>>> anyway.
>>>>>>>
>>>>>>> Christian.
>>>>>>>
>>>>>> It's the using, init path take lock in different order than cs path
>>>>> Well, could you explain to me why the vm code takes cs mutex in the
>>>>> first
>>>>> place?
>>>>>
>>>>> It clearly has it's own mutex and it doesn't looks like that it deals
>>>>> with
>>>>> any cs related data anyway.
>>>>>
>>>>> Christian.
>>>> Lock simplification is on my todo. The issue is that vm manager is
>>>> protected by
>>>> cs_mutex The vm.mutex is specific to each vm it doesn't protect the
>>>> global vm
>>>> management. I didn't wanted to introduce a new global vm mutex as vm
>>>> activity
>>>> is mostly trigger on behalf of cs so i dediced to use the cs mutex.
>>>>
>>>> That's why non cs path of vm need to take the cs mutex.
>>> So if one app is adding a bo, and another doing CS, isn't deadlock a
>>> real possibility?
>> Yeah, I think so.
> No it's not. Look at the code.
>
>>> I expect the VM code need to take CS mutex earlier then.
> No it does not. The idea is that when adding a bo we only need to take the
> cs mutex if we need to resize the vm size (and even that can be worked around).
>
> So we will need to take the cs ioctl in very few case (suspend, increasing vm
> size).
>
>> I would strongly suggest to give the vm code their own global mutex and
>> remove the per vm mutex, cause the later is pretty superfluous if the
>> cs_mutex is also taken most of the time.
>>
>> The attached patch is against drm-fixes and does exactly that.
>>
>> Christian.
> NAK with your change there will be lock contention if one app is in cs and
> another try to create bo. Currently there is allmost never contention. Once
> i ironed out the DP->VGA i will work on something to remove the cs mutex
> from vm path (ie remove it from bo creation/del path).

Ok, sounds like I don't understand the code deeply enough to fix this. 
So I'm just going to wait for your fix.

By the way: If you are talking about the NUTMEG DP->VGA problem, I have 
two systems with that sitting directly beside me. So if you got any 
patches just leave me a note and I can try them.

Christian.

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2012-04-21 19:26 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-04-21 11:39 VM lockdep warning Dave Airlie
2012-04-21 12:35 ` Christian König
2012-04-21 14:08   ` Jerome Glisse
2012-04-21 14:16     ` Christian König
2012-04-21 14:50       ` Jerome Glisse
2012-04-21 15:57         ` Dave Airlie
2012-04-21 16:32           ` Christian König
2012-04-21 17:30             ` Jerome Glisse
2012-04-21 19:26               ` Christian König

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.