All of lore.kernel.org
 help / color / mirror / Atom feed
* lock/unlock mismatch in ttm_bo.c
@ 2018-01-19 18:14 Tom St Denis
       [not found] ` <1713476e-c17f-9eea-ebd3-6999c760b678-5C7GfCeVMHo@public.gmane.org>
  0 siblings, 1 reply; 16+ messages in thread
From: Tom St Denis @ 2018-01-19 18:14 UTC (permalink / raw)
  To: amd-gfx mailing list

Hi all,

In the function ttm_bo_cleanup_refs() it seems possible to get to line 
551 without entering the block on 516 which means you'll be unlocking a 
mutex that wasn't locked.

Now it might be that in the course of the API this pattern cannot be 
expressed but it's not clear from the function alone that that is the case.

Thoughts?

Tom
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: lock/unlock mismatch in ttm_bo.c
       [not found] ` <1713476e-c17f-9eea-ebd3-6999c760b678-5C7GfCeVMHo@public.gmane.org>
@ 2018-01-19 18:23   ` Tom St Denis
       [not found]     ` <81e40cdf-6869-57fa-6858-eb77dbfb45a8-5C7GfCeVMHo@public.gmane.org>
  0 siblings, 1 reply; 16+ messages in thread
From: Tom St Denis @ 2018-01-19 18:23 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

On 19/01/18 01:14 PM, Tom St Denis wrote:
> Hi all,
> 
> In the function ttm_bo_cleanup_refs() it seems possible to get to line 
> 551 without entering the block on 516 which means you'll be unlocking a 
> mutex that wasn't locked.
> 
> Now it might be that in the course of the API this pattern cannot be 
> expressed but it's not clear from the function alone that that is the case.


Looking further it seems the behaviour depends on locking in parent 
callers.  That's kinda a no-no right?  Shouldn't the lock be 
taken/released in the same function ideally?

(also there are a handful of style issues I'll write up some patches for 
on Monday :-)).

Cheers,
Tom
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: lock/unlock mismatch in ttm_bo.c
       [not found]     ` <81e40cdf-6869-57fa-6858-eb77dbfb45a8-5C7GfCeVMHo@public.gmane.org>
@ 2018-01-20 14:46       ` Christian König
  2018-01-22  6:42       ` Chunming Zhou
  1 sibling, 0 replies; 16+ messages in thread
From: Christian König @ 2018-01-20 14:46 UTC (permalink / raw)
  To: Tom St Denis, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

Am 19.01.2018 um 19:23 schrieb Tom St Denis:
> On 19/01/18 01:14 PM, Tom St Denis wrote:
>> Hi all,
>>
>> In the function ttm_bo_cleanup_refs() it seems possible to get to 
>> line 551 without entering the block on 516 which means you'll be 
>> unlocking a mutex that wasn't locked.
>>
>> Now it might be that in the course of the API this pattern cannot be 
>> expressed but it's not clear from the function alone that that is the 
>> case.
>
>
> Looking further it seems the behaviour depends on locking in parent 
> callers.  That's kinda a no-no right?

Yeah, that used to be a really mess in TTM. Started to work on cleaning 
this up, but well you know only two hands and one head :)

> Shouldn't the lock be taken/released in the same function ideally?
>
> (also there are a handful of style issues I'll write up some patches 
> for on Monday :-)).

Feel free to provide cleanup patches for both issues, they would be very 
welcome I think.

Regards,
Christian.

>
> Cheers,
> Tom
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: lock/unlock mismatch in ttm_bo.c
       [not found]     ` <81e40cdf-6869-57fa-6858-eb77dbfb45a8-5C7GfCeVMHo@public.gmane.org>
  2018-01-20 14:46       ` Christian König
@ 2018-01-22  6:42       ` Chunming Zhou
       [not found]         ` <57fdd509-5de6-a7d8-9588-cb2c63b8f451-5C7GfCeVMHo@public.gmane.org>
  1 sibling, 1 reply; 16+ messages in thread
From: Chunming Zhou @ 2018-01-22  6:42 UTC (permalink / raw)
  To: Tom St Denis, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW



On 2018年01月20日 02:23, Tom St Denis wrote:
> On 19/01/18 01:14 PM, Tom St Denis wrote:
>> Hi all,
>>
>> In the function ttm_bo_cleanup_refs() it seems possible to get to 
>> line 551 without entering the block on 516 which means you'll be 
>> unlocking a mutex that wasn't locked.
>>
>> Now it might be that in the course of the API this pattern cannot be 
>> expressed but it's not clear from the function alone that that is the 
>> case.
>
>
> Looking further it seems the behaviour depends on locking in parent 
> callers.  That's kinda a no-no right?  Shouldn't the lock be 
> taken/released in the same function ideally?
Same feelings

Regards,
David Zhou
>
> (also there are a handful of style issues I'll write up some patches 
> for on Monday :-)).
>
> Cheers,
> Tom
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: lock/unlock mismatch in ttm_bo.c
       [not found]         ` <57fdd509-5de6-a7d8-9588-cb2c63b8f451-5C7GfCeVMHo@public.gmane.org>
@ 2018-01-22 11:29           ` Tom St Denis
  2018-01-23 19:25           ` Tom St Denis
  1 sibling, 0 replies; 16+ messages in thread
From: Tom St Denis @ 2018-01-22 11:29 UTC (permalink / raw)
  To: Chunming Zhou, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

On 22/01/18 01:42 AM, Chunming Zhou wrote:
> 
> 
> On 2018年01月20日 02:23, Tom St Denis wrote:
>> On 19/01/18 01:14 PM, Tom St Denis wrote:
>>> Hi all,
>>>
>>> In the function ttm_bo_cleanup_refs() it seems possible to get to 
>>> line 551 without entering the block on 516 which means you'll be 
>>> unlocking a mutex that wasn't locked.
>>>
>>> Now it might be that in the course of the API this pattern cannot be 
>>> expressed but it's not clear from the function alone that that is the 
>>> case.
>>
>>
>> Looking further it seems the behaviour depends on locking in parent 
>> callers.  That's kinda a no-no right?  Shouldn't the lock be 
>> taken/released in the same function ideally?
> Same feelings

Hi David,

Ok I'll see if I can sort this out.

Cheers,
Tom

> 
> Regards,
> David Zhou
>>
>> (also there are a handful of style issues I'll write up some patches 
>> for on Monday :-)).
>>
>> Cheers,
>> Tom
>> _______________________________________________
>> amd-gfx mailing list
>> amd-gfx@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
> 

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: lock/unlock mismatch in ttm_bo.c
       [not found]         ` <57fdd509-5de6-a7d8-9588-cb2c63b8f451-5C7GfCeVMHo@public.gmane.org>
  2018-01-22 11:29           ` Tom St Denis
@ 2018-01-23 19:25           ` Tom St Denis
       [not found]             ` <6c1faffd-5347-5763-e377-fd18f2e65928-5C7GfCeVMHo@public.gmane.org>
  1 sibling, 1 reply; 16+ messages in thread
From: Tom St Denis @ 2018-01-23 19:25 UTC (permalink / raw)
  To: Chunming Zhou; +Cc: Koenig, Christian, amd-gfx mailing list

[-- Attachment #1: Type: text/plain, Size: 1112 bytes --]

On 22/01/18 01:42 AM, Chunming Zhou wrote:
> 
> 
> On 2018年01月20日 02:23, Tom St Denis wrote:
>> On 19/01/18 01:14 PM, Tom St Denis wrote:
>>> Hi all,
>>>
>>> In the function ttm_bo_cleanup_refs() it seems possible to get to 
>>> line 551 without entering the block on 516 which means you'll be 
>>> unlocking a mutex that wasn't locked.
>>>
>>> Now it might be that in the course of the API this pattern cannot be 
>>> expressed but it's not clear from the function alone that that is the 
>>> case.
>>
>>
>> Looking further it seems the behaviour depends on locking in parent 
>> callers.  That's kinda a no-no right?  Shouldn't the lock be 
>> taken/released in the same function ideally?
> Same feelings
> 
> Regards,
> David Zhou

Attached is a patch that addresses this.

I can't see any obvious race in functions that call 
ttm_bo_cleanup_refs() between the time they let go of the lock and the 
time it's taken again in the call.

Running it on my system doesn't produce anything notable though the 
KASAN with DRI_PRIME=1 issue is still there (this patch neither causes 
that nor fixes it).

Tom

[-- Attachment #2: 0001-drm-ttm-Force-lock-to-be-taken-released-inside-ttm_b.patch --]
[-- Type: text/x-patch, Size: 2938 bytes --]

>From 017ebff407e8d9313f09e4ddd109ac54c3a23fba Mon Sep 17 00:00:00 2001
From: Tom St Denis <tom.stdenis-5C7GfCeVMHo@public.gmane.org>
Date: Mon, 22 Jan 2018 06:54:21 -0500
Subject: [PATCH] drm/ttm: Force lock to be taken/released inside
 ttm_bo_cleanup_refs()

The function ttm_bo_cleanup_refs() had conditional behaviour on
the global LRU lock taken outside the function.

This patch changes it so the lock is always untaken before the
function ttm_bo_cleanup_refs() is called and the function takes
the lock itself.

Signed-off-by: Tom St Denis <tom.stdenis-5C7GfCeVMHo@public.gmane.org>
---
 drivers/gpu/drm/ttm/ttm_bo.c | 14 +++++---------
 1 file changed, 5 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
index d33a6bb742a1..cbc4277eb76f 100644
--- a/drivers/gpu/drm/ttm/ttm_bo.c
+++ b/drivers/gpu/drm/ttm/ttm_bo.c
@@ -487,9 +487,6 @@ static void ttm_bo_cleanup_refs_or_queue(struct ttm_buffer_object *bo)
  * If bo idle, remove from delayed- and lru lists, and unref.
  * If not idle, do nothing.
  *
- * Must be called with lru_lock and reservation held, this function
- * will drop the lru lock and optionally the reservation lock before returning.
- *
  * @interruptible         Any sleeps should occur interruptibly.
  * @no_wait_gpu           Never wait for gpu. Return -EBUSY instead.
  * @unlock_resv           Unlock the reservation lock as well.
@@ -503,6 +500,8 @@ static int ttm_bo_cleanup_refs(struct ttm_buffer_object *bo,
 	struct reservation_object *resv;
 	int ret;
 
+	spin_lock(&glob->lru_lock);
+
 	if (unlikely(list_empty(&bo->ddestroy)))
 		resv = bo->resv;
 	else
@@ -586,17 +585,12 @@ static bool ttm_bo_delayed_delete(struct ttm_bo_device *bdev, bool remove_all)
 		kref_get(&bo->list_kref);
 		list_move_tail(&bo->ddestroy, &removed);
 
+		spin_unlock(&glob->lru_lock);
 		if (remove_all || bo->resv != &bo->ttm_resv) {
-			spin_unlock(&glob->lru_lock);
 			reservation_object_lock(bo->resv, NULL);
-
-			spin_lock(&glob->lru_lock);
 			ttm_bo_cleanup_refs(bo, false, !remove_all, true);
-
 		} else if (reservation_object_trylock(bo->resv)) {
 			ttm_bo_cleanup_refs(bo, false, !remove_all, true);
-		} else {
-			spin_unlock(&glob->lru_lock);
 		}
 
 		kref_put(&bo->list_kref, ttm_bo_release_list);
@@ -782,6 +776,7 @@ static int ttm_mem_evict_first(struct ttm_bo_device *bdev,
 	kref_get(&bo->list_kref);
 
 	if (!list_empty(&bo->ddestroy)) {
+		spin_unlock(&glob->lru_lock);
 		ret = ttm_bo_cleanup_refs(bo, ctx->interruptible,
 					  ctx->no_wait_gpu, locked);
 		kref_put(&bo->list_kref, ttm_bo_release_list);
@@ -1730,6 +1725,7 @@ int ttm_bo_swapout(struct ttm_bo_global *glob, struct ttm_operation_ctx *ctx)
 	kref_get(&bo->list_kref);
 
 	if (!list_empty(&bo->ddestroy)) {
+		spin_unlock(&glob->lru_lock);
 		ret = ttm_bo_cleanup_refs(bo, false, false, locked);
 		kref_put(&bo->list_kref, ttm_bo_release_list);
 		return ret;
-- 
2.14.3


[-- Attachment #3: Type: text/plain, Size: 154 bytes --]

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* RE: lock/unlock mismatch in ttm_bo.c
       [not found]             ` <6c1faffd-5347-5763-e377-fd18f2e65928-5C7GfCeVMHo@public.gmane.org>
@ 2018-01-24  2:33               ` He, Roger
  2018-01-24  3:09               ` Chunming Zhou
  2018-01-24  9:03               ` Christian König
  2 siblings, 0 replies; 16+ messages in thread
From: He, Roger @ 2018-01-24  2:33 UTC (permalink / raw)
  To: StDenis, Tom, Zhou, David(ChunMing)
  Cc: Koenig, Christian, amd-gfx mailing list

The patch looks fine to me, please send it to dri mail list.

Thanks
Roger(Hongbo.He)

-----Original Message-----
From: amd-gfx [mailto:amd-gfx-bounces@lists.freedesktop.org] On Behalf Of Tom St Denis
Sent: Wednesday, January 24, 2018 3:25 AM
To: Zhou, David(ChunMing) <David1.Zhou@amd.com>
Cc: Koenig, Christian <Christian.Koenig@amd.com>; amd-gfx mailing list <amd-gfx@lists.freedesktop.org>
Subject: Re: lock/unlock mismatch in ttm_bo.c

On 22/01/18 01:42 AM, Chunming Zhou wrote:
> 
> 
> On 2018年01月20日 02:23, Tom St Denis wrote:
>> On 19/01/18 01:14 PM, Tom St Denis wrote:
>>> Hi all,
>>>
>>> In the function ttm_bo_cleanup_refs() it seems possible to get to 
>>> line 551 without entering the block on 516 which means you'll be 
>>> unlocking a mutex that wasn't locked.
>>>
>>> Now it might be that in the course of the API this pattern cannot be 
>>> expressed but it's not clear from the function alone that that is 
>>> the case.
>>
>>
>> Looking further it seems the behaviour depends on locking in parent 
>> callers.  That's kinda a no-no right?  Shouldn't the lock be 
>> taken/released in the same function ideally?
> Same feelings
> 
> Regards,
> David Zhou

Attached is a patch that addresses this.

I can't see any obvious race in functions that call
ttm_bo_cleanup_refs() between the time they let go of the lock and the time it's taken again in the call.

Running it on my system doesn't produce anything notable though the KASAN with DRI_PRIME=1 issue is still there (this patch neither causes that nor fixes it).

Tom
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: lock/unlock mismatch in ttm_bo.c
       [not found]             ` <6c1faffd-5347-5763-e377-fd18f2e65928-5C7GfCeVMHo@public.gmane.org>
  2018-01-24  2:33               ` He, Roger
@ 2018-01-24  3:09               ` Chunming Zhou
       [not found]                 ` <95054871-ebd3-06a4-e845-9e2d7c7a56f8-5C7GfCeVMHo@public.gmane.org>
  2018-01-24  9:03               ` Christian König
  2 siblings, 1 reply; 16+ messages in thread
From: Chunming Zhou @ 2018-01-24  3:09 UTC (permalink / raw)
  To: Tom St Denis; +Cc: Koenig, Christian, amd-gfx mailing list

[-- Attachment #1: Type: text/plain, Size: 1463 bytes --]

Hi Tom,

Your change looks ok, as Roger suggested, you can send both dri and amd 
mail lists.

In addition, when I review your patches, I found a bug as the attached, 
you can send it together with yours if you think that's a right fix.

Regards,

David Zhou


On 2018年01月24日 03:25, Tom St Denis wrote:
> On 22/01/18 01:42 AM, Chunming Zhou wrote:
>>
>>
>> On 2018年01月20日 02:23, Tom St Denis wrote:
>>> On 19/01/18 01:14 PM, Tom St Denis wrote:
>>>> Hi all,
>>>>
>>>> In the function ttm_bo_cleanup_refs() it seems possible to get to 
>>>> line 551 without entering the block on 516 which means you'll be 
>>>> unlocking a mutex that wasn't locked.
>>>>
>>>> Now it might be that in the course of the API this pattern cannot 
>>>> be expressed but it's not clear from the function alone that that 
>>>> is the case.
>>>
>>>
>>> Looking further it seems the behaviour depends on locking in parent 
>>> callers.  That's kinda a no-no right?  Shouldn't the lock be 
>>> taken/released in the same function ideally?
>> Same feelings
>>
>> Regards,
>> David Zhou
>
> Attached is a patch that addresses this.
>
> I can't see any obvious race in functions that call 
> ttm_bo_cleanup_refs() between the time they let go of the lock and the 
> time it's taken again in the call.
>
> Running it on my system doesn't produce anything notable though the 
> KASAN with DRI_PRIME=1 issue is still there (this patch neither causes 
> that nor fixes it).
>
> Tom


[-- Attachment #2: 0001-ttm-fix-double-trylock-reservation.patch --]
[-- Type: text/x-patch, Size: 1029 bytes --]

>From 1375451816c12b3ad0c9b59b7a36bb80d3eab071 Mon Sep 17 00:00:00 2001
From: Chunming Zhou <david1.zhou-5C7GfCeVMHo@public.gmane.org>
Date: Wed, 24 Jan 2018 10:59:05 +0800
Subject: [PATCH] ttm: fix double trylock reservation

Change-Id: Ie2d7af8b6ce4df8bb575f7959e17e36ac81092e5
Signed-off-by: Chunming Zhou <david1.zhou-5C7GfCeVMHo@public.gmane.org>
---
 drivers/gpu/drm/ttm/ttm_bo.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
index d33a6bb742a1..975017bf6a85 100644
--- a/drivers/gpu/drm/ttm/ttm_bo.c
+++ b/drivers/gpu/drm/ttm/ttm_bo.c
@@ -530,7 +530,7 @@ static int ttm_bo_cleanup_refs(struct ttm_buffer_object *bo,
 			return -EBUSY;
 
 		spin_lock(&glob->lru_lock);
-		if (unlock_resv && !reservation_object_trylock(bo->resv)) {
+		if (!unlock_resv && !reservation_object_trylock(bo->resv)) {
 			/*
 			 * We raced, and lost, someone else holds the reservation now,
 			 * and is probably busy in ttm_bo_cleanup_memtype_use.
-- 
2.14.1


[-- Attachment #3: Type: text/plain, Size: 154 bytes --]

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: lock/unlock mismatch in ttm_bo.c
       [not found]                 ` <95054871-ebd3-06a4-e845-9e2d7c7a56f8-5C7GfCeVMHo@public.gmane.org>
@ 2018-01-24  5:46                   ` Chunming Zhou
       [not found]                     ` <7f86e76a-a61d-4420-a752-6efde8a18e88-5C7GfCeVMHo@public.gmane.org>
  0 siblings, 1 reply; 16+ messages in thread
From: Chunming Zhou @ 2018-01-24  5:46 UTC (permalink / raw)
  To: Tom St Denis; +Cc: Koenig, Christian, amd-gfx mailing list

[-- Attachment #1: Type: text/plain, Size: 1592 bytes --]

update the fix.


On 2018年01月24日 11:09, Chunming Zhou wrote:
> Hi Tom,
>
> Your change looks ok, as Roger suggested, you can send both dri and 
> amd mail lists.
>
> In addition, when I review your patches, I found a bug as the 
> attached, you can send it together with yours if you think that's a 
> right fix.
>
> Regards,
>
> David Zhou
>
>
> On 2018年01月24日 03:25, Tom St Denis wrote:
>> On 22/01/18 01:42 AM, Chunming Zhou wrote:
>>>
>>>
>>> On 2018年01月20日 02:23, Tom St Denis wrote:
>>>> On 19/01/18 01:14 PM, Tom St Denis wrote:
>>>>> Hi all,
>>>>>
>>>>> In the function ttm_bo_cleanup_refs() it seems possible to get to 
>>>>> line 551 without entering the block on 516 which means you'll be 
>>>>> unlocking a mutex that wasn't locked.
>>>>>
>>>>> Now it might be that in the course of the API this pattern cannot 
>>>>> be expressed but it's not clear from the function alone that that 
>>>>> is the case.
>>>>
>>>>
>>>> Looking further it seems the behaviour depends on locking in parent 
>>>> callers.  That's kinda a no-no right?  Shouldn't the lock be 
>>>> taken/released in the same function ideally?
>>> Same feelings
>>>
>>> Regards,
>>> David Zhou
>>
>> Attached is a patch that addresses this.
>>
>> I can't see any obvious race in functions that call 
>> ttm_bo_cleanup_refs() between the time they let go of the lock and 
>> the time it's taken again in the call.
>>
>> Running it on my system doesn't produce anything notable though the 
>> KASAN with DRI_PRIME=1 issue is still there (this patch neither 
>> causes that nor fixes it).
>>
>> Tom
>


[-- Attachment #2: 0001-ttm-fix-double-trylock-reservation.patch --]
[-- Type: text/x-patch, Size: 1206 bytes --]

>From 9f06e6619b47c506fc87471e38de4640c9879a9f Mon Sep 17 00:00:00 2001
From: Chunming Zhou <david1.zhou-5C7GfCeVMHo@public.gmane.org>
Date: Wed, 24 Jan 2018 10:59:05 +0800
Subject: [PATCH] ttm: fix double trylock reservation

Change-Id: Ie2d7af8b6ce4df8bb575f7959e17e36ac81092e5
Signed-off-by: Chunming Zhou <david1.zhou-5C7GfCeVMHo@public.gmane.org>
---
 drivers/gpu/drm/ttm/ttm_bo.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
index d33a6bb742a1..e737515405fd 100644
--- a/drivers/gpu/drm/ttm/ttm_bo.c
+++ b/drivers/gpu/drm/ttm/ttm_bo.c
@@ -530,7 +530,7 @@ static int ttm_bo_cleanup_refs(struct ttm_buffer_object *bo,
 			return -EBUSY;
 
 		spin_lock(&glob->lru_lock);
-		if (unlock_resv && !reservation_object_trylock(bo->resv)) {
+		if (!unlock_resv && !reservation_object_trylock(bo->resv)) {
 			/*
 			 * We raced, and lost, someone else holds the reservation now,
 			 * and is probably busy in ttm_bo_cleanup_memtype_use.
@@ -542,6 +542,7 @@ static int ttm_bo_cleanup_refs(struct ttm_buffer_object *bo,
 			spin_unlock(&glob->lru_lock);
 			return 0;
 		}
+		unlock_resv = true;
 		ret = 0;
 	}
 
-- 
2.14.1


[-- Attachment #3: Type: text/plain, Size: 154 bytes --]

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: lock/unlock mismatch in ttm_bo.c
       [not found]                     ` <7f86e76a-a61d-4420-a752-6efde8a18e88-5C7GfCeVMHo@public.gmane.org>
@ 2018-01-24  8:58                       ` Christian König
       [not found]                         ` <6a54eb41-d251-3026-6a2a-71abc724e3ff-5C7GfCeVMHo@public.gmane.org>
  0 siblings, 1 reply; 16+ messages in thread
From: Christian König @ 2018-01-24  8:58 UTC (permalink / raw)
  To: Chunming Zhou, Tom St Denis; +Cc: amd-gfx mailing list

NAK, that doesn't looks correct to me. unlock_resv means that the 
function is allowed to unlock the reservation.

So the original logic seems to do exactly what it is supposed to do. 
Please keep in mind that reservation_object_trylock returns boolean, not 
negative error code.

Regards,
Christian.

Am 24.01.2018 um 06:46 schrieb Chunming Zhou:
> update the fix.
>
>
> On 2018年01月24日 11:09, Chunming Zhou wrote:
>> Hi Tom,
>>
>> Your change looks ok, as Roger suggested, you can send both dri and 
>> amd mail lists.
>>
>> In addition, when I review your patches, I found a bug as the 
>> attached, you can send it together with yours if you think that's a 
>> right fix.
>>
>> Regards,
>>
>> David Zhou
>>
>>
>> On 2018年01月24日 03:25, Tom St Denis wrote:
>>> On 22/01/18 01:42 AM, Chunming Zhou wrote:
>>>>
>>>>
>>>> On 2018年01月20日 02:23, Tom St Denis wrote:
>>>>> On 19/01/18 01:14 PM, Tom St Denis wrote:
>>>>>> Hi all,
>>>>>>
>>>>>> In the function ttm_bo_cleanup_refs() it seems possible to get to 
>>>>>> line 551 without entering the block on 516 which means you'll be 
>>>>>> unlocking a mutex that wasn't locked.
>>>>>>
>>>>>> Now it might be that in the course of the API this pattern cannot 
>>>>>> be expressed but it's not clear from the function alone that that 
>>>>>> is the case.
>>>>>
>>>>>
>>>>> Looking further it seems the behaviour depends on locking in 
>>>>> parent callers.  That's kinda a no-no right?  Shouldn't the lock 
>>>>> be taken/released in the same function ideally?
>>>> Same feelings
>>>>
>>>> Regards,
>>>> David Zhou
>>>
>>> Attached is a patch that addresses this.
>>>
>>> I can't see any obvious race in functions that call 
>>> ttm_bo_cleanup_refs() between the time they let go of the lock and 
>>> the time it's taken again in the call.
>>>
>>> Running it on my system doesn't produce anything notable though the 
>>> KASAN with DRI_PRIME=1 issue is still there (this patch neither 
>>> causes that nor fixes it).
>>>
>>> Tom
>>
>

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: lock/unlock mismatch in ttm_bo.c
       [not found]             ` <6c1faffd-5347-5763-e377-fd18f2e65928-5C7GfCeVMHo@public.gmane.org>
  2018-01-24  2:33               ` He, Roger
  2018-01-24  3:09               ` Chunming Zhou
@ 2018-01-24  9:03               ` Christian König
  2 siblings, 0 replies; 16+ messages in thread
From: Christian König @ 2018-01-24  9:03 UTC (permalink / raw)
  To: Tom St Denis, Chunming Zhou; +Cc: amd-gfx mailing list

That patch won't work correctly like this.

When the lock is dropped it is possible that the BO is removed from the 
ddelete list and ttm_bo_cleanup_refs() starts to wait for the wrong 
reservation object.

I think we can remove the wait for bo->resv now and always wait for 
bo->ttm_resv, but I'm not 100% sure.

Need to double check the code as well,
Christian.

Am 23.01.2018 um 20:25 schrieb Tom St Denis:
> On 22/01/18 01:42 AM, Chunming Zhou wrote:
>>
>>
>> On 2018年01月20日 02:23, Tom St Denis wrote:
>>> On 19/01/18 01:14 PM, Tom St Denis wrote:
>>>> Hi all,
>>>>
>>>> In the function ttm_bo_cleanup_refs() it seems possible to get to 
>>>> line 551 without entering the block on 516 which means you'll be 
>>>> unlocking a mutex that wasn't locked.
>>>>
>>>> Now it might be that in the course of the API this pattern cannot 
>>>> be expressed but it's not clear from the function alone that that 
>>>> is the case.
>>>
>>>
>>> Looking further it seems the behaviour depends on locking in parent 
>>> callers.  That's kinda a no-no right?  Shouldn't the lock be 
>>> taken/released in the same function ideally?
>> Same feelings
>>
>> Regards,
>> David Zhou
>
> Attached is a patch that addresses this.
>
> I can't see any obvious race in functions that call 
> ttm_bo_cleanup_refs() between the time they let go of the lock and the 
> time it's taken again in the call.
>
> Running it on my system doesn't produce anything notable though the 
> KASAN with DRI_PRIME=1 issue is still there (this patch neither causes 
> that nor fixes it).
>
> Tom

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: lock/unlock mismatch in ttm_bo.c
       [not found]                         ` <6a54eb41-d251-3026-6a2a-71abc724e3ff-5C7GfCeVMHo@public.gmane.org>
@ 2018-01-24  9:06                           ` Chunming Zhou
       [not found]                             ` <52f9c9e5-8838-448c-0bef-78d3821313a2-5C7GfCeVMHo@public.gmane.org>
  2018-01-24 11:09                           ` Tom St Denis
  1 sibling, 1 reply; 16+ messages in thread
From: Chunming Zhou @ 2018-01-24  9:06 UTC (permalink / raw)
  To: Christian König, Tom St Denis; +Cc: amd-gfx mailing list



On 2018年01月24日 16:58, Christian König wrote:
> NAK, that doesn't looks correct to me. unlock_resv means that the 
> function is allowed to unlock the reservation.
>
> So the original logic seems to do exactly what it is supposed to do. 
> Please keep in mind that reservation_object_trylock returns boolean, 
> not negative error code.
'unlock_resv = true' means ttm_bo_cleanup_refs() is locked outside, then 
reservation_object_trylock will return false, then original code 'if 
(unlock_resv && !reservation_object_trylock(bo->resv))' will always be 
true, and then return directly.
Is my understanding right? is that expected?

Regards,
David Zhou

>
> Regards,
> Christian.
>
> Am 24.01.2018 um 06:46 schrieb Chunming Zhou:
>> update the fix.
>>
>>
>> On 2018年01月24日 11:09, Chunming Zhou wrote:
>>> Hi Tom,
>>>
>>> Your change looks ok, as Roger suggested, you can send both dri and 
>>> amd mail lists.
>>>
>>> In addition, when I review your patches, I found a bug as the 
>>> attached, you can send it together with yours if you think that's a 
>>> right fix.
>>>
>>> Regards,
>>>
>>> David Zhou
>>>
>>>
>>> On 2018年01月24日 03:25, Tom St Denis wrote:
>>>> On 22/01/18 01:42 AM, Chunming Zhou wrote:
>>>>>
>>>>>
>>>>> On 2018年01月20日 02:23, Tom St Denis wrote:
>>>>>> On 19/01/18 01:14 PM, Tom St Denis wrote:
>>>>>>> Hi all,
>>>>>>>
>>>>>>> In the function ttm_bo_cleanup_refs() it seems possible to get 
>>>>>>> to line 551 without entering the block on 516 which means you'll 
>>>>>>> be unlocking a mutex that wasn't locked.
>>>>>>>
>>>>>>> Now it might be that in the course of the API this pattern 
>>>>>>> cannot be expressed but it's not clear from the function alone 
>>>>>>> that that is the case.
>>>>>>
>>>>>>
>>>>>> Looking further it seems the behaviour depends on locking in 
>>>>>> parent callers.  That's kinda a no-no right? Shouldn't the lock 
>>>>>> be taken/released in the same function ideally?
>>>>> Same feelings
>>>>>
>>>>> Regards,
>>>>> David Zhou
>>>>
>>>> Attached is a patch that addresses this.
>>>>
>>>> I can't see any obvious race in functions that call 
>>>> ttm_bo_cleanup_refs() between the time they let go of the lock and 
>>>> the time it's taken again in the call.
>>>>
>>>> Running it on my system doesn't produce anything notable though the 
>>>> KASAN with DRI_PRIME=1 issue is still there (this patch neither 
>>>> causes that nor fixes it).
>>>>
>>>> Tom
>>>
>>
>

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: lock/unlock mismatch in ttm_bo.c
       [not found]                             ` <52f9c9e5-8838-448c-0bef-78d3821313a2-5C7GfCeVMHo@public.gmane.org>
@ 2018-01-24  9:09                               ` Chunming Zhou
  2018-01-24  9:10                               ` Christian König
  1 sibling, 0 replies; 16+ messages in thread
From: Chunming Zhou @ 2018-01-24  9:09 UTC (permalink / raw)
  To: Christian König, Tom St Denis; +Cc: amd-gfx mailing list



On 2018年01月24日 17:06, Chunming Zhou wrote:
>
>
> On 2018年01月24日 16:58, Christian König wrote:
>> NAK, that doesn't looks correct to me. unlock_resv means that the 
>> function is allowed to unlock the reservation.
>>
>> So the original logic seems to do exactly what it is supposed to do. 
>> Please keep in mind that reservation_object_trylock returns boolean, 
>> not negative error code.
> 'unlock_resv = true' means ttm_bo_cleanup_refs() is locked outside, 
> then reservation_object_trylock will return false, then original code 
> 'if (unlock_resv && !reservation_object_trylock(bo->resv))' will 
> always be true, and then return directly.
> Is my understanding right? is that expected?
                 if (unlock_resv)
                         reservation_object_unlock(bo->resv);

Sorry for noise, it was unlocked before.

Regards,
David Zhou
>
> Regards,
> David Zhou
>
>>
>> Regards,
>> Christian.
>>
>> Am 24.01.2018 um 06:46 schrieb Chunming Zhou:
>>> update the fix.
>>>
>>>
>>> On 2018年01月24日 11:09, Chunming Zhou wrote:
>>>> Hi Tom,
>>>>
>>>> Your change looks ok, as Roger suggested, you can send both dri and 
>>>> amd mail lists.
>>>>
>>>> In addition, when I review your patches, I found a bug as the 
>>>> attached, you can send it together with yours if you think that's a 
>>>> right fix.
>>>>
>>>> Regards,
>>>>
>>>> David Zhou
>>>>
>>>>
>>>> On 2018年01月24日 03:25, Tom St Denis wrote:
>>>>> On 22/01/18 01:42 AM, Chunming Zhou wrote:
>>>>>>
>>>>>>
>>>>>> On 2018年01月20日 02:23, Tom St Denis wrote:
>>>>>>> On 19/01/18 01:14 PM, Tom St Denis wrote:
>>>>>>>> Hi all,
>>>>>>>>
>>>>>>>> In the function ttm_bo_cleanup_refs() it seems possible to get 
>>>>>>>> to line 551 without entering the block on 516 which means 
>>>>>>>> you'll be unlocking a mutex that wasn't locked.
>>>>>>>>
>>>>>>>> Now it might be that in the course of the API this pattern 
>>>>>>>> cannot be expressed but it's not clear from the function alone 
>>>>>>>> that that is the case.
>>>>>>>
>>>>>>>
>>>>>>> Looking further it seems the behaviour depends on locking in 
>>>>>>> parent callers.  That's kinda a no-no right? Shouldn't the lock 
>>>>>>> be taken/released in the same function ideally?
>>>>>> Same feelings
>>>>>>
>>>>>> Regards,
>>>>>> David Zhou
>>>>>
>>>>> Attached is a patch that addresses this.
>>>>>
>>>>> I can't see any obvious race in functions that call 
>>>>> ttm_bo_cleanup_refs() between the time they let go of the lock and 
>>>>> the time it's taken again in the call.
>>>>>
>>>>> Running it on my system doesn't produce anything notable though 
>>>>> the KASAN with DRI_PRIME=1 issue is still there (this patch 
>>>>> neither causes that nor fixes it).
>>>>>
>>>>> Tom
>>>>
>>>
>>
>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: lock/unlock mismatch in ttm_bo.c
       [not found]                             ` <52f9c9e5-8838-448c-0bef-78d3821313a2-5C7GfCeVMHo@public.gmane.org>
  2018-01-24  9:09                               ` Chunming Zhou
@ 2018-01-24  9:10                               ` Christian König
       [not found]                                 ` <2c32e0e5-b7bc-25f0-184f-6615d31d6a4d-5C7GfCeVMHo@public.gmane.org>
  1 sibling, 1 reply; 16+ messages in thread
From: Christian König @ 2018-01-24  9:10 UTC (permalink / raw)
  To: Chunming Zhou, Tom St Denis; +Cc: amd-gfx mailing list

Am 24.01.2018 um 10:06 schrieb Chunming Zhou:
>
>
> On 2018年01月24日 16:58, Christian König wrote:
>> NAK, that doesn't looks correct to me. unlock_resv means that the 
>> function is allowed to unlock the reservation.
>>
>> So the original logic seems to do exactly what it is supposed to do. 
>> Please keep in mind that reservation_object_trylock returns boolean, 
>> not negative error code.
> 'unlock_resv = true' means ttm_bo_cleanup_refs() is locked outside, 
> then reservation_object_trylock will return false, then original code 
> 'if (unlock_resv && !reservation_object_trylock(bo->resv))' will 
> always be true, and then return directly.
> Is my understanding right? is that expected?

Ah, crap you are right. That must be "!unlock_resv && !reserv...".

But why do you want to set unlock_resv to true?

Regards,
Christian.

>
> Regards,
> David Zhou
>
>>
>> Regards,
>> Christian.
>>
>> Am 24.01.2018 um 06:46 schrieb Chunming Zhou:
>>> update the fix.
>>>
>>>
>>> On 2018年01月24日 11:09, Chunming Zhou wrote:
>>>> Hi Tom,
>>>>
>>>> Your change looks ok, as Roger suggested, you can send both dri and 
>>>> amd mail lists.
>>>>
>>>> In addition, when I review your patches, I found a bug as the 
>>>> attached, you can send it together with yours if you think that's a 
>>>> right fix.
>>>>
>>>> Regards,
>>>>
>>>> David Zhou
>>>>
>>>>
>>>> On 2018年01月24日 03:25, Tom St Denis wrote:
>>>>> On 22/01/18 01:42 AM, Chunming Zhou wrote:
>>>>>>
>>>>>>
>>>>>> On 2018年01月20日 02:23, Tom St Denis wrote:
>>>>>>> On 19/01/18 01:14 PM, Tom St Denis wrote:
>>>>>>>> Hi all,
>>>>>>>>
>>>>>>>> In the function ttm_bo_cleanup_refs() it seems possible to get 
>>>>>>>> to line 551 without entering the block on 516 which means 
>>>>>>>> you'll be unlocking a mutex that wasn't locked.
>>>>>>>>
>>>>>>>> Now it might be that in the course of the API this pattern 
>>>>>>>> cannot be expressed but it's not clear from the function alone 
>>>>>>>> that that is the case.
>>>>>>>
>>>>>>>
>>>>>>> Looking further it seems the behaviour depends on locking in 
>>>>>>> parent callers.  That's kinda a no-no right? Shouldn't the lock 
>>>>>>> be taken/released in the same function ideally?
>>>>>> Same feelings
>>>>>>
>>>>>> Regards,
>>>>>> David Zhou
>>>>>
>>>>> Attached is a patch that addresses this.
>>>>>
>>>>> I can't see any obvious race in functions that call 
>>>>> ttm_bo_cleanup_refs() between the time they let go of the lock and 
>>>>> the time it's taken again in the call.
>>>>>
>>>>> Running it on my system doesn't produce anything notable though 
>>>>> the KASAN with DRI_PRIME=1 issue is still there (this patch 
>>>>> neither causes that nor fixes it).
>>>>>
>>>>> Tom
>>>>
>>>
>>
>

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: lock/unlock mismatch in ttm_bo.c
       [not found]                                 ` <2c32e0e5-b7bc-25f0-184f-6615d31d6a4d-5C7GfCeVMHo@public.gmane.org>
@ 2018-01-24  9:12                                   ` Christian König
  0 siblings, 0 replies; 16+ messages in thread
From: Christian König @ 2018-01-24  9:12 UTC (permalink / raw)
  To: Chunming Zhou, Tom St Denis; +Cc: amd-gfx mailing list

Am 24.01.2018 um 10:10 schrieb Christian König:
> Am 24.01.2018 um 10:06 schrieb Chunming Zhou:
>>
>>
>> On 2018年01月24日 16:58, Christian König wrote:
>>> NAK, that doesn't looks correct to me. unlock_resv means that the 
>>> function is allowed to unlock the reservation.
>>>
>>> So the original logic seems to do exactly what it is supposed to do. 
>>> Please keep in mind that reservation_object_trylock returns boolean, 
>>> not negative error code.
>> 'unlock_resv = true' means ttm_bo_cleanup_refs() is locked outside, 
>> then reservation_object_trylock will return false, then original code 
>> 'if (unlock_resv && !reservation_object_trylock(bo->resv))' will 
>> always be true, and then return directly.
>> Is my understanding right? is that expected?
>
> Ah, crap you are right. That must be "!unlock_resv && !reserv...".

No wait a moment. Your initial assumption seems to be incorrect: 
'unlock_resv = true' means ttm_bo_cleanup_refs() is locked outside

When unlock_resv = false then that means ttm_bo_cleanup_refs() is locked 
outside.

And when we assume this, the original code is indeed right.

Christian.

>
> But why do you want to set unlock_resv to true?
>
> Regards,
> Christian.
>
>>
>> Regards,
>> David Zhou
>>
>>>
>>> Regards,
>>> Christian.
>>>
>>> Am 24.01.2018 um 06:46 schrieb Chunming Zhou:
>>>> update the fix.
>>>>
>>>>
>>>> On 2018年01月24日 11:09, Chunming Zhou wrote:
>>>>> Hi Tom,
>>>>>
>>>>> Your change looks ok, as Roger suggested, you can send both dri 
>>>>> and amd mail lists.
>>>>>
>>>>> In addition, when I review your patches, I found a bug as the 
>>>>> attached, you can send it together with yours if you think that's 
>>>>> a right fix.
>>>>>
>>>>> Regards,
>>>>>
>>>>> David Zhou
>>>>>
>>>>>
>>>>> On 2018年01月24日 03:25, Tom St Denis wrote:
>>>>>> On 22/01/18 01:42 AM, Chunming Zhou wrote:
>>>>>>>
>>>>>>>
>>>>>>> On 2018年01月20日 02:23, Tom St Denis wrote:
>>>>>>>> On 19/01/18 01:14 PM, Tom St Denis wrote:
>>>>>>>>> Hi all,
>>>>>>>>>
>>>>>>>>> In the function ttm_bo_cleanup_refs() it seems possible to get 
>>>>>>>>> to line 551 without entering the block on 516 which means 
>>>>>>>>> you'll be unlocking a mutex that wasn't locked.
>>>>>>>>>
>>>>>>>>> Now it might be that in the course of the API this pattern 
>>>>>>>>> cannot be expressed but it's not clear from the function alone 
>>>>>>>>> that that is the case.
>>>>>>>>
>>>>>>>>
>>>>>>>> Looking further it seems the behaviour depends on locking in 
>>>>>>>> parent callers.  That's kinda a no-no right? Shouldn't the lock 
>>>>>>>> be taken/released in the same function ideally?
>>>>>>> Same feelings
>>>>>>>
>>>>>>> Regards,
>>>>>>> David Zhou
>>>>>>
>>>>>> Attached is a patch that addresses this.
>>>>>>
>>>>>> I can't see any obvious race in functions that call 
>>>>>> ttm_bo_cleanup_refs() between the time they let go of the lock 
>>>>>> and the time it's taken again in the call.
>>>>>>
>>>>>> Running it on my system doesn't produce anything notable though 
>>>>>> the KASAN with DRI_PRIME=1 issue is still there (this patch 
>>>>>> neither causes that nor fixes it).
>>>>>>
>>>>>> Tom
>>>>>
>>>>
>>>
>>
>

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: lock/unlock mismatch in ttm_bo.c
       [not found]                         ` <6a54eb41-d251-3026-6a2a-71abc724e3ff-5C7GfCeVMHo@public.gmane.org>
  2018-01-24  9:06                           ` Chunming Zhou
@ 2018-01-24 11:09                           ` Tom St Denis
  1 sibling, 0 replies; 16+ messages in thread
From: Tom St Denis @ 2018-01-24 11:09 UTC (permalink / raw)
  To: Christian König, Chunming Zhou; +Cc: amd-gfx mailing list

On 24/01/18 03:58 AM, Christian König wrote:
> NAK, that doesn't looks correct to me. unlock_resv means that the 
> function is allowed to unlock the reservation.
> 
> So the original logic seems to do exactly what it is supposed to do. 
> Please keep in mind that reservation_object_trylock returns boolean, not 
> negative error code.

If the cleanup_refs() function assumes the lock is held it shouldn't 
release it then.  My only goal here was to cleanup the lock usage so 
that it's only manipulated in a single context.

The lock is unlocked a bit before the function ends do the other ttm 
functions cleanup_refs() call require the ability to take the lock?

Tom


> 
> Regards,
> Christian.
> 
> Am 24.01.2018 um 06:46 schrieb Chunming Zhou:
>> update the fix.
>>
>>
>> On 2018年01月24日 11:09, Chunming Zhou wrote:
>>> Hi Tom,
>>>
>>> Your change looks ok, as Roger suggested, you can send both dri and 
>>> amd mail lists.
>>>
>>> In addition, when I review your patches, I found a bug as the 
>>> attached, you can send it together with yours if you think that's a 
>>> right fix.
>>>
>>> Regards,
>>>
>>> David Zhou
>>>
>>>
>>> On 2018年01月24日 03:25, Tom St Denis wrote:
>>>> On 22/01/18 01:42 AM, Chunming Zhou wrote:
>>>>>
>>>>>
>>>>> On 2018年01月20日 02:23, Tom St Denis wrote:
>>>>>> On 19/01/18 01:14 PM, Tom St Denis wrote:
>>>>>>> Hi all,
>>>>>>>
>>>>>>> In the function ttm_bo_cleanup_refs() it seems possible to get to 
>>>>>>> line 551 without entering the block on 516 which means you'll be 
>>>>>>> unlocking a mutex that wasn't locked.
>>>>>>>
>>>>>>> Now it might be that in the course of the API this pattern cannot 
>>>>>>> be expressed but it's not clear from the function alone that that 
>>>>>>> is the case.
>>>>>>
>>>>>>
>>>>>> Looking further it seems the behaviour depends on locking in 
>>>>>> parent callers.  That's kinda a no-no right?  Shouldn't the lock 
>>>>>> be taken/released in the same function ideally?
>>>>> Same feelings
>>>>>
>>>>> Regards,
>>>>> David Zhou
>>>>
>>>> Attached is a patch that addresses this.
>>>>
>>>> I can't see any obvious race in functions that call 
>>>> ttm_bo_cleanup_refs() between the time they let go of the lock and 
>>>> the time it's taken again in the call.
>>>>
>>>> Running it on my system doesn't produce anything notable though the 
>>>> KASAN with DRI_PRIME=1 issue is still there (this patch neither 
>>>> causes that nor fixes it).
>>>>
>>>> Tom
>>>
>>
> 

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2018-01-24 11:09 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-01-19 18:14 lock/unlock mismatch in ttm_bo.c Tom St Denis
     [not found] ` <1713476e-c17f-9eea-ebd3-6999c760b678-5C7GfCeVMHo@public.gmane.org>
2018-01-19 18:23   ` Tom St Denis
     [not found]     ` <81e40cdf-6869-57fa-6858-eb77dbfb45a8-5C7GfCeVMHo@public.gmane.org>
2018-01-20 14:46       ` Christian König
2018-01-22  6:42       ` Chunming Zhou
     [not found]         ` <57fdd509-5de6-a7d8-9588-cb2c63b8f451-5C7GfCeVMHo@public.gmane.org>
2018-01-22 11:29           ` Tom St Denis
2018-01-23 19:25           ` Tom St Denis
     [not found]             ` <6c1faffd-5347-5763-e377-fd18f2e65928-5C7GfCeVMHo@public.gmane.org>
2018-01-24  2:33               ` He, Roger
2018-01-24  3:09               ` Chunming Zhou
     [not found]                 ` <95054871-ebd3-06a4-e845-9e2d7c7a56f8-5C7GfCeVMHo@public.gmane.org>
2018-01-24  5:46                   ` Chunming Zhou
     [not found]                     ` <7f86e76a-a61d-4420-a752-6efde8a18e88-5C7GfCeVMHo@public.gmane.org>
2018-01-24  8:58                       ` Christian König
     [not found]                         ` <6a54eb41-d251-3026-6a2a-71abc724e3ff-5C7GfCeVMHo@public.gmane.org>
2018-01-24  9:06                           ` Chunming Zhou
     [not found]                             ` <52f9c9e5-8838-448c-0bef-78d3821313a2-5C7GfCeVMHo@public.gmane.org>
2018-01-24  9:09                               ` Chunming Zhou
2018-01-24  9:10                               ` Christian König
     [not found]                                 ` <2c32e0e5-b7bc-25f0-184f-6615d31d6a4d-5C7GfCeVMHo@public.gmane.org>
2018-01-24  9:12                                   ` Christian König
2018-01-24 11:09                           ` Tom St Denis
2018-01-24  9:03               ` Christian König

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.