All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 1/4] drm/amdgpu: Optimization of critical section
@ 2017-06-12 20:31 Alex Xie
       [not found] ` <1497299485-7337-1-git-send-email-AlexBin.Xie-5C7GfCeVMHo@public.gmane.org>
  0 siblings, 1 reply; 18+ messages in thread
From: Alex Xie @ 2017-06-12 20:31 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW; +Cc: Alex Xie

This patch is to move a loop of unref BOs and
several memory free function calls out of
critical sections.

Signed-off-by: Alex Xie <AlexBin.Xie@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c
index a664987..02c138f 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c
@@ -75,9 +75,12 @@ static void amdgpu_bo_list_destroy(struct amdgpu_fpriv *fpriv, int id)
 		/* Another user may have a reference to this list still */
 		mutex_lock(&list->lock);
 		mutex_unlock(&list->lock);
+		mutex_unlock(&fpriv->bo_list_lock);
 		amdgpu_bo_list_free(list);
 	}
-	mutex_unlock(&fpriv->bo_list_lock);
+	else {
+		mutex_unlock(&fpriv->bo_list_lock);
+	}
 }
 
 static int amdgpu_bo_list_set(struct amdgpu_device *adev,
-- 
2.7.4

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 2/4] drm/amdgpu: Optimization of critical section
       [not found] ` <1497299485-7337-1-git-send-email-AlexBin.Xie-5C7GfCeVMHo@public.gmane.org>
@ 2017-06-12 20:31   ` Alex Xie
       [not found]     ` <1497299485-7337-2-git-send-email-AlexBin.Xie-5C7GfCeVMHo@public.gmane.org>
  2017-06-12 20:31   ` [PATCH 3/4] drm/amdgpu: Optimization of mutex usage Alex Xie
                     ` (2 subsequent siblings)
  3 siblings, 1 reply; 18+ messages in thread
From: Alex Xie @ 2017-06-12 20:31 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW; +Cc: Alex Xie

Make the critical section smaller. There is no
need to protect the bo_list, because there is
no other task can access the newly created BO
list yet.

Signed-off-by: Alex Xie <AlexBin.Xie@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c
index 02c138f..c994a04 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c
@@ -48,8 +48,9 @@ static int amdgpu_bo_list_create(struct amdgpu_fpriv *fpriv,
 	mutex_lock(&fpriv->bo_list_lock);
 	r = idr_alloc(&fpriv->bo_list_handles, *result,
 		      1, 0, GFP_KERNEL);
+	mutex_unlock(&fpriv->bo_list_lock);
+
 	if (r < 0) {
-		mutex_unlock(&fpriv->bo_list_lock);
 		kfree(*result);
 		return r;
 	}
@@ -60,7 +61,6 @@ static int amdgpu_bo_list_create(struct amdgpu_fpriv *fpriv,
 	(*result)->array = NULL;
 
 	mutex_lock(&(*result)->lock);
-	mutex_unlock(&fpriv->bo_list_lock);
 
 	return 0;
 }
-- 
2.7.4

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 3/4] drm/amdgpu: Optimization of mutex usage
       [not found] ` <1497299485-7337-1-git-send-email-AlexBin.Xie-5C7GfCeVMHo@public.gmane.org>
  2017-06-12 20:31   ` [PATCH 2/4] " Alex Xie
@ 2017-06-12 20:31   ` Alex Xie
       [not found]     ` <1497299485-7337-3-git-send-email-AlexBin.Xie-5C7GfCeVMHo@public.gmane.org>
  2017-06-12 20:31   ` [PATCH 4/4] drm/amdgpu: Optimize " Alex Xie
  2017-06-12 23:00   ` [PATCH 1/4] drm/amdgpu: Optimization of critical section Christian König
  3 siblings, 1 reply; 18+ messages in thread
From: Alex Xie @ 2017-06-12 20:31 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW; +Cc: Alex Xie

There is no need to lock the bo_list mutex,
because there is no other task can access
the newly created BO list yet.

Signed-off-by: Alex Xie <AlexBin.Xie@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c
index c994a04..4363f28 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c
@@ -60,8 +60,6 @@ static int amdgpu_bo_list_create(struct amdgpu_fpriv *fpriv,
 	(*result)->num_entries = 0;
 	(*result)->array = NULL;
 
-	mutex_lock(&(*result)->lock);
-
 	return 0;
 }
 
@@ -282,7 +280,6 @@ int amdgpu_bo_list_ioctl(struct drm_device *dev, void *data,
 
 		r = amdgpu_bo_list_set(adev, filp, list, info,
 					      args->in.bo_number);
-		amdgpu_bo_list_put(list);
 		if (r)
 			goto error_free;
 
-- 
2.7.4

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 4/4] drm/amdgpu: Optimize mutex usage
       [not found] ` <1497299485-7337-1-git-send-email-AlexBin.Xie-5C7GfCeVMHo@public.gmane.org>
  2017-06-12 20:31   ` [PATCH 2/4] " Alex Xie
  2017-06-12 20:31   ` [PATCH 3/4] drm/amdgpu: Optimization of mutex usage Alex Xie
@ 2017-06-12 20:31   ` Alex Xie
       [not found]     ` <1497299485-7337-4-git-send-email-AlexBin.Xie-5C7GfCeVMHo@public.gmane.org>
  2017-06-12 23:00   ` [PATCH 1/4] drm/amdgpu: Optimization of critical section Christian König
  3 siblings, 1 reply; 18+ messages in thread
From: Alex Xie @ 2017-06-12 20:31 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW; +Cc: Alex Xie

Use rw_semaphore instead of mutex for bo_lists.

In original function amdgpu_bo_list_get, the waiting
for result->lock can be quite long while mutex
bo_list_lock was holding. It can make other tasks
waiting for bo_list_lock for long period too.
Change bo_list_lock to rw_semaphore can avoid most of
such long waiting.

Secondly, this patch allows several tasks(readers of idr)
to proceed at the same time.

Signed-off-by: Alex Xie <AlexBin.Xie@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu.h         |  3 ++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c | 14 +++++++-------
 drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c     |  3 +--
 3 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index 063fc73..bfc40d7 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -35,6 +35,7 @@
 #include <linux/rbtree.h>
 #include <linux/hashtable.h>
 #include <linux/dma-fence.h>
+#include <linux/rwsem.h>
 
 #include <ttm/ttm_bo_api.h>
 #include <ttm/ttm_bo_driver.h>
@@ -859,7 +860,7 @@ void amdgpu_ctx_mgr_fini(struct amdgpu_ctx_mgr *mgr);
 struct amdgpu_fpriv {
 	struct amdgpu_vm	vm;
 	struct amdgpu_bo_va	*prt_va;
-	struct mutex		bo_list_lock;
+	struct rw_semaphore	bo_list_lock;
 	struct idr		bo_list_handles;
 	struct amdgpu_ctx_mgr	ctx_mgr;
 	u32			vram_lost_counter;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c
index 4363f28..bfe736e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c
@@ -45,10 +45,10 @@ static int amdgpu_bo_list_create(struct amdgpu_fpriv *fpriv,
 	if (!*result)
 		return -ENOMEM;
 
-	mutex_lock(&fpriv->bo_list_lock);
+	down_read(&fpriv->bo_list_lock);
 	r = idr_alloc(&fpriv->bo_list_handles, *result,
 		      1, 0, GFP_KERNEL);
-	mutex_unlock(&fpriv->bo_list_lock);
+	up_read(&fpriv->bo_list_lock);
 
 	if (r < 0) {
 		kfree(*result);
@@ -67,17 +67,17 @@ static void amdgpu_bo_list_destroy(struct amdgpu_fpriv *fpriv, int id)
 {
 	struct amdgpu_bo_list *list;
 
-	mutex_lock(&fpriv->bo_list_lock);
+	down_write(&fpriv->bo_list_lock);
 	list = idr_remove(&fpriv->bo_list_handles, id);
 	if (list) {
 		/* Another user may have a reference to this list still */
 		mutex_lock(&list->lock);
 		mutex_unlock(&list->lock);
-		mutex_unlock(&fpriv->bo_list_lock);
+		up_write(&fpriv->bo_list_lock);
 		amdgpu_bo_list_free(list);
 	}
 	else {
-		mutex_unlock(&fpriv->bo_list_lock);
+		up_write(&fpriv->bo_list_lock);
 	}
 }
 
@@ -173,11 +173,11 @@ amdgpu_bo_list_get(struct amdgpu_fpriv *fpriv, int id)
 {
 	struct amdgpu_bo_list *result;
 
-	mutex_lock(&fpriv->bo_list_lock);
+	down_read(&fpriv->bo_list_lock);
 	result = idr_find(&fpriv->bo_list_handles, id);
 	if (result)
 		mutex_lock(&result->lock);
-	mutex_unlock(&fpriv->bo_list_lock);
+	up_read(&fpriv->bo_list_lock);
 	return result;
 }
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
index f68ced6..c4cba81 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
@@ -841,7 +841,7 @@ int amdgpu_driver_open_kms(struct drm_device *dev, struct drm_file *file_priv)
 			goto out_suspend;
 	}
 
-	mutex_init(&fpriv->bo_list_lock);
+	init_rwsem(&fpriv->bo_list_lock);
 	idr_init(&fpriv->bo_list_handles);
 
 	amdgpu_ctx_mgr_init(&fpriv->ctx_mgr);
@@ -900,7 +900,6 @@ void amdgpu_driver_postclose_kms(struct drm_device *dev,
 		amdgpu_bo_list_free(list);
 
 	idr_destroy(&fpriv->bo_list_handles);
-	mutex_destroy(&fpriv->bo_list_lock);
 
 	kfree(fpriv);
 	file_priv->driver_priv = NULL;
-- 
2.7.4

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH 2/4] drm/amdgpu: Optimization of critical section
       [not found]     ` <1497299485-7337-2-git-send-email-AlexBin.Xie-5C7GfCeVMHo@public.gmane.org>
@ 2017-06-12 22:47       ` Christian König
       [not found]         ` <0b8a7163-4f3f-d1b3-81ab-b9cb3c2ae9af-ANTagKRnAhcb1SvskN2V4Q@public.gmane.org>
  0 siblings, 1 reply; 18+ messages in thread
From: Christian König @ 2017-06-12 22:47 UTC (permalink / raw)
  To: Alex Xie, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

Am 12.06.2017 um 22:31 schrieb Alex Xie:
> Make the critical section smaller. There is no
> need to protect the bo_list, because there is
> no other task can access the newly created BO
> list yet.

NAK, a task can guess the next id number so could get the kernel to use 
the structure before it is initialized.

Christian.

>
> Signed-off-by: Alex Xie <AlexBin.Xie@amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c | 4 ++--
>   1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c
> index 02c138f..c994a04 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c
> @@ -48,8 +48,9 @@ static int amdgpu_bo_list_create(struct amdgpu_fpriv *fpriv,
>   	mutex_lock(&fpriv->bo_list_lock);
>   	r = idr_alloc(&fpriv->bo_list_handles, *result,
>   		      1, 0, GFP_KERNEL);
> +	mutex_unlock(&fpriv->bo_list_lock);
> +
>   	if (r < 0) {
> -		mutex_unlock(&fpriv->bo_list_lock);
>   		kfree(*result);
>   		return r;
>   	}
> @@ -60,7 +61,6 @@ static int amdgpu_bo_list_create(struct amdgpu_fpriv *fpriv,
>   	(*result)->array = NULL;
>   
>   	mutex_lock(&(*result)->lock);
> -	mutex_unlock(&fpriv->bo_list_lock);
>   
>   	return 0;
>   }


_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 3/4] drm/amdgpu: Optimization of mutex usage
       [not found]     ` <1497299485-7337-3-git-send-email-AlexBin.Xie-5C7GfCeVMHo@public.gmane.org>
@ 2017-06-12 22:48       ` Christian König
  0 siblings, 0 replies; 18+ messages in thread
From: Christian König @ 2017-06-12 22:48 UTC (permalink / raw)
  To: Alex Xie, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

Am 12.06.2017 um 22:31 schrieb Alex Xie:
> There is no need to lock the bo_list mutex,
> because there is no other task can access
> the newly created BO list yet.

Again NAK, this can race as well when a task guess the next ID number or 
incorrectly uses an old one.

Christian.

>
> Signed-off-by: Alex Xie <AlexBin.Xie@amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c | 3 ---
>   1 file changed, 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c
> index c994a04..4363f28 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c
> @@ -60,8 +60,6 @@ static int amdgpu_bo_list_create(struct amdgpu_fpriv *fpriv,
>   	(*result)->num_entries = 0;
>   	(*result)->array = NULL;
>   
> -	mutex_lock(&(*result)->lock);
> -
>   	return 0;
>   }
>   
> @@ -282,7 +280,6 @@ int amdgpu_bo_list_ioctl(struct drm_device *dev, void *data,
>   
>   		r = amdgpu_bo_list_set(adev, filp, list, info,
>   					      args->in.bo_number);
> -		amdgpu_bo_list_put(list);
>   		if (r)
>   			goto error_free;
>   


_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 4/4] drm/amdgpu: Optimize mutex usage
       [not found]     ` <1497299485-7337-4-git-send-email-AlexBin.Xie-5C7GfCeVMHo@public.gmane.org>
@ 2017-06-12 22:54       ` Christian König
       [not found]         ` <e1a33855-cbc1-a317-90c1-0833e48fb791-ANTagKRnAhcb1SvskN2V4Q@public.gmane.org>
  0 siblings, 1 reply; 18+ messages in thread
From: Christian König @ 2017-06-12 22:54 UTC (permalink / raw)
  To: Alex Xie, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

Am 12.06.2017 um 22:31 schrieb Alex Xie:
> Use rw_semaphore instead of mutex for bo_lists.
>
> In original function amdgpu_bo_list_get, the waiting
> for result->lock can be quite long while mutex
> bo_list_lock was holding. It can make other tasks
> waiting for bo_list_lock for long period too.
> Change bo_list_lock to rw_semaphore can avoid most of
> such long waiting.
>
> Secondly, this patch allows several tasks(readers of idr)
> to proceed at the same time.
>
> Signed-off-by: Alex Xie <AlexBin.Xie@amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu.h         |  3 ++-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c | 14 +++++++-------
>   drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c     |  3 +--
>   3 files changed, 10 insertions(+), 10 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> index 063fc73..bfc40d7 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> @@ -35,6 +35,7 @@
>   #include <linux/rbtree.h>
>   #include <linux/hashtable.h>
>   #include <linux/dma-fence.h>
> +#include <linux/rwsem.h>
>   
>   #include <ttm/ttm_bo_api.h>
>   #include <ttm/ttm_bo_driver.h>
> @@ -859,7 +860,7 @@ void amdgpu_ctx_mgr_fini(struct amdgpu_ctx_mgr *mgr);
>   struct amdgpu_fpriv {
>   	struct amdgpu_vm	vm;
>   	struct amdgpu_bo_va	*prt_va;
> -	struct mutex		bo_list_lock;
> +	struct rw_semaphore	bo_list_lock;
>   	struct idr		bo_list_handles;
>   	struct amdgpu_ctx_mgr	ctx_mgr;
>   	u32			vram_lost_counter;
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c
> index 4363f28..bfe736e 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c
> @@ -45,10 +45,10 @@ static int amdgpu_bo_list_create(struct amdgpu_fpriv *fpriv,
>   	if (!*result)
>   		return -ENOMEM;
>   
> -	mutex_lock(&fpriv->bo_list_lock);
> +	down_read(&fpriv->bo_list_lock);
>   	r = idr_alloc(&fpriv->bo_list_handles, *result,
>   		      1, 0, GFP_KERNEL);
> -	mutex_unlock(&fpriv->bo_list_lock);
> +	up_read(&fpriv->bo_list_lock);

The protection here is incorrect, you need to take the write side here 
as well.

>   
>   	if (r < 0) {
>   		kfree(*result);
> @@ -67,17 +67,17 @@ static void amdgpu_bo_list_destroy(struct amdgpu_fpriv *fpriv, int id)
>   {
>   	struct amdgpu_bo_list *list;
>   
> -	mutex_lock(&fpriv->bo_list_lock);
> +	down_write(&fpriv->bo_list_lock);
>   	list = idr_remove(&fpriv->bo_list_handles, id);
>   	if (list) {
>   		/* Another user may have a reference to this list still */
>   		mutex_lock(&list->lock);
>   		mutex_unlock(&list->lock);
> -		mutex_unlock(&fpriv->bo_list_lock);
> +		up_write(&fpriv->bo_list_lock);
>   		amdgpu_bo_list_free(list);
>   	}
>   	else {
> -		mutex_unlock(&fpriv->bo_list_lock);
> +		up_write(&fpriv->bo_list_lock);
>   	}
>   }
>   
> @@ -173,11 +173,11 @@ amdgpu_bo_list_get(struct amdgpu_fpriv *fpriv, int id)
>   {
>   	struct amdgpu_bo_list *result;
>   
> -	mutex_lock(&fpriv->bo_list_lock);
> +	down_read(&fpriv->bo_list_lock);
>   	result = idr_find(&fpriv->bo_list_handles, id);
>   	if (result)
>   		mutex_lock(&result->lock);
> -	mutex_unlock(&fpriv->bo_list_lock);
> +	up_read(&fpriv->bo_list_lock);

Actually rw_semaphores are a bit less efficient than mutexes and the 
contention on this is practically not existent since at least the open 
source stack makes all BO list operations from a single thread.

So I'm not sure if that is actually more efficient.

Christian.

>   	return result;
>   }
>   
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
> index f68ced6..c4cba81 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
> @@ -841,7 +841,7 @@ int amdgpu_driver_open_kms(struct drm_device *dev, struct drm_file *file_priv)
>   			goto out_suspend;
>   	}
>   
> -	mutex_init(&fpriv->bo_list_lock);
> +	init_rwsem(&fpriv->bo_list_lock);
>   	idr_init(&fpriv->bo_list_handles);
>   
>   	amdgpu_ctx_mgr_init(&fpriv->ctx_mgr);
> @@ -900,7 +900,6 @@ void amdgpu_driver_postclose_kms(struct drm_device *dev,
>   		amdgpu_bo_list_free(list);
>   
>   	idr_destroy(&fpriv->bo_list_handles);
> -	mutex_destroy(&fpriv->bo_list_lock);
>   
>   	kfree(fpriv);
>   	file_priv->driver_priv = NULL;


_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/4] drm/amdgpu: Optimization of critical section
       [not found] ` <1497299485-7337-1-git-send-email-AlexBin.Xie-5C7GfCeVMHo@public.gmane.org>
                     ` (2 preceding siblings ...)
  2017-06-12 20:31   ` [PATCH 4/4] drm/amdgpu: Optimize " Alex Xie
@ 2017-06-12 23:00   ` Christian König
       [not found]     ` <c7b74eb8-65f4-2593-2fe8-dca25519e8c7-ANTagKRnAhcb1SvskN2V4Q@public.gmane.org>
  3 siblings, 1 reply; 18+ messages in thread
From: Christian König @ 2017-06-12 23:00 UTC (permalink / raw)
  To: Alex Xie, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

Am 12.06.2017 um 22:31 schrieb Alex Xie:
> This patch is to move a loop of unref BOs and
> several memory free function calls out of
> critical sections.
>
> Signed-off-by: Alex Xie <AlexBin.Xie@amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c | 5 ++++-
>   1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c
> index a664987..02c138f 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c
> @@ -75,9 +75,12 @@ static void amdgpu_bo_list_destroy(struct amdgpu_fpriv *fpriv, int id)
>   		/* Another user may have a reference to this list still */
>   		mutex_lock(&list->lock);
>   		mutex_unlock(&list->lock);
> +		mutex_unlock(&fpriv->bo_list_lock);
>   		amdgpu_bo_list_free(list);
>   	}
> -	mutex_unlock(&fpriv->bo_list_lock);
> +	else {
> +		mutex_unlock(&fpriv->bo_list_lock);
> +	}

You could move the unlock of bo_list_lock even before the if.

But since you pointed it out there is quite a bug in this function:
>                 /* Another user may have a reference to this list still */
>                 mutex_lock(&list->lock);
>                 mutex_unlock(&list->lock);
Not sure if that is still up to date, but that use case used to be 
illegal with mutexes.

Christian.

>   }
>   
>   static int amdgpu_bo_list_set(struct amdgpu_device *adev,


_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/4] drm/amdgpu: Optimization of critical section
       [not found]     ` <c7b74eb8-65f4-2593-2fe8-dca25519e8c7-ANTagKRnAhcb1SvskN2V4Q@public.gmane.org>
@ 2017-06-13  2:25       ` Dave Airlie
       [not found]         ` <CAPM=9tzgr4ik8yKv6pU5gA0rjVQ3O9-eL5g8iU_XJ_bAWzr9dA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  2017-06-13  6:29       ` axie
  1 sibling, 1 reply; 18+ messages in thread
From: Dave Airlie @ 2017-06-13  2:25 UTC (permalink / raw)
  To: Christian König; +Cc: amd-gfx mailing list, Alex Xie

On 13 June 2017 at 09:00, Christian König <deathsimple@vodafone.de> wrote:
> Am 12.06.2017 um 22:31 schrieb Alex Xie:
>>
>> This patch is to move a loop of unref BOs and
>> several memory free function calls out of
>> critical sections.
>>
>> Signed-off-by: Alex Xie <AlexBin.Xie@amd.com>
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c | 5 ++++-
>>   1 file changed, 4 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c
>> index a664987..02c138f 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c
>> @@ -75,9 +75,12 @@ static void amdgpu_bo_list_destroy(struct amdgpu_fpriv
>> *fpriv, int id)
>>                 /* Another user may have a reference to this list still */
>>                 mutex_lock(&list->lock);
>>                 mutex_unlock(&list->lock);
>> +               mutex_unlock(&fpriv->bo_list_lock);
>>                 amdgpu_bo_list_free(list);
>>         }
>> -       mutex_unlock(&fpriv->bo_list_lock);
>> +       else {
>> +               mutex_unlock(&fpriv->bo_list_lock);
>> +       }
>
>
> You could move the unlock of bo_list_lock even before the if.
>
> But since you pointed it out there is quite a bug in this function:
>>
>>                 /* Another user may have a reference to this list still */
>>                 mutex_lock(&list->lock);
>>                 mutex_unlock(&list->lock);
>
> Not sure if that is still up to date, but that use case used to be illegal
> with mutexes.

Oh wow, looks like this code has refcounting bugs, there should never
be a reason to lock/unlock to find another user.

Can anyone say krefs?

Dave.
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 4/4] drm/amdgpu: Optimize mutex usage
       [not found]         ` <e1a33855-cbc1-a317-90c1-0833e48fb791-ANTagKRnAhcb1SvskN2V4Q@public.gmane.org>
@ 2017-06-13  6:03           ` axie
       [not found]             ` <b7673a0f-bb32-183c-1ef8-49802407e231-5C7GfCeVMHo@public.gmane.org>
  0 siblings, 1 reply; 18+ messages in thread
From: axie @ 2017-06-13  6:03 UTC (permalink / raw)
  To: Christian König, Alex Xie, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

Hi Christian,

Regarding comment 1.

Please read the file idr.c line 24 to line 29.

Let me quote here. Isn't it surprise you? idr_alloc only need read lock.

...

  * Simultaneous modifications to the @idr are not allowed and should be
  * prevented by the user, usually with a lock.  idr_alloc() may be called
  * concurrently with read-only accesses to the @idr, such as idr_find() and
  * idr_for_each_entry().
  */
int idr_alloc(struct idr *idr, void *ptr, int start, int end, gfp_t gfp)

...

Regarding comment 2.

I have checked relevant information of rw_semaphore. Read lock is as 
efficient as mutex.

Write lock is less efficient than mutex if "optimistic spinning on lock 
acquisition" is not configured.

But we only use write lock when destroy bo list. Most of time, as you 
said, contention does not exist.

So the write lock does not need to wait. So the less efficiency is not a 
factor compared with mutex.


To decide which lock is better, it is more important to choose correct 
lock mechanism than

the efficiency of the lock itself. For the current bo_list, the multiple 
readers can block each other.

And what is even worse, the get function can wait for other thing when 
holding the mutex.


If you said that the user mode driver was single thread, we don't need 
any mutex here then...


-Alex Bin Xie



On 2017-06-12 06:54 PM, Christian König wrote:
> Am 12.06.2017 um 22:31 schrieb Alex Xie:
>> Use rw_semaphore instead of mutex for bo_lists.
>>
>> In original function amdgpu_bo_list_get, the waiting
>> for result->lock can be quite long while mutex
>> bo_list_lock was holding. It can make other tasks
>> waiting for bo_list_lock for long period too.
>> Change bo_list_lock to rw_semaphore can avoid most of
>> such long waiting.
>>
>> Secondly, this patch allows several tasks(readers of idr)
>> to proceed at the same time.
>>
>> Signed-off-by: Alex Xie <AlexBin.Xie@amd.com>
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu.h         |  3 ++-
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c | 14 +++++++-------
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c     |  3 +--
>>   3 files changed, 10 insertions(+), 10 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>> index 063fc73..bfc40d7 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>> @@ -35,6 +35,7 @@
>>   #include <linux/rbtree.h>
>>   #include <linux/hashtable.h>
>>   #include <linux/dma-fence.h>
>> +#include <linux/rwsem.h>
>>     #include <ttm/ttm_bo_api.h>
>>   #include <ttm/ttm_bo_driver.h>
>> @@ -859,7 +860,7 @@ void amdgpu_ctx_mgr_fini(struct amdgpu_ctx_mgr 
>> *mgr);
>>   struct amdgpu_fpriv {
>>       struct amdgpu_vm    vm;
>>       struct amdgpu_bo_va    *prt_va;
>> -    struct mutex        bo_list_lock;
>> +    struct rw_semaphore    bo_list_lock;
>>       struct idr        bo_list_handles;
>>       struct amdgpu_ctx_mgr    ctx_mgr;
>>       u32            vram_lost_counter;
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c
>> index 4363f28..bfe736e 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c
>> @@ -45,10 +45,10 @@ static int amdgpu_bo_list_create(struct 
>> amdgpu_fpriv *fpriv,
>>       if (!*result)
>>           return -ENOMEM;
>>   -    mutex_lock(&fpriv->bo_list_lock);
>> +    down_read(&fpriv->bo_list_lock);
>>       r = idr_alloc(&fpriv->bo_list_handles, *result,
>>                 1, 0, GFP_KERNEL);
>> -    mutex_unlock(&fpriv->bo_list_lock);
>> +    up_read(&fpriv->bo_list_lock);
>
> The protection here is incorrect, you need to take the write side here 
> as well.
>
>>         if (r < 0) {
>>           kfree(*result);
>> @@ -67,17 +67,17 @@ static void amdgpu_bo_list_destroy(struct 
>> amdgpu_fpriv *fpriv, int id)
>>   {
>>       struct amdgpu_bo_list *list;
>>   -    mutex_lock(&fpriv->bo_list_lock);
>> +    down_write(&fpriv->bo_list_lock);
>>       list = idr_remove(&fpriv->bo_list_handles, id);
>>       if (list) {
>>           /* Another user may have a reference to this list still */
>>           mutex_lock(&list->lock);
>>           mutex_unlock(&list->lock);
>> -        mutex_unlock(&fpriv->bo_list_lock);
>> +        up_write(&fpriv->bo_list_lock);
>>           amdgpu_bo_list_free(list);
>>       }
>>       else {
>> -        mutex_unlock(&fpriv->bo_list_lock);
>> +        up_write(&fpriv->bo_list_lock);
>>       }
>>   }
>>   @@ -173,11 +173,11 @@ amdgpu_bo_list_get(struct amdgpu_fpriv 
>> *fpriv, int id)
>>   {
>>       struct amdgpu_bo_list *result;
>>   -    mutex_lock(&fpriv->bo_list_lock);
>> +    down_read(&fpriv->bo_list_lock);
>>       result = idr_find(&fpriv->bo_list_handles, id);
>>       if (result)
>>           mutex_lock(&result->lock);
>> -    mutex_unlock(&fpriv->bo_list_lock);
>> +    up_read(&fpriv->bo_list_lock);
>
> Actually rw_semaphores are a bit less efficient than mutexes and the 
> contention on this is practically not existent since at least the open 
> source stack makes all BO list operations from a single thread.
>
> So I'm not sure if that is actually more efficient.
>
> Christian.
>
>>       return result;
>>   }
>>   diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
>> index f68ced6..c4cba81 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
>> @@ -841,7 +841,7 @@ int amdgpu_driver_open_kms(struct drm_device 
>> *dev, struct drm_file *file_priv)
>>               goto out_suspend;
>>       }
>>   -    mutex_init(&fpriv->bo_list_lock);
>> +    init_rwsem(&fpriv->bo_list_lock);
>>       idr_init(&fpriv->bo_list_handles);
>>         amdgpu_ctx_mgr_init(&fpriv->ctx_mgr);
>> @@ -900,7 +900,6 @@ void amdgpu_driver_postclose_kms(struct 
>> drm_device *dev,
>>           amdgpu_bo_list_free(list);
>>         idr_destroy(&fpriv->bo_list_handles);
>> -    mutex_destroy(&fpriv->bo_list_lock);
>>         kfree(fpriv);
>>       file_priv->driver_priv = NULL;
>
>

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/4] drm/amdgpu: Optimization of critical section
       [not found]     ` <c7b74eb8-65f4-2593-2fe8-dca25519e8c7-ANTagKRnAhcb1SvskN2V4Q@public.gmane.org>
  2017-06-13  2:25       ` Dave Airlie
@ 2017-06-13  6:29       ` axie
       [not found]         ` <815cfedc-8eb8-37aa-e9f5-5882e9a5c950-5C7GfCeVMHo@public.gmane.org>
  1 sibling, 1 reply; 18+ messages in thread
From: axie @ 2017-06-13  6:29 UTC (permalink / raw)
  To: Christian König, Alex Xie, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

On 2017-06-12 07:00 PM, Christian König wrote:
> Am 12.06.2017 um 22:31 schrieb Alex Xie:
>> This patch is to move a loop of unref BOs and
>> several memory free function calls out of
>> critical sections.
>>
>> Signed-off-by: Alex Xie <AlexBin.Xie@amd.com>
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c | 5 ++++-
>>   1 file changed, 4 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c
>> index a664987..02c138f 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c
>> @@ -75,9 +75,12 @@ static void amdgpu_bo_list_destroy(struct 
>> amdgpu_fpriv *fpriv, int id)
>>           /* Another user may have a reference to this list still */
>>           mutex_lock(&list->lock);
>>           mutex_unlock(&list->lock);
>> +        mutex_unlock(&fpriv->bo_list_lock);
>>           amdgpu_bo_list_free(list);
>>       }
>> -    mutex_unlock(&fpriv->bo_list_lock);
>> +    else {
>> +        mutex_unlock(&fpriv->bo_list_lock);
>> +    }
>
> You could move the unlock of bo_list_lock even before the if.
>
> But since you pointed it out there is quite a bug in this function:
>>                 /* Another user may have a reference to this list 
>> still */
>>                 mutex_lock(&list->lock);
>>                 mutex_unlock(&list->lock);
> Not sure if that is still up to date, but that use case used to be 
> illegal with mutexes.
As I understand this piece of code, these mutex_lock and mutex_unlock 
pair are used to make sure all other tasks have finished
access of the bo list. Another side of this story is in function 
amdgpu_bo_list_get. These two piece of codes together make sure
we can safely destroy bo list.

Otherwise we can easily simplify these lockings.
Let me give an example here.

In function amdgpu_bo_list_get, assuming we change the code like this:
...
     down_read(&fpriv->bo_list_lock);
     result = idr_find(&fpriv->bo_list_handles, id);
     up_read(&fpriv->bo_list_lock);
     /**Line 1. Task A was scheduled away from CPU**/
     if (result)
         mutex_lock(&result->lock);
...

In function amdgpu_bo_list_destroy, assuming we change the code like this:
...
     down_write(&fpriv->bo_list_lock);
     list = idr_remove(&fpriv->bo_list_handles, id);
     up_write(&fpriv->bo_list_lock);
     if (list) {
         /* Another user may have a reference to this list still */
         mutex_lock(&list->lock);
         mutex_unlock(&list->lock);
         amdgpu_bo_list_free(list);
     }
}

When task A is running in function amdgpu_bo_list_get in line 1, CPU 
scheduler takes CPU away from task A.
Then Task B run function amdgpu_bo_list_destroy. Task B can run all the 
way to destroy and free mutex.
Later Task A is back to run. The mutex result->lock has been destroyed 
by task B. Now task A try to lock a mutex
which has been destroyed and freed.


>
> Christian.
>
>>   }
>>     static int amdgpu_bo_list_set(struct amdgpu_device *adev,
>
>

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 2/4] drm/amdgpu: Optimization of critical section
       [not found]         ` <0b8a7163-4f3f-d1b3-81ab-b9cb3c2ae9af-ANTagKRnAhcb1SvskN2V4Q@public.gmane.org>
@ 2017-06-13  6:54           ` axie
       [not found]             ` <d4a8c004-8a60-83ea-be83-c827115e1a6b-5C7GfCeVMHo@public.gmane.org>
  0 siblings, 1 reply; 18+ messages in thread
From: axie @ 2017-06-13  6:54 UTC (permalink / raw)
  To: Christian König, Alex Xie, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW


On 2017-06-12 06:47 PM, Christian König wrote:
> Am 12.06.2017 um 22:31 schrieb Alex Xie:
>> Make the critical section smaller. There is no
>> need to protect the bo_list, because there is
>> no other task can access the newly created BO
>> list yet.
>
> NAK, a task can guess the next id number so could get the kernel to 
> use the structure before it is initialized.
>
> Christian.
>
How did you find such an extreme corner case? I am fine with this 
comment. Tuesday/Wednesday I will address it with next version of patch set.

Currently, there are 2 options:
Option 1: I may use a write_lock in the create function. And restore the 
original code for the creation of BO list.
Option 2: I may move the function call of idr_alloc to the end of the 
creation of BO list ioctl. This is more efficient but
the code look dirty.


>>
>> Signed-off-by: Alex Xie <AlexBin.Xie@amd.com>
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c | 4 ++--
>>   1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c
>> index 02c138f..c994a04 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c
>> @@ -48,8 +48,9 @@ static int amdgpu_bo_list_create(struct 
>> amdgpu_fpriv *fpriv,
>>       mutex_lock(&fpriv->bo_list_lock);
>>       r = idr_alloc(&fpriv->bo_list_handles, *result,
>>                 1, 0, GFP_KERNEL);
>> +    mutex_unlock(&fpriv->bo_list_lock);
>> +
>>       if (r < 0) {
>> -        mutex_unlock(&fpriv->bo_list_lock);
>>           kfree(*result);
>>           return r;
>>       }
>> @@ -60,7 +61,6 @@ static int amdgpu_bo_list_create(struct 
>> amdgpu_fpriv *fpriv,
>>       (*result)->array = NULL;
>>         mutex_lock(&(*result)->lock);
>> -    mutex_unlock(&fpriv->bo_list_lock);
>>         return 0;
>>   }
>
>

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/4] drm/amdgpu: Optimization of critical section
       [not found]         ` <CAPM=9tzgr4ik8yKv6pU5gA0rjVQ3O9-eL5g8iU_XJ_bAWzr9dA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2017-06-13  7:57           ` Christian König
  0 siblings, 0 replies; 18+ messages in thread
From: Christian König @ 2017-06-13  7:57 UTC (permalink / raw)
  To: Dave Airlie; +Cc: amd-gfx mailing list, Alex Xie

Am 13.06.2017 um 04:25 schrieb Dave Airlie:
> On 13 June 2017 at 09:00, Christian König <deathsimple@vodafone.de> wrote:
>> Am 12.06.2017 um 22:31 schrieb Alex Xie:
>>> This patch is to move a loop of unref BOs and
>>> several memory free function calls out of
>>> critical sections.
>>>
>>> Signed-off-by: Alex Xie <AlexBin.Xie@amd.com>
>>> ---
>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c | 5 ++++-
>>>    1 file changed, 4 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c
>>> index a664987..02c138f 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c
>>> @@ -75,9 +75,12 @@ static void amdgpu_bo_list_destroy(struct amdgpu_fpriv
>>> *fpriv, int id)
>>>                  /* Another user may have a reference to this list still */
>>>                  mutex_lock(&list->lock);
>>>                  mutex_unlock(&list->lock);
>>> +               mutex_unlock(&fpriv->bo_list_lock);
>>>                  amdgpu_bo_list_free(list);
>>>          }
>>> -       mutex_unlock(&fpriv->bo_list_lock);
>>> +       else {
>>> +               mutex_unlock(&fpriv->bo_list_lock);
>>> +       }
>>
>> You could move the unlock of bo_list_lock even before the if.
>>
>> But since you pointed it out there is quite a bug in this function:
>>>                  /* Another user may have a reference to this list still */
>>>                  mutex_lock(&list->lock);
>>>                  mutex_unlock(&list->lock);
>> Not sure if that is still up to date, but that use case used to be illegal
>> with mutexes.
> Oh wow, looks like this code has refcounting bugs, there should never
> be a reason to lock/unlock to find another user.

Actually that is not the issue.

We remove the bo_list object from the idr first which only works when 
nobody else is looking it up at the same time.

Taking and releasing the lock then should make sure that nobody is 
actually using the object any more in another thread.

> Can anyone say krefs?

The problem is rather that unlocking a mutex directly before it is 
released used to be illegal because mutexes work with their structure 
even after releasing it.

But the best solution would be to do the whole thing lockless with a 
kref and/or atomic updates.

Regards,
Christian.

>
> Dave.


_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 4/4] drm/amdgpu: Optimize mutex usage
       [not found]             ` <b7673a0f-bb32-183c-1ef8-49802407e231-5C7GfCeVMHo@public.gmane.org>
@ 2017-06-13 10:07               ` Christian König
  0 siblings, 0 replies; 18+ messages in thread
From: Christian König @ 2017-06-13 10:07 UTC (permalink / raw)
  To: axie, Alex Xie, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

Am 13.06.2017 um 08:03 schrieb axie:
> Hi Christian,
>
> Regarding comment 1.
>
> Please read the file idr.c line 24 to line 29.
>
> Let me quote here. Isn't it surprise you? idr_alloc only need read lock.
>
> ...
>
>  * Simultaneous modifications to the @idr are not allowed and should be
>  * prevented by the user, usually with a lock.  idr_alloc() may be called
>  * concurrently with read-only accesses to the @idr, such as 
> idr_find() and
>  * idr_for_each_entry().
>  */
> int idr_alloc(struct idr *idr, void *ptr, int start, int end, gfp_t gfp)

As far as I can see you misunderstood the comment.

idr_alloc() can be called concurrently with idr_find() and 
idr_for_each_entry(), so you don't need a lock to cover those when you 
manage the livetime of the mapped pointers with RCU for example.

But you still need to prevent idr_alloc() from being called concurrently 
from multiple threads at the same time, so a read side lock isn't 
sufficient here.

>
> ...
>
> Regarding comment 2.
>
> I have checked relevant information of rw_semaphore. Read lock is as 
> efficient as mutex.
>
> Write lock is less efficient than mutex if "optimistic spinning on 
> lock acquisition" is not configured.
>
> But we only use write lock when destroy bo list. Most of time, as you 
> said, contention does not exist.
>
> So the write lock does not need to wait. So the less efficiency is not 
> a factor compared with mutex.
>
>
> To decide which lock is better, it is more important to choose correct 
> lock mechanism than
>
> the efficiency of the lock itself. For the current bo_list, the 
> multiple readers can block each other.
>
> And what is even worse, the get function can wait for other thing when 
> holding the mutex.

Actually it would be more efficient to not use a lock at all and 
properly reference count the objects instead.

Updates can then be made using RCU.

> If you said that the user mode driver was single thread, we don't need 
> any mutex here then...

No, even when userspace doesn't call functions from multiple thread 
IOCTLs still need to be thread save to the extend that they don't crash 
when you do so anyway.

Regards,
Christian.

>
>
> -Alex Bin Xie
>
>
>
> On 2017-06-12 06:54 PM, Christian König wrote:
>> Am 12.06.2017 um 22:31 schrieb Alex Xie:
>>> Use rw_semaphore instead of mutex for bo_lists.
>>>
>>> In original function amdgpu_bo_list_get, the waiting
>>> for result->lock can be quite long while mutex
>>> bo_list_lock was holding. It can make other tasks
>>> waiting for bo_list_lock for long period too.
>>> Change bo_list_lock to rw_semaphore can avoid most of
>>> such long waiting.
>>>
>>> Secondly, this patch allows several tasks(readers of idr)
>>> to proceed at the same time.
>>>
>>> Signed-off-by: Alex Xie <AlexBin.Xie@amd.com>
>>> ---
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu.h         |  3 ++-
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c | 14 +++++++-------
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c     |  3 +--
>>>   3 files changed, 10 insertions(+), 10 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>> index 063fc73..bfc40d7 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>> @@ -35,6 +35,7 @@
>>>   #include <linux/rbtree.h>
>>>   #include <linux/hashtable.h>
>>>   #include <linux/dma-fence.h>
>>> +#include <linux/rwsem.h>
>>>     #include <ttm/ttm_bo_api.h>
>>>   #include <ttm/ttm_bo_driver.h>
>>> @@ -859,7 +860,7 @@ void amdgpu_ctx_mgr_fini(struct amdgpu_ctx_mgr 
>>> *mgr);
>>>   struct amdgpu_fpriv {
>>>       struct amdgpu_vm    vm;
>>>       struct amdgpu_bo_va    *prt_va;
>>> -    struct mutex        bo_list_lock;
>>> +    struct rw_semaphore    bo_list_lock;
>>>       struct idr        bo_list_handles;
>>>       struct amdgpu_ctx_mgr    ctx_mgr;
>>>       u32            vram_lost_counter;
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c 
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c
>>> index 4363f28..bfe736e 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c
>>> @@ -45,10 +45,10 @@ static int amdgpu_bo_list_create(struct 
>>> amdgpu_fpriv *fpriv,
>>>       if (!*result)
>>>           return -ENOMEM;
>>>   -    mutex_lock(&fpriv->bo_list_lock);
>>> +    down_read(&fpriv->bo_list_lock);
>>>       r = idr_alloc(&fpriv->bo_list_handles, *result,
>>>                 1, 0, GFP_KERNEL);
>>> -    mutex_unlock(&fpriv->bo_list_lock);
>>> +    up_read(&fpriv->bo_list_lock);
>>
>> The protection here is incorrect, you need to take the write side 
>> here as well.
>>
>>>         if (r < 0) {
>>>           kfree(*result);
>>> @@ -67,17 +67,17 @@ static void amdgpu_bo_list_destroy(struct 
>>> amdgpu_fpriv *fpriv, int id)
>>>   {
>>>       struct amdgpu_bo_list *list;
>>>   -    mutex_lock(&fpriv->bo_list_lock);
>>> +    down_write(&fpriv->bo_list_lock);
>>>       list = idr_remove(&fpriv->bo_list_handles, id);
>>>       if (list) {
>>>           /* Another user may have a reference to this list still */
>>>           mutex_lock(&list->lock);
>>>           mutex_unlock(&list->lock);
>>> -        mutex_unlock(&fpriv->bo_list_lock);
>>> +        up_write(&fpriv->bo_list_lock);
>>>           amdgpu_bo_list_free(list);
>>>       }
>>>       else {
>>> -        mutex_unlock(&fpriv->bo_list_lock);
>>> +        up_write(&fpriv->bo_list_lock);
>>>       }
>>>   }
>>>   @@ -173,11 +173,11 @@ amdgpu_bo_list_get(struct amdgpu_fpriv 
>>> *fpriv, int id)
>>>   {
>>>       struct amdgpu_bo_list *result;
>>>   -    mutex_lock(&fpriv->bo_list_lock);
>>> +    down_read(&fpriv->bo_list_lock);
>>>       result = idr_find(&fpriv->bo_list_handles, id);
>>>       if (result)
>>>           mutex_lock(&result->lock);
>>> -    mutex_unlock(&fpriv->bo_list_lock);
>>> +    up_read(&fpriv->bo_list_lock);
>>
>> Actually rw_semaphores are a bit less efficient than mutexes and the 
>> contention on this is practically not existent since at least the 
>> open source stack makes all BO list operations from a single thread.
>>
>> So I'm not sure if that is actually more efficient.
>>
>> Christian.
>>
>>>       return result;
>>>   }
>>>   diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c 
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
>>> index f68ced6..c4cba81 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
>>> @@ -841,7 +841,7 @@ int amdgpu_driver_open_kms(struct drm_device 
>>> *dev, struct drm_file *file_priv)
>>>               goto out_suspend;
>>>       }
>>>   -    mutex_init(&fpriv->bo_list_lock);
>>> +    init_rwsem(&fpriv->bo_list_lock);
>>>       idr_init(&fpriv->bo_list_handles);
>>>         amdgpu_ctx_mgr_init(&fpriv->ctx_mgr);
>>> @@ -900,7 +900,6 @@ void amdgpu_driver_postclose_kms(struct 
>>> drm_device *dev,
>>>           amdgpu_bo_list_free(list);
>>>         idr_destroy(&fpriv->bo_list_handles);
>>> -    mutex_destroy(&fpriv->bo_list_lock);
>>>         kfree(fpriv);
>>>       file_priv->driver_priv = NULL;
>>
>>
>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx


_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 2/4] drm/amdgpu: Optimization of critical section
       [not found]             ` <d4a8c004-8a60-83ea-be83-c827115e1a6b-5C7GfCeVMHo@public.gmane.org>
@ 2017-06-13 10:14               ` Christian König
  2017-06-13 23:08               ` Dave Airlie
  1 sibling, 0 replies; 18+ messages in thread
From: Christian König @ 2017-06-13 10:14 UTC (permalink / raw)
  To: axie, Alex Xie, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

Am 13.06.2017 um 08:54 schrieb axie:
>
> On 2017-06-12 06:47 PM, Christian König wrote:
>> Am 12.06.2017 um 22:31 schrieb Alex Xie:
>>> Make the critical section smaller. There is no
>>> need to protect the bo_list, because there is
>>> no other task can access the newly created BO
>>> list yet.
>>
>> NAK, a task can guess the next id number so could get the kernel to 
>> use the structure before it is initialized.
>>
>> Christian.
>>
> How did you find such an extreme corner case?

Twenty-two years of experience working with that stuff, have seen all of 
that before.

> I am fine with this comment. Tuesday/Wednesday I will address it with 
> next version of patch set.
>
> Currently, there are 2 options:
> Option 1: I may use a write_lock in the create function. And restore 
> the original code for the creation of BO list.
> Option 2: I may move the function call of idr_alloc to the end of the 
> creation of BO list ioctl. This is more efficient but
> the code look dirty.

Option #2 doesn't sound so dirty to me. Properly initializing the 
bo_list and then adding it to the idr actually sounds rather sane to me.

The only tricky part is that the currently helper function doesn't 
really match that use case.

BTW: Something I've just notices while working on this:
> *result = kzalloc(sizeof(struct amdgpu_bo_list), GFP_KERNEL);
...
>         mutex_init(&(*result)->lock);
>         (*result)->num_entries = 0;
>         (*result)->array = NULL;

We can probably save quite some CPU cycles when we still to zero 
initialize the structure twice.

Regards,
Christian.

>
>
>>>
>>> Signed-off-by: Alex Xie <AlexBin.Xie@amd.com>
>>> ---
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c | 4 ++--
>>>   1 file changed, 2 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c 
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c
>>> index 02c138f..c994a04 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c
>>> @@ -48,8 +48,9 @@ static int amdgpu_bo_list_create(struct 
>>> amdgpu_fpriv *fpriv,
>>>       mutex_lock(&fpriv->bo_list_lock);
>>>       r = idr_alloc(&fpriv->bo_list_handles, *result,
>>>                 1, 0, GFP_KERNEL);
>>> +    mutex_unlock(&fpriv->bo_list_lock);
>>> +
>>>       if (r < 0) {
>>> -        mutex_unlock(&fpriv->bo_list_lock);
>>>           kfree(*result);
>>>           return r;
>>>       }
>>> @@ -60,7 +61,6 @@ static int amdgpu_bo_list_create(struct 
>>> amdgpu_fpriv *fpriv,
>>>       (*result)->array = NULL;
>>>         mutex_lock(&(*result)->lock);
>>> -    mutex_unlock(&fpriv->bo_list_lock);
>>>         return 0;
>>>   }
>>
>>
>

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/4] drm/amdgpu: Optimization of critical section
       [not found]         ` <815cfedc-8eb8-37aa-e9f5-5882e9a5c950-5C7GfCeVMHo@public.gmane.org>
@ 2017-06-13 10:23           ` Christian König
       [not found]             ` <0167e95a-2857-ba3a-f395-73163ba0c305-ANTagKRnAhcb1SvskN2V4Q@public.gmane.org>
  0 siblings, 1 reply; 18+ messages in thread
From: Christian König @ 2017-06-13 10:23 UTC (permalink / raw)
  To: axie, Alex Xie, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

Am 13.06.2017 um 08:29 schrieb axie:
> On 2017-06-12 07:00 PM, Christian König wrote:
>> Am 12.06.2017 um 22:31 schrieb Alex Xie:
>>> This patch is to move a loop of unref BOs and
>>> several memory free function calls out of
>>> critical sections.
>>>
>>> Signed-off-by: Alex Xie <AlexBin.Xie@amd.com>
>>> ---
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c | 5 ++++-
>>>   1 file changed, 4 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c 
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c
>>> index a664987..02c138f 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c
>>> @@ -75,9 +75,12 @@ static void amdgpu_bo_list_destroy(struct 
>>> amdgpu_fpriv *fpriv, int id)
>>>           /* Another user may have a reference to this list still */
>>>           mutex_lock(&list->lock);
>>>           mutex_unlock(&list->lock);
>>> +        mutex_unlock(&fpriv->bo_list_lock);
>>>           amdgpu_bo_list_free(list);
>>>       }
>>> -    mutex_unlock(&fpriv->bo_list_lock);
>>> +    else {
>>> +        mutex_unlock(&fpriv->bo_list_lock);
>>> +    }
>>
>> You could move the unlock of bo_list_lock even before the if.
>>
>> But since you pointed it out there is quite a bug in this function:
>>>                 /* Another user may have a reference to this list 
>>> still */
>>>                 mutex_lock(&list->lock);
>>>                 mutex_unlock(&list->lock);
>> Not sure if that is still up to date, but that use case used to be 
>> illegal with mutexes.
> As I understand this piece of code, these mutex_lock and mutex_unlock 
> pair are used to make sure all other tasks have finished
> access of the bo list. Another side of this story is in function 
> amdgpu_bo_list_get. These two piece of codes together make sure
> we can safely destroy bo list.

Yeah, the idea behind the code is correct. But using mutexes in that way 
is illegal, see here https://lwn.net/Articles/575460/.

I'm not sure if that is still up to date, but in ancient times you 
needed to avoid patterns like this:
mutex_unlock(&obj->lock);
kfree(obj);

Anyway I suggest we just replace the whole bo_list handling with and RCU 
and refcount based implementation. That should avoid the whole locking 
for the read only code path.

Regards,
Christian.

>
> Otherwise we can easily simplify these lockings.
> Let me give an example here.
>
> In function amdgpu_bo_list_get, assuming we change the code like this:
> ...
>     down_read(&fpriv->bo_list_lock);
>     result = idr_find(&fpriv->bo_list_handles, id);
>     up_read(&fpriv->bo_list_lock);
>     /**Line 1. Task A was scheduled away from CPU**/
>     if (result)
>         mutex_lock(&result->lock);
> ...
>
> In function amdgpu_bo_list_destroy, assuming we change the code like 
> this:
> ...
>     down_write(&fpriv->bo_list_lock);
>     list = idr_remove(&fpriv->bo_list_handles, id);
>     up_write(&fpriv->bo_list_lock);
>     if (list) {
>         /* Another user may have a reference to this list still */
>         mutex_lock(&list->lock);
>         mutex_unlock(&list->lock);
>         amdgpu_bo_list_free(list);
>     }
> }
>
> When task A is running in function amdgpu_bo_list_get in line 1, CPU 
> scheduler takes CPU away from task A.
> Then Task B run function amdgpu_bo_list_destroy. Task B can run all 
> the way to destroy and free mutex.
> Later Task A is back to run. The mutex result->lock has been destroyed 
> by task B. Now task A try to lock a mutex
> which has been destroyed and freed.
>
>
>>
>> Christian.
>>
>>>   }
>>>     static int amdgpu_bo_list_set(struct amdgpu_device *adev,
>>
>>
>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx


_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/4] drm/amdgpu: Optimization of critical section
       [not found]             ` <0167e95a-2857-ba3a-f395-73163ba0c305-ANTagKRnAhcb1SvskN2V4Q@public.gmane.org>
@ 2017-06-13 17:04               ` axie
  0 siblings, 0 replies; 18+ messages in thread
From: axie @ 2017-06-13 17:04 UTC (permalink / raw)
  To: Christian König, Alex Xie, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

On 2017-06-13 06:23 AM, Christian König wrote:
> Am 13.06.2017 um 08:29 schrieb axie:
>> On 2017-06-12 07:00 PM, Christian König wrote:
>>> Am 12.06.2017 um 22:31 schrieb Alex Xie:
>>>> This patch is to move a loop of unref BOs and
>>>> several memory free function calls out of
>>>> critical sections.
>>>>
>>>> Signed-off-by: Alex Xie <AlexBin.Xie@amd.com>
>>>> ---
>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c | 5 ++++-
>>>>   1 file changed, 4 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c 
>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c
>>>> index a664987..02c138f 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c
>>>> @@ -75,9 +75,12 @@ static void amdgpu_bo_list_destroy(struct 
>>>> amdgpu_fpriv *fpriv, int id)
>>>>           /* Another user may have a reference to this list still */
>>>>           mutex_lock(&list->lock);
>>>>           mutex_unlock(&list->lock);
>>>> +        mutex_unlock(&fpriv->bo_list_lock);
>>>>           amdgpu_bo_list_free(list);
>>>>       }
>>>> -    mutex_unlock(&fpriv->bo_list_lock);
>>>> +    else {
>>>> +        mutex_unlock(&fpriv->bo_list_lock);
>>>> +    }
>>>
>>> You could move the unlock of bo_list_lock even before the if.
>>>
>>> But since you pointed it out there is quite a bug in this function:
>>>>                 /* Another user may have a reference to this list 
>>>> still */
>>>>                 mutex_lock(&list->lock);
>>>>                 mutex_unlock(&list->lock);
>>> Not sure if that is still up to date, but that use case used to be 
>>> illegal with mutexes.
>> As I understand this piece of code, these mutex_lock and mutex_unlock 
>> pair are used to make sure all other tasks have finished
>> access of the bo list. Another side of this story is in function 
>> amdgpu_bo_list_get. These two piece of codes together make sure
>> we can safely destroy bo list.
>
> Yeah, the idea behind the code is correct. But using mutexes in that 
> way is illegal, see here https://lwn.net/Articles/575460/.
>
> I'm not sure if that is still up to date, but in ancient times you 
> needed to avoid patterns like this:
> mutex_unlock(&obj->lock);
> kfree(obj);
>
> Anyway I suggest we just replace the whole bo_list handling with and 
> RCU and refcount based implementation. That should avoid the whole 
> locking for the read only code path.
>
> Regards,
> Christian.
>

The article may be still generally applicable. But it is not applicable 
for amdgpu_bo_list.c.
amdgpu_bo_list.c has two extra steps to avoid similar issue.

1. In function amdgpu_bo_list_destroy,
...
     down_write(&fpriv->bo_list_lock);  //Note that the bo_list_lock is 
mutex or write lock here.
     list = idr_remove(&fpriv->bo_list_handles, id);
...
With the above statements, there is no new task(user) can access the bo 
list.

2. With the above statments, there is no task currently waiting for the 
bo list lock after function amdgpu_bo_list_destroy holds bo_list_lock.
Why? In function amdgpu_bo_list_get, other task need to hold 
bo_list_lock mutex or read lock in order to wait for result->lock.
amdgpu_bo_list_get(struct amdgpu_fpriv *fpriv, int id)
{
   ...
     down_read(&fpriv->bo_list_lock);
...
         mutex_lock(&result->lock);
...
     up_read(&fpriv->bo_list_lock);
}
So when amdgpu_bo_list_destroy function is holding bo_list_lock. All 
other tasks are not waiting for result->lock.

3.
After the above 2 points, we know that when destroy function is holding 
bo_list_lock, there can only be zero/one task is holding
list->lock. There is no other scenario.

The following piece of code wait for possible existing bo list 
task(user) finishes accessing the bo list:
static void amdgpu_bo_list_destroy(struct amdgpu_fpriv *fpriv, int id)
{
...
     down_write(&fpriv->bo_list_lock);
...
         /* Another user may have a reference to this list still */
         mutex_lock(&list->lock);    //// WAITing for the exising 
task/user unlock, after this waiting, we can safely destroy and free the 
mutex.  [Alex Bin]
         mutex_unlock(&list->lock);

         up_write(&fpriv->bo_list_lock);
         amdgpu_bo_list_free(list);
...
NOTE: When we are holding bo_list_lock, there is only one possible 
scenario for list->lock left, that is,
in function amdgpu_bo_list_put, to unlock list->lock.


PLEASE give me example to prove if I miss anything.

Thanks for all other comments.

RCU lock and kref might be a cleaner solution than rw_semaphore. But 
that requires bigger change to current implementation.
I would have a look at that later.

Regards,
Alex Bin Xie

>>
>> Otherwise we can easily simplify these lockings.
>> Let me give an example here.
>>
>> In function amdgpu_bo_list_get, assuming we change the code like this:
>> ...
>>     down_read(&fpriv->bo_list_lock);
>>     result = idr_find(&fpriv->bo_list_handles, id);
>>     up_read(&fpriv->bo_list_lock);
>>     /**Line 1. Task A was scheduled away from CPU**/
>>     if (result)
>>         mutex_lock(&result->lock);
>> ...
>>
>> In function amdgpu_bo_list_destroy, assuming we change the code like 
>> this:
>> ...
>>     down_write(&fpriv->bo_list_lock);
>>     list = idr_remove(&fpriv->bo_list_handles, id);
>>     up_write(&fpriv->bo_list_lock);
>>     if (list) {
>>         /* Another user may have a reference to this list still */
>>         mutex_lock(&list->lock);
>>         mutex_unlock(&list->lock);
>>         amdgpu_bo_list_free(list);
>>     }
>> }
>>
>> When task A is running in function amdgpu_bo_list_get in line 1, CPU 
>> scheduler takes CPU away from task A.
>> Then Task B run function amdgpu_bo_list_destroy. Task B can run all 
>> the way to destroy and free mutex.
>> Later Task A is back to run. The mutex result->lock has been 
>> destroyed by task B. Now task A try to lock a mutex
>> which has been destroyed and freed.
>>
>>
>>>
>>> Christian.
>>>
>>>>   }
>>>>     static int amdgpu_bo_list_set(struct amdgpu_device *adev,
>>>
>>>
>>
>> _______________________________________________
>> amd-gfx mailing list
>> amd-gfx@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>
>

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 2/4] drm/amdgpu: Optimization of critical section
       [not found]             ` <d4a8c004-8a60-83ea-be83-c827115e1a6b-5C7GfCeVMHo@public.gmane.org>
  2017-06-13 10:14               ` Christian König
@ 2017-06-13 23:08               ` Dave Airlie
  1 sibling, 0 replies; 18+ messages in thread
From: Dave Airlie @ 2017-06-13 23:08 UTC (permalink / raw)
  To: axie; +Cc: Christian König, amd-gfx mailing list, Alex Xie

On 13 June 2017 at 16:54, axie <axie@amd.com> wrote:
>
> On 2017-06-12 06:47 PM, Christian König wrote:
>>
>> Am 12.06.2017 um 22:31 schrieb Alex Xie:
>>>
>>> Make the critical section smaller. There is no
>>> need to protect the bo_list, because there is
>>> no other task can access the newly created BO
>>> list yet.
>>
>>
>> NAK, a task can guess the next id number so could get the kernel to use
>> the structure before it is initialized.
>>
>> Christian.
>>
> How did you find such an extreme corner case? I am fine with this comment.
> Tuesday/Wednesday I will address it with next version of patch set.
>
> Currently, there are 2 options:
> Option 1: I may use a write_lock in the create function. And restore the
> original code for the creation of BO list.
> Option 2: I may move the function call of idr_alloc to the end of the
> creation of BO list ioctl. This is more efficient but
> the code look dirty.

You can idr_alloc and idr_replace later btw. We do this in a few
places in the drm.

Dave.
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2017-06-13 23:08 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-06-12 20:31 [PATCH 1/4] drm/amdgpu: Optimization of critical section Alex Xie
     [not found] ` <1497299485-7337-1-git-send-email-AlexBin.Xie-5C7GfCeVMHo@public.gmane.org>
2017-06-12 20:31   ` [PATCH 2/4] " Alex Xie
     [not found]     ` <1497299485-7337-2-git-send-email-AlexBin.Xie-5C7GfCeVMHo@public.gmane.org>
2017-06-12 22:47       ` Christian König
     [not found]         ` <0b8a7163-4f3f-d1b3-81ab-b9cb3c2ae9af-ANTagKRnAhcb1SvskN2V4Q@public.gmane.org>
2017-06-13  6:54           ` axie
     [not found]             ` <d4a8c004-8a60-83ea-be83-c827115e1a6b-5C7GfCeVMHo@public.gmane.org>
2017-06-13 10:14               ` Christian König
2017-06-13 23:08               ` Dave Airlie
2017-06-12 20:31   ` [PATCH 3/4] drm/amdgpu: Optimization of mutex usage Alex Xie
     [not found]     ` <1497299485-7337-3-git-send-email-AlexBin.Xie-5C7GfCeVMHo@public.gmane.org>
2017-06-12 22:48       ` Christian König
2017-06-12 20:31   ` [PATCH 4/4] drm/amdgpu: Optimize " Alex Xie
     [not found]     ` <1497299485-7337-4-git-send-email-AlexBin.Xie-5C7GfCeVMHo@public.gmane.org>
2017-06-12 22:54       ` Christian König
     [not found]         ` <e1a33855-cbc1-a317-90c1-0833e48fb791-ANTagKRnAhcb1SvskN2V4Q@public.gmane.org>
2017-06-13  6:03           ` axie
     [not found]             ` <b7673a0f-bb32-183c-1ef8-49802407e231-5C7GfCeVMHo@public.gmane.org>
2017-06-13 10:07               ` Christian König
2017-06-12 23:00   ` [PATCH 1/4] drm/amdgpu: Optimization of critical section Christian König
     [not found]     ` <c7b74eb8-65f4-2593-2fe8-dca25519e8c7-ANTagKRnAhcb1SvskN2V4Q@public.gmane.org>
2017-06-13  2:25       ` Dave Airlie
     [not found]         ` <CAPM=9tzgr4ik8yKv6pU5gA0rjVQ3O9-eL5g8iU_XJ_bAWzr9dA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2017-06-13  7:57           ` Christian König
2017-06-13  6:29       ` axie
     [not found]         ` <815cfedc-8eb8-37aa-e9f5-5882e9a5c950-5C7GfCeVMHo@public.gmane.org>
2017-06-13 10:23           ` Christian König
     [not found]             ` <0167e95a-2857-ba3a-f395-73163ba0c305-ANTagKRnAhcb1SvskN2V4Q@public.gmane.org>
2017-06-13 17:04               ` axie

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.