linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] gpu: drm: remove redundant dma_fence_put() when drm_sched_job_add_dependency() fails
@ 2022-04-25  8:36 Hangyu Hua
  2022-04-25 15:42 ` Andrey Grodzovsky
  0 siblings, 1 reply; 10+ messages in thread
From: Hangyu Hua @ 2022-04-25  8:36 UTC (permalink / raw)
  To: yuq825, airlied, daniel, andrey.grodzovsky
  Cc: dri-devel, lima, linux-kernel, Hangyu Hua

When drm_sched_job_add_dependency() fails, dma_fence_put() will be called
internally. Calling it again after drm_sched_job_add_dependency() finishes
may result in a dangling pointer.

Fix this by removing redundant dma_fence_put().

Signed-off-by: Hangyu Hua <hbh25y@gmail.com>
---
 drivers/gpu/drm/lima/lima_gem.c        | 1 -
 drivers/gpu/drm/scheduler/sched_main.c | 1 -
 2 files changed, 2 deletions(-)

diff --git a/drivers/gpu/drm/lima/lima_gem.c b/drivers/gpu/drm/lima/lima_gem.c
index 55bb1ec3c4f7..99c8e7f6bb1c 100644
--- a/drivers/gpu/drm/lima/lima_gem.c
+++ b/drivers/gpu/drm/lima/lima_gem.c
@@ -291,7 +291,6 @@ static int lima_gem_add_deps(struct drm_file *file, struct lima_submit *submit)
 
 		err = drm_sched_job_add_dependency(&submit->task->base, fence);
 		if (err) {
-			dma_fence_put(fence);
 			return err;
 		}
 	}
diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
index b81fceb0b8a2..ebab9eca37a8 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -708,7 +708,6 @@ int drm_sched_job_add_implicit_dependencies(struct drm_sched_job *job,
 		dma_fence_get(fence);
 		ret = drm_sched_job_add_dependency(job, fence);
 		if (ret) {
-			dma_fence_put(fence);
 			return ret;
 		}
 	}
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH] gpu: drm: remove redundant dma_fence_put() when drm_sched_job_add_dependency() fails
  2022-04-25  8:36 [PATCH] gpu: drm: remove redundant dma_fence_put() when drm_sched_job_add_dependency() fails Hangyu Hua
@ 2022-04-25 15:42 ` Andrey Grodzovsky
  2022-04-26  2:54   ` Hangyu Hua
  0 siblings, 1 reply; 10+ messages in thread
From: Andrey Grodzovsky @ 2022-04-25 15:42 UTC (permalink / raw)
  To: Hangyu Hua, yuq825, airlied, daniel; +Cc: dri-devel, lima, linux-kernel

On 2022-04-25 04:36, Hangyu Hua wrote:

> When drm_sched_job_add_dependency() fails, dma_fence_put() will be called
> internally. Calling it again after drm_sched_job_add_dependency() finishes
> may result in a dangling pointer.
>
> Fix this by removing redundant dma_fence_put().
>
> Signed-off-by: Hangyu Hua <hbh25y@gmail.com>
> ---
>   drivers/gpu/drm/lima/lima_gem.c        | 1 -
>   drivers/gpu/drm/scheduler/sched_main.c | 1 -
>   2 files changed, 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/lima/lima_gem.c b/drivers/gpu/drm/lima/lima_gem.c
> index 55bb1ec3c4f7..99c8e7f6bb1c 100644
> --- a/drivers/gpu/drm/lima/lima_gem.c
> +++ b/drivers/gpu/drm/lima/lima_gem.c
> @@ -291,7 +291,6 @@ static int lima_gem_add_deps(struct drm_file *file, struct lima_submit *submit)
>   
>   		err = drm_sched_job_add_dependency(&submit->task->base, fence);
>   		if (err) {
> -			dma_fence_put(fence);
>   			return err;


Makes sense here


>   		}
>   	}
> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
> index b81fceb0b8a2..ebab9eca37a8 100644
> --- a/drivers/gpu/drm/scheduler/sched_main.c
> +++ b/drivers/gpu/drm/scheduler/sched_main.c
> @@ -708,7 +708,6 @@ int drm_sched_job_add_implicit_dependencies(struct drm_sched_job *job,
>   		dma_fence_get(fence);
>   		ret = drm_sched_job_add_dependency(job, fence);
>   		if (ret) {
> -			dma_fence_put(fence);



Not sure about this one since if you look at the relevant commits -
'drm/scheduler: fix drm_sched_job_add_implicit_dependencies' and
'drm/scheduler: fix drm_sched_job_add_implicit_dependencies harder'
You will see that the dma_fence_put here balances the extra dma_fence_get
above

Andrey


>   			return ret;
>   		}
>   	}

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] gpu: drm: remove redundant dma_fence_put() when drm_sched_job_add_dependency() fails
  2022-04-25 15:42 ` Andrey Grodzovsky
@ 2022-04-26  2:54   ` Hangyu Hua
  2022-04-26 14:48     ` Andrey Grodzovsky
  2022-04-26 14:55     ` Andrey Grodzovsky
  0 siblings, 2 replies; 10+ messages in thread
From: Hangyu Hua @ 2022-04-26  2:54 UTC (permalink / raw)
  To: Andrey Grodzovsky, yuq825, airlied, daniel; +Cc: dri-devel, lima, linux-kernel

On 2022/4/25 23:42, Andrey Grodzovsky wrote:
> On 2022-04-25 04:36, Hangyu Hua wrote:
> 
>> When drm_sched_job_add_dependency() fails, dma_fence_put() will be called
>> internally. Calling it again after drm_sched_job_add_dependency() 
>> finishes
>> may result in a dangling pointer.
>>
>> Fix this by removing redundant dma_fence_put().
>>
>> Signed-off-by: Hangyu Hua <hbh25y@gmail.com>
>> ---
>>   drivers/gpu/drm/lima/lima_gem.c        | 1 -
>>   drivers/gpu/drm/scheduler/sched_main.c | 1 -
>>   2 files changed, 2 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/lima/lima_gem.c 
>> b/drivers/gpu/drm/lima/lima_gem.c
>> index 55bb1ec3c4f7..99c8e7f6bb1c 100644
>> --- a/drivers/gpu/drm/lima/lima_gem.c
>> +++ b/drivers/gpu/drm/lima/lima_gem.c
>> @@ -291,7 +291,6 @@ static int lima_gem_add_deps(struct drm_file 
>> *file, struct lima_submit *submit)
>>           err = drm_sched_job_add_dependency(&submit->task->base, fence);
>>           if (err) {
>> -            dma_fence_put(fence);
>>               return err;
> 
> 
> Makes sense here
> 
> 
>>           }
>>       }
>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
>> b/drivers/gpu/drm/scheduler/sched_main.c
>> index b81fceb0b8a2..ebab9eca37a8 100644
>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>> @@ -708,7 +708,6 @@ int drm_sched_job_add_implicit_dependencies(struct 
>> drm_sched_job *job,
>>           dma_fence_get(fence);
>>           ret = drm_sched_job_add_dependency(job, fence);
>>           if (ret) {
>> -            dma_fence_put(fence);
> 
> 
> 
> Not sure about this one since if you look at the relevant commits -
> 'drm/scheduler: fix drm_sched_job_add_implicit_dependencies' and
> 'drm/scheduler: fix drm_sched_job_add_implicit_dependencies harder'
> You will see that the dma_fence_put here balances the extra dma_fence_get
> above
> 
> Andrey
> 

I don't think so. I checked the call chain and found no additional 
dma_fence_get(). But dma_fence_get() needs to be called before 
drm_sched_job_add_dependency() to keep the counter balanced. On the 
other hand, dma_fence_get() and dma_fence_put() are meaningless here if 
threre is an extra dma_fence_get() beacause counter will not decrease to 
0 during drm_sched_job_add_dependency().

I check the call chain as follows:

msm_ioctl_gem_submit()
-> submit_fence_sync()
-> drm_sched_job_add_implicit_dependencies()

Thanks,
Hangyu

> 
>>               return ret;
>>           }
>>       }

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] gpu: drm: remove redundant dma_fence_put() when drm_sched_job_add_dependency() fails
  2022-04-26  2:54   ` Hangyu Hua
@ 2022-04-26 14:48     ` Andrey Grodzovsky
  2022-04-26 14:55     ` Andrey Grodzovsky
  1 sibling, 0 replies; 10+ messages in thread
From: Andrey Grodzovsky @ 2022-04-26 14:48 UTC (permalink / raw)
  To: Hangyu Hua, yuq825, airlied, daniel; +Cc: dri-devel, lima, linux-kernel


On 2022-04-25 22:54, Hangyu Hua wrote:
> On 2022/4/25 23:42, Andrey Grodzovsky wrote:
>> On 2022-04-25 04:36, Hangyu Hua wrote:
>>
>>> When drm_sched_job_add_dependency() fails, dma_fence_put() will be 
>>> called
>>> internally. Calling it again after drm_sched_job_add_dependency() 
>>> finishes
>>> may result in a dangling pointer.
>>>
>>> Fix this by removing redundant dma_fence_put().
>>>
>>> Signed-off-by: Hangyu Hua <hbh25y@gmail.com>
>>> ---
>>>   drivers/gpu/drm/lima/lima_gem.c        | 1 -
>>>   drivers/gpu/drm/scheduler/sched_main.c | 1 -
>>>   2 files changed, 2 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/lima/lima_gem.c 
>>> b/drivers/gpu/drm/lima/lima_gem.c
>>> index 55bb1ec3c4f7..99c8e7f6bb1c 100644
>>> --- a/drivers/gpu/drm/lima/lima_gem.c
>>> +++ b/drivers/gpu/drm/lima/lima_gem.c
>>> @@ -291,7 +291,6 @@ static int lima_gem_add_deps(struct drm_file 
>>> *file, struct lima_submit *submit)
>>>           err = drm_sched_job_add_dependency(&submit->task->base, 
>>> fence);
>>>           if (err) {
>>> -            dma_fence_put(fence);
>>>               return err;
>>
>>
>> Makes sense here
>>
>>
>>>           }
>>>       }
>>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
>>> b/drivers/gpu/drm/scheduler/sched_main.c
>>> index b81fceb0b8a2..ebab9eca37a8 100644
>>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>>> @@ -708,7 +708,6 @@ int 
>>> drm_sched_job_add_implicit_dependencies(struct drm_sched_job *job,
>>>           dma_fence_get(fence);
>>>           ret = drm_sched_job_add_dependency(job, fence);
>>>           if (ret) {
>>> -            dma_fence_put(fence);
>>
>>
>>
>> Not sure about this one since if you look at the relevant commits -
>> 'drm/scheduler: fix drm_sched_job_add_implicit_dependencies' and
>> 'drm/scheduler: fix drm_sched_job_add_implicit_dependencies harder'
>> You will see that the dma_fence_put here balances the extra 
>> dma_fence_get
>> above
>>
>> Andrey
>>
>
> I don't think so. I checked the call chain and found no additional 
> dma_fence_get(). But dma_fence_get() needs to be called before 
> drm_sched_job_add_dependency() to keep the counter balanced. 


I didn't say there is an additional dma_fence_get , from what I 
understand here drm_sched_job_add_implicit_dependencies->dma_fence_get 
is not balancing any counter but rather grabs an extra reference to 
account for adding the fence to the job's dependency array, and so when 
adding fails then you call dma_fence_put to decrement the count back. 
This makes sense because drm_sched_job_add_dependency doesn't increment 
himself refcount for the fences

> On the other hand, dma_fence_get() and dma_fence_put() are meaningless 
> here if threre is an extra dma_fence_get() beacause counter will not 
> decrease to 0 during drm_sched_job_add_dependency().


Where is the extra dma_fence_get() ?


>
> I check the call chain as follows:
>
> msm_ioctl_gem_submit()
> -> submit_fence_sync()
> -> drm_sched_job_add_implicit_dependencies()


Could you maybe print the buggy refcount sequence you say you discovered 
as an example ? Because I fail to follow where is the problem.

Andrey


>
> Thanks,
> Hangyu
>
>>
>>>               return ret;
>>>           }
>>>       }

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] gpu: drm: remove redundant dma_fence_put() when drm_sched_job_add_dependency() fails
  2022-04-26  2:54   ` Hangyu Hua
  2022-04-26 14:48     ` Andrey Grodzovsky
@ 2022-04-26 14:55     ` Andrey Grodzovsky
  2022-04-27  2:31       ` Hangyu Hua
  1 sibling, 1 reply; 10+ messages in thread
From: Andrey Grodzovsky @ 2022-04-26 14:55 UTC (permalink / raw)
  To: Hangyu Hua, yuq825, airlied, daniel; +Cc: dri-devel, lima, linux-kernel


On 2022-04-25 22:54, Hangyu Hua wrote:
> On 2022/4/25 23:42, Andrey Grodzovsky wrote:
>> On 2022-04-25 04:36, Hangyu Hua wrote:
>>
>>> When drm_sched_job_add_dependency() fails, dma_fence_put() will be 
>>> called
>>> internally. Calling it again after drm_sched_job_add_dependency() 
>>> finishes
>>> may result in a dangling pointer.
>>>
>>> Fix this by removing redundant dma_fence_put().
>>>
>>> Signed-off-by: Hangyu Hua <hbh25y@gmail.com>
>>> ---
>>>   drivers/gpu/drm/lima/lima_gem.c        | 1 -
>>>   drivers/gpu/drm/scheduler/sched_main.c | 1 -
>>>   2 files changed, 2 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/lima/lima_gem.c 
>>> b/drivers/gpu/drm/lima/lima_gem.c
>>> index 55bb1ec3c4f7..99c8e7f6bb1c 100644
>>> --- a/drivers/gpu/drm/lima/lima_gem.c
>>> +++ b/drivers/gpu/drm/lima/lima_gem.c
>>> @@ -291,7 +291,6 @@ static int lima_gem_add_deps(struct drm_file 
>>> *file, struct lima_submit *submit)
>>>           err = drm_sched_job_add_dependency(&submit->task->base, 
>>> fence);
>>>           if (err) {
>>> -            dma_fence_put(fence);
>>>               return err;
>>
>>
>> Makes sense here
>>
>>
>>>           }
>>>       }
>>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
>>> b/drivers/gpu/drm/scheduler/sched_main.c
>>> index b81fceb0b8a2..ebab9eca37a8 100644
>>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>>> @@ -708,7 +708,6 @@ int 
>>> drm_sched_job_add_implicit_dependencies(struct drm_sched_job *job,
>>>           dma_fence_get(fence);
>>>           ret = drm_sched_job_add_dependency(job, fence);
>>>           if (ret) {
>>> -            dma_fence_put(fence);
>>
>>
>>
>> Not sure about this one since if you look at the relevant commits -
>> 'drm/scheduler: fix drm_sched_job_add_implicit_dependencies' and
>> 'drm/scheduler: fix drm_sched_job_add_implicit_dependencies harder'
>> You will see that the dma_fence_put here balances the extra 
>> dma_fence_get
>> above
>>
>> Andrey
>>
>
> I don't think so. I checked the call chain and found no additional 
> dma_fence_get(). But dma_fence_get() needs to be called before 
> drm_sched_job_add_dependency() to keep the counter balanced. 


I don't say there is an additional get, I just say that 
drm_sched_job_add_dependency doesn't grab an extra reference to the 
fences it stores so this needs to be done outside and for that
drm_sched_job_add_implicit_dependencies->dma_fence_get is called and, if 
this addition fails you just call dma_fence_put to keep the counter 
balanced.


> On the other hand, dma_fence_get() and dma_fence_put() are meaningless 
> here if threre is an extra dma_fence_get() beacause counter will not 
> decrease to 0 during drm_sched_job_add_dependency().
>
> I check the call chain as follows:
>
> msm_ioctl_gem_submit()
> -> submit_fence_sync()
> -> drm_sched_job_add_implicit_dependencies()


Can you maybe trace or print one such example of problematic refcount 
that you are trying to fix ? I still don't see where is the problem.

Andrey


>
> Thanks,
> Hangyu
>
>>
>>>               return ret;
>>>           }
>>>       }

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] gpu: drm: remove redundant dma_fence_put() when drm_sched_job_add_dependency() fails
  2022-04-26 14:55     ` Andrey Grodzovsky
@ 2022-04-27  2:31       ` Hangyu Hua
       [not found]         ` <65b6cc23-1a77-7df0-5768-f81cd03b6514@amd.com>
  0 siblings, 1 reply; 10+ messages in thread
From: Hangyu Hua @ 2022-04-27  2:31 UTC (permalink / raw)
  To: Andrey Grodzovsky, yuq825, airlied, daniel; +Cc: dri-devel, lima, linux-kernel

On 2022/4/26 22:55, Andrey Grodzovsky wrote:
> 
> On 2022-04-25 22:54, Hangyu Hua wrote:
>> On 2022/4/25 23:42, Andrey Grodzovsky wrote:
>>> On 2022-04-25 04:36, Hangyu Hua wrote:
>>>
>>>> When drm_sched_job_add_dependency() fails, dma_fence_put() will be 
>>>> called
>>>> internally. Calling it again after drm_sched_job_add_dependency() 
>>>> finishes
>>>> may result in a dangling pointer.
>>>>
>>>> Fix this by removing redundant dma_fence_put().
>>>>
>>>> Signed-off-by: Hangyu Hua <hbh25y@gmail.com>
>>>> ---
>>>>   drivers/gpu/drm/lima/lima_gem.c        | 1 -
>>>>   drivers/gpu/drm/scheduler/sched_main.c | 1 -
>>>>   2 files changed, 2 deletions(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/lima/lima_gem.c 
>>>> b/drivers/gpu/drm/lima/lima_gem.c
>>>> index 55bb1ec3c4f7..99c8e7f6bb1c 100644
>>>> --- a/drivers/gpu/drm/lima/lima_gem.c
>>>> +++ b/drivers/gpu/drm/lima/lima_gem.c
>>>> @@ -291,7 +291,6 @@ static int lima_gem_add_deps(struct drm_file 
>>>> *file, struct lima_submit *submit)
>>>>           err = drm_sched_job_add_dependency(&submit->task->base, 
>>>> fence);
>>>>           if (err) {
>>>> -            dma_fence_put(fence);
>>>>               return err;
>>>
>>>
>>> Makes sense here
>>>
>>>
>>>>           }
>>>>       }
>>>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
>>>> b/drivers/gpu/drm/scheduler/sched_main.c
>>>> index b81fceb0b8a2..ebab9eca37a8 100644
>>>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>>>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>>>> @@ -708,7 +708,6 @@ int 
>>>> drm_sched_job_add_implicit_dependencies(struct drm_sched_job *job,
>>>>           dma_fence_get(fence);
>>>>           ret = drm_sched_job_add_dependency(job, fence);
>>>>           if (ret) {
>>>> -            dma_fence_put(fence);
>>>
>>>
>>>
>>> Not sure about this one since if you look at the relevant commits -
>>> 'drm/scheduler: fix drm_sched_job_add_implicit_dependencies' and
>>> 'drm/scheduler: fix drm_sched_job_add_implicit_dependencies harder'
>>> You will see that the dma_fence_put here balances the extra 
>>> dma_fence_get
>>> above
>>>
>>> Andrey
>>>
>>
>> I don't think so. I checked the call chain and found no additional 
>> dma_fence_get(). But dma_fence_get() needs to be called before 
>> drm_sched_job_add_dependency() to keep the counter balanced. 
> 
> 
> I don't say there is an additional get, I just say that 
> drm_sched_job_add_dependency doesn't grab an extra reference to the 
> fences it stores so this needs to be done outside and for that
> drm_sched_job_add_implicit_dependencies->dma_fence_get is called and, if 
> this addition fails you just call dma_fence_put to keep the counter 
> balanced.
> 

drm_sched_job_add_implicit_dependencies() will call 
drm_sched_job_add_dependency(). And drm_sched_job_add_dependency() 
already call dma_fence_put() when it fails. Calling dma_fence_put() 
twice doesn't make sense.

dma_fence_get() is in [2]. But dma_fence_put() will be called in [1] and 
[3] when xa_alloc() fails.


int drm_sched_job_add_dependency(struct drm_sched_job *job,
				 struct dma_fence *fence)
{
	...
	ret = xa_alloc(&job->dependencies, &id, fence, xa_limit_32b, GFP_KERNEL);
	if (ret != 0)
		dma_fence_put(fence);	<--- [1]

	return ret;
}
EXPORT_SYMBOL(drm_sched_job_add_dependency);


int drm_sched_job_add_implicit_dependencies(struct drm_sched_job *job,
					    struct drm_gem_object *obj,
					    bool write)
{
	struct dma_resv_iter cursor;
	struct dma_fence *fence;
	int ret;

	dma_resv_for_each_fence(&cursor, obj->resv, write, fence) {
		/* Make sure to grab an additional ref on the added fence */
		dma_fence_get(fence);	<--- [2]
		ret = drm_sched_job_add_dependency(job, fence);
		if (ret) {
			dma_fence_put(fence);	<--- [3]
			return ret;
		}
	}
	return 0;
}


> 
>> On the other hand, dma_fence_get() and dma_fence_put() are meaningless 
>> here if threre is an extra dma_fence_get() beacause counter will not 
>> decrease to 0 during drm_sched_job_add_dependency().
>>
>> I check the call chain as follows:
>>
>> msm_ioctl_gem_submit()
>> -> submit_fence_sync()
>> -> drm_sched_job_add_implicit_dependencies()
> 
> 
> Can you maybe trace or print one such example of problematic refcount 
> that you are trying to fix ? I still don't see where is the problem.
> 
> Andrey
> 

I also wish I could. System logs can make this easy. But i don't have a 
corresponding GPU physical device. 
drm_sched_job_add_implicit_dependencies is only used in a few devices.

Thanks.
> 
>>
>> Thanks,
>> Hangyu
>>
>>>
>>>>               return ret;
>>>>           }
>>>>       }

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] gpu: drm: remove redundant dma_fence_put() when drm_sched_job_add_dependency() fails
       [not found]         ` <65b6cc23-1a77-7df0-5768-f81cd03b6514@amd.com>
@ 2022-04-28  8:56           ` Hangyu Hua
  2022-04-28 15:27             ` Andrey Grodzovsky
  0 siblings, 1 reply; 10+ messages in thread
From: Hangyu Hua @ 2022-04-28  8:56 UTC (permalink / raw)
  To: Andrey Grodzovsky, yuq825, airlied, daniel; +Cc: dri-devel, lima, linux-kernel

On 2022/4/27 22:43, Andrey Grodzovsky wrote:
> 
> On 2022-04-26 22:31, Hangyu Hua wrote:
>> On 2022/4/26 22:55, Andrey Grodzovsky wrote:
>>>
>>> On 2022-04-25 22:54, Hangyu Hua wrote:
>>>> On 2022/4/25 23:42, Andrey Grodzovsky wrote:
>>>>> On 2022-04-25 04:36, Hangyu Hua wrote:
>>>>>
>>>>>> When drm_sched_job_add_dependency() fails, dma_fence_put() will be 
>>>>>> called
>>>>>> internally. Calling it again after drm_sched_job_add_dependency() 
>>>>>> finishes
>>>>>> may result in a dangling pointer.
>>>>>>
>>>>>> Fix this by removing redundant dma_fence_put().
>>>>>>
>>>>>> Signed-off-by: Hangyu Hua <hbh25y@gmail.com>
>>>>>> ---
>>>>>>   drivers/gpu/drm/lima/lima_gem.c        | 1 -
>>>>>>   drivers/gpu/drm/scheduler/sched_main.c | 1 -
>>>>>>   2 files changed, 2 deletions(-)
>>>>>>
>>>>>> diff --git a/drivers/gpu/drm/lima/lima_gem.c 
>>>>>> b/drivers/gpu/drm/lima/lima_gem.c
>>>>>> index 55bb1ec3c4f7..99c8e7f6bb1c 100644
>>>>>> --- a/drivers/gpu/drm/lima/lima_gem.c
>>>>>> +++ b/drivers/gpu/drm/lima/lima_gem.c
>>>>>> @@ -291,7 +291,6 @@ static int lima_gem_add_deps(struct drm_file 
>>>>>> *file, struct lima_submit *submit)
>>>>>>           err = drm_sched_job_add_dependency(&submit->task->base, 
>>>>>> fence);
>>>>>>           if (err) {
>>>>>> -            dma_fence_put(fence);
>>>>>>               return err;
>>>>>
>>>>>
>>>>> Makes sense here
>>>>>
>>>>>
>>>>>>           }
>>>>>>       }
>>>>>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
>>>>>> b/drivers/gpu/drm/scheduler/sched_main.c
>>>>>> index b81fceb0b8a2..ebab9eca37a8 100644
>>>>>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>>>>>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>>>>>> @@ -708,7 +708,6 @@ int 
>>>>>> drm_sched_job_add_implicit_dependencies(struct drm_sched_job *job,
>>>>>>           dma_fence_get(fence);
>>>>>>           ret = drm_sched_job_add_dependency(job, fence);
>>>>>>           if (ret) {
>>>>>> -            dma_fence_put(fence);
>>>>>
>>>>>
>>>>>
>>>>> Not sure about this one since if you look at the relevant commits -
>>>>> 'drm/scheduler: fix drm_sched_job_add_implicit_dependencies' and
>>>>> 'drm/scheduler: fix drm_sched_job_add_implicit_dependencies harder'
>>>>> You will see that the dma_fence_put here balances the extra 
>>>>> dma_fence_get
>>>>> above
>>>>>
>>>>> Andrey
>>>>>
>>>>
>>>> I don't think so. I checked the call chain and found no additional 
>>>> dma_fence_get(). But dma_fence_get() needs to be called before 
>>>> drm_sched_job_add_dependency() to keep the counter balanced. 
>>>
>>>
>>> I don't say there is an additional get, I just say that 
>>> drm_sched_job_add_dependency doesn't grab an extra reference to the 
>>> fences it stores so this needs to be done outside and for that
>>> drm_sched_job_add_implicit_dependencies->dma_fence_get is called and, 
>>> if this addition fails you just call dma_fence_put to keep the 
>>> counter balanced.
>>>
>>
>> drm_sched_job_add_implicit_dependencies() will call 
>> drm_sched_job_add_dependency(). And drm_sched_job_add_dependency() 
>> already call dma_fence_put() when it fails. Calling dma_fence_put() 
>> twice doesn't make sense.
>>
>> dma_fence_get() is in [2]. But dma_fence_put() will be called in [1] 
>> and [3] when xa_alloc() fails.
> 
> 
> The way I see it, [2] and [3] are mat matching *get* and *put* 
> respectively. [1] *put* is against the original 
> dma_fence_init->kref_init of the fence which always set the refcount at 1.
> Also in support of this see commit 'drm/scheduler: fix 
> drm_sched_job_add_implicit_dependencies harder' - it says there 
> "drm_sched_job_add_dependency() could drop the last ref"  - this last 
> ref is the original refcount set by dma_fence_init->kref
> 
> Andrey


You can see that drm_sched_job_add_dependency() has three return paths 
they are [4], [5] and [1]. [4] and [5] will return 0. [1] will return error.

There will be three weird problems if you're right:

1. [5] path will triger a refcount leak beacause ret is 0 in *if*[6]. 
Otherwise [2] and [5] are matching *get* and *put* in here.

2. [4] path need a additional dma_fence_get() to adds the fence as a job 
dependency. fence is from obj->resv. Taking msm as an example obj->resv 
is from etnaviv_ioctl_gem_submit()->submit_lookup_objects(). It is not 
possible that an object has *refcount == 1* but is referenced in two 
places. So dma_fence_get() called in [2] is for [4]. By the way, [3] 
don't execute in this case.

3. This one is a doubt. You can see in "[PATCH] drm/scheduler: fix 
drm_sched_job_add_implicit_dependencies harder". 
drm_sched_job_add_dependency() could drop the last ref, so we need to do
the dma_fence_get() first. But the last ref still will drop in [3] if 
drm_sched_job_add_dependency() go path [1]. And there is only a *return* 
between [1] and [3]. Is this necessary? I think Rob Clark wants to avoid 
the last ref being dropped in drm_sched_job_add_implicit_dependencies() 
because fence is still used by obj->resv.


int drm_sched_job_add_dependency(struct drm_sched_job *job,
                                  struct dma_fence *fence)
{
         ...
         xa_for_each(&job->dependencies, index, entry) {
                 if (entry->context != fence->context)
                         continue;

                 if (dma_fence_is_later(fence, entry)) {
                         dma_fence_put(entry);
                         xa_store(&job->dependencies, index, fence, 
GFP_KERNEL);	<---- [4]
                 } else {
                         dma_fence_put(fence);	<---- [5]
                 }
                 return 0;
         }

         ret = xa_alloc(&job->dependencies, &id, fence, xa_limit_32b, 
GFP_KERNEL);
         if (ret != 0)
                 dma_fence_put(fence);   <---- [1]

         return ret;
}


int drm_sched_job_add_implicit_dependencies(struct drm_sched_job *job,
                                             struct drm_gem_object *obj,
                                             bool write)
{
         struct dma_resv_iter cursor;
         struct dma_fence *fence;
         int ret;

         dma_resv_for_each_fence(&cursor, obj->resv, write, fence) {
                 /* Make sure to grab an additional ref on the added 
fence */
                 dma_fence_get(fence);   <---- [2]
                 ret = drm_sched_job_add_dependency(job, fence);
                 if (ret) {      <---- [6]
                         dma_fence_put(fence);   <---- [3]

                         return ret;
                 }
         }
         return 0;
}

Thanks,
hangyu

> 
>>
>>
>> int drm_sched_job_add_dependency(struct drm_sched_job *job,
>>                  struct dma_fence *fence)
>> {
>>     ...
>>     ret = xa_alloc(&job->dependencies, &id, fence, xa_limit_32b, 
>> GFP_KERNEL);
>>     if (ret != 0)
>>         dma_fence_put(fence);    <--- [1]
>>
>>     return ret;
>> }
>> EXPORT_SYMBOL(drm_sched_job_add_dependency);
>>
>>
>> int drm_sched_job_add_implicit_dependencies(struct drm_sched_job *job,
>>                         struct drm_gem_object *obj,
>>                         bool write)
>> {
>>     struct dma_resv_iter cursor;
>>     struct dma_fence *fence;
>>     int ret;
>>
>>     dma_resv_for_each_fence(&cursor, obj->resv, write, fence) {
>>         /* Make sure to grab an additional ref on the added fence */
>>         dma_fence_get(fence);    <--- [2]
>>         ret = drm_sched_job_add_dependency(job, fence);
>>         if (ret) {
>>             dma_fence_put(fence);    <--- [3]
>>             return ret;
>>         }
>>     }
>>     return 0;
>> }
>>
>>
>>>
>>>> On the other hand, dma_fence_get() and dma_fence_put() are 
>>>> meaningless here if threre is an extra dma_fence_get() beacause 
>>>> counter will not decrease to 0 during drm_sched_job_add_dependency().
>>>>
>>>> I check the call chain as follows:
>>>>
>>>> msm_ioctl_gem_submit()
>>>> -> submit_fence_sync()
>>>> -> drm_sched_job_add_implicit_dependencies()
>>>
>>>
>>> Can you maybe trace or print one such example of problematic refcount 
>>> that you are trying to fix ? I still don't see where is the problem.
>>>
>>> Andrey
>>>
>>
>> I also wish I could. System logs can make this easy. But i don't have 
>> a corresponding GPU physical device. 
>> drm_sched_job_add_implicit_dependencies is only used in a few devices.
>>
>> Thanks.
>>>
>>>>
>>>> Thanks,
>>>> Hangyu
>>>>
>>>>>
>>>>>>               return ret;
>>>>>>           }
>>>>>>       }

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] gpu: drm: remove redundant dma_fence_put() when drm_sched_job_add_dependency() fails
  2022-04-28  8:56           ` Hangyu Hua
@ 2022-04-28 15:27             ` Andrey Grodzovsky
  2022-04-29  3:03               ` Hangyu Hua
  0 siblings, 1 reply; 10+ messages in thread
From: Andrey Grodzovsky @ 2022-04-28 15:27 UTC (permalink / raw)
  To: Hangyu Hua, yuq825, airlied, daniel; +Cc: dri-devel, lima, linux-kernel


On 2022-04-28 04:56, Hangyu Hua wrote:
> On 2022/4/27 22:43, Andrey Grodzovsky wrote:
>>
>> On 2022-04-26 22:31, Hangyu Hua wrote:
>>> On 2022/4/26 22:55, Andrey Grodzovsky wrote:
>>>>
>>>> On 2022-04-25 22:54, Hangyu Hua wrote:
>>>>> On 2022/4/25 23:42, Andrey Grodzovsky wrote:
>>>>>> On 2022-04-25 04:36, Hangyu Hua wrote:
>>>>>>
>>>>>>> When drm_sched_job_add_dependency() fails, dma_fence_put() will 
>>>>>>> be called
>>>>>>> internally. Calling it again after 
>>>>>>> drm_sched_job_add_dependency() finishes
>>>>>>> may result in a dangling pointer.
>>>>>>>
>>>>>>> Fix this by removing redundant dma_fence_put().
>>>>>>>
>>>>>>> Signed-off-by: Hangyu Hua <hbh25y@gmail.com>
>>>>>>> ---
>>>>>>>   drivers/gpu/drm/lima/lima_gem.c        | 1 -
>>>>>>>   drivers/gpu/drm/scheduler/sched_main.c | 1 -
>>>>>>>   2 files changed, 2 deletions(-)
>>>>>>>
>>>>>>> diff --git a/drivers/gpu/drm/lima/lima_gem.c 
>>>>>>> b/drivers/gpu/drm/lima/lima_gem.c
>>>>>>> index 55bb1ec3c4f7..99c8e7f6bb1c 100644
>>>>>>> --- a/drivers/gpu/drm/lima/lima_gem.c
>>>>>>> +++ b/drivers/gpu/drm/lima/lima_gem.c
>>>>>>> @@ -291,7 +291,6 @@ static int lima_gem_add_deps(struct drm_file 
>>>>>>> *file, struct lima_submit *submit)
>>>>>>>           err = 
>>>>>>> drm_sched_job_add_dependency(&submit->task->base, fence);
>>>>>>>           if (err) {
>>>>>>> -            dma_fence_put(fence);
>>>>>>>               return err;
>>>>>>
>>>>>>
>>>>>> Makes sense here
>>>>>>
>>>>>>
>>>>>>>           }
>>>>>>>       }
>>>>>>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
>>>>>>> b/drivers/gpu/drm/scheduler/sched_main.c
>>>>>>> index b81fceb0b8a2..ebab9eca37a8 100644
>>>>>>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>>>>>>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>>>>>>> @@ -708,7 +708,6 @@ int 
>>>>>>> drm_sched_job_add_implicit_dependencies(struct drm_sched_job *job,
>>>>>>>           dma_fence_get(fence);
>>>>>>>           ret = drm_sched_job_add_dependency(job, fence);
>>>>>>>           if (ret) {
>>>>>>> -            dma_fence_put(fence);
>>>>>>
>>>>>>
>>>>>>
>>>>>> Not sure about this one since if you look at the relevant commits -
>>>>>> 'drm/scheduler: fix drm_sched_job_add_implicit_dependencies' and
>>>>>> 'drm/scheduler: fix drm_sched_job_add_implicit_dependencies harder'
>>>>>> You will see that the dma_fence_put here balances the extra 
>>>>>> dma_fence_get
>>>>>> above
>>>>>>
>>>>>> Andrey
>>>>>>
>>>>>
>>>>> I don't think so. I checked the call chain and found no additional 
>>>>> dma_fence_get(). But dma_fence_get() needs to be called before 
>>>>> drm_sched_job_add_dependency() to keep the counter balanced. 
>>>>
>>>>
>>>> I don't say there is an additional get, I just say that 
>>>> drm_sched_job_add_dependency doesn't grab an extra reference to the 
>>>> fences it stores so this needs to be done outside and for that
>>>> drm_sched_job_add_implicit_dependencies->dma_fence_get is called 
>>>> and, if this addition fails you just call dma_fence_put to keep the 
>>>> counter balanced.
>>>>
>>>
>>> drm_sched_job_add_implicit_dependencies() will call 
>>> drm_sched_job_add_dependency(). And drm_sched_job_add_dependency() 
>>> already call dma_fence_put() when it fails. Calling dma_fence_put() 
>>> twice doesn't make sense.
>>>
>>> dma_fence_get() is in [2]. But dma_fence_put() will be called in [1] 
>>> and [3] when xa_alloc() fails.
>>
>>
>> The way I see it, [2] and [3] are mat matching *get* and *put* 
>> respectively. [1] *put* is against the original 
>> dma_fence_init->kref_init of the fence which always set the refcount 
>> at 1.
>> Also in support of this see commit 'drm/scheduler: fix 
>> drm_sched_job_add_implicit_dependencies harder' - it says there 
>> "drm_sched_job_add_dependency() could drop the last ref"  - this last 
>> ref is the original refcount set by dma_fence_init->kref
>>
>> Andrey
>
>
> You can see that drm_sched_job_add_dependency() has three return paths 
> they are [4], [5] and [1]. [4] and [5] will return 0. [1] will return 
> error.
>
> There will be three weird problems if you're right:
>
> 1. [5] path will triger a refcount leak beacause ret is 0 in *if*[6]. 


Terminology confusion issue - [5] is a 'put' so it cannot cause a leak 
by definition, extra unbalanced 'get' will cause a leak because memory 
is never released, extra put will just probably cause a warning in 
kref_put or maybe double free.


> Otherwise [2] and [5] are matching *get* and *put* in here.


Exactly, they are matching - so until this point all good and no 'leak' 
then, no ?


>
> 2. [4] path need a additional dma_fence_get() to adds the fence as a 
> job dependency. fence is from obj->resv. Taking msm as an example 
> obj->resv is from etnaviv_ioctl_gem_submit()->submit_lookup_objects(). 
> It is not possible that an object has *refcount == 1* but is 
> referenced in two places. So dma_fence_get() called in [2] is for [4]. 
> By the way, [3] don't execute in this case.


Still don't see the problem - [2] is the additional dma_fence_get() you 
need here (just as you say above).


>
> 3. This one is a doubt. You can see in "[PATCH] drm/scheduler: fix 
> drm_sched_job_add_implicit_dependencies harder". 
> drm_sched_job_add_dependency() could drop the last ref, so we need to do
> the dma_fence_get() first. But the last ref still will drop in [3] if 
> drm_sched_job_add_dependency() go path [1]. And there is only a 
> *return* between [1] and [3]. Is this necessary? I think Rob Clark 
> wants to avoid the last ref being dropped in 
> drm_sched_job_add_implicit_dependencies() because fence is still used 
> by obj->resv.


In the scenario above - if we go thorough path [1] refcount before [1] 
starts is 2 - one from original kref_init and one from [2] and so it's 
balanced against 2 puts (one from [1] and one from [3]) so I still don't 
see a problem.

I suggest that you give a specific scenario  from fence ref-count 
perspective that your patch fixes. I might be wrong but unless you give 
a specific case where the 'put' in [3] is redundant I just can't see it.

Andrey


>
>
> int drm_sched_job_add_dependency(struct drm_sched_job *job,
>                                  struct dma_fence *fence)
> {
>         ...
>         xa_for_each(&job->dependencies, index, entry) {
>                 if (entry->context != fence->context)
>                         continue;
>
>                 if (dma_fence_is_later(fence, entry)) {
>                         dma_fence_put(entry);
>                         xa_store(&job->dependencies, index, fence, 
> GFP_KERNEL);    <---- [4]
>                 } else {
>                         dma_fence_put(fence);    <---- [5]
>                 }
>                 return 0;
>         }
>
>         ret = xa_alloc(&job->dependencies, &id, fence, xa_limit_32b, 
> GFP_KERNEL);
>         if (ret != 0)
>                 dma_fence_put(fence);   <---- [1]
>
>         return ret;
> }
>
>
> int drm_sched_job_add_implicit_dependencies(struct drm_sched_job *job,
>                                             struct drm_gem_object *obj,
>                                             bool write)
> {
>         struct dma_resv_iter cursor;
>         struct dma_fence *fence;
>         int ret;
>
>         dma_resv_for_each_fence(&cursor, obj->resv, write, fence) {
>                 /* Make sure to grab an additional ref on the added 
> fence */
>                 dma_fence_get(fence);   <---- [2]
>                 ret = drm_sched_job_add_dependency(job, fence);
>                 if (ret) {      <---- [6]
>                         dma_fence_put(fence);   <---- [3]
>
>                         return ret;
>                 }
>         }
>         return 0;
> }
>
> Thanks,
> hangyu
>
>>
>>>
>>>
>>> int drm_sched_job_add_dependency(struct drm_sched_job *job,
>>>                  struct dma_fence *fence)
>>> {
>>>     ...
>>>     ret = xa_alloc(&job->dependencies, &id, fence, xa_limit_32b, 
>>> GFP_KERNEL);
>>>     if (ret != 0)
>>>         dma_fence_put(fence);    <--- [1]
>>>
>>>     return ret;
>>> }
>>> EXPORT_SYMBOL(drm_sched_job_add_dependency);
>>>
>>>
>>> int drm_sched_job_add_implicit_dependencies(struct drm_sched_job *job,
>>>                         struct drm_gem_object *obj,
>>>                         bool write)
>>> {
>>>     struct dma_resv_iter cursor;
>>>     struct dma_fence *fence;
>>>     int ret;
>>>
>>>     dma_resv_for_each_fence(&cursor, obj->resv, write, fence) {
>>>         /* Make sure to grab an additional ref on the added fence */
>>>         dma_fence_get(fence);    <--- [2]
>>>         ret = drm_sched_job_add_dependency(job, fence);
>>>         if (ret) {
>>>             dma_fence_put(fence);    <--- [3]
>>>             return ret;
>>>         }
>>>     }
>>>     return 0;
>>> }
>>>
>>>
>>>>
>>>>> On the other hand, dma_fence_get() and dma_fence_put() are 
>>>>> meaningless here if threre is an extra dma_fence_get() beacause 
>>>>> counter will not decrease to 0 during drm_sched_job_add_dependency().
>>>>>
>>>>> I check the call chain as follows:
>>>>>
>>>>> msm_ioctl_gem_submit()
>>>>> -> submit_fence_sync()
>>>>> -> drm_sched_job_add_implicit_dependencies()
>>>>
>>>>
>>>> Can you maybe trace or print one such example of problematic 
>>>> refcount that you are trying to fix ? I still don't see where is 
>>>> the problem.
>>>>
>>>> Andrey
>>>>
>>>
>>> I also wish I could. System logs can make this easy. But i don't 
>>> have a corresponding GPU physical device. 
>>> drm_sched_job_add_implicit_dependencies is only used in a few devices.
>>>
>>> Thanks.
>>>>
>>>>>
>>>>> Thanks,
>>>>> Hangyu
>>>>>
>>>>>>
>>>>>>>               return ret;
>>>>>>>           }
>>>>>>>       }

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] gpu: drm: remove redundant dma_fence_put() when drm_sched_job_add_dependency() fails
  2022-04-28 15:27             ` Andrey Grodzovsky
@ 2022-04-29  3:03               ` Hangyu Hua
  2022-04-29 16:34                 ` Andrey Grodzovsky
  0 siblings, 1 reply; 10+ messages in thread
From: Hangyu Hua @ 2022-04-29  3:03 UTC (permalink / raw)
  To: Andrey Grodzovsky, yuq825, airlied, daniel; +Cc: dri-devel, lima, linux-kernel

On 2022/4/28 23:27, Andrey Grodzovsky wrote:
> 
> On 2022-04-28 04:56, Hangyu Hua wrote:
>> On 2022/4/27 22:43, Andrey Grodzovsky wrote:
>>>
>>> On 2022-04-26 22:31, Hangyu Hua wrote:
>>>> On 2022/4/26 22:55, Andrey Grodzovsky wrote:
>>>>>
>>>>> On 2022-04-25 22:54, Hangyu Hua wrote:
>>>>>> On 2022/4/25 23:42, Andrey Grodzovsky wrote:
>>>>>>> On 2022-04-25 04:36, Hangyu Hua wrote:
>>>>>>>
>>>>>>>> When drm_sched_job_add_dependency() fails, dma_fence_put() will 
>>>>>>>> be called
>>>>>>>> internally. Calling it again after 
>>>>>>>> drm_sched_job_add_dependency() finishes
>>>>>>>> may result in a dangling pointer.
>>>>>>>>
>>>>>>>> Fix this by removing redundant dma_fence_put().
>>>>>>>>
>>>>>>>> Signed-off-by: Hangyu Hua <hbh25y@gmail.com>
>>>>>>>> ---
>>>>>>>>   drivers/gpu/drm/lima/lima_gem.c        | 1 -
>>>>>>>>   drivers/gpu/drm/scheduler/sched_main.c | 1 -
>>>>>>>>   2 files changed, 2 deletions(-)
>>>>>>>>
>>>>>>>> diff --git a/drivers/gpu/drm/lima/lima_gem.c 
>>>>>>>> b/drivers/gpu/drm/lima/lima_gem.c
>>>>>>>> index 55bb1ec3c4f7..99c8e7f6bb1c 100644
>>>>>>>> --- a/drivers/gpu/drm/lima/lima_gem.c
>>>>>>>> +++ b/drivers/gpu/drm/lima/lima_gem.c
>>>>>>>> @@ -291,7 +291,6 @@ static int lima_gem_add_deps(struct drm_file 
>>>>>>>> *file, struct lima_submit *submit)
>>>>>>>>           err = 
>>>>>>>> drm_sched_job_add_dependency(&submit->task->base, fence);
>>>>>>>>           if (err) {
>>>>>>>> -            dma_fence_put(fence);
>>>>>>>>               return err;
>>>>>>>
>>>>>>>
>>>>>>> Makes sense here
>>>>>>>
>>>>>>>
>>>>>>>>           }
>>>>>>>>       }
>>>>>>>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
>>>>>>>> b/drivers/gpu/drm/scheduler/sched_main.c
>>>>>>>> index b81fceb0b8a2..ebab9eca37a8 100644
>>>>>>>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>>>>>>>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>>>>>>>> @@ -708,7 +708,6 @@ int 
>>>>>>>> drm_sched_job_add_implicit_dependencies(struct drm_sched_job *job,
>>>>>>>>           dma_fence_get(fence);
>>>>>>>>           ret = drm_sched_job_add_dependency(job, fence);
>>>>>>>>           if (ret) {
>>>>>>>> -            dma_fence_put(fence);
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Not sure about this one since if you look at the relevant commits -
>>>>>>> 'drm/scheduler: fix drm_sched_job_add_implicit_dependencies' and
>>>>>>> 'drm/scheduler: fix drm_sched_job_add_implicit_dependencies harder'
>>>>>>> You will see that the dma_fence_put here balances the extra 
>>>>>>> dma_fence_get
>>>>>>> above
>>>>>>>
>>>>>>> Andrey
>>>>>>>
>>>>>>
>>>>>> I don't think so. I checked the call chain and found no additional 
>>>>>> dma_fence_get(). But dma_fence_get() needs to be called before 
>>>>>> drm_sched_job_add_dependency() to keep the counter balanced. 
>>>>>
>>>>>
>>>>> I don't say there is an additional get, I just say that 
>>>>> drm_sched_job_add_dependency doesn't grab an extra reference to the 
>>>>> fences it stores so this needs to be done outside and for that
>>>>> drm_sched_job_add_implicit_dependencies->dma_fence_get is called 
>>>>> and, if this addition fails you just call dma_fence_put to keep the 
>>>>> counter balanced.
>>>>>
>>>>
>>>> drm_sched_job_add_implicit_dependencies() will call 
>>>> drm_sched_job_add_dependency(). And drm_sched_job_add_dependency() 
>>>> already call dma_fence_put() when it fails. Calling dma_fence_put() 
>>>> twice doesn't make sense.
>>>>
>>>> dma_fence_get() is in [2]. But dma_fence_put() will be called in [1] 
>>>> and [3] when xa_alloc() fails.
>>>
>>>
>>> The way I see it, [2] and [3] are mat matching *get* and *put* 
>>> respectively. [1] *put* is against the original 
>>> dma_fence_init->kref_init of the fence which always set the refcount 
>>> at 1.
>>> Also in support of this see commit 'drm/scheduler: fix 
>>> drm_sched_job_add_implicit_dependencies harder' - it says there 
>>> "drm_sched_job_add_dependency() could drop the last ref"  - this last 
>>> ref is the original refcount set by dma_fence_init->kref
>>>
>>> Andrey
>>
>>
>> You can see that drm_sched_job_add_dependency() has three return paths 
>> they are [4], [5] and [1]. [4] and [5] will return 0. [1] will return 
>> error.
>>
>> There will be three weird problems if you're right:
>>
>> 1. [5] path will triger a refcount leak beacause ret is 0 in *if*[6]. 
> 
> 
> Terminology confusion issue - [5] is a 'put' so it cannot cause a leak 
> by definition, extra unbalanced 'get' will cause a leak because memory 
> is never released, extra put will just probably cause a warning in 
> kref_put or maybe double free.
> 
> 
>> Otherwise [2] and [5] are matching *get* and *put* in here.
> 
> 
> Exactly, they are matching - so until this point all good and no 'leak' 
> then, no ?
> 

In fact, i just want to prove that [2] and [3] are not a matching pair 
when the path go [4] or [5]. It's less likely when the path is [1]. But 
it doesn't matter, please see my explanation below.

> 
>>
>> 2. [4] path need a additional dma_fence_get() to adds the fence as a 
>> job dependency. fence is from obj->resv. Taking msm as an example 
>> obj->resv is from etnaviv_ioctl_gem_submit()->submit_lookup_objects(). 
>> It is not possible that an object has *refcount == 1* but is 
>> referenced in two places. So dma_fence_get() called in [2] is for [4]. 
>> By the way, [3] don't execute in this case.
> 
> 
> Still don't see the problem - [2] is the additional dma_fence_get() you 
> need here (just as you say above).
> 
> 
>>
>> 3. This one is a doubt. You can see in "[PATCH] drm/scheduler: fix 
>> drm_sched_job_add_implicit_dependencies harder". 
>> drm_sched_job_add_dependency() could drop the last ref, so we need to do
>> the dma_fence_get() first. But the last ref still will drop in [3] if 
>> drm_sched_job_add_dependency() go path [1]. And there is only a 
>> *return* between [1] and [3]. Is this necessary? I think Rob Clark 
>> wants to avoid the last ref being dropped in 
>> drm_sched_job_add_implicit_dependencies() because fence is still used 
>> by obj->resv.
> 
> 
> In the scenario above - if we go thorough path [1] refcount before [1] 
> starts is 2 - one from original kref_init and one from [2] and so it's 
> balanced against 2 puts (one from [1] and one from [3]) so I still don't 
> see a problem.
>

We can't directly drop the last refcount and release fence in 
drm_sched_job_add_implicit_dependencies. fence is from obj->resv. Taking 
msm as an example obj->resv is from 
msm_ioctl_gem_submit()->submit_lookup_objects().

static int submit_lookup_objects(struct msm_gem_submit *submit,
		struct drm_msm_gem_submit *args, struct drm_file *file)
{
	...
	for (i = 0; i < args->nr_bos; i++) {
		struct drm_gem_object *obj;

		/* normally use drm_gem_object_lookup(), but for bulk lookup
		 * all under single table_lock just hit object_idr directly:
		 */
		obj = idr_find(&file->object_idr, submit->bos[i].handle);		<---- we 
find obj in here by a user controllable handle
		if (!obj) {
			DRM_ERROR("invalid handle %u at index %u\n", submit->bos[i].handle, i);
			ret = -EINVAL;
			goto out_unlock;
		}

		drm_gem_object_get(obj);

		submit->bos[i].obj = to_msm_bo(obj);	<---- we store it
	}
	...
}

Taking msm as an example, the patch to call 
drm_sched_job_add_implicit_dependencies() is 
msm_ioctl_gem_submit()->submit_fence_sync().

static int submit_fence_sync(struct msm_gem_submit *submit, bool 
no_implicit)
{
	int i, ret = 0;

	for (i = 0; i < submit->nr_bos; i++) {
		struct drm_gem_object *obj = &submit->bos[i].obj->base;	<---- get the obj
	...
		ret = drm_sched_job_add_implicit_dependencies(&submit->base,
							      obj,
							      write);
		if (ret)
			break;
	}

	return ret;
}

If fence is released in drm_sched_job_add_implicit_dependencies(), a 
dangling pointer will be in obj->resv.

specific scenario:
recount = 1 init, obj->resv->fence_excl = fence
recount = 1 before drm_sched_job_add_implicit_dependencies
recount = 2 in [2]
recount = 1 in [1]
recount = 0 in [3] <--- fence release. But fence still in obj->resv

Thanks,
Hangyu

> I suggest that you give a specific scenario  from fence ref-count 
> perspective that your patch fixes. I might be wrong but unless you give 
> a specific case where the 'put' in [3] is redundant I just can't see it.
> 
> Andrey >
> 
>>
>>
>> int drm_sched_job_add_dependency(struct drm_sched_job *job,
>>                                  struct dma_fence *fence)
>> {
>>         ...
>>         xa_for_each(&job->dependencies, index, entry) {
>>                 if (entry->context != fence->context)
>>                         continue;
>>
>>                 if (dma_fence_is_later(fence, entry)) {
>>                         dma_fence_put(entry);
>>                         xa_store(&job->dependencies, index, fence, 
>> GFP_KERNEL);    <---- [4]
>>                 } else {
>>                         dma_fence_put(fence);    <---- [5]
>>                 }
>>                 return 0;
>>         }
>>
>>         ret = xa_alloc(&job->dependencies, &id, fence, xa_limit_32b, 
>> GFP_KERNEL);
>>         if (ret != 0)
>>                 dma_fence_put(fence);   <---- [1]
>>
>>         return ret;
>> }
>>
>>
>> int drm_sched_job_add_implicit_dependencies(struct drm_sched_job *job,
>>                                             struct drm_gem_object *obj,
>>                                             bool write)
>> {
>>         struct dma_resv_iter cursor;
>>         struct dma_fence *fence;
>>         int ret;
>>
>>         dma_resv_for_each_fence(&cursor, obj->resv, write, fence) {
>>                 /* Make sure to grab an additional ref on the added 
>> fence */
>>                 dma_fence_get(fence);   <---- [2]
>>                 ret = drm_sched_job_add_dependency(job, fence);
>>                 if (ret) {      <---- [6]
>>                         dma_fence_put(fence);   <---- [3]
>>
>>                         return ret;
>>                 }
>>         }
>>         return 0;
>> }
>>
>> Thanks,
>> hangyu
>>
>>>
>>>>
>>>>
>>>> int drm_sched_job_add_dependency(struct drm_sched_job *job,
>>>>                  struct dma_fence *fence)
>>>> {
>>>>     ...
>>>>     ret = xa_alloc(&job->dependencies, &id, fence, xa_limit_32b, 
>>>> GFP_KERNEL);
>>>>     if (ret != 0)
>>>>         dma_fence_put(fence);    <--- [1]
>>>>
>>>>     return ret;
>>>> }
>>>> EXPORT_SYMBOL(drm_sched_job_add_dependency);
>>>>
>>>>
>>>> int drm_sched_job_add_implicit_dependencies(struct drm_sched_job *job,
>>>>                         struct drm_gem_object *obj,
>>>>                         bool write)
>>>> {
>>>>     struct dma_resv_iter cursor;
>>>>     struct dma_fence *fence;
>>>>     int ret;
>>>>
>>>>     dma_resv_for_each_fence(&cursor, obj->resv, write, fence) {
>>>>         /* Make sure to grab an additional ref on the added fence */
>>>>         dma_fence_get(fence);    <--- [2]
>>>>         ret = drm_sched_job_add_dependency(job, fence);
>>>>         if (ret) {
>>>>             dma_fence_put(fence);    <--- [3]
>>>>             return ret;
>>>>         }
>>>>     }
>>>>     return 0;
>>>> }
>>>>
>>>>
>>>>>
>>>>>> On the other hand, dma_fence_get() and dma_fence_put() are 
>>>>>> meaningless here if threre is an extra dma_fence_get() beacause 
>>>>>> counter will not decrease to 0 during drm_sched_job_add_dependency().
>>>>>>
>>>>>> I check the call chain as follows:
>>>>>>
>>>>>> msm_ioctl_gem_submit()
>>>>>> -> submit_fence_sync()
>>>>>> -> drm_sched_job_add_implicit_dependencies()
>>>>>
>>>>>
>>>>> Can you maybe trace or print one such example of problematic 
>>>>> refcount that you are trying to fix ? I still don't see where is 
>>>>> the problem.
>>>>>
>>>>> Andrey
>>>>>
>>>>
>>>> I also wish I could. System logs can make this easy. But i don't 
>>>> have a corresponding GPU physical device. 
>>>> drm_sched_job_add_implicit_dependencies is only used in a few devices.
>>>>
>>>> Thanks.
>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>> Hangyu
>>>>>>
>>>>>>>
>>>>>>>>               return ret;
>>>>>>>>           }
>>>>>>>>       }

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] gpu: drm: remove redundant dma_fence_put() when drm_sched_job_add_dependency() fails
  2022-04-29  3:03               ` Hangyu Hua
@ 2022-04-29 16:34                 ` Andrey Grodzovsky
  0 siblings, 0 replies; 10+ messages in thread
From: Andrey Grodzovsky @ 2022-04-29 16:34 UTC (permalink / raw)
  To: Hangyu Hua, yuq825, airlied, daniel; +Cc: dri-devel, lima, linux-kernel

Now it's easy to see it's redundant, it's a problematic design where the 
fence 'get' part is happening outside the job->dependencies array logic 
but the fence 'put' part is happening inside, leads to confusions such 
as adding this 'put'
in the first place. I will look into improving this if possible.

Patch is Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>

Andrey

On 2022-04-28 23:03, Hangyu Hua wrote:
> If fence is released in drm_sched_job_add_implicit_dependencies(), a 
> dangling pointer will be in obj->resv.
>
> specific scenario:
> recount = 1 init, obj->resv->fence_excl = fence
> recount = 1 before drm_sched_job_add_implicit_dependencies
> recount = 2 in [2]
> recount = 1 in [1]
> recount = 0 in [3] <--- fence release. But fence still in obj->resv
>
> Thanks,
> Hangyu 

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2022-04-29 16:34 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-04-25  8:36 [PATCH] gpu: drm: remove redundant dma_fence_put() when drm_sched_job_add_dependency() fails Hangyu Hua
2022-04-25 15:42 ` Andrey Grodzovsky
2022-04-26  2:54   ` Hangyu Hua
2022-04-26 14:48     ` Andrey Grodzovsky
2022-04-26 14:55     ` Andrey Grodzovsky
2022-04-27  2:31       ` Hangyu Hua
     [not found]         ` <65b6cc23-1a77-7df0-5768-f81cd03b6514@amd.com>
2022-04-28  8:56           ` Hangyu Hua
2022-04-28 15:27             ` Andrey Grodzovsky
2022-04-29  3:03               ` Hangyu Hua
2022-04-29 16:34                 ` Andrey Grodzovsky

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).