All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] drm/amdgpu: add the checking to avoid  NULL pointer dereference
@ 2018-11-22  6:56 Sharma, Deepak
       [not found] ` <20181122065523.5510-1-Deepak.Sharma-5C7GfCeVMHo@public.gmane.org>
  0 siblings, 1 reply; 17+ messages in thread
From: Sharma, Deepak @ 2018-11-22  6:56 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Zhou, David(ChunMing)
  Cc: Sharma, Deepak

when returned fence is not valid mostly due to userspace ignored
previous error causes NULL pointer dereference.

Signed-off-by: Deepak Sharma <Deepak.Sharma@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index 024dfbd87f11..14166cd8a12f 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -1403,6 +1403,8 @@ static struct dma_fence *amdgpu_cs_get_fence(struct amdgpu_device *adev,
 
 	fence = amdgpu_ctx_get_fence(ctx, entity, user->seq_no);
 	amdgpu_ctx_put(ctx);
+	if(!fence)
+		return ERR_PTR(-EINVAL);
 
 	return fence;
 }
-- 
2.15.1

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH] drm/amdgpu: add the checking to avoid NULL pointer dereference
       [not found] ` <20181122065523.5510-1-Deepak.Sharma-5C7GfCeVMHo@public.gmane.org>
@ 2018-11-22  7:00   ` zhoucm1
  2018-11-22 11:25   ` Christian König
  1 sibling, 0 replies; 17+ messages in thread
From: zhoucm1 @ 2018-11-22  7:00 UTC (permalink / raw)
  To: Sharma, Deepak, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Zhou,
	David(ChunMing)



On 2018年11月22日 14:56, Sharma, Deepak wrote:
> when returned fence is not valid mostly due to userspace ignored
> previous error causes NULL pointer dereference.
>
> Signed-off-by: Deepak Sharma <Deepak.Sharma@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>

> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 2 ++
>   1 file changed, 2 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> index 024dfbd87f11..14166cd8a12f 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> @@ -1403,6 +1403,8 @@ static struct dma_fence *amdgpu_cs_get_fence(struct amdgpu_device *adev,
>   
>   	fence = amdgpu_ctx_get_fence(ctx, entity, user->seq_no);
>   	amdgpu_ctx_put(ctx);
> +	if(!fence)
> +		return ERR_PTR(-EINVAL);
>   
>   	return fence;
>   }

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] drm/amdgpu: add the checking to avoid NULL pointer dereference
       [not found] ` <20181122065523.5510-1-Deepak.Sharma-5C7GfCeVMHo@public.gmane.org>
  2018-11-22  7:00   ` zhoucm1
@ 2018-11-22 11:25   ` Christian König
       [not found]     ` <b8284caa-3f1d-5672-6317-36be83907e9d-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
  1 sibling, 1 reply; 17+ messages in thread
From: Christian König @ 2018-11-22 11:25 UTC (permalink / raw)
  To: Sharma, Deepak, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Zhou,
	David(ChunMing)

Am 22.11.18 um 07:56 schrieb Sharma, Deepak:
> when returned fence is not valid mostly due to userspace ignored
> previous error causes NULL pointer dereference.

Again, this is clearly incorrect. The my other mails on the earlier patch.

If you have already pushed the patch then please revert.

Christian.

>
> Signed-off-by: Deepak Sharma <Deepak.Sharma@amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 2 ++
>   1 file changed, 2 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> index 024dfbd87f11..14166cd8a12f 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> @@ -1403,6 +1403,8 @@ static struct dma_fence *amdgpu_cs_get_fence(struct amdgpu_device *adev,
>   
>   	fence = amdgpu_ctx_get_fence(ctx, entity, user->seq_no);
>   	amdgpu_ctx_put(ctx);
> +	if(!fence)
> +		return ERR_PTR(-EINVAL);
>   
>   	return fence;
>   }

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] drm/amdgpu: add the checking to avoid NULL pointer dereference
       [not found]     ` <b8284caa-3f1d-5672-6317-36be83907e9d-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
@ 2018-11-23 13:27       ` Chunming Zhou
       [not found]         ` <9350d13a-e745-f284-1c49-785fe523773d-5C7GfCeVMHo@public.gmane.org>
  0 siblings, 1 reply; 17+ messages in thread
From: Chunming Zhou @ 2018-11-23 13:27 UTC (permalink / raw)
  To: Koenig, Christian, Sharma, Deepak,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Zhou, David(ChunMing)



在 2018/11/22 19:25, Christian König 写道:
> Am 22.11.18 um 07:56 schrieb Sharma, Deepak:
>> when returned fence is not valid mostly due to userspace ignored
>> previous error causes NULL pointer dereference.
>
> Again, this is clearly incorrect. The my other mails on the earlier 
> patch.
Sorry for I didn't get your history, but looks from the patch itself, it 
is still a valid patch, isn't it?

-David
>
> If you have already pushed the patch then please revert.
>
> Christian.
>
>>
>> Signed-off-by: Deepak Sharma <Deepak.Sharma@amd.com>
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 2 ++
>>   1 file changed, 2 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>> index 024dfbd87f11..14166cd8a12f 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>> @@ -1403,6 +1403,8 @@ static struct dma_fence 
>> *amdgpu_cs_get_fence(struct amdgpu_device *adev,
>>         fence = amdgpu_ctx_get_fence(ctx, entity, user->seq_no);
>>       amdgpu_ctx_put(ctx);
>> +    if(!fence)
>> +        return ERR_PTR(-EINVAL);
>>         return fence;
>>   }
>

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] drm/amdgpu: add the checking to avoid NULL pointer dereference
       [not found]         ` <9350d13a-e745-f284-1c49-785fe523773d-5C7GfCeVMHo@public.gmane.org>
@ 2018-11-23 13:30           ` Koenig, Christian
       [not found]             ` <eae25687-23ce-d686-f5f0-deeb7c3d2e1c-5C7GfCeVMHo@public.gmane.org>
  0 siblings, 1 reply; 17+ messages in thread
From: Koenig, Christian @ 2018-11-23 13:30 UTC (permalink / raw)
  To: Zhou, David(ChunMing),
	Sharma, Deepak, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

Am 23.11.18 um 14:27 schrieb Zhou, David(ChunMing):
>
> 在 2018/11/22 19:25, Christian König 写道:
>> Am 22.11.18 um 07:56 schrieb Sharma, Deepak:
>>> when returned fence is not valid mostly due to userspace ignored
>>> previous error causes NULL pointer dereference.
>> Again, this is clearly incorrect. The my other mails on the earlier
>> patch.
> Sorry for I didn't get your history, but looks from the patch itself, it
> is still a valid patch, isn't it?

No, the semantic of amdgpu_ctx_get_fence() is that we return NULL when 
the fence is already signaled.

So this patch could totally break userspace because it changes the 
behavior when we try to sync to an already signaled fence.

If that patch was applied then please revert it immediately.

Christian.

>
> -David
>> If you have already pushed the patch then please revert.
>>
>> Christian.
>>
>>> Signed-off-by: Deepak Sharma <Deepak.Sharma@amd.com>
>>> ---
>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 2 ++
>>>    1 file changed, 2 insertions(+)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>>> index 024dfbd87f11..14166cd8a12f 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>>> @@ -1403,6 +1403,8 @@ static struct dma_fence
>>> *amdgpu_cs_get_fence(struct amdgpu_device *adev,
>>>          fence = amdgpu_ctx_get_fence(ctx, entity, user->seq_no);
>>>        amdgpu_ctx_put(ctx);
>>> +    if(!fence)
>>> +        return ERR_PTR(-EINVAL);
>>>          return fence;
>>>    }

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] drm/amdgpu: add the checking to avoid NULL pointer dereference
       [not found]             ` <eae25687-23ce-d686-f5f0-deeb7c3d2e1c-5C7GfCeVMHo@public.gmane.org>
@ 2018-11-23 14:10               ` Chunming Zhou
       [not found]                 ` <70a32625-7568-6253-6c5e-a72fe38f7986-5C7GfCeVMHo@public.gmane.org>
  0 siblings, 1 reply; 17+ messages in thread
From: Chunming Zhou @ 2018-11-23 14:10 UTC (permalink / raw)
  To: Koenig, Christian, Zhou, David(ChunMing),
	Sharma, Deepak, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

[-- Attachment #1: Type: text/plain, Size: 1801 bytes --]



在 2018/11/23 21:30, Koenig, Christian 写道:
> Am 23.11.18 um 14:27 schrieb Zhou, David(ChunMing):
>> 在 2018/11/22 19:25, Christian König 写道:
>>> Am 22.11.18 um 07:56 schrieb Sharma, Deepak:
>>>> when returned fence is not valid mostly due to userspace ignored
>>>> previous error causes NULL pointer dereference.
>>> Again, this is clearly incorrect. The my other mails on the earlier
>>> patch.
>> Sorry for I didn't get your history, but looks from the patch itself, it
>> is still a valid patch, isn't it?
> No, the semantic of amdgpu_ctx_get_fence() is that we return NULL when
> the fence is already signaled.
>
> So this patch could totally break userspace because it changes the
> behavior when we try to sync to an already signaled fence.
Ah, I got your meaning, how about attached patch?

-David
>
> If that patch was applied then please revert it immediately.
>
> Christian.
>
>> -David
>>> If you have already pushed the patch then please revert.
>>>
>>> Christian.
>>>
>>>> Signed-off-by: Deepak Sharma <Deepak.Sharma@amd.com>
>>>> ---
>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 2 ++
>>>>     1 file changed, 2 insertions(+)
>>>>
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>>>> index 024dfbd87f11..14166cd8a12f 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>>>> @@ -1403,6 +1403,8 @@ static struct dma_fence
>>>> *amdgpu_cs_get_fence(struct amdgpu_device *adev,
>>>>           fence = amdgpu_ctx_get_fence(ctx, entity, user->seq_no);
>>>>         amdgpu_ctx_put(ctx);
>>>> +    if(!fence)
>>>> +        return ERR_PTR(-EINVAL);
>>>>           return fence;
>>>>     }


[-- Attachment #2: 0001-drm-amdgpu-fix-signaled-fence-isn-t-handled.patch --]
[-- Type: text/plain, Size: 2164 bytes --]

From 3640a18c31e7b786129286615fcdf397e1142451 Mon Sep 17 00:00:00 2001
From: Chunming Zhou <david1.zhou@amd.com>
Date: Fri, 23 Nov 2018 22:05:19 +0800
Subject: [PATCH] drm/amdgpu: fix signaled fence isn't handled
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Signed-off-by: Chunming Zhou <david1.zhou@amd.com>
Cc: Sharma, Deepak <Deepak.Sharma@amd.com>
CC: Christian König <Christian.Koenig@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 3 +++
 drivers/gpu/drm/drm_syncobj.c          | 1 +
 include/drm/drm_syncobj.h              | 1 +
 3 files changed, 5 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index 6a823b58b3b8..e960f9864e9f 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -1505,6 +1505,9 @@ int amdgpu_cs_fence_to_handle_ioctl(struct drm_device *dev, void *data,
 	if (IS_ERR(fence))
 		return PTR_ERR(fence);
 
+    /* that means fence was signaled */
+	if (!fence)
+		fence = drm_syncobj_get_stub_fence();
 	switch (info->in.what) {
 	case AMDGPU_FENCE_TO_HANDLE_GET_SYNCOBJ:
 		r = drm_syncobj_create(&syncobj, 0, fence);
diff --git a/drivers/gpu/drm/drm_syncobj.c b/drivers/gpu/drm/drm_syncobj.c
index 5f2df10e51c3..e5621f80d501 100644
--- a/drivers/gpu/drm/drm_syncobj.c
+++ b/drivers/gpu/drm/drm_syncobj.c
@@ -97,6 +97,7 @@ struct dma_fence *drm_syncobj_get_stub_fence(void)
 
 	return dma_fence_get(&signaled_fence);
 }
+EXPORT_SYMBOL(drm_syncobj_get_stub_fence);
 /**
  * drm_syncobj_find - lookup and reference a sync object.
  * @file_private: drm file private pointer
diff --git a/include/drm/drm_syncobj.h b/include/drm/drm_syncobj.h
index 29244cbcd05e..93e9e9b159ab 100644
--- a/include/drm/drm_syncobj.h
+++ b/include/drm/drm_syncobj.h
@@ -147,5 +147,6 @@ int drm_syncobj_get_handle(struct drm_file *file_private,
 int drm_syncobj_get_fd(struct drm_syncobj *syncobj, int *p_fd);
 int drm_syncobj_search_fence(struct drm_syncobj *syncobj, u64 point, u64 flags,
 			     struct dma_fence **fence);
+struct dma_fence *drm_syncobj_get_stub_fence(void);
 
 #endif
-- 
2.17.1


[-- Attachment #3: Type: text/plain, Size: 154 bytes --]

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH] drm/amdgpu: add the checking to avoid NULL pointer dereference
       [not found]                 ` <70a32625-7568-6253-6c5e-a72fe38f7986-5C7GfCeVMHo@public.gmane.org>
@ 2018-11-23 18:10                   ` Koenig, Christian
       [not found]                     ` <138bad33-c604-c0c5-98cb-1ce71a432185-5C7GfCeVMHo@public.gmane.org>
  0 siblings, 1 reply; 17+ messages in thread
From: Koenig, Christian @ 2018-11-23 18:10 UTC (permalink / raw)
  To: Zhou, David(ChunMing),
	Sharma, Deepak, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

Am 23.11.18 um 15:10 schrieb Zhou, David(ChunMing):
>
> 在 2018/11/23 21:30, Koenig, Christian 写道:
>> Am 23.11.18 um 14:27 schrieb Zhou, David(ChunMing):
>>> 在 2018/11/22 19:25, Christian König 写道:
>>>> Am 22.11.18 um 07:56 schrieb Sharma, Deepak:
>>>>> when returned fence is not valid mostly due to userspace ignored
>>>>> previous error causes NULL pointer dereference.
>>>> Again, this is clearly incorrect. The my other mails on the earlier
>>>> patch.
>>> Sorry for I didn't get your history, but looks from the patch itself, it
>>> is still a valid patch, isn't it?
>> No, the semantic of amdgpu_ctx_get_fence() is that we return NULL when
>> the fence is already signaled.
>>
>> So this patch could totally break userspace because it changes the
>> behavior when we try to sync to an already signaled fence.
> Ah, I got your meaning, how about attached patch?

Yeah something like this, but I would just give the 
DRM_SYNCOBJ_CREATE_SIGNALED instead.

I mean that's what this flag is good for isn't it?

Christian.

>
> -David
>> If that patch was applied then please revert it immediately.
>>
>> Christian.
>>
>>> -David
>>>> If you have already pushed the patch then please revert.
>>>>
>>>> Christian.
>>>>
>>>>> Signed-off-by: Deepak Sharma <Deepak.Sharma@amd.com>
>>>>> ---
>>>>>      drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 2 ++
>>>>>      1 file changed, 2 insertions(+)
>>>>>
>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>>>>> index 024dfbd87f11..14166cd8a12f 100644
>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>>>>> @@ -1403,6 +1403,8 @@ static struct dma_fence
>>>>> *amdgpu_cs_get_fence(struct amdgpu_device *adev,
>>>>>            fence = amdgpu_ctx_get_fence(ctx, entity, user->seq_no);
>>>>>          amdgpu_ctx_put(ctx);
>>>>> +    if(!fence)
>>>>> +        return ERR_PTR(-EINVAL);
>>>>>            return fence;
>>>>>      }

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] drm/amdgpu: add the checking to avoid NULL pointer dereference
       [not found]                     ` <138bad33-c604-c0c5-98cb-1ce71a432185-5C7GfCeVMHo@public.gmane.org>
@ 2018-11-24  8:14                       ` Chunming Zhou
       [not found]                         ` <46007826-1a4a-40eb-5ee8-af2d0c93e04a-5C7GfCeVMHo@public.gmane.org>
  0 siblings, 1 reply; 17+ messages in thread
From: Chunming Zhou @ 2018-11-24  8:14 UTC (permalink / raw)
  To: Koenig, Christian, Zhou, David(ChunMing),
	Sharma, Deepak, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW



在 2018/11/24 2:10, Koenig, Christian 写道:
> Am 23.11.18 um 15:10 schrieb Zhou, David(ChunMing):
>> 在 2018/11/23 21:30, Koenig, Christian 写道:
>>> Am 23.11.18 um 14:27 schrieb Zhou, David(ChunMing):
>>>> 在 2018/11/22 19:25, Christian König 写道:
>>>>> Am 22.11.18 um 07:56 schrieb Sharma, Deepak:
>>>>>> when returned fence is not valid mostly due to userspace ignored
>>>>>> previous error causes NULL pointer dereference.
>>>>> Again, this is clearly incorrect. The my other mails on the earlier
>>>>> patch.
>>>> Sorry for I didn't get your history, but looks from the patch itself, it
>>>> is still a valid patch, isn't it?
>>> No, the semantic of amdgpu_ctx_get_fence() is that we return NULL when
>>> the fence is already signaled.
>>>
>>> So this patch could totally break userspace because it changes the
>>> behavior when we try to sync to an already signaled fence.
>> Ah, I got your meaning, how about attached patch?
> Yeah something like this, but I would just give the
> DRM_SYNCOBJ_CREATE_SIGNALED instead.
>
> I mean that's what this flag is good for isn't it?
Yeah, I give a flag initally when creating patch, but as you know, there 
is a swich case not be able to use that flag:

     case AMDGPU_FENCE_TO_HANDLE_GET_SYNC_FILE_FD:
         fd = get_unused_fd_flags(O_CLOEXEC);
         if (fd < 0) {
             dma_fence_put(fence);
             return fd;
         }

         sync_file = sync_file_create(fence);
         dma_fence_put(fence);
         if (!sync_file) {
             put_unused_fd(fd);
             return -ENOMEM;
         }

         fd_install(fd, sync_file->file);
         info->out.handle = fd;
         return 0;

So I change to stub fence instead.

-David
>
> Christian.
>
>> -David
>>> If that patch was applied then please revert it immediately.
>>>
>>> Christian.
>>>
>>>> -David
>>>>> If you have already pushed the patch then please revert.
>>>>>
>>>>> Christian.
>>>>>
>>>>>> Signed-off-by: Deepak Sharma <Deepak.Sharma@amd.com>
>>>>>> ---
>>>>>>       drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 2 ++
>>>>>>       1 file changed, 2 insertions(+)
>>>>>>
>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>>>>>> index 024dfbd87f11..14166cd8a12f 100644
>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>>>>>> @@ -1403,6 +1403,8 @@ static struct dma_fence
>>>>>> *amdgpu_cs_get_fence(struct amdgpu_device *adev,
>>>>>>             fence = amdgpu_ctx_get_fence(ctx, entity, user->seq_no);
>>>>>>           amdgpu_ctx_put(ctx);
>>>>>> +    if(!fence)
>>>>>> +        return ERR_PTR(-EINVAL);
>>>>>>             return fence;
>>>>>>       }

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 17+ messages in thread

* RE: [PATCH] drm/amdgpu: add the checking to avoid NULL pointer dereference
       [not found]                         ` <46007826-1a4a-40eb-5ee8-af2d0c93e04a-5C7GfCeVMHo@public.gmane.org>
@ 2018-11-26  1:59                           ` Sharma, Deepak
       [not found]                             ` <BY2PR12MB0694DBE621FC3BBC269E7D51E9D70-K//h7OWB4q7S9h1UXvMT+AdYzm3356FpvxpqHgZTriW3zl9H0oFU5g@public.gmane.org>
  0 siblings, 1 reply; 17+ messages in thread
From: Sharma, Deepak @ 2018-11-26  1:59 UTC (permalink / raw)
  To: Zhou, David(ChunMing),
	Koenig, Christian, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW



在 2018/11/24 2:10, Koenig, Christian 写道:
> Am 23.11.18 um 15:10 schrieb Zhou, David(ChunMing):
>> 在 2018/11/23 21:30, Koenig, Christian 写道:
>>> Am 23.11.18 um 14:27 schrieb Zhou, David(ChunMing):
>>>> 在 2018/11/22 19:25, Christian König 写道:
>>>>> Am 22.11.18 um 07:56 schrieb Sharma, Deepak:
>>>>>> when returned fence is not valid mostly due to userspace ignored 
>>>>>> previous error causes NULL pointer dereference.
>>>>> Again, this is clearly incorrect. The my other mails on the 
>>>>> earlier patch.
>>>> Sorry for I didn't get your history, but looks from the patch 
>>>> itself, it is still a valid patch, isn't it?
>>> No, the semantic of amdgpu_ctx_get_fence() is that we return NULL 
>>> when the fence is already signaled.
>>>
>>> So this patch could totally break userspace because it changes the 
>>> behavior when we try to sync to an already signaled fence.
>> Ah, I got your meaning, how about attached patch?
> Yeah something like this, but I would just give the 
> DRM_SYNCOBJ_CREATE_SIGNALED instead.
>
> I mean that's what this flag is good for isn't it?
Yeah, I give a flag initally when creating patch, but as you know, there is a swich case not be able to use that flag:

     case AMDGPU_FENCE_TO_HANDLE_GET_SYNC_FILE_FD:
         fd = get_unused_fd_flags(O_CLOEXEC);
         if (fd < 0) {
             dma_fence_put(fence);
             return fd;
         }

         sync_file = sync_file_create(fence);
         dma_fence_put(fence);
         if (!sync_file) {
             put_unused_fd(fd);
             return -ENOMEM;
         }

         fd_install(fd, sync_file->file);
         info->out.handle = fd;
         return 0;

So I change to stub fence instead.

-David

I have not applied this patch.
The issue was trying to address is when amdgpu_cs_ioctl() failed due to low memory (ENOMEM) but userspace chose to proceed and called amdgpu_cs_fence_to_handle_ioctl().
In amdgpu_cs_fence_to_handle_ioctl(), fence is null and later causing NULL pointer dereference, this patch was to avoid that and system panic
But I understand now that its a valid case retuning NULL if fence was already signaled but need to handle case so avoid kernel panic. Seems David patch should fix this, I will test it tomorrow. 

-Deepak
>
> Christian.
>
>> -David
>>> If that patch was applied then please revert it immediately.
>>>
>>> Christian.
>>>
>>>> -David
>>>>> If you have already pushed the patch then please revert.
>>>>>
>>>>> Christian.
>>>>>
>>>>>> Signed-off-by: Deepak Sharma <Deepak.Sharma@amd.com>
>>>>>> ---
>>>>>>       drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 2 ++
>>>>>>       1 file changed, 2 insertions(+)
>>>>>>
>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>>>>>> index 024dfbd87f11..14166cd8a12f 100644
>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>>>>>> @@ -1403,6 +1403,8 @@ static struct dma_fence 
>>>>>> *amdgpu_cs_get_fence(struct amdgpu_device *adev,
>>>>>>             fence = amdgpu_ctx_get_fence(ctx, entity, user->seq_no);
>>>>>>           amdgpu_ctx_put(ctx);
>>>>>> +    if(!fence)
>>>>>> +        return ERR_PTR(-EINVAL);
>>>>>>             return fence;
>>>>>>       }

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] drm/amdgpu: add the checking to avoid NULL pointer dereference
       [not found]                             ` <BY2PR12MB0694DBE621FC3BBC269E7D51E9D70-K//h7OWB4q7S9h1UXvMT+AdYzm3356FpvxpqHgZTriW3zl9H0oFU5g@public.gmane.org>
@ 2018-11-26  9:23                               ` Christian König
       [not found]                                 ` <38725288-c15f-58be-7d1a-422abe503a04-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
  0 siblings, 1 reply; 17+ messages in thread
From: Christian König @ 2018-11-26  9:23 UTC (permalink / raw)
  To: Sharma, Deepak, Zhou, David(ChunMing),
	Koenig, Christian, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

Am 26.11.18 um 02:59 schrieb Sharma, Deepak:
>
> 在 2018/11/24 2:10, Koenig, Christian 写道:
>> Am 23.11.18 um 15:10 schrieb Zhou, David(ChunMing):
>>> 在 2018/11/23 21:30, Koenig, Christian 写道:
>>>> Am 23.11.18 um 14:27 schrieb Zhou, David(ChunMing):
>>>>> 在 2018/11/22 19:25, Christian König 写道:
>>>>>> Am 22.11.18 um 07:56 schrieb Sharma, Deepak:
>>>>>>> when returned fence is not valid mostly due to userspace ignored
>>>>>>> previous error causes NULL pointer dereference.
>>>>>> Again, this is clearly incorrect. The my other mails on the
>>>>>> earlier patch.
>>>>> Sorry for I didn't get your history, but looks from the patch
>>>>> itself, it is still a valid patch, isn't it?
>>>> No, the semantic of amdgpu_ctx_get_fence() is that we return NULL
>>>> when the fence is already signaled.
>>>>
>>>> So this patch could totally break userspace because it changes the
>>>> behavior when we try to sync to an already signaled fence.
>>> Ah, I got your meaning, how about attached patch?
>> Yeah something like this, but I would just give the
>> DRM_SYNCOBJ_CREATE_SIGNALED instead.
>>
>> I mean that's what this flag is good for isn't it?
> Yeah, I give a flag initally when creating patch, but as you know, there is a swich case not be able to use that flag:
>
>       case AMDGPU_FENCE_TO_HANDLE_GET_SYNC_FILE_FD:
>           fd = get_unused_fd_flags(O_CLOEXEC);
>           if (fd < 0) {
>               dma_fence_put(fence);
>               return fd;
>           }
>
>           sync_file = sync_file_create(fence);
>           dma_fence_put(fence);
>           if (!sync_file) {
>               put_unused_fd(fd);
>               return -ENOMEM;
>           }
>
>           fd_install(fd, sync_file->file);
>           info->out.handle = fd;
>           return 0;
>
> So I change to stub fence instead.

Yeah, I've missed that case. Not sure if the sync_file can deal with a 
NULL fence.

We should then probably move the stub fence function into 
dma_fence_stub.c under drivers/dma-buf to keep the stuff together.

>
> -David
>
> I have not applied this patch.
> The issue was trying to address is when amdgpu_cs_ioctl() failed due to low memory (ENOMEM) but userspace chose to proceed and called amdgpu_cs_fence_to_handle_ioctl().
> In amdgpu_cs_fence_to_handle_ioctl(), fence is null and later causing NULL pointer dereference, this patch was to avoid that and system panic
> But I understand now that its a valid case retuning NULL if fence was already signaled but need to handle case so avoid kernel panic. Seems David patch should fix this, I will test it tomorrow.

Mhm, but don't we bail out with an error if we ask for a failed command 
submission? If not that sounds like a bug as well.

Christian.

>
> -Deepak
>> Christian.
>>
>>> -David
>>>> If that patch was applied then please revert it immediately.
>>>>
>>>> Christian.
>>>>
>>>>> -David
>>>>>> If you have already pushed the patch then please revert.
>>>>>>
>>>>>> Christian.
>>>>>>
>>>>>>> Signed-off-by: Deepak Sharma <Deepak.Sharma@amd.com>
>>>>>>> ---
>>>>>>>        drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 2 ++
>>>>>>>        1 file changed, 2 insertions(+)
>>>>>>>
>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>>>>>>> index 024dfbd87f11..14166cd8a12f 100644
>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>>>>>>> @@ -1403,6 +1403,8 @@ static struct dma_fence
>>>>>>> *amdgpu_cs_get_fence(struct amdgpu_device *adev,
>>>>>>>              fence = amdgpu_ctx_get_fence(ctx, entity, user->seq_no);
>>>>>>>            amdgpu_ctx_put(ctx);
>>>>>>> +    if(!fence)
>>>>>>> +        return ERR_PTR(-EINVAL);
>>>>>>>              return fence;
>>>>>>>        }
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 17+ messages in thread

* RE: [PATCH] drm/amdgpu: add the checking to avoid NULL pointer dereference
       [not found]                                 ` <38725288-c15f-58be-7d1a-422abe503a04-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
@ 2018-11-26  9:57                                   ` Zhou, David(ChunMing)
       [not found]                                     ` <BY1PR12MB0502AC0F928B517E35C58751B4D70-PicGAnIBOobrCwm+z9iKNgdYzm3356FpvxpqHgZTriW3zl9H0oFU5g@public.gmane.org>
  0 siblings, 1 reply; 17+ messages in thread
From: Zhou, David(ChunMing) @ 2018-11-26  9:57 UTC (permalink / raw)
  To: Koenig, Christian, Sharma, Deepak,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW



> -----Original Message-----
> From: Christian König <ckoenig.leichtzumerken@gmail.com>
> Sent: Monday, November 26, 2018 5:23 PM
> To: Sharma, Deepak <Deepak.Sharma@amd.com>; Zhou, David(ChunMing)
> <David1.Zhou@amd.com>; Koenig, Christian <Christian.Koenig@amd.com>;
> amd-gfx@lists.freedesktop.org
> Subject: Re: [PATCH] drm/amdgpu: add the checking to avoid NULL pointer
> dereference
> 
> Am 26.11.18 um 02:59 schrieb Sharma, Deepak:
> >
> > 在 2018/11/24 2:10, Koenig, Christian 写道:
> >> Am 23.11.18 um 15:10 schrieb Zhou, David(ChunMing):
> >>> 在 2018/11/23 21:30, Koenig, Christian 写道:
> >>>> Am 23.11.18 um 14:27 schrieb Zhou, David(ChunMing):
> >>>>> 在 2018/11/22 19:25, Christian König 写道:
> >>>>>> Am 22.11.18 um 07:56 schrieb Sharma, Deepak:
> >>>>>>> when returned fence is not valid mostly due to userspace ignored
> >>>>>>> previous error causes NULL pointer dereference.
> >>>>>> Again, this is clearly incorrect. The my other mails on the
> >>>>>> earlier patch.
> >>>>> Sorry for I didn't get your history, but looks from the patch
> >>>>> itself, it is still a valid patch, isn't it?
> >>>> No, the semantic of amdgpu_ctx_get_fence() is that we return NULL
> >>>> when the fence is already signaled.
> >>>>
> >>>> So this patch could totally break userspace because it changes the
> >>>> behavior when we try to sync to an already signaled fence.
> >>> Ah, I got your meaning, how about attached patch?
> >> Yeah something like this, but I would just give the
> >> DRM_SYNCOBJ_CREATE_SIGNALED instead.
> >>
> >> I mean that's what this flag is good for isn't it?
> > Yeah, I give a flag initally when creating patch, but as you know, there is a
> swich case not be able to use that flag:
> >
> >       case AMDGPU_FENCE_TO_HANDLE_GET_SYNC_FILE_FD:
> >           fd = get_unused_fd_flags(O_CLOEXEC);
> >           if (fd < 0) {
> >               dma_fence_put(fence);
> >               return fd;
> >           }
> >
> >           sync_file = sync_file_create(fence);
> >           dma_fence_put(fence);
> >           if (!sync_file) {
> >               put_unused_fd(fd);
> >               return -ENOMEM;
> >           }
> >
> >           fd_install(fd, sync_file->file);
> >           info->out.handle = fd;
> >           return 0;
> >
> > So I change to stub fence instead.
> 
> Yeah, I've missed that case. Not sure if the sync_file can deal with a NULL
> fence.
> 
> We should then probably move the stub fence function into
> dma_fence_stub.c under drivers/dma-buf to keep the stuff together.

Yes, you wrap it to review first with your stub fence, we can do it separately first.

-David 
> 
> >
> > -David
> >
> > I have not applied this patch.
> > The issue was trying to address is when amdgpu_cs_ioctl() failed due to
> low memory (ENOMEM) but userspace chose to proceed and called
> amdgpu_cs_fence_to_handle_ioctl().
> > In amdgpu_cs_fence_to_handle_ioctl(), fence is null and later causing
> > NULL pointer dereference, this patch was to avoid that and system panic
> But I understand now that its a valid case retuning NULL if fence was already
> signaled but need to handle case so avoid kernel panic. Seems David patch
> should fix this, I will test it tomorrow.
> 
> Mhm, but don't we bail out with an error if we ask for a failed command
> submission? If not that sounds like a bug as well.
> 
> Christian.
> 
> >
> > -Deepak
> >> Christian.
> >>
> >>> -David
> >>>> If that patch was applied then please revert it immediately.
> >>>>
> >>>> Christian.
> >>>>
> >>>>> -David
> >>>>>> If you have already pushed the patch then please revert.
> >>>>>>
> >>>>>> Christian.
> >>>>>>
> >>>>>>> Signed-off-by: Deepak Sharma <Deepak.Sharma@amd.com>
> >>>>>>> ---
> >>>>>>>        drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 2 ++
> >>>>>>>        1 file changed, 2 insertions(+)
> >>>>>>>
> >>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> >>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> >>>>>>> index 024dfbd87f11..14166cd8a12f 100644
> >>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> >>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> >>>>>>> @@ -1403,6 +1403,8 @@ static struct dma_fence
> >>>>>>> *amdgpu_cs_get_fence(struct amdgpu_device *adev,
> >>>>>>>              fence = amdgpu_ctx_get_fence(ctx, entity, user->seq_no);
> >>>>>>>            amdgpu_ctx_put(ctx);
> >>>>>>> +    if(!fence)
> >>>>>>> +        return ERR_PTR(-EINVAL);
> >>>>>>>              return fence;
> >>>>>>>        }
> > _______________________________________________
> > amd-gfx mailing list
> > amd-gfx@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/amd-gfx

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] drm/amdgpu: add the checking to avoid NULL pointer dereference
       [not found]                                     ` <BY1PR12MB0502AC0F928B517E35C58751B4D70-PicGAnIBOobrCwm+z9iKNgdYzm3356FpvxpqHgZTriW3zl9H0oFU5g@public.gmane.org>
@ 2018-11-27  0:40                                       ` Deepak Sharma
       [not found]                                         ` <9619e159-d47b-2078-974d-0d3b418f6d88-5C7GfCeVMHo@public.gmane.org>
  0 siblings, 1 reply; 17+ messages in thread
From: Deepak Sharma @ 2018-11-27  0:40 UTC (permalink / raw)
  To: Zhou, David(ChunMing),
	Koenig, Christian, Sharma, Deepak,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW



On 11/26/18 1:57 AM, Zhou, David(ChunMing) wrote:
> 
> 
>> -----Original Message-----
>> From: Christian König <ckoenig.leichtzumerken@gmail.com>
>> Sent: Monday, November 26, 2018 5:23 PM
>> To: Sharma, Deepak <Deepak.Sharma@amd.com>; Zhou, David(ChunMing)
>> <David1.Zhou@amd.com>; Koenig, Christian <Christian.Koenig@amd.com>;
>> amd-gfx@lists.freedesktop.org
>> Subject: Re: [PATCH] drm/amdgpu: add the checking to avoid NULL pointer
>> dereference
>>
>> Am 26.11.18 um 02:59 schrieb Sharma, Deepak:
>>>
>>> 在 2018/11/24 2:10, Koenig, Christian 写道:
>>>> Am 23.11.18 um 15:10 schrieb Zhou, David(ChunMing):
>>>>> 在 2018/11/23 21:30, Koenig, Christian 写道:
>>>>>> Am 23.11.18 um 14:27 schrieb Zhou, David(ChunMing):
>>>>>>> 在 2018/11/22 19:25, Christian König 写道:
>>>>>>>> Am 22.11.18 um 07:56 schrieb Sharma, Deepak:
>>>>>>>>> when returned fence is not valid mostly due to userspace ignored
>>>>>>>>> previous error causes NULL pointer dereference.
>>>>>>>> Again, this is clearly incorrect. The my other mails on the
>>>>>>>> earlier patch.
>>>>>>> Sorry for I didn't get your history, but looks from the patch
>>>>>>> itself, it is still a valid patch, isn't it?
>>>>>> No, the semantic of amdgpu_ctx_get_fence() is that we return NULL
>>>>>> when the fence is already signaled.
>>>>>>
>>>>>> So this patch could totally break userspace because it changes the
>>>>>> behavior when we try to sync to an already signaled fence.
>>>>> Ah, I got your meaning, how about attached patch?
>>>> Yeah something like this, but I would just give the
>>>> DRM_SYNCOBJ_CREATE_SIGNALED instead.
>>>>
>>>> I mean that's what this flag is good for isn't it?
>>> Yeah, I give a flag initally when creating patch, but as you know, there is a
>> swich case not be able to use that flag:
>>>
>>>        case AMDGPU_FENCE_TO_HANDLE_GET_SYNC_FILE_FD:
>>>            fd = get_unused_fd_flags(O_CLOEXEC);
>>>            if (fd < 0) {
>>>                dma_fence_put(fence);
>>>                return fd;
>>>            }
>>>
>>>            sync_file = sync_file_create(fence);
>>>            dma_fence_put(fence);
>>>            if (!sync_file) {
>>>                put_unused_fd(fd);
>>>                return -ENOMEM;
>>>            }
>>>
>>>            fd_install(fd, sync_file->file);
>>>            info->out.handle = fd;
>>>            return 0;
>>>
>>> So I change to stub fence instead.
>>
>> Yeah, I've missed that case. Not sure if the sync_file can deal with a NULL
>> fence.
>>
>> We should then probably move the stub fence function into
>> dma_fence_stub.c under drivers/dma-buf to keep the stuff together.
> 
> Yes, you wrap it to review first with your stub fence, we can do it separately first.
> 
> -David
>>
>>>
>>> -David
>>>
>>> I have not applied this patch.
>>> The issue was trying to address is when amdgpu_cs_ioctl() failed due to
>> low memory (ENOMEM) but userspace chose to proceed and called
>> amdgpu_cs_fence_to_handle_ioctl().
>>> In amdgpu_cs_fence_to_handle_ioctl(), fence is null and later causing
>>> NULL pointer dereference, this patch was to avoid that and system panic
>> But I understand now that its a valid case retuning NULL if fence was already
>> signaled but need to handle case so avoid kernel panic. Seems David patch
>> should fix this, I will test it tomorrow.
>>
>> Mhm, but don't we bail out with an error if we ask for a failed command
>> submission? If not that sounds like a bug as well.
>>
>> Christian.
>>
Where do we do that?
I see error
[drm:amdgpu_cs_ioctl] *ERROR* amdgpu_cs_list_validate(validated) failed.
[drm:amdgpu_cs_ioctl] *ERROR* Not enough memory for command submission!
BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
Did some more debugging,dma_fence_is_array() is causing NULL pointer 
dereference call through sync_file_ioctl.

Also I think changes in David patch cant be applied on 
amd-staging-drm-next, which all patches I should take to get it correctly?

>>>
>>> -Deepak
>>>> Christian.
>>>>
>>>>> -David
>>>>>> If that patch was applied then please revert it immediately.
>>>>>>
>>>>>> Christian.
>>>>>>
>>>>>>> -David
>>>>>>>> If you have already pushed the patch then please revert.
>>>>>>>>
>>>>>>>> Christian.
>>>>>>>>
>>>>>>>>> Signed-off-by: Deepak Sharma <Deepak.Sharma@amd.com>
>>>>>>>>> ---
>>>>>>>>>         drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 2 ++
>>>>>>>>>         1 file changed, 2 insertions(+)
>>>>>>>>>
>>>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>>>>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>>>>>>>>> index 024dfbd87f11..14166cd8a12f 100644
>>>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>>>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>>>>>>>>> @@ -1403,6 +1403,8 @@ static struct dma_fence
>>>>>>>>> *amdgpu_cs_get_fence(struct amdgpu_device *adev,
>>>>>>>>>               fence = amdgpu_ctx_get_fence(ctx, entity, user->seq_no);
>>>>>>>>>             amdgpu_ctx_put(ctx);
>>>>>>>>> +    if(!fence)
>>>>>>>>> +        return ERR_PTR(-EINVAL);
>>>>>>>>>               return fence;
>>>>>>>>>         }
>>> _______________________________________________
>>> amd-gfx mailing list
>>> amd-gfx@lists.freedesktop.org
>>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
> 
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] drm/amdgpu: add the checking to avoid NULL pointer dereference
       [not found]                                         ` <9619e159-d47b-2078-974d-0d3b418f6d88-5C7GfCeVMHo@public.gmane.org>
@ 2018-11-27  2:40                                           ` zhoucm1
       [not found]                                             ` <174e311d-3214-1e70-c5d4-8b955b926a58-5C7GfCeVMHo@public.gmane.org>
  2018-11-27  9:52                                           ` Christian König
  1 sibling, 1 reply; 17+ messages in thread
From: zhoucm1 @ 2018-11-27  2:40 UTC (permalink / raw)
  To: Sharma, Deepak, Zhou, David(ChunMing),
	Koenig, Christian, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

[-- Attachment #1: Type: text/plain, Size: 5825 bytes --]

Yeah, you need another drm patch as well when you apply my patch. Attached.

-David


On 2018年11月27日 08:40, Sharma, Deepak wrote:
>
> On 11/26/18 1:57 AM, Zhou, David(ChunMing) wrote:
>>
>>> -----Original Message-----
>>> From: Christian König <ckoenig.leichtzumerken-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
>>> Sent: Monday, November 26, 2018 5:23 PM
>>> To: Sharma, Deepak <Deepak.Sharma-5C7GfCeVMHo@public.gmane.org>; Zhou, David(ChunMing)
>>> <David1.Zhou-5C7GfCeVMHo@public.gmane.org>; Koenig, Christian <Christian.Koenig-5C7GfCeVMHo@public.gmane.org>;
>>> amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org
>>> Subject: Re: [PATCH] drm/amdgpu: add the checking to avoid NULL pointer
>>> dereference
>>>
>>> Am 26.11.18 um 02:59 schrieb Sharma, Deepak:
>>>> 在 2018/11/24 2:10, Koenig, Christian 写道:
>>>>> Am 23.11.18 um 15:10 schrieb Zhou, David(ChunMing):
>>>>>> 在 2018/11/23 21:30, Koenig, Christian 写道:
>>>>>>> Am 23.11.18 um 14:27 schrieb Zhou, David(ChunMing):
>>>>>>>> 在 2018/11/22 19:25, Christian König 写道:
>>>>>>>>> Am 22.11.18 um 07:56 schrieb Sharma, Deepak:
>>>>>>>>>> when returned fence is not valid mostly due to userspace ignored
>>>>>>>>>> previous error causes NULL pointer dereference.
>>>>>>>>> Again, this is clearly incorrect. The my other mails on the
>>>>>>>>> earlier patch.
>>>>>>>> Sorry for I didn't get your history, but looks from the patch
>>>>>>>> itself, it is still a valid patch, isn't it?
>>>>>>> No, the semantic of amdgpu_ctx_get_fence() is that we return NULL
>>>>>>> when the fence is already signaled.
>>>>>>>
>>>>>>> So this patch could totally break userspace because it changes the
>>>>>>> behavior when we try to sync to an already signaled fence.
>>>>>> Ah, I got your meaning, how about attached patch?
>>>>> Yeah something like this, but I would just give the
>>>>> DRM_SYNCOBJ_CREATE_SIGNALED instead.
>>>>>
>>>>> I mean that's what this flag is good for isn't it?
>>>> Yeah, I give a flag initally when creating patch, but as you know, there is a
>>> swich case not be able to use that flag:
>>>>         case AMDGPU_FENCE_TO_HANDLE_GET_SYNC_FILE_FD:
>>>>             fd = get_unused_fd_flags(O_CLOEXEC);
>>>>             if (fd < 0) {
>>>>                 dma_fence_put(fence);
>>>>                 return fd;
>>>>             }
>>>>
>>>>             sync_file = sync_file_create(fence);
>>>>             dma_fence_put(fence);
>>>>             if (!sync_file) {
>>>>                 put_unused_fd(fd);
>>>>                 return -ENOMEM;
>>>>             }
>>>>
>>>>             fd_install(fd, sync_file->file);
>>>>             info->out.handle = fd;
>>>>             return 0;
>>>>
>>>> So I change to stub fence instead.
>>> Yeah, I've missed that case. Not sure if the sync_file can deal with a NULL
>>> fence.
>>>
>>> We should then probably move the stub fence function into
>>> dma_fence_stub.c under drivers/dma-buf to keep the stuff together.
>> Yes, you wrap it to review first with your stub fence, we can do it separately first.
>>
>> -David
>>>> -David
>>>>
>>>> I have not applied this patch.
>>>> The issue was trying to address is when amdgpu_cs_ioctl() failed due to
>>> low memory (ENOMEM) but userspace chose to proceed and called
>>> amdgpu_cs_fence_to_handle_ioctl().
>>>> In amdgpu_cs_fence_to_handle_ioctl(), fence is null and later causing
>>>> NULL pointer dereference, this patch was to avoid that and system panic
>>> But I understand now that its a valid case retuning NULL if fence was already
>>> signaled but need to handle case so avoid kernel panic. Seems David patch
>>> should fix this, I will test it tomorrow.
>>>
>>> Mhm, but don't we bail out with an error if we ask for a failed command
>>> submission? If not that sounds like a bug as well.
>>>
>>> Christian.
>>>
> Where do we do that?
> I see error
> [drm:amdgpu_cs_ioctl] *ERROR* amdgpu_cs_list_validate(validated) failed.
> [drm:amdgpu_cs_ioctl] *ERROR* Not enough memory for command submission!
> BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
> Did some more debugging,dma_fence_is_array() is causing NULL pointer
> dereference call through sync_file_ioctl.
>
> Also I think changes in David patch cant be applied on
> amd-staging-drm-next, which all patches I should take to get it correctly?
>
>>>> -Deepak
>>>>> Christian.
>>>>>
>>>>>> -David
>>>>>>> If that patch was applied then please revert it immediately.
>>>>>>>
>>>>>>> Christian.
>>>>>>>
>>>>>>>> -David
>>>>>>>>> If you have already pushed the patch then please revert.
>>>>>>>>>
>>>>>>>>> Christian.
>>>>>>>>>
>>>>>>>>>> Signed-off-by: Deepak Sharma <Deepak.Sharma-5C7GfCeVMHo@public.gmane.org>
>>>>>>>>>> ---
>>>>>>>>>>          drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 2 ++
>>>>>>>>>>          1 file changed, 2 insertions(+)
>>>>>>>>>>
>>>>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>>>>>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>>>>>>>>>> index 024dfbd87f11..14166cd8a12f 100644
>>>>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>>>>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>>>>>>>>>> @@ -1403,6 +1403,8 @@ static struct dma_fence
>>>>>>>>>> *amdgpu_cs_get_fence(struct amdgpu_device *adev,
>>>>>>>>>>                fence = amdgpu_ctx_get_fence(ctx, entity, user->seq_no);
>>>>>>>>>>              amdgpu_ctx_put(ctx);
>>>>>>>>>> +    if(!fence)
>>>>>>>>>> +        return ERR_PTR(-EINVAL);
>>>>>>>>>>                return fence;
>>>>>>>>>>          }
>>>> _______________________________________________
>>>> amd-gfx mailing list
>>>> amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org
>>>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: drm_syncobj: use only a single stub fence.patch --]
[-- Type: text/x-patch; name="drm_syncobj: use only a single stub fence.patch", Size: 19623 bytes --]

Received: from SN1PR12MB0509.namprd12.prod.outlook.com (2603:10b6:a02:a8::31)
 by BY1PR12MB0502.namprd12.prod.outlook.com with HTTPS via
 BYAPR03CA0018.NAMPRD03.PROD.OUTLOOK.COM; Thu, 15 Nov 2018 11:12:57 +0000
Received: from MWHPR12CA0057.namprd12.prod.outlook.com (2603:10b6:300:103::19)
 by SN1PR12MB0509.namprd12.prod.outlook.com (2a01:111:e400:5866::25) with
 Microsoft SMTP Server (version=TLS1_2,
 cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.1294.30; Thu, 15 Nov
 2018 11:12:55 +0000
Received: from BY2NAM03FT050.eop-NAM03.prod.protection.outlook.com
 (2a01:111:f400:7e4a::207) by MWHPR12CA0057.outlook.office365.com
 (2603:10b6:300:103::19) with Microsoft SMTP Server (version=TLS1_2,
 cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384) id 15.20.1294.26 via Frontend
 Transport; Thu, 15 Nov 2018 11:12:55 +0000
Authentication-Results: spf=pass (sender IP is 209.85.128.65)
 smtp.mailfrom=gmail.com; amd.com; dkim=pass (signature was verified)
 header.d=gmail.com;amd.com; dmarc=pass action=none
 header.from=gmail.com;compauth=pass reason=100
Received-SPF: Pass (protection.outlook.com: domain of gmail.com designates
 209.85.128.65 as permitted sender) receiver=protection.outlook.com;
 client-ip=209.85.128.65; helo=mail-wm1-f65.google.com;
Received: from mail-wm1-f65.google.com (209.85.128.65) by
 BY2NAM03FT050.mail.protection.outlook.com (10.152.85.137) with Microsoft SMTP
 Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA) id
 15.20.1339.10 via Frontend Transport; Thu, 15 Nov 2018 11:12:54 +0000
Received: by mail-wm1-f65.google.com with SMTP id r11-v6so18383279wmb.2
        for <David1.Zhou-5C7GfCeVMHo@public.gmane.org>; Thu, 15 Nov 2018 03:12:54 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=gmail.com; s=20161025;
        h=from:to:cc:subject:date:message-id:in-reply-to:references
         :mime-version:content-transfer-encoding;
        bh=UZ6p6tJtd2xYUOdw9+8vYF7sv90wvnT6I3AGPXZMrqU=;
        b=A6PG5sRgKxj6Gmja+v6mmeraXrBO4FtkWgIw9EIO7Ha9AWGj6pLKPY9cH+CdQT/sV2
         ayhpxgtaoQsIDOuy+DeFTVusKjxUG5/DzYV/Sw4EfAJF2/4Ere/65s3P325pcx06m1Iq
         HASqSFxh6ZzfOfWC/0SUW0bbk01hytKj2MawFuJAy0Khk3UC+HNy1xbOmCxIrdWex5GM
         dxHmBIQHllde1+HKso4UFL/5nHQ8JgycwYuOX/FnAHm6kUGorAw96Y9CPHnZnpIcngHe
         JqDuE0dCbpF74fbolQFOlWtAayXhgYiUbuvjadyHzV6mkhbQTo9T8k743dIxdiKsgg2+
         OcYA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20161025;
        h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to
         :references:mime-version:content-transfer-encoding;
        bh=UZ6p6tJtd2xYUOdw9+8vYF7sv90wvnT6I3AGPXZMrqU=;
        b=YkWxM644RKHsJHMUjEk1Cm85+T3r1PVeDjkbkziWSW9EmhYWP6vPEI2n8Rrg9Ae6pA
         2CGhTFfPEgeccrbRvGzO5FjTUWGILzCWVXOYOIZWasW4FAQBzAfjKsBtbzKlREaATgJ0
         hUHDocGPmJ0ug218dU46E3zAp9/zXcUP3NoxRZEnWmK/rG5h09IM0UQmiJTwW3V55tN2
         h9Tdw9+8DJCHY6Msrui2L3Wkh4/FWGz6+hZqahj5AvBhS6ulGmbW0z+eZm3pw10LL0si
         ulhu/tSgxZr8rpayMn74JRI+heJC2pPt0N3gO876YGvAziLY1QoCQgwCVRNHVrKXEroY
         Zwqw==
X-Gm-Message-State: AGRZ1gIs95+B9ZioOciQ/d8+4w/mpAf9p+TmRjU1W3P38ii5Vkmy4FP3
	JgBjvopgwu5stBPZdXaOThI=
X-Google-Smtp-Source: AJdET5cZxnF54rSC+TGsbDDyNh3dr69d/7VsdUw6jIC2SreyzuuIzf7VQTtx7Gr57NJXmrE8TktYWQ==
X-Received: by 2002:a1c:13d2:: with SMTP id 201-v6mr4734749wmt.58.1542280373550;
        Thu, 15 Nov 2018 03:12:53 -0800 (PST)
Return-Path: ckoenig.leichtzumerken-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org
Received: from baker.fritz.box ([2a02:908:1257:4460:683c:8090:4142:4ae3])
        by smtp.gmail.com with ESMTPSA id s66sm4935428wmf.34.2018.11.15.03.12.52
        (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128);
        Thu, 15 Nov 2018 03:12:52 -0800 (PST)
From: "=?UTF-8?q?Christian=20K=C3=B6nig?=" <ckoenig.leichtzumerken-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
X-Google-Original-From: =?UTF-8?q?Christian=20K=C3=B6nig?= <christian.koenig-5C7GfCeVMHo@public.gmane.org>
To: dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org
Cc: chris-Y6uKTt2uX1cEflXRtASbqLVCufUGDwFn@public.gmane.org,
	daniel.vetter-/w4YWyX8dFk@public.gmane.org,
	eric-WhKQ6XTQaPysTnJN9+BGXg@public.gmane.org,
	David1.Zhou-5C7GfCeVMHo@public.gmane.org
Subject: [PATCH 4/7] drm/syncobj: use only a single stub fence
Date: Thu, 15 Nov 2018 12:12:42 +0100
Message-Id: <20181115111245.30161-5-christian.koenig-5C7GfCeVMHo@public.gmane.org>
X-Mailer: git-send-email 2.14.1
In-Reply-To: <20181115111245.30161-1-christian.koenig-5C7GfCeVMHo@public.gmane.org>
References: <20181115111245.30161-1-christian.koenig-5C7GfCeVMHo@public.gmane.org>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: 8bit
X-MS-Exchange-Organization-ExpirationStartTime: 15 Nov 2018 11:12:54.8996
 (UTC)
X-MS-Exchange-Organization-ExpirationStartTimeReason: OriginalSubmit
X-MS-Exchange-Organization-ExpirationInterval: 2:00:00:00.0000000
X-MS-Exchange-Organization-ExpirationIntervalReason: OriginalSubmit
X-MS-Exchange-Organization-Network-Message-Id:
 87af999b-85f7-443c-c4ee-08d64aeb4a9b
X-EOPAttributedMessage: 0
X-EOPTenantAttributedMessage: 3dd8961f-e488-4e60-8e11-a82d994e183d:0
X-MS-Exchange-Organization-MessageDirectionality: Incoming
X-MS-Office365-Filtering-HT: Tenant
X-Forefront-Antispam-Report:
 CIP:209.85.128.65;IPV:NLI;CTRY:US;EFV:NLI;SFV:NSPM;SFS:(8156002)(2980300002)(438002)(189003)(199004)(336012)(66574009)(6666004)(356004)(73392003)(83322999)(50226002)(1096003)(5660300001)(26005)(82202002)(59536001)(2870700001)(76482006)(1076002)(73972006)(2351001)(14444005)(4326008)(2361001)(95326003)(53416004)(106466001)(87572001)(60616004)(5820100001)(93516011)(76176011)(33896004)(47776003)(2160300002)(305945005)(7636002)(34206002)(23676004)(60626007)(34756004)(7596002)(86362001)(8676002)(106002)(246002)(50466002)(446003)(476003)(486006)(9686003)(426003)(126002)(36756003)(11346002)(16003);DIR:INB;SFP:;SCL:1;SRVR:SN1PR12MB0509;H:mail-wm1-f65.google.com;FPR:;SPF:Pass;LANG:en;PTR:mail-wm1-f65.google.com;MX:1;A:1;
X-Microsoft-Exchange-Diagnostics:
 1;BY2NAM03FT050;1:1+TcSGNn67EepEIovtFbsVyy8YXJK2BaVZFlDwszS8R0ny8sv2gS8x97QUs+QmOQqh+2xU4DT2Kk1kzyvRd1YMME7PefuVMURpSuAeK1J19jF+BSoMdQ8yD0yqUnB3i/
X-MS-Exchange-Organization-AuthSource:
 BY2NAM03FT050.eop-NAM03.prod.protection.outlook.com
X-MS-Exchange-Organization-AuthAs: Anonymous
X-MS-PublicTrafficType: Email
X-MS-Office365-Filtering-Correlation-Id: 87af999b-85f7-443c-c4ee-08d64aeb4a9b
X-Microsoft-Antispam:
 BCL:0;PCL:0;RULEID:(2390098)(7020095)(4652040)(5600074)(711020)(4605076)(4608076)(4614076)(1401272)(8001031)(1421009)(1402068)(71702078);SRVR:SN1PR12MB0509;
X-Microsoft-Exchange-Diagnostics:
 1;SN1PR12MB0509;3:xqFPVoPjIcuuAzsNe/O2gvPfGGIwAGboqqAVDOaNt2ufWwYzK4a2IC+gJqL8a/zgn+uWWC0BCqN3woltt08f3mgZ5C6yTzU0STbRNyDAQN7Cfg00KGzrPd3B2SLEpDlOeuHQ1jmVyntAkpoB+SA0jPqQVhYp9nfH3SNEe/5Rg8yiQEMYvt/JLfInTVlidlTWSEumM4H8o2UoPlZzQ5j6todxk2JvRtRDY3hP9XeLiU9hbKtMAVXWdmad89LQAJ/q+IlTUhhoblZM3W2BWH6Pd5sP54JySFAZRjWaMEjgkAfuiJ1Qfnwl3VjcyBmkNjRkDUDwJaIGeHcRLTLCissR45NipnXi98ZZ5idzLKUcGFc=;25:Hx1IYCh1SyP7ya5cNi/773pp+8dO3hl5nMUq6/AwefXCBVJV463RQA5B/vUzhYsV22anpXHLGE6F04Vk+EwBVJTpKB3yNxXcvrvpL4DgyGz1JYaYg9/xcTGZ0IbD1m6qtAbG6Whpf8mNb5+gqMoPmNa+uyPzOeub27vNd2G3h6IRs5nNLFmpClIX89TvIyWtIJAb5FncMOlzN7o3RMbk0Jzd+7vupevEEPb45QtXM7eQUK0TSqL6fHxXMGHZJ3QBfIJVvUHMyEf6Cva13D6kzymQTfxuVforKZlitYojlr8qB6EBtSuO8yAt2nYXFzE5dPDg5UT0SNrWq200rVs6oCaBG6QaJL00rHB/zP5/02g=
X-MS-TrafficTypeDiagnostic: SN1PR12MB0509:
X-Microsoft-Exchange-Diagnostics:
 1;SN1PR12MB0509;31:9Vubw+JCtk/3yApdzJWHAC9mymdwNDbs1weUChZkNhRcx3K0XELy7/tbles71vMfOnlBGmqtMTZfALcK0NSqP3FIe1GO4SrxA/+DQZeeRAeMibzTrChV2612nXUa8DW/XoLzcuUb48CU1vynQyLcAZnj98h+mDCmjOMHLE6SWdkMoKPUWriFjAaUB+xM6K2WAtGq7lFpopGnnMvzBFc2dsBXz6LHDdEqUnrpmD5BLBY=;20:EiDTlzXvyXeDlc9VWHiWArvyzsCQJ+uCSa/acbYMKJOWhWyKzynw30VP2Ry59Lzo7/UjGT3xc9FXiDDojmatYZXnwzw5QRp3OtUFjIzaQjg9uMdZLIo1nvfy502PxUt473HFcQskXgGqDcDFIFXoYScpZ9q8miEGFC208y5IuianQyINEbOD1TTRN+rfuwzQb4Aldh/lrALfG3DceDADasgtxC7yT+ebWJfSGSX0thtI5v8wAm/Ldx8oI6YOn5QlZowBd9FAUd9jmTh2uiRVQOqGOSJSUmBfwiCCP3MaZiCVv8mtgccPwYslFi+ShSrwcjutDpgGZKx5raOcO0VciKzNVfiSufHYBzdFno2buSb6X0BUxMZ4pBzXrFcYz71O6c+qOyFKTYVJzSr3kApWhuIBhNvySnLFStbvlNB1uJP/8nLQ0GeChRI5quTGjN+svYUkK/GXDrjeBlg89LnksHkFoQHiBHuwZuRSk2tKggw+UYnych4hvdiR5NL99rW0
X-Exchange-Antispam-Report-Test: UriScan:(767451399110);
X-Exchange-Antispam-Report-CFA-Test:
 BCL:0;PCL:0;RULEID:(8211001083)(2018062399030)(2018011200283)(2401047)(8121501046)(52410047)(2018011210174)(2018011211064)(2018011212028)(2018011213028)(2018011214028)(2018011215028)(2018011216028)(2018011217028)(2018011218028)(2018011219092)(2018011220252)(2018011221063)(2018011222027)(2018011223027)(2018011224027)(2018011225035)(2018011229035)(2018011232269)(2018011233052)(2018021202149)(98810176)(2018021203149)(98815176)(823300264)(823350442)(823411253)(9101536074)(10201501046)(3002001)(3231415)(901025)(902075)(913088)(7045084)(9312024)(9311024)(944501410)(9300000249)(9301004277)(52103095)(52102170)(52105112)(52106170)(52408095)(98821027)(98822027)(52401380)(52601095)(52505095)(52406095)(52305095)(52206095)(88860335)(1102011)(93006095)(93005095)(1610001)(8301001075)(8301003183)(148016)
 (201708071742011)(7699051)(76991095);SRVR:SN1PR12MB0509;BCL:0;PCL:0;RULEID:;SRVR:SN1PR12MB0509;
X-Microsoft-Exchange-Diagnostics:
 1;SN1PR12MB0509;4:YdmWqLu05obWz6hGdCH/kNkxGHtt/ZzZQLFIL6VMeEE58iwgbSs9UMoN52amRpXam+wyEznGajuKBspp5K/EiWTHtLIDd9h3EEP0wSELw30QNdBPdtvgIjVl58OxFkrpUdVa57F6Bu31HDLZPis+fr1g1NVGmtSC/tLrNmZPMjDYExd894RucTOEdNpjEhGqwqs16ELXt9jxXiu06h+drbshanLg6wuz58E+IktHu4QqsYEENRdJOxRBBMuVhK0j4VNDfYzEEwjG0+3DHU+zTsjsdGyRe7CNasWvHb7zY6zOE4lbl0g6I5THpBS8WGBX
X-MS-Exchange-Organization-SCL: 1
X-Microsoft-Exchange-Diagnostics:
 =?utf-8?B?MTtTTjFQUjEyTUIwNTA5OzIzOjh5a29va2tBL2JJREhObjdTbm1wWExMYTZQ?=
 =?utf-8?B?VVJNU1hMQzh0bS9uSzhvOXFDdDY3d1AvYmVEdVVWQkZkVVBEUkZiZ21rdkpV?=
 =?utf-8?B?T3VlSkErWlZYbmxSbThKTHpuY2l5T2RZQkh4KzJKK29iYXhHeVYwMEo1U2VM?=
 =?utf-8?B?V2g5VmhFRlRZN05UbGZpQWpPQXdGZnQ4Q0JmOFZoWmhOWkV2R3I0dW5vS2h3?=
 =?utf-8?B?ckJhdTIxUHdMOWF3UmRObEd0MXM4N2dWeGE1Tkg4R3lENGZnSHJ4bHJjdnR4?=
 =?utf-8?B?eThVeWlES2FNWnhSZjlpYlpWUGNuNkxsV0FxZjFWQ0dsNDZySWw0ZkQvWmh4?=
 =?utf-8?B?TTFnZXNUNmM0d2FUZkdXL2J3R21BOThndXgzaDFUVk1VOS9KNGRFdCtmbDVk?=
 =?utf-8?B?UmY4YUhINjJRNFdUODE5Lzk0cm9zRE5ZbkNmK0d4ekptNWJ6U25RT0ZxeEVW?=
 =?utf-8?B?eGt1MDBrY3A5bzB2VDNSeXVWMHNoLzRKZ2lYTE1WZUh4NXQzeDZQN0lLLzNI?=
 =?utf-8?B?ejdibTVPTStFSk14akxwVTM1aEgxZ3kvQVNSQnVWVjlmTjVLTGdaZnZOYUgz?=
 =?utf-8?B?aFQ2MTZpV2p5QjV1L1V4TGVDTGNURU5XVWtTZGcyQ3VubHdHWnVEcHJCWTh3?=
 =?utf-8?B?N2xjNSsyckhkM0JNZkFrN2RDTW8wQS8yQ1FIZklWdittRmtSMVdwY0Zvbm1P?=
 =?utf-8?B?dy9xd3FFSWRBcHdUSVhyUFVPZTVBNXBLVHNkSHNERUVtOE9QV3dXK0kzUzBq?=
 =?utf-8?B?ZFFBZTczMlBTbytZa2hHMFpSOE5JRjFZVVlST0pISFI2QkdoQ1IrN3lENit6?=
 =?utf-8?B?dncrbGd0Y3dTMVZ2K0dWaTdqNDR4SVpvK2R5MzVFRDFYdk5meTBXemhhZk10?=
 =?utf-8?B?ZmladDF0cjhVSzNQYWpGN083K2ZRc3JEZU5wU1VrZGsxUExjeDN4UnA0ZVVE?=
 =?utf-8?B?dXF4dVlFMFlxcEtKQ2RuWDVkN01ubmlIM3VrVTd0Y2xWcHNIUmxZM1dHMUl6?=
 =?utf-8?B?TXJJRktKZVJ4ZjlveEFsbDc4OHA0Wk9DdC9UR3VXMkpuT0JmZS9IY3RiMzUy?=
 =?utf-8?B?aWRIVXRkS2JEQUpsaXBmVFlwUEo1SEh2SEo0V3d2OUk0Sks5c1hEcHU4a2Fw?=
 =?utf-8?B?MHBEb3htTmNhYW9MeExBdlk0NmZnYXJOc1JTRmVBMGp0RjkySHpwVmxwQ3Fx?=
 =?utf-8?B?VmszTjNNWVZrNkdoTi8rLzFWb2dUVmE3U25nTnJmNk52aHhHbE1GYkpKMCtm?=
 =?utf-8?B?TytkVFhnYzQzT2hMWTNXcERCcVg0Y0RBanZCU2pPeVFxSEw1RmdtT0c3ZklL?=
 =?utf-8?B?VkM4TVZucUtuM2k0MmxYNy80YU1VNS9mSGdkdzBHclNsTlNCQ2k2Zmh5NGti?=
 =?utf-8?B?dW5DRDRCa2VkVDhYTUdXRWFSUS9QTmVjSUNoWTFYTHZ1NlIxSGRZT3hrZ1Nz?=
 =?utf-8?B?RE45a1Qrblk0WHBYWkd3WUpxYlFXeEc3bGJvbGU3TFJiMk5DcXdtc1B0ckRT?=
 =?utf-8?B?b0dIN1VyYytsWDM5bU9ScS9HQjcrVmxOcWlETzE3T0VYOURrU256QUVmaEhU?=
 =?utf-8?B?NFRkd2pEZ29QdXNPWVhtOVMza1I4M0J6ckJneHFyN1pQYnhqTEhaQThoRUxu?=
 =?utf-8?B?Y08zUnlrTlp5SXJKMXh6QmVocnhrWkJMSFkrbHFGZ2xDaHFpNlZsNjJJbnF4?=
 =?utf-8?Q?gCAdBZ3dDm1uA+Px+Lm+3rpf+5PDj2tqSmk1MKA?=
X-Microsoft-Antispam-Message-Info:
 4efs+mDWTmS4jG83cAFmHfDfpvHRQ+MDbXrF9yHWq0iTHv3pj0vB0P99K7zT+d/z3thuoHpG3lwybQ7eAnY8CTu1Lfysglz1o5EPqk++AwoJ/18xWeyL0QvIPyc0IGnCz2DCW2tS2pwvDwMcG4vTKPNAgrNcdAEtg815QuhptoYuIaoXSzCb9TbpbU0V9OwPJCuGEEeplaP+sahZLPzueA7DsgRpf5FlFTaO9/WcEIWaKrP8eV0kVcuVPr9jiHs0vv3UKZ4IjAcU3HCMX291ItLSFqAqkG7f7pwmzGt4coUT3O9K+/NrjCNgOWTg7q3MCBm8j78Vk3Xuafot5gCjXKDA4ynsddczz4MsRIzeFV5TlE4mvwdbgc3M9529va1g3UYofzJiSmQZ6UiXctf4Za8fvFkt1BMV2/FnUCR1C5t2He+nCIM3rpVJyryxQEOnR47S0+g1qH5b4XtLVM58g8GAvk8afDfWaIXCgGKueYm64G2wt6h8R4CaIJHxwonCcntHycZSw1YsLQzuXDq7qcBj2iYgFZLdFepnm600UP64B/8f1h0U1v4ZtYpSfZTXQ6/iiQlxBhyc/wjn9+fW59O8a6xAQcigxQzOX9Vh2gGDnB/t9tZ+6Eoj7CC/zdUUZArDZZYsRwXBLHvfb4FMzLqPZA5bHg5q4EMy1dpAfivolF7qCoBUh2Ps4Kcts7SPAgs0ufnz44tbLAVqFvah5ERcLVs+7OR/ZiKpAMCs4+hvgVxLCB70t4J6x5J6VlWkhwTessjjq3TdJ+eYrPZN/g==
X-Microsoft-Exchange-Diagnostics:
 1;SN1PR12MB0509;6:7hZ/nwJxa9EwTcV7qn8L5XXEv1O3C+oqIGh/wv5HNiico7mIE2krvTOl3JrACheOfnaFKDUhVo6CdPaZHzfOVDZIAHxrmL0vJpR0Ky90dc6cgi3TEZmf0qlMU5SS2iBLnhrEc6ZimWmrCpbeYsoz4bEQW4QvY+8QKsP90DKevNquxdNS5qrHORfFAbNO/MyHWmVR+uyH4GmPdP3N/GTh5/WxaQ8Bdt/lpqcB8kewH7ZXIGoIzSsoQWe+21W7gJ8CsKq6WB9zWedJmt06G2OKQfUzNn19z8W5hi6M9KwZK2HiYI5C0bk6FnX4fm8V1Fbf5lU3+kn0tILyPRyp2Is+sGQIdZIXbmymjbr1a5jpeqQ5OmTYxSADw3Fybv4FL7LCtAozgEPf+Y8w1cJIZhNEzeLjQqcV52XQDuvLYQVHs8+fIYi9Ur1y+C1WuiVZqhNNK9mg+d5e1u7ufc6IdEMJNpeyiYmsK82GfIZ+uHm9mVY=;5:zJXkyjjW5eWN401pqKio7pmAk8zEA8hZUDvMLt1WOvf+xYJWRpE35mKU5yG+K9u3SeNwrDixfeS1r8urSI6isjpP7oFnApUMA1vF5jmKgdDoiye2KAoFI4e2ZsauxEgdN7ih3Q5uz0Z4T14hIDo60akSKFvieTNP+qIB8u62fNs=;7:75ZRC5ff5LRKAsLKNoUKJZPaQ3qISy8YRm2tDTc88NnV9u7I6I1l5ojxI5gvUpGTSUUNzbA6AbevJKSnXxm11aP/H7tln3YMjGXa458MwJkmzyj
 NhAXmxiw40ihVYTI5S8snpdTVjtwRDwsfNDhrDg==
SpamDiagnosticOutput: 1:99
SpamDiagnosticMetadata: NSPM
X-Microsoft-Exchange-Diagnostics:
 1;SN1PR12MB0509;20:c7sOrZuugwABGtzPGP5TPPmoqynZUm0bWLT1XjlPVZIrrxk3CWKsSowxWszp9rIlh7wAlbdXHUeiVMuWWN+7CVyAkbvvIOj8BBbJmgtZTEQ5NvUYtQrysc+VNWBspd1oIHq3GBosnhnjuEr3IzWiLqf0IaCoBh600DWimckScNSY6gHl6HOpKHjFtSNu17bhoFQN3TEfmT7CwubmJySkCiVuHCjINwLx25HdBmw/xqyVcbOowViLsKvWubHQCXb7
X-MS-Exchange-CrossTenant-OriginalArrivalTime: 15 Nov 2018 11:12:54.8840
 (UTC)
X-MS-Exchange-CrossTenant-Network-Message-Id: 87af999b-85f7-443c-c4ee-08d64aeb4a9b
X-MS-Exchange-CrossTenant-Id: 3dd8961f-e488-4e60-8e11-a82d994e183d
X-MS-Exchange-CrossTenant-FromEntityHeader: Internet
X-MS-Exchange-Transport-CrossTenantHeadersStamped: SN1PR12MB0509
X-MS-Exchange-Transport-EndToEndLatency: 00:00:02.5029573
X-MS-Exchange-Processed-By-BccFoldering: 15.20.1294.024
X-Microsoft-Exchange-Diagnostics:
	1;BY1PR12MB0502;27:4sSKTLKSPltGOEjGXjVa1ydopI3I+jKdU0VeDfAh3L2IGQICpmfpyFy1dcdsB5YrbiZG7gq7PKO6beXZUa2aHT6ErcOgGRBB4uwDNkKhIopTrqTMCiEK+bIU9D2zYQJDZwSbUxFUUKaT2c7eYZEX3oCDJZhUMnzh5dIEwA2W6QHpUukzH+rNF0a6nz6C2FctYJwp+O1AxZW1gm9mSMIASfnwgwuvG5t8CtbzxHwMrnHQ8oESfI7M5sGXHn599RdQ6lOUjhZarYIRSwvtttyQa5MmeM6biOk0Kt+P8mBJzgSfGwqaVMISW7yC/pkOuuePzwSn78LZgsFlDEUUdUNyfUpx+gCqIkqRzYV7UT91DlBtyd/r9iLQrIkLHw+mw4u90O1tEVe7ceC5Ccz4Ni8eOw==
X-Microsoft-Antispam-Mailbox-Delivery:
	ucf:0;jmr:0;ex:0;auth:0;dest:I;ENG:(20160514016)(750119)(520011016)(944506303)(944626516);
X-Microsoft-Antispam-Message-Info:
	Bn3vIf3ubgUZ61FidwuOD2dbJsVFctSUigGyLnKuT94MXC/3hc09VjpmpM7E3f5Fqj1GD/w9GllTnCDJVdRXe2PIiSQlEb/hItyJlCn7SdYavEqH8NuIX5cG5gpEybjkwJohxacLKdjW5I9KRbNGLsPQlQxA9OsltJ2ysebV1Z+KCGDnI3+A8qq9yi+gQAfnFn+5yNfjhifeu4LvKE8YBdgjHOeohH19fX/oY/aX4uJKhBdRVEBJd625Si0lQnK8dS4w6wpVi4jCD39bv5lLXsNiU2R3N3a01TJFZODVfry3AXf09JIE5vgrz2pZI94NJ8kIq7L9OscoxR/EwlCfR+fVbm1ZwjUa9b1X9K95Z/GBv17Vis4ZDJAkIm8+S5ijhB/ZDwylMkhN26uvi+rcyGpQG8UHfc4Yaql6KecZoYU7bfafHZ/+iFXNxwoFg34fBtFBIY5WurY/A/5sapdxoq0r/+nBkPP2nJr/v2KmJAch79XFHMvwGjjHSmcn2vo7Jxd7n/I8X5/WYrM8aij0BjxE+kw2J0w5pZVp10t2e7TLyx5fMbqIQDxrgOxOfw1+613Txwz1Hw1/N5eQYWznJHElpF97q1hrTj0Kgn+LKj33WoEk3JKLHPZi7Q8wENqttrIcuCTWw2WCC2wDC/6Wx4qSUfRxTmy2UkJL3ohZVRw047CKV/wIJtg4NOzY2ThRW0jGZmxq582aAzrlgOlpxJ1oiAMSxvoCRQ/ioOfYcnTPvVTPlFcINC25ZmmRsxCb0ctrF8mC8Wd+pmc+H2oemrSW0jHG2s7
 gaaCztjIRocRvZob4f2LT/Ti3TdvxoMGGbr3g9DnBx2aaoSOb8v3qxA==
MIME-Version: 1.0

Extract of useful code from the timeline work. Let's use just a single
stub fence instance instead of allocating a new one all the time.

Signed-off-by: Chunming Zhou <david1.zhou-5C7GfCeVMHo@public.gmane.org>
Signed-off-by: Christian König <christian.koenig-5C7GfCeVMHo@public.gmane.org>
---
 drivers/gpu/drm/drm_syncobj.c | 67 ++++++++++++++++++++++---------------------
 1 file changed, 35 insertions(+), 32 deletions(-)

diff --git a/drivers/gpu/drm/drm_syncobj.c b/drivers/gpu/drm/drm_syncobj.c
index f190414511ae..4c45acb326b9 100644
--- a/drivers/gpu/drm/drm_syncobj.c
+++ b/drivers/gpu/drm/drm_syncobj.c
@@ -56,10 +56,8 @@
 #include "drm_internal.h"
 #include <drm/drm_syncobj.h>
 
-struct drm_syncobj_stub_fence {
-	struct dma_fence base;
-	spinlock_t lock;
-};
+static DEFINE_SPINLOCK(stub_fence_lock);
+static struct dma_fence stub_fence;
 
 static const char *drm_syncobj_stub_fence_get_name(struct dma_fence *fence)
 {
@@ -71,6 +69,25 @@ static const struct dma_fence_ops drm_syncobj_stub_fence_ops = {
 	.get_timeline_name = drm_syncobj_stub_fence_get_name,
 };
 
+/**
+ * drm_syncobj_get_stub_fence - return a signaled fence
+ *
+ * Return a stub fence which is already signaled.
+ */
+static struct dma_fence *drm_syncobj_get_stub_fence(void)
+{
+	spin_lock(&stub_fence_lock);
+	if (!stub_fence.ops) {
+		dma_fence_init(&stub_fence,
+			       &drm_syncobj_stub_fence_ops,
+			       &stub_fence_lock,
+			       0, 0);
+		dma_fence_signal_locked(&stub_fence);
+	}
+	spin_unlock(&stub_fence_lock);
+
+	return dma_fence_get(&stub_fence);
+}
 
 /**
  * drm_syncobj_find - lookup and reference a sync object.
@@ -190,23 +207,18 @@ void drm_syncobj_replace_fence(struct drm_syncobj *syncobj,
 }
 EXPORT_SYMBOL(drm_syncobj_replace_fence);
 
-static int drm_syncobj_assign_null_handle(struct drm_syncobj *syncobj)
+/**
+ * drm_syncobj_assign_null_handle - assign a stub fence to the sync object
+ * @syncobj: sync object to assign the fence on
+ *
+ * Assign a already signaled stub fence to the sync object.
+ */
+static void drm_syncobj_assign_null_handle(struct drm_syncobj *syncobj)
 {
-	struct drm_syncobj_stub_fence *fence;
-	fence = kzalloc(sizeof(*fence), GFP_KERNEL);
-	if (fence == NULL)
-		return -ENOMEM;
+	struct dma_fence *fence = drm_syncobj_get_stub_fence();
 
-	spin_lock_init(&fence->lock);
-	dma_fence_init(&fence->base, &drm_syncobj_stub_fence_ops,
-		       &fence->lock, 0, 0);
-	dma_fence_signal(&fence->base);
-
-	drm_syncobj_replace_fence(syncobj, &fence->base);
-
-	dma_fence_put(&fence->base);
-
-	return 0;
+	drm_syncobj_replace_fence(syncobj, fence);
+	dma_fence_put(fence);
 }
 
 /**
@@ -273,7 +285,6 @@ EXPORT_SYMBOL(drm_syncobj_free);
 int drm_syncobj_create(struct drm_syncobj **out_syncobj, uint32_t flags,
 		       struct dma_fence *fence)
 {
-	int ret;
 	struct drm_syncobj *syncobj;
 
 	syncobj = kzalloc(sizeof(struct drm_syncobj), GFP_KERNEL);
@@ -284,13 +295,8 @@ int drm_syncobj_create(struct drm_syncobj **out_syncobj, uint32_t flags,
 	INIT_LIST_HEAD(&syncobj->cb_list);
 	spin_lock_init(&syncobj->lock);
 
-	if (flags & DRM_SYNCOBJ_CREATE_SIGNALED) {
-		ret = drm_syncobj_assign_null_handle(syncobj);
-		if (ret < 0) {
-			drm_syncobj_put(syncobj);
-			return ret;
-		}
-	}
+	if (flags & DRM_SYNCOBJ_CREATE_SIGNALED)
+		drm_syncobj_assign_null_handle(syncobj);
 
 	if (fence)
 		drm_syncobj_replace_fence(syncobj, fence);
@@ -984,11 +990,8 @@ drm_syncobj_signal_ioctl(struct drm_device *dev, void *data,
 	if (ret < 0)
 		return ret;
 
-	for (i = 0; i < args->count_handles; i++) {
-		ret = drm_syncobj_assign_null_handle(syncobjs[i]);
-		if (ret < 0)
-			break;
-	}
+	for (i = 0; i < args->count_handles; i++)
+		drm_syncobj_assign_null_handle(syncobjs[i]);
 
 	drm_syncobj_array_free(syncobjs, args->count_handles);
 
-- 
2.14.1


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #3: 0001-drm-amdgpu-fix-signaled-fence-isn-t-handled.patch --]
[-- Type: text/x-patch; name="0001-drm-amdgpu-fix-signaled-fence-isn-t-handled.patch", Size: 2249 bytes --]

>From 3640a18c31e7b786129286615fcdf397e1142451 Mon Sep 17 00:00:00 2001
From: Chunming Zhou <david1.zhou-5C7GfCeVMHo@public.gmane.org>
Date: Fri, 23 Nov 2018 22:05:19 +0800
Subject: [PATCH] drm/amdgpu: fix signaled fence isn't handled
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Signed-off-by: Chunming Zhou <david1.zhou-5C7GfCeVMHo@public.gmane.org>
Cc: Sharma, Deepak <Deepak.Sharma-5C7GfCeVMHo@public.gmane.org>
CC: Christian König <Christian.Koenig-5C7GfCeVMHo@public.gmane.org>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 3 +++
 drivers/gpu/drm/drm_syncobj.c          | 1 +
 include/drm/drm_syncobj.h              | 1 +
 3 files changed, 5 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index 6a823b58b3b8..e960f9864e9f 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -1505,6 +1505,9 @@ int amdgpu_cs_fence_to_handle_ioctl(struct drm_device *dev, void *data,
 	if (IS_ERR(fence))
 		return PTR_ERR(fence);
 
+    /* that means fence was signaled */
+	if (!fence)
+		fence = drm_syncobj_get_stub_fence();
 	switch (info->in.what) {
 	case AMDGPU_FENCE_TO_HANDLE_GET_SYNCOBJ:
 		r = drm_syncobj_create(&syncobj, 0, fence);
diff --git a/drivers/gpu/drm/drm_syncobj.c b/drivers/gpu/drm/drm_syncobj.c
index 5f2df10e51c3..e5621f80d501 100644
--- a/drivers/gpu/drm/drm_syncobj.c
+++ b/drivers/gpu/drm/drm_syncobj.c
@@ -97,6 +97,7 @@ struct dma_fence *drm_syncobj_get_stub_fence(void)
 
 	return dma_fence_get(&signaled_fence);
 }
+EXPORT_SYMBOL(drm_syncobj_get_stub_fence);
 /**
  * drm_syncobj_find - lookup and reference a sync object.
  * @file_private: drm file private pointer
diff --git a/include/drm/drm_syncobj.h b/include/drm/drm_syncobj.h
index 29244cbcd05e..93e9e9b159ab 100644
--- a/include/drm/drm_syncobj.h
+++ b/include/drm/drm_syncobj.h
@@ -147,5 +147,6 @@ int drm_syncobj_get_handle(struct drm_file *file_private,
 int drm_syncobj_get_fd(struct drm_syncobj *syncobj, int *p_fd);
 int drm_syncobj_search_fence(struct drm_syncobj *syncobj, u64 point, u64 flags,
 			     struct dma_fence **fence);
+struct dma_fence *drm_syncobj_get_stub_fence(void);
 
 #endif
-- 
2.17.1


[-- Attachment #4: Type: text/plain, Size: 154 bytes --]

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH] drm/amdgpu: add the checking to avoid NULL pointer dereference
       [not found]                                         ` <9619e159-d47b-2078-974d-0d3b418f6d88-5C7GfCeVMHo@public.gmane.org>
  2018-11-27  2:40                                           ` zhoucm1
@ 2018-11-27  9:52                                           ` Christian König
       [not found]                                             ` <869bef1b-26f4-0fff-fa33-2720c2114a60-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
  1 sibling, 1 reply; 17+ messages in thread
From: Christian König @ 2018-11-27  9:52 UTC (permalink / raw)
  To: Deepak Sharma, Zhou, David(ChunMing),
	Koenig, Christian, Sharma, Deepak,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

Am 27.11.18 um 01:40 schrieb Deepak Sharma:
>
> On 11/26/18 1:57 AM, Zhou, David(ChunMing) wrote:
>>
>>> -----Original Message-----
>>> From: Christian König <ckoenig.leichtzumerken@gmail.com>
>>> Sent: Monday, November 26, 2018 5:23 PM
>>> To: Sharma, Deepak <Deepak.Sharma@amd.com>; Zhou, David(ChunMing)
>>> <David1.Zhou@amd.com>; Koenig, Christian <Christian.Koenig@amd.com>;
>>> amd-gfx@lists.freedesktop.org
>>> Subject: Re: [PATCH] drm/amdgpu: add the checking to avoid NULL pointer
>>> dereference
>>>
>>> Am 26.11.18 um 02:59 schrieb Sharma, Deepak:
>>>> 在 2018/11/24 2:10, Koenig, Christian 写道:
>>>>> Am 23.11.18 um 15:10 schrieb Zhou, David(ChunMing):
>>>>>> 在 2018/11/23 21:30, Koenig, Christian 写道:
>>>>>>> Am 23.11.18 um 14:27 schrieb Zhou, David(ChunMing):
>>>>>>>> 在 2018/11/22 19:25, Christian König 写道:
>>>>>>>>> Am 22.11.18 um 07:56 schrieb Sharma, Deepak:
>>>>>>>>>> when returned fence is not valid mostly due to userspace ignored
>>>>>>>>>> previous error causes NULL pointer dereference.
>>>>>>>>> Again, this is clearly incorrect. The my other mails on the
>>>>>>>>> earlier patch.
>>>>>>>> Sorry for I didn't get your history, but looks from the patch
>>>>>>>> itself, it is still a valid patch, isn't it?
>>>>>>> No, the semantic of amdgpu_ctx_get_fence() is that we return NULL
>>>>>>> when the fence is already signaled.
>>>>>>>
>>>>>>> So this patch could totally break userspace because it changes the
>>>>>>> behavior when we try to sync to an already signaled fence.
>>>>>> Ah, I got your meaning, how about attached patch?
>>>>> Yeah something like this, but I would just give the
>>>>> DRM_SYNCOBJ_CREATE_SIGNALED instead.
>>>>>
>>>>> I mean that's what this flag is good for isn't it?
>>>> Yeah, I give a flag initally when creating patch, but as you know, there is a
>>> swich case not be able to use that flag:
>>>>         case AMDGPU_FENCE_TO_HANDLE_GET_SYNC_FILE_FD:
>>>>             fd = get_unused_fd_flags(O_CLOEXEC);
>>>>             if (fd < 0) {
>>>>                 dma_fence_put(fence);
>>>>                 return fd;
>>>>             }
>>>>
>>>>             sync_file = sync_file_create(fence);
>>>>             dma_fence_put(fence);
>>>>             if (!sync_file) {
>>>>                 put_unused_fd(fd);
>>>>                 return -ENOMEM;
>>>>             }
>>>>
>>>>             fd_install(fd, sync_file->file);
>>>>             info->out.handle = fd;
>>>>             return 0;
>>>>
>>>> So I change to stub fence instead.
>>> Yeah, I've missed that case. Not sure if the sync_file can deal with a NULL
>>> fence.
>>>
>>> We should then probably move the stub fence function into
>>> dma_fence_stub.c under drivers/dma-buf to keep the stuff together.
>> Yes, you wrap it to review first with your stub fence, we can do it separately first.
>>
>> -David
>>>> -David
>>>>
>>>> I have not applied this patch.
>>>> The issue was trying to address is when amdgpu_cs_ioctl() failed due to
>>> low memory (ENOMEM) but userspace chose to proceed and called
>>> amdgpu_cs_fence_to_handle_ioctl().
>>>> In amdgpu_cs_fence_to_handle_ioctl(), fence is null and later causing
>>>> NULL pointer dereference, this patch was to avoid that and system panic
>>> But I understand now that its a valid case retuning NULL if fence was already
>>> signaled but need to handle case so avoid kernel panic. Seems David patch
>>> should fix this, I will test it tomorrow.
>>>
>>> Mhm, but don't we bail out with an error if we ask for a failed command
>>> submission? If not that sounds like a bug as well.
>>>
>>> Christian.
>>>
> Where do we do that?
> I see error
> [drm:amdgpu_cs_ioctl] *ERROR* amdgpu_cs_list_validate(validated) failed.
> [drm:amdgpu_cs_ioctl] *ERROR* Not enough memory for command submission!
> BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
> Did some more debugging,dma_fence_is_array() is causing NULL pointer
> dereference call through sync_file_ioctl.
>
> Also I think changes in David patch cant be applied on
> amd-staging-drm-next, which all patches I should take to get it correctly?

Mhm, what you say here actually doesn't make much sense.

When the sequence number is invalid because the submission failed 
amdgpu_ctx_get_fence() returns an error:
>         if (seq >= centity->sequence) {
>                 spin_unlock(&ctx->ring_lock);
>                 return ERR_PTR(-EINVAL);
>         }

This error is then handled in amdgpu_cs_fence_to_handle_ioctl():
>         fence = amdgpu_cs_get_fence(adev, filp, &info->in.fence);
>         if (IS_ERR(fence))
>                 return PTR_ERR(fence);

So the error handling for this is actually in place.

The only thing that could go wrong is that userspace sends a sequence 
number which is to small.

The correct solution is to either set the flag as I suggested and make 
sync_file_create() capable of handling a NULL fence.

Or we make the stub fence global like David suggested.

Regards,
Christian.

>
>>>> -Deepak
>>>>> Christian.
>>>>>
>>>>>> -David
>>>>>>> If that patch was applied then please revert it immediately.
>>>>>>>
>>>>>>> Christian.
>>>>>>>
>>>>>>>> -David
>>>>>>>>> If you have already pushed the patch then please revert.
>>>>>>>>>
>>>>>>>>> Christian.
>>>>>>>>>
>>>>>>>>>> Signed-off-by: Deepak Sharma <Deepak.Sharma@amd.com>
>>>>>>>>>> ---
>>>>>>>>>>          drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 2 ++
>>>>>>>>>>          1 file changed, 2 insertions(+)
>>>>>>>>>>
>>>>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>>>>>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>>>>>>>>>> index 024dfbd87f11..14166cd8a12f 100644
>>>>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>>>>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>>>>>>>>>> @@ -1403,6 +1403,8 @@ static struct dma_fence
>>>>>>>>>> *amdgpu_cs_get_fence(struct amdgpu_device *adev,
>>>>>>>>>>                fence = amdgpu_ctx_get_fence(ctx, entity, user->seq_no);
>>>>>>>>>>              amdgpu_ctx_put(ctx);
>>>>>>>>>> +    if(!fence)
>>>>>>>>>> +        return ERR_PTR(-EINVAL);
>>>>>>>>>>                return fence;
>>>>>>>>>>          }
>>>> _______________________________________________
>>>> amd-gfx mailing list
>>>> amd-gfx@lists.freedesktop.org
>>>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] drm/amdgpu: add the checking to avoid NULL pointer dereference
       [not found]                                             ` <174e311d-3214-1e70-c5d4-8b955b926a58-5C7GfCeVMHo@public.gmane.org>
@ 2018-11-27 23:12                                               ` Sharma, Deepak
       [not found]                                                 ` <BY2PR12MB06942FD453E4656A310A806AE9D00-K//h7OWB4q7S9h1UXvMT+AdYzm3356FpvxpqHgZTriW3zl9H0oFU5g@public.gmane.org>
  0 siblings, 1 reply; 17+ messages in thread
From: Sharma, Deepak @ 2018-11-27 23:12 UTC (permalink / raw)
  To: Zhou, David(ChunMing),
	Koenig, Christian, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW


[-- Attachment #1.1: Type: text/plain, Size: 6317 bytes --]

Thanks David ,

I did backport the drm patch to my kernel  after that I am not getting -ENOMEM from amdgpu_cs_ioctl while running tests  so have not been able to test patch to handle signaled fence. As this issue is hard to reproduce , will give some more try.

But yes the problem is there and need to handle when fence is null that your patch seems to handle it correctly.

-Deepak
________________________________
From: Zhou, David(ChunMing)
Sent: Monday, November 26, 2018 6:40 PM
To: Sharma, Deepak; Zhou, David(ChunMing); Koenig, Christian; amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/amdgpu: add the checking to avoid NULL pointer dereference

Yeah, you need another drm patch as well when you apply my patch. Attached.

-David


On 2018年11月27日 08:40, Sharma, Deepak wrote:
>
> On 11/26/18 1:57 AM, Zhou, David(ChunMing) wrote:
>>
>>> -----Original Message-----
>>> From: Christian König <ckoenig.leichtzumerken@gmail.com>
>>> Sent: Monday, November 26, 2018 5:23 PM
>>> To: Sharma, Deepak <Deepak.Sharma@amd.com>; Zhou, David(ChunMing)
>>> <David1.Zhou@amd.com>; Koenig, Christian <Christian.Koenig@amd.com>;
>>> amd-gfx@lists.freedesktop.org
>>> Subject: Re: [PATCH] drm/amdgpu: add the checking to avoid NULL pointer
>>> dereference
>>>
>>> Am 26.11.18 um 02:59 schrieb Sharma, Deepak:
>>>> 在 2018/11/24 2:10, Koenig, Christian 写道:
>>>>> Am 23.11.18 um 15:10 schrieb Zhou, David(ChunMing):
>>>>>> 在 2018/11/23 21:30, Koenig, Christian 写道:
>>>>>>> Am 23.11.18 um 14:27 schrieb Zhou, David(ChunMing):
>>>>>>>> 在 2018/11/22 19:25, Christian König 写道:
>>>>>>>>> Am 22.11.18 um 07:56 schrieb Sharma, Deepak:
>>>>>>>>>> when returned fence is not valid mostly due to userspace ignored
>>>>>>>>>> previous error causes NULL pointer dereference.
>>>>>>>>> Again, this is clearly incorrect. The my other mails on the
>>>>>>>>> earlier patch.
>>>>>>>> Sorry for I didn't get your history, but looks from the patch
>>>>>>>> itself, it is still a valid patch, isn't it?
>>>>>>> No, the semantic of amdgpu_ctx_get_fence() is that we return NULL
>>>>>>> when the fence is already signaled.
>>>>>>>
>>>>>>> So this patch could totally break userspace because it changes the
>>>>>>> behavior when we try to sync to an already signaled fence.
>>>>>> Ah, I got your meaning, how about attached patch?
>>>>> Yeah something like this, but I would just give the
>>>>> DRM_SYNCOBJ_CREATE_SIGNALED instead.
>>>>>
>>>>> I mean that's what this flag is good for isn't it?
>>>> Yeah, I give a flag initally when creating patch, but as you know, there is a
>>> swich case not be able to use that flag:
>>>>         case AMDGPU_FENCE_TO_HANDLE_GET_SYNC_FILE_FD:
>>>>             fd = get_unused_fd_flags(O_CLOEXEC);
>>>>             if (fd < 0) {
>>>>                 dma_fence_put(fence);
>>>>                 return fd;
>>>>             }
>>>>
>>>>             sync_file = sync_file_create(fence);
>>>>             dma_fence_put(fence);
>>>>             if (!sync_file) {
>>>>                 put_unused_fd(fd);
>>>>                 return -ENOMEM;
>>>>             }
>>>>
>>>>             fd_install(fd, sync_file->file);
>>>>             info->out.handle = fd;
>>>>             return 0;
>>>>
>>>> So I change to stub fence instead.
>>> Yeah, I've missed that case. Not sure if the sync_file can deal with a NULL
>>> fence.
>>>
>>> We should then probably move the stub fence function into
>>> dma_fence_stub.c under drivers/dma-buf to keep the stuff together.
>> Yes, you wrap it to review first with your stub fence, we can do it separately first.
>>
>> -David
>>>> -David
>>>>
>>>> I have not applied this patch.
>>>> The issue was trying to address is when amdgpu_cs_ioctl() failed due to
>>> low memory (ENOMEM) but userspace chose to proceed and called
>>> amdgpu_cs_fence_to_handle_ioctl().
>>>> In amdgpu_cs_fence_to_handle_ioctl(), fence is null and later causing
>>>> NULL pointer dereference, this patch was to avoid that and system panic
>>> But I understand now that its a valid case retuning NULL if fence was already
>>> signaled but need to handle case so avoid kernel panic. Seems David patch
>>> should fix this, I will test it tomorrow.
>>>
>>> Mhm, but don't we bail out with an error if we ask for a failed command
>>> submission? If not that sounds like a bug as well.
>>>
>>> Christian.
>>>
> Where do we do that?
> I see error
> [drm:amdgpu_cs_ioctl] *ERROR* amdgpu_cs_list_validate(validated) failed.
> [drm:amdgpu_cs_ioctl] *ERROR* Not enough memory for command submission!
> BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
> Did some more debugging,dma_fence_is_array() is causing NULL pointer
> dereference call through sync_file_ioctl.
>
> Also I think changes in David patch cant be applied on
> amd-staging-drm-next, which all patches I should take to get it correctly?
>
>>>> -Deepak
>>>>> Christian.
>>>>>
>>>>>> -David
>>>>>>> If that patch was applied then please revert it immediately.
>>>>>>>
>>>>>>> Christian.
>>>>>>>
>>>>>>>> -David
>>>>>>>>> If you have already pushed the patch then please revert.
>>>>>>>>>
>>>>>>>>> Christian.
>>>>>>>>>
>>>>>>>>>> Signed-off-by: Deepak Sharma <Deepak.Sharma@amd.com>
>>>>>>>>>> ---
>>>>>>>>>>          drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 2 ++
>>>>>>>>>>          1 file changed, 2 insertions(+)
>>>>>>>>>>
>>>>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>>>>>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>>>>>>>>>> index 024dfbd87f11..14166cd8a12f 100644
>>>>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>>>>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>>>>>>>>>> @@ -1403,6 +1403,8 @@ static struct dma_fence
>>>>>>>>>> *amdgpu_cs_get_fence(struct amdgpu_device *adev,
>>>>>>>>>>                fence = amdgpu_ctx_get_fence(ctx, entity, user->seq_no);
>>>>>>>>>>              amdgpu_ctx_put(ctx);
>>>>>>>>>> +    if(!fence)
>>>>>>>>>> +        return ERR_PTR(-EINVAL);
>>>>>>>>>>                return fence;
>>>>>>>>>>          }
>>>> _______________________________________________
>>>> amd-gfx mailing list
>>>> amd-gfx@lists.freedesktop.org
>>>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[-- Attachment #1.2: Type: text/html, Size: 11372 bytes --]

[-- Attachment #2: Type: text/plain, Size: 154 bytes --]

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 17+ messages in thread

* RE: [PATCH] drm/amdgpu: add the checking to avoid NULL pointer dereference
       [not found]                                             ` <869bef1b-26f4-0fff-fa33-2720c2114a60-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
@ 2018-11-27 23:18                                               ` Sharma, Deepak
  0 siblings, 0 replies; 17+ messages in thread
From: Sharma, Deepak @ 2018-11-27 23:18 UTC (permalink / raw)
  To: Koenig, Christian, Zhou, David(ChunMing),
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW



-----Original Message-----
From: amd-gfx <amd-gfx-bounces@lists.freedesktop.org> On Behalf Of Christian König
Sent: Tuesday, November 27, 2018 1:52 AM
To: Sharma, Deepak <Deepak.Sharma@amd.com>; Zhou, David(ChunMing) <David1.Zhou@amd.com>; Koenig, Christian <Christian.Koenig@amd.com>; Sharma, Deepak <Deepak.Sharma@amd.com>; amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/amdgpu: add the checking to avoid NULL pointer dereference

Am 27.11.18 um 01:40 schrieb Deepak Sharma:
>
> On 11/26/18 1:57 AM, Zhou, David(ChunMing) wrote:
>>
>>> -----Original Message-----
>>> From: Christian König <ckoenig.leichtzumerken@gmail.com>
>>> Sent: Monday, November 26, 2018 5:23 PM
>>> To: Sharma, Deepak <Deepak.Sharma@amd.com>; Zhou, David(ChunMing)
>>> <David1.Zhou@amd.com>; Koenig, Christian <Christian.Koenig@amd.com>;
>>> amd-gfx@lists.freedesktop.org
>>> Subject: Re: [PATCH] drm/amdgpu: add the checking to avoid NULL pointer
>>> dereference
>>>
>>> Am 26.11.18 um 02:59 schrieb Sharma, Deepak:
>>>> 在 2018/11/24 2:10, Koenig, Christian 写道:
>>>>> Am 23.11.18 um 15:10 schrieb Zhou, David(ChunMing):
>>>>>> 在 2018/11/23 21:30, Koenig, Christian 写道:
>>>>>>> Am 23.11.18 um 14:27 schrieb Zhou, David(ChunMing):
>>>>>>>> 在 2018/11/22 19:25, Christian König 写道:
>>>>>>>>> Am 22.11.18 um 07:56 schrieb Sharma, Deepak:
>>>>>>>>>> when returned fence is not valid mostly due to userspace ignored
>>>>>>>>>> previous error causes NULL pointer dereference.
>>>>>>>>> Again, this is clearly incorrect. The my other mails on the
>>>>>>>>> earlier patch.
>>>>>>>> Sorry for I didn't get your history, but looks from the patch
>>>>>>>> itself, it is still a valid patch, isn't it?
>>>>>>> No, the semantic of amdgpu_ctx_get_fence() is that we return NULL
>>>>>>> when the fence is already signaled.
>>>>>>>
>>>>>>> So this patch could totally break userspace because it changes the
>>>>>>> behavior when we try to sync to an already signaled fence.
>>>>>> Ah, I got your meaning, how about attached patch?
>>>>> Yeah something like this, but I would just give the
>>>>> DRM_SYNCOBJ_CREATE_SIGNALED instead.
>>>>>
>>>>> I mean that's what this flag is good for isn't it?
>>>> Yeah, I give a flag initally when creating patch, but as you know, there is a
>>> swich case not be able to use that flag:
>>>>         case AMDGPU_FENCE_TO_HANDLE_GET_SYNC_FILE_FD:
>>>>             fd = get_unused_fd_flags(O_CLOEXEC);
>>>>             if (fd < 0) {
>>>>                 dma_fence_put(fence);
>>>>                 return fd;
>>>>             }
>>>>
>>>>             sync_file = sync_file_create(fence);
>>>>             dma_fence_put(fence);
>>>>             if (!sync_file) {
>>>>                 put_unused_fd(fd);
>>>>                 return -ENOMEM;
>>>>             }
>>>>
>>>>             fd_install(fd, sync_file->file);
>>>>             info->out.handle = fd;
>>>>             return 0;
>>>>
>>>> So I change to stub fence instead.
>>> Yeah, I've missed that case. Not sure if the sync_file can deal with a NULL
>>> fence.
>>>
>>> We should then probably move the stub fence function into
>>> dma_fence_stub.c under drivers/dma-buf to keep the stuff together.
>> Yes, you wrap it to review first with your stub fence, we can do it separately first.
>>
>> -David
>>>> -David
>>>>
>>>> I have not applied this patch.
>>>> The issue was trying to address is when amdgpu_cs_ioctl() failed due to
>>> low memory (ENOMEM) but userspace chose to proceed and called
>>> amdgpu_cs_fence_to_handle_ioctl().
>>>> In amdgpu_cs_fence_to_handle_ioctl(), fence is null and later causing
>>>> NULL pointer dereference, this patch was to avoid that and system panic
>>> But I understand now that its a valid case retuning NULL if fence was already
>>> signaled but need to handle case so avoid kernel panic. Seems David patch
>>> should fix this, I will test it tomorrow.
>>>
>>> Mhm, but don't we bail out with an error if we ask for a failed command
>>> submission? If not that sounds like a bug as well.
>>>
>>> Christian.
>>>
> Where do we do that?
> I see error
> [drm:amdgpu_cs_ioctl] *ERROR* amdgpu_cs_list_validate(validated) failed.
> [drm:amdgpu_cs_ioctl] *ERROR* Not enough memory for command submission!
> BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
> Did some more debugging,dma_fence_is_array() is causing NULL pointer
> dereference call through sync_file_ioctl.
>
> Also I think changes in David patch cant be applied on
> amd-staging-drm-next, which all patches I should take to get it correctly?

> Mhm, what you say here actually doesn't make much sense.

Yes  I did mixed different issue here at end , please ignore it.

When the sequence number is invalid because the submission failed 
amdgpu_ctx_get_fence() returns an error:
>         if (seq >= centity->sequence) {
>                 spin_unlock(&ctx->ring_lock);
>                 return ERR_PTR(-EINVAL);
>         }

This error is then handled in amdgpu_cs_fence_to_handle_ioctl():
>         fence = amdgpu_cs_get_fence(adev, filp, &info->in.fence);
>         if (IS_ERR(fence))
>                 return PTR_ERR(fence);

So the error handling for this is actually in place.

The only thing that could go wrong is that userspace sends a sequence 
number which is to small.

The correct solution is to either set the flag as I suggested and make 
sync_file_create() capable of handling a NULL fence.

Or we make the stub fence global like David suggested.

Regards,
Christian.

Thanks Christian  for explanation , understood. We should take any of suggest approach in for solving issue.
Will give some more try to reproduce this and try the fix.

-Deepak


>
>>>> -Deepak
>>>>> Christian.
>>>>>
>>>>>> -David
>>>>>>> If that patch was applied then please revert it immediately.
>>>>>>>
>>>>>>> Christian.
>>>>>>>
>>>>>>>> -David
>>>>>>>>> If you have already pushed the patch then please revert.
>>>>>>>>>
>>>>>>>>> Christian.
>>>>>>>>>
>>>>>>>>>> Signed-off-by: Deepak Sharma <Deepak.Sharma@amd.com>
>>>>>>>>>> ---
>>>>>>>>>>          drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 2 ++
>>>>>>>>>>          1 file changed, 2 insertions(+)
>>>>>>>>>>
>>>>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>>>>>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>>>>>>>>>> index 024dfbd87f11..14166cd8a12f 100644
>>>>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>>>>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>>>>>>>>>> @@ -1403,6 +1403,8 @@ static struct dma_fence
>>>>>>>>>> *amdgpu_cs_get_fence(struct amdgpu_device *adev,
>>>>>>>>>>                fence = amdgpu_ctx_get_fence(ctx, entity, user->seq_no);
>>>>>>>>>>              amdgpu_ctx_put(ctx);
>>>>>>>>>> +    if(!fence)
>>>>>>>>>> +        return ERR_PTR(-EINVAL);
>>>>>>>>>>                return fence;
>>>>>>>>>>          }
>>>> _______________________________________________
>>>> amd-gfx mailing list
>>>> amd-gfx@lists.freedesktop.org
>>>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] drm/amdgpu: add the checking to avoid NULL pointer dereference
       [not found]                                                 ` <BY2PR12MB06942FD453E4656A310A806AE9D00-K//h7OWB4q7S9h1UXvMT+AdYzm3356FpvxpqHgZTriW3zl9H0oFU5g@public.gmane.org>
@ 2018-12-01  3:42                                                   ` Sharma, Deepak
  0 siblings, 0 replies; 17+ messages in thread
From: Sharma, Deepak @ 2018-12-01  3:42 UTC (permalink / raw)
  To: Zhou, David(ChunMing),
	Koenig, Christian, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW


[-- Attachment #1.1: Type: text/plain, Size: 9702 bytes --]

Did find way to reproduce issue constantly. After applying David's patch "0001-drm-amdgpu-fix-signaled-fence-isn-t-handled" with minor change


-static struct dma_fence *drm_syncobj_get_stub_fence(void)
+struct dma_fence *drm_syncobj_get_stub_fence(void)


was able to avoid kernel panic due to NULL pointer dereference. So seems it is resolving the problem which I was trying to address.

Can you please put revision of this patch?



Though only one time I got reboot with dmesg..


[  213.714324] RIP: 0010:reservation_object_add_shared_fence+0x22e/0x245
[  213.714331] RSP: 0018:ffffa03900b5ba80 EFLAGS: 00010246
[  213.714338] RAX: 0000000000000004 RBX: ffffa03900b5bba8 RCX: ffff969421139d00
[  213.714343] RDX: 0000000000000000 RSI: ffff9693639662e0 RDI: ffff9694203db230
[  213.714349] RBP: ffffa03900b5bad0 R08: 000000000000002a R09: ffff969363966280
[  213.714354] R10: 0000000180800079 R11: ffffffffa3637f66 R12: ffff96941ea5cf00
[  213.714360] R13: 0000000000000000 R14: ffff9693639662e0 R15: ffff9694203db230
[  213.714366] FS:  0000786009e5f700(0000) GS:ffff96942ec00000(0000) knlGS:0000000000000000
[  213.714372] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  213.714377] CR2: 00007860023a4000 CR3: 000000011ea8a000 CR4: 00000000001406f0
[  213.714382] Call Trace:
[  213.714397]  ttm_eu_fence_buffer_objects+0x53/0x89
[  213.714407]  amdgpu_cs_ioctl+0x194f/0x196c
[  213.714420]  ? amdgpu_cs_report_moved_bytes+0x5f/0x5f
[  213.714427]  drm_ioctl_kernel+0x98/0xac
[  213.714435]  drm_ioctl+0x235/0x357
[  213.714443]  ? amdgpu_cs_report_moved_bytes+0x5f/0x5f
[  213.714450]  ? __switch_to_asm+0x34/0x70
[  213.714456]  ? __switch_to_asm+0x40/0x70
[  213.714462]  ? __switch_to_asm+0x34/0x70
[  213.714468]  ? __switch_to_asm+0x40/0x70
[  213.714476]  amdgpu_drm_ioctl+0x46/0x78
[  213.714500]  vfs_ioctl+0x1b/0x30
[  213.714507]  do_vfs_ioctl+0x492/0x6c1
[  213.714530]  SyS_ioctl+0x52/0x77
[  213.714537]  do_syscall_64+0x6d/0x81
[  213.714544]  entry_SYSCALL_64_after_hwframe+0x3d/0xa2
[  213.714551] RIP: 0033:0x78600c9c6967
[  213.714556] RSP: 002b:0000786009e5ec08 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[  213.714563] RAX: ffffffffffffffda RBX: 00000000c0186444 RCX: 000078600c9c6967
[  213.714568] RDX: 0000786009e5ec60 RSI: 00000000c0186444 RDI: 000000000000000d
[  213.714573] RBP: 0000786009e5ec40 R08: 0000786009e5ed00 R09: 0000786009e5ec50
[  213.714578] R10: 0000000000000000 R11: 0000000000000246 R12: 000000000000000d
[  213.714583] R13: 00003ba2573ff128 R14: 0000000000000000 R15: 0000786009e5ec60



Its worth to mention that I am not using latest kernel .relevant git log


71a0556255de125b7e3fc0dc6171fb31fab2b9ad drm/syncobj: Don't leak fences when WAIT_FOR_SUBMIT is set

4034318d2afac8f7f43457a4b480b877a94ec651  drm/syncobj: Stop reusing the same struct file for all syncobj -> fd


So have not taken any recent changes  from https://patchwork.kernel.org/patch/10575275/ series. only backported  attached two patches . I need to check if something is missing that can cause issue.


Thanks,

Deepak


________________________________
From: Sharma, Deepak
Sent: Tuesday, November 27, 2018 3:12 PM
To: Zhou, David(ChunMing); Koenig, Christian; amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/amdgpu: add the checking to avoid NULL pointer dereference


Thanks David ,

I did backport the drm patch to my kernel  after that I am not getting -ENOMEM from amdgpu_cs_ioctl while running tests  so have not been able to test patch to handle signaled fence. As this issue is hard to reproduce , will give some more try.

But yes the problem is there and need to handle when fence is null that your patch seems to handle it correctly.

-Deepak
________________________________
From: Zhou, David(ChunMing)
Sent: Monday, November 26, 2018 6:40 PM
To: Sharma, Deepak; Zhou, David(ChunMing); Koenig, Christian; amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/amdgpu: add the checking to avoid NULL pointer dereference

Yeah, you need another drm patch as well when you apply my patch. Attached.

-David


On 2018年11月27日 08:40, Sharma, Deepak wrote:
>
> On 11/26/18 1:57 AM, Zhou, David(ChunMing) wrote:
>>
>>> -----Original Message-----
>>> From: Christian König <ckoenig.leichtzumerken@gmail.com>
>>> Sent: Monday, November 26, 2018 5:23 PM
>>> To: Sharma, Deepak <Deepak.Sharma@amd.com>; Zhou, David(ChunMing)
>>> <David1.Zhou@amd.com>; Koenig, Christian <Christian.Koenig@amd.com>;
>>> amd-gfx@lists.freedesktop.org
>>> Subject: Re: [PATCH] drm/amdgpu: add the checking to avoid NULL pointer
>>> dereference
>>>
>>> Am 26.11.18 um 02:59 schrieb Sharma, Deepak:
>>>> 在 2018/11/24 2:10, Koenig, Christian 写道:
>>>>> Am 23.11.18 um 15:10 schrieb Zhou, David(ChunMing):
>>>>>> 在 2018/11/23 21:30, Koenig, Christian 写道:
>>>>>>> Am 23.11.18 um 14:27 schrieb Zhou, David(ChunMing):
>>>>>>>> 在 2018/11/22 19:25, Christian König 写道:
>>>>>>>>> Am 22.11.18 um 07:56 schrieb Sharma, Deepak:
>>>>>>>>>> when returned fence is not valid mostly due to userspace ignored
>>>>>>>>>> previous error causes NULL pointer dereference.
>>>>>>>>> Again, this is clearly incorrect. The my other mails on the
>>>>>>>>> earlier patch.
>>>>>>>> Sorry for I didn't get your history, but looks from the patch
>>>>>>>> itself, it is still a valid patch, isn't it?
>>>>>>> No, the semantic of amdgpu_ctx_get_fence() is that we return NULL
>>>>>>> when the fence is already signaled.
>>>>>>>
>>>>>>> So this patch could totally break userspace because it changes the
>>>>>>> behavior when we try to sync to an already signaled fence.
>>>>>> Ah, I got your meaning, how about attached patch?
>>>>> Yeah something like this, but I would just give the
>>>>> DRM_SYNCOBJ_CREATE_SIGNALED instead.
>>>>>
>>>>> I mean that's what this flag is good for isn't it?
>>>> Yeah, I give a flag initally when creating patch, but as you know, there is a
>>> swich case not be able to use that flag:
>>>>         case AMDGPU_FENCE_TO_HANDLE_GET_SYNC_FILE_FD:
>>>>             fd = get_unused_fd_flags(O_CLOEXEC);
>>>>             if (fd < 0) {
>>>>                 dma_fence_put(fence);
>>>>                 return fd;
>>>>             }
>>>>
>>>>             sync_file = sync_file_create(fence);
>>>>             dma_fence_put(fence);
>>>>             if (!sync_file) {
>>>>                 put_unused_fd(fd);
>>>>                 return -ENOMEM;
>>>>             }
>>>>
>>>>             fd_install(fd, sync_file->file);
>>>>             info->out.handle = fd;
>>>>             return 0;
>>>>
>>>> So I change to stub fence instead.
>>> Yeah, I've missed that case. Not sure if the sync_file can deal with a NULL
>>> fence.
>>>
>>> We should then probably move the stub fence function into
>>> dma_fence_stub.c under drivers/dma-buf to keep the stuff together.
>> Yes, you wrap it to review first with your stub fence, we can do it separately first.
>>
>> -David
>>>> -David
>>>>
>>>> I have not applied this patch.
>>>> The issue was trying to address is when amdgpu_cs_ioctl() failed due to
>>> low memory (ENOMEM) but userspace chose to proceed and called
>>> amdgpu_cs_fence_to_handle_ioctl().
>>>> In amdgpu_cs_fence_to_handle_ioctl(), fence is null and later causing
>>>> NULL pointer dereference, this patch was to avoid that and system panic
>>> But I understand now that its a valid case retuning NULL if fence was already
>>> signaled but need to handle case so avoid kernel panic. Seems David patch
>>> should fix this, I will test it tomorrow.
>>>
>>> Mhm, but don't we bail out with an error if we ask for a failed command
>>> submission? If not that sounds like a bug as well.
>>>
>>> Christian.
>>>
> Where do we do that?
> I see error
> [drm:amdgpu_cs_ioctl] *ERROR* amdgpu_cs_list_validate(validated) failed.
> [drm:amdgpu_cs_ioctl] *ERROR* Not enough memory for command submission!
> BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
> Did some more debugging,dma_fence_is_array() is causing NULL pointer
> dereference call through sync_file_ioctl.
>
> Also I think changes in David patch cant be applied on
> amd-staging-drm-next, which all patches I should take to get it correctly?
>
>>>> -Deepak
>>>>> Christian.
>>>>>
>>>>>> -David
>>>>>>> If that patch was applied then please revert it immediately.
>>>>>>>
>>>>>>> Christian.
>>>>>>>
>>>>>>>> -David
>>>>>>>>> If you have already pushed the patch then please revert.
>>>>>>>>>
>>>>>>>>> Christian.
>>>>>>>>>
>>>>>>>>>> Signed-off-by: Deepak Sharma <Deepak.Sharma@amd.com>
>>>>>>>>>> ---
>>>>>>>>>>          drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 2 ++
>>>>>>>>>>          1 file changed, 2 insertions(+)
>>>>>>>>>>
>>>>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>>>>>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>>>>>>>>>> index 024dfbd87f11..14166cd8a12f 100644
>>>>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>>>>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>>>>>>>>>> @@ -1403,6 +1403,8 @@ static struct dma_fence
>>>>>>>>>> *amdgpu_cs_get_fence(struct amdgpu_device *adev,
>>>>>>>>>>                fence = amdgpu_ctx_get_fence(ctx, entity, user->seq_no);
>>>>>>>>>>              amdgpu_ctx_put(ctx);
>>>>>>>>>> +    if(!fence)
>>>>>>>>>> +        return ERR_PTR(-EINVAL);
>>>>>>>>>>                return fence;
>>>>>>>>>>          }
>>>> _______________________________________________
>>>> amd-gfx mailing list
>>>> amd-gfx@lists.freedesktop.org
>>>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[-- Attachment #1.2: Type: text/html, Size: 17927 bytes --]

[-- Attachment #2: Type: text/plain, Size: 154 bytes --]

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2018-12-01  3:42 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-11-22  6:56 [PATCH] drm/amdgpu: add the checking to avoid NULL pointer dereference Sharma, Deepak
     [not found] ` <20181122065523.5510-1-Deepak.Sharma-5C7GfCeVMHo@public.gmane.org>
2018-11-22  7:00   ` zhoucm1
2018-11-22 11:25   ` Christian König
     [not found]     ` <b8284caa-3f1d-5672-6317-36be83907e9d-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2018-11-23 13:27       ` Chunming Zhou
     [not found]         ` <9350d13a-e745-f284-1c49-785fe523773d-5C7GfCeVMHo@public.gmane.org>
2018-11-23 13:30           ` Koenig, Christian
     [not found]             ` <eae25687-23ce-d686-f5f0-deeb7c3d2e1c-5C7GfCeVMHo@public.gmane.org>
2018-11-23 14:10               ` Chunming Zhou
     [not found]                 ` <70a32625-7568-6253-6c5e-a72fe38f7986-5C7GfCeVMHo@public.gmane.org>
2018-11-23 18:10                   ` Koenig, Christian
     [not found]                     ` <138bad33-c604-c0c5-98cb-1ce71a432185-5C7GfCeVMHo@public.gmane.org>
2018-11-24  8:14                       ` Chunming Zhou
     [not found]                         ` <46007826-1a4a-40eb-5ee8-af2d0c93e04a-5C7GfCeVMHo@public.gmane.org>
2018-11-26  1:59                           ` Sharma, Deepak
     [not found]                             ` <BY2PR12MB0694DBE621FC3BBC269E7D51E9D70-K//h7OWB4q7S9h1UXvMT+AdYzm3356FpvxpqHgZTriW3zl9H0oFU5g@public.gmane.org>
2018-11-26  9:23                               ` Christian König
     [not found]                                 ` <38725288-c15f-58be-7d1a-422abe503a04-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2018-11-26  9:57                                   ` Zhou, David(ChunMing)
     [not found]                                     ` <BY1PR12MB0502AC0F928B517E35C58751B4D70-PicGAnIBOobrCwm+z9iKNgdYzm3356FpvxpqHgZTriW3zl9H0oFU5g@public.gmane.org>
2018-11-27  0:40                                       ` Deepak Sharma
     [not found]                                         ` <9619e159-d47b-2078-974d-0d3b418f6d88-5C7GfCeVMHo@public.gmane.org>
2018-11-27  2:40                                           ` zhoucm1
     [not found]                                             ` <174e311d-3214-1e70-c5d4-8b955b926a58-5C7GfCeVMHo@public.gmane.org>
2018-11-27 23:12                                               ` Sharma, Deepak
     [not found]                                                 ` <BY2PR12MB06942FD453E4656A310A806AE9D00-K//h7OWB4q7S9h1UXvMT+AdYzm3356FpvxpqHgZTriW3zl9H0oFU5g@public.gmane.org>
2018-12-01  3:42                                                   ` Sharma, Deepak
2018-11-27  9:52                                           ` Christian König
     [not found]                                             ` <869bef1b-26f4-0fff-fa33-2720c2114a60-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2018-11-27 23:18                                               ` Sharma, Deepak

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.