All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] drm/amdgpu: fix the memory corruption on S3
@ 2017-06-29  7:03 Huang Rui
       [not found] ` <1498719798-8435-1-git-send-email-ray.huang-5C7GfCeVMHo@public.gmane.org>
  0 siblings, 1 reply; 6+ messages in thread
From: Huang Rui @ 2017-06-29  7:03 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Alex Deucher,
	Christian König
  Cc: Alvin Huan, Joe Qiao, Sonny Jiang, Huang Rui, Ken Wang, Xiaojie Yuan

psp->cmd will be used on resume phase, so we can not free it on hw_init.
Otherwise, a memory corruption will be triggered.

Signed-off-by: Huang Rui <ray.huang@amd.com>
---

Alex, Christian,

This is the final fix for vega10 S3. The random memory corruption issue is root
caused.

Thanks,
Ray

---
 drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
index 5041073..fcdd542 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
@@ -372,8 +372,6 @@ static int psp_load_fw(struct amdgpu_device *adev)
 	if (ret)
 		goto failed_mem;
 
-	kfree(cmd);
-
 	return 0;
 
 failed_mem:
@@ -384,6 +382,7 @@ static int psp_load_fw(struct amdgpu_device *adev)
 			      &psp->fw_pri_mc_addr, &psp->fw_pri_buf);
 failed:
 	kfree(cmd);
+	cmd = NULL;
 	return ret;
 }
 
@@ -443,6 +442,11 @@ static int psp_hw_fini(void *handle)
 		amdgpu_bo_free_kernel(&psp->fence_buf_bo,
 				      &psp->fence_buf_mc_addr, &psp->fence_buf);
 
+	if (!psp->cmd) {
+		kfree(psp->cmd);
+		psp->cmd = NULL;
+	}
+
 	return 0;
 }
 
-- 
2.7.4

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] drm/amdgpu: fix the memory corruption on S3
       [not found] ` <1498719798-8435-1-git-send-email-ray.huang-5C7GfCeVMHo@public.gmane.org>
@ 2017-06-29  7:34   ` Michel Dänzer
       [not found]     ` <86b9418d-450e-9101-517d-8d1ae7c8b358-otUistvHUpPR7s880joybQ@public.gmane.org>
  0 siblings, 1 reply; 6+ messages in thread
From: Michel Dänzer @ 2017-06-29  7:34 UTC (permalink / raw)
  To: Huang Rui, Alex Deucher, Christian König
  Cc: Alvin Huan, Joe Qiao, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	Sonny Jiang, Ken Wang, Xiaojie Yuan

On 29/06/17 04:03 PM, Huang Rui wrote:
> psp->cmd will be used on resume phase, so we can not free it on hw_init.
> Otherwise, a memory corruption will be triggered.
> 
> Signed-off-by: Huang Rui <ray.huang@amd.com>
> ---
> 
> Alex, Christian,
> 
> This is the final fix for vega10 S3. The random memory corruption issue is root
> caused.
> 
> Thanks,
> Ray
> 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 8 ++++++--
>  1 file changed, 6 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
> index 5041073..fcdd542 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
> @@ -372,8 +372,6 @@ static int psp_load_fw(struct amdgpu_device *adev)
>  	if (ret)
>  		goto failed_mem;
>  
> -	kfree(cmd);
> -
>  	return 0;

This looks like a good catch.


> @@ -384,6 +382,7 @@ static int psp_load_fw(struct amdgpu_device *adev)
>  			      &psp->fw_pri_mc_addr, &psp->fw_pri_buf);
>  failed:
>  	kfree(cmd);
> +	cmd = NULL;

This should probably be

	psp->cmd = NULL;

instead?


> @@ -443,6 +442,11 @@ static int psp_hw_fini(void *handle)
>  		amdgpu_bo_free_kernel(&psp->fence_buf_bo,
>  				      &psp->fence_buf_mc_addr, &psp->fence_buf);
>  
> +	if (!psp->cmd) {
> +		kfree(psp->cmd);
> +		psp->cmd = NULL;
> +	}

This should probably be

	if (psp->cmd) {

? If so, you could skip the if altogether, since kfree(NULL) is safe.


-- 
Earthling Michel Dänzer               |               http://www.amd.com
Libre software enthusiast             |             Mesa and X developer
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] drm/amdgpu: fix the memory corruption on S3
       [not found]     ` <86b9418d-450e-9101-517d-8d1ae7c8b358-otUistvHUpPR7s880joybQ@public.gmane.org>
@ 2017-06-29  7:59       ` Huang Rui
  2017-06-29  8:06         ` Christian König
  2017-06-29  8:07         ` Michel Dänzer
  0 siblings, 2 replies; 6+ messages in thread
From: Huang Rui @ 2017-06-29  7:59 UTC (permalink / raw)
  To: Michel Dänzer
  Cc: Huan, Alvin, Qiao, Joe(Markham),
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Jiang, Sonny, Deucher,
	Alexander, Wang, Ken, Koenig, Christian, Yuan, Xiaojie

On Thu, Jun 29, 2017 at 03:34:57PM +0800, Michel Dänzer wrote:
> On 29/06/17 04:03 PM, Huang Rui wrote:
> > psp->cmd will be used on resume phase, so we can not free it on hw_init.
> > Otherwise, a memory corruption will be triggered.
> >
> > Signed-off-by: Huang Rui <ray.huang@amd.com>
> > ---
> >
> > Alex, Christian,
> >
> > This is the final fix for vega10 S3. The random memory corruption issue is
> root
> > caused.
> >
> > Thanks,
> > Ray
> >
> > ---
> >  drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 8 ++++++--
> >  1 file changed, 6 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c b/drivers/gpu/drm/amd/
> amdgpu/amdgpu_psp.c
> > index 5041073..fcdd542 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
> > @@ -372,8 +372,6 @@ static int psp_load_fw(struct amdgpu_device *adev)
> >        if (ret)
> >                goto failed_mem;
> > 
> > -     kfree(cmd);
> > -
> >        return 0;
> 
> This looks like a good catch.
> 
> 
> > @@ -384,6 +382,7 @@ static int psp_load_fw(struct amdgpu_device *adev)
> >                              &psp->fw_pri_mc_addr, &psp->fw_pri_buf);
> >  failed:
> >        kfree(cmd);
> > +     cmd = NULL;
> 
> This should probably be
> 
>         psp->cmd = NULL;
> 
> instead?
> 

Actually, we set psp->cmd = cmd before.

But anyway, we needn't "cmd" member any more.

> 
> > @@ -443,6 +442,11 @@ static int psp_hw_fini(void *handle)
> >                amdgpu_bo_free_kernel(&psp->fence_buf_bo,
> >                                      &psp->fence_buf_mc_addr, &psp->
> fence_buf);
> > 
> > +     if (!psp->cmd) {
> > +             kfree(psp->cmd);
> > +             psp->cmd = NULL;
> > +     }
> 
> This should probably be
> 
>         if (psp->cmd) {
> 
> ? If so, you could skip the if altogether, since kfree(NULL) is safe.

Nice catch. My typo. ;-)

You're right. Will update it in V2.

Thanks,
Rui
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] drm/amdgpu: fix the memory corruption on S3
  2017-06-29  7:59       ` Huang Rui
@ 2017-06-29  8:06         ` Christian König
  2017-06-29  8:07         ` Michel Dänzer
  1 sibling, 0 replies; 6+ messages in thread
From: Christian König @ 2017-06-29  8:06 UTC (permalink / raw)
  To: Huang Rui, Michel Dänzer
  Cc: Huan, Alvin, Qiao, Joe(Markham),
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Jiang, Sonny, Deucher,
	Alexander, Wang, Ken, Yuan, Xiaojie

Indeed a nice catch.

Just skimming over the PSP code, wouldn't it be simpler to temporary 
allocate the cmd buffer in psp_np_fw_load()?

Doesn't looks like that is used outside that function. Might even be 
possible to just allocate the buffer on the stack.

Regards,
Christian.

Am 29.06.2017 um 09:59 schrieb Huang Rui:
> On Thu, Jun 29, 2017 at 03:34:57PM +0800, Michel Dänzer wrote:
>> On 29/06/17 04:03 PM, Huang Rui wrote:
>>> psp->cmd will be used on resume phase, so we can not free it on hw_init.
>>> Otherwise, a memory corruption will be triggered.
>>>
>>> Signed-off-by: Huang Rui <ray.huang@amd.com>
>>> ---
>>>
>>> Alex, Christian,
>>>
>>> This is the final fix for vega10 S3. The random memory corruption issue is
>> root
>>> caused.
>>>
>>> Thanks,
>>> Ray
>>>
>>> ---
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 8 ++++++--
>>>   1 file changed, 6 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c b/drivers/gpu/drm/amd/
>> amdgpu/amdgpu_psp.c
>>> index 5041073..fcdd542 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
>>> @@ -372,8 +372,6 @@ static int psp_load_fw(struct amdgpu_device *adev)
>>>         if (ret)
>>>                 goto failed_mem;
>>>
>>> -     kfree(cmd);
>>> -
>>>         return 0;
>> This looks like a good catch.
>>
>>
>>> @@ -384,6 +382,7 @@ static int psp_load_fw(struct amdgpu_device *adev)
>>>                               &psp->fw_pri_mc_addr, &psp->fw_pri_buf);
>>>   failed:
>>>         kfree(cmd);
>>> +     cmd = NULL;
>> This should probably be
>>
>>          psp->cmd = NULL;
>>
>> instead?
>>
> Actually, we set psp->cmd = cmd before.
>
> But anyway, we needn't "cmd" member any more.
>
>>> @@ -443,6 +442,11 @@ static int psp_hw_fini(void *handle)
>>>                 amdgpu_bo_free_kernel(&psp->fence_buf_bo,
>>>                                       &psp->fence_buf_mc_addr, &psp->
>> fence_buf);
>>> +     if (!psp->cmd) {
>>> +             kfree(psp->cmd);
>>> +             psp->cmd = NULL;
>>> +     }
>> This should probably be
>>
>>          if (psp->cmd) {
>>
>> ? If so, you could skip the if altogether, since kfree(NULL) is safe.
> Nice catch. My typo. ;-)
>
> You're right. Will update it in V2.
>
> Thanks,
> Rui


_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] drm/amdgpu: fix the memory corruption on S3
  2017-06-29  7:59       ` Huang Rui
  2017-06-29  8:06         ` Christian König
@ 2017-06-29  8:07         ` Michel Dänzer
       [not found]           ` <d86cec19-7927-c29e-d2fb-fdaf589e3369-otUistvHUpPR7s880joybQ@public.gmane.org>
  1 sibling, 1 reply; 6+ messages in thread
From: Michel Dänzer @ 2017-06-29  8:07 UTC (permalink / raw)
  To: Huang Rui
  Cc: Huan, Alvin, Qiao, Joe(Markham),
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Jiang, Sonny, Deucher,
	Alexander, Wang, Ken, Koenig, Christian, Yuan, Xiaojie

On 29/06/17 04:59 PM, Huang Rui wrote:
> On Thu, Jun 29, 2017 at 03:34:57PM +0800, Michel Dänzer wrote:
>> On 29/06/17 04:03 PM, Huang Rui wrote:
>>> psp->cmd will be used on resume phase, so we can not free it on hw_init.
>>> Otherwise, a memory corruption will be triggered.
>>>
>>> Signed-off-by: Huang Rui <ray.huang@amd.com>
>>> ---
>>>
>>> Alex, Christian,
>>>
>>> This is the final fix for vega10 S3. The random memory corruption issue is
>> root
>>> caused.
>>>
>>> Thanks,
>>> Ray
>>>
>>> ---
>>>  drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 8 ++++++--
>>>  1 file changed, 6 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c b/drivers/gpu/drm/amd/
>> amdgpu/amdgpu_psp.c
>>> index 5041073..fcdd542 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
>>> @@ -372,8 +372,6 @@ static int psp_load_fw(struct amdgpu_device *adev)
>>>        if (ret)
>>>                goto failed_mem;
>>>
>>> -     kfree(cmd);
>>> -
>>>        return 0;
>>
>> This looks like a good catch.
>>
>>
>>> @@ -384,6 +382,7 @@ static int psp_load_fw(struct amdgpu_device *adev)
>>>                              &psp->fw_pri_mc_addr, &psp->fw_pri_buf);
>>>  failed:
>>>        kfree(cmd);
>>> +     cmd = NULL;
>>
>> This should probably be
>>
>>         psp->cmd = NULL;
>>
>> instead?
>>
> 
> Actually, we set psp->cmd = cmd before.
> 
> But anyway, we needn't "cmd" member any more.

You should probably still set psp->cmd = NULL here, otherwise psp->cmd
still contains the pointer to the memory that is freed here, which could
result in use-after-free somewhere else.


-- 
Earthling Michel Dänzer               |               http://www.amd.com
Libre software enthusiast             |             Mesa and X developer
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] drm/amdgpu: fix the memory corruption on S3
       [not found]           ` <d86cec19-7927-c29e-d2fb-fdaf589e3369-otUistvHUpPR7s880joybQ@public.gmane.org>
@ 2017-06-29  8:15             ` Huang Rui
  0 siblings, 0 replies; 6+ messages in thread
From: Huang Rui @ 2017-06-29  8:15 UTC (permalink / raw)
  To: Michel Dänzer
  Cc: Huan, Alvin, Qiao, Joe(Markham),
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Jiang, Sonny, Deucher,
	Alexander, Wang, Ken, Koenig, Christian, Yuan, Xiaojie

On Thu, Jun 29, 2017 at 04:07:33PM +0800, Michel Dänzer wrote:
> On 29/06/17 04:59 PM, Huang Rui wrote:
> > On Thu, Jun 29, 2017 at 03:34:57PM +0800, Michel Dänzer wrote:
> >> On 29/06/17 04:03 PM, Huang Rui wrote:
> >>> psp->cmd will be used on resume phase, so we can not free it on hw_init.
> >>> Otherwise, a memory corruption will be triggered.
> >>>
> >>> Signed-off-by: Huang Rui <ray.huang@amd.com>
> >>> ---
> >>>
> >>> Alex, Christian,
> >>>
> >>> This is the final fix for vega10 S3. The random memory corruption issue is
> >> root
> >>> caused.
> >>>
> >>> Thanks,
> >>> Ray
> >>>
> >>> ---
> >>>  drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 8 ++++++--
> >>>  1 file changed, 6 insertions(+), 2 deletions(-)
> >>>
> >>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c b/drivers/gpu/drm/amd/
> >> amdgpu/amdgpu_psp.c
> >>> index 5041073..fcdd542 100644
> >>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
> >>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
> >>> @@ -372,8 +372,6 @@ static int psp_load_fw(struct amdgpu_device *adev)
> >>>        if (ret)
> >>>                goto failed_mem;
> >>>
> >>> -     kfree(cmd);
> >>> -
> >>>        return 0;
> >>
> >> This looks like a good catch.
> >>
> >>
> >>> @@ -384,6 +382,7 @@ static int psp_load_fw(struct amdgpu_device *adev)
> >>>                              &psp->fw_pri_mc_addr, &psp->fw_pri_buf);
> >>>  failed:
> >>>        kfree(cmd);
> >>> +     cmd = NULL;
> >>
> >> This should probably be
> >>
> >>         psp->cmd = NULL;
> >>
> >> instead?
> >>
> >
> > Actually, we set psp->cmd = cmd before.
> >
> > But anyway, we needn't "cmd" member any more.
> 
> You should probably still set psp->cmd = NULL here, otherwise psp->cmd
> still contains the pointer to the memory that is freed here, which could
> result in use-after-free somewhere else.
> 

Right, I already found it and update it in V2, please take a look.

Thanks,
Ray
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2017-06-29  8:15 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-06-29  7:03 [PATCH] drm/amdgpu: fix the memory corruption on S3 Huang Rui
     [not found] ` <1498719798-8435-1-git-send-email-ray.huang-5C7GfCeVMHo@public.gmane.org>
2017-06-29  7:34   ` Michel Dänzer
     [not found]     ` <86b9418d-450e-9101-517d-8d1ae7c8b358-otUistvHUpPR7s880joybQ@public.gmane.org>
2017-06-29  7:59       ` Huang Rui
2017-06-29  8:06         ` Christian König
2017-06-29  8:07         ` Michel Dänzer
     [not found]           ` <d86cec19-7927-c29e-d2fb-fdaf589e3369-otUistvHUpPR7s880joybQ@public.gmane.org>
2017-06-29  8:15             ` Huang Rui

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.