From mboxrd@z Thu Jan 1 00:00:00 1970 From: =?UTF-8?Q?Christian_K=c3=b6nig?= Subject: Re: [PATCH] dmr/amdgpu: Fix wrongly unref of BO Date: Tue, 18 Apr 2017 19:46:30 +0200 Message-ID: <7807726d-7318-b72f-1a14-2b37a73988bf@vodafone.de> References: <1492119298-8526-1-git-send-email-AlexBin.Xie@amd.com> <58F0495D.60607@amd.com> <58F57C14.20403@amd.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============0045821482==" Return-path: In-Reply-To: List-Id: Discussion list for AMD gfx List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: amd-gfx-bounces-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org Sender: "amd-gfx" To: "Xie, AlexBin" , "Zhou, David(ChunMing)" , "amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org" This is a multi-part message in MIME format. --===============0045821482== Content-Type: multipart/alternative; boundary="------------E8298088F81D3945B506DFA4" This is a multi-part message in MIME format. --------------E8298088F81D3945B506DFA4 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Hi AlexBin, No, David is right. This is a very common coding pattern in the kernel module. Freeing up a BO while there still exists a kernel mapping is perfectly ok, the mapping will just be released a bit later on. So this code is actually perfectly ok and just an optimization, but your patch breaks it and creates a memory leak. Regards, Christian. Am 18.04.2017 um 17:17 schrieb Xie, AlexBin: > > Hi David, > > > When amdgpu_bo_reserve return errors, we cannot release the BO. This > is not a memory leak. General speaking, memory leak is unnoticed and > unintentional. > > > The caller of function amdgpu_vram_scratch_fini ignores the return > error value... > > > The "memory leak" is not caused by my patch. It is caused because > reserving BO fails. > > > This patch only aim to make function amdgpu_vram_scratch_fini behave > correctly. > > > To follow up, we can add a warning message when amdgpu_bo_reserve > error happens in a different patch. > > > If function call amdgpu_bo_reserve is changed to uninterruptible, this > changes driver behaviour. Without a substantial issue, I would be > cautious for such a change. > > > Thanks, > > Alex Bin Xie > > > ------------------------------------------------------------------------ > *From:* Zhou, David(ChunMing) > *Sent:* Monday, April 17, 2017 10:38 PM > *To:* Xie, AlexBin; amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org > *Subject:* Re: [PATCH] dmr/amdgpu: Fix wrongly unref of BO > > > On 2017年04月17日 22:54, Xie, AlexBin wrote: >> >> Hi David, >> >> >> Thanks for the comments. However, please have look at >> amdgpu_bo_reserve definition. >> >> static inline int amdgpu_bo_reserve(struct amdgpu_bo *bo, bool no_intr) >> > Ah, this is a wired wrapper for ttm_bo_reserve. > >> >> When we call this function like the following: >> >> r = amdgpu_bo_reserve(adev->vram_scratch.robj, false); >> The false means interruptible. >> >> >> On the other hand, when amdgpu_bo_reserve function return error, why >> do we unref BO without kunmap and unpin the BO? This is wrong >> implementation when amdgpu_bo_reserve return any error. >> > Yeah, I see your mean, it's in driver un-loading, How about changing > to no interruptible? Your patch will make a memleak if bo_reserve > fails, but it seems not matter. I have no strong preference. > > Regards, > David Zhou >> >> >> Thanks, >> Alex Bin Xie >> >> ------------------------------------------------------------------------ >> *From:* Zhou, David(ChunMing) >> *Sent:* Friday, April 14, 2017 12:00 AM >> *To:* Xie, AlexBin; amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org >> *Subject:* Re: [PATCH] dmr/amdgpu: Fix wrongly unref of BO >> >> >> On 2017年04月14日 05:34, Alex Xie wrote: >> > According to comment of amdgpu_bo_reserve, amdgpu_bo_reserve >> > can return with -ERESTARTSYS. When this function was interrupted >> > by a signal, BO should not be unref. Otherwise the BO might be >> > released while is kmapped and pinned, or BO MIGHT be deref >> > multiple times, etc. >> r = amdgpu_bo_reserve(adev->vram_scratch.robj, false); >> we have specified interruptible to false, so -ERESTARTSYS isn't possible >> here. >> >> Thanks, >> David Zhou >> > >> > Change-Id: If76071a768950a0d3ad9d5da7fcae04881807621 >> > Signed-off-by: Alex Xie >> > --- >> > drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 2 +- >> > 1 file changed, 1 insertion(+), 1 deletion(-) >> > >> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c >> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c >> > index 53996e3..1dcc2d1 100644 >> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c >> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c >> > @@ -355,8 +355,8 @@ static void amdgpu_vram_scratch_fini(struct >> amdgpu_device *adev) >> > amdgpu_bo_kunmap(adev->vram_scratch.robj); >> > amdgpu_bo_unpin(adev->vram_scratch.robj); >> > amdgpu_bo_unreserve(adev->vram_scratch.robj); >> > + amdgpu_bo_unref(&adev->vram_scratch.robj); >> > } >> > - amdgpu_bo_unref(&adev->vram_scratch.robj); >> > } >> > >> > /** >> > > > > _______________________________________________ > amd-gfx mailing list > amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org > https://lists.freedesktop.org/mailman/listinfo/amd-gfx --------------E8298088F81D3945B506DFA4 Content-Type: text/html; charset=utf-8 Content-Transfer-Encoding: 8bit
Hi AlexBin,

No, David is right. This is a very common coding pattern in the kernel module.

Freeing up a BO while there still exists a kernel mapping is perfectly ok, the mapping will just be released a bit later on.

So this code is actually perfectly ok and just an optimization, but your patch breaks it and creates a memory leak.

Regards,
Christian.

Am 18.04.2017 um 17:17 schrieb Xie, AlexBin:

Hi David,


When amdgpu_bo_reserve return errors, we cannot release the BO. This is not a memory leak. General speaking, memory leak is unnoticed and unintentional.


The caller of function amdgpu_vram_scratch_fini ignores the return error value...


The "memory leak" is not caused by my patch. It is caused because reserving BO fails.


This patch only aim to make function amdgpu_vram_scratch_fini behave correctly.


To follow up, we can add a warning message when amdgpu_bo_reserve error happens in a different patch.


If function call amdgpu_bo_reserve is changed to uninterruptible, this changes driver behaviour. Without a substantial issue, I would be cautious for such a change.


Thanks,

Alex Bin Xie



From: Zhou, David(ChunMing)
Sent: Monday, April 17, 2017 10:38 PM
To: Xie, AlexBin; amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org
Subject: Re: [PATCH] dmr/amdgpu: Fix wrongly unref of BO
 


On 2017年04月17日 22:54, Xie, AlexBin wrote:

Hi David,


Thanks for the comments. However, please have look at amdgpu_bo_reserve definition.

static inline int amdgpu_bo_reserve(struct amdgpu_bo *bo, bool no_intr)

Ah, this is a wired wrapper for ttm_bo_reserve.


When we call this function like the following:

         r = amdgpu_bo_reserve(adev->vram_scratch.robj, false);
The false means interruptible.


On the other hand,  when amdgpu_bo_reserve function return error, why do we unref BO without kunmap and unpin the BO? This is wrong implementation when amdgpu_bo_reserve return any error.

Yeah, I see your mean, it's in driver un-loading, How about changing to no interruptible? Your patch will make a memleak if bo_reserve fails, but it seems not matter. I have no strong preference.

Regards,
David Zhou


Thanks,
Alex Bin Xie


From: Zhou, David(ChunMing)
Sent: Friday, April 14, 2017 12:00 AM
To: Xie, AlexBin; amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org
Subject: Re: [PATCH] dmr/amdgpu: Fix wrongly unref of BO
 


On 2017年04月14日 05:34, Alex Xie wrote:
> According to comment of amdgpu_bo_reserve, amdgpu_bo_reserve
> can return with -ERESTARTSYS. When this function was interrupted
> by a signal, BO should not be unref. Otherwise the BO might be
> released while is kmapped and pinned, or BO MIGHT be deref
> multiple times, etc.
         r = amdgpu_bo_reserve(adev->vram_scratch.robj, false);
we have specified interruptible to false, so -ERESTARTSYS isn't possible
here.

Thanks,
David Zhou
>
> Change-Id: If76071a768950a0d3ad9d5da7fcae04881807621
> Signed-off-by: Alex Xie <AlexBin.Xie-5C7GfCeVMHo@public.gmane.org>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index 53996e3..1dcc2d1 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -355,8 +355,8 @@ static void amdgpu_vram_scratch_fini(struct amdgpu_device *adev)
>                amdgpu_bo_kunmap(adev->vram_scratch.robj);
>                amdgpu_bo_unpin(adev->vram_scratch.robj);
>                amdgpu_bo_unreserve(adev->vram_scratch.robj);
> +             amdgpu_bo_unref(&adev->vram_scratch.robj);
>        }
> -     amdgpu_bo_unref(&adev->vram_scratch.robj);
>   }
>  
>   /**




_______________________________________________
amd-gfx mailing list
amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


--------------E8298088F81D3945B506DFA4-- --===============0045821482== Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: base64 Content-Disposition: inline X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX18KYW1kLWdmeCBt YWlsaW5nIGxpc3QKYW1kLWdmeEBsaXN0cy5mcmVlZGVza3RvcC5vcmcKaHR0cHM6Ly9saXN0cy5m cmVlZGVza3RvcC5vcmcvbWFpbG1hbi9saXN0aW5mby9hbWQtZ2Z4Cg== --===============0045821482==--