Hi Christian, Have you found how/where/when? When you said "mapping will just be released a bit later on", you must know the answer. It is difficult to prove something does not exist. Anyway, I will give it a try to prove such "later on" does not exist. Function ttm_bo_kunmap is the only function to unmap. To prove this, search ttm_bo_map_iomap, only ttm_bo_kunmap use this enum to correctly kunmap. Function ttm_bo_kunmap is not called by ttm itself. This is a hint that all TTM delay delete mechanism or unref mechanism will NOT kunmap BO later on. Function ttm_bo_kunmap is called by AMDGPU function amdgpu_bo_kunmap and amdgpu_gem_prime_vunmap. Search AMDGPU for amdgpu_bo_kunmap. All matches do not kunmap for scratch VRAM BO. amdgpu_bo_free_kernel is a suspect but the answer is still NO. So all possibilities are searched. Did I miss anything? Thanks, Alex Bin Xie ________________________________ From: Xie, AlexBin Sent: Tuesday, April 18, 2017 2:04:33 PM To: Christian König; Zhou, David(ChunMing); amd-gfx@lists.freedesktop.org Subject: Re: [PATCH] dmr/amdgpu: Fix wrongly unref of BO Hi Christian, Would you point out where/when will kunmap happen for this BO when release? It must be somewhere in some function calls. I checked before I asked for review. But I did not see such obvious kunmap function call. If so, there should be a comment in function amdgpu_vram_scratch_fini to avoid future confusion. Thanks, Alex Bin Xie ________________________________ From: Christian König Sent: Tuesday, April 18, 2017 1:46 PM To: Xie, AlexBin; Zhou, David(ChunMing); amd-gfx@lists.freedesktop.org Subject: Re: [PATCH] dmr/amdgpu: Fix wrongly unref of BO Hi AlexBin, No, David is right. This is a very common coding pattern in the kernel module. Freeing up a BO while there still exists a kernel mapping is perfectly ok, the mapping will just be released a bit later on. So this code is actually perfectly ok and just an optimization, but your patch breaks it and creates a memory leak. Regards, Christian. Am 18.04.2017 um 17:17 schrieb Xie, AlexBin: Hi David, When amdgpu_bo_reserve return errors, we cannot release the BO. This is not a memory leak. General speaking, memory leak is unnoticed and unintentional. The caller of function amdgpu_vram_scratch_fini ignores the return error value... The "memory leak" is not caused by my patch. It is caused because reserving BO fails. This patch only aim to make function amdgpu_vram_scratch_fini behave correctly. To follow up, we can add a warning message when amdgpu_bo_reserve error happens in a different patch. If function call amdgpu_bo_reserve is changed to uninterruptible, this changes driver behaviour. Without a substantial issue, I would be cautious for such a change. Thanks, Alex Bin Xie ________________________________ From: Zhou, David(ChunMing) Sent: Monday, April 17, 2017 10:38 PM To: Xie, AlexBin; amd-gfx@lists.freedesktop.org Subject: Re: [PATCH] dmr/amdgpu: Fix wrongly unref of BO On 2017年04月17日 22:54, Xie, AlexBin wrote: Hi David, Thanks for the comments. However, please have look at amdgpu_bo_reserve definition. static inline int amdgpu_bo_reserve(struct amdgpu_bo *bo, bool no_intr) Ah, this is a wired wrapper for ttm_bo_reserve. When we call this function like the following: r = amdgpu_bo_reserve(adev->vram_scratch.robj, false); The false means interruptible. On the other hand, when amdgpu_bo_reserve function return error, why do we unref BO without kunmap and unpin the BO? This is wrong implementation when amdgpu_bo_reserve return any error. Yeah, I see your mean, it's in driver un-loading, How about changing to no interruptible? Your patch will make a memleak if bo_reserve fails, but it seems not matter. I have no strong preference. Regards, David Zhou Thanks, Alex Bin Xie ________________________________ From: Zhou, David(ChunMing) Sent: Friday, April 14, 2017 12:00 AM To: Xie, AlexBin; amd-gfx@lists.freedesktop.org Subject: Re: [PATCH] dmr/amdgpu: Fix wrongly unref of BO On 2017年04月14日 05:34, Alex Xie wrote: > According to comment of amdgpu_bo_reserve, amdgpu_bo_reserve > can return with -ERESTARTSYS. When this function was interrupted > by a signal, BO should not be unref. Otherwise the BO might be > released while is kmapped and pinned, or BO MIGHT be deref > multiple times, etc. r = amdgpu_bo_reserve(adev->vram_scratch.robj, false); we have specified interruptible to false, so -ERESTARTSYS isn't possible here. Thanks, David Zhou > > Change-Id: If76071a768950a0d3ad9d5da7fcae04881807621 > Signed-off-by: Alex Xie > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > index 53996e3..1dcc2d1 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > @@ -355,8 +355,8 @@ static void amdgpu_vram_scratch_fini(struct amdgpu_device *adev) > amdgpu_bo_kunmap(adev->vram_scratch.robj); > amdgpu_bo_unpin(adev->vram_scratch.robj); > amdgpu_bo_unreserve(adev->vram_scratch.robj); > + amdgpu_bo_unref(&adev->vram_scratch.robj); > } > - amdgpu_bo_unref(&adev->vram_scratch.robj); > } > > /** _______________________________________________ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx