Hi AlexBin, the answer is ttm_bo_kunmap isn't called at all and yes in the case of an iomap we leak the address space reserved. But that is completely harmless on a 64bit system compared to leaking the memory backing the address space. Using amdgpu_bo_free_kernel() instead of openly coding it here is probably a good idea. Additional to that it's probably a good idea to set the no_intr flag when reserving kernel BOs. It is impossible to receive a signal during module load/unload, but it's probably better to document that in the code as well. Regards, Christian. Am 18.04.2017 um 20:54 schrieb Xie, AlexBin: > Hi Christian, > > Have you found how/where/when? When you said "mapping will just be > released a bit later on", you must know the answer. > > It is difficult to prove something does not exist. Anyway, I will give > it a try to prove such "later on" does not exist. > > Function ttm_bo_kunmap is the only function to unmap. To prove this, > search ttm_bo_map_iomap, only ttm_bo_kunmap use this enum to correctly > kunmap. > > Function ttm_bo_kunmap is not called by ttm itself. This is a hint > that all TTM delay delete mechanism or unref mechanism will NOT kunmap > BO later on. > > Function ttm_bo_kunmap is called by AMDGPU function amdgpu_bo_kunmap > and amdgpu_gem_prime_vunmap. > > Search AMDGPU for amdgpu_bo_kunmap. All matches do not kunmap for > scratch VRAM BO. amdgpu_bo_free_kernel is a suspect but the answer is > still NO. > > So all possibilities are searched. Did I miss anything? > > Thanks, > Alex Bin Xie > > ------------------------------------------------------------------------ > *From:* Xie, AlexBin > *Sent:* Tuesday, April 18, 2017 2:04:33 PM > *To:* Christian König; Zhou, David(ChunMing); > amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org > *Subject:* Re: [PATCH] dmr/amdgpu: Fix wrongly unref of BO > > Hi Christian, > > > Would you point out where/when will kunmap happen for this BO when > release? It must be somewhere in some function calls. > > > I checked before I asked for review. But I did not see such obvious > kunmap function call. > > > If so, there should be a comment in function amdgpu_vram_scratch_fini > to avoid future confusion. > > > Thanks, > Alex Bin Xie > ------------------------------------------------------------------------ > *From:* Christian König > *Sent:* Tuesday, April 18, 2017 1:46 PM > *To:* Xie, AlexBin; Zhou, David(ChunMing); amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org > *Subject:* Re: [PATCH] dmr/amdgpu: Fix wrongly unref of BO > Hi AlexBin, > > No, David is right. This is a very common coding pattern in the kernel > module. > > Freeing up a BO while there still exists a kernel mapping is perfectly > ok, the mapping will just be released a bit later on. > > So this code is actually perfectly ok and just an optimization, but > your patch breaks it and creates a memory leak. > > Regards, > Christian. > > Am 18.04.2017 um 17:17 schrieb Xie, AlexBin: >> >> Hi David, >> >> >> When amdgpu_bo_reserve return errors, we cannot release the BO. This >> is not a memory leak. General speaking, memory leak is unnoticed and >> unintentional. >> >> >> The caller of function amdgpu_vram_scratch_fini ignores the return >> error value... >> >> >> The "memory leak" is not caused by my patch. It is caused because >> reserving BO fails. >> >> >> This patch only aim to make function amdgpu_vram_scratch_fini behave >> correctly. >> >> >> To follow up, we can add a warning message when amdgpu_bo_reserve >> error happens in a different patch. >> >> >> If function call amdgpu_bo_reserve is changed to uninterruptible, >> this changes driver behaviour. Without a substantial issue, I would >> be cautious for such a change. >> >> >> Thanks, >> >> Alex Bin Xie >> >> >> ------------------------------------------------------------------------ >> *From:* Zhou, David(ChunMing) >> *Sent:* Monday, April 17, 2017 10:38 PM >> *To:* Xie, AlexBin; amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org >> *Subject:* Re: [PATCH] dmr/amdgpu: Fix wrongly unref of BO >> >> >> On 2017年04月17日 22:54, Xie, AlexBin wrote: >>> >>> Hi David, >>> >>> >>> Thanks for the comments. However, please have look at >>> amdgpu_bo_reserve definition. >>> >>> static inline int amdgpu_bo_reserve(struct amdgpu_bo *bo, bool no_intr) >>> >> Ah, this is a wired wrapper for ttm_bo_reserve. >> >>> >>> When we call this function like the following: >>> >>> r = amdgpu_bo_reserve(adev->vram_scratch.robj, false); >>> The false means interruptible. >>> >>> >>> On the other hand, when amdgpu_bo_reserve function return error, >>> why do we unref BO without kunmap and unpin the BO? This is wrong >>> implementation when amdgpu_bo_reserve return any error. >>> >> Yeah, I see your mean, it's in driver un-loading, How about changing >> to no interruptible? Your patch will make a memleak if bo_reserve >> fails, but it seems not matter. I have no strong preference. >> >> Regards, >> David Zhou >>> >>> >>> Thanks, >>> Alex Bin Xie >>> >>> ------------------------------------------------------------------------ >>> *From:* Zhou, David(ChunMing) >>> *Sent:* Friday, April 14, 2017 12:00 AM >>> *To:* Xie, AlexBin; amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org >>> *Subject:* Re: [PATCH] dmr/amdgpu: Fix wrongly unref of BO >>> >>> >>> On 2017年04月14日 05:34, Alex Xie wrote: >>> > According to comment of amdgpu_bo_reserve, amdgpu_bo_reserve >>> > can return with -ERESTARTSYS. When this function was interrupted >>> > by a signal, BO should not be unref. Otherwise the BO might be >>> > released while is kmapped and pinned, or BO MIGHT be deref >>> > multiple times, etc. >>> r = amdgpu_bo_reserve(adev->vram_scratch.robj, false); >>> we have specified interruptible to false, so -ERESTARTSYS isn't >>> possible >>> here. >>> >>> Thanks, >>> David Zhou >>> > >>> > Change-Id: If76071a768950a0d3ad9d5da7fcae04881807621 >>> > Signed-off-by: Alex Xie >>> > --- >>> > drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 2 +- >>> > 1 file changed, 1 insertion(+), 1 deletion(-) >>> > >>> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c >>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c >>> > index 53996e3..1dcc2d1 100644 >>> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c >>> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c >>> > @@ -355,8 +355,8 @@ static void amdgpu_vram_scratch_fini(struct >>> amdgpu_device *adev) >>> > amdgpu_bo_kunmap(adev->vram_scratch.robj); >>> > amdgpu_bo_unpin(adev->vram_scratch.robj); >>> > amdgpu_bo_unreserve(adev->vram_scratch.robj); >>> > + amdgpu_bo_unref(&adev->vram_scratch.robj); >>> > } >>> > - amdgpu_bo_unref(&adev->vram_scratch.robj); >>> > } >>> > >>> > /** >>> >> >> >> >> _______________________________________________ >> amd-gfx mailing list >> amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org >> https://lists.freedesktop.org/mailman/listinfo/amd-gfx > > > > > _______________________________________________ > amd-gfx mailing list > amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org > https://lists.freedesktop.org/mailman/listinfo/amd-gfx