Re: [PATCH] drm/amdgpu: Fix a potential sdma invalid access

From: "Christian König" <christian.koenig@amd.com>
To: Qu Huang <jinsdb@126.com>,
	alexander.deucher@amd.com, airlied@linux.ie, daniel@ffwll.ch,
	sumit.semwal@linaro.org, airlied@redhat.com, ray.huang@amd.com,
	Mihir.Patel@amd.com, nirmoy.aiemd@gmail.com
Cc: amd-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org,
	linux-kernel@vger.kernel.org, linux-media@vger.kernel.org
Subject: Re: [PATCH] drm/amdgpu: Fix a potential sdma invalid access
Date: Sat, 3 Apr 2021 10:49:38 +0200	[thread overview]
Message-ID: <fc8cf44b-e952-1c48-137a-77c35bbfbb24@amd.com> (raw)
In-Reply-To: <84619284-d401-f018-d627-8e5dfef1ff1f@126.com>

Hi Qu,

Am 03.04.21 um 07:08 schrieb Qu Huang:
> Hi Christian,
>
> On 2021/4/3 0:25, Christian König wrote:
>> Hi Qu,
>>
>> Am 02.04.21 um 05:18 schrieb Qu Huang:
>>> Before dma_resv_lock(bo->base.resv, NULL) in 
>>> amdgpu_bo_release_notify(),
>>> the bo->base.resv lock may be held by ttm_mem_evict_first(),
>>
>> That can't happen since when bo_release_notify is called the BO has not
>> more references and is therefore deleted.
>>
>> And we never evict a deleted BO, we just wait for it to become idle.
>>
> Yes, the bo reference counter return to zero will enter
> ttm_bo_release(),but notify bo release (call amdgpu_bo_release_notify())
> first happen, and then test if a reservation object's fences have been
> signaled, and then mark bo as deleted and remove bo from the LRU list.
>
> When ttm_bo_release() and ttm_mem_evict_first() is concurrent,
> the Bo has not been removed from the LRU list and is not marked as
> deleted, this will happen.

Not sure on which code base you are, but I don't see how this can happen.

ttm_mem_evict_first() calls ttm_bo_get_unless_zero() and 
ttm_bo_release() is only called when the BO reference count becomes zero.

So ttm_mem_evict_first() will see that this BO is about to be destroyed 
and skips it.

>
> As a test, when we use CPU memset instead of SDMA fill in
> amdgpu_bo_release_notify(), the result is page fault:
>
> PID: 5490   TASK: ffff8e8136e04100  CPU: 4   COMMAND: "gemmPerf"
>   #0 [ffff8e79eaa17970] machine_kexec at ffffffffb2863784
>   #1 [ffff8e79eaa179d0] __crash_kexec at ffffffffb291ce92
>   #2 [ffff8e79eaa17aa0] crash_kexec at ffffffffb291cf80
>   #3 [ffff8e79eaa17ab8] oops_end at ffffffffb2f6c768
>   #4 [ffff8e79eaa17ae0] no_context at ffffffffb2f5aaa6
>   #5 [ffff8e79eaa17b30] __bad_area_nosemaphore at ffffffffb2f5ab3d
>   #6 [ffff8e79eaa17b80] bad_area_nosemaphore at ffffffffb2f5acae
>   #7 [ffff8e79eaa17b90] __do_page_fault at ffffffffb2f6f6c0
>   #8 [ffff8e79eaa17c00] do_page_fault at ffffffffb2f6f925
>   #9 [ffff8e79eaa17c30] page_fault at ffffffffb2f6b758
>      [exception RIP: memset+31]
>      RIP: ffffffffb2b8668f  RSP: ffff8e79eaa17ce8  RFLAGS: 00010a17
>      RAX: bebebebebebebebe  RBX: ffff8e747bff10c0  RCX: 0000060b00200000
>      RDX: 0000000000000000  RSI: 00000000000000be  RDI: ffffab807f000000
>      RBP: ffff8e79eaa17d10   R8: ffff8e79eaa14000   R9: ffffab7c80000000
>      R10: 000000000000bcba  R11: 00000000000001ba  R12: ffff8e79ebaa4050
>      R13: ffffab7c80000000  R14: 0000000000022600  R15: ffff8e8136e04100
>      ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
> #10 [ffff8e79eaa17ce8] amdgpu_bo_release_notify at ffffffffc092f2d1 
> [amdgpu]
> #11 [ffff8e79eaa17d18] ttm_bo_release at ffffffffc08f39dd [amdttm]
> #12 [ffff8e79eaa17d58] amdttm_bo_put at ffffffffc08f3c8c [amdttm]
> #13 [ffff8e79eaa17d68] amdttm_bo_vm_close at ffffffffc08f7ac9 [amdttm]
> #14 [ffff8e79eaa17d80] remove_vma at ffffffffb29ef115
> #15 [ffff8e79eaa17da0] exit_mmap at ffffffffb29f2c64
> #16 [ffff8e79eaa17e58] mmput at ffffffffb28940c7
> #17 [ffff8e79eaa17e78] do_exit at ffffffffb289dc95
> #18 [ffff8e79eaa17f10] do_group_exit at ffffffffb289e4cf
> #19 [ffff8e79eaa17f40] sys_exit_group at ffffffffb289e544
> #20 [ffff8e79eaa17f50] system_call_fastpath at ffffffffb2f74ddb

Well that might be perfectly expected. VRAM is not necessarily CPU 
accessible.

Regards,
Christian.

>
> Regards,
> Qu.
>
>
>> Regards,
>> Christian.
>>
>>> and the VRAM mem will be evicted, mem region was replaced
>>> by Gtt mem region. amdgpu_bo_release_notify() will then
>>> hold the bo->base.resv lock, and SDMA will get an invalid
>>> address in amdgpu_fill_buffer(), resulting in a VMFAULT
>>> or memory corruption.
>>>
>>> To avoid it, we have to hold bo->base.resv lock first, and
>>> check whether the mem.mem_type is TTM_PL_VRAM.
>>>
>>> Signed-off-by: Qu Huang <jinsdb@126.com>
>>> ---
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 8 ++++++--
>>>   1 file changed, 6 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>>> index 4b29b82..8018574 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>>> @@ -1300,12 +1300,16 @@ void amdgpu_bo_release_notify(struct
>>> ttm_buffer_object *bo)
>>>       if (bo->base.resv == &bo->base._resv)
>>>           amdgpu_amdkfd_remove_fence_on_pt_pd_bos(abo);
>>>
>>> -    if (bo->mem.mem_type != TTM_PL_VRAM || !bo->mem.mm_node ||
>>> -        !(abo->flags & AMDGPU_GEM_CREATE_VRAM_WIPE_ON_RELEASE))
>>> +    if (!(abo->flags & AMDGPU_GEM_CREATE_VRAM_WIPE_ON_RELEASE))
>>>           return;
>>>
>>>       dma_resv_lock(bo->base.resv, NULL);
>>>
>>> +    if (bo->mem.mem_type != TTM_PL_VRAM || !bo->mem.mm_node) {
>>> +        dma_resv_unlock(bo->base.resv);
>>> +        return;
>>> +    }
>>> +
>>>       r = amdgpu_fill_buffer(abo, AMDGPU_POISON, bo->base.resv, 
>>> &fence);
>>>       if (!WARN_ON(r)) {
>>>           amdgpu_bo_fence(abo, fence, false);
>>> -- 
>>> 1.8.3.1
>>>
>