All of lore.kernel.org
 help / color / mirror / Atom feed
From: Nirmoy <nirmodas@amd.com>
To: "Christian König" <christian.koenig@amd.com>,
	"Nirmoy Das" <nirmoy.das@amd.com>,
	amd-gfx@lists.freedesktop.org
Cc: alexander.deucher@amd.com
Subject: Re: [RFC drm-misc-next PATCH 1/1] drm/amdgpu: clean up bo in vce and vcn test
Date: Tue, 8 Dec 2020 15:49:42 +0100	[thread overview]
Message-ID: <9c4046b2-630a-90bf-1b47-c0182bf247a2@amd.com> (raw)
In-Reply-To: <9e785a70-24a2-33ff-4cba-fd40cad53ad8@amd.com>


On 12/8/20 3:45 PM, Christian König wrote:
> Yes, correct.
>
> You could add an amdgpu_bo_free_reserved() function, but we only have 
> this one case for that so I think it's probably not worth it.
>
> Just one more comment below.
>
> Am 08.12.20 um 15:42 schrieb Nirmoy:
>> I think I know why I needed to keep that amdgpu_bo_unreserve() before 
>> calling amdgpu_bo_free_kernel().
>>
>> amdgpu_bo_free_kernel() --> amdgpu_bo_reserve()->ttm_bo_reserve()--> 
>> dma_resv_lock(bo->base.resv, ticket).
>>
>> amdgpu_bo_create_reserved() already locked that dma_resv by calling 
>> amdgpu_bo_reserve(). So "amdgpu_bo_create_reserved(); 
>> amdgpu_bo_free_kernel();" will lead to deadlock.
>>
>>
>> I wonder if we should have a separate API to clean BO created by 
>> amdgpu_bo_create_reserved().
>>
>>
>> Regards,
>>
>> Nirmoy
>>
>> On 12/8/20 3:23 PM, Nirmoy Das wrote:
>>> BO created with amdgpu_bo_create_reserved wasn't clean
>>> properly before which causes:
>>>
>>> [   21.056218] WARNING: CPU: 0 PID: 7 at 
>>> drivers/gpu/drm/ttm/ttm_bo.c:518 ttm_bo_release+0x2bf/0x310 [ttm]
>>> [   21.056219] Modules linked in: amdgpu(E) iommu_v2(E) gpu_sched(E) 
>>> drm_ttm_helper(E) ttm(E) drm_kms_helper(E) syscopyarea(E) 
>>> sysfillrect(E) sysimgblt(E) fb_sys_fops(E) cec(E) rc_core(E) 
>>> rfcomm(E) af_packet(E) cmac(E) algif_hash(E) algif_skcipher(E) 
>>> af_alg(E) xt_CHECKSUM(E) xt_MASQUERADE(E) nf_nat_tftp(E) 
>>> nf_conntrack_tftp(E) xt_CT(E) bridge(E) stp(E) llc(E) ip6t_REJECT(E) 
>>> nf_reject_ipv6(E) ip6t_rpfilter(E) xt_tcpudp(E) ipt_REJECT(E) 
>>> nf_reject_ipv4(E) xt_conntrack(E) ebtable_nat(E) ebtable_broute(E) 
>>> ip6table_nat(E) ip6table_mangle(E) ip6table_raw(E) 
>>> ip6table_security(E) iptable_nat(E) nf_nat(E) nf_conntrack(E) 
>>> nf_defrag_ipv6(E) nf_defrag_ipv4(E) iptable_mangle(E) iptable_raw(E) 
>>> iptable_security(E) ip_set(E) nfnetlink(E) ebtable_filter(E) 
>>> ebtables(E) ip6table_filter(E) ip6_tables(E) iptable_filter(E) 
>>> ip_tables(E) x_tables(E) bpfilter(E) joydev(E) hid_generic(E) 
>>> usbhid(E) bnep(E) btusb(E) btrtl(E) btbcm(E) btintel(E) bluetooth(E) 
>>> ecdh_generic(E) ecc(E) rfkill(E) dmi_sysfs(E) msr(E) wmi_bmof(E)
>>> [   21.056266]  mxm_wmi(E) snd_hda_codec_realtek(E) 
>>> snd_hda_codec_generic(E) ledtrig_audio(E) snd_hda_codec_hdmi(E) 
>>> kvm_amd(E) snd_hda_intel(E) kvm(E) snd_intel_dspcfg(E) 
>>> snd_hda_codec(E) snd_hwdep(E) snd_hda_core(E) irqbypass(E) 
>>> snd_pcm(E) snd_timer(E) snd(E) soundcore(E) igb(E) sp5100_tco(E) 
>>> pcspkr(E) i2c_piix4(E) k10temp(E) i2c_algo_bit(E) dca(E) wmi(E) 
>>> tiny_power_button(E) gpio_amdpt(E) gpio_generic(E) acpi_cpufreq(E) 
>>> button(E) drm(E) crct10dif_pclmul(E) crc32_pclmul(E) 
>>> ghash_clmulni_intel(E) aesni_intel(E) glue_helper(E) crypto_simd(E) 
>>> cryptd(E) xhci_pci(E) nvme(E) xhci_pci_renesas(E) nvme_core(E) 
>>> xhci_hcd(E) ccp(E) usbcore(E) btrfs(E) blake2b_generic(E) 
>>> libcrc32c(E) crc32c_intel(E) xor(E) raid6_pq(E) sg(E) 
>>> dm_multipath(E) dm_mod(E) scsi_dh_rdac(E) scsi_dh_emc(E) 
>>> scsi_dh_alua(E)
>>> [   21.056307] CPU: 0 PID: 7 Comm: kworker/0:1 Tainted: G            
>>> E     5.10.0-rc3-1-default+ #45
>>> [   21.056309] Hardware name: Gigabyte Technology Co., Ltd. X399 
>>> DESIGNARE EX/X399 DESIGNARE EX-CF, BIOS F12i 09/24/2019
>>> [   21.056409] Workqueue: events 
>>> amdgpu_device_delayed_init_work_handler [amdgpu]
>>> [   21.056415] RIP: 0010:ttm_bo_release+0x2bf/0x310 [ttm]
>>> [   21.056418] Code: e9 a1 fd ff ff e8 b1 00 f0 db e9 d2 fd ff ff 49 
>>> 8b 7e 88 b9 4c 1d 00 00 31 d2 be 01 00 00 00 e8 87 27 f0 db 49 8b 46 
>>> d8 eb 9e <0f> 0b 41 c7 86 9c 00 00 00 00 00 00 00 4c 89 ef e8 9c ef 
>>> ff ff 49
>>> [   21.056419] RSP: 0018:ffff9cca00107d48 EFLAGS: 00010202
>>> [   21.056421] RAX: 0000000000000001 RBX: 0000000000000000 RCX: 
>>> 000000001ec00000
>>> [   21.056423] RDX: 0000000000000001 RSI: 0000000000000246 RDI: 
>>> ffffffffc0e61b28
>>> [   21.056424] RBP: ffff90fa735e55b8 R08: ffff90fa731e8db8 R09: 
>>> 0000000000000000
>>> [   21.056425] R10: ffff90fa7335e000 R11: ffff90fa7335e000 R12: 
>>> ffffffffc0e61b28
>>> [   21.056426] R13: ffff90fa465f3c58 R14: ffff90fa465f3dc8 R15: 
>>> ffff90fa5622d600
>>> [   21.056427] FS:  0000000000000000(0000) GS:ffff91019ec00000(0000) 
>>> knlGS:0000000000000000
>>> [   21.056428] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>> [   21.056429] CR2: 00007f1150baeff8 CR3: 0000000116c70000 CR4: 
>>> 00000000003506f0
>>> [   21.056430] Call Trace:
>>> [   21.056525]  amdgpu_bo_unref+0x1a/0x30 [amdgpu]
>>> [   21.056635]  amdgpu_vcn_dec_send_msg+0x1b2/0x270 [amdgpu]
>>> [   21.056740] amdgpu_vcn_dec_get_create_msg.constprop.0+0xd8/0x100 
>>> [amdgpu]
>>> [   21.056843]  amdgpu_vcn_dec_ring_test_ib+0x27/0x180 [amdgpu]
>>> [   21.056936]  amdgpu_ib_ring_tests+0xf1/0x150 [amdgpu]
>>> [   21.057024] amdgpu_device_delayed_init_work_handler+0x11/0x30 
>>> [amdgpu]
>>> [   21.057030]  process_one_work+0x1df/0x370
>>> [   21.057033]  worker_thread+0x46/0x340
>>> [   21.057034]  ? process_one_work+0x370/0x370
>>> [   21.057037]  kthread+0x11b/0x140
>>> [   21.057039]  ? __kthread_bind_mask+0x60/0x60
>>> [   21.057043]  ret_from_fork+0x22/0x30
>>>
>>> Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
>>> ---
>>>
>>> I had to keep amdgpu_bo_unreserve() before calling 
>>> amdgpu_bo_free_kernel()
>>> or else amdgpu doesn't respond after loading. Is there any better
>>> solution ?
>>>
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c |  2 +-
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c | 16 ++++++++++------
>>>   2 files changed, 11 insertions(+), 7 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c 
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
>>> index ecaa2d7483b2..78a4dd9bf11f 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
>>> @@ -1151,6 +1151,6 @@ int amdgpu_vce_ring_test_ib(struct amdgpu_ring 
>>> *ring, long timeout)
>>>   error:
>>>       dma_fence_put(fence);
>>>       amdgpu_bo_unreserve(bo);
>>> -    amdgpu_bo_unref(&bo);
>>> +    amdgpu_bo_free_kernel(&bo, NULL, NULL);
>>>       return r;
>>>   }
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c 
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
>>> index 7e19a6656715..dfcdd38ff9c2 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
>>> @@ -491,8 +491,6 @@ static int amdgpu_vcn_dec_send_msg(struct 
>>> amdgpu_ring *ring,
>>>           goto err_free;
>>>
>>>       amdgpu_bo_fence(bo, f, false);
>>> -    amdgpu_bo_unreserve(bo);
>>> -    amdgpu_bo_unref(&bo);
>>>
>>>       if (fence)
>>>           *fence = dma_fence_get(f);
>>> @@ -504,8 +502,6 @@ static int amdgpu_vcn_dec_send_msg(struct 
>>> amdgpu_ring *ring,
>>>       amdgpu_job_free(job);
>>>
>>>   err:
>>> -    amdgpu_bo_unreserve(bo);
>>> -    amdgpu_bo_unref(&bo);
>>>       return r;
>>>   }
>>>
>>> @@ -540,7 +536,11 @@ static int amdgpu_vcn_dec_get_create_msg(struct 
>>> amdgpu_ring *ring, uint32_t hand
>>>       for (i = 14; i < 1024; ++i)
>>>           msg[i] = cpu_to_le32(0x0);
>>>
>>> -    return amdgpu_vcn_dec_send_msg(ring, bo, fence);
>>> +    r = amdgpu_vcn_dec_send_msg(ring, bo, fence);
>>> +    amdgpu_bo_unreserve(bo);
>>> +    amdgpu_bo_free_kernel(&bo, NULL, (void **)&msg);
>
> Why did you moved that here? As far as I see you just need to replace 
> the amdgpu_bo_unref() above with amdgpu_bo_free_kernel().


The BO was created in amdgpu_vcn_dec_get_create_msg(), so I thought it 
is better to keep cleanup code in the same function.


Regards,

Nirmoy


>
> Christian.
>
>>> +
>>> +    return r;
>>>   }
>>>
>>>   static int amdgpu_vcn_dec_get_destroy_msg(struct amdgpu_ring 
>>> *ring, uint32_t handle,
>>> @@ -566,7 +566,11 @@ static int 
>>> amdgpu_vcn_dec_get_destroy_msg(struct amdgpu_ring *ring, uint32_t han
>>>       for (i = 6; i < 1024; ++i)
>>>           msg[i] = cpu_to_le32(0x0);
>>>
>>> -    return amdgpu_vcn_dec_send_msg(ring, bo, fence);
>>> +    r = amdgpu_vcn_dec_send_msg(ring, bo, fence);
>>> +    amdgpu_bo_unreserve(bo);
>>> +    amdgpu_bo_free_kernel(&bo, NULL, (void **)&msg);
>>> +
>>> +    return r;
>>>   }
>>>
>>>   int amdgpu_vcn_dec_ring_test_ib(struct amdgpu_ring *ring, long 
>>> timeout)
>>> -- 
>>> 2.29.2
>>>
>>
>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

      reply	other threads:[~2020-12-08 14:49 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-12-08 14:23 [RFC drm-misc-next PATCH 1/1] drm/amdgpu: clean up bo in vce and vcn test Nirmoy Das
2020-12-08 14:42 ` Nirmoy
2020-12-08 14:45   ` Christian König
2020-12-08 14:49     ` Nirmoy [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=9c4046b2-630a-90bf-1b47-c0182bf247a2@amd.com \
    --to=nirmodas@amd.com \
    --cc=alexander.deucher@amd.com \
    --cc=amd-gfx@lists.freedesktop.org \
    --cc=christian.koenig@amd.com \
    --cc=nirmoy.das@amd.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.