From: "Sharma, Shashank" <shashank.sharma@amd.com>
To: "Lazar, Lijo" <lijo.lazar@amd.com>,
"amd-gfx@lists.freedesktop.org" <amd-gfx@lists.freedesktop.org>
Cc: "Deucher, Alexander" <Alexander.Deucher@amd.com>,
"Somalapuram Amaranath" <Amaranath.Somalapuram@amd.com>,
"Christian König" <Christian.Koenig@amd.com>
Subject: Re: [PATCH 4/4] drm/amdgpu/nv: add navi GPU reset handler
Date: Fri, 4 Feb 2022 17:38:41 +0100 [thread overview]
Message-ID: <817df2c3-e7af-92cb-53f8-8bc70b69b988@amd.com> (raw)
In-Reply-To: <a0693436-619c-efa1-b3f1-2fca6377e2fe@amd.com>
Hey Lijo,
I somehow missed to respond on this comment, pls find inline:
Regards
Shashank
On 1/22/2022 7:42 AM, Lazar, Lijo wrote:
>
>
> On 1/22/2022 2:04 AM, Sharma, Shashank wrote:
>> From 899ec6060eb7d8a3d4d56ab439e4e6cdd74190a4 Mon Sep 17 00:00:00 2001
>> From: Somalapuram Amaranath <Amaranath.Somalapuram@amd.com>
>> Date: Fri, 21 Jan 2022 14:19:42 +0530
>> Subject: [PATCH 4/4] drm/amdgpu/nv: add navi GPU reset handler
>>
>> This patch adds a GPU reset handler for Navi ASIC family, which
>> typically dumps some of the registersand sends a trace event.
>>
>> V2: Accomodated call to work function to send uevent
>>
>> Signed-off-by: Somalapuram Amaranath <Amaranath.Somalapuram@amd.com>
>> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com>
>> ---
>> drivers/gpu/drm/amd/amdgpu/nv.c | 28 ++++++++++++++++++++++++++++
>> 1 file changed, 28 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/nv.c
>> b/drivers/gpu/drm/amd/amdgpu/nv.c
>> index 01efda4398e5..ada35d4c5245 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/nv.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/nv.c
>> @@ -528,10 +528,38 @@ nv_asic_reset_method(struct amdgpu_device *adev)
>> }
>> }
>>
>> +static void amdgpu_reset_dumps(struct amdgpu_device *adev)
>> +{
>> + int r = 0, i;
>> +
>> + /* original raven doesn't have full asic reset */
>> + if ((adev->apu_flags & AMD_APU_IS_RAVEN) &&
>> + !(adev->apu_flags & AMD_APU_IS_RAVEN2))
>> + return;
>> + for (i = 0; i < adev->num_ip_blocks; i++) {
>> + if (!adev->ip_blocks[i].status.valid)
>> + continue;
>> + if (!adev->ip_blocks[i].version->funcs->reset_reg_dumps)
>> + continue;
>> + r = adev->ip_blocks[i].version->funcs->reset_reg_dumps(adev);
>> +
>> + if (r)
>> + DRM_ERROR("reset_reg_dumps of IP block <%s> failed %d\n",
>> + adev->ip_blocks[i].version->funcs->name, r);
>> + }
>> +
>> + /* Schedule work to send uevent */
>> + if (!queue_work(system_unbound_wq, &adev->gpu_reset_work))
>> + DRM_ERROR("failed to add GPU reset work\n");
>> +
>> + dump_stack();
>> +}
>> +
>> static int nv_asic_reset(struct amdgpu_device *adev)
>> {
>> int ret = 0;
>>
>> + amdgpu_reset_dumps(adev);
>
> Had a comment on this before. Now there are different reasons (or even
> no reason like a precautionary reset) to perform reset. A user would be
> interested in a trace only if the reason is valid.
>
> To clarify on why a work shouldn't be scheduled on every reset, check
> here -
>
> https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c#L2188
In the example you pointed to, they have a criteria to decide what is a
valid reset in their context, in the kernel side itself. So they can
take a call if they want to do something about it or not.
But, in our case, we want to send the trace_event to user with some
register values on every reset, and it is actually up to the profiling
app to interpret (along with what it wants to call a GPU reset). So I
don't think this is causing a considerable overhead.
- Shashank
>
>
>
> Thanks,
> Lijo
>
>> switch (nv_asic_reset_method(adev)) {
>> case AMD_RESET_METHOD_PCI:
>> dev_info(adev->dev, "PCI reset\n");
next prev parent reply other threads:[~2022-02-04 16:39 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-01-21 20:34 [PATCH 4/4] drm/amdgpu/nv: add navi GPU reset handler Sharma, Shashank
2022-01-22 6:42 ` Lazar, Lijo
2022-02-04 16:38 ` Sharma, Shashank [this message]
2022-02-04 16:50 ` Lazar, Lijo
2022-02-04 16:59 ` Sharma, Shashank
2022-02-04 17:02 ` Lazar, Lijo
2022-02-04 17:07 ` Sharma, Shashank
2022-02-04 17:11 ` Lazar, Lijo
2022-02-04 17:16 ` Sharma, Shashank
2022-02-04 17:20 ` Lazar, Lijo
2022-02-04 17:22 ` Sharma, Shashank
2022-02-04 18:41 ` Deucher, Alexander
2022-02-04 18:44 ` Deucher, Alexander
2022-02-05 7:00 ` Sharma, Shashank
2022-01-24 7:18 ` Christian König
2022-01-24 16:50 ` Sharma, Shashank
2022-01-24 16:32 ` Andrey Grodzovsky
2022-01-24 16:38 ` Sharma, Shashank
2022-01-24 17:08 ` Andrey Grodzovsky
2022-01-24 17:11 ` Sharma, Shashank
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=817df2c3-e7af-92cb-53f8-8bc70b69b988@amd.com \
--to=shashank.sharma@amd.com \
--cc=Alexander.Deucher@amd.com \
--cc=Amaranath.Somalapuram@amd.com \
--cc=Christian.Koenig@amd.com \
--cc=amd-gfx@lists.freedesktop.org \
--cc=lijo.lazar@amd.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.