From: "Lazar, Lijo" <lijo.lazar@amd.com>
To: "Christian König" <christian.koenig@amd.com>,
"Andrey Grodzovsky" <andrey.grodzovsky@amd.com>,
dri-devel@lists.freedesktop.org, amd-gfx@lists.freedesktop.org
Cc: horace.chen@amd.com, Monk.Liu@amd.com
Subject: Re: [RFC v2 4/8] drm/amdgpu: Serialize non TDR gpu recovery with TDRs
Date: Wed, 5 Jan 2022 18:56:40 +0530 [thread overview]
Message-ID: <6f64e1d2-eec5-04da-2497-9ee2e7dfcf12@amd.com> (raw)
In-Reply-To: <6be71531-40f1-fcad-f54c-03f6e14064f9@amd.com>
On 1/5/2022 6:45 PM, Christian König wrote:
> Am 05.01.22 um 14:11 schrieb Lazar, Lijo:
>> On 1/5/2022 6:01 PM, Christian König wrote:
>>> Am 05.01.22 um 10:54 schrieb Lazar, Lijo:
>>>> On 12/23/2021 3:35 AM, Andrey Grodzovsky wrote:
>>>>> Use reset domain wq also for non TDR gpu recovery trigers
>>>>> such as sysfs and RAS. We must serialize all possible
>>>>> GPU recoveries to gurantee no concurrency there.
>>>>> For TDR call the original recovery function directly since
>>>>> it's already executed from within the wq. For others just
>>>>> use a wrapper to qeueue work and wait on it to finish.
>>>>>
>>>>> v2: Rename to amdgpu_recover_work_struct
>>>>>
>>>>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>>>>> ---
>>>>> drivers/gpu/drm/amd/amdgpu/amdgpu.h | 2 ++
>>>>> drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 33
>>>>> +++++++++++++++++++++-
>>>>> drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 2 +-
>>>>> 3 files changed, 35 insertions(+), 2 deletions(-)
>>>>>
>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>>>> index b5ff76aae7e0..8e96b9a14452 100644
>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>>>> @@ -1296,6 +1296,8 @@ bool amdgpu_device_has_job_running(struct
>>>>> amdgpu_device *adev);
>>>>> bool amdgpu_device_should_recover_gpu(struct amdgpu_device *adev);
>>>>> int amdgpu_device_gpu_recover(struct amdgpu_device *adev,
>>>>> struct amdgpu_job* job);
>>>>> +int amdgpu_device_gpu_recover_imp(struct amdgpu_device *adev,
>>>>> + struct amdgpu_job *job);
>>>>> void amdgpu_device_pci_config_reset(struct amdgpu_device *adev);
>>>>> int amdgpu_device_pci_reset(struct amdgpu_device *adev);
>>>>> bool amdgpu_device_need_post(struct amdgpu_device *adev);
>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>> index 7c063fd37389..258ec3c0b2af 100644
>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>> @@ -4979,7 +4979,7 @@ static void amdgpu_device_recheck_guilty_jobs(
>>>>> * Returns 0 for success or an error on failure.
>>>>> */
>>>>> -int amdgpu_device_gpu_recover(struct amdgpu_device *adev,
>>>>> +int amdgpu_device_gpu_recover_imp(struct amdgpu_device *adev,
>>>>> struct amdgpu_job *job)
>>>>> {
>>>>> struct list_head device_list, *device_list_handle = NULL;
>>>>> @@ -5237,6 +5237,37 @@ int amdgpu_device_gpu_recover(struct
>>>>> amdgpu_device *adev,
>>>>> return r;
>>>>> }
>>>>> +struct amdgpu_recover_work_struct {
>>>>> + struct work_struct base;
>>>>> + struct amdgpu_device *adev;
>>>>> + struct amdgpu_job *job;
>>>>> + int ret;
>>>>> +};
>>>>> +
>>>>> +static void amdgpu_device_queue_gpu_recover_work(struct
>>>>> work_struct *work)
>>>>> +{
>>>>> + struct amdgpu_recover_work_struct *recover_work =
>>>>> container_of(work, struct amdgpu_recover_work_struct, base);
>>>>> +
>>>>> + recover_work->ret =
>>>>> amdgpu_device_gpu_recover_imp(recover_work->adev, recover_work->job);
>>>>> +}
>>>>> +/*
>>>>> + * Serialize gpu recover into reset domain single threaded wq
>>>>> + */
>>>>> +int amdgpu_device_gpu_recover(struct amdgpu_device *adev,
>>>>> + struct amdgpu_job *job)
>>>>> +{
>>>>> + struct amdgpu_recover_work_struct work = {.adev = adev, .job =
>>>>> job};
>>>>> +
>>>>> + INIT_WORK(&work.base, amdgpu_device_queue_gpu_recover_work);
>>>>> +
>>>>> + if (!queue_work(adev->reset_domain.wq, &work.base))
>>>>> + return -EAGAIN;
>>>>> +
>>>>
>>>> The decision to schedule a reset is made at this point. Subsequent
>>>> accesses to hardware may not be reliable. So should the flag
>>>> in_reset be set here itself rather than waiting for the work to
>>>> start execution?
>>>
>>> No, when we race and lose the VM is completely lost and probably
>>> restarted by the hypervisor.
>>>
>>> And when we race and win we properly set the flag before signaling
>>> the hypervisor that it can continue with the reset.
>>>
>>
>> I was talking about baremetal case. When this was synchronous,
>> in_reset flag is set as one of the first things and amdgpu_in_reset is
>> checked to prevent further hardware accesses. This design only changes
>> the recover part and doesn't change the hardware perspective.
>
>> Potential accesses from other processes need to be blocked as soon as
>> we determine a reset is required.
>
> That's an incorrect assumption.
>
> Accessing the hardware is perfectly ok as long as the reset hasn't
> started yet. In other words even when the hardware is locked up you can
> still happily read/write registers or access the VRAM BAR.
>
Not sure if that is 100% correct like a recovery triggered by RAS error
(depends on the access done).
Thanks,
Lijo
> Only when the hardware is currently performing a reset, then we can't
> touch it or there might be unfortunate consequences (usually complete
> system lockup).
>
> Regards,
> Christian.
>
>> Are we expecting the work to be immediately executed and set the flags?
>>
>> Thanks,
>> Lijo
>>
>>>> Also, what about having the reset_active or in_reset flag in the
>>>> reset_domain itself?
>>>
>>> Of hand that sounds like a good idea.
>>>
>>> Regards,
>>> Christian.
>>>
>>>>
>>>> Thanks,
>>>> Lijo
>>>>
>>>>> + flush_work(&work.base);
>>>>> +
>>>>> + return work.ret;
>>>>> +}
>>>>> +
>>>>> /**
>>>>> * amdgpu_device_get_pcie_info - fence pcie info about the PCIE slot
>>>>> *
>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
>>>>> index bfc47bea23db..38c9fd7b7ad4 100644
>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
>>>>> @@ -63,7 +63,7 @@ static enum drm_gpu_sched_stat
>>>>> amdgpu_job_timedout(struct drm_sched_job *s_job)
>>>>> ti.process_name, ti.tgid, ti.task_name, ti.pid);
>>>>> if (amdgpu_device_should_recover_gpu(ring->adev)) {
>>>>> - amdgpu_device_gpu_recover(ring->adev, job);
>>>>> + amdgpu_device_gpu_recover_imp(ring->adev, job);
>>>>> } else {
>>>>> drm_sched_suspend_timeout(&ring->sched);
>>>>> if (amdgpu_sriov_vf(adev))
>>>>>
>>>
>
next prev parent reply other threads:[~2022-01-05 13:26 UTC|newest]
Thread overview: 55+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-12-22 22:04 [RFC v2 0/8] Define and use reset domain for GPU recovery in amdgpu Andrey Grodzovsky
2021-12-22 22:04 ` [RFC v2 1/8] drm/amdgpu: Introduce reset domain Andrey Grodzovsky
2021-12-22 22:05 ` [RFC v2 2/8] drm/amdgpu: Move scheduler init to after XGMI is ready Andrey Grodzovsky
2021-12-23 8:39 ` Christian König
2021-12-22 22:05 ` [RFC v2 3/8] drm/amdgpu: Fix crash on modprobe Andrey Grodzovsky
2021-12-23 8:40 ` Christian König
2021-12-22 22:05 ` [RFC v2 4/8] drm/amdgpu: Serialize non TDR gpu recovery with TDRs Andrey Grodzovsky
2021-12-23 8:41 ` Christian König
2022-01-05 9:54 ` Lazar, Lijo
2022-01-05 12:31 ` Christian König
2022-01-05 13:11 ` Lazar, Lijo
2022-01-05 13:15 ` Christian König
2022-01-05 13:26 ` Lazar, Lijo [this message]
2022-01-05 13:41 ` Christian König
2022-01-05 18:11 ` Andrey Grodzovsky
2022-01-17 19:14 ` Andrey Grodzovsky
2022-01-17 19:17 ` Christian König
2022-01-17 19:21 ` Andrey Grodzovsky
2022-01-26 15:52 ` Andrey Grodzovsky
2022-01-28 16:57 ` Grodzovsky, Andrey
2022-02-07 2:41 ` JingWen Chen
2022-02-07 3:08 ` Grodzovsky, Andrey
2021-12-22 22:13 ` [RFC v2 5/8] drm/amd/virt: For SRIOV send GPU reset directly to TDR queue Andrey Grodzovsky
2021-12-22 22:13 ` [RFC v2 6/8] drm/amdgpu: Drop hive->in_reset Andrey Grodzovsky
2021-12-22 22:13 ` [RFC v2 7/8] drm/amdgpu: Drop concurrent GPU reset protection for device Andrey Grodzovsky
2021-12-22 22:14 ` [RFC v2 8/8] drm/amd/virt: Drop concurrent GPU reset protection for SRIOV Andrey Grodzovsky
2021-12-23 8:42 ` Christian König
2021-12-23 10:14 ` Liu, Monk
2021-12-24 8:58 ` Deng, Emily
2021-12-24 9:57 ` JingWen Chen
2021-12-30 18:45 ` Andrey Grodzovsky
2022-01-03 10:17 ` Christian König
2022-01-04 9:07 ` JingWen Chen
2022-01-04 10:18 ` Christian König
2022-01-04 10:49 ` Liu, Monk
2022-01-04 11:36 ` Christian König
2022-01-04 16:56 ` Andrey Grodzovsky
2022-01-05 7:34 ` JingWen Chen
2022-01-05 7:59 ` Christian König
2022-01-05 18:24 ` Andrey Grodzovsky
2022-01-06 4:59 ` JingWen Chen
2022-01-06 5:18 ` JingWen Chen
2022-01-06 9:13 ` Christian König
2022-01-06 19:13 ` Andrey Grodzovsky
2022-01-07 3:57 ` JingWen Chen
2022-01-07 5:46 ` JingWen Chen
2022-01-07 16:02 ` Andrey Grodzovsky
2022-01-12 6:28 ` JingWen Chen
2022-01-04 17:13 ` Liu, Shaoyun
2022-01-04 20:54 ` Andrey Grodzovsky
2022-01-05 0:01 ` Liu, Shaoyun
2022-01-05 7:25 ` JingWen Chen
2021-12-30 18:39 ` Andrey Grodzovsky
2021-12-23 18:07 ` Liu, Shaoyun
2021-12-23 18:29 ` [RFC v3 5/8] drm/amd/virt: For SRIOV send GPU reset directly to TDR queue Andrey Grodzovsky
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=6f64e1d2-eec5-04da-2497-9ee2e7dfcf12@amd.com \
--to=lijo.lazar@amd.com \
--cc=Monk.Liu@amd.com \
--cc=amd-gfx@lists.freedesktop.org \
--cc=andrey.grodzovsky@amd.com \
--cc=christian.koenig@amd.com \
--cc=dri-devel@lists.freedesktop.org \
--cc=horace.chen@amd.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).