From: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
To: Shashank Sharma <contactshashanksharma@gmail.com>,
amd-gfx@lists.freedesktop.org
Cc: Alexander Deucher <alexander.deucher@amd.com>,
amaranath.somalapuram@amd.com,
Christian Koenig <christian.koenig@amd.com>,
shashank.sharma@amd.com
Subject: Re: [PATCH 2/2] drm/amdgpu: add work function for GPU reset event
Date: Tue, 8 Mar 2022 11:26:42 -0500 [thread overview]
Message-ID: <9de42884-d1e2-309a-e669-5132539fbd22@amd.com> (raw)
In-Reply-To: <20220307162631.2496286-2-contactshashanksharma@gmail.com>
On 2022-03-07 11:26, Shashank Sharma wrote:
> From: Shashank Sharma <shashank.sharma@amd.com>
>
> This patch adds a work function, which will get scheduled
> in event of a GPU reset, and will send a uevent to user with
> some reset context infomration, like a PID and some flags.
Where is the actual scheduling of the work function ? Shouldn't
there be a patch for that too ?
Andrey
>
> The userspace can do some recovery and post-processing work
> based on this event.
>
> V2:
> - Changed the name of the work to gpu_reset_event_work
> (Christian)
> - Added a structure to accommodate some additional information
> (like a PID and some flags)
>
> Cc: Alexander Deucher <alexander.deucher@amd.com>
> Cc: Christian Koenig <christian.koenig@amd.com>
> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com>
> ---
> drivers/gpu/drm/amd/amdgpu/amdgpu.h | 7 +++++++
> drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 19 +++++++++++++++++++
> 2 files changed, 26 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> index d8b854fcbffa..7df219fe363f 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> @@ -813,6 +813,11 @@ struct amd_powerplay {
> #define AMDGPU_RESET_MAGIC_NUM 64
> #define AMDGPU_MAX_DF_PERFMONS 4
> #define AMDGPU_PRODUCT_NAME_LEN 64
> +struct amdgpu_reset_event_ctx {
> + uint64_t pid;
> + uint32_t flags;
> +};
> +
> struct amdgpu_device {
> struct device *dev;
> struct pci_dev *pdev;
> @@ -1063,6 +1068,7 @@ struct amdgpu_device {
>
> int asic_reset_res;
> struct work_struct xgmi_reset_work;
> + struct work_struct gpu_reset_event_work;
> struct list_head reset_list;
>
> long gfx_timeout;
> @@ -1097,6 +1103,7 @@ struct amdgpu_device {
> pci_channel_state_t pci_channel_state;
>
> struct amdgpu_reset_control *reset_cntl;
> + struct amdgpu_reset_event_ctx reset_event_ctx;
> uint32_t ip_versions[MAX_HWIP][HWIP_MAX_INSTANCE];
>
> bool ram_is_direct_mapped;
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index ed077de426d9..c43d099da06d 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -73,6 +73,7 @@
> #include <linux/pm_runtime.h>
>
> #include <drm/drm_drv.h>
> +#include <drm/drm_sysfs.h>
>
> MODULE_FIRMWARE("amdgpu/vega10_gpu_info.bin");
> MODULE_FIRMWARE("amdgpu/vega12_gpu_info.bin");
> @@ -3277,6 +3278,23 @@ bool amdgpu_device_has_dc_support(struct amdgpu_device *adev)
> return amdgpu_device_asic_has_dc_support(adev->asic_type);
> }
>
> +static void amdgpu_device_reset_event_func(struct work_struct *__work)
> +{
> + struct amdgpu_device *adev = container_of(__work, struct amdgpu_device,
> + gpu_reset_event_work);
> + struct amdgpu_reset_event_ctx *event_ctx = &adev->reset_event_ctx;
> +
> + /*
> + * A GPU reset has happened, indicate the userspace and pass the
> + * following information:
> + * - pid of the process involved,
> + * - if the VRAM is valid or not,
> + * - indicate that userspace may want to collect the ftrace event
> + * data from the trace event.
> + */
> + drm_sysfs_reset_event(&adev->ddev, event_ctx->pid, event_ctx->flags);
> +}
> +
> static void amdgpu_device_xgmi_reset_func(struct work_struct *__work)
> {
> struct amdgpu_device *adev =
> @@ -3525,6 +3543,7 @@ int amdgpu_device_init(struct amdgpu_device *adev,
> amdgpu_device_delay_enable_gfx_off);
>
> INIT_WORK(&adev->xgmi_reset_work, amdgpu_device_xgmi_reset_func);
> + INIT_WORK(&adev->gpu_reset_event_work, amdgpu_device_reset_event_func);
>
> adev->gfx.gfx_off_req_count = 1;
> adev->pm.ac_power = power_supply_is_system_supplied() > 0;
next prev parent reply other threads:[~2022-03-08 16:26 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-03-07 16:26 [PATCH 1/2] drm: Add GPU reset sysfs event Shashank Sharma
2022-03-07 16:26 ` [PATCH 2/2] drm/amdgpu: add work function for GPU reset event Shashank Sharma
2022-03-07 16:46 ` Somalapuram, Amaranath
2022-03-07 17:05 ` Sharma, Shashank
2022-03-08 7:15 ` Christian König
2022-03-08 9:35 ` Sharma, Shashank
2022-03-08 16:26 ` Andrey Grodzovsky [this message]
2022-03-08 16:30 ` Sharma, Shashank
2022-03-08 17:20 ` Somalapuram, Amaranath
2022-03-08 18:53 ` Andrey Grodzovsky
2022-03-08 6:32 ` [PATCH 1/2] drm: Add GPU reset sysfs event Somalapuram, Amaranath
2022-03-08 7:06 ` Christian König
2022-03-08 9:31 ` Sharma, Shashank
2022-03-08 10:32 ` Christian König
2022-03-08 11:56 ` Sharma, Shashank
2022-03-08 15:37 ` Somalapuram, Amaranath
2022-03-09 8:02 ` Christian König
2022-03-08 16:40 ` Sharma, Shashank
2022-03-09 8:05 ` Christian König
2022-03-08 16:25 ` Andrey Grodzovsky
2022-03-08 16:35 ` Sharma, Shashank
2022-03-08 16:36 ` Andrey Grodzovsky
2022-03-08 16:36 ` Lazar, Lijo
2022-03-08 16:39 ` Andrey Grodzovsky
2022-03-08 16:46 ` Sharma, Shashank
2022-03-08 16:55 ` Andrey Grodzovsky
2022-03-08 16:57 ` Sharma, Shashank
2022-03-08 17:04 ` Somalapuram, Amaranath
2022-03-08 17:17 ` Andrey Grodzovsky
2022-03-08 17:27 ` Limonciello, Mario
2022-03-08 18:08 ` Sharma, Shashank
2022-03-08 18:10 ` Limonciello, Mario
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=9de42884-d1e2-309a-e669-5132539fbd22@amd.com \
--to=andrey.grodzovsky@amd.com \
--cc=alexander.deucher@amd.com \
--cc=amaranath.somalapuram@amd.com \
--cc=amd-gfx@lists.freedesktop.org \
--cc=christian.koenig@amd.com \
--cc=contactshashanksharma@gmail.com \
--cc=shashank.sharma@amd.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.