All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
To: Guchun Chen <guchun.chen@amd.com>,
	amd-gfx@lists.freedesktop.org, christian.koenig@amd.com,
	xinhui.pan@amd.com, alexander.deucher@amd.com
Subject: Re: [PATCH] drm/amdgpu: add missed write lock for pci detected state pci_channel_io_normal
Date: Thu, 30 Sep 2021 22:21:36 -0400	[thread overview]
Message-ID: <b7febaef-5442-1503-d743-24a6c50fa179@amd.com> (raw)
In-Reply-To: <20211001020000.14501-1-guchun.chen@amd.com>

On 2021-09-30 10:00 p.m., Guchun Chen wrote:

> When a PCI error state pci_channel_io_normal is detectd, it will
> report PCI_ERS_RESULT_CAN_RECOVER status to PCI driver, and PCI driver
> will continue the execution of PCI resume callback report_resume by
> pci_walk_bridge, and the callback will go into amdgpu_pci_resume
> finally, where write lock is releasd unconditionally without acquiring
> such lock.


Good catch but, the issue is even wider in scope, what about 
drm_sched_resubmit_jobs
and drm_sched_start called without being stopped before ? Better to put 
the entire scope
of code in this function under flag that set only in 
pci_channel_io_frozen. As far as i remember
we don't need to do anything in case of pci_channel_io_normal.

Andrey


>
> Fixes: c9a6b82f45e2("drm/amdgpu: Implement DPC recovery")
> Signed-off-by: Guchun Chen <guchun.chen@amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 1 +
>   1 file changed, 1 insertion(+)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index bb5ad2b6ca13..12f822d51de2 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -5370,6 +5370,7 @@ pci_ers_result_t amdgpu_pci_error_detected(struct pci_dev *pdev, pci_channel_sta
>   
>   	switch (state) {
>   	case pci_channel_io_normal:
> +		amdgpu_device_lock_adev(adev, NULL);
>   		return PCI_ERS_RESULT_CAN_RECOVER;
>   	/* Fatal error, prepare for slot reset */
>   	case pci_channel_io_frozen:

  reply	other threads:[~2021-10-01  2:21 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-10-01  2:00 [PATCH] drm/amdgpu: add missed write lock for pci detected state pci_channel_io_normal Guchun Chen
2021-10-01  2:21 ` Andrey Grodzovsky [this message]
2021-10-01  8:21   ` Chen, Guchun
2021-10-01 14:28     ` Andrey Grodzovsky
2021-10-01 15:21       ` Chen, Guchun
2021-10-02 15:20         ` Chen, Guchun

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b7febaef-5442-1503-d743-24a6c50fa179@amd.com \
    --to=andrey.grodzovsky@amd.com \
    --cc=alexander.deucher@amd.com \
    --cc=amd-gfx@lists.freedesktop.org \
    --cc=christian.koenig@amd.com \
    --cc=guchun.chen@amd.com \
    --cc=xinhui.pan@amd.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.