All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] drm/amdgpu: Disable GPU reset on SRIOV before remove pci.
@ 2022-10-25 14:16 Gavin Wan
  0 siblings, 0 replies; 7+ messages in thread
From: Gavin Wan @ 2022-10-25 14:16 UTC (permalink / raw)
  To: amd-gfx; +Cc: Alex Deucher, Gavin Wan

  The change of the commit <f5c7e7797060> ("Adjust removal control
  flow for smu v13_0_2") brought a bug on SRIOV envrionment. It
  caused unloading amdgpu failed on Guest VM. The reason is that
  the VF FLR was requested while unloading amdgpu driver, but the
  VF FLR of SRIOV sequence is wrong while removing PCI device.

Signed-off-by: Gavin Wan <Gavin.Wan@amd.com>
Fixes: f5c7e7797060 ("drm/amdgpu: Adjust removal control flow
       for smu v13_0_2")
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Change-Id: I1ff8dcbffd85d7f3d8267d660fd8292423d2f70f
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index 16f6a313335e..8e97e95aca8c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -2187,7 +2187,8 @@ amdgpu_pci_remove(struct pci_dev *pdev)
 		pm_runtime_forbid(dev->dev);
 	}
 
-	if (adev->ip_versions[MP1_HWIP][0] == IP_VERSION(13, 0, 2)) {
+	if (adev->ip_versions[MP1_HWIP][0] == IP_VERSION(13, 0, 2) &&
+	    !amdgpu_sriov_vf(adev)) {
 		bool need_to_reset_gpu = false;
 
 		if (adev->gmc.xgmi.num_physical_nodes > 1) {
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH] drm/amdgpu: Disable GPU reset on SRIOV before remove pci.
  2022-10-24 20:21 Gavin Wan
@ 2022-10-25  9:30 ` Christian König
  0 siblings, 0 replies; 7+ messages in thread
From: Christian König @ 2022-10-25  9:30 UTC (permalink / raw)
  To: Gavin Wan, amd-gfx; +Cc: Alex Deucher



Am 24.10.22 um 22:21 schrieb Gavin Wan:
>    The change of the commit f5c7e7797060 ("Adjust removal control
>    flow for smu v13_0_2") brought a bug on SRIOV envrionment. It
>    caused unloading amdgpu failed on Guest VM. The reason is that
>    the VF FLR was requested while unloading amdgpu driver, but the
>    VF FLR of SRIOV sequence is wrong while removing PCI device.
>
>    Fixes: f5c7e7797060 ("drm/amdgpu: Adjust removal control flow
>           for smu v13_0_2")

The Fixes line should look like a Signed-off-by or Acked-by line.

>
> Acked-by: Alex Deucher <alexander.deucher@amd.com>
> Signed-off-by: Gavin Wan <Gavin.Wan@amd.com>
> Change-Id: I1ff8dcbffd85d7f3d8267d660fd8292423d2f70f

The lines with : are noted in chronological order.

E.g. in this case that should be:

Signed-off-by: ....
Fixes: ...
Acked-by: ...
...



> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 3 ++-
>   1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> index 16f6a313335e..ab0c856c13b0 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> @@ -2187,7 +2187,8 @@ amdgpu_pci_remove(struct pci_dev *pdev)
>   		pm_runtime_forbid(dev->dev);
>   	}
>   
> -	if (adev->ip_versions[MP1_HWIP][0] == IP_VERSION(13, 0, 2)) {
> +	if ((adev->ip_versions[MP1_HWIP][0] == IP_VERSION(13, 0, 2)) &&

Please drop the extra () as long as you don't want to explicitly have an 
assignment inside an "if", "for" or "while".

Regards,
Christian.

> +			!amdgpu_sriov_vf(adev)) {
>   		bool need_to_reset_gpu = false;
>   
>   		if (adev->gmc.xgmi.num_physical_nodes > 1) {


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH] drm/amdgpu: Disable GPU reset on SRIOV before remove pci.
@ 2022-10-24 20:21 Gavin Wan
  2022-10-25  9:30 ` Christian König
  0 siblings, 1 reply; 7+ messages in thread
From: Gavin Wan @ 2022-10-24 20:21 UTC (permalink / raw)
  To: amd-gfx; +Cc: Alex Deucher, Gavin Wan

  The change of the commit f5c7e7797060 ("Adjust removal control
  flow for smu v13_0_2") brought a bug on SRIOV envrionment. It
  caused unloading amdgpu failed on Guest VM. The reason is that
  the VF FLR was requested while unloading amdgpu driver, but the
  VF FLR of SRIOV sequence is wrong while removing PCI device.

  Fixes: f5c7e7797060 ("drm/amdgpu: Adjust removal control flow
         for smu v13_0_2")

Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Gavin Wan <Gavin.Wan@amd.com>
Change-Id: I1ff8dcbffd85d7f3d8267d660fd8292423d2f70f
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index 16f6a313335e..ab0c856c13b0 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -2187,7 +2187,8 @@ amdgpu_pci_remove(struct pci_dev *pdev)
 		pm_runtime_forbid(dev->dev);
 	}
 
-	if (adev->ip_versions[MP1_HWIP][0] == IP_VERSION(13, 0, 2)) {
+	if ((adev->ip_versions[MP1_HWIP][0] == IP_VERSION(13, 0, 2)) &&
+			!amdgpu_sriov_vf(adev)) {
 		bool need_to_reset_gpu = false;
 
 		if (adev->gmc.xgmi.num_physical_nodes > 1) {
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH] drm/amdgpu: Disable GPU reset on SRIOV before remove pci.
  2022-10-24 20:03 Gavin Wan
@ 2022-10-24 20:06 ` Alex Deucher
  0 siblings, 0 replies; 7+ messages in thread
From: Alex Deucher @ 2022-10-24 20:06 UTC (permalink / raw)
  To: Gavin Wan; +Cc: amd-gfx

On Mon, Oct 24, 2022 at 4:03 PM Gavin Wan <Gavin.Wan@amd.com> wrote:
>
>   Fixes: f5c7e7797060 ("drm/amdgpu: Adjust removal control flow
>          for smu v13_0_2")
>
>   The change of the commit f5c7e7797060 ("Adjust removal control
>   flow for smu v13_0_2") brought a bug on SRIOV envrionment. It
>   caused unloading amdgpu failed on Guest VM. The reason is that
>   the VF FLR was requested while unloading amdgpu driver, but the
>   VF FLR of SRIOV sequence is wrong while removing PCI device.

Please move the Fixes line down below the patch description and above
the Signed-off-by/Reviewed-by/Acked-by/etc. tags.  Feel free to
add:
Acked-by: Alex Deucher <alexander.deucher@amd.com>

Thanks,

Alex

>
> Signed-off-by: Gavin Wan <Gavin.Wan@amd.com>
> Change-Id: I1ff8dcbffd85d7f3d8267d660fd8292423d2f70f
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> index 16f6a313335e..ab0c856c13b0 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> @@ -2187,7 +2187,8 @@ amdgpu_pci_remove(struct pci_dev *pdev)
>                 pm_runtime_forbid(dev->dev);
>         }
>
> -       if (adev->ip_versions[MP1_HWIP][0] == IP_VERSION(13, 0, 2)) {
> +       if ((adev->ip_versions[MP1_HWIP][0] == IP_VERSION(13, 0, 2)) &&
> +                       !amdgpu_sriov_vf(adev)) {
>                 bool need_to_reset_gpu = false;
>
>                 if (adev->gmc.xgmi.num_physical_nodes > 1) {
> --
> 2.34.1
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH] drm/amdgpu: Disable GPU reset on SRIOV before remove pci.
@ 2022-10-24 20:03 Gavin Wan
  2022-10-24 20:06 ` Alex Deucher
  0 siblings, 1 reply; 7+ messages in thread
From: Gavin Wan @ 2022-10-24 20:03 UTC (permalink / raw)
  To: amd-gfx; +Cc: Gavin Wan

  Fixes: f5c7e7797060 ("drm/amdgpu: Adjust removal control flow
         for smu v13_0_2")

  The change of the commit f5c7e7797060 ("Adjust removal control
  flow for smu v13_0_2") brought a bug on SRIOV envrionment. It
  caused unloading amdgpu failed on Guest VM. The reason is that
  the VF FLR was requested while unloading amdgpu driver, but the
  VF FLR of SRIOV sequence is wrong while removing PCI device.

Signed-off-by: Gavin Wan <Gavin.Wan@amd.com>
Change-Id: I1ff8dcbffd85d7f3d8267d660fd8292423d2f70f
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index 16f6a313335e..ab0c856c13b0 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -2187,7 +2187,8 @@ amdgpu_pci_remove(struct pci_dev *pdev)
 		pm_runtime_forbid(dev->dev);
 	}
 
-	if (adev->ip_versions[MP1_HWIP][0] == IP_VERSION(13, 0, 2)) {
+	if ((adev->ip_versions[MP1_HWIP][0] == IP_VERSION(13, 0, 2)) &&
+			!amdgpu_sriov_vf(adev)) {
 		bool need_to_reset_gpu = false;
 
 		if (adev->gmc.xgmi.num_physical_nodes > 1) {
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH] drm/amdgpu: Disable GPU reset on SRIOV before remove pci.
  2022-10-24 19:45 Gavin Wan
@ 2022-10-24 19:52 ` Alex Deucher
  0 siblings, 0 replies; 7+ messages in thread
From: Alex Deucher @ 2022-10-24 19:52 UTC (permalink / raw)
  To: Gavin Wan; +Cc: amd-gfx

On Mon, Oct 24, 2022 at 3:45 PM Gavin Wan <Gavin.Wan@amd.com> wrote:
>
>   The change "Adjust removal control flow for smu v13_0_2"

commit f5c7e7797060 ("drm/amdgpu: Adjust removal control flow for smu v13_0_2")

>   brought a bug on SRIOV envrionment. It caused unloading
>   amdgpu failed on Guest VM. The reason is that the VF FLR was
>   requested while unloading amdgpu driver, but VF FLR of SRIOV
>   sequence is wrong while removing PCI device.
>
> Signed-off-by: Gavin Wan <Gavin.Wan@amd.com>

Please add:
Fixes: f5c7e7797060 ("drm/amdgpu: Adjust removal control flow for smu v13_0_2")

With that,
Acked-by: Alex Deucher <alexander.deucher@amd.com>

> Change-Id: I1ff8dcbffd85d7f3d8267d660fd8292423d2f70f
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> index 16f6a313335e..ab0c856c13b0 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> @@ -2187,7 +2187,8 @@ amdgpu_pci_remove(struct pci_dev *pdev)
>                 pm_runtime_forbid(dev->dev);
>         }
>
> -       if (adev->ip_versions[MP1_HWIP][0] == IP_VERSION(13, 0, 2)) {
> +       if ((adev->ip_versions[MP1_HWIP][0] == IP_VERSION(13, 0, 2)) &&
> +                       !amdgpu_sriov_vf(adev)) {
>                 bool need_to_reset_gpu = false;
>
>                 if (adev->gmc.xgmi.num_physical_nodes > 1) {
> --
> 2.34.1
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH] drm/amdgpu: Disable GPU reset on SRIOV before remove pci.
@ 2022-10-24 19:45 Gavin Wan
  2022-10-24 19:52 ` Alex Deucher
  0 siblings, 1 reply; 7+ messages in thread
From: Gavin Wan @ 2022-10-24 19:45 UTC (permalink / raw)
  To: amd-gfx; +Cc: Gavin Wan

  The change "Adjust removal control flow for smu v13_0_2"
  brought a bug on SRIOV envrionment. It caused unloading
  amdgpu failed on Guest VM. The reason is that the VF FLR was
  requested while unloading amdgpu driver, but VF FLR of SRIOV
  sequence is wrong while removing PCI device.

Signed-off-by: Gavin Wan <Gavin.Wan@amd.com>
Change-Id: I1ff8dcbffd85d7f3d8267d660fd8292423d2f70f
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index 16f6a313335e..ab0c856c13b0 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -2187,7 +2187,8 @@ amdgpu_pci_remove(struct pci_dev *pdev)
 		pm_runtime_forbid(dev->dev);
 	}
 
-	if (adev->ip_versions[MP1_HWIP][0] == IP_VERSION(13, 0, 2)) {
+	if ((adev->ip_versions[MP1_HWIP][0] == IP_VERSION(13, 0, 2)) &&
+			!amdgpu_sriov_vf(adev)) {
 		bool need_to_reset_gpu = false;
 
 		if (adev->gmc.xgmi.num_physical_nodes > 1) {
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2022-10-25 14:16 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-10-25 14:16 [PATCH] drm/amdgpu: Disable GPU reset on SRIOV before remove pci Gavin Wan
  -- strict thread matches above, loose matches on Subject: below --
2022-10-24 20:21 Gavin Wan
2022-10-25  9:30 ` Christian König
2022-10-24 20:03 Gavin Wan
2022-10-24 20:06 ` Alex Deucher
2022-10-24 19:45 Gavin Wan
2022-10-24 19:52 ` Alex Deucher

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.