* [PATCH] drm/amdgpu: Disable GPU reset on SRIOV before remove pci.
@ 2022-10-25 14:16 Gavin Wan
0 siblings, 0 replies; 7+ messages in thread
From: Gavin Wan @ 2022-10-25 14:16 UTC (permalink / raw)
To: amd-gfx; +Cc: Alex Deucher, Gavin Wan
The change of the commit <f5c7e7797060> ("Adjust removal control
flow for smu v13_0_2") brought a bug on SRIOV envrionment. It
caused unloading amdgpu failed on Guest VM. The reason is that
the VF FLR was requested while unloading amdgpu driver, but the
VF FLR of SRIOV sequence is wrong while removing PCI device.
Signed-off-by: Gavin Wan <Gavin.Wan@amd.com>
Fixes: f5c7e7797060 ("drm/amdgpu: Adjust removal control flow
for smu v13_0_2")
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Change-Id: I1ff8dcbffd85d7f3d8267d660fd8292423d2f70f
---
drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index 16f6a313335e..8e97e95aca8c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -2187,7 +2187,8 @@ amdgpu_pci_remove(struct pci_dev *pdev)
pm_runtime_forbid(dev->dev);
}
- if (adev->ip_versions[MP1_HWIP][0] == IP_VERSION(13, 0, 2)) {
+ if (adev->ip_versions[MP1_HWIP][0] == IP_VERSION(13, 0, 2) &&
+ !amdgpu_sriov_vf(adev)) {
bool need_to_reset_gpu = false;
if (adev->gmc.xgmi.num_physical_nodes > 1) {
--
2.34.1
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH] drm/amdgpu: Disable GPU reset on SRIOV before remove pci.
2022-10-24 20:21 Gavin Wan
@ 2022-10-25 9:30 ` Christian König
0 siblings, 0 replies; 7+ messages in thread
From: Christian König @ 2022-10-25 9:30 UTC (permalink / raw)
To: Gavin Wan, amd-gfx; +Cc: Alex Deucher
Am 24.10.22 um 22:21 schrieb Gavin Wan:
> The change of the commit f5c7e7797060 ("Adjust removal control
> flow for smu v13_0_2") brought a bug on SRIOV envrionment. It
> caused unloading amdgpu failed on Guest VM. The reason is that
> the VF FLR was requested while unloading amdgpu driver, but the
> VF FLR of SRIOV sequence is wrong while removing PCI device.
>
> Fixes: f5c7e7797060 ("drm/amdgpu: Adjust removal control flow
> for smu v13_0_2")
The Fixes line should look like a Signed-off-by or Acked-by line.
>
> Acked-by: Alex Deucher <alexander.deucher@amd.com>
> Signed-off-by: Gavin Wan <Gavin.Wan@amd.com>
> Change-Id: I1ff8dcbffd85d7f3d8267d660fd8292423d2f70f
The lines with : are noted in chronological order.
E.g. in this case that should be:
Signed-off-by: ....
Fixes: ...
Acked-by: ...
...
> ---
> drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> index 16f6a313335e..ab0c856c13b0 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> @@ -2187,7 +2187,8 @@ amdgpu_pci_remove(struct pci_dev *pdev)
> pm_runtime_forbid(dev->dev);
> }
>
> - if (adev->ip_versions[MP1_HWIP][0] == IP_VERSION(13, 0, 2)) {
> + if ((adev->ip_versions[MP1_HWIP][0] == IP_VERSION(13, 0, 2)) &&
Please drop the extra () as long as you don't want to explicitly have an
assignment inside an "if", "for" or "while".
Regards,
Christian.
> + !amdgpu_sriov_vf(adev)) {
> bool need_to_reset_gpu = false;
>
> if (adev->gmc.xgmi.num_physical_nodes > 1) {
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH] drm/amdgpu: Disable GPU reset on SRIOV before remove pci.
@ 2022-10-24 20:21 Gavin Wan
2022-10-25 9:30 ` Christian König
0 siblings, 1 reply; 7+ messages in thread
From: Gavin Wan @ 2022-10-24 20:21 UTC (permalink / raw)
To: amd-gfx; +Cc: Alex Deucher, Gavin Wan
The change of the commit f5c7e7797060 ("Adjust removal control
flow for smu v13_0_2") brought a bug on SRIOV envrionment. It
caused unloading amdgpu failed on Guest VM. The reason is that
the VF FLR was requested while unloading amdgpu driver, but the
VF FLR of SRIOV sequence is wrong while removing PCI device.
Fixes: f5c7e7797060 ("drm/amdgpu: Adjust removal control flow
for smu v13_0_2")
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Gavin Wan <Gavin.Wan@amd.com>
Change-Id: I1ff8dcbffd85d7f3d8267d660fd8292423d2f70f
---
drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index 16f6a313335e..ab0c856c13b0 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -2187,7 +2187,8 @@ amdgpu_pci_remove(struct pci_dev *pdev)
pm_runtime_forbid(dev->dev);
}
- if (adev->ip_versions[MP1_HWIP][0] == IP_VERSION(13, 0, 2)) {
+ if ((adev->ip_versions[MP1_HWIP][0] == IP_VERSION(13, 0, 2)) &&
+ !amdgpu_sriov_vf(adev)) {
bool need_to_reset_gpu = false;
if (adev->gmc.xgmi.num_physical_nodes > 1) {
--
2.34.1
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH] drm/amdgpu: Disable GPU reset on SRIOV before remove pci.
2022-10-24 20:03 Gavin Wan
@ 2022-10-24 20:06 ` Alex Deucher
0 siblings, 0 replies; 7+ messages in thread
From: Alex Deucher @ 2022-10-24 20:06 UTC (permalink / raw)
To: Gavin Wan; +Cc: amd-gfx
On Mon, Oct 24, 2022 at 4:03 PM Gavin Wan <Gavin.Wan@amd.com> wrote:
>
> Fixes: f5c7e7797060 ("drm/amdgpu: Adjust removal control flow
> for smu v13_0_2")
>
> The change of the commit f5c7e7797060 ("Adjust removal control
> flow for smu v13_0_2") brought a bug on SRIOV envrionment. It
> caused unloading amdgpu failed on Guest VM. The reason is that
> the VF FLR was requested while unloading amdgpu driver, but the
> VF FLR of SRIOV sequence is wrong while removing PCI device.
Please move the Fixes line down below the patch description and above
the Signed-off-by/Reviewed-by/Acked-by/etc. tags. Feel free to
add:
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Thanks,
Alex
>
> Signed-off-by: Gavin Wan <Gavin.Wan@amd.com>
> Change-Id: I1ff8dcbffd85d7f3d8267d660fd8292423d2f70f
> ---
> drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> index 16f6a313335e..ab0c856c13b0 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> @@ -2187,7 +2187,8 @@ amdgpu_pci_remove(struct pci_dev *pdev)
> pm_runtime_forbid(dev->dev);
> }
>
> - if (adev->ip_versions[MP1_HWIP][0] == IP_VERSION(13, 0, 2)) {
> + if ((adev->ip_versions[MP1_HWIP][0] == IP_VERSION(13, 0, 2)) &&
> + !amdgpu_sriov_vf(adev)) {
> bool need_to_reset_gpu = false;
>
> if (adev->gmc.xgmi.num_physical_nodes > 1) {
> --
> 2.34.1
>
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH] drm/amdgpu: Disable GPU reset on SRIOV before remove pci.
@ 2022-10-24 20:03 Gavin Wan
2022-10-24 20:06 ` Alex Deucher
0 siblings, 1 reply; 7+ messages in thread
From: Gavin Wan @ 2022-10-24 20:03 UTC (permalink / raw)
To: amd-gfx; +Cc: Gavin Wan
Fixes: f5c7e7797060 ("drm/amdgpu: Adjust removal control flow
for smu v13_0_2")
The change of the commit f5c7e7797060 ("Adjust removal control
flow for smu v13_0_2") brought a bug on SRIOV envrionment. It
caused unloading amdgpu failed on Guest VM. The reason is that
the VF FLR was requested while unloading amdgpu driver, but the
VF FLR of SRIOV sequence is wrong while removing PCI device.
Signed-off-by: Gavin Wan <Gavin.Wan@amd.com>
Change-Id: I1ff8dcbffd85d7f3d8267d660fd8292423d2f70f
---
drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index 16f6a313335e..ab0c856c13b0 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -2187,7 +2187,8 @@ amdgpu_pci_remove(struct pci_dev *pdev)
pm_runtime_forbid(dev->dev);
}
- if (adev->ip_versions[MP1_HWIP][0] == IP_VERSION(13, 0, 2)) {
+ if ((adev->ip_versions[MP1_HWIP][0] == IP_VERSION(13, 0, 2)) &&
+ !amdgpu_sriov_vf(adev)) {
bool need_to_reset_gpu = false;
if (adev->gmc.xgmi.num_physical_nodes > 1) {
--
2.34.1
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH] drm/amdgpu: Disable GPU reset on SRIOV before remove pci.
2022-10-24 19:45 Gavin Wan
@ 2022-10-24 19:52 ` Alex Deucher
0 siblings, 0 replies; 7+ messages in thread
From: Alex Deucher @ 2022-10-24 19:52 UTC (permalink / raw)
To: Gavin Wan; +Cc: amd-gfx
On Mon, Oct 24, 2022 at 3:45 PM Gavin Wan <Gavin.Wan@amd.com> wrote:
>
> The change "Adjust removal control flow for smu v13_0_2"
commit f5c7e7797060 ("drm/amdgpu: Adjust removal control flow for smu v13_0_2")
> brought a bug on SRIOV envrionment. It caused unloading
> amdgpu failed on Guest VM. The reason is that the VF FLR was
> requested while unloading amdgpu driver, but VF FLR of SRIOV
> sequence is wrong while removing PCI device.
>
> Signed-off-by: Gavin Wan <Gavin.Wan@amd.com>
Please add:
Fixes: f5c7e7797060 ("drm/amdgpu: Adjust removal control flow for smu v13_0_2")
With that,
Acked-by: Alex Deucher <alexander.deucher@amd.com>
> Change-Id: I1ff8dcbffd85d7f3d8267d660fd8292423d2f70f
> ---
> drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> index 16f6a313335e..ab0c856c13b0 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> @@ -2187,7 +2187,8 @@ amdgpu_pci_remove(struct pci_dev *pdev)
> pm_runtime_forbid(dev->dev);
> }
>
> - if (adev->ip_versions[MP1_HWIP][0] == IP_VERSION(13, 0, 2)) {
> + if ((adev->ip_versions[MP1_HWIP][0] == IP_VERSION(13, 0, 2)) &&
> + !amdgpu_sriov_vf(adev)) {
> bool need_to_reset_gpu = false;
>
> if (adev->gmc.xgmi.num_physical_nodes > 1) {
> --
> 2.34.1
>
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH] drm/amdgpu: Disable GPU reset on SRIOV before remove pci.
@ 2022-10-24 19:45 Gavin Wan
2022-10-24 19:52 ` Alex Deucher
0 siblings, 1 reply; 7+ messages in thread
From: Gavin Wan @ 2022-10-24 19:45 UTC (permalink / raw)
To: amd-gfx; +Cc: Gavin Wan
The change "Adjust removal control flow for smu v13_0_2"
brought a bug on SRIOV envrionment. It caused unloading
amdgpu failed on Guest VM. The reason is that the VF FLR was
requested while unloading amdgpu driver, but VF FLR of SRIOV
sequence is wrong while removing PCI device.
Signed-off-by: Gavin Wan <Gavin.Wan@amd.com>
Change-Id: I1ff8dcbffd85d7f3d8267d660fd8292423d2f70f
---
drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index 16f6a313335e..ab0c856c13b0 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -2187,7 +2187,8 @@ amdgpu_pci_remove(struct pci_dev *pdev)
pm_runtime_forbid(dev->dev);
}
- if (adev->ip_versions[MP1_HWIP][0] == IP_VERSION(13, 0, 2)) {
+ if ((adev->ip_versions[MP1_HWIP][0] == IP_VERSION(13, 0, 2)) &&
+ !amdgpu_sriov_vf(adev)) {
bool need_to_reset_gpu = false;
if (adev->gmc.xgmi.num_physical_nodes > 1) {
--
2.34.1
^ permalink raw reply related [flat|nested] 7+ messages in thread
end of thread, other threads:[~2022-10-25 14:16 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-10-25 14:16 [PATCH] drm/amdgpu: Disable GPU reset on SRIOV before remove pci Gavin Wan
-- strict thread matches above, loose matches on Subject: below --
2022-10-24 20:21 Gavin Wan
2022-10-25 9:30 ` Christian König
2022-10-24 20:03 Gavin Wan
2022-10-24 20:06 ` Alex Deucher
2022-10-24 19:45 Gavin Wan
2022-10-24 19:52 ` Alex Deucher
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.