All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 1/3] Revert "drm/amdgpu: TA unload messages are not actually sent to psp when amdgpu is uninstalled"
@ 2023-01-25 17:16 vitaly.prosyak
  2023-01-25 17:16 ` [PATCH 2/3] drm/amdgpu: always sending PSP messages LOAD_ASD and UNLOAD_TA vitaly.prosyak
  2023-01-25 17:16 ` [PATCH 3/3] drm/amdgpu: use pci_dev_is_disconnected vitaly.prosyak
  0 siblings, 2 replies; 5+ messages in thread
From: vitaly.prosyak @ 2023-01-25 17:16 UTC (permalink / raw)
  To: amd-gfx, christian.koenig, alexander.deucher, yipeng.chai; +Cc: Vitaly Prosyak

From: Vitaly Prosyak <vitaly.prosyak@amd.com>

This reverts commit fac53471d0ea9693d314aa2df08d62b2e7e3a0f8.
The following change: move the drm_dev_unplug call after
amdgpu_driver_unload_kms in amdgpu_pci_remove. The reason is
the following: amdgpu_pci_remove calls drm_dev_unregister
and it should be called first to ensure userspace can't access the
device instance anymore. If we call drm_dev_unplug after
amdgpu_driver_unload_kms then we observe IGT PCI software unplug
test failure (kernel hung) for all ASICs. This is how this
regression was found.

After this revert, the following commands do work not, but it would
be fixed in the next commit:
 - sudo modprobe -r amdgpu
 - sudo modprobe amdgpu

Signed-off-by: Vitaly Prosyak <vitaly.prosyak@amd.com>
Reviewed-by Alex Deucher <alexander.deucher@amd.com>
Change-Id: Ia5c6c174dddb89a33dd93b641fd05466ffb3500c
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 3 ++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c    | 4 ++--
 2 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 0f9a5b12c3a5..a10b627c8357 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -4031,7 +4031,8 @@ void amdgpu_device_fini_hw(struct amdgpu_device *adev)
 
 	amdgpu_gart_dummy_page_fini(adev);
 
-	amdgpu_device_unmap_mmio(adev);
+	if (drm_dev_is_unplugged(adev_to_drm(adev)))
+		amdgpu_device_unmap_mmio(adev);
 
 }
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index a75dba2caeca..7edbaa90fac9 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -2227,6 +2227,8 @@ amdgpu_pci_remove(struct pci_dev *pdev)
 	struct drm_device *dev = pci_get_drvdata(pdev);
 	struct amdgpu_device *adev = drm_to_adev(dev);
 
+	drm_dev_unplug(dev);
+
 	if (adev->pm.rpm_mode != AMDGPU_RUNPM_NONE) {
 		pm_runtime_get_sync(dev->dev);
 		pm_runtime_forbid(dev->dev);
@@ -2266,8 +2268,6 @@ amdgpu_pci_remove(struct pci_dev *pdev)
 
 	amdgpu_driver_unload_kms(dev);
 
-	drm_dev_unplug(dev);
-
 	/*
 	 * Flush any in flight DMA operations from device.
 	 * Clear the Bus Master Enable bit and then wait on the PCIe Device
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH 2/3] drm/amdgpu: always sending PSP messages LOAD_ASD and UNLOAD_TA
  2023-01-25 17:16 [PATCH 1/3] Revert "drm/amdgpu: TA unload messages are not actually sent to psp when amdgpu is uninstalled" vitaly.prosyak
@ 2023-01-25 17:16 ` vitaly.prosyak
  2023-01-25 17:16 ` [PATCH 3/3] drm/amdgpu: use pci_dev_is_disconnected vitaly.prosyak
  1 sibling, 0 replies; 5+ messages in thread
From: vitaly.prosyak @ 2023-01-25 17:16 UTC (permalink / raw)
  To: amd-gfx, christian.koenig, alexander.deucher, yipeng.chai; +Cc: Vitaly Prosyak

From: Vitaly Prosyak <vitaly.prosyak@amd.com>

We allow sending PSP messages LOAD_ASD and UNLOAD_TA without
acquiring a lock in drm_dev_enter during driver unload
because we must call drm_dev_unplug as the beginning
of unload driver sequence.
Added WARNING if other PSP messages are sent without a lock.
After this commit, the following commands would work
 -sudo modprobe -r amdgpu
 -sudo modprobe amdgpu

Signed-off-by: Vitaly Prosyak <vitaly.prosyak@amd.com>
Reviewed-by Alex Deucher <alexander.deucher@amd.com>
Change-Id: I57f65fe820e2f7055f8065cd18c63fe6ff3ab694
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 16 +++++++++++++---
 1 file changed, 13 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
index a8391f269cd0..40929f34447c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
@@ -606,12 +606,21 @@ psp_cmd_submit_buf(struct psp_context *psp,
 	int timeout = 20000;
 	bool ras_intr = false;
 	bool skip_unsupport = false;
+	bool dev_entered;
 
 	if (psp->adev->no_hw_access)
 		return 0;
 
-	if (!drm_dev_enter(adev_to_drm(psp->adev), &idx))
-		return 0;
+	dev_entered = drm_dev_enter(adev_to_drm(psp->adev), &idx);
+	/*
+	 * We allow sending PSP messages LOAD_ASD and UNLOAD_TA without acquiring
+	 * a lock in drm_dev_enter during driver unload because we must call
+	 * drm_dev_unplug as the beginning  of unload driver sequence . It is very
+	 * crucial that userspace can't access device instances anymore.
+	 */
+	if (!dev_entered)
+		WARN_ON(psp->cmd_buf_mem->cmd_id != GFX_CMD_ID_LOAD_ASD &&
+			psp->cmd_buf_mem->cmd_id != GFX_CMD_ID_UNLOAD_TA);
 
 	memset(psp->cmd_buf_mem, 0, PSP_CMD_BUFFER_SIZE);
 
@@ -676,7 +685,8 @@ psp_cmd_submit_buf(struct psp_context *psp,
 	}
 
 exit:
-	drm_dev_exit(idx);
+	if (dev_entered)
+		drm_dev_exit(idx);
 	return ret;
 }
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH 3/3] drm/amdgpu: use pci_dev_is_disconnected
  2023-01-25 17:16 [PATCH 1/3] Revert "drm/amdgpu: TA unload messages are not actually sent to psp when amdgpu is uninstalled" vitaly.prosyak
  2023-01-25 17:16 ` [PATCH 2/3] drm/amdgpu: always sending PSP messages LOAD_ASD and UNLOAD_TA vitaly.prosyak
@ 2023-01-25 17:16 ` vitaly.prosyak
  2023-01-26  9:20   ` Christian König
  1 sibling, 1 reply; 5+ messages in thread
From: vitaly.prosyak @ 2023-01-25 17:16 UTC (permalink / raw)
  To: amd-gfx, christian.koenig, alexander.deucher, yipeng.chai; +Cc: Vitaly Prosyak

From: Vitaly Prosyak <vitaly.prosyak@amd.com>

Added condition for pci_dev_is_disconnected and keeps
drm_dev_is_unplugged to check whether we should unmap MMIO.
Suggested by Alex regarding pci_dev_is_disconnected.
Suggested by Christian keeping drm_dev_is_unplugged.

Signed-off-by: Vitaly Prosyak <vitaly.prosyak@amd.com>
Reviewed-by Alex Deucher <alexander.deucher@amd.com>
Reviewed-by Christian Koenig <christian.coenig@amd.com>
Change-Id: I618c471cd398437d4ed6dec6d22be78e12683ae6
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index a10b627c8357..d3568e1ded23 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -78,6 +78,8 @@
 
 #include <drm/drm_drv.h>
 
+#include "../../../../pci/pci.h"
+
 MODULE_FIRMWARE("amdgpu/vega10_gpu_info.bin");
 MODULE_FIRMWARE("amdgpu/vega12_gpu_info.bin");
 MODULE_FIRMWARE("amdgpu/raven_gpu_info.bin");
@@ -4031,7 +4033,8 @@ void amdgpu_device_fini_hw(struct amdgpu_device *adev)
 
 	amdgpu_gart_dummy_page_fini(adev);
 
-	if (drm_dev_is_unplugged(adev_to_drm(adev)))
+	if (pci_dev_is_disconnected(adev->pdev) &&
+		drm_dev_is_unplugged(adev_to_drm(adev)))
 		amdgpu_device_unmap_mmio(adev);
 
 }
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH 3/3] drm/amdgpu: use pci_dev_is_disconnected
  2023-01-25 17:16 ` [PATCH 3/3] drm/amdgpu: use pci_dev_is_disconnected vitaly.prosyak
@ 2023-01-26  9:20   ` Christian König
  2023-01-26 14:28     ` vitaly prosyak
  0 siblings, 1 reply; 5+ messages in thread
From: Christian König @ 2023-01-26  9:20 UTC (permalink / raw)
  To: vitaly.prosyak, amd-gfx, alexander.deucher, yipeng.chai

Am 25.01.23 um 18:16 schrieb vitaly.prosyak@amd.com:
> From: Vitaly Prosyak <vitaly.prosyak@amd.com>
>
> Added condition for pci_dev_is_disconnected and keeps
> drm_dev_is_unplugged to check whether we should unmap MMIO.
> Suggested by Alex regarding pci_dev_is_disconnected.
> Suggested by Christian keeping drm_dev_is_unplugged.
>
> Signed-off-by: Vitaly Prosyak <vitaly.prosyak@amd.com>
> Reviewed-by Alex Deucher <alexander.deucher@amd.com>
> Reviewed-by Christian Koenig <christian.coenig@amd.com>

Did I gave my rb with this include path below???

> Change-Id: I618c471cd398437d4ed6dec6d22be78e12683ae6
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 5 ++++-
>   1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index a10b627c8357..d3568e1ded23 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -78,6 +78,8 @@
>   
>   #include <drm/drm_drv.h>
>   
> +#include "../../../../pci/pci.h"

That include path looks more than suspicious.

If we want to use pci_dev_is_disconnected() outside of the pci subsystem 
we should probably move it to include/linux/pci.h

Regards,
Christian.

> +
>   MODULE_FIRMWARE("amdgpu/vega10_gpu_info.bin");
>   MODULE_FIRMWARE("amdgpu/vega12_gpu_info.bin");
>   MODULE_FIRMWARE("amdgpu/raven_gpu_info.bin");
> @@ -4031,7 +4033,8 @@ void amdgpu_device_fini_hw(struct amdgpu_device *adev)
>   
>   	amdgpu_gart_dummy_page_fini(adev);
>   
> -	if (drm_dev_is_unplugged(adev_to_drm(adev)))
> +	if (pci_dev_is_disconnected(adev->pdev) &&
> +		drm_dev_is_unplugged(adev_to_drm(adev)))
>   		amdgpu_device_unmap_mmio(adev);
>   
>   }


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH 3/3] drm/amdgpu: use pci_dev_is_disconnected
  2023-01-26  9:20   ` Christian König
@ 2023-01-26 14:28     ` vitaly prosyak
  0 siblings, 0 replies; 5+ messages in thread
From: vitaly prosyak @ 2023-01-26 14:28 UTC (permalink / raw)
  To: Christian König, vitaly.prosyak, amd-gfx, alexander.deucher,
	yipeng.chai


On 2023-01-26 04:20, Christian König wrote:
> Am 25.01.23 um 18:16 schrieb vitaly.prosyak@amd.com:
>> From: Vitaly Prosyak <vitaly.prosyak@amd.com>
>>
>> Added condition for pci_dev_is_disconnected and keeps
>> drm_dev_is_unplugged to check whether we should unmap MMIO.
>> Suggested by Alex regarding pci_dev_is_disconnected.
>> Suggested by Christian keeping drm_dev_is_unplugged.
>>
>> Signed-off-by: Vitaly Prosyak <vitaly.prosyak@amd.com>
>> Reviewed-by Alex Deucher <alexander.deucher@amd.com>
>> Reviewed-by Christian Koenig <christian.coenig@amd.com>
>
> Did I gave my rb with this include path below???
Explicitly there was no RB, but there was a private thread which does 
not include this include.
>
>> Change-Id: I618c471cd398437d4ed6dec6d22be78e12683ae6
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 5 ++++-
>>   1 file changed, 4 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> index a10b627c8357..d3568e1ded23 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> @@ -78,6 +78,8 @@
>>     #include <drm/drm_drv.h>
>>   +#include "../../../../pci/pci.h"
>
> That include path looks more than suspicious.
>
> If we want to use pci_dev_is_disconnected() outside of the pci 
> subsystem we should probably move it to include/linux/pci.h

Alright, I would drop this patch until the  It would be a separate patch 
to create a new:

include/linux/pci.h  with the following content:
#if IS_ENABLED(CONFIG_OF) && IS_ENABLED(CONFIG_PCI)
bool pci_dev_is_disconnected(const struct pci_dev *dev)
{
     return dev->error_state == pci_channel_io_perm_failure;
}
#else
bool pci_dev_is_disconnected(const struct pci_dev *dev)
{
     return true;
}

#endif

I am not sure about enablement  of amdgpu when PCI is off in the config, 
but this is related to move pci_dev_is_disconnected into separate file 
and receive review from PCI system/.

>
> Regards,
> Christian.
>
>> +
>>   MODULE_FIRMWARE("amdgpu/vega10_gpu_info.bin");
>>   MODULE_FIRMWARE("amdgpu/vega12_gpu_info.bin");
>>   MODULE_FIRMWARE("amdgpu/raven_gpu_info.bin");
>> @@ -4031,7 +4033,8 @@ void amdgpu_device_fini_hw(struct amdgpu_device 
>> *adev)
>>         amdgpu_gart_dummy_page_fini(adev);
>>   -    if (drm_dev_is_unplugged(adev_to_drm(adev)))
>> +    if (pci_dev_is_disconnected(adev->pdev) &&
>> +        drm_dev_is_unplugged(adev_to_drm(adev)))
>>           amdgpu_device_unmap_mmio(adev);
>>     }
>

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2023-01-26 14:28 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-01-25 17:16 [PATCH 1/3] Revert "drm/amdgpu: TA unload messages are not actually sent to psp when amdgpu is uninstalled" vitaly.prosyak
2023-01-25 17:16 ` [PATCH 2/3] drm/amdgpu: always sending PSP messages LOAD_ASD and UNLOAD_TA vitaly.prosyak
2023-01-25 17:16 ` [PATCH 3/3] drm/amdgpu: use pci_dev_is_disconnected vitaly.prosyak
2023-01-26  9:20   ` Christian König
2023-01-26 14:28     ` vitaly prosyak

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.