All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v4 00/14] RFC Support hot device unplug in amdgpu
@ 2021-01-18 21:01 ` Andrey Grodzovsky
  0 siblings, 0 replies; 196+ messages in thread
From: Andrey Grodzovsky @ 2021-01-18 21:01 UTC (permalink / raw)
  To: amd-gfx, dri-devel, ckoenig.leichtzumerken, daniel.vetter, robh,
	l.stach, yuq825, eric
  Cc: Alexander.Deucher, gregkh

Until now extracting a card either by physical extraction (e.g. eGPU with 
thunderbolt connection or by emulation through  syfs -> /sys/bus/pci/devices/device_id/remove) 
would cause random crashes in user apps. The random crashes in apps were 
mostly due to the app having mapped a device backed BO into its address 
space was still trying to access the BO while the backing device was gone.
To answer this first problem Christian suggested to fix the handling of mapped 
memory in the clients when the device goes away by forcibly unmap all buffers the 
user processes has by clearing their respective VMAs mapping the device BOs. 
Then when the VMAs try to fill in the page tables again we check in the fault 
handlerif the device is removed and if so, return an error. This will generate a 
SIGBUS to the application which can then cleanly terminate.This indeed was done 
but this in turn created a problem of kernel OOPs were the OOPSes were due to the 
fact that while the app was terminating because of the SIGBUSit would trigger use 
after free in the driver by calling to accesses device structures that were already 
released from the pci remove sequence.This was handled by introducing a 'flush' 
sequence during device removal were we wait for drm file reference to drop to 0 
meaning all user clients directly using this device terminated.

v2:
Based on discussions in the mailing list with Daniel and Pekka [1] and based on the document 
produced by Pekka from those discussions [2] the whole approach with returning SIGBUS and 
waiting for all user clients having CPU mapping of device BOs to die was dropped. 
Instead as per the document suggestion the device structures are kept alive until 
the last reference to the device is dropped by user client and in the meanwhile all existing and new CPU mappings of the BOs 
belonging to the device directly or by dma-buf import are rerouted to per user 
process dummy rw page.Also, I skipped the 'Requirements for KMS UAPI' section of [2] 
since i am trying to get the minimal set of requirements that still give useful solution 
to work and this is the'Requirements for Render and Cross-Device UAPI' section and so my 
test case is removing a secondary device, which is render only and is not involved 
in KMS.

v3:
More updates following comments from v2 such as removing loop to find DRM file when rerouting 
page faults to dummy page,getting rid of unnecessary sysfs handling refactoring and moving 
prevention of GPU recovery post device unplug from amdgpu to scheduler layer. 
On top of that added unplug support for the IOMMU enabled system.

v4:
Drop last sysfs hack and use sysfs default attribute.
Guard against write accesses after device removal to avoid modifying released memory.
Update dummy pages handling to on demand allocation and release through drm managed framework.
Add return value to scheduler job TO handler (by Luben Tuikov) and use this in amdgpu for prevention 
of GPU recovery post device unplug
Also rebase on top of drm-misc-mext instead of amd-staging-drm-next

With these patches I am able to gracefully remove the secondary card using sysfs remove hook while glxgears 
is running off of secondary card (DRI_PRIME=1) without kernel oopses or hangs and keep working 
with the primary card or soft reset the device without hangs or oopses

TODOs for followup work:
Convert AMDGPU code to use devm (for hw stuff) and drmm (for sw stuff and allocations) (Daniel)
Support plugging the secondary device back after unplug - currently still experiencing HW error on plugging back.
Add support for 'Requirements for KMS UAPI' section of [2] - unplugging primary, display connected card.

[1] - Discussions during v3 of the patchset https://www.spinics.net/lists/amd-gfx/msg55576.html
[2] - drm/doc: device hot-unplug for userspace https://www.spinics.net/lists/dri-devel/msg259755.html
[3] - Related gitlab ticket https://gitlab.freedesktop.org/drm/amd/-/issues/1081

Andrey Grodzovsky (13):
  drm/ttm: Remap all page faults to per process dummy page.
  drm: Unamp the entire device address space on device unplug
  drm/ttm: Expose ttm_tt_unpopulate for driver use
  drm/sched: Cancel and flush all oustatdning jobs before finish.
  drm/amdgpu: Split amdgpu_device_fini into early and late
  drm/amdgpu: Add early fini callback
  drm/amdgpu: Register IOMMU topology notifier per device.
  drm/amdgpu: Fix a bunch of sdma code crash post device unplug
  drm/amdgpu: Remap all page faults to per process dummy page.
  dmr/amdgpu: Move some sysfs attrs creation to default_attr
  drm/amdgpu: Guard against write accesses after device removal
  drm/sched: Make timeout timer rearm conditional.
  drm/amdgpu: Prevent any job recoveries after device is unplugged.

Luben Tuikov (1):
  drm/scheduler: Job timeout handler returns status

 drivers/gpu/drm/amd/amdgpu/amdgpu.h               |  11 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c      |  17 +--
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c        | 149 ++++++++++++++++++++--
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c           |  20 ++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c         |  15 ++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c          |   2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h          |   1 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c           |   9 ++
 drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c       |  25 ++--
 drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c           |  26 ++--
 drivers/gpu/drm/amd/amdgpu/amdgpu_irq.h           |   3 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_job.c           |  19 ++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c           |  12 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_object.c        |  10 ++
 drivers/gpu/drm/amd/amdgpu/amdgpu_object.h        |   2 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c           |  53 +++++---
 drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h           |   3 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c           |   1 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c          |  70 ++++++++++
 drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h          |  52 +-------
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c           |  21 ++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c            |   8 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c      |  14 +-
 drivers/gpu/drm/amd/amdgpu/cik_ih.c               |   2 +-
 drivers/gpu/drm/amd/amdgpu/cz_ih.c                |   2 +-
 drivers/gpu/drm/amd/amdgpu/iceland_ih.c           |   2 +-
 drivers/gpu/drm/amd/amdgpu/navi10_ih.c            |   2 +-
 drivers/gpu/drm/amd/amdgpu/psp_v11_0.c            |  16 +--
 drivers/gpu/drm/amd/amdgpu/psp_v12_0.c            |   8 +-
 drivers/gpu/drm/amd/amdgpu/psp_v3_1.c             |   8 +-
 drivers/gpu/drm/amd/amdgpu/si_ih.c                |   2 +-
 drivers/gpu/drm/amd/amdgpu/tonga_ih.c             |   2 +-
 drivers/gpu/drm/amd/amdgpu/vega10_ih.c            |   2 +-
 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c |  12 +-
 drivers/gpu/drm/amd/include/amd_shared.h          |   2 +
 drivers/gpu/drm/drm_drv.c                         |   3 +
 drivers/gpu/drm/etnaviv/etnaviv_sched.c           |  10 +-
 drivers/gpu/drm/lima/lima_sched.c                 |   4 +-
 drivers/gpu/drm/panfrost/panfrost_job.c           |   9 +-
 drivers/gpu/drm/scheduler/sched_main.c            |  18 ++-
 drivers/gpu/drm/ttm/ttm_bo_vm.c                   |  82 +++++++++++-
 drivers/gpu/drm/ttm/ttm_tt.c                      |   1 +
 drivers/gpu/drm/v3d/v3d_sched.c                   |  32 ++---
 include/drm/gpu_scheduler.h                       |  17 ++-
 include/drm/ttm/ttm_bo_api.h                      |   2 +
 45 files changed, 583 insertions(+), 198 deletions(-)

-- 
2.7.4

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* [PATCH v4 00/14] RFC Support hot device unplug in amdgpu
@ 2021-01-18 21:01 ` Andrey Grodzovsky
  0 siblings, 0 replies; 196+ messages in thread
From: Andrey Grodzovsky @ 2021-01-18 21:01 UTC (permalink / raw)
  To: amd-gfx, dri-devel, ckoenig.leichtzumerken, daniel.vetter, robh,
	l.stach, yuq825, eric
  Cc: Alexander.Deucher, gregkh, ppaalanen, Harry.Wentland, Andrey Grodzovsky

Until now extracting a card either by physical extraction (e.g. eGPU with 
thunderbolt connection or by emulation through  syfs -> /sys/bus/pci/devices/device_id/remove) 
would cause random crashes in user apps. The random crashes in apps were 
mostly due to the app having mapped a device backed BO into its address 
space was still trying to access the BO while the backing device was gone.
To answer this first problem Christian suggested to fix the handling of mapped 
memory in the clients when the device goes away by forcibly unmap all buffers the 
user processes has by clearing their respective VMAs mapping the device BOs. 
Then when the VMAs try to fill in the page tables again we check in the fault 
handlerif the device is removed and if so, return an error. This will generate a 
SIGBUS to the application which can then cleanly terminate.This indeed was done 
but this in turn created a problem of kernel OOPs were the OOPSes were due to the 
fact that while the app was terminating because of the SIGBUSit would trigger use 
after free in the driver by calling to accesses device structures that were already 
released from the pci remove sequence.This was handled by introducing a 'flush' 
sequence during device removal were we wait for drm file reference to drop to 0 
meaning all user clients directly using this device terminated.

v2:
Based on discussions in the mailing list with Daniel and Pekka [1] and based on the document 
produced by Pekka from those discussions [2] the whole approach with returning SIGBUS and 
waiting for all user clients having CPU mapping of device BOs to die was dropped. 
Instead as per the document suggestion the device structures are kept alive until 
the last reference to the device is dropped by user client and in the meanwhile all existing and new CPU mappings of the BOs 
belonging to the device directly or by dma-buf import are rerouted to per user 
process dummy rw page.Also, I skipped the 'Requirements for KMS UAPI' section of [2] 
since i am trying to get the minimal set of requirements that still give useful solution 
to work and this is the'Requirements for Render and Cross-Device UAPI' section and so my 
test case is removing a secondary device, which is render only and is not involved 
in KMS.

v3:
More updates following comments from v2 such as removing loop to find DRM file when rerouting 
page faults to dummy page,getting rid of unnecessary sysfs handling refactoring and moving 
prevention of GPU recovery post device unplug from amdgpu to scheduler layer. 
On top of that added unplug support for the IOMMU enabled system.

v4:
Drop last sysfs hack and use sysfs default attribute.
Guard against write accesses after device removal to avoid modifying released memory.
Update dummy pages handling to on demand allocation and release through drm managed framework.
Add return value to scheduler job TO handler (by Luben Tuikov) and use this in amdgpu for prevention 
of GPU recovery post device unplug
Also rebase on top of drm-misc-mext instead of amd-staging-drm-next

With these patches I am able to gracefully remove the secondary card using sysfs remove hook while glxgears 
is running off of secondary card (DRI_PRIME=1) without kernel oopses or hangs and keep working 
with the primary card or soft reset the device without hangs or oopses

TODOs for followup work:
Convert AMDGPU code to use devm (for hw stuff) and drmm (for sw stuff and allocations) (Daniel)
Support plugging the secondary device back after unplug - currently still experiencing HW error on plugging back.
Add support for 'Requirements for KMS UAPI' section of [2] - unplugging primary, display connected card.

[1] - Discussions during v3 of the patchset https://www.spinics.net/lists/amd-gfx/msg55576.html
[2] - drm/doc: device hot-unplug for userspace https://www.spinics.net/lists/dri-devel/msg259755.html
[3] - Related gitlab ticket https://gitlab.freedesktop.org/drm/amd/-/issues/1081

Andrey Grodzovsky (13):
  drm/ttm: Remap all page faults to per process dummy page.
  drm: Unamp the entire device address space on device unplug
  drm/ttm: Expose ttm_tt_unpopulate for driver use
  drm/sched: Cancel and flush all oustatdning jobs before finish.
  drm/amdgpu: Split amdgpu_device_fini into early and late
  drm/amdgpu: Add early fini callback
  drm/amdgpu: Register IOMMU topology notifier per device.
  drm/amdgpu: Fix a bunch of sdma code crash post device unplug
  drm/amdgpu: Remap all page faults to per process dummy page.
  dmr/amdgpu: Move some sysfs attrs creation to default_attr
  drm/amdgpu: Guard against write accesses after device removal
  drm/sched: Make timeout timer rearm conditional.
  drm/amdgpu: Prevent any job recoveries after device is unplugged.

Luben Tuikov (1):
  drm/scheduler: Job timeout handler returns status

 drivers/gpu/drm/amd/amdgpu/amdgpu.h               |  11 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c      |  17 +--
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c        | 149 ++++++++++++++++++++--
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c           |  20 ++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c         |  15 ++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c          |   2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h          |   1 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c           |   9 ++
 drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c       |  25 ++--
 drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c           |  26 ++--
 drivers/gpu/drm/amd/amdgpu/amdgpu_irq.h           |   3 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_job.c           |  19 ++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c           |  12 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_object.c        |  10 ++
 drivers/gpu/drm/amd/amdgpu/amdgpu_object.h        |   2 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c           |  53 +++++---
 drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h           |   3 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c           |   1 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c          |  70 ++++++++++
 drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h          |  52 +-------
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c           |  21 ++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c            |   8 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c      |  14 +-
 drivers/gpu/drm/amd/amdgpu/cik_ih.c               |   2 +-
 drivers/gpu/drm/amd/amdgpu/cz_ih.c                |   2 +-
 drivers/gpu/drm/amd/amdgpu/iceland_ih.c           |   2 +-
 drivers/gpu/drm/amd/amdgpu/navi10_ih.c            |   2 +-
 drivers/gpu/drm/amd/amdgpu/psp_v11_0.c            |  16 +--
 drivers/gpu/drm/amd/amdgpu/psp_v12_0.c            |   8 +-
 drivers/gpu/drm/amd/amdgpu/psp_v3_1.c             |   8 +-
 drivers/gpu/drm/amd/amdgpu/si_ih.c                |   2 +-
 drivers/gpu/drm/amd/amdgpu/tonga_ih.c             |   2 +-
 drivers/gpu/drm/amd/amdgpu/vega10_ih.c            |   2 +-
 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c |  12 +-
 drivers/gpu/drm/amd/include/amd_shared.h          |   2 +
 drivers/gpu/drm/drm_drv.c                         |   3 +
 drivers/gpu/drm/etnaviv/etnaviv_sched.c           |  10 +-
 drivers/gpu/drm/lima/lima_sched.c                 |   4 +-
 drivers/gpu/drm/panfrost/panfrost_job.c           |   9 +-
 drivers/gpu/drm/scheduler/sched_main.c            |  18 ++-
 drivers/gpu/drm/ttm/ttm_bo_vm.c                   |  82 +++++++++++-
 drivers/gpu/drm/ttm/ttm_tt.c                      |   1 +
 drivers/gpu/drm/v3d/v3d_sched.c                   |  32 ++---
 include/drm/gpu_scheduler.h                       |  17 ++-
 include/drm/ttm/ttm_bo_api.h                      |   2 +
 45 files changed, 583 insertions(+), 198 deletions(-)

-- 
2.7.4

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 196+ messages in thread

* [PATCH v4 01/14] drm/ttm: Remap all page faults to per process dummy page.
  2021-01-18 21:01 ` Andrey Grodzovsky
@ 2021-01-18 21:01   ` Andrey Grodzovsky
  -1 siblings, 0 replies; 196+ messages in thread
From: Andrey Grodzovsky @ 2021-01-18 21:01 UTC (permalink / raw)
  To: amd-gfx, dri-devel, ckoenig.leichtzumerken, daniel.vetter, robh,
	l.stach, yuq825, eric
  Cc: Alexander.Deucher, gregkh

On device removal reroute all CPU mappings to dummy page.

v3:
Remove loop to find DRM file and instead access it
by vma->vm_file->private_data. Move dummy page installation
into a separate function.

v4:
Map the entire BOs VA space into on demand allocated dummy page
on the first fault for that BO.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
---
 drivers/gpu/drm/ttm/ttm_bo_vm.c | 82 ++++++++++++++++++++++++++++++++++++++++-
 include/drm/ttm/ttm_bo_api.h    |  2 +
 2 files changed, 83 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c b/drivers/gpu/drm/ttm/ttm_bo_vm.c
index 6dc96cf..ed89da3 100644
--- a/drivers/gpu/drm/ttm/ttm_bo_vm.c
+++ b/drivers/gpu/drm/ttm/ttm_bo_vm.c
@@ -34,6 +34,8 @@
 #include <drm/ttm/ttm_bo_driver.h>
 #include <drm/ttm/ttm_placement.h>
 #include <drm/drm_vma_manager.h>
+#include <drm/drm_drv.h>
+#include <drm/drm_managed.h>
 #include <linux/mm.h>
 #include <linux/pfn_t.h>
 #include <linux/rbtree.h>
@@ -380,25 +382,103 @@ vm_fault_t ttm_bo_vm_fault_reserved(struct vm_fault *vmf,
 }
 EXPORT_SYMBOL(ttm_bo_vm_fault_reserved);
 
+static void ttm_bo_release_dummy_page(struct drm_device *dev, void *res)
+{
+	struct page *dummy_page = (struct page *)res;
+
+	__free_page(dummy_page);
+}
+
+vm_fault_t ttm_bo_vm_dummy_page(struct vm_fault *vmf, pgprot_t prot)
+{
+	struct vm_area_struct *vma = vmf->vma;
+	struct ttm_buffer_object *bo = vma->vm_private_data;
+	struct ttm_bo_device *bdev = bo->bdev;
+	struct drm_device *ddev = bo->base.dev;
+	vm_fault_t ret = VM_FAULT_NOPAGE;
+	unsigned long address = vma->vm_start;
+	unsigned long num_prefault = (vma->vm_end - vma->vm_start) >> PAGE_SHIFT;
+	unsigned long pfn;
+	struct page *page;
+	int i;
+
+	/*
+	 * Wait for buffer data in transit, due to a pipelined
+	 * move.
+	 */
+	ret = ttm_bo_vm_fault_idle(bo, vmf);
+	if (unlikely(ret != 0))
+		return ret;
+
+	/* Allocate new dummy page to map all the VA range in this VMA to it*/
+	page = alloc_page(GFP_KERNEL | __GFP_ZERO);
+	if (!page)
+		return VM_FAULT_OOM;
+
+	pfn = page_to_pfn(page);
+
+	/*
+	 * Prefault the entire VMA range right away to avoid further faults
+	 */
+	for (i = 0; i < num_prefault; ++i) {
+
+		if (unlikely(address >= vma->vm_end))
+			break;
+
+		if (vma->vm_flags & VM_MIXEDMAP)
+			ret = vmf_insert_mixed_prot(vma, address,
+						    __pfn_to_pfn_t(pfn, PFN_DEV),
+						    prot);
+		else
+			ret = vmf_insert_pfn_prot(vma, address, pfn, prot);
+
+		/* Never error on prefaulted PTEs */
+		if (unlikely((ret & VM_FAULT_ERROR))) {
+			if (i == 0)
+				return VM_FAULT_NOPAGE;
+			else
+				break;
+		}
+
+		address += PAGE_SIZE;
+	}
+
+	/* Set the page to be freed using drmm release action */
+	if (drmm_add_action_or_reset(ddev, ttm_bo_release_dummy_page, page))
+		return VM_FAULT_OOM;
+
+	return ret;
+}
+EXPORT_SYMBOL(ttm_bo_vm_dummy_page);
+
 vm_fault_t ttm_bo_vm_fault(struct vm_fault *vmf)
 {
 	struct vm_area_struct *vma = vmf->vma;
 	pgprot_t prot;
 	struct ttm_buffer_object *bo = vma->vm_private_data;
+	struct drm_device *ddev = bo->base.dev;
 	vm_fault_t ret;
+	int idx;
 
 	ret = ttm_bo_vm_reserve(bo, vmf);
 	if (ret)
 		return ret;
 
 	prot = vma->vm_page_prot;
-	ret = ttm_bo_vm_fault_reserved(vmf, prot, TTM_BO_VM_NUM_PREFAULT, 1);
+	if (drm_dev_enter(ddev, &idx)) {
+		ret = ttm_bo_vm_fault_reserved(vmf, prot, TTM_BO_VM_NUM_PREFAULT, 1);
+		drm_dev_exit(idx);
+	} else {
+		ret = ttm_bo_vm_dummy_page(vmf, prot);
+	}
 	if (ret == VM_FAULT_RETRY && !(vmf->flags & FAULT_FLAG_RETRY_NOWAIT))
 		return ret;
 
 	dma_resv_unlock(bo->base.resv);
 
 	return ret;
+
+	return ret;
 }
 EXPORT_SYMBOL(ttm_bo_vm_fault);
 
diff --git a/include/drm/ttm/ttm_bo_api.h b/include/drm/ttm/ttm_bo_api.h
index e17be32..12fb240 100644
--- a/include/drm/ttm/ttm_bo_api.h
+++ b/include/drm/ttm/ttm_bo_api.h
@@ -643,4 +643,6 @@ void ttm_bo_vm_close(struct vm_area_struct *vma);
 int ttm_bo_vm_access(struct vm_area_struct *vma, unsigned long addr,
 		     void *buf, int len, int write);
 
+vm_fault_t ttm_bo_vm_dummy_page(struct vm_fault *vmf, pgprot_t prot);
+
 #endif
-- 
2.7.4

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 196+ messages in thread

* [PATCH v4 01/14] drm/ttm: Remap all page faults to per process dummy page.
@ 2021-01-18 21:01   ` Andrey Grodzovsky
  0 siblings, 0 replies; 196+ messages in thread
From: Andrey Grodzovsky @ 2021-01-18 21:01 UTC (permalink / raw)
  To: amd-gfx, dri-devel, ckoenig.leichtzumerken, daniel.vetter, robh,
	l.stach, yuq825, eric
  Cc: Alexander.Deucher, gregkh, ppaalanen, Harry.Wentland, Andrey Grodzovsky

On device removal reroute all CPU mappings to dummy page.

v3:
Remove loop to find DRM file and instead access it
by vma->vm_file->private_data. Move dummy page installation
into a separate function.

v4:
Map the entire BOs VA space into on demand allocated dummy page
on the first fault for that BO.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
---
 drivers/gpu/drm/ttm/ttm_bo_vm.c | 82 ++++++++++++++++++++++++++++++++++++++++-
 include/drm/ttm/ttm_bo_api.h    |  2 +
 2 files changed, 83 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c b/drivers/gpu/drm/ttm/ttm_bo_vm.c
index 6dc96cf..ed89da3 100644
--- a/drivers/gpu/drm/ttm/ttm_bo_vm.c
+++ b/drivers/gpu/drm/ttm/ttm_bo_vm.c
@@ -34,6 +34,8 @@
 #include <drm/ttm/ttm_bo_driver.h>
 #include <drm/ttm/ttm_placement.h>
 #include <drm/drm_vma_manager.h>
+#include <drm/drm_drv.h>
+#include <drm/drm_managed.h>
 #include <linux/mm.h>
 #include <linux/pfn_t.h>
 #include <linux/rbtree.h>
@@ -380,25 +382,103 @@ vm_fault_t ttm_bo_vm_fault_reserved(struct vm_fault *vmf,
 }
 EXPORT_SYMBOL(ttm_bo_vm_fault_reserved);
 
+static void ttm_bo_release_dummy_page(struct drm_device *dev, void *res)
+{
+	struct page *dummy_page = (struct page *)res;
+
+	__free_page(dummy_page);
+}
+
+vm_fault_t ttm_bo_vm_dummy_page(struct vm_fault *vmf, pgprot_t prot)
+{
+	struct vm_area_struct *vma = vmf->vma;
+	struct ttm_buffer_object *bo = vma->vm_private_data;
+	struct ttm_bo_device *bdev = bo->bdev;
+	struct drm_device *ddev = bo->base.dev;
+	vm_fault_t ret = VM_FAULT_NOPAGE;
+	unsigned long address = vma->vm_start;
+	unsigned long num_prefault = (vma->vm_end - vma->vm_start) >> PAGE_SHIFT;
+	unsigned long pfn;
+	struct page *page;
+	int i;
+
+	/*
+	 * Wait for buffer data in transit, due to a pipelined
+	 * move.
+	 */
+	ret = ttm_bo_vm_fault_idle(bo, vmf);
+	if (unlikely(ret != 0))
+		return ret;
+
+	/* Allocate new dummy page to map all the VA range in this VMA to it*/
+	page = alloc_page(GFP_KERNEL | __GFP_ZERO);
+	if (!page)
+		return VM_FAULT_OOM;
+
+	pfn = page_to_pfn(page);
+
+	/*
+	 * Prefault the entire VMA range right away to avoid further faults
+	 */
+	for (i = 0; i < num_prefault; ++i) {
+
+		if (unlikely(address >= vma->vm_end))
+			break;
+
+		if (vma->vm_flags & VM_MIXEDMAP)
+			ret = vmf_insert_mixed_prot(vma, address,
+						    __pfn_to_pfn_t(pfn, PFN_DEV),
+						    prot);
+		else
+			ret = vmf_insert_pfn_prot(vma, address, pfn, prot);
+
+		/* Never error on prefaulted PTEs */
+		if (unlikely((ret & VM_FAULT_ERROR))) {
+			if (i == 0)
+				return VM_FAULT_NOPAGE;
+			else
+				break;
+		}
+
+		address += PAGE_SIZE;
+	}
+
+	/* Set the page to be freed using drmm release action */
+	if (drmm_add_action_or_reset(ddev, ttm_bo_release_dummy_page, page))
+		return VM_FAULT_OOM;
+
+	return ret;
+}
+EXPORT_SYMBOL(ttm_bo_vm_dummy_page);
+
 vm_fault_t ttm_bo_vm_fault(struct vm_fault *vmf)
 {
 	struct vm_area_struct *vma = vmf->vma;
 	pgprot_t prot;
 	struct ttm_buffer_object *bo = vma->vm_private_data;
+	struct drm_device *ddev = bo->base.dev;
 	vm_fault_t ret;
+	int idx;
 
 	ret = ttm_bo_vm_reserve(bo, vmf);
 	if (ret)
 		return ret;
 
 	prot = vma->vm_page_prot;
-	ret = ttm_bo_vm_fault_reserved(vmf, prot, TTM_BO_VM_NUM_PREFAULT, 1);
+	if (drm_dev_enter(ddev, &idx)) {
+		ret = ttm_bo_vm_fault_reserved(vmf, prot, TTM_BO_VM_NUM_PREFAULT, 1);
+		drm_dev_exit(idx);
+	} else {
+		ret = ttm_bo_vm_dummy_page(vmf, prot);
+	}
 	if (ret == VM_FAULT_RETRY && !(vmf->flags & FAULT_FLAG_RETRY_NOWAIT))
 		return ret;
 
 	dma_resv_unlock(bo->base.resv);
 
 	return ret;
+
+	return ret;
 }
 EXPORT_SYMBOL(ttm_bo_vm_fault);
 
diff --git a/include/drm/ttm/ttm_bo_api.h b/include/drm/ttm/ttm_bo_api.h
index e17be32..12fb240 100644
--- a/include/drm/ttm/ttm_bo_api.h
+++ b/include/drm/ttm/ttm_bo_api.h
@@ -643,4 +643,6 @@ void ttm_bo_vm_close(struct vm_area_struct *vma);
 int ttm_bo_vm_access(struct vm_area_struct *vma, unsigned long addr,
 		     void *buf, int len, int write);
 
+vm_fault_t ttm_bo_vm_dummy_page(struct vm_fault *vmf, pgprot_t prot);
+
 #endif
-- 
2.7.4

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 196+ messages in thread

* [PATCH v4 02/14] drm: Unamp the entire device address space on device unplug
  2021-01-18 21:01 ` Andrey Grodzovsky
@ 2021-01-18 21:01   ` Andrey Grodzovsky
  -1 siblings, 0 replies; 196+ messages in thread
From: Andrey Grodzovsky @ 2021-01-18 21:01 UTC (permalink / raw)
  To: amd-gfx, dri-devel, ckoenig.leichtzumerken, daniel.vetter, robh,
	l.stach, yuq825, eric
  Cc: Alexander.Deucher, gregkh

Invalidate all BOs CPU mappings once device is removed.

v3: Move the code from TTM into drm_dev_unplug

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Daniel Vetter <daniel@ffwll.ch>
---
 drivers/gpu/drm/drm_drv.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c
index d384a5b..20d22e4 100644
--- a/drivers/gpu/drm/drm_drv.c
+++ b/drivers/gpu/drm/drm_drv.c
@@ -469,6 +469,9 @@ void drm_dev_unplug(struct drm_device *dev)
 	synchronize_srcu(&drm_unplug_srcu);
 
 	drm_dev_unregister(dev);
+
+	/* Clear all CPU mappings pointing to this device */
+	unmap_mapping_range(dev->anon_inode->i_mapping, 0, 0, 1);
 }
 EXPORT_SYMBOL(drm_dev_unplug);
 
-- 
2.7.4

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 196+ messages in thread

* [PATCH v4 02/14] drm: Unamp the entire device address space on device unplug
@ 2021-01-18 21:01   ` Andrey Grodzovsky
  0 siblings, 0 replies; 196+ messages in thread
From: Andrey Grodzovsky @ 2021-01-18 21:01 UTC (permalink / raw)
  To: amd-gfx, dri-devel, ckoenig.leichtzumerken, daniel.vetter, robh,
	l.stach, yuq825, eric
  Cc: Alexander.Deucher, gregkh, ppaalanen, Harry.Wentland, Andrey Grodzovsky

Invalidate all BOs CPU mappings once device is removed.

v3: Move the code from TTM into drm_dev_unplug

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Daniel Vetter <daniel@ffwll.ch>
---
 drivers/gpu/drm/drm_drv.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c
index d384a5b..20d22e4 100644
--- a/drivers/gpu/drm/drm_drv.c
+++ b/drivers/gpu/drm/drm_drv.c
@@ -469,6 +469,9 @@ void drm_dev_unplug(struct drm_device *dev)
 	synchronize_srcu(&drm_unplug_srcu);
 
 	drm_dev_unregister(dev);
+
+	/* Clear all CPU mappings pointing to this device */
+	unmap_mapping_range(dev->anon_inode->i_mapping, 0, 0, 1);
 }
 EXPORT_SYMBOL(drm_dev_unplug);
 
-- 
2.7.4

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 196+ messages in thread

* [PATCH v4 03/14] drm/ttm: Expose ttm_tt_unpopulate for driver use
  2021-01-18 21:01 ` Andrey Grodzovsky
@ 2021-01-18 21:01   ` Andrey Grodzovsky
  -1 siblings, 0 replies; 196+ messages in thread
From: Andrey Grodzovsky @ 2021-01-18 21:01 UTC (permalink / raw)
  To: amd-gfx, dri-devel, ckoenig.leichtzumerken, daniel.vetter, robh,
	l.stach, yuq825, eric
  Cc: Alexander.Deucher, gregkh

It's needed to drop iommu backed pages on device unplug
before device's IOMMU group is released.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
---
 drivers/gpu/drm/ttm/ttm_tt.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/ttm/ttm_tt.c b/drivers/gpu/drm/ttm/ttm_tt.c
index 7f75a13..f9e0b0d 100644
--- a/drivers/gpu/drm/ttm/ttm_tt.c
+++ b/drivers/gpu/drm/ttm/ttm_tt.c
@@ -341,3 +341,4 @@ void ttm_tt_unpopulate(struct ttm_bo_device *bdev,
 		ttm_pool_free(&bdev->pool, ttm);
 	ttm->page_flags &= ~TTM_PAGE_FLAG_PRIV_POPULATED;
 }
+EXPORT_SYMBOL(ttm_tt_unpopulate);
-- 
2.7.4

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 196+ messages in thread

* [PATCH v4 03/14] drm/ttm: Expose ttm_tt_unpopulate for driver use
@ 2021-01-18 21:01   ` Andrey Grodzovsky
  0 siblings, 0 replies; 196+ messages in thread
From: Andrey Grodzovsky @ 2021-01-18 21:01 UTC (permalink / raw)
  To: amd-gfx, dri-devel, ckoenig.leichtzumerken, daniel.vetter, robh,
	l.stach, yuq825, eric
  Cc: Alexander.Deucher, gregkh, ppaalanen, Harry.Wentland, Andrey Grodzovsky

It's needed to drop iommu backed pages on device unplug
before device's IOMMU group is released.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
---
 drivers/gpu/drm/ttm/ttm_tt.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/ttm/ttm_tt.c b/drivers/gpu/drm/ttm/ttm_tt.c
index 7f75a13..f9e0b0d 100644
--- a/drivers/gpu/drm/ttm/ttm_tt.c
+++ b/drivers/gpu/drm/ttm/ttm_tt.c
@@ -341,3 +341,4 @@ void ttm_tt_unpopulate(struct ttm_bo_device *bdev,
 		ttm_pool_free(&bdev->pool, ttm);
 	ttm->page_flags &= ~TTM_PAGE_FLAG_PRIV_POPULATED;
 }
+EXPORT_SYMBOL(ttm_tt_unpopulate);
-- 
2.7.4

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 196+ messages in thread

* [PATCH v4 04/14] drm/sched: Cancel and flush all oustatdning jobs before finish.
  2021-01-18 21:01 ` Andrey Grodzovsky
@ 2021-01-18 21:01   ` Andrey Grodzovsky
  -1 siblings, 0 replies; 196+ messages in thread
From: Andrey Grodzovsky @ 2021-01-18 21:01 UTC (permalink / raw)
  To: amd-gfx, dri-devel, ckoenig.leichtzumerken, daniel.vetter, robh,
	l.stach, yuq825, eric
  Cc: Alexander.Deucher, gregkh

To avoid any possible use after free.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
---
 drivers/gpu/drm/scheduler/sched_main.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
index 997aa15..92637b7 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -899,6 +899,9 @@ void drm_sched_fini(struct drm_gpu_scheduler *sched)
 	if (sched->thread)
 		kthread_stop(sched->thread);
 
+	/* Confirm no work left behind accessing device structures */
+	cancel_delayed_work_sync(&sched->work_tdr);
+
 	sched->ready = false;
 }
 EXPORT_SYMBOL(drm_sched_fini);
-- 
2.7.4

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 196+ messages in thread

* [PATCH v4 04/14] drm/sched: Cancel and flush all oustatdning jobs before finish.
@ 2021-01-18 21:01   ` Andrey Grodzovsky
  0 siblings, 0 replies; 196+ messages in thread
From: Andrey Grodzovsky @ 2021-01-18 21:01 UTC (permalink / raw)
  To: amd-gfx, dri-devel, ckoenig.leichtzumerken, daniel.vetter, robh,
	l.stach, yuq825, eric
  Cc: Alexander.Deucher, gregkh, ppaalanen, Harry.Wentland, Andrey Grodzovsky

To avoid any possible use after free.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
---
 drivers/gpu/drm/scheduler/sched_main.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
index 997aa15..92637b7 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -899,6 +899,9 @@ void drm_sched_fini(struct drm_gpu_scheduler *sched)
 	if (sched->thread)
 		kthread_stop(sched->thread);
 
+	/* Confirm no work left behind accessing device structures */
+	cancel_delayed_work_sync(&sched->work_tdr);
+
 	sched->ready = false;
 }
 EXPORT_SYMBOL(drm_sched_fini);
-- 
2.7.4

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 196+ messages in thread

* [PATCH v4 05/14] drm/amdgpu: Split amdgpu_device_fini into early and late
  2021-01-18 21:01 ` Andrey Grodzovsky
@ 2021-01-18 21:01   ` Andrey Grodzovsky
  -1 siblings, 0 replies; 196+ messages in thread
From: Andrey Grodzovsky @ 2021-01-18 21:01 UTC (permalink / raw)
  To: amd-gfx, dri-devel, ckoenig.leichtzumerken, daniel.vetter, robh,
	l.stach, yuq825, eric
  Cc: Alexander.Deucher, gregkh

Some of the stuff in amdgpu_device_fini such as HW interrupts
disable and pending fences finilization must be done right away on
pci_remove while most of the stuff which relates to finilizing and
releasing driver data structures can be kept until
drm_driver.release hook is called, i.e. when the last device
reference is dropped.

v4: Change functions prefix early->hw and late->sw

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu.h        |  6 +++++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 26 ++++++++++++++++++--------
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c    |  7 ++-----
 drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c  | 15 ++++++++++++++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c    | 26 ++++++++++++++++----------
 drivers/gpu/drm/amd/amdgpu/amdgpu_irq.h    |  3 ++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c    | 12 +++++++++++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c    |  1 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h   |  3 ++-
 drivers/gpu/drm/amd/amdgpu/cik_ih.c        |  2 +-
 drivers/gpu/drm/amd/amdgpu/cz_ih.c         |  2 +-
 drivers/gpu/drm/amd/amdgpu/iceland_ih.c    |  2 +-
 drivers/gpu/drm/amd/amdgpu/navi10_ih.c     |  2 +-
 drivers/gpu/drm/amd/amdgpu/si_ih.c         |  2 +-
 drivers/gpu/drm/amd/amdgpu/tonga_ih.c      |  2 +-
 drivers/gpu/drm/amd/amdgpu/vega10_ih.c     |  2 +-
 16 files changed, 78 insertions(+), 35 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index f77443c..478a7d8 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -1060,7 +1060,9 @@ static inline struct amdgpu_device *amdgpu_ttm_adev(struct ttm_bo_device *bdev)
 
 int amdgpu_device_init(struct amdgpu_device *adev,
 		       uint32_t flags);
-void amdgpu_device_fini(struct amdgpu_device *adev);
+void amdgpu_device_fini_hw(struct amdgpu_device *adev);
+void amdgpu_device_fini_sw(struct amdgpu_device *adev);
+
 int amdgpu_gpu_wait_for_idle(struct amdgpu_device *adev);
 
 void amdgpu_device_vram_access(struct amdgpu_device *adev, loff_t pos,
@@ -1273,6 +1275,8 @@ void amdgpu_driver_lastclose_kms(struct drm_device *dev);
 int amdgpu_driver_open_kms(struct drm_device *dev, struct drm_file *file_priv);
 void amdgpu_driver_postclose_kms(struct drm_device *dev,
 				 struct drm_file *file_priv);
+void amdgpu_driver_release_kms(struct drm_device *dev);
+
 int amdgpu_device_ip_suspend(struct amdgpu_device *adev);
 int amdgpu_device_suspend(struct drm_device *dev, bool fbcon);
 int amdgpu_device_resume(struct drm_device *dev, bool fbcon);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 348ac67..90c8353 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -3579,14 +3579,12 @@ int amdgpu_device_init(struct amdgpu_device *adev,
  * Tear down the driver info (all asics).
  * Called at driver shutdown.
  */
-void amdgpu_device_fini(struct amdgpu_device *adev)
+void amdgpu_device_fini_hw(struct amdgpu_device *adev)
 {
 	dev_info(adev->dev, "amdgpu: finishing device.\n");
 	flush_delayed_work(&adev->delayed_init_work);
 	adev->shutdown = true;
 
-	kfree(adev->pci_state);
-
 	/* make sure IB test finished before entering exclusive mode
 	 * to avoid preemption on IB test
 	 * */
@@ -3603,11 +3601,24 @@ void amdgpu_device_fini(struct amdgpu_device *adev)
 		else
 			drm_atomic_helper_shutdown(adev_to_drm(adev));
 	}
-	amdgpu_fence_driver_fini(adev);
+	amdgpu_fence_driver_fini_hw(adev);
+
 	if (adev->pm_sysfs_en)
 		amdgpu_pm_sysfs_fini(adev);
+	if (adev->ucode_sysfs_en)
+		amdgpu_ucode_sysfs_fini(adev);
+	sysfs_remove_files(&adev->dev->kobj, amdgpu_dev_attributes);
+
+
 	amdgpu_fbdev_fini(adev);
+
+	amdgpu_irq_fini_hw(adev);
+}
+
+void amdgpu_device_fini_sw(struct amdgpu_device *adev)
+{
 	amdgpu_device_ip_fini(adev);
+	amdgpu_fence_driver_fini_sw(adev);
 	release_firmware(adev->firmware.gpu_info_fw);
 	adev->firmware.gpu_info_fw = NULL;
 	adev->accel_working = false;
@@ -3636,14 +3647,13 @@ void amdgpu_device_fini(struct amdgpu_device *adev)
 	adev->rmmio = NULL;
 	amdgpu_device_doorbell_fini(adev);
 
-	if (adev->ucode_sysfs_en)
-		amdgpu_ucode_sysfs_fini(adev);
-
-	sysfs_remove_files(&adev->dev->kobj, amdgpu_dev_attributes);
 	if (IS_ENABLED(CONFIG_PERF_EVENTS))
 		amdgpu_pmu_fini(adev);
 	if (adev->mman.discovery_bin)
 		amdgpu_discovery_fini(adev);
+
+	kfree(adev->pci_state);
+
 }
 
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index 72efd57..9c0cd00 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -1238,14 +1238,10 @@ amdgpu_pci_remove(struct pci_dev *pdev)
 {
 	struct drm_device *dev = pci_get_drvdata(pdev);
 
-#ifdef MODULE
-	if (THIS_MODULE->state != MODULE_STATE_GOING)
-#endif
-		DRM_ERROR("Hotplug removal is not supported\n");
 	drm_dev_unplug(dev);
 	amdgpu_driver_unload_kms(dev);
+
 	pci_disable_device(pdev);
-	pci_set_drvdata(pdev, NULL);
 }
 
 static void
@@ -1569,6 +1565,7 @@ static const struct drm_driver amdgpu_kms_driver = {
 	.dumb_create = amdgpu_mode_dumb_create,
 	.dumb_map_offset = amdgpu_mode_dumb_mmap,
 	.fops = &amdgpu_driver_kms_fops,
+	.release = &amdgpu_driver_release_kms,
 
 	.prime_handle_to_fd = drm_gem_prime_handle_to_fd,
 	.prime_fd_to_handle = drm_gem_prime_fd_to_handle,
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
index d56f402..e19b74c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
@@ -523,7 +523,7 @@ int amdgpu_fence_driver_init(struct amdgpu_device *adev)
  *
  * Tear down the fence driver for all possible rings (all asics).
  */
-void amdgpu_fence_driver_fini(struct amdgpu_device *adev)
+void amdgpu_fence_driver_fini_hw(struct amdgpu_device *adev)
 {
 	unsigned i, j;
 	int r;
@@ -544,6 +544,19 @@ void amdgpu_fence_driver_fini(struct amdgpu_device *adev)
 		if (!ring->no_scheduler)
 			drm_sched_fini(&ring->sched);
 		del_timer_sync(&ring->fence_drv.fallback_timer);
+	}
+}
+
+void amdgpu_fence_driver_fini_sw(struct amdgpu_device *adev)
+{
+	unsigned int i, j;
+
+	for (i = 0; i < AMDGPU_MAX_RINGS; i++) {
+		struct amdgpu_ring *ring = adev->rings[i];
+
+		if (!ring || !ring->fence_drv.initialized)
+			continue;
+
 		for (j = 0; j <= ring->fence_drv.num_fences_mask; ++j)
 			dma_fence_put(ring->fence_drv.fences[j]);
 		kfree(ring->fence_drv.fences);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
index bea57e8..2f1cfc5 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
@@ -49,6 +49,7 @@
 #include <drm/drm_irq.h>
 #include <drm/drm_vblank.h>
 #include <drm/amdgpu_drm.h>
+#include <drm/drm_drv.h>
 #include "amdgpu.h"
 #include "amdgpu_ih.h"
 #include "atom.h"
@@ -313,6 +314,20 @@ int amdgpu_irq_init(struct amdgpu_device *adev)
 	return 0;
 }
 
+
+void amdgpu_irq_fini_hw(struct amdgpu_device *adev)
+{
+	if (adev->irq.installed) {
+		drm_irq_uninstall(&adev->ddev);
+		adev->irq.installed = false;
+		if (adev->irq.msi_enabled)
+			pci_free_irq_vectors(adev->pdev);
+
+		if (!amdgpu_device_has_dc_support(adev))
+			flush_work(&adev->hotplug_work);
+	}
+}
+
 /**
  * amdgpu_irq_fini - shut down interrupt handling
  *
@@ -322,19 +337,10 @@ int amdgpu_irq_init(struct amdgpu_device *adev)
  * functionality, shuts down vblank, hotplug and reset interrupt handling,
  * turns off interrupts from all sources (all ASICs).
  */
-void amdgpu_irq_fini(struct amdgpu_device *adev)
+void amdgpu_irq_fini_sw(struct amdgpu_device *adev)
 {
 	unsigned i, j;
 
-	if (adev->irq.installed) {
-		drm_irq_uninstall(adev_to_drm(adev));
-		adev->irq.installed = false;
-		if (adev->irq.msi_enabled)
-			pci_free_irq_vectors(adev->pdev);
-		if (!amdgpu_device_has_dc_support(adev))
-			flush_work(&adev->hotplug_work);
-	}
-
 	for (i = 0; i < AMDGPU_IRQ_CLIENTID_MAX; ++i) {
 		if (!adev->irq.client[i].sources)
 			continue;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.h
index ac527e5..392a732 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.h
@@ -104,7 +104,8 @@ void amdgpu_irq_disable_all(struct amdgpu_device *adev);
 irqreturn_t amdgpu_irq_handler(int irq, void *arg);
 
 int amdgpu_irq_init(struct amdgpu_device *adev);
-void amdgpu_irq_fini(struct amdgpu_device *adev);
+void amdgpu_irq_fini_sw(struct amdgpu_device *adev);
+void amdgpu_irq_fini_hw(struct amdgpu_device *adev);
 int amdgpu_irq_add_id(struct amdgpu_device *adev,
 		      unsigned client_id, unsigned src_id,
 		      struct amdgpu_irq_src *source);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
index b16b327..fee95d3 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
@@ -29,6 +29,7 @@
 #include "amdgpu.h"
 #include <drm/drm_debugfs.h>
 #include <drm/amdgpu_drm.h>
+#include <drm/drm_drv.h>
 #include "amdgpu_uvd.h"
 #include "amdgpu_vce.h"
 #include "atom.h"
@@ -93,7 +94,7 @@ void amdgpu_driver_unload_kms(struct drm_device *dev)
 	}
 
 	amdgpu_acpi_fini(adev);
-	amdgpu_device_fini(adev);
+	amdgpu_device_fini_hw(adev);
 }
 
 void amdgpu_register_gpu_instance(struct amdgpu_device *adev)
@@ -1153,6 +1154,15 @@ void amdgpu_driver_postclose_kms(struct drm_device *dev,
 	pm_runtime_put_autosuspend(dev->dev);
 }
 
+
+void amdgpu_driver_release_kms(struct drm_device *dev)
+{
+	struct amdgpu_device *adev = drm_to_adev(dev);
+
+	amdgpu_device_fini_sw(adev);
+	pci_set_drvdata(adev->pdev, NULL);
+}
+
 /*
  * VBlank related functions.
  */
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
index c136bd4..87eaf13 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
@@ -2142,6 +2142,7 @@ int amdgpu_ras_pre_fini(struct amdgpu_device *adev)
 	if (!con)
 		return 0;
 
+
 	/* Need disable ras on all IPs here before ip [hw/sw]fini */
 	amdgpu_ras_disable_all_features(adev, 0);
 	amdgpu_ras_recovery_fini(adev);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
index 7112137..accb243 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
@@ -107,7 +107,8 @@ struct amdgpu_fence_driver {
 };
 
 int amdgpu_fence_driver_init(struct amdgpu_device *adev);
-void amdgpu_fence_driver_fini(struct amdgpu_device *adev);
+void amdgpu_fence_driver_fini_hw(struct amdgpu_device *adev);
+void amdgpu_fence_driver_fini_sw(struct amdgpu_device *adev);
 void amdgpu_fence_driver_force_completion(struct amdgpu_ring *ring);
 
 int amdgpu_fence_driver_init_ring(struct amdgpu_ring *ring,
diff --git a/drivers/gpu/drm/amd/amdgpu/cik_ih.c b/drivers/gpu/drm/amd/amdgpu/cik_ih.c
index d374571..183d44a 100644
--- a/drivers/gpu/drm/amd/amdgpu/cik_ih.c
+++ b/drivers/gpu/drm/amd/amdgpu/cik_ih.c
@@ -309,7 +309,7 @@ static int cik_ih_sw_fini(void *handle)
 {
 	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
 
-	amdgpu_irq_fini(adev);
+	amdgpu_irq_fini_sw(adev);
 	amdgpu_ih_ring_fini(adev, &adev->irq.ih);
 	amdgpu_irq_remove_domain(adev);
 
diff --git a/drivers/gpu/drm/amd/amdgpu/cz_ih.c b/drivers/gpu/drm/amd/amdgpu/cz_ih.c
index da37f8a..ee824d7 100644
--- a/drivers/gpu/drm/amd/amdgpu/cz_ih.c
+++ b/drivers/gpu/drm/amd/amdgpu/cz_ih.c
@@ -290,7 +290,7 @@ static int cz_ih_sw_fini(void *handle)
 {
 	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
 
-	amdgpu_irq_fini(adev);
+	amdgpu_irq_fini_sw(adev);
 	amdgpu_ih_ring_fini(adev, &adev->irq.ih);
 	amdgpu_irq_remove_domain(adev);
 
diff --git a/drivers/gpu/drm/amd/amdgpu/iceland_ih.c b/drivers/gpu/drm/amd/amdgpu/iceland_ih.c
index 37d8b6c..b24f6fb 100644
--- a/drivers/gpu/drm/amd/amdgpu/iceland_ih.c
+++ b/drivers/gpu/drm/amd/amdgpu/iceland_ih.c
@@ -290,7 +290,7 @@ static int iceland_ih_sw_fini(void *handle)
 {
 	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
 
-	amdgpu_irq_fini(adev);
+	amdgpu_irq_fini_sw(adev);
 	amdgpu_ih_ring_fini(adev, &adev->irq.ih);
 	amdgpu_irq_remove_domain(adev);
 
diff --git a/drivers/gpu/drm/amd/amdgpu/navi10_ih.c b/drivers/gpu/drm/amd/amdgpu/navi10_ih.c
index 7ba229e..c191410 100644
--- a/drivers/gpu/drm/amd/amdgpu/navi10_ih.c
+++ b/drivers/gpu/drm/amd/amdgpu/navi10_ih.c
@@ -716,7 +716,7 @@ static int navi10_ih_sw_fini(void *handle)
 {
 	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
 
-	amdgpu_irq_fini(adev);
+	amdgpu_irq_fini_sw(adev);
 	amdgpu_ih_ring_fini(adev, &adev->irq.ih2);
 	amdgpu_ih_ring_fini(adev, &adev->irq.ih1);
 	amdgpu_ih_ring_fini(adev, &adev->irq.ih);
diff --git a/drivers/gpu/drm/amd/amdgpu/si_ih.c b/drivers/gpu/drm/amd/amdgpu/si_ih.c
index 51880f6..751307f 100644
--- a/drivers/gpu/drm/amd/amdgpu/si_ih.c
+++ b/drivers/gpu/drm/amd/amdgpu/si_ih.c
@@ -175,7 +175,7 @@ static int si_ih_sw_fini(void *handle)
 {
 	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
 
-	amdgpu_irq_fini(adev);
+	amdgpu_irq_fini_sw(adev);
 	amdgpu_ih_ring_fini(adev, &adev->irq.ih);
 
 	return 0;
diff --git a/drivers/gpu/drm/amd/amdgpu/tonga_ih.c b/drivers/gpu/drm/amd/amdgpu/tonga_ih.c
index ce33199..729aaaa 100644
--- a/drivers/gpu/drm/amd/amdgpu/tonga_ih.c
+++ b/drivers/gpu/drm/amd/amdgpu/tonga_ih.c
@@ -301,7 +301,7 @@ static int tonga_ih_sw_fini(void *handle)
 {
 	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
 
-	amdgpu_irq_fini(adev);
+	amdgpu_irq_fini_sw(adev);
 	amdgpu_ih_ring_fini(adev, &adev->irq.ih);
 	amdgpu_irq_remove_domain(adev);
 
diff --git a/drivers/gpu/drm/amd/amdgpu/vega10_ih.c b/drivers/gpu/drm/amd/amdgpu/vega10_ih.c
index e5ae31e..a342406 100644
--- a/drivers/gpu/drm/amd/amdgpu/vega10_ih.c
+++ b/drivers/gpu/drm/amd/amdgpu/vega10_ih.c
@@ -627,7 +627,7 @@ static int vega10_ih_sw_fini(void *handle)
 {
 	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
 
-	amdgpu_irq_fini(adev);
+	amdgpu_irq_fini_sw(adev);
 	amdgpu_ih_ring_fini(adev, &adev->irq.ih2);
 	amdgpu_ih_ring_fini(adev, &adev->irq.ih1);
 	amdgpu_ih_ring_fini(adev, &adev->irq.ih);
-- 
2.7.4

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 196+ messages in thread

* [PATCH v4 05/14] drm/amdgpu: Split amdgpu_device_fini into early and late
@ 2021-01-18 21:01   ` Andrey Grodzovsky
  0 siblings, 0 replies; 196+ messages in thread
From: Andrey Grodzovsky @ 2021-01-18 21:01 UTC (permalink / raw)
  To: amd-gfx, dri-devel, ckoenig.leichtzumerken, daniel.vetter, robh,
	l.stach, yuq825, eric
  Cc: Alexander.Deucher, gregkh, ppaalanen, Harry.Wentland, Andrey Grodzovsky

Some of the stuff in amdgpu_device_fini such as HW interrupts
disable and pending fences finilization must be done right away on
pci_remove while most of the stuff which relates to finilizing and
releasing driver data structures can be kept until
drm_driver.release hook is called, i.e. when the last device
reference is dropped.

v4: Change functions prefix early->hw and late->sw

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu.h        |  6 +++++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 26 ++++++++++++++++++--------
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c    |  7 ++-----
 drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c  | 15 ++++++++++++++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c    | 26 ++++++++++++++++----------
 drivers/gpu/drm/amd/amdgpu/amdgpu_irq.h    |  3 ++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c    | 12 +++++++++++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c    |  1 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h   |  3 ++-
 drivers/gpu/drm/amd/amdgpu/cik_ih.c        |  2 +-
 drivers/gpu/drm/amd/amdgpu/cz_ih.c         |  2 +-
 drivers/gpu/drm/amd/amdgpu/iceland_ih.c    |  2 +-
 drivers/gpu/drm/amd/amdgpu/navi10_ih.c     |  2 +-
 drivers/gpu/drm/amd/amdgpu/si_ih.c         |  2 +-
 drivers/gpu/drm/amd/amdgpu/tonga_ih.c      |  2 +-
 drivers/gpu/drm/amd/amdgpu/vega10_ih.c     |  2 +-
 16 files changed, 78 insertions(+), 35 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index f77443c..478a7d8 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -1060,7 +1060,9 @@ static inline struct amdgpu_device *amdgpu_ttm_adev(struct ttm_bo_device *bdev)
 
 int amdgpu_device_init(struct amdgpu_device *adev,
 		       uint32_t flags);
-void amdgpu_device_fini(struct amdgpu_device *adev);
+void amdgpu_device_fini_hw(struct amdgpu_device *adev);
+void amdgpu_device_fini_sw(struct amdgpu_device *adev);
+
 int amdgpu_gpu_wait_for_idle(struct amdgpu_device *adev);
 
 void amdgpu_device_vram_access(struct amdgpu_device *adev, loff_t pos,
@@ -1273,6 +1275,8 @@ void amdgpu_driver_lastclose_kms(struct drm_device *dev);
 int amdgpu_driver_open_kms(struct drm_device *dev, struct drm_file *file_priv);
 void amdgpu_driver_postclose_kms(struct drm_device *dev,
 				 struct drm_file *file_priv);
+void amdgpu_driver_release_kms(struct drm_device *dev);
+
 int amdgpu_device_ip_suspend(struct amdgpu_device *adev);
 int amdgpu_device_suspend(struct drm_device *dev, bool fbcon);
 int amdgpu_device_resume(struct drm_device *dev, bool fbcon);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 348ac67..90c8353 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -3579,14 +3579,12 @@ int amdgpu_device_init(struct amdgpu_device *adev,
  * Tear down the driver info (all asics).
  * Called at driver shutdown.
  */
-void amdgpu_device_fini(struct amdgpu_device *adev)
+void amdgpu_device_fini_hw(struct amdgpu_device *adev)
 {
 	dev_info(adev->dev, "amdgpu: finishing device.\n");
 	flush_delayed_work(&adev->delayed_init_work);
 	adev->shutdown = true;
 
-	kfree(adev->pci_state);
-
 	/* make sure IB test finished before entering exclusive mode
 	 * to avoid preemption on IB test
 	 * */
@@ -3603,11 +3601,24 @@ void amdgpu_device_fini(struct amdgpu_device *adev)
 		else
 			drm_atomic_helper_shutdown(adev_to_drm(adev));
 	}
-	amdgpu_fence_driver_fini(adev);
+	amdgpu_fence_driver_fini_hw(adev);
+
 	if (adev->pm_sysfs_en)
 		amdgpu_pm_sysfs_fini(adev);
+	if (adev->ucode_sysfs_en)
+		amdgpu_ucode_sysfs_fini(adev);
+	sysfs_remove_files(&adev->dev->kobj, amdgpu_dev_attributes);
+
+
 	amdgpu_fbdev_fini(adev);
+
+	amdgpu_irq_fini_hw(adev);
+}
+
+void amdgpu_device_fini_sw(struct amdgpu_device *adev)
+{
 	amdgpu_device_ip_fini(adev);
+	amdgpu_fence_driver_fini_sw(adev);
 	release_firmware(adev->firmware.gpu_info_fw);
 	adev->firmware.gpu_info_fw = NULL;
 	adev->accel_working = false;
@@ -3636,14 +3647,13 @@ void amdgpu_device_fini(struct amdgpu_device *adev)
 	adev->rmmio = NULL;
 	amdgpu_device_doorbell_fini(adev);
 
-	if (adev->ucode_sysfs_en)
-		amdgpu_ucode_sysfs_fini(adev);
-
-	sysfs_remove_files(&adev->dev->kobj, amdgpu_dev_attributes);
 	if (IS_ENABLED(CONFIG_PERF_EVENTS))
 		amdgpu_pmu_fini(adev);
 	if (adev->mman.discovery_bin)
 		amdgpu_discovery_fini(adev);
+
+	kfree(adev->pci_state);
+
 }
 
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index 72efd57..9c0cd00 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -1238,14 +1238,10 @@ amdgpu_pci_remove(struct pci_dev *pdev)
 {
 	struct drm_device *dev = pci_get_drvdata(pdev);
 
-#ifdef MODULE
-	if (THIS_MODULE->state != MODULE_STATE_GOING)
-#endif
-		DRM_ERROR("Hotplug removal is not supported\n");
 	drm_dev_unplug(dev);
 	amdgpu_driver_unload_kms(dev);
+
 	pci_disable_device(pdev);
-	pci_set_drvdata(pdev, NULL);
 }
 
 static void
@@ -1569,6 +1565,7 @@ static const struct drm_driver amdgpu_kms_driver = {
 	.dumb_create = amdgpu_mode_dumb_create,
 	.dumb_map_offset = amdgpu_mode_dumb_mmap,
 	.fops = &amdgpu_driver_kms_fops,
+	.release = &amdgpu_driver_release_kms,
 
 	.prime_handle_to_fd = drm_gem_prime_handle_to_fd,
 	.prime_fd_to_handle = drm_gem_prime_fd_to_handle,
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
index d56f402..e19b74c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
@@ -523,7 +523,7 @@ int amdgpu_fence_driver_init(struct amdgpu_device *adev)
  *
  * Tear down the fence driver for all possible rings (all asics).
  */
-void amdgpu_fence_driver_fini(struct amdgpu_device *adev)
+void amdgpu_fence_driver_fini_hw(struct amdgpu_device *adev)
 {
 	unsigned i, j;
 	int r;
@@ -544,6 +544,19 @@ void amdgpu_fence_driver_fini(struct amdgpu_device *adev)
 		if (!ring->no_scheduler)
 			drm_sched_fini(&ring->sched);
 		del_timer_sync(&ring->fence_drv.fallback_timer);
+	}
+}
+
+void amdgpu_fence_driver_fini_sw(struct amdgpu_device *adev)
+{
+	unsigned int i, j;
+
+	for (i = 0; i < AMDGPU_MAX_RINGS; i++) {
+		struct amdgpu_ring *ring = adev->rings[i];
+
+		if (!ring || !ring->fence_drv.initialized)
+			continue;
+
 		for (j = 0; j <= ring->fence_drv.num_fences_mask; ++j)
 			dma_fence_put(ring->fence_drv.fences[j]);
 		kfree(ring->fence_drv.fences);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
index bea57e8..2f1cfc5 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
@@ -49,6 +49,7 @@
 #include <drm/drm_irq.h>
 #include <drm/drm_vblank.h>
 #include <drm/amdgpu_drm.h>
+#include <drm/drm_drv.h>
 #include "amdgpu.h"
 #include "amdgpu_ih.h"
 #include "atom.h"
@@ -313,6 +314,20 @@ int amdgpu_irq_init(struct amdgpu_device *adev)
 	return 0;
 }
 
+
+void amdgpu_irq_fini_hw(struct amdgpu_device *adev)
+{
+	if (adev->irq.installed) {
+		drm_irq_uninstall(&adev->ddev);
+		adev->irq.installed = false;
+		if (adev->irq.msi_enabled)
+			pci_free_irq_vectors(adev->pdev);
+
+		if (!amdgpu_device_has_dc_support(adev))
+			flush_work(&adev->hotplug_work);
+	}
+}
+
 /**
  * amdgpu_irq_fini - shut down interrupt handling
  *
@@ -322,19 +337,10 @@ int amdgpu_irq_init(struct amdgpu_device *adev)
  * functionality, shuts down vblank, hotplug and reset interrupt handling,
  * turns off interrupts from all sources (all ASICs).
  */
-void amdgpu_irq_fini(struct amdgpu_device *adev)
+void amdgpu_irq_fini_sw(struct amdgpu_device *adev)
 {
 	unsigned i, j;
 
-	if (adev->irq.installed) {
-		drm_irq_uninstall(adev_to_drm(adev));
-		adev->irq.installed = false;
-		if (adev->irq.msi_enabled)
-			pci_free_irq_vectors(adev->pdev);
-		if (!amdgpu_device_has_dc_support(adev))
-			flush_work(&adev->hotplug_work);
-	}
-
 	for (i = 0; i < AMDGPU_IRQ_CLIENTID_MAX; ++i) {
 		if (!adev->irq.client[i].sources)
 			continue;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.h
index ac527e5..392a732 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.h
@@ -104,7 +104,8 @@ void amdgpu_irq_disable_all(struct amdgpu_device *adev);
 irqreturn_t amdgpu_irq_handler(int irq, void *arg);
 
 int amdgpu_irq_init(struct amdgpu_device *adev);
-void amdgpu_irq_fini(struct amdgpu_device *adev);
+void amdgpu_irq_fini_sw(struct amdgpu_device *adev);
+void amdgpu_irq_fini_hw(struct amdgpu_device *adev);
 int amdgpu_irq_add_id(struct amdgpu_device *adev,
 		      unsigned client_id, unsigned src_id,
 		      struct amdgpu_irq_src *source);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
index b16b327..fee95d3 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
@@ -29,6 +29,7 @@
 #include "amdgpu.h"
 #include <drm/drm_debugfs.h>
 #include <drm/amdgpu_drm.h>
+#include <drm/drm_drv.h>
 #include "amdgpu_uvd.h"
 #include "amdgpu_vce.h"
 #include "atom.h"
@@ -93,7 +94,7 @@ void amdgpu_driver_unload_kms(struct drm_device *dev)
 	}
 
 	amdgpu_acpi_fini(adev);
-	amdgpu_device_fini(adev);
+	amdgpu_device_fini_hw(adev);
 }
 
 void amdgpu_register_gpu_instance(struct amdgpu_device *adev)
@@ -1153,6 +1154,15 @@ void amdgpu_driver_postclose_kms(struct drm_device *dev,
 	pm_runtime_put_autosuspend(dev->dev);
 }
 
+
+void amdgpu_driver_release_kms(struct drm_device *dev)
+{
+	struct amdgpu_device *adev = drm_to_adev(dev);
+
+	amdgpu_device_fini_sw(adev);
+	pci_set_drvdata(adev->pdev, NULL);
+}
+
 /*
  * VBlank related functions.
  */
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
index c136bd4..87eaf13 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
@@ -2142,6 +2142,7 @@ int amdgpu_ras_pre_fini(struct amdgpu_device *adev)
 	if (!con)
 		return 0;
 
+
 	/* Need disable ras on all IPs here before ip [hw/sw]fini */
 	amdgpu_ras_disable_all_features(adev, 0);
 	amdgpu_ras_recovery_fini(adev);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
index 7112137..accb243 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
@@ -107,7 +107,8 @@ struct amdgpu_fence_driver {
 };
 
 int amdgpu_fence_driver_init(struct amdgpu_device *adev);
-void amdgpu_fence_driver_fini(struct amdgpu_device *adev);
+void amdgpu_fence_driver_fini_hw(struct amdgpu_device *adev);
+void amdgpu_fence_driver_fini_sw(struct amdgpu_device *adev);
 void amdgpu_fence_driver_force_completion(struct amdgpu_ring *ring);
 
 int amdgpu_fence_driver_init_ring(struct amdgpu_ring *ring,
diff --git a/drivers/gpu/drm/amd/amdgpu/cik_ih.c b/drivers/gpu/drm/amd/amdgpu/cik_ih.c
index d374571..183d44a 100644
--- a/drivers/gpu/drm/amd/amdgpu/cik_ih.c
+++ b/drivers/gpu/drm/amd/amdgpu/cik_ih.c
@@ -309,7 +309,7 @@ static int cik_ih_sw_fini(void *handle)
 {
 	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
 
-	amdgpu_irq_fini(adev);
+	amdgpu_irq_fini_sw(adev);
 	amdgpu_ih_ring_fini(adev, &adev->irq.ih);
 	amdgpu_irq_remove_domain(adev);
 
diff --git a/drivers/gpu/drm/amd/amdgpu/cz_ih.c b/drivers/gpu/drm/amd/amdgpu/cz_ih.c
index da37f8a..ee824d7 100644
--- a/drivers/gpu/drm/amd/amdgpu/cz_ih.c
+++ b/drivers/gpu/drm/amd/amdgpu/cz_ih.c
@@ -290,7 +290,7 @@ static int cz_ih_sw_fini(void *handle)
 {
 	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
 
-	amdgpu_irq_fini(adev);
+	amdgpu_irq_fini_sw(adev);
 	amdgpu_ih_ring_fini(adev, &adev->irq.ih);
 	amdgpu_irq_remove_domain(adev);
 
diff --git a/drivers/gpu/drm/amd/amdgpu/iceland_ih.c b/drivers/gpu/drm/amd/amdgpu/iceland_ih.c
index 37d8b6c..b24f6fb 100644
--- a/drivers/gpu/drm/amd/amdgpu/iceland_ih.c
+++ b/drivers/gpu/drm/amd/amdgpu/iceland_ih.c
@@ -290,7 +290,7 @@ static int iceland_ih_sw_fini(void *handle)
 {
 	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
 
-	amdgpu_irq_fini(adev);
+	amdgpu_irq_fini_sw(adev);
 	amdgpu_ih_ring_fini(adev, &adev->irq.ih);
 	amdgpu_irq_remove_domain(adev);
 
diff --git a/drivers/gpu/drm/amd/amdgpu/navi10_ih.c b/drivers/gpu/drm/amd/amdgpu/navi10_ih.c
index 7ba229e..c191410 100644
--- a/drivers/gpu/drm/amd/amdgpu/navi10_ih.c
+++ b/drivers/gpu/drm/amd/amdgpu/navi10_ih.c
@@ -716,7 +716,7 @@ static int navi10_ih_sw_fini(void *handle)
 {
 	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
 
-	amdgpu_irq_fini(adev);
+	amdgpu_irq_fini_sw(adev);
 	amdgpu_ih_ring_fini(adev, &adev->irq.ih2);
 	amdgpu_ih_ring_fini(adev, &adev->irq.ih1);
 	amdgpu_ih_ring_fini(adev, &adev->irq.ih);
diff --git a/drivers/gpu/drm/amd/amdgpu/si_ih.c b/drivers/gpu/drm/amd/amdgpu/si_ih.c
index 51880f6..751307f 100644
--- a/drivers/gpu/drm/amd/amdgpu/si_ih.c
+++ b/drivers/gpu/drm/amd/amdgpu/si_ih.c
@@ -175,7 +175,7 @@ static int si_ih_sw_fini(void *handle)
 {
 	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
 
-	amdgpu_irq_fini(adev);
+	amdgpu_irq_fini_sw(adev);
 	amdgpu_ih_ring_fini(adev, &adev->irq.ih);
 
 	return 0;
diff --git a/drivers/gpu/drm/amd/amdgpu/tonga_ih.c b/drivers/gpu/drm/amd/amdgpu/tonga_ih.c
index ce33199..729aaaa 100644
--- a/drivers/gpu/drm/amd/amdgpu/tonga_ih.c
+++ b/drivers/gpu/drm/amd/amdgpu/tonga_ih.c
@@ -301,7 +301,7 @@ static int tonga_ih_sw_fini(void *handle)
 {
 	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
 
-	amdgpu_irq_fini(adev);
+	amdgpu_irq_fini_sw(adev);
 	amdgpu_ih_ring_fini(adev, &adev->irq.ih);
 	amdgpu_irq_remove_domain(adev);
 
diff --git a/drivers/gpu/drm/amd/amdgpu/vega10_ih.c b/drivers/gpu/drm/amd/amdgpu/vega10_ih.c
index e5ae31e..a342406 100644
--- a/drivers/gpu/drm/amd/amdgpu/vega10_ih.c
+++ b/drivers/gpu/drm/amd/amdgpu/vega10_ih.c
@@ -627,7 +627,7 @@ static int vega10_ih_sw_fini(void *handle)
 {
 	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
 
-	amdgpu_irq_fini(adev);
+	amdgpu_irq_fini_sw(adev);
 	amdgpu_ih_ring_fini(adev, &adev->irq.ih2);
 	amdgpu_ih_ring_fini(adev, &adev->irq.ih1);
 	amdgpu_ih_ring_fini(adev, &adev->irq.ih);
-- 
2.7.4

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 196+ messages in thread

* [PATCH v4 06/14] drm/amdgpu: Add early fini callback
  2021-01-18 21:01 ` Andrey Grodzovsky
@ 2021-01-18 21:01   ` Andrey Grodzovsky
  -1 siblings, 0 replies; 196+ messages in thread
From: Andrey Grodzovsky @ 2021-01-18 21:01 UTC (permalink / raw)
  To: amd-gfx, dri-devel, ckoenig.leichtzumerken, daniel.vetter, robh,
	l.stach, yuq825, eric
  Cc: Alexander.Deucher, gregkh

Use it to call disply code dependent on device->drv_data
before it's set to NULL on device unplug

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c        | 20 ++++++++++++++++++++
 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 12 ++++++++++--
 drivers/gpu/drm/amd/include/amd_shared.h          |  2 ++
 3 files changed, 32 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 90c8353..45e23e3 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -2529,6 +2529,24 @@ static int amdgpu_device_ip_late_init(struct amdgpu_device *adev)
 	return 0;
 }
 
+static int amdgpu_device_ip_fini_early(struct amdgpu_device *adev)
+{
+	int i, r;
+
+	for (i = 0; i < adev->num_ip_blocks; i++) {
+		if (!adev->ip_blocks[i].version->funcs->early_fini)
+			continue;
+
+		r = adev->ip_blocks[i].version->funcs->early_fini((void *)adev);
+		if (r) {
+			DRM_DEBUG("early_fini of IP block <%s> failed %d\n",
+				  adev->ip_blocks[i].version->funcs->name, r);
+		}
+	}
+
+	return 0;
+}
+
 /**
  * amdgpu_device_ip_fini - run fini for hardware IPs
  *
@@ -3613,6 +3631,8 @@ void amdgpu_device_fini_hw(struct amdgpu_device *adev)
 	amdgpu_fbdev_fini(adev);
 
 	amdgpu_irq_fini_hw(adev);
+
+	amdgpu_device_ip_fini_early(adev);
 }
 
 void amdgpu_device_fini_sw(struct amdgpu_device *adev)
diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index 86c2b2c..9b24f3e 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -1156,6 +1156,15 @@ static int amdgpu_dm_init(struct amdgpu_device *adev)
 	return -EINVAL;
 }
 
+static int amdgpu_dm_early_fini(void *handle)
+{
+	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
+
+	amdgpu_dm_audio_fini(adev);
+
+	return 0;
+}
+
 static void amdgpu_dm_fini(struct amdgpu_device *adev)
 {
 	int i;
@@ -1164,8 +1173,6 @@ static void amdgpu_dm_fini(struct amdgpu_device *adev)
 		drm_encoder_cleanup(&adev->dm.mst_encoders[i].base);
 	}
 
-	amdgpu_dm_audio_fini(adev);
-
 	amdgpu_dm_destroy_drm_device(&adev->dm);
 
 #ifdef CONFIG_DRM_AMD_DC_HDCP
@@ -2175,6 +2182,7 @@ static const struct amd_ip_funcs amdgpu_dm_funcs = {
 	.late_init = dm_late_init,
 	.sw_init = dm_sw_init,
 	.sw_fini = dm_sw_fini,
+	.early_fini = amdgpu_dm_early_fini,
 	.hw_init = dm_hw_init,
 	.hw_fini = dm_hw_fini,
 	.suspend = dm_suspend,
diff --git a/drivers/gpu/drm/amd/include/amd_shared.h b/drivers/gpu/drm/amd/include/amd_shared.h
index 9676016..63bb846 100644
--- a/drivers/gpu/drm/amd/include/amd_shared.h
+++ b/drivers/gpu/drm/amd/include/amd_shared.h
@@ -239,6 +239,7 @@ enum amd_dpm_forced_level;
  * @late_init: sets up late driver/hw state (post hw_init) - Optional
  * @sw_init: sets up driver state, does not configure hw
  * @sw_fini: tears down driver state, does not configure hw
+ * @early_fini: tears down stuff before dev detached from driver
  * @hw_init: sets up the hw state
  * @hw_fini: tears down the hw state
  * @late_fini: final cleanup
@@ -267,6 +268,7 @@ struct amd_ip_funcs {
 	int (*late_init)(void *handle);
 	int (*sw_init)(void *handle);
 	int (*sw_fini)(void *handle);
+	int (*early_fini)(void *handle);
 	int (*hw_init)(void *handle);
 	int (*hw_fini)(void *handle);
 	void (*late_fini)(void *handle);
-- 
2.7.4

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 196+ messages in thread

* [PATCH v4 06/14] drm/amdgpu: Add early fini callback
@ 2021-01-18 21:01   ` Andrey Grodzovsky
  0 siblings, 0 replies; 196+ messages in thread
From: Andrey Grodzovsky @ 2021-01-18 21:01 UTC (permalink / raw)
  To: amd-gfx, dri-devel, ckoenig.leichtzumerken, daniel.vetter, robh,
	l.stach, yuq825, eric
  Cc: Alexander.Deucher, gregkh, ppaalanen, Harry.Wentland, Andrey Grodzovsky

Use it to call disply code dependent on device->drv_data
before it's set to NULL on device unplug

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c        | 20 ++++++++++++++++++++
 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 12 ++++++++++--
 drivers/gpu/drm/amd/include/amd_shared.h          |  2 ++
 3 files changed, 32 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 90c8353..45e23e3 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -2529,6 +2529,24 @@ static int amdgpu_device_ip_late_init(struct amdgpu_device *adev)
 	return 0;
 }
 
+static int amdgpu_device_ip_fini_early(struct amdgpu_device *adev)
+{
+	int i, r;
+
+	for (i = 0; i < adev->num_ip_blocks; i++) {
+		if (!adev->ip_blocks[i].version->funcs->early_fini)
+			continue;
+
+		r = adev->ip_blocks[i].version->funcs->early_fini((void *)adev);
+		if (r) {
+			DRM_DEBUG("early_fini of IP block <%s> failed %d\n",
+				  adev->ip_blocks[i].version->funcs->name, r);
+		}
+	}
+
+	return 0;
+}
+
 /**
  * amdgpu_device_ip_fini - run fini for hardware IPs
  *
@@ -3613,6 +3631,8 @@ void amdgpu_device_fini_hw(struct amdgpu_device *adev)
 	amdgpu_fbdev_fini(adev);
 
 	amdgpu_irq_fini_hw(adev);
+
+	amdgpu_device_ip_fini_early(adev);
 }
 
 void amdgpu_device_fini_sw(struct amdgpu_device *adev)
diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index 86c2b2c..9b24f3e 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -1156,6 +1156,15 @@ static int amdgpu_dm_init(struct amdgpu_device *adev)
 	return -EINVAL;
 }
 
+static int amdgpu_dm_early_fini(void *handle)
+{
+	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
+
+	amdgpu_dm_audio_fini(adev);
+
+	return 0;
+}
+
 static void amdgpu_dm_fini(struct amdgpu_device *adev)
 {
 	int i;
@@ -1164,8 +1173,6 @@ static void amdgpu_dm_fini(struct amdgpu_device *adev)
 		drm_encoder_cleanup(&adev->dm.mst_encoders[i].base);
 	}
 
-	amdgpu_dm_audio_fini(adev);
-
 	amdgpu_dm_destroy_drm_device(&adev->dm);
 
 #ifdef CONFIG_DRM_AMD_DC_HDCP
@@ -2175,6 +2182,7 @@ static const struct amd_ip_funcs amdgpu_dm_funcs = {
 	.late_init = dm_late_init,
 	.sw_init = dm_sw_init,
 	.sw_fini = dm_sw_fini,
+	.early_fini = amdgpu_dm_early_fini,
 	.hw_init = dm_hw_init,
 	.hw_fini = dm_hw_fini,
 	.suspend = dm_suspend,
diff --git a/drivers/gpu/drm/amd/include/amd_shared.h b/drivers/gpu/drm/amd/include/amd_shared.h
index 9676016..63bb846 100644
--- a/drivers/gpu/drm/amd/include/amd_shared.h
+++ b/drivers/gpu/drm/amd/include/amd_shared.h
@@ -239,6 +239,7 @@ enum amd_dpm_forced_level;
  * @late_init: sets up late driver/hw state (post hw_init) - Optional
  * @sw_init: sets up driver state, does not configure hw
  * @sw_fini: tears down driver state, does not configure hw
+ * @early_fini: tears down stuff before dev detached from driver
  * @hw_init: sets up the hw state
  * @hw_fini: tears down the hw state
  * @late_fini: final cleanup
@@ -267,6 +268,7 @@ struct amd_ip_funcs {
 	int (*late_init)(void *handle);
 	int (*sw_init)(void *handle);
 	int (*sw_fini)(void *handle);
+	int (*early_fini)(void *handle);
 	int (*hw_init)(void *handle);
 	int (*hw_fini)(void *handle);
 	void (*late_fini)(void *handle);
-- 
2.7.4

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 196+ messages in thread

* [PATCH v4 07/14] drm/amdgpu: Register IOMMU topology notifier per device.
  2021-01-18 21:01 ` Andrey Grodzovsky
@ 2021-01-18 21:01   ` Andrey Grodzovsky
  -1 siblings, 0 replies; 196+ messages in thread
From: Andrey Grodzovsky @ 2021-01-18 21:01 UTC (permalink / raw)
  To: amd-gfx, dri-devel, ckoenig.leichtzumerken, daniel.vetter, robh,
	l.stach, yuq825, eric
  Cc: Alexander.Deucher, gregkh

Handle all DMA IOMMU gropup related dependencies before the
group is removed.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu.h        |  5 ++++
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 46 ++++++++++++++++++++++++++++++
 drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c   |  2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h   |  1 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 10 +++++++
 drivers/gpu/drm/amd/amdgpu/amdgpu_object.h |  2 ++
 6 files changed, 65 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index 478a7d8..2953420 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -51,6 +51,7 @@
 #include <linux/dma-fence.h>
 #include <linux/pci.h>
 #include <linux/aer.h>
+#include <linux/notifier.h>
 
 #include <drm/ttm/ttm_bo_api.h>
 #include <drm/ttm/ttm_bo_driver.h>
@@ -1041,6 +1042,10 @@ struct amdgpu_device {
 
 	bool                            in_pci_err_recovery;
 	struct pci_saved_state          *pci_state;
+
+	struct notifier_block		nb;
+	struct blocking_notifier_head	notifier;
+	struct list_head		device_bo_list;
 };
 
 static inline struct amdgpu_device *drm_to_adev(struct drm_device *ddev)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 45e23e3..e99f4f1 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -70,6 +70,8 @@
 #include <drm/task_barrier.h>
 #include <linux/pm_runtime.h>
 
+#include <linux/iommu.h>
+
 MODULE_FIRMWARE("amdgpu/vega10_gpu_info.bin");
 MODULE_FIRMWARE("amdgpu/vega12_gpu_info.bin");
 MODULE_FIRMWARE("amdgpu/raven_gpu_info.bin");
@@ -3200,6 +3202,39 @@ static const struct attribute *amdgpu_dev_attributes[] = {
 };
 
 
+static int amdgpu_iommu_group_notifier(struct notifier_block *nb,
+				     unsigned long action, void *data)
+{
+	struct amdgpu_device *adev = container_of(nb, struct amdgpu_device, nb);
+	struct amdgpu_bo *bo = NULL;
+
+	/*
+	 * Following is a set of IOMMU group dependencies taken care of before
+	 * device's IOMMU group is removed
+	 */
+	if (action == IOMMU_GROUP_NOTIFY_DEL_DEVICE) {
+
+		spin_lock(&ttm_bo_glob.lru_lock);
+		list_for_each_entry(bo, &adev->device_bo_list, bo) {
+			if (bo->tbo.ttm)
+				ttm_tt_unpopulate(bo->tbo.bdev, bo->tbo.ttm);
+		}
+		spin_unlock(&ttm_bo_glob.lru_lock);
+
+		if (adev->irq.ih.use_bus_addr)
+			amdgpu_ih_ring_fini(adev, &adev->irq.ih);
+		if (adev->irq.ih1.use_bus_addr)
+			amdgpu_ih_ring_fini(adev, &adev->irq.ih1);
+		if (adev->irq.ih2.use_bus_addr)
+			amdgpu_ih_ring_fini(adev, &adev->irq.ih2);
+
+		amdgpu_gart_dummy_page_fini(adev);
+	}
+
+	return NOTIFY_OK;
+}
+
+
 /**
  * amdgpu_device_init - initialize the driver
  *
@@ -3304,6 +3339,8 @@ int amdgpu_device_init(struct amdgpu_device *adev,
 
 	INIT_WORK(&adev->xgmi_reset_work, amdgpu_device_xgmi_reset_func);
 
+	INIT_LIST_HEAD(&adev->device_bo_list);
+
 	adev->gfx.gfx_off_req_count = 1;
 	adev->pm.ac_power = power_supply_is_system_supplied() > 0;
 
@@ -3575,6 +3612,15 @@ int amdgpu_device_init(struct amdgpu_device *adev,
 	if (amdgpu_device_cache_pci_state(adev->pdev))
 		pci_restore_state(pdev);
 
+	BLOCKING_INIT_NOTIFIER_HEAD(&adev->notifier);
+	adev->nb.notifier_call = amdgpu_iommu_group_notifier;
+
+	if (adev->dev->iommu_group) {
+		r = iommu_group_register_notifier(adev->dev->iommu_group, &adev->nb);
+		if (r)
+			goto failed;
+	}
+
 	return 0;
 
 failed:
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
index 0db9330..486ad6d 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
@@ -92,7 +92,7 @@ static int amdgpu_gart_dummy_page_init(struct amdgpu_device *adev)
  *
  * Frees the dummy page used by the driver (all asics).
  */
-static void amdgpu_gart_dummy_page_fini(struct amdgpu_device *adev)
+void amdgpu_gart_dummy_page_fini(struct amdgpu_device *adev)
 {
 	if (!adev->dummy_page_addr)
 		return;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h
index afa2e28..5678d9c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h
@@ -61,6 +61,7 @@ int amdgpu_gart_table_vram_pin(struct amdgpu_device *adev);
 void amdgpu_gart_table_vram_unpin(struct amdgpu_device *adev);
 int amdgpu_gart_init(struct amdgpu_device *adev);
 void amdgpu_gart_fini(struct amdgpu_device *adev);
+void amdgpu_gart_dummy_page_fini(struct amdgpu_device *adev);
 int amdgpu_gart_unbind(struct amdgpu_device *adev, uint64_t offset,
 		       int pages);
 int amdgpu_gart_map(struct amdgpu_device *adev, uint64_t offset,
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
index 6cc9919..4a1de69 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
@@ -94,6 +94,10 @@ static void amdgpu_bo_destroy(struct ttm_buffer_object *tbo)
 	}
 	amdgpu_bo_unref(&bo->parent);
 
+	spin_lock(&ttm_bo_glob.lru_lock);
+	list_del(&bo->bo);
+	spin_unlock(&ttm_bo_glob.lru_lock);
+
 	kfree(bo->metadata);
 	kfree(bo);
 }
@@ -613,6 +617,12 @@ static int amdgpu_bo_do_create(struct amdgpu_device *adev,
 	if (bp->type == ttm_bo_type_device)
 		bo->flags &= ~AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED;
 
+	INIT_LIST_HEAD(&bo->bo);
+
+	spin_lock(&ttm_bo_glob.lru_lock);
+	list_add_tail(&bo->bo, &adev->device_bo_list);
+	spin_unlock(&ttm_bo_glob.lru_lock);
+
 	return 0;
 
 fail_unreserve:
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
index 9ac3756..5ae8555 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
@@ -110,6 +110,8 @@ struct amdgpu_bo {
 	struct list_head		shadow_list;
 
 	struct kgd_mem                  *kfd_bo;
+
+	struct list_head		bo;
 };
 
 static inline struct amdgpu_bo *ttm_to_amdgpu_bo(struct ttm_buffer_object *tbo)
-- 
2.7.4

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 196+ messages in thread

* [PATCH v4 07/14] drm/amdgpu: Register IOMMU topology notifier per device.
@ 2021-01-18 21:01   ` Andrey Grodzovsky
  0 siblings, 0 replies; 196+ messages in thread
From: Andrey Grodzovsky @ 2021-01-18 21:01 UTC (permalink / raw)
  To: amd-gfx, dri-devel, ckoenig.leichtzumerken, daniel.vetter, robh,
	l.stach, yuq825, eric
  Cc: Alexander.Deucher, gregkh, ppaalanen, Harry.Wentland, Andrey Grodzovsky

Handle all DMA IOMMU gropup related dependencies before the
group is removed.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu.h        |  5 ++++
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 46 ++++++++++++++++++++++++++++++
 drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c   |  2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h   |  1 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 10 +++++++
 drivers/gpu/drm/amd/amdgpu/amdgpu_object.h |  2 ++
 6 files changed, 65 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index 478a7d8..2953420 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -51,6 +51,7 @@
 #include <linux/dma-fence.h>
 #include <linux/pci.h>
 #include <linux/aer.h>
+#include <linux/notifier.h>
 
 #include <drm/ttm/ttm_bo_api.h>
 #include <drm/ttm/ttm_bo_driver.h>
@@ -1041,6 +1042,10 @@ struct amdgpu_device {
 
 	bool                            in_pci_err_recovery;
 	struct pci_saved_state          *pci_state;
+
+	struct notifier_block		nb;
+	struct blocking_notifier_head	notifier;
+	struct list_head		device_bo_list;
 };
 
 static inline struct amdgpu_device *drm_to_adev(struct drm_device *ddev)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 45e23e3..e99f4f1 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -70,6 +70,8 @@
 #include <drm/task_barrier.h>
 #include <linux/pm_runtime.h>
 
+#include <linux/iommu.h>
+
 MODULE_FIRMWARE("amdgpu/vega10_gpu_info.bin");
 MODULE_FIRMWARE("amdgpu/vega12_gpu_info.bin");
 MODULE_FIRMWARE("amdgpu/raven_gpu_info.bin");
@@ -3200,6 +3202,39 @@ static const struct attribute *amdgpu_dev_attributes[] = {
 };
 
 
+static int amdgpu_iommu_group_notifier(struct notifier_block *nb,
+				     unsigned long action, void *data)
+{
+	struct amdgpu_device *adev = container_of(nb, struct amdgpu_device, nb);
+	struct amdgpu_bo *bo = NULL;
+
+	/*
+	 * Following is a set of IOMMU group dependencies taken care of before
+	 * device's IOMMU group is removed
+	 */
+	if (action == IOMMU_GROUP_NOTIFY_DEL_DEVICE) {
+
+		spin_lock(&ttm_bo_glob.lru_lock);
+		list_for_each_entry(bo, &adev->device_bo_list, bo) {
+			if (bo->tbo.ttm)
+				ttm_tt_unpopulate(bo->tbo.bdev, bo->tbo.ttm);
+		}
+		spin_unlock(&ttm_bo_glob.lru_lock);
+
+		if (adev->irq.ih.use_bus_addr)
+			amdgpu_ih_ring_fini(adev, &adev->irq.ih);
+		if (adev->irq.ih1.use_bus_addr)
+			amdgpu_ih_ring_fini(adev, &adev->irq.ih1);
+		if (adev->irq.ih2.use_bus_addr)
+			amdgpu_ih_ring_fini(adev, &adev->irq.ih2);
+
+		amdgpu_gart_dummy_page_fini(adev);
+	}
+
+	return NOTIFY_OK;
+}
+
+
 /**
  * amdgpu_device_init - initialize the driver
  *
@@ -3304,6 +3339,8 @@ int amdgpu_device_init(struct amdgpu_device *adev,
 
 	INIT_WORK(&adev->xgmi_reset_work, amdgpu_device_xgmi_reset_func);
 
+	INIT_LIST_HEAD(&adev->device_bo_list);
+
 	adev->gfx.gfx_off_req_count = 1;
 	adev->pm.ac_power = power_supply_is_system_supplied() > 0;
 
@@ -3575,6 +3612,15 @@ int amdgpu_device_init(struct amdgpu_device *adev,
 	if (amdgpu_device_cache_pci_state(adev->pdev))
 		pci_restore_state(pdev);
 
+	BLOCKING_INIT_NOTIFIER_HEAD(&adev->notifier);
+	adev->nb.notifier_call = amdgpu_iommu_group_notifier;
+
+	if (adev->dev->iommu_group) {
+		r = iommu_group_register_notifier(adev->dev->iommu_group, &adev->nb);
+		if (r)
+			goto failed;
+	}
+
 	return 0;
 
 failed:
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
index 0db9330..486ad6d 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
@@ -92,7 +92,7 @@ static int amdgpu_gart_dummy_page_init(struct amdgpu_device *adev)
  *
  * Frees the dummy page used by the driver (all asics).
  */
-static void amdgpu_gart_dummy_page_fini(struct amdgpu_device *adev)
+void amdgpu_gart_dummy_page_fini(struct amdgpu_device *adev)
 {
 	if (!adev->dummy_page_addr)
 		return;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h
index afa2e28..5678d9c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h
@@ -61,6 +61,7 @@ int amdgpu_gart_table_vram_pin(struct amdgpu_device *adev);
 void amdgpu_gart_table_vram_unpin(struct amdgpu_device *adev);
 int amdgpu_gart_init(struct amdgpu_device *adev);
 void amdgpu_gart_fini(struct amdgpu_device *adev);
+void amdgpu_gart_dummy_page_fini(struct amdgpu_device *adev);
 int amdgpu_gart_unbind(struct amdgpu_device *adev, uint64_t offset,
 		       int pages);
 int amdgpu_gart_map(struct amdgpu_device *adev, uint64_t offset,
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
index 6cc9919..4a1de69 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
@@ -94,6 +94,10 @@ static void amdgpu_bo_destroy(struct ttm_buffer_object *tbo)
 	}
 	amdgpu_bo_unref(&bo->parent);
 
+	spin_lock(&ttm_bo_glob.lru_lock);
+	list_del(&bo->bo);
+	spin_unlock(&ttm_bo_glob.lru_lock);
+
 	kfree(bo->metadata);
 	kfree(bo);
 }
@@ -613,6 +617,12 @@ static int amdgpu_bo_do_create(struct amdgpu_device *adev,
 	if (bp->type == ttm_bo_type_device)
 		bo->flags &= ~AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED;
 
+	INIT_LIST_HEAD(&bo->bo);
+
+	spin_lock(&ttm_bo_glob.lru_lock);
+	list_add_tail(&bo->bo, &adev->device_bo_list);
+	spin_unlock(&ttm_bo_glob.lru_lock);
+
 	return 0;
 
 fail_unreserve:
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
index 9ac3756..5ae8555 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
@@ -110,6 +110,8 @@ struct amdgpu_bo {
 	struct list_head		shadow_list;
 
 	struct kgd_mem                  *kfd_bo;
+
+	struct list_head		bo;
 };
 
 static inline struct amdgpu_bo *ttm_to_amdgpu_bo(struct ttm_buffer_object *tbo)
-- 
2.7.4

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 196+ messages in thread

* [PATCH v4 08/14] drm/amdgpu: Fix a bunch of sdma code crash post device unplug
  2021-01-18 21:01 ` Andrey Grodzovsky
@ 2021-01-18 21:01   ` Andrey Grodzovsky
  -1 siblings, 0 replies; 196+ messages in thread
From: Andrey Grodzovsky @ 2021-01-18 21:01 UTC (permalink / raw)
  To: amd-gfx, dri-devel, ckoenig.leichtzumerken, daniel.vetter, robh,
	l.stach, yuq825, eric
  Cc: Alexander.Deucher, gregkh

We can't allocate and submit IBs post device unplug.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index ad91c0c..5096351 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -31,6 +31,7 @@
 #include <linux/dma-buf.h>
 
 #include <drm/amdgpu_drm.h>
+#include <drm/drm_drv.h>
 #include "amdgpu.h"
 #include "amdgpu_trace.h"
 #include "amdgpu_amdkfd.h"
@@ -1604,7 +1605,10 @@ static int amdgpu_vm_bo_update_mapping(struct amdgpu_device *adev,
 	struct amdgpu_vm_update_params params;
 	enum amdgpu_sync_mode sync_mode;
 	uint64_t pfn;
-	int r;
+	int r, idx;
+
+	if (!drm_dev_enter(&adev->ddev, &idx))
+		return -ENOENT;
 
 	memset(&params, 0, sizeof(params));
 	params.adev = adev;
@@ -1647,6 +1651,8 @@ static int amdgpu_vm_bo_update_mapping(struct amdgpu_device *adev,
 	if (r)
 		goto error_unlock;
 
+
+	drm_dev_exit(idx);
 	do {
 		uint64_t tmp, num_entries, addr;
 
-- 
2.7.4

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 196+ messages in thread

* [PATCH v4 08/14] drm/amdgpu: Fix a bunch of sdma code crash post device unplug
@ 2021-01-18 21:01   ` Andrey Grodzovsky
  0 siblings, 0 replies; 196+ messages in thread
From: Andrey Grodzovsky @ 2021-01-18 21:01 UTC (permalink / raw)
  To: amd-gfx, dri-devel, ckoenig.leichtzumerken, daniel.vetter, robh,
	l.stach, yuq825, eric
  Cc: Alexander.Deucher, gregkh, ppaalanen, Harry.Wentland, Andrey Grodzovsky

We can't allocate and submit IBs post device unplug.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index ad91c0c..5096351 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -31,6 +31,7 @@
 #include <linux/dma-buf.h>
 
 #include <drm/amdgpu_drm.h>
+#include <drm/drm_drv.h>
 #include "amdgpu.h"
 #include "amdgpu_trace.h"
 #include "amdgpu_amdkfd.h"
@@ -1604,7 +1605,10 @@ static int amdgpu_vm_bo_update_mapping(struct amdgpu_device *adev,
 	struct amdgpu_vm_update_params params;
 	enum amdgpu_sync_mode sync_mode;
 	uint64_t pfn;
-	int r;
+	int r, idx;
+
+	if (!drm_dev_enter(&adev->ddev, &idx))
+		return -ENOENT;
 
 	memset(&params, 0, sizeof(params));
 	params.adev = adev;
@@ -1647,6 +1651,8 @@ static int amdgpu_vm_bo_update_mapping(struct amdgpu_device *adev,
 	if (r)
 		goto error_unlock;
 
+
+	drm_dev_exit(idx);
 	do {
 		uint64_t tmp, num_entries, addr;
 
-- 
2.7.4

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 196+ messages in thread

* [PATCH v4 09/14] drm/amdgpu: Remap all page faults to per process dummy page.
  2021-01-18 21:01 ` Andrey Grodzovsky
@ 2021-01-18 21:01   ` Andrey Grodzovsky
  -1 siblings, 0 replies; 196+ messages in thread
From: Andrey Grodzovsky @ 2021-01-18 21:01 UTC (permalink / raw)
  To: amd-gfx, dri-devel, ckoenig.leichtzumerken, daniel.vetter, robh,
	l.stach, yuq825, eric
  Cc: Alexander.Deucher, gregkh

On device removal reroute all CPU mappings to dummy page
per drm_file instance or imported GEM object.

v4:
Update for modified ttm_bo_vm_dummy_page

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 21 ++++++++++++++++-----
 1 file changed, 16 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
index 9fd2157..550dc5e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -49,6 +49,7 @@
 
 #include <drm/drm_debugfs.h>
 #include <drm/amdgpu_drm.h>
+#include <drm/drm_drv.h>
 
 #include "amdgpu.h"
 #include "amdgpu_object.h"
@@ -1982,18 +1983,28 @@ void amdgpu_ttm_set_buffer_funcs_status(struct amdgpu_device *adev, bool enable)
 static vm_fault_t amdgpu_ttm_fault(struct vm_fault *vmf)
 {
 	struct ttm_buffer_object *bo = vmf->vma->vm_private_data;
+	struct drm_device *ddev = bo->base.dev;
 	vm_fault_t ret;
+	int idx;
 
 	ret = ttm_bo_vm_reserve(bo, vmf);
 	if (ret)
 		return ret;
 
-	ret = amdgpu_bo_fault_reserve_notify(bo);
-	if (ret)
-		goto unlock;
+	if (drm_dev_enter(ddev, &idx)) {
+		ret = amdgpu_bo_fault_reserve_notify(bo);
+		if (ret) {
+			drm_dev_exit(idx);
+			goto unlock;
+		}
 
-	ret = ttm_bo_vm_fault_reserved(vmf, vmf->vma->vm_page_prot,
-				       TTM_BO_VM_NUM_PREFAULT, 1);
+		 ret = ttm_bo_vm_fault_reserved(vmf, vmf->vma->vm_page_prot,
+						TTM_BO_VM_NUM_PREFAULT, 1);
+
+		 drm_dev_exit(idx);
+	} else {
+		ret = ttm_bo_vm_dummy_page(vmf, vmf->vma->vm_page_prot);
+	}
 	if (ret == VM_FAULT_RETRY && !(vmf->flags & FAULT_FLAG_RETRY_NOWAIT))
 		return ret;
 
-- 
2.7.4

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 196+ messages in thread

* [PATCH v4 09/14] drm/amdgpu: Remap all page faults to per process dummy page.
@ 2021-01-18 21:01   ` Andrey Grodzovsky
  0 siblings, 0 replies; 196+ messages in thread
From: Andrey Grodzovsky @ 2021-01-18 21:01 UTC (permalink / raw)
  To: amd-gfx, dri-devel, ckoenig.leichtzumerken, daniel.vetter, robh,
	l.stach, yuq825, eric
  Cc: Alexander.Deucher, gregkh, ppaalanen, Harry.Wentland, Andrey Grodzovsky

On device removal reroute all CPU mappings to dummy page
per drm_file instance or imported GEM object.

v4:
Update for modified ttm_bo_vm_dummy_page

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 21 ++++++++++++++++-----
 1 file changed, 16 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
index 9fd2157..550dc5e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -49,6 +49,7 @@
 
 #include <drm/drm_debugfs.h>
 #include <drm/amdgpu_drm.h>
+#include <drm/drm_drv.h>
 
 #include "amdgpu.h"
 #include "amdgpu_object.h"
@@ -1982,18 +1983,28 @@ void amdgpu_ttm_set_buffer_funcs_status(struct amdgpu_device *adev, bool enable)
 static vm_fault_t amdgpu_ttm_fault(struct vm_fault *vmf)
 {
 	struct ttm_buffer_object *bo = vmf->vma->vm_private_data;
+	struct drm_device *ddev = bo->base.dev;
 	vm_fault_t ret;
+	int idx;
 
 	ret = ttm_bo_vm_reserve(bo, vmf);
 	if (ret)
 		return ret;
 
-	ret = amdgpu_bo_fault_reserve_notify(bo);
-	if (ret)
-		goto unlock;
+	if (drm_dev_enter(ddev, &idx)) {
+		ret = amdgpu_bo_fault_reserve_notify(bo);
+		if (ret) {
+			drm_dev_exit(idx);
+			goto unlock;
+		}
 
-	ret = ttm_bo_vm_fault_reserved(vmf, vmf->vma->vm_page_prot,
-				       TTM_BO_VM_NUM_PREFAULT, 1);
+		 ret = ttm_bo_vm_fault_reserved(vmf, vmf->vma->vm_page_prot,
+						TTM_BO_VM_NUM_PREFAULT, 1);
+
+		 drm_dev_exit(idx);
+	} else {
+		ret = ttm_bo_vm_dummy_page(vmf, vmf->vma->vm_page_prot);
+	}
 	if (ret == VM_FAULT_RETRY && !(vmf->flags & FAULT_FLAG_RETRY_NOWAIT))
 		return ret;
 
-- 
2.7.4

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 196+ messages in thread

* [PATCH v4 10/14] dmr/amdgpu: Move some sysfs attrs creation to default_attr
  2021-01-18 21:01 ` Andrey Grodzovsky
@ 2021-01-18 21:01   ` Andrey Grodzovsky
  -1 siblings, 0 replies; 196+ messages in thread
From: Andrey Grodzovsky @ 2021-01-18 21:01 UTC (permalink / raw)
  To: amd-gfx, dri-devel, ckoenig.leichtzumerken, daniel.vetter, robh,
	l.stach, yuq825, eric
  Cc: Alexander.Deucher, gregkh

This allows to remove explicit creation and destruction
of those attrs and by this avoids warnings on device
finilizing post physical device extraction.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c | 17 +++++++++--------
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c      | 13 +++++++++++++
 drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c  | 25 ++++++++++---------------
 drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c | 14 +++++---------
 4 files changed, 37 insertions(+), 32 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c
index 86add0f..0346e12 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c
@@ -1953,6 +1953,15 @@ static ssize_t amdgpu_atombios_get_vbios_version(struct device *dev,
 static DEVICE_ATTR(vbios_version, 0444, amdgpu_atombios_get_vbios_version,
 		   NULL);
 
+static struct attribute *amdgpu_vbios_version_attrs[] = {
+	&dev_attr_vbios_version.attr,
+	NULL
+};
+
+const struct attribute_group amdgpu_vbios_version_attr_group = {
+	.attrs = amdgpu_vbios_version_attrs
+};
+
 /**
  * amdgpu_atombios_fini - free the driver info and callbacks for atombios
  *
@@ -1972,7 +1981,6 @@ void amdgpu_atombios_fini(struct amdgpu_device *adev)
 	adev->mode_info.atom_context = NULL;
 	kfree(adev->mode_info.atom_card_info);
 	adev->mode_info.atom_card_info = NULL;
-	device_remove_file(adev->dev, &dev_attr_vbios_version);
 }
 
 /**
@@ -1989,7 +1997,6 @@ int amdgpu_atombios_init(struct amdgpu_device *adev)
 {
 	struct card_info *atom_card_info =
 	    kzalloc(sizeof(struct card_info), GFP_KERNEL);
-	int ret;
 
 	if (!atom_card_info)
 		return -ENOMEM;
@@ -2027,12 +2034,6 @@ int amdgpu_atombios_init(struct amdgpu_device *adev)
 		amdgpu_atombios_allocate_fb_scratch(adev);
 	}
 
-	ret = device_create_file(adev->dev, &dev_attr_vbios_version);
-	if (ret) {
-		DRM_ERROR("Failed to create device file for VBIOS version\n");
-		return ret;
-	}
-
 	return 0;
 }
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index 9c0cd00..8fddd74 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -1587,6 +1587,18 @@ static struct pci_error_handlers amdgpu_pci_err_handler = {
 	.resume		= amdgpu_pci_resume,
 };
 
+extern const struct attribute_group amdgpu_vram_mgr_attr_group;
+extern const struct attribute_group amdgpu_gtt_mgr_attr_group;
+extern const struct attribute_group amdgpu_vbios_version_attr_group;
+
+static const struct attribute_group *amdgpu_sysfs_groups[] = {
+	&amdgpu_vram_mgr_attr_group,
+	&amdgpu_gtt_mgr_attr_group,
+	&amdgpu_vbios_version_attr_group,
+	NULL,
+};
+
+
 static struct pci_driver amdgpu_kms_pci_driver = {
 	.name = DRIVER_NAME,
 	.id_table = pciidlist,
@@ -1595,6 +1607,7 @@ static struct pci_driver amdgpu_kms_pci_driver = {
 	.shutdown = amdgpu_pci_shutdown,
 	.driver.pm = &amdgpu_pm_ops,
 	.err_handler = &amdgpu_pci_err_handler,
+	.driver.dev_groups = amdgpu_sysfs_groups,
 };
 
 static int __init amdgpu_init(void)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c
index 8980329..3b7150e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c
@@ -77,6 +77,16 @@ static DEVICE_ATTR(mem_info_gtt_total, S_IRUGO,
 static DEVICE_ATTR(mem_info_gtt_used, S_IRUGO,
 	           amdgpu_mem_info_gtt_used_show, NULL);
 
+static struct attribute *amdgpu_gtt_mgr_attributes[] = {
+	&dev_attr_mem_info_gtt_total.attr,
+	&dev_attr_mem_info_gtt_used.attr,
+	NULL
+};
+
+const struct attribute_group amdgpu_gtt_mgr_attr_group = {
+	.attrs = amdgpu_gtt_mgr_attributes
+};
+
 static const struct ttm_resource_manager_func amdgpu_gtt_mgr_func;
 /**
  * amdgpu_gtt_mgr_init - init GTT manager and DRM MM
@@ -91,7 +101,6 @@ int amdgpu_gtt_mgr_init(struct amdgpu_device *adev, uint64_t gtt_size)
 	struct amdgpu_gtt_mgr *mgr = &adev->mman.gtt_mgr;
 	struct ttm_resource_manager *man = &mgr->manager;
 	uint64_t start, size;
-	int ret;
 
 	man->use_tt = true;
 	man->func = &amdgpu_gtt_mgr_func;
@@ -104,17 +113,6 @@ int amdgpu_gtt_mgr_init(struct amdgpu_device *adev, uint64_t gtt_size)
 	spin_lock_init(&mgr->lock);
 	atomic64_set(&mgr->available, gtt_size >> PAGE_SHIFT);
 
-	ret = device_create_file(adev->dev, &dev_attr_mem_info_gtt_total);
-	if (ret) {
-		DRM_ERROR("Failed to create device file mem_info_gtt_total\n");
-		return ret;
-	}
-	ret = device_create_file(adev->dev, &dev_attr_mem_info_gtt_used);
-	if (ret) {
-		DRM_ERROR("Failed to create device file mem_info_gtt_used\n");
-		return ret;
-	}
-
 	ttm_set_driver_manager(&adev->mman.bdev, TTM_PL_TT, &mgr->manager);
 	ttm_resource_manager_set_used(man, true);
 	return 0;
@@ -144,9 +142,6 @@ void amdgpu_gtt_mgr_fini(struct amdgpu_device *adev)
 	drm_mm_takedown(&mgr->mm);
 	spin_unlock(&mgr->lock);
 
-	device_remove_file(adev->dev, &dev_attr_mem_info_gtt_total);
-	device_remove_file(adev->dev, &dev_attr_mem_info_gtt_used);
-
 	ttm_resource_manager_cleanup(man);
 	ttm_set_driver_manager(&adev->mman.bdev, TTM_PL_TT, NULL);
 }
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
index d2de2a7..9158d11 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
@@ -154,7 +154,7 @@ static DEVICE_ATTR(mem_info_vis_vram_used, S_IRUGO,
 static DEVICE_ATTR(mem_info_vram_vendor, S_IRUGO,
 		   amdgpu_mem_info_vram_vendor, NULL);
 
-static const struct attribute *amdgpu_vram_mgr_attributes[] = {
+static struct attribute *amdgpu_vram_mgr_attributes[] = {
 	&dev_attr_mem_info_vram_total.attr,
 	&dev_attr_mem_info_vis_vram_total.attr,
 	&dev_attr_mem_info_vram_used.attr,
@@ -163,6 +163,10 @@ static const struct attribute *amdgpu_vram_mgr_attributes[] = {
 	NULL
 };
 
+const struct attribute_group amdgpu_vram_mgr_attr_group = {
+	.attrs = amdgpu_vram_mgr_attributes
+};
+
 static const struct ttm_resource_manager_func amdgpu_vram_mgr_func;
 
 /**
@@ -176,7 +180,6 @@ int amdgpu_vram_mgr_init(struct amdgpu_device *adev)
 {
 	struct amdgpu_vram_mgr *mgr = &adev->mman.vram_mgr;
 	struct ttm_resource_manager *man = &mgr->manager;
-	int ret;
 
 	ttm_resource_manager_init(man, adev->gmc.real_vram_size >> PAGE_SHIFT);
 
@@ -187,11 +190,6 @@ int amdgpu_vram_mgr_init(struct amdgpu_device *adev)
 	INIT_LIST_HEAD(&mgr->reservations_pending);
 	INIT_LIST_HEAD(&mgr->reserved_pages);
 
-	/* Add the two VRAM-related sysfs files */
-	ret = sysfs_create_files(&adev->dev->kobj, amdgpu_vram_mgr_attributes);
-	if (ret)
-		DRM_ERROR("Failed to register sysfs\n");
-
 	ttm_set_driver_manager(&adev->mman.bdev, TTM_PL_VRAM, &mgr->manager);
 	ttm_resource_manager_set_used(man, true);
 	return 0;
@@ -229,8 +227,6 @@ void amdgpu_vram_mgr_fini(struct amdgpu_device *adev)
 	drm_mm_takedown(&mgr->mm);
 	spin_unlock(&mgr->lock);
 
-	sysfs_remove_files(&adev->dev->kobj, amdgpu_vram_mgr_attributes);
-
 	ttm_resource_manager_cleanup(man);
 	ttm_set_driver_manager(&adev->mman.bdev, TTM_PL_VRAM, NULL);
 }
-- 
2.7.4

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 196+ messages in thread

* [PATCH v4 10/14] dmr/amdgpu: Move some sysfs attrs creation to default_attr
@ 2021-01-18 21:01   ` Andrey Grodzovsky
  0 siblings, 0 replies; 196+ messages in thread
From: Andrey Grodzovsky @ 2021-01-18 21:01 UTC (permalink / raw)
  To: amd-gfx, dri-devel, ckoenig.leichtzumerken, daniel.vetter, robh,
	l.stach, yuq825, eric
  Cc: Alexander.Deucher, gregkh, ppaalanen, Harry.Wentland, Andrey Grodzovsky

This allows to remove explicit creation and destruction
of those attrs and by this avoids warnings on device
finilizing post physical device extraction.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c | 17 +++++++++--------
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c      | 13 +++++++++++++
 drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c  | 25 ++++++++++---------------
 drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c | 14 +++++---------
 4 files changed, 37 insertions(+), 32 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c
index 86add0f..0346e12 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c
@@ -1953,6 +1953,15 @@ static ssize_t amdgpu_atombios_get_vbios_version(struct device *dev,
 static DEVICE_ATTR(vbios_version, 0444, amdgpu_atombios_get_vbios_version,
 		   NULL);
 
+static struct attribute *amdgpu_vbios_version_attrs[] = {
+	&dev_attr_vbios_version.attr,
+	NULL
+};
+
+const struct attribute_group amdgpu_vbios_version_attr_group = {
+	.attrs = amdgpu_vbios_version_attrs
+};
+
 /**
  * amdgpu_atombios_fini - free the driver info and callbacks for atombios
  *
@@ -1972,7 +1981,6 @@ void amdgpu_atombios_fini(struct amdgpu_device *adev)
 	adev->mode_info.atom_context = NULL;
 	kfree(adev->mode_info.atom_card_info);
 	adev->mode_info.atom_card_info = NULL;
-	device_remove_file(adev->dev, &dev_attr_vbios_version);
 }
 
 /**
@@ -1989,7 +1997,6 @@ int amdgpu_atombios_init(struct amdgpu_device *adev)
 {
 	struct card_info *atom_card_info =
 	    kzalloc(sizeof(struct card_info), GFP_KERNEL);
-	int ret;
 
 	if (!atom_card_info)
 		return -ENOMEM;
@@ -2027,12 +2034,6 @@ int amdgpu_atombios_init(struct amdgpu_device *adev)
 		amdgpu_atombios_allocate_fb_scratch(adev);
 	}
 
-	ret = device_create_file(adev->dev, &dev_attr_vbios_version);
-	if (ret) {
-		DRM_ERROR("Failed to create device file for VBIOS version\n");
-		return ret;
-	}
-
 	return 0;
 }
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index 9c0cd00..8fddd74 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -1587,6 +1587,18 @@ static struct pci_error_handlers amdgpu_pci_err_handler = {
 	.resume		= amdgpu_pci_resume,
 };
 
+extern const struct attribute_group amdgpu_vram_mgr_attr_group;
+extern const struct attribute_group amdgpu_gtt_mgr_attr_group;
+extern const struct attribute_group amdgpu_vbios_version_attr_group;
+
+static const struct attribute_group *amdgpu_sysfs_groups[] = {
+	&amdgpu_vram_mgr_attr_group,
+	&amdgpu_gtt_mgr_attr_group,
+	&amdgpu_vbios_version_attr_group,
+	NULL,
+};
+
+
 static struct pci_driver amdgpu_kms_pci_driver = {
 	.name = DRIVER_NAME,
 	.id_table = pciidlist,
@@ -1595,6 +1607,7 @@ static struct pci_driver amdgpu_kms_pci_driver = {
 	.shutdown = amdgpu_pci_shutdown,
 	.driver.pm = &amdgpu_pm_ops,
 	.err_handler = &amdgpu_pci_err_handler,
+	.driver.dev_groups = amdgpu_sysfs_groups,
 };
 
 static int __init amdgpu_init(void)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c
index 8980329..3b7150e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c
@@ -77,6 +77,16 @@ static DEVICE_ATTR(mem_info_gtt_total, S_IRUGO,
 static DEVICE_ATTR(mem_info_gtt_used, S_IRUGO,
 	           amdgpu_mem_info_gtt_used_show, NULL);
 
+static struct attribute *amdgpu_gtt_mgr_attributes[] = {
+	&dev_attr_mem_info_gtt_total.attr,
+	&dev_attr_mem_info_gtt_used.attr,
+	NULL
+};
+
+const struct attribute_group amdgpu_gtt_mgr_attr_group = {
+	.attrs = amdgpu_gtt_mgr_attributes
+};
+
 static const struct ttm_resource_manager_func amdgpu_gtt_mgr_func;
 /**
  * amdgpu_gtt_mgr_init - init GTT manager and DRM MM
@@ -91,7 +101,6 @@ int amdgpu_gtt_mgr_init(struct amdgpu_device *adev, uint64_t gtt_size)
 	struct amdgpu_gtt_mgr *mgr = &adev->mman.gtt_mgr;
 	struct ttm_resource_manager *man = &mgr->manager;
 	uint64_t start, size;
-	int ret;
 
 	man->use_tt = true;
 	man->func = &amdgpu_gtt_mgr_func;
@@ -104,17 +113,6 @@ int amdgpu_gtt_mgr_init(struct amdgpu_device *adev, uint64_t gtt_size)
 	spin_lock_init(&mgr->lock);
 	atomic64_set(&mgr->available, gtt_size >> PAGE_SHIFT);
 
-	ret = device_create_file(adev->dev, &dev_attr_mem_info_gtt_total);
-	if (ret) {
-		DRM_ERROR("Failed to create device file mem_info_gtt_total\n");
-		return ret;
-	}
-	ret = device_create_file(adev->dev, &dev_attr_mem_info_gtt_used);
-	if (ret) {
-		DRM_ERROR("Failed to create device file mem_info_gtt_used\n");
-		return ret;
-	}
-
 	ttm_set_driver_manager(&adev->mman.bdev, TTM_PL_TT, &mgr->manager);
 	ttm_resource_manager_set_used(man, true);
 	return 0;
@@ -144,9 +142,6 @@ void amdgpu_gtt_mgr_fini(struct amdgpu_device *adev)
 	drm_mm_takedown(&mgr->mm);
 	spin_unlock(&mgr->lock);
 
-	device_remove_file(adev->dev, &dev_attr_mem_info_gtt_total);
-	device_remove_file(adev->dev, &dev_attr_mem_info_gtt_used);
-
 	ttm_resource_manager_cleanup(man);
 	ttm_set_driver_manager(&adev->mman.bdev, TTM_PL_TT, NULL);
 }
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
index d2de2a7..9158d11 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
@@ -154,7 +154,7 @@ static DEVICE_ATTR(mem_info_vis_vram_used, S_IRUGO,
 static DEVICE_ATTR(mem_info_vram_vendor, S_IRUGO,
 		   amdgpu_mem_info_vram_vendor, NULL);
 
-static const struct attribute *amdgpu_vram_mgr_attributes[] = {
+static struct attribute *amdgpu_vram_mgr_attributes[] = {
 	&dev_attr_mem_info_vram_total.attr,
 	&dev_attr_mem_info_vis_vram_total.attr,
 	&dev_attr_mem_info_vram_used.attr,
@@ -163,6 +163,10 @@ static const struct attribute *amdgpu_vram_mgr_attributes[] = {
 	NULL
 };
 
+const struct attribute_group amdgpu_vram_mgr_attr_group = {
+	.attrs = amdgpu_vram_mgr_attributes
+};
+
 static const struct ttm_resource_manager_func amdgpu_vram_mgr_func;
 
 /**
@@ -176,7 +180,6 @@ int amdgpu_vram_mgr_init(struct amdgpu_device *adev)
 {
 	struct amdgpu_vram_mgr *mgr = &adev->mman.vram_mgr;
 	struct ttm_resource_manager *man = &mgr->manager;
-	int ret;
 
 	ttm_resource_manager_init(man, adev->gmc.real_vram_size >> PAGE_SHIFT);
 
@@ -187,11 +190,6 @@ int amdgpu_vram_mgr_init(struct amdgpu_device *adev)
 	INIT_LIST_HEAD(&mgr->reservations_pending);
 	INIT_LIST_HEAD(&mgr->reserved_pages);
 
-	/* Add the two VRAM-related sysfs files */
-	ret = sysfs_create_files(&adev->dev->kobj, amdgpu_vram_mgr_attributes);
-	if (ret)
-		DRM_ERROR("Failed to register sysfs\n");
-
 	ttm_set_driver_manager(&adev->mman.bdev, TTM_PL_VRAM, &mgr->manager);
 	ttm_resource_manager_set_used(man, true);
 	return 0;
@@ -229,8 +227,6 @@ void amdgpu_vram_mgr_fini(struct amdgpu_device *adev)
 	drm_mm_takedown(&mgr->mm);
 	spin_unlock(&mgr->lock);
 
-	sysfs_remove_files(&adev->dev->kobj, amdgpu_vram_mgr_attributes);
-
 	ttm_resource_manager_cleanup(man);
 	ttm_set_driver_manager(&adev->mman.bdev, TTM_PL_VRAM, NULL);
 }
-- 
2.7.4

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 196+ messages in thread

* [PATCH v4 11/14] drm/amdgpu: Guard against write accesses after device removal
  2021-01-18 21:01 ` Andrey Grodzovsky
@ 2021-01-18 21:01   ` Andrey Grodzovsky
  -1 siblings, 0 replies; 196+ messages in thread
From: Andrey Grodzovsky @ 2021-01-18 21:01 UTC (permalink / raw)
  To: amd-gfx, dri-devel, ckoenig.leichtzumerken, daniel.vetter, robh,
	l.stach, yuq825, eric
  Cc: Alexander.Deucher, gregkh

This should prevent writing to memory or IO ranges possibly
already allocated for other uses after our device is removed.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 57 ++++++++++++++++++++++++
 drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c    |  9 ++++
 drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c    | 53 +++++++++++++---------
 drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h    |  3 ++
 drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c   | 70 ++++++++++++++++++++++++++++++
 drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h   | 49 ++-------------------
 drivers/gpu/drm/amd/amdgpu/psp_v11_0.c     | 16 ++-----
 drivers/gpu/drm/amd/amdgpu/psp_v12_0.c     |  8 +---
 drivers/gpu/drm/amd/amdgpu/psp_v3_1.c      |  8 +---
 9 files changed, 184 insertions(+), 89 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index e99f4f1..0a9d73c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -72,6 +72,8 @@
 
 #include <linux/iommu.h>
 
+#include <drm/drm_drv.h>
+
 MODULE_FIRMWARE("amdgpu/vega10_gpu_info.bin");
 MODULE_FIRMWARE("amdgpu/vega12_gpu_info.bin");
 MODULE_FIRMWARE("amdgpu/raven_gpu_info.bin");
@@ -404,13 +406,21 @@ uint8_t amdgpu_mm_rreg8(struct amdgpu_device *adev, uint32_t offset)
  */
 void amdgpu_mm_wreg8(struct amdgpu_device *adev, uint32_t offset, uint8_t value)
 {
+	int idx;
+
 	if (adev->in_pci_err_recovery)
 		return;
 
+
+	if (!drm_dev_enter(&adev->ddev, &idx))
+		return;
+
 	if (offset < adev->rmmio_size)
 		writeb(value, adev->rmmio + offset);
 	else
 		BUG();
+
+	drm_dev_exit(idx);
 }
 
 /**
@@ -427,9 +437,14 @@ void amdgpu_device_wreg(struct amdgpu_device *adev,
 			uint32_t reg, uint32_t v,
 			uint32_t acc_flags)
 {
+	int idx;
+
 	if (adev->in_pci_err_recovery)
 		return;
 
+	if (!drm_dev_enter(&adev->ddev, &idx))
+		return;
+
 	if ((reg * 4) < adev->rmmio_size) {
 		if (!(acc_flags & AMDGPU_REGS_NO_KIQ) &&
 		    amdgpu_sriov_runtime(adev) &&
@@ -444,6 +459,8 @@ void amdgpu_device_wreg(struct amdgpu_device *adev,
 	}
 
 	trace_amdgpu_device_wreg(adev->pdev->device, reg, v);
+
+	drm_dev_exit(idx);
 }
 
 /*
@@ -454,9 +471,14 @@ void amdgpu_device_wreg(struct amdgpu_device *adev,
 void amdgpu_mm_wreg_mmio_rlc(struct amdgpu_device *adev,
 			     uint32_t reg, uint32_t v)
 {
+	int idx;
+
 	if (adev->in_pci_err_recovery)
 		return;
 
+	if (!drm_dev_enter(&adev->ddev, &idx))
+		return;
+
 	if (amdgpu_sriov_fullaccess(adev) &&
 	    adev->gfx.rlc.funcs &&
 	    adev->gfx.rlc.funcs->is_rlcg_access_range) {
@@ -465,6 +487,8 @@ void amdgpu_mm_wreg_mmio_rlc(struct amdgpu_device *adev,
 	} else {
 		writel(v, ((void __iomem *)adev->rmmio) + (reg * 4));
 	}
+
+	drm_dev_exit(idx);
 }
 
 /**
@@ -499,15 +523,22 @@ u32 amdgpu_io_rreg(struct amdgpu_device *adev, u32 reg)
  */
 void amdgpu_io_wreg(struct amdgpu_device *adev, u32 reg, u32 v)
 {
+	int idx;
+
 	if (adev->in_pci_err_recovery)
 		return;
 
+	if (!drm_dev_enter(&adev->ddev, &idx))
+		return;
+
 	if ((reg * 4) < adev->rio_mem_size)
 		iowrite32(v, adev->rio_mem + (reg * 4));
 	else {
 		iowrite32((reg * 4), adev->rio_mem + (mmMM_INDEX * 4));
 		iowrite32(v, adev->rio_mem + (mmMM_DATA * 4));
 	}
+
+	drm_dev_exit(idx);
 }
 
 /**
@@ -544,14 +575,21 @@ u32 amdgpu_mm_rdoorbell(struct amdgpu_device *adev, u32 index)
  */
 void amdgpu_mm_wdoorbell(struct amdgpu_device *adev, u32 index, u32 v)
 {
+	int idx;
+
 	if (adev->in_pci_err_recovery)
 		return;
 
+	if (!drm_dev_enter(&adev->ddev, &idx))
+		return;
+
 	if (index < adev->doorbell.num_doorbells) {
 		writel(v, adev->doorbell.ptr + index);
 	} else {
 		DRM_ERROR("writing beyond doorbell aperture: 0x%08x!\n", index);
 	}
+
+	drm_dev_exit(idx);
 }
 
 /**
@@ -588,14 +626,21 @@ u64 amdgpu_mm_rdoorbell64(struct amdgpu_device *adev, u32 index)
  */
 void amdgpu_mm_wdoorbell64(struct amdgpu_device *adev, u32 index, u64 v)
 {
+	int idx;
+
 	if (adev->in_pci_err_recovery)
 		return;
 
+	if (!drm_dev_enter(&adev->ddev, &idx))
+		return;
+
 	if (index < adev->doorbell.num_doorbells) {
 		atomic64_set((atomic64_t *)(adev->doorbell.ptr + index), v);
 	} else {
 		DRM_ERROR("writing beyond doorbell aperture: 0x%08x!\n", index);
 	}
+
+	drm_dev_exit(idx);
 }
 
 /**
@@ -682,6 +727,10 @@ void amdgpu_device_indirect_wreg(struct amdgpu_device *adev,
 	unsigned long flags;
 	void __iomem *pcie_index_offset;
 	void __iomem *pcie_data_offset;
+	int idx;
+
+	if (!drm_dev_enter(&adev->ddev, &idx))
+		return;
 
 	spin_lock_irqsave(&adev->pcie_idx_lock, flags);
 	pcie_index_offset = (void __iomem *)adev->rmmio + pcie_index * 4;
@@ -692,6 +741,8 @@ void amdgpu_device_indirect_wreg(struct amdgpu_device *adev,
 	writel(reg_data, pcie_data_offset);
 	readl(pcie_data_offset);
 	spin_unlock_irqrestore(&adev->pcie_idx_lock, flags);
+
+	drm_dev_exit(idx);
 }
 
 /**
@@ -711,6 +762,10 @@ void amdgpu_device_indirect_wreg64(struct amdgpu_device *adev,
 	unsigned long flags;
 	void __iomem *pcie_index_offset;
 	void __iomem *pcie_data_offset;
+	int idx;
+
+	if (!drm_dev_enter(&adev->ddev, &idx))
+		return;
 
 	spin_lock_irqsave(&adev->pcie_idx_lock, flags);
 	pcie_index_offset = (void __iomem *)adev->rmmio + pcie_index * 4;
@@ -727,6 +782,8 @@ void amdgpu_device_indirect_wreg64(struct amdgpu_device *adev,
 	writel((u32)(reg_data >> 32), pcie_data_offset);
 	readl(pcie_data_offset);
 	spin_unlock_irqrestore(&adev->pcie_idx_lock, flags);
+
+	drm_dev_exit(idx);
 }
 
 /**
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
index fe1a39f..1beb4e6 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
@@ -31,6 +31,8 @@
 #include "amdgpu_ras.h"
 #include "amdgpu_xgmi.h"
 
+#include <drm/drm_drv.h>
+
 /**
  * amdgpu_gmc_get_pde_for_bo - get the PDE for a BO
  *
@@ -98,6 +100,10 @@ int amdgpu_gmc_set_pte_pde(struct amdgpu_device *adev, void *cpu_pt_addr,
 {
 	void __iomem *ptr = (void *)cpu_pt_addr;
 	uint64_t value;
+	int idx;
+
+	if (!drm_dev_enter(&adev->ddev, &idx))
+		return 0;
 
 	/*
 	 * The following is for PTE only. GART does not have PDEs.
@@ -105,6 +111,9 @@ int amdgpu_gmc_set_pte_pde(struct amdgpu_device *adev, void *cpu_pt_addr,
 	value = addr & 0x0000FFFFFFFFF000ULL;
 	value |= flags;
 	writeq(value, ptr + (gpu_page_idx * 8));
+
+	drm_dev_exit(idx);
+
 	return 0;
 }
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
index 523d22d..89e2bfe 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
@@ -37,6 +37,8 @@
 
 #include "amdgpu_ras.h"
 
+#include <drm/drm_drv.h>
+
 static int psp_sysfs_init(struct amdgpu_device *adev);
 static void psp_sysfs_fini(struct amdgpu_device *adev);
 
@@ -248,7 +250,7 @@ psp_cmd_submit_buf(struct psp_context *psp,
 		   struct psp_gfx_cmd_resp *cmd, uint64_t fence_mc_addr)
 {
 	int ret;
-	int index;
+	int index, idx;
 	int timeout = 2000;
 	bool ras_intr = false;
 	bool skip_unsupport = false;
@@ -256,6 +258,9 @@ psp_cmd_submit_buf(struct psp_context *psp,
 	if (psp->adev->in_pci_err_recovery)
 		return 0;
 
+	if (!drm_dev_enter(&psp->adev->ddev, &idx))
+		return 0;
+
 	mutex_lock(&psp->mutex);
 
 	memset(psp->cmd_buf_mem, 0, PSP_CMD_BUFFER_SIZE);
@@ -266,8 +271,7 @@ psp_cmd_submit_buf(struct psp_context *psp,
 	ret = psp_ring_cmd_submit(psp, psp->cmd_buf_mc_addr, fence_mc_addr, index);
 	if (ret) {
 		atomic_dec(&psp->fence_value);
-		mutex_unlock(&psp->mutex);
-		return ret;
+		goto exit;
 	}
 
 	amdgpu_asic_invalidate_hdp(psp->adev, NULL);
@@ -307,8 +311,8 @@ psp_cmd_submit_buf(struct psp_context *psp,
 			 psp->cmd_buf_mem->cmd_id,
 			 psp->cmd_buf_mem->resp.status);
 		if (!timeout) {
-			mutex_unlock(&psp->mutex);
-			return -EINVAL;
+			ret = -EINVAL;
+			goto exit;
 		}
 	}
 
@@ -316,8 +320,10 @@ psp_cmd_submit_buf(struct psp_context *psp,
 		ucode->tmr_mc_addr_lo = psp->cmd_buf_mem->resp.fw_addr_lo;
 		ucode->tmr_mc_addr_hi = psp->cmd_buf_mem->resp.fw_addr_hi;
 	}
-	mutex_unlock(&psp->mutex);
 
+exit:
+	mutex_unlock(&psp->mutex);
+	drm_dev_exit(idx);
 	return ret;
 }
 
@@ -354,8 +360,7 @@ static int psp_load_toc(struct psp_context *psp,
 	if (!cmd)
 		return -ENOMEM;
 	/* Copy toc to psp firmware private buffer */
-	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
-	memcpy(psp->fw_pri_buf, psp->toc_start_addr, psp->toc_bin_size);
+	psp_copy_fw(psp, psp->toc_start_addr, psp->toc_bin_size);
 
 	psp_prep_load_toc_cmd_buf(cmd, psp->fw_pri_mc_addr, psp->toc_bin_size);
 
@@ -570,8 +575,7 @@ static int psp_asd_load(struct psp_context *psp)
 	if (!cmd)
 		return -ENOMEM;
 
-	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
-	memcpy(psp->fw_pri_buf, psp->asd_start_addr, psp->asd_ucode_size);
+	psp_copy_fw(psp, psp->asd_start_addr, psp->asd_ucode_size);
 
 	psp_prep_asd_load_cmd_buf(cmd, psp->fw_pri_mc_addr,
 				  psp->asd_ucode_size);
@@ -726,8 +730,7 @@ static int psp_xgmi_load(struct psp_context *psp)
 	if (!cmd)
 		return -ENOMEM;
 
-	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
-	memcpy(psp->fw_pri_buf, psp->ta_xgmi_start_addr, psp->ta_xgmi_ucode_size);
+	psp_copy_fw(psp, psp->ta_xgmi_start_addr, psp->ta_xgmi_ucode_size);
 
 	psp_prep_ta_load_cmd_buf(cmd,
 				 psp->fw_pri_mc_addr,
@@ -982,8 +985,7 @@ static int psp_ras_load(struct psp_context *psp)
 	if (!cmd)
 		return -ENOMEM;
 
-	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
-	memcpy(psp->fw_pri_buf, psp->ta_ras_start_addr, psp->ta_ras_ucode_size);
+	psp_copy_fw(psp, psp->ta_ras_start_addr, psp->ta_ras_ucode_size);
 
 	psp_prep_ta_load_cmd_buf(cmd,
 				 psp->fw_pri_mc_addr,
@@ -1219,8 +1221,7 @@ static int psp_hdcp_load(struct psp_context *psp)
 	if (!cmd)
 		return -ENOMEM;
 
-	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
-	memcpy(psp->fw_pri_buf, psp->ta_hdcp_start_addr,
+	psp_copy_fw(psp, psp->ta_hdcp_start_addr,
 	       psp->ta_hdcp_ucode_size);
 
 	psp_prep_ta_load_cmd_buf(cmd,
@@ -1366,8 +1367,7 @@ static int psp_dtm_load(struct psp_context *psp)
 	if (!cmd)
 		return -ENOMEM;
 
-	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
-	memcpy(psp->fw_pri_buf, psp->ta_dtm_start_addr, psp->ta_dtm_ucode_size);
+	psp_copy_fw(psp, psp->ta_dtm_start_addr, psp->ta_dtm_ucode_size);
 
 	psp_prep_ta_load_cmd_buf(cmd,
 				 psp->fw_pri_mc_addr,
@@ -1507,8 +1507,7 @@ static int psp_rap_load(struct psp_context *psp)
 	if (!cmd)
 		return -ENOMEM;
 
-	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
-	memcpy(psp->fw_pri_buf, psp->ta_rap_start_addr, psp->ta_rap_ucode_size);
+	psp_copy_fw(psp, psp->ta_rap_start_addr, psp->ta_rap_ucode_size);
 
 	psp_prep_ta_load_cmd_buf(cmd,
 				 psp->fw_pri_mc_addr,
@@ -2778,6 +2777,20 @@ static ssize_t psp_usbc_pd_fw_sysfs_write(struct device *dev,
 	return count;
 }
 
+void psp_copy_fw(struct psp_context *psp, uint8_t *start_addr, uint32_t bin_size)
+{
+	int idx;
+
+	if (!drm_dev_enter(&psp->adev->ddev, &idx))
+		return;
+
+	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
+	memcpy(psp->fw_pri_buf, start_addr, bin_size);
+
+	drm_dev_exit(idx);
+}
+
+
 static DEVICE_ATTR(usbc_pd_fw, S_IRUGO | S_IWUSR,
 		   psp_usbc_pd_fw_sysfs_read,
 		   psp_usbc_pd_fw_sysfs_write);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
index da250bc..ac69314 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
@@ -400,4 +400,7 @@ int psp_init_ta_microcode(struct psp_context *psp,
 			  const char *chip_name);
 int psp_get_fw_attestation_records_addr(struct psp_context *psp,
 					uint64_t *output_ptr);
+
+void psp_copy_fw(struct psp_context *psp, uint8_t *start_addr, uint32_t bin_size);
+
 #endif
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
index 1a612f5..d656494 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
@@ -35,6 +35,8 @@
 #include "amdgpu.h"
 #include "atom.h"
 
+#include <drm/drm_drv.h>
+
 /*
  * Rings
  * Most engines on the GPU are fed via ring buffers.  Ring
@@ -463,3 +465,71 @@ int amdgpu_ring_test_helper(struct amdgpu_ring *ring)
 	ring->sched.ready = !r;
 	return r;
 }
+
+void amdgpu_ring_clear_ring(struct amdgpu_ring *ring)
+{
+	int idx;
+	int i = 0;
+
+	if (!drm_dev_enter(&ring->adev->ddev, &idx))
+		return;
+
+	while (i <= ring->buf_mask)
+		ring->ring[i++] = ring->funcs->nop;
+
+	drm_dev_exit(idx);
+
+}
+
+void amdgpu_ring_write(struct amdgpu_ring *ring, uint32_t v)
+{
+	int idx;
+
+	if (!drm_dev_enter(&ring->adev->ddev, &idx))
+		return;
+
+	if (ring->count_dw <= 0)
+		DRM_ERROR("amdgpu: writing more dwords to the ring than expected!\n");
+	ring->ring[ring->wptr++ & ring->buf_mask] = v;
+	ring->wptr &= ring->ptr_mask;
+	ring->count_dw--;
+
+	drm_dev_exit(idx);
+}
+
+void amdgpu_ring_write_multiple(struct amdgpu_ring *ring,
+					      void *src, int count_dw)
+{
+	unsigned occupied, chunk1, chunk2;
+	void *dst;
+	int idx;
+
+	if (!drm_dev_enter(&ring->adev->ddev, &idx))
+		return;
+
+	if (unlikely(ring->count_dw < count_dw))
+		DRM_ERROR("amdgpu: writing more dwords to the ring than expected!\n");
+
+	occupied = ring->wptr & ring->buf_mask;
+	dst = (void *)&ring->ring[occupied];
+	chunk1 = ring->buf_mask + 1 - occupied;
+	chunk1 = (chunk1 >= count_dw) ? count_dw: chunk1;
+	chunk2 = count_dw - chunk1;
+	chunk1 <<= 2;
+	chunk2 <<= 2;
+
+	if (chunk1)
+		memcpy(dst, src, chunk1);
+
+	if (chunk2) {
+		src += chunk1;
+		dst = (void *)ring->ring;
+		memcpy(dst, src, chunk2);
+	}
+
+	ring->wptr += count_dw;
+	ring->wptr &= ring->ptr_mask;
+	ring->count_dw -= count_dw;
+
+	drm_dev_exit(idx);
+}
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
index accb243..f90b81f 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
@@ -300,53 +300,12 @@ static inline void amdgpu_ring_set_preempt_cond_exec(struct amdgpu_ring *ring,
 	*ring->cond_exe_cpu_addr = cond_exec;
 }
 
-static inline void amdgpu_ring_clear_ring(struct amdgpu_ring *ring)
-{
-	int i = 0;
-	while (i <= ring->buf_mask)
-		ring->ring[i++] = ring->funcs->nop;
-
-}
-
-static inline void amdgpu_ring_write(struct amdgpu_ring *ring, uint32_t v)
-{
-	if (ring->count_dw <= 0)
-		DRM_ERROR("amdgpu: writing more dwords to the ring than expected!\n");
-	ring->ring[ring->wptr++ & ring->buf_mask] = v;
-	ring->wptr &= ring->ptr_mask;
-	ring->count_dw--;
-}
+void amdgpu_ring_clear_ring(struct amdgpu_ring *ring);
 
-static inline void amdgpu_ring_write_multiple(struct amdgpu_ring *ring,
-					      void *src, int count_dw)
-{
-	unsigned occupied, chunk1, chunk2;
-	void *dst;
-
-	if (unlikely(ring->count_dw < count_dw))
-		DRM_ERROR("amdgpu: writing more dwords to the ring than expected!\n");
-
-	occupied = ring->wptr & ring->buf_mask;
-	dst = (void *)&ring->ring[occupied];
-	chunk1 = ring->buf_mask + 1 - occupied;
-	chunk1 = (chunk1 >= count_dw) ? count_dw: chunk1;
-	chunk2 = count_dw - chunk1;
-	chunk1 <<= 2;
-	chunk2 <<= 2;
+void amdgpu_ring_write(struct amdgpu_ring *ring, uint32_t v);
 
-	if (chunk1)
-		memcpy(dst, src, chunk1);
-
-	if (chunk2) {
-		src += chunk1;
-		dst = (void *)ring->ring;
-		memcpy(dst, src, chunk2);
-	}
-
-	ring->wptr += count_dw;
-	ring->wptr &= ring->ptr_mask;
-	ring->count_dw -= count_dw;
-}
+void amdgpu_ring_write_multiple(struct amdgpu_ring *ring,
+					      void *src, int count_dw);
 
 int amdgpu_ring_test_helper(struct amdgpu_ring *ring);
 
diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c b/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
index bd4248c..b3ce5be 100644
--- a/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
@@ -269,10 +269,8 @@ static int psp_v11_0_bootloader_load_kdb(struct psp_context *psp)
 	if (ret)
 		return ret;
 
-	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
-
 	/* Copy PSP KDB binary to memory */
-	memcpy(psp->fw_pri_buf, psp->kdb_start_addr, psp->kdb_bin_size);
+	psp_copy_fw(psp, psp->kdb_start_addr, psp->kdb_bin_size);
 
 	/* Provide the PSP KDB to bootloader */
 	WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
@@ -302,10 +300,8 @@ static int psp_v11_0_bootloader_load_spl(struct psp_context *psp)
 	if (ret)
 		return ret;
 
-	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
-
 	/* Copy PSP SPL binary to memory */
-	memcpy(psp->fw_pri_buf, psp->spl_start_addr, psp->spl_bin_size);
+	psp_copy_fw(psp, psp->spl_start_addr, psp->spl_bin_size);
 
 	/* Provide the PSP SPL to bootloader */
 	WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
@@ -335,10 +331,8 @@ static int psp_v11_0_bootloader_load_sysdrv(struct psp_context *psp)
 	if (ret)
 		return ret;
 
-	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
-
 	/* Copy PSP System Driver binary to memory */
-	memcpy(psp->fw_pri_buf, psp->sys_start_addr, psp->sys_bin_size);
+	psp_copy_fw(psp, psp->sys_start_addr, psp->sys_bin_size);
 
 	/* Provide the sys driver to bootloader */
 	WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
@@ -371,10 +365,8 @@ static int psp_v11_0_bootloader_load_sos(struct psp_context *psp)
 	if (ret)
 		return ret;
 
-	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
-
 	/* Copy Secure OS binary to PSP memory */
-	memcpy(psp->fw_pri_buf, psp->sos_start_addr, psp->sos_bin_size);
+	psp_copy_fw(psp, psp->sos_start_addr, psp->sos_bin_size);
 
 	/* Provide the PSP secure OS to bootloader */
 	WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c b/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c
index c4828bd..618e5b6 100644
--- a/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c
@@ -138,10 +138,8 @@ static int psp_v12_0_bootloader_load_sysdrv(struct psp_context *psp)
 	if (ret)
 		return ret;
 
-	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
-
 	/* Copy PSP System Driver binary to memory */
-	memcpy(psp->fw_pri_buf, psp->sys_start_addr, psp->sys_bin_size);
+	psp_copy_fw(psp, psp->sys_start_addr, psp->sys_bin_size);
 
 	/* Provide the sys driver to bootloader */
 	WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
@@ -179,10 +177,8 @@ static int psp_v12_0_bootloader_load_sos(struct psp_context *psp)
 	if (ret)
 		return ret;
 
-	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
-
 	/* Copy Secure OS binary to PSP memory */
-	memcpy(psp->fw_pri_buf, psp->sos_start_addr, psp->sos_bin_size);
+	psp_copy_fw(psp, psp->sos_start_addr, psp->sos_bin_size);
 
 	/* Provide the PSP secure OS to bootloader */
 	WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c b/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c
index f2e725f..d0a6cccd 100644
--- a/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c
+++ b/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c
@@ -102,10 +102,8 @@ static int psp_v3_1_bootloader_load_sysdrv(struct psp_context *psp)
 	if (ret)
 		return ret;
 
-	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
-
 	/* Copy PSP System Driver binary to memory */
-	memcpy(psp->fw_pri_buf, psp->sys_start_addr, psp->sys_bin_size);
+	psp_copy_fw(psp, psp->sys_start_addr, psp->sys_bin_size);
 
 	/* Provide the sys driver to bootloader */
 	WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
@@ -143,10 +141,8 @@ static int psp_v3_1_bootloader_load_sos(struct psp_context *psp)
 	if (ret)
 		return ret;
 
-	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
-
 	/* Copy Secure OS binary to PSP memory */
-	memcpy(psp->fw_pri_buf, psp->sos_start_addr, psp->sos_bin_size);
+	psp_copy_fw(psp, psp->sos_start_addr, psp->sos_bin_size);
 
 	/* Provide the PSP secure OS to bootloader */
 	WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
-- 
2.7.4

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 196+ messages in thread

* [PATCH v4 11/14] drm/amdgpu: Guard against write accesses after device removal
@ 2021-01-18 21:01   ` Andrey Grodzovsky
  0 siblings, 0 replies; 196+ messages in thread
From: Andrey Grodzovsky @ 2021-01-18 21:01 UTC (permalink / raw)
  To: amd-gfx, dri-devel, ckoenig.leichtzumerken, daniel.vetter, robh,
	l.stach, yuq825, eric
  Cc: Alexander.Deucher, gregkh, ppaalanen, Harry.Wentland, Andrey Grodzovsky

This should prevent writing to memory or IO ranges possibly
already allocated for other uses after our device is removed.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 57 ++++++++++++++++++++++++
 drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c    |  9 ++++
 drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c    | 53 +++++++++++++---------
 drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h    |  3 ++
 drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c   | 70 ++++++++++++++++++++++++++++++
 drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h   | 49 ++-------------------
 drivers/gpu/drm/amd/amdgpu/psp_v11_0.c     | 16 ++-----
 drivers/gpu/drm/amd/amdgpu/psp_v12_0.c     |  8 +---
 drivers/gpu/drm/amd/amdgpu/psp_v3_1.c      |  8 +---
 9 files changed, 184 insertions(+), 89 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index e99f4f1..0a9d73c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -72,6 +72,8 @@
 
 #include <linux/iommu.h>
 
+#include <drm/drm_drv.h>
+
 MODULE_FIRMWARE("amdgpu/vega10_gpu_info.bin");
 MODULE_FIRMWARE("amdgpu/vega12_gpu_info.bin");
 MODULE_FIRMWARE("amdgpu/raven_gpu_info.bin");
@@ -404,13 +406,21 @@ uint8_t amdgpu_mm_rreg8(struct amdgpu_device *adev, uint32_t offset)
  */
 void amdgpu_mm_wreg8(struct amdgpu_device *adev, uint32_t offset, uint8_t value)
 {
+	int idx;
+
 	if (adev->in_pci_err_recovery)
 		return;
 
+
+	if (!drm_dev_enter(&adev->ddev, &idx))
+		return;
+
 	if (offset < adev->rmmio_size)
 		writeb(value, adev->rmmio + offset);
 	else
 		BUG();
+
+	drm_dev_exit(idx);
 }
 
 /**
@@ -427,9 +437,14 @@ void amdgpu_device_wreg(struct amdgpu_device *adev,
 			uint32_t reg, uint32_t v,
 			uint32_t acc_flags)
 {
+	int idx;
+
 	if (adev->in_pci_err_recovery)
 		return;
 
+	if (!drm_dev_enter(&adev->ddev, &idx))
+		return;
+
 	if ((reg * 4) < adev->rmmio_size) {
 		if (!(acc_flags & AMDGPU_REGS_NO_KIQ) &&
 		    amdgpu_sriov_runtime(adev) &&
@@ -444,6 +459,8 @@ void amdgpu_device_wreg(struct amdgpu_device *adev,
 	}
 
 	trace_amdgpu_device_wreg(adev->pdev->device, reg, v);
+
+	drm_dev_exit(idx);
 }
 
 /*
@@ -454,9 +471,14 @@ void amdgpu_device_wreg(struct amdgpu_device *adev,
 void amdgpu_mm_wreg_mmio_rlc(struct amdgpu_device *adev,
 			     uint32_t reg, uint32_t v)
 {
+	int idx;
+
 	if (adev->in_pci_err_recovery)
 		return;
 
+	if (!drm_dev_enter(&adev->ddev, &idx))
+		return;
+
 	if (amdgpu_sriov_fullaccess(adev) &&
 	    adev->gfx.rlc.funcs &&
 	    adev->gfx.rlc.funcs->is_rlcg_access_range) {
@@ -465,6 +487,8 @@ void amdgpu_mm_wreg_mmio_rlc(struct amdgpu_device *adev,
 	} else {
 		writel(v, ((void __iomem *)adev->rmmio) + (reg * 4));
 	}
+
+	drm_dev_exit(idx);
 }
 
 /**
@@ -499,15 +523,22 @@ u32 amdgpu_io_rreg(struct amdgpu_device *adev, u32 reg)
  */
 void amdgpu_io_wreg(struct amdgpu_device *adev, u32 reg, u32 v)
 {
+	int idx;
+
 	if (adev->in_pci_err_recovery)
 		return;
 
+	if (!drm_dev_enter(&adev->ddev, &idx))
+		return;
+
 	if ((reg * 4) < adev->rio_mem_size)
 		iowrite32(v, adev->rio_mem + (reg * 4));
 	else {
 		iowrite32((reg * 4), adev->rio_mem + (mmMM_INDEX * 4));
 		iowrite32(v, adev->rio_mem + (mmMM_DATA * 4));
 	}
+
+	drm_dev_exit(idx);
 }
 
 /**
@@ -544,14 +575,21 @@ u32 amdgpu_mm_rdoorbell(struct amdgpu_device *adev, u32 index)
  */
 void amdgpu_mm_wdoorbell(struct amdgpu_device *adev, u32 index, u32 v)
 {
+	int idx;
+
 	if (adev->in_pci_err_recovery)
 		return;
 
+	if (!drm_dev_enter(&adev->ddev, &idx))
+		return;
+
 	if (index < adev->doorbell.num_doorbells) {
 		writel(v, adev->doorbell.ptr + index);
 	} else {
 		DRM_ERROR("writing beyond doorbell aperture: 0x%08x!\n", index);
 	}
+
+	drm_dev_exit(idx);
 }
 
 /**
@@ -588,14 +626,21 @@ u64 amdgpu_mm_rdoorbell64(struct amdgpu_device *adev, u32 index)
  */
 void amdgpu_mm_wdoorbell64(struct amdgpu_device *adev, u32 index, u64 v)
 {
+	int idx;
+
 	if (adev->in_pci_err_recovery)
 		return;
 
+	if (!drm_dev_enter(&adev->ddev, &idx))
+		return;
+
 	if (index < adev->doorbell.num_doorbells) {
 		atomic64_set((atomic64_t *)(adev->doorbell.ptr + index), v);
 	} else {
 		DRM_ERROR("writing beyond doorbell aperture: 0x%08x!\n", index);
 	}
+
+	drm_dev_exit(idx);
 }
 
 /**
@@ -682,6 +727,10 @@ void amdgpu_device_indirect_wreg(struct amdgpu_device *adev,
 	unsigned long flags;
 	void __iomem *pcie_index_offset;
 	void __iomem *pcie_data_offset;
+	int idx;
+
+	if (!drm_dev_enter(&adev->ddev, &idx))
+		return;
 
 	spin_lock_irqsave(&adev->pcie_idx_lock, flags);
 	pcie_index_offset = (void __iomem *)adev->rmmio + pcie_index * 4;
@@ -692,6 +741,8 @@ void amdgpu_device_indirect_wreg(struct amdgpu_device *adev,
 	writel(reg_data, pcie_data_offset);
 	readl(pcie_data_offset);
 	spin_unlock_irqrestore(&adev->pcie_idx_lock, flags);
+
+	drm_dev_exit(idx);
 }
 
 /**
@@ -711,6 +762,10 @@ void amdgpu_device_indirect_wreg64(struct amdgpu_device *adev,
 	unsigned long flags;
 	void __iomem *pcie_index_offset;
 	void __iomem *pcie_data_offset;
+	int idx;
+
+	if (!drm_dev_enter(&adev->ddev, &idx))
+		return;
 
 	spin_lock_irqsave(&adev->pcie_idx_lock, flags);
 	pcie_index_offset = (void __iomem *)adev->rmmio + pcie_index * 4;
@@ -727,6 +782,8 @@ void amdgpu_device_indirect_wreg64(struct amdgpu_device *adev,
 	writel((u32)(reg_data >> 32), pcie_data_offset);
 	readl(pcie_data_offset);
 	spin_unlock_irqrestore(&adev->pcie_idx_lock, flags);
+
+	drm_dev_exit(idx);
 }
 
 /**
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
index fe1a39f..1beb4e6 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
@@ -31,6 +31,8 @@
 #include "amdgpu_ras.h"
 #include "amdgpu_xgmi.h"
 
+#include <drm/drm_drv.h>
+
 /**
  * amdgpu_gmc_get_pde_for_bo - get the PDE for a BO
  *
@@ -98,6 +100,10 @@ int amdgpu_gmc_set_pte_pde(struct amdgpu_device *adev, void *cpu_pt_addr,
 {
 	void __iomem *ptr = (void *)cpu_pt_addr;
 	uint64_t value;
+	int idx;
+
+	if (!drm_dev_enter(&adev->ddev, &idx))
+		return 0;
 
 	/*
 	 * The following is for PTE only. GART does not have PDEs.
@@ -105,6 +111,9 @@ int amdgpu_gmc_set_pte_pde(struct amdgpu_device *adev, void *cpu_pt_addr,
 	value = addr & 0x0000FFFFFFFFF000ULL;
 	value |= flags;
 	writeq(value, ptr + (gpu_page_idx * 8));
+
+	drm_dev_exit(idx);
+
 	return 0;
 }
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
index 523d22d..89e2bfe 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
@@ -37,6 +37,8 @@
 
 #include "amdgpu_ras.h"
 
+#include <drm/drm_drv.h>
+
 static int psp_sysfs_init(struct amdgpu_device *adev);
 static void psp_sysfs_fini(struct amdgpu_device *adev);
 
@@ -248,7 +250,7 @@ psp_cmd_submit_buf(struct psp_context *psp,
 		   struct psp_gfx_cmd_resp *cmd, uint64_t fence_mc_addr)
 {
 	int ret;
-	int index;
+	int index, idx;
 	int timeout = 2000;
 	bool ras_intr = false;
 	bool skip_unsupport = false;
@@ -256,6 +258,9 @@ psp_cmd_submit_buf(struct psp_context *psp,
 	if (psp->adev->in_pci_err_recovery)
 		return 0;
 
+	if (!drm_dev_enter(&psp->adev->ddev, &idx))
+		return 0;
+
 	mutex_lock(&psp->mutex);
 
 	memset(psp->cmd_buf_mem, 0, PSP_CMD_BUFFER_SIZE);
@@ -266,8 +271,7 @@ psp_cmd_submit_buf(struct psp_context *psp,
 	ret = psp_ring_cmd_submit(psp, psp->cmd_buf_mc_addr, fence_mc_addr, index);
 	if (ret) {
 		atomic_dec(&psp->fence_value);
-		mutex_unlock(&psp->mutex);
-		return ret;
+		goto exit;
 	}
 
 	amdgpu_asic_invalidate_hdp(psp->adev, NULL);
@@ -307,8 +311,8 @@ psp_cmd_submit_buf(struct psp_context *psp,
 			 psp->cmd_buf_mem->cmd_id,
 			 psp->cmd_buf_mem->resp.status);
 		if (!timeout) {
-			mutex_unlock(&psp->mutex);
-			return -EINVAL;
+			ret = -EINVAL;
+			goto exit;
 		}
 	}
 
@@ -316,8 +320,10 @@ psp_cmd_submit_buf(struct psp_context *psp,
 		ucode->tmr_mc_addr_lo = psp->cmd_buf_mem->resp.fw_addr_lo;
 		ucode->tmr_mc_addr_hi = psp->cmd_buf_mem->resp.fw_addr_hi;
 	}
-	mutex_unlock(&psp->mutex);
 
+exit:
+	mutex_unlock(&psp->mutex);
+	drm_dev_exit(idx);
 	return ret;
 }
 
@@ -354,8 +360,7 @@ static int psp_load_toc(struct psp_context *psp,
 	if (!cmd)
 		return -ENOMEM;
 	/* Copy toc to psp firmware private buffer */
-	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
-	memcpy(psp->fw_pri_buf, psp->toc_start_addr, psp->toc_bin_size);
+	psp_copy_fw(psp, psp->toc_start_addr, psp->toc_bin_size);
 
 	psp_prep_load_toc_cmd_buf(cmd, psp->fw_pri_mc_addr, psp->toc_bin_size);
 
@@ -570,8 +575,7 @@ static int psp_asd_load(struct psp_context *psp)
 	if (!cmd)
 		return -ENOMEM;
 
-	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
-	memcpy(psp->fw_pri_buf, psp->asd_start_addr, psp->asd_ucode_size);
+	psp_copy_fw(psp, psp->asd_start_addr, psp->asd_ucode_size);
 
 	psp_prep_asd_load_cmd_buf(cmd, psp->fw_pri_mc_addr,
 				  psp->asd_ucode_size);
@@ -726,8 +730,7 @@ static int psp_xgmi_load(struct psp_context *psp)
 	if (!cmd)
 		return -ENOMEM;
 
-	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
-	memcpy(psp->fw_pri_buf, psp->ta_xgmi_start_addr, psp->ta_xgmi_ucode_size);
+	psp_copy_fw(psp, psp->ta_xgmi_start_addr, psp->ta_xgmi_ucode_size);
 
 	psp_prep_ta_load_cmd_buf(cmd,
 				 psp->fw_pri_mc_addr,
@@ -982,8 +985,7 @@ static int psp_ras_load(struct psp_context *psp)
 	if (!cmd)
 		return -ENOMEM;
 
-	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
-	memcpy(psp->fw_pri_buf, psp->ta_ras_start_addr, psp->ta_ras_ucode_size);
+	psp_copy_fw(psp, psp->ta_ras_start_addr, psp->ta_ras_ucode_size);
 
 	psp_prep_ta_load_cmd_buf(cmd,
 				 psp->fw_pri_mc_addr,
@@ -1219,8 +1221,7 @@ static int psp_hdcp_load(struct psp_context *psp)
 	if (!cmd)
 		return -ENOMEM;
 
-	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
-	memcpy(psp->fw_pri_buf, psp->ta_hdcp_start_addr,
+	psp_copy_fw(psp, psp->ta_hdcp_start_addr,
 	       psp->ta_hdcp_ucode_size);
 
 	psp_prep_ta_load_cmd_buf(cmd,
@@ -1366,8 +1367,7 @@ static int psp_dtm_load(struct psp_context *psp)
 	if (!cmd)
 		return -ENOMEM;
 
-	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
-	memcpy(psp->fw_pri_buf, psp->ta_dtm_start_addr, psp->ta_dtm_ucode_size);
+	psp_copy_fw(psp, psp->ta_dtm_start_addr, psp->ta_dtm_ucode_size);
 
 	psp_prep_ta_load_cmd_buf(cmd,
 				 psp->fw_pri_mc_addr,
@@ -1507,8 +1507,7 @@ static int psp_rap_load(struct psp_context *psp)
 	if (!cmd)
 		return -ENOMEM;
 
-	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
-	memcpy(psp->fw_pri_buf, psp->ta_rap_start_addr, psp->ta_rap_ucode_size);
+	psp_copy_fw(psp, psp->ta_rap_start_addr, psp->ta_rap_ucode_size);
 
 	psp_prep_ta_load_cmd_buf(cmd,
 				 psp->fw_pri_mc_addr,
@@ -2778,6 +2777,20 @@ static ssize_t psp_usbc_pd_fw_sysfs_write(struct device *dev,
 	return count;
 }
 
+void psp_copy_fw(struct psp_context *psp, uint8_t *start_addr, uint32_t bin_size)
+{
+	int idx;
+
+	if (!drm_dev_enter(&psp->adev->ddev, &idx))
+		return;
+
+	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
+	memcpy(psp->fw_pri_buf, start_addr, bin_size);
+
+	drm_dev_exit(idx);
+}
+
+
 static DEVICE_ATTR(usbc_pd_fw, S_IRUGO | S_IWUSR,
 		   psp_usbc_pd_fw_sysfs_read,
 		   psp_usbc_pd_fw_sysfs_write);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
index da250bc..ac69314 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
@@ -400,4 +400,7 @@ int psp_init_ta_microcode(struct psp_context *psp,
 			  const char *chip_name);
 int psp_get_fw_attestation_records_addr(struct psp_context *psp,
 					uint64_t *output_ptr);
+
+void psp_copy_fw(struct psp_context *psp, uint8_t *start_addr, uint32_t bin_size);
+
 #endif
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
index 1a612f5..d656494 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
@@ -35,6 +35,8 @@
 #include "amdgpu.h"
 #include "atom.h"
 
+#include <drm/drm_drv.h>
+
 /*
  * Rings
  * Most engines on the GPU are fed via ring buffers.  Ring
@@ -463,3 +465,71 @@ int amdgpu_ring_test_helper(struct amdgpu_ring *ring)
 	ring->sched.ready = !r;
 	return r;
 }
+
+void amdgpu_ring_clear_ring(struct amdgpu_ring *ring)
+{
+	int idx;
+	int i = 0;
+
+	if (!drm_dev_enter(&ring->adev->ddev, &idx))
+		return;
+
+	while (i <= ring->buf_mask)
+		ring->ring[i++] = ring->funcs->nop;
+
+	drm_dev_exit(idx);
+
+}
+
+void amdgpu_ring_write(struct amdgpu_ring *ring, uint32_t v)
+{
+	int idx;
+
+	if (!drm_dev_enter(&ring->adev->ddev, &idx))
+		return;
+
+	if (ring->count_dw <= 0)
+		DRM_ERROR("amdgpu: writing more dwords to the ring than expected!\n");
+	ring->ring[ring->wptr++ & ring->buf_mask] = v;
+	ring->wptr &= ring->ptr_mask;
+	ring->count_dw--;
+
+	drm_dev_exit(idx);
+}
+
+void amdgpu_ring_write_multiple(struct amdgpu_ring *ring,
+					      void *src, int count_dw)
+{
+	unsigned occupied, chunk1, chunk2;
+	void *dst;
+	int idx;
+
+	if (!drm_dev_enter(&ring->adev->ddev, &idx))
+		return;
+
+	if (unlikely(ring->count_dw < count_dw))
+		DRM_ERROR("amdgpu: writing more dwords to the ring than expected!\n");
+
+	occupied = ring->wptr & ring->buf_mask;
+	dst = (void *)&ring->ring[occupied];
+	chunk1 = ring->buf_mask + 1 - occupied;
+	chunk1 = (chunk1 >= count_dw) ? count_dw: chunk1;
+	chunk2 = count_dw - chunk1;
+	chunk1 <<= 2;
+	chunk2 <<= 2;
+
+	if (chunk1)
+		memcpy(dst, src, chunk1);
+
+	if (chunk2) {
+		src += chunk1;
+		dst = (void *)ring->ring;
+		memcpy(dst, src, chunk2);
+	}
+
+	ring->wptr += count_dw;
+	ring->wptr &= ring->ptr_mask;
+	ring->count_dw -= count_dw;
+
+	drm_dev_exit(idx);
+}
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
index accb243..f90b81f 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
@@ -300,53 +300,12 @@ static inline void amdgpu_ring_set_preempt_cond_exec(struct amdgpu_ring *ring,
 	*ring->cond_exe_cpu_addr = cond_exec;
 }
 
-static inline void amdgpu_ring_clear_ring(struct amdgpu_ring *ring)
-{
-	int i = 0;
-	while (i <= ring->buf_mask)
-		ring->ring[i++] = ring->funcs->nop;
-
-}
-
-static inline void amdgpu_ring_write(struct amdgpu_ring *ring, uint32_t v)
-{
-	if (ring->count_dw <= 0)
-		DRM_ERROR("amdgpu: writing more dwords to the ring than expected!\n");
-	ring->ring[ring->wptr++ & ring->buf_mask] = v;
-	ring->wptr &= ring->ptr_mask;
-	ring->count_dw--;
-}
+void amdgpu_ring_clear_ring(struct amdgpu_ring *ring);
 
-static inline void amdgpu_ring_write_multiple(struct amdgpu_ring *ring,
-					      void *src, int count_dw)
-{
-	unsigned occupied, chunk1, chunk2;
-	void *dst;
-
-	if (unlikely(ring->count_dw < count_dw))
-		DRM_ERROR("amdgpu: writing more dwords to the ring than expected!\n");
-
-	occupied = ring->wptr & ring->buf_mask;
-	dst = (void *)&ring->ring[occupied];
-	chunk1 = ring->buf_mask + 1 - occupied;
-	chunk1 = (chunk1 >= count_dw) ? count_dw: chunk1;
-	chunk2 = count_dw - chunk1;
-	chunk1 <<= 2;
-	chunk2 <<= 2;
+void amdgpu_ring_write(struct amdgpu_ring *ring, uint32_t v);
 
-	if (chunk1)
-		memcpy(dst, src, chunk1);
-
-	if (chunk2) {
-		src += chunk1;
-		dst = (void *)ring->ring;
-		memcpy(dst, src, chunk2);
-	}
-
-	ring->wptr += count_dw;
-	ring->wptr &= ring->ptr_mask;
-	ring->count_dw -= count_dw;
-}
+void amdgpu_ring_write_multiple(struct amdgpu_ring *ring,
+					      void *src, int count_dw);
 
 int amdgpu_ring_test_helper(struct amdgpu_ring *ring);
 
diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c b/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
index bd4248c..b3ce5be 100644
--- a/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
@@ -269,10 +269,8 @@ static int psp_v11_0_bootloader_load_kdb(struct psp_context *psp)
 	if (ret)
 		return ret;
 
-	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
-
 	/* Copy PSP KDB binary to memory */
-	memcpy(psp->fw_pri_buf, psp->kdb_start_addr, psp->kdb_bin_size);
+	psp_copy_fw(psp, psp->kdb_start_addr, psp->kdb_bin_size);
 
 	/* Provide the PSP KDB to bootloader */
 	WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
@@ -302,10 +300,8 @@ static int psp_v11_0_bootloader_load_spl(struct psp_context *psp)
 	if (ret)
 		return ret;
 
-	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
-
 	/* Copy PSP SPL binary to memory */
-	memcpy(psp->fw_pri_buf, psp->spl_start_addr, psp->spl_bin_size);
+	psp_copy_fw(psp, psp->spl_start_addr, psp->spl_bin_size);
 
 	/* Provide the PSP SPL to bootloader */
 	WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
@@ -335,10 +331,8 @@ static int psp_v11_0_bootloader_load_sysdrv(struct psp_context *psp)
 	if (ret)
 		return ret;
 
-	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
-
 	/* Copy PSP System Driver binary to memory */
-	memcpy(psp->fw_pri_buf, psp->sys_start_addr, psp->sys_bin_size);
+	psp_copy_fw(psp, psp->sys_start_addr, psp->sys_bin_size);
 
 	/* Provide the sys driver to bootloader */
 	WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
@@ -371,10 +365,8 @@ static int psp_v11_0_bootloader_load_sos(struct psp_context *psp)
 	if (ret)
 		return ret;
 
-	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
-
 	/* Copy Secure OS binary to PSP memory */
-	memcpy(psp->fw_pri_buf, psp->sos_start_addr, psp->sos_bin_size);
+	psp_copy_fw(psp, psp->sos_start_addr, psp->sos_bin_size);
 
 	/* Provide the PSP secure OS to bootloader */
 	WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c b/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c
index c4828bd..618e5b6 100644
--- a/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c
@@ -138,10 +138,8 @@ static int psp_v12_0_bootloader_load_sysdrv(struct psp_context *psp)
 	if (ret)
 		return ret;
 
-	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
-
 	/* Copy PSP System Driver binary to memory */
-	memcpy(psp->fw_pri_buf, psp->sys_start_addr, psp->sys_bin_size);
+	psp_copy_fw(psp, psp->sys_start_addr, psp->sys_bin_size);
 
 	/* Provide the sys driver to bootloader */
 	WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
@@ -179,10 +177,8 @@ static int psp_v12_0_bootloader_load_sos(struct psp_context *psp)
 	if (ret)
 		return ret;
 
-	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
-
 	/* Copy Secure OS binary to PSP memory */
-	memcpy(psp->fw_pri_buf, psp->sos_start_addr, psp->sos_bin_size);
+	psp_copy_fw(psp, psp->sos_start_addr, psp->sos_bin_size);
 
 	/* Provide the PSP secure OS to bootloader */
 	WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c b/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c
index f2e725f..d0a6cccd 100644
--- a/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c
+++ b/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c
@@ -102,10 +102,8 @@ static int psp_v3_1_bootloader_load_sysdrv(struct psp_context *psp)
 	if (ret)
 		return ret;
 
-	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
-
 	/* Copy PSP System Driver binary to memory */
-	memcpy(psp->fw_pri_buf, psp->sys_start_addr, psp->sys_bin_size);
+	psp_copy_fw(psp, psp->sys_start_addr, psp->sys_bin_size);
 
 	/* Provide the sys driver to bootloader */
 	WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
@@ -143,10 +141,8 @@ static int psp_v3_1_bootloader_load_sos(struct psp_context *psp)
 	if (ret)
 		return ret;
 
-	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
-
 	/* Copy Secure OS binary to PSP memory */
-	memcpy(psp->fw_pri_buf, psp->sos_start_addr, psp->sos_bin_size);
+	psp_copy_fw(psp, psp->sos_start_addr, psp->sos_bin_size);
 
 	/* Provide the PSP secure OS to bootloader */
 	WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
-- 
2.7.4

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 196+ messages in thread

* [PATCH v4 12/14] drm/scheduler: Job timeout handler returns status
  2021-01-18 21:01 ` Andrey Grodzovsky
@ 2021-01-18 21:01   ` Andrey Grodzovsky
  -1 siblings, 0 replies; 196+ messages in thread
From: Andrey Grodzovsky @ 2021-01-18 21:01 UTC (permalink / raw)
  To: amd-gfx, dri-devel, ckoenig.leichtzumerken, daniel.vetter, robh,
	l.stach, yuq825, eric
  Cc: Tomeu Vizoso, gregkh, Steven Price, Luben Tuikov,
	Alyssa Rosenzweig, Russell King, Alexander.Deucher,
	Christian König

From: Luben Tuikov <luben.tuikov@amd.com>

This patch does not change current behaviour.

The driver's job timeout handler now returns
status indicating back to the DRM layer whether
the task (job) was successfully aborted or whether
more time should be given to the task to complete.

Default behaviour as of this patch, is preserved,
except in obvious-by-comment case in the Panfrost
driver, as documented below.

All drivers which make use of the
drm_sched_backend_ops' .timedout_job() callback
have been accordingly renamed and return the
would've-been default value of
DRM_TASK_STATUS_ALIVE to restart the task's
timeout timer--this is the old behaviour, and
is preserved by this patch.

In the case of the Panfrost driver, its timedout
callback correctly first checks if the job had
completed in due time and if so, it now returns
DRM_TASK_STATUS_COMPLETE to notify the DRM layer
that the task can be moved to the done list, to be
freed later. In the other two subsequent checks,
the value of DRM_TASK_STATUS_ALIVE is returned, as
per the default behaviour.

A more involved driver's solutions can be had
in subequent patches.

v2: Use enum as the status of a driver's job
    timeout callback method.

v4: (By Andrey Grodzovsky)
Replace DRM_TASK_STATUS_COMPLETE with DRM_TASK_STATUS_ENODEV
to enable a hint to the schduler for when NOT to rearm the
timeout timer.

Cc: Alexander Deucher <Alexander.Deucher@amd.com>
Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Lucas Stach <l.stach@pengutronix.de>
Cc: Russell King <linux+etnaviv@armlinux.org.uk>
Cc: Christian Gmeiner <christian.gmeiner@gmail.com>
Cc: Qiang Yu <yuq825@gmail.com>
Cc: Rob Herring <robh@kernel.org>
Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Cc: Steven Price <steven.price@arm.com>
Cc: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Cc: Eric Anholt <eric@anholt.net>
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_job.c |  6 ++++--
 drivers/gpu/drm/etnaviv/etnaviv_sched.c | 10 +++++++++-
 drivers/gpu/drm/lima/lima_sched.c       |  4 +++-
 drivers/gpu/drm/panfrost/panfrost_job.c |  9 ++++++---
 drivers/gpu/drm/scheduler/sched_main.c  |  4 +---
 drivers/gpu/drm/v3d/v3d_sched.c         | 32 +++++++++++++++++---------------
 include/drm/gpu_scheduler.h             | 17 ++++++++++++++---
 7 files changed, 54 insertions(+), 28 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
index ff48101..a111326 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
@@ -28,7 +28,7 @@
 #include "amdgpu.h"
 #include "amdgpu_trace.h"
 
-static void amdgpu_job_timedout(struct drm_sched_job *s_job)
+static enum drm_task_status amdgpu_job_timedout(struct drm_sched_job *s_job)
 {
 	struct amdgpu_ring *ring = to_amdgpu_ring(s_job->sched);
 	struct amdgpu_job *job = to_amdgpu_job(s_job);
@@ -41,7 +41,7 @@ static void amdgpu_job_timedout(struct drm_sched_job *s_job)
 	    amdgpu_ring_soft_recovery(ring, job->vmid, s_job->s_fence->parent)) {
 		DRM_ERROR("ring %s timeout, but soft recovered\n",
 			  s_job->sched->name);
-		return;
+		return DRM_TASK_STATUS_ALIVE;
 	}
 
 	amdgpu_vm_get_task_info(ring->adev, job->pasid, &ti);
@@ -53,10 +53,12 @@ static void amdgpu_job_timedout(struct drm_sched_job *s_job)
 
 	if (amdgpu_device_should_recover_gpu(ring->adev)) {
 		amdgpu_device_gpu_recover(ring->adev, job);
+		return DRM_TASK_STATUS_ALIVE;
 	} else {
 		drm_sched_suspend_timeout(&ring->sched);
 		if (amdgpu_sriov_vf(adev))
 			adev->virt.tdr_debug = true;
+		return DRM_TASK_STATUS_ALIVE;
 	}
 }
 
diff --git a/drivers/gpu/drm/etnaviv/etnaviv_sched.c b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
index cd46c88..c495169 100644
--- a/drivers/gpu/drm/etnaviv/etnaviv_sched.c
+++ b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
@@ -82,7 +82,8 @@ static struct dma_fence *etnaviv_sched_run_job(struct drm_sched_job *sched_job)
 	return fence;
 }
 
-static void etnaviv_sched_timedout_job(struct drm_sched_job *sched_job)
+static enum drm_task_status etnaviv_sched_timedout_job(struct drm_sched_job
+						       *sched_job)
 {
 	struct etnaviv_gem_submit *submit = to_etnaviv_submit(sched_job);
 	struct etnaviv_gpu *gpu = submit->gpu;
@@ -120,9 +121,16 @@ static void etnaviv_sched_timedout_job(struct drm_sched_job *sched_job)
 
 	drm_sched_resubmit_jobs(&gpu->sched);
 
+	/* Tell the DRM scheduler that this task needs
+	 * more time.
+	 */
+	drm_sched_start(&gpu->sched, true);
+	return DRM_TASK_STATUS_ALIVE;
+
 out_no_timeout:
 	/* restart scheduler after GPU is usable again */
 	drm_sched_start(&gpu->sched, true);
+	return DRM_TASK_STATUS_ALIVE;
 }
 
 static void etnaviv_sched_free_job(struct drm_sched_job *sched_job)
diff --git a/drivers/gpu/drm/lima/lima_sched.c b/drivers/gpu/drm/lima/lima_sched.c
index 63b4c56..66d9236 100644
--- a/drivers/gpu/drm/lima/lima_sched.c
+++ b/drivers/gpu/drm/lima/lima_sched.c
@@ -415,7 +415,7 @@ static void lima_sched_build_error_task_list(struct lima_sched_task *task)
 	mutex_unlock(&dev->error_task_list_lock);
 }
 
-static void lima_sched_timedout_job(struct drm_sched_job *job)
+static enum drm_task_status lima_sched_timedout_job(struct drm_sched_job *job)
 {
 	struct lima_sched_pipe *pipe = to_lima_pipe(job->sched);
 	struct lima_sched_task *task = to_lima_task(job);
@@ -449,6 +449,8 @@ static void lima_sched_timedout_job(struct drm_sched_job *job)
 
 	drm_sched_resubmit_jobs(&pipe->base);
 	drm_sched_start(&pipe->base, true);
+
+	return DRM_TASK_STATUS_ALIVE;
 }
 
 static void lima_sched_free_job(struct drm_sched_job *job)
diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c
index 04e6f6f..10d41ac 100644
--- a/drivers/gpu/drm/panfrost/panfrost_job.c
+++ b/drivers/gpu/drm/panfrost/panfrost_job.c
@@ -432,7 +432,8 @@ static void panfrost_scheduler_start(struct panfrost_queue_state *queue)
 	mutex_unlock(&queue->lock);
 }
 
-static void panfrost_job_timedout(struct drm_sched_job *sched_job)
+static enum drm_task_status panfrost_job_timedout(struct drm_sched_job
+						  *sched_job)
 {
 	struct panfrost_job *job = to_panfrost_job(sched_job);
 	struct panfrost_device *pfdev = job->pfdev;
@@ -443,7 +444,7 @@ static void panfrost_job_timedout(struct drm_sched_job *sched_job)
 	 * spurious. Bail out.
 	 */
 	if (dma_fence_is_signaled(job->done_fence))
-		return;
+		return DRM_TASK_STATUS_ALIVE;
 
 	dev_err(pfdev->dev, "gpu sched timeout, js=%d, config=0x%x, status=0x%x, head=0x%x, tail=0x%x, sched_job=%p",
 		js,
@@ -455,11 +456,13 @@ static void panfrost_job_timedout(struct drm_sched_job *sched_job)
 
 	/* Scheduler is already stopped, nothing to do. */
 	if (!panfrost_scheduler_stop(&pfdev->js->queue[js], sched_job))
-		return;
+		return DRM_TASK_STATUS_ALIVE;
 
 	/* Schedule a reset if there's no reset in progress. */
 	if (!atomic_xchg(&pfdev->reset.pending, 1))
 		schedule_work(&pfdev->reset.work);
+
+	return DRM_TASK_STATUS_ALIVE;
 }
 
 static const struct drm_sched_backend_ops panfrost_sched_ops = {
diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
index 92637b7..73fccc5 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -527,7 +527,7 @@ void drm_sched_start(struct drm_gpu_scheduler *sched, bool full_recovery)
 EXPORT_SYMBOL(drm_sched_start);
 
 /**
- * drm_sched_resubmit_jobs - helper to relunch job from pending ring list
+ * drm_sched_resubmit_jobs - helper to relaunch jobs from the pending list
  *
  * @sched: scheduler instance
  *
@@ -561,8 +561,6 @@ void drm_sched_resubmit_jobs(struct drm_gpu_scheduler *sched)
 		} else {
 			s_job->s_fence->parent = fence;
 		}
-
-
 	}
 }
 EXPORT_SYMBOL(drm_sched_resubmit_jobs);
diff --git a/drivers/gpu/drm/v3d/v3d_sched.c b/drivers/gpu/drm/v3d/v3d_sched.c
index 452682e..3740665e 100644
--- a/drivers/gpu/drm/v3d/v3d_sched.c
+++ b/drivers/gpu/drm/v3d/v3d_sched.c
@@ -259,7 +259,7 @@ v3d_cache_clean_job_run(struct drm_sched_job *sched_job)
 	return NULL;
 }
 
-static void
+static enum drm_task_status
 v3d_gpu_reset_for_timeout(struct v3d_dev *v3d, struct drm_sched_job *sched_job)
 {
 	enum v3d_queue q;
@@ -285,6 +285,8 @@ v3d_gpu_reset_for_timeout(struct v3d_dev *v3d, struct drm_sched_job *sched_job)
 	}
 
 	mutex_unlock(&v3d->reset_lock);
+
+	return DRM_TASK_STATUS_ALIVE;
 }
 
 /* If the current address or return address have changed, then the GPU
@@ -292,7 +294,7 @@ v3d_gpu_reset_for_timeout(struct v3d_dev *v3d, struct drm_sched_job *sched_job)
  * could fail if the GPU got in an infinite loop in the CL, but that
  * is pretty unlikely outside of an i-g-t testcase.
  */
-static void
+static enum drm_task_status
 v3d_cl_job_timedout(struct drm_sched_job *sched_job, enum v3d_queue q,
 		    u32 *timedout_ctca, u32 *timedout_ctra)
 {
@@ -304,39 +306,39 @@ v3d_cl_job_timedout(struct drm_sched_job *sched_job, enum v3d_queue q,
 	if (*timedout_ctca != ctca || *timedout_ctra != ctra) {
 		*timedout_ctca = ctca;
 		*timedout_ctra = ctra;
-		return;
+		return DRM_TASK_STATUS_ALIVE;
 	}
 
-	v3d_gpu_reset_for_timeout(v3d, sched_job);
+	return v3d_gpu_reset_for_timeout(v3d, sched_job);
 }
 
-static void
+static enum drm_task_status
 v3d_bin_job_timedout(struct drm_sched_job *sched_job)
 {
 	struct v3d_bin_job *job = to_bin_job(sched_job);
 
-	v3d_cl_job_timedout(sched_job, V3D_BIN,
-			    &job->timedout_ctca, &job->timedout_ctra);
+	return v3d_cl_job_timedout(sched_job, V3D_BIN,
+				   &job->timedout_ctca, &job->timedout_ctra);
 }
 
-static void
+static enum drm_task_status
 v3d_render_job_timedout(struct drm_sched_job *sched_job)
 {
 	struct v3d_render_job *job = to_render_job(sched_job);
 
-	v3d_cl_job_timedout(sched_job, V3D_RENDER,
-			    &job->timedout_ctca, &job->timedout_ctra);
+	return v3d_cl_job_timedout(sched_job, V3D_RENDER,
+				   &job->timedout_ctca, &job->timedout_ctra);
 }
 
-static void
+static enum drm_task_status
 v3d_generic_job_timedout(struct drm_sched_job *sched_job)
 {
 	struct v3d_job *job = to_v3d_job(sched_job);
 
-	v3d_gpu_reset_for_timeout(job->v3d, sched_job);
+	return v3d_gpu_reset_for_timeout(job->v3d, sched_job);
 }
 
-static void
+static enum drm_task_status
 v3d_csd_job_timedout(struct drm_sched_job *sched_job)
 {
 	struct v3d_csd_job *job = to_csd_job(sched_job);
@@ -348,10 +350,10 @@ v3d_csd_job_timedout(struct drm_sched_job *sched_job)
 	 */
 	if (job->timedout_batches != batches) {
 		job->timedout_batches = batches;
-		return;
+		return DRM_TASK_STATUS_ALIVE;
 	}
 
-	v3d_gpu_reset_for_timeout(v3d, sched_job);
+	return v3d_gpu_reset_for_timeout(v3d, sched_job);
 }
 
 static const struct drm_sched_backend_ops v3d_bin_sched_ops = {
diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
index 975e8a6..3ba36bc 100644
--- a/include/drm/gpu_scheduler.h
+++ b/include/drm/gpu_scheduler.h
@@ -206,6 +206,11 @@ static inline bool drm_sched_invalidate_job(struct drm_sched_job *s_job,
 	return s_job && atomic_inc_return(&s_job->karma) > threshold;
 }
 
+enum drm_task_status {
+	DRM_TASK_STATUS_ENODEV,
+	DRM_TASK_STATUS_ALIVE
+};
+
 /**
  * struct drm_sched_backend_ops
  *
@@ -230,10 +235,16 @@ struct drm_sched_backend_ops {
 	struct dma_fence *(*run_job)(struct drm_sched_job *sched_job);
 
 	/**
-         * @timedout_job: Called when a job has taken too long to execute,
-         * to trigger GPU recovery.
+	 * @timedout_job: Called when a job has taken too long to execute,
+	 * to trigger GPU recovery.
+	 *
+	 * Return DRM_TASK_STATUS_ALIVE, if the task (job) is healthy
+	 * and executing in the hardware, i.e. it needs more time.
+	 *
+	 * Return DRM_TASK_STATUS_ENODEV, if the task (job) has
+	 * been aborted.
 	 */
-	void (*timedout_job)(struct drm_sched_job *sched_job);
+	enum drm_task_status (*timedout_job)(struct drm_sched_job *sched_job);
 
 	/**
          * @free_job: Called once the job's finished fence has been signaled
-- 
2.7.4

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 196+ messages in thread

* [PATCH v4 12/14] drm/scheduler: Job timeout handler returns status
@ 2021-01-18 21:01   ` Andrey Grodzovsky
  0 siblings, 0 replies; 196+ messages in thread
From: Andrey Grodzovsky @ 2021-01-18 21:01 UTC (permalink / raw)
  To: amd-gfx, dri-devel, ckoenig.leichtzumerken, daniel.vetter, robh,
	l.stach, yuq825, eric
  Cc: Andrey Grodzovsky, Tomeu Vizoso, gregkh, Christian Gmeiner,
	Steven Price, ppaalanen, Luben Tuikov, Alyssa Rosenzweig,
	Russell King, Alexander.Deucher, Harry.Wentland,
	Christian König

From: Luben Tuikov <luben.tuikov@amd.com>

This patch does not change current behaviour.

The driver's job timeout handler now returns
status indicating back to the DRM layer whether
the task (job) was successfully aborted or whether
more time should be given to the task to complete.

Default behaviour as of this patch, is preserved,
except in obvious-by-comment case in the Panfrost
driver, as documented below.

All drivers which make use of the
drm_sched_backend_ops' .timedout_job() callback
have been accordingly renamed and return the
would've-been default value of
DRM_TASK_STATUS_ALIVE to restart the task's
timeout timer--this is the old behaviour, and
is preserved by this patch.

In the case of the Panfrost driver, its timedout
callback correctly first checks if the job had
completed in due time and if so, it now returns
DRM_TASK_STATUS_COMPLETE to notify the DRM layer
that the task can be moved to the done list, to be
freed later. In the other two subsequent checks,
the value of DRM_TASK_STATUS_ALIVE is returned, as
per the default behaviour.

A more involved driver's solutions can be had
in subequent patches.

v2: Use enum as the status of a driver's job
    timeout callback method.

v4: (By Andrey Grodzovsky)
Replace DRM_TASK_STATUS_COMPLETE with DRM_TASK_STATUS_ENODEV
to enable a hint to the schduler for when NOT to rearm the
timeout timer.

Cc: Alexander Deucher <Alexander.Deucher@amd.com>
Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Lucas Stach <l.stach@pengutronix.de>
Cc: Russell King <linux+etnaviv@armlinux.org.uk>
Cc: Christian Gmeiner <christian.gmeiner@gmail.com>
Cc: Qiang Yu <yuq825@gmail.com>
Cc: Rob Herring <robh@kernel.org>
Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Cc: Steven Price <steven.price@arm.com>
Cc: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Cc: Eric Anholt <eric@anholt.net>
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_job.c |  6 ++++--
 drivers/gpu/drm/etnaviv/etnaviv_sched.c | 10 +++++++++-
 drivers/gpu/drm/lima/lima_sched.c       |  4 +++-
 drivers/gpu/drm/panfrost/panfrost_job.c |  9 ++++++---
 drivers/gpu/drm/scheduler/sched_main.c  |  4 +---
 drivers/gpu/drm/v3d/v3d_sched.c         | 32 +++++++++++++++++---------------
 include/drm/gpu_scheduler.h             | 17 ++++++++++++++---
 7 files changed, 54 insertions(+), 28 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
index ff48101..a111326 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
@@ -28,7 +28,7 @@
 #include "amdgpu.h"
 #include "amdgpu_trace.h"
 
-static void amdgpu_job_timedout(struct drm_sched_job *s_job)
+static enum drm_task_status amdgpu_job_timedout(struct drm_sched_job *s_job)
 {
 	struct amdgpu_ring *ring = to_amdgpu_ring(s_job->sched);
 	struct amdgpu_job *job = to_amdgpu_job(s_job);
@@ -41,7 +41,7 @@ static void amdgpu_job_timedout(struct drm_sched_job *s_job)
 	    amdgpu_ring_soft_recovery(ring, job->vmid, s_job->s_fence->parent)) {
 		DRM_ERROR("ring %s timeout, but soft recovered\n",
 			  s_job->sched->name);
-		return;
+		return DRM_TASK_STATUS_ALIVE;
 	}
 
 	amdgpu_vm_get_task_info(ring->adev, job->pasid, &ti);
@@ -53,10 +53,12 @@ static void amdgpu_job_timedout(struct drm_sched_job *s_job)
 
 	if (amdgpu_device_should_recover_gpu(ring->adev)) {
 		amdgpu_device_gpu_recover(ring->adev, job);
+		return DRM_TASK_STATUS_ALIVE;
 	} else {
 		drm_sched_suspend_timeout(&ring->sched);
 		if (amdgpu_sriov_vf(adev))
 			adev->virt.tdr_debug = true;
+		return DRM_TASK_STATUS_ALIVE;
 	}
 }
 
diff --git a/drivers/gpu/drm/etnaviv/etnaviv_sched.c b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
index cd46c88..c495169 100644
--- a/drivers/gpu/drm/etnaviv/etnaviv_sched.c
+++ b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
@@ -82,7 +82,8 @@ static struct dma_fence *etnaviv_sched_run_job(struct drm_sched_job *sched_job)
 	return fence;
 }
 
-static void etnaviv_sched_timedout_job(struct drm_sched_job *sched_job)
+static enum drm_task_status etnaviv_sched_timedout_job(struct drm_sched_job
+						       *sched_job)
 {
 	struct etnaviv_gem_submit *submit = to_etnaviv_submit(sched_job);
 	struct etnaviv_gpu *gpu = submit->gpu;
@@ -120,9 +121,16 @@ static void etnaviv_sched_timedout_job(struct drm_sched_job *sched_job)
 
 	drm_sched_resubmit_jobs(&gpu->sched);
 
+	/* Tell the DRM scheduler that this task needs
+	 * more time.
+	 */
+	drm_sched_start(&gpu->sched, true);
+	return DRM_TASK_STATUS_ALIVE;
+
 out_no_timeout:
 	/* restart scheduler after GPU is usable again */
 	drm_sched_start(&gpu->sched, true);
+	return DRM_TASK_STATUS_ALIVE;
 }
 
 static void etnaviv_sched_free_job(struct drm_sched_job *sched_job)
diff --git a/drivers/gpu/drm/lima/lima_sched.c b/drivers/gpu/drm/lima/lima_sched.c
index 63b4c56..66d9236 100644
--- a/drivers/gpu/drm/lima/lima_sched.c
+++ b/drivers/gpu/drm/lima/lima_sched.c
@@ -415,7 +415,7 @@ static void lima_sched_build_error_task_list(struct lima_sched_task *task)
 	mutex_unlock(&dev->error_task_list_lock);
 }
 
-static void lima_sched_timedout_job(struct drm_sched_job *job)
+static enum drm_task_status lima_sched_timedout_job(struct drm_sched_job *job)
 {
 	struct lima_sched_pipe *pipe = to_lima_pipe(job->sched);
 	struct lima_sched_task *task = to_lima_task(job);
@@ -449,6 +449,8 @@ static void lima_sched_timedout_job(struct drm_sched_job *job)
 
 	drm_sched_resubmit_jobs(&pipe->base);
 	drm_sched_start(&pipe->base, true);
+
+	return DRM_TASK_STATUS_ALIVE;
 }
 
 static void lima_sched_free_job(struct drm_sched_job *job)
diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c
index 04e6f6f..10d41ac 100644
--- a/drivers/gpu/drm/panfrost/panfrost_job.c
+++ b/drivers/gpu/drm/panfrost/panfrost_job.c
@@ -432,7 +432,8 @@ static void panfrost_scheduler_start(struct panfrost_queue_state *queue)
 	mutex_unlock(&queue->lock);
 }
 
-static void panfrost_job_timedout(struct drm_sched_job *sched_job)
+static enum drm_task_status panfrost_job_timedout(struct drm_sched_job
+						  *sched_job)
 {
 	struct panfrost_job *job = to_panfrost_job(sched_job);
 	struct panfrost_device *pfdev = job->pfdev;
@@ -443,7 +444,7 @@ static void panfrost_job_timedout(struct drm_sched_job *sched_job)
 	 * spurious. Bail out.
 	 */
 	if (dma_fence_is_signaled(job->done_fence))
-		return;
+		return DRM_TASK_STATUS_ALIVE;
 
 	dev_err(pfdev->dev, "gpu sched timeout, js=%d, config=0x%x, status=0x%x, head=0x%x, tail=0x%x, sched_job=%p",
 		js,
@@ -455,11 +456,13 @@ static void panfrost_job_timedout(struct drm_sched_job *sched_job)
 
 	/* Scheduler is already stopped, nothing to do. */
 	if (!panfrost_scheduler_stop(&pfdev->js->queue[js], sched_job))
-		return;
+		return DRM_TASK_STATUS_ALIVE;
 
 	/* Schedule a reset if there's no reset in progress. */
 	if (!atomic_xchg(&pfdev->reset.pending, 1))
 		schedule_work(&pfdev->reset.work);
+
+	return DRM_TASK_STATUS_ALIVE;
 }
 
 static const struct drm_sched_backend_ops panfrost_sched_ops = {
diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
index 92637b7..73fccc5 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -527,7 +527,7 @@ void drm_sched_start(struct drm_gpu_scheduler *sched, bool full_recovery)
 EXPORT_SYMBOL(drm_sched_start);
 
 /**
- * drm_sched_resubmit_jobs - helper to relunch job from pending ring list
+ * drm_sched_resubmit_jobs - helper to relaunch jobs from the pending list
  *
  * @sched: scheduler instance
  *
@@ -561,8 +561,6 @@ void drm_sched_resubmit_jobs(struct drm_gpu_scheduler *sched)
 		} else {
 			s_job->s_fence->parent = fence;
 		}
-
-
 	}
 }
 EXPORT_SYMBOL(drm_sched_resubmit_jobs);
diff --git a/drivers/gpu/drm/v3d/v3d_sched.c b/drivers/gpu/drm/v3d/v3d_sched.c
index 452682e..3740665e 100644
--- a/drivers/gpu/drm/v3d/v3d_sched.c
+++ b/drivers/gpu/drm/v3d/v3d_sched.c
@@ -259,7 +259,7 @@ v3d_cache_clean_job_run(struct drm_sched_job *sched_job)
 	return NULL;
 }
 
-static void
+static enum drm_task_status
 v3d_gpu_reset_for_timeout(struct v3d_dev *v3d, struct drm_sched_job *sched_job)
 {
 	enum v3d_queue q;
@@ -285,6 +285,8 @@ v3d_gpu_reset_for_timeout(struct v3d_dev *v3d, struct drm_sched_job *sched_job)
 	}
 
 	mutex_unlock(&v3d->reset_lock);
+
+	return DRM_TASK_STATUS_ALIVE;
 }
 
 /* If the current address or return address have changed, then the GPU
@@ -292,7 +294,7 @@ v3d_gpu_reset_for_timeout(struct v3d_dev *v3d, struct drm_sched_job *sched_job)
  * could fail if the GPU got in an infinite loop in the CL, but that
  * is pretty unlikely outside of an i-g-t testcase.
  */
-static void
+static enum drm_task_status
 v3d_cl_job_timedout(struct drm_sched_job *sched_job, enum v3d_queue q,
 		    u32 *timedout_ctca, u32 *timedout_ctra)
 {
@@ -304,39 +306,39 @@ v3d_cl_job_timedout(struct drm_sched_job *sched_job, enum v3d_queue q,
 	if (*timedout_ctca != ctca || *timedout_ctra != ctra) {
 		*timedout_ctca = ctca;
 		*timedout_ctra = ctra;
-		return;
+		return DRM_TASK_STATUS_ALIVE;
 	}
 
-	v3d_gpu_reset_for_timeout(v3d, sched_job);
+	return v3d_gpu_reset_for_timeout(v3d, sched_job);
 }
 
-static void
+static enum drm_task_status
 v3d_bin_job_timedout(struct drm_sched_job *sched_job)
 {
 	struct v3d_bin_job *job = to_bin_job(sched_job);
 
-	v3d_cl_job_timedout(sched_job, V3D_BIN,
-			    &job->timedout_ctca, &job->timedout_ctra);
+	return v3d_cl_job_timedout(sched_job, V3D_BIN,
+				   &job->timedout_ctca, &job->timedout_ctra);
 }
 
-static void
+static enum drm_task_status
 v3d_render_job_timedout(struct drm_sched_job *sched_job)
 {
 	struct v3d_render_job *job = to_render_job(sched_job);
 
-	v3d_cl_job_timedout(sched_job, V3D_RENDER,
-			    &job->timedout_ctca, &job->timedout_ctra);
+	return v3d_cl_job_timedout(sched_job, V3D_RENDER,
+				   &job->timedout_ctca, &job->timedout_ctra);
 }
 
-static void
+static enum drm_task_status
 v3d_generic_job_timedout(struct drm_sched_job *sched_job)
 {
 	struct v3d_job *job = to_v3d_job(sched_job);
 
-	v3d_gpu_reset_for_timeout(job->v3d, sched_job);
+	return v3d_gpu_reset_for_timeout(job->v3d, sched_job);
 }
 
-static void
+static enum drm_task_status
 v3d_csd_job_timedout(struct drm_sched_job *sched_job)
 {
 	struct v3d_csd_job *job = to_csd_job(sched_job);
@@ -348,10 +350,10 @@ v3d_csd_job_timedout(struct drm_sched_job *sched_job)
 	 */
 	if (job->timedout_batches != batches) {
 		job->timedout_batches = batches;
-		return;
+		return DRM_TASK_STATUS_ALIVE;
 	}
 
-	v3d_gpu_reset_for_timeout(v3d, sched_job);
+	return v3d_gpu_reset_for_timeout(v3d, sched_job);
 }
 
 static const struct drm_sched_backend_ops v3d_bin_sched_ops = {
diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
index 975e8a6..3ba36bc 100644
--- a/include/drm/gpu_scheduler.h
+++ b/include/drm/gpu_scheduler.h
@@ -206,6 +206,11 @@ static inline bool drm_sched_invalidate_job(struct drm_sched_job *s_job,
 	return s_job && atomic_inc_return(&s_job->karma) > threshold;
 }
 
+enum drm_task_status {
+	DRM_TASK_STATUS_ENODEV,
+	DRM_TASK_STATUS_ALIVE
+};
+
 /**
  * struct drm_sched_backend_ops
  *
@@ -230,10 +235,16 @@ struct drm_sched_backend_ops {
 	struct dma_fence *(*run_job)(struct drm_sched_job *sched_job);
 
 	/**
-         * @timedout_job: Called when a job has taken too long to execute,
-         * to trigger GPU recovery.
+	 * @timedout_job: Called when a job has taken too long to execute,
+	 * to trigger GPU recovery.
+	 *
+	 * Return DRM_TASK_STATUS_ALIVE, if the task (job) is healthy
+	 * and executing in the hardware, i.e. it needs more time.
+	 *
+	 * Return DRM_TASK_STATUS_ENODEV, if the task (job) has
+	 * been aborted.
 	 */
-	void (*timedout_job)(struct drm_sched_job *sched_job);
+	enum drm_task_status (*timedout_job)(struct drm_sched_job *sched_job);
 
 	/**
          * @free_job: Called once the job's finished fence has been signaled
-- 
2.7.4

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 196+ messages in thread

* [PATCH v4 13/14] drm/sched: Make timeout timer rearm conditional.
  2021-01-18 21:01 ` Andrey Grodzovsky
@ 2021-01-18 21:01   ` Andrey Grodzovsky
  -1 siblings, 0 replies; 196+ messages in thread
From: Andrey Grodzovsky @ 2021-01-18 21:01 UTC (permalink / raw)
  To: amd-gfx, dri-devel, ckoenig.leichtzumerken, daniel.vetter, robh,
	l.stach, yuq825, eric
  Cc: Alexander.Deucher, gregkh

We don't want to rearm the timer if driver hook reports
that the device is gone.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
---
 drivers/gpu/drm/scheduler/sched_main.c | 11 +++++++----
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
index 73fccc5..9552334 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -314,6 +314,7 @@ static void drm_sched_job_timedout(struct work_struct *work)
 {
 	struct drm_gpu_scheduler *sched;
 	struct drm_sched_job *job;
+	enum drm_task_status status = DRM_TASK_STATUS_ALIVE;
 
 	sched = container_of(work, struct drm_gpu_scheduler, work_tdr.work);
 
@@ -331,7 +332,7 @@ static void drm_sched_job_timedout(struct work_struct *work)
 		list_del_init(&job->list);
 		spin_unlock(&sched->job_list_lock);
 
-		job->sched->ops->timedout_job(job);
+		status = job->sched->ops->timedout_job(job);
 
 		/*
 		 * Guilty job did complete and hence needs to be manually removed
@@ -345,9 +346,11 @@ static void drm_sched_job_timedout(struct work_struct *work)
 		spin_unlock(&sched->job_list_lock);
 	}
 
-	spin_lock(&sched->job_list_lock);
-	drm_sched_start_timeout(sched);
-	spin_unlock(&sched->job_list_lock);
+	if (status != DRM_TASK_STATUS_ENODEV) {
+		spin_lock(&sched->job_list_lock);
+		drm_sched_start_timeout(sched);
+		spin_unlock(&sched->job_list_lock);
+	}
 }
 
  /**
-- 
2.7.4

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 196+ messages in thread

* [PATCH v4 13/14] drm/sched: Make timeout timer rearm conditional.
@ 2021-01-18 21:01   ` Andrey Grodzovsky
  0 siblings, 0 replies; 196+ messages in thread
From: Andrey Grodzovsky @ 2021-01-18 21:01 UTC (permalink / raw)
  To: amd-gfx, dri-devel, ckoenig.leichtzumerken, daniel.vetter, robh,
	l.stach, yuq825, eric
  Cc: Alexander.Deucher, gregkh, ppaalanen, Harry.Wentland, Andrey Grodzovsky

We don't want to rearm the timer if driver hook reports
that the device is gone.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
---
 drivers/gpu/drm/scheduler/sched_main.c | 11 +++++++----
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
index 73fccc5..9552334 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -314,6 +314,7 @@ static void drm_sched_job_timedout(struct work_struct *work)
 {
 	struct drm_gpu_scheduler *sched;
 	struct drm_sched_job *job;
+	enum drm_task_status status = DRM_TASK_STATUS_ALIVE;
 
 	sched = container_of(work, struct drm_gpu_scheduler, work_tdr.work);
 
@@ -331,7 +332,7 @@ static void drm_sched_job_timedout(struct work_struct *work)
 		list_del_init(&job->list);
 		spin_unlock(&sched->job_list_lock);
 
-		job->sched->ops->timedout_job(job);
+		status = job->sched->ops->timedout_job(job);
 
 		/*
 		 * Guilty job did complete and hence needs to be manually removed
@@ -345,9 +346,11 @@ static void drm_sched_job_timedout(struct work_struct *work)
 		spin_unlock(&sched->job_list_lock);
 	}
 
-	spin_lock(&sched->job_list_lock);
-	drm_sched_start_timeout(sched);
-	spin_unlock(&sched->job_list_lock);
+	if (status != DRM_TASK_STATUS_ENODEV) {
+		spin_lock(&sched->job_list_lock);
+		drm_sched_start_timeout(sched);
+		spin_unlock(&sched->job_list_lock);
+	}
 }
 
  /**
-- 
2.7.4

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 196+ messages in thread

* [PATCH v4 14/14] drm/amdgpu: Prevent any job recoveries after device is unplugged.
  2021-01-18 21:01 ` Andrey Grodzovsky
@ 2021-01-18 21:01   ` Andrey Grodzovsky
  -1 siblings, 0 replies; 196+ messages in thread
From: Andrey Grodzovsky @ 2021-01-18 21:01 UTC (permalink / raw)
  To: amd-gfx, dri-devel, ckoenig.leichtzumerken, daniel.vetter, robh,
	l.stach, yuq825, eric
  Cc: Alexander.Deucher, gregkh

Return DRM_TASK_STATUS_ENODEV back to the scheduler when device
is not present so they timeout timer will not be rearmed.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 19 ++++++++++++++++---
 1 file changed, 16 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
index a111326..e4aa5fe 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
@@ -25,6 +25,8 @@
 #include <linux/wait.h>
 #include <linux/sched.h>
 
+#include <drm/drm_drv.h>
+
 #include "amdgpu.h"
 #include "amdgpu_trace.h"
 
@@ -34,6 +36,15 @@ static enum drm_task_status amdgpu_job_timedout(struct drm_sched_job *s_job)
 	struct amdgpu_job *job = to_amdgpu_job(s_job);
 	struct amdgpu_task_info ti;
 	struct amdgpu_device *adev = ring->adev;
+	int idx;
+
+	if (!drm_dev_enter(&adev->ddev, &idx)) {
+		DRM_INFO("%s - device unplugged skipping recovery on scheduler:%s",
+			 __func__, s_job->sched->name);
+
+		/* Effectively the job is aborted as the device is gone */
+		return DRM_TASK_STATUS_ENODEV;
+	}
 
 	memset(&ti, 0, sizeof(struct amdgpu_task_info));
 
@@ -41,7 +52,7 @@ static enum drm_task_status amdgpu_job_timedout(struct drm_sched_job *s_job)
 	    amdgpu_ring_soft_recovery(ring, job->vmid, s_job->s_fence->parent)) {
 		DRM_ERROR("ring %s timeout, but soft recovered\n",
 			  s_job->sched->name);
-		return DRM_TASK_STATUS_ALIVE;
+		goto exit;
 	}
 
 	amdgpu_vm_get_task_info(ring->adev, job->pasid, &ti);
@@ -53,13 +64,15 @@ static enum drm_task_status amdgpu_job_timedout(struct drm_sched_job *s_job)
 
 	if (amdgpu_device_should_recover_gpu(ring->adev)) {
 		amdgpu_device_gpu_recover(ring->adev, job);
-		return DRM_TASK_STATUS_ALIVE;
 	} else {
 		drm_sched_suspend_timeout(&ring->sched);
 		if (amdgpu_sriov_vf(adev))
 			adev->virt.tdr_debug = true;
-		return DRM_TASK_STATUS_ALIVE;
 	}
+
+exit:
+	drm_dev_exit(idx);
+	return DRM_TASK_STATUS_ALIVE;
 }
 
 int amdgpu_job_alloc(struct amdgpu_device *adev, unsigned num_ibs,
-- 
2.7.4

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 196+ messages in thread

* [PATCH v4 14/14] drm/amdgpu: Prevent any job recoveries after device is unplugged.
@ 2021-01-18 21:01   ` Andrey Grodzovsky
  0 siblings, 0 replies; 196+ messages in thread
From: Andrey Grodzovsky @ 2021-01-18 21:01 UTC (permalink / raw)
  To: amd-gfx, dri-devel, ckoenig.leichtzumerken, daniel.vetter, robh,
	l.stach, yuq825, eric
  Cc: Alexander.Deucher, gregkh, ppaalanen, Harry.Wentland, Andrey Grodzovsky

Return DRM_TASK_STATUS_ENODEV back to the scheduler when device
is not present so they timeout timer will not be rearmed.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 19 ++++++++++++++++---
 1 file changed, 16 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
index a111326..e4aa5fe 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
@@ -25,6 +25,8 @@
 #include <linux/wait.h>
 #include <linux/sched.h>
 
+#include <drm/drm_drv.h>
+
 #include "amdgpu.h"
 #include "amdgpu_trace.h"
 
@@ -34,6 +36,15 @@ static enum drm_task_status amdgpu_job_timedout(struct drm_sched_job *s_job)
 	struct amdgpu_job *job = to_amdgpu_job(s_job);
 	struct amdgpu_task_info ti;
 	struct amdgpu_device *adev = ring->adev;
+	int idx;
+
+	if (!drm_dev_enter(&adev->ddev, &idx)) {
+		DRM_INFO("%s - device unplugged skipping recovery on scheduler:%s",
+			 __func__, s_job->sched->name);
+
+		/* Effectively the job is aborted as the device is gone */
+		return DRM_TASK_STATUS_ENODEV;
+	}
 
 	memset(&ti, 0, sizeof(struct amdgpu_task_info));
 
@@ -41,7 +52,7 @@ static enum drm_task_status amdgpu_job_timedout(struct drm_sched_job *s_job)
 	    amdgpu_ring_soft_recovery(ring, job->vmid, s_job->s_fence->parent)) {
 		DRM_ERROR("ring %s timeout, but soft recovered\n",
 			  s_job->sched->name);
-		return DRM_TASK_STATUS_ALIVE;
+		goto exit;
 	}
 
 	amdgpu_vm_get_task_info(ring->adev, job->pasid, &ti);
@@ -53,13 +64,15 @@ static enum drm_task_status amdgpu_job_timedout(struct drm_sched_job *s_job)
 
 	if (amdgpu_device_should_recover_gpu(ring->adev)) {
 		amdgpu_device_gpu_recover(ring->adev, job);
-		return DRM_TASK_STATUS_ALIVE;
 	} else {
 		drm_sched_suspend_timeout(&ring->sched);
 		if (amdgpu_sriov_vf(adev))
 			adev->virt.tdr_debug = true;
-		return DRM_TASK_STATUS_ALIVE;
 	}
+
+exit:
+	drm_dev_exit(idx);
+	return DRM_TASK_STATUS_ALIVE;
 }
 
 int amdgpu_job_alloc(struct amdgpu_device *adev, unsigned num_ibs,
-- 
2.7.4

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 01/14] drm/ttm: Remap all page faults to per process dummy page.
  2021-01-18 21:01   ` Andrey Grodzovsky
@ 2021-01-18 21:48     ` Alex Deucher
  -1 siblings, 0 replies; 196+ messages in thread
From: Alex Deucher @ 2021-01-18 21:48 UTC (permalink / raw)
  To: Andrey Grodzovsky
  Cc: Greg KH, Christian König, Maling list - DRI developers,
	amd-gfx list, Daniel Vetter, Deucher, Alexander, Qiang Yu

On Mon, Jan 18, 2021 at 4:02 PM Andrey Grodzovsky
<andrey.grodzovsky@amd.com> wrote:
>
> On device removal reroute all CPU mappings to dummy page.
>
> v3:
> Remove loop to find DRM file and instead access it
> by vma->vm_file->private_data. Move dummy page installation
> into a separate function.
>
> v4:
> Map the entire BOs VA space into on demand allocated dummy page
> on the first fault for that BO.
>
> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> ---
>  drivers/gpu/drm/ttm/ttm_bo_vm.c | 82 ++++++++++++++++++++++++++++++++++++++++-
>  include/drm/ttm/ttm_bo_api.h    |  2 +
>  2 files changed, 83 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c b/drivers/gpu/drm/ttm/ttm_bo_vm.c
> index 6dc96cf..ed89da3 100644
> --- a/drivers/gpu/drm/ttm/ttm_bo_vm.c
> +++ b/drivers/gpu/drm/ttm/ttm_bo_vm.c
> @@ -34,6 +34,8 @@
>  #include <drm/ttm/ttm_bo_driver.h>
>  #include <drm/ttm/ttm_placement.h>
>  #include <drm/drm_vma_manager.h>
> +#include <drm/drm_drv.h>
> +#include <drm/drm_managed.h>
>  #include <linux/mm.h>
>  #include <linux/pfn_t.h>
>  #include <linux/rbtree.h>
> @@ -380,25 +382,103 @@ vm_fault_t ttm_bo_vm_fault_reserved(struct vm_fault *vmf,
>  }
>  EXPORT_SYMBOL(ttm_bo_vm_fault_reserved);
>
> +static void ttm_bo_release_dummy_page(struct drm_device *dev, void *res)
> +{
> +       struct page *dummy_page = (struct page *)res;
> +
> +       __free_page(dummy_page);
> +}
> +
> +vm_fault_t ttm_bo_vm_dummy_page(struct vm_fault *vmf, pgprot_t prot)
> +{
> +       struct vm_area_struct *vma = vmf->vma;
> +       struct ttm_buffer_object *bo = vma->vm_private_data;
> +       struct ttm_bo_device *bdev = bo->bdev;
> +       struct drm_device *ddev = bo->base.dev;
> +       vm_fault_t ret = VM_FAULT_NOPAGE;
> +       unsigned long address = vma->vm_start;
> +       unsigned long num_prefault = (vma->vm_end - vma->vm_start) >> PAGE_SHIFT;
> +       unsigned long pfn;
> +       struct page *page;
> +       int i;
> +
> +       /*
> +        * Wait for buffer data in transit, due to a pipelined
> +        * move.
> +        */
> +       ret = ttm_bo_vm_fault_idle(bo, vmf);
> +       if (unlikely(ret != 0))
> +               return ret;
> +
> +       /* Allocate new dummy page to map all the VA range in this VMA to it*/
> +       page = alloc_page(GFP_KERNEL | __GFP_ZERO);
> +       if (!page)
> +               return VM_FAULT_OOM;
> +
> +       pfn = page_to_pfn(page);
> +
> +       /*
> +        * Prefault the entire VMA range right away to avoid further faults
> +        */
> +       for (i = 0; i < num_prefault; ++i) {
> +
> +               if (unlikely(address >= vma->vm_end))
> +                       break;
> +
> +               if (vma->vm_flags & VM_MIXEDMAP)
> +                       ret = vmf_insert_mixed_prot(vma, address,
> +                                                   __pfn_to_pfn_t(pfn, PFN_DEV),
> +                                                   prot);
> +               else
> +                       ret = vmf_insert_pfn_prot(vma, address, pfn, prot);
> +
> +               /* Never error on prefaulted PTEs */
> +               if (unlikely((ret & VM_FAULT_ERROR))) {
> +                       if (i == 0)
> +                               return VM_FAULT_NOPAGE;
> +                       else
> +                               break;
> +               }
> +
> +               address += PAGE_SIZE;
> +       }
> +
> +       /* Set the page to be freed using drmm release action */
> +       if (drmm_add_action_or_reset(ddev, ttm_bo_release_dummy_page, page))
> +               return VM_FAULT_OOM;
> +
> +       return ret;
> +}
> +EXPORT_SYMBOL(ttm_bo_vm_dummy_page);
> +
>  vm_fault_t ttm_bo_vm_fault(struct vm_fault *vmf)
>  {
>         struct vm_area_struct *vma = vmf->vma;
>         pgprot_t prot;
>         struct ttm_buffer_object *bo = vma->vm_private_data;
> +       struct drm_device *ddev = bo->base.dev;
>         vm_fault_t ret;
> +       int idx;
>
>         ret = ttm_bo_vm_reserve(bo, vmf);
>         if (ret)
>                 return ret;
>
>         prot = vma->vm_page_prot;
> -       ret = ttm_bo_vm_fault_reserved(vmf, prot, TTM_BO_VM_NUM_PREFAULT, 1);
> +       if (drm_dev_enter(ddev, &idx)) {
> +               ret = ttm_bo_vm_fault_reserved(vmf, prot, TTM_BO_VM_NUM_PREFAULT, 1);
> +               drm_dev_exit(idx);
> +       } else {
> +               ret = ttm_bo_vm_dummy_page(vmf, prot);
> +       }
>         if (ret == VM_FAULT_RETRY && !(vmf->flags & FAULT_FLAG_RETRY_NOWAIT))
>                 return ret;
>
>         dma_resv_unlock(bo->base.resv);
>
>         return ret;
> +
> +       return ret;

Duplicate return here.

Alex

>  }
>  EXPORT_SYMBOL(ttm_bo_vm_fault);
>
> diff --git a/include/drm/ttm/ttm_bo_api.h b/include/drm/ttm/ttm_bo_api.h
> index e17be32..12fb240 100644
> --- a/include/drm/ttm/ttm_bo_api.h
> +++ b/include/drm/ttm/ttm_bo_api.h
> @@ -643,4 +643,6 @@ void ttm_bo_vm_close(struct vm_area_struct *vma);
>  int ttm_bo_vm_access(struct vm_area_struct *vma, unsigned long addr,
>                      void *buf, int len, int write);
>
> +vm_fault_t ttm_bo_vm_dummy_page(struct vm_fault *vmf, pgprot_t prot);
> +
>  #endif
> --
> 2.7.4
>
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 01/14] drm/ttm: Remap all page faults to per process dummy page.
@ 2021-01-18 21:48     ` Alex Deucher
  0 siblings, 0 replies; 196+ messages in thread
From: Alex Deucher @ 2021-01-18 21:48 UTC (permalink / raw)
  To: Andrey Grodzovsky
  Cc: Rob Herring, Greg KH, Christian König,
	Maling list - DRI developers, Eric Anholt, amd-gfx list,
	Daniel Vetter, Deucher, Alexander, Qiang Yu, Lucas Stach

On Mon, Jan 18, 2021 at 4:02 PM Andrey Grodzovsky
<andrey.grodzovsky@amd.com> wrote:
>
> On device removal reroute all CPU mappings to dummy page.
>
> v3:
> Remove loop to find DRM file and instead access it
> by vma->vm_file->private_data. Move dummy page installation
> into a separate function.
>
> v4:
> Map the entire BOs VA space into on demand allocated dummy page
> on the first fault for that BO.
>
> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> ---
>  drivers/gpu/drm/ttm/ttm_bo_vm.c | 82 ++++++++++++++++++++++++++++++++++++++++-
>  include/drm/ttm/ttm_bo_api.h    |  2 +
>  2 files changed, 83 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c b/drivers/gpu/drm/ttm/ttm_bo_vm.c
> index 6dc96cf..ed89da3 100644
> --- a/drivers/gpu/drm/ttm/ttm_bo_vm.c
> +++ b/drivers/gpu/drm/ttm/ttm_bo_vm.c
> @@ -34,6 +34,8 @@
>  #include <drm/ttm/ttm_bo_driver.h>
>  #include <drm/ttm/ttm_placement.h>
>  #include <drm/drm_vma_manager.h>
> +#include <drm/drm_drv.h>
> +#include <drm/drm_managed.h>
>  #include <linux/mm.h>
>  #include <linux/pfn_t.h>
>  #include <linux/rbtree.h>
> @@ -380,25 +382,103 @@ vm_fault_t ttm_bo_vm_fault_reserved(struct vm_fault *vmf,
>  }
>  EXPORT_SYMBOL(ttm_bo_vm_fault_reserved);
>
> +static void ttm_bo_release_dummy_page(struct drm_device *dev, void *res)
> +{
> +       struct page *dummy_page = (struct page *)res;
> +
> +       __free_page(dummy_page);
> +}
> +
> +vm_fault_t ttm_bo_vm_dummy_page(struct vm_fault *vmf, pgprot_t prot)
> +{
> +       struct vm_area_struct *vma = vmf->vma;
> +       struct ttm_buffer_object *bo = vma->vm_private_data;
> +       struct ttm_bo_device *bdev = bo->bdev;
> +       struct drm_device *ddev = bo->base.dev;
> +       vm_fault_t ret = VM_FAULT_NOPAGE;
> +       unsigned long address = vma->vm_start;
> +       unsigned long num_prefault = (vma->vm_end - vma->vm_start) >> PAGE_SHIFT;
> +       unsigned long pfn;
> +       struct page *page;
> +       int i;
> +
> +       /*
> +        * Wait for buffer data in transit, due to a pipelined
> +        * move.
> +        */
> +       ret = ttm_bo_vm_fault_idle(bo, vmf);
> +       if (unlikely(ret != 0))
> +               return ret;
> +
> +       /* Allocate new dummy page to map all the VA range in this VMA to it*/
> +       page = alloc_page(GFP_KERNEL | __GFP_ZERO);
> +       if (!page)
> +               return VM_FAULT_OOM;
> +
> +       pfn = page_to_pfn(page);
> +
> +       /*
> +        * Prefault the entire VMA range right away to avoid further faults
> +        */
> +       for (i = 0; i < num_prefault; ++i) {
> +
> +               if (unlikely(address >= vma->vm_end))
> +                       break;
> +
> +               if (vma->vm_flags & VM_MIXEDMAP)
> +                       ret = vmf_insert_mixed_prot(vma, address,
> +                                                   __pfn_to_pfn_t(pfn, PFN_DEV),
> +                                                   prot);
> +               else
> +                       ret = vmf_insert_pfn_prot(vma, address, pfn, prot);
> +
> +               /* Never error on prefaulted PTEs */
> +               if (unlikely((ret & VM_FAULT_ERROR))) {
> +                       if (i == 0)
> +                               return VM_FAULT_NOPAGE;
> +                       else
> +                               break;
> +               }
> +
> +               address += PAGE_SIZE;
> +       }
> +
> +       /* Set the page to be freed using drmm release action */
> +       if (drmm_add_action_or_reset(ddev, ttm_bo_release_dummy_page, page))
> +               return VM_FAULT_OOM;
> +
> +       return ret;
> +}
> +EXPORT_SYMBOL(ttm_bo_vm_dummy_page);
> +
>  vm_fault_t ttm_bo_vm_fault(struct vm_fault *vmf)
>  {
>         struct vm_area_struct *vma = vmf->vma;
>         pgprot_t prot;
>         struct ttm_buffer_object *bo = vma->vm_private_data;
> +       struct drm_device *ddev = bo->base.dev;
>         vm_fault_t ret;
> +       int idx;
>
>         ret = ttm_bo_vm_reserve(bo, vmf);
>         if (ret)
>                 return ret;
>
>         prot = vma->vm_page_prot;
> -       ret = ttm_bo_vm_fault_reserved(vmf, prot, TTM_BO_VM_NUM_PREFAULT, 1);
> +       if (drm_dev_enter(ddev, &idx)) {
> +               ret = ttm_bo_vm_fault_reserved(vmf, prot, TTM_BO_VM_NUM_PREFAULT, 1);
> +               drm_dev_exit(idx);
> +       } else {
> +               ret = ttm_bo_vm_dummy_page(vmf, prot);
> +       }
>         if (ret == VM_FAULT_RETRY && !(vmf->flags & FAULT_FLAG_RETRY_NOWAIT))
>                 return ret;
>
>         dma_resv_unlock(bo->base.resv);
>
>         return ret;
> +
> +       return ret;

Duplicate return here.

Alex

>  }
>  EXPORT_SYMBOL(ttm_bo_vm_fault);
>
> diff --git a/include/drm/ttm/ttm_bo_api.h b/include/drm/ttm/ttm_bo_api.h
> index e17be32..12fb240 100644
> --- a/include/drm/ttm/ttm_bo_api.h
> +++ b/include/drm/ttm/ttm_bo_api.h
> @@ -643,4 +643,6 @@ void ttm_bo_vm_close(struct vm_area_struct *vma);
>  int ttm_bo_vm_access(struct vm_area_struct *vma, unsigned long addr,
>                      void *buf, int len, int write);
>
> +vm_fault_t ttm_bo_vm_dummy_page(struct vm_fault *vmf, pgprot_t prot);
> +
>  #endif
> --
> 2.7.4
>
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 04/14] drm/sched: Cancel and flush all oustatdning jobs before finish.
  2021-01-18 21:01   ` Andrey Grodzovsky
@ 2021-01-18 21:49     ` Alex Deucher
  -1 siblings, 0 replies; 196+ messages in thread
From: Alex Deucher @ 2021-01-18 21:49 UTC (permalink / raw)
  To: Andrey Grodzovsky
  Cc: Greg KH, Christian König, Maling list - DRI developers,
	amd-gfx list, Daniel Vetter, Deucher, Alexander, Qiang Yu

On Mon, Jan 18, 2021 at 4:02 PM Andrey Grodzovsky
<andrey.grodzovsky@amd.com> wrote:
>
> To avoid any possible use after free.
>
> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> Reviewed-by: Christian König <christian.koenig@amd.com>

In the subject:
oustatdning -> outstanding

Alex


> ---
>  drivers/gpu/drm/scheduler/sched_main.c | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
> index 997aa15..92637b7 100644
> --- a/drivers/gpu/drm/scheduler/sched_main.c
> +++ b/drivers/gpu/drm/scheduler/sched_main.c
> @@ -899,6 +899,9 @@ void drm_sched_fini(struct drm_gpu_scheduler *sched)
>         if (sched->thread)
>                 kthread_stop(sched->thread);
>
> +       /* Confirm no work left behind accessing device structures */
> +       cancel_delayed_work_sync(&sched->work_tdr);
> +
>         sched->ready = false;
>  }
>  EXPORT_SYMBOL(drm_sched_fini);
> --
> 2.7.4
>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 04/14] drm/sched: Cancel and flush all oustatdning jobs before finish.
@ 2021-01-18 21:49     ` Alex Deucher
  0 siblings, 0 replies; 196+ messages in thread
From: Alex Deucher @ 2021-01-18 21:49 UTC (permalink / raw)
  To: Andrey Grodzovsky
  Cc: Rob Herring, Greg KH, Christian König,
	Maling list - DRI developers, Eric Anholt, Pekka Paalanen,
	amd-gfx list, Daniel Vetter, Deucher, Alexander, Qiang Yu,
	Wentland, Harry, Lucas Stach

On Mon, Jan 18, 2021 at 4:02 PM Andrey Grodzovsky
<andrey.grodzovsky@amd.com> wrote:
>
> To avoid any possible use after free.
>
> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> Reviewed-by: Christian König <christian.koenig@amd.com>

In the subject:
oustatdning -> outstanding

Alex


> ---
>  drivers/gpu/drm/scheduler/sched_main.c | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
> index 997aa15..92637b7 100644
> --- a/drivers/gpu/drm/scheduler/sched_main.c
> +++ b/drivers/gpu/drm/scheduler/sched_main.c
> @@ -899,6 +899,9 @@ void drm_sched_fini(struct drm_gpu_scheduler *sched)
>         if (sched->thread)
>                 kthread_stop(sched->thread);
>
> +       /* Confirm no work left behind accessing device structures */
> +       cancel_delayed_work_sync(&sched->work_tdr);
> +
>         sched->ready = false;
>  }
>  EXPORT_SYMBOL(drm_sched_fini);
> --
> 2.7.4
>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 07/14] drm/amdgpu: Register IOMMU topology notifier per device.
  2021-01-18 21:01   ` Andrey Grodzovsky
@ 2021-01-18 21:52     ` Alex Deucher
  -1 siblings, 0 replies; 196+ messages in thread
From: Alex Deucher @ 2021-01-18 21:52 UTC (permalink / raw)
  To: Andrey Grodzovsky
  Cc: Greg KH, Christian König, Maling list - DRI developers,
	amd-gfx list, Daniel Vetter, Deucher, Alexander, Qiang Yu

On Mon, Jan 18, 2021 at 4:02 PM Andrey Grodzovsky
<andrey.grodzovsky@amd.com> wrote:
>
> Handle all DMA IOMMU gropup related dependencies before the

gropup -> group

Alex

> group is removed.
>
> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu.h        |  5 ++++
>  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 46 ++++++++++++++++++++++++++++++
>  drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c   |  2 +-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h   |  1 +
>  drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 10 +++++++
>  drivers/gpu/drm/amd/amdgpu/amdgpu_object.h |  2 ++
>  6 files changed, 65 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> index 478a7d8..2953420 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> @@ -51,6 +51,7 @@
>  #include <linux/dma-fence.h>
>  #include <linux/pci.h>
>  #include <linux/aer.h>
> +#include <linux/notifier.h>
>
>  #include <drm/ttm/ttm_bo_api.h>
>  #include <drm/ttm/ttm_bo_driver.h>
> @@ -1041,6 +1042,10 @@ struct amdgpu_device {
>
>         bool                            in_pci_err_recovery;
>         struct pci_saved_state          *pci_state;
> +
> +       struct notifier_block           nb;
> +       struct blocking_notifier_head   notifier;
> +       struct list_head                device_bo_list;
>  };
>
>  static inline struct amdgpu_device *drm_to_adev(struct drm_device *ddev)
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index 45e23e3..e99f4f1 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -70,6 +70,8 @@
>  #include <drm/task_barrier.h>
>  #include <linux/pm_runtime.h>
>
> +#include <linux/iommu.h>
> +
>  MODULE_FIRMWARE("amdgpu/vega10_gpu_info.bin");
>  MODULE_FIRMWARE("amdgpu/vega12_gpu_info.bin");
>  MODULE_FIRMWARE("amdgpu/raven_gpu_info.bin");
> @@ -3200,6 +3202,39 @@ static const struct attribute *amdgpu_dev_attributes[] = {
>  };
>
>
> +static int amdgpu_iommu_group_notifier(struct notifier_block *nb,
> +                                    unsigned long action, void *data)
> +{
> +       struct amdgpu_device *adev = container_of(nb, struct amdgpu_device, nb);
> +       struct amdgpu_bo *bo = NULL;
> +
> +       /*
> +        * Following is a set of IOMMU group dependencies taken care of before
> +        * device's IOMMU group is removed
> +        */
> +       if (action == IOMMU_GROUP_NOTIFY_DEL_DEVICE) {
> +
> +               spin_lock(&ttm_bo_glob.lru_lock);
> +               list_for_each_entry(bo, &adev->device_bo_list, bo) {
> +                       if (bo->tbo.ttm)
> +                               ttm_tt_unpopulate(bo->tbo.bdev, bo->tbo.ttm);
> +               }
> +               spin_unlock(&ttm_bo_glob.lru_lock);
> +
> +               if (adev->irq.ih.use_bus_addr)
> +                       amdgpu_ih_ring_fini(adev, &adev->irq.ih);
> +               if (adev->irq.ih1.use_bus_addr)
> +                       amdgpu_ih_ring_fini(adev, &adev->irq.ih1);
> +               if (adev->irq.ih2.use_bus_addr)
> +                       amdgpu_ih_ring_fini(adev, &adev->irq.ih2);
> +
> +               amdgpu_gart_dummy_page_fini(adev);
> +       }
> +
> +       return NOTIFY_OK;
> +}
> +
> +
>  /**
>   * amdgpu_device_init - initialize the driver
>   *
> @@ -3304,6 +3339,8 @@ int amdgpu_device_init(struct amdgpu_device *adev,
>
>         INIT_WORK(&adev->xgmi_reset_work, amdgpu_device_xgmi_reset_func);
>
> +       INIT_LIST_HEAD(&adev->device_bo_list);
> +
>         adev->gfx.gfx_off_req_count = 1;
>         adev->pm.ac_power = power_supply_is_system_supplied() > 0;
>
> @@ -3575,6 +3612,15 @@ int amdgpu_device_init(struct amdgpu_device *adev,
>         if (amdgpu_device_cache_pci_state(adev->pdev))
>                 pci_restore_state(pdev);
>
> +       BLOCKING_INIT_NOTIFIER_HEAD(&adev->notifier);
> +       adev->nb.notifier_call = amdgpu_iommu_group_notifier;
> +
> +       if (adev->dev->iommu_group) {
> +               r = iommu_group_register_notifier(adev->dev->iommu_group, &adev->nb);
> +               if (r)
> +                       goto failed;
> +       }
> +
>         return 0;
>
>  failed:
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
> index 0db9330..486ad6d 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
> @@ -92,7 +92,7 @@ static int amdgpu_gart_dummy_page_init(struct amdgpu_device *adev)
>   *
>   * Frees the dummy page used by the driver (all asics).
>   */
> -static void amdgpu_gart_dummy_page_fini(struct amdgpu_device *adev)
> +void amdgpu_gart_dummy_page_fini(struct amdgpu_device *adev)
>  {
>         if (!adev->dummy_page_addr)
>                 return;
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h
> index afa2e28..5678d9c 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h
> @@ -61,6 +61,7 @@ int amdgpu_gart_table_vram_pin(struct amdgpu_device *adev);
>  void amdgpu_gart_table_vram_unpin(struct amdgpu_device *adev);
>  int amdgpu_gart_init(struct amdgpu_device *adev);
>  void amdgpu_gart_fini(struct amdgpu_device *adev);
> +void amdgpu_gart_dummy_page_fini(struct amdgpu_device *adev);
>  int amdgpu_gart_unbind(struct amdgpu_device *adev, uint64_t offset,
>                        int pages);
>  int amdgpu_gart_map(struct amdgpu_device *adev, uint64_t offset,
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> index 6cc9919..4a1de69 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> @@ -94,6 +94,10 @@ static void amdgpu_bo_destroy(struct ttm_buffer_object *tbo)
>         }
>         amdgpu_bo_unref(&bo->parent);
>
> +       spin_lock(&ttm_bo_glob.lru_lock);
> +       list_del(&bo->bo);
> +       spin_unlock(&ttm_bo_glob.lru_lock);
> +
>         kfree(bo->metadata);
>         kfree(bo);
>  }
> @@ -613,6 +617,12 @@ static int amdgpu_bo_do_create(struct amdgpu_device *adev,
>         if (bp->type == ttm_bo_type_device)
>                 bo->flags &= ~AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED;
>
> +       INIT_LIST_HEAD(&bo->bo);
> +
> +       spin_lock(&ttm_bo_glob.lru_lock);
> +       list_add_tail(&bo->bo, &adev->device_bo_list);
> +       spin_unlock(&ttm_bo_glob.lru_lock);
> +
>         return 0;
>
>  fail_unreserve:
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
> index 9ac3756..5ae8555 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
> @@ -110,6 +110,8 @@ struct amdgpu_bo {
>         struct list_head                shadow_list;
>
>         struct kgd_mem                  *kfd_bo;
> +
> +       struct list_head                bo;
>  };
>
>  static inline struct amdgpu_bo *ttm_to_amdgpu_bo(struct ttm_buffer_object *tbo)
> --
> 2.7.4
>
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 07/14] drm/amdgpu: Register IOMMU topology notifier per device.
@ 2021-01-18 21:52     ` Alex Deucher
  0 siblings, 0 replies; 196+ messages in thread
From: Alex Deucher @ 2021-01-18 21:52 UTC (permalink / raw)
  To: Andrey Grodzovsky
  Cc: Rob Herring, Greg KH, Christian König,
	Maling list - DRI developers, Eric Anholt, amd-gfx list,
	Daniel Vetter, Deucher, Alexander, Qiang Yu, Lucas Stach

On Mon, Jan 18, 2021 at 4:02 PM Andrey Grodzovsky
<andrey.grodzovsky@amd.com> wrote:
>
> Handle all DMA IOMMU gropup related dependencies before the

gropup -> group

Alex

> group is removed.
>
> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu.h        |  5 ++++
>  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 46 ++++++++++++++++++++++++++++++
>  drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c   |  2 +-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h   |  1 +
>  drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 10 +++++++
>  drivers/gpu/drm/amd/amdgpu/amdgpu_object.h |  2 ++
>  6 files changed, 65 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> index 478a7d8..2953420 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> @@ -51,6 +51,7 @@
>  #include <linux/dma-fence.h>
>  #include <linux/pci.h>
>  #include <linux/aer.h>
> +#include <linux/notifier.h>
>
>  #include <drm/ttm/ttm_bo_api.h>
>  #include <drm/ttm/ttm_bo_driver.h>
> @@ -1041,6 +1042,10 @@ struct amdgpu_device {
>
>         bool                            in_pci_err_recovery;
>         struct pci_saved_state          *pci_state;
> +
> +       struct notifier_block           nb;
> +       struct blocking_notifier_head   notifier;
> +       struct list_head                device_bo_list;
>  };
>
>  static inline struct amdgpu_device *drm_to_adev(struct drm_device *ddev)
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index 45e23e3..e99f4f1 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -70,6 +70,8 @@
>  #include <drm/task_barrier.h>
>  #include <linux/pm_runtime.h>
>
> +#include <linux/iommu.h>
> +
>  MODULE_FIRMWARE("amdgpu/vega10_gpu_info.bin");
>  MODULE_FIRMWARE("amdgpu/vega12_gpu_info.bin");
>  MODULE_FIRMWARE("amdgpu/raven_gpu_info.bin");
> @@ -3200,6 +3202,39 @@ static const struct attribute *amdgpu_dev_attributes[] = {
>  };
>
>
> +static int amdgpu_iommu_group_notifier(struct notifier_block *nb,
> +                                    unsigned long action, void *data)
> +{
> +       struct amdgpu_device *adev = container_of(nb, struct amdgpu_device, nb);
> +       struct amdgpu_bo *bo = NULL;
> +
> +       /*
> +        * Following is a set of IOMMU group dependencies taken care of before
> +        * device's IOMMU group is removed
> +        */
> +       if (action == IOMMU_GROUP_NOTIFY_DEL_DEVICE) {
> +
> +               spin_lock(&ttm_bo_glob.lru_lock);
> +               list_for_each_entry(bo, &adev->device_bo_list, bo) {
> +                       if (bo->tbo.ttm)
> +                               ttm_tt_unpopulate(bo->tbo.bdev, bo->tbo.ttm);
> +               }
> +               spin_unlock(&ttm_bo_glob.lru_lock);
> +
> +               if (adev->irq.ih.use_bus_addr)
> +                       amdgpu_ih_ring_fini(adev, &adev->irq.ih);
> +               if (adev->irq.ih1.use_bus_addr)
> +                       amdgpu_ih_ring_fini(adev, &adev->irq.ih1);
> +               if (adev->irq.ih2.use_bus_addr)
> +                       amdgpu_ih_ring_fini(adev, &adev->irq.ih2);
> +
> +               amdgpu_gart_dummy_page_fini(adev);
> +       }
> +
> +       return NOTIFY_OK;
> +}
> +
> +
>  /**
>   * amdgpu_device_init - initialize the driver
>   *
> @@ -3304,6 +3339,8 @@ int amdgpu_device_init(struct amdgpu_device *adev,
>
>         INIT_WORK(&adev->xgmi_reset_work, amdgpu_device_xgmi_reset_func);
>
> +       INIT_LIST_HEAD(&adev->device_bo_list);
> +
>         adev->gfx.gfx_off_req_count = 1;
>         adev->pm.ac_power = power_supply_is_system_supplied() > 0;
>
> @@ -3575,6 +3612,15 @@ int amdgpu_device_init(struct amdgpu_device *adev,
>         if (amdgpu_device_cache_pci_state(adev->pdev))
>                 pci_restore_state(pdev);
>
> +       BLOCKING_INIT_NOTIFIER_HEAD(&adev->notifier);
> +       adev->nb.notifier_call = amdgpu_iommu_group_notifier;
> +
> +       if (adev->dev->iommu_group) {
> +               r = iommu_group_register_notifier(adev->dev->iommu_group, &adev->nb);
> +               if (r)
> +                       goto failed;
> +       }
> +
>         return 0;
>
>  failed:
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
> index 0db9330..486ad6d 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
> @@ -92,7 +92,7 @@ static int amdgpu_gart_dummy_page_init(struct amdgpu_device *adev)
>   *
>   * Frees the dummy page used by the driver (all asics).
>   */
> -static void amdgpu_gart_dummy_page_fini(struct amdgpu_device *adev)
> +void amdgpu_gart_dummy_page_fini(struct amdgpu_device *adev)
>  {
>         if (!adev->dummy_page_addr)
>                 return;
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h
> index afa2e28..5678d9c 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h
> @@ -61,6 +61,7 @@ int amdgpu_gart_table_vram_pin(struct amdgpu_device *adev);
>  void amdgpu_gart_table_vram_unpin(struct amdgpu_device *adev);
>  int amdgpu_gart_init(struct amdgpu_device *adev);
>  void amdgpu_gart_fini(struct amdgpu_device *adev);
> +void amdgpu_gart_dummy_page_fini(struct amdgpu_device *adev);
>  int amdgpu_gart_unbind(struct amdgpu_device *adev, uint64_t offset,
>                        int pages);
>  int amdgpu_gart_map(struct amdgpu_device *adev, uint64_t offset,
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> index 6cc9919..4a1de69 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> @@ -94,6 +94,10 @@ static void amdgpu_bo_destroy(struct ttm_buffer_object *tbo)
>         }
>         amdgpu_bo_unref(&bo->parent);
>
> +       spin_lock(&ttm_bo_glob.lru_lock);
> +       list_del(&bo->bo);
> +       spin_unlock(&ttm_bo_glob.lru_lock);
> +
>         kfree(bo->metadata);
>         kfree(bo);
>  }
> @@ -613,6 +617,12 @@ static int amdgpu_bo_do_create(struct amdgpu_device *adev,
>         if (bp->type == ttm_bo_type_device)
>                 bo->flags &= ~AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED;
>
> +       INIT_LIST_HEAD(&bo->bo);
> +
> +       spin_lock(&ttm_bo_glob.lru_lock);
> +       list_add_tail(&bo->bo, &adev->device_bo_list);
> +       spin_unlock(&ttm_bo_glob.lru_lock);
> +
>         return 0;
>
>  fail_unreserve:
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
> index 9ac3756..5ae8555 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
> @@ -110,6 +110,8 @@ struct amdgpu_bo {
>         struct list_head                shadow_list;
>
>         struct kgd_mem                  *kfd_bo;
> +
> +       struct list_head                bo;
>  };
>
>  static inline struct amdgpu_bo *ttm_to_amdgpu_bo(struct ttm_buffer_object *tbo)
> --
> 2.7.4
>
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 10/14] dmr/amdgpu: Move some sysfs attrs creation to default_attr
  2021-01-18 21:01   ` Andrey Grodzovsky
@ 2021-01-19  7:34     ` Greg KH
  -1 siblings, 0 replies; 196+ messages in thread
From: Greg KH @ 2021-01-19  7:34 UTC (permalink / raw)
  To: Andrey Grodzovsky
  Cc: ckoenig.leichtzumerken, dri-devel, amd-gfx, daniel.vetter,
	Alexander.Deucher, yuq825

On Mon, Jan 18, 2021 at 04:01:19PM -0500, Andrey Grodzovsky wrote:
>  static struct pci_driver amdgpu_kms_pci_driver = {
>  	.name = DRIVER_NAME,
>  	.id_table = pciidlist,
> @@ -1595,6 +1607,7 @@ static struct pci_driver amdgpu_kms_pci_driver = {
>  	.shutdown = amdgpu_pci_shutdown,
>  	.driver.pm = &amdgpu_pm_ops,
>  	.err_handler = &amdgpu_pci_err_handler,
> +	.driver.dev_groups = amdgpu_sysfs_groups,

Shouldn't this just be:
	groups - amdgpu_sysfs_groups,

Why go to the "driver root" here?

Other than that tiny thing, looks good to me, nice cleanup!

greg k-h
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 10/14] dmr/amdgpu: Move some sysfs attrs creation to default_attr
@ 2021-01-19  7:34     ` Greg KH
  0 siblings, 0 replies; 196+ messages in thread
From: Greg KH @ 2021-01-19  7:34 UTC (permalink / raw)
  To: Andrey Grodzovsky
  Cc: robh, ckoenig.leichtzumerken, dri-devel, eric, ppaalanen,
	amd-gfx, daniel.vetter, Alexander.Deucher, yuq825,
	Harry.Wentland, l.stach

On Mon, Jan 18, 2021 at 04:01:19PM -0500, Andrey Grodzovsky wrote:
>  static struct pci_driver amdgpu_kms_pci_driver = {
>  	.name = DRIVER_NAME,
>  	.id_table = pciidlist,
> @@ -1595,6 +1607,7 @@ static struct pci_driver amdgpu_kms_pci_driver = {
>  	.shutdown = amdgpu_pci_shutdown,
>  	.driver.pm = &amdgpu_pm_ops,
>  	.err_handler = &amdgpu_pci_err_handler,
> +	.driver.dev_groups = amdgpu_sysfs_groups,

Shouldn't this just be:
	groups - amdgpu_sysfs_groups,

Why go to the "driver root" here?

Other than that tiny thing, looks good to me, nice cleanup!

greg k-h
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 12/14] drm/scheduler: Job timeout handler returns status
  2021-01-18 21:01   ` Andrey Grodzovsky
@ 2021-01-19  7:53     ` Christian König
  -1 siblings, 0 replies; 196+ messages in thread
From: Christian König @ 2021-01-19  7:53 UTC (permalink / raw)
  To: Andrey Grodzovsky, amd-gfx, dri-devel, ckoenig.leichtzumerken,
	daniel.vetter, robh, l.stach, yuq825, eric
  Cc: Tomeu Vizoso, gregkh, Steven Price, Luben Tuikov,
	Alyssa Rosenzweig, Russell King, Alexander.Deucher

Am 18.01.21 um 22:01 schrieb Andrey Grodzovsky:
> From: Luben Tuikov <luben.tuikov@amd.com>
>
> This patch does not change current behaviour.
>
> The driver's job timeout handler now returns
> status indicating back to the DRM layer whether
> the task (job) was successfully aborted or whether
> more time should be given to the task to complete.
>
> Default behaviour as of this patch, is preserved,
> except in obvious-by-comment case in the Panfrost
> driver, as documented below.
>
> All drivers which make use of the
> drm_sched_backend_ops' .timedout_job() callback
> have been accordingly renamed and return the
> would've-been default value of
> DRM_TASK_STATUS_ALIVE to restart the task's
> timeout timer--this is the old behaviour, and
> is preserved by this patch.
>
> In the case of the Panfrost driver, its timedout
> callback correctly first checks if the job had
> completed in due time and if so, it now returns
> DRM_TASK_STATUS_COMPLETE to notify the DRM layer
> that the task can be moved to the done list, to be
> freed later. In the other two subsequent checks,
> the value of DRM_TASK_STATUS_ALIVE is returned, as
> per the default behaviour.
>
> A more involved driver's solutions can be had
> in subequent patches.
>
> v2: Use enum as the status of a driver's job
>      timeout callback method.
>
> v4: (By Andrey Grodzovsky)
> Replace DRM_TASK_STATUS_COMPLETE with DRM_TASK_STATUS_ENODEV
> to enable a hint to the schduler for when NOT to rearm the
> timeout timer.

As Lukas pointed out returning the job (or task) status doesn't make 
much sense.

What we return here is the status of the scheduler.

I would either rename the enum or completely drop it and return a 
negative error status.

Apart from that looks fine to me,
Christian.


>
> Cc: Alexander Deucher <Alexander.Deucher@amd.com>
> Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
> Cc: Christian König <christian.koenig@amd.com>
> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> Cc: Lucas Stach <l.stach@pengutronix.de>
> Cc: Russell King <linux+etnaviv@armlinux.org.uk>
> Cc: Christian Gmeiner <christian.gmeiner@gmail.com>
> Cc: Qiang Yu <yuq825@gmail.com>
> Cc: Rob Herring <robh@kernel.org>
> Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com>
> Cc: Steven Price <steven.price@arm.com>
> Cc: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
> Cc: Eric Anholt <eric@anholt.net>
> Reported-by: kernel test robot <lkp@intel.com>
> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_job.c |  6 ++++--
>   drivers/gpu/drm/etnaviv/etnaviv_sched.c | 10 +++++++++-
>   drivers/gpu/drm/lima/lima_sched.c       |  4 +++-
>   drivers/gpu/drm/panfrost/panfrost_job.c |  9 ++++++---
>   drivers/gpu/drm/scheduler/sched_main.c  |  4 +---
>   drivers/gpu/drm/v3d/v3d_sched.c         | 32 +++++++++++++++++---------------
>   include/drm/gpu_scheduler.h             | 17 ++++++++++++++---
>   7 files changed, 54 insertions(+), 28 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> index ff48101..a111326 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> @@ -28,7 +28,7 @@
>   #include "amdgpu.h"
>   #include "amdgpu_trace.h"
>   
> -static void amdgpu_job_timedout(struct drm_sched_job *s_job)
> +static enum drm_task_status amdgpu_job_timedout(struct drm_sched_job *s_job)
>   {
>   	struct amdgpu_ring *ring = to_amdgpu_ring(s_job->sched);
>   	struct amdgpu_job *job = to_amdgpu_job(s_job);
> @@ -41,7 +41,7 @@ static void amdgpu_job_timedout(struct drm_sched_job *s_job)
>   	    amdgpu_ring_soft_recovery(ring, job->vmid, s_job->s_fence->parent)) {
>   		DRM_ERROR("ring %s timeout, but soft recovered\n",
>   			  s_job->sched->name);
> -		return;
> +		return DRM_TASK_STATUS_ALIVE;
>   	}
>   
>   	amdgpu_vm_get_task_info(ring->adev, job->pasid, &ti);
> @@ -53,10 +53,12 @@ static void amdgpu_job_timedout(struct drm_sched_job *s_job)
>   
>   	if (amdgpu_device_should_recover_gpu(ring->adev)) {
>   		amdgpu_device_gpu_recover(ring->adev, job);
> +		return DRM_TASK_STATUS_ALIVE;
>   	} else {
>   		drm_sched_suspend_timeout(&ring->sched);
>   		if (amdgpu_sriov_vf(adev))
>   			adev->virt.tdr_debug = true;
> +		return DRM_TASK_STATUS_ALIVE;
>   	}
>   }
>   
> diff --git a/drivers/gpu/drm/etnaviv/etnaviv_sched.c b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
> index cd46c88..c495169 100644
> --- a/drivers/gpu/drm/etnaviv/etnaviv_sched.c
> +++ b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
> @@ -82,7 +82,8 @@ static struct dma_fence *etnaviv_sched_run_job(struct drm_sched_job *sched_job)
>   	return fence;
>   }
>   
> -static void etnaviv_sched_timedout_job(struct drm_sched_job *sched_job)
> +static enum drm_task_status etnaviv_sched_timedout_job(struct drm_sched_job
> +						       *sched_job)
>   {
>   	struct etnaviv_gem_submit *submit = to_etnaviv_submit(sched_job);
>   	struct etnaviv_gpu *gpu = submit->gpu;
> @@ -120,9 +121,16 @@ static void etnaviv_sched_timedout_job(struct drm_sched_job *sched_job)
>   
>   	drm_sched_resubmit_jobs(&gpu->sched);
>   
> +	/* Tell the DRM scheduler that this task needs
> +	 * more time.
> +	 */
> +	drm_sched_start(&gpu->sched, true);
> +	return DRM_TASK_STATUS_ALIVE;
> +
>   out_no_timeout:
>   	/* restart scheduler after GPU is usable again */
>   	drm_sched_start(&gpu->sched, true);
> +	return DRM_TASK_STATUS_ALIVE;
>   }
>   
>   static void etnaviv_sched_free_job(struct drm_sched_job *sched_job)
> diff --git a/drivers/gpu/drm/lima/lima_sched.c b/drivers/gpu/drm/lima/lima_sched.c
> index 63b4c56..66d9236 100644
> --- a/drivers/gpu/drm/lima/lima_sched.c
> +++ b/drivers/gpu/drm/lima/lima_sched.c
> @@ -415,7 +415,7 @@ static void lima_sched_build_error_task_list(struct lima_sched_task *task)
>   	mutex_unlock(&dev->error_task_list_lock);
>   }
>   
> -static void lima_sched_timedout_job(struct drm_sched_job *job)
> +static enum drm_task_status lima_sched_timedout_job(struct drm_sched_job *job)
>   {
>   	struct lima_sched_pipe *pipe = to_lima_pipe(job->sched);
>   	struct lima_sched_task *task = to_lima_task(job);
> @@ -449,6 +449,8 @@ static void lima_sched_timedout_job(struct drm_sched_job *job)
>   
>   	drm_sched_resubmit_jobs(&pipe->base);
>   	drm_sched_start(&pipe->base, true);
> +
> +	return DRM_TASK_STATUS_ALIVE;
>   }
>   
>   static void lima_sched_free_job(struct drm_sched_job *job)
> diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c
> index 04e6f6f..10d41ac 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_job.c
> +++ b/drivers/gpu/drm/panfrost/panfrost_job.c
> @@ -432,7 +432,8 @@ static void panfrost_scheduler_start(struct panfrost_queue_state *queue)
>   	mutex_unlock(&queue->lock);
>   }
>   
> -static void panfrost_job_timedout(struct drm_sched_job *sched_job)
> +static enum drm_task_status panfrost_job_timedout(struct drm_sched_job
> +						  *sched_job)
>   {
>   	struct panfrost_job *job = to_panfrost_job(sched_job);
>   	struct panfrost_device *pfdev = job->pfdev;
> @@ -443,7 +444,7 @@ static void panfrost_job_timedout(struct drm_sched_job *sched_job)
>   	 * spurious. Bail out.
>   	 */
>   	if (dma_fence_is_signaled(job->done_fence))
> -		return;
> +		return DRM_TASK_STATUS_ALIVE;
>   
>   	dev_err(pfdev->dev, "gpu sched timeout, js=%d, config=0x%x, status=0x%x, head=0x%x, tail=0x%x, sched_job=%p",
>   		js,
> @@ -455,11 +456,13 @@ static void panfrost_job_timedout(struct drm_sched_job *sched_job)
>   
>   	/* Scheduler is already stopped, nothing to do. */
>   	if (!panfrost_scheduler_stop(&pfdev->js->queue[js], sched_job))
> -		return;
> +		return DRM_TASK_STATUS_ALIVE;
>   
>   	/* Schedule a reset if there's no reset in progress. */
>   	if (!atomic_xchg(&pfdev->reset.pending, 1))
>   		schedule_work(&pfdev->reset.work);
> +
> +	return DRM_TASK_STATUS_ALIVE;
>   }
>   
>   static const struct drm_sched_backend_ops panfrost_sched_ops = {
> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
> index 92637b7..73fccc5 100644
> --- a/drivers/gpu/drm/scheduler/sched_main.c
> +++ b/drivers/gpu/drm/scheduler/sched_main.c
> @@ -527,7 +527,7 @@ void drm_sched_start(struct drm_gpu_scheduler *sched, bool full_recovery)
>   EXPORT_SYMBOL(drm_sched_start);
>   
>   /**
> - * drm_sched_resubmit_jobs - helper to relunch job from pending ring list
> + * drm_sched_resubmit_jobs - helper to relaunch jobs from the pending list
>    *
>    * @sched: scheduler instance
>    *
> @@ -561,8 +561,6 @@ void drm_sched_resubmit_jobs(struct drm_gpu_scheduler *sched)
>   		} else {
>   			s_job->s_fence->parent = fence;
>   		}
> -
> -
>   	}
>   }
>   EXPORT_SYMBOL(drm_sched_resubmit_jobs);
> diff --git a/drivers/gpu/drm/v3d/v3d_sched.c b/drivers/gpu/drm/v3d/v3d_sched.c
> index 452682e..3740665e 100644
> --- a/drivers/gpu/drm/v3d/v3d_sched.c
> +++ b/drivers/gpu/drm/v3d/v3d_sched.c
> @@ -259,7 +259,7 @@ v3d_cache_clean_job_run(struct drm_sched_job *sched_job)
>   	return NULL;
>   }
>   
> -static void
> +static enum drm_task_status
>   v3d_gpu_reset_for_timeout(struct v3d_dev *v3d, struct drm_sched_job *sched_job)
>   {
>   	enum v3d_queue q;
> @@ -285,6 +285,8 @@ v3d_gpu_reset_for_timeout(struct v3d_dev *v3d, struct drm_sched_job *sched_job)
>   	}
>   
>   	mutex_unlock(&v3d->reset_lock);
> +
> +	return DRM_TASK_STATUS_ALIVE;
>   }
>   
>   /* If the current address or return address have changed, then the GPU
> @@ -292,7 +294,7 @@ v3d_gpu_reset_for_timeout(struct v3d_dev *v3d, struct drm_sched_job *sched_job)
>    * could fail if the GPU got in an infinite loop in the CL, but that
>    * is pretty unlikely outside of an i-g-t testcase.
>    */
> -static void
> +static enum drm_task_status
>   v3d_cl_job_timedout(struct drm_sched_job *sched_job, enum v3d_queue q,
>   		    u32 *timedout_ctca, u32 *timedout_ctra)
>   {
> @@ -304,39 +306,39 @@ v3d_cl_job_timedout(struct drm_sched_job *sched_job, enum v3d_queue q,
>   	if (*timedout_ctca != ctca || *timedout_ctra != ctra) {
>   		*timedout_ctca = ctca;
>   		*timedout_ctra = ctra;
> -		return;
> +		return DRM_TASK_STATUS_ALIVE;
>   	}
>   
> -	v3d_gpu_reset_for_timeout(v3d, sched_job);
> +	return v3d_gpu_reset_for_timeout(v3d, sched_job);
>   }
>   
> -static void
> +static enum drm_task_status
>   v3d_bin_job_timedout(struct drm_sched_job *sched_job)
>   {
>   	struct v3d_bin_job *job = to_bin_job(sched_job);
>   
> -	v3d_cl_job_timedout(sched_job, V3D_BIN,
> -			    &job->timedout_ctca, &job->timedout_ctra);
> +	return v3d_cl_job_timedout(sched_job, V3D_BIN,
> +				   &job->timedout_ctca, &job->timedout_ctra);
>   }
>   
> -static void
> +static enum drm_task_status
>   v3d_render_job_timedout(struct drm_sched_job *sched_job)
>   {
>   	struct v3d_render_job *job = to_render_job(sched_job);
>   
> -	v3d_cl_job_timedout(sched_job, V3D_RENDER,
> -			    &job->timedout_ctca, &job->timedout_ctra);
> +	return v3d_cl_job_timedout(sched_job, V3D_RENDER,
> +				   &job->timedout_ctca, &job->timedout_ctra);
>   }
>   
> -static void
> +static enum drm_task_status
>   v3d_generic_job_timedout(struct drm_sched_job *sched_job)
>   {
>   	struct v3d_job *job = to_v3d_job(sched_job);
>   
> -	v3d_gpu_reset_for_timeout(job->v3d, sched_job);
> +	return v3d_gpu_reset_for_timeout(job->v3d, sched_job);
>   }
>   
> -static void
> +static enum drm_task_status
>   v3d_csd_job_timedout(struct drm_sched_job *sched_job)
>   {
>   	struct v3d_csd_job *job = to_csd_job(sched_job);
> @@ -348,10 +350,10 @@ v3d_csd_job_timedout(struct drm_sched_job *sched_job)
>   	 */
>   	if (job->timedout_batches != batches) {
>   		job->timedout_batches = batches;
> -		return;
> +		return DRM_TASK_STATUS_ALIVE;
>   	}
>   
> -	v3d_gpu_reset_for_timeout(v3d, sched_job);
> +	return v3d_gpu_reset_for_timeout(v3d, sched_job);
>   }
>   
>   static const struct drm_sched_backend_ops v3d_bin_sched_ops = {
> diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
> index 975e8a6..3ba36bc 100644
> --- a/include/drm/gpu_scheduler.h
> +++ b/include/drm/gpu_scheduler.h
> @@ -206,6 +206,11 @@ static inline bool drm_sched_invalidate_job(struct drm_sched_job *s_job,
>   	return s_job && atomic_inc_return(&s_job->karma) > threshold;
>   }
>   
> +enum drm_task_status {
> +	DRM_TASK_STATUS_ENODEV,
> +	DRM_TASK_STATUS_ALIVE
> +};
> +
>   /**
>    * struct drm_sched_backend_ops
>    *
> @@ -230,10 +235,16 @@ struct drm_sched_backend_ops {
>   	struct dma_fence *(*run_job)(struct drm_sched_job *sched_job);
>   
>   	/**
> -         * @timedout_job: Called when a job has taken too long to execute,
> -         * to trigger GPU recovery.
> +	 * @timedout_job: Called when a job has taken too long to execute,
> +	 * to trigger GPU recovery.
> +	 *
> +	 * Return DRM_TASK_STATUS_ALIVE, if the task (job) is healthy
> +	 * and executing in the hardware, i.e. it needs more time.
> +	 *
> +	 * Return DRM_TASK_STATUS_ENODEV, if the task (job) has
> +	 * been aborted.
>   	 */
> -	void (*timedout_job)(struct drm_sched_job *sched_job);
> +	enum drm_task_status (*timedout_job)(struct drm_sched_job *sched_job);
>   
>   	/**
>            * @free_job: Called once the job's finished fence has been signaled

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 12/14] drm/scheduler: Job timeout handler returns status
@ 2021-01-19  7:53     ` Christian König
  0 siblings, 0 replies; 196+ messages in thread
From: Christian König @ 2021-01-19  7:53 UTC (permalink / raw)
  To: Andrey Grodzovsky, amd-gfx, dri-devel, ckoenig.leichtzumerken,
	daniel.vetter, robh, l.stach, yuq825, eric
  Cc: Tomeu Vizoso, gregkh, Christian Gmeiner, Steven Price, ppaalanen,
	Luben Tuikov, Alyssa Rosenzweig, Russell King, Alexander.Deucher,
	Harry.Wentland

Am 18.01.21 um 22:01 schrieb Andrey Grodzovsky:
> From: Luben Tuikov <luben.tuikov@amd.com>
>
> This patch does not change current behaviour.
>
> The driver's job timeout handler now returns
> status indicating back to the DRM layer whether
> the task (job) was successfully aborted or whether
> more time should be given to the task to complete.
>
> Default behaviour as of this patch, is preserved,
> except in obvious-by-comment case in the Panfrost
> driver, as documented below.
>
> All drivers which make use of the
> drm_sched_backend_ops' .timedout_job() callback
> have been accordingly renamed and return the
> would've-been default value of
> DRM_TASK_STATUS_ALIVE to restart the task's
> timeout timer--this is the old behaviour, and
> is preserved by this patch.
>
> In the case of the Panfrost driver, its timedout
> callback correctly first checks if the job had
> completed in due time and if so, it now returns
> DRM_TASK_STATUS_COMPLETE to notify the DRM layer
> that the task can be moved to the done list, to be
> freed later. In the other two subsequent checks,
> the value of DRM_TASK_STATUS_ALIVE is returned, as
> per the default behaviour.
>
> A more involved driver's solutions can be had
> in subequent patches.
>
> v2: Use enum as the status of a driver's job
>      timeout callback method.
>
> v4: (By Andrey Grodzovsky)
> Replace DRM_TASK_STATUS_COMPLETE with DRM_TASK_STATUS_ENODEV
> to enable a hint to the schduler for when NOT to rearm the
> timeout timer.

As Lukas pointed out returning the job (or task) status doesn't make 
much sense.

What we return here is the status of the scheduler.

I would either rename the enum or completely drop it and return a 
negative error status.

Apart from that looks fine to me,
Christian.


>
> Cc: Alexander Deucher <Alexander.Deucher@amd.com>
> Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
> Cc: Christian König <christian.koenig@amd.com>
> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> Cc: Lucas Stach <l.stach@pengutronix.de>
> Cc: Russell King <linux+etnaviv@armlinux.org.uk>
> Cc: Christian Gmeiner <christian.gmeiner@gmail.com>
> Cc: Qiang Yu <yuq825@gmail.com>
> Cc: Rob Herring <robh@kernel.org>
> Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com>
> Cc: Steven Price <steven.price@arm.com>
> Cc: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
> Cc: Eric Anholt <eric@anholt.net>
> Reported-by: kernel test robot <lkp@intel.com>
> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_job.c |  6 ++++--
>   drivers/gpu/drm/etnaviv/etnaviv_sched.c | 10 +++++++++-
>   drivers/gpu/drm/lima/lima_sched.c       |  4 +++-
>   drivers/gpu/drm/panfrost/panfrost_job.c |  9 ++++++---
>   drivers/gpu/drm/scheduler/sched_main.c  |  4 +---
>   drivers/gpu/drm/v3d/v3d_sched.c         | 32 +++++++++++++++++---------------
>   include/drm/gpu_scheduler.h             | 17 ++++++++++++++---
>   7 files changed, 54 insertions(+), 28 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> index ff48101..a111326 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> @@ -28,7 +28,7 @@
>   #include "amdgpu.h"
>   #include "amdgpu_trace.h"
>   
> -static void amdgpu_job_timedout(struct drm_sched_job *s_job)
> +static enum drm_task_status amdgpu_job_timedout(struct drm_sched_job *s_job)
>   {
>   	struct amdgpu_ring *ring = to_amdgpu_ring(s_job->sched);
>   	struct amdgpu_job *job = to_amdgpu_job(s_job);
> @@ -41,7 +41,7 @@ static void amdgpu_job_timedout(struct drm_sched_job *s_job)
>   	    amdgpu_ring_soft_recovery(ring, job->vmid, s_job->s_fence->parent)) {
>   		DRM_ERROR("ring %s timeout, but soft recovered\n",
>   			  s_job->sched->name);
> -		return;
> +		return DRM_TASK_STATUS_ALIVE;
>   	}
>   
>   	amdgpu_vm_get_task_info(ring->adev, job->pasid, &ti);
> @@ -53,10 +53,12 @@ static void amdgpu_job_timedout(struct drm_sched_job *s_job)
>   
>   	if (amdgpu_device_should_recover_gpu(ring->adev)) {
>   		amdgpu_device_gpu_recover(ring->adev, job);
> +		return DRM_TASK_STATUS_ALIVE;
>   	} else {
>   		drm_sched_suspend_timeout(&ring->sched);
>   		if (amdgpu_sriov_vf(adev))
>   			adev->virt.tdr_debug = true;
> +		return DRM_TASK_STATUS_ALIVE;
>   	}
>   }
>   
> diff --git a/drivers/gpu/drm/etnaviv/etnaviv_sched.c b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
> index cd46c88..c495169 100644
> --- a/drivers/gpu/drm/etnaviv/etnaviv_sched.c
> +++ b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
> @@ -82,7 +82,8 @@ static struct dma_fence *etnaviv_sched_run_job(struct drm_sched_job *sched_job)
>   	return fence;
>   }
>   
> -static void etnaviv_sched_timedout_job(struct drm_sched_job *sched_job)
> +static enum drm_task_status etnaviv_sched_timedout_job(struct drm_sched_job
> +						       *sched_job)
>   {
>   	struct etnaviv_gem_submit *submit = to_etnaviv_submit(sched_job);
>   	struct etnaviv_gpu *gpu = submit->gpu;
> @@ -120,9 +121,16 @@ static void etnaviv_sched_timedout_job(struct drm_sched_job *sched_job)
>   
>   	drm_sched_resubmit_jobs(&gpu->sched);
>   
> +	/* Tell the DRM scheduler that this task needs
> +	 * more time.
> +	 */
> +	drm_sched_start(&gpu->sched, true);
> +	return DRM_TASK_STATUS_ALIVE;
> +
>   out_no_timeout:
>   	/* restart scheduler after GPU is usable again */
>   	drm_sched_start(&gpu->sched, true);
> +	return DRM_TASK_STATUS_ALIVE;
>   }
>   
>   static void etnaviv_sched_free_job(struct drm_sched_job *sched_job)
> diff --git a/drivers/gpu/drm/lima/lima_sched.c b/drivers/gpu/drm/lima/lima_sched.c
> index 63b4c56..66d9236 100644
> --- a/drivers/gpu/drm/lima/lima_sched.c
> +++ b/drivers/gpu/drm/lima/lima_sched.c
> @@ -415,7 +415,7 @@ static void lima_sched_build_error_task_list(struct lima_sched_task *task)
>   	mutex_unlock(&dev->error_task_list_lock);
>   }
>   
> -static void lima_sched_timedout_job(struct drm_sched_job *job)
> +static enum drm_task_status lima_sched_timedout_job(struct drm_sched_job *job)
>   {
>   	struct lima_sched_pipe *pipe = to_lima_pipe(job->sched);
>   	struct lima_sched_task *task = to_lima_task(job);
> @@ -449,6 +449,8 @@ static void lima_sched_timedout_job(struct drm_sched_job *job)
>   
>   	drm_sched_resubmit_jobs(&pipe->base);
>   	drm_sched_start(&pipe->base, true);
> +
> +	return DRM_TASK_STATUS_ALIVE;
>   }
>   
>   static void lima_sched_free_job(struct drm_sched_job *job)
> diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c
> index 04e6f6f..10d41ac 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_job.c
> +++ b/drivers/gpu/drm/panfrost/panfrost_job.c
> @@ -432,7 +432,8 @@ static void panfrost_scheduler_start(struct panfrost_queue_state *queue)
>   	mutex_unlock(&queue->lock);
>   }
>   
> -static void panfrost_job_timedout(struct drm_sched_job *sched_job)
> +static enum drm_task_status panfrost_job_timedout(struct drm_sched_job
> +						  *sched_job)
>   {
>   	struct panfrost_job *job = to_panfrost_job(sched_job);
>   	struct panfrost_device *pfdev = job->pfdev;
> @@ -443,7 +444,7 @@ static void panfrost_job_timedout(struct drm_sched_job *sched_job)
>   	 * spurious. Bail out.
>   	 */
>   	if (dma_fence_is_signaled(job->done_fence))
> -		return;
> +		return DRM_TASK_STATUS_ALIVE;
>   
>   	dev_err(pfdev->dev, "gpu sched timeout, js=%d, config=0x%x, status=0x%x, head=0x%x, tail=0x%x, sched_job=%p",
>   		js,
> @@ -455,11 +456,13 @@ static void panfrost_job_timedout(struct drm_sched_job *sched_job)
>   
>   	/* Scheduler is already stopped, nothing to do. */
>   	if (!panfrost_scheduler_stop(&pfdev->js->queue[js], sched_job))
> -		return;
> +		return DRM_TASK_STATUS_ALIVE;
>   
>   	/* Schedule a reset if there's no reset in progress. */
>   	if (!atomic_xchg(&pfdev->reset.pending, 1))
>   		schedule_work(&pfdev->reset.work);
> +
> +	return DRM_TASK_STATUS_ALIVE;
>   }
>   
>   static const struct drm_sched_backend_ops panfrost_sched_ops = {
> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
> index 92637b7..73fccc5 100644
> --- a/drivers/gpu/drm/scheduler/sched_main.c
> +++ b/drivers/gpu/drm/scheduler/sched_main.c
> @@ -527,7 +527,7 @@ void drm_sched_start(struct drm_gpu_scheduler *sched, bool full_recovery)
>   EXPORT_SYMBOL(drm_sched_start);
>   
>   /**
> - * drm_sched_resubmit_jobs - helper to relunch job from pending ring list
> + * drm_sched_resubmit_jobs - helper to relaunch jobs from the pending list
>    *
>    * @sched: scheduler instance
>    *
> @@ -561,8 +561,6 @@ void drm_sched_resubmit_jobs(struct drm_gpu_scheduler *sched)
>   		} else {
>   			s_job->s_fence->parent = fence;
>   		}
> -
> -
>   	}
>   }
>   EXPORT_SYMBOL(drm_sched_resubmit_jobs);
> diff --git a/drivers/gpu/drm/v3d/v3d_sched.c b/drivers/gpu/drm/v3d/v3d_sched.c
> index 452682e..3740665e 100644
> --- a/drivers/gpu/drm/v3d/v3d_sched.c
> +++ b/drivers/gpu/drm/v3d/v3d_sched.c
> @@ -259,7 +259,7 @@ v3d_cache_clean_job_run(struct drm_sched_job *sched_job)
>   	return NULL;
>   }
>   
> -static void
> +static enum drm_task_status
>   v3d_gpu_reset_for_timeout(struct v3d_dev *v3d, struct drm_sched_job *sched_job)
>   {
>   	enum v3d_queue q;
> @@ -285,6 +285,8 @@ v3d_gpu_reset_for_timeout(struct v3d_dev *v3d, struct drm_sched_job *sched_job)
>   	}
>   
>   	mutex_unlock(&v3d->reset_lock);
> +
> +	return DRM_TASK_STATUS_ALIVE;
>   }
>   
>   /* If the current address or return address have changed, then the GPU
> @@ -292,7 +294,7 @@ v3d_gpu_reset_for_timeout(struct v3d_dev *v3d, struct drm_sched_job *sched_job)
>    * could fail if the GPU got in an infinite loop in the CL, but that
>    * is pretty unlikely outside of an i-g-t testcase.
>    */
> -static void
> +static enum drm_task_status
>   v3d_cl_job_timedout(struct drm_sched_job *sched_job, enum v3d_queue q,
>   		    u32 *timedout_ctca, u32 *timedout_ctra)
>   {
> @@ -304,39 +306,39 @@ v3d_cl_job_timedout(struct drm_sched_job *sched_job, enum v3d_queue q,
>   	if (*timedout_ctca != ctca || *timedout_ctra != ctra) {
>   		*timedout_ctca = ctca;
>   		*timedout_ctra = ctra;
> -		return;
> +		return DRM_TASK_STATUS_ALIVE;
>   	}
>   
> -	v3d_gpu_reset_for_timeout(v3d, sched_job);
> +	return v3d_gpu_reset_for_timeout(v3d, sched_job);
>   }
>   
> -static void
> +static enum drm_task_status
>   v3d_bin_job_timedout(struct drm_sched_job *sched_job)
>   {
>   	struct v3d_bin_job *job = to_bin_job(sched_job);
>   
> -	v3d_cl_job_timedout(sched_job, V3D_BIN,
> -			    &job->timedout_ctca, &job->timedout_ctra);
> +	return v3d_cl_job_timedout(sched_job, V3D_BIN,
> +				   &job->timedout_ctca, &job->timedout_ctra);
>   }
>   
> -static void
> +static enum drm_task_status
>   v3d_render_job_timedout(struct drm_sched_job *sched_job)
>   {
>   	struct v3d_render_job *job = to_render_job(sched_job);
>   
> -	v3d_cl_job_timedout(sched_job, V3D_RENDER,
> -			    &job->timedout_ctca, &job->timedout_ctra);
> +	return v3d_cl_job_timedout(sched_job, V3D_RENDER,
> +				   &job->timedout_ctca, &job->timedout_ctra);
>   }
>   
> -static void
> +static enum drm_task_status
>   v3d_generic_job_timedout(struct drm_sched_job *sched_job)
>   {
>   	struct v3d_job *job = to_v3d_job(sched_job);
>   
> -	v3d_gpu_reset_for_timeout(job->v3d, sched_job);
> +	return v3d_gpu_reset_for_timeout(job->v3d, sched_job);
>   }
>   
> -static void
> +static enum drm_task_status
>   v3d_csd_job_timedout(struct drm_sched_job *sched_job)
>   {
>   	struct v3d_csd_job *job = to_csd_job(sched_job);
> @@ -348,10 +350,10 @@ v3d_csd_job_timedout(struct drm_sched_job *sched_job)
>   	 */
>   	if (job->timedout_batches != batches) {
>   		job->timedout_batches = batches;
> -		return;
> +		return DRM_TASK_STATUS_ALIVE;
>   	}
>   
> -	v3d_gpu_reset_for_timeout(v3d, sched_job);
> +	return v3d_gpu_reset_for_timeout(v3d, sched_job);
>   }
>   
>   static const struct drm_sched_backend_ops v3d_bin_sched_ops = {
> diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
> index 975e8a6..3ba36bc 100644
> --- a/include/drm/gpu_scheduler.h
> +++ b/include/drm/gpu_scheduler.h
> @@ -206,6 +206,11 @@ static inline bool drm_sched_invalidate_job(struct drm_sched_job *s_job,
>   	return s_job && atomic_inc_return(&s_job->karma) > threshold;
>   }
>   
> +enum drm_task_status {
> +	DRM_TASK_STATUS_ENODEV,
> +	DRM_TASK_STATUS_ALIVE
> +};
> +
>   /**
>    * struct drm_sched_backend_ops
>    *
> @@ -230,10 +235,16 @@ struct drm_sched_backend_ops {
>   	struct dma_fence *(*run_job)(struct drm_sched_job *sched_job);
>   
>   	/**
> -         * @timedout_job: Called when a job has taken too long to execute,
> -         * to trigger GPU recovery.
> +	 * @timedout_job: Called when a job has taken too long to execute,
> +	 * to trigger GPU recovery.
> +	 *
> +	 * Return DRM_TASK_STATUS_ALIVE, if the task (job) is healthy
> +	 * and executing in the hardware, i.e. it needs more time.
> +	 *
> +	 * Return DRM_TASK_STATUS_ENODEV, if the task (job) has
> +	 * been aborted.
>   	 */
> -	void (*timedout_job)(struct drm_sched_job *sched_job);
> +	enum drm_task_status (*timedout_job)(struct drm_sched_job *sched_job);
>   
>   	/**
>            * @free_job: Called once the job's finished fence has been signaled

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 01/14] drm/ttm: Remap all page faults to per process dummy page.
  2021-01-18 21:01   ` Andrey Grodzovsky
@ 2021-01-19  8:41     ` Christian König
  -1 siblings, 0 replies; 196+ messages in thread
From: Christian König @ 2021-01-19  8:41 UTC (permalink / raw)
  To: Andrey Grodzovsky, amd-gfx, dri-devel, daniel.vetter, robh,
	l.stach, yuq825, eric
  Cc: Alexander.Deucher, gregkh

Am 18.01.21 um 22:01 schrieb Andrey Grodzovsky:
> On device removal reroute all CPU mappings to dummy page.
>
> v3:
> Remove loop to find DRM file and instead access it
> by vma->vm_file->private_data. Move dummy page installation
> into a separate function.
>
> v4:
> Map the entire BOs VA space into on demand allocated dummy page
> on the first fault for that BO.
>
> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> ---
>   drivers/gpu/drm/ttm/ttm_bo_vm.c | 82 ++++++++++++++++++++++++++++++++++++++++-
>   include/drm/ttm/ttm_bo_api.h    |  2 +
>   2 files changed, 83 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c b/drivers/gpu/drm/ttm/ttm_bo_vm.c
> index 6dc96cf..ed89da3 100644
> --- a/drivers/gpu/drm/ttm/ttm_bo_vm.c
> +++ b/drivers/gpu/drm/ttm/ttm_bo_vm.c
> @@ -34,6 +34,8 @@
>   #include <drm/ttm/ttm_bo_driver.h>
>   #include <drm/ttm/ttm_placement.h>
>   #include <drm/drm_vma_manager.h>
> +#include <drm/drm_drv.h>
> +#include <drm/drm_managed.h>
>   #include <linux/mm.h>
>   #include <linux/pfn_t.h>
>   #include <linux/rbtree.h>
> @@ -380,25 +382,103 @@ vm_fault_t ttm_bo_vm_fault_reserved(struct vm_fault *vmf,
>   }
>   EXPORT_SYMBOL(ttm_bo_vm_fault_reserved);
>   
> +static void ttm_bo_release_dummy_page(struct drm_device *dev, void *res)
> +{
> +	struct page *dummy_page = (struct page *)res;
> +
> +	__free_page(dummy_page);
> +}
> +
> +vm_fault_t ttm_bo_vm_dummy_page(struct vm_fault *vmf, pgprot_t prot)
> +{
> +	struct vm_area_struct *vma = vmf->vma;
> +	struct ttm_buffer_object *bo = vma->vm_private_data;
> +	struct ttm_bo_device *bdev = bo->bdev;
> +	struct drm_device *ddev = bo->base.dev;
> +	vm_fault_t ret = VM_FAULT_NOPAGE;
> +	unsigned long address = vma->vm_start;
> +	unsigned long num_prefault = (vma->vm_end - vma->vm_start) >> PAGE_SHIFT;
> +	unsigned long pfn;
> +	struct page *page;
> +	int i;
> +
> +	/*
> +	 * Wait for buffer data in transit, due to a pipelined
> +	 * move.
> +	 */
> +	ret = ttm_bo_vm_fault_idle(bo, vmf);
> +	if (unlikely(ret != 0))
> +		return ret;

This is superfluous and probably quite harmful here because we wait for 
the hardware to do something.

We map a dummy page instead of the real BO content to the whole range 
anyway, so no need to wait for the real BO content to show up.

> +
> +	/* Allocate new dummy page to map all the VA range in this VMA to it*/
> +	page = alloc_page(GFP_KERNEL | __GFP_ZERO);
> +	if (!page)
> +		return VM_FAULT_OOM;
> +
> +	pfn = page_to_pfn(page);
> +
> +	/*
> +	 * Prefault the entire VMA range right away to avoid further faults
> +	 */
> +	for (i = 0; i < num_prefault; ++i) {

Maybe rename the variable to num_pages. I was confused for a moment why 
we still prefault.

Alternative you can just drop i and do "for (addr = vma->vm_start; addr 
< vma->vm_end; addr += PAGE_SIZE)".

> +
> +		if (unlikely(address >= vma->vm_end))
> +			break;
> +
> +		if (vma->vm_flags & VM_MIXEDMAP)
> +			ret = vmf_insert_mixed_prot(vma, address,
> +						    __pfn_to_pfn_t(pfn, PFN_DEV),
> +						    prot);
> +		else
> +			ret = vmf_insert_pfn_prot(vma, address, pfn, prot);
> +
> +		/* Never error on prefaulted PTEs */
> +		if (unlikely((ret & VM_FAULT_ERROR))) {
> +			if (i == 0)
> +				return VM_FAULT_NOPAGE;
> +			else
> +				break;

This should probably be modified to either always return the error or 
always ignore it.

Apart from that looks good to me.

Christian.

> +		}
> +
> +		address += PAGE_SIZE;
> +	}
> +
> +	/* Set the page to be freed using drmm release action */
> +	if (drmm_add_action_or_reset(ddev, ttm_bo_release_dummy_page, page))
> +		return VM_FAULT_OOM;
> +
> +	return ret;
> +}
> +EXPORT_SYMBOL(ttm_bo_vm_dummy_page);
> +
>   vm_fault_t ttm_bo_vm_fault(struct vm_fault *vmf)
>   {
>   	struct vm_area_struct *vma = vmf->vma;
>   	pgprot_t prot;
>   	struct ttm_buffer_object *bo = vma->vm_private_data;
> +	struct drm_device *ddev = bo->base.dev;
>   	vm_fault_t ret;
> +	int idx;
>   
>   	ret = ttm_bo_vm_reserve(bo, vmf);
>   	if (ret)
>   		return ret;
>   
>   	prot = vma->vm_page_prot;
> -	ret = ttm_bo_vm_fault_reserved(vmf, prot, TTM_BO_VM_NUM_PREFAULT, 1);
> +	if (drm_dev_enter(ddev, &idx)) {
> +		ret = ttm_bo_vm_fault_reserved(vmf, prot, TTM_BO_VM_NUM_PREFAULT, 1);
> +		drm_dev_exit(idx);
> +	} else {
> +		ret = ttm_bo_vm_dummy_page(vmf, prot);
> +	}
>   	if (ret == VM_FAULT_RETRY && !(vmf->flags & FAULT_FLAG_RETRY_NOWAIT))
>   		return ret;
>   
>   	dma_resv_unlock(bo->base.resv);
>   
>   	return ret;
> +
> +	return ret;
>   }
>   EXPORT_SYMBOL(ttm_bo_vm_fault);
>   
> diff --git a/include/drm/ttm/ttm_bo_api.h b/include/drm/ttm/ttm_bo_api.h
> index e17be32..12fb240 100644
> --- a/include/drm/ttm/ttm_bo_api.h
> +++ b/include/drm/ttm/ttm_bo_api.h
> @@ -643,4 +643,6 @@ void ttm_bo_vm_close(struct vm_area_struct *vma);
>   int ttm_bo_vm_access(struct vm_area_struct *vma, unsigned long addr,
>   		     void *buf, int len, int write);
>   
> +vm_fault_t ttm_bo_vm_dummy_page(struct vm_fault *vmf, pgprot_t prot);
> +
>   #endif

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 01/14] drm/ttm: Remap all page faults to per process dummy page.
@ 2021-01-19  8:41     ` Christian König
  0 siblings, 0 replies; 196+ messages in thread
From: Christian König @ 2021-01-19  8:41 UTC (permalink / raw)
  To: Andrey Grodzovsky, amd-gfx, dri-devel, daniel.vetter, robh,
	l.stach, yuq825, eric
  Cc: Alexander.Deucher, gregkh, ppaalanen, Harry.Wentland

Am 18.01.21 um 22:01 schrieb Andrey Grodzovsky:
> On device removal reroute all CPU mappings to dummy page.
>
> v3:
> Remove loop to find DRM file and instead access it
> by vma->vm_file->private_data. Move dummy page installation
> into a separate function.
>
> v4:
> Map the entire BOs VA space into on demand allocated dummy page
> on the first fault for that BO.
>
> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> ---
>   drivers/gpu/drm/ttm/ttm_bo_vm.c | 82 ++++++++++++++++++++++++++++++++++++++++-
>   include/drm/ttm/ttm_bo_api.h    |  2 +
>   2 files changed, 83 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c b/drivers/gpu/drm/ttm/ttm_bo_vm.c
> index 6dc96cf..ed89da3 100644
> --- a/drivers/gpu/drm/ttm/ttm_bo_vm.c
> +++ b/drivers/gpu/drm/ttm/ttm_bo_vm.c
> @@ -34,6 +34,8 @@
>   #include <drm/ttm/ttm_bo_driver.h>
>   #include <drm/ttm/ttm_placement.h>
>   #include <drm/drm_vma_manager.h>
> +#include <drm/drm_drv.h>
> +#include <drm/drm_managed.h>
>   #include <linux/mm.h>
>   #include <linux/pfn_t.h>
>   #include <linux/rbtree.h>
> @@ -380,25 +382,103 @@ vm_fault_t ttm_bo_vm_fault_reserved(struct vm_fault *vmf,
>   }
>   EXPORT_SYMBOL(ttm_bo_vm_fault_reserved);
>   
> +static void ttm_bo_release_dummy_page(struct drm_device *dev, void *res)
> +{
> +	struct page *dummy_page = (struct page *)res;
> +
> +	__free_page(dummy_page);
> +}
> +
> +vm_fault_t ttm_bo_vm_dummy_page(struct vm_fault *vmf, pgprot_t prot)
> +{
> +	struct vm_area_struct *vma = vmf->vma;
> +	struct ttm_buffer_object *bo = vma->vm_private_data;
> +	struct ttm_bo_device *bdev = bo->bdev;
> +	struct drm_device *ddev = bo->base.dev;
> +	vm_fault_t ret = VM_FAULT_NOPAGE;
> +	unsigned long address = vma->vm_start;
> +	unsigned long num_prefault = (vma->vm_end - vma->vm_start) >> PAGE_SHIFT;
> +	unsigned long pfn;
> +	struct page *page;
> +	int i;
> +
> +	/*
> +	 * Wait for buffer data in transit, due to a pipelined
> +	 * move.
> +	 */
> +	ret = ttm_bo_vm_fault_idle(bo, vmf);
> +	if (unlikely(ret != 0))
> +		return ret;

This is superfluous and probably quite harmful here because we wait for 
the hardware to do something.

We map a dummy page instead of the real BO content to the whole range 
anyway, so no need to wait for the real BO content to show up.

> +
> +	/* Allocate new dummy page to map all the VA range in this VMA to it*/
> +	page = alloc_page(GFP_KERNEL | __GFP_ZERO);
> +	if (!page)
> +		return VM_FAULT_OOM;
> +
> +	pfn = page_to_pfn(page);
> +
> +	/*
> +	 * Prefault the entire VMA range right away to avoid further faults
> +	 */
> +	for (i = 0; i < num_prefault; ++i) {

Maybe rename the variable to num_pages. I was confused for a moment why 
we still prefault.

Alternative you can just drop i and do "for (addr = vma->vm_start; addr 
< vma->vm_end; addr += PAGE_SIZE)".

> +
> +		if (unlikely(address >= vma->vm_end))
> +			break;
> +
> +		if (vma->vm_flags & VM_MIXEDMAP)
> +			ret = vmf_insert_mixed_prot(vma, address,
> +						    __pfn_to_pfn_t(pfn, PFN_DEV),
> +						    prot);
> +		else
> +			ret = vmf_insert_pfn_prot(vma, address, pfn, prot);
> +
> +		/* Never error on prefaulted PTEs */
> +		if (unlikely((ret & VM_FAULT_ERROR))) {
> +			if (i == 0)
> +				return VM_FAULT_NOPAGE;
> +			else
> +				break;

This should probably be modified to either always return the error or 
always ignore it.

Apart from that looks good to me.

Christian.

> +		}
> +
> +		address += PAGE_SIZE;
> +	}
> +
> +	/* Set the page to be freed using drmm release action */
> +	if (drmm_add_action_or_reset(ddev, ttm_bo_release_dummy_page, page))
> +		return VM_FAULT_OOM;
> +
> +	return ret;
> +}
> +EXPORT_SYMBOL(ttm_bo_vm_dummy_page);
> +
>   vm_fault_t ttm_bo_vm_fault(struct vm_fault *vmf)
>   {
>   	struct vm_area_struct *vma = vmf->vma;
>   	pgprot_t prot;
>   	struct ttm_buffer_object *bo = vma->vm_private_data;
> +	struct drm_device *ddev = bo->base.dev;
>   	vm_fault_t ret;
> +	int idx;
>   
>   	ret = ttm_bo_vm_reserve(bo, vmf);
>   	if (ret)
>   		return ret;
>   
>   	prot = vma->vm_page_prot;
> -	ret = ttm_bo_vm_fault_reserved(vmf, prot, TTM_BO_VM_NUM_PREFAULT, 1);
> +	if (drm_dev_enter(ddev, &idx)) {
> +		ret = ttm_bo_vm_fault_reserved(vmf, prot, TTM_BO_VM_NUM_PREFAULT, 1);
> +		drm_dev_exit(idx);
> +	} else {
> +		ret = ttm_bo_vm_dummy_page(vmf, prot);
> +	}
>   	if (ret == VM_FAULT_RETRY && !(vmf->flags & FAULT_FLAG_RETRY_NOWAIT))
>   		return ret;
>   
>   	dma_resv_unlock(bo->base.resv);
>   
>   	return ret;
> +
> +	return ret;
>   }
>   EXPORT_SYMBOL(ttm_bo_vm_fault);
>   
> diff --git a/include/drm/ttm/ttm_bo_api.h b/include/drm/ttm/ttm_bo_api.h
> index e17be32..12fb240 100644
> --- a/include/drm/ttm/ttm_bo_api.h
> +++ b/include/drm/ttm/ttm_bo_api.h
> @@ -643,4 +643,6 @@ void ttm_bo_vm_close(struct vm_area_struct *vma);
>   int ttm_bo_vm_access(struct vm_area_struct *vma, unsigned long addr,
>   		     void *buf, int len, int write);
>   
> +vm_fault_t ttm_bo_vm_dummy_page(struct vm_fault *vmf, pgprot_t prot);
> +
>   #endif

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 04/14] drm/sched: Cancel and flush all oustatdning jobs before finish.
  2021-01-18 21:01   ` Andrey Grodzovsky
@ 2021-01-19  8:42     ` Christian König
  -1 siblings, 0 replies; 196+ messages in thread
From: Christian König @ 2021-01-19  8:42 UTC (permalink / raw)
  To: Andrey Grodzovsky, amd-gfx, dri-devel, daniel.vetter, robh,
	l.stach, yuq825, eric
  Cc: Alexander.Deucher, gregkh

This is a bug fix and should probably be pushed separately to drm-misc-next.

Christian.

Am 18.01.21 um 22:01 schrieb Andrey Grodzovsky:
> To avoid any possible use after free.
>
> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> Reviewed-by: Christian König <christian.koenig@amd.com>
> ---
>   drivers/gpu/drm/scheduler/sched_main.c | 3 +++
>   1 file changed, 3 insertions(+)
>
> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
> index 997aa15..92637b7 100644
> --- a/drivers/gpu/drm/scheduler/sched_main.c
> +++ b/drivers/gpu/drm/scheduler/sched_main.c
> @@ -899,6 +899,9 @@ void drm_sched_fini(struct drm_gpu_scheduler *sched)
>   	if (sched->thread)
>   		kthread_stop(sched->thread);
>   
> +	/* Confirm no work left behind accessing device structures */
> +	cancel_delayed_work_sync(&sched->work_tdr);
> +
>   	sched->ready = false;
>   }
>   EXPORT_SYMBOL(drm_sched_fini);

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 04/14] drm/sched: Cancel and flush all oustatdning jobs before finish.
@ 2021-01-19  8:42     ` Christian König
  0 siblings, 0 replies; 196+ messages in thread
From: Christian König @ 2021-01-19  8:42 UTC (permalink / raw)
  To: Andrey Grodzovsky, amd-gfx, dri-devel, daniel.vetter, robh,
	l.stach, yuq825, eric
  Cc: Alexander.Deucher, gregkh, ppaalanen, Harry.Wentland

This is a bug fix and should probably be pushed separately to drm-misc-next.

Christian.

Am 18.01.21 um 22:01 schrieb Andrey Grodzovsky:
> To avoid any possible use after free.
>
> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> Reviewed-by: Christian König <christian.koenig@amd.com>
> ---
>   drivers/gpu/drm/scheduler/sched_main.c | 3 +++
>   1 file changed, 3 insertions(+)
>
> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
> index 997aa15..92637b7 100644
> --- a/drivers/gpu/drm/scheduler/sched_main.c
> +++ b/drivers/gpu/drm/scheduler/sched_main.c
> @@ -899,6 +899,9 @@ void drm_sched_fini(struct drm_gpu_scheduler *sched)
>   	if (sched->thread)
>   		kthread_stop(sched->thread);
>   
> +	/* Confirm no work left behind accessing device structures */
> +	cancel_delayed_work_sync(&sched->work_tdr);
> +
>   	sched->ready = false;
>   }
>   EXPORT_SYMBOL(drm_sched_fini);

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 05/14] drm/amdgpu: Split amdgpu_device_fini into early and late
  2021-01-18 21:01   ` Andrey Grodzovsky
@ 2021-01-19  8:45     ` Christian König
  -1 siblings, 0 replies; 196+ messages in thread
From: Christian König @ 2021-01-19  8:45 UTC (permalink / raw)
  To: Andrey Grodzovsky, amd-gfx, dri-devel, daniel.vetter, robh,
	l.stach, yuq825, eric
  Cc: Alexander.Deucher, gregkh

Am 18.01.21 um 22:01 schrieb Andrey Grodzovsky:
> Some of the stuff in amdgpu_device_fini such as HW interrupts
> disable and pending fences finilization must be done right away on
> pci_remove while most of the stuff which relates to finilizing and
> releasing driver data structures can be kept until
> drm_driver.release hook is called, i.e. when the last device
> reference is dropped.
>
> v4: Change functions prefix early->hw and late->sw
>
> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>

The fence and irq changes look sane to me, no idea for the rest.

Acked-by: Christian König <christian.koenig@amd.com>

> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu.h        |  6 +++++-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 26 ++++++++++++++++++--------
>   drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c    |  7 ++-----
>   drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c  | 15 ++++++++++++++-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c    | 26 ++++++++++++++++----------
>   drivers/gpu/drm/amd/amdgpu/amdgpu_irq.h    |  3 ++-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c    | 12 +++++++++++-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c    |  1 +
>   drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h   |  3 ++-
>   drivers/gpu/drm/amd/amdgpu/cik_ih.c        |  2 +-
>   drivers/gpu/drm/amd/amdgpu/cz_ih.c         |  2 +-
>   drivers/gpu/drm/amd/amdgpu/iceland_ih.c    |  2 +-
>   drivers/gpu/drm/amd/amdgpu/navi10_ih.c     |  2 +-
>   drivers/gpu/drm/amd/amdgpu/si_ih.c         |  2 +-
>   drivers/gpu/drm/amd/amdgpu/tonga_ih.c      |  2 +-
>   drivers/gpu/drm/amd/amdgpu/vega10_ih.c     |  2 +-
>   16 files changed, 78 insertions(+), 35 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> index f77443c..478a7d8 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> @@ -1060,7 +1060,9 @@ static inline struct amdgpu_device *amdgpu_ttm_adev(struct ttm_bo_device *bdev)
>   
>   int amdgpu_device_init(struct amdgpu_device *adev,
>   		       uint32_t flags);
> -void amdgpu_device_fini(struct amdgpu_device *adev);
> +void amdgpu_device_fini_hw(struct amdgpu_device *adev);
> +void amdgpu_device_fini_sw(struct amdgpu_device *adev);
> +
>   int amdgpu_gpu_wait_for_idle(struct amdgpu_device *adev);
>   
>   void amdgpu_device_vram_access(struct amdgpu_device *adev, loff_t pos,
> @@ -1273,6 +1275,8 @@ void amdgpu_driver_lastclose_kms(struct drm_device *dev);
>   int amdgpu_driver_open_kms(struct drm_device *dev, struct drm_file *file_priv);
>   void amdgpu_driver_postclose_kms(struct drm_device *dev,
>   				 struct drm_file *file_priv);
> +void amdgpu_driver_release_kms(struct drm_device *dev);
> +
>   int amdgpu_device_ip_suspend(struct amdgpu_device *adev);
>   int amdgpu_device_suspend(struct drm_device *dev, bool fbcon);
>   int amdgpu_device_resume(struct drm_device *dev, bool fbcon);
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index 348ac67..90c8353 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -3579,14 +3579,12 @@ int amdgpu_device_init(struct amdgpu_device *adev,
>    * Tear down the driver info (all asics).
>    * Called at driver shutdown.
>    */
> -void amdgpu_device_fini(struct amdgpu_device *adev)
> +void amdgpu_device_fini_hw(struct amdgpu_device *adev)
>   {
>   	dev_info(adev->dev, "amdgpu: finishing device.\n");
>   	flush_delayed_work(&adev->delayed_init_work);
>   	adev->shutdown = true;
>   
> -	kfree(adev->pci_state);
> -
>   	/* make sure IB test finished before entering exclusive mode
>   	 * to avoid preemption on IB test
>   	 * */
> @@ -3603,11 +3601,24 @@ void amdgpu_device_fini(struct amdgpu_device *adev)
>   		else
>   			drm_atomic_helper_shutdown(adev_to_drm(adev));
>   	}
> -	amdgpu_fence_driver_fini(adev);
> +	amdgpu_fence_driver_fini_hw(adev);
> +
>   	if (adev->pm_sysfs_en)
>   		amdgpu_pm_sysfs_fini(adev);
> +	if (adev->ucode_sysfs_en)
> +		amdgpu_ucode_sysfs_fini(adev);
> +	sysfs_remove_files(&adev->dev->kobj, amdgpu_dev_attributes);
> +
> +
>   	amdgpu_fbdev_fini(adev);
> +
> +	amdgpu_irq_fini_hw(adev);
> +}
> +
> +void amdgpu_device_fini_sw(struct amdgpu_device *adev)
> +{
>   	amdgpu_device_ip_fini(adev);
> +	amdgpu_fence_driver_fini_sw(adev);
>   	release_firmware(adev->firmware.gpu_info_fw);
>   	adev->firmware.gpu_info_fw = NULL;
>   	adev->accel_working = false;
> @@ -3636,14 +3647,13 @@ void amdgpu_device_fini(struct amdgpu_device *adev)
>   	adev->rmmio = NULL;
>   	amdgpu_device_doorbell_fini(adev);
>   
> -	if (adev->ucode_sysfs_en)
> -		amdgpu_ucode_sysfs_fini(adev);
> -
> -	sysfs_remove_files(&adev->dev->kobj, amdgpu_dev_attributes);
>   	if (IS_ENABLED(CONFIG_PERF_EVENTS))
>   		amdgpu_pmu_fini(adev);
>   	if (adev->mman.discovery_bin)
>   		amdgpu_discovery_fini(adev);
> +
> +	kfree(adev->pci_state);
> +
>   }
>   
>   
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> index 72efd57..9c0cd00 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> @@ -1238,14 +1238,10 @@ amdgpu_pci_remove(struct pci_dev *pdev)
>   {
>   	struct drm_device *dev = pci_get_drvdata(pdev);
>   
> -#ifdef MODULE
> -	if (THIS_MODULE->state != MODULE_STATE_GOING)
> -#endif
> -		DRM_ERROR("Hotplug removal is not supported\n");
>   	drm_dev_unplug(dev);
>   	amdgpu_driver_unload_kms(dev);
> +
>   	pci_disable_device(pdev);
> -	pci_set_drvdata(pdev, NULL);
>   }
>   
>   static void
> @@ -1569,6 +1565,7 @@ static const struct drm_driver amdgpu_kms_driver = {
>   	.dumb_create = amdgpu_mode_dumb_create,
>   	.dumb_map_offset = amdgpu_mode_dumb_mmap,
>   	.fops = &amdgpu_driver_kms_fops,
> +	.release = &amdgpu_driver_release_kms,
>   
>   	.prime_handle_to_fd = drm_gem_prime_handle_to_fd,
>   	.prime_fd_to_handle = drm_gem_prime_fd_to_handle,
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
> index d56f402..e19b74c 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
> @@ -523,7 +523,7 @@ int amdgpu_fence_driver_init(struct amdgpu_device *adev)
>    *
>    * Tear down the fence driver for all possible rings (all asics).
>    */
> -void amdgpu_fence_driver_fini(struct amdgpu_device *adev)
> +void amdgpu_fence_driver_fini_hw(struct amdgpu_device *adev)
>   {
>   	unsigned i, j;
>   	int r;
> @@ -544,6 +544,19 @@ void amdgpu_fence_driver_fini(struct amdgpu_device *adev)
>   		if (!ring->no_scheduler)
>   			drm_sched_fini(&ring->sched);
>   		del_timer_sync(&ring->fence_drv.fallback_timer);
> +	}
> +}
> +
> +void amdgpu_fence_driver_fini_sw(struct amdgpu_device *adev)
> +{
> +	unsigned int i, j;
> +
> +	for (i = 0; i < AMDGPU_MAX_RINGS; i++) {
> +		struct amdgpu_ring *ring = adev->rings[i];
> +
> +		if (!ring || !ring->fence_drv.initialized)
> +			continue;
> +
>   		for (j = 0; j <= ring->fence_drv.num_fences_mask; ++j)
>   			dma_fence_put(ring->fence_drv.fences[j]);
>   		kfree(ring->fence_drv.fences);
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
> index bea57e8..2f1cfc5 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
> @@ -49,6 +49,7 @@
>   #include <drm/drm_irq.h>
>   #include <drm/drm_vblank.h>
>   #include <drm/amdgpu_drm.h>
> +#include <drm/drm_drv.h>
>   #include "amdgpu.h"
>   #include "amdgpu_ih.h"
>   #include "atom.h"
> @@ -313,6 +314,20 @@ int amdgpu_irq_init(struct amdgpu_device *adev)
>   	return 0;
>   }
>   
> +
> +void amdgpu_irq_fini_hw(struct amdgpu_device *adev)
> +{
> +	if (adev->irq.installed) {
> +		drm_irq_uninstall(&adev->ddev);
> +		adev->irq.installed = false;
> +		if (adev->irq.msi_enabled)
> +			pci_free_irq_vectors(adev->pdev);
> +
> +		if (!amdgpu_device_has_dc_support(adev))
> +			flush_work(&adev->hotplug_work);
> +	}
> +}
> +
>   /**
>    * amdgpu_irq_fini - shut down interrupt handling
>    *
> @@ -322,19 +337,10 @@ int amdgpu_irq_init(struct amdgpu_device *adev)
>    * functionality, shuts down vblank, hotplug and reset interrupt handling,
>    * turns off interrupts from all sources (all ASICs).
>    */
> -void amdgpu_irq_fini(struct amdgpu_device *adev)
> +void amdgpu_irq_fini_sw(struct amdgpu_device *adev)
>   {
>   	unsigned i, j;
>   
> -	if (adev->irq.installed) {
> -		drm_irq_uninstall(adev_to_drm(adev));
> -		adev->irq.installed = false;
> -		if (adev->irq.msi_enabled)
> -			pci_free_irq_vectors(adev->pdev);
> -		if (!amdgpu_device_has_dc_support(adev))
> -			flush_work(&adev->hotplug_work);
> -	}
> -
>   	for (i = 0; i < AMDGPU_IRQ_CLIENTID_MAX; ++i) {
>   		if (!adev->irq.client[i].sources)
>   			continue;
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.h
> index ac527e5..392a732 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.h
> @@ -104,7 +104,8 @@ void amdgpu_irq_disable_all(struct amdgpu_device *adev);
>   irqreturn_t amdgpu_irq_handler(int irq, void *arg);
>   
>   int amdgpu_irq_init(struct amdgpu_device *adev);
> -void amdgpu_irq_fini(struct amdgpu_device *adev);
> +void amdgpu_irq_fini_sw(struct amdgpu_device *adev);
> +void amdgpu_irq_fini_hw(struct amdgpu_device *adev);
>   int amdgpu_irq_add_id(struct amdgpu_device *adev,
>   		      unsigned client_id, unsigned src_id,
>   		      struct amdgpu_irq_src *source);
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
> index b16b327..fee95d3 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
> @@ -29,6 +29,7 @@
>   #include "amdgpu.h"
>   #include <drm/drm_debugfs.h>
>   #include <drm/amdgpu_drm.h>
> +#include <drm/drm_drv.h>
>   #include "amdgpu_uvd.h"
>   #include "amdgpu_vce.h"
>   #include "atom.h"
> @@ -93,7 +94,7 @@ void amdgpu_driver_unload_kms(struct drm_device *dev)
>   	}
>   
>   	amdgpu_acpi_fini(adev);
> -	amdgpu_device_fini(adev);
> +	amdgpu_device_fini_hw(adev);
>   }
>   
>   void amdgpu_register_gpu_instance(struct amdgpu_device *adev)
> @@ -1153,6 +1154,15 @@ void amdgpu_driver_postclose_kms(struct drm_device *dev,
>   	pm_runtime_put_autosuspend(dev->dev);
>   }
>   
> +
> +void amdgpu_driver_release_kms(struct drm_device *dev)
> +{
> +	struct amdgpu_device *adev = drm_to_adev(dev);
> +
> +	amdgpu_device_fini_sw(adev);
> +	pci_set_drvdata(adev->pdev, NULL);
> +}
> +
>   /*
>    * VBlank related functions.
>    */
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
> index c136bd4..87eaf13 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
> @@ -2142,6 +2142,7 @@ int amdgpu_ras_pre_fini(struct amdgpu_device *adev)
>   	if (!con)
>   		return 0;
>   
> +
>   	/* Need disable ras on all IPs here before ip [hw/sw]fini */
>   	amdgpu_ras_disable_all_features(adev, 0);
>   	amdgpu_ras_recovery_fini(adev);
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
> index 7112137..accb243 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
> @@ -107,7 +107,8 @@ struct amdgpu_fence_driver {
>   };
>   
>   int amdgpu_fence_driver_init(struct amdgpu_device *adev);
> -void amdgpu_fence_driver_fini(struct amdgpu_device *adev);
> +void amdgpu_fence_driver_fini_hw(struct amdgpu_device *adev);
> +void amdgpu_fence_driver_fini_sw(struct amdgpu_device *adev);
>   void amdgpu_fence_driver_force_completion(struct amdgpu_ring *ring);
>   
>   int amdgpu_fence_driver_init_ring(struct amdgpu_ring *ring,
> diff --git a/drivers/gpu/drm/amd/amdgpu/cik_ih.c b/drivers/gpu/drm/amd/amdgpu/cik_ih.c
> index d374571..183d44a 100644
> --- a/drivers/gpu/drm/amd/amdgpu/cik_ih.c
> +++ b/drivers/gpu/drm/amd/amdgpu/cik_ih.c
> @@ -309,7 +309,7 @@ static int cik_ih_sw_fini(void *handle)
>   {
>   	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
>   
> -	amdgpu_irq_fini(adev);
> +	amdgpu_irq_fini_sw(adev);
>   	amdgpu_ih_ring_fini(adev, &adev->irq.ih);
>   	amdgpu_irq_remove_domain(adev);
>   
> diff --git a/drivers/gpu/drm/amd/amdgpu/cz_ih.c b/drivers/gpu/drm/amd/amdgpu/cz_ih.c
> index da37f8a..ee824d7 100644
> --- a/drivers/gpu/drm/amd/amdgpu/cz_ih.c
> +++ b/drivers/gpu/drm/amd/amdgpu/cz_ih.c
> @@ -290,7 +290,7 @@ static int cz_ih_sw_fini(void *handle)
>   {
>   	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
>   
> -	amdgpu_irq_fini(adev);
> +	amdgpu_irq_fini_sw(adev);
>   	amdgpu_ih_ring_fini(adev, &adev->irq.ih);
>   	amdgpu_irq_remove_domain(adev);
>   
> diff --git a/drivers/gpu/drm/amd/amdgpu/iceland_ih.c b/drivers/gpu/drm/amd/amdgpu/iceland_ih.c
> index 37d8b6c..b24f6fb 100644
> --- a/drivers/gpu/drm/amd/amdgpu/iceland_ih.c
> +++ b/drivers/gpu/drm/amd/amdgpu/iceland_ih.c
> @@ -290,7 +290,7 @@ static int iceland_ih_sw_fini(void *handle)
>   {
>   	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
>   
> -	amdgpu_irq_fini(adev);
> +	amdgpu_irq_fini_sw(adev);
>   	amdgpu_ih_ring_fini(adev, &adev->irq.ih);
>   	amdgpu_irq_remove_domain(adev);
>   
> diff --git a/drivers/gpu/drm/amd/amdgpu/navi10_ih.c b/drivers/gpu/drm/amd/amdgpu/navi10_ih.c
> index 7ba229e..c191410 100644
> --- a/drivers/gpu/drm/amd/amdgpu/navi10_ih.c
> +++ b/drivers/gpu/drm/amd/amdgpu/navi10_ih.c
> @@ -716,7 +716,7 @@ static int navi10_ih_sw_fini(void *handle)
>   {
>   	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
>   
> -	amdgpu_irq_fini(adev);
> +	amdgpu_irq_fini_sw(adev);
>   	amdgpu_ih_ring_fini(adev, &adev->irq.ih2);
>   	amdgpu_ih_ring_fini(adev, &adev->irq.ih1);
>   	amdgpu_ih_ring_fini(adev, &adev->irq.ih);
> diff --git a/drivers/gpu/drm/amd/amdgpu/si_ih.c b/drivers/gpu/drm/amd/amdgpu/si_ih.c
> index 51880f6..751307f 100644
> --- a/drivers/gpu/drm/amd/amdgpu/si_ih.c
> +++ b/drivers/gpu/drm/amd/amdgpu/si_ih.c
> @@ -175,7 +175,7 @@ static int si_ih_sw_fini(void *handle)
>   {
>   	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
>   
> -	amdgpu_irq_fini(adev);
> +	amdgpu_irq_fini_sw(adev);
>   	amdgpu_ih_ring_fini(adev, &adev->irq.ih);
>   
>   	return 0;
> diff --git a/drivers/gpu/drm/amd/amdgpu/tonga_ih.c b/drivers/gpu/drm/amd/amdgpu/tonga_ih.c
> index ce33199..729aaaa 100644
> --- a/drivers/gpu/drm/amd/amdgpu/tonga_ih.c
> +++ b/drivers/gpu/drm/amd/amdgpu/tonga_ih.c
> @@ -301,7 +301,7 @@ static int tonga_ih_sw_fini(void *handle)
>   {
>   	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
>   
> -	amdgpu_irq_fini(adev);
> +	amdgpu_irq_fini_sw(adev);
>   	amdgpu_ih_ring_fini(adev, &adev->irq.ih);
>   	amdgpu_irq_remove_domain(adev);
>   
> diff --git a/drivers/gpu/drm/amd/amdgpu/vega10_ih.c b/drivers/gpu/drm/amd/amdgpu/vega10_ih.c
> index e5ae31e..a342406 100644
> --- a/drivers/gpu/drm/amd/amdgpu/vega10_ih.c
> +++ b/drivers/gpu/drm/amd/amdgpu/vega10_ih.c
> @@ -627,7 +627,7 @@ static int vega10_ih_sw_fini(void *handle)
>   {
>   	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
>   
> -	amdgpu_irq_fini(adev);
> +	amdgpu_irq_fini_sw(adev);
>   	amdgpu_ih_ring_fini(adev, &adev->irq.ih2);
>   	amdgpu_ih_ring_fini(adev, &adev->irq.ih1);
>   	amdgpu_ih_ring_fini(adev, &adev->irq.ih);

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 05/14] drm/amdgpu: Split amdgpu_device_fini into early and late
@ 2021-01-19  8:45     ` Christian König
  0 siblings, 0 replies; 196+ messages in thread
From: Christian König @ 2021-01-19  8:45 UTC (permalink / raw)
  To: Andrey Grodzovsky, amd-gfx, dri-devel, daniel.vetter, robh,
	l.stach, yuq825, eric
  Cc: Alexander.Deucher, gregkh, ppaalanen, Harry.Wentland

Am 18.01.21 um 22:01 schrieb Andrey Grodzovsky:
> Some of the stuff in amdgpu_device_fini such as HW interrupts
> disable and pending fences finilization must be done right away on
> pci_remove while most of the stuff which relates to finilizing and
> releasing driver data structures can be kept until
> drm_driver.release hook is called, i.e. when the last device
> reference is dropped.
>
> v4: Change functions prefix early->hw and late->sw
>
> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>

The fence and irq changes look sane to me, no idea for the rest.

Acked-by: Christian König <christian.koenig@amd.com>

> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu.h        |  6 +++++-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 26 ++++++++++++++++++--------
>   drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c    |  7 ++-----
>   drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c  | 15 ++++++++++++++-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c    | 26 ++++++++++++++++----------
>   drivers/gpu/drm/amd/amdgpu/amdgpu_irq.h    |  3 ++-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c    | 12 +++++++++++-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c    |  1 +
>   drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h   |  3 ++-
>   drivers/gpu/drm/amd/amdgpu/cik_ih.c        |  2 +-
>   drivers/gpu/drm/amd/amdgpu/cz_ih.c         |  2 +-
>   drivers/gpu/drm/amd/amdgpu/iceland_ih.c    |  2 +-
>   drivers/gpu/drm/amd/amdgpu/navi10_ih.c     |  2 +-
>   drivers/gpu/drm/amd/amdgpu/si_ih.c         |  2 +-
>   drivers/gpu/drm/amd/amdgpu/tonga_ih.c      |  2 +-
>   drivers/gpu/drm/amd/amdgpu/vega10_ih.c     |  2 +-
>   16 files changed, 78 insertions(+), 35 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> index f77443c..478a7d8 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> @@ -1060,7 +1060,9 @@ static inline struct amdgpu_device *amdgpu_ttm_adev(struct ttm_bo_device *bdev)
>   
>   int amdgpu_device_init(struct amdgpu_device *adev,
>   		       uint32_t flags);
> -void amdgpu_device_fini(struct amdgpu_device *adev);
> +void amdgpu_device_fini_hw(struct amdgpu_device *adev);
> +void amdgpu_device_fini_sw(struct amdgpu_device *adev);
> +
>   int amdgpu_gpu_wait_for_idle(struct amdgpu_device *adev);
>   
>   void amdgpu_device_vram_access(struct amdgpu_device *adev, loff_t pos,
> @@ -1273,6 +1275,8 @@ void amdgpu_driver_lastclose_kms(struct drm_device *dev);
>   int amdgpu_driver_open_kms(struct drm_device *dev, struct drm_file *file_priv);
>   void amdgpu_driver_postclose_kms(struct drm_device *dev,
>   				 struct drm_file *file_priv);
> +void amdgpu_driver_release_kms(struct drm_device *dev);
> +
>   int amdgpu_device_ip_suspend(struct amdgpu_device *adev);
>   int amdgpu_device_suspend(struct drm_device *dev, bool fbcon);
>   int amdgpu_device_resume(struct drm_device *dev, bool fbcon);
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index 348ac67..90c8353 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -3579,14 +3579,12 @@ int amdgpu_device_init(struct amdgpu_device *adev,
>    * Tear down the driver info (all asics).
>    * Called at driver shutdown.
>    */
> -void amdgpu_device_fini(struct amdgpu_device *adev)
> +void amdgpu_device_fini_hw(struct amdgpu_device *adev)
>   {
>   	dev_info(adev->dev, "amdgpu: finishing device.\n");
>   	flush_delayed_work(&adev->delayed_init_work);
>   	adev->shutdown = true;
>   
> -	kfree(adev->pci_state);
> -
>   	/* make sure IB test finished before entering exclusive mode
>   	 * to avoid preemption on IB test
>   	 * */
> @@ -3603,11 +3601,24 @@ void amdgpu_device_fini(struct amdgpu_device *adev)
>   		else
>   			drm_atomic_helper_shutdown(adev_to_drm(adev));
>   	}
> -	amdgpu_fence_driver_fini(adev);
> +	amdgpu_fence_driver_fini_hw(adev);
> +
>   	if (adev->pm_sysfs_en)
>   		amdgpu_pm_sysfs_fini(adev);
> +	if (adev->ucode_sysfs_en)
> +		amdgpu_ucode_sysfs_fini(adev);
> +	sysfs_remove_files(&adev->dev->kobj, amdgpu_dev_attributes);
> +
> +
>   	amdgpu_fbdev_fini(adev);
> +
> +	amdgpu_irq_fini_hw(adev);
> +}
> +
> +void amdgpu_device_fini_sw(struct amdgpu_device *adev)
> +{
>   	amdgpu_device_ip_fini(adev);
> +	amdgpu_fence_driver_fini_sw(adev);
>   	release_firmware(adev->firmware.gpu_info_fw);
>   	adev->firmware.gpu_info_fw = NULL;
>   	adev->accel_working = false;
> @@ -3636,14 +3647,13 @@ void amdgpu_device_fini(struct amdgpu_device *adev)
>   	adev->rmmio = NULL;
>   	amdgpu_device_doorbell_fini(adev);
>   
> -	if (adev->ucode_sysfs_en)
> -		amdgpu_ucode_sysfs_fini(adev);
> -
> -	sysfs_remove_files(&adev->dev->kobj, amdgpu_dev_attributes);
>   	if (IS_ENABLED(CONFIG_PERF_EVENTS))
>   		amdgpu_pmu_fini(adev);
>   	if (adev->mman.discovery_bin)
>   		amdgpu_discovery_fini(adev);
> +
> +	kfree(adev->pci_state);
> +
>   }
>   
>   
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> index 72efd57..9c0cd00 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> @@ -1238,14 +1238,10 @@ amdgpu_pci_remove(struct pci_dev *pdev)
>   {
>   	struct drm_device *dev = pci_get_drvdata(pdev);
>   
> -#ifdef MODULE
> -	if (THIS_MODULE->state != MODULE_STATE_GOING)
> -#endif
> -		DRM_ERROR("Hotplug removal is not supported\n");
>   	drm_dev_unplug(dev);
>   	amdgpu_driver_unload_kms(dev);
> +
>   	pci_disable_device(pdev);
> -	pci_set_drvdata(pdev, NULL);
>   }
>   
>   static void
> @@ -1569,6 +1565,7 @@ static const struct drm_driver amdgpu_kms_driver = {
>   	.dumb_create = amdgpu_mode_dumb_create,
>   	.dumb_map_offset = amdgpu_mode_dumb_mmap,
>   	.fops = &amdgpu_driver_kms_fops,
> +	.release = &amdgpu_driver_release_kms,
>   
>   	.prime_handle_to_fd = drm_gem_prime_handle_to_fd,
>   	.prime_fd_to_handle = drm_gem_prime_fd_to_handle,
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
> index d56f402..e19b74c 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
> @@ -523,7 +523,7 @@ int amdgpu_fence_driver_init(struct amdgpu_device *adev)
>    *
>    * Tear down the fence driver for all possible rings (all asics).
>    */
> -void amdgpu_fence_driver_fini(struct amdgpu_device *adev)
> +void amdgpu_fence_driver_fini_hw(struct amdgpu_device *adev)
>   {
>   	unsigned i, j;
>   	int r;
> @@ -544,6 +544,19 @@ void amdgpu_fence_driver_fini(struct amdgpu_device *adev)
>   		if (!ring->no_scheduler)
>   			drm_sched_fini(&ring->sched);
>   		del_timer_sync(&ring->fence_drv.fallback_timer);
> +	}
> +}
> +
> +void amdgpu_fence_driver_fini_sw(struct amdgpu_device *adev)
> +{
> +	unsigned int i, j;
> +
> +	for (i = 0; i < AMDGPU_MAX_RINGS; i++) {
> +		struct amdgpu_ring *ring = adev->rings[i];
> +
> +		if (!ring || !ring->fence_drv.initialized)
> +			continue;
> +
>   		for (j = 0; j <= ring->fence_drv.num_fences_mask; ++j)
>   			dma_fence_put(ring->fence_drv.fences[j]);
>   		kfree(ring->fence_drv.fences);
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
> index bea57e8..2f1cfc5 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
> @@ -49,6 +49,7 @@
>   #include <drm/drm_irq.h>
>   #include <drm/drm_vblank.h>
>   #include <drm/amdgpu_drm.h>
> +#include <drm/drm_drv.h>
>   #include "amdgpu.h"
>   #include "amdgpu_ih.h"
>   #include "atom.h"
> @@ -313,6 +314,20 @@ int amdgpu_irq_init(struct amdgpu_device *adev)
>   	return 0;
>   }
>   
> +
> +void amdgpu_irq_fini_hw(struct amdgpu_device *adev)
> +{
> +	if (adev->irq.installed) {
> +		drm_irq_uninstall(&adev->ddev);
> +		adev->irq.installed = false;
> +		if (adev->irq.msi_enabled)
> +			pci_free_irq_vectors(adev->pdev);
> +
> +		if (!amdgpu_device_has_dc_support(adev))
> +			flush_work(&adev->hotplug_work);
> +	}
> +}
> +
>   /**
>    * amdgpu_irq_fini - shut down interrupt handling
>    *
> @@ -322,19 +337,10 @@ int amdgpu_irq_init(struct amdgpu_device *adev)
>    * functionality, shuts down vblank, hotplug and reset interrupt handling,
>    * turns off interrupts from all sources (all ASICs).
>    */
> -void amdgpu_irq_fini(struct amdgpu_device *adev)
> +void amdgpu_irq_fini_sw(struct amdgpu_device *adev)
>   {
>   	unsigned i, j;
>   
> -	if (adev->irq.installed) {
> -		drm_irq_uninstall(adev_to_drm(adev));
> -		adev->irq.installed = false;
> -		if (adev->irq.msi_enabled)
> -			pci_free_irq_vectors(adev->pdev);
> -		if (!amdgpu_device_has_dc_support(adev))
> -			flush_work(&adev->hotplug_work);
> -	}
> -
>   	for (i = 0; i < AMDGPU_IRQ_CLIENTID_MAX; ++i) {
>   		if (!adev->irq.client[i].sources)
>   			continue;
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.h
> index ac527e5..392a732 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.h
> @@ -104,7 +104,8 @@ void amdgpu_irq_disable_all(struct amdgpu_device *adev);
>   irqreturn_t amdgpu_irq_handler(int irq, void *arg);
>   
>   int amdgpu_irq_init(struct amdgpu_device *adev);
> -void amdgpu_irq_fini(struct amdgpu_device *adev);
> +void amdgpu_irq_fini_sw(struct amdgpu_device *adev);
> +void amdgpu_irq_fini_hw(struct amdgpu_device *adev);
>   int amdgpu_irq_add_id(struct amdgpu_device *adev,
>   		      unsigned client_id, unsigned src_id,
>   		      struct amdgpu_irq_src *source);
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
> index b16b327..fee95d3 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
> @@ -29,6 +29,7 @@
>   #include "amdgpu.h"
>   #include <drm/drm_debugfs.h>
>   #include <drm/amdgpu_drm.h>
> +#include <drm/drm_drv.h>
>   #include "amdgpu_uvd.h"
>   #include "amdgpu_vce.h"
>   #include "atom.h"
> @@ -93,7 +94,7 @@ void amdgpu_driver_unload_kms(struct drm_device *dev)
>   	}
>   
>   	amdgpu_acpi_fini(adev);
> -	amdgpu_device_fini(adev);
> +	amdgpu_device_fini_hw(adev);
>   }
>   
>   void amdgpu_register_gpu_instance(struct amdgpu_device *adev)
> @@ -1153,6 +1154,15 @@ void amdgpu_driver_postclose_kms(struct drm_device *dev,
>   	pm_runtime_put_autosuspend(dev->dev);
>   }
>   
> +
> +void amdgpu_driver_release_kms(struct drm_device *dev)
> +{
> +	struct amdgpu_device *adev = drm_to_adev(dev);
> +
> +	amdgpu_device_fini_sw(adev);
> +	pci_set_drvdata(adev->pdev, NULL);
> +}
> +
>   /*
>    * VBlank related functions.
>    */
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
> index c136bd4..87eaf13 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
> @@ -2142,6 +2142,7 @@ int amdgpu_ras_pre_fini(struct amdgpu_device *adev)
>   	if (!con)
>   		return 0;
>   
> +
>   	/* Need disable ras on all IPs here before ip [hw/sw]fini */
>   	amdgpu_ras_disable_all_features(adev, 0);
>   	amdgpu_ras_recovery_fini(adev);
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
> index 7112137..accb243 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
> @@ -107,7 +107,8 @@ struct amdgpu_fence_driver {
>   };
>   
>   int amdgpu_fence_driver_init(struct amdgpu_device *adev);
> -void amdgpu_fence_driver_fini(struct amdgpu_device *adev);
> +void amdgpu_fence_driver_fini_hw(struct amdgpu_device *adev);
> +void amdgpu_fence_driver_fini_sw(struct amdgpu_device *adev);
>   void amdgpu_fence_driver_force_completion(struct amdgpu_ring *ring);
>   
>   int amdgpu_fence_driver_init_ring(struct amdgpu_ring *ring,
> diff --git a/drivers/gpu/drm/amd/amdgpu/cik_ih.c b/drivers/gpu/drm/amd/amdgpu/cik_ih.c
> index d374571..183d44a 100644
> --- a/drivers/gpu/drm/amd/amdgpu/cik_ih.c
> +++ b/drivers/gpu/drm/amd/amdgpu/cik_ih.c
> @@ -309,7 +309,7 @@ static int cik_ih_sw_fini(void *handle)
>   {
>   	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
>   
> -	amdgpu_irq_fini(adev);
> +	amdgpu_irq_fini_sw(adev);
>   	amdgpu_ih_ring_fini(adev, &adev->irq.ih);
>   	amdgpu_irq_remove_domain(adev);
>   
> diff --git a/drivers/gpu/drm/amd/amdgpu/cz_ih.c b/drivers/gpu/drm/amd/amdgpu/cz_ih.c
> index da37f8a..ee824d7 100644
> --- a/drivers/gpu/drm/amd/amdgpu/cz_ih.c
> +++ b/drivers/gpu/drm/amd/amdgpu/cz_ih.c
> @@ -290,7 +290,7 @@ static int cz_ih_sw_fini(void *handle)
>   {
>   	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
>   
> -	amdgpu_irq_fini(adev);
> +	amdgpu_irq_fini_sw(adev);
>   	amdgpu_ih_ring_fini(adev, &adev->irq.ih);
>   	amdgpu_irq_remove_domain(adev);
>   
> diff --git a/drivers/gpu/drm/amd/amdgpu/iceland_ih.c b/drivers/gpu/drm/amd/amdgpu/iceland_ih.c
> index 37d8b6c..b24f6fb 100644
> --- a/drivers/gpu/drm/amd/amdgpu/iceland_ih.c
> +++ b/drivers/gpu/drm/amd/amdgpu/iceland_ih.c
> @@ -290,7 +290,7 @@ static int iceland_ih_sw_fini(void *handle)
>   {
>   	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
>   
> -	amdgpu_irq_fini(adev);
> +	amdgpu_irq_fini_sw(adev);
>   	amdgpu_ih_ring_fini(adev, &adev->irq.ih);
>   	amdgpu_irq_remove_domain(adev);
>   
> diff --git a/drivers/gpu/drm/amd/amdgpu/navi10_ih.c b/drivers/gpu/drm/amd/amdgpu/navi10_ih.c
> index 7ba229e..c191410 100644
> --- a/drivers/gpu/drm/amd/amdgpu/navi10_ih.c
> +++ b/drivers/gpu/drm/amd/amdgpu/navi10_ih.c
> @@ -716,7 +716,7 @@ static int navi10_ih_sw_fini(void *handle)
>   {
>   	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
>   
> -	amdgpu_irq_fini(adev);
> +	amdgpu_irq_fini_sw(adev);
>   	amdgpu_ih_ring_fini(adev, &adev->irq.ih2);
>   	amdgpu_ih_ring_fini(adev, &adev->irq.ih1);
>   	amdgpu_ih_ring_fini(adev, &adev->irq.ih);
> diff --git a/drivers/gpu/drm/amd/amdgpu/si_ih.c b/drivers/gpu/drm/amd/amdgpu/si_ih.c
> index 51880f6..751307f 100644
> --- a/drivers/gpu/drm/amd/amdgpu/si_ih.c
> +++ b/drivers/gpu/drm/amd/amdgpu/si_ih.c
> @@ -175,7 +175,7 @@ static int si_ih_sw_fini(void *handle)
>   {
>   	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
>   
> -	amdgpu_irq_fini(adev);
> +	amdgpu_irq_fini_sw(adev);
>   	amdgpu_ih_ring_fini(adev, &adev->irq.ih);
>   
>   	return 0;
> diff --git a/drivers/gpu/drm/amd/amdgpu/tonga_ih.c b/drivers/gpu/drm/amd/amdgpu/tonga_ih.c
> index ce33199..729aaaa 100644
> --- a/drivers/gpu/drm/amd/amdgpu/tonga_ih.c
> +++ b/drivers/gpu/drm/amd/amdgpu/tonga_ih.c
> @@ -301,7 +301,7 @@ static int tonga_ih_sw_fini(void *handle)
>   {
>   	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
>   
> -	amdgpu_irq_fini(adev);
> +	amdgpu_irq_fini_sw(adev);
>   	amdgpu_ih_ring_fini(adev, &adev->irq.ih);
>   	amdgpu_irq_remove_domain(adev);
>   
> diff --git a/drivers/gpu/drm/amd/amdgpu/vega10_ih.c b/drivers/gpu/drm/amd/amdgpu/vega10_ih.c
> index e5ae31e..a342406 100644
> --- a/drivers/gpu/drm/amd/amdgpu/vega10_ih.c
> +++ b/drivers/gpu/drm/amd/amdgpu/vega10_ih.c
> @@ -627,7 +627,7 @@ static int vega10_ih_sw_fini(void *handle)
>   {
>   	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
>   
> -	amdgpu_irq_fini(adev);
> +	amdgpu_irq_fini_sw(adev);
>   	amdgpu_ih_ring_fini(adev, &adev->irq.ih2);
>   	amdgpu_ih_ring_fini(adev, &adev->irq.ih1);
>   	amdgpu_ih_ring_fini(adev, &adev->irq.ih);

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 07/14] drm/amdgpu: Register IOMMU topology notifier per device.
  2021-01-18 21:01   ` Andrey Grodzovsky
@ 2021-01-19  8:48     ` Christian König
  -1 siblings, 0 replies; 196+ messages in thread
From: Christian König @ 2021-01-19  8:48 UTC (permalink / raw)
  To: Andrey Grodzovsky, amd-gfx, dri-devel, daniel.vetter, robh,
	l.stach, yuq825, eric
  Cc: Alexander.Deucher, gregkh

Am 18.01.21 um 22:01 schrieb Andrey Grodzovsky:
> Handle all DMA IOMMU gropup related dependencies before the
> group is removed.
>
> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu.h        |  5 ++++
>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 46 ++++++++++++++++++++++++++++++
>   drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c   |  2 +-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h   |  1 +
>   drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 10 +++++++
>   drivers/gpu/drm/amd/amdgpu/amdgpu_object.h |  2 ++
>   6 files changed, 65 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> index 478a7d8..2953420 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> @@ -51,6 +51,7 @@
>   #include <linux/dma-fence.h>
>   #include <linux/pci.h>
>   #include <linux/aer.h>
> +#include <linux/notifier.h>
>   
>   #include <drm/ttm/ttm_bo_api.h>
>   #include <drm/ttm/ttm_bo_driver.h>
> @@ -1041,6 +1042,10 @@ struct amdgpu_device {
>   
>   	bool                            in_pci_err_recovery;
>   	struct pci_saved_state          *pci_state;
> +
> +	struct notifier_block		nb;
> +	struct blocking_notifier_head	notifier;
> +	struct list_head		device_bo_list;
>   };
>   
>   static inline struct amdgpu_device *drm_to_adev(struct drm_device *ddev)
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index 45e23e3..e99f4f1 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -70,6 +70,8 @@
>   #include <drm/task_barrier.h>
>   #include <linux/pm_runtime.h>
>   
> +#include <linux/iommu.h>
> +
>   MODULE_FIRMWARE("amdgpu/vega10_gpu_info.bin");
>   MODULE_FIRMWARE("amdgpu/vega12_gpu_info.bin");
>   MODULE_FIRMWARE("amdgpu/raven_gpu_info.bin");
> @@ -3200,6 +3202,39 @@ static const struct attribute *amdgpu_dev_attributes[] = {
>   };
>   
>   
> +static int amdgpu_iommu_group_notifier(struct notifier_block *nb,
> +				     unsigned long action, void *data)
> +{
> +	struct amdgpu_device *adev = container_of(nb, struct amdgpu_device, nb);
> +	struct amdgpu_bo *bo = NULL;
> +
> +	/*
> +	 * Following is a set of IOMMU group dependencies taken care of before
> +	 * device's IOMMU group is removed
> +	 */
> +	if (action == IOMMU_GROUP_NOTIFY_DEL_DEVICE) {
> +
> +		spin_lock(&ttm_bo_glob.lru_lock);
> +		list_for_each_entry(bo, &adev->device_bo_list, bo) {
> +			if (bo->tbo.ttm)
> +				ttm_tt_unpopulate(bo->tbo.bdev, bo->tbo.ttm);
> +		}
> +		spin_unlock(&ttm_bo_glob.lru_lock);

That approach won't work. ttm_tt_unpopulate() might sleep on an IOMMU lock.

You need to use a mutex here or even better make sure you can access the 
device_bo_list without a lock in this moment.

Christian.

> +
> +		if (adev->irq.ih.use_bus_addr)
> +			amdgpu_ih_ring_fini(adev, &adev->irq.ih);
> +		if (adev->irq.ih1.use_bus_addr)
> +			amdgpu_ih_ring_fini(adev, &adev->irq.ih1);
> +		if (adev->irq.ih2.use_bus_addr)
> +			amdgpu_ih_ring_fini(adev, &adev->irq.ih2);
> +
> +		amdgpu_gart_dummy_page_fini(adev);
> +	}
> +
> +	return NOTIFY_OK;
> +}
> +
> +
>   /**
>    * amdgpu_device_init - initialize the driver
>    *
> @@ -3304,6 +3339,8 @@ int amdgpu_device_init(struct amdgpu_device *adev,
>   
>   	INIT_WORK(&adev->xgmi_reset_work, amdgpu_device_xgmi_reset_func);
>   
> +	INIT_LIST_HEAD(&adev->device_bo_list);
> +
>   	adev->gfx.gfx_off_req_count = 1;
>   	adev->pm.ac_power = power_supply_is_system_supplied() > 0;
>   
> @@ -3575,6 +3612,15 @@ int amdgpu_device_init(struct amdgpu_device *adev,
>   	if (amdgpu_device_cache_pci_state(adev->pdev))
>   		pci_restore_state(pdev);
>   
> +	BLOCKING_INIT_NOTIFIER_HEAD(&adev->notifier);
> +	adev->nb.notifier_call = amdgpu_iommu_group_notifier;
> +
> +	if (adev->dev->iommu_group) {
> +		r = iommu_group_register_notifier(adev->dev->iommu_group, &adev->nb);
> +		if (r)
> +			goto failed;
> +	}
> +
>   	return 0;
>   
>   failed:
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
> index 0db9330..486ad6d 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
> @@ -92,7 +92,7 @@ static int amdgpu_gart_dummy_page_init(struct amdgpu_device *adev)
>    *
>    * Frees the dummy page used by the driver (all asics).
>    */
> -static void amdgpu_gart_dummy_page_fini(struct amdgpu_device *adev)
> +void amdgpu_gart_dummy_page_fini(struct amdgpu_device *adev)
>   {
>   	if (!adev->dummy_page_addr)
>   		return;
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h
> index afa2e28..5678d9c 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h
> @@ -61,6 +61,7 @@ int amdgpu_gart_table_vram_pin(struct amdgpu_device *adev);
>   void amdgpu_gart_table_vram_unpin(struct amdgpu_device *adev);
>   int amdgpu_gart_init(struct amdgpu_device *adev);
>   void amdgpu_gart_fini(struct amdgpu_device *adev);
> +void amdgpu_gart_dummy_page_fini(struct amdgpu_device *adev);
>   int amdgpu_gart_unbind(struct amdgpu_device *adev, uint64_t offset,
>   		       int pages);
>   int amdgpu_gart_map(struct amdgpu_device *adev, uint64_t offset,
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> index 6cc9919..4a1de69 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> @@ -94,6 +94,10 @@ static void amdgpu_bo_destroy(struct ttm_buffer_object *tbo)
>   	}
>   	amdgpu_bo_unref(&bo->parent);
>   
> +	spin_lock(&ttm_bo_glob.lru_lock);
> +	list_del(&bo->bo);
> +	spin_unlock(&ttm_bo_glob.lru_lock);
> +
>   	kfree(bo->metadata);
>   	kfree(bo);
>   }
> @@ -613,6 +617,12 @@ static int amdgpu_bo_do_create(struct amdgpu_device *adev,
>   	if (bp->type == ttm_bo_type_device)
>   		bo->flags &= ~AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED;
>   
> +	INIT_LIST_HEAD(&bo->bo);
> +
> +	spin_lock(&ttm_bo_glob.lru_lock);
> +	list_add_tail(&bo->bo, &adev->device_bo_list);
> +	spin_unlock(&ttm_bo_glob.lru_lock);
> +
>   	return 0;
>   
>   fail_unreserve:
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
> index 9ac3756..5ae8555 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
> @@ -110,6 +110,8 @@ struct amdgpu_bo {
>   	struct list_head		shadow_list;
>   
>   	struct kgd_mem                  *kfd_bo;
> +
> +	struct list_head		bo;
>   };
>   
>   static inline struct amdgpu_bo *ttm_to_amdgpu_bo(struct ttm_buffer_object *tbo)

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 07/14] drm/amdgpu: Register IOMMU topology notifier per device.
@ 2021-01-19  8:48     ` Christian König
  0 siblings, 0 replies; 196+ messages in thread
From: Christian König @ 2021-01-19  8:48 UTC (permalink / raw)
  To: Andrey Grodzovsky, amd-gfx, dri-devel, daniel.vetter, robh,
	l.stach, yuq825, eric
  Cc: Alexander.Deucher, gregkh, ppaalanen, Harry.Wentland

Am 18.01.21 um 22:01 schrieb Andrey Grodzovsky:
> Handle all DMA IOMMU gropup related dependencies before the
> group is removed.
>
> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu.h        |  5 ++++
>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 46 ++++++++++++++++++++++++++++++
>   drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c   |  2 +-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h   |  1 +
>   drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 10 +++++++
>   drivers/gpu/drm/amd/amdgpu/amdgpu_object.h |  2 ++
>   6 files changed, 65 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> index 478a7d8..2953420 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> @@ -51,6 +51,7 @@
>   #include <linux/dma-fence.h>
>   #include <linux/pci.h>
>   #include <linux/aer.h>
> +#include <linux/notifier.h>
>   
>   #include <drm/ttm/ttm_bo_api.h>
>   #include <drm/ttm/ttm_bo_driver.h>
> @@ -1041,6 +1042,10 @@ struct amdgpu_device {
>   
>   	bool                            in_pci_err_recovery;
>   	struct pci_saved_state          *pci_state;
> +
> +	struct notifier_block		nb;
> +	struct blocking_notifier_head	notifier;
> +	struct list_head		device_bo_list;
>   };
>   
>   static inline struct amdgpu_device *drm_to_adev(struct drm_device *ddev)
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index 45e23e3..e99f4f1 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -70,6 +70,8 @@
>   #include <drm/task_barrier.h>
>   #include <linux/pm_runtime.h>
>   
> +#include <linux/iommu.h>
> +
>   MODULE_FIRMWARE("amdgpu/vega10_gpu_info.bin");
>   MODULE_FIRMWARE("amdgpu/vega12_gpu_info.bin");
>   MODULE_FIRMWARE("amdgpu/raven_gpu_info.bin");
> @@ -3200,6 +3202,39 @@ static const struct attribute *amdgpu_dev_attributes[] = {
>   };
>   
>   
> +static int amdgpu_iommu_group_notifier(struct notifier_block *nb,
> +				     unsigned long action, void *data)
> +{
> +	struct amdgpu_device *adev = container_of(nb, struct amdgpu_device, nb);
> +	struct amdgpu_bo *bo = NULL;
> +
> +	/*
> +	 * Following is a set of IOMMU group dependencies taken care of before
> +	 * device's IOMMU group is removed
> +	 */
> +	if (action == IOMMU_GROUP_NOTIFY_DEL_DEVICE) {
> +
> +		spin_lock(&ttm_bo_glob.lru_lock);
> +		list_for_each_entry(bo, &adev->device_bo_list, bo) {
> +			if (bo->tbo.ttm)
> +				ttm_tt_unpopulate(bo->tbo.bdev, bo->tbo.ttm);
> +		}
> +		spin_unlock(&ttm_bo_glob.lru_lock);

That approach won't work. ttm_tt_unpopulate() might sleep on an IOMMU lock.

You need to use a mutex here or even better make sure you can access the 
device_bo_list without a lock in this moment.

Christian.

> +
> +		if (adev->irq.ih.use_bus_addr)
> +			amdgpu_ih_ring_fini(adev, &adev->irq.ih);
> +		if (adev->irq.ih1.use_bus_addr)
> +			amdgpu_ih_ring_fini(adev, &adev->irq.ih1);
> +		if (adev->irq.ih2.use_bus_addr)
> +			amdgpu_ih_ring_fini(adev, &adev->irq.ih2);
> +
> +		amdgpu_gart_dummy_page_fini(adev);
> +	}
> +
> +	return NOTIFY_OK;
> +}
> +
> +
>   /**
>    * amdgpu_device_init - initialize the driver
>    *
> @@ -3304,6 +3339,8 @@ int amdgpu_device_init(struct amdgpu_device *adev,
>   
>   	INIT_WORK(&adev->xgmi_reset_work, amdgpu_device_xgmi_reset_func);
>   
> +	INIT_LIST_HEAD(&adev->device_bo_list);
> +
>   	adev->gfx.gfx_off_req_count = 1;
>   	adev->pm.ac_power = power_supply_is_system_supplied() > 0;
>   
> @@ -3575,6 +3612,15 @@ int amdgpu_device_init(struct amdgpu_device *adev,
>   	if (amdgpu_device_cache_pci_state(adev->pdev))
>   		pci_restore_state(pdev);
>   
> +	BLOCKING_INIT_NOTIFIER_HEAD(&adev->notifier);
> +	adev->nb.notifier_call = amdgpu_iommu_group_notifier;
> +
> +	if (adev->dev->iommu_group) {
> +		r = iommu_group_register_notifier(adev->dev->iommu_group, &adev->nb);
> +		if (r)
> +			goto failed;
> +	}
> +
>   	return 0;
>   
>   failed:
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
> index 0db9330..486ad6d 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
> @@ -92,7 +92,7 @@ static int amdgpu_gart_dummy_page_init(struct amdgpu_device *adev)
>    *
>    * Frees the dummy page used by the driver (all asics).
>    */
> -static void amdgpu_gart_dummy_page_fini(struct amdgpu_device *adev)
> +void amdgpu_gart_dummy_page_fini(struct amdgpu_device *adev)
>   {
>   	if (!adev->dummy_page_addr)
>   		return;
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h
> index afa2e28..5678d9c 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h
> @@ -61,6 +61,7 @@ int amdgpu_gart_table_vram_pin(struct amdgpu_device *adev);
>   void amdgpu_gart_table_vram_unpin(struct amdgpu_device *adev);
>   int amdgpu_gart_init(struct amdgpu_device *adev);
>   void amdgpu_gart_fini(struct amdgpu_device *adev);
> +void amdgpu_gart_dummy_page_fini(struct amdgpu_device *adev);
>   int amdgpu_gart_unbind(struct amdgpu_device *adev, uint64_t offset,
>   		       int pages);
>   int amdgpu_gart_map(struct amdgpu_device *adev, uint64_t offset,
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> index 6cc9919..4a1de69 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> @@ -94,6 +94,10 @@ static void amdgpu_bo_destroy(struct ttm_buffer_object *tbo)
>   	}
>   	amdgpu_bo_unref(&bo->parent);
>   
> +	spin_lock(&ttm_bo_glob.lru_lock);
> +	list_del(&bo->bo);
> +	spin_unlock(&ttm_bo_glob.lru_lock);
> +
>   	kfree(bo->metadata);
>   	kfree(bo);
>   }
> @@ -613,6 +617,12 @@ static int amdgpu_bo_do_create(struct amdgpu_device *adev,
>   	if (bp->type == ttm_bo_type_device)
>   		bo->flags &= ~AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED;
>   
> +	INIT_LIST_HEAD(&bo->bo);
> +
> +	spin_lock(&ttm_bo_glob.lru_lock);
> +	list_add_tail(&bo->bo, &adev->device_bo_list);
> +	spin_unlock(&ttm_bo_glob.lru_lock);
> +
>   	return 0;
>   
>   fail_unreserve:
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
> index 9ac3756..5ae8555 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
> @@ -110,6 +110,8 @@ struct amdgpu_bo {
>   	struct list_head		shadow_list;
>   
>   	struct kgd_mem                  *kfd_bo;
> +
> +	struct list_head		bo;
>   };
>   
>   static inline struct amdgpu_bo *ttm_to_amdgpu_bo(struct ttm_buffer_object *tbo)

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 08/14] drm/amdgpu: Fix a bunch of sdma code crash post device unplug
  2021-01-18 21:01   ` Andrey Grodzovsky
@ 2021-01-19  8:51     ` Christian König
  -1 siblings, 0 replies; 196+ messages in thread
From: Christian König @ 2021-01-19  8:51 UTC (permalink / raw)
  To: Andrey Grodzovsky, amd-gfx, dri-devel, daniel.vetter, robh,
	l.stach, yuq825, eric
  Cc: Alexander.Deucher, gregkh

Am 18.01.21 um 22:01 schrieb Andrey Grodzovsky:
> We can't allocate and submit IBs post device unplug.
>
> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 8 +++++++-
>   1 file changed, 7 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> index ad91c0c..5096351 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> @@ -31,6 +31,7 @@
>   #include <linux/dma-buf.h>
>   
>   #include <drm/amdgpu_drm.h>
> +#include <drm/drm_drv.h>
>   #include "amdgpu.h"
>   #include "amdgpu_trace.h"
>   #include "amdgpu_amdkfd.h"
> @@ -1604,7 +1605,10 @@ static int amdgpu_vm_bo_update_mapping(struct amdgpu_device *adev,
>   	struct amdgpu_vm_update_params params;
>   	enum amdgpu_sync_mode sync_mode;
>   	uint64_t pfn;
> -	int r;
> +	int r, idx;
> +
> +	if (!drm_dev_enter(&adev->ddev, &idx))
> +		return -ENOENT;

Why not -ENODEV?

>   
>   	memset(&params, 0, sizeof(params));
>   	params.adev = adev;
> @@ -1647,6 +1651,8 @@ static int amdgpu_vm_bo_update_mapping(struct amdgpu_device *adev,
>   	if (r)
>   		goto error_unlock;
>   
> +
> +	drm_dev_exit(idx);

That's to early. You probably need to do this much further below after 
the commit.

Christian.

>   	do {
>   		uint64_t tmp, num_entries, addr;
>   

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 08/14] drm/amdgpu: Fix a bunch of sdma code crash post device unplug
@ 2021-01-19  8:51     ` Christian König
  0 siblings, 0 replies; 196+ messages in thread
From: Christian König @ 2021-01-19  8:51 UTC (permalink / raw)
  To: Andrey Grodzovsky, amd-gfx, dri-devel, daniel.vetter, robh,
	l.stach, yuq825, eric
  Cc: Alexander.Deucher, gregkh, ppaalanen, Harry.Wentland

Am 18.01.21 um 22:01 schrieb Andrey Grodzovsky:
> We can't allocate and submit IBs post device unplug.
>
> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 8 +++++++-
>   1 file changed, 7 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> index ad91c0c..5096351 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> @@ -31,6 +31,7 @@
>   #include <linux/dma-buf.h>
>   
>   #include <drm/amdgpu_drm.h>
> +#include <drm/drm_drv.h>
>   #include "amdgpu.h"
>   #include "amdgpu_trace.h"
>   #include "amdgpu_amdkfd.h"
> @@ -1604,7 +1605,10 @@ static int amdgpu_vm_bo_update_mapping(struct amdgpu_device *adev,
>   	struct amdgpu_vm_update_params params;
>   	enum amdgpu_sync_mode sync_mode;
>   	uint64_t pfn;
> -	int r;
> +	int r, idx;
> +
> +	if (!drm_dev_enter(&adev->ddev, &idx))
> +		return -ENOENT;

Why not -ENODEV?

>   
>   	memset(&params, 0, sizeof(params));
>   	params.adev = adev;
> @@ -1647,6 +1651,8 @@ static int amdgpu_vm_bo_update_mapping(struct amdgpu_device *adev,
>   	if (r)
>   		goto error_unlock;
>   
> +
> +	drm_dev_exit(idx);

That's to early. You probably need to do this much further below after 
the commit.

Christian.

>   	do {
>   		uint64_t tmp, num_entries, addr;
>   

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 09/14] drm/amdgpu: Remap all page faults to per process dummy page.
  2021-01-18 21:01   ` Andrey Grodzovsky
@ 2021-01-19  8:52     ` Christian König
  -1 siblings, 0 replies; 196+ messages in thread
From: Christian König @ 2021-01-19  8:52 UTC (permalink / raw)
  To: Andrey Grodzovsky, amd-gfx, dri-devel, daniel.vetter, robh,
	l.stach, yuq825, eric
  Cc: Alexander.Deucher, gregkh

Am 18.01.21 um 22:01 schrieb Andrey Grodzovsky:
> On device removal reroute all CPU mappings to dummy page
> per drm_file instance or imported GEM object.
>
> v4:
> Update for modified ttm_bo_vm_dummy_page
>
> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>

Reviewed-by: Christian König <christian.koenig@amd.com>

> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 21 ++++++++++++++++-----
>   1 file changed, 16 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> index 9fd2157..550dc5e 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> @@ -49,6 +49,7 @@
>   
>   #include <drm/drm_debugfs.h>
>   #include <drm/amdgpu_drm.h>
> +#include <drm/drm_drv.h>
>   
>   #include "amdgpu.h"
>   #include "amdgpu_object.h"
> @@ -1982,18 +1983,28 @@ void amdgpu_ttm_set_buffer_funcs_status(struct amdgpu_device *adev, bool enable)
>   static vm_fault_t amdgpu_ttm_fault(struct vm_fault *vmf)
>   {
>   	struct ttm_buffer_object *bo = vmf->vma->vm_private_data;
> +	struct drm_device *ddev = bo->base.dev;
>   	vm_fault_t ret;
> +	int idx;
>   
>   	ret = ttm_bo_vm_reserve(bo, vmf);
>   	if (ret)
>   		return ret;
>   
> -	ret = amdgpu_bo_fault_reserve_notify(bo);
> -	if (ret)
> -		goto unlock;
> +	if (drm_dev_enter(ddev, &idx)) {
> +		ret = amdgpu_bo_fault_reserve_notify(bo);
> +		if (ret) {
> +			drm_dev_exit(idx);
> +			goto unlock;
> +		}
>   
> -	ret = ttm_bo_vm_fault_reserved(vmf, vmf->vma->vm_page_prot,
> -				       TTM_BO_VM_NUM_PREFAULT, 1);
> +		 ret = ttm_bo_vm_fault_reserved(vmf, vmf->vma->vm_page_prot,
> +						TTM_BO_VM_NUM_PREFAULT, 1);
> +
> +		 drm_dev_exit(idx);
> +	} else {
> +		ret = ttm_bo_vm_dummy_page(vmf, vmf->vma->vm_page_prot);
> +	}
>   	if (ret == VM_FAULT_RETRY && !(vmf->flags & FAULT_FLAG_RETRY_NOWAIT))
>   		return ret;
>   

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 09/14] drm/amdgpu: Remap all page faults to per process dummy page.
@ 2021-01-19  8:52     ` Christian König
  0 siblings, 0 replies; 196+ messages in thread
From: Christian König @ 2021-01-19  8:52 UTC (permalink / raw)
  To: Andrey Grodzovsky, amd-gfx, dri-devel, daniel.vetter, robh,
	l.stach, yuq825, eric
  Cc: Alexander.Deucher, gregkh, ppaalanen, Harry.Wentland

Am 18.01.21 um 22:01 schrieb Andrey Grodzovsky:
> On device removal reroute all CPU mappings to dummy page
> per drm_file instance or imported GEM object.
>
> v4:
> Update for modified ttm_bo_vm_dummy_page
>
> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>

Reviewed-by: Christian König <christian.koenig@amd.com>

> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 21 ++++++++++++++++-----
>   1 file changed, 16 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> index 9fd2157..550dc5e 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> @@ -49,6 +49,7 @@
>   
>   #include <drm/drm_debugfs.h>
>   #include <drm/amdgpu_drm.h>
> +#include <drm/drm_drv.h>
>   
>   #include "amdgpu.h"
>   #include "amdgpu_object.h"
> @@ -1982,18 +1983,28 @@ void amdgpu_ttm_set_buffer_funcs_status(struct amdgpu_device *adev, bool enable)
>   static vm_fault_t amdgpu_ttm_fault(struct vm_fault *vmf)
>   {
>   	struct ttm_buffer_object *bo = vmf->vma->vm_private_data;
> +	struct drm_device *ddev = bo->base.dev;
>   	vm_fault_t ret;
> +	int idx;
>   
>   	ret = ttm_bo_vm_reserve(bo, vmf);
>   	if (ret)
>   		return ret;
>   
> -	ret = amdgpu_bo_fault_reserve_notify(bo);
> -	if (ret)
> -		goto unlock;
> +	if (drm_dev_enter(ddev, &idx)) {
> +		ret = amdgpu_bo_fault_reserve_notify(bo);
> +		if (ret) {
> +			drm_dev_exit(idx);
> +			goto unlock;
> +		}
>   
> -	ret = ttm_bo_vm_fault_reserved(vmf, vmf->vma->vm_page_prot,
> -				       TTM_BO_VM_NUM_PREFAULT, 1);
> +		 ret = ttm_bo_vm_fault_reserved(vmf, vmf->vma->vm_page_prot,
> +						TTM_BO_VM_NUM_PREFAULT, 1);
> +
> +		 drm_dev_exit(idx);
> +	} else {
> +		ret = ttm_bo_vm_dummy_page(vmf, vmf->vma->vm_page_prot);
> +	}
>   	if (ret == VM_FAULT_RETRY && !(vmf->flags & FAULT_FLAG_RETRY_NOWAIT))
>   		return ret;
>   

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 10/14] dmr/amdgpu: Move some sysfs attrs creation to default_attr
  2021-01-18 21:01   ` Andrey Grodzovsky
@ 2021-01-19  8:53     ` Christian König
  -1 siblings, 0 replies; 196+ messages in thread
From: Christian König @ 2021-01-19  8:53 UTC (permalink / raw)
  To: Andrey Grodzovsky, amd-gfx, dri-devel, daniel.vetter, robh,
	l.stach, yuq825, eric
  Cc: Alexander.Deucher, gregkh

Am 18.01.21 um 22:01 schrieb Andrey Grodzovsky:
> This allows to remove explicit creation and destruction
> of those attrs and by this avoids warnings on device
> finilizing post physical device extraction.
>
> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>

Acked-by: Christian König <christian.koenig@amd.com>

> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c | 17 +++++++++--------
>   drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c      | 13 +++++++++++++
>   drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c  | 25 ++++++++++---------------
>   drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c | 14 +++++---------
>   4 files changed, 37 insertions(+), 32 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c
> index 86add0f..0346e12 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c
> @@ -1953,6 +1953,15 @@ static ssize_t amdgpu_atombios_get_vbios_version(struct device *dev,
>   static DEVICE_ATTR(vbios_version, 0444, amdgpu_atombios_get_vbios_version,
>   		   NULL);
>   
> +static struct attribute *amdgpu_vbios_version_attrs[] = {
> +	&dev_attr_vbios_version.attr,
> +	NULL
> +};
> +
> +const struct attribute_group amdgpu_vbios_version_attr_group = {
> +	.attrs = amdgpu_vbios_version_attrs
> +};
> +
>   /**
>    * amdgpu_atombios_fini - free the driver info and callbacks for atombios
>    *
> @@ -1972,7 +1981,6 @@ void amdgpu_atombios_fini(struct amdgpu_device *adev)
>   	adev->mode_info.atom_context = NULL;
>   	kfree(adev->mode_info.atom_card_info);
>   	adev->mode_info.atom_card_info = NULL;
> -	device_remove_file(adev->dev, &dev_attr_vbios_version);
>   }
>   
>   /**
> @@ -1989,7 +1997,6 @@ int amdgpu_atombios_init(struct amdgpu_device *adev)
>   {
>   	struct card_info *atom_card_info =
>   	    kzalloc(sizeof(struct card_info), GFP_KERNEL);
> -	int ret;
>   
>   	if (!atom_card_info)
>   		return -ENOMEM;
> @@ -2027,12 +2034,6 @@ int amdgpu_atombios_init(struct amdgpu_device *adev)
>   		amdgpu_atombios_allocate_fb_scratch(adev);
>   	}
>   
> -	ret = device_create_file(adev->dev, &dev_attr_vbios_version);
> -	if (ret) {
> -		DRM_ERROR("Failed to create device file for VBIOS version\n");
> -		return ret;
> -	}
> -
>   	return 0;
>   }
>   
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> index 9c0cd00..8fddd74 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> @@ -1587,6 +1587,18 @@ static struct pci_error_handlers amdgpu_pci_err_handler = {
>   	.resume		= amdgpu_pci_resume,
>   };
>   
> +extern const struct attribute_group amdgpu_vram_mgr_attr_group;
> +extern const struct attribute_group amdgpu_gtt_mgr_attr_group;
> +extern const struct attribute_group amdgpu_vbios_version_attr_group;
> +
> +static const struct attribute_group *amdgpu_sysfs_groups[] = {
> +	&amdgpu_vram_mgr_attr_group,
> +	&amdgpu_gtt_mgr_attr_group,
> +	&amdgpu_vbios_version_attr_group,
> +	NULL,
> +};
> +
> +
>   static struct pci_driver amdgpu_kms_pci_driver = {
>   	.name = DRIVER_NAME,
>   	.id_table = pciidlist,
> @@ -1595,6 +1607,7 @@ static struct pci_driver amdgpu_kms_pci_driver = {
>   	.shutdown = amdgpu_pci_shutdown,
>   	.driver.pm = &amdgpu_pm_ops,
>   	.err_handler = &amdgpu_pci_err_handler,
> +	.driver.dev_groups = amdgpu_sysfs_groups,
>   };
>   
>   static int __init amdgpu_init(void)
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c
> index 8980329..3b7150e 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c
> @@ -77,6 +77,16 @@ static DEVICE_ATTR(mem_info_gtt_total, S_IRUGO,
>   static DEVICE_ATTR(mem_info_gtt_used, S_IRUGO,
>   	           amdgpu_mem_info_gtt_used_show, NULL);
>   
> +static struct attribute *amdgpu_gtt_mgr_attributes[] = {
> +	&dev_attr_mem_info_gtt_total.attr,
> +	&dev_attr_mem_info_gtt_used.attr,
> +	NULL
> +};
> +
> +const struct attribute_group amdgpu_gtt_mgr_attr_group = {
> +	.attrs = amdgpu_gtt_mgr_attributes
> +};
> +
>   static const struct ttm_resource_manager_func amdgpu_gtt_mgr_func;
>   /**
>    * amdgpu_gtt_mgr_init - init GTT manager and DRM MM
> @@ -91,7 +101,6 @@ int amdgpu_gtt_mgr_init(struct amdgpu_device *adev, uint64_t gtt_size)
>   	struct amdgpu_gtt_mgr *mgr = &adev->mman.gtt_mgr;
>   	struct ttm_resource_manager *man = &mgr->manager;
>   	uint64_t start, size;
> -	int ret;
>   
>   	man->use_tt = true;
>   	man->func = &amdgpu_gtt_mgr_func;
> @@ -104,17 +113,6 @@ int amdgpu_gtt_mgr_init(struct amdgpu_device *adev, uint64_t gtt_size)
>   	spin_lock_init(&mgr->lock);
>   	atomic64_set(&mgr->available, gtt_size >> PAGE_SHIFT);
>   
> -	ret = device_create_file(adev->dev, &dev_attr_mem_info_gtt_total);
> -	if (ret) {
> -		DRM_ERROR("Failed to create device file mem_info_gtt_total\n");
> -		return ret;
> -	}
> -	ret = device_create_file(adev->dev, &dev_attr_mem_info_gtt_used);
> -	if (ret) {
> -		DRM_ERROR("Failed to create device file mem_info_gtt_used\n");
> -		return ret;
> -	}
> -
>   	ttm_set_driver_manager(&adev->mman.bdev, TTM_PL_TT, &mgr->manager);
>   	ttm_resource_manager_set_used(man, true);
>   	return 0;
> @@ -144,9 +142,6 @@ void amdgpu_gtt_mgr_fini(struct amdgpu_device *adev)
>   	drm_mm_takedown(&mgr->mm);
>   	spin_unlock(&mgr->lock);
>   
> -	device_remove_file(adev->dev, &dev_attr_mem_info_gtt_total);
> -	device_remove_file(adev->dev, &dev_attr_mem_info_gtt_used);
> -
>   	ttm_resource_manager_cleanup(man);
>   	ttm_set_driver_manager(&adev->mman.bdev, TTM_PL_TT, NULL);
>   }
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
> index d2de2a7..9158d11 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
> @@ -154,7 +154,7 @@ static DEVICE_ATTR(mem_info_vis_vram_used, S_IRUGO,
>   static DEVICE_ATTR(mem_info_vram_vendor, S_IRUGO,
>   		   amdgpu_mem_info_vram_vendor, NULL);
>   
> -static const struct attribute *amdgpu_vram_mgr_attributes[] = {
> +static struct attribute *amdgpu_vram_mgr_attributes[] = {
>   	&dev_attr_mem_info_vram_total.attr,
>   	&dev_attr_mem_info_vis_vram_total.attr,
>   	&dev_attr_mem_info_vram_used.attr,
> @@ -163,6 +163,10 @@ static const struct attribute *amdgpu_vram_mgr_attributes[] = {
>   	NULL
>   };
>   
> +const struct attribute_group amdgpu_vram_mgr_attr_group = {
> +	.attrs = amdgpu_vram_mgr_attributes
> +};
> +
>   static const struct ttm_resource_manager_func amdgpu_vram_mgr_func;
>   
>   /**
> @@ -176,7 +180,6 @@ int amdgpu_vram_mgr_init(struct amdgpu_device *adev)
>   {
>   	struct amdgpu_vram_mgr *mgr = &adev->mman.vram_mgr;
>   	struct ttm_resource_manager *man = &mgr->manager;
> -	int ret;
>   
>   	ttm_resource_manager_init(man, adev->gmc.real_vram_size >> PAGE_SHIFT);
>   
> @@ -187,11 +190,6 @@ int amdgpu_vram_mgr_init(struct amdgpu_device *adev)
>   	INIT_LIST_HEAD(&mgr->reservations_pending);
>   	INIT_LIST_HEAD(&mgr->reserved_pages);
>   
> -	/* Add the two VRAM-related sysfs files */
> -	ret = sysfs_create_files(&adev->dev->kobj, amdgpu_vram_mgr_attributes);
> -	if (ret)
> -		DRM_ERROR("Failed to register sysfs\n");
> -
>   	ttm_set_driver_manager(&adev->mman.bdev, TTM_PL_VRAM, &mgr->manager);
>   	ttm_resource_manager_set_used(man, true);
>   	return 0;
> @@ -229,8 +227,6 @@ void amdgpu_vram_mgr_fini(struct amdgpu_device *adev)
>   	drm_mm_takedown(&mgr->mm);
>   	spin_unlock(&mgr->lock);
>   
> -	sysfs_remove_files(&adev->dev->kobj, amdgpu_vram_mgr_attributes);
> -
>   	ttm_resource_manager_cleanup(man);
>   	ttm_set_driver_manager(&adev->mman.bdev, TTM_PL_VRAM, NULL);
>   }

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 10/14] dmr/amdgpu: Move some sysfs attrs creation to default_attr
@ 2021-01-19  8:53     ` Christian König
  0 siblings, 0 replies; 196+ messages in thread
From: Christian König @ 2021-01-19  8:53 UTC (permalink / raw)
  To: Andrey Grodzovsky, amd-gfx, dri-devel, daniel.vetter, robh,
	l.stach, yuq825, eric
  Cc: Alexander.Deucher, gregkh, ppaalanen, Harry.Wentland

Am 18.01.21 um 22:01 schrieb Andrey Grodzovsky:
> This allows to remove explicit creation and destruction
> of those attrs and by this avoids warnings on device
> finilizing post physical device extraction.
>
> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>

Acked-by: Christian König <christian.koenig@amd.com>

> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c | 17 +++++++++--------
>   drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c      | 13 +++++++++++++
>   drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c  | 25 ++++++++++---------------
>   drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c | 14 +++++---------
>   4 files changed, 37 insertions(+), 32 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c
> index 86add0f..0346e12 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c
> @@ -1953,6 +1953,15 @@ static ssize_t amdgpu_atombios_get_vbios_version(struct device *dev,
>   static DEVICE_ATTR(vbios_version, 0444, amdgpu_atombios_get_vbios_version,
>   		   NULL);
>   
> +static struct attribute *amdgpu_vbios_version_attrs[] = {
> +	&dev_attr_vbios_version.attr,
> +	NULL
> +};
> +
> +const struct attribute_group amdgpu_vbios_version_attr_group = {
> +	.attrs = amdgpu_vbios_version_attrs
> +};
> +
>   /**
>    * amdgpu_atombios_fini - free the driver info and callbacks for atombios
>    *
> @@ -1972,7 +1981,6 @@ void amdgpu_atombios_fini(struct amdgpu_device *adev)
>   	adev->mode_info.atom_context = NULL;
>   	kfree(adev->mode_info.atom_card_info);
>   	adev->mode_info.atom_card_info = NULL;
> -	device_remove_file(adev->dev, &dev_attr_vbios_version);
>   }
>   
>   /**
> @@ -1989,7 +1997,6 @@ int amdgpu_atombios_init(struct amdgpu_device *adev)
>   {
>   	struct card_info *atom_card_info =
>   	    kzalloc(sizeof(struct card_info), GFP_KERNEL);
> -	int ret;
>   
>   	if (!atom_card_info)
>   		return -ENOMEM;
> @@ -2027,12 +2034,6 @@ int amdgpu_atombios_init(struct amdgpu_device *adev)
>   		amdgpu_atombios_allocate_fb_scratch(adev);
>   	}
>   
> -	ret = device_create_file(adev->dev, &dev_attr_vbios_version);
> -	if (ret) {
> -		DRM_ERROR("Failed to create device file for VBIOS version\n");
> -		return ret;
> -	}
> -
>   	return 0;
>   }
>   
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> index 9c0cd00..8fddd74 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> @@ -1587,6 +1587,18 @@ static struct pci_error_handlers amdgpu_pci_err_handler = {
>   	.resume		= amdgpu_pci_resume,
>   };
>   
> +extern const struct attribute_group amdgpu_vram_mgr_attr_group;
> +extern const struct attribute_group amdgpu_gtt_mgr_attr_group;
> +extern const struct attribute_group amdgpu_vbios_version_attr_group;
> +
> +static const struct attribute_group *amdgpu_sysfs_groups[] = {
> +	&amdgpu_vram_mgr_attr_group,
> +	&amdgpu_gtt_mgr_attr_group,
> +	&amdgpu_vbios_version_attr_group,
> +	NULL,
> +};
> +
> +
>   static struct pci_driver amdgpu_kms_pci_driver = {
>   	.name = DRIVER_NAME,
>   	.id_table = pciidlist,
> @@ -1595,6 +1607,7 @@ static struct pci_driver amdgpu_kms_pci_driver = {
>   	.shutdown = amdgpu_pci_shutdown,
>   	.driver.pm = &amdgpu_pm_ops,
>   	.err_handler = &amdgpu_pci_err_handler,
> +	.driver.dev_groups = amdgpu_sysfs_groups,
>   };
>   
>   static int __init amdgpu_init(void)
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c
> index 8980329..3b7150e 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c
> @@ -77,6 +77,16 @@ static DEVICE_ATTR(mem_info_gtt_total, S_IRUGO,
>   static DEVICE_ATTR(mem_info_gtt_used, S_IRUGO,
>   	           amdgpu_mem_info_gtt_used_show, NULL);
>   
> +static struct attribute *amdgpu_gtt_mgr_attributes[] = {
> +	&dev_attr_mem_info_gtt_total.attr,
> +	&dev_attr_mem_info_gtt_used.attr,
> +	NULL
> +};
> +
> +const struct attribute_group amdgpu_gtt_mgr_attr_group = {
> +	.attrs = amdgpu_gtt_mgr_attributes
> +};
> +
>   static const struct ttm_resource_manager_func amdgpu_gtt_mgr_func;
>   /**
>    * amdgpu_gtt_mgr_init - init GTT manager and DRM MM
> @@ -91,7 +101,6 @@ int amdgpu_gtt_mgr_init(struct amdgpu_device *adev, uint64_t gtt_size)
>   	struct amdgpu_gtt_mgr *mgr = &adev->mman.gtt_mgr;
>   	struct ttm_resource_manager *man = &mgr->manager;
>   	uint64_t start, size;
> -	int ret;
>   
>   	man->use_tt = true;
>   	man->func = &amdgpu_gtt_mgr_func;
> @@ -104,17 +113,6 @@ int amdgpu_gtt_mgr_init(struct amdgpu_device *adev, uint64_t gtt_size)
>   	spin_lock_init(&mgr->lock);
>   	atomic64_set(&mgr->available, gtt_size >> PAGE_SHIFT);
>   
> -	ret = device_create_file(adev->dev, &dev_attr_mem_info_gtt_total);
> -	if (ret) {
> -		DRM_ERROR("Failed to create device file mem_info_gtt_total\n");
> -		return ret;
> -	}
> -	ret = device_create_file(adev->dev, &dev_attr_mem_info_gtt_used);
> -	if (ret) {
> -		DRM_ERROR("Failed to create device file mem_info_gtt_used\n");
> -		return ret;
> -	}
> -
>   	ttm_set_driver_manager(&adev->mman.bdev, TTM_PL_TT, &mgr->manager);
>   	ttm_resource_manager_set_used(man, true);
>   	return 0;
> @@ -144,9 +142,6 @@ void amdgpu_gtt_mgr_fini(struct amdgpu_device *adev)
>   	drm_mm_takedown(&mgr->mm);
>   	spin_unlock(&mgr->lock);
>   
> -	device_remove_file(adev->dev, &dev_attr_mem_info_gtt_total);
> -	device_remove_file(adev->dev, &dev_attr_mem_info_gtt_used);
> -
>   	ttm_resource_manager_cleanup(man);
>   	ttm_set_driver_manager(&adev->mman.bdev, TTM_PL_TT, NULL);
>   }
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
> index d2de2a7..9158d11 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
> @@ -154,7 +154,7 @@ static DEVICE_ATTR(mem_info_vis_vram_used, S_IRUGO,
>   static DEVICE_ATTR(mem_info_vram_vendor, S_IRUGO,
>   		   amdgpu_mem_info_vram_vendor, NULL);
>   
> -static const struct attribute *amdgpu_vram_mgr_attributes[] = {
> +static struct attribute *amdgpu_vram_mgr_attributes[] = {
>   	&dev_attr_mem_info_vram_total.attr,
>   	&dev_attr_mem_info_vis_vram_total.attr,
>   	&dev_attr_mem_info_vram_used.attr,
> @@ -163,6 +163,10 @@ static const struct attribute *amdgpu_vram_mgr_attributes[] = {
>   	NULL
>   };
>   
> +const struct attribute_group amdgpu_vram_mgr_attr_group = {
> +	.attrs = amdgpu_vram_mgr_attributes
> +};
> +
>   static const struct ttm_resource_manager_func amdgpu_vram_mgr_func;
>   
>   /**
> @@ -176,7 +180,6 @@ int amdgpu_vram_mgr_init(struct amdgpu_device *adev)
>   {
>   	struct amdgpu_vram_mgr *mgr = &adev->mman.vram_mgr;
>   	struct ttm_resource_manager *man = &mgr->manager;
> -	int ret;
>   
>   	ttm_resource_manager_init(man, adev->gmc.real_vram_size >> PAGE_SHIFT);
>   
> @@ -187,11 +190,6 @@ int amdgpu_vram_mgr_init(struct amdgpu_device *adev)
>   	INIT_LIST_HEAD(&mgr->reservations_pending);
>   	INIT_LIST_HEAD(&mgr->reserved_pages);
>   
> -	/* Add the two VRAM-related sysfs files */
> -	ret = sysfs_create_files(&adev->dev->kobj, amdgpu_vram_mgr_attributes);
> -	if (ret)
> -		DRM_ERROR("Failed to register sysfs\n");
> -
>   	ttm_set_driver_manager(&adev->mman.bdev, TTM_PL_VRAM, &mgr->manager);
>   	ttm_resource_manager_set_used(man, true);
>   	return 0;
> @@ -229,8 +227,6 @@ void amdgpu_vram_mgr_fini(struct amdgpu_device *adev)
>   	drm_mm_takedown(&mgr->mm);
>   	spin_unlock(&mgr->lock);
>   
> -	sysfs_remove_files(&adev->dev->kobj, amdgpu_vram_mgr_attributes);
> -
>   	ttm_resource_manager_cleanup(man);
>   	ttm_set_driver_manager(&adev->mman.bdev, TTM_PL_VRAM, NULL);
>   }

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 11/14] drm/amdgpu: Guard against write accesses after device removal
  2021-01-18 21:01   ` Andrey Grodzovsky
@ 2021-01-19  8:55     ` Christian König
  -1 siblings, 0 replies; 196+ messages in thread
From: Christian König @ 2021-01-19  8:55 UTC (permalink / raw)
  To: Andrey Grodzovsky, amd-gfx, dri-devel, daniel.vetter, robh,
	l.stach, yuq825, eric
  Cc: Alexander.Deucher, gregkh

Am 18.01.21 um 22:01 schrieb Andrey Grodzovsky:
> This should prevent writing to memory or IO ranges possibly
> already allocated for other uses after our device is removed.

Wow, that adds quite some overhead to every register access. I'm not 
sure we can do this.

Christian.

>
> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 57 ++++++++++++++++++++++++
>   drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c    |  9 ++++
>   drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c    | 53 +++++++++++++---------
>   drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h    |  3 ++
>   drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c   | 70 ++++++++++++++++++++++++++++++
>   drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h   | 49 ++-------------------
>   drivers/gpu/drm/amd/amdgpu/psp_v11_0.c     | 16 ++-----
>   drivers/gpu/drm/amd/amdgpu/psp_v12_0.c     |  8 +---
>   drivers/gpu/drm/amd/amdgpu/psp_v3_1.c      |  8 +---
>   9 files changed, 184 insertions(+), 89 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index e99f4f1..0a9d73c 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -72,6 +72,8 @@
>   
>   #include <linux/iommu.h>
>   
> +#include <drm/drm_drv.h>
> +
>   MODULE_FIRMWARE("amdgpu/vega10_gpu_info.bin");
>   MODULE_FIRMWARE("amdgpu/vega12_gpu_info.bin");
>   MODULE_FIRMWARE("amdgpu/raven_gpu_info.bin");
> @@ -404,13 +406,21 @@ uint8_t amdgpu_mm_rreg8(struct amdgpu_device *adev, uint32_t offset)
>    */
>   void amdgpu_mm_wreg8(struct amdgpu_device *adev, uint32_t offset, uint8_t value)
>   {
> +	int idx;
> +
>   	if (adev->in_pci_err_recovery)
>   		return;
>   
> +
> +	if (!drm_dev_enter(&adev->ddev, &idx))
> +		return;
> +
>   	if (offset < adev->rmmio_size)
>   		writeb(value, adev->rmmio + offset);
>   	else
>   		BUG();
> +
> +	drm_dev_exit(idx);
>   }
>   
>   /**
> @@ -427,9 +437,14 @@ void amdgpu_device_wreg(struct amdgpu_device *adev,
>   			uint32_t reg, uint32_t v,
>   			uint32_t acc_flags)
>   {
> +	int idx;
> +
>   	if (adev->in_pci_err_recovery)
>   		return;
>   
> +	if (!drm_dev_enter(&adev->ddev, &idx))
> +		return;
> +
>   	if ((reg * 4) < adev->rmmio_size) {
>   		if (!(acc_flags & AMDGPU_REGS_NO_KIQ) &&
>   		    amdgpu_sriov_runtime(adev) &&
> @@ -444,6 +459,8 @@ void amdgpu_device_wreg(struct amdgpu_device *adev,
>   	}
>   
>   	trace_amdgpu_device_wreg(adev->pdev->device, reg, v);
> +
> +	drm_dev_exit(idx);
>   }
>   
>   /*
> @@ -454,9 +471,14 @@ void amdgpu_device_wreg(struct amdgpu_device *adev,
>   void amdgpu_mm_wreg_mmio_rlc(struct amdgpu_device *adev,
>   			     uint32_t reg, uint32_t v)
>   {
> +	int idx;
> +
>   	if (adev->in_pci_err_recovery)
>   		return;
>   
> +	if (!drm_dev_enter(&adev->ddev, &idx))
> +		return;
> +
>   	if (amdgpu_sriov_fullaccess(adev) &&
>   	    adev->gfx.rlc.funcs &&
>   	    adev->gfx.rlc.funcs->is_rlcg_access_range) {
> @@ -465,6 +487,8 @@ void amdgpu_mm_wreg_mmio_rlc(struct amdgpu_device *adev,
>   	} else {
>   		writel(v, ((void __iomem *)adev->rmmio) + (reg * 4));
>   	}
> +
> +	drm_dev_exit(idx);
>   }
>   
>   /**
> @@ -499,15 +523,22 @@ u32 amdgpu_io_rreg(struct amdgpu_device *adev, u32 reg)
>    */
>   void amdgpu_io_wreg(struct amdgpu_device *adev, u32 reg, u32 v)
>   {
> +	int idx;
> +
>   	if (adev->in_pci_err_recovery)
>   		return;
>   
> +	if (!drm_dev_enter(&adev->ddev, &idx))
> +		return;
> +
>   	if ((reg * 4) < adev->rio_mem_size)
>   		iowrite32(v, adev->rio_mem + (reg * 4));
>   	else {
>   		iowrite32((reg * 4), adev->rio_mem + (mmMM_INDEX * 4));
>   		iowrite32(v, adev->rio_mem + (mmMM_DATA * 4));
>   	}
> +
> +	drm_dev_exit(idx);
>   }
>   
>   /**
> @@ -544,14 +575,21 @@ u32 amdgpu_mm_rdoorbell(struct amdgpu_device *adev, u32 index)
>    */
>   void amdgpu_mm_wdoorbell(struct amdgpu_device *adev, u32 index, u32 v)
>   {
> +	int idx;
> +
>   	if (adev->in_pci_err_recovery)
>   		return;
>   
> +	if (!drm_dev_enter(&adev->ddev, &idx))
> +		return;
> +
>   	if (index < adev->doorbell.num_doorbells) {
>   		writel(v, adev->doorbell.ptr + index);
>   	} else {
>   		DRM_ERROR("writing beyond doorbell aperture: 0x%08x!\n", index);
>   	}
> +
> +	drm_dev_exit(idx);
>   }
>   
>   /**
> @@ -588,14 +626,21 @@ u64 amdgpu_mm_rdoorbell64(struct amdgpu_device *adev, u32 index)
>    */
>   void amdgpu_mm_wdoorbell64(struct amdgpu_device *adev, u32 index, u64 v)
>   {
> +	int idx;
> +
>   	if (adev->in_pci_err_recovery)
>   		return;
>   
> +	if (!drm_dev_enter(&adev->ddev, &idx))
> +		return;
> +
>   	if (index < adev->doorbell.num_doorbells) {
>   		atomic64_set((atomic64_t *)(adev->doorbell.ptr + index), v);
>   	} else {
>   		DRM_ERROR("writing beyond doorbell aperture: 0x%08x!\n", index);
>   	}
> +
> +	drm_dev_exit(idx);
>   }
>   
>   /**
> @@ -682,6 +727,10 @@ void amdgpu_device_indirect_wreg(struct amdgpu_device *adev,
>   	unsigned long flags;
>   	void __iomem *pcie_index_offset;
>   	void __iomem *pcie_data_offset;
> +	int idx;
> +
> +	if (!drm_dev_enter(&adev->ddev, &idx))
> +		return;
>   
>   	spin_lock_irqsave(&adev->pcie_idx_lock, flags);
>   	pcie_index_offset = (void __iomem *)adev->rmmio + pcie_index * 4;
> @@ -692,6 +741,8 @@ void amdgpu_device_indirect_wreg(struct amdgpu_device *adev,
>   	writel(reg_data, pcie_data_offset);
>   	readl(pcie_data_offset);
>   	spin_unlock_irqrestore(&adev->pcie_idx_lock, flags);
> +
> +	drm_dev_exit(idx);
>   }
>   
>   /**
> @@ -711,6 +762,10 @@ void amdgpu_device_indirect_wreg64(struct amdgpu_device *adev,
>   	unsigned long flags;
>   	void __iomem *pcie_index_offset;
>   	void __iomem *pcie_data_offset;
> +	int idx;
> +
> +	if (!drm_dev_enter(&adev->ddev, &idx))
> +		return;
>   
>   	spin_lock_irqsave(&adev->pcie_idx_lock, flags);
>   	pcie_index_offset = (void __iomem *)adev->rmmio + pcie_index * 4;
> @@ -727,6 +782,8 @@ void amdgpu_device_indirect_wreg64(struct amdgpu_device *adev,
>   	writel((u32)(reg_data >> 32), pcie_data_offset);
>   	readl(pcie_data_offset);
>   	spin_unlock_irqrestore(&adev->pcie_idx_lock, flags);
> +
> +	drm_dev_exit(idx);
>   }
>   
>   /**
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
> index fe1a39f..1beb4e6 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
> @@ -31,6 +31,8 @@
>   #include "amdgpu_ras.h"
>   #include "amdgpu_xgmi.h"
>   
> +#include <drm/drm_drv.h>
> +
>   /**
>    * amdgpu_gmc_get_pde_for_bo - get the PDE for a BO
>    *
> @@ -98,6 +100,10 @@ int amdgpu_gmc_set_pte_pde(struct amdgpu_device *adev, void *cpu_pt_addr,
>   {
>   	void __iomem *ptr = (void *)cpu_pt_addr;
>   	uint64_t value;
> +	int idx;
> +
> +	if (!drm_dev_enter(&adev->ddev, &idx))
> +		return 0;
>   
>   	/*
>   	 * The following is for PTE only. GART does not have PDEs.
> @@ -105,6 +111,9 @@ int amdgpu_gmc_set_pte_pde(struct amdgpu_device *adev, void *cpu_pt_addr,
>   	value = addr & 0x0000FFFFFFFFF000ULL;
>   	value |= flags;
>   	writeq(value, ptr + (gpu_page_idx * 8));
> +
> +	drm_dev_exit(idx);
> +
>   	return 0;
>   }
>   
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
> index 523d22d..89e2bfe 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
> @@ -37,6 +37,8 @@
>   
>   #include "amdgpu_ras.h"
>   
> +#include <drm/drm_drv.h>
> +
>   static int psp_sysfs_init(struct amdgpu_device *adev);
>   static void psp_sysfs_fini(struct amdgpu_device *adev);
>   
> @@ -248,7 +250,7 @@ psp_cmd_submit_buf(struct psp_context *psp,
>   		   struct psp_gfx_cmd_resp *cmd, uint64_t fence_mc_addr)
>   {
>   	int ret;
> -	int index;
> +	int index, idx;
>   	int timeout = 2000;
>   	bool ras_intr = false;
>   	bool skip_unsupport = false;
> @@ -256,6 +258,9 @@ psp_cmd_submit_buf(struct psp_context *psp,
>   	if (psp->adev->in_pci_err_recovery)
>   		return 0;
>   
> +	if (!drm_dev_enter(&psp->adev->ddev, &idx))
> +		return 0;
> +
>   	mutex_lock(&psp->mutex);
>   
>   	memset(psp->cmd_buf_mem, 0, PSP_CMD_BUFFER_SIZE);
> @@ -266,8 +271,7 @@ psp_cmd_submit_buf(struct psp_context *psp,
>   	ret = psp_ring_cmd_submit(psp, psp->cmd_buf_mc_addr, fence_mc_addr, index);
>   	if (ret) {
>   		atomic_dec(&psp->fence_value);
> -		mutex_unlock(&psp->mutex);
> -		return ret;
> +		goto exit;
>   	}
>   
>   	amdgpu_asic_invalidate_hdp(psp->adev, NULL);
> @@ -307,8 +311,8 @@ psp_cmd_submit_buf(struct psp_context *psp,
>   			 psp->cmd_buf_mem->cmd_id,
>   			 psp->cmd_buf_mem->resp.status);
>   		if (!timeout) {
> -			mutex_unlock(&psp->mutex);
> -			return -EINVAL;
> +			ret = -EINVAL;
> +			goto exit;
>   		}
>   	}
>   
> @@ -316,8 +320,10 @@ psp_cmd_submit_buf(struct psp_context *psp,
>   		ucode->tmr_mc_addr_lo = psp->cmd_buf_mem->resp.fw_addr_lo;
>   		ucode->tmr_mc_addr_hi = psp->cmd_buf_mem->resp.fw_addr_hi;
>   	}
> -	mutex_unlock(&psp->mutex);
>   
> +exit:
> +	mutex_unlock(&psp->mutex);
> +	drm_dev_exit(idx);
>   	return ret;
>   }
>   
> @@ -354,8 +360,7 @@ static int psp_load_toc(struct psp_context *psp,
>   	if (!cmd)
>   		return -ENOMEM;
>   	/* Copy toc to psp firmware private buffer */
> -	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
> -	memcpy(psp->fw_pri_buf, psp->toc_start_addr, psp->toc_bin_size);
> +	psp_copy_fw(psp, psp->toc_start_addr, psp->toc_bin_size);
>   
>   	psp_prep_load_toc_cmd_buf(cmd, psp->fw_pri_mc_addr, psp->toc_bin_size);
>   
> @@ -570,8 +575,7 @@ static int psp_asd_load(struct psp_context *psp)
>   	if (!cmd)
>   		return -ENOMEM;
>   
> -	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
> -	memcpy(psp->fw_pri_buf, psp->asd_start_addr, psp->asd_ucode_size);
> +	psp_copy_fw(psp, psp->asd_start_addr, psp->asd_ucode_size);
>   
>   	psp_prep_asd_load_cmd_buf(cmd, psp->fw_pri_mc_addr,
>   				  psp->asd_ucode_size);
> @@ -726,8 +730,7 @@ static int psp_xgmi_load(struct psp_context *psp)
>   	if (!cmd)
>   		return -ENOMEM;
>   
> -	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
> -	memcpy(psp->fw_pri_buf, psp->ta_xgmi_start_addr, psp->ta_xgmi_ucode_size);
> +	psp_copy_fw(psp, psp->ta_xgmi_start_addr, psp->ta_xgmi_ucode_size);
>   
>   	psp_prep_ta_load_cmd_buf(cmd,
>   				 psp->fw_pri_mc_addr,
> @@ -982,8 +985,7 @@ static int psp_ras_load(struct psp_context *psp)
>   	if (!cmd)
>   		return -ENOMEM;
>   
> -	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
> -	memcpy(psp->fw_pri_buf, psp->ta_ras_start_addr, psp->ta_ras_ucode_size);
> +	psp_copy_fw(psp, psp->ta_ras_start_addr, psp->ta_ras_ucode_size);
>   
>   	psp_prep_ta_load_cmd_buf(cmd,
>   				 psp->fw_pri_mc_addr,
> @@ -1219,8 +1221,7 @@ static int psp_hdcp_load(struct psp_context *psp)
>   	if (!cmd)
>   		return -ENOMEM;
>   
> -	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
> -	memcpy(psp->fw_pri_buf, psp->ta_hdcp_start_addr,
> +	psp_copy_fw(psp, psp->ta_hdcp_start_addr,
>   	       psp->ta_hdcp_ucode_size);
>   
>   	psp_prep_ta_load_cmd_buf(cmd,
> @@ -1366,8 +1367,7 @@ static int psp_dtm_load(struct psp_context *psp)
>   	if (!cmd)
>   		return -ENOMEM;
>   
> -	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
> -	memcpy(psp->fw_pri_buf, psp->ta_dtm_start_addr, psp->ta_dtm_ucode_size);
> +	psp_copy_fw(psp, psp->ta_dtm_start_addr, psp->ta_dtm_ucode_size);
>   
>   	psp_prep_ta_load_cmd_buf(cmd,
>   				 psp->fw_pri_mc_addr,
> @@ -1507,8 +1507,7 @@ static int psp_rap_load(struct psp_context *psp)
>   	if (!cmd)
>   		return -ENOMEM;
>   
> -	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
> -	memcpy(psp->fw_pri_buf, psp->ta_rap_start_addr, psp->ta_rap_ucode_size);
> +	psp_copy_fw(psp, psp->ta_rap_start_addr, psp->ta_rap_ucode_size);
>   
>   	psp_prep_ta_load_cmd_buf(cmd,
>   				 psp->fw_pri_mc_addr,
> @@ -2778,6 +2777,20 @@ static ssize_t psp_usbc_pd_fw_sysfs_write(struct device *dev,
>   	return count;
>   }
>   
> +void psp_copy_fw(struct psp_context *psp, uint8_t *start_addr, uint32_t bin_size)
> +{
> +	int idx;
> +
> +	if (!drm_dev_enter(&psp->adev->ddev, &idx))
> +		return;
> +
> +	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
> +	memcpy(psp->fw_pri_buf, start_addr, bin_size);
> +
> +	drm_dev_exit(idx);
> +}
> +
> +
>   static DEVICE_ATTR(usbc_pd_fw, S_IRUGO | S_IWUSR,
>   		   psp_usbc_pd_fw_sysfs_read,
>   		   psp_usbc_pd_fw_sysfs_write);
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
> index da250bc..ac69314 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
> @@ -400,4 +400,7 @@ int psp_init_ta_microcode(struct psp_context *psp,
>   			  const char *chip_name);
>   int psp_get_fw_attestation_records_addr(struct psp_context *psp,
>   					uint64_t *output_ptr);
> +
> +void psp_copy_fw(struct psp_context *psp, uint8_t *start_addr, uint32_t bin_size);
> +
>   #endif
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
> index 1a612f5..d656494 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
> @@ -35,6 +35,8 @@
>   #include "amdgpu.h"
>   #include "atom.h"
>   
> +#include <drm/drm_drv.h>
> +
>   /*
>    * Rings
>    * Most engines on the GPU are fed via ring buffers.  Ring
> @@ -463,3 +465,71 @@ int amdgpu_ring_test_helper(struct amdgpu_ring *ring)
>   	ring->sched.ready = !r;
>   	return r;
>   }
> +
> +void amdgpu_ring_clear_ring(struct amdgpu_ring *ring)
> +{
> +	int idx;
> +	int i = 0;
> +
> +	if (!drm_dev_enter(&ring->adev->ddev, &idx))
> +		return;
> +
> +	while (i <= ring->buf_mask)
> +		ring->ring[i++] = ring->funcs->nop;
> +
> +	drm_dev_exit(idx);
> +
> +}
> +
> +void amdgpu_ring_write(struct amdgpu_ring *ring, uint32_t v)
> +{
> +	int idx;
> +
> +	if (!drm_dev_enter(&ring->adev->ddev, &idx))
> +		return;
> +
> +	if (ring->count_dw <= 0)
> +		DRM_ERROR("amdgpu: writing more dwords to the ring than expected!\n");
> +	ring->ring[ring->wptr++ & ring->buf_mask] = v;
> +	ring->wptr &= ring->ptr_mask;
> +	ring->count_dw--;
> +
> +	drm_dev_exit(idx);
> +}
> +
> +void amdgpu_ring_write_multiple(struct amdgpu_ring *ring,
> +					      void *src, int count_dw)
> +{
> +	unsigned occupied, chunk1, chunk2;
> +	void *dst;
> +	int idx;
> +
> +	if (!drm_dev_enter(&ring->adev->ddev, &idx))
> +		return;
> +
> +	if (unlikely(ring->count_dw < count_dw))
> +		DRM_ERROR("amdgpu: writing more dwords to the ring than expected!\n");
> +
> +	occupied = ring->wptr & ring->buf_mask;
> +	dst = (void *)&ring->ring[occupied];
> +	chunk1 = ring->buf_mask + 1 - occupied;
> +	chunk1 = (chunk1 >= count_dw) ? count_dw: chunk1;
> +	chunk2 = count_dw - chunk1;
> +	chunk1 <<= 2;
> +	chunk2 <<= 2;
> +
> +	if (chunk1)
> +		memcpy(dst, src, chunk1);
> +
> +	if (chunk2) {
> +		src += chunk1;
> +		dst = (void *)ring->ring;
> +		memcpy(dst, src, chunk2);
> +	}
> +
> +	ring->wptr += count_dw;
> +	ring->wptr &= ring->ptr_mask;
> +	ring->count_dw -= count_dw;
> +
> +	drm_dev_exit(idx);
> +}
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
> index accb243..f90b81f 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
> @@ -300,53 +300,12 @@ static inline void amdgpu_ring_set_preempt_cond_exec(struct amdgpu_ring *ring,
>   	*ring->cond_exe_cpu_addr = cond_exec;
>   }
>   
> -static inline void amdgpu_ring_clear_ring(struct amdgpu_ring *ring)
> -{
> -	int i = 0;
> -	while (i <= ring->buf_mask)
> -		ring->ring[i++] = ring->funcs->nop;
> -
> -}
> -
> -static inline void amdgpu_ring_write(struct amdgpu_ring *ring, uint32_t v)
> -{
> -	if (ring->count_dw <= 0)
> -		DRM_ERROR("amdgpu: writing more dwords to the ring than expected!\n");
> -	ring->ring[ring->wptr++ & ring->buf_mask] = v;
> -	ring->wptr &= ring->ptr_mask;
> -	ring->count_dw--;
> -}
> +void amdgpu_ring_clear_ring(struct amdgpu_ring *ring);
>   
> -static inline void amdgpu_ring_write_multiple(struct amdgpu_ring *ring,
> -					      void *src, int count_dw)
> -{
> -	unsigned occupied, chunk1, chunk2;
> -	void *dst;
> -
> -	if (unlikely(ring->count_dw < count_dw))
> -		DRM_ERROR("amdgpu: writing more dwords to the ring than expected!\n");
> -
> -	occupied = ring->wptr & ring->buf_mask;
> -	dst = (void *)&ring->ring[occupied];
> -	chunk1 = ring->buf_mask + 1 - occupied;
> -	chunk1 = (chunk1 >= count_dw) ? count_dw: chunk1;
> -	chunk2 = count_dw - chunk1;
> -	chunk1 <<= 2;
> -	chunk2 <<= 2;
> +void amdgpu_ring_write(struct amdgpu_ring *ring, uint32_t v);
>   
> -	if (chunk1)
> -		memcpy(dst, src, chunk1);
> -
> -	if (chunk2) {
> -		src += chunk1;
> -		dst = (void *)ring->ring;
> -		memcpy(dst, src, chunk2);
> -	}
> -
> -	ring->wptr += count_dw;
> -	ring->wptr &= ring->ptr_mask;
> -	ring->count_dw -= count_dw;
> -}
> +void amdgpu_ring_write_multiple(struct amdgpu_ring *ring,
> +					      void *src, int count_dw);
>   
>   int amdgpu_ring_test_helper(struct amdgpu_ring *ring);
>   
> diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c b/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
> index bd4248c..b3ce5be 100644
> --- a/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
> @@ -269,10 +269,8 @@ static int psp_v11_0_bootloader_load_kdb(struct psp_context *psp)
>   	if (ret)
>   		return ret;
>   
> -	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
> -
>   	/* Copy PSP KDB binary to memory */
> -	memcpy(psp->fw_pri_buf, psp->kdb_start_addr, psp->kdb_bin_size);
> +	psp_copy_fw(psp, psp->kdb_start_addr, psp->kdb_bin_size);
>   
>   	/* Provide the PSP KDB to bootloader */
>   	WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
> @@ -302,10 +300,8 @@ static int psp_v11_0_bootloader_load_spl(struct psp_context *psp)
>   	if (ret)
>   		return ret;
>   
> -	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
> -
>   	/* Copy PSP SPL binary to memory */
> -	memcpy(psp->fw_pri_buf, psp->spl_start_addr, psp->spl_bin_size);
> +	psp_copy_fw(psp, psp->spl_start_addr, psp->spl_bin_size);
>   
>   	/* Provide the PSP SPL to bootloader */
>   	WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
> @@ -335,10 +331,8 @@ static int psp_v11_0_bootloader_load_sysdrv(struct psp_context *psp)
>   	if (ret)
>   		return ret;
>   
> -	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
> -
>   	/* Copy PSP System Driver binary to memory */
> -	memcpy(psp->fw_pri_buf, psp->sys_start_addr, psp->sys_bin_size);
> +	psp_copy_fw(psp, psp->sys_start_addr, psp->sys_bin_size);
>   
>   	/* Provide the sys driver to bootloader */
>   	WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
> @@ -371,10 +365,8 @@ static int psp_v11_0_bootloader_load_sos(struct psp_context *psp)
>   	if (ret)
>   		return ret;
>   
> -	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
> -
>   	/* Copy Secure OS binary to PSP memory */
> -	memcpy(psp->fw_pri_buf, psp->sos_start_addr, psp->sos_bin_size);
> +	psp_copy_fw(psp, psp->sos_start_addr, psp->sos_bin_size);
>   
>   	/* Provide the PSP secure OS to bootloader */
>   	WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
> diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c b/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c
> index c4828bd..618e5b6 100644
> --- a/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c
> @@ -138,10 +138,8 @@ static int psp_v12_0_bootloader_load_sysdrv(struct psp_context *psp)
>   	if (ret)
>   		return ret;
>   
> -	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
> -
>   	/* Copy PSP System Driver binary to memory */
> -	memcpy(psp->fw_pri_buf, psp->sys_start_addr, psp->sys_bin_size);
> +	psp_copy_fw(psp, psp->sys_start_addr, psp->sys_bin_size);
>   
>   	/* Provide the sys driver to bootloader */
>   	WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
> @@ -179,10 +177,8 @@ static int psp_v12_0_bootloader_load_sos(struct psp_context *psp)
>   	if (ret)
>   		return ret;
>   
> -	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
> -
>   	/* Copy Secure OS binary to PSP memory */
> -	memcpy(psp->fw_pri_buf, psp->sos_start_addr, psp->sos_bin_size);
> +	psp_copy_fw(psp, psp->sos_start_addr, psp->sos_bin_size);
>   
>   	/* Provide the PSP secure OS to bootloader */
>   	WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
> diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c b/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c
> index f2e725f..d0a6cccd 100644
> --- a/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c
> +++ b/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c
> @@ -102,10 +102,8 @@ static int psp_v3_1_bootloader_load_sysdrv(struct psp_context *psp)
>   	if (ret)
>   		return ret;
>   
> -	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
> -
>   	/* Copy PSP System Driver binary to memory */
> -	memcpy(psp->fw_pri_buf, psp->sys_start_addr, psp->sys_bin_size);
> +	psp_copy_fw(psp, psp->sys_start_addr, psp->sys_bin_size);
>   
>   	/* Provide the sys driver to bootloader */
>   	WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
> @@ -143,10 +141,8 @@ static int psp_v3_1_bootloader_load_sos(struct psp_context *psp)
>   	if (ret)
>   		return ret;
>   
> -	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
> -
>   	/* Copy Secure OS binary to PSP memory */
> -	memcpy(psp->fw_pri_buf, psp->sos_start_addr, psp->sos_bin_size);
> +	psp_copy_fw(psp, psp->sos_start_addr, psp->sos_bin_size);
>   
>   	/* Provide the PSP secure OS to bootloader */
>   	WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 11/14] drm/amdgpu: Guard against write accesses after device removal
@ 2021-01-19  8:55     ` Christian König
  0 siblings, 0 replies; 196+ messages in thread
From: Christian König @ 2021-01-19  8:55 UTC (permalink / raw)
  To: Andrey Grodzovsky, amd-gfx, dri-devel, daniel.vetter, robh,
	l.stach, yuq825, eric
  Cc: Alexander.Deucher, gregkh, ppaalanen, Harry.Wentland

Am 18.01.21 um 22:01 schrieb Andrey Grodzovsky:
> This should prevent writing to memory or IO ranges possibly
> already allocated for other uses after our device is removed.

Wow, that adds quite some overhead to every register access. I'm not 
sure we can do this.

Christian.

>
> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 57 ++++++++++++++++++++++++
>   drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c    |  9 ++++
>   drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c    | 53 +++++++++++++---------
>   drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h    |  3 ++
>   drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c   | 70 ++++++++++++++++++++++++++++++
>   drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h   | 49 ++-------------------
>   drivers/gpu/drm/amd/amdgpu/psp_v11_0.c     | 16 ++-----
>   drivers/gpu/drm/amd/amdgpu/psp_v12_0.c     |  8 +---
>   drivers/gpu/drm/amd/amdgpu/psp_v3_1.c      |  8 +---
>   9 files changed, 184 insertions(+), 89 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index e99f4f1..0a9d73c 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -72,6 +72,8 @@
>   
>   #include <linux/iommu.h>
>   
> +#include <drm/drm_drv.h>
> +
>   MODULE_FIRMWARE("amdgpu/vega10_gpu_info.bin");
>   MODULE_FIRMWARE("amdgpu/vega12_gpu_info.bin");
>   MODULE_FIRMWARE("amdgpu/raven_gpu_info.bin");
> @@ -404,13 +406,21 @@ uint8_t amdgpu_mm_rreg8(struct amdgpu_device *adev, uint32_t offset)
>    */
>   void amdgpu_mm_wreg8(struct amdgpu_device *adev, uint32_t offset, uint8_t value)
>   {
> +	int idx;
> +
>   	if (adev->in_pci_err_recovery)
>   		return;
>   
> +
> +	if (!drm_dev_enter(&adev->ddev, &idx))
> +		return;
> +
>   	if (offset < adev->rmmio_size)
>   		writeb(value, adev->rmmio + offset);
>   	else
>   		BUG();
> +
> +	drm_dev_exit(idx);
>   }
>   
>   /**
> @@ -427,9 +437,14 @@ void amdgpu_device_wreg(struct amdgpu_device *adev,
>   			uint32_t reg, uint32_t v,
>   			uint32_t acc_flags)
>   {
> +	int idx;
> +
>   	if (adev->in_pci_err_recovery)
>   		return;
>   
> +	if (!drm_dev_enter(&adev->ddev, &idx))
> +		return;
> +
>   	if ((reg * 4) < adev->rmmio_size) {
>   		if (!(acc_flags & AMDGPU_REGS_NO_KIQ) &&
>   		    amdgpu_sriov_runtime(adev) &&
> @@ -444,6 +459,8 @@ void amdgpu_device_wreg(struct amdgpu_device *adev,
>   	}
>   
>   	trace_amdgpu_device_wreg(adev->pdev->device, reg, v);
> +
> +	drm_dev_exit(idx);
>   }
>   
>   /*
> @@ -454,9 +471,14 @@ void amdgpu_device_wreg(struct amdgpu_device *adev,
>   void amdgpu_mm_wreg_mmio_rlc(struct amdgpu_device *adev,
>   			     uint32_t reg, uint32_t v)
>   {
> +	int idx;
> +
>   	if (adev->in_pci_err_recovery)
>   		return;
>   
> +	if (!drm_dev_enter(&adev->ddev, &idx))
> +		return;
> +
>   	if (amdgpu_sriov_fullaccess(adev) &&
>   	    adev->gfx.rlc.funcs &&
>   	    adev->gfx.rlc.funcs->is_rlcg_access_range) {
> @@ -465,6 +487,8 @@ void amdgpu_mm_wreg_mmio_rlc(struct amdgpu_device *adev,
>   	} else {
>   		writel(v, ((void __iomem *)adev->rmmio) + (reg * 4));
>   	}
> +
> +	drm_dev_exit(idx);
>   }
>   
>   /**
> @@ -499,15 +523,22 @@ u32 amdgpu_io_rreg(struct amdgpu_device *adev, u32 reg)
>    */
>   void amdgpu_io_wreg(struct amdgpu_device *adev, u32 reg, u32 v)
>   {
> +	int idx;
> +
>   	if (adev->in_pci_err_recovery)
>   		return;
>   
> +	if (!drm_dev_enter(&adev->ddev, &idx))
> +		return;
> +
>   	if ((reg * 4) < adev->rio_mem_size)
>   		iowrite32(v, adev->rio_mem + (reg * 4));
>   	else {
>   		iowrite32((reg * 4), adev->rio_mem + (mmMM_INDEX * 4));
>   		iowrite32(v, adev->rio_mem + (mmMM_DATA * 4));
>   	}
> +
> +	drm_dev_exit(idx);
>   }
>   
>   /**
> @@ -544,14 +575,21 @@ u32 amdgpu_mm_rdoorbell(struct amdgpu_device *adev, u32 index)
>    */
>   void amdgpu_mm_wdoorbell(struct amdgpu_device *adev, u32 index, u32 v)
>   {
> +	int idx;
> +
>   	if (adev->in_pci_err_recovery)
>   		return;
>   
> +	if (!drm_dev_enter(&adev->ddev, &idx))
> +		return;
> +
>   	if (index < adev->doorbell.num_doorbells) {
>   		writel(v, adev->doorbell.ptr + index);
>   	} else {
>   		DRM_ERROR("writing beyond doorbell aperture: 0x%08x!\n", index);
>   	}
> +
> +	drm_dev_exit(idx);
>   }
>   
>   /**
> @@ -588,14 +626,21 @@ u64 amdgpu_mm_rdoorbell64(struct amdgpu_device *adev, u32 index)
>    */
>   void amdgpu_mm_wdoorbell64(struct amdgpu_device *adev, u32 index, u64 v)
>   {
> +	int idx;
> +
>   	if (adev->in_pci_err_recovery)
>   		return;
>   
> +	if (!drm_dev_enter(&adev->ddev, &idx))
> +		return;
> +
>   	if (index < adev->doorbell.num_doorbells) {
>   		atomic64_set((atomic64_t *)(adev->doorbell.ptr + index), v);
>   	} else {
>   		DRM_ERROR("writing beyond doorbell aperture: 0x%08x!\n", index);
>   	}
> +
> +	drm_dev_exit(idx);
>   }
>   
>   /**
> @@ -682,6 +727,10 @@ void amdgpu_device_indirect_wreg(struct amdgpu_device *adev,
>   	unsigned long flags;
>   	void __iomem *pcie_index_offset;
>   	void __iomem *pcie_data_offset;
> +	int idx;
> +
> +	if (!drm_dev_enter(&adev->ddev, &idx))
> +		return;
>   
>   	spin_lock_irqsave(&adev->pcie_idx_lock, flags);
>   	pcie_index_offset = (void __iomem *)adev->rmmio + pcie_index * 4;
> @@ -692,6 +741,8 @@ void amdgpu_device_indirect_wreg(struct amdgpu_device *adev,
>   	writel(reg_data, pcie_data_offset);
>   	readl(pcie_data_offset);
>   	spin_unlock_irqrestore(&adev->pcie_idx_lock, flags);
> +
> +	drm_dev_exit(idx);
>   }
>   
>   /**
> @@ -711,6 +762,10 @@ void amdgpu_device_indirect_wreg64(struct amdgpu_device *adev,
>   	unsigned long flags;
>   	void __iomem *pcie_index_offset;
>   	void __iomem *pcie_data_offset;
> +	int idx;
> +
> +	if (!drm_dev_enter(&adev->ddev, &idx))
> +		return;
>   
>   	spin_lock_irqsave(&adev->pcie_idx_lock, flags);
>   	pcie_index_offset = (void __iomem *)adev->rmmio + pcie_index * 4;
> @@ -727,6 +782,8 @@ void amdgpu_device_indirect_wreg64(struct amdgpu_device *adev,
>   	writel((u32)(reg_data >> 32), pcie_data_offset);
>   	readl(pcie_data_offset);
>   	spin_unlock_irqrestore(&adev->pcie_idx_lock, flags);
> +
> +	drm_dev_exit(idx);
>   }
>   
>   /**
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
> index fe1a39f..1beb4e6 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
> @@ -31,6 +31,8 @@
>   #include "amdgpu_ras.h"
>   #include "amdgpu_xgmi.h"
>   
> +#include <drm/drm_drv.h>
> +
>   /**
>    * amdgpu_gmc_get_pde_for_bo - get the PDE for a BO
>    *
> @@ -98,6 +100,10 @@ int amdgpu_gmc_set_pte_pde(struct amdgpu_device *adev, void *cpu_pt_addr,
>   {
>   	void __iomem *ptr = (void *)cpu_pt_addr;
>   	uint64_t value;
> +	int idx;
> +
> +	if (!drm_dev_enter(&adev->ddev, &idx))
> +		return 0;
>   
>   	/*
>   	 * The following is for PTE only. GART does not have PDEs.
> @@ -105,6 +111,9 @@ int amdgpu_gmc_set_pte_pde(struct amdgpu_device *adev, void *cpu_pt_addr,
>   	value = addr & 0x0000FFFFFFFFF000ULL;
>   	value |= flags;
>   	writeq(value, ptr + (gpu_page_idx * 8));
> +
> +	drm_dev_exit(idx);
> +
>   	return 0;
>   }
>   
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
> index 523d22d..89e2bfe 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
> @@ -37,6 +37,8 @@
>   
>   #include "amdgpu_ras.h"
>   
> +#include <drm/drm_drv.h>
> +
>   static int psp_sysfs_init(struct amdgpu_device *adev);
>   static void psp_sysfs_fini(struct amdgpu_device *adev);
>   
> @@ -248,7 +250,7 @@ psp_cmd_submit_buf(struct psp_context *psp,
>   		   struct psp_gfx_cmd_resp *cmd, uint64_t fence_mc_addr)
>   {
>   	int ret;
> -	int index;
> +	int index, idx;
>   	int timeout = 2000;
>   	bool ras_intr = false;
>   	bool skip_unsupport = false;
> @@ -256,6 +258,9 @@ psp_cmd_submit_buf(struct psp_context *psp,
>   	if (psp->adev->in_pci_err_recovery)
>   		return 0;
>   
> +	if (!drm_dev_enter(&psp->adev->ddev, &idx))
> +		return 0;
> +
>   	mutex_lock(&psp->mutex);
>   
>   	memset(psp->cmd_buf_mem, 0, PSP_CMD_BUFFER_SIZE);
> @@ -266,8 +271,7 @@ psp_cmd_submit_buf(struct psp_context *psp,
>   	ret = psp_ring_cmd_submit(psp, psp->cmd_buf_mc_addr, fence_mc_addr, index);
>   	if (ret) {
>   		atomic_dec(&psp->fence_value);
> -		mutex_unlock(&psp->mutex);
> -		return ret;
> +		goto exit;
>   	}
>   
>   	amdgpu_asic_invalidate_hdp(psp->adev, NULL);
> @@ -307,8 +311,8 @@ psp_cmd_submit_buf(struct psp_context *psp,
>   			 psp->cmd_buf_mem->cmd_id,
>   			 psp->cmd_buf_mem->resp.status);
>   		if (!timeout) {
> -			mutex_unlock(&psp->mutex);
> -			return -EINVAL;
> +			ret = -EINVAL;
> +			goto exit;
>   		}
>   	}
>   
> @@ -316,8 +320,10 @@ psp_cmd_submit_buf(struct psp_context *psp,
>   		ucode->tmr_mc_addr_lo = psp->cmd_buf_mem->resp.fw_addr_lo;
>   		ucode->tmr_mc_addr_hi = psp->cmd_buf_mem->resp.fw_addr_hi;
>   	}
> -	mutex_unlock(&psp->mutex);
>   
> +exit:
> +	mutex_unlock(&psp->mutex);
> +	drm_dev_exit(idx);
>   	return ret;
>   }
>   
> @@ -354,8 +360,7 @@ static int psp_load_toc(struct psp_context *psp,
>   	if (!cmd)
>   		return -ENOMEM;
>   	/* Copy toc to psp firmware private buffer */
> -	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
> -	memcpy(psp->fw_pri_buf, psp->toc_start_addr, psp->toc_bin_size);
> +	psp_copy_fw(psp, psp->toc_start_addr, psp->toc_bin_size);
>   
>   	psp_prep_load_toc_cmd_buf(cmd, psp->fw_pri_mc_addr, psp->toc_bin_size);
>   
> @@ -570,8 +575,7 @@ static int psp_asd_load(struct psp_context *psp)
>   	if (!cmd)
>   		return -ENOMEM;
>   
> -	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
> -	memcpy(psp->fw_pri_buf, psp->asd_start_addr, psp->asd_ucode_size);
> +	psp_copy_fw(psp, psp->asd_start_addr, psp->asd_ucode_size);
>   
>   	psp_prep_asd_load_cmd_buf(cmd, psp->fw_pri_mc_addr,
>   				  psp->asd_ucode_size);
> @@ -726,8 +730,7 @@ static int psp_xgmi_load(struct psp_context *psp)
>   	if (!cmd)
>   		return -ENOMEM;
>   
> -	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
> -	memcpy(psp->fw_pri_buf, psp->ta_xgmi_start_addr, psp->ta_xgmi_ucode_size);
> +	psp_copy_fw(psp, psp->ta_xgmi_start_addr, psp->ta_xgmi_ucode_size);
>   
>   	psp_prep_ta_load_cmd_buf(cmd,
>   				 psp->fw_pri_mc_addr,
> @@ -982,8 +985,7 @@ static int psp_ras_load(struct psp_context *psp)
>   	if (!cmd)
>   		return -ENOMEM;
>   
> -	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
> -	memcpy(psp->fw_pri_buf, psp->ta_ras_start_addr, psp->ta_ras_ucode_size);
> +	psp_copy_fw(psp, psp->ta_ras_start_addr, psp->ta_ras_ucode_size);
>   
>   	psp_prep_ta_load_cmd_buf(cmd,
>   				 psp->fw_pri_mc_addr,
> @@ -1219,8 +1221,7 @@ static int psp_hdcp_load(struct psp_context *psp)
>   	if (!cmd)
>   		return -ENOMEM;
>   
> -	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
> -	memcpy(psp->fw_pri_buf, psp->ta_hdcp_start_addr,
> +	psp_copy_fw(psp, psp->ta_hdcp_start_addr,
>   	       psp->ta_hdcp_ucode_size);
>   
>   	psp_prep_ta_load_cmd_buf(cmd,
> @@ -1366,8 +1367,7 @@ static int psp_dtm_load(struct psp_context *psp)
>   	if (!cmd)
>   		return -ENOMEM;
>   
> -	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
> -	memcpy(psp->fw_pri_buf, psp->ta_dtm_start_addr, psp->ta_dtm_ucode_size);
> +	psp_copy_fw(psp, psp->ta_dtm_start_addr, psp->ta_dtm_ucode_size);
>   
>   	psp_prep_ta_load_cmd_buf(cmd,
>   				 psp->fw_pri_mc_addr,
> @@ -1507,8 +1507,7 @@ static int psp_rap_load(struct psp_context *psp)
>   	if (!cmd)
>   		return -ENOMEM;
>   
> -	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
> -	memcpy(psp->fw_pri_buf, psp->ta_rap_start_addr, psp->ta_rap_ucode_size);
> +	psp_copy_fw(psp, psp->ta_rap_start_addr, psp->ta_rap_ucode_size);
>   
>   	psp_prep_ta_load_cmd_buf(cmd,
>   				 psp->fw_pri_mc_addr,
> @@ -2778,6 +2777,20 @@ static ssize_t psp_usbc_pd_fw_sysfs_write(struct device *dev,
>   	return count;
>   }
>   
> +void psp_copy_fw(struct psp_context *psp, uint8_t *start_addr, uint32_t bin_size)
> +{
> +	int idx;
> +
> +	if (!drm_dev_enter(&psp->adev->ddev, &idx))
> +		return;
> +
> +	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
> +	memcpy(psp->fw_pri_buf, start_addr, bin_size);
> +
> +	drm_dev_exit(idx);
> +}
> +
> +
>   static DEVICE_ATTR(usbc_pd_fw, S_IRUGO | S_IWUSR,
>   		   psp_usbc_pd_fw_sysfs_read,
>   		   psp_usbc_pd_fw_sysfs_write);
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
> index da250bc..ac69314 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
> @@ -400,4 +400,7 @@ int psp_init_ta_microcode(struct psp_context *psp,
>   			  const char *chip_name);
>   int psp_get_fw_attestation_records_addr(struct psp_context *psp,
>   					uint64_t *output_ptr);
> +
> +void psp_copy_fw(struct psp_context *psp, uint8_t *start_addr, uint32_t bin_size);
> +
>   #endif
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
> index 1a612f5..d656494 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
> @@ -35,6 +35,8 @@
>   #include "amdgpu.h"
>   #include "atom.h"
>   
> +#include <drm/drm_drv.h>
> +
>   /*
>    * Rings
>    * Most engines on the GPU are fed via ring buffers.  Ring
> @@ -463,3 +465,71 @@ int amdgpu_ring_test_helper(struct amdgpu_ring *ring)
>   	ring->sched.ready = !r;
>   	return r;
>   }
> +
> +void amdgpu_ring_clear_ring(struct amdgpu_ring *ring)
> +{
> +	int idx;
> +	int i = 0;
> +
> +	if (!drm_dev_enter(&ring->adev->ddev, &idx))
> +		return;
> +
> +	while (i <= ring->buf_mask)
> +		ring->ring[i++] = ring->funcs->nop;
> +
> +	drm_dev_exit(idx);
> +
> +}
> +
> +void amdgpu_ring_write(struct amdgpu_ring *ring, uint32_t v)
> +{
> +	int idx;
> +
> +	if (!drm_dev_enter(&ring->adev->ddev, &idx))
> +		return;
> +
> +	if (ring->count_dw <= 0)
> +		DRM_ERROR("amdgpu: writing more dwords to the ring than expected!\n");
> +	ring->ring[ring->wptr++ & ring->buf_mask] = v;
> +	ring->wptr &= ring->ptr_mask;
> +	ring->count_dw--;
> +
> +	drm_dev_exit(idx);
> +}
> +
> +void amdgpu_ring_write_multiple(struct amdgpu_ring *ring,
> +					      void *src, int count_dw)
> +{
> +	unsigned occupied, chunk1, chunk2;
> +	void *dst;
> +	int idx;
> +
> +	if (!drm_dev_enter(&ring->adev->ddev, &idx))
> +		return;
> +
> +	if (unlikely(ring->count_dw < count_dw))
> +		DRM_ERROR("amdgpu: writing more dwords to the ring than expected!\n");
> +
> +	occupied = ring->wptr & ring->buf_mask;
> +	dst = (void *)&ring->ring[occupied];
> +	chunk1 = ring->buf_mask + 1 - occupied;
> +	chunk1 = (chunk1 >= count_dw) ? count_dw: chunk1;
> +	chunk2 = count_dw - chunk1;
> +	chunk1 <<= 2;
> +	chunk2 <<= 2;
> +
> +	if (chunk1)
> +		memcpy(dst, src, chunk1);
> +
> +	if (chunk2) {
> +		src += chunk1;
> +		dst = (void *)ring->ring;
> +		memcpy(dst, src, chunk2);
> +	}
> +
> +	ring->wptr += count_dw;
> +	ring->wptr &= ring->ptr_mask;
> +	ring->count_dw -= count_dw;
> +
> +	drm_dev_exit(idx);
> +}
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
> index accb243..f90b81f 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
> @@ -300,53 +300,12 @@ static inline void amdgpu_ring_set_preempt_cond_exec(struct amdgpu_ring *ring,
>   	*ring->cond_exe_cpu_addr = cond_exec;
>   }
>   
> -static inline void amdgpu_ring_clear_ring(struct amdgpu_ring *ring)
> -{
> -	int i = 0;
> -	while (i <= ring->buf_mask)
> -		ring->ring[i++] = ring->funcs->nop;
> -
> -}
> -
> -static inline void amdgpu_ring_write(struct amdgpu_ring *ring, uint32_t v)
> -{
> -	if (ring->count_dw <= 0)
> -		DRM_ERROR("amdgpu: writing more dwords to the ring than expected!\n");
> -	ring->ring[ring->wptr++ & ring->buf_mask] = v;
> -	ring->wptr &= ring->ptr_mask;
> -	ring->count_dw--;
> -}
> +void amdgpu_ring_clear_ring(struct amdgpu_ring *ring);
>   
> -static inline void amdgpu_ring_write_multiple(struct amdgpu_ring *ring,
> -					      void *src, int count_dw)
> -{
> -	unsigned occupied, chunk1, chunk2;
> -	void *dst;
> -
> -	if (unlikely(ring->count_dw < count_dw))
> -		DRM_ERROR("amdgpu: writing more dwords to the ring than expected!\n");
> -
> -	occupied = ring->wptr & ring->buf_mask;
> -	dst = (void *)&ring->ring[occupied];
> -	chunk1 = ring->buf_mask + 1 - occupied;
> -	chunk1 = (chunk1 >= count_dw) ? count_dw: chunk1;
> -	chunk2 = count_dw - chunk1;
> -	chunk1 <<= 2;
> -	chunk2 <<= 2;
> +void amdgpu_ring_write(struct amdgpu_ring *ring, uint32_t v);
>   
> -	if (chunk1)
> -		memcpy(dst, src, chunk1);
> -
> -	if (chunk2) {
> -		src += chunk1;
> -		dst = (void *)ring->ring;
> -		memcpy(dst, src, chunk2);
> -	}
> -
> -	ring->wptr += count_dw;
> -	ring->wptr &= ring->ptr_mask;
> -	ring->count_dw -= count_dw;
> -}
> +void amdgpu_ring_write_multiple(struct amdgpu_ring *ring,
> +					      void *src, int count_dw);
>   
>   int amdgpu_ring_test_helper(struct amdgpu_ring *ring);
>   
> diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c b/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
> index bd4248c..b3ce5be 100644
> --- a/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
> @@ -269,10 +269,8 @@ static int psp_v11_0_bootloader_load_kdb(struct psp_context *psp)
>   	if (ret)
>   		return ret;
>   
> -	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
> -
>   	/* Copy PSP KDB binary to memory */
> -	memcpy(psp->fw_pri_buf, psp->kdb_start_addr, psp->kdb_bin_size);
> +	psp_copy_fw(psp, psp->kdb_start_addr, psp->kdb_bin_size);
>   
>   	/* Provide the PSP KDB to bootloader */
>   	WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
> @@ -302,10 +300,8 @@ static int psp_v11_0_bootloader_load_spl(struct psp_context *psp)
>   	if (ret)
>   		return ret;
>   
> -	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
> -
>   	/* Copy PSP SPL binary to memory */
> -	memcpy(psp->fw_pri_buf, psp->spl_start_addr, psp->spl_bin_size);
> +	psp_copy_fw(psp, psp->spl_start_addr, psp->spl_bin_size);
>   
>   	/* Provide the PSP SPL to bootloader */
>   	WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
> @@ -335,10 +331,8 @@ static int psp_v11_0_bootloader_load_sysdrv(struct psp_context *psp)
>   	if (ret)
>   		return ret;
>   
> -	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
> -
>   	/* Copy PSP System Driver binary to memory */
> -	memcpy(psp->fw_pri_buf, psp->sys_start_addr, psp->sys_bin_size);
> +	psp_copy_fw(psp, psp->sys_start_addr, psp->sys_bin_size);
>   
>   	/* Provide the sys driver to bootloader */
>   	WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
> @@ -371,10 +365,8 @@ static int psp_v11_0_bootloader_load_sos(struct psp_context *psp)
>   	if (ret)
>   		return ret;
>   
> -	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
> -
>   	/* Copy Secure OS binary to PSP memory */
> -	memcpy(psp->fw_pri_buf, psp->sos_start_addr, psp->sos_bin_size);
> +	psp_copy_fw(psp, psp->sos_start_addr, psp->sos_bin_size);
>   
>   	/* Provide the PSP secure OS to bootloader */
>   	WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
> diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c b/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c
> index c4828bd..618e5b6 100644
> --- a/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c
> @@ -138,10 +138,8 @@ static int psp_v12_0_bootloader_load_sysdrv(struct psp_context *psp)
>   	if (ret)
>   		return ret;
>   
> -	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
> -
>   	/* Copy PSP System Driver binary to memory */
> -	memcpy(psp->fw_pri_buf, psp->sys_start_addr, psp->sys_bin_size);
> +	psp_copy_fw(psp, psp->sys_start_addr, psp->sys_bin_size);
>   
>   	/* Provide the sys driver to bootloader */
>   	WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
> @@ -179,10 +177,8 @@ static int psp_v12_0_bootloader_load_sos(struct psp_context *psp)
>   	if (ret)
>   		return ret;
>   
> -	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
> -
>   	/* Copy Secure OS binary to PSP memory */
> -	memcpy(psp->fw_pri_buf, psp->sos_start_addr, psp->sos_bin_size);
> +	psp_copy_fw(psp, psp->sos_start_addr, psp->sos_bin_size);
>   
>   	/* Provide the PSP secure OS to bootloader */
>   	WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
> diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c b/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c
> index f2e725f..d0a6cccd 100644
> --- a/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c
> +++ b/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c
> @@ -102,10 +102,8 @@ static int psp_v3_1_bootloader_load_sysdrv(struct psp_context *psp)
>   	if (ret)
>   		return ret;
>   
> -	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
> -
>   	/* Copy PSP System Driver binary to memory */
> -	memcpy(psp->fw_pri_buf, psp->sys_start_addr, psp->sys_bin_size);
> +	psp_copy_fw(psp, psp->sys_start_addr, psp->sys_bin_size);
>   
>   	/* Provide the sys driver to bootloader */
>   	WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
> @@ -143,10 +141,8 @@ static int psp_v3_1_bootloader_load_sos(struct psp_context *psp)
>   	if (ret)
>   		return ret;
>   
> -	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
> -
>   	/* Copy Secure OS binary to PSP memory */
> -	memcpy(psp->fw_pri_buf, psp->sos_start_addr, psp->sos_bin_size);
> +	psp_copy_fw(psp, psp->sos_start_addr, psp->sos_bin_size);
>   
>   	/* Provide the PSP secure OS to bootloader */
>   	WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 04/14] drm/sched: Cancel and flush all oustatdning jobs before finish.
  2021-01-19  8:42     ` Christian König
@ 2021-01-19  9:50       ` Christian König
  -1 siblings, 0 replies; 196+ messages in thread
From: Christian König @ 2021-01-19  9:50 UTC (permalink / raw)
  To: Andrey Grodzovsky, amd-gfx, dri-devel, daniel.vetter, robh,
	l.stach, yuq825, eric
  Cc: Alexander.Deucher, gregkh

Added a CC: stable tag and pushed it.

Thanks,
Christian.

Am 19.01.21 um 09:42 schrieb Christian König:
> This is a bug fix and should probably be pushed separately to 
> drm-misc-next.
>
> Christian.
>
> Am 18.01.21 um 22:01 schrieb Andrey Grodzovsky:
>> To avoid any possible use after free.
>>
>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>> Reviewed-by: Christian König <christian.koenig@amd.com>
>> ---
>>   drivers/gpu/drm/scheduler/sched_main.c | 3 +++
>>   1 file changed, 3 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
>> b/drivers/gpu/drm/scheduler/sched_main.c
>> index 997aa15..92637b7 100644
>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>> @@ -899,6 +899,9 @@ void drm_sched_fini(struct drm_gpu_scheduler *sched)
>>       if (sched->thread)
>>           kthread_stop(sched->thread);
>>   +    /* Confirm no work left behind accessing device structures */
>> +    cancel_delayed_work_sync(&sched->work_tdr);
>> +
>>       sched->ready = false;
>>   }
>>   EXPORT_SYMBOL(drm_sched_fini);
>

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 04/14] drm/sched: Cancel and flush all oustatdning jobs before finish.
@ 2021-01-19  9:50       ` Christian König
  0 siblings, 0 replies; 196+ messages in thread
From: Christian König @ 2021-01-19  9:50 UTC (permalink / raw)
  To: Andrey Grodzovsky, amd-gfx, dri-devel, daniel.vetter, robh,
	l.stach, yuq825, eric
  Cc: Alexander.Deucher, gregkh, ppaalanen, Harry.Wentland

Added a CC: stable tag and pushed it.

Thanks,
Christian.

Am 19.01.21 um 09:42 schrieb Christian König:
> This is a bug fix and should probably be pushed separately to 
> drm-misc-next.
>
> Christian.
>
> Am 18.01.21 um 22:01 schrieb Andrey Grodzovsky:
>> To avoid any possible use after free.
>>
>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>> Reviewed-by: Christian König <christian.koenig@amd.com>
>> ---
>>   drivers/gpu/drm/scheduler/sched_main.c | 3 +++
>>   1 file changed, 3 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
>> b/drivers/gpu/drm/scheduler/sched_main.c
>> index 997aa15..92637b7 100644
>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>> @@ -899,6 +899,9 @@ void drm_sched_fini(struct drm_gpu_scheduler *sched)
>>       if (sched->thread)
>>           kthread_stop(sched->thread);
>>   +    /* Confirm no work left behind accessing device structures */
>> +    cancel_delayed_work_sync(&sched->work_tdr);
>> +
>>       sched->ready = false;
>>   }
>>   EXPORT_SYMBOL(drm_sched_fini);
>

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 07/14] drm/amdgpu: Register IOMMU topology notifier per device.
  2021-01-19  8:48     ` Christian König
@ 2021-01-19 13:45       ` Daniel Vetter
  -1 siblings, 0 replies; 196+ messages in thread
From: Daniel Vetter @ 2021-01-19 13:45 UTC (permalink / raw)
  To: christian.koenig
  Cc: daniel.vetter, dri-devel, amd-gfx, gregkh, Alexander.Deucher, yuq825

On Tue, Jan 19, 2021 at 09:48:03AM +0100, Christian König wrote:
> Am 18.01.21 um 22:01 schrieb Andrey Grodzovsky:
> > Handle all DMA IOMMU gropup related dependencies before the
> > group is removed.
> > 
> > Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> > ---
> >   drivers/gpu/drm/amd/amdgpu/amdgpu.h        |  5 ++++
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 46 ++++++++++++++++++++++++++++++
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c   |  2 +-
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h   |  1 +
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 10 +++++++
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_object.h |  2 ++
> >   6 files changed, 65 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> > index 478a7d8..2953420 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> > @@ -51,6 +51,7 @@
> >   #include <linux/dma-fence.h>
> >   #include <linux/pci.h>
> >   #include <linux/aer.h>
> > +#include <linux/notifier.h>
> >   #include <drm/ttm/ttm_bo_api.h>
> >   #include <drm/ttm/ttm_bo_driver.h>
> > @@ -1041,6 +1042,10 @@ struct amdgpu_device {
> >   	bool                            in_pci_err_recovery;
> >   	struct pci_saved_state          *pci_state;
> > +
> > +	struct notifier_block		nb;
> > +	struct blocking_notifier_head	notifier;
> > +	struct list_head		device_bo_list;
> >   };
> >   static inline struct amdgpu_device *drm_to_adev(struct drm_device *ddev)
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > index 45e23e3..e99f4f1 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > @@ -70,6 +70,8 @@
> >   #include <drm/task_barrier.h>
> >   #include <linux/pm_runtime.h>
> > +#include <linux/iommu.h>
> > +
> >   MODULE_FIRMWARE("amdgpu/vega10_gpu_info.bin");
> >   MODULE_FIRMWARE("amdgpu/vega12_gpu_info.bin");
> >   MODULE_FIRMWARE("amdgpu/raven_gpu_info.bin");
> > @@ -3200,6 +3202,39 @@ static const struct attribute *amdgpu_dev_attributes[] = {
> >   };
> > +static int amdgpu_iommu_group_notifier(struct notifier_block *nb,
> > +				     unsigned long action, void *data)
> > +{
> > +	struct amdgpu_device *adev = container_of(nb, struct amdgpu_device, nb);
> > +	struct amdgpu_bo *bo = NULL;
> > +
> > +	/*
> > +	 * Following is a set of IOMMU group dependencies taken care of before
> > +	 * device's IOMMU group is removed
> > +	 */
> > +	if (action == IOMMU_GROUP_NOTIFY_DEL_DEVICE) {
> > +
> > +		spin_lock(&ttm_bo_glob.lru_lock);
> > +		list_for_each_entry(bo, &adev->device_bo_list, bo) {
> > +			if (bo->tbo.ttm)
> > +				ttm_tt_unpopulate(bo->tbo.bdev, bo->tbo.ttm);
> > +		}
> > +		spin_unlock(&ttm_bo_glob.lru_lock);
> 
> That approach won't work. ttm_tt_unpopulate() might sleep on an IOMMU lock.
> 
> You need to use a mutex here or even better make sure you can access the
> device_bo_list without a lock in this moment.

I'd also be worried about the notifier mutex getting really badly in the
way.

Plus I'm worried why we even need this, it sounds a bit like papering over
the iommu subsystem. Assuming we clean up all our iommu mappings in our
device hotunplug/unload code, why do we still need to have an additional
iommu notifier on top, with all kinds of additional headaches? The iommu
shouldn't clean up before the devices in its group have cleaned up.

I think we need more info here on what the exact problem is first.
-Daniel

> 
> Christian.
> 
> > +
> > +		if (adev->irq.ih.use_bus_addr)
> > +			amdgpu_ih_ring_fini(adev, &adev->irq.ih);
> > +		if (adev->irq.ih1.use_bus_addr)
> > +			amdgpu_ih_ring_fini(adev, &adev->irq.ih1);
> > +		if (adev->irq.ih2.use_bus_addr)
> > +			amdgpu_ih_ring_fini(adev, &adev->irq.ih2);
> > +
> > +		amdgpu_gart_dummy_page_fini(adev);
> > +	}
> > +
> > +	return NOTIFY_OK;
> > +}
> > +
> > +
> >   /**
> >    * amdgpu_device_init - initialize the driver
> >    *
> > @@ -3304,6 +3339,8 @@ int amdgpu_device_init(struct amdgpu_device *adev,
> >   	INIT_WORK(&adev->xgmi_reset_work, amdgpu_device_xgmi_reset_func);
> > +	INIT_LIST_HEAD(&adev->device_bo_list);
> > +
> >   	adev->gfx.gfx_off_req_count = 1;
> >   	adev->pm.ac_power = power_supply_is_system_supplied() > 0;
> > @@ -3575,6 +3612,15 @@ int amdgpu_device_init(struct amdgpu_device *adev,
> >   	if (amdgpu_device_cache_pci_state(adev->pdev))
> >   		pci_restore_state(pdev);
> > +	BLOCKING_INIT_NOTIFIER_HEAD(&adev->notifier);
> > +	adev->nb.notifier_call = amdgpu_iommu_group_notifier;
> > +
> > +	if (adev->dev->iommu_group) {
> > +		r = iommu_group_register_notifier(adev->dev->iommu_group, &adev->nb);
> > +		if (r)
> > +			goto failed;
> > +	}
> > +
> >   	return 0;
> >   failed:
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
> > index 0db9330..486ad6d 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
> > @@ -92,7 +92,7 @@ static int amdgpu_gart_dummy_page_init(struct amdgpu_device *adev)
> >    *
> >    * Frees the dummy page used by the driver (all asics).
> >    */
> > -static void amdgpu_gart_dummy_page_fini(struct amdgpu_device *adev)
> > +void amdgpu_gart_dummy_page_fini(struct amdgpu_device *adev)
> >   {
> >   	if (!adev->dummy_page_addr)
> >   		return;
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h
> > index afa2e28..5678d9c 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h
> > @@ -61,6 +61,7 @@ int amdgpu_gart_table_vram_pin(struct amdgpu_device *adev);
> >   void amdgpu_gart_table_vram_unpin(struct amdgpu_device *adev);
> >   int amdgpu_gart_init(struct amdgpu_device *adev);
> >   void amdgpu_gart_fini(struct amdgpu_device *adev);
> > +void amdgpu_gart_dummy_page_fini(struct amdgpu_device *adev);
> >   int amdgpu_gart_unbind(struct amdgpu_device *adev, uint64_t offset,
> >   		       int pages);
> >   int amdgpu_gart_map(struct amdgpu_device *adev, uint64_t offset,
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> > index 6cc9919..4a1de69 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> > @@ -94,6 +94,10 @@ static void amdgpu_bo_destroy(struct ttm_buffer_object *tbo)
> >   	}
> >   	amdgpu_bo_unref(&bo->parent);
> > +	spin_lock(&ttm_bo_glob.lru_lock);
> > +	list_del(&bo->bo);
> > +	spin_unlock(&ttm_bo_glob.lru_lock);
> > +
> >   	kfree(bo->metadata);
> >   	kfree(bo);
> >   }
> > @@ -613,6 +617,12 @@ static int amdgpu_bo_do_create(struct amdgpu_device *adev,
> >   	if (bp->type == ttm_bo_type_device)
> >   		bo->flags &= ~AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED;
> > +	INIT_LIST_HEAD(&bo->bo);
> > +
> > +	spin_lock(&ttm_bo_glob.lru_lock);
> > +	list_add_tail(&bo->bo, &adev->device_bo_list);
> > +	spin_unlock(&ttm_bo_glob.lru_lock);
> > +
> >   	return 0;
> >   fail_unreserve:
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
> > index 9ac3756..5ae8555 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
> > @@ -110,6 +110,8 @@ struct amdgpu_bo {
> >   	struct list_head		shadow_list;
> >   	struct kgd_mem                  *kfd_bo;
> > +
> > +	struct list_head		bo;
> >   };
> >   static inline struct amdgpu_bo *ttm_to_amdgpu_bo(struct ttm_buffer_object *tbo)
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 07/14] drm/amdgpu: Register IOMMU topology notifier per device.
@ 2021-01-19 13:45       ` Daniel Vetter
  0 siblings, 0 replies; 196+ messages in thread
From: Daniel Vetter @ 2021-01-19 13:45 UTC (permalink / raw)
  To: christian.koenig
  Cc: robh, Andrey Grodzovsky, daniel.vetter, dri-devel, eric,
	ppaalanen, amd-gfx, gregkh, Alexander.Deucher, yuq825,
	Harry.Wentland, l.stach

On Tue, Jan 19, 2021 at 09:48:03AM +0100, Christian König wrote:
> Am 18.01.21 um 22:01 schrieb Andrey Grodzovsky:
> > Handle all DMA IOMMU gropup related dependencies before the
> > group is removed.
> > 
> > Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> > ---
> >   drivers/gpu/drm/amd/amdgpu/amdgpu.h        |  5 ++++
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 46 ++++++++++++++++++++++++++++++
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c   |  2 +-
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h   |  1 +
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 10 +++++++
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_object.h |  2 ++
> >   6 files changed, 65 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> > index 478a7d8..2953420 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> > @@ -51,6 +51,7 @@
> >   #include <linux/dma-fence.h>
> >   #include <linux/pci.h>
> >   #include <linux/aer.h>
> > +#include <linux/notifier.h>
> >   #include <drm/ttm/ttm_bo_api.h>
> >   #include <drm/ttm/ttm_bo_driver.h>
> > @@ -1041,6 +1042,10 @@ struct amdgpu_device {
> >   	bool                            in_pci_err_recovery;
> >   	struct pci_saved_state          *pci_state;
> > +
> > +	struct notifier_block		nb;
> > +	struct blocking_notifier_head	notifier;
> > +	struct list_head		device_bo_list;
> >   };
> >   static inline struct amdgpu_device *drm_to_adev(struct drm_device *ddev)
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > index 45e23e3..e99f4f1 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > @@ -70,6 +70,8 @@
> >   #include <drm/task_barrier.h>
> >   #include <linux/pm_runtime.h>
> > +#include <linux/iommu.h>
> > +
> >   MODULE_FIRMWARE("amdgpu/vega10_gpu_info.bin");
> >   MODULE_FIRMWARE("amdgpu/vega12_gpu_info.bin");
> >   MODULE_FIRMWARE("amdgpu/raven_gpu_info.bin");
> > @@ -3200,6 +3202,39 @@ static const struct attribute *amdgpu_dev_attributes[] = {
> >   };
> > +static int amdgpu_iommu_group_notifier(struct notifier_block *nb,
> > +				     unsigned long action, void *data)
> > +{
> > +	struct amdgpu_device *adev = container_of(nb, struct amdgpu_device, nb);
> > +	struct amdgpu_bo *bo = NULL;
> > +
> > +	/*
> > +	 * Following is a set of IOMMU group dependencies taken care of before
> > +	 * device's IOMMU group is removed
> > +	 */
> > +	if (action == IOMMU_GROUP_NOTIFY_DEL_DEVICE) {
> > +
> > +		spin_lock(&ttm_bo_glob.lru_lock);
> > +		list_for_each_entry(bo, &adev->device_bo_list, bo) {
> > +			if (bo->tbo.ttm)
> > +				ttm_tt_unpopulate(bo->tbo.bdev, bo->tbo.ttm);
> > +		}
> > +		spin_unlock(&ttm_bo_glob.lru_lock);
> 
> That approach won't work. ttm_tt_unpopulate() might sleep on an IOMMU lock.
> 
> You need to use a mutex here or even better make sure you can access the
> device_bo_list without a lock in this moment.

I'd also be worried about the notifier mutex getting really badly in the
way.

Plus I'm worried why we even need this, it sounds a bit like papering over
the iommu subsystem. Assuming we clean up all our iommu mappings in our
device hotunplug/unload code, why do we still need to have an additional
iommu notifier on top, with all kinds of additional headaches? The iommu
shouldn't clean up before the devices in its group have cleaned up.

I think we need more info here on what the exact problem is first.
-Daniel

> 
> Christian.
> 
> > +
> > +		if (adev->irq.ih.use_bus_addr)
> > +			amdgpu_ih_ring_fini(adev, &adev->irq.ih);
> > +		if (adev->irq.ih1.use_bus_addr)
> > +			amdgpu_ih_ring_fini(adev, &adev->irq.ih1);
> > +		if (adev->irq.ih2.use_bus_addr)
> > +			amdgpu_ih_ring_fini(adev, &adev->irq.ih2);
> > +
> > +		amdgpu_gart_dummy_page_fini(adev);
> > +	}
> > +
> > +	return NOTIFY_OK;
> > +}
> > +
> > +
> >   /**
> >    * amdgpu_device_init - initialize the driver
> >    *
> > @@ -3304,6 +3339,8 @@ int amdgpu_device_init(struct amdgpu_device *adev,
> >   	INIT_WORK(&adev->xgmi_reset_work, amdgpu_device_xgmi_reset_func);
> > +	INIT_LIST_HEAD(&adev->device_bo_list);
> > +
> >   	adev->gfx.gfx_off_req_count = 1;
> >   	adev->pm.ac_power = power_supply_is_system_supplied() > 0;
> > @@ -3575,6 +3612,15 @@ int amdgpu_device_init(struct amdgpu_device *adev,
> >   	if (amdgpu_device_cache_pci_state(adev->pdev))
> >   		pci_restore_state(pdev);
> > +	BLOCKING_INIT_NOTIFIER_HEAD(&adev->notifier);
> > +	adev->nb.notifier_call = amdgpu_iommu_group_notifier;
> > +
> > +	if (adev->dev->iommu_group) {
> > +		r = iommu_group_register_notifier(adev->dev->iommu_group, &adev->nb);
> > +		if (r)
> > +			goto failed;
> > +	}
> > +
> >   	return 0;
> >   failed:
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
> > index 0db9330..486ad6d 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
> > @@ -92,7 +92,7 @@ static int amdgpu_gart_dummy_page_init(struct amdgpu_device *adev)
> >    *
> >    * Frees the dummy page used by the driver (all asics).
> >    */
> > -static void amdgpu_gart_dummy_page_fini(struct amdgpu_device *adev)
> > +void amdgpu_gart_dummy_page_fini(struct amdgpu_device *adev)
> >   {
> >   	if (!adev->dummy_page_addr)
> >   		return;
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h
> > index afa2e28..5678d9c 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h
> > @@ -61,6 +61,7 @@ int amdgpu_gart_table_vram_pin(struct amdgpu_device *adev);
> >   void amdgpu_gart_table_vram_unpin(struct amdgpu_device *adev);
> >   int amdgpu_gart_init(struct amdgpu_device *adev);
> >   void amdgpu_gart_fini(struct amdgpu_device *adev);
> > +void amdgpu_gart_dummy_page_fini(struct amdgpu_device *adev);
> >   int amdgpu_gart_unbind(struct amdgpu_device *adev, uint64_t offset,
> >   		       int pages);
> >   int amdgpu_gart_map(struct amdgpu_device *adev, uint64_t offset,
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> > index 6cc9919..4a1de69 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> > @@ -94,6 +94,10 @@ static void amdgpu_bo_destroy(struct ttm_buffer_object *tbo)
> >   	}
> >   	amdgpu_bo_unref(&bo->parent);
> > +	spin_lock(&ttm_bo_glob.lru_lock);
> > +	list_del(&bo->bo);
> > +	spin_unlock(&ttm_bo_glob.lru_lock);
> > +
> >   	kfree(bo->metadata);
> >   	kfree(bo);
> >   }
> > @@ -613,6 +617,12 @@ static int amdgpu_bo_do_create(struct amdgpu_device *adev,
> >   	if (bp->type == ttm_bo_type_device)
> >   		bo->flags &= ~AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED;
> > +	INIT_LIST_HEAD(&bo->bo);
> > +
> > +	spin_lock(&ttm_bo_glob.lru_lock);
> > +	list_add_tail(&bo->bo, &adev->device_bo_list);
> > +	spin_unlock(&ttm_bo_glob.lru_lock);
> > +
> >   	return 0;
> >   fail_unreserve:
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
> > index 9ac3756..5ae8555 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
> > @@ -110,6 +110,8 @@ struct amdgpu_bo {
> >   	struct list_head		shadow_list;
> >   	struct kgd_mem                  *kfd_bo;
> > +
> > +	struct list_head		bo;
> >   };
> >   static inline struct amdgpu_bo *ttm_to_amdgpu_bo(struct ttm_buffer_object *tbo)
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 01/14] drm/ttm: Remap all page faults to per process dummy page.
  2021-01-18 21:01   ` Andrey Grodzovsky
@ 2021-01-19 13:56     ` Daniel Vetter
  -1 siblings, 0 replies; 196+ messages in thread
From: Daniel Vetter @ 2021-01-19 13:56 UTC (permalink / raw)
  To: Andrey Grodzovsky
  Cc: gregkh, ckoenig.leichtzumerken, dri-devel, amd-gfx,
	daniel.vetter, Alexander.Deucher, yuq825

On Mon, Jan 18, 2021 at 04:01:10PM -0500, Andrey Grodzovsky wrote:
> On device removal reroute all CPU mappings to dummy page.
> 
> v3:
> Remove loop to find DRM file and instead access it
> by vma->vm_file->private_data. Move dummy page installation
> into a separate function.
> 
> v4:
> Map the entire BOs VA space into on demand allocated dummy page
> on the first fault for that BO.
> 
> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> ---
>  drivers/gpu/drm/ttm/ttm_bo_vm.c | 82 ++++++++++++++++++++++++++++++++++++++++-
>  include/drm/ttm/ttm_bo_api.h    |  2 +
>  2 files changed, 83 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c b/drivers/gpu/drm/ttm/ttm_bo_vm.c
> index 6dc96cf..ed89da3 100644
> --- a/drivers/gpu/drm/ttm/ttm_bo_vm.c
> +++ b/drivers/gpu/drm/ttm/ttm_bo_vm.c
> @@ -34,6 +34,8 @@
>  #include <drm/ttm/ttm_bo_driver.h>
>  #include <drm/ttm/ttm_placement.h>
>  #include <drm/drm_vma_manager.h>
> +#include <drm/drm_drv.h>
> +#include <drm/drm_managed.h>
>  #include <linux/mm.h>
>  #include <linux/pfn_t.h>
>  #include <linux/rbtree.h>
> @@ -380,25 +382,103 @@ vm_fault_t ttm_bo_vm_fault_reserved(struct vm_fault *vmf,
>  }
>  EXPORT_SYMBOL(ttm_bo_vm_fault_reserved);
>  
> +static void ttm_bo_release_dummy_page(struct drm_device *dev, void *res)
> +{
> +	struct page *dummy_page = (struct page *)res;
> +
> +	__free_page(dummy_page);
> +}
> +
> +vm_fault_t ttm_bo_vm_dummy_page(struct vm_fault *vmf, pgprot_t prot)
> +{
> +	struct vm_area_struct *vma = vmf->vma;
> +	struct ttm_buffer_object *bo = vma->vm_private_data;
> +	struct ttm_bo_device *bdev = bo->bdev;
> +	struct drm_device *ddev = bo->base.dev;
> +	vm_fault_t ret = VM_FAULT_NOPAGE;
> +	unsigned long address = vma->vm_start;
> +	unsigned long num_prefault = (vma->vm_end - vma->vm_start) >> PAGE_SHIFT;
> +	unsigned long pfn;
> +	struct page *page;
> +	int i;
> +
> +	/*
> +	 * Wait for buffer data in transit, due to a pipelined
> +	 * move.
> +	 */
> +	ret = ttm_bo_vm_fault_idle(bo, vmf);
> +	if (unlikely(ret != 0))
> +		return ret;
> +
> +	/* Allocate new dummy page to map all the VA range in this VMA to it*/
> +	page = alloc_page(GFP_KERNEL | __GFP_ZERO);
> +	if (!page)
> +		return VM_FAULT_OOM;
> +
> +	pfn = page_to_pfn(page);
> +
> +	/*
> +	 * Prefault the entire VMA range right away to avoid further faults
> +	 */
> +	for (i = 0; i < num_prefault; ++i) {
> +
> +		if (unlikely(address >= vma->vm_end))
> +			break;
> +
> +		if (vma->vm_flags & VM_MIXEDMAP)
> +			ret = vmf_insert_mixed_prot(vma, address,
> +						    __pfn_to_pfn_t(pfn, PFN_DEV),
> +						    prot);
> +		else
> +			ret = vmf_insert_pfn_prot(vma, address, pfn, prot);
> +
> +		/* Never error on prefaulted PTEs */
> +		if (unlikely((ret & VM_FAULT_ERROR))) {
> +			if (i == 0)
> +				return VM_FAULT_NOPAGE;
> +			else
> +				break;
> +		}
> +
> +		address += PAGE_SIZE;
> +	}
> +
> +	/* Set the page to be freed using drmm release action */
> +	if (drmm_add_action_or_reset(ddev, ttm_bo_release_dummy_page, page))
> +		return VM_FAULT_OOM;
> +
> +	return ret;
> +}
> +EXPORT_SYMBOL(ttm_bo_vm_dummy_page);

I think we can lift this entire thing (once the ttm_bo_vm_fault_idle is
gone) to the drm level, since nothing ttm specific in here. Probably stuff
it into drm_gem.c (but really it's not even gem specific, it's fully
generic "replace this vma with dummy pages pls" function.

Aside from this nit I think the overall approach you have here is starting
to look good. Lots of work&polish, but imo we're getting there and can
start landing stuff soon.
-Daniel

> +
>  vm_fault_t ttm_bo_vm_fault(struct vm_fault *vmf)
>  {
>  	struct vm_area_struct *vma = vmf->vma;
>  	pgprot_t prot;
>  	struct ttm_buffer_object *bo = vma->vm_private_data;
> +	struct drm_device *ddev = bo->base.dev;
>  	vm_fault_t ret;
> +	int idx;
>  
>  	ret = ttm_bo_vm_reserve(bo, vmf);
>  	if (ret)
>  		return ret;
>  
>  	prot = vma->vm_page_prot;
> -	ret = ttm_bo_vm_fault_reserved(vmf, prot, TTM_BO_VM_NUM_PREFAULT, 1);
> +	if (drm_dev_enter(ddev, &idx)) {
> +		ret = ttm_bo_vm_fault_reserved(vmf, prot, TTM_BO_VM_NUM_PREFAULT, 1);
> +		drm_dev_exit(idx);
> +	} else {
> +		ret = ttm_bo_vm_dummy_page(vmf, prot);
> +	}
>  	if (ret == VM_FAULT_RETRY && !(vmf->flags & FAULT_FLAG_RETRY_NOWAIT))
>  		return ret;
>  
>  	dma_resv_unlock(bo->base.resv);
>  
>  	return ret;
> +
> +	return ret;
>  }
>  EXPORT_SYMBOL(ttm_bo_vm_fault);
>  
> diff --git a/include/drm/ttm/ttm_bo_api.h b/include/drm/ttm/ttm_bo_api.h
> index e17be32..12fb240 100644
> --- a/include/drm/ttm/ttm_bo_api.h
> +++ b/include/drm/ttm/ttm_bo_api.h
> @@ -643,4 +643,6 @@ void ttm_bo_vm_close(struct vm_area_struct *vma);
>  int ttm_bo_vm_access(struct vm_area_struct *vma, unsigned long addr,
>  		     void *buf, int len, int write);
>  
> +vm_fault_t ttm_bo_vm_dummy_page(struct vm_fault *vmf, pgprot_t prot);
> +
>  #endif
> -- 
> 2.7.4
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 01/14] drm/ttm: Remap all page faults to per process dummy page.
@ 2021-01-19 13:56     ` Daniel Vetter
  0 siblings, 0 replies; 196+ messages in thread
From: Daniel Vetter @ 2021-01-19 13:56 UTC (permalink / raw)
  To: Andrey Grodzovsky
  Cc: robh, gregkh, ckoenig.leichtzumerken, dri-devel, eric, ppaalanen,
	amd-gfx, daniel.vetter, Alexander.Deucher, yuq825,
	Harry.Wentland, l.stach

On Mon, Jan 18, 2021 at 04:01:10PM -0500, Andrey Grodzovsky wrote:
> On device removal reroute all CPU mappings to dummy page.
> 
> v3:
> Remove loop to find DRM file and instead access it
> by vma->vm_file->private_data. Move dummy page installation
> into a separate function.
> 
> v4:
> Map the entire BOs VA space into on demand allocated dummy page
> on the first fault for that BO.
> 
> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> ---
>  drivers/gpu/drm/ttm/ttm_bo_vm.c | 82 ++++++++++++++++++++++++++++++++++++++++-
>  include/drm/ttm/ttm_bo_api.h    |  2 +
>  2 files changed, 83 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c b/drivers/gpu/drm/ttm/ttm_bo_vm.c
> index 6dc96cf..ed89da3 100644
> --- a/drivers/gpu/drm/ttm/ttm_bo_vm.c
> +++ b/drivers/gpu/drm/ttm/ttm_bo_vm.c
> @@ -34,6 +34,8 @@
>  #include <drm/ttm/ttm_bo_driver.h>
>  #include <drm/ttm/ttm_placement.h>
>  #include <drm/drm_vma_manager.h>
> +#include <drm/drm_drv.h>
> +#include <drm/drm_managed.h>
>  #include <linux/mm.h>
>  #include <linux/pfn_t.h>
>  #include <linux/rbtree.h>
> @@ -380,25 +382,103 @@ vm_fault_t ttm_bo_vm_fault_reserved(struct vm_fault *vmf,
>  }
>  EXPORT_SYMBOL(ttm_bo_vm_fault_reserved);
>  
> +static void ttm_bo_release_dummy_page(struct drm_device *dev, void *res)
> +{
> +	struct page *dummy_page = (struct page *)res;
> +
> +	__free_page(dummy_page);
> +}
> +
> +vm_fault_t ttm_bo_vm_dummy_page(struct vm_fault *vmf, pgprot_t prot)
> +{
> +	struct vm_area_struct *vma = vmf->vma;
> +	struct ttm_buffer_object *bo = vma->vm_private_data;
> +	struct ttm_bo_device *bdev = bo->bdev;
> +	struct drm_device *ddev = bo->base.dev;
> +	vm_fault_t ret = VM_FAULT_NOPAGE;
> +	unsigned long address = vma->vm_start;
> +	unsigned long num_prefault = (vma->vm_end - vma->vm_start) >> PAGE_SHIFT;
> +	unsigned long pfn;
> +	struct page *page;
> +	int i;
> +
> +	/*
> +	 * Wait for buffer data in transit, due to a pipelined
> +	 * move.
> +	 */
> +	ret = ttm_bo_vm_fault_idle(bo, vmf);
> +	if (unlikely(ret != 0))
> +		return ret;
> +
> +	/* Allocate new dummy page to map all the VA range in this VMA to it*/
> +	page = alloc_page(GFP_KERNEL | __GFP_ZERO);
> +	if (!page)
> +		return VM_FAULT_OOM;
> +
> +	pfn = page_to_pfn(page);
> +
> +	/*
> +	 * Prefault the entire VMA range right away to avoid further faults
> +	 */
> +	for (i = 0; i < num_prefault; ++i) {
> +
> +		if (unlikely(address >= vma->vm_end))
> +			break;
> +
> +		if (vma->vm_flags & VM_MIXEDMAP)
> +			ret = vmf_insert_mixed_prot(vma, address,
> +						    __pfn_to_pfn_t(pfn, PFN_DEV),
> +						    prot);
> +		else
> +			ret = vmf_insert_pfn_prot(vma, address, pfn, prot);
> +
> +		/* Never error on prefaulted PTEs */
> +		if (unlikely((ret & VM_FAULT_ERROR))) {
> +			if (i == 0)
> +				return VM_FAULT_NOPAGE;
> +			else
> +				break;
> +		}
> +
> +		address += PAGE_SIZE;
> +	}
> +
> +	/* Set the page to be freed using drmm release action */
> +	if (drmm_add_action_or_reset(ddev, ttm_bo_release_dummy_page, page))
> +		return VM_FAULT_OOM;
> +
> +	return ret;
> +}
> +EXPORT_SYMBOL(ttm_bo_vm_dummy_page);

I think we can lift this entire thing (once the ttm_bo_vm_fault_idle is
gone) to the drm level, since nothing ttm specific in here. Probably stuff
it into drm_gem.c (but really it's not even gem specific, it's fully
generic "replace this vma with dummy pages pls" function.

Aside from this nit I think the overall approach you have here is starting
to look good. Lots of work&polish, but imo we're getting there and can
start landing stuff soon.
-Daniel

> +
>  vm_fault_t ttm_bo_vm_fault(struct vm_fault *vmf)
>  {
>  	struct vm_area_struct *vma = vmf->vma;
>  	pgprot_t prot;
>  	struct ttm_buffer_object *bo = vma->vm_private_data;
> +	struct drm_device *ddev = bo->base.dev;
>  	vm_fault_t ret;
> +	int idx;
>  
>  	ret = ttm_bo_vm_reserve(bo, vmf);
>  	if (ret)
>  		return ret;
>  
>  	prot = vma->vm_page_prot;
> -	ret = ttm_bo_vm_fault_reserved(vmf, prot, TTM_BO_VM_NUM_PREFAULT, 1);
> +	if (drm_dev_enter(ddev, &idx)) {
> +		ret = ttm_bo_vm_fault_reserved(vmf, prot, TTM_BO_VM_NUM_PREFAULT, 1);
> +		drm_dev_exit(idx);
> +	} else {
> +		ret = ttm_bo_vm_dummy_page(vmf, prot);
> +	}
>  	if (ret == VM_FAULT_RETRY && !(vmf->flags & FAULT_FLAG_RETRY_NOWAIT))
>  		return ret;
>  
>  	dma_resv_unlock(bo->base.resv);
>  
>  	return ret;
> +
> +	return ret;
>  }
>  EXPORT_SYMBOL(ttm_bo_vm_fault);
>  
> diff --git a/include/drm/ttm/ttm_bo_api.h b/include/drm/ttm/ttm_bo_api.h
> index e17be32..12fb240 100644
> --- a/include/drm/ttm/ttm_bo_api.h
> +++ b/include/drm/ttm/ttm_bo_api.h
> @@ -643,4 +643,6 @@ void ttm_bo_vm_close(struct vm_area_struct *vma);
>  int ttm_bo_vm_access(struct vm_area_struct *vma, unsigned long addr,
>  		     void *buf, int len, int write);
>  
> +vm_fault_t ttm_bo_vm_dummy_page(struct vm_fault *vmf, pgprot_t prot);
> +
>  #endif
> -- 
> 2.7.4
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 00/14] RFC Support hot device unplug in amdgpu
  2021-01-18 21:01 ` Andrey Grodzovsky
@ 2021-01-19 14:16   ` Daniel Vetter
  -1 siblings, 0 replies; 196+ messages in thread
From: Daniel Vetter @ 2021-01-19 14:16 UTC (permalink / raw)
  To: Andrey Grodzovsky
  Cc: gregkh, ckoenig.leichtzumerken, dri-devel, amd-gfx,
	daniel.vetter, Alexander.Deucher, yuq825

On Mon, Jan 18, 2021 at 04:01:09PM -0500, Andrey Grodzovsky wrote:
> Until now extracting a card either by physical extraction (e.g. eGPU with 
> thunderbolt connection or by emulation through  syfs -> /sys/bus/pci/devices/device_id/remove) 
> would cause random crashes in user apps. The random crashes in apps were 
> mostly due to the app having mapped a device backed BO into its address 
> space was still trying to access the BO while the backing device was gone.
> To answer this first problem Christian suggested to fix the handling of mapped 
> memory in the clients when the device goes away by forcibly unmap all buffers the 
> user processes has by clearing their respective VMAs mapping the device BOs. 
> Then when the VMAs try to fill in the page tables again we check in the fault 
> handlerif the device is removed and if so, return an error. This will generate a 
> SIGBUS to the application which can then cleanly terminate.This indeed was done 
> but this in turn created a problem of kernel OOPs were the OOPSes were due to the 
> fact that while the app was terminating because of the SIGBUSit would trigger use 
> after free in the driver by calling to accesses device structures that were already 
> released from the pci remove sequence.This was handled by introducing a 'flush' 
> sequence during device removal were we wait for drm file reference to drop to 0 
> meaning all user clients directly using this device terminated.
> 
> v2:
> Based on discussions in the mailing list with Daniel and Pekka [1] and based on the document 
> produced by Pekka from those discussions [2] the whole approach with returning SIGBUS and 
> waiting for all user clients having CPU mapping of device BOs to die was dropped. 
> Instead as per the document suggestion the device structures are kept alive until 
> the last reference to the device is dropped by user client and in the meanwhile all existing and new CPU mappings of the BOs 
> belonging to the device directly or by dma-buf import are rerouted to per user 
> process dummy rw page.Also, I skipped the 'Requirements for KMS UAPI' section of [2] 
> since i am trying to get the minimal set of requirements that still give useful solution 
> to work and this is the'Requirements for Render and Cross-Device UAPI' section and so my 
> test case is removing a secondary device, which is render only and is not involved 
> in KMS.
> 
> v3:
> More updates following comments from v2 such as removing loop to find DRM file when rerouting 
> page faults to dummy page,getting rid of unnecessary sysfs handling refactoring and moving 
> prevention of GPU recovery post device unplug from amdgpu to scheduler layer. 
> On top of that added unplug support for the IOMMU enabled system.
> 
> v4:
> Drop last sysfs hack and use sysfs default attribute.
> Guard against write accesses after device removal to avoid modifying released memory.
> Update dummy pages handling to on demand allocation and release through drm managed framework.
> Add return value to scheduler job TO handler (by Luben Tuikov) and use this in amdgpu for prevention 
> of GPU recovery post device unplug
> Also rebase on top of drm-misc-mext instead of amd-staging-drm-next
> 
> With these patches I am able to gracefully remove the secondary card using sysfs remove hook while glxgears 
> is running off of secondary card (DRI_PRIME=1) without kernel oopses or hangs and keep working 
> with the primary card or soft reset the device without hangs or oopses
> 
> TODOs for followup work:
> Convert AMDGPU code to use devm (for hw stuff) and drmm (for sw stuff and allocations) (Daniel)
> Support plugging the secondary device back after unplug - currently still experiencing HW error on plugging back.
> Add support for 'Requirements for KMS UAPI' section of [2] - unplugging primary, display connected card.
> 
> [1] - Discussions during v3 of the patchset https://www.spinics.net/lists/amd-gfx/msg55576.html
> [2] - drm/doc: device hot-unplug for userspace https://www.spinics.net/lists/dri-devel/msg259755.html
> [3] - Related gitlab ticket https://gitlab.freedesktop.org/drm/amd/-/issues/1081

btw have you tried this out with some of the igts we have? core_hotunplug
is the one I'm thinking of. Might be worth to extend this for amdgpu
specific stuff (like run some batches on it while hotunplugging).

Since there's so many corner cases we need to test here (shared dma-buf,
shared dma_fence) I think it would make sense to have a shared testcase
across drivers. Only specific thing would be some hooks to keep the gpu
busy in some fashion while we yank the driver. But just to get it started
you can throw in entirely amdgpu specific subtests and just share some of
the test code.
-Daniel

> 
> Andrey Grodzovsky (13):
>   drm/ttm: Remap all page faults to per process dummy page.
>   drm: Unamp the entire device address space on device unplug
>   drm/ttm: Expose ttm_tt_unpopulate for driver use
>   drm/sched: Cancel and flush all oustatdning jobs before finish.
>   drm/amdgpu: Split amdgpu_device_fini into early and late
>   drm/amdgpu: Add early fini callback
>   drm/amdgpu: Register IOMMU topology notifier per device.
>   drm/amdgpu: Fix a bunch of sdma code crash post device unplug
>   drm/amdgpu: Remap all page faults to per process dummy page.
>   dmr/amdgpu: Move some sysfs attrs creation to default_attr
>   drm/amdgpu: Guard against write accesses after device removal
>   drm/sched: Make timeout timer rearm conditional.
>   drm/amdgpu: Prevent any job recoveries after device is unplugged.
> 
> Luben Tuikov (1):
>   drm/scheduler: Job timeout handler returns status
> 
>  drivers/gpu/drm/amd/amdgpu/amdgpu.h               |  11 +-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c      |  17 +--
>  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c        | 149 ++++++++++++++++++++--
>  drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c           |  20 ++-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c         |  15 ++-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c          |   2 +-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h          |   1 +
>  drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c           |   9 ++
>  drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c       |  25 ++--
>  drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c           |  26 ++--
>  drivers/gpu/drm/amd/amdgpu/amdgpu_irq.h           |   3 +-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_job.c           |  19 ++-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c           |  12 +-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_object.c        |  10 ++
>  drivers/gpu/drm/amd/amdgpu/amdgpu_object.h        |   2 +
>  drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c           |  53 +++++---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h           |   3 +
>  drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c           |   1 +
>  drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c          |  70 ++++++++++
>  drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h          |  52 +-------
>  drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c           |  21 ++-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c            |   8 +-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c      |  14 +-
>  drivers/gpu/drm/amd/amdgpu/cik_ih.c               |   2 +-
>  drivers/gpu/drm/amd/amdgpu/cz_ih.c                |   2 +-
>  drivers/gpu/drm/amd/amdgpu/iceland_ih.c           |   2 +-
>  drivers/gpu/drm/amd/amdgpu/navi10_ih.c            |   2 +-
>  drivers/gpu/drm/amd/amdgpu/psp_v11_0.c            |  16 +--
>  drivers/gpu/drm/amd/amdgpu/psp_v12_0.c            |   8 +-
>  drivers/gpu/drm/amd/amdgpu/psp_v3_1.c             |   8 +-
>  drivers/gpu/drm/amd/amdgpu/si_ih.c                |   2 +-
>  drivers/gpu/drm/amd/amdgpu/tonga_ih.c             |   2 +-
>  drivers/gpu/drm/amd/amdgpu/vega10_ih.c            |   2 +-
>  drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c |  12 +-
>  drivers/gpu/drm/amd/include/amd_shared.h          |   2 +
>  drivers/gpu/drm/drm_drv.c                         |   3 +
>  drivers/gpu/drm/etnaviv/etnaviv_sched.c           |  10 +-
>  drivers/gpu/drm/lima/lima_sched.c                 |   4 +-
>  drivers/gpu/drm/panfrost/panfrost_job.c           |   9 +-
>  drivers/gpu/drm/scheduler/sched_main.c            |  18 ++-
>  drivers/gpu/drm/ttm/ttm_bo_vm.c                   |  82 +++++++++++-
>  drivers/gpu/drm/ttm/ttm_tt.c                      |   1 +
>  drivers/gpu/drm/v3d/v3d_sched.c                   |  32 ++---
>  include/drm/gpu_scheduler.h                       |  17 ++-
>  include/drm/ttm/ttm_bo_api.h                      |   2 +
>  45 files changed, 583 insertions(+), 198 deletions(-)
> 
> -- 
> 2.7.4
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 00/14] RFC Support hot device unplug in amdgpu
@ 2021-01-19 14:16   ` Daniel Vetter
  0 siblings, 0 replies; 196+ messages in thread
From: Daniel Vetter @ 2021-01-19 14:16 UTC (permalink / raw)
  To: Andrey Grodzovsky
  Cc: robh, gregkh, ckoenig.leichtzumerken, dri-devel, eric, ppaalanen,
	amd-gfx, daniel.vetter, Alexander.Deucher, yuq825,
	Harry.Wentland, l.stach

On Mon, Jan 18, 2021 at 04:01:09PM -0500, Andrey Grodzovsky wrote:
> Until now extracting a card either by physical extraction (e.g. eGPU with 
> thunderbolt connection or by emulation through  syfs -> /sys/bus/pci/devices/device_id/remove) 
> would cause random crashes in user apps. The random crashes in apps were 
> mostly due to the app having mapped a device backed BO into its address 
> space was still trying to access the BO while the backing device was gone.
> To answer this first problem Christian suggested to fix the handling of mapped 
> memory in the clients when the device goes away by forcibly unmap all buffers the 
> user processes has by clearing their respective VMAs mapping the device BOs. 
> Then when the VMAs try to fill in the page tables again we check in the fault 
> handlerif the device is removed and if so, return an error. This will generate a 
> SIGBUS to the application which can then cleanly terminate.This indeed was done 
> but this in turn created a problem of kernel OOPs were the OOPSes were due to the 
> fact that while the app was terminating because of the SIGBUSit would trigger use 
> after free in the driver by calling to accesses device structures that were already 
> released from the pci remove sequence.This was handled by introducing a 'flush' 
> sequence during device removal were we wait for drm file reference to drop to 0 
> meaning all user clients directly using this device terminated.
> 
> v2:
> Based on discussions in the mailing list with Daniel and Pekka [1] and based on the document 
> produced by Pekka from those discussions [2] the whole approach with returning SIGBUS and 
> waiting for all user clients having CPU mapping of device BOs to die was dropped. 
> Instead as per the document suggestion the device structures are kept alive until 
> the last reference to the device is dropped by user client and in the meanwhile all existing and new CPU mappings of the BOs 
> belonging to the device directly or by dma-buf import are rerouted to per user 
> process dummy rw page.Also, I skipped the 'Requirements for KMS UAPI' section of [2] 
> since i am trying to get the minimal set of requirements that still give useful solution 
> to work and this is the'Requirements for Render and Cross-Device UAPI' section and so my 
> test case is removing a secondary device, which is render only and is not involved 
> in KMS.
> 
> v3:
> More updates following comments from v2 such as removing loop to find DRM file when rerouting 
> page faults to dummy page,getting rid of unnecessary sysfs handling refactoring and moving 
> prevention of GPU recovery post device unplug from amdgpu to scheduler layer. 
> On top of that added unplug support for the IOMMU enabled system.
> 
> v4:
> Drop last sysfs hack and use sysfs default attribute.
> Guard against write accesses after device removal to avoid modifying released memory.
> Update dummy pages handling to on demand allocation and release through drm managed framework.
> Add return value to scheduler job TO handler (by Luben Tuikov) and use this in amdgpu for prevention 
> of GPU recovery post device unplug
> Also rebase on top of drm-misc-mext instead of amd-staging-drm-next
> 
> With these patches I am able to gracefully remove the secondary card using sysfs remove hook while glxgears 
> is running off of secondary card (DRI_PRIME=1) without kernel oopses or hangs and keep working 
> with the primary card or soft reset the device without hangs or oopses
> 
> TODOs for followup work:
> Convert AMDGPU code to use devm (for hw stuff) and drmm (for sw stuff and allocations) (Daniel)
> Support plugging the secondary device back after unplug - currently still experiencing HW error on plugging back.
> Add support for 'Requirements for KMS UAPI' section of [2] - unplugging primary, display connected card.
> 
> [1] - Discussions during v3 of the patchset https://www.spinics.net/lists/amd-gfx/msg55576.html
> [2] - drm/doc: device hot-unplug for userspace https://www.spinics.net/lists/dri-devel/msg259755.html
> [3] - Related gitlab ticket https://gitlab.freedesktop.org/drm/amd/-/issues/1081

btw have you tried this out with some of the igts we have? core_hotunplug
is the one I'm thinking of. Might be worth to extend this for amdgpu
specific stuff (like run some batches on it while hotunplugging).

Since there's so many corner cases we need to test here (shared dma-buf,
shared dma_fence) I think it would make sense to have a shared testcase
across drivers. Only specific thing would be some hooks to keep the gpu
busy in some fashion while we yank the driver. But just to get it started
you can throw in entirely amdgpu specific subtests and just share some of
the test code.
-Daniel

> 
> Andrey Grodzovsky (13):
>   drm/ttm: Remap all page faults to per process dummy page.
>   drm: Unamp the entire device address space on device unplug
>   drm/ttm: Expose ttm_tt_unpopulate for driver use
>   drm/sched: Cancel and flush all oustatdning jobs before finish.
>   drm/amdgpu: Split amdgpu_device_fini into early and late
>   drm/amdgpu: Add early fini callback
>   drm/amdgpu: Register IOMMU topology notifier per device.
>   drm/amdgpu: Fix a bunch of sdma code crash post device unplug
>   drm/amdgpu: Remap all page faults to per process dummy page.
>   dmr/amdgpu: Move some sysfs attrs creation to default_attr
>   drm/amdgpu: Guard against write accesses after device removal
>   drm/sched: Make timeout timer rearm conditional.
>   drm/amdgpu: Prevent any job recoveries after device is unplugged.
> 
> Luben Tuikov (1):
>   drm/scheduler: Job timeout handler returns status
> 
>  drivers/gpu/drm/amd/amdgpu/amdgpu.h               |  11 +-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c      |  17 +--
>  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c        | 149 ++++++++++++++++++++--
>  drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c           |  20 ++-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c         |  15 ++-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c          |   2 +-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h          |   1 +
>  drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c           |   9 ++
>  drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c       |  25 ++--
>  drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c           |  26 ++--
>  drivers/gpu/drm/amd/amdgpu/amdgpu_irq.h           |   3 +-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_job.c           |  19 ++-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c           |  12 +-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_object.c        |  10 ++
>  drivers/gpu/drm/amd/amdgpu/amdgpu_object.h        |   2 +
>  drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c           |  53 +++++---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h           |   3 +
>  drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c           |   1 +
>  drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c          |  70 ++++++++++
>  drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h          |  52 +-------
>  drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c           |  21 ++-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c            |   8 +-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c      |  14 +-
>  drivers/gpu/drm/amd/amdgpu/cik_ih.c               |   2 +-
>  drivers/gpu/drm/amd/amdgpu/cz_ih.c                |   2 +-
>  drivers/gpu/drm/amd/amdgpu/iceland_ih.c           |   2 +-
>  drivers/gpu/drm/amd/amdgpu/navi10_ih.c            |   2 +-
>  drivers/gpu/drm/amd/amdgpu/psp_v11_0.c            |  16 +--
>  drivers/gpu/drm/amd/amdgpu/psp_v12_0.c            |   8 +-
>  drivers/gpu/drm/amd/amdgpu/psp_v3_1.c             |   8 +-
>  drivers/gpu/drm/amd/amdgpu/si_ih.c                |   2 +-
>  drivers/gpu/drm/amd/amdgpu/tonga_ih.c             |   2 +-
>  drivers/gpu/drm/amd/amdgpu/vega10_ih.c            |   2 +-
>  drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c |  12 +-
>  drivers/gpu/drm/amd/include/amd_shared.h          |   2 +
>  drivers/gpu/drm/drm_drv.c                         |   3 +
>  drivers/gpu/drm/etnaviv/etnaviv_sched.c           |  10 +-
>  drivers/gpu/drm/lima/lima_sched.c                 |   4 +-
>  drivers/gpu/drm/panfrost/panfrost_job.c           |   9 +-
>  drivers/gpu/drm/scheduler/sched_main.c            |  18 ++-
>  drivers/gpu/drm/ttm/ttm_bo_vm.c                   |  82 +++++++++++-
>  drivers/gpu/drm/ttm/ttm_tt.c                      |   1 +
>  drivers/gpu/drm/v3d/v3d_sched.c                   |  32 ++---
>  include/drm/gpu_scheduler.h                       |  17 ++-
>  include/drm/ttm/ttm_bo_api.h                      |   2 +
>  45 files changed, 583 insertions(+), 198 deletions(-)
> 
> -- 
> 2.7.4
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 11/14] drm/amdgpu: Guard against write accesses after device removal
  2021-01-19  8:55     ` Christian König
@ 2021-01-19 15:35       ` Andrey Grodzovsky
  -1 siblings, 0 replies; 196+ messages in thread
From: Andrey Grodzovsky @ 2021-01-19 15:35 UTC (permalink / raw)
  To: christian.koenig, amd-gfx, dri-devel, daniel.vetter, robh,
	l.stach, yuq825, eric
  Cc: Alexander.Deucher, gregkh

There is really no other way according to this article 
https://lwn.net/Articles/767885/

"A perfect solution seems nearly impossible though; we cannot acquire a mutex on 
the user
to prevent them from yanking a device and we cannot check for a presence change 
after every
device access for performance reasons. "

But I assumed srcu_read_lock should be pretty seamless performance wise, no ?
The other solution would be as I suggested to keep all the device IO ranges 
reserved and system
memory pages unfreed until the device is finalized in the driver but Daniel said 
this would upset the PCI layer (the MMIO ranges reservation part).

Andrey




On 1/19/21 3:55 AM, Christian König wrote:
> Am 18.01.21 um 22:01 schrieb Andrey Grodzovsky:
>> This should prevent writing to memory or IO ranges possibly
>> already allocated for other uses after our device is removed.
>
> Wow, that adds quite some overhead to every register access. I'm not sure we 
> can do this.
>
> Christian.
>
>>
>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 57 ++++++++++++++++++++++++
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c    |  9 ++++
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c    | 53 +++++++++++++---------
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h    |  3 ++
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c   | 70 ++++++++++++++++++++++++++++++
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h   | 49 ++-------------------
>>   drivers/gpu/drm/amd/amdgpu/psp_v11_0.c     | 16 ++-----
>>   drivers/gpu/drm/amd/amdgpu/psp_v12_0.c     |  8 +---
>>   drivers/gpu/drm/amd/amdgpu/psp_v3_1.c      |  8 +---
>>   9 files changed, 184 insertions(+), 89 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> index e99f4f1..0a9d73c 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> @@ -72,6 +72,8 @@
>>     #include <linux/iommu.h>
>>   +#include <drm/drm_drv.h>
>> +
>>   MODULE_FIRMWARE("amdgpu/vega10_gpu_info.bin");
>>   MODULE_FIRMWARE("amdgpu/vega12_gpu_info.bin");
>>   MODULE_FIRMWARE("amdgpu/raven_gpu_info.bin");
>> @@ -404,13 +406,21 @@ uint8_t amdgpu_mm_rreg8(struct amdgpu_device *adev, 
>> uint32_t offset)
>>    */
>>   void amdgpu_mm_wreg8(struct amdgpu_device *adev, uint32_t offset, uint8_t 
>> value)
>>   {
>> +    int idx;
>> +
>>       if (adev->in_pci_err_recovery)
>>           return;
>>   +
>> +    if (!drm_dev_enter(&adev->ddev, &idx))
>> +        return;
>> +
>>       if (offset < adev->rmmio_size)
>>           writeb(value, adev->rmmio + offset);
>>       else
>>           BUG();
>> +
>> +    drm_dev_exit(idx);
>>   }
>>     /**
>> @@ -427,9 +437,14 @@ void amdgpu_device_wreg(struct amdgpu_device *adev,
>>               uint32_t reg, uint32_t v,
>>               uint32_t acc_flags)
>>   {
>> +    int idx;
>> +
>>       if (adev->in_pci_err_recovery)
>>           return;
>>   +    if (!drm_dev_enter(&adev->ddev, &idx))
>> +        return;
>> +
>>       if ((reg * 4) < adev->rmmio_size) {
>>           if (!(acc_flags & AMDGPU_REGS_NO_KIQ) &&
>>               amdgpu_sriov_runtime(adev) &&
>> @@ -444,6 +459,8 @@ void amdgpu_device_wreg(struct amdgpu_device *adev,
>>       }
>>         trace_amdgpu_device_wreg(adev->pdev->device, reg, v);
>> +
>> +    drm_dev_exit(idx);
>>   }
>>     /*
>> @@ -454,9 +471,14 @@ void amdgpu_device_wreg(struct amdgpu_device *adev,
>>   void amdgpu_mm_wreg_mmio_rlc(struct amdgpu_device *adev,
>>                    uint32_t reg, uint32_t v)
>>   {
>> +    int idx;
>> +
>>       if (adev->in_pci_err_recovery)
>>           return;
>>   +    if (!drm_dev_enter(&adev->ddev, &idx))
>> +        return;
>> +
>>       if (amdgpu_sriov_fullaccess(adev) &&
>>           adev->gfx.rlc.funcs &&
>>           adev->gfx.rlc.funcs->is_rlcg_access_range) {
>> @@ -465,6 +487,8 @@ void amdgpu_mm_wreg_mmio_rlc(struct amdgpu_device *adev,
>>       } else {
>>           writel(v, ((void __iomem *)adev->rmmio) + (reg * 4));
>>       }
>> +
>> +    drm_dev_exit(idx);
>>   }
>>     /**
>> @@ -499,15 +523,22 @@ u32 amdgpu_io_rreg(struct amdgpu_device *adev, u32 reg)
>>    */
>>   void amdgpu_io_wreg(struct amdgpu_device *adev, u32 reg, u32 v)
>>   {
>> +    int idx;
>> +
>>       if (adev->in_pci_err_recovery)
>>           return;
>>   +    if (!drm_dev_enter(&adev->ddev, &idx))
>> +        return;
>> +
>>       if ((reg * 4) < adev->rio_mem_size)
>>           iowrite32(v, adev->rio_mem + (reg * 4));
>>       else {
>>           iowrite32((reg * 4), adev->rio_mem + (mmMM_INDEX * 4));
>>           iowrite32(v, adev->rio_mem + (mmMM_DATA * 4));
>>       }
>> +
>> +    drm_dev_exit(idx);
>>   }
>>     /**
>> @@ -544,14 +575,21 @@ u32 amdgpu_mm_rdoorbell(struct amdgpu_device *adev, u32 
>> index)
>>    */
>>   void amdgpu_mm_wdoorbell(struct amdgpu_device *adev, u32 index, u32 v)
>>   {
>> +    int idx;
>> +
>>       if (adev->in_pci_err_recovery)
>>           return;
>>   +    if (!drm_dev_enter(&adev->ddev, &idx))
>> +        return;
>> +
>>       if (index < adev->doorbell.num_doorbells) {
>>           writel(v, adev->doorbell.ptr + index);
>>       } else {
>>           DRM_ERROR("writing beyond doorbell aperture: 0x%08x!\n", index);
>>       }
>> +
>> +    drm_dev_exit(idx);
>>   }
>>     /**
>> @@ -588,14 +626,21 @@ u64 amdgpu_mm_rdoorbell64(struct amdgpu_device *adev, 
>> u32 index)
>>    */
>>   void amdgpu_mm_wdoorbell64(struct amdgpu_device *adev, u32 index, u64 v)
>>   {
>> +    int idx;
>> +
>>       if (adev->in_pci_err_recovery)
>>           return;
>>   +    if (!drm_dev_enter(&adev->ddev, &idx))
>> +        return;
>> +
>>       if (index < adev->doorbell.num_doorbells) {
>>           atomic64_set((atomic64_t *)(adev->doorbell.ptr + index), v);
>>       } else {
>>           DRM_ERROR("writing beyond doorbell aperture: 0x%08x!\n", index);
>>       }
>> +
>> +    drm_dev_exit(idx);
>>   }
>>     /**
>> @@ -682,6 +727,10 @@ void amdgpu_device_indirect_wreg(struct amdgpu_device 
>> *adev,
>>       unsigned long flags;
>>       void __iomem *pcie_index_offset;
>>       void __iomem *pcie_data_offset;
>> +    int idx;
>> +
>> +    if (!drm_dev_enter(&adev->ddev, &idx))
>> +        return;
>>         spin_lock_irqsave(&adev->pcie_idx_lock, flags);
>>       pcie_index_offset = (void __iomem *)adev->rmmio + pcie_index * 4;
>> @@ -692,6 +741,8 @@ void amdgpu_device_indirect_wreg(struct amdgpu_device *adev,
>>       writel(reg_data, pcie_data_offset);
>>       readl(pcie_data_offset);
>>       spin_unlock_irqrestore(&adev->pcie_idx_lock, flags);
>> +
>> +    drm_dev_exit(idx);
>>   }
>>     /**
>> @@ -711,6 +762,10 @@ void amdgpu_device_indirect_wreg64(struct amdgpu_device 
>> *adev,
>>       unsigned long flags;
>>       void __iomem *pcie_index_offset;
>>       void __iomem *pcie_data_offset;
>> +    int idx;
>> +
>> +    if (!drm_dev_enter(&adev->ddev, &idx))
>> +        return;
>>         spin_lock_irqsave(&adev->pcie_idx_lock, flags);
>>       pcie_index_offset = (void __iomem *)adev->rmmio + pcie_index * 4;
>> @@ -727,6 +782,8 @@ void amdgpu_device_indirect_wreg64(struct amdgpu_device 
>> *adev,
>>       writel((u32)(reg_data >> 32), pcie_data_offset);
>>       readl(pcie_data_offset);
>>       spin_unlock_irqrestore(&adev->pcie_idx_lock, flags);
>> +
>> +    drm_dev_exit(idx);
>>   }
>>     /**
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
>> index fe1a39f..1beb4e6 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
>> @@ -31,6 +31,8 @@
>>   #include "amdgpu_ras.h"
>>   #include "amdgpu_xgmi.h"
>>   +#include <drm/drm_drv.h>
>> +
>>   /**
>>    * amdgpu_gmc_get_pde_for_bo - get the PDE for a BO
>>    *
>> @@ -98,6 +100,10 @@ int amdgpu_gmc_set_pte_pde(struct amdgpu_device *adev, 
>> void *cpu_pt_addr,
>>   {
>>       void __iomem *ptr = (void *)cpu_pt_addr;
>>       uint64_t value;
>> +    int idx;
>> +
>> +    if (!drm_dev_enter(&adev->ddev, &idx))
>> +        return 0;
>>         /*
>>        * The following is for PTE only. GART does not have PDEs.
>> @@ -105,6 +111,9 @@ int amdgpu_gmc_set_pte_pde(struct amdgpu_device *adev, 
>> void *cpu_pt_addr,
>>       value = addr & 0x0000FFFFFFFFF000ULL;
>>       value |= flags;
>>       writeq(value, ptr + (gpu_page_idx * 8));
>> +
>> +    drm_dev_exit(idx);
>> +
>>       return 0;
>>   }
>>   diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
>> index 523d22d..89e2bfe 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
>> @@ -37,6 +37,8 @@
>>     #include "amdgpu_ras.h"
>>   +#include <drm/drm_drv.h>
>> +
>>   static int psp_sysfs_init(struct amdgpu_device *adev);
>>   static void psp_sysfs_fini(struct amdgpu_device *adev);
>>   @@ -248,7 +250,7 @@ psp_cmd_submit_buf(struct psp_context *psp,
>>              struct psp_gfx_cmd_resp *cmd, uint64_t fence_mc_addr)
>>   {
>>       int ret;
>> -    int index;
>> +    int index, idx;
>>       int timeout = 2000;
>>       bool ras_intr = false;
>>       bool skip_unsupport = false;
>> @@ -256,6 +258,9 @@ psp_cmd_submit_buf(struct psp_context *psp,
>>       if (psp->adev->in_pci_err_recovery)
>>           return 0;
>>   +    if (!drm_dev_enter(&psp->adev->ddev, &idx))
>> +        return 0;
>> +
>>       mutex_lock(&psp->mutex);
>>         memset(psp->cmd_buf_mem, 0, PSP_CMD_BUFFER_SIZE);
>> @@ -266,8 +271,7 @@ psp_cmd_submit_buf(struct psp_context *psp,
>>       ret = psp_ring_cmd_submit(psp, psp->cmd_buf_mc_addr, fence_mc_addr, 
>> index);
>>       if (ret) {
>>           atomic_dec(&psp->fence_value);
>> -        mutex_unlock(&psp->mutex);
>> -        return ret;
>> +        goto exit;
>>       }
>>         amdgpu_asic_invalidate_hdp(psp->adev, NULL);
>> @@ -307,8 +311,8 @@ psp_cmd_submit_buf(struct psp_context *psp,
>>                psp->cmd_buf_mem->cmd_id,
>>                psp->cmd_buf_mem->resp.status);
>>           if (!timeout) {
>> -            mutex_unlock(&psp->mutex);
>> -            return -EINVAL;
>> +            ret = -EINVAL;
>> +            goto exit;
>>           }
>>       }
>>   @@ -316,8 +320,10 @@ psp_cmd_submit_buf(struct psp_context *psp,
>>           ucode->tmr_mc_addr_lo = psp->cmd_buf_mem->resp.fw_addr_lo;
>>           ucode->tmr_mc_addr_hi = psp->cmd_buf_mem->resp.fw_addr_hi;
>>       }
>> -    mutex_unlock(&psp->mutex);
>>   +exit:
>> +    mutex_unlock(&psp->mutex);
>> +    drm_dev_exit(idx);
>>       return ret;
>>   }
>>   @@ -354,8 +360,7 @@ static int psp_load_toc(struct psp_context *psp,
>>       if (!cmd)
>>           return -ENOMEM;
>>       /* Copy toc to psp firmware private buffer */
>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>> -    memcpy(psp->fw_pri_buf, psp->toc_start_addr, psp->toc_bin_size);
>> +    psp_copy_fw(psp, psp->toc_start_addr, psp->toc_bin_size);
>>         psp_prep_load_toc_cmd_buf(cmd, psp->fw_pri_mc_addr, psp->toc_bin_size);
>>   @@ -570,8 +575,7 @@ static int psp_asd_load(struct psp_context *psp)
>>       if (!cmd)
>>           return -ENOMEM;
>>   -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>> -    memcpy(psp->fw_pri_buf, psp->asd_start_addr, psp->asd_ucode_size);
>> +    psp_copy_fw(psp, psp->asd_start_addr, psp->asd_ucode_size);
>>         psp_prep_asd_load_cmd_buf(cmd, psp->fw_pri_mc_addr,
>>                     psp->asd_ucode_size);
>> @@ -726,8 +730,7 @@ static int psp_xgmi_load(struct psp_context *psp)
>>       if (!cmd)
>>           return -ENOMEM;
>>   -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>> -    memcpy(psp->fw_pri_buf, psp->ta_xgmi_start_addr, psp->ta_xgmi_ucode_size);
>> +    psp_copy_fw(psp, psp->ta_xgmi_start_addr, psp->ta_xgmi_ucode_size);
>>         psp_prep_ta_load_cmd_buf(cmd,
>>                    psp->fw_pri_mc_addr,
>> @@ -982,8 +985,7 @@ static int psp_ras_load(struct psp_context *psp)
>>       if (!cmd)
>>           return -ENOMEM;
>>   -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>> -    memcpy(psp->fw_pri_buf, psp->ta_ras_start_addr, psp->ta_ras_ucode_size);
>> +    psp_copy_fw(psp, psp->ta_ras_start_addr, psp->ta_ras_ucode_size);
>>         psp_prep_ta_load_cmd_buf(cmd,
>>                    psp->fw_pri_mc_addr,
>> @@ -1219,8 +1221,7 @@ static int psp_hdcp_load(struct psp_context *psp)
>>       if (!cmd)
>>           return -ENOMEM;
>>   -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>> -    memcpy(psp->fw_pri_buf, psp->ta_hdcp_start_addr,
>> +    psp_copy_fw(psp, psp->ta_hdcp_start_addr,
>>              psp->ta_hdcp_ucode_size);
>>         psp_prep_ta_load_cmd_buf(cmd,
>> @@ -1366,8 +1367,7 @@ static int psp_dtm_load(struct psp_context *psp)
>>       if (!cmd)
>>           return -ENOMEM;
>>   -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>> -    memcpy(psp->fw_pri_buf, psp->ta_dtm_start_addr, psp->ta_dtm_ucode_size);
>> +    psp_copy_fw(psp, psp->ta_dtm_start_addr, psp->ta_dtm_ucode_size);
>>         psp_prep_ta_load_cmd_buf(cmd,
>>                    psp->fw_pri_mc_addr,
>> @@ -1507,8 +1507,7 @@ static int psp_rap_load(struct psp_context *psp)
>>       if (!cmd)
>>           return -ENOMEM;
>>   -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>> -    memcpy(psp->fw_pri_buf, psp->ta_rap_start_addr, psp->ta_rap_ucode_size);
>> +    psp_copy_fw(psp, psp->ta_rap_start_addr, psp->ta_rap_ucode_size);
>>         psp_prep_ta_load_cmd_buf(cmd,
>>                    psp->fw_pri_mc_addr,
>> @@ -2778,6 +2777,20 @@ static ssize_t psp_usbc_pd_fw_sysfs_write(struct 
>> device *dev,
>>       return count;
>>   }
>>   +void psp_copy_fw(struct psp_context *psp, uint8_t *start_addr, uint32_t 
>> bin_size)
>> +{
>> +    int idx;
>> +
>> +    if (!drm_dev_enter(&psp->adev->ddev, &idx))
>> +        return;
>> +
>> +    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>> +    memcpy(psp->fw_pri_buf, start_addr, bin_size);
>> +
>> +    drm_dev_exit(idx);
>> +}
>> +
>> +
>>   static DEVICE_ATTR(usbc_pd_fw, S_IRUGO | S_IWUSR,
>>              psp_usbc_pd_fw_sysfs_read,
>>              psp_usbc_pd_fw_sysfs_write);
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
>> index da250bc..ac69314 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
>> @@ -400,4 +400,7 @@ int psp_init_ta_microcode(struct psp_context *psp,
>>                 const char *chip_name);
>>   int psp_get_fw_attestation_records_addr(struct psp_context *psp,
>>                       uint64_t *output_ptr);
>> +
>> +void psp_copy_fw(struct psp_context *psp, uint8_t *start_addr, uint32_t 
>> bin_size);
>> +
>>   #endif
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
>> index 1a612f5..d656494 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
>> @@ -35,6 +35,8 @@
>>   #include "amdgpu.h"
>>   #include "atom.h"
>>   +#include <drm/drm_drv.h>
>> +
>>   /*
>>    * Rings
>>    * Most engines on the GPU are fed via ring buffers.  Ring
>> @@ -463,3 +465,71 @@ int amdgpu_ring_test_helper(struct amdgpu_ring *ring)
>>       ring->sched.ready = !r;
>>       return r;
>>   }
>> +
>> +void amdgpu_ring_clear_ring(struct amdgpu_ring *ring)
>> +{
>> +    int idx;
>> +    int i = 0;
>> +
>> +    if (!drm_dev_enter(&ring->adev->ddev, &idx))
>> +        return;
>> +
>> +    while (i <= ring->buf_mask)
>> +        ring->ring[i++] = ring->funcs->nop;
>> +
>> +    drm_dev_exit(idx);
>> +
>> +}
>> +
>> +void amdgpu_ring_write(struct amdgpu_ring *ring, uint32_t v)
>> +{
>> +    int idx;
>> +
>> +    if (!drm_dev_enter(&ring->adev->ddev, &idx))
>> +        return;
>> +
>> +    if (ring->count_dw <= 0)
>> +        DRM_ERROR("amdgpu: writing more dwords to the ring than expected!\n");
>> +    ring->ring[ring->wptr++ & ring->buf_mask] = v;
>> +    ring->wptr &= ring->ptr_mask;
>> +    ring->count_dw--;
>> +
>> +    drm_dev_exit(idx);
>> +}
>> +
>> +void amdgpu_ring_write_multiple(struct amdgpu_ring *ring,
>> +                          void *src, int count_dw)
>> +{
>> +    unsigned occupied, chunk1, chunk2;
>> +    void *dst;
>> +    int idx;
>> +
>> +    if (!drm_dev_enter(&ring->adev->ddev, &idx))
>> +        return;
>> +
>> +    if (unlikely(ring->count_dw < count_dw))
>> +        DRM_ERROR("amdgpu: writing more dwords to the ring than expected!\n");
>> +
>> +    occupied = ring->wptr & ring->buf_mask;
>> +    dst = (void *)&ring->ring[occupied];
>> +    chunk1 = ring->buf_mask + 1 - occupied;
>> +    chunk1 = (chunk1 >= count_dw) ? count_dw: chunk1;
>> +    chunk2 = count_dw - chunk1;
>> +    chunk1 <<= 2;
>> +    chunk2 <<= 2;
>> +
>> +    if (chunk1)
>> +        memcpy(dst, src, chunk1);
>> +
>> +    if (chunk2) {
>> +        src += chunk1;
>> +        dst = (void *)ring->ring;
>> +        memcpy(dst, src, chunk2);
>> +    }
>> +
>> +    ring->wptr += count_dw;
>> +    ring->wptr &= ring->ptr_mask;
>> +    ring->count_dw -= count_dw;
>> +
>> +    drm_dev_exit(idx);
>> +}
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>> index accb243..f90b81f 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>> @@ -300,53 +300,12 @@ static inline void 
>> amdgpu_ring_set_preempt_cond_exec(struct amdgpu_ring *ring,
>>       *ring->cond_exe_cpu_addr = cond_exec;
>>   }
>>   -static inline void amdgpu_ring_clear_ring(struct amdgpu_ring *ring)
>> -{
>> -    int i = 0;
>> -    while (i <= ring->buf_mask)
>> -        ring->ring[i++] = ring->funcs->nop;
>> -
>> -}
>> -
>> -static inline void amdgpu_ring_write(struct amdgpu_ring *ring, uint32_t v)
>> -{
>> -    if (ring->count_dw <= 0)
>> -        DRM_ERROR("amdgpu: writing more dwords to the ring than expected!\n");
>> -    ring->ring[ring->wptr++ & ring->buf_mask] = v;
>> -    ring->wptr &= ring->ptr_mask;
>> -    ring->count_dw--;
>> -}
>> +void amdgpu_ring_clear_ring(struct amdgpu_ring *ring);
>>   -static inline void amdgpu_ring_write_multiple(struct amdgpu_ring *ring,
>> -                          void *src, int count_dw)
>> -{
>> -    unsigned occupied, chunk1, chunk2;
>> -    void *dst;
>> -
>> -    if (unlikely(ring->count_dw < count_dw))
>> -        DRM_ERROR("amdgpu: writing more dwords to the ring than expected!\n");
>> -
>> -    occupied = ring->wptr & ring->buf_mask;
>> -    dst = (void *)&ring->ring[occupied];
>> -    chunk1 = ring->buf_mask + 1 - occupied;
>> -    chunk1 = (chunk1 >= count_dw) ? count_dw: chunk1;
>> -    chunk2 = count_dw - chunk1;
>> -    chunk1 <<= 2;
>> -    chunk2 <<= 2;
>> +void amdgpu_ring_write(struct amdgpu_ring *ring, uint32_t v);
>>   -    if (chunk1)
>> -        memcpy(dst, src, chunk1);
>> -
>> -    if (chunk2) {
>> -        src += chunk1;
>> -        dst = (void *)ring->ring;
>> -        memcpy(dst, src, chunk2);
>> -    }
>> -
>> -    ring->wptr += count_dw;
>> -    ring->wptr &= ring->ptr_mask;
>> -    ring->count_dw -= count_dw;
>> -}
>> +void amdgpu_ring_write_multiple(struct amdgpu_ring *ring,
>> +                          void *src, int count_dw);
>>     int amdgpu_ring_test_helper(struct amdgpu_ring *ring);
>>   diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c 
>> b/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
>> index bd4248c..b3ce5be 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
>> @@ -269,10 +269,8 @@ static int psp_v11_0_bootloader_load_kdb(struct 
>> psp_context *psp)
>>       if (ret)
>>           return ret;
>>   -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>> -
>>       /* Copy PSP KDB binary to memory */
>> -    memcpy(psp->fw_pri_buf, psp->kdb_start_addr, psp->kdb_bin_size);
>> +    psp_copy_fw(psp, psp->kdb_start_addr, psp->kdb_bin_size);
>>         /* Provide the PSP KDB to bootloader */
>>       WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>> @@ -302,10 +300,8 @@ static int psp_v11_0_bootloader_load_spl(struct 
>> psp_context *psp)
>>       if (ret)
>>           return ret;
>>   -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>> -
>>       /* Copy PSP SPL binary to memory */
>> -    memcpy(psp->fw_pri_buf, psp->spl_start_addr, psp->spl_bin_size);
>> +    psp_copy_fw(psp, psp->spl_start_addr, psp->spl_bin_size);
>>         /* Provide the PSP SPL to bootloader */
>>       WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>> @@ -335,10 +331,8 @@ static int psp_v11_0_bootloader_load_sysdrv(struct 
>> psp_context *psp)
>>       if (ret)
>>           return ret;
>>   -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>> -
>>       /* Copy PSP System Driver binary to memory */
>> -    memcpy(psp->fw_pri_buf, psp->sys_start_addr, psp->sys_bin_size);
>> +    psp_copy_fw(psp, psp->sys_start_addr, psp->sys_bin_size);
>>         /* Provide the sys driver to bootloader */
>>       WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>> @@ -371,10 +365,8 @@ static int psp_v11_0_bootloader_load_sos(struct 
>> psp_context *psp)
>>       if (ret)
>>           return ret;
>>   -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>> -
>>       /* Copy Secure OS binary to PSP memory */
>> -    memcpy(psp->fw_pri_buf, psp->sos_start_addr, psp->sos_bin_size);
>> +    psp_copy_fw(psp, psp->sos_start_addr, psp->sos_bin_size);
>>         /* Provide the PSP secure OS to bootloader */
>>       WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>> diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c 
>> b/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c
>> index c4828bd..618e5b6 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c
>> @@ -138,10 +138,8 @@ static int psp_v12_0_bootloader_load_sysdrv(struct 
>> psp_context *psp)
>>       if (ret)
>>           return ret;
>>   -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>> -
>>       /* Copy PSP System Driver binary to memory */
>> -    memcpy(psp->fw_pri_buf, psp->sys_start_addr, psp->sys_bin_size);
>> +    psp_copy_fw(psp, psp->sys_start_addr, psp->sys_bin_size);
>>         /* Provide the sys driver to bootloader */
>>       WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>> @@ -179,10 +177,8 @@ static int psp_v12_0_bootloader_load_sos(struct 
>> psp_context *psp)
>>       if (ret)
>>           return ret;
>>   -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>> -
>>       /* Copy Secure OS binary to PSP memory */
>> -    memcpy(psp->fw_pri_buf, psp->sos_start_addr, psp->sos_bin_size);
>> +    psp_copy_fw(psp, psp->sos_start_addr, psp->sos_bin_size);
>>         /* Provide the PSP secure OS to bootloader */
>>       WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>> diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c 
>> b/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c
>> index f2e725f..d0a6cccd 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c
>> @@ -102,10 +102,8 @@ static int psp_v3_1_bootloader_load_sysdrv(struct 
>> psp_context *psp)
>>       if (ret)
>>           return ret;
>>   -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>> -
>>       /* Copy PSP System Driver binary to memory */
>> -    memcpy(psp->fw_pri_buf, psp->sys_start_addr, psp->sys_bin_size);
>> +    psp_copy_fw(psp, psp->sys_start_addr, psp->sys_bin_size);
>>         /* Provide the sys driver to bootloader */
>>       WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>> @@ -143,10 +141,8 @@ static int psp_v3_1_bootloader_load_sos(struct 
>> psp_context *psp)
>>       if (ret)
>>           return ret;
>>   -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>> -
>>       /* Copy Secure OS binary to PSP memory */
>> -    memcpy(psp->fw_pri_buf, psp->sos_start_addr, psp->sos_bin_size);
>> +    psp_copy_fw(psp, psp->sos_start_addr, psp->sos_bin_size);
>>         /* Provide the PSP secure OS to bootloader */
>>       WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 11/14] drm/amdgpu: Guard against write accesses after device removal
@ 2021-01-19 15:35       ` Andrey Grodzovsky
  0 siblings, 0 replies; 196+ messages in thread
From: Andrey Grodzovsky @ 2021-01-19 15:35 UTC (permalink / raw)
  To: christian.koenig, amd-gfx, dri-devel, daniel.vetter, robh,
	l.stach, yuq825, eric
  Cc: Alexander.Deucher, gregkh, ppaalanen, Harry.Wentland

There is really no other way according to this article 
https://lwn.net/Articles/767885/

"A perfect solution seems nearly impossible though; we cannot acquire a mutex on 
the user
to prevent them from yanking a device and we cannot check for a presence change 
after every
device access for performance reasons. "

But I assumed srcu_read_lock should be pretty seamless performance wise, no ?
The other solution would be as I suggested to keep all the device IO ranges 
reserved and system
memory pages unfreed until the device is finalized in the driver but Daniel said 
this would upset the PCI layer (the MMIO ranges reservation part).

Andrey




On 1/19/21 3:55 AM, Christian König wrote:
> Am 18.01.21 um 22:01 schrieb Andrey Grodzovsky:
>> This should prevent writing to memory or IO ranges possibly
>> already allocated for other uses after our device is removed.
>
> Wow, that adds quite some overhead to every register access. I'm not sure we 
> can do this.
>
> Christian.
>
>>
>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 57 ++++++++++++++++++++++++
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c    |  9 ++++
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c    | 53 +++++++++++++---------
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h    |  3 ++
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c   | 70 ++++++++++++++++++++++++++++++
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h   | 49 ++-------------------
>>   drivers/gpu/drm/amd/amdgpu/psp_v11_0.c     | 16 ++-----
>>   drivers/gpu/drm/amd/amdgpu/psp_v12_0.c     |  8 +---
>>   drivers/gpu/drm/amd/amdgpu/psp_v3_1.c      |  8 +---
>>   9 files changed, 184 insertions(+), 89 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> index e99f4f1..0a9d73c 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> @@ -72,6 +72,8 @@
>>     #include <linux/iommu.h>
>>   +#include <drm/drm_drv.h>
>> +
>>   MODULE_FIRMWARE("amdgpu/vega10_gpu_info.bin");
>>   MODULE_FIRMWARE("amdgpu/vega12_gpu_info.bin");
>>   MODULE_FIRMWARE("amdgpu/raven_gpu_info.bin");
>> @@ -404,13 +406,21 @@ uint8_t amdgpu_mm_rreg8(struct amdgpu_device *adev, 
>> uint32_t offset)
>>    */
>>   void amdgpu_mm_wreg8(struct amdgpu_device *adev, uint32_t offset, uint8_t 
>> value)
>>   {
>> +    int idx;
>> +
>>       if (adev->in_pci_err_recovery)
>>           return;
>>   +
>> +    if (!drm_dev_enter(&adev->ddev, &idx))
>> +        return;
>> +
>>       if (offset < adev->rmmio_size)
>>           writeb(value, adev->rmmio + offset);
>>       else
>>           BUG();
>> +
>> +    drm_dev_exit(idx);
>>   }
>>     /**
>> @@ -427,9 +437,14 @@ void amdgpu_device_wreg(struct amdgpu_device *adev,
>>               uint32_t reg, uint32_t v,
>>               uint32_t acc_flags)
>>   {
>> +    int idx;
>> +
>>       if (adev->in_pci_err_recovery)
>>           return;
>>   +    if (!drm_dev_enter(&adev->ddev, &idx))
>> +        return;
>> +
>>       if ((reg * 4) < adev->rmmio_size) {
>>           if (!(acc_flags & AMDGPU_REGS_NO_KIQ) &&
>>               amdgpu_sriov_runtime(adev) &&
>> @@ -444,6 +459,8 @@ void amdgpu_device_wreg(struct amdgpu_device *adev,
>>       }
>>         trace_amdgpu_device_wreg(adev->pdev->device, reg, v);
>> +
>> +    drm_dev_exit(idx);
>>   }
>>     /*
>> @@ -454,9 +471,14 @@ void amdgpu_device_wreg(struct amdgpu_device *adev,
>>   void amdgpu_mm_wreg_mmio_rlc(struct amdgpu_device *adev,
>>                    uint32_t reg, uint32_t v)
>>   {
>> +    int idx;
>> +
>>       if (adev->in_pci_err_recovery)
>>           return;
>>   +    if (!drm_dev_enter(&adev->ddev, &idx))
>> +        return;
>> +
>>       if (amdgpu_sriov_fullaccess(adev) &&
>>           adev->gfx.rlc.funcs &&
>>           adev->gfx.rlc.funcs->is_rlcg_access_range) {
>> @@ -465,6 +487,8 @@ void amdgpu_mm_wreg_mmio_rlc(struct amdgpu_device *adev,
>>       } else {
>>           writel(v, ((void __iomem *)adev->rmmio) + (reg * 4));
>>       }
>> +
>> +    drm_dev_exit(idx);
>>   }
>>     /**
>> @@ -499,15 +523,22 @@ u32 amdgpu_io_rreg(struct amdgpu_device *adev, u32 reg)
>>    */
>>   void amdgpu_io_wreg(struct amdgpu_device *adev, u32 reg, u32 v)
>>   {
>> +    int idx;
>> +
>>       if (adev->in_pci_err_recovery)
>>           return;
>>   +    if (!drm_dev_enter(&adev->ddev, &idx))
>> +        return;
>> +
>>       if ((reg * 4) < adev->rio_mem_size)
>>           iowrite32(v, adev->rio_mem + (reg * 4));
>>       else {
>>           iowrite32((reg * 4), adev->rio_mem + (mmMM_INDEX * 4));
>>           iowrite32(v, adev->rio_mem + (mmMM_DATA * 4));
>>       }
>> +
>> +    drm_dev_exit(idx);
>>   }
>>     /**
>> @@ -544,14 +575,21 @@ u32 amdgpu_mm_rdoorbell(struct amdgpu_device *adev, u32 
>> index)
>>    */
>>   void amdgpu_mm_wdoorbell(struct amdgpu_device *adev, u32 index, u32 v)
>>   {
>> +    int idx;
>> +
>>       if (adev->in_pci_err_recovery)
>>           return;
>>   +    if (!drm_dev_enter(&adev->ddev, &idx))
>> +        return;
>> +
>>       if (index < adev->doorbell.num_doorbells) {
>>           writel(v, adev->doorbell.ptr + index);
>>       } else {
>>           DRM_ERROR("writing beyond doorbell aperture: 0x%08x!\n", index);
>>       }
>> +
>> +    drm_dev_exit(idx);
>>   }
>>     /**
>> @@ -588,14 +626,21 @@ u64 amdgpu_mm_rdoorbell64(struct amdgpu_device *adev, 
>> u32 index)
>>    */
>>   void amdgpu_mm_wdoorbell64(struct amdgpu_device *adev, u32 index, u64 v)
>>   {
>> +    int idx;
>> +
>>       if (adev->in_pci_err_recovery)
>>           return;
>>   +    if (!drm_dev_enter(&adev->ddev, &idx))
>> +        return;
>> +
>>       if (index < adev->doorbell.num_doorbells) {
>>           atomic64_set((atomic64_t *)(adev->doorbell.ptr + index), v);
>>       } else {
>>           DRM_ERROR("writing beyond doorbell aperture: 0x%08x!\n", index);
>>       }
>> +
>> +    drm_dev_exit(idx);
>>   }
>>     /**
>> @@ -682,6 +727,10 @@ void amdgpu_device_indirect_wreg(struct amdgpu_device 
>> *adev,
>>       unsigned long flags;
>>       void __iomem *pcie_index_offset;
>>       void __iomem *pcie_data_offset;
>> +    int idx;
>> +
>> +    if (!drm_dev_enter(&adev->ddev, &idx))
>> +        return;
>>         spin_lock_irqsave(&adev->pcie_idx_lock, flags);
>>       pcie_index_offset = (void __iomem *)adev->rmmio + pcie_index * 4;
>> @@ -692,6 +741,8 @@ void amdgpu_device_indirect_wreg(struct amdgpu_device *adev,
>>       writel(reg_data, pcie_data_offset);
>>       readl(pcie_data_offset);
>>       spin_unlock_irqrestore(&adev->pcie_idx_lock, flags);
>> +
>> +    drm_dev_exit(idx);
>>   }
>>     /**
>> @@ -711,6 +762,10 @@ void amdgpu_device_indirect_wreg64(struct amdgpu_device 
>> *adev,
>>       unsigned long flags;
>>       void __iomem *pcie_index_offset;
>>       void __iomem *pcie_data_offset;
>> +    int idx;
>> +
>> +    if (!drm_dev_enter(&adev->ddev, &idx))
>> +        return;
>>         spin_lock_irqsave(&adev->pcie_idx_lock, flags);
>>       pcie_index_offset = (void __iomem *)adev->rmmio + pcie_index * 4;
>> @@ -727,6 +782,8 @@ void amdgpu_device_indirect_wreg64(struct amdgpu_device 
>> *adev,
>>       writel((u32)(reg_data >> 32), pcie_data_offset);
>>       readl(pcie_data_offset);
>>       spin_unlock_irqrestore(&adev->pcie_idx_lock, flags);
>> +
>> +    drm_dev_exit(idx);
>>   }
>>     /**
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
>> index fe1a39f..1beb4e6 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
>> @@ -31,6 +31,8 @@
>>   #include "amdgpu_ras.h"
>>   #include "amdgpu_xgmi.h"
>>   +#include <drm/drm_drv.h>
>> +
>>   /**
>>    * amdgpu_gmc_get_pde_for_bo - get the PDE for a BO
>>    *
>> @@ -98,6 +100,10 @@ int amdgpu_gmc_set_pte_pde(struct amdgpu_device *adev, 
>> void *cpu_pt_addr,
>>   {
>>       void __iomem *ptr = (void *)cpu_pt_addr;
>>       uint64_t value;
>> +    int idx;
>> +
>> +    if (!drm_dev_enter(&adev->ddev, &idx))
>> +        return 0;
>>         /*
>>        * The following is for PTE only. GART does not have PDEs.
>> @@ -105,6 +111,9 @@ int amdgpu_gmc_set_pte_pde(struct amdgpu_device *adev, 
>> void *cpu_pt_addr,
>>       value = addr & 0x0000FFFFFFFFF000ULL;
>>       value |= flags;
>>       writeq(value, ptr + (gpu_page_idx * 8));
>> +
>> +    drm_dev_exit(idx);
>> +
>>       return 0;
>>   }
>>   diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
>> index 523d22d..89e2bfe 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
>> @@ -37,6 +37,8 @@
>>     #include "amdgpu_ras.h"
>>   +#include <drm/drm_drv.h>
>> +
>>   static int psp_sysfs_init(struct amdgpu_device *adev);
>>   static void psp_sysfs_fini(struct amdgpu_device *adev);
>>   @@ -248,7 +250,7 @@ psp_cmd_submit_buf(struct psp_context *psp,
>>              struct psp_gfx_cmd_resp *cmd, uint64_t fence_mc_addr)
>>   {
>>       int ret;
>> -    int index;
>> +    int index, idx;
>>       int timeout = 2000;
>>       bool ras_intr = false;
>>       bool skip_unsupport = false;
>> @@ -256,6 +258,9 @@ psp_cmd_submit_buf(struct psp_context *psp,
>>       if (psp->adev->in_pci_err_recovery)
>>           return 0;
>>   +    if (!drm_dev_enter(&psp->adev->ddev, &idx))
>> +        return 0;
>> +
>>       mutex_lock(&psp->mutex);
>>         memset(psp->cmd_buf_mem, 0, PSP_CMD_BUFFER_SIZE);
>> @@ -266,8 +271,7 @@ psp_cmd_submit_buf(struct psp_context *psp,
>>       ret = psp_ring_cmd_submit(psp, psp->cmd_buf_mc_addr, fence_mc_addr, 
>> index);
>>       if (ret) {
>>           atomic_dec(&psp->fence_value);
>> -        mutex_unlock(&psp->mutex);
>> -        return ret;
>> +        goto exit;
>>       }
>>         amdgpu_asic_invalidate_hdp(psp->adev, NULL);
>> @@ -307,8 +311,8 @@ psp_cmd_submit_buf(struct psp_context *psp,
>>                psp->cmd_buf_mem->cmd_id,
>>                psp->cmd_buf_mem->resp.status);
>>           if (!timeout) {
>> -            mutex_unlock(&psp->mutex);
>> -            return -EINVAL;
>> +            ret = -EINVAL;
>> +            goto exit;
>>           }
>>       }
>>   @@ -316,8 +320,10 @@ psp_cmd_submit_buf(struct psp_context *psp,
>>           ucode->tmr_mc_addr_lo = psp->cmd_buf_mem->resp.fw_addr_lo;
>>           ucode->tmr_mc_addr_hi = psp->cmd_buf_mem->resp.fw_addr_hi;
>>       }
>> -    mutex_unlock(&psp->mutex);
>>   +exit:
>> +    mutex_unlock(&psp->mutex);
>> +    drm_dev_exit(idx);
>>       return ret;
>>   }
>>   @@ -354,8 +360,7 @@ static int psp_load_toc(struct psp_context *psp,
>>       if (!cmd)
>>           return -ENOMEM;
>>       /* Copy toc to psp firmware private buffer */
>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>> -    memcpy(psp->fw_pri_buf, psp->toc_start_addr, psp->toc_bin_size);
>> +    psp_copy_fw(psp, psp->toc_start_addr, psp->toc_bin_size);
>>         psp_prep_load_toc_cmd_buf(cmd, psp->fw_pri_mc_addr, psp->toc_bin_size);
>>   @@ -570,8 +575,7 @@ static int psp_asd_load(struct psp_context *psp)
>>       if (!cmd)
>>           return -ENOMEM;
>>   -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>> -    memcpy(psp->fw_pri_buf, psp->asd_start_addr, psp->asd_ucode_size);
>> +    psp_copy_fw(psp, psp->asd_start_addr, psp->asd_ucode_size);
>>         psp_prep_asd_load_cmd_buf(cmd, psp->fw_pri_mc_addr,
>>                     psp->asd_ucode_size);
>> @@ -726,8 +730,7 @@ static int psp_xgmi_load(struct psp_context *psp)
>>       if (!cmd)
>>           return -ENOMEM;
>>   -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>> -    memcpy(psp->fw_pri_buf, psp->ta_xgmi_start_addr, psp->ta_xgmi_ucode_size);
>> +    psp_copy_fw(psp, psp->ta_xgmi_start_addr, psp->ta_xgmi_ucode_size);
>>         psp_prep_ta_load_cmd_buf(cmd,
>>                    psp->fw_pri_mc_addr,
>> @@ -982,8 +985,7 @@ static int psp_ras_load(struct psp_context *psp)
>>       if (!cmd)
>>           return -ENOMEM;
>>   -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>> -    memcpy(psp->fw_pri_buf, psp->ta_ras_start_addr, psp->ta_ras_ucode_size);
>> +    psp_copy_fw(psp, psp->ta_ras_start_addr, psp->ta_ras_ucode_size);
>>         psp_prep_ta_load_cmd_buf(cmd,
>>                    psp->fw_pri_mc_addr,
>> @@ -1219,8 +1221,7 @@ static int psp_hdcp_load(struct psp_context *psp)
>>       if (!cmd)
>>           return -ENOMEM;
>>   -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>> -    memcpy(psp->fw_pri_buf, psp->ta_hdcp_start_addr,
>> +    psp_copy_fw(psp, psp->ta_hdcp_start_addr,
>>              psp->ta_hdcp_ucode_size);
>>         psp_prep_ta_load_cmd_buf(cmd,
>> @@ -1366,8 +1367,7 @@ static int psp_dtm_load(struct psp_context *psp)
>>       if (!cmd)
>>           return -ENOMEM;
>>   -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>> -    memcpy(psp->fw_pri_buf, psp->ta_dtm_start_addr, psp->ta_dtm_ucode_size);
>> +    psp_copy_fw(psp, psp->ta_dtm_start_addr, psp->ta_dtm_ucode_size);
>>         psp_prep_ta_load_cmd_buf(cmd,
>>                    psp->fw_pri_mc_addr,
>> @@ -1507,8 +1507,7 @@ static int psp_rap_load(struct psp_context *psp)
>>       if (!cmd)
>>           return -ENOMEM;
>>   -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>> -    memcpy(psp->fw_pri_buf, psp->ta_rap_start_addr, psp->ta_rap_ucode_size);
>> +    psp_copy_fw(psp, psp->ta_rap_start_addr, psp->ta_rap_ucode_size);
>>         psp_prep_ta_load_cmd_buf(cmd,
>>                    psp->fw_pri_mc_addr,
>> @@ -2778,6 +2777,20 @@ static ssize_t psp_usbc_pd_fw_sysfs_write(struct 
>> device *dev,
>>       return count;
>>   }
>>   +void psp_copy_fw(struct psp_context *psp, uint8_t *start_addr, uint32_t 
>> bin_size)
>> +{
>> +    int idx;
>> +
>> +    if (!drm_dev_enter(&psp->adev->ddev, &idx))
>> +        return;
>> +
>> +    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>> +    memcpy(psp->fw_pri_buf, start_addr, bin_size);
>> +
>> +    drm_dev_exit(idx);
>> +}
>> +
>> +
>>   static DEVICE_ATTR(usbc_pd_fw, S_IRUGO | S_IWUSR,
>>              psp_usbc_pd_fw_sysfs_read,
>>              psp_usbc_pd_fw_sysfs_write);
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
>> index da250bc..ac69314 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
>> @@ -400,4 +400,7 @@ int psp_init_ta_microcode(struct psp_context *psp,
>>                 const char *chip_name);
>>   int psp_get_fw_attestation_records_addr(struct psp_context *psp,
>>                       uint64_t *output_ptr);
>> +
>> +void psp_copy_fw(struct psp_context *psp, uint8_t *start_addr, uint32_t 
>> bin_size);
>> +
>>   #endif
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
>> index 1a612f5..d656494 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
>> @@ -35,6 +35,8 @@
>>   #include "amdgpu.h"
>>   #include "atom.h"
>>   +#include <drm/drm_drv.h>
>> +
>>   /*
>>    * Rings
>>    * Most engines on the GPU are fed via ring buffers.  Ring
>> @@ -463,3 +465,71 @@ int amdgpu_ring_test_helper(struct amdgpu_ring *ring)
>>       ring->sched.ready = !r;
>>       return r;
>>   }
>> +
>> +void amdgpu_ring_clear_ring(struct amdgpu_ring *ring)
>> +{
>> +    int idx;
>> +    int i = 0;
>> +
>> +    if (!drm_dev_enter(&ring->adev->ddev, &idx))
>> +        return;
>> +
>> +    while (i <= ring->buf_mask)
>> +        ring->ring[i++] = ring->funcs->nop;
>> +
>> +    drm_dev_exit(idx);
>> +
>> +}
>> +
>> +void amdgpu_ring_write(struct amdgpu_ring *ring, uint32_t v)
>> +{
>> +    int idx;
>> +
>> +    if (!drm_dev_enter(&ring->adev->ddev, &idx))
>> +        return;
>> +
>> +    if (ring->count_dw <= 0)
>> +        DRM_ERROR("amdgpu: writing more dwords to the ring than expected!\n");
>> +    ring->ring[ring->wptr++ & ring->buf_mask] = v;
>> +    ring->wptr &= ring->ptr_mask;
>> +    ring->count_dw--;
>> +
>> +    drm_dev_exit(idx);
>> +}
>> +
>> +void amdgpu_ring_write_multiple(struct amdgpu_ring *ring,
>> +                          void *src, int count_dw)
>> +{
>> +    unsigned occupied, chunk1, chunk2;
>> +    void *dst;
>> +    int idx;
>> +
>> +    if (!drm_dev_enter(&ring->adev->ddev, &idx))
>> +        return;
>> +
>> +    if (unlikely(ring->count_dw < count_dw))
>> +        DRM_ERROR("amdgpu: writing more dwords to the ring than expected!\n");
>> +
>> +    occupied = ring->wptr & ring->buf_mask;
>> +    dst = (void *)&ring->ring[occupied];
>> +    chunk1 = ring->buf_mask + 1 - occupied;
>> +    chunk1 = (chunk1 >= count_dw) ? count_dw: chunk1;
>> +    chunk2 = count_dw - chunk1;
>> +    chunk1 <<= 2;
>> +    chunk2 <<= 2;
>> +
>> +    if (chunk1)
>> +        memcpy(dst, src, chunk1);
>> +
>> +    if (chunk2) {
>> +        src += chunk1;
>> +        dst = (void *)ring->ring;
>> +        memcpy(dst, src, chunk2);
>> +    }
>> +
>> +    ring->wptr += count_dw;
>> +    ring->wptr &= ring->ptr_mask;
>> +    ring->count_dw -= count_dw;
>> +
>> +    drm_dev_exit(idx);
>> +}
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>> index accb243..f90b81f 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>> @@ -300,53 +300,12 @@ static inline void 
>> amdgpu_ring_set_preempt_cond_exec(struct amdgpu_ring *ring,
>>       *ring->cond_exe_cpu_addr = cond_exec;
>>   }
>>   -static inline void amdgpu_ring_clear_ring(struct amdgpu_ring *ring)
>> -{
>> -    int i = 0;
>> -    while (i <= ring->buf_mask)
>> -        ring->ring[i++] = ring->funcs->nop;
>> -
>> -}
>> -
>> -static inline void amdgpu_ring_write(struct amdgpu_ring *ring, uint32_t v)
>> -{
>> -    if (ring->count_dw <= 0)
>> -        DRM_ERROR("amdgpu: writing more dwords to the ring than expected!\n");
>> -    ring->ring[ring->wptr++ & ring->buf_mask] = v;
>> -    ring->wptr &= ring->ptr_mask;
>> -    ring->count_dw--;
>> -}
>> +void amdgpu_ring_clear_ring(struct amdgpu_ring *ring);
>>   -static inline void amdgpu_ring_write_multiple(struct amdgpu_ring *ring,
>> -                          void *src, int count_dw)
>> -{
>> -    unsigned occupied, chunk1, chunk2;
>> -    void *dst;
>> -
>> -    if (unlikely(ring->count_dw < count_dw))
>> -        DRM_ERROR("amdgpu: writing more dwords to the ring than expected!\n");
>> -
>> -    occupied = ring->wptr & ring->buf_mask;
>> -    dst = (void *)&ring->ring[occupied];
>> -    chunk1 = ring->buf_mask + 1 - occupied;
>> -    chunk1 = (chunk1 >= count_dw) ? count_dw: chunk1;
>> -    chunk2 = count_dw - chunk1;
>> -    chunk1 <<= 2;
>> -    chunk2 <<= 2;
>> +void amdgpu_ring_write(struct amdgpu_ring *ring, uint32_t v);
>>   -    if (chunk1)
>> -        memcpy(dst, src, chunk1);
>> -
>> -    if (chunk2) {
>> -        src += chunk1;
>> -        dst = (void *)ring->ring;
>> -        memcpy(dst, src, chunk2);
>> -    }
>> -
>> -    ring->wptr += count_dw;
>> -    ring->wptr &= ring->ptr_mask;
>> -    ring->count_dw -= count_dw;
>> -}
>> +void amdgpu_ring_write_multiple(struct amdgpu_ring *ring,
>> +                          void *src, int count_dw);
>>     int amdgpu_ring_test_helper(struct amdgpu_ring *ring);
>>   diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c 
>> b/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
>> index bd4248c..b3ce5be 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
>> @@ -269,10 +269,8 @@ static int psp_v11_0_bootloader_load_kdb(struct 
>> psp_context *psp)
>>       if (ret)
>>           return ret;
>>   -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>> -
>>       /* Copy PSP KDB binary to memory */
>> -    memcpy(psp->fw_pri_buf, psp->kdb_start_addr, psp->kdb_bin_size);
>> +    psp_copy_fw(psp, psp->kdb_start_addr, psp->kdb_bin_size);
>>         /* Provide the PSP KDB to bootloader */
>>       WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>> @@ -302,10 +300,8 @@ static int psp_v11_0_bootloader_load_spl(struct 
>> psp_context *psp)
>>       if (ret)
>>           return ret;
>>   -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>> -
>>       /* Copy PSP SPL binary to memory */
>> -    memcpy(psp->fw_pri_buf, psp->spl_start_addr, psp->spl_bin_size);
>> +    psp_copy_fw(psp, psp->spl_start_addr, psp->spl_bin_size);
>>         /* Provide the PSP SPL to bootloader */
>>       WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>> @@ -335,10 +331,8 @@ static int psp_v11_0_bootloader_load_sysdrv(struct 
>> psp_context *psp)
>>       if (ret)
>>           return ret;
>>   -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>> -
>>       /* Copy PSP System Driver binary to memory */
>> -    memcpy(psp->fw_pri_buf, psp->sys_start_addr, psp->sys_bin_size);
>> +    psp_copy_fw(psp, psp->sys_start_addr, psp->sys_bin_size);
>>         /* Provide the sys driver to bootloader */
>>       WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>> @@ -371,10 +365,8 @@ static int psp_v11_0_bootloader_load_sos(struct 
>> psp_context *psp)
>>       if (ret)
>>           return ret;
>>   -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>> -
>>       /* Copy Secure OS binary to PSP memory */
>> -    memcpy(psp->fw_pri_buf, psp->sos_start_addr, psp->sos_bin_size);
>> +    psp_copy_fw(psp, psp->sos_start_addr, psp->sos_bin_size);
>>         /* Provide the PSP secure OS to bootloader */
>>       WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>> diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c 
>> b/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c
>> index c4828bd..618e5b6 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c
>> @@ -138,10 +138,8 @@ static int psp_v12_0_bootloader_load_sysdrv(struct 
>> psp_context *psp)
>>       if (ret)
>>           return ret;
>>   -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>> -
>>       /* Copy PSP System Driver binary to memory */
>> -    memcpy(psp->fw_pri_buf, psp->sys_start_addr, psp->sys_bin_size);
>> +    psp_copy_fw(psp, psp->sys_start_addr, psp->sys_bin_size);
>>         /* Provide the sys driver to bootloader */
>>       WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>> @@ -179,10 +177,8 @@ static int psp_v12_0_bootloader_load_sos(struct 
>> psp_context *psp)
>>       if (ret)
>>           return ret;
>>   -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>> -
>>       /* Copy Secure OS binary to PSP memory */
>> -    memcpy(psp->fw_pri_buf, psp->sos_start_addr, psp->sos_bin_size);
>> +    psp_copy_fw(psp, psp->sos_start_addr, psp->sos_bin_size);
>>         /* Provide the PSP secure OS to bootloader */
>>       WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>> diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c 
>> b/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c
>> index f2e725f..d0a6cccd 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c
>> @@ -102,10 +102,8 @@ static int psp_v3_1_bootloader_load_sysdrv(struct 
>> psp_context *psp)
>>       if (ret)
>>           return ret;
>>   -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>> -
>>       /* Copy PSP System Driver binary to memory */
>> -    memcpy(psp->fw_pri_buf, psp->sys_start_addr, psp->sys_bin_size);
>> +    psp_copy_fw(psp, psp->sys_start_addr, psp->sys_bin_size);
>>         /* Provide the sys driver to bootloader */
>>       WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>> @@ -143,10 +141,8 @@ static int psp_v3_1_bootloader_load_sos(struct 
>> psp_context *psp)
>>       if (ret)
>>           return ret;
>>   -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>> -
>>       /* Copy Secure OS binary to PSP memory */
>> -    memcpy(psp->fw_pri_buf, psp->sos_start_addr, psp->sos_bin_size);
>> +    psp_copy_fw(psp, psp->sos_start_addr, psp->sos_bin_size);
>>         /* Provide the PSP secure OS to bootloader */
>>       WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 11/14] drm/amdgpu: Guard against write accesses after device removal
  2021-01-19 15:35       ` Andrey Grodzovsky
@ 2021-01-19 15:39         ` Christian König
  -1 siblings, 0 replies; 196+ messages in thread
From: Christian König @ 2021-01-19 15:39 UTC (permalink / raw)
  To: Andrey Grodzovsky, amd-gfx, dri-devel, daniel.vetter, robh,
	l.stach, yuq825, eric
  Cc: Alexander.Deucher, gregkh

The is also the possibility to have the drm_dev_enter/exit much more 
high level.

E.g. we should it have anyway on every IOCTL and what remains are work 
items, scheduler threads and interrupts.

Christian.

Am 19.01.21 um 16:35 schrieb Andrey Grodzovsky:
> There is really no other way according to this article 
> https://lwn.net/Articles/767885/
>
> "A perfect solution seems nearly impossible though; we cannot acquire 
> a mutex on the user
> to prevent them from yanking a device and we cannot check for a 
> presence change after every
> device access for performance reasons. "
>
> But I assumed srcu_read_lock should be pretty seamless performance 
> wise, no ?
> The other solution would be as I suggested to keep all the device IO 
> ranges reserved and system
> memory pages unfreed until the device is finalized in the driver but 
> Daniel said this would upset the PCI layer (the MMIO ranges 
> reservation part).
>
> Andrey
>
>
>
>
> On 1/19/21 3:55 AM, Christian König wrote:
>> Am 18.01.21 um 22:01 schrieb Andrey Grodzovsky:
>>> This should prevent writing to memory or IO ranges possibly
>>> already allocated for other uses after our device is removed.
>>
>> Wow, that adds quite some overhead to every register access. I'm not 
>> sure we can do this.
>>
>> Christian.
>>
>>>
>>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>>> ---
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 57 
>>> ++++++++++++++++++++++++
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c    |  9 ++++
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c    | 53 
>>> +++++++++++++---------
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h    |  3 ++
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c   | 70 
>>> ++++++++++++++++++++++++++++++
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h   | 49 ++-------------------
>>>   drivers/gpu/drm/amd/amdgpu/psp_v11_0.c     | 16 ++-----
>>>   drivers/gpu/drm/amd/amdgpu/psp_v12_0.c     |  8 +---
>>>   drivers/gpu/drm/amd/amdgpu/psp_v3_1.c      |  8 +---
>>>   9 files changed, 184 insertions(+), 89 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>> index e99f4f1..0a9d73c 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>> @@ -72,6 +72,8 @@
>>>     #include <linux/iommu.h>
>>>   +#include <drm/drm_drv.h>
>>> +
>>>   MODULE_FIRMWARE("amdgpu/vega10_gpu_info.bin");
>>>   MODULE_FIRMWARE("amdgpu/vega12_gpu_info.bin");
>>>   MODULE_FIRMWARE("amdgpu/raven_gpu_info.bin");
>>> @@ -404,13 +406,21 @@ uint8_t amdgpu_mm_rreg8(struct amdgpu_device 
>>> *adev, uint32_t offset)
>>>    */
>>>   void amdgpu_mm_wreg8(struct amdgpu_device *adev, uint32_t offset, 
>>> uint8_t value)
>>>   {
>>> +    int idx;
>>> +
>>>       if (adev->in_pci_err_recovery)
>>>           return;
>>>   +
>>> +    if (!drm_dev_enter(&adev->ddev, &idx))
>>> +        return;
>>> +
>>>       if (offset < adev->rmmio_size)
>>>           writeb(value, adev->rmmio + offset);
>>>       else
>>>           BUG();
>>> +
>>> +    drm_dev_exit(idx);
>>>   }
>>>     /**
>>> @@ -427,9 +437,14 @@ void amdgpu_device_wreg(struct amdgpu_device 
>>> *adev,
>>>               uint32_t reg, uint32_t v,
>>>               uint32_t acc_flags)
>>>   {
>>> +    int idx;
>>> +
>>>       if (adev->in_pci_err_recovery)
>>>           return;
>>>   +    if (!drm_dev_enter(&adev->ddev, &idx))
>>> +        return;
>>> +
>>>       if ((reg * 4) < adev->rmmio_size) {
>>>           if (!(acc_flags & AMDGPU_REGS_NO_KIQ) &&
>>>               amdgpu_sriov_runtime(adev) &&
>>> @@ -444,6 +459,8 @@ void amdgpu_device_wreg(struct amdgpu_device *adev,
>>>       }
>>>         trace_amdgpu_device_wreg(adev->pdev->device, reg, v);
>>> +
>>> +    drm_dev_exit(idx);
>>>   }
>>>     /*
>>> @@ -454,9 +471,14 @@ void amdgpu_device_wreg(struct amdgpu_device 
>>> *adev,
>>>   void amdgpu_mm_wreg_mmio_rlc(struct amdgpu_device *adev,
>>>                    uint32_t reg, uint32_t v)
>>>   {
>>> +    int idx;
>>> +
>>>       if (adev->in_pci_err_recovery)
>>>           return;
>>>   +    if (!drm_dev_enter(&adev->ddev, &idx))
>>> +        return;
>>> +
>>>       if (amdgpu_sriov_fullaccess(adev) &&
>>>           adev->gfx.rlc.funcs &&
>>>           adev->gfx.rlc.funcs->is_rlcg_access_range) {
>>> @@ -465,6 +487,8 @@ void amdgpu_mm_wreg_mmio_rlc(struct 
>>> amdgpu_device *adev,
>>>       } else {
>>>           writel(v, ((void __iomem *)adev->rmmio) + (reg * 4));
>>>       }
>>> +
>>> +    drm_dev_exit(idx);
>>>   }
>>>     /**
>>> @@ -499,15 +523,22 @@ u32 amdgpu_io_rreg(struct amdgpu_device *adev, 
>>> u32 reg)
>>>    */
>>>   void amdgpu_io_wreg(struct amdgpu_device *adev, u32 reg, u32 v)
>>>   {
>>> +    int idx;
>>> +
>>>       if (adev->in_pci_err_recovery)
>>>           return;
>>>   +    if (!drm_dev_enter(&adev->ddev, &idx))
>>> +        return;
>>> +
>>>       if ((reg * 4) < adev->rio_mem_size)
>>>           iowrite32(v, adev->rio_mem + (reg * 4));
>>>       else {
>>>           iowrite32((reg * 4), adev->rio_mem + (mmMM_INDEX * 4));
>>>           iowrite32(v, adev->rio_mem + (mmMM_DATA * 4));
>>>       }
>>> +
>>> +    drm_dev_exit(idx);
>>>   }
>>>     /**
>>> @@ -544,14 +575,21 @@ u32 amdgpu_mm_rdoorbell(struct amdgpu_device 
>>> *adev, u32 index)
>>>    */
>>>   void amdgpu_mm_wdoorbell(struct amdgpu_device *adev, u32 index, 
>>> u32 v)
>>>   {
>>> +    int idx;
>>> +
>>>       if (adev->in_pci_err_recovery)
>>>           return;
>>>   +    if (!drm_dev_enter(&adev->ddev, &idx))
>>> +        return;
>>> +
>>>       if (index < adev->doorbell.num_doorbells) {
>>>           writel(v, adev->doorbell.ptr + index);
>>>       } else {
>>>           DRM_ERROR("writing beyond doorbell aperture: 0x%08x!\n", 
>>> index);
>>>       }
>>> +
>>> +    drm_dev_exit(idx);
>>>   }
>>>     /**
>>> @@ -588,14 +626,21 @@ u64 amdgpu_mm_rdoorbell64(struct amdgpu_device 
>>> *adev, u32 index)
>>>    */
>>>   void amdgpu_mm_wdoorbell64(struct amdgpu_device *adev, u32 index, 
>>> u64 v)
>>>   {
>>> +    int idx;
>>> +
>>>       if (adev->in_pci_err_recovery)
>>>           return;
>>>   +    if (!drm_dev_enter(&adev->ddev, &idx))
>>> +        return;
>>> +
>>>       if (index < adev->doorbell.num_doorbells) {
>>>           atomic64_set((atomic64_t *)(adev->doorbell.ptr + index), v);
>>>       } else {
>>>           DRM_ERROR("writing beyond doorbell aperture: 0x%08x!\n", 
>>> index);
>>>       }
>>> +
>>> +    drm_dev_exit(idx);
>>>   }
>>>     /**
>>> @@ -682,6 +727,10 @@ void amdgpu_device_indirect_wreg(struct 
>>> amdgpu_device *adev,
>>>       unsigned long flags;
>>>       void __iomem *pcie_index_offset;
>>>       void __iomem *pcie_data_offset;
>>> +    int idx;
>>> +
>>> +    if (!drm_dev_enter(&adev->ddev, &idx))
>>> +        return;
>>>         spin_lock_irqsave(&adev->pcie_idx_lock, flags);
>>>       pcie_index_offset = (void __iomem *)adev->rmmio + pcie_index * 4;
>>> @@ -692,6 +741,8 @@ void amdgpu_device_indirect_wreg(struct 
>>> amdgpu_device *adev,
>>>       writel(reg_data, pcie_data_offset);
>>>       readl(pcie_data_offset);
>>>       spin_unlock_irqrestore(&adev->pcie_idx_lock, flags);
>>> +
>>> +    drm_dev_exit(idx);
>>>   }
>>>     /**
>>> @@ -711,6 +762,10 @@ void amdgpu_device_indirect_wreg64(struct 
>>> amdgpu_device *adev,
>>>       unsigned long flags;
>>>       void __iomem *pcie_index_offset;
>>>       void __iomem *pcie_data_offset;
>>> +    int idx;
>>> +
>>> +    if (!drm_dev_enter(&adev->ddev, &idx))
>>> +        return;
>>>         spin_lock_irqsave(&adev->pcie_idx_lock, flags);
>>>       pcie_index_offset = (void __iomem *)adev->rmmio + pcie_index * 4;
>>> @@ -727,6 +782,8 @@ void amdgpu_device_indirect_wreg64(struct 
>>> amdgpu_device *adev,
>>>       writel((u32)(reg_data >> 32), pcie_data_offset);
>>>       readl(pcie_data_offset);
>>>       spin_unlock_irqrestore(&adev->pcie_idx_lock, flags);
>>> +
>>> +    drm_dev_exit(idx);
>>>   }
>>>     /**
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c 
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
>>> index fe1a39f..1beb4e6 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
>>> @@ -31,6 +31,8 @@
>>>   #include "amdgpu_ras.h"
>>>   #include "amdgpu_xgmi.h"
>>>   +#include <drm/drm_drv.h>
>>> +
>>>   /**
>>>    * amdgpu_gmc_get_pde_for_bo - get the PDE for a BO
>>>    *
>>> @@ -98,6 +100,10 @@ int amdgpu_gmc_set_pte_pde(struct amdgpu_device 
>>> *adev, void *cpu_pt_addr,
>>>   {
>>>       void __iomem *ptr = (void *)cpu_pt_addr;
>>>       uint64_t value;
>>> +    int idx;
>>> +
>>> +    if (!drm_dev_enter(&adev->ddev, &idx))
>>> +        return 0;
>>>         /*
>>>        * The following is for PTE only. GART does not have PDEs.
>>> @@ -105,6 +111,9 @@ int amdgpu_gmc_set_pte_pde(struct amdgpu_device 
>>> *adev, void *cpu_pt_addr,
>>>       value = addr & 0x0000FFFFFFFFF000ULL;
>>>       value |= flags;
>>>       writeq(value, ptr + (gpu_page_idx * 8));
>>> +
>>> +    drm_dev_exit(idx);
>>> +
>>>       return 0;
>>>   }
>>>   diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c 
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
>>> index 523d22d..89e2bfe 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
>>> @@ -37,6 +37,8 @@
>>>     #include "amdgpu_ras.h"
>>>   +#include <drm/drm_drv.h>
>>> +
>>>   static int psp_sysfs_init(struct amdgpu_device *adev);
>>>   static void psp_sysfs_fini(struct amdgpu_device *adev);
>>>   @@ -248,7 +250,7 @@ psp_cmd_submit_buf(struct psp_context *psp,
>>>              struct psp_gfx_cmd_resp *cmd, uint64_t fence_mc_addr)
>>>   {
>>>       int ret;
>>> -    int index;
>>> +    int index, idx;
>>>       int timeout = 2000;
>>>       bool ras_intr = false;
>>>       bool skip_unsupport = false;
>>> @@ -256,6 +258,9 @@ psp_cmd_submit_buf(struct psp_context *psp,
>>>       if (psp->adev->in_pci_err_recovery)
>>>           return 0;
>>>   +    if (!drm_dev_enter(&psp->adev->ddev, &idx))
>>> +        return 0;
>>> +
>>>       mutex_lock(&psp->mutex);
>>>         memset(psp->cmd_buf_mem, 0, PSP_CMD_BUFFER_SIZE);
>>> @@ -266,8 +271,7 @@ psp_cmd_submit_buf(struct psp_context *psp,
>>>       ret = psp_ring_cmd_submit(psp, psp->cmd_buf_mc_addr, 
>>> fence_mc_addr, index);
>>>       if (ret) {
>>>           atomic_dec(&psp->fence_value);
>>> -        mutex_unlock(&psp->mutex);
>>> -        return ret;
>>> +        goto exit;
>>>       }
>>>         amdgpu_asic_invalidate_hdp(psp->adev, NULL);
>>> @@ -307,8 +311,8 @@ psp_cmd_submit_buf(struct psp_context *psp,
>>>                psp->cmd_buf_mem->cmd_id,
>>>                psp->cmd_buf_mem->resp.status);
>>>           if (!timeout) {
>>> -            mutex_unlock(&psp->mutex);
>>> -            return -EINVAL;
>>> +            ret = -EINVAL;
>>> +            goto exit;
>>>           }
>>>       }
>>>   @@ -316,8 +320,10 @@ psp_cmd_submit_buf(struct psp_context *psp,
>>>           ucode->tmr_mc_addr_lo = psp->cmd_buf_mem->resp.fw_addr_lo;
>>>           ucode->tmr_mc_addr_hi = psp->cmd_buf_mem->resp.fw_addr_hi;
>>>       }
>>> -    mutex_unlock(&psp->mutex);
>>>   +exit:
>>> +    mutex_unlock(&psp->mutex);
>>> +    drm_dev_exit(idx);
>>>       return ret;
>>>   }
>>>   @@ -354,8 +360,7 @@ static int psp_load_toc(struct psp_context *psp,
>>>       if (!cmd)
>>>           return -ENOMEM;
>>>       /* Copy toc to psp firmware private buffer */
>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>> -    memcpy(psp->fw_pri_buf, psp->toc_start_addr, psp->toc_bin_size);
>>> +    psp_copy_fw(psp, psp->toc_start_addr, psp->toc_bin_size);
>>>         psp_prep_load_toc_cmd_buf(cmd, psp->fw_pri_mc_addr, 
>>> psp->toc_bin_size);
>>>   @@ -570,8 +575,7 @@ static int psp_asd_load(struct psp_context *psp)
>>>       if (!cmd)
>>>           return -ENOMEM;
>>>   -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>> -    memcpy(psp->fw_pri_buf, psp->asd_start_addr, psp->asd_ucode_size);
>>> +    psp_copy_fw(psp, psp->asd_start_addr, psp->asd_ucode_size);
>>>         psp_prep_asd_load_cmd_buf(cmd, psp->fw_pri_mc_addr,
>>>                     psp->asd_ucode_size);
>>> @@ -726,8 +730,7 @@ static int psp_xgmi_load(struct psp_context *psp)
>>>       if (!cmd)
>>>           return -ENOMEM;
>>>   -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>> -    memcpy(psp->fw_pri_buf, psp->ta_xgmi_start_addr, 
>>> psp->ta_xgmi_ucode_size);
>>> +    psp_copy_fw(psp, psp->ta_xgmi_start_addr, 
>>> psp->ta_xgmi_ucode_size);
>>>         psp_prep_ta_load_cmd_buf(cmd,
>>>                    psp->fw_pri_mc_addr,
>>> @@ -982,8 +985,7 @@ static int psp_ras_load(struct psp_context *psp)
>>>       if (!cmd)
>>>           return -ENOMEM;
>>>   -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>> -    memcpy(psp->fw_pri_buf, psp->ta_ras_start_addr, 
>>> psp->ta_ras_ucode_size);
>>> +    psp_copy_fw(psp, psp->ta_ras_start_addr, psp->ta_ras_ucode_size);
>>>         psp_prep_ta_load_cmd_buf(cmd,
>>>                    psp->fw_pri_mc_addr,
>>> @@ -1219,8 +1221,7 @@ static int psp_hdcp_load(struct psp_context *psp)
>>>       if (!cmd)
>>>           return -ENOMEM;
>>>   -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>> -    memcpy(psp->fw_pri_buf, psp->ta_hdcp_start_addr,
>>> +    psp_copy_fw(psp, psp->ta_hdcp_start_addr,
>>>              psp->ta_hdcp_ucode_size);
>>>         psp_prep_ta_load_cmd_buf(cmd,
>>> @@ -1366,8 +1367,7 @@ static int psp_dtm_load(struct psp_context *psp)
>>>       if (!cmd)
>>>           return -ENOMEM;
>>>   -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>> -    memcpy(psp->fw_pri_buf, psp->ta_dtm_start_addr, 
>>> psp->ta_dtm_ucode_size);
>>> +    psp_copy_fw(psp, psp->ta_dtm_start_addr, psp->ta_dtm_ucode_size);
>>>         psp_prep_ta_load_cmd_buf(cmd,
>>>                    psp->fw_pri_mc_addr,
>>> @@ -1507,8 +1507,7 @@ static int psp_rap_load(struct psp_context *psp)
>>>       if (!cmd)
>>>           return -ENOMEM;
>>>   -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>> -    memcpy(psp->fw_pri_buf, psp->ta_rap_start_addr, 
>>> psp->ta_rap_ucode_size);
>>> +    psp_copy_fw(psp, psp->ta_rap_start_addr, psp->ta_rap_ucode_size);
>>>         psp_prep_ta_load_cmd_buf(cmd,
>>>                    psp->fw_pri_mc_addr,
>>> @@ -2778,6 +2777,20 @@ static ssize_t 
>>> psp_usbc_pd_fw_sysfs_write(struct device *dev,
>>>       return count;
>>>   }
>>>   +void psp_copy_fw(struct psp_context *psp, uint8_t *start_addr, 
>>> uint32_t bin_size)
>>> +{
>>> +    int idx;
>>> +
>>> +    if (!drm_dev_enter(&psp->adev->ddev, &idx))
>>> +        return;
>>> +
>>> +    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>> +    memcpy(psp->fw_pri_buf, start_addr, bin_size);
>>> +
>>> +    drm_dev_exit(idx);
>>> +}
>>> +
>>> +
>>>   static DEVICE_ATTR(usbc_pd_fw, S_IRUGO | S_IWUSR,
>>>              psp_usbc_pd_fw_sysfs_read,
>>>              psp_usbc_pd_fw_sysfs_write);
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h 
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
>>> index da250bc..ac69314 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
>>> @@ -400,4 +400,7 @@ int psp_init_ta_microcode(struct psp_context *psp,
>>>                 const char *chip_name);
>>>   int psp_get_fw_attestation_records_addr(struct psp_context *psp,
>>>                       uint64_t *output_ptr);
>>> +
>>> +void psp_copy_fw(struct psp_context *psp, uint8_t *start_addr, 
>>> uint32_t bin_size);
>>> +
>>>   #endif
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c 
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
>>> index 1a612f5..d656494 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
>>> @@ -35,6 +35,8 @@
>>>   #include "amdgpu.h"
>>>   #include "atom.h"
>>>   +#include <drm/drm_drv.h>
>>> +
>>>   /*
>>>    * Rings
>>>    * Most engines on the GPU are fed via ring buffers.  Ring
>>> @@ -463,3 +465,71 @@ int amdgpu_ring_test_helper(struct amdgpu_ring 
>>> *ring)
>>>       ring->sched.ready = !r;
>>>       return r;
>>>   }
>>> +
>>> +void amdgpu_ring_clear_ring(struct amdgpu_ring *ring)
>>> +{
>>> +    int idx;
>>> +    int i = 0;
>>> +
>>> +    if (!drm_dev_enter(&ring->adev->ddev, &idx))
>>> +        return;
>>> +
>>> +    while (i <= ring->buf_mask)
>>> +        ring->ring[i++] = ring->funcs->nop;
>>> +
>>> +    drm_dev_exit(idx);
>>> +
>>> +}
>>> +
>>> +void amdgpu_ring_write(struct amdgpu_ring *ring, uint32_t v)
>>> +{
>>> +    int idx;
>>> +
>>> +    if (!drm_dev_enter(&ring->adev->ddev, &idx))
>>> +        return;
>>> +
>>> +    if (ring->count_dw <= 0)
>>> +        DRM_ERROR("amdgpu: writing more dwords to the ring than 
>>> expected!\n");
>>> +    ring->ring[ring->wptr++ & ring->buf_mask] = v;
>>> +    ring->wptr &= ring->ptr_mask;
>>> +    ring->count_dw--;
>>> +
>>> +    drm_dev_exit(idx);
>>> +}
>>> +
>>> +void amdgpu_ring_write_multiple(struct amdgpu_ring *ring,
>>> +                          void *src, int count_dw)
>>> +{
>>> +    unsigned occupied, chunk1, chunk2;
>>> +    void *dst;
>>> +    int idx;
>>> +
>>> +    if (!drm_dev_enter(&ring->adev->ddev, &idx))
>>> +        return;
>>> +
>>> +    if (unlikely(ring->count_dw < count_dw))
>>> +        DRM_ERROR("amdgpu: writing more dwords to the ring than 
>>> expected!\n");
>>> +
>>> +    occupied = ring->wptr & ring->buf_mask;
>>> +    dst = (void *)&ring->ring[occupied];
>>> +    chunk1 = ring->buf_mask + 1 - occupied;
>>> +    chunk1 = (chunk1 >= count_dw) ? count_dw: chunk1;
>>> +    chunk2 = count_dw - chunk1;
>>> +    chunk1 <<= 2;
>>> +    chunk2 <<= 2;
>>> +
>>> +    if (chunk1)
>>> +        memcpy(dst, src, chunk1);
>>> +
>>> +    if (chunk2) {
>>> +        src += chunk1;
>>> +        dst = (void *)ring->ring;
>>> +        memcpy(dst, src, chunk2);
>>> +    }
>>> +
>>> +    ring->wptr += count_dw;
>>> +    ring->wptr &= ring->ptr_mask;
>>> +    ring->count_dw -= count_dw;
>>> +
>>> +    drm_dev_exit(idx);
>>> +}
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h 
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>>> index accb243..f90b81f 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>>> @@ -300,53 +300,12 @@ static inline void 
>>> amdgpu_ring_set_preempt_cond_exec(struct amdgpu_ring *ring,
>>>       *ring->cond_exe_cpu_addr = cond_exec;
>>>   }
>>>   -static inline void amdgpu_ring_clear_ring(struct amdgpu_ring *ring)
>>> -{
>>> -    int i = 0;
>>> -    while (i <= ring->buf_mask)
>>> -        ring->ring[i++] = ring->funcs->nop;
>>> -
>>> -}
>>> -
>>> -static inline void amdgpu_ring_write(struct amdgpu_ring *ring, 
>>> uint32_t v)
>>> -{
>>> -    if (ring->count_dw <= 0)
>>> -        DRM_ERROR("amdgpu: writing more dwords to the ring than 
>>> expected!\n");
>>> -    ring->ring[ring->wptr++ & ring->buf_mask] = v;
>>> -    ring->wptr &= ring->ptr_mask;
>>> -    ring->count_dw--;
>>> -}
>>> +void amdgpu_ring_clear_ring(struct amdgpu_ring *ring);
>>>   -static inline void amdgpu_ring_write_multiple(struct amdgpu_ring 
>>> *ring,
>>> -                          void *src, int count_dw)
>>> -{
>>> -    unsigned occupied, chunk1, chunk2;
>>> -    void *dst;
>>> -
>>> -    if (unlikely(ring->count_dw < count_dw))
>>> -        DRM_ERROR("amdgpu: writing more dwords to the ring than 
>>> expected!\n");
>>> -
>>> -    occupied = ring->wptr & ring->buf_mask;
>>> -    dst = (void *)&ring->ring[occupied];
>>> -    chunk1 = ring->buf_mask + 1 - occupied;
>>> -    chunk1 = (chunk1 >= count_dw) ? count_dw: chunk1;
>>> -    chunk2 = count_dw - chunk1;
>>> -    chunk1 <<= 2;
>>> -    chunk2 <<= 2;
>>> +void amdgpu_ring_write(struct amdgpu_ring *ring, uint32_t v);
>>>   -    if (chunk1)
>>> -        memcpy(dst, src, chunk1);
>>> -
>>> -    if (chunk2) {
>>> -        src += chunk1;
>>> -        dst = (void *)ring->ring;
>>> -        memcpy(dst, src, chunk2);
>>> -    }
>>> -
>>> -    ring->wptr += count_dw;
>>> -    ring->wptr &= ring->ptr_mask;
>>> -    ring->count_dw -= count_dw;
>>> -}
>>> +void amdgpu_ring_write_multiple(struct amdgpu_ring *ring,
>>> +                          void *src, int count_dw);
>>>     int amdgpu_ring_test_helper(struct amdgpu_ring *ring);
>>>   diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c 
>>> b/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
>>> index bd4248c..b3ce5be 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
>>> @@ -269,10 +269,8 @@ static int psp_v11_0_bootloader_load_kdb(struct 
>>> psp_context *psp)
>>>       if (ret)
>>>           return ret;
>>>   -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>> -
>>>       /* Copy PSP KDB binary to memory */
>>> -    memcpy(psp->fw_pri_buf, psp->kdb_start_addr, psp->kdb_bin_size);
>>> +    psp_copy_fw(psp, psp->kdb_start_addr, psp->kdb_bin_size);
>>>         /* Provide the PSP KDB to bootloader */
>>>       WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>> @@ -302,10 +300,8 @@ static int psp_v11_0_bootloader_load_spl(struct 
>>> psp_context *psp)
>>>       if (ret)
>>>           return ret;
>>>   -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>> -
>>>       /* Copy PSP SPL binary to memory */
>>> -    memcpy(psp->fw_pri_buf, psp->spl_start_addr, psp->spl_bin_size);
>>> +    psp_copy_fw(psp, psp->spl_start_addr, psp->spl_bin_size);
>>>         /* Provide the PSP SPL to bootloader */
>>>       WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>> @@ -335,10 +331,8 @@ static int 
>>> psp_v11_0_bootloader_load_sysdrv(struct psp_context *psp)
>>>       if (ret)
>>>           return ret;
>>>   -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>> -
>>>       /* Copy PSP System Driver binary to memory */
>>> -    memcpy(psp->fw_pri_buf, psp->sys_start_addr, psp->sys_bin_size);
>>> +    psp_copy_fw(psp, psp->sys_start_addr, psp->sys_bin_size);
>>>         /* Provide the sys driver to bootloader */
>>>       WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>> @@ -371,10 +365,8 @@ static int psp_v11_0_bootloader_load_sos(struct 
>>> psp_context *psp)
>>>       if (ret)
>>>           return ret;
>>>   -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>> -
>>>       /* Copy Secure OS binary to PSP memory */
>>> -    memcpy(psp->fw_pri_buf, psp->sos_start_addr, psp->sos_bin_size);
>>> +    psp_copy_fw(psp, psp->sos_start_addr, psp->sos_bin_size);
>>>         /* Provide the PSP secure OS to bootloader */
>>>       WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c 
>>> b/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c
>>> index c4828bd..618e5b6 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c
>>> @@ -138,10 +138,8 @@ static int 
>>> psp_v12_0_bootloader_load_sysdrv(struct psp_context *psp)
>>>       if (ret)
>>>           return ret;
>>>   -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>> -
>>>       /* Copy PSP System Driver binary to memory */
>>> -    memcpy(psp->fw_pri_buf, psp->sys_start_addr, psp->sys_bin_size);
>>> +    psp_copy_fw(psp, psp->sys_start_addr, psp->sys_bin_size);
>>>         /* Provide the sys driver to bootloader */
>>>       WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>> @@ -179,10 +177,8 @@ static int psp_v12_0_bootloader_load_sos(struct 
>>> psp_context *psp)
>>>       if (ret)
>>>           return ret;
>>>   -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>> -
>>>       /* Copy Secure OS binary to PSP memory */
>>> -    memcpy(psp->fw_pri_buf, psp->sos_start_addr, psp->sos_bin_size);
>>> +    psp_copy_fw(psp, psp->sos_start_addr, psp->sos_bin_size);
>>>         /* Provide the PSP secure OS to bootloader */
>>>       WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c 
>>> b/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c
>>> index f2e725f..d0a6cccd 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c
>>> @@ -102,10 +102,8 @@ static int 
>>> psp_v3_1_bootloader_load_sysdrv(struct psp_context *psp)
>>>       if (ret)
>>>           return ret;
>>>   -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>> -
>>>       /* Copy PSP System Driver binary to memory */
>>> -    memcpy(psp->fw_pri_buf, psp->sys_start_addr, psp->sys_bin_size);
>>> +    psp_copy_fw(psp, psp->sys_start_addr, psp->sys_bin_size);
>>>         /* Provide the sys driver to bootloader */
>>>       WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>> @@ -143,10 +141,8 @@ static int psp_v3_1_bootloader_load_sos(struct 
>>> psp_context *psp)
>>>       if (ret)
>>>           return ret;
>>>   -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>> -
>>>       /* Copy Secure OS binary to PSP memory */
>>> -    memcpy(psp->fw_pri_buf, psp->sos_start_addr, psp->sos_bin_size);
>>> +    psp_copy_fw(psp, psp->sos_start_addr, psp->sos_bin_size);
>>>         /* Provide the PSP secure OS to bootloader */
>>>       WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 11/14] drm/amdgpu: Guard against write accesses after device removal
@ 2021-01-19 15:39         ` Christian König
  0 siblings, 0 replies; 196+ messages in thread
From: Christian König @ 2021-01-19 15:39 UTC (permalink / raw)
  To: Andrey Grodzovsky, amd-gfx, dri-devel, daniel.vetter, robh,
	l.stach, yuq825, eric
  Cc: Alexander.Deucher, gregkh, ppaalanen, Harry.Wentland

The is also the possibility to have the drm_dev_enter/exit much more 
high level.

E.g. we should it have anyway on every IOCTL and what remains are work 
items, scheduler threads and interrupts.

Christian.

Am 19.01.21 um 16:35 schrieb Andrey Grodzovsky:
> There is really no other way according to this article 
> https://lwn.net/Articles/767885/
>
> "A perfect solution seems nearly impossible though; we cannot acquire 
> a mutex on the user
> to prevent them from yanking a device and we cannot check for a 
> presence change after every
> device access for performance reasons. "
>
> But I assumed srcu_read_lock should be pretty seamless performance 
> wise, no ?
> The other solution would be as I suggested to keep all the device IO 
> ranges reserved and system
> memory pages unfreed until the device is finalized in the driver but 
> Daniel said this would upset the PCI layer (the MMIO ranges 
> reservation part).
>
> Andrey
>
>
>
>
> On 1/19/21 3:55 AM, Christian König wrote:
>> Am 18.01.21 um 22:01 schrieb Andrey Grodzovsky:
>>> This should prevent writing to memory or IO ranges possibly
>>> already allocated for other uses after our device is removed.
>>
>> Wow, that adds quite some overhead to every register access. I'm not 
>> sure we can do this.
>>
>> Christian.
>>
>>>
>>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>>> ---
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 57 
>>> ++++++++++++++++++++++++
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c    |  9 ++++
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c    | 53 
>>> +++++++++++++---------
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h    |  3 ++
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c   | 70 
>>> ++++++++++++++++++++++++++++++
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h   | 49 ++-------------------
>>>   drivers/gpu/drm/amd/amdgpu/psp_v11_0.c     | 16 ++-----
>>>   drivers/gpu/drm/amd/amdgpu/psp_v12_0.c     |  8 +---
>>>   drivers/gpu/drm/amd/amdgpu/psp_v3_1.c      |  8 +---
>>>   9 files changed, 184 insertions(+), 89 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>> index e99f4f1..0a9d73c 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>> @@ -72,6 +72,8 @@
>>>     #include <linux/iommu.h>
>>>   +#include <drm/drm_drv.h>
>>> +
>>>   MODULE_FIRMWARE("amdgpu/vega10_gpu_info.bin");
>>>   MODULE_FIRMWARE("amdgpu/vega12_gpu_info.bin");
>>>   MODULE_FIRMWARE("amdgpu/raven_gpu_info.bin");
>>> @@ -404,13 +406,21 @@ uint8_t amdgpu_mm_rreg8(struct amdgpu_device 
>>> *adev, uint32_t offset)
>>>    */
>>>   void amdgpu_mm_wreg8(struct amdgpu_device *adev, uint32_t offset, 
>>> uint8_t value)
>>>   {
>>> +    int idx;
>>> +
>>>       if (adev->in_pci_err_recovery)
>>>           return;
>>>   +
>>> +    if (!drm_dev_enter(&adev->ddev, &idx))
>>> +        return;
>>> +
>>>       if (offset < adev->rmmio_size)
>>>           writeb(value, adev->rmmio + offset);
>>>       else
>>>           BUG();
>>> +
>>> +    drm_dev_exit(idx);
>>>   }
>>>     /**
>>> @@ -427,9 +437,14 @@ void amdgpu_device_wreg(struct amdgpu_device 
>>> *adev,
>>>               uint32_t reg, uint32_t v,
>>>               uint32_t acc_flags)
>>>   {
>>> +    int idx;
>>> +
>>>       if (adev->in_pci_err_recovery)
>>>           return;
>>>   +    if (!drm_dev_enter(&adev->ddev, &idx))
>>> +        return;
>>> +
>>>       if ((reg * 4) < adev->rmmio_size) {
>>>           if (!(acc_flags & AMDGPU_REGS_NO_KIQ) &&
>>>               amdgpu_sriov_runtime(adev) &&
>>> @@ -444,6 +459,8 @@ void amdgpu_device_wreg(struct amdgpu_device *adev,
>>>       }
>>>         trace_amdgpu_device_wreg(adev->pdev->device, reg, v);
>>> +
>>> +    drm_dev_exit(idx);
>>>   }
>>>     /*
>>> @@ -454,9 +471,14 @@ void amdgpu_device_wreg(struct amdgpu_device 
>>> *adev,
>>>   void amdgpu_mm_wreg_mmio_rlc(struct amdgpu_device *adev,
>>>                    uint32_t reg, uint32_t v)
>>>   {
>>> +    int idx;
>>> +
>>>       if (adev->in_pci_err_recovery)
>>>           return;
>>>   +    if (!drm_dev_enter(&adev->ddev, &idx))
>>> +        return;
>>> +
>>>       if (amdgpu_sriov_fullaccess(adev) &&
>>>           adev->gfx.rlc.funcs &&
>>>           adev->gfx.rlc.funcs->is_rlcg_access_range) {
>>> @@ -465,6 +487,8 @@ void amdgpu_mm_wreg_mmio_rlc(struct 
>>> amdgpu_device *adev,
>>>       } else {
>>>           writel(v, ((void __iomem *)adev->rmmio) + (reg * 4));
>>>       }
>>> +
>>> +    drm_dev_exit(idx);
>>>   }
>>>     /**
>>> @@ -499,15 +523,22 @@ u32 amdgpu_io_rreg(struct amdgpu_device *adev, 
>>> u32 reg)
>>>    */
>>>   void amdgpu_io_wreg(struct amdgpu_device *adev, u32 reg, u32 v)
>>>   {
>>> +    int idx;
>>> +
>>>       if (adev->in_pci_err_recovery)
>>>           return;
>>>   +    if (!drm_dev_enter(&adev->ddev, &idx))
>>> +        return;
>>> +
>>>       if ((reg * 4) < adev->rio_mem_size)
>>>           iowrite32(v, adev->rio_mem + (reg * 4));
>>>       else {
>>>           iowrite32((reg * 4), adev->rio_mem + (mmMM_INDEX * 4));
>>>           iowrite32(v, adev->rio_mem + (mmMM_DATA * 4));
>>>       }
>>> +
>>> +    drm_dev_exit(idx);
>>>   }
>>>     /**
>>> @@ -544,14 +575,21 @@ u32 amdgpu_mm_rdoorbell(struct amdgpu_device 
>>> *adev, u32 index)
>>>    */
>>>   void amdgpu_mm_wdoorbell(struct amdgpu_device *adev, u32 index, 
>>> u32 v)
>>>   {
>>> +    int idx;
>>> +
>>>       if (adev->in_pci_err_recovery)
>>>           return;
>>>   +    if (!drm_dev_enter(&adev->ddev, &idx))
>>> +        return;
>>> +
>>>       if (index < adev->doorbell.num_doorbells) {
>>>           writel(v, adev->doorbell.ptr + index);
>>>       } else {
>>>           DRM_ERROR("writing beyond doorbell aperture: 0x%08x!\n", 
>>> index);
>>>       }
>>> +
>>> +    drm_dev_exit(idx);
>>>   }
>>>     /**
>>> @@ -588,14 +626,21 @@ u64 amdgpu_mm_rdoorbell64(struct amdgpu_device 
>>> *adev, u32 index)
>>>    */
>>>   void amdgpu_mm_wdoorbell64(struct amdgpu_device *adev, u32 index, 
>>> u64 v)
>>>   {
>>> +    int idx;
>>> +
>>>       if (adev->in_pci_err_recovery)
>>>           return;
>>>   +    if (!drm_dev_enter(&adev->ddev, &idx))
>>> +        return;
>>> +
>>>       if (index < adev->doorbell.num_doorbells) {
>>>           atomic64_set((atomic64_t *)(adev->doorbell.ptr + index), v);
>>>       } else {
>>>           DRM_ERROR("writing beyond doorbell aperture: 0x%08x!\n", 
>>> index);
>>>       }
>>> +
>>> +    drm_dev_exit(idx);
>>>   }
>>>     /**
>>> @@ -682,6 +727,10 @@ void amdgpu_device_indirect_wreg(struct 
>>> amdgpu_device *adev,
>>>       unsigned long flags;
>>>       void __iomem *pcie_index_offset;
>>>       void __iomem *pcie_data_offset;
>>> +    int idx;
>>> +
>>> +    if (!drm_dev_enter(&adev->ddev, &idx))
>>> +        return;
>>>         spin_lock_irqsave(&adev->pcie_idx_lock, flags);
>>>       pcie_index_offset = (void __iomem *)adev->rmmio + pcie_index * 4;
>>> @@ -692,6 +741,8 @@ void amdgpu_device_indirect_wreg(struct 
>>> amdgpu_device *adev,
>>>       writel(reg_data, pcie_data_offset);
>>>       readl(pcie_data_offset);
>>>       spin_unlock_irqrestore(&adev->pcie_idx_lock, flags);
>>> +
>>> +    drm_dev_exit(idx);
>>>   }
>>>     /**
>>> @@ -711,6 +762,10 @@ void amdgpu_device_indirect_wreg64(struct 
>>> amdgpu_device *adev,
>>>       unsigned long flags;
>>>       void __iomem *pcie_index_offset;
>>>       void __iomem *pcie_data_offset;
>>> +    int idx;
>>> +
>>> +    if (!drm_dev_enter(&adev->ddev, &idx))
>>> +        return;
>>>         spin_lock_irqsave(&adev->pcie_idx_lock, flags);
>>>       pcie_index_offset = (void __iomem *)adev->rmmio + pcie_index * 4;
>>> @@ -727,6 +782,8 @@ void amdgpu_device_indirect_wreg64(struct 
>>> amdgpu_device *adev,
>>>       writel((u32)(reg_data >> 32), pcie_data_offset);
>>>       readl(pcie_data_offset);
>>>       spin_unlock_irqrestore(&adev->pcie_idx_lock, flags);
>>> +
>>> +    drm_dev_exit(idx);
>>>   }
>>>     /**
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c 
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
>>> index fe1a39f..1beb4e6 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
>>> @@ -31,6 +31,8 @@
>>>   #include "amdgpu_ras.h"
>>>   #include "amdgpu_xgmi.h"
>>>   +#include <drm/drm_drv.h>
>>> +
>>>   /**
>>>    * amdgpu_gmc_get_pde_for_bo - get the PDE for a BO
>>>    *
>>> @@ -98,6 +100,10 @@ int amdgpu_gmc_set_pte_pde(struct amdgpu_device 
>>> *adev, void *cpu_pt_addr,
>>>   {
>>>       void __iomem *ptr = (void *)cpu_pt_addr;
>>>       uint64_t value;
>>> +    int idx;
>>> +
>>> +    if (!drm_dev_enter(&adev->ddev, &idx))
>>> +        return 0;
>>>         /*
>>>        * The following is for PTE only. GART does not have PDEs.
>>> @@ -105,6 +111,9 @@ int amdgpu_gmc_set_pte_pde(struct amdgpu_device 
>>> *adev, void *cpu_pt_addr,
>>>       value = addr & 0x0000FFFFFFFFF000ULL;
>>>       value |= flags;
>>>       writeq(value, ptr + (gpu_page_idx * 8));
>>> +
>>> +    drm_dev_exit(idx);
>>> +
>>>       return 0;
>>>   }
>>>   diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c 
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
>>> index 523d22d..89e2bfe 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
>>> @@ -37,6 +37,8 @@
>>>     #include "amdgpu_ras.h"
>>>   +#include <drm/drm_drv.h>
>>> +
>>>   static int psp_sysfs_init(struct amdgpu_device *adev);
>>>   static void psp_sysfs_fini(struct amdgpu_device *adev);
>>>   @@ -248,7 +250,7 @@ psp_cmd_submit_buf(struct psp_context *psp,
>>>              struct psp_gfx_cmd_resp *cmd, uint64_t fence_mc_addr)
>>>   {
>>>       int ret;
>>> -    int index;
>>> +    int index, idx;
>>>       int timeout = 2000;
>>>       bool ras_intr = false;
>>>       bool skip_unsupport = false;
>>> @@ -256,6 +258,9 @@ psp_cmd_submit_buf(struct psp_context *psp,
>>>       if (psp->adev->in_pci_err_recovery)
>>>           return 0;
>>>   +    if (!drm_dev_enter(&psp->adev->ddev, &idx))
>>> +        return 0;
>>> +
>>>       mutex_lock(&psp->mutex);
>>>         memset(psp->cmd_buf_mem, 0, PSP_CMD_BUFFER_SIZE);
>>> @@ -266,8 +271,7 @@ psp_cmd_submit_buf(struct psp_context *psp,
>>>       ret = psp_ring_cmd_submit(psp, psp->cmd_buf_mc_addr, 
>>> fence_mc_addr, index);
>>>       if (ret) {
>>>           atomic_dec(&psp->fence_value);
>>> -        mutex_unlock(&psp->mutex);
>>> -        return ret;
>>> +        goto exit;
>>>       }
>>>         amdgpu_asic_invalidate_hdp(psp->adev, NULL);
>>> @@ -307,8 +311,8 @@ psp_cmd_submit_buf(struct psp_context *psp,
>>>                psp->cmd_buf_mem->cmd_id,
>>>                psp->cmd_buf_mem->resp.status);
>>>           if (!timeout) {
>>> -            mutex_unlock(&psp->mutex);
>>> -            return -EINVAL;
>>> +            ret = -EINVAL;
>>> +            goto exit;
>>>           }
>>>       }
>>>   @@ -316,8 +320,10 @@ psp_cmd_submit_buf(struct psp_context *psp,
>>>           ucode->tmr_mc_addr_lo = psp->cmd_buf_mem->resp.fw_addr_lo;
>>>           ucode->tmr_mc_addr_hi = psp->cmd_buf_mem->resp.fw_addr_hi;
>>>       }
>>> -    mutex_unlock(&psp->mutex);
>>>   +exit:
>>> +    mutex_unlock(&psp->mutex);
>>> +    drm_dev_exit(idx);
>>>       return ret;
>>>   }
>>>   @@ -354,8 +360,7 @@ static int psp_load_toc(struct psp_context *psp,
>>>       if (!cmd)
>>>           return -ENOMEM;
>>>       /* Copy toc to psp firmware private buffer */
>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>> -    memcpy(psp->fw_pri_buf, psp->toc_start_addr, psp->toc_bin_size);
>>> +    psp_copy_fw(psp, psp->toc_start_addr, psp->toc_bin_size);
>>>         psp_prep_load_toc_cmd_buf(cmd, psp->fw_pri_mc_addr, 
>>> psp->toc_bin_size);
>>>   @@ -570,8 +575,7 @@ static int psp_asd_load(struct psp_context *psp)
>>>       if (!cmd)
>>>           return -ENOMEM;
>>>   -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>> -    memcpy(psp->fw_pri_buf, psp->asd_start_addr, psp->asd_ucode_size);
>>> +    psp_copy_fw(psp, psp->asd_start_addr, psp->asd_ucode_size);
>>>         psp_prep_asd_load_cmd_buf(cmd, psp->fw_pri_mc_addr,
>>>                     psp->asd_ucode_size);
>>> @@ -726,8 +730,7 @@ static int psp_xgmi_load(struct psp_context *psp)
>>>       if (!cmd)
>>>           return -ENOMEM;
>>>   -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>> -    memcpy(psp->fw_pri_buf, psp->ta_xgmi_start_addr, 
>>> psp->ta_xgmi_ucode_size);
>>> +    psp_copy_fw(psp, psp->ta_xgmi_start_addr, 
>>> psp->ta_xgmi_ucode_size);
>>>         psp_prep_ta_load_cmd_buf(cmd,
>>>                    psp->fw_pri_mc_addr,
>>> @@ -982,8 +985,7 @@ static int psp_ras_load(struct psp_context *psp)
>>>       if (!cmd)
>>>           return -ENOMEM;
>>>   -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>> -    memcpy(psp->fw_pri_buf, psp->ta_ras_start_addr, 
>>> psp->ta_ras_ucode_size);
>>> +    psp_copy_fw(psp, psp->ta_ras_start_addr, psp->ta_ras_ucode_size);
>>>         psp_prep_ta_load_cmd_buf(cmd,
>>>                    psp->fw_pri_mc_addr,
>>> @@ -1219,8 +1221,7 @@ static int psp_hdcp_load(struct psp_context *psp)
>>>       if (!cmd)
>>>           return -ENOMEM;
>>>   -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>> -    memcpy(psp->fw_pri_buf, psp->ta_hdcp_start_addr,
>>> +    psp_copy_fw(psp, psp->ta_hdcp_start_addr,
>>>              psp->ta_hdcp_ucode_size);
>>>         psp_prep_ta_load_cmd_buf(cmd,
>>> @@ -1366,8 +1367,7 @@ static int psp_dtm_load(struct psp_context *psp)
>>>       if (!cmd)
>>>           return -ENOMEM;
>>>   -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>> -    memcpy(psp->fw_pri_buf, psp->ta_dtm_start_addr, 
>>> psp->ta_dtm_ucode_size);
>>> +    psp_copy_fw(psp, psp->ta_dtm_start_addr, psp->ta_dtm_ucode_size);
>>>         psp_prep_ta_load_cmd_buf(cmd,
>>>                    psp->fw_pri_mc_addr,
>>> @@ -1507,8 +1507,7 @@ static int psp_rap_load(struct psp_context *psp)
>>>       if (!cmd)
>>>           return -ENOMEM;
>>>   -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>> -    memcpy(psp->fw_pri_buf, psp->ta_rap_start_addr, 
>>> psp->ta_rap_ucode_size);
>>> +    psp_copy_fw(psp, psp->ta_rap_start_addr, psp->ta_rap_ucode_size);
>>>         psp_prep_ta_load_cmd_buf(cmd,
>>>                    psp->fw_pri_mc_addr,
>>> @@ -2778,6 +2777,20 @@ static ssize_t 
>>> psp_usbc_pd_fw_sysfs_write(struct device *dev,
>>>       return count;
>>>   }
>>>   +void psp_copy_fw(struct psp_context *psp, uint8_t *start_addr, 
>>> uint32_t bin_size)
>>> +{
>>> +    int idx;
>>> +
>>> +    if (!drm_dev_enter(&psp->adev->ddev, &idx))
>>> +        return;
>>> +
>>> +    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>> +    memcpy(psp->fw_pri_buf, start_addr, bin_size);
>>> +
>>> +    drm_dev_exit(idx);
>>> +}
>>> +
>>> +
>>>   static DEVICE_ATTR(usbc_pd_fw, S_IRUGO | S_IWUSR,
>>>              psp_usbc_pd_fw_sysfs_read,
>>>              psp_usbc_pd_fw_sysfs_write);
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h 
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
>>> index da250bc..ac69314 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
>>> @@ -400,4 +400,7 @@ int psp_init_ta_microcode(struct psp_context *psp,
>>>                 const char *chip_name);
>>>   int psp_get_fw_attestation_records_addr(struct psp_context *psp,
>>>                       uint64_t *output_ptr);
>>> +
>>> +void psp_copy_fw(struct psp_context *psp, uint8_t *start_addr, 
>>> uint32_t bin_size);
>>> +
>>>   #endif
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c 
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
>>> index 1a612f5..d656494 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
>>> @@ -35,6 +35,8 @@
>>>   #include "amdgpu.h"
>>>   #include "atom.h"
>>>   +#include <drm/drm_drv.h>
>>> +
>>>   /*
>>>    * Rings
>>>    * Most engines on the GPU are fed via ring buffers.  Ring
>>> @@ -463,3 +465,71 @@ int amdgpu_ring_test_helper(struct amdgpu_ring 
>>> *ring)
>>>       ring->sched.ready = !r;
>>>       return r;
>>>   }
>>> +
>>> +void amdgpu_ring_clear_ring(struct amdgpu_ring *ring)
>>> +{
>>> +    int idx;
>>> +    int i = 0;
>>> +
>>> +    if (!drm_dev_enter(&ring->adev->ddev, &idx))
>>> +        return;
>>> +
>>> +    while (i <= ring->buf_mask)
>>> +        ring->ring[i++] = ring->funcs->nop;
>>> +
>>> +    drm_dev_exit(idx);
>>> +
>>> +}
>>> +
>>> +void amdgpu_ring_write(struct amdgpu_ring *ring, uint32_t v)
>>> +{
>>> +    int idx;
>>> +
>>> +    if (!drm_dev_enter(&ring->adev->ddev, &idx))
>>> +        return;
>>> +
>>> +    if (ring->count_dw <= 0)
>>> +        DRM_ERROR("amdgpu: writing more dwords to the ring than 
>>> expected!\n");
>>> +    ring->ring[ring->wptr++ & ring->buf_mask] = v;
>>> +    ring->wptr &= ring->ptr_mask;
>>> +    ring->count_dw--;
>>> +
>>> +    drm_dev_exit(idx);
>>> +}
>>> +
>>> +void amdgpu_ring_write_multiple(struct amdgpu_ring *ring,
>>> +                          void *src, int count_dw)
>>> +{
>>> +    unsigned occupied, chunk1, chunk2;
>>> +    void *dst;
>>> +    int idx;
>>> +
>>> +    if (!drm_dev_enter(&ring->adev->ddev, &idx))
>>> +        return;
>>> +
>>> +    if (unlikely(ring->count_dw < count_dw))
>>> +        DRM_ERROR("amdgpu: writing more dwords to the ring than 
>>> expected!\n");
>>> +
>>> +    occupied = ring->wptr & ring->buf_mask;
>>> +    dst = (void *)&ring->ring[occupied];
>>> +    chunk1 = ring->buf_mask + 1 - occupied;
>>> +    chunk1 = (chunk1 >= count_dw) ? count_dw: chunk1;
>>> +    chunk2 = count_dw - chunk1;
>>> +    chunk1 <<= 2;
>>> +    chunk2 <<= 2;
>>> +
>>> +    if (chunk1)
>>> +        memcpy(dst, src, chunk1);
>>> +
>>> +    if (chunk2) {
>>> +        src += chunk1;
>>> +        dst = (void *)ring->ring;
>>> +        memcpy(dst, src, chunk2);
>>> +    }
>>> +
>>> +    ring->wptr += count_dw;
>>> +    ring->wptr &= ring->ptr_mask;
>>> +    ring->count_dw -= count_dw;
>>> +
>>> +    drm_dev_exit(idx);
>>> +}
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h 
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>>> index accb243..f90b81f 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>>> @@ -300,53 +300,12 @@ static inline void 
>>> amdgpu_ring_set_preempt_cond_exec(struct amdgpu_ring *ring,
>>>       *ring->cond_exe_cpu_addr = cond_exec;
>>>   }
>>>   -static inline void amdgpu_ring_clear_ring(struct amdgpu_ring *ring)
>>> -{
>>> -    int i = 0;
>>> -    while (i <= ring->buf_mask)
>>> -        ring->ring[i++] = ring->funcs->nop;
>>> -
>>> -}
>>> -
>>> -static inline void amdgpu_ring_write(struct amdgpu_ring *ring, 
>>> uint32_t v)
>>> -{
>>> -    if (ring->count_dw <= 0)
>>> -        DRM_ERROR("amdgpu: writing more dwords to the ring than 
>>> expected!\n");
>>> -    ring->ring[ring->wptr++ & ring->buf_mask] = v;
>>> -    ring->wptr &= ring->ptr_mask;
>>> -    ring->count_dw--;
>>> -}
>>> +void amdgpu_ring_clear_ring(struct amdgpu_ring *ring);
>>>   -static inline void amdgpu_ring_write_multiple(struct amdgpu_ring 
>>> *ring,
>>> -                          void *src, int count_dw)
>>> -{
>>> -    unsigned occupied, chunk1, chunk2;
>>> -    void *dst;
>>> -
>>> -    if (unlikely(ring->count_dw < count_dw))
>>> -        DRM_ERROR("amdgpu: writing more dwords to the ring than 
>>> expected!\n");
>>> -
>>> -    occupied = ring->wptr & ring->buf_mask;
>>> -    dst = (void *)&ring->ring[occupied];
>>> -    chunk1 = ring->buf_mask + 1 - occupied;
>>> -    chunk1 = (chunk1 >= count_dw) ? count_dw: chunk1;
>>> -    chunk2 = count_dw - chunk1;
>>> -    chunk1 <<= 2;
>>> -    chunk2 <<= 2;
>>> +void amdgpu_ring_write(struct amdgpu_ring *ring, uint32_t v);
>>>   -    if (chunk1)
>>> -        memcpy(dst, src, chunk1);
>>> -
>>> -    if (chunk2) {
>>> -        src += chunk1;
>>> -        dst = (void *)ring->ring;
>>> -        memcpy(dst, src, chunk2);
>>> -    }
>>> -
>>> -    ring->wptr += count_dw;
>>> -    ring->wptr &= ring->ptr_mask;
>>> -    ring->count_dw -= count_dw;
>>> -}
>>> +void amdgpu_ring_write_multiple(struct amdgpu_ring *ring,
>>> +                          void *src, int count_dw);
>>>     int amdgpu_ring_test_helper(struct amdgpu_ring *ring);
>>>   diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c 
>>> b/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
>>> index bd4248c..b3ce5be 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
>>> @@ -269,10 +269,8 @@ static int psp_v11_0_bootloader_load_kdb(struct 
>>> psp_context *psp)
>>>       if (ret)
>>>           return ret;
>>>   -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>> -
>>>       /* Copy PSP KDB binary to memory */
>>> -    memcpy(psp->fw_pri_buf, psp->kdb_start_addr, psp->kdb_bin_size);
>>> +    psp_copy_fw(psp, psp->kdb_start_addr, psp->kdb_bin_size);
>>>         /* Provide the PSP KDB to bootloader */
>>>       WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>> @@ -302,10 +300,8 @@ static int psp_v11_0_bootloader_load_spl(struct 
>>> psp_context *psp)
>>>       if (ret)
>>>           return ret;
>>>   -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>> -
>>>       /* Copy PSP SPL binary to memory */
>>> -    memcpy(psp->fw_pri_buf, psp->spl_start_addr, psp->spl_bin_size);
>>> +    psp_copy_fw(psp, psp->spl_start_addr, psp->spl_bin_size);
>>>         /* Provide the PSP SPL to bootloader */
>>>       WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>> @@ -335,10 +331,8 @@ static int 
>>> psp_v11_0_bootloader_load_sysdrv(struct psp_context *psp)
>>>       if (ret)
>>>           return ret;
>>>   -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>> -
>>>       /* Copy PSP System Driver binary to memory */
>>> -    memcpy(psp->fw_pri_buf, psp->sys_start_addr, psp->sys_bin_size);
>>> +    psp_copy_fw(psp, psp->sys_start_addr, psp->sys_bin_size);
>>>         /* Provide the sys driver to bootloader */
>>>       WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>> @@ -371,10 +365,8 @@ static int psp_v11_0_bootloader_load_sos(struct 
>>> psp_context *psp)
>>>       if (ret)
>>>           return ret;
>>>   -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>> -
>>>       /* Copy Secure OS binary to PSP memory */
>>> -    memcpy(psp->fw_pri_buf, psp->sos_start_addr, psp->sos_bin_size);
>>> +    psp_copy_fw(psp, psp->sos_start_addr, psp->sos_bin_size);
>>>         /* Provide the PSP secure OS to bootloader */
>>>       WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c 
>>> b/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c
>>> index c4828bd..618e5b6 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c
>>> @@ -138,10 +138,8 @@ static int 
>>> psp_v12_0_bootloader_load_sysdrv(struct psp_context *psp)
>>>       if (ret)
>>>           return ret;
>>>   -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>> -
>>>       /* Copy PSP System Driver binary to memory */
>>> -    memcpy(psp->fw_pri_buf, psp->sys_start_addr, psp->sys_bin_size);
>>> +    psp_copy_fw(psp, psp->sys_start_addr, psp->sys_bin_size);
>>>         /* Provide the sys driver to bootloader */
>>>       WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>> @@ -179,10 +177,8 @@ static int psp_v12_0_bootloader_load_sos(struct 
>>> psp_context *psp)
>>>       if (ret)
>>>           return ret;
>>>   -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>> -
>>>       /* Copy Secure OS binary to PSP memory */
>>> -    memcpy(psp->fw_pri_buf, psp->sos_start_addr, psp->sos_bin_size);
>>> +    psp_copy_fw(psp, psp->sos_start_addr, psp->sos_bin_size);
>>>         /* Provide the PSP secure OS to bootloader */
>>>       WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c 
>>> b/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c
>>> index f2e725f..d0a6cccd 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c
>>> @@ -102,10 +102,8 @@ static int 
>>> psp_v3_1_bootloader_load_sysdrv(struct psp_context *psp)
>>>       if (ret)
>>>           return ret;
>>>   -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>> -
>>>       /* Copy PSP System Driver binary to memory */
>>> -    memcpy(psp->fw_pri_buf, psp->sys_start_addr, psp->sys_bin_size);
>>> +    psp_copy_fw(psp, psp->sys_start_addr, psp->sys_bin_size);
>>>         /* Provide the sys driver to bootloader */
>>>       WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>> @@ -143,10 +141,8 @@ static int psp_v3_1_bootloader_load_sos(struct 
>>> psp_context *psp)
>>>       if (ret)
>>>           return ret;
>>>   -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>> -
>>>       /* Copy Secure OS binary to PSP memory */
>>> -    memcpy(psp->fw_pri_buf, psp->sos_start_addr, psp->sos_bin_size);
>>> +    psp_copy_fw(psp, psp->sos_start_addr, psp->sos_bin_size);
>>>         /* Provide the PSP secure OS to bootloader */
>>>       WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 10/14] dmr/amdgpu: Move some sysfs attrs creation to default_attr
  2021-01-19  7:34     ` Greg KH
@ 2021-01-19 16:36       ` Andrey Grodzovsky
  -1 siblings, 0 replies; 196+ messages in thread
From: Andrey Grodzovsky @ 2021-01-19 16:36 UTC (permalink / raw)
  To: Greg KH
  Cc: ckoenig.leichtzumerken, dri-devel, amd-gfx, daniel.vetter,
	Alexander.Deucher, yuq825


On 1/19/21 2:34 AM, Greg KH wrote:
> On Mon, Jan 18, 2021 at 04:01:19PM -0500, Andrey Grodzovsky wrote:
>>   static struct pci_driver amdgpu_kms_pci_driver = {
>>   	.name = DRIVER_NAME,
>>   	.id_table = pciidlist,
>> @@ -1595,6 +1607,7 @@ static struct pci_driver amdgpu_kms_pci_driver = {
>>   	.shutdown = amdgpu_pci_shutdown,
>>   	.driver.pm = &amdgpu_pm_ops,
>>   	.err_handler = &amdgpu_pci_err_handler,
>> +	.driver.dev_groups = amdgpu_sysfs_groups,
> Shouldn't this just be:
> 	groups - amdgpu_sysfs_groups,
>
> Why go to the "driver root" here?


Because I still didn't get to your suggestion to propose a patch to add groups to
pci_driver, it's located in 'base' driver struct.

Andrey


>
> Other than that tiny thing, looks good to me, nice cleanup!
>
> greg k-h
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 10/14] dmr/amdgpu: Move some sysfs attrs creation to default_attr
@ 2021-01-19 16:36       ` Andrey Grodzovsky
  0 siblings, 0 replies; 196+ messages in thread
From: Andrey Grodzovsky @ 2021-01-19 16:36 UTC (permalink / raw)
  To: Greg KH
  Cc: robh, ckoenig.leichtzumerken, dri-devel, eric, ppaalanen,
	amd-gfx, daniel.vetter, Alexander.Deucher, yuq825,
	Harry.Wentland, l.stach


On 1/19/21 2:34 AM, Greg KH wrote:
> On Mon, Jan 18, 2021 at 04:01:19PM -0500, Andrey Grodzovsky wrote:
>>   static struct pci_driver amdgpu_kms_pci_driver = {
>>   	.name = DRIVER_NAME,
>>   	.id_table = pciidlist,
>> @@ -1595,6 +1607,7 @@ static struct pci_driver amdgpu_kms_pci_driver = {
>>   	.shutdown = amdgpu_pci_shutdown,
>>   	.driver.pm = &amdgpu_pm_ops,
>>   	.err_handler = &amdgpu_pci_err_handler,
>> +	.driver.dev_groups = amdgpu_sysfs_groups,
> Shouldn't this just be:
> 	groups - amdgpu_sysfs_groups,
>
> Why go to the "driver root" here?


Because I still didn't get to your suggestion to propose a patch to add groups to
pci_driver, it's located in 'base' driver struct.

Andrey


>
> Other than that tiny thing, looks good to me, nice cleanup!
>
> greg k-h
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 00/14] RFC Support hot device unplug in amdgpu
  2021-01-19 14:16   ` Daniel Vetter
@ 2021-01-19 17:31     ` Andrey Grodzovsky
  -1 siblings, 0 replies; 196+ messages in thread
From: Andrey Grodzovsky @ 2021-01-19 17:31 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: gregkh, ckoenig.leichtzumerken, dri-devel, amd-gfx,
	daniel.vetter, Alexander.Deucher, yuq825


On 1/19/21 9:16 AM, Daniel Vetter wrote:
> On Mon, Jan 18, 2021 at 04:01:09PM -0500, Andrey Grodzovsky wrote:
>> Until now extracting a card either by physical extraction (e.g. eGPU with
>> thunderbolt connection or by emulation through  syfs -> /sys/bus/pci/devices/device_id/remove)
>> would cause random crashes in user apps. The random crashes in apps were
>> mostly due to the app having mapped a device backed BO into its address
>> space was still trying to access the BO while the backing device was gone.
>> To answer this first problem Christian suggested to fix the handling of mapped
>> memory in the clients when the device goes away by forcibly unmap all buffers the
>> user processes has by clearing their respective VMAs mapping the device BOs.
>> Then when the VMAs try to fill in the page tables again we check in the fault
>> handlerif the device is removed and if so, return an error. This will generate a
>> SIGBUS to the application which can then cleanly terminate.This indeed was done
>> but this in turn created a problem of kernel OOPs were the OOPSes were due to the
>> fact that while the app was terminating because of the SIGBUSit would trigger use
>> after free in the driver by calling to accesses device structures that were already
>> released from the pci remove sequence.This was handled by introducing a 'flush'
>> sequence during device removal were we wait for drm file reference to drop to 0
>> meaning all user clients directly using this device terminated.
>>
>> v2:
>> Based on discussions in the mailing list with Daniel and Pekka [1] and based on the document
>> produced by Pekka from those discussions [2] the whole approach with returning SIGBUS and
>> waiting for all user clients having CPU mapping of device BOs to die was dropped.
>> Instead as per the document suggestion the device structures are kept alive until
>> the last reference to the device is dropped by user client and in the meanwhile all existing and new CPU mappings of the BOs
>> belonging to the device directly or by dma-buf import are rerouted to per user
>> process dummy rw page.Also, I skipped the 'Requirements for KMS UAPI' section of [2]
>> since i am trying to get the minimal set of requirements that still give useful solution
>> to work and this is the'Requirements for Render and Cross-Device UAPI' section and so my
>> test case is removing a secondary device, which is render only and is not involved
>> in KMS.
>>
>> v3:
>> More updates following comments from v2 such as removing loop to find DRM file when rerouting
>> page faults to dummy page,getting rid of unnecessary sysfs handling refactoring and moving
>> prevention of GPU recovery post device unplug from amdgpu to scheduler layer.
>> On top of that added unplug support for the IOMMU enabled system.
>>
>> v4:
>> Drop last sysfs hack and use sysfs default attribute.
>> Guard against write accesses after device removal to avoid modifying released memory.
>> Update dummy pages handling to on demand allocation and release through drm managed framework.
>> Add return value to scheduler job TO handler (by Luben Tuikov) and use this in amdgpu for prevention
>> of GPU recovery post device unplug
>> Also rebase on top of drm-misc-mext instead of amd-staging-drm-next
>>
>> With these patches I am able to gracefully remove the secondary card using sysfs remove hook while glxgears
>> is running off of secondary card (DRI_PRIME=1) without kernel oopses or hangs and keep working
>> with the primary card or soft reset the device without hangs or oopses
>>
>> TODOs for followup work:
>> Convert AMDGPU code to use devm (for hw stuff) and drmm (for sw stuff and allocations) (Daniel)
>> Support plugging the secondary device back after unplug - currently still experiencing HW error on plugging back.
>> Add support for 'Requirements for KMS UAPI' section of [2] - unplugging primary, display connected card.
>>
>> [1] - Discussions during v3 of the patchset https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.spinics.net%2Flists%2Famd-gfx%2Fmsg55576.html&amp;data=04%7C01%7Candrey.grodzovsky%40amd.com%7C4b12f8caf53645eaf0c608d8bc84d7fa%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637466626035281917%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=E73dK7r1OBt1T9UcSt6kYbxYk9LL22EgizbpvkjfZ0c%3D&amp;reserved=0
>> [2] - drm/doc: device hot-unplug for userspace https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.spinics.net%2Flists%2Fdri-devel%2Fmsg259755.html&amp;data=04%7C01%7Candrey.grodzovsky%40amd.com%7C4b12f8caf53645eaf0c608d8bc84d7fa%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637466626035291908%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=EAzrOrNd14IA6gjjCVi9mAQJQZbcrFQbRNC3bN9gVQc%3D&amp;reserved=0
>> [3] - Related gitlab ticket https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgitlab.freedesktop.org%2Fdrm%2Famd%2F-%2Fissues%2F1081&amp;data=04%7C01%7Candrey.grodzovsky%40amd.com%7C4b12f8caf53645eaf0c608d8bc84d7fa%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637466626035291908%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=Pmd1WS79YGhU65XNsLtz9s3B6Oc1Dq%2FG4v2t1QDYrFQ%3D&amp;reserved=0
> btw have you tried this out with some of the igts we have? core_hotunplug
> is the one I'm thinking of. Might be worth to extend this for amdgpu
> specific stuff (like run some batches on it while hotunplugging).

No, I mostly used just running glxgears while testing which covers already
exported/imported dma-buf case and a few manually hacked tests in libdrm amdgpu 
test suite


>
> Since there's so many corner cases we need to test here (shared dma-buf,
> shared dma_fence) I think it would make sense to have a shared testcase
> across drivers.


Not familiar with IGT too much, is there an easy way to setup shared dma bufs 
and fences
use cases there or you mean I need to add them now ?


> Only specific thing would be some hooks to keep the gpu
> busy in some fashion while we yank the driver.


Do you mean like staring X and some active rendering on top (like glxgears) 
automatically from within IGT ?


> But just to get it started
> you can throw in entirely amdgpu specific subtests and just share some of
> the test code.
> -Daniel


Im general, I wasn't aware of this test suite and looks like it does what i test 
among other stuff.
I will definitely  try to run with it although the rescan part will not work as 
plugging
the device back is in my TODO list and not part of the scope for this patchset 
and so I will
probably comment the re-scan section out while testing.

Andrey


>
>> Andrey Grodzovsky (13):
>>    drm/ttm: Remap all page faults to per process dummy page.
>>    drm: Unamp the entire device address space on device unplug
>>    drm/ttm: Expose ttm_tt_unpopulate for driver use
>>    drm/sched: Cancel and flush all oustatdning jobs before finish.
>>    drm/amdgpu: Split amdgpu_device_fini into early and late
>>    drm/amdgpu: Add early fini callback
>>    drm/amdgpu: Register IOMMU topology notifier per device.
>>    drm/amdgpu: Fix a bunch of sdma code crash post device unplug
>>    drm/amdgpu: Remap all page faults to per process dummy page.
>>    dmr/amdgpu: Move some sysfs attrs creation to default_attr
>>    drm/amdgpu: Guard against write accesses after device removal
>>    drm/sched: Make timeout timer rearm conditional.
>>    drm/amdgpu: Prevent any job recoveries after device is unplugged.
>>
>> Luben Tuikov (1):
>>    drm/scheduler: Job timeout handler returns status
>>
>>   drivers/gpu/drm/amd/amdgpu/amdgpu.h               |  11 +-
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c      |  17 +--
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c        | 149 ++++++++++++++++++++--
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c           |  20 ++-
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c         |  15 ++-
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c          |   2 +-
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h          |   1 +
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c           |   9 ++
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c       |  25 ++--
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c           |  26 ++--
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_irq.h           |   3 +-
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_job.c           |  19 ++-
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c           |  12 +-
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_object.c        |  10 ++
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_object.h        |   2 +
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c           |  53 +++++---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h           |   3 +
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c           |   1 +
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c          |  70 ++++++++++
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h          |  52 +-------
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c           |  21 ++-
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c            |   8 +-
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c      |  14 +-
>>   drivers/gpu/drm/amd/amdgpu/cik_ih.c               |   2 +-
>>   drivers/gpu/drm/amd/amdgpu/cz_ih.c                |   2 +-
>>   drivers/gpu/drm/amd/amdgpu/iceland_ih.c           |   2 +-
>>   drivers/gpu/drm/amd/amdgpu/navi10_ih.c            |   2 +-
>>   drivers/gpu/drm/amd/amdgpu/psp_v11_0.c            |  16 +--
>>   drivers/gpu/drm/amd/amdgpu/psp_v12_0.c            |   8 +-
>>   drivers/gpu/drm/amd/amdgpu/psp_v3_1.c             |   8 +-
>>   drivers/gpu/drm/amd/amdgpu/si_ih.c                |   2 +-
>>   drivers/gpu/drm/amd/amdgpu/tonga_ih.c             |   2 +-
>>   drivers/gpu/drm/amd/amdgpu/vega10_ih.c            |   2 +-
>>   drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c |  12 +-
>>   drivers/gpu/drm/amd/include/amd_shared.h          |   2 +
>>   drivers/gpu/drm/drm_drv.c                         |   3 +
>>   drivers/gpu/drm/etnaviv/etnaviv_sched.c           |  10 +-
>>   drivers/gpu/drm/lima/lima_sched.c                 |   4 +-
>>   drivers/gpu/drm/panfrost/panfrost_job.c           |   9 +-
>>   drivers/gpu/drm/scheduler/sched_main.c            |  18 ++-
>>   drivers/gpu/drm/ttm/ttm_bo_vm.c                   |  82 +++++++++++-
>>   drivers/gpu/drm/ttm/ttm_tt.c                      |   1 +
>>   drivers/gpu/drm/v3d/v3d_sched.c                   |  32 ++---
>>   include/drm/gpu_scheduler.h                       |  17 ++-
>>   include/drm/ttm/ttm_bo_api.h                      |   2 +
>>   45 files changed, 583 insertions(+), 198 deletions(-)
>>
>> -- 
>> 2.7.4
>>
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 00/14] RFC Support hot device unplug in amdgpu
@ 2021-01-19 17:31     ` Andrey Grodzovsky
  0 siblings, 0 replies; 196+ messages in thread
From: Andrey Grodzovsky @ 2021-01-19 17:31 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: robh, gregkh, ckoenig.leichtzumerken, dri-devel, eric, ppaalanen,
	amd-gfx, daniel.vetter, Alexander.Deucher, yuq825,
	Harry.Wentland, l.stach


On 1/19/21 9:16 AM, Daniel Vetter wrote:
> On Mon, Jan 18, 2021 at 04:01:09PM -0500, Andrey Grodzovsky wrote:
>> Until now extracting a card either by physical extraction (e.g. eGPU with
>> thunderbolt connection or by emulation through  syfs -> /sys/bus/pci/devices/device_id/remove)
>> would cause random crashes in user apps. The random crashes in apps were
>> mostly due to the app having mapped a device backed BO into its address
>> space was still trying to access the BO while the backing device was gone.
>> To answer this first problem Christian suggested to fix the handling of mapped
>> memory in the clients when the device goes away by forcibly unmap all buffers the
>> user processes has by clearing their respective VMAs mapping the device BOs.
>> Then when the VMAs try to fill in the page tables again we check in the fault
>> handlerif the device is removed and if so, return an error. This will generate a
>> SIGBUS to the application which can then cleanly terminate.This indeed was done
>> but this in turn created a problem of kernel OOPs were the OOPSes were due to the
>> fact that while the app was terminating because of the SIGBUSit would trigger use
>> after free in the driver by calling to accesses device structures that were already
>> released from the pci remove sequence.This was handled by introducing a 'flush'
>> sequence during device removal were we wait for drm file reference to drop to 0
>> meaning all user clients directly using this device terminated.
>>
>> v2:
>> Based on discussions in the mailing list with Daniel and Pekka [1] and based on the document
>> produced by Pekka from those discussions [2] the whole approach with returning SIGBUS and
>> waiting for all user clients having CPU mapping of device BOs to die was dropped.
>> Instead as per the document suggestion the device structures are kept alive until
>> the last reference to the device is dropped by user client and in the meanwhile all existing and new CPU mappings of the BOs
>> belonging to the device directly or by dma-buf import are rerouted to per user
>> process dummy rw page.Also, I skipped the 'Requirements for KMS UAPI' section of [2]
>> since i am trying to get the minimal set of requirements that still give useful solution
>> to work and this is the'Requirements for Render and Cross-Device UAPI' section and so my
>> test case is removing a secondary device, which is render only and is not involved
>> in KMS.
>>
>> v3:
>> More updates following comments from v2 such as removing loop to find DRM file when rerouting
>> page faults to dummy page,getting rid of unnecessary sysfs handling refactoring and moving
>> prevention of GPU recovery post device unplug from amdgpu to scheduler layer.
>> On top of that added unplug support for the IOMMU enabled system.
>>
>> v4:
>> Drop last sysfs hack and use sysfs default attribute.
>> Guard against write accesses after device removal to avoid modifying released memory.
>> Update dummy pages handling to on demand allocation and release through drm managed framework.
>> Add return value to scheduler job TO handler (by Luben Tuikov) and use this in amdgpu for prevention
>> of GPU recovery post device unplug
>> Also rebase on top of drm-misc-mext instead of amd-staging-drm-next
>>
>> With these patches I am able to gracefully remove the secondary card using sysfs remove hook while glxgears
>> is running off of secondary card (DRI_PRIME=1) without kernel oopses or hangs and keep working
>> with the primary card or soft reset the device without hangs or oopses
>>
>> TODOs for followup work:
>> Convert AMDGPU code to use devm (for hw stuff) and drmm (for sw stuff and allocations) (Daniel)
>> Support plugging the secondary device back after unplug - currently still experiencing HW error on plugging back.
>> Add support for 'Requirements for KMS UAPI' section of [2] - unplugging primary, display connected card.
>>
>> [1] - Discussions during v3 of the patchset https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.spinics.net%2Flists%2Famd-gfx%2Fmsg55576.html&amp;data=04%7C01%7Candrey.grodzovsky%40amd.com%7C4b12f8caf53645eaf0c608d8bc84d7fa%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637466626035281917%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=E73dK7r1OBt1T9UcSt6kYbxYk9LL22EgizbpvkjfZ0c%3D&amp;reserved=0
>> [2] - drm/doc: device hot-unplug for userspace https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.spinics.net%2Flists%2Fdri-devel%2Fmsg259755.html&amp;data=04%7C01%7Candrey.grodzovsky%40amd.com%7C4b12f8caf53645eaf0c608d8bc84d7fa%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637466626035291908%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=EAzrOrNd14IA6gjjCVi9mAQJQZbcrFQbRNC3bN9gVQc%3D&amp;reserved=0
>> [3] - Related gitlab ticket https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgitlab.freedesktop.org%2Fdrm%2Famd%2F-%2Fissues%2F1081&amp;data=04%7C01%7Candrey.grodzovsky%40amd.com%7C4b12f8caf53645eaf0c608d8bc84d7fa%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637466626035291908%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=Pmd1WS79YGhU65XNsLtz9s3B6Oc1Dq%2FG4v2t1QDYrFQ%3D&amp;reserved=0
> btw have you tried this out with some of the igts we have? core_hotunplug
> is the one I'm thinking of. Might be worth to extend this for amdgpu
> specific stuff (like run some batches on it while hotunplugging).

No, I mostly used just running glxgears while testing which covers already
exported/imported dma-buf case and a few manually hacked tests in libdrm amdgpu 
test suite


>
> Since there's so many corner cases we need to test here (shared dma-buf,
> shared dma_fence) I think it would make sense to have a shared testcase
> across drivers.


Not familiar with IGT too much, is there an easy way to setup shared dma bufs 
and fences
use cases there or you mean I need to add them now ?


> Only specific thing would be some hooks to keep the gpu
> busy in some fashion while we yank the driver.


Do you mean like staring X and some active rendering on top (like glxgears) 
automatically from within IGT ?


> But just to get it started
> you can throw in entirely amdgpu specific subtests and just share some of
> the test code.
> -Daniel


Im general, I wasn't aware of this test suite and looks like it does what i test 
among other stuff.
I will definitely  try to run with it although the rescan part will not work as 
plugging
the device back is in my TODO list and not part of the scope for this patchset 
and so I will
probably comment the re-scan section out while testing.

Andrey


>
>> Andrey Grodzovsky (13):
>>    drm/ttm: Remap all page faults to per process dummy page.
>>    drm: Unamp the entire device address space on device unplug
>>    drm/ttm: Expose ttm_tt_unpopulate for driver use
>>    drm/sched: Cancel and flush all oustatdning jobs before finish.
>>    drm/amdgpu: Split amdgpu_device_fini into early and late
>>    drm/amdgpu: Add early fini callback
>>    drm/amdgpu: Register IOMMU topology notifier per device.
>>    drm/amdgpu: Fix a bunch of sdma code crash post device unplug
>>    drm/amdgpu: Remap all page faults to per process dummy page.
>>    dmr/amdgpu: Move some sysfs attrs creation to default_attr
>>    drm/amdgpu: Guard against write accesses after device removal
>>    drm/sched: Make timeout timer rearm conditional.
>>    drm/amdgpu: Prevent any job recoveries after device is unplugged.
>>
>> Luben Tuikov (1):
>>    drm/scheduler: Job timeout handler returns status
>>
>>   drivers/gpu/drm/amd/amdgpu/amdgpu.h               |  11 +-
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c      |  17 +--
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c        | 149 ++++++++++++++++++++--
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c           |  20 ++-
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c         |  15 ++-
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c          |   2 +-
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h          |   1 +
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c           |   9 ++
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c       |  25 ++--
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c           |  26 ++--
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_irq.h           |   3 +-
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_job.c           |  19 ++-
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c           |  12 +-
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_object.c        |  10 ++
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_object.h        |   2 +
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c           |  53 +++++---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h           |   3 +
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c           |   1 +
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c          |  70 ++++++++++
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h          |  52 +-------
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c           |  21 ++-
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c            |   8 +-
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c      |  14 +-
>>   drivers/gpu/drm/amd/amdgpu/cik_ih.c               |   2 +-
>>   drivers/gpu/drm/amd/amdgpu/cz_ih.c                |   2 +-
>>   drivers/gpu/drm/amd/amdgpu/iceland_ih.c           |   2 +-
>>   drivers/gpu/drm/amd/amdgpu/navi10_ih.c            |   2 +-
>>   drivers/gpu/drm/amd/amdgpu/psp_v11_0.c            |  16 +--
>>   drivers/gpu/drm/amd/amdgpu/psp_v12_0.c            |   8 +-
>>   drivers/gpu/drm/amd/amdgpu/psp_v3_1.c             |   8 +-
>>   drivers/gpu/drm/amd/amdgpu/si_ih.c                |   2 +-
>>   drivers/gpu/drm/amd/amdgpu/tonga_ih.c             |   2 +-
>>   drivers/gpu/drm/amd/amdgpu/vega10_ih.c            |   2 +-
>>   drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c |  12 +-
>>   drivers/gpu/drm/amd/include/amd_shared.h          |   2 +
>>   drivers/gpu/drm/drm_drv.c                         |   3 +
>>   drivers/gpu/drm/etnaviv/etnaviv_sched.c           |  10 +-
>>   drivers/gpu/drm/lima/lima_sched.c                 |   4 +-
>>   drivers/gpu/drm/panfrost/panfrost_job.c           |   9 +-
>>   drivers/gpu/drm/scheduler/sched_main.c            |  18 ++-
>>   drivers/gpu/drm/ttm/ttm_bo_vm.c                   |  82 +++++++++++-
>>   drivers/gpu/drm/ttm/ttm_tt.c                      |   1 +
>>   drivers/gpu/drm/v3d/v3d_sched.c                   |  32 ++---
>>   include/drm/gpu_scheduler.h                       |  17 ++-
>>   include/drm/ttm/ttm_bo_api.h                      |   2 +
>>   45 files changed, 583 insertions(+), 198 deletions(-)
>>
>> -- 
>> 2.7.4
>>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 10/14] dmr/amdgpu: Move some sysfs attrs creation to default_attr
  2021-01-19 16:36       ` Andrey Grodzovsky
@ 2021-01-19 17:47         ` Greg KH
  -1 siblings, 0 replies; 196+ messages in thread
From: Greg KH @ 2021-01-19 17:47 UTC (permalink / raw)
  To: Andrey Grodzovsky
  Cc: ckoenig.leichtzumerken, dri-devel, amd-gfx, daniel.vetter,
	Alexander.Deucher, yuq825

On Tue, Jan 19, 2021 at 11:36:01AM -0500, Andrey Grodzovsky wrote:
> 
> On 1/19/21 2:34 AM, Greg KH wrote:
> > On Mon, Jan 18, 2021 at 04:01:19PM -0500, Andrey Grodzovsky wrote:
> > >   static struct pci_driver amdgpu_kms_pci_driver = {
> > >   	.name = DRIVER_NAME,
> > >   	.id_table = pciidlist,
> > > @@ -1595,6 +1607,7 @@ static struct pci_driver amdgpu_kms_pci_driver = {
> > >   	.shutdown = amdgpu_pci_shutdown,
> > >   	.driver.pm = &amdgpu_pm_ops,
> > >   	.err_handler = &amdgpu_pci_err_handler,
> > > +	.driver.dev_groups = amdgpu_sysfs_groups,
> > Shouldn't this just be:
> > 	groups - amdgpu_sysfs_groups,
> > 
> > Why go to the "driver root" here?
> 
> 
> Because I still didn't get to your suggestion to propose a patch to add groups to
> pci_driver, it's located in 'base' driver struct.

You are a pci driver, you should never have to mess with the "base"
driver struct.  Look at commit 92d50fc1602e ("PCI/IB: add support for
pci driver attribute groups") which got merged in 4.14, way back in
2017 :)

driver.pm also looks odd, but I'm just going to ignore that for now...

thanks,

greg k-h
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 10/14] dmr/amdgpu: Move some sysfs attrs creation to default_attr
@ 2021-01-19 17:47         ` Greg KH
  0 siblings, 0 replies; 196+ messages in thread
From: Greg KH @ 2021-01-19 17:47 UTC (permalink / raw)
  To: Andrey Grodzovsky
  Cc: robh, ckoenig.leichtzumerken, dri-devel, eric, ppaalanen,
	amd-gfx, daniel.vetter, Alexander.Deucher, yuq825,
	Harry.Wentland, l.stach

On Tue, Jan 19, 2021 at 11:36:01AM -0500, Andrey Grodzovsky wrote:
> 
> On 1/19/21 2:34 AM, Greg KH wrote:
> > On Mon, Jan 18, 2021 at 04:01:19PM -0500, Andrey Grodzovsky wrote:
> > >   static struct pci_driver amdgpu_kms_pci_driver = {
> > >   	.name = DRIVER_NAME,
> > >   	.id_table = pciidlist,
> > > @@ -1595,6 +1607,7 @@ static struct pci_driver amdgpu_kms_pci_driver = {
> > >   	.shutdown = amdgpu_pci_shutdown,
> > >   	.driver.pm = &amdgpu_pm_ops,
> > >   	.err_handler = &amdgpu_pci_err_handler,
> > > +	.driver.dev_groups = amdgpu_sysfs_groups,
> > Shouldn't this just be:
> > 	groups - amdgpu_sysfs_groups,
> > 
> > Why go to the "driver root" here?
> 
> 
> Because I still didn't get to your suggestion to propose a patch to add groups to
> pci_driver, it's located in 'base' driver struct.

You are a pci driver, you should never have to mess with the "base"
driver struct.  Look at commit 92d50fc1602e ("PCI/IB: add support for
pci driver attribute groups") which got merged in 4.14, way back in
2017 :)

driver.pm also looks odd, but I'm just going to ignore that for now...

thanks,

greg k-h
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 12/14] drm/scheduler: Job timeout handler returns status
  2021-01-19  7:53     ` Christian König
@ 2021-01-19 17:47       ` Luben Tuikov
  -1 siblings, 0 replies; 196+ messages in thread
From: Luben Tuikov @ 2021-01-19 17:47 UTC (permalink / raw)
  To: Christian König, Andrey Grodzovsky, amd-gfx, dri-devel,
	ckoenig.leichtzumerken, daniel.vetter, robh, l.stach, yuq825,
	eric
  Cc: Tomeu Vizoso, gregkh, Steven Price, Alyssa Rosenzweig,
	Russell King, Alexander.Deucher

On 2021-01-19 2:53 a.m., Christian König wrote:
> Am 18.01.21 um 22:01 schrieb Andrey Grodzovsky:
>> From: Luben Tuikov <luben.tuikov@amd.com>
>>
>> This patch does not change current behaviour.
>>
>> The driver's job timeout handler now returns
>> status indicating back to the DRM layer whether
>> the task (job) was successfully aborted or whether
>> more time should be given to the task to complete.
>>
>> Default behaviour as of this patch, is preserved,
>> except in obvious-by-comment case in the Panfrost
>> driver, as documented below.
>>
>> All drivers which make use of the
>> drm_sched_backend_ops' .timedout_job() callback
>> have been accordingly renamed and return the
>> would've-been default value of
>> DRM_TASK_STATUS_ALIVE to restart the task's
>> timeout timer--this is the old behaviour, and
>> is preserved by this patch.
>>
>> In the case of the Panfrost driver, its timedout
>> callback correctly first checks if the job had
>> completed in due time and if so, it now returns
>> DRM_TASK_STATUS_COMPLETE to notify the DRM layer
>> that the task can be moved to the done list, to be
>> freed later. In the other two subsequent checks,
>> the value of DRM_TASK_STATUS_ALIVE is returned, as
>> per the default behaviour.
>>
>> A more involved driver's solutions can be had
>> in subequent patches.
>>
>> v2: Use enum as the status of a driver's job
>>      timeout callback method.
>>
>> v4: (By Andrey Grodzovsky)
>> Replace DRM_TASK_STATUS_COMPLETE with DRM_TASK_STATUS_ENODEV
>> to enable a hint to the schduler for when NOT to rearm the
>> timeout timer.
> As Lukas pointed out returning the job (or task) status doesn't make 
> much sense.
>
> What we return here is the status of the scheduler.
>
> I would either rename the enum or completely drop it and return a 
> negative error status.

Yes, that could be had.

Although, dropping the enum and returning [-1, 0], might
make the return status meaning vague. Using an enum with an appropriate
name, makes the intention clear to the next programmer.

Now, Andrey did rename one of the enumerated values to
DRM_TASK_STATUS_ENODEV, perhaps the same but with:

enum drm_sched_status {
    DRM_SCHED_STAT_NONE, /* Reserve 0 */
    DRM_SCHED_STAT_NOMINAL,
    DRM_SCHED_STAT_ENODEV,
};

and also renaming the enum to the above would be acceptable?

Regards,
Luben

> Apart from that looks fine to me,
> Christian.
>
>
>> Cc: Alexander Deucher <Alexander.Deucher@amd.com>
>> Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
>> Cc: Christian König <christian.koenig@amd.com>
>> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
>> Cc: Lucas Stach <l.stach@pengutronix.de>
>> Cc: Russell King <linux+etnaviv@armlinux.org.uk>
>> Cc: Christian Gmeiner <christian.gmeiner@gmail.com>
>> Cc: Qiang Yu <yuq825@gmail.com>
>> Cc: Rob Herring <robh@kernel.org>
>> Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com>
>> Cc: Steven Price <steven.price@arm.com>
>> Cc: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
>> Cc: Eric Anholt <eric@anholt.net>
>> Reported-by: kernel test robot <lkp@intel.com>
>> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_job.c |  6 ++++--
>>   drivers/gpu/drm/etnaviv/etnaviv_sched.c | 10 +++++++++-
>>   drivers/gpu/drm/lima/lima_sched.c       |  4 +++-
>>   drivers/gpu/drm/panfrost/panfrost_job.c |  9 ++++++---
>>   drivers/gpu/drm/scheduler/sched_main.c  |  4 +---
>>   drivers/gpu/drm/v3d/v3d_sched.c         | 32 +++++++++++++++++---------------
>>   include/drm/gpu_scheduler.h             | 17 ++++++++++++++---
>>   7 files changed, 54 insertions(+), 28 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
>> index ff48101..a111326 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
>> @@ -28,7 +28,7 @@
>>   #include "amdgpu.h"
>>   #include "amdgpu_trace.h"
>>   
>> -static void amdgpu_job_timedout(struct drm_sched_job *s_job)
>> +static enum drm_task_status amdgpu_job_timedout(struct drm_sched_job *s_job)
>>   {
>>   	struct amdgpu_ring *ring = to_amdgpu_ring(s_job->sched);
>>   	struct amdgpu_job *job = to_amdgpu_job(s_job);
>> @@ -41,7 +41,7 @@ static void amdgpu_job_timedout(struct drm_sched_job *s_job)
>>   	    amdgpu_ring_soft_recovery(ring, job->vmid, s_job->s_fence->parent)) {
>>   		DRM_ERROR("ring %s timeout, but soft recovered\n",
>>   			  s_job->sched->name);
>> -		return;
>> +		return DRM_TASK_STATUS_ALIVE;
>>   	}
>>   
>>   	amdgpu_vm_get_task_info(ring->adev, job->pasid, &ti);
>> @@ -53,10 +53,12 @@ static void amdgpu_job_timedout(struct drm_sched_job *s_job)
>>   
>>   	if (amdgpu_device_should_recover_gpu(ring->adev)) {
>>   		amdgpu_device_gpu_recover(ring->adev, job);
>> +		return DRM_TASK_STATUS_ALIVE;
>>   	} else {
>>   		drm_sched_suspend_timeout(&ring->sched);
>>   		if (amdgpu_sriov_vf(adev))
>>   			adev->virt.tdr_debug = true;
>> +		return DRM_TASK_STATUS_ALIVE;
>>   	}
>>   }
>>   
>> diff --git a/drivers/gpu/drm/etnaviv/etnaviv_sched.c b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
>> index cd46c88..c495169 100644
>> --- a/drivers/gpu/drm/etnaviv/etnaviv_sched.c
>> +++ b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
>> @@ -82,7 +82,8 @@ static struct dma_fence *etnaviv_sched_run_job(struct drm_sched_job *sched_job)
>>   	return fence;
>>   }
>>   
>> -static void etnaviv_sched_timedout_job(struct drm_sched_job *sched_job)
>> +static enum drm_task_status etnaviv_sched_timedout_job(struct drm_sched_job
>> +						       *sched_job)
>>   {
>>   	struct etnaviv_gem_submit *submit = to_etnaviv_submit(sched_job);
>>   	struct etnaviv_gpu *gpu = submit->gpu;
>> @@ -120,9 +121,16 @@ static void etnaviv_sched_timedout_job(struct drm_sched_job *sched_job)
>>   
>>   	drm_sched_resubmit_jobs(&gpu->sched);
>>   
>> +	/* Tell the DRM scheduler that this task needs
>> +	 * more time.
>> +	 */
>> +	drm_sched_start(&gpu->sched, true);
>> +	return DRM_TASK_STATUS_ALIVE;
>> +
>>   out_no_timeout:
>>   	/* restart scheduler after GPU is usable again */
>>   	drm_sched_start(&gpu->sched, true);
>> +	return DRM_TASK_STATUS_ALIVE;
>>   }
>>   
>>   static void etnaviv_sched_free_job(struct drm_sched_job *sched_job)
>> diff --git a/drivers/gpu/drm/lima/lima_sched.c b/drivers/gpu/drm/lima/lima_sched.c
>> index 63b4c56..66d9236 100644
>> --- a/drivers/gpu/drm/lima/lima_sched.c
>> +++ b/drivers/gpu/drm/lima/lima_sched.c
>> @@ -415,7 +415,7 @@ static void lima_sched_build_error_task_list(struct lima_sched_task *task)
>>   	mutex_unlock(&dev->error_task_list_lock);
>>   }
>>   
>> -static void lima_sched_timedout_job(struct drm_sched_job *job)
>> +static enum drm_task_status lima_sched_timedout_job(struct drm_sched_job *job)
>>   {
>>   	struct lima_sched_pipe *pipe = to_lima_pipe(job->sched);
>>   	struct lima_sched_task *task = to_lima_task(job);
>> @@ -449,6 +449,8 @@ static void lima_sched_timedout_job(struct drm_sched_job *job)
>>   
>>   	drm_sched_resubmit_jobs(&pipe->base);
>>   	drm_sched_start(&pipe->base, true);
>> +
>> +	return DRM_TASK_STATUS_ALIVE;
>>   }
>>   
>>   static void lima_sched_free_job(struct drm_sched_job *job)
>> diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c
>> index 04e6f6f..10d41ac 100644
>> --- a/drivers/gpu/drm/panfrost/panfrost_job.c
>> +++ b/drivers/gpu/drm/panfrost/panfrost_job.c
>> @@ -432,7 +432,8 @@ static void panfrost_scheduler_start(struct panfrost_queue_state *queue)
>>   	mutex_unlock(&queue->lock);
>>   }
>>   
>> -static void panfrost_job_timedout(struct drm_sched_job *sched_job)
>> +static enum drm_task_status panfrost_job_timedout(struct drm_sched_job
>> +						  *sched_job)
>>   {
>>   	struct panfrost_job *job = to_panfrost_job(sched_job);
>>   	struct panfrost_device *pfdev = job->pfdev;
>> @@ -443,7 +444,7 @@ static void panfrost_job_timedout(struct drm_sched_job *sched_job)
>>   	 * spurious. Bail out.
>>   	 */
>>   	if (dma_fence_is_signaled(job->done_fence))
>> -		return;
>> +		return DRM_TASK_STATUS_ALIVE;
>>   
>>   	dev_err(pfdev->dev, "gpu sched timeout, js=%d, config=0x%x, status=0x%x, head=0x%x, tail=0x%x, sched_job=%p",
>>   		js,
>> @@ -455,11 +456,13 @@ static void panfrost_job_timedout(struct drm_sched_job *sched_job)
>>   
>>   	/* Scheduler is already stopped, nothing to do. */
>>   	if (!panfrost_scheduler_stop(&pfdev->js->queue[js], sched_job))
>> -		return;
>> +		return DRM_TASK_STATUS_ALIVE;
>>   
>>   	/* Schedule a reset if there's no reset in progress. */
>>   	if (!atomic_xchg(&pfdev->reset.pending, 1))
>>   		schedule_work(&pfdev->reset.work);
>> +
>> +	return DRM_TASK_STATUS_ALIVE;
>>   }
>>   
>>   static const struct drm_sched_backend_ops panfrost_sched_ops = {
>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
>> index 92637b7..73fccc5 100644
>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>> @@ -527,7 +527,7 @@ void drm_sched_start(struct drm_gpu_scheduler *sched, bool full_recovery)
>>   EXPORT_SYMBOL(drm_sched_start);
>>   
>>   /**
>> - * drm_sched_resubmit_jobs - helper to relunch job from pending ring list
>> + * drm_sched_resubmit_jobs - helper to relaunch jobs from the pending list
>>    *
>>    * @sched: scheduler instance
>>    *
>> @@ -561,8 +561,6 @@ void drm_sched_resubmit_jobs(struct drm_gpu_scheduler *sched)
>>   		} else {
>>   			s_job->s_fence->parent = fence;
>>   		}
>> -
>> -
>>   	}
>>   }
>>   EXPORT_SYMBOL(drm_sched_resubmit_jobs);
>> diff --git a/drivers/gpu/drm/v3d/v3d_sched.c b/drivers/gpu/drm/v3d/v3d_sched.c
>> index 452682e..3740665e 100644
>> --- a/drivers/gpu/drm/v3d/v3d_sched.c
>> +++ b/drivers/gpu/drm/v3d/v3d_sched.c
>> @@ -259,7 +259,7 @@ v3d_cache_clean_job_run(struct drm_sched_job *sched_job)
>>   	return NULL;
>>   }
>>   
>> -static void
>> +static enum drm_task_status
>>   v3d_gpu_reset_for_timeout(struct v3d_dev *v3d, struct drm_sched_job *sched_job)
>>   {
>>   	enum v3d_queue q;
>> @@ -285,6 +285,8 @@ v3d_gpu_reset_for_timeout(struct v3d_dev *v3d, struct drm_sched_job *sched_job)
>>   	}
>>   
>>   	mutex_unlock(&v3d->reset_lock);
>> +
>> +	return DRM_TASK_STATUS_ALIVE;
>>   }
>>   
>>   /* If the current address or return address have changed, then the GPU
>> @@ -292,7 +294,7 @@ v3d_gpu_reset_for_timeout(struct v3d_dev *v3d, struct drm_sched_job *sched_job)
>>    * could fail if the GPU got in an infinite loop in the CL, but that
>>    * is pretty unlikely outside of an i-g-t testcase.
>>    */
>> -static void
>> +static enum drm_task_status
>>   v3d_cl_job_timedout(struct drm_sched_job *sched_job, enum v3d_queue q,
>>   		    u32 *timedout_ctca, u32 *timedout_ctra)
>>   {
>> @@ -304,39 +306,39 @@ v3d_cl_job_timedout(struct drm_sched_job *sched_job, enum v3d_queue q,
>>   	if (*timedout_ctca != ctca || *timedout_ctra != ctra) {
>>   		*timedout_ctca = ctca;
>>   		*timedout_ctra = ctra;
>> -		return;
>> +		return DRM_TASK_STATUS_ALIVE;
>>   	}
>>   
>> -	v3d_gpu_reset_for_timeout(v3d, sched_job);
>> +	return v3d_gpu_reset_for_timeout(v3d, sched_job);
>>   }
>>   
>> -static void
>> +static enum drm_task_status
>>   v3d_bin_job_timedout(struct drm_sched_job *sched_job)
>>   {
>>   	struct v3d_bin_job *job = to_bin_job(sched_job);
>>   
>> -	v3d_cl_job_timedout(sched_job, V3D_BIN,
>> -			    &job->timedout_ctca, &job->timedout_ctra);
>> +	return v3d_cl_job_timedout(sched_job, V3D_BIN,
>> +				   &job->timedout_ctca, &job->timedout_ctra);
>>   }
>>   
>> -static void
>> +static enum drm_task_status
>>   v3d_render_job_timedout(struct drm_sched_job *sched_job)
>>   {
>>   	struct v3d_render_job *job = to_render_job(sched_job);
>>   
>> -	v3d_cl_job_timedout(sched_job, V3D_RENDER,
>> -			    &job->timedout_ctca, &job->timedout_ctra);
>> +	return v3d_cl_job_timedout(sched_job, V3D_RENDER,
>> +				   &job->timedout_ctca, &job->timedout_ctra);
>>   }
>>   
>> -static void
>> +static enum drm_task_status
>>   v3d_generic_job_timedout(struct drm_sched_job *sched_job)
>>   {
>>   	struct v3d_job *job = to_v3d_job(sched_job);
>>   
>> -	v3d_gpu_reset_for_timeout(job->v3d, sched_job);
>> +	return v3d_gpu_reset_for_timeout(job->v3d, sched_job);
>>   }
>>   
>> -static void
>> +static enum drm_task_status
>>   v3d_csd_job_timedout(struct drm_sched_job *sched_job)
>>   {
>>   	struct v3d_csd_job *job = to_csd_job(sched_job);
>> @@ -348,10 +350,10 @@ v3d_csd_job_timedout(struct drm_sched_job *sched_job)
>>   	 */
>>   	if (job->timedout_batches != batches) {
>>   		job->timedout_batches = batches;
>> -		return;
>> +		return DRM_TASK_STATUS_ALIVE;
>>   	}
>>   
>> -	v3d_gpu_reset_for_timeout(v3d, sched_job);
>> +	return v3d_gpu_reset_for_timeout(v3d, sched_job);
>>   }
>>   
>>   static const struct drm_sched_backend_ops v3d_bin_sched_ops = {
>> diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
>> index 975e8a6..3ba36bc 100644
>> --- a/include/drm/gpu_scheduler.h
>> +++ b/include/drm/gpu_scheduler.h
>> @@ -206,6 +206,11 @@ static inline bool drm_sched_invalidate_job(struct drm_sched_job *s_job,
>>   	return s_job && atomic_inc_return(&s_job->karma) > threshold;
>>   }
>>   
>> +enum drm_task_status {
>> +	DRM_TASK_STATUS_ENODEV,
>> +	DRM_TASK_STATUS_ALIVE
>> +};
>> +
>>   /**
>>    * struct drm_sched_backend_ops
>>    *
>> @@ -230,10 +235,16 @@ struct drm_sched_backend_ops {
>>   	struct dma_fence *(*run_job)(struct drm_sched_job *sched_job);
>>   
>>   	/**
>> -         * @timedout_job: Called when a job has taken too long to execute,
>> -         * to trigger GPU recovery.
>> +	 * @timedout_job: Called when a job has taken too long to execute,
>> +	 * to trigger GPU recovery.
>> +	 *
>> +	 * Return DRM_TASK_STATUS_ALIVE, if the task (job) is healthy
>> +	 * and executing in the hardware, i.e. it needs more time.
>> +	 *
>> +	 * Return DRM_TASK_STATUS_ENODEV, if the task (job) has
>> +	 * been aborted.
>>   	 */
>> -	void (*timedout_job)(struct drm_sched_job *sched_job);
>> +	enum drm_task_status (*timedout_job)(struct drm_sched_job *sched_job);
>>   
>>   	/**
>>            * @free_job: Called once the job's finished fence has been signaled

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 12/14] drm/scheduler: Job timeout handler returns status
@ 2021-01-19 17:47       ` Luben Tuikov
  0 siblings, 0 replies; 196+ messages in thread
From: Luben Tuikov @ 2021-01-19 17:47 UTC (permalink / raw)
  To: Christian König, Andrey Grodzovsky, amd-gfx, dri-devel,
	ckoenig.leichtzumerken, daniel.vetter, robh, l.stach, yuq825,
	eric
  Cc: Tomeu Vizoso, gregkh, Christian Gmeiner, Steven Price, ppaalanen,
	Alyssa Rosenzweig, Russell King, Alexander.Deucher,
	Harry.Wentland

On 2021-01-19 2:53 a.m., Christian König wrote:
> Am 18.01.21 um 22:01 schrieb Andrey Grodzovsky:
>> From: Luben Tuikov <luben.tuikov@amd.com>
>>
>> This patch does not change current behaviour.
>>
>> The driver's job timeout handler now returns
>> status indicating back to the DRM layer whether
>> the task (job) was successfully aborted or whether
>> more time should be given to the task to complete.
>>
>> Default behaviour as of this patch, is preserved,
>> except in obvious-by-comment case in the Panfrost
>> driver, as documented below.
>>
>> All drivers which make use of the
>> drm_sched_backend_ops' .timedout_job() callback
>> have been accordingly renamed and return the
>> would've-been default value of
>> DRM_TASK_STATUS_ALIVE to restart the task's
>> timeout timer--this is the old behaviour, and
>> is preserved by this patch.
>>
>> In the case of the Panfrost driver, its timedout
>> callback correctly first checks if the job had
>> completed in due time and if so, it now returns
>> DRM_TASK_STATUS_COMPLETE to notify the DRM layer
>> that the task can be moved to the done list, to be
>> freed later. In the other two subsequent checks,
>> the value of DRM_TASK_STATUS_ALIVE is returned, as
>> per the default behaviour.
>>
>> A more involved driver's solutions can be had
>> in subequent patches.
>>
>> v2: Use enum as the status of a driver's job
>>      timeout callback method.
>>
>> v4: (By Andrey Grodzovsky)
>> Replace DRM_TASK_STATUS_COMPLETE with DRM_TASK_STATUS_ENODEV
>> to enable a hint to the schduler for when NOT to rearm the
>> timeout timer.
> As Lukas pointed out returning the job (or task) status doesn't make 
> much sense.
>
> What we return here is the status of the scheduler.
>
> I would either rename the enum or completely drop it and return a 
> negative error status.

Yes, that could be had.

Although, dropping the enum and returning [-1, 0], might
make the return status meaning vague. Using an enum with an appropriate
name, makes the intention clear to the next programmer.

Now, Andrey did rename one of the enumerated values to
DRM_TASK_STATUS_ENODEV, perhaps the same but with:

enum drm_sched_status {
    DRM_SCHED_STAT_NONE, /* Reserve 0 */
    DRM_SCHED_STAT_NOMINAL,
    DRM_SCHED_STAT_ENODEV,
};

and also renaming the enum to the above would be acceptable?

Regards,
Luben

> Apart from that looks fine to me,
> Christian.
>
>
>> Cc: Alexander Deucher <Alexander.Deucher@amd.com>
>> Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
>> Cc: Christian König <christian.koenig@amd.com>
>> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
>> Cc: Lucas Stach <l.stach@pengutronix.de>
>> Cc: Russell King <linux+etnaviv@armlinux.org.uk>
>> Cc: Christian Gmeiner <christian.gmeiner@gmail.com>
>> Cc: Qiang Yu <yuq825@gmail.com>
>> Cc: Rob Herring <robh@kernel.org>
>> Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com>
>> Cc: Steven Price <steven.price@arm.com>
>> Cc: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
>> Cc: Eric Anholt <eric@anholt.net>
>> Reported-by: kernel test robot <lkp@intel.com>
>> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_job.c |  6 ++++--
>>   drivers/gpu/drm/etnaviv/etnaviv_sched.c | 10 +++++++++-
>>   drivers/gpu/drm/lima/lima_sched.c       |  4 +++-
>>   drivers/gpu/drm/panfrost/panfrost_job.c |  9 ++++++---
>>   drivers/gpu/drm/scheduler/sched_main.c  |  4 +---
>>   drivers/gpu/drm/v3d/v3d_sched.c         | 32 +++++++++++++++++---------------
>>   include/drm/gpu_scheduler.h             | 17 ++++++++++++++---
>>   7 files changed, 54 insertions(+), 28 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
>> index ff48101..a111326 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
>> @@ -28,7 +28,7 @@
>>   #include "amdgpu.h"
>>   #include "amdgpu_trace.h"
>>   
>> -static void amdgpu_job_timedout(struct drm_sched_job *s_job)
>> +static enum drm_task_status amdgpu_job_timedout(struct drm_sched_job *s_job)
>>   {
>>   	struct amdgpu_ring *ring = to_amdgpu_ring(s_job->sched);
>>   	struct amdgpu_job *job = to_amdgpu_job(s_job);
>> @@ -41,7 +41,7 @@ static void amdgpu_job_timedout(struct drm_sched_job *s_job)
>>   	    amdgpu_ring_soft_recovery(ring, job->vmid, s_job->s_fence->parent)) {
>>   		DRM_ERROR("ring %s timeout, but soft recovered\n",
>>   			  s_job->sched->name);
>> -		return;
>> +		return DRM_TASK_STATUS_ALIVE;
>>   	}
>>   
>>   	amdgpu_vm_get_task_info(ring->adev, job->pasid, &ti);
>> @@ -53,10 +53,12 @@ static void amdgpu_job_timedout(struct drm_sched_job *s_job)
>>   
>>   	if (amdgpu_device_should_recover_gpu(ring->adev)) {
>>   		amdgpu_device_gpu_recover(ring->adev, job);
>> +		return DRM_TASK_STATUS_ALIVE;
>>   	} else {
>>   		drm_sched_suspend_timeout(&ring->sched);
>>   		if (amdgpu_sriov_vf(adev))
>>   			adev->virt.tdr_debug = true;
>> +		return DRM_TASK_STATUS_ALIVE;
>>   	}
>>   }
>>   
>> diff --git a/drivers/gpu/drm/etnaviv/etnaviv_sched.c b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
>> index cd46c88..c495169 100644
>> --- a/drivers/gpu/drm/etnaviv/etnaviv_sched.c
>> +++ b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
>> @@ -82,7 +82,8 @@ static struct dma_fence *etnaviv_sched_run_job(struct drm_sched_job *sched_job)
>>   	return fence;
>>   }
>>   
>> -static void etnaviv_sched_timedout_job(struct drm_sched_job *sched_job)
>> +static enum drm_task_status etnaviv_sched_timedout_job(struct drm_sched_job
>> +						       *sched_job)
>>   {
>>   	struct etnaviv_gem_submit *submit = to_etnaviv_submit(sched_job);
>>   	struct etnaviv_gpu *gpu = submit->gpu;
>> @@ -120,9 +121,16 @@ static void etnaviv_sched_timedout_job(struct drm_sched_job *sched_job)
>>   
>>   	drm_sched_resubmit_jobs(&gpu->sched);
>>   
>> +	/* Tell the DRM scheduler that this task needs
>> +	 * more time.
>> +	 */
>> +	drm_sched_start(&gpu->sched, true);
>> +	return DRM_TASK_STATUS_ALIVE;
>> +
>>   out_no_timeout:
>>   	/* restart scheduler after GPU is usable again */
>>   	drm_sched_start(&gpu->sched, true);
>> +	return DRM_TASK_STATUS_ALIVE;
>>   }
>>   
>>   static void etnaviv_sched_free_job(struct drm_sched_job *sched_job)
>> diff --git a/drivers/gpu/drm/lima/lima_sched.c b/drivers/gpu/drm/lima/lima_sched.c
>> index 63b4c56..66d9236 100644
>> --- a/drivers/gpu/drm/lima/lima_sched.c
>> +++ b/drivers/gpu/drm/lima/lima_sched.c
>> @@ -415,7 +415,7 @@ static void lima_sched_build_error_task_list(struct lima_sched_task *task)
>>   	mutex_unlock(&dev->error_task_list_lock);
>>   }
>>   
>> -static void lima_sched_timedout_job(struct drm_sched_job *job)
>> +static enum drm_task_status lima_sched_timedout_job(struct drm_sched_job *job)
>>   {
>>   	struct lima_sched_pipe *pipe = to_lima_pipe(job->sched);
>>   	struct lima_sched_task *task = to_lima_task(job);
>> @@ -449,6 +449,8 @@ static void lima_sched_timedout_job(struct drm_sched_job *job)
>>   
>>   	drm_sched_resubmit_jobs(&pipe->base);
>>   	drm_sched_start(&pipe->base, true);
>> +
>> +	return DRM_TASK_STATUS_ALIVE;
>>   }
>>   
>>   static void lima_sched_free_job(struct drm_sched_job *job)
>> diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c
>> index 04e6f6f..10d41ac 100644
>> --- a/drivers/gpu/drm/panfrost/panfrost_job.c
>> +++ b/drivers/gpu/drm/panfrost/panfrost_job.c
>> @@ -432,7 +432,8 @@ static void panfrost_scheduler_start(struct panfrost_queue_state *queue)
>>   	mutex_unlock(&queue->lock);
>>   }
>>   
>> -static void panfrost_job_timedout(struct drm_sched_job *sched_job)
>> +static enum drm_task_status panfrost_job_timedout(struct drm_sched_job
>> +						  *sched_job)
>>   {
>>   	struct panfrost_job *job = to_panfrost_job(sched_job);
>>   	struct panfrost_device *pfdev = job->pfdev;
>> @@ -443,7 +444,7 @@ static void panfrost_job_timedout(struct drm_sched_job *sched_job)
>>   	 * spurious. Bail out.
>>   	 */
>>   	if (dma_fence_is_signaled(job->done_fence))
>> -		return;
>> +		return DRM_TASK_STATUS_ALIVE;
>>   
>>   	dev_err(pfdev->dev, "gpu sched timeout, js=%d, config=0x%x, status=0x%x, head=0x%x, tail=0x%x, sched_job=%p",
>>   		js,
>> @@ -455,11 +456,13 @@ static void panfrost_job_timedout(struct drm_sched_job *sched_job)
>>   
>>   	/* Scheduler is already stopped, nothing to do. */
>>   	if (!panfrost_scheduler_stop(&pfdev->js->queue[js], sched_job))
>> -		return;
>> +		return DRM_TASK_STATUS_ALIVE;
>>   
>>   	/* Schedule a reset if there's no reset in progress. */
>>   	if (!atomic_xchg(&pfdev->reset.pending, 1))
>>   		schedule_work(&pfdev->reset.work);
>> +
>> +	return DRM_TASK_STATUS_ALIVE;
>>   }
>>   
>>   static const struct drm_sched_backend_ops panfrost_sched_ops = {
>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
>> index 92637b7..73fccc5 100644
>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>> @@ -527,7 +527,7 @@ void drm_sched_start(struct drm_gpu_scheduler *sched, bool full_recovery)
>>   EXPORT_SYMBOL(drm_sched_start);
>>   
>>   /**
>> - * drm_sched_resubmit_jobs - helper to relunch job from pending ring list
>> + * drm_sched_resubmit_jobs - helper to relaunch jobs from the pending list
>>    *
>>    * @sched: scheduler instance
>>    *
>> @@ -561,8 +561,6 @@ void drm_sched_resubmit_jobs(struct drm_gpu_scheduler *sched)
>>   		} else {
>>   			s_job->s_fence->parent = fence;
>>   		}
>> -
>> -
>>   	}
>>   }
>>   EXPORT_SYMBOL(drm_sched_resubmit_jobs);
>> diff --git a/drivers/gpu/drm/v3d/v3d_sched.c b/drivers/gpu/drm/v3d/v3d_sched.c
>> index 452682e..3740665e 100644
>> --- a/drivers/gpu/drm/v3d/v3d_sched.c
>> +++ b/drivers/gpu/drm/v3d/v3d_sched.c
>> @@ -259,7 +259,7 @@ v3d_cache_clean_job_run(struct drm_sched_job *sched_job)
>>   	return NULL;
>>   }
>>   
>> -static void
>> +static enum drm_task_status
>>   v3d_gpu_reset_for_timeout(struct v3d_dev *v3d, struct drm_sched_job *sched_job)
>>   {
>>   	enum v3d_queue q;
>> @@ -285,6 +285,8 @@ v3d_gpu_reset_for_timeout(struct v3d_dev *v3d, struct drm_sched_job *sched_job)
>>   	}
>>   
>>   	mutex_unlock(&v3d->reset_lock);
>> +
>> +	return DRM_TASK_STATUS_ALIVE;
>>   }
>>   
>>   /* If the current address or return address have changed, then the GPU
>> @@ -292,7 +294,7 @@ v3d_gpu_reset_for_timeout(struct v3d_dev *v3d, struct drm_sched_job *sched_job)
>>    * could fail if the GPU got in an infinite loop in the CL, but that
>>    * is pretty unlikely outside of an i-g-t testcase.
>>    */
>> -static void
>> +static enum drm_task_status
>>   v3d_cl_job_timedout(struct drm_sched_job *sched_job, enum v3d_queue q,
>>   		    u32 *timedout_ctca, u32 *timedout_ctra)
>>   {
>> @@ -304,39 +306,39 @@ v3d_cl_job_timedout(struct drm_sched_job *sched_job, enum v3d_queue q,
>>   	if (*timedout_ctca != ctca || *timedout_ctra != ctra) {
>>   		*timedout_ctca = ctca;
>>   		*timedout_ctra = ctra;
>> -		return;
>> +		return DRM_TASK_STATUS_ALIVE;
>>   	}
>>   
>> -	v3d_gpu_reset_for_timeout(v3d, sched_job);
>> +	return v3d_gpu_reset_for_timeout(v3d, sched_job);
>>   }
>>   
>> -static void
>> +static enum drm_task_status
>>   v3d_bin_job_timedout(struct drm_sched_job *sched_job)
>>   {
>>   	struct v3d_bin_job *job = to_bin_job(sched_job);
>>   
>> -	v3d_cl_job_timedout(sched_job, V3D_BIN,
>> -			    &job->timedout_ctca, &job->timedout_ctra);
>> +	return v3d_cl_job_timedout(sched_job, V3D_BIN,
>> +				   &job->timedout_ctca, &job->timedout_ctra);
>>   }
>>   
>> -static void
>> +static enum drm_task_status
>>   v3d_render_job_timedout(struct drm_sched_job *sched_job)
>>   {
>>   	struct v3d_render_job *job = to_render_job(sched_job);
>>   
>> -	v3d_cl_job_timedout(sched_job, V3D_RENDER,
>> -			    &job->timedout_ctca, &job->timedout_ctra);
>> +	return v3d_cl_job_timedout(sched_job, V3D_RENDER,
>> +				   &job->timedout_ctca, &job->timedout_ctra);
>>   }
>>   
>> -static void
>> +static enum drm_task_status
>>   v3d_generic_job_timedout(struct drm_sched_job *sched_job)
>>   {
>>   	struct v3d_job *job = to_v3d_job(sched_job);
>>   
>> -	v3d_gpu_reset_for_timeout(job->v3d, sched_job);
>> +	return v3d_gpu_reset_for_timeout(job->v3d, sched_job);
>>   }
>>   
>> -static void
>> +static enum drm_task_status
>>   v3d_csd_job_timedout(struct drm_sched_job *sched_job)
>>   {
>>   	struct v3d_csd_job *job = to_csd_job(sched_job);
>> @@ -348,10 +350,10 @@ v3d_csd_job_timedout(struct drm_sched_job *sched_job)
>>   	 */
>>   	if (job->timedout_batches != batches) {
>>   		job->timedout_batches = batches;
>> -		return;
>> +		return DRM_TASK_STATUS_ALIVE;
>>   	}
>>   
>> -	v3d_gpu_reset_for_timeout(v3d, sched_job);
>> +	return v3d_gpu_reset_for_timeout(v3d, sched_job);
>>   }
>>   
>>   static const struct drm_sched_backend_ops v3d_bin_sched_ops = {
>> diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
>> index 975e8a6..3ba36bc 100644
>> --- a/include/drm/gpu_scheduler.h
>> +++ b/include/drm/gpu_scheduler.h
>> @@ -206,6 +206,11 @@ static inline bool drm_sched_invalidate_job(struct drm_sched_job *s_job,
>>   	return s_job && atomic_inc_return(&s_job->karma) > threshold;
>>   }
>>   
>> +enum drm_task_status {
>> +	DRM_TASK_STATUS_ENODEV,
>> +	DRM_TASK_STATUS_ALIVE
>> +};
>> +
>>   /**
>>    * struct drm_sched_backend_ops
>>    *
>> @@ -230,10 +235,16 @@ struct drm_sched_backend_ops {
>>   	struct dma_fence *(*run_job)(struct drm_sched_job *sched_job);
>>   
>>   	/**
>> -         * @timedout_job: Called when a job has taken too long to execute,
>> -         * to trigger GPU recovery.
>> +	 * @timedout_job: Called when a job has taken too long to execute,
>> +	 * to trigger GPU recovery.
>> +	 *
>> +	 * Return DRM_TASK_STATUS_ALIVE, if the task (job) is healthy
>> +	 * and executing in the hardware, i.e. it needs more time.
>> +	 *
>> +	 * Return DRM_TASK_STATUS_ENODEV, if the task (job) has
>> +	 * been aborted.
>>   	 */
>> -	void (*timedout_job)(struct drm_sched_job *sched_job);
>> +	enum drm_task_status (*timedout_job)(struct drm_sched_job *sched_job);
>>   
>>   	/**
>>            * @free_job: Called once the job's finished fence has been signaled

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 11/14] drm/amdgpu: Guard against write accesses after device removal
  2021-01-19 15:35       ` Andrey Grodzovsky
@ 2021-01-19 18:05         ` Daniel Vetter
  -1 siblings, 0 replies; 196+ messages in thread
From: Daniel Vetter @ 2021-01-19 18:05 UTC (permalink / raw)
  To: Andrey Grodzovsky
  Cc: amd-gfx list, Greg KH, dri-devel, Qiang Yu, Alex Deucher,
	Christian König

On Tue, Jan 19, 2021 at 4:35 PM Andrey Grodzovsky
<Andrey.Grodzovsky@amd.com> wrote:
>
> There is really no other way according to this article
> https://lwn.net/Articles/767885/
>
> "A perfect solution seems nearly impossible though; we cannot acquire a mutex on
> the user
> to prevent them from yanking a device and we cannot check for a presence change
> after every
> device access for performance reasons. "
>
> But I assumed srcu_read_lock should be pretty seamless performance wise, no ?

The read side is supposed to be dirt cheap, the write side is were we
just stall for all readers to eventually complete on their own.
Definitely should be much cheaper than mmio read, on the mmio write
side it might actually hurt a bit. Otoh I think those don't stall the
cpu by default when they're timing out, so maybe if the overhead is
too much for those, we could omit them?

Maybe just do a small microbenchmark for these for testing, with a
register that doesn't change hw state. So with and without
drm_dev_enter/exit, and also one with the hw plugged out so that we
have actual timeouts in the transactions.
-Daniel

> The other solution would be as I suggested to keep all the device IO ranges
> reserved and system
> memory pages unfreed until the device is finalized in the driver but Daniel said
> this would upset the PCI layer (the MMIO ranges reservation part).
>
> Andrey
>
>
>
>
> On 1/19/21 3:55 AM, Christian König wrote:
> > Am 18.01.21 um 22:01 schrieb Andrey Grodzovsky:
> >> This should prevent writing to memory or IO ranges possibly
> >> already allocated for other uses after our device is removed.
> >
> > Wow, that adds quite some overhead to every register access. I'm not sure we
> > can do this.
> >
> > Christian.
> >
> >>
> >> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> >> ---
> >>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 57 ++++++++++++++++++++++++
> >>   drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c    |  9 ++++
> >>   drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c    | 53 +++++++++++++---------
> >>   drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h    |  3 ++
> >>   drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c   | 70 ++++++++++++++++++++++++++++++
> >>   drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h   | 49 ++-------------------
> >>   drivers/gpu/drm/amd/amdgpu/psp_v11_0.c     | 16 ++-----
> >>   drivers/gpu/drm/amd/amdgpu/psp_v12_0.c     |  8 +---
> >>   drivers/gpu/drm/amd/amdgpu/psp_v3_1.c      |  8 +---
> >>   9 files changed, 184 insertions(+), 89 deletions(-)
> >>
> >> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> >> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> >> index e99f4f1..0a9d73c 100644
> >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> >> @@ -72,6 +72,8 @@
> >>     #include <linux/iommu.h>
> >>   +#include <drm/drm_drv.h>
> >> +
> >>   MODULE_FIRMWARE("amdgpu/vega10_gpu_info.bin");
> >>   MODULE_FIRMWARE("amdgpu/vega12_gpu_info.bin");
> >>   MODULE_FIRMWARE("amdgpu/raven_gpu_info.bin");
> >> @@ -404,13 +406,21 @@ uint8_t amdgpu_mm_rreg8(struct amdgpu_device *adev,
> >> uint32_t offset)
> >>    */
> >>   void amdgpu_mm_wreg8(struct amdgpu_device *adev, uint32_t offset, uint8_t
> >> value)
> >>   {
> >> +    int idx;
> >> +
> >>       if (adev->in_pci_err_recovery)
> >>           return;
> >>   +
> >> +    if (!drm_dev_enter(&adev->ddev, &idx))
> >> +        return;
> >> +
> >>       if (offset < adev->rmmio_size)
> >>           writeb(value, adev->rmmio + offset);
> >>       else
> >>           BUG();
> >> +
> >> +    drm_dev_exit(idx);
> >>   }
> >>     /**
> >> @@ -427,9 +437,14 @@ void amdgpu_device_wreg(struct amdgpu_device *adev,
> >>               uint32_t reg, uint32_t v,
> >>               uint32_t acc_flags)
> >>   {
> >> +    int idx;
> >> +
> >>       if (adev->in_pci_err_recovery)
> >>           return;
> >>   +    if (!drm_dev_enter(&adev->ddev, &idx))
> >> +        return;
> >> +
> >>       if ((reg * 4) < adev->rmmio_size) {
> >>           if (!(acc_flags & AMDGPU_REGS_NO_KIQ) &&
> >>               amdgpu_sriov_runtime(adev) &&
> >> @@ -444,6 +459,8 @@ void amdgpu_device_wreg(struct amdgpu_device *adev,
> >>       }
> >>         trace_amdgpu_device_wreg(adev->pdev->device, reg, v);
> >> +
> >> +    drm_dev_exit(idx);
> >>   }
> >>     /*
> >> @@ -454,9 +471,14 @@ void amdgpu_device_wreg(struct amdgpu_device *adev,
> >>   void amdgpu_mm_wreg_mmio_rlc(struct amdgpu_device *adev,
> >>                    uint32_t reg, uint32_t v)
> >>   {
> >> +    int idx;
> >> +
> >>       if (adev->in_pci_err_recovery)
> >>           return;
> >>   +    if (!drm_dev_enter(&adev->ddev, &idx))
> >> +        return;
> >> +
> >>       if (amdgpu_sriov_fullaccess(adev) &&
> >>           adev->gfx.rlc.funcs &&
> >>           adev->gfx.rlc.funcs->is_rlcg_access_range) {
> >> @@ -465,6 +487,8 @@ void amdgpu_mm_wreg_mmio_rlc(struct amdgpu_device *adev,
> >>       } else {
> >>           writel(v, ((void __iomem *)adev->rmmio) + (reg * 4));
> >>       }
> >> +
> >> +    drm_dev_exit(idx);
> >>   }
> >>     /**
> >> @@ -499,15 +523,22 @@ u32 amdgpu_io_rreg(struct amdgpu_device *adev, u32 reg)
> >>    */
> >>   void amdgpu_io_wreg(struct amdgpu_device *adev, u32 reg, u32 v)
> >>   {
> >> +    int idx;
> >> +
> >>       if (adev->in_pci_err_recovery)
> >>           return;
> >>   +    if (!drm_dev_enter(&adev->ddev, &idx))
> >> +        return;
> >> +
> >>       if ((reg * 4) < adev->rio_mem_size)
> >>           iowrite32(v, adev->rio_mem + (reg * 4));
> >>       else {
> >>           iowrite32((reg * 4), adev->rio_mem + (mmMM_INDEX * 4));
> >>           iowrite32(v, adev->rio_mem + (mmMM_DATA * 4));
> >>       }
> >> +
> >> +    drm_dev_exit(idx);
> >>   }
> >>     /**
> >> @@ -544,14 +575,21 @@ u32 amdgpu_mm_rdoorbell(struct amdgpu_device *adev, u32
> >> index)
> >>    */
> >>   void amdgpu_mm_wdoorbell(struct amdgpu_device *adev, u32 index, u32 v)
> >>   {
> >> +    int idx;
> >> +
> >>       if (adev->in_pci_err_recovery)
> >>           return;
> >>   +    if (!drm_dev_enter(&adev->ddev, &idx))
> >> +        return;
> >> +
> >>       if (index < adev->doorbell.num_doorbells) {
> >>           writel(v, adev->doorbell.ptr + index);
> >>       } else {
> >>           DRM_ERROR("writing beyond doorbell aperture: 0x%08x!\n", index);
> >>       }
> >> +
> >> +    drm_dev_exit(idx);
> >>   }
> >>     /**
> >> @@ -588,14 +626,21 @@ u64 amdgpu_mm_rdoorbell64(struct amdgpu_device *adev,
> >> u32 index)
> >>    */
> >>   void amdgpu_mm_wdoorbell64(struct amdgpu_device *adev, u32 index, u64 v)
> >>   {
> >> +    int idx;
> >> +
> >>       if (adev->in_pci_err_recovery)
> >>           return;
> >>   +    if (!drm_dev_enter(&adev->ddev, &idx))
> >> +        return;
> >> +
> >>       if (index < adev->doorbell.num_doorbells) {
> >>           atomic64_set((atomic64_t *)(adev->doorbell.ptr + index), v);
> >>       } else {
> >>           DRM_ERROR("writing beyond doorbell aperture: 0x%08x!\n", index);
> >>       }
> >> +
> >> +    drm_dev_exit(idx);
> >>   }
> >>     /**
> >> @@ -682,6 +727,10 @@ void amdgpu_device_indirect_wreg(struct amdgpu_device
> >> *adev,
> >>       unsigned long flags;
> >>       void __iomem *pcie_index_offset;
> >>       void __iomem *pcie_data_offset;
> >> +    int idx;
> >> +
> >> +    if (!drm_dev_enter(&adev->ddev, &idx))
> >> +        return;
> >>         spin_lock_irqsave(&adev->pcie_idx_lock, flags);
> >>       pcie_index_offset = (void __iomem *)adev->rmmio + pcie_index * 4;
> >> @@ -692,6 +741,8 @@ void amdgpu_device_indirect_wreg(struct amdgpu_device *adev,
> >>       writel(reg_data, pcie_data_offset);
> >>       readl(pcie_data_offset);
> >>       spin_unlock_irqrestore(&adev->pcie_idx_lock, flags);
> >> +
> >> +    drm_dev_exit(idx);
> >>   }
> >>     /**
> >> @@ -711,6 +762,10 @@ void amdgpu_device_indirect_wreg64(struct amdgpu_device
> >> *adev,
> >>       unsigned long flags;
> >>       void __iomem *pcie_index_offset;
> >>       void __iomem *pcie_data_offset;
> >> +    int idx;
> >> +
> >> +    if (!drm_dev_enter(&adev->ddev, &idx))
> >> +        return;
> >>         spin_lock_irqsave(&adev->pcie_idx_lock, flags);
> >>       pcie_index_offset = (void __iomem *)adev->rmmio + pcie_index * 4;
> >> @@ -727,6 +782,8 @@ void amdgpu_device_indirect_wreg64(struct amdgpu_device
> >> *adev,
> >>       writel((u32)(reg_data >> 32), pcie_data_offset);
> >>       readl(pcie_data_offset);
> >>       spin_unlock_irqrestore(&adev->pcie_idx_lock, flags);
> >> +
> >> +    drm_dev_exit(idx);
> >>   }
> >>     /**
> >> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
> >> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
> >> index fe1a39f..1beb4e6 100644
> >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
> >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
> >> @@ -31,6 +31,8 @@
> >>   #include "amdgpu_ras.h"
> >>   #include "amdgpu_xgmi.h"
> >>   +#include <drm/drm_drv.h>
> >> +
> >>   /**
> >>    * amdgpu_gmc_get_pde_for_bo - get the PDE for a BO
> >>    *
> >> @@ -98,6 +100,10 @@ int amdgpu_gmc_set_pte_pde(struct amdgpu_device *adev,
> >> void *cpu_pt_addr,
> >>   {
> >>       void __iomem *ptr = (void *)cpu_pt_addr;
> >>       uint64_t value;
> >> +    int idx;
> >> +
> >> +    if (!drm_dev_enter(&adev->ddev, &idx))
> >> +        return 0;
> >>         /*
> >>        * The following is for PTE only. GART does not have PDEs.
> >> @@ -105,6 +111,9 @@ int amdgpu_gmc_set_pte_pde(struct amdgpu_device *adev,
> >> void *cpu_pt_addr,
> >>       value = addr & 0x0000FFFFFFFFF000ULL;
> >>       value |= flags;
> >>       writeq(value, ptr + (gpu_page_idx * 8));
> >> +
> >> +    drm_dev_exit(idx);
> >> +
> >>       return 0;
> >>   }
> >>   diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
> >> b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
> >> index 523d22d..89e2bfe 100644
> >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
> >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
> >> @@ -37,6 +37,8 @@
> >>     #include "amdgpu_ras.h"
> >>   +#include <drm/drm_drv.h>
> >> +
> >>   static int psp_sysfs_init(struct amdgpu_device *adev);
> >>   static void psp_sysfs_fini(struct amdgpu_device *adev);
> >>   @@ -248,7 +250,7 @@ psp_cmd_submit_buf(struct psp_context *psp,
> >>              struct psp_gfx_cmd_resp *cmd, uint64_t fence_mc_addr)
> >>   {
> >>       int ret;
> >> -    int index;
> >> +    int index, idx;
> >>       int timeout = 2000;
> >>       bool ras_intr = false;
> >>       bool skip_unsupport = false;
> >> @@ -256,6 +258,9 @@ psp_cmd_submit_buf(struct psp_context *psp,
> >>       if (psp->adev->in_pci_err_recovery)
> >>           return 0;
> >>   +    if (!drm_dev_enter(&psp->adev->ddev, &idx))
> >> +        return 0;
> >> +
> >>       mutex_lock(&psp->mutex);
> >>         memset(psp->cmd_buf_mem, 0, PSP_CMD_BUFFER_SIZE);
> >> @@ -266,8 +271,7 @@ psp_cmd_submit_buf(struct psp_context *psp,
> >>       ret = psp_ring_cmd_submit(psp, psp->cmd_buf_mc_addr, fence_mc_addr,
> >> index);
> >>       if (ret) {
> >>           atomic_dec(&psp->fence_value);
> >> -        mutex_unlock(&psp->mutex);
> >> -        return ret;
> >> +        goto exit;
> >>       }
> >>         amdgpu_asic_invalidate_hdp(psp->adev, NULL);
> >> @@ -307,8 +311,8 @@ psp_cmd_submit_buf(struct psp_context *psp,
> >>                psp->cmd_buf_mem->cmd_id,
> >>                psp->cmd_buf_mem->resp.status);
> >>           if (!timeout) {
> >> -            mutex_unlock(&psp->mutex);
> >> -            return -EINVAL;
> >> +            ret = -EINVAL;
> >> +            goto exit;
> >>           }
> >>       }
> >>   @@ -316,8 +320,10 @@ psp_cmd_submit_buf(struct psp_context *psp,
> >>           ucode->tmr_mc_addr_lo = psp->cmd_buf_mem->resp.fw_addr_lo;
> >>           ucode->tmr_mc_addr_hi = psp->cmd_buf_mem->resp.fw_addr_hi;
> >>       }
> >> -    mutex_unlock(&psp->mutex);
> >>   +exit:
> >> +    mutex_unlock(&psp->mutex);
> >> +    drm_dev_exit(idx);
> >>       return ret;
> >>   }
> >>   @@ -354,8 +360,7 @@ static int psp_load_toc(struct psp_context *psp,
> >>       if (!cmd)
> >>           return -ENOMEM;
> >>       /* Copy toc to psp firmware private buffer */
> >> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
> >> -    memcpy(psp->fw_pri_buf, psp->toc_start_addr, psp->toc_bin_size);
> >> +    psp_copy_fw(psp, psp->toc_start_addr, psp->toc_bin_size);
> >>         psp_prep_load_toc_cmd_buf(cmd, psp->fw_pri_mc_addr, psp->toc_bin_size);
> >>   @@ -570,8 +575,7 @@ static int psp_asd_load(struct psp_context *psp)
> >>       if (!cmd)
> >>           return -ENOMEM;
> >>   -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
> >> -    memcpy(psp->fw_pri_buf, psp->asd_start_addr, psp->asd_ucode_size);
> >> +    psp_copy_fw(psp, psp->asd_start_addr, psp->asd_ucode_size);
> >>         psp_prep_asd_load_cmd_buf(cmd, psp->fw_pri_mc_addr,
> >>                     psp->asd_ucode_size);
> >> @@ -726,8 +730,7 @@ static int psp_xgmi_load(struct psp_context *psp)
> >>       if (!cmd)
> >>           return -ENOMEM;
> >>   -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
> >> -    memcpy(psp->fw_pri_buf, psp->ta_xgmi_start_addr, psp->ta_xgmi_ucode_size);
> >> +    psp_copy_fw(psp, psp->ta_xgmi_start_addr, psp->ta_xgmi_ucode_size);
> >>         psp_prep_ta_load_cmd_buf(cmd,
> >>                    psp->fw_pri_mc_addr,
> >> @@ -982,8 +985,7 @@ static int psp_ras_load(struct psp_context *psp)
> >>       if (!cmd)
> >>           return -ENOMEM;
> >>   -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
> >> -    memcpy(psp->fw_pri_buf, psp->ta_ras_start_addr, psp->ta_ras_ucode_size);
> >> +    psp_copy_fw(psp, psp->ta_ras_start_addr, psp->ta_ras_ucode_size);
> >>         psp_prep_ta_load_cmd_buf(cmd,
> >>                    psp->fw_pri_mc_addr,
> >> @@ -1219,8 +1221,7 @@ static int psp_hdcp_load(struct psp_context *psp)
> >>       if (!cmd)
> >>           return -ENOMEM;
> >>   -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
> >> -    memcpy(psp->fw_pri_buf, psp->ta_hdcp_start_addr,
> >> +    psp_copy_fw(psp, psp->ta_hdcp_start_addr,
> >>              psp->ta_hdcp_ucode_size);
> >>         psp_prep_ta_load_cmd_buf(cmd,
> >> @@ -1366,8 +1367,7 @@ static int psp_dtm_load(struct psp_context *psp)
> >>       if (!cmd)
> >>           return -ENOMEM;
> >>   -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
> >> -    memcpy(psp->fw_pri_buf, psp->ta_dtm_start_addr, psp->ta_dtm_ucode_size);
> >> +    psp_copy_fw(psp, psp->ta_dtm_start_addr, psp->ta_dtm_ucode_size);
> >>         psp_prep_ta_load_cmd_buf(cmd,
> >>                    psp->fw_pri_mc_addr,
> >> @@ -1507,8 +1507,7 @@ static int psp_rap_load(struct psp_context *psp)
> >>       if (!cmd)
> >>           return -ENOMEM;
> >>   -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
> >> -    memcpy(psp->fw_pri_buf, psp->ta_rap_start_addr, psp->ta_rap_ucode_size);
> >> +    psp_copy_fw(psp, psp->ta_rap_start_addr, psp->ta_rap_ucode_size);
> >>         psp_prep_ta_load_cmd_buf(cmd,
> >>                    psp->fw_pri_mc_addr,
> >> @@ -2778,6 +2777,20 @@ static ssize_t psp_usbc_pd_fw_sysfs_write(struct
> >> device *dev,
> >>       return count;
> >>   }
> >>   +void psp_copy_fw(struct psp_context *psp, uint8_t *start_addr, uint32_t
> >> bin_size)
> >> +{
> >> +    int idx;
> >> +
> >> +    if (!drm_dev_enter(&psp->adev->ddev, &idx))
> >> +        return;
> >> +
> >> +    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
> >> +    memcpy(psp->fw_pri_buf, start_addr, bin_size);
> >> +
> >> +    drm_dev_exit(idx);
> >> +}
> >> +
> >> +
> >>   static DEVICE_ATTR(usbc_pd_fw, S_IRUGO | S_IWUSR,
> >>              psp_usbc_pd_fw_sysfs_read,
> >>              psp_usbc_pd_fw_sysfs_write);
> >> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
> >> b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
> >> index da250bc..ac69314 100644
> >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
> >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
> >> @@ -400,4 +400,7 @@ int psp_init_ta_microcode(struct psp_context *psp,
> >>                 const char *chip_name);
> >>   int psp_get_fw_attestation_records_addr(struct psp_context *psp,
> >>                       uint64_t *output_ptr);
> >> +
> >> +void psp_copy_fw(struct psp_context *psp, uint8_t *start_addr, uint32_t
> >> bin_size);
> >> +
> >>   #endif
> >> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
> >> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
> >> index 1a612f5..d656494 100644
> >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
> >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
> >> @@ -35,6 +35,8 @@
> >>   #include "amdgpu.h"
> >>   #include "atom.h"
> >>   +#include <drm/drm_drv.h>
> >> +
> >>   /*
> >>    * Rings
> >>    * Most engines on the GPU are fed via ring buffers.  Ring
> >> @@ -463,3 +465,71 @@ int amdgpu_ring_test_helper(struct amdgpu_ring *ring)
> >>       ring->sched.ready = !r;
> >>       return r;
> >>   }
> >> +
> >> +void amdgpu_ring_clear_ring(struct amdgpu_ring *ring)
> >> +{
> >> +    int idx;
> >> +    int i = 0;
> >> +
> >> +    if (!drm_dev_enter(&ring->adev->ddev, &idx))
> >> +        return;
> >> +
> >> +    while (i <= ring->buf_mask)
> >> +        ring->ring[i++] = ring->funcs->nop;
> >> +
> >> +    drm_dev_exit(idx);
> >> +
> >> +}
> >> +
> >> +void amdgpu_ring_write(struct amdgpu_ring *ring, uint32_t v)
> >> +{
> >> +    int idx;
> >> +
> >> +    if (!drm_dev_enter(&ring->adev->ddev, &idx))
> >> +        return;
> >> +
> >> +    if (ring->count_dw <= 0)
> >> +        DRM_ERROR("amdgpu: writing more dwords to the ring than expected!\n");
> >> +    ring->ring[ring->wptr++ & ring->buf_mask] = v;
> >> +    ring->wptr &= ring->ptr_mask;
> >> +    ring->count_dw--;
> >> +
> >> +    drm_dev_exit(idx);
> >> +}
> >> +
> >> +void amdgpu_ring_write_multiple(struct amdgpu_ring *ring,
> >> +                          void *src, int count_dw)
> >> +{
> >> +    unsigned occupied, chunk1, chunk2;
> >> +    void *dst;
> >> +    int idx;
> >> +
> >> +    if (!drm_dev_enter(&ring->adev->ddev, &idx))
> >> +        return;
> >> +
> >> +    if (unlikely(ring->count_dw < count_dw))
> >> +        DRM_ERROR("amdgpu: writing more dwords to the ring than expected!\n");
> >> +
> >> +    occupied = ring->wptr & ring->buf_mask;
> >> +    dst = (void *)&ring->ring[occupied];
> >> +    chunk1 = ring->buf_mask + 1 - occupied;
> >> +    chunk1 = (chunk1 >= count_dw) ? count_dw: chunk1;
> >> +    chunk2 = count_dw - chunk1;
> >> +    chunk1 <<= 2;
> >> +    chunk2 <<= 2;
> >> +
> >> +    if (chunk1)
> >> +        memcpy(dst, src, chunk1);
> >> +
> >> +    if (chunk2) {
> >> +        src += chunk1;
> >> +        dst = (void *)ring->ring;
> >> +        memcpy(dst, src, chunk2);
> >> +    }
> >> +
> >> +    ring->wptr += count_dw;
> >> +    ring->wptr &= ring->ptr_mask;
> >> +    ring->count_dw -= count_dw;
> >> +
> >> +    drm_dev_exit(idx);
> >> +}
> >> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
> >> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
> >> index accb243..f90b81f 100644
> >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
> >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
> >> @@ -300,53 +300,12 @@ static inline void
> >> amdgpu_ring_set_preempt_cond_exec(struct amdgpu_ring *ring,
> >>       *ring->cond_exe_cpu_addr = cond_exec;
> >>   }
> >>   -static inline void amdgpu_ring_clear_ring(struct amdgpu_ring *ring)
> >> -{
> >> -    int i = 0;
> >> -    while (i <= ring->buf_mask)
> >> -        ring->ring[i++] = ring->funcs->nop;
> >> -
> >> -}
> >> -
> >> -static inline void amdgpu_ring_write(struct amdgpu_ring *ring, uint32_t v)
> >> -{
> >> -    if (ring->count_dw <= 0)
> >> -        DRM_ERROR("amdgpu: writing more dwords to the ring than expected!\n");
> >> -    ring->ring[ring->wptr++ & ring->buf_mask] = v;
> >> -    ring->wptr &= ring->ptr_mask;
> >> -    ring->count_dw--;
> >> -}
> >> +void amdgpu_ring_clear_ring(struct amdgpu_ring *ring);
> >>   -static inline void amdgpu_ring_write_multiple(struct amdgpu_ring *ring,
> >> -                          void *src, int count_dw)
> >> -{
> >> -    unsigned occupied, chunk1, chunk2;
> >> -    void *dst;
> >> -
> >> -    if (unlikely(ring->count_dw < count_dw))
> >> -        DRM_ERROR("amdgpu: writing more dwords to the ring than expected!\n");
> >> -
> >> -    occupied = ring->wptr & ring->buf_mask;
> >> -    dst = (void *)&ring->ring[occupied];
> >> -    chunk1 = ring->buf_mask + 1 - occupied;
> >> -    chunk1 = (chunk1 >= count_dw) ? count_dw: chunk1;
> >> -    chunk2 = count_dw - chunk1;
> >> -    chunk1 <<= 2;
> >> -    chunk2 <<= 2;
> >> +void amdgpu_ring_write(struct amdgpu_ring *ring, uint32_t v);
> >>   -    if (chunk1)
> >> -        memcpy(dst, src, chunk1);
> >> -
> >> -    if (chunk2) {
> >> -        src += chunk1;
> >> -        dst = (void *)ring->ring;
> >> -        memcpy(dst, src, chunk2);
> >> -    }
> >> -
> >> -    ring->wptr += count_dw;
> >> -    ring->wptr &= ring->ptr_mask;
> >> -    ring->count_dw -= count_dw;
> >> -}
> >> +void amdgpu_ring_write_multiple(struct amdgpu_ring *ring,
> >> +                          void *src, int count_dw);
> >>     int amdgpu_ring_test_helper(struct amdgpu_ring *ring);
> >>   diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
> >> b/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
> >> index bd4248c..b3ce5be 100644
> >> --- a/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
> >> +++ b/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
> >> @@ -269,10 +269,8 @@ static int psp_v11_0_bootloader_load_kdb(struct
> >> psp_context *psp)
> >>       if (ret)
> >>           return ret;
> >>   -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
> >> -
> >>       /* Copy PSP KDB binary to memory */
> >> -    memcpy(psp->fw_pri_buf, psp->kdb_start_addr, psp->kdb_bin_size);
> >> +    psp_copy_fw(psp, psp->kdb_start_addr, psp->kdb_bin_size);
> >>         /* Provide the PSP KDB to bootloader */
> >>       WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
> >> @@ -302,10 +300,8 @@ static int psp_v11_0_bootloader_load_spl(struct
> >> psp_context *psp)
> >>       if (ret)
> >>           return ret;
> >>   -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
> >> -
> >>       /* Copy PSP SPL binary to memory */
> >> -    memcpy(psp->fw_pri_buf, psp->spl_start_addr, psp->spl_bin_size);
> >> +    psp_copy_fw(psp, psp->spl_start_addr, psp->spl_bin_size);
> >>         /* Provide the PSP SPL to bootloader */
> >>       WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
> >> @@ -335,10 +331,8 @@ static int psp_v11_0_bootloader_load_sysdrv(struct
> >> psp_context *psp)
> >>       if (ret)
> >>           return ret;
> >>   -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
> >> -
> >>       /* Copy PSP System Driver binary to memory */
> >> -    memcpy(psp->fw_pri_buf, psp->sys_start_addr, psp->sys_bin_size);
> >> +    psp_copy_fw(psp, psp->sys_start_addr, psp->sys_bin_size);
> >>         /* Provide the sys driver to bootloader */
> >>       WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
> >> @@ -371,10 +365,8 @@ static int psp_v11_0_bootloader_load_sos(struct
> >> psp_context *psp)
> >>       if (ret)
> >>           return ret;
> >>   -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
> >> -
> >>       /* Copy Secure OS binary to PSP memory */
> >> -    memcpy(psp->fw_pri_buf, psp->sos_start_addr, psp->sos_bin_size);
> >> +    psp_copy_fw(psp, psp->sos_start_addr, psp->sos_bin_size);
> >>         /* Provide the PSP secure OS to bootloader */
> >>       WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
> >> diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c
> >> b/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c
> >> index c4828bd..618e5b6 100644
> >> --- a/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c
> >> +++ b/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c
> >> @@ -138,10 +138,8 @@ static int psp_v12_0_bootloader_load_sysdrv(struct
> >> psp_context *psp)
> >>       if (ret)
> >>           return ret;
> >>   -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
> >> -
> >>       /* Copy PSP System Driver binary to memory */
> >> -    memcpy(psp->fw_pri_buf, psp->sys_start_addr, psp->sys_bin_size);
> >> +    psp_copy_fw(psp, psp->sys_start_addr, psp->sys_bin_size);
> >>         /* Provide the sys driver to bootloader */
> >>       WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
> >> @@ -179,10 +177,8 @@ static int psp_v12_0_bootloader_load_sos(struct
> >> psp_context *psp)
> >>       if (ret)
> >>           return ret;
> >>   -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
> >> -
> >>       /* Copy Secure OS binary to PSP memory */
> >> -    memcpy(psp->fw_pri_buf, psp->sos_start_addr, psp->sos_bin_size);
> >> +    psp_copy_fw(psp, psp->sos_start_addr, psp->sos_bin_size);
> >>         /* Provide the PSP secure OS to bootloader */
> >>       WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
> >> diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c
> >> b/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c
> >> index f2e725f..d0a6cccd 100644
> >> --- a/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c
> >> +++ b/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c
> >> @@ -102,10 +102,8 @@ static int psp_v3_1_bootloader_load_sysdrv(struct
> >> psp_context *psp)
> >>       if (ret)
> >>           return ret;
> >>   -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
> >> -
> >>       /* Copy PSP System Driver binary to memory */
> >> -    memcpy(psp->fw_pri_buf, psp->sys_start_addr, psp->sys_bin_size);
> >> +    psp_copy_fw(psp, psp->sys_start_addr, psp->sys_bin_size);
> >>         /* Provide the sys driver to bootloader */
> >>       WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
> >> @@ -143,10 +141,8 @@ static int psp_v3_1_bootloader_load_sos(struct
> >> psp_context *psp)
> >>       if (ret)
> >>           return ret;
> >>   -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
> >> -
> >>       /* Copy Secure OS binary to PSP memory */
> >> -    memcpy(psp->fw_pri_buf, psp->sos_start_addr, psp->sos_bin_size);
> >> +    psp_copy_fw(psp, psp->sos_start_addr, psp->sos_bin_size);
> >>         /* Provide the PSP secure OS to bootloader */
> >>       WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
> >



-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 11/14] drm/amdgpu: Guard against write accesses after device removal
@ 2021-01-19 18:05         ` Daniel Vetter
  0 siblings, 0 replies; 196+ messages in thread
From: Daniel Vetter @ 2021-01-19 18:05 UTC (permalink / raw)
  To: Andrey Grodzovsky
  Cc: Rob Herring, amd-gfx list, Greg KH, dri-devel, Anholt, Eric,
	Pekka Paalanen, Qiang Yu, Alex Deucher, Wentland, Harry,
	Christian König, Lucas Stach

On Tue, Jan 19, 2021 at 4:35 PM Andrey Grodzovsky
<Andrey.Grodzovsky@amd.com> wrote:
>
> There is really no other way according to this article
> https://lwn.net/Articles/767885/
>
> "A perfect solution seems nearly impossible though; we cannot acquire a mutex on
> the user
> to prevent them from yanking a device and we cannot check for a presence change
> after every
> device access for performance reasons. "
>
> But I assumed srcu_read_lock should be pretty seamless performance wise, no ?

The read side is supposed to be dirt cheap, the write side is were we
just stall for all readers to eventually complete on their own.
Definitely should be much cheaper than mmio read, on the mmio write
side it might actually hurt a bit. Otoh I think those don't stall the
cpu by default when they're timing out, so maybe if the overhead is
too much for those, we could omit them?

Maybe just do a small microbenchmark for these for testing, with a
register that doesn't change hw state. So with and without
drm_dev_enter/exit, and also one with the hw plugged out so that we
have actual timeouts in the transactions.
-Daniel

> The other solution would be as I suggested to keep all the device IO ranges
> reserved and system
> memory pages unfreed until the device is finalized in the driver but Daniel said
> this would upset the PCI layer (the MMIO ranges reservation part).
>
> Andrey
>
>
>
>
> On 1/19/21 3:55 AM, Christian König wrote:
> > Am 18.01.21 um 22:01 schrieb Andrey Grodzovsky:
> >> This should prevent writing to memory or IO ranges possibly
> >> already allocated for other uses after our device is removed.
> >
> > Wow, that adds quite some overhead to every register access. I'm not sure we
> > can do this.
> >
> > Christian.
> >
> >>
> >> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> >> ---
> >>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 57 ++++++++++++++++++++++++
> >>   drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c    |  9 ++++
> >>   drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c    | 53 +++++++++++++---------
> >>   drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h    |  3 ++
> >>   drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c   | 70 ++++++++++++++++++++++++++++++
> >>   drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h   | 49 ++-------------------
> >>   drivers/gpu/drm/amd/amdgpu/psp_v11_0.c     | 16 ++-----
> >>   drivers/gpu/drm/amd/amdgpu/psp_v12_0.c     |  8 +---
> >>   drivers/gpu/drm/amd/amdgpu/psp_v3_1.c      |  8 +---
> >>   9 files changed, 184 insertions(+), 89 deletions(-)
> >>
> >> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> >> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> >> index e99f4f1..0a9d73c 100644
> >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> >> @@ -72,6 +72,8 @@
> >>     #include <linux/iommu.h>
> >>   +#include <drm/drm_drv.h>
> >> +
> >>   MODULE_FIRMWARE("amdgpu/vega10_gpu_info.bin");
> >>   MODULE_FIRMWARE("amdgpu/vega12_gpu_info.bin");
> >>   MODULE_FIRMWARE("amdgpu/raven_gpu_info.bin");
> >> @@ -404,13 +406,21 @@ uint8_t amdgpu_mm_rreg8(struct amdgpu_device *adev,
> >> uint32_t offset)
> >>    */
> >>   void amdgpu_mm_wreg8(struct amdgpu_device *adev, uint32_t offset, uint8_t
> >> value)
> >>   {
> >> +    int idx;
> >> +
> >>       if (adev->in_pci_err_recovery)
> >>           return;
> >>   +
> >> +    if (!drm_dev_enter(&adev->ddev, &idx))
> >> +        return;
> >> +
> >>       if (offset < adev->rmmio_size)
> >>           writeb(value, adev->rmmio + offset);
> >>       else
> >>           BUG();
> >> +
> >> +    drm_dev_exit(idx);
> >>   }
> >>     /**
> >> @@ -427,9 +437,14 @@ void amdgpu_device_wreg(struct amdgpu_device *adev,
> >>               uint32_t reg, uint32_t v,
> >>               uint32_t acc_flags)
> >>   {
> >> +    int idx;
> >> +
> >>       if (adev->in_pci_err_recovery)
> >>           return;
> >>   +    if (!drm_dev_enter(&adev->ddev, &idx))
> >> +        return;
> >> +
> >>       if ((reg * 4) < adev->rmmio_size) {
> >>           if (!(acc_flags & AMDGPU_REGS_NO_KIQ) &&
> >>               amdgpu_sriov_runtime(adev) &&
> >> @@ -444,6 +459,8 @@ void amdgpu_device_wreg(struct amdgpu_device *adev,
> >>       }
> >>         trace_amdgpu_device_wreg(adev->pdev->device, reg, v);
> >> +
> >> +    drm_dev_exit(idx);
> >>   }
> >>     /*
> >> @@ -454,9 +471,14 @@ void amdgpu_device_wreg(struct amdgpu_device *adev,
> >>   void amdgpu_mm_wreg_mmio_rlc(struct amdgpu_device *adev,
> >>                    uint32_t reg, uint32_t v)
> >>   {
> >> +    int idx;
> >> +
> >>       if (adev->in_pci_err_recovery)
> >>           return;
> >>   +    if (!drm_dev_enter(&adev->ddev, &idx))
> >> +        return;
> >> +
> >>       if (amdgpu_sriov_fullaccess(adev) &&
> >>           adev->gfx.rlc.funcs &&
> >>           adev->gfx.rlc.funcs->is_rlcg_access_range) {
> >> @@ -465,6 +487,8 @@ void amdgpu_mm_wreg_mmio_rlc(struct amdgpu_device *adev,
> >>       } else {
> >>           writel(v, ((void __iomem *)adev->rmmio) + (reg * 4));
> >>       }
> >> +
> >> +    drm_dev_exit(idx);
> >>   }
> >>     /**
> >> @@ -499,15 +523,22 @@ u32 amdgpu_io_rreg(struct amdgpu_device *adev, u32 reg)
> >>    */
> >>   void amdgpu_io_wreg(struct amdgpu_device *adev, u32 reg, u32 v)
> >>   {
> >> +    int idx;
> >> +
> >>       if (adev->in_pci_err_recovery)
> >>           return;
> >>   +    if (!drm_dev_enter(&adev->ddev, &idx))
> >> +        return;
> >> +
> >>       if ((reg * 4) < adev->rio_mem_size)
> >>           iowrite32(v, adev->rio_mem + (reg * 4));
> >>       else {
> >>           iowrite32((reg * 4), adev->rio_mem + (mmMM_INDEX * 4));
> >>           iowrite32(v, adev->rio_mem + (mmMM_DATA * 4));
> >>       }
> >> +
> >> +    drm_dev_exit(idx);
> >>   }
> >>     /**
> >> @@ -544,14 +575,21 @@ u32 amdgpu_mm_rdoorbell(struct amdgpu_device *adev, u32
> >> index)
> >>    */
> >>   void amdgpu_mm_wdoorbell(struct amdgpu_device *adev, u32 index, u32 v)
> >>   {
> >> +    int idx;
> >> +
> >>       if (adev->in_pci_err_recovery)
> >>           return;
> >>   +    if (!drm_dev_enter(&adev->ddev, &idx))
> >> +        return;
> >> +
> >>       if (index < adev->doorbell.num_doorbells) {
> >>           writel(v, adev->doorbell.ptr + index);
> >>       } else {
> >>           DRM_ERROR("writing beyond doorbell aperture: 0x%08x!\n", index);
> >>       }
> >> +
> >> +    drm_dev_exit(idx);
> >>   }
> >>     /**
> >> @@ -588,14 +626,21 @@ u64 amdgpu_mm_rdoorbell64(struct amdgpu_device *adev,
> >> u32 index)
> >>    */
> >>   void amdgpu_mm_wdoorbell64(struct amdgpu_device *adev, u32 index, u64 v)
> >>   {
> >> +    int idx;
> >> +
> >>       if (adev->in_pci_err_recovery)
> >>           return;
> >>   +    if (!drm_dev_enter(&adev->ddev, &idx))
> >> +        return;
> >> +
> >>       if (index < adev->doorbell.num_doorbells) {
> >>           atomic64_set((atomic64_t *)(adev->doorbell.ptr + index), v);
> >>       } else {
> >>           DRM_ERROR("writing beyond doorbell aperture: 0x%08x!\n", index);
> >>       }
> >> +
> >> +    drm_dev_exit(idx);
> >>   }
> >>     /**
> >> @@ -682,6 +727,10 @@ void amdgpu_device_indirect_wreg(struct amdgpu_device
> >> *adev,
> >>       unsigned long flags;
> >>       void __iomem *pcie_index_offset;
> >>       void __iomem *pcie_data_offset;
> >> +    int idx;
> >> +
> >> +    if (!drm_dev_enter(&adev->ddev, &idx))
> >> +        return;
> >>         spin_lock_irqsave(&adev->pcie_idx_lock, flags);
> >>       pcie_index_offset = (void __iomem *)adev->rmmio + pcie_index * 4;
> >> @@ -692,6 +741,8 @@ void amdgpu_device_indirect_wreg(struct amdgpu_device *adev,
> >>       writel(reg_data, pcie_data_offset);
> >>       readl(pcie_data_offset);
> >>       spin_unlock_irqrestore(&adev->pcie_idx_lock, flags);
> >> +
> >> +    drm_dev_exit(idx);
> >>   }
> >>     /**
> >> @@ -711,6 +762,10 @@ void amdgpu_device_indirect_wreg64(struct amdgpu_device
> >> *adev,
> >>       unsigned long flags;
> >>       void __iomem *pcie_index_offset;
> >>       void __iomem *pcie_data_offset;
> >> +    int idx;
> >> +
> >> +    if (!drm_dev_enter(&adev->ddev, &idx))
> >> +        return;
> >>         spin_lock_irqsave(&adev->pcie_idx_lock, flags);
> >>       pcie_index_offset = (void __iomem *)adev->rmmio + pcie_index * 4;
> >> @@ -727,6 +782,8 @@ void amdgpu_device_indirect_wreg64(struct amdgpu_device
> >> *adev,
> >>       writel((u32)(reg_data >> 32), pcie_data_offset);
> >>       readl(pcie_data_offset);
> >>       spin_unlock_irqrestore(&adev->pcie_idx_lock, flags);
> >> +
> >> +    drm_dev_exit(idx);
> >>   }
> >>     /**
> >> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
> >> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
> >> index fe1a39f..1beb4e6 100644
> >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
> >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
> >> @@ -31,6 +31,8 @@
> >>   #include "amdgpu_ras.h"
> >>   #include "amdgpu_xgmi.h"
> >>   +#include <drm/drm_drv.h>
> >> +
> >>   /**
> >>    * amdgpu_gmc_get_pde_for_bo - get the PDE for a BO
> >>    *
> >> @@ -98,6 +100,10 @@ int amdgpu_gmc_set_pte_pde(struct amdgpu_device *adev,
> >> void *cpu_pt_addr,
> >>   {
> >>       void __iomem *ptr = (void *)cpu_pt_addr;
> >>       uint64_t value;
> >> +    int idx;
> >> +
> >> +    if (!drm_dev_enter(&adev->ddev, &idx))
> >> +        return 0;
> >>         /*
> >>        * The following is for PTE only. GART does not have PDEs.
> >> @@ -105,6 +111,9 @@ int amdgpu_gmc_set_pte_pde(struct amdgpu_device *adev,
> >> void *cpu_pt_addr,
> >>       value = addr & 0x0000FFFFFFFFF000ULL;
> >>       value |= flags;
> >>       writeq(value, ptr + (gpu_page_idx * 8));
> >> +
> >> +    drm_dev_exit(idx);
> >> +
> >>       return 0;
> >>   }
> >>   diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
> >> b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
> >> index 523d22d..89e2bfe 100644
> >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
> >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
> >> @@ -37,6 +37,8 @@
> >>     #include "amdgpu_ras.h"
> >>   +#include <drm/drm_drv.h>
> >> +
> >>   static int psp_sysfs_init(struct amdgpu_device *adev);
> >>   static void psp_sysfs_fini(struct amdgpu_device *adev);
> >>   @@ -248,7 +250,7 @@ psp_cmd_submit_buf(struct psp_context *psp,
> >>              struct psp_gfx_cmd_resp *cmd, uint64_t fence_mc_addr)
> >>   {
> >>       int ret;
> >> -    int index;
> >> +    int index, idx;
> >>       int timeout = 2000;
> >>       bool ras_intr = false;
> >>       bool skip_unsupport = false;
> >> @@ -256,6 +258,9 @@ psp_cmd_submit_buf(struct psp_context *psp,
> >>       if (psp->adev->in_pci_err_recovery)
> >>           return 0;
> >>   +    if (!drm_dev_enter(&psp->adev->ddev, &idx))
> >> +        return 0;
> >> +
> >>       mutex_lock(&psp->mutex);
> >>         memset(psp->cmd_buf_mem, 0, PSP_CMD_BUFFER_SIZE);
> >> @@ -266,8 +271,7 @@ psp_cmd_submit_buf(struct psp_context *psp,
> >>       ret = psp_ring_cmd_submit(psp, psp->cmd_buf_mc_addr, fence_mc_addr,
> >> index);
> >>       if (ret) {
> >>           atomic_dec(&psp->fence_value);
> >> -        mutex_unlock(&psp->mutex);
> >> -        return ret;
> >> +        goto exit;
> >>       }
> >>         amdgpu_asic_invalidate_hdp(psp->adev, NULL);
> >> @@ -307,8 +311,8 @@ psp_cmd_submit_buf(struct psp_context *psp,
> >>                psp->cmd_buf_mem->cmd_id,
> >>                psp->cmd_buf_mem->resp.status);
> >>           if (!timeout) {
> >> -            mutex_unlock(&psp->mutex);
> >> -            return -EINVAL;
> >> +            ret = -EINVAL;
> >> +            goto exit;
> >>           }
> >>       }
> >>   @@ -316,8 +320,10 @@ psp_cmd_submit_buf(struct psp_context *psp,
> >>           ucode->tmr_mc_addr_lo = psp->cmd_buf_mem->resp.fw_addr_lo;
> >>           ucode->tmr_mc_addr_hi = psp->cmd_buf_mem->resp.fw_addr_hi;
> >>       }
> >> -    mutex_unlock(&psp->mutex);
> >>   +exit:
> >> +    mutex_unlock(&psp->mutex);
> >> +    drm_dev_exit(idx);
> >>       return ret;
> >>   }
> >>   @@ -354,8 +360,7 @@ static int psp_load_toc(struct psp_context *psp,
> >>       if (!cmd)
> >>           return -ENOMEM;
> >>       /* Copy toc to psp firmware private buffer */
> >> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
> >> -    memcpy(psp->fw_pri_buf, psp->toc_start_addr, psp->toc_bin_size);
> >> +    psp_copy_fw(psp, psp->toc_start_addr, psp->toc_bin_size);
> >>         psp_prep_load_toc_cmd_buf(cmd, psp->fw_pri_mc_addr, psp->toc_bin_size);
> >>   @@ -570,8 +575,7 @@ static int psp_asd_load(struct psp_context *psp)
> >>       if (!cmd)
> >>           return -ENOMEM;
> >>   -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
> >> -    memcpy(psp->fw_pri_buf, psp->asd_start_addr, psp->asd_ucode_size);
> >> +    psp_copy_fw(psp, psp->asd_start_addr, psp->asd_ucode_size);
> >>         psp_prep_asd_load_cmd_buf(cmd, psp->fw_pri_mc_addr,
> >>                     psp->asd_ucode_size);
> >> @@ -726,8 +730,7 @@ static int psp_xgmi_load(struct psp_context *psp)
> >>       if (!cmd)
> >>           return -ENOMEM;
> >>   -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
> >> -    memcpy(psp->fw_pri_buf, psp->ta_xgmi_start_addr, psp->ta_xgmi_ucode_size);
> >> +    psp_copy_fw(psp, psp->ta_xgmi_start_addr, psp->ta_xgmi_ucode_size);
> >>         psp_prep_ta_load_cmd_buf(cmd,
> >>                    psp->fw_pri_mc_addr,
> >> @@ -982,8 +985,7 @@ static int psp_ras_load(struct psp_context *psp)
> >>       if (!cmd)
> >>           return -ENOMEM;
> >>   -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
> >> -    memcpy(psp->fw_pri_buf, psp->ta_ras_start_addr, psp->ta_ras_ucode_size);
> >> +    psp_copy_fw(psp, psp->ta_ras_start_addr, psp->ta_ras_ucode_size);
> >>         psp_prep_ta_load_cmd_buf(cmd,
> >>                    psp->fw_pri_mc_addr,
> >> @@ -1219,8 +1221,7 @@ static int psp_hdcp_load(struct psp_context *psp)
> >>       if (!cmd)
> >>           return -ENOMEM;
> >>   -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
> >> -    memcpy(psp->fw_pri_buf, psp->ta_hdcp_start_addr,
> >> +    psp_copy_fw(psp, psp->ta_hdcp_start_addr,
> >>              psp->ta_hdcp_ucode_size);
> >>         psp_prep_ta_load_cmd_buf(cmd,
> >> @@ -1366,8 +1367,7 @@ static int psp_dtm_load(struct psp_context *psp)
> >>       if (!cmd)
> >>           return -ENOMEM;
> >>   -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
> >> -    memcpy(psp->fw_pri_buf, psp->ta_dtm_start_addr, psp->ta_dtm_ucode_size);
> >> +    psp_copy_fw(psp, psp->ta_dtm_start_addr, psp->ta_dtm_ucode_size);
> >>         psp_prep_ta_load_cmd_buf(cmd,
> >>                    psp->fw_pri_mc_addr,
> >> @@ -1507,8 +1507,7 @@ static int psp_rap_load(struct psp_context *psp)
> >>       if (!cmd)
> >>           return -ENOMEM;
> >>   -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
> >> -    memcpy(psp->fw_pri_buf, psp->ta_rap_start_addr, psp->ta_rap_ucode_size);
> >> +    psp_copy_fw(psp, psp->ta_rap_start_addr, psp->ta_rap_ucode_size);
> >>         psp_prep_ta_load_cmd_buf(cmd,
> >>                    psp->fw_pri_mc_addr,
> >> @@ -2778,6 +2777,20 @@ static ssize_t psp_usbc_pd_fw_sysfs_write(struct
> >> device *dev,
> >>       return count;
> >>   }
> >>   +void psp_copy_fw(struct psp_context *psp, uint8_t *start_addr, uint32_t
> >> bin_size)
> >> +{
> >> +    int idx;
> >> +
> >> +    if (!drm_dev_enter(&psp->adev->ddev, &idx))
> >> +        return;
> >> +
> >> +    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
> >> +    memcpy(psp->fw_pri_buf, start_addr, bin_size);
> >> +
> >> +    drm_dev_exit(idx);
> >> +}
> >> +
> >> +
> >>   static DEVICE_ATTR(usbc_pd_fw, S_IRUGO | S_IWUSR,
> >>              psp_usbc_pd_fw_sysfs_read,
> >>              psp_usbc_pd_fw_sysfs_write);
> >> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
> >> b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
> >> index da250bc..ac69314 100644
> >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
> >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
> >> @@ -400,4 +400,7 @@ int psp_init_ta_microcode(struct psp_context *psp,
> >>                 const char *chip_name);
> >>   int psp_get_fw_attestation_records_addr(struct psp_context *psp,
> >>                       uint64_t *output_ptr);
> >> +
> >> +void psp_copy_fw(struct psp_context *psp, uint8_t *start_addr, uint32_t
> >> bin_size);
> >> +
> >>   #endif
> >> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
> >> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
> >> index 1a612f5..d656494 100644
> >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
> >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
> >> @@ -35,6 +35,8 @@
> >>   #include "amdgpu.h"
> >>   #include "atom.h"
> >>   +#include <drm/drm_drv.h>
> >> +
> >>   /*
> >>    * Rings
> >>    * Most engines on the GPU are fed via ring buffers.  Ring
> >> @@ -463,3 +465,71 @@ int amdgpu_ring_test_helper(struct amdgpu_ring *ring)
> >>       ring->sched.ready = !r;
> >>       return r;
> >>   }
> >> +
> >> +void amdgpu_ring_clear_ring(struct amdgpu_ring *ring)
> >> +{
> >> +    int idx;
> >> +    int i = 0;
> >> +
> >> +    if (!drm_dev_enter(&ring->adev->ddev, &idx))
> >> +        return;
> >> +
> >> +    while (i <= ring->buf_mask)
> >> +        ring->ring[i++] = ring->funcs->nop;
> >> +
> >> +    drm_dev_exit(idx);
> >> +
> >> +}
> >> +
> >> +void amdgpu_ring_write(struct amdgpu_ring *ring, uint32_t v)
> >> +{
> >> +    int idx;
> >> +
> >> +    if (!drm_dev_enter(&ring->adev->ddev, &idx))
> >> +        return;
> >> +
> >> +    if (ring->count_dw <= 0)
> >> +        DRM_ERROR("amdgpu: writing more dwords to the ring than expected!\n");
> >> +    ring->ring[ring->wptr++ & ring->buf_mask] = v;
> >> +    ring->wptr &= ring->ptr_mask;
> >> +    ring->count_dw--;
> >> +
> >> +    drm_dev_exit(idx);
> >> +}
> >> +
> >> +void amdgpu_ring_write_multiple(struct amdgpu_ring *ring,
> >> +                          void *src, int count_dw)
> >> +{
> >> +    unsigned occupied, chunk1, chunk2;
> >> +    void *dst;
> >> +    int idx;
> >> +
> >> +    if (!drm_dev_enter(&ring->adev->ddev, &idx))
> >> +        return;
> >> +
> >> +    if (unlikely(ring->count_dw < count_dw))
> >> +        DRM_ERROR("amdgpu: writing more dwords to the ring than expected!\n");
> >> +
> >> +    occupied = ring->wptr & ring->buf_mask;
> >> +    dst = (void *)&ring->ring[occupied];
> >> +    chunk1 = ring->buf_mask + 1 - occupied;
> >> +    chunk1 = (chunk1 >= count_dw) ? count_dw: chunk1;
> >> +    chunk2 = count_dw - chunk1;
> >> +    chunk1 <<= 2;
> >> +    chunk2 <<= 2;
> >> +
> >> +    if (chunk1)
> >> +        memcpy(dst, src, chunk1);
> >> +
> >> +    if (chunk2) {
> >> +        src += chunk1;
> >> +        dst = (void *)ring->ring;
> >> +        memcpy(dst, src, chunk2);
> >> +    }
> >> +
> >> +    ring->wptr += count_dw;
> >> +    ring->wptr &= ring->ptr_mask;
> >> +    ring->count_dw -= count_dw;
> >> +
> >> +    drm_dev_exit(idx);
> >> +}
> >> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
> >> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
> >> index accb243..f90b81f 100644
> >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
> >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
> >> @@ -300,53 +300,12 @@ static inline void
> >> amdgpu_ring_set_preempt_cond_exec(struct amdgpu_ring *ring,
> >>       *ring->cond_exe_cpu_addr = cond_exec;
> >>   }
> >>   -static inline void amdgpu_ring_clear_ring(struct amdgpu_ring *ring)
> >> -{
> >> -    int i = 0;
> >> -    while (i <= ring->buf_mask)
> >> -        ring->ring[i++] = ring->funcs->nop;
> >> -
> >> -}
> >> -
> >> -static inline void amdgpu_ring_write(struct amdgpu_ring *ring, uint32_t v)
> >> -{
> >> -    if (ring->count_dw <= 0)
> >> -        DRM_ERROR("amdgpu: writing more dwords to the ring than expected!\n");
> >> -    ring->ring[ring->wptr++ & ring->buf_mask] = v;
> >> -    ring->wptr &= ring->ptr_mask;
> >> -    ring->count_dw--;
> >> -}
> >> +void amdgpu_ring_clear_ring(struct amdgpu_ring *ring);
> >>   -static inline void amdgpu_ring_write_multiple(struct amdgpu_ring *ring,
> >> -                          void *src, int count_dw)
> >> -{
> >> -    unsigned occupied, chunk1, chunk2;
> >> -    void *dst;
> >> -
> >> -    if (unlikely(ring->count_dw < count_dw))
> >> -        DRM_ERROR("amdgpu: writing more dwords to the ring than expected!\n");
> >> -
> >> -    occupied = ring->wptr & ring->buf_mask;
> >> -    dst = (void *)&ring->ring[occupied];
> >> -    chunk1 = ring->buf_mask + 1 - occupied;
> >> -    chunk1 = (chunk1 >= count_dw) ? count_dw: chunk1;
> >> -    chunk2 = count_dw - chunk1;
> >> -    chunk1 <<= 2;
> >> -    chunk2 <<= 2;
> >> +void amdgpu_ring_write(struct amdgpu_ring *ring, uint32_t v);
> >>   -    if (chunk1)
> >> -        memcpy(dst, src, chunk1);
> >> -
> >> -    if (chunk2) {
> >> -        src += chunk1;
> >> -        dst = (void *)ring->ring;
> >> -        memcpy(dst, src, chunk2);
> >> -    }
> >> -
> >> -    ring->wptr += count_dw;
> >> -    ring->wptr &= ring->ptr_mask;
> >> -    ring->count_dw -= count_dw;
> >> -}
> >> +void amdgpu_ring_write_multiple(struct amdgpu_ring *ring,
> >> +                          void *src, int count_dw);
> >>     int amdgpu_ring_test_helper(struct amdgpu_ring *ring);
> >>   diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
> >> b/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
> >> index bd4248c..b3ce5be 100644
> >> --- a/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
> >> +++ b/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
> >> @@ -269,10 +269,8 @@ static int psp_v11_0_bootloader_load_kdb(struct
> >> psp_context *psp)
> >>       if (ret)
> >>           return ret;
> >>   -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
> >> -
> >>       /* Copy PSP KDB binary to memory */
> >> -    memcpy(psp->fw_pri_buf, psp->kdb_start_addr, psp->kdb_bin_size);
> >> +    psp_copy_fw(psp, psp->kdb_start_addr, psp->kdb_bin_size);
> >>         /* Provide the PSP KDB to bootloader */
> >>       WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
> >> @@ -302,10 +300,8 @@ static int psp_v11_0_bootloader_load_spl(struct
> >> psp_context *psp)
> >>       if (ret)
> >>           return ret;
> >>   -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
> >> -
> >>       /* Copy PSP SPL binary to memory */
> >> -    memcpy(psp->fw_pri_buf, psp->spl_start_addr, psp->spl_bin_size);
> >> +    psp_copy_fw(psp, psp->spl_start_addr, psp->spl_bin_size);
> >>         /* Provide the PSP SPL to bootloader */
> >>       WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
> >> @@ -335,10 +331,8 @@ static int psp_v11_0_bootloader_load_sysdrv(struct
> >> psp_context *psp)
> >>       if (ret)
> >>           return ret;
> >>   -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
> >> -
> >>       /* Copy PSP System Driver binary to memory */
> >> -    memcpy(psp->fw_pri_buf, psp->sys_start_addr, psp->sys_bin_size);
> >> +    psp_copy_fw(psp, psp->sys_start_addr, psp->sys_bin_size);
> >>         /* Provide the sys driver to bootloader */
> >>       WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
> >> @@ -371,10 +365,8 @@ static int psp_v11_0_bootloader_load_sos(struct
> >> psp_context *psp)
> >>       if (ret)
> >>           return ret;
> >>   -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
> >> -
> >>       /* Copy Secure OS binary to PSP memory */
> >> -    memcpy(psp->fw_pri_buf, psp->sos_start_addr, psp->sos_bin_size);
> >> +    psp_copy_fw(psp, psp->sos_start_addr, psp->sos_bin_size);
> >>         /* Provide the PSP secure OS to bootloader */
> >>       WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
> >> diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c
> >> b/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c
> >> index c4828bd..618e5b6 100644
> >> --- a/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c
> >> +++ b/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c
> >> @@ -138,10 +138,8 @@ static int psp_v12_0_bootloader_load_sysdrv(struct
> >> psp_context *psp)
> >>       if (ret)
> >>           return ret;
> >>   -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
> >> -
> >>       /* Copy PSP System Driver binary to memory */
> >> -    memcpy(psp->fw_pri_buf, psp->sys_start_addr, psp->sys_bin_size);
> >> +    psp_copy_fw(psp, psp->sys_start_addr, psp->sys_bin_size);
> >>         /* Provide the sys driver to bootloader */
> >>       WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
> >> @@ -179,10 +177,8 @@ static int psp_v12_0_bootloader_load_sos(struct
> >> psp_context *psp)
> >>       if (ret)
> >>           return ret;
> >>   -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
> >> -
> >>       /* Copy Secure OS binary to PSP memory */
> >> -    memcpy(psp->fw_pri_buf, psp->sos_start_addr, psp->sos_bin_size);
> >> +    psp_copy_fw(psp, psp->sos_start_addr, psp->sos_bin_size);
> >>         /* Provide the PSP secure OS to bootloader */
> >>       WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
> >> diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c
> >> b/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c
> >> index f2e725f..d0a6cccd 100644
> >> --- a/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c
> >> +++ b/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c
> >> @@ -102,10 +102,8 @@ static int psp_v3_1_bootloader_load_sysdrv(struct
> >> psp_context *psp)
> >>       if (ret)
> >>           return ret;
> >>   -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
> >> -
> >>       /* Copy PSP System Driver binary to memory */
> >> -    memcpy(psp->fw_pri_buf, psp->sys_start_addr, psp->sys_bin_size);
> >> +    psp_copy_fw(psp, psp->sys_start_addr, psp->sys_bin_size);
> >>         /* Provide the sys driver to bootloader */
> >>       WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
> >> @@ -143,10 +141,8 @@ static int psp_v3_1_bootloader_load_sos(struct
> >> psp_context *psp)
> >>       if (ret)
> >>           return ret;
> >>   -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
> >> -
> >>       /* Copy Secure OS binary to PSP memory */
> >> -    memcpy(psp->fw_pri_buf, psp->sos_start_addr, psp->sos_bin_size);
> >> +    psp_copy_fw(psp, psp->sos_start_addr, psp->sos_bin_size);
> >>         /* Provide the PSP secure OS to bootloader */
> >>       WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
> >



-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 00/14] RFC Support hot device unplug in amdgpu
  2021-01-19 17:31     ` Andrey Grodzovsky
@ 2021-01-19 18:08       ` Daniel Vetter
  -1 siblings, 0 replies; 196+ messages in thread
From: Daniel Vetter @ 2021-01-19 18:08 UTC (permalink / raw)
  To: Andrey Grodzovsky
  Cc: amd-gfx list, Christian König, dri-devel, Qiang Yu, Greg KH,
	Alex Deucher

On Tue, Jan 19, 2021 at 6:31 PM Andrey Grodzovsky
<Andrey.Grodzovsky@amd.com> wrote:
>
>
> On 1/19/21 9:16 AM, Daniel Vetter wrote:
> > On Mon, Jan 18, 2021 at 04:01:09PM -0500, Andrey Grodzovsky wrote:
> >> Until now extracting a card either by physical extraction (e.g. eGPU with
> >> thunderbolt connection or by emulation through  syfs -> /sys/bus/pci/devices/device_id/remove)
> >> would cause random crashes in user apps. The random crashes in apps were
> >> mostly due to the app having mapped a device backed BO into its address
> >> space was still trying to access the BO while the backing device was gone.
> >> To answer this first problem Christian suggested to fix the handling of mapped
> >> memory in the clients when the device goes away by forcibly unmap all buffers the
> >> user processes has by clearing their respective VMAs mapping the device BOs.
> >> Then when the VMAs try to fill in the page tables again we check in the fault
> >> handlerif the device is removed and if so, return an error. This will generate a
> >> SIGBUS to the application which can then cleanly terminate.This indeed was done
> >> but this in turn created a problem of kernel OOPs were the OOPSes were due to the
> >> fact that while the app was terminating because of the SIGBUSit would trigger use
> >> after free in the driver by calling to accesses device structures that were already
> >> released from the pci remove sequence.This was handled by introducing a 'flush'
> >> sequence during device removal were we wait for drm file reference to drop to 0
> >> meaning all user clients directly using this device terminated.
> >>
> >> v2:
> >> Based on discussions in the mailing list with Daniel and Pekka [1] and based on the document
> >> produced by Pekka from those discussions [2] the whole approach with returning SIGBUS and
> >> waiting for all user clients having CPU mapping of device BOs to die was dropped.
> >> Instead as per the document suggestion the device structures are kept alive until
> >> the last reference to the device is dropped by user client and in the meanwhile all existing and new CPU mappings of the BOs
> >> belonging to the device directly or by dma-buf import are rerouted to per user
> >> process dummy rw page.Also, I skipped the 'Requirements for KMS UAPI' section of [2]
> >> since i am trying to get the minimal set of requirements that still give useful solution
> >> to work and this is the'Requirements for Render and Cross-Device UAPI' section and so my
> >> test case is removing a secondary device, which is render only and is not involved
> >> in KMS.
> >>
> >> v3:
> >> More updates following comments from v2 such as removing loop to find DRM file when rerouting
> >> page faults to dummy page,getting rid of unnecessary sysfs handling refactoring and moving
> >> prevention of GPU recovery post device unplug from amdgpu to scheduler layer.
> >> On top of that added unplug support for the IOMMU enabled system.
> >>
> >> v4:
> >> Drop last sysfs hack and use sysfs default attribute.
> >> Guard against write accesses after device removal to avoid modifying released memory.
> >> Update dummy pages handling to on demand allocation and release through drm managed framework.
> >> Add return value to scheduler job TO handler (by Luben Tuikov) and use this in amdgpu for prevention
> >> of GPU recovery post device unplug
> >> Also rebase on top of drm-misc-mext instead of amd-staging-drm-next
> >>
> >> With these patches I am able to gracefully remove the secondary card using sysfs remove hook while glxgears
> >> is running off of secondary card (DRI_PRIME=1) without kernel oopses or hangs and keep working
> >> with the primary card or soft reset the device without hangs or oopses
> >>
> >> TODOs for followup work:
> >> Convert AMDGPU code to use devm (for hw stuff) and drmm (for sw stuff and allocations) (Daniel)
> >> Support plugging the secondary device back after unplug - currently still experiencing HW error on plugging back.
> >> Add support for 'Requirements for KMS UAPI' section of [2] - unplugging primary, display connected card.
> >>
> >> [1] - Discussions during v3 of the patchset https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.spinics.net%2Flists%2Famd-gfx%2Fmsg55576.html&amp;data=04%7C01%7Candrey.grodzovsky%40amd.com%7C4b12f8caf53645eaf0c608d8bc84d7fa%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637466626035281917%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=E73dK7r1OBt1T9UcSt6kYbxYk9LL22EgizbpvkjfZ0c%3D&amp;reserved=0
> >> [2] - drm/doc: device hot-unplug for userspace https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.spinics.net%2Flists%2Fdri-devel%2Fmsg259755.html&amp;data=04%7C01%7Candrey.grodzovsky%40amd.com%7C4b12f8caf53645eaf0c608d8bc84d7fa%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637466626035291908%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=EAzrOrNd14IA6gjjCVi9mAQJQZbcrFQbRNC3bN9gVQc%3D&amp;reserved=0
> >> [3] - Related gitlab ticket https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgitlab.freedesktop.org%2Fdrm%2Famd%2F-%2Fissues%2F1081&amp;data=04%7C01%7Candrey.grodzovsky%40amd.com%7C4b12f8caf53645eaf0c608d8bc84d7fa%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637466626035291908%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=Pmd1WS79YGhU65XNsLtz9s3B6Oc1Dq%2FG4v2t1QDYrFQ%3D&amp;reserved=0
> > btw have you tried this out with some of the igts we have? core_hotunplug
> > is the one I'm thinking of. Might be worth to extend this for amdgpu
> > specific stuff (like run some batches on it while hotunplugging).
>
> No, I mostly used just running glxgears while testing which covers already
> exported/imported dma-buf case and a few manually hacked tests in libdrm amdgpu
> test suite
>
>
> >
> > Since there's so many corner cases we need to test here (shared dma-buf,
> > shared dma_fence) I think it would make sense to have a shared testcase
> > across drivers.
>
>
> Not familiar with IGT too much, is there an easy way to setup shared dma bufs
> and fences
> use cases there or you mean I need to add them now ?

We do have test infrastructure for all of that, but the hotunplug test
doesn't have that yet I think.

> > Only specific thing would be some hooks to keep the gpu
> > busy in some fashion while we yank the driver.
>
>
> Do you mean like staring X and some active rendering on top (like glxgears)
> automatically from within IGT ?

Nope, igt is meant to be bare metal testing so you don't have to drag
the entire winsys around (which in a wayland world, is not really good
for driver testing anyway, since everything is different). We use this
for our pre-merge ci for drm/i915.

> > But just to get it started
> > you can throw in entirely amdgpu specific subtests and just share some of
> > the test code.
> > -Daniel
>
>
> Im general, I wasn't aware of this test suite and looks like it does what i test
> among other stuff.
> I will definitely  try to run with it although the rescan part will not work as
> plugging
> the device back is in my TODO list and not part of the scope for this patchset
> and so I will
> probably comment the re-scan section out while testing.

amd gem has been using libdrm-amd thus far iirc, but for things like
this I think it'd be worth to at least consider switching. Display
team has already started to use some of the test and contribute stuff
(I think the VRR testcase is from amd).
-Daniel

>
> Andrey
>
>
> >
> >> Andrey Grodzovsky (13):
> >>    drm/ttm: Remap all page faults to per process dummy page.
> >>    drm: Unamp the entire device address space on device unplug
> >>    drm/ttm: Expose ttm_tt_unpopulate for driver use
> >>    drm/sched: Cancel and flush all oustatdning jobs before finish.
> >>    drm/amdgpu: Split amdgpu_device_fini into early and late
> >>    drm/amdgpu: Add early fini callback
> >>    drm/amdgpu: Register IOMMU topology notifier per device.
> >>    drm/amdgpu: Fix a bunch of sdma code crash post device unplug
> >>    drm/amdgpu: Remap all page faults to per process dummy page.
> >>    dmr/amdgpu: Move some sysfs attrs creation to default_attr
> >>    drm/amdgpu: Guard against write accesses after device removal
> >>    drm/sched: Make timeout timer rearm conditional.
> >>    drm/amdgpu: Prevent any job recoveries after device is unplugged.
> >>
> >> Luben Tuikov (1):
> >>    drm/scheduler: Job timeout handler returns status
> >>
> >>   drivers/gpu/drm/amd/amdgpu/amdgpu.h               |  11 +-
> >>   drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c      |  17 +--
> >>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c        | 149 ++++++++++++++++++++--
> >>   drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c           |  20 ++-
> >>   drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c         |  15 ++-
> >>   drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c          |   2 +-
> >>   drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h          |   1 +
> >>   drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c           |   9 ++
> >>   drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c       |  25 ++--
> >>   drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c           |  26 ++--
> >>   drivers/gpu/drm/amd/amdgpu/amdgpu_irq.h           |   3 +-
> >>   drivers/gpu/drm/amd/amdgpu/amdgpu_job.c           |  19 ++-
> >>   drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c           |  12 +-
> >>   drivers/gpu/drm/amd/amdgpu/amdgpu_object.c        |  10 ++
> >>   drivers/gpu/drm/amd/amdgpu/amdgpu_object.h        |   2 +
> >>   drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c           |  53 +++++---
> >>   drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h           |   3 +
> >>   drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c           |   1 +
> >>   drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c          |  70 ++++++++++
> >>   drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h          |  52 +-------
> >>   drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c           |  21 ++-
> >>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c            |   8 +-
> >>   drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c      |  14 +-
> >>   drivers/gpu/drm/amd/amdgpu/cik_ih.c               |   2 +-
> >>   drivers/gpu/drm/amd/amdgpu/cz_ih.c                |   2 +-
> >>   drivers/gpu/drm/amd/amdgpu/iceland_ih.c           |   2 +-
> >>   drivers/gpu/drm/amd/amdgpu/navi10_ih.c            |   2 +-
> >>   drivers/gpu/drm/amd/amdgpu/psp_v11_0.c            |  16 +--
> >>   drivers/gpu/drm/amd/amdgpu/psp_v12_0.c            |   8 +-
> >>   drivers/gpu/drm/amd/amdgpu/psp_v3_1.c             |   8 +-
> >>   drivers/gpu/drm/amd/amdgpu/si_ih.c                |   2 +-
> >>   drivers/gpu/drm/amd/amdgpu/tonga_ih.c             |   2 +-
> >>   drivers/gpu/drm/amd/amdgpu/vega10_ih.c            |   2 +-
> >>   drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c |  12 +-
> >>   drivers/gpu/drm/amd/include/amd_shared.h          |   2 +
> >>   drivers/gpu/drm/drm_drv.c                         |   3 +
> >>   drivers/gpu/drm/etnaviv/etnaviv_sched.c           |  10 +-
> >>   drivers/gpu/drm/lima/lima_sched.c                 |   4 +-
> >>   drivers/gpu/drm/panfrost/panfrost_job.c           |   9 +-
> >>   drivers/gpu/drm/scheduler/sched_main.c            |  18 ++-
> >>   drivers/gpu/drm/ttm/ttm_bo_vm.c                   |  82 +++++++++++-
> >>   drivers/gpu/drm/ttm/ttm_tt.c                      |   1 +
> >>   drivers/gpu/drm/v3d/v3d_sched.c                   |  32 ++---
> >>   include/drm/gpu_scheduler.h                       |  17 ++-
> >>   include/drm/ttm/ttm_bo_api.h                      |   2 +
> >>   45 files changed, 583 insertions(+), 198 deletions(-)
> >>
> >> --
> >> 2.7.4
> >>



-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 00/14] RFC Support hot device unplug in amdgpu
@ 2021-01-19 18:08       ` Daniel Vetter
  0 siblings, 0 replies; 196+ messages in thread
From: Daniel Vetter @ 2021-01-19 18:08 UTC (permalink / raw)
  To: Andrey Grodzovsky
  Cc: Rob Herring, amd-gfx list, Christian König, dri-devel,
	Anholt, Eric, Pekka Paalanen, Qiang Yu, Greg KH, Alex Deucher,
	Wentland, Harry, Lucas Stach

On Tue, Jan 19, 2021 at 6:31 PM Andrey Grodzovsky
<Andrey.Grodzovsky@amd.com> wrote:
>
>
> On 1/19/21 9:16 AM, Daniel Vetter wrote:
> > On Mon, Jan 18, 2021 at 04:01:09PM -0500, Andrey Grodzovsky wrote:
> >> Until now extracting a card either by physical extraction (e.g. eGPU with
> >> thunderbolt connection or by emulation through  syfs -> /sys/bus/pci/devices/device_id/remove)
> >> would cause random crashes in user apps. The random crashes in apps were
> >> mostly due to the app having mapped a device backed BO into its address
> >> space was still trying to access the BO while the backing device was gone.
> >> To answer this first problem Christian suggested to fix the handling of mapped
> >> memory in the clients when the device goes away by forcibly unmap all buffers the
> >> user processes has by clearing their respective VMAs mapping the device BOs.
> >> Then when the VMAs try to fill in the page tables again we check in the fault
> >> handlerif the device is removed and if so, return an error. This will generate a
> >> SIGBUS to the application which can then cleanly terminate.This indeed was done
> >> but this in turn created a problem of kernel OOPs were the OOPSes were due to the
> >> fact that while the app was terminating because of the SIGBUSit would trigger use
> >> after free in the driver by calling to accesses device structures that were already
> >> released from the pci remove sequence.This was handled by introducing a 'flush'
> >> sequence during device removal were we wait for drm file reference to drop to 0
> >> meaning all user clients directly using this device terminated.
> >>
> >> v2:
> >> Based on discussions in the mailing list with Daniel and Pekka [1] and based on the document
> >> produced by Pekka from those discussions [2] the whole approach with returning SIGBUS and
> >> waiting for all user clients having CPU mapping of device BOs to die was dropped.
> >> Instead as per the document suggestion the device structures are kept alive until
> >> the last reference to the device is dropped by user client and in the meanwhile all existing and new CPU mappings of the BOs
> >> belonging to the device directly or by dma-buf import are rerouted to per user
> >> process dummy rw page.Also, I skipped the 'Requirements for KMS UAPI' section of [2]
> >> since i am trying to get the minimal set of requirements that still give useful solution
> >> to work and this is the'Requirements for Render and Cross-Device UAPI' section and so my
> >> test case is removing a secondary device, which is render only and is not involved
> >> in KMS.
> >>
> >> v3:
> >> More updates following comments from v2 such as removing loop to find DRM file when rerouting
> >> page faults to dummy page,getting rid of unnecessary sysfs handling refactoring and moving
> >> prevention of GPU recovery post device unplug from amdgpu to scheduler layer.
> >> On top of that added unplug support for the IOMMU enabled system.
> >>
> >> v4:
> >> Drop last sysfs hack and use sysfs default attribute.
> >> Guard against write accesses after device removal to avoid modifying released memory.
> >> Update dummy pages handling to on demand allocation and release through drm managed framework.
> >> Add return value to scheduler job TO handler (by Luben Tuikov) and use this in amdgpu for prevention
> >> of GPU recovery post device unplug
> >> Also rebase on top of drm-misc-mext instead of amd-staging-drm-next
> >>
> >> With these patches I am able to gracefully remove the secondary card using sysfs remove hook while glxgears
> >> is running off of secondary card (DRI_PRIME=1) without kernel oopses or hangs and keep working
> >> with the primary card or soft reset the device without hangs or oopses
> >>
> >> TODOs for followup work:
> >> Convert AMDGPU code to use devm (for hw stuff) and drmm (for sw stuff and allocations) (Daniel)
> >> Support plugging the secondary device back after unplug - currently still experiencing HW error on plugging back.
> >> Add support for 'Requirements for KMS UAPI' section of [2] - unplugging primary, display connected card.
> >>
> >> [1] - Discussions during v3 of the patchset https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.spinics.net%2Flists%2Famd-gfx%2Fmsg55576.html&amp;data=04%7C01%7Candrey.grodzovsky%40amd.com%7C4b12f8caf53645eaf0c608d8bc84d7fa%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637466626035281917%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=E73dK7r1OBt1T9UcSt6kYbxYk9LL22EgizbpvkjfZ0c%3D&amp;reserved=0
> >> [2] - drm/doc: device hot-unplug for userspace https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.spinics.net%2Flists%2Fdri-devel%2Fmsg259755.html&amp;data=04%7C01%7Candrey.grodzovsky%40amd.com%7C4b12f8caf53645eaf0c608d8bc84d7fa%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637466626035291908%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=EAzrOrNd14IA6gjjCVi9mAQJQZbcrFQbRNC3bN9gVQc%3D&amp;reserved=0
> >> [3] - Related gitlab ticket https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgitlab.freedesktop.org%2Fdrm%2Famd%2F-%2Fissues%2F1081&amp;data=04%7C01%7Candrey.grodzovsky%40amd.com%7C4b12f8caf53645eaf0c608d8bc84d7fa%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637466626035291908%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=Pmd1WS79YGhU65XNsLtz9s3B6Oc1Dq%2FG4v2t1QDYrFQ%3D&amp;reserved=0
> > btw have you tried this out with some of the igts we have? core_hotunplug
> > is the one I'm thinking of. Might be worth to extend this for amdgpu
> > specific stuff (like run some batches on it while hotunplugging).
>
> No, I mostly used just running glxgears while testing which covers already
> exported/imported dma-buf case and a few manually hacked tests in libdrm amdgpu
> test suite
>
>
> >
> > Since there's so many corner cases we need to test here (shared dma-buf,
> > shared dma_fence) I think it would make sense to have a shared testcase
> > across drivers.
>
>
> Not familiar with IGT too much, is there an easy way to setup shared dma bufs
> and fences
> use cases there or you mean I need to add them now ?

We do have test infrastructure for all of that, but the hotunplug test
doesn't have that yet I think.

> > Only specific thing would be some hooks to keep the gpu
> > busy in some fashion while we yank the driver.
>
>
> Do you mean like staring X and some active rendering on top (like glxgears)
> automatically from within IGT ?

Nope, igt is meant to be bare metal testing so you don't have to drag
the entire winsys around (which in a wayland world, is not really good
for driver testing anyway, since everything is different). We use this
for our pre-merge ci for drm/i915.

> > But just to get it started
> > you can throw in entirely amdgpu specific subtests and just share some of
> > the test code.
> > -Daniel
>
>
> Im general, I wasn't aware of this test suite and looks like it does what i test
> among other stuff.
> I will definitely  try to run with it although the rescan part will not work as
> plugging
> the device back is in my TODO list and not part of the scope for this patchset
> and so I will
> probably comment the re-scan section out while testing.

amd gem has been using libdrm-amd thus far iirc, but for things like
this I think it'd be worth to at least consider switching. Display
team has already started to use some of the test and contribute stuff
(I think the VRR testcase is from amd).
-Daniel

>
> Andrey
>
>
> >
> >> Andrey Grodzovsky (13):
> >>    drm/ttm: Remap all page faults to per process dummy page.
> >>    drm: Unamp the entire device address space on device unplug
> >>    drm/ttm: Expose ttm_tt_unpopulate for driver use
> >>    drm/sched: Cancel and flush all oustatdning jobs before finish.
> >>    drm/amdgpu: Split amdgpu_device_fini into early and late
> >>    drm/amdgpu: Add early fini callback
> >>    drm/amdgpu: Register IOMMU topology notifier per device.
> >>    drm/amdgpu: Fix a bunch of sdma code crash post device unplug
> >>    drm/amdgpu: Remap all page faults to per process dummy page.
> >>    dmr/amdgpu: Move some sysfs attrs creation to default_attr
> >>    drm/amdgpu: Guard against write accesses after device removal
> >>    drm/sched: Make timeout timer rearm conditional.
> >>    drm/amdgpu: Prevent any job recoveries after device is unplugged.
> >>
> >> Luben Tuikov (1):
> >>    drm/scheduler: Job timeout handler returns status
> >>
> >>   drivers/gpu/drm/amd/amdgpu/amdgpu.h               |  11 +-
> >>   drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c      |  17 +--
> >>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c        | 149 ++++++++++++++++++++--
> >>   drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c           |  20 ++-
> >>   drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c         |  15 ++-
> >>   drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c          |   2 +-
> >>   drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h          |   1 +
> >>   drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c           |   9 ++
> >>   drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c       |  25 ++--
> >>   drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c           |  26 ++--
> >>   drivers/gpu/drm/amd/amdgpu/amdgpu_irq.h           |   3 +-
> >>   drivers/gpu/drm/amd/amdgpu/amdgpu_job.c           |  19 ++-
> >>   drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c           |  12 +-
> >>   drivers/gpu/drm/amd/amdgpu/amdgpu_object.c        |  10 ++
> >>   drivers/gpu/drm/amd/amdgpu/amdgpu_object.h        |   2 +
> >>   drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c           |  53 +++++---
> >>   drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h           |   3 +
> >>   drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c           |   1 +
> >>   drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c          |  70 ++++++++++
> >>   drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h          |  52 +-------
> >>   drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c           |  21 ++-
> >>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c            |   8 +-
> >>   drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c      |  14 +-
> >>   drivers/gpu/drm/amd/amdgpu/cik_ih.c               |   2 +-
> >>   drivers/gpu/drm/amd/amdgpu/cz_ih.c                |   2 +-
> >>   drivers/gpu/drm/amd/amdgpu/iceland_ih.c           |   2 +-
> >>   drivers/gpu/drm/amd/amdgpu/navi10_ih.c            |   2 +-
> >>   drivers/gpu/drm/amd/amdgpu/psp_v11_0.c            |  16 +--
> >>   drivers/gpu/drm/amd/amdgpu/psp_v12_0.c            |   8 +-
> >>   drivers/gpu/drm/amd/amdgpu/psp_v3_1.c             |   8 +-
> >>   drivers/gpu/drm/amd/amdgpu/si_ih.c                |   2 +-
> >>   drivers/gpu/drm/amd/amdgpu/tonga_ih.c             |   2 +-
> >>   drivers/gpu/drm/amd/amdgpu/vega10_ih.c            |   2 +-
> >>   drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c |  12 +-
> >>   drivers/gpu/drm/amd/include/amd_shared.h          |   2 +
> >>   drivers/gpu/drm/drm_drv.c                         |   3 +
> >>   drivers/gpu/drm/etnaviv/etnaviv_sched.c           |  10 +-
> >>   drivers/gpu/drm/lima/lima_sched.c                 |   4 +-
> >>   drivers/gpu/drm/panfrost/panfrost_job.c           |   9 +-
> >>   drivers/gpu/drm/scheduler/sched_main.c            |  18 ++-
> >>   drivers/gpu/drm/ttm/ttm_bo_vm.c                   |  82 +++++++++++-
> >>   drivers/gpu/drm/ttm/ttm_tt.c                      |   1 +
> >>   drivers/gpu/drm/v3d/v3d_sched.c                   |  32 ++---
> >>   include/drm/gpu_scheduler.h                       |  17 ++-
> >>   include/drm/ttm/ttm_bo_api.h                      |   2 +
> >>   45 files changed, 583 insertions(+), 198 deletions(-)
> >>
> >> --
> >> 2.7.4
> >>



-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 00/14] RFC Support hot device unplug in amdgpu
  2021-01-19 18:08       ` Daniel Vetter
@ 2021-01-19 18:18         ` Andrey Grodzovsky
  -1 siblings, 0 replies; 196+ messages in thread
From: Andrey Grodzovsky @ 2021-01-19 18:18 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: amd-gfx list, Christian König, dri-devel, Qiang Yu, Greg KH,
	Alex Deucher


On 1/19/21 1:08 PM, Daniel Vetter wrote:
> On Tue, Jan 19, 2021 at 6:31 PM Andrey Grodzovsky
> <Andrey.Grodzovsky@amd.com> wrote:
>>
>> On 1/19/21 9:16 AM, Daniel Vetter wrote:
>>> On Mon, Jan 18, 2021 at 04:01:09PM -0500, Andrey Grodzovsky wrote:
>>>> Until now extracting a card either by physical extraction (e.g. eGPU with
>>>> thunderbolt connection or by emulation through  syfs -> /sys/bus/pci/devices/device_id/remove)
>>>> would cause random crashes in user apps. The random crashes in apps were
>>>> mostly due to the app having mapped a device backed BO into its address
>>>> space was still trying to access the BO while the backing device was gone.
>>>> To answer this first problem Christian suggested to fix the handling of mapped
>>>> memory in the clients when the device goes away by forcibly unmap all buffers the
>>>> user processes has by clearing their respective VMAs mapping the device BOs.
>>>> Then when the VMAs try to fill in the page tables again we check in the fault
>>>> handlerif the device is removed and if so, return an error. This will generate a
>>>> SIGBUS to the application which can then cleanly terminate.This indeed was done
>>>> but this in turn created a problem of kernel OOPs were the OOPSes were due to the
>>>> fact that while the app was terminating because of the SIGBUSit would trigger use
>>>> after free in the driver by calling to accesses device structures that were already
>>>> released from the pci remove sequence.This was handled by introducing a 'flush'
>>>> sequence during device removal were we wait for drm file reference to drop to 0
>>>> meaning all user clients directly using this device terminated.
>>>>
>>>> v2:
>>>> Based on discussions in the mailing list with Daniel and Pekka [1] and based on the document
>>>> produced by Pekka from those discussions [2] the whole approach with returning SIGBUS and
>>>> waiting for all user clients having CPU mapping of device BOs to die was dropped.
>>>> Instead as per the document suggestion the device structures are kept alive until
>>>> the last reference to the device is dropped by user client and in the meanwhile all existing and new CPU mappings of the BOs
>>>> belonging to the device directly or by dma-buf import are rerouted to per user
>>>> process dummy rw page.Also, I skipped the 'Requirements for KMS UAPI' section of [2]
>>>> since i am trying to get the minimal set of requirements that still give useful solution
>>>> to work and this is the'Requirements for Render and Cross-Device UAPI' section and so my
>>>> test case is removing a secondary device, which is render only and is not involved
>>>> in KMS.
>>>>
>>>> v3:
>>>> More updates following comments from v2 such as removing loop to find DRM file when rerouting
>>>> page faults to dummy page,getting rid of unnecessary sysfs handling refactoring and moving
>>>> prevention of GPU recovery post device unplug from amdgpu to scheduler layer.
>>>> On top of that added unplug support for the IOMMU enabled system.
>>>>
>>>> v4:
>>>> Drop last sysfs hack and use sysfs default attribute.
>>>> Guard against write accesses after device removal to avoid modifying released memory.
>>>> Update dummy pages handling to on demand allocation and release through drm managed framework.
>>>> Add return value to scheduler job TO handler (by Luben Tuikov) and use this in amdgpu for prevention
>>>> of GPU recovery post device unplug
>>>> Also rebase on top of drm-misc-mext instead of amd-staging-drm-next
>>>>
>>>> With these patches I am able to gracefully remove the secondary card using sysfs remove hook while glxgears
>>>> is running off of secondary card (DRI_PRIME=1) without kernel oopses or hangs and keep working
>>>> with the primary card or soft reset the device without hangs or oopses
>>>>
>>>> TODOs for followup work:
>>>> Convert AMDGPU code to use devm (for hw stuff) and drmm (for sw stuff and allocations) (Daniel)
>>>> Support plugging the secondary device back after unplug - currently still experiencing HW error on plugging back.
>>>> Add support for 'Requirements for KMS UAPI' section of [2] - unplugging primary, display connected card.
>>>>
>>>> [1] - Discussions during v3 of the patchset https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.spinics.net%2Flists%2Famd-gfx%2Fmsg55576.html&amp;data=04%7C01%7CAndrey.Grodzovsky%40amd.com%7C9055ea164ca14a0cbce108d8bca53d37%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637466765176719365%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=AqqeqmhF%2BZ1%2BRwMgtpmfoW1gtEnLGxiy3U5OMm%2BBqk8%3D&amp;reserved=0
>>>> [2] - drm/doc: device hot-unplug for userspace https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.spinics.net%2Flists%2Fdri-devel%2Fmsg259755.html&amp;data=04%7C01%7CAndrey.Grodzovsky%40amd.com%7C9055ea164ca14a0cbce108d8bca53d37%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637466765176719365%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=oHHyRtTMTNQAnkzptG0B8%2FeeniU1z2DSca8L4yCYJcE%3D&amp;reserved=0
>>>> [3] - Related gitlab ticket https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgitlab.freedesktop.org%2Fdrm%2Famd%2F-%2Fissues%2F1081&amp;data=04%7C01%7CAndrey.Grodzovsky%40amd.com%7C9055ea164ca14a0cbce108d8bca53d37%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637466765176719365%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=inKlV%2F5QIPw%2BhHvLM46X27%2Fcjr%2FXyhxmrC0xYXBhHuE%3D&amp;reserved=0
>>> btw have you tried this out with some of the igts we have? core_hotunplug
>>> is the one I'm thinking of. Might be worth to extend this for amdgpu
>>> specific stuff (like run some batches on it while hotunplugging).
>> No, I mostly used just running glxgears while testing which covers already
>> exported/imported dma-buf case and a few manually hacked tests in libdrm amdgpu
>> test suite
>>
>>
>>> Since there's so many corner cases we need to test here (shared dma-buf,
>>> shared dma_fence) I think it would make sense to have a shared testcase
>>> across drivers.
>>
>> Not familiar with IGT too much, is there an easy way to setup shared dma bufs
>> and fences
>> use cases there or you mean I need to add them now ?
> We do have test infrastructure for all of that, but the hotunplug test
> doesn't have that yet I think.
>
>>> Only specific thing would be some hooks to keep the gpu
>>> busy in some fashion while we yank the driver.
>>
>> Do you mean like staring X and some active rendering on top (like glxgears)
>> automatically from within IGT ?
> Nope, igt is meant to be bare metal testing so you don't have to drag
> the entire winsys around (which in a wayland world, is not really good
> for driver testing anyway, since everything is different). We use this
> for our pre-merge ci for drm/i915.


So i keep it busy by X/glxgers which is manual operation. What you suggest
then is some client within IGT which opens the device and starts submitting jobs
(which is much like what libdrm amdgpu tests already do) ? And this
part is the amdgou specific code I just need to port from libdrm to here ?

Andrey


>
>>> But just to get it started
>>> you can throw in entirely amdgpu specific subtests and just share some of
>>> the test code.
>>> -Daniel
>>
>> Im general, I wasn't aware of this test suite and looks like it does what i test
>> among other stuff.
>> I will definitely  try to run with it although the rescan part will not work as
>> plugging
>> the device back is in my TODO list and not part of the scope for this patchset
>> and so I will
>> probably comment the re-scan section out while testing.
> amd gem has been using libdrm-amd thus far iirc, but for things like
> this I think it'd be worth to at least consider switching. Display
> team has already started to use some of the test and contribute stuff
> (I think the VRR testcase is from amd).
> -Daniel
>
>> Andrey
>>
>>
>>>> Andrey Grodzovsky (13):
>>>>     drm/ttm: Remap all page faults to per process dummy page.
>>>>     drm: Unamp the entire device address space on device unplug
>>>>     drm/ttm: Expose ttm_tt_unpopulate for driver use
>>>>     drm/sched: Cancel and flush all oustatdning jobs before finish.
>>>>     drm/amdgpu: Split amdgpu_device_fini into early and late
>>>>     drm/amdgpu: Add early fini callback
>>>>     drm/amdgpu: Register IOMMU topology notifier per device.
>>>>     drm/amdgpu: Fix a bunch of sdma code crash post device unplug
>>>>     drm/amdgpu: Remap all page faults to per process dummy page.
>>>>     dmr/amdgpu: Move some sysfs attrs creation to default_attr
>>>>     drm/amdgpu: Guard against write accesses after device removal
>>>>     drm/sched: Make timeout timer rearm conditional.
>>>>     drm/amdgpu: Prevent any job recoveries after device is unplugged.
>>>>
>>>> Luben Tuikov (1):
>>>>     drm/scheduler: Job timeout handler returns status
>>>>
>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu.h               |  11 +-
>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c      |  17 +--
>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_device.c        | 149 ++++++++++++++++++++--
>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c           |  20 ++-
>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c         |  15 ++-
>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c          |   2 +-
>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h          |   1 +
>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c           |   9 ++
>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c       |  25 ++--
>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c           |  26 ++--
>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_irq.h           |   3 +-
>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_job.c           |  19 ++-
>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c           |  12 +-
>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_object.c        |  10 ++
>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_object.h        |   2 +
>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c           |  53 +++++---
>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h           |   3 +
>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c           |   1 +
>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c          |  70 ++++++++++
>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h          |  52 +-------
>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c           |  21 ++-
>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c            |   8 +-
>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c      |  14 +-
>>>>    drivers/gpu/drm/amd/amdgpu/cik_ih.c               |   2 +-
>>>>    drivers/gpu/drm/amd/amdgpu/cz_ih.c                |   2 +-
>>>>    drivers/gpu/drm/amd/amdgpu/iceland_ih.c           |   2 +-
>>>>    drivers/gpu/drm/amd/amdgpu/navi10_ih.c            |   2 +-
>>>>    drivers/gpu/drm/amd/amdgpu/psp_v11_0.c            |  16 +--
>>>>    drivers/gpu/drm/amd/amdgpu/psp_v12_0.c            |   8 +-
>>>>    drivers/gpu/drm/amd/amdgpu/psp_v3_1.c             |   8 +-
>>>>    drivers/gpu/drm/amd/amdgpu/si_ih.c                |   2 +-
>>>>    drivers/gpu/drm/amd/amdgpu/tonga_ih.c             |   2 +-
>>>>    drivers/gpu/drm/amd/amdgpu/vega10_ih.c            |   2 +-
>>>>    drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c |  12 +-
>>>>    drivers/gpu/drm/amd/include/amd_shared.h          |   2 +
>>>>    drivers/gpu/drm/drm_drv.c                         |   3 +
>>>>    drivers/gpu/drm/etnaviv/etnaviv_sched.c           |  10 +-
>>>>    drivers/gpu/drm/lima/lima_sched.c                 |   4 +-
>>>>    drivers/gpu/drm/panfrost/panfrost_job.c           |   9 +-
>>>>    drivers/gpu/drm/scheduler/sched_main.c            |  18 ++-
>>>>    drivers/gpu/drm/ttm/ttm_bo_vm.c                   |  82 +++++++++++-
>>>>    drivers/gpu/drm/ttm/ttm_tt.c                      |   1 +
>>>>    drivers/gpu/drm/v3d/v3d_sched.c                   |  32 ++---
>>>>    include/drm/gpu_scheduler.h                       |  17 ++-
>>>>    include/drm/ttm/ttm_bo_api.h                      |   2 +
>>>>    45 files changed, 583 insertions(+), 198 deletions(-)
>>>>
>>>> --
>>>> 2.7.4
>>>>
>
>
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 00/14] RFC Support hot device unplug in amdgpu
@ 2021-01-19 18:18         ` Andrey Grodzovsky
  0 siblings, 0 replies; 196+ messages in thread
From: Andrey Grodzovsky @ 2021-01-19 18:18 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Rob Herring, amd-gfx list, Christian König, dri-devel,
	Anholt, Eric, Pekka Paalanen, Qiang Yu, Greg KH, Alex Deucher,
	Wentland, Harry, Lucas Stach


On 1/19/21 1:08 PM, Daniel Vetter wrote:
> On Tue, Jan 19, 2021 at 6:31 PM Andrey Grodzovsky
> <Andrey.Grodzovsky@amd.com> wrote:
>>
>> On 1/19/21 9:16 AM, Daniel Vetter wrote:
>>> On Mon, Jan 18, 2021 at 04:01:09PM -0500, Andrey Grodzovsky wrote:
>>>> Until now extracting a card either by physical extraction (e.g. eGPU with
>>>> thunderbolt connection or by emulation through  syfs -> /sys/bus/pci/devices/device_id/remove)
>>>> would cause random crashes in user apps. The random crashes in apps were
>>>> mostly due to the app having mapped a device backed BO into its address
>>>> space was still trying to access the BO while the backing device was gone.
>>>> To answer this first problem Christian suggested to fix the handling of mapped
>>>> memory in the clients when the device goes away by forcibly unmap all buffers the
>>>> user processes has by clearing their respective VMAs mapping the device BOs.
>>>> Then when the VMAs try to fill in the page tables again we check in the fault
>>>> handlerif the device is removed and if so, return an error. This will generate a
>>>> SIGBUS to the application which can then cleanly terminate.This indeed was done
>>>> but this in turn created a problem of kernel OOPs were the OOPSes were due to the
>>>> fact that while the app was terminating because of the SIGBUSit would trigger use
>>>> after free in the driver by calling to accesses device structures that were already
>>>> released from the pci remove sequence.This was handled by introducing a 'flush'
>>>> sequence during device removal were we wait for drm file reference to drop to 0
>>>> meaning all user clients directly using this device terminated.
>>>>
>>>> v2:
>>>> Based on discussions in the mailing list with Daniel and Pekka [1] and based on the document
>>>> produced by Pekka from those discussions [2] the whole approach with returning SIGBUS and
>>>> waiting for all user clients having CPU mapping of device BOs to die was dropped.
>>>> Instead as per the document suggestion the device structures are kept alive until
>>>> the last reference to the device is dropped by user client and in the meanwhile all existing and new CPU mappings of the BOs
>>>> belonging to the device directly or by dma-buf import are rerouted to per user
>>>> process dummy rw page.Also, I skipped the 'Requirements for KMS UAPI' section of [2]
>>>> since i am trying to get the minimal set of requirements that still give useful solution
>>>> to work and this is the'Requirements for Render and Cross-Device UAPI' section and so my
>>>> test case is removing a secondary device, which is render only and is not involved
>>>> in KMS.
>>>>
>>>> v3:
>>>> More updates following comments from v2 such as removing loop to find DRM file when rerouting
>>>> page faults to dummy page,getting rid of unnecessary sysfs handling refactoring and moving
>>>> prevention of GPU recovery post device unplug from amdgpu to scheduler layer.
>>>> On top of that added unplug support for the IOMMU enabled system.
>>>>
>>>> v4:
>>>> Drop last sysfs hack and use sysfs default attribute.
>>>> Guard against write accesses after device removal to avoid modifying released memory.
>>>> Update dummy pages handling to on demand allocation and release through drm managed framework.
>>>> Add return value to scheduler job TO handler (by Luben Tuikov) and use this in amdgpu for prevention
>>>> of GPU recovery post device unplug
>>>> Also rebase on top of drm-misc-mext instead of amd-staging-drm-next
>>>>
>>>> With these patches I am able to gracefully remove the secondary card using sysfs remove hook while glxgears
>>>> is running off of secondary card (DRI_PRIME=1) without kernel oopses or hangs and keep working
>>>> with the primary card or soft reset the device without hangs or oopses
>>>>
>>>> TODOs for followup work:
>>>> Convert AMDGPU code to use devm (for hw stuff) and drmm (for sw stuff and allocations) (Daniel)
>>>> Support plugging the secondary device back after unplug - currently still experiencing HW error on plugging back.
>>>> Add support for 'Requirements for KMS UAPI' section of [2] - unplugging primary, display connected card.
>>>>
>>>> [1] - Discussions during v3 of the patchset https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.spinics.net%2Flists%2Famd-gfx%2Fmsg55576.html&amp;data=04%7C01%7CAndrey.Grodzovsky%40amd.com%7C9055ea164ca14a0cbce108d8bca53d37%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637466765176719365%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=AqqeqmhF%2BZ1%2BRwMgtpmfoW1gtEnLGxiy3U5OMm%2BBqk8%3D&amp;reserved=0
>>>> [2] - drm/doc: device hot-unplug for userspace https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.spinics.net%2Flists%2Fdri-devel%2Fmsg259755.html&amp;data=04%7C01%7CAndrey.Grodzovsky%40amd.com%7C9055ea164ca14a0cbce108d8bca53d37%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637466765176719365%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=oHHyRtTMTNQAnkzptG0B8%2FeeniU1z2DSca8L4yCYJcE%3D&amp;reserved=0
>>>> [3] - Related gitlab ticket https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgitlab.freedesktop.org%2Fdrm%2Famd%2F-%2Fissues%2F1081&amp;data=04%7C01%7CAndrey.Grodzovsky%40amd.com%7C9055ea164ca14a0cbce108d8bca53d37%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637466765176719365%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=inKlV%2F5QIPw%2BhHvLM46X27%2Fcjr%2FXyhxmrC0xYXBhHuE%3D&amp;reserved=0
>>> btw have you tried this out with some of the igts we have? core_hotunplug
>>> is the one I'm thinking of. Might be worth to extend this for amdgpu
>>> specific stuff (like run some batches on it while hotunplugging).
>> No, I mostly used just running glxgears while testing which covers already
>> exported/imported dma-buf case and a few manually hacked tests in libdrm amdgpu
>> test suite
>>
>>
>>> Since there's so many corner cases we need to test here (shared dma-buf,
>>> shared dma_fence) I think it would make sense to have a shared testcase
>>> across drivers.
>>
>> Not familiar with IGT too much, is there an easy way to setup shared dma bufs
>> and fences
>> use cases there or you mean I need to add them now ?
> We do have test infrastructure for all of that, but the hotunplug test
> doesn't have that yet I think.
>
>>> Only specific thing would be some hooks to keep the gpu
>>> busy in some fashion while we yank the driver.
>>
>> Do you mean like staring X and some active rendering on top (like glxgears)
>> automatically from within IGT ?
> Nope, igt is meant to be bare metal testing so you don't have to drag
> the entire winsys around (which in a wayland world, is not really good
> for driver testing anyway, since everything is different). We use this
> for our pre-merge ci for drm/i915.


So i keep it busy by X/glxgers which is manual operation. What you suggest
then is some client within IGT which opens the device and starts submitting jobs
(which is much like what libdrm amdgpu tests already do) ? And this
part is the amdgou specific code I just need to port from libdrm to here ?

Andrey


>
>>> But just to get it started
>>> you can throw in entirely amdgpu specific subtests and just share some of
>>> the test code.
>>> -Daniel
>>
>> Im general, I wasn't aware of this test suite and looks like it does what i test
>> among other stuff.
>> I will definitely  try to run with it although the rescan part will not work as
>> plugging
>> the device back is in my TODO list and not part of the scope for this patchset
>> and so I will
>> probably comment the re-scan section out while testing.
> amd gem has been using libdrm-amd thus far iirc, but for things like
> this I think it'd be worth to at least consider switching. Display
> team has already started to use some of the test and contribute stuff
> (I think the VRR testcase is from amd).
> -Daniel
>
>> Andrey
>>
>>
>>>> Andrey Grodzovsky (13):
>>>>     drm/ttm: Remap all page faults to per process dummy page.
>>>>     drm: Unamp the entire device address space on device unplug
>>>>     drm/ttm: Expose ttm_tt_unpopulate for driver use
>>>>     drm/sched: Cancel and flush all oustatdning jobs before finish.
>>>>     drm/amdgpu: Split amdgpu_device_fini into early and late
>>>>     drm/amdgpu: Add early fini callback
>>>>     drm/amdgpu: Register IOMMU topology notifier per device.
>>>>     drm/amdgpu: Fix a bunch of sdma code crash post device unplug
>>>>     drm/amdgpu: Remap all page faults to per process dummy page.
>>>>     dmr/amdgpu: Move some sysfs attrs creation to default_attr
>>>>     drm/amdgpu: Guard against write accesses after device removal
>>>>     drm/sched: Make timeout timer rearm conditional.
>>>>     drm/amdgpu: Prevent any job recoveries after device is unplugged.
>>>>
>>>> Luben Tuikov (1):
>>>>     drm/scheduler: Job timeout handler returns status
>>>>
>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu.h               |  11 +-
>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c      |  17 +--
>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_device.c        | 149 ++++++++++++++++++++--
>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c           |  20 ++-
>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c         |  15 ++-
>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c          |   2 +-
>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h          |   1 +
>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c           |   9 ++
>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c       |  25 ++--
>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c           |  26 ++--
>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_irq.h           |   3 +-
>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_job.c           |  19 ++-
>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c           |  12 +-
>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_object.c        |  10 ++
>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_object.h        |   2 +
>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c           |  53 +++++---
>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h           |   3 +
>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c           |   1 +
>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c          |  70 ++++++++++
>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h          |  52 +-------
>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c           |  21 ++-
>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c            |   8 +-
>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c      |  14 +-
>>>>    drivers/gpu/drm/amd/amdgpu/cik_ih.c               |   2 +-
>>>>    drivers/gpu/drm/amd/amdgpu/cz_ih.c                |   2 +-
>>>>    drivers/gpu/drm/amd/amdgpu/iceland_ih.c           |   2 +-
>>>>    drivers/gpu/drm/amd/amdgpu/navi10_ih.c            |   2 +-
>>>>    drivers/gpu/drm/amd/amdgpu/psp_v11_0.c            |  16 +--
>>>>    drivers/gpu/drm/amd/amdgpu/psp_v12_0.c            |   8 +-
>>>>    drivers/gpu/drm/amd/amdgpu/psp_v3_1.c             |   8 +-
>>>>    drivers/gpu/drm/amd/amdgpu/si_ih.c                |   2 +-
>>>>    drivers/gpu/drm/amd/amdgpu/tonga_ih.c             |   2 +-
>>>>    drivers/gpu/drm/amd/amdgpu/vega10_ih.c            |   2 +-
>>>>    drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c |  12 +-
>>>>    drivers/gpu/drm/amd/include/amd_shared.h          |   2 +
>>>>    drivers/gpu/drm/drm_drv.c                         |   3 +
>>>>    drivers/gpu/drm/etnaviv/etnaviv_sched.c           |  10 +-
>>>>    drivers/gpu/drm/lima/lima_sched.c                 |   4 +-
>>>>    drivers/gpu/drm/panfrost/panfrost_job.c           |   9 +-
>>>>    drivers/gpu/drm/scheduler/sched_main.c            |  18 ++-
>>>>    drivers/gpu/drm/ttm/ttm_bo_vm.c                   |  82 +++++++++++-
>>>>    drivers/gpu/drm/ttm/ttm_tt.c                      |   1 +
>>>>    drivers/gpu/drm/v3d/v3d_sched.c                   |  32 ++---
>>>>    include/drm/gpu_scheduler.h                       |  17 ++-
>>>>    include/drm/ttm/ttm_bo_api.h                      |   2 +
>>>>    45 files changed, 583 insertions(+), 198 deletions(-)
>>>>
>>>> --
>>>> 2.7.4
>>>>
>
>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 11/14] drm/amdgpu: Guard against write accesses after device removal
  2021-01-19 18:05         ` Daniel Vetter
@ 2021-01-19 18:22           ` Andrey Grodzovsky
  -1 siblings, 0 replies; 196+ messages in thread
From: Andrey Grodzovsky @ 2021-01-19 18:22 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: amd-gfx list, Greg KH, dri-devel, Qiang Yu, Alex Deucher,
	Christian König


On 1/19/21 1:05 PM, Daniel Vetter wrote:
> On Tue, Jan 19, 2021 at 4:35 PM Andrey Grodzovsky
> <Andrey.Grodzovsky@amd.com> wrote:
>> There is really no other way according to this article
>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flwn.net%2FArticles%2F767885%2F&amp;data=04%7C01%7CAndrey.Grodzovsky%40amd.com%7C7a1f5ae6a06f4661d47708d8bca4cb32%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637466763278674162%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=QupsglO9WRuis8XRLBFIhl6miTXVOdAnk8oP4BfSclQ%3D&amp;reserved=0
>>
>> "A perfect solution seems nearly impossible though; we cannot acquire a mutex on
>> the user
>> to prevent them from yanking a device and we cannot check for a presence change
>> after every
>> device access for performance reasons. "
>>
>> But I assumed srcu_read_lock should be pretty seamless performance wise, no ?
> The read side is supposed to be dirt cheap, the write side is were we
> just stall for all readers to eventually complete on their own.
> Definitely should be much cheaper than mmio read, on the mmio write
> side it might actually hurt a bit. Otoh I think those don't stall the
> cpu by default when they're timing out, so maybe if the overhead is
> too much for those, we could omit them?
>
> Maybe just do a small microbenchmark for these for testing, with a
> register that doesn't change hw state. So with and without
> drm_dev_enter/exit, and also one with the hw plugged out so that we
> have actual timeouts in the transactions.
> -Daniel


So say writing in a loop to some harmless scratch register for many times both 
for plugged
and unplugged case and measure total time delta ?

Andrey


>
>> The other solution would be as I suggested to keep all the device IO ranges
>> reserved and system
>> memory pages unfreed until the device is finalized in the driver but Daniel said
>> this would upset the PCI layer (the MMIO ranges reservation part).
>>
>> Andrey
>>
>>
>>
>>
>> On 1/19/21 3:55 AM, Christian König wrote:
>>> Am 18.01.21 um 22:01 schrieb Andrey Grodzovsky:
>>>> This should prevent writing to memory or IO ranges possibly
>>>> already allocated for other uses after our device is removed.
>>> Wow, that adds quite some overhead to every register access. I'm not sure we
>>> can do this.
>>>
>>> Christian.
>>>
>>>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>>>> ---
>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 57 ++++++++++++++++++++++++
>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c    |  9 ++++
>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c    | 53 +++++++++++++---------
>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h    |  3 ++
>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c   | 70 ++++++++++++++++++++++++++++++
>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h   | 49 ++-------------------
>>>>    drivers/gpu/drm/amd/amdgpu/psp_v11_0.c     | 16 ++-----
>>>>    drivers/gpu/drm/amd/amdgpu/psp_v12_0.c     |  8 +---
>>>>    drivers/gpu/drm/amd/amdgpu/psp_v3_1.c      |  8 +---
>>>>    9 files changed, 184 insertions(+), 89 deletions(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>> index e99f4f1..0a9d73c 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>> @@ -72,6 +72,8 @@
>>>>      #include <linux/iommu.h>
>>>>    +#include <drm/drm_drv.h>
>>>> +
>>>>    MODULE_FIRMWARE("amdgpu/vega10_gpu_info.bin");
>>>>    MODULE_FIRMWARE("amdgpu/vega12_gpu_info.bin");
>>>>    MODULE_FIRMWARE("amdgpu/raven_gpu_info.bin");
>>>> @@ -404,13 +406,21 @@ uint8_t amdgpu_mm_rreg8(struct amdgpu_device *adev,
>>>> uint32_t offset)
>>>>     */
>>>>    void amdgpu_mm_wreg8(struct amdgpu_device *adev, uint32_t offset, uint8_t
>>>> value)
>>>>    {
>>>> +    int idx;
>>>> +
>>>>        if (adev->in_pci_err_recovery)
>>>>            return;
>>>>    +
>>>> +    if (!drm_dev_enter(&adev->ddev, &idx))
>>>> +        return;
>>>> +
>>>>        if (offset < adev->rmmio_size)
>>>>            writeb(value, adev->rmmio + offset);
>>>>        else
>>>>            BUG();
>>>> +
>>>> +    drm_dev_exit(idx);
>>>>    }
>>>>      /**
>>>> @@ -427,9 +437,14 @@ void amdgpu_device_wreg(struct amdgpu_device *adev,
>>>>                uint32_t reg, uint32_t v,
>>>>                uint32_t acc_flags)
>>>>    {
>>>> +    int idx;
>>>> +
>>>>        if (adev->in_pci_err_recovery)
>>>>            return;
>>>>    +    if (!drm_dev_enter(&adev->ddev, &idx))
>>>> +        return;
>>>> +
>>>>        if ((reg * 4) < adev->rmmio_size) {
>>>>            if (!(acc_flags & AMDGPU_REGS_NO_KIQ) &&
>>>>                amdgpu_sriov_runtime(adev) &&
>>>> @@ -444,6 +459,8 @@ void amdgpu_device_wreg(struct amdgpu_device *adev,
>>>>        }
>>>>          trace_amdgpu_device_wreg(adev->pdev->device, reg, v);
>>>> +
>>>> +    drm_dev_exit(idx);
>>>>    }
>>>>      /*
>>>> @@ -454,9 +471,14 @@ void amdgpu_device_wreg(struct amdgpu_device *adev,
>>>>    void amdgpu_mm_wreg_mmio_rlc(struct amdgpu_device *adev,
>>>>                     uint32_t reg, uint32_t v)
>>>>    {
>>>> +    int idx;
>>>> +
>>>>        if (adev->in_pci_err_recovery)
>>>>            return;
>>>>    +    if (!drm_dev_enter(&adev->ddev, &idx))
>>>> +        return;
>>>> +
>>>>        if (amdgpu_sriov_fullaccess(adev) &&
>>>>            adev->gfx.rlc.funcs &&
>>>>            adev->gfx.rlc.funcs->is_rlcg_access_range) {
>>>> @@ -465,6 +487,8 @@ void amdgpu_mm_wreg_mmio_rlc(struct amdgpu_device *adev,
>>>>        } else {
>>>>            writel(v, ((void __iomem *)adev->rmmio) + (reg * 4));
>>>>        }
>>>> +
>>>> +    drm_dev_exit(idx);
>>>>    }
>>>>      /**
>>>> @@ -499,15 +523,22 @@ u32 amdgpu_io_rreg(struct amdgpu_device *adev, u32 reg)
>>>>     */
>>>>    void amdgpu_io_wreg(struct amdgpu_device *adev, u32 reg, u32 v)
>>>>    {
>>>> +    int idx;
>>>> +
>>>>        if (adev->in_pci_err_recovery)
>>>>            return;
>>>>    +    if (!drm_dev_enter(&adev->ddev, &idx))
>>>> +        return;
>>>> +
>>>>        if ((reg * 4) < adev->rio_mem_size)
>>>>            iowrite32(v, adev->rio_mem + (reg * 4));
>>>>        else {
>>>>            iowrite32((reg * 4), adev->rio_mem + (mmMM_INDEX * 4));
>>>>            iowrite32(v, adev->rio_mem + (mmMM_DATA * 4));
>>>>        }
>>>> +
>>>> +    drm_dev_exit(idx);
>>>>    }
>>>>      /**
>>>> @@ -544,14 +575,21 @@ u32 amdgpu_mm_rdoorbell(struct amdgpu_device *adev, u32
>>>> index)
>>>>     */
>>>>    void amdgpu_mm_wdoorbell(struct amdgpu_device *adev, u32 index, u32 v)
>>>>    {
>>>> +    int idx;
>>>> +
>>>>        if (adev->in_pci_err_recovery)
>>>>            return;
>>>>    +    if (!drm_dev_enter(&adev->ddev, &idx))
>>>> +        return;
>>>> +
>>>>        if (index < adev->doorbell.num_doorbells) {
>>>>            writel(v, adev->doorbell.ptr + index);
>>>>        } else {
>>>>            DRM_ERROR("writing beyond doorbell aperture: 0x%08x!\n", index);
>>>>        }
>>>> +
>>>> +    drm_dev_exit(idx);
>>>>    }
>>>>      /**
>>>> @@ -588,14 +626,21 @@ u64 amdgpu_mm_rdoorbell64(struct amdgpu_device *adev,
>>>> u32 index)
>>>>     */
>>>>    void amdgpu_mm_wdoorbell64(struct amdgpu_device *adev, u32 index, u64 v)
>>>>    {
>>>> +    int idx;
>>>> +
>>>>        if (adev->in_pci_err_recovery)
>>>>            return;
>>>>    +    if (!drm_dev_enter(&adev->ddev, &idx))
>>>> +        return;
>>>> +
>>>>        if (index < adev->doorbell.num_doorbells) {
>>>>            atomic64_set((atomic64_t *)(adev->doorbell.ptr + index), v);
>>>>        } else {
>>>>            DRM_ERROR("writing beyond doorbell aperture: 0x%08x!\n", index);
>>>>        }
>>>> +
>>>> +    drm_dev_exit(idx);
>>>>    }
>>>>      /**
>>>> @@ -682,6 +727,10 @@ void amdgpu_device_indirect_wreg(struct amdgpu_device
>>>> *adev,
>>>>        unsigned long flags;
>>>>        void __iomem *pcie_index_offset;
>>>>        void __iomem *pcie_data_offset;
>>>> +    int idx;
>>>> +
>>>> +    if (!drm_dev_enter(&adev->ddev, &idx))
>>>> +        return;
>>>>          spin_lock_irqsave(&adev->pcie_idx_lock, flags);
>>>>        pcie_index_offset = (void __iomem *)adev->rmmio + pcie_index * 4;
>>>> @@ -692,6 +741,8 @@ void amdgpu_device_indirect_wreg(struct amdgpu_device *adev,
>>>>        writel(reg_data, pcie_data_offset);
>>>>        readl(pcie_data_offset);
>>>>        spin_unlock_irqrestore(&adev->pcie_idx_lock, flags);
>>>> +
>>>> +    drm_dev_exit(idx);
>>>>    }
>>>>      /**
>>>> @@ -711,6 +762,10 @@ void amdgpu_device_indirect_wreg64(struct amdgpu_device
>>>> *adev,
>>>>        unsigned long flags;
>>>>        void __iomem *pcie_index_offset;
>>>>        void __iomem *pcie_data_offset;
>>>> +    int idx;
>>>> +
>>>> +    if (!drm_dev_enter(&adev->ddev, &idx))
>>>> +        return;
>>>>          spin_lock_irqsave(&adev->pcie_idx_lock, flags);
>>>>        pcie_index_offset = (void __iomem *)adev->rmmio + pcie_index * 4;
>>>> @@ -727,6 +782,8 @@ void amdgpu_device_indirect_wreg64(struct amdgpu_device
>>>> *adev,
>>>>        writel((u32)(reg_data >> 32), pcie_data_offset);
>>>>        readl(pcie_data_offset);
>>>>        spin_unlock_irqrestore(&adev->pcie_idx_lock, flags);
>>>> +
>>>> +    drm_dev_exit(idx);
>>>>    }
>>>>      /**
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
>>>> index fe1a39f..1beb4e6 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
>>>> @@ -31,6 +31,8 @@
>>>>    #include "amdgpu_ras.h"
>>>>    #include "amdgpu_xgmi.h"
>>>>    +#include <drm/drm_drv.h>
>>>> +
>>>>    /**
>>>>     * amdgpu_gmc_get_pde_for_bo - get the PDE for a BO
>>>>     *
>>>> @@ -98,6 +100,10 @@ int amdgpu_gmc_set_pte_pde(struct amdgpu_device *adev,
>>>> void *cpu_pt_addr,
>>>>    {
>>>>        void __iomem *ptr = (void *)cpu_pt_addr;
>>>>        uint64_t value;
>>>> +    int idx;
>>>> +
>>>> +    if (!drm_dev_enter(&adev->ddev, &idx))
>>>> +        return 0;
>>>>          /*
>>>>         * The following is for PTE only. GART does not have PDEs.
>>>> @@ -105,6 +111,9 @@ int amdgpu_gmc_set_pte_pde(struct amdgpu_device *adev,
>>>> void *cpu_pt_addr,
>>>>        value = addr & 0x0000FFFFFFFFF000ULL;
>>>>        value |= flags;
>>>>        writeq(value, ptr + (gpu_page_idx * 8));
>>>> +
>>>> +    drm_dev_exit(idx);
>>>> +
>>>>        return 0;
>>>>    }
>>>>    diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
>>>> index 523d22d..89e2bfe 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
>>>> @@ -37,6 +37,8 @@
>>>>      #include "amdgpu_ras.h"
>>>>    +#include <drm/drm_drv.h>
>>>> +
>>>>    static int psp_sysfs_init(struct amdgpu_device *adev);
>>>>    static void psp_sysfs_fini(struct amdgpu_device *adev);
>>>>    @@ -248,7 +250,7 @@ psp_cmd_submit_buf(struct psp_context *psp,
>>>>               struct psp_gfx_cmd_resp *cmd, uint64_t fence_mc_addr)
>>>>    {
>>>>        int ret;
>>>> -    int index;
>>>> +    int index, idx;
>>>>        int timeout = 2000;
>>>>        bool ras_intr = false;
>>>>        bool skip_unsupport = false;
>>>> @@ -256,6 +258,9 @@ psp_cmd_submit_buf(struct psp_context *psp,
>>>>        if (psp->adev->in_pci_err_recovery)
>>>>            return 0;
>>>>    +    if (!drm_dev_enter(&psp->adev->ddev, &idx))
>>>> +        return 0;
>>>> +
>>>>        mutex_lock(&psp->mutex);
>>>>          memset(psp->cmd_buf_mem, 0, PSP_CMD_BUFFER_SIZE);
>>>> @@ -266,8 +271,7 @@ psp_cmd_submit_buf(struct psp_context *psp,
>>>>        ret = psp_ring_cmd_submit(psp, psp->cmd_buf_mc_addr, fence_mc_addr,
>>>> index);
>>>>        if (ret) {
>>>>            atomic_dec(&psp->fence_value);
>>>> -        mutex_unlock(&psp->mutex);
>>>> -        return ret;
>>>> +        goto exit;
>>>>        }
>>>>          amdgpu_asic_invalidate_hdp(psp->adev, NULL);
>>>> @@ -307,8 +311,8 @@ psp_cmd_submit_buf(struct psp_context *psp,
>>>>                 psp->cmd_buf_mem->cmd_id,
>>>>                 psp->cmd_buf_mem->resp.status);
>>>>            if (!timeout) {
>>>> -            mutex_unlock(&psp->mutex);
>>>> -            return -EINVAL;
>>>> +            ret = -EINVAL;
>>>> +            goto exit;
>>>>            }
>>>>        }
>>>>    @@ -316,8 +320,10 @@ psp_cmd_submit_buf(struct psp_context *psp,
>>>>            ucode->tmr_mc_addr_lo = psp->cmd_buf_mem->resp.fw_addr_lo;
>>>>            ucode->tmr_mc_addr_hi = psp->cmd_buf_mem->resp.fw_addr_hi;
>>>>        }
>>>> -    mutex_unlock(&psp->mutex);
>>>>    +exit:
>>>> +    mutex_unlock(&psp->mutex);
>>>> +    drm_dev_exit(idx);
>>>>        return ret;
>>>>    }
>>>>    @@ -354,8 +360,7 @@ static int psp_load_toc(struct psp_context *psp,
>>>>        if (!cmd)
>>>>            return -ENOMEM;
>>>>        /* Copy toc to psp firmware private buffer */
>>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>> -    memcpy(psp->fw_pri_buf, psp->toc_start_addr, psp->toc_bin_size);
>>>> +    psp_copy_fw(psp, psp->toc_start_addr, psp->toc_bin_size);
>>>>          psp_prep_load_toc_cmd_buf(cmd, psp->fw_pri_mc_addr, psp->toc_bin_size);
>>>>    @@ -570,8 +575,7 @@ static int psp_asd_load(struct psp_context *psp)
>>>>        if (!cmd)
>>>>            return -ENOMEM;
>>>>    -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>> -    memcpy(psp->fw_pri_buf, psp->asd_start_addr, psp->asd_ucode_size);
>>>> +    psp_copy_fw(psp, psp->asd_start_addr, psp->asd_ucode_size);
>>>>          psp_prep_asd_load_cmd_buf(cmd, psp->fw_pri_mc_addr,
>>>>                      psp->asd_ucode_size);
>>>> @@ -726,8 +730,7 @@ static int psp_xgmi_load(struct psp_context *psp)
>>>>        if (!cmd)
>>>>            return -ENOMEM;
>>>>    -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>> -    memcpy(psp->fw_pri_buf, psp->ta_xgmi_start_addr, psp->ta_xgmi_ucode_size);
>>>> +    psp_copy_fw(psp, psp->ta_xgmi_start_addr, psp->ta_xgmi_ucode_size);
>>>>          psp_prep_ta_load_cmd_buf(cmd,
>>>>                     psp->fw_pri_mc_addr,
>>>> @@ -982,8 +985,7 @@ static int psp_ras_load(struct psp_context *psp)
>>>>        if (!cmd)
>>>>            return -ENOMEM;
>>>>    -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>> -    memcpy(psp->fw_pri_buf, psp->ta_ras_start_addr, psp->ta_ras_ucode_size);
>>>> +    psp_copy_fw(psp, psp->ta_ras_start_addr, psp->ta_ras_ucode_size);
>>>>          psp_prep_ta_load_cmd_buf(cmd,
>>>>                     psp->fw_pri_mc_addr,
>>>> @@ -1219,8 +1221,7 @@ static int psp_hdcp_load(struct psp_context *psp)
>>>>        if (!cmd)
>>>>            return -ENOMEM;
>>>>    -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>> -    memcpy(psp->fw_pri_buf, psp->ta_hdcp_start_addr,
>>>> +    psp_copy_fw(psp, psp->ta_hdcp_start_addr,
>>>>               psp->ta_hdcp_ucode_size);
>>>>          psp_prep_ta_load_cmd_buf(cmd,
>>>> @@ -1366,8 +1367,7 @@ static int psp_dtm_load(struct psp_context *psp)
>>>>        if (!cmd)
>>>>            return -ENOMEM;
>>>>    -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>> -    memcpy(psp->fw_pri_buf, psp->ta_dtm_start_addr, psp->ta_dtm_ucode_size);
>>>> +    psp_copy_fw(psp, psp->ta_dtm_start_addr, psp->ta_dtm_ucode_size);
>>>>          psp_prep_ta_load_cmd_buf(cmd,
>>>>                     psp->fw_pri_mc_addr,
>>>> @@ -1507,8 +1507,7 @@ static int psp_rap_load(struct psp_context *psp)
>>>>        if (!cmd)
>>>>            return -ENOMEM;
>>>>    -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>> -    memcpy(psp->fw_pri_buf, psp->ta_rap_start_addr, psp->ta_rap_ucode_size);
>>>> +    psp_copy_fw(psp, psp->ta_rap_start_addr, psp->ta_rap_ucode_size);
>>>>          psp_prep_ta_load_cmd_buf(cmd,
>>>>                     psp->fw_pri_mc_addr,
>>>> @@ -2778,6 +2777,20 @@ static ssize_t psp_usbc_pd_fw_sysfs_write(struct
>>>> device *dev,
>>>>        return count;
>>>>    }
>>>>    +void psp_copy_fw(struct psp_context *psp, uint8_t *start_addr, uint32_t
>>>> bin_size)
>>>> +{
>>>> +    int idx;
>>>> +
>>>> +    if (!drm_dev_enter(&psp->adev->ddev, &idx))
>>>> +        return;
>>>> +
>>>> +    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>> +    memcpy(psp->fw_pri_buf, start_addr, bin_size);
>>>> +
>>>> +    drm_dev_exit(idx);
>>>> +}
>>>> +
>>>> +
>>>>    static DEVICE_ATTR(usbc_pd_fw, S_IRUGO | S_IWUSR,
>>>>               psp_usbc_pd_fw_sysfs_read,
>>>>               psp_usbc_pd_fw_sysfs_write);
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
>>>> index da250bc..ac69314 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
>>>> @@ -400,4 +400,7 @@ int psp_init_ta_microcode(struct psp_context *psp,
>>>>                  const char *chip_name);
>>>>    int psp_get_fw_attestation_records_addr(struct psp_context *psp,
>>>>                        uint64_t *output_ptr);
>>>> +
>>>> +void psp_copy_fw(struct psp_context *psp, uint8_t *start_addr, uint32_t
>>>> bin_size);
>>>> +
>>>>    #endif
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
>>>> index 1a612f5..d656494 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
>>>> @@ -35,6 +35,8 @@
>>>>    #include "amdgpu.h"
>>>>    #include "atom.h"
>>>>    +#include <drm/drm_drv.h>
>>>> +
>>>>    /*
>>>>     * Rings
>>>>     * Most engines on the GPU are fed via ring buffers.  Ring
>>>> @@ -463,3 +465,71 @@ int amdgpu_ring_test_helper(struct amdgpu_ring *ring)
>>>>        ring->sched.ready = !r;
>>>>        return r;
>>>>    }
>>>> +
>>>> +void amdgpu_ring_clear_ring(struct amdgpu_ring *ring)
>>>> +{
>>>> +    int idx;
>>>> +    int i = 0;
>>>> +
>>>> +    if (!drm_dev_enter(&ring->adev->ddev, &idx))
>>>> +        return;
>>>> +
>>>> +    while (i <= ring->buf_mask)
>>>> +        ring->ring[i++] = ring->funcs->nop;
>>>> +
>>>> +    drm_dev_exit(idx);
>>>> +
>>>> +}
>>>> +
>>>> +void amdgpu_ring_write(struct amdgpu_ring *ring, uint32_t v)
>>>> +{
>>>> +    int idx;
>>>> +
>>>> +    if (!drm_dev_enter(&ring->adev->ddev, &idx))
>>>> +        return;
>>>> +
>>>> +    if (ring->count_dw <= 0)
>>>> +        DRM_ERROR("amdgpu: writing more dwords to the ring than expected!\n");
>>>> +    ring->ring[ring->wptr++ & ring->buf_mask] = v;
>>>> +    ring->wptr &= ring->ptr_mask;
>>>> +    ring->count_dw--;
>>>> +
>>>> +    drm_dev_exit(idx);
>>>> +}
>>>> +
>>>> +void amdgpu_ring_write_multiple(struct amdgpu_ring *ring,
>>>> +                          void *src, int count_dw)
>>>> +{
>>>> +    unsigned occupied, chunk1, chunk2;
>>>> +    void *dst;
>>>> +    int idx;
>>>> +
>>>> +    if (!drm_dev_enter(&ring->adev->ddev, &idx))
>>>> +        return;
>>>> +
>>>> +    if (unlikely(ring->count_dw < count_dw))
>>>> +        DRM_ERROR("amdgpu: writing more dwords to the ring than expected!\n");
>>>> +
>>>> +    occupied = ring->wptr & ring->buf_mask;
>>>> +    dst = (void *)&ring->ring[occupied];
>>>> +    chunk1 = ring->buf_mask + 1 - occupied;
>>>> +    chunk1 = (chunk1 >= count_dw) ? count_dw: chunk1;
>>>> +    chunk2 = count_dw - chunk1;
>>>> +    chunk1 <<= 2;
>>>> +    chunk2 <<= 2;
>>>> +
>>>> +    if (chunk1)
>>>> +        memcpy(dst, src, chunk1);
>>>> +
>>>> +    if (chunk2) {
>>>> +        src += chunk1;
>>>> +        dst = (void *)ring->ring;
>>>> +        memcpy(dst, src, chunk2);
>>>> +    }
>>>> +
>>>> +    ring->wptr += count_dw;
>>>> +    ring->wptr &= ring->ptr_mask;
>>>> +    ring->count_dw -= count_dw;
>>>> +
>>>> +    drm_dev_exit(idx);
>>>> +}
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>>>> index accb243..f90b81f 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>>>> @@ -300,53 +300,12 @@ static inline void
>>>> amdgpu_ring_set_preempt_cond_exec(struct amdgpu_ring *ring,
>>>>        *ring->cond_exe_cpu_addr = cond_exec;
>>>>    }
>>>>    -static inline void amdgpu_ring_clear_ring(struct amdgpu_ring *ring)
>>>> -{
>>>> -    int i = 0;
>>>> -    while (i <= ring->buf_mask)
>>>> -        ring->ring[i++] = ring->funcs->nop;
>>>> -
>>>> -}
>>>> -
>>>> -static inline void amdgpu_ring_write(struct amdgpu_ring *ring, uint32_t v)
>>>> -{
>>>> -    if (ring->count_dw <= 0)
>>>> -        DRM_ERROR("amdgpu: writing more dwords to the ring than expected!\n");
>>>> -    ring->ring[ring->wptr++ & ring->buf_mask] = v;
>>>> -    ring->wptr &= ring->ptr_mask;
>>>> -    ring->count_dw--;
>>>> -}
>>>> +void amdgpu_ring_clear_ring(struct amdgpu_ring *ring);
>>>>    -static inline void amdgpu_ring_write_multiple(struct amdgpu_ring *ring,
>>>> -                          void *src, int count_dw)
>>>> -{
>>>> -    unsigned occupied, chunk1, chunk2;
>>>> -    void *dst;
>>>> -
>>>> -    if (unlikely(ring->count_dw < count_dw))
>>>> -        DRM_ERROR("amdgpu: writing more dwords to the ring than expected!\n");
>>>> -
>>>> -    occupied = ring->wptr & ring->buf_mask;
>>>> -    dst = (void *)&ring->ring[occupied];
>>>> -    chunk1 = ring->buf_mask + 1 - occupied;
>>>> -    chunk1 = (chunk1 >= count_dw) ? count_dw: chunk1;
>>>> -    chunk2 = count_dw - chunk1;
>>>> -    chunk1 <<= 2;
>>>> -    chunk2 <<= 2;
>>>> +void amdgpu_ring_write(struct amdgpu_ring *ring, uint32_t v);
>>>>    -    if (chunk1)
>>>> -        memcpy(dst, src, chunk1);
>>>> -
>>>> -    if (chunk2) {
>>>> -        src += chunk1;
>>>> -        dst = (void *)ring->ring;
>>>> -        memcpy(dst, src, chunk2);
>>>> -    }
>>>> -
>>>> -    ring->wptr += count_dw;
>>>> -    ring->wptr &= ring->ptr_mask;
>>>> -    ring->count_dw -= count_dw;
>>>> -}
>>>> +void amdgpu_ring_write_multiple(struct amdgpu_ring *ring,
>>>> +                          void *src, int count_dw);
>>>>      int amdgpu_ring_test_helper(struct amdgpu_ring *ring);
>>>>    diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
>>>> b/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
>>>> index bd4248c..b3ce5be 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
>>>> @@ -269,10 +269,8 @@ static int psp_v11_0_bootloader_load_kdb(struct
>>>> psp_context *psp)
>>>>        if (ret)
>>>>            return ret;
>>>>    -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>> -
>>>>        /* Copy PSP KDB binary to memory */
>>>> -    memcpy(psp->fw_pri_buf, psp->kdb_start_addr, psp->kdb_bin_size);
>>>> +    psp_copy_fw(psp, psp->kdb_start_addr, psp->kdb_bin_size);
>>>>          /* Provide the PSP KDB to bootloader */
>>>>        WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>>> @@ -302,10 +300,8 @@ static int psp_v11_0_bootloader_load_spl(struct
>>>> psp_context *psp)
>>>>        if (ret)
>>>>            return ret;
>>>>    -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>> -
>>>>        /* Copy PSP SPL binary to memory */
>>>> -    memcpy(psp->fw_pri_buf, psp->spl_start_addr, psp->spl_bin_size);
>>>> +    psp_copy_fw(psp, psp->spl_start_addr, psp->spl_bin_size);
>>>>          /* Provide the PSP SPL to bootloader */
>>>>        WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>>> @@ -335,10 +331,8 @@ static int psp_v11_0_bootloader_load_sysdrv(struct
>>>> psp_context *psp)
>>>>        if (ret)
>>>>            return ret;
>>>>    -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>> -
>>>>        /* Copy PSP System Driver binary to memory */
>>>> -    memcpy(psp->fw_pri_buf, psp->sys_start_addr, psp->sys_bin_size);
>>>> +    psp_copy_fw(psp, psp->sys_start_addr, psp->sys_bin_size);
>>>>          /* Provide the sys driver to bootloader */
>>>>        WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>>> @@ -371,10 +365,8 @@ static int psp_v11_0_bootloader_load_sos(struct
>>>> psp_context *psp)
>>>>        if (ret)
>>>>            return ret;
>>>>    -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>> -
>>>>        /* Copy Secure OS binary to PSP memory */
>>>> -    memcpy(psp->fw_pri_buf, psp->sos_start_addr, psp->sos_bin_size);
>>>> +    psp_copy_fw(psp, psp->sos_start_addr, psp->sos_bin_size);
>>>>          /* Provide the PSP secure OS to bootloader */
>>>>        WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c
>>>> b/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c
>>>> index c4828bd..618e5b6 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c
>>>> @@ -138,10 +138,8 @@ static int psp_v12_0_bootloader_load_sysdrv(struct
>>>> psp_context *psp)
>>>>        if (ret)
>>>>            return ret;
>>>>    -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>> -
>>>>        /* Copy PSP System Driver binary to memory */
>>>> -    memcpy(psp->fw_pri_buf, psp->sys_start_addr, psp->sys_bin_size);
>>>> +    psp_copy_fw(psp, psp->sys_start_addr, psp->sys_bin_size);
>>>>          /* Provide the sys driver to bootloader */
>>>>        WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>>> @@ -179,10 +177,8 @@ static int psp_v12_0_bootloader_load_sos(struct
>>>> psp_context *psp)
>>>>        if (ret)
>>>>            return ret;
>>>>    -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>> -
>>>>        /* Copy Secure OS binary to PSP memory */
>>>> -    memcpy(psp->fw_pri_buf, psp->sos_start_addr, psp->sos_bin_size);
>>>> +    psp_copy_fw(psp, psp->sos_start_addr, psp->sos_bin_size);
>>>>          /* Provide the PSP secure OS to bootloader */
>>>>        WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c
>>>> b/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c
>>>> index f2e725f..d0a6cccd 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c
>>>> @@ -102,10 +102,8 @@ static int psp_v3_1_bootloader_load_sysdrv(struct
>>>> psp_context *psp)
>>>>        if (ret)
>>>>            return ret;
>>>>    -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>> -
>>>>        /* Copy PSP System Driver binary to memory */
>>>> -    memcpy(psp->fw_pri_buf, psp->sys_start_addr, psp->sys_bin_size);
>>>> +    psp_copy_fw(psp, psp->sys_start_addr, psp->sys_bin_size);
>>>>          /* Provide the sys driver to bootloader */
>>>>        WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>>> @@ -143,10 +141,8 @@ static int psp_v3_1_bootloader_load_sos(struct
>>>> psp_context *psp)
>>>>        if (ret)
>>>>            return ret;
>>>>    -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>> -
>>>>        /* Copy Secure OS binary to PSP memory */
>>>> -    memcpy(psp->fw_pri_buf, psp->sos_start_addr, psp->sos_bin_size);
>>>> +    psp_copy_fw(psp, psp->sos_start_addr, psp->sos_bin_size);
>>>>          /* Provide the PSP secure OS to bootloader */
>>>>        WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>
>
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 11/14] drm/amdgpu: Guard against write accesses after device removal
@ 2021-01-19 18:22           ` Andrey Grodzovsky
  0 siblings, 0 replies; 196+ messages in thread
From: Andrey Grodzovsky @ 2021-01-19 18:22 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Rob Herring, amd-gfx list, Greg KH, dri-devel, Anholt, Eric,
	Pekka Paalanen, Qiang Yu, Alex Deucher, Wentland, Harry,
	Christian König, Lucas Stach


On 1/19/21 1:05 PM, Daniel Vetter wrote:
> On Tue, Jan 19, 2021 at 4:35 PM Andrey Grodzovsky
> <Andrey.Grodzovsky@amd.com> wrote:
>> There is really no other way according to this article
>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flwn.net%2FArticles%2F767885%2F&amp;data=04%7C01%7CAndrey.Grodzovsky%40amd.com%7C7a1f5ae6a06f4661d47708d8bca4cb32%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637466763278674162%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=QupsglO9WRuis8XRLBFIhl6miTXVOdAnk8oP4BfSclQ%3D&amp;reserved=0
>>
>> "A perfect solution seems nearly impossible though; we cannot acquire a mutex on
>> the user
>> to prevent them from yanking a device and we cannot check for a presence change
>> after every
>> device access for performance reasons. "
>>
>> But I assumed srcu_read_lock should be pretty seamless performance wise, no ?
> The read side is supposed to be dirt cheap, the write side is were we
> just stall for all readers to eventually complete on their own.
> Definitely should be much cheaper than mmio read, on the mmio write
> side it might actually hurt a bit. Otoh I think those don't stall the
> cpu by default when they're timing out, so maybe if the overhead is
> too much for those, we could omit them?
>
> Maybe just do a small microbenchmark for these for testing, with a
> register that doesn't change hw state. So with and without
> drm_dev_enter/exit, and also one with the hw plugged out so that we
> have actual timeouts in the transactions.
> -Daniel


So say writing in a loop to some harmless scratch register for many times both 
for plugged
and unplugged case and measure total time delta ?

Andrey


>
>> The other solution would be as I suggested to keep all the device IO ranges
>> reserved and system
>> memory pages unfreed until the device is finalized in the driver but Daniel said
>> this would upset the PCI layer (the MMIO ranges reservation part).
>>
>> Andrey
>>
>>
>>
>>
>> On 1/19/21 3:55 AM, Christian König wrote:
>>> Am 18.01.21 um 22:01 schrieb Andrey Grodzovsky:
>>>> This should prevent writing to memory or IO ranges possibly
>>>> already allocated for other uses after our device is removed.
>>> Wow, that adds quite some overhead to every register access. I'm not sure we
>>> can do this.
>>>
>>> Christian.
>>>
>>>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>>>> ---
>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 57 ++++++++++++++++++++++++
>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c    |  9 ++++
>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c    | 53 +++++++++++++---------
>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h    |  3 ++
>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c   | 70 ++++++++++++++++++++++++++++++
>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h   | 49 ++-------------------
>>>>    drivers/gpu/drm/amd/amdgpu/psp_v11_0.c     | 16 ++-----
>>>>    drivers/gpu/drm/amd/amdgpu/psp_v12_0.c     |  8 +---
>>>>    drivers/gpu/drm/amd/amdgpu/psp_v3_1.c      |  8 +---
>>>>    9 files changed, 184 insertions(+), 89 deletions(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>> index e99f4f1..0a9d73c 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>> @@ -72,6 +72,8 @@
>>>>      #include <linux/iommu.h>
>>>>    +#include <drm/drm_drv.h>
>>>> +
>>>>    MODULE_FIRMWARE("amdgpu/vega10_gpu_info.bin");
>>>>    MODULE_FIRMWARE("amdgpu/vega12_gpu_info.bin");
>>>>    MODULE_FIRMWARE("amdgpu/raven_gpu_info.bin");
>>>> @@ -404,13 +406,21 @@ uint8_t amdgpu_mm_rreg8(struct amdgpu_device *adev,
>>>> uint32_t offset)
>>>>     */
>>>>    void amdgpu_mm_wreg8(struct amdgpu_device *adev, uint32_t offset, uint8_t
>>>> value)
>>>>    {
>>>> +    int idx;
>>>> +
>>>>        if (adev->in_pci_err_recovery)
>>>>            return;
>>>>    +
>>>> +    if (!drm_dev_enter(&adev->ddev, &idx))
>>>> +        return;
>>>> +
>>>>        if (offset < adev->rmmio_size)
>>>>            writeb(value, adev->rmmio + offset);
>>>>        else
>>>>            BUG();
>>>> +
>>>> +    drm_dev_exit(idx);
>>>>    }
>>>>      /**
>>>> @@ -427,9 +437,14 @@ void amdgpu_device_wreg(struct amdgpu_device *adev,
>>>>                uint32_t reg, uint32_t v,
>>>>                uint32_t acc_flags)
>>>>    {
>>>> +    int idx;
>>>> +
>>>>        if (adev->in_pci_err_recovery)
>>>>            return;
>>>>    +    if (!drm_dev_enter(&adev->ddev, &idx))
>>>> +        return;
>>>> +
>>>>        if ((reg * 4) < adev->rmmio_size) {
>>>>            if (!(acc_flags & AMDGPU_REGS_NO_KIQ) &&
>>>>                amdgpu_sriov_runtime(adev) &&
>>>> @@ -444,6 +459,8 @@ void amdgpu_device_wreg(struct amdgpu_device *adev,
>>>>        }
>>>>          trace_amdgpu_device_wreg(adev->pdev->device, reg, v);
>>>> +
>>>> +    drm_dev_exit(idx);
>>>>    }
>>>>      /*
>>>> @@ -454,9 +471,14 @@ void amdgpu_device_wreg(struct amdgpu_device *adev,
>>>>    void amdgpu_mm_wreg_mmio_rlc(struct amdgpu_device *adev,
>>>>                     uint32_t reg, uint32_t v)
>>>>    {
>>>> +    int idx;
>>>> +
>>>>        if (adev->in_pci_err_recovery)
>>>>            return;
>>>>    +    if (!drm_dev_enter(&adev->ddev, &idx))
>>>> +        return;
>>>> +
>>>>        if (amdgpu_sriov_fullaccess(adev) &&
>>>>            adev->gfx.rlc.funcs &&
>>>>            adev->gfx.rlc.funcs->is_rlcg_access_range) {
>>>> @@ -465,6 +487,8 @@ void amdgpu_mm_wreg_mmio_rlc(struct amdgpu_device *adev,
>>>>        } else {
>>>>            writel(v, ((void __iomem *)adev->rmmio) + (reg * 4));
>>>>        }
>>>> +
>>>> +    drm_dev_exit(idx);
>>>>    }
>>>>      /**
>>>> @@ -499,15 +523,22 @@ u32 amdgpu_io_rreg(struct amdgpu_device *adev, u32 reg)
>>>>     */
>>>>    void amdgpu_io_wreg(struct amdgpu_device *adev, u32 reg, u32 v)
>>>>    {
>>>> +    int idx;
>>>> +
>>>>        if (adev->in_pci_err_recovery)
>>>>            return;
>>>>    +    if (!drm_dev_enter(&adev->ddev, &idx))
>>>> +        return;
>>>> +
>>>>        if ((reg * 4) < adev->rio_mem_size)
>>>>            iowrite32(v, adev->rio_mem + (reg * 4));
>>>>        else {
>>>>            iowrite32((reg * 4), adev->rio_mem + (mmMM_INDEX * 4));
>>>>            iowrite32(v, adev->rio_mem + (mmMM_DATA * 4));
>>>>        }
>>>> +
>>>> +    drm_dev_exit(idx);
>>>>    }
>>>>      /**
>>>> @@ -544,14 +575,21 @@ u32 amdgpu_mm_rdoorbell(struct amdgpu_device *adev, u32
>>>> index)
>>>>     */
>>>>    void amdgpu_mm_wdoorbell(struct amdgpu_device *adev, u32 index, u32 v)
>>>>    {
>>>> +    int idx;
>>>> +
>>>>        if (adev->in_pci_err_recovery)
>>>>            return;
>>>>    +    if (!drm_dev_enter(&adev->ddev, &idx))
>>>> +        return;
>>>> +
>>>>        if (index < adev->doorbell.num_doorbells) {
>>>>            writel(v, adev->doorbell.ptr + index);
>>>>        } else {
>>>>            DRM_ERROR("writing beyond doorbell aperture: 0x%08x!\n", index);
>>>>        }
>>>> +
>>>> +    drm_dev_exit(idx);
>>>>    }
>>>>      /**
>>>> @@ -588,14 +626,21 @@ u64 amdgpu_mm_rdoorbell64(struct amdgpu_device *adev,
>>>> u32 index)
>>>>     */
>>>>    void amdgpu_mm_wdoorbell64(struct amdgpu_device *adev, u32 index, u64 v)
>>>>    {
>>>> +    int idx;
>>>> +
>>>>        if (adev->in_pci_err_recovery)
>>>>            return;
>>>>    +    if (!drm_dev_enter(&adev->ddev, &idx))
>>>> +        return;
>>>> +
>>>>        if (index < adev->doorbell.num_doorbells) {
>>>>            atomic64_set((atomic64_t *)(adev->doorbell.ptr + index), v);
>>>>        } else {
>>>>            DRM_ERROR("writing beyond doorbell aperture: 0x%08x!\n", index);
>>>>        }
>>>> +
>>>> +    drm_dev_exit(idx);
>>>>    }
>>>>      /**
>>>> @@ -682,6 +727,10 @@ void amdgpu_device_indirect_wreg(struct amdgpu_device
>>>> *adev,
>>>>        unsigned long flags;
>>>>        void __iomem *pcie_index_offset;
>>>>        void __iomem *pcie_data_offset;
>>>> +    int idx;
>>>> +
>>>> +    if (!drm_dev_enter(&adev->ddev, &idx))
>>>> +        return;
>>>>          spin_lock_irqsave(&adev->pcie_idx_lock, flags);
>>>>        pcie_index_offset = (void __iomem *)adev->rmmio + pcie_index * 4;
>>>> @@ -692,6 +741,8 @@ void amdgpu_device_indirect_wreg(struct amdgpu_device *adev,
>>>>        writel(reg_data, pcie_data_offset);
>>>>        readl(pcie_data_offset);
>>>>        spin_unlock_irqrestore(&adev->pcie_idx_lock, flags);
>>>> +
>>>> +    drm_dev_exit(idx);
>>>>    }
>>>>      /**
>>>> @@ -711,6 +762,10 @@ void amdgpu_device_indirect_wreg64(struct amdgpu_device
>>>> *adev,
>>>>        unsigned long flags;
>>>>        void __iomem *pcie_index_offset;
>>>>        void __iomem *pcie_data_offset;
>>>> +    int idx;
>>>> +
>>>> +    if (!drm_dev_enter(&adev->ddev, &idx))
>>>> +        return;
>>>>          spin_lock_irqsave(&adev->pcie_idx_lock, flags);
>>>>        pcie_index_offset = (void __iomem *)adev->rmmio + pcie_index * 4;
>>>> @@ -727,6 +782,8 @@ void amdgpu_device_indirect_wreg64(struct amdgpu_device
>>>> *adev,
>>>>        writel((u32)(reg_data >> 32), pcie_data_offset);
>>>>        readl(pcie_data_offset);
>>>>        spin_unlock_irqrestore(&adev->pcie_idx_lock, flags);
>>>> +
>>>> +    drm_dev_exit(idx);
>>>>    }
>>>>      /**
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
>>>> index fe1a39f..1beb4e6 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
>>>> @@ -31,6 +31,8 @@
>>>>    #include "amdgpu_ras.h"
>>>>    #include "amdgpu_xgmi.h"
>>>>    +#include <drm/drm_drv.h>
>>>> +
>>>>    /**
>>>>     * amdgpu_gmc_get_pde_for_bo - get the PDE for a BO
>>>>     *
>>>> @@ -98,6 +100,10 @@ int amdgpu_gmc_set_pte_pde(struct amdgpu_device *adev,
>>>> void *cpu_pt_addr,
>>>>    {
>>>>        void __iomem *ptr = (void *)cpu_pt_addr;
>>>>        uint64_t value;
>>>> +    int idx;
>>>> +
>>>> +    if (!drm_dev_enter(&adev->ddev, &idx))
>>>> +        return 0;
>>>>          /*
>>>>         * The following is for PTE only. GART does not have PDEs.
>>>> @@ -105,6 +111,9 @@ int amdgpu_gmc_set_pte_pde(struct amdgpu_device *adev,
>>>> void *cpu_pt_addr,
>>>>        value = addr & 0x0000FFFFFFFFF000ULL;
>>>>        value |= flags;
>>>>        writeq(value, ptr + (gpu_page_idx * 8));
>>>> +
>>>> +    drm_dev_exit(idx);
>>>> +
>>>>        return 0;
>>>>    }
>>>>    diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
>>>> index 523d22d..89e2bfe 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
>>>> @@ -37,6 +37,8 @@
>>>>      #include "amdgpu_ras.h"
>>>>    +#include <drm/drm_drv.h>
>>>> +
>>>>    static int psp_sysfs_init(struct amdgpu_device *adev);
>>>>    static void psp_sysfs_fini(struct amdgpu_device *adev);
>>>>    @@ -248,7 +250,7 @@ psp_cmd_submit_buf(struct psp_context *psp,
>>>>               struct psp_gfx_cmd_resp *cmd, uint64_t fence_mc_addr)
>>>>    {
>>>>        int ret;
>>>> -    int index;
>>>> +    int index, idx;
>>>>        int timeout = 2000;
>>>>        bool ras_intr = false;
>>>>        bool skip_unsupport = false;
>>>> @@ -256,6 +258,9 @@ psp_cmd_submit_buf(struct psp_context *psp,
>>>>        if (psp->adev->in_pci_err_recovery)
>>>>            return 0;
>>>>    +    if (!drm_dev_enter(&psp->adev->ddev, &idx))
>>>> +        return 0;
>>>> +
>>>>        mutex_lock(&psp->mutex);
>>>>          memset(psp->cmd_buf_mem, 0, PSP_CMD_BUFFER_SIZE);
>>>> @@ -266,8 +271,7 @@ psp_cmd_submit_buf(struct psp_context *psp,
>>>>        ret = psp_ring_cmd_submit(psp, psp->cmd_buf_mc_addr, fence_mc_addr,
>>>> index);
>>>>        if (ret) {
>>>>            atomic_dec(&psp->fence_value);
>>>> -        mutex_unlock(&psp->mutex);
>>>> -        return ret;
>>>> +        goto exit;
>>>>        }
>>>>          amdgpu_asic_invalidate_hdp(psp->adev, NULL);
>>>> @@ -307,8 +311,8 @@ psp_cmd_submit_buf(struct psp_context *psp,
>>>>                 psp->cmd_buf_mem->cmd_id,
>>>>                 psp->cmd_buf_mem->resp.status);
>>>>            if (!timeout) {
>>>> -            mutex_unlock(&psp->mutex);
>>>> -            return -EINVAL;
>>>> +            ret = -EINVAL;
>>>> +            goto exit;
>>>>            }
>>>>        }
>>>>    @@ -316,8 +320,10 @@ psp_cmd_submit_buf(struct psp_context *psp,
>>>>            ucode->tmr_mc_addr_lo = psp->cmd_buf_mem->resp.fw_addr_lo;
>>>>            ucode->tmr_mc_addr_hi = psp->cmd_buf_mem->resp.fw_addr_hi;
>>>>        }
>>>> -    mutex_unlock(&psp->mutex);
>>>>    +exit:
>>>> +    mutex_unlock(&psp->mutex);
>>>> +    drm_dev_exit(idx);
>>>>        return ret;
>>>>    }
>>>>    @@ -354,8 +360,7 @@ static int psp_load_toc(struct psp_context *psp,
>>>>        if (!cmd)
>>>>            return -ENOMEM;
>>>>        /* Copy toc to psp firmware private buffer */
>>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>> -    memcpy(psp->fw_pri_buf, psp->toc_start_addr, psp->toc_bin_size);
>>>> +    psp_copy_fw(psp, psp->toc_start_addr, psp->toc_bin_size);
>>>>          psp_prep_load_toc_cmd_buf(cmd, psp->fw_pri_mc_addr, psp->toc_bin_size);
>>>>    @@ -570,8 +575,7 @@ static int psp_asd_load(struct psp_context *psp)
>>>>        if (!cmd)
>>>>            return -ENOMEM;
>>>>    -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>> -    memcpy(psp->fw_pri_buf, psp->asd_start_addr, psp->asd_ucode_size);
>>>> +    psp_copy_fw(psp, psp->asd_start_addr, psp->asd_ucode_size);
>>>>          psp_prep_asd_load_cmd_buf(cmd, psp->fw_pri_mc_addr,
>>>>                      psp->asd_ucode_size);
>>>> @@ -726,8 +730,7 @@ static int psp_xgmi_load(struct psp_context *psp)
>>>>        if (!cmd)
>>>>            return -ENOMEM;
>>>>    -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>> -    memcpy(psp->fw_pri_buf, psp->ta_xgmi_start_addr, psp->ta_xgmi_ucode_size);
>>>> +    psp_copy_fw(psp, psp->ta_xgmi_start_addr, psp->ta_xgmi_ucode_size);
>>>>          psp_prep_ta_load_cmd_buf(cmd,
>>>>                     psp->fw_pri_mc_addr,
>>>> @@ -982,8 +985,7 @@ static int psp_ras_load(struct psp_context *psp)
>>>>        if (!cmd)
>>>>            return -ENOMEM;
>>>>    -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>> -    memcpy(psp->fw_pri_buf, psp->ta_ras_start_addr, psp->ta_ras_ucode_size);
>>>> +    psp_copy_fw(psp, psp->ta_ras_start_addr, psp->ta_ras_ucode_size);
>>>>          psp_prep_ta_load_cmd_buf(cmd,
>>>>                     psp->fw_pri_mc_addr,
>>>> @@ -1219,8 +1221,7 @@ static int psp_hdcp_load(struct psp_context *psp)
>>>>        if (!cmd)
>>>>            return -ENOMEM;
>>>>    -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>> -    memcpy(psp->fw_pri_buf, psp->ta_hdcp_start_addr,
>>>> +    psp_copy_fw(psp, psp->ta_hdcp_start_addr,
>>>>               psp->ta_hdcp_ucode_size);
>>>>          psp_prep_ta_load_cmd_buf(cmd,
>>>> @@ -1366,8 +1367,7 @@ static int psp_dtm_load(struct psp_context *psp)
>>>>        if (!cmd)
>>>>            return -ENOMEM;
>>>>    -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>> -    memcpy(psp->fw_pri_buf, psp->ta_dtm_start_addr, psp->ta_dtm_ucode_size);
>>>> +    psp_copy_fw(psp, psp->ta_dtm_start_addr, psp->ta_dtm_ucode_size);
>>>>          psp_prep_ta_load_cmd_buf(cmd,
>>>>                     psp->fw_pri_mc_addr,
>>>> @@ -1507,8 +1507,7 @@ static int psp_rap_load(struct psp_context *psp)
>>>>        if (!cmd)
>>>>            return -ENOMEM;
>>>>    -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>> -    memcpy(psp->fw_pri_buf, psp->ta_rap_start_addr, psp->ta_rap_ucode_size);
>>>> +    psp_copy_fw(psp, psp->ta_rap_start_addr, psp->ta_rap_ucode_size);
>>>>          psp_prep_ta_load_cmd_buf(cmd,
>>>>                     psp->fw_pri_mc_addr,
>>>> @@ -2778,6 +2777,20 @@ static ssize_t psp_usbc_pd_fw_sysfs_write(struct
>>>> device *dev,
>>>>        return count;
>>>>    }
>>>>    +void psp_copy_fw(struct psp_context *psp, uint8_t *start_addr, uint32_t
>>>> bin_size)
>>>> +{
>>>> +    int idx;
>>>> +
>>>> +    if (!drm_dev_enter(&psp->adev->ddev, &idx))
>>>> +        return;
>>>> +
>>>> +    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>> +    memcpy(psp->fw_pri_buf, start_addr, bin_size);
>>>> +
>>>> +    drm_dev_exit(idx);
>>>> +}
>>>> +
>>>> +
>>>>    static DEVICE_ATTR(usbc_pd_fw, S_IRUGO | S_IWUSR,
>>>>               psp_usbc_pd_fw_sysfs_read,
>>>>               psp_usbc_pd_fw_sysfs_write);
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
>>>> index da250bc..ac69314 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
>>>> @@ -400,4 +400,7 @@ int psp_init_ta_microcode(struct psp_context *psp,
>>>>                  const char *chip_name);
>>>>    int psp_get_fw_attestation_records_addr(struct psp_context *psp,
>>>>                        uint64_t *output_ptr);
>>>> +
>>>> +void psp_copy_fw(struct psp_context *psp, uint8_t *start_addr, uint32_t
>>>> bin_size);
>>>> +
>>>>    #endif
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
>>>> index 1a612f5..d656494 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
>>>> @@ -35,6 +35,8 @@
>>>>    #include "amdgpu.h"
>>>>    #include "atom.h"
>>>>    +#include <drm/drm_drv.h>
>>>> +
>>>>    /*
>>>>     * Rings
>>>>     * Most engines on the GPU are fed via ring buffers.  Ring
>>>> @@ -463,3 +465,71 @@ int amdgpu_ring_test_helper(struct amdgpu_ring *ring)
>>>>        ring->sched.ready = !r;
>>>>        return r;
>>>>    }
>>>> +
>>>> +void amdgpu_ring_clear_ring(struct amdgpu_ring *ring)
>>>> +{
>>>> +    int idx;
>>>> +    int i = 0;
>>>> +
>>>> +    if (!drm_dev_enter(&ring->adev->ddev, &idx))
>>>> +        return;
>>>> +
>>>> +    while (i <= ring->buf_mask)
>>>> +        ring->ring[i++] = ring->funcs->nop;
>>>> +
>>>> +    drm_dev_exit(idx);
>>>> +
>>>> +}
>>>> +
>>>> +void amdgpu_ring_write(struct amdgpu_ring *ring, uint32_t v)
>>>> +{
>>>> +    int idx;
>>>> +
>>>> +    if (!drm_dev_enter(&ring->adev->ddev, &idx))
>>>> +        return;
>>>> +
>>>> +    if (ring->count_dw <= 0)
>>>> +        DRM_ERROR("amdgpu: writing more dwords to the ring than expected!\n");
>>>> +    ring->ring[ring->wptr++ & ring->buf_mask] = v;
>>>> +    ring->wptr &= ring->ptr_mask;
>>>> +    ring->count_dw--;
>>>> +
>>>> +    drm_dev_exit(idx);
>>>> +}
>>>> +
>>>> +void amdgpu_ring_write_multiple(struct amdgpu_ring *ring,
>>>> +                          void *src, int count_dw)
>>>> +{
>>>> +    unsigned occupied, chunk1, chunk2;
>>>> +    void *dst;
>>>> +    int idx;
>>>> +
>>>> +    if (!drm_dev_enter(&ring->adev->ddev, &idx))
>>>> +        return;
>>>> +
>>>> +    if (unlikely(ring->count_dw < count_dw))
>>>> +        DRM_ERROR("amdgpu: writing more dwords to the ring than expected!\n");
>>>> +
>>>> +    occupied = ring->wptr & ring->buf_mask;
>>>> +    dst = (void *)&ring->ring[occupied];
>>>> +    chunk1 = ring->buf_mask + 1 - occupied;
>>>> +    chunk1 = (chunk1 >= count_dw) ? count_dw: chunk1;
>>>> +    chunk2 = count_dw - chunk1;
>>>> +    chunk1 <<= 2;
>>>> +    chunk2 <<= 2;
>>>> +
>>>> +    if (chunk1)
>>>> +        memcpy(dst, src, chunk1);
>>>> +
>>>> +    if (chunk2) {
>>>> +        src += chunk1;
>>>> +        dst = (void *)ring->ring;
>>>> +        memcpy(dst, src, chunk2);
>>>> +    }
>>>> +
>>>> +    ring->wptr += count_dw;
>>>> +    ring->wptr &= ring->ptr_mask;
>>>> +    ring->count_dw -= count_dw;
>>>> +
>>>> +    drm_dev_exit(idx);
>>>> +}
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>>>> index accb243..f90b81f 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>>>> @@ -300,53 +300,12 @@ static inline void
>>>> amdgpu_ring_set_preempt_cond_exec(struct amdgpu_ring *ring,
>>>>        *ring->cond_exe_cpu_addr = cond_exec;
>>>>    }
>>>>    -static inline void amdgpu_ring_clear_ring(struct amdgpu_ring *ring)
>>>> -{
>>>> -    int i = 0;
>>>> -    while (i <= ring->buf_mask)
>>>> -        ring->ring[i++] = ring->funcs->nop;
>>>> -
>>>> -}
>>>> -
>>>> -static inline void amdgpu_ring_write(struct amdgpu_ring *ring, uint32_t v)
>>>> -{
>>>> -    if (ring->count_dw <= 0)
>>>> -        DRM_ERROR("amdgpu: writing more dwords to the ring than expected!\n");
>>>> -    ring->ring[ring->wptr++ & ring->buf_mask] = v;
>>>> -    ring->wptr &= ring->ptr_mask;
>>>> -    ring->count_dw--;
>>>> -}
>>>> +void amdgpu_ring_clear_ring(struct amdgpu_ring *ring);
>>>>    -static inline void amdgpu_ring_write_multiple(struct amdgpu_ring *ring,
>>>> -                          void *src, int count_dw)
>>>> -{
>>>> -    unsigned occupied, chunk1, chunk2;
>>>> -    void *dst;
>>>> -
>>>> -    if (unlikely(ring->count_dw < count_dw))
>>>> -        DRM_ERROR("amdgpu: writing more dwords to the ring than expected!\n");
>>>> -
>>>> -    occupied = ring->wptr & ring->buf_mask;
>>>> -    dst = (void *)&ring->ring[occupied];
>>>> -    chunk1 = ring->buf_mask + 1 - occupied;
>>>> -    chunk1 = (chunk1 >= count_dw) ? count_dw: chunk1;
>>>> -    chunk2 = count_dw - chunk1;
>>>> -    chunk1 <<= 2;
>>>> -    chunk2 <<= 2;
>>>> +void amdgpu_ring_write(struct amdgpu_ring *ring, uint32_t v);
>>>>    -    if (chunk1)
>>>> -        memcpy(dst, src, chunk1);
>>>> -
>>>> -    if (chunk2) {
>>>> -        src += chunk1;
>>>> -        dst = (void *)ring->ring;
>>>> -        memcpy(dst, src, chunk2);
>>>> -    }
>>>> -
>>>> -    ring->wptr += count_dw;
>>>> -    ring->wptr &= ring->ptr_mask;
>>>> -    ring->count_dw -= count_dw;
>>>> -}
>>>> +void amdgpu_ring_write_multiple(struct amdgpu_ring *ring,
>>>> +                          void *src, int count_dw);
>>>>      int amdgpu_ring_test_helper(struct amdgpu_ring *ring);
>>>>    diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
>>>> b/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
>>>> index bd4248c..b3ce5be 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
>>>> @@ -269,10 +269,8 @@ static int psp_v11_0_bootloader_load_kdb(struct
>>>> psp_context *psp)
>>>>        if (ret)
>>>>            return ret;
>>>>    -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>> -
>>>>        /* Copy PSP KDB binary to memory */
>>>> -    memcpy(psp->fw_pri_buf, psp->kdb_start_addr, psp->kdb_bin_size);
>>>> +    psp_copy_fw(psp, psp->kdb_start_addr, psp->kdb_bin_size);
>>>>          /* Provide the PSP KDB to bootloader */
>>>>        WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>>> @@ -302,10 +300,8 @@ static int psp_v11_0_bootloader_load_spl(struct
>>>> psp_context *psp)
>>>>        if (ret)
>>>>            return ret;
>>>>    -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>> -
>>>>        /* Copy PSP SPL binary to memory */
>>>> -    memcpy(psp->fw_pri_buf, psp->spl_start_addr, psp->spl_bin_size);
>>>> +    psp_copy_fw(psp, psp->spl_start_addr, psp->spl_bin_size);
>>>>          /* Provide the PSP SPL to bootloader */
>>>>        WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>>> @@ -335,10 +331,8 @@ static int psp_v11_0_bootloader_load_sysdrv(struct
>>>> psp_context *psp)
>>>>        if (ret)
>>>>            return ret;
>>>>    -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>> -
>>>>        /* Copy PSP System Driver binary to memory */
>>>> -    memcpy(psp->fw_pri_buf, psp->sys_start_addr, psp->sys_bin_size);
>>>> +    psp_copy_fw(psp, psp->sys_start_addr, psp->sys_bin_size);
>>>>          /* Provide the sys driver to bootloader */
>>>>        WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>>> @@ -371,10 +365,8 @@ static int psp_v11_0_bootloader_load_sos(struct
>>>> psp_context *psp)
>>>>        if (ret)
>>>>            return ret;
>>>>    -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>> -
>>>>        /* Copy Secure OS binary to PSP memory */
>>>> -    memcpy(psp->fw_pri_buf, psp->sos_start_addr, psp->sos_bin_size);
>>>> +    psp_copy_fw(psp, psp->sos_start_addr, psp->sos_bin_size);
>>>>          /* Provide the PSP secure OS to bootloader */
>>>>        WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c
>>>> b/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c
>>>> index c4828bd..618e5b6 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c
>>>> @@ -138,10 +138,8 @@ static int psp_v12_0_bootloader_load_sysdrv(struct
>>>> psp_context *psp)
>>>>        if (ret)
>>>>            return ret;
>>>>    -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>> -
>>>>        /* Copy PSP System Driver binary to memory */
>>>> -    memcpy(psp->fw_pri_buf, psp->sys_start_addr, psp->sys_bin_size);
>>>> +    psp_copy_fw(psp, psp->sys_start_addr, psp->sys_bin_size);
>>>>          /* Provide the sys driver to bootloader */
>>>>        WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>>> @@ -179,10 +177,8 @@ static int psp_v12_0_bootloader_load_sos(struct
>>>> psp_context *psp)
>>>>        if (ret)
>>>>            return ret;
>>>>    -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>> -
>>>>        /* Copy Secure OS binary to PSP memory */
>>>> -    memcpy(psp->fw_pri_buf, psp->sos_start_addr, psp->sos_bin_size);
>>>> +    psp_copy_fw(psp, psp->sos_start_addr, psp->sos_bin_size);
>>>>          /* Provide the PSP secure OS to bootloader */
>>>>        WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c
>>>> b/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c
>>>> index f2e725f..d0a6cccd 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c
>>>> @@ -102,10 +102,8 @@ static int psp_v3_1_bootloader_load_sysdrv(struct
>>>> psp_context *psp)
>>>>        if (ret)
>>>>            return ret;
>>>>    -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>> -
>>>>        /* Copy PSP System Driver binary to memory */
>>>> -    memcpy(psp->fw_pri_buf, psp->sys_start_addr, psp->sys_bin_size);
>>>> +    psp_copy_fw(psp, psp->sys_start_addr, psp->sys_bin_size);
>>>>          /* Provide the sys driver to bootloader */
>>>>        WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>>> @@ -143,10 +141,8 @@ static int psp_v3_1_bootloader_load_sos(struct
>>>> psp_context *psp)
>>>>        if (ret)
>>>>            return ret;
>>>>    -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>> -
>>>>        /* Copy Secure OS binary to PSP memory */
>>>> -    memcpy(psp->fw_pri_buf, psp->sos_start_addr, psp->sos_bin_size);
>>>> +    psp_copy_fw(psp, psp->sos_start_addr, psp->sos_bin_size);
>>>>          /* Provide the PSP secure OS to bootloader */
>>>>        WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>
>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 12/14] drm/scheduler: Job timeout handler returns status
  2021-01-19 17:47       ` Luben Tuikov
@ 2021-01-19 18:53         ` Christian König
  -1 siblings, 0 replies; 196+ messages in thread
From: Christian König @ 2021-01-19 18:53 UTC (permalink / raw)
  To: Luben Tuikov, Andrey Grodzovsky, amd-gfx, dri-devel,
	ckoenig.leichtzumerken, daniel.vetter, robh, l.stach, yuq825,
	eric
  Cc: Tomeu Vizoso, gregkh, Steven Price, Alyssa Rosenzweig,
	Russell King, Alexander.Deucher

Am 19.01.21 um 18:47 schrieb Luben Tuikov:
> On 2021-01-19 2:53 a.m., Christian König wrote:
>> Am 18.01.21 um 22:01 schrieb Andrey Grodzovsky:
>>> From: Luben Tuikov <luben.tuikov@amd.com>
>>>
>>> This patch does not change current behaviour.
>>>
>>> The driver's job timeout handler now returns
>>> status indicating back to the DRM layer whether
>>> the task (job) was successfully aborted or whether
>>> more time should be given to the task to complete.
>>>
>>> Default behaviour as of this patch, is preserved,
>>> except in obvious-by-comment case in the Panfrost
>>> driver, as documented below.
>>>
>>> All drivers which make use of the
>>> drm_sched_backend_ops' .timedout_job() callback
>>> have been accordingly renamed and return the
>>> would've-been default value of
>>> DRM_TASK_STATUS_ALIVE to restart the task's
>>> timeout timer--this is the old behaviour, and
>>> is preserved by this patch.
>>>
>>> In the case of the Panfrost driver, its timedout
>>> callback correctly first checks if the job had
>>> completed in due time and if so, it now returns
>>> DRM_TASK_STATUS_COMPLETE to notify the DRM layer
>>> that the task can be moved to the done list, to be
>>> freed later. In the other two subsequent checks,
>>> the value of DRM_TASK_STATUS_ALIVE is returned, as
>>> per the default behaviour.
>>>
>>> A more involved driver's solutions can be had
>>> in subequent patches.
>>>
>>> v2: Use enum as the status of a driver's job
>>>       timeout callback method.
>>>
>>> v4: (By Andrey Grodzovsky)
>>> Replace DRM_TASK_STATUS_COMPLETE with DRM_TASK_STATUS_ENODEV
>>> to enable a hint to the schduler for when NOT to rearm the
>>> timeout timer.
>> As Lukas pointed out returning the job (or task) status doesn't make
>> much sense.
>>
>> What we return here is the status of the scheduler.
>>
>> I would either rename the enum or completely drop it and return a
>> negative error status.
> Yes, that could be had.
>
> Although, dropping the enum and returning [-1, 0], might
> make the return status meaning vague. Using an enum with an appropriate
> name, makes the intention clear to the next programmer.

Completely agree, but -ENODEV and 0 could work.

On the other hand using DRM_SCHED_* is perfectly fine with me as well.

Christian.

>
> Now, Andrey did rename one of the enumerated values to
> DRM_TASK_STATUS_ENODEV, perhaps the same but with:
>
> enum drm_sched_status {
>      DRM_SCHED_STAT_NONE, /* Reserve 0 */
>      DRM_SCHED_STAT_NOMINAL,
>      DRM_SCHED_STAT_ENODEV,
> };
>
> and also renaming the enum to the above would be acceptable?
>
> Regards,
> Luben
>
>> Apart from that looks fine to me,
>> Christian.
>>
>>
>>> Cc: Alexander Deucher <Alexander.Deucher@amd.com>
>>> Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
>>> Cc: Christian König <christian.koenig@amd.com>
>>> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
>>> Cc: Lucas Stach <l.stach@pengutronix.de>
>>> Cc: Russell King <linux+etnaviv@armlinux.org.uk>
>>> Cc: Christian Gmeiner <christian.gmeiner@gmail.com>
>>> Cc: Qiang Yu <yuq825@gmail.com>
>>> Cc: Rob Herring <robh@kernel.org>
>>> Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com>
>>> Cc: Steven Price <steven.price@arm.com>
>>> Cc: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
>>> Cc: Eric Anholt <eric@anholt.net>
>>> Reported-by: kernel test robot <lkp@intel.com>
>>> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
>>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>>> ---
>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_job.c |  6 ++++--
>>>    drivers/gpu/drm/etnaviv/etnaviv_sched.c | 10 +++++++++-
>>>    drivers/gpu/drm/lima/lima_sched.c       |  4 +++-
>>>    drivers/gpu/drm/panfrost/panfrost_job.c |  9 ++++++---
>>>    drivers/gpu/drm/scheduler/sched_main.c  |  4 +---
>>>    drivers/gpu/drm/v3d/v3d_sched.c         | 32 +++++++++++++++++---------------
>>>    include/drm/gpu_scheduler.h             | 17 ++++++++++++++---
>>>    7 files changed, 54 insertions(+), 28 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
>>> index ff48101..a111326 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
>>> @@ -28,7 +28,7 @@
>>>    #include "amdgpu.h"
>>>    #include "amdgpu_trace.h"
>>>    
>>> -static void amdgpu_job_timedout(struct drm_sched_job *s_job)
>>> +static enum drm_task_status amdgpu_job_timedout(struct drm_sched_job *s_job)
>>>    {
>>>    	struct amdgpu_ring *ring = to_amdgpu_ring(s_job->sched);
>>>    	struct amdgpu_job *job = to_amdgpu_job(s_job);
>>> @@ -41,7 +41,7 @@ static void amdgpu_job_timedout(struct drm_sched_job *s_job)
>>>    	    amdgpu_ring_soft_recovery(ring, job->vmid, s_job->s_fence->parent)) {
>>>    		DRM_ERROR("ring %s timeout, but soft recovered\n",
>>>    			  s_job->sched->name);
>>> -		return;
>>> +		return DRM_TASK_STATUS_ALIVE;
>>>    	}
>>>    
>>>    	amdgpu_vm_get_task_info(ring->adev, job->pasid, &ti);
>>> @@ -53,10 +53,12 @@ static void amdgpu_job_timedout(struct drm_sched_job *s_job)
>>>    
>>>    	if (amdgpu_device_should_recover_gpu(ring->adev)) {
>>>    		amdgpu_device_gpu_recover(ring->adev, job);
>>> +		return DRM_TASK_STATUS_ALIVE;
>>>    	} else {
>>>    		drm_sched_suspend_timeout(&ring->sched);
>>>    		if (amdgpu_sriov_vf(adev))
>>>    			adev->virt.tdr_debug = true;
>>> +		return DRM_TASK_STATUS_ALIVE;
>>>    	}
>>>    }
>>>    
>>> diff --git a/drivers/gpu/drm/etnaviv/etnaviv_sched.c b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
>>> index cd46c88..c495169 100644
>>> --- a/drivers/gpu/drm/etnaviv/etnaviv_sched.c
>>> +++ b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
>>> @@ -82,7 +82,8 @@ static struct dma_fence *etnaviv_sched_run_job(struct drm_sched_job *sched_job)
>>>    	return fence;
>>>    }
>>>    
>>> -static void etnaviv_sched_timedout_job(struct drm_sched_job *sched_job)
>>> +static enum drm_task_status etnaviv_sched_timedout_job(struct drm_sched_job
>>> +						       *sched_job)
>>>    {
>>>    	struct etnaviv_gem_submit *submit = to_etnaviv_submit(sched_job);
>>>    	struct etnaviv_gpu *gpu = submit->gpu;
>>> @@ -120,9 +121,16 @@ static void etnaviv_sched_timedout_job(struct drm_sched_job *sched_job)
>>>    
>>>    	drm_sched_resubmit_jobs(&gpu->sched);
>>>    
>>> +	/* Tell the DRM scheduler that this task needs
>>> +	 * more time.
>>> +	 */
>>> +	drm_sched_start(&gpu->sched, true);
>>> +	return DRM_TASK_STATUS_ALIVE;
>>> +
>>>    out_no_timeout:
>>>    	/* restart scheduler after GPU is usable again */
>>>    	drm_sched_start(&gpu->sched, true);
>>> +	return DRM_TASK_STATUS_ALIVE;
>>>    }
>>>    
>>>    static void etnaviv_sched_free_job(struct drm_sched_job *sched_job)
>>> diff --git a/drivers/gpu/drm/lima/lima_sched.c b/drivers/gpu/drm/lima/lima_sched.c
>>> index 63b4c56..66d9236 100644
>>> --- a/drivers/gpu/drm/lima/lima_sched.c
>>> +++ b/drivers/gpu/drm/lima/lima_sched.c
>>> @@ -415,7 +415,7 @@ static void lima_sched_build_error_task_list(struct lima_sched_task *task)
>>>    	mutex_unlock(&dev->error_task_list_lock);
>>>    }
>>>    
>>> -static void lima_sched_timedout_job(struct drm_sched_job *job)
>>> +static enum drm_task_status lima_sched_timedout_job(struct drm_sched_job *job)
>>>    {
>>>    	struct lima_sched_pipe *pipe = to_lima_pipe(job->sched);
>>>    	struct lima_sched_task *task = to_lima_task(job);
>>> @@ -449,6 +449,8 @@ static void lima_sched_timedout_job(struct drm_sched_job *job)
>>>    
>>>    	drm_sched_resubmit_jobs(&pipe->base);
>>>    	drm_sched_start(&pipe->base, true);
>>> +
>>> +	return DRM_TASK_STATUS_ALIVE;
>>>    }
>>>    
>>>    static void lima_sched_free_job(struct drm_sched_job *job)
>>> diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c
>>> index 04e6f6f..10d41ac 100644
>>> --- a/drivers/gpu/drm/panfrost/panfrost_job.c
>>> +++ b/drivers/gpu/drm/panfrost/panfrost_job.c
>>> @@ -432,7 +432,8 @@ static void panfrost_scheduler_start(struct panfrost_queue_state *queue)
>>>    	mutex_unlock(&queue->lock);
>>>    }
>>>    
>>> -static void panfrost_job_timedout(struct drm_sched_job *sched_job)
>>> +static enum drm_task_status panfrost_job_timedout(struct drm_sched_job
>>> +						  *sched_job)
>>>    {
>>>    	struct panfrost_job *job = to_panfrost_job(sched_job);
>>>    	struct panfrost_device *pfdev = job->pfdev;
>>> @@ -443,7 +444,7 @@ static void panfrost_job_timedout(struct drm_sched_job *sched_job)
>>>    	 * spurious. Bail out.
>>>    	 */
>>>    	if (dma_fence_is_signaled(job->done_fence))
>>> -		return;
>>> +		return DRM_TASK_STATUS_ALIVE;
>>>    
>>>    	dev_err(pfdev->dev, "gpu sched timeout, js=%d, config=0x%x, status=0x%x, head=0x%x, tail=0x%x, sched_job=%p",
>>>    		js,
>>> @@ -455,11 +456,13 @@ static void panfrost_job_timedout(struct drm_sched_job *sched_job)
>>>    
>>>    	/* Scheduler is already stopped, nothing to do. */
>>>    	if (!panfrost_scheduler_stop(&pfdev->js->queue[js], sched_job))
>>> -		return;
>>> +		return DRM_TASK_STATUS_ALIVE;
>>>    
>>>    	/* Schedule a reset if there's no reset in progress. */
>>>    	if (!atomic_xchg(&pfdev->reset.pending, 1))
>>>    		schedule_work(&pfdev->reset.work);
>>> +
>>> +	return DRM_TASK_STATUS_ALIVE;
>>>    }
>>>    
>>>    static const struct drm_sched_backend_ops panfrost_sched_ops = {
>>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
>>> index 92637b7..73fccc5 100644
>>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>>> @@ -527,7 +527,7 @@ void drm_sched_start(struct drm_gpu_scheduler *sched, bool full_recovery)
>>>    EXPORT_SYMBOL(drm_sched_start);
>>>    
>>>    /**
>>> - * drm_sched_resubmit_jobs - helper to relunch job from pending ring list
>>> + * drm_sched_resubmit_jobs - helper to relaunch jobs from the pending list
>>>     *
>>>     * @sched: scheduler instance
>>>     *
>>> @@ -561,8 +561,6 @@ void drm_sched_resubmit_jobs(struct drm_gpu_scheduler *sched)
>>>    		} else {
>>>    			s_job->s_fence->parent = fence;
>>>    		}
>>> -
>>> -
>>>    	}
>>>    }
>>>    EXPORT_SYMBOL(drm_sched_resubmit_jobs);
>>> diff --git a/drivers/gpu/drm/v3d/v3d_sched.c b/drivers/gpu/drm/v3d/v3d_sched.c
>>> index 452682e..3740665e 100644
>>> --- a/drivers/gpu/drm/v3d/v3d_sched.c
>>> +++ b/drivers/gpu/drm/v3d/v3d_sched.c
>>> @@ -259,7 +259,7 @@ v3d_cache_clean_job_run(struct drm_sched_job *sched_job)
>>>    	return NULL;
>>>    }
>>>    
>>> -static void
>>> +static enum drm_task_status
>>>    v3d_gpu_reset_for_timeout(struct v3d_dev *v3d, struct drm_sched_job *sched_job)
>>>    {
>>>    	enum v3d_queue q;
>>> @@ -285,6 +285,8 @@ v3d_gpu_reset_for_timeout(struct v3d_dev *v3d, struct drm_sched_job *sched_job)
>>>    	}
>>>    
>>>    	mutex_unlock(&v3d->reset_lock);
>>> +
>>> +	return DRM_TASK_STATUS_ALIVE;
>>>    }
>>>    
>>>    /* If the current address or return address have changed, then the GPU
>>> @@ -292,7 +294,7 @@ v3d_gpu_reset_for_timeout(struct v3d_dev *v3d, struct drm_sched_job *sched_job)
>>>     * could fail if the GPU got in an infinite loop in the CL, but that
>>>     * is pretty unlikely outside of an i-g-t testcase.
>>>     */
>>> -static void
>>> +static enum drm_task_status
>>>    v3d_cl_job_timedout(struct drm_sched_job *sched_job, enum v3d_queue q,
>>>    		    u32 *timedout_ctca, u32 *timedout_ctra)
>>>    {
>>> @@ -304,39 +306,39 @@ v3d_cl_job_timedout(struct drm_sched_job *sched_job, enum v3d_queue q,
>>>    	if (*timedout_ctca != ctca || *timedout_ctra != ctra) {
>>>    		*timedout_ctca = ctca;
>>>    		*timedout_ctra = ctra;
>>> -		return;
>>> +		return DRM_TASK_STATUS_ALIVE;
>>>    	}
>>>    
>>> -	v3d_gpu_reset_for_timeout(v3d, sched_job);
>>> +	return v3d_gpu_reset_for_timeout(v3d, sched_job);
>>>    }
>>>    
>>> -static void
>>> +static enum drm_task_status
>>>    v3d_bin_job_timedout(struct drm_sched_job *sched_job)
>>>    {
>>>    	struct v3d_bin_job *job = to_bin_job(sched_job);
>>>    
>>> -	v3d_cl_job_timedout(sched_job, V3D_BIN,
>>> -			    &job->timedout_ctca, &job->timedout_ctra);
>>> +	return v3d_cl_job_timedout(sched_job, V3D_BIN,
>>> +				   &job->timedout_ctca, &job->timedout_ctra);
>>>    }
>>>    
>>> -static void
>>> +static enum drm_task_status
>>>    v3d_render_job_timedout(struct drm_sched_job *sched_job)
>>>    {
>>>    	struct v3d_render_job *job = to_render_job(sched_job);
>>>    
>>> -	v3d_cl_job_timedout(sched_job, V3D_RENDER,
>>> -			    &job->timedout_ctca, &job->timedout_ctra);
>>> +	return v3d_cl_job_timedout(sched_job, V3D_RENDER,
>>> +				   &job->timedout_ctca, &job->timedout_ctra);
>>>    }
>>>    
>>> -static void
>>> +static enum drm_task_status
>>>    v3d_generic_job_timedout(struct drm_sched_job *sched_job)
>>>    {
>>>    	struct v3d_job *job = to_v3d_job(sched_job);
>>>    
>>> -	v3d_gpu_reset_for_timeout(job->v3d, sched_job);
>>> +	return v3d_gpu_reset_for_timeout(job->v3d, sched_job);
>>>    }
>>>    
>>> -static void
>>> +static enum drm_task_status
>>>    v3d_csd_job_timedout(struct drm_sched_job *sched_job)
>>>    {
>>>    	struct v3d_csd_job *job = to_csd_job(sched_job);
>>> @@ -348,10 +350,10 @@ v3d_csd_job_timedout(struct drm_sched_job *sched_job)
>>>    	 */
>>>    	if (job->timedout_batches != batches) {
>>>    		job->timedout_batches = batches;
>>> -		return;
>>> +		return DRM_TASK_STATUS_ALIVE;
>>>    	}
>>>    
>>> -	v3d_gpu_reset_for_timeout(v3d, sched_job);
>>> +	return v3d_gpu_reset_for_timeout(v3d, sched_job);
>>>    }
>>>    
>>>    static const struct drm_sched_backend_ops v3d_bin_sched_ops = {
>>> diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
>>> index 975e8a6..3ba36bc 100644
>>> --- a/include/drm/gpu_scheduler.h
>>> +++ b/include/drm/gpu_scheduler.h
>>> @@ -206,6 +206,11 @@ static inline bool drm_sched_invalidate_job(struct drm_sched_job *s_job,
>>>    	return s_job && atomic_inc_return(&s_job->karma) > threshold;
>>>    }
>>>    
>>> +enum drm_task_status {
>>> +	DRM_TASK_STATUS_ENODEV,
>>> +	DRM_TASK_STATUS_ALIVE
>>> +};
>>> +
>>>    /**
>>>     * struct drm_sched_backend_ops
>>>     *
>>> @@ -230,10 +235,16 @@ struct drm_sched_backend_ops {
>>>    	struct dma_fence *(*run_job)(struct drm_sched_job *sched_job);
>>>    
>>>    	/**
>>> -         * @timedout_job: Called when a job has taken too long to execute,
>>> -         * to trigger GPU recovery.
>>> +	 * @timedout_job: Called when a job has taken too long to execute,
>>> +	 * to trigger GPU recovery.
>>> +	 *
>>> +	 * Return DRM_TASK_STATUS_ALIVE, if the task (job) is healthy
>>> +	 * and executing in the hardware, i.e. it needs more time.
>>> +	 *
>>> +	 * Return DRM_TASK_STATUS_ENODEV, if the task (job) has
>>> +	 * been aborted.
>>>    	 */
>>> -	void (*timedout_job)(struct drm_sched_job *sched_job);
>>> +	enum drm_task_status (*timedout_job)(struct drm_sched_job *sched_job);
>>>    
>>>    	/**
>>>             * @free_job: Called once the job's finished fence has been signaled

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 12/14] drm/scheduler: Job timeout handler returns status
@ 2021-01-19 18:53         ` Christian König
  0 siblings, 0 replies; 196+ messages in thread
From: Christian König @ 2021-01-19 18:53 UTC (permalink / raw)
  To: Luben Tuikov, Andrey Grodzovsky, amd-gfx, dri-devel,
	ckoenig.leichtzumerken, daniel.vetter, robh, l.stach, yuq825,
	eric
  Cc: Tomeu Vizoso, gregkh, Christian Gmeiner, Steven Price, ppaalanen,
	Alyssa Rosenzweig, Russell King, Alexander.Deucher,
	Harry.Wentland

Am 19.01.21 um 18:47 schrieb Luben Tuikov:
> On 2021-01-19 2:53 a.m., Christian König wrote:
>> Am 18.01.21 um 22:01 schrieb Andrey Grodzovsky:
>>> From: Luben Tuikov <luben.tuikov@amd.com>
>>>
>>> This patch does not change current behaviour.
>>>
>>> The driver's job timeout handler now returns
>>> status indicating back to the DRM layer whether
>>> the task (job) was successfully aborted or whether
>>> more time should be given to the task to complete.
>>>
>>> Default behaviour as of this patch, is preserved,
>>> except in obvious-by-comment case in the Panfrost
>>> driver, as documented below.
>>>
>>> All drivers which make use of the
>>> drm_sched_backend_ops' .timedout_job() callback
>>> have been accordingly renamed and return the
>>> would've-been default value of
>>> DRM_TASK_STATUS_ALIVE to restart the task's
>>> timeout timer--this is the old behaviour, and
>>> is preserved by this patch.
>>>
>>> In the case of the Panfrost driver, its timedout
>>> callback correctly first checks if the job had
>>> completed in due time and if so, it now returns
>>> DRM_TASK_STATUS_COMPLETE to notify the DRM layer
>>> that the task can be moved to the done list, to be
>>> freed later. In the other two subsequent checks,
>>> the value of DRM_TASK_STATUS_ALIVE is returned, as
>>> per the default behaviour.
>>>
>>> A more involved driver's solutions can be had
>>> in subequent patches.
>>>
>>> v2: Use enum as the status of a driver's job
>>>       timeout callback method.
>>>
>>> v4: (By Andrey Grodzovsky)
>>> Replace DRM_TASK_STATUS_COMPLETE with DRM_TASK_STATUS_ENODEV
>>> to enable a hint to the schduler for when NOT to rearm the
>>> timeout timer.
>> As Lukas pointed out returning the job (or task) status doesn't make
>> much sense.
>>
>> What we return here is the status of the scheduler.
>>
>> I would either rename the enum or completely drop it and return a
>> negative error status.
> Yes, that could be had.
>
> Although, dropping the enum and returning [-1, 0], might
> make the return status meaning vague. Using an enum with an appropriate
> name, makes the intention clear to the next programmer.

Completely agree, but -ENODEV and 0 could work.

On the other hand using DRM_SCHED_* is perfectly fine with me as well.

Christian.

>
> Now, Andrey did rename one of the enumerated values to
> DRM_TASK_STATUS_ENODEV, perhaps the same but with:
>
> enum drm_sched_status {
>      DRM_SCHED_STAT_NONE, /* Reserve 0 */
>      DRM_SCHED_STAT_NOMINAL,
>      DRM_SCHED_STAT_ENODEV,
> };
>
> and also renaming the enum to the above would be acceptable?
>
> Regards,
> Luben
>
>> Apart from that looks fine to me,
>> Christian.
>>
>>
>>> Cc: Alexander Deucher <Alexander.Deucher@amd.com>
>>> Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
>>> Cc: Christian König <christian.koenig@amd.com>
>>> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
>>> Cc: Lucas Stach <l.stach@pengutronix.de>
>>> Cc: Russell King <linux+etnaviv@armlinux.org.uk>
>>> Cc: Christian Gmeiner <christian.gmeiner@gmail.com>
>>> Cc: Qiang Yu <yuq825@gmail.com>
>>> Cc: Rob Herring <robh@kernel.org>
>>> Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com>
>>> Cc: Steven Price <steven.price@arm.com>
>>> Cc: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
>>> Cc: Eric Anholt <eric@anholt.net>
>>> Reported-by: kernel test robot <lkp@intel.com>
>>> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
>>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>>> ---
>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_job.c |  6 ++++--
>>>    drivers/gpu/drm/etnaviv/etnaviv_sched.c | 10 +++++++++-
>>>    drivers/gpu/drm/lima/lima_sched.c       |  4 +++-
>>>    drivers/gpu/drm/panfrost/panfrost_job.c |  9 ++++++---
>>>    drivers/gpu/drm/scheduler/sched_main.c  |  4 +---
>>>    drivers/gpu/drm/v3d/v3d_sched.c         | 32 +++++++++++++++++---------------
>>>    include/drm/gpu_scheduler.h             | 17 ++++++++++++++---
>>>    7 files changed, 54 insertions(+), 28 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
>>> index ff48101..a111326 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
>>> @@ -28,7 +28,7 @@
>>>    #include "amdgpu.h"
>>>    #include "amdgpu_trace.h"
>>>    
>>> -static void amdgpu_job_timedout(struct drm_sched_job *s_job)
>>> +static enum drm_task_status amdgpu_job_timedout(struct drm_sched_job *s_job)
>>>    {
>>>    	struct amdgpu_ring *ring = to_amdgpu_ring(s_job->sched);
>>>    	struct amdgpu_job *job = to_amdgpu_job(s_job);
>>> @@ -41,7 +41,7 @@ static void amdgpu_job_timedout(struct drm_sched_job *s_job)
>>>    	    amdgpu_ring_soft_recovery(ring, job->vmid, s_job->s_fence->parent)) {
>>>    		DRM_ERROR("ring %s timeout, but soft recovered\n",
>>>    			  s_job->sched->name);
>>> -		return;
>>> +		return DRM_TASK_STATUS_ALIVE;
>>>    	}
>>>    
>>>    	amdgpu_vm_get_task_info(ring->adev, job->pasid, &ti);
>>> @@ -53,10 +53,12 @@ static void amdgpu_job_timedout(struct drm_sched_job *s_job)
>>>    
>>>    	if (amdgpu_device_should_recover_gpu(ring->adev)) {
>>>    		amdgpu_device_gpu_recover(ring->adev, job);
>>> +		return DRM_TASK_STATUS_ALIVE;
>>>    	} else {
>>>    		drm_sched_suspend_timeout(&ring->sched);
>>>    		if (amdgpu_sriov_vf(adev))
>>>    			adev->virt.tdr_debug = true;
>>> +		return DRM_TASK_STATUS_ALIVE;
>>>    	}
>>>    }
>>>    
>>> diff --git a/drivers/gpu/drm/etnaviv/etnaviv_sched.c b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
>>> index cd46c88..c495169 100644
>>> --- a/drivers/gpu/drm/etnaviv/etnaviv_sched.c
>>> +++ b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
>>> @@ -82,7 +82,8 @@ static struct dma_fence *etnaviv_sched_run_job(struct drm_sched_job *sched_job)
>>>    	return fence;
>>>    }
>>>    
>>> -static void etnaviv_sched_timedout_job(struct drm_sched_job *sched_job)
>>> +static enum drm_task_status etnaviv_sched_timedout_job(struct drm_sched_job
>>> +						       *sched_job)
>>>    {
>>>    	struct etnaviv_gem_submit *submit = to_etnaviv_submit(sched_job);
>>>    	struct etnaviv_gpu *gpu = submit->gpu;
>>> @@ -120,9 +121,16 @@ static void etnaviv_sched_timedout_job(struct drm_sched_job *sched_job)
>>>    
>>>    	drm_sched_resubmit_jobs(&gpu->sched);
>>>    
>>> +	/* Tell the DRM scheduler that this task needs
>>> +	 * more time.
>>> +	 */
>>> +	drm_sched_start(&gpu->sched, true);
>>> +	return DRM_TASK_STATUS_ALIVE;
>>> +
>>>    out_no_timeout:
>>>    	/* restart scheduler after GPU is usable again */
>>>    	drm_sched_start(&gpu->sched, true);
>>> +	return DRM_TASK_STATUS_ALIVE;
>>>    }
>>>    
>>>    static void etnaviv_sched_free_job(struct drm_sched_job *sched_job)
>>> diff --git a/drivers/gpu/drm/lima/lima_sched.c b/drivers/gpu/drm/lima/lima_sched.c
>>> index 63b4c56..66d9236 100644
>>> --- a/drivers/gpu/drm/lima/lima_sched.c
>>> +++ b/drivers/gpu/drm/lima/lima_sched.c
>>> @@ -415,7 +415,7 @@ static void lima_sched_build_error_task_list(struct lima_sched_task *task)
>>>    	mutex_unlock(&dev->error_task_list_lock);
>>>    }
>>>    
>>> -static void lima_sched_timedout_job(struct drm_sched_job *job)
>>> +static enum drm_task_status lima_sched_timedout_job(struct drm_sched_job *job)
>>>    {
>>>    	struct lima_sched_pipe *pipe = to_lima_pipe(job->sched);
>>>    	struct lima_sched_task *task = to_lima_task(job);
>>> @@ -449,6 +449,8 @@ static void lima_sched_timedout_job(struct drm_sched_job *job)
>>>    
>>>    	drm_sched_resubmit_jobs(&pipe->base);
>>>    	drm_sched_start(&pipe->base, true);
>>> +
>>> +	return DRM_TASK_STATUS_ALIVE;
>>>    }
>>>    
>>>    static void lima_sched_free_job(struct drm_sched_job *job)
>>> diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c
>>> index 04e6f6f..10d41ac 100644
>>> --- a/drivers/gpu/drm/panfrost/panfrost_job.c
>>> +++ b/drivers/gpu/drm/panfrost/panfrost_job.c
>>> @@ -432,7 +432,8 @@ static void panfrost_scheduler_start(struct panfrost_queue_state *queue)
>>>    	mutex_unlock(&queue->lock);
>>>    }
>>>    
>>> -static void panfrost_job_timedout(struct drm_sched_job *sched_job)
>>> +static enum drm_task_status panfrost_job_timedout(struct drm_sched_job
>>> +						  *sched_job)
>>>    {
>>>    	struct panfrost_job *job = to_panfrost_job(sched_job);
>>>    	struct panfrost_device *pfdev = job->pfdev;
>>> @@ -443,7 +444,7 @@ static void panfrost_job_timedout(struct drm_sched_job *sched_job)
>>>    	 * spurious. Bail out.
>>>    	 */
>>>    	if (dma_fence_is_signaled(job->done_fence))
>>> -		return;
>>> +		return DRM_TASK_STATUS_ALIVE;
>>>    
>>>    	dev_err(pfdev->dev, "gpu sched timeout, js=%d, config=0x%x, status=0x%x, head=0x%x, tail=0x%x, sched_job=%p",
>>>    		js,
>>> @@ -455,11 +456,13 @@ static void panfrost_job_timedout(struct drm_sched_job *sched_job)
>>>    
>>>    	/* Scheduler is already stopped, nothing to do. */
>>>    	if (!panfrost_scheduler_stop(&pfdev->js->queue[js], sched_job))
>>> -		return;
>>> +		return DRM_TASK_STATUS_ALIVE;
>>>    
>>>    	/* Schedule a reset if there's no reset in progress. */
>>>    	if (!atomic_xchg(&pfdev->reset.pending, 1))
>>>    		schedule_work(&pfdev->reset.work);
>>> +
>>> +	return DRM_TASK_STATUS_ALIVE;
>>>    }
>>>    
>>>    static const struct drm_sched_backend_ops panfrost_sched_ops = {
>>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
>>> index 92637b7..73fccc5 100644
>>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>>> @@ -527,7 +527,7 @@ void drm_sched_start(struct drm_gpu_scheduler *sched, bool full_recovery)
>>>    EXPORT_SYMBOL(drm_sched_start);
>>>    
>>>    /**
>>> - * drm_sched_resubmit_jobs - helper to relunch job from pending ring list
>>> + * drm_sched_resubmit_jobs - helper to relaunch jobs from the pending list
>>>     *
>>>     * @sched: scheduler instance
>>>     *
>>> @@ -561,8 +561,6 @@ void drm_sched_resubmit_jobs(struct drm_gpu_scheduler *sched)
>>>    		} else {
>>>    			s_job->s_fence->parent = fence;
>>>    		}
>>> -
>>> -
>>>    	}
>>>    }
>>>    EXPORT_SYMBOL(drm_sched_resubmit_jobs);
>>> diff --git a/drivers/gpu/drm/v3d/v3d_sched.c b/drivers/gpu/drm/v3d/v3d_sched.c
>>> index 452682e..3740665e 100644
>>> --- a/drivers/gpu/drm/v3d/v3d_sched.c
>>> +++ b/drivers/gpu/drm/v3d/v3d_sched.c
>>> @@ -259,7 +259,7 @@ v3d_cache_clean_job_run(struct drm_sched_job *sched_job)
>>>    	return NULL;
>>>    }
>>>    
>>> -static void
>>> +static enum drm_task_status
>>>    v3d_gpu_reset_for_timeout(struct v3d_dev *v3d, struct drm_sched_job *sched_job)
>>>    {
>>>    	enum v3d_queue q;
>>> @@ -285,6 +285,8 @@ v3d_gpu_reset_for_timeout(struct v3d_dev *v3d, struct drm_sched_job *sched_job)
>>>    	}
>>>    
>>>    	mutex_unlock(&v3d->reset_lock);
>>> +
>>> +	return DRM_TASK_STATUS_ALIVE;
>>>    }
>>>    
>>>    /* If the current address or return address have changed, then the GPU
>>> @@ -292,7 +294,7 @@ v3d_gpu_reset_for_timeout(struct v3d_dev *v3d, struct drm_sched_job *sched_job)
>>>     * could fail if the GPU got in an infinite loop in the CL, but that
>>>     * is pretty unlikely outside of an i-g-t testcase.
>>>     */
>>> -static void
>>> +static enum drm_task_status
>>>    v3d_cl_job_timedout(struct drm_sched_job *sched_job, enum v3d_queue q,
>>>    		    u32 *timedout_ctca, u32 *timedout_ctra)
>>>    {
>>> @@ -304,39 +306,39 @@ v3d_cl_job_timedout(struct drm_sched_job *sched_job, enum v3d_queue q,
>>>    	if (*timedout_ctca != ctca || *timedout_ctra != ctra) {
>>>    		*timedout_ctca = ctca;
>>>    		*timedout_ctra = ctra;
>>> -		return;
>>> +		return DRM_TASK_STATUS_ALIVE;
>>>    	}
>>>    
>>> -	v3d_gpu_reset_for_timeout(v3d, sched_job);
>>> +	return v3d_gpu_reset_for_timeout(v3d, sched_job);
>>>    }
>>>    
>>> -static void
>>> +static enum drm_task_status
>>>    v3d_bin_job_timedout(struct drm_sched_job *sched_job)
>>>    {
>>>    	struct v3d_bin_job *job = to_bin_job(sched_job);
>>>    
>>> -	v3d_cl_job_timedout(sched_job, V3D_BIN,
>>> -			    &job->timedout_ctca, &job->timedout_ctra);
>>> +	return v3d_cl_job_timedout(sched_job, V3D_BIN,
>>> +				   &job->timedout_ctca, &job->timedout_ctra);
>>>    }
>>>    
>>> -static void
>>> +static enum drm_task_status
>>>    v3d_render_job_timedout(struct drm_sched_job *sched_job)
>>>    {
>>>    	struct v3d_render_job *job = to_render_job(sched_job);
>>>    
>>> -	v3d_cl_job_timedout(sched_job, V3D_RENDER,
>>> -			    &job->timedout_ctca, &job->timedout_ctra);
>>> +	return v3d_cl_job_timedout(sched_job, V3D_RENDER,
>>> +				   &job->timedout_ctca, &job->timedout_ctra);
>>>    }
>>>    
>>> -static void
>>> +static enum drm_task_status
>>>    v3d_generic_job_timedout(struct drm_sched_job *sched_job)
>>>    {
>>>    	struct v3d_job *job = to_v3d_job(sched_job);
>>>    
>>> -	v3d_gpu_reset_for_timeout(job->v3d, sched_job);
>>> +	return v3d_gpu_reset_for_timeout(job->v3d, sched_job);
>>>    }
>>>    
>>> -static void
>>> +static enum drm_task_status
>>>    v3d_csd_job_timedout(struct drm_sched_job *sched_job)
>>>    {
>>>    	struct v3d_csd_job *job = to_csd_job(sched_job);
>>> @@ -348,10 +350,10 @@ v3d_csd_job_timedout(struct drm_sched_job *sched_job)
>>>    	 */
>>>    	if (job->timedout_batches != batches) {
>>>    		job->timedout_batches = batches;
>>> -		return;
>>> +		return DRM_TASK_STATUS_ALIVE;
>>>    	}
>>>    
>>> -	v3d_gpu_reset_for_timeout(v3d, sched_job);
>>> +	return v3d_gpu_reset_for_timeout(v3d, sched_job);
>>>    }
>>>    
>>>    static const struct drm_sched_backend_ops v3d_bin_sched_ops = {
>>> diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
>>> index 975e8a6..3ba36bc 100644
>>> --- a/include/drm/gpu_scheduler.h
>>> +++ b/include/drm/gpu_scheduler.h
>>> @@ -206,6 +206,11 @@ static inline bool drm_sched_invalidate_job(struct drm_sched_job *s_job,
>>>    	return s_job && atomic_inc_return(&s_job->karma) > threshold;
>>>    }
>>>    
>>> +enum drm_task_status {
>>> +	DRM_TASK_STATUS_ENODEV,
>>> +	DRM_TASK_STATUS_ALIVE
>>> +};
>>> +
>>>    /**
>>>     * struct drm_sched_backend_ops
>>>     *
>>> @@ -230,10 +235,16 @@ struct drm_sched_backend_ops {
>>>    	struct dma_fence *(*run_job)(struct drm_sched_job *sched_job);
>>>    
>>>    	/**
>>> -         * @timedout_job: Called when a job has taken too long to execute,
>>> -         * to trigger GPU recovery.
>>> +	 * @timedout_job: Called when a job has taken too long to execute,
>>> +	 * to trigger GPU recovery.
>>> +	 *
>>> +	 * Return DRM_TASK_STATUS_ALIVE, if the task (job) is healthy
>>> +	 * and executing in the hardware, i.e. it needs more time.
>>> +	 *
>>> +	 * Return DRM_TASK_STATUS_ENODEV, if the task (job) has
>>> +	 * been aborted.
>>>    	 */
>>> -	void (*timedout_job)(struct drm_sched_job *sched_job);
>>> +	enum drm_task_status (*timedout_job)(struct drm_sched_job *sched_job);
>>>    
>>>    	/**
>>>             * @free_job: Called once the job's finished fence has been signaled

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 11/14] drm/amdgpu: Guard against write accesses after device removal
  2021-01-19 18:22           ` Andrey Grodzovsky
@ 2021-01-19 18:59             ` Christian König
  -1 siblings, 0 replies; 196+ messages in thread
From: Christian König @ 2021-01-19 18:59 UTC (permalink / raw)
  To: Andrey Grodzovsky, Daniel Vetter
  Cc: Greg KH, dri-devel, amd-gfx list, Alex Deucher,
	Christian König, Qiang Yu

Am 19.01.21 um 19:22 schrieb Andrey Grodzovsky:
>
> On 1/19/21 1:05 PM, Daniel Vetter wrote:
>> On Tue, Jan 19, 2021 at 4:35 PM Andrey Grodzovsky
>> <Andrey.Grodzovsky@amd.com> wrote:
>>> There is really no other way according to this article
>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flwn.net%2FArticles%2F767885%2F&amp;data=04%7C01%7CAndrey.Grodzovsky%40amd.com%7C7a1f5ae6a06f4661d47708d8bca4cb32%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637466763278674162%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=QupsglO9WRuis8XRLBFIhl6miTXVOdAnk8oP4BfSclQ%3D&amp;reserved=0 
>>>
>>>
>>> "A perfect solution seems nearly impossible though; we cannot 
>>> acquire a mutex on
>>> the user
>>> to prevent them from yanking a device and we cannot check for a 
>>> presence change
>>> after every
>>> device access for performance reasons. "
>>>
>>> But I assumed srcu_read_lock should be pretty seamless performance 
>>> wise, no ?
>> The read side is supposed to be dirt cheap, the write side is were we
>> just stall for all readers to eventually complete on their own.
>> Definitely should be much cheaper than mmio read, on the mmio write
>> side it might actually hurt a bit. Otoh I think those don't stall the
>> cpu by default when they're timing out, so maybe if the overhead is
>> too much for those, we could omit them?
>>
>> Maybe just do a small microbenchmark for these for testing, with a
>> register that doesn't change hw state. So with and without
>> drm_dev_enter/exit, and also one with the hw plugged out so that we
>> have actual timeouts in the transactions.
>> -Daniel
>
>
> So say writing in a loop to some harmless scratch register for many 
> times both for plugged
> and unplugged case and measure total time delta ?

I think we should at least measure the following:

1. Writing X times to a scratch reg without your patch.
2. Writing X times to a scratch reg with your patch.
3. Writing X times to a scratch reg with the hardware physically 
disconnected.

I suggest to repeat that once for Polaris (or older) and once for Vega 
or Navi.

The SRBM on Polaris is meant to introduce some delay in each access, so 
it might react differently then the newer hardware.

Christian.

>
> Andrey
>
>
>>
>>> The other solution would be as I suggested to keep all the device IO 
>>> ranges
>>> reserved and system
>>> memory pages unfreed until the device is finalized in the driver but 
>>> Daniel said
>>> this would upset the PCI layer (the MMIO ranges reservation part).
>>>
>>> Andrey
>>>
>>>
>>>
>>>
>>> On 1/19/21 3:55 AM, Christian König wrote:
>>>> Am 18.01.21 um 22:01 schrieb Andrey Grodzovsky:
>>>>> This should prevent writing to memory or IO ranges possibly
>>>>> already allocated for other uses after our device is removed.
>>>> Wow, that adds quite some overhead to every register access. I'm 
>>>> not sure we
>>>> can do this.
>>>>
>>>> Christian.
>>>>
>>>>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>>>>> ---
>>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 57 
>>>>> ++++++++++++++++++++++++
>>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c    |  9 ++++
>>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c    | 53 
>>>>> +++++++++++++---------
>>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h    |  3 ++
>>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c   | 70 
>>>>> ++++++++++++++++++++++++++++++
>>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h   | 49 
>>>>> ++-------------------
>>>>>    drivers/gpu/drm/amd/amdgpu/psp_v11_0.c     | 16 ++-----
>>>>>    drivers/gpu/drm/amd/amdgpu/psp_v12_0.c     |  8 +---
>>>>>    drivers/gpu/drm/amd/amdgpu/psp_v3_1.c      |  8 +---
>>>>>    9 files changed, 184 insertions(+), 89 deletions(-)
>>>>>
>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>> index e99f4f1..0a9d73c 100644
>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>> @@ -72,6 +72,8 @@
>>>>>      #include <linux/iommu.h>
>>>>>    +#include <drm/drm_drv.h>
>>>>> +
>>>>>    MODULE_FIRMWARE("amdgpu/vega10_gpu_info.bin");
>>>>>    MODULE_FIRMWARE("amdgpu/vega12_gpu_info.bin");
>>>>>    MODULE_FIRMWARE("amdgpu/raven_gpu_info.bin");
>>>>> @@ -404,13 +406,21 @@ uint8_t amdgpu_mm_rreg8(struct amdgpu_device 
>>>>> *adev,
>>>>> uint32_t offset)
>>>>>     */
>>>>>    void amdgpu_mm_wreg8(struct amdgpu_device *adev, uint32_t 
>>>>> offset, uint8_t
>>>>> value)
>>>>>    {
>>>>> +    int idx;
>>>>> +
>>>>>        if (adev->in_pci_err_recovery)
>>>>>            return;
>>>>>    +
>>>>> +    if (!drm_dev_enter(&adev->ddev, &idx))
>>>>> +        return;
>>>>> +
>>>>>        if (offset < adev->rmmio_size)
>>>>>            writeb(value, adev->rmmio + offset);
>>>>>        else
>>>>>            BUG();
>>>>> +
>>>>> +    drm_dev_exit(idx);
>>>>>    }
>>>>>      /**
>>>>> @@ -427,9 +437,14 @@ void amdgpu_device_wreg(struct amdgpu_device 
>>>>> *adev,
>>>>>                uint32_t reg, uint32_t v,
>>>>>                uint32_t acc_flags)
>>>>>    {
>>>>> +    int idx;
>>>>> +
>>>>>        if (adev->in_pci_err_recovery)
>>>>>            return;
>>>>>    +    if (!drm_dev_enter(&adev->ddev, &idx))
>>>>> +        return;
>>>>> +
>>>>>        if ((reg * 4) < adev->rmmio_size) {
>>>>>            if (!(acc_flags & AMDGPU_REGS_NO_KIQ) &&
>>>>>                amdgpu_sriov_runtime(adev) &&
>>>>> @@ -444,6 +459,8 @@ void amdgpu_device_wreg(struct amdgpu_device 
>>>>> *adev,
>>>>>        }
>>>>> trace_amdgpu_device_wreg(adev->pdev->device, reg, v);
>>>>> +
>>>>> +    drm_dev_exit(idx);
>>>>>    }
>>>>>      /*
>>>>> @@ -454,9 +471,14 @@ void amdgpu_device_wreg(struct amdgpu_device 
>>>>> *adev,
>>>>>    void amdgpu_mm_wreg_mmio_rlc(struct amdgpu_device *adev,
>>>>>                     uint32_t reg, uint32_t v)
>>>>>    {
>>>>> +    int idx;
>>>>> +
>>>>>        if (adev->in_pci_err_recovery)
>>>>>            return;
>>>>>    +    if (!drm_dev_enter(&adev->ddev, &idx))
>>>>> +        return;
>>>>> +
>>>>>        if (amdgpu_sriov_fullaccess(adev) &&
>>>>>            adev->gfx.rlc.funcs &&
>>>>> adev->gfx.rlc.funcs->is_rlcg_access_range) {
>>>>> @@ -465,6 +487,8 @@ void amdgpu_mm_wreg_mmio_rlc(struct 
>>>>> amdgpu_device *adev,
>>>>>        } else {
>>>>>            writel(v, ((void __iomem *)adev->rmmio) + (reg * 4));
>>>>>        }
>>>>> +
>>>>> +    drm_dev_exit(idx);
>>>>>    }
>>>>>      /**
>>>>> @@ -499,15 +523,22 @@ u32 amdgpu_io_rreg(struct amdgpu_device 
>>>>> *adev, u32 reg)
>>>>>     */
>>>>>    void amdgpu_io_wreg(struct amdgpu_device *adev, u32 reg, u32 v)
>>>>>    {
>>>>> +    int idx;
>>>>> +
>>>>>        if (adev->in_pci_err_recovery)
>>>>>            return;
>>>>>    +    if (!drm_dev_enter(&adev->ddev, &idx))
>>>>> +        return;
>>>>> +
>>>>>        if ((reg * 4) < adev->rio_mem_size)
>>>>>            iowrite32(v, adev->rio_mem + (reg * 4));
>>>>>        else {
>>>>>            iowrite32((reg * 4), adev->rio_mem + (mmMM_INDEX * 4));
>>>>>            iowrite32(v, adev->rio_mem + (mmMM_DATA * 4));
>>>>>        }
>>>>> +
>>>>> +    drm_dev_exit(idx);
>>>>>    }
>>>>>      /**
>>>>> @@ -544,14 +575,21 @@ u32 amdgpu_mm_rdoorbell(struct amdgpu_device 
>>>>> *adev, u32
>>>>> index)
>>>>>     */
>>>>>    void amdgpu_mm_wdoorbell(struct amdgpu_device *adev, u32 index, 
>>>>> u32 v)
>>>>>    {
>>>>> +    int idx;
>>>>> +
>>>>>        if (adev->in_pci_err_recovery)
>>>>>            return;
>>>>>    +    if (!drm_dev_enter(&adev->ddev, &idx))
>>>>> +        return;
>>>>> +
>>>>>        if (index < adev->doorbell.num_doorbells) {
>>>>>            writel(v, adev->doorbell.ptr + index);
>>>>>        } else {
>>>>>            DRM_ERROR("writing beyond doorbell aperture: 
>>>>> 0x%08x!\n", index);
>>>>>        }
>>>>> +
>>>>> +    drm_dev_exit(idx);
>>>>>    }
>>>>>      /**
>>>>> @@ -588,14 +626,21 @@ u64 amdgpu_mm_rdoorbell64(struct 
>>>>> amdgpu_device *adev,
>>>>> u32 index)
>>>>>     */
>>>>>    void amdgpu_mm_wdoorbell64(struct amdgpu_device *adev, u32 
>>>>> index, u64 v)
>>>>>    {
>>>>> +    int idx;
>>>>> +
>>>>>        if (adev->in_pci_err_recovery)
>>>>>            return;
>>>>>    +    if (!drm_dev_enter(&adev->ddev, &idx))
>>>>> +        return;
>>>>> +
>>>>>        if (index < adev->doorbell.num_doorbells) {
>>>>>            atomic64_set((atomic64_t *)(adev->doorbell.ptr + 
>>>>> index), v);
>>>>>        } else {
>>>>>            DRM_ERROR("writing beyond doorbell aperture: 
>>>>> 0x%08x!\n", index);
>>>>>        }
>>>>> +
>>>>> +    drm_dev_exit(idx);
>>>>>    }
>>>>>      /**
>>>>> @@ -682,6 +727,10 @@ void amdgpu_device_indirect_wreg(struct 
>>>>> amdgpu_device
>>>>> *adev,
>>>>>        unsigned long flags;
>>>>>        void __iomem *pcie_index_offset;
>>>>>        void __iomem *pcie_data_offset;
>>>>> +    int idx;
>>>>> +
>>>>> +    if (!drm_dev_enter(&adev->ddev, &idx))
>>>>> +        return;
>>>>>          spin_lock_irqsave(&adev->pcie_idx_lock, flags);
>>>>>        pcie_index_offset = (void __iomem *)adev->rmmio + 
>>>>> pcie_index * 4;
>>>>> @@ -692,6 +741,8 @@ void amdgpu_device_indirect_wreg(struct 
>>>>> amdgpu_device *adev,
>>>>>        writel(reg_data, pcie_data_offset);
>>>>>        readl(pcie_data_offset);
>>>>>        spin_unlock_irqrestore(&adev->pcie_idx_lock, flags);
>>>>> +
>>>>> +    drm_dev_exit(idx);
>>>>>    }
>>>>>      /**
>>>>> @@ -711,6 +762,10 @@ void amdgpu_device_indirect_wreg64(struct 
>>>>> amdgpu_device
>>>>> *adev,
>>>>>        unsigned long flags;
>>>>>        void __iomem *pcie_index_offset;
>>>>>        void __iomem *pcie_data_offset;
>>>>> +    int idx;
>>>>> +
>>>>> +    if (!drm_dev_enter(&adev->ddev, &idx))
>>>>> +        return;
>>>>>          spin_lock_irqsave(&adev->pcie_idx_lock, flags);
>>>>>        pcie_index_offset = (void __iomem *)adev->rmmio + 
>>>>> pcie_index * 4;
>>>>> @@ -727,6 +782,8 @@ void amdgpu_device_indirect_wreg64(struct 
>>>>> amdgpu_device
>>>>> *adev,
>>>>>        writel((u32)(reg_data >> 32), pcie_data_offset);
>>>>>        readl(pcie_data_offset);
>>>>>        spin_unlock_irqrestore(&adev->pcie_idx_lock, flags);
>>>>> +
>>>>> +    drm_dev_exit(idx);
>>>>>    }
>>>>>      /**
>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
>>>>> index fe1a39f..1beb4e6 100644
>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
>>>>> @@ -31,6 +31,8 @@
>>>>>    #include "amdgpu_ras.h"
>>>>>    #include "amdgpu_xgmi.h"
>>>>>    +#include <drm/drm_drv.h>
>>>>> +
>>>>>    /**
>>>>>     * amdgpu_gmc_get_pde_for_bo - get the PDE for a BO
>>>>>     *
>>>>> @@ -98,6 +100,10 @@ int amdgpu_gmc_set_pte_pde(struct 
>>>>> amdgpu_device *adev,
>>>>> void *cpu_pt_addr,
>>>>>    {
>>>>>        void __iomem *ptr = (void *)cpu_pt_addr;
>>>>>        uint64_t value;
>>>>> +    int idx;
>>>>> +
>>>>> +    if (!drm_dev_enter(&adev->ddev, &idx))
>>>>> +        return 0;
>>>>>          /*
>>>>>         * The following is for PTE only. GART does not have PDEs.
>>>>> @@ -105,6 +111,9 @@ int amdgpu_gmc_set_pte_pde(struct 
>>>>> amdgpu_device *adev,
>>>>> void *cpu_pt_addr,
>>>>>        value = addr & 0x0000FFFFFFFFF000ULL;
>>>>>        value |= flags;
>>>>>        writeq(value, ptr + (gpu_page_idx * 8));
>>>>> +
>>>>> +    drm_dev_exit(idx);
>>>>> +
>>>>>        return 0;
>>>>>    }
>>>>>    diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
>>>>> index 523d22d..89e2bfe 100644
>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
>>>>> @@ -37,6 +37,8 @@
>>>>>      #include "amdgpu_ras.h"
>>>>>    +#include <drm/drm_drv.h>
>>>>> +
>>>>>    static int psp_sysfs_init(struct amdgpu_device *adev);
>>>>>    static void psp_sysfs_fini(struct amdgpu_device *adev);
>>>>>    @@ -248,7 +250,7 @@ psp_cmd_submit_buf(struct psp_context *psp,
>>>>>               struct psp_gfx_cmd_resp *cmd, uint64_t fence_mc_addr)
>>>>>    {
>>>>>        int ret;
>>>>> -    int index;
>>>>> +    int index, idx;
>>>>>        int timeout = 2000;
>>>>>        bool ras_intr = false;
>>>>>        bool skip_unsupport = false;
>>>>> @@ -256,6 +258,9 @@ psp_cmd_submit_buf(struct psp_context *psp,
>>>>>        if (psp->adev->in_pci_err_recovery)
>>>>>            return 0;
>>>>>    +    if (!drm_dev_enter(&psp->adev->ddev, &idx))
>>>>> +        return 0;
>>>>> +
>>>>>        mutex_lock(&psp->mutex);
>>>>>          memset(psp->cmd_buf_mem, 0, PSP_CMD_BUFFER_SIZE);
>>>>> @@ -266,8 +271,7 @@ psp_cmd_submit_buf(struct psp_context *psp,
>>>>>        ret = psp_ring_cmd_submit(psp, psp->cmd_buf_mc_addr, 
>>>>> fence_mc_addr,
>>>>> index);
>>>>>        if (ret) {
>>>>>            atomic_dec(&psp->fence_value);
>>>>> -        mutex_unlock(&psp->mutex);
>>>>> -        return ret;
>>>>> +        goto exit;
>>>>>        }
>>>>>          amdgpu_asic_invalidate_hdp(psp->adev, NULL);
>>>>> @@ -307,8 +311,8 @@ psp_cmd_submit_buf(struct psp_context *psp,
>>>>>                 psp->cmd_buf_mem->cmd_id,
>>>>>                 psp->cmd_buf_mem->resp.status);
>>>>>            if (!timeout) {
>>>>> -            mutex_unlock(&psp->mutex);
>>>>> -            return -EINVAL;
>>>>> +            ret = -EINVAL;
>>>>> +            goto exit;
>>>>>            }
>>>>>        }
>>>>>    @@ -316,8 +320,10 @@ psp_cmd_submit_buf(struct psp_context *psp,
>>>>>            ucode->tmr_mc_addr_lo = psp->cmd_buf_mem->resp.fw_addr_lo;
>>>>>            ucode->tmr_mc_addr_hi = psp->cmd_buf_mem->resp.fw_addr_hi;
>>>>>        }
>>>>> -    mutex_unlock(&psp->mutex);
>>>>>    +exit:
>>>>> +    mutex_unlock(&psp->mutex);
>>>>> +    drm_dev_exit(idx);
>>>>>        return ret;
>>>>>    }
>>>>>    @@ -354,8 +360,7 @@ static int psp_load_toc(struct psp_context 
>>>>> *psp,
>>>>>        if (!cmd)
>>>>>            return -ENOMEM;
>>>>>        /* Copy toc to psp firmware private buffer */
>>>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>> -    memcpy(psp->fw_pri_buf, psp->toc_start_addr, psp->toc_bin_size);
>>>>> +    psp_copy_fw(psp, psp->toc_start_addr, psp->toc_bin_size);
>>>>>          psp_prep_load_toc_cmd_buf(cmd, psp->fw_pri_mc_addr, 
>>>>> psp->toc_bin_size);
>>>>>    @@ -570,8 +575,7 @@ static int psp_asd_load(struct psp_context 
>>>>> *psp)
>>>>>        if (!cmd)
>>>>>            return -ENOMEM;
>>>>>    -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>> -    memcpy(psp->fw_pri_buf, psp->asd_start_addr, 
>>>>> psp->asd_ucode_size);
>>>>> +    psp_copy_fw(psp, psp->asd_start_addr, psp->asd_ucode_size);
>>>>>          psp_prep_asd_load_cmd_buf(cmd, psp->fw_pri_mc_addr,
>>>>>                      psp->asd_ucode_size);
>>>>> @@ -726,8 +730,7 @@ static int psp_xgmi_load(struct psp_context *psp)
>>>>>        if (!cmd)
>>>>>            return -ENOMEM;
>>>>>    -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>> -    memcpy(psp->fw_pri_buf, psp->ta_xgmi_start_addr, 
>>>>> psp->ta_xgmi_ucode_size);
>>>>> +    psp_copy_fw(psp, psp->ta_xgmi_start_addr, 
>>>>> psp->ta_xgmi_ucode_size);
>>>>>          psp_prep_ta_load_cmd_buf(cmd,
>>>>>                     psp->fw_pri_mc_addr,
>>>>> @@ -982,8 +985,7 @@ static int psp_ras_load(struct psp_context *psp)
>>>>>        if (!cmd)
>>>>>            return -ENOMEM;
>>>>>    -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>> -    memcpy(psp->fw_pri_buf, psp->ta_ras_start_addr, 
>>>>> psp->ta_ras_ucode_size);
>>>>> +    psp_copy_fw(psp, psp->ta_ras_start_addr, 
>>>>> psp->ta_ras_ucode_size);
>>>>>          psp_prep_ta_load_cmd_buf(cmd,
>>>>>                     psp->fw_pri_mc_addr,
>>>>> @@ -1219,8 +1221,7 @@ static int psp_hdcp_load(struct psp_context 
>>>>> *psp)
>>>>>        if (!cmd)
>>>>>            return -ENOMEM;
>>>>>    -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>> -    memcpy(psp->fw_pri_buf, psp->ta_hdcp_start_addr,
>>>>> +    psp_copy_fw(psp, psp->ta_hdcp_start_addr,
>>>>>               psp->ta_hdcp_ucode_size);
>>>>>          psp_prep_ta_load_cmd_buf(cmd,
>>>>> @@ -1366,8 +1367,7 @@ static int psp_dtm_load(struct psp_context 
>>>>> *psp)
>>>>>        if (!cmd)
>>>>>            return -ENOMEM;
>>>>>    -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>> -    memcpy(psp->fw_pri_buf, psp->ta_dtm_start_addr, 
>>>>> psp->ta_dtm_ucode_size);
>>>>> +    psp_copy_fw(psp, psp->ta_dtm_start_addr, 
>>>>> psp->ta_dtm_ucode_size);
>>>>>          psp_prep_ta_load_cmd_buf(cmd,
>>>>>                     psp->fw_pri_mc_addr,
>>>>> @@ -1507,8 +1507,7 @@ static int psp_rap_load(struct psp_context 
>>>>> *psp)
>>>>>        if (!cmd)
>>>>>            return -ENOMEM;
>>>>>    -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>> -    memcpy(psp->fw_pri_buf, psp->ta_rap_start_addr, 
>>>>> psp->ta_rap_ucode_size);
>>>>> +    psp_copy_fw(psp, psp->ta_rap_start_addr, 
>>>>> psp->ta_rap_ucode_size);
>>>>>          psp_prep_ta_load_cmd_buf(cmd,
>>>>>                     psp->fw_pri_mc_addr,
>>>>> @@ -2778,6 +2777,20 @@ static ssize_t 
>>>>> psp_usbc_pd_fw_sysfs_write(struct
>>>>> device *dev,
>>>>>        return count;
>>>>>    }
>>>>>    +void psp_copy_fw(struct psp_context *psp, uint8_t *start_addr, 
>>>>> uint32_t
>>>>> bin_size)
>>>>> +{
>>>>> +    int idx;
>>>>> +
>>>>> +    if (!drm_dev_enter(&psp->adev->ddev, &idx))
>>>>> +        return;
>>>>> +
>>>>> +    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>> +    memcpy(psp->fw_pri_buf, start_addr, bin_size);
>>>>> +
>>>>> +    drm_dev_exit(idx);
>>>>> +}
>>>>> +
>>>>> +
>>>>>    static DEVICE_ATTR(usbc_pd_fw, S_IRUGO | S_IWUSR,
>>>>>               psp_usbc_pd_fw_sysfs_read,
>>>>>               psp_usbc_pd_fw_sysfs_write);
>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
>>>>> index da250bc..ac69314 100644
>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
>>>>> @@ -400,4 +400,7 @@ int psp_init_ta_microcode(struct psp_context 
>>>>> *psp,
>>>>>                  const char *chip_name);
>>>>>    int psp_get_fw_attestation_records_addr(struct psp_context *psp,
>>>>>                        uint64_t *output_ptr);
>>>>> +
>>>>> +void psp_copy_fw(struct psp_context *psp, uint8_t *start_addr, 
>>>>> uint32_t
>>>>> bin_size);
>>>>> +
>>>>>    #endif
>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
>>>>> index 1a612f5..d656494 100644
>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
>>>>> @@ -35,6 +35,8 @@
>>>>>    #include "amdgpu.h"
>>>>>    #include "atom.h"
>>>>>    +#include <drm/drm_drv.h>
>>>>> +
>>>>>    /*
>>>>>     * Rings
>>>>>     * Most engines on the GPU are fed via ring buffers. Ring
>>>>> @@ -463,3 +465,71 @@ int amdgpu_ring_test_helper(struct 
>>>>> amdgpu_ring *ring)
>>>>>        ring->sched.ready = !r;
>>>>>        return r;
>>>>>    }
>>>>> +
>>>>> +void amdgpu_ring_clear_ring(struct amdgpu_ring *ring)
>>>>> +{
>>>>> +    int idx;
>>>>> +    int i = 0;
>>>>> +
>>>>> +    if (!drm_dev_enter(&ring->adev->ddev, &idx))
>>>>> +        return;
>>>>> +
>>>>> +    while (i <= ring->buf_mask)
>>>>> +        ring->ring[i++] = ring->funcs->nop;
>>>>> +
>>>>> +    drm_dev_exit(idx);
>>>>> +
>>>>> +}
>>>>> +
>>>>> +void amdgpu_ring_write(struct amdgpu_ring *ring, uint32_t v)
>>>>> +{
>>>>> +    int idx;
>>>>> +
>>>>> +    if (!drm_dev_enter(&ring->adev->ddev, &idx))
>>>>> +        return;
>>>>> +
>>>>> +    if (ring->count_dw <= 0)
>>>>> +        DRM_ERROR("amdgpu: writing more dwords to the ring than 
>>>>> expected!\n");
>>>>> +    ring->ring[ring->wptr++ & ring->buf_mask] = v;
>>>>> +    ring->wptr &= ring->ptr_mask;
>>>>> +    ring->count_dw--;
>>>>> +
>>>>> +    drm_dev_exit(idx);
>>>>> +}
>>>>> +
>>>>> +void amdgpu_ring_write_multiple(struct amdgpu_ring *ring,
>>>>> +                          void *src, int count_dw)
>>>>> +{
>>>>> +    unsigned occupied, chunk1, chunk2;
>>>>> +    void *dst;
>>>>> +    int idx;
>>>>> +
>>>>> +    if (!drm_dev_enter(&ring->adev->ddev, &idx))
>>>>> +        return;
>>>>> +
>>>>> +    if (unlikely(ring->count_dw < count_dw))
>>>>> +        DRM_ERROR("amdgpu: writing more dwords to the ring than 
>>>>> expected!\n");
>>>>> +
>>>>> +    occupied = ring->wptr & ring->buf_mask;
>>>>> +    dst = (void *)&ring->ring[occupied];
>>>>> +    chunk1 = ring->buf_mask + 1 - occupied;
>>>>> +    chunk1 = (chunk1 >= count_dw) ? count_dw: chunk1;
>>>>> +    chunk2 = count_dw - chunk1;
>>>>> +    chunk1 <<= 2;
>>>>> +    chunk2 <<= 2;
>>>>> +
>>>>> +    if (chunk1)
>>>>> +        memcpy(dst, src, chunk1);
>>>>> +
>>>>> +    if (chunk2) {
>>>>> +        src += chunk1;
>>>>> +        dst = (void *)ring->ring;
>>>>> +        memcpy(dst, src, chunk2);
>>>>> +    }
>>>>> +
>>>>> +    ring->wptr += count_dw;
>>>>> +    ring->wptr &= ring->ptr_mask;
>>>>> +    ring->count_dw -= count_dw;
>>>>> +
>>>>> +    drm_dev_exit(idx);
>>>>> +}
>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>>>>> index accb243..f90b81f 100644
>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>>>>> @@ -300,53 +300,12 @@ static inline void
>>>>> amdgpu_ring_set_preempt_cond_exec(struct amdgpu_ring *ring,
>>>>>        *ring->cond_exe_cpu_addr = cond_exec;
>>>>>    }
>>>>>    -static inline void amdgpu_ring_clear_ring(struct amdgpu_ring 
>>>>> *ring)
>>>>> -{
>>>>> -    int i = 0;
>>>>> -    while (i <= ring->buf_mask)
>>>>> -        ring->ring[i++] = ring->funcs->nop;
>>>>> -
>>>>> -}
>>>>> -
>>>>> -static inline void amdgpu_ring_write(struct amdgpu_ring *ring, 
>>>>> uint32_t v)
>>>>> -{
>>>>> -    if (ring->count_dw <= 0)
>>>>> -        DRM_ERROR("amdgpu: writing more dwords to the ring than 
>>>>> expected!\n");
>>>>> -    ring->ring[ring->wptr++ & ring->buf_mask] = v;
>>>>> -    ring->wptr &= ring->ptr_mask;
>>>>> -    ring->count_dw--;
>>>>> -}
>>>>> +void amdgpu_ring_clear_ring(struct amdgpu_ring *ring);
>>>>>    -static inline void amdgpu_ring_write_multiple(struct 
>>>>> amdgpu_ring *ring,
>>>>> -                          void *src, int count_dw)
>>>>> -{
>>>>> -    unsigned occupied, chunk1, chunk2;
>>>>> -    void *dst;
>>>>> -
>>>>> -    if (unlikely(ring->count_dw < count_dw))
>>>>> -        DRM_ERROR("amdgpu: writing more dwords to the ring than 
>>>>> expected!\n");
>>>>> -
>>>>> -    occupied = ring->wptr & ring->buf_mask;
>>>>> -    dst = (void *)&ring->ring[occupied];
>>>>> -    chunk1 = ring->buf_mask + 1 - occupied;
>>>>> -    chunk1 = (chunk1 >= count_dw) ? count_dw: chunk1;
>>>>> -    chunk2 = count_dw - chunk1;
>>>>> -    chunk1 <<= 2;
>>>>> -    chunk2 <<= 2;
>>>>> +void amdgpu_ring_write(struct amdgpu_ring *ring, uint32_t v);
>>>>>    -    if (chunk1)
>>>>> -        memcpy(dst, src, chunk1);
>>>>> -
>>>>> -    if (chunk2) {
>>>>> -        src += chunk1;
>>>>> -        dst = (void *)ring->ring;
>>>>> -        memcpy(dst, src, chunk2);
>>>>> -    }
>>>>> -
>>>>> -    ring->wptr += count_dw;
>>>>> -    ring->wptr &= ring->ptr_mask;
>>>>> -    ring->count_dw -= count_dw;
>>>>> -}
>>>>> +void amdgpu_ring_write_multiple(struct amdgpu_ring *ring,
>>>>> +                          void *src, int count_dw);
>>>>>      int amdgpu_ring_test_helper(struct amdgpu_ring *ring);
>>>>>    diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
>>>>> b/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
>>>>> index bd4248c..b3ce5be 100644
>>>>> --- a/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
>>>>> @@ -269,10 +269,8 @@ static int psp_v11_0_bootloader_load_kdb(struct
>>>>> psp_context *psp)
>>>>>        if (ret)
>>>>>            return ret;
>>>>>    -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>> -
>>>>>        /* Copy PSP KDB binary to memory */
>>>>> -    memcpy(psp->fw_pri_buf, psp->kdb_start_addr, psp->kdb_bin_size);
>>>>> +    psp_copy_fw(psp, psp->kdb_start_addr, psp->kdb_bin_size);
>>>>>          /* Provide the PSP KDB to bootloader */
>>>>>        WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>>>> @@ -302,10 +300,8 @@ static int psp_v11_0_bootloader_load_spl(struct
>>>>> psp_context *psp)
>>>>>        if (ret)
>>>>>            return ret;
>>>>>    -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>> -
>>>>>        /* Copy PSP SPL binary to memory */
>>>>> -    memcpy(psp->fw_pri_buf, psp->spl_start_addr, psp->spl_bin_size);
>>>>> +    psp_copy_fw(psp, psp->spl_start_addr, psp->spl_bin_size);
>>>>>          /* Provide the PSP SPL to bootloader */
>>>>>        WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>>>> @@ -335,10 +331,8 @@ static int 
>>>>> psp_v11_0_bootloader_load_sysdrv(struct
>>>>> psp_context *psp)
>>>>>        if (ret)
>>>>>            return ret;
>>>>>    -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>> -
>>>>>        /* Copy PSP System Driver binary to memory */
>>>>> -    memcpy(psp->fw_pri_buf, psp->sys_start_addr, psp->sys_bin_size);
>>>>> +    psp_copy_fw(psp, psp->sys_start_addr, psp->sys_bin_size);
>>>>>          /* Provide the sys driver to bootloader */
>>>>>        WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>>>> @@ -371,10 +365,8 @@ static int psp_v11_0_bootloader_load_sos(struct
>>>>> psp_context *psp)
>>>>>        if (ret)
>>>>>            return ret;
>>>>>    -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>> -
>>>>>        /* Copy Secure OS binary to PSP memory */
>>>>> -    memcpy(psp->fw_pri_buf, psp->sos_start_addr, psp->sos_bin_size);
>>>>> +    psp_copy_fw(psp, psp->sos_start_addr, psp->sos_bin_size);
>>>>>          /* Provide the PSP secure OS to bootloader */
>>>>>        WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c
>>>>> b/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c
>>>>> index c4828bd..618e5b6 100644
>>>>> --- a/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c
>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c
>>>>> @@ -138,10 +138,8 @@ static int 
>>>>> psp_v12_0_bootloader_load_sysdrv(struct
>>>>> psp_context *psp)
>>>>>        if (ret)
>>>>>            return ret;
>>>>>    -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>> -
>>>>>        /* Copy PSP System Driver binary to memory */
>>>>> -    memcpy(psp->fw_pri_buf, psp->sys_start_addr, psp->sys_bin_size);
>>>>> +    psp_copy_fw(psp, psp->sys_start_addr, psp->sys_bin_size);
>>>>>          /* Provide the sys driver to bootloader */
>>>>>        WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>>>> @@ -179,10 +177,8 @@ static int psp_v12_0_bootloader_load_sos(struct
>>>>> psp_context *psp)
>>>>>        if (ret)
>>>>>            return ret;
>>>>>    -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>> -
>>>>>        /* Copy Secure OS binary to PSP memory */
>>>>> -    memcpy(psp->fw_pri_buf, psp->sos_start_addr, psp->sos_bin_size);
>>>>> +    psp_copy_fw(psp, psp->sos_start_addr, psp->sos_bin_size);
>>>>>          /* Provide the PSP secure OS to bootloader */
>>>>>        WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c
>>>>> b/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c
>>>>> index f2e725f..d0a6cccd 100644
>>>>> --- a/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c
>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c
>>>>> @@ -102,10 +102,8 @@ static int 
>>>>> psp_v3_1_bootloader_load_sysdrv(struct
>>>>> psp_context *psp)
>>>>>        if (ret)
>>>>>            return ret;
>>>>>    -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>> -
>>>>>        /* Copy PSP System Driver binary to memory */
>>>>> -    memcpy(psp->fw_pri_buf, psp->sys_start_addr, psp->sys_bin_size);
>>>>> +    psp_copy_fw(psp, psp->sys_start_addr, psp->sys_bin_size);
>>>>>          /* Provide the sys driver to bootloader */
>>>>>        WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>>>> @@ -143,10 +141,8 @@ static int psp_v3_1_bootloader_load_sos(struct
>>>>> psp_context *psp)
>>>>>        if (ret)
>>>>>            return ret;
>>>>>    -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>> -
>>>>>        /* Copy Secure OS binary to PSP memory */
>>>>> -    memcpy(psp->fw_pri_buf, psp->sos_start_addr, psp->sos_bin_size);
>>>>> +    psp_copy_fw(psp, psp->sos_start_addr, psp->sos_bin_size);
>>>>>          /* Provide the PSP secure OS to bootloader */
>>>>>        WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>
>>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 11/14] drm/amdgpu: Guard against write accesses after device removal
@ 2021-01-19 18:59             ` Christian König
  0 siblings, 0 replies; 196+ messages in thread
From: Christian König @ 2021-01-19 18:59 UTC (permalink / raw)
  To: Andrey Grodzovsky, Daniel Vetter
  Cc: Rob Herring, Greg KH, dri-devel, Anholt, Eric, Pekka Paalanen,
	amd-gfx list, Alex Deucher, Lucas Stach, Wentland, Harry,
	Christian König, Qiang Yu

Am 19.01.21 um 19:22 schrieb Andrey Grodzovsky:
>
> On 1/19/21 1:05 PM, Daniel Vetter wrote:
>> On Tue, Jan 19, 2021 at 4:35 PM Andrey Grodzovsky
>> <Andrey.Grodzovsky@amd.com> wrote:
>>> There is really no other way according to this article
>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flwn.net%2FArticles%2F767885%2F&amp;data=04%7C01%7CAndrey.Grodzovsky%40amd.com%7C7a1f5ae6a06f4661d47708d8bca4cb32%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637466763278674162%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=QupsglO9WRuis8XRLBFIhl6miTXVOdAnk8oP4BfSclQ%3D&amp;reserved=0 
>>>
>>>
>>> "A perfect solution seems nearly impossible though; we cannot 
>>> acquire a mutex on
>>> the user
>>> to prevent them from yanking a device and we cannot check for a 
>>> presence change
>>> after every
>>> device access for performance reasons. "
>>>
>>> But I assumed srcu_read_lock should be pretty seamless performance 
>>> wise, no ?
>> The read side is supposed to be dirt cheap, the write side is were we
>> just stall for all readers to eventually complete on their own.
>> Definitely should be much cheaper than mmio read, on the mmio write
>> side it might actually hurt a bit. Otoh I think those don't stall the
>> cpu by default when they're timing out, so maybe if the overhead is
>> too much for those, we could omit them?
>>
>> Maybe just do a small microbenchmark for these for testing, with a
>> register that doesn't change hw state. So with and without
>> drm_dev_enter/exit, and also one with the hw plugged out so that we
>> have actual timeouts in the transactions.
>> -Daniel
>
>
> So say writing in a loop to some harmless scratch register for many 
> times both for plugged
> and unplugged case and measure total time delta ?

I think we should at least measure the following:

1. Writing X times to a scratch reg without your patch.
2. Writing X times to a scratch reg with your patch.
3. Writing X times to a scratch reg with the hardware physically 
disconnected.

I suggest to repeat that once for Polaris (or older) and once for Vega 
or Navi.

The SRBM on Polaris is meant to introduce some delay in each access, so 
it might react differently then the newer hardware.

Christian.

>
> Andrey
>
>
>>
>>> The other solution would be as I suggested to keep all the device IO 
>>> ranges
>>> reserved and system
>>> memory pages unfreed until the device is finalized in the driver but 
>>> Daniel said
>>> this would upset the PCI layer (the MMIO ranges reservation part).
>>>
>>> Andrey
>>>
>>>
>>>
>>>
>>> On 1/19/21 3:55 AM, Christian König wrote:
>>>> Am 18.01.21 um 22:01 schrieb Andrey Grodzovsky:
>>>>> This should prevent writing to memory or IO ranges possibly
>>>>> already allocated for other uses after our device is removed.
>>>> Wow, that adds quite some overhead to every register access. I'm 
>>>> not sure we
>>>> can do this.
>>>>
>>>> Christian.
>>>>
>>>>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>>>>> ---
>>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 57 
>>>>> ++++++++++++++++++++++++
>>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c    |  9 ++++
>>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c    | 53 
>>>>> +++++++++++++---------
>>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h    |  3 ++
>>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c   | 70 
>>>>> ++++++++++++++++++++++++++++++
>>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h   | 49 
>>>>> ++-------------------
>>>>>    drivers/gpu/drm/amd/amdgpu/psp_v11_0.c     | 16 ++-----
>>>>>    drivers/gpu/drm/amd/amdgpu/psp_v12_0.c     |  8 +---
>>>>>    drivers/gpu/drm/amd/amdgpu/psp_v3_1.c      |  8 +---
>>>>>    9 files changed, 184 insertions(+), 89 deletions(-)
>>>>>
>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>> index e99f4f1..0a9d73c 100644
>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>> @@ -72,6 +72,8 @@
>>>>>      #include <linux/iommu.h>
>>>>>    +#include <drm/drm_drv.h>
>>>>> +
>>>>>    MODULE_FIRMWARE("amdgpu/vega10_gpu_info.bin");
>>>>>    MODULE_FIRMWARE("amdgpu/vega12_gpu_info.bin");
>>>>>    MODULE_FIRMWARE("amdgpu/raven_gpu_info.bin");
>>>>> @@ -404,13 +406,21 @@ uint8_t amdgpu_mm_rreg8(struct amdgpu_device 
>>>>> *adev,
>>>>> uint32_t offset)
>>>>>     */
>>>>>    void amdgpu_mm_wreg8(struct amdgpu_device *adev, uint32_t 
>>>>> offset, uint8_t
>>>>> value)
>>>>>    {
>>>>> +    int idx;
>>>>> +
>>>>>        if (adev->in_pci_err_recovery)
>>>>>            return;
>>>>>    +
>>>>> +    if (!drm_dev_enter(&adev->ddev, &idx))
>>>>> +        return;
>>>>> +
>>>>>        if (offset < adev->rmmio_size)
>>>>>            writeb(value, adev->rmmio + offset);
>>>>>        else
>>>>>            BUG();
>>>>> +
>>>>> +    drm_dev_exit(idx);
>>>>>    }
>>>>>      /**
>>>>> @@ -427,9 +437,14 @@ void amdgpu_device_wreg(struct amdgpu_device 
>>>>> *adev,
>>>>>                uint32_t reg, uint32_t v,
>>>>>                uint32_t acc_flags)
>>>>>    {
>>>>> +    int idx;
>>>>> +
>>>>>        if (adev->in_pci_err_recovery)
>>>>>            return;
>>>>>    +    if (!drm_dev_enter(&adev->ddev, &idx))
>>>>> +        return;
>>>>> +
>>>>>        if ((reg * 4) < adev->rmmio_size) {
>>>>>            if (!(acc_flags & AMDGPU_REGS_NO_KIQ) &&
>>>>>                amdgpu_sriov_runtime(adev) &&
>>>>> @@ -444,6 +459,8 @@ void amdgpu_device_wreg(struct amdgpu_device 
>>>>> *adev,
>>>>>        }
>>>>> trace_amdgpu_device_wreg(adev->pdev->device, reg, v);
>>>>> +
>>>>> +    drm_dev_exit(idx);
>>>>>    }
>>>>>      /*
>>>>> @@ -454,9 +471,14 @@ void amdgpu_device_wreg(struct amdgpu_device 
>>>>> *adev,
>>>>>    void amdgpu_mm_wreg_mmio_rlc(struct amdgpu_device *adev,
>>>>>                     uint32_t reg, uint32_t v)
>>>>>    {
>>>>> +    int idx;
>>>>> +
>>>>>        if (adev->in_pci_err_recovery)
>>>>>            return;
>>>>>    +    if (!drm_dev_enter(&adev->ddev, &idx))
>>>>> +        return;
>>>>> +
>>>>>        if (amdgpu_sriov_fullaccess(adev) &&
>>>>>            adev->gfx.rlc.funcs &&
>>>>> adev->gfx.rlc.funcs->is_rlcg_access_range) {
>>>>> @@ -465,6 +487,8 @@ void amdgpu_mm_wreg_mmio_rlc(struct 
>>>>> amdgpu_device *adev,
>>>>>        } else {
>>>>>            writel(v, ((void __iomem *)adev->rmmio) + (reg * 4));
>>>>>        }
>>>>> +
>>>>> +    drm_dev_exit(idx);
>>>>>    }
>>>>>      /**
>>>>> @@ -499,15 +523,22 @@ u32 amdgpu_io_rreg(struct amdgpu_device 
>>>>> *adev, u32 reg)
>>>>>     */
>>>>>    void amdgpu_io_wreg(struct amdgpu_device *adev, u32 reg, u32 v)
>>>>>    {
>>>>> +    int idx;
>>>>> +
>>>>>        if (adev->in_pci_err_recovery)
>>>>>            return;
>>>>>    +    if (!drm_dev_enter(&adev->ddev, &idx))
>>>>> +        return;
>>>>> +
>>>>>        if ((reg * 4) < adev->rio_mem_size)
>>>>>            iowrite32(v, adev->rio_mem + (reg * 4));
>>>>>        else {
>>>>>            iowrite32((reg * 4), adev->rio_mem + (mmMM_INDEX * 4));
>>>>>            iowrite32(v, adev->rio_mem + (mmMM_DATA * 4));
>>>>>        }
>>>>> +
>>>>> +    drm_dev_exit(idx);
>>>>>    }
>>>>>      /**
>>>>> @@ -544,14 +575,21 @@ u32 amdgpu_mm_rdoorbell(struct amdgpu_device 
>>>>> *adev, u32
>>>>> index)
>>>>>     */
>>>>>    void amdgpu_mm_wdoorbell(struct amdgpu_device *adev, u32 index, 
>>>>> u32 v)
>>>>>    {
>>>>> +    int idx;
>>>>> +
>>>>>        if (adev->in_pci_err_recovery)
>>>>>            return;
>>>>>    +    if (!drm_dev_enter(&adev->ddev, &idx))
>>>>> +        return;
>>>>> +
>>>>>        if (index < adev->doorbell.num_doorbells) {
>>>>>            writel(v, adev->doorbell.ptr + index);
>>>>>        } else {
>>>>>            DRM_ERROR("writing beyond doorbell aperture: 
>>>>> 0x%08x!\n", index);
>>>>>        }
>>>>> +
>>>>> +    drm_dev_exit(idx);
>>>>>    }
>>>>>      /**
>>>>> @@ -588,14 +626,21 @@ u64 amdgpu_mm_rdoorbell64(struct 
>>>>> amdgpu_device *adev,
>>>>> u32 index)
>>>>>     */
>>>>>    void amdgpu_mm_wdoorbell64(struct amdgpu_device *adev, u32 
>>>>> index, u64 v)
>>>>>    {
>>>>> +    int idx;
>>>>> +
>>>>>        if (adev->in_pci_err_recovery)
>>>>>            return;
>>>>>    +    if (!drm_dev_enter(&adev->ddev, &idx))
>>>>> +        return;
>>>>> +
>>>>>        if (index < adev->doorbell.num_doorbells) {
>>>>>            atomic64_set((atomic64_t *)(adev->doorbell.ptr + 
>>>>> index), v);
>>>>>        } else {
>>>>>            DRM_ERROR("writing beyond doorbell aperture: 
>>>>> 0x%08x!\n", index);
>>>>>        }
>>>>> +
>>>>> +    drm_dev_exit(idx);
>>>>>    }
>>>>>      /**
>>>>> @@ -682,6 +727,10 @@ void amdgpu_device_indirect_wreg(struct 
>>>>> amdgpu_device
>>>>> *adev,
>>>>>        unsigned long flags;
>>>>>        void __iomem *pcie_index_offset;
>>>>>        void __iomem *pcie_data_offset;
>>>>> +    int idx;
>>>>> +
>>>>> +    if (!drm_dev_enter(&adev->ddev, &idx))
>>>>> +        return;
>>>>>          spin_lock_irqsave(&adev->pcie_idx_lock, flags);
>>>>>        pcie_index_offset = (void __iomem *)adev->rmmio + 
>>>>> pcie_index * 4;
>>>>> @@ -692,6 +741,8 @@ void amdgpu_device_indirect_wreg(struct 
>>>>> amdgpu_device *adev,
>>>>>        writel(reg_data, pcie_data_offset);
>>>>>        readl(pcie_data_offset);
>>>>>        spin_unlock_irqrestore(&adev->pcie_idx_lock, flags);
>>>>> +
>>>>> +    drm_dev_exit(idx);
>>>>>    }
>>>>>      /**
>>>>> @@ -711,6 +762,10 @@ void amdgpu_device_indirect_wreg64(struct 
>>>>> amdgpu_device
>>>>> *adev,
>>>>>        unsigned long flags;
>>>>>        void __iomem *pcie_index_offset;
>>>>>        void __iomem *pcie_data_offset;
>>>>> +    int idx;
>>>>> +
>>>>> +    if (!drm_dev_enter(&adev->ddev, &idx))
>>>>> +        return;
>>>>>          spin_lock_irqsave(&adev->pcie_idx_lock, flags);
>>>>>        pcie_index_offset = (void __iomem *)adev->rmmio + 
>>>>> pcie_index * 4;
>>>>> @@ -727,6 +782,8 @@ void amdgpu_device_indirect_wreg64(struct 
>>>>> amdgpu_device
>>>>> *adev,
>>>>>        writel((u32)(reg_data >> 32), pcie_data_offset);
>>>>>        readl(pcie_data_offset);
>>>>>        spin_unlock_irqrestore(&adev->pcie_idx_lock, flags);
>>>>> +
>>>>> +    drm_dev_exit(idx);
>>>>>    }
>>>>>      /**
>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
>>>>> index fe1a39f..1beb4e6 100644
>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
>>>>> @@ -31,6 +31,8 @@
>>>>>    #include "amdgpu_ras.h"
>>>>>    #include "amdgpu_xgmi.h"
>>>>>    +#include <drm/drm_drv.h>
>>>>> +
>>>>>    /**
>>>>>     * amdgpu_gmc_get_pde_for_bo - get the PDE for a BO
>>>>>     *
>>>>> @@ -98,6 +100,10 @@ int amdgpu_gmc_set_pte_pde(struct 
>>>>> amdgpu_device *adev,
>>>>> void *cpu_pt_addr,
>>>>>    {
>>>>>        void __iomem *ptr = (void *)cpu_pt_addr;
>>>>>        uint64_t value;
>>>>> +    int idx;
>>>>> +
>>>>> +    if (!drm_dev_enter(&adev->ddev, &idx))
>>>>> +        return 0;
>>>>>          /*
>>>>>         * The following is for PTE only. GART does not have PDEs.
>>>>> @@ -105,6 +111,9 @@ int amdgpu_gmc_set_pte_pde(struct 
>>>>> amdgpu_device *adev,
>>>>> void *cpu_pt_addr,
>>>>>        value = addr & 0x0000FFFFFFFFF000ULL;
>>>>>        value |= flags;
>>>>>        writeq(value, ptr + (gpu_page_idx * 8));
>>>>> +
>>>>> +    drm_dev_exit(idx);
>>>>> +
>>>>>        return 0;
>>>>>    }
>>>>>    diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
>>>>> index 523d22d..89e2bfe 100644
>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
>>>>> @@ -37,6 +37,8 @@
>>>>>      #include "amdgpu_ras.h"
>>>>>    +#include <drm/drm_drv.h>
>>>>> +
>>>>>    static int psp_sysfs_init(struct amdgpu_device *adev);
>>>>>    static void psp_sysfs_fini(struct amdgpu_device *adev);
>>>>>    @@ -248,7 +250,7 @@ psp_cmd_submit_buf(struct psp_context *psp,
>>>>>               struct psp_gfx_cmd_resp *cmd, uint64_t fence_mc_addr)
>>>>>    {
>>>>>        int ret;
>>>>> -    int index;
>>>>> +    int index, idx;
>>>>>        int timeout = 2000;
>>>>>        bool ras_intr = false;
>>>>>        bool skip_unsupport = false;
>>>>> @@ -256,6 +258,9 @@ psp_cmd_submit_buf(struct psp_context *psp,
>>>>>        if (psp->adev->in_pci_err_recovery)
>>>>>            return 0;
>>>>>    +    if (!drm_dev_enter(&psp->adev->ddev, &idx))
>>>>> +        return 0;
>>>>> +
>>>>>        mutex_lock(&psp->mutex);
>>>>>          memset(psp->cmd_buf_mem, 0, PSP_CMD_BUFFER_SIZE);
>>>>> @@ -266,8 +271,7 @@ psp_cmd_submit_buf(struct psp_context *psp,
>>>>>        ret = psp_ring_cmd_submit(psp, psp->cmd_buf_mc_addr, 
>>>>> fence_mc_addr,
>>>>> index);
>>>>>        if (ret) {
>>>>>            atomic_dec(&psp->fence_value);
>>>>> -        mutex_unlock(&psp->mutex);
>>>>> -        return ret;
>>>>> +        goto exit;
>>>>>        }
>>>>>          amdgpu_asic_invalidate_hdp(psp->adev, NULL);
>>>>> @@ -307,8 +311,8 @@ psp_cmd_submit_buf(struct psp_context *psp,
>>>>>                 psp->cmd_buf_mem->cmd_id,
>>>>>                 psp->cmd_buf_mem->resp.status);
>>>>>            if (!timeout) {
>>>>> -            mutex_unlock(&psp->mutex);
>>>>> -            return -EINVAL;
>>>>> +            ret = -EINVAL;
>>>>> +            goto exit;
>>>>>            }
>>>>>        }
>>>>>    @@ -316,8 +320,10 @@ psp_cmd_submit_buf(struct psp_context *psp,
>>>>>            ucode->tmr_mc_addr_lo = psp->cmd_buf_mem->resp.fw_addr_lo;
>>>>>            ucode->tmr_mc_addr_hi = psp->cmd_buf_mem->resp.fw_addr_hi;
>>>>>        }
>>>>> -    mutex_unlock(&psp->mutex);
>>>>>    +exit:
>>>>> +    mutex_unlock(&psp->mutex);
>>>>> +    drm_dev_exit(idx);
>>>>>        return ret;
>>>>>    }
>>>>>    @@ -354,8 +360,7 @@ static int psp_load_toc(struct psp_context 
>>>>> *psp,
>>>>>        if (!cmd)
>>>>>            return -ENOMEM;
>>>>>        /* Copy toc to psp firmware private buffer */
>>>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>> -    memcpy(psp->fw_pri_buf, psp->toc_start_addr, psp->toc_bin_size);
>>>>> +    psp_copy_fw(psp, psp->toc_start_addr, psp->toc_bin_size);
>>>>>          psp_prep_load_toc_cmd_buf(cmd, psp->fw_pri_mc_addr, 
>>>>> psp->toc_bin_size);
>>>>>    @@ -570,8 +575,7 @@ static int psp_asd_load(struct psp_context 
>>>>> *psp)
>>>>>        if (!cmd)
>>>>>            return -ENOMEM;
>>>>>    -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>> -    memcpy(psp->fw_pri_buf, psp->asd_start_addr, 
>>>>> psp->asd_ucode_size);
>>>>> +    psp_copy_fw(psp, psp->asd_start_addr, psp->asd_ucode_size);
>>>>>          psp_prep_asd_load_cmd_buf(cmd, psp->fw_pri_mc_addr,
>>>>>                      psp->asd_ucode_size);
>>>>> @@ -726,8 +730,7 @@ static int psp_xgmi_load(struct psp_context *psp)
>>>>>        if (!cmd)
>>>>>            return -ENOMEM;
>>>>>    -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>> -    memcpy(psp->fw_pri_buf, psp->ta_xgmi_start_addr, 
>>>>> psp->ta_xgmi_ucode_size);
>>>>> +    psp_copy_fw(psp, psp->ta_xgmi_start_addr, 
>>>>> psp->ta_xgmi_ucode_size);
>>>>>          psp_prep_ta_load_cmd_buf(cmd,
>>>>>                     psp->fw_pri_mc_addr,
>>>>> @@ -982,8 +985,7 @@ static int psp_ras_load(struct psp_context *psp)
>>>>>        if (!cmd)
>>>>>            return -ENOMEM;
>>>>>    -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>> -    memcpy(psp->fw_pri_buf, psp->ta_ras_start_addr, 
>>>>> psp->ta_ras_ucode_size);
>>>>> +    psp_copy_fw(psp, psp->ta_ras_start_addr, 
>>>>> psp->ta_ras_ucode_size);
>>>>>          psp_prep_ta_load_cmd_buf(cmd,
>>>>>                     psp->fw_pri_mc_addr,
>>>>> @@ -1219,8 +1221,7 @@ static int psp_hdcp_load(struct psp_context 
>>>>> *psp)
>>>>>        if (!cmd)
>>>>>            return -ENOMEM;
>>>>>    -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>> -    memcpy(psp->fw_pri_buf, psp->ta_hdcp_start_addr,
>>>>> +    psp_copy_fw(psp, psp->ta_hdcp_start_addr,
>>>>>               psp->ta_hdcp_ucode_size);
>>>>>          psp_prep_ta_load_cmd_buf(cmd,
>>>>> @@ -1366,8 +1367,7 @@ static int psp_dtm_load(struct psp_context 
>>>>> *psp)
>>>>>        if (!cmd)
>>>>>            return -ENOMEM;
>>>>>    -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>> -    memcpy(psp->fw_pri_buf, psp->ta_dtm_start_addr, 
>>>>> psp->ta_dtm_ucode_size);
>>>>> +    psp_copy_fw(psp, psp->ta_dtm_start_addr, 
>>>>> psp->ta_dtm_ucode_size);
>>>>>          psp_prep_ta_load_cmd_buf(cmd,
>>>>>                     psp->fw_pri_mc_addr,
>>>>> @@ -1507,8 +1507,7 @@ static int psp_rap_load(struct psp_context 
>>>>> *psp)
>>>>>        if (!cmd)
>>>>>            return -ENOMEM;
>>>>>    -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>> -    memcpy(psp->fw_pri_buf, psp->ta_rap_start_addr, 
>>>>> psp->ta_rap_ucode_size);
>>>>> +    psp_copy_fw(psp, psp->ta_rap_start_addr, 
>>>>> psp->ta_rap_ucode_size);
>>>>>          psp_prep_ta_load_cmd_buf(cmd,
>>>>>                     psp->fw_pri_mc_addr,
>>>>> @@ -2778,6 +2777,20 @@ static ssize_t 
>>>>> psp_usbc_pd_fw_sysfs_write(struct
>>>>> device *dev,
>>>>>        return count;
>>>>>    }
>>>>>    +void psp_copy_fw(struct psp_context *psp, uint8_t *start_addr, 
>>>>> uint32_t
>>>>> bin_size)
>>>>> +{
>>>>> +    int idx;
>>>>> +
>>>>> +    if (!drm_dev_enter(&psp->adev->ddev, &idx))
>>>>> +        return;
>>>>> +
>>>>> +    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>> +    memcpy(psp->fw_pri_buf, start_addr, bin_size);
>>>>> +
>>>>> +    drm_dev_exit(idx);
>>>>> +}
>>>>> +
>>>>> +
>>>>>    static DEVICE_ATTR(usbc_pd_fw, S_IRUGO | S_IWUSR,
>>>>>               psp_usbc_pd_fw_sysfs_read,
>>>>>               psp_usbc_pd_fw_sysfs_write);
>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
>>>>> index da250bc..ac69314 100644
>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
>>>>> @@ -400,4 +400,7 @@ int psp_init_ta_microcode(struct psp_context 
>>>>> *psp,
>>>>>                  const char *chip_name);
>>>>>    int psp_get_fw_attestation_records_addr(struct psp_context *psp,
>>>>>                        uint64_t *output_ptr);
>>>>> +
>>>>> +void psp_copy_fw(struct psp_context *psp, uint8_t *start_addr, 
>>>>> uint32_t
>>>>> bin_size);
>>>>> +
>>>>>    #endif
>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
>>>>> index 1a612f5..d656494 100644
>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
>>>>> @@ -35,6 +35,8 @@
>>>>>    #include "amdgpu.h"
>>>>>    #include "atom.h"
>>>>>    +#include <drm/drm_drv.h>
>>>>> +
>>>>>    /*
>>>>>     * Rings
>>>>>     * Most engines on the GPU are fed via ring buffers. Ring
>>>>> @@ -463,3 +465,71 @@ int amdgpu_ring_test_helper(struct 
>>>>> amdgpu_ring *ring)
>>>>>        ring->sched.ready = !r;
>>>>>        return r;
>>>>>    }
>>>>> +
>>>>> +void amdgpu_ring_clear_ring(struct amdgpu_ring *ring)
>>>>> +{
>>>>> +    int idx;
>>>>> +    int i = 0;
>>>>> +
>>>>> +    if (!drm_dev_enter(&ring->adev->ddev, &idx))
>>>>> +        return;
>>>>> +
>>>>> +    while (i <= ring->buf_mask)
>>>>> +        ring->ring[i++] = ring->funcs->nop;
>>>>> +
>>>>> +    drm_dev_exit(idx);
>>>>> +
>>>>> +}
>>>>> +
>>>>> +void amdgpu_ring_write(struct amdgpu_ring *ring, uint32_t v)
>>>>> +{
>>>>> +    int idx;
>>>>> +
>>>>> +    if (!drm_dev_enter(&ring->adev->ddev, &idx))
>>>>> +        return;
>>>>> +
>>>>> +    if (ring->count_dw <= 0)
>>>>> +        DRM_ERROR("amdgpu: writing more dwords to the ring than 
>>>>> expected!\n");
>>>>> +    ring->ring[ring->wptr++ & ring->buf_mask] = v;
>>>>> +    ring->wptr &= ring->ptr_mask;
>>>>> +    ring->count_dw--;
>>>>> +
>>>>> +    drm_dev_exit(idx);
>>>>> +}
>>>>> +
>>>>> +void amdgpu_ring_write_multiple(struct amdgpu_ring *ring,
>>>>> +                          void *src, int count_dw)
>>>>> +{
>>>>> +    unsigned occupied, chunk1, chunk2;
>>>>> +    void *dst;
>>>>> +    int idx;
>>>>> +
>>>>> +    if (!drm_dev_enter(&ring->adev->ddev, &idx))
>>>>> +        return;
>>>>> +
>>>>> +    if (unlikely(ring->count_dw < count_dw))
>>>>> +        DRM_ERROR("amdgpu: writing more dwords to the ring than 
>>>>> expected!\n");
>>>>> +
>>>>> +    occupied = ring->wptr & ring->buf_mask;
>>>>> +    dst = (void *)&ring->ring[occupied];
>>>>> +    chunk1 = ring->buf_mask + 1 - occupied;
>>>>> +    chunk1 = (chunk1 >= count_dw) ? count_dw: chunk1;
>>>>> +    chunk2 = count_dw - chunk1;
>>>>> +    chunk1 <<= 2;
>>>>> +    chunk2 <<= 2;
>>>>> +
>>>>> +    if (chunk1)
>>>>> +        memcpy(dst, src, chunk1);
>>>>> +
>>>>> +    if (chunk2) {
>>>>> +        src += chunk1;
>>>>> +        dst = (void *)ring->ring;
>>>>> +        memcpy(dst, src, chunk2);
>>>>> +    }
>>>>> +
>>>>> +    ring->wptr += count_dw;
>>>>> +    ring->wptr &= ring->ptr_mask;
>>>>> +    ring->count_dw -= count_dw;
>>>>> +
>>>>> +    drm_dev_exit(idx);
>>>>> +}
>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>>>>> index accb243..f90b81f 100644
>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>>>>> @@ -300,53 +300,12 @@ static inline void
>>>>> amdgpu_ring_set_preempt_cond_exec(struct amdgpu_ring *ring,
>>>>>        *ring->cond_exe_cpu_addr = cond_exec;
>>>>>    }
>>>>>    -static inline void amdgpu_ring_clear_ring(struct amdgpu_ring 
>>>>> *ring)
>>>>> -{
>>>>> -    int i = 0;
>>>>> -    while (i <= ring->buf_mask)
>>>>> -        ring->ring[i++] = ring->funcs->nop;
>>>>> -
>>>>> -}
>>>>> -
>>>>> -static inline void amdgpu_ring_write(struct amdgpu_ring *ring, 
>>>>> uint32_t v)
>>>>> -{
>>>>> -    if (ring->count_dw <= 0)
>>>>> -        DRM_ERROR("amdgpu: writing more dwords to the ring than 
>>>>> expected!\n");
>>>>> -    ring->ring[ring->wptr++ & ring->buf_mask] = v;
>>>>> -    ring->wptr &= ring->ptr_mask;
>>>>> -    ring->count_dw--;
>>>>> -}
>>>>> +void amdgpu_ring_clear_ring(struct amdgpu_ring *ring);
>>>>>    -static inline void amdgpu_ring_write_multiple(struct 
>>>>> amdgpu_ring *ring,
>>>>> -                          void *src, int count_dw)
>>>>> -{
>>>>> -    unsigned occupied, chunk1, chunk2;
>>>>> -    void *dst;
>>>>> -
>>>>> -    if (unlikely(ring->count_dw < count_dw))
>>>>> -        DRM_ERROR("amdgpu: writing more dwords to the ring than 
>>>>> expected!\n");
>>>>> -
>>>>> -    occupied = ring->wptr & ring->buf_mask;
>>>>> -    dst = (void *)&ring->ring[occupied];
>>>>> -    chunk1 = ring->buf_mask + 1 - occupied;
>>>>> -    chunk1 = (chunk1 >= count_dw) ? count_dw: chunk1;
>>>>> -    chunk2 = count_dw - chunk1;
>>>>> -    chunk1 <<= 2;
>>>>> -    chunk2 <<= 2;
>>>>> +void amdgpu_ring_write(struct amdgpu_ring *ring, uint32_t v);
>>>>>    -    if (chunk1)
>>>>> -        memcpy(dst, src, chunk1);
>>>>> -
>>>>> -    if (chunk2) {
>>>>> -        src += chunk1;
>>>>> -        dst = (void *)ring->ring;
>>>>> -        memcpy(dst, src, chunk2);
>>>>> -    }
>>>>> -
>>>>> -    ring->wptr += count_dw;
>>>>> -    ring->wptr &= ring->ptr_mask;
>>>>> -    ring->count_dw -= count_dw;
>>>>> -}
>>>>> +void amdgpu_ring_write_multiple(struct amdgpu_ring *ring,
>>>>> +                          void *src, int count_dw);
>>>>>      int amdgpu_ring_test_helper(struct amdgpu_ring *ring);
>>>>>    diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
>>>>> b/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
>>>>> index bd4248c..b3ce5be 100644
>>>>> --- a/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
>>>>> @@ -269,10 +269,8 @@ static int psp_v11_0_bootloader_load_kdb(struct
>>>>> psp_context *psp)
>>>>>        if (ret)
>>>>>            return ret;
>>>>>    -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>> -
>>>>>        /* Copy PSP KDB binary to memory */
>>>>> -    memcpy(psp->fw_pri_buf, psp->kdb_start_addr, psp->kdb_bin_size);
>>>>> +    psp_copy_fw(psp, psp->kdb_start_addr, psp->kdb_bin_size);
>>>>>          /* Provide the PSP KDB to bootloader */
>>>>>        WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>>>> @@ -302,10 +300,8 @@ static int psp_v11_0_bootloader_load_spl(struct
>>>>> psp_context *psp)
>>>>>        if (ret)
>>>>>            return ret;
>>>>>    -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>> -
>>>>>        /* Copy PSP SPL binary to memory */
>>>>> -    memcpy(psp->fw_pri_buf, psp->spl_start_addr, psp->spl_bin_size);
>>>>> +    psp_copy_fw(psp, psp->spl_start_addr, psp->spl_bin_size);
>>>>>          /* Provide the PSP SPL to bootloader */
>>>>>        WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>>>> @@ -335,10 +331,8 @@ static int 
>>>>> psp_v11_0_bootloader_load_sysdrv(struct
>>>>> psp_context *psp)
>>>>>        if (ret)
>>>>>            return ret;
>>>>>    -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>> -
>>>>>        /* Copy PSP System Driver binary to memory */
>>>>> -    memcpy(psp->fw_pri_buf, psp->sys_start_addr, psp->sys_bin_size);
>>>>> +    psp_copy_fw(psp, psp->sys_start_addr, psp->sys_bin_size);
>>>>>          /* Provide the sys driver to bootloader */
>>>>>        WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>>>> @@ -371,10 +365,8 @@ static int psp_v11_0_bootloader_load_sos(struct
>>>>> psp_context *psp)
>>>>>        if (ret)
>>>>>            return ret;
>>>>>    -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>> -
>>>>>        /* Copy Secure OS binary to PSP memory */
>>>>> -    memcpy(psp->fw_pri_buf, psp->sos_start_addr, psp->sos_bin_size);
>>>>> +    psp_copy_fw(psp, psp->sos_start_addr, psp->sos_bin_size);
>>>>>          /* Provide the PSP secure OS to bootloader */
>>>>>        WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c
>>>>> b/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c
>>>>> index c4828bd..618e5b6 100644
>>>>> --- a/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c
>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c
>>>>> @@ -138,10 +138,8 @@ static int 
>>>>> psp_v12_0_bootloader_load_sysdrv(struct
>>>>> psp_context *psp)
>>>>>        if (ret)
>>>>>            return ret;
>>>>>    -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>> -
>>>>>        /* Copy PSP System Driver binary to memory */
>>>>> -    memcpy(psp->fw_pri_buf, psp->sys_start_addr, psp->sys_bin_size);
>>>>> +    psp_copy_fw(psp, psp->sys_start_addr, psp->sys_bin_size);
>>>>>          /* Provide the sys driver to bootloader */
>>>>>        WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>>>> @@ -179,10 +177,8 @@ static int psp_v12_0_bootloader_load_sos(struct
>>>>> psp_context *psp)
>>>>>        if (ret)
>>>>>            return ret;
>>>>>    -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>> -
>>>>>        /* Copy Secure OS binary to PSP memory */
>>>>> -    memcpy(psp->fw_pri_buf, psp->sos_start_addr, psp->sos_bin_size);
>>>>> +    psp_copy_fw(psp, psp->sos_start_addr, psp->sos_bin_size);
>>>>>          /* Provide the PSP secure OS to bootloader */
>>>>>        WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c
>>>>> b/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c
>>>>> index f2e725f..d0a6cccd 100644
>>>>> --- a/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c
>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c
>>>>> @@ -102,10 +102,8 @@ static int 
>>>>> psp_v3_1_bootloader_load_sysdrv(struct
>>>>> psp_context *psp)
>>>>>        if (ret)
>>>>>            return ret;
>>>>>    -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>> -
>>>>>        /* Copy PSP System Driver binary to memory */
>>>>> -    memcpy(psp->fw_pri_buf, psp->sys_start_addr, psp->sys_bin_size);
>>>>> +    psp_copy_fw(psp, psp->sys_start_addr, psp->sys_bin_size);
>>>>>          /* Provide the sys driver to bootloader */
>>>>>        WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>>>> @@ -143,10 +141,8 @@ static int psp_v3_1_bootloader_load_sos(struct
>>>>> psp_context *psp)
>>>>>        if (ret)
>>>>>            return ret;
>>>>>    -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>> -
>>>>>        /* Copy Secure OS binary to PSP memory */
>>>>> -    memcpy(psp->fw_pri_buf, psp->sos_start_addr, psp->sos_bin_size);
>>>>> +    psp_copy_fw(psp, psp->sos_start_addr, psp->sos_bin_size);
>>>>>          /* Provide the PSP secure OS to bootloader */
>>>>>        WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>
>>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 10/14] dmr/amdgpu: Move some sysfs attrs creation to default_attr
  2021-01-19 17:47         ` Greg KH
@ 2021-01-19 19:04           ` Alex Deucher
  -1 siblings, 0 replies; 196+ messages in thread
From: Alex Deucher @ 2021-01-19 19:04 UTC (permalink / raw)
  To: Greg KH
  Cc: Daniel Vetter, amd-gfx list, Maling list - DRI developers,
	Christian König, Deucher, Alexander, Qiang Yu

On Tue, Jan 19, 2021 at 1:26 PM Greg KH <gregkh@linuxfoundation.org> wrote:
>
> On Tue, Jan 19, 2021 at 11:36:01AM -0500, Andrey Grodzovsky wrote:
> >
> > On 1/19/21 2:34 AM, Greg KH wrote:
> > > On Mon, Jan 18, 2021 at 04:01:19PM -0500, Andrey Grodzovsky wrote:
> > > >   static struct pci_driver amdgpu_kms_pci_driver = {
> > > >           .name = DRIVER_NAME,
> > > >           .id_table = pciidlist,
> > > > @@ -1595,6 +1607,7 @@ static struct pci_driver amdgpu_kms_pci_driver = {
> > > >           .shutdown = amdgpu_pci_shutdown,
> > > >           .driver.pm = &amdgpu_pm_ops,
> > > >           .err_handler = &amdgpu_pci_err_handler,
> > > > + .driver.dev_groups = amdgpu_sysfs_groups,
> > > Shouldn't this just be:
> > >     groups - amdgpu_sysfs_groups,
> > >
> > > Why go to the "driver root" here?
> >
> >
> > Because I still didn't get to your suggestion to propose a patch to add groups to
> > pci_driver, it's located in 'base' driver struct.
>
> You are a pci driver, you should never have to mess with the "base"
> driver struct.  Look at commit 92d50fc1602e ("PCI/IB: add support for
> pci driver attribute groups") which got merged in 4.14, way back in
> 2017 :)

Per the previous discussion of this patch set:
https://www.mail-archive.com/amd-gfx@lists.freedesktop.org/msg56019.html

Alex

>
> driver.pm also looks odd, but I'm just going to ignore that for now...
>
> thanks,
>
> greg k-h
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 10/14] dmr/amdgpu: Move some sysfs attrs creation to default_attr
@ 2021-01-19 19:04           ` Alex Deucher
  0 siblings, 0 replies; 196+ messages in thread
From: Alex Deucher @ 2021-01-19 19:04 UTC (permalink / raw)
  To: Greg KH
  Cc: Andrey Grodzovsky, Daniel Vetter, amd-gfx list,
	Maling list - DRI developers, Christian König, Deucher,
	Alexander, Qiang Yu

On Tue, Jan 19, 2021 at 1:26 PM Greg KH <gregkh@linuxfoundation.org> wrote:
>
> On Tue, Jan 19, 2021 at 11:36:01AM -0500, Andrey Grodzovsky wrote:
> >
> > On 1/19/21 2:34 AM, Greg KH wrote:
> > > On Mon, Jan 18, 2021 at 04:01:19PM -0500, Andrey Grodzovsky wrote:
> > > >   static struct pci_driver amdgpu_kms_pci_driver = {
> > > >           .name = DRIVER_NAME,
> > > >           .id_table = pciidlist,
> > > > @@ -1595,6 +1607,7 @@ static struct pci_driver amdgpu_kms_pci_driver = {
> > > >           .shutdown = amdgpu_pci_shutdown,
> > > >           .driver.pm = &amdgpu_pm_ops,
> > > >           .err_handler = &amdgpu_pci_err_handler,
> > > > + .driver.dev_groups = amdgpu_sysfs_groups,
> > > Shouldn't this just be:
> > >     groups - amdgpu_sysfs_groups,
> > >
> > > Why go to the "driver root" here?
> >
> >
> > Because I still didn't get to your suggestion to propose a patch to add groups to
> > pci_driver, it's located in 'base' driver struct.
>
> You are a pci driver, you should never have to mess with the "base"
> driver struct.  Look at commit 92d50fc1602e ("PCI/IB: add support for
> pci driver attribute groups") which got merged in 4.14, way back in
> 2017 :)

Per the previous discussion of this patch set:
https://www.mail-archive.com/amd-gfx@lists.freedesktop.org/msg56019.html

Alex

>
> driver.pm also looks odd, but I'm just going to ignore that for now...
>
> thanks,
>
> greg k-h
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 10/14] dmr/amdgpu: Move some sysfs attrs creation to default_attr
  2021-01-19 19:04           ` Alex Deucher
@ 2021-01-19 19:16             ` Andrey Grodzovsky
  -1 siblings, 0 replies; 196+ messages in thread
From: Andrey Grodzovsky @ 2021-01-19 19:16 UTC (permalink / raw)
  To: Alex Deucher, Greg KH
  Cc: Christian König, Maling list - DRI developers, amd-gfx list,
	Qiang Yu, Daniel Vetter, Deucher, Alexander


On 1/19/21 2:04 PM, Alex Deucher wrote:
> On Tue, Jan 19, 2021 at 1:26 PM Greg KH <gregkh@linuxfoundation.org> wrote:
>> On Tue, Jan 19, 2021 at 11:36:01AM -0500, Andrey Grodzovsky wrote:
>>> On 1/19/21 2:34 AM, Greg KH wrote:
>>>> On Mon, Jan 18, 2021 at 04:01:19PM -0500, Andrey Grodzovsky wrote:
>>>>>    static struct pci_driver amdgpu_kms_pci_driver = {
>>>>>            .name = DRIVER_NAME,
>>>>>            .id_table = pciidlist,
>>>>> @@ -1595,6 +1607,7 @@ static struct pci_driver amdgpu_kms_pci_driver = {
>>>>>            .shutdown = amdgpu_pci_shutdown,
>>>>>            .driver.pm = &amdgpu_pm_ops,
>>>>>            .err_handler = &amdgpu_pci_err_handler,
>>>>> + .driver.dev_groups = amdgpu_sysfs_groups,
>>>> Shouldn't this just be:
>>>>      groups - amdgpu_sysfs_groups,
>>>>
>>>> Why go to the "driver root" here?
>>>
>>> Because I still didn't get to your suggestion to propose a patch to add groups to
>>> pci_driver, it's located in 'base' driver struct.
>> You are a pci driver, you should never have to mess with the "base"
>> driver struct.  Look at commit 92d50fc1602e ("PCI/IB: add support for
>> pci driver attribute groups") which got merged in 4.14, way back in
>> 2017 :)
> Per the previous discussion of this patch set:
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.mail-archive.com%2Famd-gfx%40lists.freedesktop.org%2Fmsg56019.html&amp;data=04%7C01%7CAndrey.Grodzovsky%40amd.com%7C1b43efdc8a164169eee508d8bcad1ece%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637466799090087255%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=T462j96qC%2BXCZzgnJMG%2BbUEOG94GVuqkvTWfUB%2B3%2Fl8%3D&amp;reserved=0
>
> Alex


Got it, Next iteration I will include a patch like the above to pci-devel as 
part of the series and will update this patch accordingly.

Andrey


>
>> driver.pm also looks odd, but I'm just going to ignore that for now...
>>
>> thanks,
>>
>> greg k-h
>> _______________________________________________
>> dri-devel mailing list
>> dri-devel@lists.freedesktop.org
>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Fdri-devel&amp;data=04%7C01%7CAndrey.Grodzovsky%40amd.com%7C1b43efdc8a164169eee508d8bcad1ece%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637466799090087255%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=reQqTGFCsEXvHOmSt8c4B6idrotIS4Q69WKw%2FRtpAEg%3D&amp;reserved=0
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 10/14] dmr/amdgpu: Move some sysfs attrs creation to default_attr
@ 2021-01-19 19:16             ` Andrey Grodzovsky
  0 siblings, 0 replies; 196+ messages in thread
From: Andrey Grodzovsky @ 2021-01-19 19:16 UTC (permalink / raw)
  To: Alex Deucher, Greg KH
  Cc: Christian König, Maling list - DRI developers, amd-gfx list,
	Qiang Yu, Daniel Vetter, Deucher, Alexander


On 1/19/21 2:04 PM, Alex Deucher wrote:
> On Tue, Jan 19, 2021 at 1:26 PM Greg KH <gregkh@linuxfoundation.org> wrote:
>> On Tue, Jan 19, 2021 at 11:36:01AM -0500, Andrey Grodzovsky wrote:
>>> On 1/19/21 2:34 AM, Greg KH wrote:
>>>> On Mon, Jan 18, 2021 at 04:01:19PM -0500, Andrey Grodzovsky wrote:
>>>>>    static struct pci_driver amdgpu_kms_pci_driver = {
>>>>>            .name = DRIVER_NAME,
>>>>>            .id_table = pciidlist,
>>>>> @@ -1595,6 +1607,7 @@ static struct pci_driver amdgpu_kms_pci_driver = {
>>>>>            .shutdown = amdgpu_pci_shutdown,
>>>>>            .driver.pm = &amdgpu_pm_ops,
>>>>>            .err_handler = &amdgpu_pci_err_handler,
>>>>> + .driver.dev_groups = amdgpu_sysfs_groups,
>>>> Shouldn't this just be:
>>>>      groups - amdgpu_sysfs_groups,
>>>>
>>>> Why go to the "driver root" here?
>>>
>>> Because I still didn't get to your suggestion to propose a patch to add groups to
>>> pci_driver, it's located in 'base' driver struct.
>> You are a pci driver, you should never have to mess with the "base"
>> driver struct.  Look at commit 92d50fc1602e ("PCI/IB: add support for
>> pci driver attribute groups") which got merged in 4.14, way back in
>> 2017 :)
> Per the previous discussion of this patch set:
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.mail-archive.com%2Famd-gfx%40lists.freedesktop.org%2Fmsg56019.html&amp;data=04%7C01%7CAndrey.Grodzovsky%40amd.com%7C1b43efdc8a164169eee508d8bcad1ece%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637466799090087255%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=T462j96qC%2BXCZzgnJMG%2BbUEOG94GVuqkvTWfUB%2B3%2Fl8%3D&amp;reserved=0
>
> Alex


Got it, Next iteration I will include a patch like the above to pci-devel as 
part of the series and will update this patch accordingly.

Andrey


>
>> driver.pm also looks odd, but I'm just going to ignore that for now...
>>
>> thanks,
>>
>> greg k-h
>> _______________________________________________
>> dri-devel mailing list
>> dri-devel@lists.freedesktop.org
>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Fdri-devel&amp;data=04%7C01%7CAndrey.Grodzovsky%40amd.com%7C1b43efdc8a164169eee508d8bcad1ece%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637466799090087255%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=reQqTGFCsEXvHOmSt8c4B6idrotIS4Q69WKw%2FRtpAEg%3D&amp;reserved=0
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 11/14] drm/amdgpu: Guard against write accesses after device removal
  2021-01-19 18:59             ` Christian König
@ 2021-01-19 19:16               ` Andrey Grodzovsky
  -1 siblings, 0 replies; 196+ messages in thread
From: Andrey Grodzovsky @ 2021-01-19 19:16 UTC (permalink / raw)
  To: christian.koenig, Daniel Vetter
  Cc: Greg KH, dri-devel, amd-gfx list, Alex Deucher, Qiang Yu


On 1/19/21 1:59 PM, Christian König wrote:
> Am 19.01.21 um 19:22 schrieb Andrey Grodzovsky:
>>
>> On 1/19/21 1:05 PM, Daniel Vetter wrote:
>>> On Tue, Jan 19, 2021 at 4:35 PM Andrey Grodzovsky
>>> <Andrey.Grodzovsky@amd.com> wrote:
>>>> There is really no other way according to this article
>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flwn.net%2FArticles%2F767885%2F&amp;data=04%7C01%7CAndrey.Grodzovsky%40amd.com%7Cee61fb937d2d4baedf6f08d8bcac5b02%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637466795752297305%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=a9Y4ZMEVYaMP7IeMVxQgXGpAkDXSkedMAiWkyqwzEe8%3D&amp;reserved=0 
>>>>
>>>>
>>>> "A perfect solution seems nearly impossible though; we cannot acquire a 
>>>> mutex on
>>>> the user
>>>> to prevent them from yanking a device and we cannot check for a presence 
>>>> change
>>>> after every
>>>> device access for performance reasons. "
>>>>
>>>> But I assumed srcu_read_lock should be pretty seamless performance wise, no ?
>>> The read side is supposed to be dirt cheap, the write side is were we
>>> just stall for all readers to eventually complete on their own.
>>> Definitely should be much cheaper than mmio read, on the mmio write
>>> side it might actually hurt a bit. Otoh I think those don't stall the
>>> cpu by default when they're timing out, so maybe if the overhead is
>>> too much for those, we could omit them?
>>>
>>> Maybe just do a small microbenchmark for these for testing, with a
>>> register that doesn't change hw state. So with and without
>>> drm_dev_enter/exit, and also one with the hw plugged out so that we
>>> have actual timeouts in the transactions.
>>> -Daniel
>>
>>
>> So say writing in a loop to some harmless scratch register for many times 
>> both for plugged
>> and unplugged case and measure total time delta ?
>
> I think we should at least measure the following:
>
> 1. Writing X times to a scratch reg without your patch.
> 2. Writing X times to a scratch reg with your patch.
> 3. Writing X times to a scratch reg with the hardware physically disconnected.
>
> I suggest to repeat that once for Polaris (or older) and once for Vega or Navi.
>
> The SRBM on Polaris is meant to introduce some delay in each access, so it 
> might react differently then the newer hardware.
>
> Christian.


Will do.

Andrey


>
>>
>> Andrey
>>
>>
>>>
>>>> The other solution would be as I suggested to keep all the device IO ranges
>>>> reserved and system
>>>> memory pages unfreed until the device is finalized in the driver but Daniel 
>>>> said
>>>> this would upset the PCI layer (the MMIO ranges reservation part).
>>>>
>>>> Andrey
>>>>
>>>>
>>>>
>>>>
>>>> On 1/19/21 3:55 AM, Christian König wrote:
>>>>> Am 18.01.21 um 22:01 schrieb Andrey Grodzovsky:
>>>>>> This should prevent writing to memory or IO ranges possibly
>>>>>> already allocated for other uses after our device is removed.
>>>>> Wow, that adds quite some overhead to every register access. I'm not sure we
>>>>> can do this.
>>>>>
>>>>> Christian.
>>>>>
>>>>>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>>>>>> ---
>>>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 57 ++++++++++++++++++++++++
>>>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c    |  9 ++++
>>>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c    | 53 +++++++++++++---------
>>>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h    |  3 ++
>>>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c   | 70 
>>>>>> ++++++++++++++++++++++++++++++
>>>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h   | 49 ++-------------------
>>>>>>    drivers/gpu/drm/amd/amdgpu/psp_v11_0.c     | 16 ++-----
>>>>>>    drivers/gpu/drm/amd/amdgpu/psp_v12_0.c     |  8 +---
>>>>>>    drivers/gpu/drm/amd/amdgpu/psp_v3_1.c      |  8 +---
>>>>>>    9 files changed, 184 insertions(+), 89 deletions(-)
>>>>>>
>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>>> index e99f4f1..0a9d73c 100644
>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>>> @@ -72,6 +72,8 @@
>>>>>>      #include <linux/iommu.h>
>>>>>>    +#include <drm/drm_drv.h>
>>>>>> +
>>>>>>    MODULE_FIRMWARE("amdgpu/vega10_gpu_info.bin");
>>>>>>    MODULE_FIRMWARE("amdgpu/vega12_gpu_info.bin");
>>>>>>    MODULE_FIRMWARE("amdgpu/raven_gpu_info.bin");
>>>>>> @@ -404,13 +406,21 @@ uint8_t amdgpu_mm_rreg8(struct amdgpu_device *adev,
>>>>>> uint32_t offset)
>>>>>>     */
>>>>>>    void amdgpu_mm_wreg8(struct amdgpu_device *adev, uint32_t offset, uint8_t
>>>>>> value)
>>>>>>    {
>>>>>> +    int idx;
>>>>>> +
>>>>>>        if (adev->in_pci_err_recovery)
>>>>>>            return;
>>>>>>    +
>>>>>> +    if (!drm_dev_enter(&adev->ddev, &idx))
>>>>>> +        return;
>>>>>> +
>>>>>>        if (offset < adev->rmmio_size)
>>>>>>            writeb(value, adev->rmmio + offset);
>>>>>>        else
>>>>>>            BUG();
>>>>>> +
>>>>>> +    drm_dev_exit(idx);
>>>>>>    }
>>>>>>      /**
>>>>>> @@ -427,9 +437,14 @@ void amdgpu_device_wreg(struct amdgpu_device *adev,
>>>>>>                uint32_t reg, uint32_t v,
>>>>>>                uint32_t acc_flags)
>>>>>>    {
>>>>>> +    int idx;
>>>>>> +
>>>>>>        if (adev->in_pci_err_recovery)
>>>>>>            return;
>>>>>>    +    if (!drm_dev_enter(&adev->ddev, &idx))
>>>>>> +        return;
>>>>>> +
>>>>>>        if ((reg * 4) < adev->rmmio_size) {
>>>>>>            if (!(acc_flags & AMDGPU_REGS_NO_KIQ) &&
>>>>>>                amdgpu_sriov_runtime(adev) &&
>>>>>> @@ -444,6 +459,8 @@ void amdgpu_device_wreg(struct amdgpu_device *adev,
>>>>>>        }
>>>>>> trace_amdgpu_device_wreg(adev->pdev->device, reg, v);
>>>>>> +
>>>>>> +    drm_dev_exit(idx);
>>>>>>    }
>>>>>>      /*
>>>>>> @@ -454,9 +471,14 @@ void amdgpu_device_wreg(struct amdgpu_device *adev,
>>>>>>    void amdgpu_mm_wreg_mmio_rlc(struct amdgpu_device *adev,
>>>>>>                     uint32_t reg, uint32_t v)
>>>>>>    {
>>>>>> +    int idx;
>>>>>> +
>>>>>>        if (adev->in_pci_err_recovery)
>>>>>>            return;
>>>>>>    +    if (!drm_dev_enter(&adev->ddev, &idx))
>>>>>> +        return;
>>>>>> +
>>>>>>        if (amdgpu_sriov_fullaccess(adev) &&
>>>>>>            adev->gfx.rlc.funcs &&
>>>>>> adev->gfx.rlc.funcs->is_rlcg_access_range) {
>>>>>> @@ -465,6 +487,8 @@ void amdgpu_mm_wreg_mmio_rlc(struct amdgpu_device *adev,
>>>>>>        } else {
>>>>>>            writel(v, ((void __iomem *)adev->rmmio) + (reg * 4));
>>>>>>        }
>>>>>> +
>>>>>> +    drm_dev_exit(idx);
>>>>>>    }
>>>>>>      /**
>>>>>> @@ -499,15 +523,22 @@ u32 amdgpu_io_rreg(struct amdgpu_device *adev, u32 
>>>>>> reg)
>>>>>>     */
>>>>>>    void amdgpu_io_wreg(struct amdgpu_device *adev, u32 reg, u32 v)
>>>>>>    {
>>>>>> +    int idx;
>>>>>> +
>>>>>>        if (adev->in_pci_err_recovery)
>>>>>>            return;
>>>>>>    +    if (!drm_dev_enter(&adev->ddev, &idx))
>>>>>> +        return;
>>>>>> +
>>>>>>        if ((reg * 4) < adev->rio_mem_size)
>>>>>>            iowrite32(v, adev->rio_mem + (reg * 4));
>>>>>>        else {
>>>>>>            iowrite32((reg * 4), adev->rio_mem + (mmMM_INDEX * 4));
>>>>>>            iowrite32(v, adev->rio_mem + (mmMM_DATA * 4));
>>>>>>        }
>>>>>> +
>>>>>> +    drm_dev_exit(idx);
>>>>>>    }
>>>>>>      /**
>>>>>> @@ -544,14 +575,21 @@ u32 amdgpu_mm_rdoorbell(struct amdgpu_device *adev, 
>>>>>> u32
>>>>>> index)
>>>>>>     */
>>>>>>    void amdgpu_mm_wdoorbell(struct amdgpu_device *adev, u32 index, u32 v)
>>>>>>    {
>>>>>> +    int idx;
>>>>>> +
>>>>>>        if (adev->in_pci_err_recovery)
>>>>>>            return;
>>>>>>    +    if (!drm_dev_enter(&adev->ddev, &idx))
>>>>>> +        return;
>>>>>> +
>>>>>>        if (index < adev->doorbell.num_doorbells) {
>>>>>>            writel(v, adev->doorbell.ptr + index);
>>>>>>        } else {
>>>>>>            DRM_ERROR("writing beyond doorbell aperture: 0x%08x!\n", index);
>>>>>>        }
>>>>>> +
>>>>>> +    drm_dev_exit(idx);
>>>>>>    }
>>>>>>      /**
>>>>>> @@ -588,14 +626,21 @@ u64 amdgpu_mm_rdoorbell64(struct amdgpu_device *adev,
>>>>>> u32 index)
>>>>>>     */
>>>>>>    void amdgpu_mm_wdoorbell64(struct amdgpu_device *adev, u32 index, u64 v)
>>>>>>    {
>>>>>> +    int idx;
>>>>>> +
>>>>>>        if (adev->in_pci_err_recovery)
>>>>>>            return;
>>>>>>    +    if (!drm_dev_enter(&adev->ddev, &idx))
>>>>>> +        return;
>>>>>> +
>>>>>>        if (index < adev->doorbell.num_doorbells) {
>>>>>>            atomic64_set((atomic64_t *)(adev->doorbell.ptr + index), v);
>>>>>>        } else {
>>>>>>            DRM_ERROR("writing beyond doorbell aperture: 0x%08x!\n", index);
>>>>>>        }
>>>>>> +
>>>>>> +    drm_dev_exit(idx);
>>>>>>    }
>>>>>>      /**
>>>>>> @@ -682,6 +727,10 @@ void amdgpu_device_indirect_wreg(struct amdgpu_device
>>>>>> *adev,
>>>>>>        unsigned long flags;
>>>>>>        void __iomem *pcie_index_offset;
>>>>>>        void __iomem *pcie_data_offset;
>>>>>> +    int idx;
>>>>>> +
>>>>>> +    if (!drm_dev_enter(&adev->ddev, &idx))
>>>>>> +        return;
>>>>>>          spin_lock_irqsave(&adev->pcie_idx_lock, flags);
>>>>>>        pcie_index_offset = (void __iomem *)adev->rmmio + pcie_index * 4;
>>>>>> @@ -692,6 +741,8 @@ void amdgpu_device_indirect_wreg(struct amdgpu_device 
>>>>>> *adev,
>>>>>>        writel(reg_data, pcie_data_offset);
>>>>>>        readl(pcie_data_offset);
>>>>>> spin_unlock_irqrestore(&adev->pcie_idx_lock, flags);
>>>>>> +
>>>>>> +    drm_dev_exit(idx);
>>>>>>    }
>>>>>>      /**
>>>>>> @@ -711,6 +762,10 @@ void amdgpu_device_indirect_wreg64(struct amdgpu_device
>>>>>> *adev,
>>>>>>        unsigned long flags;
>>>>>>        void __iomem *pcie_index_offset;
>>>>>>        void __iomem *pcie_data_offset;
>>>>>> +    int idx;
>>>>>> +
>>>>>> +    if (!drm_dev_enter(&adev->ddev, &idx))
>>>>>> +        return;
>>>>>>          spin_lock_irqsave(&adev->pcie_idx_lock, flags);
>>>>>>        pcie_index_offset = (void __iomem *)adev->rmmio + pcie_index * 4;
>>>>>> @@ -727,6 +782,8 @@ void amdgpu_device_indirect_wreg64(struct amdgpu_device
>>>>>> *adev,
>>>>>>        writel((u32)(reg_data >> 32), pcie_data_offset);
>>>>>>        readl(pcie_data_offset);
>>>>>> spin_unlock_irqrestore(&adev->pcie_idx_lock, flags);
>>>>>> +
>>>>>> +    drm_dev_exit(idx);
>>>>>>    }
>>>>>>      /**
>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
>>>>>> index fe1a39f..1beb4e6 100644
>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
>>>>>> @@ -31,6 +31,8 @@
>>>>>>    #include "amdgpu_ras.h"
>>>>>>    #include "amdgpu_xgmi.h"
>>>>>>    +#include <drm/drm_drv.h>
>>>>>> +
>>>>>>    /**
>>>>>>     * amdgpu_gmc_get_pde_for_bo - get the PDE for a BO
>>>>>>     *
>>>>>> @@ -98,6 +100,10 @@ int amdgpu_gmc_set_pte_pde(struct amdgpu_device *adev,
>>>>>> void *cpu_pt_addr,
>>>>>>    {
>>>>>>        void __iomem *ptr = (void *)cpu_pt_addr;
>>>>>>        uint64_t value;
>>>>>> +    int idx;
>>>>>> +
>>>>>> +    if (!drm_dev_enter(&adev->ddev, &idx))
>>>>>> +        return 0;
>>>>>>          /*
>>>>>>         * The following is for PTE only. GART does not have PDEs.
>>>>>> @@ -105,6 +111,9 @@ int amdgpu_gmc_set_pte_pde(struct amdgpu_device *adev,
>>>>>> void *cpu_pt_addr,
>>>>>>        value = addr & 0x0000FFFFFFFFF000ULL;
>>>>>>        value |= flags;
>>>>>>        writeq(value, ptr + (gpu_page_idx * 8));
>>>>>> +
>>>>>> +    drm_dev_exit(idx);
>>>>>> +
>>>>>>        return 0;
>>>>>>    }
>>>>>>    diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
>>>>>> index 523d22d..89e2bfe 100644
>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
>>>>>> @@ -37,6 +37,8 @@
>>>>>>      #include "amdgpu_ras.h"
>>>>>>    +#include <drm/drm_drv.h>
>>>>>> +
>>>>>>    static int psp_sysfs_init(struct amdgpu_device *adev);
>>>>>>    static void psp_sysfs_fini(struct amdgpu_device *adev);
>>>>>>    @@ -248,7 +250,7 @@ psp_cmd_submit_buf(struct psp_context *psp,
>>>>>>               struct psp_gfx_cmd_resp *cmd, uint64_t fence_mc_addr)
>>>>>>    {
>>>>>>        int ret;
>>>>>> -    int index;
>>>>>> +    int index, idx;
>>>>>>        int timeout = 2000;
>>>>>>        bool ras_intr = false;
>>>>>>        bool skip_unsupport = false;
>>>>>> @@ -256,6 +258,9 @@ psp_cmd_submit_buf(struct psp_context *psp,
>>>>>>        if (psp->adev->in_pci_err_recovery)
>>>>>>            return 0;
>>>>>>    +    if (!drm_dev_enter(&psp->adev->ddev, &idx))
>>>>>> +        return 0;
>>>>>> +
>>>>>>        mutex_lock(&psp->mutex);
>>>>>>          memset(psp->cmd_buf_mem, 0, PSP_CMD_BUFFER_SIZE);
>>>>>> @@ -266,8 +271,7 @@ psp_cmd_submit_buf(struct psp_context *psp,
>>>>>>        ret = psp_ring_cmd_submit(psp, psp->cmd_buf_mc_addr, fence_mc_addr,
>>>>>> index);
>>>>>>        if (ret) {
>>>>>>            atomic_dec(&psp->fence_value);
>>>>>> -        mutex_unlock(&psp->mutex);
>>>>>> -        return ret;
>>>>>> +        goto exit;
>>>>>>        }
>>>>>>          amdgpu_asic_invalidate_hdp(psp->adev, NULL);
>>>>>> @@ -307,8 +311,8 @@ psp_cmd_submit_buf(struct psp_context *psp,
>>>>>>                 psp->cmd_buf_mem->cmd_id,
>>>>>>                 psp->cmd_buf_mem->resp.status);
>>>>>>            if (!timeout) {
>>>>>> -            mutex_unlock(&psp->mutex);
>>>>>> -            return -EINVAL;
>>>>>> +            ret = -EINVAL;
>>>>>> +            goto exit;
>>>>>>            }
>>>>>>        }
>>>>>>    @@ -316,8 +320,10 @@ psp_cmd_submit_buf(struct psp_context *psp,
>>>>>>            ucode->tmr_mc_addr_lo = psp->cmd_buf_mem->resp.fw_addr_lo;
>>>>>>            ucode->tmr_mc_addr_hi = psp->cmd_buf_mem->resp.fw_addr_hi;
>>>>>>        }
>>>>>> -    mutex_unlock(&psp->mutex);
>>>>>>    +exit:
>>>>>> +    mutex_unlock(&psp->mutex);
>>>>>> +    drm_dev_exit(idx);
>>>>>>        return ret;
>>>>>>    }
>>>>>>    @@ -354,8 +360,7 @@ static int psp_load_toc(struct psp_context *psp,
>>>>>>        if (!cmd)
>>>>>>            return -ENOMEM;
>>>>>>        /* Copy toc to psp firmware private buffer */
>>>>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>>> -    memcpy(psp->fw_pri_buf, psp->toc_start_addr, psp->toc_bin_size);
>>>>>> +    psp_copy_fw(psp, psp->toc_start_addr, psp->toc_bin_size);
>>>>>>          psp_prep_load_toc_cmd_buf(cmd, psp->fw_pri_mc_addr, 
>>>>>> psp->toc_bin_size);
>>>>>>    @@ -570,8 +575,7 @@ static int psp_asd_load(struct psp_context *psp)
>>>>>>        if (!cmd)
>>>>>>            return -ENOMEM;
>>>>>>    -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>>> -    memcpy(psp->fw_pri_buf, psp->asd_start_addr, psp->asd_ucode_size);
>>>>>> +    psp_copy_fw(psp, psp->asd_start_addr, psp->asd_ucode_size);
>>>>>>          psp_prep_asd_load_cmd_buf(cmd, psp->fw_pri_mc_addr,
>>>>>>                      psp->asd_ucode_size);
>>>>>> @@ -726,8 +730,7 @@ static int psp_xgmi_load(struct psp_context *psp)
>>>>>>        if (!cmd)
>>>>>>            return -ENOMEM;
>>>>>>    -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>>> -    memcpy(psp->fw_pri_buf, psp->ta_xgmi_start_addr, 
>>>>>> psp->ta_xgmi_ucode_size);
>>>>>> +    psp_copy_fw(psp, psp->ta_xgmi_start_addr, psp->ta_xgmi_ucode_size);
>>>>>>          psp_prep_ta_load_cmd_buf(cmd,
>>>>>>                     psp->fw_pri_mc_addr,
>>>>>> @@ -982,8 +985,7 @@ static int psp_ras_load(struct psp_context *psp)
>>>>>>        if (!cmd)
>>>>>>            return -ENOMEM;
>>>>>>    -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>>> -    memcpy(psp->fw_pri_buf, psp->ta_ras_start_addr, 
>>>>>> psp->ta_ras_ucode_size);
>>>>>> +    psp_copy_fw(psp, psp->ta_ras_start_addr, psp->ta_ras_ucode_size);
>>>>>>          psp_prep_ta_load_cmd_buf(cmd,
>>>>>>                     psp->fw_pri_mc_addr,
>>>>>> @@ -1219,8 +1221,7 @@ static int psp_hdcp_load(struct psp_context *psp)
>>>>>>        if (!cmd)
>>>>>>            return -ENOMEM;
>>>>>>    -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>>> -    memcpy(psp->fw_pri_buf, psp->ta_hdcp_start_addr,
>>>>>> +    psp_copy_fw(psp, psp->ta_hdcp_start_addr,
>>>>>>               psp->ta_hdcp_ucode_size);
>>>>>>          psp_prep_ta_load_cmd_buf(cmd,
>>>>>> @@ -1366,8 +1367,7 @@ static int psp_dtm_load(struct psp_context *psp)
>>>>>>        if (!cmd)
>>>>>>            return -ENOMEM;
>>>>>>    -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>>> -    memcpy(psp->fw_pri_buf, psp->ta_dtm_start_addr, 
>>>>>> psp->ta_dtm_ucode_size);
>>>>>> +    psp_copy_fw(psp, psp->ta_dtm_start_addr, psp->ta_dtm_ucode_size);
>>>>>>          psp_prep_ta_load_cmd_buf(cmd,
>>>>>>                     psp->fw_pri_mc_addr,
>>>>>> @@ -1507,8 +1507,7 @@ static int psp_rap_load(struct psp_context *psp)
>>>>>>        if (!cmd)
>>>>>>            return -ENOMEM;
>>>>>>    -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>>> -    memcpy(psp->fw_pri_buf, psp->ta_rap_start_addr, 
>>>>>> psp->ta_rap_ucode_size);
>>>>>> +    psp_copy_fw(psp, psp->ta_rap_start_addr, psp->ta_rap_ucode_size);
>>>>>>          psp_prep_ta_load_cmd_buf(cmd,
>>>>>>                     psp->fw_pri_mc_addr,
>>>>>> @@ -2778,6 +2777,20 @@ static ssize_t psp_usbc_pd_fw_sysfs_write(struct
>>>>>> device *dev,
>>>>>>        return count;
>>>>>>    }
>>>>>>    +void psp_copy_fw(struct psp_context *psp, uint8_t *start_addr, uint32_t
>>>>>> bin_size)
>>>>>> +{
>>>>>> +    int idx;
>>>>>> +
>>>>>> +    if (!drm_dev_enter(&psp->adev->ddev, &idx))
>>>>>> +        return;
>>>>>> +
>>>>>> +    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>>> +    memcpy(psp->fw_pri_buf, start_addr, bin_size);
>>>>>> +
>>>>>> +    drm_dev_exit(idx);
>>>>>> +}
>>>>>> +
>>>>>> +
>>>>>>    static DEVICE_ATTR(usbc_pd_fw, S_IRUGO | S_IWUSR,
>>>>>>               psp_usbc_pd_fw_sysfs_read,
>>>>>>               psp_usbc_pd_fw_sysfs_write);
>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
>>>>>> index da250bc..ac69314 100644
>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
>>>>>> @@ -400,4 +400,7 @@ int psp_init_ta_microcode(struct psp_context *psp,
>>>>>>                  const char *chip_name);
>>>>>>    int psp_get_fw_attestation_records_addr(struct psp_context *psp,
>>>>>>                        uint64_t *output_ptr);
>>>>>> +
>>>>>> +void psp_copy_fw(struct psp_context *psp, uint8_t *start_addr, uint32_t
>>>>>> bin_size);
>>>>>> +
>>>>>>    #endif
>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
>>>>>> index 1a612f5..d656494 100644
>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
>>>>>> @@ -35,6 +35,8 @@
>>>>>>    #include "amdgpu.h"
>>>>>>    #include "atom.h"
>>>>>>    +#include <drm/drm_drv.h>
>>>>>> +
>>>>>>    /*
>>>>>>     * Rings
>>>>>>     * Most engines on the GPU are fed via ring buffers. Ring
>>>>>> @@ -463,3 +465,71 @@ int amdgpu_ring_test_helper(struct amdgpu_ring *ring)
>>>>>>        ring->sched.ready = !r;
>>>>>>        return r;
>>>>>>    }
>>>>>> +
>>>>>> +void amdgpu_ring_clear_ring(struct amdgpu_ring *ring)
>>>>>> +{
>>>>>> +    int idx;
>>>>>> +    int i = 0;
>>>>>> +
>>>>>> +    if (!drm_dev_enter(&ring->adev->ddev, &idx))
>>>>>> +        return;
>>>>>> +
>>>>>> +    while (i <= ring->buf_mask)
>>>>>> +        ring->ring[i++] = ring->funcs->nop;
>>>>>> +
>>>>>> +    drm_dev_exit(idx);
>>>>>> +
>>>>>> +}
>>>>>> +
>>>>>> +void amdgpu_ring_write(struct amdgpu_ring *ring, uint32_t v)
>>>>>> +{
>>>>>> +    int idx;
>>>>>> +
>>>>>> +    if (!drm_dev_enter(&ring->adev->ddev, &idx))
>>>>>> +        return;
>>>>>> +
>>>>>> +    if (ring->count_dw <= 0)
>>>>>> +        DRM_ERROR("amdgpu: writing more dwords to the ring than 
>>>>>> expected!\n");
>>>>>> +    ring->ring[ring->wptr++ & ring->buf_mask] = v;
>>>>>> +    ring->wptr &= ring->ptr_mask;
>>>>>> +    ring->count_dw--;
>>>>>> +
>>>>>> +    drm_dev_exit(idx);
>>>>>> +}
>>>>>> +
>>>>>> +void amdgpu_ring_write_multiple(struct amdgpu_ring *ring,
>>>>>> +                          void *src, int count_dw)
>>>>>> +{
>>>>>> +    unsigned occupied, chunk1, chunk2;
>>>>>> +    void *dst;
>>>>>> +    int idx;
>>>>>> +
>>>>>> +    if (!drm_dev_enter(&ring->adev->ddev, &idx))
>>>>>> +        return;
>>>>>> +
>>>>>> +    if (unlikely(ring->count_dw < count_dw))
>>>>>> +        DRM_ERROR("amdgpu: writing more dwords to the ring than 
>>>>>> expected!\n");
>>>>>> +
>>>>>> +    occupied = ring->wptr & ring->buf_mask;
>>>>>> +    dst = (void *)&ring->ring[occupied];
>>>>>> +    chunk1 = ring->buf_mask + 1 - occupied;
>>>>>> +    chunk1 = (chunk1 >= count_dw) ? count_dw: chunk1;
>>>>>> +    chunk2 = count_dw - chunk1;
>>>>>> +    chunk1 <<= 2;
>>>>>> +    chunk2 <<= 2;
>>>>>> +
>>>>>> +    if (chunk1)
>>>>>> +        memcpy(dst, src, chunk1);
>>>>>> +
>>>>>> +    if (chunk2) {
>>>>>> +        src += chunk1;
>>>>>> +        dst = (void *)ring->ring;
>>>>>> +        memcpy(dst, src, chunk2);
>>>>>> +    }
>>>>>> +
>>>>>> +    ring->wptr += count_dw;
>>>>>> +    ring->wptr &= ring->ptr_mask;
>>>>>> +    ring->count_dw -= count_dw;
>>>>>> +
>>>>>> +    drm_dev_exit(idx);
>>>>>> +}
>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>>>>>> index accb243..f90b81f 100644
>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>>>>>> @@ -300,53 +300,12 @@ static inline void
>>>>>> amdgpu_ring_set_preempt_cond_exec(struct amdgpu_ring *ring,
>>>>>>        *ring->cond_exe_cpu_addr = cond_exec;
>>>>>>    }
>>>>>>    -static inline void amdgpu_ring_clear_ring(struct amdgpu_ring *ring)
>>>>>> -{
>>>>>> -    int i = 0;
>>>>>> -    while (i <= ring->buf_mask)
>>>>>> -        ring->ring[i++] = ring->funcs->nop;
>>>>>> -
>>>>>> -}
>>>>>> -
>>>>>> -static inline void amdgpu_ring_write(struct amdgpu_ring *ring, uint32_t v)
>>>>>> -{
>>>>>> -    if (ring->count_dw <= 0)
>>>>>> -        DRM_ERROR("amdgpu: writing more dwords to the ring than 
>>>>>> expected!\n");
>>>>>> -    ring->ring[ring->wptr++ & ring->buf_mask] = v;
>>>>>> -    ring->wptr &= ring->ptr_mask;
>>>>>> -    ring->count_dw--;
>>>>>> -}
>>>>>> +void amdgpu_ring_clear_ring(struct amdgpu_ring *ring);
>>>>>>    -static inline void amdgpu_ring_write_multiple(struct amdgpu_ring *ring,
>>>>>> -                          void *src, int count_dw)
>>>>>> -{
>>>>>> -    unsigned occupied, chunk1, chunk2;
>>>>>> -    void *dst;
>>>>>> -
>>>>>> -    if (unlikely(ring->count_dw < count_dw))
>>>>>> -        DRM_ERROR("amdgpu: writing more dwords to the ring than 
>>>>>> expected!\n");
>>>>>> -
>>>>>> -    occupied = ring->wptr & ring->buf_mask;
>>>>>> -    dst = (void *)&ring->ring[occupied];
>>>>>> -    chunk1 = ring->buf_mask + 1 - occupied;
>>>>>> -    chunk1 = (chunk1 >= count_dw) ? count_dw: chunk1;
>>>>>> -    chunk2 = count_dw - chunk1;
>>>>>> -    chunk1 <<= 2;
>>>>>> -    chunk2 <<= 2;
>>>>>> +void amdgpu_ring_write(struct amdgpu_ring *ring, uint32_t v);
>>>>>>    -    if (chunk1)
>>>>>> -        memcpy(dst, src, chunk1);
>>>>>> -
>>>>>> -    if (chunk2) {
>>>>>> -        src += chunk1;
>>>>>> -        dst = (void *)ring->ring;
>>>>>> -        memcpy(dst, src, chunk2);
>>>>>> -    }
>>>>>> -
>>>>>> -    ring->wptr += count_dw;
>>>>>> -    ring->wptr &= ring->ptr_mask;
>>>>>> -    ring->count_dw -= count_dw;
>>>>>> -}
>>>>>> +void amdgpu_ring_write_multiple(struct amdgpu_ring *ring,
>>>>>> +                          void *src, int count_dw);
>>>>>>      int amdgpu_ring_test_helper(struct amdgpu_ring *ring);
>>>>>>    diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
>>>>>> b/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
>>>>>> index bd4248c..b3ce5be 100644
>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
>>>>>> @@ -269,10 +269,8 @@ static int psp_v11_0_bootloader_load_kdb(struct
>>>>>> psp_context *psp)
>>>>>>        if (ret)
>>>>>>            return ret;
>>>>>>    -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>>> -
>>>>>>        /* Copy PSP KDB binary to memory */
>>>>>> -    memcpy(psp->fw_pri_buf, psp->kdb_start_addr, psp->kdb_bin_size);
>>>>>> +    psp_copy_fw(psp, psp->kdb_start_addr, psp->kdb_bin_size);
>>>>>>          /* Provide the PSP KDB to bootloader */
>>>>>>        WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>>>>> @@ -302,10 +300,8 @@ static int psp_v11_0_bootloader_load_spl(struct
>>>>>> psp_context *psp)
>>>>>>        if (ret)
>>>>>>            return ret;
>>>>>>    -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>>> -
>>>>>>        /* Copy PSP SPL binary to memory */
>>>>>> -    memcpy(psp->fw_pri_buf, psp->spl_start_addr, psp->spl_bin_size);
>>>>>> +    psp_copy_fw(psp, psp->spl_start_addr, psp->spl_bin_size);
>>>>>>          /* Provide the PSP SPL to bootloader */
>>>>>>        WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>>>>> @@ -335,10 +331,8 @@ static int psp_v11_0_bootloader_load_sysdrv(struct
>>>>>> psp_context *psp)
>>>>>>        if (ret)
>>>>>>            return ret;
>>>>>>    -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>>> -
>>>>>>        /* Copy PSP System Driver binary to memory */
>>>>>> -    memcpy(psp->fw_pri_buf, psp->sys_start_addr, psp->sys_bin_size);
>>>>>> +    psp_copy_fw(psp, psp->sys_start_addr, psp->sys_bin_size);
>>>>>>          /* Provide the sys driver to bootloader */
>>>>>>        WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>>>>> @@ -371,10 +365,8 @@ static int psp_v11_0_bootloader_load_sos(struct
>>>>>> psp_context *psp)
>>>>>>        if (ret)
>>>>>>            return ret;
>>>>>>    -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>>> -
>>>>>>        /* Copy Secure OS binary to PSP memory */
>>>>>> -    memcpy(psp->fw_pri_buf, psp->sos_start_addr, psp->sos_bin_size);
>>>>>> +    psp_copy_fw(psp, psp->sos_start_addr, psp->sos_bin_size);
>>>>>>          /* Provide the PSP secure OS to bootloader */
>>>>>>        WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c
>>>>>> b/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c
>>>>>> index c4828bd..618e5b6 100644
>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c
>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c
>>>>>> @@ -138,10 +138,8 @@ static int psp_v12_0_bootloader_load_sysdrv(struct
>>>>>> psp_context *psp)
>>>>>>        if (ret)
>>>>>>            return ret;
>>>>>>    -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>>> -
>>>>>>        /* Copy PSP System Driver binary to memory */
>>>>>> -    memcpy(psp->fw_pri_buf, psp->sys_start_addr, psp->sys_bin_size);
>>>>>> +    psp_copy_fw(psp, psp->sys_start_addr, psp->sys_bin_size);
>>>>>>          /* Provide the sys driver to bootloader */
>>>>>>        WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>>>>> @@ -179,10 +177,8 @@ static int psp_v12_0_bootloader_load_sos(struct
>>>>>> psp_context *psp)
>>>>>>        if (ret)
>>>>>>            return ret;
>>>>>>    -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>>> -
>>>>>>        /* Copy Secure OS binary to PSP memory */
>>>>>> -    memcpy(psp->fw_pri_buf, psp->sos_start_addr, psp->sos_bin_size);
>>>>>> +    psp_copy_fw(psp, psp->sos_start_addr, psp->sos_bin_size);
>>>>>>          /* Provide the PSP secure OS to bootloader */
>>>>>>        WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c
>>>>>> b/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c
>>>>>> index f2e725f..d0a6cccd 100644
>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c
>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c
>>>>>> @@ -102,10 +102,8 @@ static int psp_v3_1_bootloader_load_sysdrv(struct
>>>>>> psp_context *psp)
>>>>>>        if (ret)
>>>>>>            return ret;
>>>>>>    -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>>> -
>>>>>>        /* Copy PSP System Driver binary to memory */
>>>>>> -    memcpy(psp->fw_pri_buf, psp->sys_start_addr, psp->sys_bin_size);
>>>>>> +    psp_copy_fw(psp, psp->sys_start_addr, psp->sys_bin_size);
>>>>>>          /* Provide the sys driver to bootloader */
>>>>>>        WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>>>>> @@ -143,10 +141,8 @@ static int psp_v3_1_bootloader_load_sos(struct
>>>>>> psp_context *psp)
>>>>>>        if (ret)
>>>>>>            return ret;
>>>>>>    -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>>> -
>>>>>>        /* Copy Secure OS binary to PSP memory */
>>>>>> -    memcpy(psp->fw_pri_buf, psp->sos_start_addr, psp->sos_bin_size);
>>>>>> +    psp_copy_fw(psp, psp->sos_start_addr, psp->sos_bin_size);
>>>>>>          /* Provide the PSP secure OS to bootloader */
>>>>>>        WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>>
>>>
>> _______________________________________________
>> amd-gfx mailing list
>> amd-gfx@lists.freedesktop.org
>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&amp;data=04%7C01%7CAndrey.Grodzovsky%40amd.com%7Cee61fb937d2d4baedf6f08d8bcac5b02%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637466795752297305%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=a5MkPkwHh7WkR24K9EoCWSKPdCpiXCJH6RwGbGyhHyA%3D&amp;reserved=0 
>>
>
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 11/14] drm/amdgpu: Guard against write accesses after device removal
@ 2021-01-19 19:16               ` Andrey Grodzovsky
  0 siblings, 0 replies; 196+ messages in thread
From: Andrey Grodzovsky @ 2021-01-19 19:16 UTC (permalink / raw)
  To: christian.koenig, Daniel Vetter
  Cc: Rob Herring, Greg KH, dri-devel, Anholt, Eric, Pekka Paalanen,
	amd-gfx list, Alex Deucher, Lucas Stach, Wentland, Harry,
	Qiang Yu


On 1/19/21 1:59 PM, Christian König wrote:
> Am 19.01.21 um 19:22 schrieb Andrey Grodzovsky:
>>
>> On 1/19/21 1:05 PM, Daniel Vetter wrote:
>>> On Tue, Jan 19, 2021 at 4:35 PM Andrey Grodzovsky
>>> <Andrey.Grodzovsky@amd.com> wrote:
>>>> There is really no other way according to this article
>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flwn.net%2FArticles%2F767885%2F&amp;data=04%7C01%7CAndrey.Grodzovsky%40amd.com%7Cee61fb937d2d4baedf6f08d8bcac5b02%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637466795752297305%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=a9Y4ZMEVYaMP7IeMVxQgXGpAkDXSkedMAiWkyqwzEe8%3D&amp;reserved=0 
>>>>
>>>>
>>>> "A perfect solution seems nearly impossible though; we cannot acquire a 
>>>> mutex on
>>>> the user
>>>> to prevent them from yanking a device and we cannot check for a presence 
>>>> change
>>>> after every
>>>> device access for performance reasons. "
>>>>
>>>> But I assumed srcu_read_lock should be pretty seamless performance wise, no ?
>>> The read side is supposed to be dirt cheap, the write side is were we
>>> just stall for all readers to eventually complete on their own.
>>> Definitely should be much cheaper than mmio read, on the mmio write
>>> side it might actually hurt a bit. Otoh I think those don't stall the
>>> cpu by default when they're timing out, so maybe if the overhead is
>>> too much for those, we could omit them?
>>>
>>> Maybe just do a small microbenchmark for these for testing, with a
>>> register that doesn't change hw state. So with and without
>>> drm_dev_enter/exit, and also one with the hw plugged out so that we
>>> have actual timeouts in the transactions.
>>> -Daniel
>>
>>
>> So say writing in a loop to some harmless scratch register for many times 
>> both for plugged
>> and unplugged case and measure total time delta ?
>
> I think we should at least measure the following:
>
> 1. Writing X times to a scratch reg without your patch.
> 2. Writing X times to a scratch reg with your patch.
> 3. Writing X times to a scratch reg with the hardware physically disconnected.
>
> I suggest to repeat that once for Polaris (or older) and once for Vega or Navi.
>
> The SRBM on Polaris is meant to introduce some delay in each access, so it 
> might react differently then the newer hardware.
>
> Christian.


Will do.

Andrey


>
>>
>> Andrey
>>
>>
>>>
>>>> The other solution would be as I suggested to keep all the device IO ranges
>>>> reserved and system
>>>> memory pages unfreed until the device is finalized in the driver but Daniel 
>>>> said
>>>> this would upset the PCI layer (the MMIO ranges reservation part).
>>>>
>>>> Andrey
>>>>
>>>>
>>>>
>>>>
>>>> On 1/19/21 3:55 AM, Christian König wrote:
>>>>> Am 18.01.21 um 22:01 schrieb Andrey Grodzovsky:
>>>>>> This should prevent writing to memory or IO ranges possibly
>>>>>> already allocated for other uses after our device is removed.
>>>>> Wow, that adds quite some overhead to every register access. I'm not sure we
>>>>> can do this.
>>>>>
>>>>> Christian.
>>>>>
>>>>>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>>>>>> ---
>>>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 57 ++++++++++++++++++++++++
>>>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c    |  9 ++++
>>>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c    | 53 +++++++++++++---------
>>>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h    |  3 ++
>>>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c   | 70 
>>>>>> ++++++++++++++++++++++++++++++
>>>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h   | 49 ++-------------------
>>>>>>    drivers/gpu/drm/amd/amdgpu/psp_v11_0.c     | 16 ++-----
>>>>>>    drivers/gpu/drm/amd/amdgpu/psp_v12_0.c     |  8 +---
>>>>>>    drivers/gpu/drm/amd/amdgpu/psp_v3_1.c      |  8 +---
>>>>>>    9 files changed, 184 insertions(+), 89 deletions(-)
>>>>>>
>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>>> index e99f4f1..0a9d73c 100644
>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>>> @@ -72,6 +72,8 @@
>>>>>>      #include <linux/iommu.h>
>>>>>>    +#include <drm/drm_drv.h>
>>>>>> +
>>>>>>    MODULE_FIRMWARE("amdgpu/vega10_gpu_info.bin");
>>>>>>    MODULE_FIRMWARE("amdgpu/vega12_gpu_info.bin");
>>>>>>    MODULE_FIRMWARE("amdgpu/raven_gpu_info.bin");
>>>>>> @@ -404,13 +406,21 @@ uint8_t amdgpu_mm_rreg8(struct amdgpu_device *adev,
>>>>>> uint32_t offset)
>>>>>>     */
>>>>>>    void amdgpu_mm_wreg8(struct amdgpu_device *adev, uint32_t offset, uint8_t
>>>>>> value)
>>>>>>    {
>>>>>> +    int idx;
>>>>>> +
>>>>>>        if (adev->in_pci_err_recovery)
>>>>>>            return;
>>>>>>    +
>>>>>> +    if (!drm_dev_enter(&adev->ddev, &idx))
>>>>>> +        return;
>>>>>> +
>>>>>>        if (offset < adev->rmmio_size)
>>>>>>            writeb(value, adev->rmmio + offset);
>>>>>>        else
>>>>>>            BUG();
>>>>>> +
>>>>>> +    drm_dev_exit(idx);
>>>>>>    }
>>>>>>      /**
>>>>>> @@ -427,9 +437,14 @@ void amdgpu_device_wreg(struct amdgpu_device *adev,
>>>>>>                uint32_t reg, uint32_t v,
>>>>>>                uint32_t acc_flags)
>>>>>>    {
>>>>>> +    int idx;
>>>>>> +
>>>>>>        if (adev->in_pci_err_recovery)
>>>>>>            return;
>>>>>>    +    if (!drm_dev_enter(&adev->ddev, &idx))
>>>>>> +        return;
>>>>>> +
>>>>>>        if ((reg * 4) < adev->rmmio_size) {
>>>>>>            if (!(acc_flags & AMDGPU_REGS_NO_KIQ) &&
>>>>>>                amdgpu_sriov_runtime(adev) &&
>>>>>> @@ -444,6 +459,8 @@ void amdgpu_device_wreg(struct amdgpu_device *adev,
>>>>>>        }
>>>>>> trace_amdgpu_device_wreg(adev->pdev->device, reg, v);
>>>>>> +
>>>>>> +    drm_dev_exit(idx);
>>>>>>    }
>>>>>>      /*
>>>>>> @@ -454,9 +471,14 @@ void amdgpu_device_wreg(struct amdgpu_device *adev,
>>>>>>    void amdgpu_mm_wreg_mmio_rlc(struct amdgpu_device *adev,
>>>>>>                     uint32_t reg, uint32_t v)
>>>>>>    {
>>>>>> +    int idx;
>>>>>> +
>>>>>>        if (adev->in_pci_err_recovery)
>>>>>>            return;
>>>>>>    +    if (!drm_dev_enter(&adev->ddev, &idx))
>>>>>> +        return;
>>>>>> +
>>>>>>        if (amdgpu_sriov_fullaccess(adev) &&
>>>>>>            adev->gfx.rlc.funcs &&
>>>>>> adev->gfx.rlc.funcs->is_rlcg_access_range) {
>>>>>> @@ -465,6 +487,8 @@ void amdgpu_mm_wreg_mmio_rlc(struct amdgpu_device *adev,
>>>>>>        } else {
>>>>>>            writel(v, ((void __iomem *)adev->rmmio) + (reg * 4));
>>>>>>        }
>>>>>> +
>>>>>> +    drm_dev_exit(idx);
>>>>>>    }
>>>>>>      /**
>>>>>> @@ -499,15 +523,22 @@ u32 amdgpu_io_rreg(struct amdgpu_device *adev, u32 
>>>>>> reg)
>>>>>>     */
>>>>>>    void amdgpu_io_wreg(struct amdgpu_device *adev, u32 reg, u32 v)
>>>>>>    {
>>>>>> +    int idx;
>>>>>> +
>>>>>>        if (adev->in_pci_err_recovery)
>>>>>>            return;
>>>>>>    +    if (!drm_dev_enter(&adev->ddev, &idx))
>>>>>> +        return;
>>>>>> +
>>>>>>        if ((reg * 4) < adev->rio_mem_size)
>>>>>>            iowrite32(v, adev->rio_mem + (reg * 4));
>>>>>>        else {
>>>>>>            iowrite32((reg * 4), adev->rio_mem + (mmMM_INDEX * 4));
>>>>>>            iowrite32(v, adev->rio_mem + (mmMM_DATA * 4));
>>>>>>        }
>>>>>> +
>>>>>> +    drm_dev_exit(idx);
>>>>>>    }
>>>>>>      /**
>>>>>> @@ -544,14 +575,21 @@ u32 amdgpu_mm_rdoorbell(struct amdgpu_device *adev, 
>>>>>> u32
>>>>>> index)
>>>>>>     */
>>>>>>    void amdgpu_mm_wdoorbell(struct amdgpu_device *adev, u32 index, u32 v)
>>>>>>    {
>>>>>> +    int idx;
>>>>>> +
>>>>>>        if (adev->in_pci_err_recovery)
>>>>>>            return;
>>>>>>    +    if (!drm_dev_enter(&adev->ddev, &idx))
>>>>>> +        return;
>>>>>> +
>>>>>>        if (index < adev->doorbell.num_doorbells) {
>>>>>>            writel(v, adev->doorbell.ptr + index);
>>>>>>        } else {
>>>>>>            DRM_ERROR("writing beyond doorbell aperture: 0x%08x!\n", index);
>>>>>>        }
>>>>>> +
>>>>>> +    drm_dev_exit(idx);
>>>>>>    }
>>>>>>      /**
>>>>>> @@ -588,14 +626,21 @@ u64 amdgpu_mm_rdoorbell64(struct amdgpu_device *adev,
>>>>>> u32 index)
>>>>>>     */
>>>>>>    void amdgpu_mm_wdoorbell64(struct amdgpu_device *adev, u32 index, u64 v)
>>>>>>    {
>>>>>> +    int idx;
>>>>>> +
>>>>>>        if (adev->in_pci_err_recovery)
>>>>>>            return;
>>>>>>    +    if (!drm_dev_enter(&adev->ddev, &idx))
>>>>>> +        return;
>>>>>> +
>>>>>>        if (index < adev->doorbell.num_doorbells) {
>>>>>>            atomic64_set((atomic64_t *)(adev->doorbell.ptr + index), v);
>>>>>>        } else {
>>>>>>            DRM_ERROR("writing beyond doorbell aperture: 0x%08x!\n", index);
>>>>>>        }
>>>>>> +
>>>>>> +    drm_dev_exit(idx);
>>>>>>    }
>>>>>>      /**
>>>>>> @@ -682,6 +727,10 @@ void amdgpu_device_indirect_wreg(struct amdgpu_device
>>>>>> *adev,
>>>>>>        unsigned long flags;
>>>>>>        void __iomem *pcie_index_offset;
>>>>>>        void __iomem *pcie_data_offset;
>>>>>> +    int idx;
>>>>>> +
>>>>>> +    if (!drm_dev_enter(&adev->ddev, &idx))
>>>>>> +        return;
>>>>>>          spin_lock_irqsave(&adev->pcie_idx_lock, flags);
>>>>>>        pcie_index_offset = (void __iomem *)adev->rmmio + pcie_index * 4;
>>>>>> @@ -692,6 +741,8 @@ void amdgpu_device_indirect_wreg(struct amdgpu_device 
>>>>>> *adev,
>>>>>>        writel(reg_data, pcie_data_offset);
>>>>>>        readl(pcie_data_offset);
>>>>>> spin_unlock_irqrestore(&adev->pcie_idx_lock, flags);
>>>>>> +
>>>>>> +    drm_dev_exit(idx);
>>>>>>    }
>>>>>>      /**
>>>>>> @@ -711,6 +762,10 @@ void amdgpu_device_indirect_wreg64(struct amdgpu_device
>>>>>> *adev,
>>>>>>        unsigned long flags;
>>>>>>        void __iomem *pcie_index_offset;
>>>>>>        void __iomem *pcie_data_offset;
>>>>>> +    int idx;
>>>>>> +
>>>>>> +    if (!drm_dev_enter(&adev->ddev, &idx))
>>>>>> +        return;
>>>>>>          spin_lock_irqsave(&adev->pcie_idx_lock, flags);
>>>>>>        pcie_index_offset = (void __iomem *)adev->rmmio + pcie_index * 4;
>>>>>> @@ -727,6 +782,8 @@ void amdgpu_device_indirect_wreg64(struct amdgpu_device
>>>>>> *adev,
>>>>>>        writel((u32)(reg_data >> 32), pcie_data_offset);
>>>>>>        readl(pcie_data_offset);
>>>>>> spin_unlock_irqrestore(&adev->pcie_idx_lock, flags);
>>>>>> +
>>>>>> +    drm_dev_exit(idx);
>>>>>>    }
>>>>>>      /**
>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
>>>>>> index fe1a39f..1beb4e6 100644
>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
>>>>>> @@ -31,6 +31,8 @@
>>>>>>    #include "amdgpu_ras.h"
>>>>>>    #include "amdgpu_xgmi.h"
>>>>>>    +#include <drm/drm_drv.h>
>>>>>> +
>>>>>>    /**
>>>>>>     * amdgpu_gmc_get_pde_for_bo - get the PDE for a BO
>>>>>>     *
>>>>>> @@ -98,6 +100,10 @@ int amdgpu_gmc_set_pte_pde(struct amdgpu_device *adev,
>>>>>> void *cpu_pt_addr,
>>>>>>    {
>>>>>>        void __iomem *ptr = (void *)cpu_pt_addr;
>>>>>>        uint64_t value;
>>>>>> +    int idx;
>>>>>> +
>>>>>> +    if (!drm_dev_enter(&adev->ddev, &idx))
>>>>>> +        return 0;
>>>>>>          /*
>>>>>>         * The following is for PTE only. GART does not have PDEs.
>>>>>> @@ -105,6 +111,9 @@ int amdgpu_gmc_set_pte_pde(struct amdgpu_device *adev,
>>>>>> void *cpu_pt_addr,
>>>>>>        value = addr & 0x0000FFFFFFFFF000ULL;
>>>>>>        value |= flags;
>>>>>>        writeq(value, ptr + (gpu_page_idx * 8));
>>>>>> +
>>>>>> +    drm_dev_exit(idx);
>>>>>> +
>>>>>>        return 0;
>>>>>>    }
>>>>>>    diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
>>>>>> index 523d22d..89e2bfe 100644
>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
>>>>>> @@ -37,6 +37,8 @@
>>>>>>      #include "amdgpu_ras.h"
>>>>>>    +#include <drm/drm_drv.h>
>>>>>> +
>>>>>>    static int psp_sysfs_init(struct amdgpu_device *adev);
>>>>>>    static void psp_sysfs_fini(struct amdgpu_device *adev);
>>>>>>    @@ -248,7 +250,7 @@ psp_cmd_submit_buf(struct psp_context *psp,
>>>>>>               struct psp_gfx_cmd_resp *cmd, uint64_t fence_mc_addr)
>>>>>>    {
>>>>>>        int ret;
>>>>>> -    int index;
>>>>>> +    int index, idx;
>>>>>>        int timeout = 2000;
>>>>>>        bool ras_intr = false;
>>>>>>        bool skip_unsupport = false;
>>>>>> @@ -256,6 +258,9 @@ psp_cmd_submit_buf(struct psp_context *psp,
>>>>>>        if (psp->adev->in_pci_err_recovery)
>>>>>>            return 0;
>>>>>>    +    if (!drm_dev_enter(&psp->adev->ddev, &idx))
>>>>>> +        return 0;
>>>>>> +
>>>>>>        mutex_lock(&psp->mutex);
>>>>>>          memset(psp->cmd_buf_mem, 0, PSP_CMD_BUFFER_SIZE);
>>>>>> @@ -266,8 +271,7 @@ psp_cmd_submit_buf(struct psp_context *psp,
>>>>>>        ret = psp_ring_cmd_submit(psp, psp->cmd_buf_mc_addr, fence_mc_addr,
>>>>>> index);
>>>>>>        if (ret) {
>>>>>>            atomic_dec(&psp->fence_value);
>>>>>> -        mutex_unlock(&psp->mutex);
>>>>>> -        return ret;
>>>>>> +        goto exit;
>>>>>>        }
>>>>>>          amdgpu_asic_invalidate_hdp(psp->adev, NULL);
>>>>>> @@ -307,8 +311,8 @@ psp_cmd_submit_buf(struct psp_context *psp,
>>>>>>                 psp->cmd_buf_mem->cmd_id,
>>>>>>                 psp->cmd_buf_mem->resp.status);
>>>>>>            if (!timeout) {
>>>>>> -            mutex_unlock(&psp->mutex);
>>>>>> -            return -EINVAL;
>>>>>> +            ret = -EINVAL;
>>>>>> +            goto exit;
>>>>>>            }
>>>>>>        }
>>>>>>    @@ -316,8 +320,10 @@ psp_cmd_submit_buf(struct psp_context *psp,
>>>>>>            ucode->tmr_mc_addr_lo = psp->cmd_buf_mem->resp.fw_addr_lo;
>>>>>>            ucode->tmr_mc_addr_hi = psp->cmd_buf_mem->resp.fw_addr_hi;
>>>>>>        }
>>>>>> -    mutex_unlock(&psp->mutex);
>>>>>>    +exit:
>>>>>> +    mutex_unlock(&psp->mutex);
>>>>>> +    drm_dev_exit(idx);
>>>>>>        return ret;
>>>>>>    }
>>>>>>    @@ -354,8 +360,7 @@ static int psp_load_toc(struct psp_context *psp,
>>>>>>        if (!cmd)
>>>>>>            return -ENOMEM;
>>>>>>        /* Copy toc to psp firmware private buffer */
>>>>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>>> -    memcpy(psp->fw_pri_buf, psp->toc_start_addr, psp->toc_bin_size);
>>>>>> +    psp_copy_fw(psp, psp->toc_start_addr, psp->toc_bin_size);
>>>>>>          psp_prep_load_toc_cmd_buf(cmd, psp->fw_pri_mc_addr, 
>>>>>> psp->toc_bin_size);
>>>>>>    @@ -570,8 +575,7 @@ static int psp_asd_load(struct psp_context *psp)
>>>>>>        if (!cmd)
>>>>>>            return -ENOMEM;
>>>>>>    -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>>> -    memcpy(psp->fw_pri_buf, psp->asd_start_addr, psp->asd_ucode_size);
>>>>>> +    psp_copy_fw(psp, psp->asd_start_addr, psp->asd_ucode_size);
>>>>>>          psp_prep_asd_load_cmd_buf(cmd, psp->fw_pri_mc_addr,
>>>>>>                      psp->asd_ucode_size);
>>>>>> @@ -726,8 +730,7 @@ static int psp_xgmi_load(struct psp_context *psp)
>>>>>>        if (!cmd)
>>>>>>            return -ENOMEM;
>>>>>>    -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>>> -    memcpy(psp->fw_pri_buf, psp->ta_xgmi_start_addr, 
>>>>>> psp->ta_xgmi_ucode_size);
>>>>>> +    psp_copy_fw(psp, psp->ta_xgmi_start_addr, psp->ta_xgmi_ucode_size);
>>>>>>          psp_prep_ta_load_cmd_buf(cmd,
>>>>>>                     psp->fw_pri_mc_addr,
>>>>>> @@ -982,8 +985,7 @@ static int psp_ras_load(struct psp_context *psp)
>>>>>>        if (!cmd)
>>>>>>            return -ENOMEM;
>>>>>>    -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>>> -    memcpy(psp->fw_pri_buf, psp->ta_ras_start_addr, 
>>>>>> psp->ta_ras_ucode_size);
>>>>>> +    psp_copy_fw(psp, psp->ta_ras_start_addr, psp->ta_ras_ucode_size);
>>>>>>          psp_prep_ta_load_cmd_buf(cmd,
>>>>>>                     psp->fw_pri_mc_addr,
>>>>>> @@ -1219,8 +1221,7 @@ static int psp_hdcp_load(struct psp_context *psp)
>>>>>>        if (!cmd)
>>>>>>            return -ENOMEM;
>>>>>>    -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>>> -    memcpy(psp->fw_pri_buf, psp->ta_hdcp_start_addr,
>>>>>> +    psp_copy_fw(psp, psp->ta_hdcp_start_addr,
>>>>>>               psp->ta_hdcp_ucode_size);
>>>>>>          psp_prep_ta_load_cmd_buf(cmd,
>>>>>> @@ -1366,8 +1367,7 @@ static int psp_dtm_load(struct psp_context *psp)
>>>>>>        if (!cmd)
>>>>>>            return -ENOMEM;
>>>>>>    -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>>> -    memcpy(psp->fw_pri_buf, psp->ta_dtm_start_addr, 
>>>>>> psp->ta_dtm_ucode_size);
>>>>>> +    psp_copy_fw(psp, psp->ta_dtm_start_addr, psp->ta_dtm_ucode_size);
>>>>>>          psp_prep_ta_load_cmd_buf(cmd,
>>>>>>                     psp->fw_pri_mc_addr,
>>>>>> @@ -1507,8 +1507,7 @@ static int psp_rap_load(struct psp_context *psp)
>>>>>>        if (!cmd)
>>>>>>            return -ENOMEM;
>>>>>>    -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>>> -    memcpy(psp->fw_pri_buf, psp->ta_rap_start_addr, 
>>>>>> psp->ta_rap_ucode_size);
>>>>>> +    psp_copy_fw(psp, psp->ta_rap_start_addr, psp->ta_rap_ucode_size);
>>>>>>          psp_prep_ta_load_cmd_buf(cmd,
>>>>>>                     psp->fw_pri_mc_addr,
>>>>>> @@ -2778,6 +2777,20 @@ static ssize_t psp_usbc_pd_fw_sysfs_write(struct
>>>>>> device *dev,
>>>>>>        return count;
>>>>>>    }
>>>>>>    +void psp_copy_fw(struct psp_context *psp, uint8_t *start_addr, uint32_t
>>>>>> bin_size)
>>>>>> +{
>>>>>> +    int idx;
>>>>>> +
>>>>>> +    if (!drm_dev_enter(&psp->adev->ddev, &idx))
>>>>>> +        return;
>>>>>> +
>>>>>> +    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>>> +    memcpy(psp->fw_pri_buf, start_addr, bin_size);
>>>>>> +
>>>>>> +    drm_dev_exit(idx);
>>>>>> +}
>>>>>> +
>>>>>> +
>>>>>>    static DEVICE_ATTR(usbc_pd_fw, S_IRUGO | S_IWUSR,
>>>>>>               psp_usbc_pd_fw_sysfs_read,
>>>>>>               psp_usbc_pd_fw_sysfs_write);
>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
>>>>>> index da250bc..ac69314 100644
>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
>>>>>> @@ -400,4 +400,7 @@ int psp_init_ta_microcode(struct psp_context *psp,
>>>>>>                  const char *chip_name);
>>>>>>    int psp_get_fw_attestation_records_addr(struct psp_context *psp,
>>>>>>                        uint64_t *output_ptr);
>>>>>> +
>>>>>> +void psp_copy_fw(struct psp_context *psp, uint8_t *start_addr, uint32_t
>>>>>> bin_size);
>>>>>> +
>>>>>>    #endif
>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
>>>>>> index 1a612f5..d656494 100644
>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
>>>>>> @@ -35,6 +35,8 @@
>>>>>>    #include "amdgpu.h"
>>>>>>    #include "atom.h"
>>>>>>    +#include <drm/drm_drv.h>
>>>>>> +
>>>>>>    /*
>>>>>>     * Rings
>>>>>>     * Most engines on the GPU are fed via ring buffers. Ring
>>>>>> @@ -463,3 +465,71 @@ int amdgpu_ring_test_helper(struct amdgpu_ring *ring)
>>>>>>        ring->sched.ready = !r;
>>>>>>        return r;
>>>>>>    }
>>>>>> +
>>>>>> +void amdgpu_ring_clear_ring(struct amdgpu_ring *ring)
>>>>>> +{
>>>>>> +    int idx;
>>>>>> +    int i = 0;
>>>>>> +
>>>>>> +    if (!drm_dev_enter(&ring->adev->ddev, &idx))
>>>>>> +        return;
>>>>>> +
>>>>>> +    while (i <= ring->buf_mask)
>>>>>> +        ring->ring[i++] = ring->funcs->nop;
>>>>>> +
>>>>>> +    drm_dev_exit(idx);
>>>>>> +
>>>>>> +}
>>>>>> +
>>>>>> +void amdgpu_ring_write(struct amdgpu_ring *ring, uint32_t v)
>>>>>> +{
>>>>>> +    int idx;
>>>>>> +
>>>>>> +    if (!drm_dev_enter(&ring->adev->ddev, &idx))
>>>>>> +        return;
>>>>>> +
>>>>>> +    if (ring->count_dw <= 0)
>>>>>> +        DRM_ERROR("amdgpu: writing more dwords to the ring than 
>>>>>> expected!\n");
>>>>>> +    ring->ring[ring->wptr++ & ring->buf_mask] = v;
>>>>>> +    ring->wptr &= ring->ptr_mask;
>>>>>> +    ring->count_dw--;
>>>>>> +
>>>>>> +    drm_dev_exit(idx);
>>>>>> +}
>>>>>> +
>>>>>> +void amdgpu_ring_write_multiple(struct amdgpu_ring *ring,
>>>>>> +                          void *src, int count_dw)
>>>>>> +{
>>>>>> +    unsigned occupied, chunk1, chunk2;
>>>>>> +    void *dst;
>>>>>> +    int idx;
>>>>>> +
>>>>>> +    if (!drm_dev_enter(&ring->adev->ddev, &idx))
>>>>>> +        return;
>>>>>> +
>>>>>> +    if (unlikely(ring->count_dw < count_dw))
>>>>>> +        DRM_ERROR("amdgpu: writing more dwords to the ring than 
>>>>>> expected!\n");
>>>>>> +
>>>>>> +    occupied = ring->wptr & ring->buf_mask;
>>>>>> +    dst = (void *)&ring->ring[occupied];
>>>>>> +    chunk1 = ring->buf_mask + 1 - occupied;
>>>>>> +    chunk1 = (chunk1 >= count_dw) ? count_dw: chunk1;
>>>>>> +    chunk2 = count_dw - chunk1;
>>>>>> +    chunk1 <<= 2;
>>>>>> +    chunk2 <<= 2;
>>>>>> +
>>>>>> +    if (chunk1)
>>>>>> +        memcpy(dst, src, chunk1);
>>>>>> +
>>>>>> +    if (chunk2) {
>>>>>> +        src += chunk1;
>>>>>> +        dst = (void *)ring->ring;
>>>>>> +        memcpy(dst, src, chunk2);
>>>>>> +    }
>>>>>> +
>>>>>> +    ring->wptr += count_dw;
>>>>>> +    ring->wptr &= ring->ptr_mask;
>>>>>> +    ring->count_dw -= count_dw;
>>>>>> +
>>>>>> +    drm_dev_exit(idx);
>>>>>> +}
>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>>>>>> index accb243..f90b81f 100644
>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>>>>>> @@ -300,53 +300,12 @@ static inline void
>>>>>> amdgpu_ring_set_preempt_cond_exec(struct amdgpu_ring *ring,
>>>>>>        *ring->cond_exe_cpu_addr = cond_exec;
>>>>>>    }
>>>>>>    -static inline void amdgpu_ring_clear_ring(struct amdgpu_ring *ring)
>>>>>> -{
>>>>>> -    int i = 0;
>>>>>> -    while (i <= ring->buf_mask)
>>>>>> -        ring->ring[i++] = ring->funcs->nop;
>>>>>> -
>>>>>> -}
>>>>>> -
>>>>>> -static inline void amdgpu_ring_write(struct amdgpu_ring *ring, uint32_t v)
>>>>>> -{
>>>>>> -    if (ring->count_dw <= 0)
>>>>>> -        DRM_ERROR("amdgpu: writing more dwords to the ring than 
>>>>>> expected!\n");
>>>>>> -    ring->ring[ring->wptr++ & ring->buf_mask] = v;
>>>>>> -    ring->wptr &= ring->ptr_mask;
>>>>>> -    ring->count_dw--;
>>>>>> -}
>>>>>> +void amdgpu_ring_clear_ring(struct amdgpu_ring *ring);
>>>>>>    -static inline void amdgpu_ring_write_multiple(struct amdgpu_ring *ring,
>>>>>> -                          void *src, int count_dw)
>>>>>> -{
>>>>>> -    unsigned occupied, chunk1, chunk2;
>>>>>> -    void *dst;
>>>>>> -
>>>>>> -    if (unlikely(ring->count_dw < count_dw))
>>>>>> -        DRM_ERROR("amdgpu: writing more dwords to the ring than 
>>>>>> expected!\n");
>>>>>> -
>>>>>> -    occupied = ring->wptr & ring->buf_mask;
>>>>>> -    dst = (void *)&ring->ring[occupied];
>>>>>> -    chunk1 = ring->buf_mask + 1 - occupied;
>>>>>> -    chunk1 = (chunk1 >= count_dw) ? count_dw: chunk1;
>>>>>> -    chunk2 = count_dw - chunk1;
>>>>>> -    chunk1 <<= 2;
>>>>>> -    chunk2 <<= 2;
>>>>>> +void amdgpu_ring_write(struct amdgpu_ring *ring, uint32_t v);
>>>>>>    -    if (chunk1)
>>>>>> -        memcpy(dst, src, chunk1);
>>>>>> -
>>>>>> -    if (chunk2) {
>>>>>> -        src += chunk1;
>>>>>> -        dst = (void *)ring->ring;
>>>>>> -        memcpy(dst, src, chunk2);
>>>>>> -    }
>>>>>> -
>>>>>> -    ring->wptr += count_dw;
>>>>>> -    ring->wptr &= ring->ptr_mask;
>>>>>> -    ring->count_dw -= count_dw;
>>>>>> -}
>>>>>> +void amdgpu_ring_write_multiple(struct amdgpu_ring *ring,
>>>>>> +                          void *src, int count_dw);
>>>>>>      int amdgpu_ring_test_helper(struct amdgpu_ring *ring);
>>>>>>    diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
>>>>>> b/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
>>>>>> index bd4248c..b3ce5be 100644
>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
>>>>>> @@ -269,10 +269,8 @@ static int psp_v11_0_bootloader_load_kdb(struct
>>>>>> psp_context *psp)
>>>>>>        if (ret)
>>>>>>            return ret;
>>>>>>    -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>>> -
>>>>>>        /* Copy PSP KDB binary to memory */
>>>>>> -    memcpy(psp->fw_pri_buf, psp->kdb_start_addr, psp->kdb_bin_size);
>>>>>> +    psp_copy_fw(psp, psp->kdb_start_addr, psp->kdb_bin_size);
>>>>>>          /* Provide the PSP KDB to bootloader */
>>>>>>        WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>>>>> @@ -302,10 +300,8 @@ static int psp_v11_0_bootloader_load_spl(struct
>>>>>> psp_context *psp)
>>>>>>        if (ret)
>>>>>>            return ret;
>>>>>>    -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>>> -
>>>>>>        /* Copy PSP SPL binary to memory */
>>>>>> -    memcpy(psp->fw_pri_buf, psp->spl_start_addr, psp->spl_bin_size);
>>>>>> +    psp_copy_fw(psp, psp->spl_start_addr, psp->spl_bin_size);
>>>>>>          /* Provide the PSP SPL to bootloader */
>>>>>>        WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>>>>> @@ -335,10 +331,8 @@ static int psp_v11_0_bootloader_load_sysdrv(struct
>>>>>> psp_context *psp)
>>>>>>        if (ret)
>>>>>>            return ret;
>>>>>>    -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>>> -
>>>>>>        /* Copy PSP System Driver binary to memory */
>>>>>> -    memcpy(psp->fw_pri_buf, psp->sys_start_addr, psp->sys_bin_size);
>>>>>> +    psp_copy_fw(psp, psp->sys_start_addr, psp->sys_bin_size);
>>>>>>          /* Provide the sys driver to bootloader */
>>>>>>        WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>>>>> @@ -371,10 +365,8 @@ static int psp_v11_0_bootloader_load_sos(struct
>>>>>> psp_context *psp)
>>>>>>        if (ret)
>>>>>>            return ret;
>>>>>>    -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>>> -
>>>>>>        /* Copy Secure OS binary to PSP memory */
>>>>>> -    memcpy(psp->fw_pri_buf, psp->sos_start_addr, psp->sos_bin_size);
>>>>>> +    psp_copy_fw(psp, psp->sos_start_addr, psp->sos_bin_size);
>>>>>>          /* Provide the PSP secure OS to bootloader */
>>>>>>        WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c
>>>>>> b/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c
>>>>>> index c4828bd..618e5b6 100644
>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c
>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c
>>>>>> @@ -138,10 +138,8 @@ static int psp_v12_0_bootloader_load_sysdrv(struct
>>>>>> psp_context *psp)
>>>>>>        if (ret)
>>>>>>            return ret;
>>>>>>    -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>>> -
>>>>>>        /* Copy PSP System Driver binary to memory */
>>>>>> -    memcpy(psp->fw_pri_buf, psp->sys_start_addr, psp->sys_bin_size);
>>>>>> +    psp_copy_fw(psp, psp->sys_start_addr, psp->sys_bin_size);
>>>>>>          /* Provide the sys driver to bootloader */
>>>>>>        WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>>>>> @@ -179,10 +177,8 @@ static int psp_v12_0_bootloader_load_sos(struct
>>>>>> psp_context *psp)
>>>>>>        if (ret)
>>>>>>            return ret;
>>>>>>    -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>>> -
>>>>>>        /* Copy Secure OS binary to PSP memory */
>>>>>> -    memcpy(psp->fw_pri_buf, psp->sos_start_addr, psp->sos_bin_size);
>>>>>> +    psp_copy_fw(psp, psp->sos_start_addr, psp->sos_bin_size);
>>>>>>          /* Provide the PSP secure OS to bootloader */
>>>>>>        WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c
>>>>>> b/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c
>>>>>> index f2e725f..d0a6cccd 100644
>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c
>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c
>>>>>> @@ -102,10 +102,8 @@ static int psp_v3_1_bootloader_load_sysdrv(struct
>>>>>> psp_context *psp)
>>>>>>        if (ret)
>>>>>>            return ret;
>>>>>>    -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>>> -
>>>>>>        /* Copy PSP System Driver binary to memory */
>>>>>> -    memcpy(psp->fw_pri_buf, psp->sys_start_addr, psp->sys_bin_size);
>>>>>> +    psp_copy_fw(psp, psp->sys_start_addr, psp->sys_bin_size);
>>>>>>          /* Provide the sys driver to bootloader */
>>>>>>        WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>>>>> @@ -143,10 +141,8 @@ static int psp_v3_1_bootloader_load_sos(struct
>>>>>> psp_context *psp)
>>>>>>        if (ret)
>>>>>>            return ret;
>>>>>>    -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>>> -
>>>>>>        /* Copy Secure OS binary to PSP memory */
>>>>>> -    memcpy(psp->fw_pri_buf, psp->sos_start_addr, psp->sos_bin_size);
>>>>>> +    psp_copy_fw(psp, psp->sos_start_addr, psp->sos_bin_size);
>>>>>>          /* Provide the PSP secure OS to bootloader */
>>>>>>        WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>>
>>>
>> _______________________________________________
>> amd-gfx mailing list
>> amd-gfx@lists.freedesktop.org
>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&amp;data=04%7C01%7CAndrey.Grodzovsky%40amd.com%7Cee61fb937d2d4baedf6f08d8bcac5b02%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637466795752297305%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=a5MkPkwHh7WkR24K9EoCWSKPdCpiXCJH6RwGbGyhHyA%3D&amp;reserved=0 
>>
>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 10/14] dmr/amdgpu: Move some sysfs attrs creation to default_attr
  2021-01-19 19:04           ` Alex Deucher
@ 2021-01-19 19:41             ` Greg KH
  -1 siblings, 0 replies; 196+ messages in thread
From: Greg KH @ 2021-01-19 19:41 UTC (permalink / raw)
  To: Alex Deucher
  Cc: Daniel Vetter, amd-gfx list, Maling list - DRI developers,
	Christian König, Deucher, Alexander, Qiang Yu

On Tue, Jan 19, 2021 at 02:04:48PM -0500, Alex Deucher wrote:
> On Tue, Jan 19, 2021 at 1:26 PM Greg KH <gregkh@linuxfoundation.org> wrote:
> >
> > On Tue, Jan 19, 2021 at 11:36:01AM -0500, Andrey Grodzovsky wrote:
> > >
> > > On 1/19/21 2:34 AM, Greg KH wrote:
> > > > On Mon, Jan 18, 2021 at 04:01:19PM -0500, Andrey Grodzovsky wrote:
> > > > >   static struct pci_driver amdgpu_kms_pci_driver = {
> > > > >           .name = DRIVER_NAME,
> > > > >           .id_table = pciidlist,
> > > > > @@ -1595,6 +1607,7 @@ static struct pci_driver amdgpu_kms_pci_driver = {
> > > > >           .shutdown = amdgpu_pci_shutdown,
> > > > >           .driver.pm = &amdgpu_pm_ops,
> > > > >           .err_handler = &amdgpu_pci_err_handler,
> > > > > + .driver.dev_groups = amdgpu_sysfs_groups,
> > > > Shouldn't this just be:
> > > >     groups - amdgpu_sysfs_groups,
> > > >
> > > > Why go to the "driver root" here?
> > >
> > >
> > > Because I still didn't get to your suggestion to propose a patch to add groups to
> > > pci_driver, it's located in 'base' driver struct.
> >
> > You are a pci driver, you should never have to mess with the "base"
> > driver struct.  Look at commit 92d50fc1602e ("PCI/IB: add support for
> > pci driver attribute groups") which got merged in 4.14, way back in
> > 2017 :)
> 
> Per the previous discussion of this patch set:
> https://www.mail-archive.com/amd-gfx@lists.freedesktop.org/msg56019.html

Hey, at least I'm consistent :)
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 10/14] dmr/amdgpu: Move some sysfs attrs creation to default_attr
@ 2021-01-19 19:41             ` Greg KH
  0 siblings, 0 replies; 196+ messages in thread
From: Greg KH @ 2021-01-19 19:41 UTC (permalink / raw)
  To: Alex Deucher
  Cc: Andrey Grodzovsky, Daniel Vetter, amd-gfx list,
	Maling list - DRI developers, Christian König, Deucher,
	Alexander, Qiang Yu

On Tue, Jan 19, 2021 at 02:04:48PM -0500, Alex Deucher wrote:
> On Tue, Jan 19, 2021 at 1:26 PM Greg KH <gregkh@linuxfoundation.org> wrote:
> >
> > On Tue, Jan 19, 2021 at 11:36:01AM -0500, Andrey Grodzovsky wrote:
> > >
> > > On 1/19/21 2:34 AM, Greg KH wrote:
> > > > On Mon, Jan 18, 2021 at 04:01:19PM -0500, Andrey Grodzovsky wrote:
> > > > >   static struct pci_driver amdgpu_kms_pci_driver = {
> > > > >           .name = DRIVER_NAME,
> > > > >           .id_table = pciidlist,
> > > > > @@ -1595,6 +1607,7 @@ static struct pci_driver amdgpu_kms_pci_driver = {
> > > > >           .shutdown = amdgpu_pci_shutdown,
> > > > >           .driver.pm = &amdgpu_pm_ops,
> > > > >           .err_handler = &amdgpu_pci_err_handler,
> > > > > + .driver.dev_groups = amdgpu_sysfs_groups,
> > > > Shouldn't this just be:
> > > >     groups - amdgpu_sysfs_groups,
> > > >
> > > > Why go to the "driver root" here?
> > >
> > >
> > > Because I still didn't get to your suggestion to propose a patch to add groups to
> > > pci_driver, it's located in 'base' driver struct.
> >
> > You are a pci driver, you should never have to mess with the "base"
> > driver struct.  Look at commit 92d50fc1602e ("PCI/IB: add support for
> > pci driver attribute groups") which got merged in 4.14, way back in
> > 2017 :)
> 
> Per the previous discussion of this patch set:
> https://www.mail-archive.com/amd-gfx@lists.freedesktop.org/msg56019.html

Hey, at least I'm consistent :)
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 07/14] drm/amdgpu: Register IOMMU topology notifier per device.
  2021-01-19 13:45       ` Daniel Vetter
@ 2021-01-19 21:21         ` Andrey Grodzovsky
  -1 siblings, 0 replies; 196+ messages in thread
From: Andrey Grodzovsky @ 2021-01-19 21:21 UTC (permalink / raw)
  To: Daniel Vetter, christian.koenig
  Cc: amd-gfx, daniel.vetter, dri-devel, yuq825, gregkh, Alexander.Deucher


[-- Attachment #1.1: Type: text/plain, Size: 12796 bytes --]


On 1/19/21 8:45 AM, Daniel Vetter wrote:
> On Tue, Jan 19, 2021 at 09:48:03AM +0100, Christian König wrote:
>> Am 18.01.21 um 22:01 schrieb Andrey Grodzovsky:
>>> Handle all DMA IOMMU gropup related dependencies before the
>>> group is removed.
>>>
>>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>>> ---
>>>    drivers/gpu/drm/amd/amdgpu/amdgpu.h        |  5 ++++
>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 46 ++++++++++++++++++++++++++++++
>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c   |  2 +-
>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h   |  1 +
>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 10 +++++++
>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_object.h |  2 ++
>>>    6 files changed, 65 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>> index 478a7d8..2953420 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>> @@ -51,6 +51,7 @@
>>>    #include <linux/dma-fence.h>
>>>    #include <linux/pci.h>
>>>    #include <linux/aer.h>
>>> +#include <linux/notifier.h>
>>>    #include <drm/ttm/ttm_bo_api.h>
>>>    #include <drm/ttm/ttm_bo_driver.h>
>>> @@ -1041,6 +1042,10 @@ struct amdgpu_device {
>>>    	bool                            in_pci_err_recovery;
>>>    	struct pci_saved_state          *pci_state;
>>> +
>>> +	struct notifier_block		nb;
>>> +	struct blocking_notifier_head	notifier;
>>> +	struct list_head		device_bo_list;
>>>    };
>>>    static inline struct amdgpu_device *drm_to_adev(struct drm_device *ddev)
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>> index 45e23e3..e99f4f1 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>> @@ -70,6 +70,8 @@
>>>    #include <drm/task_barrier.h>
>>>    #include <linux/pm_runtime.h>
>>> +#include <linux/iommu.h>
>>> +
>>>    MODULE_FIRMWARE("amdgpu/vega10_gpu_info.bin");
>>>    MODULE_FIRMWARE("amdgpu/vega12_gpu_info.bin");
>>>    MODULE_FIRMWARE("amdgpu/raven_gpu_info.bin");
>>> @@ -3200,6 +3202,39 @@ static const struct attribute *amdgpu_dev_attributes[] = {
>>>    };
>>> +static int amdgpu_iommu_group_notifier(struct notifier_block *nb,
>>> +				     unsigned long action, void *data)
>>> +{
>>> +	struct amdgpu_device *adev = container_of(nb, struct amdgpu_device, nb);
>>> +	struct amdgpu_bo *bo = NULL;
>>> +
>>> +	/*
>>> +	 * Following is a set of IOMMU group dependencies taken care of before
>>> +	 * device's IOMMU group is removed
>>> +	 */
>>> +	if (action == IOMMU_GROUP_NOTIFY_DEL_DEVICE) {
>>> +
>>> +		spin_lock(&ttm_bo_glob.lru_lock);
>>> +		list_for_each_entry(bo, &adev->device_bo_list, bo) {
>>> +			if (bo->tbo.ttm)
>>> +				ttm_tt_unpopulate(bo->tbo.bdev, bo->tbo.ttm);
>>> +		}
>>> +		spin_unlock(&ttm_bo_glob.lru_lock);
>> That approach won't work. ttm_tt_unpopulate() might sleep on an IOMMU lock.
>>
>> You need to use a mutex here or even better make sure you can access the
>> device_bo_list without a lock in this moment.
> I'd also be worried about the notifier mutex getting really badly in the
> way.
>
> Plus I'm worried why we even need this, it sounds a bit like papering over
> the iommu subsystem. Assuming we clean up all our iommu mappings in our
> device hotunplug/unload code, why do we still need to have an additional
> iommu notifier on top, with all kinds of additional headaches? The iommu
> shouldn't clean up before the devices in its group have cleaned up.
>
> I think we need more info here on what the exact problem is first.
> -Daniel


Originally I experienced the  crash bellow on IOMMU enabled device, it happens 
post device removal from PCI topology -
during shutting down of user client holding last reference to drm device file (X 
in my case).
The crash is because by the time I get to this point struct device->iommu_group 
pointer is NULL
already since the IOMMU group for the device is unset during PCI removal. So 
this contradicts what you said above
that the iommu shouldn't clean up before the devices in its group have cleaned up.
So instead of guessing when is the right place to place all IOMMU related 
cleanups it makes sense
to get notification from IOMMU subsystem in the form of event 
IOMMU_GROUP_NOTIFY_DEL_DEVICE
and use that place to do all the relevant cleanups.

Andrey


[  123.810074 <   28.126960>] BUG: kernel NULL pointer dereference, address: 
00000000000000c8
[  123.810080 <    0.000006>] #PF: supervisor read access in kernel mode
[  123.810082 <    0.000002>] #PF: error_code(0x0000) - not-present page
[  123.810085 <    0.000003>] PGD 0 P4D 0
[  123.810089 <    0.000004>] Oops: 0000 [#1] SMP NOPTI
[  123.810094 <    0.000005>] CPU: 5 PID: 1418 Comm: Xorg:shlo4 Tainted: 
G           O      5.9.0-rc2-dev+ #59
[  123.810096 <    0.000002>] Hardware name: System manufacturer System Product 
Name/PRIME X470-PRO, BIOS 4406 02/28/2019
[  123.810105 <    0.000009>] *RIP: 0010:iommu_get_dma_domain*+0x10/0x20
[  123.810108 <    0.000003>] Code: b0 48 c7 87 98 00 00 00 00 00 00 00 31 c0 c3 
b8 f4 ff ff ff eb a6 0f 1f 40 00 0f 1f 44 00 00 48 8b 87 d0 02 00 00 55 48 89 e5 
<48> 8b 80 c8 00 00 00 5d c3 0f 1f 80 00 00 00 00 0f 1f 44 00 00 48
[  123.810111 <    0.000003>] RSP: 0018:ffffa2e201f7f980 EFLAGS: 00010246
[  123.810114 <    0.000003>] RAX: 0000000000000000 RBX: 0000000000001000 RCX: 
0000000000000000
[  123.810116 <    0.000002>] RDX: 0000000000001000 RSI: 00000000bf5cb000 RDI: 
ffff93c259dc60b0
[  123.810117 <    0.000001>] RBP: ffffa2e201f7f980 R08: 0000000000000000 R09: 
0000000000000000
[  123.810119 <    0.000002>] R10: ffffa2e201f7faf0 R11: 0000000000000001 R12: 
00000000bf5cb000
[  123.810121 <    0.000002>] R13: 0000000000001000 R14: ffff93c24cef9c50 R15: 
ffff93c256c05688
[  123.810124 <    0.000003>] FS:  00007f5e5e8d3700(0000) 
GS:ffff93c25e940000(0000) knlGS:0000000000000000
[  123.810126 <    0.000002>] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  123.810128 <    0.000002>] CR2: 00000000000000c8 CR3: 000000027fe0a000 CR4: 
00000000003506e0
[  123.810130 <    0.000002>] Call Trace:
[  123.810136 <    0.000006>]  __iommu_dma_unmap+0x2e/0x100
[  123.810141 <    0.000005>]  ? kfree+0x389/0x3a0
[  123.810144 <    0.000003>]  iommu_dma_unmap_page+0xe/0x10
[  123.810149 <    0.000005>] dma_unmap_page_attrs+0x4d/0xf0
[  123.810159 <    0.000010>]  ? ttm_bo_del_from_lru+0x8e/0xb0 [ttm]
[  123.810165 <    0.000006>] ttm_unmap_and_unpopulate_pages+0x8e/0xc0 [ttm]
[  123.810252 <    0.000087>] amdgpu_ttm_tt_unpopulate+0xaa/0xd0 [amdgpu]
[  123.810258 <    0.000006>]  ttm_tt_unpopulate+0x59/0x70 [ttm]
[  123.810264 <    0.000006>]  ttm_tt_destroy+0x6a/0x70 [ttm]
[  123.810270 <    0.000006>] ttm_bo_cleanup_memtype_use+0x36/0xa0 [ttm]
[  123.810276 <    0.000006>]  ttm_bo_put+0x1e7/0x400 [ttm]
[  123.810358 <    0.000082>]  amdgpu_bo_unref+0x1e/0x30 [amdgpu]
[  123.810440 <    0.000082>] amdgpu_gem_object_free+0x37/0x50 [amdgpu]
[  123.810459 <    0.000019>]  drm_gem_object_free+0x35/0x40 [drm]
[  123.810476 <    0.000017>] drm_gem_object_handle_put_unlocked+0x9d/0xd0 [drm]
[  123.810494 <    0.000018>] drm_gem_object_release_handle+0x74/0x90 [drm]
[  123.810511 <    0.000017>]  ? drm_gem_object_handle_put_unlocked+0xd0/0xd0 [drm]
[  123.810516 <    0.000005>]  idr_for_each+0x4d/0xd0
[  123.810534 <    0.000018>]  drm_gem_release+0x20/0x30 [drm]
[  123.810550 <    0.000016>]  drm_file_free+0x251/0x2a0 [drm]
[  123.810567 <    0.000017>] drm_close_helper.isra.14+0x61/0x70 [drm]
[  123.810583 <    0.000016>]  drm_release+0x6a/0xe0 [drm]
[  123.810588 <    0.000005>]  __fput+0xa2/0x250
[  123.810592 <    0.000004>]  ____fput+0xe/0x10
[  123.810595 <    0.000003>]  task_work_run+0x6c/0xa0
[  123.810600 <    0.000005>]  do_exit+0x376/0xb60
[  123.810604 <    0.000004>]  do_group_exit+0x43/0xa0
[  123.810608 <    0.000004>]  get_signal+0x18b/0x8e0
[  123.810612 <    0.000004>]  ? do_futex+0x595/0xc20
[  123.810617 <    0.000005>]  arch_do_signal+0x34/0x880
[  123.810620 <    0.000003>]  ? check_preempt_curr+0x50/0x60
[  123.810623 <    0.000003>]  ? ttwu_do_wakeup+0x1e/0x160
[  123.810626 <    0.000003>]  ? ttwu_do_activate+0x61/0x70
[  123.810630 <    0.000004>] exit_to_user_mode_prepare+0x124/0x1b0
[  123.810635 <    0.000005>] syscall_exit_to_user_mode+0x31/0x170
[  123.810639 <    0.000004>]  do_syscall_64+0x43/0x80


Andrey


>
>> Christian.
>>
>>> +
>>> +		if (adev->irq.ih.use_bus_addr)
>>> +			amdgpu_ih_ring_fini(adev, &adev->irq.ih);
>>> +		if (adev->irq.ih1.use_bus_addr)
>>> +			amdgpu_ih_ring_fini(adev, &adev->irq.ih1);
>>> +		if (adev->irq.ih2.use_bus_addr)
>>> +			amdgpu_ih_ring_fini(adev, &adev->irq.ih2);
>>> +
>>> +		amdgpu_gart_dummy_page_fini(adev);
>>> +	}
>>> +
>>> +	return NOTIFY_OK;
>>> +}
>>> +
>>> +
>>>    /**
>>>     * amdgpu_device_init - initialize the driver
>>>     *
>>> @@ -3304,6 +3339,8 @@ int amdgpu_device_init(struct amdgpu_device *adev,
>>>    	INIT_WORK(&adev->xgmi_reset_work, amdgpu_device_xgmi_reset_func);
>>> +	INIT_LIST_HEAD(&adev->device_bo_list);
>>> +
>>>    	adev->gfx.gfx_off_req_count = 1;
>>>    	adev->pm.ac_power = power_supply_is_system_supplied() > 0;
>>> @@ -3575,6 +3612,15 @@ int amdgpu_device_init(struct amdgpu_device *adev,
>>>    	if (amdgpu_device_cache_pci_state(adev->pdev))
>>>    		pci_restore_state(pdev);
>>> +	BLOCKING_INIT_NOTIFIER_HEAD(&adev->notifier);
>>> +	adev->nb.notifier_call = amdgpu_iommu_group_notifier;
>>> +
>>> +	if (adev->dev->iommu_group) {
>>> +		r = iommu_group_register_notifier(adev->dev->iommu_group, &adev->nb);
>>> +		if (r)
>>> +			goto failed;
>>> +	}
>>> +
>>>    	return 0;
>>>    failed:
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
>>> index 0db9330..486ad6d 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
>>> @@ -92,7 +92,7 @@ static int amdgpu_gart_dummy_page_init(struct amdgpu_device *adev)
>>>     *
>>>     * Frees the dummy page used by the driver (all asics).
>>>     */
>>> -static void amdgpu_gart_dummy_page_fini(struct amdgpu_device *adev)
>>> +void amdgpu_gart_dummy_page_fini(struct amdgpu_device *adev)
>>>    {
>>>    	if (!adev->dummy_page_addr)
>>>    		return;
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h
>>> index afa2e28..5678d9c 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h
>>> @@ -61,6 +61,7 @@ int amdgpu_gart_table_vram_pin(struct amdgpu_device *adev);
>>>    void amdgpu_gart_table_vram_unpin(struct amdgpu_device *adev);
>>>    int amdgpu_gart_init(struct amdgpu_device *adev);
>>>    void amdgpu_gart_fini(struct amdgpu_device *adev);
>>> +void amdgpu_gart_dummy_page_fini(struct amdgpu_device *adev);
>>>    int amdgpu_gart_unbind(struct amdgpu_device *adev, uint64_t offset,
>>>    		       int pages);
>>>    int amdgpu_gart_map(struct amdgpu_device *adev, uint64_t offset,
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>>> index 6cc9919..4a1de69 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>>> @@ -94,6 +94,10 @@ static void amdgpu_bo_destroy(struct ttm_buffer_object *tbo)
>>>    	}
>>>    	amdgpu_bo_unref(&bo->parent);
>>> +	spin_lock(&ttm_bo_glob.lru_lock);
>>> +	list_del(&bo->bo);
>>> +	spin_unlock(&ttm_bo_glob.lru_lock);
>>> +
>>>    	kfree(bo->metadata);
>>>    	kfree(bo);
>>>    }
>>> @@ -613,6 +617,12 @@ static int amdgpu_bo_do_create(struct amdgpu_device *adev,
>>>    	if (bp->type == ttm_bo_type_device)
>>>    		bo->flags &= ~AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED;
>>> +	INIT_LIST_HEAD(&bo->bo);
>>> +
>>> +	spin_lock(&ttm_bo_glob.lru_lock);
>>> +	list_add_tail(&bo->bo, &adev->device_bo_list);
>>> +	spin_unlock(&ttm_bo_glob.lru_lock);
>>> +
>>>    	return 0;
>>>    fail_unreserve:
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
>>> index 9ac3756..5ae8555 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
>>> @@ -110,6 +110,8 @@ struct amdgpu_bo {
>>>    	struct list_head		shadow_list;
>>>    	struct kgd_mem                  *kfd_bo;
>>> +
>>> +	struct list_head		bo;
>>>    };
>>>    static inline struct amdgpu_bo *ttm_to_amdgpu_bo(struct ttm_buffer_object *tbo)

[-- Attachment #1.2: Type: text/html, Size: 16239 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 07/14] drm/amdgpu: Register IOMMU topology notifier per device.
@ 2021-01-19 21:21         ` Andrey Grodzovsky
  0 siblings, 0 replies; 196+ messages in thread
From: Andrey Grodzovsky @ 2021-01-19 21:21 UTC (permalink / raw)
  To: Daniel Vetter, christian.koenig
  Cc: robh, amd-gfx, daniel.vetter, dri-devel, eric, ppaalanen, yuq825,
	gregkh, Alexander.Deucher, Harry.Wentland, l.stach


[-- Attachment #1.1: Type: text/plain, Size: 12796 bytes --]


On 1/19/21 8:45 AM, Daniel Vetter wrote:
> On Tue, Jan 19, 2021 at 09:48:03AM +0100, Christian König wrote:
>> Am 18.01.21 um 22:01 schrieb Andrey Grodzovsky:
>>> Handle all DMA IOMMU gropup related dependencies before the
>>> group is removed.
>>>
>>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>>> ---
>>>    drivers/gpu/drm/amd/amdgpu/amdgpu.h        |  5 ++++
>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 46 ++++++++++++++++++++++++++++++
>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c   |  2 +-
>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h   |  1 +
>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 10 +++++++
>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_object.h |  2 ++
>>>    6 files changed, 65 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>> index 478a7d8..2953420 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>> @@ -51,6 +51,7 @@
>>>    #include <linux/dma-fence.h>
>>>    #include <linux/pci.h>
>>>    #include <linux/aer.h>
>>> +#include <linux/notifier.h>
>>>    #include <drm/ttm/ttm_bo_api.h>
>>>    #include <drm/ttm/ttm_bo_driver.h>
>>> @@ -1041,6 +1042,10 @@ struct amdgpu_device {
>>>    	bool                            in_pci_err_recovery;
>>>    	struct pci_saved_state          *pci_state;
>>> +
>>> +	struct notifier_block		nb;
>>> +	struct blocking_notifier_head	notifier;
>>> +	struct list_head		device_bo_list;
>>>    };
>>>    static inline struct amdgpu_device *drm_to_adev(struct drm_device *ddev)
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>> index 45e23e3..e99f4f1 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>> @@ -70,6 +70,8 @@
>>>    #include <drm/task_barrier.h>
>>>    #include <linux/pm_runtime.h>
>>> +#include <linux/iommu.h>
>>> +
>>>    MODULE_FIRMWARE("amdgpu/vega10_gpu_info.bin");
>>>    MODULE_FIRMWARE("amdgpu/vega12_gpu_info.bin");
>>>    MODULE_FIRMWARE("amdgpu/raven_gpu_info.bin");
>>> @@ -3200,6 +3202,39 @@ static const struct attribute *amdgpu_dev_attributes[] = {
>>>    };
>>> +static int amdgpu_iommu_group_notifier(struct notifier_block *nb,
>>> +				     unsigned long action, void *data)
>>> +{
>>> +	struct amdgpu_device *adev = container_of(nb, struct amdgpu_device, nb);
>>> +	struct amdgpu_bo *bo = NULL;
>>> +
>>> +	/*
>>> +	 * Following is a set of IOMMU group dependencies taken care of before
>>> +	 * device's IOMMU group is removed
>>> +	 */
>>> +	if (action == IOMMU_GROUP_NOTIFY_DEL_DEVICE) {
>>> +
>>> +		spin_lock(&ttm_bo_glob.lru_lock);
>>> +		list_for_each_entry(bo, &adev->device_bo_list, bo) {
>>> +			if (bo->tbo.ttm)
>>> +				ttm_tt_unpopulate(bo->tbo.bdev, bo->tbo.ttm);
>>> +		}
>>> +		spin_unlock(&ttm_bo_glob.lru_lock);
>> That approach won't work. ttm_tt_unpopulate() might sleep on an IOMMU lock.
>>
>> You need to use a mutex here or even better make sure you can access the
>> device_bo_list without a lock in this moment.
> I'd also be worried about the notifier mutex getting really badly in the
> way.
>
> Plus I'm worried why we even need this, it sounds a bit like papering over
> the iommu subsystem. Assuming we clean up all our iommu mappings in our
> device hotunplug/unload code, why do we still need to have an additional
> iommu notifier on top, with all kinds of additional headaches? The iommu
> shouldn't clean up before the devices in its group have cleaned up.
>
> I think we need more info here on what the exact problem is first.
> -Daniel


Originally I experienced the  crash bellow on IOMMU enabled device, it happens 
post device removal from PCI topology -
during shutting down of user client holding last reference to drm device file (X 
in my case).
The crash is because by the time I get to this point struct device->iommu_group 
pointer is NULL
already since the IOMMU group for the device is unset during PCI removal. So 
this contradicts what you said above
that the iommu shouldn't clean up before the devices in its group have cleaned up.
So instead of guessing when is the right place to place all IOMMU related 
cleanups it makes sense
to get notification from IOMMU subsystem in the form of event 
IOMMU_GROUP_NOTIFY_DEL_DEVICE
and use that place to do all the relevant cleanups.

Andrey


[  123.810074 <   28.126960>] BUG: kernel NULL pointer dereference, address: 
00000000000000c8
[  123.810080 <    0.000006>] #PF: supervisor read access in kernel mode
[  123.810082 <    0.000002>] #PF: error_code(0x0000) - not-present page
[  123.810085 <    0.000003>] PGD 0 P4D 0
[  123.810089 <    0.000004>] Oops: 0000 [#1] SMP NOPTI
[  123.810094 <    0.000005>] CPU: 5 PID: 1418 Comm: Xorg:shlo4 Tainted: 
G           O      5.9.0-rc2-dev+ #59
[  123.810096 <    0.000002>] Hardware name: System manufacturer System Product 
Name/PRIME X470-PRO, BIOS 4406 02/28/2019
[  123.810105 <    0.000009>] *RIP: 0010:iommu_get_dma_domain*+0x10/0x20
[  123.810108 <    0.000003>] Code: b0 48 c7 87 98 00 00 00 00 00 00 00 31 c0 c3 
b8 f4 ff ff ff eb a6 0f 1f 40 00 0f 1f 44 00 00 48 8b 87 d0 02 00 00 55 48 89 e5 
<48> 8b 80 c8 00 00 00 5d c3 0f 1f 80 00 00 00 00 0f 1f 44 00 00 48
[  123.810111 <    0.000003>] RSP: 0018:ffffa2e201f7f980 EFLAGS: 00010246
[  123.810114 <    0.000003>] RAX: 0000000000000000 RBX: 0000000000001000 RCX: 
0000000000000000
[  123.810116 <    0.000002>] RDX: 0000000000001000 RSI: 00000000bf5cb000 RDI: 
ffff93c259dc60b0
[  123.810117 <    0.000001>] RBP: ffffa2e201f7f980 R08: 0000000000000000 R09: 
0000000000000000
[  123.810119 <    0.000002>] R10: ffffa2e201f7faf0 R11: 0000000000000001 R12: 
00000000bf5cb000
[  123.810121 <    0.000002>] R13: 0000000000001000 R14: ffff93c24cef9c50 R15: 
ffff93c256c05688
[  123.810124 <    0.000003>] FS:  00007f5e5e8d3700(0000) 
GS:ffff93c25e940000(0000) knlGS:0000000000000000
[  123.810126 <    0.000002>] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  123.810128 <    0.000002>] CR2: 00000000000000c8 CR3: 000000027fe0a000 CR4: 
00000000003506e0
[  123.810130 <    0.000002>] Call Trace:
[  123.810136 <    0.000006>]  __iommu_dma_unmap+0x2e/0x100
[  123.810141 <    0.000005>]  ? kfree+0x389/0x3a0
[  123.810144 <    0.000003>]  iommu_dma_unmap_page+0xe/0x10
[  123.810149 <    0.000005>] dma_unmap_page_attrs+0x4d/0xf0
[  123.810159 <    0.000010>]  ? ttm_bo_del_from_lru+0x8e/0xb0 [ttm]
[  123.810165 <    0.000006>] ttm_unmap_and_unpopulate_pages+0x8e/0xc0 [ttm]
[  123.810252 <    0.000087>] amdgpu_ttm_tt_unpopulate+0xaa/0xd0 [amdgpu]
[  123.810258 <    0.000006>]  ttm_tt_unpopulate+0x59/0x70 [ttm]
[  123.810264 <    0.000006>]  ttm_tt_destroy+0x6a/0x70 [ttm]
[  123.810270 <    0.000006>] ttm_bo_cleanup_memtype_use+0x36/0xa0 [ttm]
[  123.810276 <    0.000006>]  ttm_bo_put+0x1e7/0x400 [ttm]
[  123.810358 <    0.000082>]  amdgpu_bo_unref+0x1e/0x30 [amdgpu]
[  123.810440 <    0.000082>] amdgpu_gem_object_free+0x37/0x50 [amdgpu]
[  123.810459 <    0.000019>]  drm_gem_object_free+0x35/0x40 [drm]
[  123.810476 <    0.000017>] drm_gem_object_handle_put_unlocked+0x9d/0xd0 [drm]
[  123.810494 <    0.000018>] drm_gem_object_release_handle+0x74/0x90 [drm]
[  123.810511 <    0.000017>]  ? drm_gem_object_handle_put_unlocked+0xd0/0xd0 [drm]
[  123.810516 <    0.000005>]  idr_for_each+0x4d/0xd0
[  123.810534 <    0.000018>]  drm_gem_release+0x20/0x30 [drm]
[  123.810550 <    0.000016>]  drm_file_free+0x251/0x2a0 [drm]
[  123.810567 <    0.000017>] drm_close_helper.isra.14+0x61/0x70 [drm]
[  123.810583 <    0.000016>]  drm_release+0x6a/0xe0 [drm]
[  123.810588 <    0.000005>]  __fput+0xa2/0x250
[  123.810592 <    0.000004>]  ____fput+0xe/0x10
[  123.810595 <    0.000003>]  task_work_run+0x6c/0xa0
[  123.810600 <    0.000005>]  do_exit+0x376/0xb60
[  123.810604 <    0.000004>]  do_group_exit+0x43/0xa0
[  123.810608 <    0.000004>]  get_signal+0x18b/0x8e0
[  123.810612 <    0.000004>]  ? do_futex+0x595/0xc20
[  123.810617 <    0.000005>]  arch_do_signal+0x34/0x880
[  123.810620 <    0.000003>]  ? check_preempt_curr+0x50/0x60
[  123.810623 <    0.000003>]  ? ttwu_do_wakeup+0x1e/0x160
[  123.810626 <    0.000003>]  ? ttwu_do_activate+0x61/0x70
[  123.810630 <    0.000004>] exit_to_user_mode_prepare+0x124/0x1b0
[  123.810635 <    0.000005>] syscall_exit_to_user_mode+0x31/0x170
[  123.810639 <    0.000004>]  do_syscall_64+0x43/0x80


Andrey


>
>> Christian.
>>
>>> +
>>> +		if (adev->irq.ih.use_bus_addr)
>>> +			amdgpu_ih_ring_fini(adev, &adev->irq.ih);
>>> +		if (adev->irq.ih1.use_bus_addr)
>>> +			amdgpu_ih_ring_fini(adev, &adev->irq.ih1);
>>> +		if (adev->irq.ih2.use_bus_addr)
>>> +			amdgpu_ih_ring_fini(adev, &adev->irq.ih2);
>>> +
>>> +		amdgpu_gart_dummy_page_fini(adev);
>>> +	}
>>> +
>>> +	return NOTIFY_OK;
>>> +}
>>> +
>>> +
>>>    /**
>>>     * amdgpu_device_init - initialize the driver
>>>     *
>>> @@ -3304,6 +3339,8 @@ int amdgpu_device_init(struct amdgpu_device *adev,
>>>    	INIT_WORK(&adev->xgmi_reset_work, amdgpu_device_xgmi_reset_func);
>>> +	INIT_LIST_HEAD(&adev->device_bo_list);
>>> +
>>>    	adev->gfx.gfx_off_req_count = 1;
>>>    	adev->pm.ac_power = power_supply_is_system_supplied() > 0;
>>> @@ -3575,6 +3612,15 @@ int amdgpu_device_init(struct amdgpu_device *adev,
>>>    	if (amdgpu_device_cache_pci_state(adev->pdev))
>>>    		pci_restore_state(pdev);
>>> +	BLOCKING_INIT_NOTIFIER_HEAD(&adev->notifier);
>>> +	adev->nb.notifier_call = amdgpu_iommu_group_notifier;
>>> +
>>> +	if (adev->dev->iommu_group) {
>>> +		r = iommu_group_register_notifier(adev->dev->iommu_group, &adev->nb);
>>> +		if (r)
>>> +			goto failed;
>>> +	}
>>> +
>>>    	return 0;
>>>    failed:
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
>>> index 0db9330..486ad6d 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
>>> @@ -92,7 +92,7 @@ static int amdgpu_gart_dummy_page_init(struct amdgpu_device *adev)
>>>     *
>>>     * Frees the dummy page used by the driver (all asics).
>>>     */
>>> -static void amdgpu_gart_dummy_page_fini(struct amdgpu_device *adev)
>>> +void amdgpu_gart_dummy_page_fini(struct amdgpu_device *adev)
>>>    {
>>>    	if (!adev->dummy_page_addr)
>>>    		return;
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h
>>> index afa2e28..5678d9c 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h
>>> @@ -61,6 +61,7 @@ int amdgpu_gart_table_vram_pin(struct amdgpu_device *adev);
>>>    void amdgpu_gart_table_vram_unpin(struct amdgpu_device *adev);
>>>    int amdgpu_gart_init(struct amdgpu_device *adev);
>>>    void amdgpu_gart_fini(struct amdgpu_device *adev);
>>> +void amdgpu_gart_dummy_page_fini(struct amdgpu_device *adev);
>>>    int amdgpu_gart_unbind(struct amdgpu_device *adev, uint64_t offset,
>>>    		       int pages);
>>>    int amdgpu_gart_map(struct amdgpu_device *adev, uint64_t offset,
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>>> index 6cc9919..4a1de69 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>>> @@ -94,6 +94,10 @@ static void amdgpu_bo_destroy(struct ttm_buffer_object *tbo)
>>>    	}
>>>    	amdgpu_bo_unref(&bo->parent);
>>> +	spin_lock(&ttm_bo_glob.lru_lock);
>>> +	list_del(&bo->bo);
>>> +	spin_unlock(&ttm_bo_glob.lru_lock);
>>> +
>>>    	kfree(bo->metadata);
>>>    	kfree(bo);
>>>    }
>>> @@ -613,6 +617,12 @@ static int amdgpu_bo_do_create(struct amdgpu_device *adev,
>>>    	if (bp->type == ttm_bo_type_device)
>>>    		bo->flags &= ~AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED;
>>> +	INIT_LIST_HEAD(&bo->bo);
>>> +
>>> +	spin_lock(&ttm_bo_glob.lru_lock);
>>> +	list_add_tail(&bo->bo, &adev->device_bo_list);
>>> +	spin_unlock(&ttm_bo_glob.lru_lock);
>>> +
>>>    	return 0;
>>>    fail_unreserve:
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
>>> index 9ac3756..5ae8555 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
>>> @@ -110,6 +110,8 @@ struct amdgpu_bo {
>>>    	struct list_head		shadow_list;
>>>    	struct kgd_mem                  *kfd_bo;
>>> +
>>> +	struct list_head		bo;
>>>    };
>>>    static inline struct amdgpu_bo *ttm_to_amdgpu_bo(struct ttm_buffer_object *tbo)

[-- Attachment #1.2: Type: text/html, Size: 16239 bytes --]

[-- Attachment #2: Type: text/plain, Size: 154 bytes --]

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 07/14] drm/amdgpu: Register IOMMU topology notifier per device.
  2021-01-19 21:21         ` Andrey Grodzovsky
@ 2021-01-19 22:01           ` Daniel Vetter
  -1 siblings, 0 replies; 196+ messages in thread
From: Daniel Vetter @ 2021-01-19 22:01 UTC (permalink / raw)
  To: Andrey Grodzovsky
  Cc: amd-gfx list, Greg KH, dri-devel, Qiang Yu, Alex Deucher,
	Christian König

On Tue, Jan 19, 2021 at 10:22 PM Andrey Grodzovsky
<Andrey.Grodzovsky@amd.com> wrote:
>
>
> On 1/19/21 8:45 AM, Daniel Vetter wrote:
>
> On Tue, Jan 19, 2021 at 09:48:03AM +0100, Christian König wrote:
>
> Am 18.01.21 um 22:01 schrieb Andrey Grodzovsky:
>
> Handle all DMA IOMMU gropup related dependencies before the
> group is removed.
>
> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu.h        |  5 ++++
>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 46 ++++++++++++++++++++++++++++++
>   drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c   |  2 +-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h   |  1 +
>   drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 10 +++++++
>   drivers/gpu/drm/amd/amdgpu/amdgpu_object.h |  2 ++
>   6 files changed, 65 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> index 478a7d8..2953420 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> @@ -51,6 +51,7 @@
>   #include <linux/dma-fence.h>
>   #include <linux/pci.h>
>   #include <linux/aer.h>
> +#include <linux/notifier.h>
>   #include <drm/ttm/ttm_bo_api.h>
>   #include <drm/ttm/ttm_bo_driver.h>
> @@ -1041,6 +1042,10 @@ struct amdgpu_device {
>   bool                            in_pci_err_recovery;
>   struct pci_saved_state          *pci_state;
> +
> + struct notifier_block nb;
> + struct blocking_notifier_head notifier;
> + struct list_head device_bo_list;
>   };
>   static inline struct amdgpu_device *drm_to_adev(struct drm_device *ddev)
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index 45e23e3..e99f4f1 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -70,6 +70,8 @@
>   #include <drm/task_barrier.h>
>   #include <linux/pm_runtime.h>
> +#include <linux/iommu.h>
> +
>   MODULE_FIRMWARE("amdgpu/vega10_gpu_info.bin");
>   MODULE_FIRMWARE("amdgpu/vega12_gpu_info.bin");
>   MODULE_FIRMWARE("amdgpu/raven_gpu_info.bin");
> @@ -3200,6 +3202,39 @@ static const struct attribute *amdgpu_dev_attributes[] = {
>   };
> +static int amdgpu_iommu_group_notifier(struct notifier_block *nb,
> +     unsigned long action, void *data)
> +{
> + struct amdgpu_device *adev = container_of(nb, struct amdgpu_device, nb);
> + struct amdgpu_bo *bo = NULL;
> +
> + /*
> + * Following is a set of IOMMU group dependencies taken care of before
> + * device's IOMMU group is removed
> + */
> + if (action == IOMMU_GROUP_NOTIFY_DEL_DEVICE) {
> +
> + spin_lock(&ttm_bo_glob.lru_lock);
> + list_for_each_entry(bo, &adev->device_bo_list, bo) {
> + if (bo->tbo.ttm)
> + ttm_tt_unpopulate(bo->tbo.bdev, bo->tbo.ttm);
> + }
> + spin_unlock(&ttm_bo_glob.lru_lock);
>
> That approach won't work. ttm_tt_unpopulate() might sleep on an IOMMU lock.
>
> You need to use a mutex here or even better make sure you can access the
> device_bo_list without a lock in this moment.
>
> I'd also be worried about the notifier mutex getting really badly in the
> way.
>
> Plus I'm worried why we even need this, it sounds a bit like papering over
> the iommu subsystem. Assuming we clean up all our iommu mappings in our
> device hotunplug/unload code, why do we still need to have an additional
> iommu notifier on top, with all kinds of additional headaches? The iommu
> shouldn't clean up before the devices in its group have cleaned up.
>
> I think we need more info here on what the exact problem is first.
> -Daniel
>
>
> Originally I experienced the  crash bellow on IOMMU enabled device, it happens post device removal from PCI topology -
> during shutting down of user client holding last reference to drm device file (X in my case).
> The crash is because by the time I get to this point struct device->iommu_group pointer is NULL
> already since the IOMMU group for the device is unset during PCI removal. So this contradicts what you said above
> that the iommu shouldn't clean up before the devices in its group have cleaned up.
> So instead of guessing when is the right place to place all IOMMU related cleanups it makes sense
> to get notification from IOMMU subsystem in the form of event IOMMU_GROUP_NOTIFY_DEL_DEVICE
> and use that place to do all the relevant cleanups.

Yeah that goes boom, but you shouldn't need this special iommu cleanup
handler. Making sure that all the dma-api mappings are gone needs to
be done as part of the device hotunplug, you can't delay that to the
last drm_device cleanup.

So I most of the patch here with pulling that out (should be outright
removed from the final release code even) is good, just not yet how
you call that new code. Probably these bits (aside from walking all
buffers and unpopulating the tt) should be done from the early_free
callback you're adding.

Also what I just realized: For normal unload you need to make sure the
hw is actually stopped first, before we unmap buffers. Otherwise
driver unload will likely result in wedged hw, probably not what you
want for debugging.
-Daniel

> Andrey
>
>
> [  123.810074 <   28.126960>] BUG: kernel NULL pointer dereference, address: 00000000000000c8
> [  123.810080 <    0.000006>] #PF: supervisor read access in kernel mode
> [  123.810082 <    0.000002>] #PF: error_code(0x0000) - not-present page
> [  123.810085 <    0.000003>] PGD 0 P4D 0
> [  123.810089 <    0.000004>] Oops: 0000 [#1] SMP NOPTI
> [  123.810094 <    0.000005>] CPU: 5 PID: 1418 Comm: Xorg:shlo4 Tainted: G           O      5.9.0-rc2-dev+ #59
> [  123.810096 <    0.000002>] Hardware name: System manufacturer System Product Name/PRIME X470-PRO, BIOS 4406 02/28/2019
> [  123.810105 <    0.000009>] RIP: 0010:iommu_get_dma_domain+0x10/0x20
> [  123.810108 <    0.000003>] Code: b0 48 c7 87 98 00 00 00 00 00 00 00 31 c0 c3 b8 f4 ff ff ff eb a6 0f 1f 40 00 0f 1f 44 00 00 48 8b 87 d0 02 00 00 55 48 89 e5 <48> 8b 80 c8 00 00 00 5d c3 0f 1f 80 00 00 00 00 0f 1f 44 00 00 48
> [  123.810111 <    0.000003>] RSP: 0018:ffffa2e201f7f980 EFLAGS: 00010246
> [  123.810114 <    0.000003>] RAX: 0000000000000000 RBX: 0000000000001000 RCX: 0000000000000000
> [  123.810116 <    0.000002>] RDX: 0000000000001000 RSI: 00000000bf5cb000 RDI: ffff93c259dc60b0
> [  123.810117 <    0.000001>] RBP: ffffa2e201f7f980 R08: 0000000000000000 R09: 0000000000000000
> [  123.810119 <    0.000002>] R10: ffffa2e201f7faf0 R11: 0000000000000001 R12: 00000000bf5cb000
> [  123.810121 <    0.000002>] R13: 0000000000001000 R14: ffff93c24cef9c50 R15: ffff93c256c05688
> [  123.810124 <    0.000003>] FS:  00007f5e5e8d3700(0000) GS:ffff93c25e940000(0000) knlGS:0000000000000000
> [  123.810126 <    0.000002>] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  123.810128 <    0.000002>] CR2: 00000000000000c8 CR3: 000000027fe0a000 CR4: 00000000003506e0
> [  123.810130 <    0.000002>] Call Trace:
> [  123.810136 <    0.000006>]  __iommu_dma_unmap+0x2e/0x100
> [  123.810141 <    0.000005>]  ? kfree+0x389/0x3a0
> [  123.810144 <    0.000003>]  iommu_dma_unmap_page+0xe/0x10
> [  123.810149 <    0.000005>] dma_unmap_page_attrs+0x4d/0xf0
> [  123.810159 <    0.000010>]  ? ttm_bo_del_from_lru+0x8e/0xb0 [ttm]
> [  123.810165 <    0.000006>] ttm_unmap_and_unpopulate_pages+0x8e/0xc0 [ttm]
> [  123.810252 <    0.000087>] amdgpu_ttm_tt_unpopulate+0xaa/0xd0 [amdgpu]
> [  123.810258 <    0.000006>]  ttm_tt_unpopulate+0x59/0x70 [ttm]
> [  123.810264 <    0.000006>]  ttm_tt_destroy+0x6a/0x70 [ttm]
> [  123.810270 <    0.000006>] ttm_bo_cleanup_memtype_use+0x36/0xa0 [ttm]
> [  123.810276 <    0.000006>]  ttm_bo_put+0x1e7/0x400 [ttm]
> [  123.810358 <    0.000082>]  amdgpu_bo_unref+0x1e/0x30 [amdgpu]
> [  123.810440 <    0.000082>] amdgpu_gem_object_free+0x37/0x50 [amdgpu]
> [  123.810459 <    0.000019>]  drm_gem_object_free+0x35/0x40 [drm]
> [  123.810476 <    0.000017>] drm_gem_object_handle_put_unlocked+0x9d/0xd0 [drm]
> [  123.810494 <    0.000018>] drm_gem_object_release_handle+0x74/0x90 [drm]
> [  123.810511 <    0.000017>]  ? drm_gem_object_handle_put_unlocked+0xd0/0xd0 [drm]
> [  123.810516 <    0.000005>]  idr_for_each+0x4d/0xd0
> [  123.810534 <    0.000018>]  drm_gem_release+0x20/0x30 [drm]
> [  123.810550 <    0.000016>]  drm_file_free+0x251/0x2a0 [drm]
> [  123.810567 <    0.000017>] drm_close_helper.isra.14+0x61/0x70 [drm]
> [  123.810583 <    0.000016>]  drm_release+0x6a/0xe0 [drm]
> [  123.810588 <    0.000005>]  __fput+0xa2/0x250
> [  123.810592 <    0.000004>]  ____fput+0xe/0x10
> [  123.810595 <    0.000003>]  task_work_run+0x6c/0xa0
> [  123.810600 <    0.000005>]  do_exit+0x376/0xb60
> [  123.810604 <    0.000004>]  do_group_exit+0x43/0xa0
> [  123.810608 <    0.000004>]  get_signal+0x18b/0x8e0
> [  123.810612 <    0.000004>]  ? do_futex+0x595/0xc20
> [  123.810617 <    0.000005>]  arch_do_signal+0x34/0x880
> [  123.810620 <    0.000003>]  ? check_preempt_curr+0x50/0x60
> [  123.810623 <    0.000003>]  ? ttwu_do_wakeup+0x1e/0x160
> [  123.810626 <    0.000003>]  ? ttwu_do_activate+0x61/0x70
> [  123.810630 <    0.000004>] exit_to_user_mode_prepare+0x124/0x1b0
> [  123.810635 <    0.000005>] syscall_exit_to_user_mode+0x31/0x170
> [  123.810639 <    0.000004>]  do_syscall_64+0x43/0x80
>
>
> Andrey
>
>
>
> Christian.
>
> +
> + if (adev->irq.ih.use_bus_addr)
> + amdgpu_ih_ring_fini(adev, &adev->irq.ih);
> + if (adev->irq.ih1.use_bus_addr)
> + amdgpu_ih_ring_fini(adev, &adev->irq.ih1);
> + if (adev->irq.ih2.use_bus_addr)
> + amdgpu_ih_ring_fini(adev, &adev->irq.ih2);
> +
> + amdgpu_gart_dummy_page_fini(adev);
> + }
> +
> + return NOTIFY_OK;
> +}
> +
> +
>   /**
>    * amdgpu_device_init - initialize the driver
>    *
> @@ -3304,6 +3339,8 @@ int amdgpu_device_init(struct amdgpu_device *adev,
>   INIT_WORK(&adev->xgmi_reset_work, amdgpu_device_xgmi_reset_func);
> + INIT_LIST_HEAD(&adev->device_bo_list);
> +
>   adev->gfx.gfx_off_req_count = 1;
>   adev->pm.ac_power = power_supply_is_system_supplied() > 0;
> @@ -3575,6 +3612,15 @@ int amdgpu_device_init(struct amdgpu_device *adev,
>   if (amdgpu_device_cache_pci_state(adev->pdev))
>   pci_restore_state(pdev);
> + BLOCKING_INIT_NOTIFIER_HEAD(&adev->notifier);
> + adev->nb.notifier_call = amdgpu_iommu_group_notifier;
> +
> + if (adev->dev->iommu_group) {
> + r = iommu_group_register_notifier(adev->dev->iommu_group, &adev->nb);
> + if (r)
> + goto failed;
> + }
> +
>   return 0;
>   failed:
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
> index 0db9330..486ad6d 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
> @@ -92,7 +92,7 @@ static int amdgpu_gart_dummy_page_init(struct amdgpu_device *adev)
>    *
>    * Frees the dummy page used by the driver (all asics).
>    */
> -static void amdgpu_gart_dummy_page_fini(struct amdgpu_device *adev)
> +void amdgpu_gart_dummy_page_fini(struct amdgpu_device *adev)
>   {
>   if (!adev->dummy_page_addr)
>   return;
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h
> index afa2e28..5678d9c 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h
> @@ -61,6 +61,7 @@ int amdgpu_gart_table_vram_pin(struct amdgpu_device *adev);
>   void amdgpu_gart_table_vram_unpin(struct amdgpu_device *adev);
>   int amdgpu_gart_init(struct amdgpu_device *adev);
>   void amdgpu_gart_fini(struct amdgpu_device *adev);
> +void amdgpu_gart_dummy_page_fini(struct amdgpu_device *adev);
>   int amdgpu_gart_unbind(struct amdgpu_device *adev, uint64_t offset,
>         int pages);
>   int amdgpu_gart_map(struct amdgpu_device *adev, uint64_t offset,
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> index 6cc9919..4a1de69 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> @@ -94,6 +94,10 @@ static void amdgpu_bo_destroy(struct ttm_buffer_object *tbo)
>   }
>   amdgpu_bo_unref(&bo->parent);
> + spin_lock(&ttm_bo_glob.lru_lock);
> + list_del(&bo->bo);
> + spin_unlock(&ttm_bo_glob.lru_lock);
> +
>   kfree(bo->metadata);
>   kfree(bo);
>   }
> @@ -613,6 +617,12 @@ static int amdgpu_bo_do_create(struct amdgpu_device *adev,
>   if (bp->type == ttm_bo_type_device)
>   bo->flags &= ~AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED;
> + INIT_LIST_HEAD(&bo->bo);
> +
> + spin_lock(&ttm_bo_glob.lru_lock);
> + list_add_tail(&bo->bo, &adev->device_bo_list);
> + spin_unlock(&ttm_bo_glob.lru_lock);
> +
>   return 0;
>   fail_unreserve:
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
> index 9ac3756..5ae8555 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
> @@ -110,6 +110,8 @@ struct amdgpu_bo {
>   struct list_head shadow_list;
>   struct kgd_mem                  *kfd_bo;
> +
> + struct list_head bo;
>   };
>   static inline struct amdgpu_bo *ttm_to_amdgpu_bo(struct ttm_buffer_object *tbo)



-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 07/14] drm/amdgpu: Register IOMMU topology notifier per device.
@ 2021-01-19 22:01           ` Daniel Vetter
  0 siblings, 0 replies; 196+ messages in thread
From: Daniel Vetter @ 2021-01-19 22:01 UTC (permalink / raw)
  To: Andrey Grodzovsky
  Cc: Rob Herring, amd-gfx list, Greg KH, dri-devel, Anholt, Eric,
	Pekka Paalanen, Qiang Yu, Alex Deucher, Wentland, Harry,
	Christian König, Lucas Stach

On Tue, Jan 19, 2021 at 10:22 PM Andrey Grodzovsky
<Andrey.Grodzovsky@amd.com> wrote:
>
>
> On 1/19/21 8:45 AM, Daniel Vetter wrote:
>
> On Tue, Jan 19, 2021 at 09:48:03AM +0100, Christian König wrote:
>
> Am 18.01.21 um 22:01 schrieb Andrey Grodzovsky:
>
> Handle all DMA IOMMU gropup related dependencies before the
> group is removed.
>
> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu.h        |  5 ++++
>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 46 ++++++++++++++++++++++++++++++
>   drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c   |  2 +-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h   |  1 +
>   drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 10 +++++++
>   drivers/gpu/drm/amd/amdgpu/amdgpu_object.h |  2 ++
>   6 files changed, 65 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> index 478a7d8..2953420 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> @@ -51,6 +51,7 @@
>   #include <linux/dma-fence.h>
>   #include <linux/pci.h>
>   #include <linux/aer.h>
> +#include <linux/notifier.h>
>   #include <drm/ttm/ttm_bo_api.h>
>   #include <drm/ttm/ttm_bo_driver.h>
> @@ -1041,6 +1042,10 @@ struct amdgpu_device {
>   bool                            in_pci_err_recovery;
>   struct pci_saved_state          *pci_state;
> +
> + struct notifier_block nb;
> + struct blocking_notifier_head notifier;
> + struct list_head device_bo_list;
>   };
>   static inline struct amdgpu_device *drm_to_adev(struct drm_device *ddev)
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index 45e23e3..e99f4f1 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -70,6 +70,8 @@
>   #include <drm/task_barrier.h>
>   #include <linux/pm_runtime.h>
> +#include <linux/iommu.h>
> +
>   MODULE_FIRMWARE("amdgpu/vega10_gpu_info.bin");
>   MODULE_FIRMWARE("amdgpu/vega12_gpu_info.bin");
>   MODULE_FIRMWARE("amdgpu/raven_gpu_info.bin");
> @@ -3200,6 +3202,39 @@ static const struct attribute *amdgpu_dev_attributes[] = {
>   };
> +static int amdgpu_iommu_group_notifier(struct notifier_block *nb,
> +     unsigned long action, void *data)
> +{
> + struct amdgpu_device *adev = container_of(nb, struct amdgpu_device, nb);
> + struct amdgpu_bo *bo = NULL;
> +
> + /*
> + * Following is a set of IOMMU group dependencies taken care of before
> + * device's IOMMU group is removed
> + */
> + if (action == IOMMU_GROUP_NOTIFY_DEL_DEVICE) {
> +
> + spin_lock(&ttm_bo_glob.lru_lock);
> + list_for_each_entry(bo, &adev->device_bo_list, bo) {
> + if (bo->tbo.ttm)
> + ttm_tt_unpopulate(bo->tbo.bdev, bo->tbo.ttm);
> + }
> + spin_unlock(&ttm_bo_glob.lru_lock);
>
> That approach won't work. ttm_tt_unpopulate() might sleep on an IOMMU lock.
>
> You need to use a mutex here or even better make sure you can access the
> device_bo_list without a lock in this moment.
>
> I'd also be worried about the notifier mutex getting really badly in the
> way.
>
> Plus I'm worried why we even need this, it sounds a bit like papering over
> the iommu subsystem. Assuming we clean up all our iommu mappings in our
> device hotunplug/unload code, why do we still need to have an additional
> iommu notifier on top, with all kinds of additional headaches? The iommu
> shouldn't clean up before the devices in its group have cleaned up.
>
> I think we need more info here on what the exact problem is first.
> -Daniel
>
>
> Originally I experienced the  crash bellow on IOMMU enabled device, it happens post device removal from PCI topology -
> during shutting down of user client holding last reference to drm device file (X in my case).
> The crash is because by the time I get to this point struct device->iommu_group pointer is NULL
> already since the IOMMU group for the device is unset during PCI removal. So this contradicts what you said above
> that the iommu shouldn't clean up before the devices in its group have cleaned up.
> So instead of guessing when is the right place to place all IOMMU related cleanups it makes sense
> to get notification from IOMMU subsystem in the form of event IOMMU_GROUP_NOTIFY_DEL_DEVICE
> and use that place to do all the relevant cleanups.

Yeah that goes boom, but you shouldn't need this special iommu cleanup
handler. Making sure that all the dma-api mappings are gone needs to
be done as part of the device hotunplug, you can't delay that to the
last drm_device cleanup.

So I most of the patch here with pulling that out (should be outright
removed from the final release code even) is good, just not yet how
you call that new code. Probably these bits (aside from walking all
buffers and unpopulating the tt) should be done from the early_free
callback you're adding.

Also what I just realized: For normal unload you need to make sure the
hw is actually stopped first, before we unmap buffers. Otherwise
driver unload will likely result in wedged hw, probably not what you
want for debugging.
-Daniel

> Andrey
>
>
> [  123.810074 <   28.126960>] BUG: kernel NULL pointer dereference, address: 00000000000000c8
> [  123.810080 <    0.000006>] #PF: supervisor read access in kernel mode
> [  123.810082 <    0.000002>] #PF: error_code(0x0000) - not-present page
> [  123.810085 <    0.000003>] PGD 0 P4D 0
> [  123.810089 <    0.000004>] Oops: 0000 [#1] SMP NOPTI
> [  123.810094 <    0.000005>] CPU: 5 PID: 1418 Comm: Xorg:shlo4 Tainted: G           O      5.9.0-rc2-dev+ #59
> [  123.810096 <    0.000002>] Hardware name: System manufacturer System Product Name/PRIME X470-PRO, BIOS 4406 02/28/2019
> [  123.810105 <    0.000009>] RIP: 0010:iommu_get_dma_domain+0x10/0x20
> [  123.810108 <    0.000003>] Code: b0 48 c7 87 98 00 00 00 00 00 00 00 31 c0 c3 b8 f4 ff ff ff eb a6 0f 1f 40 00 0f 1f 44 00 00 48 8b 87 d0 02 00 00 55 48 89 e5 <48> 8b 80 c8 00 00 00 5d c3 0f 1f 80 00 00 00 00 0f 1f 44 00 00 48
> [  123.810111 <    0.000003>] RSP: 0018:ffffa2e201f7f980 EFLAGS: 00010246
> [  123.810114 <    0.000003>] RAX: 0000000000000000 RBX: 0000000000001000 RCX: 0000000000000000
> [  123.810116 <    0.000002>] RDX: 0000000000001000 RSI: 00000000bf5cb000 RDI: ffff93c259dc60b0
> [  123.810117 <    0.000001>] RBP: ffffa2e201f7f980 R08: 0000000000000000 R09: 0000000000000000
> [  123.810119 <    0.000002>] R10: ffffa2e201f7faf0 R11: 0000000000000001 R12: 00000000bf5cb000
> [  123.810121 <    0.000002>] R13: 0000000000001000 R14: ffff93c24cef9c50 R15: ffff93c256c05688
> [  123.810124 <    0.000003>] FS:  00007f5e5e8d3700(0000) GS:ffff93c25e940000(0000) knlGS:0000000000000000
> [  123.810126 <    0.000002>] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  123.810128 <    0.000002>] CR2: 00000000000000c8 CR3: 000000027fe0a000 CR4: 00000000003506e0
> [  123.810130 <    0.000002>] Call Trace:
> [  123.810136 <    0.000006>]  __iommu_dma_unmap+0x2e/0x100
> [  123.810141 <    0.000005>]  ? kfree+0x389/0x3a0
> [  123.810144 <    0.000003>]  iommu_dma_unmap_page+0xe/0x10
> [  123.810149 <    0.000005>] dma_unmap_page_attrs+0x4d/0xf0
> [  123.810159 <    0.000010>]  ? ttm_bo_del_from_lru+0x8e/0xb0 [ttm]
> [  123.810165 <    0.000006>] ttm_unmap_and_unpopulate_pages+0x8e/0xc0 [ttm]
> [  123.810252 <    0.000087>] amdgpu_ttm_tt_unpopulate+0xaa/0xd0 [amdgpu]
> [  123.810258 <    0.000006>]  ttm_tt_unpopulate+0x59/0x70 [ttm]
> [  123.810264 <    0.000006>]  ttm_tt_destroy+0x6a/0x70 [ttm]
> [  123.810270 <    0.000006>] ttm_bo_cleanup_memtype_use+0x36/0xa0 [ttm]
> [  123.810276 <    0.000006>]  ttm_bo_put+0x1e7/0x400 [ttm]
> [  123.810358 <    0.000082>]  amdgpu_bo_unref+0x1e/0x30 [amdgpu]
> [  123.810440 <    0.000082>] amdgpu_gem_object_free+0x37/0x50 [amdgpu]
> [  123.810459 <    0.000019>]  drm_gem_object_free+0x35/0x40 [drm]
> [  123.810476 <    0.000017>] drm_gem_object_handle_put_unlocked+0x9d/0xd0 [drm]
> [  123.810494 <    0.000018>] drm_gem_object_release_handle+0x74/0x90 [drm]
> [  123.810511 <    0.000017>]  ? drm_gem_object_handle_put_unlocked+0xd0/0xd0 [drm]
> [  123.810516 <    0.000005>]  idr_for_each+0x4d/0xd0
> [  123.810534 <    0.000018>]  drm_gem_release+0x20/0x30 [drm]
> [  123.810550 <    0.000016>]  drm_file_free+0x251/0x2a0 [drm]
> [  123.810567 <    0.000017>] drm_close_helper.isra.14+0x61/0x70 [drm]
> [  123.810583 <    0.000016>]  drm_release+0x6a/0xe0 [drm]
> [  123.810588 <    0.000005>]  __fput+0xa2/0x250
> [  123.810592 <    0.000004>]  ____fput+0xe/0x10
> [  123.810595 <    0.000003>]  task_work_run+0x6c/0xa0
> [  123.810600 <    0.000005>]  do_exit+0x376/0xb60
> [  123.810604 <    0.000004>]  do_group_exit+0x43/0xa0
> [  123.810608 <    0.000004>]  get_signal+0x18b/0x8e0
> [  123.810612 <    0.000004>]  ? do_futex+0x595/0xc20
> [  123.810617 <    0.000005>]  arch_do_signal+0x34/0x880
> [  123.810620 <    0.000003>]  ? check_preempt_curr+0x50/0x60
> [  123.810623 <    0.000003>]  ? ttwu_do_wakeup+0x1e/0x160
> [  123.810626 <    0.000003>]  ? ttwu_do_activate+0x61/0x70
> [  123.810630 <    0.000004>] exit_to_user_mode_prepare+0x124/0x1b0
> [  123.810635 <    0.000005>] syscall_exit_to_user_mode+0x31/0x170
> [  123.810639 <    0.000004>]  do_syscall_64+0x43/0x80
>
>
> Andrey
>
>
>
> Christian.
>
> +
> + if (adev->irq.ih.use_bus_addr)
> + amdgpu_ih_ring_fini(adev, &adev->irq.ih);
> + if (adev->irq.ih1.use_bus_addr)
> + amdgpu_ih_ring_fini(adev, &adev->irq.ih1);
> + if (adev->irq.ih2.use_bus_addr)
> + amdgpu_ih_ring_fini(adev, &adev->irq.ih2);
> +
> + amdgpu_gart_dummy_page_fini(adev);
> + }
> +
> + return NOTIFY_OK;
> +}
> +
> +
>   /**
>    * amdgpu_device_init - initialize the driver
>    *
> @@ -3304,6 +3339,8 @@ int amdgpu_device_init(struct amdgpu_device *adev,
>   INIT_WORK(&adev->xgmi_reset_work, amdgpu_device_xgmi_reset_func);
> + INIT_LIST_HEAD(&adev->device_bo_list);
> +
>   adev->gfx.gfx_off_req_count = 1;
>   adev->pm.ac_power = power_supply_is_system_supplied() > 0;
> @@ -3575,6 +3612,15 @@ int amdgpu_device_init(struct amdgpu_device *adev,
>   if (amdgpu_device_cache_pci_state(adev->pdev))
>   pci_restore_state(pdev);
> + BLOCKING_INIT_NOTIFIER_HEAD(&adev->notifier);
> + adev->nb.notifier_call = amdgpu_iommu_group_notifier;
> +
> + if (adev->dev->iommu_group) {
> + r = iommu_group_register_notifier(adev->dev->iommu_group, &adev->nb);
> + if (r)
> + goto failed;
> + }
> +
>   return 0;
>   failed:
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
> index 0db9330..486ad6d 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
> @@ -92,7 +92,7 @@ static int amdgpu_gart_dummy_page_init(struct amdgpu_device *adev)
>    *
>    * Frees the dummy page used by the driver (all asics).
>    */
> -static void amdgpu_gart_dummy_page_fini(struct amdgpu_device *adev)
> +void amdgpu_gart_dummy_page_fini(struct amdgpu_device *adev)
>   {
>   if (!adev->dummy_page_addr)
>   return;
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h
> index afa2e28..5678d9c 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h
> @@ -61,6 +61,7 @@ int amdgpu_gart_table_vram_pin(struct amdgpu_device *adev);
>   void amdgpu_gart_table_vram_unpin(struct amdgpu_device *adev);
>   int amdgpu_gart_init(struct amdgpu_device *adev);
>   void amdgpu_gart_fini(struct amdgpu_device *adev);
> +void amdgpu_gart_dummy_page_fini(struct amdgpu_device *adev);
>   int amdgpu_gart_unbind(struct amdgpu_device *adev, uint64_t offset,
>         int pages);
>   int amdgpu_gart_map(struct amdgpu_device *adev, uint64_t offset,
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> index 6cc9919..4a1de69 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> @@ -94,6 +94,10 @@ static void amdgpu_bo_destroy(struct ttm_buffer_object *tbo)
>   }
>   amdgpu_bo_unref(&bo->parent);
> + spin_lock(&ttm_bo_glob.lru_lock);
> + list_del(&bo->bo);
> + spin_unlock(&ttm_bo_glob.lru_lock);
> +
>   kfree(bo->metadata);
>   kfree(bo);
>   }
> @@ -613,6 +617,12 @@ static int amdgpu_bo_do_create(struct amdgpu_device *adev,
>   if (bp->type == ttm_bo_type_device)
>   bo->flags &= ~AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED;
> + INIT_LIST_HEAD(&bo->bo);
> +
> + spin_lock(&ttm_bo_glob.lru_lock);
> + list_add_tail(&bo->bo, &adev->device_bo_list);
> + spin_unlock(&ttm_bo_glob.lru_lock);
> +
>   return 0;
>   fail_unreserve:
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
> index 9ac3756..5ae8555 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
> @@ -110,6 +110,8 @@ struct amdgpu_bo {
>   struct list_head shadow_list;
>   struct kgd_mem                  *kfd_bo;
> +
> + struct list_head bo;
>   };
>   static inline struct amdgpu_bo *ttm_to_amdgpu_bo(struct ttm_buffer_object *tbo)



-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 07/14] drm/amdgpu: Register IOMMU topology notifier per device.
  2021-01-19 22:01           ` Daniel Vetter
@ 2021-01-20  4:21             ` Andrey Grodzovsky
  -1 siblings, 0 replies; 196+ messages in thread
From: Andrey Grodzovsky @ 2021-01-20  4:21 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Greg KH, dri-devel, amd-gfx list, Alex Deucher,
	Christian König, Qiang Yu


On 1/19/21 5:01 PM, Daniel Vetter wrote:
> On Tue, Jan 19, 2021 at 10:22 PM Andrey Grodzovsky
> <Andrey.Grodzovsky@amd.com> wrote:
>>
>> On 1/19/21 8:45 AM, Daniel Vetter wrote:
>>
>> On Tue, Jan 19, 2021 at 09:48:03AM +0100, Christian König wrote:
>>
>> Am 18.01.21 um 22:01 schrieb Andrey Grodzovsky:
>>
>> Handle all DMA IOMMU gropup related dependencies before the
>> group is removed.
>>
>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>> ---
>>    drivers/gpu/drm/amd/amdgpu/amdgpu.h        |  5 ++++
>>    drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 46 ++++++++++++++++++++++++++++++
>>    drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c   |  2 +-
>>    drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h   |  1 +
>>    drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 10 +++++++
>>    drivers/gpu/drm/amd/amdgpu/amdgpu_object.h |  2 ++
>>    6 files changed, 65 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>> index 478a7d8..2953420 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>> @@ -51,6 +51,7 @@
>>    #include <linux/dma-fence.h>
>>    #include <linux/pci.h>
>>    #include <linux/aer.h>
>> +#include <linux/notifier.h>
>>    #include <drm/ttm/ttm_bo_api.h>
>>    #include <drm/ttm/ttm_bo_driver.h>
>> @@ -1041,6 +1042,10 @@ struct amdgpu_device {
>>    bool                            in_pci_err_recovery;
>>    struct pci_saved_state          *pci_state;
>> +
>> + struct notifier_block nb;
>> + struct blocking_notifier_head notifier;
>> + struct list_head device_bo_list;
>>    };
>>    static inline struct amdgpu_device *drm_to_adev(struct drm_device *ddev)
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> index 45e23e3..e99f4f1 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> @@ -70,6 +70,8 @@
>>    #include <drm/task_barrier.h>
>>    #include <linux/pm_runtime.h>
>> +#include <linux/iommu.h>
>> +
>>    MODULE_FIRMWARE("amdgpu/vega10_gpu_info.bin");
>>    MODULE_FIRMWARE("amdgpu/vega12_gpu_info.bin");
>>    MODULE_FIRMWARE("amdgpu/raven_gpu_info.bin");
>> @@ -3200,6 +3202,39 @@ static const struct attribute *amdgpu_dev_attributes[] = {
>>    };
>> +static int amdgpu_iommu_group_notifier(struct notifier_block *nb,
>> +     unsigned long action, void *data)
>> +{
>> + struct amdgpu_device *adev = container_of(nb, struct amdgpu_device, nb);
>> + struct amdgpu_bo *bo = NULL;
>> +
>> + /*
>> + * Following is a set of IOMMU group dependencies taken care of before
>> + * device's IOMMU group is removed
>> + */
>> + if (action == IOMMU_GROUP_NOTIFY_DEL_DEVICE) {
>> +
>> + spin_lock(&ttm_bo_glob.lru_lock);
>> + list_for_each_entry(bo, &adev->device_bo_list, bo) {
>> + if (bo->tbo.ttm)
>> + ttm_tt_unpopulate(bo->tbo.bdev, bo->tbo.ttm);
>> + }
>> + spin_unlock(&ttm_bo_glob.lru_lock);
>>
>> That approach won't work. ttm_tt_unpopulate() might sleep on an IOMMU lock.
>>
>> You need to use a mutex here or even better make sure you can access the
>> device_bo_list without a lock in this moment.
>>
>> I'd also be worried about the notifier mutex getting really badly in the
>> way.
>>
>> Plus I'm worried why we even need this, it sounds a bit like papering over
>> the iommu subsystem. Assuming we clean up all our iommu mappings in our
>> device hotunplug/unload code, why do we still need to have an additional
>> iommu notifier on top, with all kinds of additional headaches? The iommu
>> shouldn't clean up before the devices in its group have cleaned up.
>>
>> I think we need more info here on what the exact problem is first.
>> -Daniel
>>
>>
>> Originally I experienced the  crash bellow on IOMMU enabled device, it happens post device removal from PCI topology -
>> during shutting down of user client holding last reference to drm device file (X in my case).
>> The crash is because by the time I get to this point struct device->iommu_group pointer is NULL
>> already since the IOMMU group for the device is unset during PCI removal. So this contradicts what you said above
>> that the iommu shouldn't clean up before the devices in its group have cleaned up.
>> So instead of guessing when is the right place to place all IOMMU related cleanups it makes sense
>> to get notification from IOMMU subsystem in the form of event IOMMU_GROUP_NOTIFY_DEL_DEVICE
>> and use that place to do all the relevant cleanups.
> Yeah that goes boom, but you shouldn't need this special iommu cleanup
> handler. Making sure that all the dma-api mappings are gone needs to
> be done as part of the device hotunplug, you can't delay that to the
> last drm_device cleanup.
>
> So I most of the patch here with pulling that out (should be outright
> removed from the final release code even) is good, just not yet how
> you call that new code. Probably these bits (aside from walking all
> buffers and unpopulating the tt) should be done from the early_free
> callback you're adding.
>
> Also what I just realized: For normal unload you need to make sure the
> hw is actually stopped first, before we unmap buffers. Otherwise
> driver unload will likely result in wedged hw, probably not what you
> want for debugging.
> -Daniel

Since device removal from IOMMU group and this hook in particular
takes place before call to amdgpu_pci_remove essentially it means
that for IOMMU use case the entire amdgpu_device_fini_hw function
shouold be called here to stop the HW instead from amdgpu_pci_remove.

Looking at this from another perspective, AFAIK on each new device probing
either due to PCI bus rescan or driver reload we are resetting the ASIC before doing
any init operations (assuming we successfully gained MMIO access) and so maybe
your concern is not an issue ?

Andrey


>
>> Andrey
>>
>>
>> [  123.810074 <   28.126960>] BUG: kernel NULL pointer dereference, address: 00000000000000c8
>> [  123.810080 <    0.000006>] #PF: supervisor read access in kernel mode
>> [  123.810082 <    0.000002>] #PF: error_code(0x0000) - not-present page
>> [  123.810085 <    0.000003>] PGD 0 P4D 0
>> [  123.810089 <    0.000004>] Oops: 0000 [#1] SMP NOPTI
>> [  123.810094 <    0.000005>] CPU: 5 PID: 1418 Comm: Xorg:shlo4 Tainted: G           O      5.9.0-rc2-dev+ #59
>> [  123.810096 <    0.000002>] Hardware name: System manufacturer System Product Name/PRIME X470-PRO, BIOS 4406 02/28/2019
>> [  123.810105 <    0.000009>] RIP: 0010:iommu_get_dma_domain+0x10/0x20
>> [  123.810108 <    0.000003>] Code: b0 48 c7 87 98 00 00 00 00 00 00 00 31 c0 c3 b8 f4 ff ff ff eb a6 0f 1f 40 00 0f 1f 44 00 00 48 8b 87 d0 02 00 00 55 48 89 e5 <48> 8b 80 c8 00 00 00 5d c3 0f 1f 80 00 00 00 00 0f 1f 44 00 00 48
>> [  123.810111 <    0.000003>] RSP: 0018:ffffa2e201f7f980 EFLAGS: 00010246
>> [  123.810114 <    0.000003>] RAX: 0000000000000000 RBX: 0000000000001000 RCX: 0000000000000000
>> [  123.810116 <    0.000002>] RDX: 0000000000001000 RSI: 00000000bf5cb000 RDI: ffff93c259dc60b0
>> [  123.810117 <    0.000001>] RBP: ffffa2e201f7f980 R08: 0000000000000000 R09: 0000000000000000
>> [  123.810119 <    0.000002>] R10: ffffa2e201f7faf0 R11: 0000000000000001 R12: 00000000bf5cb000
>> [  123.810121 <    0.000002>] R13: 0000000000001000 R14: ffff93c24cef9c50 R15: ffff93c256c05688
>> [  123.810124 <    0.000003>] FS:  00007f5e5e8d3700(0000) GS:ffff93c25e940000(0000) knlGS:0000000000000000
>> [  123.810126 <    0.000002>] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> [  123.810128 <    0.000002>] CR2: 00000000000000c8 CR3: 000000027fe0a000 CR4: 00000000003506e0
>> [  123.810130 <    0.000002>] Call Trace:
>> [  123.810136 <    0.000006>]  __iommu_dma_unmap+0x2e/0x100
>> [  123.810141 <    0.000005>]  ? kfree+0x389/0x3a0
>> [  123.810144 <    0.000003>]  iommu_dma_unmap_page+0xe/0x10
>> [  123.810149 <    0.000005>] dma_unmap_page_attrs+0x4d/0xf0
>> [  123.810159 <    0.000010>]  ? ttm_bo_del_from_lru+0x8e/0xb0 [ttm]
>> [  123.810165 <    0.000006>] ttm_unmap_and_unpopulate_pages+0x8e/0xc0 [ttm]
>> [  123.810252 <    0.000087>] amdgpu_ttm_tt_unpopulate+0xaa/0xd0 [amdgpu]
>> [  123.810258 <    0.000006>]  ttm_tt_unpopulate+0x59/0x70 [ttm]
>> [  123.810264 <    0.000006>]  ttm_tt_destroy+0x6a/0x70 [ttm]
>> [  123.810270 <    0.000006>] ttm_bo_cleanup_memtype_use+0x36/0xa0 [ttm]
>> [  123.810276 <    0.000006>]  ttm_bo_put+0x1e7/0x400 [ttm]
>> [  123.810358 <    0.000082>]  amdgpu_bo_unref+0x1e/0x30 [amdgpu]
>> [  123.810440 <    0.000082>] amdgpu_gem_object_free+0x37/0x50 [amdgpu]
>> [  123.810459 <    0.000019>]  drm_gem_object_free+0x35/0x40 [drm]
>> [  123.810476 <    0.000017>] drm_gem_object_handle_put_unlocked+0x9d/0xd0 [drm]
>> [  123.810494 <    0.000018>] drm_gem_object_release_handle+0x74/0x90 [drm]
>> [  123.810511 <    0.000017>]  ? drm_gem_object_handle_put_unlocked+0xd0/0xd0 [drm]
>> [  123.810516 <    0.000005>]  idr_for_each+0x4d/0xd0
>> [  123.810534 <    0.000018>]  drm_gem_release+0x20/0x30 [drm]
>> [  123.810550 <    0.000016>]  drm_file_free+0x251/0x2a0 [drm]
>> [  123.810567 <    0.000017>] drm_close_helper.isra.14+0x61/0x70 [drm]
>> [  123.810583 <    0.000016>]  drm_release+0x6a/0xe0 [drm]
>> [  123.810588 <    0.000005>]  __fput+0xa2/0x250
>> [  123.810592 <    0.000004>]  ____fput+0xe/0x10
>> [  123.810595 <    0.000003>]  task_work_run+0x6c/0xa0
>> [  123.810600 <    0.000005>]  do_exit+0x376/0xb60
>> [  123.810604 <    0.000004>]  do_group_exit+0x43/0xa0
>> [  123.810608 <    0.000004>]  get_signal+0x18b/0x8e0
>> [  123.810612 <    0.000004>]  ? do_futex+0x595/0xc20
>> [  123.810617 <    0.000005>]  arch_do_signal+0x34/0x880
>> [  123.810620 <    0.000003>]  ? check_preempt_curr+0x50/0x60
>> [  123.810623 <    0.000003>]  ? ttwu_do_wakeup+0x1e/0x160
>> [  123.810626 <    0.000003>]  ? ttwu_do_activate+0x61/0x70
>> [  123.810630 <    0.000004>] exit_to_user_mode_prepare+0x124/0x1b0
>> [  123.810635 <    0.000005>] syscall_exit_to_user_mode+0x31/0x170
>> [  123.810639 <    0.000004>]  do_syscall_64+0x43/0x80
>>
>>
>> Andrey
>>
>>
>>
>> Christian.
>>
>> +
>> + if (adev->irq.ih.use_bus_addr)
>> + amdgpu_ih_ring_fini(adev, &adev->irq.ih);
>> + if (adev->irq.ih1.use_bus_addr)
>> + amdgpu_ih_ring_fini(adev, &adev->irq.ih1);
>> + if (adev->irq.ih2.use_bus_addr)
>> + amdgpu_ih_ring_fini(adev, &adev->irq.ih2);
>> +
>> + amdgpu_gart_dummy_page_fini(adev);
>> + }
>> +
>> + return NOTIFY_OK;
>> +}
>> +
>> +
>>    /**
>>     * amdgpu_device_init - initialize the driver
>>     *
>> @@ -3304,6 +3339,8 @@ int amdgpu_device_init(struct amdgpu_device *adev,
>>    INIT_WORK(&adev->xgmi_reset_work, amdgpu_device_xgmi_reset_func);
>> + INIT_LIST_HEAD(&adev->device_bo_list);
>> +
>>    adev->gfx.gfx_off_req_count = 1;
>>    adev->pm.ac_power = power_supply_is_system_supplied() > 0;
>> @@ -3575,6 +3612,15 @@ int amdgpu_device_init(struct amdgpu_device *adev,
>>    if (amdgpu_device_cache_pci_state(adev->pdev))
>>    pci_restore_state(pdev);
>> + BLOCKING_INIT_NOTIFIER_HEAD(&adev->notifier);
>> + adev->nb.notifier_call = amdgpu_iommu_group_notifier;
>> +
>> + if (adev->dev->iommu_group) {
>> + r = iommu_group_register_notifier(adev->dev->iommu_group, &adev->nb);
>> + if (r)
>> + goto failed;
>> + }
>> +
>>    return 0;
>>    failed:
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
>> index 0db9330..486ad6d 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
>> @@ -92,7 +92,7 @@ static int amdgpu_gart_dummy_page_init(struct amdgpu_device *adev)
>>     *
>>     * Frees the dummy page used by the driver (all asics).
>>     */
>> -static void amdgpu_gart_dummy_page_fini(struct amdgpu_device *adev)
>> +void amdgpu_gart_dummy_page_fini(struct amdgpu_device *adev)
>>    {
>>    if (!adev->dummy_page_addr)
>>    return;
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h
>> index afa2e28..5678d9c 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h
>> @@ -61,6 +61,7 @@ int amdgpu_gart_table_vram_pin(struct amdgpu_device *adev);
>>    void amdgpu_gart_table_vram_unpin(struct amdgpu_device *adev);
>>    int amdgpu_gart_init(struct amdgpu_device *adev);
>>    void amdgpu_gart_fini(struct amdgpu_device *adev);
>> +void amdgpu_gart_dummy_page_fini(struct amdgpu_device *adev);
>>    int amdgpu_gart_unbind(struct amdgpu_device *adev, uint64_t offset,
>>          int pages);
>>    int amdgpu_gart_map(struct amdgpu_device *adev, uint64_t offset,
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>> index 6cc9919..4a1de69 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>> @@ -94,6 +94,10 @@ static void amdgpu_bo_destroy(struct ttm_buffer_object *tbo)
>>    }
>>    amdgpu_bo_unref(&bo->parent);
>> + spin_lock(&ttm_bo_glob.lru_lock);
>> + list_del(&bo->bo);
>> + spin_unlock(&ttm_bo_glob.lru_lock);
>> +
>>    kfree(bo->metadata);
>>    kfree(bo);
>>    }
>> @@ -613,6 +617,12 @@ static int amdgpu_bo_do_create(struct amdgpu_device *adev,
>>    if (bp->type == ttm_bo_type_device)
>>    bo->flags &= ~AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED;
>> + INIT_LIST_HEAD(&bo->bo);
>> +
>> + spin_lock(&ttm_bo_glob.lru_lock);
>> + list_add_tail(&bo->bo, &adev->device_bo_list);
>> + spin_unlock(&ttm_bo_glob.lru_lock);
>> +
>>    return 0;
>>    fail_unreserve:
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
>> index 9ac3756..5ae8555 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
>> @@ -110,6 +110,8 @@ struct amdgpu_bo {
>>    struct list_head shadow_list;
>>    struct kgd_mem                  *kfd_bo;
>> +
>> + struct list_head bo;
>>    };
>>    static inline struct amdgpu_bo *ttm_to_amdgpu_bo(struct ttm_buffer_object *tbo)
>
>
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 07/14] drm/amdgpu: Register IOMMU topology notifier per device.
@ 2021-01-20  4:21             ` Andrey Grodzovsky
  0 siblings, 0 replies; 196+ messages in thread
From: Andrey Grodzovsky @ 2021-01-20  4:21 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Rob Herring, Greg KH, dri-devel, Anholt, Eric, Pekka Paalanen,
	amd-gfx list, Alex Deucher, Lucas Stach, Wentland, Harry,
	Christian König, Qiang Yu


On 1/19/21 5:01 PM, Daniel Vetter wrote:
> On Tue, Jan 19, 2021 at 10:22 PM Andrey Grodzovsky
> <Andrey.Grodzovsky@amd.com> wrote:
>>
>> On 1/19/21 8:45 AM, Daniel Vetter wrote:
>>
>> On Tue, Jan 19, 2021 at 09:48:03AM +0100, Christian König wrote:
>>
>> Am 18.01.21 um 22:01 schrieb Andrey Grodzovsky:
>>
>> Handle all DMA IOMMU gropup related dependencies before the
>> group is removed.
>>
>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>> ---
>>    drivers/gpu/drm/amd/amdgpu/amdgpu.h        |  5 ++++
>>    drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 46 ++++++++++++++++++++++++++++++
>>    drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c   |  2 +-
>>    drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h   |  1 +
>>    drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 10 +++++++
>>    drivers/gpu/drm/amd/amdgpu/amdgpu_object.h |  2 ++
>>    6 files changed, 65 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>> index 478a7d8..2953420 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>> @@ -51,6 +51,7 @@
>>    #include <linux/dma-fence.h>
>>    #include <linux/pci.h>
>>    #include <linux/aer.h>
>> +#include <linux/notifier.h>
>>    #include <drm/ttm/ttm_bo_api.h>
>>    #include <drm/ttm/ttm_bo_driver.h>
>> @@ -1041,6 +1042,10 @@ struct amdgpu_device {
>>    bool                            in_pci_err_recovery;
>>    struct pci_saved_state          *pci_state;
>> +
>> + struct notifier_block nb;
>> + struct blocking_notifier_head notifier;
>> + struct list_head device_bo_list;
>>    };
>>    static inline struct amdgpu_device *drm_to_adev(struct drm_device *ddev)
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> index 45e23e3..e99f4f1 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> @@ -70,6 +70,8 @@
>>    #include <drm/task_barrier.h>
>>    #include <linux/pm_runtime.h>
>> +#include <linux/iommu.h>
>> +
>>    MODULE_FIRMWARE("amdgpu/vega10_gpu_info.bin");
>>    MODULE_FIRMWARE("amdgpu/vega12_gpu_info.bin");
>>    MODULE_FIRMWARE("amdgpu/raven_gpu_info.bin");
>> @@ -3200,6 +3202,39 @@ static const struct attribute *amdgpu_dev_attributes[] = {
>>    };
>> +static int amdgpu_iommu_group_notifier(struct notifier_block *nb,
>> +     unsigned long action, void *data)
>> +{
>> + struct amdgpu_device *adev = container_of(nb, struct amdgpu_device, nb);
>> + struct amdgpu_bo *bo = NULL;
>> +
>> + /*
>> + * Following is a set of IOMMU group dependencies taken care of before
>> + * device's IOMMU group is removed
>> + */
>> + if (action == IOMMU_GROUP_NOTIFY_DEL_DEVICE) {
>> +
>> + spin_lock(&ttm_bo_glob.lru_lock);
>> + list_for_each_entry(bo, &adev->device_bo_list, bo) {
>> + if (bo->tbo.ttm)
>> + ttm_tt_unpopulate(bo->tbo.bdev, bo->tbo.ttm);
>> + }
>> + spin_unlock(&ttm_bo_glob.lru_lock);
>>
>> That approach won't work. ttm_tt_unpopulate() might sleep on an IOMMU lock.
>>
>> You need to use a mutex here or even better make sure you can access the
>> device_bo_list without a lock in this moment.
>>
>> I'd also be worried about the notifier mutex getting really badly in the
>> way.
>>
>> Plus I'm worried why we even need this, it sounds a bit like papering over
>> the iommu subsystem. Assuming we clean up all our iommu mappings in our
>> device hotunplug/unload code, why do we still need to have an additional
>> iommu notifier on top, with all kinds of additional headaches? The iommu
>> shouldn't clean up before the devices in its group have cleaned up.
>>
>> I think we need more info here on what the exact problem is first.
>> -Daniel
>>
>>
>> Originally I experienced the  crash bellow on IOMMU enabled device, it happens post device removal from PCI topology -
>> during shutting down of user client holding last reference to drm device file (X in my case).
>> The crash is because by the time I get to this point struct device->iommu_group pointer is NULL
>> already since the IOMMU group for the device is unset during PCI removal. So this contradicts what you said above
>> that the iommu shouldn't clean up before the devices in its group have cleaned up.
>> So instead of guessing when is the right place to place all IOMMU related cleanups it makes sense
>> to get notification from IOMMU subsystem in the form of event IOMMU_GROUP_NOTIFY_DEL_DEVICE
>> and use that place to do all the relevant cleanups.
> Yeah that goes boom, but you shouldn't need this special iommu cleanup
> handler. Making sure that all the dma-api mappings are gone needs to
> be done as part of the device hotunplug, you can't delay that to the
> last drm_device cleanup.
>
> So I most of the patch here with pulling that out (should be outright
> removed from the final release code even) is good, just not yet how
> you call that new code. Probably these bits (aside from walking all
> buffers and unpopulating the tt) should be done from the early_free
> callback you're adding.
>
> Also what I just realized: For normal unload you need to make sure the
> hw is actually stopped first, before we unmap buffers. Otherwise
> driver unload will likely result in wedged hw, probably not what you
> want for debugging.
> -Daniel

Since device removal from IOMMU group and this hook in particular
takes place before call to amdgpu_pci_remove essentially it means
that for IOMMU use case the entire amdgpu_device_fini_hw function
shouold be called here to stop the HW instead from amdgpu_pci_remove.

Looking at this from another perspective, AFAIK on each new device probing
either due to PCI bus rescan or driver reload we are resetting the ASIC before doing
any init operations (assuming we successfully gained MMIO access) and so maybe
your concern is not an issue ?

Andrey


>
>> Andrey
>>
>>
>> [  123.810074 <   28.126960>] BUG: kernel NULL pointer dereference, address: 00000000000000c8
>> [  123.810080 <    0.000006>] #PF: supervisor read access in kernel mode
>> [  123.810082 <    0.000002>] #PF: error_code(0x0000) - not-present page
>> [  123.810085 <    0.000003>] PGD 0 P4D 0
>> [  123.810089 <    0.000004>] Oops: 0000 [#1] SMP NOPTI
>> [  123.810094 <    0.000005>] CPU: 5 PID: 1418 Comm: Xorg:shlo4 Tainted: G           O      5.9.0-rc2-dev+ #59
>> [  123.810096 <    0.000002>] Hardware name: System manufacturer System Product Name/PRIME X470-PRO, BIOS 4406 02/28/2019
>> [  123.810105 <    0.000009>] RIP: 0010:iommu_get_dma_domain+0x10/0x20
>> [  123.810108 <    0.000003>] Code: b0 48 c7 87 98 00 00 00 00 00 00 00 31 c0 c3 b8 f4 ff ff ff eb a6 0f 1f 40 00 0f 1f 44 00 00 48 8b 87 d0 02 00 00 55 48 89 e5 <48> 8b 80 c8 00 00 00 5d c3 0f 1f 80 00 00 00 00 0f 1f 44 00 00 48
>> [  123.810111 <    0.000003>] RSP: 0018:ffffa2e201f7f980 EFLAGS: 00010246
>> [  123.810114 <    0.000003>] RAX: 0000000000000000 RBX: 0000000000001000 RCX: 0000000000000000
>> [  123.810116 <    0.000002>] RDX: 0000000000001000 RSI: 00000000bf5cb000 RDI: ffff93c259dc60b0
>> [  123.810117 <    0.000001>] RBP: ffffa2e201f7f980 R08: 0000000000000000 R09: 0000000000000000
>> [  123.810119 <    0.000002>] R10: ffffa2e201f7faf0 R11: 0000000000000001 R12: 00000000bf5cb000
>> [  123.810121 <    0.000002>] R13: 0000000000001000 R14: ffff93c24cef9c50 R15: ffff93c256c05688
>> [  123.810124 <    0.000003>] FS:  00007f5e5e8d3700(0000) GS:ffff93c25e940000(0000) knlGS:0000000000000000
>> [  123.810126 <    0.000002>] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> [  123.810128 <    0.000002>] CR2: 00000000000000c8 CR3: 000000027fe0a000 CR4: 00000000003506e0
>> [  123.810130 <    0.000002>] Call Trace:
>> [  123.810136 <    0.000006>]  __iommu_dma_unmap+0x2e/0x100
>> [  123.810141 <    0.000005>]  ? kfree+0x389/0x3a0
>> [  123.810144 <    0.000003>]  iommu_dma_unmap_page+0xe/0x10
>> [  123.810149 <    0.000005>] dma_unmap_page_attrs+0x4d/0xf0
>> [  123.810159 <    0.000010>]  ? ttm_bo_del_from_lru+0x8e/0xb0 [ttm]
>> [  123.810165 <    0.000006>] ttm_unmap_and_unpopulate_pages+0x8e/0xc0 [ttm]
>> [  123.810252 <    0.000087>] amdgpu_ttm_tt_unpopulate+0xaa/0xd0 [amdgpu]
>> [  123.810258 <    0.000006>]  ttm_tt_unpopulate+0x59/0x70 [ttm]
>> [  123.810264 <    0.000006>]  ttm_tt_destroy+0x6a/0x70 [ttm]
>> [  123.810270 <    0.000006>] ttm_bo_cleanup_memtype_use+0x36/0xa0 [ttm]
>> [  123.810276 <    0.000006>]  ttm_bo_put+0x1e7/0x400 [ttm]
>> [  123.810358 <    0.000082>]  amdgpu_bo_unref+0x1e/0x30 [amdgpu]
>> [  123.810440 <    0.000082>] amdgpu_gem_object_free+0x37/0x50 [amdgpu]
>> [  123.810459 <    0.000019>]  drm_gem_object_free+0x35/0x40 [drm]
>> [  123.810476 <    0.000017>] drm_gem_object_handle_put_unlocked+0x9d/0xd0 [drm]
>> [  123.810494 <    0.000018>] drm_gem_object_release_handle+0x74/0x90 [drm]
>> [  123.810511 <    0.000017>]  ? drm_gem_object_handle_put_unlocked+0xd0/0xd0 [drm]
>> [  123.810516 <    0.000005>]  idr_for_each+0x4d/0xd0
>> [  123.810534 <    0.000018>]  drm_gem_release+0x20/0x30 [drm]
>> [  123.810550 <    0.000016>]  drm_file_free+0x251/0x2a0 [drm]
>> [  123.810567 <    0.000017>] drm_close_helper.isra.14+0x61/0x70 [drm]
>> [  123.810583 <    0.000016>]  drm_release+0x6a/0xe0 [drm]
>> [  123.810588 <    0.000005>]  __fput+0xa2/0x250
>> [  123.810592 <    0.000004>]  ____fput+0xe/0x10
>> [  123.810595 <    0.000003>]  task_work_run+0x6c/0xa0
>> [  123.810600 <    0.000005>]  do_exit+0x376/0xb60
>> [  123.810604 <    0.000004>]  do_group_exit+0x43/0xa0
>> [  123.810608 <    0.000004>]  get_signal+0x18b/0x8e0
>> [  123.810612 <    0.000004>]  ? do_futex+0x595/0xc20
>> [  123.810617 <    0.000005>]  arch_do_signal+0x34/0x880
>> [  123.810620 <    0.000003>]  ? check_preempt_curr+0x50/0x60
>> [  123.810623 <    0.000003>]  ? ttwu_do_wakeup+0x1e/0x160
>> [  123.810626 <    0.000003>]  ? ttwu_do_activate+0x61/0x70
>> [  123.810630 <    0.000004>] exit_to_user_mode_prepare+0x124/0x1b0
>> [  123.810635 <    0.000005>] syscall_exit_to_user_mode+0x31/0x170
>> [  123.810639 <    0.000004>]  do_syscall_64+0x43/0x80
>>
>>
>> Andrey
>>
>>
>>
>> Christian.
>>
>> +
>> + if (adev->irq.ih.use_bus_addr)
>> + amdgpu_ih_ring_fini(adev, &adev->irq.ih);
>> + if (adev->irq.ih1.use_bus_addr)
>> + amdgpu_ih_ring_fini(adev, &adev->irq.ih1);
>> + if (adev->irq.ih2.use_bus_addr)
>> + amdgpu_ih_ring_fini(adev, &adev->irq.ih2);
>> +
>> + amdgpu_gart_dummy_page_fini(adev);
>> + }
>> +
>> + return NOTIFY_OK;
>> +}
>> +
>> +
>>    /**
>>     * amdgpu_device_init - initialize the driver
>>     *
>> @@ -3304,6 +3339,8 @@ int amdgpu_device_init(struct amdgpu_device *adev,
>>    INIT_WORK(&adev->xgmi_reset_work, amdgpu_device_xgmi_reset_func);
>> + INIT_LIST_HEAD(&adev->device_bo_list);
>> +
>>    adev->gfx.gfx_off_req_count = 1;
>>    adev->pm.ac_power = power_supply_is_system_supplied() > 0;
>> @@ -3575,6 +3612,15 @@ int amdgpu_device_init(struct amdgpu_device *adev,
>>    if (amdgpu_device_cache_pci_state(adev->pdev))
>>    pci_restore_state(pdev);
>> + BLOCKING_INIT_NOTIFIER_HEAD(&adev->notifier);
>> + adev->nb.notifier_call = amdgpu_iommu_group_notifier;
>> +
>> + if (adev->dev->iommu_group) {
>> + r = iommu_group_register_notifier(adev->dev->iommu_group, &adev->nb);
>> + if (r)
>> + goto failed;
>> + }
>> +
>>    return 0;
>>    failed:
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
>> index 0db9330..486ad6d 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
>> @@ -92,7 +92,7 @@ static int amdgpu_gart_dummy_page_init(struct amdgpu_device *adev)
>>     *
>>     * Frees the dummy page used by the driver (all asics).
>>     */
>> -static void amdgpu_gart_dummy_page_fini(struct amdgpu_device *adev)
>> +void amdgpu_gart_dummy_page_fini(struct amdgpu_device *adev)
>>    {
>>    if (!adev->dummy_page_addr)
>>    return;
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h
>> index afa2e28..5678d9c 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h
>> @@ -61,6 +61,7 @@ int amdgpu_gart_table_vram_pin(struct amdgpu_device *adev);
>>    void amdgpu_gart_table_vram_unpin(struct amdgpu_device *adev);
>>    int amdgpu_gart_init(struct amdgpu_device *adev);
>>    void amdgpu_gart_fini(struct amdgpu_device *adev);
>> +void amdgpu_gart_dummy_page_fini(struct amdgpu_device *adev);
>>    int amdgpu_gart_unbind(struct amdgpu_device *adev, uint64_t offset,
>>          int pages);
>>    int amdgpu_gart_map(struct amdgpu_device *adev, uint64_t offset,
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>> index 6cc9919..4a1de69 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>> @@ -94,6 +94,10 @@ static void amdgpu_bo_destroy(struct ttm_buffer_object *tbo)
>>    }
>>    amdgpu_bo_unref(&bo->parent);
>> + spin_lock(&ttm_bo_glob.lru_lock);
>> + list_del(&bo->bo);
>> + spin_unlock(&ttm_bo_glob.lru_lock);
>> +
>>    kfree(bo->metadata);
>>    kfree(bo);
>>    }
>> @@ -613,6 +617,12 @@ static int amdgpu_bo_do_create(struct amdgpu_device *adev,
>>    if (bp->type == ttm_bo_type_device)
>>    bo->flags &= ~AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED;
>> + INIT_LIST_HEAD(&bo->bo);
>> +
>> + spin_lock(&ttm_bo_glob.lru_lock);
>> + list_add_tail(&bo->bo, &adev->device_bo_list);
>> + spin_unlock(&ttm_bo_glob.lru_lock);
>> +
>>    return 0;
>>    fail_unreserve:
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
>> index 9ac3756..5ae8555 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
>> @@ -110,6 +110,8 @@ struct amdgpu_bo {
>>    struct list_head shadow_list;
>>    struct kgd_mem                  *kfd_bo;
>> +
>> + struct list_head bo;
>>    };
>>    static inline struct amdgpu_bo *ttm_to_amdgpu_bo(struct ttm_buffer_object *tbo)
>
>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 07/14] drm/amdgpu: Register IOMMU topology notifier per device.
  2021-01-19  8:48     ` Christian König
@ 2021-01-20  5:01       ` Andrey Grodzovsky
  -1 siblings, 0 replies; 196+ messages in thread
From: Andrey Grodzovsky @ 2021-01-20  5:01 UTC (permalink / raw)
  To: christian.koenig, amd-gfx, dri-devel, daniel.vetter, robh,
	l.stach, yuq825, eric
  Cc: Alexander.Deucher, gregkh


On 1/19/21 3:48 AM, Christian König wrote:
> Am 18.01.21 um 22:01 schrieb Andrey Grodzovsky:
>> Handle all DMA IOMMU gropup related dependencies before the
>> group is removed.
>>
>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu.h        |  5 ++++
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 46 ++++++++++++++++++++++++++++++
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c   |  2 +-
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h   |  1 +
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 10 +++++++
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_object.h |  2 ++
>>   6 files changed, 65 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>> index 478a7d8..2953420 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>> @@ -51,6 +51,7 @@
>>   #include <linux/dma-fence.h>
>>   #include <linux/pci.h>
>>   #include <linux/aer.h>
>> +#include <linux/notifier.h>
>>     #include <drm/ttm/ttm_bo_api.h>
>>   #include <drm/ttm/ttm_bo_driver.h>
>> @@ -1041,6 +1042,10 @@ struct amdgpu_device {
>>         bool                            in_pci_err_recovery;
>>       struct pci_saved_state          *pci_state;
>> +
>> +    struct notifier_block        nb;
>> +    struct blocking_notifier_head    notifier;
>> +    struct list_head        device_bo_list;
>>   };
>>     static inline struct amdgpu_device *drm_to_adev(struct drm_device *ddev)
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> index 45e23e3..e99f4f1 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> @@ -70,6 +70,8 @@
>>   #include <drm/task_barrier.h>
>>   #include <linux/pm_runtime.h>
>>   +#include <linux/iommu.h>
>> +
>>   MODULE_FIRMWARE("amdgpu/vega10_gpu_info.bin");
>>   MODULE_FIRMWARE("amdgpu/vega12_gpu_info.bin");
>>   MODULE_FIRMWARE("amdgpu/raven_gpu_info.bin");
>> @@ -3200,6 +3202,39 @@ static const struct attribute *amdgpu_dev_attributes[] 
>> = {
>>   };
>>     +static int amdgpu_iommu_group_notifier(struct notifier_block *nb,
>> +                     unsigned long action, void *data)
>> +{
>> +    struct amdgpu_device *adev = container_of(nb, struct amdgpu_device, nb);
>> +    struct amdgpu_bo *bo = NULL;
>> +
>> +    /*
>> +     * Following is a set of IOMMU group dependencies taken care of before
>> +     * device's IOMMU group is removed
>> +     */
>> +    if (action == IOMMU_GROUP_NOTIFY_DEL_DEVICE) {
>> +
>> +        spin_lock(&ttm_bo_glob.lru_lock);
>> +        list_for_each_entry(bo, &adev->device_bo_list, bo) {
>> +            if (bo->tbo.ttm)
>> +                ttm_tt_unpopulate(bo->tbo.bdev, bo->tbo.ttm);
>> +        }
>> +        spin_unlock(&ttm_bo_glob.lru_lock);
>
> That approach won't work. ttm_tt_unpopulate() might sleep on an IOMMU lock.
>
> You need to use a mutex here or even better make sure you can access the 
> device_bo_list without a lock in this moment.
>
> Christian.


I can think of switching to RCU list ? Otherwise, elements are added
on BO create and deleted on BO destroy, how can i prevent any of those from
happening while in this section besides mutex ? Make a copy list and run over it 
instead ?

Andrey


>
>> +
>> +        if (adev->irq.ih.use_bus_addr)
>> +            amdgpu_ih_ring_fini(adev, &adev->irq.ih);
>> +        if (adev->irq.ih1.use_bus_addr)
>> +            amdgpu_ih_ring_fini(adev, &adev->irq.ih1);
>> +        if (adev->irq.ih2.use_bus_addr)
>> +            amdgpu_ih_ring_fini(adev, &adev->irq.ih2);
>> +
>> +        amdgpu_gart_dummy_page_fini(adev);
>> +    }
>> +
>> +    return NOTIFY_OK;
>> +}
>> +
>> +
>>   /**
>>    * amdgpu_device_init - initialize the driver
>>    *
>> @@ -3304,6 +3339,8 @@ int amdgpu_device_init(struct amdgpu_device *adev,
>>         INIT_WORK(&adev->xgmi_reset_work, amdgpu_device_xgmi_reset_func);
>>   +    INIT_LIST_HEAD(&adev->device_bo_list);
>> +
>>       adev->gfx.gfx_off_req_count = 1;
>>       adev->pm.ac_power = power_supply_is_system_supplied() > 0;
>>   @@ -3575,6 +3612,15 @@ int amdgpu_device_init(struct amdgpu_device *adev,
>>       if (amdgpu_device_cache_pci_state(adev->pdev))
>>           pci_restore_state(pdev);
>>   +    BLOCKING_INIT_NOTIFIER_HEAD(&adev->notifier);
>> +    adev->nb.notifier_call = amdgpu_iommu_group_notifier;
>> +
>> +    if (adev->dev->iommu_group) {
>> +        r = iommu_group_register_notifier(adev->dev->iommu_group, &adev->nb);
>> +        if (r)
>> +            goto failed;
>> +    }
>> +
>>       return 0;
>>     failed:
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
>> index 0db9330..486ad6d 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
>> @@ -92,7 +92,7 @@ static int amdgpu_gart_dummy_page_init(struct amdgpu_device 
>> *adev)
>>    *
>>    * Frees the dummy page used by the driver (all asics).
>>    */
>> -static void amdgpu_gart_dummy_page_fini(struct amdgpu_device *adev)
>> +void amdgpu_gart_dummy_page_fini(struct amdgpu_device *adev)
>>   {
>>       if (!adev->dummy_page_addr)
>>           return;
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h
>> index afa2e28..5678d9c 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h
>> @@ -61,6 +61,7 @@ int amdgpu_gart_table_vram_pin(struct amdgpu_device *adev);
>>   void amdgpu_gart_table_vram_unpin(struct amdgpu_device *adev);
>>   int amdgpu_gart_init(struct amdgpu_device *adev);
>>   void amdgpu_gart_fini(struct amdgpu_device *adev);
>> +void amdgpu_gart_dummy_page_fini(struct amdgpu_device *adev);
>>   int amdgpu_gart_unbind(struct amdgpu_device *adev, uint64_t offset,
>>                  int pages);
>>   int amdgpu_gart_map(struct amdgpu_device *adev, uint64_t offset,
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>> index 6cc9919..4a1de69 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>> @@ -94,6 +94,10 @@ static void amdgpu_bo_destroy(struct ttm_buffer_object *tbo)
>>       }
>>       amdgpu_bo_unref(&bo->parent);
>>   +    spin_lock(&ttm_bo_glob.lru_lock);
>> +    list_del(&bo->bo);
>> +    spin_unlock(&ttm_bo_glob.lru_lock);
>> +
>>       kfree(bo->metadata);
>>       kfree(bo);
>>   }
>> @@ -613,6 +617,12 @@ static int amdgpu_bo_do_create(struct amdgpu_device *adev,
>>       if (bp->type == ttm_bo_type_device)
>>           bo->flags &= ~AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED;
>>   +    INIT_LIST_HEAD(&bo->bo);
>> +
>> +    spin_lock(&ttm_bo_glob.lru_lock);
>> +    list_add_tail(&bo->bo, &adev->device_bo_list);
>> +    spin_unlock(&ttm_bo_glob.lru_lock);
>> +
>>       return 0;
>>     fail_unreserve:
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
>> index 9ac3756..5ae8555 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
>> @@ -110,6 +110,8 @@ struct amdgpu_bo {
>>       struct list_head        shadow_list;
>>         struct kgd_mem                  *kfd_bo;
>> +
>> +    struct list_head        bo;
>>   };
>>     static inline struct amdgpu_bo *ttm_to_amdgpu_bo(struct ttm_buffer_object 
>> *tbo)
>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&amp;data=04%7C01%7Candrey.grodzovsky%40amd.com%7C0c703eb6e73744962d3b08d8bc56f303%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637466428923905672%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=2Tkz4EMOEwFLQJUOk1ixd28c2ad1HqjBVIDO%2FX0OgqM%3D&amp;reserved=0 
>
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 07/14] drm/amdgpu: Register IOMMU topology notifier per device.
@ 2021-01-20  5:01       ` Andrey Grodzovsky
  0 siblings, 0 replies; 196+ messages in thread
From: Andrey Grodzovsky @ 2021-01-20  5:01 UTC (permalink / raw)
  To: christian.koenig, amd-gfx, dri-devel, daniel.vetter, robh,
	l.stach, yuq825, eric
  Cc: Alexander.Deucher, gregkh, ppaalanen, Harry.Wentland


On 1/19/21 3:48 AM, Christian König wrote:
> Am 18.01.21 um 22:01 schrieb Andrey Grodzovsky:
>> Handle all DMA IOMMU gropup related dependencies before the
>> group is removed.
>>
>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu.h        |  5 ++++
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 46 ++++++++++++++++++++++++++++++
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c   |  2 +-
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h   |  1 +
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 10 +++++++
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_object.h |  2 ++
>>   6 files changed, 65 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>> index 478a7d8..2953420 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>> @@ -51,6 +51,7 @@
>>   #include <linux/dma-fence.h>
>>   #include <linux/pci.h>
>>   #include <linux/aer.h>
>> +#include <linux/notifier.h>
>>     #include <drm/ttm/ttm_bo_api.h>
>>   #include <drm/ttm/ttm_bo_driver.h>
>> @@ -1041,6 +1042,10 @@ struct amdgpu_device {
>>         bool                            in_pci_err_recovery;
>>       struct pci_saved_state          *pci_state;
>> +
>> +    struct notifier_block        nb;
>> +    struct blocking_notifier_head    notifier;
>> +    struct list_head        device_bo_list;
>>   };
>>     static inline struct amdgpu_device *drm_to_adev(struct drm_device *ddev)
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> index 45e23e3..e99f4f1 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> @@ -70,6 +70,8 @@
>>   #include <drm/task_barrier.h>
>>   #include <linux/pm_runtime.h>
>>   +#include <linux/iommu.h>
>> +
>>   MODULE_FIRMWARE("amdgpu/vega10_gpu_info.bin");
>>   MODULE_FIRMWARE("amdgpu/vega12_gpu_info.bin");
>>   MODULE_FIRMWARE("amdgpu/raven_gpu_info.bin");
>> @@ -3200,6 +3202,39 @@ static const struct attribute *amdgpu_dev_attributes[] 
>> = {
>>   };
>>     +static int amdgpu_iommu_group_notifier(struct notifier_block *nb,
>> +                     unsigned long action, void *data)
>> +{
>> +    struct amdgpu_device *adev = container_of(nb, struct amdgpu_device, nb);
>> +    struct amdgpu_bo *bo = NULL;
>> +
>> +    /*
>> +     * Following is a set of IOMMU group dependencies taken care of before
>> +     * device's IOMMU group is removed
>> +     */
>> +    if (action == IOMMU_GROUP_NOTIFY_DEL_DEVICE) {
>> +
>> +        spin_lock(&ttm_bo_glob.lru_lock);
>> +        list_for_each_entry(bo, &adev->device_bo_list, bo) {
>> +            if (bo->tbo.ttm)
>> +                ttm_tt_unpopulate(bo->tbo.bdev, bo->tbo.ttm);
>> +        }
>> +        spin_unlock(&ttm_bo_glob.lru_lock);
>
> That approach won't work. ttm_tt_unpopulate() might sleep on an IOMMU lock.
>
> You need to use a mutex here or even better make sure you can access the 
> device_bo_list without a lock in this moment.
>
> Christian.


I can think of switching to RCU list ? Otherwise, elements are added
on BO create and deleted on BO destroy, how can i prevent any of those from
happening while in this section besides mutex ? Make a copy list and run over it 
instead ?

Andrey


>
>> +
>> +        if (adev->irq.ih.use_bus_addr)
>> +            amdgpu_ih_ring_fini(adev, &adev->irq.ih);
>> +        if (adev->irq.ih1.use_bus_addr)
>> +            amdgpu_ih_ring_fini(adev, &adev->irq.ih1);
>> +        if (adev->irq.ih2.use_bus_addr)
>> +            amdgpu_ih_ring_fini(adev, &adev->irq.ih2);
>> +
>> +        amdgpu_gart_dummy_page_fini(adev);
>> +    }
>> +
>> +    return NOTIFY_OK;
>> +}
>> +
>> +
>>   /**
>>    * amdgpu_device_init - initialize the driver
>>    *
>> @@ -3304,6 +3339,8 @@ int amdgpu_device_init(struct amdgpu_device *adev,
>>         INIT_WORK(&adev->xgmi_reset_work, amdgpu_device_xgmi_reset_func);
>>   +    INIT_LIST_HEAD(&adev->device_bo_list);
>> +
>>       adev->gfx.gfx_off_req_count = 1;
>>       adev->pm.ac_power = power_supply_is_system_supplied() > 0;
>>   @@ -3575,6 +3612,15 @@ int amdgpu_device_init(struct amdgpu_device *adev,
>>       if (amdgpu_device_cache_pci_state(adev->pdev))
>>           pci_restore_state(pdev);
>>   +    BLOCKING_INIT_NOTIFIER_HEAD(&adev->notifier);
>> +    adev->nb.notifier_call = amdgpu_iommu_group_notifier;
>> +
>> +    if (adev->dev->iommu_group) {
>> +        r = iommu_group_register_notifier(adev->dev->iommu_group, &adev->nb);
>> +        if (r)
>> +            goto failed;
>> +    }
>> +
>>       return 0;
>>     failed:
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
>> index 0db9330..486ad6d 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
>> @@ -92,7 +92,7 @@ static int amdgpu_gart_dummy_page_init(struct amdgpu_device 
>> *adev)
>>    *
>>    * Frees the dummy page used by the driver (all asics).
>>    */
>> -static void amdgpu_gart_dummy_page_fini(struct amdgpu_device *adev)
>> +void amdgpu_gart_dummy_page_fini(struct amdgpu_device *adev)
>>   {
>>       if (!adev->dummy_page_addr)
>>           return;
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h
>> index afa2e28..5678d9c 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h
>> @@ -61,6 +61,7 @@ int amdgpu_gart_table_vram_pin(struct amdgpu_device *adev);
>>   void amdgpu_gart_table_vram_unpin(struct amdgpu_device *adev);
>>   int amdgpu_gart_init(struct amdgpu_device *adev);
>>   void amdgpu_gart_fini(struct amdgpu_device *adev);
>> +void amdgpu_gart_dummy_page_fini(struct amdgpu_device *adev);
>>   int amdgpu_gart_unbind(struct amdgpu_device *adev, uint64_t offset,
>>                  int pages);
>>   int amdgpu_gart_map(struct amdgpu_device *adev, uint64_t offset,
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>> index 6cc9919..4a1de69 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>> @@ -94,6 +94,10 @@ static void amdgpu_bo_destroy(struct ttm_buffer_object *tbo)
>>       }
>>       amdgpu_bo_unref(&bo->parent);
>>   +    spin_lock(&ttm_bo_glob.lru_lock);
>> +    list_del(&bo->bo);
>> +    spin_unlock(&ttm_bo_glob.lru_lock);
>> +
>>       kfree(bo->metadata);
>>       kfree(bo);
>>   }
>> @@ -613,6 +617,12 @@ static int amdgpu_bo_do_create(struct amdgpu_device *adev,
>>       if (bp->type == ttm_bo_type_device)
>>           bo->flags &= ~AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED;
>>   +    INIT_LIST_HEAD(&bo->bo);
>> +
>> +    spin_lock(&ttm_bo_glob.lru_lock);
>> +    list_add_tail(&bo->bo, &adev->device_bo_list);
>> +    spin_unlock(&ttm_bo_glob.lru_lock);
>> +
>>       return 0;
>>     fail_unreserve:
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
>> index 9ac3756..5ae8555 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
>> @@ -110,6 +110,8 @@ struct amdgpu_bo {
>>       struct list_head        shadow_list;
>>         struct kgd_mem                  *kfd_bo;
>> +
>> +    struct list_head        bo;
>>   };
>>     static inline struct amdgpu_bo *ttm_to_amdgpu_bo(struct ttm_buffer_object 
>> *tbo)
>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&amp;data=04%7C01%7Candrey.grodzovsky%40amd.com%7C0c703eb6e73744962d3b08d8bc56f303%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637466428923905672%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=2Tkz4EMOEwFLQJUOk1ixd28c2ad1HqjBVIDO%2FX0OgqM%3D&amp;reserved=0 
>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 07/14] drm/amdgpu: Register IOMMU topology notifier per device.
  2021-01-20  4:21             ` Andrey Grodzovsky
@ 2021-01-20  8:38               ` Daniel Vetter
  -1 siblings, 0 replies; 196+ messages in thread
From: Daniel Vetter @ 2021-01-20  8:38 UTC (permalink / raw)
  To: Andrey Grodzovsky
  Cc: Greg KH, dri-devel, amd-gfx list, Alex Deucher,
	Christian König, Qiang Yu

On Wed, Jan 20, 2021 at 5:21 AM Andrey Grodzovsky
<Andrey.Grodzovsky@amd.com> wrote:
>
>
> On 1/19/21 5:01 PM, Daniel Vetter wrote:
> > On Tue, Jan 19, 2021 at 10:22 PM Andrey Grodzovsky
> > <Andrey.Grodzovsky@amd.com> wrote:
> >>
> >> On 1/19/21 8:45 AM, Daniel Vetter wrote:
> >>
> >> On Tue, Jan 19, 2021 at 09:48:03AM +0100, Christian König wrote:
> >>
> >> Am 18.01.21 um 22:01 schrieb Andrey Grodzovsky:
> >>
> >> Handle all DMA IOMMU gropup related dependencies before the
> >> group is removed.
> >>
> >> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> >> ---
> >>    drivers/gpu/drm/amd/amdgpu/amdgpu.h        |  5 ++++
> >>    drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 46 ++++++++++++++++++++++++++++++
> >>    drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c   |  2 +-
> >>    drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h   |  1 +
> >>    drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 10 +++++++
> >>    drivers/gpu/drm/amd/amdgpu/amdgpu_object.h |  2 ++
> >>    6 files changed, 65 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> >> index 478a7d8..2953420 100644
> >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> >> @@ -51,6 +51,7 @@
> >>    #include <linux/dma-fence.h>
> >>    #include <linux/pci.h>
> >>    #include <linux/aer.h>
> >> +#include <linux/notifier.h>
> >>    #include <drm/ttm/ttm_bo_api.h>
> >>    #include <drm/ttm/ttm_bo_driver.h>
> >> @@ -1041,6 +1042,10 @@ struct amdgpu_device {
> >>    bool                            in_pci_err_recovery;
> >>    struct pci_saved_state          *pci_state;
> >> +
> >> + struct notifier_block nb;
> >> + struct blocking_notifier_head notifier;
> >> + struct list_head device_bo_list;
> >>    };
> >>    static inline struct amdgpu_device *drm_to_adev(struct drm_device *ddev)
> >> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> >> index 45e23e3..e99f4f1 100644
> >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> >> @@ -70,6 +70,8 @@
> >>    #include <drm/task_barrier.h>
> >>    #include <linux/pm_runtime.h>
> >> +#include <linux/iommu.h>
> >> +
> >>    MODULE_FIRMWARE("amdgpu/vega10_gpu_info.bin");
> >>    MODULE_FIRMWARE("amdgpu/vega12_gpu_info.bin");
> >>    MODULE_FIRMWARE("amdgpu/raven_gpu_info.bin");
> >> @@ -3200,6 +3202,39 @@ static const struct attribute *amdgpu_dev_attributes[] = {
> >>    };
> >> +static int amdgpu_iommu_group_notifier(struct notifier_block *nb,
> >> +     unsigned long action, void *data)
> >> +{
> >> + struct amdgpu_device *adev = container_of(nb, struct amdgpu_device, nb);
> >> + struct amdgpu_bo *bo = NULL;
> >> +
> >> + /*
> >> + * Following is a set of IOMMU group dependencies taken care of before
> >> + * device's IOMMU group is removed
> >> + */
> >> + if (action == IOMMU_GROUP_NOTIFY_DEL_DEVICE) {
> >> +
> >> + spin_lock(&ttm_bo_glob.lru_lock);
> >> + list_for_each_entry(bo, &adev->device_bo_list, bo) {
> >> + if (bo->tbo.ttm)
> >> + ttm_tt_unpopulate(bo->tbo.bdev, bo->tbo.ttm);
> >> + }
> >> + spin_unlock(&ttm_bo_glob.lru_lock);
> >>
> >> That approach won't work. ttm_tt_unpopulate() might sleep on an IOMMU lock.
> >>
> >> You need to use a mutex here or even better make sure you can access the
> >> device_bo_list without a lock in this moment.
> >>
> >> I'd also be worried about the notifier mutex getting really badly in the
> >> way.
> >>
> >> Plus I'm worried why we even need this, it sounds a bit like papering over
> >> the iommu subsystem. Assuming we clean up all our iommu mappings in our
> >> device hotunplug/unload code, why do we still need to have an additional
> >> iommu notifier on top, with all kinds of additional headaches? The iommu
> >> shouldn't clean up before the devices in its group have cleaned up.
> >>
> >> I think we need more info here on what the exact problem is first.
> >> -Daniel
> >>
> >>
> >> Originally I experienced the  crash bellow on IOMMU enabled device, it happens post device removal from PCI topology -
> >> during shutting down of user client holding last reference to drm device file (X in my case).
> >> The crash is because by the time I get to this point struct device->iommu_group pointer is NULL
> >> already since the IOMMU group for the device is unset during PCI removal. So this contradicts what you said above
> >> that the iommu shouldn't clean up before the devices in its group have cleaned up.
> >> So instead of guessing when is the right place to place all IOMMU related cleanups it makes sense
> >> to get notification from IOMMU subsystem in the form of event IOMMU_GROUP_NOTIFY_DEL_DEVICE
> >> and use that place to do all the relevant cleanups.
> > Yeah that goes boom, but you shouldn't need this special iommu cleanup
> > handler. Making sure that all the dma-api mappings are gone needs to
> > be done as part of the device hotunplug, you can't delay that to the
> > last drm_device cleanup.
> >
> > So I most of the patch here with pulling that out (should be outright
> > removed from the final release code even) is good, just not yet how
> > you call that new code. Probably these bits (aside from walking all
> > buffers and unpopulating the tt) should be done from the early_free
> > callback you're adding.
> >
> > Also what I just realized: For normal unload you need to make sure the
> > hw is actually stopped first, before we unmap buffers. Otherwise
> > driver unload will likely result in wedged hw, probably not what you
> > want for debugging.
> > -Daniel
>
> Since device removal from IOMMU group and this hook in particular
> takes place before call to amdgpu_pci_remove essentially it means
> that for IOMMU use case the entire amdgpu_device_fini_hw function
> shouold be called here to stop the HW instead from amdgpu_pci_remove.

The crash you showed was on final drm_close, which should happen after
device removal, so that's clearly buggy. If the iommu subsystem
removes stuff before the driver could clean up already, then I think
that's an iommu bug or dma-api bug. Just plain using dma_map/unmap and
friends really shouldn't need notifier hacks like you're implementing
here. Can you pls show me a backtrace where dma_unmap_sg blows up when
it's put into the pci_driver remove callback?

> Looking at this from another perspective, AFAIK on each new device probing
> either due to PCI bus rescan or driver reload we are resetting the ASIC before doing
> any init operations (assuming we successfully gained MMIO access) and so maybe
> your concern is not an issue ?

Reset on probe is too late. The problem is that if you just remove the
driver, your device is doing dma at that moment. And you kinda have to
stop that before you free the mappings/memory. Of course when the
device is actually hotunplugged, then dma is guaranteed to have
stopped already. I'm not sure whether disabling the pci device is
enough to make sure no more dma happens, could be that's enough.
-Daniel

> Andrey
>
>
> >
> >> Andrey
> >>
> >>
> >> [  123.810074 <   28.126960>] BUG: kernel NULL pointer dereference, address: 00000000000000c8
> >> [  123.810080 <    0.000006>] #PF: supervisor read access in kernel mode
> >> [  123.810082 <    0.000002>] #PF: error_code(0x0000) - not-present page
> >> [  123.810085 <    0.000003>] PGD 0 P4D 0
> >> [  123.810089 <    0.000004>] Oops: 0000 [#1] SMP NOPTI
> >> [  123.810094 <    0.000005>] CPU: 5 PID: 1418 Comm: Xorg:shlo4 Tainted: G           O      5.9.0-rc2-dev+ #59
> >> [  123.810096 <    0.000002>] Hardware name: System manufacturer System Product Name/PRIME X470-PRO, BIOS 4406 02/28/2019
> >> [  123.810105 <    0.000009>] RIP: 0010:iommu_get_dma_domain+0x10/0x20
> >> [  123.810108 <    0.000003>] Code: b0 48 c7 87 98 00 00 00 00 00 00 00 31 c0 c3 b8 f4 ff ff ff eb a6 0f 1f 40 00 0f 1f 44 00 00 48 8b 87 d0 02 00 00 55 48 89 e5 <48> 8b 80 c8 00 00 00 5d c3 0f 1f 80 00 00 00 00 0f 1f 44 00 00 48
> >> [  123.810111 <    0.000003>] RSP: 0018:ffffa2e201f7f980 EFLAGS: 00010246
> >> [  123.810114 <    0.000003>] RAX: 0000000000000000 RBX: 0000000000001000 RCX: 0000000000000000
> >> [  123.810116 <    0.000002>] RDX: 0000000000001000 RSI: 00000000bf5cb000 RDI: ffff93c259dc60b0
> >> [  123.810117 <    0.000001>] RBP: ffffa2e201f7f980 R08: 0000000000000000 R09: 0000000000000000
> >> [  123.810119 <    0.000002>] R10: ffffa2e201f7faf0 R11: 0000000000000001 R12: 00000000bf5cb000
> >> [  123.810121 <    0.000002>] R13: 0000000000001000 R14: ffff93c24cef9c50 R15: ffff93c256c05688
> >> [  123.810124 <    0.000003>] FS:  00007f5e5e8d3700(0000) GS:ffff93c25e940000(0000) knlGS:0000000000000000
> >> [  123.810126 <    0.000002>] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> >> [  123.810128 <    0.000002>] CR2: 00000000000000c8 CR3: 000000027fe0a000 CR4: 00000000003506e0
> >> [  123.810130 <    0.000002>] Call Trace:
> >> [  123.810136 <    0.000006>]  __iommu_dma_unmap+0x2e/0x100
> >> [  123.810141 <    0.000005>]  ? kfree+0x389/0x3a0
> >> [  123.810144 <    0.000003>]  iommu_dma_unmap_page+0xe/0x10
> >> [  123.810149 <    0.000005>] dma_unmap_page_attrs+0x4d/0xf0
> >> [  123.810159 <    0.000010>]  ? ttm_bo_del_from_lru+0x8e/0xb0 [ttm]
> >> [  123.810165 <    0.000006>] ttm_unmap_and_unpopulate_pages+0x8e/0xc0 [ttm]
> >> [  123.810252 <    0.000087>] amdgpu_ttm_tt_unpopulate+0xaa/0xd0 [amdgpu]
> >> [  123.810258 <    0.000006>]  ttm_tt_unpopulate+0x59/0x70 [ttm]
> >> [  123.810264 <    0.000006>]  ttm_tt_destroy+0x6a/0x70 [ttm]
> >> [  123.810270 <    0.000006>] ttm_bo_cleanup_memtype_use+0x36/0xa0 [ttm]
> >> [  123.810276 <    0.000006>]  ttm_bo_put+0x1e7/0x400 [ttm]
> >> [  123.810358 <    0.000082>]  amdgpu_bo_unref+0x1e/0x30 [amdgpu]
> >> [  123.810440 <    0.000082>] amdgpu_gem_object_free+0x37/0x50 [amdgpu]
> >> [  123.810459 <    0.000019>]  drm_gem_object_free+0x35/0x40 [drm]
> >> [  123.810476 <    0.000017>] drm_gem_object_handle_put_unlocked+0x9d/0xd0 [drm]
> >> [  123.810494 <    0.000018>] drm_gem_object_release_handle+0x74/0x90 [drm]
> >> [  123.810511 <    0.000017>]  ? drm_gem_object_handle_put_unlocked+0xd0/0xd0 [drm]
> >> [  123.810516 <    0.000005>]  idr_for_each+0x4d/0xd0
> >> [  123.810534 <    0.000018>]  drm_gem_release+0x20/0x30 [drm]
> >> [  123.810550 <    0.000016>]  drm_file_free+0x251/0x2a0 [drm]
> >> [  123.810567 <    0.000017>] drm_close_helper.isra.14+0x61/0x70 [drm]
> >> [  123.810583 <    0.000016>]  drm_release+0x6a/0xe0 [drm]
> >> [  123.810588 <    0.000005>]  __fput+0xa2/0x250
> >> [  123.810592 <    0.000004>]  ____fput+0xe/0x10
> >> [  123.810595 <    0.000003>]  task_work_run+0x6c/0xa0
> >> [  123.810600 <    0.000005>]  do_exit+0x376/0xb60
> >> [  123.810604 <    0.000004>]  do_group_exit+0x43/0xa0
> >> [  123.810608 <    0.000004>]  get_signal+0x18b/0x8e0
> >> [  123.810612 <    0.000004>]  ? do_futex+0x595/0xc20
> >> [  123.810617 <    0.000005>]  arch_do_signal+0x34/0x880
> >> [  123.810620 <    0.000003>]  ? check_preempt_curr+0x50/0x60
> >> [  123.810623 <    0.000003>]  ? ttwu_do_wakeup+0x1e/0x160
> >> [  123.810626 <    0.000003>]  ? ttwu_do_activate+0x61/0x70
> >> [  123.810630 <    0.000004>] exit_to_user_mode_prepare+0x124/0x1b0
> >> [  123.810635 <    0.000005>] syscall_exit_to_user_mode+0x31/0x170
> >> [  123.810639 <    0.000004>]  do_syscall_64+0x43/0x80
> >>
> >>
> >> Andrey
> >>
> >>
> >>
> >> Christian.
> >>
> >> +
> >> + if (adev->irq.ih.use_bus_addr)
> >> + amdgpu_ih_ring_fini(adev, &adev->irq.ih);
> >> + if (adev->irq.ih1.use_bus_addr)
> >> + amdgpu_ih_ring_fini(adev, &adev->irq.ih1);
> >> + if (adev->irq.ih2.use_bus_addr)
> >> + amdgpu_ih_ring_fini(adev, &adev->irq.ih2);
> >> +
> >> + amdgpu_gart_dummy_page_fini(adev);
> >> + }
> >> +
> >> + return NOTIFY_OK;
> >> +}
> >> +
> >> +
> >>    /**
> >>     * amdgpu_device_init - initialize the driver
> >>     *
> >> @@ -3304,6 +3339,8 @@ int amdgpu_device_init(struct amdgpu_device *adev,
> >>    INIT_WORK(&adev->xgmi_reset_work, amdgpu_device_xgmi_reset_func);
> >> + INIT_LIST_HEAD(&adev->device_bo_list);
> >> +
> >>    adev->gfx.gfx_off_req_count = 1;
> >>    adev->pm.ac_power = power_supply_is_system_supplied() > 0;
> >> @@ -3575,6 +3612,15 @@ int amdgpu_device_init(struct amdgpu_device *adev,
> >>    if (amdgpu_device_cache_pci_state(adev->pdev))
> >>    pci_restore_state(pdev);
> >> + BLOCKING_INIT_NOTIFIER_HEAD(&adev->notifier);
> >> + adev->nb.notifier_call = amdgpu_iommu_group_notifier;
> >> +
> >> + if (adev->dev->iommu_group) {
> >> + r = iommu_group_register_notifier(adev->dev->iommu_group, &adev->nb);
> >> + if (r)
> >> + goto failed;
> >> + }
> >> +
> >>    return 0;
> >>    failed:
> >> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
> >> index 0db9330..486ad6d 100644
> >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
> >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
> >> @@ -92,7 +92,7 @@ static int amdgpu_gart_dummy_page_init(struct amdgpu_device *adev)
> >>     *
> >>     * Frees the dummy page used by the driver (all asics).
> >>     */
> >> -static void amdgpu_gart_dummy_page_fini(struct amdgpu_device *adev)
> >> +void amdgpu_gart_dummy_page_fini(struct amdgpu_device *adev)
> >>    {
> >>    if (!adev->dummy_page_addr)
> >>    return;
> >> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h
> >> index afa2e28..5678d9c 100644
> >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h
> >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h
> >> @@ -61,6 +61,7 @@ int amdgpu_gart_table_vram_pin(struct amdgpu_device *adev);
> >>    void amdgpu_gart_table_vram_unpin(struct amdgpu_device *adev);
> >>    int amdgpu_gart_init(struct amdgpu_device *adev);
> >>    void amdgpu_gart_fini(struct amdgpu_device *adev);
> >> +void amdgpu_gart_dummy_page_fini(struct amdgpu_device *adev);
> >>    int amdgpu_gart_unbind(struct amdgpu_device *adev, uint64_t offset,
> >>          int pages);
> >>    int amdgpu_gart_map(struct amdgpu_device *adev, uint64_t offset,
> >> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> >> index 6cc9919..4a1de69 100644
> >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> >> @@ -94,6 +94,10 @@ static void amdgpu_bo_destroy(struct ttm_buffer_object *tbo)
> >>    }
> >>    amdgpu_bo_unref(&bo->parent);
> >> + spin_lock(&ttm_bo_glob.lru_lock);
> >> + list_del(&bo->bo);
> >> + spin_unlock(&ttm_bo_glob.lru_lock);
> >> +
> >>    kfree(bo->metadata);
> >>    kfree(bo);
> >>    }
> >> @@ -613,6 +617,12 @@ static int amdgpu_bo_do_create(struct amdgpu_device *adev,
> >>    if (bp->type == ttm_bo_type_device)
> >>    bo->flags &= ~AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED;
> >> + INIT_LIST_HEAD(&bo->bo);
> >> +
> >> + spin_lock(&ttm_bo_glob.lru_lock);
> >> + list_add_tail(&bo->bo, &adev->device_bo_list);
> >> + spin_unlock(&ttm_bo_glob.lru_lock);
> >> +
> >>    return 0;
> >>    fail_unreserve:
> >> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
> >> index 9ac3756..5ae8555 100644
> >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
> >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
> >> @@ -110,6 +110,8 @@ struct amdgpu_bo {
> >>    struct list_head shadow_list;
> >>    struct kgd_mem                  *kfd_bo;
> >> +
> >> + struct list_head bo;
> >>    };
> >>    static inline struct amdgpu_bo *ttm_to_amdgpu_bo(struct ttm_buffer_object *tbo)
> >
> >



-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 07/14] drm/amdgpu: Register IOMMU topology notifier per device.
@ 2021-01-20  8:38               ` Daniel Vetter
  0 siblings, 0 replies; 196+ messages in thread
From: Daniel Vetter @ 2021-01-20  8:38 UTC (permalink / raw)
  To: Andrey Grodzovsky
  Cc: Rob Herring, Greg KH, dri-devel, Anholt, Eric, Pekka Paalanen,
	amd-gfx list, Alex Deucher, Lucas Stach, Wentland, Harry,
	Christian König, Qiang Yu

On Wed, Jan 20, 2021 at 5:21 AM Andrey Grodzovsky
<Andrey.Grodzovsky@amd.com> wrote:
>
>
> On 1/19/21 5:01 PM, Daniel Vetter wrote:
> > On Tue, Jan 19, 2021 at 10:22 PM Andrey Grodzovsky
> > <Andrey.Grodzovsky@amd.com> wrote:
> >>
> >> On 1/19/21 8:45 AM, Daniel Vetter wrote:
> >>
> >> On Tue, Jan 19, 2021 at 09:48:03AM +0100, Christian König wrote:
> >>
> >> Am 18.01.21 um 22:01 schrieb Andrey Grodzovsky:
> >>
> >> Handle all DMA IOMMU gropup related dependencies before the
> >> group is removed.
> >>
> >> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> >> ---
> >>    drivers/gpu/drm/amd/amdgpu/amdgpu.h        |  5 ++++
> >>    drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 46 ++++++++++++++++++++++++++++++
> >>    drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c   |  2 +-
> >>    drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h   |  1 +
> >>    drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 10 +++++++
> >>    drivers/gpu/drm/amd/amdgpu/amdgpu_object.h |  2 ++
> >>    6 files changed, 65 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> >> index 478a7d8..2953420 100644
> >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> >> @@ -51,6 +51,7 @@
> >>    #include <linux/dma-fence.h>
> >>    #include <linux/pci.h>
> >>    #include <linux/aer.h>
> >> +#include <linux/notifier.h>
> >>    #include <drm/ttm/ttm_bo_api.h>
> >>    #include <drm/ttm/ttm_bo_driver.h>
> >> @@ -1041,6 +1042,10 @@ struct amdgpu_device {
> >>    bool                            in_pci_err_recovery;
> >>    struct pci_saved_state          *pci_state;
> >> +
> >> + struct notifier_block nb;
> >> + struct blocking_notifier_head notifier;
> >> + struct list_head device_bo_list;
> >>    };
> >>    static inline struct amdgpu_device *drm_to_adev(struct drm_device *ddev)
> >> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> >> index 45e23e3..e99f4f1 100644
> >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> >> @@ -70,6 +70,8 @@
> >>    #include <drm/task_barrier.h>
> >>    #include <linux/pm_runtime.h>
> >> +#include <linux/iommu.h>
> >> +
> >>    MODULE_FIRMWARE("amdgpu/vega10_gpu_info.bin");
> >>    MODULE_FIRMWARE("amdgpu/vega12_gpu_info.bin");
> >>    MODULE_FIRMWARE("amdgpu/raven_gpu_info.bin");
> >> @@ -3200,6 +3202,39 @@ static const struct attribute *amdgpu_dev_attributes[] = {
> >>    };
> >> +static int amdgpu_iommu_group_notifier(struct notifier_block *nb,
> >> +     unsigned long action, void *data)
> >> +{
> >> + struct amdgpu_device *adev = container_of(nb, struct amdgpu_device, nb);
> >> + struct amdgpu_bo *bo = NULL;
> >> +
> >> + /*
> >> + * Following is a set of IOMMU group dependencies taken care of before
> >> + * device's IOMMU group is removed
> >> + */
> >> + if (action == IOMMU_GROUP_NOTIFY_DEL_DEVICE) {
> >> +
> >> + spin_lock(&ttm_bo_glob.lru_lock);
> >> + list_for_each_entry(bo, &adev->device_bo_list, bo) {
> >> + if (bo->tbo.ttm)
> >> + ttm_tt_unpopulate(bo->tbo.bdev, bo->tbo.ttm);
> >> + }
> >> + spin_unlock(&ttm_bo_glob.lru_lock);
> >>
> >> That approach won't work. ttm_tt_unpopulate() might sleep on an IOMMU lock.
> >>
> >> You need to use a mutex here or even better make sure you can access the
> >> device_bo_list without a lock in this moment.
> >>
> >> I'd also be worried about the notifier mutex getting really badly in the
> >> way.
> >>
> >> Plus I'm worried why we even need this, it sounds a bit like papering over
> >> the iommu subsystem. Assuming we clean up all our iommu mappings in our
> >> device hotunplug/unload code, why do we still need to have an additional
> >> iommu notifier on top, with all kinds of additional headaches? The iommu
> >> shouldn't clean up before the devices in its group have cleaned up.
> >>
> >> I think we need more info here on what the exact problem is first.
> >> -Daniel
> >>
> >>
> >> Originally I experienced the  crash bellow on IOMMU enabled device, it happens post device removal from PCI topology -
> >> during shutting down of user client holding last reference to drm device file (X in my case).
> >> The crash is because by the time I get to this point struct device->iommu_group pointer is NULL
> >> already since the IOMMU group for the device is unset during PCI removal. So this contradicts what you said above
> >> that the iommu shouldn't clean up before the devices in its group have cleaned up.
> >> So instead of guessing when is the right place to place all IOMMU related cleanups it makes sense
> >> to get notification from IOMMU subsystem in the form of event IOMMU_GROUP_NOTIFY_DEL_DEVICE
> >> and use that place to do all the relevant cleanups.
> > Yeah that goes boom, but you shouldn't need this special iommu cleanup
> > handler. Making sure that all the dma-api mappings are gone needs to
> > be done as part of the device hotunplug, you can't delay that to the
> > last drm_device cleanup.
> >
> > So I most of the patch here with pulling that out (should be outright
> > removed from the final release code even) is good, just not yet how
> > you call that new code. Probably these bits (aside from walking all
> > buffers and unpopulating the tt) should be done from the early_free
> > callback you're adding.
> >
> > Also what I just realized: For normal unload you need to make sure the
> > hw is actually stopped first, before we unmap buffers. Otherwise
> > driver unload will likely result in wedged hw, probably not what you
> > want for debugging.
> > -Daniel
>
> Since device removal from IOMMU group and this hook in particular
> takes place before call to amdgpu_pci_remove essentially it means
> that for IOMMU use case the entire amdgpu_device_fini_hw function
> shouold be called here to stop the HW instead from amdgpu_pci_remove.

The crash you showed was on final drm_close, which should happen after
device removal, so that's clearly buggy. If the iommu subsystem
removes stuff before the driver could clean up already, then I think
that's an iommu bug or dma-api bug. Just plain using dma_map/unmap and
friends really shouldn't need notifier hacks like you're implementing
here. Can you pls show me a backtrace where dma_unmap_sg blows up when
it's put into the pci_driver remove callback?

> Looking at this from another perspective, AFAIK on each new device probing
> either due to PCI bus rescan or driver reload we are resetting the ASIC before doing
> any init operations (assuming we successfully gained MMIO access) and so maybe
> your concern is not an issue ?

Reset on probe is too late. The problem is that if you just remove the
driver, your device is doing dma at that moment. And you kinda have to
stop that before you free the mappings/memory. Of course when the
device is actually hotunplugged, then dma is guaranteed to have
stopped already. I'm not sure whether disabling the pci device is
enough to make sure no more dma happens, could be that's enough.
-Daniel

> Andrey
>
>
> >
> >> Andrey
> >>
> >>
> >> [  123.810074 <   28.126960>] BUG: kernel NULL pointer dereference, address: 00000000000000c8
> >> [  123.810080 <    0.000006>] #PF: supervisor read access in kernel mode
> >> [  123.810082 <    0.000002>] #PF: error_code(0x0000) - not-present page
> >> [  123.810085 <    0.000003>] PGD 0 P4D 0
> >> [  123.810089 <    0.000004>] Oops: 0000 [#1] SMP NOPTI
> >> [  123.810094 <    0.000005>] CPU: 5 PID: 1418 Comm: Xorg:shlo4 Tainted: G           O      5.9.0-rc2-dev+ #59
> >> [  123.810096 <    0.000002>] Hardware name: System manufacturer System Product Name/PRIME X470-PRO, BIOS 4406 02/28/2019
> >> [  123.810105 <    0.000009>] RIP: 0010:iommu_get_dma_domain+0x10/0x20
> >> [  123.810108 <    0.000003>] Code: b0 48 c7 87 98 00 00 00 00 00 00 00 31 c0 c3 b8 f4 ff ff ff eb a6 0f 1f 40 00 0f 1f 44 00 00 48 8b 87 d0 02 00 00 55 48 89 e5 <48> 8b 80 c8 00 00 00 5d c3 0f 1f 80 00 00 00 00 0f 1f 44 00 00 48
> >> [  123.810111 <    0.000003>] RSP: 0018:ffffa2e201f7f980 EFLAGS: 00010246
> >> [  123.810114 <    0.000003>] RAX: 0000000000000000 RBX: 0000000000001000 RCX: 0000000000000000
> >> [  123.810116 <    0.000002>] RDX: 0000000000001000 RSI: 00000000bf5cb000 RDI: ffff93c259dc60b0
> >> [  123.810117 <    0.000001>] RBP: ffffa2e201f7f980 R08: 0000000000000000 R09: 0000000000000000
> >> [  123.810119 <    0.000002>] R10: ffffa2e201f7faf0 R11: 0000000000000001 R12: 00000000bf5cb000
> >> [  123.810121 <    0.000002>] R13: 0000000000001000 R14: ffff93c24cef9c50 R15: ffff93c256c05688
> >> [  123.810124 <    0.000003>] FS:  00007f5e5e8d3700(0000) GS:ffff93c25e940000(0000) knlGS:0000000000000000
> >> [  123.810126 <    0.000002>] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> >> [  123.810128 <    0.000002>] CR2: 00000000000000c8 CR3: 000000027fe0a000 CR4: 00000000003506e0
> >> [  123.810130 <    0.000002>] Call Trace:
> >> [  123.810136 <    0.000006>]  __iommu_dma_unmap+0x2e/0x100
> >> [  123.810141 <    0.000005>]  ? kfree+0x389/0x3a0
> >> [  123.810144 <    0.000003>]  iommu_dma_unmap_page+0xe/0x10
> >> [  123.810149 <    0.000005>] dma_unmap_page_attrs+0x4d/0xf0
> >> [  123.810159 <    0.000010>]  ? ttm_bo_del_from_lru+0x8e/0xb0 [ttm]
> >> [  123.810165 <    0.000006>] ttm_unmap_and_unpopulate_pages+0x8e/0xc0 [ttm]
> >> [  123.810252 <    0.000087>] amdgpu_ttm_tt_unpopulate+0xaa/0xd0 [amdgpu]
> >> [  123.810258 <    0.000006>]  ttm_tt_unpopulate+0x59/0x70 [ttm]
> >> [  123.810264 <    0.000006>]  ttm_tt_destroy+0x6a/0x70 [ttm]
> >> [  123.810270 <    0.000006>] ttm_bo_cleanup_memtype_use+0x36/0xa0 [ttm]
> >> [  123.810276 <    0.000006>]  ttm_bo_put+0x1e7/0x400 [ttm]
> >> [  123.810358 <    0.000082>]  amdgpu_bo_unref+0x1e/0x30 [amdgpu]
> >> [  123.810440 <    0.000082>] amdgpu_gem_object_free+0x37/0x50 [amdgpu]
> >> [  123.810459 <    0.000019>]  drm_gem_object_free+0x35/0x40 [drm]
> >> [  123.810476 <    0.000017>] drm_gem_object_handle_put_unlocked+0x9d/0xd0 [drm]
> >> [  123.810494 <    0.000018>] drm_gem_object_release_handle+0x74/0x90 [drm]
> >> [  123.810511 <    0.000017>]  ? drm_gem_object_handle_put_unlocked+0xd0/0xd0 [drm]
> >> [  123.810516 <    0.000005>]  idr_for_each+0x4d/0xd0
> >> [  123.810534 <    0.000018>]  drm_gem_release+0x20/0x30 [drm]
> >> [  123.810550 <    0.000016>]  drm_file_free+0x251/0x2a0 [drm]
> >> [  123.810567 <    0.000017>] drm_close_helper.isra.14+0x61/0x70 [drm]
> >> [  123.810583 <    0.000016>]  drm_release+0x6a/0xe0 [drm]
> >> [  123.810588 <    0.000005>]  __fput+0xa2/0x250
> >> [  123.810592 <    0.000004>]  ____fput+0xe/0x10
> >> [  123.810595 <    0.000003>]  task_work_run+0x6c/0xa0
> >> [  123.810600 <    0.000005>]  do_exit+0x376/0xb60
> >> [  123.810604 <    0.000004>]  do_group_exit+0x43/0xa0
> >> [  123.810608 <    0.000004>]  get_signal+0x18b/0x8e0
> >> [  123.810612 <    0.000004>]  ? do_futex+0x595/0xc20
> >> [  123.810617 <    0.000005>]  arch_do_signal+0x34/0x880
> >> [  123.810620 <    0.000003>]  ? check_preempt_curr+0x50/0x60
> >> [  123.810623 <    0.000003>]  ? ttwu_do_wakeup+0x1e/0x160
> >> [  123.810626 <    0.000003>]  ? ttwu_do_activate+0x61/0x70
> >> [  123.810630 <    0.000004>] exit_to_user_mode_prepare+0x124/0x1b0
> >> [  123.810635 <    0.000005>] syscall_exit_to_user_mode+0x31/0x170
> >> [  123.810639 <    0.000004>]  do_syscall_64+0x43/0x80
> >>
> >>
> >> Andrey
> >>
> >>
> >>
> >> Christian.
> >>
> >> +
> >> + if (adev->irq.ih.use_bus_addr)
> >> + amdgpu_ih_ring_fini(adev, &adev->irq.ih);
> >> + if (adev->irq.ih1.use_bus_addr)
> >> + amdgpu_ih_ring_fini(adev, &adev->irq.ih1);
> >> + if (adev->irq.ih2.use_bus_addr)
> >> + amdgpu_ih_ring_fini(adev, &adev->irq.ih2);
> >> +
> >> + amdgpu_gart_dummy_page_fini(adev);
> >> + }
> >> +
> >> + return NOTIFY_OK;
> >> +}
> >> +
> >> +
> >>    /**
> >>     * amdgpu_device_init - initialize the driver
> >>     *
> >> @@ -3304,6 +3339,8 @@ int amdgpu_device_init(struct amdgpu_device *adev,
> >>    INIT_WORK(&adev->xgmi_reset_work, amdgpu_device_xgmi_reset_func);
> >> + INIT_LIST_HEAD(&adev->device_bo_list);
> >> +
> >>    adev->gfx.gfx_off_req_count = 1;
> >>    adev->pm.ac_power = power_supply_is_system_supplied() > 0;
> >> @@ -3575,6 +3612,15 @@ int amdgpu_device_init(struct amdgpu_device *adev,
> >>    if (amdgpu_device_cache_pci_state(adev->pdev))
> >>    pci_restore_state(pdev);
> >> + BLOCKING_INIT_NOTIFIER_HEAD(&adev->notifier);
> >> + adev->nb.notifier_call = amdgpu_iommu_group_notifier;
> >> +
> >> + if (adev->dev->iommu_group) {
> >> + r = iommu_group_register_notifier(adev->dev->iommu_group, &adev->nb);
> >> + if (r)
> >> + goto failed;
> >> + }
> >> +
> >>    return 0;
> >>    failed:
> >> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
> >> index 0db9330..486ad6d 100644
> >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
> >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
> >> @@ -92,7 +92,7 @@ static int amdgpu_gart_dummy_page_init(struct amdgpu_device *adev)
> >>     *
> >>     * Frees the dummy page used by the driver (all asics).
> >>     */
> >> -static void amdgpu_gart_dummy_page_fini(struct amdgpu_device *adev)
> >> +void amdgpu_gart_dummy_page_fini(struct amdgpu_device *adev)
> >>    {
> >>    if (!adev->dummy_page_addr)
> >>    return;
> >> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h
> >> index afa2e28..5678d9c 100644
> >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h
> >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h
> >> @@ -61,6 +61,7 @@ int amdgpu_gart_table_vram_pin(struct amdgpu_device *adev);
> >>    void amdgpu_gart_table_vram_unpin(struct amdgpu_device *adev);
> >>    int amdgpu_gart_init(struct amdgpu_device *adev);
> >>    void amdgpu_gart_fini(struct amdgpu_device *adev);
> >> +void amdgpu_gart_dummy_page_fini(struct amdgpu_device *adev);
> >>    int amdgpu_gart_unbind(struct amdgpu_device *adev, uint64_t offset,
> >>          int pages);
> >>    int amdgpu_gart_map(struct amdgpu_device *adev, uint64_t offset,
> >> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> >> index 6cc9919..4a1de69 100644
> >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> >> @@ -94,6 +94,10 @@ static void amdgpu_bo_destroy(struct ttm_buffer_object *tbo)
> >>    }
> >>    amdgpu_bo_unref(&bo->parent);
> >> + spin_lock(&ttm_bo_glob.lru_lock);
> >> + list_del(&bo->bo);
> >> + spin_unlock(&ttm_bo_glob.lru_lock);
> >> +
> >>    kfree(bo->metadata);
> >>    kfree(bo);
> >>    }
> >> @@ -613,6 +617,12 @@ static int amdgpu_bo_do_create(struct amdgpu_device *adev,
> >>    if (bp->type == ttm_bo_type_device)
> >>    bo->flags &= ~AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED;
> >> + INIT_LIST_HEAD(&bo->bo);
> >> +
> >> + spin_lock(&ttm_bo_glob.lru_lock);
> >> + list_add_tail(&bo->bo, &adev->device_bo_list);
> >> + spin_unlock(&ttm_bo_glob.lru_lock);
> >> +
> >>    return 0;
> >>    fail_unreserve:
> >> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
> >> index 9ac3756..5ae8555 100644
> >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
> >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
> >> @@ -110,6 +110,8 @@ struct amdgpu_bo {
> >>    struct list_head shadow_list;
> >>    struct kgd_mem                  *kfd_bo;
> >> +
> >> + struct list_head bo;
> >>    };
> >>    static inline struct amdgpu_bo *ttm_to_amdgpu_bo(struct ttm_buffer_object *tbo)
> >
> >



-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 00/14] RFC Support hot device unplug in amdgpu
  2021-01-19 18:18         ` Andrey Grodzovsky
@ 2021-01-20  9:05           ` Daniel Vetter
  -1 siblings, 0 replies; 196+ messages in thread
From: Daniel Vetter @ 2021-01-20  9:05 UTC (permalink / raw)
  To: Andrey Grodzovsky
  Cc: Christian König, dri-devel, amd-gfx list, Greg KH,
	Alex Deucher, Qiang Yu

On Tue, Jan 19, 2021 at 01:18:15PM -0500, Andrey Grodzovsky wrote:
> 
> On 1/19/21 1:08 PM, Daniel Vetter wrote:
> > On Tue, Jan 19, 2021 at 6:31 PM Andrey Grodzovsky
> > <Andrey.Grodzovsky@amd.com> wrote:
> > > 
> > > On 1/19/21 9:16 AM, Daniel Vetter wrote:
> > > > On Mon, Jan 18, 2021 at 04:01:09PM -0500, Andrey Grodzovsky wrote:
> > > > > Until now extracting a card either by physical extraction (e.g. eGPU with
> > > > > thunderbolt connection or by emulation through  syfs -> /sys/bus/pci/devices/device_id/remove)
> > > > > would cause random crashes in user apps. The random crashes in apps were
> > > > > mostly due to the app having mapped a device backed BO into its address
> > > > > space was still trying to access the BO while the backing device was gone.
> > > > > To answer this first problem Christian suggested to fix the handling of mapped
> > > > > memory in the clients when the device goes away by forcibly unmap all buffers the
> > > > > user processes has by clearing their respective VMAs mapping the device BOs.
> > > > > Then when the VMAs try to fill in the page tables again we check in the fault
> > > > > handlerif the device is removed and if so, return an error. This will generate a
> > > > > SIGBUS to the application which can then cleanly terminate.This indeed was done
> > > > > but this in turn created a problem of kernel OOPs were the OOPSes were due to the
> > > > > fact that while the app was terminating because of the SIGBUSit would trigger use
> > > > > after free in the driver by calling to accesses device structures that were already
> > > > > released from the pci remove sequence.This was handled by introducing a 'flush'
> > > > > sequence during device removal were we wait for drm file reference to drop to 0
> > > > > meaning all user clients directly using this device terminated.
> > > > > 
> > > > > v2:
> > > > > Based on discussions in the mailing list with Daniel and Pekka [1] and based on the document
> > > > > produced by Pekka from those discussions [2] the whole approach with returning SIGBUS and
> > > > > waiting for all user clients having CPU mapping of device BOs to die was dropped.
> > > > > Instead as per the document suggestion the device structures are kept alive until
> > > > > the last reference to the device is dropped by user client and in the meanwhile all existing and new CPU mappings of the BOs
> > > > > belonging to the device directly or by dma-buf import are rerouted to per user
> > > > > process dummy rw page.Also, I skipped the 'Requirements for KMS UAPI' section of [2]
> > > > > since i am trying to get the minimal set of requirements that still give useful solution
> > > > > to work and this is the'Requirements for Render and Cross-Device UAPI' section and so my
> > > > > test case is removing a secondary device, which is render only and is not involved
> > > > > in KMS.
> > > > > 
> > > > > v3:
> > > > > More updates following comments from v2 such as removing loop to find DRM file when rerouting
> > > > > page faults to dummy page,getting rid of unnecessary sysfs handling refactoring and moving
> > > > > prevention of GPU recovery post device unplug from amdgpu to scheduler layer.
> > > > > On top of that added unplug support for the IOMMU enabled system.
> > > > > 
> > > > > v4:
> > > > > Drop last sysfs hack and use sysfs default attribute.
> > > > > Guard against write accesses after device removal to avoid modifying released memory.
> > > > > Update dummy pages handling to on demand allocation and release through drm managed framework.
> > > > > Add return value to scheduler job TO handler (by Luben Tuikov) and use this in amdgpu for prevention
> > > > > of GPU recovery post device unplug
> > > > > Also rebase on top of drm-misc-mext instead of amd-staging-drm-next
> > > > > 
> > > > > With these patches I am able to gracefully remove the secondary card using sysfs remove hook while glxgears
> > > > > is running off of secondary card (DRI_PRIME=1) without kernel oopses or hangs and keep working
> > > > > with the primary card or soft reset the device without hangs or oopses
> > > > > 
> > > > > TODOs for followup work:
> > > > > Convert AMDGPU code to use devm (for hw stuff) and drmm (for sw stuff and allocations) (Daniel)
> > > > > Support plugging the secondary device back after unplug - currently still experiencing HW error on plugging back.
> > > > > Add support for 'Requirements for KMS UAPI' section of [2] - unplugging primary, display connected card.
> > > > > 
> > > > > [1] - Discussions during v3 of the patchset https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.spinics.net%2Flists%2Famd-gfx%2Fmsg55576.html&amp;data=04%7C01%7CAndrey.Grodzovsky%40amd.com%7C9055ea164ca14a0cbce108d8bca53d37%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637466765176719365%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=AqqeqmhF%2BZ1%2BRwMgtpmfoW1gtEnLGxiy3U5OMm%2BBqk8%3D&amp;reserved=0
> > > > > [2] - drm/doc: device hot-unplug for userspace https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.spinics.net%2Flists%2Fdri-devel%2Fmsg259755.html&amp;data=04%7C01%7CAndrey.Grodzovsky%40amd.com%7C9055ea164ca14a0cbce108d8bca53d37%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637466765176719365%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=oHHyRtTMTNQAnkzptG0B8%2FeeniU1z2DSca8L4yCYJcE%3D&amp;reserved=0
> > > > > [3] - Related gitlab ticket https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgitlab.freedesktop.org%2Fdrm%2Famd%2F-%2Fissues%2F1081&amp;data=04%7C01%7CAndrey.Grodzovsky%40amd.com%7C9055ea164ca14a0cbce108d8bca53d37%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637466765176719365%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=inKlV%2F5QIPw%2BhHvLM46X27%2Fcjr%2FXyhxmrC0xYXBhHuE%3D&amp;reserved=0
> > > > btw have you tried this out with some of the igts we have? core_hotunplug
> > > > is the one I'm thinking of. Might be worth to extend this for amdgpu
> > > > specific stuff (like run some batches on it while hotunplugging).
> > > No, I mostly used just running glxgears while testing which covers already
> > > exported/imported dma-buf case and a few manually hacked tests in libdrm amdgpu
> > > test suite
> > > 
> > > 
> > > > Since there's so many corner cases we need to test here (shared dma-buf,
> > > > shared dma_fence) I think it would make sense to have a shared testcase
> > > > across drivers.
> > > 
> > > Not familiar with IGT too much, is there an easy way to setup shared dma bufs
> > > and fences
> > > use cases there or you mean I need to add them now ?
> > We do have test infrastructure for all of that, but the hotunplug test
> > doesn't have that yet I think.
> > 
> > > > Only specific thing would be some hooks to keep the gpu
> > > > busy in some fashion while we yank the driver.
> > > 
> > > Do you mean like staring X and some active rendering on top (like glxgears)
> > > automatically from within IGT ?
> > Nope, igt is meant to be bare metal testing so you don't have to drag
> > the entire winsys around (which in a wayland world, is not really good
> > for driver testing anyway, since everything is different). We use this
> > for our pre-merge ci for drm/i915.
> 
> 
> So i keep it busy by X/glxgers which is manual operation. What you suggest
> then is some client within IGT which opens the device and starts submitting jobs
> (which is much like what libdrm amdgpu tests already do) ? And this
> part is the amdgou specific code I just need to port from libdrm to here ?

Yup. For i915 tests we have an entire library already for small workloads,
including some that just spin forever (useful for reset testing and could
also come handy for unload testing).
-Daniel

> 
> Andrey
> 
> 
> > 
> > > > But just to get it started
> > > > you can throw in entirely amdgpu specific subtests and just share some of
> > > > the test code.
> > > > -Daniel
> > > 
> > > Im general, I wasn't aware of this test suite and looks like it does what i test
> > > among other stuff.
> > > I will definitely  try to run with it although the rescan part will not work as
> > > plugging
> > > the device back is in my TODO list and not part of the scope for this patchset
> > > and so I will
> > > probably comment the re-scan section out while testing.
> > amd gem has been using libdrm-amd thus far iirc, but for things like
> > this I think it'd be worth to at least consider switching. Display
> > team has already started to use some of the test and contribute stuff
> > (I think the VRR testcase is from amd).
> > -Daniel
> > 
> > > Andrey
> > > 
> > > 
> > > > > Andrey Grodzovsky (13):
> > > > >     drm/ttm: Remap all page faults to per process dummy page.
> > > > >     drm: Unamp the entire device address space on device unplug
> > > > >     drm/ttm: Expose ttm_tt_unpopulate for driver use
> > > > >     drm/sched: Cancel and flush all oustatdning jobs before finish.
> > > > >     drm/amdgpu: Split amdgpu_device_fini into early and late
> > > > >     drm/amdgpu: Add early fini callback
> > > > >     drm/amdgpu: Register IOMMU topology notifier per device.
> > > > >     drm/amdgpu: Fix a bunch of sdma code crash post device unplug
> > > > >     drm/amdgpu: Remap all page faults to per process dummy page.
> > > > >     dmr/amdgpu: Move some sysfs attrs creation to default_attr
> > > > >     drm/amdgpu: Guard against write accesses after device removal
> > > > >     drm/sched: Make timeout timer rearm conditional.
> > > > >     drm/amdgpu: Prevent any job recoveries after device is unplugged.
> > > > > 
> > > > > Luben Tuikov (1):
> > > > >     drm/scheduler: Job timeout handler returns status
> > > > > 
> > > > >    drivers/gpu/drm/amd/amdgpu/amdgpu.h               |  11 +-
> > > > >    drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c      |  17 +--
> > > > >    drivers/gpu/drm/amd/amdgpu/amdgpu_device.c        | 149 ++++++++++++++++++++--
> > > > >    drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c           |  20 ++-
> > > > >    drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c         |  15 ++-
> > > > >    drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c          |   2 +-
> > > > >    drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h          |   1 +
> > > > >    drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c           |   9 ++
> > > > >    drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c       |  25 ++--
> > > > >    drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c           |  26 ++--
> > > > >    drivers/gpu/drm/amd/amdgpu/amdgpu_irq.h           |   3 +-
> > > > >    drivers/gpu/drm/amd/amdgpu/amdgpu_job.c           |  19 ++-
> > > > >    drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c           |  12 +-
> > > > >    drivers/gpu/drm/amd/amdgpu/amdgpu_object.c        |  10 ++
> > > > >    drivers/gpu/drm/amd/amdgpu/amdgpu_object.h        |   2 +
> > > > >    drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c           |  53 +++++---
> > > > >    drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h           |   3 +
> > > > >    drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c           |   1 +
> > > > >    drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c          |  70 ++++++++++
> > > > >    drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h          |  52 +-------
> > > > >    drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c           |  21 ++-
> > > > >    drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c            |   8 +-
> > > > >    drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c      |  14 +-
> > > > >    drivers/gpu/drm/amd/amdgpu/cik_ih.c               |   2 +-
> > > > >    drivers/gpu/drm/amd/amdgpu/cz_ih.c                |   2 +-
> > > > >    drivers/gpu/drm/amd/amdgpu/iceland_ih.c           |   2 +-
> > > > >    drivers/gpu/drm/amd/amdgpu/navi10_ih.c            |   2 +-
> > > > >    drivers/gpu/drm/amd/amdgpu/psp_v11_0.c            |  16 +--
> > > > >    drivers/gpu/drm/amd/amdgpu/psp_v12_0.c            |   8 +-
> > > > >    drivers/gpu/drm/amd/amdgpu/psp_v3_1.c             |   8 +-
> > > > >    drivers/gpu/drm/amd/amdgpu/si_ih.c                |   2 +-
> > > > >    drivers/gpu/drm/amd/amdgpu/tonga_ih.c             |   2 +-
> > > > >    drivers/gpu/drm/amd/amdgpu/vega10_ih.c            |   2 +-
> > > > >    drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c |  12 +-
> > > > >    drivers/gpu/drm/amd/include/amd_shared.h          |   2 +
> > > > >    drivers/gpu/drm/drm_drv.c                         |   3 +
> > > > >    drivers/gpu/drm/etnaviv/etnaviv_sched.c           |  10 +-
> > > > >    drivers/gpu/drm/lima/lima_sched.c                 |   4 +-
> > > > >    drivers/gpu/drm/panfrost/panfrost_job.c           |   9 +-
> > > > >    drivers/gpu/drm/scheduler/sched_main.c            |  18 ++-
> > > > >    drivers/gpu/drm/ttm/ttm_bo_vm.c                   |  82 +++++++++++-
> > > > >    drivers/gpu/drm/ttm/ttm_tt.c                      |   1 +
> > > > >    drivers/gpu/drm/v3d/v3d_sched.c                   |  32 ++---
> > > > >    include/drm/gpu_scheduler.h                       |  17 ++-
> > > > >    include/drm/ttm/ttm_bo_api.h                      |   2 +
> > > > >    45 files changed, 583 insertions(+), 198 deletions(-)
> > > > > 
> > > > > --
> > > > > 2.7.4
> > > > > 
> > 
> > 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 00/14] RFC Support hot device unplug in amdgpu
@ 2021-01-20  9:05           ` Daniel Vetter
  0 siblings, 0 replies; 196+ messages in thread
From: Daniel Vetter @ 2021-01-20  9:05 UTC (permalink / raw)
  To: Andrey Grodzovsky
  Cc: Rob Herring, Christian König, dri-devel, Anholt, Eric,
	Pekka Paalanen, amd-gfx list, Daniel Vetter, Greg KH,
	Alex Deucher, Qiang Yu, Wentland, Harry, Lucas Stach

On Tue, Jan 19, 2021 at 01:18:15PM -0500, Andrey Grodzovsky wrote:
> 
> On 1/19/21 1:08 PM, Daniel Vetter wrote:
> > On Tue, Jan 19, 2021 at 6:31 PM Andrey Grodzovsky
> > <Andrey.Grodzovsky@amd.com> wrote:
> > > 
> > > On 1/19/21 9:16 AM, Daniel Vetter wrote:
> > > > On Mon, Jan 18, 2021 at 04:01:09PM -0500, Andrey Grodzovsky wrote:
> > > > > Until now extracting a card either by physical extraction (e.g. eGPU with
> > > > > thunderbolt connection or by emulation through  syfs -> /sys/bus/pci/devices/device_id/remove)
> > > > > would cause random crashes in user apps. The random crashes in apps were
> > > > > mostly due to the app having mapped a device backed BO into its address
> > > > > space was still trying to access the BO while the backing device was gone.
> > > > > To answer this first problem Christian suggested to fix the handling of mapped
> > > > > memory in the clients when the device goes away by forcibly unmap all buffers the
> > > > > user processes has by clearing their respective VMAs mapping the device BOs.
> > > > > Then when the VMAs try to fill in the page tables again we check in the fault
> > > > > handlerif the device is removed and if so, return an error. This will generate a
> > > > > SIGBUS to the application which can then cleanly terminate.This indeed was done
> > > > > but this in turn created a problem of kernel OOPs were the OOPSes were due to the
> > > > > fact that while the app was terminating because of the SIGBUSit would trigger use
> > > > > after free in the driver by calling to accesses device structures that were already
> > > > > released from the pci remove sequence.This was handled by introducing a 'flush'
> > > > > sequence during device removal were we wait for drm file reference to drop to 0
> > > > > meaning all user clients directly using this device terminated.
> > > > > 
> > > > > v2:
> > > > > Based on discussions in the mailing list with Daniel and Pekka [1] and based on the document
> > > > > produced by Pekka from those discussions [2] the whole approach with returning SIGBUS and
> > > > > waiting for all user clients having CPU mapping of device BOs to die was dropped.
> > > > > Instead as per the document suggestion the device structures are kept alive until
> > > > > the last reference to the device is dropped by user client and in the meanwhile all existing and new CPU mappings of the BOs
> > > > > belonging to the device directly or by dma-buf import are rerouted to per user
> > > > > process dummy rw page.Also, I skipped the 'Requirements for KMS UAPI' section of [2]
> > > > > since i am trying to get the minimal set of requirements that still give useful solution
> > > > > to work and this is the'Requirements for Render and Cross-Device UAPI' section and so my
> > > > > test case is removing a secondary device, which is render only and is not involved
> > > > > in KMS.
> > > > > 
> > > > > v3:
> > > > > More updates following comments from v2 such as removing loop to find DRM file when rerouting
> > > > > page faults to dummy page,getting rid of unnecessary sysfs handling refactoring and moving
> > > > > prevention of GPU recovery post device unplug from amdgpu to scheduler layer.
> > > > > On top of that added unplug support for the IOMMU enabled system.
> > > > > 
> > > > > v4:
> > > > > Drop last sysfs hack and use sysfs default attribute.
> > > > > Guard against write accesses after device removal to avoid modifying released memory.
> > > > > Update dummy pages handling to on demand allocation and release through drm managed framework.
> > > > > Add return value to scheduler job TO handler (by Luben Tuikov) and use this in amdgpu for prevention
> > > > > of GPU recovery post device unplug
> > > > > Also rebase on top of drm-misc-mext instead of amd-staging-drm-next
> > > > > 
> > > > > With these patches I am able to gracefully remove the secondary card using sysfs remove hook while glxgears
> > > > > is running off of secondary card (DRI_PRIME=1) without kernel oopses or hangs and keep working
> > > > > with the primary card or soft reset the device without hangs or oopses
> > > > > 
> > > > > TODOs for followup work:
> > > > > Convert AMDGPU code to use devm (for hw stuff) and drmm (for sw stuff and allocations) (Daniel)
> > > > > Support plugging the secondary device back after unplug - currently still experiencing HW error on plugging back.
> > > > > Add support for 'Requirements for KMS UAPI' section of [2] - unplugging primary, display connected card.
> > > > > 
> > > > > [1] - Discussions during v3 of the patchset https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.spinics.net%2Flists%2Famd-gfx%2Fmsg55576.html&amp;data=04%7C01%7CAndrey.Grodzovsky%40amd.com%7C9055ea164ca14a0cbce108d8bca53d37%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637466765176719365%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=AqqeqmhF%2BZ1%2BRwMgtpmfoW1gtEnLGxiy3U5OMm%2BBqk8%3D&amp;reserved=0
> > > > > [2] - drm/doc: device hot-unplug for userspace https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.spinics.net%2Flists%2Fdri-devel%2Fmsg259755.html&amp;data=04%7C01%7CAndrey.Grodzovsky%40amd.com%7C9055ea164ca14a0cbce108d8bca53d37%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637466765176719365%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=oHHyRtTMTNQAnkzptG0B8%2FeeniU1z2DSca8L4yCYJcE%3D&amp;reserved=0
> > > > > [3] - Related gitlab ticket https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgitlab.freedesktop.org%2Fdrm%2Famd%2F-%2Fissues%2F1081&amp;data=04%7C01%7CAndrey.Grodzovsky%40amd.com%7C9055ea164ca14a0cbce108d8bca53d37%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637466765176719365%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=inKlV%2F5QIPw%2BhHvLM46X27%2Fcjr%2FXyhxmrC0xYXBhHuE%3D&amp;reserved=0
> > > > btw have you tried this out with some of the igts we have? core_hotunplug
> > > > is the one I'm thinking of. Might be worth to extend this for amdgpu
> > > > specific stuff (like run some batches on it while hotunplugging).
> > > No, I mostly used just running glxgears while testing which covers already
> > > exported/imported dma-buf case and a few manually hacked tests in libdrm amdgpu
> > > test suite
> > > 
> > > 
> > > > Since there's so many corner cases we need to test here (shared dma-buf,
> > > > shared dma_fence) I think it would make sense to have a shared testcase
> > > > across drivers.
> > > 
> > > Not familiar with IGT too much, is there an easy way to setup shared dma bufs
> > > and fences
> > > use cases there or you mean I need to add them now ?
> > We do have test infrastructure for all of that, but the hotunplug test
> > doesn't have that yet I think.
> > 
> > > > Only specific thing would be some hooks to keep the gpu
> > > > busy in some fashion while we yank the driver.
> > > 
> > > Do you mean like staring X and some active rendering on top (like glxgears)
> > > automatically from within IGT ?
> > Nope, igt is meant to be bare metal testing so you don't have to drag
> > the entire winsys around (which in a wayland world, is not really good
> > for driver testing anyway, since everything is different). We use this
> > for our pre-merge ci for drm/i915.
> 
> 
> So i keep it busy by X/glxgers which is manual operation. What you suggest
> then is some client within IGT which opens the device and starts submitting jobs
> (which is much like what libdrm amdgpu tests already do) ? And this
> part is the amdgou specific code I just need to port from libdrm to here ?

Yup. For i915 tests we have an entire library already for small workloads,
including some that just spin forever (useful for reset testing and could
also come handy for unload testing).
-Daniel

> 
> Andrey
> 
> 
> > 
> > > > But just to get it started
> > > > you can throw in entirely amdgpu specific subtests and just share some of
> > > > the test code.
> > > > -Daniel
> > > 
> > > Im general, I wasn't aware of this test suite and looks like it does what i test
> > > among other stuff.
> > > I will definitely  try to run with it although the rescan part will not work as
> > > plugging
> > > the device back is in my TODO list and not part of the scope for this patchset
> > > and so I will
> > > probably comment the re-scan section out while testing.
> > amd gem has been using libdrm-amd thus far iirc, but for things like
> > this I think it'd be worth to at least consider switching. Display
> > team has already started to use some of the test and contribute stuff
> > (I think the VRR testcase is from amd).
> > -Daniel
> > 
> > > Andrey
> > > 
> > > 
> > > > > Andrey Grodzovsky (13):
> > > > >     drm/ttm: Remap all page faults to per process dummy page.
> > > > >     drm: Unamp the entire device address space on device unplug
> > > > >     drm/ttm: Expose ttm_tt_unpopulate for driver use
> > > > >     drm/sched: Cancel and flush all oustatdning jobs before finish.
> > > > >     drm/amdgpu: Split amdgpu_device_fini into early and late
> > > > >     drm/amdgpu: Add early fini callback
> > > > >     drm/amdgpu: Register IOMMU topology notifier per device.
> > > > >     drm/amdgpu: Fix a bunch of sdma code crash post device unplug
> > > > >     drm/amdgpu: Remap all page faults to per process dummy page.
> > > > >     dmr/amdgpu: Move some sysfs attrs creation to default_attr
> > > > >     drm/amdgpu: Guard against write accesses after device removal
> > > > >     drm/sched: Make timeout timer rearm conditional.
> > > > >     drm/amdgpu: Prevent any job recoveries after device is unplugged.
> > > > > 
> > > > > Luben Tuikov (1):
> > > > >     drm/scheduler: Job timeout handler returns status
> > > > > 
> > > > >    drivers/gpu/drm/amd/amdgpu/amdgpu.h               |  11 +-
> > > > >    drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c      |  17 +--
> > > > >    drivers/gpu/drm/amd/amdgpu/amdgpu_device.c        | 149 ++++++++++++++++++++--
> > > > >    drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c           |  20 ++-
> > > > >    drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c         |  15 ++-
> > > > >    drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c          |   2 +-
> > > > >    drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h          |   1 +
> > > > >    drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c           |   9 ++
> > > > >    drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c       |  25 ++--
> > > > >    drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c           |  26 ++--
> > > > >    drivers/gpu/drm/amd/amdgpu/amdgpu_irq.h           |   3 +-
> > > > >    drivers/gpu/drm/amd/amdgpu/amdgpu_job.c           |  19 ++-
> > > > >    drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c           |  12 +-
> > > > >    drivers/gpu/drm/amd/amdgpu/amdgpu_object.c        |  10 ++
> > > > >    drivers/gpu/drm/amd/amdgpu/amdgpu_object.h        |   2 +
> > > > >    drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c           |  53 +++++---
> > > > >    drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h           |   3 +
> > > > >    drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c           |   1 +
> > > > >    drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c          |  70 ++++++++++
> > > > >    drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h          |  52 +-------
> > > > >    drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c           |  21 ++-
> > > > >    drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c            |   8 +-
> > > > >    drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c      |  14 +-
> > > > >    drivers/gpu/drm/amd/amdgpu/cik_ih.c               |   2 +-
> > > > >    drivers/gpu/drm/amd/amdgpu/cz_ih.c                |   2 +-
> > > > >    drivers/gpu/drm/amd/amdgpu/iceland_ih.c           |   2 +-
> > > > >    drivers/gpu/drm/amd/amdgpu/navi10_ih.c            |   2 +-
> > > > >    drivers/gpu/drm/amd/amdgpu/psp_v11_0.c            |  16 +--
> > > > >    drivers/gpu/drm/amd/amdgpu/psp_v12_0.c            |   8 +-
> > > > >    drivers/gpu/drm/amd/amdgpu/psp_v3_1.c             |   8 +-
> > > > >    drivers/gpu/drm/amd/amdgpu/si_ih.c                |   2 +-
> > > > >    drivers/gpu/drm/amd/amdgpu/tonga_ih.c             |   2 +-
> > > > >    drivers/gpu/drm/amd/amdgpu/vega10_ih.c            |   2 +-
> > > > >    drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c |  12 +-
> > > > >    drivers/gpu/drm/amd/include/amd_shared.h          |   2 +
> > > > >    drivers/gpu/drm/drm_drv.c                         |   3 +
> > > > >    drivers/gpu/drm/etnaviv/etnaviv_sched.c           |  10 +-
> > > > >    drivers/gpu/drm/lima/lima_sched.c                 |   4 +-
> > > > >    drivers/gpu/drm/panfrost/panfrost_job.c           |   9 +-
> > > > >    drivers/gpu/drm/scheduler/sched_main.c            |  18 ++-
> > > > >    drivers/gpu/drm/ttm/ttm_bo_vm.c                   |  82 +++++++++++-
> > > > >    drivers/gpu/drm/ttm/ttm_tt.c                      |   1 +
> > > > >    drivers/gpu/drm/v3d/v3d_sched.c                   |  32 ++---
> > > > >    include/drm/gpu_scheduler.h                       |  17 ++-
> > > > >    include/drm/ttm/ttm_bo_api.h                      |   2 +
> > > > >    45 files changed, 583 insertions(+), 198 deletions(-)
> > > > > 
> > > > > --
> > > > > 2.7.4
> > > > > 
> > 
> > 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 00/14] RFC Support hot device unplug in amdgpu
  2021-01-20  9:05           ` Daniel Vetter
@ 2021-01-20 14:19             ` Andrey Grodzovsky
  -1 siblings, 0 replies; 196+ messages in thread
From: Andrey Grodzovsky @ 2021-01-20 14:19 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: amd-gfx list, Christian König, dri-devel, Qiang Yu, Greg KH,
	Alex Deucher


On 1/20/21 4:05 AM, Daniel Vetter wrote:
> On Tue, Jan 19, 2021 at 01:18:15PM -0500, Andrey Grodzovsky wrote:
>> On 1/19/21 1:08 PM, Daniel Vetter wrote:
>>> On Tue, Jan 19, 2021 at 6:31 PM Andrey Grodzovsky
>>> <Andrey.Grodzovsky@amd.com> wrote:
>>>> On 1/19/21 9:16 AM, Daniel Vetter wrote:
>>>>> On Mon, Jan 18, 2021 at 04:01:09PM -0500, Andrey Grodzovsky wrote:
>>>>>> Until now extracting a card either by physical extraction (e.g. eGPU with
>>>>>> thunderbolt connection or by emulation through  syfs -> /sys/bus/pci/devices/device_id/remove)
>>>>>> would cause random crashes in user apps. The random crashes in apps were
>>>>>> mostly due to the app having mapped a device backed BO into its address
>>>>>> space was still trying to access the BO while the backing device was gone.
>>>>>> To answer this first problem Christian suggested to fix the handling of mapped
>>>>>> memory in the clients when the device goes away by forcibly unmap all buffers the
>>>>>> user processes has by clearing their respective VMAs mapping the device BOs.
>>>>>> Then when the VMAs try to fill in the page tables again we check in the fault
>>>>>> handlerif the device is removed and if so, return an error. This will generate a
>>>>>> SIGBUS to the application which can then cleanly terminate.This indeed was done
>>>>>> but this in turn created a problem of kernel OOPs were the OOPSes were due to the
>>>>>> fact that while the app was terminating because of the SIGBUSit would trigger use
>>>>>> after free in the driver by calling to accesses device structures that were already
>>>>>> released from the pci remove sequence.This was handled by introducing a 'flush'
>>>>>> sequence during device removal were we wait for drm file reference to drop to 0
>>>>>> meaning all user clients directly using this device terminated.
>>>>>>
>>>>>> v2:
>>>>>> Based on discussions in the mailing list with Daniel and Pekka [1] and based on the document
>>>>>> produced by Pekka from those discussions [2] the whole approach with returning SIGBUS and
>>>>>> waiting for all user clients having CPU mapping of device BOs to die was dropped.
>>>>>> Instead as per the document suggestion the device structures are kept alive until
>>>>>> the last reference to the device is dropped by user client and in the meanwhile all existing and new CPU mappings of the BOs
>>>>>> belonging to the device directly or by dma-buf import are rerouted to per user
>>>>>> process dummy rw page.Also, I skipped the 'Requirements for KMS UAPI' section of [2]
>>>>>> since i am trying to get the minimal set of requirements that still give useful solution
>>>>>> to work and this is the'Requirements for Render and Cross-Device UAPI' section and so my
>>>>>> test case is removing a secondary device, which is render only and is not involved
>>>>>> in KMS.
>>>>>>
>>>>>> v3:
>>>>>> More updates following comments from v2 such as removing loop to find DRM file when rerouting
>>>>>> page faults to dummy page,getting rid of unnecessary sysfs handling refactoring and moving
>>>>>> prevention of GPU recovery post device unplug from amdgpu to scheduler layer.
>>>>>> On top of that added unplug support for the IOMMU enabled system.
>>>>>>
>>>>>> v4:
>>>>>> Drop last sysfs hack and use sysfs default attribute.
>>>>>> Guard against write accesses after device removal to avoid modifying released memory.
>>>>>> Update dummy pages handling to on demand allocation and release through drm managed framework.
>>>>>> Add return value to scheduler job TO handler (by Luben Tuikov) and use this in amdgpu for prevention
>>>>>> of GPU recovery post device unplug
>>>>>> Also rebase on top of drm-misc-mext instead of amd-staging-drm-next
>>>>>>
>>>>>> With these patches I am able to gracefully remove the secondary card using sysfs remove hook while glxgears
>>>>>> is running off of secondary card (DRI_PRIME=1) without kernel oopses or hangs and keep working
>>>>>> with the primary card or soft reset the device without hangs or oopses
>>>>>>
>>>>>> TODOs for followup work:
>>>>>> Convert AMDGPU code to use devm (for hw stuff) and drmm (for sw stuff and allocations) (Daniel)
>>>>>> Support plugging the secondary device back after unplug - currently still experiencing HW error on plugging back.
>>>>>> Add support for 'Requirements for KMS UAPI' section of [2] - unplugging primary, display connected card.
>>>>>>
>>>>>> [1] - Discussions during v3 of the patchset https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.spinics.net%2Flists%2Famd-gfx%2Fmsg55576.html&amp;data=04%7C01%7CAndrey.Grodzovsky%40amd.com%7Cbe51719dbdac41f5176b08d8bd2279ec%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637467303085005502%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=T4JLiSl7m4R%2FhcfcAxomY%2FMJ8QiTHaJ%2FJaqNZVT%2FDsk%3D&amp;reserved=0
>>>>>> [2] - drm/doc: device hot-unplug for userspace https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.spinics.net%2Flists%2Fdri-devel%2Fmsg259755.html&amp;data=04%7C01%7CAndrey.Grodzovsky%40amd.com%7Cbe51719dbdac41f5176b08d8bd2279ec%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637467303085005502%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=qitlHw6tqm4eGRstKccgh8zIPgILbS%2FJUa5yZGmSQcU%3D&amp;reserved=0
>>>>>> [3] - Related gitlab ticket https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgitlab.freedesktop.org%2Fdrm%2Famd%2F-%2Fissues%2F1081&amp;data=04%7C01%7CAndrey.Grodzovsky%40amd.com%7Cbe51719dbdac41f5176b08d8bd2279ec%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637467303085005502%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=UzOXP6bHYY6f7MCs4ZbSSvfY0DJ%2FEVPeIqedAi%2BZGG8%3D&amp;reserved=0
>>>>> btw have you tried this out with some of the igts we have? core_hotunplug
>>>>> is the one I'm thinking of. Might be worth to extend this for amdgpu
>>>>> specific stuff (like run some batches on it while hotunplugging).
>>>> No, I mostly used just running glxgears while testing which covers already
>>>> exported/imported dma-buf case and a few manually hacked tests in libdrm amdgpu
>>>> test suite
>>>>
>>>>
>>>>> Since there's so many corner cases we need to test here (shared dma-buf,
>>>>> shared dma_fence) I think it would make sense to have a shared testcase
>>>>> across drivers.
>>>> Not familiar with IGT too much, is there an easy way to setup shared dma bufs
>>>> and fences
>>>> use cases there or you mean I need to add them now ?
>>> We do have test infrastructure for all of that, but the hotunplug test
>>> doesn't have that yet I think.
>>>
>>>>> Only specific thing would be some hooks to keep the gpu
>>>>> busy in some fashion while we yank the driver.
>>>> Do you mean like staring X and some active rendering on top (like glxgears)
>>>> automatically from within IGT ?
>>> Nope, igt is meant to be bare metal testing so you don't have to drag
>>> the entire winsys around (which in a wayland world, is not really good
>>> for driver testing anyway, since everything is different). We use this
>>> for our pre-merge ci for drm/i915.
>>
>> So i keep it busy by X/glxgers which is manual operation. What you suggest
>> then is some client within IGT which opens the device and starts submitting jobs
>> (which is much like what libdrm amdgpu tests already do) ? And this
>> part is the amdgou specific code I just need to port from libdrm to here ?
> Yup. For i915 tests we have an entire library already for small workloads,
> including some that just spin forever (useful for reset testing and could
> also come handy for unload testing).
> -Daniel


Does it mean I would have to drag in the entire infrastructure code from
within libdrm amdgpu code that allows for command submissions through
our IOCTLs ?

Andrey

>
>> Andrey
>>
>>
>>>>> But just to get it started
>>>>> you can throw in entirely amdgpu specific subtests and just share some of
>>>>> the test code.
>>>>> -Daniel
>>>> Im general, I wasn't aware of this test suite and looks like it does what i test
>>>> among other stuff.
>>>> I will definitely  try to run with it although the rescan part will not work as
>>>> plugging
>>>> the device back is in my TODO list and not part of the scope for this patchset
>>>> and so I will
>>>> probably comment the re-scan section out while testing.
>>> amd gem has been using libdrm-amd thus far iirc, but for things like
>>> this I think it'd be worth to at least consider switching. Display
>>> team has already started to use some of the test and contribute stuff
>>> (I think the VRR testcase is from amd).
>>> -Daniel
>>>
>>>> Andrey
>>>>
>>>>
>>>>>> Andrey Grodzovsky (13):
>>>>>>      drm/ttm: Remap all page faults to per process dummy page.
>>>>>>      drm: Unamp the entire device address space on device unplug
>>>>>>      drm/ttm: Expose ttm_tt_unpopulate for driver use
>>>>>>      drm/sched: Cancel and flush all oustatdning jobs before finish.
>>>>>>      drm/amdgpu: Split amdgpu_device_fini into early and late
>>>>>>      drm/amdgpu: Add early fini callback
>>>>>>      drm/amdgpu: Register IOMMU topology notifier per device.
>>>>>>      drm/amdgpu: Fix a bunch of sdma code crash post device unplug
>>>>>>      drm/amdgpu: Remap all page faults to per process dummy page.
>>>>>>      dmr/amdgpu: Move some sysfs attrs creation to default_attr
>>>>>>      drm/amdgpu: Guard against write accesses after device removal
>>>>>>      drm/sched: Make timeout timer rearm conditional.
>>>>>>      drm/amdgpu: Prevent any job recoveries after device is unplugged.
>>>>>>
>>>>>> Luben Tuikov (1):
>>>>>>      drm/scheduler: Job timeout handler returns status
>>>>>>
>>>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu.h               |  11 +-
>>>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c      |  17 +--
>>>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_device.c        | 149 ++++++++++++++++++++--
>>>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c           |  20 ++-
>>>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c         |  15 ++-
>>>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c          |   2 +-
>>>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h          |   1 +
>>>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c           |   9 ++
>>>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c       |  25 ++--
>>>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c           |  26 ++--
>>>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_irq.h           |   3 +-
>>>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_job.c           |  19 ++-
>>>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c           |  12 +-
>>>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_object.c        |  10 ++
>>>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_object.h        |   2 +
>>>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c           |  53 +++++---
>>>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h           |   3 +
>>>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c           |   1 +
>>>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c          |  70 ++++++++++
>>>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h          |  52 +-------
>>>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c           |  21 ++-
>>>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c            |   8 +-
>>>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c      |  14 +-
>>>>>>     drivers/gpu/drm/amd/amdgpu/cik_ih.c               |   2 +-
>>>>>>     drivers/gpu/drm/amd/amdgpu/cz_ih.c                |   2 +-
>>>>>>     drivers/gpu/drm/amd/amdgpu/iceland_ih.c           |   2 +-
>>>>>>     drivers/gpu/drm/amd/amdgpu/navi10_ih.c            |   2 +-
>>>>>>     drivers/gpu/drm/amd/amdgpu/psp_v11_0.c            |  16 +--
>>>>>>     drivers/gpu/drm/amd/amdgpu/psp_v12_0.c            |   8 +-
>>>>>>     drivers/gpu/drm/amd/amdgpu/psp_v3_1.c             |   8 +-
>>>>>>     drivers/gpu/drm/amd/amdgpu/si_ih.c                |   2 +-
>>>>>>     drivers/gpu/drm/amd/amdgpu/tonga_ih.c             |   2 +-
>>>>>>     drivers/gpu/drm/amd/amdgpu/vega10_ih.c            |   2 +-
>>>>>>     drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c |  12 +-
>>>>>>     drivers/gpu/drm/amd/include/amd_shared.h          |   2 +
>>>>>>     drivers/gpu/drm/drm_drv.c                         |   3 +
>>>>>>     drivers/gpu/drm/etnaviv/etnaviv_sched.c           |  10 +-
>>>>>>     drivers/gpu/drm/lima/lima_sched.c                 |   4 +-
>>>>>>     drivers/gpu/drm/panfrost/panfrost_job.c           |   9 +-
>>>>>>     drivers/gpu/drm/scheduler/sched_main.c            |  18 ++-
>>>>>>     drivers/gpu/drm/ttm/ttm_bo_vm.c                   |  82 +++++++++++-
>>>>>>     drivers/gpu/drm/ttm/ttm_tt.c                      |   1 +
>>>>>>     drivers/gpu/drm/v3d/v3d_sched.c                   |  32 ++---
>>>>>>     include/drm/gpu_scheduler.h                       |  17 ++-
>>>>>>     include/drm/ttm/ttm_bo_api.h                      |   2 +
>>>>>>     45 files changed, 583 insertions(+), 198 deletions(-)
>>>>>>
>>>>>> --
>>>>>> 2.7.4
>>>>>>
>>>
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 00/14] RFC Support hot device unplug in amdgpu
@ 2021-01-20 14:19             ` Andrey Grodzovsky
  0 siblings, 0 replies; 196+ messages in thread
From: Andrey Grodzovsky @ 2021-01-20 14:19 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Rob Herring, amd-gfx list, Christian König, dri-devel,
	Anholt, Eric, Pekka Paalanen, Qiang Yu, Greg KH, Alex Deucher,
	Wentland, Harry, Lucas Stach


On 1/20/21 4:05 AM, Daniel Vetter wrote:
> On Tue, Jan 19, 2021 at 01:18:15PM -0500, Andrey Grodzovsky wrote:
>> On 1/19/21 1:08 PM, Daniel Vetter wrote:
>>> On Tue, Jan 19, 2021 at 6:31 PM Andrey Grodzovsky
>>> <Andrey.Grodzovsky@amd.com> wrote:
>>>> On 1/19/21 9:16 AM, Daniel Vetter wrote:
>>>>> On Mon, Jan 18, 2021 at 04:01:09PM -0500, Andrey Grodzovsky wrote:
>>>>>> Until now extracting a card either by physical extraction (e.g. eGPU with
>>>>>> thunderbolt connection or by emulation through  syfs -> /sys/bus/pci/devices/device_id/remove)
>>>>>> would cause random crashes in user apps. The random crashes in apps were
>>>>>> mostly due to the app having mapped a device backed BO into its address
>>>>>> space was still trying to access the BO while the backing device was gone.
>>>>>> To answer this first problem Christian suggested to fix the handling of mapped
>>>>>> memory in the clients when the device goes away by forcibly unmap all buffers the
>>>>>> user processes has by clearing their respective VMAs mapping the device BOs.
>>>>>> Then when the VMAs try to fill in the page tables again we check in the fault
>>>>>> handlerif the device is removed and if so, return an error. This will generate a
>>>>>> SIGBUS to the application which can then cleanly terminate.This indeed was done
>>>>>> but this in turn created a problem of kernel OOPs were the OOPSes were due to the
>>>>>> fact that while the app was terminating because of the SIGBUSit would trigger use
>>>>>> after free in the driver by calling to accesses device structures that were already
>>>>>> released from the pci remove sequence.This was handled by introducing a 'flush'
>>>>>> sequence during device removal were we wait for drm file reference to drop to 0
>>>>>> meaning all user clients directly using this device terminated.
>>>>>>
>>>>>> v2:
>>>>>> Based on discussions in the mailing list with Daniel and Pekka [1] and based on the document
>>>>>> produced by Pekka from those discussions [2] the whole approach with returning SIGBUS and
>>>>>> waiting for all user clients having CPU mapping of device BOs to die was dropped.
>>>>>> Instead as per the document suggestion the device structures are kept alive until
>>>>>> the last reference to the device is dropped by user client and in the meanwhile all existing and new CPU mappings of the BOs
>>>>>> belonging to the device directly or by dma-buf import are rerouted to per user
>>>>>> process dummy rw page.Also, I skipped the 'Requirements for KMS UAPI' section of [2]
>>>>>> since i am trying to get the minimal set of requirements that still give useful solution
>>>>>> to work and this is the'Requirements for Render and Cross-Device UAPI' section and so my
>>>>>> test case is removing a secondary device, which is render only and is not involved
>>>>>> in KMS.
>>>>>>
>>>>>> v3:
>>>>>> More updates following comments from v2 such as removing loop to find DRM file when rerouting
>>>>>> page faults to dummy page,getting rid of unnecessary sysfs handling refactoring and moving
>>>>>> prevention of GPU recovery post device unplug from amdgpu to scheduler layer.
>>>>>> On top of that added unplug support for the IOMMU enabled system.
>>>>>>
>>>>>> v4:
>>>>>> Drop last sysfs hack and use sysfs default attribute.
>>>>>> Guard against write accesses after device removal to avoid modifying released memory.
>>>>>> Update dummy pages handling to on demand allocation and release through drm managed framework.
>>>>>> Add return value to scheduler job TO handler (by Luben Tuikov) and use this in amdgpu for prevention
>>>>>> of GPU recovery post device unplug
>>>>>> Also rebase on top of drm-misc-mext instead of amd-staging-drm-next
>>>>>>
>>>>>> With these patches I am able to gracefully remove the secondary card using sysfs remove hook while glxgears
>>>>>> is running off of secondary card (DRI_PRIME=1) without kernel oopses or hangs and keep working
>>>>>> with the primary card or soft reset the device without hangs or oopses
>>>>>>
>>>>>> TODOs for followup work:
>>>>>> Convert AMDGPU code to use devm (for hw stuff) and drmm (for sw stuff and allocations) (Daniel)
>>>>>> Support plugging the secondary device back after unplug - currently still experiencing HW error on plugging back.
>>>>>> Add support for 'Requirements for KMS UAPI' section of [2] - unplugging primary, display connected card.
>>>>>>
>>>>>> [1] - Discussions during v3 of the patchset https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.spinics.net%2Flists%2Famd-gfx%2Fmsg55576.html&amp;data=04%7C01%7CAndrey.Grodzovsky%40amd.com%7Cbe51719dbdac41f5176b08d8bd2279ec%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637467303085005502%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=T4JLiSl7m4R%2FhcfcAxomY%2FMJ8QiTHaJ%2FJaqNZVT%2FDsk%3D&amp;reserved=0
>>>>>> [2] - drm/doc: device hot-unplug for userspace https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.spinics.net%2Flists%2Fdri-devel%2Fmsg259755.html&amp;data=04%7C01%7CAndrey.Grodzovsky%40amd.com%7Cbe51719dbdac41f5176b08d8bd2279ec%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637467303085005502%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=qitlHw6tqm4eGRstKccgh8zIPgILbS%2FJUa5yZGmSQcU%3D&amp;reserved=0
>>>>>> [3] - Related gitlab ticket https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgitlab.freedesktop.org%2Fdrm%2Famd%2F-%2Fissues%2F1081&amp;data=04%7C01%7CAndrey.Grodzovsky%40amd.com%7Cbe51719dbdac41f5176b08d8bd2279ec%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637467303085005502%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=UzOXP6bHYY6f7MCs4ZbSSvfY0DJ%2FEVPeIqedAi%2BZGG8%3D&amp;reserved=0
>>>>> btw have you tried this out with some of the igts we have? core_hotunplug
>>>>> is the one I'm thinking of. Might be worth to extend this for amdgpu
>>>>> specific stuff (like run some batches on it while hotunplugging).
>>>> No, I mostly used just running glxgears while testing which covers already
>>>> exported/imported dma-buf case and a few manually hacked tests in libdrm amdgpu
>>>> test suite
>>>>
>>>>
>>>>> Since there's so many corner cases we need to test here (shared dma-buf,
>>>>> shared dma_fence) I think it would make sense to have a shared testcase
>>>>> across drivers.
>>>> Not familiar with IGT too much, is there an easy way to setup shared dma bufs
>>>> and fences
>>>> use cases there or you mean I need to add them now ?
>>> We do have test infrastructure for all of that, but the hotunplug test
>>> doesn't have that yet I think.
>>>
>>>>> Only specific thing would be some hooks to keep the gpu
>>>>> busy in some fashion while we yank the driver.
>>>> Do you mean like staring X and some active rendering on top (like glxgears)
>>>> automatically from within IGT ?
>>> Nope, igt is meant to be bare metal testing so you don't have to drag
>>> the entire winsys around (which in a wayland world, is not really good
>>> for driver testing anyway, since everything is different). We use this
>>> for our pre-merge ci for drm/i915.
>>
>> So i keep it busy by X/glxgers which is manual operation. What you suggest
>> then is some client within IGT which opens the device and starts submitting jobs
>> (which is much like what libdrm amdgpu tests already do) ? And this
>> part is the amdgou specific code I just need to port from libdrm to here ?
> Yup. For i915 tests we have an entire library already for small workloads,
> including some that just spin forever (useful for reset testing and could
> also come handy for unload testing).
> -Daniel


Does it mean I would have to drag in the entire infrastructure code from
within libdrm amdgpu code that allows for command submissions through
our IOCTLs ?

Andrey

>
>> Andrey
>>
>>
>>>>> But just to get it started
>>>>> you can throw in entirely amdgpu specific subtests and just share some of
>>>>> the test code.
>>>>> -Daniel
>>>> Im general, I wasn't aware of this test suite and looks like it does what i test
>>>> among other stuff.
>>>> I will definitely  try to run with it although the rescan part will not work as
>>>> plugging
>>>> the device back is in my TODO list and not part of the scope for this patchset
>>>> and so I will
>>>> probably comment the re-scan section out while testing.
>>> amd gem has been using libdrm-amd thus far iirc, but for things like
>>> this I think it'd be worth to at least consider switching. Display
>>> team has already started to use some of the test and contribute stuff
>>> (I think the VRR testcase is from amd).
>>> -Daniel
>>>
>>>> Andrey
>>>>
>>>>
>>>>>> Andrey Grodzovsky (13):
>>>>>>      drm/ttm: Remap all page faults to per process dummy page.
>>>>>>      drm: Unamp the entire device address space on device unplug
>>>>>>      drm/ttm: Expose ttm_tt_unpopulate for driver use
>>>>>>      drm/sched: Cancel and flush all oustatdning jobs before finish.
>>>>>>      drm/amdgpu: Split amdgpu_device_fini into early and late
>>>>>>      drm/amdgpu: Add early fini callback
>>>>>>      drm/amdgpu: Register IOMMU topology notifier per device.
>>>>>>      drm/amdgpu: Fix a bunch of sdma code crash post device unplug
>>>>>>      drm/amdgpu: Remap all page faults to per process dummy page.
>>>>>>      dmr/amdgpu: Move some sysfs attrs creation to default_attr
>>>>>>      drm/amdgpu: Guard against write accesses after device removal
>>>>>>      drm/sched: Make timeout timer rearm conditional.
>>>>>>      drm/amdgpu: Prevent any job recoveries after device is unplugged.
>>>>>>
>>>>>> Luben Tuikov (1):
>>>>>>      drm/scheduler: Job timeout handler returns status
>>>>>>
>>>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu.h               |  11 +-
>>>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c      |  17 +--
>>>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_device.c        | 149 ++++++++++++++++++++--
>>>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c           |  20 ++-
>>>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c         |  15 ++-
>>>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c          |   2 +-
>>>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h          |   1 +
>>>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c           |   9 ++
>>>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c       |  25 ++--
>>>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c           |  26 ++--
>>>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_irq.h           |   3 +-
>>>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_job.c           |  19 ++-
>>>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c           |  12 +-
>>>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_object.c        |  10 ++
>>>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_object.h        |   2 +
>>>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c           |  53 +++++---
>>>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h           |   3 +
>>>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c           |   1 +
>>>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c          |  70 ++++++++++
>>>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h          |  52 +-------
>>>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c           |  21 ++-
>>>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c            |   8 +-
>>>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c      |  14 +-
>>>>>>     drivers/gpu/drm/amd/amdgpu/cik_ih.c               |   2 +-
>>>>>>     drivers/gpu/drm/amd/amdgpu/cz_ih.c                |   2 +-
>>>>>>     drivers/gpu/drm/amd/amdgpu/iceland_ih.c           |   2 +-
>>>>>>     drivers/gpu/drm/amd/amdgpu/navi10_ih.c            |   2 +-
>>>>>>     drivers/gpu/drm/amd/amdgpu/psp_v11_0.c            |  16 +--
>>>>>>     drivers/gpu/drm/amd/amdgpu/psp_v12_0.c            |   8 +-
>>>>>>     drivers/gpu/drm/amd/amdgpu/psp_v3_1.c             |   8 +-
>>>>>>     drivers/gpu/drm/amd/amdgpu/si_ih.c                |   2 +-
>>>>>>     drivers/gpu/drm/amd/amdgpu/tonga_ih.c             |   2 +-
>>>>>>     drivers/gpu/drm/amd/amdgpu/vega10_ih.c            |   2 +-
>>>>>>     drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c |  12 +-
>>>>>>     drivers/gpu/drm/amd/include/amd_shared.h          |   2 +
>>>>>>     drivers/gpu/drm/drm_drv.c                         |   3 +
>>>>>>     drivers/gpu/drm/etnaviv/etnaviv_sched.c           |  10 +-
>>>>>>     drivers/gpu/drm/lima/lima_sched.c                 |   4 +-
>>>>>>     drivers/gpu/drm/panfrost/panfrost_job.c           |   9 +-
>>>>>>     drivers/gpu/drm/scheduler/sched_main.c            |  18 ++-
>>>>>>     drivers/gpu/drm/ttm/ttm_bo_vm.c                   |  82 +++++++++++-
>>>>>>     drivers/gpu/drm/ttm/ttm_tt.c                      |   1 +
>>>>>>     drivers/gpu/drm/v3d/v3d_sched.c                   |  32 ++---
>>>>>>     include/drm/gpu_scheduler.h                       |  17 ++-
>>>>>>     include/drm/ttm/ttm_bo_api.h                      |   2 +
>>>>>>     45 files changed, 583 insertions(+), 198 deletions(-)
>>>>>>
>>>>>> --
>>>>>> 2.7.4
>>>>>>
>>>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 00/14] RFC Support hot device unplug in amdgpu
  2021-01-20 14:19             ` Andrey Grodzovsky
@ 2021-01-20 15:59               ` Daniel Vetter
  -1 siblings, 0 replies; 196+ messages in thread
From: Daniel Vetter @ 2021-01-20 15:59 UTC (permalink / raw)
  To: Andrey Grodzovsky
  Cc: amd-gfx list, Christian König, dri-devel, Qiang Yu, Greg KH,
	Alex Deucher

On Wed, Jan 20, 2021 at 3:20 PM Andrey Grodzovsky
<Andrey.Grodzovsky@amd.com> wrote:
>
>
> On 1/20/21 4:05 AM, Daniel Vetter wrote:
> > On Tue, Jan 19, 2021 at 01:18:15PM -0500, Andrey Grodzovsky wrote:
> >> On 1/19/21 1:08 PM, Daniel Vetter wrote:
> >>> On Tue, Jan 19, 2021 at 6:31 PM Andrey Grodzovsky
> >>> <Andrey.Grodzovsky@amd.com> wrote:
> >>>> On 1/19/21 9:16 AM, Daniel Vetter wrote:
> >>>>> On Mon, Jan 18, 2021 at 04:01:09PM -0500, Andrey Grodzovsky wrote:
> >>>>>> Until now extracting a card either by physical extraction (e.g. eGPU with
> >>>>>> thunderbolt connection or by emulation through  syfs -> /sys/bus/pci/devices/device_id/remove)
> >>>>>> would cause random crashes in user apps. The random crashes in apps were
> >>>>>> mostly due to the app having mapped a device backed BO into its address
> >>>>>> space was still trying to access the BO while the backing device was gone.
> >>>>>> To answer this first problem Christian suggested to fix the handling of mapped
> >>>>>> memory in the clients when the device goes away by forcibly unmap all buffers the
> >>>>>> user processes has by clearing their respective VMAs mapping the device BOs.
> >>>>>> Then when the VMAs try to fill in the page tables again we check in the fault
> >>>>>> handlerif the device is removed and if so, return an error. This will generate a
> >>>>>> SIGBUS to the application which can then cleanly terminate.This indeed was done
> >>>>>> but this in turn created a problem of kernel OOPs were the OOPSes were due to the
> >>>>>> fact that while the app was terminating because of the SIGBUSit would trigger use
> >>>>>> after free in the driver by calling to accesses device structures that were already
> >>>>>> released from the pci remove sequence.This was handled by introducing a 'flush'
> >>>>>> sequence during device removal were we wait for drm file reference to drop to 0
> >>>>>> meaning all user clients directly using this device terminated.
> >>>>>>
> >>>>>> v2:
> >>>>>> Based on discussions in the mailing list with Daniel and Pekka [1] and based on the document
> >>>>>> produced by Pekka from those discussions [2] the whole approach with returning SIGBUS and
> >>>>>> waiting for all user clients having CPU mapping of device BOs to die was dropped.
> >>>>>> Instead as per the document suggestion the device structures are kept alive until
> >>>>>> the last reference to the device is dropped by user client and in the meanwhile all existing and new CPU mappings of the BOs
> >>>>>> belonging to the device directly or by dma-buf import are rerouted to per user
> >>>>>> process dummy rw page.Also, I skipped the 'Requirements for KMS UAPI' section of [2]
> >>>>>> since i am trying to get the minimal set of requirements that still give useful solution
> >>>>>> to work and this is the'Requirements for Render and Cross-Device UAPI' section and so my
> >>>>>> test case is removing a secondary device, which is render only and is not involved
> >>>>>> in KMS.
> >>>>>>
> >>>>>> v3:
> >>>>>> More updates following comments from v2 such as removing loop to find DRM file when rerouting
> >>>>>> page faults to dummy page,getting rid of unnecessary sysfs handling refactoring and moving
> >>>>>> prevention of GPU recovery post device unplug from amdgpu to scheduler layer.
> >>>>>> On top of that added unplug support for the IOMMU enabled system.
> >>>>>>
> >>>>>> v4:
> >>>>>> Drop last sysfs hack and use sysfs default attribute.
> >>>>>> Guard against write accesses after device removal to avoid modifying released memory.
> >>>>>> Update dummy pages handling to on demand allocation and release through drm managed framework.
> >>>>>> Add return value to scheduler job TO handler (by Luben Tuikov) and use this in amdgpu for prevention
> >>>>>> of GPU recovery post device unplug
> >>>>>> Also rebase on top of drm-misc-mext instead of amd-staging-drm-next
> >>>>>>
> >>>>>> With these patches I am able to gracefully remove the secondary card using sysfs remove hook while glxgears
> >>>>>> is running off of secondary card (DRI_PRIME=1) without kernel oopses or hangs and keep working
> >>>>>> with the primary card or soft reset the device without hangs or oopses
> >>>>>>
> >>>>>> TODOs for followup work:
> >>>>>> Convert AMDGPU code to use devm (for hw stuff) and drmm (for sw stuff and allocations) (Daniel)
> >>>>>> Support plugging the secondary device back after unplug - currently still experiencing HW error on plugging back.
> >>>>>> Add support for 'Requirements for KMS UAPI' section of [2] - unplugging primary, display connected card.
> >>>>>>
> >>>>>> [1] - Discussions during v3 of the patchset https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.spinics.net%2Flists%2Famd-gfx%2Fmsg55576.html&amp;data=04%7C01%7CAndrey.Grodzovsky%40amd.com%7Cbe51719dbdac41f5176b08d8bd2279ec%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637467303085005502%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=T4JLiSl7m4R%2FhcfcAxomY%2FMJ8QiTHaJ%2FJaqNZVT%2FDsk%3D&amp;reserved=0
> >>>>>> [2] - drm/doc: device hot-unplug for userspace https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.spinics.net%2Flists%2Fdri-devel%2Fmsg259755.html&amp;data=04%7C01%7CAndrey.Grodzovsky%40amd.com%7Cbe51719dbdac41f5176b08d8bd2279ec%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637467303085005502%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=qitlHw6tqm4eGRstKccgh8zIPgILbS%2FJUa5yZGmSQcU%3D&amp;reserved=0
> >>>>>> [3] - Related gitlab ticket https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgitlab.freedesktop.org%2Fdrm%2Famd%2F-%2Fissues%2F1081&amp;data=04%7C01%7CAndrey.Grodzovsky%40amd.com%7Cbe51719dbdac41f5176b08d8bd2279ec%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637467303085005502%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=UzOXP6bHYY6f7MCs4ZbSSvfY0DJ%2FEVPeIqedAi%2BZGG8%3D&amp;reserved=0
> >>>>> btw have you tried this out with some of the igts we have? core_hotunplug
> >>>>> is the one I'm thinking of. Might be worth to extend this for amdgpu
> >>>>> specific stuff (like run some batches on it while hotunplugging).
> >>>> No, I mostly used just running glxgears while testing which covers already
> >>>> exported/imported dma-buf case and a few manually hacked tests in libdrm amdgpu
> >>>> test suite
> >>>>
> >>>>
> >>>>> Since there's so many corner cases we need to test here (shared dma-buf,
> >>>>> shared dma_fence) I think it would make sense to have a shared testcase
> >>>>> across drivers.
> >>>> Not familiar with IGT too much, is there an easy way to setup shared dma bufs
> >>>> and fences
> >>>> use cases there or you mean I need to add them now ?
> >>> We do have test infrastructure for all of that, but the hotunplug test
> >>> doesn't have that yet I think.
> >>>
> >>>>> Only specific thing would be some hooks to keep the gpu
> >>>>> busy in some fashion while we yank the driver.
> >>>> Do you mean like staring X and some active rendering on top (like glxgears)
> >>>> automatically from within IGT ?
> >>> Nope, igt is meant to be bare metal testing so you don't have to drag
> >>> the entire winsys around (which in a wayland world, is not really good
> >>> for driver testing anyway, since everything is different). We use this
> >>> for our pre-merge ci for drm/i915.
> >>
> >> So i keep it busy by X/glxgers which is manual operation. What you suggest
> >> then is some client within IGT which opens the device and starts submitting jobs
> >> (which is much like what libdrm amdgpu tests already do) ? And this
> >> part is the amdgou specific code I just need to port from libdrm to here ?
> > Yup. For i915 tests we have an entire library already for small workloads,
> > including some that just spin forever (useful for reset testing and could
> > also come handy for unload testing).
> > -Daniel
>
>
> Does it mean I would have to drag in the entire infrastructure code from
> within libdrm amdgpu code that allows for command submissions through
> our IOCTLs ?

No it's perfectly fine to use libdrm in igt tests, we do that too. I
just mean we have some additional helpers to submit specific workloads
for intel gpu, like rendercpy to move data with the 3d engine (just
using copy engines only isn't good enough sometimes for testing), or
the special hanging batchbuffers we use for reset testing, or in
general for having precise control over race conditions and things
like that.

One thing that was somewhat annoying for i915 but shouldn't be a
problem for amdgpu is that igt builds on intel. So we have stub
functions for libdrm-intel, since libdrm-intel doesn't build on arm.
Shouldn't be a problem for you.
-Daniel


> Andrey
>
> >
> >> Andrey
> >>
> >>
> >>>>> But just to get it started
> >>>>> you can throw in entirely amdgpu specific subtests and just share some of
> >>>>> the test code.
> >>>>> -Daniel
> >>>> Im general, I wasn't aware of this test suite and looks like it does what i test
> >>>> among other stuff.
> >>>> I will definitely  try to run with it although the rescan part will not work as
> >>>> plugging
> >>>> the device back is in my TODO list and not part of the scope for this patchset
> >>>> and so I will
> >>>> probably comment the re-scan section out while testing.
> >>> amd gem has been using libdrm-amd thus far iirc, but for things like
> >>> this I think it'd be worth to at least consider switching. Display
> >>> team has already started to use some of the test and contribute stuff
> >>> (I think the VRR testcase is from amd).
> >>> -Daniel
> >>>
> >>>> Andrey
> >>>>
> >>>>
> >>>>>> Andrey Grodzovsky (13):
> >>>>>>      drm/ttm: Remap all page faults to per process dummy page.
> >>>>>>      drm: Unamp the entire device address space on device unplug
> >>>>>>      drm/ttm: Expose ttm_tt_unpopulate for driver use
> >>>>>>      drm/sched: Cancel and flush all oustatdning jobs before finish.
> >>>>>>      drm/amdgpu: Split amdgpu_device_fini into early and late
> >>>>>>      drm/amdgpu: Add early fini callback
> >>>>>>      drm/amdgpu: Register IOMMU topology notifier per device.
> >>>>>>      drm/amdgpu: Fix a bunch of sdma code crash post device unplug
> >>>>>>      drm/amdgpu: Remap all page faults to per process dummy page.
> >>>>>>      dmr/amdgpu: Move some sysfs attrs creation to default_attr
> >>>>>>      drm/amdgpu: Guard against write accesses after device removal
> >>>>>>      drm/sched: Make timeout timer rearm conditional.
> >>>>>>      drm/amdgpu: Prevent any job recoveries after device is unplugged.
> >>>>>>
> >>>>>> Luben Tuikov (1):
> >>>>>>      drm/scheduler: Job timeout handler returns status
> >>>>>>
> >>>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu.h               |  11 +-
> >>>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c      |  17 +--
> >>>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_device.c        | 149 ++++++++++++++++++++--
> >>>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c           |  20 ++-
> >>>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c         |  15 ++-
> >>>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c          |   2 +-
> >>>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h          |   1 +
> >>>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c           |   9 ++
> >>>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c       |  25 ++--
> >>>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c           |  26 ++--
> >>>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_irq.h           |   3 +-
> >>>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_job.c           |  19 ++-
> >>>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c           |  12 +-
> >>>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_object.c        |  10 ++
> >>>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_object.h        |   2 +
> >>>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c           |  53 +++++---
> >>>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h           |   3 +
> >>>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c           |   1 +
> >>>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c          |  70 ++++++++++
> >>>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h          |  52 +-------
> >>>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c           |  21 ++-
> >>>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c            |   8 +-
> >>>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c      |  14 +-
> >>>>>>     drivers/gpu/drm/amd/amdgpu/cik_ih.c               |   2 +-
> >>>>>>     drivers/gpu/drm/amd/amdgpu/cz_ih.c                |   2 +-
> >>>>>>     drivers/gpu/drm/amd/amdgpu/iceland_ih.c           |   2 +-
> >>>>>>     drivers/gpu/drm/amd/amdgpu/navi10_ih.c            |   2 +-
> >>>>>>     drivers/gpu/drm/amd/amdgpu/psp_v11_0.c            |  16 +--
> >>>>>>     drivers/gpu/drm/amd/amdgpu/psp_v12_0.c            |   8 +-
> >>>>>>     drivers/gpu/drm/amd/amdgpu/psp_v3_1.c             |   8 +-
> >>>>>>     drivers/gpu/drm/amd/amdgpu/si_ih.c                |   2 +-
> >>>>>>     drivers/gpu/drm/amd/amdgpu/tonga_ih.c             |   2 +-
> >>>>>>     drivers/gpu/drm/amd/amdgpu/vega10_ih.c            |   2 +-
> >>>>>>     drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c |  12 +-
> >>>>>>     drivers/gpu/drm/amd/include/amd_shared.h          |   2 +
> >>>>>>     drivers/gpu/drm/drm_drv.c                         |   3 +
> >>>>>>     drivers/gpu/drm/etnaviv/etnaviv_sched.c           |  10 +-
> >>>>>>     drivers/gpu/drm/lima/lima_sched.c                 |   4 +-
> >>>>>>     drivers/gpu/drm/panfrost/panfrost_job.c           |   9 +-
> >>>>>>     drivers/gpu/drm/scheduler/sched_main.c            |  18 ++-
> >>>>>>     drivers/gpu/drm/ttm/ttm_bo_vm.c                   |  82 +++++++++++-
> >>>>>>     drivers/gpu/drm/ttm/ttm_tt.c                      |   1 +
> >>>>>>     drivers/gpu/drm/v3d/v3d_sched.c                   |  32 ++---
> >>>>>>     include/drm/gpu_scheduler.h                       |  17 ++-
> >>>>>>     include/drm/ttm/ttm_bo_api.h                      |   2 +
> >>>>>>     45 files changed, 583 insertions(+), 198 deletions(-)
> >>>>>>
> >>>>>> --
> >>>>>> 2.7.4
> >>>>>>
> >>>



-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 00/14] RFC Support hot device unplug in amdgpu
@ 2021-01-20 15:59               ` Daniel Vetter
  0 siblings, 0 replies; 196+ messages in thread
From: Daniel Vetter @ 2021-01-20 15:59 UTC (permalink / raw)
  To: Andrey Grodzovsky
  Cc: Rob Herring, amd-gfx list, Christian König, dri-devel,
	Anholt, Eric, Pekka Paalanen, Qiang Yu, Greg KH, Alex Deucher,
	Wentland, Harry, Lucas Stach

On Wed, Jan 20, 2021 at 3:20 PM Andrey Grodzovsky
<Andrey.Grodzovsky@amd.com> wrote:
>
>
> On 1/20/21 4:05 AM, Daniel Vetter wrote:
> > On Tue, Jan 19, 2021 at 01:18:15PM -0500, Andrey Grodzovsky wrote:
> >> On 1/19/21 1:08 PM, Daniel Vetter wrote:
> >>> On Tue, Jan 19, 2021 at 6:31 PM Andrey Grodzovsky
> >>> <Andrey.Grodzovsky@amd.com> wrote:
> >>>> On 1/19/21 9:16 AM, Daniel Vetter wrote:
> >>>>> On Mon, Jan 18, 2021 at 04:01:09PM -0500, Andrey Grodzovsky wrote:
> >>>>>> Until now extracting a card either by physical extraction (e.g. eGPU with
> >>>>>> thunderbolt connection or by emulation through  syfs -> /sys/bus/pci/devices/device_id/remove)
> >>>>>> would cause random crashes in user apps. The random crashes in apps were
> >>>>>> mostly due to the app having mapped a device backed BO into its address
> >>>>>> space was still trying to access the BO while the backing device was gone.
> >>>>>> To answer this first problem Christian suggested to fix the handling of mapped
> >>>>>> memory in the clients when the device goes away by forcibly unmap all buffers the
> >>>>>> user processes has by clearing their respective VMAs mapping the device BOs.
> >>>>>> Then when the VMAs try to fill in the page tables again we check in the fault
> >>>>>> handlerif the device is removed and if so, return an error. This will generate a
> >>>>>> SIGBUS to the application which can then cleanly terminate.This indeed was done
> >>>>>> but this in turn created a problem of kernel OOPs were the OOPSes were due to the
> >>>>>> fact that while the app was terminating because of the SIGBUSit would trigger use
> >>>>>> after free in the driver by calling to accesses device structures that were already
> >>>>>> released from the pci remove sequence.This was handled by introducing a 'flush'
> >>>>>> sequence during device removal were we wait for drm file reference to drop to 0
> >>>>>> meaning all user clients directly using this device terminated.
> >>>>>>
> >>>>>> v2:
> >>>>>> Based on discussions in the mailing list with Daniel and Pekka [1] and based on the document
> >>>>>> produced by Pekka from those discussions [2] the whole approach with returning SIGBUS and
> >>>>>> waiting for all user clients having CPU mapping of device BOs to die was dropped.
> >>>>>> Instead as per the document suggestion the device structures are kept alive until
> >>>>>> the last reference to the device is dropped by user client and in the meanwhile all existing and new CPU mappings of the BOs
> >>>>>> belonging to the device directly or by dma-buf import are rerouted to per user
> >>>>>> process dummy rw page.Also, I skipped the 'Requirements for KMS UAPI' section of [2]
> >>>>>> since i am trying to get the minimal set of requirements that still give useful solution
> >>>>>> to work and this is the'Requirements for Render and Cross-Device UAPI' section and so my
> >>>>>> test case is removing a secondary device, which is render only and is not involved
> >>>>>> in KMS.
> >>>>>>
> >>>>>> v3:
> >>>>>> More updates following comments from v2 such as removing loop to find DRM file when rerouting
> >>>>>> page faults to dummy page,getting rid of unnecessary sysfs handling refactoring and moving
> >>>>>> prevention of GPU recovery post device unplug from amdgpu to scheduler layer.
> >>>>>> On top of that added unplug support for the IOMMU enabled system.
> >>>>>>
> >>>>>> v4:
> >>>>>> Drop last sysfs hack and use sysfs default attribute.
> >>>>>> Guard against write accesses after device removal to avoid modifying released memory.
> >>>>>> Update dummy pages handling to on demand allocation and release through drm managed framework.
> >>>>>> Add return value to scheduler job TO handler (by Luben Tuikov) and use this in amdgpu for prevention
> >>>>>> of GPU recovery post device unplug
> >>>>>> Also rebase on top of drm-misc-mext instead of amd-staging-drm-next
> >>>>>>
> >>>>>> With these patches I am able to gracefully remove the secondary card using sysfs remove hook while glxgears
> >>>>>> is running off of secondary card (DRI_PRIME=1) without kernel oopses or hangs and keep working
> >>>>>> with the primary card or soft reset the device without hangs or oopses
> >>>>>>
> >>>>>> TODOs for followup work:
> >>>>>> Convert AMDGPU code to use devm (for hw stuff) and drmm (for sw stuff and allocations) (Daniel)
> >>>>>> Support plugging the secondary device back after unplug - currently still experiencing HW error on plugging back.
> >>>>>> Add support for 'Requirements for KMS UAPI' section of [2] - unplugging primary, display connected card.
> >>>>>>
> >>>>>> [1] - Discussions during v3 of the patchset https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.spinics.net%2Flists%2Famd-gfx%2Fmsg55576.html&amp;data=04%7C01%7CAndrey.Grodzovsky%40amd.com%7Cbe51719dbdac41f5176b08d8bd2279ec%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637467303085005502%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=T4JLiSl7m4R%2FhcfcAxomY%2FMJ8QiTHaJ%2FJaqNZVT%2FDsk%3D&amp;reserved=0
> >>>>>> [2] - drm/doc: device hot-unplug for userspace https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.spinics.net%2Flists%2Fdri-devel%2Fmsg259755.html&amp;data=04%7C01%7CAndrey.Grodzovsky%40amd.com%7Cbe51719dbdac41f5176b08d8bd2279ec%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637467303085005502%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=qitlHw6tqm4eGRstKccgh8zIPgILbS%2FJUa5yZGmSQcU%3D&amp;reserved=0
> >>>>>> [3] - Related gitlab ticket https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgitlab.freedesktop.org%2Fdrm%2Famd%2F-%2Fissues%2F1081&amp;data=04%7C01%7CAndrey.Grodzovsky%40amd.com%7Cbe51719dbdac41f5176b08d8bd2279ec%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637467303085005502%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=UzOXP6bHYY6f7MCs4ZbSSvfY0DJ%2FEVPeIqedAi%2BZGG8%3D&amp;reserved=0
> >>>>> btw have you tried this out with some of the igts we have? core_hotunplug
> >>>>> is the one I'm thinking of. Might be worth to extend this for amdgpu
> >>>>> specific stuff (like run some batches on it while hotunplugging).
> >>>> No, I mostly used just running glxgears while testing which covers already
> >>>> exported/imported dma-buf case and a few manually hacked tests in libdrm amdgpu
> >>>> test suite
> >>>>
> >>>>
> >>>>> Since there's so many corner cases we need to test here (shared dma-buf,
> >>>>> shared dma_fence) I think it would make sense to have a shared testcase
> >>>>> across drivers.
> >>>> Not familiar with IGT too much, is there an easy way to setup shared dma bufs
> >>>> and fences
> >>>> use cases there or you mean I need to add them now ?
> >>> We do have test infrastructure for all of that, but the hotunplug test
> >>> doesn't have that yet I think.
> >>>
> >>>>> Only specific thing would be some hooks to keep the gpu
> >>>>> busy in some fashion while we yank the driver.
> >>>> Do you mean like staring X and some active rendering on top (like glxgears)
> >>>> automatically from within IGT ?
> >>> Nope, igt is meant to be bare metal testing so you don't have to drag
> >>> the entire winsys around (which in a wayland world, is not really good
> >>> for driver testing anyway, since everything is different). We use this
> >>> for our pre-merge ci for drm/i915.
> >>
> >> So i keep it busy by X/glxgers which is manual operation. What you suggest
> >> then is some client within IGT which opens the device and starts submitting jobs
> >> (which is much like what libdrm amdgpu tests already do) ? And this
> >> part is the amdgou specific code I just need to port from libdrm to here ?
> > Yup. For i915 tests we have an entire library already for small workloads,
> > including some that just spin forever (useful for reset testing and could
> > also come handy for unload testing).
> > -Daniel
>
>
> Does it mean I would have to drag in the entire infrastructure code from
> within libdrm amdgpu code that allows for command submissions through
> our IOCTLs ?

No it's perfectly fine to use libdrm in igt tests, we do that too. I
just mean we have some additional helpers to submit specific workloads
for intel gpu, like rendercpy to move data with the 3d engine (just
using copy engines only isn't good enough sometimes for testing), or
the special hanging batchbuffers we use for reset testing, or in
general for having precise control over race conditions and things
like that.

One thing that was somewhat annoying for i915 but shouldn't be a
problem for amdgpu is that igt builds on intel. So we have stub
functions for libdrm-intel, since libdrm-intel doesn't build on arm.
Shouldn't be a problem for you.
-Daniel


> Andrey
>
> >
> >> Andrey
> >>
> >>
> >>>>> But just to get it started
> >>>>> you can throw in entirely amdgpu specific subtests and just share some of
> >>>>> the test code.
> >>>>> -Daniel
> >>>> Im general, I wasn't aware of this test suite and looks like it does what i test
> >>>> among other stuff.
> >>>> I will definitely  try to run with it although the rescan part will not work as
> >>>> plugging
> >>>> the device back is in my TODO list and not part of the scope for this patchset
> >>>> and so I will
> >>>> probably comment the re-scan section out while testing.
> >>> amd gem has been using libdrm-amd thus far iirc, but for things like
> >>> this I think it'd be worth to at least consider switching. Display
> >>> team has already started to use some of the test and contribute stuff
> >>> (I think the VRR testcase is from amd).
> >>> -Daniel
> >>>
> >>>> Andrey
> >>>>
> >>>>
> >>>>>> Andrey Grodzovsky (13):
> >>>>>>      drm/ttm: Remap all page faults to per process dummy page.
> >>>>>>      drm: Unamp the entire device address space on device unplug
> >>>>>>      drm/ttm: Expose ttm_tt_unpopulate for driver use
> >>>>>>      drm/sched: Cancel and flush all oustatdning jobs before finish.
> >>>>>>      drm/amdgpu: Split amdgpu_device_fini into early and late
> >>>>>>      drm/amdgpu: Add early fini callback
> >>>>>>      drm/amdgpu: Register IOMMU topology notifier per device.
> >>>>>>      drm/amdgpu: Fix a bunch of sdma code crash post device unplug
> >>>>>>      drm/amdgpu: Remap all page faults to per process dummy page.
> >>>>>>      dmr/amdgpu: Move some sysfs attrs creation to default_attr
> >>>>>>      drm/amdgpu: Guard against write accesses after device removal
> >>>>>>      drm/sched: Make timeout timer rearm conditional.
> >>>>>>      drm/amdgpu: Prevent any job recoveries after device is unplugged.
> >>>>>>
> >>>>>> Luben Tuikov (1):
> >>>>>>      drm/scheduler: Job timeout handler returns status
> >>>>>>
> >>>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu.h               |  11 +-
> >>>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c      |  17 +--
> >>>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_device.c        | 149 ++++++++++++++++++++--
> >>>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c           |  20 ++-
> >>>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c         |  15 ++-
> >>>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c          |   2 +-
> >>>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h          |   1 +
> >>>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c           |   9 ++
> >>>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c       |  25 ++--
> >>>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c           |  26 ++--
> >>>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_irq.h           |   3 +-
> >>>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_job.c           |  19 ++-
> >>>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c           |  12 +-
> >>>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_object.c        |  10 ++
> >>>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_object.h        |   2 +
> >>>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c           |  53 +++++---
> >>>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h           |   3 +
> >>>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c           |   1 +
> >>>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c          |  70 ++++++++++
> >>>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h          |  52 +-------
> >>>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c           |  21 ++-
> >>>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c            |   8 +-
> >>>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c      |  14 +-
> >>>>>>     drivers/gpu/drm/amd/amdgpu/cik_ih.c               |   2 +-
> >>>>>>     drivers/gpu/drm/amd/amdgpu/cz_ih.c                |   2 +-
> >>>>>>     drivers/gpu/drm/amd/amdgpu/iceland_ih.c           |   2 +-
> >>>>>>     drivers/gpu/drm/amd/amdgpu/navi10_ih.c            |   2 +-
> >>>>>>     drivers/gpu/drm/amd/amdgpu/psp_v11_0.c            |  16 +--
> >>>>>>     drivers/gpu/drm/amd/amdgpu/psp_v12_0.c            |   8 +-
> >>>>>>     drivers/gpu/drm/amd/amdgpu/psp_v3_1.c             |   8 +-
> >>>>>>     drivers/gpu/drm/amd/amdgpu/si_ih.c                |   2 +-
> >>>>>>     drivers/gpu/drm/amd/amdgpu/tonga_ih.c             |   2 +-
> >>>>>>     drivers/gpu/drm/amd/amdgpu/vega10_ih.c            |   2 +-
> >>>>>>     drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c |  12 +-
> >>>>>>     drivers/gpu/drm/amd/include/amd_shared.h          |   2 +
> >>>>>>     drivers/gpu/drm/drm_drv.c                         |   3 +
> >>>>>>     drivers/gpu/drm/etnaviv/etnaviv_sched.c           |  10 +-
> >>>>>>     drivers/gpu/drm/lima/lima_sched.c                 |   4 +-
> >>>>>>     drivers/gpu/drm/panfrost/panfrost_job.c           |   9 +-
> >>>>>>     drivers/gpu/drm/scheduler/sched_main.c            |  18 ++-
> >>>>>>     drivers/gpu/drm/ttm/ttm_bo_vm.c                   |  82 +++++++++++-
> >>>>>>     drivers/gpu/drm/ttm/ttm_tt.c                      |   1 +
> >>>>>>     drivers/gpu/drm/v3d/v3d_sched.c                   |  32 ++---
> >>>>>>     include/drm/gpu_scheduler.h                       |  17 ++-
> >>>>>>     include/drm/ttm/ttm_bo_api.h                      |   2 +
> >>>>>>     45 files changed, 583 insertions(+), 198 deletions(-)
> >>>>>>
> >>>>>> --
> >>>>>> 2.7.4
> >>>>>>
> >>>



-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 11/14] drm/amdgpu: Guard against write accesses after device removal
  2021-01-19 19:16               ` Andrey Grodzovsky
@ 2021-01-20 19:34                 ` Andrey Grodzovsky
  -1 siblings, 0 replies; 196+ messages in thread
From: Andrey Grodzovsky @ 2021-01-20 19:34 UTC (permalink / raw)
  To: christian.koenig, Daniel Vetter
  Cc: Greg KH, dri-devel, amd-gfx list, Alex Deucher, Qiang Yu


On 1/19/21 2:16 PM, Andrey Grodzovsky wrote:
>
> On 1/19/21 1:59 PM, Christian König wrote:
>> Am 19.01.21 um 19:22 schrieb Andrey Grodzovsky:
>>>
>>> On 1/19/21 1:05 PM, Daniel Vetter wrote:
>>>> On Tue, Jan 19, 2021 at 4:35 PM Andrey Grodzovsky
>>>> <Andrey.Grodzovsky@amd.com> wrote:
>>>>> There is really no other way according to this article
>>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flwn.net%2FArticles%2F767885%2F&amp;data=04%7C01%7CAndrey.Grodzovsky%40amd.com%7Cee61fb937d2d4baedf6f08d8bcac5b02%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637466795752297305%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=a9Y4ZMEVYaMP7IeMVxQgXGpAkDXSkedMAiWkyqwzEe8%3D&amp;reserved=0 
>>>>>
>>>>>
>>>>> "A perfect solution seems nearly impossible though; we cannot acquire a 
>>>>> mutex on
>>>>> the user
>>>>> to prevent them from yanking a device and we cannot check for a presence 
>>>>> change
>>>>> after every
>>>>> device access for performance reasons. "
>>>>>
>>>>> But I assumed srcu_read_lock should be pretty seamless performance wise, no ?
>>>> The read side is supposed to be dirt cheap, the write side is were we
>>>> just stall for all readers to eventually complete on their own.
>>>> Definitely should be much cheaper than mmio read, on the mmio write
>>>> side it might actually hurt a bit. Otoh I think those don't stall the
>>>> cpu by default when they're timing out, so maybe if the overhead is
>>>> too much for those, we could omit them?
>>>>
>>>> Maybe just do a small microbenchmark for these for testing, with a
>>>> register that doesn't change hw state. So with and without
>>>> drm_dev_enter/exit, and also one with the hw plugged out so that we
>>>> have actual timeouts in the transactions.
>>>> -Daniel
>>>
>>>
>>> So say writing in a loop to some harmless scratch register for many times 
>>> both for plugged
>>> and unplugged case and measure total time delta ?
>>
>> I think we should at least measure the following:
>>
>> 1. Writing X times to a scratch reg without your patch.
>> 2. Writing X times to a scratch reg with your patch.
>> 3. Writing X times to a scratch reg with the hardware physically disconnected.


Just realized, I can't test this part since I don't have eGPU to yank out.

Andrey


>>
>> I suggest to repeat that once for Polaris (or older) and once for Vega or Navi.
>>
>> The SRBM on Polaris is meant to introduce some delay in each access, so it 
>> might react differently then the newer hardware.
>>
>> Christian.
>
>
> Will do.
>
> Andrey
>
>
>>
>>>
>>> Andrey
>>>
>>>
>>>>
>>>>> The other solution would be as I suggested to keep all the device IO ranges
>>>>> reserved and system
>>>>> memory pages unfreed until the device is finalized in the driver but 
>>>>> Daniel said
>>>>> this would upset the PCI layer (the MMIO ranges reservation part).
>>>>>
>>>>> Andrey
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On 1/19/21 3:55 AM, Christian König wrote:
>>>>>> Am 18.01.21 um 22:01 schrieb Andrey Grodzovsky:
>>>>>>> This should prevent writing to memory or IO ranges possibly
>>>>>>> already allocated for other uses after our device is removed.
>>>>>> Wow, that adds quite some overhead to every register access. I'm not sure we
>>>>>> can do this.
>>>>>>
>>>>>> Christian.
>>>>>>
>>>>>>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>>>>>>> ---
>>>>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 57 ++++++++++++++++++++++++
>>>>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c    |  9 ++++
>>>>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c    | 53 +++++++++++++---------
>>>>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h    |  3 ++
>>>>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c   | 70 
>>>>>>> ++++++++++++++++++++++++++++++
>>>>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h   | 49 ++-------------------
>>>>>>>    drivers/gpu/drm/amd/amdgpu/psp_v11_0.c     | 16 ++-----
>>>>>>>    drivers/gpu/drm/amd/amdgpu/psp_v12_0.c     |  8 +---
>>>>>>>    drivers/gpu/drm/amd/amdgpu/psp_v3_1.c      |  8 +---
>>>>>>>    9 files changed, 184 insertions(+), 89 deletions(-)
>>>>>>>
>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>>>> index e99f4f1..0a9d73c 100644
>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>>>> @@ -72,6 +72,8 @@
>>>>>>>      #include <linux/iommu.h>
>>>>>>>    +#include <drm/drm_drv.h>
>>>>>>> +
>>>>>>>    MODULE_FIRMWARE("amdgpu/vega10_gpu_info.bin");
>>>>>>>    MODULE_FIRMWARE("amdgpu/vega12_gpu_info.bin");
>>>>>>>    MODULE_FIRMWARE("amdgpu/raven_gpu_info.bin");
>>>>>>> @@ -404,13 +406,21 @@ uint8_t amdgpu_mm_rreg8(struct amdgpu_device *adev,
>>>>>>> uint32_t offset)
>>>>>>>     */
>>>>>>>    void amdgpu_mm_wreg8(struct amdgpu_device *adev, uint32_t offset, 
>>>>>>> uint8_t
>>>>>>> value)
>>>>>>>    {
>>>>>>> +    int idx;
>>>>>>> +
>>>>>>>        if (adev->in_pci_err_recovery)
>>>>>>>            return;
>>>>>>>    +
>>>>>>> +    if (!drm_dev_enter(&adev->ddev, &idx))
>>>>>>> +        return;
>>>>>>> +
>>>>>>>        if (offset < adev->rmmio_size)
>>>>>>>            writeb(value, adev->rmmio + offset);
>>>>>>>        else
>>>>>>>            BUG();
>>>>>>> +
>>>>>>> +    drm_dev_exit(idx);
>>>>>>>    }
>>>>>>>      /**
>>>>>>> @@ -427,9 +437,14 @@ void amdgpu_device_wreg(struct amdgpu_device *adev,
>>>>>>>                uint32_t reg, uint32_t v,
>>>>>>>                uint32_t acc_flags)
>>>>>>>    {
>>>>>>> +    int idx;
>>>>>>> +
>>>>>>>        if (adev->in_pci_err_recovery)
>>>>>>>            return;
>>>>>>>    +    if (!drm_dev_enter(&adev->ddev, &idx))
>>>>>>> +        return;
>>>>>>> +
>>>>>>>        if ((reg * 4) < adev->rmmio_size) {
>>>>>>>            if (!(acc_flags & AMDGPU_REGS_NO_KIQ) &&
>>>>>>>                amdgpu_sriov_runtime(adev) &&
>>>>>>> @@ -444,6 +459,8 @@ void amdgpu_device_wreg(struct amdgpu_device *adev,
>>>>>>>        }
>>>>>>> trace_amdgpu_device_wreg(adev->pdev->device, reg, v);
>>>>>>> +
>>>>>>> +    drm_dev_exit(idx);
>>>>>>>    }
>>>>>>>      /*
>>>>>>> @@ -454,9 +471,14 @@ void amdgpu_device_wreg(struct amdgpu_device *adev,
>>>>>>>    void amdgpu_mm_wreg_mmio_rlc(struct amdgpu_device *adev,
>>>>>>>                     uint32_t reg, uint32_t v)
>>>>>>>    {
>>>>>>> +    int idx;
>>>>>>> +
>>>>>>>        if (adev->in_pci_err_recovery)
>>>>>>>            return;
>>>>>>>    +    if (!drm_dev_enter(&adev->ddev, &idx))
>>>>>>> +        return;
>>>>>>> +
>>>>>>>        if (amdgpu_sriov_fullaccess(adev) &&
>>>>>>>            adev->gfx.rlc.funcs &&
>>>>>>> adev->gfx.rlc.funcs->is_rlcg_access_range) {
>>>>>>> @@ -465,6 +487,8 @@ void amdgpu_mm_wreg_mmio_rlc(struct amdgpu_device 
>>>>>>> *adev,
>>>>>>>        } else {
>>>>>>>            writel(v, ((void __iomem *)adev->rmmio) + (reg * 4));
>>>>>>>        }
>>>>>>> +
>>>>>>> +    drm_dev_exit(idx);
>>>>>>>    }
>>>>>>>      /**
>>>>>>> @@ -499,15 +523,22 @@ u32 amdgpu_io_rreg(struct amdgpu_device *adev, u32 
>>>>>>> reg)
>>>>>>>     */
>>>>>>>    void amdgpu_io_wreg(struct amdgpu_device *adev, u32 reg, u32 v)
>>>>>>>    {
>>>>>>> +    int idx;
>>>>>>> +
>>>>>>>        if (adev->in_pci_err_recovery)
>>>>>>>            return;
>>>>>>>    +    if (!drm_dev_enter(&adev->ddev, &idx))
>>>>>>> +        return;
>>>>>>> +
>>>>>>>        if ((reg * 4) < adev->rio_mem_size)
>>>>>>>            iowrite32(v, adev->rio_mem + (reg * 4));
>>>>>>>        else {
>>>>>>>            iowrite32((reg * 4), adev->rio_mem + (mmMM_INDEX * 4));
>>>>>>>            iowrite32(v, adev->rio_mem + (mmMM_DATA * 4));
>>>>>>>        }
>>>>>>> +
>>>>>>> +    drm_dev_exit(idx);
>>>>>>>    }
>>>>>>>      /**
>>>>>>> @@ -544,14 +575,21 @@ u32 amdgpu_mm_rdoorbell(struct amdgpu_device 
>>>>>>> *adev, u32
>>>>>>> index)
>>>>>>>     */
>>>>>>>    void amdgpu_mm_wdoorbell(struct amdgpu_device *adev, u32 index, u32 v)
>>>>>>>    {
>>>>>>> +    int idx;
>>>>>>> +
>>>>>>>        if (adev->in_pci_err_recovery)
>>>>>>>            return;
>>>>>>>    +    if (!drm_dev_enter(&adev->ddev, &idx))
>>>>>>> +        return;
>>>>>>> +
>>>>>>>        if (index < adev->doorbell.num_doorbells) {
>>>>>>>            writel(v, adev->doorbell.ptr + index);
>>>>>>>        } else {
>>>>>>>            DRM_ERROR("writing beyond doorbell aperture: 0x%08x!\n", index);
>>>>>>>        }
>>>>>>> +
>>>>>>> +    drm_dev_exit(idx);
>>>>>>>    }
>>>>>>>      /**
>>>>>>> @@ -588,14 +626,21 @@ u64 amdgpu_mm_rdoorbell64(struct amdgpu_device *adev,
>>>>>>> u32 index)
>>>>>>>     */
>>>>>>>    void amdgpu_mm_wdoorbell64(struct amdgpu_device *adev, u32 index, u64 v)
>>>>>>>    {
>>>>>>> +    int idx;
>>>>>>> +
>>>>>>>        if (adev->in_pci_err_recovery)
>>>>>>>            return;
>>>>>>>    +    if (!drm_dev_enter(&adev->ddev, &idx))
>>>>>>> +        return;
>>>>>>> +
>>>>>>>        if (index < adev->doorbell.num_doorbells) {
>>>>>>>            atomic64_set((atomic64_t *)(adev->doorbell.ptr + index), v);
>>>>>>>        } else {
>>>>>>>            DRM_ERROR("writing beyond doorbell aperture: 0x%08x!\n", index);
>>>>>>>        }
>>>>>>> +
>>>>>>> +    drm_dev_exit(idx);
>>>>>>>    }
>>>>>>>      /**
>>>>>>> @@ -682,6 +727,10 @@ void amdgpu_device_indirect_wreg(struct amdgpu_device
>>>>>>> *adev,
>>>>>>>        unsigned long flags;
>>>>>>>        void __iomem *pcie_index_offset;
>>>>>>>        void __iomem *pcie_data_offset;
>>>>>>> +    int idx;
>>>>>>> +
>>>>>>> +    if (!drm_dev_enter(&adev->ddev, &idx))
>>>>>>> +        return;
>>>>>>> spin_lock_irqsave(&adev->pcie_idx_lock, flags);
>>>>>>>        pcie_index_offset = (void __iomem *)adev->rmmio + pcie_index * 4;
>>>>>>> @@ -692,6 +741,8 @@ void amdgpu_device_indirect_wreg(struct 
>>>>>>> amdgpu_device *adev,
>>>>>>>        writel(reg_data, pcie_data_offset);
>>>>>>>        readl(pcie_data_offset);
>>>>>>> spin_unlock_irqrestore(&adev->pcie_idx_lock, flags);
>>>>>>> +
>>>>>>> +    drm_dev_exit(idx);
>>>>>>>    }
>>>>>>>      /**
>>>>>>> @@ -711,6 +762,10 @@ void amdgpu_device_indirect_wreg64(struct 
>>>>>>> amdgpu_device
>>>>>>> *adev,
>>>>>>>        unsigned long flags;
>>>>>>>        void __iomem *pcie_index_offset;
>>>>>>>        void __iomem *pcie_data_offset;
>>>>>>> +    int idx;
>>>>>>> +
>>>>>>> +    if (!drm_dev_enter(&adev->ddev, &idx))
>>>>>>> +        return;
>>>>>>> spin_lock_irqsave(&adev->pcie_idx_lock, flags);
>>>>>>>        pcie_index_offset = (void __iomem *)adev->rmmio + pcie_index * 4;
>>>>>>> @@ -727,6 +782,8 @@ void amdgpu_device_indirect_wreg64(struct amdgpu_device
>>>>>>> *adev,
>>>>>>>        writel((u32)(reg_data >> 32), pcie_data_offset);
>>>>>>>        readl(pcie_data_offset);
>>>>>>> spin_unlock_irqrestore(&adev->pcie_idx_lock, flags);
>>>>>>> +
>>>>>>> +    drm_dev_exit(idx);
>>>>>>>    }
>>>>>>>      /**
>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
>>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
>>>>>>> index fe1a39f..1beb4e6 100644
>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
>>>>>>> @@ -31,6 +31,8 @@
>>>>>>>    #include "amdgpu_ras.h"
>>>>>>>    #include "amdgpu_xgmi.h"
>>>>>>>    +#include <drm/drm_drv.h>
>>>>>>> +
>>>>>>>    /**
>>>>>>>     * amdgpu_gmc_get_pde_for_bo - get the PDE for a BO
>>>>>>>     *
>>>>>>> @@ -98,6 +100,10 @@ int amdgpu_gmc_set_pte_pde(struct amdgpu_device *adev,
>>>>>>> void *cpu_pt_addr,
>>>>>>>    {
>>>>>>>        void __iomem *ptr = (void *)cpu_pt_addr;
>>>>>>>        uint64_t value;
>>>>>>> +    int idx;
>>>>>>> +
>>>>>>> +    if (!drm_dev_enter(&adev->ddev, &idx))
>>>>>>> +        return 0;
>>>>>>>          /*
>>>>>>>         * The following is for PTE only. GART does not have PDEs.
>>>>>>> @@ -105,6 +111,9 @@ int amdgpu_gmc_set_pte_pde(struct amdgpu_device *adev,
>>>>>>> void *cpu_pt_addr,
>>>>>>>        value = addr & 0x0000FFFFFFFFF000ULL;
>>>>>>>        value |= flags;
>>>>>>>        writeq(value, ptr + (gpu_page_idx * 8));
>>>>>>> +
>>>>>>> +    drm_dev_exit(idx);
>>>>>>> +
>>>>>>>        return 0;
>>>>>>>    }
>>>>>>>    diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
>>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
>>>>>>> index 523d22d..89e2bfe 100644
>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
>>>>>>> @@ -37,6 +37,8 @@
>>>>>>>      #include "amdgpu_ras.h"
>>>>>>>    +#include <drm/drm_drv.h>
>>>>>>> +
>>>>>>>    static int psp_sysfs_init(struct amdgpu_device *adev);
>>>>>>>    static void psp_sysfs_fini(struct amdgpu_device *adev);
>>>>>>>    @@ -248,7 +250,7 @@ psp_cmd_submit_buf(struct psp_context *psp,
>>>>>>>               struct psp_gfx_cmd_resp *cmd, uint64_t fence_mc_addr)
>>>>>>>    {
>>>>>>>        int ret;
>>>>>>> -    int index;
>>>>>>> +    int index, idx;
>>>>>>>        int timeout = 2000;
>>>>>>>        bool ras_intr = false;
>>>>>>>        bool skip_unsupport = false;
>>>>>>> @@ -256,6 +258,9 @@ psp_cmd_submit_buf(struct psp_context *psp,
>>>>>>>        if (psp->adev->in_pci_err_recovery)
>>>>>>>            return 0;
>>>>>>>    +    if (!drm_dev_enter(&psp->adev->ddev, &idx))
>>>>>>> +        return 0;
>>>>>>> +
>>>>>>>        mutex_lock(&psp->mutex);
>>>>>>>          memset(psp->cmd_buf_mem, 0, PSP_CMD_BUFFER_SIZE);
>>>>>>> @@ -266,8 +271,7 @@ psp_cmd_submit_buf(struct psp_context *psp,
>>>>>>>        ret = psp_ring_cmd_submit(psp, psp->cmd_buf_mc_addr, fence_mc_addr,
>>>>>>> index);
>>>>>>>        if (ret) {
>>>>>>>            atomic_dec(&psp->fence_value);
>>>>>>> -        mutex_unlock(&psp->mutex);
>>>>>>> -        return ret;
>>>>>>> +        goto exit;
>>>>>>>        }
>>>>>>>          amdgpu_asic_invalidate_hdp(psp->adev, NULL);
>>>>>>> @@ -307,8 +311,8 @@ psp_cmd_submit_buf(struct psp_context *psp,
>>>>>>>                 psp->cmd_buf_mem->cmd_id,
>>>>>>>                 psp->cmd_buf_mem->resp.status);
>>>>>>>            if (!timeout) {
>>>>>>> -            mutex_unlock(&psp->mutex);
>>>>>>> -            return -EINVAL;
>>>>>>> +            ret = -EINVAL;
>>>>>>> +            goto exit;
>>>>>>>            }
>>>>>>>        }
>>>>>>>    @@ -316,8 +320,10 @@ psp_cmd_submit_buf(struct psp_context *psp,
>>>>>>>            ucode->tmr_mc_addr_lo = psp->cmd_buf_mem->resp.fw_addr_lo;
>>>>>>>            ucode->tmr_mc_addr_hi = psp->cmd_buf_mem->resp.fw_addr_hi;
>>>>>>>        }
>>>>>>> -    mutex_unlock(&psp->mutex);
>>>>>>>    +exit:
>>>>>>> +    mutex_unlock(&psp->mutex);
>>>>>>> +    drm_dev_exit(idx);
>>>>>>>        return ret;
>>>>>>>    }
>>>>>>>    @@ -354,8 +360,7 @@ static int psp_load_toc(struct psp_context *psp,
>>>>>>>        if (!cmd)
>>>>>>>            return -ENOMEM;
>>>>>>>        /* Copy toc to psp firmware private buffer */
>>>>>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>>>> -    memcpy(psp->fw_pri_buf, psp->toc_start_addr, psp->toc_bin_size);
>>>>>>> +    psp_copy_fw(psp, psp->toc_start_addr, psp->toc_bin_size);
>>>>>>>          psp_prep_load_toc_cmd_buf(cmd, psp->fw_pri_mc_addr, 
>>>>>>> psp->toc_bin_size);
>>>>>>>    @@ -570,8 +575,7 @@ static int psp_asd_load(struct psp_context *psp)
>>>>>>>        if (!cmd)
>>>>>>>            return -ENOMEM;
>>>>>>>    -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>>>> -    memcpy(psp->fw_pri_buf, psp->asd_start_addr, psp->asd_ucode_size);
>>>>>>> +    psp_copy_fw(psp, psp->asd_start_addr, psp->asd_ucode_size);
>>>>>>>          psp_prep_asd_load_cmd_buf(cmd, psp->fw_pri_mc_addr,
>>>>>>>                      psp->asd_ucode_size);
>>>>>>> @@ -726,8 +730,7 @@ static int psp_xgmi_load(struct psp_context *psp)
>>>>>>>        if (!cmd)
>>>>>>>            return -ENOMEM;
>>>>>>>    -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>>>> -    memcpy(psp->fw_pri_buf, psp->ta_xgmi_start_addr, 
>>>>>>> psp->ta_xgmi_ucode_size);
>>>>>>> +    psp_copy_fw(psp, psp->ta_xgmi_start_addr, psp->ta_xgmi_ucode_size);
>>>>>>>          psp_prep_ta_load_cmd_buf(cmd,
>>>>>>>                     psp->fw_pri_mc_addr,
>>>>>>> @@ -982,8 +985,7 @@ static int psp_ras_load(struct psp_context *psp)
>>>>>>>        if (!cmd)
>>>>>>>            return -ENOMEM;
>>>>>>>    -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>>>> -    memcpy(psp->fw_pri_buf, psp->ta_ras_start_addr, 
>>>>>>> psp->ta_ras_ucode_size);
>>>>>>> +    psp_copy_fw(psp, psp->ta_ras_start_addr, psp->ta_ras_ucode_size);
>>>>>>>          psp_prep_ta_load_cmd_buf(cmd,
>>>>>>>                     psp->fw_pri_mc_addr,
>>>>>>> @@ -1219,8 +1221,7 @@ static int psp_hdcp_load(struct psp_context *psp)
>>>>>>>        if (!cmd)
>>>>>>>            return -ENOMEM;
>>>>>>>    -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>>>> -    memcpy(psp->fw_pri_buf, psp->ta_hdcp_start_addr,
>>>>>>> +    psp_copy_fw(psp, psp->ta_hdcp_start_addr,
>>>>>>>               psp->ta_hdcp_ucode_size);
>>>>>>>          psp_prep_ta_load_cmd_buf(cmd,
>>>>>>> @@ -1366,8 +1367,7 @@ static int psp_dtm_load(struct psp_context *psp)
>>>>>>>        if (!cmd)
>>>>>>>            return -ENOMEM;
>>>>>>>    -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>>>> -    memcpy(psp->fw_pri_buf, psp->ta_dtm_start_addr, 
>>>>>>> psp->ta_dtm_ucode_size);
>>>>>>> +    psp_copy_fw(psp, psp->ta_dtm_start_addr, psp->ta_dtm_ucode_size);
>>>>>>>          psp_prep_ta_load_cmd_buf(cmd,
>>>>>>>                     psp->fw_pri_mc_addr,
>>>>>>> @@ -1507,8 +1507,7 @@ static int psp_rap_load(struct psp_context *psp)
>>>>>>>        if (!cmd)
>>>>>>>            return -ENOMEM;
>>>>>>>    -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>>>> -    memcpy(psp->fw_pri_buf, psp->ta_rap_start_addr, 
>>>>>>> psp->ta_rap_ucode_size);
>>>>>>> +    psp_copy_fw(psp, psp->ta_rap_start_addr, psp->ta_rap_ucode_size);
>>>>>>>          psp_prep_ta_load_cmd_buf(cmd,
>>>>>>>                     psp->fw_pri_mc_addr,
>>>>>>> @@ -2778,6 +2777,20 @@ static ssize_t psp_usbc_pd_fw_sysfs_write(struct
>>>>>>> device *dev,
>>>>>>>        return count;
>>>>>>>    }
>>>>>>>    +void psp_copy_fw(struct psp_context *psp, uint8_t *start_addr, uint32_t
>>>>>>> bin_size)
>>>>>>> +{
>>>>>>> +    int idx;
>>>>>>> +
>>>>>>> +    if (!drm_dev_enter(&psp->adev->ddev, &idx))
>>>>>>> +        return;
>>>>>>> +
>>>>>>> +    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>>>> +    memcpy(psp->fw_pri_buf, start_addr, bin_size);
>>>>>>> +
>>>>>>> +    drm_dev_exit(idx);
>>>>>>> +}
>>>>>>> +
>>>>>>> +
>>>>>>>    static DEVICE_ATTR(usbc_pd_fw, S_IRUGO | S_IWUSR,
>>>>>>>               psp_usbc_pd_fw_sysfs_read,
>>>>>>>               psp_usbc_pd_fw_sysfs_write);
>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
>>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
>>>>>>> index da250bc..ac69314 100644
>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
>>>>>>> @@ -400,4 +400,7 @@ int psp_init_ta_microcode(struct psp_context *psp,
>>>>>>>                  const char *chip_name);
>>>>>>>    int psp_get_fw_attestation_records_addr(struct psp_context *psp,
>>>>>>>                        uint64_t *output_ptr);
>>>>>>> +
>>>>>>> +void psp_copy_fw(struct psp_context *psp, uint8_t *start_addr, uint32_t
>>>>>>> bin_size);
>>>>>>> +
>>>>>>>    #endif
>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
>>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
>>>>>>> index 1a612f5..d656494 100644
>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
>>>>>>> @@ -35,6 +35,8 @@
>>>>>>>    #include "amdgpu.h"
>>>>>>>    #include "atom.h"
>>>>>>>    +#include <drm/drm_drv.h>
>>>>>>> +
>>>>>>>    /*
>>>>>>>     * Rings
>>>>>>>     * Most engines on the GPU are fed via ring buffers. Ring
>>>>>>> @@ -463,3 +465,71 @@ int amdgpu_ring_test_helper(struct amdgpu_ring *ring)
>>>>>>>        ring->sched.ready = !r;
>>>>>>>        return r;
>>>>>>>    }
>>>>>>> +
>>>>>>> +void amdgpu_ring_clear_ring(struct amdgpu_ring *ring)
>>>>>>> +{
>>>>>>> +    int idx;
>>>>>>> +    int i = 0;
>>>>>>> +
>>>>>>> +    if (!drm_dev_enter(&ring->adev->ddev, &idx))
>>>>>>> +        return;
>>>>>>> +
>>>>>>> +    while (i <= ring->buf_mask)
>>>>>>> +        ring->ring[i++] = ring->funcs->nop;
>>>>>>> +
>>>>>>> +    drm_dev_exit(idx);
>>>>>>> +
>>>>>>> +}
>>>>>>> +
>>>>>>> +void amdgpu_ring_write(struct amdgpu_ring *ring, uint32_t v)
>>>>>>> +{
>>>>>>> +    int idx;
>>>>>>> +
>>>>>>> +    if (!drm_dev_enter(&ring->adev->ddev, &idx))
>>>>>>> +        return;
>>>>>>> +
>>>>>>> +    if (ring->count_dw <= 0)
>>>>>>> +        DRM_ERROR("amdgpu: writing more dwords to the ring than 
>>>>>>> expected!\n");
>>>>>>> +    ring->ring[ring->wptr++ & ring->buf_mask] = v;
>>>>>>> +    ring->wptr &= ring->ptr_mask;
>>>>>>> +    ring->count_dw--;
>>>>>>> +
>>>>>>> +    drm_dev_exit(idx);
>>>>>>> +}
>>>>>>> +
>>>>>>> +void amdgpu_ring_write_multiple(struct amdgpu_ring *ring,
>>>>>>> +                          void *src, int count_dw)
>>>>>>> +{
>>>>>>> +    unsigned occupied, chunk1, chunk2;
>>>>>>> +    void *dst;
>>>>>>> +    int idx;
>>>>>>> +
>>>>>>> +    if (!drm_dev_enter(&ring->adev->ddev, &idx))
>>>>>>> +        return;
>>>>>>> +
>>>>>>> +    if (unlikely(ring->count_dw < count_dw))
>>>>>>> +        DRM_ERROR("amdgpu: writing more dwords to the ring than 
>>>>>>> expected!\n");
>>>>>>> +
>>>>>>> +    occupied = ring->wptr & ring->buf_mask;
>>>>>>> +    dst = (void *)&ring->ring[occupied];
>>>>>>> +    chunk1 = ring->buf_mask + 1 - occupied;
>>>>>>> +    chunk1 = (chunk1 >= count_dw) ? count_dw: chunk1;
>>>>>>> +    chunk2 = count_dw - chunk1;
>>>>>>> +    chunk1 <<= 2;
>>>>>>> +    chunk2 <<= 2;
>>>>>>> +
>>>>>>> +    if (chunk1)
>>>>>>> +        memcpy(dst, src, chunk1);
>>>>>>> +
>>>>>>> +    if (chunk2) {
>>>>>>> +        src += chunk1;
>>>>>>> +        dst = (void *)ring->ring;
>>>>>>> +        memcpy(dst, src, chunk2);
>>>>>>> +    }
>>>>>>> +
>>>>>>> +    ring->wptr += count_dw;
>>>>>>> +    ring->wptr &= ring->ptr_mask;
>>>>>>> +    ring->count_dw -= count_dw;
>>>>>>> +
>>>>>>> +    drm_dev_exit(idx);
>>>>>>> +}
>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>>>>>>> index accb243..f90b81f 100644
>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>>>>>>> @@ -300,53 +300,12 @@ static inline void
>>>>>>> amdgpu_ring_set_preempt_cond_exec(struct amdgpu_ring *ring,
>>>>>>>        *ring->cond_exe_cpu_addr = cond_exec;
>>>>>>>    }
>>>>>>>    -static inline void amdgpu_ring_clear_ring(struct amdgpu_ring *ring)
>>>>>>> -{
>>>>>>> -    int i = 0;
>>>>>>> -    while (i <= ring->buf_mask)
>>>>>>> -        ring->ring[i++] = ring->funcs->nop;
>>>>>>> -
>>>>>>> -}
>>>>>>> -
>>>>>>> -static inline void amdgpu_ring_write(struct amdgpu_ring *ring, uint32_t v)
>>>>>>> -{
>>>>>>> -    if (ring->count_dw <= 0)
>>>>>>> -        DRM_ERROR("amdgpu: writing more dwords to the ring than 
>>>>>>> expected!\n");
>>>>>>> -    ring->ring[ring->wptr++ & ring->buf_mask] = v;
>>>>>>> -    ring->wptr &= ring->ptr_mask;
>>>>>>> -    ring->count_dw--;
>>>>>>> -}
>>>>>>> +void amdgpu_ring_clear_ring(struct amdgpu_ring *ring);
>>>>>>>    -static inline void amdgpu_ring_write_multiple(struct amdgpu_ring *ring,
>>>>>>> -                          void *src, int count_dw)
>>>>>>> -{
>>>>>>> -    unsigned occupied, chunk1, chunk2;
>>>>>>> -    void *dst;
>>>>>>> -
>>>>>>> -    if (unlikely(ring->count_dw < count_dw))
>>>>>>> -        DRM_ERROR("amdgpu: writing more dwords to the ring than 
>>>>>>> expected!\n");
>>>>>>> -
>>>>>>> -    occupied = ring->wptr & ring->buf_mask;
>>>>>>> -    dst = (void *)&ring->ring[occupied];
>>>>>>> -    chunk1 = ring->buf_mask + 1 - occupied;
>>>>>>> -    chunk1 = (chunk1 >= count_dw) ? count_dw: chunk1;
>>>>>>> -    chunk2 = count_dw - chunk1;
>>>>>>> -    chunk1 <<= 2;
>>>>>>> -    chunk2 <<= 2;
>>>>>>> +void amdgpu_ring_write(struct amdgpu_ring *ring, uint32_t v);
>>>>>>>    -    if (chunk1)
>>>>>>> -        memcpy(dst, src, chunk1);
>>>>>>> -
>>>>>>> -    if (chunk2) {
>>>>>>> -        src += chunk1;
>>>>>>> -        dst = (void *)ring->ring;
>>>>>>> -        memcpy(dst, src, chunk2);
>>>>>>> -    }
>>>>>>> -
>>>>>>> -    ring->wptr += count_dw;
>>>>>>> -    ring->wptr &= ring->ptr_mask;
>>>>>>> -    ring->count_dw -= count_dw;
>>>>>>> -}
>>>>>>> +void amdgpu_ring_write_multiple(struct amdgpu_ring *ring,
>>>>>>> +                          void *src, int count_dw);
>>>>>>>      int amdgpu_ring_test_helper(struct amdgpu_ring *ring);
>>>>>>>    diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
>>>>>>> b/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
>>>>>>> index bd4248c..b3ce5be 100644
>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
>>>>>>> @@ -269,10 +269,8 @@ static int psp_v11_0_bootloader_load_kdb(struct
>>>>>>> psp_context *psp)
>>>>>>>        if (ret)
>>>>>>>            return ret;
>>>>>>>    -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>>>> -
>>>>>>>        /* Copy PSP KDB binary to memory */
>>>>>>> -    memcpy(psp->fw_pri_buf, psp->kdb_start_addr, psp->kdb_bin_size);
>>>>>>> +    psp_copy_fw(psp, psp->kdb_start_addr, psp->kdb_bin_size);
>>>>>>>          /* Provide the PSP KDB to bootloader */
>>>>>>>        WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>>>>>> @@ -302,10 +300,8 @@ static int psp_v11_0_bootloader_load_spl(struct
>>>>>>> psp_context *psp)
>>>>>>>        if (ret)
>>>>>>>            return ret;
>>>>>>>    -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>>>> -
>>>>>>>        /* Copy PSP SPL binary to memory */
>>>>>>> -    memcpy(psp->fw_pri_buf, psp->spl_start_addr, psp->spl_bin_size);
>>>>>>> +    psp_copy_fw(psp, psp->spl_start_addr, psp->spl_bin_size);
>>>>>>>          /* Provide the PSP SPL to bootloader */
>>>>>>>        WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>>>>>> @@ -335,10 +331,8 @@ static int psp_v11_0_bootloader_load_sysdrv(struct
>>>>>>> psp_context *psp)
>>>>>>>        if (ret)
>>>>>>>            return ret;
>>>>>>>    -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>>>> -
>>>>>>>        /* Copy PSP System Driver binary to memory */
>>>>>>> -    memcpy(psp->fw_pri_buf, psp->sys_start_addr, psp->sys_bin_size);
>>>>>>> +    psp_copy_fw(psp, psp->sys_start_addr, psp->sys_bin_size);
>>>>>>>          /* Provide the sys driver to bootloader */
>>>>>>>        WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>>>>>> @@ -371,10 +365,8 @@ static int psp_v11_0_bootloader_load_sos(struct
>>>>>>> psp_context *psp)
>>>>>>>        if (ret)
>>>>>>>            return ret;
>>>>>>>    -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>>>> -
>>>>>>>        /* Copy Secure OS binary to PSP memory */
>>>>>>> -    memcpy(psp->fw_pri_buf, psp->sos_start_addr, psp->sos_bin_size);
>>>>>>> +    psp_copy_fw(psp, psp->sos_start_addr, psp->sos_bin_size);
>>>>>>>          /* Provide the PSP secure OS to bootloader */
>>>>>>>        WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c
>>>>>>> b/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c
>>>>>>> index c4828bd..618e5b6 100644
>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c
>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c
>>>>>>> @@ -138,10 +138,8 @@ static int psp_v12_0_bootloader_load_sysdrv(struct
>>>>>>> psp_context *psp)
>>>>>>>        if (ret)
>>>>>>>            return ret;
>>>>>>>    -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>>>> -
>>>>>>>        /* Copy PSP System Driver binary to memory */
>>>>>>> -    memcpy(psp->fw_pri_buf, psp->sys_start_addr, psp->sys_bin_size);
>>>>>>> +    psp_copy_fw(psp, psp->sys_start_addr, psp->sys_bin_size);
>>>>>>>          /* Provide the sys driver to bootloader */
>>>>>>>        WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>>>>>> @@ -179,10 +177,8 @@ static int psp_v12_0_bootloader_load_sos(struct
>>>>>>> psp_context *psp)
>>>>>>>        if (ret)
>>>>>>>            return ret;
>>>>>>>    -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>>>> -
>>>>>>>        /* Copy Secure OS binary to PSP memory */
>>>>>>> -    memcpy(psp->fw_pri_buf, psp->sos_start_addr, psp->sos_bin_size);
>>>>>>> +    psp_copy_fw(psp, psp->sos_start_addr, psp->sos_bin_size);
>>>>>>>          /* Provide the PSP secure OS to bootloader */
>>>>>>>        WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c
>>>>>>> b/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c
>>>>>>> index f2e725f..d0a6cccd 100644
>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c
>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c
>>>>>>> @@ -102,10 +102,8 @@ static int psp_v3_1_bootloader_load_sysdrv(struct
>>>>>>> psp_context *psp)
>>>>>>>        if (ret)
>>>>>>>            return ret;
>>>>>>>    -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>>>> -
>>>>>>>        /* Copy PSP System Driver binary to memory */
>>>>>>> -    memcpy(psp->fw_pri_buf, psp->sys_start_addr, psp->sys_bin_size);
>>>>>>> +    psp_copy_fw(psp, psp->sys_start_addr, psp->sys_bin_size);
>>>>>>>          /* Provide the sys driver to bootloader */
>>>>>>>        WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>>>>>> @@ -143,10 +141,8 @@ static int psp_v3_1_bootloader_load_sos(struct
>>>>>>> psp_context *psp)
>>>>>>>        if (ret)
>>>>>>>            return ret;
>>>>>>>    -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>>>> -
>>>>>>>        /* Copy Secure OS binary to PSP memory */
>>>>>>> -    memcpy(psp->fw_pri_buf, psp->sos_start_addr, psp->sos_bin_size);
>>>>>>> +    psp_copy_fw(psp, psp->sos_start_addr, psp->sos_bin_size);
>>>>>>>          /* Provide the PSP secure OS to bootloader */
>>>>>>>        WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>>>
>>>>
>>> _______________________________________________
>>> amd-gfx mailing list
>>> amd-gfx@lists.freedesktop.org
>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&amp;data=04%7C01%7CAndrey.Grodzovsky%40amd.com%7Cee61fb937d2d4baedf6f08d8bcac5b02%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637466795752297305%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=a5MkPkwHh7WkR24K9EoCWSKPdCpiXCJH6RwGbGyhHyA%3D&amp;reserved=0 
>>>
>>
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 11/14] drm/amdgpu: Guard against write accesses after device removal
@ 2021-01-20 19:34                 ` Andrey Grodzovsky
  0 siblings, 0 replies; 196+ messages in thread
From: Andrey Grodzovsky @ 2021-01-20 19:34 UTC (permalink / raw)
  To: christian.koenig, Daniel Vetter
  Cc: Rob Herring, Greg KH, dri-devel, Anholt, Eric, Pekka Paalanen,
	amd-gfx list, Alex Deucher, Lucas Stach, Wentland, Harry,
	Qiang Yu


On 1/19/21 2:16 PM, Andrey Grodzovsky wrote:
>
> On 1/19/21 1:59 PM, Christian König wrote:
>> Am 19.01.21 um 19:22 schrieb Andrey Grodzovsky:
>>>
>>> On 1/19/21 1:05 PM, Daniel Vetter wrote:
>>>> On Tue, Jan 19, 2021 at 4:35 PM Andrey Grodzovsky
>>>> <Andrey.Grodzovsky@amd.com> wrote:
>>>>> There is really no other way according to this article
>>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flwn.net%2FArticles%2F767885%2F&amp;data=04%7C01%7CAndrey.Grodzovsky%40amd.com%7Cee61fb937d2d4baedf6f08d8bcac5b02%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637466795752297305%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=a9Y4ZMEVYaMP7IeMVxQgXGpAkDXSkedMAiWkyqwzEe8%3D&amp;reserved=0 
>>>>>
>>>>>
>>>>> "A perfect solution seems nearly impossible though; we cannot acquire a 
>>>>> mutex on
>>>>> the user
>>>>> to prevent them from yanking a device and we cannot check for a presence 
>>>>> change
>>>>> after every
>>>>> device access for performance reasons. "
>>>>>
>>>>> But I assumed srcu_read_lock should be pretty seamless performance wise, no ?
>>>> The read side is supposed to be dirt cheap, the write side is were we
>>>> just stall for all readers to eventually complete on their own.
>>>> Definitely should be much cheaper than mmio read, on the mmio write
>>>> side it might actually hurt a bit. Otoh I think those don't stall the
>>>> cpu by default when they're timing out, so maybe if the overhead is
>>>> too much for those, we could omit them?
>>>>
>>>> Maybe just do a small microbenchmark for these for testing, with a
>>>> register that doesn't change hw state. So with and without
>>>> drm_dev_enter/exit, and also one with the hw plugged out so that we
>>>> have actual timeouts in the transactions.
>>>> -Daniel
>>>
>>>
>>> So say writing in a loop to some harmless scratch register for many times 
>>> both for plugged
>>> and unplugged case and measure total time delta ?
>>
>> I think we should at least measure the following:
>>
>> 1. Writing X times to a scratch reg without your patch.
>> 2. Writing X times to a scratch reg with your patch.
>> 3. Writing X times to a scratch reg with the hardware physically disconnected.


Just realized, I can't test this part since I don't have eGPU to yank out.

Andrey


>>
>> I suggest to repeat that once for Polaris (or older) and once for Vega or Navi.
>>
>> The SRBM on Polaris is meant to introduce some delay in each access, so it 
>> might react differently then the newer hardware.
>>
>> Christian.
>
>
> Will do.
>
> Andrey
>
>
>>
>>>
>>> Andrey
>>>
>>>
>>>>
>>>>> The other solution would be as I suggested to keep all the device IO ranges
>>>>> reserved and system
>>>>> memory pages unfreed until the device is finalized in the driver but 
>>>>> Daniel said
>>>>> this would upset the PCI layer (the MMIO ranges reservation part).
>>>>>
>>>>> Andrey
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On 1/19/21 3:55 AM, Christian König wrote:
>>>>>> Am 18.01.21 um 22:01 schrieb Andrey Grodzovsky:
>>>>>>> This should prevent writing to memory or IO ranges possibly
>>>>>>> already allocated for other uses after our device is removed.
>>>>>> Wow, that adds quite some overhead to every register access. I'm not sure we
>>>>>> can do this.
>>>>>>
>>>>>> Christian.
>>>>>>
>>>>>>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>>>>>>> ---
>>>>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 57 ++++++++++++++++++++++++
>>>>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c    |  9 ++++
>>>>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c    | 53 +++++++++++++---------
>>>>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h    |  3 ++
>>>>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c   | 70 
>>>>>>> ++++++++++++++++++++++++++++++
>>>>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h   | 49 ++-------------------
>>>>>>>    drivers/gpu/drm/amd/amdgpu/psp_v11_0.c     | 16 ++-----
>>>>>>>    drivers/gpu/drm/amd/amdgpu/psp_v12_0.c     |  8 +---
>>>>>>>    drivers/gpu/drm/amd/amdgpu/psp_v3_1.c      |  8 +---
>>>>>>>    9 files changed, 184 insertions(+), 89 deletions(-)
>>>>>>>
>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>>>> index e99f4f1..0a9d73c 100644
>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>>>> @@ -72,6 +72,8 @@
>>>>>>>      #include <linux/iommu.h>
>>>>>>>    +#include <drm/drm_drv.h>
>>>>>>> +
>>>>>>>    MODULE_FIRMWARE("amdgpu/vega10_gpu_info.bin");
>>>>>>>    MODULE_FIRMWARE("amdgpu/vega12_gpu_info.bin");
>>>>>>>    MODULE_FIRMWARE("amdgpu/raven_gpu_info.bin");
>>>>>>> @@ -404,13 +406,21 @@ uint8_t amdgpu_mm_rreg8(struct amdgpu_device *adev,
>>>>>>> uint32_t offset)
>>>>>>>     */
>>>>>>>    void amdgpu_mm_wreg8(struct amdgpu_device *adev, uint32_t offset, 
>>>>>>> uint8_t
>>>>>>> value)
>>>>>>>    {
>>>>>>> +    int idx;
>>>>>>> +
>>>>>>>        if (adev->in_pci_err_recovery)
>>>>>>>            return;
>>>>>>>    +
>>>>>>> +    if (!drm_dev_enter(&adev->ddev, &idx))
>>>>>>> +        return;
>>>>>>> +
>>>>>>>        if (offset < adev->rmmio_size)
>>>>>>>            writeb(value, adev->rmmio + offset);
>>>>>>>        else
>>>>>>>            BUG();
>>>>>>> +
>>>>>>> +    drm_dev_exit(idx);
>>>>>>>    }
>>>>>>>      /**
>>>>>>> @@ -427,9 +437,14 @@ void amdgpu_device_wreg(struct amdgpu_device *adev,
>>>>>>>                uint32_t reg, uint32_t v,
>>>>>>>                uint32_t acc_flags)
>>>>>>>    {
>>>>>>> +    int idx;
>>>>>>> +
>>>>>>>        if (adev->in_pci_err_recovery)
>>>>>>>            return;
>>>>>>>    +    if (!drm_dev_enter(&adev->ddev, &idx))
>>>>>>> +        return;
>>>>>>> +
>>>>>>>        if ((reg * 4) < adev->rmmio_size) {
>>>>>>>            if (!(acc_flags & AMDGPU_REGS_NO_KIQ) &&
>>>>>>>                amdgpu_sriov_runtime(adev) &&
>>>>>>> @@ -444,6 +459,8 @@ void amdgpu_device_wreg(struct amdgpu_device *adev,
>>>>>>>        }
>>>>>>> trace_amdgpu_device_wreg(adev->pdev->device, reg, v);
>>>>>>> +
>>>>>>> +    drm_dev_exit(idx);
>>>>>>>    }
>>>>>>>      /*
>>>>>>> @@ -454,9 +471,14 @@ void amdgpu_device_wreg(struct amdgpu_device *adev,
>>>>>>>    void amdgpu_mm_wreg_mmio_rlc(struct amdgpu_device *adev,
>>>>>>>                     uint32_t reg, uint32_t v)
>>>>>>>    {
>>>>>>> +    int idx;
>>>>>>> +
>>>>>>>        if (adev->in_pci_err_recovery)
>>>>>>>            return;
>>>>>>>    +    if (!drm_dev_enter(&adev->ddev, &idx))
>>>>>>> +        return;
>>>>>>> +
>>>>>>>        if (amdgpu_sriov_fullaccess(adev) &&
>>>>>>>            adev->gfx.rlc.funcs &&
>>>>>>> adev->gfx.rlc.funcs->is_rlcg_access_range) {
>>>>>>> @@ -465,6 +487,8 @@ void amdgpu_mm_wreg_mmio_rlc(struct amdgpu_device 
>>>>>>> *adev,
>>>>>>>        } else {
>>>>>>>            writel(v, ((void __iomem *)adev->rmmio) + (reg * 4));
>>>>>>>        }
>>>>>>> +
>>>>>>> +    drm_dev_exit(idx);
>>>>>>>    }
>>>>>>>      /**
>>>>>>> @@ -499,15 +523,22 @@ u32 amdgpu_io_rreg(struct amdgpu_device *adev, u32 
>>>>>>> reg)
>>>>>>>     */
>>>>>>>    void amdgpu_io_wreg(struct amdgpu_device *adev, u32 reg, u32 v)
>>>>>>>    {
>>>>>>> +    int idx;
>>>>>>> +
>>>>>>>        if (adev->in_pci_err_recovery)
>>>>>>>            return;
>>>>>>>    +    if (!drm_dev_enter(&adev->ddev, &idx))
>>>>>>> +        return;
>>>>>>> +
>>>>>>>        if ((reg * 4) < adev->rio_mem_size)
>>>>>>>            iowrite32(v, adev->rio_mem + (reg * 4));
>>>>>>>        else {
>>>>>>>            iowrite32((reg * 4), adev->rio_mem + (mmMM_INDEX * 4));
>>>>>>>            iowrite32(v, adev->rio_mem + (mmMM_DATA * 4));
>>>>>>>        }
>>>>>>> +
>>>>>>> +    drm_dev_exit(idx);
>>>>>>>    }
>>>>>>>      /**
>>>>>>> @@ -544,14 +575,21 @@ u32 amdgpu_mm_rdoorbell(struct amdgpu_device 
>>>>>>> *adev, u32
>>>>>>> index)
>>>>>>>     */
>>>>>>>    void amdgpu_mm_wdoorbell(struct amdgpu_device *adev, u32 index, u32 v)
>>>>>>>    {
>>>>>>> +    int idx;
>>>>>>> +
>>>>>>>        if (adev->in_pci_err_recovery)
>>>>>>>            return;
>>>>>>>    +    if (!drm_dev_enter(&adev->ddev, &idx))
>>>>>>> +        return;
>>>>>>> +
>>>>>>>        if (index < adev->doorbell.num_doorbells) {
>>>>>>>            writel(v, adev->doorbell.ptr + index);
>>>>>>>        } else {
>>>>>>>            DRM_ERROR("writing beyond doorbell aperture: 0x%08x!\n", index);
>>>>>>>        }
>>>>>>> +
>>>>>>> +    drm_dev_exit(idx);
>>>>>>>    }
>>>>>>>      /**
>>>>>>> @@ -588,14 +626,21 @@ u64 amdgpu_mm_rdoorbell64(struct amdgpu_device *adev,
>>>>>>> u32 index)
>>>>>>>     */
>>>>>>>    void amdgpu_mm_wdoorbell64(struct amdgpu_device *adev, u32 index, u64 v)
>>>>>>>    {
>>>>>>> +    int idx;
>>>>>>> +
>>>>>>>        if (adev->in_pci_err_recovery)
>>>>>>>            return;
>>>>>>>    +    if (!drm_dev_enter(&adev->ddev, &idx))
>>>>>>> +        return;
>>>>>>> +
>>>>>>>        if (index < adev->doorbell.num_doorbells) {
>>>>>>>            atomic64_set((atomic64_t *)(adev->doorbell.ptr + index), v);
>>>>>>>        } else {
>>>>>>>            DRM_ERROR("writing beyond doorbell aperture: 0x%08x!\n", index);
>>>>>>>        }
>>>>>>> +
>>>>>>> +    drm_dev_exit(idx);
>>>>>>>    }
>>>>>>>      /**
>>>>>>> @@ -682,6 +727,10 @@ void amdgpu_device_indirect_wreg(struct amdgpu_device
>>>>>>> *adev,
>>>>>>>        unsigned long flags;
>>>>>>>        void __iomem *pcie_index_offset;
>>>>>>>        void __iomem *pcie_data_offset;
>>>>>>> +    int idx;
>>>>>>> +
>>>>>>> +    if (!drm_dev_enter(&adev->ddev, &idx))
>>>>>>> +        return;
>>>>>>> spin_lock_irqsave(&adev->pcie_idx_lock, flags);
>>>>>>>        pcie_index_offset = (void __iomem *)adev->rmmio + pcie_index * 4;
>>>>>>> @@ -692,6 +741,8 @@ void amdgpu_device_indirect_wreg(struct 
>>>>>>> amdgpu_device *adev,
>>>>>>>        writel(reg_data, pcie_data_offset);
>>>>>>>        readl(pcie_data_offset);
>>>>>>> spin_unlock_irqrestore(&adev->pcie_idx_lock, flags);
>>>>>>> +
>>>>>>> +    drm_dev_exit(idx);
>>>>>>>    }
>>>>>>>      /**
>>>>>>> @@ -711,6 +762,10 @@ void amdgpu_device_indirect_wreg64(struct 
>>>>>>> amdgpu_device
>>>>>>> *adev,
>>>>>>>        unsigned long flags;
>>>>>>>        void __iomem *pcie_index_offset;
>>>>>>>        void __iomem *pcie_data_offset;
>>>>>>> +    int idx;
>>>>>>> +
>>>>>>> +    if (!drm_dev_enter(&adev->ddev, &idx))
>>>>>>> +        return;
>>>>>>> spin_lock_irqsave(&adev->pcie_idx_lock, flags);
>>>>>>>        pcie_index_offset = (void __iomem *)adev->rmmio + pcie_index * 4;
>>>>>>> @@ -727,6 +782,8 @@ void amdgpu_device_indirect_wreg64(struct amdgpu_device
>>>>>>> *adev,
>>>>>>>        writel((u32)(reg_data >> 32), pcie_data_offset);
>>>>>>>        readl(pcie_data_offset);
>>>>>>> spin_unlock_irqrestore(&adev->pcie_idx_lock, flags);
>>>>>>> +
>>>>>>> +    drm_dev_exit(idx);
>>>>>>>    }
>>>>>>>      /**
>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
>>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
>>>>>>> index fe1a39f..1beb4e6 100644
>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
>>>>>>> @@ -31,6 +31,8 @@
>>>>>>>    #include "amdgpu_ras.h"
>>>>>>>    #include "amdgpu_xgmi.h"
>>>>>>>    +#include <drm/drm_drv.h>
>>>>>>> +
>>>>>>>    /**
>>>>>>>     * amdgpu_gmc_get_pde_for_bo - get the PDE for a BO
>>>>>>>     *
>>>>>>> @@ -98,6 +100,10 @@ int amdgpu_gmc_set_pte_pde(struct amdgpu_device *adev,
>>>>>>> void *cpu_pt_addr,
>>>>>>>    {
>>>>>>>        void __iomem *ptr = (void *)cpu_pt_addr;
>>>>>>>        uint64_t value;
>>>>>>> +    int idx;
>>>>>>> +
>>>>>>> +    if (!drm_dev_enter(&adev->ddev, &idx))
>>>>>>> +        return 0;
>>>>>>>          /*
>>>>>>>         * The following is for PTE only. GART does not have PDEs.
>>>>>>> @@ -105,6 +111,9 @@ int amdgpu_gmc_set_pte_pde(struct amdgpu_device *adev,
>>>>>>> void *cpu_pt_addr,
>>>>>>>        value = addr & 0x0000FFFFFFFFF000ULL;
>>>>>>>        value |= flags;
>>>>>>>        writeq(value, ptr + (gpu_page_idx * 8));
>>>>>>> +
>>>>>>> +    drm_dev_exit(idx);
>>>>>>> +
>>>>>>>        return 0;
>>>>>>>    }
>>>>>>>    diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
>>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
>>>>>>> index 523d22d..89e2bfe 100644
>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
>>>>>>> @@ -37,6 +37,8 @@
>>>>>>>      #include "amdgpu_ras.h"
>>>>>>>    +#include <drm/drm_drv.h>
>>>>>>> +
>>>>>>>    static int psp_sysfs_init(struct amdgpu_device *adev);
>>>>>>>    static void psp_sysfs_fini(struct amdgpu_device *adev);
>>>>>>>    @@ -248,7 +250,7 @@ psp_cmd_submit_buf(struct psp_context *psp,
>>>>>>>               struct psp_gfx_cmd_resp *cmd, uint64_t fence_mc_addr)
>>>>>>>    {
>>>>>>>        int ret;
>>>>>>> -    int index;
>>>>>>> +    int index, idx;
>>>>>>>        int timeout = 2000;
>>>>>>>        bool ras_intr = false;
>>>>>>>        bool skip_unsupport = false;
>>>>>>> @@ -256,6 +258,9 @@ psp_cmd_submit_buf(struct psp_context *psp,
>>>>>>>        if (psp->adev->in_pci_err_recovery)
>>>>>>>            return 0;
>>>>>>>    +    if (!drm_dev_enter(&psp->adev->ddev, &idx))
>>>>>>> +        return 0;
>>>>>>> +
>>>>>>>        mutex_lock(&psp->mutex);
>>>>>>>          memset(psp->cmd_buf_mem, 0, PSP_CMD_BUFFER_SIZE);
>>>>>>> @@ -266,8 +271,7 @@ psp_cmd_submit_buf(struct psp_context *psp,
>>>>>>>        ret = psp_ring_cmd_submit(psp, psp->cmd_buf_mc_addr, fence_mc_addr,
>>>>>>> index);
>>>>>>>        if (ret) {
>>>>>>>            atomic_dec(&psp->fence_value);
>>>>>>> -        mutex_unlock(&psp->mutex);
>>>>>>> -        return ret;
>>>>>>> +        goto exit;
>>>>>>>        }
>>>>>>>          amdgpu_asic_invalidate_hdp(psp->adev, NULL);
>>>>>>> @@ -307,8 +311,8 @@ psp_cmd_submit_buf(struct psp_context *psp,
>>>>>>>                 psp->cmd_buf_mem->cmd_id,
>>>>>>>                 psp->cmd_buf_mem->resp.status);
>>>>>>>            if (!timeout) {
>>>>>>> -            mutex_unlock(&psp->mutex);
>>>>>>> -            return -EINVAL;
>>>>>>> +            ret = -EINVAL;
>>>>>>> +            goto exit;
>>>>>>>            }
>>>>>>>        }
>>>>>>>    @@ -316,8 +320,10 @@ psp_cmd_submit_buf(struct psp_context *psp,
>>>>>>>            ucode->tmr_mc_addr_lo = psp->cmd_buf_mem->resp.fw_addr_lo;
>>>>>>>            ucode->tmr_mc_addr_hi = psp->cmd_buf_mem->resp.fw_addr_hi;
>>>>>>>        }
>>>>>>> -    mutex_unlock(&psp->mutex);
>>>>>>>    +exit:
>>>>>>> +    mutex_unlock(&psp->mutex);
>>>>>>> +    drm_dev_exit(idx);
>>>>>>>        return ret;
>>>>>>>    }
>>>>>>>    @@ -354,8 +360,7 @@ static int psp_load_toc(struct psp_context *psp,
>>>>>>>        if (!cmd)
>>>>>>>            return -ENOMEM;
>>>>>>>        /* Copy toc to psp firmware private buffer */
>>>>>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>>>> -    memcpy(psp->fw_pri_buf, psp->toc_start_addr, psp->toc_bin_size);
>>>>>>> +    psp_copy_fw(psp, psp->toc_start_addr, psp->toc_bin_size);
>>>>>>>          psp_prep_load_toc_cmd_buf(cmd, psp->fw_pri_mc_addr, 
>>>>>>> psp->toc_bin_size);
>>>>>>>    @@ -570,8 +575,7 @@ static int psp_asd_load(struct psp_context *psp)
>>>>>>>        if (!cmd)
>>>>>>>            return -ENOMEM;
>>>>>>>    -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>>>> -    memcpy(psp->fw_pri_buf, psp->asd_start_addr, psp->asd_ucode_size);
>>>>>>> +    psp_copy_fw(psp, psp->asd_start_addr, psp->asd_ucode_size);
>>>>>>>          psp_prep_asd_load_cmd_buf(cmd, psp->fw_pri_mc_addr,
>>>>>>>                      psp->asd_ucode_size);
>>>>>>> @@ -726,8 +730,7 @@ static int psp_xgmi_load(struct psp_context *psp)
>>>>>>>        if (!cmd)
>>>>>>>            return -ENOMEM;
>>>>>>>    -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>>>> -    memcpy(psp->fw_pri_buf, psp->ta_xgmi_start_addr, 
>>>>>>> psp->ta_xgmi_ucode_size);
>>>>>>> +    psp_copy_fw(psp, psp->ta_xgmi_start_addr, psp->ta_xgmi_ucode_size);
>>>>>>>          psp_prep_ta_load_cmd_buf(cmd,
>>>>>>>                     psp->fw_pri_mc_addr,
>>>>>>> @@ -982,8 +985,7 @@ static int psp_ras_load(struct psp_context *psp)
>>>>>>>        if (!cmd)
>>>>>>>            return -ENOMEM;
>>>>>>>    -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>>>> -    memcpy(psp->fw_pri_buf, psp->ta_ras_start_addr, 
>>>>>>> psp->ta_ras_ucode_size);
>>>>>>> +    psp_copy_fw(psp, psp->ta_ras_start_addr, psp->ta_ras_ucode_size);
>>>>>>>          psp_prep_ta_load_cmd_buf(cmd,
>>>>>>>                     psp->fw_pri_mc_addr,
>>>>>>> @@ -1219,8 +1221,7 @@ static int psp_hdcp_load(struct psp_context *psp)
>>>>>>>        if (!cmd)
>>>>>>>            return -ENOMEM;
>>>>>>>    -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>>>> -    memcpy(psp->fw_pri_buf, psp->ta_hdcp_start_addr,
>>>>>>> +    psp_copy_fw(psp, psp->ta_hdcp_start_addr,
>>>>>>>               psp->ta_hdcp_ucode_size);
>>>>>>>          psp_prep_ta_load_cmd_buf(cmd,
>>>>>>> @@ -1366,8 +1367,7 @@ static int psp_dtm_load(struct psp_context *psp)
>>>>>>>        if (!cmd)
>>>>>>>            return -ENOMEM;
>>>>>>>    -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>>>> -    memcpy(psp->fw_pri_buf, psp->ta_dtm_start_addr, 
>>>>>>> psp->ta_dtm_ucode_size);
>>>>>>> +    psp_copy_fw(psp, psp->ta_dtm_start_addr, psp->ta_dtm_ucode_size);
>>>>>>>          psp_prep_ta_load_cmd_buf(cmd,
>>>>>>>                     psp->fw_pri_mc_addr,
>>>>>>> @@ -1507,8 +1507,7 @@ static int psp_rap_load(struct psp_context *psp)
>>>>>>>        if (!cmd)
>>>>>>>            return -ENOMEM;
>>>>>>>    -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>>>> -    memcpy(psp->fw_pri_buf, psp->ta_rap_start_addr, 
>>>>>>> psp->ta_rap_ucode_size);
>>>>>>> +    psp_copy_fw(psp, psp->ta_rap_start_addr, psp->ta_rap_ucode_size);
>>>>>>>          psp_prep_ta_load_cmd_buf(cmd,
>>>>>>>                     psp->fw_pri_mc_addr,
>>>>>>> @@ -2778,6 +2777,20 @@ static ssize_t psp_usbc_pd_fw_sysfs_write(struct
>>>>>>> device *dev,
>>>>>>>        return count;
>>>>>>>    }
>>>>>>>    +void psp_copy_fw(struct psp_context *psp, uint8_t *start_addr, uint32_t
>>>>>>> bin_size)
>>>>>>> +{
>>>>>>> +    int idx;
>>>>>>> +
>>>>>>> +    if (!drm_dev_enter(&psp->adev->ddev, &idx))
>>>>>>> +        return;
>>>>>>> +
>>>>>>> +    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>>>> +    memcpy(psp->fw_pri_buf, start_addr, bin_size);
>>>>>>> +
>>>>>>> +    drm_dev_exit(idx);
>>>>>>> +}
>>>>>>> +
>>>>>>> +
>>>>>>>    static DEVICE_ATTR(usbc_pd_fw, S_IRUGO | S_IWUSR,
>>>>>>>               psp_usbc_pd_fw_sysfs_read,
>>>>>>>               psp_usbc_pd_fw_sysfs_write);
>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
>>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
>>>>>>> index da250bc..ac69314 100644
>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
>>>>>>> @@ -400,4 +400,7 @@ int psp_init_ta_microcode(struct psp_context *psp,
>>>>>>>                  const char *chip_name);
>>>>>>>    int psp_get_fw_attestation_records_addr(struct psp_context *psp,
>>>>>>>                        uint64_t *output_ptr);
>>>>>>> +
>>>>>>> +void psp_copy_fw(struct psp_context *psp, uint8_t *start_addr, uint32_t
>>>>>>> bin_size);
>>>>>>> +
>>>>>>>    #endif
>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
>>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
>>>>>>> index 1a612f5..d656494 100644
>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
>>>>>>> @@ -35,6 +35,8 @@
>>>>>>>    #include "amdgpu.h"
>>>>>>>    #include "atom.h"
>>>>>>>    +#include <drm/drm_drv.h>
>>>>>>> +
>>>>>>>    /*
>>>>>>>     * Rings
>>>>>>>     * Most engines on the GPU are fed via ring buffers. Ring
>>>>>>> @@ -463,3 +465,71 @@ int amdgpu_ring_test_helper(struct amdgpu_ring *ring)
>>>>>>>        ring->sched.ready = !r;
>>>>>>>        return r;
>>>>>>>    }
>>>>>>> +
>>>>>>> +void amdgpu_ring_clear_ring(struct amdgpu_ring *ring)
>>>>>>> +{
>>>>>>> +    int idx;
>>>>>>> +    int i = 0;
>>>>>>> +
>>>>>>> +    if (!drm_dev_enter(&ring->adev->ddev, &idx))
>>>>>>> +        return;
>>>>>>> +
>>>>>>> +    while (i <= ring->buf_mask)
>>>>>>> +        ring->ring[i++] = ring->funcs->nop;
>>>>>>> +
>>>>>>> +    drm_dev_exit(idx);
>>>>>>> +
>>>>>>> +}
>>>>>>> +
>>>>>>> +void amdgpu_ring_write(struct amdgpu_ring *ring, uint32_t v)
>>>>>>> +{
>>>>>>> +    int idx;
>>>>>>> +
>>>>>>> +    if (!drm_dev_enter(&ring->adev->ddev, &idx))
>>>>>>> +        return;
>>>>>>> +
>>>>>>> +    if (ring->count_dw <= 0)
>>>>>>> +        DRM_ERROR("amdgpu: writing more dwords to the ring than 
>>>>>>> expected!\n");
>>>>>>> +    ring->ring[ring->wptr++ & ring->buf_mask] = v;
>>>>>>> +    ring->wptr &= ring->ptr_mask;
>>>>>>> +    ring->count_dw--;
>>>>>>> +
>>>>>>> +    drm_dev_exit(idx);
>>>>>>> +}
>>>>>>> +
>>>>>>> +void amdgpu_ring_write_multiple(struct amdgpu_ring *ring,
>>>>>>> +                          void *src, int count_dw)
>>>>>>> +{
>>>>>>> +    unsigned occupied, chunk1, chunk2;
>>>>>>> +    void *dst;
>>>>>>> +    int idx;
>>>>>>> +
>>>>>>> +    if (!drm_dev_enter(&ring->adev->ddev, &idx))
>>>>>>> +        return;
>>>>>>> +
>>>>>>> +    if (unlikely(ring->count_dw < count_dw))
>>>>>>> +        DRM_ERROR("amdgpu: writing more dwords to the ring than 
>>>>>>> expected!\n");
>>>>>>> +
>>>>>>> +    occupied = ring->wptr & ring->buf_mask;
>>>>>>> +    dst = (void *)&ring->ring[occupied];
>>>>>>> +    chunk1 = ring->buf_mask + 1 - occupied;
>>>>>>> +    chunk1 = (chunk1 >= count_dw) ? count_dw: chunk1;
>>>>>>> +    chunk2 = count_dw - chunk1;
>>>>>>> +    chunk1 <<= 2;
>>>>>>> +    chunk2 <<= 2;
>>>>>>> +
>>>>>>> +    if (chunk1)
>>>>>>> +        memcpy(dst, src, chunk1);
>>>>>>> +
>>>>>>> +    if (chunk2) {
>>>>>>> +        src += chunk1;
>>>>>>> +        dst = (void *)ring->ring;
>>>>>>> +        memcpy(dst, src, chunk2);
>>>>>>> +    }
>>>>>>> +
>>>>>>> +    ring->wptr += count_dw;
>>>>>>> +    ring->wptr &= ring->ptr_mask;
>>>>>>> +    ring->count_dw -= count_dw;
>>>>>>> +
>>>>>>> +    drm_dev_exit(idx);
>>>>>>> +}
>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>>>>>>> index accb243..f90b81f 100644
>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>>>>>>> @@ -300,53 +300,12 @@ static inline void
>>>>>>> amdgpu_ring_set_preempt_cond_exec(struct amdgpu_ring *ring,
>>>>>>>        *ring->cond_exe_cpu_addr = cond_exec;
>>>>>>>    }
>>>>>>>    -static inline void amdgpu_ring_clear_ring(struct amdgpu_ring *ring)
>>>>>>> -{
>>>>>>> -    int i = 0;
>>>>>>> -    while (i <= ring->buf_mask)
>>>>>>> -        ring->ring[i++] = ring->funcs->nop;
>>>>>>> -
>>>>>>> -}
>>>>>>> -
>>>>>>> -static inline void amdgpu_ring_write(struct amdgpu_ring *ring, uint32_t v)
>>>>>>> -{
>>>>>>> -    if (ring->count_dw <= 0)
>>>>>>> -        DRM_ERROR("amdgpu: writing more dwords to the ring than 
>>>>>>> expected!\n");
>>>>>>> -    ring->ring[ring->wptr++ & ring->buf_mask] = v;
>>>>>>> -    ring->wptr &= ring->ptr_mask;
>>>>>>> -    ring->count_dw--;
>>>>>>> -}
>>>>>>> +void amdgpu_ring_clear_ring(struct amdgpu_ring *ring);
>>>>>>>    -static inline void amdgpu_ring_write_multiple(struct amdgpu_ring *ring,
>>>>>>> -                          void *src, int count_dw)
>>>>>>> -{
>>>>>>> -    unsigned occupied, chunk1, chunk2;
>>>>>>> -    void *dst;
>>>>>>> -
>>>>>>> -    if (unlikely(ring->count_dw < count_dw))
>>>>>>> -        DRM_ERROR("amdgpu: writing more dwords to the ring than 
>>>>>>> expected!\n");
>>>>>>> -
>>>>>>> -    occupied = ring->wptr & ring->buf_mask;
>>>>>>> -    dst = (void *)&ring->ring[occupied];
>>>>>>> -    chunk1 = ring->buf_mask + 1 - occupied;
>>>>>>> -    chunk1 = (chunk1 >= count_dw) ? count_dw: chunk1;
>>>>>>> -    chunk2 = count_dw - chunk1;
>>>>>>> -    chunk1 <<= 2;
>>>>>>> -    chunk2 <<= 2;
>>>>>>> +void amdgpu_ring_write(struct amdgpu_ring *ring, uint32_t v);
>>>>>>>    -    if (chunk1)
>>>>>>> -        memcpy(dst, src, chunk1);
>>>>>>> -
>>>>>>> -    if (chunk2) {
>>>>>>> -        src += chunk1;
>>>>>>> -        dst = (void *)ring->ring;
>>>>>>> -        memcpy(dst, src, chunk2);
>>>>>>> -    }
>>>>>>> -
>>>>>>> -    ring->wptr += count_dw;
>>>>>>> -    ring->wptr &= ring->ptr_mask;
>>>>>>> -    ring->count_dw -= count_dw;
>>>>>>> -}
>>>>>>> +void amdgpu_ring_write_multiple(struct amdgpu_ring *ring,
>>>>>>> +                          void *src, int count_dw);
>>>>>>>      int amdgpu_ring_test_helper(struct amdgpu_ring *ring);
>>>>>>>    diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
>>>>>>> b/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
>>>>>>> index bd4248c..b3ce5be 100644
>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
>>>>>>> @@ -269,10 +269,8 @@ static int psp_v11_0_bootloader_load_kdb(struct
>>>>>>> psp_context *psp)
>>>>>>>        if (ret)
>>>>>>>            return ret;
>>>>>>>    -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>>>> -
>>>>>>>        /* Copy PSP KDB binary to memory */
>>>>>>> -    memcpy(psp->fw_pri_buf, psp->kdb_start_addr, psp->kdb_bin_size);
>>>>>>> +    psp_copy_fw(psp, psp->kdb_start_addr, psp->kdb_bin_size);
>>>>>>>          /* Provide the PSP KDB to bootloader */
>>>>>>>        WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>>>>>> @@ -302,10 +300,8 @@ static int psp_v11_0_bootloader_load_spl(struct
>>>>>>> psp_context *psp)
>>>>>>>        if (ret)
>>>>>>>            return ret;
>>>>>>>    -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>>>> -
>>>>>>>        /* Copy PSP SPL binary to memory */
>>>>>>> -    memcpy(psp->fw_pri_buf, psp->spl_start_addr, psp->spl_bin_size);
>>>>>>> +    psp_copy_fw(psp, psp->spl_start_addr, psp->spl_bin_size);
>>>>>>>          /* Provide the PSP SPL to bootloader */
>>>>>>>        WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>>>>>> @@ -335,10 +331,8 @@ static int psp_v11_0_bootloader_load_sysdrv(struct
>>>>>>> psp_context *psp)
>>>>>>>        if (ret)
>>>>>>>            return ret;
>>>>>>>    -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>>>> -
>>>>>>>        /* Copy PSP System Driver binary to memory */
>>>>>>> -    memcpy(psp->fw_pri_buf, psp->sys_start_addr, psp->sys_bin_size);
>>>>>>> +    psp_copy_fw(psp, psp->sys_start_addr, psp->sys_bin_size);
>>>>>>>          /* Provide the sys driver to bootloader */
>>>>>>>        WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>>>>>> @@ -371,10 +365,8 @@ static int psp_v11_0_bootloader_load_sos(struct
>>>>>>> psp_context *psp)
>>>>>>>        if (ret)
>>>>>>>            return ret;
>>>>>>>    -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>>>> -
>>>>>>>        /* Copy Secure OS binary to PSP memory */
>>>>>>> -    memcpy(psp->fw_pri_buf, psp->sos_start_addr, psp->sos_bin_size);
>>>>>>> +    psp_copy_fw(psp, psp->sos_start_addr, psp->sos_bin_size);
>>>>>>>          /* Provide the PSP secure OS to bootloader */
>>>>>>>        WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c
>>>>>>> b/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c
>>>>>>> index c4828bd..618e5b6 100644
>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c
>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c
>>>>>>> @@ -138,10 +138,8 @@ static int psp_v12_0_bootloader_load_sysdrv(struct
>>>>>>> psp_context *psp)
>>>>>>>        if (ret)
>>>>>>>            return ret;
>>>>>>>    -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>>>> -
>>>>>>>        /* Copy PSP System Driver binary to memory */
>>>>>>> -    memcpy(psp->fw_pri_buf, psp->sys_start_addr, psp->sys_bin_size);
>>>>>>> +    psp_copy_fw(psp, psp->sys_start_addr, psp->sys_bin_size);
>>>>>>>          /* Provide the sys driver to bootloader */
>>>>>>>        WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>>>>>> @@ -179,10 +177,8 @@ static int psp_v12_0_bootloader_load_sos(struct
>>>>>>> psp_context *psp)
>>>>>>>        if (ret)
>>>>>>>            return ret;
>>>>>>>    -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>>>> -
>>>>>>>        /* Copy Secure OS binary to PSP memory */
>>>>>>> -    memcpy(psp->fw_pri_buf, psp->sos_start_addr, psp->sos_bin_size);
>>>>>>> +    psp_copy_fw(psp, psp->sos_start_addr, psp->sos_bin_size);
>>>>>>>          /* Provide the PSP secure OS to bootloader */
>>>>>>>        WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c
>>>>>>> b/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c
>>>>>>> index f2e725f..d0a6cccd 100644
>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c
>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c
>>>>>>> @@ -102,10 +102,8 @@ static int psp_v3_1_bootloader_load_sysdrv(struct
>>>>>>> psp_context *psp)
>>>>>>>        if (ret)
>>>>>>>            return ret;
>>>>>>>    -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>>>> -
>>>>>>>        /* Copy PSP System Driver binary to memory */
>>>>>>> -    memcpy(psp->fw_pri_buf, psp->sys_start_addr, psp->sys_bin_size);
>>>>>>> +    psp_copy_fw(psp, psp->sys_start_addr, psp->sys_bin_size);
>>>>>>>          /* Provide the sys driver to bootloader */
>>>>>>>        WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>>>>>> @@ -143,10 +141,8 @@ static int psp_v3_1_bootloader_load_sos(struct
>>>>>>> psp_context *psp)
>>>>>>>        if (ret)
>>>>>>>            return ret;
>>>>>>>    -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>>>> -
>>>>>>>        /* Copy Secure OS binary to PSP memory */
>>>>>>> -    memcpy(psp->fw_pri_buf, psp->sos_start_addr, psp->sos_bin_size);
>>>>>>> +    psp_copy_fw(psp, psp->sos_start_addr, psp->sos_bin_size);
>>>>>>>          /* Provide the PSP secure OS to bootloader */
>>>>>>>        WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>>>
>>>>
>>> _______________________________________________
>>> amd-gfx mailing list
>>> amd-gfx@lists.freedesktop.org
>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&amp;data=04%7C01%7CAndrey.Grodzovsky%40amd.com%7Cee61fb937d2d4baedf6f08d8bcac5b02%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637466795752297305%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=a5MkPkwHh7WkR24K9EoCWSKPdCpiXCJH6RwGbGyhHyA%3D&amp;reserved=0 
>>>
>>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 07/14] drm/amdgpu: Register IOMMU topology notifier per device.
  2021-01-20  5:01       ` Andrey Grodzovsky
@ 2021-01-20 19:38         ` Andrey Grodzovsky
  -1 siblings, 0 replies; 196+ messages in thread
From: Andrey Grodzovsky @ 2021-01-20 19:38 UTC (permalink / raw)
  To: christian.koenig, amd-gfx, dri-devel, daniel.vetter, robh,
	l.stach, yuq825, eric
  Cc: Alexander.Deucher, gregkh

Ping

Andrey

On 1/20/21 12:01 AM, Andrey Grodzovsky wrote:
>
> On 1/19/21 3:48 AM, Christian König wrote:
>> Am 18.01.21 um 22:01 schrieb Andrey Grodzovsky:
>>> Handle all DMA IOMMU gropup related dependencies before the
>>> group is removed.
>>>
>>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>>> ---
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu.h        |  5 ++++
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 46 
>>> ++++++++++++++++++++++++++++++
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c   |  2 +-
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h   |  1 +
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 10 +++++++
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_object.h |  2 ++
>>>   6 files changed, 65 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>> index 478a7d8..2953420 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>> @@ -51,6 +51,7 @@
>>>   #include <linux/dma-fence.h>
>>>   #include <linux/pci.h>
>>>   #include <linux/aer.h>
>>> +#include <linux/notifier.h>
>>>     #include <drm/ttm/ttm_bo_api.h>
>>>   #include <drm/ttm/ttm_bo_driver.h>
>>> @@ -1041,6 +1042,10 @@ struct amdgpu_device {
>>>         bool                            in_pci_err_recovery;
>>>       struct pci_saved_state          *pci_state;
>>> +
>>> +    struct notifier_block        nb;
>>> +    struct blocking_notifier_head    notifier;
>>> +    struct list_head        device_bo_list;
>>>   };
>>>     static inline struct amdgpu_device *drm_to_adev(struct drm_device *ddev)
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>> index 45e23e3..e99f4f1 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>> @@ -70,6 +70,8 @@
>>>   #include <drm/task_barrier.h>
>>>   #include <linux/pm_runtime.h>
>>>   +#include <linux/iommu.h>
>>> +
>>>   MODULE_FIRMWARE("amdgpu/vega10_gpu_info.bin");
>>>   MODULE_FIRMWARE("amdgpu/vega12_gpu_info.bin");
>>>   MODULE_FIRMWARE("amdgpu/raven_gpu_info.bin");
>>> @@ -3200,6 +3202,39 @@ static const struct attribute 
>>> *amdgpu_dev_attributes[] = {
>>>   };
>>>     +static int amdgpu_iommu_group_notifier(struct notifier_block *nb,
>>> +                     unsigned long action, void *data)
>>> +{
>>> +    struct amdgpu_device *adev = container_of(nb, struct amdgpu_device, nb);
>>> +    struct amdgpu_bo *bo = NULL;
>>> +
>>> +    /*
>>> +     * Following is a set of IOMMU group dependencies taken care of before
>>> +     * device's IOMMU group is removed
>>> +     */
>>> +    if (action == IOMMU_GROUP_NOTIFY_DEL_DEVICE) {
>>> +
>>> +        spin_lock(&ttm_bo_glob.lru_lock);
>>> +        list_for_each_entry(bo, &adev->device_bo_list, bo) {
>>> +            if (bo->tbo.ttm)
>>> +                ttm_tt_unpopulate(bo->tbo.bdev, bo->tbo.ttm);
>>> +        }
>>> +        spin_unlock(&ttm_bo_glob.lru_lock);
>>
>> That approach won't work. ttm_tt_unpopulate() might sleep on an IOMMU lock.
>>
>> You need to use a mutex here or even better make sure you can access the 
>> device_bo_list without a lock in this moment.
>>
>> Christian.
>
>
> I can think of switching to RCU list ? Otherwise, elements are added
> on BO create and deleted on BO destroy, how can i prevent any of those from
> happening while in this section besides mutex ? Make a copy list and run over 
> it instead ?
>
> Andrey
>
>
>>
>>> +
>>> +        if (adev->irq.ih.use_bus_addr)
>>> +            amdgpu_ih_ring_fini(adev, &adev->irq.ih);
>>> +        if (adev->irq.ih1.use_bus_addr)
>>> +            amdgpu_ih_ring_fini(adev, &adev->irq.ih1);
>>> +        if (adev->irq.ih2.use_bus_addr)
>>> +            amdgpu_ih_ring_fini(adev, &adev->irq.ih2);
>>> +
>>> +        amdgpu_gart_dummy_page_fini(adev);
>>> +    }
>>> +
>>> +    return NOTIFY_OK;
>>> +}
>>> +
>>> +
>>>   /**
>>>    * amdgpu_device_init - initialize the driver
>>>    *
>>> @@ -3304,6 +3339,8 @@ int amdgpu_device_init(struct amdgpu_device *adev,
>>>         INIT_WORK(&adev->xgmi_reset_work, amdgpu_device_xgmi_reset_func);
>>>   +    INIT_LIST_HEAD(&adev->device_bo_list);
>>> +
>>>       adev->gfx.gfx_off_req_count = 1;
>>>       adev->pm.ac_power = power_supply_is_system_supplied() > 0;
>>>   @@ -3575,6 +3612,15 @@ int amdgpu_device_init(struct amdgpu_device *adev,
>>>       if (amdgpu_device_cache_pci_state(adev->pdev))
>>>           pci_restore_state(pdev);
>>>   +    BLOCKING_INIT_NOTIFIER_HEAD(&adev->notifier);
>>> +    adev->nb.notifier_call = amdgpu_iommu_group_notifier;
>>> +
>>> +    if (adev->dev->iommu_group) {
>>> +        r = iommu_group_register_notifier(adev->dev->iommu_group, &adev->nb);
>>> +        if (r)
>>> +            goto failed;
>>> +    }
>>> +
>>>       return 0;
>>>     failed:
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c 
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
>>> index 0db9330..486ad6d 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
>>> @@ -92,7 +92,7 @@ static int amdgpu_gart_dummy_page_init(struct 
>>> amdgpu_device *adev)
>>>    *
>>>    * Frees the dummy page used by the driver (all asics).
>>>    */
>>> -static void amdgpu_gart_dummy_page_fini(struct amdgpu_device *adev)
>>> +void amdgpu_gart_dummy_page_fini(struct amdgpu_device *adev)
>>>   {
>>>       if (!adev->dummy_page_addr)
>>>           return;
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h 
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h
>>> index afa2e28..5678d9c 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h
>>> @@ -61,6 +61,7 @@ int amdgpu_gart_table_vram_pin(struct amdgpu_device *adev);
>>>   void amdgpu_gart_table_vram_unpin(struct amdgpu_device *adev);
>>>   int amdgpu_gart_init(struct amdgpu_device *adev);
>>>   void amdgpu_gart_fini(struct amdgpu_device *adev);
>>> +void amdgpu_gart_dummy_page_fini(struct amdgpu_device *adev);
>>>   int amdgpu_gart_unbind(struct amdgpu_device *adev, uint64_t offset,
>>>                  int pages);
>>>   int amdgpu_gart_map(struct amdgpu_device *adev, uint64_t offset,
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c 
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>>> index 6cc9919..4a1de69 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>>> @@ -94,6 +94,10 @@ static void amdgpu_bo_destroy(struct ttm_buffer_object *tbo)
>>>       }
>>>       amdgpu_bo_unref(&bo->parent);
>>>   +    spin_lock(&ttm_bo_glob.lru_lock);
>>> +    list_del(&bo->bo);
>>> +    spin_unlock(&ttm_bo_glob.lru_lock);
>>> +
>>>       kfree(bo->metadata);
>>>       kfree(bo);
>>>   }
>>> @@ -613,6 +617,12 @@ static int amdgpu_bo_do_create(struct amdgpu_device *adev,
>>>       if (bp->type == ttm_bo_type_device)
>>>           bo->flags &= ~AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED;
>>>   +    INIT_LIST_HEAD(&bo->bo);
>>> +
>>> +    spin_lock(&ttm_bo_glob.lru_lock);
>>> +    list_add_tail(&bo->bo, &adev->device_bo_list);
>>> +    spin_unlock(&ttm_bo_glob.lru_lock);
>>> +
>>>       return 0;
>>>     fail_unreserve:
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h 
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
>>> index 9ac3756..5ae8555 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
>>> @@ -110,6 +110,8 @@ struct amdgpu_bo {
>>>       struct list_head        shadow_list;
>>>         struct kgd_mem                  *kfd_bo;
>>> +
>>> +    struct list_head        bo;
>>>   };
>>>     static inline struct amdgpu_bo *ttm_to_amdgpu_bo(struct 
>>> ttm_buffer_object *tbo)
>>
>> _______________________________________________
>> amd-gfx mailing list
>> amd-gfx@lists.freedesktop.org
>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&amp;data=04%7C01%7Candrey.grodzovsky%40amd.com%7C0c703eb6e73744962d3b08d8bc56f303%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637466428923905672%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=2Tkz4EMOEwFLQJUOk1ixd28c2ad1HqjBVIDO%2FX0OgqM%3D&amp;reserved=0 
>>
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 07/14] drm/amdgpu: Register IOMMU topology notifier per device.
@ 2021-01-20 19:38         ` Andrey Grodzovsky
  0 siblings, 0 replies; 196+ messages in thread
From: Andrey Grodzovsky @ 2021-01-20 19:38 UTC (permalink / raw)
  To: christian.koenig, amd-gfx, dri-devel, daniel.vetter, robh,
	l.stach, yuq825, eric
  Cc: Alexander.Deucher, gregkh, ppaalanen, Harry.Wentland

Ping

Andrey

On 1/20/21 12:01 AM, Andrey Grodzovsky wrote:
>
> On 1/19/21 3:48 AM, Christian König wrote:
>> Am 18.01.21 um 22:01 schrieb Andrey Grodzovsky:
>>> Handle all DMA IOMMU gropup related dependencies before the
>>> group is removed.
>>>
>>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>>> ---
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu.h        |  5 ++++
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 46 
>>> ++++++++++++++++++++++++++++++
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c   |  2 +-
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h   |  1 +
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 10 +++++++
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_object.h |  2 ++
>>>   6 files changed, 65 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>> index 478a7d8..2953420 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>> @@ -51,6 +51,7 @@
>>>   #include <linux/dma-fence.h>
>>>   #include <linux/pci.h>
>>>   #include <linux/aer.h>
>>> +#include <linux/notifier.h>
>>>     #include <drm/ttm/ttm_bo_api.h>
>>>   #include <drm/ttm/ttm_bo_driver.h>
>>> @@ -1041,6 +1042,10 @@ struct amdgpu_device {
>>>         bool                            in_pci_err_recovery;
>>>       struct pci_saved_state          *pci_state;
>>> +
>>> +    struct notifier_block        nb;
>>> +    struct blocking_notifier_head    notifier;
>>> +    struct list_head        device_bo_list;
>>>   };
>>>     static inline struct amdgpu_device *drm_to_adev(struct drm_device *ddev)
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>> index 45e23e3..e99f4f1 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>> @@ -70,6 +70,8 @@
>>>   #include <drm/task_barrier.h>
>>>   #include <linux/pm_runtime.h>
>>>   +#include <linux/iommu.h>
>>> +
>>>   MODULE_FIRMWARE("amdgpu/vega10_gpu_info.bin");
>>>   MODULE_FIRMWARE("amdgpu/vega12_gpu_info.bin");
>>>   MODULE_FIRMWARE("amdgpu/raven_gpu_info.bin");
>>> @@ -3200,6 +3202,39 @@ static const struct attribute 
>>> *amdgpu_dev_attributes[] = {
>>>   };
>>>     +static int amdgpu_iommu_group_notifier(struct notifier_block *nb,
>>> +                     unsigned long action, void *data)
>>> +{
>>> +    struct amdgpu_device *adev = container_of(nb, struct amdgpu_device, nb);
>>> +    struct amdgpu_bo *bo = NULL;
>>> +
>>> +    /*
>>> +     * Following is a set of IOMMU group dependencies taken care of before
>>> +     * device's IOMMU group is removed
>>> +     */
>>> +    if (action == IOMMU_GROUP_NOTIFY_DEL_DEVICE) {
>>> +
>>> +        spin_lock(&ttm_bo_glob.lru_lock);
>>> +        list_for_each_entry(bo, &adev->device_bo_list, bo) {
>>> +            if (bo->tbo.ttm)
>>> +                ttm_tt_unpopulate(bo->tbo.bdev, bo->tbo.ttm);
>>> +        }
>>> +        spin_unlock(&ttm_bo_glob.lru_lock);
>>
>> That approach won't work. ttm_tt_unpopulate() might sleep on an IOMMU lock.
>>
>> You need to use a mutex here or even better make sure you can access the 
>> device_bo_list without a lock in this moment.
>>
>> Christian.
>
>
> I can think of switching to RCU list ? Otherwise, elements are added
> on BO create and deleted on BO destroy, how can i prevent any of those from
> happening while in this section besides mutex ? Make a copy list and run over 
> it instead ?
>
> Andrey
>
>
>>
>>> +
>>> +        if (adev->irq.ih.use_bus_addr)
>>> +            amdgpu_ih_ring_fini(adev, &adev->irq.ih);
>>> +        if (adev->irq.ih1.use_bus_addr)
>>> +            amdgpu_ih_ring_fini(adev, &adev->irq.ih1);
>>> +        if (adev->irq.ih2.use_bus_addr)
>>> +            amdgpu_ih_ring_fini(adev, &adev->irq.ih2);
>>> +
>>> +        amdgpu_gart_dummy_page_fini(adev);
>>> +    }
>>> +
>>> +    return NOTIFY_OK;
>>> +}
>>> +
>>> +
>>>   /**
>>>    * amdgpu_device_init - initialize the driver
>>>    *
>>> @@ -3304,6 +3339,8 @@ int amdgpu_device_init(struct amdgpu_device *adev,
>>>         INIT_WORK(&adev->xgmi_reset_work, amdgpu_device_xgmi_reset_func);
>>>   +    INIT_LIST_HEAD(&adev->device_bo_list);
>>> +
>>>       adev->gfx.gfx_off_req_count = 1;
>>>       adev->pm.ac_power = power_supply_is_system_supplied() > 0;
>>>   @@ -3575,6 +3612,15 @@ int amdgpu_device_init(struct amdgpu_device *adev,
>>>       if (amdgpu_device_cache_pci_state(adev->pdev))
>>>           pci_restore_state(pdev);
>>>   +    BLOCKING_INIT_NOTIFIER_HEAD(&adev->notifier);
>>> +    adev->nb.notifier_call = amdgpu_iommu_group_notifier;
>>> +
>>> +    if (adev->dev->iommu_group) {
>>> +        r = iommu_group_register_notifier(adev->dev->iommu_group, &adev->nb);
>>> +        if (r)
>>> +            goto failed;
>>> +    }
>>> +
>>>       return 0;
>>>     failed:
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c 
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
>>> index 0db9330..486ad6d 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
>>> @@ -92,7 +92,7 @@ static int amdgpu_gart_dummy_page_init(struct 
>>> amdgpu_device *adev)
>>>    *
>>>    * Frees the dummy page used by the driver (all asics).
>>>    */
>>> -static void amdgpu_gart_dummy_page_fini(struct amdgpu_device *adev)
>>> +void amdgpu_gart_dummy_page_fini(struct amdgpu_device *adev)
>>>   {
>>>       if (!adev->dummy_page_addr)
>>>           return;
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h 
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h
>>> index afa2e28..5678d9c 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h
>>> @@ -61,6 +61,7 @@ int amdgpu_gart_table_vram_pin(struct amdgpu_device *adev);
>>>   void amdgpu_gart_table_vram_unpin(struct amdgpu_device *adev);
>>>   int amdgpu_gart_init(struct amdgpu_device *adev);
>>>   void amdgpu_gart_fini(struct amdgpu_device *adev);
>>> +void amdgpu_gart_dummy_page_fini(struct amdgpu_device *adev);
>>>   int amdgpu_gart_unbind(struct amdgpu_device *adev, uint64_t offset,
>>>                  int pages);
>>>   int amdgpu_gart_map(struct amdgpu_device *adev, uint64_t offset,
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c 
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>>> index 6cc9919..4a1de69 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>>> @@ -94,6 +94,10 @@ static void amdgpu_bo_destroy(struct ttm_buffer_object *tbo)
>>>       }
>>>       amdgpu_bo_unref(&bo->parent);
>>>   +    spin_lock(&ttm_bo_glob.lru_lock);
>>> +    list_del(&bo->bo);
>>> +    spin_unlock(&ttm_bo_glob.lru_lock);
>>> +
>>>       kfree(bo->metadata);
>>>       kfree(bo);
>>>   }
>>> @@ -613,6 +617,12 @@ static int amdgpu_bo_do_create(struct amdgpu_device *adev,
>>>       if (bp->type == ttm_bo_type_device)
>>>           bo->flags &= ~AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED;
>>>   +    INIT_LIST_HEAD(&bo->bo);
>>> +
>>> +    spin_lock(&ttm_bo_glob.lru_lock);
>>> +    list_add_tail(&bo->bo, &adev->device_bo_list);
>>> +    spin_unlock(&ttm_bo_glob.lru_lock);
>>> +
>>>       return 0;
>>>     fail_unreserve:
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h 
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
>>> index 9ac3756..5ae8555 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
>>> @@ -110,6 +110,8 @@ struct amdgpu_bo {
>>>       struct list_head        shadow_list;
>>>         struct kgd_mem                  *kfd_bo;
>>> +
>>> +    struct list_head        bo;
>>>   };
>>>     static inline struct amdgpu_bo *ttm_to_amdgpu_bo(struct 
>>> ttm_buffer_object *tbo)
>>
>> _______________________________________________
>> amd-gfx mailing list
>> amd-gfx@lists.freedesktop.org
>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&amp;data=04%7C01%7Candrey.grodzovsky%40amd.com%7C0c703eb6e73744962d3b08d8bc56f303%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637466428923905672%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=2Tkz4EMOEwFLQJUOk1ixd28c2ad1HqjBVIDO%2FX0OgqM%3D&amp;reserved=0 
>>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 07/14] drm/amdgpu: Register IOMMU topology notifier per device.
  2021-01-20 19:38         ` Andrey Grodzovsky
@ 2021-01-21 10:42           ` Christian König
  -1 siblings, 0 replies; 196+ messages in thread
From: Christian König @ 2021-01-21 10:42 UTC (permalink / raw)
  To: Andrey Grodzovsky, amd-gfx, dri-devel, daniel.vetter, robh,
	l.stach, yuq825, eric
  Cc: Alexander.Deucher, gregkh

Am 20.01.21 um 20:38 schrieb Andrey Grodzovsky:
> Ping
>
> Andrey
>
> On 1/20/21 12:01 AM, Andrey Grodzovsky wrote:
>>
>> On 1/19/21 3:48 AM, Christian König wrote:
>>> Am 18.01.21 um 22:01 schrieb Andrey Grodzovsky:
>>>> Handle all DMA IOMMU gropup related dependencies before the
>>>> group is removed.
>>>>
>>>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>>>> ---
>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu.h        |  5 ++++
>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 46 
>>>> ++++++++++++++++++++++++++++++
>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c   |  2 +-
>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h   |  1 +
>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 10 +++++++
>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_object.h |  2 ++
>>>>   6 files changed, 65 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>>> index 478a7d8..2953420 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>>> @@ -51,6 +51,7 @@
>>>>   #include <linux/dma-fence.h>
>>>>   #include <linux/pci.h>
>>>>   #include <linux/aer.h>
>>>> +#include <linux/notifier.h>
>>>>     #include <drm/ttm/ttm_bo_api.h>
>>>>   #include <drm/ttm/ttm_bo_driver.h>
>>>> @@ -1041,6 +1042,10 @@ struct amdgpu_device {
>>>>         bool                            in_pci_err_recovery;
>>>>       struct pci_saved_state          *pci_state;
>>>> +
>>>> +    struct notifier_block        nb;
>>>> +    struct blocking_notifier_head    notifier;
>>>> +    struct list_head        device_bo_list;
>>>>   };
>>>>     static inline struct amdgpu_device *drm_to_adev(struct 
>>>> drm_device *ddev)
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>> index 45e23e3..e99f4f1 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>> @@ -70,6 +70,8 @@
>>>>   #include <drm/task_barrier.h>
>>>>   #include <linux/pm_runtime.h>
>>>>   +#include <linux/iommu.h>
>>>> +
>>>>   MODULE_FIRMWARE("amdgpu/vega10_gpu_info.bin");
>>>>   MODULE_FIRMWARE("amdgpu/vega12_gpu_info.bin");
>>>>   MODULE_FIRMWARE("amdgpu/raven_gpu_info.bin");
>>>> @@ -3200,6 +3202,39 @@ static const struct attribute 
>>>> *amdgpu_dev_attributes[] = {
>>>>   };
>>>>     +static int amdgpu_iommu_group_notifier(struct notifier_block *nb,
>>>> +                     unsigned long action, void *data)
>>>> +{
>>>> +    struct amdgpu_device *adev = container_of(nb, struct 
>>>> amdgpu_device, nb);
>>>> +    struct amdgpu_bo *bo = NULL;
>>>> +
>>>> +    /*
>>>> +     * Following is a set of IOMMU group dependencies taken care 
>>>> of before
>>>> +     * device's IOMMU group is removed
>>>> +     */
>>>> +    if (action == IOMMU_GROUP_NOTIFY_DEL_DEVICE) {
>>>> +
>>>> +        spin_lock(&ttm_bo_glob.lru_lock);
>>>> +        list_for_each_entry(bo, &adev->device_bo_list, bo) {
>>>> +            if (bo->tbo.ttm)
>>>> +                ttm_tt_unpopulate(bo->tbo.bdev, bo->tbo.ttm);
>>>> +        }
>>>> +        spin_unlock(&ttm_bo_glob.lru_lock);
>>>
>>> That approach won't work. ttm_tt_unpopulate() might sleep on an 
>>> IOMMU lock.
>>>
>>> You need to use a mutex here or even better make sure you can access 
>>> the device_bo_list without a lock in this moment.
>>>
>>> Christian.
>>
>>
>> I can think of switching to RCU list ? Otherwise, elements are added
>> on BO create and deleted on BO destroy, how can i prevent any of 
>> those from
>> happening while in this section besides mutex ? Make a copy list and 
>> run over it instead ?

RCU won't work since the BO is not RCU protected.

What you can try something like this:

spin_lock(&ttm_bo_glob.lru_lock);
while (list_not_empty(&adev->device_bo_list)) {
     bo = list_first_entry(&adev->device_bo_list);
     list_del(bo->...);
     spin_unlock(&ttm_bo_glob.lru_lock);
     ttm_tt_unpopulate(bo);
     spin_lock(&ttm_bo_glob.lru_lock);
}...

Regards,
Christian.

>>
>> Andrey
>>
>>
>>>
>>>> +
>>>> +        if (adev->irq.ih.use_bus_addr)
>>>> +            amdgpu_ih_ring_fini(adev, &adev->irq.ih);
>>>> +        if (adev->irq.ih1.use_bus_addr)
>>>> +            amdgpu_ih_ring_fini(adev, &adev->irq.ih1);
>>>> +        if (adev->irq.ih2.use_bus_addr)
>>>> +            amdgpu_ih_ring_fini(adev, &adev->irq.ih2);
>>>> +
>>>> +        amdgpu_gart_dummy_page_fini(adev);
>>>> +    }
>>>> +
>>>> +    return NOTIFY_OK;
>>>> +}
>>>> +
>>>> +
>>>>   /**
>>>>    * amdgpu_device_init - initialize the driver
>>>>    *
>>>> @@ -3304,6 +3339,8 @@ int amdgpu_device_init(struct amdgpu_device 
>>>> *adev,
>>>>         INIT_WORK(&adev->xgmi_reset_work, 
>>>> amdgpu_device_xgmi_reset_func);
>>>>   +    INIT_LIST_HEAD(&adev->device_bo_list);
>>>> +
>>>>       adev->gfx.gfx_off_req_count = 1;
>>>>       adev->pm.ac_power = power_supply_is_system_supplied() > 0;
>>>>   @@ -3575,6 +3612,15 @@ int amdgpu_device_init(struct 
>>>> amdgpu_device *adev,
>>>>       if (amdgpu_device_cache_pci_state(adev->pdev))
>>>>           pci_restore_state(pdev);
>>>>   +    BLOCKING_INIT_NOTIFIER_HEAD(&adev->notifier);
>>>> +    adev->nb.notifier_call = amdgpu_iommu_group_notifier;
>>>> +
>>>> +    if (adev->dev->iommu_group) {
>>>> +        r = iommu_group_register_notifier(adev->dev->iommu_group, 
>>>> &adev->nb);
>>>> +        if (r)
>>>> +            goto failed;
>>>> +    }
>>>> +
>>>>       return 0;
>>>>     failed:
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c 
>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
>>>> index 0db9330..486ad6d 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
>>>> @@ -92,7 +92,7 @@ static int amdgpu_gart_dummy_page_init(struct 
>>>> amdgpu_device *adev)
>>>>    *
>>>>    * Frees the dummy page used by the driver (all asics).
>>>>    */
>>>> -static void amdgpu_gart_dummy_page_fini(struct amdgpu_device *adev)
>>>> +void amdgpu_gart_dummy_page_fini(struct amdgpu_device *adev)
>>>>   {
>>>>       if (!adev->dummy_page_addr)
>>>>           return;
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h 
>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h
>>>> index afa2e28..5678d9c 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h
>>>> @@ -61,6 +61,7 @@ int amdgpu_gart_table_vram_pin(struct 
>>>> amdgpu_device *adev);
>>>>   void amdgpu_gart_table_vram_unpin(struct amdgpu_device *adev);
>>>>   int amdgpu_gart_init(struct amdgpu_device *adev);
>>>>   void amdgpu_gart_fini(struct amdgpu_device *adev);
>>>> +void amdgpu_gart_dummy_page_fini(struct amdgpu_device *adev);
>>>>   int amdgpu_gart_unbind(struct amdgpu_device *adev, uint64_t offset,
>>>>                  int pages);
>>>>   int amdgpu_gart_map(struct amdgpu_device *adev, uint64_t offset,
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c 
>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>>>> index 6cc9919..4a1de69 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>>>> @@ -94,6 +94,10 @@ static void amdgpu_bo_destroy(struct 
>>>> ttm_buffer_object *tbo)
>>>>       }
>>>>       amdgpu_bo_unref(&bo->parent);
>>>>   +    spin_lock(&ttm_bo_glob.lru_lock);
>>>> +    list_del(&bo->bo);
>>>> +    spin_unlock(&ttm_bo_glob.lru_lock);
>>>> +
>>>>       kfree(bo->metadata);
>>>>       kfree(bo);
>>>>   }
>>>> @@ -613,6 +617,12 @@ static int amdgpu_bo_do_create(struct 
>>>> amdgpu_device *adev,
>>>>       if (bp->type == ttm_bo_type_device)
>>>>           bo->flags &= ~AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED;
>>>>   +    INIT_LIST_HEAD(&bo->bo);
>>>> +
>>>> +    spin_lock(&ttm_bo_glob.lru_lock);
>>>> +    list_add_tail(&bo->bo, &adev->device_bo_list);
>>>> +    spin_unlock(&ttm_bo_glob.lru_lock);
>>>> +
>>>>       return 0;
>>>>     fail_unreserve:
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h 
>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
>>>> index 9ac3756..5ae8555 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
>>>> @@ -110,6 +110,8 @@ struct amdgpu_bo {
>>>>       struct list_head        shadow_list;
>>>>         struct kgd_mem                  *kfd_bo;
>>>> +
>>>> +    struct list_head        bo;
>>>>   };
>>>>     static inline struct amdgpu_bo *ttm_to_amdgpu_bo(struct 
>>>> ttm_buffer_object *tbo)
>>>
>>> _______________________________________________
>>> amd-gfx mailing list
>>> amd-gfx@lists.freedesktop.org
>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&amp;data=04%7C01%7Candrey.grodzovsky%40amd.com%7C0c703eb6e73744962d3b08d8bc56f303%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637466428923905672%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=2Tkz4EMOEwFLQJUOk1ixd28c2ad1HqjBVIDO%2FX0OgqM%3D&amp;reserved=0 
>>>

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 07/14] drm/amdgpu: Register IOMMU topology notifier per device.
@ 2021-01-21 10:42           ` Christian König
  0 siblings, 0 replies; 196+ messages in thread
From: Christian König @ 2021-01-21 10:42 UTC (permalink / raw)
  To: Andrey Grodzovsky, amd-gfx, dri-devel, daniel.vetter, robh,
	l.stach, yuq825, eric
  Cc: Alexander.Deucher, gregkh, ppaalanen, Harry.Wentland

Am 20.01.21 um 20:38 schrieb Andrey Grodzovsky:
> Ping
>
> Andrey
>
> On 1/20/21 12:01 AM, Andrey Grodzovsky wrote:
>>
>> On 1/19/21 3:48 AM, Christian König wrote:
>>> Am 18.01.21 um 22:01 schrieb Andrey Grodzovsky:
>>>> Handle all DMA IOMMU gropup related dependencies before the
>>>> group is removed.
>>>>
>>>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>>>> ---
>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu.h        |  5 ++++
>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 46 
>>>> ++++++++++++++++++++++++++++++
>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c   |  2 +-
>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h   |  1 +
>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 10 +++++++
>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_object.h |  2 ++
>>>>   6 files changed, 65 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>>> index 478a7d8..2953420 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>>> @@ -51,6 +51,7 @@
>>>>   #include <linux/dma-fence.h>
>>>>   #include <linux/pci.h>
>>>>   #include <linux/aer.h>
>>>> +#include <linux/notifier.h>
>>>>     #include <drm/ttm/ttm_bo_api.h>
>>>>   #include <drm/ttm/ttm_bo_driver.h>
>>>> @@ -1041,6 +1042,10 @@ struct amdgpu_device {
>>>>         bool                            in_pci_err_recovery;
>>>>       struct pci_saved_state          *pci_state;
>>>> +
>>>> +    struct notifier_block        nb;
>>>> +    struct blocking_notifier_head    notifier;
>>>> +    struct list_head        device_bo_list;
>>>>   };
>>>>     static inline struct amdgpu_device *drm_to_adev(struct 
>>>> drm_device *ddev)
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>> index 45e23e3..e99f4f1 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>> @@ -70,6 +70,8 @@
>>>>   #include <drm/task_barrier.h>
>>>>   #include <linux/pm_runtime.h>
>>>>   +#include <linux/iommu.h>
>>>> +
>>>>   MODULE_FIRMWARE("amdgpu/vega10_gpu_info.bin");
>>>>   MODULE_FIRMWARE("amdgpu/vega12_gpu_info.bin");
>>>>   MODULE_FIRMWARE("amdgpu/raven_gpu_info.bin");
>>>> @@ -3200,6 +3202,39 @@ static const struct attribute 
>>>> *amdgpu_dev_attributes[] = {
>>>>   };
>>>>     +static int amdgpu_iommu_group_notifier(struct notifier_block *nb,
>>>> +                     unsigned long action, void *data)
>>>> +{
>>>> +    struct amdgpu_device *adev = container_of(nb, struct 
>>>> amdgpu_device, nb);
>>>> +    struct amdgpu_bo *bo = NULL;
>>>> +
>>>> +    /*
>>>> +     * Following is a set of IOMMU group dependencies taken care 
>>>> of before
>>>> +     * device's IOMMU group is removed
>>>> +     */
>>>> +    if (action == IOMMU_GROUP_NOTIFY_DEL_DEVICE) {
>>>> +
>>>> +        spin_lock(&ttm_bo_glob.lru_lock);
>>>> +        list_for_each_entry(bo, &adev->device_bo_list, bo) {
>>>> +            if (bo->tbo.ttm)
>>>> +                ttm_tt_unpopulate(bo->tbo.bdev, bo->tbo.ttm);
>>>> +        }
>>>> +        spin_unlock(&ttm_bo_glob.lru_lock);
>>>
>>> That approach won't work. ttm_tt_unpopulate() might sleep on an 
>>> IOMMU lock.
>>>
>>> You need to use a mutex here or even better make sure you can access 
>>> the device_bo_list without a lock in this moment.
>>>
>>> Christian.
>>
>>
>> I can think of switching to RCU list ? Otherwise, elements are added
>> on BO create and deleted on BO destroy, how can i prevent any of 
>> those from
>> happening while in this section besides mutex ? Make a copy list and 
>> run over it instead ?

RCU won't work since the BO is not RCU protected.

What you can try something like this:

spin_lock(&ttm_bo_glob.lru_lock);
while (list_not_empty(&adev->device_bo_list)) {
     bo = list_first_entry(&adev->device_bo_list);
     list_del(bo->...);
     spin_unlock(&ttm_bo_glob.lru_lock);
     ttm_tt_unpopulate(bo);
     spin_lock(&ttm_bo_glob.lru_lock);
}...

Regards,
Christian.

>>
>> Andrey
>>
>>
>>>
>>>> +
>>>> +        if (adev->irq.ih.use_bus_addr)
>>>> +            amdgpu_ih_ring_fini(adev, &adev->irq.ih);
>>>> +        if (adev->irq.ih1.use_bus_addr)
>>>> +            amdgpu_ih_ring_fini(adev, &adev->irq.ih1);
>>>> +        if (adev->irq.ih2.use_bus_addr)
>>>> +            amdgpu_ih_ring_fini(adev, &adev->irq.ih2);
>>>> +
>>>> +        amdgpu_gart_dummy_page_fini(adev);
>>>> +    }
>>>> +
>>>> +    return NOTIFY_OK;
>>>> +}
>>>> +
>>>> +
>>>>   /**
>>>>    * amdgpu_device_init - initialize the driver
>>>>    *
>>>> @@ -3304,6 +3339,8 @@ int amdgpu_device_init(struct amdgpu_device 
>>>> *adev,
>>>>         INIT_WORK(&adev->xgmi_reset_work, 
>>>> amdgpu_device_xgmi_reset_func);
>>>>   +    INIT_LIST_HEAD(&adev->device_bo_list);
>>>> +
>>>>       adev->gfx.gfx_off_req_count = 1;
>>>>       adev->pm.ac_power = power_supply_is_system_supplied() > 0;
>>>>   @@ -3575,6 +3612,15 @@ int amdgpu_device_init(struct 
>>>> amdgpu_device *adev,
>>>>       if (amdgpu_device_cache_pci_state(adev->pdev))
>>>>           pci_restore_state(pdev);
>>>>   +    BLOCKING_INIT_NOTIFIER_HEAD(&adev->notifier);
>>>> +    adev->nb.notifier_call = amdgpu_iommu_group_notifier;
>>>> +
>>>> +    if (adev->dev->iommu_group) {
>>>> +        r = iommu_group_register_notifier(adev->dev->iommu_group, 
>>>> &adev->nb);
>>>> +        if (r)
>>>> +            goto failed;
>>>> +    }
>>>> +
>>>>       return 0;
>>>>     failed:
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c 
>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
>>>> index 0db9330..486ad6d 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
>>>> @@ -92,7 +92,7 @@ static int amdgpu_gart_dummy_page_init(struct 
>>>> amdgpu_device *adev)
>>>>    *
>>>>    * Frees the dummy page used by the driver (all asics).
>>>>    */
>>>> -static void amdgpu_gart_dummy_page_fini(struct amdgpu_device *adev)
>>>> +void amdgpu_gart_dummy_page_fini(struct amdgpu_device *adev)
>>>>   {
>>>>       if (!adev->dummy_page_addr)
>>>>           return;
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h 
>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h
>>>> index afa2e28..5678d9c 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h
>>>> @@ -61,6 +61,7 @@ int amdgpu_gart_table_vram_pin(struct 
>>>> amdgpu_device *adev);
>>>>   void amdgpu_gart_table_vram_unpin(struct amdgpu_device *adev);
>>>>   int amdgpu_gart_init(struct amdgpu_device *adev);
>>>>   void amdgpu_gart_fini(struct amdgpu_device *adev);
>>>> +void amdgpu_gart_dummy_page_fini(struct amdgpu_device *adev);
>>>>   int amdgpu_gart_unbind(struct amdgpu_device *adev, uint64_t offset,
>>>>                  int pages);
>>>>   int amdgpu_gart_map(struct amdgpu_device *adev, uint64_t offset,
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c 
>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>>>> index 6cc9919..4a1de69 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>>>> @@ -94,6 +94,10 @@ static void amdgpu_bo_destroy(struct 
>>>> ttm_buffer_object *tbo)
>>>>       }
>>>>       amdgpu_bo_unref(&bo->parent);
>>>>   +    spin_lock(&ttm_bo_glob.lru_lock);
>>>> +    list_del(&bo->bo);
>>>> +    spin_unlock(&ttm_bo_glob.lru_lock);
>>>> +
>>>>       kfree(bo->metadata);
>>>>       kfree(bo);
>>>>   }
>>>> @@ -613,6 +617,12 @@ static int amdgpu_bo_do_create(struct 
>>>> amdgpu_device *adev,
>>>>       if (bp->type == ttm_bo_type_device)
>>>>           bo->flags &= ~AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED;
>>>>   +    INIT_LIST_HEAD(&bo->bo);
>>>> +
>>>> +    spin_lock(&ttm_bo_glob.lru_lock);
>>>> +    list_add_tail(&bo->bo, &adev->device_bo_list);
>>>> +    spin_unlock(&ttm_bo_glob.lru_lock);
>>>> +
>>>>       return 0;
>>>>     fail_unreserve:
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h 
>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
>>>> index 9ac3756..5ae8555 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
>>>> @@ -110,6 +110,8 @@ struct amdgpu_bo {
>>>>       struct list_head        shadow_list;
>>>>         struct kgd_mem                  *kfd_bo;
>>>> +
>>>> +    struct list_head        bo;
>>>>   };
>>>>     static inline struct amdgpu_bo *ttm_to_amdgpu_bo(struct 
>>>> ttm_buffer_object *tbo)
>>>
>>> _______________________________________________
>>> amd-gfx mailing list
>>> amd-gfx@lists.freedesktop.org
>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&amp;data=04%7C01%7Candrey.grodzovsky%40amd.com%7C0c703eb6e73744962d3b08d8bc56f303%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637466428923905672%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=2Tkz4EMOEwFLQJUOk1ixd28c2ad1HqjBVIDO%2FX0OgqM%3D&amp;reserved=0 
>>>

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 07/14] drm/amdgpu: Register IOMMU topology notifier per device.
       [not found]               ` <1a5f7ccb-1f91-91be-1cb1-e7cb43ac2c13@amd.com>
@ 2021-01-21 10:48                   ` Daniel Vetter
  0 siblings, 0 replies; 196+ messages in thread
From: Daniel Vetter @ 2021-01-21 10:48 UTC (permalink / raw)
  To: Andrey Grodzovsky
  Cc: Greg KH, dri-devel, amd-gfx list, Alex Deucher,
	Christian König, Qiang Yu

On Wed, Jan 20, 2021 at 8:16 PM Andrey Grodzovsky
<Andrey.Grodzovsky@amd.com> wrote:
>
>
> On 1/20/21 3:38 AM, Daniel Vetter wrote:
> > On Wed, Jan 20, 2021 at 5:21 AM Andrey Grodzovsky
> > <Andrey.Grodzovsky@amd.com> wrote:
> >>
> >> On 1/19/21 5:01 PM, Daniel Vetter wrote:
> >>> On Tue, Jan 19, 2021 at 10:22 PM Andrey Grodzovsky
> >>> <Andrey.Grodzovsky@amd.com> wrote:
> >>>> On 1/19/21 8:45 AM, Daniel Vetter wrote:
> >>>>
> >>>> On Tue, Jan 19, 2021 at 09:48:03AM +0100, Christian König wrote:
> >>>>
> >>>> Am 18.01.21 um 22:01 schrieb Andrey Grodzovsky:
> >>>>
> >>>> Handle all DMA IOMMU gropup related dependencies before the
> >>>> group is removed.
> >>>>
> >>>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> >>>> ---
> >>>>     drivers/gpu/drm/amd/amdgpu/amdgpu.h        |  5 ++++
> >>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 46 ++++++++++++++++++++++++++++++
> >>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c   |  2 +-
> >>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h   |  1 +
> >>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 10 +++++++
> >>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_object.h |  2 ++
> >>>>     6 files changed, 65 insertions(+), 1 deletion(-)
> >>>>
> >>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> >>>> index 478a7d8..2953420 100644
> >>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> >>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> >>>> @@ -51,6 +51,7 @@
> >>>>     #include <linux/dma-fence.h>
> >>>>     #include <linux/pci.h>
> >>>>     #include <linux/aer.h>
> >>>> +#include <linux/notifier.h>
> >>>>     #include <drm/ttm/ttm_bo_api.h>
> >>>>     #include <drm/ttm/ttm_bo_driver.h>
> >>>> @@ -1041,6 +1042,10 @@ struct amdgpu_device {
> >>>>     bool                            in_pci_err_recovery;
> >>>>     struct pci_saved_state          *pci_state;
> >>>> +
> >>>> + struct notifier_block nb;
> >>>> + struct blocking_notifier_head notifier;
> >>>> + struct list_head device_bo_list;
> >>>>     };
> >>>>     static inline struct amdgpu_device *drm_to_adev(struct drm_device *ddev)
> >>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> >>>> index 45e23e3..e99f4f1 100644
> >>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> >>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> >>>> @@ -70,6 +70,8 @@
> >>>>     #include <drm/task_barrier.h>
> >>>>     #include <linux/pm_runtime.h>
> >>>> +#include <linux/iommu.h>
> >>>> +
> >>>>     MODULE_FIRMWARE("amdgpu/vega10_gpu_info.bin");
> >>>>     MODULE_FIRMWARE("amdgpu/vega12_gpu_info.bin");
> >>>>     MODULE_FIRMWARE("amdgpu/raven_gpu_info.bin");
> >>>> @@ -3200,6 +3202,39 @@ static const struct attribute *amdgpu_dev_attributes[] = {
> >>>>     };
> >>>> +static int amdgpu_iommu_group_notifier(struct notifier_block *nb,
> >>>> +     unsigned long action, void *data)
> >>>> +{
> >>>> + struct amdgpu_device *adev = container_of(nb, struct amdgpu_device, nb);
> >>>> + struct amdgpu_bo *bo = NULL;
> >>>> +
> >>>> + /*
> >>>> + * Following is a set of IOMMU group dependencies taken care of before
> >>>> + * device's IOMMU group is removed
> >>>> + */
> >>>> + if (action == IOMMU_GROUP_NOTIFY_DEL_DEVICE) {
> >>>> +
> >>>> + spin_lock(&ttm_bo_glob.lru_lock);
> >>>> + list_for_each_entry(bo, &adev->device_bo_list, bo) {
> >>>> + if (bo->tbo.ttm)
> >>>> + ttm_tt_unpopulate(bo->tbo.bdev, bo->tbo.ttm);
> >>>> + }
> >>>> + spin_unlock(&ttm_bo_glob.lru_lock);
> >>>>
> >>>> That approach won't work. ttm_tt_unpopulate() might sleep on an IOMMU lock.
> >>>>
> >>>> You need to use a mutex here or even better make sure you can access the
> >>>> device_bo_list without a lock in this moment.
> >>>>
> >>>> I'd also be worried about the notifier mutex getting really badly in the
> >>>> way.
> >>>>
> >>>> Plus I'm worried why we even need this, it sounds a bit like papering over
> >>>> the iommu subsystem. Assuming we clean up all our iommu mappings in our
> >>>> device hotunplug/unload code, why do we still need to have an additional
> >>>> iommu notifier on top, with all kinds of additional headaches? The iommu
> >>>> shouldn't clean up before the devices in its group have cleaned up.
> >>>>
> >>>> I think we need more info here on what the exact problem is first.
> >>>> -Daniel
> >>>>
> >>>>
> >>>> Originally I experienced the  crash bellow on IOMMU enabled device, it happens post device removal from PCI topology -
> >>>> during shutting down of user client holding last reference to drm device file (X in my case).
> >>>> The crash is because by the time I get to this point struct device->iommu_group pointer is NULL
> >>>> already since the IOMMU group for the device is unset during PCI removal. So this contradicts what you said above
> >>>> that the iommu shouldn't clean up before the devices in its group have cleaned up.
> >>>> So instead of guessing when is the right place to place all IOMMU related cleanups it makes sense
> >>>> to get notification from IOMMU subsystem in the form of event IOMMU_GROUP_NOTIFY_DEL_DEVICE
> >>>> and use that place to do all the relevant cleanups.
> >>> Yeah that goes boom, but you shouldn't need this special iommu cleanup
> >>> handler. Making sure that all the dma-api mappings are gone needs to
> >>> be done as part of the device hotunplug, you can't delay that to the
> >>> last drm_device cleanup.
> >>>
> >>> So I most of the patch here with pulling that out (should be outright
> >>> removed from the final release code even) is good, just not yet how
> >>> you call that new code. Probably these bits (aside from walking all
> >>> buffers and unpopulating the tt) should be done from the early_free
> >>> callback you're adding.
> >>>
> >>> Also what I just realized: For normal unload you need to make sure the
> >>> hw is actually stopped first, before we unmap buffers. Otherwise
> >>> driver unload will likely result in wedged hw, probably not what you
> >>> want for debugging.
> >>> -Daniel
> >> Since device removal from IOMMU group and this hook in particular
> >> takes place before call to amdgpu_pci_remove essentially it means
> >> that for IOMMU use case the entire amdgpu_device_fini_hw function
> >> shouold be called here to stop the HW instead from amdgpu_pci_remove.
> > The crash you showed was on final drm_close, which should happen after
> > device removal, so that's clearly buggy. If the iommu subsystem
> > removes stuff before the driver could clean up already, then I think
> > that's an iommu bug or dma-api bug. Just plain using dma_map/unmap and
> > friends really shouldn't need notifier hacks like you're implementing
> > here. Can you pls show me a backtrace where dma_unmap_sg blows up when
> > it's put into the pci_driver remove callback?
>
>
> It's not blowing up and has the same effect as using this notifier because setting
> of device->iommu_group pointer to NULL takes place at the same call stack but
> after amdgpu_pci_remove is called (see pci_stop_and_remove_bus_device).
> But i think that using notifier callback is better then just sticking
> the cleanup code in amdgpu_pci_remove because this is IOMMU specific cleanup and
> also coupled
> in the code to the place where device->iommu_group is unset.

Notifiers are a locking pain, plus dma_unmap_* is really just plain
normal driver cleanup work. If you want neat&automatic cleanup, look
at devm_ family of functions. There's imo really no reason to have
this notifier, and only reasons against it.

I think intermediately the cleanest solution is to put each cleanup
into a corresponding hw_fini callback (early_free is maybe a bit
ambiguous in what exactly it means).
-Daniel

> Andrey
>
>
> >
> >> Looking at this from another perspective, AFAIK on each new device probing
> >> either due to PCI bus rescan or driver reload we are resetting the ASIC before doing
> >> any init operations (assuming we successfully gained MMIO access) and so maybe
> >> your concern is not an issue ?
> > Reset on probe is too late. The problem is that if you just remove the
> > driver, your device is doing dma at that moment. And you kinda have to
> > stop that before you free the mappings/memory. Of course when the
> > device is actually hotunplugged, then dma is guaranteed to have
> > stopped already. I'm not sure whether disabling the pci device is
> > enough to make sure no more dma happens, could be that's enough.
> > -Daniel
> >
> >> Andrey
> >>
> >>
> >>>> Andrey
> >>>>
> >>>>
> >>>> [  123.810074 <   28.126960>] BUG: kernel NULL pointer dereference, address: 00000000000000c8
> >>>> [  123.810080 <    0.000006>] #PF: supervisor read access in kernel mode
> >>>> [  123.810082 <    0.000002>] #PF: error_code(0x0000) - not-present page
> >>>> [  123.810085 <    0.000003>] PGD 0 P4D 0
> >>>> [  123.810089 <    0.000004>] Oops: 0000 [#1] SMP NOPTI
> >>>> [  123.810094 <    0.000005>] CPU: 5 PID: 1418 Comm: Xorg:shlo4 Tainted: G           O      5.9.0-rc2-dev+ #59
> >>>> [  123.810096 <    0.000002>] Hardware name: System manufacturer System Product Name/PRIME X470-PRO, BIOS 4406 02/28/2019
> >>>> [  123.810105 <    0.000009>] RIP: 0010:iommu_get_dma_domain+0x10/0x20
> >>>> [  123.810108 <    0.000003>] Code: b0 48 c7 87 98 00 00 00 00 00 00 00 31 c0 c3 b8 f4 ff ff ff eb a6 0f 1f 40 00 0f 1f 44 00 00 48 8b 87 d0 02 00 00 55 48 89 e5 <48> 8b 80 c8 00 00 00 5d c3 0f 1f 80 00 00 00 00 0f 1f 44 00 00 48
> >>>> [  123.810111 <    0.000003>] RSP: 0018:ffffa2e201f7f980 EFLAGS: 00010246
> >>>> [  123.810114 <    0.000003>] RAX: 0000000000000000 RBX: 0000000000001000 RCX: 0000000000000000
> >>>> [  123.810116 <    0.000002>] RDX: 0000000000001000 RSI: 00000000bf5cb000 RDI: ffff93c259dc60b0
> >>>> [  123.810117 <    0.000001>] RBP: ffffa2e201f7f980 R08: 0000000000000000 R09: 0000000000000000
> >>>> [  123.810119 <    0.000002>] R10: ffffa2e201f7faf0 R11: 0000000000000001 R12: 00000000bf5cb000
> >>>> [  123.810121 <    0.000002>] R13: 0000000000001000 R14: ffff93c24cef9c50 R15: ffff93c256c05688
> >>>> [  123.810124 <    0.000003>] FS:  00007f5e5e8d3700(0000) GS:ffff93c25e940000(0000) knlGS:0000000000000000
> >>>> [  123.810126 <    0.000002>] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> >>>> [  123.810128 <    0.000002>] CR2: 00000000000000c8 CR3: 000000027fe0a000 CR4: 00000000003506e0
> >>>> [  123.810130 <    0.000002>] Call Trace:
> >>>> [  123.810136 <    0.000006>]  __iommu_dma_unmap+0x2e/0x100
> >>>> [  123.810141 <    0.000005>]  ? kfree+0x389/0x3a0
> >>>> [  123.810144 <    0.000003>]  iommu_dma_unmap_page+0xe/0x10
> >>>> [  123.810149 <    0.000005>] dma_unmap_page_attrs+0x4d/0xf0
> >>>> [  123.810159 <    0.000010>]  ? ttm_bo_del_from_lru+0x8e/0xb0 [ttm]
> >>>> [  123.810165 <    0.000006>] ttm_unmap_and_unpopulate_pages+0x8e/0xc0 [ttm]
> >>>> [  123.810252 <    0.000087>] amdgpu_ttm_tt_unpopulate+0xaa/0xd0 [amdgpu]
> >>>> [  123.810258 <    0.000006>]  ttm_tt_unpopulate+0x59/0x70 [ttm]
> >>>> [  123.810264 <    0.000006>]  ttm_tt_destroy+0x6a/0x70 [ttm]
> >>>> [  123.810270 <    0.000006>] ttm_bo_cleanup_memtype_use+0x36/0xa0 [ttm]
> >>>> [  123.810276 <    0.000006>]  ttm_bo_put+0x1e7/0x400 [ttm]
> >>>> [  123.810358 <    0.000082>]  amdgpu_bo_unref+0x1e/0x30 [amdgpu]
> >>>> [  123.810440 <    0.000082>] amdgpu_gem_object_free+0x37/0x50 [amdgpu]
> >>>> [  123.810459 <    0.000019>]  drm_gem_object_free+0x35/0x40 [drm]
> >>>> [  123.810476 <    0.000017>] drm_gem_object_handle_put_unlocked+0x9d/0xd0 [drm]
> >>>> [  123.810494 <    0.000018>] drm_gem_object_release_handle+0x74/0x90 [drm]
> >>>> [  123.810511 <    0.000017>]  ? drm_gem_object_handle_put_unlocked+0xd0/0xd0 [drm]
> >>>> [  123.810516 <    0.000005>]  idr_for_each+0x4d/0xd0
> >>>> [  123.810534 <    0.000018>]  drm_gem_release+0x20/0x30 [drm]
> >>>> [  123.810550 <    0.000016>]  drm_file_free+0x251/0x2a0 [drm]
> >>>> [  123.810567 <    0.000017>] drm_close_helper.isra.14+0x61/0x70 [drm]
> >>>> [  123.810583 <    0.000016>]  drm_release+0x6a/0xe0 [drm]
> >>>> [  123.810588 <    0.000005>]  __fput+0xa2/0x250
> >>>> [  123.810592 <    0.000004>]  ____fput+0xe/0x10
> >>>> [  123.810595 <    0.000003>]  task_work_run+0x6c/0xa0
> >>>> [  123.810600 <    0.000005>]  do_exit+0x376/0xb60
> >>>> [  123.810604 <    0.000004>]  do_group_exit+0x43/0xa0
> >>>> [  123.810608 <    0.000004>]  get_signal+0x18b/0x8e0
> >>>> [  123.810612 <    0.000004>]  ? do_futex+0x595/0xc20
> >>>> [  123.810617 <    0.000005>]  arch_do_signal+0x34/0x880
> >>>> [  123.810620 <    0.000003>]  ? check_preempt_curr+0x50/0x60
> >>>> [  123.810623 <    0.000003>]  ? ttwu_do_wakeup+0x1e/0x160
> >>>> [  123.810626 <    0.000003>]  ? ttwu_do_activate+0x61/0x70
> >>>> [  123.810630 <    0.000004>] exit_to_user_mode_prepare+0x124/0x1b0
> >>>> [  123.810635 <    0.000005>] syscall_exit_to_user_mode+0x31/0x170
> >>>> [  123.810639 <    0.000004>]  do_syscall_64+0x43/0x80
> >>>>
> >>>>
> >>>> Andrey
> >>>>
> >>>>
> >>>>
> >>>> Christian.
> >>>>
> >>>> +
> >>>> + if (adev->irq.ih.use_bus_addr)
> >>>> + amdgpu_ih_ring_fini(adev, &adev->irq.ih);
> >>>> + if (adev->irq.ih1.use_bus_addr)
> >>>> + amdgpu_ih_ring_fini(adev, &adev->irq.ih1);
> >>>> + if (adev->irq.ih2.use_bus_addr)
> >>>> + amdgpu_ih_ring_fini(adev, &adev->irq.ih2);
> >>>> +
> >>>> + amdgpu_gart_dummy_page_fini(adev);
> >>>> + }
> >>>> +
> >>>> + return NOTIFY_OK;
> >>>> +}
> >>>> +
> >>>> +
> >>>>     /**
> >>>>      * amdgpu_device_init - initialize the driver
> >>>>      *
> >>>> @@ -3304,6 +3339,8 @@ int amdgpu_device_init(struct amdgpu_device *adev,
> >>>>     INIT_WORK(&adev->xgmi_reset_work, amdgpu_device_xgmi_reset_func);
> >>>> + INIT_LIST_HEAD(&adev->device_bo_list);
> >>>> +
> >>>>     adev->gfx.gfx_off_req_count = 1;
> >>>>     adev->pm.ac_power = power_supply_is_system_supplied() > 0;
> >>>> @@ -3575,6 +3612,15 @@ int amdgpu_device_init(struct amdgpu_device *adev,
> >>>>     if (amdgpu_device_cache_pci_state(adev->pdev))
> >>>>     pci_restore_state(pdev);
> >>>> + BLOCKING_INIT_NOTIFIER_HEAD(&adev->notifier);
> >>>> + adev->nb.notifier_call = amdgpu_iommu_group_notifier;
> >>>> +
> >>>> + if (adev->dev->iommu_group) {
> >>>> + r = iommu_group_register_notifier(adev->dev->iommu_group, &adev->nb);
> >>>> + if (r)
> >>>> + goto failed;
> >>>> + }
> >>>> +
> >>>>     return 0;
> >>>>     failed:
> >>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
> >>>> index 0db9330..486ad6d 100644
> >>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
> >>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
> >>>> @@ -92,7 +92,7 @@ static int amdgpu_gart_dummy_page_init(struct amdgpu_device *adev)
> >>>>      *
> >>>>      * Frees the dummy page used by the driver (all asics).
> >>>>      */
> >>>> -static void amdgpu_gart_dummy_page_fini(struct amdgpu_device *adev)
> >>>> +void amdgpu_gart_dummy_page_fini(struct amdgpu_device *adev)
> >>>>     {
> >>>>     if (!adev->dummy_page_addr)
> >>>>     return;
> >>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h
> >>>> index afa2e28..5678d9c 100644
> >>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h
> >>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h
> >>>> @@ -61,6 +61,7 @@ int amdgpu_gart_table_vram_pin(struct amdgpu_device *adev);
> >>>>     void amdgpu_gart_table_vram_unpin(struct amdgpu_device *adev);
> >>>>     int amdgpu_gart_init(struct amdgpu_device *adev);
> >>>>     void amdgpu_gart_fini(struct amdgpu_device *adev);
> >>>> +void amdgpu_gart_dummy_page_fini(struct amdgpu_device *adev);
> >>>>     int amdgpu_gart_unbind(struct amdgpu_device *adev, uint64_t offset,
> >>>>           int pages);
> >>>>     int amdgpu_gart_map(struct amdgpu_device *adev, uint64_t offset,
> >>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> >>>> index 6cc9919..4a1de69 100644
> >>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> >>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> >>>> @@ -94,6 +94,10 @@ static void amdgpu_bo_destroy(struct ttm_buffer_object *tbo)
> >>>>     }
> >>>>     amdgpu_bo_unref(&bo->parent);
> >>>> + spin_lock(&ttm_bo_glob.lru_lock);
> >>>> + list_del(&bo->bo);
> >>>> + spin_unlock(&ttm_bo_glob.lru_lock);
> >>>> +
> >>>>     kfree(bo->metadata);
> >>>>     kfree(bo);
> >>>>     }
> >>>> @@ -613,6 +617,12 @@ static int amdgpu_bo_do_create(struct amdgpu_device *adev,
> >>>>     if (bp->type == ttm_bo_type_device)
> >>>>     bo->flags &= ~AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED;
> >>>> + INIT_LIST_HEAD(&bo->bo);
> >>>> +
> >>>> + spin_lock(&ttm_bo_glob.lru_lock);
> >>>> + list_add_tail(&bo->bo, &adev->device_bo_list);
> >>>> + spin_unlock(&ttm_bo_glob.lru_lock);
> >>>> +
> >>>>     return 0;
> >>>>     fail_unreserve:
> >>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
> >>>> index 9ac3756..5ae8555 100644
> >>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
> >>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
> >>>> @@ -110,6 +110,8 @@ struct amdgpu_bo {
> >>>>     struct list_head shadow_list;
> >>>>     struct kgd_mem                  *kfd_bo;
> >>>> +
> >>>> + struct list_head bo;
> >>>>     };
> >>>>     static inline struct amdgpu_bo *ttm_to_amdgpu_bo(struct ttm_buffer_object *tbo)
> >>>
> >
> >



-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 07/14] drm/amdgpu: Register IOMMU topology notifier per device.
@ 2021-01-21 10:48                   ` Daniel Vetter
  0 siblings, 0 replies; 196+ messages in thread
From: Daniel Vetter @ 2021-01-21 10:48 UTC (permalink / raw)
  To: Andrey Grodzovsky
  Cc: Rob Herring, Greg KH, dri-devel, Anholt, Eric, Pekka Paalanen,
	amd-gfx list, Alex Deucher, Lucas Stach, Wentland, Harry,
	Christian König, Qiang Yu

On Wed, Jan 20, 2021 at 8:16 PM Andrey Grodzovsky
<Andrey.Grodzovsky@amd.com> wrote:
>
>
> On 1/20/21 3:38 AM, Daniel Vetter wrote:
> > On Wed, Jan 20, 2021 at 5:21 AM Andrey Grodzovsky
> > <Andrey.Grodzovsky@amd.com> wrote:
> >>
> >> On 1/19/21 5:01 PM, Daniel Vetter wrote:
> >>> On Tue, Jan 19, 2021 at 10:22 PM Andrey Grodzovsky
> >>> <Andrey.Grodzovsky@amd.com> wrote:
> >>>> On 1/19/21 8:45 AM, Daniel Vetter wrote:
> >>>>
> >>>> On Tue, Jan 19, 2021 at 09:48:03AM +0100, Christian König wrote:
> >>>>
> >>>> Am 18.01.21 um 22:01 schrieb Andrey Grodzovsky:
> >>>>
> >>>> Handle all DMA IOMMU gropup related dependencies before the
> >>>> group is removed.
> >>>>
> >>>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> >>>> ---
> >>>>     drivers/gpu/drm/amd/amdgpu/amdgpu.h        |  5 ++++
> >>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 46 ++++++++++++++++++++++++++++++
> >>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c   |  2 +-
> >>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h   |  1 +
> >>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 10 +++++++
> >>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_object.h |  2 ++
> >>>>     6 files changed, 65 insertions(+), 1 deletion(-)
> >>>>
> >>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> >>>> index 478a7d8..2953420 100644
> >>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> >>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> >>>> @@ -51,6 +51,7 @@
> >>>>     #include <linux/dma-fence.h>
> >>>>     #include <linux/pci.h>
> >>>>     #include <linux/aer.h>
> >>>> +#include <linux/notifier.h>
> >>>>     #include <drm/ttm/ttm_bo_api.h>
> >>>>     #include <drm/ttm/ttm_bo_driver.h>
> >>>> @@ -1041,6 +1042,10 @@ struct amdgpu_device {
> >>>>     bool                            in_pci_err_recovery;
> >>>>     struct pci_saved_state          *pci_state;
> >>>> +
> >>>> + struct notifier_block nb;
> >>>> + struct blocking_notifier_head notifier;
> >>>> + struct list_head device_bo_list;
> >>>>     };
> >>>>     static inline struct amdgpu_device *drm_to_adev(struct drm_device *ddev)
> >>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> >>>> index 45e23e3..e99f4f1 100644
> >>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> >>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> >>>> @@ -70,6 +70,8 @@
> >>>>     #include <drm/task_barrier.h>
> >>>>     #include <linux/pm_runtime.h>
> >>>> +#include <linux/iommu.h>
> >>>> +
> >>>>     MODULE_FIRMWARE("amdgpu/vega10_gpu_info.bin");
> >>>>     MODULE_FIRMWARE("amdgpu/vega12_gpu_info.bin");
> >>>>     MODULE_FIRMWARE("amdgpu/raven_gpu_info.bin");
> >>>> @@ -3200,6 +3202,39 @@ static const struct attribute *amdgpu_dev_attributes[] = {
> >>>>     };
> >>>> +static int amdgpu_iommu_group_notifier(struct notifier_block *nb,
> >>>> +     unsigned long action, void *data)
> >>>> +{
> >>>> + struct amdgpu_device *adev = container_of(nb, struct amdgpu_device, nb);
> >>>> + struct amdgpu_bo *bo = NULL;
> >>>> +
> >>>> + /*
> >>>> + * Following is a set of IOMMU group dependencies taken care of before
> >>>> + * device's IOMMU group is removed
> >>>> + */
> >>>> + if (action == IOMMU_GROUP_NOTIFY_DEL_DEVICE) {
> >>>> +
> >>>> + spin_lock(&ttm_bo_glob.lru_lock);
> >>>> + list_for_each_entry(bo, &adev->device_bo_list, bo) {
> >>>> + if (bo->tbo.ttm)
> >>>> + ttm_tt_unpopulate(bo->tbo.bdev, bo->tbo.ttm);
> >>>> + }
> >>>> + spin_unlock(&ttm_bo_glob.lru_lock);
> >>>>
> >>>> That approach won't work. ttm_tt_unpopulate() might sleep on an IOMMU lock.
> >>>>
> >>>> You need to use a mutex here or even better make sure you can access the
> >>>> device_bo_list without a lock in this moment.
> >>>>
> >>>> I'd also be worried about the notifier mutex getting really badly in the
> >>>> way.
> >>>>
> >>>> Plus I'm worried why we even need this, it sounds a bit like papering over
> >>>> the iommu subsystem. Assuming we clean up all our iommu mappings in our
> >>>> device hotunplug/unload code, why do we still need to have an additional
> >>>> iommu notifier on top, with all kinds of additional headaches? The iommu
> >>>> shouldn't clean up before the devices in its group have cleaned up.
> >>>>
> >>>> I think we need more info here on what the exact problem is first.
> >>>> -Daniel
> >>>>
> >>>>
> >>>> Originally I experienced the  crash bellow on IOMMU enabled device, it happens post device removal from PCI topology -
> >>>> during shutting down of user client holding last reference to drm device file (X in my case).
> >>>> The crash is because by the time I get to this point struct device->iommu_group pointer is NULL
> >>>> already since the IOMMU group for the device is unset during PCI removal. So this contradicts what you said above
> >>>> that the iommu shouldn't clean up before the devices in its group have cleaned up.
> >>>> So instead of guessing when is the right place to place all IOMMU related cleanups it makes sense
> >>>> to get notification from IOMMU subsystem in the form of event IOMMU_GROUP_NOTIFY_DEL_DEVICE
> >>>> and use that place to do all the relevant cleanups.
> >>> Yeah that goes boom, but you shouldn't need this special iommu cleanup
> >>> handler. Making sure that all the dma-api mappings are gone needs to
> >>> be done as part of the device hotunplug, you can't delay that to the
> >>> last drm_device cleanup.
> >>>
> >>> So I most of the patch here with pulling that out (should be outright
> >>> removed from the final release code even) is good, just not yet how
> >>> you call that new code. Probably these bits (aside from walking all
> >>> buffers and unpopulating the tt) should be done from the early_free
> >>> callback you're adding.
> >>>
> >>> Also what I just realized: For normal unload you need to make sure the
> >>> hw is actually stopped first, before we unmap buffers. Otherwise
> >>> driver unload will likely result in wedged hw, probably not what you
> >>> want for debugging.
> >>> -Daniel
> >> Since device removal from IOMMU group and this hook in particular
> >> takes place before call to amdgpu_pci_remove essentially it means
> >> that for IOMMU use case the entire amdgpu_device_fini_hw function
> >> shouold be called here to stop the HW instead from amdgpu_pci_remove.
> > The crash you showed was on final drm_close, which should happen after
> > device removal, so that's clearly buggy. If the iommu subsystem
> > removes stuff before the driver could clean up already, then I think
> > that's an iommu bug or dma-api bug. Just plain using dma_map/unmap and
> > friends really shouldn't need notifier hacks like you're implementing
> > here. Can you pls show me a backtrace where dma_unmap_sg blows up when
> > it's put into the pci_driver remove callback?
>
>
> It's not blowing up and has the same effect as using this notifier because setting
> of device->iommu_group pointer to NULL takes place at the same call stack but
> after amdgpu_pci_remove is called (see pci_stop_and_remove_bus_device).
> But i think that using notifier callback is better then just sticking
> the cleanup code in amdgpu_pci_remove because this is IOMMU specific cleanup and
> also coupled
> in the code to the place where device->iommu_group is unset.

Notifiers are a locking pain, plus dma_unmap_* is really just plain
normal driver cleanup work. If you want neat&automatic cleanup, look
at devm_ family of functions. There's imo really no reason to have
this notifier, and only reasons against it.

I think intermediately the cleanest solution is to put each cleanup
into a corresponding hw_fini callback (early_free is maybe a bit
ambiguous in what exactly it means).
-Daniel

> Andrey
>
>
> >
> >> Looking at this from another perspective, AFAIK on each new device probing
> >> either due to PCI bus rescan or driver reload we are resetting the ASIC before doing
> >> any init operations (assuming we successfully gained MMIO access) and so maybe
> >> your concern is not an issue ?
> > Reset on probe is too late. The problem is that if you just remove the
> > driver, your device is doing dma at that moment. And you kinda have to
> > stop that before you free the mappings/memory. Of course when the
> > device is actually hotunplugged, then dma is guaranteed to have
> > stopped already. I'm not sure whether disabling the pci device is
> > enough to make sure no more dma happens, could be that's enough.
> > -Daniel
> >
> >> Andrey
> >>
> >>
> >>>> Andrey
> >>>>
> >>>>
> >>>> [  123.810074 <   28.126960>] BUG: kernel NULL pointer dereference, address: 00000000000000c8
> >>>> [  123.810080 <    0.000006>] #PF: supervisor read access in kernel mode
> >>>> [  123.810082 <    0.000002>] #PF: error_code(0x0000) - not-present page
> >>>> [  123.810085 <    0.000003>] PGD 0 P4D 0
> >>>> [  123.810089 <    0.000004>] Oops: 0000 [#1] SMP NOPTI
> >>>> [  123.810094 <    0.000005>] CPU: 5 PID: 1418 Comm: Xorg:shlo4 Tainted: G           O      5.9.0-rc2-dev+ #59
> >>>> [  123.810096 <    0.000002>] Hardware name: System manufacturer System Product Name/PRIME X470-PRO, BIOS 4406 02/28/2019
> >>>> [  123.810105 <    0.000009>] RIP: 0010:iommu_get_dma_domain+0x10/0x20
> >>>> [  123.810108 <    0.000003>] Code: b0 48 c7 87 98 00 00 00 00 00 00 00 31 c0 c3 b8 f4 ff ff ff eb a6 0f 1f 40 00 0f 1f 44 00 00 48 8b 87 d0 02 00 00 55 48 89 e5 <48> 8b 80 c8 00 00 00 5d c3 0f 1f 80 00 00 00 00 0f 1f 44 00 00 48
> >>>> [  123.810111 <    0.000003>] RSP: 0018:ffffa2e201f7f980 EFLAGS: 00010246
> >>>> [  123.810114 <    0.000003>] RAX: 0000000000000000 RBX: 0000000000001000 RCX: 0000000000000000
> >>>> [  123.810116 <    0.000002>] RDX: 0000000000001000 RSI: 00000000bf5cb000 RDI: ffff93c259dc60b0
> >>>> [  123.810117 <    0.000001>] RBP: ffffa2e201f7f980 R08: 0000000000000000 R09: 0000000000000000
> >>>> [  123.810119 <    0.000002>] R10: ffffa2e201f7faf0 R11: 0000000000000001 R12: 00000000bf5cb000
> >>>> [  123.810121 <    0.000002>] R13: 0000000000001000 R14: ffff93c24cef9c50 R15: ffff93c256c05688
> >>>> [  123.810124 <    0.000003>] FS:  00007f5e5e8d3700(0000) GS:ffff93c25e940000(0000) knlGS:0000000000000000
> >>>> [  123.810126 <    0.000002>] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> >>>> [  123.810128 <    0.000002>] CR2: 00000000000000c8 CR3: 000000027fe0a000 CR4: 00000000003506e0
> >>>> [  123.810130 <    0.000002>] Call Trace:
> >>>> [  123.810136 <    0.000006>]  __iommu_dma_unmap+0x2e/0x100
> >>>> [  123.810141 <    0.000005>]  ? kfree+0x389/0x3a0
> >>>> [  123.810144 <    0.000003>]  iommu_dma_unmap_page+0xe/0x10
> >>>> [  123.810149 <    0.000005>] dma_unmap_page_attrs+0x4d/0xf0
> >>>> [  123.810159 <    0.000010>]  ? ttm_bo_del_from_lru+0x8e/0xb0 [ttm]
> >>>> [  123.810165 <    0.000006>] ttm_unmap_and_unpopulate_pages+0x8e/0xc0 [ttm]
> >>>> [  123.810252 <    0.000087>] amdgpu_ttm_tt_unpopulate+0xaa/0xd0 [amdgpu]
> >>>> [  123.810258 <    0.000006>]  ttm_tt_unpopulate+0x59/0x70 [ttm]
> >>>> [  123.810264 <    0.000006>]  ttm_tt_destroy+0x6a/0x70 [ttm]
> >>>> [  123.810270 <    0.000006>] ttm_bo_cleanup_memtype_use+0x36/0xa0 [ttm]
> >>>> [  123.810276 <    0.000006>]  ttm_bo_put+0x1e7/0x400 [ttm]
> >>>> [  123.810358 <    0.000082>]  amdgpu_bo_unref+0x1e/0x30 [amdgpu]
> >>>> [  123.810440 <    0.000082>] amdgpu_gem_object_free+0x37/0x50 [amdgpu]
> >>>> [  123.810459 <    0.000019>]  drm_gem_object_free+0x35/0x40 [drm]
> >>>> [  123.810476 <    0.000017>] drm_gem_object_handle_put_unlocked+0x9d/0xd0 [drm]
> >>>> [  123.810494 <    0.000018>] drm_gem_object_release_handle+0x74/0x90 [drm]
> >>>> [  123.810511 <    0.000017>]  ? drm_gem_object_handle_put_unlocked+0xd0/0xd0 [drm]
> >>>> [  123.810516 <    0.000005>]  idr_for_each+0x4d/0xd0
> >>>> [  123.810534 <    0.000018>]  drm_gem_release+0x20/0x30 [drm]
> >>>> [  123.810550 <    0.000016>]  drm_file_free+0x251/0x2a0 [drm]
> >>>> [  123.810567 <    0.000017>] drm_close_helper.isra.14+0x61/0x70 [drm]
> >>>> [  123.810583 <    0.000016>]  drm_release+0x6a/0xe0 [drm]
> >>>> [  123.810588 <    0.000005>]  __fput+0xa2/0x250
> >>>> [  123.810592 <    0.000004>]  ____fput+0xe/0x10
> >>>> [  123.810595 <    0.000003>]  task_work_run+0x6c/0xa0
> >>>> [  123.810600 <    0.000005>]  do_exit+0x376/0xb60
> >>>> [  123.810604 <    0.000004>]  do_group_exit+0x43/0xa0
> >>>> [  123.810608 <    0.000004>]  get_signal+0x18b/0x8e0
> >>>> [  123.810612 <    0.000004>]  ? do_futex+0x595/0xc20
> >>>> [  123.810617 <    0.000005>]  arch_do_signal+0x34/0x880
> >>>> [  123.810620 <    0.000003>]  ? check_preempt_curr+0x50/0x60
> >>>> [  123.810623 <    0.000003>]  ? ttwu_do_wakeup+0x1e/0x160
> >>>> [  123.810626 <    0.000003>]  ? ttwu_do_activate+0x61/0x70
> >>>> [  123.810630 <    0.000004>] exit_to_user_mode_prepare+0x124/0x1b0
> >>>> [  123.810635 <    0.000005>] syscall_exit_to_user_mode+0x31/0x170
> >>>> [  123.810639 <    0.000004>]  do_syscall_64+0x43/0x80
> >>>>
> >>>>
> >>>> Andrey
> >>>>
> >>>>
> >>>>
> >>>> Christian.
> >>>>
> >>>> +
> >>>> + if (adev->irq.ih.use_bus_addr)
> >>>> + amdgpu_ih_ring_fini(adev, &adev->irq.ih);
> >>>> + if (adev->irq.ih1.use_bus_addr)
> >>>> + amdgpu_ih_ring_fini(adev, &adev->irq.ih1);
> >>>> + if (adev->irq.ih2.use_bus_addr)
> >>>> + amdgpu_ih_ring_fini(adev, &adev->irq.ih2);
> >>>> +
> >>>> + amdgpu_gart_dummy_page_fini(adev);
> >>>> + }
> >>>> +
> >>>> + return NOTIFY_OK;
> >>>> +}
> >>>> +
> >>>> +
> >>>>     /**
> >>>>      * amdgpu_device_init - initialize the driver
> >>>>      *
> >>>> @@ -3304,6 +3339,8 @@ int amdgpu_device_init(struct amdgpu_device *adev,
> >>>>     INIT_WORK(&adev->xgmi_reset_work, amdgpu_device_xgmi_reset_func);
> >>>> + INIT_LIST_HEAD(&adev->device_bo_list);
> >>>> +
> >>>>     adev->gfx.gfx_off_req_count = 1;
> >>>>     adev->pm.ac_power = power_supply_is_system_supplied() > 0;
> >>>> @@ -3575,6 +3612,15 @@ int amdgpu_device_init(struct amdgpu_device *adev,
> >>>>     if (amdgpu_device_cache_pci_state(adev->pdev))
> >>>>     pci_restore_state(pdev);
> >>>> + BLOCKING_INIT_NOTIFIER_HEAD(&adev->notifier);
> >>>> + adev->nb.notifier_call = amdgpu_iommu_group_notifier;
> >>>> +
> >>>> + if (adev->dev->iommu_group) {
> >>>> + r = iommu_group_register_notifier(adev->dev->iommu_group, &adev->nb);
> >>>> + if (r)
> >>>> + goto failed;
> >>>> + }
> >>>> +
> >>>>     return 0;
> >>>>     failed:
> >>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
> >>>> index 0db9330..486ad6d 100644
> >>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
> >>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
> >>>> @@ -92,7 +92,7 @@ static int amdgpu_gart_dummy_page_init(struct amdgpu_device *adev)
> >>>>      *
> >>>>      * Frees the dummy page used by the driver (all asics).
> >>>>      */
> >>>> -static void amdgpu_gart_dummy_page_fini(struct amdgpu_device *adev)
> >>>> +void amdgpu_gart_dummy_page_fini(struct amdgpu_device *adev)
> >>>>     {
> >>>>     if (!adev->dummy_page_addr)
> >>>>     return;
> >>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h
> >>>> index afa2e28..5678d9c 100644
> >>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h
> >>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h
> >>>> @@ -61,6 +61,7 @@ int amdgpu_gart_table_vram_pin(struct amdgpu_device *adev);
> >>>>     void amdgpu_gart_table_vram_unpin(struct amdgpu_device *adev);
> >>>>     int amdgpu_gart_init(struct amdgpu_device *adev);
> >>>>     void amdgpu_gart_fini(struct amdgpu_device *adev);
> >>>> +void amdgpu_gart_dummy_page_fini(struct amdgpu_device *adev);
> >>>>     int amdgpu_gart_unbind(struct amdgpu_device *adev, uint64_t offset,
> >>>>           int pages);
> >>>>     int amdgpu_gart_map(struct amdgpu_device *adev, uint64_t offset,
> >>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> >>>> index 6cc9919..4a1de69 100644
> >>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> >>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> >>>> @@ -94,6 +94,10 @@ static void amdgpu_bo_destroy(struct ttm_buffer_object *tbo)
> >>>>     }
> >>>>     amdgpu_bo_unref(&bo->parent);
> >>>> + spin_lock(&ttm_bo_glob.lru_lock);
> >>>> + list_del(&bo->bo);
> >>>> + spin_unlock(&ttm_bo_glob.lru_lock);
> >>>> +
> >>>>     kfree(bo->metadata);
> >>>>     kfree(bo);
> >>>>     }
> >>>> @@ -613,6 +617,12 @@ static int amdgpu_bo_do_create(struct amdgpu_device *adev,
> >>>>     if (bp->type == ttm_bo_type_device)
> >>>>     bo->flags &= ~AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED;
> >>>> + INIT_LIST_HEAD(&bo->bo);
> >>>> +
> >>>> + spin_lock(&ttm_bo_glob.lru_lock);
> >>>> + list_add_tail(&bo->bo, &adev->device_bo_list);
> >>>> + spin_unlock(&ttm_bo_glob.lru_lock);
> >>>> +
> >>>>     return 0;
> >>>>     fail_unreserve:
> >>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
> >>>> index 9ac3756..5ae8555 100644
> >>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
> >>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
> >>>> @@ -110,6 +110,8 @@ struct amdgpu_bo {
> >>>>     struct list_head shadow_list;
> >>>>     struct kgd_mem                  *kfd_bo;
> >>>> +
> >>>> + struct list_head bo;
> >>>>     };
> >>>>     static inline struct amdgpu_bo *ttm_to_amdgpu_bo(struct ttm_buffer_object *tbo)
> >>>
> >
> >



-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 01/14] drm/ttm: Remap all page faults to per process dummy page.
  2021-01-19 13:56     ` Daniel Vetter
@ 2021-01-25 15:28       ` Andrey Grodzovsky
  -1 siblings, 0 replies; 196+ messages in thread
From: Andrey Grodzovsky @ 2021-01-25 15:28 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: gregkh, ckoenig.leichtzumerken, dri-devel, amd-gfx,
	daniel.vetter, Alexander.Deucher, yuq825


On 1/19/21 8:56 AM, Daniel Vetter wrote:
> On Mon, Jan 18, 2021 at 04:01:10PM -0500, Andrey Grodzovsky wrote:
>> On device removal reroute all CPU mappings to dummy page.
>>
>> v3:
>> Remove loop to find DRM file and instead access it
>> by vma->vm_file->private_data. Move dummy page installation
>> into a separate function.
>>
>> v4:
>> Map the entire BOs VA space into on demand allocated dummy page
>> on the first fault for that BO.
>>
>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>> ---
>>   drivers/gpu/drm/ttm/ttm_bo_vm.c | 82 ++++++++++++++++++++++++++++++++++++++++-
>>   include/drm/ttm/ttm_bo_api.h    |  2 +
>>   2 files changed, 83 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c b/drivers/gpu/drm/ttm/ttm_bo_vm.c
>> index 6dc96cf..ed89da3 100644
>> --- a/drivers/gpu/drm/ttm/ttm_bo_vm.c
>> +++ b/drivers/gpu/drm/ttm/ttm_bo_vm.c
>> @@ -34,6 +34,8 @@
>>   #include <drm/ttm/ttm_bo_driver.h>
>>   #include <drm/ttm/ttm_placement.h>
>>   #include <drm/drm_vma_manager.h>
>> +#include <drm/drm_drv.h>
>> +#include <drm/drm_managed.h>
>>   #include <linux/mm.h>
>>   #include <linux/pfn_t.h>
>>   #include <linux/rbtree.h>
>> @@ -380,25 +382,103 @@ vm_fault_t ttm_bo_vm_fault_reserved(struct vm_fault *vmf,
>>   }
>>   EXPORT_SYMBOL(ttm_bo_vm_fault_reserved);
>>   
>> +static void ttm_bo_release_dummy_page(struct drm_device *dev, void *res)
>> +{
>> +	struct page *dummy_page = (struct page *)res;
>> +
>> +	__free_page(dummy_page);
>> +}
>> +
>> +vm_fault_t ttm_bo_vm_dummy_page(struct vm_fault *vmf, pgprot_t prot)
>> +{
>> +	struct vm_area_struct *vma = vmf->vma;
>> +	struct ttm_buffer_object *bo = vma->vm_private_data;
>> +	struct ttm_bo_device *bdev = bo->bdev;
>> +	struct drm_device *ddev = bo->base.dev;
>> +	vm_fault_t ret = VM_FAULT_NOPAGE;
>> +	unsigned long address = vma->vm_start;
>> +	unsigned long num_prefault = (vma->vm_end - vma->vm_start) >> PAGE_SHIFT;
>> +	unsigned long pfn;
>> +	struct page *page;
>> +	int i;
>> +
>> +	/*
>> +	 * Wait for buffer data in transit, due to a pipelined
>> +	 * move.
>> +	 */
>> +	ret = ttm_bo_vm_fault_idle(bo, vmf);
>> +	if (unlikely(ret != 0))
>> +		return ret;
>> +
>> +	/* Allocate new dummy page to map all the VA range in this VMA to it*/
>> +	page = alloc_page(GFP_KERNEL | __GFP_ZERO);
>> +	if (!page)
>> +		return VM_FAULT_OOM;
>> +
>> +	pfn = page_to_pfn(page);
>> +
>> +	/*
>> +	 * Prefault the entire VMA range right away to avoid further faults
>> +	 */
>> +	for (i = 0; i < num_prefault; ++i) {
>> +
>> +		if (unlikely(address >= vma->vm_end))
>> +			break;
>> +
>> +		if (vma->vm_flags & VM_MIXEDMAP)
>> +			ret = vmf_insert_mixed_prot(vma, address,
>> +						    __pfn_to_pfn_t(pfn, PFN_DEV),
>> +						    prot);
>> +		else
>> +			ret = vmf_insert_pfn_prot(vma, address, pfn, prot);
>> +
>> +		/* Never error on prefaulted PTEs */
>> +		if (unlikely((ret & VM_FAULT_ERROR))) {
>> +			if (i == 0)
>> +				return VM_FAULT_NOPAGE;
>> +			else
>> +				break;
>> +		}
>> +
>> +		address += PAGE_SIZE;
>> +	}
>> +
>> +	/* Set the page to be freed using drmm release action */
>> +	if (drmm_add_action_or_reset(ddev, ttm_bo_release_dummy_page, page))
>> +		return VM_FAULT_OOM;
>> +
>> +	return ret;
>> +}
>> +EXPORT_SYMBOL(ttm_bo_vm_dummy_page);
> I think we can lift this entire thing (once the ttm_bo_vm_fault_idle is
> gone) to the drm level, since nothing ttm specific in here. Probably stuff
> it into drm_gem.c (but really it's not even gem specific, it's fully
> generic "replace this vma with dummy pages pls" function.


Once I started with this I noticed that drmm_add_action_or_reset depends
on struct drm_device *ddev = bo->base.dev  and bo is the private data
we embed at the TTM level when setting up the mapping and so this forces
to move drmm_add_action_or_reset out of this function to every client who uses
this function, and then you separate the logic of page allocation from it's release.
So I suggest we keep it as is.

Andrey


>
> Aside from this nit I think the overall approach you have here is starting
> to look good. Lots of work&polish, but imo we're getting there and can
> start landing stuff soon.
> -Daniel
>
>> +
>>   vm_fault_t ttm_bo_vm_fault(struct vm_fault *vmf)
>>   {
>>   	struct vm_area_struct *vma = vmf->vma;
>>   	pgprot_t prot;
>>   	struct ttm_buffer_object *bo = vma->vm_private_data;
>> +	struct drm_device *ddev = bo->base.dev;
>>   	vm_fault_t ret;
>> +	int idx;
>>   
>>   	ret = ttm_bo_vm_reserve(bo, vmf);
>>   	if (ret)
>>   		return ret;
>>   
>>   	prot = vma->vm_page_prot;
>> -	ret = ttm_bo_vm_fault_reserved(vmf, prot, TTM_BO_VM_NUM_PREFAULT, 1);
>> +	if (drm_dev_enter(ddev, &idx)) {
>> +		ret = ttm_bo_vm_fault_reserved(vmf, prot, TTM_BO_VM_NUM_PREFAULT, 1);
>> +		drm_dev_exit(idx);
>> +	} else {
>> +		ret = ttm_bo_vm_dummy_page(vmf, prot);
>> +	}
>>   	if (ret == VM_FAULT_RETRY && !(vmf->flags & FAULT_FLAG_RETRY_NOWAIT))
>>   		return ret;
>>   
>>   	dma_resv_unlock(bo->base.resv);
>>   
>>   	return ret;
>> +
>> +	return ret;
>>   }
>>   EXPORT_SYMBOL(ttm_bo_vm_fault);
>>   
>> diff --git a/include/drm/ttm/ttm_bo_api.h b/include/drm/ttm/ttm_bo_api.h
>> index e17be32..12fb240 100644
>> --- a/include/drm/ttm/ttm_bo_api.h
>> +++ b/include/drm/ttm/ttm_bo_api.h
>> @@ -643,4 +643,6 @@ void ttm_bo_vm_close(struct vm_area_struct *vma);
>>   int ttm_bo_vm_access(struct vm_area_struct *vma, unsigned long addr,
>>   		     void *buf, int len, int write);
>>   
>> +vm_fault_t ttm_bo_vm_dummy_page(struct vm_fault *vmf, pgprot_t prot);
>> +
>>   #endif
>> -- 
>> 2.7.4
>>
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 01/14] drm/ttm: Remap all page faults to per process dummy page.
@ 2021-01-25 15:28       ` Andrey Grodzovsky
  0 siblings, 0 replies; 196+ messages in thread
From: Andrey Grodzovsky @ 2021-01-25 15:28 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: robh, gregkh, ckoenig.leichtzumerken, dri-devel, eric, ppaalanen,
	amd-gfx, daniel.vetter, Alexander.Deucher, yuq825,
	Harry.Wentland, l.stach


On 1/19/21 8:56 AM, Daniel Vetter wrote:
> On Mon, Jan 18, 2021 at 04:01:10PM -0500, Andrey Grodzovsky wrote:
>> On device removal reroute all CPU mappings to dummy page.
>>
>> v3:
>> Remove loop to find DRM file and instead access it
>> by vma->vm_file->private_data. Move dummy page installation
>> into a separate function.
>>
>> v4:
>> Map the entire BOs VA space into on demand allocated dummy page
>> on the first fault for that BO.
>>
>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>> ---
>>   drivers/gpu/drm/ttm/ttm_bo_vm.c | 82 ++++++++++++++++++++++++++++++++++++++++-
>>   include/drm/ttm/ttm_bo_api.h    |  2 +
>>   2 files changed, 83 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c b/drivers/gpu/drm/ttm/ttm_bo_vm.c
>> index 6dc96cf..ed89da3 100644
>> --- a/drivers/gpu/drm/ttm/ttm_bo_vm.c
>> +++ b/drivers/gpu/drm/ttm/ttm_bo_vm.c
>> @@ -34,6 +34,8 @@
>>   #include <drm/ttm/ttm_bo_driver.h>
>>   #include <drm/ttm/ttm_placement.h>
>>   #include <drm/drm_vma_manager.h>
>> +#include <drm/drm_drv.h>
>> +#include <drm/drm_managed.h>
>>   #include <linux/mm.h>
>>   #include <linux/pfn_t.h>
>>   #include <linux/rbtree.h>
>> @@ -380,25 +382,103 @@ vm_fault_t ttm_bo_vm_fault_reserved(struct vm_fault *vmf,
>>   }
>>   EXPORT_SYMBOL(ttm_bo_vm_fault_reserved);
>>   
>> +static void ttm_bo_release_dummy_page(struct drm_device *dev, void *res)
>> +{
>> +	struct page *dummy_page = (struct page *)res;
>> +
>> +	__free_page(dummy_page);
>> +}
>> +
>> +vm_fault_t ttm_bo_vm_dummy_page(struct vm_fault *vmf, pgprot_t prot)
>> +{
>> +	struct vm_area_struct *vma = vmf->vma;
>> +	struct ttm_buffer_object *bo = vma->vm_private_data;
>> +	struct ttm_bo_device *bdev = bo->bdev;
>> +	struct drm_device *ddev = bo->base.dev;
>> +	vm_fault_t ret = VM_FAULT_NOPAGE;
>> +	unsigned long address = vma->vm_start;
>> +	unsigned long num_prefault = (vma->vm_end - vma->vm_start) >> PAGE_SHIFT;
>> +	unsigned long pfn;
>> +	struct page *page;
>> +	int i;
>> +
>> +	/*
>> +	 * Wait for buffer data in transit, due to a pipelined
>> +	 * move.
>> +	 */
>> +	ret = ttm_bo_vm_fault_idle(bo, vmf);
>> +	if (unlikely(ret != 0))
>> +		return ret;
>> +
>> +	/* Allocate new dummy page to map all the VA range in this VMA to it*/
>> +	page = alloc_page(GFP_KERNEL | __GFP_ZERO);
>> +	if (!page)
>> +		return VM_FAULT_OOM;
>> +
>> +	pfn = page_to_pfn(page);
>> +
>> +	/*
>> +	 * Prefault the entire VMA range right away to avoid further faults
>> +	 */
>> +	for (i = 0; i < num_prefault; ++i) {
>> +
>> +		if (unlikely(address >= vma->vm_end))
>> +			break;
>> +
>> +		if (vma->vm_flags & VM_MIXEDMAP)
>> +			ret = vmf_insert_mixed_prot(vma, address,
>> +						    __pfn_to_pfn_t(pfn, PFN_DEV),
>> +						    prot);
>> +		else
>> +			ret = vmf_insert_pfn_prot(vma, address, pfn, prot);
>> +
>> +		/* Never error on prefaulted PTEs */
>> +		if (unlikely((ret & VM_FAULT_ERROR))) {
>> +			if (i == 0)
>> +				return VM_FAULT_NOPAGE;
>> +			else
>> +				break;
>> +		}
>> +
>> +		address += PAGE_SIZE;
>> +	}
>> +
>> +	/* Set the page to be freed using drmm release action */
>> +	if (drmm_add_action_or_reset(ddev, ttm_bo_release_dummy_page, page))
>> +		return VM_FAULT_OOM;
>> +
>> +	return ret;
>> +}
>> +EXPORT_SYMBOL(ttm_bo_vm_dummy_page);
> I think we can lift this entire thing (once the ttm_bo_vm_fault_idle is
> gone) to the drm level, since nothing ttm specific in here. Probably stuff
> it into drm_gem.c (but really it's not even gem specific, it's fully
> generic "replace this vma with dummy pages pls" function.


Once I started with this I noticed that drmm_add_action_or_reset depends
on struct drm_device *ddev = bo->base.dev  and bo is the private data
we embed at the TTM level when setting up the mapping and so this forces
to move drmm_add_action_or_reset out of this function to every client who uses
this function, and then you separate the logic of page allocation from it's release.
So I suggest we keep it as is.

Andrey


>
> Aside from this nit I think the overall approach you have here is starting
> to look good. Lots of work&polish, but imo we're getting there and can
> start landing stuff soon.
> -Daniel
>
>> +
>>   vm_fault_t ttm_bo_vm_fault(struct vm_fault *vmf)
>>   {
>>   	struct vm_area_struct *vma = vmf->vma;
>>   	pgprot_t prot;
>>   	struct ttm_buffer_object *bo = vma->vm_private_data;
>> +	struct drm_device *ddev = bo->base.dev;
>>   	vm_fault_t ret;
>> +	int idx;
>>   
>>   	ret = ttm_bo_vm_reserve(bo, vmf);
>>   	if (ret)
>>   		return ret;
>>   
>>   	prot = vma->vm_page_prot;
>> -	ret = ttm_bo_vm_fault_reserved(vmf, prot, TTM_BO_VM_NUM_PREFAULT, 1);
>> +	if (drm_dev_enter(ddev, &idx)) {
>> +		ret = ttm_bo_vm_fault_reserved(vmf, prot, TTM_BO_VM_NUM_PREFAULT, 1);
>> +		drm_dev_exit(idx);
>> +	} else {
>> +		ret = ttm_bo_vm_dummy_page(vmf, prot);
>> +	}
>>   	if (ret == VM_FAULT_RETRY && !(vmf->flags & FAULT_FLAG_RETRY_NOWAIT))
>>   		return ret;
>>   
>>   	dma_resv_unlock(bo->base.resv);
>>   
>>   	return ret;
>> +
>> +	return ret;
>>   }
>>   EXPORT_SYMBOL(ttm_bo_vm_fault);
>>   
>> diff --git a/include/drm/ttm/ttm_bo_api.h b/include/drm/ttm/ttm_bo_api.h
>> index e17be32..12fb240 100644
>> --- a/include/drm/ttm/ttm_bo_api.h
>> +++ b/include/drm/ttm/ttm_bo_api.h
>> @@ -643,4 +643,6 @@ void ttm_bo_vm_close(struct vm_area_struct *vma);
>>   int ttm_bo_vm_access(struct vm_area_struct *vma, unsigned long addr,
>>   		     void *buf, int len, int write);
>>   
>> +vm_fault_t ttm_bo_vm_dummy_page(struct vm_fault *vmf, pgprot_t prot);
>> +
>>   #endif
>> -- 
>> 2.7.4
>>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 01/14] drm/ttm: Remap all page faults to per process dummy page.
  2021-01-25 15:28       ` Andrey Grodzovsky
@ 2021-01-27 14:29         ` Andrey Grodzovsky
  -1 siblings, 0 replies; 196+ messages in thread
From: Andrey Grodzovsky @ 2021-01-27 14:29 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: gregkh, ckoenig.leichtzumerken, dri-devel, amd-gfx,
	daniel.vetter, Alexander.Deucher, yuq825

Hey Daniel, just a ping.

Andrey

On 1/25/21 10:28 AM, Andrey Grodzovsky wrote:
>
> On 1/19/21 8:56 AM, Daniel Vetter wrote:
>> On Mon, Jan 18, 2021 at 04:01:10PM -0500, Andrey Grodzovsky wrote:
>>> On device removal reroute all CPU mappings to dummy page.
>>>
>>> v3:
>>> Remove loop to find DRM file and instead access it
>>> by vma->vm_file->private_data. Move dummy page installation
>>> into a separate function.
>>>
>>> v4:
>>> Map the entire BOs VA space into on demand allocated dummy page
>>> on the first fault for that BO.
>>>
>>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>>> ---
>>>   drivers/gpu/drm/ttm/ttm_bo_vm.c | 82 
>>> ++++++++++++++++++++++++++++++++++++++++-
>>>   include/drm/ttm/ttm_bo_api.h    |  2 +
>>>   2 files changed, 83 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c b/drivers/gpu/drm/ttm/ttm_bo_vm.c
>>> index 6dc96cf..ed89da3 100644
>>> --- a/drivers/gpu/drm/ttm/ttm_bo_vm.c
>>> +++ b/drivers/gpu/drm/ttm/ttm_bo_vm.c
>>> @@ -34,6 +34,8 @@
>>>   #include <drm/ttm/ttm_bo_driver.h>
>>>   #include <drm/ttm/ttm_placement.h>
>>>   #include <drm/drm_vma_manager.h>
>>> +#include <drm/drm_drv.h>
>>> +#include <drm/drm_managed.h>
>>>   #include <linux/mm.h>
>>>   #include <linux/pfn_t.h>
>>>   #include <linux/rbtree.h>
>>> @@ -380,25 +382,103 @@ vm_fault_t ttm_bo_vm_fault_reserved(struct vm_fault 
>>> *vmf,
>>>   }
>>>   EXPORT_SYMBOL(ttm_bo_vm_fault_reserved);
>>>   +static void ttm_bo_release_dummy_page(struct drm_device *dev, void *res)
>>> +{
>>> +    struct page *dummy_page = (struct page *)res;
>>> +
>>> +    __free_page(dummy_page);
>>> +}
>>> +
>>> +vm_fault_t ttm_bo_vm_dummy_page(struct vm_fault *vmf, pgprot_t prot)
>>> +{
>>> +    struct vm_area_struct *vma = vmf->vma;
>>> +    struct ttm_buffer_object *bo = vma->vm_private_data;
>>> +    struct ttm_bo_device *bdev = bo->bdev;
>>> +    struct drm_device *ddev = bo->base.dev;
>>> +    vm_fault_t ret = VM_FAULT_NOPAGE;
>>> +    unsigned long address = vma->vm_start;
>>> +    unsigned long num_prefault = (vma->vm_end - vma->vm_start) >> PAGE_SHIFT;
>>> +    unsigned long pfn;
>>> +    struct page *page;
>>> +    int i;
>>> +
>>> +    /*
>>> +     * Wait for buffer data in transit, due to a pipelined
>>> +     * move.
>>> +     */
>>> +    ret = ttm_bo_vm_fault_idle(bo, vmf);
>>> +    if (unlikely(ret != 0))
>>> +        return ret;
>>> +
>>> +    /* Allocate new dummy page to map all the VA range in this VMA to it*/
>>> +    page = alloc_page(GFP_KERNEL | __GFP_ZERO);
>>> +    if (!page)
>>> +        return VM_FAULT_OOM;
>>> +
>>> +    pfn = page_to_pfn(page);
>>> +
>>> +    /*
>>> +     * Prefault the entire VMA range right away to avoid further faults
>>> +     */
>>> +    for (i = 0; i < num_prefault; ++i) {
>>> +
>>> +        if (unlikely(address >= vma->vm_end))
>>> +            break;
>>> +
>>> +        if (vma->vm_flags & VM_MIXEDMAP)
>>> +            ret = vmf_insert_mixed_prot(vma, address,
>>> +                            __pfn_to_pfn_t(pfn, PFN_DEV),
>>> +                            prot);
>>> +        else
>>> +            ret = vmf_insert_pfn_prot(vma, address, pfn, prot);
>>> +
>>> +        /* Never error on prefaulted PTEs */
>>> +        if (unlikely((ret & VM_FAULT_ERROR))) {
>>> +            if (i == 0)
>>> +                return VM_FAULT_NOPAGE;
>>> +            else
>>> +                break;
>>> +        }
>>> +
>>> +        address += PAGE_SIZE;
>>> +    }
>>> +
>>> +    /* Set the page to be freed using drmm release action */
>>> +    if (drmm_add_action_or_reset(ddev, ttm_bo_release_dummy_page, page))
>>> +        return VM_FAULT_OOM;
>>> +
>>> +    return ret;
>>> +}
>>> +EXPORT_SYMBOL(ttm_bo_vm_dummy_page);
>> I think we can lift this entire thing (once the ttm_bo_vm_fault_idle is
>> gone) to the drm level, since nothing ttm specific in here. Probably stuff
>> it into drm_gem.c (but really it's not even gem specific, it's fully
>> generic "replace this vma with dummy pages pls" function.
>
>
> Once I started with this I noticed that drmm_add_action_or_reset depends
> on struct drm_device *ddev = bo->base.dev  and bo is the private data
> we embed at the TTM level when setting up the mapping and so this forces
> to move drmm_add_action_or_reset out of this function to every client who uses
> this function, and then you separate the logic of page allocation from it's 
> release.
> So I suggest we keep it as is.
>
> Andrey
>
>
>>
>> Aside from this nit I think the overall approach you have here is starting
>> to look good. Lots of work&polish, but imo we're getting there and can
>> start landing stuff soon.
>> -Daniel
>>
>>> +
>>>   vm_fault_t ttm_bo_vm_fault(struct vm_fault *vmf)
>>>   {
>>>       struct vm_area_struct *vma = vmf->vma;
>>>       pgprot_t prot;
>>>       struct ttm_buffer_object *bo = vma->vm_private_data;
>>> +    struct drm_device *ddev = bo->base.dev;
>>>       vm_fault_t ret;
>>> +    int idx;
>>>         ret = ttm_bo_vm_reserve(bo, vmf);
>>>       if (ret)
>>>           return ret;
>>>         prot = vma->vm_page_prot;
>>> -    ret = ttm_bo_vm_fault_reserved(vmf, prot, TTM_BO_VM_NUM_PREFAULT, 1);
>>> +    if (drm_dev_enter(ddev, &idx)) {
>>> +        ret = ttm_bo_vm_fault_reserved(vmf, prot, TTM_BO_VM_NUM_PREFAULT, 1);
>>> +        drm_dev_exit(idx);
>>> +    } else {
>>> +        ret = ttm_bo_vm_dummy_page(vmf, prot);
>>> +    }
>>>       if (ret == VM_FAULT_RETRY && !(vmf->flags & FAULT_FLAG_RETRY_NOWAIT))
>>>           return ret;
>>>         dma_resv_unlock(bo->base.resv);
>>>         return ret;
>>> +
>>> +    return ret;
>>>   }
>>>   EXPORT_SYMBOL(ttm_bo_vm_fault);
>>>   diff --git a/include/drm/ttm/ttm_bo_api.h b/include/drm/ttm/ttm_bo_api.h
>>> index e17be32..12fb240 100644
>>> --- a/include/drm/ttm/ttm_bo_api.h
>>> +++ b/include/drm/ttm/ttm_bo_api.h
>>> @@ -643,4 +643,6 @@ void ttm_bo_vm_close(struct vm_area_struct *vma);
>>>   int ttm_bo_vm_access(struct vm_area_struct *vma, unsigned long addr,
>>>                void *buf, int len, int write);
>>>   +vm_fault_t ttm_bo_vm_dummy_page(struct vm_fault *vmf, pgprot_t prot);
>>> +
>>>   #endif
>>> -- 
>>> 2.7.4
>>>
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 01/14] drm/ttm: Remap all page faults to per process dummy page.
@ 2021-01-27 14:29         ` Andrey Grodzovsky
  0 siblings, 0 replies; 196+ messages in thread
From: Andrey Grodzovsky @ 2021-01-27 14:29 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: robh, gregkh, ckoenig.leichtzumerken, dri-devel, eric, ppaalanen,
	amd-gfx, daniel.vetter, Alexander.Deucher, yuq825,
	Harry.Wentland, l.stach

Hey Daniel, just a ping.

Andrey

On 1/25/21 10:28 AM, Andrey Grodzovsky wrote:
>
> On 1/19/21 8:56 AM, Daniel Vetter wrote:
>> On Mon, Jan 18, 2021 at 04:01:10PM -0500, Andrey Grodzovsky wrote:
>>> On device removal reroute all CPU mappings to dummy page.
>>>
>>> v3:
>>> Remove loop to find DRM file and instead access it
>>> by vma->vm_file->private_data. Move dummy page installation
>>> into a separate function.
>>>
>>> v4:
>>> Map the entire BOs VA space into on demand allocated dummy page
>>> on the first fault for that BO.
>>>
>>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>>> ---
>>>   drivers/gpu/drm/ttm/ttm_bo_vm.c | 82 
>>> ++++++++++++++++++++++++++++++++++++++++-
>>>   include/drm/ttm/ttm_bo_api.h    |  2 +
>>>   2 files changed, 83 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c b/drivers/gpu/drm/ttm/ttm_bo_vm.c
>>> index 6dc96cf..ed89da3 100644
>>> --- a/drivers/gpu/drm/ttm/ttm_bo_vm.c
>>> +++ b/drivers/gpu/drm/ttm/ttm_bo_vm.c
>>> @@ -34,6 +34,8 @@
>>>   #include <drm/ttm/ttm_bo_driver.h>
>>>   #include <drm/ttm/ttm_placement.h>
>>>   #include <drm/drm_vma_manager.h>
>>> +#include <drm/drm_drv.h>
>>> +#include <drm/drm_managed.h>
>>>   #include <linux/mm.h>
>>>   #include <linux/pfn_t.h>
>>>   #include <linux/rbtree.h>
>>> @@ -380,25 +382,103 @@ vm_fault_t ttm_bo_vm_fault_reserved(struct vm_fault 
>>> *vmf,
>>>   }
>>>   EXPORT_SYMBOL(ttm_bo_vm_fault_reserved);
>>>   +static void ttm_bo_release_dummy_page(struct drm_device *dev, void *res)
>>> +{
>>> +    struct page *dummy_page = (struct page *)res;
>>> +
>>> +    __free_page(dummy_page);
>>> +}
>>> +
>>> +vm_fault_t ttm_bo_vm_dummy_page(struct vm_fault *vmf, pgprot_t prot)
>>> +{
>>> +    struct vm_area_struct *vma = vmf->vma;
>>> +    struct ttm_buffer_object *bo = vma->vm_private_data;
>>> +    struct ttm_bo_device *bdev = bo->bdev;
>>> +    struct drm_device *ddev = bo->base.dev;
>>> +    vm_fault_t ret = VM_FAULT_NOPAGE;
>>> +    unsigned long address = vma->vm_start;
>>> +    unsigned long num_prefault = (vma->vm_end - vma->vm_start) >> PAGE_SHIFT;
>>> +    unsigned long pfn;
>>> +    struct page *page;
>>> +    int i;
>>> +
>>> +    /*
>>> +     * Wait for buffer data in transit, due to a pipelined
>>> +     * move.
>>> +     */
>>> +    ret = ttm_bo_vm_fault_idle(bo, vmf);
>>> +    if (unlikely(ret != 0))
>>> +        return ret;
>>> +
>>> +    /* Allocate new dummy page to map all the VA range in this VMA to it*/
>>> +    page = alloc_page(GFP_KERNEL | __GFP_ZERO);
>>> +    if (!page)
>>> +        return VM_FAULT_OOM;
>>> +
>>> +    pfn = page_to_pfn(page);
>>> +
>>> +    /*
>>> +     * Prefault the entire VMA range right away to avoid further faults
>>> +     */
>>> +    for (i = 0; i < num_prefault; ++i) {
>>> +
>>> +        if (unlikely(address >= vma->vm_end))
>>> +            break;
>>> +
>>> +        if (vma->vm_flags & VM_MIXEDMAP)
>>> +            ret = vmf_insert_mixed_prot(vma, address,
>>> +                            __pfn_to_pfn_t(pfn, PFN_DEV),
>>> +                            prot);
>>> +        else
>>> +            ret = vmf_insert_pfn_prot(vma, address, pfn, prot);
>>> +
>>> +        /* Never error on prefaulted PTEs */
>>> +        if (unlikely((ret & VM_FAULT_ERROR))) {
>>> +            if (i == 0)
>>> +                return VM_FAULT_NOPAGE;
>>> +            else
>>> +                break;
>>> +        }
>>> +
>>> +        address += PAGE_SIZE;
>>> +    }
>>> +
>>> +    /* Set the page to be freed using drmm release action */
>>> +    if (drmm_add_action_or_reset(ddev, ttm_bo_release_dummy_page, page))
>>> +        return VM_FAULT_OOM;
>>> +
>>> +    return ret;
>>> +}
>>> +EXPORT_SYMBOL(ttm_bo_vm_dummy_page);
>> I think we can lift this entire thing (once the ttm_bo_vm_fault_idle is
>> gone) to the drm level, since nothing ttm specific in here. Probably stuff
>> it into drm_gem.c (but really it's not even gem specific, it's fully
>> generic "replace this vma with dummy pages pls" function.
>
>
> Once I started with this I noticed that drmm_add_action_or_reset depends
> on struct drm_device *ddev = bo->base.dev  and bo is the private data
> we embed at the TTM level when setting up the mapping and so this forces
> to move drmm_add_action_or_reset out of this function to every client who uses
> this function, and then you separate the logic of page allocation from it's 
> release.
> So I suggest we keep it as is.
>
> Andrey
>
>
>>
>> Aside from this nit I think the overall approach you have here is starting
>> to look good. Lots of work&polish, but imo we're getting there and can
>> start landing stuff soon.
>> -Daniel
>>
>>> +
>>>   vm_fault_t ttm_bo_vm_fault(struct vm_fault *vmf)
>>>   {
>>>       struct vm_area_struct *vma = vmf->vma;
>>>       pgprot_t prot;
>>>       struct ttm_buffer_object *bo = vma->vm_private_data;
>>> +    struct drm_device *ddev = bo->base.dev;
>>>       vm_fault_t ret;
>>> +    int idx;
>>>         ret = ttm_bo_vm_reserve(bo, vmf);
>>>       if (ret)
>>>           return ret;
>>>         prot = vma->vm_page_prot;
>>> -    ret = ttm_bo_vm_fault_reserved(vmf, prot, TTM_BO_VM_NUM_PREFAULT, 1);
>>> +    if (drm_dev_enter(ddev, &idx)) {
>>> +        ret = ttm_bo_vm_fault_reserved(vmf, prot, TTM_BO_VM_NUM_PREFAULT, 1);
>>> +        drm_dev_exit(idx);
>>> +    } else {
>>> +        ret = ttm_bo_vm_dummy_page(vmf, prot);
>>> +    }
>>>       if (ret == VM_FAULT_RETRY && !(vmf->flags & FAULT_FLAG_RETRY_NOWAIT))
>>>           return ret;
>>>         dma_resv_unlock(bo->base.resv);
>>>         return ret;
>>> +
>>> +    return ret;
>>>   }
>>>   EXPORT_SYMBOL(ttm_bo_vm_fault);
>>>   diff --git a/include/drm/ttm/ttm_bo_api.h b/include/drm/ttm/ttm_bo_api.h
>>> index e17be32..12fb240 100644
>>> --- a/include/drm/ttm/ttm_bo_api.h
>>> +++ b/include/drm/ttm/ttm_bo_api.h
>>> @@ -643,4 +643,6 @@ void ttm_bo_vm_close(struct vm_area_struct *vma);
>>>   int ttm_bo_vm_access(struct vm_area_struct *vma, unsigned long addr,
>>>                void *buf, int len, int write);
>>>   +vm_fault_t ttm_bo_vm_dummy_page(struct vm_fault *vmf, pgprot_t prot);
>>> +
>>>   #endif
>>> -- 
>>> 2.7.4
>>>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 11/14] drm/amdgpu: Guard against write accesses after device removal
  2021-01-19 18:59             ` Christian König
@ 2021-01-28 17:23               ` Andrey Grodzovsky
  -1 siblings, 0 replies; 196+ messages in thread
From: Andrey Grodzovsky @ 2021-01-28 17:23 UTC (permalink / raw)
  To: christian.koenig, Daniel Vetter
  Cc: Greg KH, dri-devel, amd-gfx list, Alex Deucher, Qiang Yu

[-- Attachment #1: Type: text/plain, Size: 33539 bytes --]


On 1/19/21 1:59 PM, Christian König wrote:
> Am 19.01.21 um 19:22 schrieb Andrey Grodzovsky:
>>
>> On 1/19/21 1:05 PM, Daniel Vetter wrote:
>>> On Tue, Jan 19, 2021 at 4:35 PM Andrey Grodzovsky
>>> <Andrey.Grodzovsky@amd.com> wrote:
>>>> There is really no other way according to this article
>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flwn.net%2FArticles%2F767885%2F&amp;data=04%7C01%7CAndrey.Grodzovsky%40amd.com%7Cee61fb937d2d4baedf6f08d8bcac5b02%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637466795752297305%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=a9Y4ZMEVYaMP7IeMVxQgXGpAkDXSkedMAiWkyqwzEe8%3D&amp;reserved=0 
>>>>
>>>>
>>>> "A perfect solution seems nearly impossible though; we cannot acquire a 
>>>> mutex on
>>>> the user
>>>> to prevent them from yanking a device and we cannot check for a presence 
>>>> change
>>>> after every
>>>> device access for performance reasons. "
>>>>
>>>> But I assumed srcu_read_lock should be pretty seamless performance wise, no ?
>>> The read side is supposed to be dirt cheap, the write side is were we
>>> just stall for all readers to eventually complete on their own.
>>> Definitely should be much cheaper than mmio read, on the mmio write
>>> side it might actually hurt a bit. Otoh I think those don't stall the
>>> cpu by default when they're timing out, so maybe if the overhead is
>>> too much for those, we could omit them?
>>>
>>> Maybe just do a small microbenchmark for these for testing, with a
>>> register that doesn't change hw state. So with and without
>>> drm_dev_enter/exit, and also one with the hw plugged out so that we
>>> have actual timeouts in the transactions.
>>> -Daniel
>>
>>
>> So say writing in a loop to some harmless scratch register for many times 
>> both for plugged
>> and unplugged case and measure total time delta ?
>
> I think we should at least measure the following:
>
> 1. Writing X times to a scratch reg without your patch.
> 2. Writing X times to a scratch reg with your patch.
> 3. Writing X times to a scratch reg with the hardware physically disconnected.
>
> I suggest to repeat that once for Polaris (or older) and once for Vega or Navi.
>
> The SRBM on Polaris is meant to introduce some delay in each access, so it 
> might react differently then the newer hardware.
>
> Christian.


See attached results and the testing code. Ran on Polaris (gfx8) and Vega10(gfx9)

In summary, over 1 million WWREG32 in loop with and without this patch you get 
around 10ms of accumulated overhead ( so 0.00001 millisecond penalty for each 
WWREG32) for using drm_dev_enter check when writing registers.

P.S Bullet 3 I cannot test as I need eGPU and currently I don't have one.

diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
index 3763921..1650549 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
@@ -873,6 +873,11 @@ static int gfx_v8_0_ring_test_ring(struct amdgpu_ring *ring)
         if (i >= adev->usec_timeout)
                 r = -ETIMEDOUT;

+       DRM_ERROR("Before write 1M times to scratch register");
+       for (i = 0; i < 1000000; i++)
+               WREG32(scratch, 0xDEADBEEF);
+       DRM_ERROR("After write 1M times to scratch register");
+
  error_free_scratch:
         amdgpu_gfx_scratch_free(adev, scratch);
         return r;
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
index 5f4805e..7ecbfef 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
@@ -1063,6 +1063,11 @@ static int gfx_v9_0_ring_test_ring(struct amdgpu_ring *ring)
         if (i >= adev->usec_timeout)
                 r = -ETIMEDOUT;

+       DRM_ERROR("Before write 1M times to scratch register");
+       for (i = 0; i < 1000000; i++)
+               WREG32(scratch, 0xDEADBEEF);
+       DRM_ERROR("After write 1M times to scratch register");
+
  error_free_scratch:
         amdgpu_gfx_scratch_free(adev, scratch);
         return r;


Andrey


Andrey



>
>>
>> Andrey
>>
>>
>>>
>>>> The other solution would be as I suggested to keep all the device IO ranges
>>>> reserved and system
>>>> memory pages unfreed until the device is finalized in the driver but Daniel 
>>>> said
>>>> this would upset the PCI layer (the MMIO ranges reservation part).
>>>>
>>>> Andrey
>>>>
>>>>
>>>>
>>>>
>>>> On 1/19/21 3:55 AM, Christian König wrote:
>>>>> Am 18.01.21 um 22:01 schrieb Andrey Grodzovsky:
>>>>>> This should prevent writing to memory or IO ranges possibly
>>>>>> already allocated for other uses after our device is removed.
>>>>> Wow, that adds quite some overhead to every register access. I'm not sure we
>>>>> can do this.
>>>>>
>>>>> Christian.
>>>>>
>>>>>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>>>>>> ---
>>>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 57 ++++++++++++++++++++++++
>>>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c    |  9 ++++
>>>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c    | 53 +++++++++++++---------
>>>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h    |  3 ++
>>>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c   | 70 
>>>>>> ++++++++++++++++++++++++++++++
>>>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h   | 49 ++-------------------
>>>>>>    drivers/gpu/drm/amd/amdgpu/psp_v11_0.c     | 16 ++-----
>>>>>>    drivers/gpu/drm/amd/amdgpu/psp_v12_0.c     |  8 +---
>>>>>>    drivers/gpu/drm/amd/amdgpu/psp_v3_1.c      |  8 +---
>>>>>>    9 files changed, 184 insertions(+), 89 deletions(-)
>>>>>>
>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>>> index e99f4f1..0a9d73c 100644
>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>>> @@ -72,6 +72,8 @@
>>>>>>      #include <linux/iommu.h>
>>>>>>    +#include <drm/drm_drv.h>
>>>>>> +
>>>>>>    MODULE_FIRMWARE("amdgpu/vega10_gpu_info.bin");
>>>>>>    MODULE_FIRMWARE("amdgpu/vega12_gpu_info.bin");
>>>>>>    MODULE_FIRMWARE("amdgpu/raven_gpu_info.bin");
>>>>>> @@ -404,13 +406,21 @@ uint8_t amdgpu_mm_rreg8(struct amdgpu_device *adev,
>>>>>> uint32_t offset)
>>>>>>     */
>>>>>>    void amdgpu_mm_wreg8(struct amdgpu_device *adev, uint32_t offset, uint8_t
>>>>>> value)
>>>>>>    {
>>>>>> +    int idx;
>>>>>> +
>>>>>>        if (adev->in_pci_err_recovery)
>>>>>>            return;
>>>>>>    +
>>>>>> +    if (!drm_dev_enter(&adev->ddev, &idx))
>>>>>> +        return;
>>>>>> +
>>>>>>        if (offset < adev->rmmio_size)
>>>>>>            writeb(value, adev->rmmio + offset);
>>>>>>        else
>>>>>>            BUG();
>>>>>> +
>>>>>> +    drm_dev_exit(idx);
>>>>>>    }
>>>>>>      /**
>>>>>> @@ -427,9 +437,14 @@ void amdgpu_device_wreg(struct amdgpu_device *adev,
>>>>>>                uint32_t reg, uint32_t v,
>>>>>>                uint32_t acc_flags)
>>>>>>    {
>>>>>> +    int idx;
>>>>>> +
>>>>>>        if (adev->in_pci_err_recovery)
>>>>>>            return;
>>>>>>    +    if (!drm_dev_enter(&adev->ddev, &idx))
>>>>>> +        return;
>>>>>> +
>>>>>>        if ((reg * 4) < adev->rmmio_size) {
>>>>>>            if (!(acc_flags & AMDGPU_REGS_NO_KIQ) &&
>>>>>>                amdgpu_sriov_runtime(adev) &&
>>>>>> @@ -444,6 +459,8 @@ void amdgpu_device_wreg(struct amdgpu_device *adev,
>>>>>>        }
>>>>>> trace_amdgpu_device_wreg(adev->pdev->device, reg, v);
>>>>>> +
>>>>>> +    drm_dev_exit(idx);
>>>>>>    }
>>>>>>      /*
>>>>>> @@ -454,9 +471,14 @@ void amdgpu_device_wreg(struct amdgpu_device *adev,
>>>>>>    void amdgpu_mm_wreg_mmio_rlc(struct amdgpu_device *adev,
>>>>>>                     uint32_t reg, uint32_t v)
>>>>>>    {
>>>>>> +    int idx;
>>>>>> +
>>>>>>        if (adev->in_pci_err_recovery)
>>>>>>            return;
>>>>>>    +    if (!drm_dev_enter(&adev->ddev, &idx))
>>>>>> +        return;
>>>>>> +
>>>>>>        if (amdgpu_sriov_fullaccess(adev) &&
>>>>>>            adev->gfx.rlc.funcs &&
>>>>>> adev->gfx.rlc.funcs->is_rlcg_access_range) {
>>>>>> @@ -465,6 +487,8 @@ void amdgpu_mm_wreg_mmio_rlc(struct amdgpu_device *adev,
>>>>>>        } else {
>>>>>>            writel(v, ((void __iomem *)adev->rmmio) + (reg * 4));
>>>>>>        }
>>>>>> +
>>>>>> +    drm_dev_exit(idx);
>>>>>>    }
>>>>>>      /**
>>>>>> @@ -499,15 +523,22 @@ u32 amdgpu_io_rreg(struct amdgpu_device *adev, u32 
>>>>>> reg)
>>>>>>     */
>>>>>>    void amdgpu_io_wreg(struct amdgpu_device *adev, u32 reg, u32 v)
>>>>>>    {
>>>>>> +    int idx;
>>>>>> +
>>>>>>        if (adev->in_pci_err_recovery)
>>>>>>            return;
>>>>>>    +    if (!drm_dev_enter(&adev->ddev, &idx))
>>>>>> +        return;
>>>>>> +
>>>>>>        if ((reg * 4) < adev->rio_mem_size)
>>>>>>            iowrite32(v, adev->rio_mem + (reg * 4));
>>>>>>        else {
>>>>>>            iowrite32((reg * 4), adev->rio_mem + (mmMM_INDEX * 4));
>>>>>>            iowrite32(v, adev->rio_mem + (mmMM_DATA * 4));
>>>>>>        }
>>>>>> +
>>>>>> +    drm_dev_exit(idx);
>>>>>>    }
>>>>>>      /**
>>>>>> @@ -544,14 +575,21 @@ u32 amdgpu_mm_rdoorbell(struct amdgpu_device *adev, 
>>>>>> u32
>>>>>> index)
>>>>>>     */
>>>>>>    void amdgpu_mm_wdoorbell(struct amdgpu_device *adev, u32 index, u32 v)
>>>>>>    {
>>>>>> +    int idx;
>>>>>> +
>>>>>>        if (adev->in_pci_err_recovery)
>>>>>>            return;
>>>>>>    +    if (!drm_dev_enter(&adev->ddev, &idx))
>>>>>> +        return;
>>>>>> +
>>>>>>        if (index < adev->doorbell.num_doorbells) {
>>>>>>            writel(v, adev->doorbell.ptr + index);
>>>>>>        } else {
>>>>>>            DRM_ERROR("writing beyond doorbell aperture: 0x%08x!\n", index);
>>>>>>        }
>>>>>> +
>>>>>> +    drm_dev_exit(idx);
>>>>>>    }
>>>>>>      /**
>>>>>> @@ -588,14 +626,21 @@ u64 amdgpu_mm_rdoorbell64(struct amdgpu_device *adev,
>>>>>> u32 index)
>>>>>>     */
>>>>>>    void amdgpu_mm_wdoorbell64(struct amdgpu_device *adev, u32 index, u64 v)
>>>>>>    {
>>>>>> +    int idx;
>>>>>> +
>>>>>>        if (adev->in_pci_err_recovery)
>>>>>>            return;
>>>>>>    +    if (!drm_dev_enter(&adev->ddev, &idx))
>>>>>> +        return;
>>>>>> +
>>>>>>        if (index < adev->doorbell.num_doorbells) {
>>>>>>            atomic64_set((atomic64_t *)(adev->doorbell.ptr + index), v);
>>>>>>        } else {
>>>>>>            DRM_ERROR("writing beyond doorbell aperture: 0x%08x!\n", index);
>>>>>>        }
>>>>>> +
>>>>>> +    drm_dev_exit(idx);
>>>>>>    }
>>>>>>      /**
>>>>>> @@ -682,6 +727,10 @@ void amdgpu_device_indirect_wreg(struct amdgpu_device
>>>>>> *adev,
>>>>>>        unsigned long flags;
>>>>>>        void __iomem *pcie_index_offset;
>>>>>>        void __iomem *pcie_data_offset;
>>>>>> +    int idx;
>>>>>> +
>>>>>> +    if (!drm_dev_enter(&adev->ddev, &idx))
>>>>>> +        return;
>>>>>>          spin_lock_irqsave(&adev->pcie_idx_lock, flags);
>>>>>>        pcie_index_offset = (void __iomem *)adev->rmmio + pcie_index * 4;
>>>>>> @@ -692,6 +741,8 @@ void amdgpu_device_indirect_wreg(struct amdgpu_device 
>>>>>> *adev,
>>>>>>        writel(reg_data, pcie_data_offset);
>>>>>>        readl(pcie_data_offset);
>>>>>> spin_unlock_irqrestore(&adev->pcie_idx_lock, flags);
>>>>>> +
>>>>>> +    drm_dev_exit(idx);
>>>>>>    }
>>>>>>      /**
>>>>>> @@ -711,6 +762,10 @@ void amdgpu_device_indirect_wreg64(struct amdgpu_device
>>>>>> *adev,
>>>>>>        unsigned long flags;
>>>>>>        void __iomem *pcie_index_offset;
>>>>>>        void __iomem *pcie_data_offset;
>>>>>> +    int idx;
>>>>>> +
>>>>>> +    if (!drm_dev_enter(&adev->ddev, &idx))
>>>>>> +        return;
>>>>>>          spin_lock_irqsave(&adev->pcie_idx_lock, flags);
>>>>>>        pcie_index_offset = (void __iomem *)adev->rmmio + pcie_index * 4;
>>>>>> @@ -727,6 +782,8 @@ void amdgpu_device_indirect_wreg64(struct amdgpu_device
>>>>>> *adev,
>>>>>>        writel((u32)(reg_data >> 32), pcie_data_offset);
>>>>>>        readl(pcie_data_offset);
>>>>>> spin_unlock_irqrestore(&adev->pcie_idx_lock, flags);
>>>>>> +
>>>>>> +    drm_dev_exit(idx);
>>>>>>    }
>>>>>>      /**
>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
>>>>>> index fe1a39f..1beb4e6 100644
>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
>>>>>> @@ -31,6 +31,8 @@
>>>>>>    #include "amdgpu_ras.h"
>>>>>>    #include "amdgpu_xgmi.h"
>>>>>>    +#include <drm/drm_drv.h>
>>>>>> +
>>>>>>    /**
>>>>>>     * amdgpu_gmc_get_pde_for_bo - get the PDE for a BO
>>>>>>     *
>>>>>> @@ -98,6 +100,10 @@ int amdgpu_gmc_set_pte_pde(struct amdgpu_device *adev,
>>>>>> void *cpu_pt_addr,
>>>>>>    {
>>>>>>        void __iomem *ptr = (void *)cpu_pt_addr;
>>>>>>        uint64_t value;
>>>>>> +    int idx;
>>>>>> +
>>>>>> +    if (!drm_dev_enter(&adev->ddev, &idx))
>>>>>> +        return 0;
>>>>>>          /*
>>>>>>         * The following is for PTE only. GART does not have PDEs.
>>>>>> @@ -105,6 +111,9 @@ int amdgpu_gmc_set_pte_pde(struct amdgpu_device *adev,
>>>>>> void *cpu_pt_addr,
>>>>>>        value = addr & 0x0000FFFFFFFFF000ULL;
>>>>>>        value |= flags;
>>>>>>        writeq(value, ptr + (gpu_page_idx * 8));
>>>>>> +
>>>>>> +    drm_dev_exit(idx);
>>>>>> +
>>>>>>        return 0;
>>>>>>    }
>>>>>>    diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
>>>>>> index 523d22d..89e2bfe 100644
>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
>>>>>> @@ -37,6 +37,8 @@
>>>>>>      #include "amdgpu_ras.h"
>>>>>>    +#include <drm/drm_drv.h>
>>>>>> +
>>>>>>    static int psp_sysfs_init(struct amdgpu_device *adev);
>>>>>>    static void psp_sysfs_fini(struct amdgpu_device *adev);
>>>>>>    @@ -248,7 +250,7 @@ psp_cmd_submit_buf(struct psp_context *psp,
>>>>>>               struct psp_gfx_cmd_resp *cmd, uint64_t fence_mc_addr)
>>>>>>    {
>>>>>>        int ret;
>>>>>> -    int index;
>>>>>> +    int index, idx;
>>>>>>        int timeout = 2000;
>>>>>>        bool ras_intr = false;
>>>>>>        bool skip_unsupport = false;
>>>>>> @@ -256,6 +258,9 @@ psp_cmd_submit_buf(struct psp_context *psp,
>>>>>>        if (psp->adev->in_pci_err_recovery)
>>>>>>            return 0;
>>>>>>    +    if (!drm_dev_enter(&psp->adev->ddev, &idx))
>>>>>> +        return 0;
>>>>>> +
>>>>>>        mutex_lock(&psp->mutex);
>>>>>>          memset(psp->cmd_buf_mem, 0, PSP_CMD_BUFFER_SIZE);
>>>>>> @@ -266,8 +271,7 @@ psp_cmd_submit_buf(struct psp_context *psp,
>>>>>>        ret = psp_ring_cmd_submit(psp, psp->cmd_buf_mc_addr, fence_mc_addr,
>>>>>> index);
>>>>>>        if (ret) {
>>>>>>            atomic_dec(&psp->fence_value);
>>>>>> -        mutex_unlock(&psp->mutex);
>>>>>> -        return ret;
>>>>>> +        goto exit;
>>>>>>        }
>>>>>>          amdgpu_asic_invalidate_hdp(psp->adev, NULL);
>>>>>> @@ -307,8 +311,8 @@ psp_cmd_submit_buf(struct psp_context *psp,
>>>>>>                 psp->cmd_buf_mem->cmd_id,
>>>>>>                 psp->cmd_buf_mem->resp.status);
>>>>>>            if (!timeout) {
>>>>>> -            mutex_unlock(&psp->mutex);
>>>>>> -            return -EINVAL;
>>>>>> +            ret = -EINVAL;
>>>>>> +            goto exit;
>>>>>>            }
>>>>>>        }
>>>>>>    @@ -316,8 +320,10 @@ psp_cmd_submit_buf(struct psp_context *psp,
>>>>>>            ucode->tmr_mc_addr_lo = psp->cmd_buf_mem->resp.fw_addr_lo;
>>>>>>            ucode->tmr_mc_addr_hi = psp->cmd_buf_mem->resp.fw_addr_hi;
>>>>>>        }
>>>>>> -    mutex_unlock(&psp->mutex);
>>>>>>    +exit:
>>>>>> +    mutex_unlock(&psp->mutex);
>>>>>> +    drm_dev_exit(idx);
>>>>>>        return ret;
>>>>>>    }
>>>>>>    @@ -354,8 +360,7 @@ static int psp_load_toc(struct psp_context *psp,
>>>>>>        if (!cmd)
>>>>>>            return -ENOMEM;
>>>>>>        /* Copy toc to psp firmware private buffer */
>>>>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>>> -    memcpy(psp->fw_pri_buf, psp->toc_start_addr, psp->toc_bin_size);
>>>>>> +    psp_copy_fw(psp, psp->toc_start_addr, psp->toc_bin_size);
>>>>>>          psp_prep_load_toc_cmd_buf(cmd, psp->fw_pri_mc_addr, 
>>>>>> psp->toc_bin_size);
>>>>>>    @@ -570,8 +575,7 @@ static int psp_asd_load(struct psp_context *psp)
>>>>>>        if (!cmd)
>>>>>>            return -ENOMEM;
>>>>>>    -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>>> -    memcpy(psp->fw_pri_buf, psp->asd_start_addr, psp->asd_ucode_size);
>>>>>> +    psp_copy_fw(psp, psp->asd_start_addr, psp->asd_ucode_size);
>>>>>>          psp_prep_asd_load_cmd_buf(cmd, psp->fw_pri_mc_addr,
>>>>>>                      psp->asd_ucode_size);
>>>>>> @@ -726,8 +730,7 @@ static int psp_xgmi_load(struct psp_context *psp)
>>>>>>        if (!cmd)
>>>>>>            return -ENOMEM;
>>>>>>    -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>>> -    memcpy(psp->fw_pri_buf, psp->ta_xgmi_start_addr, 
>>>>>> psp->ta_xgmi_ucode_size);
>>>>>> +    psp_copy_fw(psp, psp->ta_xgmi_start_addr, psp->ta_xgmi_ucode_size);
>>>>>>          psp_prep_ta_load_cmd_buf(cmd,
>>>>>>                     psp->fw_pri_mc_addr,
>>>>>> @@ -982,8 +985,7 @@ static int psp_ras_load(struct psp_context *psp)
>>>>>>        if (!cmd)
>>>>>>            return -ENOMEM;
>>>>>>    -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>>> -    memcpy(psp->fw_pri_buf, psp->ta_ras_start_addr, 
>>>>>> psp->ta_ras_ucode_size);
>>>>>> +    psp_copy_fw(psp, psp->ta_ras_start_addr, psp->ta_ras_ucode_size);
>>>>>>          psp_prep_ta_load_cmd_buf(cmd,
>>>>>>                     psp->fw_pri_mc_addr,
>>>>>> @@ -1219,8 +1221,7 @@ static int psp_hdcp_load(struct psp_context *psp)
>>>>>>        if (!cmd)
>>>>>>            return -ENOMEM;
>>>>>>    -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>>> -    memcpy(psp->fw_pri_buf, psp->ta_hdcp_start_addr,
>>>>>> +    psp_copy_fw(psp, psp->ta_hdcp_start_addr,
>>>>>>               psp->ta_hdcp_ucode_size);
>>>>>>          psp_prep_ta_load_cmd_buf(cmd,
>>>>>> @@ -1366,8 +1367,7 @@ static int psp_dtm_load(struct psp_context *psp)
>>>>>>        if (!cmd)
>>>>>>            return -ENOMEM;
>>>>>>    -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>>> -    memcpy(psp->fw_pri_buf, psp->ta_dtm_start_addr, 
>>>>>> psp->ta_dtm_ucode_size);
>>>>>> +    psp_copy_fw(psp, psp->ta_dtm_start_addr, psp->ta_dtm_ucode_size);
>>>>>>          psp_prep_ta_load_cmd_buf(cmd,
>>>>>>                     psp->fw_pri_mc_addr,
>>>>>> @@ -1507,8 +1507,7 @@ static int psp_rap_load(struct psp_context *psp)
>>>>>>        if (!cmd)
>>>>>>            return -ENOMEM;
>>>>>>    -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>>> -    memcpy(psp->fw_pri_buf, psp->ta_rap_start_addr, 
>>>>>> psp->ta_rap_ucode_size);
>>>>>> +    psp_copy_fw(psp, psp->ta_rap_start_addr, psp->ta_rap_ucode_size);
>>>>>>          psp_prep_ta_load_cmd_buf(cmd,
>>>>>>                     psp->fw_pri_mc_addr,
>>>>>> @@ -2778,6 +2777,20 @@ static ssize_t psp_usbc_pd_fw_sysfs_write(struct
>>>>>> device *dev,
>>>>>>        return count;
>>>>>>    }
>>>>>>    +void psp_copy_fw(struct psp_context *psp, uint8_t *start_addr, uint32_t
>>>>>> bin_size)
>>>>>> +{
>>>>>> +    int idx;
>>>>>> +
>>>>>> +    if (!drm_dev_enter(&psp->adev->ddev, &idx))
>>>>>> +        return;
>>>>>> +
>>>>>> +    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>>> +    memcpy(psp->fw_pri_buf, start_addr, bin_size);
>>>>>> +
>>>>>> +    drm_dev_exit(idx);
>>>>>> +}
>>>>>> +
>>>>>> +
>>>>>>    static DEVICE_ATTR(usbc_pd_fw, S_IRUGO | S_IWUSR,
>>>>>>               psp_usbc_pd_fw_sysfs_read,
>>>>>>               psp_usbc_pd_fw_sysfs_write);
>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
>>>>>> index da250bc..ac69314 100644
>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
>>>>>> @@ -400,4 +400,7 @@ int psp_init_ta_microcode(struct psp_context *psp,
>>>>>>                  const char *chip_name);
>>>>>>    int psp_get_fw_attestation_records_addr(struct psp_context *psp,
>>>>>>                        uint64_t *output_ptr);
>>>>>> +
>>>>>> +void psp_copy_fw(struct psp_context *psp, uint8_t *start_addr, uint32_t
>>>>>> bin_size);
>>>>>> +
>>>>>>    #endif
>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
>>>>>> index 1a612f5..d656494 100644
>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
>>>>>> @@ -35,6 +35,8 @@
>>>>>>    #include "amdgpu.h"
>>>>>>    #include "atom.h"
>>>>>>    +#include <drm/drm_drv.h>
>>>>>> +
>>>>>>    /*
>>>>>>     * Rings
>>>>>>     * Most engines on the GPU are fed via ring buffers. Ring
>>>>>> @@ -463,3 +465,71 @@ int amdgpu_ring_test_helper(struct amdgpu_ring *ring)
>>>>>>        ring->sched.ready = !r;
>>>>>>        return r;
>>>>>>    }
>>>>>> +
>>>>>> +void amdgpu_ring_clear_ring(struct amdgpu_ring *ring)
>>>>>> +{
>>>>>> +    int idx;
>>>>>> +    int i = 0;
>>>>>> +
>>>>>> +    if (!drm_dev_enter(&ring->adev->ddev, &idx))
>>>>>> +        return;
>>>>>> +
>>>>>> +    while (i <= ring->buf_mask)
>>>>>> +        ring->ring[i++] = ring->funcs->nop;
>>>>>> +
>>>>>> +    drm_dev_exit(idx);
>>>>>> +
>>>>>> +}
>>>>>> +
>>>>>> +void amdgpu_ring_write(struct amdgpu_ring *ring, uint32_t v)
>>>>>> +{
>>>>>> +    int idx;
>>>>>> +
>>>>>> +    if (!drm_dev_enter(&ring->adev->ddev, &idx))
>>>>>> +        return;
>>>>>> +
>>>>>> +    if (ring->count_dw <= 0)
>>>>>> +        DRM_ERROR("amdgpu: writing more dwords to the ring than 
>>>>>> expected!\n");
>>>>>> +    ring->ring[ring->wptr++ & ring->buf_mask] = v;
>>>>>> +    ring->wptr &= ring->ptr_mask;
>>>>>> +    ring->count_dw--;
>>>>>> +
>>>>>> +    drm_dev_exit(idx);
>>>>>> +}
>>>>>> +
>>>>>> +void amdgpu_ring_write_multiple(struct amdgpu_ring *ring,
>>>>>> +                          void *src, int count_dw)
>>>>>> +{
>>>>>> +    unsigned occupied, chunk1, chunk2;
>>>>>> +    void *dst;
>>>>>> +    int idx;
>>>>>> +
>>>>>> +    if (!drm_dev_enter(&ring->adev->ddev, &idx))
>>>>>> +        return;
>>>>>> +
>>>>>> +    if (unlikely(ring->count_dw < count_dw))
>>>>>> +        DRM_ERROR("amdgpu: writing more dwords to the ring than 
>>>>>> expected!\n");
>>>>>> +
>>>>>> +    occupied = ring->wptr & ring->buf_mask;
>>>>>> +    dst = (void *)&ring->ring[occupied];
>>>>>> +    chunk1 = ring->buf_mask + 1 - occupied;
>>>>>> +    chunk1 = (chunk1 >= count_dw) ? count_dw: chunk1;
>>>>>> +    chunk2 = count_dw - chunk1;
>>>>>> +    chunk1 <<= 2;
>>>>>> +    chunk2 <<= 2;
>>>>>> +
>>>>>> +    if (chunk1)
>>>>>> +        memcpy(dst, src, chunk1);
>>>>>> +
>>>>>> +    if (chunk2) {
>>>>>> +        src += chunk1;
>>>>>> +        dst = (void *)ring->ring;
>>>>>> +        memcpy(dst, src, chunk2);
>>>>>> +    }
>>>>>> +
>>>>>> +    ring->wptr += count_dw;
>>>>>> +    ring->wptr &= ring->ptr_mask;
>>>>>> +    ring->count_dw -= count_dw;
>>>>>> +
>>>>>> +    drm_dev_exit(idx);
>>>>>> +}
>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>>>>>> index accb243..f90b81f 100644
>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>>>>>> @@ -300,53 +300,12 @@ static inline void
>>>>>> amdgpu_ring_set_preempt_cond_exec(struct amdgpu_ring *ring,
>>>>>>        *ring->cond_exe_cpu_addr = cond_exec;
>>>>>>    }
>>>>>>    -static inline void amdgpu_ring_clear_ring(struct amdgpu_ring *ring)
>>>>>> -{
>>>>>> -    int i = 0;
>>>>>> -    while (i <= ring->buf_mask)
>>>>>> -        ring->ring[i++] = ring->funcs->nop;
>>>>>> -
>>>>>> -}
>>>>>> -
>>>>>> -static inline void amdgpu_ring_write(struct amdgpu_ring *ring, uint32_t v)
>>>>>> -{
>>>>>> -    if (ring->count_dw <= 0)
>>>>>> -        DRM_ERROR("amdgpu: writing more dwords to the ring than 
>>>>>> expected!\n");
>>>>>> -    ring->ring[ring->wptr++ & ring->buf_mask] = v;
>>>>>> -    ring->wptr &= ring->ptr_mask;
>>>>>> -    ring->count_dw--;
>>>>>> -}
>>>>>> +void amdgpu_ring_clear_ring(struct amdgpu_ring *ring);
>>>>>>    -static inline void amdgpu_ring_write_multiple(struct amdgpu_ring *ring,
>>>>>> -                          void *src, int count_dw)
>>>>>> -{
>>>>>> -    unsigned occupied, chunk1, chunk2;
>>>>>> -    void *dst;
>>>>>> -
>>>>>> -    if (unlikely(ring->count_dw < count_dw))
>>>>>> -        DRM_ERROR("amdgpu: writing more dwords to the ring than 
>>>>>> expected!\n");
>>>>>> -
>>>>>> -    occupied = ring->wptr & ring->buf_mask;
>>>>>> -    dst = (void *)&ring->ring[occupied];
>>>>>> -    chunk1 = ring->buf_mask + 1 - occupied;
>>>>>> -    chunk1 = (chunk1 >= count_dw) ? count_dw: chunk1;
>>>>>> -    chunk2 = count_dw - chunk1;
>>>>>> -    chunk1 <<= 2;
>>>>>> -    chunk2 <<= 2;
>>>>>> +void amdgpu_ring_write(struct amdgpu_ring *ring, uint32_t v);
>>>>>>    -    if (chunk1)
>>>>>> -        memcpy(dst, src, chunk1);
>>>>>> -
>>>>>> -    if (chunk2) {
>>>>>> -        src += chunk1;
>>>>>> -        dst = (void *)ring->ring;
>>>>>> -        memcpy(dst, src, chunk2);
>>>>>> -    }
>>>>>> -
>>>>>> -    ring->wptr += count_dw;
>>>>>> -    ring->wptr &= ring->ptr_mask;
>>>>>> -    ring->count_dw -= count_dw;
>>>>>> -}
>>>>>> +void amdgpu_ring_write_multiple(struct amdgpu_ring *ring,
>>>>>> +                          void *src, int count_dw);
>>>>>>      int amdgpu_ring_test_helper(struct amdgpu_ring *ring);
>>>>>>    diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
>>>>>> b/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
>>>>>> index bd4248c..b3ce5be 100644
>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
>>>>>> @@ -269,10 +269,8 @@ static int psp_v11_0_bootloader_load_kdb(struct
>>>>>> psp_context *psp)
>>>>>>        if (ret)
>>>>>>            return ret;
>>>>>>    -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>>> -
>>>>>>        /* Copy PSP KDB binary to memory */
>>>>>> -    memcpy(psp->fw_pri_buf, psp->kdb_start_addr, psp->kdb_bin_size);
>>>>>> +    psp_copy_fw(psp, psp->kdb_start_addr, psp->kdb_bin_size);
>>>>>>          /* Provide the PSP KDB to bootloader */
>>>>>>        WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>>>>> @@ -302,10 +300,8 @@ static int psp_v11_0_bootloader_load_spl(struct
>>>>>> psp_context *psp)
>>>>>>        if (ret)
>>>>>>            return ret;
>>>>>>    -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>>> -
>>>>>>        /* Copy PSP SPL binary to memory */
>>>>>> -    memcpy(psp->fw_pri_buf, psp->spl_start_addr, psp->spl_bin_size);
>>>>>> +    psp_copy_fw(psp, psp->spl_start_addr, psp->spl_bin_size);
>>>>>>          /* Provide the PSP SPL to bootloader */
>>>>>>        WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>>>>> @@ -335,10 +331,8 @@ static int psp_v11_0_bootloader_load_sysdrv(struct
>>>>>> psp_context *psp)
>>>>>>        if (ret)
>>>>>>            return ret;
>>>>>>    -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>>> -
>>>>>>        /* Copy PSP System Driver binary to memory */
>>>>>> -    memcpy(psp->fw_pri_buf, psp->sys_start_addr, psp->sys_bin_size);
>>>>>> +    psp_copy_fw(psp, psp->sys_start_addr, psp->sys_bin_size);
>>>>>>          /* Provide the sys driver to bootloader */
>>>>>>        WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>>>>> @@ -371,10 +365,8 @@ static int psp_v11_0_bootloader_load_sos(struct
>>>>>> psp_context *psp)
>>>>>>        if (ret)
>>>>>>            return ret;
>>>>>>    -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>>> -
>>>>>>        /* Copy Secure OS binary to PSP memory */
>>>>>> -    memcpy(psp->fw_pri_buf, psp->sos_start_addr, psp->sos_bin_size);
>>>>>> +    psp_copy_fw(psp, psp->sos_start_addr, psp->sos_bin_size);
>>>>>>          /* Provide the PSP secure OS to bootloader */
>>>>>>        WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c
>>>>>> b/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c
>>>>>> index c4828bd..618e5b6 100644
>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c
>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c
>>>>>> @@ -138,10 +138,8 @@ static int psp_v12_0_bootloader_load_sysdrv(struct
>>>>>> psp_context *psp)
>>>>>>        if (ret)
>>>>>>            return ret;
>>>>>>    -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>>> -
>>>>>>        /* Copy PSP System Driver binary to memory */
>>>>>> -    memcpy(psp->fw_pri_buf, psp->sys_start_addr, psp->sys_bin_size);
>>>>>> +    psp_copy_fw(psp, psp->sys_start_addr, psp->sys_bin_size);
>>>>>>          /* Provide the sys driver to bootloader */
>>>>>>        WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>>>>> @@ -179,10 +177,8 @@ static int psp_v12_0_bootloader_load_sos(struct
>>>>>> psp_context *psp)
>>>>>>        if (ret)
>>>>>>            return ret;
>>>>>>    -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>>> -
>>>>>>        /* Copy Secure OS binary to PSP memory */
>>>>>> -    memcpy(psp->fw_pri_buf, psp->sos_start_addr, psp->sos_bin_size);
>>>>>> +    psp_copy_fw(psp, psp->sos_start_addr, psp->sos_bin_size);
>>>>>>          /* Provide the PSP secure OS to bootloader */
>>>>>>        WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c
>>>>>> b/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c
>>>>>> index f2e725f..d0a6cccd 100644
>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c
>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c
>>>>>> @@ -102,10 +102,8 @@ static int psp_v3_1_bootloader_load_sysdrv(struct
>>>>>> psp_context *psp)
>>>>>>        if (ret)
>>>>>>            return ret;
>>>>>>    -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>>> -
>>>>>>        /* Copy PSP System Driver binary to memory */
>>>>>> -    memcpy(psp->fw_pri_buf, psp->sys_start_addr, psp->sys_bin_size);
>>>>>> +    psp_copy_fw(psp, psp->sys_start_addr, psp->sys_bin_size);
>>>>>>          /* Provide the sys driver to bootloader */
>>>>>>        WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>>>>> @@ -143,10 +141,8 @@ static int psp_v3_1_bootloader_load_sos(struct
>>>>>> psp_context *psp)
>>>>>>        if (ret)
>>>>>>            return ret;
>>>>>>    -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>>> -
>>>>>>        /* Copy Secure OS binary to PSP memory */
>>>>>> -    memcpy(psp->fw_pri_buf, psp->sos_start_addr, psp->sos_bin_size);
>>>>>> +    psp_copy_fw(psp, psp->sos_start_addr, psp->sos_bin_size);
>>>>>>          /* Provide the PSP secure OS to bootloader */
>>>>>>        WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>>
>>>
>> _______________________________________________
>> amd-gfx mailing list
>> amd-gfx@lists.freedesktop.org
>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&amp;data=04%7C01%7CAndrey.Grodzovsky%40amd.com%7Cee61fb937d2d4baedf6f08d8bcac5b02%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637466795752297305%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=a5MkPkwHh7WkR24K9EoCWSKPdCpiXCJH6RwGbGyhHyA%3D&amp;reserved=0 
>>
>

[-- Attachment #2: results.log --]
[-- Type: text/x-log, Size: 9299 bytes --]

with drm_dev_enter


[   20.606168 <    0.053014>] [drm:gfx_v8_0_ring_test_ring [amdgpu]] *ERROR* Before write 1M times to scratch register
[   20.914669 <    0.308501>] [drm:gfx_v8_0_ring_test_ring [amdgpu]] *ERROR* After write 1M times to scratch register
[   20.914857 <    0.000188>] [drm:gfx_v8_0_ring_test_ring [amdgpu]] *ERROR* Before write 1M times to scratch register
[   21.224795 <    0.309938>] [drm:gfx_v8_0_ring_test_ring [amdgpu]] *ERROR* After write 1M times to scratch register
[   21.224986 <    0.000191>] [drm:gfx_v8_0_ring_test_ring [amdgpu]] *ERROR* Before write 1M times to scratch register
[   21.533422 <    0.308436>] [drm:gfx_v8_0_ring_test_ring [amdgpu]] *ERROR* After write 1M times to scratch register
[   21.843633 <    0.000164>] [drm:gfx_v8_0_ring_test_ring [amdgpu]] *ERROR* Before write 1M times to scratch register
[   22.152756 <    0.309123>] [drm:gfx_v8_0_ring_test_ring [amdgpu]] *ERROR* After write 1M times to scratch register
[   22.152934 <    0.000178>] [drm:gfx_v8_0_ring_test_ring [amdgpu]] *ERROR* Before write 1M times to scratch register
[   22.462544 <    0.309610>] [drm:gfx_v8_0_ring_test_ring [amdgpu]] *ERROR* After write 1M times to scratch register
[   22.462721 <    0.000177>] [drm:gfx_v8_0_ring_test_ring [amdgpu]] *ERROR* Before write 1M times to scratch register
[   22.771921 <    0.309200>] [drm:gfx_v8_0_ring_test_ring [amdgpu]] *ERROR* After write 1M times to scratch register
[   22.772128 <    0.000207>] [drm:gfx_v8_0_ring_test_ring [amdgpu]] *ERROR* Before write 1M times to scratch register
[   23.081148 <    0.309020>] [drm:gfx_v8_0_ring_test_ring [amdgpu]] *ERROR* After write 1M times to scratch register
[   23.081331 <    0.000183>] [drm:gfx_v8_0_ring_test_ring [amdgpu]] *ERROR* Before write 1M times to scratch register
[   23.391600 <    0.310269>] [drm:gfx_v8_0_ring_test_ring [amdgpu]] *ERROR* After write 1M times to scratch register
[   23.391783 <    0.000183>] [drm:gfx_v8_0_ring_test_ring [amdgpu]] *ERROR* Before write 1M times to scratch register
[   23.702026 <    0.310243>] [drm:gfx_v8_0_ring_test_ring [amdgpu]] *ERROR* After write 1M times to scratch register




[   25.330532 <    0.000112>] [drm:gfx_v9_0_ring_test_ring [amdgpu]] *ERROR* Before write 1M times to scratch register
[   25.642806 <    0.312274>] [drm:gfx_v9_0_ring_test_ring [amdgpu]] *ERROR* After write 1M times to scratch register
[   25.643123 <    0.000317>] [drm:gfx_v9_0_ring_test_ring [amdgpu]] *ERROR* Before write 1M times to scratch register
[   25.954685 <    0.311562>] [drm:gfx_v9_0_ring_test_ring [amdgpu]] *ERROR* After write 1M times to scratch register
[   25.954906 <    0.000221>] [drm:gfx_v9_0_ring_test_ring [amdgpu]] *ERROR* Before write 1M times to scratch register
[   26.266457 <    0.311551>] [drm:gfx_v9_0_ring_test_ring [amdgpu]] *ERROR* After write 1M times to scratch register
[   26.266675 <    0.000218>] [drm:gfx_v9_0_ring_test_ring [amdgpu]] *ERROR* Before write 1M times to scratch register
[   26.579848 <    0.313173>] [drm:gfx_v9_0_ring_test_ring [amdgpu]] *ERROR* After write 1M times to scratch register
[   26.580066 <    0.000218>] [drm:gfx_v9_0_ring_test_ring [amdgpu]] *ERROR* Before write 1M times to scratch register
[   26.891740 <    0.311674>] [drm:gfx_v9_0_ring_test_ring [amdgpu]] *ERROR* After write 1M times to scratch register
[   26.891958 <    0.000218>] [drm:gfx_v9_0_ring_test_ring [amdgpu]] *ERROR* Before write 1M times to scratch register
[   27.203947 <    0.311989>] [drm:gfx_v9_0_ring_test_ring [amdgpu]] *ERROR* After write 1M times to scratch register
[   27.204166 <    0.000219>] [drm:gfx_v9_0_ring_test_ring [amdgpu]] *ERROR* Before write 1M times to scratch register
[   27.516040 <    0.311874>] [drm:gfx_v9_0_ring_test_ring [amdgpu]] *ERROR* After write 1M times to scratch register
[   27.516265 <    0.000225>] [drm:gfx_v9_0_ring_test_ring [amdgpu]] *ERROR* Before write 1M times to scratch register
[   27.828137 <    0.311872>] [drm:gfx_v9_0_ring_test_ring [amdgpu]] *ERROR* After write 1M times to scratch register
[   27.828356 <    0.000219>] [drm:gfx_v9_0_ring_test_ring [amdgpu]] *ERROR* Before write 1M times to scratch register
[   28.140182 <    0.311826>] [drm:gfx_v9_0_ring_test_ring [amdgpu]] *ERROR* After write 1M times to scratch register
[   28.140395 <    0.000213>] [drm:gfx_v9_0_ring_test_ring [amdgpu]] *ERROR* Before write 1M times to scratch register
[   28.452405 <    0.312010>] [drm:gfx_v9_0_ring_test_ring [amdgpu]] *ERROR* After write 1M times to scratch register

without drm_dev_enter


[   28.519096 <    0.049775>] [drm:gfx_v8_0_ring_test_ring [amdgpu]] *ERROR* Before write 1M times to scratch register
[   28.728785 <    0.209689>] [drm:gfx_v8_0_ring_test_ring [amdgpu]] *ERROR* After write 1M times to scratch register
[   28.728946 <    0.000161>] [drm:gfx_v8_0_ring_test_ring [amdgpu]] *ERROR* Before write 1M times to scratch register
[   28.938814 <    0.209868>] [drm:gfx_v8_0_ring_test_ring [amdgpu]] *ERROR* After write 1M times to scratch register
[   28.938979 <    0.000165>] [drm:gfx_v8_0_ring_test_ring [amdgpu]] *ERROR* Before write 1M times to scratch register
[   29.148659 <    0.209680>] [drm:gfx_v8_0_ring_test_ring [amdgpu]] *ERROR* After write 1M times to scratch register
[   29.148809 <    0.000150>] [drm:gfx_v8_0_ring_test_ring [amdgpu]] *ERROR* Before write 1M times to scratch register
[   29.358514 <    0.209705>] [drm:gfx_v8_0_ring_test_ring [amdgpu]] *ERROR* After write 1M times to scratch register
[   29.358664 <    0.000150>] [drm:gfx_v8_0_ring_test_ring [amdgpu]] *ERROR* Before write 1M times to scratch register
[   29.568358 <    0.209694>] [drm:gfx_v8_0_ring_test_ring [amdgpu]] *ERROR* After write 1M times to scratch register
[   29.568508 <    0.000150>] [drm:gfx_v8_0_ring_test_ring [amdgpu]] *ERROR* Before write 1M times to scratch register
[   29.778389 <    0.209881>] [drm:gfx_v8_0_ring_test_ring [amdgpu]] *ERROR* After write 1M times to scratch register
[   29.778539 <    0.000150>] [drm:gfx_v8_0_ring_test_ring [amdgpu]] *ERROR* Before write 1M times to scratch register
[   29.988240 <    0.209701>] [drm:gfx_v8_0_ring_test_ring [amdgpu]] *ERROR* After write 1M times to scratch register
[   29.988391 <    0.000151>] [drm:gfx_v8_0_ring_test_ring [amdgpu]] *ERROR* Before write 1M times to scratch register
[   30.198077 <    0.209686>] [drm:gfx_v8_0_ring_test_ring [amdgpu]] *ERROR* After write 1M times to scratch register
[   30.198228 <    0.000151>] [drm:gfx_v8_0_ring_test_ring [amdgpu]] *ERROR* Before write 1M times to scratch register
[   30.407928 <    0.209700>] [drm:gfx_v8_0_ring_test_ring [amdgpu]] *ERROR* After write 1M times to scratch register
[   30.408079 <    0.000151>] [drm:gfx_v8_0_ring_test_ring [amdgpu]] *ERROR* Before write 1M times to scratch register
[   30.617796 <    0.209717>] [drm:gfx_v8_0_ring_test_ring [amdgpu]] *ERROR* After write 1M times to scratch register





[   32.222242 <    0.000086>] [drm:gfx_v9_0_ring_test_ring [amdgpu]] *ERROR* Before write 1M times to scratch register
[   32.431423 <    0.209181>] [drm:gfx_v9_0_ring_test_ring [amdgpu]] *ERROR* After write 1M times to scratch register
[   32.431834 <    0.000411>] [drm:gfx_v9_0_ring_test_ring [amdgpu]] *ERROR* Before write 1M times to scratch register
[   32.641075 <    0.209241>] [drm:gfx_v9_0_ring_test_ring [amdgpu]] *ERROR* After write 1M times to scratch register
[   32.641268 <    0.000193>] [drm:gfx_v9_0_ring_test_ring [amdgpu]] *ERROR* Before write 1M times to scratch register
[   32.850482 <    0.209214>] [drm:gfx_v9_0_ring_test_ring [amdgpu]] *ERROR* After write 1M times to scratch register
[   32.850671 <    0.000189>] [drm:gfx_v9_0_ring_test_ring [amdgpu]] *ERROR* Before write 1M times to scratch register
[   33.059900 <    0.209229>] [drm:gfx_v9_0_ring_test_ring [amdgpu]] *ERROR* After write 1M times to scratch register
[   33.060092 <    0.000192>] [drm:gfx_v9_0_ring_test_ring [amdgpu]] *ERROR* Before write 1M times to scratch register
[   33.269336 <    0.209244>] [drm:gfx_v9_0_ring_test_ring [amdgpu]] *ERROR* After write 1M times to scratch register
[   33.269526 <    0.000190>] [drm:gfx_v9_0_ring_test_ring [amdgpu]] *ERROR* Before write 1M times to scratch register
[   33.478741 <    0.209215>] [drm:gfx_v9_0_ring_test_ring [amdgpu]] *ERROR* After write 1M times to scratch register
[   33.478931 <    0.000190>] [drm:gfx_v9_0_ring_test_ring [amdgpu]] *ERROR* Before write 1M times to scratch register
[   33.688259 <    0.209328>] [drm:gfx_v9_0_ring_test_ring [amdgpu]] *ERROR* After write 1M times to scratch register
[   33.688449 <    0.000190>] [drm:gfx_v9_0_ring_test_ring [amdgpu]] *ERROR* Before write 1M times to scratch register
[   33.897756 <    0.209307>] [drm:gfx_v9_0_ring_test_ring [amdgpu]] *ERROR* After write 1M times to scratch register
[   33.897946 <    0.000190>] [drm:gfx_v9_0_ring_test_ring [amdgpu]] *ERROR* Before write 1M times to scratch register
[   34.107256 <    0.209310>] [drm:gfx_v9_0_ring_test_ring [amdgpu]] *ERROR* After write 1M times to scratch register
[   34.107445 <    0.000189>] [drm:gfx_v9_0_ring_test_ring [amdgpu]] *ERROR* Before write 1M times to scratch register
[   34.316758 <    0.209313>] [drm:gfx_v9_0_ring_test_ring [amdgpu]] *ERROR* After write 1M times to scratch register


[-- Attachment #3: Type: text/plain, Size: 160 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 11/14] drm/amdgpu: Guard against write accesses after device removal
@ 2021-01-28 17:23               ` Andrey Grodzovsky
  0 siblings, 0 replies; 196+ messages in thread
From: Andrey Grodzovsky @ 2021-01-28 17:23 UTC (permalink / raw)
  To: christian.koenig, Daniel Vetter
  Cc: Rob Herring, Greg KH, dri-devel, Anholt, Eric, Pekka Paalanen,
	amd-gfx list, Alex Deucher, Lucas Stach, Wentland, Harry,
	Qiang Yu

[-- Attachment #1: Type: text/plain, Size: 33539 bytes --]


On 1/19/21 1:59 PM, Christian König wrote:
> Am 19.01.21 um 19:22 schrieb Andrey Grodzovsky:
>>
>> On 1/19/21 1:05 PM, Daniel Vetter wrote:
>>> On Tue, Jan 19, 2021 at 4:35 PM Andrey Grodzovsky
>>> <Andrey.Grodzovsky@amd.com> wrote:
>>>> There is really no other way according to this article
>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flwn.net%2FArticles%2F767885%2F&amp;data=04%7C01%7CAndrey.Grodzovsky%40amd.com%7Cee61fb937d2d4baedf6f08d8bcac5b02%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637466795752297305%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=a9Y4ZMEVYaMP7IeMVxQgXGpAkDXSkedMAiWkyqwzEe8%3D&amp;reserved=0 
>>>>
>>>>
>>>> "A perfect solution seems nearly impossible though; we cannot acquire a 
>>>> mutex on
>>>> the user
>>>> to prevent them from yanking a device and we cannot check for a presence 
>>>> change
>>>> after every
>>>> device access for performance reasons. "
>>>>
>>>> But I assumed srcu_read_lock should be pretty seamless performance wise, no ?
>>> The read side is supposed to be dirt cheap, the write side is were we
>>> just stall for all readers to eventually complete on their own.
>>> Definitely should be much cheaper than mmio read, on the mmio write
>>> side it might actually hurt a bit. Otoh I think those don't stall the
>>> cpu by default when they're timing out, so maybe if the overhead is
>>> too much for those, we could omit them?
>>>
>>> Maybe just do a small microbenchmark for these for testing, with a
>>> register that doesn't change hw state. So with and without
>>> drm_dev_enter/exit, and also one with the hw plugged out so that we
>>> have actual timeouts in the transactions.
>>> -Daniel
>>
>>
>> So say writing in a loop to some harmless scratch register for many times 
>> both for plugged
>> and unplugged case and measure total time delta ?
>
> I think we should at least measure the following:
>
> 1. Writing X times to a scratch reg without your patch.
> 2. Writing X times to a scratch reg with your patch.
> 3. Writing X times to a scratch reg with the hardware physically disconnected.
>
> I suggest to repeat that once for Polaris (or older) and once for Vega or Navi.
>
> The SRBM on Polaris is meant to introduce some delay in each access, so it 
> might react differently then the newer hardware.
>
> Christian.


See attached results and the testing code. Ran on Polaris (gfx8) and Vega10(gfx9)

In summary, over 1 million WWREG32 in loop with and without this patch you get 
around 10ms of accumulated overhead ( so 0.00001 millisecond penalty for each 
WWREG32) for using drm_dev_enter check when writing registers.

P.S Bullet 3 I cannot test as I need eGPU and currently I don't have one.

diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
index 3763921..1650549 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
@@ -873,6 +873,11 @@ static int gfx_v8_0_ring_test_ring(struct amdgpu_ring *ring)
         if (i >= adev->usec_timeout)
                 r = -ETIMEDOUT;

+       DRM_ERROR("Before write 1M times to scratch register");
+       for (i = 0; i < 1000000; i++)
+               WREG32(scratch, 0xDEADBEEF);
+       DRM_ERROR("After write 1M times to scratch register");
+
  error_free_scratch:
         amdgpu_gfx_scratch_free(adev, scratch);
         return r;
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
index 5f4805e..7ecbfef 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
@@ -1063,6 +1063,11 @@ static int gfx_v9_0_ring_test_ring(struct amdgpu_ring *ring)
         if (i >= adev->usec_timeout)
                 r = -ETIMEDOUT;

+       DRM_ERROR("Before write 1M times to scratch register");
+       for (i = 0; i < 1000000; i++)
+               WREG32(scratch, 0xDEADBEEF);
+       DRM_ERROR("After write 1M times to scratch register");
+
  error_free_scratch:
         amdgpu_gfx_scratch_free(adev, scratch);
         return r;


Andrey


Andrey



>
>>
>> Andrey
>>
>>
>>>
>>>> The other solution would be as I suggested to keep all the device IO ranges
>>>> reserved and system
>>>> memory pages unfreed until the device is finalized in the driver but Daniel 
>>>> said
>>>> this would upset the PCI layer (the MMIO ranges reservation part).
>>>>
>>>> Andrey
>>>>
>>>>
>>>>
>>>>
>>>> On 1/19/21 3:55 AM, Christian König wrote:
>>>>> Am 18.01.21 um 22:01 schrieb Andrey Grodzovsky:
>>>>>> This should prevent writing to memory or IO ranges possibly
>>>>>> already allocated for other uses after our device is removed.
>>>>> Wow, that adds quite some overhead to every register access. I'm not sure we
>>>>> can do this.
>>>>>
>>>>> Christian.
>>>>>
>>>>>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>>>>>> ---
>>>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 57 ++++++++++++++++++++++++
>>>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c    |  9 ++++
>>>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c    | 53 +++++++++++++---------
>>>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h    |  3 ++
>>>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c   | 70 
>>>>>> ++++++++++++++++++++++++++++++
>>>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h   | 49 ++-------------------
>>>>>>    drivers/gpu/drm/amd/amdgpu/psp_v11_0.c     | 16 ++-----
>>>>>>    drivers/gpu/drm/amd/amdgpu/psp_v12_0.c     |  8 +---
>>>>>>    drivers/gpu/drm/amd/amdgpu/psp_v3_1.c      |  8 +---
>>>>>>    9 files changed, 184 insertions(+), 89 deletions(-)
>>>>>>
>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>>> index e99f4f1..0a9d73c 100644
>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>>> @@ -72,6 +72,8 @@
>>>>>>      #include <linux/iommu.h>
>>>>>>    +#include <drm/drm_drv.h>
>>>>>> +
>>>>>>    MODULE_FIRMWARE("amdgpu/vega10_gpu_info.bin");
>>>>>>    MODULE_FIRMWARE("amdgpu/vega12_gpu_info.bin");
>>>>>>    MODULE_FIRMWARE("amdgpu/raven_gpu_info.bin");
>>>>>> @@ -404,13 +406,21 @@ uint8_t amdgpu_mm_rreg8(struct amdgpu_device *adev,
>>>>>> uint32_t offset)
>>>>>>     */
>>>>>>    void amdgpu_mm_wreg8(struct amdgpu_device *adev, uint32_t offset, uint8_t
>>>>>> value)
>>>>>>    {
>>>>>> +    int idx;
>>>>>> +
>>>>>>        if (adev->in_pci_err_recovery)
>>>>>>            return;
>>>>>>    +
>>>>>> +    if (!drm_dev_enter(&adev->ddev, &idx))
>>>>>> +        return;
>>>>>> +
>>>>>>        if (offset < adev->rmmio_size)
>>>>>>            writeb(value, adev->rmmio + offset);
>>>>>>        else
>>>>>>            BUG();
>>>>>> +
>>>>>> +    drm_dev_exit(idx);
>>>>>>    }
>>>>>>      /**
>>>>>> @@ -427,9 +437,14 @@ void amdgpu_device_wreg(struct amdgpu_device *adev,
>>>>>>                uint32_t reg, uint32_t v,
>>>>>>                uint32_t acc_flags)
>>>>>>    {
>>>>>> +    int idx;
>>>>>> +
>>>>>>        if (adev->in_pci_err_recovery)
>>>>>>            return;
>>>>>>    +    if (!drm_dev_enter(&adev->ddev, &idx))
>>>>>> +        return;
>>>>>> +
>>>>>>        if ((reg * 4) < adev->rmmio_size) {
>>>>>>            if (!(acc_flags & AMDGPU_REGS_NO_KIQ) &&
>>>>>>                amdgpu_sriov_runtime(adev) &&
>>>>>> @@ -444,6 +459,8 @@ void amdgpu_device_wreg(struct amdgpu_device *adev,
>>>>>>        }
>>>>>> trace_amdgpu_device_wreg(adev->pdev->device, reg, v);
>>>>>> +
>>>>>> +    drm_dev_exit(idx);
>>>>>>    }
>>>>>>      /*
>>>>>> @@ -454,9 +471,14 @@ void amdgpu_device_wreg(struct amdgpu_device *adev,
>>>>>>    void amdgpu_mm_wreg_mmio_rlc(struct amdgpu_device *adev,
>>>>>>                     uint32_t reg, uint32_t v)
>>>>>>    {
>>>>>> +    int idx;
>>>>>> +
>>>>>>        if (adev->in_pci_err_recovery)
>>>>>>            return;
>>>>>>    +    if (!drm_dev_enter(&adev->ddev, &idx))
>>>>>> +        return;
>>>>>> +
>>>>>>        if (amdgpu_sriov_fullaccess(adev) &&
>>>>>>            adev->gfx.rlc.funcs &&
>>>>>> adev->gfx.rlc.funcs->is_rlcg_access_range) {
>>>>>> @@ -465,6 +487,8 @@ void amdgpu_mm_wreg_mmio_rlc(struct amdgpu_device *adev,
>>>>>>        } else {
>>>>>>            writel(v, ((void __iomem *)adev->rmmio) + (reg * 4));
>>>>>>        }
>>>>>> +
>>>>>> +    drm_dev_exit(idx);
>>>>>>    }
>>>>>>      /**
>>>>>> @@ -499,15 +523,22 @@ u32 amdgpu_io_rreg(struct amdgpu_device *adev, u32 
>>>>>> reg)
>>>>>>     */
>>>>>>    void amdgpu_io_wreg(struct amdgpu_device *adev, u32 reg, u32 v)
>>>>>>    {
>>>>>> +    int idx;
>>>>>> +
>>>>>>        if (adev->in_pci_err_recovery)
>>>>>>            return;
>>>>>>    +    if (!drm_dev_enter(&adev->ddev, &idx))
>>>>>> +        return;
>>>>>> +
>>>>>>        if ((reg * 4) < adev->rio_mem_size)
>>>>>>            iowrite32(v, adev->rio_mem + (reg * 4));
>>>>>>        else {
>>>>>>            iowrite32((reg * 4), adev->rio_mem + (mmMM_INDEX * 4));
>>>>>>            iowrite32(v, adev->rio_mem + (mmMM_DATA * 4));
>>>>>>        }
>>>>>> +
>>>>>> +    drm_dev_exit(idx);
>>>>>>    }
>>>>>>      /**
>>>>>> @@ -544,14 +575,21 @@ u32 amdgpu_mm_rdoorbell(struct amdgpu_device *adev, 
>>>>>> u32
>>>>>> index)
>>>>>>     */
>>>>>>    void amdgpu_mm_wdoorbell(struct amdgpu_device *adev, u32 index, u32 v)
>>>>>>    {
>>>>>> +    int idx;
>>>>>> +
>>>>>>        if (adev->in_pci_err_recovery)
>>>>>>            return;
>>>>>>    +    if (!drm_dev_enter(&adev->ddev, &idx))
>>>>>> +        return;
>>>>>> +
>>>>>>        if (index < adev->doorbell.num_doorbells) {
>>>>>>            writel(v, adev->doorbell.ptr + index);
>>>>>>        } else {
>>>>>>            DRM_ERROR("writing beyond doorbell aperture: 0x%08x!\n", index);
>>>>>>        }
>>>>>> +
>>>>>> +    drm_dev_exit(idx);
>>>>>>    }
>>>>>>      /**
>>>>>> @@ -588,14 +626,21 @@ u64 amdgpu_mm_rdoorbell64(struct amdgpu_device *adev,
>>>>>> u32 index)
>>>>>>     */
>>>>>>    void amdgpu_mm_wdoorbell64(struct amdgpu_device *adev, u32 index, u64 v)
>>>>>>    {
>>>>>> +    int idx;
>>>>>> +
>>>>>>        if (adev->in_pci_err_recovery)
>>>>>>            return;
>>>>>>    +    if (!drm_dev_enter(&adev->ddev, &idx))
>>>>>> +        return;
>>>>>> +
>>>>>>        if (index < adev->doorbell.num_doorbells) {
>>>>>>            atomic64_set((atomic64_t *)(adev->doorbell.ptr + index), v);
>>>>>>        } else {
>>>>>>            DRM_ERROR("writing beyond doorbell aperture: 0x%08x!\n", index);
>>>>>>        }
>>>>>> +
>>>>>> +    drm_dev_exit(idx);
>>>>>>    }
>>>>>>      /**
>>>>>> @@ -682,6 +727,10 @@ void amdgpu_device_indirect_wreg(struct amdgpu_device
>>>>>> *adev,
>>>>>>        unsigned long flags;
>>>>>>        void __iomem *pcie_index_offset;
>>>>>>        void __iomem *pcie_data_offset;
>>>>>> +    int idx;
>>>>>> +
>>>>>> +    if (!drm_dev_enter(&adev->ddev, &idx))
>>>>>> +        return;
>>>>>>          spin_lock_irqsave(&adev->pcie_idx_lock, flags);
>>>>>>        pcie_index_offset = (void __iomem *)adev->rmmio + pcie_index * 4;
>>>>>> @@ -692,6 +741,8 @@ void amdgpu_device_indirect_wreg(struct amdgpu_device 
>>>>>> *adev,
>>>>>>        writel(reg_data, pcie_data_offset);
>>>>>>        readl(pcie_data_offset);
>>>>>> spin_unlock_irqrestore(&adev->pcie_idx_lock, flags);
>>>>>> +
>>>>>> +    drm_dev_exit(idx);
>>>>>>    }
>>>>>>      /**
>>>>>> @@ -711,6 +762,10 @@ void amdgpu_device_indirect_wreg64(struct amdgpu_device
>>>>>> *adev,
>>>>>>        unsigned long flags;
>>>>>>        void __iomem *pcie_index_offset;
>>>>>>        void __iomem *pcie_data_offset;
>>>>>> +    int idx;
>>>>>> +
>>>>>> +    if (!drm_dev_enter(&adev->ddev, &idx))
>>>>>> +        return;
>>>>>>          spin_lock_irqsave(&adev->pcie_idx_lock, flags);
>>>>>>        pcie_index_offset = (void __iomem *)adev->rmmio + pcie_index * 4;
>>>>>> @@ -727,6 +782,8 @@ void amdgpu_device_indirect_wreg64(struct amdgpu_device
>>>>>> *adev,
>>>>>>        writel((u32)(reg_data >> 32), pcie_data_offset);
>>>>>>        readl(pcie_data_offset);
>>>>>> spin_unlock_irqrestore(&adev->pcie_idx_lock, flags);
>>>>>> +
>>>>>> +    drm_dev_exit(idx);
>>>>>>    }
>>>>>>      /**
>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
>>>>>> index fe1a39f..1beb4e6 100644
>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
>>>>>> @@ -31,6 +31,8 @@
>>>>>>    #include "amdgpu_ras.h"
>>>>>>    #include "amdgpu_xgmi.h"
>>>>>>    +#include <drm/drm_drv.h>
>>>>>> +
>>>>>>    /**
>>>>>>     * amdgpu_gmc_get_pde_for_bo - get the PDE for a BO
>>>>>>     *
>>>>>> @@ -98,6 +100,10 @@ int amdgpu_gmc_set_pte_pde(struct amdgpu_device *adev,
>>>>>> void *cpu_pt_addr,
>>>>>>    {
>>>>>>        void __iomem *ptr = (void *)cpu_pt_addr;
>>>>>>        uint64_t value;
>>>>>> +    int idx;
>>>>>> +
>>>>>> +    if (!drm_dev_enter(&adev->ddev, &idx))
>>>>>> +        return 0;
>>>>>>          /*
>>>>>>         * The following is for PTE only. GART does not have PDEs.
>>>>>> @@ -105,6 +111,9 @@ int amdgpu_gmc_set_pte_pde(struct amdgpu_device *adev,
>>>>>> void *cpu_pt_addr,
>>>>>>        value = addr & 0x0000FFFFFFFFF000ULL;
>>>>>>        value |= flags;
>>>>>>        writeq(value, ptr + (gpu_page_idx * 8));
>>>>>> +
>>>>>> +    drm_dev_exit(idx);
>>>>>> +
>>>>>>        return 0;
>>>>>>    }
>>>>>>    diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
>>>>>> index 523d22d..89e2bfe 100644
>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
>>>>>> @@ -37,6 +37,8 @@
>>>>>>      #include "amdgpu_ras.h"
>>>>>>    +#include <drm/drm_drv.h>
>>>>>> +
>>>>>>    static int psp_sysfs_init(struct amdgpu_device *adev);
>>>>>>    static void psp_sysfs_fini(struct amdgpu_device *adev);
>>>>>>    @@ -248,7 +250,7 @@ psp_cmd_submit_buf(struct psp_context *psp,
>>>>>>               struct psp_gfx_cmd_resp *cmd, uint64_t fence_mc_addr)
>>>>>>    {
>>>>>>        int ret;
>>>>>> -    int index;
>>>>>> +    int index, idx;
>>>>>>        int timeout = 2000;
>>>>>>        bool ras_intr = false;
>>>>>>        bool skip_unsupport = false;
>>>>>> @@ -256,6 +258,9 @@ psp_cmd_submit_buf(struct psp_context *psp,
>>>>>>        if (psp->adev->in_pci_err_recovery)
>>>>>>            return 0;
>>>>>>    +    if (!drm_dev_enter(&psp->adev->ddev, &idx))
>>>>>> +        return 0;
>>>>>> +
>>>>>>        mutex_lock(&psp->mutex);
>>>>>>          memset(psp->cmd_buf_mem, 0, PSP_CMD_BUFFER_SIZE);
>>>>>> @@ -266,8 +271,7 @@ psp_cmd_submit_buf(struct psp_context *psp,
>>>>>>        ret = psp_ring_cmd_submit(psp, psp->cmd_buf_mc_addr, fence_mc_addr,
>>>>>> index);
>>>>>>        if (ret) {
>>>>>>            atomic_dec(&psp->fence_value);
>>>>>> -        mutex_unlock(&psp->mutex);
>>>>>> -        return ret;
>>>>>> +        goto exit;
>>>>>>        }
>>>>>>          amdgpu_asic_invalidate_hdp(psp->adev, NULL);
>>>>>> @@ -307,8 +311,8 @@ psp_cmd_submit_buf(struct psp_context *psp,
>>>>>>                 psp->cmd_buf_mem->cmd_id,
>>>>>>                 psp->cmd_buf_mem->resp.status);
>>>>>>            if (!timeout) {
>>>>>> -            mutex_unlock(&psp->mutex);
>>>>>> -            return -EINVAL;
>>>>>> +            ret = -EINVAL;
>>>>>> +            goto exit;
>>>>>>            }
>>>>>>        }
>>>>>>    @@ -316,8 +320,10 @@ psp_cmd_submit_buf(struct psp_context *psp,
>>>>>>            ucode->tmr_mc_addr_lo = psp->cmd_buf_mem->resp.fw_addr_lo;
>>>>>>            ucode->tmr_mc_addr_hi = psp->cmd_buf_mem->resp.fw_addr_hi;
>>>>>>        }
>>>>>> -    mutex_unlock(&psp->mutex);
>>>>>>    +exit:
>>>>>> +    mutex_unlock(&psp->mutex);
>>>>>> +    drm_dev_exit(idx);
>>>>>>        return ret;
>>>>>>    }
>>>>>>    @@ -354,8 +360,7 @@ static int psp_load_toc(struct psp_context *psp,
>>>>>>        if (!cmd)
>>>>>>            return -ENOMEM;
>>>>>>        /* Copy toc to psp firmware private buffer */
>>>>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>>> -    memcpy(psp->fw_pri_buf, psp->toc_start_addr, psp->toc_bin_size);
>>>>>> +    psp_copy_fw(psp, psp->toc_start_addr, psp->toc_bin_size);
>>>>>>          psp_prep_load_toc_cmd_buf(cmd, psp->fw_pri_mc_addr, 
>>>>>> psp->toc_bin_size);
>>>>>>    @@ -570,8 +575,7 @@ static int psp_asd_load(struct psp_context *psp)
>>>>>>        if (!cmd)
>>>>>>            return -ENOMEM;
>>>>>>    -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>>> -    memcpy(psp->fw_pri_buf, psp->asd_start_addr, psp->asd_ucode_size);
>>>>>> +    psp_copy_fw(psp, psp->asd_start_addr, psp->asd_ucode_size);
>>>>>>          psp_prep_asd_load_cmd_buf(cmd, psp->fw_pri_mc_addr,
>>>>>>                      psp->asd_ucode_size);
>>>>>> @@ -726,8 +730,7 @@ static int psp_xgmi_load(struct psp_context *psp)
>>>>>>        if (!cmd)
>>>>>>            return -ENOMEM;
>>>>>>    -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>>> -    memcpy(psp->fw_pri_buf, psp->ta_xgmi_start_addr, 
>>>>>> psp->ta_xgmi_ucode_size);
>>>>>> +    psp_copy_fw(psp, psp->ta_xgmi_start_addr, psp->ta_xgmi_ucode_size);
>>>>>>          psp_prep_ta_load_cmd_buf(cmd,
>>>>>>                     psp->fw_pri_mc_addr,
>>>>>> @@ -982,8 +985,7 @@ static int psp_ras_load(struct psp_context *psp)
>>>>>>        if (!cmd)
>>>>>>            return -ENOMEM;
>>>>>>    -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>>> -    memcpy(psp->fw_pri_buf, psp->ta_ras_start_addr, 
>>>>>> psp->ta_ras_ucode_size);
>>>>>> +    psp_copy_fw(psp, psp->ta_ras_start_addr, psp->ta_ras_ucode_size);
>>>>>>          psp_prep_ta_load_cmd_buf(cmd,
>>>>>>                     psp->fw_pri_mc_addr,
>>>>>> @@ -1219,8 +1221,7 @@ static int psp_hdcp_load(struct psp_context *psp)
>>>>>>        if (!cmd)
>>>>>>            return -ENOMEM;
>>>>>>    -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>>> -    memcpy(psp->fw_pri_buf, psp->ta_hdcp_start_addr,
>>>>>> +    psp_copy_fw(psp, psp->ta_hdcp_start_addr,
>>>>>>               psp->ta_hdcp_ucode_size);
>>>>>>          psp_prep_ta_load_cmd_buf(cmd,
>>>>>> @@ -1366,8 +1367,7 @@ static int psp_dtm_load(struct psp_context *psp)
>>>>>>        if (!cmd)
>>>>>>            return -ENOMEM;
>>>>>>    -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>>> -    memcpy(psp->fw_pri_buf, psp->ta_dtm_start_addr, 
>>>>>> psp->ta_dtm_ucode_size);
>>>>>> +    psp_copy_fw(psp, psp->ta_dtm_start_addr, psp->ta_dtm_ucode_size);
>>>>>>          psp_prep_ta_load_cmd_buf(cmd,
>>>>>>                     psp->fw_pri_mc_addr,
>>>>>> @@ -1507,8 +1507,7 @@ static int psp_rap_load(struct psp_context *psp)
>>>>>>        if (!cmd)
>>>>>>            return -ENOMEM;
>>>>>>    -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>>> -    memcpy(psp->fw_pri_buf, psp->ta_rap_start_addr, 
>>>>>> psp->ta_rap_ucode_size);
>>>>>> +    psp_copy_fw(psp, psp->ta_rap_start_addr, psp->ta_rap_ucode_size);
>>>>>>          psp_prep_ta_load_cmd_buf(cmd,
>>>>>>                     psp->fw_pri_mc_addr,
>>>>>> @@ -2778,6 +2777,20 @@ static ssize_t psp_usbc_pd_fw_sysfs_write(struct
>>>>>> device *dev,
>>>>>>        return count;
>>>>>>    }
>>>>>>    +void psp_copy_fw(struct psp_context *psp, uint8_t *start_addr, uint32_t
>>>>>> bin_size)
>>>>>> +{
>>>>>> +    int idx;
>>>>>> +
>>>>>> +    if (!drm_dev_enter(&psp->adev->ddev, &idx))
>>>>>> +        return;
>>>>>> +
>>>>>> +    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>>> +    memcpy(psp->fw_pri_buf, start_addr, bin_size);
>>>>>> +
>>>>>> +    drm_dev_exit(idx);
>>>>>> +}
>>>>>> +
>>>>>> +
>>>>>>    static DEVICE_ATTR(usbc_pd_fw, S_IRUGO | S_IWUSR,
>>>>>>               psp_usbc_pd_fw_sysfs_read,
>>>>>>               psp_usbc_pd_fw_sysfs_write);
>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
>>>>>> index da250bc..ac69314 100644
>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
>>>>>> @@ -400,4 +400,7 @@ int psp_init_ta_microcode(struct psp_context *psp,
>>>>>>                  const char *chip_name);
>>>>>>    int psp_get_fw_attestation_records_addr(struct psp_context *psp,
>>>>>>                        uint64_t *output_ptr);
>>>>>> +
>>>>>> +void psp_copy_fw(struct psp_context *psp, uint8_t *start_addr, uint32_t
>>>>>> bin_size);
>>>>>> +
>>>>>>    #endif
>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
>>>>>> index 1a612f5..d656494 100644
>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
>>>>>> @@ -35,6 +35,8 @@
>>>>>>    #include "amdgpu.h"
>>>>>>    #include "atom.h"
>>>>>>    +#include <drm/drm_drv.h>
>>>>>> +
>>>>>>    /*
>>>>>>     * Rings
>>>>>>     * Most engines on the GPU are fed via ring buffers. Ring
>>>>>> @@ -463,3 +465,71 @@ int amdgpu_ring_test_helper(struct amdgpu_ring *ring)
>>>>>>        ring->sched.ready = !r;
>>>>>>        return r;
>>>>>>    }
>>>>>> +
>>>>>> +void amdgpu_ring_clear_ring(struct amdgpu_ring *ring)
>>>>>> +{
>>>>>> +    int idx;
>>>>>> +    int i = 0;
>>>>>> +
>>>>>> +    if (!drm_dev_enter(&ring->adev->ddev, &idx))
>>>>>> +        return;
>>>>>> +
>>>>>> +    while (i <= ring->buf_mask)
>>>>>> +        ring->ring[i++] = ring->funcs->nop;
>>>>>> +
>>>>>> +    drm_dev_exit(idx);
>>>>>> +
>>>>>> +}
>>>>>> +
>>>>>> +void amdgpu_ring_write(struct amdgpu_ring *ring, uint32_t v)
>>>>>> +{
>>>>>> +    int idx;
>>>>>> +
>>>>>> +    if (!drm_dev_enter(&ring->adev->ddev, &idx))
>>>>>> +        return;
>>>>>> +
>>>>>> +    if (ring->count_dw <= 0)
>>>>>> +        DRM_ERROR("amdgpu: writing more dwords to the ring than 
>>>>>> expected!\n");
>>>>>> +    ring->ring[ring->wptr++ & ring->buf_mask] = v;
>>>>>> +    ring->wptr &= ring->ptr_mask;
>>>>>> +    ring->count_dw--;
>>>>>> +
>>>>>> +    drm_dev_exit(idx);
>>>>>> +}
>>>>>> +
>>>>>> +void amdgpu_ring_write_multiple(struct amdgpu_ring *ring,
>>>>>> +                          void *src, int count_dw)
>>>>>> +{
>>>>>> +    unsigned occupied, chunk1, chunk2;
>>>>>> +    void *dst;
>>>>>> +    int idx;
>>>>>> +
>>>>>> +    if (!drm_dev_enter(&ring->adev->ddev, &idx))
>>>>>> +        return;
>>>>>> +
>>>>>> +    if (unlikely(ring->count_dw < count_dw))
>>>>>> +        DRM_ERROR("amdgpu: writing more dwords to the ring than 
>>>>>> expected!\n");
>>>>>> +
>>>>>> +    occupied = ring->wptr & ring->buf_mask;
>>>>>> +    dst = (void *)&ring->ring[occupied];
>>>>>> +    chunk1 = ring->buf_mask + 1 - occupied;
>>>>>> +    chunk1 = (chunk1 >= count_dw) ? count_dw: chunk1;
>>>>>> +    chunk2 = count_dw - chunk1;
>>>>>> +    chunk1 <<= 2;
>>>>>> +    chunk2 <<= 2;
>>>>>> +
>>>>>> +    if (chunk1)
>>>>>> +        memcpy(dst, src, chunk1);
>>>>>> +
>>>>>> +    if (chunk2) {
>>>>>> +        src += chunk1;
>>>>>> +        dst = (void *)ring->ring;
>>>>>> +        memcpy(dst, src, chunk2);
>>>>>> +    }
>>>>>> +
>>>>>> +    ring->wptr += count_dw;
>>>>>> +    ring->wptr &= ring->ptr_mask;
>>>>>> +    ring->count_dw -= count_dw;
>>>>>> +
>>>>>> +    drm_dev_exit(idx);
>>>>>> +}
>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>>>>>> index accb243..f90b81f 100644
>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>>>>>> @@ -300,53 +300,12 @@ static inline void
>>>>>> amdgpu_ring_set_preempt_cond_exec(struct amdgpu_ring *ring,
>>>>>>        *ring->cond_exe_cpu_addr = cond_exec;
>>>>>>    }
>>>>>>    -static inline void amdgpu_ring_clear_ring(struct amdgpu_ring *ring)
>>>>>> -{
>>>>>> -    int i = 0;
>>>>>> -    while (i <= ring->buf_mask)
>>>>>> -        ring->ring[i++] = ring->funcs->nop;
>>>>>> -
>>>>>> -}
>>>>>> -
>>>>>> -static inline void amdgpu_ring_write(struct amdgpu_ring *ring, uint32_t v)
>>>>>> -{
>>>>>> -    if (ring->count_dw <= 0)
>>>>>> -        DRM_ERROR("amdgpu: writing more dwords to the ring than 
>>>>>> expected!\n");
>>>>>> -    ring->ring[ring->wptr++ & ring->buf_mask] = v;
>>>>>> -    ring->wptr &= ring->ptr_mask;
>>>>>> -    ring->count_dw--;
>>>>>> -}
>>>>>> +void amdgpu_ring_clear_ring(struct amdgpu_ring *ring);
>>>>>>    -static inline void amdgpu_ring_write_multiple(struct amdgpu_ring *ring,
>>>>>> -                          void *src, int count_dw)
>>>>>> -{
>>>>>> -    unsigned occupied, chunk1, chunk2;
>>>>>> -    void *dst;
>>>>>> -
>>>>>> -    if (unlikely(ring->count_dw < count_dw))
>>>>>> -        DRM_ERROR("amdgpu: writing more dwords to the ring than 
>>>>>> expected!\n");
>>>>>> -
>>>>>> -    occupied = ring->wptr & ring->buf_mask;
>>>>>> -    dst = (void *)&ring->ring[occupied];
>>>>>> -    chunk1 = ring->buf_mask + 1 - occupied;
>>>>>> -    chunk1 = (chunk1 >= count_dw) ? count_dw: chunk1;
>>>>>> -    chunk2 = count_dw - chunk1;
>>>>>> -    chunk1 <<= 2;
>>>>>> -    chunk2 <<= 2;
>>>>>> +void amdgpu_ring_write(struct amdgpu_ring *ring, uint32_t v);
>>>>>>    -    if (chunk1)
>>>>>> -        memcpy(dst, src, chunk1);
>>>>>> -
>>>>>> -    if (chunk2) {
>>>>>> -        src += chunk1;
>>>>>> -        dst = (void *)ring->ring;
>>>>>> -        memcpy(dst, src, chunk2);
>>>>>> -    }
>>>>>> -
>>>>>> -    ring->wptr += count_dw;
>>>>>> -    ring->wptr &= ring->ptr_mask;
>>>>>> -    ring->count_dw -= count_dw;
>>>>>> -}
>>>>>> +void amdgpu_ring_write_multiple(struct amdgpu_ring *ring,
>>>>>> +                          void *src, int count_dw);
>>>>>>      int amdgpu_ring_test_helper(struct amdgpu_ring *ring);
>>>>>>    diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
>>>>>> b/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
>>>>>> index bd4248c..b3ce5be 100644
>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
>>>>>> @@ -269,10 +269,8 @@ static int psp_v11_0_bootloader_load_kdb(struct
>>>>>> psp_context *psp)
>>>>>>        if (ret)
>>>>>>            return ret;
>>>>>>    -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>>> -
>>>>>>        /* Copy PSP KDB binary to memory */
>>>>>> -    memcpy(psp->fw_pri_buf, psp->kdb_start_addr, psp->kdb_bin_size);
>>>>>> +    psp_copy_fw(psp, psp->kdb_start_addr, psp->kdb_bin_size);
>>>>>>          /* Provide the PSP KDB to bootloader */
>>>>>>        WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>>>>> @@ -302,10 +300,8 @@ static int psp_v11_0_bootloader_load_spl(struct
>>>>>> psp_context *psp)
>>>>>>        if (ret)
>>>>>>            return ret;
>>>>>>    -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>>> -
>>>>>>        /* Copy PSP SPL binary to memory */
>>>>>> -    memcpy(psp->fw_pri_buf, psp->spl_start_addr, psp->spl_bin_size);
>>>>>> +    psp_copy_fw(psp, psp->spl_start_addr, psp->spl_bin_size);
>>>>>>          /* Provide the PSP SPL to bootloader */
>>>>>>        WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>>>>> @@ -335,10 +331,8 @@ static int psp_v11_0_bootloader_load_sysdrv(struct
>>>>>> psp_context *psp)
>>>>>>        if (ret)
>>>>>>            return ret;
>>>>>>    -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>>> -
>>>>>>        /* Copy PSP System Driver binary to memory */
>>>>>> -    memcpy(psp->fw_pri_buf, psp->sys_start_addr, psp->sys_bin_size);
>>>>>> +    psp_copy_fw(psp, psp->sys_start_addr, psp->sys_bin_size);
>>>>>>          /* Provide the sys driver to bootloader */
>>>>>>        WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>>>>> @@ -371,10 +365,8 @@ static int psp_v11_0_bootloader_load_sos(struct
>>>>>> psp_context *psp)
>>>>>>        if (ret)
>>>>>>            return ret;
>>>>>>    -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>>> -
>>>>>>        /* Copy Secure OS binary to PSP memory */
>>>>>> -    memcpy(psp->fw_pri_buf, psp->sos_start_addr, psp->sos_bin_size);
>>>>>> +    psp_copy_fw(psp, psp->sos_start_addr, psp->sos_bin_size);
>>>>>>          /* Provide the PSP secure OS to bootloader */
>>>>>>        WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c
>>>>>> b/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c
>>>>>> index c4828bd..618e5b6 100644
>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c
>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c
>>>>>> @@ -138,10 +138,8 @@ static int psp_v12_0_bootloader_load_sysdrv(struct
>>>>>> psp_context *psp)
>>>>>>        if (ret)
>>>>>>            return ret;
>>>>>>    -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>>> -
>>>>>>        /* Copy PSP System Driver binary to memory */
>>>>>> -    memcpy(psp->fw_pri_buf, psp->sys_start_addr, psp->sys_bin_size);
>>>>>> +    psp_copy_fw(psp, psp->sys_start_addr, psp->sys_bin_size);
>>>>>>          /* Provide the sys driver to bootloader */
>>>>>>        WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>>>>> @@ -179,10 +177,8 @@ static int psp_v12_0_bootloader_load_sos(struct
>>>>>> psp_context *psp)
>>>>>>        if (ret)
>>>>>>            return ret;
>>>>>>    -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>>> -
>>>>>>        /* Copy Secure OS binary to PSP memory */
>>>>>> -    memcpy(psp->fw_pri_buf, psp->sos_start_addr, psp->sos_bin_size);
>>>>>> +    psp_copy_fw(psp, psp->sos_start_addr, psp->sos_bin_size);
>>>>>>          /* Provide the PSP secure OS to bootloader */
>>>>>>        WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c
>>>>>> b/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c
>>>>>> index f2e725f..d0a6cccd 100644
>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c
>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c
>>>>>> @@ -102,10 +102,8 @@ static int psp_v3_1_bootloader_load_sysdrv(struct
>>>>>> psp_context *psp)
>>>>>>        if (ret)
>>>>>>            return ret;
>>>>>>    -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>>> -
>>>>>>        /* Copy PSP System Driver binary to memory */
>>>>>> -    memcpy(psp->fw_pri_buf, psp->sys_start_addr, psp->sys_bin_size);
>>>>>> +    psp_copy_fw(psp, psp->sys_start_addr, psp->sys_bin_size);
>>>>>>          /* Provide the sys driver to bootloader */
>>>>>>        WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>>>>> @@ -143,10 +141,8 @@ static int psp_v3_1_bootloader_load_sos(struct
>>>>>> psp_context *psp)
>>>>>>        if (ret)
>>>>>>            return ret;
>>>>>>    -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>>> -
>>>>>>        /* Copy Secure OS binary to PSP memory */
>>>>>> -    memcpy(psp->fw_pri_buf, psp->sos_start_addr, psp->sos_bin_size);
>>>>>> +    psp_copy_fw(psp, psp->sos_start_addr, psp->sos_bin_size);
>>>>>>          /* Provide the PSP secure OS to bootloader */
>>>>>>        WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>>
>>>
>> _______________________________________________
>> amd-gfx mailing list
>> amd-gfx@lists.freedesktop.org
>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&amp;data=04%7C01%7CAndrey.Grodzovsky%40amd.com%7Cee61fb937d2d4baedf6f08d8bcac5b02%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637466795752297305%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=a5MkPkwHh7WkR24K9EoCWSKPdCpiXCJH6RwGbGyhHyA%3D&amp;reserved=0 
>>
>

[-- Attachment #2: results.log --]
[-- Type: text/x-log, Size: 9299 bytes --]

with drm_dev_enter


[   20.606168 <    0.053014>] [drm:gfx_v8_0_ring_test_ring [amdgpu]] *ERROR* Before write 1M times to scratch register
[   20.914669 <    0.308501>] [drm:gfx_v8_0_ring_test_ring [amdgpu]] *ERROR* After write 1M times to scratch register
[   20.914857 <    0.000188>] [drm:gfx_v8_0_ring_test_ring [amdgpu]] *ERROR* Before write 1M times to scratch register
[   21.224795 <    0.309938>] [drm:gfx_v8_0_ring_test_ring [amdgpu]] *ERROR* After write 1M times to scratch register
[   21.224986 <    0.000191>] [drm:gfx_v8_0_ring_test_ring [amdgpu]] *ERROR* Before write 1M times to scratch register
[   21.533422 <    0.308436>] [drm:gfx_v8_0_ring_test_ring [amdgpu]] *ERROR* After write 1M times to scratch register
[   21.843633 <    0.000164>] [drm:gfx_v8_0_ring_test_ring [amdgpu]] *ERROR* Before write 1M times to scratch register
[   22.152756 <    0.309123>] [drm:gfx_v8_0_ring_test_ring [amdgpu]] *ERROR* After write 1M times to scratch register
[   22.152934 <    0.000178>] [drm:gfx_v8_0_ring_test_ring [amdgpu]] *ERROR* Before write 1M times to scratch register
[   22.462544 <    0.309610>] [drm:gfx_v8_0_ring_test_ring [amdgpu]] *ERROR* After write 1M times to scratch register
[   22.462721 <    0.000177>] [drm:gfx_v8_0_ring_test_ring [amdgpu]] *ERROR* Before write 1M times to scratch register
[   22.771921 <    0.309200>] [drm:gfx_v8_0_ring_test_ring [amdgpu]] *ERROR* After write 1M times to scratch register
[   22.772128 <    0.000207>] [drm:gfx_v8_0_ring_test_ring [amdgpu]] *ERROR* Before write 1M times to scratch register
[   23.081148 <    0.309020>] [drm:gfx_v8_0_ring_test_ring [amdgpu]] *ERROR* After write 1M times to scratch register
[   23.081331 <    0.000183>] [drm:gfx_v8_0_ring_test_ring [amdgpu]] *ERROR* Before write 1M times to scratch register
[   23.391600 <    0.310269>] [drm:gfx_v8_0_ring_test_ring [amdgpu]] *ERROR* After write 1M times to scratch register
[   23.391783 <    0.000183>] [drm:gfx_v8_0_ring_test_ring [amdgpu]] *ERROR* Before write 1M times to scratch register
[   23.702026 <    0.310243>] [drm:gfx_v8_0_ring_test_ring [amdgpu]] *ERROR* After write 1M times to scratch register




[   25.330532 <    0.000112>] [drm:gfx_v9_0_ring_test_ring [amdgpu]] *ERROR* Before write 1M times to scratch register
[   25.642806 <    0.312274>] [drm:gfx_v9_0_ring_test_ring [amdgpu]] *ERROR* After write 1M times to scratch register
[   25.643123 <    0.000317>] [drm:gfx_v9_0_ring_test_ring [amdgpu]] *ERROR* Before write 1M times to scratch register
[   25.954685 <    0.311562>] [drm:gfx_v9_0_ring_test_ring [amdgpu]] *ERROR* After write 1M times to scratch register
[   25.954906 <    0.000221>] [drm:gfx_v9_0_ring_test_ring [amdgpu]] *ERROR* Before write 1M times to scratch register
[   26.266457 <    0.311551>] [drm:gfx_v9_0_ring_test_ring [amdgpu]] *ERROR* After write 1M times to scratch register
[   26.266675 <    0.000218>] [drm:gfx_v9_0_ring_test_ring [amdgpu]] *ERROR* Before write 1M times to scratch register
[   26.579848 <    0.313173>] [drm:gfx_v9_0_ring_test_ring [amdgpu]] *ERROR* After write 1M times to scratch register
[   26.580066 <    0.000218>] [drm:gfx_v9_0_ring_test_ring [amdgpu]] *ERROR* Before write 1M times to scratch register
[   26.891740 <    0.311674>] [drm:gfx_v9_0_ring_test_ring [amdgpu]] *ERROR* After write 1M times to scratch register
[   26.891958 <    0.000218>] [drm:gfx_v9_0_ring_test_ring [amdgpu]] *ERROR* Before write 1M times to scratch register
[   27.203947 <    0.311989>] [drm:gfx_v9_0_ring_test_ring [amdgpu]] *ERROR* After write 1M times to scratch register
[   27.204166 <    0.000219>] [drm:gfx_v9_0_ring_test_ring [amdgpu]] *ERROR* Before write 1M times to scratch register
[   27.516040 <    0.311874>] [drm:gfx_v9_0_ring_test_ring [amdgpu]] *ERROR* After write 1M times to scratch register
[   27.516265 <    0.000225>] [drm:gfx_v9_0_ring_test_ring [amdgpu]] *ERROR* Before write 1M times to scratch register
[   27.828137 <    0.311872>] [drm:gfx_v9_0_ring_test_ring [amdgpu]] *ERROR* After write 1M times to scratch register
[   27.828356 <    0.000219>] [drm:gfx_v9_0_ring_test_ring [amdgpu]] *ERROR* Before write 1M times to scratch register
[   28.140182 <    0.311826>] [drm:gfx_v9_0_ring_test_ring [amdgpu]] *ERROR* After write 1M times to scratch register
[   28.140395 <    0.000213>] [drm:gfx_v9_0_ring_test_ring [amdgpu]] *ERROR* Before write 1M times to scratch register
[   28.452405 <    0.312010>] [drm:gfx_v9_0_ring_test_ring [amdgpu]] *ERROR* After write 1M times to scratch register

without drm_dev_enter


[   28.519096 <    0.049775>] [drm:gfx_v8_0_ring_test_ring [amdgpu]] *ERROR* Before write 1M times to scratch register
[   28.728785 <    0.209689>] [drm:gfx_v8_0_ring_test_ring [amdgpu]] *ERROR* After write 1M times to scratch register
[   28.728946 <    0.000161>] [drm:gfx_v8_0_ring_test_ring [amdgpu]] *ERROR* Before write 1M times to scratch register
[   28.938814 <    0.209868>] [drm:gfx_v8_0_ring_test_ring [amdgpu]] *ERROR* After write 1M times to scratch register
[   28.938979 <    0.000165>] [drm:gfx_v8_0_ring_test_ring [amdgpu]] *ERROR* Before write 1M times to scratch register
[   29.148659 <    0.209680>] [drm:gfx_v8_0_ring_test_ring [amdgpu]] *ERROR* After write 1M times to scratch register
[   29.148809 <    0.000150>] [drm:gfx_v8_0_ring_test_ring [amdgpu]] *ERROR* Before write 1M times to scratch register
[   29.358514 <    0.209705>] [drm:gfx_v8_0_ring_test_ring [amdgpu]] *ERROR* After write 1M times to scratch register
[   29.358664 <    0.000150>] [drm:gfx_v8_0_ring_test_ring [amdgpu]] *ERROR* Before write 1M times to scratch register
[   29.568358 <    0.209694>] [drm:gfx_v8_0_ring_test_ring [amdgpu]] *ERROR* After write 1M times to scratch register
[   29.568508 <    0.000150>] [drm:gfx_v8_0_ring_test_ring [amdgpu]] *ERROR* Before write 1M times to scratch register
[   29.778389 <    0.209881>] [drm:gfx_v8_0_ring_test_ring [amdgpu]] *ERROR* After write 1M times to scratch register
[   29.778539 <    0.000150>] [drm:gfx_v8_0_ring_test_ring [amdgpu]] *ERROR* Before write 1M times to scratch register
[   29.988240 <    0.209701>] [drm:gfx_v8_0_ring_test_ring [amdgpu]] *ERROR* After write 1M times to scratch register
[   29.988391 <    0.000151>] [drm:gfx_v8_0_ring_test_ring [amdgpu]] *ERROR* Before write 1M times to scratch register
[   30.198077 <    0.209686>] [drm:gfx_v8_0_ring_test_ring [amdgpu]] *ERROR* After write 1M times to scratch register
[   30.198228 <    0.000151>] [drm:gfx_v8_0_ring_test_ring [amdgpu]] *ERROR* Before write 1M times to scratch register
[   30.407928 <    0.209700>] [drm:gfx_v8_0_ring_test_ring [amdgpu]] *ERROR* After write 1M times to scratch register
[   30.408079 <    0.000151>] [drm:gfx_v8_0_ring_test_ring [amdgpu]] *ERROR* Before write 1M times to scratch register
[   30.617796 <    0.209717>] [drm:gfx_v8_0_ring_test_ring [amdgpu]] *ERROR* After write 1M times to scratch register





[   32.222242 <    0.000086>] [drm:gfx_v9_0_ring_test_ring [amdgpu]] *ERROR* Before write 1M times to scratch register
[   32.431423 <    0.209181>] [drm:gfx_v9_0_ring_test_ring [amdgpu]] *ERROR* After write 1M times to scratch register
[   32.431834 <    0.000411>] [drm:gfx_v9_0_ring_test_ring [amdgpu]] *ERROR* Before write 1M times to scratch register
[   32.641075 <    0.209241>] [drm:gfx_v9_0_ring_test_ring [amdgpu]] *ERROR* After write 1M times to scratch register
[   32.641268 <    0.000193>] [drm:gfx_v9_0_ring_test_ring [amdgpu]] *ERROR* Before write 1M times to scratch register
[   32.850482 <    0.209214>] [drm:gfx_v9_0_ring_test_ring [amdgpu]] *ERROR* After write 1M times to scratch register
[   32.850671 <    0.000189>] [drm:gfx_v9_0_ring_test_ring [amdgpu]] *ERROR* Before write 1M times to scratch register
[   33.059900 <    0.209229>] [drm:gfx_v9_0_ring_test_ring [amdgpu]] *ERROR* After write 1M times to scratch register
[   33.060092 <    0.000192>] [drm:gfx_v9_0_ring_test_ring [amdgpu]] *ERROR* Before write 1M times to scratch register
[   33.269336 <    0.209244>] [drm:gfx_v9_0_ring_test_ring [amdgpu]] *ERROR* After write 1M times to scratch register
[   33.269526 <    0.000190>] [drm:gfx_v9_0_ring_test_ring [amdgpu]] *ERROR* Before write 1M times to scratch register
[   33.478741 <    0.209215>] [drm:gfx_v9_0_ring_test_ring [amdgpu]] *ERROR* After write 1M times to scratch register
[   33.478931 <    0.000190>] [drm:gfx_v9_0_ring_test_ring [amdgpu]] *ERROR* Before write 1M times to scratch register
[   33.688259 <    0.209328>] [drm:gfx_v9_0_ring_test_ring [amdgpu]] *ERROR* After write 1M times to scratch register
[   33.688449 <    0.000190>] [drm:gfx_v9_0_ring_test_ring [amdgpu]] *ERROR* Before write 1M times to scratch register
[   33.897756 <    0.209307>] [drm:gfx_v9_0_ring_test_ring [amdgpu]] *ERROR* After write 1M times to scratch register
[   33.897946 <    0.000190>] [drm:gfx_v9_0_ring_test_ring [amdgpu]] *ERROR* Before write 1M times to scratch register
[   34.107256 <    0.209310>] [drm:gfx_v9_0_ring_test_ring [amdgpu]] *ERROR* After write 1M times to scratch register
[   34.107445 <    0.000189>] [drm:gfx_v9_0_ring_test_ring [amdgpu]] *ERROR* Before write 1M times to scratch register
[   34.316758 <    0.209313>] [drm:gfx_v9_0_ring_test_ring [amdgpu]] *ERROR* After write 1M times to scratch register


[-- Attachment #3: Type: text/plain, Size: 154 bytes --]

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 11/14] drm/amdgpu: Guard against write accesses after device removal
  2021-01-28 17:23               ` Andrey Grodzovsky
@ 2021-01-29 15:16                 ` Christian König
  -1 siblings, 0 replies; 196+ messages in thread
From: Christian König @ 2021-01-29 15:16 UTC (permalink / raw)
  To: Andrey Grodzovsky, Daniel Vetter
  Cc: Greg KH, dri-devel, amd-gfx list, Alex Deucher, Qiang Yu

Am 28.01.21 um 18:23 schrieb Andrey Grodzovsky:
>
> On 1/19/21 1:59 PM, Christian König wrote:
>> Am 19.01.21 um 19:22 schrieb Andrey Grodzovsky:
>>>
>>> On 1/19/21 1:05 PM, Daniel Vetter wrote:
>>>> [SNIP]
>>> So say writing in a loop to some harmless scratch register for many 
>>> times both for plugged
>>> and unplugged case and measure total time delta ?
>>
>> I think we should at least measure the following:
>>
>> 1. Writing X times to a scratch reg without your patch.
>> 2. Writing X times to a scratch reg with your patch.
>> 3. Writing X times to a scratch reg with the hardware physically 
>> disconnected.
>>
>> I suggest to repeat that once for Polaris (or older) and once for 
>> Vega or Navi.
>>
>> The SRBM on Polaris is meant to introduce some delay in each access, 
>> so it might react differently then the newer hardware.
>>
>> Christian.
>
>
> See attached results and the testing code. Ran on Polaris (gfx8) and 
> Vega10(gfx9)
>
> In summary, over 1 million WWREG32 in loop with and without this patch 
> you get around 10ms of accumulated overhead ( so 0.00001 millisecond 
> penalty for each WWREG32) for using drm_dev_enter check when writing 
> registers.
>
> P.S Bullet 3 I cannot test as I need eGPU and currently I don't have one.

Well if I'm not completely mistaken that are 100ms of accumulated 
overhead. So around 100ns per write. And even bigger problem is that 
this is a ~67% increase.

I'm not sure how many write we do during normal operation, but that 
sounds like a bit much. Ideas?

Christian.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 11/14] drm/amdgpu: Guard against write accesses after device removal
@ 2021-01-29 15:16                 ` Christian König
  0 siblings, 0 replies; 196+ messages in thread
From: Christian König @ 2021-01-29 15:16 UTC (permalink / raw)
  To: Andrey Grodzovsky, Daniel Vetter
  Cc: Rob Herring, Greg KH, dri-devel, Anholt, Eric, Pekka Paalanen,
	amd-gfx list, Alex Deucher, Lucas Stach, Wentland, Harry,
	Qiang Yu

Am 28.01.21 um 18:23 schrieb Andrey Grodzovsky:
>
> On 1/19/21 1:59 PM, Christian König wrote:
>> Am 19.01.21 um 19:22 schrieb Andrey Grodzovsky:
>>>
>>> On 1/19/21 1:05 PM, Daniel Vetter wrote:
>>>> [SNIP]
>>> So say writing in a loop to some harmless scratch register for many 
>>> times both for plugged
>>> and unplugged case and measure total time delta ?
>>
>> I think we should at least measure the following:
>>
>> 1. Writing X times to a scratch reg without your patch.
>> 2. Writing X times to a scratch reg with your patch.
>> 3. Writing X times to a scratch reg with the hardware physically 
>> disconnected.
>>
>> I suggest to repeat that once for Polaris (or older) and once for 
>> Vega or Navi.
>>
>> The SRBM on Polaris is meant to introduce some delay in each access, 
>> so it might react differently then the newer hardware.
>>
>> Christian.
>
>
> See attached results and the testing code. Ran on Polaris (gfx8) and 
> Vega10(gfx9)
>
> In summary, over 1 million WWREG32 in loop with and without this patch 
> you get around 10ms of accumulated overhead ( so 0.00001 millisecond 
> penalty for each WWREG32) for using drm_dev_enter check when writing 
> registers.
>
> P.S Bullet 3 I cannot test as I need eGPU and currently I don't have one.

Well if I'm not completely mistaken that are 100ms of accumulated 
overhead. So around 100ns per write. And even bigger problem is that 
this is a ~67% increase.

I'm not sure how many write we do during normal operation, but that 
sounds like a bit much. Ideas?

Christian.
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 11/14] drm/amdgpu: Guard against write accesses after device removal
  2021-01-29 15:16                 ` Christian König
@ 2021-01-29 17:35                   ` Andrey Grodzovsky
  -1 siblings, 0 replies; 196+ messages in thread
From: Andrey Grodzovsky @ 2021-01-29 17:35 UTC (permalink / raw)
  To: Christian König, Daniel Vetter
  Cc: Greg KH, dri-devel, amd-gfx list, Alex Deucher, Qiang Yu


On 1/29/21 10:16 AM, Christian König wrote:
> Am 28.01.21 um 18:23 schrieb Andrey Grodzovsky:
>>
>> On 1/19/21 1:59 PM, Christian König wrote:
>>> Am 19.01.21 um 19:22 schrieb Andrey Grodzovsky:
>>>>
>>>> On 1/19/21 1:05 PM, Daniel Vetter wrote:
>>>>> [SNIP]
>>>> So say writing in a loop to some harmless scratch register for many times 
>>>> both for plugged
>>>> and unplugged case and measure total time delta ?
>>>
>>> I think we should at least measure the following:
>>>
>>> 1. Writing X times to a scratch reg without your patch.
>>> 2. Writing X times to a scratch reg with your patch.
>>> 3. Writing X times to a scratch reg with the hardware physically disconnected.
>>>
>>> I suggest to repeat that once for Polaris (or older) and once for Vega or Navi.
>>>
>>> The SRBM on Polaris is meant to introduce some delay in each access, so it 
>>> might react differently then the newer hardware.
>>>
>>> Christian.
>>
>>
>> See attached results and the testing code. Ran on Polaris (gfx8) and 
>> Vega10(gfx9)
>>
>> In summary, over 1 million WWREG32 in loop with and without this patch you 
>> get around 10ms of accumulated overhead ( so 0.00001 millisecond penalty for 
>> each WWREG32) for using drm_dev_enter check when writing registers.
>>
>> P.S Bullet 3 I cannot test as I need eGPU and currently I don't have one.
>
> Well if I'm not completely mistaken that are 100ms of accumulated overhead. So 
> around 100ns per write. And even bigger problem is that this is a ~67% increase.


My bad, and 67% from what ? How u calculate ?


>
> I'm not sure how many write we do during normal operation, but that sounds 
> like a bit much. Ideas?


Well, u suggested to move the drm_dev_enter way up but as i see it the problem 
with this is that it increase the chance of race where the
device is extracted after we check for drm_dev_enter (there is also such chance 
even when it's placed inside WWREG but it's lower).
Earlier I propsed that instead of doing all those guards scattered all over the 
code simply delay release of system memory pages and unreserve of
MMIO ranges to until after the device itself is gone after last drm device 
reference is dropped. But Daniel opposes delaying MMIO ranges unreserve to after
PCI remove code because according to him it will upset the PCI subsytem.

Andrey

>
> Christian.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 11/14] drm/amdgpu: Guard against write accesses after device removal
@ 2021-01-29 17:35                   ` Andrey Grodzovsky
  0 siblings, 0 replies; 196+ messages in thread
From: Andrey Grodzovsky @ 2021-01-29 17:35 UTC (permalink / raw)
  To: Christian König, Daniel Vetter
  Cc: Rob Herring, Greg KH, dri-devel, Anholt, Eric, Pekka Paalanen,
	amd-gfx list, Alex Deucher, Lucas Stach, Wentland, Harry,
	Qiang Yu


On 1/29/21 10:16 AM, Christian König wrote:
> Am 28.01.21 um 18:23 schrieb Andrey Grodzovsky:
>>
>> On 1/19/21 1:59 PM, Christian König wrote:
>>> Am 19.01.21 um 19:22 schrieb Andrey Grodzovsky:
>>>>
>>>> On 1/19/21 1:05 PM, Daniel Vetter wrote:
>>>>> [SNIP]
>>>> So say writing in a loop to some harmless scratch register for many times 
>>>> both for plugged
>>>> and unplugged case and measure total time delta ?
>>>
>>> I think we should at least measure the following:
>>>
>>> 1. Writing X times to a scratch reg without your patch.
>>> 2. Writing X times to a scratch reg with your patch.
>>> 3. Writing X times to a scratch reg with the hardware physically disconnected.
>>>
>>> I suggest to repeat that once for Polaris (or older) and once for Vega or Navi.
>>>
>>> The SRBM on Polaris is meant to introduce some delay in each access, so it 
>>> might react differently then the newer hardware.
>>>
>>> Christian.
>>
>>
>> See attached results and the testing code. Ran on Polaris (gfx8) and 
>> Vega10(gfx9)
>>
>> In summary, over 1 million WWREG32 in loop with and without this patch you 
>> get around 10ms of accumulated overhead ( so 0.00001 millisecond penalty for 
>> each WWREG32) for using drm_dev_enter check when writing registers.
>>
>> P.S Bullet 3 I cannot test as I need eGPU and currently I don't have one.
>
> Well if I'm not completely mistaken that are 100ms of accumulated overhead. So 
> around 100ns per write. And even bigger problem is that this is a ~67% increase.


My bad, and 67% from what ? How u calculate ?


>
> I'm not sure how many write we do during normal operation, but that sounds 
> like a bit much. Ideas?


Well, u suggested to move the drm_dev_enter way up but as i see it the problem 
with this is that it increase the chance of race where the
device is extracted after we check for drm_dev_enter (there is also such chance 
even when it's placed inside WWREG but it's lower).
Earlier I propsed that instead of doing all those guards scattered all over the 
code simply delay release of system memory pages and unreserve of
MMIO ranges to until after the device itself is gone after last drm device 
reference is dropped. But Daniel opposes delaying MMIO ranges unreserve to after
PCI remove code because according to him it will upset the PCI subsytem.

Andrey

>
> Christian.
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 11/14] drm/amdgpu: Guard against write accesses after device removal
  2021-01-29 17:35                   ` Andrey Grodzovsky
@ 2021-01-29 19:25                     ` Christian König
  -1 siblings, 0 replies; 196+ messages in thread
From: Christian König @ 2021-01-29 19:25 UTC (permalink / raw)
  To: Andrey Grodzovsky, Christian König, Daniel Vetter
  Cc: Greg KH, amd-gfx list, dri-devel, Alex Deucher, Qiang Yu

Am 29.01.21 um 18:35 schrieb Andrey Grodzovsky:
>
> On 1/29/21 10:16 AM, Christian König wrote:
>> Am 28.01.21 um 18:23 schrieb Andrey Grodzovsky:
>>>
>>> On 1/19/21 1:59 PM, Christian König wrote:
>>>> Am 19.01.21 um 19:22 schrieb Andrey Grodzovsky:
>>>>>
>>>>> On 1/19/21 1:05 PM, Daniel Vetter wrote:
>>>>>> [SNIP]
>>>>> So say writing in a loop to some harmless scratch register for 
>>>>> many times both for plugged
>>>>> and unplugged case and measure total time delta ?
>>>>
>>>> I think we should at least measure the following:
>>>>
>>>> 1. Writing X times to a scratch reg without your patch.
>>>> 2. Writing X times to a scratch reg with your patch.
>>>> 3. Writing X times to a scratch reg with the hardware physically 
>>>> disconnected.
>>>>
>>>> I suggest to repeat that once for Polaris (or older) and once for 
>>>> Vega or Navi.
>>>>
>>>> The SRBM on Polaris is meant to introduce some delay in each 
>>>> access, so it might react differently then the newer hardware.
>>>>
>>>> Christian.
>>>
>>>
>>> See attached results and the testing code. Ran on Polaris (gfx8) and 
>>> Vega10(gfx9)
>>>
>>> In summary, over 1 million WWREG32 in loop with and without this 
>>> patch you get around 10ms of accumulated overhead ( so 0.00001 
>>> millisecond penalty for each WWREG32) for using drm_dev_enter check 
>>> when writing registers.
>>>
>>> P.S Bullet 3 I cannot test as I need eGPU and currently I don't have 
>>> one.
>>
>> Well if I'm not completely mistaken that are 100ms of accumulated 
>> overhead. So around 100ns per write. And even bigger problem is that 
>> this is a ~67% increase.
>
>
> My bad, and 67% from what ? How u calculate ?

My bad, (308501-209689)/209689=47% increase.

>>
>> I'm not sure how many write we do during normal operation, but that 
>> sounds like a bit much. Ideas?
>
> Well, u suggested to move the drm_dev_enter way up but as i see it the 
> problem with this is that it increase the chance of race where the
> device is extracted after we check for drm_dev_enter (there is also 
> such chance even when it's placed inside WWREG but it's lower).
> Earlier I propsed that instead of doing all those guards scattered all 
> over the code simply delay release of system memory pages and 
> unreserve of
> MMIO ranges to until after the device itself is gone after last drm 
> device reference is dropped. But Daniel opposes delaying MMIO ranges 
> unreserve to after
> PCI remove code because according to him it will upset the PCI subsytem.

Yeah, that's most likely true as well.

Maybe Daniel has another idea when he's back from vacation.

Christian.

>
> Andrey
>
>>
>> Christian.
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 11/14] drm/amdgpu: Guard against write accesses after device removal
@ 2021-01-29 19:25                     ` Christian König
  0 siblings, 0 replies; 196+ messages in thread
From: Christian König @ 2021-01-29 19:25 UTC (permalink / raw)
  To: Andrey Grodzovsky, Christian König, Daniel Vetter
  Cc: Rob Herring, Greg KH, amd-gfx list, Anholt, Eric, Pekka Paalanen,
	dri-devel, Alex Deucher, Qiang Yu, Wentland, Harry, Lucas Stach

Am 29.01.21 um 18:35 schrieb Andrey Grodzovsky:
>
> On 1/29/21 10:16 AM, Christian König wrote:
>> Am 28.01.21 um 18:23 schrieb Andrey Grodzovsky:
>>>
>>> On 1/19/21 1:59 PM, Christian König wrote:
>>>> Am 19.01.21 um 19:22 schrieb Andrey Grodzovsky:
>>>>>
>>>>> On 1/19/21 1:05 PM, Daniel Vetter wrote:
>>>>>> [SNIP]
>>>>> So say writing in a loop to some harmless scratch register for 
>>>>> many times both for plugged
>>>>> and unplugged case and measure total time delta ?
>>>>
>>>> I think we should at least measure the following:
>>>>
>>>> 1. Writing X times to a scratch reg without your patch.
>>>> 2. Writing X times to a scratch reg with your patch.
>>>> 3. Writing X times to a scratch reg with the hardware physically 
>>>> disconnected.
>>>>
>>>> I suggest to repeat that once for Polaris (or older) and once for 
>>>> Vega or Navi.
>>>>
>>>> The SRBM on Polaris is meant to introduce some delay in each 
>>>> access, so it might react differently then the newer hardware.
>>>>
>>>> Christian.
>>>
>>>
>>> See attached results and the testing code. Ran on Polaris (gfx8) and 
>>> Vega10(gfx9)
>>>
>>> In summary, over 1 million WWREG32 in loop with and without this 
>>> patch you get around 10ms of accumulated overhead ( so 0.00001 
>>> millisecond penalty for each WWREG32) for using drm_dev_enter check 
>>> when writing registers.
>>>
>>> P.S Bullet 3 I cannot test as I need eGPU and currently I don't have 
>>> one.
>>
>> Well if I'm not completely mistaken that are 100ms of accumulated 
>> overhead. So around 100ns per write. And even bigger problem is that 
>> this is a ~67% increase.
>
>
> My bad, and 67% from what ? How u calculate ?

My bad, (308501-209689)/209689=47% increase.

>>
>> I'm not sure how many write we do during normal operation, but that 
>> sounds like a bit much. Ideas?
>
> Well, u suggested to move the drm_dev_enter way up but as i see it the 
> problem with this is that it increase the chance of race where the
> device is extracted after we check for drm_dev_enter (there is also 
> such chance even when it's placed inside WWREG but it's lower).
> Earlier I propsed that instead of doing all those guards scattered all 
> over the code simply delay release of system memory pages and 
> unreserve of
> MMIO ranges to until after the device itself is gone after last drm 
> device reference is dropped. But Daniel opposes delaying MMIO ranges 
> unreserve to after
> PCI remove code because according to him it will upset the PCI subsytem.

Yeah, that's most likely true as well.

Maybe Daniel has another idea when he's back from vacation.

Christian.

>
> Andrey
>
>>
>> Christian.
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 01/14] drm/ttm: Remap all page faults to per process dummy page.
  2021-01-27 14:29         ` Andrey Grodzovsky
@ 2021-02-02 14:21           ` Daniel Vetter
  -1 siblings, 0 replies; 196+ messages in thread
From: Daniel Vetter @ 2021-02-02 14:21 UTC (permalink / raw)
  To: Andrey Grodzovsky
  Cc: gregkh, ckoenig.leichtzumerken, dri-devel, amd-gfx,
	daniel.vetter, Alexander.Deucher, yuq825

On Wed, Jan 27, 2021 at 09:29:41AM -0500, Andrey Grodzovsky wrote:
> Hey Daniel, just a ping.

Was on vacations last week.

> Andrey
> 
> On 1/25/21 10:28 AM, Andrey Grodzovsky wrote:
> > 
> > On 1/19/21 8:56 AM, Daniel Vetter wrote:
> > > On Mon, Jan 18, 2021 at 04:01:10PM -0500, Andrey Grodzovsky wrote:
> > > > On device removal reroute all CPU mappings to dummy page.
> > > > 
> > > > v3:
> > > > Remove loop to find DRM file and instead access it
> > > > by vma->vm_file->private_data. Move dummy page installation
> > > > into a separate function.
> > > > 
> > > > v4:
> > > > Map the entire BOs VA space into on demand allocated dummy page
> > > > on the first fault for that BO.
> > > > 
> > > > Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> > > > ---
> > > >   drivers/gpu/drm/ttm/ttm_bo_vm.c | 82
> > > > ++++++++++++++++++++++++++++++++++++++++-
> > > >   include/drm/ttm/ttm_bo_api.h    |  2 +
> > > >   2 files changed, 83 insertions(+), 1 deletion(-)
> > > > 
> > > > diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c b/drivers/gpu/drm/ttm/ttm_bo_vm.c
> > > > index 6dc96cf..ed89da3 100644
> > > > --- a/drivers/gpu/drm/ttm/ttm_bo_vm.c
> > > > +++ b/drivers/gpu/drm/ttm/ttm_bo_vm.c
> > > > @@ -34,6 +34,8 @@
> > > >   #include <drm/ttm/ttm_bo_driver.h>
> > > >   #include <drm/ttm/ttm_placement.h>
> > > >   #include <drm/drm_vma_manager.h>
> > > > +#include <drm/drm_drv.h>
> > > > +#include <drm/drm_managed.h>
> > > >   #include <linux/mm.h>
> > > >   #include <linux/pfn_t.h>
> > > >   #include <linux/rbtree.h>
> > > > @@ -380,25 +382,103 @@ vm_fault_t
> > > > ttm_bo_vm_fault_reserved(struct vm_fault *vmf,
> > > >   }
> > > >   EXPORT_SYMBOL(ttm_bo_vm_fault_reserved);
> > > >   +static void ttm_bo_release_dummy_page(struct drm_device *dev, void *res)
> > > > +{
> > > > +    struct page *dummy_page = (struct page *)res;
> > > > +
> > > > +    __free_page(dummy_page);
> > > > +}
> > > > +
> > > > +vm_fault_t ttm_bo_vm_dummy_page(struct vm_fault *vmf, pgprot_t prot)
> > > > +{
> > > > +    struct vm_area_struct *vma = vmf->vma;
> > > > +    struct ttm_buffer_object *bo = vma->vm_private_data;
> > > > +    struct ttm_bo_device *bdev = bo->bdev;
> > > > +    struct drm_device *ddev = bo->base.dev;
> > > > +    vm_fault_t ret = VM_FAULT_NOPAGE;
> > > > +    unsigned long address = vma->vm_start;
> > > > +    unsigned long num_prefault = (vma->vm_end - vma->vm_start) >> PAGE_SHIFT;
> > > > +    unsigned long pfn;
> > > > +    struct page *page;
> > > > +    int i;
> > > > +
> > > > +    /*
> > > > +     * Wait for buffer data in transit, due to a pipelined
> > > > +     * move.
> > > > +     */
> > > > +    ret = ttm_bo_vm_fault_idle(bo, vmf);
> > > > +    if (unlikely(ret != 0))
> > > > +        return ret;
> > > > +
> > > > +    /* Allocate new dummy page to map all the VA range in this VMA to it*/
> > > > +    page = alloc_page(GFP_KERNEL | __GFP_ZERO);
> > > > +    if (!page)
> > > > +        return VM_FAULT_OOM;
> > > > +
> > > > +    pfn = page_to_pfn(page);
> > > > +
> > > > +    /*
> > > > +     * Prefault the entire VMA range right away to avoid further faults
> > > > +     */
> > > > +    for (i = 0; i < num_prefault; ++i) {
> > > > +
> > > > +        if (unlikely(address >= vma->vm_end))
> > > > +            break;
> > > > +
> > > > +        if (vma->vm_flags & VM_MIXEDMAP)
> > > > +            ret = vmf_insert_mixed_prot(vma, address,
> > > > +                            __pfn_to_pfn_t(pfn, PFN_DEV),
> > > > +                            prot);
> > > > +        else
> > > > +            ret = vmf_insert_pfn_prot(vma, address, pfn, prot);
> > > > +
> > > > +        /* Never error on prefaulted PTEs */
> > > > +        if (unlikely((ret & VM_FAULT_ERROR))) {
> > > > +            if (i == 0)
> > > > +                return VM_FAULT_NOPAGE;
> > > > +            else
> > > > +                break;
> > > > +        }
> > > > +
> > > > +        address += PAGE_SIZE;
> > > > +    }
> > > > +
> > > > +    /* Set the page to be freed using drmm release action */
> > > > +    if (drmm_add_action_or_reset(ddev, ttm_bo_release_dummy_page, page))
> > > > +        return VM_FAULT_OOM;
> > > > +
> > > > +    return ret;
> > > > +}
> > > > +EXPORT_SYMBOL(ttm_bo_vm_dummy_page);
> > > I think we can lift this entire thing (once the ttm_bo_vm_fault_idle is
> > > gone) to the drm level, since nothing ttm specific in here. Probably stuff
> > > it into drm_gem.c (but really it's not even gem specific, it's fully
> > > generic "replace this vma with dummy pages pls" function.
> > 
> > 
> > Once I started with this I noticed that drmm_add_action_or_reset depends
> > on struct drm_device *ddev = bo->base.dev  and bo is the private data
> > we embed at the TTM level when setting up the mapping and so this forces
> > to move drmm_add_action_or_reset out of this function to every client who uses
> > this function, and then you separate the logic of page allocation from
> > it's release.
> > So I suggest we keep it as is.

Uh disappointing. Thing is, ttm essentially means drm devices with gem, except for
vmwgfx, which is a drm_device without gem. And I think one of the
remaining ttm refactors in this area is to move ttm_device over into
drm_device someone, and then we'd have bo->base.dev always set to
something that drmm_add_action_or_reset can use.

I guess hand-rolling for now and jotting this down as a TODO item is fine
too, but would be good to get this addressed since that's another reason
here to do this. Maybe sync with Christian how to best do this.
-Daniel

> > 
> > Andrey
> > 
> > 
> > > 
> > > Aside from this nit I think the overall approach you have here is starting
> > > to look good. Lots of work&polish, but imo we're getting there and can
> > > start landing stuff soon.
> > > -Daniel
> > > 
> > > > +
> > > >   vm_fault_t ttm_bo_vm_fault(struct vm_fault *vmf)
> > > >   {
> > > >       struct vm_area_struct *vma = vmf->vma;
> > > >       pgprot_t prot;
> > > >       struct ttm_buffer_object *bo = vma->vm_private_data;
> > > > +    struct drm_device *ddev = bo->base.dev;
> > > >       vm_fault_t ret;
> > > > +    int idx;
> > > >         ret = ttm_bo_vm_reserve(bo, vmf);
> > > >       if (ret)
> > > >           return ret;
> > > >         prot = vma->vm_page_prot;
> > > > -    ret = ttm_bo_vm_fault_reserved(vmf, prot, TTM_BO_VM_NUM_PREFAULT, 1);
> > > > +    if (drm_dev_enter(ddev, &idx)) {
> > > > +        ret = ttm_bo_vm_fault_reserved(vmf, prot, TTM_BO_VM_NUM_PREFAULT, 1);
> > > > +        drm_dev_exit(idx);
> > > > +    } else {
> > > > +        ret = ttm_bo_vm_dummy_page(vmf, prot);
> > > > +    }
> > > >       if (ret == VM_FAULT_RETRY && !(vmf->flags & FAULT_FLAG_RETRY_NOWAIT))
> > > >           return ret;
> > > >         dma_resv_unlock(bo->base.resv);
> > > >         return ret;
> > > > +
> > > > +    return ret;
> > > >   }
> > > >   EXPORT_SYMBOL(ttm_bo_vm_fault);
> > > >   diff --git a/include/drm/ttm/ttm_bo_api.h b/include/drm/ttm/ttm_bo_api.h
> > > > index e17be32..12fb240 100644
> > > > --- a/include/drm/ttm/ttm_bo_api.h
> > > > +++ b/include/drm/ttm/ttm_bo_api.h
> > > > @@ -643,4 +643,6 @@ void ttm_bo_vm_close(struct vm_area_struct *vma);
> > > >   int ttm_bo_vm_access(struct vm_area_struct *vma, unsigned long addr,
> > > >                void *buf, int len, int write);
> > > >   +vm_fault_t ttm_bo_vm_dummy_page(struct vm_fault *vmf, pgprot_t prot);
> > > > +
> > > >   #endif
> > > > -- 
> > > > 2.7.4
> > > > 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 01/14] drm/ttm: Remap all page faults to per process dummy page.
@ 2021-02-02 14:21           ` Daniel Vetter
  0 siblings, 0 replies; 196+ messages in thread
From: Daniel Vetter @ 2021-02-02 14:21 UTC (permalink / raw)
  To: Andrey Grodzovsky
  Cc: robh, gregkh, ckoenig.leichtzumerken, dri-devel, eric, ppaalanen,
	amd-gfx, Daniel Vetter, daniel.vetter, Alexander.Deucher, yuq825,
	Harry.Wentland, l.stach

On Wed, Jan 27, 2021 at 09:29:41AM -0500, Andrey Grodzovsky wrote:
> Hey Daniel, just a ping.

Was on vacations last week.

> Andrey
> 
> On 1/25/21 10:28 AM, Andrey Grodzovsky wrote:
> > 
> > On 1/19/21 8:56 AM, Daniel Vetter wrote:
> > > On Mon, Jan 18, 2021 at 04:01:10PM -0500, Andrey Grodzovsky wrote:
> > > > On device removal reroute all CPU mappings to dummy page.
> > > > 
> > > > v3:
> > > > Remove loop to find DRM file and instead access it
> > > > by vma->vm_file->private_data. Move dummy page installation
> > > > into a separate function.
> > > > 
> > > > v4:
> > > > Map the entire BOs VA space into on demand allocated dummy page
> > > > on the first fault for that BO.
> > > > 
> > > > Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> > > > ---
> > > >   drivers/gpu/drm/ttm/ttm_bo_vm.c | 82
> > > > ++++++++++++++++++++++++++++++++++++++++-
> > > >   include/drm/ttm/ttm_bo_api.h    |  2 +
> > > >   2 files changed, 83 insertions(+), 1 deletion(-)
> > > > 
> > > > diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c b/drivers/gpu/drm/ttm/ttm_bo_vm.c
> > > > index 6dc96cf..ed89da3 100644
> > > > --- a/drivers/gpu/drm/ttm/ttm_bo_vm.c
> > > > +++ b/drivers/gpu/drm/ttm/ttm_bo_vm.c
> > > > @@ -34,6 +34,8 @@
> > > >   #include <drm/ttm/ttm_bo_driver.h>
> > > >   #include <drm/ttm/ttm_placement.h>
> > > >   #include <drm/drm_vma_manager.h>
> > > > +#include <drm/drm_drv.h>
> > > > +#include <drm/drm_managed.h>
> > > >   #include <linux/mm.h>
> > > >   #include <linux/pfn_t.h>
> > > >   #include <linux/rbtree.h>
> > > > @@ -380,25 +382,103 @@ vm_fault_t
> > > > ttm_bo_vm_fault_reserved(struct vm_fault *vmf,
> > > >   }
> > > >   EXPORT_SYMBOL(ttm_bo_vm_fault_reserved);
> > > >   +static void ttm_bo_release_dummy_page(struct drm_device *dev, void *res)
> > > > +{
> > > > +    struct page *dummy_page = (struct page *)res;
> > > > +
> > > > +    __free_page(dummy_page);
> > > > +}
> > > > +
> > > > +vm_fault_t ttm_bo_vm_dummy_page(struct vm_fault *vmf, pgprot_t prot)
> > > > +{
> > > > +    struct vm_area_struct *vma = vmf->vma;
> > > > +    struct ttm_buffer_object *bo = vma->vm_private_data;
> > > > +    struct ttm_bo_device *bdev = bo->bdev;
> > > > +    struct drm_device *ddev = bo->base.dev;
> > > > +    vm_fault_t ret = VM_FAULT_NOPAGE;
> > > > +    unsigned long address = vma->vm_start;
> > > > +    unsigned long num_prefault = (vma->vm_end - vma->vm_start) >> PAGE_SHIFT;
> > > > +    unsigned long pfn;
> > > > +    struct page *page;
> > > > +    int i;
> > > > +
> > > > +    /*
> > > > +     * Wait for buffer data in transit, due to a pipelined
> > > > +     * move.
> > > > +     */
> > > > +    ret = ttm_bo_vm_fault_idle(bo, vmf);
> > > > +    if (unlikely(ret != 0))
> > > > +        return ret;
> > > > +
> > > > +    /* Allocate new dummy page to map all the VA range in this VMA to it*/
> > > > +    page = alloc_page(GFP_KERNEL | __GFP_ZERO);
> > > > +    if (!page)
> > > > +        return VM_FAULT_OOM;
> > > > +
> > > > +    pfn = page_to_pfn(page);
> > > > +
> > > > +    /*
> > > > +     * Prefault the entire VMA range right away to avoid further faults
> > > > +     */
> > > > +    for (i = 0; i < num_prefault; ++i) {
> > > > +
> > > > +        if (unlikely(address >= vma->vm_end))
> > > > +            break;
> > > > +
> > > > +        if (vma->vm_flags & VM_MIXEDMAP)
> > > > +            ret = vmf_insert_mixed_prot(vma, address,
> > > > +                            __pfn_to_pfn_t(pfn, PFN_DEV),
> > > > +                            prot);
> > > > +        else
> > > > +            ret = vmf_insert_pfn_prot(vma, address, pfn, prot);
> > > > +
> > > > +        /* Never error on prefaulted PTEs */
> > > > +        if (unlikely((ret & VM_FAULT_ERROR))) {
> > > > +            if (i == 0)
> > > > +                return VM_FAULT_NOPAGE;
> > > > +            else
> > > > +                break;
> > > > +        }
> > > > +
> > > > +        address += PAGE_SIZE;
> > > > +    }
> > > > +
> > > > +    /* Set the page to be freed using drmm release action */
> > > > +    if (drmm_add_action_or_reset(ddev, ttm_bo_release_dummy_page, page))
> > > > +        return VM_FAULT_OOM;
> > > > +
> > > > +    return ret;
> > > > +}
> > > > +EXPORT_SYMBOL(ttm_bo_vm_dummy_page);
> > > I think we can lift this entire thing (once the ttm_bo_vm_fault_idle is
> > > gone) to the drm level, since nothing ttm specific in here. Probably stuff
> > > it into drm_gem.c (but really it's not even gem specific, it's fully
> > > generic "replace this vma with dummy pages pls" function.
> > 
> > 
> > Once I started with this I noticed that drmm_add_action_or_reset depends
> > on struct drm_device *ddev = bo->base.dev  and bo is the private data
> > we embed at the TTM level when setting up the mapping and so this forces
> > to move drmm_add_action_or_reset out of this function to every client who uses
> > this function, and then you separate the logic of page allocation from
> > it's release.
> > So I suggest we keep it as is.

Uh disappointing. Thing is, ttm essentially means drm devices with gem, except for
vmwgfx, which is a drm_device without gem. And I think one of the
remaining ttm refactors in this area is to move ttm_device over into
drm_device someone, and then we'd have bo->base.dev always set to
something that drmm_add_action_or_reset can use.

I guess hand-rolling for now and jotting this down as a TODO item is fine
too, but would be good to get this addressed since that's another reason
here to do this. Maybe sync with Christian how to best do this.
-Daniel

> > 
> > Andrey
> > 
> > 
> > > 
> > > Aside from this nit I think the overall approach you have here is starting
> > > to look good. Lots of work&polish, but imo we're getting there and can
> > > start landing stuff soon.
> > > -Daniel
> > > 
> > > > +
> > > >   vm_fault_t ttm_bo_vm_fault(struct vm_fault *vmf)
> > > >   {
> > > >       struct vm_area_struct *vma = vmf->vma;
> > > >       pgprot_t prot;
> > > >       struct ttm_buffer_object *bo = vma->vm_private_data;
> > > > +    struct drm_device *ddev = bo->base.dev;
> > > >       vm_fault_t ret;
> > > > +    int idx;
> > > >         ret = ttm_bo_vm_reserve(bo, vmf);
> > > >       if (ret)
> > > >           return ret;
> > > >         prot = vma->vm_page_prot;
> > > > -    ret = ttm_bo_vm_fault_reserved(vmf, prot, TTM_BO_VM_NUM_PREFAULT, 1);
> > > > +    if (drm_dev_enter(ddev, &idx)) {
> > > > +        ret = ttm_bo_vm_fault_reserved(vmf, prot, TTM_BO_VM_NUM_PREFAULT, 1);
> > > > +        drm_dev_exit(idx);
> > > > +    } else {
> > > > +        ret = ttm_bo_vm_dummy_page(vmf, prot);
> > > > +    }
> > > >       if (ret == VM_FAULT_RETRY && !(vmf->flags & FAULT_FLAG_RETRY_NOWAIT))
> > > >           return ret;
> > > >         dma_resv_unlock(bo->base.resv);
> > > >         return ret;
> > > > +
> > > > +    return ret;
> > > >   }
> > > >   EXPORT_SYMBOL(ttm_bo_vm_fault);
> > > >   diff --git a/include/drm/ttm/ttm_bo_api.h b/include/drm/ttm/ttm_bo_api.h
> > > > index e17be32..12fb240 100644
> > > > --- a/include/drm/ttm/ttm_bo_api.h
> > > > +++ b/include/drm/ttm/ttm_bo_api.h
> > > > @@ -643,4 +643,6 @@ void ttm_bo_vm_close(struct vm_area_struct *vma);
> > > >   int ttm_bo_vm_access(struct vm_area_struct *vma, unsigned long addr,
> > > >                void *buf, int len, int write);
> > > >   +vm_fault_t ttm_bo_vm_dummy_page(struct vm_fault *vmf, pgprot_t prot);
> > > > +
> > > >   #endif
> > > > -- 
> > > > 2.7.4
> > > > 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 11/14] drm/amdgpu: Guard against write accesses after device removal
  2021-01-29 19:25                     ` Christian König
@ 2021-02-05 16:22                       ` Andrey Grodzovsky
  -1 siblings, 0 replies; 196+ messages in thread
From: Andrey Grodzovsky @ 2021-02-05 16:22 UTC (permalink / raw)
  To: christian.koenig, Daniel Vetter
  Cc: Greg KH, amd-gfx list, dri-devel, Alex Deucher, Qiang Yu

Daniel, ping. Also, please refer to the other thread with Bjorn from pci-dev
on the same topic I added you to.

Andrey

On 1/29/21 2:25 PM, Christian König wrote:
> Am 29.01.21 um 18:35 schrieb Andrey Grodzovsky:
>>
>> On 1/29/21 10:16 AM, Christian König wrote:
>>> Am 28.01.21 um 18:23 schrieb Andrey Grodzovsky:
>>>>
>>>> On 1/19/21 1:59 PM, Christian König wrote:
>>>>> Am 19.01.21 um 19:22 schrieb Andrey Grodzovsky:
>>>>>>
>>>>>> On 1/19/21 1:05 PM, Daniel Vetter wrote:
>>>>>>> [SNIP]
>>>>>> So say writing in a loop to some harmless scratch register for many times 
>>>>>> both for plugged
>>>>>> and unplugged case and measure total time delta ?
>>>>>
>>>>> I think we should at least measure the following:
>>>>>
>>>>> 1. Writing X times to a scratch reg without your patch.
>>>>> 2. Writing X times to a scratch reg with your patch.
>>>>> 3. Writing X times to a scratch reg with the hardware physically disconnected.
>>>>>
>>>>> I suggest to repeat that once for Polaris (or older) and once for Vega or 
>>>>> Navi.
>>>>>
>>>>> The SRBM on Polaris is meant to introduce some delay in each access, so it 
>>>>> might react differently then the newer hardware.
>>>>>
>>>>> Christian.
>>>>
>>>>
>>>> See attached results and the testing code. Ran on Polaris (gfx8) and 
>>>> Vega10(gfx9)
>>>>
>>>> In summary, over 1 million WWREG32 in loop with and without this patch you 
>>>> get around 10ms of accumulated overhead ( so 0.00001 millisecond penalty for 
>>>> each WWREG32) for using drm_dev_enter check when writing registers.
>>>>
>>>> P.S Bullet 3 I cannot test as I need eGPU and currently I don't have one.
>>>
>>> Well if I'm not completely mistaken that are 100ms of accumulated overhead. 
>>> So around 100ns per write. And even bigger problem is that this is a ~67% 
>>> increase.
>>
>>
>> My bad, and 67% from what ? How u calculate ?
> 
> My bad, (308501-209689)/209689=47% increase.
> 
>>>
>>> I'm not sure how many write we do during normal operation, but that sounds 
>>> like a bit much. Ideas?
>>
>> Well, u suggested to move the drm_dev_enter way up but as i see it the problem 
>> with this is that it increase the chance of race where the
>> device is extracted after we check for drm_dev_enter (there is also such 
>> chance even when it's placed inside WWREG but it's lower).
>> Earlier I propsed that instead of doing all those guards scattered all over 
>> the code simply delay release of system memory pages and unreserve of
>> MMIO ranges to until after the device itself is gone after last drm device 
>> reference is dropped. But Daniel opposes delaying MMIO ranges unreserve to after
>> PCI remove code because according to him it will upset the PCI subsytem.
> 
> Yeah, that's most likely true as well.
> 
> Maybe Daniel has another idea when he's back from vacation.
> 
> Christian.
> 
>>
>> Andrey
>>
>>>
>>> Christian.
>> _______________________________________________
>> amd-gfx mailing list
>> amd-gfx@lists.freedesktop.org
>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&amp;data=04%7C01%7CAndrey.Grodzovsky%40amd.com%7C7e63c7ba9ac44d80163108d8c48b9507%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637475451078731703%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=SozIYYmHpkk%2B4PRycs8T7x1DYagThy6lQoFXV5Ddamk%3D&amp;reserved=0 
>>
> 
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 11/14] drm/amdgpu: Guard against write accesses after device removal
@ 2021-02-05 16:22                       ` Andrey Grodzovsky
  0 siblings, 0 replies; 196+ messages in thread
From: Andrey Grodzovsky @ 2021-02-05 16:22 UTC (permalink / raw)
  To: christian.koenig, Daniel Vetter
  Cc: Rob Herring, Greg KH, amd-gfx list, Anholt, Eric, Pekka Paalanen,
	dri-devel, Alex Deucher, Qiang Yu, Wentland, Harry, Lucas Stach

Daniel, ping. Also, please refer to the other thread with Bjorn from pci-dev
on the same topic I added you to.

Andrey

On 1/29/21 2:25 PM, Christian König wrote:
> Am 29.01.21 um 18:35 schrieb Andrey Grodzovsky:
>>
>> On 1/29/21 10:16 AM, Christian König wrote:
>>> Am 28.01.21 um 18:23 schrieb Andrey Grodzovsky:
>>>>
>>>> On 1/19/21 1:59 PM, Christian König wrote:
>>>>> Am 19.01.21 um 19:22 schrieb Andrey Grodzovsky:
>>>>>>
>>>>>> On 1/19/21 1:05 PM, Daniel Vetter wrote:
>>>>>>> [SNIP]
>>>>>> So say writing in a loop to some harmless scratch register for many times 
>>>>>> both for plugged
>>>>>> and unplugged case and measure total time delta ?
>>>>>
>>>>> I think we should at least measure the following:
>>>>>
>>>>> 1. Writing X times to a scratch reg without your patch.
>>>>> 2. Writing X times to a scratch reg with your patch.
>>>>> 3. Writing X times to a scratch reg with the hardware physically disconnected.
>>>>>
>>>>> I suggest to repeat that once for Polaris (or older) and once for Vega or 
>>>>> Navi.
>>>>>
>>>>> The SRBM on Polaris is meant to introduce some delay in each access, so it 
>>>>> might react differently then the newer hardware.
>>>>>
>>>>> Christian.
>>>>
>>>>
>>>> See attached results and the testing code. Ran on Polaris (gfx8) and 
>>>> Vega10(gfx9)
>>>>
>>>> In summary, over 1 million WWREG32 in loop with and without this patch you 
>>>> get around 10ms of accumulated overhead ( so 0.00001 millisecond penalty for 
>>>> each WWREG32) for using drm_dev_enter check when writing registers.
>>>>
>>>> P.S Bullet 3 I cannot test as I need eGPU and currently I don't have one.
>>>
>>> Well if I'm not completely mistaken that are 100ms of accumulated overhead. 
>>> So around 100ns per write. And even bigger problem is that this is a ~67% 
>>> increase.
>>
>>
>> My bad, and 67% from what ? How u calculate ?
> 
> My bad, (308501-209689)/209689=47% increase.
> 
>>>
>>> I'm not sure how many write we do during normal operation, but that sounds 
>>> like a bit much. Ideas?
>>
>> Well, u suggested to move the drm_dev_enter way up but as i see it the problem 
>> with this is that it increase the chance of race where the
>> device is extracted after we check for drm_dev_enter (there is also such 
>> chance even when it's placed inside WWREG but it's lower).
>> Earlier I propsed that instead of doing all those guards scattered all over 
>> the code simply delay release of system memory pages and unreserve of
>> MMIO ranges to until after the device itself is gone after last drm device 
>> reference is dropped. But Daniel opposes delaying MMIO ranges unreserve to after
>> PCI remove code because according to him it will upset the PCI subsytem.
> 
> Yeah, that's most likely true as well.
> 
> Maybe Daniel has another idea when he's back from vacation.
> 
> Christian.
> 
>>
>> Andrey
>>
>>>
>>> Christian.
>> _______________________________________________
>> amd-gfx mailing list
>> amd-gfx@lists.freedesktop.org
>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&amp;data=04%7C01%7CAndrey.Grodzovsky%40amd.com%7C7e63c7ba9ac44d80163108d8c48b9507%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637475451078731703%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=SozIYYmHpkk%2B4PRycs8T7x1DYagThy6lQoFXV5Ddamk%3D&amp;reserved=0 
>>
> 
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 11/14] drm/amdgpu: Guard against write accesses after device removal
  2021-02-05 16:22                       ` Andrey Grodzovsky
@ 2021-02-05 22:10                         ` Daniel Vetter
  -1 siblings, 0 replies; 196+ messages in thread
From: Daniel Vetter @ 2021-02-05 22:10 UTC (permalink / raw)
  To: Andrey Grodzovsky
  Cc: Greg KH, dri-devel, amd-gfx list, Alex Deucher, Qiang Yu,
	Christian König

On Fri, Feb 5, 2021 at 5:22 PM Andrey Grodzovsky
<Andrey.Grodzovsky@amd.com> wrote:
>
> Daniel, ping. Also, please refer to the other thread with Bjorn from pci-dev
> on the same topic I added you to.

Summarizing my take over there for here plus maybe some more
clarification. There's two problems:

- You must guarantee that after the ->remove callback of your driver
is finished, there's no more mmio or any other hw access. A
combination of stopping stuff and drm_dev_enter/exit can help with
that. This prevents the use-after-free issue.

- For the actual hotunplug time, i.e. anything that can run while your
driver is used up to the point where ->remove callback has finished
stopp hw access you must guarantee that code doesn't blow up when it
gets bogus reads (in the form of 0xff values). drm_dev_enter/exit
can't help you with that. Plus you should make sure that we're not
spending forever waiting for a big pile of mmio access all to time out
because you never bail out - some coarse-grained drm_dev_enter/exit
might help here.

Plus finally the userspace access problem: You must guarantee that
after ->remove has finished that none of the uapi or cross-driver
access points (driver ioctl, dma-buf, dma-fence, anything else that
hangs around) can reach the data structures/memory mappings/whatever
which have been released as part of your ->remove callback.
drm_dev_enter/exit is again the tool of choice here.

So you have to use drm_dev_enter/exit for some of the problems we face
on hotunplug, but it's not the tool that can handle the actual hw
hotunplug race conditions for you.

Unfortunately the hw hotunplug race condition is an utter pain to
test, since essentially you need to validate your driver against
spurious 0xff reads at any moment. And I don't even have a clever idea
to simulate this, e.g. by forcefully replacing the iobar mapping: What
we'd need is a mapping that allows reads (so we can fill a page with
0xff and use that everywhere), but instead of rejecting writes, allows
them, but drops them (so that the 0xff stays intact). Maybe we could
simulate this with some kernel debug tricks (kinda like mmiotrace)
with a read-only mapping and dropping every write every time we fault.
But ugh ...

Otoh validating an entire driver like amdgpu without such a trick
against 0xff reads is practically impossible. So maybe you need to add
this as one of the tasks here?
-Daniel

>
> Andrey
>
> On 1/29/21 2:25 PM, Christian König wrote:
> > Am 29.01.21 um 18:35 schrieb Andrey Grodzovsky:
> >>
> >> On 1/29/21 10:16 AM, Christian König wrote:
> >>> Am 28.01.21 um 18:23 schrieb Andrey Grodzovsky:
> >>>>
> >>>> On 1/19/21 1:59 PM, Christian König wrote:
> >>>>> Am 19.01.21 um 19:22 schrieb Andrey Grodzovsky:
> >>>>>>
> >>>>>> On 1/19/21 1:05 PM, Daniel Vetter wrote:
> >>>>>>> [SNIP]
> >>>>>> So say writing in a loop to some harmless scratch register for many times
> >>>>>> both for plugged
> >>>>>> and unplugged case and measure total time delta ?
> >>>>>
> >>>>> I think we should at least measure the following:
> >>>>>
> >>>>> 1. Writing X times to a scratch reg without your patch.
> >>>>> 2. Writing X times to a scratch reg with your patch.
> >>>>> 3. Writing X times to a scratch reg with the hardware physically disconnected.
> >>>>>
> >>>>> I suggest to repeat that once for Polaris (or older) and once for Vega or
> >>>>> Navi.
> >>>>>
> >>>>> The SRBM on Polaris is meant to introduce some delay in each access, so it
> >>>>> might react differently then the newer hardware.
> >>>>>
> >>>>> Christian.
> >>>>
> >>>>
> >>>> See attached results and the testing code. Ran on Polaris (gfx8) and
> >>>> Vega10(gfx9)
> >>>>
> >>>> In summary, over 1 million WWREG32 in loop with and without this patch you
> >>>> get around 10ms of accumulated overhead ( so 0.00001 millisecond penalty for
> >>>> each WWREG32) for using drm_dev_enter check when writing registers.
> >>>>
> >>>> P.S Bullet 3 I cannot test as I need eGPU and currently I don't have one.
> >>>
> >>> Well if I'm not completely mistaken that are 100ms of accumulated overhead.
> >>> So around 100ns per write. And even bigger problem is that this is a ~67%
> >>> increase.
> >>
> >>
> >> My bad, and 67% from what ? How u calculate ?
> >
> > My bad, (308501-209689)/209689=47% increase.
> >
> >>>
> >>> I'm not sure how many write we do during normal operation, but that sounds
> >>> like a bit much. Ideas?
> >>
> >> Well, u suggested to move the drm_dev_enter way up but as i see it the problem
> >> with this is that it increase the chance of race where the
> >> device is extracted after we check for drm_dev_enter (there is also such
> >> chance even when it's placed inside WWREG but it's lower).
> >> Earlier I propsed that instead of doing all those guards scattered all over
> >> the code simply delay release of system memory pages and unreserve of
> >> MMIO ranges to until after the device itself is gone after last drm device
> >> reference is dropped. But Daniel opposes delaying MMIO ranges unreserve to after
> >> PCI remove code because according to him it will upset the PCI subsytem.
> >
> > Yeah, that's most likely true as well.
> >
> > Maybe Daniel has another idea when he's back from vacation.
> >
> > Christian.
> >
> >>
> >> Andrey
> >>
> >>>
> >>> Christian.
> >> _______________________________________________
> >> amd-gfx mailing list
> >> amd-gfx@lists.freedesktop.org
> >> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&amp;data=04%7C01%7CAndrey.Grodzovsky%40amd.com%7C7e63c7ba9ac44d80163108d8c48b9507%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637475451078731703%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=SozIYYmHpkk%2B4PRycs8T7x1DYagThy6lQoFXV5Ddamk%3D&amp;reserved=0
> >>
> >



-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 11/14] drm/amdgpu: Guard against write accesses after device removal
@ 2021-02-05 22:10                         ` Daniel Vetter
  0 siblings, 0 replies; 196+ messages in thread
From: Daniel Vetter @ 2021-02-05 22:10 UTC (permalink / raw)
  To: Andrey Grodzovsky
  Cc: Rob Herring, Greg KH, dri-devel, Anholt, Eric, Pekka Paalanen,
	amd-gfx list, Alex Deucher, Qiang Yu, Wentland, Harry,
	Christian König, Lucas Stach

On Fri, Feb 5, 2021 at 5:22 PM Andrey Grodzovsky
<Andrey.Grodzovsky@amd.com> wrote:
>
> Daniel, ping. Also, please refer to the other thread with Bjorn from pci-dev
> on the same topic I added you to.

Summarizing my take over there for here plus maybe some more
clarification. There's two problems:

- You must guarantee that after the ->remove callback of your driver
is finished, there's no more mmio or any other hw access. A
combination of stopping stuff and drm_dev_enter/exit can help with
that. This prevents the use-after-free issue.

- For the actual hotunplug time, i.e. anything that can run while your
driver is used up to the point where ->remove callback has finished
stopp hw access you must guarantee that code doesn't blow up when it
gets bogus reads (in the form of 0xff values). drm_dev_enter/exit
can't help you with that. Plus you should make sure that we're not
spending forever waiting for a big pile of mmio access all to time out
because you never bail out - some coarse-grained drm_dev_enter/exit
might help here.

Plus finally the userspace access problem: You must guarantee that
after ->remove has finished that none of the uapi or cross-driver
access points (driver ioctl, dma-buf, dma-fence, anything else that
hangs around) can reach the data structures/memory mappings/whatever
which have been released as part of your ->remove callback.
drm_dev_enter/exit is again the tool of choice here.

So you have to use drm_dev_enter/exit for some of the problems we face
on hotunplug, but it's not the tool that can handle the actual hw
hotunplug race conditions for you.

Unfortunately the hw hotunplug race condition is an utter pain to
test, since essentially you need to validate your driver against
spurious 0xff reads at any moment. And I don't even have a clever idea
to simulate this, e.g. by forcefully replacing the iobar mapping: What
we'd need is a mapping that allows reads (so we can fill a page with
0xff and use that everywhere), but instead of rejecting writes, allows
them, but drops them (so that the 0xff stays intact). Maybe we could
simulate this with some kernel debug tricks (kinda like mmiotrace)
with a read-only mapping and dropping every write every time we fault.
But ugh ...

Otoh validating an entire driver like amdgpu without such a trick
against 0xff reads is practically impossible. So maybe you need to add
this as one of the tasks here?
-Daniel

>
> Andrey
>
> On 1/29/21 2:25 PM, Christian König wrote:
> > Am 29.01.21 um 18:35 schrieb Andrey Grodzovsky:
> >>
> >> On 1/29/21 10:16 AM, Christian König wrote:
> >>> Am 28.01.21 um 18:23 schrieb Andrey Grodzovsky:
> >>>>
> >>>> On 1/19/21 1:59 PM, Christian König wrote:
> >>>>> Am 19.01.21 um 19:22 schrieb Andrey Grodzovsky:
> >>>>>>
> >>>>>> On 1/19/21 1:05 PM, Daniel Vetter wrote:
> >>>>>>> [SNIP]
> >>>>>> So say writing in a loop to some harmless scratch register for many times
> >>>>>> both for plugged
> >>>>>> and unplugged case and measure total time delta ?
> >>>>>
> >>>>> I think we should at least measure the following:
> >>>>>
> >>>>> 1. Writing X times to a scratch reg without your patch.
> >>>>> 2. Writing X times to a scratch reg with your patch.
> >>>>> 3. Writing X times to a scratch reg with the hardware physically disconnected.
> >>>>>
> >>>>> I suggest to repeat that once for Polaris (or older) and once for Vega or
> >>>>> Navi.
> >>>>>
> >>>>> The SRBM on Polaris is meant to introduce some delay in each access, so it
> >>>>> might react differently then the newer hardware.
> >>>>>
> >>>>> Christian.
> >>>>
> >>>>
> >>>> See attached results and the testing code. Ran on Polaris (gfx8) and
> >>>> Vega10(gfx9)
> >>>>
> >>>> In summary, over 1 million WWREG32 in loop with and without this patch you
> >>>> get around 10ms of accumulated overhead ( so 0.00001 millisecond penalty for
> >>>> each WWREG32) for using drm_dev_enter check when writing registers.
> >>>>
> >>>> P.S Bullet 3 I cannot test as I need eGPU and currently I don't have one.
> >>>
> >>> Well if I'm not completely mistaken that are 100ms of accumulated overhead.
> >>> So around 100ns per write. And even bigger problem is that this is a ~67%
> >>> increase.
> >>
> >>
> >> My bad, and 67% from what ? How u calculate ?
> >
> > My bad, (308501-209689)/209689=47% increase.
> >
> >>>
> >>> I'm not sure how many write we do during normal operation, but that sounds
> >>> like a bit much. Ideas?
> >>
> >> Well, u suggested to move the drm_dev_enter way up but as i see it the problem
> >> with this is that it increase the chance of race where the
> >> device is extracted after we check for drm_dev_enter (there is also such
> >> chance even when it's placed inside WWREG but it's lower).
> >> Earlier I propsed that instead of doing all those guards scattered all over
> >> the code simply delay release of system memory pages and unreserve of
> >> MMIO ranges to until after the device itself is gone after last drm device
> >> reference is dropped. But Daniel opposes delaying MMIO ranges unreserve to after
> >> PCI remove code because according to him it will upset the PCI subsytem.
> >
> > Yeah, that's most likely true as well.
> >
> > Maybe Daniel has another idea when he's back from vacation.
> >
> > Christian.
> >
> >>
> >> Andrey
> >>
> >>>
> >>> Christian.
> >> _______________________________________________
> >> amd-gfx mailing list
> >> amd-gfx@lists.freedesktop.org
> >> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&amp;data=04%7C01%7CAndrey.Grodzovsky%40amd.com%7C7e63c7ba9ac44d80163108d8c48b9507%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637475451078731703%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=SozIYYmHpkk%2B4PRycs8T7x1DYagThy6lQoFXV5Ddamk%3D&amp;reserved=0
> >>
> >



-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 11/14] drm/amdgpu: Guard against write accesses after device removal
  2021-02-05 22:10                         ` Daniel Vetter
@ 2021-02-05 23:09                           ` Andrey Grodzovsky
  -1 siblings, 0 replies; 196+ messages in thread
From: Andrey Grodzovsky @ 2021-02-05 23:09 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Greg KH, dri-devel, amd-gfx list, Alex Deucher, Qiang Yu,
	Christian König



On 2/5/21 5:10 PM, Daniel Vetter wrote:
> On Fri, Feb 5, 2021 at 5:22 PM Andrey Grodzovsky
> <Andrey.Grodzovsky@amd.com> wrote:
>>
>> Daniel, ping. Also, please refer to the other thread with Bjorn from pci-dev
>> on the same topic I added you to.
> 
> Summarizing my take over there for here plus maybe some more
> clarification. There's two problems:
> 
> - You must guarantee that after the ->remove callback of your driver
> is finished, there's no more mmio or any other hw access. A
> combination of stopping stuff and drm_dev_enter/exit can help with
> that. This prevents the use-after-free issue.
> 
> - For the actual hotunplug time, i.e. anything that can run while your
> driver is used up to the point where ->remove callback has finished
> stopp hw access you must guarantee that code doesn't blow up when it
> gets bogus reads (in the form of 0xff values). drm_dev_enter/exit
> can't help you with that. Plus you should make sure that we're not
> spending forever waiting for a big pile of mmio access all to time out
> because you never bail out - some coarse-grained drm_dev_enter/exit
> might help here.
> 
> Plus finally the userspace access problem: You must guarantee that
> after ->remove has finished that none of the uapi or cross-driver
> access points (driver ioctl, dma-buf, dma-fence, anything else that
> hangs around) can reach the data structures/memory mappings/whatever
> which have been released as part of your ->remove callback.
> drm_dev_enter/exit is again the tool of choice here.
> 
> So you have to use drm_dev_enter/exit for some of the problems we face
> on hotunplug, but it's not the tool that can handle the actual hw
> hotunplug race conditions for you.
> 
> Unfortunately the hw hotunplug race condition is an utter pain to
> test, since essentially you need to validate your driver against
> spurious 0xff reads at any moment. And I don't even have a clever idea
> to simulate this, e.g. by forcefully replacing the iobar mapping: What
> we'd need is a mapping that allows reads (so we can fill a page with
> 0xff and use that everywhere), but instead of rejecting writes, allows
> them, but drops them (so that the 0xff stays intact). Maybe we could
> simulate this with some kernel debug tricks (kinda like mmiotrace)
> with a read-only mapping and dropping every write every time we fault.
> But ugh ...
> 
> Otoh validating an entire driver like amdgpu without such a trick
> against 0xff reads is practically impossible. So maybe you need to add
> this as one of the tasks here?
> -Daniel

Not sure it's not a dump idea but still, worth asking -  what if I
just simply quietly return early from the .remove  callback  without
doing anything there, the driver will not be aware that the device
is removed and will at least try to continue working as usual including
IOCTLs, job scheduling e.t.c. On the other hand all MMIO read accesses will
start returning ~0, regarding rejecting writes - I don't see anywhere
we test for result of writing (e.g. amdgpu_mm_wreg8) so seems they will
just seamlessly  go through... Or is it the pci_dev that will be freed
by PCI core itself and so I will immediately crash ?

Andrey

> 
>>
>> Andrey
>>
>> On 1/29/21 2:25 PM, Christian König wrote:
>>> Am 29.01.21 um 18:35 schrieb Andrey Grodzovsky:
>>>>
>>>> On 1/29/21 10:16 AM, Christian König wrote:
>>>>> Am 28.01.21 um 18:23 schrieb Andrey Grodzovsky:
>>>>>>
>>>>>> On 1/19/21 1:59 PM, Christian König wrote:
>>>>>>> Am 19.01.21 um 19:22 schrieb Andrey Grodzovsky:
>>>>>>>>
>>>>>>>> On 1/19/21 1:05 PM, Daniel Vetter wrote:
>>>>>>>>> [SNIP]
>>>>>>>> So say writing in a loop to some harmless scratch register for many times
>>>>>>>> both for plugged
>>>>>>>> and unplugged case and measure total time delta ?
>>>>>>>
>>>>>>> I think we should at least measure the following:
>>>>>>>
>>>>>>> 1. Writing X times to a scratch reg without your patch.
>>>>>>> 2. Writing X times to a scratch reg with your patch.
>>>>>>> 3. Writing X times to a scratch reg with the hardware physically disconnected.
>>>>>>>
>>>>>>> I suggest to repeat that once for Polaris (or older) and once for Vega or
>>>>>>> Navi.
>>>>>>>
>>>>>>> The SRBM on Polaris is meant to introduce some delay in each access, so it
>>>>>>> might react differently then the newer hardware.
>>>>>>>
>>>>>>> Christian.
>>>>>>
>>>>>>
>>>>>> See attached results and the testing code. Ran on Polaris (gfx8) and
>>>>>> Vega10(gfx9)
>>>>>>
>>>>>> In summary, over 1 million WWREG32 in loop with and without this patch you
>>>>>> get around 10ms of accumulated overhead ( so 0.00001 millisecond penalty for
>>>>>> each WWREG32) for using drm_dev_enter check when writing registers.
>>>>>>
>>>>>> P.S Bullet 3 I cannot test as I need eGPU and currently I don't have one.
>>>>>
>>>>> Well if I'm not completely mistaken that are 100ms of accumulated overhead.
>>>>> So around 100ns per write. And even bigger problem is that this is a ~67%
>>>>> increase.
>>>>
>>>>
>>>> My bad, and 67% from what ? How u calculate ?
>>>
>>> My bad, (308501-209689)/209689=47% increase.
>>>
>>>>>
>>>>> I'm not sure how many write we do during normal operation, but that sounds
>>>>> like a bit much. Ideas?
>>>>
>>>> Well, u suggested to move the drm_dev_enter way up but as i see it the problem
>>>> with this is that it increase the chance of race where the
>>>> device is extracted after we check for drm_dev_enter (there is also such
>>>> chance even when it's placed inside WWREG but it's lower).
>>>> Earlier I propsed that instead of doing all those guards scattered all over
>>>> the code simply delay release of system memory pages and unreserve of
>>>> MMIO ranges to until after the device itself is gone after last drm device
>>>> reference is dropped. But Daniel opposes delaying MMIO ranges unreserve to after
>>>> PCI remove code because according to him it will upset the PCI subsytem.
>>>
>>> Yeah, that's most likely true as well.
>>>
>>> Maybe Daniel has another idea when he's back from vacation.
>>>
>>> Christian.
>>>
>>>>
>>>> Andrey
>>>>
>>>>>
>>>>> Christian.
>>>> _______________________________________________
>>>> amd-gfx mailing list
>>>> amd-gfx@lists.freedesktop.org
>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&amp;data=04%7C01%7CAndrey.Grodzovsky%40amd.com%7C7810d8d6f03443ce2e0408d8ca22ea99%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637481598615581693%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=zTV6FTpL3titmMTVEPxxVT8e5lTKVsLViwZudEsNn%2Bw%3D&amp;reserved=0
>>>>
>>>
> 
> 
> 
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 11/14] drm/amdgpu: Guard against write accesses after device removal
@ 2021-02-05 23:09                           ` Andrey Grodzovsky
  0 siblings, 0 replies; 196+ messages in thread
From: Andrey Grodzovsky @ 2021-02-05 23:09 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Rob Herring, Greg KH, dri-devel, Anholt, Eric, Pekka Paalanen,
	amd-gfx list, Alex Deucher, Qiang Yu, Wentland, Harry,
	Christian König, Lucas Stach



On 2/5/21 5:10 PM, Daniel Vetter wrote:
> On Fri, Feb 5, 2021 at 5:22 PM Andrey Grodzovsky
> <Andrey.Grodzovsky@amd.com> wrote:
>>
>> Daniel, ping. Also, please refer to the other thread with Bjorn from pci-dev
>> on the same topic I added you to.
> 
> Summarizing my take over there for here plus maybe some more
> clarification. There's two problems:
> 
> - You must guarantee that after the ->remove callback of your driver
> is finished, there's no more mmio or any other hw access. A
> combination of stopping stuff and drm_dev_enter/exit can help with
> that. This prevents the use-after-free issue.
> 
> - For the actual hotunplug time, i.e. anything that can run while your
> driver is used up to the point where ->remove callback has finished
> stopp hw access you must guarantee that code doesn't blow up when it
> gets bogus reads (in the form of 0xff values). drm_dev_enter/exit
> can't help you with that. Plus you should make sure that we're not
> spending forever waiting for a big pile of mmio access all to time out
> because you never bail out - some coarse-grained drm_dev_enter/exit
> might help here.
> 
> Plus finally the userspace access problem: You must guarantee that
> after ->remove has finished that none of the uapi or cross-driver
> access points (driver ioctl, dma-buf, dma-fence, anything else that
> hangs around) can reach the data structures/memory mappings/whatever
> which have been released as part of your ->remove callback.
> drm_dev_enter/exit is again the tool of choice here.
> 
> So you have to use drm_dev_enter/exit for some of the problems we face
> on hotunplug, but it's not the tool that can handle the actual hw
> hotunplug race conditions for you.
> 
> Unfortunately the hw hotunplug race condition is an utter pain to
> test, since essentially you need to validate your driver against
> spurious 0xff reads at any moment. And I don't even have a clever idea
> to simulate this, e.g. by forcefully replacing the iobar mapping: What
> we'd need is a mapping that allows reads (so we can fill a page with
> 0xff and use that everywhere), but instead of rejecting writes, allows
> them, but drops them (so that the 0xff stays intact). Maybe we could
> simulate this with some kernel debug tricks (kinda like mmiotrace)
> with a read-only mapping and dropping every write every time we fault.
> But ugh ...
> 
> Otoh validating an entire driver like amdgpu without such a trick
> against 0xff reads is practically impossible. So maybe you need to add
> this as one of the tasks here?
> -Daniel

Not sure it's not a dump idea but still, worth asking -  what if I
just simply quietly return early from the .remove  callback  without
doing anything there, the driver will not be aware that the device
is removed and will at least try to continue working as usual including
IOCTLs, job scheduling e.t.c. On the other hand all MMIO read accesses will
start returning ~0, regarding rejecting writes - I don't see anywhere
we test for result of writing (e.g. amdgpu_mm_wreg8) so seems they will
just seamlessly  go through... Or is it the pci_dev that will be freed
by PCI core itself and so I will immediately crash ?

Andrey

> 
>>
>> Andrey
>>
>> On 1/29/21 2:25 PM, Christian König wrote:
>>> Am 29.01.21 um 18:35 schrieb Andrey Grodzovsky:
>>>>
>>>> On 1/29/21 10:16 AM, Christian König wrote:
>>>>> Am 28.01.21 um 18:23 schrieb Andrey Grodzovsky:
>>>>>>
>>>>>> On 1/19/21 1:59 PM, Christian König wrote:
>>>>>>> Am 19.01.21 um 19:22 schrieb Andrey Grodzovsky:
>>>>>>>>
>>>>>>>> On 1/19/21 1:05 PM, Daniel Vetter wrote:
>>>>>>>>> [SNIP]
>>>>>>>> So say writing in a loop to some harmless scratch register for many times
>>>>>>>> both for plugged
>>>>>>>> and unplugged case and measure total time delta ?
>>>>>>>
>>>>>>> I think we should at least measure the following:
>>>>>>>
>>>>>>> 1. Writing X times to a scratch reg without your patch.
>>>>>>> 2. Writing X times to a scratch reg with your patch.
>>>>>>> 3. Writing X times to a scratch reg with the hardware physically disconnected.
>>>>>>>
>>>>>>> I suggest to repeat that once for Polaris (or older) and once for Vega or
>>>>>>> Navi.
>>>>>>>
>>>>>>> The SRBM on Polaris is meant to introduce some delay in each access, so it
>>>>>>> might react differently then the newer hardware.
>>>>>>>
>>>>>>> Christian.
>>>>>>
>>>>>>
>>>>>> See attached results and the testing code. Ran on Polaris (gfx8) and
>>>>>> Vega10(gfx9)
>>>>>>
>>>>>> In summary, over 1 million WWREG32 in loop with and without this patch you
>>>>>> get around 10ms of accumulated overhead ( so 0.00001 millisecond penalty for
>>>>>> each WWREG32) for using drm_dev_enter check when writing registers.
>>>>>>
>>>>>> P.S Bullet 3 I cannot test as I need eGPU and currently I don't have one.
>>>>>
>>>>> Well if I'm not completely mistaken that are 100ms of accumulated overhead.
>>>>> So around 100ns per write. And even bigger problem is that this is a ~67%
>>>>> increase.
>>>>
>>>>
>>>> My bad, and 67% from what ? How u calculate ?
>>>
>>> My bad, (308501-209689)/209689=47% increase.
>>>
>>>>>
>>>>> I'm not sure how many write we do during normal operation, but that sounds
>>>>> like a bit much. Ideas?
>>>>
>>>> Well, u suggested to move the drm_dev_enter way up but as i see it the problem
>>>> with this is that it increase the chance of race where the
>>>> device is extracted after we check for drm_dev_enter (there is also such
>>>> chance even when it's placed inside WWREG but it's lower).
>>>> Earlier I propsed that instead of doing all those guards scattered all over
>>>> the code simply delay release of system memory pages and unreserve of
>>>> MMIO ranges to until after the device itself is gone after last drm device
>>>> reference is dropped. But Daniel opposes delaying MMIO ranges unreserve to after
>>>> PCI remove code because according to him it will upset the PCI subsytem.
>>>
>>> Yeah, that's most likely true as well.
>>>
>>> Maybe Daniel has another idea when he's back from vacation.
>>>
>>> Christian.
>>>
>>>>
>>>> Andrey
>>>>
>>>>>
>>>>> Christian.
>>>> _______________________________________________
>>>> amd-gfx mailing list
>>>> amd-gfx@lists.freedesktop.org
>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&amp;data=04%7C01%7CAndrey.Grodzovsky%40amd.com%7C7810d8d6f03443ce2e0408d8ca22ea99%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637481598615581693%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=zTV6FTpL3titmMTVEPxxVT8e5lTKVsLViwZudEsNn%2Bw%3D&amp;reserved=0
>>>>
>>>
> 
> 
> 
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 11/14] drm/amdgpu: Guard against write accesses after device removal
  2021-02-05 23:09                           ` Andrey Grodzovsky
@ 2021-02-06 14:18                             ` Daniel Vetter
  -1 siblings, 0 replies; 196+ messages in thread
From: Daniel Vetter @ 2021-02-06 14:18 UTC (permalink / raw)
  To: Andrey Grodzovsky
  Cc: Greg KH, dri-devel, amd-gfx list, Alex Deucher, Qiang Yu,
	Christian König

On Sat, Feb 6, 2021 at 12:09 AM Andrey Grodzovsky
<Andrey.Grodzovsky@amd.com> wrote:
>
>
>
> On 2/5/21 5:10 PM, Daniel Vetter wrote:
> > On Fri, Feb 5, 2021 at 5:22 PM Andrey Grodzovsky
> > <Andrey.Grodzovsky@amd.com> wrote:
> >>
> >> Daniel, ping. Also, please refer to the other thread with Bjorn from pci-dev
> >> on the same topic I added you to.
> >
> > Summarizing my take over there for here plus maybe some more
> > clarification. There's two problems:
> >
> > - You must guarantee that after the ->remove callback of your driver
> > is finished, there's no more mmio or any other hw access. A
> > combination of stopping stuff and drm_dev_enter/exit can help with
> > that. This prevents the use-after-free issue.
> >
> > - For the actual hotunplug time, i.e. anything that can run while your
> > driver is used up to the point where ->remove callback has finished
> > stopp hw access you must guarantee that code doesn't blow up when it
> > gets bogus reads (in the form of 0xff values). drm_dev_enter/exit
> > can't help you with that. Plus you should make sure that we're not
> > spending forever waiting for a big pile of mmio access all to time out
> > because you never bail out - some coarse-grained drm_dev_enter/exit
> > might help here.
> >
> > Plus finally the userspace access problem: You must guarantee that
> > after ->remove has finished that none of the uapi or cross-driver
> > access points (driver ioctl, dma-buf, dma-fence, anything else that
> > hangs around) can reach the data structures/memory mappings/whatever
> > which have been released as part of your ->remove callback.
> > drm_dev_enter/exit is again the tool of choice here.
> >
> > So you have to use drm_dev_enter/exit for some of the problems we face
> > on hotunplug, but it's not the tool that can handle the actual hw
> > hotunplug race conditions for you.
> >
> > Unfortunately the hw hotunplug race condition is an utter pain to
> > test, since essentially you need to validate your driver against
> > spurious 0xff reads at any moment. And I don't even have a clever idea
> > to simulate this, e.g. by forcefully replacing the iobar mapping: What
> > we'd need is a mapping that allows reads (so we can fill a page with
> > 0xff and use that everywhere), but instead of rejecting writes, allows
> > them, but drops them (so that the 0xff stays intact). Maybe we could
> > simulate this with some kernel debug tricks (kinda like mmiotrace)
> > with a read-only mapping and dropping every write every time we fault.
> > But ugh ...
> >
> > Otoh validating an entire driver like amdgpu without such a trick
> > against 0xff reads is practically impossible. So maybe you need to add
> > this as one of the tasks here?
> > -Daniel
>
> Not sure it's not a dump idea but still, worth asking -  what if I
> just simply quietly return early from the .remove  callback  without
> doing anything there, the driver will not be aware that the device
> is removed and will at least try to continue working as usual including
> IOCTLs, job scheduling e.t.c. On the other hand all MMIO read accesses will
> start returning ~0, regarding rejecting writes - I don't see anywhere
> we test for result of writing (e.g. amdgpu_mm_wreg8) so seems they will
> just seamlessly  go through... Or is it the pci_dev that will be freed
> by PCI core itself and so I will immediately crash ?

This still requires that you physically unplug the device, so not
something you can do in CI. Plus it doesn't allow you to easily fake a
hotunplug in the middle of something interesting like an atomic
modeset commit. If you instead punch out the mmio mapping with some
pte trick, you can intercept the faults and count down until you
actually switch over to only returning 0xff. This allows you to sweep
through entire complex execution flows so that you have a guarantee
you've actually caught everything.

If otoh you just hotunplug and don't clean up (or equivalent, insert a
long sleep at the beginning of your ->remove hook) then you just check
that at the beginning of each operation there's a check that bails
out.

It's better than nothing for prototyping, but I don't think it's
useful in a CI setting to assure stuff stays fixed.
-Daniel


> Andrey
>
> >
> >>
> >> Andrey
> >>
> >> On 1/29/21 2:25 PM, Christian König wrote:
> >>> Am 29.01.21 um 18:35 schrieb Andrey Grodzovsky:
> >>>>
> >>>> On 1/29/21 10:16 AM, Christian König wrote:
> >>>>> Am 28.01.21 um 18:23 schrieb Andrey Grodzovsky:
> >>>>>>
> >>>>>> On 1/19/21 1:59 PM, Christian König wrote:
> >>>>>>> Am 19.01.21 um 19:22 schrieb Andrey Grodzovsky:
> >>>>>>>>
> >>>>>>>> On 1/19/21 1:05 PM, Daniel Vetter wrote:
> >>>>>>>>> [SNIP]
> >>>>>>>> So say writing in a loop to some harmless scratch register for many times
> >>>>>>>> both for plugged
> >>>>>>>> and unplugged case and measure total time delta ?
> >>>>>>>
> >>>>>>> I think we should at least measure the following:
> >>>>>>>
> >>>>>>> 1. Writing X times to a scratch reg without your patch.
> >>>>>>> 2. Writing X times to a scratch reg with your patch.
> >>>>>>> 3. Writing X times to a scratch reg with the hardware physically disconnected.
> >>>>>>>
> >>>>>>> I suggest to repeat that once for Polaris (or older) and once for Vega or
> >>>>>>> Navi.
> >>>>>>>
> >>>>>>> The SRBM on Polaris is meant to introduce some delay in each access, so it
> >>>>>>> might react differently then the newer hardware.
> >>>>>>>
> >>>>>>> Christian.
> >>>>>>
> >>>>>>
> >>>>>> See attached results and the testing code. Ran on Polaris (gfx8) and
> >>>>>> Vega10(gfx9)
> >>>>>>
> >>>>>> In summary, over 1 million WWREG32 in loop with and without this patch you
> >>>>>> get around 10ms of accumulated overhead ( so 0.00001 millisecond penalty for
> >>>>>> each WWREG32) for using drm_dev_enter check when writing registers.
> >>>>>>
> >>>>>> P.S Bullet 3 I cannot test as I need eGPU and currently I don't have one.
> >>>>>
> >>>>> Well if I'm not completely mistaken that are 100ms of accumulated overhead.
> >>>>> So around 100ns per write. And even bigger problem is that this is a ~67%
> >>>>> increase.
> >>>>
> >>>>
> >>>> My bad, and 67% from what ? How u calculate ?
> >>>
> >>> My bad, (308501-209689)/209689=47% increase.
> >>>
> >>>>>
> >>>>> I'm not sure how many write we do during normal operation, but that sounds
> >>>>> like a bit much. Ideas?
> >>>>
> >>>> Well, u suggested to move the drm_dev_enter way up but as i see it the problem
> >>>> with this is that it increase the chance of race where the
> >>>> device is extracted after we check for drm_dev_enter (there is also such
> >>>> chance even when it's placed inside WWREG but it's lower).
> >>>> Earlier I propsed that instead of doing all those guards scattered all over
> >>>> the code simply delay release of system memory pages and unreserve of
> >>>> MMIO ranges to until after the device itself is gone after last drm device
> >>>> reference is dropped. But Daniel opposes delaying MMIO ranges unreserve to after
> >>>> PCI remove code because according to him it will upset the PCI subsytem.
> >>>
> >>> Yeah, that's most likely true as well.
> >>>
> >>> Maybe Daniel has another idea when he's back from vacation.
> >>>
> >>> Christian.
> >>>
> >>>>
> >>>> Andrey
> >>>>
> >>>>>
> >>>>> Christian.
> >>>> _______________________________________________
> >>>> amd-gfx mailing list
> >>>> amd-gfx@lists.freedesktop.org
> >>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&amp;data=04%7C01%7CAndrey.Grodzovsky%40amd.com%7C7810d8d6f03443ce2e0408d8ca22ea99%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637481598615581693%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=zTV6FTpL3titmMTVEPxxVT8e5lTKVsLViwZudEsNn%2Bw%3D&amp;reserved=0
> >>>>
> >>>
> >
> >
> >



-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 11/14] drm/amdgpu: Guard against write accesses after device removal
@ 2021-02-06 14:18                             ` Daniel Vetter
  0 siblings, 0 replies; 196+ messages in thread
From: Daniel Vetter @ 2021-02-06 14:18 UTC (permalink / raw)
  To: Andrey Grodzovsky
  Cc: Rob Herring, Greg KH, dri-devel, Anholt, Eric, Pekka Paalanen,
	amd-gfx list, Alex Deucher, Qiang Yu, Wentland, Harry,
	Christian König, Lucas Stach

On Sat, Feb 6, 2021 at 12:09 AM Andrey Grodzovsky
<Andrey.Grodzovsky@amd.com> wrote:
>
>
>
> On 2/5/21 5:10 PM, Daniel Vetter wrote:
> > On Fri, Feb 5, 2021 at 5:22 PM Andrey Grodzovsky
> > <Andrey.Grodzovsky@amd.com> wrote:
> >>
> >> Daniel, ping. Also, please refer to the other thread with Bjorn from pci-dev
> >> on the same topic I added you to.
> >
> > Summarizing my take over there for here plus maybe some more
> > clarification. There's two problems:
> >
> > - You must guarantee that after the ->remove callback of your driver
> > is finished, there's no more mmio or any other hw access. A
> > combination of stopping stuff and drm_dev_enter/exit can help with
> > that. This prevents the use-after-free issue.
> >
> > - For the actual hotunplug time, i.e. anything that can run while your
> > driver is used up to the point where ->remove callback has finished
> > stopp hw access you must guarantee that code doesn't blow up when it
> > gets bogus reads (in the form of 0xff values). drm_dev_enter/exit
> > can't help you with that. Plus you should make sure that we're not
> > spending forever waiting for a big pile of mmio access all to time out
> > because you never bail out - some coarse-grained drm_dev_enter/exit
> > might help here.
> >
> > Plus finally the userspace access problem: You must guarantee that
> > after ->remove has finished that none of the uapi or cross-driver
> > access points (driver ioctl, dma-buf, dma-fence, anything else that
> > hangs around) can reach the data structures/memory mappings/whatever
> > which have been released as part of your ->remove callback.
> > drm_dev_enter/exit is again the tool of choice here.
> >
> > So you have to use drm_dev_enter/exit for some of the problems we face
> > on hotunplug, but it's not the tool that can handle the actual hw
> > hotunplug race conditions for you.
> >
> > Unfortunately the hw hotunplug race condition is an utter pain to
> > test, since essentially you need to validate your driver against
> > spurious 0xff reads at any moment. And I don't even have a clever idea
> > to simulate this, e.g. by forcefully replacing the iobar mapping: What
> > we'd need is a mapping that allows reads (so we can fill a page with
> > 0xff and use that everywhere), but instead of rejecting writes, allows
> > them, but drops them (so that the 0xff stays intact). Maybe we could
> > simulate this with some kernel debug tricks (kinda like mmiotrace)
> > with a read-only mapping and dropping every write every time we fault.
> > But ugh ...
> >
> > Otoh validating an entire driver like amdgpu without such a trick
> > against 0xff reads is practically impossible. So maybe you need to add
> > this as one of the tasks here?
> > -Daniel
>
> Not sure it's not a dump idea but still, worth asking -  what if I
> just simply quietly return early from the .remove  callback  without
> doing anything there, the driver will not be aware that the device
> is removed and will at least try to continue working as usual including
> IOCTLs, job scheduling e.t.c. On the other hand all MMIO read accesses will
> start returning ~0, regarding rejecting writes - I don't see anywhere
> we test for result of writing (e.g. amdgpu_mm_wreg8) so seems they will
> just seamlessly  go through... Or is it the pci_dev that will be freed
> by PCI core itself and so I will immediately crash ?

This still requires that you physically unplug the device, so not
something you can do in CI. Plus it doesn't allow you to easily fake a
hotunplug in the middle of something interesting like an atomic
modeset commit. If you instead punch out the mmio mapping with some
pte trick, you can intercept the faults and count down until you
actually switch over to only returning 0xff. This allows you to sweep
through entire complex execution flows so that you have a guarantee
you've actually caught everything.

If otoh you just hotunplug and don't clean up (or equivalent, insert a
long sleep at the beginning of your ->remove hook) then you just check
that at the beginning of each operation there's a check that bails
out.

It's better than nothing for prototyping, but I don't think it's
useful in a CI setting to assure stuff stays fixed.
-Daniel


> Andrey
>
> >
> >>
> >> Andrey
> >>
> >> On 1/29/21 2:25 PM, Christian König wrote:
> >>> Am 29.01.21 um 18:35 schrieb Andrey Grodzovsky:
> >>>>
> >>>> On 1/29/21 10:16 AM, Christian König wrote:
> >>>>> Am 28.01.21 um 18:23 schrieb Andrey Grodzovsky:
> >>>>>>
> >>>>>> On 1/19/21 1:59 PM, Christian König wrote:
> >>>>>>> Am 19.01.21 um 19:22 schrieb Andrey Grodzovsky:
> >>>>>>>>
> >>>>>>>> On 1/19/21 1:05 PM, Daniel Vetter wrote:
> >>>>>>>>> [SNIP]
> >>>>>>>> So say writing in a loop to some harmless scratch register for many times
> >>>>>>>> both for plugged
> >>>>>>>> and unplugged case and measure total time delta ?
> >>>>>>>
> >>>>>>> I think we should at least measure the following:
> >>>>>>>
> >>>>>>> 1. Writing X times to a scratch reg without your patch.
> >>>>>>> 2. Writing X times to a scratch reg with your patch.
> >>>>>>> 3. Writing X times to a scratch reg with the hardware physically disconnected.
> >>>>>>>
> >>>>>>> I suggest to repeat that once for Polaris (or older) and once for Vega or
> >>>>>>> Navi.
> >>>>>>>
> >>>>>>> The SRBM on Polaris is meant to introduce some delay in each access, so it
> >>>>>>> might react differently then the newer hardware.
> >>>>>>>
> >>>>>>> Christian.
> >>>>>>
> >>>>>>
> >>>>>> See attached results and the testing code. Ran on Polaris (gfx8) and
> >>>>>> Vega10(gfx9)
> >>>>>>
> >>>>>> In summary, over 1 million WWREG32 in loop with and without this patch you
> >>>>>> get around 10ms of accumulated overhead ( so 0.00001 millisecond penalty for
> >>>>>> each WWREG32) for using drm_dev_enter check when writing registers.
> >>>>>>
> >>>>>> P.S Bullet 3 I cannot test as I need eGPU and currently I don't have one.
> >>>>>
> >>>>> Well if I'm not completely mistaken that are 100ms of accumulated overhead.
> >>>>> So around 100ns per write. And even bigger problem is that this is a ~67%
> >>>>> increase.
> >>>>
> >>>>
> >>>> My bad, and 67% from what ? How u calculate ?
> >>>
> >>> My bad, (308501-209689)/209689=47% increase.
> >>>
> >>>>>
> >>>>> I'm not sure how many write we do during normal operation, but that sounds
> >>>>> like a bit much. Ideas?
> >>>>
> >>>> Well, u suggested to move the drm_dev_enter way up but as i see it the problem
> >>>> with this is that it increase the chance of race where the
> >>>> device is extracted after we check for drm_dev_enter (there is also such
> >>>> chance even when it's placed inside WWREG but it's lower).
> >>>> Earlier I propsed that instead of doing all those guards scattered all over
> >>>> the code simply delay release of system memory pages and unreserve of
> >>>> MMIO ranges to until after the device itself is gone after last drm device
> >>>> reference is dropped. But Daniel opposes delaying MMIO ranges unreserve to after
> >>>> PCI remove code because according to him it will upset the PCI subsytem.
> >>>
> >>> Yeah, that's most likely true as well.
> >>>
> >>> Maybe Daniel has another idea when he's back from vacation.
> >>>
> >>> Christian.
> >>>
> >>>>
> >>>> Andrey
> >>>>
> >>>>>
> >>>>> Christian.
> >>>> _______________________________________________
> >>>> amd-gfx mailing list
> >>>> amd-gfx@lists.freedesktop.org
> >>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&amp;data=04%7C01%7CAndrey.Grodzovsky%40amd.com%7C7810d8d6f03443ce2e0408d8ca22ea99%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637481598615581693%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=zTV6FTpL3titmMTVEPxxVT8e5lTKVsLViwZudEsNn%2Bw%3D&amp;reserved=0
> >>>>
> >>>
> >
> >
> >



-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 11/14] drm/amdgpu: Guard against write accesses after device removal
  2021-02-05 22:10                         ` Daniel Vetter
@ 2021-02-07 21:28                           ` Andrey Grodzovsky
  -1 siblings, 0 replies; 196+ messages in thread
From: Andrey Grodzovsky @ 2021-02-07 21:28 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Greg KH, dri-devel, amd-gfx list, Alex Deucher, Qiang Yu,
	Christian König



On 2/5/21 5:10 PM, Daniel Vetter wrote:
> On Fri, Feb 5, 2021 at 5:22 PM Andrey Grodzovsky
> <Andrey.Grodzovsky@amd.com> wrote:
>>
>> Daniel, ping. Also, please refer to the other thread with Bjorn from pci-dev
>> on the same topic I added you to.
> 
> Summarizing my take over there for here plus maybe some more
> clarification. There's two problems:
> 
> - You must guarantee that after the ->remove callback of your driver
> is finished, there's no more mmio or any other hw access. A
> combination of stopping stuff and drm_dev_enter/exit can help with
> that. This prevents the use-after-free issue.
> 
> - For the actual hotunplug time, i.e. anything that can run while your
> driver is used up to the point where ->remove callback has finished
> stopp hw access you must guarantee that code doesn't blow up when it
> gets bogus reads (in the form of 0xff values). drm_dev_enter/exit
> can't help you with that. Plus you should make sure that we're not
> spending forever waiting for a big pile of mmio access all to time out
> because you never bail out - some coarse-grained drm_dev_enter/exit
> might help here.
> 
> Plus finally the userspace access problem: You must guarantee that
> after ->remove has finished that none of the uapi or cross-driver
> access points (driver ioctl, dma-buf, dma-fence, anything else that
> hangs around) can reach the data structures/memory mappings/whatever
> which have been released as part of your ->remove callback.
> drm_dev_enter/exit is again the tool of choice here.
> 
> So you have to use drm_dev_enter/exit for some of the problems we face
> on hotunplug, but it's not the tool that can handle the actual hw
> hotunplug race conditions for you.
> 
> Unfortunately the hw hotunplug race condition is an utter pain to
> test, since essentially you need to validate your driver against
> spurious 0xff reads at any moment. And I don't even have a clever idea
> to simulate this, e.g. by forcefully replacing the iobar mapping: What
> we'd need is a mapping that allows reads (so we can fill a page with
> 0xff and use that everywhere), but instead of rejecting writes, allows
> them, but drops them (so that the 0xff stays intact). Maybe we could
> simulate this with some kernel debug tricks (kinda like mmiotrace)
> with a read-only mapping and dropping every write every time we fault.

Clarification - as far as I know there are no page fault handlers for kernel
mappings. And we are talking about kernel mappings here, right ?  If there were 
I could solve all those issues the same as I do for user mappings, by
invalidating all existing mappings in the kernel (both kmaps and ioreamps)and 
insert dummy zero or ~0 filled page instead.
Also, I assume forcefully remapping the IO BAR to ~0 filled page would involve
ioremap API and it's not something that I think can be easily done according to
am answer i got to a related topic a few weeks ago 
https://www.spinics.net/lists/linux-pci/msg103396.html (that was the only reply 
i got)


> But ugh ...
> 
> Otoh validating an entire driver like amdgpu without such a trick
> against 0xff reads is practically impossible. So maybe you need to add
> this as one of the tasks here?

Or I could just for validation purposes return ~0 from all reg reads in the code
and ignore writes if drm_dev_unplugged, this could already easily validate a big 
portion of the code flow under such scenario.

Andrey

> -Daniel
> 
>>
>> Andrey
>>
>> On 1/29/21 2:25 PM, Christian König wrote:
>>> Am 29.01.21 um 18:35 schrieb Andrey Grodzovsky:
>>>>
>>>> On 1/29/21 10:16 AM, Christian König wrote:
>>>>> Am 28.01.21 um 18:23 schrieb Andrey Grodzovsky:
>>>>>>
>>>>>> On 1/19/21 1:59 PM, Christian König wrote:
>>>>>>> Am 19.01.21 um 19:22 schrieb Andrey Grodzovsky:
>>>>>>>>
>>>>>>>> On 1/19/21 1:05 PM, Daniel Vetter wrote:
>>>>>>>>> [SNIP]
>>>>>>>> So say writing in a loop to some harmless scratch register for many times
>>>>>>>> both for plugged
>>>>>>>> and unplugged case and measure total time delta ?
>>>>>>>
>>>>>>> I think we should at least measure the following:
>>>>>>>
>>>>>>> 1. Writing X times to a scratch reg without your patch.
>>>>>>> 2. Writing X times to a scratch reg with your patch.
>>>>>>> 3. Writing X times to a scratch reg with the hardware physically disconnected.
>>>>>>>
>>>>>>> I suggest to repeat that once for Polaris (or older) and once for Vega or
>>>>>>> Navi.
>>>>>>>
>>>>>>> The SRBM on Polaris is meant to introduce some delay in each access, so it
>>>>>>> might react differently then the newer hardware.
>>>>>>>
>>>>>>> Christian.
>>>>>>
>>>>>>
>>>>>> See attached results and the testing code. Ran on Polaris (gfx8) and
>>>>>> Vega10(gfx9)
>>>>>>
>>>>>> In summary, over 1 million WWREG32 in loop with and without this patch you
>>>>>> get around 10ms of accumulated overhead ( so 0.00001 millisecond penalty for
>>>>>> each WWREG32) for using drm_dev_enter check when writing registers.
>>>>>>
>>>>>> P.S Bullet 3 I cannot test as I need eGPU and currently I don't have one.
>>>>>
>>>>> Well if I'm not completely mistaken that are 100ms of accumulated overhead.
>>>>> So around 100ns per write. And even bigger problem is that this is a ~67%
>>>>> increase.
>>>>
>>>>
>>>> My bad, and 67% from what ? How u calculate ?
>>>
>>> My bad, (308501-209689)/209689=47% increase.
>>>
>>>>>
>>>>> I'm not sure how many write we do during normal operation, but that sounds
>>>>> like a bit much. Ideas?
>>>>
>>>> Well, u suggested to move the drm_dev_enter way up but as i see it the problem
>>>> with this is that it increase the chance of race where the
>>>> device is extracted after we check for drm_dev_enter (there is also such
>>>> chance even when it's placed inside WWREG but it's lower).
>>>> Earlier I propsed that instead of doing all those guards scattered all over
>>>> the code simply delay release of system memory pages and unreserve of
>>>> MMIO ranges to until after the device itself is gone after last drm device
>>>> reference is dropped. But Daniel opposes delaying MMIO ranges unreserve to after
>>>> PCI remove code because according to him it will upset the PCI subsytem.
>>>
>>> Yeah, that's most likely true as well.
>>>
>>> Maybe Daniel has another idea when he's back from vacation.
>>>
>>> Christian.
>>>
>>>>
>>>> Andrey
>>>>
>>>>>
>>>>> Christian.
>>>> _______________________________________________
>>>> amd-gfx mailing list
>>>> amd-gfx@lists.freedesktop.org
>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&amp;data=04%7C01%7CAndrey.Grodzovsky%40amd.com%7C7810d8d6f03443ce2e0408d8ca22ea99%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637481598615581693%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=zTV6FTpL3titmMTVEPxxVT8e5lTKVsLViwZudEsNn%2Bw%3D&amp;reserved=0
>>>>
>>>
> 
> 
> 
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 11/14] drm/amdgpu: Guard against write accesses after device removal
@ 2021-02-07 21:28                           ` Andrey Grodzovsky
  0 siblings, 0 replies; 196+ messages in thread
From: Andrey Grodzovsky @ 2021-02-07 21:28 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Rob Herring, Greg KH, dri-devel, Anholt, Eric, Pekka Paalanen,
	amd-gfx list, Alex Deucher, Qiang Yu, Wentland, Harry,
	Christian König, Lucas Stach



On 2/5/21 5:10 PM, Daniel Vetter wrote:
> On Fri, Feb 5, 2021 at 5:22 PM Andrey Grodzovsky
> <Andrey.Grodzovsky@amd.com> wrote:
>>
>> Daniel, ping. Also, please refer to the other thread with Bjorn from pci-dev
>> on the same topic I added you to.
> 
> Summarizing my take over there for here plus maybe some more
> clarification. There's two problems:
> 
> - You must guarantee that after the ->remove callback of your driver
> is finished, there's no more mmio or any other hw access. A
> combination of stopping stuff and drm_dev_enter/exit can help with
> that. This prevents the use-after-free issue.
> 
> - For the actual hotunplug time, i.e. anything that can run while your
> driver is used up to the point where ->remove callback has finished
> stopp hw access you must guarantee that code doesn't blow up when it
> gets bogus reads (in the form of 0xff values). drm_dev_enter/exit
> can't help you with that. Plus you should make sure that we're not
> spending forever waiting for a big pile of mmio access all to time out
> because you never bail out - some coarse-grained drm_dev_enter/exit
> might help here.
> 
> Plus finally the userspace access problem: You must guarantee that
> after ->remove has finished that none of the uapi or cross-driver
> access points (driver ioctl, dma-buf, dma-fence, anything else that
> hangs around) can reach the data structures/memory mappings/whatever
> which have been released as part of your ->remove callback.
> drm_dev_enter/exit is again the tool of choice here.
> 
> So you have to use drm_dev_enter/exit for some of the problems we face
> on hotunplug, but it's not the tool that can handle the actual hw
> hotunplug race conditions for you.
> 
> Unfortunately the hw hotunplug race condition is an utter pain to
> test, since essentially you need to validate your driver against
> spurious 0xff reads at any moment. And I don't even have a clever idea
> to simulate this, e.g. by forcefully replacing the iobar mapping: What
> we'd need is a mapping that allows reads (so we can fill a page with
> 0xff and use that everywhere), but instead of rejecting writes, allows
> them, but drops them (so that the 0xff stays intact). Maybe we could
> simulate this with some kernel debug tricks (kinda like mmiotrace)
> with a read-only mapping and dropping every write every time we fault.

Clarification - as far as I know there are no page fault handlers for kernel
mappings. And we are talking about kernel mappings here, right ?  If there were 
I could solve all those issues the same as I do for user mappings, by
invalidating all existing mappings in the kernel (both kmaps and ioreamps)and 
insert dummy zero or ~0 filled page instead.
Also, I assume forcefully remapping the IO BAR to ~0 filled page would involve
ioremap API and it's not something that I think can be easily done according to
am answer i got to a related topic a few weeks ago 
https://www.spinics.net/lists/linux-pci/msg103396.html (that was the only reply 
i got)


> But ugh ...
> 
> Otoh validating an entire driver like amdgpu without such a trick
> against 0xff reads is practically impossible. So maybe you need to add
> this as one of the tasks here?

Or I could just for validation purposes return ~0 from all reg reads in the code
and ignore writes if drm_dev_unplugged, this could already easily validate a big 
portion of the code flow under such scenario.

Andrey

> -Daniel
> 
>>
>> Andrey
>>
>> On 1/29/21 2:25 PM, Christian König wrote:
>>> Am 29.01.21 um 18:35 schrieb Andrey Grodzovsky:
>>>>
>>>> On 1/29/21 10:16 AM, Christian König wrote:
>>>>> Am 28.01.21 um 18:23 schrieb Andrey Grodzovsky:
>>>>>>
>>>>>> On 1/19/21 1:59 PM, Christian König wrote:
>>>>>>> Am 19.01.21 um 19:22 schrieb Andrey Grodzovsky:
>>>>>>>>
>>>>>>>> On 1/19/21 1:05 PM, Daniel Vetter wrote:
>>>>>>>>> [SNIP]
>>>>>>>> So say writing in a loop to some harmless scratch register for many times
>>>>>>>> both for plugged
>>>>>>>> and unplugged case and measure total time delta ?
>>>>>>>
>>>>>>> I think we should at least measure the following:
>>>>>>>
>>>>>>> 1. Writing X times to a scratch reg without your patch.
>>>>>>> 2. Writing X times to a scratch reg with your patch.
>>>>>>> 3. Writing X times to a scratch reg with the hardware physically disconnected.
>>>>>>>
>>>>>>> I suggest to repeat that once for Polaris (or older) and once for Vega or
>>>>>>> Navi.
>>>>>>>
>>>>>>> The SRBM on Polaris is meant to introduce some delay in each access, so it
>>>>>>> might react differently then the newer hardware.
>>>>>>>
>>>>>>> Christian.
>>>>>>
>>>>>>
>>>>>> See attached results and the testing code. Ran on Polaris (gfx8) and
>>>>>> Vega10(gfx9)
>>>>>>
>>>>>> In summary, over 1 million WWREG32 in loop with and without this patch you
>>>>>> get around 10ms of accumulated overhead ( so 0.00001 millisecond penalty for
>>>>>> each WWREG32) for using drm_dev_enter check when writing registers.
>>>>>>
>>>>>> P.S Bullet 3 I cannot test as I need eGPU and currently I don't have one.
>>>>>
>>>>> Well if I'm not completely mistaken that are 100ms of accumulated overhead.
>>>>> So around 100ns per write. And even bigger problem is that this is a ~67%
>>>>> increase.
>>>>
>>>>
>>>> My bad, and 67% from what ? How u calculate ?
>>>
>>> My bad, (308501-209689)/209689=47% increase.
>>>
>>>>>
>>>>> I'm not sure how many write we do during normal operation, but that sounds
>>>>> like a bit much. Ideas?
>>>>
>>>> Well, u suggested to move the drm_dev_enter way up but as i see it the problem
>>>> with this is that it increase the chance of race where the
>>>> device is extracted after we check for drm_dev_enter (there is also such
>>>> chance even when it's placed inside WWREG but it's lower).
>>>> Earlier I propsed that instead of doing all those guards scattered all over
>>>> the code simply delay release of system memory pages and unreserve of
>>>> MMIO ranges to until after the device itself is gone after last drm device
>>>> reference is dropped. But Daniel opposes delaying MMIO ranges unreserve to after
>>>> PCI remove code because according to him it will upset the PCI subsytem.
>>>
>>> Yeah, that's most likely true as well.
>>>
>>> Maybe Daniel has another idea when he's back from vacation.
>>>
>>> Christian.
>>>
>>>>
>>>> Andrey
>>>>
>>>>>
>>>>> Christian.
>>>> _______________________________________________
>>>> amd-gfx mailing list
>>>> amd-gfx@lists.freedesktop.org
>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&amp;data=04%7C01%7CAndrey.Grodzovsky%40amd.com%7C7810d8d6f03443ce2e0408d8ca22ea99%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637481598615581693%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=zTV6FTpL3titmMTVEPxxVT8e5lTKVsLViwZudEsNn%2Bw%3D&amp;reserved=0
>>>>
>>>
> 
> 
> 
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 11/14] drm/amdgpu: Guard against write accesses after device removal
  2021-02-07 21:28                           ` Andrey Grodzovsky
@ 2021-02-07 21:50                             ` Daniel Vetter
  -1 siblings, 0 replies; 196+ messages in thread
From: Daniel Vetter @ 2021-02-07 21:50 UTC (permalink / raw)
  To: Andrey Grodzovsky
  Cc: Greg KH, dri-devel, amd-gfx list, Alex Deucher, Qiang Yu,
	Christian König

On Sun, Feb 7, 2021 at 10:28 PM Andrey Grodzovsky
<Andrey.Grodzovsky@amd.com> wrote:
>
>
>
> On 2/5/21 5:10 PM, Daniel Vetter wrote:
> > On Fri, Feb 5, 2021 at 5:22 PM Andrey Grodzovsky
> > <Andrey.Grodzovsky@amd.com> wrote:
> >>
> >> Daniel, ping. Also, please refer to the other thread with Bjorn from pci-dev
> >> on the same topic I added you to.
> >
> > Summarizing my take over there for here plus maybe some more
> > clarification. There's two problems:
> >
> > - You must guarantee that after the ->remove callback of your driver
> > is finished, there's no more mmio or any other hw access. A
> > combination of stopping stuff and drm_dev_enter/exit can help with
> > that. This prevents the use-after-free issue.
> >
> > - For the actual hotunplug time, i.e. anything that can run while your
> > driver is used up to the point where ->remove callback has finished
> > stopp hw access you must guarantee that code doesn't blow up when it
> > gets bogus reads (in the form of 0xff values). drm_dev_enter/exit
> > can't help you with that. Plus you should make sure that we're not
> > spending forever waiting for a big pile of mmio access all to time out
> > because you never bail out - some coarse-grained drm_dev_enter/exit
> > might help here.
> >
> > Plus finally the userspace access problem: You must guarantee that
> > after ->remove has finished that none of the uapi or cross-driver
> > access points (driver ioctl, dma-buf, dma-fence, anything else that
> > hangs around) can reach the data structures/memory mappings/whatever
> > which have been released as part of your ->remove callback.
> > drm_dev_enter/exit is again the tool of choice here.
> >
> > So you have to use drm_dev_enter/exit for some of the problems we face
> > on hotunplug, but it's not the tool that can handle the actual hw
> > hotunplug race conditions for you.
> >
> > Unfortunately the hw hotunplug race condition is an utter pain to
> > test, since essentially you need to validate your driver against
> > spurious 0xff reads at any moment. And I don't even have a clever idea
> > to simulate this, e.g. by forcefully replacing the iobar mapping: What
> > we'd need is a mapping that allows reads (so we can fill a page with
> > 0xff and use that everywhere), but instead of rejecting writes, allows
> > them, but drops them (so that the 0xff stays intact). Maybe we could
> > simulate this with some kernel debug tricks (kinda like mmiotrace)
> > with a read-only mapping and dropping every write every time we fault.
>
> Clarification - as far as I know there are no page fault handlers for kernel
> mappings. And we are talking about kernel mappings here, right ?  If there were
> I could solve all those issues the same as I do for user mappings, by
> invalidating all existing mappings in the kernel (both kmaps and ioreamps)and
> insert dummy zero or ~0 filled page instead.
> Also, I assume forcefully remapping the IO BAR to ~0 filled page would involve
> ioremap API and it's not something that I think can be easily done according to
> am answer i got to a related topic a few weeks ago
> https://www.spinics.net/lists/linux-pci/msg103396.html (that was the only reply
> i got)

mmiotrace can, but only for debug, and only on x86 platforms:

https://www.kernel.org/doc/html/latest/trace/mmiotrace.html

Should be feasible (but maybe not worth the effort) to extend this to
support fake unplug.

>
> > But ugh ...
> >
> > Otoh validating an entire driver like amdgpu without such a trick
> > against 0xff reads is practically impossible. So maybe you need to add
> > this as one of the tasks here?
>
> Or I could just for validation purposes return ~0 from all reg reads in the code
> and ignore writes if drm_dev_unplugged, this could already easily validate a big
> portion of the code flow under such scenario.

Hm yeah if your really wrap them all, that should work too. Since
iommappings have __iomem pointer type, as long as amdgpu is sparse
warning free, should be doable to guarantee this.
-Daniel

> Andrey
>
> > -Daniel
> >
> >>
> >> Andrey
> >>
> >> On 1/29/21 2:25 PM, Christian König wrote:
> >>> Am 29.01.21 um 18:35 schrieb Andrey Grodzovsky:
> >>>>
> >>>> On 1/29/21 10:16 AM, Christian König wrote:
> >>>>> Am 28.01.21 um 18:23 schrieb Andrey Grodzovsky:
> >>>>>>
> >>>>>> On 1/19/21 1:59 PM, Christian König wrote:
> >>>>>>> Am 19.01.21 um 19:22 schrieb Andrey Grodzovsky:
> >>>>>>>>
> >>>>>>>> On 1/19/21 1:05 PM, Daniel Vetter wrote:
> >>>>>>>>> [SNIP]
> >>>>>>>> So say writing in a loop to some harmless scratch register for many times
> >>>>>>>> both for plugged
> >>>>>>>> and unplugged case and measure total time delta ?
> >>>>>>>
> >>>>>>> I think we should at least measure the following:
> >>>>>>>
> >>>>>>> 1. Writing X times to a scratch reg without your patch.
> >>>>>>> 2. Writing X times to a scratch reg with your patch.
> >>>>>>> 3. Writing X times to a scratch reg with the hardware physically disconnected.
> >>>>>>>
> >>>>>>> I suggest to repeat that once for Polaris (or older) and once for Vega or
> >>>>>>> Navi.
> >>>>>>>
> >>>>>>> The SRBM on Polaris is meant to introduce some delay in each access, so it
> >>>>>>> might react differently then the newer hardware.
> >>>>>>>
> >>>>>>> Christian.
> >>>>>>
> >>>>>>
> >>>>>> See attached results and the testing code. Ran on Polaris (gfx8) and
> >>>>>> Vega10(gfx9)
> >>>>>>
> >>>>>> In summary, over 1 million WWREG32 in loop with and without this patch you
> >>>>>> get around 10ms of accumulated overhead ( so 0.00001 millisecond penalty for
> >>>>>> each WWREG32) for using drm_dev_enter check when writing registers.
> >>>>>>
> >>>>>> P.S Bullet 3 I cannot test as I need eGPU and currently I don't have one.
> >>>>>
> >>>>> Well if I'm not completely mistaken that are 100ms of accumulated overhead.
> >>>>> So around 100ns per write. And even bigger problem is that this is a ~67%
> >>>>> increase.
> >>>>
> >>>>
> >>>> My bad, and 67% from what ? How u calculate ?
> >>>
> >>> My bad, (308501-209689)/209689=47% increase.
> >>>
> >>>>>
> >>>>> I'm not sure how many write we do during normal operation, but that sounds
> >>>>> like a bit much. Ideas?
> >>>>
> >>>> Well, u suggested to move the drm_dev_enter way up but as i see it the problem
> >>>> with this is that it increase the chance of race where the
> >>>> device is extracted after we check for drm_dev_enter (there is also such
> >>>> chance even when it's placed inside WWREG but it's lower).
> >>>> Earlier I propsed that instead of doing all those guards scattered all over
> >>>> the code simply delay release of system memory pages and unreserve of
> >>>> MMIO ranges to until after the device itself is gone after last drm device
> >>>> reference is dropped. But Daniel opposes delaying MMIO ranges unreserve to after
> >>>> PCI remove code because according to him it will upset the PCI subsytem.
> >>>
> >>> Yeah, that's most likely true as well.
> >>>
> >>> Maybe Daniel has another idea when he's back from vacation.
> >>>
> >>> Christian.
> >>>
> >>>>
> >>>> Andrey
> >>>>
> >>>>>
> >>>>> Christian.
> >>>> _______________________________________________
> >>>> amd-gfx mailing list
> >>>> amd-gfx@lists.freedesktop.org
> >>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&amp;data=04%7C01%7CAndrey.Grodzovsky%40amd.com%7C7810d8d6f03443ce2e0408d8ca22ea99%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637481598615581693%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=zTV6FTpL3titmMTVEPxxVT8e5lTKVsLViwZudEsNn%2Bw%3D&amp;reserved=0
> >>>>
> >>>
> >
> >
> >



-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 11/14] drm/amdgpu: Guard against write accesses after device removal
@ 2021-02-07 21:50                             ` Daniel Vetter
  0 siblings, 0 replies; 196+ messages in thread
From: Daniel Vetter @ 2021-02-07 21:50 UTC (permalink / raw)
  To: Andrey Grodzovsky
  Cc: Rob Herring, Greg KH, dri-devel, Anholt, Eric, Pekka Paalanen,
	amd-gfx list, Alex Deucher, Qiang Yu, Wentland, Harry,
	Christian König, Lucas Stach

On Sun, Feb 7, 2021 at 10:28 PM Andrey Grodzovsky
<Andrey.Grodzovsky@amd.com> wrote:
>
>
>
> On 2/5/21 5:10 PM, Daniel Vetter wrote:
> > On Fri, Feb 5, 2021 at 5:22 PM Andrey Grodzovsky
> > <Andrey.Grodzovsky@amd.com> wrote:
> >>
> >> Daniel, ping. Also, please refer to the other thread with Bjorn from pci-dev
> >> on the same topic I added you to.
> >
> > Summarizing my take over there for here plus maybe some more
> > clarification. There's two problems:
> >
> > - You must guarantee that after the ->remove callback of your driver
> > is finished, there's no more mmio or any other hw access. A
> > combination of stopping stuff and drm_dev_enter/exit can help with
> > that. This prevents the use-after-free issue.
> >
> > - For the actual hotunplug time, i.e. anything that can run while your
> > driver is used up to the point where ->remove callback has finished
> > stopp hw access you must guarantee that code doesn't blow up when it
> > gets bogus reads (in the form of 0xff values). drm_dev_enter/exit
> > can't help you with that. Plus you should make sure that we're not
> > spending forever waiting for a big pile of mmio access all to time out
> > because you never bail out - some coarse-grained drm_dev_enter/exit
> > might help here.
> >
> > Plus finally the userspace access problem: You must guarantee that
> > after ->remove has finished that none of the uapi or cross-driver
> > access points (driver ioctl, dma-buf, dma-fence, anything else that
> > hangs around) can reach the data structures/memory mappings/whatever
> > which have been released as part of your ->remove callback.
> > drm_dev_enter/exit is again the tool of choice here.
> >
> > So you have to use drm_dev_enter/exit for some of the problems we face
> > on hotunplug, but it's not the tool that can handle the actual hw
> > hotunplug race conditions for you.
> >
> > Unfortunately the hw hotunplug race condition is an utter pain to
> > test, since essentially you need to validate your driver against
> > spurious 0xff reads at any moment. And I don't even have a clever idea
> > to simulate this, e.g. by forcefully replacing the iobar mapping: What
> > we'd need is a mapping that allows reads (so we can fill a page with
> > 0xff and use that everywhere), but instead of rejecting writes, allows
> > them, but drops them (so that the 0xff stays intact). Maybe we could
> > simulate this with some kernel debug tricks (kinda like mmiotrace)
> > with a read-only mapping and dropping every write every time we fault.
>
> Clarification - as far as I know there are no page fault handlers for kernel
> mappings. And we are talking about kernel mappings here, right ?  If there were
> I could solve all those issues the same as I do for user mappings, by
> invalidating all existing mappings in the kernel (both kmaps and ioreamps)and
> insert dummy zero or ~0 filled page instead.
> Also, I assume forcefully remapping the IO BAR to ~0 filled page would involve
> ioremap API and it's not something that I think can be easily done according to
> am answer i got to a related topic a few weeks ago
> https://www.spinics.net/lists/linux-pci/msg103396.html (that was the only reply
> i got)

mmiotrace can, but only for debug, and only on x86 platforms:

https://www.kernel.org/doc/html/latest/trace/mmiotrace.html

Should be feasible (but maybe not worth the effort) to extend this to
support fake unplug.

>
> > But ugh ...
> >
> > Otoh validating an entire driver like amdgpu without such a trick
> > against 0xff reads is practically impossible. So maybe you need to add
> > this as one of the tasks here?
>
> Or I could just for validation purposes return ~0 from all reg reads in the code
> and ignore writes if drm_dev_unplugged, this could already easily validate a big
> portion of the code flow under such scenario.

Hm yeah if your really wrap them all, that should work too. Since
iommappings have __iomem pointer type, as long as amdgpu is sparse
warning free, should be doable to guarantee this.
-Daniel

> Andrey
>
> > -Daniel
> >
> >>
> >> Andrey
> >>
> >> On 1/29/21 2:25 PM, Christian König wrote:
> >>> Am 29.01.21 um 18:35 schrieb Andrey Grodzovsky:
> >>>>
> >>>> On 1/29/21 10:16 AM, Christian König wrote:
> >>>>> Am 28.01.21 um 18:23 schrieb Andrey Grodzovsky:
> >>>>>>
> >>>>>> On 1/19/21 1:59 PM, Christian König wrote:
> >>>>>>> Am 19.01.21 um 19:22 schrieb Andrey Grodzovsky:
> >>>>>>>>
> >>>>>>>> On 1/19/21 1:05 PM, Daniel Vetter wrote:
> >>>>>>>>> [SNIP]
> >>>>>>>> So say writing in a loop to some harmless scratch register for many times
> >>>>>>>> both for plugged
> >>>>>>>> and unplugged case and measure total time delta ?
> >>>>>>>
> >>>>>>> I think we should at least measure the following:
> >>>>>>>
> >>>>>>> 1. Writing X times to a scratch reg without your patch.
> >>>>>>> 2. Writing X times to a scratch reg with your patch.
> >>>>>>> 3. Writing X times to a scratch reg with the hardware physically disconnected.
> >>>>>>>
> >>>>>>> I suggest to repeat that once for Polaris (or older) and once for Vega or
> >>>>>>> Navi.
> >>>>>>>
> >>>>>>> The SRBM on Polaris is meant to introduce some delay in each access, so it
> >>>>>>> might react differently then the newer hardware.
> >>>>>>>
> >>>>>>> Christian.
> >>>>>>
> >>>>>>
> >>>>>> See attached results and the testing code. Ran on Polaris (gfx8) and
> >>>>>> Vega10(gfx9)
> >>>>>>
> >>>>>> In summary, over 1 million WWREG32 in loop with and without this patch you
> >>>>>> get around 10ms of accumulated overhead ( so 0.00001 millisecond penalty for
> >>>>>> each WWREG32) for using drm_dev_enter check when writing registers.
> >>>>>>
> >>>>>> P.S Bullet 3 I cannot test as I need eGPU and currently I don't have one.
> >>>>>
> >>>>> Well if I'm not completely mistaken that are 100ms of accumulated overhead.
> >>>>> So around 100ns per write. And even bigger problem is that this is a ~67%
> >>>>> increase.
> >>>>
> >>>>
> >>>> My bad, and 67% from what ? How u calculate ?
> >>>
> >>> My bad, (308501-209689)/209689=47% increase.
> >>>
> >>>>>
> >>>>> I'm not sure how many write we do during normal operation, but that sounds
> >>>>> like a bit much. Ideas?
> >>>>
> >>>> Well, u suggested to move the drm_dev_enter way up but as i see it the problem
> >>>> with this is that it increase the chance of race where the
> >>>> device is extracted after we check for drm_dev_enter (there is also such
> >>>> chance even when it's placed inside WWREG but it's lower).
> >>>> Earlier I propsed that instead of doing all those guards scattered all over
> >>>> the code simply delay release of system memory pages and unreserve of
> >>>> MMIO ranges to until after the device itself is gone after last drm device
> >>>> reference is dropped. But Daniel opposes delaying MMIO ranges unreserve to after
> >>>> PCI remove code because according to him it will upset the PCI subsytem.
> >>>
> >>> Yeah, that's most likely true as well.
> >>>
> >>> Maybe Daniel has another idea when he's back from vacation.
> >>>
> >>> Christian.
> >>>
> >>>>
> >>>> Andrey
> >>>>
> >>>>>
> >>>>> Christian.
> >>>> _______________________________________________
> >>>> amd-gfx mailing list
> >>>> amd-gfx@lists.freedesktop.org
> >>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&amp;data=04%7C01%7CAndrey.Grodzovsky%40amd.com%7C7810d8d6f03443ce2e0408d8ca22ea99%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637481598615581693%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=zTV6FTpL3titmMTVEPxxVT8e5lTKVsLViwZudEsNn%2Bw%3D&amp;reserved=0
> >>>>
> >>>
> >
> >
> >



-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 00/14] RFC Support hot device unplug in amdgpu
  2021-01-20 15:59               ` Daniel Vetter
@ 2021-02-08  5:59                 ` Andrey Grodzovsky
  -1 siblings, 0 replies; 196+ messages in thread
From: Andrey Grodzovsky @ 2021-02-08  5:59 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: amd-gfx list, Christian König, dri-devel, Qiang Yu, Greg KH,
	Alex Deucher

[-- Attachment #1: Type: text/plain, Size: 14986 bytes --]


On 1/20/21 10:59 AM, Daniel Vetter wrote:
> On Wed, Jan 20, 2021 at 3:20 PM Andrey Grodzovsky
> <Andrey.Grodzovsky@amd.com> wrote:
>>
>> On 1/20/21 4:05 AM, Daniel Vetter wrote:
>>> On Tue, Jan 19, 2021 at 01:18:15PM -0500, Andrey Grodzovsky wrote:
>>>> On 1/19/21 1:08 PM, Daniel Vetter wrote:
>>>>> On Tue, Jan 19, 2021 at 6:31 PM Andrey Grodzovsky
>>>>> <Andrey.Grodzovsky@amd.com> wrote:
>>>>>> On 1/19/21 9:16 AM, Daniel Vetter wrote:
>>>>>>> On Mon, Jan 18, 2021 at 04:01:09PM -0500, Andrey Grodzovsky wrote:
>>>>>>>> Until now extracting a card either by physical extraction (e.g. eGPU with
>>>>>>>> thunderbolt connection or by emulation through  syfs -> /sys/bus/pci/devices/device_id/remove)
>>>>>>>> would cause random crashes in user apps. The random crashes in apps were
>>>>>>>> mostly due to the app having mapped a device backed BO into its address
>>>>>>>> space was still trying to access the BO while the backing device was gone.
>>>>>>>> To answer this first problem Christian suggested to fix the handling of mapped
>>>>>>>> memory in the clients when the device goes away by forcibly unmap all buffers the
>>>>>>>> user processes has by clearing their respective VMAs mapping the device BOs.
>>>>>>>> Then when the VMAs try to fill in the page tables again we check in the fault
>>>>>>>> handlerif the device is removed and if so, return an error. This will generate a
>>>>>>>> SIGBUS to the application which can then cleanly terminate.This indeed was done
>>>>>>>> but this in turn created a problem of kernel OOPs were the OOPSes were due to the
>>>>>>>> fact that while the app was terminating because of the SIGBUSit would trigger use
>>>>>>>> after free in the driver by calling to accesses device structures that were already
>>>>>>>> released from the pci remove sequence.This was handled by introducing a 'flush'
>>>>>>>> sequence during device removal were we wait for drm file reference to drop to 0
>>>>>>>> meaning all user clients directly using this device terminated.
>>>>>>>>
>>>>>>>> v2:
>>>>>>>> Based on discussions in the mailing list with Daniel and Pekka [1] and based on the document
>>>>>>>> produced by Pekka from those discussions [2] the whole approach with returning SIGBUS and
>>>>>>>> waiting for all user clients having CPU mapping of device BOs to die was dropped.
>>>>>>>> Instead as per the document suggestion the device structures are kept alive until
>>>>>>>> the last reference to the device is dropped by user client and in the meanwhile all existing and new CPU mappings of the BOs
>>>>>>>> belonging to the device directly or by dma-buf import are rerouted to per user
>>>>>>>> process dummy rw page.Also, I skipped the 'Requirements for KMS UAPI' section of [2]
>>>>>>>> since i am trying to get the minimal set of requirements that still give useful solution
>>>>>>>> to work and this is the'Requirements for Render and Cross-Device UAPI' section and so my
>>>>>>>> test case is removing a secondary device, which is render only and is not involved
>>>>>>>> in KMS.
>>>>>>>>
>>>>>>>> v3:
>>>>>>>> More updates following comments from v2 such as removing loop to find DRM file when rerouting
>>>>>>>> page faults to dummy page,getting rid of unnecessary sysfs handling refactoring and moving
>>>>>>>> prevention of GPU recovery post device unplug from amdgpu to scheduler layer.
>>>>>>>> On top of that added unplug support for the IOMMU enabled system.
>>>>>>>>
>>>>>>>> v4:
>>>>>>>> Drop last sysfs hack and use sysfs default attribute.
>>>>>>>> Guard against write accesses after device removal to avoid modifying released memory.
>>>>>>>> Update dummy pages handling to on demand allocation and release through drm managed framework.
>>>>>>>> Add return value to scheduler job TO handler (by Luben Tuikov) and use this in amdgpu for prevention
>>>>>>>> of GPU recovery post device unplug
>>>>>>>> Also rebase on top of drm-misc-mext instead of amd-staging-drm-next
>>>>>>>>
>>>>>>>> With these patches I am able to gracefully remove the secondary card using sysfs remove hook while glxgears
>>>>>>>> is running off of secondary card (DRI_PRIME=1) without kernel oopses or hangs and keep working
>>>>>>>> with the primary card or soft reset the device without hangs or oopses
>>>>>>>>
>>>>>>>> TODOs for followup work:
>>>>>>>> Convert AMDGPU code to use devm (for hw stuff) and drmm (for sw stuff and allocations) (Daniel)
>>>>>>>> Support plugging the secondary device back after unplug - currently still experiencing HW error on plugging back.
>>>>>>>> Add support for 'Requirements for KMS UAPI' section of [2] - unplugging primary, display connected card.
>>>>>>>>
>>>>>>>> [1] - Discussions during v3 of the patchset https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.spinics.net%2Flists%2Famd-gfx%2Fmsg55576.html&amp;data=04%7C01%7CAndrey.Grodzovsky%40amd.com%7Cf3fc3c7b55df40e165f408d8bd5c7364%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637467552072067767%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=fjgP9YubHCrILFxWmpVGSmurTJHkWw%2Bv4okyjSNsPxE%3D&amp;reserved=0
>>>>>>>> [2] - drm/doc: device hot-unplug for userspace https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.spinics.net%2Flists%2Fdri-devel%2Fmsg259755.html&amp;data=04%7C01%7CAndrey.Grodzovsky%40amd.com%7Cf3fc3c7b55df40e165f408d8bd5c7364%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637467552072067767%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=11PYyAhOjLDEiNNho8WaMB%2FLkA5AuxK6g9XpbNiPIec%3D&amp;reserved=0
>>>>>>>> [3] - Related gitlab ticket https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgitlab.freedesktop.org%2Fdrm%2Famd%2F-%2Fissues%2F1081&amp;data=04%7C01%7CAndrey.Grodzovsky%40amd.com%7Cf3fc3c7b55df40e165f408d8bd5c7364%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637467552072077759%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=q%2F%2FXDKm9LOgw9mq4Ts4JoHR8Ysd8KmoM0NGLD98MsFw%3D&amp;reserved=0
>>>>>>> btw have you tried this out with some of the igts we have? core_hotunplug
>>>>>>> is the one I'm thinking of. Might be worth to extend this for amdgpu
>>>>>>> specific stuff (like run some batches on it while hotunplugging).
>>>>>> No, I mostly used just running glxgears while testing which covers already
>>>>>> exported/imported dma-buf case and a few manually hacked tests in libdrm amdgpu
>>>>>> test suite
>>>>>>
>>>>>>
>>>>>>> Since there's so many corner cases we need to test here (shared dma-buf,
>>>>>>> shared dma_fence) I think it would make sense to have a shared testcase
>>>>>>> across drivers.
>>>>>> Not familiar with IGT too much, is there an easy way to setup shared dma bufs
>>>>>> and fences
>>>>>> use cases there or you mean I need to add them now ?
>>>>> We do have test infrastructure for all of that, but the hotunplug test
>>>>> doesn't have that yet I think.
>>>>>
>>>>>>> Only specific thing would be some hooks to keep the gpu
>>>>>>> busy in some fashion while we yank the driver.
>>>>>> Do you mean like staring X and some active rendering on top (like glxgears)
>>>>>> automatically from within IGT ?
>>>>> Nope, igt is meant to be bare metal testing so you don't have to drag
>>>>> the entire winsys around (which in a wayland world, is not really good
>>>>> for driver testing anyway, since everything is different). We use this
>>>>> for our pre-merge ci for drm/i915.
>>>> So i keep it busy by X/glxgers which is manual operation. What you suggest
>>>> then is some client within IGT which opens the device and starts submitting jobs
>>>> (which is much like what libdrm amdgpu tests already do) ? And this
>>>> part is the amdgou specific code I just need to port from libdrm to here ?
>>> Yup. For i915 tests we have an entire library already for small workloads,
>>> including some that just spin forever (useful for reset testing and could
>>> also come handy for unload testing).
>>> -Daniel
>>
>> Does it mean I would have to drag in the entire infrastructure code from
>> within libdrm amdgpu code that allows for command submissions through
>> our IOCTLs ?
> No it's perfectly fine to use libdrm in igt tests, we do that too. I
> just mean we have some additional helpers to submit specific workloads
> for intel gpu, like rendercpy to move data with the 3d engine (just
> using copy engines only isn't good enough sometimes for testing), or
> the special hanging batchbuffers we use for reset testing, or in
> general for having precise control over race conditions and things
> like that.
>
> One thing that was somewhat annoying for i915 but shouldn't be a
> problem for amdgpu is that igt builds on intel. So we have stub
> functions for libdrm-intel, since libdrm-intel doesn't build on arm.
> Shouldn't be a problem for you.
> -Daniel


Tested with igt hot-unplug test. Passed unbind_rebind, unplug-rescan, 
hot-unbind-rebind and hotunplug-rescan
if disabling the rescan part as I don't support plug-back for now. Also added 
command submission for amdgpu.
Attached a draft of submitting workload while unbinding the driver or simulating
detach. Catched 2 issues with unpug if command submission in flight  during 
unplug -
(unsignaled fence causing a hang in amdgpu_cs_sync and hitting a BUG_ON in 
gfx_v9_0_ring_emit_patch_cond_exec whic is expected i guess).
Guess glxgears command submissions is at a much slower rate so this was missed.
Is that what you meant for this test ?

Andrey


>
>
>> Andrey
>>
>>>> Andrey
>>>>
>>>>
>>>>>>> But just to get it started
>>>>>>> you can throw in entirely amdgpu specific subtests and just share some of
>>>>>>> the test code.
>>>>>>> -Daniel
>>>>>> Im general, I wasn't aware of this test suite and looks like it does what i test
>>>>>> among other stuff.
>>>>>> I will definitely  try to run with it although the rescan part will not work as
>>>>>> plugging
>>>>>> the device back is in my TODO list and not part of the scope for this patchset
>>>>>> and so I will
>>>>>> probably comment the re-scan section out while testing.
>>>>> amd gem has been using libdrm-amd thus far iirc, but for things like
>>>>> this I think it'd be worth to at least consider switching. Display
>>>>> team has already started to use some of the test and contribute stuff
>>>>> (I think the VRR testcase is from amd).
>>>>> -Daniel
>>>>>
>>>>>> Andrey
>>>>>>
>>>>>>
>>>>>>>> Andrey Grodzovsky (13):
>>>>>>>>       drm/ttm: Remap all page faults to per process dummy page.
>>>>>>>>       drm: Unamp the entire device address space on device unplug
>>>>>>>>       drm/ttm: Expose ttm_tt_unpopulate for driver use
>>>>>>>>       drm/sched: Cancel and flush all oustatdning jobs before finish.
>>>>>>>>       drm/amdgpu: Split amdgpu_device_fini into early and late
>>>>>>>>       drm/amdgpu: Add early fini callback
>>>>>>>>       drm/amdgpu: Register IOMMU topology notifier per device.
>>>>>>>>       drm/amdgpu: Fix a bunch of sdma code crash post device unplug
>>>>>>>>       drm/amdgpu: Remap all page faults to per process dummy page.
>>>>>>>>       dmr/amdgpu: Move some sysfs attrs creation to default_attr
>>>>>>>>       drm/amdgpu: Guard against write accesses after device removal
>>>>>>>>       drm/sched: Make timeout timer rearm conditional.
>>>>>>>>       drm/amdgpu: Prevent any job recoveries after device is unplugged.
>>>>>>>>
>>>>>>>> Luben Tuikov (1):
>>>>>>>>       drm/scheduler: Job timeout handler returns status
>>>>>>>>
>>>>>>>>      drivers/gpu/drm/amd/amdgpu/amdgpu.h               |  11 +-
>>>>>>>>      drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c      |  17 +--
>>>>>>>>      drivers/gpu/drm/amd/amdgpu/amdgpu_device.c        | 149 ++++++++++++++++++++--
>>>>>>>>      drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c           |  20 ++-
>>>>>>>>      drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c         |  15 ++-
>>>>>>>>      drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c          |   2 +-
>>>>>>>>      drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h          |   1 +
>>>>>>>>      drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c           |   9 ++
>>>>>>>>      drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c       |  25 ++--
>>>>>>>>      drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c           |  26 ++--
>>>>>>>>      drivers/gpu/drm/amd/amdgpu/amdgpu_irq.h           |   3 +-
>>>>>>>>      drivers/gpu/drm/amd/amdgpu/amdgpu_job.c           |  19 ++-
>>>>>>>>      drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c           |  12 +-
>>>>>>>>      drivers/gpu/drm/amd/amdgpu/amdgpu_object.c        |  10 ++
>>>>>>>>      drivers/gpu/drm/amd/amdgpu/amdgpu_object.h        |   2 +
>>>>>>>>      drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c           |  53 +++++---
>>>>>>>>      drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h           |   3 +
>>>>>>>>      drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c           |   1 +
>>>>>>>>      drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c          |  70 ++++++++++
>>>>>>>>      drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h          |  52 +-------
>>>>>>>>      drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c           |  21 ++-
>>>>>>>>      drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c            |   8 +-
>>>>>>>>      drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c      |  14 +-
>>>>>>>>      drivers/gpu/drm/amd/amdgpu/cik_ih.c               |   2 +-
>>>>>>>>      drivers/gpu/drm/amd/amdgpu/cz_ih.c                |   2 +-
>>>>>>>>      drivers/gpu/drm/amd/amdgpu/iceland_ih.c           |   2 +-
>>>>>>>>      drivers/gpu/drm/amd/amdgpu/navi10_ih.c            |   2 +-
>>>>>>>>      drivers/gpu/drm/amd/amdgpu/psp_v11_0.c            |  16 +--
>>>>>>>>      drivers/gpu/drm/amd/amdgpu/psp_v12_0.c            |   8 +-
>>>>>>>>      drivers/gpu/drm/amd/amdgpu/psp_v3_1.c             |   8 +-
>>>>>>>>      drivers/gpu/drm/amd/amdgpu/si_ih.c                |   2 +-
>>>>>>>>      drivers/gpu/drm/amd/amdgpu/tonga_ih.c             |   2 +-
>>>>>>>>      drivers/gpu/drm/amd/amdgpu/vega10_ih.c            |   2 +-
>>>>>>>>      drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c |  12 +-
>>>>>>>>      drivers/gpu/drm/amd/include/amd_shared.h          |   2 +
>>>>>>>>      drivers/gpu/drm/drm_drv.c                         |   3 +
>>>>>>>>      drivers/gpu/drm/etnaviv/etnaviv_sched.c           |  10 +-
>>>>>>>>      drivers/gpu/drm/lima/lima_sched.c                 |   4 +-
>>>>>>>>      drivers/gpu/drm/panfrost/panfrost_job.c           |   9 +-
>>>>>>>>      drivers/gpu/drm/scheduler/sched_main.c            |  18 ++-
>>>>>>>>      drivers/gpu/drm/ttm/ttm_bo_vm.c                   |  82 +++++++++++-
>>>>>>>>      drivers/gpu/drm/ttm/ttm_tt.c                      |   1 +
>>>>>>>>      drivers/gpu/drm/v3d/v3d_sched.c                   |  32 ++---
>>>>>>>>      include/drm/gpu_scheduler.h                       |  17 ++-
>>>>>>>>      include/drm/ttm/ttm_bo_api.h                      |   2 +
>>>>>>>>      45 files changed, 583 insertions(+), 198 deletions(-)
>>>>>>>>
>>>>>>>> --
>>>>>>>> 2.7.4
>>>>>>>>
>
>

[-- Attachment #2: 0001-DAFT-Add-amdgpu-command-submission-while-unplug.patch --]
[-- Type: text/x-patch, Size: 6525 bytes --]

From af658ef6b7e5b044d2566104137ee1cb34e52c59 Mon Sep 17 00:00:00 2001
From: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Date: Mon, 8 Feb 2021 00:41:28 -0500
Subject: DAFT: Add amdgpu command submission while unplug.

---
 tests/core_hotunplug.c | 218 ++++++++++++++++++++++++++++++++++++++++++++++++-
 tests/meson.build      |   2 +-
 2 files changed, 216 insertions(+), 4 deletions(-)

diff --git a/tests/core_hotunplug.c b/tests/core_hotunplug.c
index e7d2a44..5a4dcb4 100644
--- a/tests/core_hotunplug.c
+++ b/tests/core_hotunplug.c
@@ -52,6 +52,190 @@ struct hotunplug {
 	bool need_healthcheck;
 };
 
+/* amdgpu specific code */
+
+#include <amdgpu.h>
+#include <amdgpu_drm.h>
+#include <pthread.h>
+
+#define GFX_COMPUTE_NOP  0xffff1000
+
+
+static bool do_cs;
+
+static int
+amdgpu_bo_alloc_and_map(amdgpu_device_handle dev, unsigned size,
+			unsigned alignment, unsigned heap, uint64_t flags,
+			amdgpu_bo_handle *bo, void **cpu, uint64_t *mc_address,
+			amdgpu_va_handle *va_handle)
+{
+	struct amdgpu_bo_alloc_request request = {
+		.alloc_size = size,
+		.phys_alignment = alignment,
+		.preferred_heap = heap,
+		.flags = flags,
+	};
+	amdgpu_bo_handle buf_handle;
+	amdgpu_va_handle handle;
+	uint64_t vmc_addr;
+	int r;
+
+	r = amdgpu_bo_alloc(dev, &request, &buf_handle);
+	if (r)
+		return r;
+
+	r = amdgpu_va_range_alloc(dev,
+				  amdgpu_gpu_va_range_general,
+				  size, alignment, 0, &vmc_addr,
+				  &handle, 0);
+	if (r)
+		goto error_va_alloc;
+
+	r = amdgpu_bo_va_op(buf_handle, 0, size, vmc_addr, 0, AMDGPU_VA_OP_MAP);
+	if (r)
+		goto error_va_map;
+
+	r = amdgpu_bo_cpu_map(buf_handle, cpu);
+	if (r)
+		goto error_cpu_map;
+
+	*bo = buf_handle;
+	*mc_address = vmc_addr;
+	*va_handle = handle;
+
+	return 0;
+
+error_cpu_map:
+	amdgpu_bo_cpu_unmap(buf_handle);
+
+error_va_map:
+	amdgpu_bo_va_op(buf_handle, 0, size, vmc_addr, 0, AMDGPU_VA_OP_UNMAP);
+
+error_va_alloc:
+	amdgpu_bo_free(buf_handle);
+	return r;
+}
+
+static void
+amdgpu_bo_unmap_and_free(amdgpu_bo_handle bo, amdgpu_va_handle va_handle,
+			 uint64_t mc_addr, uint64_t size)
+{
+	amdgpu_bo_cpu_unmap(bo);
+	amdgpu_bo_va_op(bo, 0, size, mc_addr, 0, AMDGPU_VA_OP_UNMAP);
+	amdgpu_va_range_free(va_handle);
+	amdgpu_bo_free(bo);
+}
+
+static void amdgpu_cs_sync(amdgpu_context_handle context,
+			   unsigned int ip_type,
+			   int ring,
+			   unsigned int seqno)
+{
+	struct amdgpu_cs_fence fence = {
+		.context = context,
+		.ip_type = ip_type,
+		.ring = ring,
+		.fence = seqno,
+	};
+	uint32_t expired;
+	int err;
+
+	err = amdgpu_cs_query_fence_status(&fence,
+					   AMDGPU_TIMEOUT_INFINITE,
+					   0, &expired);
+}
+
+static void *amdgpu_nop_cs(void *p)
+{
+	int fd = *(int *)p;
+	amdgpu_bo_handle ib_result_handle;
+	void *ib_result_cpu;
+	uint64_t ib_result_mc_address;
+	uint32_t *ptr;
+	int i, r;
+	amdgpu_bo_list_handle bo_list;
+	amdgpu_va_handle va_handle;
+	uint32_t major, minor;
+	amdgpu_device_handle device;
+	amdgpu_context_handle context;
+	struct amdgpu_cs_request ibs_request;
+	struct amdgpu_cs_ib_info ib_info;
+
+
+	r = amdgpu_device_initialize(fd, &major, &minor, &device);
+	igt_require(r == 0);
+
+	r = amdgpu_cs_ctx_create(device, &context);
+	igt_assert_eq(r, 0);
+
+	r = amdgpu_bo_alloc_and_map(device, 4096, 4096,
+				    AMDGPU_GEM_DOMAIN_GTT, 0,
+				    &ib_result_handle, &ib_result_cpu,
+				    &ib_result_mc_address, &va_handle);
+	igt_assert_eq(r, 0);
+
+	ptr = ib_result_cpu;
+	for (i = 0; i < 16; ++i)
+		ptr[i] = GFX_COMPUTE_NOP;
+
+	r = amdgpu_bo_list_create(device, 1, &ib_result_handle, NULL, &bo_list);
+	igt_assert_eq(r, 0);
+
+	memset(&ib_info, 0, sizeof(struct amdgpu_cs_ib_info));
+	ib_info.ib_mc_address = ib_result_mc_address;
+	ib_info.size = 16;
+
+	memset(&ibs_request, 0, sizeof(struct amdgpu_cs_request));
+	ibs_request.ip_type = AMDGPU_HW_IP_GFX;
+	ibs_request.ring = 0;
+	ibs_request.number_of_ibs = 1;
+	ibs_request.ibs = &ib_info;
+	ibs_request.resources = bo_list;
+
+	while (do_cs)
+		amdgpu_cs_submit(context, 0, &ibs_request, 1);
+
+	amdgpu_cs_sync(context, AMDGPU_HW_IP_GFX, 0, ibs_request.seq_no);
+
+	amdgpu_bo_list_destroy(bo_list);
+
+	amdgpu_bo_unmap_and_free(ib_result_handle, va_handle,
+				 ib_result_mc_address, 4096);
+
+	amdgpu_cs_ctx_free(context);
+	amdgpu_device_deinitialize(device);
+
+	return (void *)0;
+}
+
+static pthread_t* amdgpu_create_cs_thread(int *fd)
+{
+	int r;
+	pthread_t *thread = malloc(sizeof(*thread));
+
+	do_cs = true;
+
+	r = pthread_create(thread, NULL, amdgpu_nop_cs, (void *)fd);
+	igt_assert_eq(r, 0);
+
+	/* Give thread enough time to start*/
+	usleep(100000);
+	return thread;
+}
+
+static void amdgpu_destroy_cs_thread(pthread_t *thread)
+{
+	void *status;
+
+	do_cs = false;
+
+	pthread_join(*thread, &status);
+	igt_assert(status == 0);
+
+	free(thread);
+}
+
+

@@ -455,15 +645,26 @@ static void unplug_rescan(struct hotunplug *priv)
 
 static void hotunbind_rebind(struct hotunplug *priv)
 {
+	pthread_t *thread = NULL;
+
 	pre_check(priv);
 
 	priv->fd.drm = local_drm_open_driver(false, "", " for hot unbind");
 
+	if (is_amdgpu_device(priv->fd.drm))
+		thread = amdgpu_create_cs_thread(&priv->fd.drm);
+
 	driver_unbind(priv, "hot ", 0);
 
+	if (thread)
+		amdgpu_destroy_cs_thread(thread);
+
+
 	priv->fd.drm = close_device(priv->fd.drm, "late ", "unbound ");
 	igt_assert_eq(priv->fd.drm, -1);
 
 	driver_bind(priv, 0);
 
 	igt_assert_f(healthcheck(priv, false), "%s\n", priv->failure);
@@ -471,15 +672,25 @@ static void hotunbind_rebind(struct hotunplug *priv)
 
 static void hotunplug_rescan(struct hotunplug *priv)
 {
+	pthread_t *thread = NULL;
+
 	pre_check(priv);
 
 	priv->fd.drm = local_drm_open_driver(false, "", " for hot unplug");
 
+	if (is_amdgpu_device(priv->fd.drm))
+		thread = amdgpu_create_cs_thread(&priv->fd.drm);
+
 	device_unplug(priv, "hot ", 0);
 
+	if (thread)
+		amdgpu_destroy_cs_thread(thread);
+
 	priv->fd.drm = close_device(priv->fd.drm, "late ", "removed ");
 	igt_assert_eq(priv->fd.drm, -1);

 	bus_rescan(priv, 0);
 
 	igt_assert_f(healthcheck(priv, false), "%s\n", priv->failure);
@@ -543,6 +754,7 @@ static void hotreplug_lateclose(struct hotunplug *priv)
 	igt_assert_f(healthcheck(priv, false), "%s\n", priv->failure);
 }
 
diff --git a/tests/meson.build b/tests/meson.build
index 825e018..1de6cc5 100644
--- a/tests/meson.build
+++ b/tests/meson.build
@@ -243,7 +243,7 @@ i915_progs = [
 	'sysfs_timeslice_duration',
 ]
 
-test_deps = [ igt_deps ]
+test_deps = [ igt_deps + [ libdrm_amdgpu ] ]
 
 if libdrm_nouveau.found()
 	test_progs += [
-- 
2.7.4


[-- Attachment #3: Type: text/plain, Size: 160 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 00/14] RFC Support hot device unplug in amdgpu
@ 2021-02-08  5:59                 ` Andrey Grodzovsky
  0 siblings, 0 replies; 196+ messages in thread
From: Andrey Grodzovsky @ 2021-02-08  5:59 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Rob Herring, amd-gfx list, Christian König, dri-devel,
	Anholt, Eric, Pekka Paalanen, Qiang Yu, Greg KH, Alex Deucher,
	Wentland, Harry, Lucas Stach

[-- Attachment #1: Type: text/plain, Size: 14986 bytes --]


On 1/20/21 10:59 AM, Daniel Vetter wrote:
> On Wed, Jan 20, 2021 at 3:20 PM Andrey Grodzovsky
> <Andrey.Grodzovsky@amd.com> wrote:
>>
>> On 1/20/21 4:05 AM, Daniel Vetter wrote:
>>> On Tue, Jan 19, 2021 at 01:18:15PM -0500, Andrey Grodzovsky wrote:
>>>> On 1/19/21 1:08 PM, Daniel Vetter wrote:
>>>>> On Tue, Jan 19, 2021 at 6:31 PM Andrey Grodzovsky
>>>>> <Andrey.Grodzovsky@amd.com> wrote:
>>>>>> On 1/19/21 9:16 AM, Daniel Vetter wrote:
>>>>>>> On Mon, Jan 18, 2021 at 04:01:09PM -0500, Andrey Grodzovsky wrote:
>>>>>>>> Until now extracting a card either by physical extraction (e.g. eGPU with
>>>>>>>> thunderbolt connection or by emulation through  syfs -> /sys/bus/pci/devices/device_id/remove)
>>>>>>>> would cause random crashes in user apps. The random crashes in apps were
>>>>>>>> mostly due to the app having mapped a device backed BO into its address
>>>>>>>> space was still trying to access the BO while the backing device was gone.
>>>>>>>> To answer this first problem Christian suggested to fix the handling of mapped
>>>>>>>> memory in the clients when the device goes away by forcibly unmap all buffers the
>>>>>>>> user processes has by clearing their respective VMAs mapping the device BOs.
>>>>>>>> Then when the VMAs try to fill in the page tables again we check in the fault
>>>>>>>> handlerif the device is removed and if so, return an error. This will generate a
>>>>>>>> SIGBUS to the application which can then cleanly terminate.This indeed was done
>>>>>>>> but this in turn created a problem of kernel OOPs were the OOPSes were due to the
>>>>>>>> fact that while the app was terminating because of the SIGBUSit would trigger use
>>>>>>>> after free in the driver by calling to accesses device structures that were already
>>>>>>>> released from the pci remove sequence.This was handled by introducing a 'flush'
>>>>>>>> sequence during device removal were we wait for drm file reference to drop to 0
>>>>>>>> meaning all user clients directly using this device terminated.
>>>>>>>>
>>>>>>>> v2:
>>>>>>>> Based on discussions in the mailing list with Daniel and Pekka [1] and based on the document
>>>>>>>> produced by Pekka from those discussions [2] the whole approach with returning SIGBUS and
>>>>>>>> waiting for all user clients having CPU mapping of device BOs to die was dropped.
>>>>>>>> Instead as per the document suggestion the device structures are kept alive until
>>>>>>>> the last reference to the device is dropped by user client and in the meanwhile all existing and new CPU mappings of the BOs
>>>>>>>> belonging to the device directly or by dma-buf import are rerouted to per user
>>>>>>>> process dummy rw page.Also, I skipped the 'Requirements for KMS UAPI' section of [2]
>>>>>>>> since i am trying to get the minimal set of requirements that still give useful solution
>>>>>>>> to work and this is the'Requirements for Render and Cross-Device UAPI' section and so my
>>>>>>>> test case is removing a secondary device, which is render only and is not involved
>>>>>>>> in KMS.
>>>>>>>>
>>>>>>>> v3:
>>>>>>>> More updates following comments from v2 such as removing loop to find DRM file when rerouting
>>>>>>>> page faults to dummy page,getting rid of unnecessary sysfs handling refactoring and moving
>>>>>>>> prevention of GPU recovery post device unplug from amdgpu to scheduler layer.
>>>>>>>> On top of that added unplug support for the IOMMU enabled system.
>>>>>>>>
>>>>>>>> v4:
>>>>>>>> Drop last sysfs hack and use sysfs default attribute.
>>>>>>>> Guard against write accesses after device removal to avoid modifying released memory.
>>>>>>>> Update dummy pages handling to on demand allocation and release through drm managed framework.
>>>>>>>> Add return value to scheduler job TO handler (by Luben Tuikov) and use this in amdgpu for prevention
>>>>>>>> of GPU recovery post device unplug
>>>>>>>> Also rebase on top of drm-misc-mext instead of amd-staging-drm-next
>>>>>>>>
>>>>>>>> With these patches I am able to gracefully remove the secondary card using sysfs remove hook while glxgears
>>>>>>>> is running off of secondary card (DRI_PRIME=1) without kernel oopses or hangs and keep working
>>>>>>>> with the primary card or soft reset the device without hangs or oopses
>>>>>>>>
>>>>>>>> TODOs for followup work:
>>>>>>>> Convert AMDGPU code to use devm (for hw stuff) and drmm (for sw stuff and allocations) (Daniel)
>>>>>>>> Support plugging the secondary device back after unplug - currently still experiencing HW error on plugging back.
>>>>>>>> Add support for 'Requirements for KMS UAPI' section of [2] - unplugging primary, display connected card.
>>>>>>>>
>>>>>>>> [1] - Discussions during v3 of the patchset https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.spinics.net%2Flists%2Famd-gfx%2Fmsg55576.html&amp;data=04%7C01%7CAndrey.Grodzovsky%40amd.com%7Cf3fc3c7b55df40e165f408d8bd5c7364%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637467552072067767%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=fjgP9YubHCrILFxWmpVGSmurTJHkWw%2Bv4okyjSNsPxE%3D&amp;reserved=0
>>>>>>>> [2] - drm/doc: device hot-unplug for userspace https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.spinics.net%2Flists%2Fdri-devel%2Fmsg259755.html&amp;data=04%7C01%7CAndrey.Grodzovsky%40amd.com%7Cf3fc3c7b55df40e165f408d8bd5c7364%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637467552072067767%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=11PYyAhOjLDEiNNho8WaMB%2FLkA5AuxK6g9XpbNiPIec%3D&amp;reserved=0
>>>>>>>> [3] - Related gitlab ticket https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgitlab.freedesktop.org%2Fdrm%2Famd%2F-%2Fissues%2F1081&amp;data=04%7C01%7CAndrey.Grodzovsky%40amd.com%7Cf3fc3c7b55df40e165f408d8bd5c7364%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637467552072077759%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=q%2F%2FXDKm9LOgw9mq4Ts4JoHR8Ysd8KmoM0NGLD98MsFw%3D&amp;reserved=0
>>>>>>> btw have you tried this out with some of the igts we have? core_hotunplug
>>>>>>> is the one I'm thinking of. Might be worth to extend this for amdgpu
>>>>>>> specific stuff (like run some batches on it while hotunplugging).
>>>>>> No, I mostly used just running glxgears while testing which covers already
>>>>>> exported/imported dma-buf case and a few manually hacked tests in libdrm amdgpu
>>>>>> test suite
>>>>>>
>>>>>>
>>>>>>> Since there's so many corner cases we need to test here (shared dma-buf,
>>>>>>> shared dma_fence) I think it would make sense to have a shared testcase
>>>>>>> across drivers.
>>>>>> Not familiar with IGT too much, is there an easy way to setup shared dma bufs
>>>>>> and fences
>>>>>> use cases there or you mean I need to add them now ?
>>>>> We do have test infrastructure for all of that, but the hotunplug test
>>>>> doesn't have that yet I think.
>>>>>
>>>>>>> Only specific thing would be some hooks to keep the gpu
>>>>>>> busy in some fashion while we yank the driver.
>>>>>> Do you mean like staring X and some active rendering on top (like glxgears)
>>>>>> automatically from within IGT ?
>>>>> Nope, igt is meant to be bare metal testing so you don't have to drag
>>>>> the entire winsys around (which in a wayland world, is not really good
>>>>> for driver testing anyway, since everything is different). We use this
>>>>> for our pre-merge ci for drm/i915.
>>>> So i keep it busy by X/glxgers which is manual operation. What you suggest
>>>> then is some client within IGT which opens the device and starts submitting jobs
>>>> (which is much like what libdrm amdgpu tests already do) ? And this
>>>> part is the amdgou specific code I just need to port from libdrm to here ?
>>> Yup. For i915 tests we have an entire library already for small workloads,
>>> including some that just spin forever (useful for reset testing and could
>>> also come handy for unload testing).
>>> -Daniel
>>
>> Does it mean I would have to drag in the entire infrastructure code from
>> within libdrm amdgpu code that allows for command submissions through
>> our IOCTLs ?
> No it's perfectly fine to use libdrm in igt tests, we do that too. I
> just mean we have some additional helpers to submit specific workloads
> for intel gpu, like rendercpy to move data with the 3d engine (just
> using copy engines only isn't good enough sometimes for testing), or
> the special hanging batchbuffers we use for reset testing, or in
> general for having precise control over race conditions and things
> like that.
>
> One thing that was somewhat annoying for i915 but shouldn't be a
> problem for amdgpu is that igt builds on intel. So we have stub
> functions for libdrm-intel, since libdrm-intel doesn't build on arm.
> Shouldn't be a problem for you.
> -Daniel


Tested with igt hot-unplug test. Passed unbind_rebind, unplug-rescan, 
hot-unbind-rebind and hotunplug-rescan
if disabling the rescan part as I don't support plug-back for now. Also added 
command submission for amdgpu.
Attached a draft of submitting workload while unbinding the driver or simulating
detach. Catched 2 issues with unpug if command submission in flight  during 
unplug -
(unsignaled fence causing a hang in amdgpu_cs_sync and hitting a BUG_ON in 
gfx_v9_0_ring_emit_patch_cond_exec whic is expected i guess).
Guess glxgears command submissions is at a much slower rate so this was missed.
Is that what you meant for this test ?

Andrey


>
>
>> Andrey
>>
>>>> Andrey
>>>>
>>>>
>>>>>>> But just to get it started
>>>>>>> you can throw in entirely amdgpu specific subtests and just share some of
>>>>>>> the test code.
>>>>>>> -Daniel
>>>>>> Im general, I wasn't aware of this test suite and looks like it does what i test
>>>>>> among other stuff.
>>>>>> I will definitely  try to run with it although the rescan part will not work as
>>>>>> plugging
>>>>>> the device back is in my TODO list and not part of the scope for this patchset
>>>>>> and so I will
>>>>>> probably comment the re-scan section out while testing.
>>>>> amd gem has been using libdrm-amd thus far iirc, but for things like
>>>>> this I think it'd be worth to at least consider switching. Display
>>>>> team has already started to use some of the test and contribute stuff
>>>>> (I think the VRR testcase is from amd).
>>>>> -Daniel
>>>>>
>>>>>> Andrey
>>>>>>
>>>>>>
>>>>>>>> Andrey Grodzovsky (13):
>>>>>>>>       drm/ttm: Remap all page faults to per process dummy page.
>>>>>>>>       drm: Unamp the entire device address space on device unplug
>>>>>>>>       drm/ttm: Expose ttm_tt_unpopulate for driver use
>>>>>>>>       drm/sched: Cancel and flush all oustatdning jobs before finish.
>>>>>>>>       drm/amdgpu: Split amdgpu_device_fini into early and late
>>>>>>>>       drm/amdgpu: Add early fini callback
>>>>>>>>       drm/amdgpu: Register IOMMU topology notifier per device.
>>>>>>>>       drm/amdgpu: Fix a bunch of sdma code crash post device unplug
>>>>>>>>       drm/amdgpu: Remap all page faults to per process dummy page.
>>>>>>>>       dmr/amdgpu: Move some sysfs attrs creation to default_attr
>>>>>>>>       drm/amdgpu: Guard against write accesses after device removal
>>>>>>>>       drm/sched: Make timeout timer rearm conditional.
>>>>>>>>       drm/amdgpu: Prevent any job recoveries after device is unplugged.
>>>>>>>>
>>>>>>>> Luben Tuikov (1):
>>>>>>>>       drm/scheduler: Job timeout handler returns status
>>>>>>>>
>>>>>>>>      drivers/gpu/drm/amd/amdgpu/amdgpu.h               |  11 +-
>>>>>>>>      drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c      |  17 +--
>>>>>>>>      drivers/gpu/drm/amd/amdgpu/amdgpu_device.c        | 149 ++++++++++++++++++++--
>>>>>>>>      drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c           |  20 ++-
>>>>>>>>      drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c         |  15 ++-
>>>>>>>>      drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c          |   2 +-
>>>>>>>>      drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h          |   1 +
>>>>>>>>      drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c           |   9 ++
>>>>>>>>      drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c       |  25 ++--
>>>>>>>>      drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c           |  26 ++--
>>>>>>>>      drivers/gpu/drm/amd/amdgpu/amdgpu_irq.h           |   3 +-
>>>>>>>>      drivers/gpu/drm/amd/amdgpu/amdgpu_job.c           |  19 ++-
>>>>>>>>      drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c           |  12 +-
>>>>>>>>      drivers/gpu/drm/amd/amdgpu/amdgpu_object.c        |  10 ++
>>>>>>>>      drivers/gpu/drm/amd/amdgpu/amdgpu_object.h        |   2 +
>>>>>>>>      drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c           |  53 +++++---
>>>>>>>>      drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h           |   3 +
>>>>>>>>      drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c           |   1 +
>>>>>>>>      drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c          |  70 ++++++++++
>>>>>>>>      drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h          |  52 +-------
>>>>>>>>      drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c           |  21 ++-
>>>>>>>>      drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c            |   8 +-
>>>>>>>>      drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c      |  14 +-
>>>>>>>>      drivers/gpu/drm/amd/amdgpu/cik_ih.c               |   2 +-
>>>>>>>>      drivers/gpu/drm/amd/amdgpu/cz_ih.c                |   2 +-
>>>>>>>>      drivers/gpu/drm/amd/amdgpu/iceland_ih.c           |   2 +-
>>>>>>>>      drivers/gpu/drm/amd/amdgpu/navi10_ih.c            |   2 +-
>>>>>>>>      drivers/gpu/drm/amd/amdgpu/psp_v11_0.c            |  16 +--
>>>>>>>>      drivers/gpu/drm/amd/amdgpu/psp_v12_0.c            |   8 +-
>>>>>>>>      drivers/gpu/drm/amd/amdgpu/psp_v3_1.c             |   8 +-
>>>>>>>>      drivers/gpu/drm/amd/amdgpu/si_ih.c                |   2 +-
>>>>>>>>      drivers/gpu/drm/amd/amdgpu/tonga_ih.c             |   2 +-
>>>>>>>>      drivers/gpu/drm/amd/amdgpu/vega10_ih.c            |   2 +-
>>>>>>>>      drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c |  12 +-
>>>>>>>>      drivers/gpu/drm/amd/include/amd_shared.h          |   2 +
>>>>>>>>      drivers/gpu/drm/drm_drv.c                         |   3 +
>>>>>>>>      drivers/gpu/drm/etnaviv/etnaviv_sched.c           |  10 +-
>>>>>>>>      drivers/gpu/drm/lima/lima_sched.c                 |   4 +-
>>>>>>>>      drivers/gpu/drm/panfrost/panfrost_job.c           |   9 +-
>>>>>>>>      drivers/gpu/drm/scheduler/sched_main.c            |  18 ++-
>>>>>>>>      drivers/gpu/drm/ttm/ttm_bo_vm.c                   |  82 +++++++++++-
>>>>>>>>      drivers/gpu/drm/ttm/ttm_tt.c                      |   1 +
>>>>>>>>      drivers/gpu/drm/v3d/v3d_sched.c                   |  32 ++---
>>>>>>>>      include/drm/gpu_scheduler.h                       |  17 ++-
>>>>>>>>      include/drm/ttm/ttm_bo_api.h                      |   2 +
>>>>>>>>      45 files changed, 583 insertions(+), 198 deletions(-)
>>>>>>>>
>>>>>>>> --
>>>>>>>> 2.7.4
>>>>>>>>
>
>

[-- Attachment #2: 0001-DAFT-Add-amdgpu-command-submission-while-unplug.patch --]
[-- Type: text/x-patch, Size: 6525 bytes --]

From af658ef6b7e5b044d2566104137ee1cb34e52c59 Mon Sep 17 00:00:00 2001
From: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Date: Mon, 8 Feb 2021 00:41:28 -0500
Subject: DAFT: Add amdgpu command submission while unplug.

---
 tests/core_hotunplug.c | 218 ++++++++++++++++++++++++++++++++++++++++++++++++-
 tests/meson.build      |   2 +-
 2 files changed, 216 insertions(+), 4 deletions(-)

diff --git a/tests/core_hotunplug.c b/tests/core_hotunplug.c
index e7d2a44..5a4dcb4 100644
--- a/tests/core_hotunplug.c
+++ b/tests/core_hotunplug.c
@@ -52,6 +52,190 @@ struct hotunplug {
 	bool need_healthcheck;
 };
 
+/* amdgpu specific code */
+
+#include <amdgpu.h>
+#include <amdgpu_drm.h>
+#include <pthread.h>
+
+#define GFX_COMPUTE_NOP  0xffff1000
+
+
+static bool do_cs;
+
+static int
+amdgpu_bo_alloc_and_map(amdgpu_device_handle dev, unsigned size,
+			unsigned alignment, unsigned heap, uint64_t flags,
+			amdgpu_bo_handle *bo, void **cpu, uint64_t *mc_address,
+			amdgpu_va_handle *va_handle)
+{
+	struct amdgpu_bo_alloc_request request = {
+		.alloc_size = size,
+		.phys_alignment = alignment,
+		.preferred_heap = heap,
+		.flags = flags,
+	};
+	amdgpu_bo_handle buf_handle;
+	amdgpu_va_handle handle;
+	uint64_t vmc_addr;
+	int r;
+
+	r = amdgpu_bo_alloc(dev, &request, &buf_handle);
+	if (r)
+		return r;
+
+	r = amdgpu_va_range_alloc(dev,
+				  amdgpu_gpu_va_range_general,
+				  size, alignment, 0, &vmc_addr,
+				  &handle, 0);
+	if (r)
+		goto error_va_alloc;
+
+	r = amdgpu_bo_va_op(buf_handle, 0, size, vmc_addr, 0, AMDGPU_VA_OP_MAP);
+	if (r)
+		goto error_va_map;
+
+	r = amdgpu_bo_cpu_map(buf_handle, cpu);
+	if (r)
+		goto error_cpu_map;
+
+	*bo = buf_handle;
+	*mc_address = vmc_addr;
+	*va_handle = handle;
+
+	return 0;
+
+error_cpu_map:
+	amdgpu_bo_cpu_unmap(buf_handle);
+
+error_va_map:
+	amdgpu_bo_va_op(buf_handle, 0, size, vmc_addr, 0, AMDGPU_VA_OP_UNMAP);
+
+error_va_alloc:
+	amdgpu_bo_free(buf_handle);
+	return r;
+}
+
+static void
+amdgpu_bo_unmap_and_free(amdgpu_bo_handle bo, amdgpu_va_handle va_handle,
+			 uint64_t mc_addr, uint64_t size)
+{
+	amdgpu_bo_cpu_unmap(bo);
+	amdgpu_bo_va_op(bo, 0, size, mc_addr, 0, AMDGPU_VA_OP_UNMAP);
+	amdgpu_va_range_free(va_handle);
+	amdgpu_bo_free(bo);
+}
+
+static void amdgpu_cs_sync(amdgpu_context_handle context,
+			   unsigned int ip_type,
+			   int ring,
+			   unsigned int seqno)
+{
+	struct amdgpu_cs_fence fence = {
+		.context = context,
+		.ip_type = ip_type,
+		.ring = ring,
+		.fence = seqno,
+	};
+	uint32_t expired;
+	int err;
+
+	err = amdgpu_cs_query_fence_status(&fence,
+					   AMDGPU_TIMEOUT_INFINITE,
+					   0, &expired);
+}
+
+static void *amdgpu_nop_cs(void *p)
+{
+	int fd = *(int *)p;
+	amdgpu_bo_handle ib_result_handle;
+	void *ib_result_cpu;
+	uint64_t ib_result_mc_address;
+	uint32_t *ptr;
+	int i, r;
+	amdgpu_bo_list_handle bo_list;
+	amdgpu_va_handle va_handle;
+	uint32_t major, minor;
+	amdgpu_device_handle device;
+	amdgpu_context_handle context;
+	struct amdgpu_cs_request ibs_request;
+	struct amdgpu_cs_ib_info ib_info;
+
+
+	r = amdgpu_device_initialize(fd, &major, &minor, &device);
+	igt_require(r == 0);
+
+	r = amdgpu_cs_ctx_create(device, &context);
+	igt_assert_eq(r, 0);
+
+	r = amdgpu_bo_alloc_and_map(device, 4096, 4096,
+				    AMDGPU_GEM_DOMAIN_GTT, 0,
+				    &ib_result_handle, &ib_result_cpu,
+				    &ib_result_mc_address, &va_handle);
+	igt_assert_eq(r, 0);
+
+	ptr = ib_result_cpu;
+	for (i = 0; i < 16; ++i)
+		ptr[i] = GFX_COMPUTE_NOP;
+
+	r = amdgpu_bo_list_create(device, 1, &ib_result_handle, NULL, &bo_list);
+	igt_assert_eq(r, 0);
+
+	memset(&ib_info, 0, sizeof(struct amdgpu_cs_ib_info));
+	ib_info.ib_mc_address = ib_result_mc_address;
+	ib_info.size = 16;
+
+	memset(&ibs_request, 0, sizeof(struct amdgpu_cs_request));
+	ibs_request.ip_type = AMDGPU_HW_IP_GFX;
+	ibs_request.ring = 0;
+	ibs_request.number_of_ibs = 1;
+	ibs_request.ibs = &ib_info;
+	ibs_request.resources = bo_list;
+
+	while (do_cs)
+		amdgpu_cs_submit(context, 0, &ibs_request, 1);
+
+	amdgpu_cs_sync(context, AMDGPU_HW_IP_GFX, 0, ibs_request.seq_no);
+
+	amdgpu_bo_list_destroy(bo_list);
+
+	amdgpu_bo_unmap_and_free(ib_result_handle, va_handle,
+				 ib_result_mc_address, 4096);
+
+	amdgpu_cs_ctx_free(context);
+	amdgpu_device_deinitialize(device);
+
+	return (void *)0;
+}
+
+static pthread_t* amdgpu_create_cs_thread(int *fd)
+{
+	int r;
+	pthread_t *thread = malloc(sizeof(*thread));
+
+	do_cs = true;
+
+	r = pthread_create(thread, NULL, amdgpu_nop_cs, (void *)fd);
+	igt_assert_eq(r, 0);
+
+	/* Give thread enough time to start*/
+	usleep(100000);
+	return thread;
+}
+
+static void amdgpu_destroy_cs_thread(pthread_t *thread)
+{
+	void *status;
+
+	do_cs = false;
+
+	pthread_join(*thread, &status);
+	igt_assert(status == 0);
+
+	free(thread);
+}
+
+

@@ -455,15 +645,26 @@ static void unplug_rescan(struct hotunplug *priv)
 
 static void hotunbind_rebind(struct hotunplug *priv)
 {
+	pthread_t *thread = NULL;
+
 	pre_check(priv);
 
 	priv->fd.drm = local_drm_open_driver(false, "", " for hot unbind");
 
+	if (is_amdgpu_device(priv->fd.drm))
+		thread = amdgpu_create_cs_thread(&priv->fd.drm);
+
 	driver_unbind(priv, "hot ", 0);
 
+	if (thread)
+		amdgpu_destroy_cs_thread(thread);
+
+
 	priv->fd.drm = close_device(priv->fd.drm, "late ", "unbound ");
 	igt_assert_eq(priv->fd.drm, -1);
 
 	driver_bind(priv, 0);
 
 	igt_assert_f(healthcheck(priv, false), "%s\n", priv->failure);
@@ -471,15 +672,25 @@ static void hotunbind_rebind(struct hotunplug *priv)
 
 static void hotunplug_rescan(struct hotunplug *priv)
 {
+	pthread_t *thread = NULL;
+
 	pre_check(priv);
 
 	priv->fd.drm = local_drm_open_driver(false, "", " for hot unplug");
 
+	if (is_amdgpu_device(priv->fd.drm))
+		thread = amdgpu_create_cs_thread(&priv->fd.drm);
+
 	device_unplug(priv, "hot ", 0);
 
+	if (thread)
+		amdgpu_destroy_cs_thread(thread);
+
 	priv->fd.drm = close_device(priv->fd.drm, "late ", "removed ");
 	igt_assert_eq(priv->fd.drm, -1);

 	bus_rescan(priv, 0);
 
 	igt_assert_f(healthcheck(priv, false), "%s\n", priv->failure);
@@ -543,6 +754,7 @@ static void hotreplug_lateclose(struct hotunplug *priv)
 	igt_assert_f(healthcheck(priv, false), "%s\n", priv->failure);
 }
 
diff --git a/tests/meson.build b/tests/meson.build
index 825e018..1de6cc5 100644
--- a/tests/meson.build
+++ b/tests/meson.build
@@ -243,7 +243,7 @@ i915_progs = [
 	'sysfs_timeslice_duration',
 ]
 
-test_deps = [ igt_deps ]
+test_deps = [ igt_deps + [ libdrm_amdgpu ] ]
 
 if libdrm_nouveau.found()
 	test_progs += [
-- 
2.7.4


[-- Attachment #3: Type: text/plain, Size: 154 bytes --]

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 00/14] RFC Support hot device unplug in amdgpu
  2021-02-08  5:59                 ` Andrey Grodzovsky
@ 2021-02-08  7:27                   ` Daniel Vetter
  -1 siblings, 0 replies; 196+ messages in thread
From: Daniel Vetter @ 2021-02-08  7:27 UTC (permalink / raw)
  To: Andrey Grodzovsky
  Cc: amd-gfx list, Christian König, dri-devel, Qiang Yu, Greg KH,
	Alex Deucher

On Mon, Feb 8, 2021 at 6:59 AM Andrey Grodzovsky
<Andrey.Grodzovsky@amd.com> wrote:
>
>
> On 1/20/21 10:59 AM, Daniel Vetter wrote:
> > On Wed, Jan 20, 2021 at 3:20 PM Andrey Grodzovsky
> > <Andrey.Grodzovsky@amd.com> wrote:
> >>
> >> On 1/20/21 4:05 AM, Daniel Vetter wrote:
> >>> On Tue, Jan 19, 2021 at 01:18:15PM -0500, Andrey Grodzovsky wrote:
> >>>> On 1/19/21 1:08 PM, Daniel Vetter wrote:
> >>>>> On Tue, Jan 19, 2021 at 6:31 PM Andrey Grodzovsky
> >>>>> <Andrey.Grodzovsky@amd.com> wrote:
> >>>>>> On 1/19/21 9:16 AM, Daniel Vetter wrote:
> >>>>>>> On Mon, Jan 18, 2021 at 04:01:09PM -0500, Andrey Grodzovsky wrote:
> >>>>>>>> Until now extracting a card either by physical extraction (e.g. eGPU with
> >>>>>>>> thunderbolt connection or by emulation through  syfs -> /sys/bus/pci/devices/device_id/remove)
> >>>>>>>> would cause random crashes in user apps. The random crashes in apps were
> >>>>>>>> mostly due to the app having mapped a device backed BO into its address
> >>>>>>>> space was still trying to access the BO while the backing device was gone.
> >>>>>>>> To answer this first problem Christian suggested to fix the handling of mapped
> >>>>>>>> memory in the clients when the device goes away by forcibly unmap all buffers the
> >>>>>>>> user processes has by clearing their respective VMAs mapping the device BOs.
> >>>>>>>> Then when the VMAs try to fill in the page tables again we check in the fault
> >>>>>>>> handlerif the device is removed and if so, return an error. This will generate a
> >>>>>>>> SIGBUS to the application which can then cleanly terminate.This indeed was done
> >>>>>>>> but this in turn created a problem of kernel OOPs were the OOPSes were due to the
> >>>>>>>> fact that while the app was terminating because of the SIGBUSit would trigger use
> >>>>>>>> after free in the driver by calling to accesses device structures that were already
> >>>>>>>> released from the pci remove sequence.This was handled by introducing a 'flush'
> >>>>>>>> sequence during device removal were we wait for drm file reference to drop to 0
> >>>>>>>> meaning all user clients directly using this device terminated.
> >>>>>>>>
> >>>>>>>> v2:
> >>>>>>>> Based on discussions in the mailing list with Daniel and Pekka [1] and based on the document
> >>>>>>>> produced by Pekka from those discussions [2] the whole approach with returning SIGBUS and
> >>>>>>>> waiting for all user clients having CPU mapping of device BOs to die was dropped.
> >>>>>>>> Instead as per the document suggestion the device structures are kept alive until
> >>>>>>>> the last reference to the device is dropped by user client and in the meanwhile all existing and new CPU mappings of the BOs
> >>>>>>>> belonging to the device directly or by dma-buf import are rerouted to per user
> >>>>>>>> process dummy rw page.Also, I skipped the 'Requirements for KMS UAPI' section of [2]
> >>>>>>>> since i am trying to get the minimal set of requirements that still give useful solution
> >>>>>>>> to work and this is the'Requirements for Render and Cross-Device UAPI' section and so my
> >>>>>>>> test case is removing a secondary device, which is render only and is not involved
> >>>>>>>> in KMS.
> >>>>>>>>
> >>>>>>>> v3:
> >>>>>>>> More updates following comments from v2 such as removing loop to find DRM file when rerouting
> >>>>>>>> page faults to dummy page,getting rid of unnecessary sysfs handling refactoring and moving
> >>>>>>>> prevention of GPU recovery post device unplug from amdgpu to scheduler layer.
> >>>>>>>> On top of that added unplug support for the IOMMU enabled system.
> >>>>>>>>
> >>>>>>>> v4:
> >>>>>>>> Drop last sysfs hack and use sysfs default attribute.
> >>>>>>>> Guard against write accesses after device removal to avoid modifying released memory.
> >>>>>>>> Update dummy pages handling to on demand allocation and release through drm managed framework.
> >>>>>>>> Add return value to scheduler job TO handler (by Luben Tuikov) and use this in amdgpu for prevention
> >>>>>>>> of GPU recovery post device unplug
> >>>>>>>> Also rebase on top of drm-misc-mext instead of amd-staging-drm-next
> >>>>>>>>
> >>>>>>>> With these patches I am able to gracefully remove the secondary card using sysfs remove hook while glxgears
> >>>>>>>> is running off of secondary card (DRI_PRIME=1) without kernel oopses or hangs and keep working
> >>>>>>>> with the primary card or soft reset the device without hangs or oopses
> >>>>>>>>
> >>>>>>>> TODOs for followup work:
> >>>>>>>> Convert AMDGPU code to use devm (for hw stuff) and drmm (for sw stuff and allocations) (Daniel)
> >>>>>>>> Support plugging the secondary device back after unplug - currently still experiencing HW error on plugging back.
> >>>>>>>> Add support for 'Requirements for KMS UAPI' section of [2] - unplugging primary, display connected card.
> >>>>>>>>
> >>>>>>>> [1] - Discussions during v3 of the patchset https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.spinics.net%2Flists%2Famd-gfx%2Fmsg55576.html&amp;data=04%7C01%7CAndrey.Grodzovsky%40amd.com%7Cf3fc3c7b55df40e165f408d8bd5c7364%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637467552072067767%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=fjgP9YubHCrILFxWmpVGSmurTJHkWw%2Bv4okyjSNsPxE%3D&amp;reserved=0
> >>>>>>>> [2] - drm/doc: device hot-unplug for userspace https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.spinics.net%2Flists%2Fdri-devel%2Fmsg259755.html&amp;data=04%7C01%7CAndrey.Grodzovsky%40amd.com%7Cf3fc3c7b55df40e165f408d8bd5c7364%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637467552072067767%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=11PYyAhOjLDEiNNho8WaMB%2FLkA5AuxK6g9XpbNiPIec%3D&amp;reserved=0
> >>>>>>>> [3] - Related gitlab ticket https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgitlab.freedesktop.org%2Fdrm%2Famd%2F-%2Fissues%2F1081&amp;data=04%7C01%7CAndrey.Grodzovsky%40amd.com%7Cf3fc3c7b55df40e165f408d8bd5c7364%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637467552072077759%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=q%2F%2FXDKm9LOgw9mq4Ts4JoHR8Ysd8KmoM0NGLD98MsFw%3D&amp;reserved=0
> >>>>>>> btw have you tried this out with some of the igts we have? core_hotunplug
> >>>>>>> is the one I'm thinking of. Might be worth to extend this for amdgpu
> >>>>>>> specific stuff (like run some batches on it while hotunplugging).
> >>>>>> No, I mostly used just running glxgears while testing which covers already
> >>>>>> exported/imported dma-buf case and a few manually hacked tests in libdrm amdgpu
> >>>>>> test suite
> >>>>>>
> >>>>>>
> >>>>>>> Since there's so many corner cases we need to test here (shared dma-buf,
> >>>>>>> shared dma_fence) I think it would make sense to have a shared testcase
> >>>>>>> across drivers.
> >>>>>> Not familiar with IGT too much, is there an easy way to setup shared dma bufs
> >>>>>> and fences
> >>>>>> use cases there or you mean I need to add them now ?
> >>>>> We do have test infrastructure for all of that, but the hotunplug test
> >>>>> doesn't have that yet I think.
> >>>>>
> >>>>>>> Only specific thing would be some hooks to keep the gpu
> >>>>>>> busy in some fashion while we yank the driver.
> >>>>>> Do you mean like staring X and some active rendering on top (like glxgears)
> >>>>>> automatically from within IGT ?
> >>>>> Nope, igt is meant to be bare metal testing so you don't have to drag
> >>>>> the entire winsys around (which in a wayland world, is not really good
> >>>>> for driver testing anyway, since everything is different). We use this
> >>>>> for our pre-merge ci for drm/i915.
> >>>> So i keep it busy by X/glxgers which is manual operation. What you suggest
> >>>> then is some client within IGT which opens the device and starts submitting jobs
> >>>> (which is much like what libdrm amdgpu tests already do) ? And this
> >>>> part is the amdgou specific code I just need to port from libdrm to here ?
> >>> Yup. For i915 tests we have an entire library already for small workloads,
> >>> including some that just spin forever (useful for reset testing and could
> >>> also come handy for unload testing).
> >>> -Daniel
> >>
> >> Does it mean I would have to drag in the entire infrastructure code from
> >> within libdrm amdgpu code that allows for command submissions through
> >> our IOCTLs ?
> > No it's perfectly fine to use libdrm in igt tests, we do that too. I
> > just mean we have some additional helpers to submit specific workloads
> > for intel gpu, like rendercpy to move data with the 3d engine (just
> > using copy engines only isn't good enough sometimes for testing), or
> > the special hanging batchbuffers we use for reset testing, or in
> > general for having precise control over race conditions and things
> > like that.
> >
> > One thing that was somewhat annoying for i915 but shouldn't be a
> > problem for amdgpu is that igt builds on intel. So we have stub
> > functions for libdrm-intel, since libdrm-intel doesn't build on arm.
> > Shouldn't be a problem for you.
> > -Daniel
>
>
> Tested with igt hot-unplug test. Passed unbind_rebind, unplug-rescan,
> hot-unbind-rebind and hotunplug-rescan
> if disabling the rescan part as I don't support plug-back for now. Also added
> command submission for amdgpu.
> Attached a draft of submitting workload while unbinding the driver or simulating
> detach. Catched 2 issues with unpug if command submission in flight  during
> unplug -
> (unsignaled fence causing a hang in amdgpu_cs_sync and hitting a BUG_ON in
> gfx_v9_0_ring_emit_patch_cond_exec whic is expected i guess).
> Guess glxgears command submissions is at a much slower rate so this was missed.
> Is that what you meant for this test ?

Yup. Would be good if you can submit this one for inclusion.
-Daniel

>
> Andrey
>
>
> >
> >
> >> Andrey
> >>
> >>>> Andrey
> >>>>
> >>>>
> >>>>>>> But just to get it started
> >>>>>>> you can throw in entirely amdgpu specific subtests and just share some of
> >>>>>>> the test code.
> >>>>>>> -Daniel
> >>>>>> Im general, I wasn't aware of this test suite and looks like it does what i test
> >>>>>> among other stuff.
> >>>>>> I will definitely  try to run with it although the rescan part will not work as
> >>>>>> plugging
> >>>>>> the device back is in my TODO list and not part of the scope for this patchset
> >>>>>> and so I will
> >>>>>> probably comment the re-scan section out while testing.
> >>>>> amd gem has been using libdrm-amd thus far iirc, but for things like
> >>>>> this I think it'd be worth to at least consider switching. Display
> >>>>> team has already started to use some of the test and contribute stuff
> >>>>> (I think the VRR testcase is from amd).
> >>>>> -Daniel
> >>>>>
> >>>>>> Andrey
> >>>>>>
> >>>>>>
> >>>>>>>> Andrey Grodzovsky (13):
> >>>>>>>>       drm/ttm: Remap all page faults to per process dummy page.
> >>>>>>>>       drm: Unamp the entire device address space on device unplug
> >>>>>>>>       drm/ttm: Expose ttm_tt_unpopulate for driver use
> >>>>>>>>       drm/sched: Cancel and flush all oustatdning jobs before finish.
> >>>>>>>>       drm/amdgpu: Split amdgpu_device_fini into early and late
> >>>>>>>>       drm/amdgpu: Add early fini callback
> >>>>>>>>       drm/amdgpu: Register IOMMU topology notifier per device.
> >>>>>>>>       drm/amdgpu: Fix a bunch of sdma code crash post device unplug
> >>>>>>>>       drm/amdgpu: Remap all page faults to per process dummy page.
> >>>>>>>>       dmr/amdgpu: Move some sysfs attrs creation to default_attr
> >>>>>>>>       drm/amdgpu: Guard against write accesses after device removal
> >>>>>>>>       drm/sched: Make timeout timer rearm conditional.
> >>>>>>>>       drm/amdgpu: Prevent any job recoveries after device is unplugged.
> >>>>>>>>
> >>>>>>>> Luben Tuikov (1):
> >>>>>>>>       drm/scheduler: Job timeout handler returns status
> >>>>>>>>
> >>>>>>>>      drivers/gpu/drm/amd/amdgpu/amdgpu.h               |  11 +-
> >>>>>>>>      drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c      |  17 +--
> >>>>>>>>      drivers/gpu/drm/amd/amdgpu/amdgpu_device.c        | 149 ++++++++++++++++++++--
> >>>>>>>>      drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c           |  20 ++-
> >>>>>>>>      drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c         |  15 ++-
> >>>>>>>>      drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c          |   2 +-
> >>>>>>>>      drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h          |   1 +
> >>>>>>>>      drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c           |   9 ++
> >>>>>>>>      drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c       |  25 ++--
> >>>>>>>>      drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c           |  26 ++--
> >>>>>>>>      drivers/gpu/drm/amd/amdgpu/amdgpu_irq.h           |   3 +-
> >>>>>>>>      drivers/gpu/drm/amd/amdgpu/amdgpu_job.c           |  19 ++-
> >>>>>>>>      drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c           |  12 +-
> >>>>>>>>      drivers/gpu/drm/amd/amdgpu/amdgpu_object.c        |  10 ++
> >>>>>>>>      drivers/gpu/drm/amd/amdgpu/amdgpu_object.h        |   2 +
> >>>>>>>>      drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c           |  53 +++++---
> >>>>>>>>      drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h           |   3 +
> >>>>>>>>      drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c           |   1 +
> >>>>>>>>      drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c          |  70 ++++++++++
> >>>>>>>>      drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h          |  52 +-------
> >>>>>>>>      drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c           |  21 ++-
> >>>>>>>>      drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c            |   8 +-
> >>>>>>>>      drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c      |  14 +-
> >>>>>>>>      drivers/gpu/drm/amd/amdgpu/cik_ih.c               |   2 +-
> >>>>>>>>      drivers/gpu/drm/amd/amdgpu/cz_ih.c                |   2 +-
> >>>>>>>>      drivers/gpu/drm/amd/amdgpu/iceland_ih.c           |   2 +-
> >>>>>>>>      drivers/gpu/drm/amd/amdgpu/navi10_ih.c            |   2 +-
> >>>>>>>>      drivers/gpu/drm/amd/amdgpu/psp_v11_0.c            |  16 +--
> >>>>>>>>      drivers/gpu/drm/amd/amdgpu/psp_v12_0.c            |   8 +-
> >>>>>>>>      drivers/gpu/drm/amd/amdgpu/psp_v3_1.c             |   8 +-
> >>>>>>>>      drivers/gpu/drm/amd/amdgpu/si_ih.c                |   2 +-
> >>>>>>>>      drivers/gpu/drm/amd/amdgpu/tonga_ih.c             |   2 +-
> >>>>>>>>      drivers/gpu/drm/amd/amdgpu/vega10_ih.c            |   2 +-
> >>>>>>>>      drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c |  12 +-
> >>>>>>>>      drivers/gpu/drm/amd/include/amd_shared.h          |   2 +
> >>>>>>>>      drivers/gpu/drm/drm_drv.c                         |   3 +
> >>>>>>>>      drivers/gpu/drm/etnaviv/etnaviv_sched.c           |  10 +-
> >>>>>>>>      drivers/gpu/drm/lima/lima_sched.c                 |   4 +-
> >>>>>>>>      drivers/gpu/drm/panfrost/panfrost_job.c           |   9 +-
> >>>>>>>>      drivers/gpu/drm/scheduler/sched_main.c            |  18 ++-
> >>>>>>>>      drivers/gpu/drm/ttm/ttm_bo_vm.c                   |  82 +++++++++++-
> >>>>>>>>      drivers/gpu/drm/ttm/ttm_tt.c                      |   1 +
> >>>>>>>>      drivers/gpu/drm/v3d/v3d_sched.c                   |  32 ++---
> >>>>>>>>      include/drm/gpu_scheduler.h                       |  17 ++-
> >>>>>>>>      include/drm/ttm/ttm_bo_api.h                      |   2 +
> >>>>>>>>      45 files changed, 583 insertions(+), 198 deletions(-)
> >>>>>>>>
> >>>>>>>> --
> >>>>>>>> 2.7.4
> >>>>>>>>
> >
> >



-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 00/14] RFC Support hot device unplug in amdgpu
@ 2021-02-08  7:27                   ` Daniel Vetter
  0 siblings, 0 replies; 196+ messages in thread
From: Daniel Vetter @ 2021-02-08  7:27 UTC (permalink / raw)
  To: Andrey Grodzovsky
  Cc: Rob Herring, amd-gfx list, Christian König, dri-devel,
	Anholt, Eric, Pekka Paalanen, Qiang Yu, Greg KH, Alex Deucher,
	Wentland, Harry, Lucas Stach

On Mon, Feb 8, 2021 at 6:59 AM Andrey Grodzovsky
<Andrey.Grodzovsky@amd.com> wrote:
>
>
> On 1/20/21 10:59 AM, Daniel Vetter wrote:
> > On Wed, Jan 20, 2021 at 3:20 PM Andrey Grodzovsky
> > <Andrey.Grodzovsky@amd.com> wrote:
> >>
> >> On 1/20/21 4:05 AM, Daniel Vetter wrote:
> >>> On Tue, Jan 19, 2021 at 01:18:15PM -0500, Andrey Grodzovsky wrote:
> >>>> On 1/19/21 1:08 PM, Daniel Vetter wrote:
> >>>>> On Tue, Jan 19, 2021 at 6:31 PM Andrey Grodzovsky
> >>>>> <Andrey.Grodzovsky@amd.com> wrote:
> >>>>>> On 1/19/21 9:16 AM, Daniel Vetter wrote:
> >>>>>>> On Mon, Jan 18, 2021 at 04:01:09PM -0500, Andrey Grodzovsky wrote:
> >>>>>>>> Until now extracting a card either by physical extraction (e.g. eGPU with
> >>>>>>>> thunderbolt connection or by emulation through  syfs -> /sys/bus/pci/devices/device_id/remove)
> >>>>>>>> would cause random crashes in user apps. The random crashes in apps were
> >>>>>>>> mostly due to the app having mapped a device backed BO into its address
> >>>>>>>> space was still trying to access the BO while the backing device was gone.
> >>>>>>>> To answer this first problem Christian suggested to fix the handling of mapped
> >>>>>>>> memory in the clients when the device goes away by forcibly unmap all buffers the
> >>>>>>>> user processes has by clearing their respective VMAs mapping the device BOs.
> >>>>>>>> Then when the VMAs try to fill in the page tables again we check in the fault
> >>>>>>>> handlerif the device is removed and if so, return an error. This will generate a
> >>>>>>>> SIGBUS to the application which can then cleanly terminate.This indeed was done
> >>>>>>>> but this in turn created a problem of kernel OOPs were the OOPSes were due to the
> >>>>>>>> fact that while the app was terminating because of the SIGBUSit would trigger use
> >>>>>>>> after free in the driver by calling to accesses device structures that were already
> >>>>>>>> released from the pci remove sequence.This was handled by introducing a 'flush'
> >>>>>>>> sequence during device removal were we wait for drm file reference to drop to 0
> >>>>>>>> meaning all user clients directly using this device terminated.
> >>>>>>>>
> >>>>>>>> v2:
> >>>>>>>> Based on discussions in the mailing list with Daniel and Pekka [1] and based on the document
> >>>>>>>> produced by Pekka from those discussions [2] the whole approach with returning SIGBUS and
> >>>>>>>> waiting for all user clients having CPU mapping of device BOs to die was dropped.
> >>>>>>>> Instead as per the document suggestion the device structures are kept alive until
> >>>>>>>> the last reference to the device is dropped by user client and in the meanwhile all existing and new CPU mappings of the BOs
> >>>>>>>> belonging to the device directly or by dma-buf import are rerouted to per user
> >>>>>>>> process dummy rw page.Also, I skipped the 'Requirements for KMS UAPI' section of [2]
> >>>>>>>> since i am trying to get the minimal set of requirements that still give useful solution
> >>>>>>>> to work and this is the'Requirements for Render and Cross-Device UAPI' section and so my
> >>>>>>>> test case is removing a secondary device, which is render only and is not involved
> >>>>>>>> in KMS.
> >>>>>>>>
> >>>>>>>> v3:
> >>>>>>>> More updates following comments from v2 such as removing loop to find DRM file when rerouting
> >>>>>>>> page faults to dummy page,getting rid of unnecessary sysfs handling refactoring and moving
> >>>>>>>> prevention of GPU recovery post device unplug from amdgpu to scheduler layer.
> >>>>>>>> On top of that added unplug support for the IOMMU enabled system.
> >>>>>>>>
> >>>>>>>> v4:
> >>>>>>>> Drop last sysfs hack and use sysfs default attribute.
> >>>>>>>> Guard against write accesses after device removal to avoid modifying released memory.
> >>>>>>>> Update dummy pages handling to on demand allocation and release through drm managed framework.
> >>>>>>>> Add return value to scheduler job TO handler (by Luben Tuikov) and use this in amdgpu for prevention
> >>>>>>>> of GPU recovery post device unplug
> >>>>>>>> Also rebase on top of drm-misc-mext instead of amd-staging-drm-next
> >>>>>>>>
> >>>>>>>> With these patches I am able to gracefully remove the secondary card using sysfs remove hook while glxgears
> >>>>>>>> is running off of secondary card (DRI_PRIME=1) without kernel oopses or hangs and keep working
> >>>>>>>> with the primary card or soft reset the device without hangs or oopses
> >>>>>>>>
> >>>>>>>> TODOs for followup work:
> >>>>>>>> Convert AMDGPU code to use devm (for hw stuff) and drmm (for sw stuff and allocations) (Daniel)
> >>>>>>>> Support plugging the secondary device back after unplug - currently still experiencing HW error on plugging back.
> >>>>>>>> Add support for 'Requirements for KMS UAPI' section of [2] - unplugging primary, display connected card.
> >>>>>>>>
> >>>>>>>> [1] - Discussions during v3 of the patchset https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.spinics.net%2Flists%2Famd-gfx%2Fmsg55576.html&amp;data=04%7C01%7CAndrey.Grodzovsky%40amd.com%7Cf3fc3c7b55df40e165f408d8bd5c7364%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637467552072067767%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=fjgP9YubHCrILFxWmpVGSmurTJHkWw%2Bv4okyjSNsPxE%3D&amp;reserved=0
> >>>>>>>> [2] - drm/doc: device hot-unplug for userspace https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.spinics.net%2Flists%2Fdri-devel%2Fmsg259755.html&amp;data=04%7C01%7CAndrey.Grodzovsky%40amd.com%7Cf3fc3c7b55df40e165f408d8bd5c7364%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637467552072067767%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=11PYyAhOjLDEiNNho8WaMB%2FLkA5AuxK6g9XpbNiPIec%3D&amp;reserved=0
> >>>>>>>> [3] - Related gitlab ticket https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgitlab.freedesktop.org%2Fdrm%2Famd%2F-%2Fissues%2F1081&amp;data=04%7C01%7CAndrey.Grodzovsky%40amd.com%7Cf3fc3c7b55df40e165f408d8bd5c7364%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637467552072077759%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=q%2F%2FXDKm9LOgw9mq4Ts4JoHR8Ysd8KmoM0NGLD98MsFw%3D&amp;reserved=0
> >>>>>>> btw have you tried this out with some of the igts we have? core_hotunplug
> >>>>>>> is the one I'm thinking of. Might be worth to extend this for amdgpu
> >>>>>>> specific stuff (like run some batches on it while hotunplugging).
> >>>>>> No, I mostly used just running glxgears while testing which covers already
> >>>>>> exported/imported dma-buf case and a few manually hacked tests in libdrm amdgpu
> >>>>>> test suite
> >>>>>>
> >>>>>>
> >>>>>>> Since there's so many corner cases we need to test here (shared dma-buf,
> >>>>>>> shared dma_fence) I think it would make sense to have a shared testcase
> >>>>>>> across drivers.
> >>>>>> Not familiar with IGT too much, is there an easy way to setup shared dma bufs
> >>>>>> and fences
> >>>>>> use cases there or you mean I need to add them now ?
> >>>>> We do have test infrastructure for all of that, but the hotunplug test
> >>>>> doesn't have that yet I think.
> >>>>>
> >>>>>>> Only specific thing would be some hooks to keep the gpu
> >>>>>>> busy in some fashion while we yank the driver.
> >>>>>> Do you mean like staring X and some active rendering on top (like glxgears)
> >>>>>> automatically from within IGT ?
> >>>>> Nope, igt is meant to be bare metal testing so you don't have to drag
> >>>>> the entire winsys around (which in a wayland world, is not really good
> >>>>> for driver testing anyway, since everything is different). We use this
> >>>>> for our pre-merge ci for drm/i915.
> >>>> So i keep it busy by X/glxgers which is manual operation. What you suggest
> >>>> then is some client within IGT which opens the device and starts submitting jobs
> >>>> (which is much like what libdrm amdgpu tests already do) ? And this
> >>>> part is the amdgou specific code I just need to port from libdrm to here ?
> >>> Yup. For i915 tests we have an entire library already for small workloads,
> >>> including some that just spin forever (useful for reset testing and could
> >>> also come handy for unload testing).
> >>> -Daniel
> >>
> >> Does it mean I would have to drag in the entire infrastructure code from
> >> within libdrm amdgpu code that allows for command submissions through
> >> our IOCTLs ?
> > No it's perfectly fine to use libdrm in igt tests, we do that too. I
> > just mean we have some additional helpers to submit specific workloads
> > for intel gpu, like rendercpy to move data with the 3d engine (just
> > using copy engines only isn't good enough sometimes for testing), or
> > the special hanging batchbuffers we use for reset testing, or in
> > general for having precise control over race conditions and things
> > like that.
> >
> > One thing that was somewhat annoying for i915 but shouldn't be a
> > problem for amdgpu is that igt builds on intel. So we have stub
> > functions for libdrm-intel, since libdrm-intel doesn't build on arm.
> > Shouldn't be a problem for you.
> > -Daniel
>
>
> Tested with igt hot-unplug test. Passed unbind_rebind, unplug-rescan,
> hot-unbind-rebind and hotunplug-rescan
> if disabling the rescan part as I don't support plug-back for now. Also added
> command submission for amdgpu.
> Attached a draft of submitting workload while unbinding the driver or simulating
> detach. Catched 2 issues with unpug if command submission in flight  during
> unplug -
> (unsignaled fence causing a hang in amdgpu_cs_sync and hitting a BUG_ON in
> gfx_v9_0_ring_emit_patch_cond_exec whic is expected i guess).
> Guess glxgears command submissions is at a much slower rate so this was missed.
> Is that what you meant for this test ?

Yup. Would be good if you can submit this one for inclusion.
-Daniel

>
> Andrey
>
>
> >
> >
> >> Andrey
> >>
> >>>> Andrey
> >>>>
> >>>>
> >>>>>>> But just to get it started
> >>>>>>> you can throw in entirely amdgpu specific subtests and just share some of
> >>>>>>> the test code.
> >>>>>>> -Daniel
> >>>>>> Im general, I wasn't aware of this test suite and looks like it does what i test
> >>>>>> among other stuff.
> >>>>>> I will definitely  try to run with it although the rescan part will not work as
> >>>>>> plugging
> >>>>>> the device back is in my TODO list and not part of the scope for this patchset
> >>>>>> and so I will
> >>>>>> probably comment the re-scan section out while testing.
> >>>>> amd gem has been using libdrm-amd thus far iirc, but for things like
> >>>>> this I think it'd be worth to at least consider switching. Display
> >>>>> team has already started to use some of the test and contribute stuff
> >>>>> (I think the VRR testcase is from amd).
> >>>>> -Daniel
> >>>>>
> >>>>>> Andrey
> >>>>>>
> >>>>>>
> >>>>>>>> Andrey Grodzovsky (13):
> >>>>>>>>       drm/ttm: Remap all page faults to per process dummy page.
> >>>>>>>>       drm: Unamp the entire device address space on device unplug
> >>>>>>>>       drm/ttm: Expose ttm_tt_unpopulate for driver use
> >>>>>>>>       drm/sched: Cancel and flush all oustatdning jobs before finish.
> >>>>>>>>       drm/amdgpu: Split amdgpu_device_fini into early and late
> >>>>>>>>       drm/amdgpu: Add early fini callback
> >>>>>>>>       drm/amdgpu: Register IOMMU topology notifier per device.
> >>>>>>>>       drm/amdgpu: Fix a bunch of sdma code crash post device unplug
> >>>>>>>>       drm/amdgpu: Remap all page faults to per process dummy page.
> >>>>>>>>       dmr/amdgpu: Move some sysfs attrs creation to default_attr
> >>>>>>>>       drm/amdgpu: Guard against write accesses after device removal
> >>>>>>>>       drm/sched: Make timeout timer rearm conditional.
> >>>>>>>>       drm/amdgpu: Prevent any job recoveries after device is unplugged.
> >>>>>>>>
> >>>>>>>> Luben Tuikov (1):
> >>>>>>>>       drm/scheduler: Job timeout handler returns status
> >>>>>>>>
> >>>>>>>>      drivers/gpu/drm/amd/amdgpu/amdgpu.h               |  11 +-
> >>>>>>>>      drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c      |  17 +--
> >>>>>>>>      drivers/gpu/drm/amd/amdgpu/amdgpu_device.c        | 149 ++++++++++++++++++++--
> >>>>>>>>      drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c           |  20 ++-
> >>>>>>>>      drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c         |  15 ++-
> >>>>>>>>      drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c          |   2 +-
> >>>>>>>>      drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h          |   1 +
> >>>>>>>>      drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c           |   9 ++
> >>>>>>>>      drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c       |  25 ++--
> >>>>>>>>      drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c           |  26 ++--
> >>>>>>>>      drivers/gpu/drm/amd/amdgpu/amdgpu_irq.h           |   3 +-
> >>>>>>>>      drivers/gpu/drm/amd/amdgpu/amdgpu_job.c           |  19 ++-
> >>>>>>>>      drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c           |  12 +-
> >>>>>>>>      drivers/gpu/drm/amd/amdgpu/amdgpu_object.c        |  10 ++
> >>>>>>>>      drivers/gpu/drm/amd/amdgpu/amdgpu_object.h        |   2 +
> >>>>>>>>      drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c           |  53 +++++---
> >>>>>>>>      drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h           |   3 +
> >>>>>>>>      drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c           |   1 +
> >>>>>>>>      drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c          |  70 ++++++++++
> >>>>>>>>      drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h          |  52 +-------
> >>>>>>>>      drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c           |  21 ++-
> >>>>>>>>      drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c            |   8 +-
> >>>>>>>>      drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c      |  14 +-
> >>>>>>>>      drivers/gpu/drm/amd/amdgpu/cik_ih.c               |   2 +-
> >>>>>>>>      drivers/gpu/drm/amd/amdgpu/cz_ih.c                |   2 +-
> >>>>>>>>      drivers/gpu/drm/amd/amdgpu/iceland_ih.c           |   2 +-
> >>>>>>>>      drivers/gpu/drm/amd/amdgpu/navi10_ih.c            |   2 +-
> >>>>>>>>      drivers/gpu/drm/amd/amdgpu/psp_v11_0.c            |  16 +--
> >>>>>>>>      drivers/gpu/drm/amd/amdgpu/psp_v12_0.c            |   8 +-
> >>>>>>>>      drivers/gpu/drm/amd/amdgpu/psp_v3_1.c             |   8 +-
> >>>>>>>>      drivers/gpu/drm/amd/amdgpu/si_ih.c                |   2 +-
> >>>>>>>>      drivers/gpu/drm/amd/amdgpu/tonga_ih.c             |   2 +-
> >>>>>>>>      drivers/gpu/drm/amd/amdgpu/vega10_ih.c            |   2 +-
> >>>>>>>>      drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c |  12 +-
> >>>>>>>>      drivers/gpu/drm/amd/include/amd_shared.h          |   2 +
> >>>>>>>>      drivers/gpu/drm/drm_drv.c                         |   3 +
> >>>>>>>>      drivers/gpu/drm/etnaviv/etnaviv_sched.c           |  10 +-
> >>>>>>>>      drivers/gpu/drm/lima/lima_sched.c                 |   4 +-
> >>>>>>>>      drivers/gpu/drm/panfrost/panfrost_job.c           |   9 +-
> >>>>>>>>      drivers/gpu/drm/scheduler/sched_main.c            |  18 ++-
> >>>>>>>>      drivers/gpu/drm/ttm/ttm_bo_vm.c                   |  82 +++++++++++-
> >>>>>>>>      drivers/gpu/drm/ttm/ttm_tt.c                      |   1 +
> >>>>>>>>      drivers/gpu/drm/v3d/v3d_sched.c                   |  32 ++---
> >>>>>>>>      include/drm/gpu_scheduler.h                       |  17 ++-
> >>>>>>>>      include/drm/ttm/ttm_bo_api.h                      |   2 +
> >>>>>>>>      45 files changed, 583 insertions(+), 198 deletions(-)
> >>>>>>>>
> >>>>>>>> --
> >>>>>>>> 2.7.4
> >>>>>>>>
> >
> >



-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 11/14] drm/amdgpu: Guard against write accesses after device removal
  2021-02-07 21:50                             ` Daniel Vetter
@ 2021-02-08  9:37                               ` Christian König
  -1 siblings, 0 replies; 196+ messages in thread
From: Christian König @ 2021-02-08  9:37 UTC (permalink / raw)
  To: Daniel Vetter, Andrey Grodzovsky
  Cc: Greg KH, amd-gfx list, dri-devel, Alex Deucher,
	Christian König, Qiang Yu

Am 07.02.21 um 22:50 schrieb Daniel Vetter:
> [SNIP]
>> Clarification - as far as I know there are no page fault handlers for kernel
>> mappings. And we are talking about kernel mappings here, right ?  If there were
>> I could solve all those issues the same as I do for user mappings, by
>> invalidating all existing mappings in the kernel (both kmaps and ioreamps)and
>> insert dummy zero or ~0 filled page instead.
>> Also, I assume forcefully remapping the IO BAR to ~0 filled page would involve
>> ioremap API and it's not something that I think can be easily done according to
>> am answer i got to a related topic a few weeks ago
>> https://www.spinics.net/lists/linux-pci/msg103396.html (that was the only reply
>> i got)
> mmiotrace can, but only for debug, and only on x86 platforms:
>
> https://www.kernel.org/doc/html/latest/trace/mmiotrace.html
>
> Should be feasible (but maybe not worth the effort) to extend this to
> support fake unplug.

Mhm, interesting idea you guys brought up here.

We don't need a page fault for this to work, all we need to do is to 
insert dummy PTEs into the kernels page table at the place where 
previously the MMIO mapping has been.

>>> But ugh ...
>>>
>>> Otoh validating an entire driver like amdgpu without such a trick
>>> against 0xff reads is practically impossible. So maybe you need to add
>>> this as one of the tasks here?
>> Or I could just for validation purposes return ~0 from all reg reads in the code
>> and ignore writes if drm_dev_unplugged, this could already easily validate a big
>> portion of the code flow under such scenario.
> Hm yeah if your really wrap them all, that should work too. Since
> iommappings have __iomem pointer type, as long as amdgpu is sparse
> warning free, should be doable to guarantee this.

Problem is that ~0 is not always a valid register value.

You would need to audit every register read that it doesn't use the 
returned value blindly as index or similar. That is quite a bit of work.

Regards,
Christian.

> -Daniel
>
>> Andrey
>>

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 11/14] drm/amdgpu: Guard against write accesses after device removal
@ 2021-02-08  9:37                               ` Christian König
  0 siblings, 0 replies; 196+ messages in thread
From: Christian König @ 2021-02-08  9:37 UTC (permalink / raw)
  To: Daniel Vetter, Andrey Grodzovsky
  Cc: Rob Herring, Greg KH, amd-gfx list, Anholt, Eric, Pekka Paalanen,
	dri-devel, Alex Deucher, Lucas Stach, Wentland, Harry,
	Christian König, Qiang Yu

Am 07.02.21 um 22:50 schrieb Daniel Vetter:
> [SNIP]
>> Clarification - as far as I know there are no page fault handlers for kernel
>> mappings. And we are talking about kernel mappings here, right ?  If there were
>> I could solve all those issues the same as I do for user mappings, by
>> invalidating all existing mappings in the kernel (both kmaps and ioreamps)and
>> insert dummy zero or ~0 filled page instead.
>> Also, I assume forcefully remapping the IO BAR to ~0 filled page would involve
>> ioremap API and it's not something that I think can be easily done according to
>> am answer i got to a related topic a few weeks ago
>> https://www.spinics.net/lists/linux-pci/msg103396.html (that was the only reply
>> i got)
> mmiotrace can, but only for debug, and only on x86 platforms:
>
> https://www.kernel.org/doc/html/latest/trace/mmiotrace.html
>
> Should be feasible (but maybe not worth the effort) to extend this to
> support fake unplug.

Mhm, interesting idea you guys brought up here.

We don't need a page fault for this to work, all we need to do is to 
insert dummy PTEs into the kernels page table at the place where 
previously the MMIO mapping has been.

>>> But ugh ...
>>>
>>> Otoh validating an entire driver like amdgpu without such a trick
>>> against 0xff reads is practically impossible. So maybe you need to add
>>> this as one of the tasks here?
>> Or I could just for validation purposes return ~0 from all reg reads in the code
>> and ignore writes if drm_dev_unplugged, this could already easily validate a big
>> portion of the code flow under such scenario.
> Hm yeah if your really wrap them all, that should work too. Since
> iommappings have __iomem pointer type, as long as amdgpu is sparse
> warning free, should be doable to guarantee this.

Problem is that ~0 is not always a valid register value.

You would need to audit every register read that it doesn't use the 
returned value blindly as index or similar. That is quite a bit of work.

Regards,
Christian.

> -Daniel
>
>> Andrey
>>

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 11/14] drm/amdgpu: Guard against write accesses after device removal
  2021-02-08  9:37                               ` Christian König
@ 2021-02-08  9:48                                 ` Daniel Vetter
  -1 siblings, 0 replies; 196+ messages in thread
From: Daniel Vetter @ 2021-02-08  9:48 UTC (permalink / raw)
  To: christian.koenig
  Cc: Daniel Vetter, dri-devel, amd-gfx list, Greg KH, Alex Deucher, Qiang Yu

On Mon, Feb 08, 2021 at 10:37:19AM +0100, Christian König wrote:
> Am 07.02.21 um 22:50 schrieb Daniel Vetter:
> > [SNIP]
> > > Clarification - as far as I know there are no page fault handlers for kernel
> > > mappings. And we are talking about kernel mappings here, right ?  If there were
> > > I could solve all those issues the same as I do for user mappings, by
> > > invalidating all existing mappings in the kernel (both kmaps and ioreamps)and
> > > insert dummy zero or ~0 filled page instead.
> > > Also, I assume forcefully remapping the IO BAR to ~0 filled page would involve
> > > ioremap API and it's not something that I think can be easily done according to
> > > am answer i got to a related topic a few weeks ago
> > > https://www.spinics.net/lists/linux-pci/msg103396.html (that was the only reply
> > > i got)
> > mmiotrace can, but only for debug, and only on x86 platforms:
> > 
> > https://www.kernel.org/doc/html/latest/trace/mmiotrace.html
> > 
> > Should be feasible (but maybe not worth the effort) to extend this to
> > support fake unplug.
> 
> Mhm, interesting idea you guys brought up here.
> 
> We don't need a page fault for this to work, all we need to do is to insert
> dummy PTEs into the kernels page table at the place where previously the
> MMIO mapping has been.

Simply pte trick isn't enough, because we need:
- drop all writes silently
- all reads return 0xff

ptes can't do that themselves, we minimally need write protection and then
silently proceed on each write fault without restarting the instruction.
Better would be to only catch reads, but x86 doesn't do write-only pte
permissions afaik.

> > > > But ugh ...
> > > > 
> > > > Otoh validating an entire driver like amdgpu without such a trick
> > > > against 0xff reads is practically impossible. So maybe you need to add
> > > > this as one of the tasks here?
> > > Or I could just for validation purposes return ~0 from all reg reads in the code
> > > and ignore writes if drm_dev_unplugged, this could already easily validate a big
> > > portion of the code flow under such scenario.
> > Hm yeah if your really wrap them all, that should work too. Since
> > iommappings have __iomem pointer type, as long as amdgpu is sparse
> > warning free, should be doable to guarantee this.
> 
> Problem is that ~0 is not always a valid register value.
> 
> You would need to audit every register read that it doesn't use the returned
> value blindly as index or similar. That is quite a bit of work.

Yeah that's the entire crux here :-/
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 11/14] drm/amdgpu: Guard against write accesses after device removal
@ 2021-02-08  9:48                                 ` Daniel Vetter
  0 siblings, 0 replies; 196+ messages in thread
From: Daniel Vetter @ 2021-02-08  9:48 UTC (permalink / raw)
  To: christian.koenig
  Cc: Rob Herring, Andrey Grodzovsky, Daniel Vetter, dri-devel, Anholt,
	Eric, Pekka Paalanen, amd-gfx list, Greg KH, Alex Deucher,
	Lucas Stach, Wentland, Harry, Qiang Yu

On Mon, Feb 08, 2021 at 10:37:19AM +0100, Christian König wrote:
> Am 07.02.21 um 22:50 schrieb Daniel Vetter:
> > [SNIP]
> > > Clarification - as far as I know there are no page fault handlers for kernel
> > > mappings. And we are talking about kernel mappings here, right ?  If there were
> > > I could solve all those issues the same as I do for user mappings, by
> > > invalidating all existing mappings in the kernel (both kmaps and ioreamps)and
> > > insert dummy zero or ~0 filled page instead.
> > > Also, I assume forcefully remapping the IO BAR to ~0 filled page would involve
> > > ioremap API and it's not something that I think can be easily done according to
> > > am answer i got to a related topic a few weeks ago
> > > https://www.spinics.net/lists/linux-pci/msg103396.html (that was the only reply
> > > i got)
> > mmiotrace can, but only for debug, and only on x86 platforms:
> > 
> > https://www.kernel.org/doc/html/latest/trace/mmiotrace.html
> > 
> > Should be feasible (but maybe not worth the effort) to extend this to
> > support fake unplug.
> 
> Mhm, interesting idea you guys brought up here.
> 
> We don't need a page fault for this to work, all we need to do is to insert
> dummy PTEs into the kernels page table at the place where previously the
> MMIO mapping has been.

Simply pte trick isn't enough, because we need:
- drop all writes silently
- all reads return 0xff

ptes can't do that themselves, we minimally need write protection and then
silently proceed on each write fault without restarting the instruction.
Better would be to only catch reads, but x86 doesn't do write-only pte
permissions afaik.

> > > > But ugh ...
> > > > 
> > > > Otoh validating an entire driver like amdgpu without such a trick
> > > > against 0xff reads is practically impossible. So maybe you need to add
> > > > this as one of the tasks here?
> > > Or I could just for validation purposes return ~0 from all reg reads in the code
> > > and ignore writes if drm_dev_unplugged, this could already easily validate a big
> > > portion of the code flow under such scenario.
> > Hm yeah if your really wrap them all, that should work too. Since
> > iommappings have __iomem pointer type, as long as amdgpu is sparse
> > warning free, should be doable to guarantee this.
> 
> Problem is that ~0 is not always a valid register value.
> 
> You would need to audit every register read that it doesn't use the returned
> value blindly as index or similar. That is quite a bit of work.

Yeah that's the entire crux here :-/
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 11/14] drm/amdgpu: Guard against write accesses after device removal
  2021-02-08  9:48                                 ` Daniel Vetter
@ 2021-02-08 10:03                                   ` Christian König
  -1 siblings, 0 replies; 196+ messages in thread
From: Christian König @ 2021-02-08 10:03 UTC (permalink / raw)
  To: Daniel Vetter, christian.koenig
  Cc: Daniel Vetter, dri-devel, amd-gfx list, Greg KH, Alex Deucher, Qiang Yu

Am 08.02.21 um 10:48 schrieb Daniel Vetter:
> On Mon, Feb 08, 2021 at 10:37:19AM +0100, Christian König wrote:
>> Am 07.02.21 um 22:50 schrieb Daniel Vetter:
>>> [SNIP]
>>>> Clarification - as far as I know there are no page fault handlers for kernel
>>>> mappings. And we are talking about kernel mappings here, right ?  If there were
>>>> I could solve all those issues the same as I do for user mappings, by
>>>> invalidating all existing mappings in the kernel (both kmaps and ioreamps)and
>>>> insert dummy zero or ~0 filled page instead.
>>>> Also, I assume forcefully remapping the IO BAR to ~0 filled page would involve
>>>> ioremap API and it's not something that I think can be easily done according to
>>>> am answer i got to a related topic a few weeks ago
>>>> https://www.spinics.net/lists/linux-pci/msg103396.html (that was the only reply
>>>> i got)
>>> mmiotrace can, but only for debug, and only on x86 platforms:
>>>
>>> https://www.kernel.org/doc/html/latest/trace/mmiotrace.html
>>>
>>> Should be feasible (but maybe not worth the effort) to extend this to
>>> support fake unplug.
>> Mhm, interesting idea you guys brought up here.
>>
>> We don't need a page fault for this to work, all we need to do is to insert
>> dummy PTEs into the kernels page table at the place where previously the
>> MMIO mapping has been.
> Simply pte trick isn't enough, because we need:
> - drop all writes silently
> - all reads return 0xff
>
> ptes can't do that themselves, we minimally need write protection and then
> silently proceed on each write fault without restarting the instruction.
> Better would be to only catch reads, but x86 doesn't do write-only pte
> permissions afaik.

You are not thinking far enough :)

The dummy PTE is point to a dummy MMIO page which is just never used.

That hast the exact same properties than our removed MMIO space just 
doesn't goes bananas when a new device is MMIO mapped into that and our 
driver still tries to write there.

Regards,
Christian.


>
>>>>> But ugh ...
>>>>>
>>>>> Otoh validating an entire driver like amdgpu without such a trick
>>>>> against 0xff reads is practically impossible. So maybe you need to add
>>>>> this as one of the tasks here?
>>>> Or I could just for validation purposes return ~0 from all reg reads in the code
>>>> and ignore writes if drm_dev_unplugged, this could already easily validate a big
>>>> portion of the code flow under such scenario.
>>> Hm yeah if your really wrap them all, that should work too. Since
>>> iommappings have __iomem pointer type, as long as amdgpu is sparse
>>> warning free, should be doable to guarantee this.
>> Problem is that ~0 is not always a valid register value.
>>
>> You would need to audit every register read that it doesn't use the returned
>> value blindly as index or similar. That is quite a bit of work.
> Yeah that's the entire crux here :-/
> -Daniel

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 11/14] drm/amdgpu: Guard against write accesses after device removal
@ 2021-02-08 10:03                                   ` Christian König
  0 siblings, 0 replies; 196+ messages in thread
From: Christian König @ 2021-02-08 10:03 UTC (permalink / raw)
  To: Daniel Vetter, christian.koenig
  Cc: Rob Herring, Andrey Grodzovsky, Daniel Vetter, dri-devel, Anholt,
	Eric, Pekka Paalanen, amd-gfx list, Greg KH, Alex Deucher,
	Qiang Yu, Wentland, Harry, Lucas Stach

Am 08.02.21 um 10:48 schrieb Daniel Vetter:
> On Mon, Feb 08, 2021 at 10:37:19AM +0100, Christian König wrote:
>> Am 07.02.21 um 22:50 schrieb Daniel Vetter:
>>> [SNIP]
>>>> Clarification - as far as I know there are no page fault handlers for kernel
>>>> mappings. And we are talking about kernel mappings here, right ?  If there were
>>>> I could solve all those issues the same as I do for user mappings, by
>>>> invalidating all existing mappings in the kernel (both kmaps and ioreamps)and
>>>> insert dummy zero or ~0 filled page instead.
>>>> Also, I assume forcefully remapping the IO BAR to ~0 filled page would involve
>>>> ioremap API and it's not something that I think can be easily done according to
>>>> am answer i got to a related topic a few weeks ago
>>>> https://www.spinics.net/lists/linux-pci/msg103396.html (that was the only reply
>>>> i got)
>>> mmiotrace can, but only for debug, and only on x86 platforms:
>>>
>>> https://www.kernel.org/doc/html/latest/trace/mmiotrace.html
>>>
>>> Should be feasible (but maybe not worth the effort) to extend this to
>>> support fake unplug.
>> Mhm, interesting idea you guys brought up here.
>>
>> We don't need a page fault for this to work, all we need to do is to insert
>> dummy PTEs into the kernels page table at the place where previously the
>> MMIO mapping has been.
> Simply pte trick isn't enough, because we need:
> - drop all writes silently
> - all reads return 0xff
>
> ptes can't do that themselves, we minimally need write protection and then
> silently proceed on each write fault without restarting the instruction.
> Better would be to only catch reads, but x86 doesn't do write-only pte
> permissions afaik.

You are not thinking far enough :)

The dummy PTE is point to a dummy MMIO page which is just never used.

That hast the exact same properties than our removed MMIO space just 
doesn't goes bananas when a new device is MMIO mapped into that and our 
driver still tries to write there.

Regards,
Christian.


>
>>>>> But ugh ...
>>>>>
>>>>> Otoh validating an entire driver like amdgpu without such a trick
>>>>> against 0xff reads is practically impossible. So maybe you need to add
>>>>> this as one of the tasks here?
>>>> Or I could just for validation purposes return ~0 from all reg reads in the code
>>>> and ignore writes if drm_dev_unplugged, this could already easily validate a big
>>>> portion of the code flow under such scenario.
>>> Hm yeah if your really wrap them all, that should work too. Since
>>> iommappings have __iomem pointer type, as long as amdgpu is sparse
>>> warning free, should be doable to guarantee this.
>> Problem is that ~0 is not always a valid register value.
>>
>> You would need to audit every register read that it doesn't use the returned
>> value blindly as index or similar. That is quite a bit of work.
> Yeah that's the entire crux here :-/
> -Daniel

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 11/14] drm/amdgpu: Guard against write accesses after device removal
  2021-02-08 10:03                                   ` Christian König
@ 2021-02-08 10:11                                     ` Daniel Vetter
  -1 siblings, 0 replies; 196+ messages in thread
From: Daniel Vetter @ 2021-02-08 10:11 UTC (permalink / raw)
  To: christian.koenig
  Cc: Daniel Vetter, dri-devel, amd-gfx list, Greg KH, Alex Deucher, Qiang Yu

On Mon, Feb 08, 2021 at 11:03:15AM +0100, Christian König wrote:
> Am 08.02.21 um 10:48 schrieb Daniel Vetter:
> > On Mon, Feb 08, 2021 at 10:37:19AM +0100, Christian König wrote:
> > > Am 07.02.21 um 22:50 schrieb Daniel Vetter:
> > > > [SNIP]
> > > > > Clarification - as far as I know there are no page fault handlers for kernel
> > > > > mappings. And we are talking about kernel mappings here, right ?  If there were
> > > > > I could solve all those issues the same as I do for user mappings, by
> > > > > invalidating all existing mappings in the kernel (both kmaps and ioreamps)and
> > > > > insert dummy zero or ~0 filled page instead.
> > > > > Also, I assume forcefully remapping the IO BAR to ~0 filled page would involve
> > > > > ioremap API and it's not something that I think can be easily done according to
> > > > > am answer i got to a related topic a few weeks ago
> > > > > https://www.spinics.net/lists/linux-pci/msg103396.html (that was the only reply
> > > > > i got)
> > > > mmiotrace can, but only for debug, and only on x86 platforms:
> > > > 
> > > > https://www.kernel.org/doc/html/latest/trace/mmiotrace.html
> > > > 
> > > > Should be feasible (but maybe not worth the effort) to extend this to
> > > > support fake unplug.
> > > Mhm, interesting idea you guys brought up here.
> > > 
> > > We don't need a page fault for this to work, all we need to do is to insert
> > > dummy PTEs into the kernels page table at the place where previously the
> > > MMIO mapping has been.
> > Simply pte trick isn't enough, because we need:
> > - drop all writes silently
> > - all reads return 0xff
> > 
> > ptes can't do that themselves, we minimally need write protection and then
> > silently proceed on each write fault without restarting the instruction.
> > Better would be to only catch reads, but x86 doesn't do write-only pte
> > permissions afaik.
> 
> You are not thinking far enough :)
> 
> The dummy PTE is point to a dummy MMIO page which is just never used.
> 
> That hast the exact same properties than our removed MMIO space just doesn't
> goes bananas when a new device is MMIO mapped into that and our driver still
> tries to write there.

Hm, but where do we get such a "guaranteed never used" mmio page from?

It's a nifty idea indeed otherwise ...
-Daniel

> 
> Regards,
> Christian.
> 
> 
> > 
> > > > > > But ugh ...
> > > > > > 
> > > > > > Otoh validating an entire driver like amdgpu without such a trick
> > > > > > against 0xff reads is practically impossible. So maybe you need to add
> > > > > > this as one of the tasks here?
> > > > > Or I could just for validation purposes return ~0 from all reg reads in the code
> > > > > and ignore writes if drm_dev_unplugged, this could already easily validate a big
> > > > > portion of the code flow under such scenario.
> > > > Hm yeah if your really wrap them all, that should work too. Since
> > > > iommappings have __iomem pointer type, as long as amdgpu is sparse
> > > > warning free, should be doable to guarantee this.
> > > Problem is that ~0 is not always a valid register value.
> > > 
> > > You would need to audit every register read that it doesn't use the returned
> > > value blindly as index or similar. That is quite a bit of work.
> > Yeah that's the entire crux here :-/
> > -Daniel
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 11/14] drm/amdgpu: Guard against write accesses after device removal
@ 2021-02-08 10:11                                     ` Daniel Vetter
  0 siblings, 0 replies; 196+ messages in thread
From: Daniel Vetter @ 2021-02-08 10:11 UTC (permalink / raw)
  To: christian.koenig
  Cc: Rob Herring, Andrey Grodzovsky, Daniel Vetter, dri-devel, Anholt,
	Eric, Pekka Paalanen, amd-gfx list, Daniel Vetter, Greg KH,
	Alex Deucher, Qiang Yu, Wentland, Harry, Lucas Stach

On Mon, Feb 08, 2021 at 11:03:15AM +0100, Christian König wrote:
> Am 08.02.21 um 10:48 schrieb Daniel Vetter:
> > On Mon, Feb 08, 2021 at 10:37:19AM +0100, Christian König wrote:
> > > Am 07.02.21 um 22:50 schrieb Daniel Vetter:
> > > > [SNIP]
> > > > > Clarification - as far as I know there are no page fault handlers for kernel
> > > > > mappings. And we are talking about kernel mappings here, right ?  If there were
> > > > > I could solve all those issues the same as I do for user mappings, by
> > > > > invalidating all existing mappings in the kernel (both kmaps and ioreamps)and
> > > > > insert dummy zero or ~0 filled page instead.
> > > > > Also, I assume forcefully remapping the IO BAR to ~0 filled page would involve
> > > > > ioremap API and it's not something that I think can be easily done according to
> > > > > am answer i got to a related topic a few weeks ago
> > > > > https://www.spinics.net/lists/linux-pci/msg103396.html (that was the only reply
> > > > > i got)
> > > > mmiotrace can, but only for debug, and only on x86 platforms:
> > > > 
> > > > https://www.kernel.org/doc/html/latest/trace/mmiotrace.html
> > > > 
> > > > Should be feasible (but maybe not worth the effort) to extend this to
> > > > support fake unplug.
> > > Mhm, interesting idea you guys brought up here.
> > > 
> > > We don't need a page fault for this to work, all we need to do is to insert
> > > dummy PTEs into the kernels page table at the place where previously the
> > > MMIO mapping has been.
> > Simply pte trick isn't enough, because we need:
> > - drop all writes silently
> > - all reads return 0xff
> > 
> > ptes can't do that themselves, we minimally need write protection and then
> > silently proceed on each write fault without restarting the instruction.
> > Better would be to only catch reads, but x86 doesn't do write-only pte
> > permissions afaik.
> 
> You are not thinking far enough :)
> 
> The dummy PTE is point to a dummy MMIO page which is just never used.
> 
> That hast the exact same properties than our removed MMIO space just doesn't
> goes bananas when a new device is MMIO mapped into that and our driver still
> tries to write there.

Hm, but where do we get such a "guaranteed never used" mmio page from?

It's a nifty idea indeed otherwise ...
-Daniel

> 
> Regards,
> Christian.
> 
> 
> > 
> > > > > > But ugh ...
> > > > > > 
> > > > > > Otoh validating an entire driver like amdgpu without such a trick
> > > > > > against 0xff reads is practically impossible. So maybe you need to add
> > > > > > this as one of the tasks here?
> > > > > Or I could just for validation purposes return ~0 from all reg reads in the code
> > > > > and ignore writes if drm_dev_unplugged, this could already easily validate a big
> > > > > portion of the code flow under such scenario.
> > > > Hm yeah if your really wrap them all, that should work too. Since
> > > > iommappings have __iomem pointer type, as long as amdgpu is sparse
> > > > warning free, should be doable to guarantee this.
> > > Problem is that ~0 is not always a valid register value.
> > > 
> > > You would need to audit every register read that it doesn't use the returned
> > > value blindly as index or similar. That is quite a bit of work.
> > Yeah that's the entire crux here :-/
> > -Daniel
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 11/14] drm/amdgpu: Guard against write accesses after device removal
  2021-02-08 10:11                                     ` Daniel Vetter
@ 2021-02-08 13:59                                       ` Christian König
  -1 siblings, 0 replies; 196+ messages in thread
From: Christian König @ 2021-02-08 13:59 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Daniel Vetter, dri-devel, amd-gfx list, Greg KH, Alex Deucher, Qiang Yu

Am 08.02.21 um 11:11 schrieb Daniel Vetter:
> On Mon, Feb 08, 2021 at 11:03:15AM +0100, Christian König wrote:
>> Am 08.02.21 um 10:48 schrieb Daniel Vetter:
>>> On Mon, Feb 08, 2021 at 10:37:19AM +0100, Christian König wrote:
>>>> Am 07.02.21 um 22:50 schrieb Daniel Vetter:
>>>>> [SNIP]
>>>>>> Clarification - as far as I know there are no page fault handlers for kernel
>>>>>> mappings. And we are talking about kernel mappings here, right ?  If there were
>>>>>> I could solve all those issues the same as I do for user mappings, by
>>>>>> invalidating all existing mappings in the kernel (both kmaps and ioreamps)and
>>>>>> insert dummy zero or ~0 filled page instead.
>>>>>> Also, I assume forcefully remapping the IO BAR to ~0 filled page would involve
>>>>>> ioremap API and it's not something that I think can be easily done according to
>>>>>> am answer i got to a related topic a few weeks ago
>>>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.spinics.net%2Flists%2Flinux-pci%2Fmsg103396.html&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7C0ab6d16bc49443d7dd2708d8cc19f3aa%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637483759137213247%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=mLqR3PoMBvOodcNJA6K6XP1AJ7hiz847y%2Bw%2BcGegSZE%3D&amp;reserved=0 (that was the only reply
>>>>>> i got)
>>>>> mmiotrace can, but only for debug, and only on x86 platforms:
>>>>>
>>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.kernel.org%2Fdoc%2Fhtml%2Flatest%2Ftrace%2Fmmiotrace.html&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7C0ab6d16bc49443d7dd2708d8cc19f3aa%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637483759137213247%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=yjEaR73m8rjL4ARo0upHnjSAtE4yw%2BHAISWCSgmjOoY%3D&amp;reserved=0
>>>>>
>>>>> Should be feasible (but maybe not worth the effort) to extend this to
>>>>> support fake unplug.
>>>> Mhm, interesting idea you guys brought up here.
>>>>
>>>> We don't need a page fault for this to work, all we need to do is to insert
>>>> dummy PTEs into the kernels page table at the place where previously the
>>>> MMIO mapping has been.
>>> Simply pte trick isn't enough, because we need:
>>> - drop all writes silently
>>> - all reads return 0xff
>>>
>>> ptes can't do that themselves, we minimally need write protection and then
>>> silently proceed on each write fault without restarting the instruction.
>>> Better would be to only catch reads, but x86 doesn't do write-only pte
>>> permissions afaik.
>> You are not thinking far enough :)
>>
>> The dummy PTE is point to a dummy MMIO page which is just never used.
>>
>> That hast the exact same properties than our removed MMIO space just doesn't
>> goes bananas when a new device is MMIO mapped into that and our driver still
>> tries to write there.
> Hm, but where do we get such a "guaranteed never used" mmio page from?

Well we have tons of unused IO space on 64bit systems these days.

Doesn't really needs to be PCIe address space, doesn't it?

Christian.

>
> It's a nifty idea indeed otherwise ...
> -Daniel
>
>> Regards,
>> Christian.
>>
>>
>>>>>>> But ugh ...
>>>>>>>
>>>>>>> Otoh validating an entire driver like amdgpu without such a trick
>>>>>>> against 0xff reads is practically impossible. So maybe you need to add
>>>>>>> this as one of the tasks here?
>>>>>> Or I could just for validation purposes return ~0 from all reg reads in the code
>>>>>> and ignore writes if drm_dev_unplugged, this could already easily validate a big
>>>>>> portion of the code flow under such scenario.
>>>>> Hm yeah if your really wrap them all, that should work too. Since
>>>>> iommappings have __iomem pointer type, as long as amdgpu is sparse
>>>>> warning free, should be doable to guarantee this.
>>>> Problem is that ~0 is not always a valid register value.
>>>>
>>>> You would need to audit every register read that it doesn't use the returned
>>>> value blindly as index or similar. That is quite a bit of work.
>>> Yeah that's the entire crux here :-/
>>> -Daniel

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 11/14] drm/amdgpu: Guard against write accesses after device removal
@ 2021-02-08 13:59                                       ` Christian König
  0 siblings, 0 replies; 196+ messages in thread
From: Christian König @ 2021-02-08 13:59 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Rob Herring, Andrey Grodzovsky, Daniel Vetter, dri-devel, Anholt,
	Eric, Pekka Paalanen, amd-gfx list, Greg KH, Alex Deucher,
	Qiang Yu, Wentland, Harry, Lucas Stach

Am 08.02.21 um 11:11 schrieb Daniel Vetter:
> On Mon, Feb 08, 2021 at 11:03:15AM +0100, Christian König wrote:
>> Am 08.02.21 um 10:48 schrieb Daniel Vetter:
>>> On Mon, Feb 08, 2021 at 10:37:19AM +0100, Christian König wrote:
>>>> Am 07.02.21 um 22:50 schrieb Daniel Vetter:
>>>>> [SNIP]
>>>>>> Clarification - as far as I know there are no page fault handlers for kernel
>>>>>> mappings. And we are talking about kernel mappings here, right ?  If there were
>>>>>> I could solve all those issues the same as I do for user mappings, by
>>>>>> invalidating all existing mappings in the kernel (both kmaps and ioreamps)and
>>>>>> insert dummy zero or ~0 filled page instead.
>>>>>> Also, I assume forcefully remapping the IO BAR to ~0 filled page would involve
>>>>>> ioremap API and it's not something that I think can be easily done according to
>>>>>> am answer i got to a related topic a few weeks ago
>>>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.spinics.net%2Flists%2Flinux-pci%2Fmsg103396.html&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7C0ab6d16bc49443d7dd2708d8cc19f3aa%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637483759137213247%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=mLqR3PoMBvOodcNJA6K6XP1AJ7hiz847y%2Bw%2BcGegSZE%3D&amp;reserved=0 (that was the only reply
>>>>>> i got)
>>>>> mmiotrace can, but only for debug, and only on x86 platforms:
>>>>>
>>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.kernel.org%2Fdoc%2Fhtml%2Flatest%2Ftrace%2Fmmiotrace.html&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7C0ab6d16bc49443d7dd2708d8cc19f3aa%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637483759137213247%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=yjEaR73m8rjL4ARo0upHnjSAtE4yw%2BHAISWCSgmjOoY%3D&amp;reserved=0
>>>>>
>>>>> Should be feasible (but maybe not worth the effort) to extend this to
>>>>> support fake unplug.
>>>> Mhm, interesting idea you guys brought up here.
>>>>
>>>> We don't need a page fault for this to work, all we need to do is to insert
>>>> dummy PTEs into the kernels page table at the place where previously the
>>>> MMIO mapping has been.
>>> Simply pte trick isn't enough, because we need:
>>> - drop all writes silently
>>> - all reads return 0xff
>>>
>>> ptes can't do that themselves, we minimally need write protection and then
>>> silently proceed on each write fault without restarting the instruction.
>>> Better would be to only catch reads, but x86 doesn't do write-only pte
>>> permissions afaik.
>> You are not thinking far enough :)
>>
>> The dummy PTE is point to a dummy MMIO page which is just never used.
>>
>> That hast the exact same properties than our removed MMIO space just doesn't
>> goes bananas when a new device is MMIO mapped into that and our driver still
>> tries to write there.
> Hm, but where do we get such a "guaranteed never used" mmio page from?

Well we have tons of unused IO space on 64bit systems these days.

Doesn't really needs to be PCIe address space, doesn't it?

Christian.

>
> It's a nifty idea indeed otherwise ...
> -Daniel
>
>> Regards,
>> Christian.
>>
>>
>>>>>>> But ugh ...
>>>>>>>
>>>>>>> Otoh validating an entire driver like amdgpu without such a trick
>>>>>>> against 0xff reads is practically impossible. So maybe you need to add
>>>>>>> this as one of the tasks here?
>>>>>> Or I could just for validation purposes return ~0 from all reg reads in the code
>>>>>> and ignore writes if drm_dev_unplugged, this could already easily validate a big
>>>>>> portion of the code flow under such scenario.
>>>>> Hm yeah if your really wrap them all, that should work too. Since
>>>>> iommappings have __iomem pointer type, as long as amdgpu is sparse
>>>>> warning free, should be doable to guarantee this.
>>>> Problem is that ~0 is not always a valid register value.
>>>>
>>>> You would need to audit every register read that it doesn't use the returned
>>>> value blindly as index or similar. That is quite a bit of work.
>>> Yeah that's the entire crux here :-/
>>> -Daniel

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 11/14] drm/amdgpu: Guard against write accesses after device removal
  2021-02-08 13:59                                       ` Christian König
@ 2021-02-08 16:23                                         ` Daniel Vetter
  -1 siblings, 0 replies; 196+ messages in thread
From: Daniel Vetter @ 2021-02-08 16:23 UTC (permalink / raw)
  To: Christian König
  Cc: Greg KH, amd-gfx list, dri-devel, Alex Deucher, Qiang Yu

On Mon, Feb 8, 2021 at 3:00 PM Christian König <christian.koenig@amd.com> wrote:
>
> Am 08.02.21 um 11:11 schrieb Daniel Vetter:
> > On Mon, Feb 08, 2021 at 11:03:15AM +0100, Christian König wrote:
> >> Am 08.02.21 um 10:48 schrieb Daniel Vetter:
> >>> On Mon, Feb 08, 2021 at 10:37:19AM +0100, Christian König wrote:
> >>>> Am 07.02.21 um 22:50 schrieb Daniel Vetter:
> >>>>> [SNIP]
> >>>>>> Clarification - as far as I know there are no page fault handlers for kernel
> >>>>>> mappings. And we are talking about kernel mappings here, right ?  If there were
> >>>>>> I could solve all those issues the same as I do for user mappings, by
> >>>>>> invalidating all existing mappings in the kernel (both kmaps and ioreamps)and
> >>>>>> insert dummy zero or ~0 filled page instead.
> >>>>>> Also, I assume forcefully remapping the IO BAR to ~0 filled page would involve
> >>>>>> ioremap API and it's not something that I think can be easily done according to
> >>>>>> am answer i got to a related topic a few weeks ago
> >>>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.spinics.net%2Flists%2Flinux-pci%2Fmsg103396.html&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7C0ab6d16bc49443d7dd2708d8cc19f3aa%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637483759137213247%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=mLqR3PoMBvOodcNJA6K6XP1AJ7hiz847y%2Bw%2BcGegSZE%3D&amp;reserved=0 (that was the only reply
> >>>>>> i got)
> >>>>> mmiotrace can, but only for debug, and only on x86 platforms:
> >>>>>
> >>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.kernel.org%2Fdoc%2Fhtml%2Flatest%2Ftrace%2Fmmiotrace.html&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7C0ab6d16bc49443d7dd2708d8cc19f3aa%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637483759137213247%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=yjEaR73m8rjL4ARo0upHnjSAtE4yw%2BHAISWCSgmjOoY%3D&amp;reserved=0
> >>>>>
> >>>>> Should be feasible (but maybe not worth the effort) to extend this to
> >>>>> support fake unplug.
> >>>> Mhm, interesting idea you guys brought up here.
> >>>>
> >>>> We don't need a page fault for this to work, all we need to do is to insert
> >>>> dummy PTEs into the kernels page table at the place where previously the
> >>>> MMIO mapping has been.
> >>> Simply pte trick isn't enough, because we need:
> >>> - drop all writes silently
> >>> - all reads return 0xff
> >>>
> >>> ptes can't do that themselves, we minimally need write protection and then
> >>> silently proceed on each write fault without restarting the instruction.
> >>> Better would be to only catch reads, but x86 doesn't do write-only pte
> >>> permissions afaik.
> >> You are not thinking far enough :)
> >>
> >> The dummy PTE is point to a dummy MMIO page which is just never used.
> >>
> >> That hast the exact same properties than our removed MMIO space just doesn't
> >> goes bananas when a new device is MMIO mapped into that and our driver still
> >> tries to write there.
> > Hm, but where do we get such a "guaranteed never used" mmio page from?
>
> Well we have tons of unused IO space on 64bit systems these days.
>
> Doesn't really needs to be PCIe address space, doesn't it?

That sounds very trusting to modern systems not decoding random
ranges. E.g. the pci code stopped extending the host bridge windows on
its own, entirely relying on the acpi provided ranges, to avoid
stomping on stuff that's the but not listed anywhere.

I guess if we have a range behind a pci bridge, which isn't used by
any device, but decoded by the bridge, then that should be safe
enough. Maybe could even have an option in upstream to do that on
unplug, if a certain flag is set, or a cmdline option.
-Daniel

>
> Christian.
>
> >
> > It's a nifty idea indeed otherwise ...
> > -Daniel
> >
> >> Regards,
> >> Christian.
> >>
> >>
> >>>>>>> But ugh ...
> >>>>>>>
> >>>>>>> Otoh validating an entire driver like amdgpu without such a trick
> >>>>>>> against 0xff reads is practically impossible. So maybe you need to add
> >>>>>>> this as one of the tasks here?
> >>>>>> Or I could just for validation purposes return ~0 from all reg reads in the code
> >>>>>> and ignore writes if drm_dev_unplugged, this could already easily validate a big
> >>>>>> portion of the code flow under such scenario.
> >>>>> Hm yeah if your really wrap them all, that should work too. Since
> >>>>> iommappings have __iomem pointer type, as long as amdgpu is sparse
> >>>>> warning free, should be doable to guarantee this.
> >>>> Problem is that ~0 is not always a valid register value.
> >>>>
> >>>> You would need to audit every register read that it doesn't use the returned
> >>>> value blindly as index or similar. That is quite a bit of work.
> >>> Yeah that's the entire crux here :-/
> >>> -Daniel
>


-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 11/14] drm/amdgpu: Guard against write accesses after device removal
@ 2021-02-08 16:23                                         ` Daniel Vetter
  0 siblings, 0 replies; 196+ messages in thread
From: Daniel Vetter @ 2021-02-08 16:23 UTC (permalink / raw)
  To: Christian König
  Cc: Rob Herring, Andrey Grodzovsky, Greg KH, amd-gfx list, Anholt,
	Eric, Pekka Paalanen, dri-devel, Alex Deucher, Qiang Yu,
	Wentland, Harry, Lucas Stach

On Mon, Feb 8, 2021 at 3:00 PM Christian König <christian.koenig@amd.com> wrote:
>
> Am 08.02.21 um 11:11 schrieb Daniel Vetter:
> > On Mon, Feb 08, 2021 at 11:03:15AM +0100, Christian König wrote:
> >> Am 08.02.21 um 10:48 schrieb Daniel Vetter:
> >>> On Mon, Feb 08, 2021 at 10:37:19AM +0100, Christian König wrote:
> >>>> Am 07.02.21 um 22:50 schrieb Daniel Vetter:
> >>>>> [SNIP]
> >>>>>> Clarification - as far as I know there are no page fault handlers for kernel
> >>>>>> mappings. And we are talking about kernel mappings here, right ?  If there were
> >>>>>> I could solve all those issues the same as I do for user mappings, by
> >>>>>> invalidating all existing mappings in the kernel (both kmaps and ioreamps)and
> >>>>>> insert dummy zero or ~0 filled page instead.
> >>>>>> Also, I assume forcefully remapping the IO BAR to ~0 filled page would involve
> >>>>>> ioremap API and it's not something that I think can be easily done according to
> >>>>>> am answer i got to a related topic a few weeks ago
> >>>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.spinics.net%2Flists%2Flinux-pci%2Fmsg103396.html&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7C0ab6d16bc49443d7dd2708d8cc19f3aa%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637483759137213247%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=mLqR3PoMBvOodcNJA6K6XP1AJ7hiz847y%2Bw%2BcGegSZE%3D&amp;reserved=0 (that was the only reply
> >>>>>> i got)
> >>>>> mmiotrace can, but only for debug, and only on x86 platforms:
> >>>>>
> >>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.kernel.org%2Fdoc%2Fhtml%2Flatest%2Ftrace%2Fmmiotrace.html&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7C0ab6d16bc49443d7dd2708d8cc19f3aa%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637483759137213247%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=yjEaR73m8rjL4ARo0upHnjSAtE4yw%2BHAISWCSgmjOoY%3D&amp;reserved=0
> >>>>>
> >>>>> Should be feasible (but maybe not worth the effort) to extend this to
> >>>>> support fake unplug.
> >>>> Mhm, interesting idea you guys brought up here.
> >>>>
> >>>> We don't need a page fault for this to work, all we need to do is to insert
> >>>> dummy PTEs into the kernels page table at the place where previously the
> >>>> MMIO mapping has been.
> >>> Simply pte trick isn't enough, because we need:
> >>> - drop all writes silently
> >>> - all reads return 0xff
> >>>
> >>> ptes can't do that themselves, we minimally need write protection and then
> >>> silently proceed on each write fault without restarting the instruction.
> >>> Better would be to only catch reads, but x86 doesn't do write-only pte
> >>> permissions afaik.
> >> You are not thinking far enough :)
> >>
> >> The dummy PTE is point to a dummy MMIO page which is just never used.
> >>
> >> That hast the exact same properties than our removed MMIO space just doesn't
> >> goes bananas when a new device is MMIO mapped into that and our driver still
> >> tries to write there.
> > Hm, but where do we get such a "guaranteed never used" mmio page from?
>
> Well we have tons of unused IO space on 64bit systems these days.
>
> Doesn't really needs to be PCIe address space, doesn't it?

That sounds very trusting to modern systems not decoding random
ranges. E.g. the pci code stopped extending the host bridge windows on
its own, entirely relying on the acpi provided ranges, to avoid
stomping on stuff that's the but not listed anywhere.

I guess if we have a range behind a pci bridge, which isn't used by
any device, but decoded by the bridge, then that should be safe
enough. Maybe could even have an option in upstream to do that on
unplug, if a certain flag is set, or a cmdline option.
-Daniel

>
> Christian.
>
> >
> > It's a nifty idea indeed otherwise ...
> > -Daniel
> >
> >> Regards,
> >> Christian.
> >>
> >>
> >>>>>>> But ugh ...
> >>>>>>>
> >>>>>>> Otoh validating an entire driver like amdgpu without such a trick
> >>>>>>> against 0xff reads is practically impossible. So maybe you need to add
> >>>>>>> this as one of the tasks here?
> >>>>>> Or I could just for validation purposes return ~0 from all reg reads in the code
> >>>>>> and ignore writes if drm_dev_unplugged, this could already easily validate a big
> >>>>>> portion of the code flow under such scenario.
> >>>>> Hm yeah if your really wrap them all, that should work too. Since
> >>>>> iommappings have __iomem pointer type, as long as amdgpu is sparse
> >>>>> warning free, should be doable to guarantee this.
> >>>> Problem is that ~0 is not always a valid register value.
> >>>>
> >>>> You would need to audit every register read that it doesn't use the returned
> >>>> value blindly as index or similar. That is quite a bit of work.
> >>> Yeah that's the entire crux here :-/
> >>> -Daniel
>


-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 11/14] drm/amdgpu: Guard against write accesses after device removal
  2021-02-08  9:37                               ` Christian König
@ 2021-02-08 22:09                                 ` Andrey Grodzovsky
  -1 siblings, 0 replies; 196+ messages in thread
From: Andrey Grodzovsky @ 2021-02-08 22:09 UTC (permalink / raw)
  To: christian.koenig, Daniel Vetter
  Cc: Greg KH, amd-gfx list, dri-devel, Alex Deucher, Qiang Yu


[-- Attachment #1.1: Type: text/plain, Size: 3837 bytes --]


On 2/8/21 4:37 AM, Christian König wrote:
> Am 07.02.21 um 22:50 schrieb Daniel Vetter:
>> [SNIP]
>>> Clarification - as far as I know there are no page fault handlers for kernel
>>> mappings. And we are talking about kernel mappings here, right ?  If there were
>>> I could solve all those issues the same as I do for user mappings, by
>>> invalidating all existing mappings in the kernel (both kmaps and ioreamps)and
>>> insert dummy zero or ~0 filled page instead.
>>> Also, I assume forcefully remapping the IO BAR to ~0 filled page would involve
>>> ioremap API and it's not something that I think can be easily done according to
>>> am answer i got to a related topic a few weeks ago
>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.spinics.net%2Flists%2Flinux-pci%2Fmsg103396.html&amp;data=04%7C01%7CAndrey.Grodzovsky%40amd.com%7Cb159d3ce264944486c8008d8cc15233a%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637483738446813868%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=6eP0nhS%2BZwp1Y54CwfX8vaV3FTWbW8IylW5JFaf92pY%3D&amp;reserved=0 
>>> (that was the only reply
>>> i got)
>> mmiotrace can, but only for debug, and only on x86 platforms:
>>
>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.kernel.org%2Fdoc%2Fhtml%2Flatest%2Ftrace%2Fmmiotrace.html&amp;data=04%7C01%7CAndrey.Grodzovsky%40amd.com%7Cb159d3ce264944486c8008d8cc15233a%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637483738446813868%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=QBF9J%2BVIRUkUTTjvNoZR8NqFNt8CpHkcknH2qKX7dd8%3D&amp;reserved=0 
>>
>>
>> Should be feasible (but maybe not worth the effort) to extend this to
>> support fake unplug.
>
> Mhm, interesting idea you guys brought up here.
>
> We don't need a page fault for this to work, all we need to do is to insert 
> dummy PTEs into the kernels page table at the place where previously the MMIO 
> mapping has been.


But that exactly what Mathew from linux-mm says is not a trivial thing to do, quote:

"

ioremap() is done through the vmalloc space.  It would, in theory, be
possible to reprogram the page tables used for vmalloc to point to your
magic page.  I don't think we have such a mechanism today, and there are
lots of problems with things like TLB flushes.  It's probably going to
be harder than you think.
"

If you believe it's actually doable then it would be useful not only for simulating device
unplugged situation with all MMIOs returning 0xff... but for actual handling of driver accesses
to MMIO after device is gone and, we could then drop entirely this patch as there would be no need
to guard against such accesses post device unplug.

  

>
>>>> But ugh ...
>>>>
>>>> Otoh validating an entire driver like amdgpu without such a trick
>>>> against 0xff reads is practically impossible. So maybe you need to add
>>>> this as one of the tasks here?
>>> Or I could just for validation purposes return ~0 from all reg reads in the 
>>> code
>>> and ignore writes if drm_dev_unplugged, this could already easily validate a 
>>> big
>>> portion of the code flow under such scenario.
>> Hm yeah if your really wrap them all, that should work too. Since
>> iommappings have __iomem pointer type, as long as amdgpu is sparse
>> warning free, should be doable to guarantee this.
>
> Problem is that ~0 is not always a valid register value.
>
> You would need to audit every register read that it doesn't use the returned 
> value blindly as index or similar. That is quite a bit of work.


But ~0 is the value that will be returned for every read post device unplug, 
regardless if it's valid or not, and we have to cope with
it then, no ?

Andrey


>
> Regards,
> Christian.
>
>> -Daniel
>>
>>> Andrey
>>>
>

[-- Attachment #1.2: Type: text/html, Size: 7186 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 11/14] drm/amdgpu: Guard against write accesses after device removal
@ 2021-02-08 22:09                                 ` Andrey Grodzovsky
  0 siblings, 0 replies; 196+ messages in thread
From: Andrey Grodzovsky @ 2021-02-08 22:09 UTC (permalink / raw)
  To: christian.koenig, Daniel Vetter
  Cc: Rob Herring, Greg KH, amd-gfx list, Anholt, Eric, Pekka Paalanen,
	dri-devel, Alex Deucher, Lucas Stach, Wentland, Harry, Qiang Yu


[-- Attachment #1.1: Type: text/plain, Size: 3837 bytes --]


On 2/8/21 4:37 AM, Christian König wrote:
> Am 07.02.21 um 22:50 schrieb Daniel Vetter:
>> [SNIP]
>>> Clarification - as far as I know there are no page fault handlers for kernel
>>> mappings. And we are talking about kernel mappings here, right ?  If there were
>>> I could solve all those issues the same as I do for user mappings, by
>>> invalidating all existing mappings in the kernel (both kmaps and ioreamps)and
>>> insert dummy zero or ~0 filled page instead.
>>> Also, I assume forcefully remapping the IO BAR to ~0 filled page would involve
>>> ioremap API and it's not something that I think can be easily done according to
>>> am answer i got to a related topic a few weeks ago
>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.spinics.net%2Flists%2Flinux-pci%2Fmsg103396.html&amp;data=04%7C01%7CAndrey.Grodzovsky%40amd.com%7Cb159d3ce264944486c8008d8cc15233a%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637483738446813868%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=6eP0nhS%2BZwp1Y54CwfX8vaV3FTWbW8IylW5JFaf92pY%3D&amp;reserved=0 
>>> (that was the only reply
>>> i got)
>> mmiotrace can, but only for debug, and only on x86 platforms:
>>
>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.kernel.org%2Fdoc%2Fhtml%2Flatest%2Ftrace%2Fmmiotrace.html&amp;data=04%7C01%7CAndrey.Grodzovsky%40amd.com%7Cb159d3ce264944486c8008d8cc15233a%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637483738446813868%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=QBF9J%2BVIRUkUTTjvNoZR8NqFNt8CpHkcknH2qKX7dd8%3D&amp;reserved=0 
>>
>>
>> Should be feasible (but maybe not worth the effort) to extend this to
>> support fake unplug.
>
> Mhm, interesting idea you guys brought up here.
>
> We don't need a page fault for this to work, all we need to do is to insert 
> dummy PTEs into the kernels page table at the place where previously the MMIO 
> mapping has been.


But that exactly what Mathew from linux-mm says is not a trivial thing to do, quote:

"

ioremap() is done through the vmalloc space.  It would, in theory, be
possible to reprogram the page tables used for vmalloc to point to your
magic page.  I don't think we have such a mechanism today, and there are
lots of problems with things like TLB flushes.  It's probably going to
be harder than you think.
"

If you believe it's actually doable then it would be useful not only for simulating device
unplugged situation with all MMIOs returning 0xff... but for actual handling of driver accesses
to MMIO after device is gone and, we could then drop entirely this patch as there would be no need
to guard against such accesses post device unplug.

  

>
>>>> But ugh ...
>>>>
>>>> Otoh validating an entire driver like amdgpu without such a trick
>>>> against 0xff reads is practically impossible. So maybe you need to add
>>>> this as one of the tasks here?
>>> Or I could just for validation purposes return ~0 from all reg reads in the 
>>> code
>>> and ignore writes if drm_dev_unplugged, this could already easily validate a 
>>> big
>>> portion of the code flow under such scenario.
>> Hm yeah if your really wrap them all, that should work too. Since
>> iommappings have __iomem pointer type, as long as amdgpu is sparse
>> warning free, should be doable to guarantee this.
>
> Problem is that ~0 is not always a valid register value.
>
> You would need to audit every register read that it doesn't use the returned 
> value blindly as index or similar. That is quite a bit of work.


But ~0 is the value that will be returned for every read post device unplug, 
regardless if it's valid or not, and we have to cope with
it then, no ?

Andrey


>
> Regards,
> Christian.
>
>> -Daniel
>>
>>> Andrey
>>>
>

[-- Attachment #1.2: Type: text/html, Size: 7186 bytes --]

[-- Attachment #2: Type: text/plain, Size: 154 bytes --]

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 11/14] drm/amdgpu: Guard against write accesses after device removal
  2021-02-08 16:23                                         ` Daniel Vetter
@ 2021-02-08 22:15                                           ` Andrey Grodzovsky
  -1 siblings, 0 replies; 196+ messages in thread
From: Andrey Grodzovsky @ 2021-02-08 22:15 UTC (permalink / raw)
  To: Daniel Vetter, Christian König
  Cc: Greg KH, amd-gfx list, dri-devel, Alex Deucher, Qiang Yu


On 2/8/21 11:23 AM, Daniel Vetter wrote:
> On Mon, Feb 8, 2021 at 3:00 PM Christian König <christian.koenig@amd.com> wrote:
>> Am 08.02.21 um 11:11 schrieb Daniel Vetter:
>>> On Mon, Feb 08, 2021 at 11:03:15AM +0100, Christian König wrote:
>>>> Am 08.02.21 um 10:48 schrieb Daniel Vetter:
>>>>> On Mon, Feb 08, 2021 at 10:37:19AM +0100, Christian König wrote:
>>>>>> Am 07.02.21 um 22:50 schrieb Daniel Vetter:
>>>>>>> [SNIP]
>>>>>>>> Clarification - as far as I know there are no page fault handlers for kernel
>>>>>>>> mappings. And we are talking about kernel mappings here, right ?  If there were
>>>>>>>> I could solve all those issues the same as I do for user mappings, by
>>>>>>>> invalidating all existing mappings in the kernel (both kmaps and ioreamps)and
>>>>>>>> insert dummy zero or ~0 filled page instead.
>>>>>>>> Also, I assume forcefully remapping the IO BAR to ~0 filled page would involve
>>>>>>>> ioremap API and it's not something that I think can be easily done according to
>>>>>>>> am answer i got to a related topic a few weeks ago
>>>>>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.spinics.net%2Flists%2Flinux-pci%2Fmsg103396.html&amp;data=04%7C01%7CAndrey.Grodzovsky%40amd.com%7C9d1bdf4cee504cd71b4908d8cc4df310%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637483982454608249%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=Anw%2BOwJ%2B5tvjW3tmkVNdz13%2BZ18vdpfOLWqsUZL7D2I%3D&amp;reserved=0 (that was the only reply
>>>>>>>> i got)
>>>>>>> mmiotrace can, but only for debug, and only on x86 platforms:
>>>>>>>
>>>>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.kernel.org%2Fdoc%2Fhtml%2Flatest%2Ftrace%2Fmmiotrace.html&amp;data=04%7C01%7CAndrey.Grodzovsky%40amd.com%7C9d1bdf4cee504cd71b4908d8cc4df310%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637483982454608249%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=Wa7BFNySVQJLyD6WY4pZHTP1QfeZwD7F5ydBrXuxppQ%3D&amp;reserved=0
>>>>>>>
>>>>>>> Should be feasible (but maybe not worth the effort) to extend this to
>>>>>>> support fake unplug.
>>>>>> Mhm, interesting idea you guys brought up here.
>>>>>>
>>>>>> We don't need a page fault for this to work, all we need to do is to insert
>>>>>> dummy PTEs into the kernels page table at the place where previously the
>>>>>> MMIO mapping has been.
>>>>> Simply pte trick isn't enough, because we need:
>>>>> - drop all writes silently
>>>>> - all reads return 0xff
>>>>>
>>>>> ptes can't do that themselves, we minimally need write protection and then
>>>>> silently proceed on each write fault without restarting the instruction.
>>>>> Better would be to only catch reads, but x86 doesn't do write-only pte
>>>>> permissions afaik.
>>>> You are not thinking far enough :)
>>>>
>>>> The dummy PTE is point to a dummy MMIO page which is just never used.
>>>>
>>>> That hast the exact same properties than our removed MMIO space just doesn't
>>>> goes bananas when a new device is MMIO mapped into that and our driver still
>>>> tries to write there.
>>> Hm, but where do we get such a "guaranteed never used" mmio page from?
>> Well we have tons of unused IO space on 64bit systems these days.
>>
>> Doesn't really needs to be PCIe address space, doesn't it?
> That sounds very trusting to modern systems not decoding random
> ranges. E.g. the pci code stopped extending the host bridge windows on
> its own, entirely relying on the acpi provided ranges, to avoid
> stomping on stuff that's the but not listed anywhere.
>
> I guess if we have a range behind a pci bridge, which isn't used by
> any device, but decoded by the bridge, then that should be safe
> enough. Maybe could even have an option in upstream to do that on
> unplug, if a certain flag is set, or a cmdline option.
> -Daniel


Question - Why can't we just set those PTEs to point to system memory (another 
RO dummy page)
filled with 1s ?

Andrey


>
>> Christian.
>>
>>> It's a nifty idea indeed otherwise ...
>>> -Daniel
>>>
>>>> Regards,
>>>> Christian.
>>>>
>>>>
>>>>>>>>> But ugh ...
>>>>>>>>>
>>>>>>>>> Otoh validating an entire driver like amdgpu without such a trick
>>>>>>>>> against 0xff reads is practically impossible. So maybe you need to add
>>>>>>>>> this as one of the tasks here?
>>>>>>>> Or I could just for validation purposes return ~0 from all reg reads in the code
>>>>>>>> and ignore writes if drm_dev_unplugged, this could already easily validate a big
>>>>>>>> portion of the code flow under such scenario.
>>>>>>> Hm yeah if your really wrap them all, that should work too. Since
>>>>>>> iommappings have __iomem pointer type, as long as amdgpu is sparse
>>>>>>> warning free, should be doable to guarantee this.
>>>>>> Problem is that ~0 is not always a valid register value.
>>>>>>
>>>>>> You would need to audit every register read that it doesn't use the returned
>>>>>> value blindly as index or similar. That is quite a bit of work.
>>>>> Yeah that's the entire crux here :-/
>>>>> -Daniel
>
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 11/14] drm/amdgpu: Guard against write accesses after device removal
@ 2021-02-08 22:15                                           ` Andrey Grodzovsky
  0 siblings, 0 replies; 196+ messages in thread
From: Andrey Grodzovsky @ 2021-02-08 22:15 UTC (permalink / raw)
  To: Daniel Vetter, Christian König
  Cc: Rob Herring, Greg KH, amd-gfx list, Anholt, Eric, Pekka Paalanen,
	dri-devel, Alex Deucher, Qiang Yu, Wentland, Harry, Lucas Stach


On 2/8/21 11:23 AM, Daniel Vetter wrote:
> On Mon, Feb 8, 2021 at 3:00 PM Christian König <christian.koenig@amd.com> wrote:
>> Am 08.02.21 um 11:11 schrieb Daniel Vetter:
>>> On Mon, Feb 08, 2021 at 11:03:15AM +0100, Christian König wrote:
>>>> Am 08.02.21 um 10:48 schrieb Daniel Vetter:
>>>>> On Mon, Feb 08, 2021 at 10:37:19AM +0100, Christian König wrote:
>>>>>> Am 07.02.21 um 22:50 schrieb Daniel Vetter:
>>>>>>> [SNIP]
>>>>>>>> Clarification - as far as I know there are no page fault handlers for kernel
>>>>>>>> mappings. And we are talking about kernel mappings here, right ?  If there were
>>>>>>>> I could solve all those issues the same as I do for user mappings, by
>>>>>>>> invalidating all existing mappings in the kernel (both kmaps and ioreamps)and
>>>>>>>> insert dummy zero or ~0 filled page instead.
>>>>>>>> Also, I assume forcefully remapping the IO BAR to ~0 filled page would involve
>>>>>>>> ioremap API and it's not something that I think can be easily done according to
>>>>>>>> am answer i got to a related topic a few weeks ago
>>>>>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.spinics.net%2Flists%2Flinux-pci%2Fmsg103396.html&amp;data=04%7C01%7CAndrey.Grodzovsky%40amd.com%7C9d1bdf4cee504cd71b4908d8cc4df310%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637483982454608249%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=Anw%2BOwJ%2B5tvjW3tmkVNdz13%2BZ18vdpfOLWqsUZL7D2I%3D&amp;reserved=0 (that was the only reply
>>>>>>>> i got)
>>>>>>> mmiotrace can, but only for debug, and only on x86 platforms:
>>>>>>>
>>>>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.kernel.org%2Fdoc%2Fhtml%2Flatest%2Ftrace%2Fmmiotrace.html&amp;data=04%7C01%7CAndrey.Grodzovsky%40amd.com%7C9d1bdf4cee504cd71b4908d8cc4df310%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637483982454608249%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=Wa7BFNySVQJLyD6WY4pZHTP1QfeZwD7F5ydBrXuxppQ%3D&amp;reserved=0
>>>>>>>
>>>>>>> Should be feasible (but maybe not worth the effort) to extend this to
>>>>>>> support fake unplug.
>>>>>> Mhm, interesting idea you guys brought up here.
>>>>>>
>>>>>> We don't need a page fault for this to work, all we need to do is to insert
>>>>>> dummy PTEs into the kernels page table at the place where previously the
>>>>>> MMIO mapping has been.
>>>>> Simply pte trick isn't enough, because we need:
>>>>> - drop all writes silently
>>>>> - all reads return 0xff
>>>>>
>>>>> ptes can't do that themselves, we minimally need write protection and then
>>>>> silently proceed on each write fault without restarting the instruction.
>>>>> Better would be to only catch reads, but x86 doesn't do write-only pte
>>>>> permissions afaik.
>>>> You are not thinking far enough :)
>>>>
>>>> The dummy PTE is point to a dummy MMIO page which is just never used.
>>>>
>>>> That hast the exact same properties than our removed MMIO space just doesn't
>>>> goes bananas when a new device is MMIO mapped into that and our driver still
>>>> tries to write there.
>>> Hm, but where do we get such a "guaranteed never used" mmio page from?
>> Well we have tons of unused IO space on 64bit systems these days.
>>
>> Doesn't really needs to be PCIe address space, doesn't it?
> That sounds very trusting to modern systems not decoding random
> ranges. E.g. the pci code stopped extending the host bridge windows on
> its own, entirely relying on the acpi provided ranges, to avoid
> stomping on stuff that's the but not listed anywhere.
>
> I guess if we have a range behind a pci bridge, which isn't used by
> any device, but decoded by the bridge, then that should be safe
> enough. Maybe could even have an option in upstream to do that on
> unplug, if a certain flag is set, or a cmdline option.
> -Daniel


Question - Why can't we just set those PTEs to point to system memory (another 
RO dummy page)
filled with 1s ?

Andrey


>
>> Christian.
>>
>>> It's a nifty idea indeed otherwise ...
>>> -Daniel
>>>
>>>> Regards,
>>>> Christian.
>>>>
>>>>
>>>>>>>>> But ugh ...
>>>>>>>>>
>>>>>>>>> Otoh validating an entire driver like amdgpu without such a trick
>>>>>>>>> against 0xff reads is practically impossible. So maybe you need to add
>>>>>>>>> this as one of the tasks here?
>>>>>>>> Or I could just for validation purposes return ~0 from all reg reads in the code
>>>>>>>> and ignore writes if drm_dev_unplugged, this could already easily validate a big
>>>>>>>> portion of the code flow under such scenario.
>>>>>>> Hm yeah if your really wrap them all, that should work too. Since
>>>>>>> iommappings have __iomem pointer type, as long as amdgpu is sparse
>>>>>>> warning free, should be doable to guarantee this.
>>>>>> Problem is that ~0 is not always a valid register value.
>>>>>>
>>>>>> You would need to audit every register read that it doesn't use the returned
>>>>>> value blindly as index or similar. That is quite a bit of work.
>>>>> Yeah that's the entire crux here :-/
>>>>> -Daniel
>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 00/14] RFC Support hot device unplug in amdgpu
  2021-02-08  7:27                   ` Daniel Vetter
@ 2021-02-09  4:01                     ` Andrey Grodzovsky
  -1 siblings, 0 replies; 196+ messages in thread
From: Andrey Grodzovsky @ 2021-02-09  4:01 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: amd-gfx list, Christian König, dri-devel, Qiang Yu, Greg KH,
	Alex Deucher


On 2/8/21 2:27 AM, Daniel Vetter wrote:
> On Mon, Feb 8, 2021 at 6:59 AM Andrey Grodzovsky
> <Andrey.Grodzovsky@amd.com> wrote:
>>
>> On 1/20/21 10:59 AM, Daniel Vetter wrote:
>>> On Wed, Jan 20, 2021 at 3:20 PM Andrey Grodzovsky
>>> <Andrey.Grodzovsky@amd.com> wrote:
>>>> On 1/20/21 4:05 AM, Daniel Vetter wrote:
>>>>> On Tue, Jan 19, 2021 at 01:18:15PM -0500, Andrey Grodzovsky wrote:
>>>>>> On 1/19/21 1:08 PM, Daniel Vetter wrote:
>>>>>>> On Tue, Jan 19, 2021 at 6:31 PM Andrey Grodzovsky
>>>>>>> <Andrey.Grodzovsky@amd.com> wrote:
>>>>>>>> On 1/19/21 9:16 AM, Daniel Vetter wrote:
>>>>>>>>> On Mon, Jan 18, 2021 at 04:01:09PM -0500, Andrey Grodzovsky wrote:
>>>>>>>>>> Until now extracting a card either by physical extraction (e.g. eGPU with
>>>>>>>>>> thunderbolt connection or by emulation through  syfs -> /sys/bus/pci/devices/device_id/remove)
>>>>>>>>>> would cause random crashes in user apps. The random crashes in apps were
>>>>>>>>>> mostly due to the app having mapped a device backed BO into its address
>>>>>>>>>> space was still trying to access the BO while the backing device was gone.
>>>>>>>>>> To answer this first problem Christian suggested to fix the handling of mapped
>>>>>>>>>> memory in the clients when the device goes away by forcibly unmap all buffers the
>>>>>>>>>> user processes has by clearing their respective VMAs mapping the device BOs.
>>>>>>>>>> Then when the VMAs try to fill in the page tables again we check in the fault
>>>>>>>>>> handlerif the device is removed and if so, return an error. This will generate a
>>>>>>>>>> SIGBUS to the application which can then cleanly terminate.This indeed was done
>>>>>>>>>> but this in turn created a problem of kernel OOPs were the OOPSes were due to the
>>>>>>>>>> fact that while the app was terminating because of the SIGBUSit would trigger use
>>>>>>>>>> after free in the driver by calling to accesses device structures that were already
>>>>>>>>>> released from the pci remove sequence.This was handled by introducing a 'flush'
>>>>>>>>>> sequence during device removal were we wait for drm file reference to drop to 0
>>>>>>>>>> meaning all user clients directly using this device terminated.
>>>>>>>>>>
>>>>>>>>>> v2:
>>>>>>>>>> Based on discussions in the mailing list with Daniel and Pekka [1] and based on the document
>>>>>>>>>> produced by Pekka from those discussions [2] the whole approach with returning SIGBUS and
>>>>>>>>>> waiting for all user clients having CPU mapping of device BOs to die was dropped.
>>>>>>>>>> Instead as per the document suggestion the device structures are kept alive until
>>>>>>>>>> the last reference to the device is dropped by user client and in the meanwhile all existing and new CPU mappings of the BOs
>>>>>>>>>> belonging to the device directly or by dma-buf import are rerouted to per user
>>>>>>>>>> process dummy rw page.Also, I skipped the 'Requirements for KMS UAPI' section of [2]
>>>>>>>>>> since i am trying to get the minimal set of requirements that still give useful solution
>>>>>>>>>> to work and this is the'Requirements for Render and Cross-Device UAPI' section and so my
>>>>>>>>>> test case is removing a secondary device, which is render only and is not involved
>>>>>>>>>> in KMS.
>>>>>>>>>>
>>>>>>>>>> v3:
>>>>>>>>>> More updates following comments from v2 such as removing loop to find DRM file when rerouting
>>>>>>>>>> page faults to dummy page,getting rid of unnecessary sysfs handling refactoring and moving
>>>>>>>>>> prevention of GPU recovery post device unplug from amdgpu to scheduler layer.
>>>>>>>>>> On top of that added unplug support for the IOMMU enabled system.
>>>>>>>>>>
>>>>>>>>>> v4:
>>>>>>>>>> Drop last sysfs hack and use sysfs default attribute.
>>>>>>>>>> Guard against write accesses after device removal to avoid modifying released memory.
>>>>>>>>>> Update dummy pages handling to on demand allocation and release through drm managed framework.
>>>>>>>>>> Add return value to scheduler job TO handler (by Luben Tuikov) and use this in amdgpu for prevention
>>>>>>>>>> of GPU recovery post device unplug
>>>>>>>>>> Also rebase on top of drm-misc-mext instead of amd-staging-drm-next
>>>>>>>>>>
>>>>>>>>>> With these patches I am able to gracefully remove the secondary card using sysfs remove hook while glxgears
>>>>>>>>>> is running off of secondary card (DRI_PRIME=1) without kernel oopses or hangs and keep working
>>>>>>>>>> with the primary card or soft reset the device without hangs or oopses
>>>>>>>>>>
>>>>>>>>>> TODOs for followup work:
>>>>>>>>>> Convert AMDGPU code to use devm (for hw stuff) and drmm (for sw stuff and allocations) (Daniel)
>>>>>>>>>> Support plugging the secondary device back after unplug - currently still experiencing HW error on plugging back.
>>>>>>>>>> Add support for 'Requirements for KMS UAPI' section of [2] - unplugging primary, display connected card.
>>>>>>>>>>
>>>>>>>>>> [1] - Discussions during v3 of the patchset https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.spinics.net%2Flists%2Famd-gfx%2Fmsg55576.html&amp;data=04%7C01%7CAndrey.Grodzovsky%40amd.com%7Cec5a382fde9d43c0397408d8cc02fc38%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637483660504372326%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=5AtLTzLpQ05h1ZonovShf5TUYwOTywkV1WJ1pXfB%2BCA%3D&amp;reserved=0
>>>>>>>>>> [2] - drm/doc: device hot-unplug for userspace https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.spinics.net%2Flists%2Fdri-devel%2Fmsg259755.html&amp;data=04%7C01%7CAndrey.Grodzovsky%40amd.com%7Cec5a382fde9d43c0397408d8cc02fc38%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637483660504372326%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=zCBMxQnSeiuFORHHxlSpx10v4gwZ%2BnbTFnxelmWliJo%3D&amp;reserved=0
>>>>>>>>>> [3] - Related gitlab ticket https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgitlab.freedesktop.org%2Fdrm%2Famd%2F-%2Fissues%2F1081&amp;data=04%7C01%7CAndrey.Grodzovsky%40amd.com%7Cec5a382fde9d43c0397408d8cc02fc38%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637483660504372326%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=d9rifeYMoPcbE8K5axZvnSy2kQ3ENWgUcpvol6TkkMw%3D&amp;reserved=0
>>>>>>>>> btw have you tried this out with some of the igts we have? core_hotunplug
>>>>>>>>> is the one I'm thinking of. Might be worth to extend this for amdgpu
>>>>>>>>> specific stuff (like run some batches on it while hotunplugging).
>>>>>>>> No, I mostly used just running glxgears while testing which covers already
>>>>>>>> exported/imported dma-buf case and a few manually hacked tests in libdrm amdgpu
>>>>>>>> test suite
>>>>>>>>
>>>>>>>>
>>>>>>>>> Since there's so many corner cases we need to test here (shared dma-buf,
>>>>>>>>> shared dma_fence) I think it would make sense to have a shared testcase
>>>>>>>>> across drivers.
>>>>>>>> Not familiar with IGT too much, is there an easy way to setup shared dma bufs
>>>>>>>> and fences
>>>>>>>> use cases there or you mean I need to add them now ?
>>>>>>> We do have test infrastructure for all of that, but the hotunplug test
>>>>>>> doesn't have that yet I think.
>>>>>>>
>>>>>>>>> Only specific thing would be some hooks to keep the gpu
>>>>>>>>> busy in some fashion while we yank the driver.
>>>>>>>> Do you mean like staring X and some active rendering on top (like glxgears)
>>>>>>>> automatically from within IGT ?
>>>>>>> Nope, igt is meant to be bare metal testing so you don't have to drag
>>>>>>> the entire winsys around (which in a wayland world, is not really good
>>>>>>> for driver testing anyway, since everything is different). We use this
>>>>>>> for our pre-merge ci for drm/i915.
>>>>>> So i keep it busy by X/glxgers which is manual operation. What you suggest
>>>>>> then is some client within IGT which opens the device and starts submitting jobs
>>>>>> (which is much like what libdrm amdgpu tests already do) ? And this
>>>>>> part is the amdgou specific code I just need to port from libdrm to here ?
>>>>> Yup. For i915 tests we have an entire library already for small workloads,
>>>>> including some that just spin forever (useful for reset testing and could
>>>>> also come handy for unload testing).
>>>>> -Daniel
>>>> Does it mean I would have to drag in the entire infrastructure code from
>>>> within libdrm amdgpu code that allows for command submissions through
>>>> our IOCTLs ?
>>> No it's perfectly fine to use libdrm in igt tests, we do that too. I
>>> just mean we have some additional helpers to submit specific workloads
>>> for intel gpu, like rendercpy to move data with the 3d engine (just
>>> using copy engines only isn't good enough sometimes for testing), or
>>> the special hanging batchbuffers we use for reset testing, or in
>>> general for having precise control over race conditions and things
>>> like that.
>>>
>>> One thing that was somewhat annoying for i915 but shouldn't be a
>>> problem for amdgpu is that igt builds on intel. So we have stub
>>> functions for libdrm-intel, since libdrm-intel doesn't build on arm.
>>> Shouldn't be a problem for you.
>>> -Daniel
>>
>> Tested with igt hot-unplug test. Passed unbind_rebind, unplug-rescan,
>> hot-unbind-rebind and hotunplug-rescan
>> if disabling the rescan part as I don't support plug-back for now. Also added
>> command submission for amdgpu.
>> Attached a draft of submitting workload while unbinding the driver or simulating
>> detach. Catched 2 issues with unpug if command submission in flight  during
>> unplug -
>> (unsignaled fence causing a hang in amdgpu_cs_sync and hitting a BUG_ON in
>> gfx_v9_0_ring_emit_patch_cond_exec whic is expected i guess).
>> Guess glxgears command submissions is at a much slower rate so this was missed.
>> Is that what you meant for this test ?
> Yup. Would be good if you can submit this one for inclusion.
> -Daniel


Will do together with exported dma-buf test once I do it.

P.S How am i supposed to do exported fence test. Exporting a fence from device 
A, importing it into device B, unplugging
device A then signaling the fence from device B - this supposed to call a fence 
cb which was registered
by the exporter which by now is dead and hence will cause a 'use after free' ?

Andrey

>
>> Andrey
>>
>>
>>>
>>>> Andrey
>>>>
>>>>>> Andrey
>>>>>>
>>>>>>
>>>>>>>>> But just to get it started
>>>>>>>>> you can throw in entirely amdgpu specific subtests and just share some of
>>>>>>>>> the test code.
>>>>>>>>> -Daniel
>>>>>>>> Im general, I wasn't aware of this test suite and looks like it does what i test
>>>>>>>> among other stuff.
>>>>>>>> I will definitely  try to run with it although the rescan part will not work as
>>>>>>>> plugging
>>>>>>>> the device back is in my TODO list and not part of the scope for this patchset
>>>>>>>> and so I will
>>>>>>>> probably comment the re-scan section out while testing.
>>>>>>> amd gem has been using libdrm-amd thus far iirc, but for things like
>>>>>>> this I think it'd be worth to at least consider switching. Display
>>>>>>> team has already started to use some of the test and contribute stuff
>>>>>>> (I think the VRR testcase is from amd).
>>>>>>> -Daniel
>>>>>>>
>>>>>>>> Andrey
>>>>>>>>
>>>>>>>>
>>>>>>>>>> Andrey Grodzovsky (13):
>>>>>>>>>>        drm/ttm: Remap all page faults to per process dummy page.
>>>>>>>>>>        drm: Unamp the entire device address space on device unplug
>>>>>>>>>>        drm/ttm: Expose ttm_tt_unpopulate for driver use
>>>>>>>>>>        drm/sched: Cancel and flush all oustatdning jobs before finish.
>>>>>>>>>>        drm/amdgpu: Split amdgpu_device_fini into early and late
>>>>>>>>>>        drm/amdgpu: Add early fini callback
>>>>>>>>>>        drm/amdgpu: Register IOMMU topology notifier per device.
>>>>>>>>>>        drm/amdgpu: Fix a bunch of sdma code crash post device unplug
>>>>>>>>>>        drm/amdgpu: Remap all page faults to per process dummy page.
>>>>>>>>>>        dmr/amdgpu: Move some sysfs attrs creation to default_attr
>>>>>>>>>>        drm/amdgpu: Guard against write accesses after device removal
>>>>>>>>>>        drm/sched: Make timeout timer rearm conditional.
>>>>>>>>>>        drm/amdgpu: Prevent any job recoveries after device is unplugged.
>>>>>>>>>>
>>>>>>>>>> Luben Tuikov (1):
>>>>>>>>>>        drm/scheduler: Job timeout handler returns status
>>>>>>>>>>
>>>>>>>>>>       drivers/gpu/drm/amd/amdgpu/amdgpu.h               |  11 +-
>>>>>>>>>>       drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c      |  17 +--
>>>>>>>>>>       drivers/gpu/drm/amd/amdgpu/amdgpu_device.c        | 149 ++++++++++++++++++++--
>>>>>>>>>>       drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c           |  20 ++-
>>>>>>>>>>       drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c         |  15 ++-
>>>>>>>>>>       drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c          |   2 +-
>>>>>>>>>>       drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h          |   1 +
>>>>>>>>>>       drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c           |   9 ++
>>>>>>>>>>       drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c       |  25 ++--
>>>>>>>>>>       drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c           |  26 ++--
>>>>>>>>>>       drivers/gpu/drm/amd/amdgpu/amdgpu_irq.h           |   3 +-
>>>>>>>>>>       drivers/gpu/drm/amd/amdgpu/amdgpu_job.c           |  19 ++-
>>>>>>>>>>       drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c           |  12 +-
>>>>>>>>>>       drivers/gpu/drm/amd/amdgpu/amdgpu_object.c        |  10 ++
>>>>>>>>>>       drivers/gpu/drm/amd/amdgpu/amdgpu_object.h        |   2 +
>>>>>>>>>>       drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c           |  53 +++++---
>>>>>>>>>>       drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h           |   3 +
>>>>>>>>>>       drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c           |   1 +
>>>>>>>>>>       drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c          |  70 ++++++++++
>>>>>>>>>>       drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h          |  52 +-------
>>>>>>>>>>       drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c           |  21 ++-
>>>>>>>>>>       drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c            |   8 +-
>>>>>>>>>>       drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c      |  14 +-
>>>>>>>>>>       drivers/gpu/drm/amd/amdgpu/cik_ih.c               |   2 +-
>>>>>>>>>>       drivers/gpu/drm/amd/amdgpu/cz_ih.c                |   2 +-
>>>>>>>>>>       drivers/gpu/drm/amd/amdgpu/iceland_ih.c           |   2 +-
>>>>>>>>>>       drivers/gpu/drm/amd/amdgpu/navi10_ih.c            |   2 +-
>>>>>>>>>>       drivers/gpu/drm/amd/amdgpu/psp_v11_0.c            |  16 +--
>>>>>>>>>>       drivers/gpu/drm/amd/amdgpu/psp_v12_0.c            |   8 +-
>>>>>>>>>>       drivers/gpu/drm/amd/amdgpu/psp_v3_1.c             |   8 +-
>>>>>>>>>>       drivers/gpu/drm/amd/amdgpu/si_ih.c                |   2 +-
>>>>>>>>>>       drivers/gpu/drm/amd/amdgpu/tonga_ih.c             |   2 +-
>>>>>>>>>>       drivers/gpu/drm/amd/amdgpu/vega10_ih.c            |   2 +-
>>>>>>>>>>       drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c |  12 +-
>>>>>>>>>>       drivers/gpu/drm/amd/include/amd_shared.h          |   2 +
>>>>>>>>>>       drivers/gpu/drm/drm_drv.c                         |   3 +
>>>>>>>>>>       drivers/gpu/drm/etnaviv/etnaviv_sched.c           |  10 +-
>>>>>>>>>>       drivers/gpu/drm/lima/lima_sched.c                 |   4 +-
>>>>>>>>>>       drivers/gpu/drm/panfrost/panfrost_job.c           |   9 +-
>>>>>>>>>>       drivers/gpu/drm/scheduler/sched_main.c            |  18 ++-
>>>>>>>>>>       drivers/gpu/drm/ttm/ttm_bo_vm.c                   |  82 +++++++++++-
>>>>>>>>>>       drivers/gpu/drm/ttm/ttm_tt.c                      |   1 +
>>>>>>>>>>       drivers/gpu/drm/v3d/v3d_sched.c                   |  32 ++---
>>>>>>>>>>       include/drm/gpu_scheduler.h                       |  17 ++-
>>>>>>>>>>       include/drm/ttm/ttm_bo_api.h                      |   2 +
>>>>>>>>>>       45 files changed, 583 insertions(+), 198 deletions(-)
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> 2.7.4
>>>>>>>>>>
>>>
>
>
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 00/14] RFC Support hot device unplug in amdgpu
@ 2021-02-09  4:01                     ` Andrey Grodzovsky
  0 siblings, 0 replies; 196+ messages in thread
From: Andrey Grodzovsky @ 2021-02-09  4:01 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Rob Herring, amd-gfx list, Christian König, dri-devel,
	Anholt, Eric, Pekka Paalanen, Qiang Yu, Greg KH, Alex Deucher,
	Wentland, Harry, Lucas Stach


On 2/8/21 2:27 AM, Daniel Vetter wrote:
> On Mon, Feb 8, 2021 at 6:59 AM Andrey Grodzovsky
> <Andrey.Grodzovsky@amd.com> wrote:
>>
>> On 1/20/21 10:59 AM, Daniel Vetter wrote:
>>> On Wed, Jan 20, 2021 at 3:20 PM Andrey Grodzovsky
>>> <Andrey.Grodzovsky@amd.com> wrote:
>>>> On 1/20/21 4:05 AM, Daniel Vetter wrote:
>>>>> On Tue, Jan 19, 2021 at 01:18:15PM -0500, Andrey Grodzovsky wrote:
>>>>>> On 1/19/21 1:08 PM, Daniel Vetter wrote:
>>>>>>> On Tue, Jan 19, 2021 at 6:31 PM Andrey Grodzovsky
>>>>>>> <Andrey.Grodzovsky@amd.com> wrote:
>>>>>>>> On 1/19/21 9:16 AM, Daniel Vetter wrote:
>>>>>>>>> On Mon, Jan 18, 2021 at 04:01:09PM -0500, Andrey Grodzovsky wrote:
>>>>>>>>>> Until now extracting a card either by physical extraction (e.g. eGPU with
>>>>>>>>>> thunderbolt connection or by emulation through  syfs -> /sys/bus/pci/devices/device_id/remove)
>>>>>>>>>> would cause random crashes in user apps. The random crashes in apps were
>>>>>>>>>> mostly due to the app having mapped a device backed BO into its address
>>>>>>>>>> space was still trying to access the BO while the backing device was gone.
>>>>>>>>>> To answer this first problem Christian suggested to fix the handling of mapped
>>>>>>>>>> memory in the clients when the device goes away by forcibly unmap all buffers the
>>>>>>>>>> user processes has by clearing their respective VMAs mapping the device BOs.
>>>>>>>>>> Then when the VMAs try to fill in the page tables again we check in the fault
>>>>>>>>>> handlerif the device is removed and if so, return an error. This will generate a
>>>>>>>>>> SIGBUS to the application which can then cleanly terminate.This indeed was done
>>>>>>>>>> but this in turn created a problem of kernel OOPs were the OOPSes were due to the
>>>>>>>>>> fact that while the app was terminating because of the SIGBUSit would trigger use
>>>>>>>>>> after free in the driver by calling to accesses device structures that were already
>>>>>>>>>> released from the pci remove sequence.This was handled by introducing a 'flush'
>>>>>>>>>> sequence during device removal were we wait for drm file reference to drop to 0
>>>>>>>>>> meaning all user clients directly using this device terminated.
>>>>>>>>>>
>>>>>>>>>> v2:
>>>>>>>>>> Based on discussions in the mailing list with Daniel and Pekka [1] and based on the document
>>>>>>>>>> produced by Pekka from those discussions [2] the whole approach with returning SIGBUS and
>>>>>>>>>> waiting for all user clients having CPU mapping of device BOs to die was dropped.
>>>>>>>>>> Instead as per the document suggestion the device structures are kept alive until
>>>>>>>>>> the last reference to the device is dropped by user client and in the meanwhile all existing and new CPU mappings of the BOs
>>>>>>>>>> belonging to the device directly or by dma-buf import are rerouted to per user
>>>>>>>>>> process dummy rw page.Also, I skipped the 'Requirements for KMS UAPI' section of [2]
>>>>>>>>>> since i am trying to get the minimal set of requirements that still give useful solution
>>>>>>>>>> to work and this is the'Requirements for Render and Cross-Device UAPI' section and so my
>>>>>>>>>> test case is removing a secondary device, which is render only and is not involved
>>>>>>>>>> in KMS.
>>>>>>>>>>
>>>>>>>>>> v3:
>>>>>>>>>> More updates following comments from v2 such as removing loop to find DRM file when rerouting
>>>>>>>>>> page faults to dummy page,getting rid of unnecessary sysfs handling refactoring and moving
>>>>>>>>>> prevention of GPU recovery post device unplug from amdgpu to scheduler layer.
>>>>>>>>>> On top of that added unplug support for the IOMMU enabled system.
>>>>>>>>>>
>>>>>>>>>> v4:
>>>>>>>>>> Drop last sysfs hack and use sysfs default attribute.
>>>>>>>>>> Guard against write accesses after device removal to avoid modifying released memory.
>>>>>>>>>> Update dummy pages handling to on demand allocation and release through drm managed framework.
>>>>>>>>>> Add return value to scheduler job TO handler (by Luben Tuikov) and use this in amdgpu for prevention
>>>>>>>>>> of GPU recovery post device unplug
>>>>>>>>>> Also rebase on top of drm-misc-mext instead of amd-staging-drm-next
>>>>>>>>>>
>>>>>>>>>> With these patches I am able to gracefully remove the secondary card using sysfs remove hook while glxgears
>>>>>>>>>> is running off of secondary card (DRI_PRIME=1) without kernel oopses or hangs and keep working
>>>>>>>>>> with the primary card or soft reset the device without hangs or oopses
>>>>>>>>>>
>>>>>>>>>> TODOs for followup work:
>>>>>>>>>> Convert AMDGPU code to use devm (for hw stuff) and drmm (for sw stuff and allocations) (Daniel)
>>>>>>>>>> Support plugging the secondary device back after unplug - currently still experiencing HW error on plugging back.
>>>>>>>>>> Add support for 'Requirements for KMS UAPI' section of [2] - unplugging primary, display connected card.
>>>>>>>>>>
>>>>>>>>>> [1] - Discussions during v3 of the patchset https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.spinics.net%2Flists%2Famd-gfx%2Fmsg55576.html&amp;data=04%7C01%7CAndrey.Grodzovsky%40amd.com%7Cec5a382fde9d43c0397408d8cc02fc38%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637483660504372326%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=5AtLTzLpQ05h1ZonovShf5TUYwOTywkV1WJ1pXfB%2BCA%3D&amp;reserved=0
>>>>>>>>>> [2] - drm/doc: device hot-unplug for userspace https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.spinics.net%2Flists%2Fdri-devel%2Fmsg259755.html&amp;data=04%7C01%7CAndrey.Grodzovsky%40amd.com%7Cec5a382fde9d43c0397408d8cc02fc38%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637483660504372326%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=zCBMxQnSeiuFORHHxlSpx10v4gwZ%2BnbTFnxelmWliJo%3D&amp;reserved=0
>>>>>>>>>> [3] - Related gitlab ticket https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgitlab.freedesktop.org%2Fdrm%2Famd%2F-%2Fissues%2F1081&amp;data=04%7C01%7CAndrey.Grodzovsky%40amd.com%7Cec5a382fde9d43c0397408d8cc02fc38%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637483660504372326%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=d9rifeYMoPcbE8K5axZvnSy2kQ3ENWgUcpvol6TkkMw%3D&amp;reserved=0
>>>>>>>>> btw have you tried this out with some of the igts we have? core_hotunplug
>>>>>>>>> is the one I'm thinking of. Might be worth to extend this for amdgpu
>>>>>>>>> specific stuff (like run some batches on it while hotunplugging).
>>>>>>>> No, I mostly used just running glxgears while testing which covers already
>>>>>>>> exported/imported dma-buf case and a few manually hacked tests in libdrm amdgpu
>>>>>>>> test suite
>>>>>>>>
>>>>>>>>
>>>>>>>>> Since there's so many corner cases we need to test here (shared dma-buf,
>>>>>>>>> shared dma_fence) I think it would make sense to have a shared testcase
>>>>>>>>> across drivers.
>>>>>>>> Not familiar with IGT too much, is there an easy way to setup shared dma bufs
>>>>>>>> and fences
>>>>>>>> use cases there or you mean I need to add them now ?
>>>>>>> We do have test infrastructure for all of that, but the hotunplug test
>>>>>>> doesn't have that yet I think.
>>>>>>>
>>>>>>>>> Only specific thing would be some hooks to keep the gpu
>>>>>>>>> busy in some fashion while we yank the driver.
>>>>>>>> Do you mean like staring X and some active rendering on top (like glxgears)
>>>>>>>> automatically from within IGT ?
>>>>>>> Nope, igt is meant to be bare metal testing so you don't have to drag
>>>>>>> the entire winsys around (which in a wayland world, is not really good
>>>>>>> for driver testing anyway, since everything is different). We use this
>>>>>>> for our pre-merge ci for drm/i915.
>>>>>> So i keep it busy by X/glxgers which is manual operation. What you suggest
>>>>>> then is some client within IGT which opens the device and starts submitting jobs
>>>>>> (which is much like what libdrm amdgpu tests already do) ? And this
>>>>>> part is the amdgou specific code I just need to port from libdrm to here ?
>>>>> Yup. For i915 tests we have an entire library already for small workloads,
>>>>> including some that just spin forever (useful for reset testing and could
>>>>> also come handy for unload testing).
>>>>> -Daniel
>>>> Does it mean I would have to drag in the entire infrastructure code from
>>>> within libdrm amdgpu code that allows for command submissions through
>>>> our IOCTLs ?
>>> No it's perfectly fine to use libdrm in igt tests, we do that too. I
>>> just mean we have some additional helpers to submit specific workloads
>>> for intel gpu, like rendercpy to move data with the 3d engine (just
>>> using copy engines only isn't good enough sometimes for testing), or
>>> the special hanging batchbuffers we use for reset testing, or in
>>> general for having precise control over race conditions and things
>>> like that.
>>>
>>> One thing that was somewhat annoying for i915 but shouldn't be a
>>> problem for amdgpu is that igt builds on intel. So we have stub
>>> functions for libdrm-intel, since libdrm-intel doesn't build on arm.
>>> Shouldn't be a problem for you.
>>> -Daniel
>>
>> Tested with igt hot-unplug test. Passed unbind_rebind, unplug-rescan,
>> hot-unbind-rebind and hotunplug-rescan
>> if disabling the rescan part as I don't support plug-back for now. Also added
>> command submission for amdgpu.
>> Attached a draft of submitting workload while unbinding the driver or simulating
>> detach. Catched 2 issues with unpug if command submission in flight  during
>> unplug -
>> (unsignaled fence causing a hang in amdgpu_cs_sync and hitting a BUG_ON in
>> gfx_v9_0_ring_emit_patch_cond_exec whic is expected i guess).
>> Guess glxgears command submissions is at a much slower rate so this was missed.
>> Is that what you meant for this test ?
> Yup. Would be good if you can submit this one for inclusion.
> -Daniel


Will do together with exported dma-buf test once I do it.

P.S How am i supposed to do exported fence test. Exporting a fence from device 
A, importing it into device B, unplugging
device A then signaling the fence from device B - this supposed to call a fence 
cb which was registered
by the exporter which by now is dead and hence will cause a 'use after free' ?

Andrey

>
>> Andrey
>>
>>
>>>
>>>> Andrey
>>>>
>>>>>> Andrey
>>>>>>
>>>>>>
>>>>>>>>> But just to get it started
>>>>>>>>> you can throw in entirely amdgpu specific subtests and just share some of
>>>>>>>>> the test code.
>>>>>>>>> -Daniel
>>>>>>>> Im general, I wasn't aware of this test suite and looks like it does what i test
>>>>>>>> among other stuff.
>>>>>>>> I will definitely  try to run with it although the rescan part will not work as
>>>>>>>> plugging
>>>>>>>> the device back is in my TODO list and not part of the scope for this patchset
>>>>>>>> and so I will
>>>>>>>> probably comment the re-scan section out while testing.
>>>>>>> amd gem has been using libdrm-amd thus far iirc, but for things like
>>>>>>> this I think it'd be worth to at least consider switching. Display
>>>>>>> team has already started to use some of the test and contribute stuff
>>>>>>> (I think the VRR testcase is from amd).
>>>>>>> -Daniel
>>>>>>>
>>>>>>>> Andrey
>>>>>>>>
>>>>>>>>
>>>>>>>>>> Andrey Grodzovsky (13):
>>>>>>>>>>        drm/ttm: Remap all page faults to per process dummy page.
>>>>>>>>>>        drm: Unamp the entire device address space on device unplug
>>>>>>>>>>        drm/ttm: Expose ttm_tt_unpopulate for driver use
>>>>>>>>>>        drm/sched: Cancel and flush all oustatdning jobs before finish.
>>>>>>>>>>        drm/amdgpu: Split amdgpu_device_fini into early and late
>>>>>>>>>>        drm/amdgpu: Add early fini callback
>>>>>>>>>>        drm/amdgpu: Register IOMMU topology notifier per device.
>>>>>>>>>>        drm/amdgpu: Fix a bunch of sdma code crash post device unplug
>>>>>>>>>>        drm/amdgpu: Remap all page faults to per process dummy page.
>>>>>>>>>>        dmr/amdgpu: Move some sysfs attrs creation to default_attr
>>>>>>>>>>        drm/amdgpu: Guard against write accesses after device removal
>>>>>>>>>>        drm/sched: Make timeout timer rearm conditional.
>>>>>>>>>>        drm/amdgpu: Prevent any job recoveries after device is unplugged.
>>>>>>>>>>
>>>>>>>>>> Luben Tuikov (1):
>>>>>>>>>>        drm/scheduler: Job timeout handler returns status
>>>>>>>>>>
>>>>>>>>>>       drivers/gpu/drm/amd/amdgpu/amdgpu.h               |  11 +-
>>>>>>>>>>       drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c      |  17 +--
>>>>>>>>>>       drivers/gpu/drm/amd/amdgpu/amdgpu_device.c        | 149 ++++++++++++++++++++--
>>>>>>>>>>       drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c           |  20 ++-
>>>>>>>>>>       drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c         |  15 ++-
>>>>>>>>>>       drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c          |   2 +-
>>>>>>>>>>       drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h          |   1 +
>>>>>>>>>>       drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c           |   9 ++
>>>>>>>>>>       drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c       |  25 ++--
>>>>>>>>>>       drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c           |  26 ++--
>>>>>>>>>>       drivers/gpu/drm/amd/amdgpu/amdgpu_irq.h           |   3 +-
>>>>>>>>>>       drivers/gpu/drm/amd/amdgpu/amdgpu_job.c           |  19 ++-
>>>>>>>>>>       drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c           |  12 +-
>>>>>>>>>>       drivers/gpu/drm/amd/amdgpu/amdgpu_object.c        |  10 ++
>>>>>>>>>>       drivers/gpu/drm/amd/amdgpu/amdgpu_object.h        |   2 +
>>>>>>>>>>       drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c           |  53 +++++---
>>>>>>>>>>       drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h           |   3 +
>>>>>>>>>>       drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c           |   1 +
>>>>>>>>>>       drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c          |  70 ++++++++++
>>>>>>>>>>       drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h          |  52 +-------
>>>>>>>>>>       drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c           |  21 ++-
>>>>>>>>>>       drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c            |   8 +-
>>>>>>>>>>       drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c      |  14 +-
>>>>>>>>>>       drivers/gpu/drm/amd/amdgpu/cik_ih.c               |   2 +-
>>>>>>>>>>       drivers/gpu/drm/amd/amdgpu/cz_ih.c                |   2 +-
>>>>>>>>>>       drivers/gpu/drm/amd/amdgpu/iceland_ih.c           |   2 +-
>>>>>>>>>>       drivers/gpu/drm/amd/amdgpu/navi10_ih.c            |   2 +-
>>>>>>>>>>       drivers/gpu/drm/amd/amdgpu/psp_v11_0.c            |  16 +--
>>>>>>>>>>       drivers/gpu/drm/amd/amdgpu/psp_v12_0.c            |   8 +-
>>>>>>>>>>       drivers/gpu/drm/amd/amdgpu/psp_v3_1.c             |   8 +-
>>>>>>>>>>       drivers/gpu/drm/amd/amdgpu/si_ih.c                |   2 +-
>>>>>>>>>>       drivers/gpu/drm/amd/amdgpu/tonga_ih.c             |   2 +-
>>>>>>>>>>       drivers/gpu/drm/amd/amdgpu/vega10_ih.c            |   2 +-
>>>>>>>>>>       drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c |  12 +-
>>>>>>>>>>       drivers/gpu/drm/amd/include/amd_shared.h          |   2 +
>>>>>>>>>>       drivers/gpu/drm/drm_drv.c                         |   3 +
>>>>>>>>>>       drivers/gpu/drm/etnaviv/etnaviv_sched.c           |  10 +-
>>>>>>>>>>       drivers/gpu/drm/lima/lima_sched.c                 |   4 +-
>>>>>>>>>>       drivers/gpu/drm/panfrost/panfrost_job.c           |   9 +-
>>>>>>>>>>       drivers/gpu/drm/scheduler/sched_main.c            |  18 ++-
>>>>>>>>>>       drivers/gpu/drm/ttm/ttm_bo_vm.c                   |  82 +++++++++++-
>>>>>>>>>>       drivers/gpu/drm/ttm/ttm_tt.c                      |   1 +
>>>>>>>>>>       drivers/gpu/drm/v3d/v3d_sched.c                   |  32 ++---
>>>>>>>>>>       include/drm/gpu_scheduler.h                       |  17 ++-
>>>>>>>>>>       include/drm/ttm/ttm_bo_api.h                      |   2 +
>>>>>>>>>>       45 files changed, 583 insertions(+), 198 deletions(-)
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> 2.7.4
>>>>>>>>>>
>>>
>
>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 11/14] drm/amdgpu: Guard against write accesses after device removal
  2021-02-08 22:15                                           ` Andrey Grodzovsky
@ 2021-02-09  7:58                                             ` Christian König
  -1 siblings, 0 replies; 196+ messages in thread
From: Christian König @ 2021-02-09  7:58 UTC (permalink / raw)
  To: Andrey Grodzovsky, Daniel Vetter
  Cc: Greg KH, amd-gfx list, dri-devel, Alex Deucher, Qiang Yu


Am 08.02.21 um 23:15 schrieb Andrey Grodzovsky:
>
> On 2/8/21 11:23 AM, Daniel Vetter wrote:
>> On Mon, Feb 8, 2021 at 3:00 PM Christian König 
>> <christian.koenig@amd.com> wrote:
>>> Am 08.02.21 um 11:11 schrieb Daniel Vetter:
>>>> On Mon, Feb 08, 2021 at 11:03:15AM +0100, Christian König wrote:
>>>>> Am 08.02.21 um 10:48 schrieb Daniel Vetter:
>>>>>> On Mon, Feb 08, 2021 at 10:37:19AM +0100, Christian König wrote:
>>>>>>> Am 07.02.21 um 22:50 schrieb Daniel Vetter:
>>>>>>>> [SNIP]
>>>>>>>>> Clarification - as far as I know there are no page fault 
>>>>>>>>> handlers for kernel
>>>>>>>>> mappings. And we are talking about kernel mappings here, right 
>>>>>>>>> ?  If there were
>>>>>>>>> I could solve all those issues the same as I do for user 
>>>>>>>>> mappings, by
>>>>>>>>> invalidating all existing mappings in the kernel (both kmaps 
>>>>>>>>> and ioreamps)and
>>>>>>>>> insert dummy zero or ~0 filled page instead.
>>>>>>>>> Also, I assume forcefully remapping the IO BAR to ~0 filled 
>>>>>>>>> page would involve
>>>>>>>>> ioremap API and it's not something that I think can be easily 
>>>>>>>>> done according to
>>>>>>>>> am answer i got to a related topic a few weeks ago
>>>>>>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.spinics.net%2Flists%2Flinux-pci%2Fmsg103396.html&amp;data=04%7C01%7CAndrey.Grodzovsky%40amd.com%7C9d1bdf4cee504cd71b4908d8cc4df310%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637483982454608249%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=Anw%2BOwJ%2B5tvjW3tmkVNdz13%2BZ18vdpfOLWqsUZL7D2I%3D&amp;reserved=0 
>>>>>>>>> (that was the only reply
>>>>>>>>> i got)
>>>>>>>> mmiotrace can, but only for debug, and only on x86 platforms:
>>>>>>>>
>>>>>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.kernel.org%2Fdoc%2Fhtml%2Flatest%2Ftrace%2Fmmiotrace.html&amp;data=04%7C01%7CAndrey.Grodzovsky%40amd.com%7C9d1bdf4cee504cd71b4908d8cc4df310%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637483982454608249%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=Wa7BFNySVQJLyD6WY4pZHTP1QfeZwD7F5ydBrXuxppQ%3D&amp;reserved=0 
>>>>>>>>
>>>>>>>>
>>>>>>>> Should be feasible (but maybe not worth the effort) to extend 
>>>>>>>> this to
>>>>>>>> support fake unplug.
>>>>>>> Mhm, interesting idea you guys brought up here.
>>>>>>>
>>>>>>> We don't need a page fault for this to work, all we need to do 
>>>>>>> is to insert
>>>>>>> dummy PTEs into the kernels page table at the place where 
>>>>>>> previously the
>>>>>>> MMIO mapping has been.
>>>>>> Simply pte trick isn't enough, because we need:
>>>>>> - drop all writes silently
>>>>>> - all reads return 0xff
>>>>>>
>>>>>> ptes can't do that themselves, we minimally need write protection 
>>>>>> and then
>>>>>> silently proceed on each write fault without restarting the 
>>>>>> instruction.
>>>>>> Better would be to only catch reads, but x86 doesn't do 
>>>>>> write-only pte
>>>>>> permissions afaik.
>>>>> You are not thinking far enough :)
>>>>>
>>>>> The dummy PTE is point to a dummy MMIO page which is just never used.
>>>>>
>>>>> That hast the exact same properties than our removed MMIO space 
>>>>> just doesn't
>>>>> goes bananas when a new device is MMIO mapped into that and our 
>>>>> driver still
>>>>> tries to write there.
>>>> Hm, but where do we get such a "guaranteed never used" mmio page from?
>>> Well we have tons of unused IO space on 64bit systems these days.
>>>
>>> Doesn't really needs to be PCIe address space, doesn't it?
>> That sounds very trusting to modern systems not decoding random
>> ranges. E.g. the pci code stopped extending the host bridge windows on
>> its own, entirely relying on the acpi provided ranges, to avoid
>> stomping on stuff that's the but not listed anywhere.
>>
>> I guess if we have a range behind a pci bridge, which isn't used by
>> any device, but decoded by the bridge, then that should be safe
>> enough. Maybe could even have an option in upstream to do that on
>> unplug, if a certain flag is set, or a cmdline option.
>> -Daniel
>
>
> Question - Why can't we just set those PTEs to point to system memory 
> (another RO dummy page)
> filled with 1s ?


Then writes are not discarded. E.g. the 1s would change to something else.

Christian.


>
> Andrey
>
>
>>
>>> Christian.
>>>
>>>> It's a nifty idea indeed otherwise ...
>>>> -Daniel
>>>>
>>>>> Regards,
>>>>> Christian.
>>>>>
>>>>>
>>>>>>>>>> But ugh ...
>>>>>>>>>>
>>>>>>>>>> Otoh validating an entire driver like amdgpu without such a 
>>>>>>>>>> trick
>>>>>>>>>> against 0xff reads is practically impossible. So maybe you 
>>>>>>>>>> need to add
>>>>>>>>>> this as one of the tasks here?
>>>>>>>>> Or I could just for validation purposes return ~0 from all reg 
>>>>>>>>> reads in the code
>>>>>>>>> and ignore writes if drm_dev_unplugged, this could already 
>>>>>>>>> easily validate a big
>>>>>>>>> portion of the code flow under such scenario.
>>>>>>>> Hm yeah if your really wrap them all, that should work too. Since
>>>>>>>> iommappings have __iomem pointer type, as long as amdgpu is sparse
>>>>>>>> warning free, should be doable to guarantee this.
>>>>>>> Problem is that ~0 is not always a valid register value.
>>>>>>>
>>>>>>> You would need to audit every register read that it doesn't use 
>>>>>>> the returned
>>>>>>> value blindly as index or similar. That is quite a bit of work.
>>>>>> Yeah that's the entire crux here :-/
>>>>>> -Daniel
>>
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 11/14] drm/amdgpu: Guard against write accesses after device removal
@ 2021-02-09  7:58                                             ` Christian König
  0 siblings, 0 replies; 196+ messages in thread
From: Christian König @ 2021-02-09  7:58 UTC (permalink / raw)
  To: Andrey Grodzovsky, Daniel Vetter
  Cc: Rob Herring, Greg KH, amd-gfx list, Anholt, Eric, Pekka Paalanen,
	dri-devel, Alex Deucher, Qiang Yu, Wentland, Harry, Lucas Stach


Am 08.02.21 um 23:15 schrieb Andrey Grodzovsky:
>
> On 2/8/21 11:23 AM, Daniel Vetter wrote:
>> On Mon, Feb 8, 2021 at 3:00 PM Christian König 
>> <christian.koenig@amd.com> wrote:
>>> Am 08.02.21 um 11:11 schrieb Daniel Vetter:
>>>> On Mon, Feb 08, 2021 at 11:03:15AM +0100, Christian König wrote:
>>>>> Am 08.02.21 um 10:48 schrieb Daniel Vetter:
>>>>>> On Mon, Feb 08, 2021 at 10:37:19AM +0100, Christian König wrote:
>>>>>>> Am 07.02.21 um 22:50 schrieb Daniel Vetter:
>>>>>>>> [SNIP]
>>>>>>>>> Clarification - as far as I know there are no page fault 
>>>>>>>>> handlers for kernel
>>>>>>>>> mappings. And we are talking about kernel mappings here, right 
>>>>>>>>> ?  If there were
>>>>>>>>> I could solve all those issues the same as I do for user 
>>>>>>>>> mappings, by
>>>>>>>>> invalidating all existing mappings in the kernel (both kmaps 
>>>>>>>>> and ioreamps)and
>>>>>>>>> insert dummy zero or ~0 filled page instead.
>>>>>>>>> Also, I assume forcefully remapping the IO BAR to ~0 filled 
>>>>>>>>> page would involve
>>>>>>>>> ioremap API and it's not something that I think can be easily 
>>>>>>>>> done according to
>>>>>>>>> am answer i got to a related topic a few weeks ago
>>>>>>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.spinics.net%2Flists%2Flinux-pci%2Fmsg103396.html&amp;data=04%7C01%7CAndrey.Grodzovsky%40amd.com%7C9d1bdf4cee504cd71b4908d8cc4df310%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637483982454608249%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=Anw%2BOwJ%2B5tvjW3tmkVNdz13%2BZ18vdpfOLWqsUZL7D2I%3D&amp;reserved=0 
>>>>>>>>> (that was the only reply
>>>>>>>>> i got)
>>>>>>>> mmiotrace can, but only for debug, and only on x86 platforms:
>>>>>>>>
>>>>>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.kernel.org%2Fdoc%2Fhtml%2Flatest%2Ftrace%2Fmmiotrace.html&amp;data=04%7C01%7CAndrey.Grodzovsky%40amd.com%7C9d1bdf4cee504cd71b4908d8cc4df310%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637483982454608249%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=Wa7BFNySVQJLyD6WY4pZHTP1QfeZwD7F5ydBrXuxppQ%3D&amp;reserved=0 
>>>>>>>>
>>>>>>>>
>>>>>>>> Should be feasible (but maybe not worth the effort) to extend 
>>>>>>>> this to
>>>>>>>> support fake unplug.
>>>>>>> Mhm, interesting idea you guys brought up here.
>>>>>>>
>>>>>>> We don't need a page fault for this to work, all we need to do 
>>>>>>> is to insert
>>>>>>> dummy PTEs into the kernels page table at the place where 
>>>>>>> previously the
>>>>>>> MMIO mapping has been.
>>>>>> Simply pte trick isn't enough, because we need:
>>>>>> - drop all writes silently
>>>>>> - all reads return 0xff
>>>>>>
>>>>>> ptes can't do that themselves, we minimally need write protection 
>>>>>> and then
>>>>>> silently proceed on each write fault without restarting the 
>>>>>> instruction.
>>>>>> Better would be to only catch reads, but x86 doesn't do 
>>>>>> write-only pte
>>>>>> permissions afaik.
>>>>> You are not thinking far enough :)
>>>>>
>>>>> The dummy PTE is point to a dummy MMIO page which is just never used.
>>>>>
>>>>> That hast the exact same properties than our removed MMIO space 
>>>>> just doesn't
>>>>> goes bananas when a new device is MMIO mapped into that and our 
>>>>> driver still
>>>>> tries to write there.
>>>> Hm, but where do we get such a "guaranteed never used" mmio page from?
>>> Well we have tons of unused IO space on 64bit systems these days.
>>>
>>> Doesn't really needs to be PCIe address space, doesn't it?
>> That sounds very trusting to modern systems not decoding random
>> ranges. E.g. the pci code stopped extending the host bridge windows on
>> its own, entirely relying on the acpi provided ranges, to avoid
>> stomping on stuff that's the but not listed anywhere.
>>
>> I guess if we have a range behind a pci bridge, which isn't used by
>> any device, but decoded by the bridge, then that should be safe
>> enough. Maybe could even have an option in upstream to do that on
>> unplug, if a certain flag is set, or a cmdline option.
>> -Daniel
>
>
> Question - Why can't we just set those PTEs to point to system memory 
> (another RO dummy page)
> filled with 1s ?


Then writes are not discarded. E.g. the 1s would change to something else.

Christian.


>
> Andrey
>
>
>>
>>> Christian.
>>>
>>>> It's a nifty idea indeed otherwise ...
>>>> -Daniel
>>>>
>>>>> Regards,
>>>>> Christian.
>>>>>
>>>>>
>>>>>>>>>> But ugh ...
>>>>>>>>>>
>>>>>>>>>> Otoh validating an entire driver like amdgpu without such a 
>>>>>>>>>> trick
>>>>>>>>>> against 0xff reads is practically impossible. So maybe you 
>>>>>>>>>> need to add
>>>>>>>>>> this as one of the tasks here?
>>>>>>>>> Or I could just for validation purposes return ~0 from all reg 
>>>>>>>>> reads in the code
>>>>>>>>> and ignore writes if drm_dev_unplugged, this could already 
>>>>>>>>> easily validate a big
>>>>>>>>> portion of the code flow under such scenario.
>>>>>>>> Hm yeah if your really wrap them all, that should work too. Since
>>>>>>>> iommappings have __iomem pointer type, as long as amdgpu is sparse
>>>>>>>> warning free, should be doable to guarantee this.
>>>>>>> Problem is that ~0 is not always a valid register value.
>>>>>>>
>>>>>>> You would need to audit every register read that it doesn't use 
>>>>>>> the returned
>>>>>>> value blindly as index or similar. That is quite a bit of work.
>>>>>> Yeah that's the entire crux here :-/
>>>>>> -Daniel
>>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 11/14] drm/amdgpu: Guard against write accesses after device removal
  2021-02-08 22:09                                 ` Andrey Grodzovsky
@ 2021-02-09  8:27                                   ` Christian König
  -1 siblings, 0 replies; 196+ messages in thread
From: Christian König @ 2021-02-09  8:27 UTC (permalink / raw)
  To: Andrey Grodzovsky, christian.koenig, Daniel Vetter
  Cc: Greg KH, dri-devel, amd-gfx list, Alex Deucher, Qiang Yu


[-- Attachment #1.1: Type: text/plain, Size: 4467 bytes --]



Am 08.02.21 um 23:09 schrieb Andrey Grodzovsky:
>
>
> On 2/8/21 4:37 AM, Christian König wrote:
>> Am 07.02.21 um 22:50 schrieb Daniel Vetter:
>>> [SNIP]
>>>> Clarification - as far as I know there are no page fault handlers 
>>>> for kernel
>>>> mappings. And we are talking about kernel mappings here, right ?  
>>>> If there were
>>>> I could solve all those issues the same as I do for user mappings, by
>>>> invalidating all existing mappings in the kernel (both kmaps and 
>>>> ioreamps)and
>>>> insert dummy zero or ~0 filled page instead.
>>>> Also, I assume forcefully remapping the IO BAR to ~0 filled page 
>>>> would involve
>>>> ioremap API and it's not something that I think can be easily done 
>>>> according to
>>>> am answer i got to a related topic a few weeks ago
>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.spinics.net%2Flists%2Flinux-pci%2Fmsg103396.html&amp;data=04%7C01%7CAndrey.Grodzovsky%40amd.com%7Cb159d3ce264944486c8008d8cc15233a%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637483738446813868%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=6eP0nhS%2BZwp1Y54CwfX8vaV3FTWbW8IylW5JFaf92pY%3D&amp;reserved=0 
>>>> (that was the only reply
>>>> i got)
>>> mmiotrace can, but only for debug, and only on x86 platforms:
>>>
>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.kernel.org%2Fdoc%2Fhtml%2Flatest%2Ftrace%2Fmmiotrace.html&amp;data=04%7C01%7CAndrey.Grodzovsky%40amd.com%7Cb159d3ce264944486c8008d8cc15233a%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637483738446813868%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=QBF9J%2BVIRUkUTTjvNoZR8NqFNt8CpHkcknH2qKX7dd8%3D&amp;reserved=0 
>>>
>>>
>>> Should be feasible (but maybe not worth the effort) to extend this to
>>> support fake unplug.
>>
>> Mhm, interesting idea you guys brought up here.
>>
>> We don't need a page fault for this to work, all we need to do is to 
>> insert dummy PTEs into the kernels page table at the place where 
>> previously the MMIO mapping has been.
>
>
> But that exactly what Mathew from linux-mm says is not a trivial thing 
> to do, quote:
>
> "
>
> ioremap() is done through the vmalloc space.  It would, in theory, be
> possible to reprogram the page tables used for vmalloc to point to your
> magic page.  I don't think we have such a mechanism today, and there are
> lots of problems with things like TLB flushes.  It's probably going to
> be harder than you think.
> "

I haven't followed the full discussion, but I don't see much preventing 
this.

All you need is a new ioremap_dummy() function which takes the old start 
and length of the mapping.

Still a bit core and maybe even platform code, but rather useful I think.

Christian.

>
> If you believe it's actually doable then it would be useful not only for simulating device
> unplugged situation with all MMIOs returning 0xff... but for actual handling of driver accesses
> to MMIO after device is gone and, we could then drop entirely this patch as there would be no need
> to guard against such accesses post device unplug.


>
>   
>>
>>>>> But ugh ...
>>>>>
>>>>> Otoh validating an entire driver like amdgpu without such a trick
>>>>> against 0xff reads is practically impossible. So maybe you need to 
>>>>> add
>>>>> this as one of the tasks here?
>>>> Or I could just for validation purposes return ~0 from all reg 
>>>> reads in the code
>>>> and ignore writes if drm_dev_unplugged, this could already easily 
>>>> validate a big
>>>> portion of the code flow under such scenario.
>>> Hm yeah if your really wrap them all, that should work too. Since
>>> iommappings have __iomem pointer type, as long as amdgpu is sparse
>>> warning free, should be doable to guarantee this.
>>
>> Problem is that ~0 is not always a valid register value.
>>
>> You would need to audit every register read that it doesn't use the 
>> returned value blindly as index or similar. That is quite a bit of work.
>
>
> But ~0 is the value that will be returned for every read post device 
> unplug, regardless if it's valid or not, and we have to cope with
> it then, no ?
>
> Andrey
>
>
>>
>> Regards,
>> Christian.
>>
>>> -Daniel
>>>
>>>> Andrey
>>>>
>>
>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[-- Attachment #1.2: Type: text/html, Size: 9468 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 11/14] drm/amdgpu: Guard against write accesses after device removal
@ 2021-02-09  8:27                                   ` Christian König
  0 siblings, 0 replies; 196+ messages in thread
From: Christian König @ 2021-02-09  8:27 UTC (permalink / raw)
  To: Andrey Grodzovsky, christian.koenig, Daniel Vetter
  Cc: Rob Herring, Greg KH, dri-devel, Anholt, Eric, Pekka Paalanen,
	amd-gfx list, Alex Deucher, Qiang Yu, Wentland, Harry,
	Lucas Stach


[-- Attachment #1.1: Type: text/plain, Size: 4467 bytes --]



Am 08.02.21 um 23:09 schrieb Andrey Grodzovsky:
>
>
> On 2/8/21 4:37 AM, Christian König wrote:
>> Am 07.02.21 um 22:50 schrieb Daniel Vetter:
>>> [SNIP]
>>>> Clarification - as far as I know there are no page fault handlers 
>>>> for kernel
>>>> mappings. And we are talking about kernel mappings here, right ?  
>>>> If there were
>>>> I could solve all those issues the same as I do for user mappings, by
>>>> invalidating all existing mappings in the kernel (both kmaps and 
>>>> ioreamps)and
>>>> insert dummy zero or ~0 filled page instead.
>>>> Also, I assume forcefully remapping the IO BAR to ~0 filled page 
>>>> would involve
>>>> ioremap API and it's not something that I think can be easily done 
>>>> according to
>>>> am answer i got to a related topic a few weeks ago
>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.spinics.net%2Flists%2Flinux-pci%2Fmsg103396.html&amp;data=04%7C01%7CAndrey.Grodzovsky%40amd.com%7Cb159d3ce264944486c8008d8cc15233a%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637483738446813868%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=6eP0nhS%2BZwp1Y54CwfX8vaV3FTWbW8IylW5JFaf92pY%3D&amp;reserved=0 
>>>> (that was the only reply
>>>> i got)
>>> mmiotrace can, but only for debug, and only on x86 platforms:
>>>
>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.kernel.org%2Fdoc%2Fhtml%2Flatest%2Ftrace%2Fmmiotrace.html&amp;data=04%7C01%7CAndrey.Grodzovsky%40amd.com%7Cb159d3ce264944486c8008d8cc15233a%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637483738446813868%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=QBF9J%2BVIRUkUTTjvNoZR8NqFNt8CpHkcknH2qKX7dd8%3D&amp;reserved=0 
>>>
>>>
>>> Should be feasible (but maybe not worth the effort) to extend this to
>>> support fake unplug.
>>
>> Mhm, interesting idea you guys brought up here.
>>
>> We don't need a page fault for this to work, all we need to do is to 
>> insert dummy PTEs into the kernels page table at the place where 
>> previously the MMIO mapping has been.
>
>
> But that exactly what Mathew from linux-mm says is not a trivial thing 
> to do, quote:
>
> "
>
> ioremap() is done through the vmalloc space.  It would, in theory, be
> possible to reprogram the page tables used for vmalloc to point to your
> magic page.  I don't think we have such a mechanism today, and there are
> lots of problems with things like TLB flushes.  It's probably going to
> be harder than you think.
> "

I haven't followed the full discussion, but I don't see much preventing 
this.

All you need is a new ioremap_dummy() function which takes the old start 
and length of the mapping.

Still a bit core and maybe even platform code, but rather useful I think.

Christian.

>
> If you believe it's actually doable then it would be useful not only for simulating device
> unplugged situation with all MMIOs returning 0xff... but for actual handling of driver accesses
> to MMIO after device is gone and, we could then drop entirely this patch as there would be no need
> to guard against such accesses post device unplug.


>
>   
>>
>>>>> But ugh ...
>>>>>
>>>>> Otoh validating an entire driver like amdgpu without such a trick
>>>>> against 0xff reads is practically impossible. So maybe you need to 
>>>>> add
>>>>> this as one of the tasks here?
>>>> Or I could just for validation purposes return ~0 from all reg 
>>>> reads in the code
>>>> and ignore writes if drm_dev_unplugged, this could already easily 
>>>> validate a big
>>>> portion of the code flow under such scenario.
>>> Hm yeah if your really wrap them all, that should work too. Since
>>> iommappings have __iomem pointer type, as long as amdgpu is sparse
>>> warning free, should be doable to guarantee this.
>>
>> Problem is that ~0 is not always a valid register value.
>>
>> You would need to audit every register read that it doesn't use the 
>> returned value blindly as index or similar. That is quite a bit of work.
>
>
> But ~0 is the value that will be returned for every read post device 
> unplug, regardless if it's valid or not, and we have to cope with
> it then, no ?
>
> Andrey
>
>
>>
>> Regards,
>> Christian.
>>
>>> -Daniel
>>>
>>>> Andrey
>>>>
>>
>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[-- Attachment #1.2: Type: text/html, Size: 9468 bytes --]

[-- Attachment #2: Type: text/plain, Size: 154 bytes --]

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 11/14] drm/amdgpu: Guard against write accesses after device removal
  2021-02-09  8:27                                   ` Christian König
@ 2021-02-09  9:46                                     ` Daniel Vetter
  -1 siblings, 0 replies; 196+ messages in thread
From: Daniel Vetter @ 2021-02-09  9:46 UTC (permalink / raw)
  To: Christian König
  Cc: Greg KH, amd-gfx list, dri-devel, Daniel Vetter, Alex Deucher,
	Qiang Yu, christian.koenig

On Tue, Feb 09, 2021 at 09:27:03AM +0100, Christian König wrote:
> 
> 
> Am 08.02.21 um 23:09 schrieb Andrey Grodzovsky:
> > 
> > 
> > On 2/8/21 4:37 AM, Christian König wrote:
> > > Am 07.02.21 um 22:50 schrieb Daniel Vetter:
> > > > [SNIP]
> > > > > Clarification - as far as I know there are no page fault
> > > > > handlers for kernel
> > > > > mappings. And we are talking about kernel mappings here,
> > > > > right ?  If there were
> > > > > I could solve all those issues the same as I do for user mappings, by
> > > > > invalidating all existing mappings in the kernel (both kmaps
> > > > > and ioreamps)and
> > > > > insert dummy zero or ~0 filled page instead.
> > > > > Also, I assume forcefully remapping the IO BAR to ~0 filled
> > > > > page would involve
> > > > > ioremap API and it's not something that I think can be
> > > > > easily done according to
> > > > > am answer i got to a related topic a few weeks ago
> > > > > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.spinics.net%2Flists%2Flinux-pci%2Fmsg103396.html&amp;data=04%7C01%7CAndrey.Grodzovsky%40amd.com%7Cb159d3ce264944486c8008d8cc15233a%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637483738446813868%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=6eP0nhS%2BZwp1Y54CwfX8vaV3FTWbW8IylW5JFaf92pY%3D&amp;reserved=0
> > > > > (that was the only reply
> > > > > i got)
> > > > mmiotrace can, but only for debug, and only on x86 platforms:
> > > > 
> > > > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.kernel.org%2Fdoc%2Fhtml%2Flatest%2Ftrace%2Fmmiotrace.html&amp;data=04%7C01%7CAndrey.Grodzovsky%40amd.com%7Cb159d3ce264944486c8008d8cc15233a%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637483738446813868%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=QBF9J%2BVIRUkUTTjvNoZR8NqFNt8CpHkcknH2qKX7dd8%3D&amp;reserved=0
> > > > 
> > > > 
> > > > Should be feasible (but maybe not worth the effort) to extend this to
> > > > support fake unplug.
> > > 
> > > Mhm, interesting idea you guys brought up here.
> > > 
> > > We don't need a page fault for this to work, all we need to do is to
> > > insert dummy PTEs into the kernels page table at the place where
> > > previously the MMIO mapping has been.
> > 
> > 
> > But that exactly what Mathew from linux-mm says is not a trivial thing
> > to do, quote:
> > 
> > "
> > 
> > ioremap() is done through the vmalloc space.  It would, in theory, be
> > possible to reprogram the page tables used for vmalloc to point to your
> > magic page.  I don't think we have such a mechanism today, and there are
> > lots of problems with things like TLB flushes.  It's probably going to
> > be harder than you think.
> > "
> 
> I haven't followed the full discussion, but I don't see much preventing
> this.
> 
> All you need is a new ioremap_dummy() function which takes the old start and
> length of the mapping.
> 
> Still a bit core and maybe even platform code, but rather useful I think.

Yeah we don't care about races, so if the tlb flushing isn't perfect
that's fine.

Also if we glue this into the mmiotrace infrastructure, that already has
all the fault handling. So on x86 I think we could even make it perfect
(but that feels like overkill) and fully atomic. Plus the mmiotrace
overhead (even if we don't capture anything) is probably a bit much even
for testing in CI or somewhere like that.
-Daniel

> 
> Christian.
> 
> > 
> > If you believe it's actually doable then it would be useful not only for simulating device
> > unplugged situation with all MMIOs returning 0xff... but for actual handling of driver accesses
> > to MMIO after device is gone and, we could then drop entirely this patch as there would be no need
> > to guard against such accesses post device unplug.
> 
> 
> > 
> > > 
> > > > > > But ugh ...
> > > > > > 
> > > > > > Otoh validating an entire driver like amdgpu without such a trick
> > > > > > against 0xff reads is practically impossible. So maybe
> > > > > > you need to add
> > > > > > this as one of the tasks here?
> > > > > Or I could just for validation purposes return ~0 from all
> > > > > reg reads in the code
> > > > > and ignore writes if drm_dev_unplugged, this could already
> > > > > easily validate a big
> > > > > portion of the code flow under such scenario.
> > > > Hm yeah if your really wrap them all, that should work too. Since
> > > > iommappings have __iomem pointer type, as long as amdgpu is sparse
> > > > warning free, should be doable to guarantee this.
> > > 
> > > Problem is that ~0 is not always a valid register value.
> > > 
> > > You would need to audit every register read that it doesn't use the
> > > returned value blindly as index or similar. That is quite a bit of
> > > work.
> > 
> > 
> > But ~0 is the value that will be returned for every read post device
> > unplug, regardless if it's valid or not, and we have to cope with
> > it then, no ?
> > 
> > Andrey
> > 
> > 
> > > 
> > > Regards,
> > > Christian.
> > > 
> > > > -Daniel
> > > > 
> > > > > Andrey
> > > > > 
> > > 
> > 
> > _______________________________________________
> > amd-gfx mailing list
> > amd-gfx@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/amd-gfx
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 11/14] drm/amdgpu: Guard against write accesses after device removal
@ 2021-02-09  9:46                                     ` Daniel Vetter
  0 siblings, 0 replies; 196+ messages in thread
From: Daniel Vetter @ 2021-02-09  9:46 UTC (permalink / raw)
  To: Christian König
  Cc: Rob Herring, Andrey Grodzovsky, Greg KH, amd-gfx list, Anholt,
	Eric, Pekka Paalanen, dri-devel, Daniel Vetter, Alex Deucher,
	Qiang Yu, Wentland, Harry, christian.koenig, Lucas Stach

On Tue, Feb 09, 2021 at 09:27:03AM +0100, Christian König wrote:
> 
> 
> Am 08.02.21 um 23:09 schrieb Andrey Grodzovsky:
> > 
> > 
> > On 2/8/21 4:37 AM, Christian König wrote:
> > > Am 07.02.21 um 22:50 schrieb Daniel Vetter:
> > > > [SNIP]
> > > > > Clarification - as far as I know there are no page fault
> > > > > handlers for kernel
> > > > > mappings. And we are talking about kernel mappings here,
> > > > > right ?  If there were
> > > > > I could solve all those issues the same as I do for user mappings, by
> > > > > invalidating all existing mappings in the kernel (both kmaps
> > > > > and ioreamps)and
> > > > > insert dummy zero or ~0 filled page instead.
> > > > > Also, I assume forcefully remapping the IO BAR to ~0 filled
> > > > > page would involve
> > > > > ioremap API and it's not something that I think can be
> > > > > easily done according to
> > > > > am answer i got to a related topic a few weeks ago
> > > > > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.spinics.net%2Flists%2Flinux-pci%2Fmsg103396.html&amp;data=04%7C01%7CAndrey.Grodzovsky%40amd.com%7Cb159d3ce264944486c8008d8cc15233a%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637483738446813868%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=6eP0nhS%2BZwp1Y54CwfX8vaV3FTWbW8IylW5JFaf92pY%3D&amp;reserved=0
> > > > > (that was the only reply
> > > > > i got)
> > > > mmiotrace can, but only for debug, and only on x86 platforms:
> > > > 
> > > > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.kernel.org%2Fdoc%2Fhtml%2Flatest%2Ftrace%2Fmmiotrace.html&amp;data=04%7C01%7CAndrey.Grodzovsky%40amd.com%7Cb159d3ce264944486c8008d8cc15233a%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637483738446813868%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=QBF9J%2BVIRUkUTTjvNoZR8NqFNt8CpHkcknH2qKX7dd8%3D&amp;reserved=0
> > > > 
> > > > 
> > > > Should be feasible (but maybe not worth the effort) to extend this to
> > > > support fake unplug.
> > > 
> > > Mhm, interesting idea you guys brought up here.
> > > 
> > > We don't need a page fault for this to work, all we need to do is to
> > > insert dummy PTEs into the kernels page table at the place where
> > > previously the MMIO mapping has been.
> > 
> > 
> > But that exactly what Mathew from linux-mm says is not a trivial thing
> > to do, quote:
> > 
> > "
> > 
> > ioremap() is done through the vmalloc space.  It would, in theory, be
> > possible to reprogram the page tables used for vmalloc to point to your
> > magic page.  I don't think we have such a mechanism today, and there are
> > lots of problems with things like TLB flushes.  It's probably going to
> > be harder than you think.
> > "
> 
> I haven't followed the full discussion, but I don't see much preventing
> this.
> 
> All you need is a new ioremap_dummy() function which takes the old start and
> length of the mapping.
> 
> Still a bit core and maybe even platform code, but rather useful I think.

Yeah we don't care about races, so if the tlb flushing isn't perfect
that's fine.

Also if we glue this into the mmiotrace infrastructure, that already has
all the fault handling. So on x86 I think we could even make it perfect
(but that feels like overkill) and fully atomic. Plus the mmiotrace
overhead (even if we don't capture anything) is probably a bit much even
for testing in CI or somewhere like that.
-Daniel

> 
> Christian.
> 
> > 
> > If you believe it's actually doable then it would be useful not only for simulating device
> > unplugged situation with all MMIOs returning 0xff... but for actual handling of driver accesses
> > to MMIO after device is gone and, we could then drop entirely this patch as there would be no need
> > to guard against such accesses post device unplug.
> 
> 
> > 
> > > 
> > > > > > But ugh ...
> > > > > > 
> > > > > > Otoh validating an entire driver like amdgpu without such a trick
> > > > > > against 0xff reads is practically impossible. So maybe
> > > > > > you need to add
> > > > > > this as one of the tasks here?
> > > > > Or I could just for validation purposes return ~0 from all
> > > > > reg reads in the code
> > > > > and ignore writes if drm_dev_unplugged, this could already
> > > > > easily validate a big
> > > > > portion of the code flow under such scenario.
> > > > Hm yeah if your really wrap them all, that should work too. Since
> > > > iommappings have __iomem pointer type, as long as amdgpu is sparse
> > > > warning free, should be doable to guarantee this.
> > > 
> > > Problem is that ~0 is not always a valid register value.
> > > 
> > > You would need to audit every register read that it doesn't use the
> > > returned value blindly as index or similar. That is quite a bit of
> > > work.
> > 
> > 
> > But ~0 is the value that will be returned for every read post device
> > unplug, regardless if it's valid or not, and we have to cope with
> > it then, no ?
> > 
> > Andrey
> > 
> > 
> > > 
> > > Regards,
> > > Christian.
> > > 
> > > > -Daniel
> > > > 
> > > > > Andrey
> > > > > 
> > > 
> > 
> > _______________________________________________
> > amd-gfx mailing list
> > amd-gfx@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/amd-gfx
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 00/14] RFC Support hot device unplug in amdgpu
  2021-02-09  4:01                     ` Andrey Grodzovsky
@ 2021-02-09  9:50                       ` Daniel Vetter
  -1 siblings, 0 replies; 196+ messages in thread
From: Daniel Vetter @ 2021-02-09  9:50 UTC (permalink / raw)
  To: Andrey Grodzovsky
  Cc: Christian König, dri-devel, amd-gfx list, Greg KH,
	Alex Deucher, Qiang Yu

On Mon, Feb 08, 2021 at 11:01:14PM -0500, Andrey Grodzovsky wrote:
> 
> On 2/8/21 2:27 AM, Daniel Vetter wrote:
> > On Mon, Feb 8, 2021 at 6:59 AM Andrey Grodzovsky
> > <Andrey.Grodzovsky@amd.com> wrote:
> > > 
> > > On 1/20/21 10:59 AM, Daniel Vetter wrote:
> > > > On Wed, Jan 20, 2021 at 3:20 PM Andrey Grodzovsky
> > > > <Andrey.Grodzovsky@amd.com> wrote:
> > > > > On 1/20/21 4:05 AM, Daniel Vetter wrote:
> > > > > > On Tue, Jan 19, 2021 at 01:18:15PM -0500, Andrey Grodzovsky wrote:
> > > > > > > On 1/19/21 1:08 PM, Daniel Vetter wrote:
> > > > > > > > On Tue, Jan 19, 2021 at 6:31 PM Andrey Grodzovsky
> > > > > > > > <Andrey.Grodzovsky@amd.com> wrote:
> > > > > > > > > On 1/19/21 9:16 AM, Daniel Vetter wrote:
> > > > > > > > > > On Mon, Jan 18, 2021 at 04:01:09PM -0500, Andrey Grodzovsky wrote:
> > > > > > > > > > > Until now extracting a card either by physical extraction (e.g. eGPU with
> > > > > > > > > > > thunderbolt connection or by emulation through  syfs -> /sys/bus/pci/devices/device_id/remove)
> > > > > > > > > > > would cause random crashes in user apps. The random crashes in apps were
> > > > > > > > > > > mostly due to the app having mapped a device backed BO into its address
> > > > > > > > > > > space was still trying to access the BO while the backing device was gone.
> > > > > > > > > > > To answer this first problem Christian suggested to fix the handling of mapped
> > > > > > > > > > > memory in the clients when the device goes away by forcibly unmap all buffers the
> > > > > > > > > > > user processes has by clearing their respective VMAs mapping the device BOs.
> > > > > > > > > > > Then when the VMAs try to fill in the page tables again we check in the fault
> > > > > > > > > > > handlerif the device is removed and if so, return an error. This will generate a
> > > > > > > > > > > SIGBUS to the application which can then cleanly terminate.This indeed was done
> > > > > > > > > > > but this in turn created a problem of kernel OOPs were the OOPSes were due to the
> > > > > > > > > > > fact that while the app was terminating because of the SIGBUSit would trigger use
> > > > > > > > > > > after free in the driver by calling to accesses device structures that were already
> > > > > > > > > > > released from the pci remove sequence.This was handled by introducing a 'flush'
> > > > > > > > > > > sequence during device removal were we wait for drm file reference to drop to 0
> > > > > > > > > > > meaning all user clients directly using this device terminated.
> > > > > > > > > > > 
> > > > > > > > > > > v2:
> > > > > > > > > > > Based on discussions in the mailing list with Daniel and Pekka [1] and based on the document
> > > > > > > > > > > produced by Pekka from those discussions [2] the whole approach with returning SIGBUS and
> > > > > > > > > > > waiting for all user clients having CPU mapping of device BOs to die was dropped.
> > > > > > > > > > > Instead as per the document suggestion the device structures are kept alive until
> > > > > > > > > > > the last reference to the device is dropped by user client and in the meanwhile all existing and new CPU mappings of the BOs
> > > > > > > > > > > belonging to the device directly or by dma-buf import are rerouted to per user
> > > > > > > > > > > process dummy rw page.Also, I skipped the 'Requirements for KMS UAPI' section of [2]
> > > > > > > > > > > since i am trying to get the minimal set of requirements that still give useful solution
> > > > > > > > > > > to work and this is the'Requirements for Render and Cross-Device UAPI' section and so my
> > > > > > > > > > > test case is removing a secondary device, which is render only and is not involved
> > > > > > > > > > > in KMS.
> > > > > > > > > > > 
> > > > > > > > > > > v3:
> > > > > > > > > > > More updates following comments from v2 such as removing loop to find DRM file when rerouting
> > > > > > > > > > > page faults to dummy page,getting rid of unnecessary sysfs handling refactoring and moving
> > > > > > > > > > > prevention of GPU recovery post device unplug from amdgpu to scheduler layer.
> > > > > > > > > > > On top of that added unplug support for the IOMMU enabled system.
> > > > > > > > > > > 
> > > > > > > > > > > v4:
> > > > > > > > > > > Drop last sysfs hack and use sysfs default attribute.
> > > > > > > > > > > Guard against write accesses after device removal to avoid modifying released memory.
> > > > > > > > > > > Update dummy pages handling to on demand allocation and release through drm managed framework.
> > > > > > > > > > > Add return value to scheduler job TO handler (by Luben Tuikov) and use this in amdgpu for prevention
> > > > > > > > > > > of GPU recovery post device unplug
> > > > > > > > > > > Also rebase on top of drm-misc-mext instead of amd-staging-drm-next
> > > > > > > > > > > 
> > > > > > > > > > > With these patches I am able to gracefully remove the secondary card using sysfs remove hook while glxgears
> > > > > > > > > > > is running off of secondary card (DRI_PRIME=1) without kernel oopses or hangs and keep working
> > > > > > > > > > > with the primary card or soft reset the device without hangs or oopses
> > > > > > > > > > > 
> > > > > > > > > > > TODOs for followup work:
> > > > > > > > > > > Convert AMDGPU code to use devm (for hw stuff) and drmm (for sw stuff and allocations) (Daniel)
> > > > > > > > > > > Support plugging the secondary device back after unplug - currently still experiencing HW error on plugging back.
> > > > > > > > > > > Add support for 'Requirements for KMS UAPI' section of [2] - unplugging primary, display connected card.
> > > > > > > > > > > 
> > > > > > > > > > > [1] - Discussions during v3 of the patchset https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.spinics.net%2Flists%2Famd-gfx%2Fmsg55576.html&amp;data=04%7C01%7CAndrey.Grodzovsky%40amd.com%7Cec5a382fde9d43c0397408d8cc02fc38%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637483660504372326%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=5AtLTzLpQ05h1ZonovShf5TUYwOTywkV1WJ1pXfB%2BCA%3D&amp;reserved=0
> > > > > > > > > > > [2] - drm/doc: device hot-unplug for userspace https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.spinics.net%2Flists%2Fdri-devel%2Fmsg259755.html&amp;data=04%7C01%7CAndrey.Grodzovsky%40amd.com%7Cec5a382fde9d43c0397408d8cc02fc38%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637483660504372326%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=zCBMxQnSeiuFORHHxlSpx10v4gwZ%2BnbTFnxelmWliJo%3D&amp;reserved=0
> > > > > > > > > > > [3] - Related gitlab ticket https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgitlab.freedesktop.org%2Fdrm%2Famd%2F-%2Fissues%2F1081&amp;data=04%7C01%7CAndrey.Grodzovsky%40amd.com%7Cec5a382fde9d43c0397408d8cc02fc38%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637483660504372326%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=d9rifeYMoPcbE8K5axZvnSy2kQ3ENWgUcpvol6TkkMw%3D&amp;reserved=0
> > > > > > > > > > btw have you tried this out with some of the igts we have? core_hotunplug
> > > > > > > > > > is the one I'm thinking of. Might be worth to extend this for amdgpu
> > > > > > > > > > specific stuff (like run some batches on it while hotunplugging).
> > > > > > > > > No, I mostly used just running glxgears while testing which covers already
> > > > > > > > > exported/imported dma-buf case and a few manually hacked tests in libdrm amdgpu
> > > > > > > > > test suite
> > > > > > > > > 
> > > > > > > > > 
> > > > > > > > > > Since there's so many corner cases we need to test here (shared dma-buf,
> > > > > > > > > > shared dma_fence) I think it would make sense to have a shared testcase
> > > > > > > > > > across drivers.
> > > > > > > > > Not familiar with IGT too much, is there an easy way to setup shared dma bufs
> > > > > > > > > and fences
> > > > > > > > > use cases there or you mean I need to add them now ?
> > > > > > > > We do have test infrastructure for all of that, but the hotunplug test
> > > > > > > > doesn't have that yet I think.
> > > > > > > > 
> > > > > > > > > > Only specific thing would be some hooks to keep the gpu
> > > > > > > > > > busy in some fashion while we yank the driver.
> > > > > > > > > Do you mean like staring X and some active rendering on top (like glxgears)
> > > > > > > > > automatically from within IGT ?
> > > > > > > > Nope, igt is meant to be bare metal testing so you don't have to drag
> > > > > > > > the entire winsys around (which in a wayland world, is not really good
> > > > > > > > for driver testing anyway, since everything is different). We use this
> > > > > > > > for our pre-merge ci for drm/i915.
> > > > > > > So i keep it busy by X/glxgers which is manual operation. What you suggest
> > > > > > > then is some client within IGT which opens the device and starts submitting jobs
> > > > > > > (which is much like what libdrm amdgpu tests already do) ? And this
> > > > > > > part is the amdgou specific code I just need to port from libdrm to here ?
> > > > > > Yup. For i915 tests we have an entire library already for small workloads,
> > > > > > including some that just spin forever (useful for reset testing and could
> > > > > > also come handy for unload testing).
> > > > > > -Daniel
> > > > > Does it mean I would have to drag in the entire infrastructure code from
> > > > > within libdrm amdgpu code that allows for command submissions through
> > > > > our IOCTLs ?
> > > > No it's perfectly fine to use libdrm in igt tests, we do that too. I
> > > > just mean we have some additional helpers to submit specific workloads
> > > > for intel gpu, like rendercpy to move data with the 3d engine (just
> > > > using copy engines only isn't good enough sometimes for testing), or
> > > > the special hanging batchbuffers we use for reset testing, or in
> > > > general for having precise control over race conditions and things
> > > > like that.
> > > > 
> > > > One thing that was somewhat annoying for i915 but shouldn't be a
> > > > problem for amdgpu is that igt builds on intel. So we have stub
> > > > functions for libdrm-intel, since libdrm-intel doesn't build on arm.
> > > > Shouldn't be a problem for you.
> > > > -Daniel
> > > 
> > > Tested with igt hot-unplug test. Passed unbind_rebind, unplug-rescan,
> > > hot-unbind-rebind and hotunplug-rescan
> > > if disabling the rescan part as I don't support plug-back for now. Also added
> > > command submission for amdgpu.
> > > Attached a draft of submitting workload while unbinding the driver or simulating
> > > detach. Catched 2 issues with unpug if command submission in flight  during
> > > unplug -
> > > (unsignaled fence causing a hang in amdgpu_cs_sync and hitting a BUG_ON in
> > > gfx_v9_0_ring_emit_patch_cond_exec whic is expected i guess).
> > > Guess glxgears command submissions is at a much slower rate so this was missed.
> > > Is that what you meant for this test ?
> > Yup. Would be good if you can submit this one for inclusion.
> > -Daniel
> 
> 
> Will do together with exported dma-buf test once I do it.
> 
> P.S How am i supposed to do exported fence test. Exporting a fence from
> device A, importing it into device B, unplugging
> device A then signaling the fence from device B - this supposed to call a
> fence cb which was registered
> by the exporter which by now is dead and hence will cause a 'use after free' ?

Yeah in the end we'd need 2 hw devices for testing full fence
functionality. A useful intermediate step would be to just export the
fence (either as sync_file, which I think amdgpu doesn't support because
no android egl support in mesa) or drm_syncobj (which you can do as
standalone fd too iirc), and then just using the fence a bit from
userspace (like wait on it or get its status) after the device is
unplugged.

I think this should cover most of the cross-driver issues that fences
bring in, and I think for the other problems we can worry once we spot.
-Daniel

> 
> Andrey
> 
> > 
> > > Andrey
> > > 
> > > 
> > > > 
> > > > > Andrey
> > > > > 
> > > > > > > Andrey
> > > > > > > 
> > > > > > > 
> > > > > > > > > > But just to get it started
> > > > > > > > > > you can throw in entirely amdgpu specific subtests and just share some of
> > > > > > > > > > the test code.
> > > > > > > > > > -Daniel
> > > > > > > > > Im general, I wasn't aware of this test suite and looks like it does what i test
> > > > > > > > > among other stuff.
> > > > > > > > > I will definitely  try to run with it although the rescan part will not work as
> > > > > > > > > plugging
> > > > > > > > > the device back is in my TODO list and not part of the scope for this patchset
> > > > > > > > > and so I will
> > > > > > > > > probably comment the re-scan section out while testing.
> > > > > > > > amd gem has been using libdrm-amd thus far iirc, but for things like
> > > > > > > > this I think it'd be worth to at least consider switching. Display
> > > > > > > > team has already started to use some of the test and contribute stuff
> > > > > > > > (I think the VRR testcase is from amd).
> > > > > > > > -Daniel
> > > > > > > > 
> > > > > > > > > Andrey
> > > > > > > > > 
> > > > > > > > > 
> > > > > > > > > > > Andrey Grodzovsky (13):
> > > > > > > > > > >        drm/ttm: Remap all page faults to per process dummy page.
> > > > > > > > > > >        drm: Unamp the entire device address space on device unplug
> > > > > > > > > > >        drm/ttm: Expose ttm_tt_unpopulate for driver use
> > > > > > > > > > >        drm/sched: Cancel and flush all oustatdning jobs before finish.
> > > > > > > > > > >        drm/amdgpu: Split amdgpu_device_fini into early and late
> > > > > > > > > > >        drm/amdgpu: Add early fini callback
> > > > > > > > > > >        drm/amdgpu: Register IOMMU topology notifier per device.
> > > > > > > > > > >        drm/amdgpu: Fix a bunch of sdma code crash post device unplug
> > > > > > > > > > >        drm/amdgpu: Remap all page faults to per process dummy page.
> > > > > > > > > > >        dmr/amdgpu: Move some sysfs attrs creation to default_attr
> > > > > > > > > > >        drm/amdgpu: Guard against write accesses after device removal
> > > > > > > > > > >        drm/sched: Make timeout timer rearm conditional.
> > > > > > > > > > >        drm/amdgpu: Prevent any job recoveries after device is unplugged.
> > > > > > > > > > > 
> > > > > > > > > > > Luben Tuikov (1):
> > > > > > > > > > >        drm/scheduler: Job timeout handler returns status
> > > > > > > > > > > 
> > > > > > > > > > >       drivers/gpu/drm/amd/amdgpu/amdgpu.h               |  11 +-
> > > > > > > > > > >       drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c      |  17 +--
> > > > > > > > > > >       drivers/gpu/drm/amd/amdgpu/amdgpu_device.c        | 149 ++++++++++++++++++++--
> > > > > > > > > > >       drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c           |  20 ++-
> > > > > > > > > > >       drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c         |  15 ++-
> > > > > > > > > > >       drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c          |   2 +-
> > > > > > > > > > >       drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h          |   1 +
> > > > > > > > > > >       drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c           |   9 ++
> > > > > > > > > > >       drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c       |  25 ++--
> > > > > > > > > > >       drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c           |  26 ++--
> > > > > > > > > > >       drivers/gpu/drm/amd/amdgpu/amdgpu_irq.h           |   3 +-
> > > > > > > > > > >       drivers/gpu/drm/amd/amdgpu/amdgpu_job.c           |  19 ++-
> > > > > > > > > > >       drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c           |  12 +-
> > > > > > > > > > >       drivers/gpu/drm/amd/amdgpu/amdgpu_object.c        |  10 ++
> > > > > > > > > > >       drivers/gpu/drm/amd/amdgpu/amdgpu_object.h        |   2 +
> > > > > > > > > > >       drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c           |  53 +++++---
> > > > > > > > > > >       drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h           |   3 +
> > > > > > > > > > >       drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c           |   1 +
> > > > > > > > > > >       drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c          |  70 ++++++++++
> > > > > > > > > > >       drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h          |  52 +-------
> > > > > > > > > > >       drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c           |  21 ++-
> > > > > > > > > > >       drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c            |   8 +-
> > > > > > > > > > >       drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c      |  14 +-
> > > > > > > > > > >       drivers/gpu/drm/amd/amdgpu/cik_ih.c               |   2 +-
> > > > > > > > > > >       drivers/gpu/drm/amd/amdgpu/cz_ih.c                |   2 +-
> > > > > > > > > > >       drivers/gpu/drm/amd/amdgpu/iceland_ih.c           |   2 +-
> > > > > > > > > > >       drivers/gpu/drm/amd/amdgpu/navi10_ih.c            |   2 +-
> > > > > > > > > > >       drivers/gpu/drm/amd/amdgpu/psp_v11_0.c            |  16 +--
> > > > > > > > > > >       drivers/gpu/drm/amd/amdgpu/psp_v12_0.c            |   8 +-
> > > > > > > > > > >       drivers/gpu/drm/amd/amdgpu/psp_v3_1.c             |   8 +-
> > > > > > > > > > >       drivers/gpu/drm/amd/amdgpu/si_ih.c                |   2 +-
> > > > > > > > > > >       drivers/gpu/drm/amd/amdgpu/tonga_ih.c             |   2 +-
> > > > > > > > > > >       drivers/gpu/drm/amd/amdgpu/vega10_ih.c            |   2 +-
> > > > > > > > > > >       drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c |  12 +-
> > > > > > > > > > >       drivers/gpu/drm/amd/include/amd_shared.h          |   2 +
> > > > > > > > > > >       drivers/gpu/drm/drm_drv.c                         |   3 +
> > > > > > > > > > >       drivers/gpu/drm/etnaviv/etnaviv_sched.c           |  10 +-
> > > > > > > > > > >       drivers/gpu/drm/lima/lima_sched.c                 |   4 +-
> > > > > > > > > > >       drivers/gpu/drm/panfrost/panfrost_job.c           |   9 +-
> > > > > > > > > > >       drivers/gpu/drm/scheduler/sched_main.c            |  18 ++-
> > > > > > > > > > >       drivers/gpu/drm/ttm/ttm_bo_vm.c                   |  82 +++++++++++-
> > > > > > > > > > >       drivers/gpu/drm/ttm/ttm_tt.c                      |   1 +
> > > > > > > > > > >       drivers/gpu/drm/v3d/v3d_sched.c                   |  32 ++---
> > > > > > > > > > >       include/drm/gpu_scheduler.h                       |  17 ++-
> > > > > > > > > > >       include/drm/ttm/ttm_bo_api.h                      |   2 +
> > > > > > > > > > >       45 files changed, 583 insertions(+), 198 deletions(-)
> > > > > > > > > > > 
> > > > > > > > > > > --
> > > > > > > > > > > 2.7.4
> > > > > > > > > > > 
> > > > 
> > 
> > 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 00/14] RFC Support hot device unplug in amdgpu
@ 2021-02-09  9:50                       ` Daniel Vetter
  0 siblings, 0 replies; 196+ messages in thread
From: Daniel Vetter @ 2021-02-09  9:50 UTC (permalink / raw)
  To: Andrey Grodzovsky
  Cc: Rob Herring, Christian König, dri-devel, Anholt, Eric,
	Pekka Paalanen, amd-gfx list, Daniel Vetter, Greg KH,
	Alex Deucher, Qiang Yu, Wentland, Harry, Lucas Stach

On Mon, Feb 08, 2021 at 11:01:14PM -0500, Andrey Grodzovsky wrote:
> 
> On 2/8/21 2:27 AM, Daniel Vetter wrote:
> > On Mon, Feb 8, 2021 at 6:59 AM Andrey Grodzovsky
> > <Andrey.Grodzovsky@amd.com> wrote:
> > > 
> > > On 1/20/21 10:59 AM, Daniel Vetter wrote:
> > > > On Wed, Jan 20, 2021 at 3:20 PM Andrey Grodzovsky
> > > > <Andrey.Grodzovsky@amd.com> wrote:
> > > > > On 1/20/21 4:05 AM, Daniel Vetter wrote:
> > > > > > On Tue, Jan 19, 2021 at 01:18:15PM -0500, Andrey Grodzovsky wrote:
> > > > > > > On 1/19/21 1:08 PM, Daniel Vetter wrote:
> > > > > > > > On Tue, Jan 19, 2021 at 6:31 PM Andrey Grodzovsky
> > > > > > > > <Andrey.Grodzovsky@amd.com> wrote:
> > > > > > > > > On 1/19/21 9:16 AM, Daniel Vetter wrote:
> > > > > > > > > > On Mon, Jan 18, 2021 at 04:01:09PM -0500, Andrey Grodzovsky wrote:
> > > > > > > > > > > Until now extracting a card either by physical extraction (e.g. eGPU with
> > > > > > > > > > > thunderbolt connection or by emulation through  syfs -> /sys/bus/pci/devices/device_id/remove)
> > > > > > > > > > > would cause random crashes in user apps. The random crashes in apps were
> > > > > > > > > > > mostly due to the app having mapped a device backed BO into its address
> > > > > > > > > > > space was still trying to access the BO while the backing device was gone.
> > > > > > > > > > > To answer this first problem Christian suggested to fix the handling of mapped
> > > > > > > > > > > memory in the clients when the device goes away by forcibly unmap all buffers the
> > > > > > > > > > > user processes has by clearing their respective VMAs mapping the device BOs.
> > > > > > > > > > > Then when the VMAs try to fill in the page tables again we check in the fault
> > > > > > > > > > > handlerif the device is removed and if so, return an error. This will generate a
> > > > > > > > > > > SIGBUS to the application which can then cleanly terminate.This indeed was done
> > > > > > > > > > > but this in turn created a problem of kernel OOPs were the OOPSes were due to the
> > > > > > > > > > > fact that while the app was terminating because of the SIGBUSit would trigger use
> > > > > > > > > > > after free in the driver by calling to accesses device structures that were already
> > > > > > > > > > > released from the pci remove sequence.This was handled by introducing a 'flush'
> > > > > > > > > > > sequence during device removal were we wait for drm file reference to drop to 0
> > > > > > > > > > > meaning all user clients directly using this device terminated.
> > > > > > > > > > > 
> > > > > > > > > > > v2:
> > > > > > > > > > > Based on discussions in the mailing list with Daniel and Pekka [1] and based on the document
> > > > > > > > > > > produced by Pekka from those discussions [2] the whole approach with returning SIGBUS and
> > > > > > > > > > > waiting for all user clients having CPU mapping of device BOs to die was dropped.
> > > > > > > > > > > Instead as per the document suggestion the device structures are kept alive until
> > > > > > > > > > > the last reference to the device is dropped by user client and in the meanwhile all existing and new CPU mappings of the BOs
> > > > > > > > > > > belonging to the device directly or by dma-buf import are rerouted to per user
> > > > > > > > > > > process dummy rw page.Also, I skipped the 'Requirements for KMS UAPI' section of [2]
> > > > > > > > > > > since i am trying to get the minimal set of requirements that still give useful solution
> > > > > > > > > > > to work and this is the'Requirements for Render and Cross-Device UAPI' section and so my
> > > > > > > > > > > test case is removing a secondary device, which is render only and is not involved
> > > > > > > > > > > in KMS.
> > > > > > > > > > > 
> > > > > > > > > > > v3:
> > > > > > > > > > > More updates following comments from v2 such as removing loop to find DRM file when rerouting
> > > > > > > > > > > page faults to dummy page,getting rid of unnecessary sysfs handling refactoring and moving
> > > > > > > > > > > prevention of GPU recovery post device unplug from amdgpu to scheduler layer.
> > > > > > > > > > > On top of that added unplug support for the IOMMU enabled system.
> > > > > > > > > > > 
> > > > > > > > > > > v4:
> > > > > > > > > > > Drop last sysfs hack and use sysfs default attribute.
> > > > > > > > > > > Guard against write accesses after device removal to avoid modifying released memory.
> > > > > > > > > > > Update dummy pages handling to on demand allocation and release through drm managed framework.
> > > > > > > > > > > Add return value to scheduler job TO handler (by Luben Tuikov) and use this in amdgpu for prevention
> > > > > > > > > > > of GPU recovery post device unplug
> > > > > > > > > > > Also rebase on top of drm-misc-mext instead of amd-staging-drm-next
> > > > > > > > > > > 
> > > > > > > > > > > With these patches I am able to gracefully remove the secondary card using sysfs remove hook while glxgears
> > > > > > > > > > > is running off of secondary card (DRI_PRIME=1) without kernel oopses or hangs and keep working
> > > > > > > > > > > with the primary card or soft reset the device without hangs or oopses
> > > > > > > > > > > 
> > > > > > > > > > > TODOs for followup work:
> > > > > > > > > > > Convert AMDGPU code to use devm (for hw stuff) and drmm (for sw stuff and allocations) (Daniel)
> > > > > > > > > > > Support plugging the secondary device back after unplug - currently still experiencing HW error on plugging back.
> > > > > > > > > > > Add support for 'Requirements for KMS UAPI' section of [2] - unplugging primary, display connected card.
> > > > > > > > > > > 
> > > > > > > > > > > [1] - Discussions during v3 of the patchset https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.spinics.net%2Flists%2Famd-gfx%2Fmsg55576.html&amp;data=04%7C01%7CAndrey.Grodzovsky%40amd.com%7Cec5a382fde9d43c0397408d8cc02fc38%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637483660504372326%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=5AtLTzLpQ05h1ZonovShf5TUYwOTywkV1WJ1pXfB%2BCA%3D&amp;reserved=0
> > > > > > > > > > > [2] - drm/doc: device hot-unplug for userspace https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.spinics.net%2Flists%2Fdri-devel%2Fmsg259755.html&amp;data=04%7C01%7CAndrey.Grodzovsky%40amd.com%7Cec5a382fde9d43c0397408d8cc02fc38%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637483660504372326%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=zCBMxQnSeiuFORHHxlSpx10v4gwZ%2BnbTFnxelmWliJo%3D&amp;reserved=0
> > > > > > > > > > > [3] - Related gitlab ticket https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgitlab.freedesktop.org%2Fdrm%2Famd%2F-%2Fissues%2F1081&amp;data=04%7C01%7CAndrey.Grodzovsky%40amd.com%7Cec5a382fde9d43c0397408d8cc02fc38%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637483660504372326%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=d9rifeYMoPcbE8K5axZvnSy2kQ3ENWgUcpvol6TkkMw%3D&amp;reserved=0
> > > > > > > > > > btw have you tried this out with some of the igts we have? core_hotunplug
> > > > > > > > > > is the one I'm thinking of. Might be worth to extend this for amdgpu
> > > > > > > > > > specific stuff (like run some batches on it while hotunplugging).
> > > > > > > > > No, I mostly used just running glxgears while testing which covers already
> > > > > > > > > exported/imported dma-buf case and a few manually hacked tests in libdrm amdgpu
> > > > > > > > > test suite
> > > > > > > > > 
> > > > > > > > > 
> > > > > > > > > > Since there's so many corner cases we need to test here (shared dma-buf,
> > > > > > > > > > shared dma_fence) I think it would make sense to have a shared testcase
> > > > > > > > > > across drivers.
> > > > > > > > > Not familiar with IGT too much, is there an easy way to setup shared dma bufs
> > > > > > > > > and fences
> > > > > > > > > use cases there or you mean I need to add them now ?
> > > > > > > > We do have test infrastructure for all of that, but the hotunplug test
> > > > > > > > doesn't have that yet I think.
> > > > > > > > 
> > > > > > > > > > Only specific thing would be some hooks to keep the gpu
> > > > > > > > > > busy in some fashion while we yank the driver.
> > > > > > > > > Do you mean like staring X and some active rendering on top (like glxgears)
> > > > > > > > > automatically from within IGT ?
> > > > > > > > Nope, igt is meant to be bare metal testing so you don't have to drag
> > > > > > > > the entire winsys around (which in a wayland world, is not really good
> > > > > > > > for driver testing anyway, since everything is different). We use this
> > > > > > > > for our pre-merge ci for drm/i915.
> > > > > > > So i keep it busy by X/glxgers which is manual operation. What you suggest
> > > > > > > then is some client within IGT which opens the device and starts submitting jobs
> > > > > > > (which is much like what libdrm amdgpu tests already do) ? And this
> > > > > > > part is the amdgou specific code I just need to port from libdrm to here ?
> > > > > > Yup. For i915 tests we have an entire library already for small workloads,
> > > > > > including some that just spin forever (useful for reset testing and could
> > > > > > also come handy for unload testing).
> > > > > > -Daniel
> > > > > Does it mean I would have to drag in the entire infrastructure code from
> > > > > within libdrm amdgpu code that allows for command submissions through
> > > > > our IOCTLs ?
> > > > No it's perfectly fine to use libdrm in igt tests, we do that too. I
> > > > just mean we have some additional helpers to submit specific workloads
> > > > for intel gpu, like rendercpy to move data with the 3d engine (just
> > > > using copy engines only isn't good enough sometimes for testing), or
> > > > the special hanging batchbuffers we use for reset testing, or in
> > > > general for having precise control over race conditions and things
> > > > like that.
> > > > 
> > > > One thing that was somewhat annoying for i915 but shouldn't be a
> > > > problem for amdgpu is that igt builds on intel. So we have stub
> > > > functions for libdrm-intel, since libdrm-intel doesn't build on arm.
> > > > Shouldn't be a problem for you.
> > > > -Daniel
> > > 
> > > Tested with igt hot-unplug test. Passed unbind_rebind, unplug-rescan,
> > > hot-unbind-rebind and hotunplug-rescan
> > > if disabling the rescan part as I don't support plug-back for now. Also added
> > > command submission for amdgpu.
> > > Attached a draft of submitting workload while unbinding the driver or simulating
> > > detach. Catched 2 issues with unpug if command submission in flight  during
> > > unplug -
> > > (unsignaled fence causing a hang in amdgpu_cs_sync and hitting a BUG_ON in
> > > gfx_v9_0_ring_emit_patch_cond_exec whic is expected i guess).
> > > Guess glxgears command submissions is at a much slower rate so this was missed.
> > > Is that what you meant for this test ?
> > Yup. Would be good if you can submit this one for inclusion.
> > -Daniel
> 
> 
> Will do together with exported dma-buf test once I do it.
> 
> P.S How am i supposed to do exported fence test. Exporting a fence from
> device A, importing it into device B, unplugging
> device A then signaling the fence from device B - this supposed to call a
> fence cb which was registered
> by the exporter which by now is dead and hence will cause a 'use after free' ?

Yeah in the end we'd need 2 hw devices for testing full fence
functionality. A useful intermediate step would be to just export the
fence (either as sync_file, which I think amdgpu doesn't support because
no android egl support in mesa) or drm_syncobj (which you can do as
standalone fd too iirc), and then just using the fence a bit from
userspace (like wait on it or get its status) after the device is
unplugged.

I think this should cover most of the cross-driver issues that fences
bring in, and I think for the other problems we can worry once we spot.
-Daniel

> 
> Andrey
> 
> > 
> > > Andrey
> > > 
> > > 
> > > > 
> > > > > Andrey
> > > > > 
> > > > > > > Andrey
> > > > > > > 
> > > > > > > 
> > > > > > > > > > But just to get it started
> > > > > > > > > > you can throw in entirely amdgpu specific subtests and just share some of
> > > > > > > > > > the test code.
> > > > > > > > > > -Daniel
> > > > > > > > > Im general, I wasn't aware of this test suite and looks like it does what i test
> > > > > > > > > among other stuff.
> > > > > > > > > I will definitely  try to run with it although the rescan part will not work as
> > > > > > > > > plugging
> > > > > > > > > the device back is in my TODO list and not part of the scope for this patchset
> > > > > > > > > and so I will
> > > > > > > > > probably comment the re-scan section out while testing.
> > > > > > > > amd gem has been using libdrm-amd thus far iirc, but for things like
> > > > > > > > this I think it'd be worth to at least consider switching. Display
> > > > > > > > team has already started to use some of the test and contribute stuff
> > > > > > > > (I think the VRR testcase is from amd).
> > > > > > > > -Daniel
> > > > > > > > 
> > > > > > > > > Andrey
> > > > > > > > > 
> > > > > > > > > 
> > > > > > > > > > > Andrey Grodzovsky (13):
> > > > > > > > > > >        drm/ttm: Remap all page faults to per process dummy page.
> > > > > > > > > > >        drm: Unamp the entire device address space on device unplug
> > > > > > > > > > >        drm/ttm: Expose ttm_tt_unpopulate for driver use
> > > > > > > > > > >        drm/sched: Cancel and flush all oustatdning jobs before finish.
> > > > > > > > > > >        drm/amdgpu: Split amdgpu_device_fini into early and late
> > > > > > > > > > >        drm/amdgpu: Add early fini callback
> > > > > > > > > > >        drm/amdgpu: Register IOMMU topology notifier per device.
> > > > > > > > > > >        drm/amdgpu: Fix a bunch of sdma code crash post device unplug
> > > > > > > > > > >        drm/amdgpu: Remap all page faults to per process dummy page.
> > > > > > > > > > >        dmr/amdgpu: Move some sysfs attrs creation to default_attr
> > > > > > > > > > >        drm/amdgpu: Guard against write accesses after device removal
> > > > > > > > > > >        drm/sched: Make timeout timer rearm conditional.
> > > > > > > > > > >        drm/amdgpu: Prevent any job recoveries after device is unplugged.
> > > > > > > > > > > 
> > > > > > > > > > > Luben Tuikov (1):
> > > > > > > > > > >        drm/scheduler: Job timeout handler returns status
> > > > > > > > > > > 
> > > > > > > > > > >       drivers/gpu/drm/amd/amdgpu/amdgpu.h               |  11 +-
> > > > > > > > > > >       drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c      |  17 +--
> > > > > > > > > > >       drivers/gpu/drm/amd/amdgpu/amdgpu_device.c        | 149 ++++++++++++++++++++--
> > > > > > > > > > >       drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c           |  20 ++-
> > > > > > > > > > >       drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c         |  15 ++-
> > > > > > > > > > >       drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c          |   2 +-
> > > > > > > > > > >       drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h          |   1 +
> > > > > > > > > > >       drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c           |   9 ++
> > > > > > > > > > >       drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c       |  25 ++--
> > > > > > > > > > >       drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c           |  26 ++--
> > > > > > > > > > >       drivers/gpu/drm/amd/amdgpu/amdgpu_irq.h           |   3 +-
> > > > > > > > > > >       drivers/gpu/drm/amd/amdgpu/amdgpu_job.c           |  19 ++-
> > > > > > > > > > >       drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c           |  12 +-
> > > > > > > > > > >       drivers/gpu/drm/amd/amdgpu/amdgpu_object.c        |  10 ++
> > > > > > > > > > >       drivers/gpu/drm/amd/amdgpu/amdgpu_object.h        |   2 +
> > > > > > > > > > >       drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c           |  53 +++++---
> > > > > > > > > > >       drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h           |   3 +
> > > > > > > > > > >       drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c           |   1 +
> > > > > > > > > > >       drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c          |  70 ++++++++++
> > > > > > > > > > >       drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h          |  52 +-------
> > > > > > > > > > >       drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c           |  21 ++-
> > > > > > > > > > >       drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c            |   8 +-
> > > > > > > > > > >       drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c      |  14 +-
> > > > > > > > > > >       drivers/gpu/drm/amd/amdgpu/cik_ih.c               |   2 +-
> > > > > > > > > > >       drivers/gpu/drm/amd/amdgpu/cz_ih.c                |   2 +-
> > > > > > > > > > >       drivers/gpu/drm/amd/amdgpu/iceland_ih.c           |   2 +-
> > > > > > > > > > >       drivers/gpu/drm/amd/amdgpu/navi10_ih.c            |   2 +-
> > > > > > > > > > >       drivers/gpu/drm/amd/amdgpu/psp_v11_0.c            |  16 +--
> > > > > > > > > > >       drivers/gpu/drm/amd/amdgpu/psp_v12_0.c            |   8 +-
> > > > > > > > > > >       drivers/gpu/drm/amd/amdgpu/psp_v3_1.c             |   8 +-
> > > > > > > > > > >       drivers/gpu/drm/amd/amdgpu/si_ih.c                |   2 +-
> > > > > > > > > > >       drivers/gpu/drm/amd/amdgpu/tonga_ih.c             |   2 +-
> > > > > > > > > > >       drivers/gpu/drm/amd/amdgpu/vega10_ih.c            |   2 +-
> > > > > > > > > > >       drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c |  12 +-
> > > > > > > > > > >       drivers/gpu/drm/amd/include/amd_shared.h          |   2 +
> > > > > > > > > > >       drivers/gpu/drm/drm_drv.c                         |   3 +
> > > > > > > > > > >       drivers/gpu/drm/etnaviv/etnaviv_sched.c           |  10 +-
> > > > > > > > > > >       drivers/gpu/drm/lima/lima_sched.c                 |   4 +-
> > > > > > > > > > >       drivers/gpu/drm/panfrost/panfrost_job.c           |   9 +-
> > > > > > > > > > >       drivers/gpu/drm/scheduler/sched_main.c            |  18 ++-
> > > > > > > > > > >       drivers/gpu/drm/ttm/ttm_bo_vm.c                   |  82 +++++++++++-
> > > > > > > > > > >       drivers/gpu/drm/ttm/ttm_tt.c                      |   1 +
> > > > > > > > > > >       drivers/gpu/drm/v3d/v3d_sched.c                   |  32 ++---
> > > > > > > > > > >       include/drm/gpu_scheduler.h                       |  17 ++-
> > > > > > > > > > >       include/drm/ttm/ttm_bo_api.h                      |   2 +
> > > > > > > > > > >       45 files changed, 583 insertions(+), 198 deletions(-)
> > > > > > > > > > > 
> > > > > > > > > > > --
> > > > > > > > > > > 2.7.4
> > > > > > > > > > > 
> > > > 
> > 
> > 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 11/14] drm/amdgpu: Guard against write accesses after device removal
  2021-02-09  7:58                                             ` Christian König
@ 2021-02-09 14:30                                               ` Andrey Grodzovsky
  -1 siblings, 0 replies; 196+ messages in thread
From: Andrey Grodzovsky @ 2021-02-09 14:30 UTC (permalink / raw)
  To: Christian König, Daniel Vetter
  Cc: Greg KH, amd-gfx list, dri-devel, Alex Deucher, Qiang Yu


On 2/9/21 2:58 AM, Christian König wrote:
>
> Am 08.02.21 um 23:15 schrieb Andrey Grodzovsky:
>>
>> On 2/8/21 11:23 AM, Daniel Vetter wrote:
>>> On Mon, Feb 8, 2021 at 3:00 PM Christian König <christian.koenig@amd.com> 
>>> wrote:
>>>> Am 08.02.21 um 11:11 schrieb Daniel Vetter:
>>>>> On Mon, Feb 08, 2021 at 11:03:15AM +0100, Christian König wrote:
>>>>>> Am 08.02.21 um 10:48 schrieb Daniel Vetter:
>>>>>>> On Mon, Feb 08, 2021 at 10:37:19AM +0100, Christian König wrote:
>>>>>>>> Am 07.02.21 um 22:50 schrieb Daniel Vetter:
>>>>>>>>> [SNIP]
>>>>>>>>>> Clarification - as far as I know there are no page fault handlers for 
>>>>>>>>>> kernel
>>>>>>>>>> mappings. And we are talking about kernel mappings here, right ?  If 
>>>>>>>>>> there were
>>>>>>>>>> I could solve all those issues the same as I do for user mappings, by
>>>>>>>>>> invalidating all existing mappings in the kernel (both kmaps and 
>>>>>>>>>> ioreamps)and
>>>>>>>>>> insert dummy zero or ~0 filled page instead.
>>>>>>>>>> Also, I assume forcefully remapping the IO BAR to ~0 filled page 
>>>>>>>>>> would involve
>>>>>>>>>> ioremap API and it's not something that I think can be easily done 
>>>>>>>>>> according to
>>>>>>>>>> am answer i got to a related topic a few weeks ago
>>>>>>>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.spinics.net%2Flists%2Flinux-pci%2Fmsg103396.html&amp;data=04%7C01%7CAndrey.Grodzovsky%40amd.com%7C9d1bdf4cee504cd71b4908d8cc4df310%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637483982454608249%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=Anw%2BOwJ%2B5tvjW3tmkVNdz13%2BZ18vdpfOLWqsUZL7D2I%3D&amp;reserved=0 
>>>>>>>>>> (that was the only reply
>>>>>>>>>> i got)
>>>>>>>>> mmiotrace can, but only for debug, and only on x86 platforms:
>>>>>>>>>
>>>>>>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.kernel.org%2Fdoc%2Fhtml%2Flatest%2Ftrace%2Fmmiotrace.html&amp;data=04%7C01%7CAndrey.Grodzovsky%40amd.com%7C9d1bdf4cee504cd71b4908d8cc4df310%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637483982454608249%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=Wa7BFNySVQJLyD6WY4pZHTP1QfeZwD7F5ydBrXuxppQ%3D&amp;reserved=0 
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Should be feasible (but maybe not worth the effort) to extend this to
>>>>>>>>> support fake unplug.
>>>>>>>> Mhm, interesting idea you guys brought up here.
>>>>>>>>
>>>>>>>> We don't need a page fault for this to work, all we need to do is to 
>>>>>>>> insert
>>>>>>>> dummy PTEs into the kernels page table at the place where previously the
>>>>>>>> MMIO mapping has been.
>>>>>>> Simply pte trick isn't enough, because we need:
>>>>>>> - drop all writes silently
>>>>>>> - all reads return 0xff
>>>>>>>
>>>>>>> ptes can't do that themselves, we minimally need write protection and then
>>>>>>> silently proceed on each write fault without restarting the instruction.
>>>>>>> Better would be to only catch reads, but x86 doesn't do write-only pte
>>>>>>> permissions afaik.
>>>>>> You are not thinking far enough :)
>>>>>>
>>>>>> The dummy PTE is point to a dummy MMIO page which is just never used.
>>>>>>
>>>>>> That hast the exact same properties than our removed MMIO space just doesn't
>>>>>> goes bananas when a new device is MMIO mapped into that and our driver still
>>>>>> tries to write there.
>>>>> Hm, but where do we get such a "guaranteed never used" mmio page from?
>>>> Well we have tons of unused IO space on 64bit systems these days.
>>>>
>>>> Doesn't really needs to be PCIe address space, doesn't it?
>>> That sounds very trusting to modern systems not decoding random
>>> ranges. E.g. the pci code stopped extending the host bridge windows on
>>> its own, entirely relying on the acpi provided ranges, to avoid
>>> stomping on stuff that's the but not listed anywhere.
>>>
>>> I guess if we have a range behind a pci bridge, which isn't used by
>>> any device, but decoded by the bridge, then that should be safe
>>> enough. Maybe could even have an option in upstream to do that on
>>> unplug, if a certain flag is set, or a cmdline option.
>>> -Daniel
>>
>>
>> Question - Why can't we just set those PTEs to point to system memory 
>> (another RO dummy page)
>> filled with 1s ?
>
>
> Then writes are not discarded. E.g. the 1s would change to something else.
>
> Christian.


I see but, what about marking the mappings as RO and discarding the write access 
page faults continuously until the device is finalized ?
Regarding using an unused range behind the upper bridge as Daniel suggested, I 
wonder will this interfere with
the upcoming feature to support BARs movement  during hot plug - 
https://www.spinics.net/lists/linux-pci/msg103195.html ?

Andrey


>
>
>>
>> Andrey
>>
>>
>>>
>>>> Christian.
>>>>
>>>>> It's a nifty idea indeed otherwise ...
>>>>> -Daniel
>>>>>
>>>>>> Regards,
>>>>>> Christian.
>>>>>>
>>>>>>
>>>>>>>>>>> But ugh ...
>>>>>>>>>>>
>>>>>>>>>>> Otoh validating an entire driver like amdgpu without such a trick
>>>>>>>>>>> against 0xff reads is practically impossible. So maybe you need to add
>>>>>>>>>>> this as one of the tasks here?
>>>>>>>>>> Or I could just for validation purposes return ~0 from all reg reads 
>>>>>>>>>> in the code
>>>>>>>>>> and ignore writes if drm_dev_unplugged, this could already easily 
>>>>>>>>>> validate a big
>>>>>>>>>> portion of the code flow under such scenario.
>>>>>>>>> Hm yeah if your really wrap them all, that should work too. Since
>>>>>>>>> iommappings have __iomem pointer type, as long as amdgpu is sparse
>>>>>>>>> warning free, should be doable to guarantee this.
>>>>>>>> Problem is that ~0 is not always a valid register value.
>>>>>>>>
>>>>>>>> You would need to audit every register read that it doesn't use the 
>>>>>>>> returned
>>>>>>>> value blindly as index or similar. That is quite a bit of work.
>>>>>>> Yeah that's the entire crux here :-/
>>>>>>> -Daniel
>>>
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 11/14] drm/amdgpu: Guard against write accesses after device removal
@ 2021-02-09 14:30                                               ` Andrey Grodzovsky
  0 siblings, 0 replies; 196+ messages in thread
From: Andrey Grodzovsky @ 2021-02-09 14:30 UTC (permalink / raw)
  To: Christian König, Daniel Vetter
  Cc: Rob Herring, Greg KH, amd-gfx list, Anholt, Eric, Pekka Paalanen,
	dri-devel, Alex Deucher, Qiang Yu, Wentland, Harry, Lucas Stach


On 2/9/21 2:58 AM, Christian König wrote:
>
> Am 08.02.21 um 23:15 schrieb Andrey Grodzovsky:
>>
>> On 2/8/21 11:23 AM, Daniel Vetter wrote:
>>> On Mon, Feb 8, 2021 at 3:00 PM Christian König <christian.koenig@amd.com> 
>>> wrote:
>>>> Am 08.02.21 um 11:11 schrieb Daniel Vetter:
>>>>> On Mon, Feb 08, 2021 at 11:03:15AM +0100, Christian König wrote:
>>>>>> Am 08.02.21 um 10:48 schrieb Daniel Vetter:
>>>>>>> On Mon, Feb 08, 2021 at 10:37:19AM +0100, Christian König wrote:
>>>>>>>> Am 07.02.21 um 22:50 schrieb Daniel Vetter:
>>>>>>>>> [SNIP]
>>>>>>>>>> Clarification - as far as I know there are no page fault handlers for 
>>>>>>>>>> kernel
>>>>>>>>>> mappings. And we are talking about kernel mappings here, right ?  If 
>>>>>>>>>> there were
>>>>>>>>>> I could solve all those issues the same as I do for user mappings, by
>>>>>>>>>> invalidating all existing mappings in the kernel (both kmaps and 
>>>>>>>>>> ioreamps)and
>>>>>>>>>> insert dummy zero or ~0 filled page instead.
>>>>>>>>>> Also, I assume forcefully remapping the IO BAR to ~0 filled page 
>>>>>>>>>> would involve
>>>>>>>>>> ioremap API and it's not something that I think can be easily done 
>>>>>>>>>> according to
>>>>>>>>>> am answer i got to a related topic a few weeks ago
>>>>>>>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.spinics.net%2Flists%2Flinux-pci%2Fmsg103396.html&amp;data=04%7C01%7CAndrey.Grodzovsky%40amd.com%7C9d1bdf4cee504cd71b4908d8cc4df310%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637483982454608249%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=Anw%2BOwJ%2B5tvjW3tmkVNdz13%2BZ18vdpfOLWqsUZL7D2I%3D&amp;reserved=0 
>>>>>>>>>> (that was the only reply
>>>>>>>>>> i got)
>>>>>>>>> mmiotrace can, but only for debug, and only on x86 platforms:
>>>>>>>>>
>>>>>>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.kernel.org%2Fdoc%2Fhtml%2Flatest%2Ftrace%2Fmmiotrace.html&amp;data=04%7C01%7CAndrey.Grodzovsky%40amd.com%7C9d1bdf4cee504cd71b4908d8cc4df310%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637483982454608249%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=Wa7BFNySVQJLyD6WY4pZHTP1QfeZwD7F5ydBrXuxppQ%3D&amp;reserved=0 
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Should be feasible (but maybe not worth the effort) to extend this to
>>>>>>>>> support fake unplug.
>>>>>>>> Mhm, interesting idea you guys brought up here.
>>>>>>>>
>>>>>>>> We don't need a page fault for this to work, all we need to do is to 
>>>>>>>> insert
>>>>>>>> dummy PTEs into the kernels page table at the place where previously the
>>>>>>>> MMIO mapping has been.
>>>>>>> Simply pte trick isn't enough, because we need:
>>>>>>> - drop all writes silently
>>>>>>> - all reads return 0xff
>>>>>>>
>>>>>>> ptes can't do that themselves, we minimally need write protection and then
>>>>>>> silently proceed on each write fault without restarting the instruction.
>>>>>>> Better would be to only catch reads, but x86 doesn't do write-only pte
>>>>>>> permissions afaik.
>>>>>> You are not thinking far enough :)
>>>>>>
>>>>>> The dummy PTE is point to a dummy MMIO page which is just never used.
>>>>>>
>>>>>> That hast the exact same properties than our removed MMIO space just doesn't
>>>>>> goes bananas when a new device is MMIO mapped into that and our driver still
>>>>>> tries to write there.
>>>>> Hm, but where do we get such a "guaranteed never used" mmio page from?
>>>> Well we have tons of unused IO space on 64bit systems these days.
>>>>
>>>> Doesn't really needs to be PCIe address space, doesn't it?
>>> That sounds very trusting to modern systems not decoding random
>>> ranges. E.g. the pci code stopped extending the host bridge windows on
>>> its own, entirely relying on the acpi provided ranges, to avoid
>>> stomping on stuff that's the but not listed anywhere.
>>>
>>> I guess if we have a range behind a pci bridge, which isn't used by
>>> any device, but decoded by the bridge, then that should be safe
>>> enough. Maybe could even have an option in upstream to do that on
>>> unplug, if a certain flag is set, or a cmdline option.
>>> -Daniel
>>
>>
>> Question - Why can't we just set those PTEs to point to system memory 
>> (another RO dummy page)
>> filled with 1s ?
>
>
> Then writes are not discarded. E.g. the 1s would change to something else.
>
> Christian.


I see but, what about marking the mappings as RO and discarding the write access 
page faults continuously until the device is finalized ?
Regarding using an unused range behind the upper bridge as Daniel suggested, I 
wonder will this interfere with
the upcoming feature to support BARs movement  during hot plug - 
https://www.spinics.net/lists/linux-pci/msg103195.html ?

Andrey


>
>
>>
>> Andrey
>>
>>
>>>
>>>> Christian.
>>>>
>>>>> It's a nifty idea indeed otherwise ...
>>>>> -Daniel
>>>>>
>>>>>> Regards,
>>>>>> Christian.
>>>>>>
>>>>>>
>>>>>>>>>>> But ugh ...
>>>>>>>>>>>
>>>>>>>>>>> Otoh validating an entire driver like amdgpu without such a trick
>>>>>>>>>>> against 0xff reads is practically impossible. So maybe you need to add
>>>>>>>>>>> this as one of the tasks here?
>>>>>>>>>> Or I could just for validation purposes return ~0 from all reg reads 
>>>>>>>>>> in the code
>>>>>>>>>> and ignore writes if drm_dev_unplugged, this could already easily 
>>>>>>>>>> validate a big
>>>>>>>>>> portion of the code flow under such scenario.
>>>>>>>>> Hm yeah if your really wrap them all, that should work too. Since
>>>>>>>>> iommappings have __iomem pointer type, as long as amdgpu is sparse
>>>>>>>>> warning free, should be doable to guarantee this.
>>>>>>>> Problem is that ~0 is not always a valid register value.
>>>>>>>>
>>>>>>>> You would need to audit every register read that it doesn't use the 
>>>>>>>> returned
>>>>>>>> value blindly as index or similar. That is quite a bit of work.
>>>>>>> Yeah that's the entire crux here :-/
>>>>>>> -Daniel
>>>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 00/14] RFC Support hot device unplug in amdgpu
  2021-02-09  9:50                       ` Daniel Vetter
@ 2021-02-09 15:34                         ` Andrey Grodzovsky
  -1 siblings, 0 replies; 196+ messages in thread
From: Andrey Grodzovsky @ 2021-02-09 15:34 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: amd-gfx list, Christian König, dri-devel, Qiang Yu, Greg KH,
	Alex Deucher


On 2/9/21 4:50 AM, Daniel Vetter wrote:
> On Mon, Feb 08, 2021 at 11:01:14PM -0500, Andrey Grodzovsky wrote:
>> On 2/8/21 2:27 AM, Daniel Vetter wrote:
>>> On Mon, Feb 8, 2021 at 6:59 AM Andrey Grodzovsky
>>> <Andrey.Grodzovsky@amd.com> wrote:
>>>> On 1/20/21 10:59 AM, Daniel Vetter wrote:
>>>>> On Wed, Jan 20, 2021 at 3:20 PM Andrey Grodzovsky
>>>>> <Andrey.Grodzovsky@amd.com> wrote:
>>>>>> On 1/20/21 4:05 AM, Daniel Vetter wrote:
>>>>>>> On Tue, Jan 19, 2021 at 01:18:15PM -0500, Andrey Grodzovsky wrote:
>>>>>>>> On 1/19/21 1:08 PM, Daniel Vetter wrote:
>>>>>>>>> On Tue, Jan 19, 2021 at 6:31 PM Andrey Grodzovsky
>>>>>>>>> <Andrey.Grodzovsky@amd.com> wrote:
>>>>>>>>>> On 1/19/21 9:16 AM, Daniel Vetter wrote:
>>>>>>>>>>> On Mon, Jan 18, 2021 at 04:01:09PM -0500, Andrey Grodzovsky wrote:
>>>>>>>>>>>> Until now extracting a card either by physical extraction (e.g. eGPU with
>>>>>>>>>>>> thunderbolt connection or by emulation through  syfs -> /sys/bus/pci/devices/device_id/remove)
>>>>>>>>>>>> would cause random crashes in user apps. The random crashes in apps were
>>>>>>>>>>>> mostly due to the app having mapped a device backed BO into its address
>>>>>>>>>>>> space was still trying to access the BO while the backing device was gone.
>>>>>>>>>>>> To answer this first problem Christian suggested to fix the handling of mapped
>>>>>>>>>>>> memory in the clients when the device goes away by forcibly unmap all buffers the
>>>>>>>>>>>> user processes has by clearing their respective VMAs mapping the device BOs.
>>>>>>>>>>>> Then when the VMAs try to fill in the page tables again we check in the fault
>>>>>>>>>>>> handlerif the device is removed and if so, return an error. This will generate a
>>>>>>>>>>>> SIGBUS to the application which can then cleanly terminate.This indeed was done
>>>>>>>>>>>> but this in turn created a problem of kernel OOPs were the OOPSes were due to the
>>>>>>>>>>>> fact that while the app was terminating because of the SIGBUSit would trigger use
>>>>>>>>>>>> after free in the driver by calling to accesses device structures that were already
>>>>>>>>>>>> released from the pci remove sequence.This was handled by introducing a 'flush'
>>>>>>>>>>>> sequence during device removal were we wait for drm file reference to drop to 0
>>>>>>>>>>>> meaning all user clients directly using this device terminated.
>>>>>>>>>>>>
>>>>>>>>>>>> v2:
>>>>>>>>>>>> Based on discussions in the mailing list with Daniel and Pekka [1] and based on the document
>>>>>>>>>>>> produced by Pekka from those discussions [2] the whole approach with returning SIGBUS and
>>>>>>>>>>>> waiting for all user clients having CPU mapping of device BOs to die was dropped.
>>>>>>>>>>>> Instead as per the document suggestion the device structures are kept alive until
>>>>>>>>>>>> the last reference to the device is dropped by user client and in the meanwhile all existing and new CPU mappings of the BOs
>>>>>>>>>>>> belonging to the device directly or by dma-buf import are rerouted to per user
>>>>>>>>>>>> process dummy rw page.Also, I skipped the 'Requirements for KMS UAPI' section of [2]
>>>>>>>>>>>> since i am trying to get the minimal set of requirements that still give useful solution
>>>>>>>>>>>> to work and this is the'Requirements for Render and Cross-Device UAPI' section and so my
>>>>>>>>>>>> test case is removing a secondary device, which is render only and is not involved
>>>>>>>>>>>> in KMS.
>>>>>>>>>>>>
>>>>>>>>>>>> v3:
>>>>>>>>>>>> More updates following comments from v2 such as removing loop to find DRM file when rerouting
>>>>>>>>>>>> page faults to dummy page,getting rid of unnecessary sysfs handling refactoring and moving
>>>>>>>>>>>> prevention of GPU recovery post device unplug from amdgpu to scheduler layer.
>>>>>>>>>>>> On top of that added unplug support for the IOMMU enabled system.
>>>>>>>>>>>>
>>>>>>>>>>>> v4:
>>>>>>>>>>>> Drop last sysfs hack and use sysfs default attribute.
>>>>>>>>>>>> Guard against write accesses after device removal to avoid modifying released memory.
>>>>>>>>>>>> Update dummy pages handling to on demand allocation and release through drm managed framework.
>>>>>>>>>>>> Add return value to scheduler job TO handler (by Luben Tuikov) and use this in amdgpu for prevention
>>>>>>>>>>>> of GPU recovery post device unplug
>>>>>>>>>>>> Also rebase on top of drm-misc-mext instead of amd-staging-drm-next
>>>>>>>>>>>>
>>>>>>>>>>>> With these patches I am able to gracefully remove the secondary card using sysfs remove hook while glxgears
>>>>>>>>>>>> is running off of secondary card (DRI_PRIME=1) without kernel oopses or hangs and keep working
>>>>>>>>>>>> with the primary card or soft reset the device without hangs or oopses
>>>>>>>>>>>>
>>>>>>>>>>>> TODOs for followup work:
>>>>>>>>>>>> Convert AMDGPU code to use devm (for hw stuff) and drmm (for sw stuff and allocations) (Daniel)
>>>>>>>>>>>> Support plugging the secondary device back after unplug - currently still experiencing HW error on plugging back.
>>>>>>>>>>>> Add support for 'Requirements for KMS UAPI' section of [2] - unplugging primary, display connected card.
>>>>>>>>>>>>
>>>>>>>>>>>> [1] - Discussions during v3 of the patchset https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.spinics.net%2Flists%2Famd-gfx%2Fmsg55576.html&amp;data=04%7C01%7CAndrey.Grodzovsky%40amd.com%7C49a8be16944441aca3ce08d8cce01c88%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637484610239738569%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=PkvEkJt2zQ9G0dpiGYtSuVLpziwQ0FklkjdMLV2ZOnE%3D&amp;reserved=0
>>>>>>>>>>>> [2] - drm/doc: device hot-unplug for userspace https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.spinics.net%2Flists%2Fdri-devel%2Fmsg259755.html&amp;data=04%7C01%7CAndrey.Grodzovsky%40amd.com%7C49a8be16944441aca3ce08d8cce01c88%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637484610239738569%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=akobkV%2FtjcDck7sHlGbgE5n1FCyGmcklhu%2Bx3goGRuw%3D&amp;reserved=0
>>>>>>>>>>>> [3] - Related gitlab ticket https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgitlab.freedesktop.org%2Fdrm%2Famd%2F-%2Fissues%2F1081&amp;data=04%7C01%7CAndrey.Grodzovsky%40amd.com%7C49a8be16944441aca3ce08d8cce01c88%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637484610239738569%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=P0O5VHLkRczar2XHXyM8UysK%2BaD4hpUuoT%2FCBCnOW2g%3D&amp;reserved=0
>>>>>>>>>>> btw have you tried this out with some of the igts we have? core_hotunplug
>>>>>>>>>>> is the one I'm thinking of. Might be worth to extend this for amdgpu
>>>>>>>>>>> specific stuff (like run some batches on it while hotunplugging).
>>>>>>>>>> No, I mostly used just running glxgears while testing which covers already
>>>>>>>>>> exported/imported dma-buf case and a few manually hacked tests in libdrm amdgpu
>>>>>>>>>> test suite
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> Since there's so many corner cases we need to test here (shared dma-buf,
>>>>>>>>>>> shared dma_fence) I think it would make sense to have a shared testcase
>>>>>>>>>>> across drivers.
>>>>>>>>>> Not familiar with IGT too much, is there an easy way to setup shared dma bufs
>>>>>>>>>> and fences
>>>>>>>>>> use cases there or you mean I need to add them now ?
>>>>>>>>> We do have test infrastructure for all of that, but the hotunplug test
>>>>>>>>> doesn't have that yet I think.
>>>>>>>>>
>>>>>>>>>>> Only specific thing would be some hooks to keep the gpu
>>>>>>>>>>> busy in some fashion while we yank the driver.
>>>>>>>>>> Do you mean like staring X and some active rendering on top (like glxgears)
>>>>>>>>>> automatically from within IGT ?
>>>>>>>>> Nope, igt is meant to be bare metal testing so you don't have to drag
>>>>>>>>> the entire winsys around (which in a wayland world, is not really good
>>>>>>>>> for driver testing anyway, since everything is different). We use this
>>>>>>>>> for our pre-merge ci for drm/i915.
>>>>>>>> So i keep it busy by X/glxgers which is manual operation. What you suggest
>>>>>>>> then is some client within IGT which opens the device and starts submitting jobs
>>>>>>>> (which is much like what libdrm amdgpu tests already do) ? And this
>>>>>>>> part is the amdgou specific code I just need to port from libdrm to here ?
>>>>>>> Yup. For i915 tests we have an entire library already for small workloads,
>>>>>>> including some that just spin forever (useful for reset testing and could
>>>>>>> also come handy for unload testing).
>>>>>>> -Daniel
>>>>>> Does it mean I would have to drag in the entire infrastructure code from
>>>>>> within libdrm amdgpu code that allows for command submissions through
>>>>>> our IOCTLs ?
>>>>> No it's perfectly fine to use libdrm in igt tests, we do that too. I
>>>>> just mean we have some additional helpers to submit specific workloads
>>>>> for intel gpu, like rendercpy to move data with the 3d engine (just
>>>>> using copy engines only isn't good enough sometimes for testing), or
>>>>> the special hanging batchbuffers we use for reset testing, or in
>>>>> general for having precise control over race conditions and things
>>>>> like that.
>>>>>
>>>>> One thing that was somewhat annoying for i915 but shouldn't be a
>>>>> problem for amdgpu is that igt builds on intel. So we have stub
>>>>> functions for libdrm-intel, since libdrm-intel doesn't build on arm.
>>>>> Shouldn't be a problem for you.
>>>>> -Daniel
>>>> Tested with igt hot-unplug test. Passed unbind_rebind, unplug-rescan,
>>>> hot-unbind-rebind and hotunplug-rescan
>>>> if disabling the rescan part as I don't support plug-back for now. Also added
>>>> command submission for amdgpu.
>>>> Attached a draft of submitting workload while unbinding the driver or simulating
>>>> detach. Catched 2 issues with unpug if command submission in flight  during
>>>> unplug -
>>>> (unsignaled fence causing a hang in amdgpu_cs_sync and hitting a BUG_ON in
>>>> gfx_v9_0_ring_emit_patch_cond_exec whic is expected i guess).
>>>> Guess glxgears command submissions is at a much slower rate so this was missed.
>>>> Is that what you meant for this test ?
>>> Yup. Would be good if you can submit this one for inclusion.
>>> -Daniel
>>
>> Will do together with exported dma-buf test once I do it.
>>
>> P.S How am i supposed to do exported fence test. Exporting a fence from
>> device A, importing it into device B, unplugging
>> device A then signaling the fence from device B - this supposed to call a
>> fence cb which was registered
>> by the exporter which by now is dead and hence will cause a 'use after free' ?
> Yeah in the end we'd need 2 hw devices for testing full fence
> functionality. A useful intermediate step would be to just export the
> fence (either as sync_file, which I think amdgpu doesn't support because
> no android egl support in mesa) or drm_syncobj (which you can do as
> standalone fd too iirc), and then just using the fence a bit from
> userspace (like wait on it or get its status) after the device is
> unplugged.
>
> I think this should cover most of the cross-driver issues that fences
> bring in, and I think for the other problems we can worry once we spot.
> -Daniel


OK, will write up all the tests and submit a merge request for all of them 
together to IGT gitlab

Andrey


>
>> Andrey
>>
>>>> Andrey
>>>>
>>>>
>>>>>> Andrey
>>>>>>
>>>>>>>> Andrey
>>>>>>>>
>>>>>>>>
>>>>>>>>>>> But just to get it started
>>>>>>>>>>> you can throw in entirely amdgpu specific subtests and just share some of
>>>>>>>>>>> the test code.
>>>>>>>>>>> -Daniel
>>>>>>>>>> Im general, I wasn't aware of this test suite and looks like it does what i test
>>>>>>>>>> among other stuff.
>>>>>>>>>> I will definitely  try to run with it although the rescan part will not work as
>>>>>>>>>> plugging
>>>>>>>>>> the device back is in my TODO list and not part of the scope for this patchset
>>>>>>>>>> and so I will
>>>>>>>>>> probably comment the re-scan section out while testing.
>>>>>>>>> amd gem has been using libdrm-amd thus far iirc, but for things like
>>>>>>>>> this I think it'd be worth to at least consider switching. Display
>>>>>>>>> team has already started to use some of the test and contribute stuff
>>>>>>>>> (I think the VRR testcase is from amd).
>>>>>>>>> -Daniel
>>>>>>>>>
>>>>>>>>>> Andrey
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>>> Andrey Grodzovsky (13):
>>>>>>>>>>>>         drm/ttm: Remap all page faults to per process dummy page.
>>>>>>>>>>>>         drm: Unamp the entire device address space on device unplug
>>>>>>>>>>>>         drm/ttm: Expose ttm_tt_unpopulate for driver use
>>>>>>>>>>>>         drm/sched: Cancel and flush all oustatdning jobs before finish.
>>>>>>>>>>>>         drm/amdgpu: Split amdgpu_device_fini into early and late
>>>>>>>>>>>>         drm/amdgpu: Add early fini callback
>>>>>>>>>>>>         drm/amdgpu: Register IOMMU topology notifier per device.
>>>>>>>>>>>>         drm/amdgpu: Fix a bunch of sdma code crash post device unplug
>>>>>>>>>>>>         drm/amdgpu: Remap all page faults to per process dummy page.
>>>>>>>>>>>>         dmr/amdgpu: Move some sysfs attrs creation to default_attr
>>>>>>>>>>>>         drm/amdgpu: Guard against write accesses after device removal
>>>>>>>>>>>>         drm/sched: Make timeout timer rearm conditional.
>>>>>>>>>>>>         drm/amdgpu: Prevent any job recoveries after device is unplugged.
>>>>>>>>>>>>
>>>>>>>>>>>> Luben Tuikov (1):
>>>>>>>>>>>>         drm/scheduler: Job timeout handler returns status
>>>>>>>>>>>>
>>>>>>>>>>>>        drivers/gpu/drm/amd/amdgpu/amdgpu.h               |  11 +-
>>>>>>>>>>>>        drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c      |  17 +--
>>>>>>>>>>>>        drivers/gpu/drm/amd/amdgpu/amdgpu_device.c        | 149 ++++++++++++++++++++--
>>>>>>>>>>>>        drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c           |  20 ++-
>>>>>>>>>>>>        drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c         |  15 ++-
>>>>>>>>>>>>        drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c          |   2 +-
>>>>>>>>>>>>        drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h          |   1 +
>>>>>>>>>>>>        drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c           |   9 ++
>>>>>>>>>>>>        drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c       |  25 ++--
>>>>>>>>>>>>        drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c           |  26 ++--
>>>>>>>>>>>>        drivers/gpu/drm/amd/amdgpu/amdgpu_irq.h           |   3 +-
>>>>>>>>>>>>        drivers/gpu/drm/amd/amdgpu/amdgpu_job.c           |  19 ++-
>>>>>>>>>>>>        drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c           |  12 +-
>>>>>>>>>>>>        drivers/gpu/drm/amd/amdgpu/amdgpu_object.c        |  10 ++
>>>>>>>>>>>>        drivers/gpu/drm/amd/amdgpu/amdgpu_object.h        |   2 +
>>>>>>>>>>>>        drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c           |  53 +++++---
>>>>>>>>>>>>        drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h           |   3 +
>>>>>>>>>>>>        drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c           |   1 +
>>>>>>>>>>>>        drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c          |  70 ++++++++++
>>>>>>>>>>>>        drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h          |  52 +-------
>>>>>>>>>>>>        drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c           |  21 ++-
>>>>>>>>>>>>        drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c            |   8 +-
>>>>>>>>>>>>        drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c      |  14 +-
>>>>>>>>>>>>        drivers/gpu/drm/amd/amdgpu/cik_ih.c               |   2 +-
>>>>>>>>>>>>        drivers/gpu/drm/amd/amdgpu/cz_ih.c                |   2 +-
>>>>>>>>>>>>        drivers/gpu/drm/amd/amdgpu/iceland_ih.c           |   2 +-
>>>>>>>>>>>>        drivers/gpu/drm/amd/amdgpu/navi10_ih.c            |   2 +-
>>>>>>>>>>>>        drivers/gpu/drm/amd/amdgpu/psp_v11_0.c            |  16 +--
>>>>>>>>>>>>        drivers/gpu/drm/amd/amdgpu/psp_v12_0.c            |   8 +-
>>>>>>>>>>>>        drivers/gpu/drm/amd/amdgpu/psp_v3_1.c             |   8 +-
>>>>>>>>>>>>        drivers/gpu/drm/amd/amdgpu/si_ih.c                |   2 +-
>>>>>>>>>>>>        drivers/gpu/drm/amd/amdgpu/tonga_ih.c             |   2 +-
>>>>>>>>>>>>        drivers/gpu/drm/amd/amdgpu/vega10_ih.c            |   2 +-
>>>>>>>>>>>>        drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c |  12 +-
>>>>>>>>>>>>        drivers/gpu/drm/amd/include/amd_shared.h          |   2 +
>>>>>>>>>>>>        drivers/gpu/drm/drm_drv.c                         |   3 +
>>>>>>>>>>>>        drivers/gpu/drm/etnaviv/etnaviv_sched.c           |  10 +-
>>>>>>>>>>>>        drivers/gpu/drm/lima/lima_sched.c                 |   4 +-
>>>>>>>>>>>>        drivers/gpu/drm/panfrost/panfrost_job.c           |   9 +-
>>>>>>>>>>>>        drivers/gpu/drm/scheduler/sched_main.c            |  18 ++-
>>>>>>>>>>>>        drivers/gpu/drm/ttm/ttm_bo_vm.c                   |  82 +++++++++++-
>>>>>>>>>>>>        drivers/gpu/drm/ttm/ttm_tt.c                      |   1 +
>>>>>>>>>>>>        drivers/gpu/drm/v3d/v3d_sched.c                   |  32 ++---
>>>>>>>>>>>>        include/drm/gpu_scheduler.h                       |  17 ++-
>>>>>>>>>>>>        include/drm/ttm/ttm_bo_api.h                      |   2 +
>>>>>>>>>>>>        45 files changed, 583 insertions(+), 198 deletions(-)
>>>>>>>>>>>>
>>>>>>>>>>>> --
>>>>>>>>>>>> 2.7.4
>>>>>>>>>>>>
>>>
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 00/14] RFC Support hot device unplug in amdgpu
@ 2021-02-09 15:34                         ` Andrey Grodzovsky
  0 siblings, 0 replies; 196+ messages in thread
From: Andrey Grodzovsky @ 2021-02-09 15:34 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Rob Herring, amd-gfx list, Christian König, dri-devel,
	Anholt, Eric, Pekka Paalanen, Qiang Yu, Greg KH, Alex Deucher,
	Wentland, Harry, Lucas Stach


On 2/9/21 4:50 AM, Daniel Vetter wrote:
> On Mon, Feb 08, 2021 at 11:01:14PM -0500, Andrey Grodzovsky wrote:
>> On 2/8/21 2:27 AM, Daniel Vetter wrote:
>>> On Mon, Feb 8, 2021 at 6:59 AM Andrey Grodzovsky
>>> <Andrey.Grodzovsky@amd.com> wrote:
>>>> On 1/20/21 10:59 AM, Daniel Vetter wrote:
>>>>> On Wed, Jan 20, 2021 at 3:20 PM Andrey Grodzovsky
>>>>> <Andrey.Grodzovsky@amd.com> wrote:
>>>>>> On 1/20/21 4:05 AM, Daniel Vetter wrote:
>>>>>>> On Tue, Jan 19, 2021 at 01:18:15PM -0500, Andrey Grodzovsky wrote:
>>>>>>>> On 1/19/21 1:08 PM, Daniel Vetter wrote:
>>>>>>>>> On Tue, Jan 19, 2021 at 6:31 PM Andrey Grodzovsky
>>>>>>>>> <Andrey.Grodzovsky@amd.com> wrote:
>>>>>>>>>> On 1/19/21 9:16 AM, Daniel Vetter wrote:
>>>>>>>>>>> On Mon, Jan 18, 2021 at 04:01:09PM -0500, Andrey Grodzovsky wrote:
>>>>>>>>>>>> Until now extracting a card either by physical extraction (e.g. eGPU with
>>>>>>>>>>>> thunderbolt connection or by emulation through  syfs -> /sys/bus/pci/devices/device_id/remove)
>>>>>>>>>>>> would cause random crashes in user apps. The random crashes in apps were
>>>>>>>>>>>> mostly due to the app having mapped a device backed BO into its address
>>>>>>>>>>>> space was still trying to access the BO while the backing device was gone.
>>>>>>>>>>>> To answer this first problem Christian suggested to fix the handling of mapped
>>>>>>>>>>>> memory in the clients when the device goes away by forcibly unmap all buffers the
>>>>>>>>>>>> user processes has by clearing their respective VMAs mapping the device BOs.
>>>>>>>>>>>> Then when the VMAs try to fill in the page tables again we check in the fault
>>>>>>>>>>>> handlerif the device is removed and if so, return an error. This will generate a
>>>>>>>>>>>> SIGBUS to the application which can then cleanly terminate.This indeed was done
>>>>>>>>>>>> but this in turn created a problem of kernel OOPs were the OOPSes were due to the
>>>>>>>>>>>> fact that while the app was terminating because of the SIGBUSit would trigger use
>>>>>>>>>>>> after free in the driver by calling to accesses device structures that were already
>>>>>>>>>>>> released from the pci remove sequence.This was handled by introducing a 'flush'
>>>>>>>>>>>> sequence during device removal were we wait for drm file reference to drop to 0
>>>>>>>>>>>> meaning all user clients directly using this device terminated.
>>>>>>>>>>>>
>>>>>>>>>>>> v2:
>>>>>>>>>>>> Based on discussions in the mailing list with Daniel and Pekka [1] and based on the document
>>>>>>>>>>>> produced by Pekka from those discussions [2] the whole approach with returning SIGBUS and
>>>>>>>>>>>> waiting for all user clients having CPU mapping of device BOs to die was dropped.
>>>>>>>>>>>> Instead as per the document suggestion the device structures are kept alive until
>>>>>>>>>>>> the last reference to the device is dropped by user client and in the meanwhile all existing and new CPU mappings of the BOs
>>>>>>>>>>>> belonging to the device directly or by dma-buf import are rerouted to per user
>>>>>>>>>>>> process dummy rw page.Also, I skipped the 'Requirements for KMS UAPI' section of [2]
>>>>>>>>>>>> since i am trying to get the minimal set of requirements that still give useful solution
>>>>>>>>>>>> to work and this is the'Requirements for Render and Cross-Device UAPI' section and so my
>>>>>>>>>>>> test case is removing a secondary device, which is render only and is not involved
>>>>>>>>>>>> in KMS.
>>>>>>>>>>>>
>>>>>>>>>>>> v3:
>>>>>>>>>>>> More updates following comments from v2 such as removing loop to find DRM file when rerouting
>>>>>>>>>>>> page faults to dummy page,getting rid of unnecessary sysfs handling refactoring and moving
>>>>>>>>>>>> prevention of GPU recovery post device unplug from amdgpu to scheduler layer.
>>>>>>>>>>>> On top of that added unplug support for the IOMMU enabled system.
>>>>>>>>>>>>
>>>>>>>>>>>> v4:
>>>>>>>>>>>> Drop last sysfs hack and use sysfs default attribute.
>>>>>>>>>>>> Guard against write accesses after device removal to avoid modifying released memory.
>>>>>>>>>>>> Update dummy pages handling to on demand allocation and release through drm managed framework.
>>>>>>>>>>>> Add return value to scheduler job TO handler (by Luben Tuikov) and use this in amdgpu for prevention
>>>>>>>>>>>> of GPU recovery post device unplug
>>>>>>>>>>>> Also rebase on top of drm-misc-mext instead of amd-staging-drm-next
>>>>>>>>>>>>
>>>>>>>>>>>> With these patches I am able to gracefully remove the secondary card using sysfs remove hook while glxgears
>>>>>>>>>>>> is running off of secondary card (DRI_PRIME=1) without kernel oopses or hangs and keep working
>>>>>>>>>>>> with the primary card or soft reset the device without hangs or oopses
>>>>>>>>>>>>
>>>>>>>>>>>> TODOs for followup work:
>>>>>>>>>>>> Convert AMDGPU code to use devm (for hw stuff) and drmm (for sw stuff and allocations) (Daniel)
>>>>>>>>>>>> Support plugging the secondary device back after unplug - currently still experiencing HW error on plugging back.
>>>>>>>>>>>> Add support for 'Requirements for KMS UAPI' section of [2] - unplugging primary, display connected card.
>>>>>>>>>>>>
>>>>>>>>>>>> [1] - Discussions during v3 of the patchset https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.spinics.net%2Flists%2Famd-gfx%2Fmsg55576.html&amp;data=04%7C01%7CAndrey.Grodzovsky%40amd.com%7C49a8be16944441aca3ce08d8cce01c88%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637484610239738569%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=PkvEkJt2zQ9G0dpiGYtSuVLpziwQ0FklkjdMLV2ZOnE%3D&amp;reserved=0
>>>>>>>>>>>> [2] - drm/doc: device hot-unplug for userspace https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.spinics.net%2Flists%2Fdri-devel%2Fmsg259755.html&amp;data=04%7C01%7CAndrey.Grodzovsky%40amd.com%7C49a8be16944441aca3ce08d8cce01c88%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637484610239738569%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=akobkV%2FtjcDck7sHlGbgE5n1FCyGmcklhu%2Bx3goGRuw%3D&amp;reserved=0
>>>>>>>>>>>> [3] - Related gitlab ticket https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgitlab.freedesktop.org%2Fdrm%2Famd%2F-%2Fissues%2F1081&amp;data=04%7C01%7CAndrey.Grodzovsky%40amd.com%7C49a8be16944441aca3ce08d8cce01c88%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637484610239738569%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=P0O5VHLkRczar2XHXyM8UysK%2BaD4hpUuoT%2FCBCnOW2g%3D&amp;reserved=0
>>>>>>>>>>> btw have you tried this out with some of the igts we have? core_hotunplug
>>>>>>>>>>> is the one I'm thinking of. Might be worth to extend this for amdgpu
>>>>>>>>>>> specific stuff (like run some batches on it while hotunplugging).
>>>>>>>>>> No, I mostly used just running glxgears while testing which covers already
>>>>>>>>>> exported/imported dma-buf case and a few manually hacked tests in libdrm amdgpu
>>>>>>>>>> test suite
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> Since there's so many corner cases we need to test here (shared dma-buf,
>>>>>>>>>>> shared dma_fence) I think it would make sense to have a shared testcase
>>>>>>>>>>> across drivers.
>>>>>>>>>> Not familiar with IGT too much, is there an easy way to setup shared dma bufs
>>>>>>>>>> and fences
>>>>>>>>>> use cases there or you mean I need to add them now ?
>>>>>>>>> We do have test infrastructure for all of that, but the hotunplug test
>>>>>>>>> doesn't have that yet I think.
>>>>>>>>>
>>>>>>>>>>> Only specific thing would be some hooks to keep the gpu
>>>>>>>>>>> busy in some fashion while we yank the driver.
>>>>>>>>>> Do you mean like staring X and some active rendering on top (like glxgears)
>>>>>>>>>> automatically from within IGT ?
>>>>>>>>> Nope, igt is meant to be bare metal testing so you don't have to drag
>>>>>>>>> the entire winsys around (which in a wayland world, is not really good
>>>>>>>>> for driver testing anyway, since everything is different). We use this
>>>>>>>>> for our pre-merge ci for drm/i915.
>>>>>>>> So i keep it busy by X/glxgers which is manual operation. What you suggest
>>>>>>>> then is some client within IGT which opens the device and starts submitting jobs
>>>>>>>> (which is much like what libdrm amdgpu tests already do) ? And this
>>>>>>>> part is the amdgou specific code I just need to port from libdrm to here ?
>>>>>>> Yup. For i915 tests we have an entire library already for small workloads,
>>>>>>> including some that just spin forever (useful for reset testing and could
>>>>>>> also come handy for unload testing).
>>>>>>> -Daniel
>>>>>> Does it mean I would have to drag in the entire infrastructure code from
>>>>>> within libdrm amdgpu code that allows for command submissions through
>>>>>> our IOCTLs ?
>>>>> No it's perfectly fine to use libdrm in igt tests, we do that too. I
>>>>> just mean we have some additional helpers to submit specific workloads
>>>>> for intel gpu, like rendercpy to move data with the 3d engine (just
>>>>> using copy engines only isn't good enough sometimes for testing), or
>>>>> the special hanging batchbuffers we use for reset testing, or in
>>>>> general for having precise control over race conditions and things
>>>>> like that.
>>>>>
>>>>> One thing that was somewhat annoying for i915 but shouldn't be a
>>>>> problem for amdgpu is that igt builds on intel. So we have stub
>>>>> functions for libdrm-intel, since libdrm-intel doesn't build on arm.
>>>>> Shouldn't be a problem for you.
>>>>> -Daniel
>>>> Tested with igt hot-unplug test. Passed unbind_rebind, unplug-rescan,
>>>> hot-unbind-rebind and hotunplug-rescan
>>>> if disabling the rescan part as I don't support plug-back for now. Also added
>>>> command submission for amdgpu.
>>>> Attached a draft of submitting workload while unbinding the driver or simulating
>>>> detach. Catched 2 issues with unpug if command submission in flight  during
>>>> unplug -
>>>> (unsignaled fence causing a hang in amdgpu_cs_sync and hitting a BUG_ON in
>>>> gfx_v9_0_ring_emit_patch_cond_exec whic is expected i guess).
>>>> Guess glxgears command submissions is at a much slower rate so this was missed.
>>>> Is that what you meant for this test ?
>>> Yup. Would be good if you can submit this one for inclusion.
>>> -Daniel
>>
>> Will do together with exported dma-buf test once I do it.
>>
>> P.S How am i supposed to do exported fence test. Exporting a fence from
>> device A, importing it into device B, unplugging
>> device A then signaling the fence from device B - this supposed to call a
>> fence cb which was registered
>> by the exporter which by now is dead and hence will cause a 'use after free' ?
> Yeah in the end we'd need 2 hw devices for testing full fence
> functionality. A useful intermediate step would be to just export the
> fence (either as sync_file, which I think amdgpu doesn't support because
> no android egl support in mesa) or drm_syncobj (which you can do as
> standalone fd too iirc), and then just using the fence a bit from
> userspace (like wait on it or get its status) after the device is
> unplugged.
>
> I think this should cover most of the cross-driver issues that fences
> bring in, and I think for the other problems we can worry once we spot.
> -Daniel


OK, will write up all the tests and submit a merge request for all of them 
together to IGT gitlab

Andrey


>
>> Andrey
>>
>>>> Andrey
>>>>
>>>>
>>>>>> Andrey
>>>>>>
>>>>>>>> Andrey
>>>>>>>>
>>>>>>>>
>>>>>>>>>>> But just to get it started
>>>>>>>>>>> you can throw in entirely amdgpu specific subtests and just share some of
>>>>>>>>>>> the test code.
>>>>>>>>>>> -Daniel
>>>>>>>>>> Im general, I wasn't aware of this test suite and looks like it does what i test
>>>>>>>>>> among other stuff.
>>>>>>>>>> I will definitely  try to run with it although the rescan part will not work as
>>>>>>>>>> plugging
>>>>>>>>>> the device back is in my TODO list and not part of the scope for this patchset
>>>>>>>>>> and so I will
>>>>>>>>>> probably comment the re-scan section out while testing.
>>>>>>>>> amd gem has been using libdrm-amd thus far iirc, but for things like
>>>>>>>>> this I think it'd be worth to at least consider switching. Display
>>>>>>>>> team has already started to use some of the test and contribute stuff
>>>>>>>>> (I think the VRR testcase is from amd).
>>>>>>>>> -Daniel
>>>>>>>>>
>>>>>>>>>> Andrey
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>>> Andrey Grodzovsky (13):
>>>>>>>>>>>>         drm/ttm: Remap all page faults to per process dummy page.
>>>>>>>>>>>>         drm: Unamp the entire device address space on device unplug
>>>>>>>>>>>>         drm/ttm: Expose ttm_tt_unpopulate for driver use
>>>>>>>>>>>>         drm/sched: Cancel and flush all oustatdning jobs before finish.
>>>>>>>>>>>>         drm/amdgpu: Split amdgpu_device_fini into early and late
>>>>>>>>>>>>         drm/amdgpu: Add early fini callback
>>>>>>>>>>>>         drm/amdgpu: Register IOMMU topology notifier per device.
>>>>>>>>>>>>         drm/amdgpu: Fix a bunch of sdma code crash post device unplug
>>>>>>>>>>>>         drm/amdgpu: Remap all page faults to per process dummy page.
>>>>>>>>>>>>         dmr/amdgpu: Move some sysfs attrs creation to default_attr
>>>>>>>>>>>>         drm/amdgpu: Guard against write accesses after device removal
>>>>>>>>>>>>         drm/sched: Make timeout timer rearm conditional.
>>>>>>>>>>>>         drm/amdgpu: Prevent any job recoveries after device is unplugged.
>>>>>>>>>>>>
>>>>>>>>>>>> Luben Tuikov (1):
>>>>>>>>>>>>         drm/scheduler: Job timeout handler returns status
>>>>>>>>>>>>
>>>>>>>>>>>>        drivers/gpu/drm/amd/amdgpu/amdgpu.h               |  11 +-
>>>>>>>>>>>>        drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c      |  17 +--
>>>>>>>>>>>>        drivers/gpu/drm/amd/amdgpu/amdgpu_device.c        | 149 ++++++++++++++++++++--
>>>>>>>>>>>>        drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c           |  20 ++-
>>>>>>>>>>>>        drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c         |  15 ++-
>>>>>>>>>>>>        drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c          |   2 +-
>>>>>>>>>>>>        drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h          |   1 +
>>>>>>>>>>>>        drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c           |   9 ++
>>>>>>>>>>>>        drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c       |  25 ++--
>>>>>>>>>>>>        drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c           |  26 ++--
>>>>>>>>>>>>        drivers/gpu/drm/amd/amdgpu/amdgpu_irq.h           |   3 +-
>>>>>>>>>>>>        drivers/gpu/drm/amd/amdgpu/amdgpu_job.c           |  19 ++-
>>>>>>>>>>>>        drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c           |  12 +-
>>>>>>>>>>>>        drivers/gpu/drm/amd/amdgpu/amdgpu_object.c        |  10 ++
>>>>>>>>>>>>        drivers/gpu/drm/amd/amdgpu/amdgpu_object.h        |   2 +
>>>>>>>>>>>>        drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c           |  53 +++++---
>>>>>>>>>>>>        drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h           |   3 +
>>>>>>>>>>>>        drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c           |   1 +
>>>>>>>>>>>>        drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c          |  70 ++++++++++
>>>>>>>>>>>>        drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h          |  52 +-------
>>>>>>>>>>>>        drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c           |  21 ++-
>>>>>>>>>>>>        drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c            |   8 +-
>>>>>>>>>>>>        drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c      |  14 +-
>>>>>>>>>>>>        drivers/gpu/drm/amd/amdgpu/cik_ih.c               |   2 +-
>>>>>>>>>>>>        drivers/gpu/drm/amd/amdgpu/cz_ih.c                |   2 +-
>>>>>>>>>>>>        drivers/gpu/drm/amd/amdgpu/iceland_ih.c           |   2 +-
>>>>>>>>>>>>        drivers/gpu/drm/amd/amdgpu/navi10_ih.c            |   2 +-
>>>>>>>>>>>>        drivers/gpu/drm/amd/amdgpu/psp_v11_0.c            |  16 +--
>>>>>>>>>>>>        drivers/gpu/drm/amd/amdgpu/psp_v12_0.c            |   8 +-
>>>>>>>>>>>>        drivers/gpu/drm/amd/amdgpu/psp_v3_1.c             |   8 +-
>>>>>>>>>>>>        drivers/gpu/drm/amd/amdgpu/si_ih.c                |   2 +-
>>>>>>>>>>>>        drivers/gpu/drm/amd/amdgpu/tonga_ih.c             |   2 +-
>>>>>>>>>>>>        drivers/gpu/drm/amd/amdgpu/vega10_ih.c            |   2 +-
>>>>>>>>>>>>        drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c |  12 +-
>>>>>>>>>>>>        drivers/gpu/drm/amd/include/amd_shared.h          |   2 +
>>>>>>>>>>>>        drivers/gpu/drm/drm_drv.c                         |   3 +
>>>>>>>>>>>>        drivers/gpu/drm/etnaviv/etnaviv_sched.c           |  10 +-
>>>>>>>>>>>>        drivers/gpu/drm/lima/lima_sched.c                 |   4 +-
>>>>>>>>>>>>        drivers/gpu/drm/panfrost/panfrost_job.c           |   9 +-
>>>>>>>>>>>>        drivers/gpu/drm/scheduler/sched_main.c            |  18 ++-
>>>>>>>>>>>>        drivers/gpu/drm/ttm/ttm_bo_vm.c                   |  82 +++++++++++-
>>>>>>>>>>>>        drivers/gpu/drm/ttm/ttm_tt.c                      |   1 +
>>>>>>>>>>>>        drivers/gpu/drm/v3d/v3d_sched.c                   |  32 ++---
>>>>>>>>>>>>        include/drm/gpu_scheduler.h                       |  17 ++-
>>>>>>>>>>>>        include/drm/ttm/ttm_bo_api.h                      |   2 +
>>>>>>>>>>>>        45 files changed, 583 insertions(+), 198 deletions(-)
>>>>>>>>>>>>
>>>>>>>>>>>> --
>>>>>>>>>>>> 2.7.4
>>>>>>>>>>>>
>>>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 11/14] drm/amdgpu: Guard against write accesses after device removal
  2021-02-09 14:30                                               ` Andrey Grodzovsky
@ 2021-02-09 15:40                                                 ` Christian König
  -1 siblings, 0 replies; 196+ messages in thread
From: Christian König @ 2021-02-09 15:40 UTC (permalink / raw)
  To: Andrey Grodzovsky, Daniel Vetter
  Cc: Greg KH, amd-gfx list, dri-devel, Alex Deucher, Qiang Yu

Am 09.02.21 um 15:30 schrieb Andrey Grodzovsky:
> [SNIP]
>>> Question - Why can't we just set those PTEs to point to system 
>>> memory (another RO dummy page)
>>> filled with 1s ?
>>
>>
>> Then writes are not discarded. E.g. the 1s would change to something 
>> else.
>>
>> Christian.
>
>
> I see but, what about marking the mappings as RO and discarding the 
> write access page faults continuously until the device is finalized ?
> Regarding using an unused range behind the upper bridge as Daniel 
> suggested, I wonder will this interfere with
> the upcoming feature to support BARs movement  during hot plug - 
> https://www.spinics.net/lists/linux-pci/msg103195.html ?

In the picture in my head the address will never be used.

But it doesn't even needs to be an unused range of the root bridge. That 
one is usually stuffed full by the BIOS.

For my BAR resize work I looked at quite a bunch of NB documentation. At 
least for AMD CPUs we should always have an MMIO address which is never 
ever used. That makes this platform/CPU dependent, but the code is just 
minimal.

The really really nice thing about this approach is that you could unit 
test and audit the code for problems even without *real* hotplug 
hardware. E.g. we can swap the kernel PTEs and see how the whole power 
and display code reacts to that etc....

Christian.

>
> Andrey
>
>
>>
>>
>>>
>>> Andrey
>>>
>>>
>>>>
>>>>> Christian.
>>>>>
>>>>>> It's a nifty idea indeed otherwise ...
>>>>>> -Daniel
>>>>>>
>>>>>>> Regards,
>>>>>>> Christian.
>>>>>>>
>>>>>>>
>>>>>>>>>>>> But ugh ...
>>>>>>>>>>>>
>>>>>>>>>>>> Otoh validating an entire driver like amdgpu without such a 
>>>>>>>>>>>> trick
>>>>>>>>>>>> against 0xff reads is practically impossible. So maybe you 
>>>>>>>>>>>> need to add
>>>>>>>>>>>> this as one of the tasks here?
>>>>>>>>>>> Or I could just for validation purposes return ~0 from all 
>>>>>>>>>>> reg reads in the code
>>>>>>>>>>> and ignore writes if drm_dev_unplugged, this could already 
>>>>>>>>>>> easily validate a big
>>>>>>>>>>> portion of the code flow under such scenario.
>>>>>>>>>> Hm yeah if your really wrap them all, that should work too. 
>>>>>>>>>> Since
>>>>>>>>>> iommappings have __iomem pointer type, as long as amdgpu is 
>>>>>>>>>> sparse
>>>>>>>>>> warning free, should be doable to guarantee this.
>>>>>>>>> Problem is that ~0 is not always a valid register value.
>>>>>>>>>
>>>>>>>>> You would need to audit every register read that it doesn't 
>>>>>>>>> use the returned
>>>>>>>>> value blindly as index or similar. That is quite a bit of work.
>>>>>>>> Yeah that's the entire crux here :-/
>>>>>>>> -Daniel
>>>>

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 11/14] drm/amdgpu: Guard against write accesses after device removal
@ 2021-02-09 15:40                                                 ` Christian König
  0 siblings, 0 replies; 196+ messages in thread
From: Christian König @ 2021-02-09 15:40 UTC (permalink / raw)
  To: Andrey Grodzovsky, Daniel Vetter
  Cc: Rob Herring, Greg KH, amd-gfx list, Anholt, Eric, Pekka Paalanen,
	dri-devel, Alex Deucher, Qiang Yu, Wentland, Harry, Lucas Stach

Am 09.02.21 um 15:30 schrieb Andrey Grodzovsky:
> [SNIP]
>>> Question - Why can't we just set those PTEs to point to system 
>>> memory (another RO dummy page)
>>> filled with 1s ?
>>
>>
>> Then writes are not discarded. E.g. the 1s would change to something 
>> else.
>>
>> Christian.
>
>
> I see but, what about marking the mappings as RO and discarding the 
> write access page faults continuously until the device is finalized ?
> Regarding using an unused range behind the upper bridge as Daniel 
> suggested, I wonder will this interfere with
> the upcoming feature to support BARs movement  during hot plug - 
> https://www.spinics.net/lists/linux-pci/msg103195.html ?

In the picture in my head the address will never be used.

But it doesn't even needs to be an unused range of the root bridge. That 
one is usually stuffed full by the BIOS.

For my BAR resize work I looked at quite a bunch of NB documentation. At 
least for AMD CPUs we should always have an MMIO address which is never 
ever used. That makes this platform/CPU dependent, but the code is just 
minimal.

The really really nice thing about this approach is that you could unit 
test and audit the code for problems even without *real* hotplug 
hardware. E.g. we can swap the kernel PTEs and see how the whole power 
and display code reacts to that etc....

Christian.

>
> Andrey
>
>
>>
>>
>>>
>>> Andrey
>>>
>>>
>>>>
>>>>> Christian.
>>>>>
>>>>>> It's a nifty idea indeed otherwise ...
>>>>>> -Daniel
>>>>>>
>>>>>>> Regards,
>>>>>>> Christian.
>>>>>>>
>>>>>>>
>>>>>>>>>>>> But ugh ...
>>>>>>>>>>>>
>>>>>>>>>>>> Otoh validating an entire driver like amdgpu without such a 
>>>>>>>>>>>> trick
>>>>>>>>>>>> against 0xff reads is practically impossible. So maybe you 
>>>>>>>>>>>> need to add
>>>>>>>>>>>> this as one of the tasks here?
>>>>>>>>>>> Or I could just for validation purposes return ~0 from all 
>>>>>>>>>>> reg reads in the code
>>>>>>>>>>> and ignore writes if drm_dev_unplugged, this could already 
>>>>>>>>>>> easily validate a big
>>>>>>>>>>> portion of the code flow under such scenario.
>>>>>>>>>> Hm yeah if your really wrap them all, that should work too. 
>>>>>>>>>> Since
>>>>>>>>>> iommappings have __iomem pointer type, as long as amdgpu is 
>>>>>>>>>> sparse
>>>>>>>>>> warning free, should be doable to guarantee this.
>>>>>>>>> Problem is that ~0 is not always a valid register value.
>>>>>>>>>
>>>>>>>>> You would need to audit every register read that it doesn't 
>>>>>>>>> use the returned
>>>>>>>>> value blindly as index or similar. That is quite a bit of work.
>>>>>>>> Yeah that's the entire crux here :-/
>>>>>>>> -Daniel
>>>>

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 11/14] drm/amdgpu: Guard against write accesses after device removal
  2021-02-09 15:40                                                 ` Christian König
@ 2021-02-10 22:01                                                   ` Andrey Grodzovsky
  -1 siblings, 0 replies; 196+ messages in thread
From: Andrey Grodzovsky @ 2021-02-10 22:01 UTC (permalink / raw)
  To: Christian König, Daniel Vetter
  Cc: Greg KH, amd-gfx list, dri-devel, Alex Deucher, Qiang Yu


On 2/9/21 10:40 AM, Christian König wrote:
> Am 09.02.21 um 15:30 schrieb Andrey Grodzovsky:
>> [SNIP]
>>>> Question - Why can't we just set those PTEs to point to system memory 
>>>> (another RO dummy page)
>>>> filled with 1s ?
>>>
>>>
>>> Then writes are not discarded. E.g. the 1s would change to something else.
>>>
>>> Christian.
>>
>>
>> I see but, what about marking the mappings as RO and discarding the write 
>> access page faults continuously until the device is finalized ?
>> Regarding using an unused range behind the upper bridge as Daniel suggested, 
>> I wonder will this interfere with
>> the upcoming feature to support BARs movement  during hot plug - 
>> https://www.spinics.net/lists/linux-pci/msg103195.html ?
>
> In the picture in my head the address will never be used.
>
> But it doesn't even needs to be an unused range of the root bridge. That one 
> is usually stuffed full by the BIOS.
>
> For my BAR resize work I looked at quite a bunch of NB documentation. At least 
> for AMD CPUs we should always have an MMIO address which is never ever used. 
> That makes this platform/CPU dependent, but the code is just minimal.
>
> The really really nice thing about this approach is that you could unit test 
> and audit the code for problems even without *real* hotplug hardware. E.g. we 
> can swap the kernel PTEs and see how the whole power and display code reacts 
> to that etc....
>
> Christian.


Tried to play with hacking mmio tracer as Daniel suggested but just hanged the 
machine so...

Can you tell me how to dynamically obtain this kind of unused MMIO address ? I 
think given such address, writing
to which is dropped and reading from return all 1s, I can then do something like 
bellow, if that what u meant -

for (address = adev->rmmio; address < adev->rmmio_size; adress += PAGE_SIZE) {

     old_pte = get_locked_pte(init_mm, address)
     dummy_pte = pfn_pte(fake_mmio_address, 0);
     set_pte(&old_pte, dummy_pte)

}

flush_tlb ???

P.S I hope to obtain thunderbolt eGPU adapter soon so even if this won't work I 
still will be able to test how the driver handles all 1s.

Andrey

>
>>
>> Andrey
>>
>>
>>>
>>>
>>>>
>>>> Andrey
>>>>
>>>>
>>>>>
>>>>>> Christian.
>>>>>>
>>>>>>> It's a nifty idea indeed otherwise ...
>>>>>>> -Daniel
>>>>>>>
>>>>>>>> Regards,
>>>>>>>> Christian.
>>>>>>>>
>>>>>>>>
>>>>>>>>>>>>> But ugh ...
>>>>>>>>>>>>>
>>>>>>>>>>>>> Otoh validating an entire driver like amdgpu without such a trick
>>>>>>>>>>>>> against 0xff reads is practically impossible. So maybe you need to 
>>>>>>>>>>>>> add
>>>>>>>>>>>>> this as one of the tasks here?
>>>>>>>>>>>> Or I could just for validation purposes return ~0 from all reg 
>>>>>>>>>>>> reads in the code
>>>>>>>>>>>> and ignore writes if drm_dev_unplugged, this could already easily 
>>>>>>>>>>>> validate a big
>>>>>>>>>>>> portion of the code flow under such scenario.
>>>>>>>>>>> Hm yeah if your really wrap them all, that should work too. Since
>>>>>>>>>>> iommappings have __iomem pointer type, as long as amdgpu is sparse
>>>>>>>>>>> warning free, should be doable to guarantee this.
>>>>>>>>>> Problem is that ~0 is not always a valid register value.
>>>>>>>>>>
>>>>>>>>>> You would need to audit every register read that it doesn't use the 
>>>>>>>>>> returned
>>>>>>>>>> value blindly as index or similar. That is quite a bit of work.
>>>>>>>>> Yeah that's the entire crux here :-/
>>>>>>>>> -Daniel
>>>>>
>
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 11/14] drm/amdgpu: Guard against write accesses after device removal
@ 2021-02-10 22:01                                                   ` Andrey Grodzovsky
  0 siblings, 0 replies; 196+ messages in thread
From: Andrey Grodzovsky @ 2021-02-10 22:01 UTC (permalink / raw)
  To: Christian König, Daniel Vetter
  Cc: Rob Herring, Greg KH, amd-gfx list, Anholt, Eric, Pekka Paalanen,
	dri-devel, Alex Deucher, Qiang Yu, Wentland, Harry, Lucas Stach


On 2/9/21 10:40 AM, Christian König wrote:
> Am 09.02.21 um 15:30 schrieb Andrey Grodzovsky:
>> [SNIP]
>>>> Question - Why can't we just set those PTEs to point to system memory 
>>>> (another RO dummy page)
>>>> filled with 1s ?
>>>
>>>
>>> Then writes are not discarded. E.g. the 1s would change to something else.
>>>
>>> Christian.
>>
>>
>> I see but, what about marking the mappings as RO and discarding the write 
>> access page faults continuously until the device is finalized ?
>> Regarding using an unused range behind the upper bridge as Daniel suggested, 
>> I wonder will this interfere with
>> the upcoming feature to support BARs movement  during hot plug - 
>> https://www.spinics.net/lists/linux-pci/msg103195.html ?
>
> In the picture in my head the address will never be used.
>
> But it doesn't even needs to be an unused range of the root bridge. That one 
> is usually stuffed full by the BIOS.
>
> For my BAR resize work I looked at quite a bunch of NB documentation. At least 
> for AMD CPUs we should always have an MMIO address which is never ever used. 
> That makes this platform/CPU dependent, but the code is just minimal.
>
> The really really nice thing about this approach is that you could unit test 
> and audit the code for problems even without *real* hotplug hardware. E.g. we 
> can swap the kernel PTEs and see how the whole power and display code reacts 
> to that etc....
>
> Christian.


Tried to play with hacking mmio tracer as Daniel suggested but just hanged the 
machine so...

Can you tell me how to dynamically obtain this kind of unused MMIO address ? I 
think given such address, writing
to which is dropped and reading from return all 1s, I can then do something like 
bellow, if that what u meant -

for (address = adev->rmmio; address < adev->rmmio_size; adress += PAGE_SIZE) {

     old_pte = get_locked_pte(init_mm, address)
     dummy_pte = pfn_pte(fake_mmio_address, 0);
     set_pte(&old_pte, dummy_pte)

}

flush_tlb ???

P.S I hope to obtain thunderbolt eGPU adapter soon so even if this won't work I 
still will be able to test how the driver handles all 1s.

Andrey

>
>>
>> Andrey
>>
>>
>>>
>>>
>>>>
>>>> Andrey
>>>>
>>>>
>>>>>
>>>>>> Christian.
>>>>>>
>>>>>>> It's a nifty idea indeed otherwise ...
>>>>>>> -Daniel
>>>>>>>
>>>>>>>> Regards,
>>>>>>>> Christian.
>>>>>>>>
>>>>>>>>
>>>>>>>>>>>>> But ugh ...
>>>>>>>>>>>>>
>>>>>>>>>>>>> Otoh validating an entire driver like amdgpu without such a trick
>>>>>>>>>>>>> against 0xff reads is practically impossible. So maybe you need to 
>>>>>>>>>>>>> add
>>>>>>>>>>>>> this as one of the tasks here?
>>>>>>>>>>>> Or I could just for validation purposes return ~0 from all reg 
>>>>>>>>>>>> reads in the code
>>>>>>>>>>>> and ignore writes if drm_dev_unplugged, this could already easily 
>>>>>>>>>>>> validate a big
>>>>>>>>>>>> portion of the code flow under such scenario.
>>>>>>>>>>> Hm yeah if your really wrap them all, that should work too. Since
>>>>>>>>>>> iommappings have __iomem pointer type, as long as amdgpu is sparse
>>>>>>>>>>> warning free, should be doable to guarantee this.
>>>>>>>>>> Problem is that ~0 is not always a valid register value.
>>>>>>>>>>
>>>>>>>>>> You would need to audit every register read that it doesn't use the 
>>>>>>>>>> returned
>>>>>>>>>> value blindly as index or similar. That is quite a bit of work.
>>>>>>>>> Yeah that's the entire crux here :-/
>>>>>>>>> -Daniel
>>>>>
>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 11/14] drm/amdgpu: Guard against write accesses after device removal
  2021-02-10 22:01                                                   ` Andrey Grodzovsky
@ 2021-02-12 15:00                                                     ` Andrey Grodzovsky
  -1 siblings, 0 replies; 196+ messages in thread
From: Andrey Grodzovsky @ 2021-02-12 15:00 UTC (permalink / raw)
  To: Christian König, Daniel Vetter
  Cc: Greg KH, amd-gfx list, dri-devel, Alex Deucher, Qiang Yu

Ping

Andrey

On 2/10/21 5:01 PM, Andrey Grodzovsky wrote:
>
> On 2/9/21 10:40 AM, Christian König wrote:
>> Am 09.02.21 um 15:30 schrieb Andrey Grodzovsky:
>>> [SNIP]
>>>>> Question - Why can't we just set those PTEs to point to system memory 
>>>>> (another RO dummy page)
>>>>> filled with 1s ?
>>>>
>>>>
>>>> Then writes are not discarded. E.g. the 1s would change to something else.
>>>>
>>>> Christian.
>>>
>>>
>>> I see but, what about marking the mappings as RO and discarding the write 
>>> access page faults continuously until the device is finalized ?
>>> Regarding using an unused range behind the upper bridge as Daniel suggested, 
>>> I wonder will this interfere with
>>> the upcoming feature to support BARs movement  during hot plug - 
>>> https://www.spinics.net/lists/linux-pci/msg103195.html ?
>>
>> In the picture in my head the address will never be used.
>>
>> But it doesn't even needs to be an unused range of the root bridge. That one 
>> is usually stuffed full by the BIOS.
>>
>> For my BAR resize work I looked at quite a bunch of NB documentation. At 
>> least for AMD CPUs we should always have an MMIO address which is never ever 
>> used. That makes this platform/CPU dependent, but the code is just minimal.
>>
>> The really really nice thing about this approach is that you could unit test 
>> and audit the code for problems even without *real* hotplug hardware. E.g. we 
>> can swap the kernel PTEs and see how the whole power and display code reacts 
>> to that etc....
>>
>> Christian.
>
>
> Tried to play with hacking mmio tracer as Daniel suggested but just hanged the 
> machine so...
>
> Can you tell me how to dynamically obtain this kind of unused MMIO address ? I 
> think given such address, writing
> to which is dropped and reading from return all 1s, I can then do something 
> like bellow, if that what u meant -
>
> for (address = adev->rmmio; address < adev->rmmio_size; adress += PAGE_SIZE) {
>
>     old_pte = get_locked_pte(init_mm, address)
>     dummy_pte = pfn_pte(fake_mmio_address, 0);
>     set_pte(&old_pte, dummy_pte)
>
> }
>
> flush_tlb ???
>
> P.S I hope to obtain thunderbolt eGPU adapter soon so even if this won't work 
> I still will be able to test how the driver handles all 1s.
>
> Andrey
>
>>
>>>
>>> Andrey
>>>
>>>
>>>>
>>>>
>>>>>
>>>>> Andrey
>>>>>
>>>>>
>>>>>>
>>>>>>> Christian.
>>>>>>>
>>>>>>>> It's a nifty idea indeed otherwise ...
>>>>>>>> -Daniel
>>>>>>>>
>>>>>>>>> Regards,
>>>>>>>>> Christian.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>>>>>> But ugh ...
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Otoh validating an entire driver like amdgpu without such a trick
>>>>>>>>>>>>>> against 0xff reads is practically impossible. So maybe you need 
>>>>>>>>>>>>>> to add
>>>>>>>>>>>>>> this as one of the tasks here?
>>>>>>>>>>>>> Or I could just for validation purposes return ~0 from all reg 
>>>>>>>>>>>>> reads in the code
>>>>>>>>>>>>> and ignore writes if drm_dev_unplugged, this could already easily 
>>>>>>>>>>>>> validate a big
>>>>>>>>>>>>> portion of the code flow under such scenario.
>>>>>>>>>>>> Hm yeah if your really wrap them all, that should work too. Since
>>>>>>>>>>>> iommappings have __iomem pointer type, as long as amdgpu is sparse
>>>>>>>>>>>> warning free, should be doable to guarantee this.
>>>>>>>>>>> Problem is that ~0 is not always a valid register value.
>>>>>>>>>>>
>>>>>>>>>>> You would need to audit every register read that it doesn't use the 
>>>>>>>>>>> returned
>>>>>>>>>>> value blindly as index or similar. That is quite a bit of work.
>>>>>>>>>> Yeah that's the entire crux here :-/
>>>>>>>>>> -Daniel
>>>>>>
>>
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 11/14] drm/amdgpu: Guard against write accesses after device removal
@ 2021-02-12 15:00                                                     ` Andrey Grodzovsky
  0 siblings, 0 replies; 196+ messages in thread
From: Andrey Grodzovsky @ 2021-02-12 15:00 UTC (permalink / raw)
  To: Christian König, Daniel Vetter
  Cc: Rob Herring, Greg KH, amd-gfx list, Anholt, Eric, Pekka Paalanen,
	dri-devel, Alex Deucher, Qiang Yu, Wentland, Harry, Lucas Stach

Ping

Andrey

On 2/10/21 5:01 PM, Andrey Grodzovsky wrote:
>
> On 2/9/21 10:40 AM, Christian König wrote:
>> Am 09.02.21 um 15:30 schrieb Andrey Grodzovsky:
>>> [SNIP]
>>>>> Question - Why can't we just set those PTEs to point to system memory 
>>>>> (another RO dummy page)
>>>>> filled with 1s ?
>>>>
>>>>
>>>> Then writes are not discarded. E.g. the 1s would change to something else.
>>>>
>>>> Christian.
>>>
>>>
>>> I see but, what about marking the mappings as RO and discarding the write 
>>> access page faults continuously until the device is finalized ?
>>> Regarding using an unused range behind the upper bridge as Daniel suggested, 
>>> I wonder will this interfere with
>>> the upcoming feature to support BARs movement  during hot plug - 
>>> https://www.spinics.net/lists/linux-pci/msg103195.html ?
>>
>> In the picture in my head the address will never be used.
>>
>> But it doesn't even needs to be an unused range of the root bridge. That one 
>> is usually stuffed full by the BIOS.
>>
>> For my BAR resize work I looked at quite a bunch of NB documentation. At 
>> least for AMD CPUs we should always have an MMIO address which is never ever 
>> used. That makes this platform/CPU dependent, but the code is just minimal.
>>
>> The really really nice thing about this approach is that you could unit test 
>> and audit the code for problems even without *real* hotplug hardware. E.g. we 
>> can swap the kernel PTEs and see how the whole power and display code reacts 
>> to that etc....
>>
>> Christian.
>
>
> Tried to play with hacking mmio tracer as Daniel suggested but just hanged the 
> machine so...
>
> Can you tell me how to dynamically obtain this kind of unused MMIO address ? I 
> think given such address, writing
> to which is dropped and reading from return all 1s, I can then do something 
> like bellow, if that what u meant -
>
> for (address = adev->rmmio; address < adev->rmmio_size; adress += PAGE_SIZE) {
>
>     old_pte = get_locked_pte(init_mm, address)
>     dummy_pte = pfn_pte(fake_mmio_address, 0);
>     set_pte(&old_pte, dummy_pte)
>
> }
>
> flush_tlb ???
>
> P.S I hope to obtain thunderbolt eGPU adapter soon so even if this won't work 
> I still will be able to test how the driver handles all 1s.
>
> Andrey
>
>>
>>>
>>> Andrey
>>>
>>>
>>>>
>>>>
>>>>>
>>>>> Andrey
>>>>>
>>>>>
>>>>>>
>>>>>>> Christian.
>>>>>>>
>>>>>>>> It's a nifty idea indeed otherwise ...
>>>>>>>> -Daniel
>>>>>>>>
>>>>>>>>> Regards,
>>>>>>>>> Christian.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>>>>>> But ugh ...
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Otoh validating an entire driver like amdgpu without such a trick
>>>>>>>>>>>>>> against 0xff reads is practically impossible. So maybe you need 
>>>>>>>>>>>>>> to add
>>>>>>>>>>>>>> this as one of the tasks here?
>>>>>>>>>>>>> Or I could just for validation purposes return ~0 from all reg 
>>>>>>>>>>>>> reads in the code
>>>>>>>>>>>>> and ignore writes if drm_dev_unplugged, this could already easily 
>>>>>>>>>>>>> validate a big
>>>>>>>>>>>>> portion of the code flow under such scenario.
>>>>>>>>>>>> Hm yeah if your really wrap them all, that should work too. Since
>>>>>>>>>>>> iommappings have __iomem pointer type, as long as amdgpu is sparse
>>>>>>>>>>>> warning free, should be doable to guarantee this.
>>>>>>>>>>> Problem is that ~0 is not always a valid register value.
>>>>>>>>>>>
>>>>>>>>>>> You would need to audit every register read that it doesn't use the 
>>>>>>>>>>> returned
>>>>>>>>>>> value blindly as index or similar. That is quite a bit of work.
>>>>>>>>>> Yeah that's the entire crux here :-/
>>>>>>>>>> -Daniel
>>>>>>
>>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 00/14] RFC Support hot device unplug in amdgpu
  2021-02-09  9:50                       ` Daniel Vetter
@ 2021-02-18 20:03                         ` Andrey Grodzovsky
  -1 siblings, 0 replies; 196+ messages in thread
From: Andrey Grodzovsky @ 2021-02-18 20:03 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: amd-gfx list, Christian König, dri-devel, Qiang Yu, Greg KH,
	Alex Deucher

Looked a bit into it, I want to export sync_object to FD and import  from that FD
such that I will wait on the imported sync object handle from one thread while
signaling the exported sync object handle from another (post device unplug) ?

My problem is how to create a sync object with a non signaled 'fake' fence ?
I only see API that creates it with already signaled fence (or none) - 
https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/drm_syncobj.c#L56

P.S I expect the kernel to crash since unlike with dma_bufs we don't hold
drm device reference here on export.

Andrey

On 2/9/21 4:50 AM, Daniel Vetter wrote:
> Yeah in the end we'd need 2 hw devices for testing full fence
> functionality. A useful intermediate step would be to just export the
> fence (either as sync_file, which I think amdgpu doesn't support because
> no android egl support in mesa) or drm_syncobj (which you can do as
> standalone fd too iirc), and then just using the fence a bit from
> userspace (like wait on it or get its status) after the device is
> unplugged.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 00/14] RFC Support hot device unplug in amdgpu
@ 2021-02-18 20:03                         ` Andrey Grodzovsky
  0 siblings, 0 replies; 196+ messages in thread
From: Andrey Grodzovsky @ 2021-02-18 20:03 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Rob Herring, amd-gfx list, Christian König, dri-devel,
	Anholt, Eric, Pekka Paalanen, Qiang Yu, Greg KH, Alex Deucher,
	Wentland, Harry, Lucas Stach

Looked a bit into it, I want to export sync_object to FD and import  from that FD
such that I will wait on the imported sync object handle from one thread while
signaling the exported sync object handle from another (post device unplug) ?

My problem is how to create a sync object with a non signaled 'fake' fence ?
I only see API that creates it with already signaled fence (or none) - 
https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/drm_syncobj.c#L56

P.S I expect the kernel to crash since unlike with dma_bufs we don't hold
drm device reference here on export.

Andrey

On 2/9/21 4:50 AM, Daniel Vetter wrote:
> Yeah in the end we'd need 2 hw devices for testing full fence
> functionality. A useful intermediate step would be to just export the
> fence (either as sync_file, which I think amdgpu doesn't support because
> no android egl support in mesa) or drm_syncobj (which you can do as
> standalone fd too iirc), and then just using the fence a bit from
> userspace (like wait on it or get its status) after the device is
> unplugged.
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 00/14] RFC Support hot device unplug in amdgpu
  2021-02-18 20:03                         ` Andrey Grodzovsky
@ 2021-02-19 10:24                           ` Daniel Vetter
  -1 siblings, 0 replies; 196+ messages in thread
From: Daniel Vetter @ 2021-02-19 10:24 UTC (permalink / raw)
  To: Andrey Grodzovsky
  Cc: amd-gfx list, Christian König, dri-devel, Qiang Yu, Greg KH,
	Alex Deucher

On Thu, Feb 18, 2021 at 9:03 PM Andrey Grodzovsky
<Andrey.Grodzovsky@amd.com> wrote:
>
> Looked a bit into it, I want to export sync_object to FD and import  from that FD
> such that I will wait on the imported sync object handle from one thread while
> signaling the exported sync object handle from another (post device unplug) ?
>
> My problem is how to create a sync object with a non signaled 'fake' fence ?
> I only see API that creates it with already signaled fence (or none) -
> https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/drm_syncobj.c#L56
>
> P.S I expect the kernel to crash since unlike with dma_bufs we don't hold
> drm device reference here on export.

Well maybe there's no crash. I think if you go through all your
dma_fence that you have and force-complete them, then I think external
callers wont go into the driver anymore. But there's still pointers
potentially pointing at your device struct and all that, but should
work. Still needs some audit ofc.

Wrt how you get such a free-standing fence, that's amdgpu specific. Roughly
- submit cs
- get the fence for that (either sync_file, but I don't think amdgpu
supports that, or maybe through drm_syncobj)
- hotunplug
- wait on that fence somehow (drm_syncobj has direct uapi for this,
same for sync_file I think)

Cheers, Daniel

>
> Andrey
>
> On 2/9/21 4:50 AM, Daniel Vetter wrote:
> > Yeah in the end we'd need 2 hw devices for testing full fence
> > functionality. A useful intermediate step would be to just export the
> > fence (either as sync_file, which I think amdgpu doesn't support because
> > no android egl support in mesa) or drm_syncobj (which you can do as
> > standalone fd too iirc), and then just using the fence a bit from
> > userspace (like wait on it or get its status) after the device is
> > unplugged.



-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 00/14] RFC Support hot device unplug in amdgpu
@ 2021-02-19 10:24                           ` Daniel Vetter
  0 siblings, 0 replies; 196+ messages in thread
From: Daniel Vetter @ 2021-02-19 10:24 UTC (permalink / raw)
  To: Andrey Grodzovsky
  Cc: Rob Herring, amd-gfx list, Christian König, dri-devel,
	Anholt, Eric, Pekka Paalanen, Qiang Yu, Greg KH, Alex Deucher,
	Wentland, Harry, Lucas Stach

On Thu, Feb 18, 2021 at 9:03 PM Andrey Grodzovsky
<Andrey.Grodzovsky@amd.com> wrote:
>
> Looked a bit into it, I want to export sync_object to FD and import  from that FD
> such that I will wait on the imported sync object handle from one thread while
> signaling the exported sync object handle from another (post device unplug) ?
>
> My problem is how to create a sync object with a non signaled 'fake' fence ?
> I only see API that creates it with already signaled fence (or none) -
> https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/drm_syncobj.c#L56
>
> P.S I expect the kernel to crash since unlike with dma_bufs we don't hold
> drm device reference here on export.

Well maybe there's no crash. I think if you go through all your
dma_fence that you have and force-complete them, then I think external
callers wont go into the driver anymore. But there's still pointers
potentially pointing at your device struct and all that, but should
work. Still needs some audit ofc.

Wrt how you get such a free-standing fence, that's amdgpu specific. Roughly
- submit cs
- get the fence for that (either sync_file, but I don't think amdgpu
supports that, or maybe through drm_syncobj)
- hotunplug
- wait on that fence somehow (drm_syncobj has direct uapi for this,
same for sync_file I think)

Cheers, Daniel

>
> Andrey
>
> On 2/9/21 4:50 AM, Daniel Vetter wrote:
> > Yeah in the end we'd need 2 hw devices for testing full fence
> > functionality. A useful intermediate step would be to just export the
> > fence (either as sync_file, which I think amdgpu doesn't support because
> > no android egl support in mesa) or drm_syncobj (which you can do as
> > standalone fd too iirc), and then just using the fence a bit from
> > userspace (like wait on it or get its status) after the device is
> > unplugged.



-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 00/14] RFC Support hot device unplug in amdgpu
  2021-02-19 10:24                           ` Daniel Vetter
@ 2021-02-24 16:30                             ` Andrey Grodzovsky
  -1 siblings, 0 replies; 196+ messages in thread
From: Andrey Grodzovsky @ 2021-02-24 16:30 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: amd-gfx list, Christian König, dri-devel, Qiang Yu, Greg KH,
	Alex Deucher


On 2021-02-19 5:24 a.m., Daniel Vetter wrote:
> On Thu, Feb 18, 2021 at 9:03 PM Andrey Grodzovsky
> <Andrey.Grodzovsky@amd.com> wrote:
>> Looked a bit into it, I want to export sync_object to FD and import  from that FD
>> such that I will wait on the imported sync object handle from one thread while
>> signaling the exported sync object handle from another (post device unplug) ?
>>
>> My problem is how to create a sync object with a non signaled 'fake' fence ?
>> I only see API that creates it with already signaled fence (or none) -
>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Felixir.bootlin.com%2Flinux%2Flatest%2Fsource%2Fdrivers%2Fgpu%2Fdrm%2Fdrm_syncobj.c%23L56&amp;data=04%7C01%7CAndrey.Grodzovsky%40amd.com%7C5085bdd151c642514d2408d8d4c08e56%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637493270792459284%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=sZWIn0Lo7ZujBq0e7MdFPhJDARXWpOlLgLzANMS8cCY%3D&amp;reserved=0
>>
>> P.S I expect the kernel to crash since unlike with dma_bufs we don't hold
>> drm device reference here on export.
> Well maybe there's no crash. I think if you go through all your
> dma_fence that you have and force-complete them, then I think external
> callers wont go into the driver anymore. But there's still pointers
> potentially pointing at your device struct and all that, but should
> work. Still needs some audit ofc.
>
> Wrt how you get such a free-standing fence, that's amdgpu specific. Roughly
> - submit cs
> - get the fence for that (either sync_file, but I don't think amdgpu
> supports that, or maybe through drm_syncobj)
> - hotunplug
> - wait on that fence somehow (drm_syncobj has direct uapi for this,
> same for sync_file I think)
>
> Cheers, Daniel


Indeed worked fine, did with 2 devices. Since syncobj is refcounted, 
even after I
destroyed the original syncobj and unplugged the device, the exported 
syncobj and the
fence inside didn't go anywhere.

See my 3 tests in my branch on Gitlab 
https://gitlab.freedesktop.org/agrodzov/igt-gpu-tools/-/commits/master
and let me know if I should go ahead and do a merge request (into which 
target project/branch ?) or you
have more comments.

Andrey


>
>> Andrey
>>
>> On 2/9/21 4:50 AM, Daniel Vetter wrote:
>>> Yeah in the end we'd need 2 hw devices for testing full fence
>>> functionality. A useful intermediate step would be to just export the
>>> fence (either as sync_file, which I think amdgpu doesn't support because
>>> no android egl support in mesa) or drm_syncobj (which you can do as
>>> standalone fd too iirc), and then just using the fence a bit from
>>> userspace (like wait on it or get its status) after the device is
>>> unplugged.
>
>
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 00/14] RFC Support hot device unplug in amdgpu
@ 2021-02-24 16:30                             ` Andrey Grodzovsky
  0 siblings, 0 replies; 196+ messages in thread
From: Andrey Grodzovsky @ 2021-02-24 16:30 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Rob Herring, amd-gfx list, Christian König, dri-devel,
	Anholt, Eric, Pekka Paalanen, Qiang Yu, Greg KH, Alex Deucher,
	Wentland, Harry, Lucas Stach


On 2021-02-19 5:24 a.m., Daniel Vetter wrote:
> On Thu, Feb 18, 2021 at 9:03 PM Andrey Grodzovsky
> <Andrey.Grodzovsky@amd.com> wrote:
>> Looked a bit into it, I want to export sync_object to FD and import  from that FD
>> such that I will wait on the imported sync object handle from one thread while
>> signaling the exported sync object handle from another (post device unplug) ?
>>
>> My problem is how to create a sync object with a non signaled 'fake' fence ?
>> I only see API that creates it with already signaled fence (or none) -
>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Felixir.bootlin.com%2Flinux%2Flatest%2Fsource%2Fdrivers%2Fgpu%2Fdrm%2Fdrm_syncobj.c%23L56&amp;data=04%7C01%7CAndrey.Grodzovsky%40amd.com%7C5085bdd151c642514d2408d8d4c08e56%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637493270792459284%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=sZWIn0Lo7ZujBq0e7MdFPhJDARXWpOlLgLzANMS8cCY%3D&amp;reserved=0
>>
>> P.S I expect the kernel to crash since unlike with dma_bufs we don't hold
>> drm device reference here on export.
> Well maybe there's no crash. I think if you go through all your
> dma_fence that you have and force-complete them, then I think external
> callers wont go into the driver anymore. But there's still pointers
> potentially pointing at your device struct and all that, but should
> work. Still needs some audit ofc.
>
> Wrt how you get such a free-standing fence, that's amdgpu specific. Roughly
> - submit cs
> - get the fence for that (either sync_file, but I don't think amdgpu
> supports that, or maybe through drm_syncobj)
> - hotunplug
> - wait on that fence somehow (drm_syncobj has direct uapi for this,
> same for sync_file I think)
>
> Cheers, Daniel


Indeed worked fine, did with 2 devices. Since syncobj is refcounted, 
even after I
destroyed the original syncobj and unplugged the device, the exported 
syncobj and the
fence inside didn't go anywhere.

See my 3 tests in my branch on Gitlab 
https://gitlab.freedesktop.org/agrodzov/igt-gpu-tools/-/commits/master
and let me know if I should go ahead and do a merge request (into which 
target project/branch ?) or you
have more comments.

Andrey


>
>> Andrey
>>
>> On 2/9/21 4:50 AM, Daniel Vetter wrote:
>>> Yeah in the end we'd need 2 hw devices for testing full fence
>>> functionality. A useful intermediate step would be to just export the
>>> fence (either as sync_file, which I think amdgpu doesn't support because
>>> no android egl support in mesa) or drm_syncobj (which you can do as
>>> standalone fd too iirc), and then just using the fence a bit from
>>> userspace (like wait on it or get its status) after the device is
>>> unplugged.
>
>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 00/14] RFC Support hot device unplug in amdgpu
  2021-02-24 16:30                             ` Andrey Grodzovsky
@ 2021-02-25 10:25                               ` Daniel Vetter
  -1 siblings, 0 replies; 196+ messages in thread
From: Daniel Vetter @ 2021-02-25 10:25 UTC (permalink / raw)
  To: Andrey Grodzovsky
  Cc: Christian König, dri-devel, amd-gfx list, Greg KH,
	Alex Deucher, Qiang Yu

On Wed, Feb 24, 2021 at 11:30:50AM -0500, Andrey Grodzovsky wrote:
> 
> On 2021-02-19 5:24 a.m., Daniel Vetter wrote:
> > On Thu, Feb 18, 2021 at 9:03 PM Andrey Grodzovsky
> > <Andrey.Grodzovsky@amd.com> wrote:
> > > Looked a bit into it, I want to export sync_object to FD and import  from that FD
> > > such that I will wait on the imported sync object handle from one thread while
> > > signaling the exported sync object handle from another (post device unplug) ?
> > > 
> > > My problem is how to create a sync object with a non signaled 'fake' fence ?
> > > I only see API that creates it with already signaled fence (or none) -
> > > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Felixir.bootlin.com%2Flinux%2Flatest%2Fsource%2Fdrivers%2Fgpu%2Fdrm%2Fdrm_syncobj.c%23L56&amp;data=04%7C01%7CAndrey.Grodzovsky%40amd.com%7C5085bdd151c642514d2408d8d4c08e56%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637493270792459284%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=sZWIn0Lo7ZujBq0e7MdFPhJDARXWpOlLgLzANMS8cCY%3D&amp;reserved=0
> > > 
> > > P.S I expect the kernel to crash since unlike with dma_bufs we don't hold
> > > drm device reference here on export.
> > Well maybe there's no crash. I think if you go through all your
> > dma_fence that you have and force-complete them, then I think external
> > callers wont go into the driver anymore. But there's still pointers
> > potentially pointing at your device struct and all that, but should
> > work. Still needs some audit ofc.
> > 
> > Wrt how you get such a free-standing fence, that's amdgpu specific. Roughly
> > - submit cs
> > - get the fence for that (either sync_file, but I don't think amdgpu
> > supports that, or maybe through drm_syncobj)
> > - hotunplug
> > - wait on that fence somehow (drm_syncobj has direct uapi for this,
> > same for sync_file I think)
> > 
> > Cheers, Daniel
> 
> 
> Indeed worked fine, did with 2 devices. Since syncobj is refcounted, even
> after I
> destroyed the original syncobj and unplugged the device, the exported
> syncobj and the
> fence inside didn't go anywhere.
> 
> See my 3 tests in my branch on Gitlab
> https://gitlab.freedesktop.org/agrodzov/igt-gpu-tools/-/commits/master
> and let me know if I should go ahead and do a merge request (into which
> target project/branch ?) or you
> have more comments.

igt still works with patch submission.
-Daniel

> 
> Andrey
> 
> 
> > 
> > > Andrey
> > > 
> > > On 2/9/21 4:50 AM, Daniel Vetter wrote:
> > > > Yeah in the end we'd need 2 hw devices for testing full fence
> > > > functionality. A useful intermediate step would be to just export the
> > > > fence (either as sync_file, which I think amdgpu doesn't support because
> > > > no android egl support in mesa) or drm_syncobj (which you can do as
> > > > standalone fd too iirc), and then just using the fence a bit from
> > > > userspace (like wait on it or get its status) after the device is
> > > > unplugged.
> > 
> > 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 00/14] RFC Support hot device unplug in amdgpu
@ 2021-02-25 10:25                               ` Daniel Vetter
  0 siblings, 0 replies; 196+ messages in thread
From: Daniel Vetter @ 2021-02-25 10:25 UTC (permalink / raw)
  To: Andrey Grodzovsky
  Cc: Rob Herring, Christian König, dri-devel, Anholt, Eric,
	Pekka Paalanen, amd-gfx list, Daniel Vetter, Greg KH,
	Alex Deucher, Qiang Yu, Wentland, Harry, Lucas Stach

On Wed, Feb 24, 2021 at 11:30:50AM -0500, Andrey Grodzovsky wrote:
> 
> On 2021-02-19 5:24 a.m., Daniel Vetter wrote:
> > On Thu, Feb 18, 2021 at 9:03 PM Andrey Grodzovsky
> > <Andrey.Grodzovsky@amd.com> wrote:
> > > Looked a bit into it, I want to export sync_object to FD and import  from that FD
> > > such that I will wait on the imported sync object handle from one thread while
> > > signaling the exported sync object handle from another (post device unplug) ?
> > > 
> > > My problem is how to create a sync object with a non signaled 'fake' fence ?
> > > I only see API that creates it with already signaled fence (or none) -
> > > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Felixir.bootlin.com%2Flinux%2Flatest%2Fsource%2Fdrivers%2Fgpu%2Fdrm%2Fdrm_syncobj.c%23L56&amp;data=04%7C01%7CAndrey.Grodzovsky%40amd.com%7C5085bdd151c642514d2408d8d4c08e56%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637493270792459284%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=sZWIn0Lo7ZujBq0e7MdFPhJDARXWpOlLgLzANMS8cCY%3D&amp;reserved=0
> > > 
> > > P.S I expect the kernel to crash since unlike with dma_bufs we don't hold
> > > drm device reference here on export.
> > Well maybe there's no crash. I think if you go through all your
> > dma_fence that you have and force-complete them, then I think external
> > callers wont go into the driver anymore. But there's still pointers
> > potentially pointing at your device struct and all that, but should
> > work. Still needs some audit ofc.
> > 
> > Wrt how you get such a free-standing fence, that's amdgpu specific. Roughly
> > - submit cs
> > - get the fence for that (either sync_file, but I don't think amdgpu
> > supports that, or maybe through drm_syncobj)
> > - hotunplug
> > - wait on that fence somehow (drm_syncobj has direct uapi for this,
> > same for sync_file I think)
> > 
> > Cheers, Daniel
> 
> 
> Indeed worked fine, did with 2 devices. Since syncobj is refcounted, even
> after I
> destroyed the original syncobj and unplugged the device, the exported
> syncobj and the
> fence inside didn't go anywhere.
> 
> See my 3 tests in my branch on Gitlab
> https://gitlab.freedesktop.org/agrodzov/igt-gpu-tools/-/commits/master
> and let me know if I should go ahead and do a merge request (into which
> target project/branch ?) or you
> have more comments.

igt still works with patch submission.
-Daniel

> 
> Andrey
> 
> 
> > 
> > > Andrey
> > > 
> > > On 2/9/21 4:50 AM, Daniel Vetter wrote:
> > > > Yeah in the end we'd need 2 hw devices for testing full fence
> > > > functionality. A useful intermediate step would be to just export the
> > > > fence (either as sync_file, which I think amdgpu doesn't support because
> > > > no android egl support in mesa) or drm_syncobj (which you can do as
> > > > standalone fd too iirc), and then just using the fence a bit from
> > > > userspace (like wait on it or get its status) after the device is
> > > > unplugged.
> > 
> > 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 00/14] RFC Support hot device unplug in amdgpu
  2021-02-25 10:25                               ` Daniel Vetter
@ 2021-02-25 16:12                                 ` Andrey Grodzovsky
  -1 siblings, 0 replies; 196+ messages in thread
From: Andrey Grodzovsky @ 2021-02-25 16:12 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Christian König, amd-gfx list, dri-devel, Greg KH,
	Alex Deucher, Qiang Yu


On 2021-02-25 5:25 a.m., Daniel Vetter wrote:
> On Wed, Feb 24, 2021 at 11:30:50AM -0500, Andrey Grodzovsky wrote:
>> On 2021-02-19 5:24 a.m., Daniel Vetter wrote:
>>> On Thu, Feb 18, 2021 at 9:03 PM Andrey Grodzovsky
>>> <Andrey.Grodzovsky@amd.com> wrote:
>>>> Looked a bit into it, I want to export sync_object to FD and import  from that FD
>>>> such that I will wait on the imported sync object handle from one thread while
>>>> signaling the exported sync object handle from another (post device unplug) ?
>>>>
>>>> My problem is how to create a sync object with a non signaled 'fake' fence ?
>>>> I only see API that creates it with already signaled fence (or none) -
>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Felixir.bootlin.com%2Flinux%2Flatest%2Fsource%2Fdrivers%2Fgpu%2Fdrm%2Fdrm_syncobj.c%23L56&amp;data=04%7C01%7Candrey.grodzovsky%40amd.com%7Cc6828a032b80464fc0f008d8d977bc32%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637498455582209331%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=Izca%2BYNliCefXqAgOIX%2Bs3XQ1vWVGXbfAh28B%2F51blQ%3D&amp;reserved=0
>>>>
>>>> P.S I expect the kernel to crash since unlike with dma_bufs we don't hold
>>>> drm device reference here on export.
>>> Well maybe there's no crash. I think if you go through all your
>>> dma_fence that you have and force-complete them, then I think external
>>> callers wont go into the driver anymore. But there's still pointers
>>> potentially pointing at your device struct and all that, but should
>>> work. Still needs some audit ofc.
>>>
>>> Wrt how you get such a free-standing fence, that's amdgpu specific. Roughly
>>> - submit cs
>>> - get the fence for that (either sync_file, but I don't think amdgpu
>>> supports that, or maybe through drm_syncobj)
>>> - hotunplug
>>> - wait on that fence somehow (drm_syncobj has direct uapi for this,
>>> same for sync_file I think)
>>>
>>> Cheers, Daniel
>>
>> Indeed worked fine, did with 2 devices. Since syncobj is refcounted, even
>> after I
>> destroyed the original syncobj and unplugged the device, the exported
>> syncobj and the
>> fence inside didn't go anywhere.
>>
>> See my 3 tests in my branch on Gitlab
>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgitlab.freedesktop.org%2Fagrodzov%2Figt-gpu-tools%2F-%2Fcommits%2Fmaster&amp;data=04%7C01%7Candrey.grodzovsky%40amd.com%7Cc6828a032b80464fc0f008d8d977bc32%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637498455582209331%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=IvoCIggDUV3EgDqOhrJokWei%2B6byg%2Be9cfaJel9y2RY%3D&amp;reserved=0
>> and let me know if I should go ahead and do a merge request (into which
>> target project/branch ?) or you
>> have more comments.
> igt still works with patch submission.
> -Daniel


I see, Need to divert to other work for a while, will get to it once I 
am back to device unplug.

Andrey


>
>> Andrey
>>
>>
>>>> Andrey
>>>>
>>>> On 2/9/21 4:50 AM, Daniel Vetter wrote:
>>>>> Yeah in the end we'd need 2 hw devices for testing full fence
>>>>> functionality. A useful intermediate step would be to just export the
>>>>> fence (either as sync_file, which I think amdgpu doesn't support because
>>>>> no android egl support in mesa) or drm_syncobj (which you can do as
>>>>> standalone fd too iirc), and then just using the fence a bit from
>>>>> userspace (like wait on it or get its status) after the device is
>>>>> unplugged.
>>>
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 196+ messages in thread

* Re: [PATCH v4 00/14] RFC Support hot device unplug in amdgpu
@ 2021-02-25 16:12                                 ` Andrey Grodzovsky
  0 siblings, 0 replies; 196+ messages in thread
From: Andrey Grodzovsky @ 2021-02-25 16:12 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Rob Herring, Christian König, amd-gfx list, Anholt, Eric,
	Pekka Paalanen, dri-devel, Greg KH, Alex Deucher, Lucas Stach,
	Wentland, Harry, Qiang Yu


On 2021-02-25 5:25 a.m., Daniel Vetter wrote:
> On Wed, Feb 24, 2021 at 11:30:50AM -0500, Andrey Grodzovsky wrote:
>> On 2021-02-19 5:24 a.m., Daniel Vetter wrote:
>>> On Thu, Feb 18, 2021 at 9:03 PM Andrey Grodzovsky
>>> <Andrey.Grodzovsky@amd.com> wrote:
>>>> Looked a bit into it, I want to export sync_object to FD and import  from that FD
>>>> such that I will wait on the imported sync object handle from one thread while
>>>> signaling the exported sync object handle from another (post device unplug) ?
>>>>
>>>> My problem is how to create a sync object with a non signaled 'fake' fence ?
>>>> I only see API that creates it with already signaled fence (or none) -
>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Felixir.bootlin.com%2Flinux%2Flatest%2Fsource%2Fdrivers%2Fgpu%2Fdrm%2Fdrm_syncobj.c%23L56&amp;data=04%7C01%7Candrey.grodzovsky%40amd.com%7Cc6828a032b80464fc0f008d8d977bc32%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637498455582209331%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=Izca%2BYNliCefXqAgOIX%2Bs3XQ1vWVGXbfAh28B%2F51blQ%3D&amp;reserved=0
>>>>
>>>> P.S I expect the kernel to crash since unlike with dma_bufs we don't hold
>>>> drm device reference here on export.
>>> Well maybe there's no crash. I think if you go through all your
>>> dma_fence that you have and force-complete them, then I think external
>>> callers wont go into the driver anymore. But there's still pointers
>>> potentially pointing at your device struct and all that, but should
>>> work. Still needs some audit ofc.
>>>
>>> Wrt how you get such a free-standing fence, that's amdgpu specific. Roughly
>>> - submit cs
>>> - get the fence for that (either sync_file, but I don't think amdgpu
>>> supports that, or maybe through drm_syncobj)
>>> - hotunplug
>>> - wait on that fence somehow (drm_syncobj has direct uapi for this,
>>> same for sync_file I think)
>>>
>>> Cheers, Daniel
>>
>> Indeed worked fine, did with 2 devices. Since syncobj is refcounted, even
>> after I
>> destroyed the original syncobj and unplugged the device, the exported
>> syncobj and the
>> fence inside didn't go anywhere.
>>
>> See my 3 tests in my branch on Gitlab
>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgitlab.freedesktop.org%2Fagrodzov%2Figt-gpu-tools%2F-%2Fcommits%2Fmaster&amp;data=04%7C01%7Candrey.grodzovsky%40amd.com%7Cc6828a032b80464fc0f008d8d977bc32%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637498455582209331%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=IvoCIggDUV3EgDqOhrJokWei%2B6byg%2Be9cfaJel9y2RY%3D&amp;reserved=0
>> and let me know if I should go ahead and do a merge request (into which
>> target project/branch ?) or you
>> have more comments.
> igt still works with patch submission.
> -Daniel


I see, Need to divert to other work for a while, will get to it once I 
am back to device unplug.

Andrey


>
>> Andrey
>>
>>
>>>> Andrey
>>>>
>>>> On 2/9/21 4:50 AM, Daniel Vetter wrote:
>>>>> Yeah in the end we'd need 2 hw devices for testing full fence
>>>>> functionality. A useful intermediate step would be to just export the
>>>>> fence (either as sync_file, which I think amdgpu doesn't support because
>>>>> no android egl support in mesa) or drm_syncobj (which you can do as
>>>>> standalone fd too iirc), and then just using the fence a bit from
>>>>> userspace (like wait on it or get its status) after the device is
>>>>> unplugged.
>>>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 196+ messages in thread

end of thread, other threads:[~2021-02-25 16:12 UTC | newest]

Thread overview: 196+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-01-18 21:01 [PATCH v4 00/14] RFC Support hot device unplug in amdgpu Andrey Grodzovsky
2021-01-18 21:01 ` Andrey Grodzovsky
2021-01-18 21:01 ` [PATCH v4 01/14] drm/ttm: Remap all page faults to per process dummy page Andrey Grodzovsky
2021-01-18 21:01   ` Andrey Grodzovsky
2021-01-18 21:48   ` Alex Deucher
2021-01-18 21:48     ` Alex Deucher
2021-01-19  8:41   ` Christian König
2021-01-19  8:41     ` Christian König
2021-01-19 13:56   ` Daniel Vetter
2021-01-19 13:56     ` Daniel Vetter
2021-01-25 15:28     ` Andrey Grodzovsky
2021-01-25 15:28       ` Andrey Grodzovsky
2021-01-27 14:29       ` Andrey Grodzovsky
2021-01-27 14:29         ` Andrey Grodzovsky
2021-02-02 14:21         ` Daniel Vetter
2021-02-02 14:21           ` Daniel Vetter
2021-01-18 21:01 ` [PATCH v4 02/14] drm: Unamp the entire device address space on device unplug Andrey Grodzovsky
2021-01-18 21:01   ` Andrey Grodzovsky
2021-01-18 21:01 ` [PATCH v4 03/14] drm/ttm: Expose ttm_tt_unpopulate for driver use Andrey Grodzovsky
2021-01-18 21:01   ` Andrey Grodzovsky
2021-01-18 21:01 ` [PATCH v4 04/14] drm/sched: Cancel and flush all oustatdning jobs before finish Andrey Grodzovsky
2021-01-18 21:01   ` Andrey Grodzovsky
2021-01-18 21:49   ` Alex Deucher
2021-01-18 21:49     ` Alex Deucher
2021-01-19  8:42   ` Christian König
2021-01-19  8:42     ` Christian König
2021-01-19  9:50     ` Christian König
2021-01-19  9:50       ` Christian König
2021-01-18 21:01 ` [PATCH v4 05/14] drm/amdgpu: Split amdgpu_device_fini into early and late Andrey Grodzovsky
2021-01-18 21:01   ` Andrey Grodzovsky
2021-01-19  8:45   ` Christian König
2021-01-19  8:45     ` Christian König
2021-01-18 21:01 ` [PATCH v4 06/14] drm/amdgpu: Add early fini callback Andrey Grodzovsky
2021-01-18 21:01   ` Andrey Grodzovsky
2021-01-18 21:01 ` [PATCH v4 07/14] drm/amdgpu: Register IOMMU topology notifier per device Andrey Grodzovsky
2021-01-18 21:01   ` Andrey Grodzovsky
2021-01-18 21:52   ` Alex Deucher
2021-01-18 21:52     ` Alex Deucher
2021-01-19  8:48   ` Christian König
2021-01-19  8:48     ` Christian König
2021-01-19 13:45     ` Daniel Vetter
2021-01-19 13:45       ` Daniel Vetter
2021-01-19 21:21       ` Andrey Grodzovsky
2021-01-19 21:21         ` Andrey Grodzovsky
2021-01-19 22:01         ` Daniel Vetter
2021-01-19 22:01           ` Daniel Vetter
2021-01-20  4:21           ` Andrey Grodzovsky
2021-01-20  4:21             ` Andrey Grodzovsky
2021-01-20  8:38             ` Daniel Vetter
2021-01-20  8:38               ` Daniel Vetter
     [not found]               ` <1a5f7ccb-1f91-91be-1cb1-e7cb43ac2c13@amd.com>
2021-01-21 10:48                 ` Daniel Vetter
2021-01-21 10:48                   ` Daniel Vetter
2021-01-20  5:01     ` Andrey Grodzovsky
2021-01-20  5:01       ` Andrey Grodzovsky
2021-01-20 19:38       ` Andrey Grodzovsky
2021-01-20 19:38         ` Andrey Grodzovsky
2021-01-21 10:42         ` Christian König
2021-01-21 10:42           ` Christian König
2021-01-18 21:01 ` [PATCH v4 08/14] drm/amdgpu: Fix a bunch of sdma code crash post device unplug Andrey Grodzovsky
2021-01-18 21:01   ` Andrey Grodzovsky
2021-01-19  8:51   ` Christian König
2021-01-19  8:51     ` Christian König
2021-01-18 21:01 ` [PATCH v4 09/14] drm/amdgpu: Remap all page faults to per process dummy page Andrey Grodzovsky
2021-01-18 21:01   ` Andrey Grodzovsky
2021-01-19  8:52   ` Christian König
2021-01-19  8:52     ` Christian König
2021-01-18 21:01 ` [PATCH v4 10/14] dmr/amdgpu: Move some sysfs attrs creation to default_attr Andrey Grodzovsky
2021-01-18 21:01   ` Andrey Grodzovsky
2021-01-19  7:34   ` Greg KH
2021-01-19  7:34     ` Greg KH
2021-01-19 16:36     ` Andrey Grodzovsky
2021-01-19 16:36       ` Andrey Grodzovsky
2021-01-19 17:47       ` Greg KH
2021-01-19 17:47         ` Greg KH
2021-01-19 19:04         ` Alex Deucher
2021-01-19 19:04           ` Alex Deucher
2021-01-19 19:16           ` Andrey Grodzovsky
2021-01-19 19:16             ` Andrey Grodzovsky
2021-01-19 19:41           ` Greg KH
2021-01-19 19:41             ` Greg KH
2021-01-19  8:53   ` Christian König
2021-01-19  8:53     ` Christian König
2021-01-18 21:01 ` [PATCH v4 11/14] drm/amdgpu: Guard against write accesses after device removal Andrey Grodzovsky
2021-01-18 21:01   ` Andrey Grodzovsky
2021-01-19  8:55   ` Christian König
2021-01-19  8:55     ` Christian König
2021-01-19 15:35     ` Andrey Grodzovsky
2021-01-19 15:35       ` Andrey Grodzovsky
2021-01-19 15:39       ` Christian König
2021-01-19 15:39         ` Christian König
2021-01-19 18:05       ` Daniel Vetter
2021-01-19 18:05         ` Daniel Vetter
2021-01-19 18:22         ` Andrey Grodzovsky
2021-01-19 18:22           ` Andrey Grodzovsky
2021-01-19 18:59           ` Christian König
2021-01-19 18:59             ` Christian König
2021-01-19 19:16             ` Andrey Grodzovsky
2021-01-19 19:16               ` Andrey Grodzovsky
2021-01-20 19:34               ` Andrey Grodzovsky
2021-01-20 19:34                 ` Andrey Grodzovsky
2021-01-28 17:23             ` Andrey Grodzovsky
2021-01-28 17:23               ` Andrey Grodzovsky
2021-01-29 15:16               ` Christian König
2021-01-29 15:16                 ` Christian König
2021-01-29 17:35                 ` Andrey Grodzovsky
2021-01-29 17:35                   ` Andrey Grodzovsky
2021-01-29 19:25                   ` Christian König
2021-01-29 19:25                     ` Christian König
2021-02-05 16:22                     ` Andrey Grodzovsky
2021-02-05 16:22                       ` Andrey Grodzovsky
2021-02-05 22:10                       ` Daniel Vetter
2021-02-05 22:10                         ` Daniel Vetter
2021-02-05 23:09                         ` Andrey Grodzovsky
2021-02-05 23:09                           ` Andrey Grodzovsky
2021-02-06 14:18                           ` Daniel Vetter
2021-02-06 14:18                             ` Daniel Vetter
2021-02-07 21:28                         ` Andrey Grodzovsky
2021-02-07 21:28                           ` Andrey Grodzovsky
2021-02-07 21:50                           ` Daniel Vetter
2021-02-07 21:50                             ` Daniel Vetter
2021-02-08  9:37                             ` Christian König
2021-02-08  9:37                               ` Christian König
2021-02-08  9:48                               ` Daniel Vetter
2021-02-08  9:48                                 ` Daniel Vetter
2021-02-08 10:03                                 ` Christian König
2021-02-08 10:03                                   ` Christian König
2021-02-08 10:11                                   ` Daniel Vetter
2021-02-08 10:11                                     ` Daniel Vetter
2021-02-08 13:59                                     ` Christian König
2021-02-08 13:59                                       ` Christian König
2021-02-08 16:23                                       ` Daniel Vetter
2021-02-08 16:23                                         ` Daniel Vetter
2021-02-08 22:15                                         ` Andrey Grodzovsky
2021-02-08 22:15                                           ` Andrey Grodzovsky
2021-02-09  7:58                                           ` Christian König
2021-02-09  7:58                                             ` Christian König
2021-02-09 14:30                                             ` Andrey Grodzovsky
2021-02-09 14:30                                               ` Andrey Grodzovsky
2021-02-09 15:40                                               ` Christian König
2021-02-09 15:40                                                 ` Christian König
2021-02-10 22:01                                                 ` Andrey Grodzovsky
2021-02-10 22:01                                                   ` Andrey Grodzovsky
2021-02-12 15:00                                                   ` Andrey Grodzovsky
2021-02-12 15:00                                                     ` Andrey Grodzovsky
2021-02-08 22:09                               ` Andrey Grodzovsky
2021-02-08 22:09                                 ` Andrey Grodzovsky
2021-02-09  8:27                                 ` Christian König
2021-02-09  8:27                                   ` Christian König
2021-02-09  9:46                                   ` Daniel Vetter
2021-02-09  9:46                                     ` Daniel Vetter
2021-01-18 21:01 ` [PATCH v4 12/14] drm/scheduler: Job timeout handler returns status Andrey Grodzovsky
2021-01-18 21:01   ` Andrey Grodzovsky
2021-01-19  7:53   ` Christian König
2021-01-19  7:53     ` Christian König
2021-01-19 17:47     ` Luben Tuikov
2021-01-19 17:47       ` Luben Tuikov
2021-01-19 18:53       ` Christian König
2021-01-19 18:53         ` Christian König
2021-01-18 21:01 ` [PATCH v4 13/14] drm/sched: Make timeout timer rearm conditional Andrey Grodzovsky
2021-01-18 21:01   ` Andrey Grodzovsky
2021-01-18 21:01 ` [PATCH v4 14/14] drm/amdgpu: Prevent any job recoveries after device is unplugged Andrey Grodzovsky
2021-01-18 21:01   ` Andrey Grodzovsky
2021-01-19 14:16 ` [PATCH v4 00/14] RFC Support hot device unplug in amdgpu Daniel Vetter
2021-01-19 14:16   ` Daniel Vetter
2021-01-19 17:31   ` Andrey Grodzovsky
2021-01-19 17:31     ` Andrey Grodzovsky
2021-01-19 18:08     ` Daniel Vetter
2021-01-19 18:08       ` Daniel Vetter
2021-01-19 18:18       ` Andrey Grodzovsky
2021-01-19 18:18         ` Andrey Grodzovsky
2021-01-20  9:05         ` Daniel Vetter
2021-01-20  9:05           ` Daniel Vetter
2021-01-20 14:19           ` Andrey Grodzovsky
2021-01-20 14:19             ` Andrey Grodzovsky
2021-01-20 15:59             ` Daniel Vetter
2021-01-20 15:59               ` Daniel Vetter
2021-02-08  5:59               ` Andrey Grodzovsky
2021-02-08  5:59                 ` Andrey Grodzovsky
2021-02-08  7:27                 ` Daniel Vetter
2021-02-08  7:27                   ` Daniel Vetter
2021-02-09  4:01                   ` Andrey Grodzovsky
2021-02-09  4:01                     ` Andrey Grodzovsky
2021-02-09  9:50                     ` Daniel Vetter
2021-02-09  9:50                       ` Daniel Vetter
2021-02-09 15:34                       ` Andrey Grodzovsky
2021-02-09 15:34                         ` Andrey Grodzovsky
2021-02-18 20:03                       ` Andrey Grodzovsky
2021-02-18 20:03                         ` Andrey Grodzovsky
2021-02-19 10:24                         ` Daniel Vetter
2021-02-19 10:24                           ` Daniel Vetter
2021-02-24 16:30                           ` Andrey Grodzovsky
2021-02-24 16:30                             ` Andrey Grodzovsky
2021-02-25 10:25                             ` Daniel Vetter
2021-02-25 10:25                               ` Daniel Vetter
2021-02-25 16:12                               ` Andrey Grodzovsky
2021-02-25 16:12                                 ` Andrey Grodzovsky

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.