All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v6 00/16] RFC Support hot device unplug in amdgpu
@ 2021-05-10 16:36 ` Andrey Grodzovsky
  0 siblings, 0 replies; 126+ messages in thread
From: Andrey Grodzovsky @ 2021-05-10 16:36 UTC (permalink / raw)
  To: dri-devel, amd-gfx, linux-pci, ckoenig.leichtzumerken,
	daniel.vetter, Harry.Wentland
  Cc: ppaalanen, Alexander.Deucher, gregkh, helgaas, Felix.Kuehling,
	Andrey Grodzovsky

Until now extracting a card either by physical extraction (e.g. eGPU with 
thunderbolt connection or by emulation through  sysfs -> /sys/bus/pci/devices/device_id/remove) 
would cause random crashes in user apps. The random crashes in apps were 
mostly due to the app having mapped a device backed BO into its address 
space and was still trying to access the BO while the backing device was gone.
To answer this first problem Christian suggested fixing the handling of mapped 
memory in the clients when the device goes away by forcibly unmapping all buffers the 
user processes have by clearing their respective VMAs mapping the device BOs.
Then when the VMAs try to fill in the page tables again we check in the fault 
handler if the device is removed and if so, return an error. This will generate a 
SIGBUS to the application which can then cleanly terminate. This indeed was done 
but this in turn created a problem of kernel OOPs where the OOPSes were due to the 
fact that while the app was terminating because of the SIGBUS it would trigger use 
after free in the driver by calling to access device structures that were already
released from the pci remove sequence. This was handled by introducing a 'flush' 
sequence during device removal where we wait for drm file reference to drop to 0 
meaning all user clients directly using this device terminated.

v2:
Based on discussions in the mailing list with Daniel and Pekka [1] and based on the document 
produced by Pekka from those discussions [2] the whole approach with returning SIGBUS and 
waiting for all user clients having CPU mapping of device BOs to die was dropped. 
Instead as per the document suggestion the device structures are kept alive until 
the last reference to the device is dropped by user client and in the meanwhile all existing and new CPU mappings of the BOs 
belonging to the device directly or by dma-buf import are rerouted to per user 
process dummy rw page.Also, I skipped the 'Requirements for KMS UAPI' section of [2] 
since i am trying to get the minimal set of requirements that still give useful solution 
to work and this is the'Requirements for Render and Cross-Device UAPI' section and so my 
test case is removing a secondary device, which is render only and is not involved 
in KMS.

v3:
More updates following comments from v2 such as removing loop to find DRM file when rerouting 
page faults to dummy page,getting rid of unnecessary sysfs handling refactoring and moving 
prevention of GPU recovery post device unplug from amdgpu to scheduler layer. 
On top of that added unplug support for the IOMMU enabled system.

v4:
Drop last sysfs hack and use sysfs default attribute.
Guard against write accesses after device removal to avoid modifying released memory.
Update dummy pages handling to on demand allocation and release through drm managed framework.
Add return value to scheduler job TO handler (by Luben Tuikov) and use this in amdgpu for prevention 
of GPU recovery post device unplug
Also rebase on top of drm-misc-mext instead of amd-staging-drm-next

v5:
The most significant in this series is the improved protection from kernel driver accessing MMIO ranges that were allocated
for the device once the device is gone. To do this, first a patch 'drm/amdgpu: Unmap all MMIO mappings' is introduced.
This patch unamps all MMIO mapped into the kernel address space in the form of BARs and kernel BOs with CPU visible VRAM mappings.
This way it helped to discover multiple such access points because a page fault would be immediately generated on access. Most of them
were solved by moving HW fini code into pci_remove stage (patch drm/amdgpu: Add early fini callback) and for some who 
were harder to unwind drm_dev_enter/exit scoping was used. In addition all the IOCTLs and all background work and timers 
are now protected with drm_dev_enter/exit at their root in an attempt that after drm_dev_unplug is finished none of them 
run anymore and the pci_remove thread is the only thread executing which might touch the HW. To prevent deadlocks in such 
case against threads stuck on various HW or SW fences patches 'drm/amdgpu: Finalise device fences on device remove'  
and drm/amdgpu: Add rw_sem to pushing job into sched queue' take care of force signaling all such existing fences 
and rejecting any newly added ones.

v6:
Drop using drm_dev_enter/exit in conjunction with signalling HW fences before setting drm_dev_unplug.
We need to devise a more robust cros DRM approach to the problem of dma fence waits falling
inside drm_dev_enter/exit scopes -> move to TODO.

With these patches I am able to gracefully remove the secondary card using sysfs remove hook while glxgears is running off of secondary 
card (DRI_PRIME=1) without kernel oopses or hangs and keep working with the primary card or soft reset the device without hangs or oopses.
Also as per Daniel's comment I added 3 tests to IGT [4] to core_hotunplug test suite - remove device while commands are submitted, 
exported BO and exported fence (not pushed yet).
Also now it's possible to plug back the device after unplug 
Also some users now can successfully use those patches with eGPU boxes[3].

TODOs for followup work:
Convert AMDGPU code to use devm (for hw stuff) and drmm (for sw stuff and allocations) (Daniel)
Add support for 'Requirements for KMS UAPI' section of [2] - unplugging primary, display connected card.
Annotate drm_dev_enter/exit against dma_fence_waits as first in deciding where to use drm_dev_enter/exit
in code for device unplug.

[1] - Discussions during v5 of the patchset https://lore.kernel.org/amd-gfx/20210428151207.1212258-1-andrey.grodzovsky@amd.com/
[2] - drm/doc: device hot-unplug for userspace https://www.spinics.net/lists/dri-devel/msg259755.html
[3] - Related gitlab ticket https://gitlab.freedesktop.org/drm/amd/-/issues/1081
[4] - Related IGT tests https://gitlab.freedesktop.org/agrodzov/igt-gpu-tools/-/commits/master

Andrey Grodzovsky (16):
  drm/ttm: Remap all page faults to per process dummy page.
  drm/ttm: Expose ttm_tt_unpopulate for driver use
  drm/amdgpu: Split amdgpu_device_fini into early and late
  drm/amdkfd: Split kfd suspend from devie exit
  drm/amdgpu: Add early fini callback
  drm/amdgpu: Handle IOMMU enabled case.
  drm/amdgpu: Remap all page faults to per process dummy page.
  PCI: Add support for dev_groups to struct pci_device_driver
  drm/amdgpu: Convert driver sysfs attributes to static attributes
  drm/amdgpu: Guard against write accesses after device removal
  drm/sched: Make timeout timer rearm conditional.
  drm/amdgpu: Prevent any job recoveries after device is unplugged.
  drm/amdgpu: Fix hang on device removal.
  drm/scheduler: Fix hang when sched_entity released
  drm/amd/display: Remove superflous drm_mode_config_cleanup
  drm/amdgpu: Verify DMA opearations from device are done

 drivers/gpu/drm/amd/amdgpu/amdgpu.h           |  6 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c    |  2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h    |  2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c  | 17 ++--
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c    | 98 +++++++++++++------
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c       | 26 ++++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c     | 31 ++++--
 drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c      |  3 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h      |  1 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c       |  9 ++
 drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c   | 25 ++---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c        | 17 +++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c       | 35 +++++--
 drivers/gpu/drm/amd/amdgpu/amdgpu_irq.h       |  3 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_job.c       | 19 +++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c       | 12 ++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c       | 63 +++++++-----
 drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h       |  2 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c       |  1 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c      | 70 +++++++++++++
 drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h      | 52 ++--------
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c       | 21 +++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c       | 31 ++++--
 drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c       | 11 ++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c       | 22 +++--
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c        |  7 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c  | 14 +--
 drivers/gpu/drm/amd/amdgpu/cik_ih.c           |  3 +-
 drivers/gpu/drm/amd/amdgpu/cz_ih.c            |  3 +-
 drivers/gpu/drm/amd/amdgpu/iceland_ih.c       |  3 +-
 drivers/gpu/drm/amd/amdgpu/navi10_ih.c        |  5 +-
 drivers/gpu/drm/amd/amdgpu/psp_v11_0.c        | 44 ++++-----
 drivers/gpu/drm/amd/amdgpu/psp_v12_0.c        |  8 +-
 drivers/gpu/drm/amd/amdgpu/psp_v3_1.c         |  8 +-
 drivers/gpu/drm/amd/amdgpu/si_ih.c            |  3 +-
 drivers/gpu/drm/amd/amdgpu/tonga_ih.c         |  3 +-
 drivers/gpu/drm/amd/amdgpu/vce_v4_0.c         | 26 +++--
 drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c         | 22 +++--
 drivers/gpu/drm/amd/amdgpu/vega10_ih.c        |  5 +-
 drivers/gpu/drm/amd/amdgpu/vega20_ih.c        |  2 +-
 drivers/gpu/drm/amd/amdkfd/kfd_device.c       |  3 +-
 .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 13 ++-
 drivers/gpu/drm/amd/include/amd_shared.h      |  2 +
 .../drm/amd/pm/powerplay/smumgr/smu7_smumgr.c |  2 +
 drivers/gpu/drm/scheduler/sched_entity.c      |  3 +-
 drivers/gpu/drm/scheduler/sched_main.c        | 35 ++++++-
 drivers/gpu/drm/ttm/ttm_bo_vm.c               | 57 ++++++++++-
 drivers/gpu/drm/ttm/ttm_tt.c                  |  1 +
 drivers/pci/pci-driver.c                      |  1 +
 include/drm/ttm/ttm_bo_api.h                  |  2 +
 include/linux/pci.h                           |  3 +
 51 files changed, 585 insertions(+), 272 deletions(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 126+ messages in thread

* [PATCH v6 00/16] RFC Support hot device unplug in amdgpu
@ 2021-05-10 16:36 ` Andrey Grodzovsky
  0 siblings, 0 replies; 126+ messages in thread
From: Andrey Grodzovsky @ 2021-05-10 16:36 UTC (permalink / raw)
  To: dri-devel, amd-gfx, linux-pci, ckoenig.leichtzumerken,
	daniel.vetter, Harry.Wentland
  Cc: gregkh, Felix.Kuehling, helgaas, Alexander.Deucher

Until now extracting a card either by physical extraction (e.g. eGPU with 
thunderbolt connection or by emulation through  sysfs -> /sys/bus/pci/devices/device_id/remove) 
would cause random crashes in user apps. The random crashes in apps were 
mostly due to the app having mapped a device backed BO into its address 
space and was still trying to access the BO while the backing device was gone.
To answer this first problem Christian suggested fixing the handling of mapped 
memory in the clients when the device goes away by forcibly unmapping all buffers the 
user processes have by clearing their respective VMAs mapping the device BOs.
Then when the VMAs try to fill in the page tables again we check in the fault 
handler if the device is removed and if so, return an error. This will generate a 
SIGBUS to the application which can then cleanly terminate. This indeed was done 
but this in turn created a problem of kernel OOPs where the OOPSes were due to the 
fact that while the app was terminating because of the SIGBUS it would trigger use 
after free in the driver by calling to access device structures that were already
released from the pci remove sequence. This was handled by introducing a 'flush' 
sequence during device removal where we wait for drm file reference to drop to 0 
meaning all user clients directly using this device terminated.

v2:
Based on discussions in the mailing list with Daniel and Pekka [1] and based on the document 
produced by Pekka from those discussions [2] the whole approach with returning SIGBUS and 
waiting for all user clients having CPU mapping of device BOs to die was dropped. 
Instead as per the document suggestion the device structures are kept alive until 
the last reference to the device is dropped by user client and in the meanwhile all existing and new CPU mappings of the BOs 
belonging to the device directly or by dma-buf import are rerouted to per user 
process dummy rw page.Also, I skipped the 'Requirements for KMS UAPI' section of [2] 
since i am trying to get the minimal set of requirements that still give useful solution 
to work and this is the'Requirements for Render and Cross-Device UAPI' section and so my 
test case is removing a secondary device, which is render only and is not involved 
in KMS.

v3:
More updates following comments from v2 such as removing loop to find DRM file when rerouting 
page faults to dummy page,getting rid of unnecessary sysfs handling refactoring and moving 
prevention of GPU recovery post device unplug from amdgpu to scheduler layer. 
On top of that added unplug support for the IOMMU enabled system.

v4:
Drop last sysfs hack and use sysfs default attribute.
Guard against write accesses after device removal to avoid modifying released memory.
Update dummy pages handling to on demand allocation and release through drm managed framework.
Add return value to scheduler job TO handler (by Luben Tuikov) and use this in amdgpu for prevention 
of GPU recovery post device unplug
Also rebase on top of drm-misc-mext instead of amd-staging-drm-next

v5:
The most significant in this series is the improved protection from kernel driver accessing MMIO ranges that were allocated
for the device once the device is gone. To do this, first a patch 'drm/amdgpu: Unmap all MMIO mappings' is introduced.
This patch unamps all MMIO mapped into the kernel address space in the form of BARs and kernel BOs with CPU visible VRAM mappings.
This way it helped to discover multiple such access points because a page fault would be immediately generated on access. Most of them
were solved by moving HW fini code into pci_remove stage (patch drm/amdgpu: Add early fini callback) and for some who 
were harder to unwind drm_dev_enter/exit scoping was used. In addition all the IOCTLs and all background work and timers 
are now protected with drm_dev_enter/exit at their root in an attempt that after drm_dev_unplug is finished none of them 
run anymore and the pci_remove thread is the only thread executing which might touch the HW. To prevent deadlocks in such 
case against threads stuck on various HW or SW fences patches 'drm/amdgpu: Finalise device fences on device remove'  
and drm/amdgpu: Add rw_sem to pushing job into sched queue' take care of force signaling all such existing fences 
and rejecting any newly added ones.

v6:
Drop using drm_dev_enter/exit in conjunction with signalling HW fences before setting drm_dev_unplug.
We need to devise a more robust cros DRM approach to the problem of dma fence waits falling
inside drm_dev_enter/exit scopes -> move to TODO.

With these patches I am able to gracefully remove the secondary card using sysfs remove hook while glxgears is running off of secondary 
card (DRI_PRIME=1) without kernel oopses or hangs and keep working with the primary card or soft reset the device without hangs or oopses.
Also as per Daniel's comment I added 3 tests to IGT [4] to core_hotunplug test suite - remove device while commands are submitted, 
exported BO and exported fence (not pushed yet).
Also now it's possible to plug back the device after unplug 
Also some users now can successfully use those patches with eGPU boxes[3].

TODOs for followup work:
Convert AMDGPU code to use devm (for hw stuff) and drmm (for sw stuff and allocations) (Daniel)
Add support for 'Requirements for KMS UAPI' section of [2] - unplugging primary, display connected card.
Annotate drm_dev_enter/exit against dma_fence_waits as first in deciding where to use drm_dev_enter/exit
in code for device unplug.

[1] - Discussions during v5 of the patchset https://lore.kernel.org/amd-gfx/20210428151207.1212258-1-andrey.grodzovsky@amd.com/
[2] - drm/doc: device hot-unplug for userspace https://www.spinics.net/lists/dri-devel/msg259755.html
[3] - Related gitlab ticket https://gitlab.freedesktop.org/drm/amd/-/issues/1081
[4] - Related IGT tests https://gitlab.freedesktop.org/agrodzov/igt-gpu-tools/-/commits/master

Andrey Grodzovsky (16):
  drm/ttm: Remap all page faults to per process dummy page.
  drm/ttm: Expose ttm_tt_unpopulate for driver use
  drm/amdgpu: Split amdgpu_device_fini into early and late
  drm/amdkfd: Split kfd suspend from devie exit
  drm/amdgpu: Add early fini callback
  drm/amdgpu: Handle IOMMU enabled case.
  drm/amdgpu: Remap all page faults to per process dummy page.
  PCI: Add support for dev_groups to struct pci_device_driver
  drm/amdgpu: Convert driver sysfs attributes to static attributes
  drm/amdgpu: Guard against write accesses after device removal
  drm/sched: Make timeout timer rearm conditional.
  drm/amdgpu: Prevent any job recoveries after device is unplugged.
  drm/amdgpu: Fix hang on device removal.
  drm/scheduler: Fix hang when sched_entity released
  drm/amd/display: Remove superflous drm_mode_config_cleanup
  drm/amdgpu: Verify DMA opearations from device are done

 drivers/gpu/drm/amd/amdgpu/amdgpu.h           |  6 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c    |  2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h    |  2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c  | 17 ++--
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c    | 98 +++++++++++++------
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c       | 26 ++++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c     | 31 ++++--
 drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c      |  3 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h      |  1 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c       |  9 ++
 drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c   | 25 ++---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c        | 17 +++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c       | 35 +++++--
 drivers/gpu/drm/amd/amdgpu/amdgpu_irq.h       |  3 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_job.c       | 19 +++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c       | 12 ++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c       | 63 +++++++-----
 drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h       |  2 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c       |  1 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c      | 70 +++++++++++++
 drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h      | 52 ++--------
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c       | 21 +++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c       | 31 ++++--
 drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c       | 11 ++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c       | 22 +++--
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c        |  7 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c  | 14 +--
 drivers/gpu/drm/amd/amdgpu/cik_ih.c           |  3 +-
 drivers/gpu/drm/amd/amdgpu/cz_ih.c            |  3 +-
 drivers/gpu/drm/amd/amdgpu/iceland_ih.c       |  3 +-
 drivers/gpu/drm/amd/amdgpu/navi10_ih.c        |  5 +-
 drivers/gpu/drm/amd/amdgpu/psp_v11_0.c        | 44 ++++-----
 drivers/gpu/drm/amd/amdgpu/psp_v12_0.c        |  8 +-
 drivers/gpu/drm/amd/amdgpu/psp_v3_1.c         |  8 +-
 drivers/gpu/drm/amd/amdgpu/si_ih.c            |  3 +-
 drivers/gpu/drm/amd/amdgpu/tonga_ih.c         |  3 +-
 drivers/gpu/drm/amd/amdgpu/vce_v4_0.c         | 26 +++--
 drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c         | 22 +++--
 drivers/gpu/drm/amd/amdgpu/vega10_ih.c        |  5 +-
 drivers/gpu/drm/amd/amdgpu/vega20_ih.c        |  2 +-
 drivers/gpu/drm/amd/amdkfd/kfd_device.c       |  3 +-
 .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 13 ++-
 drivers/gpu/drm/amd/include/amd_shared.h      |  2 +
 .../drm/amd/pm/powerplay/smumgr/smu7_smumgr.c |  2 +
 drivers/gpu/drm/scheduler/sched_entity.c      |  3 +-
 drivers/gpu/drm/scheduler/sched_main.c        | 35 ++++++-
 drivers/gpu/drm/ttm/ttm_bo_vm.c               | 57 ++++++++++-
 drivers/gpu/drm/ttm/ttm_tt.c                  |  1 +
 drivers/pci/pci-driver.c                      |  1 +
 include/drm/ttm/ttm_bo_api.h                  |  2 +
 include/linux/pci.h                           |  3 +
 51 files changed, 585 insertions(+), 272 deletions(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 126+ messages in thread

* [PATCH v6 00/16] RFC Support hot device unplug in amdgpu
@ 2021-05-10 16:36 ` Andrey Grodzovsky
  0 siblings, 0 replies; 126+ messages in thread
From: Andrey Grodzovsky @ 2021-05-10 16:36 UTC (permalink / raw)
  To: dri-devel, amd-gfx, linux-pci, ckoenig.leichtzumerken,
	daniel.vetter, Harry.Wentland
  Cc: Andrey Grodzovsky, gregkh, Felix.Kuehling, ppaalanen, helgaas,
	Alexander.Deucher

Until now extracting a card either by physical extraction (e.g. eGPU with 
thunderbolt connection or by emulation through  sysfs -> /sys/bus/pci/devices/device_id/remove) 
would cause random crashes in user apps. The random crashes in apps were 
mostly due to the app having mapped a device backed BO into its address 
space and was still trying to access the BO while the backing device was gone.
To answer this first problem Christian suggested fixing the handling of mapped 
memory in the clients when the device goes away by forcibly unmapping all buffers the 
user processes have by clearing their respective VMAs mapping the device BOs.
Then when the VMAs try to fill in the page tables again we check in the fault 
handler if the device is removed and if so, return an error. This will generate a 
SIGBUS to the application which can then cleanly terminate. This indeed was done 
but this in turn created a problem of kernel OOPs where the OOPSes were due to the 
fact that while the app was terminating because of the SIGBUS it would trigger use 
after free in the driver by calling to access device structures that were already
released from the pci remove sequence. This was handled by introducing a 'flush' 
sequence during device removal where we wait for drm file reference to drop to 0 
meaning all user clients directly using this device terminated.

v2:
Based on discussions in the mailing list with Daniel and Pekka [1] and based on the document 
produced by Pekka from those discussions [2] the whole approach with returning SIGBUS and 
waiting for all user clients having CPU mapping of device BOs to die was dropped. 
Instead as per the document suggestion the device structures are kept alive until 
the last reference to the device is dropped by user client and in the meanwhile all existing and new CPU mappings of the BOs 
belonging to the device directly or by dma-buf import are rerouted to per user 
process dummy rw page.Also, I skipped the 'Requirements for KMS UAPI' section of [2] 
since i am trying to get the minimal set of requirements that still give useful solution 
to work and this is the'Requirements for Render and Cross-Device UAPI' section and so my 
test case is removing a secondary device, which is render only and is not involved 
in KMS.

v3:
More updates following comments from v2 such as removing loop to find DRM file when rerouting 
page faults to dummy page,getting rid of unnecessary sysfs handling refactoring and moving 
prevention of GPU recovery post device unplug from amdgpu to scheduler layer. 
On top of that added unplug support for the IOMMU enabled system.

v4:
Drop last sysfs hack and use sysfs default attribute.
Guard against write accesses after device removal to avoid modifying released memory.
Update dummy pages handling to on demand allocation and release through drm managed framework.
Add return value to scheduler job TO handler (by Luben Tuikov) and use this in amdgpu for prevention 
of GPU recovery post device unplug
Also rebase on top of drm-misc-mext instead of amd-staging-drm-next

v5:
The most significant in this series is the improved protection from kernel driver accessing MMIO ranges that were allocated
for the device once the device is gone. To do this, first a patch 'drm/amdgpu: Unmap all MMIO mappings' is introduced.
This patch unamps all MMIO mapped into the kernel address space in the form of BARs and kernel BOs with CPU visible VRAM mappings.
This way it helped to discover multiple such access points because a page fault would be immediately generated on access. Most of them
were solved by moving HW fini code into pci_remove stage (patch drm/amdgpu: Add early fini callback) and for some who 
were harder to unwind drm_dev_enter/exit scoping was used. In addition all the IOCTLs and all background work and timers 
are now protected with drm_dev_enter/exit at their root in an attempt that after drm_dev_unplug is finished none of them 
run anymore and the pci_remove thread is the only thread executing which might touch the HW. To prevent deadlocks in such 
case against threads stuck on various HW or SW fences patches 'drm/amdgpu: Finalise device fences on device remove'  
and drm/amdgpu: Add rw_sem to pushing job into sched queue' take care of force signaling all such existing fences 
and rejecting any newly added ones.

v6:
Drop using drm_dev_enter/exit in conjunction with signalling HW fences before setting drm_dev_unplug.
We need to devise a more robust cros DRM approach to the problem of dma fence waits falling
inside drm_dev_enter/exit scopes -> move to TODO.

With these patches I am able to gracefully remove the secondary card using sysfs remove hook while glxgears is running off of secondary 
card (DRI_PRIME=1) without kernel oopses or hangs and keep working with the primary card or soft reset the device without hangs or oopses.
Also as per Daniel's comment I added 3 tests to IGT [4] to core_hotunplug test suite - remove device while commands are submitted, 
exported BO and exported fence (not pushed yet).
Also now it's possible to plug back the device after unplug 
Also some users now can successfully use those patches with eGPU boxes[3].

TODOs for followup work:
Convert AMDGPU code to use devm (for hw stuff) and drmm (for sw stuff and allocations) (Daniel)
Add support for 'Requirements for KMS UAPI' section of [2] - unplugging primary, display connected card.
Annotate drm_dev_enter/exit against dma_fence_waits as first in deciding where to use drm_dev_enter/exit
in code for device unplug.

[1] - Discussions during v5 of the patchset https://lore.kernel.org/amd-gfx/20210428151207.1212258-1-andrey.grodzovsky@amd.com/
[2] - drm/doc: device hot-unplug for userspace https://www.spinics.net/lists/dri-devel/msg259755.html
[3] - Related gitlab ticket https://gitlab.freedesktop.org/drm/amd/-/issues/1081
[4] - Related IGT tests https://gitlab.freedesktop.org/agrodzov/igt-gpu-tools/-/commits/master

Andrey Grodzovsky (16):
  drm/ttm: Remap all page faults to per process dummy page.
  drm/ttm: Expose ttm_tt_unpopulate for driver use
  drm/amdgpu: Split amdgpu_device_fini into early and late
  drm/amdkfd: Split kfd suspend from devie exit
  drm/amdgpu: Add early fini callback
  drm/amdgpu: Handle IOMMU enabled case.
  drm/amdgpu: Remap all page faults to per process dummy page.
  PCI: Add support for dev_groups to struct pci_device_driver
  drm/amdgpu: Convert driver sysfs attributes to static attributes
  drm/amdgpu: Guard against write accesses after device removal
  drm/sched: Make timeout timer rearm conditional.
  drm/amdgpu: Prevent any job recoveries after device is unplugged.
  drm/amdgpu: Fix hang on device removal.
  drm/scheduler: Fix hang when sched_entity released
  drm/amd/display: Remove superflous drm_mode_config_cleanup
  drm/amdgpu: Verify DMA opearations from device are done

 drivers/gpu/drm/amd/amdgpu/amdgpu.h           |  6 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c    |  2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h    |  2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c  | 17 ++--
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c    | 98 +++++++++++++------
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c       | 26 ++++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c     | 31 ++++--
 drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c      |  3 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h      |  1 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c       |  9 ++
 drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c   | 25 ++---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c        | 17 +++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c       | 35 +++++--
 drivers/gpu/drm/amd/amdgpu/amdgpu_irq.h       |  3 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_job.c       | 19 +++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c       | 12 ++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c       | 63 +++++++-----
 drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h       |  2 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c       |  1 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c      | 70 +++++++++++++
 drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h      | 52 ++--------
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c       | 21 +++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c       | 31 ++++--
 drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c       | 11 ++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c       | 22 +++--
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c        |  7 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c  | 14 +--
 drivers/gpu/drm/amd/amdgpu/cik_ih.c           |  3 +-
 drivers/gpu/drm/amd/amdgpu/cz_ih.c            |  3 +-
 drivers/gpu/drm/amd/amdgpu/iceland_ih.c       |  3 +-
 drivers/gpu/drm/amd/amdgpu/navi10_ih.c        |  5 +-
 drivers/gpu/drm/amd/amdgpu/psp_v11_0.c        | 44 ++++-----
 drivers/gpu/drm/amd/amdgpu/psp_v12_0.c        |  8 +-
 drivers/gpu/drm/amd/amdgpu/psp_v3_1.c         |  8 +-
 drivers/gpu/drm/amd/amdgpu/si_ih.c            |  3 +-
 drivers/gpu/drm/amd/amdgpu/tonga_ih.c         |  3 +-
 drivers/gpu/drm/amd/amdgpu/vce_v4_0.c         | 26 +++--
 drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c         | 22 +++--
 drivers/gpu/drm/amd/amdgpu/vega10_ih.c        |  5 +-
 drivers/gpu/drm/amd/amdgpu/vega20_ih.c        |  2 +-
 drivers/gpu/drm/amd/amdkfd/kfd_device.c       |  3 +-
 .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 13 ++-
 drivers/gpu/drm/amd/include/amd_shared.h      |  2 +
 .../drm/amd/pm/powerplay/smumgr/smu7_smumgr.c |  2 +
 drivers/gpu/drm/scheduler/sched_entity.c      |  3 +-
 drivers/gpu/drm/scheduler/sched_main.c        | 35 ++++++-
 drivers/gpu/drm/ttm/ttm_bo_vm.c               | 57 ++++++++++-
 drivers/gpu/drm/ttm/ttm_tt.c                  |  1 +
 drivers/pci/pci-driver.c                      |  1 +
 include/drm/ttm/ttm_bo_api.h                  |  2 +
 include/linux/pci.h                           |  3 +
 51 files changed, 585 insertions(+), 272 deletions(-)

-- 
2.25.1

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 126+ messages in thread

* [PATCH v6 01/16] drm/ttm: Remap all page faults to per process dummy page.
  2021-05-10 16:36 ` Andrey Grodzovsky
  (?)
@ 2021-05-10 16:36   ` Andrey Grodzovsky
  -1 siblings, 0 replies; 126+ messages in thread
From: Andrey Grodzovsky @ 2021-05-10 16:36 UTC (permalink / raw)
  To: dri-devel, amd-gfx, linux-pci, ckoenig.leichtzumerken,
	daniel.vetter, Harry.Wentland
  Cc: ppaalanen, Alexander.Deucher, gregkh, helgaas, Felix.Kuehling,
	Andrey Grodzovsky

On device removal reroute all CPU mappings to dummy page.

v3:
Remove loop to find DRM file and instead access it
by vma->vm_file->private_data. Move dummy page installation
into a separate function.

v4:
Map the entire BOs VA space into on demand allocated dummy page
on the first fault for that BO.

v5: Remove duplicate return.

v6: Polish ttm_bo_vm_dummy_page, remove superflous code.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
---
 drivers/gpu/drm/ttm/ttm_bo_vm.c | 57 ++++++++++++++++++++++++++++++++-
 include/drm/ttm/ttm_bo_api.h    |  2 ++
 2 files changed, 58 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c b/drivers/gpu/drm/ttm/ttm_bo_vm.c
index b31b18058965..e5a9615519d1 100644
--- a/drivers/gpu/drm/ttm/ttm_bo_vm.c
+++ b/drivers/gpu/drm/ttm/ttm_bo_vm.c
@@ -34,6 +34,8 @@
 #include <drm/ttm/ttm_bo_driver.h>
 #include <drm/ttm/ttm_placement.h>
 #include <drm/drm_vma_manager.h>
+#include <drm/drm_drv.h>
+#include <drm/drm_managed.h>
 #include <linux/mm.h>
 #include <linux/pfn_t.h>
 #include <linux/rbtree.h>
@@ -380,19 +382,72 @@ vm_fault_t ttm_bo_vm_fault_reserved(struct vm_fault *vmf,
 }
 EXPORT_SYMBOL(ttm_bo_vm_fault_reserved);
 
+static void ttm_bo_release_dummy_page(struct drm_device *dev, void *res)
+{
+	struct page *dummy_page = (struct page *)res;
+
+	__free_page(dummy_page);
+}
+
+vm_fault_t ttm_bo_vm_dummy_page(struct vm_fault *vmf, pgprot_t prot)
+{
+	struct vm_area_struct *vma = vmf->vma;
+	struct ttm_buffer_object *bo = vma->vm_private_data;
+	struct drm_device *ddev = bo->base.dev;
+	vm_fault_t ret = VM_FAULT_NOPAGE;
+	unsigned long address;
+	unsigned long pfn;
+	struct page *page;
+
+	/* Allocate new dummy page to map all the VA range in this VMA to it*/
+	page = alloc_page(GFP_KERNEL | __GFP_ZERO);
+	if (!page)
+		return VM_FAULT_OOM;
+
+	pfn = page_to_pfn(page);
+
+	/* Prefault the entire VMA range right away to avoid further faults */
+	for (address = vma->vm_start; address < vma->vm_end; address += PAGE_SIZE) {
+
+		if (unlikely(address >= vma->vm_end))
+			break;
+
+		if (vma->vm_flags & VM_MIXEDMAP)
+			ret = vmf_insert_mixed_prot(vma, address,
+						    __pfn_to_pfn_t(pfn, PFN_DEV),
+						    prot);
+		else
+			ret = vmf_insert_pfn_prot(vma, address, pfn, prot);
+	}
+
+	/* Set the page to be freed using drmm release action */
+	if (drmm_add_action_or_reset(ddev, ttm_bo_release_dummy_page, page))
+		return VM_FAULT_OOM;
+
+	return ret;
+}
+EXPORT_SYMBOL(ttm_bo_vm_dummy_page);
+
 vm_fault_t ttm_bo_vm_fault(struct vm_fault *vmf)
 {
 	struct vm_area_struct *vma = vmf->vma;
 	pgprot_t prot;
 	struct ttm_buffer_object *bo = vma->vm_private_data;
+	struct drm_device *ddev = bo->base.dev;
 	vm_fault_t ret;
+	int idx;
 
 	ret = ttm_bo_vm_reserve(bo, vmf);
 	if (ret)
 		return ret;
 
 	prot = vma->vm_page_prot;
-	ret = ttm_bo_vm_fault_reserved(vmf, prot, TTM_BO_VM_NUM_PREFAULT, 1);
+	if (drm_dev_enter(ddev, &idx)) {
+		ret = ttm_bo_vm_fault_reserved(vmf, prot, TTM_BO_VM_NUM_PREFAULT, 1);
+		drm_dev_exit(idx);
+	} else {
+		ret = ttm_bo_vm_dummy_page(vmf, prot);
+	}
 	if (ret == VM_FAULT_RETRY && !(vmf->flags & FAULT_FLAG_RETRY_NOWAIT))
 		return ret;
 
diff --git a/include/drm/ttm/ttm_bo_api.h b/include/drm/ttm/ttm_bo_api.h
index 639521880c29..254ede97f8e3 100644
--- a/include/drm/ttm/ttm_bo_api.h
+++ b/include/drm/ttm/ttm_bo_api.h
@@ -620,4 +620,6 @@ int ttm_bo_vm_access(struct vm_area_struct *vma, unsigned long addr,
 		     void *buf, int len, int write);
 bool ttm_bo_delayed_delete(struct ttm_device *bdev, bool remove_all);
 
+vm_fault_t ttm_bo_vm_dummy_page(struct vm_fault *vmf, pgprot_t prot);
+
 #endif
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 126+ messages in thread

* [PATCH v6 01/16] drm/ttm: Remap all page faults to per process dummy page.
@ 2021-05-10 16:36   ` Andrey Grodzovsky
  0 siblings, 0 replies; 126+ messages in thread
From: Andrey Grodzovsky @ 2021-05-10 16:36 UTC (permalink / raw)
  To: dri-devel, amd-gfx, linux-pci, ckoenig.leichtzumerken,
	daniel.vetter, Harry.Wentland
  Cc: gregkh, Felix.Kuehling, helgaas, Alexander.Deucher

On device removal reroute all CPU mappings to dummy page.

v3:
Remove loop to find DRM file and instead access it
by vma->vm_file->private_data. Move dummy page installation
into a separate function.

v4:
Map the entire BOs VA space into on demand allocated dummy page
on the first fault for that BO.

v5: Remove duplicate return.

v6: Polish ttm_bo_vm_dummy_page, remove superflous code.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
---
 drivers/gpu/drm/ttm/ttm_bo_vm.c | 57 ++++++++++++++++++++++++++++++++-
 include/drm/ttm/ttm_bo_api.h    |  2 ++
 2 files changed, 58 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c b/drivers/gpu/drm/ttm/ttm_bo_vm.c
index b31b18058965..e5a9615519d1 100644
--- a/drivers/gpu/drm/ttm/ttm_bo_vm.c
+++ b/drivers/gpu/drm/ttm/ttm_bo_vm.c
@@ -34,6 +34,8 @@
 #include <drm/ttm/ttm_bo_driver.h>
 #include <drm/ttm/ttm_placement.h>
 #include <drm/drm_vma_manager.h>
+#include <drm/drm_drv.h>
+#include <drm/drm_managed.h>
 #include <linux/mm.h>
 #include <linux/pfn_t.h>
 #include <linux/rbtree.h>
@@ -380,19 +382,72 @@ vm_fault_t ttm_bo_vm_fault_reserved(struct vm_fault *vmf,
 }
 EXPORT_SYMBOL(ttm_bo_vm_fault_reserved);
 
+static void ttm_bo_release_dummy_page(struct drm_device *dev, void *res)
+{
+	struct page *dummy_page = (struct page *)res;
+
+	__free_page(dummy_page);
+}
+
+vm_fault_t ttm_bo_vm_dummy_page(struct vm_fault *vmf, pgprot_t prot)
+{
+	struct vm_area_struct *vma = vmf->vma;
+	struct ttm_buffer_object *bo = vma->vm_private_data;
+	struct drm_device *ddev = bo->base.dev;
+	vm_fault_t ret = VM_FAULT_NOPAGE;
+	unsigned long address;
+	unsigned long pfn;
+	struct page *page;
+
+	/* Allocate new dummy page to map all the VA range in this VMA to it*/
+	page = alloc_page(GFP_KERNEL | __GFP_ZERO);
+	if (!page)
+		return VM_FAULT_OOM;
+
+	pfn = page_to_pfn(page);
+
+	/* Prefault the entire VMA range right away to avoid further faults */
+	for (address = vma->vm_start; address < vma->vm_end; address += PAGE_SIZE) {
+
+		if (unlikely(address >= vma->vm_end))
+			break;
+
+		if (vma->vm_flags & VM_MIXEDMAP)
+			ret = vmf_insert_mixed_prot(vma, address,
+						    __pfn_to_pfn_t(pfn, PFN_DEV),
+						    prot);
+		else
+			ret = vmf_insert_pfn_prot(vma, address, pfn, prot);
+	}
+
+	/* Set the page to be freed using drmm release action */
+	if (drmm_add_action_or_reset(ddev, ttm_bo_release_dummy_page, page))
+		return VM_FAULT_OOM;
+
+	return ret;
+}
+EXPORT_SYMBOL(ttm_bo_vm_dummy_page);
+
 vm_fault_t ttm_bo_vm_fault(struct vm_fault *vmf)
 {
 	struct vm_area_struct *vma = vmf->vma;
 	pgprot_t prot;
 	struct ttm_buffer_object *bo = vma->vm_private_data;
+	struct drm_device *ddev = bo->base.dev;
 	vm_fault_t ret;
+	int idx;
 
 	ret = ttm_bo_vm_reserve(bo, vmf);
 	if (ret)
 		return ret;
 
 	prot = vma->vm_page_prot;
-	ret = ttm_bo_vm_fault_reserved(vmf, prot, TTM_BO_VM_NUM_PREFAULT, 1);
+	if (drm_dev_enter(ddev, &idx)) {
+		ret = ttm_bo_vm_fault_reserved(vmf, prot, TTM_BO_VM_NUM_PREFAULT, 1);
+		drm_dev_exit(idx);
+	} else {
+		ret = ttm_bo_vm_dummy_page(vmf, prot);
+	}
 	if (ret == VM_FAULT_RETRY && !(vmf->flags & FAULT_FLAG_RETRY_NOWAIT))
 		return ret;
 
diff --git a/include/drm/ttm/ttm_bo_api.h b/include/drm/ttm/ttm_bo_api.h
index 639521880c29..254ede97f8e3 100644
--- a/include/drm/ttm/ttm_bo_api.h
+++ b/include/drm/ttm/ttm_bo_api.h
@@ -620,4 +620,6 @@ int ttm_bo_vm_access(struct vm_area_struct *vma, unsigned long addr,
 		     void *buf, int len, int write);
 bool ttm_bo_delayed_delete(struct ttm_device *bdev, bool remove_all);
 
+vm_fault_t ttm_bo_vm_dummy_page(struct vm_fault *vmf, pgprot_t prot);
+
 #endif
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 126+ messages in thread

* [PATCH v6 01/16] drm/ttm: Remap all page faults to per process dummy page.
@ 2021-05-10 16:36   ` Andrey Grodzovsky
  0 siblings, 0 replies; 126+ messages in thread
From: Andrey Grodzovsky @ 2021-05-10 16:36 UTC (permalink / raw)
  To: dri-devel, amd-gfx, linux-pci, ckoenig.leichtzumerken,
	daniel.vetter, Harry.Wentland
  Cc: Andrey Grodzovsky, gregkh, Felix.Kuehling, ppaalanen, helgaas,
	Alexander.Deucher

On device removal reroute all CPU mappings to dummy page.

v3:
Remove loop to find DRM file and instead access it
by vma->vm_file->private_data. Move dummy page installation
into a separate function.

v4:
Map the entire BOs VA space into on demand allocated dummy page
on the first fault for that BO.

v5: Remove duplicate return.

v6: Polish ttm_bo_vm_dummy_page, remove superflous code.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
---
 drivers/gpu/drm/ttm/ttm_bo_vm.c | 57 ++++++++++++++++++++++++++++++++-
 include/drm/ttm/ttm_bo_api.h    |  2 ++
 2 files changed, 58 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c b/drivers/gpu/drm/ttm/ttm_bo_vm.c
index b31b18058965..e5a9615519d1 100644
--- a/drivers/gpu/drm/ttm/ttm_bo_vm.c
+++ b/drivers/gpu/drm/ttm/ttm_bo_vm.c
@@ -34,6 +34,8 @@
 #include <drm/ttm/ttm_bo_driver.h>
 #include <drm/ttm/ttm_placement.h>
 #include <drm/drm_vma_manager.h>
+#include <drm/drm_drv.h>
+#include <drm/drm_managed.h>
 #include <linux/mm.h>
 #include <linux/pfn_t.h>
 #include <linux/rbtree.h>
@@ -380,19 +382,72 @@ vm_fault_t ttm_bo_vm_fault_reserved(struct vm_fault *vmf,
 }
 EXPORT_SYMBOL(ttm_bo_vm_fault_reserved);
 
+static void ttm_bo_release_dummy_page(struct drm_device *dev, void *res)
+{
+	struct page *dummy_page = (struct page *)res;
+
+	__free_page(dummy_page);
+}
+
+vm_fault_t ttm_bo_vm_dummy_page(struct vm_fault *vmf, pgprot_t prot)
+{
+	struct vm_area_struct *vma = vmf->vma;
+	struct ttm_buffer_object *bo = vma->vm_private_data;
+	struct drm_device *ddev = bo->base.dev;
+	vm_fault_t ret = VM_FAULT_NOPAGE;
+	unsigned long address;
+	unsigned long pfn;
+	struct page *page;
+
+	/* Allocate new dummy page to map all the VA range in this VMA to it*/
+	page = alloc_page(GFP_KERNEL | __GFP_ZERO);
+	if (!page)
+		return VM_FAULT_OOM;
+
+	pfn = page_to_pfn(page);
+
+	/* Prefault the entire VMA range right away to avoid further faults */
+	for (address = vma->vm_start; address < vma->vm_end; address += PAGE_SIZE) {
+
+		if (unlikely(address >= vma->vm_end))
+			break;
+
+		if (vma->vm_flags & VM_MIXEDMAP)
+			ret = vmf_insert_mixed_prot(vma, address,
+						    __pfn_to_pfn_t(pfn, PFN_DEV),
+						    prot);
+		else
+			ret = vmf_insert_pfn_prot(vma, address, pfn, prot);
+	}
+
+	/* Set the page to be freed using drmm release action */
+	if (drmm_add_action_or_reset(ddev, ttm_bo_release_dummy_page, page))
+		return VM_FAULT_OOM;
+
+	return ret;
+}
+EXPORT_SYMBOL(ttm_bo_vm_dummy_page);
+
 vm_fault_t ttm_bo_vm_fault(struct vm_fault *vmf)
 {
 	struct vm_area_struct *vma = vmf->vma;
 	pgprot_t prot;
 	struct ttm_buffer_object *bo = vma->vm_private_data;
+	struct drm_device *ddev = bo->base.dev;
 	vm_fault_t ret;
+	int idx;
 
 	ret = ttm_bo_vm_reserve(bo, vmf);
 	if (ret)
 		return ret;
 
 	prot = vma->vm_page_prot;
-	ret = ttm_bo_vm_fault_reserved(vmf, prot, TTM_BO_VM_NUM_PREFAULT, 1);
+	if (drm_dev_enter(ddev, &idx)) {
+		ret = ttm_bo_vm_fault_reserved(vmf, prot, TTM_BO_VM_NUM_PREFAULT, 1);
+		drm_dev_exit(idx);
+	} else {
+		ret = ttm_bo_vm_dummy_page(vmf, prot);
+	}
 	if (ret == VM_FAULT_RETRY && !(vmf->flags & FAULT_FLAG_RETRY_NOWAIT))
 		return ret;
 
diff --git a/include/drm/ttm/ttm_bo_api.h b/include/drm/ttm/ttm_bo_api.h
index 639521880c29..254ede97f8e3 100644
--- a/include/drm/ttm/ttm_bo_api.h
+++ b/include/drm/ttm/ttm_bo_api.h
@@ -620,4 +620,6 @@ int ttm_bo_vm_access(struct vm_area_struct *vma, unsigned long addr,
 		     void *buf, int len, int write);
 bool ttm_bo_delayed_delete(struct ttm_device *bdev, bool remove_all);
 
+vm_fault_t ttm_bo_vm_dummy_page(struct vm_fault *vmf, pgprot_t prot);
+
 #endif
-- 
2.25.1

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 126+ messages in thread

* [PATCH v6 02/16] drm/ttm: Expose ttm_tt_unpopulate for driver use
  2021-05-10 16:36 ` Andrey Grodzovsky
  (?)
@ 2021-05-10 16:36   ` Andrey Grodzovsky
  -1 siblings, 0 replies; 126+ messages in thread
From: Andrey Grodzovsky @ 2021-05-10 16:36 UTC (permalink / raw)
  To: dri-devel, amd-gfx, linux-pci, ckoenig.leichtzumerken,
	daniel.vetter, Harry.Wentland
  Cc: ppaalanen, Alexander.Deucher, gregkh, helgaas, Felix.Kuehling,
	Andrey Grodzovsky

It's needed to drop iommu backed pages on device unplug
before device's IOMMU group is released.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
---
 drivers/gpu/drm/ttm/ttm_tt.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/ttm/ttm_tt.c b/drivers/gpu/drm/ttm/ttm_tt.c
index 539e0232cb3b..dfbe1ea8763f 100644
--- a/drivers/gpu/drm/ttm/ttm_tt.c
+++ b/drivers/gpu/drm/ttm/ttm_tt.c
@@ -433,3 +433,4 @@ void ttm_tt_mgr_init(unsigned long num_pages, unsigned long num_dma32_pages)
 	if (!ttm_dma32_pages_limit)
 		ttm_dma32_pages_limit = num_dma32_pages;
 }
+EXPORT_SYMBOL(ttm_tt_unpopulate);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 126+ messages in thread

* [PATCH v6 02/16] drm/ttm: Expose ttm_tt_unpopulate for driver use
@ 2021-05-10 16:36   ` Andrey Grodzovsky
  0 siblings, 0 replies; 126+ messages in thread
From: Andrey Grodzovsky @ 2021-05-10 16:36 UTC (permalink / raw)
  To: dri-devel, amd-gfx, linux-pci, ckoenig.leichtzumerken,
	daniel.vetter, Harry.Wentland
  Cc: gregkh, Felix.Kuehling, helgaas, Alexander.Deucher

It's needed to drop iommu backed pages on device unplug
before device's IOMMU group is released.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
---
 drivers/gpu/drm/ttm/ttm_tt.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/ttm/ttm_tt.c b/drivers/gpu/drm/ttm/ttm_tt.c
index 539e0232cb3b..dfbe1ea8763f 100644
--- a/drivers/gpu/drm/ttm/ttm_tt.c
+++ b/drivers/gpu/drm/ttm/ttm_tt.c
@@ -433,3 +433,4 @@ void ttm_tt_mgr_init(unsigned long num_pages, unsigned long num_dma32_pages)
 	if (!ttm_dma32_pages_limit)
 		ttm_dma32_pages_limit = num_dma32_pages;
 }
+EXPORT_SYMBOL(ttm_tt_unpopulate);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 126+ messages in thread

* [PATCH v6 02/16] drm/ttm: Expose ttm_tt_unpopulate for driver use
@ 2021-05-10 16:36   ` Andrey Grodzovsky
  0 siblings, 0 replies; 126+ messages in thread
From: Andrey Grodzovsky @ 2021-05-10 16:36 UTC (permalink / raw)
  To: dri-devel, amd-gfx, linux-pci, ckoenig.leichtzumerken,
	daniel.vetter, Harry.Wentland
  Cc: Andrey Grodzovsky, gregkh, Felix.Kuehling, ppaalanen, helgaas,
	Alexander.Deucher

It's needed to drop iommu backed pages on device unplug
before device's IOMMU group is released.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
---
 drivers/gpu/drm/ttm/ttm_tt.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/ttm/ttm_tt.c b/drivers/gpu/drm/ttm/ttm_tt.c
index 539e0232cb3b..dfbe1ea8763f 100644
--- a/drivers/gpu/drm/ttm/ttm_tt.c
+++ b/drivers/gpu/drm/ttm/ttm_tt.c
@@ -433,3 +433,4 @@ void ttm_tt_mgr_init(unsigned long num_pages, unsigned long num_dma32_pages)
 	if (!ttm_dma32_pages_limit)
 		ttm_dma32_pages_limit = num_dma32_pages;
 }
+EXPORT_SYMBOL(ttm_tt_unpopulate);
-- 
2.25.1

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 126+ messages in thread

* [PATCH v6 03/16] drm/amdgpu: Split amdgpu_device_fini into early and late
  2021-05-10 16:36 ` Andrey Grodzovsky
  (?)
@ 2021-05-10 16:36   ` Andrey Grodzovsky
  -1 siblings, 0 replies; 126+ messages in thread
From: Andrey Grodzovsky @ 2021-05-10 16:36 UTC (permalink / raw)
  To: dri-devel, amd-gfx, linux-pci, ckoenig.leichtzumerken,
	daniel.vetter, Harry.Wentland
  Cc: ppaalanen, Alexander.Deucher, gregkh, helgaas, Felix.Kuehling,
	Andrey Grodzovsky, Christian König, Alex Deucher

Some of the stuff in amdgpu_device_fini such as HW interrupts
disable and pending fences finilization must be done right away on
pci_remove while most of the stuff which relates to finilizing and
releasing driver data structures can be kept until
drm_driver.release hook is called, i.e. when the last device
reference is dropped.

v4: Change functions prefix early->hw and late->sw

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu.h        |  6 ++++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 26 +++++++++++++++-------
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c    |  7 ++----
 drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c  | 15 ++++++++++++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c    | 26 +++++++++++++---------
 drivers/gpu/drm/amd/amdgpu/amdgpu_irq.h    |  3 ++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c    | 12 +++++++++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c    |  1 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h   |  3 ++-
 drivers/gpu/drm/amd/amdgpu/cik_ih.c        |  2 +-
 drivers/gpu/drm/amd/amdgpu/cz_ih.c         |  2 +-
 drivers/gpu/drm/amd/amdgpu/iceland_ih.c    |  2 +-
 drivers/gpu/drm/amd/amdgpu/navi10_ih.c     |  2 +-
 drivers/gpu/drm/amd/amdgpu/si_ih.c         |  2 +-
 drivers/gpu/drm/amd/amdgpu/tonga_ih.c      |  2 +-
 drivers/gpu/drm/amd/amdgpu/vega10_ih.c     |  2 +-
 drivers/gpu/drm/amd/amdgpu/vega20_ih.c     |  2 +-
 17 files changed, 79 insertions(+), 36 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index 380801b59b07..d830a541ba89 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -1099,7 +1099,9 @@ static inline struct amdgpu_device *amdgpu_ttm_adev(struct ttm_device *bdev)
 
 int amdgpu_device_init(struct amdgpu_device *adev,
 		       uint32_t flags);
-void amdgpu_device_fini(struct amdgpu_device *adev);
+void amdgpu_device_fini_hw(struct amdgpu_device *adev);
+void amdgpu_device_fini_sw(struct amdgpu_device *adev);
+
 int amdgpu_gpu_wait_for_idle(struct amdgpu_device *adev);
 
 void amdgpu_device_vram_access(struct amdgpu_device *adev, loff_t pos,
@@ -1319,6 +1321,8 @@ void amdgpu_driver_lastclose_kms(struct drm_device *dev);
 int amdgpu_driver_open_kms(struct drm_device *dev, struct drm_file *file_priv);
 void amdgpu_driver_postclose_kms(struct drm_device *dev,
 				 struct drm_file *file_priv);
+void amdgpu_driver_release_kms(struct drm_device *dev);
+
 int amdgpu_device_ip_suspend(struct amdgpu_device *adev);
 int amdgpu_device_suspend(struct drm_device *dev, bool fbcon);
 int amdgpu_device_resume(struct drm_device *dev, bool fbcon);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index b4ad1c055c70..3760ce7d8ff8 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -3648,15 +3648,13 @@ int amdgpu_device_init(struct amdgpu_device *adev,
  * Tear down the driver info (all asics).
  * Called at driver shutdown.
  */
-void amdgpu_device_fini(struct amdgpu_device *adev)
+void amdgpu_device_fini_hw(struct amdgpu_device *adev)
 {
 	dev_info(adev->dev, "amdgpu: finishing device.\n");
 	flush_delayed_work(&adev->delayed_init_work);
 	ttm_bo_lock_delayed_workqueue(&adev->mman.bdev);
 	adev->shutdown = true;
 
-	kfree(adev->pci_state);
-
 	/* make sure IB test finished before entering exclusive mode
 	 * to avoid preemption on IB test
 	 * */
@@ -3673,11 +3671,24 @@ void amdgpu_device_fini(struct amdgpu_device *adev)
 		else
 			drm_atomic_helper_shutdown(adev_to_drm(adev));
 	}
-	amdgpu_fence_driver_fini(adev);
+	amdgpu_fence_driver_fini_hw(adev);
+
 	if (adev->pm_sysfs_en)
 		amdgpu_pm_sysfs_fini(adev);
+	if (adev->ucode_sysfs_en)
+		amdgpu_ucode_sysfs_fini(adev);
+	sysfs_remove_files(&adev->dev->kobj, amdgpu_dev_attributes);
+
+
 	amdgpu_fbdev_fini(adev);
+
+	amdgpu_irq_fini_hw(adev);
+}
+
+void amdgpu_device_fini_sw(struct amdgpu_device *adev)
+{
 	amdgpu_device_ip_fini(adev);
+	amdgpu_fence_driver_fini_sw(adev);
 	release_firmware(adev->firmware.gpu_info_fw);
 	adev->firmware.gpu_info_fw = NULL;
 	adev->accel_working = false;
@@ -3703,14 +3714,13 @@ void amdgpu_device_fini(struct amdgpu_device *adev)
 	adev->rmmio = NULL;
 	amdgpu_device_doorbell_fini(adev);
 
-	if (adev->ucode_sysfs_en)
-		amdgpu_ucode_sysfs_fini(adev);
-
-	sysfs_remove_files(&adev->dev->kobj, amdgpu_dev_attributes);
 	if (IS_ENABLED(CONFIG_PERF_EVENTS))
 		amdgpu_pmu_fini(adev);
 	if (adev->mman.discovery_bin)
 		amdgpu_discovery_fini(adev);
+
+	kfree(adev->pci_state);
+
 }
 
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index 6cf573293823..5ebed4c7d9c0 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -1311,14 +1311,10 @@ amdgpu_pci_remove(struct pci_dev *pdev)
 {
 	struct drm_device *dev = pci_get_drvdata(pdev);
 
-#ifdef MODULE
-	if (THIS_MODULE->state != MODULE_STATE_GOING)
-#endif
-		DRM_ERROR("Hotplug removal is not supported\n");
 	drm_dev_unplug(dev);
 	amdgpu_driver_unload_kms(dev);
+
 	pci_disable_device(pdev);
-	pci_set_drvdata(pdev, NULL);
 }
 
 static void
@@ -1748,6 +1744,7 @@ static const struct drm_driver amdgpu_kms_driver = {
 	.dumb_create = amdgpu_mode_dumb_create,
 	.dumb_map_offset = amdgpu_mode_dumb_mmap,
 	.fops = &amdgpu_driver_kms_fops,
+	.release = &amdgpu_driver_release_kms,
 
 	.prime_handle_to_fd = drm_gem_prime_handle_to_fd,
 	.prime_fd_to_handle = drm_gem_prime_fd_to_handle,
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
index 47ea46859618..1ffb36bd0b19 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
@@ -523,7 +523,7 @@ int amdgpu_fence_driver_init(struct amdgpu_device *adev)
  *
  * Tear down the fence driver for all possible rings (all asics).
  */
-void amdgpu_fence_driver_fini(struct amdgpu_device *adev)
+void amdgpu_fence_driver_fini_hw(struct amdgpu_device *adev)
 {
 	unsigned i, j;
 	int r;
@@ -545,6 +545,19 @@ void amdgpu_fence_driver_fini(struct amdgpu_device *adev)
 				       ring->fence_drv.irq_type);
 
 		del_timer_sync(&ring->fence_drv.fallback_timer);
+	}
+}
+
+void amdgpu_fence_driver_fini_sw(struct amdgpu_device *adev)
+{
+	unsigned int i, j;
+
+	for (i = 0; i < AMDGPU_MAX_RINGS; i++) {
+		struct amdgpu_ring *ring = adev->rings[i];
+
+		if (!ring || !ring->fence_drv.initialized)
+			continue;
+
 		for (j = 0; j <= ring->fence_drv.num_fences_mask; ++j)
 			dma_fence_put(ring->fence_drv.fences[j]);
 		kfree(ring->fence_drv.fences);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
index 90f50561b43a..233b64dab94b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
@@ -49,6 +49,7 @@
 #include <drm/drm_irq.h>
 #include <drm/drm_vblank.h>
 #include <drm/amdgpu_drm.h>
+#include <drm/drm_drv.h>
 #include "amdgpu.h"
 #include "amdgpu_ih.h"
 #include "atom.h"
@@ -348,6 +349,20 @@ int amdgpu_irq_init(struct amdgpu_device *adev)
 	return 0;
 }
 
+
+void amdgpu_irq_fini_hw(struct amdgpu_device *adev)
+{
+	if (adev->irq.installed) {
+		drm_irq_uninstall(&adev->ddev);
+		adev->irq.installed = false;
+		if (adev->irq.msi_enabled)
+			pci_free_irq_vectors(adev->pdev);
+
+		if (!amdgpu_device_has_dc_support(adev))
+			flush_work(&adev->hotplug_work);
+	}
+}
+
 /**
  * amdgpu_irq_fini - shut down interrupt handling
  *
@@ -357,19 +372,10 @@ int amdgpu_irq_init(struct amdgpu_device *adev)
  * functionality, shuts down vblank, hotplug and reset interrupt handling,
  * turns off interrupts from all sources (all ASICs).
  */
-void amdgpu_irq_fini(struct amdgpu_device *adev)
+void amdgpu_irq_fini_sw(struct amdgpu_device *adev)
 {
 	unsigned i, j;
 
-	if (adev->irq.installed) {
-		drm_irq_uninstall(adev_to_drm(adev));
-		adev->irq.installed = false;
-		if (adev->irq.msi_enabled)
-			pci_free_irq_vectors(adev->pdev);
-		if (!amdgpu_device_has_dc_support(adev))
-			flush_work(&adev->hotplug_work);
-	}
-
 	for (i = 0; i < AMDGPU_IRQ_CLIENTID_MAX; ++i) {
 		if (!adev->irq.client[i].sources)
 			continue;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.h
index cf6116648322..78ad4784cc74 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.h
@@ -103,7 +103,8 @@ void amdgpu_irq_disable_all(struct amdgpu_device *adev);
 irqreturn_t amdgpu_irq_handler(int irq, void *arg);
 
 int amdgpu_irq_init(struct amdgpu_device *adev);
-void amdgpu_irq_fini(struct amdgpu_device *adev);
+void amdgpu_irq_fini_sw(struct amdgpu_device *adev);
+void amdgpu_irq_fini_hw(struct amdgpu_device *adev);
 int amdgpu_irq_add_id(struct amdgpu_device *adev,
 		      unsigned client_id, unsigned src_id,
 		      struct amdgpu_irq_src *source);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
index 39ee88d29cca..f3ecada208b0 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
@@ -28,6 +28,7 @@
 
 #include "amdgpu.h"
 #include <drm/amdgpu_drm.h>
+#include <drm/drm_drv.h>
 #include "amdgpu_uvd.h"
 #include "amdgpu_vce.h"
 #include "atom.h"
@@ -92,7 +93,7 @@ void amdgpu_driver_unload_kms(struct drm_device *dev)
 	}
 
 	amdgpu_acpi_fini(adev);
-	amdgpu_device_fini(adev);
+	amdgpu_device_fini_hw(adev);
 }
 
 void amdgpu_register_gpu_instance(struct amdgpu_device *adev)
@@ -1219,6 +1220,15 @@ void amdgpu_driver_postclose_kms(struct drm_device *dev,
 	pm_runtime_put_autosuspend(dev->dev);
 }
 
+
+void amdgpu_driver_release_kms(struct drm_device *dev)
+{
+	struct amdgpu_device *adev = drm_to_adev(dev);
+
+	amdgpu_device_fini_sw(adev);
+	pci_set_drvdata(adev->pdev, NULL);
+}
+
 /*
  * VBlank related functions.
  */
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
index 0541196ae1ed..844a667f655b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
@@ -2325,6 +2325,7 @@ int amdgpu_ras_pre_fini(struct amdgpu_device *adev)
 	if (!adev->ras_features || !con)
 		return 0;
 
+
 	/* Need disable ras on all IPs here before ip [hw/sw]fini */
 	amdgpu_ras_disable_all_features(adev, 0);
 	amdgpu_ras_recovery_fini(adev);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
index ca1622835296..e7d3d0dbdd96 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
@@ -107,7 +107,8 @@ struct amdgpu_fence_driver {
 };
 
 int amdgpu_fence_driver_init(struct amdgpu_device *adev);
-void amdgpu_fence_driver_fini(struct amdgpu_device *adev);
+void amdgpu_fence_driver_fini_hw(struct amdgpu_device *adev);
+void amdgpu_fence_driver_fini_sw(struct amdgpu_device *adev);
 void amdgpu_fence_driver_force_completion(struct amdgpu_ring *ring);
 
 int amdgpu_fence_driver_init_ring(struct amdgpu_ring *ring,
diff --git a/drivers/gpu/drm/amd/amdgpu/cik_ih.c b/drivers/gpu/drm/amd/amdgpu/cik_ih.c
index d3745711d55f..183d44a6583c 100644
--- a/drivers/gpu/drm/amd/amdgpu/cik_ih.c
+++ b/drivers/gpu/drm/amd/amdgpu/cik_ih.c
@@ -309,7 +309,7 @@ static int cik_ih_sw_fini(void *handle)
 {
 	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
 
-	amdgpu_irq_fini(adev);
+	amdgpu_irq_fini_sw(adev);
 	amdgpu_ih_ring_fini(adev, &adev->irq.ih);
 	amdgpu_irq_remove_domain(adev);
 
diff --git a/drivers/gpu/drm/amd/amdgpu/cz_ih.c b/drivers/gpu/drm/amd/amdgpu/cz_ih.c
index 307c01301c87..d32743949003 100644
--- a/drivers/gpu/drm/amd/amdgpu/cz_ih.c
+++ b/drivers/gpu/drm/amd/amdgpu/cz_ih.c
@@ -301,7 +301,7 @@ static int cz_ih_sw_fini(void *handle)
 {
 	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
 
-	amdgpu_irq_fini(adev);
+	amdgpu_irq_fini_sw(adev);
 	amdgpu_ih_ring_fini(adev, &adev->irq.ih);
 	amdgpu_irq_remove_domain(adev);
 
diff --git a/drivers/gpu/drm/amd/amdgpu/iceland_ih.c b/drivers/gpu/drm/amd/amdgpu/iceland_ih.c
index cc957471f31e..da96c6013477 100644
--- a/drivers/gpu/drm/amd/amdgpu/iceland_ih.c
+++ b/drivers/gpu/drm/amd/amdgpu/iceland_ih.c
@@ -300,7 +300,7 @@ static int iceland_ih_sw_fini(void *handle)
 {
 	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
 
-	amdgpu_irq_fini(adev);
+	amdgpu_irq_fini_sw(adev);
 	amdgpu_ih_ring_fini(adev, &adev->irq.ih);
 	amdgpu_irq_remove_domain(adev);
 
diff --git a/drivers/gpu/drm/amd/amdgpu/navi10_ih.c b/drivers/gpu/drm/amd/amdgpu/navi10_ih.c
index f4e4040bbd25..5eea4550b856 100644
--- a/drivers/gpu/drm/amd/amdgpu/navi10_ih.c
+++ b/drivers/gpu/drm/amd/amdgpu/navi10_ih.c
@@ -569,7 +569,7 @@ static int navi10_ih_sw_fini(void *handle)
 {
 	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
 
-	amdgpu_irq_fini(adev);
+	amdgpu_irq_fini_sw(adev);
 	amdgpu_ih_ring_fini(adev, &adev->irq.ih_soft);
 	amdgpu_ih_ring_fini(adev, &adev->irq.ih2);
 	amdgpu_ih_ring_fini(adev, &adev->irq.ih1);
diff --git a/drivers/gpu/drm/amd/amdgpu/si_ih.c b/drivers/gpu/drm/amd/amdgpu/si_ih.c
index 51880f6ef634..751307f3252c 100644
--- a/drivers/gpu/drm/amd/amdgpu/si_ih.c
+++ b/drivers/gpu/drm/amd/amdgpu/si_ih.c
@@ -175,7 +175,7 @@ static int si_ih_sw_fini(void *handle)
 {
 	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
 
-	amdgpu_irq_fini(adev);
+	amdgpu_irq_fini_sw(adev);
 	amdgpu_ih_ring_fini(adev, &adev->irq.ih);
 
 	return 0;
diff --git a/drivers/gpu/drm/amd/amdgpu/tonga_ih.c b/drivers/gpu/drm/amd/amdgpu/tonga_ih.c
index 249fcbee7871..973d80ec7f6c 100644
--- a/drivers/gpu/drm/amd/amdgpu/tonga_ih.c
+++ b/drivers/gpu/drm/amd/amdgpu/tonga_ih.c
@@ -312,7 +312,7 @@ static int tonga_ih_sw_fini(void *handle)
 {
 	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
 
-	amdgpu_irq_fini(adev);
+	amdgpu_irq_fini_sw(adev);
 	amdgpu_ih_ring_fini(adev, &adev->irq.ih);
 	amdgpu_irq_remove_domain(adev);
 
diff --git a/drivers/gpu/drm/amd/amdgpu/vega10_ih.c b/drivers/gpu/drm/amd/amdgpu/vega10_ih.c
index ca8efa5c6978..dead9c2fbd4c 100644
--- a/drivers/gpu/drm/amd/amdgpu/vega10_ih.c
+++ b/drivers/gpu/drm/amd/amdgpu/vega10_ih.c
@@ -513,7 +513,7 @@ static int vega10_ih_sw_fini(void *handle)
 {
 	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
 
-	amdgpu_irq_fini(adev);
+	amdgpu_irq_fini_sw(adev);
 	amdgpu_ih_ring_fini(adev, &adev->irq.ih_soft);
 	amdgpu_ih_ring_fini(adev, &adev->irq.ih2);
 	amdgpu_ih_ring_fini(adev, &adev->irq.ih1);
diff --git a/drivers/gpu/drm/amd/amdgpu/vega20_ih.c b/drivers/gpu/drm/amd/amdgpu/vega20_ih.c
index 8a122b413bf5..58993ae1fe11 100644
--- a/drivers/gpu/drm/amd/amdgpu/vega20_ih.c
+++ b/drivers/gpu/drm/amd/amdgpu/vega20_ih.c
@@ -565,7 +565,7 @@ static int vega20_ih_sw_fini(void *handle)
 {
 	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
 
-	amdgpu_irq_fini(adev);
+	amdgpu_irq_fini_sw(adev);
 	amdgpu_ih_ring_fini(adev, &adev->irq.ih_soft);
 	amdgpu_ih_ring_fini(adev, &adev->irq.ih2);
 	amdgpu_ih_ring_fini(adev, &adev->irq.ih1);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 126+ messages in thread

* [PATCH v6 03/16] drm/amdgpu: Split amdgpu_device_fini into early and late
@ 2021-05-10 16:36   ` Andrey Grodzovsky
  0 siblings, 0 replies; 126+ messages in thread
From: Andrey Grodzovsky @ 2021-05-10 16:36 UTC (permalink / raw)
  To: dri-devel, amd-gfx, linux-pci, ckoenig.leichtzumerken,
	daniel.vetter, Harry.Wentland
  Cc: gregkh, Felix.Kuehling, helgaas, Alex Deucher, Christian König

Some of the stuff in amdgpu_device_fini such as HW interrupts
disable and pending fences finilization must be done right away on
pci_remove while most of the stuff which relates to finilizing and
releasing driver data structures can be kept until
drm_driver.release hook is called, i.e. when the last device
reference is dropped.

v4: Change functions prefix early->hw and late->sw

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu.h        |  6 ++++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 26 +++++++++++++++-------
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c    |  7 ++----
 drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c  | 15 ++++++++++++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c    | 26 +++++++++++++---------
 drivers/gpu/drm/amd/amdgpu/amdgpu_irq.h    |  3 ++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c    | 12 +++++++++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c    |  1 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h   |  3 ++-
 drivers/gpu/drm/amd/amdgpu/cik_ih.c        |  2 +-
 drivers/gpu/drm/amd/amdgpu/cz_ih.c         |  2 +-
 drivers/gpu/drm/amd/amdgpu/iceland_ih.c    |  2 +-
 drivers/gpu/drm/amd/amdgpu/navi10_ih.c     |  2 +-
 drivers/gpu/drm/amd/amdgpu/si_ih.c         |  2 +-
 drivers/gpu/drm/amd/amdgpu/tonga_ih.c      |  2 +-
 drivers/gpu/drm/amd/amdgpu/vega10_ih.c     |  2 +-
 drivers/gpu/drm/amd/amdgpu/vega20_ih.c     |  2 +-
 17 files changed, 79 insertions(+), 36 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index 380801b59b07..d830a541ba89 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -1099,7 +1099,9 @@ static inline struct amdgpu_device *amdgpu_ttm_adev(struct ttm_device *bdev)
 
 int amdgpu_device_init(struct amdgpu_device *adev,
 		       uint32_t flags);
-void amdgpu_device_fini(struct amdgpu_device *adev);
+void amdgpu_device_fini_hw(struct amdgpu_device *adev);
+void amdgpu_device_fini_sw(struct amdgpu_device *adev);
+
 int amdgpu_gpu_wait_for_idle(struct amdgpu_device *adev);
 
 void amdgpu_device_vram_access(struct amdgpu_device *adev, loff_t pos,
@@ -1319,6 +1321,8 @@ void amdgpu_driver_lastclose_kms(struct drm_device *dev);
 int amdgpu_driver_open_kms(struct drm_device *dev, struct drm_file *file_priv);
 void amdgpu_driver_postclose_kms(struct drm_device *dev,
 				 struct drm_file *file_priv);
+void amdgpu_driver_release_kms(struct drm_device *dev);
+
 int amdgpu_device_ip_suspend(struct amdgpu_device *adev);
 int amdgpu_device_suspend(struct drm_device *dev, bool fbcon);
 int amdgpu_device_resume(struct drm_device *dev, bool fbcon);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index b4ad1c055c70..3760ce7d8ff8 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -3648,15 +3648,13 @@ int amdgpu_device_init(struct amdgpu_device *adev,
  * Tear down the driver info (all asics).
  * Called at driver shutdown.
  */
-void amdgpu_device_fini(struct amdgpu_device *adev)
+void amdgpu_device_fini_hw(struct amdgpu_device *adev)
 {
 	dev_info(adev->dev, "amdgpu: finishing device.\n");
 	flush_delayed_work(&adev->delayed_init_work);
 	ttm_bo_lock_delayed_workqueue(&adev->mman.bdev);
 	adev->shutdown = true;
 
-	kfree(adev->pci_state);
-
 	/* make sure IB test finished before entering exclusive mode
 	 * to avoid preemption on IB test
 	 * */
@@ -3673,11 +3671,24 @@ void amdgpu_device_fini(struct amdgpu_device *adev)
 		else
 			drm_atomic_helper_shutdown(adev_to_drm(adev));
 	}
-	amdgpu_fence_driver_fini(adev);
+	amdgpu_fence_driver_fini_hw(adev);
+
 	if (adev->pm_sysfs_en)
 		amdgpu_pm_sysfs_fini(adev);
+	if (adev->ucode_sysfs_en)
+		amdgpu_ucode_sysfs_fini(adev);
+	sysfs_remove_files(&adev->dev->kobj, amdgpu_dev_attributes);
+
+
 	amdgpu_fbdev_fini(adev);
+
+	amdgpu_irq_fini_hw(adev);
+}
+
+void amdgpu_device_fini_sw(struct amdgpu_device *adev)
+{
 	amdgpu_device_ip_fini(adev);
+	amdgpu_fence_driver_fini_sw(adev);
 	release_firmware(adev->firmware.gpu_info_fw);
 	adev->firmware.gpu_info_fw = NULL;
 	adev->accel_working = false;
@@ -3703,14 +3714,13 @@ void amdgpu_device_fini(struct amdgpu_device *adev)
 	adev->rmmio = NULL;
 	amdgpu_device_doorbell_fini(adev);
 
-	if (adev->ucode_sysfs_en)
-		amdgpu_ucode_sysfs_fini(adev);
-
-	sysfs_remove_files(&adev->dev->kobj, amdgpu_dev_attributes);
 	if (IS_ENABLED(CONFIG_PERF_EVENTS))
 		amdgpu_pmu_fini(adev);
 	if (adev->mman.discovery_bin)
 		amdgpu_discovery_fini(adev);
+
+	kfree(adev->pci_state);
+
 }
 
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index 6cf573293823..5ebed4c7d9c0 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -1311,14 +1311,10 @@ amdgpu_pci_remove(struct pci_dev *pdev)
 {
 	struct drm_device *dev = pci_get_drvdata(pdev);
 
-#ifdef MODULE
-	if (THIS_MODULE->state != MODULE_STATE_GOING)
-#endif
-		DRM_ERROR("Hotplug removal is not supported\n");
 	drm_dev_unplug(dev);
 	amdgpu_driver_unload_kms(dev);
+
 	pci_disable_device(pdev);
-	pci_set_drvdata(pdev, NULL);
 }
 
 static void
@@ -1748,6 +1744,7 @@ static const struct drm_driver amdgpu_kms_driver = {
 	.dumb_create = amdgpu_mode_dumb_create,
 	.dumb_map_offset = amdgpu_mode_dumb_mmap,
 	.fops = &amdgpu_driver_kms_fops,
+	.release = &amdgpu_driver_release_kms,
 
 	.prime_handle_to_fd = drm_gem_prime_handle_to_fd,
 	.prime_fd_to_handle = drm_gem_prime_fd_to_handle,
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
index 47ea46859618..1ffb36bd0b19 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
@@ -523,7 +523,7 @@ int amdgpu_fence_driver_init(struct amdgpu_device *adev)
  *
  * Tear down the fence driver for all possible rings (all asics).
  */
-void amdgpu_fence_driver_fini(struct amdgpu_device *adev)
+void amdgpu_fence_driver_fini_hw(struct amdgpu_device *adev)
 {
 	unsigned i, j;
 	int r;
@@ -545,6 +545,19 @@ void amdgpu_fence_driver_fini(struct amdgpu_device *adev)
 				       ring->fence_drv.irq_type);
 
 		del_timer_sync(&ring->fence_drv.fallback_timer);
+	}
+}
+
+void amdgpu_fence_driver_fini_sw(struct amdgpu_device *adev)
+{
+	unsigned int i, j;
+
+	for (i = 0; i < AMDGPU_MAX_RINGS; i++) {
+		struct amdgpu_ring *ring = adev->rings[i];
+
+		if (!ring || !ring->fence_drv.initialized)
+			continue;
+
 		for (j = 0; j <= ring->fence_drv.num_fences_mask; ++j)
 			dma_fence_put(ring->fence_drv.fences[j]);
 		kfree(ring->fence_drv.fences);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
index 90f50561b43a..233b64dab94b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
@@ -49,6 +49,7 @@
 #include <drm/drm_irq.h>
 #include <drm/drm_vblank.h>
 #include <drm/amdgpu_drm.h>
+#include <drm/drm_drv.h>
 #include "amdgpu.h"
 #include "amdgpu_ih.h"
 #include "atom.h"
@@ -348,6 +349,20 @@ int amdgpu_irq_init(struct amdgpu_device *adev)
 	return 0;
 }
 
+
+void amdgpu_irq_fini_hw(struct amdgpu_device *adev)
+{
+	if (adev->irq.installed) {
+		drm_irq_uninstall(&adev->ddev);
+		adev->irq.installed = false;
+		if (adev->irq.msi_enabled)
+			pci_free_irq_vectors(adev->pdev);
+
+		if (!amdgpu_device_has_dc_support(adev))
+			flush_work(&adev->hotplug_work);
+	}
+}
+
 /**
  * amdgpu_irq_fini - shut down interrupt handling
  *
@@ -357,19 +372,10 @@ int amdgpu_irq_init(struct amdgpu_device *adev)
  * functionality, shuts down vblank, hotplug and reset interrupt handling,
  * turns off interrupts from all sources (all ASICs).
  */
-void amdgpu_irq_fini(struct amdgpu_device *adev)
+void amdgpu_irq_fini_sw(struct amdgpu_device *adev)
 {
 	unsigned i, j;
 
-	if (adev->irq.installed) {
-		drm_irq_uninstall(adev_to_drm(adev));
-		adev->irq.installed = false;
-		if (adev->irq.msi_enabled)
-			pci_free_irq_vectors(adev->pdev);
-		if (!amdgpu_device_has_dc_support(adev))
-			flush_work(&adev->hotplug_work);
-	}
-
 	for (i = 0; i < AMDGPU_IRQ_CLIENTID_MAX; ++i) {
 		if (!adev->irq.client[i].sources)
 			continue;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.h
index cf6116648322..78ad4784cc74 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.h
@@ -103,7 +103,8 @@ void amdgpu_irq_disable_all(struct amdgpu_device *adev);
 irqreturn_t amdgpu_irq_handler(int irq, void *arg);
 
 int amdgpu_irq_init(struct amdgpu_device *adev);
-void amdgpu_irq_fini(struct amdgpu_device *adev);
+void amdgpu_irq_fini_sw(struct amdgpu_device *adev);
+void amdgpu_irq_fini_hw(struct amdgpu_device *adev);
 int amdgpu_irq_add_id(struct amdgpu_device *adev,
 		      unsigned client_id, unsigned src_id,
 		      struct amdgpu_irq_src *source);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
index 39ee88d29cca..f3ecada208b0 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
@@ -28,6 +28,7 @@
 
 #include "amdgpu.h"
 #include <drm/amdgpu_drm.h>
+#include <drm/drm_drv.h>
 #include "amdgpu_uvd.h"
 #include "amdgpu_vce.h"
 #include "atom.h"
@@ -92,7 +93,7 @@ void amdgpu_driver_unload_kms(struct drm_device *dev)
 	}
 
 	amdgpu_acpi_fini(adev);
-	amdgpu_device_fini(adev);
+	amdgpu_device_fini_hw(adev);
 }
 
 void amdgpu_register_gpu_instance(struct amdgpu_device *adev)
@@ -1219,6 +1220,15 @@ void amdgpu_driver_postclose_kms(struct drm_device *dev,
 	pm_runtime_put_autosuspend(dev->dev);
 }
 
+
+void amdgpu_driver_release_kms(struct drm_device *dev)
+{
+	struct amdgpu_device *adev = drm_to_adev(dev);
+
+	amdgpu_device_fini_sw(adev);
+	pci_set_drvdata(adev->pdev, NULL);
+}
+
 /*
  * VBlank related functions.
  */
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
index 0541196ae1ed..844a667f655b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
@@ -2325,6 +2325,7 @@ int amdgpu_ras_pre_fini(struct amdgpu_device *adev)
 	if (!adev->ras_features || !con)
 		return 0;
 
+
 	/* Need disable ras on all IPs here before ip [hw/sw]fini */
 	amdgpu_ras_disable_all_features(adev, 0);
 	amdgpu_ras_recovery_fini(adev);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
index ca1622835296..e7d3d0dbdd96 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
@@ -107,7 +107,8 @@ struct amdgpu_fence_driver {
 };
 
 int amdgpu_fence_driver_init(struct amdgpu_device *adev);
-void amdgpu_fence_driver_fini(struct amdgpu_device *adev);
+void amdgpu_fence_driver_fini_hw(struct amdgpu_device *adev);
+void amdgpu_fence_driver_fini_sw(struct amdgpu_device *adev);
 void amdgpu_fence_driver_force_completion(struct amdgpu_ring *ring);
 
 int amdgpu_fence_driver_init_ring(struct amdgpu_ring *ring,
diff --git a/drivers/gpu/drm/amd/amdgpu/cik_ih.c b/drivers/gpu/drm/amd/amdgpu/cik_ih.c
index d3745711d55f..183d44a6583c 100644
--- a/drivers/gpu/drm/amd/amdgpu/cik_ih.c
+++ b/drivers/gpu/drm/amd/amdgpu/cik_ih.c
@@ -309,7 +309,7 @@ static int cik_ih_sw_fini(void *handle)
 {
 	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
 
-	amdgpu_irq_fini(adev);
+	amdgpu_irq_fini_sw(adev);
 	amdgpu_ih_ring_fini(adev, &adev->irq.ih);
 	amdgpu_irq_remove_domain(adev);
 
diff --git a/drivers/gpu/drm/amd/amdgpu/cz_ih.c b/drivers/gpu/drm/amd/amdgpu/cz_ih.c
index 307c01301c87..d32743949003 100644
--- a/drivers/gpu/drm/amd/amdgpu/cz_ih.c
+++ b/drivers/gpu/drm/amd/amdgpu/cz_ih.c
@@ -301,7 +301,7 @@ static int cz_ih_sw_fini(void *handle)
 {
 	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
 
-	amdgpu_irq_fini(adev);
+	amdgpu_irq_fini_sw(adev);
 	amdgpu_ih_ring_fini(adev, &adev->irq.ih);
 	amdgpu_irq_remove_domain(adev);
 
diff --git a/drivers/gpu/drm/amd/amdgpu/iceland_ih.c b/drivers/gpu/drm/amd/amdgpu/iceland_ih.c
index cc957471f31e..da96c6013477 100644
--- a/drivers/gpu/drm/amd/amdgpu/iceland_ih.c
+++ b/drivers/gpu/drm/amd/amdgpu/iceland_ih.c
@@ -300,7 +300,7 @@ static int iceland_ih_sw_fini(void *handle)
 {
 	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
 
-	amdgpu_irq_fini(adev);
+	amdgpu_irq_fini_sw(adev);
 	amdgpu_ih_ring_fini(adev, &adev->irq.ih);
 	amdgpu_irq_remove_domain(adev);
 
diff --git a/drivers/gpu/drm/amd/amdgpu/navi10_ih.c b/drivers/gpu/drm/amd/amdgpu/navi10_ih.c
index f4e4040bbd25..5eea4550b856 100644
--- a/drivers/gpu/drm/amd/amdgpu/navi10_ih.c
+++ b/drivers/gpu/drm/amd/amdgpu/navi10_ih.c
@@ -569,7 +569,7 @@ static int navi10_ih_sw_fini(void *handle)
 {
 	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
 
-	amdgpu_irq_fini(adev);
+	amdgpu_irq_fini_sw(adev);
 	amdgpu_ih_ring_fini(adev, &adev->irq.ih_soft);
 	amdgpu_ih_ring_fini(adev, &adev->irq.ih2);
 	amdgpu_ih_ring_fini(adev, &adev->irq.ih1);
diff --git a/drivers/gpu/drm/amd/amdgpu/si_ih.c b/drivers/gpu/drm/amd/amdgpu/si_ih.c
index 51880f6ef634..751307f3252c 100644
--- a/drivers/gpu/drm/amd/amdgpu/si_ih.c
+++ b/drivers/gpu/drm/amd/amdgpu/si_ih.c
@@ -175,7 +175,7 @@ static int si_ih_sw_fini(void *handle)
 {
 	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
 
-	amdgpu_irq_fini(adev);
+	amdgpu_irq_fini_sw(adev);
 	amdgpu_ih_ring_fini(adev, &adev->irq.ih);
 
 	return 0;
diff --git a/drivers/gpu/drm/amd/amdgpu/tonga_ih.c b/drivers/gpu/drm/amd/amdgpu/tonga_ih.c
index 249fcbee7871..973d80ec7f6c 100644
--- a/drivers/gpu/drm/amd/amdgpu/tonga_ih.c
+++ b/drivers/gpu/drm/amd/amdgpu/tonga_ih.c
@@ -312,7 +312,7 @@ static int tonga_ih_sw_fini(void *handle)
 {
 	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
 
-	amdgpu_irq_fini(adev);
+	amdgpu_irq_fini_sw(adev);
 	amdgpu_ih_ring_fini(adev, &adev->irq.ih);
 	amdgpu_irq_remove_domain(adev);
 
diff --git a/drivers/gpu/drm/amd/amdgpu/vega10_ih.c b/drivers/gpu/drm/amd/amdgpu/vega10_ih.c
index ca8efa5c6978..dead9c2fbd4c 100644
--- a/drivers/gpu/drm/amd/amdgpu/vega10_ih.c
+++ b/drivers/gpu/drm/amd/amdgpu/vega10_ih.c
@@ -513,7 +513,7 @@ static int vega10_ih_sw_fini(void *handle)
 {
 	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
 
-	amdgpu_irq_fini(adev);
+	amdgpu_irq_fini_sw(adev);
 	amdgpu_ih_ring_fini(adev, &adev->irq.ih_soft);
 	amdgpu_ih_ring_fini(adev, &adev->irq.ih2);
 	amdgpu_ih_ring_fini(adev, &adev->irq.ih1);
diff --git a/drivers/gpu/drm/amd/amdgpu/vega20_ih.c b/drivers/gpu/drm/amd/amdgpu/vega20_ih.c
index 8a122b413bf5..58993ae1fe11 100644
--- a/drivers/gpu/drm/amd/amdgpu/vega20_ih.c
+++ b/drivers/gpu/drm/amd/amdgpu/vega20_ih.c
@@ -565,7 +565,7 @@ static int vega20_ih_sw_fini(void *handle)
 {
 	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
 
-	amdgpu_irq_fini(adev);
+	amdgpu_irq_fini_sw(adev);
 	amdgpu_ih_ring_fini(adev, &adev->irq.ih_soft);
 	amdgpu_ih_ring_fini(adev, &adev->irq.ih2);
 	amdgpu_ih_ring_fini(adev, &adev->irq.ih1);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 126+ messages in thread

* [PATCH v6 03/16] drm/amdgpu: Split amdgpu_device_fini into early and late
@ 2021-05-10 16:36   ` Andrey Grodzovsky
  0 siblings, 0 replies; 126+ messages in thread
From: Andrey Grodzovsky @ 2021-05-10 16:36 UTC (permalink / raw)
  To: dri-devel, amd-gfx, linux-pci, ckoenig.leichtzumerken,
	daniel.vetter, Harry.Wentland
  Cc: Andrey Grodzovsky, gregkh, Felix.Kuehling, ppaalanen, helgaas,
	Alex Deucher, Christian König

Some of the stuff in amdgpu_device_fini such as HW interrupts
disable and pending fences finilization must be done right away on
pci_remove while most of the stuff which relates to finilizing and
releasing driver data structures can be kept until
drm_driver.release hook is called, i.e. when the last device
reference is dropped.

v4: Change functions prefix early->hw and late->sw

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu.h        |  6 ++++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 26 +++++++++++++++-------
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c    |  7 ++----
 drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c  | 15 ++++++++++++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c    | 26 +++++++++++++---------
 drivers/gpu/drm/amd/amdgpu/amdgpu_irq.h    |  3 ++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c    | 12 +++++++++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c    |  1 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h   |  3 ++-
 drivers/gpu/drm/amd/amdgpu/cik_ih.c        |  2 +-
 drivers/gpu/drm/amd/amdgpu/cz_ih.c         |  2 +-
 drivers/gpu/drm/amd/amdgpu/iceland_ih.c    |  2 +-
 drivers/gpu/drm/amd/amdgpu/navi10_ih.c     |  2 +-
 drivers/gpu/drm/amd/amdgpu/si_ih.c         |  2 +-
 drivers/gpu/drm/amd/amdgpu/tonga_ih.c      |  2 +-
 drivers/gpu/drm/amd/amdgpu/vega10_ih.c     |  2 +-
 drivers/gpu/drm/amd/amdgpu/vega20_ih.c     |  2 +-
 17 files changed, 79 insertions(+), 36 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index 380801b59b07..d830a541ba89 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -1099,7 +1099,9 @@ static inline struct amdgpu_device *amdgpu_ttm_adev(struct ttm_device *bdev)
 
 int amdgpu_device_init(struct amdgpu_device *adev,
 		       uint32_t flags);
-void amdgpu_device_fini(struct amdgpu_device *adev);
+void amdgpu_device_fini_hw(struct amdgpu_device *adev);
+void amdgpu_device_fini_sw(struct amdgpu_device *adev);
+
 int amdgpu_gpu_wait_for_idle(struct amdgpu_device *adev);
 
 void amdgpu_device_vram_access(struct amdgpu_device *adev, loff_t pos,
@@ -1319,6 +1321,8 @@ void amdgpu_driver_lastclose_kms(struct drm_device *dev);
 int amdgpu_driver_open_kms(struct drm_device *dev, struct drm_file *file_priv);
 void amdgpu_driver_postclose_kms(struct drm_device *dev,
 				 struct drm_file *file_priv);
+void amdgpu_driver_release_kms(struct drm_device *dev);
+
 int amdgpu_device_ip_suspend(struct amdgpu_device *adev);
 int amdgpu_device_suspend(struct drm_device *dev, bool fbcon);
 int amdgpu_device_resume(struct drm_device *dev, bool fbcon);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index b4ad1c055c70..3760ce7d8ff8 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -3648,15 +3648,13 @@ int amdgpu_device_init(struct amdgpu_device *adev,
  * Tear down the driver info (all asics).
  * Called at driver shutdown.
  */
-void amdgpu_device_fini(struct amdgpu_device *adev)
+void amdgpu_device_fini_hw(struct amdgpu_device *adev)
 {
 	dev_info(adev->dev, "amdgpu: finishing device.\n");
 	flush_delayed_work(&adev->delayed_init_work);
 	ttm_bo_lock_delayed_workqueue(&adev->mman.bdev);
 	adev->shutdown = true;
 
-	kfree(adev->pci_state);
-
 	/* make sure IB test finished before entering exclusive mode
 	 * to avoid preemption on IB test
 	 * */
@@ -3673,11 +3671,24 @@ void amdgpu_device_fini(struct amdgpu_device *adev)
 		else
 			drm_atomic_helper_shutdown(adev_to_drm(adev));
 	}
-	amdgpu_fence_driver_fini(adev);
+	amdgpu_fence_driver_fini_hw(adev);
+
 	if (adev->pm_sysfs_en)
 		amdgpu_pm_sysfs_fini(adev);
+	if (adev->ucode_sysfs_en)
+		amdgpu_ucode_sysfs_fini(adev);
+	sysfs_remove_files(&adev->dev->kobj, amdgpu_dev_attributes);
+
+
 	amdgpu_fbdev_fini(adev);
+
+	amdgpu_irq_fini_hw(adev);
+}
+
+void amdgpu_device_fini_sw(struct amdgpu_device *adev)
+{
 	amdgpu_device_ip_fini(adev);
+	amdgpu_fence_driver_fini_sw(adev);
 	release_firmware(adev->firmware.gpu_info_fw);
 	adev->firmware.gpu_info_fw = NULL;
 	adev->accel_working = false;
@@ -3703,14 +3714,13 @@ void amdgpu_device_fini(struct amdgpu_device *adev)
 	adev->rmmio = NULL;
 	amdgpu_device_doorbell_fini(adev);
 
-	if (adev->ucode_sysfs_en)
-		amdgpu_ucode_sysfs_fini(adev);
-
-	sysfs_remove_files(&adev->dev->kobj, amdgpu_dev_attributes);
 	if (IS_ENABLED(CONFIG_PERF_EVENTS))
 		amdgpu_pmu_fini(adev);
 	if (adev->mman.discovery_bin)
 		amdgpu_discovery_fini(adev);
+
+	kfree(adev->pci_state);
+
 }
 
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index 6cf573293823..5ebed4c7d9c0 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -1311,14 +1311,10 @@ amdgpu_pci_remove(struct pci_dev *pdev)
 {
 	struct drm_device *dev = pci_get_drvdata(pdev);
 
-#ifdef MODULE
-	if (THIS_MODULE->state != MODULE_STATE_GOING)
-#endif
-		DRM_ERROR("Hotplug removal is not supported\n");
 	drm_dev_unplug(dev);
 	amdgpu_driver_unload_kms(dev);
+
 	pci_disable_device(pdev);
-	pci_set_drvdata(pdev, NULL);
 }
 
 static void
@@ -1748,6 +1744,7 @@ static const struct drm_driver amdgpu_kms_driver = {
 	.dumb_create = amdgpu_mode_dumb_create,
 	.dumb_map_offset = amdgpu_mode_dumb_mmap,
 	.fops = &amdgpu_driver_kms_fops,
+	.release = &amdgpu_driver_release_kms,
 
 	.prime_handle_to_fd = drm_gem_prime_handle_to_fd,
 	.prime_fd_to_handle = drm_gem_prime_fd_to_handle,
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
index 47ea46859618..1ffb36bd0b19 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
@@ -523,7 +523,7 @@ int amdgpu_fence_driver_init(struct amdgpu_device *adev)
  *
  * Tear down the fence driver for all possible rings (all asics).
  */
-void amdgpu_fence_driver_fini(struct amdgpu_device *adev)
+void amdgpu_fence_driver_fini_hw(struct amdgpu_device *adev)
 {
 	unsigned i, j;
 	int r;
@@ -545,6 +545,19 @@ void amdgpu_fence_driver_fini(struct amdgpu_device *adev)
 				       ring->fence_drv.irq_type);
 
 		del_timer_sync(&ring->fence_drv.fallback_timer);
+	}
+}
+
+void amdgpu_fence_driver_fini_sw(struct amdgpu_device *adev)
+{
+	unsigned int i, j;
+
+	for (i = 0; i < AMDGPU_MAX_RINGS; i++) {
+		struct amdgpu_ring *ring = adev->rings[i];
+
+		if (!ring || !ring->fence_drv.initialized)
+			continue;
+
 		for (j = 0; j <= ring->fence_drv.num_fences_mask; ++j)
 			dma_fence_put(ring->fence_drv.fences[j]);
 		kfree(ring->fence_drv.fences);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
index 90f50561b43a..233b64dab94b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
@@ -49,6 +49,7 @@
 #include <drm/drm_irq.h>
 #include <drm/drm_vblank.h>
 #include <drm/amdgpu_drm.h>
+#include <drm/drm_drv.h>
 #include "amdgpu.h"
 #include "amdgpu_ih.h"
 #include "atom.h"
@@ -348,6 +349,20 @@ int amdgpu_irq_init(struct amdgpu_device *adev)
 	return 0;
 }
 
+
+void amdgpu_irq_fini_hw(struct amdgpu_device *adev)
+{
+	if (adev->irq.installed) {
+		drm_irq_uninstall(&adev->ddev);
+		adev->irq.installed = false;
+		if (adev->irq.msi_enabled)
+			pci_free_irq_vectors(adev->pdev);
+
+		if (!amdgpu_device_has_dc_support(adev))
+			flush_work(&adev->hotplug_work);
+	}
+}
+
 /**
  * amdgpu_irq_fini - shut down interrupt handling
  *
@@ -357,19 +372,10 @@ int amdgpu_irq_init(struct amdgpu_device *adev)
  * functionality, shuts down vblank, hotplug and reset interrupt handling,
  * turns off interrupts from all sources (all ASICs).
  */
-void amdgpu_irq_fini(struct amdgpu_device *adev)
+void amdgpu_irq_fini_sw(struct amdgpu_device *adev)
 {
 	unsigned i, j;
 
-	if (adev->irq.installed) {
-		drm_irq_uninstall(adev_to_drm(adev));
-		adev->irq.installed = false;
-		if (adev->irq.msi_enabled)
-			pci_free_irq_vectors(adev->pdev);
-		if (!amdgpu_device_has_dc_support(adev))
-			flush_work(&adev->hotplug_work);
-	}
-
 	for (i = 0; i < AMDGPU_IRQ_CLIENTID_MAX; ++i) {
 		if (!adev->irq.client[i].sources)
 			continue;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.h
index cf6116648322..78ad4784cc74 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.h
@@ -103,7 +103,8 @@ void amdgpu_irq_disable_all(struct amdgpu_device *adev);
 irqreturn_t amdgpu_irq_handler(int irq, void *arg);
 
 int amdgpu_irq_init(struct amdgpu_device *adev);
-void amdgpu_irq_fini(struct amdgpu_device *adev);
+void amdgpu_irq_fini_sw(struct amdgpu_device *adev);
+void amdgpu_irq_fini_hw(struct amdgpu_device *adev);
 int amdgpu_irq_add_id(struct amdgpu_device *adev,
 		      unsigned client_id, unsigned src_id,
 		      struct amdgpu_irq_src *source);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
index 39ee88d29cca..f3ecada208b0 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
@@ -28,6 +28,7 @@
 
 #include "amdgpu.h"
 #include <drm/amdgpu_drm.h>
+#include <drm/drm_drv.h>
 #include "amdgpu_uvd.h"
 #include "amdgpu_vce.h"
 #include "atom.h"
@@ -92,7 +93,7 @@ void amdgpu_driver_unload_kms(struct drm_device *dev)
 	}
 
 	amdgpu_acpi_fini(adev);
-	amdgpu_device_fini(adev);
+	amdgpu_device_fini_hw(adev);
 }
 
 void amdgpu_register_gpu_instance(struct amdgpu_device *adev)
@@ -1219,6 +1220,15 @@ void amdgpu_driver_postclose_kms(struct drm_device *dev,
 	pm_runtime_put_autosuspend(dev->dev);
 }
 
+
+void amdgpu_driver_release_kms(struct drm_device *dev)
+{
+	struct amdgpu_device *adev = drm_to_adev(dev);
+
+	amdgpu_device_fini_sw(adev);
+	pci_set_drvdata(adev->pdev, NULL);
+}
+
 /*
  * VBlank related functions.
  */
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
index 0541196ae1ed..844a667f655b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
@@ -2325,6 +2325,7 @@ int amdgpu_ras_pre_fini(struct amdgpu_device *adev)
 	if (!adev->ras_features || !con)
 		return 0;
 
+
 	/* Need disable ras on all IPs here before ip [hw/sw]fini */
 	amdgpu_ras_disable_all_features(adev, 0);
 	amdgpu_ras_recovery_fini(adev);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
index ca1622835296..e7d3d0dbdd96 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
@@ -107,7 +107,8 @@ struct amdgpu_fence_driver {
 };
 
 int amdgpu_fence_driver_init(struct amdgpu_device *adev);
-void amdgpu_fence_driver_fini(struct amdgpu_device *adev);
+void amdgpu_fence_driver_fini_hw(struct amdgpu_device *adev);
+void amdgpu_fence_driver_fini_sw(struct amdgpu_device *adev);
 void amdgpu_fence_driver_force_completion(struct amdgpu_ring *ring);
 
 int amdgpu_fence_driver_init_ring(struct amdgpu_ring *ring,
diff --git a/drivers/gpu/drm/amd/amdgpu/cik_ih.c b/drivers/gpu/drm/amd/amdgpu/cik_ih.c
index d3745711d55f..183d44a6583c 100644
--- a/drivers/gpu/drm/amd/amdgpu/cik_ih.c
+++ b/drivers/gpu/drm/amd/amdgpu/cik_ih.c
@@ -309,7 +309,7 @@ static int cik_ih_sw_fini(void *handle)
 {
 	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
 
-	amdgpu_irq_fini(adev);
+	amdgpu_irq_fini_sw(adev);
 	amdgpu_ih_ring_fini(adev, &adev->irq.ih);
 	amdgpu_irq_remove_domain(adev);
 
diff --git a/drivers/gpu/drm/amd/amdgpu/cz_ih.c b/drivers/gpu/drm/amd/amdgpu/cz_ih.c
index 307c01301c87..d32743949003 100644
--- a/drivers/gpu/drm/amd/amdgpu/cz_ih.c
+++ b/drivers/gpu/drm/amd/amdgpu/cz_ih.c
@@ -301,7 +301,7 @@ static int cz_ih_sw_fini(void *handle)
 {
 	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
 
-	amdgpu_irq_fini(adev);
+	amdgpu_irq_fini_sw(adev);
 	amdgpu_ih_ring_fini(adev, &adev->irq.ih);
 	amdgpu_irq_remove_domain(adev);
 
diff --git a/drivers/gpu/drm/amd/amdgpu/iceland_ih.c b/drivers/gpu/drm/amd/amdgpu/iceland_ih.c
index cc957471f31e..da96c6013477 100644
--- a/drivers/gpu/drm/amd/amdgpu/iceland_ih.c
+++ b/drivers/gpu/drm/amd/amdgpu/iceland_ih.c
@@ -300,7 +300,7 @@ static int iceland_ih_sw_fini(void *handle)
 {
 	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
 
-	amdgpu_irq_fini(adev);
+	amdgpu_irq_fini_sw(adev);
 	amdgpu_ih_ring_fini(adev, &adev->irq.ih);
 	amdgpu_irq_remove_domain(adev);
 
diff --git a/drivers/gpu/drm/amd/amdgpu/navi10_ih.c b/drivers/gpu/drm/amd/amdgpu/navi10_ih.c
index f4e4040bbd25..5eea4550b856 100644
--- a/drivers/gpu/drm/amd/amdgpu/navi10_ih.c
+++ b/drivers/gpu/drm/amd/amdgpu/navi10_ih.c
@@ -569,7 +569,7 @@ static int navi10_ih_sw_fini(void *handle)
 {
 	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
 
-	amdgpu_irq_fini(adev);
+	amdgpu_irq_fini_sw(adev);
 	amdgpu_ih_ring_fini(adev, &adev->irq.ih_soft);
 	amdgpu_ih_ring_fini(adev, &adev->irq.ih2);
 	amdgpu_ih_ring_fini(adev, &adev->irq.ih1);
diff --git a/drivers/gpu/drm/amd/amdgpu/si_ih.c b/drivers/gpu/drm/amd/amdgpu/si_ih.c
index 51880f6ef634..751307f3252c 100644
--- a/drivers/gpu/drm/amd/amdgpu/si_ih.c
+++ b/drivers/gpu/drm/amd/amdgpu/si_ih.c
@@ -175,7 +175,7 @@ static int si_ih_sw_fini(void *handle)
 {
 	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
 
-	amdgpu_irq_fini(adev);
+	amdgpu_irq_fini_sw(adev);
 	amdgpu_ih_ring_fini(adev, &adev->irq.ih);
 
 	return 0;
diff --git a/drivers/gpu/drm/amd/amdgpu/tonga_ih.c b/drivers/gpu/drm/amd/amdgpu/tonga_ih.c
index 249fcbee7871..973d80ec7f6c 100644
--- a/drivers/gpu/drm/amd/amdgpu/tonga_ih.c
+++ b/drivers/gpu/drm/amd/amdgpu/tonga_ih.c
@@ -312,7 +312,7 @@ static int tonga_ih_sw_fini(void *handle)
 {
 	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
 
-	amdgpu_irq_fini(adev);
+	amdgpu_irq_fini_sw(adev);
 	amdgpu_ih_ring_fini(adev, &adev->irq.ih);
 	amdgpu_irq_remove_domain(adev);
 
diff --git a/drivers/gpu/drm/amd/amdgpu/vega10_ih.c b/drivers/gpu/drm/amd/amdgpu/vega10_ih.c
index ca8efa5c6978..dead9c2fbd4c 100644
--- a/drivers/gpu/drm/amd/amdgpu/vega10_ih.c
+++ b/drivers/gpu/drm/amd/amdgpu/vega10_ih.c
@@ -513,7 +513,7 @@ static int vega10_ih_sw_fini(void *handle)
 {
 	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
 
-	amdgpu_irq_fini(adev);
+	amdgpu_irq_fini_sw(adev);
 	amdgpu_ih_ring_fini(adev, &adev->irq.ih_soft);
 	amdgpu_ih_ring_fini(adev, &adev->irq.ih2);
 	amdgpu_ih_ring_fini(adev, &adev->irq.ih1);
diff --git a/drivers/gpu/drm/amd/amdgpu/vega20_ih.c b/drivers/gpu/drm/amd/amdgpu/vega20_ih.c
index 8a122b413bf5..58993ae1fe11 100644
--- a/drivers/gpu/drm/amd/amdgpu/vega20_ih.c
+++ b/drivers/gpu/drm/amd/amdgpu/vega20_ih.c
@@ -565,7 +565,7 @@ static int vega20_ih_sw_fini(void *handle)
 {
 	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
 
-	amdgpu_irq_fini(adev);
+	amdgpu_irq_fini_sw(adev);
 	amdgpu_ih_ring_fini(adev, &adev->irq.ih_soft);
 	amdgpu_ih_ring_fini(adev, &adev->irq.ih2);
 	amdgpu_ih_ring_fini(adev, &adev->irq.ih1);
-- 
2.25.1

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 126+ messages in thread

* [PATCH v6 04/16] drm/amdkfd: Split kfd suspend from devie exit
  2021-05-10 16:36 ` Andrey Grodzovsky
  (?)
@ 2021-05-10 16:36   ` Andrey Grodzovsky
  -1 siblings, 0 replies; 126+ messages in thread
From: Andrey Grodzovsky @ 2021-05-10 16:36 UTC (permalink / raw)
  To: dri-devel, amd-gfx, linux-pci, ckoenig.leichtzumerken,
	daniel.vetter, Harry.Wentland
  Cc: ppaalanen, Alexander.Deucher, gregkh, helgaas, Felix.Kuehling,
	Andrey Grodzovsky

Helps to expdite HW related stuff to amdgpu_pci_remove

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h | 2 +-
 drivers/gpu/drm/amd/amdkfd/kfd_device.c    | 3 ++-
 3 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
index 5f6696a3c778..2b06dee9a0ce 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
@@ -170,7 +170,7 @@ void amdgpu_amdkfd_device_init(struct amdgpu_device *adev)
 	}
 }
 
-void amdgpu_amdkfd_device_fini(struct amdgpu_device *adev)
+void amdgpu_amdkfd_device_fini_sw(struct amdgpu_device *adev)
 {
 	if (adev->kfd.dev) {
 		kgd2kfd_device_exit(adev->kfd.dev);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
index 14f68c028126..f8e10af99c28 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
@@ -127,7 +127,7 @@ void amdgpu_amdkfd_interrupt(struct amdgpu_device *adev,
 			const void *ih_ring_entry);
 void amdgpu_amdkfd_device_probe(struct amdgpu_device *adev);
 void amdgpu_amdkfd_device_init(struct amdgpu_device *adev);
-void amdgpu_amdkfd_device_fini(struct amdgpu_device *adev);
+void amdgpu_amdkfd_device_fini_sw(struct amdgpu_device *adev);
 int amdgpu_amdkfd_submit_ib(struct kgd_dev *kgd, enum kgd_engine_type engine,
 				uint32_t vmid, uint64_t gpu_addr,
 				uint32_t *ib_cmd, uint32_t ib_len);
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
index 357b9bf62a1c..ab6d2a43c9a3 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
@@ -858,10 +858,11 @@ bool kgd2kfd_device_init(struct kfd_dev *kfd,
 	return kfd->init_complete;
 }
 
+
+
 void kgd2kfd_device_exit(struct kfd_dev *kfd)
 {
 	if (kfd->init_complete) {
-		kgd2kfd_suspend(kfd, false);
 		device_queue_manager_uninit(kfd->dqm);
 		kfd_interrupt_exit(kfd);
 		kfd_topology_remove_device(kfd);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 126+ messages in thread

* [PATCH v6 04/16] drm/amdkfd: Split kfd suspend from devie exit
@ 2021-05-10 16:36   ` Andrey Grodzovsky
  0 siblings, 0 replies; 126+ messages in thread
From: Andrey Grodzovsky @ 2021-05-10 16:36 UTC (permalink / raw)
  To: dri-devel, amd-gfx, linux-pci, ckoenig.leichtzumerken,
	daniel.vetter, Harry.Wentland
  Cc: gregkh, Felix.Kuehling, helgaas, Alexander.Deucher

Helps to expdite HW related stuff to amdgpu_pci_remove

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h | 2 +-
 drivers/gpu/drm/amd/amdkfd/kfd_device.c    | 3 ++-
 3 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
index 5f6696a3c778..2b06dee9a0ce 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
@@ -170,7 +170,7 @@ void amdgpu_amdkfd_device_init(struct amdgpu_device *adev)
 	}
 }
 
-void amdgpu_amdkfd_device_fini(struct amdgpu_device *adev)
+void amdgpu_amdkfd_device_fini_sw(struct amdgpu_device *adev)
 {
 	if (adev->kfd.dev) {
 		kgd2kfd_device_exit(adev->kfd.dev);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
index 14f68c028126..f8e10af99c28 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
@@ -127,7 +127,7 @@ void amdgpu_amdkfd_interrupt(struct amdgpu_device *adev,
 			const void *ih_ring_entry);
 void amdgpu_amdkfd_device_probe(struct amdgpu_device *adev);
 void amdgpu_amdkfd_device_init(struct amdgpu_device *adev);
-void amdgpu_amdkfd_device_fini(struct amdgpu_device *adev);
+void amdgpu_amdkfd_device_fini_sw(struct amdgpu_device *adev);
 int amdgpu_amdkfd_submit_ib(struct kgd_dev *kgd, enum kgd_engine_type engine,
 				uint32_t vmid, uint64_t gpu_addr,
 				uint32_t *ib_cmd, uint32_t ib_len);
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
index 357b9bf62a1c..ab6d2a43c9a3 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
@@ -858,10 +858,11 @@ bool kgd2kfd_device_init(struct kfd_dev *kfd,
 	return kfd->init_complete;
 }
 
+
+
 void kgd2kfd_device_exit(struct kfd_dev *kfd)
 {
 	if (kfd->init_complete) {
-		kgd2kfd_suspend(kfd, false);
 		device_queue_manager_uninit(kfd->dqm);
 		kfd_interrupt_exit(kfd);
 		kfd_topology_remove_device(kfd);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 126+ messages in thread

* [PATCH v6 04/16] drm/amdkfd: Split kfd suspend from devie exit
@ 2021-05-10 16:36   ` Andrey Grodzovsky
  0 siblings, 0 replies; 126+ messages in thread
From: Andrey Grodzovsky @ 2021-05-10 16:36 UTC (permalink / raw)
  To: dri-devel, amd-gfx, linux-pci, ckoenig.leichtzumerken,
	daniel.vetter, Harry.Wentland
  Cc: Andrey Grodzovsky, gregkh, Felix.Kuehling, ppaalanen, helgaas,
	Alexander.Deucher

Helps to expdite HW related stuff to amdgpu_pci_remove

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h | 2 +-
 drivers/gpu/drm/amd/amdkfd/kfd_device.c    | 3 ++-
 3 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
index 5f6696a3c778..2b06dee9a0ce 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
@@ -170,7 +170,7 @@ void amdgpu_amdkfd_device_init(struct amdgpu_device *adev)
 	}
 }
 
-void amdgpu_amdkfd_device_fini(struct amdgpu_device *adev)
+void amdgpu_amdkfd_device_fini_sw(struct amdgpu_device *adev)
 {
 	if (adev->kfd.dev) {
 		kgd2kfd_device_exit(adev->kfd.dev);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
index 14f68c028126..f8e10af99c28 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
@@ -127,7 +127,7 @@ void amdgpu_amdkfd_interrupt(struct amdgpu_device *adev,
 			const void *ih_ring_entry);
 void amdgpu_amdkfd_device_probe(struct amdgpu_device *adev);
 void amdgpu_amdkfd_device_init(struct amdgpu_device *adev);
-void amdgpu_amdkfd_device_fini(struct amdgpu_device *adev);
+void amdgpu_amdkfd_device_fini_sw(struct amdgpu_device *adev);
 int amdgpu_amdkfd_submit_ib(struct kgd_dev *kgd, enum kgd_engine_type engine,
 				uint32_t vmid, uint64_t gpu_addr,
 				uint32_t *ib_cmd, uint32_t ib_len);
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
index 357b9bf62a1c..ab6d2a43c9a3 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
@@ -858,10 +858,11 @@ bool kgd2kfd_device_init(struct kfd_dev *kfd,
 	return kfd->init_complete;
 }
 
+
+
 void kgd2kfd_device_exit(struct kfd_dev *kfd)
 {
 	if (kfd->init_complete) {
-		kgd2kfd_suspend(kfd, false);
 		device_queue_manager_uninit(kfd->dqm);
 		kfd_interrupt_exit(kfd);
 		kfd_topology_remove_device(kfd);
-- 
2.25.1

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 126+ messages in thread

* [PATCH v6 05/16] drm/amdgpu: Add early fini callback
  2021-05-10 16:36 ` Andrey Grodzovsky
  (?)
@ 2021-05-10 16:36   ` Andrey Grodzovsky
  -1 siblings, 0 replies; 126+ messages in thread
From: Andrey Grodzovsky @ 2021-05-10 16:36 UTC (permalink / raw)
  To: dri-devel, amd-gfx, linux-pci, ckoenig.leichtzumerken,
	daniel.vetter, Harry.Wentland
  Cc: ppaalanen, Alexander.Deucher, gregkh, helgaas, Felix.Kuehling,
	Andrey Grodzovsky

Use it to call disply code dependent on device->drv_data
before it's set to NULL on device unplug

v5: Move HW finilization into this callback to prevent MMIO accesses
    post cpi remove.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c    | 59 +++++++++++++------
 .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 12 +++-
 drivers/gpu/drm/amd/include/amd_shared.h      |  2 +
 3 files changed, 52 insertions(+), 21 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 3760ce7d8ff8..18598eda18f6 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -2558,34 +2558,26 @@ static int amdgpu_device_ip_late_init(struct amdgpu_device *adev)
 	return 0;
 }
 
-/**
- * amdgpu_device_ip_fini - run fini for hardware IPs
- *
- * @adev: amdgpu_device pointer
- *
- * Main teardown pass for hardware IPs.  The list of all the hardware
- * IPs that make up the asic is walked and the hw_fini and sw_fini callbacks
- * are run.  hw_fini tears down the hardware associated with each IP
- * and sw_fini tears down any software state associated with each IP.
- * Returns 0 on success, negative error code on failure.
- */
-static int amdgpu_device_ip_fini(struct amdgpu_device *adev)
+static int amdgpu_device_ip_fini_early(struct amdgpu_device *adev)
 {
 	int i, r;
 
-	if (amdgpu_sriov_vf(adev) && adev->virt.ras_init_done)
-		amdgpu_virt_release_ras_err_handler_data(adev);
+	for (i = 0; i < adev->num_ip_blocks; i++) {
+		if (!adev->ip_blocks[i].version->funcs->early_fini)
+			continue;
 
-	amdgpu_ras_pre_fini(adev);
+		r = adev->ip_blocks[i].version->funcs->early_fini((void *)adev);
+		if (r) {
+			DRM_DEBUG("early_fini of IP block <%s> failed %d\n",
+				  adev->ip_blocks[i].version->funcs->name, r);
+		}
+	}
 
-	if (adev->gmc.xgmi.num_physical_nodes > 1)
-		amdgpu_xgmi_remove_device(adev);
+	amdgpu_amdkfd_suspend(adev, false);
 
 	amdgpu_device_set_pg_state(adev, AMD_PG_STATE_UNGATE);
 	amdgpu_device_set_cg_state(adev, AMD_CG_STATE_UNGATE);
 
-	amdgpu_amdkfd_device_fini(adev);
-
 	/* need to disable SMC first */
 	for (i = 0; i < adev->num_ip_blocks; i++) {
 		if (!adev->ip_blocks[i].status.hw)
@@ -2616,6 +2608,33 @@ static int amdgpu_device_ip_fini(struct amdgpu_device *adev)
 		adev->ip_blocks[i].status.hw = false;
 	}
 
+	return 0;
+}
+
+/**
+ * amdgpu_device_ip_fini - run fini for hardware IPs
+ *
+ * @adev: amdgpu_device pointer
+ *
+ * Main teardown pass for hardware IPs.  The list of all the hardware
+ * IPs that make up the asic is walked and the hw_fini and sw_fini callbacks
+ * are run.  hw_fini tears down the hardware associated with each IP
+ * and sw_fini tears down any software state associated with each IP.
+ * Returns 0 on success, negative error code on failure.
+ */
+static int amdgpu_device_ip_fini(struct amdgpu_device *adev)
+{
+	int i, r;
+
+	if (amdgpu_sriov_vf(adev) && adev->virt.ras_init_done)
+		amdgpu_virt_release_ras_err_handler_data(adev);
+
+	amdgpu_ras_pre_fini(adev);
+
+	if (adev->gmc.xgmi.num_physical_nodes > 1)
+		amdgpu_xgmi_remove_device(adev);
+
+	amdgpu_amdkfd_device_fini_sw(adev);
 
 	for (i = adev->num_ip_blocks - 1; i >= 0; i--) {
 		if (!adev->ip_blocks[i].status.sw)
@@ -3683,6 +3702,8 @@ void amdgpu_device_fini_hw(struct amdgpu_device *adev)
 	amdgpu_fbdev_fini(adev);
 
 	amdgpu_irq_fini_hw(adev);
+
+	amdgpu_device_ip_fini_early(adev);
 }
 
 void amdgpu_device_fini_sw(struct amdgpu_device *adev)
diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index 296704ce3768..6c2c6a51ce6c 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -1251,6 +1251,15 @@ static int amdgpu_dm_init(struct amdgpu_device *adev)
 	return -EINVAL;
 }
 
+static int amdgpu_dm_early_fini(void *handle)
+{
+	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
+
+	amdgpu_dm_audio_fini(adev);
+
+	return 0;
+}
+
 static void amdgpu_dm_fini(struct amdgpu_device *adev)
 {
 	int i;
@@ -1259,8 +1268,6 @@ static void amdgpu_dm_fini(struct amdgpu_device *adev)
 		drm_encoder_cleanup(&adev->dm.mst_encoders[i].base);
 	}
 
-	amdgpu_dm_audio_fini(adev);
-
 	amdgpu_dm_destroy_drm_device(&adev->dm);
 
 #if defined(CONFIG_DRM_AMD_SECURE_DISPLAY)
@@ -2298,6 +2305,7 @@ static const struct amd_ip_funcs amdgpu_dm_funcs = {
 	.late_init = dm_late_init,
 	.sw_init = dm_sw_init,
 	.sw_fini = dm_sw_fini,
+	.early_fini = amdgpu_dm_early_fini,
 	.hw_init = dm_hw_init,
 	.hw_fini = dm_hw_fini,
 	.suspend = dm_suspend,
diff --git a/drivers/gpu/drm/amd/include/amd_shared.h b/drivers/gpu/drm/amd/include/amd_shared.h
index 43ed6291b2b8..1ad56da486e4 100644
--- a/drivers/gpu/drm/amd/include/amd_shared.h
+++ b/drivers/gpu/drm/amd/include/amd_shared.h
@@ -240,6 +240,7 @@ enum amd_dpm_forced_level;
  * @late_init: sets up late driver/hw state (post hw_init) - Optional
  * @sw_init: sets up driver state, does not configure hw
  * @sw_fini: tears down driver state, does not configure hw
+ * @early_fini: tears down stuff before dev detached from driver
  * @hw_init: sets up the hw state
  * @hw_fini: tears down the hw state
  * @late_fini: final cleanup
@@ -268,6 +269,7 @@ struct amd_ip_funcs {
 	int (*late_init)(void *handle);
 	int (*sw_init)(void *handle);
 	int (*sw_fini)(void *handle);
+	int (*early_fini)(void *handle);
 	int (*hw_init)(void *handle);
 	int (*hw_fini)(void *handle);
 	void (*late_fini)(void *handle);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 126+ messages in thread

* [PATCH v6 05/16] drm/amdgpu: Add early fini callback
@ 2021-05-10 16:36   ` Andrey Grodzovsky
  0 siblings, 0 replies; 126+ messages in thread
From: Andrey Grodzovsky @ 2021-05-10 16:36 UTC (permalink / raw)
  To: dri-devel, amd-gfx, linux-pci, ckoenig.leichtzumerken,
	daniel.vetter, Harry.Wentland
  Cc: gregkh, Felix.Kuehling, helgaas, Alexander.Deucher

Use it to call disply code dependent on device->drv_data
before it's set to NULL on device unplug

v5: Move HW finilization into this callback to prevent MMIO accesses
    post cpi remove.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c    | 59 +++++++++++++------
 .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 12 +++-
 drivers/gpu/drm/amd/include/amd_shared.h      |  2 +
 3 files changed, 52 insertions(+), 21 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 3760ce7d8ff8..18598eda18f6 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -2558,34 +2558,26 @@ static int amdgpu_device_ip_late_init(struct amdgpu_device *adev)
 	return 0;
 }
 
-/**
- * amdgpu_device_ip_fini - run fini for hardware IPs
- *
- * @adev: amdgpu_device pointer
- *
- * Main teardown pass for hardware IPs.  The list of all the hardware
- * IPs that make up the asic is walked and the hw_fini and sw_fini callbacks
- * are run.  hw_fini tears down the hardware associated with each IP
- * and sw_fini tears down any software state associated with each IP.
- * Returns 0 on success, negative error code on failure.
- */
-static int amdgpu_device_ip_fini(struct amdgpu_device *adev)
+static int amdgpu_device_ip_fini_early(struct amdgpu_device *adev)
 {
 	int i, r;
 
-	if (amdgpu_sriov_vf(adev) && adev->virt.ras_init_done)
-		amdgpu_virt_release_ras_err_handler_data(adev);
+	for (i = 0; i < adev->num_ip_blocks; i++) {
+		if (!adev->ip_blocks[i].version->funcs->early_fini)
+			continue;
 
-	amdgpu_ras_pre_fini(adev);
+		r = adev->ip_blocks[i].version->funcs->early_fini((void *)adev);
+		if (r) {
+			DRM_DEBUG("early_fini of IP block <%s> failed %d\n",
+				  adev->ip_blocks[i].version->funcs->name, r);
+		}
+	}
 
-	if (adev->gmc.xgmi.num_physical_nodes > 1)
-		amdgpu_xgmi_remove_device(adev);
+	amdgpu_amdkfd_suspend(adev, false);
 
 	amdgpu_device_set_pg_state(adev, AMD_PG_STATE_UNGATE);
 	amdgpu_device_set_cg_state(adev, AMD_CG_STATE_UNGATE);
 
-	amdgpu_amdkfd_device_fini(adev);
-
 	/* need to disable SMC first */
 	for (i = 0; i < adev->num_ip_blocks; i++) {
 		if (!adev->ip_blocks[i].status.hw)
@@ -2616,6 +2608,33 @@ static int amdgpu_device_ip_fini(struct amdgpu_device *adev)
 		adev->ip_blocks[i].status.hw = false;
 	}
 
+	return 0;
+}
+
+/**
+ * amdgpu_device_ip_fini - run fini for hardware IPs
+ *
+ * @adev: amdgpu_device pointer
+ *
+ * Main teardown pass for hardware IPs.  The list of all the hardware
+ * IPs that make up the asic is walked and the hw_fini and sw_fini callbacks
+ * are run.  hw_fini tears down the hardware associated with each IP
+ * and sw_fini tears down any software state associated with each IP.
+ * Returns 0 on success, negative error code on failure.
+ */
+static int amdgpu_device_ip_fini(struct amdgpu_device *adev)
+{
+	int i, r;
+
+	if (amdgpu_sriov_vf(adev) && adev->virt.ras_init_done)
+		amdgpu_virt_release_ras_err_handler_data(adev);
+
+	amdgpu_ras_pre_fini(adev);
+
+	if (adev->gmc.xgmi.num_physical_nodes > 1)
+		amdgpu_xgmi_remove_device(adev);
+
+	amdgpu_amdkfd_device_fini_sw(adev);
 
 	for (i = adev->num_ip_blocks - 1; i >= 0; i--) {
 		if (!adev->ip_blocks[i].status.sw)
@@ -3683,6 +3702,8 @@ void amdgpu_device_fini_hw(struct amdgpu_device *adev)
 	amdgpu_fbdev_fini(adev);
 
 	amdgpu_irq_fini_hw(adev);
+
+	amdgpu_device_ip_fini_early(adev);
 }
 
 void amdgpu_device_fini_sw(struct amdgpu_device *adev)
diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index 296704ce3768..6c2c6a51ce6c 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -1251,6 +1251,15 @@ static int amdgpu_dm_init(struct amdgpu_device *adev)
 	return -EINVAL;
 }
 
+static int amdgpu_dm_early_fini(void *handle)
+{
+	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
+
+	amdgpu_dm_audio_fini(adev);
+
+	return 0;
+}
+
 static void amdgpu_dm_fini(struct amdgpu_device *adev)
 {
 	int i;
@@ -1259,8 +1268,6 @@ static void amdgpu_dm_fini(struct amdgpu_device *adev)
 		drm_encoder_cleanup(&adev->dm.mst_encoders[i].base);
 	}
 
-	amdgpu_dm_audio_fini(adev);
-
 	amdgpu_dm_destroy_drm_device(&adev->dm);
 
 #if defined(CONFIG_DRM_AMD_SECURE_DISPLAY)
@@ -2298,6 +2305,7 @@ static const struct amd_ip_funcs amdgpu_dm_funcs = {
 	.late_init = dm_late_init,
 	.sw_init = dm_sw_init,
 	.sw_fini = dm_sw_fini,
+	.early_fini = amdgpu_dm_early_fini,
 	.hw_init = dm_hw_init,
 	.hw_fini = dm_hw_fini,
 	.suspend = dm_suspend,
diff --git a/drivers/gpu/drm/amd/include/amd_shared.h b/drivers/gpu/drm/amd/include/amd_shared.h
index 43ed6291b2b8..1ad56da486e4 100644
--- a/drivers/gpu/drm/amd/include/amd_shared.h
+++ b/drivers/gpu/drm/amd/include/amd_shared.h
@@ -240,6 +240,7 @@ enum amd_dpm_forced_level;
  * @late_init: sets up late driver/hw state (post hw_init) - Optional
  * @sw_init: sets up driver state, does not configure hw
  * @sw_fini: tears down driver state, does not configure hw
+ * @early_fini: tears down stuff before dev detached from driver
  * @hw_init: sets up the hw state
  * @hw_fini: tears down the hw state
  * @late_fini: final cleanup
@@ -268,6 +269,7 @@ struct amd_ip_funcs {
 	int (*late_init)(void *handle);
 	int (*sw_init)(void *handle);
 	int (*sw_fini)(void *handle);
+	int (*early_fini)(void *handle);
 	int (*hw_init)(void *handle);
 	int (*hw_fini)(void *handle);
 	void (*late_fini)(void *handle);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 126+ messages in thread

* [PATCH v6 05/16] drm/amdgpu: Add early fini callback
@ 2021-05-10 16:36   ` Andrey Grodzovsky
  0 siblings, 0 replies; 126+ messages in thread
From: Andrey Grodzovsky @ 2021-05-10 16:36 UTC (permalink / raw)
  To: dri-devel, amd-gfx, linux-pci, ckoenig.leichtzumerken,
	daniel.vetter, Harry.Wentland
  Cc: Andrey Grodzovsky, gregkh, Felix.Kuehling, ppaalanen, helgaas,
	Alexander.Deucher

Use it to call disply code dependent on device->drv_data
before it's set to NULL on device unplug

v5: Move HW finilization into this callback to prevent MMIO accesses
    post cpi remove.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c    | 59 +++++++++++++------
 .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 12 +++-
 drivers/gpu/drm/amd/include/amd_shared.h      |  2 +
 3 files changed, 52 insertions(+), 21 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 3760ce7d8ff8..18598eda18f6 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -2558,34 +2558,26 @@ static int amdgpu_device_ip_late_init(struct amdgpu_device *adev)
 	return 0;
 }
 
-/**
- * amdgpu_device_ip_fini - run fini for hardware IPs
- *
- * @adev: amdgpu_device pointer
- *
- * Main teardown pass for hardware IPs.  The list of all the hardware
- * IPs that make up the asic is walked and the hw_fini and sw_fini callbacks
- * are run.  hw_fini tears down the hardware associated with each IP
- * and sw_fini tears down any software state associated with each IP.
- * Returns 0 on success, negative error code on failure.
- */
-static int amdgpu_device_ip_fini(struct amdgpu_device *adev)
+static int amdgpu_device_ip_fini_early(struct amdgpu_device *adev)
 {
 	int i, r;
 
-	if (amdgpu_sriov_vf(adev) && adev->virt.ras_init_done)
-		amdgpu_virt_release_ras_err_handler_data(adev);
+	for (i = 0; i < adev->num_ip_blocks; i++) {
+		if (!adev->ip_blocks[i].version->funcs->early_fini)
+			continue;
 
-	amdgpu_ras_pre_fini(adev);
+		r = adev->ip_blocks[i].version->funcs->early_fini((void *)adev);
+		if (r) {
+			DRM_DEBUG("early_fini of IP block <%s> failed %d\n",
+				  adev->ip_blocks[i].version->funcs->name, r);
+		}
+	}
 
-	if (adev->gmc.xgmi.num_physical_nodes > 1)
-		amdgpu_xgmi_remove_device(adev);
+	amdgpu_amdkfd_suspend(adev, false);
 
 	amdgpu_device_set_pg_state(adev, AMD_PG_STATE_UNGATE);
 	amdgpu_device_set_cg_state(adev, AMD_CG_STATE_UNGATE);
 
-	amdgpu_amdkfd_device_fini(adev);
-
 	/* need to disable SMC first */
 	for (i = 0; i < adev->num_ip_blocks; i++) {
 		if (!adev->ip_blocks[i].status.hw)
@@ -2616,6 +2608,33 @@ static int amdgpu_device_ip_fini(struct amdgpu_device *adev)
 		adev->ip_blocks[i].status.hw = false;
 	}
 
+	return 0;
+}
+
+/**
+ * amdgpu_device_ip_fini - run fini for hardware IPs
+ *
+ * @adev: amdgpu_device pointer
+ *
+ * Main teardown pass for hardware IPs.  The list of all the hardware
+ * IPs that make up the asic is walked and the hw_fini and sw_fini callbacks
+ * are run.  hw_fini tears down the hardware associated with each IP
+ * and sw_fini tears down any software state associated with each IP.
+ * Returns 0 on success, negative error code on failure.
+ */
+static int amdgpu_device_ip_fini(struct amdgpu_device *adev)
+{
+	int i, r;
+
+	if (amdgpu_sriov_vf(adev) && adev->virt.ras_init_done)
+		amdgpu_virt_release_ras_err_handler_data(adev);
+
+	amdgpu_ras_pre_fini(adev);
+
+	if (adev->gmc.xgmi.num_physical_nodes > 1)
+		amdgpu_xgmi_remove_device(adev);
+
+	amdgpu_amdkfd_device_fini_sw(adev);
 
 	for (i = adev->num_ip_blocks - 1; i >= 0; i--) {
 		if (!adev->ip_blocks[i].status.sw)
@@ -3683,6 +3702,8 @@ void amdgpu_device_fini_hw(struct amdgpu_device *adev)
 	amdgpu_fbdev_fini(adev);
 
 	amdgpu_irq_fini_hw(adev);
+
+	amdgpu_device_ip_fini_early(adev);
 }
 
 void amdgpu_device_fini_sw(struct amdgpu_device *adev)
diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index 296704ce3768..6c2c6a51ce6c 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -1251,6 +1251,15 @@ static int amdgpu_dm_init(struct amdgpu_device *adev)
 	return -EINVAL;
 }
 
+static int amdgpu_dm_early_fini(void *handle)
+{
+	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
+
+	amdgpu_dm_audio_fini(adev);
+
+	return 0;
+}
+
 static void amdgpu_dm_fini(struct amdgpu_device *adev)
 {
 	int i;
@@ -1259,8 +1268,6 @@ static void amdgpu_dm_fini(struct amdgpu_device *adev)
 		drm_encoder_cleanup(&adev->dm.mst_encoders[i].base);
 	}
 
-	amdgpu_dm_audio_fini(adev);
-
 	amdgpu_dm_destroy_drm_device(&adev->dm);
 
 #if defined(CONFIG_DRM_AMD_SECURE_DISPLAY)
@@ -2298,6 +2305,7 @@ static const struct amd_ip_funcs amdgpu_dm_funcs = {
 	.late_init = dm_late_init,
 	.sw_init = dm_sw_init,
 	.sw_fini = dm_sw_fini,
+	.early_fini = amdgpu_dm_early_fini,
 	.hw_init = dm_hw_init,
 	.hw_fini = dm_hw_fini,
 	.suspend = dm_suspend,
diff --git a/drivers/gpu/drm/amd/include/amd_shared.h b/drivers/gpu/drm/amd/include/amd_shared.h
index 43ed6291b2b8..1ad56da486e4 100644
--- a/drivers/gpu/drm/amd/include/amd_shared.h
+++ b/drivers/gpu/drm/amd/include/amd_shared.h
@@ -240,6 +240,7 @@ enum amd_dpm_forced_level;
  * @late_init: sets up late driver/hw state (post hw_init) - Optional
  * @sw_init: sets up driver state, does not configure hw
  * @sw_fini: tears down driver state, does not configure hw
+ * @early_fini: tears down stuff before dev detached from driver
  * @hw_init: sets up the hw state
  * @hw_fini: tears down the hw state
  * @late_fini: final cleanup
@@ -268,6 +269,7 @@ struct amd_ip_funcs {
 	int (*late_init)(void *handle);
 	int (*sw_init)(void *handle);
 	int (*sw_fini)(void *handle);
+	int (*early_fini)(void *handle);
 	int (*hw_init)(void *handle);
 	int (*hw_fini)(void *handle);
 	void (*late_fini)(void *handle);
-- 
2.25.1

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 126+ messages in thread

* [PATCH v6 06/16] drm/amdgpu: Handle IOMMU enabled case.
  2021-05-10 16:36 ` Andrey Grodzovsky
  (?)
@ 2021-05-10 16:36   ` Andrey Grodzovsky
  -1 siblings, 0 replies; 126+ messages in thread
From: Andrey Grodzovsky @ 2021-05-10 16:36 UTC (permalink / raw)
  To: dri-devel, amd-gfx, linux-pci, ckoenig.leichtzumerken,
	daniel.vetter, Harry.Wentland
  Cc: ppaalanen, Alexander.Deucher, gregkh, helgaas, Felix.Kuehling,
	Andrey Grodzovsky

Handle all DMA IOMMU gropup related dependencies before the
group is removed.

v5: Drop IOMMU notifier and switch to lockless call to ttm_tt_unpopulate
v6: Drop the BO unamp list

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 4 ++--
 drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c   | 3 +--
 drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h   | 1 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c    | 9 +++++++++
 drivers/gpu/drm/amd/amdgpu/cik_ih.c        | 1 -
 drivers/gpu/drm/amd/amdgpu/cz_ih.c         | 1 -
 drivers/gpu/drm/amd/amdgpu/iceland_ih.c    | 1 -
 drivers/gpu/drm/amd/amdgpu/navi10_ih.c     | 3 ---
 drivers/gpu/drm/amd/amdgpu/si_ih.c         | 1 -
 drivers/gpu/drm/amd/amdgpu/tonga_ih.c      | 1 -
 drivers/gpu/drm/amd/amdgpu/vega10_ih.c     | 3 ---
 11 files changed, 13 insertions(+), 15 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 18598eda18f6..a0bff4713672 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -3256,7 +3256,6 @@ static const struct attribute *amdgpu_dev_attributes[] = {
 	NULL
 };
 
-
 /**
  * amdgpu_device_init - initialize the driver
  *
@@ -3698,12 +3697,13 @@ void amdgpu_device_fini_hw(struct amdgpu_device *adev)
 		amdgpu_ucode_sysfs_fini(adev);
 	sysfs_remove_files(&adev->dev->kobj, amdgpu_dev_attributes);
 
-
 	amdgpu_fbdev_fini(adev);
 
 	amdgpu_irq_fini_hw(adev);
 
 	amdgpu_device_ip_fini_early(adev);
+
+	amdgpu_gart_dummy_page_fini(adev);
 }
 
 void amdgpu_device_fini_sw(struct amdgpu_device *adev)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
index c5a9a4fb10d2..354e68081b53 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
@@ -92,7 +92,7 @@ static int amdgpu_gart_dummy_page_init(struct amdgpu_device *adev)
  *
  * Frees the dummy page used by the driver (all asics).
  */
-static void amdgpu_gart_dummy_page_fini(struct amdgpu_device *adev)
+void amdgpu_gart_dummy_page_fini(struct amdgpu_device *adev)
 {
 	if (!adev->dummy_page_addr)
 		return;
@@ -375,5 +375,4 @@ int amdgpu_gart_init(struct amdgpu_device *adev)
  */
 void amdgpu_gart_fini(struct amdgpu_device *adev)
 {
-	amdgpu_gart_dummy_page_fini(adev);
 }
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h
index a25fe97b0196..78dc7a23da56 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h
@@ -58,6 +58,7 @@ int amdgpu_gart_table_vram_pin(struct amdgpu_device *adev);
 void amdgpu_gart_table_vram_unpin(struct amdgpu_device *adev);
 int amdgpu_gart_init(struct amdgpu_device *adev);
 void amdgpu_gart_fini(struct amdgpu_device *adev);
+void amdgpu_gart_dummy_page_fini(struct amdgpu_device *adev);
 int amdgpu_gart_unbind(struct amdgpu_device *adev, uint64_t offset,
 		       int pages);
 int amdgpu_gart_map(struct amdgpu_device *adev, uint64_t offset,
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
index 233b64dab94b..a14973a7a9c9 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
@@ -361,6 +361,15 @@ void amdgpu_irq_fini_hw(struct amdgpu_device *adev)
 		if (!amdgpu_device_has_dc_support(adev))
 			flush_work(&adev->hotplug_work);
 	}
+
+	if (adev->irq.ih_soft.ring)
+		amdgpu_ih_ring_fini(adev, &adev->irq.ih_soft);
+	if (adev->irq.ih.ring)
+		amdgpu_ih_ring_fini(adev, &adev->irq.ih);
+	if (adev->irq.ih1.ring)
+		amdgpu_ih_ring_fini(adev, &adev->irq.ih1);
+	if (adev->irq.ih2.ring)
+		amdgpu_ih_ring_fini(adev, &adev->irq.ih2);
 }
 
 /**
diff --git a/drivers/gpu/drm/amd/amdgpu/cik_ih.c b/drivers/gpu/drm/amd/amdgpu/cik_ih.c
index 183d44a6583c..df385ffc9768 100644
--- a/drivers/gpu/drm/amd/amdgpu/cik_ih.c
+++ b/drivers/gpu/drm/amd/amdgpu/cik_ih.c
@@ -310,7 +310,6 @@ static int cik_ih_sw_fini(void *handle)
 	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
 
 	amdgpu_irq_fini_sw(adev);
-	amdgpu_ih_ring_fini(adev, &adev->irq.ih);
 	amdgpu_irq_remove_domain(adev);
 
 	return 0;
diff --git a/drivers/gpu/drm/amd/amdgpu/cz_ih.c b/drivers/gpu/drm/amd/amdgpu/cz_ih.c
index d32743949003..b8c47e0cf37a 100644
--- a/drivers/gpu/drm/amd/amdgpu/cz_ih.c
+++ b/drivers/gpu/drm/amd/amdgpu/cz_ih.c
@@ -302,7 +302,6 @@ static int cz_ih_sw_fini(void *handle)
 	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
 
 	amdgpu_irq_fini_sw(adev);
-	amdgpu_ih_ring_fini(adev, &adev->irq.ih);
 	amdgpu_irq_remove_domain(adev);
 
 	return 0;
diff --git a/drivers/gpu/drm/amd/amdgpu/iceland_ih.c b/drivers/gpu/drm/amd/amdgpu/iceland_ih.c
index da96c6013477..ddfe4eaeea05 100644
--- a/drivers/gpu/drm/amd/amdgpu/iceland_ih.c
+++ b/drivers/gpu/drm/amd/amdgpu/iceland_ih.c
@@ -301,7 +301,6 @@ static int iceland_ih_sw_fini(void *handle)
 	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
 
 	amdgpu_irq_fini_sw(adev);
-	amdgpu_ih_ring_fini(adev, &adev->irq.ih);
 	amdgpu_irq_remove_domain(adev);
 
 	return 0;
diff --git a/drivers/gpu/drm/amd/amdgpu/navi10_ih.c b/drivers/gpu/drm/amd/amdgpu/navi10_ih.c
index 5eea4550b856..e171a9e78544 100644
--- a/drivers/gpu/drm/amd/amdgpu/navi10_ih.c
+++ b/drivers/gpu/drm/amd/amdgpu/navi10_ih.c
@@ -571,9 +571,6 @@ static int navi10_ih_sw_fini(void *handle)
 
 	amdgpu_irq_fini_sw(adev);
 	amdgpu_ih_ring_fini(adev, &adev->irq.ih_soft);
-	amdgpu_ih_ring_fini(adev, &adev->irq.ih2);
-	amdgpu_ih_ring_fini(adev, &adev->irq.ih1);
-	amdgpu_ih_ring_fini(adev, &adev->irq.ih);
 
 	return 0;
 }
diff --git a/drivers/gpu/drm/amd/amdgpu/si_ih.c b/drivers/gpu/drm/amd/amdgpu/si_ih.c
index 751307f3252c..9a24f17a5750 100644
--- a/drivers/gpu/drm/amd/amdgpu/si_ih.c
+++ b/drivers/gpu/drm/amd/amdgpu/si_ih.c
@@ -176,7 +176,6 @@ static int si_ih_sw_fini(void *handle)
 	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
 
 	amdgpu_irq_fini_sw(adev);
-	amdgpu_ih_ring_fini(adev, &adev->irq.ih);
 
 	return 0;
 }
diff --git a/drivers/gpu/drm/amd/amdgpu/tonga_ih.c b/drivers/gpu/drm/amd/amdgpu/tonga_ih.c
index 973d80ec7f6c..b08905d1c00f 100644
--- a/drivers/gpu/drm/amd/amdgpu/tonga_ih.c
+++ b/drivers/gpu/drm/amd/amdgpu/tonga_ih.c
@@ -313,7 +313,6 @@ static int tonga_ih_sw_fini(void *handle)
 	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
 
 	amdgpu_irq_fini_sw(adev);
-	amdgpu_ih_ring_fini(adev, &adev->irq.ih);
 	amdgpu_irq_remove_domain(adev);
 
 	return 0;
diff --git a/drivers/gpu/drm/amd/amdgpu/vega10_ih.c b/drivers/gpu/drm/amd/amdgpu/vega10_ih.c
index dead9c2fbd4c..d78b8abe993a 100644
--- a/drivers/gpu/drm/amd/amdgpu/vega10_ih.c
+++ b/drivers/gpu/drm/amd/amdgpu/vega10_ih.c
@@ -515,9 +515,6 @@ static int vega10_ih_sw_fini(void *handle)
 
 	amdgpu_irq_fini_sw(adev);
 	amdgpu_ih_ring_fini(adev, &adev->irq.ih_soft);
-	amdgpu_ih_ring_fini(adev, &adev->irq.ih2);
-	amdgpu_ih_ring_fini(adev, &adev->irq.ih1);
-	amdgpu_ih_ring_fini(adev, &adev->irq.ih);
 
 	return 0;
 }
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 126+ messages in thread

* [PATCH v6 06/16] drm/amdgpu: Handle IOMMU enabled case.
@ 2021-05-10 16:36   ` Andrey Grodzovsky
  0 siblings, 0 replies; 126+ messages in thread
From: Andrey Grodzovsky @ 2021-05-10 16:36 UTC (permalink / raw)
  To: dri-devel, amd-gfx, linux-pci, ckoenig.leichtzumerken,
	daniel.vetter, Harry.Wentland
  Cc: gregkh, Felix.Kuehling, helgaas, Alexander.Deucher

Handle all DMA IOMMU gropup related dependencies before the
group is removed.

v5: Drop IOMMU notifier and switch to lockless call to ttm_tt_unpopulate
v6: Drop the BO unamp list

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 4 ++--
 drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c   | 3 +--
 drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h   | 1 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c    | 9 +++++++++
 drivers/gpu/drm/amd/amdgpu/cik_ih.c        | 1 -
 drivers/gpu/drm/amd/amdgpu/cz_ih.c         | 1 -
 drivers/gpu/drm/amd/amdgpu/iceland_ih.c    | 1 -
 drivers/gpu/drm/amd/amdgpu/navi10_ih.c     | 3 ---
 drivers/gpu/drm/amd/amdgpu/si_ih.c         | 1 -
 drivers/gpu/drm/amd/amdgpu/tonga_ih.c      | 1 -
 drivers/gpu/drm/amd/amdgpu/vega10_ih.c     | 3 ---
 11 files changed, 13 insertions(+), 15 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 18598eda18f6..a0bff4713672 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -3256,7 +3256,6 @@ static const struct attribute *amdgpu_dev_attributes[] = {
 	NULL
 };
 
-
 /**
  * amdgpu_device_init - initialize the driver
  *
@@ -3698,12 +3697,13 @@ void amdgpu_device_fini_hw(struct amdgpu_device *adev)
 		amdgpu_ucode_sysfs_fini(adev);
 	sysfs_remove_files(&adev->dev->kobj, amdgpu_dev_attributes);
 
-
 	amdgpu_fbdev_fini(adev);
 
 	amdgpu_irq_fini_hw(adev);
 
 	amdgpu_device_ip_fini_early(adev);
+
+	amdgpu_gart_dummy_page_fini(adev);
 }
 
 void amdgpu_device_fini_sw(struct amdgpu_device *adev)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
index c5a9a4fb10d2..354e68081b53 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
@@ -92,7 +92,7 @@ static int amdgpu_gart_dummy_page_init(struct amdgpu_device *adev)
  *
  * Frees the dummy page used by the driver (all asics).
  */
-static void amdgpu_gart_dummy_page_fini(struct amdgpu_device *adev)
+void amdgpu_gart_dummy_page_fini(struct amdgpu_device *adev)
 {
 	if (!adev->dummy_page_addr)
 		return;
@@ -375,5 +375,4 @@ int amdgpu_gart_init(struct amdgpu_device *adev)
  */
 void amdgpu_gart_fini(struct amdgpu_device *adev)
 {
-	amdgpu_gart_dummy_page_fini(adev);
 }
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h
index a25fe97b0196..78dc7a23da56 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h
@@ -58,6 +58,7 @@ int amdgpu_gart_table_vram_pin(struct amdgpu_device *adev);
 void amdgpu_gart_table_vram_unpin(struct amdgpu_device *adev);
 int amdgpu_gart_init(struct amdgpu_device *adev);
 void amdgpu_gart_fini(struct amdgpu_device *adev);
+void amdgpu_gart_dummy_page_fini(struct amdgpu_device *adev);
 int amdgpu_gart_unbind(struct amdgpu_device *adev, uint64_t offset,
 		       int pages);
 int amdgpu_gart_map(struct amdgpu_device *adev, uint64_t offset,
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
index 233b64dab94b..a14973a7a9c9 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
@@ -361,6 +361,15 @@ void amdgpu_irq_fini_hw(struct amdgpu_device *adev)
 		if (!amdgpu_device_has_dc_support(adev))
 			flush_work(&adev->hotplug_work);
 	}
+
+	if (adev->irq.ih_soft.ring)
+		amdgpu_ih_ring_fini(adev, &adev->irq.ih_soft);
+	if (adev->irq.ih.ring)
+		amdgpu_ih_ring_fini(adev, &adev->irq.ih);
+	if (adev->irq.ih1.ring)
+		amdgpu_ih_ring_fini(adev, &adev->irq.ih1);
+	if (adev->irq.ih2.ring)
+		amdgpu_ih_ring_fini(adev, &adev->irq.ih2);
 }
 
 /**
diff --git a/drivers/gpu/drm/amd/amdgpu/cik_ih.c b/drivers/gpu/drm/amd/amdgpu/cik_ih.c
index 183d44a6583c..df385ffc9768 100644
--- a/drivers/gpu/drm/amd/amdgpu/cik_ih.c
+++ b/drivers/gpu/drm/amd/amdgpu/cik_ih.c
@@ -310,7 +310,6 @@ static int cik_ih_sw_fini(void *handle)
 	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
 
 	amdgpu_irq_fini_sw(adev);
-	amdgpu_ih_ring_fini(adev, &adev->irq.ih);
 	amdgpu_irq_remove_domain(adev);
 
 	return 0;
diff --git a/drivers/gpu/drm/amd/amdgpu/cz_ih.c b/drivers/gpu/drm/amd/amdgpu/cz_ih.c
index d32743949003..b8c47e0cf37a 100644
--- a/drivers/gpu/drm/amd/amdgpu/cz_ih.c
+++ b/drivers/gpu/drm/amd/amdgpu/cz_ih.c
@@ -302,7 +302,6 @@ static int cz_ih_sw_fini(void *handle)
 	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
 
 	amdgpu_irq_fini_sw(adev);
-	amdgpu_ih_ring_fini(adev, &adev->irq.ih);
 	amdgpu_irq_remove_domain(adev);
 
 	return 0;
diff --git a/drivers/gpu/drm/amd/amdgpu/iceland_ih.c b/drivers/gpu/drm/amd/amdgpu/iceland_ih.c
index da96c6013477..ddfe4eaeea05 100644
--- a/drivers/gpu/drm/amd/amdgpu/iceland_ih.c
+++ b/drivers/gpu/drm/amd/amdgpu/iceland_ih.c
@@ -301,7 +301,6 @@ static int iceland_ih_sw_fini(void *handle)
 	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
 
 	amdgpu_irq_fini_sw(adev);
-	amdgpu_ih_ring_fini(adev, &adev->irq.ih);
 	amdgpu_irq_remove_domain(adev);
 
 	return 0;
diff --git a/drivers/gpu/drm/amd/amdgpu/navi10_ih.c b/drivers/gpu/drm/amd/amdgpu/navi10_ih.c
index 5eea4550b856..e171a9e78544 100644
--- a/drivers/gpu/drm/amd/amdgpu/navi10_ih.c
+++ b/drivers/gpu/drm/amd/amdgpu/navi10_ih.c
@@ -571,9 +571,6 @@ static int navi10_ih_sw_fini(void *handle)
 
 	amdgpu_irq_fini_sw(adev);
 	amdgpu_ih_ring_fini(adev, &adev->irq.ih_soft);
-	amdgpu_ih_ring_fini(adev, &adev->irq.ih2);
-	amdgpu_ih_ring_fini(adev, &adev->irq.ih1);
-	amdgpu_ih_ring_fini(adev, &adev->irq.ih);
 
 	return 0;
 }
diff --git a/drivers/gpu/drm/amd/amdgpu/si_ih.c b/drivers/gpu/drm/amd/amdgpu/si_ih.c
index 751307f3252c..9a24f17a5750 100644
--- a/drivers/gpu/drm/amd/amdgpu/si_ih.c
+++ b/drivers/gpu/drm/amd/amdgpu/si_ih.c
@@ -176,7 +176,6 @@ static int si_ih_sw_fini(void *handle)
 	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
 
 	amdgpu_irq_fini_sw(adev);
-	amdgpu_ih_ring_fini(adev, &adev->irq.ih);
 
 	return 0;
 }
diff --git a/drivers/gpu/drm/amd/amdgpu/tonga_ih.c b/drivers/gpu/drm/amd/amdgpu/tonga_ih.c
index 973d80ec7f6c..b08905d1c00f 100644
--- a/drivers/gpu/drm/amd/amdgpu/tonga_ih.c
+++ b/drivers/gpu/drm/amd/amdgpu/tonga_ih.c
@@ -313,7 +313,6 @@ static int tonga_ih_sw_fini(void *handle)
 	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
 
 	amdgpu_irq_fini_sw(adev);
-	amdgpu_ih_ring_fini(adev, &adev->irq.ih);
 	amdgpu_irq_remove_domain(adev);
 
 	return 0;
diff --git a/drivers/gpu/drm/amd/amdgpu/vega10_ih.c b/drivers/gpu/drm/amd/amdgpu/vega10_ih.c
index dead9c2fbd4c..d78b8abe993a 100644
--- a/drivers/gpu/drm/amd/amdgpu/vega10_ih.c
+++ b/drivers/gpu/drm/amd/amdgpu/vega10_ih.c
@@ -515,9 +515,6 @@ static int vega10_ih_sw_fini(void *handle)
 
 	amdgpu_irq_fini_sw(adev);
 	amdgpu_ih_ring_fini(adev, &adev->irq.ih_soft);
-	amdgpu_ih_ring_fini(adev, &adev->irq.ih2);
-	amdgpu_ih_ring_fini(adev, &adev->irq.ih1);
-	amdgpu_ih_ring_fini(adev, &adev->irq.ih);
 
 	return 0;
 }
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 126+ messages in thread

* [PATCH v6 06/16] drm/amdgpu: Handle IOMMU enabled case.
@ 2021-05-10 16:36   ` Andrey Grodzovsky
  0 siblings, 0 replies; 126+ messages in thread
From: Andrey Grodzovsky @ 2021-05-10 16:36 UTC (permalink / raw)
  To: dri-devel, amd-gfx, linux-pci, ckoenig.leichtzumerken,
	daniel.vetter, Harry.Wentland
  Cc: Andrey Grodzovsky, gregkh, Felix.Kuehling, ppaalanen, helgaas,
	Alexander.Deucher

Handle all DMA IOMMU gropup related dependencies before the
group is removed.

v5: Drop IOMMU notifier and switch to lockless call to ttm_tt_unpopulate
v6: Drop the BO unamp list

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 4 ++--
 drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c   | 3 +--
 drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h   | 1 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c    | 9 +++++++++
 drivers/gpu/drm/amd/amdgpu/cik_ih.c        | 1 -
 drivers/gpu/drm/amd/amdgpu/cz_ih.c         | 1 -
 drivers/gpu/drm/amd/amdgpu/iceland_ih.c    | 1 -
 drivers/gpu/drm/amd/amdgpu/navi10_ih.c     | 3 ---
 drivers/gpu/drm/amd/amdgpu/si_ih.c         | 1 -
 drivers/gpu/drm/amd/amdgpu/tonga_ih.c      | 1 -
 drivers/gpu/drm/amd/amdgpu/vega10_ih.c     | 3 ---
 11 files changed, 13 insertions(+), 15 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 18598eda18f6..a0bff4713672 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -3256,7 +3256,6 @@ static const struct attribute *amdgpu_dev_attributes[] = {
 	NULL
 };
 
-
 /**
  * amdgpu_device_init - initialize the driver
  *
@@ -3698,12 +3697,13 @@ void amdgpu_device_fini_hw(struct amdgpu_device *adev)
 		amdgpu_ucode_sysfs_fini(adev);
 	sysfs_remove_files(&adev->dev->kobj, amdgpu_dev_attributes);
 
-
 	amdgpu_fbdev_fini(adev);
 
 	amdgpu_irq_fini_hw(adev);
 
 	amdgpu_device_ip_fini_early(adev);
+
+	amdgpu_gart_dummy_page_fini(adev);
 }
 
 void amdgpu_device_fini_sw(struct amdgpu_device *adev)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
index c5a9a4fb10d2..354e68081b53 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
@@ -92,7 +92,7 @@ static int amdgpu_gart_dummy_page_init(struct amdgpu_device *adev)
  *
  * Frees the dummy page used by the driver (all asics).
  */
-static void amdgpu_gart_dummy_page_fini(struct amdgpu_device *adev)
+void amdgpu_gart_dummy_page_fini(struct amdgpu_device *adev)
 {
 	if (!adev->dummy_page_addr)
 		return;
@@ -375,5 +375,4 @@ int amdgpu_gart_init(struct amdgpu_device *adev)
  */
 void amdgpu_gart_fini(struct amdgpu_device *adev)
 {
-	amdgpu_gart_dummy_page_fini(adev);
 }
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h
index a25fe97b0196..78dc7a23da56 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h
@@ -58,6 +58,7 @@ int amdgpu_gart_table_vram_pin(struct amdgpu_device *adev);
 void amdgpu_gart_table_vram_unpin(struct amdgpu_device *adev);
 int amdgpu_gart_init(struct amdgpu_device *adev);
 void amdgpu_gart_fini(struct amdgpu_device *adev);
+void amdgpu_gart_dummy_page_fini(struct amdgpu_device *adev);
 int amdgpu_gart_unbind(struct amdgpu_device *adev, uint64_t offset,
 		       int pages);
 int amdgpu_gart_map(struct amdgpu_device *adev, uint64_t offset,
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
index 233b64dab94b..a14973a7a9c9 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
@@ -361,6 +361,15 @@ void amdgpu_irq_fini_hw(struct amdgpu_device *adev)
 		if (!amdgpu_device_has_dc_support(adev))
 			flush_work(&adev->hotplug_work);
 	}
+
+	if (adev->irq.ih_soft.ring)
+		amdgpu_ih_ring_fini(adev, &adev->irq.ih_soft);
+	if (adev->irq.ih.ring)
+		amdgpu_ih_ring_fini(adev, &adev->irq.ih);
+	if (adev->irq.ih1.ring)
+		amdgpu_ih_ring_fini(adev, &adev->irq.ih1);
+	if (adev->irq.ih2.ring)
+		amdgpu_ih_ring_fini(adev, &adev->irq.ih2);
 }
 
 /**
diff --git a/drivers/gpu/drm/amd/amdgpu/cik_ih.c b/drivers/gpu/drm/amd/amdgpu/cik_ih.c
index 183d44a6583c..df385ffc9768 100644
--- a/drivers/gpu/drm/amd/amdgpu/cik_ih.c
+++ b/drivers/gpu/drm/amd/amdgpu/cik_ih.c
@@ -310,7 +310,6 @@ static int cik_ih_sw_fini(void *handle)
 	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
 
 	amdgpu_irq_fini_sw(adev);
-	amdgpu_ih_ring_fini(adev, &adev->irq.ih);
 	amdgpu_irq_remove_domain(adev);
 
 	return 0;
diff --git a/drivers/gpu/drm/amd/amdgpu/cz_ih.c b/drivers/gpu/drm/amd/amdgpu/cz_ih.c
index d32743949003..b8c47e0cf37a 100644
--- a/drivers/gpu/drm/amd/amdgpu/cz_ih.c
+++ b/drivers/gpu/drm/amd/amdgpu/cz_ih.c
@@ -302,7 +302,6 @@ static int cz_ih_sw_fini(void *handle)
 	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
 
 	amdgpu_irq_fini_sw(adev);
-	amdgpu_ih_ring_fini(adev, &adev->irq.ih);
 	amdgpu_irq_remove_domain(adev);
 
 	return 0;
diff --git a/drivers/gpu/drm/amd/amdgpu/iceland_ih.c b/drivers/gpu/drm/amd/amdgpu/iceland_ih.c
index da96c6013477..ddfe4eaeea05 100644
--- a/drivers/gpu/drm/amd/amdgpu/iceland_ih.c
+++ b/drivers/gpu/drm/amd/amdgpu/iceland_ih.c
@@ -301,7 +301,6 @@ static int iceland_ih_sw_fini(void *handle)
 	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
 
 	amdgpu_irq_fini_sw(adev);
-	amdgpu_ih_ring_fini(adev, &adev->irq.ih);
 	amdgpu_irq_remove_domain(adev);
 
 	return 0;
diff --git a/drivers/gpu/drm/amd/amdgpu/navi10_ih.c b/drivers/gpu/drm/amd/amdgpu/navi10_ih.c
index 5eea4550b856..e171a9e78544 100644
--- a/drivers/gpu/drm/amd/amdgpu/navi10_ih.c
+++ b/drivers/gpu/drm/amd/amdgpu/navi10_ih.c
@@ -571,9 +571,6 @@ static int navi10_ih_sw_fini(void *handle)
 
 	amdgpu_irq_fini_sw(adev);
 	amdgpu_ih_ring_fini(adev, &adev->irq.ih_soft);
-	amdgpu_ih_ring_fini(adev, &adev->irq.ih2);
-	amdgpu_ih_ring_fini(adev, &adev->irq.ih1);
-	amdgpu_ih_ring_fini(adev, &adev->irq.ih);
 
 	return 0;
 }
diff --git a/drivers/gpu/drm/amd/amdgpu/si_ih.c b/drivers/gpu/drm/amd/amdgpu/si_ih.c
index 751307f3252c..9a24f17a5750 100644
--- a/drivers/gpu/drm/amd/amdgpu/si_ih.c
+++ b/drivers/gpu/drm/amd/amdgpu/si_ih.c
@@ -176,7 +176,6 @@ static int si_ih_sw_fini(void *handle)
 	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
 
 	amdgpu_irq_fini_sw(adev);
-	amdgpu_ih_ring_fini(adev, &adev->irq.ih);
 
 	return 0;
 }
diff --git a/drivers/gpu/drm/amd/amdgpu/tonga_ih.c b/drivers/gpu/drm/amd/amdgpu/tonga_ih.c
index 973d80ec7f6c..b08905d1c00f 100644
--- a/drivers/gpu/drm/amd/amdgpu/tonga_ih.c
+++ b/drivers/gpu/drm/amd/amdgpu/tonga_ih.c
@@ -313,7 +313,6 @@ static int tonga_ih_sw_fini(void *handle)
 	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
 
 	amdgpu_irq_fini_sw(adev);
-	amdgpu_ih_ring_fini(adev, &adev->irq.ih);
 	amdgpu_irq_remove_domain(adev);
 
 	return 0;
diff --git a/drivers/gpu/drm/amd/amdgpu/vega10_ih.c b/drivers/gpu/drm/amd/amdgpu/vega10_ih.c
index dead9c2fbd4c..d78b8abe993a 100644
--- a/drivers/gpu/drm/amd/amdgpu/vega10_ih.c
+++ b/drivers/gpu/drm/amd/amdgpu/vega10_ih.c
@@ -515,9 +515,6 @@ static int vega10_ih_sw_fini(void *handle)
 
 	amdgpu_irq_fini_sw(adev);
 	amdgpu_ih_ring_fini(adev, &adev->irq.ih_soft);
-	amdgpu_ih_ring_fini(adev, &adev->irq.ih2);
-	amdgpu_ih_ring_fini(adev, &adev->irq.ih1);
-	amdgpu_ih_ring_fini(adev, &adev->irq.ih);
 
 	return 0;
 }
-- 
2.25.1

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 126+ messages in thread

* [PATCH v6 07/16] drm/amdgpu: Remap all page faults to per process dummy page.
  2021-05-10 16:36 ` Andrey Grodzovsky
  (?)
@ 2021-05-10 16:36   ` Andrey Grodzovsky
  -1 siblings, 0 replies; 126+ messages in thread
From: Andrey Grodzovsky @ 2021-05-10 16:36 UTC (permalink / raw)
  To: dri-devel, amd-gfx, linux-pci, ckoenig.leichtzumerken,
	daniel.vetter, Harry.Wentland
  Cc: ppaalanen, Alexander.Deucher, gregkh, helgaas, Felix.Kuehling,
	Andrey Grodzovsky, Christian König

On device removal reroute all CPU mappings to dummy page
per drm_file instance or imported GEM object.

v4:
Update for modified ttm_bo_vm_dummy_page

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 21 ++++++++++++++++-----
 1 file changed, 16 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
index 8c7ec09eb1a4..0d54e70278ca 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -48,6 +48,7 @@
 #include <drm/ttm/ttm_placement.h>
 
 #include <drm/amdgpu_drm.h>
+#include <drm/drm_drv.h>
 
 #include "amdgpu.h"
 #include "amdgpu_object.h"
@@ -1905,18 +1906,28 @@ void amdgpu_ttm_set_buffer_funcs_status(struct amdgpu_device *adev, bool enable)
 static vm_fault_t amdgpu_ttm_fault(struct vm_fault *vmf)
 {
 	struct ttm_buffer_object *bo = vmf->vma->vm_private_data;
+	struct drm_device *ddev = bo->base.dev;
 	vm_fault_t ret;
+	int idx;
 
 	ret = ttm_bo_vm_reserve(bo, vmf);
 	if (ret)
 		return ret;
 
-	ret = amdgpu_bo_fault_reserve_notify(bo);
-	if (ret)
-		goto unlock;
+	if (drm_dev_enter(ddev, &idx)) {
+		ret = amdgpu_bo_fault_reserve_notify(bo);
+		if (ret) {
+			drm_dev_exit(idx);
+			goto unlock;
+		}
 
-	ret = ttm_bo_vm_fault_reserved(vmf, vmf->vma->vm_page_prot,
-				       TTM_BO_VM_NUM_PREFAULT, 1);
+		 ret = ttm_bo_vm_fault_reserved(vmf, vmf->vma->vm_page_prot,
+						TTM_BO_VM_NUM_PREFAULT, 1);
+
+		 drm_dev_exit(idx);
+	} else {
+		ret = ttm_bo_vm_dummy_page(vmf, vmf->vma->vm_page_prot);
+	}
 	if (ret == VM_FAULT_RETRY && !(vmf->flags & FAULT_FLAG_RETRY_NOWAIT))
 		return ret;
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 126+ messages in thread

* [PATCH v6 07/16] drm/amdgpu: Remap all page faults to per process dummy page.
@ 2021-05-10 16:36   ` Andrey Grodzovsky
  0 siblings, 0 replies; 126+ messages in thread
From: Andrey Grodzovsky @ 2021-05-10 16:36 UTC (permalink / raw)
  To: dri-devel, amd-gfx, linux-pci, ckoenig.leichtzumerken,
	daniel.vetter, Harry.Wentland
  Cc: gregkh, Felix.Kuehling, helgaas, Alexander.Deucher, Christian König

On device removal reroute all CPU mappings to dummy page
per drm_file instance or imported GEM object.

v4:
Update for modified ttm_bo_vm_dummy_page

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 21 ++++++++++++++++-----
 1 file changed, 16 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
index 8c7ec09eb1a4..0d54e70278ca 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -48,6 +48,7 @@
 #include <drm/ttm/ttm_placement.h>
 
 #include <drm/amdgpu_drm.h>
+#include <drm/drm_drv.h>
 
 #include "amdgpu.h"
 #include "amdgpu_object.h"
@@ -1905,18 +1906,28 @@ void amdgpu_ttm_set_buffer_funcs_status(struct amdgpu_device *adev, bool enable)
 static vm_fault_t amdgpu_ttm_fault(struct vm_fault *vmf)
 {
 	struct ttm_buffer_object *bo = vmf->vma->vm_private_data;
+	struct drm_device *ddev = bo->base.dev;
 	vm_fault_t ret;
+	int idx;
 
 	ret = ttm_bo_vm_reserve(bo, vmf);
 	if (ret)
 		return ret;
 
-	ret = amdgpu_bo_fault_reserve_notify(bo);
-	if (ret)
-		goto unlock;
+	if (drm_dev_enter(ddev, &idx)) {
+		ret = amdgpu_bo_fault_reserve_notify(bo);
+		if (ret) {
+			drm_dev_exit(idx);
+			goto unlock;
+		}
 
-	ret = ttm_bo_vm_fault_reserved(vmf, vmf->vma->vm_page_prot,
-				       TTM_BO_VM_NUM_PREFAULT, 1);
+		 ret = ttm_bo_vm_fault_reserved(vmf, vmf->vma->vm_page_prot,
+						TTM_BO_VM_NUM_PREFAULT, 1);
+
+		 drm_dev_exit(idx);
+	} else {
+		ret = ttm_bo_vm_dummy_page(vmf, vmf->vma->vm_page_prot);
+	}
 	if (ret == VM_FAULT_RETRY && !(vmf->flags & FAULT_FLAG_RETRY_NOWAIT))
 		return ret;
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 126+ messages in thread

* [PATCH v6 07/16] drm/amdgpu: Remap all page faults to per process dummy page.
@ 2021-05-10 16:36   ` Andrey Grodzovsky
  0 siblings, 0 replies; 126+ messages in thread
From: Andrey Grodzovsky @ 2021-05-10 16:36 UTC (permalink / raw)
  To: dri-devel, amd-gfx, linux-pci, ckoenig.leichtzumerken,
	daniel.vetter, Harry.Wentland
  Cc: Andrey Grodzovsky, gregkh, Felix.Kuehling, ppaalanen, helgaas,
	Alexander.Deucher, Christian König

On device removal reroute all CPU mappings to dummy page
per drm_file instance or imported GEM object.

v4:
Update for modified ttm_bo_vm_dummy_page

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 21 ++++++++++++++++-----
 1 file changed, 16 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
index 8c7ec09eb1a4..0d54e70278ca 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -48,6 +48,7 @@
 #include <drm/ttm/ttm_placement.h>
 
 #include <drm/amdgpu_drm.h>
+#include <drm/drm_drv.h>
 
 #include "amdgpu.h"
 #include "amdgpu_object.h"
@@ -1905,18 +1906,28 @@ void amdgpu_ttm_set_buffer_funcs_status(struct amdgpu_device *adev, bool enable)
 static vm_fault_t amdgpu_ttm_fault(struct vm_fault *vmf)
 {
 	struct ttm_buffer_object *bo = vmf->vma->vm_private_data;
+	struct drm_device *ddev = bo->base.dev;
 	vm_fault_t ret;
+	int idx;
 
 	ret = ttm_bo_vm_reserve(bo, vmf);
 	if (ret)
 		return ret;
 
-	ret = amdgpu_bo_fault_reserve_notify(bo);
-	if (ret)
-		goto unlock;
+	if (drm_dev_enter(ddev, &idx)) {
+		ret = amdgpu_bo_fault_reserve_notify(bo);
+		if (ret) {
+			drm_dev_exit(idx);
+			goto unlock;
+		}
 
-	ret = ttm_bo_vm_fault_reserved(vmf, vmf->vma->vm_page_prot,
-				       TTM_BO_VM_NUM_PREFAULT, 1);
+		 ret = ttm_bo_vm_fault_reserved(vmf, vmf->vma->vm_page_prot,
+						TTM_BO_VM_NUM_PREFAULT, 1);
+
+		 drm_dev_exit(idx);
+	} else {
+		ret = ttm_bo_vm_dummy_page(vmf, vmf->vma->vm_page_prot);
+	}
 	if (ret == VM_FAULT_RETRY && !(vmf->flags & FAULT_FLAG_RETRY_NOWAIT))
 		return ret;
 
-- 
2.25.1

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 126+ messages in thread

* [PATCH v6 08/16] PCI: Add support for dev_groups to struct pci_device_driver
  2021-05-10 16:36 ` Andrey Grodzovsky
  (?)
@ 2021-05-10 16:36   ` Andrey Grodzovsky
  -1 siblings, 0 replies; 126+ messages in thread
From: Andrey Grodzovsky @ 2021-05-10 16:36 UTC (permalink / raw)
  To: dri-devel, amd-gfx, linux-pci, ckoenig.leichtzumerken,
	daniel.vetter, Harry.Wentland
  Cc: ppaalanen, Alexander.Deucher, gregkh, helgaas, Felix.Kuehling,
	Andrey Grodzovsky

This helps converting PCI drivers sysfs attributes to static.

Analogous to b71b283e3d6d ("USB: add support for dev_groups to
struct usb_driver")

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Suggested-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/pci/pci-driver.c | 1 +
 include/linux/pci.h      | 3 +++
 2 files changed, 4 insertions(+)

diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
index ec44a79e951a..3a72352aa5cf 100644
--- a/drivers/pci/pci-driver.c
+++ b/drivers/pci/pci-driver.c
@@ -1385,6 +1385,7 @@ int __pci_register_driver(struct pci_driver *drv, struct module *owner,
 	drv->driver.owner = owner;
 	drv->driver.mod_name = mod_name;
 	drv->driver.groups = drv->groups;
+	drv->driver.dev_groups = drv->dev_groups;
 
 	spin_lock_init(&drv->dynids.lock);
 	INIT_LIST_HEAD(&drv->dynids.list);
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 86c799c97b77..b57755b03009 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -858,6 +858,8 @@ struct module;
  *		number of VFs to enable via sysfs "sriov_numvfs" file.
  * @err_handler: See Documentation/PCI/pci-error-recovery.rst
  * @groups:	Sysfs attribute groups.
+ * @dev_groups: Attributes attached to the device that will be
+ *              created once it is bound to the driver.
  * @driver:	Driver model structure.
  * @dynids:	List of dynamically added device IDs.
  */
@@ -873,6 +875,7 @@ struct pci_driver {
 	int  (*sriov_configure)(struct pci_dev *dev, int num_vfs); /* On PF */
 	const struct pci_error_handlers *err_handler;
 	const struct attribute_group **groups;
+	const struct attribute_group **dev_groups;
 	struct device_driver	driver;
 	struct pci_dynids	dynids;
 };
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 126+ messages in thread

* [PATCH v6 08/16] PCI: Add support for dev_groups to struct pci_device_driver
@ 2021-05-10 16:36   ` Andrey Grodzovsky
  0 siblings, 0 replies; 126+ messages in thread
From: Andrey Grodzovsky @ 2021-05-10 16:36 UTC (permalink / raw)
  To: dri-devel, amd-gfx, linux-pci, ckoenig.leichtzumerken,
	daniel.vetter, Harry.Wentland
  Cc: gregkh, Felix.Kuehling, helgaas, Alexander.Deucher

This helps converting PCI drivers sysfs attributes to static.

Analogous to b71b283e3d6d ("USB: add support for dev_groups to
struct usb_driver")

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Suggested-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/pci/pci-driver.c | 1 +
 include/linux/pci.h      | 3 +++
 2 files changed, 4 insertions(+)

diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
index ec44a79e951a..3a72352aa5cf 100644
--- a/drivers/pci/pci-driver.c
+++ b/drivers/pci/pci-driver.c
@@ -1385,6 +1385,7 @@ int __pci_register_driver(struct pci_driver *drv, struct module *owner,
 	drv->driver.owner = owner;
 	drv->driver.mod_name = mod_name;
 	drv->driver.groups = drv->groups;
+	drv->driver.dev_groups = drv->dev_groups;
 
 	spin_lock_init(&drv->dynids.lock);
 	INIT_LIST_HEAD(&drv->dynids.list);
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 86c799c97b77..b57755b03009 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -858,6 +858,8 @@ struct module;
  *		number of VFs to enable via sysfs "sriov_numvfs" file.
  * @err_handler: See Documentation/PCI/pci-error-recovery.rst
  * @groups:	Sysfs attribute groups.
+ * @dev_groups: Attributes attached to the device that will be
+ *              created once it is bound to the driver.
  * @driver:	Driver model structure.
  * @dynids:	List of dynamically added device IDs.
  */
@@ -873,6 +875,7 @@ struct pci_driver {
 	int  (*sriov_configure)(struct pci_dev *dev, int num_vfs); /* On PF */
 	const struct pci_error_handlers *err_handler;
 	const struct attribute_group **groups;
+	const struct attribute_group **dev_groups;
 	struct device_driver	driver;
 	struct pci_dynids	dynids;
 };
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 126+ messages in thread

* [PATCH v6 08/16] PCI: Add support for dev_groups to struct pci_device_driver
@ 2021-05-10 16:36   ` Andrey Grodzovsky
  0 siblings, 0 replies; 126+ messages in thread
From: Andrey Grodzovsky @ 2021-05-10 16:36 UTC (permalink / raw)
  To: dri-devel, amd-gfx, linux-pci, ckoenig.leichtzumerken,
	daniel.vetter, Harry.Wentland
  Cc: Andrey Grodzovsky, gregkh, Felix.Kuehling, ppaalanen, helgaas,
	Alexander.Deucher

This helps converting PCI drivers sysfs attributes to static.

Analogous to b71b283e3d6d ("USB: add support for dev_groups to
struct usb_driver")

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Suggested-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/pci/pci-driver.c | 1 +
 include/linux/pci.h      | 3 +++
 2 files changed, 4 insertions(+)

diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
index ec44a79e951a..3a72352aa5cf 100644
--- a/drivers/pci/pci-driver.c
+++ b/drivers/pci/pci-driver.c
@@ -1385,6 +1385,7 @@ int __pci_register_driver(struct pci_driver *drv, struct module *owner,
 	drv->driver.owner = owner;
 	drv->driver.mod_name = mod_name;
 	drv->driver.groups = drv->groups;
+	drv->driver.dev_groups = drv->dev_groups;
 
 	spin_lock_init(&drv->dynids.lock);
 	INIT_LIST_HEAD(&drv->dynids.list);
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 86c799c97b77..b57755b03009 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -858,6 +858,8 @@ struct module;
  *		number of VFs to enable via sysfs "sriov_numvfs" file.
  * @err_handler: See Documentation/PCI/pci-error-recovery.rst
  * @groups:	Sysfs attribute groups.
+ * @dev_groups: Attributes attached to the device that will be
+ *              created once it is bound to the driver.
  * @driver:	Driver model structure.
  * @dynids:	List of dynamically added device IDs.
  */
@@ -873,6 +875,7 @@ struct pci_driver {
 	int  (*sriov_configure)(struct pci_dev *dev, int num_vfs); /* On PF */
 	const struct pci_error_handlers *err_handler;
 	const struct attribute_group **groups;
+	const struct attribute_group **dev_groups;
 	struct device_driver	driver;
 	struct pci_dynids	dynids;
 };
-- 
2.25.1

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 126+ messages in thread

* [PATCH v6 09/16] drm/amdgpu: Convert driver sysfs attributes to static attributes
  2021-05-10 16:36 ` Andrey Grodzovsky
  (?)
@ 2021-05-10 16:36   ` Andrey Grodzovsky
  -1 siblings, 0 replies; 126+ messages in thread
From: Andrey Grodzovsky @ 2021-05-10 16:36 UTC (permalink / raw)
  To: dri-devel, amd-gfx, linux-pci, ckoenig.leichtzumerken,
	daniel.vetter, Harry.Wentland
  Cc: ppaalanen, Alexander.Deucher, gregkh, helgaas, Felix.Kuehling,
	Andrey Grodzovsky, Christian König

This allows to remove explicit creation and destruction
of those attrs and by this avoids warnings on device
finalizing post physical device extraction.

v5: Use newly added pci_driver.dev_groups directly

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c | 17 ++++++-------
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c      | 13 ++++++++++
 drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c  | 25 ++++++++------------
 drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c | 14 ++++-------
 4 files changed, 37 insertions(+), 32 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c
index 494b2e1717d5..879ed3e50a6e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c
@@ -1768,6 +1768,15 @@ static ssize_t amdgpu_atombios_get_vbios_version(struct device *dev,
 static DEVICE_ATTR(vbios_version, 0444, amdgpu_atombios_get_vbios_version,
 		   NULL);
 
+static struct attribute *amdgpu_vbios_version_attrs[] = {
+	&dev_attr_vbios_version.attr,
+	NULL
+};
+
+const struct attribute_group amdgpu_vbios_version_attr_group = {
+	.attrs = amdgpu_vbios_version_attrs
+};
+
 /**
  * amdgpu_atombios_fini - free the driver info and callbacks for atombios
  *
@@ -1787,7 +1796,6 @@ void amdgpu_atombios_fini(struct amdgpu_device *adev)
 	adev->mode_info.atom_context = NULL;
 	kfree(adev->mode_info.atom_card_info);
 	adev->mode_info.atom_card_info = NULL;
-	device_remove_file(adev->dev, &dev_attr_vbios_version);
 }
 
 /**
@@ -1804,7 +1812,6 @@ int amdgpu_atombios_init(struct amdgpu_device *adev)
 {
 	struct card_info *atom_card_info =
 	    kzalloc(sizeof(struct card_info), GFP_KERNEL);
-	int ret;
 
 	if (!atom_card_info)
 		return -ENOMEM;
@@ -1833,12 +1840,6 @@ int amdgpu_atombios_init(struct amdgpu_device *adev)
 		amdgpu_atombios_allocate_fb_scratch(adev);
 	}
 
-	ret = device_create_file(adev->dev, &dev_attr_vbios_version);
-	if (ret) {
-		DRM_ERROR("Failed to create device file for VBIOS version\n");
-		return ret;
-	}
-
 	return 0;
 }
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index 5ebed4c7d9c0..83006f45b10b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -1766,6 +1766,18 @@ static struct pci_error_handlers amdgpu_pci_err_handler = {
 	.resume		= amdgpu_pci_resume,
 };
 
+extern const struct attribute_group amdgpu_vram_mgr_attr_group;
+extern const struct attribute_group amdgpu_gtt_mgr_attr_group;
+extern const struct attribute_group amdgpu_vbios_version_attr_group;
+
+static const struct attribute_group *amdgpu_sysfs_groups[] = {
+	&amdgpu_vram_mgr_attr_group,
+	&amdgpu_gtt_mgr_attr_group,
+	&amdgpu_vbios_version_attr_group,
+	NULL,
+};
+
+
 static struct pci_driver amdgpu_kms_pci_driver = {
 	.name = DRIVER_NAME,
 	.id_table = pciidlist,
@@ -1774,6 +1786,7 @@ static struct pci_driver amdgpu_kms_pci_driver = {
 	.shutdown = amdgpu_pci_shutdown,
 	.driver.pm = &amdgpu_pm_ops,
 	.err_handler = &amdgpu_pci_err_handler,
+	.dev_groups = amdgpu_sysfs_groups,
 };
 
 static int __init amdgpu_init(void)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c
index 72962de4c04c..a4404da8ca6d 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c
@@ -75,6 +75,16 @@ static DEVICE_ATTR(mem_info_gtt_total, S_IRUGO,
 static DEVICE_ATTR(mem_info_gtt_used, S_IRUGO,
 	           amdgpu_mem_info_gtt_used_show, NULL);
 
+static struct attribute *amdgpu_gtt_mgr_attributes[] = {
+	&dev_attr_mem_info_gtt_total.attr,
+	&dev_attr_mem_info_gtt_used.attr,
+	NULL
+};
+
+const struct attribute_group amdgpu_gtt_mgr_attr_group = {
+	.attrs = amdgpu_gtt_mgr_attributes
+};
+
 static const struct ttm_resource_manager_func amdgpu_gtt_mgr_func;
 /**
  * amdgpu_gtt_mgr_init - init GTT manager and DRM MM
@@ -89,7 +99,6 @@ int amdgpu_gtt_mgr_init(struct amdgpu_device *adev, uint64_t gtt_size)
 	struct amdgpu_gtt_mgr *mgr = &adev->mman.gtt_mgr;
 	struct ttm_resource_manager *man = &mgr->manager;
 	uint64_t start, size;
-	int ret;
 
 	man->use_tt = true;
 	man->func = &amdgpu_gtt_mgr_func;
@@ -102,17 +111,6 @@ int amdgpu_gtt_mgr_init(struct amdgpu_device *adev, uint64_t gtt_size)
 	spin_lock_init(&mgr->lock);
 	atomic64_set(&mgr->available, gtt_size >> PAGE_SHIFT);
 
-	ret = device_create_file(adev->dev, &dev_attr_mem_info_gtt_total);
-	if (ret) {
-		DRM_ERROR("Failed to create device file mem_info_gtt_total\n");
-		return ret;
-	}
-	ret = device_create_file(adev->dev, &dev_attr_mem_info_gtt_used);
-	if (ret) {
-		DRM_ERROR("Failed to create device file mem_info_gtt_used\n");
-		return ret;
-	}
-
 	ttm_set_driver_manager(&adev->mman.bdev, TTM_PL_TT, &mgr->manager);
 	ttm_resource_manager_set_used(man, true);
 	return 0;
@@ -142,9 +140,6 @@ void amdgpu_gtt_mgr_fini(struct amdgpu_device *adev)
 	drm_mm_takedown(&mgr->mm);
 	spin_unlock(&mgr->lock);
 
-	device_remove_file(adev->dev, &dev_attr_mem_info_gtt_total);
-	device_remove_file(adev->dev, &dev_attr_mem_info_gtt_used);
-
 	ttm_resource_manager_cleanup(man);
 	ttm_set_driver_manager(&adev->mman.bdev, TTM_PL_TT, NULL);
 }
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
index 2344aba9dca3..8543d6486018 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
@@ -152,7 +152,7 @@ static DEVICE_ATTR(mem_info_vis_vram_used, S_IRUGO,
 static DEVICE_ATTR(mem_info_vram_vendor, S_IRUGO,
 		   amdgpu_mem_info_vram_vendor, NULL);
 
-static const struct attribute *amdgpu_vram_mgr_attributes[] = {
+static struct attribute *amdgpu_vram_mgr_attributes[] = {
 	&dev_attr_mem_info_vram_total.attr,
 	&dev_attr_mem_info_vis_vram_total.attr,
 	&dev_attr_mem_info_vram_used.attr,
@@ -161,6 +161,10 @@ static const struct attribute *amdgpu_vram_mgr_attributes[] = {
 	NULL
 };
 
+const struct attribute_group amdgpu_vram_mgr_attr_group = {
+	.attrs = amdgpu_vram_mgr_attributes
+};
+
 static const struct ttm_resource_manager_func amdgpu_vram_mgr_func;
 
 /**
@@ -174,7 +178,6 @@ int amdgpu_vram_mgr_init(struct amdgpu_device *adev)
 {
 	struct amdgpu_vram_mgr *mgr = &adev->mman.vram_mgr;
 	struct ttm_resource_manager *man = &mgr->manager;
-	int ret;
 
 	ttm_resource_manager_init(man, adev->gmc.real_vram_size >> PAGE_SHIFT);
 
@@ -185,11 +188,6 @@ int amdgpu_vram_mgr_init(struct amdgpu_device *adev)
 	INIT_LIST_HEAD(&mgr->reservations_pending);
 	INIT_LIST_HEAD(&mgr->reserved_pages);
 
-	/* Add the two VRAM-related sysfs files */
-	ret = sysfs_create_files(&adev->dev->kobj, amdgpu_vram_mgr_attributes);
-	if (ret)
-		DRM_ERROR("Failed to register sysfs\n");
-
 	ttm_set_driver_manager(&adev->mman.bdev, TTM_PL_VRAM, &mgr->manager);
 	ttm_resource_manager_set_used(man, true);
 	return 0;
@@ -227,8 +225,6 @@ void amdgpu_vram_mgr_fini(struct amdgpu_device *adev)
 	drm_mm_takedown(&mgr->mm);
 	spin_unlock(&mgr->lock);
 
-	sysfs_remove_files(&adev->dev->kobj, amdgpu_vram_mgr_attributes);
-
 	ttm_resource_manager_cleanup(man);
 	ttm_set_driver_manager(&adev->mman.bdev, TTM_PL_VRAM, NULL);
 }
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 126+ messages in thread

* [PATCH v6 09/16] drm/amdgpu: Convert driver sysfs attributes to static attributes
@ 2021-05-10 16:36   ` Andrey Grodzovsky
  0 siblings, 0 replies; 126+ messages in thread
From: Andrey Grodzovsky @ 2021-05-10 16:36 UTC (permalink / raw)
  To: dri-devel, amd-gfx, linux-pci, ckoenig.leichtzumerken,
	daniel.vetter, Harry.Wentland
  Cc: gregkh, Felix.Kuehling, helgaas, Alexander.Deucher, Christian König

This allows to remove explicit creation and destruction
of those attrs and by this avoids warnings on device
finalizing post physical device extraction.

v5: Use newly added pci_driver.dev_groups directly

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c | 17 ++++++-------
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c      | 13 ++++++++++
 drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c  | 25 ++++++++------------
 drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c | 14 ++++-------
 4 files changed, 37 insertions(+), 32 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c
index 494b2e1717d5..879ed3e50a6e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c
@@ -1768,6 +1768,15 @@ static ssize_t amdgpu_atombios_get_vbios_version(struct device *dev,
 static DEVICE_ATTR(vbios_version, 0444, amdgpu_atombios_get_vbios_version,
 		   NULL);
 
+static struct attribute *amdgpu_vbios_version_attrs[] = {
+	&dev_attr_vbios_version.attr,
+	NULL
+};
+
+const struct attribute_group amdgpu_vbios_version_attr_group = {
+	.attrs = amdgpu_vbios_version_attrs
+};
+
 /**
  * amdgpu_atombios_fini - free the driver info and callbacks for atombios
  *
@@ -1787,7 +1796,6 @@ void amdgpu_atombios_fini(struct amdgpu_device *adev)
 	adev->mode_info.atom_context = NULL;
 	kfree(adev->mode_info.atom_card_info);
 	adev->mode_info.atom_card_info = NULL;
-	device_remove_file(adev->dev, &dev_attr_vbios_version);
 }
 
 /**
@@ -1804,7 +1812,6 @@ int amdgpu_atombios_init(struct amdgpu_device *adev)
 {
 	struct card_info *atom_card_info =
 	    kzalloc(sizeof(struct card_info), GFP_KERNEL);
-	int ret;
 
 	if (!atom_card_info)
 		return -ENOMEM;
@@ -1833,12 +1840,6 @@ int amdgpu_atombios_init(struct amdgpu_device *adev)
 		amdgpu_atombios_allocate_fb_scratch(adev);
 	}
 
-	ret = device_create_file(adev->dev, &dev_attr_vbios_version);
-	if (ret) {
-		DRM_ERROR("Failed to create device file for VBIOS version\n");
-		return ret;
-	}
-
 	return 0;
 }
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index 5ebed4c7d9c0..83006f45b10b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -1766,6 +1766,18 @@ static struct pci_error_handlers amdgpu_pci_err_handler = {
 	.resume		= amdgpu_pci_resume,
 };
 
+extern const struct attribute_group amdgpu_vram_mgr_attr_group;
+extern const struct attribute_group amdgpu_gtt_mgr_attr_group;
+extern const struct attribute_group amdgpu_vbios_version_attr_group;
+
+static const struct attribute_group *amdgpu_sysfs_groups[] = {
+	&amdgpu_vram_mgr_attr_group,
+	&amdgpu_gtt_mgr_attr_group,
+	&amdgpu_vbios_version_attr_group,
+	NULL,
+};
+
+
 static struct pci_driver amdgpu_kms_pci_driver = {
 	.name = DRIVER_NAME,
 	.id_table = pciidlist,
@@ -1774,6 +1786,7 @@ static struct pci_driver amdgpu_kms_pci_driver = {
 	.shutdown = amdgpu_pci_shutdown,
 	.driver.pm = &amdgpu_pm_ops,
 	.err_handler = &amdgpu_pci_err_handler,
+	.dev_groups = amdgpu_sysfs_groups,
 };
 
 static int __init amdgpu_init(void)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c
index 72962de4c04c..a4404da8ca6d 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c
@@ -75,6 +75,16 @@ static DEVICE_ATTR(mem_info_gtt_total, S_IRUGO,
 static DEVICE_ATTR(mem_info_gtt_used, S_IRUGO,
 	           amdgpu_mem_info_gtt_used_show, NULL);
 
+static struct attribute *amdgpu_gtt_mgr_attributes[] = {
+	&dev_attr_mem_info_gtt_total.attr,
+	&dev_attr_mem_info_gtt_used.attr,
+	NULL
+};
+
+const struct attribute_group amdgpu_gtt_mgr_attr_group = {
+	.attrs = amdgpu_gtt_mgr_attributes
+};
+
 static const struct ttm_resource_manager_func amdgpu_gtt_mgr_func;
 /**
  * amdgpu_gtt_mgr_init - init GTT manager and DRM MM
@@ -89,7 +99,6 @@ int amdgpu_gtt_mgr_init(struct amdgpu_device *adev, uint64_t gtt_size)
 	struct amdgpu_gtt_mgr *mgr = &adev->mman.gtt_mgr;
 	struct ttm_resource_manager *man = &mgr->manager;
 	uint64_t start, size;
-	int ret;
 
 	man->use_tt = true;
 	man->func = &amdgpu_gtt_mgr_func;
@@ -102,17 +111,6 @@ int amdgpu_gtt_mgr_init(struct amdgpu_device *adev, uint64_t gtt_size)
 	spin_lock_init(&mgr->lock);
 	atomic64_set(&mgr->available, gtt_size >> PAGE_SHIFT);
 
-	ret = device_create_file(adev->dev, &dev_attr_mem_info_gtt_total);
-	if (ret) {
-		DRM_ERROR("Failed to create device file mem_info_gtt_total\n");
-		return ret;
-	}
-	ret = device_create_file(adev->dev, &dev_attr_mem_info_gtt_used);
-	if (ret) {
-		DRM_ERROR("Failed to create device file mem_info_gtt_used\n");
-		return ret;
-	}
-
 	ttm_set_driver_manager(&adev->mman.bdev, TTM_PL_TT, &mgr->manager);
 	ttm_resource_manager_set_used(man, true);
 	return 0;
@@ -142,9 +140,6 @@ void amdgpu_gtt_mgr_fini(struct amdgpu_device *adev)
 	drm_mm_takedown(&mgr->mm);
 	spin_unlock(&mgr->lock);
 
-	device_remove_file(adev->dev, &dev_attr_mem_info_gtt_total);
-	device_remove_file(adev->dev, &dev_attr_mem_info_gtt_used);
-
 	ttm_resource_manager_cleanup(man);
 	ttm_set_driver_manager(&adev->mman.bdev, TTM_PL_TT, NULL);
 }
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
index 2344aba9dca3..8543d6486018 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
@@ -152,7 +152,7 @@ static DEVICE_ATTR(mem_info_vis_vram_used, S_IRUGO,
 static DEVICE_ATTR(mem_info_vram_vendor, S_IRUGO,
 		   amdgpu_mem_info_vram_vendor, NULL);
 
-static const struct attribute *amdgpu_vram_mgr_attributes[] = {
+static struct attribute *amdgpu_vram_mgr_attributes[] = {
 	&dev_attr_mem_info_vram_total.attr,
 	&dev_attr_mem_info_vis_vram_total.attr,
 	&dev_attr_mem_info_vram_used.attr,
@@ -161,6 +161,10 @@ static const struct attribute *amdgpu_vram_mgr_attributes[] = {
 	NULL
 };
 
+const struct attribute_group amdgpu_vram_mgr_attr_group = {
+	.attrs = amdgpu_vram_mgr_attributes
+};
+
 static const struct ttm_resource_manager_func amdgpu_vram_mgr_func;
 
 /**
@@ -174,7 +178,6 @@ int amdgpu_vram_mgr_init(struct amdgpu_device *adev)
 {
 	struct amdgpu_vram_mgr *mgr = &adev->mman.vram_mgr;
 	struct ttm_resource_manager *man = &mgr->manager;
-	int ret;
 
 	ttm_resource_manager_init(man, adev->gmc.real_vram_size >> PAGE_SHIFT);
 
@@ -185,11 +188,6 @@ int amdgpu_vram_mgr_init(struct amdgpu_device *adev)
 	INIT_LIST_HEAD(&mgr->reservations_pending);
 	INIT_LIST_HEAD(&mgr->reserved_pages);
 
-	/* Add the two VRAM-related sysfs files */
-	ret = sysfs_create_files(&adev->dev->kobj, amdgpu_vram_mgr_attributes);
-	if (ret)
-		DRM_ERROR("Failed to register sysfs\n");
-
 	ttm_set_driver_manager(&adev->mman.bdev, TTM_PL_VRAM, &mgr->manager);
 	ttm_resource_manager_set_used(man, true);
 	return 0;
@@ -227,8 +225,6 @@ void amdgpu_vram_mgr_fini(struct amdgpu_device *adev)
 	drm_mm_takedown(&mgr->mm);
 	spin_unlock(&mgr->lock);
 
-	sysfs_remove_files(&adev->dev->kobj, amdgpu_vram_mgr_attributes);
-
 	ttm_resource_manager_cleanup(man);
 	ttm_set_driver_manager(&adev->mman.bdev, TTM_PL_VRAM, NULL);
 }
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 126+ messages in thread

* [PATCH v6 09/16] drm/amdgpu: Convert driver sysfs attributes to static attributes
@ 2021-05-10 16:36   ` Andrey Grodzovsky
  0 siblings, 0 replies; 126+ messages in thread
From: Andrey Grodzovsky @ 2021-05-10 16:36 UTC (permalink / raw)
  To: dri-devel, amd-gfx, linux-pci, ckoenig.leichtzumerken,
	daniel.vetter, Harry.Wentland
  Cc: Andrey Grodzovsky, gregkh, Felix.Kuehling, ppaalanen, helgaas,
	Alexander.Deucher, Christian König

This allows to remove explicit creation and destruction
of those attrs and by this avoids warnings on device
finalizing post physical device extraction.

v5: Use newly added pci_driver.dev_groups directly

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c | 17 ++++++-------
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c      | 13 ++++++++++
 drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c  | 25 ++++++++------------
 drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c | 14 ++++-------
 4 files changed, 37 insertions(+), 32 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c
index 494b2e1717d5..879ed3e50a6e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c
@@ -1768,6 +1768,15 @@ static ssize_t amdgpu_atombios_get_vbios_version(struct device *dev,
 static DEVICE_ATTR(vbios_version, 0444, amdgpu_atombios_get_vbios_version,
 		   NULL);
 
+static struct attribute *amdgpu_vbios_version_attrs[] = {
+	&dev_attr_vbios_version.attr,
+	NULL
+};
+
+const struct attribute_group amdgpu_vbios_version_attr_group = {
+	.attrs = amdgpu_vbios_version_attrs
+};
+
 /**
  * amdgpu_atombios_fini - free the driver info and callbacks for atombios
  *
@@ -1787,7 +1796,6 @@ void amdgpu_atombios_fini(struct amdgpu_device *adev)
 	adev->mode_info.atom_context = NULL;
 	kfree(adev->mode_info.atom_card_info);
 	adev->mode_info.atom_card_info = NULL;
-	device_remove_file(adev->dev, &dev_attr_vbios_version);
 }
 
 /**
@@ -1804,7 +1812,6 @@ int amdgpu_atombios_init(struct amdgpu_device *adev)
 {
 	struct card_info *atom_card_info =
 	    kzalloc(sizeof(struct card_info), GFP_KERNEL);
-	int ret;
 
 	if (!atom_card_info)
 		return -ENOMEM;
@@ -1833,12 +1840,6 @@ int amdgpu_atombios_init(struct amdgpu_device *adev)
 		amdgpu_atombios_allocate_fb_scratch(adev);
 	}
 
-	ret = device_create_file(adev->dev, &dev_attr_vbios_version);
-	if (ret) {
-		DRM_ERROR("Failed to create device file for VBIOS version\n");
-		return ret;
-	}
-
 	return 0;
 }
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index 5ebed4c7d9c0..83006f45b10b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -1766,6 +1766,18 @@ static struct pci_error_handlers amdgpu_pci_err_handler = {
 	.resume		= amdgpu_pci_resume,
 };
 
+extern const struct attribute_group amdgpu_vram_mgr_attr_group;
+extern const struct attribute_group amdgpu_gtt_mgr_attr_group;
+extern const struct attribute_group amdgpu_vbios_version_attr_group;
+
+static const struct attribute_group *amdgpu_sysfs_groups[] = {
+	&amdgpu_vram_mgr_attr_group,
+	&amdgpu_gtt_mgr_attr_group,
+	&amdgpu_vbios_version_attr_group,
+	NULL,
+};
+
+
 static struct pci_driver amdgpu_kms_pci_driver = {
 	.name = DRIVER_NAME,
 	.id_table = pciidlist,
@@ -1774,6 +1786,7 @@ static struct pci_driver amdgpu_kms_pci_driver = {
 	.shutdown = amdgpu_pci_shutdown,
 	.driver.pm = &amdgpu_pm_ops,
 	.err_handler = &amdgpu_pci_err_handler,
+	.dev_groups = amdgpu_sysfs_groups,
 };
 
 static int __init amdgpu_init(void)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c
index 72962de4c04c..a4404da8ca6d 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c
@@ -75,6 +75,16 @@ static DEVICE_ATTR(mem_info_gtt_total, S_IRUGO,
 static DEVICE_ATTR(mem_info_gtt_used, S_IRUGO,
 	           amdgpu_mem_info_gtt_used_show, NULL);
 
+static struct attribute *amdgpu_gtt_mgr_attributes[] = {
+	&dev_attr_mem_info_gtt_total.attr,
+	&dev_attr_mem_info_gtt_used.attr,
+	NULL
+};
+
+const struct attribute_group amdgpu_gtt_mgr_attr_group = {
+	.attrs = amdgpu_gtt_mgr_attributes
+};
+
 static const struct ttm_resource_manager_func amdgpu_gtt_mgr_func;
 /**
  * amdgpu_gtt_mgr_init - init GTT manager and DRM MM
@@ -89,7 +99,6 @@ int amdgpu_gtt_mgr_init(struct amdgpu_device *adev, uint64_t gtt_size)
 	struct amdgpu_gtt_mgr *mgr = &adev->mman.gtt_mgr;
 	struct ttm_resource_manager *man = &mgr->manager;
 	uint64_t start, size;
-	int ret;
 
 	man->use_tt = true;
 	man->func = &amdgpu_gtt_mgr_func;
@@ -102,17 +111,6 @@ int amdgpu_gtt_mgr_init(struct amdgpu_device *adev, uint64_t gtt_size)
 	spin_lock_init(&mgr->lock);
 	atomic64_set(&mgr->available, gtt_size >> PAGE_SHIFT);
 
-	ret = device_create_file(adev->dev, &dev_attr_mem_info_gtt_total);
-	if (ret) {
-		DRM_ERROR("Failed to create device file mem_info_gtt_total\n");
-		return ret;
-	}
-	ret = device_create_file(adev->dev, &dev_attr_mem_info_gtt_used);
-	if (ret) {
-		DRM_ERROR("Failed to create device file mem_info_gtt_used\n");
-		return ret;
-	}
-
 	ttm_set_driver_manager(&adev->mman.bdev, TTM_PL_TT, &mgr->manager);
 	ttm_resource_manager_set_used(man, true);
 	return 0;
@@ -142,9 +140,6 @@ void amdgpu_gtt_mgr_fini(struct amdgpu_device *adev)
 	drm_mm_takedown(&mgr->mm);
 	spin_unlock(&mgr->lock);
 
-	device_remove_file(adev->dev, &dev_attr_mem_info_gtt_total);
-	device_remove_file(adev->dev, &dev_attr_mem_info_gtt_used);
-
 	ttm_resource_manager_cleanup(man);
 	ttm_set_driver_manager(&adev->mman.bdev, TTM_PL_TT, NULL);
 }
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
index 2344aba9dca3..8543d6486018 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
@@ -152,7 +152,7 @@ static DEVICE_ATTR(mem_info_vis_vram_used, S_IRUGO,
 static DEVICE_ATTR(mem_info_vram_vendor, S_IRUGO,
 		   amdgpu_mem_info_vram_vendor, NULL);
 
-static const struct attribute *amdgpu_vram_mgr_attributes[] = {
+static struct attribute *amdgpu_vram_mgr_attributes[] = {
 	&dev_attr_mem_info_vram_total.attr,
 	&dev_attr_mem_info_vis_vram_total.attr,
 	&dev_attr_mem_info_vram_used.attr,
@@ -161,6 +161,10 @@ static const struct attribute *amdgpu_vram_mgr_attributes[] = {
 	NULL
 };
 
+const struct attribute_group amdgpu_vram_mgr_attr_group = {
+	.attrs = amdgpu_vram_mgr_attributes
+};
+
 static const struct ttm_resource_manager_func amdgpu_vram_mgr_func;
 
 /**
@@ -174,7 +178,6 @@ int amdgpu_vram_mgr_init(struct amdgpu_device *adev)
 {
 	struct amdgpu_vram_mgr *mgr = &adev->mman.vram_mgr;
 	struct ttm_resource_manager *man = &mgr->manager;
-	int ret;
 
 	ttm_resource_manager_init(man, adev->gmc.real_vram_size >> PAGE_SHIFT);
 
@@ -185,11 +188,6 @@ int amdgpu_vram_mgr_init(struct amdgpu_device *adev)
 	INIT_LIST_HEAD(&mgr->reservations_pending);
 	INIT_LIST_HEAD(&mgr->reserved_pages);
 
-	/* Add the two VRAM-related sysfs files */
-	ret = sysfs_create_files(&adev->dev->kobj, amdgpu_vram_mgr_attributes);
-	if (ret)
-		DRM_ERROR("Failed to register sysfs\n");
-
 	ttm_set_driver_manager(&adev->mman.bdev, TTM_PL_VRAM, &mgr->manager);
 	ttm_resource_manager_set_used(man, true);
 	return 0;
@@ -227,8 +225,6 @@ void amdgpu_vram_mgr_fini(struct amdgpu_device *adev)
 	drm_mm_takedown(&mgr->mm);
 	spin_unlock(&mgr->lock);
 
-	sysfs_remove_files(&adev->dev->kobj, amdgpu_vram_mgr_attributes);
-
 	ttm_resource_manager_cleanup(man);
 	ttm_set_driver_manager(&adev->mman.bdev, TTM_PL_VRAM, NULL);
 }
-- 
2.25.1

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 126+ messages in thread

* [PATCH v6 10/16] drm/amdgpu: Guard against write accesses after device removal
  2021-05-10 16:36 ` Andrey Grodzovsky
  (?)
@ 2021-05-10 16:36   ` Andrey Grodzovsky
  -1 siblings, 0 replies; 126+ messages in thread
From: Andrey Grodzovsky @ 2021-05-10 16:36 UTC (permalink / raw)
  To: dri-devel, amd-gfx, linux-pci, ckoenig.leichtzumerken,
	daniel.vetter, Harry.Wentland
  Cc: ppaalanen, Alexander.Deucher, gregkh, helgaas, Felix.Kuehling,
	Andrey Grodzovsky

This should prevent writing to memory or IO ranges possibly
already allocated for other uses after our device is removed.

v5:
Protect more places wher memcopy_to/form_io takes place
Protect IB submissions

v6: Switch to !drm_dev_enter instead of scoping entire code
with brackets.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c    | 11 ++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c       |  9 +++
 drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c        | 17 +++--
 drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c       | 63 +++++++++++------
 drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h       |  2 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c      | 70 +++++++++++++++++++
 drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h      | 49 ++-----------
 drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c       | 31 +++++---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c       | 11 ++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c       | 22 ++++--
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c        |  7 +-
 drivers/gpu/drm/amd/amdgpu/psp_v11_0.c        | 44 ++++++------
 drivers/gpu/drm/amd/amdgpu/psp_v12_0.c        |  8 +--
 drivers/gpu/drm/amd/amdgpu/psp_v3_1.c         |  8 +--
 drivers/gpu/drm/amd/amdgpu/vce_v4_0.c         | 26 ++++---
 drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c         | 22 +++---
 .../drm/amd/pm/powerplay/smumgr/smu7_smumgr.c |  2 +
 17 files changed, 257 insertions(+), 145 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index a0bff4713672..94c415176cdc 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -71,6 +71,8 @@
 #include <drm/task_barrier.h>
 #include <linux/pm_runtime.h>
 
+#include <drm/drm_drv.h>
+
 MODULE_FIRMWARE("amdgpu/vega10_gpu_info.bin");
 MODULE_FIRMWARE("amdgpu/vega12_gpu_info.bin");
 MODULE_FIRMWARE("amdgpu/raven_gpu_info.bin");
@@ -281,7 +283,10 @@ void amdgpu_device_vram_access(struct amdgpu_device *adev, loff_t pos,
 	unsigned long flags;
 	uint32_t hi = ~0;
 	uint64_t last;
+	int idx;
 
+	 if (!drm_dev_enter(&adev->ddev, &idx))
+		 return;
 
 #ifdef CONFIG_64BIT
 	last = min(pos + size, adev->gmc.visible_vram_size);
@@ -299,8 +304,10 @@ void amdgpu_device_vram_access(struct amdgpu_device *adev, loff_t pos,
 			memcpy_fromio(buf, addr, count);
 		}
 
-		if (count == size)
+		if (count == size) {
+			drm_dev_exit(idx);
 			return;
+		}
 
 		pos += count;
 		buf += count / 4;
@@ -323,6 +330,8 @@ void amdgpu_device_vram_access(struct amdgpu_device *adev, loff_t pos,
 			*buf++ = RREG32_NO_KIQ(mmMM_DATA);
 	}
 	spin_unlock_irqrestore(&adev->mmio_idx_lock, flags);
+
+	drm_dev_exit(idx);
 }
 
 /*
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
index 4d32233cde92..04ba5eef1e88 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
@@ -31,6 +31,8 @@
 #include "amdgpu_ras.h"
 #include "amdgpu_xgmi.h"
 
+#include <drm/drm_drv.h>
+
 /**
  * amdgpu_gmc_pdb0_alloc - allocate vram for pdb0
  *
@@ -151,6 +153,10 @@ int amdgpu_gmc_set_pte_pde(struct amdgpu_device *adev, void *cpu_pt_addr,
 {
 	void __iomem *ptr = (void *)cpu_pt_addr;
 	uint64_t value;
+	int idx;
+
+	if (!drm_dev_enter(&adev->ddev, &idx))
+		return 0;
 
 	/*
 	 * The following is for PTE only. GART does not have PDEs.
@@ -158,6 +164,9 @@ int amdgpu_gmc_set_pte_pde(struct amdgpu_device *adev, void *cpu_pt_addr,
 	value = addr & 0x0000FFFFFFFFF000ULL;
 	value |= flags;
 	writeq(value, ptr + (gpu_page_idx * 8));
+
+	drm_dev_exit(idx);
+
 	return 0;
 }
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
index 148a3b481b12..62fcbd446c71 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
@@ -30,6 +30,7 @@
 #include <linux/slab.h>
 
 #include <drm/amdgpu_drm.h>
+#include <drm/drm_drv.h>
 
 #include "amdgpu.h"
 #include "atom.h"
@@ -137,7 +138,7 @@ int amdgpu_ib_schedule(struct amdgpu_ring *ring, unsigned num_ibs,
 	bool secure;
 
 	unsigned i;
-	int r = 0;
+	int idx, r = 0;
 	bool need_pipe_sync = false;
 
 	if (num_ibs == 0)
@@ -169,13 +170,16 @@ int amdgpu_ib_schedule(struct amdgpu_ring *ring, unsigned num_ibs,
 		return -EINVAL;
 	}
 
+	if (!drm_dev_enter(&adev->ddev, &idx))
+		return -ENODEV;
+
 	alloc_size = ring->funcs->emit_frame_size + num_ibs *
 		ring->funcs->emit_ib_size;
 
 	r = amdgpu_ring_alloc(ring, alloc_size);
 	if (r) {
 		dev_err(adev->dev, "scheduling IB failed (%d).\n", r);
-		return r;
+		goto exit;
 	}
 
 	need_ctx_switch = ring->current_ctx != fence_ctx;
@@ -205,7 +209,7 @@ int amdgpu_ib_schedule(struct amdgpu_ring *ring, unsigned num_ibs,
 		r = amdgpu_vm_flush(ring, job, need_pipe_sync);
 		if (r) {
 			amdgpu_ring_undo(ring);
-			return r;
+			goto exit;
 		}
 	}
 
@@ -286,7 +290,7 @@ int amdgpu_ib_schedule(struct amdgpu_ring *ring, unsigned num_ibs,
 		if (job && job->vmid)
 			amdgpu_vmid_reset(adev, ring->funcs->vmhub, job->vmid);
 		amdgpu_ring_undo(ring);
-		return r;
+		goto exit;
 	}
 
 	if (ring->funcs->insert_end)
@@ -304,7 +308,10 @@ int amdgpu_ib_schedule(struct amdgpu_ring *ring, unsigned num_ibs,
 		ring->funcs->emit_wave_limit(ring, false);
 
 	amdgpu_ring_commit(ring);
-	return 0;
+
+exit:
+	drm_dev_exit(idx);
+	return r;
 }
 
 /**
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
index 9e769cf6095b..bb6afee61666 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
@@ -25,6 +25,7 @@
 
 #include <linux/firmware.h>
 #include <linux/dma-mapping.h>
+#include <drm/drm_drv.h>
 
 #include "amdgpu.h"
 #include "amdgpu_psp.h"
@@ -39,6 +40,8 @@
 #include "amdgpu_ras.h"
 #include "amdgpu_securedisplay.h"
 
+#include <drm/drm_drv.h>
+
 static int psp_sysfs_init(struct amdgpu_device *adev);
 static void psp_sysfs_fini(struct amdgpu_device *adev);
 
@@ -253,7 +256,7 @@ psp_cmd_submit_buf(struct psp_context *psp,
 		   struct psp_gfx_cmd_resp *cmd, uint64_t fence_mc_addr)
 {
 	int ret;
-	int index;
+	int index, idx;
 	int timeout = 20000;
 	bool ras_intr = false;
 	bool skip_unsupport = false;
@@ -261,6 +264,9 @@ psp_cmd_submit_buf(struct psp_context *psp,
 	if (psp->adev->in_pci_err_recovery)
 		return 0;
 
+	if (!drm_dev_enter(&psp->adev->ddev, &idx))
+		return 0;
+
 	mutex_lock(&psp->mutex);
 
 	memset(psp->cmd_buf_mem, 0, PSP_CMD_BUFFER_SIZE);
@@ -271,8 +277,7 @@ psp_cmd_submit_buf(struct psp_context *psp,
 	ret = psp_ring_cmd_submit(psp, psp->cmd_buf_mc_addr, fence_mc_addr, index);
 	if (ret) {
 		atomic_dec(&psp->fence_value);
-		mutex_unlock(&psp->mutex);
-		return ret;
+		goto exit;
 	}
 
 	amdgpu_asic_invalidate_hdp(psp->adev, NULL);
@@ -312,8 +317,8 @@ psp_cmd_submit_buf(struct psp_context *psp,
 			 psp->cmd_buf_mem->cmd_id,
 			 psp->cmd_buf_mem->resp.status);
 		if (!timeout) {
-			mutex_unlock(&psp->mutex);
-			return -EINVAL;
+			ret = -EINVAL;
+			goto exit;
 		}
 	}
 
@@ -321,8 +326,10 @@ psp_cmd_submit_buf(struct psp_context *psp,
 		ucode->tmr_mc_addr_lo = psp->cmd_buf_mem->resp.fw_addr_lo;
 		ucode->tmr_mc_addr_hi = psp->cmd_buf_mem->resp.fw_addr_hi;
 	}
-	mutex_unlock(&psp->mutex);
 
+exit:
+	mutex_unlock(&psp->mutex);
+	drm_dev_exit(idx);
 	return ret;
 }
 
@@ -359,8 +366,7 @@ static int psp_load_toc(struct psp_context *psp,
 	if (!cmd)
 		return -ENOMEM;
 	/* Copy toc to psp firmware private buffer */
-	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
-	memcpy(psp->fw_pri_buf, psp->toc_start_addr, psp->toc_bin_size);
+	psp_copy_fw(psp, psp->toc_start_addr, psp->toc_bin_size);
 
 	psp_prep_load_toc_cmd_buf(cmd, psp->fw_pri_mc_addr, psp->toc_bin_size);
 
@@ -625,8 +631,7 @@ static int psp_asd_load(struct psp_context *psp)
 	if (!cmd)
 		return -ENOMEM;
 
-	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
-	memcpy(psp->fw_pri_buf, psp->asd_start_addr, psp->asd_ucode_size);
+	psp_copy_fw(psp, psp->asd_start_addr, psp->asd_ucode_size);
 
 	psp_prep_asd_load_cmd_buf(cmd, psp->fw_pri_mc_addr,
 				  psp->asd_ucode_size);
@@ -781,8 +786,7 @@ static int psp_xgmi_load(struct psp_context *psp)
 	if (!cmd)
 		return -ENOMEM;
 
-	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
-	memcpy(psp->fw_pri_buf, psp->ta_xgmi_start_addr, psp->ta_xgmi_ucode_size);
+	psp_copy_fw(psp, psp->ta_xgmi_start_addr, psp->ta_xgmi_ucode_size);
 
 	psp_prep_ta_load_cmd_buf(cmd,
 				 psp->fw_pri_mc_addr,
@@ -1038,8 +1042,7 @@ static int psp_ras_load(struct psp_context *psp)
 	if (!cmd)
 		return -ENOMEM;
 
-	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
-	memcpy(psp->fw_pri_buf, psp->ta_ras_start_addr, psp->ta_ras_ucode_size);
+	psp_copy_fw(psp, psp->ta_ras_start_addr, psp->ta_ras_ucode_size);
 
 	psp_prep_ta_load_cmd_buf(cmd,
 				 psp->fw_pri_mc_addr,
@@ -1275,8 +1278,7 @@ static int psp_hdcp_load(struct psp_context *psp)
 	if (!cmd)
 		return -ENOMEM;
 
-	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
-	memcpy(psp->fw_pri_buf, psp->ta_hdcp_start_addr,
+	psp_copy_fw(psp, psp->ta_hdcp_start_addr,
 	       psp->ta_hdcp_ucode_size);
 
 	psp_prep_ta_load_cmd_buf(cmd,
@@ -1427,8 +1429,7 @@ static int psp_dtm_load(struct psp_context *psp)
 	if (!cmd)
 		return -ENOMEM;
 
-	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
-	memcpy(psp->fw_pri_buf, psp->ta_dtm_start_addr, psp->ta_dtm_ucode_size);
+	psp_copy_fw(psp, psp->ta_dtm_start_addr, psp->ta_dtm_ucode_size);
 
 	psp_prep_ta_load_cmd_buf(cmd,
 				 psp->fw_pri_mc_addr,
@@ -1573,8 +1574,7 @@ static int psp_rap_load(struct psp_context *psp)
 	if (!cmd)
 		return -ENOMEM;
 
-	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
-	memcpy(psp->fw_pri_buf, psp->ta_rap_start_addr, psp->ta_rap_ucode_size);
+	psp_copy_fw(psp, psp->ta_rap_start_addr, psp->ta_rap_ucode_size);
 
 	psp_prep_ta_load_cmd_buf(cmd,
 				 psp->fw_pri_mc_addr,
@@ -3022,7 +3022,7 @@ static ssize_t psp_usbc_pd_fw_sysfs_write(struct device *dev,
 	struct amdgpu_device *adev = drm_to_adev(ddev);
 	void *cpu_addr;
 	dma_addr_t dma_addr;
-	int ret;
+	int ret, idx;
 	char fw_name[100];
 	const struct firmware *usbc_pd_fw;
 
@@ -3031,6 +3031,9 @@ static ssize_t psp_usbc_pd_fw_sysfs_write(struct device *dev,
 		return -EBUSY;
 	}
 
+	if (!drm_dev_enter(ddev, &idx))
+		return -ENODEV;
+
 	snprintf(fw_name, sizeof(fw_name), "amdgpu/%s", buf);
 	ret = request_firmware(&usbc_pd_fw, fw_name, adev->dev);
 	if (ret)
@@ -3062,16 +3065,30 @@ static ssize_t psp_usbc_pd_fw_sysfs_write(struct device *dev,
 rel_buf:
 	dma_free_coherent(adev->dev, usbc_pd_fw->size, cpu_addr, dma_addr);
 	release_firmware(usbc_pd_fw);
-
 fail:
 	if (ret) {
 		DRM_ERROR("Failed to load USBC PD FW, err = %d", ret);
-		return ret;
+		count = ret;
 	}
 
+	drm_dev_exit(idx);
 	return count;
 }
 
+void psp_copy_fw(struct psp_context *psp, uint8_t *start_addr, uint32_t bin_size)
+{
+	int idx;
+
+	if (!drm_dev_enter(&psp->adev->ddev, &idx))
+		return;
+
+	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
+	memcpy(psp->fw_pri_buf, start_addr, bin_size);
+
+	drm_dev_exit(idx);
+}
+
+
 static DEVICE_ATTR(usbc_pd_fw, S_IRUGO | S_IWUSR,
 		   psp_usbc_pd_fw_sysfs_read,
 		   psp_usbc_pd_fw_sysfs_write);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
index 46a5328e00e0..2bfdc278817f 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
@@ -423,4 +423,6 @@ int psp_get_fw_attestation_records_addr(struct psp_context *psp,
 
 int psp_load_fw_list(struct psp_context *psp,
 		     struct amdgpu_firmware_info **ucode_list, int ucode_count);
+void psp_copy_fw(struct psp_context *psp, uint8_t *start_addr, uint32_t bin_size);
+
 #endif
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
index 688624ebe421..e1985bc34436 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
@@ -35,6 +35,8 @@
 #include "amdgpu.h"
 #include "atom.h"
 
+#include <drm/drm_drv.h>
+
 /*
  * Rings
  * Most engines on the GPU are fed via ring buffers.  Ring
@@ -461,3 +463,71 @@ int amdgpu_ring_test_helper(struct amdgpu_ring *ring)
 	ring->sched.ready = !r;
 	return r;
 }
+
+void amdgpu_ring_clear_ring(struct amdgpu_ring *ring)
+{
+	int idx;
+	int i = 0;
+
+	if (!drm_dev_enter(&ring->adev->ddev, &idx))
+		return;
+
+	while (i <= ring->buf_mask)
+		ring->ring[i++] = ring->funcs->nop;
+
+	drm_dev_exit(idx);
+
+}
+
+void amdgpu_ring_write(struct amdgpu_ring *ring, uint32_t v)
+{
+	int idx;
+
+	if (!drm_dev_enter(&ring->adev->ddev, &idx))
+		return;
+
+	if (ring->count_dw <= 0)
+		DRM_ERROR("amdgpu: writing more dwords to the ring than expected!\n");
+	ring->ring[ring->wptr++ & ring->buf_mask] = v;
+	ring->wptr &= ring->ptr_mask;
+	ring->count_dw--;
+
+	drm_dev_exit(idx);
+}
+
+void amdgpu_ring_write_multiple(struct amdgpu_ring *ring,
+					      void *src, int count_dw)
+{
+	unsigned occupied, chunk1, chunk2;
+	void *dst;
+	int idx;
+
+	if (!drm_dev_enter(&ring->adev->ddev, &idx))
+		return;
+
+	if (unlikely(ring->count_dw < count_dw))
+		DRM_ERROR("amdgpu: writing more dwords to the ring than expected!\n");
+
+	occupied = ring->wptr & ring->buf_mask;
+	dst = (void *)&ring->ring[occupied];
+	chunk1 = ring->buf_mask + 1 - occupied;
+	chunk1 = (chunk1 >= count_dw) ? count_dw: chunk1;
+	chunk2 = count_dw - chunk1;
+	chunk1 <<= 2;
+	chunk2 <<= 2;
+
+	if (chunk1)
+		memcpy(dst, src, chunk1);
+
+	if (chunk2) {
+		src += chunk1;
+		dst = (void *)ring->ring;
+		memcpy(dst, src, chunk2);
+	}
+
+	ring->wptr += count_dw;
+	ring->wptr &= ring->ptr_mask;
+	ring->count_dw -= count_dw;
+
+	drm_dev_exit(idx);
+}
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
index e7d3d0dbdd96..c67bc6d3d039 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
@@ -299,53 +299,12 @@ static inline void amdgpu_ring_set_preempt_cond_exec(struct amdgpu_ring *ring,
 	*ring->cond_exe_cpu_addr = cond_exec;
 }
 
-static inline void amdgpu_ring_clear_ring(struct amdgpu_ring *ring)
-{
-	int i = 0;
-	while (i <= ring->buf_mask)
-		ring->ring[i++] = ring->funcs->nop;
-
-}
-
-static inline void amdgpu_ring_write(struct amdgpu_ring *ring, uint32_t v)
-{
-	if (ring->count_dw <= 0)
-		DRM_ERROR("amdgpu: writing more dwords to the ring than expected!\n");
-	ring->ring[ring->wptr++ & ring->buf_mask] = v;
-	ring->wptr &= ring->ptr_mask;
-	ring->count_dw--;
-}
+void amdgpu_ring_clear_ring(struct amdgpu_ring *ring);
 
-static inline void amdgpu_ring_write_multiple(struct amdgpu_ring *ring,
-					      void *src, int count_dw)
-{
-	unsigned occupied, chunk1, chunk2;
-	void *dst;
-
-	if (unlikely(ring->count_dw < count_dw))
-		DRM_ERROR("amdgpu: writing more dwords to the ring than expected!\n");
-
-	occupied = ring->wptr & ring->buf_mask;
-	dst = (void *)&ring->ring[occupied];
-	chunk1 = ring->buf_mask + 1 - occupied;
-	chunk1 = (chunk1 >= count_dw) ? count_dw: chunk1;
-	chunk2 = count_dw - chunk1;
-	chunk1 <<= 2;
-	chunk2 <<= 2;
+void amdgpu_ring_write(struct amdgpu_ring *ring, uint32_t v);
 
-	if (chunk1)
-		memcpy(dst, src, chunk1);
-
-	if (chunk2) {
-		src += chunk1;
-		dst = (void *)ring->ring;
-		memcpy(dst, src, chunk2);
-	}
-
-	ring->wptr += count_dw;
-	ring->wptr &= ring->ptr_mask;
-	ring->count_dw -= count_dw;
-}
+void amdgpu_ring_write_multiple(struct amdgpu_ring *ring,
+					      void *src, int count_dw);
 
 int amdgpu_ring_test_helper(struct amdgpu_ring *ring);
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
index c6dbc0801604..82f0542c7792 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
@@ -32,6 +32,7 @@
 #include <linux/module.h>
 
 #include <drm/drm.h>
+#include <drm/drm_drv.h>
 
 #include "amdgpu.h"
 #include "amdgpu_pm.h"
@@ -375,7 +376,7 @@ int amdgpu_uvd_suspend(struct amdgpu_device *adev)
 {
 	unsigned size;
 	void *ptr;
-	int i, j;
+	int i, j, idx;
 	bool in_ras_intr = amdgpu_ras_intr_triggered();
 
 	cancel_delayed_work_sync(&adev->uvd.idle_work);
@@ -403,11 +404,15 @@ int amdgpu_uvd_suspend(struct amdgpu_device *adev)
 		if (!adev->uvd.inst[j].saved_bo)
 			return -ENOMEM;
 
-		/* re-write 0 since err_event_athub will corrupt VCPU buffer */
-		if (in_ras_intr)
-			memset(adev->uvd.inst[j].saved_bo, 0, size);
-		else
-			memcpy_fromio(adev->uvd.inst[j].saved_bo, ptr, size);
+		if (drm_dev_enter(&adev->ddev, &idx)) {
+			/* re-write 0 since err_event_athub will corrupt VCPU buffer */
+			if (in_ras_intr)
+				memset(adev->uvd.inst[j].saved_bo, 0, size);
+			else
+				memcpy_fromio(adev->uvd.inst[j].saved_bo, ptr, size);
+
+			drm_dev_exit(idx);
+		}
 	}
 
 	if (in_ras_intr)
@@ -420,7 +425,7 @@ int amdgpu_uvd_resume(struct amdgpu_device *adev)
 {
 	unsigned size;
 	void *ptr;
-	int i;
+	int i, idx;
 
 	for (i = 0; i < adev->uvd.num_uvd_inst; i++) {
 		if (adev->uvd.harvest_config & (1 << i))
@@ -432,7 +437,10 @@ int amdgpu_uvd_resume(struct amdgpu_device *adev)
 		ptr = adev->uvd.inst[i].cpu_addr;
 
 		if (adev->uvd.inst[i].saved_bo != NULL) {
-			memcpy_toio(ptr, adev->uvd.inst[i].saved_bo, size);
+			if (drm_dev_enter(&adev->ddev, &idx)) {
+				memcpy_toio(ptr, adev->uvd.inst[i].saved_bo, size);
+				drm_dev_exit(idx);
+			}
 			kvfree(adev->uvd.inst[i].saved_bo);
 			adev->uvd.inst[i].saved_bo = NULL;
 		} else {
@@ -442,8 +450,11 @@ int amdgpu_uvd_resume(struct amdgpu_device *adev)
 			hdr = (const struct common_firmware_header *)adev->uvd.fw->data;
 			if (adev->firmware.load_type != AMDGPU_FW_LOAD_PSP) {
 				offset = le32_to_cpu(hdr->ucode_array_offset_bytes);
-				memcpy_toio(adev->uvd.inst[i].cpu_addr, adev->uvd.fw->data + offset,
-					    le32_to_cpu(hdr->ucode_size_bytes));
+				if (drm_dev_enter(&adev->ddev, &idx)) {
+					memcpy_toio(adev->uvd.inst[i].cpu_addr, adev->uvd.fw->data + offset,
+						    le32_to_cpu(hdr->ucode_size_bytes));
+					drm_dev_exit(idx);
+				}
 				size -= le32_to_cpu(hdr->ucode_size_bytes);
 				ptr += le32_to_cpu(hdr->ucode_size_bytes);
 			}
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
index ea6a62f67e38..833203401ef4 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
@@ -29,6 +29,7 @@
 #include <linux/module.h>
 
 #include <drm/drm.h>
+#include <drm/drm_drv.h>
 
 #include "amdgpu.h"
 #include "amdgpu_pm.h"
@@ -293,7 +294,7 @@ int amdgpu_vce_resume(struct amdgpu_device *adev)
 	void *cpu_addr;
 	const struct common_firmware_header *hdr;
 	unsigned offset;
-	int r;
+	int r, idx;
 
 	if (adev->vce.vcpu_bo == NULL)
 		return -EINVAL;
@@ -313,8 +314,12 @@ int amdgpu_vce_resume(struct amdgpu_device *adev)
 
 	hdr = (const struct common_firmware_header *)adev->vce.fw->data;
 	offset = le32_to_cpu(hdr->ucode_array_offset_bytes);
-	memcpy_toio(cpu_addr, adev->vce.fw->data + offset,
-		    adev->vce.fw->size - offset);
+
+	if (drm_dev_enter(&adev->ddev, &idx)) {
+		memcpy_toio(cpu_addr, adev->vce.fw->data + offset,
+			    adev->vce.fw->size - offset);
+		drm_dev_exit(idx);
+	}
 
 	amdgpu_bo_kunmap(adev->vce.vcpu_bo);
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
index 201645963ba5..21f7d3644d70 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
@@ -27,6 +27,7 @@
 #include <linux/firmware.h>
 #include <linux/module.h>
 #include <linux/pci.h>
+#include <drm/drm_drv.h>
 
 #include "amdgpu.h"
 #include "amdgpu_pm.h"
@@ -275,7 +276,7 @@ int amdgpu_vcn_suspend(struct amdgpu_device *adev)
 {
 	unsigned size;
 	void *ptr;
-	int i;
+	int i, idx;
 
 	cancel_delayed_work_sync(&adev->vcn.idle_work);
 
@@ -292,7 +293,10 @@ int amdgpu_vcn_suspend(struct amdgpu_device *adev)
 		if (!adev->vcn.inst[i].saved_bo)
 			return -ENOMEM;
 
-		memcpy_fromio(adev->vcn.inst[i].saved_bo, ptr, size);
+		if (drm_dev_enter(&adev->ddev, &idx)) {
+			memcpy_fromio(adev->vcn.inst[i].saved_bo, ptr, size);
+			drm_dev_exit(idx);
+		}
 	}
 	return 0;
 }
@@ -301,7 +305,7 @@ int amdgpu_vcn_resume(struct amdgpu_device *adev)
 {
 	unsigned size;
 	void *ptr;
-	int i;
+	int i, idx;
 
 	for (i = 0; i < adev->vcn.num_vcn_inst; ++i) {
 		if (adev->vcn.harvest_config & (1 << i))
@@ -313,7 +317,10 @@ int amdgpu_vcn_resume(struct amdgpu_device *adev)
 		ptr = adev->vcn.inst[i].cpu_addr;
 
 		if (adev->vcn.inst[i].saved_bo != NULL) {
-			memcpy_toio(ptr, adev->vcn.inst[i].saved_bo, size);
+			if (drm_dev_enter(&adev->ddev, &idx)) {
+				memcpy_toio(ptr, adev->vcn.inst[i].saved_bo, size);
+				drm_dev_exit(idx);
+			}
 			kvfree(adev->vcn.inst[i].saved_bo);
 			adev->vcn.inst[i].saved_bo = NULL;
 		} else {
@@ -323,8 +330,11 @@ int amdgpu_vcn_resume(struct amdgpu_device *adev)
 			hdr = (const struct common_firmware_header *)adev->vcn.fw->data;
 			if (adev->firmware.load_type != AMDGPU_FW_LOAD_PSP) {
 				offset = le32_to_cpu(hdr->ucode_array_offset_bytes);
-				memcpy_toio(adev->vcn.inst[i].cpu_addr, adev->vcn.fw->data + offset,
-					    le32_to_cpu(hdr->ucode_size_bytes));
+				if (drm_dev_enter(&adev->ddev, &idx)) {
+					memcpy_toio(adev->vcn.inst[i].cpu_addr, adev->vcn.fw->data + offset,
+						    le32_to_cpu(hdr->ucode_size_bytes));
+					drm_dev_exit(idx);
+				}
 				size -= le32_to_cpu(hdr->ucode_size_bytes);
 				ptr += le32_to_cpu(hdr->ucode_size_bytes);
 			}
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index 9f868cf3b832..7dd5f10ab570 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -32,6 +32,7 @@
 #include <linux/dma-buf.h>
 
 #include <drm/amdgpu_drm.h>
+#include <drm/drm_drv.h>
 #include "amdgpu.h"
 #include "amdgpu_trace.h"
 #include "amdgpu_amdkfd.h"
@@ -1606,7 +1607,10 @@ static int amdgpu_vm_bo_update_mapping(struct amdgpu_device *adev,
 	struct amdgpu_vm_update_params params;
 	enum amdgpu_sync_mode sync_mode;
 	uint64_t pfn;
-	int r;
+	int r, idx;
+
+	if (!drm_dev_enter(&adev->ddev, &idx))
+		return -ENODEV;
 
 	memset(&params, 0, sizeof(params));
 	params.adev = adev;
@@ -1715,6 +1719,7 @@ static int amdgpu_vm_bo_update_mapping(struct amdgpu_device *adev,
 
 error_unlock:
 	amdgpu_vm_eviction_unlock(vm);
+	drm_dev_exit(idx);
 	return r;
 }
 
diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c b/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
index 589410c32d09..2cec71e823f5 100644
--- a/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
@@ -23,6 +23,7 @@
 #include <linux/firmware.h>
 #include <linux/module.h>
 #include <linux/vmalloc.h>
+#include <drm/drm_drv.h>
 
 #include "amdgpu.h"
 #include "amdgpu_psp.h"
@@ -269,10 +270,8 @@ static int psp_v11_0_bootloader_load_kdb(struct psp_context *psp)
 	if (ret)
 		return ret;
 
-	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
-
 	/* Copy PSP KDB binary to memory */
-	memcpy(psp->fw_pri_buf, psp->kdb_start_addr, psp->kdb_bin_size);
+	psp_copy_fw(psp, psp->kdb_start_addr, psp->kdb_bin_size);
 
 	/* Provide the PSP KDB to bootloader */
 	WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
@@ -302,10 +301,8 @@ static int psp_v11_0_bootloader_load_spl(struct psp_context *psp)
 	if (ret)
 		return ret;
 
-	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
-
 	/* Copy PSP SPL binary to memory */
-	memcpy(psp->fw_pri_buf, psp->spl_start_addr, psp->spl_bin_size);
+	psp_copy_fw(psp, psp->spl_start_addr, psp->spl_bin_size);
 
 	/* Provide the PSP SPL to bootloader */
 	WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
@@ -335,10 +332,8 @@ static int psp_v11_0_bootloader_load_sysdrv(struct psp_context *psp)
 	if (ret)
 		return ret;
 
-	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
-
 	/* Copy PSP System Driver binary to memory */
-	memcpy(psp->fw_pri_buf, psp->sys_start_addr, psp->sys_bin_size);
+	psp_copy_fw(psp, psp->sys_start_addr, psp->sys_bin_size);
 
 	/* Provide the sys driver to bootloader */
 	WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
@@ -371,10 +366,8 @@ static int psp_v11_0_bootloader_load_sos(struct psp_context *psp)
 	if (ret)
 		return ret;
 
-	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
-
 	/* Copy Secure OS binary to PSP memory */
-	memcpy(psp->fw_pri_buf, psp->sos_start_addr, psp->sos_bin_size);
+	psp_copy_fw(psp, psp->sos_start_addr, psp->sos_bin_size);
 
 	/* Provide the PSP secure OS to bootloader */
 	WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
@@ -608,7 +601,7 @@ static int psp_v11_0_memory_training(struct psp_context *psp, uint32_t ops)
 	uint32_t p2c_header[4];
 	uint32_t sz;
 	void *buf;
-	int ret;
+	int ret, idx;
 
 	if (ctx->init == PSP_MEM_TRAIN_NOT_SUPPORT) {
 		DRM_DEBUG("Memory training is not supported.\n");
@@ -681,17 +674,24 @@ static int psp_v11_0_memory_training(struct psp_context *psp, uint32_t ops)
 			return -ENOMEM;
 		}
 
-		memcpy_fromio(buf, adev->mman.aper_base_kaddr, sz);
-		ret = psp_v11_0_memory_training_send_msg(psp, PSP_BL__DRAM_LONG_TRAIN);
-		if (ret) {
-			DRM_ERROR("Send long training msg failed.\n");
+		if (drm_dev_enter(&adev->ddev, &idx)) {
+			memcpy_fromio(buf, adev->mman.aper_base_kaddr, sz);
+			ret = psp_v11_0_memory_training_send_msg(psp, PSP_BL__DRAM_LONG_TRAIN);
+			if (ret) {
+				DRM_ERROR("Send long training msg failed.\n");
+				vfree(buf);
+				drm_dev_exit(idx);
+				return ret;
+			}
+
+			memcpy_toio(adev->mman.aper_base_kaddr, buf, sz);
+			adev->hdp.funcs->flush_hdp(adev, NULL);
 			vfree(buf);
-			return ret;
+			drm_dev_exit(idx);
+		} else {
+			vfree(buf);
+			return -ENODEV;
 		}
-
-		memcpy_toio(adev->mman.aper_base_kaddr, buf, sz);
-		adev->hdp.funcs->flush_hdp(adev, NULL);
-		vfree(buf);
 	}
 
 	if (ops & PSP_MEM_TRAIN_SAVE) {
diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c b/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c
index c4828bd3264b..618e5b6b85d9 100644
--- a/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c
@@ -138,10 +138,8 @@ static int psp_v12_0_bootloader_load_sysdrv(struct psp_context *psp)
 	if (ret)
 		return ret;
 
-	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
-
 	/* Copy PSP System Driver binary to memory */
-	memcpy(psp->fw_pri_buf, psp->sys_start_addr, psp->sys_bin_size);
+	psp_copy_fw(psp, psp->sys_start_addr, psp->sys_bin_size);
 
 	/* Provide the sys driver to bootloader */
 	WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
@@ -179,10 +177,8 @@ static int psp_v12_0_bootloader_load_sos(struct psp_context *psp)
 	if (ret)
 		return ret;
 
-	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
-
 	/* Copy Secure OS binary to PSP memory */
-	memcpy(psp->fw_pri_buf, psp->sos_start_addr, psp->sos_bin_size);
+	psp_copy_fw(psp, psp->sos_start_addr, psp->sos_bin_size);
 
 	/* Provide the PSP secure OS to bootloader */
 	WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c b/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c
index f2e725f72d2f..d0a6cccd0897 100644
--- a/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c
+++ b/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c
@@ -102,10 +102,8 @@ static int psp_v3_1_bootloader_load_sysdrv(struct psp_context *psp)
 	if (ret)
 		return ret;
 
-	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
-
 	/* Copy PSP System Driver binary to memory */
-	memcpy(psp->fw_pri_buf, psp->sys_start_addr, psp->sys_bin_size);
+	psp_copy_fw(psp, psp->sys_start_addr, psp->sys_bin_size);
 
 	/* Provide the sys driver to bootloader */
 	WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
@@ -143,10 +141,8 @@ static int psp_v3_1_bootloader_load_sos(struct psp_context *psp)
 	if (ret)
 		return ret;
 
-	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
-
 	/* Copy Secure OS binary to PSP memory */
-	memcpy(psp->fw_pri_buf, psp->sos_start_addr, psp->sos_bin_size);
+	psp_copy_fw(psp, psp->sos_start_addr, psp->sos_bin_size);
 
 	/* Provide the PSP secure OS to bootloader */
 	WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
diff --git a/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c b/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
index 8e238dea7bef..90910d19db12 100644
--- a/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
@@ -25,6 +25,7 @@
  */
 
 #include <linux/firmware.h>
+#include <drm/drm_drv.h>
 
 #include "amdgpu.h"
 #include "amdgpu_vce.h"
@@ -555,16 +556,19 @@ static int vce_v4_0_hw_fini(void *handle)
 static int vce_v4_0_suspend(void *handle)
 {
 	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
-	int r;
+	int r, idx;
 
 	if (adev->vce.vcpu_bo == NULL)
 		return 0;
 
-	if (adev->firmware.load_type == AMDGPU_FW_LOAD_PSP) {
-		unsigned size = amdgpu_bo_size(adev->vce.vcpu_bo);
-		void *ptr = adev->vce.cpu_addr;
+	if (drm_dev_enter(&adev->ddev, &idx)) {
+		if (adev->firmware.load_type == AMDGPU_FW_LOAD_PSP) {
+			unsigned size = amdgpu_bo_size(adev->vce.vcpu_bo);
+			void *ptr = adev->vce.cpu_addr;
 
-		memcpy_fromio(adev->vce.saved_bo, ptr, size);
+			memcpy_fromio(adev->vce.saved_bo, ptr, size);
+		}
+		drm_dev_exit(idx);
 	}
 
 	r = vce_v4_0_hw_fini(adev);
@@ -577,16 +581,20 @@ static int vce_v4_0_suspend(void *handle)
 static int vce_v4_0_resume(void *handle)
 {
 	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
-	int r;
+	int r, idx;
 
 	if (adev->vce.vcpu_bo == NULL)
 		return -EINVAL;
 
 	if (adev->firmware.load_type == AMDGPU_FW_LOAD_PSP) {
-		unsigned size = amdgpu_bo_size(adev->vce.vcpu_bo);
-		void *ptr = adev->vce.cpu_addr;
 
-		memcpy_toio(ptr, adev->vce.saved_bo, size);
+		if (drm_dev_enter(&adev->ddev, &idx)) {
+			unsigned size = amdgpu_bo_size(adev->vce.vcpu_bo);
+			void *ptr = adev->vce.cpu_addr;
+
+			memcpy_toio(ptr, adev->vce.saved_bo, size);
+			drm_dev_exit(idx);
+		}
 	} else {
 		r = amdgpu_vce_resume(adev);
 		if (r)
diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c b/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c
index 3f15bf34123a..df34be8ec82d 100644
--- a/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c
@@ -34,6 +34,8 @@
 #include "vcn/vcn_3_0_0_sh_mask.h"
 #include "ivsrcid/vcn/irqsrcs_vcn_2_0.h"
 
+#include <drm/drm_drv.h>
+
 #define mmUVD_CONTEXT_ID_INTERNAL_OFFSET			0x27
 #define mmUVD_GPCOM_VCPU_CMD_INTERNAL_OFFSET			0x0f
 #define mmUVD_GPCOM_VCPU_DATA0_INTERNAL_OFFSET			0x10
@@ -268,16 +270,20 @@ static int vcn_v3_0_sw_init(void *handle)
 static int vcn_v3_0_sw_fini(void *handle)
 {
 	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
-	int i, r;
+	int i, r, idx;
 
-	for (i = 0; i < adev->vcn.num_vcn_inst; i++) {
-		volatile struct amdgpu_fw_shared *fw_shared;
+	if (drm_dev_enter(&adev->ddev, &idx)) {
+		for (i = 0; i < adev->vcn.num_vcn_inst; i++) {
+			volatile struct amdgpu_fw_shared *fw_shared;
 
-		if (adev->vcn.harvest_config & (1 << i))
-			continue;
-		fw_shared = adev->vcn.inst[i].fw_shared_cpu_addr;
-		fw_shared->present_flag_0 = 0;
-		fw_shared->sw_ring.is_enabled = false;
+			if (adev->vcn.harvest_config & (1 << i))
+				continue;
+			fw_shared = adev->vcn.inst[i].fw_shared_cpu_addr;
+			fw_shared->present_flag_0 = 0;
+			fw_shared->sw_ring.is_enabled = false;
+		}
+
+		drm_dev_exit(idx);
 	}
 
 	if (amdgpu_sriov_vf(adev))
diff --git a/drivers/gpu/drm/amd/pm/powerplay/smumgr/smu7_smumgr.c b/drivers/gpu/drm/amd/pm/powerplay/smumgr/smu7_smumgr.c
index aae25243eb10..d628b91846c9 100644
--- a/drivers/gpu/drm/amd/pm/powerplay/smumgr/smu7_smumgr.c
+++ b/drivers/gpu/drm/amd/pm/powerplay/smumgr/smu7_smumgr.c
@@ -405,6 +405,8 @@ int smu7_request_smu_load_fw(struct pp_hwmgr *hwmgr)
 				UCODE_ID_MEC_STORAGE, &toc->entry[toc->num_entries++]),
 				"Failed to Get Firmware Entry.", r = -EINVAL; goto failed);
 	}
+
+	/* AG TODO Can't call drm_dev_enter/exit because access adev->ddev here ... */
 	memcpy_toio(smu_data->header_buffer.kaddr, smu_data->toc,
 		    sizeof(struct SMU_DRAMData_TOC));
 	smum_send_msg_to_smc_with_parameter(hwmgr,
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 126+ messages in thread

* [PATCH v6 10/16] drm/amdgpu: Guard against write accesses after device removal
@ 2021-05-10 16:36   ` Andrey Grodzovsky
  0 siblings, 0 replies; 126+ messages in thread
From: Andrey Grodzovsky @ 2021-05-10 16:36 UTC (permalink / raw)
  To: dri-devel, amd-gfx, linux-pci, ckoenig.leichtzumerken,
	daniel.vetter, Harry.Wentland
  Cc: gregkh, Felix.Kuehling, helgaas, Alexander.Deucher

This should prevent writing to memory or IO ranges possibly
already allocated for other uses after our device is removed.

v5:
Protect more places wher memcopy_to/form_io takes place
Protect IB submissions

v6: Switch to !drm_dev_enter instead of scoping entire code
with brackets.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c    | 11 ++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c       |  9 +++
 drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c        | 17 +++--
 drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c       | 63 +++++++++++------
 drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h       |  2 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c      | 70 +++++++++++++++++++
 drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h      | 49 ++-----------
 drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c       | 31 +++++---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c       | 11 ++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c       | 22 ++++--
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c        |  7 +-
 drivers/gpu/drm/amd/amdgpu/psp_v11_0.c        | 44 ++++++------
 drivers/gpu/drm/amd/amdgpu/psp_v12_0.c        |  8 +--
 drivers/gpu/drm/amd/amdgpu/psp_v3_1.c         |  8 +--
 drivers/gpu/drm/amd/amdgpu/vce_v4_0.c         | 26 ++++---
 drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c         | 22 +++---
 .../drm/amd/pm/powerplay/smumgr/smu7_smumgr.c |  2 +
 17 files changed, 257 insertions(+), 145 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index a0bff4713672..94c415176cdc 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -71,6 +71,8 @@
 #include <drm/task_barrier.h>
 #include <linux/pm_runtime.h>
 
+#include <drm/drm_drv.h>
+
 MODULE_FIRMWARE("amdgpu/vega10_gpu_info.bin");
 MODULE_FIRMWARE("amdgpu/vega12_gpu_info.bin");
 MODULE_FIRMWARE("amdgpu/raven_gpu_info.bin");
@@ -281,7 +283,10 @@ void amdgpu_device_vram_access(struct amdgpu_device *adev, loff_t pos,
 	unsigned long flags;
 	uint32_t hi = ~0;
 	uint64_t last;
+	int idx;
 
+	 if (!drm_dev_enter(&adev->ddev, &idx))
+		 return;
 
 #ifdef CONFIG_64BIT
 	last = min(pos + size, adev->gmc.visible_vram_size);
@@ -299,8 +304,10 @@ void amdgpu_device_vram_access(struct amdgpu_device *adev, loff_t pos,
 			memcpy_fromio(buf, addr, count);
 		}
 
-		if (count == size)
+		if (count == size) {
+			drm_dev_exit(idx);
 			return;
+		}
 
 		pos += count;
 		buf += count / 4;
@@ -323,6 +330,8 @@ void amdgpu_device_vram_access(struct amdgpu_device *adev, loff_t pos,
 			*buf++ = RREG32_NO_KIQ(mmMM_DATA);
 	}
 	spin_unlock_irqrestore(&adev->mmio_idx_lock, flags);
+
+	drm_dev_exit(idx);
 }
 
 /*
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
index 4d32233cde92..04ba5eef1e88 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
@@ -31,6 +31,8 @@
 #include "amdgpu_ras.h"
 #include "amdgpu_xgmi.h"
 
+#include <drm/drm_drv.h>
+
 /**
  * amdgpu_gmc_pdb0_alloc - allocate vram for pdb0
  *
@@ -151,6 +153,10 @@ int amdgpu_gmc_set_pte_pde(struct amdgpu_device *adev, void *cpu_pt_addr,
 {
 	void __iomem *ptr = (void *)cpu_pt_addr;
 	uint64_t value;
+	int idx;
+
+	if (!drm_dev_enter(&adev->ddev, &idx))
+		return 0;
 
 	/*
 	 * The following is for PTE only. GART does not have PDEs.
@@ -158,6 +164,9 @@ int amdgpu_gmc_set_pte_pde(struct amdgpu_device *adev, void *cpu_pt_addr,
 	value = addr & 0x0000FFFFFFFFF000ULL;
 	value |= flags;
 	writeq(value, ptr + (gpu_page_idx * 8));
+
+	drm_dev_exit(idx);
+
 	return 0;
 }
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
index 148a3b481b12..62fcbd446c71 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
@@ -30,6 +30,7 @@
 #include <linux/slab.h>
 
 #include <drm/amdgpu_drm.h>
+#include <drm/drm_drv.h>
 
 #include "amdgpu.h"
 #include "atom.h"
@@ -137,7 +138,7 @@ int amdgpu_ib_schedule(struct amdgpu_ring *ring, unsigned num_ibs,
 	bool secure;
 
 	unsigned i;
-	int r = 0;
+	int idx, r = 0;
 	bool need_pipe_sync = false;
 
 	if (num_ibs == 0)
@@ -169,13 +170,16 @@ int amdgpu_ib_schedule(struct amdgpu_ring *ring, unsigned num_ibs,
 		return -EINVAL;
 	}
 
+	if (!drm_dev_enter(&adev->ddev, &idx))
+		return -ENODEV;
+
 	alloc_size = ring->funcs->emit_frame_size + num_ibs *
 		ring->funcs->emit_ib_size;
 
 	r = amdgpu_ring_alloc(ring, alloc_size);
 	if (r) {
 		dev_err(adev->dev, "scheduling IB failed (%d).\n", r);
-		return r;
+		goto exit;
 	}
 
 	need_ctx_switch = ring->current_ctx != fence_ctx;
@@ -205,7 +209,7 @@ int amdgpu_ib_schedule(struct amdgpu_ring *ring, unsigned num_ibs,
 		r = amdgpu_vm_flush(ring, job, need_pipe_sync);
 		if (r) {
 			amdgpu_ring_undo(ring);
-			return r;
+			goto exit;
 		}
 	}
 
@@ -286,7 +290,7 @@ int amdgpu_ib_schedule(struct amdgpu_ring *ring, unsigned num_ibs,
 		if (job && job->vmid)
 			amdgpu_vmid_reset(adev, ring->funcs->vmhub, job->vmid);
 		amdgpu_ring_undo(ring);
-		return r;
+		goto exit;
 	}
 
 	if (ring->funcs->insert_end)
@@ -304,7 +308,10 @@ int amdgpu_ib_schedule(struct amdgpu_ring *ring, unsigned num_ibs,
 		ring->funcs->emit_wave_limit(ring, false);
 
 	amdgpu_ring_commit(ring);
-	return 0;
+
+exit:
+	drm_dev_exit(idx);
+	return r;
 }
 
 /**
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
index 9e769cf6095b..bb6afee61666 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
@@ -25,6 +25,7 @@
 
 #include <linux/firmware.h>
 #include <linux/dma-mapping.h>
+#include <drm/drm_drv.h>
 
 #include "amdgpu.h"
 #include "amdgpu_psp.h"
@@ -39,6 +40,8 @@
 #include "amdgpu_ras.h"
 #include "amdgpu_securedisplay.h"
 
+#include <drm/drm_drv.h>
+
 static int psp_sysfs_init(struct amdgpu_device *adev);
 static void psp_sysfs_fini(struct amdgpu_device *adev);
 
@@ -253,7 +256,7 @@ psp_cmd_submit_buf(struct psp_context *psp,
 		   struct psp_gfx_cmd_resp *cmd, uint64_t fence_mc_addr)
 {
 	int ret;
-	int index;
+	int index, idx;
 	int timeout = 20000;
 	bool ras_intr = false;
 	bool skip_unsupport = false;
@@ -261,6 +264,9 @@ psp_cmd_submit_buf(struct psp_context *psp,
 	if (psp->adev->in_pci_err_recovery)
 		return 0;
 
+	if (!drm_dev_enter(&psp->adev->ddev, &idx))
+		return 0;
+
 	mutex_lock(&psp->mutex);
 
 	memset(psp->cmd_buf_mem, 0, PSP_CMD_BUFFER_SIZE);
@@ -271,8 +277,7 @@ psp_cmd_submit_buf(struct psp_context *psp,
 	ret = psp_ring_cmd_submit(psp, psp->cmd_buf_mc_addr, fence_mc_addr, index);
 	if (ret) {
 		atomic_dec(&psp->fence_value);
-		mutex_unlock(&psp->mutex);
-		return ret;
+		goto exit;
 	}
 
 	amdgpu_asic_invalidate_hdp(psp->adev, NULL);
@@ -312,8 +317,8 @@ psp_cmd_submit_buf(struct psp_context *psp,
 			 psp->cmd_buf_mem->cmd_id,
 			 psp->cmd_buf_mem->resp.status);
 		if (!timeout) {
-			mutex_unlock(&psp->mutex);
-			return -EINVAL;
+			ret = -EINVAL;
+			goto exit;
 		}
 	}
 
@@ -321,8 +326,10 @@ psp_cmd_submit_buf(struct psp_context *psp,
 		ucode->tmr_mc_addr_lo = psp->cmd_buf_mem->resp.fw_addr_lo;
 		ucode->tmr_mc_addr_hi = psp->cmd_buf_mem->resp.fw_addr_hi;
 	}
-	mutex_unlock(&psp->mutex);
 
+exit:
+	mutex_unlock(&psp->mutex);
+	drm_dev_exit(idx);
 	return ret;
 }
 
@@ -359,8 +366,7 @@ static int psp_load_toc(struct psp_context *psp,
 	if (!cmd)
 		return -ENOMEM;
 	/* Copy toc to psp firmware private buffer */
-	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
-	memcpy(psp->fw_pri_buf, psp->toc_start_addr, psp->toc_bin_size);
+	psp_copy_fw(psp, psp->toc_start_addr, psp->toc_bin_size);
 
 	psp_prep_load_toc_cmd_buf(cmd, psp->fw_pri_mc_addr, psp->toc_bin_size);
 
@@ -625,8 +631,7 @@ static int psp_asd_load(struct psp_context *psp)
 	if (!cmd)
 		return -ENOMEM;
 
-	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
-	memcpy(psp->fw_pri_buf, psp->asd_start_addr, psp->asd_ucode_size);
+	psp_copy_fw(psp, psp->asd_start_addr, psp->asd_ucode_size);
 
 	psp_prep_asd_load_cmd_buf(cmd, psp->fw_pri_mc_addr,
 				  psp->asd_ucode_size);
@@ -781,8 +786,7 @@ static int psp_xgmi_load(struct psp_context *psp)
 	if (!cmd)
 		return -ENOMEM;
 
-	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
-	memcpy(psp->fw_pri_buf, psp->ta_xgmi_start_addr, psp->ta_xgmi_ucode_size);
+	psp_copy_fw(psp, psp->ta_xgmi_start_addr, psp->ta_xgmi_ucode_size);
 
 	psp_prep_ta_load_cmd_buf(cmd,
 				 psp->fw_pri_mc_addr,
@@ -1038,8 +1042,7 @@ static int psp_ras_load(struct psp_context *psp)
 	if (!cmd)
 		return -ENOMEM;
 
-	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
-	memcpy(psp->fw_pri_buf, psp->ta_ras_start_addr, psp->ta_ras_ucode_size);
+	psp_copy_fw(psp, psp->ta_ras_start_addr, psp->ta_ras_ucode_size);
 
 	psp_prep_ta_load_cmd_buf(cmd,
 				 psp->fw_pri_mc_addr,
@@ -1275,8 +1278,7 @@ static int psp_hdcp_load(struct psp_context *psp)
 	if (!cmd)
 		return -ENOMEM;
 
-	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
-	memcpy(psp->fw_pri_buf, psp->ta_hdcp_start_addr,
+	psp_copy_fw(psp, psp->ta_hdcp_start_addr,
 	       psp->ta_hdcp_ucode_size);
 
 	psp_prep_ta_load_cmd_buf(cmd,
@@ -1427,8 +1429,7 @@ static int psp_dtm_load(struct psp_context *psp)
 	if (!cmd)
 		return -ENOMEM;
 
-	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
-	memcpy(psp->fw_pri_buf, psp->ta_dtm_start_addr, psp->ta_dtm_ucode_size);
+	psp_copy_fw(psp, psp->ta_dtm_start_addr, psp->ta_dtm_ucode_size);
 
 	psp_prep_ta_load_cmd_buf(cmd,
 				 psp->fw_pri_mc_addr,
@@ -1573,8 +1574,7 @@ static int psp_rap_load(struct psp_context *psp)
 	if (!cmd)
 		return -ENOMEM;
 
-	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
-	memcpy(psp->fw_pri_buf, psp->ta_rap_start_addr, psp->ta_rap_ucode_size);
+	psp_copy_fw(psp, psp->ta_rap_start_addr, psp->ta_rap_ucode_size);
 
 	psp_prep_ta_load_cmd_buf(cmd,
 				 psp->fw_pri_mc_addr,
@@ -3022,7 +3022,7 @@ static ssize_t psp_usbc_pd_fw_sysfs_write(struct device *dev,
 	struct amdgpu_device *adev = drm_to_adev(ddev);
 	void *cpu_addr;
 	dma_addr_t dma_addr;
-	int ret;
+	int ret, idx;
 	char fw_name[100];
 	const struct firmware *usbc_pd_fw;
 
@@ -3031,6 +3031,9 @@ static ssize_t psp_usbc_pd_fw_sysfs_write(struct device *dev,
 		return -EBUSY;
 	}
 
+	if (!drm_dev_enter(ddev, &idx))
+		return -ENODEV;
+
 	snprintf(fw_name, sizeof(fw_name), "amdgpu/%s", buf);
 	ret = request_firmware(&usbc_pd_fw, fw_name, adev->dev);
 	if (ret)
@@ -3062,16 +3065,30 @@ static ssize_t psp_usbc_pd_fw_sysfs_write(struct device *dev,
 rel_buf:
 	dma_free_coherent(adev->dev, usbc_pd_fw->size, cpu_addr, dma_addr);
 	release_firmware(usbc_pd_fw);
-
 fail:
 	if (ret) {
 		DRM_ERROR("Failed to load USBC PD FW, err = %d", ret);
-		return ret;
+		count = ret;
 	}
 
+	drm_dev_exit(idx);
 	return count;
 }
 
+void psp_copy_fw(struct psp_context *psp, uint8_t *start_addr, uint32_t bin_size)
+{
+	int idx;
+
+	if (!drm_dev_enter(&psp->adev->ddev, &idx))
+		return;
+
+	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
+	memcpy(psp->fw_pri_buf, start_addr, bin_size);
+
+	drm_dev_exit(idx);
+}
+
+
 static DEVICE_ATTR(usbc_pd_fw, S_IRUGO | S_IWUSR,
 		   psp_usbc_pd_fw_sysfs_read,
 		   psp_usbc_pd_fw_sysfs_write);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
index 46a5328e00e0..2bfdc278817f 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
@@ -423,4 +423,6 @@ int psp_get_fw_attestation_records_addr(struct psp_context *psp,
 
 int psp_load_fw_list(struct psp_context *psp,
 		     struct amdgpu_firmware_info **ucode_list, int ucode_count);
+void psp_copy_fw(struct psp_context *psp, uint8_t *start_addr, uint32_t bin_size);
+
 #endif
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
index 688624ebe421..e1985bc34436 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
@@ -35,6 +35,8 @@
 #include "amdgpu.h"
 #include "atom.h"
 
+#include <drm/drm_drv.h>
+
 /*
  * Rings
  * Most engines on the GPU are fed via ring buffers.  Ring
@@ -461,3 +463,71 @@ int amdgpu_ring_test_helper(struct amdgpu_ring *ring)
 	ring->sched.ready = !r;
 	return r;
 }
+
+void amdgpu_ring_clear_ring(struct amdgpu_ring *ring)
+{
+	int idx;
+	int i = 0;
+
+	if (!drm_dev_enter(&ring->adev->ddev, &idx))
+		return;
+
+	while (i <= ring->buf_mask)
+		ring->ring[i++] = ring->funcs->nop;
+
+	drm_dev_exit(idx);
+
+}
+
+void amdgpu_ring_write(struct amdgpu_ring *ring, uint32_t v)
+{
+	int idx;
+
+	if (!drm_dev_enter(&ring->adev->ddev, &idx))
+		return;
+
+	if (ring->count_dw <= 0)
+		DRM_ERROR("amdgpu: writing more dwords to the ring than expected!\n");
+	ring->ring[ring->wptr++ & ring->buf_mask] = v;
+	ring->wptr &= ring->ptr_mask;
+	ring->count_dw--;
+
+	drm_dev_exit(idx);
+}
+
+void amdgpu_ring_write_multiple(struct amdgpu_ring *ring,
+					      void *src, int count_dw)
+{
+	unsigned occupied, chunk1, chunk2;
+	void *dst;
+	int idx;
+
+	if (!drm_dev_enter(&ring->adev->ddev, &idx))
+		return;
+
+	if (unlikely(ring->count_dw < count_dw))
+		DRM_ERROR("amdgpu: writing more dwords to the ring than expected!\n");
+
+	occupied = ring->wptr & ring->buf_mask;
+	dst = (void *)&ring->ring[occupied];
+	chunk1 = ring->buf_mask + 1 - occupied;
+	chunk1 = (chunk1 >= count_dw) ? count_dw: chunk1;
+	chunk2 = count_dw - chunk1;
+	chunk1 <<= 2;
+	chunk2 <<= 2;
+
+	if (chunk1)
+		memcpy(dst, src, chunk1);
+
+	if (chunk2) {
+		src += chunk1;
+		dst = (void *)ring->ring;
+		memcpy(dst, src, chunk2);
+	}
+
+	ring->wptr += count_dw;
+	ring->wptr &= ring->ptr_mask;
+	ring->count_dw -= count_dw;
+
+	drm_dev_exit(idx);
+}
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
index e7d3d0dbdd96..c67bc6d3d039 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
@@ -299,53 +299,12 @@ static inline void amdgpu_ring_set_preempt_cond_exec(struct amdgpu_ring *ring,
 	*ring->cond_exe_cpu_addr = cond_exec;
 }
 
-static inline void amdgpu_ring_clear_ring(struct amdgpu_ring *ring)
-{
-	int i = 0;
-	while (i <= ring->buf_mask)
-		ring->ring[i++] = ring->funcs->nop;
-
-}
-
-static inline void amdgpu_ring_write(struct amdgpu_ring *ring, uint32_t v)
-{
-	if (ring->count_dw <= 0)
-		DRM_ERROR("amdgpu: writing more dwords to the ring than expected!\n");
-	ring->ring[ring->wptr++ & ring->buf_mask] = v;
-	ring->wptr &= ring->ptr_mask;
-	ring->count_dw--;
-}
+void amdgpu_ring_clear_ring(struct amdgpu_ring *ring);
 
-static inline void amdgpu_ring_write_multiple(struct amdgpu_ring *ring,
-					      void *src, int count_dw)
-{
-	unsigned occupied, chunk1, chunk2;
-	void *dst;
-
-	if (unlikely(ring->count_dw < count_dw))
-		DRM_ERROR("amdgpu: writing more dwords to the ring than expected!\n");
-
-	occupied = ring->wptr & ring->buf_mask;
-	dst = (void *)&ring->ring[occupied];
-	chunk1 = ring->buf_mask + 1 - occupied;
-	chunk1 = (chunk1 >= count_dw) ? count_dw: chunk1;
-	chunk2 = count_dw - chunk1;
-	chunk1 <<= 2;
-	chunk2 <<= 2;
+void amdgpu_ring_write(struct amdgpu_ring *ring, uint32_t v);
 
-	if (chunk1)
-		memcpy(dst, src, chunk1);
-
-	if (chunk2) {
-		src += chunk1;
-		dst = (void *)ring->ring;
-		memcpy(dst, src, chunk2);
-	}
-
-	ring->wptr += count_dw;
-	ring->wptr &= ring->ptr_mask;
-	ring->count_dw -= count_dw;
-}
+void amdgpu_ring_write_multiple(struct amdgpu_ring *ring,
+					      void *src, int count_dw);
 
 int amdgpu_ring_test_helper(struct amdgpu_ring *ring);
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
index c6dbc0801604..82f0542c7792 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
@@ -32,6 +32,7 @@
 #include <linux/module.h>
 
 #include <drm/drm.h>
+#include <drm/drm_drv.h>
 
 #include "amdgpu.h"
 #include "amdgpu_pm.h"
@@ -375,7 +376,7 @@ int amdgpu_uvd_suspend(struct amdgpu_device *adev)
 {
 	unsigned size;
 	void *ptr;
-	int i, j;
+	int i, j, idx;
 	bool in_ras_intr = amdgpu_ras_intr_triggered();
 
 	cancel_delayed_work_sync(&adev->uvd.idle_work);
@@ -403,11 +404,15 @@ int amdgpu_uvd_suspend(struct amdgpu_device *adev)
 		if (!adev->uvd.inst[j].saved_bo)
 			return -ENOMEM;
 
-		/* re-write 0 since err_event_athub will corrupt VCPU buffer */
-		if (in_ras_intr)
-			memset(adev->uvd.inst[j].saved_bo, 0, size);
-		else
-			memcpy_fromio(adev->uvd.inst[j].saved_bo, ptr, size);
+		if (drm_dev_enter(&adev->ddev, &idx)) {
+			/* re-write 0 since err_event_athub will corrupt VCPU buffer */
+			if (in_ras_intr)
+				memset(adev->uvd.inst[j].saved_bo, 0, size);
+			else
+				memcpy_fromio(adev->uvd.inst[j].saved_bo, ptr, size);
+
+			drm_dev_exit(idx);
+		}
 	}
 
 	if (in_ras_intr)
@@ -420,7 +425,7 @@ int amdgpu_uvd_resume(struct amdgpu_device *adev)
 {
 	unsigned size;
 	void *ptr;
-	int i;
+	int i, idx;
 
 	for (i = 0; i < adev->uvd.num_uvd_inst; i++) {
 		if (adev->uvd.harvest_config & (1 << i))
@@ -432,7 +437,10 @@ int amdgpu_uvd_resume(struct amdgpu_device *adev)
 		ptr = adev->uvd.inst[i].cpu_addr;
 
 		if (adev->uvd.inst[i].saved_bo != NULL) {
-			memcpy_toio(ptr, adev->uvd.inst[i].saved_bo, size);
+			if (drm_dev_enter(&adev->ddev, &idx)) {
+				memcpy_toio(ptr, adev->uvd.inst[i].saved_bo, size);
+				drm_dev_exit(idx);
+			}
 			kvfree(adev->uvd.inst[i].saved_bo);
 			adev->uvd.inst[i].saved_bo = NULL;
 		} else {
@@ -442,8 +450,11 @@ int amdgpu_uvd_resume(struct amdgpu_device *adev)
 			hdr = (const struct common_firmware_header *)adev->uvd.fw->data;
 			if (adev->firmware.load_type != AMDGPU_FW_LOAD_PSP) {
 				offset = le32_to_cpu(hdr->ucode_array_offset_bytes);
-				memcpy_toio(adev->uvd.inst[i].cpu_addr, adev->uvd.fw->data + offset,
-					    le32_to_cpu(hdr->ucode_size_bytes));
+				if (drm_dev_enter(&adev->ddev, &idx)) {
+					memcpy_toio(adev->uvd.inst[i].cpu_addr, adev->uvd.fw->data + offset,
+						    le32_to_cpu(hdr->ucode_size_bytes));
+					drm_dev_exit(idx);
+				}
 				size -= le32_to_cpu(hdr->ucode_size_bytes);
 				ptr += le32_to_cpu(hdr->ucode_size_bytes);
 			}
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
index ea6a62f67e38..833203401ef4 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
@@ -29,6 +29,7 @@
 #include <linux/module.h>
 
 #include <drm/drm.h>
+#include <drm/drm_drv.h>
 
 #include "amdgpu.h"
 #include "amdgpu_pm.h"
@@ -293,7 +294,7 @@ int amdgpu_vce_resume(struct amdgpu_device *adev)
 	void *cpu_addr;
 	const struct common_firmware_header *hdr;
 	unsigned offset;
-	int r;
+	int r, idx;
 
 	if (adev->vce.vcpu_bo == NULL)
 		return -EINVAL;
@@ -313,8 +314,12 @@ int amdgpu_vce_resume(struct amdgpu_device *adev)
 
 	hdr = (const struct common_firmware_header *)adev->vce.fw->data;
 	offset = le32_to_cpu(hdr->ucode_array_offset_bytes);
-	memcpy_toio(cpu_addr, adev->vce.fw->data + offset,
-		    adev->vce.fw->size - offset);
+
+	if (drm_dev_enter(&adev->ddev, &idx)) {
+		memcpy_toio(cpu_addr, adev->vce.fw->data + offset,
+			    adev->vce.fw->size - offset);
+		drm_dev_exit(idx);
+	}
 
 	amdgpu_bo_kunmap(adev->vce.vcpu_bo);
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
index 201645963ba5..21f7d3644d70 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
@@ -27,6 +27,7 @@
 #include <linux/firmware.h>
 #include <linux/module.h>
 #include <linux/pci.h>
+#include <drm/drm_drv.h>
 
 #include "amdgpu.h"
 #include "amdgpu_pm.h"
@@ -275,7 +276,7 @@ int amdgpu_vcn_suspend(struct amdgpu_device *adev)
 {
 	unsigned size;
 	void *ptr;
-	int i;
+	int i, idx;
 
 	cancel_delayed_work_sync(&adev->vcn.idle_work);
 
@@ -292,7 +293,10 @@ int amdgpu_vcn_suspend(struct amdgpu_device *adev)
 		if (!adev->vcn.inst[i].saved_bo)
 			return -ENOMEM;
 
-		memcpy_fromio(adev->vcn.inst[i].saved_bo, ptr, size);
+		if (drm_dev_enter(&adev->ddev, &idx)) {
+			memcpy_fromio(adev->vcn.inst[i].saved_bo, ptr, size);
+			drm_dev_exit(idx);
+		}
 	}
 	return 0;
 }
@@ -301,7 +305,7 @@ int amdgpu_vcn_resume(struct amdgpu_device *adev)
 {
 	unsigned size;
 	void *ptr;
-	int i;
+	int i, idx;
 
 	for (i = 0; i < adev->vcn.num_vcn_inst; ++i) {
 		if (adev->vcn.harvest_config & (1 << i))
@@ -313,7 +317,10 @@ int amdgpu_vcn_resume(struct amdgpu_device *adev)
 		ptr = adev->vcn.inst[i].cpu_addr;
 
 		if (adev->vcn.inst[i].saved_bo != NULL) {
-			memcpy_toio(ptr, adev->vcn.inst[i].saved_bo, size);
+			if (drm_dev_enter(&adev->ddev, &idx)) {
+				memcpy_toio(ptr, adev->vcn.inst[i].saved_bo, size);
+				drm_dev_exit(idx);
+			}
 			kvfree(adev->vcn.inst[i].saved_bo);
 			adev->vcn.inst[i].saved_bo = NULL;
 		} else {
@@ -323,8 +330,11 @@ int amdgpu_vcn_resume(struct amdgpu_device *adev)
 			hdr = (const struct common_firmware_header *)adev->vcn.fw->data;
 			if (adev->firmware.load_type != AMDGPU_FW_LOAD_PSP) {
 				offset = le32_to_cpu(hdr->ucode_array_offset_bytes);
-				memcpy_toio(adev->vcn.inst[i].cpu_addr, adev->vcn.fw->data + offset,
-					    le32_to_cpu(hdr->ucode_size_bytes));
+				if (drm_dev_enter(&adev->ddev, &idx)) {
+					memcpy_toio(adev->vcn.inst[i].cpu_addr, adev->vcn.fw->data + offset,
+						    le32_to_cpu(hdr->ucode_size_bytes));
+					drm_dev_exit(idx);
+				}
 				size -= le32_to_cpu(hdr->ucode_size_bytes);
 				ptr += le32_to_cpu(hdr->ucode_size_bytes);
 			}
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index 9f868cf3b832..7dd5f10ab570 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -32,6 +32,7 @@
 #include <linux/dma-buf.h>
 
 #include <drm/amdgpu_drm.h>
+#include <drm/drm_drv.h>
 #include "amdgpu.h"
 #include "amdgpu_trace.h"
 #include "amdgpu_amdkfd.h"
@@ -1606,7 +1607,10 @@ static int amdgpu_vm_bo_update_mapping(struct amdgpu_device *adev,
 	struct amdgpu_vm_update_params params;
 	enum amdgpu_sync_mode sync_mode;
 	uint64_t pfn;
-	int r;
+	int r, idx;
+
+	if (!drm_dev_enter(&adev->ddev, &idx))
+		return -ENODEV;
 
 	memset(&params, 0, sizeof(params));
 	params.adev = adev;
@@ -1715,6 +1719,7 @@ static int amdgpu_vm_bo_update_mapping(struct amdgpu_device *adev,
 
 error_unlock:
 	amdgpu_vm_eviction_unlock(vm);
+	drm_dev_exit(idx);
 	return r;
 }
 
diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c b/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
index 589410c32d09..2cec71e823f5 100644
--- a/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
@@ -23,6 +23,7 @@
 #include <linux/firmware.h>
 #include <linux/module.h>
 #include <linux/vmalloc.h>
+#include <drm/drm_drv.h>
 
 #include "amdgpu.h"
 #include "amdgpu_psp.h"
@@ -269,10 +270,8 @@ static int psp_v11_0_bootloader_load_kdb(struct psp_context *psp)
 	if (ret)
 		return ret;
 
-	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
-
 	/* Copy PSP KDB binary to memory */
-	memcpy(psp->fw_pri_buf, psp->kdb_start_addr, psp->kdb_bin_size);
+	psp_copy_fw(psp, psp->kdb_start_addr, psp->kdb_bin_size);
 
 	/* Provide the PSP KDB to bootloader */
 	WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
@@ -302,10 +301,8 @@ static int psp_v11_0_bootloader_load_spl(struct psp_context *psp)
 	if (ret)
 		return ret;
 
-	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
-
 	/* Copy PSP SPL binary to memory */
-	memcpy(psp->fw_pri_buf, psp->spl_start_addr, psp->spl_bin_size);
+	psp_copy_fw(psp, psp->spl_start_addr, psp->spl_bin_size);
 
 	/* Provide the PSP SPL to bootloader */
 	WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
@@ -335,10 +332,8 @@ static int psp_v11_0_bootloader_load_sysdrv(struct psp_context *psp)
 	if (ret)
 		return ret;
 
-	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
-
 	/* Copy PSP System Driver binary to memory */
-	memcpy(psp->fw_pri_buf, psp->sys_start_addr, psp->sys_bin_size);
+	psp_copy_fw(psp, psp->sys_start_addr, psp->sys_bin_size);
 
 	/* Provide the sys driver to bootloader */
 	WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
@@ -371,10 +366,8 @@ static int psp_v11_0_bootloader_load_sos(struct psp_context *psp)
 	if (ret)
 		return ret;
 
-	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
-
 	/* Copy Secure OS binary to PSP memory */
-	memcpy(psp->fw_pri_buf, psp->sos_start_addr, psp->sos_bin_size);
+	psp_copy_fw(psp, psp->sos_start_addr, psp->sos_bin_size);
 
 	/* Provide the PSP secure OS to bootloader */
 	WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
@@ -608,7 +601,7 @@ static int psp_v11_0_memory_training(struct psp_context *psp, uint32_t ops)
 	uint32_t p2c_header[4];
 	uint32_t sz;
 	void *buf;
-	int ret;
+	int ret, idx;
 
 	if (ctx->init == PSP_MEM_TRAIN_NOT_SUPPORT) {
 		DRM_DEBUG("Memory training is not supported.\n");
@@ -681,17 +674,24 @@ static int psp_v11_0_memory_training(struct psp_context *psp, uint32_t ops)
 			return -ENOMEM;
 		}
 
-		memcpy_fromio(buf, adev->mman.aper_base_kaddr, sz);
-		ret = psp_v11_0_memory_training_send_msg(psp, PSP_BL__DRAM_LONG_TRAIN);
-		if (ret) {
-			DRM_ERROR("Send long training msg failed.\n");
+		if (drm_dev_enter(&adev->ddev, &idx)) {
+			memcpy_fromio(buf, adev->mman.aper_base_kaddr, sz);
+			ret = psp_v11_0_memory_training_send_msg(psp, PSP_BL__DRAM_LONG_TRAIN);
+			if (ret) {
+				DRM_ERROR("Send long training msg failed.\n");
+				vfree(buf);
+				drm_dev_exit(idx);
+				return ret;
+			}
+
+			memcpy_toio(adev->mman.aper_base_kaddr, buf, sz);
+			adev->hdp.funcs->flush_hdp(adev, NULL);
 			vfree(buf);
-			return ret;
+			drm_dev_exit(idx);
+		} else {
+			vfree(buf);
+			return -ENODEV;
 		}
-
-		memcpy_toio(adev->mman.aper_base_kaddr, buf, sz);
-		adev->hdp.funcs->flush_hdp(adev, NULL);
-		vfree(buf);
 	}
 
 	if (ops & PSP_MEM_TRAIN_SAVE) {
diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c b/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c
index c4828bd3264b..618e5b6b85d9 100644
--- a/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c
@@ -138,10 +138,8 @@ static int psp_v12_0_bootloader_load_sysdrv(struct psp_context *psp)
 	if (ret)
 		return ret;
 
-	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
-
 	/* Copy PSP System Driver binary to memory */
-	memcpy(psp->fw_pri_buf, psp->sys_start_addr, psp->sys_bin_size);
+	psp_copy_fw(psp, psp->sys_start_addr, psp->sys_bin_size);
 
 	/* Provide the sys driver to bootloader */
 	WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
@@ -179,10 +177,8 @@ static int psp_v12_0_bootloader_load_sos(struct psp_context *psp)
 	if (ret)
 		return ret;
 
-	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
-
 	/* Copy Secure OS binary to PSP memory */
-	memcpy(psp->fw_pri_buf, psp->sos_start_addr, psp->sos_bin_size);
+	psp_copy_fw(psp, psp->sos_start_addr, psp->sos_bin_size);
 
 	/* Provide the PSP secure OS to bootloader */
 	WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c b/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c
index f2e725f72d2f..d0a6cccd0897 100644
--- a/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c
+++ b/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c
@@ -102,10 +102,8 @@ static int psp_v3_1_bootloader_load_sysdrv(struct psp_context *psp)
 	if (ret)
 		return ret;
 
-	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
-
 	/* Copy PSP System Driver binary to memory */
-	memcpy(psp->fw_pri_buf, psp->sys_start_addr, psp->sys_bin_size);
+	psp_copy_fw(psp, psp->sys_start_addr, psp->sys_bin_size);
 
 	/* Provide the sys driver to bootloader */
 	WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
@@ -143,10 +141,8 @@ static int psp_v3_1_bootloader_load_sos(struct psp_context *psp)
 	if (ret)
 		return ret;
 
-	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
-
 	/* Copy Secure OS binary to PSP memory */
-	memcpy(psp->fw_pri_buf, psp->sos_start_addr, psp->sos_bin_size);
+	psp_copy_fw(psp, psp->sos_start_addr, psp->sos_bin_size);
 
 	/* Provide the PSP secure OS to bootloader */
 	WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
diff --git a/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c b/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
index 8e238dea7bef..90910d19db12 100644
--- a/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
@@ -25,6 +25,7 @@
  */
 
 #include <linux/firmware.h>
+#include <drm/drm_drv.h>
 
 #include "amdgpu.h"
 #include "amdgpu_vce.h"
@@ -555,16 +556,19 @@ static int vce_v4_0_hw_fini(void *handle)
 static int vce_v4_0_suspend(void *handle)
 {
 	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
-	int r;
+	int r, idx;
 
 	if (adev->vce.vcpu_bo == NULL)
 		return 0;
 
-	if (adev->firmware.load_type == AMDGPU_FW_LOAD_PSP) {
-		unsigned size = amdgpu_bo_size(adev->vce.vcpu_bo);
-		void *ptr = adev->vce.cpu_addr;
+	if (drm_dev_enter(&adev->ddev, &idx)) {
+		if (adev->firmware.load_type == AMDGPU_FW_LOAD_PSP) {
+			unsigned size = amdgpu_bo_size(adev->vce.vcpu_bo);
+			void *ptr = adev->vce.cpu_addr;
 
-		memcpy_fromio(adev->vce.saved_bo, ptr, size);
+			memcpy_fromio(adev->vce.saved_bo, ptr, size);
+		}
+		drm_dev_exit(idx);
 	}
 
 	r = vce_v4_0_hw_fini(adev);
@@ -577,16 +581,20 @@ static int vce_v4_0_suspend(void *handle)
 static int vce_v4_0_resume(void *handle)
 {
 	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
-	int r;
+	int r, idx;
 
 	if (adev->vce.vcpu_bo == NULL)
 		return -EINVAL;
 
 	if (adev->firmware.load_type == AMDGPU_FW_LOAD_PSP) {
-		unsigned size = amdgpu_bo_size(adev->vce.vcpu_bo);
-		void *ptr = adev->vce.cpu_addr;
 
-		memcpy_toio(ptr, adev->vce.saved_bo, size);
+		if (drm_dev_enter(&adev->ddev, &idx)) {
+			unsigned size = amdgpu_bo_size(adev->vce.vcpu_bo);
+			void *ptr = adev->vce.cpu_addr;
+
+			memcpy_toio(ptr, adev->vce.saved_bo, size);
+			drm_dev_exit(idx);
+		}
 	} else {
 		r = amdgpu_vce_resume(adev);
 		if (r)
diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c b/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c
index 3f15bf34123a..df34be8ec82d 100644
--- a/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c
@@ -34,6 +34,8 @@
 #include "vcn/vcn_3_0_0_sh_mask.h"
 #include "ivsrcid/vcn/irqsrcs_vcn_2_0.h"
 
+#include <drm/drm_drv.h>
+
 #define mmUVD_CONTEXT_ID_INTERNAL_OFFSET			0x27
 #define mmUVD_GPCOM_VCPU_CMD_INTERNAL_OFFSET			0x0f
 #define mmUVD_GPCOM_VCPU_DATA0_INTERNAL_OFFSET			0x10
@@ -268,16 +270,20 @@ static int vcn_v3_0_sw_init(void *handle)
 static int vcn_v3_0_sw_fini(void *handle)
 {
 	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
-	int i, r;
+	int i, r, idx;
 
-	for (i = 0; i < adev->vcn.num_vcn_inst; i++) {
-		volatile struct amdgpu_fw_shared *fw_shared;
+	if (drm_dev_enter(&adev->ddev, &idx)) {
+		for (i = 0; i < adev->vcn.num_vcn_inst; i++) {
+			volatile struct amdgpu_fw_shared *fw_shared;
 
-		if (adev->vcn.harvest_config & (1 << i))
-			continue;
-		fw_shared = adev->vcn.inst[i].fw_shared_cpu_addr;
-		fw_shared->present_flag_0 = 0;
-		fw_shared->sw_ring.is_enabled = false;
+			if (adev->vcn.harvest_config & (1 << i))
+				continue;
+			fw_shared = adev->vcn.inst[i].fw_shared_cpu_addr;
+			fw_shared->present_flag_0 = 0;
+			fw_shared->sw_ring.is_enabled = false;
+		}
+
+		drm_dev_exit(idx);
 	}
 
 	if (amdgpu_sriov_vf(adev))
diff --git a/drivers/gpu/drm/amd/pm/powerplay/smumgr/smu7_smumgr.c b/drivers/gpu/drm/amd/pm/powerplay/smumgr/smu7_smumgr.c
index aae25243eb10..d628b91846c9 100644
--- a/drivers/gpu/drm/amd/pm/powerplay/smumgr/smu7_smumgr.c
+++ b/drivers/gpu/drm/amd/pm/powerplay/smumgr/smu7_smumgr.c
@@ -405,6 +405,8 @@ int smu7_request_smu_load_fw(struct pp_hwmgr *hwmgr)
 				UCODE_ID_MEC_STORAGE, &toc->entry[toc->num_entries++]),
 				"Failed to Get Firmware Entry.", r = -EINVAL; goto failed);
 	}
+
+	/* AG TODO Can't call drm_dev_enter/exit because access adev->ddev here ... */
 	memcpy_toio(smu_data->header_buffer.kaddr, smu_data->toc,
 		    sizeof(struct SMU_DRAMData_TOC));
 	smum_send_msg_to_smc_with_parameter(hwmgr,
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 126+ messages in thread

* [PATCH v6 10/16] drm/amdgpu: Guard against write accesses after device removal
@ 2021-05-10 16:36   ` Andrey Grodzovsky
  0 siblings, 0 replies; 126+ messages in thread
From: Andrey Grodzovsky @ 2021-05-10 16:36 UTC (permalink / raw)
  To: dri-devel, amd-gfx, linux-pci, ckoenig.leichtzumerken,
	daniel.vetter, Harry.Wentland
  Cc: Andrey Grodzovsky, gregkh, Felix.Kuehling, ppaalanen, helgaas,
	Alexander.Deucher

This should prevent writing to memory or IO ranges possibly
already allocated for other uses after our device is removed.

v5:
Protect more places wher memcopy_to/form_io takes place
Protect IB submissions

v6: Switch to !drm_dev_enter instead of scoping entire code
with brackets.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c    | 11 ++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c       |  9 +++
 drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c        | 17 +++--
 drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c       | 63 +++++++++++------
 drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h       |  2 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c      | 70 +++++++++++++++++++
 drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h      | 49 ++-----------
 drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c       | 31 +++++---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c       | 11 ++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c       | 22 ++++--
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c        |  7 +-
 drivers/gpu/drm/amd/amdgpu/psp_v11_0.c        | 44 ++++++------
 drivers/gpu/drm/amd/amdgpu/psp_v12_0.c        |  8 +--
 drivers/gpu/drm/amd/amdgpu/psp_v3_1.c         |  8 +--
 drivers/gpu/drm/amd/amdgpu/vce_v4_0.c         | 26 ++++---
 drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c         | 22 +++---
 .../drm/amd/pm/powerplay/smumgr/smu7_smumgr.c |  2 +
 17 files changed, 257 insertions(+), 145 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index a0bff4713672..94c415176cdc 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -71,6 +71,8 @@
 #include <drm/task_barrier.h>
 #include <linux/pm_runtime.h>
 
+#include <drm/drm_drv.h>
+
 MODULE_FIRMWARE("amdgpu/vega10_gpu_info.bin");
 MODULE_FIRMWARE("amdgpu/vega12_gpu_info.bin");
 MODULE_FIRMWARE("amdgpu/raven_gpu_info.bin");
@@ -281,7 +283,10 @@ void amdgpu_device_vram_access(struct amdgpu_device *adev, loff_t pos,
 	unsigned long flags;
 	uint32_t hi = ~0;
 	uint64_t last;
+	int idx;
 
+	 if (!drm_dev_enter(&adev->ddev, &idx))
+		 return;
 
 #ifdef CONFIG_64BIT
 	last = min(pos + size, adev->gmc.visible_vram_size);
@@ -299,8 +304,10 @@ void amdgpu_device_vram_access(struct amdgpu_device *adev, loff_t pos,
 			memcpy_fromio(buf, addr, count);
 		}
 
-		if (count == size)
+		if (count == size) {
+			drm_dev_exit(idx);
 			return;
+		}
 
 		pos += count;
 		buf += count / 4;
@@ -323,6 +330,8 @@ void amdgpu_device_vram_access(struct amdgpu_device *adev, loff_t pos,
 			*buf++ = RREG32_NO_KIQ(mmMM_DATA);
 	}
 	spin_unlock_irqrestore(&adev->mmio_idx_lock, flags);
+
+	drm_dev_exit(idx);
 }
 
 /*
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
index 4d32233cde92..04ba5eef1e88 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
@@ -31,6 +31,8 @@
 #include "amdgpu_ras.h"
 #include "amdgpu_xgmi.h"
 
+#include <drm/drm_drv.h>
+
 /**
  * amdgpu_gmc_pdb0_alloc - allocate vram for pdb0
  *
@@ -151,6 +153,10 @@ int amdgpu_gmc_set_pte_pde(struct amdgpu_device *adev, void *cpu_pt_addr,
 {
 	void __iomem *ptr = (void *)cpu_pt_addr;
 	uint64_t value;
+	int idx;
+
+	if (!drm_dev_enter(&adev->ddev, &idx))
+		return 0;
 
 	/*
 	 * The following is for PTE only. GART does not have PDEs.
@@ -158,6 +164,9 @@ int amdgpu_gmc_set_pte_pde(struct amdgpu_device *adev, void *cpu_pt_addr,
 	value = addr & 0x0000FFFFFFFFF000ULL;
 	value |= flags;
 	writeq(value, ptr + (gpu_page_idx * 8));
+
+	drm_dev_exit(idx);
+
 	return 0;
 }
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
index 148a3b481b12..62fcbd446c71 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
@@ -30,6 +30,7 @@
 #include <linux/slab.h>
 
 #include <drm/amdgpu_drm.h>
+#include <drm/drm_drv.h>
 
 #include "amdgpu.h"
 #include "atom.h"
@@ -137,7 +138,7 @@ int amdgpu_ib_schedule(struct amdgpu_ring *ring, unsigned num_ibs,
 	bool secure;
 
 	unsigned i;
-	int r = 0;
+	int idx, r = 0;
 	bool need_pipe_sync = false;
 
 	if (num_ibs == 0)
@@ -169,13 +170,16 @@ int amdgpu_ib_schedule(struct amdgpu_ring *ring, unsigned num_ibs,
 		return -EINVAL;
 	}
 
+	if (!drm_dev_enter(&adev->ddev, &idx))
+		return -ENODEV;
+
 	alloc_size = ring->funcs->emit_frame_size + num_ibs *
 		ring->funcs->emit_ib_size;
 
 	r = amdgpu_ring_alloc(ring, alloc_size);
 	if (r) {
 		dev_err(adev->dev, "scheduling IB failed (%d).\n", r);
-		return r;
+		goto exit;
 	}
 
 	need_ctx_switch = ring->current_ctx != fence_ctx;
@@ -205,7 +209,7 @@ int amdgpu_ib_schedule(struct amdgpu_ring *ring, unsigned num_ibs,
 		r = amdgpu_vm_flush(ring, job, need_pipe_sync);
 		if (r) {
 			amdgpu_ring_undo(ring);
-			return r;
+			goto exit;
 		}
 	}
 
@@ -286,7 +290,7 @@ int amdgpu_ib_schedule(struct amdgpu_ring *ring, unsigned num_ibs,
 		if (job && job->vmid)
 			amdgpu_vmid_reset(adev, ring->funcs->vmhub, job->vmid);
 		amdgpu_ring_undo(ring);
-		return r;
+		goto exit;
 	}
 
 	if (ring->funcs->insert_end)
@@ -304,7 +308,10 @@ int amdgpu_ib_schedule(struct amdgpu_ring *ring, unsigned num_ibs,
 		ring->funcs->emit_wave_limit(ring, false);
 
 	amdgpu_ring_commit(ring);
-	return 0;
+
+exit:
+	drm_dev_exit(idx);
+	return r;
 }
 
 /**
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
index 9e769cf6095b..bb6afee61666 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
@@ -25,6 +25,7 @@
 
 #include <linux/firmware.h>
 #include <linux/dma-mapping.h>
+#include <drm/drm_drv.h>
 
 #include "amdgpu.h"
 #include "amdgpu_psp.h"
@@ -39,6 +40,8 @@
 #include "amdgpu_ras.h"
 #include "amdgpu_securedisplay.h"
 
+#include <drm/drm_drv.h>
+
 static int psp_sysfs_init(struct amdgpu_device *adev);
 static void psp_sysfs_fini(struct amdgpu_device *adev);
 
@@ -253,7 +256,7 @@ psp_cmd_submit_buf(struct psp_context *psp,
 		   struct psp_gfx_cmd_resp *cmd, uint64_t fence_mc_addr)
 {
 	int ret;
-	int index;
+	int index, idx;
 	int timeout = 20000;
 	bool ras_intr = false;
 	bool skip_unsupport = false;
@@ -261,6 +264,9 @@ psp_cmd_submit_buf(struct psp_context *psp,
 	if (psp->adev->in_pci_err_recovery)
 		return 0;
 
+	if (!drm_dev_enter(&psp->adev->ddev, &idx))
+		return 0;
+
 	mutex_lock(&psp->mutex);
 
 	memset(psp->cmd_buf_mem, 0, PSP_CMD_BUFFER_SIZE);
@@ -271,8 +277,7 @@ psp_cmd_submit_buf(struct psp_context *psp,
 	ret = psp_ring_cmd_submit(psp, psp->cmd_buf_mc_addr, fence_mc_addr, index);
 	if (ret) {
 		atomic_dec(&psp->fence_value);
-		mutex_unlock(&psp->mutex);
-		return ret;
+		goto exit;
 	}
 
 	amdgpu_asic_invalidate_hdp(psp->adev, NULL);
@@ -312,8 +317,8 @@ psp_cmd_submit_buf(struct psp_context *psp,
 			 psp->cmd_buf_mem->cmd_id,
 			 psp->cmd_buf_mem->resp.status);
 		if (!timeout) {
-			mutex_unlock(&psp->mutex);
-			return -EINVAL;
+			ret = -EINVAL;
+			goto exit;
 		}
 	}
 
@@ -321,8 +326,10 @@ psp_cmd_submit_buf(struct psp_context *psp,
 		ucode->tmr_mc_addr_lo = psp->cmd_buf_mem->resp.fw_addr_lo;
 		ucode->tmr_mc_addr_hi = psp->cmd_buf_mem->resp.fw_addr_hi;
 	}
-	mutex_unlock(&psp->mutex);
 
+exit:
+	mutex_unlock(&psp->mutex);
+	drm_dev_exit(idx);
 	return ret;
 }
 
@@ -359,8 +366,7 @@ static int psp_load_toc(struct psp_context *psp,
 	if (!cmd)
 		return -ENOMEM;
 	/* Copy toc to psp firmware private buffer */
-	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
-	memcpy(psp->fw_pri_buf, psp->toc_start_addr, psp->toc_bin_size);
+	psp_copy_fw(psp, psp->toc_start_addr, psp->toc_bin_size);
 
 	psp_prep_load_toc_cmd_buf(cmd, psp->fw_pri_mc_addr, psp->toc_bin_size);
 
@@ -625,8 +631,7 @@ static int psp_asd_load(struct psp_context *psp)
 	if (!cmd)
 		return -ENOMEM;
 
-	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
-	memcpy(psp->fw_pri_buf, psp->asd_start_addr, psp->asd_ucode_size);
+	psp_copy_fw(psp, psp->asd_start_addr, psp->asd_ucode_size);
 
 	psp_prep_asd_load_cmd_buf(cmd, psp->fw_pri_mc_addr,
 				  psp->asd_ucode_size);
@@ -781,8 +786,7 @@ static int psp_xgmi_load(struct psp_context *psp)
 	if (!cmd)
 		return -ENOMEM;
 
-	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
-	memcpy(psp->fw_pri_buf, psp->ta_xgmi_start_addr, psp->ta_xgmi_ucode_size);
+	psp_copy_fw(psp, psp->ta_xgmi_start_addr, psp->ta_xgmi_ucode_size);
 
 	psp_prep_ta_load_cmd_buf(cmd,
 				 psp->fw_pri_mc_addr,
@@ -1038,8 +1042,7 @@ static int psp_ras_load(struct psp_context *psp)
 	if (!cmd)
 		return -ENOMEM;
 
-	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
-	memcpy(psp->fw_pri_buf, psp->ta_ras_start_addr, psp->ta_ras_ucode_size);
+	psp_copy_fw(psp, psp->ta_ras_start_addr, psp->ta_ras_ucode_size);
 
 	psp_prep_ta_load_cmd_buf(cmd,
 				 psp->fw_pri_mc_addr,
@@ -1275,8 +1278,7 @@ static int psp_hdcp_load(struct psp_context *psp)
 	if (!cmd)
 		return -ENOMEM;
 
-	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
-	memcpy(psp->fw_pri_buf, psp->ta_hdcp_start_addr,
+	psp_copy_fw(psp, psp->ta_hdcp_start_addr,
 	       psp->ta_hdcp_ucode_size);
 
 	psp_prep_ta_load_cmd_buf(cmd,
@@ -1427,8 +1429,7 @@ static int psp_dtm_load(struct psp_context *psp)
 	if (!cmd)
 		return -ENOMEM;
 
-	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
-	memcpy(psp->fw_pri_buf, psp->ta_dtm_start_addr, psp->ta_dtm_ucode_size);
+	psp_copy_fw(psp, psp->ta_dtm_start_addr, psp->ta_dtm_ucode_size);
 
 	psp_prep_ta_load_cmd_buf(cmd,
 				 psp->fw_pri_mc_addr,
@@ -1573,8 +1574,7 @@ static int psp_rap_load(struct psp_context *psp)
 	if (!cmd)
 		return -ENOMEM;
 
-	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
-	memcpy(psp->fw_pri_buf, psp->ta_rap_start_addr, psp->ta_rap_ucode_size);
+	psp_copy_fw(psp, psp->ta_rap_start_addr, psp->ta_rap_ucode_size);
 
 	psp_prep_ta_load_cmd_buf(cmd,
 				 psp->fw_pri_mc_addr,
@@ -3022,7 +3022,7 @@ static ssize_t psp_usbc_pd_fw_sysfs_write(struct device *dev,
 	struct amdgpu_device *adev = drm_to_adev(ddev);
 	void *cpu_addr;
 	dma_addr_t dma_addr;
-	int ret;
+	int ret, idx;
 	char fw_name[100];
 	const struct firmware *usbc_pd_fw;
 
@@ -3031,6 +3031,9 @@ static ssize_t psp_usbc_pd_fw_sysfs_write(struct device *dev,
 		return -EBUSY;
 	}
 
+	if (!drm_dev_enter(ddev, &idx))
+		return -ENODEV;
+
 	snprintf(fw_name, sizeof(fw_name), "amdgpu/%s", buf);
 	ret = request_firmware(&usbc_pd_fw, fw_name, adev->dev);
 	if (ret)
@@ -3062,16 +3065,30 @@ static ssize_t psp_usbc_pd_fw_sysfs_write(struct device *dev,
 rel_buf:
 	dma_free_coherent(adev->dev, usbc_pd_fw->size, cpu_addr, dma_addr);
 	release_firmware(usbc_pd_fw);
-
 fail:
 	if (ret) {
 		DRM_ERROR("Failed to load USBC PD FW, err = %d", ret);
-		return ret;
+		count = ret;
 	}
 
+	drm_dev_exit(idx);
 	return count;
 }
 
+void psp_copy_fw(struct psp_context *psp, uint8_t *start_addr, uint32_t bin_size)
+{
+	int idx;
+
+	if (!drm_dev_enter(&psp->adev->ddev, &idx))
+		return;
+
+	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
+	memcpy(psp->fw_pri_buf, start_addr, bin_size);
+
+	drm_dev_exit(idx);
+}
+
+
 static DEVICE_ATTR(usbc_pd_fw, S_IRUGO | S_IWUSR,
 		   psp_usbc_pd_fw_sysfs_read,
 		   psp_usbc_pd_fw_sysfs_write);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
index 46a5328e00e0..2bfdc278817f 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
@@ -423,4 +423,6 @@ int psp_get_fw_attestation_records_addr(struct psp_context *psp,
 
 int psp_load_fw_list(struct psp_context *psp,
 		     struct amdgpu_firmware_info **ucode_list, int ucode_count);
+void psp_copy_fw(struct psp_context *psp, uint8_t *start_addr, uint32_t bin_size);
+
 #endif
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
index 688624ebe421..e1985bc34436 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
@@ -35,6 +35,8 @@
 #include "amdgpu.h"
 #include "atom.h"
 
+#include <drm/drm_drv.h>
+
 /*
  * Rings
  * Most engines on the GPU are fed via ring buffers.  Ring
@@ -461,3 +463,71 @@ int amdgpu_ring_test_helper(struct amdgpu_ring *ring)
 	ring->sched.ready = !r;
 	return r;
 }
+
+void amdgpu_ring_clear_ring(struct amdgpu_ring *ring)
+{
+	int idx;
+	int i = 0;
+
+	if (!drm_dev_enter(&ring->adev->ddev, &idx))
+		return;
+
+	while (i <= ring->buf_mask)
+		ring->ring[i++] = ring->funcs->nop;
+
+	drm_dev_exit(idx);
+
+}
+
+void amdgpu_ring_write(struct amdgpu_ring *ring, uint32_t v)
+{
+	int idx;
+
+	if (!drm_dev_enter(&ring->adev->ddev, &idx))
+		return;
+
+	if (ring->count_dw <= 0)
+		DRM_ERROR("amdgpu: writing more dwords to the ring than expected!\n");
+	ring->ring[ring->wptr++ & ring->buf_mask] = v;
+	ring->wptr &= ring->ptr_mask;
+	ring->count_dw--;
+
+	drm_dev_exit(idx);
+}
+
+void amdgpu_ring_write_multiple(struct amdgpu_ring *ring,
+					      void *src, int count_dw)
+{
+	unsigned occupied, chunk1, chunk2;
+	void *dst;
+	int idx;
+
+	if (!drm_dev_enter(&ring->adev->ddev, &idx))
+		return;
+
+	if (unlikely(ring->count_dw < count_dw))
+		DRM_ERROR("amdgpu: writing more dwords to the ring than expected!\n");
+
+	occupied = ring->wptr & ring->buf_mask;
+	dst = (void *)&ring->ring[occupied];
+	chunk1 = ring->buf_mask + 1 - occupied;
+	chunk1 = (chunk1 >= count_dw) ? count_dw: chunk1;
+	chunk2 = count_dw - chunk1;
+	chunk1 <<= 2;
+	chunk2 <<= 2;
+
+	if (chunk1)
+		memcpy(dst, src, chunk1);
+
+	if (chunk2) {
+		src += chunk1;
+		dst = (void *)ring->ring;
+		memcpy(dst, src, chunk2);
+	}
+
+	ring->wptr += count_dw;
+	ring->wptr &= ring->ptr_mask;
+	ring->count_dw -= count_dw;
+
+	drm_dev_exit(idx);
+}
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
index e7d3d0dbdd96..c67bc6d3d039 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
@@ -299,53 +299,12 @@ static inline void amdgpu_ring_set_preempt_cond_exec(struct amdgpu_ring *ring,
 	*ring->cond_exe_cpu_addr = cond_exec;
 }
 
-static inline void amdgpu_ring_clear_ring(struct amdgpu_ring *ring)
-{
-	int i = 0;
-	while (i <= ring->buf_mask)
-		ring->ring[i++] = ring->funcs->nop;
-
-}
-
-static inline void amdgpu_ring_write(struct amdgpu_ring *ring, uint32_t v)
-{
-	if (ring->count_dw <= 0)
-		DRM_ERROR("amdgpu: writing more dwords to the ring than expected!\n");
-	ring->ring[ring->wptr++ & ring->buf_mask] = v;
-	ring->wptr &= ring->ptr_mask;
-	ring->count_dw--;
-}
+void amdgpu_ring_clear_ring(struct amdgpu_ring *ring);
 
-static inline void amdgpu_ring_write_multiple(struct amdgpu_ring *ring,
-					      void *src, int count_dw)
-{
-	unsigned occupied, chunk1, chunk2;
-	void *dst;
-
-	if (unlikely(ring->count_dw < count_dw))
-		DRM_ERROR("amdgpu: writing more dwords to the ring than expected!\n");
-
-	occupied = ring->wptr & ring->buf_mask;
-	dst = (void *)&ring->ring[occupied];
-	chunk1 = ring->buf_mask + 1 - occupied;
-	chunk1 = (chunk1 >= count_dw) ? count_dw: chunk1;
-	chunk2 = count_dw - chunk1;
-	chunk1 <<= 2;
-	chunk2 <<= 2;
+void amdgpu_ring_write(struct amdgpu_ring *ring, uint32_t v);
 
-	if (chunk1)
-		memcpy(dst, src, chunk1);
-
-	if (chunk2) {
-		src += chunk1;
-		dst = (void *)ring->ring;
-		memcpy(dst, src, chunk2);
-	}
-
-	ring->wptr += count_dw;
-	ring->wptr &= ring->ptr_mask;
-	ring->count_dw -= count_dw;
-}
+void amdgpu_ring_write_multiple(struct amdgpu_ring *ring,
+					      void *src, int count_dw);
 
 int amdgpu_ring_test_helper(struct amdgpu_ring *ring);
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
index c6dbc0801604..82f0542c7792 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
@@ -32,6 +32,7 @@
 #include <linux/module.h>
 
 #include <drm/drm.h>
+#include <drm/drm_drv.h>
 
 #include "amdgpu.h"
 #include "amdgpu_pm.h"
@@ -375,7 +376,7 @@ int amdgpu_uvd_suspend(struct amdgpu_device *adev)
 {
 	unsigned size;
 	void *ptr;
-	int i, j;
+	int i, j, idx;
 	bool in_ras_intr = amdgpu_ras_intr_triggered();
 
 	cancel_delayed_work_sync(&adev->uvd.idle_work);
@@ -403,11 +404,15 @@ int amdgpu_uvd_suspend(struct amdgpu_device *adev)
 		if (!adev->uvd.inst[j].saved_bo)
 			return -ENOMEM;
 
-		/* re-write 0 since err_event_athub will corrupt VCPU buffer */
-		if (in_ras_intr)
-			memset(adev->uvd.inst[j].saved_bo, 0, size);
-		else
-			memcpy_fromio(adev->uvd.inst[j].saved_bo, ptr, size);
+		if (drm_dev_enter(&adev->ddev, &idx)) {
+			/* re-write 0 since err_event_athub will corrupt VCPU buffer */
+			if (in_ras_intr)
+				memset(adev->uvd.inst[j].saved_bo, 0, size);
+			else
+				memcpy_fromio(adev->uvd.inst[j].saved_bo, ptr, size);
+
+			drm_dev_exit(idx);
+		}
 	}
 
 	if (in_ras_intr)
@@ -420,7 +425,7 @@ int amdgpu_uvd_resume(struct amdgpu_device *adev)
 {
 	unsigned size;
 	void *ptr;
-	int i;
+	int i, idx;
 
 	for (i = 0; i < adev->uvd.num_uvd_inst; i++) {
 		if (adev->uvd.harvest_config & (1 << i))
@@ -432,7 +437,10 @@ int amdgpu_uvd_resume(struct amdgpu_device *adev)
 		ptr = adev->uvd.inst[i].cpu_addr;
 
 		if (adev->uvd.inst[i].saved_bo != NULL) {
-			memcpy_toio(ptr, adev->uvd.inst[i].saved_bo, size);
+			if (drm_dev_enter(&adev->ddev, &idx)) {
+				memcpy_toio(ptr, adev->uvd.inst[i].saved_bo, size);
+				drm_dev_exit(idx);
+			}
 			kvfree(adev->uvd.inst[i].saved_bo);
 			adev->uvd.inst[i].saved_bo = NULL;
 		} else {
@@ -442,8 +450,11 @@ int amdgpu_uvd_resume(struct amdgpu_device *adev)
 			hdr = (const struct common_firmware_header *)adev->uvd.fw->data;
 			if (adev->firmware.load_type != AMDGPU_FW_LOAD_PSP) {
 				offset = le32_to_cpu(hdr->ucode_array_offset_bytes);
-				memcpy_toio(adev->uvd.inst[i].cpu_addr, adev->uvd.fw->data + offset,
-					    le32_to_cpu(hdr->ucode_size_bytes));
+				if (drm_dev_enter(&adev->ddev, &idx)) {
+					memcpy_toio(adev->uvd.inst[i].cpu_addr, adev->uvd.fw->data + offset,
+						    le32_to_cpu(hdr->ucode_size_bytes));
+					drm_dev_exit(idx);
+				}
 				size -= le32_to_cpu(hdr->ucode_size_bytes);
 				ptr += le32_to_cpu(hdr->ucode_size_bytes);
 			}
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
index ea6a62f67e38..833203401ef4 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
@@ -29,6 +29,7 @@
 #include <linux/module.h>
 
 #include <drm/drm.h>
+#include <drm/drm_drv.h>
 
 #include "amdgpu.h"
 #include "amdgpu_pm.h"
@@ -293,7 +294,7 @@ int amdgpu_vce_resume(struct amdgpu_device *adev)
 	void *cpu_addr;
 	const struct common_firmware_header *hdr;
 	unsigned offset;
-	int r;
+	int r, idx;
 
 	if (adev->vce.vcpu_bo == NULL)
 		return -EINVAL;
@@ -313,8 +314,12 @@ int amdgpu_vce_resume(struct amdgpu_device *adev)
 
 	hdr = (const struct common_firmware_header *)adev->vce.fw->data;
 	offset = le32_to_cpu(hdr->ucode_array_offset_bytes);
-	memcpy_toio(cpu_addr, adev->vce.fw->data + offset,
-		    adev->vce.fw->size - offset);
+
+	if (drm_dev_enter(&adev->ddev, &idx)) {
+		memcpy_toio(cpu_addr, adev->vce.fw->data + offset,
+			    adev->vce.fw->size - offset);
+		drm_dev_exit(idx);
+	}
 
 	amdgpu_bo_kunmap(adev->vce.vcpu_bo);
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
index 201645963ba5..21f7d3644d70 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
@@ -27,6 +27,7 @@
 #include <linux/firmware.h>
 #include <linux/module.h>
 #include <linux/pci.h>
+#include <drm/drm_drv.h>
 
 #include "amdgpu.h"
 #include "amdgpu_pm.h"
@@ -275,7 +276,7 @@ int amdgpu_vcn_suspend(struct amdgpu_device *adev)
 {
 	unsigned size;
 	void *ptr;
-	int i;
+	int i, idx;
 
 	cancel_delayed_work_sync(&adev->vcn.idle_work);
 
@@ -292,7 +293,10 @@ int amdgpu_vcn_suspend(struct amdgpu_device *adev)
 		if (!adev->vcn.inst[i].saved_bo)
 			return -ENOMEM;
 
-		memcpy_fromio(adev->vcn.inst[i].saved_bo, ptr, size);
+		if (drm_dev_enter(&adev->ddev, &idx)) {
+			memcpy_fromio(adev->vcn.inst[i].saved_bo, ptr, size);
+			drm_dev_exit(idx);
+		}
 	}
 	return 0;
 }
@@ -301,7 +305,7 @@ int amdgpu_vcn_resume(struct amdgpu_device *adev)
 {
 	unsigned size;
 	void *ptr;
-	int i;
+	int i, idx;
 
 	for (i = 0; i < adev->vcn.num_vcn_inst; ++i) {
 		if (adev->vcn.harvest_config & (1 << i))
@@ -313,7 +317,10 @@ int amdgpu_vcn_resume(struct amdgpu_device *adev)
 		ptr = adev->vcn.inst[i].cpu_addr;
 
 		if (adev->vcn.inst[i].saved_bo != NULL) {
-			memcpy_toio(ptr, adev->vcn.inst[i].saved_bo, size);
+			if (drm_dev_enter(&adev->ddev, &idx)) {
+				memcpy_toio(ptr, adev->vcn.inst[i].saved_bo, size);
+				drm_dev_exit(idx);
+			}
 			kvfree(adev->vcn.inst[i].saved_bo);
 			adev->vcn.inst[i].saved_bo = NULL;
 		} else {
@@ -323,8 +330,11 @@ int amdgpu_vcn_resume(struct amdgpu_device *adev)
 			hdr = (const struct common_firmware_header *)adev->vcn.fw->data;
 			if (adev->firmware.load_type != AMDGPU_FW_LOAD_PSP) {
 				offset = le32_to_cpu(hdr->ucode_array_offset_bytes);
-				memcpy_toio(adev->vcn.inst[i].cpu_addr, adev->vcn.fw->data + offset,
-					    le32_to_cpu(hdr->ucode_size_bytes));
+				if (drm_dev_enter(&adev->ddev, &idx)) {
+					memcpy_toio(adev->vcn.inst[i].cpu_addr, adev->vcn.fw->data + offset,
+						    le32_to_cpu(hdr->ucode_size_bytes));
+					drm_dev_exit(idx);
+				}
 				size -= le32_to_cpu(hdr->ucode_size_bytes);
 				ptr += le32_to_cpu(hdr->ucode_size_bytes);
 			}
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index 9f868cf3b832..7dd5f10ab570 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -32,6 +32,7 @@
 #include <linux/dma-buf.h>
 
 #include <drm/amdgpu_drm.h>
+#include <drm/drm_drv.h>
 #include "amdgpu.h"
 #include "amdgpu_trace.h"
 #include "amdgpu_amdkfd.h"
@@ -1606,7 +1607,10 @@ static int amdgpu_vm_bo_update_mapping(struct amdgpu_device *adev,
 	struct amdgpu_vm_update_params params;
 	enum amdgpu_sync_mode sync_mode;
 	uint64_t pfn;
-	int r;
+	int r, idx;
+
+	if (!drm_dev_enter(&adev->ddev, &idx))
+		return -ENODEV;
 
 	memset(&params, 0, sizeof(params));
 	params.adev = adev;
@@ -1715,6 +1719,7 @@ static int amdgpu_vm_bo_update_mapping(struct amdgpu_device *adev,
 
 error_unlock:
 	amdgpu_vm_eviction_unlock(vm);
+	drm_dev_exit(idx);
 	return r;
 }
 
diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c b/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
index 589410c32d09..2cec71e823f5 100644
--- a/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
@@ -23,6 +23,7 @@
 #include <linux/firmware.h>
 #include <linux/module.h>
 #include <linux/vmalloc.h>
+#include <drm/drm_drv.h>
 
 #include "amdgpu.h"
 #include "amdgpu_psp.h"
@@ -269,10 +270,8 @@ static int psp_v11_0_bootloader_load_kdb(struct psp_context *psp)
 	if (ret)
 		return ret;
 
-	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
-
 	/* Copy PSP KDB binary to memory */
-	memcpy(psp->fw_pri_buf, psp->kdb_start_addr, psp->kdb_bin_size);
+	psp_copy_fw(psp, psp->kdb_start_addr, psp->kdb_bin_size);
 
 	/* Provide the PSP KDB to bootloader */
 	WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
@@ -302,10 +301,8 @@ static int psp_v11_0_bootloader_load_spl(struct psp_context *psp)
 	if (ret)
 		return ret;
 
-	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
-
 	/* Copy PSP SPL binary to memory */
-	memcpy(psp->fw_pri_buf, psp->spl_start_addr, psp->spl_bin_size);
+	psp_copy_fw(psp, psp->spl_start_addr, psp->spl_bin_size);
 
 	/* Provide the PSP SPL to bootloader */
 	WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
@@ -335,10 +332,8 @@ static int psp_v11_0_bootloader_load_sysdrv(struct psp_context *psp)
 	if (ret)
 		return ret;
 
-	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
-
 	/* Copy PSP System Driver binary to memory */
-	memcpy(psp->fw_pri_buf, psp->sys_start_addr, psp->sys_bin_size);
+	psp_copy_fw(psp, psp->sys_start_addr, psp->sys_bin_size);
 
 	/* Provide the sys driver to bootloader */
 	WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
@@ -371,10 +366,8 @@ static int psp_v11_0_bootloader_load_sos(struct psp_context *psp)
 	if (ret)
 		return ret;
 
-	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
-
 	/* Copy Secure OS binary to PSP memory */
-	memcpy(psp->fw_pri_buf, psp->sos_start_addr, psp->sos_bin_size);
+	psp_copy_fw(psp, psp->sos_start_addr, psp->sos_bin_size);
 
 	/* Provide the PSP secure OS to bootloader */
 	WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
@@ -608,7 +601,7 @@ static int psp_v11_0_memory_training(struct psp_context *psp, uint32_t ops)
 	uint32_t p2c_header[4];
 	uint32_t sz;
 	void *buf;
-	int ret;
+	int ret, idx;
 
 	if (ctx->init == PSP_MEM_TRAIN_NOT_SUPPORT) {
 		DRM_DEBUG("Memory training is not supported.\n");
@@ -681,17 +674,24 @@ static int psp_v11_0_memory_training(struct psp_context *psp, uint32_t ops)
 			return -ENOMEM;
 		}
 
-		memcpy_fromio(buf, adev->mman.aper_base_kaddr, sz);
-		ret = psp_v11_0_memory_training_send_msg(psp, PSP_BL__DRAM_LONG_TRAIN);
-		if (ret) {
-			DRM_ERROR("Send long training msg failed.\n");
+		if (drm_dev_enter(&adev->ddev, &idx)) {
+			memcpy_fromio(buf, adev->mman.aper_base_kaddr, sz);
+			ret = psp_v11_0_memory_training_send_msg(psp, PSP_BL__DRAM_LONG_TRAIN);
+			if (ret) {
+				DRM_ERROR("Send long training msg failed.\n");
+				vfree(buf);
+				drm_dev_exit(idx);
+				return ret;
+			}
+
+			memcpy_toio(adev->mman.aper_base_kaddr, buf, sz);
+			adev->hdp.funcs->flush_hdp(adev, NULL);
 			vfree(buf);
-			return ret;
+			drm_dev_exit(idx);
+		} else {
+			vfree(buf);
+			return -ENODEV;
 		}
-
-		memcpy_toio(adev->mman.aper_base_kaddr, buf, sz);
-		adev->hdp.funcs->flush_hdp(adev, NULL);
-		vfree(buf);
 	}
 
 	if (ops & PSP_MEM_TRAIN_SAVE) {
diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c b/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c
index c4828bd3264b..618e5b6b85d9 100644
--- a/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c
@@ -138,10 +138,8 @@ static int psp_v12_0_bootloader_load_sysdrv(struct psp_context *psp)
 	if (ret)
 		return ret;
 
-	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
-
 	/* Copy PSP System Driver binary to memory */
-	memcpy(psp->fw_pri_buf, psp->sys_start_addr, psp->sys_bin_size);
+	psp_copy_fw(psp, psp->sys_start_addr, psp->sys_bin_size);
 
 	/* Provide the sys driver to bootloader */
 	WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
@@ -179,10 +177,8 @@ static int psp_v12_0_bootloader_load_sos(struct psp_context *psp)
 	if (ret)
 		return ret;
 
-	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
-
 	/* Copy Secure OS binary to PSP memory */
-	memcpy(psp->fw_pri_buf, psp->sos_start_addr, psp->sos_bin_size);
+	psp_copy_fw(psp, psp->sos_start_addr, psp->sos_bin_size);
 
 	/* Provide the PSP secure OS to bootloader */
 	WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c b/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c
index f2e725f72d2f..d0a6cccd0897 100644
--- a/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c
+++ b/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c
@@ -102,10 +102,8 @@ static int psp_v3_1_bootloader_load_sysdrv(struct psp_context *psp)
 	if (ret)
 		return ret;
 
-	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
-
 	/* Copy PSP System Driver binary to memory */
-	memcpy(psp->fw_pri_buf, psp->sys_start_addr, psp->sys_bin_size);
+	psp_copy_fw(psp, psp->sys_start_addr, psp->sys_bin_size);
 
 	/* Provide the sys driver to bootloader */
 	WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
@@ -143,10 +141,8 @@ static int psp_v3_1_bootloader_load_sos(struct psp_context *psp)
 	if (ret)
 		return ret;
 
-	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
-
 	/* Copy Secure OS binary to PSP memory */
-	memcpy(psp->fw_pri_buf, psp->sos_start_addr, psp->sos_bin_size);
+	psp_copy_fw(psp, psp->sos_start_addr, psp->sos_bin_size);
 
 	/* Provide the PSP secure OS to bootloader */
 	WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
diff --git a/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c b/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
index 8e238dea7bef..90910d19db12 100644
--- a/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
@@ -25,6 +25,7 @@
  */
 
 #include <linux/firmware.h>
+#include <drm/drm_drv.h>
 
 #include "amdgpu.h"
 #include "amdgpu_vce.h"
@@ -555,16 +556,19 @@ static int vce_v4_0_hw_fini(void *handle)
 static int vce_v4_0_suspend(void *handle)
 {
 	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
-	int r;
+	int r, idx;
 
 	if (adev->vce.vcpu_bo == NULL)
 		return 0;
 
-	if (adev->firmware.load_type == AMDGPU_FW_LOAD_PSP) {
-		unsigned size = amdgpu_bo_size(adev->vce.vcpu_bo);
-		void *ptr = adev->vce.cpu_addr;
+	if (drm_dev_enter(&adev->ddev, &idx)) {
+		if (adev->firmware.load_type == AMDGPU_FW_LOAD_PSP) {
+			unsigned size = amdgpu_bo_size(adev->vce.vcpu_bo);
+			void *ptr = adev->vce.cpu_addr;
 
-		memcpy_fromio(adev->vce.saved_bo, ptr, size);
+			memcpy_fromio(adev->vce.saved_bo, ptr, size);
+		}
+		drm_dev_exit(idx);
 	}
 
 	r = vce_v4_0_hw_fini(adev);
@@ -577,16 +581,20 @@ static int vce_v4_0_suspend(void *handle)
 static int vce_v4_0_resume(void *handle)
 {
 	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
-	int r;
+	int r, idx;
 
 	if (adev->vce.vcpu_bo == NULL)
 		return -EINVAL;
 
 	if (adev->firmware.load_type == AMDGPU_FW_LOAD_PSP) {
-		unsigned size = amdgpu_bo_size(adev->vce.vcpu_bo);
-		void *ptr = adev->vce.cpu_addr;
 
-		memcpy_toio(ptr, adev->vce.saved_bo, size);
+		if (drm_dev_enter(&adev->ddev, &idx)) {
+			unsigned size = amdgpu_bo_size(adev->vce.vcpu_bo);
+			void *ptr = adev->vce.cpu_addr;
+
+			memcpy_toio(ptr, adev->vce.saved_bo, size);
+			drm_dev_exit(idx);
+		}
 	} else {
 		r = amdgpu_vce_resume(adev);
 		if (r)
diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c b/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c
index 3f15bf34123a..df34be8ec82d 100644
--- a/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c
@@ -34,6 +34,8 @@
 #include "vcn/vcn_3_0_0_sh_mask.h"
 #include "ivsrcid/vcn/irqsrcs_vcn_2_0.h"
 
+#include <drm/drm_drv.h>
+
 #define mmUVD_CONTEXT_ID_INTERNAL_OFFSET			0x27
 #define mmUVD_GPCOM_VCPU_CMD_INTERNAL_OFFSET			0x0f
 #define mmUVD_GPCOM_VCPU_DATA0_INTERNAL_OFFSET			0x10
@@ -268,16 +270,20 @@ static int vcn_v3_0_sw_init(void *handle)
 static int vcn_v3_0_sw_fini(void *handle)
 {
 	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
-	int i, r;
+	int i, r, idx;
 
-	for (i = 0; i < adev->vcn.num_vcn_inst; i++) {
-		volatile struct amdgpu_fw_shared *fw_shared;
+	if (drm_dev_enter(&adev->ddev, &idx)) {
+		for (i = 0; i < adev->vcn.num_vcn_inst; i++) {
+			volatile struct amdgpu_fw_shared *fw_shared;
 
-		if (adev->vcn.harvest_config & (1 << i))
-			continue;
-		fw_shared = adev->vcn.inst[i].fw_shared_cpu_addr;
-		fw_shared->present_flag_0 = 0;
-		fw_shared->sw_ring.is_enabled = false;
+			if (adev->vcn.harvest_config & (1 << i))
+				continue;
+			fw_shared = adev->vcn.inst[i].fw_shared_cpu_addr;
+			fw_shared->present_flag_0 = 0;
+			fw_shared->sw_ring.is_enabled = false;
+		}
+
+		drm_dev_exit(idx);
 	}
 
 	if (amdgpu_sriov_vf(adev))
diff --git a/drivers/gpu/drm/amd/pm/powerplay/smumgr/smu7_smumgr.c b/drivers/gpu/drm/amd/pm/powerplay/smumgr/smu7_smumgr.c
index aae25243eb10..d628b91846c9 100644
--- a/drivers/gpu/drm/amd/pm/powerplay/smumgr/smu7_smumgr.c
+++ b/drivers/gpu/drm/amd/pm/powerplay/smumgr/smu7_smumgr.c
@@ -405,6 +405,8 @@ int smu7_request_smu_load_fw(struct pp_hwmgr *hwmgr)
 				UCODE_ID_MEC_STORAGE, &toc->entry[toc->num_entries++]),
 				"Failed to Get Firmware Entry.", r = -EINVAL; goto failed);
 	}
+
+	/* AG TODO Can't call drm_dev_enter/exit because access adev->ddev here ... */
 	memcpy_toio(smu_data->header_buffer.kaddr, smu_data->toc,
 		    sizeof(struct SMU_DRAMData_TOC));
 	smum_send_msg_to_smc_with_parameter(hwmgr,
-- 
2.25.1

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 126+ messages in thread

* [PATCH v6 11/16] drm/sched: Make timeout timer rearm conditional.
  2021-05-10 16:36 ` Andrey Grodzovsky
  (?)
@ 2021-05-10 16:36   ` Andrey Grodzovsky
  -1 siblings, 0 replies; 126+ messages in thread
From: Andrey Grodzovsky @ 2021-05-10 16:36 UTC (permalink / raw)
  To: dri-devel, amd-gfx, linux-pci, ckoenig.leichtzumerken,
	daniel.vetter, Harry.Wentland
  Cc: ppaalanen, Alexander.Deucher, gregkh, helgaas, Felix.Kuehling,
	Andrey Grodzovsky

We don't want to rearm the timer if driver hook reports
that the device is gone.

v5: Update drm_gpu_sched_stat values in code.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
---
 drivers/gpu/drm/scheduler/sched_main.c | 11 +++++++----
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
index f4f474944169..8d1211e87101 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -314,6 +314,7 @@ static void drm_sched_job_timedout(struct work_struct *work)
 {
 	struct drm_gpu_scheduler *sched;
 	struct drm_sched_job *job;
+	enum drm_gpu_sched_stat status = DRM_GPU_SCHED_STAT_NOMINAL;
 
 	sched = container_of(work, struct drm_gpu_scheduler, work_tdr.work);
 
@@ -331,7 +332,7 @@ static void drm_sched_job_timedout(struct work_struct *work)
 		list_del_init(&job->list);
 		spin_unlock(&sched->job_list_lock);
 
-		job->sched->ops->timedout_job(job);
+		status = job->sched->ops->timedout_job(job);
 
 		/*
 		 * Guilty job did complete and hence needs to be manually removed
@@ -345,9 +346,11 @@ static void drm_sched_job_timedout(struct work_struct *work)
 		spin_unlock(&sched->job_list_lock);
 	}
 
-	spin_lock(&sched->job_list_lock);
-	drm_sched_start_timeout(sched);
-	spin_unlock(&sched->job_list_lock);
+	if (status != DRM_GPU_SCHED_STAT_ENODEV) {
+		spin_lock(&sched->job_list_lock);
+		drm_sched_start_timeout(sched);
+		spin_unlock(&sched->job_list_lock);
+	}
 }
 
  /**
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 126+ messages in thread

* [PATCH v6 11/16] drm/sched: Make timeout timer rearm conditional.
@ 2021-05-10 16:36   ` Andrey Grodzovsky
  0 siblings, 0 replies; 126+ messages in thread
From: Andrey Grodzovsky @ 2021-05-10 16:36 UTC (permalink / raw)
  To: dri-devel, amd-gfx, linux-pci, ckoenig.leichtzumerken,
	daniel.vetter, Harry.Wentland
  Cc: gregkh, Felix.Kuehling, helgaas, Alexander.Deucher

We don't want to rearm the timer if driver hook reports
that the device is gone.

v5: Update drm_gpu_sched_stat values in code.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
---
 drivers/gpu/drm/scheduler/sched_main.c | 11 +++++++----
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
index f4f474944169..8d1211e87101 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -314,6 +314,7 @@ static void drm_sched_job_timedout(struct work_struct *work)
 {
 	struct drm_gpu_scheduler *sched;
 	struct drm_sched_job *job;
+	enum drm_gpu_sched_stat status = DRM_GPU_SCHED_STAT_NOMINAL;
 
 	sched = container_of(work, struct drm_gpu_scheduler, work_tdr.work);
 
@@ -331,7 +332,7 @@ static void drm_sched_job_timedout(struct work_struct *work)
 		list_del_init(&job->list);
 		spin_unlock(&sched->job_list_lock);
 
-		job->sched->ops->timedout_job(job);
+		status = job->sched->ops->timedout_job(job);
 
 		/*
 		 * Guilty job did complete and hence needs to be manually removed
@@ -345,9 +346,11 @@ static void drm_sched_job_timedout(struct work_struct *work)
 		spin_unlock(&sched->job_list_lock);
 	}
 
-	spin_lock(&sched->job_list_lock);
-	drm_sched_start_timeout(sched);
-	spin_unlock(&sched->job_list_lock);
+	if (status != DRM_GPU_SCHED_STAT_ENODEV) {
+		spin_lock(&sched->job_list_lock);
+		drm_sched_start_timeout(sched);
+		spin_unlock(&sched->job_list_lock);
+	}
 }
 
  /**
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 126+ messages in thread

* [PATCH v6 11/16] drm/sched: Make timeout timer rearm conditional.
@ 2021-05-10 16:36   ` Andrey Grodzovsky
  0 siblings, 0 replies; 126+ messages in thread
From: Andrey Grodzovsky @ 2021-05-10 16:36 UTC (permalink / raw)
  To: dri-devel, amd-gfx, linux-pci, ckoenig.leichtzumerken,
	daniel.vetter, Harry.Wentland
  Cc: Andrey Grodzovsky, gregkh, Felix.Kuehling, ppaalanen, helgaas,
	Alexander.Deucher

We don't want to rearm the timer if driver hook reports
that the device is gone.

v5: Update drm_gpu_sched_stat values in code.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
---
 drivers/gpu/drm/scheduler/sched_main.c | 11 +++++++----
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
index f4f474944169..8d1211e87101 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -314,6 +314,7 @@ static void drm_sched_job_timedout(struct work_struct *work)
 {
 	struct drm_gpu_scheduler *sched;
 	struct drm_sched_job *job;
+	enum drm_gpu_sched_stat status = DRM_GPU_SCHED_STAT_NOMINAL;
 
 	sched = container_of(work, struct drm_gpu_scheduler, work_tdr.work);
 
@@ -331,7 +332,7 @@ static void drm_sched_job_timedout(struct work_struct *work)
 		list_del_init(&job->list);
 		spin_unlock(&sched->job_list_lock);
 
-		job->sched->ops->timedout_job(job);
+		status = job->sched->ops->timedout_job(job);
 
 		/*
 		 * Guilty job did complete and hence needs to be manually removed
@@ -345,9 +346,11 @@ static void drm_sched_job_timedout(struct work_struct *work)
 		spin_unlock(&sched->job_list_lock);
 	}
 
-	spin_lock(&sched->job_list_lock);
-	drm_sched_start_timeout(sched);
-	spin_unlock(&sched->job_list_lock);
+	if (status != DRM_GPU_SCHED_STAT_ENODEV) {
+		spin_lock(&sched->job_list_lock);
+		drm_sched_start_timeout(sched);
+		spin_unlock(&sched->job_list_lock);
+	}
 }
 
  /**
-- 
2.25.1

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 126+ messages in thread

* [PATCH v6 12/16] drm/amdgpu: Prevent any job recoveries after device is unplugged.
  2021-05-10 16:36 ` Andrey Grodzovsky
  (?)
@ 2021-05-10 16:36   ` Andrey Grodzovsky
  -1 siblings, 0 replies; 126+ messages in thread
From: Andrey Grodzovsky @ 2021-05-10 16:36 UTC (permalink / raw)
  To: dri-devel, amd-gfx, linux-pci, ckoenig.leichtzumerken,
	daniel.vetter, Harry.Wentland
  Cc: ppaalanen, Alexander.Deucher, gregkh, helgaas, Felix.Kuehling,
	Andrey Grodzovsky

Return DRM_TASK_STATUS_ENODEV back to the scheduler when device
is not present so they timeout timer will not be rearmed.

v5: Update to match updated return values in enum drm_gpu_sched_stat

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 19 ++++++++++++++++---
 1 file changed, 16 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
index 759b34799221..d33e6d97cc89 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
@@ -25,6 +25,8 @@
 #include <linux/wait.h>
 #include <linux/sched.h>
 
+#include <drm/drm_drv.h>
+
 #include "amdgpu.h"
 #include "amdgpu_trace.h"
 
@@ -34,6 +36,15 @@ static enum drm_gpu_sched_stat amdgpu_job_timedout(struct drm_sched_job *s_job)
 	struct amdgpu_job *job = to_amdgpu_job(s_job);
 	struct amdgpu_task_info ti;
 	struct amdgpu_device *adev = ring->adev;
+	int idx;
+
+	if (!drm_dev_enter(&adev->ddev, &idx)) {
+		DRM_INFO("%s - device unplugged skipping recovery on scheduler:%s",
+			 __func__, s_job->sched->name);
+
+		/* Effectively the job is aborted as the device is gone */
+		return DRM_GPU_SCHED_STAT_ENODEV;
+	}
 
 	memset(&ti, 0, sizeof(struct amdgpu_task_info));
 
@@ -41,7 +52,7 @@ static enum drm_gpu_sched_stat amdgpu_job_timedout(struct drm_sched_job *s_job)
 	    amdgpu_ring_soft_recovery(ring, job->vmid, s_job->s_fence->parent)) {
 		DRM_ERROR("ring %s timeout, but soft recovered\n",
 			  s_job->sched->name);
-		return DRM_GPU_SCHED_STAT_NOMINAL;
+		goto exit;
 	}
 
 	amdgpu_vm_get_task_info(ring->adev, job->pasid, &ti);
@@ -53,13 +64,15 @@ static enum drm_gpu_sched_stat amdgpu_job_timedout(struct drm_sched_job *s_job)
 
 	if (amdgpu_device_should_recover_gpu(ring->adev)) {
 		amdgpu_device_gpu_recover(ring->adev, job);
-		return DRM_GPU_SCHED_STAT_NOMINAL;
 	} else {
 		drm_sched_suspend_timeout(&ring->sched);
 		if (amdgpu_sriov_vf(adev))
 			adev->virt.tdr_debug = true;
-		return DRM_GPU_SCHED_STAT_NOMINAL;
 	}
+
+exit:
+	drm_dev_exit(idx);
+	return DRM_GPU_SCHED_STAT_NOMINAL;
 }
 
 int amdgpu_job_alloc(struct amdgpu_device *adev, unsigned num_ibs,
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 126+ messages in thread

* [PATCH v6 12/16] drm/amdgpu: Prevent any job recoveries after device is unplugged.
@ 2021-05-10 16:36   ` Andrey Grodzovsky
  0 siblings, 0 replies; 126+ messages in thread
From: Andrey Grodzovsky @ 2021-05-10 16:36 UTC (permalink / raw)
  To: dri-devel, amd-gfx, linux-pci, ckoenig.leichtzumerken,
	daniel.vetter, Harry.Wentland
  Cc: gregkh, Felix.Kuehling, helgaas, Alexander.Deucher

Return DRM_TASK_STATUS_ENODEV back to the scheduler when device
is not present so they timeout timer will not be rearmed.

v5: Update to match updated return values in enum drm_gpu_sched_stat

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 19 ++++++++++++++++---
 1 file changed, 16 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
index 759b34799221..d33e6d97cc89 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
@@ -25,6 +25,8 @@
 #include <linux/wait.h>
 #include <linux/sched.h>
 
+#include <drm/drm_drv.h>
+
 #include "amdgpu.h"
 #include "amdgpu_trace.h"
 
@@ -34,6 +36,15 @@ static enum drm_gpu_sched_stat amdgpu_job_timedout(struct drm_sched_job *s_job)
 	struct amdgpu_job *job = to_amdgpu_job(s_job);
 	struct amdgpu_task_info ti;
 	struct amdgpu_device *adev = ring->adev;
+	int idx;
+
+	if (!drm_dev_enter(&adev->ddev, &idx)) {
+		DRM_INFO("%s - device unplugged skipping recovery on scheduler:%s",
+			 __func__, s_job->sched->name);
+
+		/* Effectively the job is aborted as the device is gone */
+		return DRM_GPU_SCHED_STAT_ENODEV;
+	}
 
 	memset(&ti, 0, sizeof(struct amdgpu_task_info));
 
@@ -41,7 +52,7 @@ static enum drm_gpu_sched_stat amdgpu_job_timedout(struct drm_sched_job *s_job)
 	    amdgpu_ring_soft_recovery(ring, job->vmid, s_job->s_fence->parent)) {
 		DRM_ERROR("ring %s timeout, but soft recovered\n",
 			  s_job->sched->name);
-		return DRM_GPU_SCHED_STAT_NOMINAL;
+		goto exit;
 	}
 
 	amdgpu_vm_get_task_info(ring->adev, job->pasid, &ti);
@@ -53,13 +64,15 @@ static enum drm_gpu_sched_stat amdgpu_job_timedout(struct drm_sched_job *s_job)
 
 	if (amdgpu_device_should_recover_gpu(ring->adev)) {
 		amdgpu_device_gpu_recover(ring->adev, job);
-		return DRM_GPU_SCHED_STAT_NOMINAL;
 	} else {
 		drm_sched_suspend_timeout(&ring->sched);
 		if (amdgpu_sriov_vf(adev))
 			adev->virt.tdr_debug = true;
-		return DRM_GPU_SCHED_STAT_NOMINAL;
 	}
+
+exit:
+	drm_dev_exit(idx);
+	return DRM_GPU_SCHED_STAT_NOMINAL;
 }
 
 int amdgpu_job_alloc(struct amdgpu_device *adev, unsigned num_ibs,
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 126+ messages in thread

* [PATCH v6 12/16] drm/amdgpu: Prevent any job recoveries after device is unplugged.
@ 2021-05-10 16:36   ` Andrey Grodzovsky
  0 siblings, 0 replies; 126+ messages in thread
From: Andrey Grodzovsky @ 2021-05-10 16:36 UTC (permalink / raw)
  To: dri-devel, amd-gfx, linux-pci, ckoenig.leichtzumerken,
	daniel.vetter, Harry.Wentland
  Cc: Andrey Grodzovsky, gregkh, Felix.Kuehling, ppaalanen, helgaas,
	Alexander.Deucher

Return DRM_TASK_STATUS_ENODEV back to the scheduler when device
is not present so they timeout timer will not be rearmed.

v5: Update to match updated return values in enum drm_gpu_sched_stat

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 19 ++++++++++++++++---
 1 file changed, 16 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
index 759b34799221..d33e6d97cc89 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
@@ -25,6 +25,8 @@
 #include <linux/wait.h>
 #include <linux/sched.h>
 
+#include <drm/drm_drv.h>
+
 #include "amdgpu.h"
 #include "amdgpu_trace.h"
 
@@ -34,6 +36,15 @@ static enum drm_gpu_sched_stat amdgpu_job_timedout(struct drm_sched_job *s_job)
 	struct amdgpu_job *job = to_amdgpu_job(s_job);
 	struct amdgpu_task_info ti;
 	struct amdgpu_device *adev = ring->adev;
+	int idx;
+
+	if (!drm_dev_enter(&adev->ddev, &idx)) {
+		DRM_INFO("%s - device unplugged skipping recovery on scheduler:%s",
+			 __func__, s_job->sched->name);
+
+		/* Effectively the job is aborted as the device is gone */
+		return DRM_GPU_SCHED_STAT_ENODEV;
+	}
 
 	memset(&ti, 0, sizeof(struct amdgpu_task_info));
 
@@ -41,7 +52,7 @@ static enum drm_gpu_sched_stat amdgpu_job_timedout(struct drm_sched_job *s_job)
 	    amdgpu_ring_soft_recovery(ring, job->vmid, s_job->s_fence->parent)) {
 		DRM_ERROR("ring %s timeout, but soft recovered\n",
 			  s_job->sched->name);
-		return DRM_GPU_SCHED_STAT_NOMINAL;
+		goto exit;
 	}
 
 	amdgpu_vm_get_task_info(ring->adev, job->pasid, &ti);
@@ -53,13 +64,15 @@ static enum drm_gpu_sched_stat amdgpu_job_timedout(struct drm_sched_job *s_job)
 
 	if (amdgpu_device_should_recover_gpu(ring->adev)) {
 		amdgpu_device_gpu_recover(ring->adev, job);
-		return DRM_GPU_SCHED_STAT_NOMINAL;
 	} else {
 		drm_sched_suspend_timeout(&ring->sched);
 		if (amdgpu_sriov_vf(adev))
 			adev->virt.tdr_debug = true;
-		return DRM_GPU_SCHED_STAT_NOMINAL;
 	}
+
+exit:
+	drm_dev_exit(idx);
+	return DRM_GPU_SCHED_STAT_NOMINAL;
 }
 
 int amdgpu_job_alloc(struct amdgpu_device *adev, unsigned num_ibs,
-- 
2.25.1

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 126+ messages in thread

* [PATCH v6 13/16] drm/amdgpu: Fix hang on device removal.
  2021-05-10 16:36 ` Andrey Grodzovsky
  (?)
@ 2021-05-10 16:36   ` Andrey Grodzovsky
  -1 siblings, 0 replies; 126+ messages in thread
From: Andrey Grodzovsky @ 2021-05-10 16:36 UTC (permalink / raw)
  To: dri-devel, amd-gfx, linux-pci, ckoenig.leichtzumerken,
	daniel.vetter, Harry.Wentland
  Cc: ppaalanen, Alexander.Deucher, gregkh, helgaas, Felix.Kuehling,
	Andrey Grodzovsky

If removing while commands in flight you cannot wait to flush the
HW fences on a ring since the device is gone.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c | 16 ++++++++++------
 1 file changed, 10 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
index 1ffb36bd0b19..fa03702ecbfb 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
@@ -36,6 +36,7 @@
 #include <linux/firmware.h>
 #include <linux/pm_runtime.h>
 
+#include <drm/drm_drv.h>
 #include "amdgpu.h"
 #include "amdgpu_trace.h"
 
@@ -525,8 +526,7 @@ int amdgpu_fence_driver_init(struct amdgpu_device *adev)
  */
 void amdgpu_fence_driver_fini_hw(struct amdgpu_device *adev)
 {
-	unsigned i, j;
-	int r;
+	int i, r;
 
 	for (i = 0; i < AMDGPU_MAX_RINGS; i++) {
 		struct amdgpu_ring *ring = adev->rings[i];
@@ -535,11 +535,15 @@ void amdgpu_fence_driver_fini_hw(struct amdgpu_device *adev)
 			continue;
 		if (!ring->no_scheduler)
 			drm_sched_fini(&ring->sched);
-		r = amdgpu_fence_wait_empty(ring);
-		if (r) {
-			/* no need to trigger GPU reset as we are unloading */
+		/* You can't wait for HW to signal if it's gone */
+		if (!drm_dev_is_unplugged(&adev->ddev))
+			r = amdgpu_fence_wait_empty(ring);
+		else
+			r = -ENODEV;
+		/* no need to trigger GPU reset as we are unloading */
+		if (r)
 			amdgpu_fence_driver_force_completion(ring);
-		}
+
 		if (ring->fence_drv.irq_src)
 			amdgpu_irq_put(adev, ring->fence_drv.irq_src,
 				       ring->fence_drv.irq_type);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 126+ messages in thread

* [PATCH v6 13/16] drm/amdgpu: Fix hang on device removal.
@ 2021-05-10 16:36   ` Andrey Grodzovsky
  0 siblings, 0 replies; 126+ messages in thread
From: Andrey Grodzovsky @ 2021-05-10 16:36 UTC (permalink / raw)
  To: dri-devel, amd-gfx, linux-pci, ckoenig.leichtzumerken,
	daniel.vetter, Harry.Wentland
  Cc: gregkh, Felix.Kuehling, helgaas, Alexander.Deucher

If removing while commands in flight you cannot wait to flush the
HW fences on a ring since the device is gone.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c | 16 ++++++++++------
 1 file changed, 10 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
index 1ffb36bd0b19..fa03702ecbfb 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
@@ -36,6 +36,7 @@
 #include <linux/firmware.h>
 #include <linux/pm_runtime.h>
 
+#include <drm/drm_drv.h>
 #include "amdgpu.h"
 #include "amdgpu_trace.h"
 
@@ -525,8 +526,7 @@ int amdgpu_fence_driver_init(struct amdgpu_device *adev)
  */
 void amdgpu_fence_driver_fini_hw(struct amdgpu_device *adev)
 {
-	unsigned i, j;
-	int r;
+	int i, r;
 
 	for (i = 0; i < AMDGPU_MAX_RINGS; i++) {
 		struct amdgpu_ring *ring = adev->rings[i];
@@ -535,11 +535,15 @@ void amdgpu_fence_driver_fini_hw(struct amdgpu_device *adev)
 			continue;
 		if (!ring->no_scheduler)
 			drm_sched_fini(&ring->sched);
-		r = amdgpu_fence_wait_empty(ring);
-		if (r) {
-			/* no need to trigger GPU reset as we are unloading */
+		/* You can't wait for HW to signal if it's gone */
+		if (!drm_dev_is_unplugged(&adev->ddev))
+			r = amdgpu_fence_wait_empty(ring);
+		else
+			r = -ENODEV;
+		/* no need to trigger GPU reset as we are unloading */
+		if (r)
 			amdgpu_fence_driver_force_completion(ring);
-		}
+
 		if (ring->fence_drv.irq_src)
 			amdgpu_irq_put(adev, ring->fence_drv.irq_src,
 				       ring->fence_drv.irq_type);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 126+ messages in thread

* [PATCH v6 13/16] drm/amdgpu: Fix hang on device removal.
@ 2021-05-10 16:36   ` Andrey Grodzovsky
  0 siblings, 0 replies; 126+ messages in thread
From: Andrey Grodzovsky @ 2021-05-10 16:36 UTC (permalink / raw)
  To: dri-devel, amd-gfx, linux-pci, ckoenig.leichtzumerken,
	daniel.vetter, Harry.Wentland
  Cc: Andrey Grodzovsky, gregkh, Felix.Kuehling, ppaalanen, helgaas,
	Alexander.Deucher

If removing while commands in flight you cannot wait to flush the
HW fences on a ring since the device is gone.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c | 16 ++++++++++------
 1 file changed, 10 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
index 1ffb36bd0b19..fa03702ecbfb 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
@@ -36,6 +36,7 @@
 #include <linux/firmware.h>
 #include <linux/pm_runtime.h>
 
+#include <drm/drm_drv.h>
 #include "amdgpu.h"
 #include "amdgpu_trace.h"
 
@@ -525,8 +526,7 @@ int amdgpu_fence_driver_init(struct amdgpu_device *adev)
  */
 void amdgpu_fence_driver_fini_hw(struct amdgpu_device *adev)
 {
-	unsigned i, j;
-	int r;
+	int i, r;
 
 	for (i = 0; i < AMDGPU_MAX_RINGS; i++) {
 		struct amdgpu_ring *ring = adev->rings[i];
@@ -535,11 +535,15 @@ void amdgpu_fence_driver_fini_hw(struct amdgpu_device *adev)
 			continue;
 		if (!ring->no_scheduler)
 			drm_sched_fini(&ring->sched);
-		r = amdgpu_fence_wait_empty(ring);
-		if (r) {
-			/* no need to trigger GPU reset as we are unloading */
+		/* You can't wait for HW to signal if it's gone */
+		if (!drm_dev_is_unplugged(&adev->ddev))
+			r = amdgpu_fence_wait_empty(ring);
+		else
+			r = -ENODEV;
+		/* no need to trigger GPU reset as we are unloading */
+		if (r)
 			amdgpu_fence_driver_force_completion(ring);
-		}
+
 		if (ring->fence_drv.irq_src)
 			amdgpu_irq_put(adev, ring->fence_drv.irq_src,
 				       ring->fence_drv.irq_type);
-- 
2.25.1

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 126+ messages in thread

* [PATCH v6 14/16] drm/scheduler: Fix hang when sched_entity released
  2021-05-10 16:36 ` Andrey Grodzovsky
  (?)
@ 2021-05-10 16:36   ` Andrey Grodzovsky
  -1 siblings, 0 replies; 126+ messages in thread
From: Andrey Grodzovsky @ 2021-05-10 16:36 UTC (permalink / raw)
  To: dri-devel, amd-gfx, linux-pci, ckoenig.leichtzumerken,
	daniel.vetter, Harry.Wentland
  Cc: ppaalanen, Alexander.Deucher, gregkh, helgaas, Felix.Kuehling,
	Andrey Grodzovsky, Christian König

Problem: If scheduler is already stopped by the time sched_entity
is released and entity's job_queue not empty I encountred
a hang in drm_sched_entity_flush. This is because drm_sched_entity_is_idle
never becomes false.

Fix: In drm_sched_fini detach all sched_entities from the
scheduler's run queues. This will satisfy drm_sched_entity_is_idle.
Also wakeup all those processes stuck in sched_entity flushing
as the scheduler main thread which wakes them up is stopped by now.

v2:
Reverse order of drm_sched_rq_remove_entity and marking
s_entity as stopped to prevent reinserion back to rq due
to race.

v3:
Drop drm_sched_rq_remove_entity, only modify entity->stopped
and check for it in drm_sched_entity_is_idle

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
---
 drivers/gpu/drm/scheduler/sched_entity.c |  3 ++-
 drivers/gpu/drm/scheduler/sched_main.c   | 24 ++++++++++++++++++++++++
 2 files changed, 26 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c
index 0249c7450188..2e93e881b65f 100644
--- a/drivers/gpu/drm/scheduler/sched_entity.c
+++ b/drivers/gpu/drm/scheduler/sched_entity.c
@@ -116,7 +116,8 @@ static bool drm_sched_entity_is_idle(struct drm_sched_entity *entity)
 	rmb(); /* for list_empty to work without lock */
 
 	if (list_empty(&entity->list) ||
-	    spsc_queue_count(&entity->job_queue) == 0)
+	    spsc_queue_count(&entity->job_queue) == 0 ||
+	    entity->stopped)
 		return true;
 
 	return false;
diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
index 8d1211e87101..a2a953693b45 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -898,9 +898,33 @@ EXPORT_SYMBOL(drm_sched_init);
  */
 void drm_sched_fini(struct drm_gpu_scheduler *sched)
 {
+	struct drm_sched_entity *s_entity;
+	int i;
+
 	if (sched->thread)
 		kthread_stop(sched->thread);
 
+	for (i = DRM_SCHED_PRIORITY_COUNT - 1; i >= DRM_SCHED_PRIORITY_MIN; i--) {
+		struct drm_sched_rq *rq = &sched->sched_rq[i];
+
+		if (!rq)
+			continue;
+
+		spin_lock(&rq->lock);
+		list_for_each_entry(s_entity, &rq->entities, list)
+			/*
+			 * Prevents reinsertion and marks job_queue as idle,
+			 * it will removed from rq in drm_sched_entity_fini
+			 * eventually
+			 */
+			s_entity->stopped = true;
+		spin_unlock(&rq->lock);
+
+	}
+
+	/* Wakeup everyone stuck in drm_sched_entity_flush for this scheduler */
+	wake_up_all(&sched->job_scheduled);
+
 	/* Confirm no work left behind accessing device structures */
 	cancel_delayed_work_sync(&sched->work_tdr);
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 126+ messages in thread

* [PATCH v6 14/16] drm/scheduler: Fix hang when sched_entity released
@ 2021-05-10 16:36   ` Andrey Grodzovsky
  0 siblings, 0 replies; 126+ messages in thread
From: Andrey Grodzovsky @ 2021-05-10 16:36 UTC (permalink / raw)
  To: dri-devel, amd-gfx, linux-pci, ckoenig.leichtzumerken,
	daniel.vetter, Harry.Wentland
  Cc: gregkh, Felix.Kuehling, helgaas, Alexander.Deucher, Christian König

Problem: If scheduler is already stopped by the time sched_entity
is released and entity's job_queue not empty I encountred
a hang in drm_sched_entity_flush. This is because drm_sched_entity_is_idle
never becomes false.

Fix: In drm_sched_fini detach all sched_entities from the
scheduler's run queues. This will satisfy drm_sched_entity_is_idle.
Also wakeup all those processes stuck in sched_entity flushing
as the scheduler main thread which wakes them up is stopped by now.

v2:
Reverse order of drm_sched_rq_remove_entity and marking
s_entity as stopped to prevent reinserion back to rq due
to race.

v3:
Drop drm_sched_rq_remove_entity, only modify entity->stopped
and check for it in drm_sched_entity_is_idle

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
---
 drivers/gpu/drm/scheduler/sched_entity.c |  3 ++-
 drivers/gpu/drm/scheduler/sched_main.c   | 24 ++++++++++++++++++++++++
 2 files changed, 26 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c
index 0249c7450188..2e93e881b65f 100644
--- a/drivers/gpu/drm/scheduler/sched_entity.c
+++ b/drivers/gpu/drm/scheduler/sched_entity.c
@@ -116,7 +116,8 @@ static bool drm_sched_entity_is_idle(struct drm_sched_entity *entity)
 	rmb(); /* for list_empty to work without lock */
 
 	if (list_empty(&entity->list) ||
-	    spsc_queue_count(&entity->job_queue) == 0)
+	    spsc_queue_count(&entity->job_queue) == 0 ||
+	    entity->stopped)
 		return true;
 
 	return false;
diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
index 8d1211e87101..a2a953693b45 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -898,9 +898,33 @@ EXPORT_SYMBOL(drm_sched_init);
  */
 void drm_sched_fini(struct drm_gpu_scheduler *sched)
 {
+	struct drm_sched_entity *s_entity;
+	int i;
+
 	if (sched->thread)
 		kthread_stop(sched->thread);
 
+	for (i = DRM_SCHED_PRIORITY_COUNT - 1; i >= DRM_SCHED_PRIORITY_MIN; i--) {
+		struct drm_sched_rq *rq = &sched->sched_rq[i];
+
+		if (!rq)
+			continue;
+
+		spin_lock(&rq->lock);
+		list_for_each_entry(s_entity, &rq->entities, list)
+			/*
+			 * Prevents reinsertion and marks job_queue as idle,
+			 * it will removed from rq in drm_sched_entity_fini
+			 * eventually
+			 */
+			s_entity->stopped = true;
+		spin_unlock(&rq->lock);
+
+	}
+
+	/* Wakeup everyone stuck in drm_sched_entity_flush for this scheduler */
+	wake_up_all(&sched->job_scheduled);
+
 	/* Confirm no work left behind accessing device structures */
 	cancel_delayed_work_sync(&sched->work_tdr);
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 126+ messages in thread

* [PATCH v6 14/16] drm/scheduler: Fix hang when sched_entity released
@ 2021-05-10 16:36   ` Andrey Grodzovsky
  0 siblings, 0 replies; 126+ messages in thread
From: Andrey Grodzovsky @ 2021-05-10 16:36 UTC (permalink / raw)
  To: dri-devel, amd-gfx, linux-pci, ckoenig.leichtzumerken,
	daniel.vetter, Harry.Wentland
  Cc: Andrey Grodzovsky, gregkh, Felix.Kuehling, ppaalanen, helgaas,
	Alexander.Deucher, Christian König

Problem: If scheduler is already stopped by the time sched_entity
is released and entity's job_queue not empty I encountred
a hang in drm_sched_entity_flush. This is because drm_sched_entity_is_idle
never becomes false.

Fix: In drm_sched_fini detach all sched_entities from the
scheduler's run queues. This will satisfy drm_sched_entity_is_idle.
Also wakeup all those processes stuck in sched_entity flushing
as the scheduler main thread which wakes them up is stopped by now.

v2:
Reverse order of drm_sched_rq_remove_entity and marking
s_entity as stopped to prevent reinserion back to rq due
to race.

v3:
Drop drm_sched_rq_remove_entity, only modify entity->stopped
and check for it in drm_sched_entity_is_idle

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
---
 drivers/gpu/drm/scheduler/sched_entity.c |  3 ++-
 drivers/gpu/drm/scheduler/sched_main.c   | 24 ++++++++++++++++++++++++
 2 files changed, 26 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c
index 0249c7450188..2e93e881b65f 100644
--- a/drivers/gpu/drm/scheduler/sched_entity.c
+++ b/drivers/gpu/drm/scheduler/sched_entity.c
@@ -116,7 +116,8 @@ static bool drm_sched_entity_is_idle(struct drm_sched_entity *entity)
 	rmb(); /* for list_empty to work without lock */
 
 	if (list_empty(&entity->list) ||
-	    spsc_queue_count(&entity->job_queue) == 0)
+	    spsc_queue_count(&entity->job_queue) == 0 ||
+	    entity->stopped)
 		return true;
 
 	return false;
diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
index 8d1211e87101..a2a953693b45 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -898,9 +898,33 @@ EXPORT_SYMBOL(drm_sched_init);
  */
 void drm_sched_fini(struct drm_gpu_scheduler *sched)
 {
+	struct drm_sched_entity *s_entity;
+	int i;
+
 	if (sched->thread)
 		kthread_stop(sched->thread);
 
+	for (i = DRM_SCHED_PRIORITY_COUNT - 1; i >= DRM_SCHED_PRIORITY_MIN; i--) {
+		struct drm_sched_rq *rq = &sched->sched_rq[i];
+
+		if (!rq)
+			continue;
+
+		spin_lock(&rq->lock);
+		list_for_each_entry(s_entity, &rq->entities, list)
+			/*
+			 * Prevents reinsertion and marks job_queue as idle,
+			 * it will removed from rq in drm_sched_entity_fini
+			 * eventually
+			 */
+			s_entity->stopped = true;
+		spin_unlock(&rq->lock);
+
+	}
+
+	/* Wakeup everyone stuck in drm_sched_entity_flush for this scheduler */
+	wake_up_all(&sched->job_scheduled);
+
 	/* Confirm no work left behind accessing device structures */
 	cancel_delayed_work_sync(&sched->work_tdr);
 
-- 
2.25.1

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 126+ messages in thread

* [PATCH v6 15/16] drm/amd/display: Remove superflous drm_mode_config_cleanup
  2021-05-10 16:36 ` Andrey Grodzovsky
  (?)
@ 2021-05-10 16:36   ` Andrey Grodzovsky
  -1 siblings, 0 replies; 126+ messages in thread
From: Andrey Grodzovsky @ 2021-05-10 16:36 UTC (permalink / raw)
  To: dri-devel, amd-gfx, linux-pci, ckoenig.leichtzumerken,
	daniel.vetter, Harry.Wentland
  Cc: ppaalanen, Alexander.Deucher, gregkh, helgaas, Felix.Kuehling,
	Andrey Grodzovsky

It's already being released by DRM core through devm

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
---
 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index 6c2c6a51ce6c..9728a0158bcb 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -3757,7 +3757,6 @@ static int amdgpu_dm_initialize_drm_device(struct amdgpu_device *adev)
 
 static void amdgpu_dm_destroy_drm_device(struct amdgpu_display_manager *dm)
 {
-	drm_mode_config_cleanup(dm->ddev);
 	drm_atomic_private_obj_fini(&dm->atomic_obj);
 	return;
 }
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 126+ messages in thread

* [PATCH v6 15/16] drm/amd/display: Remove superflous drm_mode_config_cleanup
@ 2021-05-10 16:36   ` Andrey Grodzovsky
  0 siblings, 0 replies; 126+ messages in thread
From: Andrey Grodzovsky @ 2021-05-10 16:36 UTC (permalink / raw)
  To: dri-devel, amd-gfx, linux-pci, ckoenig.leichtzumerken,
	daniel.vetter, Harry.Wentland
  Cc: gregkh, Felix.Kuehling, helgaas, Alexander.Deucher

It's already being released by DRM core through devm

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
---
 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index 6c2c6a51ce6c..9728a0158bcb 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -3757,7 +3757,6 @@ static int amdgpu_dm_initialize_drm_device(struct amdgpu_device *adev)
 
 static void amdgpu_dm_destroy_drm_device(struct amdgpu_display_manager *dm)
 {
-	drm_mode_config_cleanup(dm->ddev);
 	drm_atomic_private_obj_fini(&dm->atomic_obj);
 	return;
 }
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 126+ messages in thread

* [PATCH v6 15/16] drm/amd/display: Remove superflous drm_mode_config_cleanup
@ 2021-05-10 16:36   ` Andrey Grodzovsky
  0 siblings, 0 replies; 126+ messages in thread
From: Andrey Grodzovsky @ 2021-05-10 16:36 UTC (permalink / raw)
  To: dri-devel, amd-gfx, linux-pci, ckoenig.leichtzumerken,
	daniel.vetter, Harry.Wentland
  Cc: Andrey Grodzovsky, gregkh, Felix.Kuehling, ppaalanen, helgaas,
	Alexander.Deucher

It's already being released by DRM core through devm

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
---
 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index 6c2c6a51ce6c..9728a0158bcb 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -3757,7 +3757,6 @@ static int amdgpu_dm_initialize_drm_device(struct amdgpu_device *adev)
 
 static void amdgpu_dm_destroy_drm_device(struct amdgpu_display_manager *dm)
 {
-	drm_mode_config_cleanup(dm->ddev);
 	drm_atomic_private_obj_fini(&dm->atomic_obj);
 	return;
 }
-- 
2.25.1

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 126+ messages in thread

* [PATCH v6 16/16] drm/amdgpu: Verify DMA opearations from device are done
  2021-05-10 16:36 ` Andrey Grodzovsky
  (?)
@ 2021-05-10 16:36   ` Andrey Grodzovsky
  -1 siblings, 0 replies; 126+ messages in thread
From: Andrey Grodzovsky @ 2021-05-10 16:36 UTC (permalink / raw)
  To: dri-devel, amd-gfx, linux-pci, ckoenig.leichtzumerken,
	daniel.vetter, Harry.Wentland
  Cc: ppaalanen, Alexander.Deucher, gregkh, helgaas, Felix.Kuehling,
	Andrey Grodzovsky

In case device remove is just simualted by sysfs then verify
device doesn't keep doing DMA to the released memory after
pci_remove is done.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index 83006f45b10b..5e6af9e0b7bf 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -1314,7 +1314,13 @@ amdgpu_pci_remove(struct pci_dev *pdev)
 	drm_dev_unplug(dev);
 	amdgpu_driver_unload_kms(dev);
 
+	/*
+	 * Flush any in flight DMA operations from device.
+	 * Clear the Bus Master Enable bit and then wait on the PCIe Device
+	 * StatusTransactions Pending bit.
+	 */
 	pci_disable_device(pdev);
+	pci_wait_for_pending_transaction(pdev);
 }
 
 static void
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 126+ messages in thread

* [PATCH v6 16/16] drm/amdgpu: Verify DMA opearations from device are done
@ 2021-05-10 16:36   ` Andrey Grodzovsky
  0 siblings, 0 replies; 126+ messages in thread
From: Andrey Grodzovsky @ 2021-05-10 16:36 UTC (permalink / raw)
  To: dri-devel, amd-gfx, linux-pci, ckoenig.leichtzumerken,
	daniel.vetter, Harry.Wentland
  Cc: gregkh, Felix.Kuehling, helgaas, Alexander.Deucher

In case device remove is just simualted by sysfs then verify
device doesn't keep doing DMA to the released memory after
pci_remove is done.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index 83006f45b10b..5e6af9e0b7bf 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -1314,7 +1314,13 @@ amdgpu_pci_remove(struct pci_dev *pdev)
 	drm_dev_unplug(dev);
 	amdgpu_driver_unload_kms(dev);
 
+	/*
+	 * Flush any in flight DMA operations from device.
+	 * Clear the Bus Master Enable bit and then wait on the PCIe Device
+	 * StatusTransactions Pending bit.
+	 */
 	pci_disable_device(pdev);
+	pci_wait_for_pending_transaction(pdev);
 }
 
 static void
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 126+ messages in thread

* [PATCH v6 16/16] drm/amdgpu: Verify DMA opearations from device are done
@ 2021-05-10 16:36   ` Andrey Grodzovsky
  0 siblings, 0 replies; 126+ messages in thread
From: Andrey Grodzovsky @ 2021-05-10 16:36 UTC (permalink / raw)
  To: dri-devel, amd-gfx, linux-pci, ckoenig.leichtzumerken,
	daniel.vetter, Harry.Wentland
  Cc: Andrey Grodzovsky, gregkh, Felix.Kuehling, ppaalanen, helgaas,
	Alexander.Deucher

In case device remove is just simualted by sysfs then verify
device doesn't keep doing DMA to the released memory after
pci_remove is done.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index 83006f45b10b..5e6af9e0b7bf 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -1314,7 +1314,13 @@ amdgpu_pci_remove(struct pci_dev *pdev)
 	drm_dev_unplug(dev);
 	amdgpu_driver_unload_kms(dev);
 
+	/*
+	 * Flush any in flight DMA operations from device.
+	 * Clear the Bus Master Enable bit and then wait on the PCIe Device
+	 * StatusTransactions Pending bit.
+	 */
 	pci_disable_device(pdev);
+	pci_wait_for_pending_transaction(pdev);
 }
 
 static void
-- 
2.25.1

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 126+ messages in thread

* Re: [PATCH v6 02/16] drm/ttm: Expose ttm_tt_unpopulate for driver use
  2021-05-10 16:36   ` Andrey Grodzovsky
  (?)
@ 2021-05-10 18:27     ` Felix Kuehling
  -1 siblings, 0 replies; 126+ messages in thread
From: Felix Kuehling @ 2021-05-10 18:27 UTC (permalink / raw)
  To: Andrey Grodzovsky, dri-devel, amd-gfx, linux-pci,
	ckoenig.leichtzumerken, daniel.vetter, Harry.Wentland
  Cc: ppaalanen, Alexander.Deucher, gregkh, helgaas

Am 2021-05-10 um 12:36 p.m. schrieb Andrey Grodzovsky:
> It's needed to drop iommu backed pages on device unplug
> before device's IOMMU group is released.

I don't see any calls to ttm_tt_unpopulate in the rest of the series
now. Is that an accident, or can this patch be dropped?

Regards,
  Felix


>
> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> ---
>  drivers/gpu/drm/ttm/ttm_tt.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/drivers/gpu/drm/ttm/ttm_tt.c b/drivers/gpu/drm/ttm/ttm_tt.c
> index 539e0232cb3b..dfbe1ea8763f 100644
> --- a/drivers/gpu/drm/ttm/ttm_tt.c
> +++ b/drivers/gpu/drm/ttm/ttm_tt.c
> @@ -433,3 +433,4 @@ void ttm_tt_mgr_init(unsigned long num_pages, unsigned long num_dma32_pages)
>  	if (!ttm_dma32_pages_limit)
>  		ttm_dma32_pages_limit = num_dma32_pages;
>  }
> +EXPORT_SYMBOL(ttm_tt_unpopulate);

^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [PATCH v6 02/16] drm/ttm: Expose ttm_tt_unpopulate for driver use
@ 2021-05-10 18:27     ` Felix Kuehling
  0 siblings, 0 replies; 126+ messages in thread
From: Felix Kuehling @ 2021-05-10 18:27 UTC (permalink / raw)
  To: Andrey Grodzovsky, dri-devel, amd-gfx, linux-pci,
	ckoenig.leichtzumerken, daniel.vetter, Harry.Wentland
  Cc: Alexander.Deucher, gregkh, helgaas

Am 2021-05-10 um 12:36 p.m. schrieb Andrey Grodzovsky:
> It's needed to drop iommu backed pages on device unplug
> before device's IOMMU group is released.

I don't see any calls to ttm_tt_unpopulate in the rest of the series
now. Is that an accident, or can this patch be dropped?

Regards,
  Felix


>
> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> ---
>  drivers/gpu/drm/ttm/ttm_tt.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/drivers/gpu/drm/ttm/ttm_tt.c b/drivers/gpu/drm/ttm/ttm_tt.c
> index 539e0232cb3b..dfbe1ea8763f 100644
> --- a/drivers/gpu/drm/ttm/ttm_tt.c
> +++ b/drivers/gpu/drm/ttm/ttm_tt.c
> @@ -433,3 +433,4 @@ void ttm_tt_mgr_init(unsigned long num_pages, unsigned long num_dma32_pages)
>  	if (!ttm_dma32_pages_limit)
>  		ttm_dma32_pages_limit = num_dma32_pages;
>  }
> +EXPORT_SYMBOL(ttm_tt_unpopulate);

^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [PATCH v6 02/16] drm/ttm: Expose ttm_tt_unpopulate for driver use
@ 2021-05-10 18:27     ` Felix Kuehling
  0 siblings, 0 replies; 126+ messages in thread
From: Felix Kuehling @ 2021-05-10 18:27 UTC (permalink / raw)
  To: Andrey Grodzovsky, dri-devel, amd-gfx, linux-pci,
	ckoenig.leichtzumerken, daniel.vetter, Harry.Wentland
  Cc: Alexander.Deucher, gregkh, ppaalanen, helgaas

Am 2021-05-10 um 12:36 p.m. schrieb Andrey Grodzovsky:
> It's needed to drop iommu backed pages on device unplug
> before device's IOMMU group is released.

I don't see any calls to ttm_tt_unpopulate in the rest of the series
now. Is that an accident, or can this patch be dropped?

Regards,
  Felix


>
> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> ---
>  drivers/gpu/drm/ttm/ttm_tt.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/drivers/gpu/drm/ttm/ttm_tt.c b/drivers/gpu/drm/ttm/ttm_tt.c
> index 539e0232cb3b..dfbe1ea8763f 100644
> --- a/drivers/gpu/drm/ttm/ttm_tt.c
> +++ b/drivers/gpu/drm/ttm/ttm_tt.c
> @@ -433,3 +433,4 @@ void ttm_tt_mgr_init(unsigned long num_pages, unsigned long num_dma32_pages)
>  	if (!ttm_dma32_pages_limit)
>  		ttm_dma32_pages_limit = num_dma32_pages;
>  }
> +EXPORT_SYMBOL(ttm_tt_unpopulate);
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [PATCH v6 02/16] drm/ttm: Expose ttm_tt_unpopulate for driver use
  2021-05-10 18:27     ` Felix Kuehling
  (?)
@ 2021-05-10 18:32       ` Andrey Grodzovsky
  -1 siblings, 0 replies; 126+ messages in thread
From: Andrey Grodzovsky @ 2021-05-10 18:32 UTC (permalink / raw)
  To: Felix Kuehling, dri-devel, amd-gfx, linux-pci,
	ckoenig.leichtzumerken, daniel.vetter, Harry.Wentland
  Cc: ppaalanen, Alexander.Deucher, gregkh, helgaas


On 2021-05-10 2:27 p.m., Felix Kuehling wrote:
> Am 2021-05-10 um 12:36 p.m. schrieb Andrey Grodzovsky:
>> It's needed to drop iommu backed pages on device unplug
>> before device's IOMMU group is released.
> I don't see any calls to ttm_tt_unpopulate in the rest of the series
> now. Is that an accident, or can this patch be dropped?
>
> Regards,
>    Felix


You are right, it can be dropped because it's not required post 5.11 
kernel (at least
not in the use cases I tested).

Andrey

>
>
>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>> ---
>>   drivers/gpu/drm/ttm/ttm_tt.c | 1 +
>>   1 file changed, 1 insertion(+)
>>
>> diff --git a/drivers/gpu/drm/ttm/ttm_tt.c b/drivers/gpu/drm/ttm/ttm_tt.c
>> index 539e0232cb3b..dfbe1ea8763f 100644
>> --- a/drivers/gpu/drm/ttm/ttm_tt.c
>> +++ b/drivers/gpu/drm/ttm/ttm_tt.c
>> @@ -433,3 +433,4 @@ void ttm_tt_mgr_init(unsigned long num_pages, unsigned long num_dma32_pages)
>>   	if (!ttm_dma32_pages_limit)
>>   		ttm_dma32_pages_limit = num_dma32_pages;
>>   }
>> +EXPORT_SYMBOL(ttm_tt_unpopulate);

^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [PATCH v6 02/16] drm/ttm: Expose ttm_tt_unpopulate for driver use
@ 2021-05-10 18:32       ` Andrey Grodzovsky
  0 siblings, 0 replies; 126+ messages in thread
From: Andrey Grodzovsky @ 2021-05-10 18:32 UTC (permalink / raw)
  To: Felix Kuehling, dri-devel, amd-gfx, linux-pci,
	ckoenig.leichtzumerken, daniel.vetter, Harry.Wentland
  Cc: Alexander.Deucher, gregkh, helgaas


On 2021-05-10 2:27 p.m., Felix Kuehling wrote:
> Am 2021-05-10 um 12:36 p.m. schrieb Andrey Grodzovsky:
>> It's needed to drop iommu backed pages on device unplug
>> before device's IOMMU group is released.
> I don't see any calls to ttm_tt_unpopulate in the rest of the series
> now. Is that an accident, or can this patch be dropped?
>
> Regards,
>    Felix


You are right, it can be dropped because it's not required post 5.11 
kernel (at least
not in the use cases I tested).

Andrey

>
>
>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>> ---
>>   drivers/gpu/drm/ttm/ttm_tt.c | 1 +
>>   1 file changed, 1 insertion(+)
>>
>> diff --git a/drivers/gpu/drm/ttm/ttm_tt.c b/drivers/gpu/drm/ttm/ttm_tt.c
>> index 539e0232cb3b..dfbe1ea8763f 100644
>> --- a/drivers/gpu/drm/ttm/ttm_tt.c
>> +++ b/drivers/gpu/drm/ttm/ttm_tt.c
>> @@ -433,3 +433,4 @@ void ttm_tt_mgr_init(unsigned long num_pages, unsigned long num_dma32_pages)
>>   	if (!ttm_dma32_pages_limit)
>>   		ttm_dma32_pages_limit = num_dma32_pages;
>>   }
>> +EXPORT_SYMBOL(ttm_tt_unpopulate);

^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [PATCH v6 02/16] drm/ttm: Expose ttm_tt_unpopulate for driver use
@ 2021-05-10 18:32       ` Andrey Grodzovsky
  0 siblings, 0 replies; 126+ messages in thread
From: Andrey Grodzovsky @ 2021-05-10 18:32 UTC (permalink / raw)
  To: Felix Kuehling, dri-devel, amd-gfx, linux-pci,
	ckoenig.leichtzumerken, daniel.vetter, Harry.Wentland
  Cc: Alexander.Deucher, gregkh, ppaalanen, helgaas


On 2021-05-10 2:27 p.m., Felix Kuehling wrote:
> Am 2021-05-10 um 12:36 p.m. schrieb Andrey Grodzovsky:
>> It's needed to drop iommu backed pages on device unplug
>> before device's IOMMU group is released.
> I don't see any calls to ttm_tt_unpopulate in the rest of the series
> now. Is that an accident, or can this patch be dropped?
>
> Regards,
>    Felix


You are right, it can be dropped because it's not required post 5.11 
kernel (at least
not in the use cases I tested).

Andrey

>
>
>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>> ---
>>   drivers/gpu/drm/ttm/ttm_tt.c | 1 +
>>   1 file changed, 1 insertion(+)
>>
>> diff --git a/drivers/gpu/drm/ttm/ttm_tt.c b/drivers/gpu/drm/ttm/ttm_tt.c
>> index 539e0232cb3b..dfbe1ea8763f 100644
>> --- a/drivers/gpu/drm/ttm/ttm_tt.c
>> +++ b/drivers/gpu/drm/ttm/ttm_tt.c
>> @@ -433,3 +433,4 @@ void ttm_tt_mgr_init(unsigned long num_pages, unsigned long num_dma32_pages)
>>   	if (!ttm_dma32_pages_limit)
>>   		ttm_dma32_pages_limit = num_dma32_pages;
>>   }
>> +EXPORT_SYMBOL(ttm_tt_unpopulate);
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [PATCH v6 08/16] PCI: Add support for dev_groups to struct pci_device_driver
  2021-05-10 16:36   ` Andrey Grodzovsky
  (?)
@ 2021-05-10 20:56     ` Bjorn Helgaas
  -1 siblings, 0 replies; 126+ messages in thread
From: Bjorn Helgaas @ 2021-05-10 20:56 UTC (permalink / raw)
  To: Andrey Grodzovsky
  Cc: dri-devel, amd-gfx, linux-pci, ckoenig.leichtzumerken,
	daniel.vetter, Harry.Wentland, ppaalanen, Alexander.Deucher,
	gregkh, Felix.Kuehling

In subject:

  PCI: Add support for dev_groups to struct pci_driver

(not "struct pci_device_driver," which does not exist)

On Mon, May 10, 2021 at 12:36:17PM -0400, Andrey Grodzovsky wrote:
> This helps converting PCI drivers sysfs attributes to static.
> 
> Analogous to b71b283e3d6d ("USB: add support for dev_groups to
> struct usb_driver")
> 
> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> Suggested-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

With the subject change above,

Acked-by: Bjorn Helgaas <bhelgaas@google.com>

> ---
>  drivers/pci/pci-driver.c | 1 +
>  include/linux/pci.h      | 3 +++
>  2 files changed, 4 insertions(+)
> 
> diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
> index ec44a79e951a..3a72352aa5cf 100644
> --- a/drivers/pci/pci-driver.c
> +++ b/drivers/pci/pci-driver.c
> @@ -1385,6 +1385,7 @@ int __pci_register_driver(struct pci_driver *drv, struct module *owner,
>  	drv->driver.owner = owner;
>  	drv->driver.mod_name = mod_name;
>  	drv->driver.groups = drv->groups;
> +	drv->driver.dev_groups = drv->dev_groups;
>  
>  	spin_lock_init(&drv->dynids.lock);
>  	INIT_LIST_HEAD(&drv->dynids.list);
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index 86c799c97b77..b57755b03009 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -858,6 +858,8 @@ struct module;
>   *		number of VFs to enable via sysfs "sriov_numvfs" file.
>   * @err_handler: See Documentation/PCI/pci-error-recovery.rst
>   * @groups:	Sysfs attribute groups.
> + * @dev_groups: Attributes attached to the device that will be
> + *              created once it is bound to the driver.
>   * @driver:	Driver model structure.
>   * @dynids:	List of dynamically added device IDs.
>   */
> @@ -873,6 +875,7 @@ struct pci_driver {
>  	int  (*sriov_configure)(struct pci_dev *dev, int num_vfs); /* On PF */
>  	const struct pci_error_handlers *err_handler;
>  	const struct attribute_group **groups;
> +	const struct attribute_group **dev_groups;
>  	struct device_driver	driver;
>  	struct pci_dynids	dynids;
>  };
> -- 
> 2.25.1
> 

^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [PATCH v6 08/16] PCI: Add support for dev_groups to struct pci_device_driver
@ 2021-05-10 20:56     ` Bjorn Helgaas
  0 siblings, 0 replies; 126+ messages in thread
From: Bjorn Helgaas @ 2021-05-10 20:56 UTC (permalink / raw)
  To: Andrey Grodzovsky
  Cc: ckoenig.leichtzumerken, gregkh, daniel.vetter, Felix.Kuehling,
	amd-gfx, dri-devel, linux-pci, Alexander.Deucher

In subject:

  PCI: Add support for dev_groups to struct pci_driver

(not "struct pci_device_driver," which does not exist)

On Mon, May 10, 2021 at 12:36:17PM -0400, Andrey Grodzovsky wrote:
> This helps converting PCI drivers sysfs attributes to static.
> 
> Analogous to b71b283e3d6d ("USB: add support for dev_groups to
> struct usb_driver")
> 
> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> Suggested-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

With the subject change above,

Acked-by: Bjorn Helgaas <bhelgaas@google.com>

> ---
>  drivers/pci/pci-driver.c | 1 +
>  include/linux/pci.h      | 3 +++
>  2 files changed, 4 insertions(+)
> 
> diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
> index ec44a79e951a..3a72352aa5cf 100644
> --- a/drivers/pci/pci-driver.c
> +++ b/drivers/pci/pci-driver.c
> @@ -1385,6 +1385,7 @@ int __pci_register_driver(struct pci_driver *drv, struct module *owner,
>  	drv->driver.owner = owner;
>  	drv->driver.mod_name = mod_name;
>  	drv->driver.groups = drv->groups;
> +	drv->driver.dev_groups = drv->dev_groups;
>  
>  	spin_lock_init(&drv->dynids.lock);
>  	INIT_LIST_HEAD(&drv->dynids.list);
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index 86c799c97b77..b57755b03009 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -858,6 +858,8 @@ struct module;
>   *		number of VFs to enable via sysfs "sriov_numvfs" file.
>   * @err_handler: See Documentation/PCI/pci-error-recovery.rst
>   * @groups:	Sysfs attribute groups.
> + * @dev_groups: Attributes attached to the device that will be
> + *              created once it is bound to the driver.
>   * @driver:	Driver model structure.
>   * @dynids:	List of dynamically added device IDs.
>   */
> @@ -873,6 +875,7 @@ struct pci_driver {
>  	int  (*sriov_configure)(struct pci_dev *dev, int num_vfs); /* On PF */
>  	const struct pci_error_handlers *err_handler;
>  	const struct attribute_group **groups;
> +	const struct attribute_group **dev_groups;
>  	struct device_driver	driver;
>  	struct pci_dynids	dynids;
>  };
> -- 
> 2.25.1
> 

^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [PATCH v6 08/16] PCI: Add support for dev_groups to struct pci_device_driver
@ 2021-05-10 20:56     ` Bjorn Helgaas
  0 siblings, 0 replies; 126+ messages in thread
From: Bjorn Helgaas @ 2021-05-10 20:56 UTC (permalink / raw)
  To: Andrey Grodzovsky
  Cc: ckoenig.leichtzumerken, gregkh, daniel.vetter, Felix.Kuehling,
	amd-gfx, ppaalanen, dri-devel, linux-pci, Alexander.Deucher,
	Harry.Wentland

In subject:

  PCI: Add support for dev_groups to struct pci_driver

(not "struct pci_device_driver," which does not exist)

On Mon, May 10, 2021 at 12:36:17PM -0400, Andrey Grodzovsky wrote:
> This helps converting PCI drivers sysfs attributes to static.
> 
> Analogous to b71b283e3d6d ("USB: add support for dev_groups to
> struct usb_driver")
> 
> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> Suggested-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

With the subject change above,

Acked-by: Bjorn Helgaas <bhelgaas@google.com>

> ---
>  drivers/pci/pci-driver.c | 1 +
>  include/linux/pci.h      | 3 +++
>  2 files changed, 4 insertions(+)
> 
> diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
> index ec44a79e951a..3a72352aa5cf 100644
> --- a/drivers/pci/pci-driver.c
> +++ b/drivers/pci/pci-driver.c
> @@ -1385,6 +1385,7 @@ int __pci_register_driver(struct pci_driver *drv, struct module *owner,
>  	drv->driver.owner = owner;
>  	drv->driver.mod_name = mod_name;
>  	drv->driver.groups = drv->groups;
> +	drv->driver.dev_groups = drv->dev_groups;
>  
>  	spin_lock_init(&drv->dynids.lock);
>  	INIT_LIST_HEAD(&drv->dynids.list);
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index 86c799c97b77..b57755b03009 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -858,6 +858,8 @@ struct module;
>   *		number of VFs to enable via sysfs "sriov_numvfs" file.
>   * @err_handler: See Documentation/PCI/pci-error-recovery.rst
>   * @groups:	Sysfs attribute groups.
> + * @dev_groups: Attributes attached to the device that will be
> + *              created once it is bound to the driver.
>   * @driver:	Driver model structure.
>   * @dynids:	List of dynamically added device IDs.
>   */
> @@ -873,6 +875,7 @@ struct pci_driver {
>  	int  (*sriov_configure)(struct pci_dev *dev, int num_vfs); /* On PF */
>  	const struct pci_error_handlers *err_handler;
>  	const struct attribute_group **groups;
> +	const struct attribute_group **dev_groups;
>  	struct device_driver	driver;
>  	struct pci_dynids	dynids;
>  };
> -- 
> 2.25.1
> 
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [PATCH v6 15/16] drm/amd/display: Remove superflous drm_mode_config_cleanup
  2021-05-10 16:36   ` Andrey Grodzovsky
  (?)
@ 2021-05-10 21:38     ` Rodrigo Siqueira
  -1 siblings, 0 replies; 126+ messages in thread
From: Rodrigo Siqueira @ 2021-05-10 21:38 UTC (permalink / raw)
  To: Andrey Grodzovsky
  Cc: dri-devel, amd-gfx, linux-pci, ckoenig.leichtzumerken,
	daniel.vetter, Harry.Wentland, gregkh, Felix.Kuehling, ppaalanen,
	helgaas, Alexander.Deucher

[-- Attachment #1: Type: text/plain, Size: 1526 bytes --]

lgtm,

Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>

On 05/10, Andrey Grodzovsky wrote:
> It's already being released by DRM core through devm
> 
> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> ---
>  drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 1 -
>  1 file changed, 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> index 6c2c6a51ce6c..9728a0158bcb 100644
> --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> @@ -3757,7 +3757,6 @@ static int amdgpu_dm_initialize_drm_device(struct amdgpu_device *adev)
>  
>  static void amdgpu_dm_destroy_drm_device(struct amdgpu_display_manager *dm)
>  {
> -	drm_mode_config_cleanup(dm->ddev);
>  	drm_atomic_private_obj_fini(&dm->atomic_obj);
>  	return;
>  }
> -- 
> 2.25.1
> 
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&amp;data=04%7C01%7CRodrigo.Siqueira%40amd.com%7Cd7ebdc33a79d49d6560308d913d1e32c%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637562614440095736%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=OtEAqSIaLB6CgKhLQGhIQc2A%2B6lprqGB31yqQts6OVc%3D&amp;reserved=0

-- 
Rodrigo Siqueira
https://siqueira.tech

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [PATCH v6 15/16] drm/amd/display: Remove superflous drm_mode_config_cleanup
@ 2021-05-10 21:38     ` Rodrigo Siqueira
  0 siblings, 0 replies; 126+ messages in thread
From: Rodrigo Siqueira @ 2021-05-10 21:38 UTC (permalink / raw)
  To: Andrey Grodzovsky
  Cc: ckoenig.leichtzumerken, gregkh, daniel.vetter, Felix.Kuehling,
	amd-gfx, helgaas, dri-devel, linux-pci, Alexander.Deucher

[-- Attachment #1: Type: text/plain, Size: 1526 bytes --]

lgtm,

Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>

On 05/10, Andrey Grodzovsky wrote:
> It's already being released by DRM core through devm
> 
> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> ---
>  drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 1 -
>  1 file changed, 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> index 6c2c6a51ce6c..9728a0158bcb 100644
> --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> @@ -3757,7 +3757,6 @@ static int amdgpu_dm_initialize_drm_device(struct amdgpu_device *adev)
>  
>  static void amdgpu_dm_destroy_drm_device(struct amdgpu_display_manager *dm)
>  {
> -	drm_mode_config_cleanup(dm->ddev);
>  	drm_atomic_private_obj_fini(&dm->atomic_obj);
>  	return;
>  }
> -- 
> 2.25.1
> 
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&amp;data=04%7C01%7CRodrigo.Siqueira%40amd.com%7Cd7ebdc33a79d49d6560308d913d1e32c%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637562614440095736%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=OtEAqSIaLB6CgKhLQGhIQc2A%2B6lprqGB31yqQts6OVc%3D&amp;reserved=0

-- 
Rodrigo Siqueira
https://siqueira.tech

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [PATCH v6 15/16] drm/amd/display: Remove superflous drm_mode_config_cleanup
@ 2021-05-10 21:38     ` Rodrigo Siqueira
  0 siblings, 0 replies; 126+ messages in thread
From: Rodrigo Siqueira @ 2021-05-10 21:38 UTC (permalink / raw)
  To: Andrey Grodzovsky
  Cc: ckoenig.leichtzumerken, gregkh, daniel.vetter, Felix.Kuehling,
	amd-gfx, ppaalanen, helgaas, dri-devel, linux-pci,
	Alexander.Deucher, Harry.Wentland


[-- Attachment #1.1: Type: text/plain, Size: 1526 bytes --]

lgtm,

Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>

On 05/10, Andrey Grodzovsky wrote:
> It's already being released by DRM core through devm
> 
> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> ---
>  drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 1 -
>  1 file changed, 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> index 6c2c6a51ce6c..9728a0158bcb 100644
> --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> @@ -3757,7 +3757,6 @@ static int amdgpu_dm_initialize_drm_device(struct amdgpu_device *adev)
>  
>  static void amdgpu_dm_destroy_drm_device(struct amdgpu_display_manager *dm)
>  {
> -	drm_mode_config_cleanup(dm->ddev);
>  	drm_atomic_private_obj_fini(&dm->atomic_obj);
>  	return;
>  }
> -- 
> 2.25.1
> 
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&amp;data=04%7C01%7CRodrigo.Siqueira%40amd.com%7Cd7ebdc33a79d49d6560308d913d1e32c%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637562614440095736%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=OtEAqSIaLB6CgKhLQGhIQc2A%2B6lprqGB31yqQts6OVc%3D&amp;reserved=0

-- 
Rodrigo Siqueira
https://siqueira.tech

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

[-- Attachment #2: Type: text/plain, Size: 154 bytes --]

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [PATCH v6 03/16] drm/amdgpu: Split amdgpu_device_fini into early and late
  2021-05-10 16:36   ` Andrey Grodzovsky
  (?)
  (?)
@ 2021-05-10 23:49     ` kernel test robot
  -1 siblings, 0 replies; 126+ messages in thread
From: kernel test robot @ 2021-05-10 23:49 UTC (permalink / raw)
  To: Andrey Grodzovsky, dri-devel, amd-gfx, linux-pci,
	ckoenig.leichtzumerken, daniel.vetter, Harry.Wentland
  Cc: kbuild-all, ppaalanen, Alexander.Deucher, gregkh, helgaas

[-- Attachment #1: Type: text/plain, Size: 5082 bytes --]

Hi Andrey,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on drm-intel/for-linux-next]
[also build test WARNING on drm-tip/drm-tip drm-exynos/exynos-drm-next tegra-drm/drm/tegra/for-next linus/master v5.13-rc1 next-20210510]
[cannot apply to pci/next drm/drm-next]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Andrey-Grodzovsky/RFC-Support-hot-device-unplug-in-amdgpu/20210511-003754
base:   git://anongit.freedesktop.org/drm-intel for-linux-next
config: x86_64-randconfig-a012-20210510 (attached as .config)
compiler: clang version 13.0.0 (https://github.com/llvm/llvm-project 492173d42b32cb91d5d0d72d5ed84fcab80d059a)
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # install x86_64 cross compiling tool for clang build
        # apt-get install binutils-x86-64-linux-gnu
        # https://github.com/0day-ci/linux/commit/28901216b0a25add4057d60c10eb305d4a32535e
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Andrey-Grodzovsky/RFC-Support-hot-device-unplug-in-amdgpu/20210511-003754
        git checkout 28901216b0a25add4057d60c10eb305d4a32535e
        # save the attached .config to linux build tree
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 ARCH=x86_64 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All warnings (new ones prefixed by >>):

   drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c:444: warning: Function parameter or member 'sched_score' not described in 'amdgpu_fence_driver_init_ring'
>> drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c:527: warning: expecting prototype for amdgpu_fence_driver_fini(). Prototype was for amdgpu_fence_driver_fini_hw() instead
--
>> drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:3652: warning: expecting prototype for amdgpu_device_fini(). Prototype was for amdgpu_device_fini_hw() instead
--
>> drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c:376: warning: expecting prototype for amdgpu_irq_fini(). Prototype was for amdgpu_irq_fini_sw() instead


vim +527 drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c

d38ceaf99ed015 Alex Deucher      2015-04-20  517  
d38ceaf99ed015 Alex Deucher      2015-04-20  518  /**
d38ceaf99ed015 Alex Deucher      2015-04-20  519   * amdgpu_fence_driver_fini - tear down the fence driver
d38ceaf99ed015 Alex Deucher      2015-04-20  520   * for all possible rings.
d38ceaf99ed015 Alex Deucher      2015-04-20  521   *
d38ceaf99ed015 Alex Deucher      2015-04-20  522   * @adev: amdgpu device pointer
d38ceaf99ed015 Alex Deucher      2015-04-20  523   *
d38ceaf99ed015 Alex Deucher      2015-04-20  524   * Tear down the fence driver for all possible rings (all asics).
d38ceaf99ed015 Alex Deucher      2015-04-20  525   */
28901216b0a25a Andrey Grodzovsky 2021-05-10  526  void amdgpu_fence_driver_fini_hw(struct amdgpu_device *adev)
d38ceaf99ed015 Alex Deucher      2015-04-20 @527  {
c89377d10a11e5 Christian König   2016-03-13  528  	unsigned i, j;
c89377d10a11e5 Christian König   2016-03-13  529  	int r;
d38ceaf99ed015 Alex Deucher      2015-04-20  530  
d38ceaf99ed015 Alex Deucher      2015-04-20  531  	for (i = 0; i < AMDGPU_MAX_RINGS; i++) {
d38ceaf99ed015 Alex Deucher      2015-04-20  532  		struct amdgpu_ring *ring = adev->rings[i];
c2776afe740db5 Christian König   2015-11-03  533  
d38ceaf99ed015 Alex Deucher      2015-04-20  534  		if (!ring || !ring->fence_drv.initialized)
d38ceaf99ed015 Alex Deucher      2015-04-20  535  			continue;
bb0cd09be45ea4 Emily Deng        2021-03-04  536  		if (!ring->no_scheduler)
bb0cd09be45ea4 Emily Deng        2021-03-04  537  			drm_sched_fini(&ring->sched);
d38ceaf99ed015 Alex Deucher      2015-04-20  538  		r = amdgpu_fence_wait_empty(ring);
d38ceaf99ed015 Alex Deucher      2015-04-20  539  		if (r) {
d38ceaf99ed015 Alex Deucher      2015-04-20  540  			/* no need to trigger GPU reset as we are unloading */
2f9d4084cac96a Monk Liu          2017-10-16  541  			amdgpu_fence_driver_force_completion(ring);
d38ceaf99ed015 Alex Deucher      2015-04-20  542  		}
55611b507fd645 Jack Xiao         2019-06-05  543  		if (ring->fence_drv.irq_src)
c6a4079badc2f0 Chunming Zhou     2015-06-01  544  			amdgpu_irq_put(adev, ring->fence_drv.irq_src,
c6a4079badc2f0 Chunming Zhou     2015-06-01  545  				       ring->fence_drv.irq_type);
bb0cd09be45ea4 Emily Deng        2021-03-04  546  
8c5e13ec6a2c26 Andrey Grodzovsky 2018-09-21  547  		del_timer_sync(&ring->fence_drv.fallback_timer);
28901216b0a25a Andrey Grodzovsky 2021-05-10  548  	}
28901216b0a25a Andrey Grodzovsky 2021-05-10  549  }
28901216b0a25a Andrey Grodzovsky 2021-05-10  550  

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 36692 bytes --]

^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [PATCH v6 03/16] drm/amdgpu: Split amdgpu_device_fini into early and late
@ 2021-05-10 23:49     ` kernel test robot
  0 siblings, 0 replies; 126+ messages in thread
From: kernel test robot @ 2021-05-10 23:49 UTC (permalink / raw)
  To: Andrey Grodzovsky, dri-devel, amd-gfx, linux-pci,
	ckoenig.leichtzumerken, daniel.vetter, Harry.Wentland
  Cc: Alexander.Deucher, gregkh, helgaas, kbuild-all

[-- Attachment #1: Type: text/plain, Size: 5082 bytes --]

Hi Andrey,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on drm-intel/for-linux-next]
[also build test WARNING on drm-tip/drm-tip drm-exynos/exynos-drm-next tegra-drm/drm/tegra/for-next linus/master v5.13-rc1 next-20210510]
[cannot apply to pci/next drm/drm-next]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Andrey-Grodzovsky/RFC-Support-hot-device-unplug-in-amdgpu/20210511-003754
base:   git://anongit.freedesktop.org/drm-intel for-linux-next
config: x86_64-randconfig-a012-20210510 (attached as .config)
compiler: clang version 13.0.0 (https://github.com/llvm/llvm-project 492173d42b32cb91d5d0d72d5ed84fcab80d059a)
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # install x86_64 cross compiling tool for clang build
        # apt-get install binutils-x86-64-linux-gnu
        # https://github.com/0day-ci/linux/commit/28901216b0a25add4057d60c10eb305d4a32535e
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Andrey-Grodzovsky/RFC-Support-hot-device-unplug-in-amdgpu/20210511-003754
        git checkout 28901216b0a25add4057d60c10eb305d4a32535e
        # save the attached .config to linux build tree
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 ARCH=x86_64 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All warnings (new ones prefixed by >>):

   drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c:444: warning: Function parameter or member 'sched_score' not described in 'amdgpu_fence_driver_init_ring'
>> drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c:527: warning: expecting prototype for amdgpu_fence_driver_fini(). Prototype was for amdgpu_fence_driver_fini_hw() instead
--
>> drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:3652: warning: expecting prototype for amdgpu_device_fini(). Prototype was for amdgpu_device_fini_hw() instead
--
>> drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c:376: warning: expecting prototype for amdgpu_irq_fini(). Prototype was for amdgpu_irq_fini_sw() instead


vim +527 drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c

d38ceaf99ed015 Alex Deucher      2015-04-20  517  
d38ceaf99ed015 Alex Deucher      2015-04-20  518  /**
d38ceaf99ed015 Alex Deucher      2015-04-20  519   * amdgpu_fence_driver_fini - tear down the fence driver
d38ceaf99ed015 Alex Deucher      2015-04-20  520   * for all possible rings.
d38ceaf99ed015 Alex Deucher      2015-04-20  521   *
d38ceaf99ed015 Alex Deucher      2015-04-20  522   * @adev: amdgpu device pointer
d38ceaf99ed015 Alex Deucher      2015-04-20  523   *
d38ceaf99ed015 Alex Deucher      2015-04-20  524   * Tear down the fence driver for all possible rings (all asics).
d38ceaf99ed015 Alex Deucher      2015-04-20  525   */
28901216b0a25a Andrey Grodzovsky 2021-05-10  526  void amdgpu_fence_driver_fini_hw(struct amdgpu_device *adev)
d38ceaf99ed015 Alex Deucher      2015-04-20 @527  {
c89377d10a11e5 Christian König   2016-03-13  528  	unsigned i, j;
c89377d10a11e5 Christian König   2016-03-13  529  	int r;
d38ceaf99ed015 Alex Deucher      2015-04-20  530  
d38ceaf99ed015 Alex Deucher      2015-04-20  531  	for (i = 0; i < AMDGPU_MAX_RINGS; i++) {
d38ceaf99ed015 Alex Deucher      2015-04-20  532  		struct amdgpu_ring *ring = adev->rings[i];
c2776afe740db5 Christian König   2015-11-03  533  
d38ceaf99ed015 Alex Deucher      2015-04-20  534  		if (!ring || !ring->fence_drv.initialized)
d38ceaf99ed015 Alex Deucher      2015-04-20  535  			continue;
bb0cd09be45ea4 Emily Deng        2021-03-04  536  		if (!ring->no_scheduler)
bb0cd09be45ea4 Emily Deng        2021-03-04  537  			drm_sched_fini(&ring->sched);
d38ceaf99ed015 Alex Deucher      2015-04-20  538  		r = amdgpu_fence_wait_empty(ring);
d38ceaf99ed015 Alex Deucher      2015-04-20  539  		if (r) {
d38ceaf99ed015 Alex Deucher      2015-04-20  540  			/* no need to trigger GPU reset as we are unloading */
2f9d4084cac96a Monk Liu          2017-10-16  541  			amdgpu_fence_driver_force_completion(ring);
d38ceaf99ed015 Alex Deucher      2015-04-20  542  		}
55611b507fd645 Jack Xiao         2019-06-05  543  		if (ring->fence_drv.irq_src)
c6a4079badc2f0 Chunming Zhou     2015-06-01  544  			amdgpu_irq_put(adev, ring->fence_drv.irq_src,
c6a4079badc2f0 Chunming Zhou     2015-06-01  545  				       ring->fence_drv.irq_type);
bb0cd09be45ea4 Emily Deng        2021-03-04  546  
8c5e13ec6a2c26 Andrey Grodzovsky 2018-09-21  547  		del_timer_sync(&ring->fence_drv.fallback_timer);
28901216b0a25a Andrey Grodzovsky 2021-05-10  548  	}
28901216b0a25a Andrey Grodzovsky 2021-05-10  549  }
28901216b0a25a Andrey Grodzovsky 2021-05-10  550  

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 36692 bytes --]

^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [PATCH v6 03/16] drm/amdgpu: Split amdgpu_device_fini into early and late
@ 2021-05-10 23:49     ` kernel test robot
  0 siblings, 0 replies; 126+ messages in thread
From: kernel test robot @ 2021-05-10 23:49 UTC (permalink / raw)
  To: Andrey Grodzovsky, dri-devel, amd-gfx, linux-pci,
	ckoenig.leichtzumerken, daniel.vetter, Harry.Wentland
  Cc: Alexander.Deucher, gregkh, ppaalanen, helgaas, kbuild-all

[-- Attachment #1: Type: text/plain, Size: 5082 bytes --]

Hi Andrey,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on drm-intel/for-linux-next]
[also build test WARNING on drm-tip/drm-tip drm-exynos/exynos-drm-next tegra-drm/drm/tegra/for-next linus/master v5.13-rc1 next-20210510]
[cannot apply to pci/next drm/drm-next]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Andrey-Grodzovsky/RFC-Support-hot-device-unplug-in-amdgpu/20210511-003754
base:   git://anongit.freedesktop.org/drm-intel for-linux-next
config: x86_64-randconfig-a012-20210510 (attached as .config)
compiler: clang version 13.0.0 (https://github.com/llvm/llvm-project 492173d42b32cb91d5d0d72d5ed84fcab80d059a)
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # install x86_64 cross compiling tool for clang build
        # apt-get install binutils-x86-64-linux-gnu
        # https://github.com/0day-ci/linux/commit/28901216b0a25add4057d60c10eb305d4a32535e
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Andrey-Grodzovsky/RFC-Support-hot-device-unplug-in-amdgpu/20210511-003754
        git checkout 28901216b0a25add4057d60c10eb305d4a32535e
        # save the attached .config to linux build tree
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 ARCH=x86_64 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All warnings (new ones prefixed by >>):

   drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c:444: warning: Function parameter or member 'sched_score' not described in 'amdgpu_fence_driver_init_ring'
>> drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c:527: warning: expecting prototype for amdgpu_fence_driver_fini(). Prototype was for amdgpu_fence_driver_fini_hw() instead
--
>> drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:3652: warning: expecting prototype for amdgpu_device_fini(). Prototype was for amdgpu_device_fini_hw() instead
--
>> drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c:376: warning: expecting prototype for amdgpu_irq_fini(). Prototype was for amdgpu_irq_fini_sw() instead


vim +527 drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c

d38ceaf99ed015 Alex Deucher      2015-04-20  517  
d38ceaf99ed015 Alex Deucher      2015-04-20  518  /**
d38ceaf99ed015 Alex Deucher      2015-04-20  519   * amdgpu_fence_driver_fini - tear down the fence driver
d38ceaf99ed015 Alex Deucher      2015-04-20  520   * for all possible rings.
d38ceaf99ed015 Alex Deucher      2015-04-20  521   *
d38ceaf99ed015 Alex Deucher      2015-04-20  522   * @adev: amdgpu device pointer
d38ceaf99ed015 Alex Deucher      2015-04-20  523   *
d38ceaf99ed015 Alex Deucher      2015-04-20  524   * Tear down the fence driver for all possible rings (all asics).
d38ceaf99ed015 Alex Deucher      2015-04-20  525   */
28901216b0a25a Andrey Grodzovsky 2021-05-10  526  void amdgpu_fence_driver_fini_hw(struct amdgpu_device *adev)
d38ceaf99ed015 Alex Deucher      2015-04-20 @527  {
c89377d10a11e5 Christian König   2016-03-13  528  	unsigned i, j;
c89377d10a11e5 Christian König   2016-03-13  529  	int r;
d38ceaf99ed015 Alex Deucher      2015-04-20  530  
d38ceaf99ed015 Alex Deucher      2015-04-20  531  	for (i = 0; i < AMDGPU_MAX_RINGS; i++) {
d38ceaf99ed015 Alex Deucher      2015-04-20  532  		struct amdgpu_ring *ring = adev->rings[i];
c2776afe740db5 Christian König   2015-11-03  533  
d38ceaf99ed015 Alex Deucher      2015-04-20  534  		if (!ring || !ring->fence_drv.initialized)
d38ceaf99ed015 Alex Deucher      2015-04-20  535  			continue;
bb0cd09be45ea4 Emily Deng        2021-03-04  536  		if (!ring->no_scheduler)
bb0cd09be45ea4 Emily Deng        2021-03-04  537  			drm_sched_fini(&ring->sched);
d38ceaf99ed015 Alex Deucher      2015-04-20  538  		r = amdgpu_fence_wait_empty(ring);
d38ceaf99ed015 Alex Deucher      2015-04-20  539  		if (r) {
d38ceaf99ed015 Alex Deucher      2015-04-20  540  			/* no need to trigger GPU reset as we are unloading */
2f9d4084cac96a Monk Liu          2017-10-16  541  			amdgpu_fence_driver_force_completion(ring);
d38ceaf99ed015 Alex Deucher      2015-04-20  542  		}
55611b507fd645 Jack Xiao         2019-06-05  543  		if (ring->fence_drv.irq_src)
c6a4079badc2f0 Chunming Zhou     2015-06-01  544  			amdgpu_irq_put(adev, ring->fence_drv.irq_src,
c6a4079badc2f0 Chunming Zhou     2015-06-01  545  				       ring->fence_drv.irq_type);
bb0cd09be45ea4 Emily Deng        2021-03-04  546  
8c5e13ec6a2c26 Andrey Grodzovsky 2018-09-21  547  		del_timer_sync(&ring->fence_drv.fallback_timer);
28901216b0a25a Andrey Grodzovsky 2021-05-10  548  	}
28901216b0a25a Andrey Grodzovsky 2021-05-10  549  }
28901216b0a25a Andrey Grodzovsky 2021-05-10  550  

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 36692 bytes --]

[-- Attachment #3: Type: text/plain, Size: 154 bytes --]

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [PATCH v6 03/16] drm/amdgpu: Split amdgpu_device_fini into early and late
@ 2021-05-10 23:49     ` kernel test robot
  0 siblings, 0 replies; 126+ messages in thread
From: kernel test robot @ 2021-05-10 23:49 UTC (permalink / raw)
  To: kbuild-all

[-- Attachment #1: Type: text/plain, Size: 5167 bytes --]

Hi Andrey,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on drm-intel/for-linux-next]
[also build test WARNING on drm-tip/drm-tip drm-exynos/exynos-drm-next tegra-drm/drm/tegra/for-next linus/master v5.13-rc1 next-20210510]
[cannot apply to pci/next drm/drm-next]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Andrey-Grodzovsky/RFC-Support-hot-device-unplug-in-amdgpu/20210511-003754
base:   git://anongit.freedesktop.org/drm-intel for-linux-next
config: x86_64-randconfig-a012-20210510 (attached as .config)
compiler: clang version 13.0.0 (https://github.com/llvm/llvm-project 492173d42b32cb91d5d0d72d5ed84fcab80d059a)
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # install x86_64 cross compiling tool for clang build
        # apt-get install binutils-x86-64-linux-gnu
        # https://github.com/0day-ci/linux/commit/28901216b0a25add4057d60c10eb305d4a32535e
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Andrey-Grodzovsky/RFC-Support-hot-device-unplug-in-amdgpu/20210511-003754
        git checkout 28901216b0a25add4057d60c10eb305d4a32535e
        # save the attached .config to linux build tree
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 ARCH=x86_64 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All warnings (new ones prefixed by >>):

   drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c:444: warning: Function parameter or member 'sched_score' not described in 'amdgpu_fence_driver_init_ring'
>> drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c:527: warning: expecting prototype for amdgpu_fence_driver_fini(). Prototype was for amdgpu_fence_driver_fini_hw() instead
--
>> drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:3652: warning: expecting prototype for amdgpu_device_fini(). Prototype was for amdgpu_device_fini_hw() instead
--
>> drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c:376: warning: expecting prototype for amdgpu_irq_fini(). Prototype was for amdgpu_irq_fini_sw() instead


vim +527 drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c

d38ceaf99ed015 Alex Deucher      2015-04-20  517  
d38ceaf99ed015 Alex Deucher      2015-04-20  518  /**
d38ceaf99ed015 Alex Deucher      2015-04-20  519   * amdgpu_fence_driver_fini - tear down the fence driver
d38ceaf99ed015 Alex Deucher      2015-04-20  520   * for all possible rings.
d38ceaf99ed015 Alex Deucher      2015-04-20  521   *
d38ceaf99ed015 Alex Deucher      2015-04-20  522   * @adev: amdgpu device pointer
d38ceaf99ed015 Alex Deucher      2015-04-20  523   *
d38ceaf99ed015 Alex Deucher      2015-04-20  524   * Tear down the fence driver for all possible rings (all asics).
d38ceaf99ed015 Alex Deucher      2015-04-20  525   */
28901216b0a25a Andrey Grodzovsky 2021-05-10  526  void amdgpu_fence_driver_fini_hw(struct amdgpu_device *adev)
d38ceaf99ed015 Alex Deucher      2015-04-20 @527  {
c89377d10a11e5 Christian König   2016-03-13  528  	unsigned i, j;
c89377d10a11e5 Christian König   2016-03-13  529  	int r;
d38ceaf99ed015 Alex Deucher      2015-04-20  530  
d38ceaf99ed015 Alex Deucher      2015-04-20  531  	for (i = 0; i < AMDGPU_MAX_RINGS; i++) {
d38ceaf99ed015 Alex Deucher      2015-04-20  532  		struct amdgpu_ring *ring = adev->rings[i];
c2776afe740db5 Christian König   2015-11-03  533  
d38ceaf99ed015 Alex Deucher      2015-04-20  534  		if (!ring || !ring->fence_drv.initialized)
d38ceaf99ed015 Alex Deucher      2015-04-20  535  			continue;
bb0cd09be45ea4 Emily Deng        2021-03-04  536  		if (!ring->no_scheduler)
bb0cd09be45ea4 Emily Deng        2021-03-04  537  			drm_sched_fini(&ring->sched);
d38ceaf99ed015 Alex Deucher      2015-04-20  538  		r = amdgpu_fence_wait_empty(ring);
d38ceaf99ed015 Alex Deucher      2015-04-20  539  		if (r) {
d38ceaf99ed015 Alex Deucher      2015-04-20  540  			/* no need to trigger GPU reset as we are unloading */
2f9d4084cac96a Monk Liu          2017-10-16  541  			amdgpu_fence_driver_force_completion(ring);
d38ceaf99ed015 Alex Deucher      2015-04-20  542  		}
55611b507fd645 Jack Xiao         2019-06-05  543  		if (ring->fence_drv.irq_src)
c6a4079badc2f0 Chunming Zhou     2015-06-01  544  			amdgpu_irq_put(adev, ring->fence_drv.irq_src,
c6a4079badc2f0 Chunming Zhou     2015-06-01  545  				       ring->fence_drv.irq_type);
bb0cd09be45ea4 Emily Deng        2021-03-04  546  
8c5e13ec6a2c26 Andrey Grodzovsky 2018-09-21  547  		del_timer_sync(&ring->fence_drv.fallback_timer);
28901216b0a25a Andrey Grodzovsky 2021-05-10  548  	}
28901216b0a25a Andrey Grodzovsky 2021-05-10  549  }
28901216b0a25a Andrey Grodzovsky 2021-05-10  550  

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all(a)lists.01.org

[-- Attachment #2: config.gz --]
[-- Type: application/gzip, Size: 36692 bytes --]

^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [PATCH v6 01/16] drm/ttm: Remap all page faults to per process dummy page.
  2021-05-10 16:36   ` Andrey Grodzovsky
  (?)
@ 2021-05-11  6:38     ` Christian König
  -1 siblings, 0 replies; 126+ messages in thread
From: Christian König @ 2021-05-11  6:38 UTC (permalink / raw)
  To: Andrey Grodzovsky, dri-devel, amd-gfx, linux-pci, daniel.vetter,
	Harry.Wentland
  Cc: ppaalanen, Alexander.Deucher, gregkh, helgaas, Felix.Kuehling

Am 10.05.21 um 18:36 schrieb Andrey Grodzovsky:
> On device removal reroute all CPU mappings to dummy page.
>
> v3:
> Remove loop to find DRM file and instead access it
> by vma->vm_file->private_data. Move dummy page installation
> into a separate function.
>
> v4:
> Map the entire BOs VA space into on demand allocated dummy page
> on the first fault for that BO.
>
> v5: Remove duplicate return.
>
> v6: Polish ttm_bo_vm_dummy_page, remove superflous code.
>
> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> ---
>   drivers/gpu/drm/ttm/ttm_bo_vm.c | 57 ++++++++++++++++++++++++++++++++-
>   include/drm/ttm/ttm_bo_api.h    |  2 ++
>   2 files changed, 58 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c b/drivers/gpu/drm/ttm/ttm_bo_vm.c
> index b31b18058965..e5a9615519d1 100644
> --- a/drivers/gpu/drm/ttm/ttm_bo_vm.c
> +++ b/drivers/gpu/drm/ttm/ttm_bo_vm.c
> @@ -34,6 +34,8 @@
>   #include <drm/ttm/ttm_bo_driver.h>
>   #include <drm/ttm/ttm_placement.h>
>   #include <drm/drm_vma_manager.h>
> +#include <drm/drm_drv.h>
> +#include <drm/drm_managed.h>
>   #include <linux/mm.h>
>   #include <linux/pfn_t.h>
>   #include <linux/rbtree.h>
> @@ -380,19 +382,72 @@ vm_fault_t ttm_bo_vm_fault_reserved(struct vm_fault *vmf,
>   }
>   EXPORT_SYMBOL(ttm_bo_vm_fault_reserved);
>   
> +static void ttm_bo_release_dummy_page(struct drm_device *dev, void *res)
> +{
> +	struct page *dummy_page = (struct page *)res;
> +
> +	__free_page(dummy_page);
> +}
> +
> +vm_fault_t ttm_bo_vm_dummy_page(struct vm_fault *vmf, pgprot_t prot)
> +{
> +	struct vm_area_struct *vma = vmf->vma;
> +	struct ttm_buffer_object *bo = vma->vm_private_data;
> +	struct drm_device *ddev = bo->base.dev;
> +	vm_fault_t ret = VM_FAULT_NOPAGE;
> +	unsigned long address;
> +	unsigned long pfn;
> +	struct page *page;
> +
> +	/* Allocate new dummy page to map all the VA range in this VMA to it*/
> +	page = alloc_page(GFP_KERNEL | __GFP_ZERO);
> +	if (!page)
> +		return VM_FAULT_OOM;
> +
> +	pfn = page_to_pfn(page);
> +
> +	/* Prefault the entire VMA range right away to avoid further faults */
> +	for (address = vma->vm_start; address < vma->vm_end; address += PAGE_SIZE) {
> +

> +		if (unlikely(address >= vma->vm_end))
> +			break;

That extra check can be removed as far as I can see.


> +
> +		if (vma->vm_flags & VM_MIXEDMAP)
> +			ret = vmf_insert_mixed_prot(vma, address,
> +						    __pfn_to_pfn_t(pfn, PFN_DEV),
> +						    prot);
> +		else
> +			ret = vmf_insert_pfn_prot(vma, address, pfn, prot);
> +	}
> +

> +	/* Set the page to be freed using drmm release action */
> +	if (drmm_add_action_or_reset(ddev, ttm_bo_release_dummy_page, page))
> +		return VM_FAULT_OOM;

You should probably move that before inserting the page into the VMA and 
also free the allocated page if it goes wrong.

Apart from that patch looks good to me,
Christian.

> +
> +	return ret;
> +}
> +EXPORT_SYMBOL(ttm_bo_vm_dummy_page);
> +
>   vm_fault_t ttm_bo_vm_fault(struct vm_fault *vmf)
>   {
>   	struct vm_area_struct *vma = vmf->vma;
>   	pgprot_t prot;
>   	struct ttm_buffer_object *bo = vma->vm_private_data;
> +	struct drm_device *ddev = bo->base.dev;
>   	vm_fault_t ret;
> +	int idx;
>   
>   	ret = ttm_bo_vm_reserve(bo, vmf);
>   	if (ret)
>   		return ret;
>   
>   	prot = vma->vm_page_prot;
> -	ret = ttm_bo_vm_fault_reserved(vmf, prot, TTM_BO_VM_NUM_PREFAULT, 1);
> +	if (drm_dev_enter(ddev, &idx)) {
> +		ret = ttm_bo_vm_fault_reserved(vmf, prot, TTM_BO_VM_NUM_PREFAULT, 1);
> +		drm_dev_exit(idx);
> +	} else {
> +		ret = ttm_bo_vm_dummy_page(vmf, prot);
> +	}
>   	if (ret == VM_FAULT_RETRY && !(vmf->flags & FAULT_FLAG_RETRY_NOWAIT))
>   		return ret;
>   
> diff --git a/include/drm/ttm/ttm_bo_api.h b/include/drm/ttm/ttm_bo_api.h
> index 639521880c29..254ede97f8e3 100644
> --- a/include/drm/ttm/ttm_bo_api.h
> +++ b/include/drm/ttm/ttm_bo_api.h
> @@ -620,4 +620,6 @@ int ttm_bo_vm_access(struct vm_area_struct *vma, unsigned long addr,
>   		     void *buf, int len, int write);
>   bool ttm_bo_delayed_delete(struct ttm_device *bdev, bool remove_all);
>   
> +vm_fault_t ttm_bo_vm_dummy_page(struct vm_fault *vmf, pgprot_t prot);
> +
>   #endif


^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [PATCH v6 01/16] drm/ttm: Remap all page faults to per process dummy page.
@ 2021-05-11  6:38     ` Christian König
  0 siblings, 0 replies; 126+ messages in thread
From: Christian König @ 2021-05-11  6:38 UTC (permalink / raw)
  To: Andrey Grodzovsky, dri-devel, amd-gfx, linux-pci, daniel.vetter,
	Harry.Wentland
  Cc: Alexander.Deucher, gregkh, helgaas, Felix.Kuehling

Am 10.05.21 um 18:36 schrieb Andrey Grodzovsky:
> On device removal reroute all CPU mappings to dummy page.
>
> v3:
> Remove loop to find DRM file and instead access it
> by vma->vm_file->private_data. Move dummy page installation
> into a separate function.
>
> v4:
> Map the entire BOs VA space into on demand allocated dummy page
> on the first fault for that BO.
>
> v5: Remove duplicate return.
>
> v6: Polish ttm_bo_vm_dummy_page, remove superflous code.
>
> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> ---
>   drivers/gpu/drm/ttm/ttm_bo_vm.c | 57 ++++++++++++++++++++++++++++++++-
>   include/drm/ttm/ttm_bo_api.h    |  2 ++
>   2 files changed, 58 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c b/drivers/gpu/drm/ttm/ttm_bo_vm.c
> index b31b18058965..e5a9615519d1 100644
> --- a/drivers/gpu/drm/ttm/ttm_bo_vm.c
> +++ b/drivers/gpu/drm/ttm/ttm_bo_vm.c
> @@ -34,6 +34,8 @@
>   #include <drm/ttm/ttm_bo_driver.h>
>   #include <drm/ttm/ttm_placement.h>
>   #include <drm/drm_vma_manager.h>
> +#include <drm/drm_drv.h>
> +#include <drm/drm_managed.h>
>   #include <linux/mm.h>
>   #include <linux/pfn_t.h>
>   #include <linux/rbtree.h>
> @@ -380,19 +382,72 @@ vm_fault_t ttm_bo_vm_fault_reserved(struct vm_fault *vmf,
>   }
>   EXPORT_SYMBOL(ttm_bo_vm_fault_reserved);
>   
> +static void ttm_bo_release_dummy_page(struct drm_device *dev, void *res)
> +{
> +	struct page *dummy_page = (struct page *)res;
> +
> +	__free_page(dummy_page);
> +}
> +
> +vm_fault_t ttm_bo_vm_dummy_page(struct vm_fault *vmf, pgprot_t prot)
> +{
> +	struct vm_area_struct *vma = vmf->vma;
> +	struct ttm_buffer_object *bo = vma->vm_private_data;
> +	struct drm_device *ddev = bo->base.dev;
> +	vm_fault_t ret = VM_FAULT_NOPAGE;
> +	unsigned long address;
> +	unsigned long pfn;
> +	struct page *page;
> +
> +	/* Allocate new dummy page to map all the VA range in this VMA to it*/
> +	page = alloc_page(GFP_KERNEL | __GFP_ZERO);
> +	if (!page)
> +		return VM_FAULT_OOM;
> +
> +	pfn = page_to_pfn(page);
> +
> +	/* Prefault the entire VMA range right away to avoid further faults */
> +	for (address = vma->vm_start; address < vma->vm_end; address += PAGE_SIZE) {
> +

> +		if (unlikely(address >= vma->vm_end))
> +			break;

That extra check can be removed as far as I can see.


> +
> +		if (vma->vm_flags & VM_MIXEDMAP)
> +			ret = vmf_insert_mixed_prot(vma, address,
> +						    __pfn_to_pfn_t(pfn, PFN_DEV),
> +						    prot);
> +		else
> +			ret = vmf_insert_pfn_prot(vma, address, pfn, prot);
> +	}
> +

> +	/* Set the page to be freed using drmm release action */
> +	if (drmm_add_action_or_reset(ddev, ttm_bo_release_dummy_page, page))
> +		return VM_FAULT_OOM;

You should probably move that before inserting the page into the VMA and 
also free the allocated page if it goes wrong.

Apart from that patch looks good to me,
Christian.

> +
> +	return ret;
> +}
> +EXPORT_SYMBOL(ttm_bo_vm_dummy_page);
> +
>   vm_fault_t ttm_bo_vm_fault(struct vm_fault *vmf)
>   {
>   	struct vm_area_struct *vma = vmf->vma;
>   	pgprot_t prot;
>   	struct ttm_buffer_object *bo = vma->vm_private_data;
> +	struct drm_device *ddev = bo->base.dev;
>   	vm_fault_t ret;
> +	int idx;
>   
>   	ret = ttm_bo_vm_reserve(bo, vmf);
>   	if (ret)
>   		return ret;
>   
>   	prot = vma->vm_page_prot;
> -	ret = ttm_bo_vm_fault_reserved(vmf, prot, TTM_BO_VM_NUM_PREFAULT, 1);
> +	if (drm_dev_enter(ddev, &idx)) {
> +		ret = ttm_bo_vm_fault_reserved(vmf, prot, TTM_BO_VM_NUM_PREFAULT, 1);
> +		drm_dev_exit(idx);
> +	} else {
> +		ret = ttm_bo_vm_dummy_page(vmf, prot);
> +	}
>   	if (ret == VM_FAULT_RETRY && !(vmf->flags & FAULT_FLAG_RETRY_NOWAIT))
>   		return ret;
>   
> diff --git a/include/drm/ttm/ttm_bo_api.h b/include/drm/ttm/ttm_bo_api.h
> index 639521880c29..254ede97f8e3 100644
> --- a/include/drm/ttm/ttm_bo_api.h
> +++ b/include/drm/ttm/ttm_bo_api.h
> @@ -620,4 +620,6 @@ int ttm_bo_vm_access(struct vm_area_struct *vma, unsigned long addr,
>   		     void *buf, int len, int write);
>   bool ttm_bo_delayed_delete(struct ttm_device *bdev, bool remove_all);
>   
> +vm_fault_t ttm_bo_vm_dummy_page(struct vm_fault *vmf, pgprot_t prot);
> +
>   #endif


^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [PATCH v6 01/16] drm/ttm: Remap all page faults to per process dummy page.
@ 2021-05-11  6:38     ` Christian König
  0 siblings, 0 replies; 126+ messages in thread
From: Christian König @ 2021-05-11  6:38 UTC (permalink / raw)
  To: Andrey Grodzovsky, dri-devel, amd-gfx, linux-pci, daniel.vetter,
	Harry.Wentland
  Cc: Alexander.Deucher, gregkh, ppaalanen, helgaas, Felix.Kuehling

Am 10.05.21 um 18:36 schrieb Andrey Grodzovsky:
> On device removal reroute all CPU mappings to dummy page.
>
> v3:
> Remove loop to find DRM file and instead access it
> by vma->vm_file->private_data. Move dummy page installation
> into a separate function.
>
> v4:
> Map the entire BOs VA space into on demand allocated dummy page
> on the first fault for that BO.
>
> v5: Remove duplicate return.
>
> v6: Polish ttm_bo_vm_dummy_page, remove superflous code.
>
> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> ---
>   drivers/gpu/drm/ttm/ttm_bo_vm.c | 57 ++++++++++++++++++++++++++++++++-
>   include/drm/ttm/ttm_bo_api.h    |  2 ++
>   2 files changed, 58 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c b/drivers/gpu/drm/ttm/ttm_bo_vm.c
> index b31b18058965..e5a9615519d1 100644
> --- a/drivers/gpu/drm/ttm/ttm_bo_vm.c
> +++ b/drivers/gpu/drm/ttm/ttm_bo_vm.c
> @@ -34,6 +34,8 @@
>   #include <drm/ttm/ttm_bo_driver.h>
>   #include <drm/ttm/ttm_placement.h>
>   #include <drm/drm_vma_manager.h>
> +#include <drm/drm_drv.h>
> +#include <drm/drm_managed.h>
>   #include <linux/mm.h>
>   #include <linux/pfn_t.h>
>   #include <linux/rbtree.h>
> @@ -380,19 +382,72 @@ vm_fault_t ttm_bo_vm_fault_reserved(struct vm_fault *vmf,
>   }
>   EXPORT_SYMBOL(ttm_bo_vm_fault_reserved);
>   
> +static void ttm_bo_release_dummy_page(struct drm_device *dev, void *res)
> +{
> +	struct page *dummy_page = (struct page *)res;
> +
> +	__free_page(dummy_page);
> +}
> +
> +vm_fault_t ttm_bo_vm_dummy_page(struct vm_fault *vmf, pgprot_t prot)
> +{
> +	struct vm_area_struct *vma = vmf->vma;
> +	struct ttm_buffer_object *bo = vma->vm_private_data;
> +	struct drm_device *ddev = bo->base.dev;
> +	vm_fault_t ret = VM_FAULT_NOPAGE;
> +	unsigned long address;
> +	unsigned long pfn;
> +	struct page *page;
> +
> +	/* Allocate new dummy page to map all the VA range in this VMA to it*/
> +	page = alloc_page(GFP_KERNEL | __GFP_ZERO);
> +	if (!page)
> +		return VM_FAULT_OOM;
> +
> +	pfn = page_to_pfn(page);
> +
> +	/* Prefault the entire VMA range right away to avoid further faults */
> +	for (address = vma->vm_start; address < vma->vm_end; address += PAGE_SIZE) {
> +

> +		if (unlikely(address >= vma->vm_end))
> +			break;

That extra check can be removed as far as I can see.


> +
> +		if (vma->vm_flags & VM_MIXEDMAP)
> +			ret = vmf_insert_mixed_prot(vma, address,
> +						    __pfn_to_pfn_t(pfn, PFN_DEV),
> +						    prot);
> +		else
> +			ret = vmf_insert_pfn_prot(vma, address, pfn, prot);
> +	}
> +

> +	/* Set the page to be freed using drmm release action */
> +	if (drmm_add_action_or_reset(ddev, ttm_bo_release_dummy_page, page))
> +		return VM_FAULT_OOM;

You should probably move that before inserting the page into the VMA and 
also free the allocated page if it goes wrong.

Apart from that patch looks good to me,
Christian.

> +
> +	return ret;
> +}
> +EXPORT_SYMBOL(ttm_bo_vm_dummy_page);
> +
>   vm_fault_t ttm_bo_vm_fault(struct vm_fault *vmf)
>   {
>   	struct vm_area_struct *vma = vmf->vma;
>   	pgprot_t prot;
>   	struct ttm_buffer_object *bo = vma->vm_private_data;
> +	struct drm_device *ddev = bo->base.dev;
>   	vm_fault_t ret;
> +	int idx;
>   
>   	ret = ttm_bo_vm_reserve(bo, vmf);
>   	if (ret)
>   		return ret;
>   
>   	prot = vma->vm_page_prot;
> -	ret = ttm_bo_vm_fault_reserved(vmf, prot, TTM_BO_VM_NUM_PREFAULT, 1);
> +	if (drm_dev_enter(ddev, &idx)) {
> +		ret = ttm_bo_vm_fault_reserved(vmf, prot, TTM_BO_VM_NUM_PREFAULT, 1);
> +		drm_dev_exit(idx);
> +	} else {
> +		ret = ttm_bo_vm_dummy_page(vmf, prot);
> +	}
>   	if (ret == VM_FAULT_RETRY && !(vmf->flags & FAULT_FLAG_RETRY_NOWAIT))
>   		return ret;
>   
> diff --git a/include/drm/ttm/ttm_bo_api.h b/include/drm/ttm/ttm_bo_api.h
> index 639521880c29..254ede97f8e3 100644
> --- a/include/drm/ttm/ttm_bo_api.h
> +++ b/include/drm/ttm/ttm_bo_api.h
> @@ -620,4 +620,6 @@ int ttm_bo_vm_access(struct vm_area_struct *vma, unsigned long addr,
>   		     void *buf, int len, int write);
>   bool ttm_bo_delayed_delete(struct ttm_device *bdev, bool remove_all);
>   
> +vm_fault_t ttm_bo_vm_dummy_page(struct vm_fault *vmf, pgprot_t prot);
> +
>   #endif

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [PATCH v6 04/16] drm/amdkfd: Split kfd suspend from devie exit
  2021-05-10 16:36   ` Andrey Grodzovsky
  (?)
@ 2021-05-11  6:40     ` Christian König
  -1 siblings, 0 replies; 126+ messages in thread
From: Christian König @ 2021-05-11  6:40 UTC (permalink / raw)
  To: Andrey Grodzovsky, dri-devel, amd-gfx, linux-pci, daniel.vetter,
	Harry.Wentland
  Cc: ppaalanen, Alexander.Deucher, gregkh, helgaas, Felix.Kuehling

Am 10.05.21 um 18:36 schrieb Andrey Grodzovsky:
> Helps to expdite HW related stuff to amdgpu_pci_remove
>
> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 2 +-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h | 2 +-
>   drivers/gpu/drm/amd/amdkfd/kfd_device.c    | 3 ++-
>   3 files changed, 4 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
> index 5f6696a3c778..2b06dee9a0ce 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
> @@ -170,7 +170,7 @@ void amdgpu_amdkfd_device_init(struct amdgpu_device *adev)
>   	}
>   }
>   
> -void amdgpu_amdkfd_device_fini(struct amdgpu_device *adev)
> +void amdgpu_amdkfd_device_fini_sw(struct amdgpu_device *adev)
>   {
>   	if (adev->kfd.dev) {
>   		kgd2kfd_device_exit(adev->kfd.dev);
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
> index 14f68c028126..f8e10af99c28 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
> @@ -127,7 +127,7 @@ void amdgpu_amdkfd_interrupt(struct amdgpu_device *adev,
>   			const void *ih_ring_entry);
>   void amdgpu_amdkfd_device_probe(struct amdgpu_device *adev);
>   void amdgpu_amdkfd_device_init(struct amdgpu_device *adev);
> -void amdgpu_amdkfd_device_fini(struct amdgpu_device *adev);
> +void amdgpu_amdkfd_device_fini_sw(struct amdgpu_device *adev);
>   int amdgpu_amdkfd_submit_ib(struct kgd_dev *kgd, enum kgd_engine_type engine,
>   				uint32_t vmid, uint64_t gpu_addr,
>   				uint32_t *ib_cmd, uint32_t ib_len);
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
> index 357b9bf62a1c..ab6d2a43c9a3 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
> @@ -858,10 +858,11 @@ bool kgd2kfd_device_init(struct kfd_dev *kfd,
>   	return kfd->init_complete;
>   }
>   
> +
> +

Looks like unnecessary white space change to me.

>   void kgd2kfd_device_exit(struct kfd_dev *kfd)
>   {
>   	if (kfd->init_complete) {
> -		kgd2kfd_suspend(kfd, false);

Where is the call to this function now?

Christian.

>   		device_queue_manager_uninit(kfd->dqm);
>   		kfd_interrupt_exit(kfd);
>   		kfd_topology_remove_device(kfd);


^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [PATCH v6 04/16] drm/amdkfd: Split kfd suspend from devie exit
@ 2021-05-11  6:40     ` Christian König
  0 siblings, 0 replies; 126+ messages in thread
From: Christian König @ 2021-05-11  6:40 UTC (permalink / raw)
  To: Andrey Grodzovsky, dri-devel, amd-gfx, linux-pci, daniel.vetter,
	Harry.Wentland
  Cc: Alexander.Deucher, gregkh, helgaas, Felix.Kuehling

Am 10.05.21 um 18:36 schrieb Andrey Grodzovsky:
> Helps to expdite HW related stuff to amdgpu_pci_remove
>
> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 2 +-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h | 2 +-
>   drivers/gpu/drm/amd/amdkfd/kfd_device.c    | 3 ++-
>   3 files changed, 4 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
> index 5f6696a3c778..2b06dee9a0ce 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
> @@ -170,7 +170,7 @@ void amdgpu_amdkfd_device_init(struct amdgpu_device *adev)
>   	}
>   }
>   
> -void amdgpu_amdkfd_device_fini(struct amdgpu_device *adev)
> +void amdgpu_amdkfd_device_fini_sw(struct amdgpu_device *adev)
>   {
>   	if (adev->kfd.dev) {
>   		kgd2kfd_device_exit(adev->kfd.dev);
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
> index 14f68c028126..f8e10af99c28 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
> @@ -127,7 +127,7 @@ void amdgpu_amdkfd_interrupt(struct amdgpu_device *adev,
>   			const void *ih_ring_entry);
>   void amdgpu_amdkfd_device_probe(struct amdgpu_device *adev);
>   void amdgpu_amdkfd_device_init(struct amdgpu_device *adev);
> -void amdgpu_amdkfd_device_fini(struct amdgpu_device *adev);
> +void amdgpu_amdkfd_device_fini_sw(struct amdgpu_device *adev);
>   int amdgpu_amdkfd_submit_ib(struct kgd_dev *kgd, enum kgd_engine_type engine,
>   				uint32_t vmid, uint64_t gpu_addr,
>   				uint32_t *ib_cmd, uint32_t ib_len);
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
> index 357b9bf62a1c..ab6d2a43c9a3 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
> @@ -858,10 +858,11 @@ bool kgd2kfd_device_init(struct kfd_dev *kfd,
>   	return kfd->init_complete;
>   }
>   
> +
> +

Looks like unnecessary white space change to me.

>   void kgd2kfd_device_exit(struct kfd_dev *kfd)
>   {
>   	if (kfd->init_complete) {
> -		kgd2kfd_suspend(kfd, false);

Where is the call to this function now?

Christian.

>   		device_queue_manager_uninit(kfd->dqm);
>   		kfd_interrupt_exit(kfd);
>   		kfd_topology_remove_device(kfd);


^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [PATCH v6 04/16] drm/amdkfd: Split kfd suspend from devie exit
@ 2021-05-11  6:40     ` Christian König
  0 siblings, 0 replies; 126+ messages in thread
From: Christian König @ 2021-05-11  6:40 UTC (permalink / raw)
  To: Andrey Grodzovsky, dri-devel, amd-gfx, linux-pci, daniel.vetter,
	Harry.Wentland
  Cc: Alexander.Deucher, gregkh, ppaalanen, helgaas, Felix.Kuehling

Am 10.05.21 um 18:36 schrieb Andrey Grodzovsky:
> Helps to expdite HW related stuff to amdgpu_pci_remove
>
> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 2 +-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h | 2 +-
>   drivers/gpu/drm/amd/amdkfd/kfd_device.c    | 3 ++-
>   3 files changed, 4 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
> index 5f6696a3c778..2b06dee9a0ce 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
> @@ -170,7 +170,7 @@ void amdgpu_amdkfd_device_init(struct amdgpu_device *adev)
>   	}
>   }
>   
> -void amdgpu_amdkfd_device_fini(struct amdgpu_device *adev)
> +void amdgpu_amdkfd_device_fini_sw(struct amdgpu_device *adev)
>   {
>   	if (adev->kfd.dev) {
>   		kgd2kfd_device_exit(adev->kfd.dev);
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
> index 14f68c028126..f8e10af99c28 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
> @@ -127,7 +127,7 @@ void amdgpu_amdkfd_interrupt(struct amdgpu_device *adev,
>   			const void *ih_ring_entry);
>   void amdgpu_amdkfd_device_probe(struct amdgpu_device *adev);
>   void amdgpu_amdkfd_device_init(struct amdgpu_device *adev);
> -void amdgpu_amdkfd_device_fini(struct amdgpu_device *adev);
> +void amdgpu_amdkfd_device_fini_sw(struct amdgpu_device *adev);
>   int amdgpu_amdkfd_submit_ib(struct kgd_dev *kgd, enum kgd_engine_type engine,
>   				uint32_t vmid, uint64_t gpu_addr,
>   				uint32_t *ib_cmd, uint32_t ib_len);
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
> index 357b9bf62a1c..ab6d2a43c9a3 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
> @@ -858,10 +858,11 @@ bool kgd2kfd_device_init(struct kfd_dev *kfd,
>   	return kfd->init_complete;
>   }
>   
> +
> +

Looks like unnecessary white space change to me.

>   void kgd2kfd_device_exit(struct kfd_dev *kfd)
>   {
>   	if (kfd->init_complete) {
> -		kgd2kfd_suspend(kfd, false);

Where is the call to this function now?

Christian.

>   		device_queue_manager_uninit(kfd->dqm);
>   		kfd_interrupt_exit(kfd);
>   		kfd_topology_remove_device(kfd);

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [PATCH v6 05/16] drm/amdgpu: Add early fini callback
  2021-05-10 16:36   ` Andrey Grodzovsky
  (?)
@ 2021-05-11  6:41     ` Christian König
  -1 siblings, 0 replies; 126+ messages in thread
From: Christian König @ 2021-05-11  6:41 UTC (permalink / raw)
  To: Andrey Grodzovsky, dri-devel, amd-gfx, linux-pci, daniel.vetter,
	Harry.Wentland
  Cc: ppaalanen, Alexander.Deucher, gregkh, helgaas, Felix.Kuehling

Am 10.05.21 um 18:36 schrieb Andrey Grodzovsky:
> Use it to call disply code dependent on device->drv_data
> before it's set to NULL on device unplug
>
> v5: Move HW finilization into this callback to prevent MMIO accesses
>      post cpi remove.
>
> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>

Acked-by: Christian König <christian.koenig@amd.com>

> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c    | 59 +++++++++++++------
>   .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 12 +++-
>   drivers/gpu/drm/amd/include/amd_shared.h      |  2 +
>   3 files changed, 52 insertions(+), 21 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index 3760ce7d8ff8..18598eda18f6 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -2558,34 +2558,26 @@ static int amdgpu_device_ip_late_init(struct amdgpu_device *adev)
>   	return 0;
>   }
>   
> -/**
> - * amdgpu_device_ip_fini - run fini for hardware IPs
> - *
> - * @adev: amdgpu_device pointer
> - *
> - * Main teardown pass for hardware IPs.  The list of all the hardware
> - * IPs that make up the asic is walked and the hw_fini and sw_fini callbacks
> - * are run.  hw_fini tears down the hardware associated with each IP
> - * and sw_fini tears down any software state associated with each IP.
> - * Returns 0 on success, negative error code on failure.
> - */
> -static int amdgpu_device_ip_fini(struct amdgpu_device *adev)
> +static int amdgpu_device_ip_fini_early(struct amdgpu_device *adev)
>   {
>   	int i, r;
>   
> -	if (amdgpu_sriov_vf(adev) && adev->virt.ras_init_done)
> -		amdgpu_virt_release_ras_err_handler_data(adev);
> +	for (i = 0; i < adev->num_ip_blocks; i++) {
> +		if (!adev->ip_blocks[i].version->funcs->early_fini)
> +			continue;
>   
> -	amdgpu_ras_pre_fini(adev);
> +		r = adev->ip_blocks[i].version->funcs->early_fini((void *)adev);
> +		if (r) {
> +			DRM_DEBUG("early_fini of IP block <%s> failed %d\n",
> +				  adev->ip_blocks[i].version->funcs->name, r);
> +		}
> +	}
>   
> -	if (adev->gmc.xgmi.num_physical_nodes > 1)
> -		amdgpu_xgmi_remove_device(adev);
> +	amdgpu_amdkfd_suspend(adev, false);
>   
>   	amdgpu_device_set_pg_state(adev, AMD_PG_STATE_UNGATE);
>   	amdgpu_device_set_cg_state(adev, AMD_CG_STATE_UNGATE);
>   
> -	amdgpu_amdkfd_device_fini(adev);
> -
>   	/* need to disable SMC first */
>   	for (i = 0; i < adev->num_ip_blocks; i++) {
>   		if (!adev->ip_blocks[i].status.hw)
> @@ -2616,6 +2608,33 @@ static int amdgpu_device_ip_fini(struct amdgpu_device *adev)
>   		adev->ip_blocks[i].status.hw = false;
>   	}
>   
> +	return 0;
> +}
> +
> +/**
> + * amdgpu_device_ip_fini - run fini for hardware IPs
> + *
> + * @adev: amdgpu_device pointer
> + *
> + * Main teardown pass for hardware IPs.  The list of all the hardware
> + * IPs that make up the asic is walked and the hw_fini and sw_fini callbacks
> + * are run.  hw_fini tears down the hardware associated with each IP
> + * and sw_fini tears down any software state associated with each IP.
> + * Returns 0 on success, negative error code on failure.
> + */
> +static int amdgpu_device_ip_fini(struct amdgpu_device *adev)
> +{
> +	int i, r;
> +
> +	if (amdgpu_sriov_vf(adev) && adev->virt.ras_init_done)
> +		amdgpu_virt_release_ras_err_handler_data(adev);
> +
> +	amdgpu_ras_pre_fini(adev);
> +
> +	if (adev->gmc.xgmi.num_physical_nodes > 1)
> +		amdgpu_xgmi_remove_device(adev);
> +
> +	amdgpu_amdkfd_device_fini_sw(adev);
>   
>   	for (i = adev->num_ip_blocks - 1; i >= 0; i--) {
>   		if (!adev->ip_blocks[i].status.sw)
> @@ -3683,6 +3702,8 @@ void amdgpu_device_fini_hw(struct amdgpu_device *adev)
>   	amdgpu_fbdev_fini(adev);
>   
>   	amdgpu_irq_fini_hw(adev);
> +
> +	amdgpu_device_ip_fini_early(adev);
>   }
>   
>   void amdgpu_device_fini_sw(struct amdgpu_device *adev)
> diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> index 296704ce3768..6c2c6a51ce6c 100644
> --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> @@ -1251,6 +1251,15 @@ static int amdgpu_dm_init(struct amdgpu_device *adev)
>   	return -EINVAL;
>   }
>   
> +static int amdgpu_dm_early_fini(void *handle)
> +{
> +	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
> +
> +	amdgpu_dm_audio_fini(adev);
> +
> +	return 0;
> +}
> +
>   static void amdgpu_dm_fini(struct amdgpu_device *adev)
>   {
>   	int i;
> @@ -1259,8 +1268,6 @@ static void amdgpu_dm_fini(struct amdgpu_device *adev)
>   		drm_encoder_cleanup(&adev->dm.mst_encoders[i].base);
>   	}
>   
> -	amdgpu_dm_audio_fini(adev);
> -
>   	amdgpu_dm_destroy_drm_device(&adev->dm);
>   
>   #if defined(CONFIG_DRM_AMD_SECURE_DISPLAY)
> @@ -2298,6 +2305,7 @@ static const struct amd_ip_funcs amdgpu_dm_funcs = {
>   	.late_init = dm_late_init,
>   	.sw_init = dm_sw_init,
>   	.sw_fini = dm_sw_fini,
> +	.early_fini = amdgpu_dm_early_fini,
>   	.hw_init = dm_hw_init,
>   	.hw_fini = dm_hw_fini,
>   	.suspend = dm_suspend,
> diff --git a/drivers/gpu/drm/amd/include/amd_shared.h b/drivers/gpu/drm/amd/include/amd_shared.h
> index 43ed6291b2b8..1ad56da486e4 100644
> --- a/drivers/gpu/drm/amd/include/amd_shared.h
> +++ b/drivers/gpu/drm/amd/include/amd_shared.h
> @@ -240,6 +240,7 @@ enum amd_dpm_forced_level;
>    * @late_init: sets up late driver/hw state (post hw_init) - Optional
>    * @sw_init: sets up driver state, does not configure hw
>    * @sw_fini: tears down driver state, does not configure hw
> + * @early_fini: tears down stuff before dev detached from driver
>    * @hw_init: sets up the hw state
>    * @hw_fini: tears down the hw state
>    * @late_fini: final cleanup
> @@ -268,6 +269,7 @@ struct amd_ip_funcs {
>   	int (*late_init)(void *handle);
>   	int (*sw_init)(void *handle);
>   	int (*sw_fini)(void *handle);
> +	int (*early_fini)(void *handle);
>   	int (*hw_init)(void *handle);
>   	int (*hw_fini)(void *handle);
>   	void (*late_fini)(void *handle);


^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [PATCH v6 05/16] drm/amdgpu: Add early fini callback
@ 2021-05-11  6:41     ` Christian König
  0 siblings, 0 replies; 126+ messages in thread
From: Christian König @ 2021-05-11  6:41 UTC (permalink / raw)
  To: Andrey Grodzovsky, dri-devel, amd-gfx, linux-pci, daniel.vetter,
	Harry.Wentland
  Cc: Alexander.Deucher, gregkh, helgaas, Felix.Kuehling

Am 10.05.21 um 18:36 schrieb Andrey Grodzovsky:
> Use it to call disply code dependent on device->drv_data
> before it's set to NULL on device unplug
>
> v5: Move HW finilization into this callback to prevent MMIO accesses
>      post cpi remove.
>
> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>

Acked-by: Christian König <christian.koenig@amd.com>

> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c    | 59 +++++++++++++------
>   .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 12 +++-
>   drivers/gpu/drm/amd/include/amd_shared.h      |  2 +
>   3 files changed, 52 insertions(+), 21 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index 3760ce7d8ff8..18598eda18f6 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -2558,34 +2558,26 @@ static int amdgpu_device_ip_late_init(struct amdgpu_device *adev)
>   	return 0;
>   }
>   
> -/**
> - * amdgpu_device_ip_fini - run fini for hardware IPs
> - *
> - * @adev: amdgpu_device pointer
> - *
> - * Main teardown pass for hardware IPs.  The list of all the hardware
> - * IPs that make up the asic is walked and the hw_fini and sw_fini callbacks
> - * are run.  hw_fini tears down the hardware associated with each IP
> - * and sw_fini tears down any software state associated with each IP.
> - * Returns 0 on success, negative error code on failure.
> - */
> -static int amdgpu_device_ip_fini(struct amdgpu_device *adev)
> +static int amdgpu_device_ip_fini_early(struct amdgpu_device *adev)
>   {
>   	int i, r;
>   
> -	if (amdgpu_sriov_vf(adev) && adev->virt.ras_init_done)
> -		amdgpu_virt_release_ras_err_handler_data(adev);
> +	for (i = 0; i < adev->num_ip_blocks; i++) {
> +		if (!adev->ip_blocks[i].version->funcs->early_fini)
> +			continue;
>   
> -	amdgpu_ras_pre_fini(adev);
> +		r = adev->ip_blocks[i].version->funcs->early_fini((void *)adev);
> +		if (r) {
> +			DRM_DEBUG("early_fini of IP block <%s> failed %d\n",
> +				  adev->ip_blocks[i].version->funcs->name, r);
> +		}
> +	}
>   
> -	if (adev->gmc.xgmi.num_physical_nodes > 1)
> -		amdgpu_xgmi_remove_device(adev);
> +	amdgpu_amdkfd_suspend(adev, false);
>   
>   	amdgpu_device_set_pg_state(adev, AMD_PG_STATE_UNGATE);
>   	amdgpu_device_set_cg_state(adev, AMD_CG_STATE_UNGATE);
>   
> -	amdgpu_amdkfd_device_fini(adev);
> -
>   	/* need to disable SMC first */
>   	for (i = 0; i < adev->num_ip_blocks; i++) {
>   		if (!adev->ip_blocks[i].status.hw)
> @@ -2616,6 +2608,33 @@ static int amdgpu_device_ip_fini(struct amdgpu_device *adev)
>   		adev->ip_blocks[i].status.hw = false;
>   	}
>   
> +	return 0;
> +}
> +
> +/**
> + * amdgpu_device_ip_fini - run fini for hardware IPs
> + *
> + * @adev: amdgpu_device pointer
> + *
> + * Main teardown pass for hardware IPs.  The list of all the hardware
> + * IPs that make up the asic is walked and the hw_fini and sw_fini callbacks
> + * are run.  hw_fini tears down the hardware associated with each IP
> + * and sw_fini tears down any software state associated with each IP.
> + * Returns 0 on success, negative error code on failure.
> + */
> +static int amdgpu_device_ip_fini(struct amdgpu_device *adev)
> +{
> +	int i, r;
> +
> +	if (amdgpu_sriov_vf(adev) && adev->virt.ras_init_done)
> +		amdgpu_virt_release_ras_err_handler_data(adev);
> +
> +	amdgpu_ras_pre_fini(adev);
> +
> +	if (adev->gmc.xgmi.num_physical_nodes > 1)
> +		amdgpu_xgmi_remove_device(adev);
> +
> +	amdgpu_amdkfd_device_fini_sw(adev);
>   
>   	for (i = adev->num_ip_blocks - 1; i >= 0; i--) {
>   		if (!adev->ip_blocks[i].status.sw)
> @@ -3683,6 +3702,8 @@ void amdgpu_device_fini_hw(struct amdgpu_device *adev)
>   	amdgpu_fbdev_fini(adev);
>   
>   	amdgpu_irq_fini_hw(adev);
> +
> +	amdgpu_device_ip_fini_early(adev);
>   }
>   
>   void amdgpu_device_fini_sw(struct amdgpu_device *adev)
> diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> index 296704ce3768..6c2c6a51ce6c 100644
> --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> @@ -1251,6 +1251,15 @@ static int amdgpu_dm_init(struct amdgpu_device *adev)
>   	return -EINVAL;
>   }
>   
> +static int amdgpu_dm_early_fini(void *handle)
> +{
> +	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
> +
> +	amdgpu_dm_audio_fini(adev);
> +
> +	return 0;
> +}
> +
>   static void amdgpu_dm_fini(struct amdgpu_device *adev)
>   {
>   	int i;
> @@ -1259,8 +1268,6 @@ static void amdgpu_dm_fini(struct amdgpu_device *adev)
>   		drm_encoder_cleanup(&adev->dm.mst_encoders[i].base);
>   	}
>   
> -	amdgpu_dm_audio_fini(adev);
> -
>   	amdgpu_dm_destroy_drm_device(&adev->dm);
>   
>   #if defined(CONFIG_DRM_AMD_SECURE_DISPLAY)
> @@ -2298,6 +2305,7 @@ static const struct amd_ip_funcs amdgpu_dm_funcs = {
>   	.late_init = dm_late_init,
>   	.sw_init = dm_sw_init,
>   	.sw_fini = dm_sw_fini,
> +	.early_fini = amdgpu_dm_early_fini,
>   	.hw_init = dm_hw_init,
>   	.hw_fini = dm_hw_fini,
>   	.suspend = dm_suspend,
> diff --git a/drivers/gpu/drm/amd/include/amd_shared.h b/drivers/gpu/drm/amd/include/amd_shared.h
> index 43ed6291b2b8..1ad56da486e4 100644
> --- a/drivers/gpu/drm/amd/include/amd_shared.h
> +++ b/drivers/gpu/drm/amd/include/amd_shared.h
> @@ -240,6 +240,7 @@ enum amd_dpm_forced_level;
>    * @late_init: sets up late driver/hw state (post hw_init) - Optional
>    * @sw_init: sets up driver state, does not configure hw
>    * @sw_fini: tears down driver state, does not configure hw
> + * @early_fini: tears down stuff before dev detached from driver
>    * @hw_init: sets up the hw state
>    * @hw_fini: tears down the hw state
>    * @late_fini: final cleanup
> @@ -268,6 +269,7 @@ struct amd_ip_funcs {
>   	int (*late_init)(void *handle);
>   	int (*sw_init)(void *handle);
>   	int (*sw_fini)(void *handle);
> +	int (*early_fini)(void *handle);
>   	int (*hw_init)(void *handle);
>   	int (*hw_fini)(void *handle);
>   	void (*late_fini)(void *handle);


^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [PATCH v6 05/16] drm/amdgpu: Add early fini callback
@ 2021-05-11  6:41     ` Christian König
  0 siblings, 0 replies; 126+ messages in thread
From: Christian König @ 2021-05-11  6:41 UTC (permalink / raw)
  To: Andrey Grodzovsky, dri-devel, amd-gfx, linux-pci, daniel.vetter,
	Harry.Wentland
  Cc: Alexander.Deucher, gregkh, ppaalanen, helgaas, Felix.Kuehling

Am 10.05.21 um 18:36 schrieb Andrey Grodzovsky:
> Use it to call disply code dependent on device->drv_data
> before it's set to NULL on device unplug
>
> v5: Move HW finilization into this callback to prevent MMIO accesses
>      post cpi remove.
>
> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>

Acked-by: Christian König <christian.koenig@amd.com>

> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c    | 59 +++++++++++++------
>   .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 12 +++-
>   drivers/gpu/drm/amd/include/amd_shared.h      |  2 +
>   3 files changed, 52 insertions(+), 21 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index 3760ce7d8ff8..18598eda18f6 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -2558,34 +2558,26 @@ static int amdgpu_device_ip_late_init(struct amdgpu_device *adev)
>   	return 0;
>   }
>   
> -/**
> - * amdgpu_device_ip_fini - run fini for hardware IPs
> - *
> - * @adev: amdgpu_device pointer
> - *
> - * Main teardown pass for hardware IPs.  The list of all the hardware
> - * IPs that make up the asic is walked and the hw_fini and sw_fini callbacks
> - * are run.  hw_fini tears down the hardware associated with each IP
> - * and sw_fini tears down any software state associated with each IP.
> - * Returns 0 on success, negative error code on failure.
> - */
> -static int amdgpu_device_ip_fini(struct amdgpu_device *adev)
> +static int amdgpu_device_ip_fini_early(struct amdgpu_device *adev)
>   {
>   	int i, r;
>   
> -	if (amdgpu_sriov_vf(adev) && adev->virt.ras_init_done)
> -		amdgpu_virt_release_ras_err_handler_data(adev);
> +	for (i = 0; i < adev->num_ip_blocks; i++) {
> +		if (!adev->ip_blocks[i].version->funcs->early_fini)
> +			continue;
>   
> -	amdgpu_ras_pre_fini(adev);
> +		r = adev->ip_blocks[i].version->funcs->early_fini((void *)adev);
> +		if (r) {
> +			DRM_DEBUG("early_fini of IP block <%s> failed %d\n",
> +				  adev->ip_blocks[i].version->funcs->name, r);
> +		}
> +	}
>   
> -	if (adev->gmc.xgmi.num_physical_nodes > 1)
> -		amdgpu_xgmi_remove_device(adev);
> +	amdgpu_amdkfd_suspend(adev, false);
>   
>   	amdgpu_device_set_pg_state(adev, AMD_PG_STATE_UNGATE);
>   	amdgpu_device_set_cg_state(adev, AMD_CG_STATE_UNGATE);
>   
> -	amdgpu_amdkfd_device_fini(adev);
> -
>   	/* need to disable SMC first */
>   	for (i = 0; i < adev->num_ip_blocks; i++) {
>   		if (!adev->ip_blocks[i].status.hw)
> @@ -2616,6 +2608,33 @@ static int amdgpu_device_ip_fini(struct amdgpu_device *adev)
>   		adev->ip_blocks[i].status.hw = false;
>   	}
>   
> +	return 0;
> +}
> +
> +/**
> + * amdgpu_device_ip_fini - run fini for hardware IPs
> + *
> + * @adev: amdgpu_device pointer
> + *
> + * Main teardown pass for hardware IPs.  The list of all the hardware
> + * IPs that make up the asic is walked and the hw_fini and sw_fini callbacks
> + * are run.  hw_fini tears down the hardware associated with each IP
> + * and sw_fini tears down any software state associated with each IP.
> + * Returns 0 on success, negative error code on failure.
> + */
> +static int amdgpu_device_ip_fini(struct amdgpu_device *adev)
> +{
> +	int i, r;
> +
> +	if (amdgpu_sriov_vf(adev) && adev->virt.ras_init_done)
> +		amdgpu_virt_release_ras_err_handler_data(adev);
> +
> +	amdgpu_ras_pre_fini(adev);
> +
> +	if (adev->gmc.xgmi.num_physical_nodes > 1)
> +		amdgpu_xgmi_remove_device(adev);
> +
> +	amdgpu_amdkfd_device_fini_sw(adev);
>   
>   	for (i = adev->num_ip_blocks - 1; i >= 0; i--) {
>   		if (!adev->ip_blocks[i].status.sw)
> @@ -3683,6 +3702,8 @@ void amdgpu_device_fini_hw(struct amdgpu_device *adev)
>   	amdgpu_fbdev_fini(adev);
>   
>   	amdgpu_irq_fini_hw(adev);
> +
> +	amdgpu_device_ip_fini_early(adev);
>   }
>   
>   void amdgpu_device_fini_sw(struct amdgpu_device *adev)
> diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> index 296704ce3768..6c2c6a51ce6c 100644
> --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> @@ -1251,6 +1251,15 @@ static int amdgpu_dm_init(struct amdgpu_device *adev)
>   	return -EINVAL;
>   }
>   
> +static int amdgpu_dm_early_fini(void *handle)
> +{
> +	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
> +
> +	amdgpu_dm_audio_fini(adev);
> +
> +	return 0;
> +}
> +
>   static void amdgpu_dm_fini(struct amdgpu_device *adev)
>   {
>   	int i;
> @@ -1259,8 +1268,6 @@ static void amdgpu_dm_fini(struct amdgpu_device *adev)
>   		drm_encoder_cleanup(&adev->dm.mst_encoders[i].base);
>   	}
>   
> -	amdgpu_dm_audio_fini(adev);
> -
>   	amdgpu_dm_destroy_drm_device(&adev->dm);
>   
>   #if defined(CONFIG_DRM_AMD_SECURE_DISPLAY)
> @@ -2298,6 +2305,7 @@ static const struct amd_ip_funcs amdgpu_dm_funcs = {
>   	.late_init = dm_late_init,
>   	.sw_init = dm_sw_init,
>   	.sw_fini = dm_sw_fini,
> +	.early_fini = amdgpu_dm_early_fini,
>   	.hw_init = dm_hw_init,
>   	.hw_fini = dm_hw_fini,
>   	.suspend = dm_suspend,
> diff --git a/drivers/gpu/drm/amd/include/amd_shared.h b/drivers/gpu/drm/amd/include/amd_shared.h
> index 43ed6291b2b8..1ad56da486e4 100644
> --- a/drivers/gpu/drm/amd/include/amd_shared.h
> +++ b/drivers/gpu/drm/amd/include/amd_shared.h
> @@ -240,6 +240,7 @@ enum amd_dpm_forced_level;
>    * @late_init: sets up late driver/hw state (post hw_init) - Optional
>    * @sw_init: sets up driver state, does not configure hw
>    * @sw_fini: tears down driver state, does not configure hw
> + * @early_fini: tears down stuff before dev detached from driver
>    * @hw_init: sets up the hw state
>    * @hw_fini: tears down the hw state
>    * @late_fini: final cleanup
> @@ -268,6 +269,7 @@ struct amd_ip_funcs {
>   	int (*late_init)(void *handle);
>   	int (*sw_init)(void *handle);
>   	int (*sw_fini)(void *handle);
> +	int (*early_fini)(void *handle);
>   	int (*hw_init)(void *handle);
>   	int (*hw_fini)(void *handle);
>   	void (*late_fini)(void *handle);

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [PATCH v6 06/16] drm/amdgpu: Handle IOMMU enabled case.
  2021-05-10 16:36   ` Andrey Grodzovsky
  (?)
@ 2021-05-11  6:44     ` Christian König
  -1 siblings, 0 replies; 126+ messages in thread
From: Christian König @ 2021-05-11  6:44 UTC (permalink / raw)
  To: Andrey Grodzovsky, dri-devel, amd-gfx, linux-pci, daniel.vetter,
	Harry.Wentland
  Cc: ppaalanen, Alexander.Deucher, gregkh, helgaas, Felix.Kuehling

Am 10.05.21 um 18:36 schrieb Andrey Grodzovsky:
> Handle all DMA IOMMU gropup related dependencies before the
> group is removed.
>
> v5: Drop IOMMU notifier and switch to lockless call to ttm_tt_unpopulate
> v6: Drop the BO unamp list
>
> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 4 ++--
>   drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c   | 3 +--
>   drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h   | 1 +
>   drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c    | 9 +++++++++
>   drivers/gpu/drm/amd/amdgpu/cik_ih.c        | 1 -
>   drivers/gpu/drm/amd/amdgpu/cz_ih.c         | 1 -
>   drivers/gpu/drm/amd/amdgpu/iceland_ih.c    | 1 -
>   drivers/gpu/drm/amd/amdgpu/navi10_ih.c     | 3 ---
>   drivers/gpu/drm/amd/amdgpu/si_ih.c         | 1 -
>   drivers/gpu/drm/amd/amdgpu/tonga_ih.c      | 1 -
>   drivers/gpu/drm/amd/amdgpu/vega10_ih.c     | 3 ---
>   11 files changed, 13 insertions(+), 15 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index 18598eda18f6..a0bff4713672 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -3256,7 +3256,6 @@ static const struct attribute *amdgpu_dev_attributes[] = {
>   	NULL
>   };
>   
> -
>   /**
>    * amdgpu_device_init - initialize the driver
>    *
> @@ -3698,12 +3697,13 @@ void amdgpu_device_fini_hw(struct amdgpu_device *adev)
>   		amdgpu_ucode_sysfs_fini(adev);
>   	sysfs_remove_files(&adev->dev->kobj, amdgpu_dev_attributes);
>   
> -
>   	amdgpu_fbdev_fini(adev);
>   
>   	amdgpu_irq_fini_hw(adev);
>   
>   	amdgpu_device_ip_fini_early(adev);
> +
> +	amdgpu_gart_dummy_page_fini(adev);

I think you should probably just call amdgpu_gart_fini() here.

>   }
>   
>   void amdgpu_device_fini_sw(struct amdgpu_device *adev)
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
> index c5a9a4fb10d2..354e68081b53 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
> @@ -92,7 +92,7 @@ static int amdgpu_gart_dummy_page_init(struct amdgpu_device *adev)
>    *
>    * Frees the dummy page used by the driver (all asics).
>    */
> -static void amdgpu_gart_dummy_page_fini(struct amdgpu_device *adev)
> +void amdgpu_gart_dummy_page_fini(struct amdgpu_device *adev)
>   {
>   	if (!adev->dummy_page_addr)
>   		return;
> @@ -375,5 +375,4 @@ int amdgpu_gart_init(struct amdgpu_device *adev)
>    */
>   void amdgpu_gart_fini(struct amdgpu_device *adev)
>   {
> -	amdgpu_gart_dummy_page_fini(adev);
>   }

Well either you remove amdgpu_gart_fini() or just call 
amdgpu_gart_fini() instead of amdgpu_gart_dummy_page_fini().

> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h
> index a25fe97b0196..78dc7a23da56 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h
> @@ -58,6 +58,7 @@ int amdgpu_gart_table_vram_pin(struct amdgpu_device *adev);
>   void amdgpu_gart_table_vram_unpin(struct amdgpu_device *adev);
>   int amdgpu_gart_init(struct amdgpu_device *adev);
>   void amdgpu_gart_fini(struct amdgpu_device *adev);
> +void amdgpu_gart_dummy_page_fini(struct amdgpu_device *adev);
>   int amdgpu_gart_unbind(struct amdgpu_device *adev, uint64_t offset,
>   		       int pages);
>   int amdgpu_gart_map(struct amdgpu_device *adev, uint64_t offset,
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
> index 233b64dab94b..a14973a7a9c9 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
> @@ -361,6 +361,15 @@ void amdgpu_irq_fini_hw(struct amdgpu_device *adev)
>   		if (!amdgpu_device_has_dc_support(adev))
>   			flush_work(&adev->hotplug_work);
>   	}
> +
> +	if (adev->irq.ih_soft.ring)
> +		amdgpu_ih_ring_fini(adev, &adev->irq.ih_soft);
> +	if (adev->irq.ih.ring)
> +		amdgpu_ih_ring_fini(adev, &adev->irq.ih);
> +	if (adev->irq.ih1.ring)
> +		amdgpu_ih_ring_fini(adev, &adev->irq.ih1);
> +	if (adev->irq.ih2.ring)
> +		amdgpu_ih_ring_fini(adev, &adev->irq.ih2);

You should probably make the function NULL save instead of checking here.

Christian.

>   }
>   
>   /**
> diff --git a/drivers/gpu/drm/amd/amdgpu/cik_ih.c b/drivers/gpu/drm/amd/amdgpu/cik_ih.c
> index 183d44a6583c..df385ffc9768 100644
> --- a/drivers/gpu/drm/amd/amdgpu/cik_ih.c
> +++ b/drivers/gpu/drm/amd/amdgpu/cik_ih.c
> @@ -310,7 +310,6 @@ static int cik_ih_sw_fini(void *handle)
>   	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
>   
>   	amdgpu_irq_fini_sw(adev);
> -	amdgpu_ih_ring_fini(adev, &adev->irq.ih);
>   	amdgpu_irq_remove_domain(adev);
>   
>   	return 0;
> diff --git a/drivers/gpu/drm/amd/amdgpu/cz_ih.c b/drivers/gpu/drm/amd/amdgpu/cz_ih.c
> index d32743949003..b8c47e0cf37a 100644
> --- a/drivers/gpu/drm/amd/amdgpu/cz_ih.c
> +++ b/drivers/gpu/drm/amd/amdgpu/cz_ih.c
> @@ -302,7 +302,6 @@ static int cz_ih_sw_fini(void *handle)
>   	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
>   
>   	amdgpu_irq_fini_sw(adev);
> -	amdgpu_ih_ring_fini(adev, &adev->irq.ih);
>   	amdgpu_irq_remove_domain(adev);
>   
>   	return 0;
> diff --git a/drivers/gpu/drm/amd/amdgpu/iceland_ih.c b/drivers/gpu/drm/amd/amdgpu/iceland_ih.c
> index da96c6013477..ddfe4eaeea05 100644
> --- a/drivers/gpu/drm/amd/amdgpu/iceland_ih.c
> +++ b/drivers/gpu/drm/amd/amdgpu/iceland_ih.c
> @@ -301,7 +301,6 @@ static int iceland_ih_sw_fini(void *handle)
>   	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
>   
>   	amdgpu_irq_fini_sw(adev);
> -	amdgpu_ih_ring_fini(adev, &adev->irq.ih);
>   	amdgpu_irq_remove_domain(adev);
>   
>   	return 0;
> diff --git a/drivers/gpu/drm/amd/amdgpu/navi10_ih.c b/drivers/gpu/drm/amd/amdgpu/navi10_ih.c
> index 5eea4550b856..e171a9e78544 100644
> --- a/drivers/gpu/drm/amd/amdgpu/navi10_ih.c
> +++ b/drivers/gpu/drm/amd/amdgpu/navi10_ih.c
> @@ -571,9 +571,6 @@ static int navi10_ih_sw_fini(void *handle)
>   
>   	amdgpu_irq_fini_sw(adev);
>   	amdgpu_ih_ring_fini(adev, &adev->irq.ih_soft);
> -	amdgpu_ih_ring_fini(adev, &adev->irq.ih2);
> -	amdgpu_ih_ring_fini(adev, &adev->irq.ih1);
> -	amdgpu_ih_ring_fini(adev, &adev->irq.ih);
>   
>   	return 0;
>   }
> diff --git a/drivers/gpu/drm/amd/amdgpu/si_ih.c b/drivers/gpu/drm/amd/amdgpu/si_ih.c
> index 751307f3252c..9a24f17a5750 100644
> --- a/drivers/gpu/drm/amd/amdgpu/si_ih.c
> +++ b/drivers/gpu/drm/amd/amdgpu/si_ih.c
> @@ -176,7 +176,6 @@ static int si_ih_sw_fini(void *handle)
>   	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
>   
>   	amdgpu_irq_fini_sw(adev);
> -	amdgpu_ih_ring_fini(adev, &adev->irq.ih);
>   
>   	return 0;
>   }
> diff --git a/drivers/gpu/drm/amd/amdgpu/tonga_ih.c b/drivers/gpu/drm/amd/amdgpu/tonga_ih.c
> index 973d80ec7f6c..b08905d1c00f 100644
> --- a/drivers/gpu/drm/amd/amdgpu/tonga_ih.c
> +++ b/drivers/gpu/drm/amd/amdgpu/tonga_ih.c
> @@ -313,7 +313,6 @@ static int tonga_ih_sw_fini(void *handle)
>   	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
>   
>   	amdgpu_irq_fini_sw(adev);
> -	amdgpu_ih_ring_fini(adev, &adev->irq.ih);
>   	amdgpu_irq_remove_domain(adev);
>   
>   	return 0;
> diff --git a/drivers/gpu/drm/amd/amdgpu/vega10_ih.c b/drivers/gpu/drm/amd/amdgpu/vega10_ih.c
> index dead9c2fbd4c..d78b8abe993a 100644
> --- a/drivers/gpu/drm/amd/amdgpu/vega10_ih.c
> +++ b/drivers/gpu/drm/amd/amdgpu/vega10_ih.c
> @@ -515,9 +515,6 @@ static int vega10_ih_sw_fini(void *handle)
>   
>   	amdgpu_irq_fini_sw(adev);
>   	amdgpu_ih_ring_fini(adev, &adev->irq.ih_soft);
> -	amdgpu_ih_ring_fini(adev, &adev->irq.ih2);
> -	amdgpu_ih_ring_fini(adev, &adev->irq.ih1);
> -	amdgpu_ih_ring_fini(adev, &adev->irq.ih);
>   
>   	return 0;
>   }


^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [PATCH v6 06/16] drm/amdgpu: Handle IOMMU enabled case.
@ 2021-05-11  6:44     ` Christian König
  0 siblings, 0 replies; 126+ messages in thread
From: Christian König @ 2021-05-11  6:44 UTC (permalink / raw)
  To: Andrey Grodzovsky, dri-devel, amd-gfx, linux-pci, daniel.vetter,
	Harry.Wentland
  Cc: Alexander.Deucher, gregkh, helgaas, Felix.Kuehling

Am 10.05.21 um 18:36 schrieb Andrey Grodzovsky:
> Handle all DMA IOMMU gropup related dependencies before the
> group is removed.
>
> v5: Drop IOMMU notifier and switch to lockless call to ttm_tt_unpopulate
> v6: Drop the BO unamp list
>
> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 4 ++--
>   drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c   | 3 +--
>   drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h   | 1 +
>   drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c    | 9 +++++++++
>   drivers/gpu/drm/amd/amdgpu/cik_ih.c        | 1 -
>   drivers/gpu/drm/amd/amdgpu/cz_ih.c         | 1 -
>   drivers/gpu/drm/amd/amdgpu/iceland_ih.c    | 1 -
>   drivers/gpu/drm/amd/amdgpu/navi10_ih.c     | 3 ---
>   drivers/gpu/drm/amd/amdgpu/si_ih.c         | 1 -
>   drivers/gpu/drm/amd/amdgpu/tonga_ih.c      | 1 -
>   drivers/gpu/drm/amd/amdgpu/vega10_ih.c     | 3 ---
>   11 files changed, 13 insertions(+), 15 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index 18598eda18f6..a0bff4713672 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -3256,7 +3256,6 @@ static const struct attribute *amdgpu_dev_attributes[] = {
>   	NULL
>   };
>   
> -
>   /**
>    * amdgpu_device_init - initialize the driver
>    *
> @@ -3698,12 +3697,13 @@ void amdgpu_device_fini_hw(struct amdgpu_device *adev)
>   		amdgpu_ucode_sysfs_fini(adev);
>   	sysfs_remove_files(&adev->dev->kobj, amdgpu_dev_attributes);
>   
> -
>   	amdgpu_fbdev_fini(adev);
>   
>   	amdgpu_irq_fini_hw(adev);
>   
>   	amdgpu_device_ip_fini_early(adev);
> +
> +	amdgpu_gart_dummy_page_fini(adev);

I think you should probably just call amdgpu_gart_fini() here.

>   }
>   
>   void amdgpu_device_fini_sw(struct amdgpu_device *adev)
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
> index c5a9a4fb10d2..354e68081b53 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
> @@ -92,7 +92,7 @@ static int amdgpu_gart_dummy_page_init(struct amdgpu_device *adev)
>    *
>    * Frees the dummy page used by the driver (all asics).
>    */
> -static void amdgpu_gart_dummy_page_fini(struct amdgpu_device *adev)
> +void amdgpu_gart_dummy_page_fini(struct amdgpu_device *adev)
>   {
>   	if (!adev->dummy_page_addr)
>   		return;
> @@ -375,5 +375,4 @@ int amdgpu_gart_init(struct amdgpu_device *adev)
>    */
>   void amdgpu_gart_fini(struct amdgpu_device *adev)
>   {
> -	amdgpu_gart_dummy_page_fini(adev);
>   }

Well either you remove amdgpu_gart_fini() or just call 
amdgpu_gart_fini() instead of amdgpu_gart_dummy_page_fini().

> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h
> index a25fe97b0196..78dc7a23da56 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h
> @@ -58,6 +58,7 @@ int amdgpu_gart_table_vram_pin(struct amdgpu_device *adev);
>   void amdgpu_gart_table_vram_unpin(struct amdgpu_device *adev);
>   int amdgpu_gart_init(struct amdgpu_device *adev);
>   void amdgpu_gart_fini(struct amdgpu_device *adev);
> +void amdgpu_gart_dummy_page_fini(struct amdgpu_device *adev);
>   int amdgpu_gart_unbind(struct amdgpu_device *adev, uint64_t offset,
>   		       int pages);
>   int amdgpu_gart_map(struct amdgpu_device *adev, uint64_t offset,
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
> index 233b64dab94b..a14973a7a9c9 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
> @@ -361,6 +361,15 @@ void amdgpu_irq_fini_hw(struct amdgpu_device *adev)
>   		if (!amdgpu_device_has_dc_support(adev))
>   			flush_work(&adev->hotplug_work);
>   	}
> +
> +	if (adev->irq.ih_soft.ring)
> +		amdgpu_ih_ring_fini(adev, &adev->irq.ih_soft);
> +	if (adev->irq.ih.ring)
> +		amdgpu_ih_ring_fini(adev, &adev->irq.ih);
> +	if (adev->irq.ih1.ring)
> +		amdgpu_ih_ring_fini(adev, &adev->irq.ih1);
> +	if (adev->irq.ih2.ring)
> +		amdgpu_ih_ring_fini(adev, &adev->irq.ih2);

You should probably make the function NULL save instead of checking here.

Christian.

>   }
>   
>   /**
> diff --git a/drivers/gpu/drm/amd/amdgpu/cik_ih.c b/drivers/gpu/drm/amd/amdgpu/cik_ih.c
> index 183d44a6583c..df385ffc9768 100644
> --- a/drivers/gpu/drm/amd/amdgpu/cik_ih.c
> +++ b/drivers/gpu/drm/amd/amdgpu/cik_ih.c
> @@ -310,7 +310,6 @@ static int cik_ih_sw_fini(void *handle)
>   	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
>   
>   	amdgpu_irq_fini_sw(adev);
> -	amdgpu_ih_ring_fini(adev, &adev->irq.ih);
>   	amdgpu_irq_remove_domain(adev);
>   
>   	return 0;
> diff --git a/drivers/gpu/drm/amd/amdgpu/cz_ih.c b/drivers/gpu/drm/amd/amdgpu/cz_ih.c
> index d32743949003..b8c47e0cf37a 100644
> --- a/drivers/gpu/drm/amd/amdgpu/cz_ih.c
> +++ b/drivers/gpu/drm/amd/amdgpu/cz_ih.c
> @@ -302,7 +302,6 @@ static int cz_ih_sw_fini(void *handle)
>   	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
>   
>   	amdgpu_irq_fini_sw(adev);
> -	amdgpu_ih_ring_fini(adev, &adev->irq.ih);
>   	amdgpu_irq_remove_domain(adev);
>   
>   	return 0;
> diff --git a/drivers/gpu/drm/amd/amdgpu/iceland_ih.c b/drivers/gpu/drm/amd/amdgpu/iceland_ih.c
> index da96c6013477..ddfe4eaeea05 100644
> --- a/drivers/gpu/drm/amd/amdgpu/iceland_ih.c
> +++ b/drivers/gpu/drm/amd/amdgpu/iceland_ih.c
> @@ -301,7 +301,6 @@ static int iceland_ih_sw_fini(void *handle)
>   	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
>   
>   	amdgpu_irq_fini_sw(adev);
> -	amdgpu_ih_ring_fini(adev, &adev->irq.ih);
>   	amdgpu_irq_remove_domain(adev);
>   
>   	return 0;
> diff --git a/drivers/gpu/drm/amd/amdgpu/navi10_ih.c b/drivers/gpu/drm/amd/amdgpu/navi10_ih.c
> index 5eea4550b856..e171a9e78544 100644
> --- a/drivers/gpu/drm/amd/amdgpu/navi10_ih.c
> +++ b/drivers/gpu/drm/amd/amdgpu/navi10_ih.c
> @@ -571,9 +571,6 @@ static int navi10_ih_sw_fini(void *handle)
>   
>   	amdgpu_irq_fini_sw(adev);
>   	amdgpu_ih_ring_fini(adev, &adev->irq.ih_soft);
> -	amdgpu_ih_ring_fini(adev, &adev->irq.ih2);
> -	amdgpu_ih_ring_fini(adev, &adev->irq.ih1);
> -	amdgpu_ih_ring_fini(adev, &adev->irq.ih);
>   
>   	return 0;
>   }
> diff --git a/drivers/gpu/drm/amd/amdgpu/si_ih.c b/drivers/gpu/drm/amd/amdgpu/si_ih.c
> index 751307f3252c..9a24f17a5750 100644
> --- a/drivers/gpu/drm/amd/amdgpu/si_ih.c
> +++ b/drivers/gpu/drm/amd/amdgpu/si_ih.c
> @@ -176,7 +176,6 @@ static int si_ih_sw_fini(void *handle)
>   	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
>   
>   	amdgpu_irq_fini_sw(adev);
> -	amdgpu_ih_ring_fini(adev, &adev->irq.ih);
>   
>   	return 0;
>   }
> diff --git a/drivers/gpu/drm/amd/amdgpu/tonga_ih.c b/drivers/gpu/drm/amd/amdgpu/tonga_ih.c
> index 973d80ec7f6c..b08905d1c00f 100644
> --- a/drivers/gpu/drm/amd/amdgpu/tonga_ih.c
> +++ b/drivers/gpu/drm/amd/amdgpu/tonga_ih.c
> @@ -313,7 +313,6 @@ static int tonga_ih_sw_fini(void *handle)
>   	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
>   
>   	amdgpu_irq_fini_sw(adev);
> -	amdgpu_ih_ring_fini(adev, &adev->irq.ih);
>   	amdgpu_irq_remove_domain(adev);
>   
>   	return 0;
> diff --git a/drivers/gpu/drm/amd/amdgpu/vega10_ih.c b/drivers/gpu/drm/amd/amdgpu/vega10_ih.c
> index dead9c2fbd4c..d78b8abe993a 100644
> --- a/drivers/gpu/drm/amd/amdgpu/vega10_ih.c
> +++ b/drivers/gpu/drm/amd/amdgpu/vega10_ih.c
> @@ -515,9 +515,6 @@ static int vega10_ih_sw_fini(void *handle)
>   
>   	amdgpu_irq_fini_sw(adev);
>   	amdgpu_ih_ring_fini(adev, &adev->irq.ih_soft);
> -	amdgpu_ih_ring_fini(adev, &adev->irq.ih2);
> -	amdgpu_ih_ring_fini(adev, &adev->irq.ih1);
> -	amdgpu_ih_ring_fini(adev, &adev->irq.ih);
>   
>   	return 0;
>   }


^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [PATCH v6 06/16] drm/amdgpu: Handle IOMMU enabled case.
@ 2021-05-11  6:44     ` Christian König
  0 siblings, 0 replies; 126+ messages in thread
From: Christian König @ 2021-05-11  6:44 UTC (permalink / raw)
  To: Andrey Grodzovsky, dri-devel, amd-gfx, linux-pci, daniel.vetter,
	Harry.Wentland
  Cc: Alexander.Deucher, gregkh, ppaalanen, helgaas, Felix.Kuehling

Am 10.05.21 um 18:36 schrieb Andrey Grodzovsky:
> Handle all DMA IOMMU gropup related dependencies before the
> group is removed.
>
> v5: Drop IOMMU notifier and switch to lockless call to ttm_tt_unpopulate
> v6: Drop the BO unamp list
>
> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 4 ++--
>   drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c   | 3 +--
>   drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h   | 1 +
>   drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c    | 9 +++++++++
>   drivers/gpu/drm/amd/amdgpu/cik_ih.c        | 1 -
>   drivers/gpu/drm/amd/amdgpu/cz_ih.c         | 1 -
>   drivers/gpu/drm/amd/amdgpu/iceland_ih.c    | 1 -
>   drivers/gpu/drm/amd/amdgpu/navi10_ih.c     | 3 ---
>   drivers/gpu/drm/amd/amdgpu/si_ih.c         | 1 -
>   drivers/gpu/drm/amd/amdgpu/tonga_ih.c      | 1 -
>   drivers/gpu/drm/amd/amdgpu/vega10_ih.c     | 3 ---
>   11 files changed, 13 insertions(+), 15 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index 18598eda18f6..a0bff4713672 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -3256,7 +3256,6 @@ static const struct attribute *amdgpu_dev_attributes[] = {
>   	NULL
>   };
>   
> -
>   /**
>    * amdgpu_device_init - initialize the driver
>    *
> @@ -3698,12 +3697,13 @@ void amdgpu_device_fini_hw(struct amdgpu_device *adev)
>   		amdgpu_ucode_sysfs_fini(adev);
>   	sysfs_remove_files(&adev->dev->kobj, amdgpu_dev_attributes);
>   
> -
>   	amdgpu_fbdev_fini(adev);
>   
>   	amdgpu_irq_fini_hw(adev);
>   
>   	amdgpu_device_ip_fini_early(adev);
> +
> +	amdgpu_gart_dummy_page_fini(adev);

I think you should probably just call amdgpu_gart_fini() here.

>   }
>   
>   void amdgpu_device_fini_sw(struct amdgpu_device *adev)
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
> index c5a9a4fb10d2..354e68081b53 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
> @@ -92,7 +92,7 @@ static int amdgpu_gart_dummy_page_init(struct amdgpu_device *adev)
>    *
>    * Frees the dummy page used by the driver (all asics).
>    */
> -static void amdgpu_gart_dummy_page_fini(struct amdgpu_device *adev)
> +void amdgpu_gart_dummy_page_fini(struct amdgpu_device *adev)
>   {
>   	if (!adev->dummy_page_addr)
>   		return;
> @@ -375,5 +375,4 @@ int amdgpu_gart_init(struct amdgpu_device *adev)
>    */
>   void amdgpu_gart_fini(struct amdgpu_device *adev)
>   {
> -	amdgpu_gart_dummy_page_fini(adev);
>   }

Well either you remove amdgpu_gart_fini() or just call 
amdgpu_gart_fini() instead of amdgpu_gart_dummy_page_fini().

> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h
> index a25fe97b0196..78dc7a23da56 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h
> @@ -58,6 +58,7 @@ int amdgpu_gart_table_vram_pin(struct amdgpu_device *adev);
>   void amdgpu_gart_table_vram_unpin(struct amdgpu_device *adev);
>   int amdgpu_gart_init(struct amdgpu_device *adev);
>   void amdgpu_gart_fini(struct amdgpu_device *adev);
> +void amdgpu_gart_dummy_page_fini(struct amdgpu_device *adev);
>   int amdgpu_gart_unbind(struct amdgpu_device *adev, uint64_t offset,
>   		       int pages);
>   int amdgpu_gart_map(struct amdgpu_device *adev, uint64_t offset,
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
> index 233b64dab94b..a14973a7a9c9 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
> @@ -361,6 +361,15 @@ void amdgpu_irq_fini_hw(struct amdgpu_device *adev)
>   		if (!amdgpu_device_has_dc_support(adev))
>   			flush_work(&adev->hotplug_work);
>   	}
> +
> +	if (adev->irq.ih_soft.ring)
> +		amdgpu_ih_ring_fini(adev, &adev->irq.ih_soft);
> +	if (adev->irq.ih.ring)
> +		amdgpu_ih_ring_fini(adev, &adev->irq.ih);
> +	if (adev->irq.ih1.ring)
> +		amdgpu_ih_ring_fini(adev, &adev->irq.ih1);
> +	if (adev->irq.ih2.ring)
> +		amdgpu_ih_ring_fini(adev, &adev->irq.ih2);

You should probably make the function NULL save instead of checking here.

Christian.

>   }
>   
>   /**
> diff --git a/drivers/gpu/drm/amd/amdgpu/cik_ih.c b/drivers/gpu/drm/amd/amdgpu/cik_ih.c
> index 183d44a6583c..df385ffc9768 100644
> --- a/drivers/gpu/drm/amd/amdgpu/cik_ih.c
> +++ b/drivers/gpu/drm/amd/amdgpu/cik_ih.c
> @@ -310,7 +310,6 @@ static int cik_ih_sw_fini(void *handle)
>   	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
>   
>   	amdgpu_irq_fini_sw(adev);
> -	amdgpu_ih_ring_fini(adev, &adev->irq.ih);
>   	amdgpu_irq_remove_domain(adev);
>   
>   	return 0;
> diff --git a/drivers/gpu/drm/amd/amdgpu/cz_ih.c b/drivers/gpu/drm/amd/amdgpu/cz_ih.c
> index d32743949003..b8c47e0cf37a 100644
> --- a/drivers/gpu/drm/amd/amdgpu/cz_ih.c
> +++ b/drivers/gpu/drm/amd/amdgpu/cz_ih.c
> @@ -302,7 +302,6 @@ static int cz_ih_sw_fini(void *handle)
>   	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
>   
>   	amdgpu_irq_fini_sw(adev);
> -	amdgpu_ih_ring_fini(adev, &adev->irq.ih);
>   	amdgpu_irq_remove_domain(adev);
>   
>   	return 0;
> diff --git a/drivers/gpu/drm/amd/amdgpu/iceland_ih.c b/drivers/gpu/drm/amd/amdgpu/iceland_ih.c
> index da96c6013477..ddfe4eaeea05 100644
> --- a/drivers/gpu/drm/amd/amdgpu/iceland_ih.c
> +++ b/drivers/gpu/drm/amd/amdgpu/iceland_ih.c
> @@ -301,7 +301,6 @@ static int iceland_ih_sw_fini(void *handle)
>   	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
>   
>   	amdgpu_irq_fini_sw(adev);
> -	amdgpu_ih_ring_fini(adev, &adev->irq.ih);
>   	amdgpu_irq_remove_domain(adev);
>   
>   	return 0;
> diff --git a/drivers/gpu/drm/amd/amdgpu/navi10_ih.c b/drivers/gpu/drm/amd/amdgpu/navi10_ih.c
> index 5eea4550b856..e171a9e78544 100644
> --- a/drivers/gpu/drm/amd/amdgpu/navi10_ih.c
> +++ b/drivers/gpu/drm/amd/amdgpu/navi10_ih.c
> @@ -571,9 +571,6 @@ static int navi10_ih_sw_fini(void *handle)
>   
>   	amdgpu_irq_fini_sw(adev);
>   	amdgpu_ih_ring_fini(adev, &adev->irq.ih_soft);
> -	amdgpu_ih_ring_fini(adev, &adev->irq.ih2);
> -	amdgpu_ih_ring_fini(adev, &adev->irq.ih1);
> -	amdgpu_ih_ring_fini(adev, &adev->irq.ih);
>   
>   	return 0;
>   }
> diff --git a/drivers/gpu/drm/amd/amdgpu/si_ih.c b/drivers/gpu/drm/amd/amdgpu/si_ih.c
> index 751307f3252c..9a24f17a5750 100644
> --- a/drivers/gpu/drm/amd/amdgpu/si_ih.c
> +++ b/drivers/gpu/drm/amd/amdgpu/si_ih.c
> @@ -176,7 +176,6 @@ static int si_ih_sw_fini(void *handle)
>   	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
>   
>   	amdgpu_irq_fini_sw(adev);
> -	amdgpu_ih_ring_fini(adev, &adev->irq.ih);
>   
>   	return 0;
>   }
> diff --git a/drivers/gpu/drm/amd/amdgpu/tonga_ih.c b/drivers/gpu/drm/amd/amdgpu/tonga_ih.c
> index 973d80ec7f6c..b08905d1c00f 100644
> --- a/drivers/gpu/drm/amd/amdgpu/tonga_ih.c
> +++ b/drivers/gpu/drm/amd/amdgpu/tonga_ih.c
> @@ -313,7 +313,6 @@ static int tonga_ih_sw_fini(void *handle)
>   	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
>   
>   	amdgpu_irq_fini_sw(adev);
> -	amdgpu_ih_ring_fini(adev, &adev->irq.ih);
>   	amdgpu_irq_remove_domain(adev);
>   
>   	return 0;
> diff --git a/drivers/gpu/drm/amd/amdgpu/vega10_ih.c b/drivers/gpu/drm/amd/amdgpu/vega10_ih.c
> index dead9c2fbd4c..d78b8abe993a 100644
> --- a/drivers/gpu/drm/amd/amdgpu/vega10_ih.c
> +++ b/drivers/gpu/drm/amd/amdgpu/vega10_ih.c
> @@ -515,9 +515,6 @@ static int vega10_ih_sw_fini(void *handle)
>   
>   	amdgpu_irq_fini_sw(adev);
>   	amdgpu_ih_ring_fini(adev, &adev->irq.ih_soft);
> -	amdgpu_ih_ring_fini(adev, &adev->irq.ih2);
> -	amdgpu_ih_ring_fini(adev, &adev->irq.ih1);
> -	amdgpu_ih_ring_fini(adev, &adev->irq.ih);
>   
>   	return 0;
>   }

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [PATCH v6 10/16] drm/amdgpu: Guard against write accesses after device removal
  2021-05-10 16:36   ` Andrey Grodzovsky
  (?)
@ 2021-05-11  6:50     ` Christian König
  -1 siblings, 0 replies; 126+ messages in thread
From: Christian König @ 2021-05-11  6:50 UTC (permalink / raw)
  To: Andrey Grodzovsky, dri-devel, amd-gfx, linux-pci, daniel.vetter,
	Harry.Wentland
  Cc: ppaalanen, Alexander.Deucher, gregkh, helgaas, Felix.Kuehling

Am 10.05.21 um 18:36 schrieb Andrey Grodzovsky:
> This should prevent writing to memory or IO ranges possibly
> already allocated for other uses after our device is removed.
>
> v5:
> Protect more places wher memcopy_to/form_io takes place
> Protect IB submissions
>
> v6: Switch to !drm_dev_enter instead of scoping entire code
> with brackets.
>
> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c    | 11 ++-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c       |  9 +++
>   drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c        | 17 +++--
>   drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c       | 63 +++++++++++------
>   drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h       |  2 +
>   drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c      | 70 +++++++++++++++++++
>   drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h      | 49 ++-----------
>   drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c       | 31 +++++---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c       | 11 ++-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c       | 22 ++++--
>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c        |  7 +-
>   drivers/gpu/drm/amd/amdgpu/psp_v11_0.c        | 44 ++++++------
>   drivers/gpu/drm/amd/amdgpu/psp_v12_0.c        |  8 +--
>   drivers/gpu/drm/amd/amdgpu/psp_v3_1.c         |  8 +--
>   drivers/gpu/drm/amd/amdgpu/vce_v4_0.c         | 26 ++++---
>   drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c         | 22 +++---
>   .../drm/amd/pm/powerplay/smumgr/smu7_smumgr.c |  2 +
>   17 files changed, 257 insertions(+), 145 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index a0bff4713672..94c415176cdc 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -71,6 +71,8 @@
>   #include <drm/task_barrier.h>
>   #include <linux/pm_runtime.h>
>   
> +#include <drm/drm_drv.h>
> +
>   MODULE_FIRMWARE("amdgpu/vega10_gpu_info.bin");
>   MODULE_FIRMWARE("amdgpu/vega12_gpu_info.bin");
>   MODULE_FIRMWARE("amdgpu/raven_gpu_info.bin");
> @@ -281,7 +283,10 @@ void amdgpu_device_vram_access(struct amdgpu_device *adev, loff_t pos,
>   	unsigned long flags;
>   	uint32_t hi = ~0;
>   	uint64_t last;
> +	int idx;
>   
> +	 if (!drm_dev_enter(&adev->ddev, &idx))
> +		 return;
>   
>   #ifdef CONFIG_64BIT
>   	last = min(pos + size, adev->gmc.visible_vram_size);
> @@ -299,8 +304,10 @@ void amdgpu_device_vram_access(struct amdgpu_device *adev, loff_t pos,
>   			memcpy_fromio(buf, addr, count);
>   		}
>   
> -		if (count == size)
> +		if (count == size) {
> +			drm_dev_exit(idx);
>   			return;
> +		}

Maybe use a goto instead, but really just a nit pick.



>   
>   		pos += count;
>   		buf += count / 4;
> @@ -323,6 +330,8 @@ void amdgpu_device_vram_access(struct amdgpu_device *adev, loff_t pos,
>   			*buf++ = RREG32_NO_KIQ(mmMM_DATA);
>   	}
>   	spin_unlock_irqrestore(&adev->mmio_idx_lock, flags);
> +
> +	drm_dev_exit(idx);
>   }
>   
>   /*
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
> index 4d32233cde92..04ba5eef1e88 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
> @@ -31,6 +31,8 @@
>   #include "amdgpu_ras.h"
>   #include "amdgpu_xgmi.h"
>   
> +#include <drm/drm_drv.h>
> +
>   /**
>    * amdgpu_gmc_pdb0_alloc - allocate vram for pdb0
>    *
> @@ -151,6 +153,10 @@ int amdgpu_gmc_set_pte_pde(struct amdgpu_device *adev, void *cpu_pt_addr,
>   {
>   	void __iomem *ptr = (void *)cpu_pt_addr;
>   	uint64_t value;
> +	int idx;
> +
> +	if (!drm_dev_enter(&adev->ddev, &idx))
> +		return 0;
>   
>   	/*
>   	 * The following is for PTE only. GART does not have PDEs.
> @@ -158,6 +164,9 @@ int amdgpu_gmc_set_pte_pde(struct amdgpu_device *adev, void *cpu_pt_addr,
>   	value = addr & 0x0000FFFFFFFFF000ULL;
>   	value |= flags;
>   	writeq(value, ptr + (gpu_page_idx * 8));
> +
> +	drm_dev_exit(idx);
> +
>   	return 0;
>   }
>   
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
> index 148a3b481b12..62fcbd446c71 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
> @@ -30,6 +30,7 @@
>   #include <linux/slab.h>
>   
>   #include <drm/amdgpu_drm.h>
> +#include <drm/drm_drv.h>
>   
>   #include "amdgpu.h"
>   #include "atom.h"
> @@ -137,7 +138,7 @@ int amdgpu_ib_schedule(struct amdgpu_ring *ring, unsigned num_ibs,
>   	bool secure;
>   
>   	unsigned i;
> -	int r = 0;
> +	int idx, r = 0;
>   	bool need_pipe_sync = false;
>   
>   	if (num_ibs == 0)
> @@ -169,13 +170,16 @@ int amdgpu_ib_schedule(struct amdgpu_ring *ring, unsigned num_ibs,
>   		return -EINVAL;
>   	}
>   
> +	if (!drm_dev_enter(&adev->ddev, &idx))
> +		return -ENODEV;
> +
>   	alloc_size = ring->funcs->emit_frame_size + num_ibs *
>   		ring->funcs->emit_ib_size;
>   
>   	r = amdgpu_ring_alloc(ring, alloc_size);
>   	if (r) {
>   		dev_err(adev->dev, "scheduling IB failed (%d).\n", r);
> -		return r;
> +		goto exit;
>   	}
>   
>   	need_ctx_switch = ring->current_ctx != fence_ctx;
> @@ -205,7 +209,7 @@ int amdgpu_ib_schedule(struct amdgpu_ring *ring, unsigned num_ibs,
>   		r = amdgpu_vm_flush(ring, job, need_pipe_sync);
>   		if (r) {
>   			amdgpu_ring_undo(ring);
> -			return r;
> +			goto exit;
>   		}
>   	}
>   
> @@ -286,7 +290,7 @@ int amdgpu_ib_schedule(struct amdgpu_ring *ring, unsigned num_ibs,
>   		if (job && job->vmid)
>   			amdgpu_vmid_reset(adev, ring->funcs->vmhub, job->vmid);
>   		amdgpu_ring_undo(ring);
> -		return r;
> +		goto exit;
>   	}
>   
>   	if (ring->funcs->insert_end)
> @@ -304,7 +308,10 @@ int amdgpu_ib_schedule(struct amdgpu_ring *ring, unsigned num_ibs,
>   		ring->funcs->emit_wave_limit(ring, false);
>   
>   	amdgpu_ring_commit(ring);
> -	return 0;
> +
> +exit:
> +	drm_dev_exit(idx);
> +	return r;
>   }
>   
>   /**
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
> index 9e769cf6095b..bb6afee61666 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
> @@ -25,6 +25,7 @@
>   
>   #include <linux/firmware.h>
>   #include <linux/dma-mapping.h>
> +#include <drm/drm_drv.h>
>   
>   #include "amdgpu.h"
>   #include "amdgpu_psp.h"
> @@ -39,6 +40,8 @@
>   #include "amdgpu_ras.h"
>   #include "amdgpu_securedisplay.h"
>   
> +#include <drm/drm_drv.h>
> +
>   static int psp_sysfs_init(struct amdgpu_device *adev);
>   static void psp_sysfs_fini(struct amdgpu_device *adev);
>   
> @@ -253,7 +256,7 @@ psp_cmd_submit_buf(struct psp_context *psp,
>   		   struct psp_gfx_cmd_resp *cmd, uint64_t fence_mc_addr)
>   {
>   	int ret;
> -	int index;
> +	int index, idx;
>   	int timeout = 20000;
>   	bool ras_intr = false;
>   	bool skip_unsupport = false;
> @@ -261,6 +264,9 @@ psp_cmd_submit_buf(struct psp_context *psp,
>   	if (psp->adev->in_pci_err_recovery)
>   		return 0;
>   
> +	if (!drm_dev_enter(&psp->adev->ddev, &idx))
> +		return 0;
> +
>   	mutex_lock(&psp->mutex);
>   
>   	memset(psp->cmd_buf_mem, 0, PSP_CMD_BUFFER_SIZE);
> @@ -271,8 +277,7 @@ psp_cmd_submit_buf(struct psp_context *psp,
>   	ret = psp_ring_cmd_submit(psp, psp->cmd_buf_mc_addr, fence_mc_addr, index);
>   	if (ret) {
>   		atomic_dec(&psp->fence_value);
> -		mutex_unlock(&psp->mutex);
> -		return ret;
> +		goto exit;
>   	}
>   
>   	amdgpu_asic_invalidate_hdp(psp->adev, NULL);
> @@ -312,8 +317,8 @@ psp_cmd_submit_buf(struct psp_context *psp,
>   			 psp->cmd_buf_mem->cmd_id,
>   			 psp->cmd_buf_mem->resp.status);
>   		if (!timeout) {
> -			mutex_unlock(&psp->mutex);
> -			return -EINVAL;
> +			ret = -EINVAL;
> +			goto exit;
>   		}
>   	}
>   
> @@ -321,8 +326,10 @@ psp_cmd_submit_buf(struct psp_context *psp,
>   		ucode->tmr_mc_addr_lo = psp->cmd_buf_mem->resp.fw_addr_lo;
>   		ucode->tmr_mc_addr_hi = psp->cmd_buf_mem->resp.fw_addr_hi;
>   	}
> -	mutex_unlock(&psp->mutex);
>   
> +exit:
> +	mutex_unlock(&psp->mutex);
> +	drm_dev_exit(idx);
>   	return ret;
>   }
>   
> @@ -359,8 +366,7 @@ static int psp_load_toc(struct psp_context *psp,
>   	if (!cmd)
>   		return -ENOMEM;
>   	/* Copy toc to psp firmware private buffer */
> -	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
> -	memcpy(psp->fw_pri_buf, psp->toc_start_addr, psp->toc_bin_size);
> +	psp_copy_fw(psp, psp->toc_start_addr, psp->toc_bin_size);
>   
>   	psp_prep_load_toc_cmd_buf(cmd, psp->fw_pri_mc_addr, psp->toc_bin_size);
>   
> @@ -625,8 +631,7 @@ static int psp_asd_load(struct psp_context *psp)
>   	if (!cmd)
>   		return -ENOMEM;
>   
> -	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
> -	memcpy(psp->fw_pri_buf, psp->asd_start_addr, psp->asd_ucode_size);
> +	psp_copy_fw(psp, psp->asd_start_addr, psp->asd_ucode_size);
>   
>   	psp_prep_asd_load_cmd_buf(cmd, psp->fw_pri_mc_addr,
>   				  psp->asd_ucode_size);
> @@ -781,8 +786,7 @@ static int psp_xgmi_load(struct psp_context *psp)
>   	if (!cmd)
>   		return -ENOMEM;
>   
> -	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
> -	memcpy(psp->fw_pri_buf, psp->ta_xgmi_start_addr, psp->ta_xgmi_ucode_size);
> +	psp_copy_fw(psp, psp->ta_xgmi_start_addr, psp->ta_xgmi_ucode_size);
>   
>   	psp_prep_ta_load_cmd_buf(cmd,
>   				 psp->fw_pri_mc_addr,
> @@ -1038,8 +1042,7 @@ static int psp_ras_load(struct psp_context *psp)
>   	if (!cmd)
>   		return -ENOMEM;
>   
> -	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
> -	memcpy(psp->fw_pri_buf, psp->ta_ras_start_addr, psp->ta_ras_ucode_size);
> +	psp_copy_fw(psp, psp->ta_ras_start_addr, psp->ta_ras_ucode_size);
>   
>   	psp_prep_ta_load_cmd_buf(cmd,
>   				 psp->fw_pri_mc_addr,
> @@ -1275,8 +1278,7 @@ static int psp_hdcp_load(struct psp_context *psp)
>   	if (!cmd)
>   		return -ENOMEM;
>   
> -	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
> -	memcpy(psp->fw_pri_buf, psp->ta_hdcp_start_addr,
> +	psp_copy_fw(psp, psp->ta_hdcp_start_addr,
>   	       psp->ta_hdcp_ucode_size);
>   
>   	psp_prep_ta_load_cmd_buf(cmd,
> @@ -1427,8 +1429,7 @@ static int psp_dtm_load(struct psp_context *psp)
>   	if (!cmd)
>   		return -ENOMEM;
>   
> -	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
> -	memcpy(psp->fw_pri_buf, psp->ta_dtm_start_addr, psp->ta_dtm_ucode_size);
> +	psp_copy_fw(psp, psp->ta_dtm_start_addr, psp->ta_dtm_ucode_size);
>   
>   	psp_prep_ta_load_cmd_buf(cmd,
>   				 psp->fw_pri_mc_addr,
> @@ -1573,8 +1574,7 @@ static int psp_rap_load(struct psp_context *psp)
>   	if (!cmd)
>   		return -ENOMEM;
>   
> -	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
> -	memcpy(psp->fw_pri_buf, psp->ta_rap_start_addr, psp->ta_rap_ucode_size);
> +	psp_copy_fw(psp, psp->ta_rap_start_addr, psp->ta_rap_ucode_size);
>   
>   	psp_prep_ta_load_cmd_buf(cmd,
>   				 psp->fw_pri_mc_addr,
> @@ -3022,7 +3022,7 @@ static ssize_t psp_usbc_pd_fw_sysfs_write(struct device *dev,
>   	struct amdgpu_device *adev = drm_to_adev(ddev);
>   	void *cpu_addr;
>   	dma_addr_t dma_addr;
> -	int ret;
> +	int ret, idx;
>   	char fw_name[100];
>   	const struct firmware *usbc_pd_fw;
>   
> @@ -3031,6 +3031,9 @@ static ssize_t psp_usbc_pd_fw_sysfs_write(struct device *dev,
>   		return -EBUSY;
>   	}
>   
> +	if (!drm_dev_enter(ddev, &idx))
> +		return -ENODEV;
> +
>   	snprintf(fw_name, sizeof(fw_name), "amdgpu/%s", buf);
>   	ret = request_firmware(&usbc_pd_fw, fw_name, adev->dev);
>   	if (ret)
> @@ -3062,16 +3065,30 @@ static ssize_t psp_usbc_pd_fw_sysfs_write(struct device *dev,
>   rel_buf:
>   	dma_free_coherent(adev->dev, usbc_pd_fw->size, cpu_addr, dma_addr);
>   	release_firmware(usbc_pd_fw);
> -
>   fail:
>   	if (ret) {
>   		DRM_ERROR("Failed to load USBC PD FW, err = %d", ret);
> -		return ret;
> +		count = ret;
>   	}
>   
> +	drm_dev_exit(idx);
>   	return count;
>   }
>   
> +void psp_copy_fw(struct psp_context *psp, uint8_t *start_addr, uint32_t bin_size)
> +{
> +	int idx;
> +
> +	if (!drm_dev_enter(&psp->adev->ddev, &idx))
> +		return;
> +
> +	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
> +	memcpy(psp->fw_pri_buf, start_addr, bin_size);
> +
> +	drm_dev_exit(idx);
> +}
> +
> +
>   static DEVICE_ATTR(usbc_pd_fw, S_IRUGO | S_IWUSR,
>   		   psp_usbc_pd_fw_sysfs_read,
>   		   psp_usbc_pd_fw_sysfs_write);
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
> index 46a5328e00e0..2bfdc278817f 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
> @@ -423,4 +423,6 @@ int psp_get_fw_attestation_records_addr(struct psp_context *psp,
>   
>   int psp_load_fw_list(struct psp_context *psp,
>   		     struct amdgpu_firmware_info **ucode_list, int ucode_count);
> +void psp_copy_fw(struct psp_context *psp, uint8_t *start_addr, uint32_t bin_size);
> +
>   #endif
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
> index 688624ebe421..e1985bc34436 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
> @@ -35,6 +35,8 @@
>   #include "amdgpu.h"
>   #include "atom.h"
>   
> +#include <drm/drm_drv.h>
> +
>   /*
>    * Rings
>    * Most engines on the GPU are fed via ring buffers.  Ring
> @@ -461,3 +463,71 @@ int amdgpu_ring_test_helper(struct amdgpu_ring *ring)
>   	ring->sched.ready = !r;
>   	return r;
>   }
> +
> +void amdgpu_ring_clear_ring(struct amdgpu_ring *ring)
> +{
> +	int idx;
> +	int i = 0;
> +
> +	if (!drm_dev_enter(&ring->adev->ddev, &idx))
> +		return;
> +
> +	while (i <= ring->buf_mask)
> +		ring->ring[i++] = ring->funcs->nop;
> +
> +	drm_dev_exit(idx);
> +
> +}
> +
> +void amdgpu_ring_write(struct amdgpu_ring *ring, uint32_t v)
> +{
> +	int idx;
> +
> +	if (!drm_dev_enter(&ring->adev->ddev, &idx))
> +		return;
> +
> +	if (ring->count_dw <= 0)
> +		DRM_ERROR("amdgpu: writing more dwords to the ring than expected!\n");
> +	ring->ring[ring->wptr++ & ring->buf_mask] = v;
> +	ring->wptr &= ring->ptr_mask;
> +	ring->count_dw--;
> +
> +	drm_dev_exit(idx);
> +}
> +
> +void amdgpu_ring_write_multiple(struct amdgpu_ring *ring,
> +					      void *src, int count_dw)
> +{
> +	unsigned occupied, chunk1, chunk2;
> +	void *dst;
> +	int idx;
> +
> +	if (!drm_dev_enter(&ring->adev->ddev, &idx))
> +		return;
> +
> +	if (unlikely(ring->count_dw < count_dw))
> +		DRM_ERROR("amdgpu: writing more dwords to the ring than expected!\n");
> +
> +	occupied = ring->wptr & ring->buf_mask;
> +	dst = (void *)&ring->ring[occupied];
> +	chunk1 = ring->buf_mask + 1 - occupied;
> +	chunk1 = (chunk1 >= count_dw) ? count_dw: chunk1;
> +	chunk2 = count_dw - chunk1;
> +	chunk1 <<= 2;
> +	chunk2 <<= 2;
> +
> +	if (chunk1)
> +		memcpy(dst, src, chunk1);
> +
> +	if (chunk2) {
> +		src += chunk1;
> +		dst = (void *)ring->ring;
> +		memcpy(dst, src, chunk2);
> +	}
> +
> +	ring->wptr += count_dw;
> +	ring->wptr &= ring->ptr_mask;
> +	ring->count_dw -= count_dw;
> +
> +	drm_dev_exit(idx);
> +}

The ring should never we in MMIO memory, so you can completely drop that 
as far as I can see.

Maybe split that patch by use case so that we can more easily review/ack it.

Thanks,
Christian.

> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
> index e7d3d0dbdd96..c67bc6d3d039 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
> @@ -299,53 +299,12 @@ static inline void amdgpu_ring_set_preempt_cond_exec(struct amdgpu_ring *ring,
>   	*ring->cond_exe_cpu_addr = cond_exec;
>   }
>   
> -static inline void amdgpu_ring_clear_ring(struct amdgpu_ring *ring)
> -{
> -	int i = 0;
> -	while (i <= ring->buf_mask)
> -		ring->ring[i++] = ring->funcs->nop;
> -
> -}
> -
> -static inline void amdgpu_ring_write(struct amdgpu_ring *ring, uint32_t v)
> -{
> -	if (ring->count_dw <= 0)
> -		DRM_ERROR("amdgpu: writing more dwords to the ring than expected!\n");
> -	ring->ring[ring->wptr++ & ring->buf_mask] = v;
> -	ring->wptr &= ring->ptr_mask;
> -	ring->count_dw--;
> -}
> +void amdgpu_ring_clear_ring(struct amdgpu_ring *ring);
>   
> -static inline void amdgpu_ring_write_multiple(struct amdgpu_ring *ring,
> -					      void *src, int count_dw)
> -{
> -	unsigned occupied, chunk1, chunk2;
> -	void *dst;
> -
> -	if (unlikely(ring->count_dw < count_dw))
> -		DRM_ERROR("amdgpu: writing more dwords to the ring than expected!\n");
> -
> -	occupied = ring->wptr & ring->buf_mask;
> -	dst = (void *)&ring->ring[occupied];
> -	chunk1 = ring->buf_mask + 1 - occupied;
> -	chunk1 = (chunk1 >= count_dw) ? count_dw: chunk1;
> -	chunk2 = count_dw - chunk1;
> -	chunk1 <<= 2;
> -	chunk2 <<= 2;
> +void amdgpu_ring_write(struct amdgpu_ring *ring, uint32_t v);
>   
> -	if (chunk1)
> -		memcpy(dst, src, chunk1);
> -
> -	if (chunk2) {
> -		src += chunk1;
> -		dst = (void *)ring->ring;
> -		memcpy(dst, src, chunk2);
> -	}
> -
> -	ring->wptr += count_dw;
> -	ring->wptr &= ring->ptr_mask;
> -	ring->count_dw -= count_dw;
> -}
> +void amdgpu_ring_write_multiple(struct amdgpu_ring *ring,
> +					      void *src, int count_dw);
>   
>   int amdgpu_ring_test_helper(struct amdgpu_ring *ring);
>   
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
> index c6dbc0801604..82f0542c7792 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
> @@ -32,6 +32,7 @@
>   #include <linux/module.h>
>   
>   #include <drm/drm.h>
> +#include <drm/drm_drv.h>
>   
>   #include "amdgpu.h"
>   #include "amdgpu_pm.h"
> @@ -375,7 +376,7 @@ int amdgpu_uvd_suspend(struct amdgpu_device *adev)
>   {
>   	unsigned size;
>   	void *ptr;
> -	int i, j;
> +	int i, j, idx;
>   	bool in_ras_intr = amdgpu_ras_intr_triggered();
>   
>   	cancel_delayed_work_sync(&adev->uvd.idle_work);
> @@ -403,11 +404,15 @@ int amdgpu_uvd_suspend(struct amdgpu_device *adev)
>   		if (!adev->uvd.inst[j].saved_bo)
>   			return -ENOMEM;
>   
> -		/* re-write 0 since err_event_athub will corrupt VCPU buffer */
> -		if (in_ras_intr)
> -			memset(adev->uvd.inst[j].saved_bo, 0, size);
> -		else
> -			memcpy_fromio(adev->uvd.inst[j].saved_bo, ptr, size);
> +		if (drm_dev_enter(&adev->ddev, &idx)) {
> +			/* re-write 0 since err_event_athub will corrupt VCPU buffer */
> +			if (in_ras_intr)
> +				memset(adev->uvd.inst[j].saved_bo, 0, size);
> +			else
> +				memcpy_fromio(adev->uvd.inst[j].saved_bo, ptr, size);
> +
> +			drm_dev_exit(idx);
> +		}
>   	}
>   
>   	if (in_ras_intr)
> @@ -420,7 +425,7 @@ int amdgpu_uvd_resume(struct amdgpu_device *adev)
>   {
>   	unsigned size;
>   	void *ptr;
> -	int i;
> +	int i, idx;
>   
>   	for (i = 0; i < adev->uvd.num_uvd_inst; i++) {
>   		if (adev->uvd.harvest_config & (1 << i))
> @@ -432,7 +437,10 @@ int amdgpu_uvd_resume(struct amdgpu_device *adev)
>   		ptr = adev->uvd.inst[i].cpu_addr;
>   
>   		if (adev->uvd.inst[i].saved_bo != NULL) {
> -			memcpy_toio(ptr, adev->uvd.inst[i].saved_bo, size);
> +			if (drm_dev_enter(&adev->ddev, &idx)) {
> +				memcpy_toio(ptr, adev->uvd.inst[i].saved_bo, size);
> +				drm_dev_exit(idx);
> +			}
>   			kvfree(adev->uvd.inst[i].saved_bo);
>   			adev->uvd.inst[i].saved_bo = NULL;
>   		} else {
> @@ -442,8 +450,11 @@ int amdgpu_uvd_resume(struct amdgpu_device *adev)
>   			hdr = (const struct common_firmware_header *)adev->uvd.fw->data;
>   			if (adev->firmware.load_type != AMDGPU_FW_LOAD_PSP) {
>   				offset = le32_to_cpu(hdr->ucode_array_offset_bytes);
> -				memcpy_toio(adev->uvd.inst[i].cpu_addr, adev->uvd.fw->data + offset,
> -					    le32_to_cpu(hdr->ucode_size_bytes));
> +				if (drm_dev_enter(&adev->ddev, &idx)) {
> +					memcpy_toio(adev->uvd.inst[i].cpu_addr, adev->uvd.fw->data + offset,
> +						    le32_to_cpu(hdr->ucode_size_bytes));
> +					drm_dev_exit(idx);
> +				}
>   				size -= le32_to_cpu(hdr->ucode_size_bytes);
>   				ptr += le32_to_cpu(hdr->ucode_size_bytes);
>   			}
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
> index ea6a62f67e38..833203401ef4 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
> @@ -29,6 +29,7 @@
>   #include <linux/module.h>
>   
>   #include <drm/drm.h>
> +#include <drm/drm_drv.h>
>   
>   #include "amdgpu.h"
>   #include "amdgpu_pm.h"
> @@ -293,7 +294,7 @@ int amdgpu_vce_resume(struct amdgpu_device *adev)
>   	void *cpu_addr;
>   	const struct common_firmware_header *hdr;
>   	unsigned offset;
> -	int r;
> +	int r, idx;
>   
>   	if (adev->vce.vcpu_bo == NULL)
>   		return -EINVAL;
> @@ -313,8 +314,12 @@ int amdgpu_vce_resume(struct amdgpu_device *adev)
>   
>   	hdr = (const struct common_firmware_header *)adev->vce.fw->data;
>   	offset = le32_to_cpu(hdr->ucode_array_offset_bytes);
> -	memcpy_toio(cpu_addr, adev->vce.fw->data + offset,
> -		    adev->vce.fw->size - offset);
> +
> +	if (drm_dev_enter(&adev->ddev, &idx)) {
> +		memcpy_toio(cpu_addr, adev->vce.fw->data + offset,
> +			    adev->vce.fw->size - offset);
> +		drm_dev_exit(idx);
> +	}
>   
>   	amdgpu_bo_kunmap(adev->vce.vcpu_bo);
>   
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
> index 201645963ba5..21f7d3644d70 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
> @@ -27,6 +27,7 @@
>   #include <linux/firmware.h>
>   #include <linux/module.h>
>   #include <linux/pci.h>
> +#include <drm/drm_drv.h>
>   
>   #include "amdgpu.h"
>   #include "amdgpu_pm.h"
> @@ -275,7 +276,7 @@ int amdgpu_vcn_suspend(struct amdgpu_device *adev)
>   {
>   	unsigned size;
>   	void *ptr;
> -	int i;
> +	int i, idx;
>   
>   	cancel_delayed_work_sync(&adev->vcn.idle_work);
>   
> @@ -292,7 +293,10 @@ int amdgpu_vcn_suspend(struct amdgpu_device *adev)
>   		if (!adev->vcn.inst[i].saved_bo)
>   			return -ENOMEM;
>   
> -		memcpy_fromio(adev->vcn.inst[i].saved_bo, ptr, size);
> +		if (drm_dev_enter(&adev->ddev, &idx)) {
> +			memcpy_fromio(adev->vcn.inst[i].saved_bo, ptr, size);
> +			drm_dev_exit(idx);
> +		}
>   	}
>   	return 0;
>   }
> @@ -301,7 +305,7 @@ int amdgpu_vcn_resume(struct amdgpu_device *adev)
>   {
>   	unsigned size;
>   	void *ptr;
> -	int i;
> +	int i, idx;
>   
>   	for (i = 0; i < adev->vcn.num_vcn_inst; ++i) {
>   		if (adev->vcn.harvest_config & (1 << i))
> @@ -313,7 +317,10 @@ int amdgpu_vcn_resume(struct amdgpu_device *adev)
>   		ptr = adev->vcn.inst[i].cpu_addr;
>   
>   		if (adev->vcn.inst[i].saved_bo != NULL) {
> -			memcpy_toio(ptr, adev->vcn.inst[i].saved_bo, size);
> +			if (drm_dev_enter(&adev->ddev, &idx)) {
> +				memcpy_toio(ptr, adev->vcn.inst[i].saved_bo, size);
> +				drm_dev_exit(idx);
> +			}
>   			kvfree(adev->vcn.inst[i].saved_bo);
>   			adev->vcn.inst[i].saved_bo = NULL;
>   		} else {
> @@ -323,8 +330,11 @@ int amdgpu_vcn_resume(struct amdgpu_device *adev)
>   			hdr = (const struct common_firmware_header *)adev->vcn.fw->data;
>   			if (adev->firmware.load_type != AMDGPU_FW_LOAD_PSP) {
>   				offset = le32_to_cpu(hdr->ucode_array_offset_bytes);
> -				memcpy_toio(adev->vcn.inst[i].cpu_addr, adev->vcn.fw->data + offset,
> -					    le32_to_cpu(hdr->ucode_size_bytes));
> +				if (drm_dev_enter(&adev->ddev, &idx)) {
> +					memcpy_toio(adev->vcn.inst[i].cpu_addr, adev->vcn.fw->data + offset,
> +						    le32_to_cpu(hdr->ucode_size_bytes));
> +					drm_dev_exit(idx);
> +				}
>   				size -= le32_to_cpu(hdr->ucode_size_bytes);
>   				ptr += le32_to_cpu(hdr->ucode_size_bytes);
>   			}
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> index 9f868cf3b832..7dd5f10ab570 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> @@ -32,6 +32,7 @@
>   #include <linux/dma-buf.h>
>   
>   #include <drm/amdgpu_drm.h>
> +#include <drm/drm_drv.h>
>   #include "amdgpu.h"
>   #include "amdgpu_trace.h"
>   #include "amdgpu_amdkfd.h"
> @@ -1606,7 +1607,10 @@ static int amdgpu_vm_bo_update_mapping(struct amdgpu_device *adev,
>   	struct amdgpu_vm_update_params params;
>   	enum amdgpu_sync_mode sync_mode;
>   	uint64_t pfn;
> -	int r;
> +	int r, idx;
> +
> +	if (!drm_dev_enter(&adev->ddev, &idx))
> +		return -ENODEV;
>   
>   	memset(&params, 0, sizeof(params));
>   	params.adev = adev;
> @@ -1715,6 +1719,7 @@ static int amdgpu_vm_bo_update_mapping(struct amdgpu_device *adev,
>   
>   error_unlock:
>   	amdgpu_vm_eviction_unlock(vm);
> +	drm_dev_exit(idx);
>   	return r;
>   }
>   
> diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c b/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
> index 589410c32d09..2cec71e823f5 100644
> --- a/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
> @@ -23,6 +23,7 @@
>   #include <linux/firmware.h>
>   #include <linux/module.h>
>   #include <linux/vmalloc.h>
> +#include <drm/drm_drv.h>
>   
>   #include "amdgpu.h"
>   #include "amdgpu_psp.h"
> @@ -269,10 +270,8 @@ static int psp_v11_0_bootloader_load_kdb(struct psp_context *psp)
>   	if (ret)
>   		return ret;
>   
> -	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
> -
>   	/* Copy PSP KDB binary to memory */
> -	memcpy(psp->fw_pri_buf, psp->kdb_start_addr, psp->kdb_bin_size);
> +	psp_copy_fw(psp, psp->kdb_start_addr, psp->kdb_bin_size);
>   
>   	/* Provide the PSP KDB to bootloader */
>   	WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
> @@ -302,10 +301,8 @@ static int psp_v11_0_bootloader_load_spl(struct psp_context *psp)
>   	if (ret)
>   		return ret;
>   
> -	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
> -
>   	/* Copy PSP SPL binary to memory */
> -	memcpy(psp->fw_pri_buf, psp->spl_start_addr, psp->spl_bin_size);
> +	psp_copy_fw(psp, psp->spl_start_addr, psp->spl_bin_size);
>   
>   	/* Provide the PSP SPL to bootloader */
>   	WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
> @@ -335,10 +332,8 @@ static int psp_v11_0_bootloader_load_sysdrv(struct psp_context *psp)
>   	if (ret)
>   		return ret;
>   
> -	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
> -
>   	/* Copy PSP System Driver binary to memory */
> -	memcpy(psp->fw_pri_buf, psp->sys_start_addr, psp->sys_bin_size);
> +	psp_copy_fw(psp, psp->sys_start_addr, psp->sys_bin_size);
>   
>   	/* Provide the sys driver to bootloader */
>   	WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
> @@ -371,10 +366,8 @@ static int psp_v11_0_bootloader_load_sos(struct psp_context *psp)
>   	if (ret)
>   		return ret;
>   
> -	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
> -
>   	/* Copy Secure OS binary to PSP memory */
> -	memcpy(psp->fw_pri_buf, psp->sos_start_addr, psp->sos_bin_size);
> +	psp_copy_fw(psp, psp->sos_start_addr, psp->sos_bin_size);
>   
>   	/* Provide the PSP secure OS to bootloader */
>   	WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
> @@ -608,7 +601,7 @@ static int psp_v11_0_memory_training(struct psp_context *psp, uint32_t ops)
>   	uint32_t p2c_header[4];
>   	uint32_t sz;
>   	void *buf;
> -	int ret;
> +	int ret, idx;
>   
>   	if (ctx->init == PSP_MEM_TRAIN_NOT_SUPPORT) {
>   		DRM_DEBUG("Memory training is not supported.\n");
> @@ -681,17 +674,24 @@ static int psp_v11_0_memory_training(struct psp_context *psp, uint32_t ops)
>   			return -ENOMEM;
>   		}
>   
> -		memcpy_fromio(buf, adev->mman.aper_base_kaddr, sz);
> -		ret = psp_v11_0_memory_training_send_msg(psp, PSP_BL__DRAM_LONG_TRAIN);
> -		if (ret) {
> -			DRM_ERROR("Send long training msg failed.\n");
> +		if (drm_dev_enter(&adev->ddev, &idx)) {
> +			memcpy_fromio(buf, adev->mman.aper_base_kaddr, sz);
> +			ret = psp_v11_0_memory_training_send_msg(psp, PSP_BL__DRAM_LONG_TRAIN);
> +			if (ret) {
> +				DRM_ERROR("Send long training msg failed.\n");
> +				vfree(buf);
> +				drm_dev_exit(idx);
> +				return ret;
> +			}
> +
> +			memcpy_toio(adev->mman.aper_base_kaddr, buf, sz);
> +			adev->hdp.funcs->flush_hdp(adev, NULL);
>   			vfree(buf);
> -			return ret;
> +			drm_dev_exit(idx);
> +		} else {
> +			vfree(buf);
> +			return -ENODEV;
>   		}
> -
> -		memcpy_toio(adev->mman.aper_base_kaddr, buf, sz);
> -		adev->hdp.funcs->flush_hdp(adev, NULL);
> -		vfree(buf);
>   	}
>   
>   	if (ops & PSP_MEM_TRAIN_SAVE) {
> diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c b/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c
> index c4828bd3264b..618e5b6b85d9 100644
> --- a/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c
> @@ -138,10 +138,8 @@ static int psp_v12_0_bootloader_load_sysdrv(struct psp_context *psp)
>   	if (ret)
>   		return ret;
>   
> -	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
> -
>   	/* Copy PSP System Driver binary to memory */
> -	memcpy(psp->fw_pri_buf, psp->sys_start_addr, psp->sys_bin_size);
> +	psp_copy_fw(psp, psp->sys_start_addr, psp->sys_bin_size);
>   
>   	/* Provide the sys driver to bootloader */
>   	WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
> @@ -179,10 +177,8 @@ static int psp_v12_0_bootloader_load_sos(struct psp_context *psp)
>   	if (ret)
>   		return ret;
>   
> -	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
> -
>   	/* Copy Secure OS binary to PSP memory */
> -	memcpy(psp->fw_pri_buf, psp->sos_start_addr, psp->sos_bin_size);
> +	psp_copy_fw(psp, psp->sos_start_addr, psp->sos_bin_size);
>   
>   	/* Provide the PSP secure OS to bootloader */
>   	WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
> diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c b/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c
> index f2e725f72d2f..d0a6cccd0897 100644
> --- a/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c
> +++ b/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c
> @@ -102,10 +102,8 @@ static int psp_v3_1_bootloader_load_sysdrv(struct psp_context *psp)
>   	if (ret)
>   		return ret;
>   
> -	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
> -
>   	/* Copy PSP System Driver binary to memory */
> -	memcpy(psp->fw_pri_buf, psp->sys_start_addr, psp->sys_bin_size);
> +	psp_copy_fw(psp, psp->sys_start_addr, psp->sys_bin_size);
>   
>   	/* Provide the sys driver to bootloader */
>   	WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
> @@ -143,10 +141,8 @@ static int psp_v3_1_bootloader_load_sos(struct psp_context *psp)
>   	if (ret)
>   		return ret;
>   
> -	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
> -
>   	/* Copy Secure OS binary to PSP memory */
> -	memcpy(psp->fw_pri_buf, psp->sos_start_addr, psp->sos_bin_size);
> +	psp_copy_fw(psp, psp->sos_start_addr, psp->sos_bin_size);
>   
>   	/* Provide the PSP secure OS to bootloader */
>   	WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
> diff --git a/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c b/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
> index 8e238dea7bef..90910d19db12 100644
> --- a/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
> @@ -25,6 +25,7 @@
>    */
>   
>   #include <linux/firmware.h>
> +#include <drm/drm_drv.h>
>   
>   #include "amdgpu.h"
>   #include "amdgpu_vce.h"
> @@ -555,16 +556,19 @@ static int vce_v4_0_hw_fini(void *handle)
>   static int vce_v4_0_suspend(void *handle)
>   {
>   	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
> -	int r;
> +	int r, idx;
>   
>   	if (adev->vce.vcpu_bo == NULL)
>   		return 0;
>   
> -	if (adev->firmware.load_type == AMDGPU_FW_LOAD_PSP) {
> -		unsigned size = amdgpu_bo_size(adev->vce.vcpu_bo);
> -		void *ptr = adev->vce.cpu_addr;
> +	if (drm_dev_enter(&adev->ddev, &idx)) {
> +		if (adev->firmware.load_type == AMDGPU_FW_LOAD_PSP) {
> +			unsigned size = amdgpu_bo_size(adev->vce.vcpu_bo);
> +			void *ptr = adev->vce.cpu_addr;
>   
> -		memcpy_fromio(adev->vce.saved_bo, ptr, size);
> +			memcpy_fromio(adev->vce.saved_bo, ptr, size);
> +		}
> +		drm_dev_exit(idx);
>   	}
>   
>   	r = vce_v4_0_hw_fini(adev);
> @@ -577,16 +581,20 @@ static int vce_v4_0_suspend(void *handle)
>   static int vce_v4_0_resume(void *handle)
>   {
>   	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
> -	int r;
> +	int r, idx;
>   
>   	if (adev->vce.vcpu_bo == NULL)
>   		return -EINVAL;
>   
>   	if (adev->firmware.load_type == AMDGPU_FW_LOAD_PSP) {
> -		unsigned size = amdgpu_bo_size(adev->vce.vcpu_bo);
> -		void *ptr = adev->vce.cpu_addr;
>   
> -		memcpy_toio(ptr, adev->vce.saved_bo, size);
> +		if (drm_dev_enter(&adev->ddev, &idx)) {
> +			unsigned size = amdgpu_bo_size(adev->vce.vcpu_bo);
> +			void *ptr = adev->vce.cpu_addr;
> +
> +			memcpy_toio(ptr, adev->vce.saved_bo, size);
> +			drm_dev_exit(idx);
> +		}
>   	} else {
>   		r = amdgpu_vce_resume(adev);
>   		if (r)
> diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c b/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c
> index 3f15bf34123a..df34be8ec82d 100644
> --- a/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c
> @@ -34,6 +34,8 @@
>   #include "vcn/vcn_3_0_0_sh_mask.h"
>   #include "ivsrcid/vcn/irqsrcs_vcn_2_0.h"
>   
> +#include <drm/drm_drv.h>
> +
>   #define mmUVD_CONTEXT_ID_INTERNAL_OFFSET			0x27
>   #define mmUVD_GPCOM_VCPU_CMD_INTERNAL_OFFSET			0x0f
>   #define mmUVD_GPCOM_VCPU_DATA0_INTERNAL_OFFSET			0x10
> @@ -268,16 +270,20 @@ static int vcn_v3_0_sw_init(void *handle)
>   static int vcn_v3_0_sw_fini(void *handle)
>   {
>   	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
> -	int i, r;
> +	int i, r, idx;
>   
> -	for (i = 0; i < adev->vcn.num_vcn_inst; i++) {
> -		volatile struct amdgpu_fw_shared *fw_shared;
> +	if (drm_dev_enter(&adev->ddev, &idx)) {
> +		for (i = 0; i < adev->vcn.num_vcn_inst; i++) {
> +			volatile struct amdgpu_fw_shared *fw_shared;
>   
> -		if (adev->vcn.harvest_config & (1 << i))
> -			continue;
> -		fw_shared = adev->vcn.inst[i].fw_shared_cpu_addr;
> -		fw_shared->present_flag_0 = 0;
> -		fw_shared->sw_ring.is_enabled = false;
> +			if (adev->vcn.harvest_config & (1 << i))
> +				continue;
> +			fw_shared = adev->vcn.inst[i].fw_shared_cpu_addr;
> +			fw_shared->present_flag_0 = 0;
> +			fw_shared->sw_ring.is_enabled = false;
> +		}
> +
> +		drm_dev_exit(idx);
>   	}
>   
>   	if (amdgpu_sriov_vf(adev))
> diff --git a/drivers/gpu/drm/amd/pm/powerplay/smumgr/smu7_smumgr.c b/drivers/gpu/drm/amd/pm/powerplay/smumgr/smu7_smumgr.c
> index aae25243eb10..d628b91846c9 100644
> --- a/drivers/gpu/drm/amd/pm/powerplay/smumgr/smu7_smumgr.c
> +++ b/drivers/gpu/drm/amd/pm/powerplay/smumgr/smu7_smumgr.c
> @@ -405,6 +405,8 @@ int smu7_request_smu_load_fw(struct pp_hwmgr *hwmgr)
>   				UCODE_ID_MEC_STORAGE, &toc->entry[toc->num_entries++]),
>   				"Failed to Get Firmware Entry.", r = -EINVAL; goto failed);
>   	}
> +
> +	/* AG TODO Can't call drm_dev_enter/exit because access adev->ddev here ... */
>   	memcpy_toio(smu_data->header_buffer.kaddr, smu_data->toc,
>   		    sizeof(struct SMU_DRAMData_TOC));
>   	smum_send_msg_to_smc_with_parameter(hwmgr,


^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [PATCH v6 10/16] drm/amdgpu: Guard against write accesses after device removal
@ 2021-05-11  6:50     ` Christian König
  0 siblings, 0 replies; 126+ messages in thread
From: Christian König @ 2021-05-11  6:50 UTC (permalink / raw)
  To: Andrey Grodzovsky, dri-devel, amd-gfx, linux-pci, daniel.vetter,
	Harry.Wentland
  Cc: Alexander.Deucher, gregkh, helgaas, Felix.Kuehling

Am 10.05.21 um 18:36 schrieb Andrey Grodzovsky:
> This should prevent writing to memory or IO ranges possibly
> already allocated for other uses after our device is removed.
>
> v5:
> Protect more places wher memcopy_to/form_io takes place
> Protect IB submissions
>
> v6: Switch to !drm_dev_enter instead of scoping entire code
> with brackets.
>
> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c    | 11 ++-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c       |  9 +++
>   drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c        | 17 +++--
>   drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c       | 63 +++++++++++------
>   drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h       |  2 +
>   drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c      | 70 +++++++++++++++++++
>   drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h      | 49 ++-----------
>   drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c       | 31 +++++---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c       | 11 ++-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c       | 22 ++++--
>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c        |  7 +-
>   drivers/gpu/drm/amd/amdgpu/psp_v11_0.c        | 44 ++++++------
>   drivers/gpu/drm/amd/amdgpu/psp_v12_0.c        |  8 +--
>   drivers/gpu/drm/amd/amdgpu/psp_v3_1.c         |  8 +--
>   drivers/gpu/drm/amd/amdgpu/vce_v4_0.c         | 26 ++++---
>   drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c         | 22 +++---
>   .../drm/amd/pm/powerplay/smumgr/smu7_smumgr.c |  2 +
>   17 files changed, 257 insertions(+), 145 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index a0bff4713672..94c415176cdc 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -71,6 +71,8 @@
>   #include <drm/task_barrier.h>
>   #include <linux/pm_runtime.h>
>   
> +#include <drm/drm_drv.h>
> +
>   MODULE_FIRMWARE("amdgpu/vega10_gpu_info.bin");
>   MODULE_FIRMWARE("amdgpu/vega12_gpu_info.bin");
>   MODULE_FIRMWARE("amdgpu/raven_gpu_info.bin");
> @@ -281,7 +283,10 @@ void amdgpu_device_vram_access(struct amdgpu_device *adev, loff_t pos,
>   	unsigned long flags;
>   	uint32_t hi = ~0;
>   	uint64_t last;
> +	int idx;
>   
> +	 if (!drm_dev_enter(&adev->ddev, &idx))
> +		 return;
>   
>   #ifdef CONFIG_64BIT
>   	last = min(pos + size, adev->gmc.visible_vram_size);
> @@ -299,8 +304,10 @@ void amdgpu_device_vram_access(struct amdgpu_device *adev, loff_t pos,
>   			memcpy_fromio(buf, addr, count);
>   		}
>   
> -		if (count == size)
> +		if (count == size) {
> +			drm_dev_exit(idx);
>   			return;
> +		}

Maybe use a goto instead, but really just a nit pick.



>   
>   		pos += count;
>   		buf += count / 4;
> @@ -323,6 +330,8 @@ void amdgpu_device_vram_access(struct amdgpu_device *adev, loff_t pos,
>   			*buf++ = RREG32_NO_KIQ(mmMM_DATA);
>   	}
>   	spin_unlock_irqrestore(&adev->mmio_idx_lock, flags);
> +
> +	drm_dev_exit(idx);
>   }
>   
>   /*
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
> index 4d32233cde92..04ba5eef1e88 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
> @@ -31,6 +31,8 @@
>   #include "amdgpu_ras.h"
>   #include "amdgpu_xgmi.h"
>   
> +#include <drm/drm_drv.h>
> +
>   /**
>    * amdgpu_gmc_pdb0_alloc - allocate vram for pdb0
>    *
> @@ -151,6 +153,10 @@ int amdgpu_gmc_set_pte_pde(struct amdgpu_device *adev, void *cpu_pt_addr,
>   {
>   	void __iomem *ptr = (void *)cpu_pt_addr;
>   	uint64_t value;
> +	int idx;
> +
> +	if (!drm_dev_enter(&adev->ddev, &idx))
> +		return 0;
>   
>   	/*
>   	 * The following is for PTE only. GART does not have PDEs.
> @@ -158,6 +164,9 @@ int amdgpu_gmc_set_pte_pde(struct amdgpu_device *adev, void *cpu_pt_addr,
>   	value = addr & 0x0000FFFFFFFFF000ULL;
>   	value |= flags;
>   	writeq(value, ptr + (gpu_page_idx * 8));
> +
> +	drm_dev_exit(idx);
> +
>   	return 0;
>   }
>   
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
> index 148a3b481b12..62fcbd446c71 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
> @@ -30,6 +30,7 @@
>   #include <linux/slab.h>
>   
>   #include <drm/amdgpu_drm.h>
> +#include <drm/drm_drv.h>
>   
>   #include "amdgpu.h"
>   #include "atom.h"
> @@ -137,7 +138,7 @@ int amdgpu_ib_schedule(struct amdgpu_ring *ring, unsigned num_ibs,
>   	bool secure;
>   
>   	unsigned i;
> -	int r = 0;
> +	int idx, r = 0;
>   	bool need_pipe_sync = false;
>   
>   	if (num_ibs == 0)
> @@ -169,13 +170,16 @@ int amdgpu_ib_schedule(struct amdgpu_ring *ring, unsigned num_ibs,
>   		return -EINVAL;
>   	}
>   
> +	if (!drm_dev_enter(&adev->ddev, &idx))
> +		return -ENODEV;
> +
>   	alloc_size = ring->funcs->emit_frame_size + num_ibs *
>   		ring->funcs->emit_ib_size;
>   
>   	r = amdgpu_ring_alloc(ring, alloc_size);
>   	if (r) {
>   		dev_err(adev->dev, "scheduling IB failed (%d).\n", r);
> -		return r;
> +		goto exit;
>   	}
>   
>   	need_ctx_switch = ring->current_ctx != fence_ctx;
> @@ -205,7 +209,7 @@ int amdgpu_ib_schedule(struct amdgpu_ring *ring, unsigned num_ibs,
>   		r = amdgpu_vm_flush(ring, job, need_pipe_sync);
>   		if (r) {
>   			amdgpu_ring_undo(ring);
> -			return r;
> +			goto exit;
>   		}
>   	}
>   
> @@ -286,7 +290,7 @@ int amdgpu_ib_schedule(struct amdgpu_ring *ring, unsigned num_ibs,
>   		if (job && job->vmid)
>   			amdgpu_vmid_reset(adev, ring->funcs->vmhub, job->vmid);
>   		amdgpu_ring_undo(ring);
> -		return r;
> +		goto exit;
>   	}
>   
>   	if (ring->funcs->insert_end)
> @@ -304,7 +308,10 @@ int amdgpu_ib_schedule(struct amdgpu_ring *ring, unsigned num_ibs,
>   		ring->funcs->emit_wave_limit(ring, false);
>   
>   	amdgpu_ring_commit(ring);
> -	return 0;
> +
> +exit:
> +	drm_dev_exit(idx);
> +	return r;
>   }
>   
>   /**
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
> index 9e769cf6095b..bb6afee61666 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
> @@ -25,6 +25,7 @@
>   
>   #include <linux/firmware.h>
>   #include <linux/dma-mapping.h>
> +#include <drm/drm_drv.h>
>   
>   #include "amdgpu.h"
>   #include "amdgpu_psp.h"
> @@ -39,6 +40,8 @@
>   #include "amdgpu_ras.h"
>   #include "amdgpu_securedisplay.h"
>   
> +#include <drm/drm_drv.h>
> +
>   static int psp_sysfs_init(struct amdgpu_device *adev);
>   static void psp_sysfs_fini(struct amdgpu_device *adev);
>   
> @@ -253,7 +256,7 @@ psp_cmd_submit_buf(struct psp_context *psp,
>   		   struct psp_gfx_cmd_resp *cmd, uint64_t fence_mc_addr)
>   {
>   	int ret;
> -	int index;
> +	int index, idx;
>   	int timeout = 20000;
>   	bool ras_intr = false;
>   	bool skip_unsupport = false;
> @@ -261,6 +264,9 @@ psp_cmd_submit_buf(struct psp_context *psp,
>   	if (psp->adev->in_pci_err_recovery)
>   		return 0;
>   
> +	if (!drm_dev_enter(&psp->adev->ddev, &idx))
> +		return 0;
> +
>   	mutex_lock(&psp->mutex);
>   
>   	memset(psp->cmd_buf_mem, 0, PSP_CMD_BUFFER_SIZE);
> @@ -271,8 +277,7 @@ psp_cmd_submit_buf(struct psp_context *psp,
>   	ret = psp_ring_cmd_submit(psp, psp->cmd_buf_mc_addr, fence_mc_addr, index);
>   	if (ret) {
>   		atomic_dec(&psp->fence_value);
> -		mutex_unlock(&psp->mutex);
> -		return ret;
> +		goto exit;
>   	}
>   
>   	amdgpu_asic_invalidate_hdp(psp->adev, NULL);
> @@ -312,8 +317,8 @@ psp_cmd_submit_buf(struct psp_context *psp,
>   			 psp->cmd_buf_mem->cmd_id,
>   			 psp->cmd_buf_mem->resp.status);
>   		if (!timeout) {
> -			mutex_unlock(&psp->mutex);
> -			return -EINVAL;
> +			ret = -EINVAL;
> +			goto exit;
>   		}
>   	}
>   
> @@ -321,8 +326,10 @@ psp_cmd_submit_buf(struct psp_context *psp,
>   		ucode->tmr_mc_addr_lo = psp->cmd_buf_mem->resp.fw_addr_lo;
>   		ucode->tmr_mc_addr_hi = psp->cmd_buf_mem->resp.fw_addr_hi;
>   	}
> -	mutex_unlock(&psp->mutex);
>   
> +exit:
> +	mutex_unlock(&psp->mutex);
> +	drm_dev_exit(idx);
>   	return ret;
>   }
>   
> @@ -359,8 +366,7 @@ static int psp_load_toc(struct psp_context *psp,
>   	if (!cmd)
>   		return -ENOMEM;
>   	/* Copy toc to psp firmware private buffer */
> -	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
> -	memcpy(psp->fw_pri_buf, psp->toc_start_addr, psp->toc_bin_size);
> +	psp_copy_fw(psp, psp->toc_start_addr, psp->toc_bin_size);
>   
>   	psp_prep_load_toc_cmd_buf(cmd, psp->fw_pri_mc_addr, psp->toc_bin_size);
>   
> @@ -625,8 +631,7 @@ static int psp_asd_load(struct psp_context *psp)
>   	if (!cmd)
>   		return -ENOMEM;
>   
> -	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
> -	memcpy(psp->fw_pri_buf, psp->asd_start_addr, psp->asd_ucode_size);
> +	psp_copy_fw(psp, psp->asd_start_addr, psp->asd_ucode_size);
>   
>   	psp_prep_asd_load_cmd_buf(cmd, psp->fw_pri_mc_addr,
>   				  psp->asd_ucode_size);
> @@ -781,8 +786,7 @@ static int psp_xgmi_load(struct psp_context *psp)
>   	if (!cmd)
>   		return -ENOMEM;
>   
> -	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
> -	memcpy(psp->fw_pri_buf, psp->ta_xgmi_start_addr, psp->ta_xgmi_ucode_size);
> +	psp_copy_fw(psp, psp->ta_xgmi_start_addr, psp->ta_xgmi_ucode_size);
>   
>   	psp_prep_ta_load_cmd_buf(cmd,
>   				 psp->fw_pri_mc_addr,
> @@ -1038,8 +1042,7 @@ static int psp_ras_load(struct psp_context *psp)
>   	if (!cmd)
>   		return -ENOMEM;
>   
> -	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
> -	memcpy(psp->fw_pri_buf, psp->ta_ras_start_addr, psp->ta_ras_ucode_size);
> +	psp_copy_fw(psp, psp->ta_ras_start_addr, psp->ta_ras_ucode_size);
>   
>   	psp_prep_ta_load_cmd_buf(cmd,
>   				 psp->fw_pri_mc_addr,
> @@ -1275,8 +1278,7 @@ static int psp_hdcp_load(struct psp_context *psp)
>   	if (!cmd)
>   		return -ENOMEM;
>   
> -	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
> -	memcpy(psp->fw_pri_buf, psp->ta_hdcp_start_addr,
> +	psp_copy_fw(psp, psp->ta_hdcp_start_addr,
>   	       psp->ta_hdcp_ucode_size);
>   
>   	psp_prep_ta_load_cmd_buf(cmd,
> @@ -1427,8 +1429,7 @@ static int psp_dtm_load(struct psp_context *psp)
>   	if (!cmd)
>   		return -ENOMEM;
>   
> -	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
> -	memcpy(psp->fw_pri_buf, psp->ta_dtm_start_addr, psp->ta_dtm_ucode_size);
> +	psp_copy_fw(psp, psp->ta_dtm_start_addr, psp->ta_dtm_ucode_size);
>   
>   	psp_prep_ta_load_cmd_buf(cmd,
>   				 psp->fw_pri_mc_addr,
> @@ -1573,8 +1574,7 @@ static int psp_rap_load(struct psp_context *psp)
>   	if (!cmd)
>   		return -ENOMEM;
>   
> -	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
> -	memcpy(psp->fw_pri_buf, psp->ta_rap_start_addr, psp->ta_rap_ucode_size);
> +	psp_copy_fw(psp, psp->ta_rap_start_addr, psp->ta_rap_ucode_size);
>   
>   	psp_prep_ta_load_cmd_buf(cmd,
>   				 psp->fw_pri_mc_addr,
> @@ -3022,7 +3022,7 @@ static ssize_t psp_usbc_pd_fw_sysfs_write(struct device *dev,
>   	struct amdgpu_device *adev = drm_to_adev(ddev);
>   	void *cpu_addr;
>   	dma_addr_t dma_addr;
> -	int ret;
> +	int ret, idx;
>   	char fw_name[100];
>   	const struct firmware *usbc_pd_fw;
>   
> @@ -3031,6 +3031,9 @@ static ssize_t psp_usbc_pd_fw_sysfs_write(struct device *dev,
>   		return -EBUSY;
>   	}
>   
> +	if (!drm_dev_enter(ddev, &idx))
> +		return -ENODEV;
> +
>   	snprintf(fw_name, sizeof(fw_name), "amdgpu/%s", buf);
>   	ret = request_firmware(&usbc_pd_fw, fw_name, adev->dev);
>   	if (ret)
> @@ -3062,16 +3065,30 @@ static ssize_t psp_usbc_pd_fw_sysfs_write(struct device *dev,
>   rel_buf:
>   	dma_free_coherent(adev->dev, usbc_pd_fw->size, cpu_addr, dma_addr);
>   	release_firmware(usbc_pd_fw);
> -
>   fail:
>   	if (ret) {
>   		DRM_ERROR("Failed to load USBC PD FW, err = %d", ret);
> -		return ret;
> +		count = ret;
>   	}
>   
> +	drm_dev_exit(idx);
>   	return count;
>   }
>   
> +void psp_copy_fw(struct psp_context *psp, uint8_t *start_addr, uint32_t bin_size)
> +{
> +	int idx;
> +
> +	if (!drm_dev_enter(&psp->adev->ddev, &idx))
> +		return;
> +
> +	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
> +	memcpy(psp->fw_pri_buf, start_addr, bin_size);
> +
> +	drm_dev_exit(idx);
> +}
> +
> +
>   static DEVICE_ATTR(usbc_pd_fw, S_IRUGO | S_IWUSR,
>   		   psp_usbc_pd_fw_sysfs_read,
>   		   psp_usbc_pd_fw_sysfs_write);
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
> index 46a5328e00e0..2bfdc278817f 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
> @@ -423,4 +423,6 @@ int psp_get_fw_attestation_records_addr(struct psp_context *psp,
>   
>   int psp_load_fw_list(struct psp_context *psp,
>   		     struct amdgpu_firmware_info **ucode_list, int ucode_count);
> +void psp_copy_fw(struct psp_context *psp, uint8_t *start_addr, uint32_t bin_size);
> +
>   #endif
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
> index 688624ebe421..e1985bc34436 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
> @@ -35,6 +35,8 @@
>   #include "amdgpu.h"
>   #include "atom.h"
>   
> +#include <drm/drm_drv.h>
> +
>   /*
>    * Rings
>    * Most engines on the GPU are fed via ring buffers.  Ring
> @@ -461,3 +463,71 @@ int amdgpu_ring_test_helper(struct amdgpu_ring *ring)
>   	ring->sched.ready = !r;
>   	return r;
>   }
> +
> +void amdgpu_ring_clear_ring(struct amdgpu_ring *ring)
> +{
> +	int idx;
> +	int i = 0;
> +
> +	if (!drm_dev_enter(&ring->adev->ddev, &idx))
> +		return;
> +
> +	while (i <= ring->buf_mask)
> +		ring->ring[i++] = ring->funcs->nop;
> +
> +	drm_dev_exit(idx);
> +
> +}
> +
> +void amdgpu_ring_write(struct amdgpu_ring *ring, uint32_t v)
> +{
> +	int idx;
> +
> +	if (!drm_dev_enter(&ring->adev->ddev, &idx))
> +		return;
> +
> +	if (ring->count_dw <= 0)
> +		DRM_ERROR("amdgpu: writing more dwords to the ring than expected!\n");
> +	ring->ring[ring->wptr++ & ring->buf_mask] = v;
> +	ring->wptr &= ring->ptr_mask;
> +	ring->count_dw--;
> +
> +	drm_dev_exit(idx);
> +}
> +
> +void amdgpu_ring_write_multiple(struct amdgpu_ring *ring,
> +					      void *src, int count_dw)
> +{
> +	unsigned occupied, chunk1, chunk2;
> +	void *dst;
> +	int idx;
> +
> +	if (!drm_dev_enter(&ring->adev->ddev, &idx))
> +		return;
> +
> +	if (unlikely(ring->count_dw < count_dw))
> +		DRM_ERROR("amdgpu: writing more dwords to the ring than expected!\n");
> +
> +	occupied = ring->wptr & ring->buf_mask;
> +	dst = (void *)&ring->ring[occupied];
> +	chunk1 = ring->buf_mask + 1 - occupied;
> +	chunk1 = (chunk1 >= count_dw) ? count_dw: chunk1;
> +	chunk2 = count_dw - chunk1;
> +	chunk1 <<= 2;
> +	chunk2 <<= 2;
> +
> +	if (chunk1)
> +		memcpy(dst, src, chunk1);
> +
> +	if (chunk2) {
> +		src += chunk1;
> +		dst = (void *)ring->ring;
> +		memcpy(dst, src, chunk2);
> +	}
> +
> +	ring->wptr += count_dw;
> +	ring->wptr &= ring->ptr_mask;
> +	ring->count_dw -= count_dw;
> +
> +	drm_dev_exit(idx);
> +}

The ring should never we in MMIO memory, so you can completely drop that 
as far as I can see.

Maybe split that patch by use case so that we can more easily review/ack it.

Thanks,
Christian.

> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
> index e7d3d0dbdd96..c67bc6d3d039 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
> @@ -299,53 +299,12 @@ static inline void amdgpu_ring_set_preempt_cond_exec(struct amdgpu_ring *ring,
>   	*ring->cond_exe_cpu_addr = cond_exec;
>   }
>   
> -static inline void amdgpu_ring_clear_ring(struct amdgpu_ring *ring)
> -{
> -	int i = 0;
> -	while (i <= ring->buf_mask)
> -		ring->ring[i++] = ring->funcs->nop;
> -
> -}
> -
> -static inline void amdgpu_ring_write(struct amdgpu_ring *ring, uint32_t v)
> -{
> -	if (ring->count_dw <= 0)
> -		DRM_ERROR("amdgpu: writing more dwords to the ring than expected!\n");
> -	ring->ring[ring->wptr++ & ring->buf_mask] = v;
> -	ring->wptr &= ring->ptr_mask;
> -	ring->count_dw--;
> -}
> +void amdgpu_ring_clear_ring(struct amdgpu_ring *ring);
>   
> -static inline void amdgpu_ring_write_multiple(struct amdgpu_ring *ring,
> -					      void *src, int count_dw)
> -{
> -	unsigned occupied, chunk1, chunk2;
> -	void *dst;
> -
> -	if (unlikely(ring->count_dw < count_dw))
> -		DRM_ERROR("amdgpu: writing more dwords to the ring than expected!\n");
> -
> -	occupied = ring->wptr & ring->buf_mask;
> -	dst = (void *)&ring->ring[occupied];
> -	chunk1 = ring->buf_mask + 1 - occupied;
> -	chunk1 = (chunk1 >= count_dw) ? count_dw: chunk1;
> -	chunk2 = count_dw - chunk1;
> -	chunk1 <<= 2;
> -	chunk2 <<= 2;
> +void amdgpu_ring_write(struct amdgpu_ring *ring, uint32_t v);
>   
> -	if (chunk1)
> -		memcpy(dst, src, chunk1);
> -
> -	if (chunk2) {
> -		src += chunk1;
> -		dst = (void *)ring->ring;
> -		memcpy(dst, src, chunk2);
> -	}
> -
> -	ring->wptr += count_dw;
> -	ring->wptr &= ring->ptr_mask;
> -	ring->count_dw -= count_dw;
> -}
> +void amdgpu_ring_write_multiple(struct amdgpu_ring *ring,
> +					      void *src, int count_dw);
>   
>   int amdgpu_ring_test_helper(struct amdgpu_ring *ring);
>   
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
> index c6dbc0801604..82f0542c7792 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
> @@ -32,6 +32,7 @@
>   #include <linux/module.h>
>   
>   #include <drm/drm.h>
> +#include <drm/drm_drv.h>
>   
>   #include "amdgpu.h"
>   #include "amdgpu_pm.h"
> @@ -375,7 +376,7 @@ int amdgpu_uvd_suspend(struct amdgpu_device *adev)
>   {
>   	unsigned size;
>   	void *ptr;
> -	int i, j;
> +	int i, j, idx;
>   	bool in_ras_intr = amdgpu_ras_intr_triggered();
>   
>   	cancel_delayed_work_sync(&adev->uvd.idle_work);
> @@ -403,11 +404,15 @@ int amdgpu_uvd_suspend(struct amdgpu_device *adev)
>   		if (!adev->uvd.inst[j].saved_bo)
>   			return -ENOMEM;
>   
> -		/* re-write 0 since err_event_athub will corrupt VCPU buffer */
> -		if (in_ras_intr)
> -			memset(adev->uvd.inst[j].saved_bo, 0, size);
> -		else
> -			memcpy_fromio(adev->uvd.inst[j].saved_bo, ptr, size);
> +		if (drm_dev_enter(&adev->ddev, &idx)) {
> +			/* re-write 0 since err_event_athub will corrupt VCPU buffer */
> +			if (in_ras_intr)
> +				memset(adev->uvd.inst[j].saved_bo, 0, size);
> +			else
> +				memcpy_fromio(adev->uvd.inst[j].saved_bo, ptr, size);
> +
> +			drm_dev_exit(idx);
> +		}
>   	}
>   
>   	if (in_ras_intr)
> @@ -420,7 +425,7 @@ int amdgpu_uvd_resume(struct amdgpu_device *adev)
>   {
>   	unsigned size;
>   	void *ptr;
> -	int i;
> +	int i, idx;
>   
>   	for (i = 0; i < adev->uvd.num_uvd_inst; i++) {
>   		if (adev->uvd.harvest_config & (1 << i))
> @@ -432,7 +437,10 @@ int amdgpu_uvd_resume(struct amdgpu_device *adev)
>   		ptr = adev->uvd.inst[i].cpu_addr;
>   
>   		if (adev->uvd.inst[i].saved_bo != NULL) {
> -			memcpy_toio(ptr, adev->uvd.inst[i].saved_bo, size);
> +			if (drm_dev_enter(&adev->ddev, &idx)) {
> +				memcpy_toio(ptr, adev->uvd.inst[i].saved_bo, size);
> +				drm_dev_exit(idx);
> +			}
>   			kvfree(adev->uvd.inst[i].saved_bo);
>   			adev->uvd.inst[i].saved_bo = NULL;
>   		} else {
> @@ -442,8 +450,11 @@ int amdgpu_uvd_resume(struct amdgpu_device *adev)
>   			hdr = (const struct common_firmware_header *)adev->uvd.fw->data;
>   			if (adev->firmware.load_type != AMDGPU_FW_LOAD_PSP) {
>   				offset = le32_to_cpu(hdr->ucode_array_offset_bytes);
> -				memcpy_toio(adev->uvd.inst[i].cpu_addr, adev->uvd.fw->data + offset,
> -					    le32_to_cpu(hdr->ucode_size_bytes));
> +				if (drm_dev_enter(&adev->ddev, &idx)) {
> +					memcpy_toio(adev->uvd.inst[i].cpu_addr, adev->uvd.fw->data + offset,
> +						    le32_to_cpu(hdr->ucode_size_bytes));
> +					drm_dev_exit(idx);
> +				}
>   				size -= le32_to_cpu(hdr->ucode_size_bytes);
>   				ptr += le32_to_cpu(hdr->ucode_size_bytes);
>   			}
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
> index ea6a62f67e38..833203401ef4 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
> @@ -29,6 +29,7 @@
>   #include <linux/module.h>
>   
>   #include <drm/drm.h>
> +#include <drm/drm_drv.h>
>   
>   #include "amdgpu.h"
>   #include "amdgpu_pm.h"
> @@ -293,7 +294,7 @@ int amdgpu_vce_resume(struct amdgpu_device *adev)
>   	void *cpu_addr;
>   	const struct common_firmware_header *hdr;
>   	unsigned offset;
> -	int r;
> +	int r, idx;
>   
>   	if (adev->vce.vcpu_bo == NULL)
>   		return -EINVAL;
> @@ -313,8 +314,12 @@ int amdgpu_vce_resume(struct amdgpu_device *adev)
>   
>   	hdr = (const struct common_firmware_header *)adev->vce.fw->data;
>   	offset = le32_to_cpu(hdr->ucode_array_offset_bytes);
> -	memcpy_toio(cpu_addr, adev->vce.fw->data + offset,
> -		    adev->vce.fw->size - offset);
> +
> +	if (drm_dev_enter(&adev->ddev, &idx)) {
> +		memcpy_toio(cpu_addr, adev->vce.fw->data + offset,
> +			    adev->vce.fw->size - offset);
> +		drm_dev_exit(idx);
> +	}
>   
>   	amdgpu_bo_kunmap(adev->vce.vcpu_bo);
>   
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
> index 201645963ba5..21f7d3644d70 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
> @@ -27,6 +27,7 @@
>   #include <linux/firmware.h>
>   #include <linux/module.h>
>   #include <linux/pci.h>
> +#include <drm/drm_drv.h>
>   
>   #include "amdgpu.h"
>   #include "amdgpu_pm.h"
> @@ -275,7 +276,7 @@ int amdgpu_vcn_suspend(struct amdgpu_device *adev)
>   {
>   	unsigned size;
>   	void *ptr;
> -	int i;
> +	int i, idx;
>   
>   	cancel_delayed_work_sync(&adev->vcn.idle_work);
>   
> @@ -292,7 +293,10 @@ int amdgpu_vcn_suspend(struct amdgpu_device *adev)
>   		if (!adev->vcn.inst[i].saved_bo)
>   			return -ENOMEM;
>   
> -		memcpy_fromio(adev->vcn.inst[i].saved_bo, ptr, size);
> +		if (drm_dev_enter(&adev->ddev, &idx)) {
> +			memcpy_fromio(adev->vcn.inst[i].saved_bo, ptr, size);
> +			drm_dev_exit(idx);
> +		}
>   	}
>   	return 0;
>   }
> @@ -301,7 +305,7 @@ int amdgpu_vcn_resume(struct amdgpu_device *adev)
>   {
>   	unsigned size;
>   	void *ptr;
> -	int i;
> +	int i, idx;
>   
>   	for (i = 0; i < adev->vcn.num_vcn_inst; ++i) {
>   		if (adev->vcn.harvest_config & (1 << i))
> @@ -313,7 +317,10 @@ int amdgpu_vcn_resume(struct amdgpu_device *adev)
>   		ptr = adev->vcn.inst[i].cpu_addr;
>   
>   		if (adev->vcn.inst[i].saved_bo != NULL) {
> -			memcpy_toio(ptr, adev->vcn.inst[i].saved_bo, size);
> +			if (drm_dev_enter(&adev->ddev, &idx)) {
> +				memcpy_toio(ptr, adev->vcn.inst[i].saved_bo, size);
> +				drm_dev_exit(idx);
> +			}
>   			kvfree(adev->vcn.inst[i].saved_bo);
>   			adev->vcn.inst[i].saved_bo = NULL;
>   		} else {
> @@ -323,8 +330,11 @@ int amdgpu_vcn_resume(struct amdgpu_device *adev)
>   			hdr = (const struct common_firmware_header *)adev->vcn.fw->data;
>   			if (adev->firmware.load_type != AMDGPU_FW_LOAD_PSP) {
>   				offset = le32_to_cpu(hdr->ucode_array_offset_bytes);
> -				memcpy_toio(adev->vcn.inst[i].cpu_addr, adev->vcn.fw->data + offset,
> -					    le32_to_cpu(hdr->ucode_size_bytes));
> +				if (drm_dev_enter(&adev->ddev, &idx)) {
> +					memcpy_toio(adev->vcn.inst[i].cpu_addr, adev->vcn.fw->data + offset,
> +						    le32_to_cpu(hdr->ucode_size_bytes));
> +					drm_dev_exit(idx);
> +				}
>   				size -= le32_to_cpu(hdr->ucode_size_bytes);
>   				ptr += le32_to_cpu(hdr->ucode_size_bytes);
>   			}
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> index 9f868cf3b832..7dd5f10ab570 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> @@ -32,6 +32,7 @@
>   #include <linux/dma-buf.h>
>   
>   #include <drm/amdgpu_drm.h>
> +#include <drm/drm_drv.h>
>   #include "amdgpu.h"
>   #include "amdgpu_trace.h"
>   #include "amdgpu_amdkfd.h"
> @@ -1606,7 +1607,10 @@ static int amdgpu_vm_bo_update_mapping(struct amdgpu_device *adev,
>   	struct amdgpu_vm_update_params params;
>   	enum amdgpu_sync_mode sync_mode;
>   	uint64_t pfn;
> -	int r;
> +	int r, idx;
> +
> +	if (!drm_dev_enter(&adev->ddev, &idx))
> +		return -ENODEV;
>   
>   	memset(&params, 0, sizeof(params));
>   	params.adev = adev;
> @@ -1715,6 +1719,7 @@ static int amdgpu_vm_bo_update_mapping(struct amdgpu_device *adev,
>   
>   error_unlock:
>   	amdgpu_vm_eviction_unlock(vm);
> +	drm_dev_exit(idx);
>   	return r;
>   }
>   
> diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c b/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
> index 589410c32d09..2cec71e823f5 100644
> --- a/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
> @@ -23,6 +23,7 @@
>   #include <linux/firmware.h>
>   #include <linux/module.h>
>   #include <linux/vmalloc.h>
> +#include <drm/drm_drv.h>
>   
>   #include "amdgpu.h"
>   #include "amdgpu_psp.h"
> @@ -269,10 +270,8 @@ static int psp_v11_0_bootloader_load_kdb(struct psp_context *psp)
>   	if (ret)
>   		return ret;
>   
> -	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
> -
>   	/* Copy PSP KDB binary to memory */
> -	memcpy(psp->fw_pri_buf, psp->kdb_start_addr, psp->kdb_bin_size);
> +	psp_copy_fw(psp, psp->kdb_start_addr, psp->kdb_bin_size);
>   
>   	/* Provide the PSP KDB to bootloader */
>   	WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
> @@ -302,10 +301,8 @@ static int psp_v11_0_bootloader_load_spl(struct psp_context *psp)
>   	if (ret)
>   		return ret;
>   
> -	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
> -
>   	/* Copy PSP SPL binary to memory */
> -	memcpy(psp->fw_pri_buf, psp->spl_start_addr, psp->spl_bin_size);
> +	psp_copy_fw(psp, psp->spl_start_addr, psp->spl_bin_size);
>   
>   	/* Provide the PSP SPL to bootloader */
>   	WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
> @@ -335,10 +332,8 @@ static int psp_v11_0_bootloader_load_sysdrv(struct psp_context *psp)
>   	if (ret)
>   		return ret;
>   
> -	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
> -
>   	/* Copy PSP System Driver binary to memory */
> -	memcpy(psp->fw_pri_buf, psp->sys_start_addr, psp->sys_bin_size);
> +	psp_copy_fw(psp, psp->sys_start_addr, psp->sys_bin_size);
>   
>   	/* Provide the sys driver to bootloader */
>   	WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
> @@ -371,10 +366,8 @@ static int psp_v11_0_bootloader_load_sos(struct psp_context *psp)
>   	if (ret)
>   		return ret;
>   
> -	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
> -
>   	/* Copy Secure OS binary to PSP memory */
> -	memcpy(psp->fw_pri_buf, psp->sos_start_addr, psp->sos_bin_size);
> +	psp_copy_fw(psp, psp->sos_start_addr, psp->sos_bin_size);
>   
>   	/* Provide the PSP secure OS to bootloader */
>   	WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
> @@ -608,7 +601,7 @@ static int psp_v11_0_memory_training(struct psp_context *psp, uint32_t ops)
>   	uint32_t p2c_header[4];
>   	uint32_t sz;
>   	void *buf;
> -	int ret;
> +	int ret, idx;
>   
>   	if (ctx->init == PSP_MEM_TRAIN_NOT_SUPPORT) {
>   		DRM_DEBUG("Memory training is not supported.\n");
> @@ -681,17 +674,24 @@ static int psp_v11_0_memory_training(struct psp_context *psp, uint32_t ops)
>   			return -ENOMEM;
>   		}
>   
> -		memcpy_fromio(buf, adev->mman.aper_base_kaddr, sz);
> -		ret = psp_v11_0_memory_training_send_msg(psp, PSP_BL__DRAM_LONG_TRAIN);
> -		if (ret) {
> -			DRM_ERROR("Send long training msg failed.\n");
> +		if (drm_dev_enter(&adev->ddev, &idx)) {
> +			memcpy_fromio(buf, adev->mman.aper_base_kaddr, sz);
> +			ret = psp_v11_0_memory_training_send_msg(psp, PSP_BL__DRAM_LONG_TRAIN);
> +			if (ret) {
> +				DRM_ERROR("Send long training msg failed.\n");
> +				vfree(buf);
> +				drm_dev_exit(idx);
> +				return ret;
> +			}
> +
> +			memcpy_toio(adev->mman.aper_base_kaddr, buf, sz);
> +			adev->hdp.funcs->flush_hdp(adev, NULL);
>   			vfree(buf);
> -			return ret;
> +			drm_dev_exit(idx);
> +		} else {
> +			vfree(buf);
> +			return -ENODEV;
>   		}
> -
> -		memcpy_toio(adev->mman.aper_base_kaddr, buf, sz);
> -		adev->hdp.funcs->flush_hdp(adev, NULL);
> -		vfree(buf);
>   	}
>   
>   	if (ops & PSP_MEM_TRAIN_SAVE) {
> diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c b/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c
> index c4828bd3264b..618e5b6b85d9 100644
> --- a/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c
> @@ -138,10 +138,8 @@ static int psp_v12_0_bootloader_load_sysdrv(struct psp_context *psp)
>   	if (ret)
>   		return ret;
>   
> -	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
> -
>   	/* Copy PSP System Driver binary to memory */
> -	memcpy(psp->fw_pri_buf, psp->sys_start_addr, psp->sys_bin_size);
> +	psp_copy_fw(psp, psp->sys_start_addr, psp->sys_bin_size);
>   
>   	/* Provide the sys driver to bootloader */
>   	WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
> @@ -179,10 +177,8 @@ static int psp_v12_0_bootloader_load_sos(struct psp_context *psp)
>   	if (ret)
>   		return ret;
>   
> -	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
> -
>   	/* Copy Secure OS binary to PSP memory */
> -	memcpy(psp->fw_pri_buf, psp->sos_start_addr, psp->sos_bin_size);
> +	psp_copy_fw(psp, psp->sos_start_addr, psp->sos_bin_size);
>   
>   	/* Provide the PSP secure OS to bootloader */
>   	WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
> diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c b/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c
> index f2e725f72d2f..d0a6cccd0897 100644
> --- a/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c
> +++ b/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c
> @@ -102,10 +102,8 @@ static int psp_v3_1_bootloader_load_sysdrv(struct psp_context *psp)
>   	if (ret)
>   		return ret;
>   
> -	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
> -
>   	/* Copy PSP System Driver binary to memory */
> -	memcpy(psp->fw_pri_buf, psp->sys_start_addr, psp->sys_bin_size);
> +	psp_copy_fw(psp, psp->sys_start_addr, psp->sys_bin_size);
>   
>   	/* Provide the sys driver to bootloader */
>   	WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
> @@ -143,10 +141,8 @@ static int psp_v3_1_bootloader_load_sos(struct psp_context *psp)
>   	if (ret)
>   		return ret;
>   
> -	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
> -
>   	/* Copy Secure OS binary to PSP memory */
> -	memcpy(psp->fw_pri_buf, psp->sos_start_addr, psp->sos_bin_size);
> +	psp_copy_fw(psp, psp->sos_start_addr, psp->sos_bin_size);
>   
>   	/* Provide the PSP secure OS to bootloader */
>   	WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
> diff --git a/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c b/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
> index 8e238dea7bef..90910d19db12 100644
> --- a/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
> @@ -25,6 +25,7 @@
>    */
>   
>   #include <linux/firmware.h>
> +#include <drm/drm_drv.h>
>   
>   #include "amdgpu.h"
>   #include "amdgpu_vce.h"
> @@ -555,16 +556,19 @@ static int vce_v4_0_hw_fini(void *handle)
>   static int vce_v4_0_suspend(void *handle)
>   {
>   	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
> -	int r;
> +	int r, idx;
>   
>   	if (adev->vce.vcpu_bo == NULL)
>   		return 0;
>   
> -	if (adev->firmware.load_type == AMDGPU_FW_LOAD_PSP) {
> -		unsigned size = amdgpu_bo_size(adev->vce.vcpu_bo);
> -		void *ptr = adev->vce.cpu_addr;
> +	if (drm_dev_enter(&adev->ddev, &idx)) {
> +		if (adev->firmware.load_type == AMDGPU_FW_LOAD_PSP) {
> +			unsigned size = amdgpu_bo_size(adev->vce.vcpu_bo);
> +			void *ptr = adev->vce.cpu_addr;
>   
> -		memcpy_fromio(adev->vce.saved_bo, ptr, size);
> +			memcpy_fromio(adev->vce.saved_bo, ptr, size);
> +		}
> +		drm_dev_exit(idx);
>   	}
>   
>   	r = vce_v4_0_hw_fini(adev);
> @@ -577,16 +581,20 @@ static int vce_v4_0_suspend(void *handle)
>   static int vce_v4_0_resume(void *handle)
>   {
>   	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
> -	int r;
> +	int r, idx;
>   
>   	if (adev->vce.vcpu_bo == NULL)
>   		return -EINVAL;
>   
>   	if (adev->firmware.load_type == AMDGPU_FW_LOAD_PSP) {
> -		unsigned size = amdgpu_bo_size(adev->vce.vcpu_bo);
> -		void *ptr = adev->vce.cpu_addr;
>   
> -		memcpy_toio(ptr, adev->vce.saved_bo, size);
> +		if (drm_dev_enter(&adev->ddev, &idx)) {
> +			unsigned size = amdgpu_bo_size(adev->vce.vcpu_bo);
> +			void *ptr = adev->vce.cpu_addr;
> +
> +			memcpy_toio(ptr, adev->vce.saved_bo, size);
> +			drm_dev_exit(idx);
> +		}
>   	} else {
>   		r = amdgpu_vce_resume(adev);
>   		if (r)
> diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c b/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c
> index 3f15bf34123a..df34be8ec82d 100644
> --- a/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c
> @@ -34,6 +34,8 @@
>   #include "vcn/vcn_3_0_0_sh_mask.h"
>   #include "ivsrcid/vcn/irqsrcs_vcn_2_0.h"
>   
> +#include <drm/drm_drv.h>
> +
>   #define mmUVD_CONTEXT_ID_INTERNAL_OFFSET			0x27
>   #define mmUVD_GPCOM_VCPU_CMD_INTERNAL_OFFSET			0x0f
>   #define mmUVD_GPCOM_VCPU_DATA0_INTERNAL_OFFSET			0x10
> @@ -268,16 +270,20 @@ static int vcn_v3_0_sw_init(void *handle)
>   static int vcn_v3_0_sw_fini(void *handle)
>   {
>   	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
> -	int i, r;
> +	int i, r, idx;
>   
> -	for (i = 0; i < adev->vcn.num_vcn_inst; i++) {
> -		volatile struct amdgpu_fw_shared *fw_shared;
> +	if (drm_dev_enter(&adev->ddev, &idx)) {
> +		for (i = 0; i < adev->vcn.num_vcn_inst; i++) {
> +			volatile struct amdgpu_fw_shared *fw_shared;
>   
> -		if (adev->vcn.harvest_config & (1 << i))
> -			continue;
> -		fw_shared = adev->vcn.inst[i].fw_shared_cpu_addr;
> -		fw_shared->present_flag_0 = 0;
> -		fw_shared->sw_ring.is_enabled = false;
> +			if (adev->vcn.harvest_config & (1 << i))
> +				continue;
> +			fw_shared = adev->vcn.inst[i].fw_shared_cpu_addr;
> +			fw_shared->present_flag_0 = 0;
> +			fw_shared->sw_ring.is_enabled = false;
> +		}
> +
> +		drm_dev_exit(idx);
>   	}
>   
>   	if (amdgpu_sriov_vf(adev))
> diff --git a/drivers/gpu/drm/amd/pm/powerplay/smumgr/smu7_smumgr.c b/drivers/gpu/drm/amd/pm/powerplay/smumgr/smu7_smumgr.c
> index aae25243eb10..d628b91846c9 100644
> --- a/drivers/gpu/drm/amd/pm/powerplay/smumgr/smu7_smumgr.c
> +++ b/drivers/gpu/drm/amd/pm/powerplay/smumgr/smu7_smumgr.c
> @@ -405,6 +405,8 @@ int smu7_request_smu_load_fw(struct pp_hwmgr *hwmgr)
>   				UCODE_ID_MEC_STORAGE, &toc->entry[toc->num_entries++]),
>   				"Failed to Get Firmware Entry.", r = -EINVAL; goto failed);
>   	}
> +
> +	/* AG TODO Can't call drm_dev_enter/exit because access adev->ddev here ... */
>   	memcpy_toio(smu_data->header_buffer.kaddr, smu_data->toc,
>   		    sizeof(struct SMU_DRAMData_TOC));
>   	smum_send_msg_to_smc_with_parameter(hwmgr,


^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [PATCH v6 10/16] drm/amdgpu: Guard against write accesses after device removal
@ 2021-05-11  6:50     ` Christian König
  0 siblings, 0 replies; 126+ messages in thread
From: Christian König @ 2021-05-11  6:50 UTC (permalink / raw)
  To: Andrey Grodzovsky, dri-devel, amd-gfx, linux-pci, daniel.vetter,
	Harry.Wentland
  Cc: Alexander.Deucher, gregkh, ppaalanen, helgaas, Felix.Kuehling

Am 10.05.21 um 18:36 schrieb Andrey Grodzovsky:
> This should prevent writing to memory or IO ranges possibly
> already allocated for other uses after our device is removed.
>
> v5:
> Protect more places wher memcopy_to/form_io takes place
> Protect IB submissions
>
> v6: Switch to !drm_dev_enter instead of scoping entire code
> with brackets.
>
> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c    | 11 ++-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c       |  9 +++
>   drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c        | 17 +++--
>   drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c       | 63 +++++++++++------
>   drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h       |  2 +
>   drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c      | 70 +++++++++++++++++++
>   drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h      | 49 ++-----------
>   drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c       | 31 +++++---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c       | 11 ++-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c       | 22 ++++--
>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c        |  7 +-
>   drivers/gpu/drm/amd/amdgpu/psp_v11_0.c        | 44 ++++++------
>   drivers/gpu/drm/amd/amdgpu/psp_v12_0.c        |  8 +--
>   drivers/gpu/drm/amd/amdgpu/psp_v3_1.c         |  8 +--
>   drivers/gpu/drm/amd/amdgpu/vce_v4_0.c         | 26 ++++---
>   drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c         | 22 +++---
>   .../drm/amd/pm/powerplay/smumgr/smu7_smumgr.c |  2 +
>   17 files changed, 257 insertions(+), 145 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index a0bff4713672..94c415176cdc 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -71,6 +71,8 @@
>   #include <drm/task_barrier.h>
>   #include <linux/pm_runtime.h>
>   
> +#include <drm/drm_drv.h>
> +
>   MODULE_FIRMWARE("amdgpu/vega10_gpu_info.bin");
>   MODULE_FIRMWARE("amdgpu/vega12_gpu_info.bin");
>   MODULE_FIRMWARE("amdgpu/raven_gpu_info.bin");
> @@ -281,7 +283,10 @@ void amdgpu_device_vram_access(struct amdgpu_device *adev, loff_t pos,
>   	unsigned long flags;
>   	uint32_t hi = ~0;
>   	uint64_t last;
> +	int idx;
>   
> +	 if (!drm_dev_enter(&adev->ddev, &idx))
> +		 return;
>   
>   #ifdef CONFIG_64BIT
>   	last = min(pos + size, adev->gmc.visible_vram_size);
> @@ -299,8 +304,10 @@ void amdgpu_device_vram_access(struct amdgpu_device *adev, loff_t pos,
>   			memcpy_fromio(buf, addr, count);
>   		}
>   
> -		if (count == size)
> +		if (count == size) {
> +			drm_dev_exit(idx);
>   			return;
> +		}

Maybe use a goto instead, but really just a nit pick.



>   
>   		pos += count;
>   		buf += count / 4;
> @@ -323,6 +330,8 @@ void amdgpu_device_vram_access(struct amdgpu_device *adev, loff_t pos,
>   			*buf++ = RREG32_NO_KIQ(mmMM_DATA);
>   	}
>   	spin_unlock_irqrestore(&adev->mmio_idx_lock, flags);
> +
> +	drm_dev_exit(idx);
>   }
>   
>   /*
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
> index 4d32233cde92..04ba5eef1e88 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
> @@ -31,6 +31,8 @@
>   #include "amdgpu_ras.h"
>   #include "amdgpu_xgmi.h"
>   
> +#include <drm/drm_drv.h>
> +
>   /**
>    * amdgpu_gmc_pdb0_alloc - allocate vram for pdb0
>    *
> @@ -151,6 +153,10 @@ int amdgpu_gmc_set_pte_pde(struct amdgpu_device *adev, void *cpu_pt_addr,
>   {
>   	void __iomem *ptr = (void *)cpu_pt_addr;
>   	uint64_t value;
> +	int idx;
> +
> +	if (!drm_dev_enter(&adev->ddev, &idx))
> +		return 0;
>   
>   	/*
>   	 * The following is for PTE only. GART does not have PDEs.
> @@ -158,6 +164,9 @@ int amdgpu_gmc_set_pte_pde(struct amdgpu_device *adev, void *cpu_pt_addr,
>   	value = addr & 0x0000FFFFFFFFF000ULL;
>   	value |= flags;
>   	writeq(value, ptr + (gpu_page_idx * 8));
> +
> +	drm_dev_exit(idx);
> +
>   	return 0;
>   }
>   
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
> index 148a3b481b12..62fcbd446c71 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
> @@ -30,6 +30,7 @@
>   #include <linux/slab.h>
>   
>   #include <drm/amdgpu_drm.h>
> +#include <drm/drm_drv.h>
>   
>   #include "amdgpu.h"
>   #include "atom.h"
> @@ -137,7 +138,7 @@ int amdgpu_ib_schedule(struct amdgpu_ring *ring, unsigned num_ibs,
>   	bool secure;
>   
>   	unsigned i;
> -	int r = 0;
> +	int idx, r = 0;
>   	bool need_pipe_sync = false;
>   
>   	if (num_ibs == 0)
> @@ -169,13 +170,16 @@ int amdgpu_ib_schedule(struct amdgpu_ring *ring, unsigned num_ibs,
>   		return -EINVAL;
>   	}
>   
> +	if (!drm_dev_enter(&adev->ddev, &idx))
> +		return -ENODEV;
> +
>   	alloc_size = ring->funcs->emit_frame_size + num_ibs *
>   		ring->funcs->emit_ib_size;
>   
>   	r = amdgpu_ring_alloc(ring, alloc_size);
>   	if (r) {
>   		dev_err(adev->dev, "scheduling IB failed (%d).\n", r);
> -		return r;
> +		goto exit;
>   	}
>   
>   	need_ctx_switch = ring->current_ctx != fence_ctx;
> @@ -205,7 +209,7 @@ int amdgpu_ib_schedule(struct amdgpu_ring *ring, unsigned num_ibs,
>   		r = amdgpu_vm_flush(ring, job, need_pipe_sync);
>   		if (r) {
>   			amdgpu_ring_undo(ring);
> -			return r;
> +			goto exit;
>   		}
>   	}
>   
> @@ -286,7 +290,7 @@ int amdgpu_ib_schedule(struct amdgpu_ring *ring, unsigned num_ibs,
>   		if (job && job->vmid)
>   			amdgpu_vmid_reset(adev, ring->funcs->vmhub, job->vmid);
>   		amdgpu_ring_undo(ring);
> -		return r;
> +		goto exit;
>   	}
>   
>   	if (ring->funcs->insert_end)
> @@ -304,7 +308,10 @@ int amdgpu_ib_schedule(struct amdgpu_ring *ring, unsigned num_ibs,
>   		ring->funcs->emit_wave_limit(ring, false);
>   
>   	amdgpu_ring_commit(ring);
> -	return 0;
> +
> +exit:
> +	drm_dev_exit(idx);
> +	return r;
>   }
>   
>   /**
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
> index 9e769cf6095b..bb6afee61666 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
> @@ -25,6 +25,7 @@
>   
>   #include <linux/firmware.h>
>   #include <linux/dma-mapping.h>
> +#include <drm/drm_drv.h>
>   
>   #include "amdgpu.h"
>   #include "amdgpu_psp.h"
> @@ -39,6 +40,8 @@
>   #include "amdgpu_ras.h"
>   #include "amdgpu_securedisplay.h"
>   
> +#include <drm/drm_drv.h>
> +
>   static int psp_sysfs_init(struct amdgpu_device *adev);
>   static void psp_sysfs_fini(struct amdgpu_device *adev);
>   
> @@ -253,7 +256,7 @@ psp_cmd_submit_buf(struct psp_context *psp,
>   		   struct psp_gfx_cmd_resp *cmd, uint64_t fence_mc_addr)
>   {
>   	int ret;
> -	int index;
> +	int index, idx;
>   	int timeout = 20000;
>   	bool ras_intr = false;
>   	bool skip_unsupport = false;
> @@ -261,6 +264,9 @@ psp_cmd_submit_buf(struct psp_context *psp,
>   	if (psp->adev->in_pci_err_recovery)
>   		return 0;
>   
> +	if (!drm_dev_enter(&psp->adev->ddev, &idx))
> +		return 0;
> +
>   	mutex_lock(&psp->mutex);
>   
>   	memset(psp->cmd_buf_mem, 0, PSP_CMD_BUFFER_SIZE);
> @@ -271,8 +277,7 @@ psp_cmd_submit_buf(struct psp_context *psp,
>   	ret = psp_ring_cmd_submit(psp, psp->cmd_buf_mc_addr, fence_mc_addr, index);
>   	if (ret) {
>   		atomic_dec(&psp->fence_value);
> -		mutex_unlock(&psp->mutex);
> -		return ret;
> +		goto exit;
>   	}
>   
>   	amdgpu_asic_invalidate_hdp(psp->adev, NULL);
> @@ -312,8 +317,8 @@ psp_cmd_submit_buf(struct psp_context *psp,
>   			 psp->cmd_buf_mem->cmd_id,
>   			 psp->cmd_buf_mem->resp.status);
>   		if (!timeout) {
> -			mutex_unlock(&psp->mutex);
> -			return -EINVAL;
> +			ret = -EINVAL;
> +			goto exit;
>   		}
>   	}
>   
> @@ -321,8 +326,10 @@ psp_cmd_submit_buf(struct psp_context *psp,
>   		ucode->tmr_mc_addr_lo = psp->cmd_buf_mem->resp.fw_addr_lo;
>   		ucode->tmr_mc_addr_hi = psp->cmd_buf_mem->resp.fw_addr_hi;
>   	}
> -	mutex_unlock(&psp->mutex);
>   
> +exit:
> +	mutex_unlock(&psp->mutex);
> +	drm_dev_exit(idx);
>   	return ret;
>   }
>   
> @@ -359,8 +366,7 @@ static int psp_load_toc(struct psp_context *psp,
>   	if (!cmd)
>   		return -ENOMEM;
>   	/* Copy toc to psp firmware private buffer */
> -	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
> -	memcpy(psp->fw_pri_buf, psp->toc_start_addr, psp->toc_bin_size);
> +	psp_copy_fw(psp, psp->toc_start_addr, psp->toc_bin_size);
>   
>   	psp_prep_load_toc_cmd_buf(cmd, psp->fw_pri_mc_addr, psp->toc_bin_size);
>   
> @@ -625,8 +631,7 @@ static int psp_asd_load(struct psp_context *psp)
>   	if (!cmd)
>   		return -ENOMEM;
>   
> -	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
> -	memcpy(psp->fw_pri_buf, psp->asd_start_addr, psp->asd_ucode_size);
> +	psp_copy_fw(psp, psp->asd_start_addr, psp->asd_ucode_size);
>   
>   	psp_prep_asd_load_cmd_buf(cmd, psp->fw_pri_mc_addr,
>   				  psp->asd_ucode_size);
> @@ -781,8 +786,7 @@ static int psp_xgmi_load(struct psp_context *psp)
>   	if (!cmd)
>   		return -ENOMEM;
>   
> -	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
> -	memcpy(psp->fw_pri_buf, psp->ta_xgmi_start_addr, psp->ta_xgmi_ucode_size);
> +	psp_copy_fw(psp, psp->ta_xgmi_start_addr, psp->ta_xgmi_ucode_size);
>   
>   	psp_prep_ta_load_cmd_buf(cmd,
>   				 psp->fw_pri_mc_addr,
> @@ -1038,8 +1042,7 @@ static int psp_ras_load(struct psp_context *psp)
>   	if (!cmd)
>   		return -ENOMEM;
>   
> -	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
> -	memcpy(psp->fw_pri_buf, psp->ta_ras_start_addr, psp->ta_ras_ucode_size);
> +	psp_copy_fw(psp, psp->ta_ras_start_addr, psp->ta_ras_ucode_size);
>   
>   	psp_prep_ta_load_cmd_buf(cmd,
>   				 psp->fw_pri_mc_addr,
> @@ -1275,8 +1278,7 @@ static int psp_hdcp_load(struct psp_context *psp)
>   	if (!cmd)
>   		return -ENOMEM;
>   
> -	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
> -	memcpy(psp->fw_pri_buf, psp->ta_hdcp_start_addr,
> +	psp_copy_fw(psp, psp->ta_hdcp_start_addr,
>   	       psp->ta_hdcp_ucode_size);
>   
>   	psp_prep_ta_load_cmd_buf(cmd,
> @@ -1427,8 +1429,7 @@ static int psp_dtm_load(struct psp_context *psp)
>   	if (!cmd)
>   		return -ENOMEM;
>   
> -	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
> -	memcpy(psp->fw_pri_buf, psp->ta_dtm_start_addr, psp->ta_dtm_ucode_size);
> +	psp_copy_fw(psp, psp->ta_dtm_start_addr, psp->ta_dtm_ucode_size);
>   
>   	psp_prep_ta_load_cmd_buf(cmd,
>   				 psp->fw_pri_mc_addr,
> @@ -1573,8 +1574,7 @@ static int psp_rap_load(struct psp_context *psp)
>   	if (!cmd)
>   		return -ENOMEM;
>   
> -	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
> -	memcpy(psp->fw_pri_buf, psp->ta_rap_start_addr, psp->ta_rap_ucode_size);
> +	psp_copy_fw(psp, psp->ta_rap_start_addr, psp->ta_rap_ucode_size);
>   
>   	psp_prep_ta_load_cmd_buf(cmd,
>   				 psp->fw_pri_mc_addr,
> @@ -3022,7 +3022,7 @@ static ssize_t psp_usbc_pd_fw_sysfs_write(struct device *dev,
>   	struct amdgpu_device *adev = drm_to_adev(ddev);
>   	void *cpu_addr;
>   	dma_addr_t dma_addr;
> -	int ret;
> +	int ret, idx;
>   	char fw_name[100];
>   	const struct firmware *usbc_pd_fw;
>   
> @@ -3031,6 +3031,9 @@ static ssize_t psp_usbc_pd_fw_sysfs_write(struct device *dev,
>   		return -EBUSY;
>   	}
>   
> +	if (!drm_dev_enter(ddev, &idx))
> +		return -ENODEV;
> +
>   	snprintf(fw_name, sizeof(fw_name), "amdgpu/%s", buf);
>   	ret = request_firmware(&usbc_pd_fw, fw_name, adev->dev);
>   	if (ret)
> @@ -3062,16 +3065,30 @@ static ssize_t psp_usbc_pd_fw_sysfs_write(struct device *dev,
>   rel_buf:
>   	dma_free_coherent(adev->dev, usbc_pd_fw->size, cpu_addr, dma_addr);
>   	release_firmware(usbc_pd_fw);
> -
>   fail:
>   	if (ret) {
>   		DRM_ERROR("Failed to load USBC PD FW, err = %d", ret);
> -		return ret;
> +		count = ret;
>   	}
>   
> +	drm_dev_exit(idx);
>   	return count;
>   }
>   
> +void psp_copy_fw(struct psp_context *psp, uint8_t *start_addr, uint32_t bin_size)
> +{
> +	int idx;
> +
> +	if (!drm_dev_enter(&psp->adev->ddev, &idx))
> +		return;
> +
> +	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
> +	memcpy(psp->fw_pri_buf, start_addr, bin_size);
> +
> +	drm_dev_exit(idx);
> +}
> +
> +
>   static DEVICE_ATTR(usbc_pd_fw, S_IRUGO | S_IWUSR,
>   		   psp_usbc_pd_fw_sysfs_read,
>   		   psp_usbc_pd_fw_sysfs_write);
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
> index 46a5328e00e0..2bfdc278817f 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
> @@ -423,4 +423,6 @@ int psp_get_fw_attestation_records_addr(struct psp_context *psp,
>   
>   int psp_load_fw_list(struct psp_context *psp,
>   		     struct amdgpu_firmware_info **ucode_list, int ucode_count);
> +void psp_copy_fw(struct psp_context *psp, uint8_t *start_addr, uint32_t bin_size);
> +
>   #endif
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
> index 688624ebe421..e1985bc34436 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
> @@ -35,6 +35,8 @@
>   #include "amdgpu.h"
>   #include "atom.h"
>   
> +#include <drm/drm_drv.h>
> +
>   /*
>    * Rings
>    * Most engines on the GPU are fed via ring buffers.  Ring
> @@ -461,3 +463,71 @@ int amdgpu_ring_test_helper(struct amdgpu_ring *ring)
>   	ring->sched.ready = !r;
>   	return r;
>   }
> +
> +void amdgpu_ring_clear_ring(struct amdgpu_ring *ring)
> +{
> +	int idx;
> +	int i = 0;
> +
> +	if (!drm_dev_enter(&ring->adev->ddev, &idx))
> +		return;
> +
> +	while (i <= ring->buf_mask)
> +		ring->ring[i++] = ring->funcs->nop;
> +
> +	drm_dev_exit(idx);
> +
> +}
> +
> +void amdgpu_ring_write(struct amdgpu_ring *ring, uint32_t v)
> +{
> +	int idx;
> +
> +	if (!drm_dev_enter(&ring->adev->ddev, &idx))
> +		return;
> +
> +	if (ring->count_dw <= 0)
> +		DRM_ERROR("amdgpu: writing more dwords to the ring than expected!\n");
> +	ring->ring[ring->wptr++ & ring->buf_mask] = v;
> +	ring->wptr &= ring->ptr_mask;
> +	ring->count_dw--;
> +
> +	drm_dev_exit(idx);
> +}
> +
> +void amdgpu_ring_write_multiple(struct amdgpu_ring *ring,
> +					      void *src, int count_dw)
> +{
> +	unsigned occupied, chunk1, chunk2;
> +	void *dst;
> +	int idx;
> +
> +	if (!drm_dev_enter(&ring->adev->ddev, &idx))
> +		return;
> +
> +	if (unlikely(ring->count_dw < count_dw))
> +		DRM_ERROR("amdgpu: writing more dwords to the ring than expected!\n");
> +
> +	occupied = ring->wptr & ring->buf_mask;
> +	dst = (void *)&ring->ring[occupied];
> +	chunk1 = ring->buf_mask + 1 - occupied;
> +	chunk1 = (chunk1 >= count_dw) ? count_dw: chunk1;
> +	chunk2 = count_dw - chunk1;
> +	chunk1 <<= 2;
> +	chunk2 <<= 2;
> +
> +	if (chunk1)
> +		memcpy(dst, src, chunk1);
> +
> +	if (chunk2) {
> +		src += chunk1;
> +		dst = (void *)ring->ring;
> +		memcpy(dst, src, chunk2);
> +	}
> +
> +	ring->wptr += count_dw;
> +	ring->wptr &= ring->ptr_mask;
> +	ring->count_dw -= count_dw;
> +
> +	drm_dev_exit(idx);
> +}

The ring should never we in MMIO memory, so you can completely drop that 
as far as I can see.

Maybe split that patch by use case so that we can more easily review/ack it.

Thanks,
Christian.

> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
> index e7d3d0dbdd96..c67bc6d3d039 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
> @@ -299,53 +299,12 @@ static inline void amdgpu_ring_set_preempt_cond_exec(struct amdgpu_ring *ring,
>   	*ring->cond_exe_cpu_addr = cond_exec;
>   }
>   
> -static inline void amdgpu_ring_clear_ring(struct amdgpu_ring *ring)
> -{
> -	int i = 0;
> -	while (i <= ring->buf_mask)
> -		ring->ring[i++] = ring->funcs->nop;
> -
> -}
> -
> -static inline void amdgpu_ring_write(struct amdgpu_ring *ring, uint32_t v)
> -{
> -	if (ring->count_dw <= 0)
> -		DRM_ERROR("amdgpu: writing more dwords to the ring than expected!\n");
> -	ring->ring[ring->wptr++ & ring->buf_mask] = v;
> -	ring->wptr &= ring->ptr_mask;
> -	ring->count_dw--;
> -}
> +void amdgpu_ring_clear_ring(struct amdgpu_ring *ring);
>   
> -static inline void amdgpu_ring_write_multiple(struct amdgpu_ring *ring,
> -					      void *src, int count_dw)
> -{
> -	unsigned occupied, chunk1, chunk2;
> -	void *dst;
> -
> -	if (unlikely(ring->count_dw < count_dw))
> -		DRM_ERROR("amdgpu: writing more dwords to the ring than expected!\n");
> -
> -	occupied = ring->wptr & ring->buf_mask;
> -	dst = (void *)&ring->ring[occupied];
> -	chunk1 = ring->buf_mask + 1 - occupied;
> -	chunk1 = (chunk1 >= count_dw) ? count_dw: chunk1;
> -	chunk2 = count_dw - chunk1;
> -	chunk1 <<= 2;
> -	chunk2 <<= 2;
> +void amdgpu_ring_write(struct amdgpu_ring *ring, uint32_t v);
>   
> -	if (chunk1)
> -		memcpy(dst, src, chunk1);
> -
> -	if (chunk2) {
> -		src += chunk1;
> -		dst = (void *)ring->ring;
> -		memcpy(dst, src, chunk2);
> -	}
> -
> -	ring->wptr += count_dw;
> -	ring->wptr &= ring->ptr_mask;
> -	ring->count_dw -= count_dw;
> -}
> +void amdgpu_ring_write_multiple(struct amdgpu_ring *ring,
> +					      void *src, int count_dw);
>   
>   int amdgpu_ring_test_helper(struct amdgpu_ring *ring);
>   
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
> index c6dbc0801604..82f0542c7792 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
> @@ -32,6 +32,7 @@
>   #include <linux/module.h>
>   
>   #include <drm/drm.h>
> +#include <drm/drm_drv.h>
>   
>   #include "amdgpu.h"
>   #include "amdgpu_pm.h"
> @@ -375,7 +376,7 @@ int amdgpu_uvd_suspend(struct amdgpu_device *adev)
>   {
>   	unsigned size;
>   	void *ptr;
> -	int i, j;
> +	int i, j, idx;
>   	bool in_ras_intr = amdgpu_ras_intr_triggered();
>   
>   	cancel_delayed_work_sync(&adev->uvd.idle_work);
> @@ -403,11 +404,15 @@ int amdgpu_uvd_suspend(struct amdgpu_device *adev)
>   		if (!adev->uvd.inst[j].saved_bo)
>   			return -ENOMEM;
>   
> -		/* re-write 0 since err_event_athub will corrupt VCPU buffer */
> -		if (in_ras_intr)
> -			memset(adev->uvd.inst[j].saved_bo, 0, size);
> -		else
> -			memcpy_fromio(adev->uvd.inst[j].saved_bo, ptr, size);
> +		if (drm_dev_enter(&adev->ddev, &idx)) {
> +			/* re-write 0 since err_event_athub will corrupt VCPU buffer */
> +			if (in_ras_intr)
> +				memset(adev->uvd.inst[j].saved_bo, 0, size);
> +			else
> +				memcpy_fromio(adev->uvd.inst[j].saved_bo, ptr, size);
> +
> +			drm_dev_exit(idx);
> +		}
>   	}
>   
>   	if (in_ras_intr)
> @@ -420,7 +425,7 @@ int amdgpu_uvd_resume(struct amdgpu_device *adev)
>   {
>   	unsigned size;
>   	void *ptr;
> -	int i;
> +	int i, idx;
>   
>   	for (i = 0; i < adev->uvd.num_uvd_inst; i++) {
>   		if (adev->uvd.harvest_config & (1 << i))
> @@ -432,7 +437,10 @@ int amdgpu_uvd_resume(struct amdgpu_device *adev)
>   		ptr = adev->uvd.inst[i].cpu_addr;
>   
>   		if (adev->uvd.inst[i].saved_bo != NULL) {
> -			memcpy_toio(ptr, adev->uvd.inst[i].saved_bo, size);
> +			if (drm_dev_enter(&adev->ddev, &idx)) {
> +				memcpy_toio(ptr, adev->uvd.inst[i].saved_bo, size);
> +				drm_dev_exit(idx);
> +			}
>   			kvfree(adev->uvd.inst[i].saved_bo);
>   			adev->uvd.inst[i].saved_bo = NULL;
>   		} else {
> @@ -442,8 +450,11 @@ int amdgpu_uvd_resume(struct amdgpu_device *adev)
>   			hdr = (const struct common_firmware_header *)adev->uvd.fw->data;
>   			if (adev->firmware.load_type != AMDGPU_FW_LOAD_PSP) {
>   				offset = le32_to_cpu(hdr->ucode_array_offset_bytes);
> -				memcpy_toio(adev->uvd.inst[i].cpu_addr, adev->uvd.fw->data + offset,
> -					    le32_to_cpu(hdr->ucode_size_bytes));
> +				if (drm_dev_enter(&adev->ddev, &idx)) {
> +					memcpy_toio(adev->uvd.inst[i].cpu_addr, adev->uvd.fw->data + offset,
> +						    le32_to_cpu(hdr->ucode_size_bytes));
> +					drm_dev_exit(idx);
> +				}
>   				size -= le32_to_cpu(hdr->ucode_size_bytes);
>   				ptr += le32_to_cpu(hdr->ucode_size_bytes);
>   			}
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
> index ea6a62f67e38..833203401ef4 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
> @@ -29,6 +29,7 @@
>   #include <linux/module.h>
>   
>   #include <drm/drm.h>
> +#include <drm/drm_drv.h>
>   
>   #include "amdgpu.h"
>   #include "amdgpu_pm.h"
> @@ -293,7 +294,7 @@ int amdgpu_vce_resume(struct amdgpu_device *adev)
>   	void *cpu_addr;
>   	const struct common_firmware_header *hdr;
>   	unsigned offset;
> -	int r;
> +	int r, idx;
>   
>   	if (adev->vce.vcpu_bo == NULL)
>   		return -EINVAL;
> @@ -313,8 +314,12 @@ int amdgpu_vce_resume(struct amdgpu_device *adev)
>   
>   	hdr = (const struct common_firmware_header *)adev->vce.fw->data;
>   	offset = le32_to_cpu(hdr->ucode_array_offset_bytes);
> -	memcpy_toio(cpu_addr, adev->vce.fw->data + offset,
> -		    adev->vce.fw->size - offset);
> +
> +	if (drm_dev_enter(&adev->ddev, &idx)) {
> +		memcpy_toio(cpu_addr, adev->vce.fw->data + offset,
> +			    adev->vce.fw->size - offset);
> +		drm_dev_exit(idx);
> +	}
>   
>   	amdgpu_bo_kunmap(adev->vce.vcpu_bo);
>   
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
> index 201645963ba5..21f7d3644d70 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
> @@ -27,6 +27,7 @@
>   #include <linux/firmware.h>
>   #include <linux/module.h>
>   #include <linux/pci.h>
> +#include <drm/drm_drv.h>
>   
>   #include "amdgpu.h"
>   #include "amdgpu_pm.h"
> @@ -275,7 +276,7 @@ int amdgpu_vcn_suspend(struct amdgpu_device *adev)
>   {
>   	unsigned size;
>   	void *ptr;
> -	int i;
> +	int i, idx;
>   
>   	cancel_delayed_work_sync(&adev->vcn.idle_work);
>   
> @@ -292,7 +293,10 @@ int amdgpu_vcn_suspend(struct amdgpu_device *adev)
>   		if (!adev->vcn.inst[i].saved_bo)
>   			return -ENOMEM;
>   
> -		memcpy_fromio(adev->vcn.inst[i].saved_bo, ptr, size);
> +		if (drm_dev_enter(&adev->ddev, &idx)) {
> +			memcpy_fromio(adev->vcn.inst[i].saved_bo, ptr, size);
> +			drm_dev_exit(idx);
> +		}
>   	}
>   	return 0;
>   }
> @@ -301,7 +305,7 @@ int amdgpu_vcn_resume(struct amdgpu_device *adev)
>   {
>   	unsigned size;
>   	void *ptr;
> -	int i;
> +	int i, idx;
>   
>   	for (i = 0; i < adev->vcn.num_vcn_inst; ++i) {
>   		if (adev->vcn.harvest_config & (1 << i))
> @@ -313,7 +317,10 @@ int amdgpu_vcn_resume(struct amdgpu_device *adev)
>   		ptr = adev->vcn.inst[i].cpu_addr;
>   
>   		if (adev->vcn.inst[i].saved_bo != NULL) {
> -			memcpy_toio(ptr, adev->vcn.inst[i].saved_bo, size);
> +			if (drm_dev_enter(&adev->ddev, &idx)) {
> +				memcpy_toio(ptr, adev->vcn.inst[i].saved_bo, size);
> +				drm_dev_exit(idx);
> +			}
>   			kvfree(adev->vcn.inst[i].saved_bo);
>   			adev->vcn.inst[i].saved_bo = NULL;
>   		} else {
> @@ -323,8 +330,11 @@ int amdgpu_vcn_resume(struct amdgpu_device *adev)
>   			hdr = (const struct common_firmware_header *)adev->vcn.fw->data;
>   			if (adev->firmware.load_type != AMDGPU_FW_LOAD_PSP) {
>   				offset = le32_to_cpu(hdr->ucode_array_offset_bytes);
> -				memcpy_toio(adev->vcn.inst[i].cpu_addr, adev->vcn.fw->data + offset,
> -					    le32_to_cpu(hdr->ucode_size_bytes));
> +				if (drm_dev_enter(&adev->ddev, &idx)) {
> +					memcpy_toio(adev->vcn.inst[i].cpu_addr, adev->vcn.fw->data + offset,
> +						    le32_to_cpu(hdr->ucode_size_bytes));
> +					drm_dev_exit(idx);
> +				}
>   				size -= le32_to_cpu(hdr->ucode_size_bytes);
>   				ptr += le32_to_cpu(hdr->ucode_size_bytes);
>   			}
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> index 9f868cf3b832..7dd5f10ab570 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> @@ -32,6 +32,7 @@
>   #include <linux/dma-buf.h>
>   
>   #include <drm/amdgpu_drm.h>
> +#include <drm/drm_drv.h>
>   #include "amdgpu.h"
>   #include "amdgpu_trace.h"
>   #include "amdgpu_amdkfd.h"
> @@ -1606,7 +1607,10 @@ static int amdgpu_vm_bo_update_mapping(struct amdgpu_device *adev,
>   	struct amdgpu_vm_update_params params;
>   	enum amdgpu_sync_mode sync_mode;
>   	uint64_t pfn;
> -	int r;
> +	int r, idx;
> +
> +	if (!drm_dev_enter(&adev->ddev, &idx))
> +		return -ENODEV;
>   
>   	memset(&params, 0, sizeof(params));
>   	params.adev = adev;
> @@ -1715,6 +1719,7 @@ static int amdgpu_vm_bo_update_mapping(struct amdgpu_device *adev,
>   
>   error_unlock:
>   	amdgpu_vm_eviction_unlock(vm);
> +	drm_dev_exit(idx);
>   	return r;
>   }
>   
> diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c b/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
> index 589410c32d09..2cec71e823f5 100644
> --- a/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
> @@ -23,6 +23,7 @@
>   #include <linux/firmware.h>
>   #include <linux/module.h>
>   #include <linux/vmalloc.h>
> +#include <drm/drm_drv.h>
>   
>   #include "amdgpu.h"
>   #include "amdgpu_psp.h"
> @@ -269,10 +270,8 @@ static int psp_v11_0_bootloader_load_kdb(struct psp_context *psp)
>   	if (ret)
>   		return ret;
>   
> -	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
> -
>   	/* Copy PSP KDB binary to memory */
> -	memcpy(psp->fw_pri_buf, psp->kdb_start_addr, psp->kdb_bin_size);
> +	psp_copy_fw(psp, psp->kdb_start_addr, psp->kdb_bin_size);
>   
>   	/* Provide the PSP KDB to bootloader */
>   	WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
> @@ -302,10 +301,8 @@ static int psp_v11_0_bootloader_load_spl(struct psp_context *psp)
>   	if (ret)
>   		return ret;
>   
> -	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
> -
>   	/* Copy PSP SPL binary to memory */
> -	memcpy(psp->fw_pri_buf, psp->spl_start_addr, psp->spl_bin_size);
> +	psp_copy_fw(psp, psp->spl_start_addr, psp->spl_bin_size);
>   
>   	/* Provide the PSP SPL to bootloader */
>   	WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
> @@ -335,10 +332,8 @@ static int psp_v11_0_bootloader_load_sysdrv(struct psp_context *psp)
>   	if (ret)
>   		return ret;
>   
> -	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
> -
>   	/* Copy PSP System Driver binary to memory */
> -	memcpy(psp->fw_pri_buf, psp->sys_start_addr, psp->sys_bin_size);
> +	psp_copy_fw(psp, psp->sys_start_addr, psp->sys_bin_size);
>   
>   	/* Provide the sys driver to bootloader */
>   	WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
> @@ -371,10 +366,8 @@ static int psp_v11_0_bootloader_load_sos(struct psp_context *psp)
>   	if (ret)
>   		return ret;
>   
> -	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
> -
>   	/* Copy Secure OS binary to PSP memory */
> -	memcpy(psp->fw_pri_buf, psp->sos_start_addr, psp->sos_bin_size);
> +	psp_copy_fw(psp, psp->sos_start_addr, psp->sos_bin_size);
>   
>   	/* Provide the PSP secure OS to bootloader */
>   	WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
> @@ -608,7 +601,7 @@ static int psp_v11_0_memory_training(struct psp_context *psp, uint32_t ops)
>   	uint32_t p2c_header[4];
>   	uint32_t sz;
>   	void *buf;
> -	int ret;
> +	int ret, idx;
>   
>   	if (ctx->init == PSP_MEM_TRAIN_NOT_SUPPORT) {
>   		DRM_DEBUG("Memory training is not supported.\n");
> @@ -681,17 +674,24 @@ static int psp_v11_0_memory_training(struct psp_context *psp, uint32_t ops)
>   			return -ENOMEM;
>   		}
>   
> -		memcpy_fromio(buf, adev->mman.aper_base_kaddr, sz);
> -		ret = psp_v11_0_memory_training_send_msg(psp, PSP_BL__DRAM_LONG_TRAIN);
> -		if (ret) {
> -			DRM_ERROR("Send long training msg failed.\n");
> +		if (drm_dev_enter(&adev->ddev, &idx)) {
> +			memcpy_fromio(buf, adev->mman.aper_base_kaddr, sz);
> +			ret = psp_v11_0_memory_training_send_msg(psp, PSP_BL__DRAM_LONG_TRAIN);
> +			if (ret) {
> +				DRM_ERROR("Send long training msg failed.\n");
> +				vfree(buf);
> +				drm_dev_exit(idx);
> +				return ret;
> +			}
> +
> +			memcpy_toio(adev->mman.aper_base_kaddr, buf, sz);
> +			adev->hdp.funcs->flush_hdp(adev, NULL);
>   			vfree(buf);
> -			return ret;
> +			drm_dev_exit(idx);
> +		} else {
> +			vfree(buf);
> +			return -ENODEV;
>   		}
> -
> -		memcpy_toio(adev->mman.aper_base_kaddr, buf, sz);
> -		adev->hdp.funcs->flush_hdp(adev, NULL);
> -		vfree(buf);
>   	}
>   
>   	if (ops & PSP_MEM_TRAIN_SAVE) {
> diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c b/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c
> index c4828bd3264b..618e5b6b85d9 100644
> --- a/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c
> @@ -138,10 +138,8 @@ static int psp_v12_0_bootloader_load_sysdrv(struct psp_context *psp)
>   	if (ret)
>   		return ret;
>   
> -	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
> -
>   	/* Copy PSP System Driver binary to memory */
> -	memcpy(psp->fw_pri_buf, psp->sys_start_addr, psp->sys_bin_size);
> +	psp_copy_fw(psp, psp->sys_start_addr, psp->sys_bin_size);
>   
>   	/* Provide the sys driver to bootloader */
>   	WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
> @@ -179,10 +177,8 @@ static int psp_v12_0_bootloader_load_sos(struct psp_context *psp)
>   	if (ret)
>   		return ret;
>   
> -	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
> -
>   	/* Copy Secure OS binary to PSP memory */
> -	memcpy(psp->fw_pri_buf, psp->sos_start_addr, psp->sos_bin_size);
> +	psp_copy_fw(psp, psp->sos_start_addr, psp->sos_bin_size);
>   
>   	/* Provide the PSP secure OS to bootloader */
>   	WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
> diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c b/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c
> index f2e725f72d2f..d0a6cccd0897 100644
> --- a/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c
> +++ b/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c
> @@ -102,10 +102,8 @@ static int psp_v3_1_bootloader_load_sysdrv(struct psp_context *psp)
>   	if (ret)
>   		return ret;
>   
> -	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
> -
>   	/* Copy PSP System Driver binary to memory */
> -	memcpy(psp->fw_pri_buf, psp->sys_start_addr, psp->sys_bin_size);
> +	psp_copy_fw(psp, psp->sys_start_addr, psp->sys_bin_size);
>   
>   	/* Provide the sys driver to bootloader */
>   	WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
> @@ -143,10 +141,8 @@ static int psp_v3_1_bootloader_load_sos(struct psp_context *psp)
>   	if (ret)
>   		return ret;
>   
> -	memset(psp->fw_pri_buf, 0, PSP_1_MEG);
> -
>   	/* Copy Secure OS binary to PSP memory */
> -	memcpy(psp->fw_pri_buf, psp->sos_start_addr, psp->sos_bin_size);
> +	psp_copy_fw(psp, psp->sos_start_addr, psp->sos_bin_size);
>   
>   	/* Provide the PSP secure OS to bootloader */
>   	WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
> diff --git a/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c b/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
> index 8e238dea7bef..90910d19db12 100644
> --- a/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
> @@ -25,6 +25,7 @@
>    */
>   
>   #include <linux/firmware.h>
> +#include <drm/drm_drv.h>
>   
>   #include "amdgpu.h"
>   #include "amdgpu_vce.h"
> @@ -555,16 +556,19 @@ static int vce_v4_0_hw_fini(void *handle)
>   static int vce_v4_0_suspend(void *handle)
>   {
>   	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
> -	int r;
> +	int r, idx;
>   
>   	if (adev->vce.vcpu_bo == NULL)
>   		return 0;
>   
> -	if (adev->firmware.load_type == AMDGPU_FW_LOAD_PSP) {
> -		unsigned size = amdgpu_bo_size(adev->vce.vcpu_bo);
> -		void *ptr = adev->vce.cpu_addr;
> +	if (drm_dev_enter(&adev->ddev, &idx)) {
> +		if (adev->firmware.load_type == AMDGPU_FW_LOAD_PSP) {
> +			unsigned size = amdgpu_bo_size(adev->vce.vcpu_bo);
> +			void *ptr = adev->vce.cpu_addr;
>   
> -		memcpy_fromio(adev->vce.saved_bo, ptr, size);
> +			memcpy_fromio(adev->vce.saved_bo, ptr, size);
> +		}
> +		drm_dev_exit(idx);
>   	}
>   
>   	r = vce_v4_0_hw_fini(adev);
> @@ -577,16 +581,20 @@ static int vce_v4_0_suspend(void *handle)
>   static int vce_v4_0_resume(void *handle)
>   {
>   	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
> -	int r;
> +	int r, idx;
>   
>   	if (adev->vce.vcpu_bo == NULL)
>   		return -EINVAL;
>   
>   	if (adev->firmware.load_type == AMDGPU_FW_LOAD_PSP) {
> -		unsigned size = amdgpu_bo_size(adev->vce.vcpu_bo);
> -		void *ptr = adev->vce.cpu_addr;
>   
> -		memcpy_toio(ptr, adev->vce.saved_bo, size);
> +		if (drm_dev_enter(&adev->ddev, &idx)) {
> +			unsigned size = amdgpu_bo_size(adev->vce.vcpu_bo);
> +			void *ptr = adev->vce.cpu_addr;
> +
> +			memcpy_toio(ptr, adev->vce.saved_bo, size);
> +			drm_dev_exit(idx);
> +		}
>   	} else {
>   		r = amdgpu_vce_resume(adev);
>   		if (r)
> diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c b/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c
> index 3f15bf34123a..df34be8ec82d 100644
> --- a/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c
> @@ -34,6 +34,8 @@
>   #include "vcn/vcn_3_0_0_sh_mask.h"
>   #include "ivsrcid/vcn/irqsrcs_vcn_2_0.h"
>   
> +#include <drm/drm_drv.h>
> +
>   #define mmUVD_CONTEXT_ID_INTERNAL_OFFSET			0x27
>   #define mmUVD_GPCOM_VCPU_CMD_INTERNAL_OFFSET			0x0f
>   #define mmUVD_GPCOM_VCPU_DATA0_INTERNAL_OFFSET			0x10
> @@ -268,16 +270,20 @@ static int vcn_v3_0_sw_init(void *handle)
>   static int vcn_v3_0_sw_fini(void *handle)
>   {
>   	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
> -	int i, r;
> +	int i, r, idx;
>   
> -	for (i = 0; i < adev->vcn.num_vcn_inst; i++) {
> -		volatile struct amdgpu_fw_shared *fw_shared;
> +	if (drm_dev_enter(&adev->ddev, &idx)) {
> +		for (i = 0; i < adev->vcn.num_vcn_inst; i++) {
> +			volatile struct amdgpu_fw_shared *fw_shared;
>   
> -		if (adev->vcn.harvest_config & (1 << i))
> -			continue;
> -		fw_shared = adev->vcn.inst[i].fw_shared_cpu_addr;
> -		fw_shared->present_flag_0 = 0;
> -		fw_shared->sw_ring.is_enabled = false;
> +			if (adev->vcn.harvest_config & (1 << i))
> +				continue;
> +			fw_shared = adev->vcn.inst[i].fw_shared_cpu_addr;
> +			fw_shared->present_flag_0 = 0;
> +			fw_shared->sw_ring.is_enabled = false;
> +		}
> +
> +		drm_dev_exit(idx);
>   	}
>   
>   	if (amdgpu_sriov_vf(adev))
> diff --git a/drivers/gpu/drm/amd/pm/powerplay/smumgr/smu7_smumgr.c b/drivers/gpu/drm/amd/pm/powerplay/smumgr/smu7_smumgr.c
> index aae25243eb10..d628b91846c9 100644
> --- a/drivers/gpu/drm/amd/pm/powerplay/smumgr/smu7_smumgr.c
> +++ b/drivers/gpu/drm/amd/pm/powerplay/smumgr/smu7_smumgr.c
> @@ -405,6 +405,8 @@ int smu7_request_smu_load_fw(struct pp_hwmgr *hwmgr)
>   				UCODE_ID_MEC_STORAGE, &toc->entry[toc->num_entries++]),
>   				"Failed to Get Firmware Entry.", r = -EINVAL; goto failed);
>   	}
> +
> +	/* AG TODO Can't call drm_dev_enter/exit because access adev->ddev here ... */
>   	memcpy_toio(smu_data->header_buffer.kaddr, smu_data->toc,
>   		    sizeof(struct SMU_DRAMData_TOC));
>   	smum_send_msg_to_smc_with_parameter(hwmgr,

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [PATCH v6 11/16] drm/sched: Make timeout timer rearm conditional.
  2021-05-10 16:36   ` Andrey Grodzovsky
  (?)
@ 2021-05-11  6:52     ` Christian König
  -1 siblings, 0 replies; 126+ messages in thread
From: Christian König @ 2021-05-11  6:52 UTC (permalink / raw)
  To: Andrey Grodzovsky, dri-devel, amd-gfx, linux-pci, daniel.vetter,
	Harry.Wentland
  Cc: ppaalanen, Alexander.Deucher, gregkh, helgaas, Felix.Kuehling

Am 10.05.21 um 18:36 schrieb Andrey Grodzovsky:
> We don't want to rearm the timer if driver hook reports
> that the device is gone.
>
> v5: Update drm_gpu_sched_stat values in code.
>
> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>

Reviewed-by: Christian König <christian.koenig@amd.com>

> ---
>   drivers/gpu/drm/scheduler/sched_main.c | 11 +++++++----
>   1 file changed, 7 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
> index f4f474944169..8d1211e87101 100644
> --- a/drivers/gpu/drm/scheduler/sched_main.c
> +++ b/drivers/gpu/drm/scheduler/sched_main.c
> @@ -314,6 +314,7 @@ static void drm_sched_job_timedout(struct work_struct *work)
>   {
>   	struct drm_gpu_scheduler *sched;
>   	struct drm_sched_job *job;
> +	enum drm_gpu_sched_stat status = DRM_GPU_SCHED_STAT_NOMINAL;
>   
>   	sched = container_of(work, struct drm_gpu_scheduler, work_tdr.work);
>   
> @@ -331,7 +332,7 @@ static void drm_sched_job_timedout(struct work_struct *work)
>   		list_del_init(&job->list);
>   		spin_unlock(&sched->job_list_lock);
>   
> -		job->sched->ops->timedout_job(job);
> +		status = job->sched->ops->timedout_job(job);
>   
>   		/*
>   		 * Guilty job did complete and hence needs to be manually removed
> @@ -345,9 +346,11 @@ static void drm_sched_job_timedout(struct work_struct *work)
>   		spin_unlock(&sched->job_list_lock);
>   	}
>   
> -	spin_lock(&sched->job_list_lock);
> -	drm_sched_start_timeout(sched);
> -	spin_unlock(&sched->job_list_lock);
> +	if (status != DRM_GPU_SCHED_STAT_ENODEV) {
> +		spin_lock(&sched->job_list_lock);
> +		drm_sched_start_timeout(sched);
> +		spin_unlock(&sched->job_list_lock);
> +	}
>   }
>   
>    /**


^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [PATCH v6 11/16] drm/sched: Make timeout timer rearm conditional.
@ 2021-05-11  6:52     ` Christian König
  0 siblings, 0 replies; 126+ messages in thread
From: Christian König @ 2021-05-11  6:52 UTC (permalink / raw)
  To: Andrey Grodzovsky, dri-devel, amd-gfx, linux-pci, daniel.vetter,
	Harry.Wentland
  Cc: Alexander.Deucher, gregkh, helgaas, Felix.Kuehling

Am 10.05.21 um 18:36 schrieb Andrey Grodzovsky:
> We don't want to rearm the timer if driver hook reports
> that the device is gone.
>
> v5: Update drm_gpu_sched_stat values in code.
>
> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>

Reviewed-by: Christian König <christian.koenig@amd.com>

> ---
>   drivers/gpu/drm/scheduler/sched_main.c | 11 +++++++----
>   1 file changed, 7 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
> index f4f474944169..8d1211e87101 100644
> --- a/drivers/gpu/drm/scheduler/sched_main.c
> +++ b/drivers/gpu/drm/scheduler/sched_main.c
> @@ -314,6 +314,7 @@ static void drm_sched_job_timedout(struct work_struct *work)
>   {
>   	struct drm_gpu_scheduler *sched;
>   	struct drm_sched_job *job;
> +	enum drm_gpu_sched_stat status = DRM_GPU_SCHED_STAT_NOMINAL;
>   
>   	sched = container_of(work, struct drm_gpu_scheduler, work_tdr.work);
>   
> @@ -331,7 +332,7 @@ static void drm_sched_job_timedout(struct work_struct *work)
>   		list_del_init(&job->list);
>   		spin_unlock(&sched->job_list_lock);
>   
> -		job->sched->ops->timedout_job(job);
> +		status = job->sched->ops->timedout_job(job);
>   
>   		/*
>   		 * Guilty job did complete and hence needs to be manually removed
> @@ -345,9 +346,11 @@ static void drm_sched_job_timedout(struct work_struct *work)
>   		spin_unlock(&sched->job_list_lock);
>   	}
>   
> -	spin_lock(&sched->job_list_lock);
> -	drm_sched_start_timeout(sched);
> -	spin_unlock(&sched->job_list_lock);
> +	if (status != DRM_GPU_SCHED_STAT_ENODEV) {
> +		spin_lock(&sched->job_list_lock);
> +		drm_sched_start_timeout(sched);
> +		spin_unlock(&sched->job_list_lock);
> +	}
>   }
>   
>    /**


^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [PATCH v6 11/16] drm/sched: Make timeout timer rearm conditional.
@ 2021-05-11  6:52     ` Christian König
  0 siblings, 0 replies; 126+ messages in thread
From: Christian König @ 2021-05-11  6:52 UTC (permalink / raw)
  To: Andrey Grodzovsky, dri-devel, amd-gfx, linux-pci, daniel.vetter,
	Harry.Wentland
  Cc: Alexander.Deucher, gregkh, ppaalanen, helgaas, Felix.Kuehling

Am 10.05.21 um 18:36 schrieb Andrey Grodzovsky:
> We don't want to rearm the timer if driver hook reports
> that the device is gone.
>
> v5: Update drm_gpu_sched_stat values in code.
>
> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>

Reviewed-by: Christian König <christian.koenig@amd.com>

> ---
>   drivers/gpu/drm/scheduler/sched_main.c | 11 +++++++----
>   1 file changed, 7 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
> index f4f474944169..8d1211e87101 100644
> --- a/drivers/gpu/drm/scheduler/sched_main.c
> +++ b/drivers/gpu/drm/scheduler/sched_main.c
> @@ -314,6 +314,7 @@ static void drm_sched_job_timedout(struct work_struct *work)
>   {
>   	struct drm_gpu_scheduler *sched;
>   	struct drm_sched_job *job;
> +	enum drm_gpu_sched_stat status = DRM_GPU_SCHED_STAT_NOMINAL;
>   
>   	sched = container_of(work, struct drm_gpu_scheduler, work_tdr.work);
>   
> @@ -331,7 +332,7 @@ static void drm_sched_job_timedout(struct work_struct *work)
>   		list_del_init(&job->list);
>   		spin_unlock(&sched->job_list_lock);
>   
> -		job->sched->ops->timedout_job(job);
> +		status = job->sched->ops->timedout_job(job);
>   
>   		/*
>   		 * Guilty job did complete and hence needs to be manually removed
> @@ -345,9 +346,11 @@ static void drm_sched_job_timedout(struct work_struct *work)
>   		spin_unlock(&sched->job_list_lock);
>   	}
>   
> -	spin_lock(&sched->job_list_lock);
> -	drm_sched_start_timeout(sched);
> -	spin_unlock(&sched->job_list_lock);
> +	if (status != DRM_GPU_SCHED_STAT_ENODEV) {
> +		spin_lock(&sched->job_list_lock);
> +		drm_sched_start_timeout(sched);
> +		spin_unlock(&sched->job_list_lock);
> +	}
>   }
>   
>    /**

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [PATCH v6 12/16] drm/amdgpu: Prevent any job recoveries after device is unplugged.
  2021-05-10 16:36   ` Andrey Grodzovsky
  (?)
@ 2021-05-11  6:53     ` Christian König
  -1 siblings, 0 replies; 126+ messages in thread
From: Christian König @ 2021-05-11  6:53 UTC (permalink / raw)
  To: Andrey Grodzovsky, dri-devel, amd-gfx, linux-pci, daniel.vetter,
	Harry.Wentland
  Cc: ppaalanen, Alexander.Deucher, gregkh, helgaas, Felix.Kuehling

Am 10.05.21 um 18:36 schrieb Andrey Grodzovsky:
> Return DRM_TASK_STATUS_ENODEV back to the scheduler when device
> is not present so they timeout timer will not be rearmed.
>
> v5: Update to match updated return values in enum drm_gpu_sched_stat
>
> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>

Reviewed-by: Christian König <christian.koenig@amd.com>

> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 19 ++++++++++++++++---
>   1 file changed, 16 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> index 759b34799221..d33e6d97cc89 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> @@ -25,6 +25,8 @@
>   #include <linux/wait.h>
>   #include <linux/sched.h>
>   
> +#include <drm/drm_drv.h>
> +
>   #include "amdgpu.h"
>   #include "amdgpu_trace.h"
>   
> @@ -34,6 +36,15 @@ static enum drm_gpu_sched_stat amdgpu_job_timedout(struct drm_sched_job *s_job)
>   	struct amdgpu_job *job = to_amdgpu_job(s_job);
>   	struct amdgpu_task_info ti;
>   	struct amdgpu_device *adev = ring->adev;
> +	int idx;
> +
> +	if (!drm_dev_enter(&adev->ddev, &idx)) {
> +		DRM_INFO("%s - device unplugged skipping recovery on scheduler:%s",
> +			 __func__, s_job->sched->name);
> +
> +		/* Effectively the job is aborted as the device is gone */
> +		return DRM_GPU_SCHED_STAT_ENODEV;
> +	}
>   
>   	memset(&ti, 0, sizeof(struct amdgpu_task_info));
>   
> @@ -41,7 +52,7 @@ static enum drm_gpu_sched_stat amdgpu_job_timedout(struct drm_sched_job *s_job)
>   	    amdgpu_ring_soft_recovery(ring, job->vmid, s_job->s_fence->parent)) {
>   		DRM_ERROR("ring %s timeout, but soft recovered\n",
>   			  s_job->sched->name);
> -		return DRM_GPU_SCHED_STAT_NOMINAL;
> +		goto exit;
>   	}
>   
>   	amdgpu_vm_get_task_info(ring->adev, job->pasid, &ti);
> @@ -53,13 +64,15 @@ static enum drm_gpu_sched_stat amdgpu_job_timedout(struct drm_sched_job *s_job)
>   
>   	if (amdgpu_device_should_recover_gpu(ring->adev)) {
>   		amdgpu_device_gpu_recover(ring->adev, job);
> -		return DRM_GPU_SCHED_STAT_NOMINAL;
>   	} else {
>   		drm_sched_suspend_timeout(&ring->sched);
>   		if (amdgpu_sriov_vf(adev))
>   			adev->virt.tdr_debug = true;
> -		return DRM_GPU_SCHED_STAT_NOMINAL;
>   	}
> +
> +exit:
> +	drm_dev_exit(idx);
> +	return DRM_GPU_SCHED_STAT_NOMINAL;
>   }
>   
>   int amdgpu_job_alloc(struct amdgpu_device *adev, unsigned num_ibs,


^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [PATCH v6 12/16] drm/amdgpu: Prevent any job recoveries after device is unplugged.
@ 2021-05-11  6:53     ` Christian König
  0 siblings, 0 replies; 126+ messages in thread
From: Christian König @ 2021-05-11  6:53 UTC (permalink / raw)
  To: Andrey Grodzovsky, dri-devel, amd-gfx, linux-pci, daniel.vetter,
	Harry.Wentland
  Cc: Alexander.Deucher, gregkh, helgaas, Felix.Kuehling

Am 10.05.21 um 18:36 schrieb Andrey Grodzovsky:
> Return DRM_TASK_STATUS_ENODEV back to the scheduler when device
> is not present so they timeout timer will not be rearmed.
>
> v5: Update to match updated return values in enum drm_gpu_sched_stat
>
> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>

Reviewed-by: Christian König <christian.koenig@amd.com>

> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 19 ++++++++++++++++---
>   1 file changed, 16 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> index 759b34799221..d33e6d97cc89 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> @@ -25,6 +25,8 @@
>   #include <linux/wait.h>
>   #include <linux/sched.h>
>   
> +#include <drm/drm_drv.h>
> +
>   #include "amdgpu.h"
>   #include "amdgpu_trace.h"
>   
> @@ -34,6 +36,15 @@ static enum drm_gpu_sched_stat amdgpu_job_timedout(struct drm_sched_job *s_job)
>   	struct amdgpu_job *job = to_amdgpu_job(s_job);
>   	struct amdgpu_task_info ti;
>   	struct amdgpu_device *adev = ring->adev;
> +	int idx;
> +
> +	if (!drm_dev_enter(&adev->ddev, &idx)) {
> +		DRM_INFO("%s - device unplugged skipping recovery on scheduler:%s",
> +			 __func__, s_job->sched->name);
> +
> +		/* Effectively the job is aborted as the device is gone */
> +		return DRM_GPU_SCHED_STAT_ENODEV;
> +	}
>   
>   	memset(&ti, 0, sizeof(struct amdgpu_task_info));
>   
> @@ -41,7 +52,7 @@ static enum drm_gpu_sched_stat amdgpu_job_timedout(struct drm_sched_job *s_job)
>   	    amdgpu_ring_soft_recovery(ring, job->vmid, s_job->s_fence->parent)) {
>   		DRM_ERROR("ring %s timeout, but soft recovered\n",
>   			  s_job->sched->name);
> -		return DRM_GPU_SCHED_STAT_NOMINAL;
> +		goto exit;
>   	}
>   
>   	amdgpu_vm_get_task_info(ring->adev, job->pasid, &ti);
> @@ -53,13 +64,15 @@ static enum drm_gpu_sched_stat amdgpu_job_timedout(struct drm_sched_job *s_job)
>   
>   	if (amdgpu_device_should_recover_gpu(ring->adev)) {
>   		amdgpu_device_gpu_recover(ring->adev, job);
> -		return DRM_GPU_SCHED_STAT_NOMINAL;
>   	} else {
>   		drm_sched_suspend_timeout(&ring->sched);
>   		if (amdgpu_sriov_vf(adev))
>   			adev->virt.tdr_debug = true;
> -		return DRM_GPU_SCHED_STAT_NOMINAL;
>   	}
> +
> +exit:
> +	drm_dev_exit(idx);
> +	return DRM_GPU_SCHED_STAT_NOMINAL;
>   }
>   
>   int amdgpu_job_alloc(struct amdgpu_device *adev, unsigned num_ibs,


^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [PATCH v6 12/16] drm/amdgpu: Prevent any job recoveries after device is unplugged.
@ 2021-05-11  6:53     ` Christian König
  0 siblings, 0 replies; 126+ messages in thread
From: Christian König @ 2021-05-11  6:53 UTC (permalink / raw)
  To: Andrey Grodzovsky, dri-devel, amd-gfx, linux-pci, daniel.vetter,
	Harry.Wentland
  Cc: Alexander.Deucher, gregkh, ppaalanen, helgaas, Felix.Kuehling

Am 10.05.21 um 18:36 schrieb Andrey Grodzovsky:
> Return DRM_TASK_STATUS_ENODEV back to the scheduler when device
> is not present so they timeout timer will not be rearmed.
>
> v5: Update to match updated return values in enum drm_gpu_sched_stat
>
> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>

Reviewed-by: Christian König <christian.koenig@amd.com>

> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 19 ++++++++++++++++---
>   1 file changed, 16 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> index 759b34799221..d33e6d97cc89 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> @@ -25,6 +25,8 @@
>   #include <linux/wait.h>
>   #include <linux/sched.h>
>   
> +#include <drm/drm_drv.h>
> +
>   #include "amdgpu.h"
>   #include "amdgpu_trace.h"
>   
> @@ -34,6 +36,15 @@ static enum drm_gpu_sched_stat amdgpu_job_timedout(struct drm_sched_job *s_job)
>   	struct amdgpu_job *job = to_amdgpu_job(s_job);
>   	struct amdgpu_task_info ti;
>   	struct amdgpu_device *adev = ring->adev;
> +	int idx;
> +
> +	if (!drm_dev_enter(&adev->ddev, &idx)) {
> +		DRM_INFO("%s - device unplugged skipping recovery on scheduler:%s",
> +			 __func__, s_job->sched->name);
> +
> +		/* Effectively the job is aborted as the device is gone */
> +		return DRM_GPU_SCHED_STAT_ENODEV;
> +	}
>   
>   	memset(&ti, 0, sizeof(struct amdgpu_task_info));
>   
> @@ -41,7 +52,7 @@ static enum drm_gpu_sched_stat amdgpu_job_timedout(struct drm_sched_job *s_job)
>   	    amdgpu_ring_soft_recovery(ring, job->vmid, s_job->s_fence->parent)) {
>   		DRM_ERROR("ring %s timeout, but soft recovered\n",
>   			  s_job->sched->name);
> -		return DRM_GPU_SCHED_STAT_NOMINAL;
> +		goto exit;
>   	}
>   
>   	amdgpu_vm_get_task_info(ring->adev, job->pasid, &ti);
> @@ -53,13 +64,15 @@ static enum drm_gpu_sched_stat amdgpu_job_timedout(struct drm_sched_job *s_job)
>   
>   	if (amdgpu_device_should_recover_gpu(ring->adev)) {
>   		amdgpu_device_gpu_recover(ring->adev, job);
> -		return DRM_GPU_SCHED_STAT_NOMINAL;
>   	} else {
>   		drm_sched_suspend_timeout(&ring->sched);
>   		if (amdgpu_sriov_vf(adev))
>   			adev->virt.tdr_debug = true;
> -		return DRM_GPU_SCHED_STAT_NOMINAL;
>   	}
> +
> +exit:
> +	drm_dev_exit(idx);
> +	return DRM_GPU_SCHED_STAT_NOMINAL;
>   }
>   
>   int amdgpu_job_alloc(struct amdgpu_device *adev, unsigned num_ibs,

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [PATCH v6 13/16] drm/amdgpu: Fix hang on device removal.
  2021-05-10 16:36   ` Andrey Grodzovsky
  (?)
@ 2021-05-11  6:54     ` Christian König
  -1 siblings, 0 replies; 126+ messages in thread
From: Christian König @ 2021-05-11  6:54 UTC (permalink / raw)
  To: Andrey Grodzovsky, dri-devel, amd-gfx, linux-pci, daniel.vetter,
	Harry.Wentland
  Cc: ppaalanen, Alexander.Deucher, gregkh, helgaas, Felix.Kuehling



Am 10.05.21 um 18:36 schrieb Andrey Grodzovsky:
> If removing while commands in flight you cannot wait to flush the
> HW fences on a ring since the device is gone.
>
> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c | 16 ++++++++++------
>   1 file changed, 10 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
> index 1ffb36bd0b19..fa03702ecbfb 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
> @@ -36,6 +36,7 @@
>   #include <linux/firmware.h>
>   #include <linux/pm_runtime.h>
>   
> +#include <drm/drm_drv.h>
>   #include "amdgpu.h"
>   #include "amdgpu_trace.h"
>   
> @@ -525,8 +526,7 @@ int amdgpu_fence_driver_init(struct amdgpu_device *adev)
>    */
>   void amdgpu_fence_driver_fini_hw(struct amdgpu_device *adev)
>   {
> -	unsigned i, j;
> -	int r;
> +	int i, r;

Is j not used here any more?

Christian.

>   
>   	for (i = 0; i < AMDGPU_MAX_RINGS; i++) {
>   		struct amdgpu_ring *ring = adev->rings[i];
> @@ -535,11 +535,15 @@ void amdgpu_fence_driver_fini_hw(struct amdgpu_device *adev)
>   			continue;
>   		if (!ring->no_scheduler)
>   			drm_sched_fini(&ring->sched);
> -		r = amdgpu_fence_wait_empty(ring);
> -		if (r) {
> -			/* no need to trigger GPU reset as we are unloading */
> +		/* You can't wait for HW to signal if it's gone */
> +		if (!drm_dev_is_unplugged(&adev->ddev))
> +			r = amdgpu_fence_wait_empty(ring);
> +		else
> +			r = -ENODEV;
> +		/* no need to trigger GPU reset as we are unloading */
> +		if (r)
>   			amdgpu_fence_driver_force_completion(ring);
> -		}
> +
>   		if (ring->fence_drv.irq_src)
>   			amdgpu_irq_put(adev, ring->fence_drv.irq_src,
>   				       ring->fence_drv.irq_type);


^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [PATCH v6 13/16] drm/amdgpu: Fix hang on device removal.
@ 2021-05-11  6:54     ` Christian König
  0 siblings, 0 replies; 126+ messages in thread
From: Christian König @ 2021-05-11  6:54 UTC (permalink / raw)
  To: Andrey Grodzovsky, dri-devel, amd-gfx, linux-pci, daniel.vetter,
	Harry.Wentland
  Cc: Alexander.Deucher, gregkh, helgaas, Felix.Kuehling



Am 10.05.21 um 18:36 schrieb Andrey Grodzovsky:
> If removing while commands in flight you cannot wait to flush the
> HW fences on a ring since the device is gone.
>
> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c | 16 ++++++++++------
>   1 file changed, 10 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
> index 1ffb36bd0b19..fa03702ecbfb 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
> @@ -36,6 +36,7 @@
>   #include <linux/firmware.h>
>   #include <linux/pm_runtime.h>
>   
> +#include <drm/drm_drv.h>
>   #include "amdgpu.h"
>   #include "amdgpu_trace.h"
>   
> @@ -525,8 +526,7 @@ int amdgpu_fence_driver_init(struct amdgpu_device *adev)
>    */
>   void amdgpu_fence_driver_fini_hw(struct amdgpu_device *adev)
>   {
> -	unsigned i, j;
> -	int r;
> +	int i, r;

Is j not used here any more?

Christian.

>   
>   	for (i = 0; i < AMDGPU_MAX_RINGS; i++) {
>   		struct amdgpu_ring *ring = adev->rings[i];
> @@ -535,11 +535,15 @@ void amdgpu_fence_driver_fini_hw(struct amdgpu_device *adev)
>   			continue;
>   		if (!ring->no_scheduler)
>   			drm_sched_fini(&ring->sched);
> -		r = amdgpu_fence_wait_empty(ring);
> -		if (r) {
> -			/* no need to trigger GPU reset as we are unloading */
> +		/* You can't wait for HW to signal if it's gone */
> +		if (!drm_dev_is_unplugged(&adev->ddev))
> +			r = amdgpu_fence_wait_empty(ring);
> +		else
> +			r = -ENODEV;
> +		/* no need to trigger GPU reset as we are unloading */
> +		if (r)
>   			amdgpu_fence_driver_force_completion(ring);
> -		}
> +
>   		if (ring->fence_drv.irq_src)
>   			amdgpu_irq_put(adev, ring->fence_drv.irq_src,
>   				       ring->fence_drv.irq_type);


^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [PATCH v6 13/16] drm/amdgpu: Fix hang on device removal.
@ 2021-05-11  6:54     ` Christian König
  0 siblings, 0 replies; 126+ messages in thread
From: Christian König @ 2021-05-11  6:54 UTC (permalink / raw)
  To: Andrey Grodzovsky, dri-devel, amd-gfx, linux-pci, daniel.vetter,
	Harry.Wentland
  Cc: Alexander.Deucher, gregkh, ppaalanen, helgaas, Felix.Kuehling



Am 10.05.21 um 18:36 schrieb Andrey Grodzovsky:
> If removing while commands in flight you cannot wait to flush the
> HW fences on a ring since the device is gone.
>
> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c | 16 ++++++++++------
>   1 file changed, 10 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
> index 1ffb36bd0b19..fa03702ecbfb 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
> @@ -36,6 +36,7 @@
>   #include <linux/firmware.h>
>   #include <linux/pm_runtime.h>
>   
> +#include <drm/drm_drv.h>
>   #include "amdgpu.h"
>   #include "amdgpu_trace.h"
>   
> @@ -525,8 +526,7 @@ int amdgpu_fence_driver_init(struct amdgpu_device *adev)
>    */
>   void amdgpu_fence_driver_fini_hw(struct amdgpu_device *adev)
>   {
> -	unsigned i, j;
> -	int r;
> +	int i, r;

Is j not used here any more?

Christian.

>   
>   	for (i = 0; i < AMDGPU_MAX_RINGS; i++) {
>   		struct amdgpu_ring *ring = adev->rings[i];
> @@ -535,11 +535,15 @@ void amdgpu_fence_driver_fini_hw(struct amdgpu_device *adev)
>   			continue;
>   		if (!ring->no_scheduler)
>   			drm_sched_fini(&ring->sched);
> -		r = amdgpu_fence_wait_empty(ring);
> -		if (r) {
> -			/* no need to trigger GPU reset as we are unloading */
> +		/* You can't wait for HW to signal if it's gone */
> +		if (!drm_dev_is_unplugged(&adev->ddev))
> +			r = amdgpu_fence_wait_empty(ring);
> +		else
> +			r = -ENODEV;
> +		/* no need to trigger GPU reset as we are unloading */
> +		if (r)
>   			amdgpu_fence_driver_force_completion(ring);
> -		}
> +
>   		if (ring->fence_drv.irq_src)
>   			amdgpu_irq_put(adev, ring->fence_drv.irq_src,
>   				       ring->fence_drv.irq_type);

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [PATCH v6 16/16] drm/amdgpu: Verify DMA opearations from device are done
  2021-05-10 16:36   ` Andrey Grodzovsky
  (?)
@ 2021-05-11  6:56     ` Christian König
  -1 siblings, 0 replies; 126+ messages in thread
From: Christian König @ 2021-05-11  6:56 UTC (permalink / raw)
  To: Andrey Grodzovsky, dri-devel, amd-gfx, linux-pci, daniel.vetter,
	Harry.Wentland
  Cc: ppaalanen, Alexander.Deucher, gregkh, helgaas, Felix.Kuehling

Am 10.05.21 um 18:36 schrieb Andrey Grodzovsky:
> In case device remove is just simualted by sysfs then verify
> device doesn't keep doing DMA to the released memory after
> pci_remove is done.
>
> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>

Acked-by: Christian König <christian.koenig@amd.com>

> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 6 ++++++
>   1 file changed, 6 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> index 83006f45b10b..5e6af9e0b7bf 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> @@ -1314,7 +1314,13 @@ amdgpu_pci_remove(struct pci_dev *pdev)
>   	drm_dev_unplug(dev);
>   	amdgpu_driver_unload_kms(dev);
>   
> +	/*
> +	 * Flush any in flight DMA operations from device.
> +	 * Clear the Bus Master Enable bit and then wait on the PCIe Device
> +	 * StatusTransactions Pending bit.
> +	 */
>   	pci_disable_device(pdev);
> +	pci_wait_for_pending_transaction(pdev);
>   }
>   
>   static void


^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [PATCH v6 16/16] drm/amdgpu: Verify DMA opearations from device are done
@ 2021-05-11  6:56     ` Christian König
  0 siblings, 0 replies; 126+ messages in thread
From: Christian König @ 2021-05-11  6:56 UTC (permalink / raw)
  To: Andrey Grodzovsky, dri-devel, amd-gfx, linux-pci, daniel.vetter,
	Harry.Wentland
  Cc: Alexander.Deucher, gregkh, helgaas, Felix.Kuehling

Am 10.05.21 um 18:36 schrieb Andrey Grodzovsky:
> In case device remove is just simualted by sysfs then verify
> device doesn't keep doing DMA to the released memory after
> pci_remove is done.
>
> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>

Acked-by: Christian König <christian.koenig@amd.com>

> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 6 ++++++
>   1 file changed, 6 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> index 83006f45b10b..5e6af9e0b7bf 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> @@ -1314,7 +1314,13 @@ amdgpu_pci_remove(struct pci_dev *pdev)
>   	drm_dev_unplug(dev);
>   	amdgpu_driver_unload_kms(dev);
>   
> +	/*
> +	 * Flush any in flight DMA operations from device.
> +	 * Clear the Bus Master Enable bit and then wait on the PCIe Device
> +	 * StatusTransactions Pending bit.
> +	 */
>   	pci_disable_device(pdev);
> +	pci_wait_for_pending_transaction(pdev);
>   }
>   
>   static void


^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [PATCH v6 16/16] drm/amdgpu: Verify DMA opearations from device are done
@ 2021-05-11  6:56     ` Christian König
  0 siblings, 0 replies; 126+ messages in thread
From: Christian König @ 2021-05-11  6:56 UTC (permalink / raw)
  To: Andrey Grodzovsky, dri-devel, amd-gfx, linux-pci, daniel.vetter,
	Harry.Wentland
  Cc: Alexander.Deucher, gregkh, ppaalanen, helgaas, Felix.Kuehling

Am 10.05.21 um 18:36 schrieb Andrey Grodzovsky:
> In case device remove is just simualted by sysfs then verify
> device doesn't keep doing DMA to the released memory after
> pci_remove is done.
>
> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>

Acked-by: Christian König <christian.koenig@amd.com>

> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 6 ++++++
>   1 file changed, 6 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> index 83006f45b10b..5e6af9e0b7bf 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> @@ -1314,7 +1314,13 @@ amdgpu_pci_remove(struct pci_dev *pdev)
>   	drm_dev_unplug(dev);
>   	amdgpu_driver_unload_kms(dev);
>   
> +	/*
> +	 * Flush any in flight DMA operations from device.
> +	 * Clear the Bus Master Enable bit and then wait on the PCIe Device
> +	 * StatusTransactions Pending bit.
> +	 */
>   	pci_disable_device(pdev);
> +	pci_wait_for_pending_transaction(pdev);
>   }
>   
>   static void

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [PATCH v6 04/16] drm/amdkfd: Split kfd suspend from devie exit
  2021-05-10 16:36   ` Andrey Grodzovsky
@ 2021-05-11 13:24     ` Deucher, Alexander
  -1 siblings, 0 replies; 126+ messages in thread
From: Deucher, Alexander @ 2021-05-11 13:24 UTC (permalink / raw)
  To: Grodzovsky, Andrey, dri-devel, amd-gfx, linux-pci,
	ckoenig.leichtzumerken, daniel.vetter, Wentland, Harry
  Cc: gregkh, helgaas, Kuehling, Felix

[-- Attachment #1: Type: text/plain, Size: 3280 bytes --]

[AMD Public Use]

Typo in the subject: devie > device

Alex
________________________________
From: Grodzovsky, Andrey <Andrey.Grodzovsky@amd.com>
Sent: Monday, May 10, 2021 12:36 PM
To: dri-devel@lists.freedesktop.org <dri-devel@lists.freedesktop.org>; amd-gfx@lists.freedesktop.org <amd-gfx@lists.freedesktop.org>; linux-pci@vger.kernel.org <linux-pci@vger.kernel.org>; ckoenig.leichtzumerken@gmail.com <ckoenig.leichtzumerken@gmail.com>; daniel.vetter@ffwll.ch <daniel.vetter@ffwll.ch>; Wentland, Harry <Harry.Wentland@amd.com>
Cc: ppaalanen@gmail.com <ppaalanen@gmail.com>; Deucher, Alexander <Alexander.Deucher@amd.com>; gregkh@linuxfoundation.org <gregkh@linuxfoundation.org>; helgaas@kernel.org <helgaas@kernel.org>; Kuehling, Felix <Felix.Kuehling@amd.com>; Grodzovsky, Andrey <Andrey.Grodzovsky@amd.com>
Subject: [PATCH v6 04/16] drm/amdkfd: Split kfd suspend from devie exit

Helps to expdite HW related stuff to amdgpu_pci_remove

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h | 2 +-
 drivers/gpu/drm/amd/amdkfd/kfd_device.c    | 3 ++-
 3 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
index 5f6696a3c778..2b06dee9a0ce 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
@@ -170,7 +170,7 @@ void amdgpu_amdkfd_device_init(struct amdgpu_device *adev)
         }
 }

-void amdgpu_amdkfd_device_fini(struct amdgpu_device *adev)
+void amdgpu_amdkfd_device_fini_sw(struct amdgpu_device *adev)
 {
         if (adev->kfd.dev) {
                 kgd2kfd_device_exit(adev->kfd.dev);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
index 14f68c028126..f8e10af99c28 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
@@ -127,7 +127,7 @@ void amdgpu_amdkfd_interrupt(struct amdgpu_device *adev,
                         const void *ih_ring_entry);
 void amdgpu_amdkfd_device_probe(struct amdgpu_device *adev);
 void amdgpu_amdkfd_device_init(struct amdgpu_device *adev);
-void amdgpu_amdkfd_device_fini(struct amdgpu_device *adev);
+void amdgpu_amdkfd_device_fini_sw(struct amdgpu_device *adev);
 int amdgpu_amdkfd_submit_ib(struct kgd_dev *kgd, enum kgd_engine_type engine,
                                 uint32_t vmid, uint64_t gpu_addr,
                                 uint32_t *ib_cmd, uint32_t ib_len);
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
index 357b9bf62a1c..ab6d2a43c9a3 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
@@ -858,10 +858,11 @@ bool kgd2kfd_device_init(struct kfd_dev *kfd,
         return kfd->init_complete;
 }

+
+
 void kgd2kfd_device_exit(struct kfd_dev *kfd)
 {
         if (kfd->init_complete) {
-               kgd2kfd_suspend(kfd, false);
                 device_queue_manager_uninit(kfd->dqm);
                 kfd_interrupt_exit(kfd);
                 kfd_topology_remove_device(kfd);
--
2.25.1


[-- Attachment #2: Type: text/html, Size: 5781 bytes --]

^ permalink raw reply related	[flat|nested] 126+ messages in thread

* Re: [PATCH v6 04/16] drm/amdkfd: Split kfd suspend from devie exit
@ 2021-05-11 13:24     ` Deucher, Alexander
  0 siblings, 0 replies; 126+ messages in thread
From: Deucher, Alexander @ 2021-05-11 13:24 UTC (permalink / raw)
  To: Grodzovsky, Andrey, dri-devel, amd-gfx, linux-pci,
	ckoenig.leichtzumerken, daniel.vetter, Wentland, Harry
  Cc: gregkh, ppaalanen, helgaas, Kuehling, Felix


[-- Attachment #1.1: Type: text/plain, Size: 3280 bytes --]

[AMD Public Use]

Typo in the subject: devie > device

Alex
________________________________
From: Grodzovsky, Andrey <Andrey.Grodzovsky@amd.com>
Sent: Monday, May 10, 2021 12:36 PM
To: dri-devel@lists.freedesktop.org <dri-devel@lists.freedesktop.org>; amd-gfx@lists.freedesktop.org <amd-gfx@lists.freedesktop.org>; linux-pci@vger.kernel.org <linux-pci@vger.kernel.org>; ckoenig.leichtzumerken@gmail.com <ckoenig.leichtzumerken@gmail.com>; daniel.vetter@ffwll.ch <daniel.vetter@ffwll.ch>; Wentland, Harry <Harry.Wentland@amd.com>
Cc: ppaalanen@gmail.com <ppaalanen@gmail.com>; Deucher, Alexander <Alexander.Deucher@amd.com>; gregkh@linuxfoundation.org <gregkh@linuxfoundation.org>; helgaas@kernel.org <helgaas@kernel.org>; Kuehling, Felix <Felix.Kuehling@amd.com>; Grodzovsky, Andrey <Andrey.Grodzovsky@amd.com>
Subject: [PATCH v6 04/16] drm/amdkfd: Split kfd suspend from devie exit

Helps to expdite HW related stuff to amdgpu_pci_remove

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h | 2 +-
 drivers/gpu/drm/amd/amdkfd/kfd_device.c    | 3 ++-
 3 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
index 5f6696a3c778..2b06dee9a0ce 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
@@ -170,7 +170,7 @@ void amdgpu_amdkfd_device_init(struct amdgpu_device *adev)
         }
 }

-void amdgpu_amdkfd_device_fini(struct amdgpu_device *adev)
+void amdgpu_amdkfd_device_fini_sw(struct amdgpu_device *adev)
 {
         if (adev->kfd.dev) {
                 kgd2kfd_device_exit(adev->kfd.dev);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
index 14f68c028126..f8e10af99c28 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
@@ -127,7 +127,7 @@ void amdgpu_amdkfd_interrupt(struct amdgpu_device *adev,
                         const void *ih_ring_entry);
 void amdgpu_amdkfd_device_probe(struct amdgpu_device *adev);
 void amdgpu_amdkfd_device_init(struct amdgpu_device *adev);
-void amdgpu_amdkfd_device_fini(struct amdgpu_device *adev);
+void amdgpu_amdkfd_device_fini_sw(struct amdgpu_device *adev);
 int amdgpu_amdkfd_submit_ib(struct kgd_dev *kgd, enum kgd_engine_type engine,
                                 uint32_t vmid, uint64_t gpu_addr,
                                 uint32_t *ib_cmd, uint32_t ib_len);
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
index 357b9bf62a1c..ab6d2a43c9a3 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
@@ -858,10 +858,11 @@ bool kgd2kfd_device_init(struct kfd_dev *kfd,
         return kfd->init_complete;
 }

+
+
 void kgd2kfd_device_exit(struct kfd_dev *kfd)
 {
         if (kfd->init_complete) {
-               kgd2kfd_suspend(kfd, false);
                 device_queue_manager_uninit(kfd->dqm);
                 kfd_interrupt_exit(kfd);
                 kfd_topology_remove_device(kfd);
--
2.25.1


[-- Attachment #1.2: Type: text/html, Size: 5781 bytes --]

[-- Attachment #2: Type: text/plain, Size: 154 bytes --]

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 126+ messages in thread

* Re: [PATCH v6 01/16] drm/ttm: Remap all page faults to per process dummy page.
  2021-05-11  6:38     ` Christian König
  (?)
@ 2021-05-11 14:44       ` Andrey Grodzovsky
  -1 siblings, 0 replies; 126+ messages in thread
From: Andrey Grodzovsky @ 2021-05-11 14:44 UTC (permalink / raw)
  To: Christian König, dri-devel, amd-gfx, linux-pci,
	daniel.vetter, Harry.Wentland
  Cc: ppaalanen, Alexander.Deucher, gregkh, helgaas, Felix.Kuehling


On 2021-05-11 2:38 a.m., Christian König wrote:
> Am 10.05.21 um 18:36 schrieb Andrey Grodzovsky:
>> On device removal reroute all CPU mappings to dummy page.
>>
>> v3:
>> Remove loop to find DRM file and instead access it
>> by vma->vm_file->private_data. Move dummy page installation
>> into a separate function.
>>
>> v4:
>> Map the entire BOs VA space into on demand allocated dummy page
>> on the first fault for that BO.
>>
>> v5: Remove duplicate return.
>>
>> v6: Polish ttm_bo_vm_dummy_page, remove superflous code.
>>
>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>> ---
>>   drivers/gpu/drm/ttm/ttm_bo_vm.c | 57 ++++++++++++++++++++++++++++++++-
>>   include/drm/ttm/ttm_bo_api.h    |  2 ++
>>   2 files changed, 58 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c 
>> b/drivers/gpu/drm/ttm/ttm_bo_vm.c
>> index b31b18058965..e5a9615519d1 100644
>> --- a/drivers/gpu/drm/ttm/ttm_bo_vm.c
>> +++ b/drivers/gpu/drm/ttm/ttm_bo_vm.c
>> @@ -34,6 +34,8 @@
>>   #include <drm/ttm/ttm_bo_driver.h>
>>   #include <drm/ttm/ttm_placement.h>
>>   #include <drm/drm_vma_manager.h>
>> +#include <drm/drm_drv.h>
>> +#include <drm/drm_managed.h>
>>   #include <linux/mm.h>
>>   #include <linux/pfn_t.h>
>>   #include <linux/rbtree.h>
>> @@ -380,19 +382,72 @@ vm_fault_t ttm_bo_vm_fault_reserved(struct 
>> vm_fault *vmf,
>>   }
>>   EXPORT_SYMBOL(ttm_bo_vm_fault_reserved);
>>   +static void ttm_bo_release_dummy_page(struct drm_device *dev, void 
>> *res)
>> +{
>> +    struct page *dummy_page = (struct page *)res;
>> +
>> +    __free_page(dummy_page);
>> +}
>> +
>> +vm_fault_t ttm_bo_vm_dummy_page(struct vm_fault *vmf, pgprot_t prot)
>> +{
>> +    struct vm_area_struct *vma = vmf->vma;
>> +    struct ttm_buffer_object *bo = vma->vm_private_data;
>> +    struct drm_device *ddev = bo->base.dev;
>> +    vm_fault_t ret = VM_FAULT_NOPAGE;
>> +    unsigned long address;
>> +    unsigned long pfn;
>> +    struct page *page;
>> +
>> +    /* Allocate new dummy page to map all the VA range in this VMA 
>> to it*/
>> +    page = alloc_page(GFP_KERNEL | __GFP_ZERO);
>> +    if (!page)
>> +        return VM_FAULT_OOM;
>> +
>> +    pfn = page_to_pfn(page);
>> +
>> +    /* Prefault the entire VMA range right away to avoid further 
>> faults */
>> +    for (address = vma->vm_start; address < vma->vm_end; address += 
>> PAGE_SIZE) {
>> +
>
>> +        if (unlikely(address >= vma->vm_end))
>> +            break;
>
> That extra check can be removed as far as I can see.
>
>
>> +
>> +        if (vma->vm_flags & VM_MIXEDMAP)
>> +            ret = vmf_insert_mixed_prot(vma, address,
>> +                            __pfn_to_pfn_t(pfn, PFN_DEV),
>> +                            prot);
>> +        else
>> +            ret = vmf_insert_pfn_prot(vma, address, pfn, prot);
>> +    }
>> +
>
>> +    /* Set the page to be freed using drmm release action */
>> +    if (drmm_add_action_or_reset(ddev, ttm_bo_release_dummy_page, 
>> page))
>> +        return VM_FAULT_OOM;
>
> You should probably move that before inserting the page into the VMA 
> and also free the allocated page if it goes wrong.


drmm_add_action_or_reset will automatically release the page if the add 
action fails, that the 'reset' part of the function.

Andrey


>
> Apart from that patch looks good to me,
> Christian.
>
>> +
>> +    return ret;
>> +}
>> +EXPORT_SYMBOL(ttm_bo_vm_dummy_page);
>> +
>>   vm_fault_t ttm_bo_vm_fault(struct vm_fault *vmf)
>>   {
>>       struct vm_area_struct *vma = vmf->vma;
>>       pgprot_t prot;
>>       struct ttm_buffer_object *bo = vma->vm_private_data;
>> +    struct drm_device *ddev = bo->base.dev;
>>       vm_fault_t ret;
>> +    int idx;
>>         ret = ttm_bo_vm_reserve(bo, vmf);
>>       if (ret)
>>           return ret;
>>         prot = vma->vm_page_prot;
>> -    ret = ttm_bo_vm_fault_reserved(vmf, prot, 
>> TTM_BO_VM_NUM_PREFAULT, 1);
>> +    if (drm_dev_enter(ddev, &idx)) {
>> +        ret = ttm_bo_vm_fault_reserved(vmf, prot, 
>> TTM_BO_VM_NUM_PREFAULT, 1);
>> +        drm_dev_exit(idx);
>> +    } else {
>> +        ret = ttm_bo_vm_dummy_page(vmf, prot);
>> +    }
>>       if (ret == VM_FAULT_RETRY && !(vmf->flags & 
>> FAULT_FLAG_RETRY_NOWAIT))
>>           return ret;
>>   diff --git a/include/drm/ttm/ttm_bo_api.h 
>> b/include/drm/ttm/ttm_bo_api.h
>> index 639521880c29..254ede97f8e3 100644
>> --- a/include/drm/ttm/ttm_bo_api.h
>> +++ b/include/drm/ttm/ttm_bo_api.h
>> @@ -620,4 +620,6 @@ int ttm_bo_vm_access(struct vm_area_struct *vma, 
>> unsigned long addr,
>>                void *buf, int len, int write);
>>   bool ttm_bo_delayed_delete(struct ttm_device *bdev, bool remove_all);
>>   +vm_fault_t ttm_bo_vm_dummy_page(struct vm_fault *vmf, pgprot_t prot);
>> +
>>   #endif
>

^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [PATCH v6 01/16] drm/ttm: Remap all page faults to per process dummy page.
@ 2021-05-11 14:44       ` Andrey Grodzovsky
  0 siblings, 0 replies; 126+ messages in thread
From: Andrey Grodzovsky @ 2021-05-11 14:44 UTC (permalink / raw)
  To: Christian König, dri-devel, amd-gfx, linux-pci,
	daniel.vetter, Harry.Wentland
  Cc: Alexander.Deucher, gregkh, helgaas, Felix.Kuehling


On 2021-05-11 2:38 a.m., Christian König wrote:
> Am 10.05.21 um 18:36 schrieb Andrey Grodzovsky:
>> On device removal reroute all CPU mappings to dummy page.
>>
>> v3:
>> Remove loop to find DRM file and instead access it
>> by vma->vm_file->private_data. Move dummy page installation
>> into a separate function.
>>
>> v4:
>> Map the entire BOs VA space into on demand allocated dummy page
>> on the first fault for that BO.
>>
>> v5: Remove duplicate return.
>>
>> v6: Polish ttm_bo_vm_dummy_page, remove superflous code.
>>
>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>> ---
>>   drivers/gpu/drm/ttm/ttm_bo_vm.c | 57 ++++++++++++++++++++++++++++++++-
>>   include/drm/ttm/ttm_bo_api.h    |  2 ++
>>   2 files changed, 58 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c 
>> b/drivers/gpu/drm/ttm/ttm_bo_vm.c
>> index b31b18058965..e5a9615519d1 100644
>> --- a/drivers/gpu/drm/ttm/ttm_bo_vm.c
>> +++ b/drivers/gpu/drm/ttm/ttm_bo_vm.c
>> @@ -34,6 +34,8 @@
>>   #include <drm/ttm/ttm_bo_driver.h>
>>   #include <drm/ttm/ttm_placement.h>
>>   #include <drm/drm_vma_manager.h>
>> +#include <drm/drm_drv.h>
>> +#include <drm/drm_managed.h>
>>   #include <linux/mm.h>
>>   #include <linux/pfn_t.h>
>>   #include <linux/rbtree.h>
>> @@ -380,19 +382,72 @@ vm_fault_t ttm_bo_vm_fault_reserved(struct 
>> vm_fault *vmf,
>>   }
>>   EXPORT_SYMBOL(ttm_bo_vm_fault_reserved);
>>   +static void ttm_bo_release_dummy_page(struct drm_device *dev, void 
>> *res)
>> +{
>> +    struct page *dummy_page = (struct page *)res;
>> +
>> +    __free_page(dummy_page);
>> +}
>> +
>> +vm_fault_t ttm_bo_vm_dummy_page(struct vm_fault *vmf, pgprot_t prot)
>> +{
>> +    struct vm_area_struct *vma = vmf->vma;
>> +    struct ttm_buffer_object *bo = vma->vm_private_data;
>> +    struct drm_device *ddev = bo->base.dev;
>> +    vm_fault_t ret = VM_FAULT_NOPAGE;
>> +    unsigned long address;
>> +    unsigned long pfn;
>> +    struct page *page;
>> +
>> +    /* Allocate new dummy page to map all the VA range in this VMA 
>> to it*/
>> +    page = alloc_page(GFP_KERNEL | __GFP_ZERO);
>> +    if (!page)
>> +        return VM_FAULT_OOM;
>> +
>> +    pfn = page_to_pfn(page);
>> +
>> +    /* Prefault the entire VMA range right away to avoid further 
>> faults */
>> +    for (address = vma->vm_start; address < vma->vm_end; address += 
>> PAGE_SIZE) {
>> +
>
>> +        if (unlikely(address >= vma->vm_end))
>> +            break;
>
> That extra check can be removed as far as I can see.
>
>
>> +
>> +        if (vma->vm_flags & VM_MIXEDMAP)
>> +            ret = vmf_insert_mixed_prot(vma, address,
>> +                            __pfn_to_pfn_t(pfn, PFN_DEV),
>> +                            prot);
>> +        else
>> +            ret = vmf_insert_pfn_prot(vma, address, pfn, prot);
>> +    }
>> +
>
>> +    /* Set the page to be freed using drmm release action */
>> +    if (drmm_add_action_or_reset(ddev, ttm_bo_release_dummy_page, 
>> page))
>> +        return VM_FAULT_OOM;
>
> You should probably move that before inserting the page into the VMA 
> and also free the allocated page if it goes wrong.


drmm_add_action_or_reset will automatically release the page if the add 
action fails, that the 'reset' part of the function.

Andrey


>
> Apart from that patch looks good to me,
> Christian.
>
>> +
>> +    return ret;
>> +}
>> +EXPORT_SYMBOL(ttm_bo_vm_dummy_page);
>> +
>>   vm_fault_t ttm_bo_vm_fault(struct vm_fault *vmf)
>>   {
>>       struct vm_area_struct *vma = vmf->vma;
>>       pgprot_t prot;
>>       struct ttm_buffer_object *bo = vma->vm_private_data;
>> +    struct drm_device *ddev = bo->base.dev;
>>       vm_fault_t ret;
>> +    int idx;
>>         ret = ttm_bo_vm_reserve(bo, vmf);
>>       if (ret)
>>           return ret;
>>         prot = vma->vm_page_prot;
>> -    ret = ttm_bo_vm_fault_reserved(vmf, prot, 
>> TTM_BO_VM_NUM_PREFAULT, 1);
>> +    if (drm_dev_enter(ddev, &idx)) {
>> +        ret = ttm_bo_vm_fault_reserved(vmf, prot, 
>> TTM_BO_VM_NUM_PREFAULT, 1);
>> +        drm_dev_exit(idx);
>> +    } else {
>> +        ret = ttm_bo_vm_dummy_page(vmf, prot);
>> +    }
>>       if (ret == VM_FAULT_RETRY && !(vmf->flags & 
>> FAULT_FLAG_RETRY_NOWAIT))
>>           return ret;
>>   diff --git a/include/drm/ttm/ttm_bo_api.h 
>> b/include/drm/ttm/ttm_bo_api.h
>> index 639521880c29..254ede97f8e3 100644
>> --- a/include/drm/ttm/ttm_bo_api.h
>> +++ b/include/drm/ttm/ttm_bo_api.h
>> @@ -620,4 +620,6 @@ int ttm_bo_vm_access(struct vm_area_struct *vma, 
>> unsigned long addr,
>>                void *buf, int len, int write);
>>   bool ttm_bo_delayed_delete(struct ttm_device *bdev, bool remove_all);
>>   +vm_fault_t ttm_bo_vm_dummy_page(struct vm_fault *vmf, pgprot_t prot);
>> +
>>   #endif
>

^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [PATCH v6 01/16] drm/ttm: Remap all page faults to per process dummy page.
@ 2021-05-11 14:44       ` Andrey Grodzovsky
  0 siblings, 0 replies; 126+ messages in thread
From: Andrey Grodzovsky @ 2021-05-11 14:44 UTC (permalink / raw)
  To: Christian König, dri-devel, amd-gfx, linux-pci,
	daniel.vetter, Harry.Wentland
  Cc: Alexander.Deucher, gregkh, ppaalanen, helgaas, Felix.Kuehling


On 2021-05-11 2:38 a.m., Christian König wrote:
> Am 10.05.21 um 18:36 schrieb Andrey Grodzovsky:
>> On device removal reroute all CPU mappings to dummy page.
>>
>> v3:
>> Remove loop to find DRM file and instead access it
>> by vma->vm_file->private_data. Move dummy page installation
>> into a separate function.
>>
>> v4:
>> Map the entire BOs VA space into on demand allocated dummy page
>> on the first fault for that BO.
>>
>> v5: Remove duplicate return.
>>
>> v6: Polish ttm_bo_vm_dummy_page, remove superflous code.
>>
>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>> ---
>>   drivers/gpu/drm/ttm/ttm_bo_vm.c | 57 ++++++++++++++++++++++++++++++++-
>>   include/drm/ttm/ttm_bo_api.h    |  2 ++
>>   2 files changed, 58 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c 
>> b/drivers/gpu/drm/ttm/ttm_bo_vm.c
>> index b31b18058965..e5a9615519d1 100644
>> --- a/drivers/gpu/drm/ttm/ttm_bo_vm.c
>> +++ b/drivers/gpu/drm/ttm/ttm_bo_vm.c
>> @@ -34,6 +34,8 @@
>>   #include <drm/ttm/ttm_bo_driver.h>
>>   #include <drm/ttm/ttm_placement.h>
>>   #include <drm/drm_vma_manager.h>
>> +#include <drm/drm_drv.h>
>> +#include <drm/drm_managed.h>
>>   #include <linux/mm.h>
>>   #include <linux/pfn_t.h>
>>   #include <linux/rbtree.h>
>> @@ -380,19 +382,72 @@ vm_fault_t ttm_bo_vm_fault_reserved(struct 
>> vm_fault *vmf,
>>   }
>>   EXPORT_SYMBOL(ttm_bo_vm_fault_reserved);
>>   +static void ttm_bo_release_dummy_page(struct drm_device *dev, void 
>> *res)
>> +{
>> +    struct page *dummy_page = (struct page *)res;
>> +
>> +    __free_page(dummy_page);
>> +}
>> +
>> +vm_fault_t ttm_bo_vm_dummy_page(struct vm_fault *vmf, pgprot_t prot)
>> +{
>> +    struct vm_area_struct *vma = vmf->vma;
>> +    struct ttm_buffer_object *bo = vma->vm_private_data;
>> +    struct drm_device *ddev = bo->base.dev;
>> +    vm_fault_t ret = VM_FAULT_NOPAGE;
>> +    unsigned long address;
>> +    unsigned long pfn;
>> +    struct page *page;
>> +
>> +    /* Allocate new dummy page to map all the VA range in this VMA 
>> to it*/
>> +    page = alloc_page(GFP_KERNEL | __GFP_ZERO);
>> +    if (!page)
>> +        return VM_FAULT_OOM;
>> +
>> +    pfn = page_to_pfn(page);
>> +
>> +    /* Prefault the entire VMA range right away to avoid further 
>> faults */
>> +    for (address = vma->vm_start; address < vma->vm_end; address += 
>> PAGE_SIZE) {
>> +
>
>> +        if (unlikely(address >= vma->vm_end))
>> +            break;
>
> That extra check can be removed as far as I can see.
>
>
>> +
>> +        if (vma->vm_flags & VM_MIXEDMAP)
>> +            ret = vmf_insert_mixed_prot(vma, address,
>> +                            __pfn_to_pfn_t(pfn, PFN_DEV),
>> +                            prot);
>> +        else
>> +            ret = vmf_insert_pfn_prot(vma, address, pfn, prot);
>> +    }
>> +
>
>> +    /* Set the page to be freed using drmm release action */
>> +    if (drmm_add_action_or_reset(ddev, ttm_bo_release_dummy_page, 
>> page))
>> +        return VM_FAULT_OOM;
>
> You should probably move that before inserting the page into the VMA 
> and also free the allocated page if it goes wrong.


drmm_add_action_or_reset will automatically release the page if the add 
action fails, that the 'reset' part of the function.

Andrey


>
> Apart from that patch looks good to me,
> Christian.
>
>> +
>> +    return ret;
>> +}
>> +EXPORT_SYMBOL(ttm_bo_vm_dummy_page);
>> +
>>   vm_fault_t ttm_bo_vm_fault(struct vm_fault *vmf)
>>   {
>>       struct vm_area_struct *vma = vmf->vma;
>>       pgprot_t prot;
>>       struct ttm_buffer_object *bo = vma->vm_private_data;
>> +    struct drm_device *ddev = bo->base.dev;
>>       vm_fault_t ret;
>> +    int idx;
>>         ret = ttm_bo_vm_reserve(bo, vmf);
>>       if (ret)
>>           return ret;
>>         prot = vma->vm_page_prot;
>> -    ret = ttm_bo_vm_fault_reserved(vmf, prot, 
>> TTM_BO_VM_NUM_PREFAULT, 1);
>> +    if (drm_dev_enter(ddev, &idx)) {
>> +        ret = ttm_bo_vm_fault_reserved(vmf, prot, 
>> TTM_BO_VM_NUM_PREFAULT, 1);
>> +        drm_dev_exit(idx);
>> +    } else {
>> +        ret = ttm_bo_vm_dummy_page(vmf, prot);
>> +    }
>>       if (ret == VM_FAULT_RETRY && !(vmf->flags & 
>> FAULT_FLAG_RETRY_NOWAIT))
>>           return ret;
>>   diff --git a/include/drm/ttm/ttm_bo_api.h 
>> b/include/drm/ttm/ttm_bo_api.h
>> index 639521880c29..254ede97f8e3 100644
>> --- a/include/drm/ttm/ttm_bo_api.h
>> +++ b/include/drm/ttm/ttm_bo_api.h
>> @@ -620,4 +620,6 @@ int ttm_bo_vm_access(struct vm_area_struct *vma, 
>> unsigned long addr,
>>                void *buf, int len, int write);
>>   bool ttm_bo_delayed_delete(struct ttm_device *bdev, bool remove_all);
>>   +vm_fault_t ttm_bo_vm_dummy_page(struct vm_fault *vmf, pgprot_t prot);
>> +
>>   #endif
>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [PATCH v6 04/16] drm/amdkfd: Split kfd suspend from devie exit
  2021-05-11  6:40     ` Christian König
  (?)
@ 2021-05-11 14:52       ` Andrey Grodzovsky
  -1 siblings, 0 replies; 126+ messages in thread
From: Andrey Grodzovsky @ 2021-05-11 14:52 UTC (permalink / raw)
  To: Christian König, dri-devel, amd-gfx, linux-pci,
	daniel.vetter, Harry.Wentland
  Cc: ppaalanen, Alexander.Deucher, gregkh, helgaas, Felix.Kuehling



On 2021-05-11 2:40 a.m., Christian König wrote:
> Am 10.05.21 um 18:36 schrieb Andrey Grodzovsky:
>> Helps to expdite HW related stuff to amdgpu_pci_remove
>>
>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 2 +-
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h | 2 +-
>>   drivers/gpu/drm/amd/amdkfd/kfd_device.c    | 3 ++-
>>   3 files changed, 4 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
>> index 5f6696a3c778..2b06dee9a0ce 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
>> @@ -170,7 +170,7 @@ void amdgpu_amdkfd_device_init(struct 
>> amdgpu_device *adev)
>>       }
>>   }
>> -void amdgpu_amdkfd_device_fini(struct amdgpu_device *adev)
>> +void amdgpu_amdkfd_device_fini_sw(struct amdgpu_device *adev)
>>   {
>>       if (adev->kfd.dev) {
>>           kgd2kfd_device_exit(adev->kfd.dev);
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
>> index 14f68c028126..f8e10af99c28 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
>> @@ -127,7 +127,7 @@ void amdgpu_amdkfd_interrupt(struct amdgpu_device 
>> *adev,
>>               const void *ih_ring_entry);
>>   void amdgpu_amdkfd_device_probe(struct amdgpu_device *adev);
>>   void amdgpu_amdkfd_device_init(struct amdgpu_device *adev);
>> -void amdgpu_amdkfd_device_fini(struct amdgpu_device *adev);
>> +void amdgpu_amdkfd_device_fini_sw(struct amdgpu_device *adev);
>>   int amdgpu_amdkfd_submit_ib(struct kgd_dev *kgd, enum 
>> kgd_engine_type engine,
>>                   uint32_t vmid, uint64_t gpu_addr,
>>                   uint32_t *ib_cmd, uint32_t ib_len);
>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c 
>> b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
>> index 357b9bf62a1c..ab6d2a43c9a3 100644
>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
>> @@ -858,10 +858,11 @@ bool kgd2kfd_device_init(struct kfd_dev *kfd,
>>       return kfd->init_complete;
>>   }
>> +
>> +
> 
> Looks like unnecessary white space change to me.
> 
>>   void kgd2kfd_device_exit(struct kfd_dev *kfd)
>>   {
>>       if (kfd->init_complete) {
>> -        kgd2kfd_suspend(kfd, false);
> 
> Where is the call to this function now?
> 
> Christian.

In patch 'drm/amdgpu: Add early fini callback' in
amdgpu_device_ip_fini_early->amdgpu_amdkfd_suspend->kgd2kfd_suspend

Andrey

> 
>>           device_queue_manager_uninit(kfd->dqm);
>>           kfd_interrupt_exit(kfd);
>>           kfd_topology_remove_device(kfd);
> 

^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [PATCH v6 04/16] drm/amdkfd: Split kfd suspend from devie exit
@ 2021-05-11 14:52       ` Andrey Grodzovsky
  0 siblings, 0 replies; 126+ messages in thread
From: Andrey Grodzovsky @ 2021-05-11 14:52 UTC (permalink / raw)
  To: Christian König, dri-devel, amd-gfx, linux-pci,
	daniel.vetter, Harry.Wentland
  Cc: Alexander.Deucher, gregkh, helgaas, Felix.Kuehling



On 2021-05-11 2:40 a.m., Christian König wrote:
> Am 10.05.21 um 18:36 schrieb Andrey Grodzovsky:
>> Helps to expdite HW related stuff to amdgpu_pci_remove
>>
>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 2 +-
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h | 2 +-
>>   drivers/gpu/drm/amd/amdkfd/kfd_device.c    | 3 ++-
>>   3 files changed, 4 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
>> index 5f6696a3c778..2b06dee9a0ce 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
>> @@ -170,7 +170,7 @@ void amdgpu_amdkfd_device_init(struct 
>> amdgpu_device *adev)
>>       }
>>   }
>> -void amdgpu_amdkfd_device_fini(struct amdgpu_device *adev)
>> +void amdgpu_amdkfd_device_fini_sw(struct amdgpu_device *adev)
>>   {
>>       if (adev->kfd.dev) {
>>           kgd2kfd_device_exit(adev->kfd.dev);
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
>> index 14f68c028126..f8e10af99c28 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
>> @@ -127,7 +127,7 @@ void amdgpu_amdkfd_interrupt(struct amdgpu_device 
>> *adev,
>>               const void *ih_ring_entry);
>>   void amdgpu_amdkfd_device_probe(struct amdgpu_device *adev);
>>   void amdgpu_amdkfd_device_init(struct amdgpu_device *adev);
>> -void amdgpu_amdkfd_device_fini(struct amdgpu_device *adev);
>> +void amdgpu_amdkfd_device_fini_sw(struct amdgpu_device *adev);
>>   int amdgpu_amdkfd_submit_ib(struct kgd_dev *kgd, enum 
>> kgd_engine_type engine,
>>                   uint32_t vmid, uint64_t gpu_addr,
>>                   uint32_t *ib_cmd, uint32_t ib_len);
>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c 
>> b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
>> index 357b9bf62a1c..ab6d2a43c9a3 100644
>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
>> @@ -858,10 +858,11 @@ bool kgd2kfd_device_init(struct kfd_dev *kfd,
>>       return kfd->init_complete;
>>   }
>> +
>> +
> 
> Looks like unnecessary white space change to me.
> 
>>   void kgd2kfd_device_exit(struct kfd_dev *kfd)
>>   {
>>       if (kfd->init_complete) {
>> -        kgd2kfd_suspend(kfd, false);
> 
> Where is the call to this function now?
> 
> Christian.

In patch 'drm/amdgpu: Add early fini callback' in
amdgpu_device_ip_fini_early->amdgpu_amdkfd_suspend->kgd2kfd_suspend

Andrey

> 
>>           device_queue_manager_uninit(kfd->dqm);
>>           kfd_interrupt_exit(kfd);
>>           kfd_topology_remove_device(kfd);
> 

^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [PATCH v6 04/16] drm/amdkfd: Split kfd suspend from devie exit
@ 2021-05-11 14:52       ` Andrey Grodzovsky
  0 siblings, 0 replies; 126+ messages in thread
From: Andrey Grodzovsky @ 2021-05-11 14:52 UTC (permalink / raw)
  To: Christian König, dri-devel, amd-gfx, linux-pci,
	daniel.vetter, Harry.Wentland
  Cc: Alexander.Deucher, gregkh, ppaalanen, helgaas, Felix.Kuehling



On 2021-05-11 2:40 a.m., Christian König wrote:
> Am 10.05.21 um 18:36 schrieb Andrey Grodzovsky:
>> Helps to expdite HW related stuff to amdgpu_pci_remove
>>
>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 2 +-
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h | 2 +-
>>   drivers/gpu/drm/amd/amdkfd/kfd_device.c    | 3 ++-
>>   3 files changed, 4 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
>> index 5f6696a3c778..2b06dee9a0ce 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
>> @@ -170,7 +170,7 @@ void amdgpu_amdkfd_device_init(struct 
>> amdgpu_device *adev)
>>       }
>>   }
>> -void amdgpu_amdkfd_device_fini(struct amdgpu_device *adev)
>> +void amdgpu_amdkfd_device_fini_sw(struct amdgpu_device *adev)
>>   {
>>       if (adev->kfd.dev) {
>>           kgd2kfd_device_exit(adev->kfd.dev);
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
>> index 14f68c028126..f8e10af99c28 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
>> @@ -127,7 +127,7 @@ void amdgpu_amdkfd_interrupt(struct amdgpu_device 
>> *adev,
>>               const void *ih_ring_entry);
>>   void amdgpu_amdkfd_device_probe(struct amdgpu_device *adev);
>>   void amdgpu_amdkfd_device_init(struct amdgpu_device *adev);
>> -void amdgpu_amdkfd_device_fini(struct amdgpu_device *adev);
>> +void amdgpu_amdkfd_device_fini_sw(struct amdgpu_device *adev);
>>   int amdgpu_amdkfd_submit_ib(struct kgd_dev *kgd, enum 
>> kgd_engine_type engine,
>>                   uint32_t vmid, uint64_t gpu_addr,
>>                   uint32_t *ib_cmd, uint32_t ib_len);
>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c 
>> b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
>> index 357b9bf62a1c..ab6d2a43c9a3 100644
>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
>> @@ -858,10 +858,11 @@ bool kgd2kfd_device_init(struct kfd_dev *kfd,
>>       return kfd->init_complete;
>>   }
>> +
>> +
> 
> Looks like unnecessary white space change to me.
> 
>>   void kgd2kfd_device_exit(struct kfd_dev *kfd)
>>   {
>>       if (kfd->init_complete) {
>> -        kgd2kfd_suspend(kfd, false);
> 
> Where is the call to this function now?
> 
> Christian.

In patch 'drm/amdgpu: Add early fini callback' in
amdgpu_device_ip_fini_early->amdgpu_amdkfd_suspend->kgd2kfd_suspend

Andrey

> 
>>           device_queue_manager_uninit(kfd->dqm);
>>           kfd_interrupt_exit(kfd);
>>           kfd_topology_remove_device(kfd);
> 
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [PATCH v6 01/16] drm/ttm: Remap all page faults to per process dummy page.
  2021-05-11 14:44       ` Andrey Grodzovsky
  (?)
@ 2021-05-11 15:12         ` Christian König
  -1 siblings, 0 replies; 126+ messages in thread
From: Christian König @ 2021-05-11 15:12 UTC (permalink / raw)
  To: Andrey Grodzovsky, dri-devel, amd-gfx, linux-pci, daniel.vetter,
	Harry.Wentland
  Cc: ppaalanen, Alexander.Deucher, gregkh, helgaas, Felix.Kuehling



Am 11.05.21 um 16:44 schrieb Andrey Grodzovsky:
>
> On 2021-05-11 2:38 a.m., Christian König wrote:
>> Am 10.05.21 um 18:36 schrieb Andrey Grodzovsky:
>>> On device removal reroute all CPU mappings to dummy page.
>>>
>>> v3:
>>> Remove loop to find DRM file and instead access it
>>> by vma->vm_file->private_data. Move dummy page installation
>>> into a separate function.
>>>
>>> v4:
>>> Map the entire BOs VA space into on demand allocated dummy page
>>> on the first fault for that BO.
>>>
>>> v5: Remove duplicate return.
>>>
>>> v6: Polish ttm_bo_vm_dummy_page, remove superflous code.
>>>
>>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>>> ---
>>>   drivers/gpu/drm/ttm/ttm_bo_vm.c | 57 
>>> ++++++++++++++++++++++++++++++++-
>>>   include/drm/ttm/ttm_bo_api.h    |  2 ++
>>>   2 files changed, 58 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c 
>>> b/drivers/gpu/drm/ttm/ttm_bo_vm.c
>>> index b31b18058965..e5a9615519d1 100644
>>> --- a/drivers/gpu/drm/ttm/ttm_bo_vm.c
>>> +++ b/drivers/gpu/drm/ttm/ttm_bo_vm.c
>>> @@ -34,6 +34,8 @@
>>>   #include <drm/ttm/ttm_bo_driver.h>
>>>   #include <drm/ttm/ttm_placement.h>
>>>   #include <drm/drm_vma_manager.h>
>>> +#include <drm/drm_drv.h>
>>> +#include <drm/drm_managed.h>
>>>   #include <linux/mm.h>
>>>   #include <linux/pfn_t.h>
>>>   #include <linux/rbtree.h>
>>> @@ -380,19 +382,72 @@ vm_fault_t ttm_bo_vm_fault_reserved(struct 
>>> vm_fault *vmf,
>>>   }
>>>   EXPORT_SYMBOL(ttm_bo_vm_fault_reserved);
>>>   +static void ttm_bo_release_dummy_page(struct drm_device *dev, 
>>> void *res)
>>> +{
>>> +    struct page *dummy_page = (struct page *)res;
>>> +
>>> +    __free_page(dummy_page);
>>> +}
>>> +
>>> +vm_fault_t ttm_bo_vm_dummy_page(struct vm_fault *vmf, pgprot_t prot)
>>> +{
>>> +    struct vm_area_struct *vma = vmf->vma;
>>> +    struct ttm_buffer_object *bo = vma->vm_private_data;
>>> +    struct drm_device *ddev = bo->base.dev;
>>> +    vm_fault_t ret = VM_FAULT_NOPAGE;
>>> +    unsigned long address;
>>> +    unsigned long pfn;
>>> +    struct page *page;
>>> +
>>> +    /* Allocate new dummy page to map all the VA range in this VMA 
>>> to it*/
>>> +    page = alloc_page(GFP_KERNEL | __GFP_ZERO);
>>> +    if (!page)
>>> +        return VM_FAULT_OOM;
>>> +
>>> +    pfn = page_to_pfn(page);
>>> +
>>> +    /* Prefault the entire VMA range right away to avoid further 
>>> faults */
>>> +    for (address = vma->vm_start; address < vma->vm_end; address += 
>>> PAGE_SIZE) {
>>> +
>>
>>> +        if (unlikely(address >= vma->vm_end))
>>> +            break;
>>
>> That extra check can be removed as far as I can see.
>>
>>
>>> +
>>> +        if (vma->vm_flags & VM_MIXEDMAP)
>>> +            ret = vmf_insert_mixed_prot(vma, address,
>>> +                            __pfn_to_pfn_t(pfn, PFN_DEV),
>>> +                            prot);
>>> +        else
>>> +            ret = vmf_insert_pfn_prot(vma, address, pfn, prot);
>>> +    }
>>> +
>>
>>> +    /* Set the page to be freed using drmm release action */
>>> +    if (drmm_add_action_or_reset(ddev, ttm_bo_release_dummy_page, 
>>> page))
>>> +        return VM_FAULT_OOM;
>>
>> You should probably move that before inserting the page into the VMA 
>> and also free the allocated page if it goes wrong.
>
>
> drmm_add_action_or_reset will automatically release the page if the 
> add action fails, that the 'reset' part of the function.

Ah! Ok that makes it even more important that you do this before you 
insert the page into any VMA.

Otherwise userspace has access to a freed page with the rather ugly 
consequences.

Christian.

>
> Andrey
>
>
>>
>> Apart from that patch looks good to me,
>> Christian.
>>
>>> +
>>> +    return ret;
>>> +}
>>> +EXPORT_SYMBOL(ttm_bo_vm_dummy_page);
>>> +
>>>   vm_fault_t ttm_bo_vm_fault(struct vm_fault *vmf)
>>>   {
>>>       struct vm_area_struct *vma = vmf->vma;
>>>       pgprot_t prot;
>>>       struct ttm_buffer_object *bo = vma->vm_private_data;
>>> +    struct drm_device *ddev = bo->base.dev;
>>>       vm_fault_t ret;
>>> +    int idx;
>>>         ret = ttm_bo_vm_reserve(bo, vmf);
>>>       if (ret)
>>>           return ret;
>>>         prot = vma->vm_page_prot;
>>> -    ret = ttm_bo_vm_fault_reserved(vmf, prot, 
>>> TTM_BO_VM_NUM_PREFAULT, 1);
>>> +    if (drm_dev_enter(ddev, &idx)) {
>>> +        ret = ttm_bo_vm_fault_reserved(vmf, prot, 
>>> TTM_BO_VM_NUM_PREFAULT, 1);
>>> +        drm_dev_exit(idx);
>>> +    } else {
>>> +        ret = ttm_bo_vm_dummy_page(vmf, prot);
>>> +    }
>>>       if (ret == VM_FAULT_RETRY && !(vmf->flags & 
>>> FAULT_FLAG_RETRY_NOWAIT))
>>>           return ret;
>>>   diff --git a/include/drm/ttm/ttm_bo_api.h 
>>> b/include/drm/ttm/ttm_bo_api.h
>>> index 639521880c29..254ede97f8e3 100644
>>> --- a/include/drm/ttm/ttm_bo_api.h
>>> +++ b/include/drm/ttm/ttm_bo_api.h
>>> @@ -620,4 +620,6 @@ int ttm_bo_vm_access(struct vm_area_struct *vma, 
>>> unsigned long addr,
>>>                void *buf, int len, int write);
>>>   bool ttm_bo_delayed_delete(struct ttm_device *bdev, bool remove_all);
>>>   +vm_fault_t ttm_bo_vm_dummy_page(struct vm_fault *vmf, pgprot_t 
>>> prot);
>>> +
>>>   #endif
>>


^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [PATCH v6 01/16] drm/ttm: Remap all page faults to per process dummy page.
@ 2021-05-11 15:12         ` Christian König
  0 siblings, 0 replies; 126+ messages in thread
From: Christian König @ 2021-05-11 15:12 UTC (permalink / raw)
  To: Andrey Grodzovsky, dri-devel, amd-gfx, linux-pci, daniel.vetter,
	Harry.Wentland
  Cc: Alexander.Deucher, gregkh, helgaas, Felix.Kuehling



Am 11.05.21 um 16:44 schrieb Andrey Grodzovsky:
>
> On 2021-05-11 2:38 a.m., Christian König wrote:
>> Am 10.05.21 um 18:36 schrieb Andrey Grodzovsky:
>>> On device removal reroute all CPU mappings to dummy page.
>>>
>>> v3:
>>> Remove loop to find DRM file and instead access it
>>> by vma->vm_file->private_data. Move dummy page installation
>>> into a separate function.
>>>
>>> v4:
>>> Map the entire BOs VA space into on demand allocated dummy page
>>> on the first fault for that BO.
>>>
>>> v5: Remove duplicate return.
>>>
>>> v6: Polish ttm_bo_vm_dummy_page, remove superflous code.
>>>
>>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>>> ---
>>>   drivers/gpu/drm/ttm/ttm_bo_vm.c | 57 
>>> ++++++++++++++++++++++++++++++++-
>>>   include/drm/ttm/ttm_bo_api.h    |  2 ++
>>>   2 files changed, 58 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c 
>>> b/drivers/gpu/drm/ttm/ttm_bo_vm.c
>>> index b31b18058965..e5a9615519d1 100644
>>> --- a/drivers/gpu/drm/ttm/ttm_bo_vm.c
>>> +++ b/drivers/gpu/drm/ttm/ttm_bo_vm.c
>>> @@ -34,6 +34,8 @@
>>>   #include <drm/ttm/ttm_bo_driver.h>
>>>   #include <drm/ttm/ttm_placement.h>
>>>   #include <drm/drm_vma_manager.h>
>>> +#include <drm/drm_drv.h>
>>> +#include <drm/drm_managed.h>
>>>   #include <linux/mm.h>
>>>   #include <linux/pfn_t.h>
>>>   #include <linux/rbtree.h>
>>> @@ -380,19 +382,72 @@ vm_fault_t ttm_bo_vm_fault_reserved(struct 
>>> vm_fault *vmf,
>>>   }
>>>   EXPORT_SYMBOL(ttm_bo_vm_fault_reserved);
>>>   +static void ttm_bo_release_dummy_page(struct drm_device *dev, 
>>> void *res)
>>> +{
>>> +    struct page *dummy_page = (struct page *)res;
>>> +
>>> +    __free_page(dummy_page);
>>> +}
>>> +
>>> +vm_fault_t ttm_bo_vm_dummy_page(struct vm_fault *vmf, pgprot_t prot)
>>> +{
>>> +    struct vm_area_struct *vma = vmf->vma;
>>> +    struct ttm_buffer_object *bo = vma->vm_private_data;
>>> +    struct drm_device *ddev = bo->base.dev;
>>> +    vm_fault_t ret = VM_FAULT_NOPAGE;
>>> +    unsigned long address;
>>> +    unsigned long pfn;
>>> +    struct page *page;
>>> +
>>> +    /* Allocate new dummy page to map all the VA range in this VMA 
>>> to it*/
>>> +    page = alloc_page(GFP_KERNEL | __GFP_ZERO);
>>> +    if (!page)
>>> +        return VM_FAULT_OOM;
>>> +
>>> +    pfn = page_to_pfn(page);
>>> +
>>> +    /* Prefault the entire VMA range right away to avoid further 
>>> faults */
>>> +    for (address = vma->vm_start; address < vma->vm_end; address += 
>>> PAGE_SIZE) {
>>> +
>>
>>> +        if (unlikely(address >= vma->vm_end))
>>> +            break;
>>
>> That extra check can be removed as far as I can see.
>>
>>
>>> +
>>> +        if (vma->vm_flags & VM_MIXEDMAP)
>>> +            ret = vmf_insert_mixed_prot(vma, address,
>>> +                            __pfn_to_pfn_t(pfn, PFN_DEV),
>>> +                            prot);
>>> +        else
>>> +            ret = vmf_insert_pfn_prot(vma, address, pfn, prot);
>>> +    }
>>> +
>>
>>> +    /* Set the page to be freed using drmm release action */
>>> +    if (drmm_add_action_or_reset(ddev, ttm_bo_release_dummy_page, 
>>> page))
>>> +        return VM_FAULT_OOM;
>>
>> You should probably move that before inserting the page into the VMA 
>> and also free the allocated page if it goes wrong.
>
>
> drmm_add_action_or_reset will automatically release the page if the 
> add action fails, that the 'reset' part of the function.

Ah! Ok that makes it even more important that you do this before you 
insert the page into any VMA.

Otherwise userspace has access to a freed page with the rather ugly 
consequences.

Christian.

>
> Andrey
>
>
>>
>> Apart from that patch looks good to me,
>> Christian.
>>
>>> +
>>> +    return ret;
>>> +}
>>> +EXPORT_SYMBOL(ttm_bo_vm_dummy_page);
>>> +
>>>   vm_fault_t ttm_bo_vm_fault(struct vm_fault *vmf)
>>>   {
>>>       struct vm_area_struct *vma = vmf->vma;
>>>       pgprot_t prot;
>>>       struct ttm_buffer_object *bo = vma->vm_private_data;
>>> +    struct drm_device *ddev = bo->base.dev;
>>>       vm_fault_t ret;
>>> +    int idx;
>>>         ret = ttm_bo_vm_reserve(bo, vmf);
>>>       if (ret)
>>>           return ret;
>>>         prot = vma->vm_page_prot;
>>> -    ret = ttm_bo_vm_fault_reserved(vmf, prot, 
>>> TTM_BO_VM_NUM_PREFAULT, 1);
>>> +    if (drm_dev_enter(ddev, &idx)) {
>>> +        ret = ttm_bo_vm_fault_reserved(vmf, prot, 
>>> TTM_BO_VM_NUM_PREFAULT, 1);
>>> +        drm_dev_exit(idx);
>>> +    } else {
>>> +        ret = ttm_bo_vm_dummy_page(vmf, prot);
>>> +    }
>>>       if (ret == VM_FAULT_RETRY && !(vmf->flags & 
>>> FAULT_FLAG_RETRY_NOWAIT))
>>>           return ret;
>>>   diff --git a/include/drm/ttm/ttm_bo_api.h 
>>> b/include/drm/ttm/ttm_bo_api.h
>>> index 639521880c29..254ede97f8e3 100644
>>> --- a/include/drm/ttm/ttm_bo_api.h
>>> +++ b/include/drm/ttm/ttm_bo_api.h
>>> @@ -620,4 +620,6 @@ int ttm_bo_vm_access(struct vm_area_struct *vma, 
>>> unsigned long addr,
>>>                void *buf, int len, int write);
>>>   bool ttm_bo_delayed_delete(struct ttm_device *bdev, bool remove_all);
>>>   +vm_fault_t ttm_bo_vm_dummy_page(struct vm_fault *vmf, pgprot_t 
>>> prot);
>>> +
>>>   #endif
>>


^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [PATCH v6 01/16] drm/ttm: Remap all page faults to per process dummy page.
@ 2021-05-11 15:12         ` Christian König
  0 siblings, 0 replies; 126+ messages in thread
From: Christian König @ 2021-05-11 15:12 UTC (permalink / raw)
  To: Andrey Grodzovsky, dri-devel, amd-gfx, linux-pci, daniel.vetter,
	Harry.Wentland
  Cc: Alexander.Deucher, gregkh, ppaalanen, helgaas, Felix.Kuehling



Am 11.05.21 um 16:44 schrieb Andrey Grodzovsky:
>
> On 2021-05-11 2:38 a.m., Christian König wrote:
>> Am 10.05.21 um 18:36 schrieb Andrey Grodzovsky:
>>> On device removal reroute all CPU mappings to dummy page.
>>>
>>> v3:
>>> Remove loop to find DRM file and instead access it
>>> by vma->vm_file->private_data. Move dummy page installation
>>> into a separate function.
>>>
>>> v4:
>>> Map the entire BOs VA space into on demand allocated dummy page
>>> on the first fault for that BO.
>>>
>>> v5: Remove duplicate return.
>>>
>>> v6: Polish ttm_bo_vm_dummy_page, remove superflous code.
>>>
>>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>>> ---
>>>   drivers/gpu/drm/ttm/ttm_bo_vm.c | 57 
>>> ++++++++++++++++++++++++++++++++-
>>>   include/drm/ttm/ttm_bo_api.h    |  2 ++
>>>   2 files changed, 58 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c 
>>> b/drivers/gpu/drm/ttm/ttm_bo_vm.c
>>> index b31b18058965..e5a9615519d1 100644
>>> --- a/drivers/gpu/drm/ttm/ttm_bo_vm.c
>>> +++ b/drivers/gpu/drm/ttm/ttm_bo_vm.c
>>> @@ -34,6 +34,8 @@
>>>   #include <drm/ttm/ttm_bo_driver.h>
>>>   #include <drm/ttm/ttm_placement.h>
>>>   #include <drm/drm_vma_manager.h>
>>> +#include <drm/drm_drv.h>
>>> +#include <drm/drm_managed.h>
>>>   #include <linux/mm.h>
>>>   #include <linux/pfn_t.h>
>>>   #include <linux/rbtree.h>
>>> @@ -380,19 +382,72 @@ vm_fault_t ttm_bo_vm_fault_reserved(struct 
>>> vm_fault *vmf,
>>>   }
>>>   EXPORT_SYMBOL(ttm_bo_vm_fault_reserved);
>>>   +static void ttm_bo_release_dummy_page(struct drm_device *dev, 
>>> void *res)
>>> +{
>>> +    struct page *dummy_page = (struct page *)res;
>>> +
>>> +    __free_page(dummy_page);
>>> +}
>>> +
>>> +vm_fault_t ttm_bo_vm_dummy_page(struct vm_fault *vmf, pgprot_t prot)
>>> +{
>>> +    struct vm_area_struct *vma = vmf->vma;
>>> +    struct ttm_buffer_object *bo = vma->vm_private_data;
>>> +    struct drm_device *ddev = bo->base.dev;
>>> +    vm_fault_t ret = VM_FAULT_NOPAGE;
>>> +    unsigned long address;
>>> +    unsigned long pfn;
>>> +    struct page *page;
>>> +
>>> +    /* Allocate new dummy page to map all the VA range in this VMA 
>>> to it*/
>>> +    page = alloc_page(GFP_KERNEL | __GFP_ZERO);
>>> +    if (!page)
>>> +        return VM_FAULT_OOM;
>>> +
>>> +    pfn = page_to_pfn(page);
>>> +
>>> +    /* Prefault the entire VMA range right away to avoid further 
>>> faults */
>>> +    for (address = vma->vm_start; address < vma->vm_end; address += 
>>> PAGE_SIZE) {
>>> +
>>
>>> +        if (unlikely(address >= vma->vm_end))
>>> +            break;
>>
>> That extra check can be removed as far as I can see.
>>
>>
>>> +
>>> +        if (vma->vm_flags & VM_MIXEDMAP)
>>> +            ret = vmf_insert_mixed_prot(vma, address,
>>> +                            __pfn_to_pfn_t(pfn, PFN_DEV),
>>> +                            prot);
>>> +        else
>>> +            ret = vmf_insert_pfn_prot(vma, address, pfn, prot);
>>> +    }
>>> +
>>
>>> +    /* Set the page to be freed using drmm release action */
>>> +    if (drmm_add_action_or_reset(ddev, ttm_bo_release_dummy_page, 
>>> page))
>>> +        return VM_FAULT_OOM;
>>
>> You should probably move that before inserting the page into the VMA 
>> and also free the allocated page if it goes wrong.
>
>
> drmm_add_action_or_reset will automatically release the page if the 
> add action fails, that the 'reset' part of the function.

Ah! Ok that makes it even more important that you do this before you 
insert the page into any VMA.

Otherwise userspace has access to a freed page with the rather ugly 
consequences.

Christian.

>
> Andrey
>
>
>>
>> Apart from that patch looks good to me,
>> Christian.
>>
>>> +
>>> +    return ret;
>>> +}
>>> +EXPORT_SYMBOL(ttm_bo_vm_dummy_page);
>>> +
>>>   vm_fault_t ttm_bo_vm_fault(struct vm_fault *vmf)
>>>   {
>>>       struct vm_area_struct *vma = vmf->vma;
>>>       pgprot_t prot;
>>>       struct ttm_buffer_object *bo = vma->vm_private_data;
>>> +    struct drm_device *ddev = bo->base.dev;
>>>       vm_fault_t ret;
>>> +    int idx;
>>>         ret = ttm_bo_vm_reserve(bo, vmf);
>>>       if (ret)
>>>           return ret;
>>>         prot = vma->vm_page_prot;
>>> -    ret = ttm_bo_vm_fault_reserved(vmf, prot, 
>>> TTM_BO_VM_NUM_PREFAULT, 1);
>>> +    if (drm_dev_enter(ddev, &idx)) {
>>> +        ret = ttm_bo_vm_fault_reserved(vmf, prot, 
>>> TTM_BO_VM_NUM_PREFAULT, 1);
>>> +        drm_dev_exit(idx);
>>> +    } else {
>>> +        ret = ttm_bo_vm_dummy_page(vmf, prot);
>>> +    }
>>>       if (ret == VM_FAULT_RETRY && !(vmf->flags & 
>>> FAULT_FLAG_RETRY_NOWAIT))
>>>           return ret;
>>>   diff --git a/include/drm/ttm/ttm_bo_api.h 
>>> b/include/drm/ttm/ttm_bo_api.h
>>> index 639521880c29..254ede97f8e3 100644
>>> --- a/include/drm/ttm/ttm_bo_api.h
>>> +++ b/include/drm/ttm/ttm_bo_api.h
>>> @@ -620,4 +620,6 @@ int ttm_bo_vm_access(struct vm_area_struct *vma, 
>>> unsigned long addr,
>>>                void *buf, int len, int write);
>>>   bool ttm_bo_delayed_delete(struct ttm_device *bdev, bool remove_all);
>>>   +vm_fault_t ttm_bo_vm_dummy_page(struct vm_fault *vmf, pgprot_t 
>>> prot);
>>> +
>>>   #endif
>>

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [PATCH v6 06/16] drm/amdgpu: Handle IOMMU enabled case.
  2021-05-11  6:44     ` Christian König
  (?)
@ 2021-05-11 15:46       ` Andrey Grodzovsky
  -1 siblings, 0 replies; 126+ messages in thread
From: Andrey Grodzovsky @ 2021-05-11 15:46 UTC (permalink / raw)
  To: Christian König, dri-devel, amd-gfx, linux-pci,
	daniel.vetter, Harry.Wentland
  Cc: ppaalanen, Alexander.Deucher, gregkh, helgaas, Felix.Kuehling



On 2021-05-11 2:44 a.m., Christian König wrote:
> Am 10.05.21 um 18:36 schrieb Andrey Grodzovsky:
>> Handle all DMA IOMMU gropup related dependencies before the
>> group is removed.
>>
>> v5: Drop IOMMU notifier and switch to lockless call to ttm_tt_unpopulate
>> v6: Drop the BO unamp list
>>
>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 4 ++--
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c   | 3 +--
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h   | 1 +
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c    | 9 +++++++++
>>   drivers/gpu/drm/amd/amdgpu/cik_ih.c        | 1 -
>>   drivers/gpu/drm/amd/amdgpu/cz_ih.c         | 1 -
>>   drivers/gpu/drm/amd/amdgpu/iceland_ih.c    | 1 -
>>   drivers/gpu/drm/amd/amdgpu/navi10_ih.c     | 3 ---
>>   drivers/gpu/drm/amd/amdgpu/si_ih.c         | 1 -
>>   drivers/gpu/drm/amd/amdgpu/tonga_ih.c      | 1 -
>>   drivers/gpu/drm/amd/amdgpu/vega10_ih.c     | 3 ---
>>   11 files changed, 13 insertions(+), 15 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> index 18598eda18f6..a0bff4713672 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> @@ -3256,7 +3256,6 @@ static const struct attribute 
>> *amdgpu_dev_attributes[] = {
>>       NULL
>>   };
>> -
>>   /**
>>    * amdgpu_device_init - initialize the driver
>>    *
>> @@ -3698,12 +3697,13 @@ void amdgpu_device_fini_hw(struct 
>> amdgpu_device *adev)
>>           amdgpu_ucode_sysfs_fini(adev);
>>       sysfs_remove_files(&adev->dev->kobj, amdgpu_dev_attributes);
>> -
>>       amdgpu_fbdev_fini(adev);
>>       amdgpu_irq_fini_hw(adev);
>>       amdgpu_device_ip_fini_early(adev);
>> +
>> +    amdgpu_gart_dummy_page_fini(adev);
> 
> I think you should probably just call amdgpu_gart_fini() here.
> 
>>   }
>>   void amdgpu_device_fini_sw(struct amdgpu_device *adev)
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
>> index c5a9a4fb10d2..354e68081b53 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
>> @@ -92,7 +92,7 @@ static int amdgpu_gart_dummy_page_init(struct 
>> amdgpu_device *adev)
>>    *
>>    * Frees the dummy page used by the driver (all asics).
>>    */
>> -static void amdgpu_gart_dummy_page_fini(struct amdgpu_device *adev)
>> +void amdgpu_gart_dummy_page_fini(struct amdgpu_device *adev)
>>   {
>>       if (!adev->dummy_page_addr)
>>           return;
>> @@ -375,5 +375,4 @@ int amdgpu_gart_init(struct amdgpu_device *adev)
>>    */
>>   void amdgpu_gart_fini(struct amdgpu_device *adev)
>>   {
>> -    amdgpu_gart_dummy_page_fini(adev);
>>   }
> 
> Well either you remove amdgpu_gart_fini() or just call 
> amdgpu_gart_fini() instead of amdgpu_gart_dummy_page_fini().
> 
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h
>> index a25fe97b0196..78dc7a23da56 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h
>> @@ -58,6 +58,7 @@ int amdgpu_gart_table_vram_pin(struct amdgpu_device 
>> *adev);
>>   void amdgpu_gart_table_vram_unpin(struct amdgpu_device *adev);
>>   int amdgpu_gart_init(struct amdgpu_device *adev);
>>   void amdgpu_gart_fini(struct amdgpu_device *adev);
>> +void amdgpu_gart_dummy_page_fini(struct amdgpu_device *adev);
>>   int amdgpu_gart_unbind(struct amdgpu_device *adev, uint64_t offset,
>>                  int pages);
>>   int amdgpu_gart_map(struct amdgpu_device *adev, uint64_t offset,
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
>> index 233b64dab94b..a14973a7a9c9 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
>> @@ -361,6 +361,15 @@ void amdgpu_irq_fini_hw(struct amdgpu_device *adev)
>>           if (!amdgpu_device_has_dc_support(adev))
>>               flush_work(&adev->hotplug_work);
>>       }
>> +
>> +    if (adev->irq.ih_soft.ring)
>> +        amdgpu_ih_ring_fini(adev, &adev->irq.ih_soft);
>> +    if (adev->irq.ih.ring)
>> +        amdgpu_ih_ring_fini(adev, &adev->irq.ih);
>> +    if (adev->irq.ih1.ring)
>> +        amdgpu_ih_ring_fini(adev, &adev->irq.ih1);
>> +    if (adev->irq.ih2.ring)
>> +        amdgpu_ih_ring_fini(adev, &adev->irq.ih2);
> 
> You should probably make the function NULL save instead of checking here.
> 
> Christian.

Agree, in fact it's already does this check inside amdgpu_ih_ring_fini
so I will just drop the checks.

Andrey

> 
>>   }
>>   /**
>> diff --git a/drivers/gpu/drm/amd/amdgpu/cik_ih.c 
>> b/drivers/gpu/drm/amd/amdgpu/cik_ih.c
>> index 183d44a6583c..df385ffc9768 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/cik_ih.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/cik_ih.c
>> @@ -310,7 +310,6 @@ static int cik_ih_sw_fini(void *handle)
>>       struct amdgpu_device *adev = (struct amdgpu_device *)handle;
>>       amdgpu_irq_fini_sw(adev);
>> -    amdgpu_ih_ring_fini(adev, &adev->irq.ih);
>>       amdgpu_irq_remove_domain(adev);
>>       return 0;
>> diff --git a/drivers/gpu/drm/amd/amdgpu/cz_ih.c 
>> b/drivers/gpu/drm/amd/amdgpu/cz_ih.c
>> index d32743949003..b8c47e0cf37a 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/cz_ih.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/cz_ih.c
>> @@ -302,7 +302,6 @@ static int cz_ih_sw_fini(void *handle)
>>       struct amdgpu_device *adev = (struct amdgpu_device *)handle;
>>       amdgpu_irq_fini_sw(adev);
>> -    amdgpu_ih_ring_fini(adev, &adev->irq.ih);
>>       amdgpu_irq_remove_domain(adev);
>>       return 0;
>> diff --git a/drivers/gpu/drm/amd/amdgpu/iceland_ih.c 
>> b/drivers/gpu/drm/amd/amdgpu/iceland_ih.c
>> index da96c6013477..ddfe4eaeea05 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/iceland_ih.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/iceland_ih.c
>> @@ -301,7 +301,6 @@ static int iceland_ih_sw_fini(void *handle)
>>       struct amdgpu_device *adev = (struct amdgpu_device *)handle;
>>       amdgpu_irq_fini_sw(adev);
>> -    amdgpu_ih_ring_fini(adev, &adev->irq.ih);
>>       amdgpu_irq_remove_domain(adev);
>>       return 0;
>> diff --git a/drivers/gpu/drm/amd/amdgpu/navi10_ih.c 
>> b/drivers/gpu/drm/amd/amdgpu/navi10_ih.c
>> index 5eea4550b856..e171a9e78544 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/navi10_ih.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/navi10_ih.c
>> @@ -571,9 +571,6 @@ static int navi10_ih_sw_fini(void *handle)
>>       amdgpu_irq_fini_sw(adev);
>>       amdgpu_ih_ring_fini(adev, &adev->irq.ih_soft);
>> -    amdgpu_ih_ring_fini(adev, &adev->irq.ih2);
>> -    amdgpu_ih_ring_fini(adev, &adev->irq.ih1);
>> -    amdgpu_ih_ring_fini(adev, &adev->irq.ih);
>>       return 0;
>>   }
>> diff --git a/drivers/gpu/drm/amd/amdgpu/si_ih.c 
>> b/drivers/gpu/drm/amd/amdgpu/si_ih.c
>> index 751307f3252c..9a24f17a5750 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/si_ih.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/si_ih.c
>> @@ -176,7 +176,6 @@ static int si_ih_sw_fini(void *handle)
>>       struct amdgpu_device *adev = (struct amdgpu_device *)handle;
>>       amdgpu_irq_fini_sw(adev);
>> -    amdgpu_ih_ring_fini(adev, &adev->irq.ih);
>>       return 0;
>>   }
>> diff --git a/drivers/gpu/drm/amd/amdgpu/tonga_ih.c 
>> b/drivers/gpu/drm/amd/amdgpu/tonga_ih.c
>> index 973d80ec7f6c..b08905d1c00f 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/tonga_ih.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/tonga_ih.c
>> @@ -313,7 +313,6 @@ static int tonga_ih_sw_fini(void *handle)
>>       struct amdgpu_device *adev = (struct amdgpu_device *)handle;
>>       amdgpu_irq_fini_sw(adev);
>> -    amdgpu_ih_ring_fini(adev, &adev->irq.ih);
>>       amdgpu_irq_remove_domain(adev);
>>       return 0;
>> diff --git a/drivers/gpu/drm/amd/amdgpu/vega10_ih.c 
>> b/drivers/gpu/drm/amd/amdgpu/vega10_ih.c
>> index dead9c2fbd4c..d78b8abe993a 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/vega10_ih.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/vega10_ih.c
>> @@ -515,9 +515,6 @@ static int vega10_ih_sw_fini(void *handle)
>>       amdgpu_irq_fini_sw(adev);
>>       amdgpu_ih_ring_fini(adev, &adev->irq.ih_soft);
>> -    amdgpu_ih_ring_fini(adev, &adev->irq.ih2);
>> -    amdgpu_ih_ring_fini(adev, &adev->irq.ih1);
>> -    amdgpu_ih_ring_fini(adev, &adev->irq.ih);
>>       return 0;
>>   }
> 

^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [PATCH v6 06/16] drm/amdgpu: Handle IOMMU enabled case.
@ 2021-05-11 15:46       ` Andrey Grodzovsky
  0 siblings, 0 replies; 126+ messages in thread
From: Andrey Grodzovsky @ 2021-05-11 15:46 UTC (permalink / raw)
  To: Christian König, dri-devel, amd-gfx, linux-pci,
	daniel.vetter, Harry.Wentland
  Cc: Alexander.Deucher, gregkh, helgaas, Felix.Kuehling



On 2021-05-11 2:44 a.m., Christian König wrote:
> Am 10.05.21 um 18:36 schrieb Andrey Grodzovsky:
>> Handle all DMA IOMMU gropup related dependencies before the
>> group is removed.
>>
>> v5: Drop IOMMU notifier and switch to lockless call to ttm_tt_unpopulate
>> v6: Drop the BO unamp list
>>
>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 4 ++--
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c   | 3 +--
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h   | 1 +
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c    | 9 +++++++++
>>   drivers/gpu/drm/amd/amdgpu/cik_ih.c        | 1 -
>>   drivers/gpu/drm/amd/amdgpu/cz_ih.c         | 1 -
>>   drivers/gpu/drm/amd/amdgpu/iceland_ih.c    | 1 -
>>   drivers/gpu/drm/amd/amdgpu/navi10_ih.c     | 3 ---
>>   drivers/gpu/drm/amd/amdgpu/si_ih.c         | 1 -
>>   drivers/gpu/drm/amd/amdgpu/tonga_ih.c      | 1 -
>>   drivers/gpu/drm/amd/amdgpu/vega10_ih.c     | 3 ---
>>   11 files changed, 13 insertions(+), 15 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> index 18598eda18f6..a0bff4713672 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> @@ -3256,7 +3256,6 @@ static const struct attribute 
>> *amdgpu_dev_attributes[] = {
>>       NULL
>>   };
>> -
>>   /**
>>    * amdgpu_device_init - initialize the driver
>>    *
>> @@ -3698,12 +3697,13 @@ void amdgpu_device_fini_hw(struct 
>> amdgpu_device *adev)
>>           amdgpu_ucode_sysfs_fini(adev);
>>       sysfs_remove_files(&adev->dev->kobj, amdgpu_dev_attributes);
>> -
>>       amdgpu_fbdev_fini(adev);
>>       amdgpu_irq_fini_hw(adev);
>>       amdgpu_device_ip_fini_early(adev);
>> +
>> +    amdgpu_gart_dummy_page_fini(adev);
> 
> I think you should probably just call amdgpu_gart_fini() here.
> 
>>   }
>>   void amdgpu_device_fini_sw(struct amdgpu_device *adev)
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
>> index c5a9a4fb10d2..354e68081b53 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
>> @@ -92,7 +92,7 @@ static int amdgpu_gart_dummy_page_init(struct 
>> amdgpu_device *adev)
>>    *
>>    * Frees the dummy page used by the driver (all asics).
>>    */
>> -static void amdgpu_gart_dummy_page_fini(struct amdgpu_device *adev)
>> +void amdgpu_gart_dummy_page_fini(struct amdgpu_device *adev)
>>   {
>>       if (!adev->dummy_page_addr)
>>           return;
>> @@ -375,5 +375,4 @@ int amdgpu_gart_init(struct amdgpu_device *adev)
>>    */
>>   void amdgpu_gart_fini(struct amdgpu_device *adev)
>>   {
>> -    amdgpu_gart_dummy_page_fini(adev);
>>   }
> 
> Well either you remove amdgpu_gart_fini() or just call 
> amdgpu_gart_fini() instead of amdgpu_gart_dummy_page_fini().
> 
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h
>> index a25fe97b0196..78dc7a23da56 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h
>> @@ -58,6 +58,7 @@ int amdgpu_gart_table_vram_pin(struct amdgpu_device 
>> *adev);
>>   void amdgpu_gart_table_vram_unpin(struct amdgpu_device *adev);
>>   int amdgpu_gart_init(struct amdgpu_device *adev);
>>   void amdgpu_gart_fini(struct amdgpu_device *adev);
>> +void amdgpu_gart_dummy_page_fini(struct amdgpu_device *adev);
>>   int amdgpu_gart_unbind(struct amdgpu_device *adev, uint64_t offset,
>>                  int pages);
>>   int amdgpu_gart_map(struct amdgpu_device *adev, uint64_t offset,
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
>> index 233b64dab94b..a14973a7a9c9 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
>> @@ -361,6 +361,15 @@ void amdgpu_irq_fini_hw(struct amdgpu_device *adev)
>>           if (!amdgpu_device_has_dc_support(adev))
>>               flush_work(&adev->hotplug_work);
>>       }
>> +
>> +    if (adev->irq.ih_soft.ring)
>> +        amdgpu_ih_ring_fini(adev, &adev->irq.ih_soft);
>> +    if (adev->irq.ih.ring)
>> +        amdgpu_ih_ring_fini(adev, &adev->irq.ih);
>> +    if (adev->irq.ih1.ring)
>> +        amdgpu_ih_ring_fini(adev, &adev->irq.ih1);
>> +    if (adev->irq.ih2.ring)
>> +        amdgpu_ih_ring_fini(adev, &adev->irq.ih2);
> 
> You should probably make the function NULL save instead of checking here.
> 
> Christian.

Agree, in fact it's already does this check inside amdgpu_ih_ring_fini
so I will just drop the checks.

Andrey

> 
>>   }
>>   /**
>> diff --git a/drivers/gpu/drm/amd/amdgpu/cik_ih.c 
>> b/drivers/gpu/drm/amd/amdgpu/cik_ih.c
>> index 183d44a6583c..df385ffc9768 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/cik_ih.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/cik_ih.c
>> @@ -310,7 +310,6 @@ static int cik_ih_sw_fini(void *handle)
>>       struct amdgpu_device *adev = (struct amdgpu_device *)handle;
>>       amdgpu_irq_fini_sw(adev);
>> -    amdgpu_ih_ring_fini(adev, &adev->irq.ih);
>>       amdgpu_irq_remove_domain(adev);
>>       return 0;
>> diff --git a/drivers/gpu/drm/amd/amdgpu/cz_ih.c 
>> b/drivers/gpu/drm/amd/amdgpu/cz_ih.c
>> index d32743949003..b8c47e0cf37a 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/cz_ih.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/cz_ih.c
>> @@ -302,7 +302,6 @@ static int cz_ih_sw_fini(void *handle)
>>       struct amdgpu_device *adev = (struct amdgpu_device *)handle;
>>       amdgpu_irq_fini_sw(adev);
>> -    amdgpu_ih_ring_fini(adev, &adev->irq.ih);
>>       amdgpu_irq_remove_domain(adev);
>>       return 0;
>> diff --git a/drivers/gpu/drm/amd/amdgpu/iceland_ih.c 
>> b/drivers/gpu/drm/amd/amdgpu/iceland_ih.c
>> index da96c6013477..ddfe4eaeea05 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/iceland_ih.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/iceland_ih.c
>> @@ -301,7 +301,6 @@ static int iceland_ih_sw_fini(void *handle)
>>       struct amdgpu_device *adev = (struct amdgpu_device *)handle;
>>       amdgpu_irq_fini_sw(adev);
>> -    amdgpu_ih_ring_fini(adev, &adev->irq.ih);
>>       amdgpu_irq_remove_domain(adev);
>>       return 0;
>> diff --git a/drivers/gpu/drm/amd/amdgpu/navi10_ih.c 
>> b/drivers/gpu/drm/amd/amdgpu/navi10_ih.c
>> index 5eea4550b856..e171a9e78544 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/navi10_ih.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/navi10_ih.c
>> @@ -571,9 +571,6 @@ static int navi10_ih_sw_fini(void *handle)
>>       amdgpu_irq_fini_sw(adev);
>>       amdgpu_ih_ring_fini(adev, &adev->irq.ih_soft);
>> -    amdgpu_ih_ring_fini(adev, &adev->irq.ih2);
>> -    amdgpu_ih_ring_fini(adev, &adev->irq.ih1);
>> -    amdgpu_ih_ring_fini(adev, &adev->irq.ih);
>>       return 0;
>>   }
>> diff --git a/drivers/gpu/drm/amd/amdgpu/si_ih.c 
>> b/drivers/gpu/drm/amd/amdgpu/si_ih.c
>> index 751307f3252c..9a24f17a5750 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/si_ih.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/si_ih.c
>> @@ -176,7 +176,6 @@ static int si_ih_sw_fini(void *handle)
>>       struct amdgpu_device *adev = (struct amdgpu_device *)handle;
>>       amdgpu_irq_fini_sw(adev);
>> -    amdgpu_ih_ring_fini(adev, &adev->irq.ih);
>>       return 0;
>>   }
>> diff --git a/drivers/gpu/drm/amd/amdgpu/tonga_ih.c 
>> b/drivers/gpu/drm/amd/amdgpu/tonga_ih.c
>> index 973d80ec7f6c..b08905d1c00f 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/tonga_ih.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/tonga_ih.c
>> @@ -313,7 +313,6 @@ static int tonga_ih_sw_fini(void *handle)
>>       struct amdgpu_device *adev = (struct amdgpu_device *)handle;
>>       amdgpu_irq_fini_sw(adev);
>> -    amdgpu_ih_ring_fini(adev, &adev->irq.ih);
>>       amdgpu_irq_remove_domain(adev);
>>       return 0;
>> diff --git a/drivers/gpu/drm/amd/amdgpu/vega10_ih.c 
>> b/drivers/gpu/drm/amd/amdgpu/vega10_ih.c
>> index dead9c2fbd4c..d78b8abe993a 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/vega10_ih.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/vega10_ih.c
>> @@ -515,9 +515,6 @@ static int vega10_ih_sw_fini(void *handle)
>>       amdgpu_irq_fini_sw(adev);
>>       amdgpu_ih_ring_fini(adev, &adev->irq.ih_soft);
>> -    amdgpu_ih_ring_fini(adev, &adev->irq.ih2);
>> -    amdgpu_ih_ring_fini(adev, &adev->irq.ih1);
>> -    amdgpu_ih_ring_fini(adev, &adev->irq.ih);
>>       return 0;
>>   }
> 

^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [PATCH v6 06/16] drm/amdgpu: Handle IOMMU enabled case.
@ 2021-05-11 15:46       ` Andrey Grodzovsky
  0 siblings, 0 replies; 126+ messages in thread
From: Andrey Grodzovsky @ 2021-05-11 15:46 UTC (permalink / raw)
  To: Christian König, dri-devel, amd-gfx, linux-pci,
	daniel.vetter, Harry.Wentland
  Cc: Alexander.Deucher, gregkh, ppaalanen, helgaas, Felix.Kuehling



On 2021-05-11 2:44 a.m., Christian König wrote:
> Am 10.05.21 um 18:36 schrieb Andrey Grodzovsky:
>> Handle all DMA IOMMU gropup related dependencies before the
>> group is removed.
>>
>> v5: Drop IOMMU notifier and switch to lockless call to ttm_tt_unpopulate
>> v6: Drop the BO unamp list
>>
>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 4 ++--
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c   | 3 +--
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h   | 1 +
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c    | 9 +++++++++
>>   drivers/gpu/drm/amd/amdgpu/cik_ih.c        | 1 -
>>   drivers/gpu/drm/amd/amdgpu/cz_ih.c         | 1 -
>>   drivers/gpu/drm/amd/amdgpu/iceland_ih.c    | 1 -
>>   drivers/gpu/drm/amd/amdgpu/navi10_ih.c     | 3 ---
>>   drivers/gpu/drm/amd/amdgpu/si_ih.c         | 1 -
>>   drivers/gpu/drm/amd/amdgpu/tonga_ih.c      | 1 -
>>   drivers/gpu/drm/amd/amdgpu/vega10_ih.c     | 3 ---
>>   11 files changed, 13 insertions(+), 15 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> index 18598eda18f6..a0bff4713672 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> @@ -3256,7 +3256,6 @@ static const struct attribute 
>> *amdgpu_dev_attributes[] = {
>>       NULL
>>   };
>> -
>>   /**
>>    * amdgpu_device_init - initialize the driver
>>    *
>> @@ -3698,12 +3697,13 @@ void amdgpu_device_fini_hw(struct 
>> amdgpu_device *adev)
>>           amdgpu_ucode_sysfs_fini(adev);
>>       sysfs_remove_files(&adev->dev->kobj, amdgpu_dev_attributes);
>> -
>>       amdgpu_fbdev_fini(adev);
>>       amdgpu_irq_fini_hw(adev);
>>       amdgpu_device_ip_fini_early(adev);
>> +
>> +    amdgpu_gart_dummy_page_fini(adev);
> 
> I think you should probably just call amdgpu_gart_fini() here.
> 
>>   }
>>   void amdgpu_device_fini_sw(struct amdgpu_device *adev)
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
>> index c5a9a4fb10d2..354e68081b53 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
>> @@ -92,7 +92,7 @@ static int amdgpu_gart_dummy_page_init(struct 
>> amdgpu_device *adev)
>>    *
>>    * Frees the dummy page used by the driver (all asics).
>>    */
>> -static void amdgpu_gart_dummy_page_fini(struct amdgpu_device *adev)
>> +void amdgpu_gart_dummy_page_fini(struct amdgpu_device *adev)
>>   {
>>       if (!adev->dummy_page_addr)
>>           return;
>> @@ -375,5 +375,4 @@ int amdgpu_gart_init(struct amdgpu_device *adev)
>>    */
>>   void amdgpu_gart_fini(struct amdgpu_device *adev)
>>   {
>> -    amdgpu_gart_dummy_page_fini(adev);
>>   }
> 
> Well either you remove amdgpu_gart_fini() or just call 
> amdgpu_gart_fini() instead of amdgpu_gart_dummy_page_fini().
> 
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h
>> index a25fe97b0196..78dc7a23da56 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h
>> @@ -58,6 +58,7 @@ int amdgpu_gart_table_vram_pin(struct amdgpu_device 
>> *adev);
>>   void amdgpu_gart_table_vram_unpin(struct amdgpu_device *adev);
>>   int amdgpu_gart_init(struct amdgpu_device *adev);
>>   void amdgpu_gart_fini(struct amdgpu_device *adev);
>> +void amdgpu_gart_dummy_page_fini(struct amdgpu_device *adev);
>>   int amdgpu_gart_unbind(struct amdgpu_device *adev, uint64_t offset,
>>                  int pages);
>>   int amdgpu_gart_map(struct amdgpu_device *adev, uint64_t offset,
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
>> index 233b64dab94b..a14973a7a9c9 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
>> @@ -361,6 +361,15 @@ void amdgpu_irq_fini_hw(struct amdgpu_device *adev)
>>           if (!amdgpu_device_has_dc_support(adev))
>>               flush_work(&adev->hotplug_work);
>>       }
>> +
>> +    if (adev->irq.ih_soft.ring)
>> +        amdgpu_ih_ring_fini(adev, &adev->irq.ih_soft);
>> +    if (adev->irq.ih.ring)
>> +        amdgpu_ih_ring_fini(adev, &adev->irq.ih);
>> +    if (adev->irq.ih1.ring)
>> +        amdgpu_ih_ring_fini(adev, &adev->irq.ih1);
>> +    if (adev->irq.ih2.ring)
>> +        amdgpu_ih_ring_fini(adev, &adev->irq.ih2);
> 
> You should probably make the function NULL save instead of checking here.
> 
> Christian.

Agree, in fact it's already does this check inside amdgpu_ih_ring_fini
so I will just drop the checks.

Andrey

> 
>>   }
>>   /**
>> diff --git a/drivers/gpu/drm/amd/amdgpu/cik_ih.c 
>> b/drivers/gpu/drm/amd/amdgpu/cik_ih.c
>> index 183d44a6583c..df385ffc9768 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/cik_ih.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/cik_ih.c
>> @@ -310,7 +310,6 @@ static int cik_ih_sw_fini(void *handle)
>>       struct amdgpu_device *adev = (struct amdgpu_device *)handle;
>>       amdgpu_irq_fini_sw(adev);
>> -    amdgpu_ih_ring_fini(adev, &adev->irq.ih);
>>       amdgpu_irq_remove_domain(adev);
>>       return 0;
>> diff --git a/drivers/gpu/drm/amd/amdgpu/cz_ih.c 
>> b/drivers/gpu/drm/amd/amdgpu/cz_ih.c
>> index d32743949003..b8c47e0cf37a 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/cz_ih.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/cz_ih.c
>> @@ -302,7 +302,6 @@ static int cz_ih_sw_fini(void *handle)
>>       struct amdgpu_device *adev = (struct amdgpu_device *)handle;
>>       amdgpu_irq_fini_sw(adev);
>> -    amdgpu_ih_ring_fini(adev, &adev->irq.ih);
>>       amdgpu_irq_remove_domain(adev);
>>       return 0;
>> diff --git a/drivers/gpu/drm/amd/amdgpu/iceland_ih.c 
>> b/drivers/gpu/drm/amd/amdgpu/iceland_ih.c
>> index da96c6013477..ddfe4eaeea05 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/iceland_ih.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/iceland_ih.c
>> @@ -301,7 +301,6 @@ static int iceland_ih_sw_fini(void *handle)
>>       struct amdgpu_device *adev = (struct amdgpu_device *)handle;
>>       amdgpu_irq_fini_sw(adev);
>> -    amdgpu_ih_ring_fini(adev, &adev->irq.ih);
>>       amdgpu_irq_remove_domain(adev);
>>       return 0;
>> diff --git a/drivers/gpu/drm/amd/amdgpu/navi10_ih.c 
>> b/drivers/gpu/drm/amd/amdgpu/navi10_ih.c
>> index 5eea4550b856..e171a9e78544 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/navi10_ih.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/navi10_ih.c
>> @@ -571,9 +571,6 @@ static int navi10_ih_sw_fini(void *handle)
>>       amdgpu_irq_fini_sw(adev);
>>       amdgpu_ih_ring_fini(adev, &adev->irq.ih_soft);
>> -    amdgpu_ih_ring_fini(adev, &adev->irq.ih2);
>> -    amdgpu_ih_ring_fini(adev, &adev->irq.ih1);
>> -    amdgpu_ih_ring_fini(adev, &adev->irq.ih);
>>       return 0;
>>   }
>> diff --git a/drivers/gpu/drm/amd/amdgpu/si_ih.c 
>> b/drivers/gpu/drm/amd/amdgpu/si_ih.c
>> index 751307f3252c..9a24f17a5750 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/si_ih.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/si_ih.c
>> @@ -176,7 +176,6 @@ static int si_ih_sw_fini(void *handle)
>>       struct amdgpu_device *adev = (struct amdgpu_device *)handle;
>>       amdgpu_irq_fini_sw(adev);
>> -    amdgpu_ih_ring_fini(adev, &adev->irq.ih);
>>       return 0;
>>   }
>> diff --git a/drivers/gpu/drm/amd/amdgpu/tonga_ih.c 
>> b/drivers/gpu/drm/amd/amdgpu/tonga_ih.c
>> index 973d80ec7f6c..b08905d1c00f 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/tonga_ih.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/tonga_ih.c
>> @@ -313,7 +313,6 @@ static int tonga_ih_sw_fini(void *handle)
>>       struct amdgpu_device *adev = (struct amdgpu_device *)handle;
>>       amdgpu_irq_fini_sw(adev);
>> -    amdgpu_ih_ring_fini(adev, &adev->irq.ih);
>>       amdgpu_irq_remove_domain(adev);
>>       return 0;
>> diff --git a/drivers/gpu/drm/amd/amdgpu/vega10_ih.c 
>> b/drivers/gpu/drm/amd/amdgpu/vega10_ih.c
>> index dead9c2fbd4c..d78b8abe993a 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/vega10_ih.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/vega10_ih.c
>> @@ -515,9 +515,6 @@ static int vega10_ih_sw_fini(void *handle)
>>       amdgpu_irq_fini_sw(adev);
>>       amdgpu_ih_ring_fini(adev, &adev->irq.ih_soft);
>> -    amdgpu_ih_ring_fini(adev, &adev->irq.ih2);
>> -    amdgpu_ih_ring_fini(adev, &adev->irq.ih1);
>> -    amdgpu_ih_ring_fini(adev, &adev->irq.ih);
>>       return 0;
>>   }
> 
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [PATCH v6 06/16] drm/amdgpu: Handle IOMMU enabled case.
  2021-05-10 16:36   ` Andrey Grodzovsky
  (?)
@ 2021-05-11 15:56     ` Alex Deucher
  -1 siblings, 0 replies; 126+ messages in thread
From: Alex Deucher @ 2021-05-11 15:56 UTC (permalink / raw)
  To: Andrey Grodzovsky
  Cc: Maling list - DRI developers, amd-gfx list, Linux PCI,
	Christian König, Daniel Vetter, Wentland, Harry, Greg KH,
	Kuehling, Felix, Pekka Paalanen, Bjorn Helgaas, Deucher,
	Alexander

On Mon, May 10, 2021 at 12:37 PM Andrey Grodzovsky
<andrey.grodzovsky@amd.com> wrote:
>
> Handle all DMA IOMMU gropup related dependencies before the
> group is removed.
>
> v5: Drop IOMMU notifier and switch to lockless call to ttm_tt_unpopulate
> v6: Drop the BO unamp list
>
> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 4 ++--
>  drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c   | 3 +--
>  drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h   | 1 +
>  drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c    | 9 +++++++++
>  drivers/gpu/drm/amd/amdgpu/cik_ih.c        | 1 -
>  drivers/gpu/drm/amd/amdgpu/cz_ih.c         | 1 -
>  drivers/gpu/drm/amd/amdgpu/iceland_ih.c    | 1 -
>  drivers/gpu/drm/amd/amdgpu/navi10_ih.c     | 3 ---
>  drivers/gpu/drm/amd/amdgpu/si_ih.c         | 1 -
>  drivers/gpu/drm/amd/amdgpu/tonga_ih.c      | 1 -
>  drivers/gpu/drm/amd/amdgpu/vega10_ih.c     | 3 ---
>  11 files changed, 13 insertions(+), 15 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index 18598eda18f6..a0bff4713672 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -3256,7 +3256,6 @@ static const struct attribute *amdgpu_dev_attributes[] = {
>         NULL
>  };
>
> -
>  /**
>   * amdgpu_device_init - initialize the driver
>   *
> @@ -3698,12 +3697,13 @@ void amdgpu_device_fini_hw(struct amdgpu_device *adev)
>                 amdgpu_ucode_sysfs_fini(adev);
>         sysfs_remove_files(&adev->dev->kobj, amdgpu_dev_attributes);
>
> -
>         amdgpu_fbdev_fini(adev);
>
>         amdgpu_irq_fini_hw(adev);
>
>         amdgpu_device_ip_fini_early(adev);
> +
> +       amdgpu_gart_dummy_page_fini(adev);
>  }
>
>  void amdgpu_device_fini_sw(struct amdgpu_device *adev)
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
> index c5a9a4fb10d2..354e68081b53 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
> @@ -92,7 +92,7 @@ static int amdgpu_gart_dummy_page_init(struct amdgpu_device *adev)
>   *
>   * Frees the dummy page used by the driver (all asics).
>   */
> -static void amdgpu_gart_dummy_page_fini(struct amdgpu_device *adev)
> +void amdgpu_gart_dummy_page_fini(struct amdgpu_device *adev)
>  {
>         if (!adev->dummy_page_addr)
>                 return;
> @@ -375,5 +375,4 @@ int amdgpu_gart_init(struct amdgpu_device *adev)
>   */
>  void amdgpu_gart_fini(struct amdgpu_device *adev)
>  {
> -       amdgpu_gart_dummy_page_fini(adev);
>  }
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h
> index a25fe97b0196..78dc7a23da56 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h
> @@ -58,6 +58,7 @@ int amdgpu_gart_table_vram_pin(struct amdgpu_device *adev);
>  void amdgpu_gart_table_vram_unpin(struct amdgpu_device *adev);
>  int amdgpu_gart_init(struct amdgpu_device *adev);
>  void amdgpu_gart_fini(struct amdgpu_device *adev);
> +void amdgpu_gart_dummy_page_fini(struct amdgpu_device *adev);
>  int amdgpu_gart_unbind(struct amdgpu_device *adev, uint64_t offset,
>                        int pages);
>  int amdgpu_gart_map(struct amdgpu_device *adev, uint64_t offset,
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
> index 233b64dab94b..a14973a7a9c9 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
> @@ -361,6 +361,15 @@ void amdgpu_irq_fini_hw(struct amdgpu_device *adev)
>                 if (!amdgpu_device_has_dc_support(adev))
>                         flush_work(&adev->hotplug_work);
>         }
> +
> +       if (adev->irq.ih_soft.ring)
> +               amdgpu_ih_ring_fini(adev, &adev->irq.ih_soft);

Why is the ih_soft handled here and in the various ih sw_fini functions?

> +       if (adev->irq.ih.ring)
> +               amdgpu_ih_ring_fini(adev, &adev->irq.ih);
> +       if (adev->irq.ih1.ring)
> +               amdgpu_ih_ring_fini(adev, &adev->irq.ih1);
> +       if (adev->irq.ih2.ring)
> +               amdgpu_ih_ring_fini(adev, &adev->irq.ih2);
>  }
>
>  /**
> diff --git a/drivers/gpu/drm/amd/amdgpu/cik_ih.c b/drivers/gpu/drm/amd/amdgpu/cik_ih.c
> index 183d44a6583c..df385ffc9768 100644
> --- a/drivers/gpu/drm/amd/amdgpu/cik_ih.c
> +++ b/drivers/gpu/drm/amd/amdgpu/cik_ih.c
> @@ -310,7 +310,6 @@ static int cik_ih_sw_fini(void *handle)
>         struct amdgpu_device *adev = (struct amdgpu_device *)handle;
>
>         amdgpu_irq_fini_sw(adev);
> -       amdgpu_ih_ring_fini(adev, &adev->irq.ih);
>         amdgpu_irq_remove_domain(adev);
>
>         return 0;
> diff --git a/drivers/gpu/drm/amd/amdgpu/cz_ih.c b/drivers/gpu/drm/amd/amdgpu/cz_ih.c
> index d32743949003..b8c47e0cf37a 100644
> --- a/drivers/gpu/drm/amd/amdgpu/cz_ih.c
> +++ b/drivers/gpu/drm/amd/amdgpu/cz_ih.c
> @@ -302,7 +302,6 @@ static int cz_ih_sw_fini(void *handle)
>         struct amdgpu_device *adev = (struct amdgpu_device *)handle;
>
>         amdgpu_irq_fini_sw(adev);
> -       amdgpu_ih_ring_fini(adev, &adev->irq.ih);
>         amdgpu_irq_remove_domain(adev);
>
>         return 0;
> diff --git a/drivers/gpu/drm/amd/amdgpu/iceland_ih.c b/drivers/gpu/drm/amd/amdgpu/iceland_ih.c
> index da96c6013477..ddfe4eaeea05 100644
> --- a/drivers/gpu/drm/amd/amdgpu/iceland_ih.c
> +++ b/drivers/gpu/drm/amd/amdgpu/iceland_ih.c
> @@ -301,7 +301,6 @@ static int iceland_ih_sw_fini(void *handle)
>         struct amdgpu_device *adev = (struct amdgpu_device *)handle;
>
>         amdgpu_irq_fini_sw(adev);
> -       amdgpu_ih_ring_fini(adev, &adev->irq.ih);
>         amdgpu_irq_remove_domain(adev);
>
>         return 0;
> diff --git a/drivers/gpu/drm/amd/amdgpu/navi10_ih.c b/drivers/gpu/drm/amd/amdgpu/navi10_ih.c
> index 5eea4550b856..e171a9e78544 100644
> --- a/drivers/gpu/drm/amd/amdgpu/navi10_ih.c
> +++ b/drivers/gpu/drm/amd/amdgpu/navi10_ih.c
> @@ -571,9 +571,6 @@ static int navi10_ih_sw_fini(void *handle)
>
>         amdgpu_irq_fini_sw(adev);
>         amdgpu_ih_ring_fini(adev, &adev->irq.ih_soft);
> -       amdgpu_ih_ring_fini(adev, &adev->irq.ih2);
> -       amdgpu_ih_ring_fini(adev, &adev->irq.ih1);
> -       amdgpu_ih_ring_fini(adev, &adev->irq.ih);
>
>         return 0;
>  }
> diff --git a/drivers/gpu/drm/amd/amdgpu/si_ih.c b/drivers/gpu/drm/amd/amdgpu/si_ih.c
> index 751307f3252c..9a24f17a5750 100644
> --- a/drivers/gpu/drm/amd/amdgpu/si_ih.c
> +++ b/drivers/gpu/drm/amd/amdgpu/si_ih.c
> @@ -176,7 +176,6 @@ static int si_ih_sw_fini(void *handle)
>         struct amdgpu_device *adev = (struct amdgpu_device *)handle;
>
>         amdgpu_irq_fini_sw(adev);
> -       amdgpu_ih_ring_fini(adev, &adev->irq.ih);
>
>         return 0;
>  }
> diff --git a/drivers/gpu/drm/amd/amdgpu/tonga_ih.c b/drivers/gpu/drm/amd/amdgpu/tonga_ih.c
> index 973d80ec7f6c..b08905d1c00f 100644
> --- a/drivers/gpu/drm/amd/amdgpu/tonga_ih.c
> +++ b/drivers/gpu/drm/amd/amdgpu/tonga_ih.c
> @@ -313,7 +313,6 @@ static int tonga_ih_sw_fini(void *handle)
>         struct amdgpu_device *adev = (struct amdgpu_device *)handle;
>
>         amdgpu_irq_fini_sw(adev);
> -       amdgpu_ih_ring_fini(adev, &adev->irq.ih);
>         amdgpu_irq_remove_domain(adev);
>
>         return 0;
> diff --git a/drivers/gpu/drm/amd/amdgpu/vega10_ih.c b/drivers/gpu/drm/amd/amdgpu/vega10_ih.c
> index dead9c2fbd4c..d78b8abe993a 100644
> --- a/drivers/gpu/drm/amd/amdgpu/vega10_ih.c
> +++ b/drivers/gpu/drm/amd/amdgpu/vega10_ih.c
> @@ -515,9 +515,6 @@ static int vega10_ih_sw_fini(void *handle)
>
>         amdgpu_irq_fini_sw(adev);
>         amdgpu_ih_ring_fini(adev, &adev->irq.ih_soft);
> -       amdgpu_ih_ring_fini(adev, &adev->irq.ih2);
> -       amdgpu_ih_ring_fini(adev, &adev->irq.ih1);
> -       amdgpu_ih_ring_fini(adev, &adev->irq.ih);
>
>         return 0;
>  }
> --
> 2.25.1
>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [PATCH v6 06/16] drm/amdgpu: Handle IOMMU enabled case.
@ 2021-05-11 15:56     ` Alex Deucher
  0 siblings, 0 replies; 126+ messages in thread
From: Alex Deucher @ 2021-05-11 15:56 UTC (permalink / raw)
  To: Andrey Grodzovsky
  Cc: Christian König, Greg KH, Daniel Vetter, Kuehling, Felix,
	amd-gfx list, Bjorn Helgaas, Maling list - DRI developers,
	Linux PCI, Deucher, Alexander

On Mon, May 10, 2021 at 12:37 PM Andrey Grodzovsky
<andrey.grodzovsky@amd.com> wrote:
>
> Handle all DMA IOMMU gropup related dependencies before the
> group is removed.
>
> v5: Drop IOMMU notifier and switch to lockless call to ttm_tt_unpopulate
> v6: Drop the BO unamp list
>
> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 4 ++--
>  drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c   | 3 +--
>  drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h   | 1 +
>  drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c    | 9 +++++++++
>  drivers/gpu/drm/amd/amdgpu/cik_ih.c        | 1 -
>  drivers/gpu/drm/amd/amdgpu/cz_ih.c         | 1 -
>  drivers/gpu/drm/amd/amdgpu/iceland_ih.c    | 1 -
>  drivers/gpu/drm/amd/amdgpu/navi10_ih.c     | 3 ---
>  drivers/gpu/drm/amd/amdgpu/si_ih.c         | 1 -
>  drivers/gpu/drm/amd/amdgpu/tonga_ih.c      | 1 -
>  drivers/gpu/drm/amd/amdgpu/vega10_ih.c     | 3 ---
>  11 files changed, 13 insertions(+), 15 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index 18598eda18f6..a0bff4713672 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -3256,7 +3256,6 @@ static const struct attribute *amdgpu_dev_attributes[] = {
>         NULL
>  };
>
> -
>  /**
>   * amdgpu_device_init - initialize the driver
>   *
> @@ -3698,12 +3697,13 @@ void amdgpu_device_fini_hw(struct amdgpu_device *adev)
>                 amdgpu_ucode_sysfs_fini(adev);
>         sysfs_remove_files(&adev->dev->kobj, amdgpu_dev_attributes);
>
> -
>         amdgpu_fbdev_fini(adev);
>
>         amdgpu_irq_fini_hw(adev);
>
>         amdgpu_device_ip_fini_early(adev);
> +
> +       amdgpu_gart_dummy_page_fini(adev);
>  }
>
>  void amdgpu_device_fini_sw(struct amdgpu_device *adev)
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
> index c5a9a4fb10d2..354e68081b53 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
> @@ -92,7 +92,7 @@ static int amdgpu_gart_dummy_page_init(struct amdgpu_device *adev)
>   *
>   * Frees the dummy page used by the driver (all asics).
>   */
> -static void amdgpu_gart_dummy_page_fini(struct amdgpu_device *adev)
> +void amdgpu_gart_dummy_page_fini(struct amdgpu_device *adev)
>  {
>         if (!adev->dummy_page_addr)
>                 return;
> @@ -375,5 +375,4 @@ int amdgpu_gart_init(struct amdgpu_device *adev)
>   */
>  void amdgpu_gart_fini(struct amdgpu_device *adev)
>  {
> -       amdgpu_gart_dummy_page_fini(adev);
>  }
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h
> index a25fe97b0196..78dc7a23da56 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h
> @@ -58,6 +58,7 @@ int amdgpu_gart_table_vram_pin(struct amdgpu_device *adev);
>  void amdgpu_gart_table_vram_unpin(struct amdgpu_device *adev);
>  int amdgpu_gart_init(struct amdgpu_device *adev);
>  void amdgpu_gart_fini(struct amdgpu_device *adev);
> +void amdgpu_gart_dummy_page_fini(struct amdgpu_device *adev);
>  int amdgpu_gart_unbind(struct amdgpu_device *adev, uint64_t offset,
>                        int pages);
>  int amdgpu_gart_map(struct amdgpu_device *adev, uint64_t offset,
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
> index 233b64dab94b..a14973a7a9c9 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
> @@ -361,6 +361,15 @@ void amdgpu_irq_fini_hw(struct amdgpu_device *adev)
>                 if (!amdgpu_device_has_dc_support(adev))
>                         flush_work(&adev->hotplug_work);
>         }
> +
> +       if (adev->irq.ih_soft.ring)
> +               amdgpu_ih_ring_fini(adev, &adev->irq.ih_soft);

Why is the ih_soft handled here and in the various ih sw_fini functions?

> +       if (adev->irq.ih.ring)
> +               amdgpu_ih_ring_fini(adev, &adev->irq.ih);
> +       if (adev->irq.ih1.ring)
> +               amdgpu_ih_ring_fini(adev, &adev->irq.ih1);
> +       if (adev->irq.ih2.ring)
> +               amdgpu_ih_ring_fini(adev, &adev->irq.ih2);
>  }
>
>  /**
> diff --git a/drivers/gpu/drm/amd/amdgpu/cik_ih.c b/drivers/gpu/drm/amd/amdgpu/cik_ih.c
> index 183d44a6583c..df385ffc9768 100644
> --- a/drivers/gpu/drm/amd/amdgpu/cik_ih.c
> +++ b/drivers/gpu/drm/amd/amdgpu/cik_ih.c
> @@ -310,7 +310,6 @@ static int cik_ih_sw_fini(void *handle)
>         struct amdgpu_device *adev = (struct amdgpu_device *)handle;
>
>         amdgpu_irq_fini_sw(adev);
> -       amdgpu_ih_ring_fini(adev, &adev->irq.ih);
>         amdgpu_irq_remove_domain(adev);
>
>         return 0;
> diff --git a/drivers/gpu/drm/amd/amdgpu/cz_ih.c b/drivers/gpu/drm/amd/amdgpu/cz_ih.c
> index d32743949003..b8c47e0cf37a 100644
> --- a/drivers/gpu/drm/amd/amdgpu/cz_ih.c
> +++ b/drivers/gpu/drm/amd/amdgpu/cz_ih.c
> @@ -302,7 +302,6 @@ static int cz_ih_sw_fini(void *handle)
>         struct amdgpu_device *adev = (struct amdgpu_device *)handle;
>
>         amdgpu_irq_fini_sw(adev);
> -       amdgpu_ih_ring_fini(adev, &adev->irq.ih);
>         amdgpu_irq_remove_domain(adev);
>
>         return 0;
> diff --git a/drivers/gpu/drm/amd/amdgpu/iceland_ih.c b/drivers/gpu/drm/amd/amdgpu/iceland_ih.c
> index da96c6013477..ddfe4eaeea05 100644
> --- a/drivers/gpu/drm/amd/amdgpu/iceland_ih.c
> +++ b/drivers/gpu/drm/amd/amdgpu/iceland_ih.c
> @@ -301,7 +301,6 @@ static int iceland_ih_sw_fini(void *handle)
>         struct amdgpu_device *adev = (struct amdgpu_device *)handle;
>
>         amdgpu_irq_fini_sw(adev);
> -       amdgpu_ih_ring_fini(adev, &adev->irq.ih);
>         amdgpu_irq_remove_domain(adev);
>
>         return 0;
> diff --git a/drivers/gpu/drm/amd/amdgpu/navi10_ih.c b/drivers/gpu/drm/amd/amdgpu/navi10_ih.c
> index 5eea4550b856..e171a9e78544 100644
> --- a/drivers/gpu/drm/amd/amdgpu/navi10_ih.c
> +++ b/drivers/gpu/drm/amd/amdgpu/navi10_ih.c
> @@ -571,9 +571,6 @@ static int navi10_ih_sw_fini(void *handle)
>
>         amdgpu_irq_fini_sw(adev);
>         amdgpu_ih_ring_fini(adev, &adev->irq.ih_soft);
> -       amdgpu_ih_ring_fini(adev, &adev->irq.ih2);
> -       amdgpu_ih_ring_fini(adev, &adev->irq.ih1);
> -       amdgpu_ih_ring_fini(adev, &adev->irq.ih);
>
>         return 0;
>  }
> diff --git a/drivers/gpu/drm/amd/amdgpu/si_ih.c b/drivers/gpu/drm/amd/amdgpu/si_ih.c
> index 751307f3252c..9a24f17a5750 100644
> --- a/drivers/gpu/drm/amd/amdgpu/si_ih.c
> +++ b/drivers/gpu/drm/amd/amdgpu/si_ih.c
> @@ -176,7 +176,6 @@ static int si_ih_sw_fini(void *handle)
>         struct amdgpu_device *adev = (struct amdgpu_device *)handle;
>
>         amdgpu_irq_fini_sw(adev);
> -       amdgpu_ih_ring_fini(adev, &adev->irq.ih);
>
>         return 0;
>  }
> diff --git a/drivers/gpu/drm/amd/amdgpu/tonga_ih.c b/drivers/gpu/drm/amd/amdgpu/tonga_ih.c
> index 973d80ec7f6c..b08905d1c00f 100644
> --- a/drivers/gpu/drm/amd/amdgpu/tonga_ih.c
> +++ b/drivers/gpu/drm/amd/amdgpu/tonga_ih.c
> @@ -313,7 +313,6 @@ static int tonga_ih_sw_fini(void *handle)
>         struct amdgpu_device *adev = (struct amdgpu_device *)handle;
>
>         amdgpu_irq_fini_sw(adev);
> -       amdgpu_ih_ring_fini(adev, &adev->irq.ih);
>         amdgpu_irq_remove_domain(adev);
>
>         return 0;
> diff --git a/drivers/gpu/drm/amd/amdgpu/vega10_ih.c b/drivers/gpu/drm/amd/amdgpu/vega10_ih.c
> index dead9c2fbd4c..d78b8abe993a 100644
> --- a/drivers/gpu/drm/amd/amdgpu/vega10_ih.c
> +++ b/drivers/gpu/drm/amd/amdgpu/vega10_ih.c
> @@ -515,9 +515,6 @@ static int vega10_ih_sw_fini(void *handle)
>
>         amdgpu_irq_fini_sw(adev);
>         amdgpu_ih_ring_fini(adev, &adev->irq.ih_soft);
> -       amdgpu_ih_ring_fini(adev, &adev->irq.ih2);
> -       amdgpu_ih_ring_fini(adev, &adev->irq.ih1);
> -       amdgpu_ih_ring_fini(adev, &adev->irq.ih);
>
>         return 0;
>  }
> --
> 2.25.1
>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [PATCH v6 06/16] drm/amdgpu: Handle IOMMU enabled case.
@ 2021-05-11 15:56     ` Alex Deucher
  0 siblings, 0 replies; 126+ messages in thread
From: Alex Deucher @ 2021-05-11 15:56 UTC (permalink / raw)
  To: Andrey Grodzovsky
  Cc: Christian König, Greg KH, Daniel Vetter, Kuehling, Felix,
	amd-gfx list, Pekka Paalanen, Bjorn Helgaas,
	Maling list - DRI developers, Linux PCI, Deucher, Alexander,
	Wentland, Harry

On Mon, May 10, 2021 at 12:37 PM Andrey Grodzovsky
<andrey.grodzovsky@amd.com> wrote:
>
> Handle all DMA IOMMU gropup related dependencies before the
> group is removed.
>
> v5: Drop IOMMU notifier and switch to lockless call to ttm_tt_unpopulate
> v6: Drop the BO unamp list
>
> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 4 ++--
>  drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c   | 3 +--
>  drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h   | 1 +
>  drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c    | 9 +++++++++
>  drivers/gpu/drm/amd/amdgpu/cik_ih.c        | 1 -
>  drivers/gpu/drm/amd/amdgpu/cz_ih.c         | 1 -
>  drivers/gpu/drm/amd/amdgpu/iceland_ih.c    | 1 -
>  drivers/gpu/drm/amd/amdgpu/navi10_ih.c     | 3 ---
>  drivers/gpu/drm/amd/amdgpu/si_ih.c         | 1 -
>  drivers/gpu/drm/amd/amdgpu/tonga_ih.c      | 1 -
>  drivers/gpu/drm/amd/amdgpu/vega10_ih.c     | 3 ---
>  11 files changed, 13 insertions(+), 15 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index 18598eda18f6..a0bff4713672 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -3256,7 +3256,6 @@ static const struct attribute *amdgpu_dev_attributes[] = {
>         NULL
>  };
>
> -
>  /**
>   * amdgpu_device_init - initialize the driver
>   *
> @@ -3698,12 +3697,13 @@ void amdgpu_device_fini_hw(struct amdgpu_device *adev)
>                 amdgpu_ucode_sysfs_fini(adev);
>         sysfs_remove_files(&adev->dev->kobj, amdgpu_dev_attributes);
>
> -
>         amdgpu_fbdev_fini(adev);
>
>         amdgpu_irq_fini_hw(adev);
>
>         amdgpu_device_ip_fini_early(adev);
> +
> +       amdgpu_gart_dummy_page_fini(adev);
>  }
>
>  void amdgpu_device_fini_sw(struct amdgpu_device *adev)
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
> index c5a9a4fb10d2..354e68081b53 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
> @@ -92,7 +92,7 @@ static int amdgpu_gart_dummy_page_init(struct amdgpu_device *adev)
>   *
>   * Frees the dummy page used by the driver (all asics).
>   */
> -static void amdgpu_gart_dummy_page_fini(struct amdgpu_device *adev)
> +void amdgpu_gart_dummy_page_fini(struct amdgpu_device *adev)
>  {
>         if (!adev->dummy_page_addr)
>                 return;
> @@ -375,5 +375,4 @@ int amdgpu_gart_init(struct amdgpu_device *adev)
>   */
>  void amdgpu_gart_fini(struct amdgpu_device *adev)
>  {
> -       amdgpu_gart_dummy_page_fini(adev);
>  }
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h
> index a25fe97b0196..78dc7a23da56 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h
> @@ -58,6 +58,7 @@ int amdgpu_gart_table_vram_pin(struct amdgpu_device *adev);
>  void amdgpu_gart_table_vram_unpin(struct amdgpu_device *adev);
>  int amdgpu_gart_init(struct amdgpu_device *adev);
>  void amdgpu_gart_fini(struct amdgpu_device *adev);
> +void amdgpu_gart_dummy_page_fini(struct amdgpu_device *adev);
>  int amdgpu_gart_unbind(struct amdgpu_device *adev, uint64_t offset,
>                        int pages);
>  int amdgpu_gart_map(struct amdgpu_device *adev, uint64_t offset,
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
> index 233b64dab94b..a14973a7a9c9 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
> @@ -361,6 +361,15 @@ void amdgpu_irq_fini_hw(struct amdgpu_device *adev)
>                 if (!amdgpu_device_has_dc_support(adev))
>                         flush_work(&adev->hotplug_work);
>         }
> +
> +       if (adev->irq.ih_soft.ring)
> +               amdgpu_ih_ring_fini(adev, &adev->irq.ih_soft);

Why is the ih_soft handled here and in the various ih sw_fini functions?

> +       if (adev->irq.ih.ring)
> +               amdgpu_ih_ring_fini(adev, &adev->irq.ih);
> +       if (adev->irq.ih1.ring)
> +               amdgpu_ih_ring_fini(adev, &adev->irq.ih1);
> +       if (adev->irq.ih2.ring)
> +               amdgpu_ih_ring_fini(adev, &adev->irq.ih2);
>  }
>
>  /**
> diff --git a/drivers/gpu/drm/amd/amdgpu/cik_ih.c b/drivers/gpu/drm/amd/amdgpu/cik_ih.c
> index 183d44a6583c..df385ffc9768 100644
> --- a/drivers/gpu/drm/amd/amdgpu/cik_ih.c
> +++ b/drivers/gpu/drm/amd/amdgpu/cik_ih.c
> @@ -310,7 +310,6 @@ static int cik_ih_sw_fini(void *handle)
>         struct amdgpu_device *adev = (struct amdgpu_device *)handle;
>
>         amdgpu_irq_fini_sw(adev);
> -       amdgpu_ih_ring_fini(adev, &adev->irq.ih);
>         amdgpu_irq_remove_domain(adev);
>
>         return 0;
> diff --git a/drivers/gpu/drm/amd/amdgpu/cz_ih.c b/drivers/gpu/drm/amd/amdgpu/cz_ih.c
> index d32743949003..b8c47e0cf37a 100644
> --- a/drivers/gpu/drm/amd/amdgpu/cz_ih.c
> +++ b/drivers/gpu/drm/amd/amdgpu/cz_ih.c
> @@ -302,7 +302,6 @@ static int cz_ih_sw_fini(void *handle)
>         struct amdgpu_device *adev = (struct amdgpu_device *)handle;
>
>         amdgpu_irq_fini_sw(adev);
> -       amdgpu_ih_ring_fini(adev, &adev->irq.ih);
>         amdgpu_irq_remove_domain(adev);
>
>         return 0;
> diff --git a/drivers/gpu/drm/amd/amdgpu/iceland_ih.c b/drivers/gpu/drm/amd/amdgpu/iceland_ih.c
> index da96c6013477..ddfe4eaeea05 100644
> --- a/drivers/gpu/drm/amd/amdgpu/iceland_ih.c
> +++ b/drivers/gpu/drm/amd/amdgpu/iceland_ih.c
> @@ -301,7 +301,6 @@ static int iceland_ih_sw_fini(void *handle)
>         struct amdgpu_device *adev = (struct amdgpu_device *)handle;
>
>         amdgpu_irq_fini_sw(adev);
> -       amdgpu_ih_ring_fini(adev, &adev->irq.ih);
>         amdgpu_irq_remove_domain(adev);
>
>         return 0;
> diff --git a/drivers/gpu/drm/amd/amdgpu/navi10_ih.c b/drivers/gpu/drm/amd/amdgpu/navi10_ih.c
> index 5eea4550b856..e171a9e78544 100644
> --- a/drivers/gpu/drm/amd/amdgpu/navi10_ih.c
> +++ b/drivers/gpu/drm/amd/amdgpu/navi10_ih.c
> @@ -571,9 +571,6 @@ static int navi10_ih_sw_fini(void *handle)
>
>         amdgpu_irq_fini_sw(adev);
>         amdgpu_ih_ring_fini(adev, &adev->irq.ih_soft);
> -       amdgpu_ih_ring_fini(adev, &adev->irq.ih2);
> -       amdgpu_ih_ring_fini(adev, &adev->irq.ih1);
> -       amdgpu_ih_ring_fini(adev, &adev->irq.ih);
>
>         return 0;
>  }
> diff --git a/drivers/gpu/drm/amd/amdgpu/si_ih.c b/drivers/gpu/drm/amd/amdgpu/si_ih.c
> index 751307f3252c..9a24f17a5750 100644
> --- a/drivers/gpu/drm/amd/amdgpu/si_ih.c
> +++ b/drivers/gpu/drm/amd/amdgpu/si_ih.c
> @@ -176,7 +176,6 @@ static int si_ih_sw_fini(void *handle)
>         struct amdgpu_device *adev = (struct amdgpu_device *)handle;
>
>         amdgpu_irq_fini_sw(adev);
> -       amdgpu_ih_ring_fini(adev, &adev->irq.ih);
>
>         return 0;
>  }
> diff --git a/drivers/gpu/drm/amd/amdgpu/tonga_ih.c b/drivers/gpu/drm/amd/amdgpu/tonga_ih.c
> index 973d80ec7f6c..b08905d1c00f 100644
> --- a/drivers/gpu/drm/amd/amdgpu/tonga_ih.c
> +++ b/drivers/gpu/drm/amd/amdgpu/tonga_ih.c
> @@ -313,7 +313,6 @@ static int tonga_ih_sw_fini(void *handle)
>         struct amdgpu_device *adev = (struct amdgpu_device *)handle;
>
>         amdgpu_irq_fini_sw(adev);
> -       amdgpu_ih_ring_fini(adev, &adev->irq.ih);
>         amdgpu_irq_remove_domain(adev);
>
>         return 0;
> diff --git a/drivers/gpu/drm/amd/amdgpu/vega10_ih.c b/drivers/gpu/drm/amd/amdgpu/vega10_ih.c
> index dead9c2fbd4c..d78b8abe993a 100644
> --- a/drivers/gpu/drm/amd/amdgpu/vega10_ih.c
> +++ b/drivers/gpu/drm/amd/amdgpu/vega10_ih.c
> @@ -515,9 +515,6 @@ static int vega10_ih_sw_fini(void *handle)
>
>         amdgpu_irq_fini_sw(adev);
>         amdgpu_ih_ring_fini(adev, &adev->irq.ih_soft);
> -       amdgpu_ih_ring_fini(adev, &adev->irq.ih2);
> -       amdgpu_ih_ring_fini(adev, &adev->irq.ih1);
> -       amdgpu_ih_ring_fini(adev, &adev->irq.ih);
>
>         return 0;
>  }
> --
> 2.25.1
>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [PATCH v6 06/16] drm/amdgpu: Handle IOMMU enabled case.
  2021-05-11 15:56     ` Alex Deucher
  (?)
@ 2021-05-11 15:59       ` Andrey Grodzovsky
  -1 siblings, 0 replies; 126+ messages in thread
From: Andrey Grodzovsky @ 2021-05-11 15:59 UTC (permalink / raw)
  To: Alex Deucher
  Cc: Maling list - DRI developers, amd-gfx list, Linux PCI,
	Christian König, Daniel Vetter, Wentland, Harry, Greg KH,
	Kuehling, Felix, Pekka Paalanen, Bjorn Helgaas, Deucher,
	Alexander



On 2021-05-11 11:56 a.m., Alex Deucher wrote:
> On Mon, May 10, 2021 at 12:37 PM Andrey Grodzovsky
> <andrey.grodzovsky@amd.com> wrote:
>>
>> Handle all DMA IOMMU gropup related dependencies before the
>> group is removed.
>>
>> v5: Drop IOMMU notifier and switch to lockless call to ttm_tt_unpopulate
>> v6: Drop the BO unamp list
>>
>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 4 ++--
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c   | 3 +--
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h   | 1 +
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c    | 9 +++++++++
>>   drivers/gpu/drm/amd/amdgpu/cik_ih.c        | 1 -
>>   drivers/gpu/drm/amd/amdgpu/cz_ih.c         | 1 -
>>   drivers/gpu/drm/amd/amdgpu/iceland_ih.c    | 1 -
>>   drivers/gpu/drm/amd/amdgpu/navi10_ih.c     | 3 ---
>>   drivers/gpu/drm/amd/amdgpu/si_ih.c         | 1 -
>>   drivers/gpu/drm/amd/amdgpu/tonga_ih.c      | 1 -
>>   drivers/gpu/drm/amd/amdgpu/vega10_ih.c     | 3 ---
>>   11 files changed, 13 insertions(+), 15 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> index 18598eda18f6..a0bff4713672 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> @@ -3256,7 +3256,6 @@ static const struct attribute *amdgpu_dev_attributes[] = {
>>          NULL
>>   };
>>
>> -
>>   /**
>>    * amdgpu_device_init - initialize the driver
>>    *
>> @@ -3698,12 +3697,13 @@ void amdgpu_device_fini_hw(struct amdgpu_device *adev)
>>                  amdgpu_ucode_sysfs_fini(adev);
>>          sysfs_remove_files(&adev->dev->kobj, amdgpu_dev_attributes);
>>
>> -
>>          amdgpu_fbdev_fini(adev);
>>
>>          amdgpu_irq_fini_hw(adev);
>>
>>          amdgpu_device_ip_fini_early(adev);
>> +
>> +       amdgpu_gart_dummy_page_fini(adev);
>>   }
>>
>>   void amdgpu_device_fini_sw(struct amdgpu_device *adev)
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
>> index c5a9a4fb10d2..354e68081b53 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
>> @@ -92,7 +92,7 @@ static int amdgpu_gart_dummy_page_init(struct amdgpu_device *adev)
>>    *
>>    * Frees the dummy page used by the driver (all asics).
>>    */
>> -static void amdgpu_gart_dummy_page_fini(struct amdgpu_device *adev)
>> +void amdgpu_gart_dummy_page_fini(struct amdgpu_device *adev)
>>   {
>>          if (!adev->dummy_page_addr)
>>                  return;
>> @@ -375,5 +375,4 @@ int amdgpu_gart_init(struct amdgpu_device *adev)
>>    */
>>   void amdgpu_gart_fini(struct amdgpu_device *adev)
>>   {
>> -       amdgpu_gart_dummy_page_fini(adev);
>>   }
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h
>> index a25fe97b0196..78dc7a23da56 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h
>> @@ -58,6 +58,7 @@ int amdgpu_gart_table_vram_pin(struct amdgpu_device *adev);
>>   void amdgpu_gart_table_vram_unpin(struct amdgpu_device *adev);
>>   int amdgpu_gart_init(struct amdgpu_device *adev);
>>   void amdgpu_gart_fini(struct amdgpu_device *adev);
>> +void amdgpu_gart_dummy_page_fini(struct amdgpu_device *adev);
>>   int amdgpu_gart_unbind(struct amdgpu_device *adev, uint64_t offset,
>>                         int pages);
>>   int amdgpu_gart_map(struct amdgpu_device *adev, uint64_t offset,
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
>> index 233b64dab94b..a14973a7a9c9 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
>> @@ -361,6 +361,15 @@ void amdgpu_irq_fini_hw(struct amdgpu_device *adev)
>>                  if (!amdgpu_device_has_dc_support(adev))
>>                          flush_work(&adev->hotplug_work);
>>          }
>> +
>> +       if (adev->irq.ih_soft.ring)
>> +               amdgpu_ih_ring_fini(adev, &adev->irq.ih_soft);
> 
> Why is the ih_soft handled here and in the various ih sw_fini functions?

Post last rebase new ASICs i think were added which i missed.
Taking care of this with prev. comment by Christian together right now.

Andrey

> 
>> +       if (adev->irq.ih.ring)
>> +               amdgpu_ih_ring_fini(adev, &adev->irq.ih);
>> +       if (adev->irq.ih1.ring)
>> +               amdgpu_ih_ring_fini(adev, &adev->irq.ih1);
>> +       if (adev->irq.ih2.ring)
>> +               amdgpu_ih_ring_fini(adev, &adev->irq.ih2);
>>   }
>>
>>   /**
>> diff --git a/drivers/gpu/drm/amd/amdgpu/cik_ih.c b/drivers/gpu/drm/amd/amdgpu/cik_ih.c
>> index 183d44a6583c..df385ffc9768 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/cik_ih.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/cik_ih.c
>> @@ -310,7 +310,6 @@ static int cik_ih_sw_fini(void *handle)
>>          struct amdgpu_device *adev = (struct amdgpu_device *)handle;
>>
>>          amdgpu_irq_fini_sw(adev);
>> -       amdgpu_ih_ring_fini(adev, &adev->irq.ih);
>>          amdgpu_irq_remove_domain(adev);
>>
>>          return 0;
>> diff --git a/drivers/gpu/drm/amd/amdgpu/cz_ih.c b/drivers/gpu/drm/amd/amdgpu/cz_ih.c
>> index d32743949003..b8c47e0cf37a 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/cz_ih.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/cz_ih.c
>> @@ -302,7 +302,6 @@ static int cz_ih_sw_fini(void *handle)
>>          struct amdgpu_device *adev = (struct amdgpu_device *)handle;
>>
>>          amdgpu_irq_fini_sw(adev);
>> -       amdgpu_ih_ring_fini(adev, &adev->irq.ih);
>>          amdgpu_irq_remove_domain(adev);
>>
>>          return 0;
>> diff --git a/drivers/gpu/drm/amd/amdgpu/iceland_ih.c b/drivers/gpu/drm/amd/amdgpu/iceland_ih.c
>> index da96c6013477..ddfe4eaeea05 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/iceland_ih.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/iceland_ih.c
>> @@ -301,7 +301,6 @@ static int iceland_ih_sw_fini(void *handle)
>>          struct amdgpu_device *adev = (struct amdgpu_device *)handle;
>>
>>          amdgpu_irq_fini_sw(adev);
>> -       amdgpu_ih_ring_fini(adev, &adev->irq.ih);
>>          amdgpu_irq_remove_domain(adev);
>>
>>          return 0;
>> diff --git a/drivers/gpu/drm/amd/amdgpu/navi10_ih.c b/drivers/gpu/drm/amd/amdgpu/navi10_ih.c
>> index 5eea4550b856..e171a9e78544 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/navi10_ih.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/navi10_ih.c
>> @@ -571,9 +571,6 @@ static int navi10_ih_sw_fini(void *handle)
>>
>>          amdgpu_irq_fini_sw(adev);
>>          amdgpu_ih_ring_fini(adev, &adev->irq.ih_soft);
>> -       amdgpu_ih_ring_fini(adev, &adev->irq.ih2);
>> -       amdgpu_ih_ring_fini(adev, &adev->irq.ih1);
>> -       amdgpu_ih_ring_fini(adev, &adev->irq.ih);
>>
>>          return 0;
>>   }
>> diff --git a/drivers/gpu/drm/amd/amdgpu/si_ih.c b/drivers/gpu/drm/amd/amdgpu/si_ih.c
>> index 751307f3252c..9a24f17a5750 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/si_ih.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/si_ih.c
>> @@ -176,7 +176,6 @@ static int si_ih_sw_fini(void *handle)
>>          struct amdgpu_device *adev = (struct amdgpu_device *)handle;
>>
>>          amdgpu_irq_fini_sw(adev);
>> -       amdgpu_ih_ring_fini(adev, &adev->irq.ih);
>>
>>          return 0;
>>   }
>> diff --git a/drivers/gpu/drm/amd/amdgpu/tonga_ih.c b/drivers/gpu/drm/amd/amdgpu/tonga_ih.c
>> index 973d80ec7f6c..b08905d1c00f 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/tonga_ih.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/tonga_ih.c
>> @@ -313,7 +313,6 @@ static int tonga_ih_sw_fini(void *handle)
>>          struct amdgpu_device *adev = (struct amdgpu_device *)handle;
>>
>>          amdgpu_irq_fini_sw(adev);
>> -       amdgpu_ih_ring_fini(adev, &adev->irq.ih);
>>          amdgpu_irq_remove_domain(adev);
>>
>>          return 0;
>> diff --git a/drivers/gpu/drm/amd/amdgpu/vega10_ih.c b/drivers/gpu/drm/amd/amdgpu/vega10_ih.c
>> index dead9c2fbd4c..d78b8abe993a 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/vega10_ih.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/vega10_ih.c
>> @@ -515,9 +515,6 @@ static int vega10_ih_sw_fini(void *handle)
>>
>>          amdgpu_irq_fini_sw(adev);
>>          amdgpu_ih_ring_fini(adev, &adev->irq.ih_soft);
>> -       amdgpu_ih_ring_fini(adev, &adev->irq.ih2);
>> -       amdgpu_ih_ring_fini(adev, &adev->irq.ih1);
>> -       amdgpu_ih_ring_fini(adev, &adev->irq.ih);
>>
>>          return 0;
>>   }
>> --
>> 2.25.1
>>
>> _______________________________________________
>> amd-gfx mailing list
>> amd-gfx@lists.freedesktop.org
>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&amp;data=04%7C01%7Candrey.grodzovsky%40amd.com%7Cb45a1bb9b62c47513d8a08d914955fb0%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637563454058642697%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=lKXHkne%2FTx7abQcPBaINBt769zrJzEvcHwQ7KfxG1ZY%3D&amp;reserved=0

^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [PATCH v6 06/16] drm/amdgpu: Handle IOMMU enabled case.
@ 2021-05-11 15:59       ` Andrey Grodzovsky
  0 siblings, 0 replies; 126+ messages in thread
From: Andrey Grodzovsky @ 2021-05-11 15:59 UTC (permalink / raw)
  To: Alex Deucher
  Cc: Christian König, Greg KH, Daniel Vetter, Kuehling, Felix,
	amd-gfx list, Bjorn Helgaas, Maling list - DRI developers,
	Linux PCI, Deucher, Alexander



On 2021-05-11 11:56 a.m., Alex Deucher wrote:
> On Mon, May 10, 2021 at 12:37 PM Andrey Grodzovsky
> <andrey.grodzovsky@amd.com> wrote:
>>
>> Handle all DMA IOMMU gropup related dependencies before the
>> group is removed.
>>
>> v5: Drop IOMMU notifier and switch to lockless call to ttm_tt_unpopulate
>> v6: Drop the BO unamp list
>>
>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 4 ++--
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c   | 3 +--
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h   | 1 +
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c    | 9 +++++++++
>>   drivers/gpu/drm/amd/amdgpu/cik_ih.c        | 1 -
>>   drivers/gpu/drm/amd/amdgpu/cz_ih.c         | 1 -
>>   drivers/gpu/drm/amd/amdgpu/iceland_ih.c    | 1 -
>>   drivers/gpu/drm/amd/amdgpu/navi10_ih.c     | 3 ---
>>   drivers/gpu/drm/amd/amdgpu/si_ih.c         | 1 -
>>   drivers/gpu/drm/amd/amdgpu/tonga_ih.c      | 1 -
>>   drivers/gpu/drm/amd/amdgpu/vega10_ih.c     | 3 ---
>>   11 files changed, 13 insertions(+), 15 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> index 18598eda18f6..a0bff4713672 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> @@ -3256,7 +3256,6 @@ static const struct attribute *amdgpu_dev_attributes[] = {
>>          NULL
>>   };
>>
>> -
>>   /**
>>    * amdgpu_device_init - initialize the driver
>>    *
>> @@ -3698,12 +3697,13 @@ void amdgpu_device_fini_hw(struct amdgpu_device *adev)
>>                  amdgpu_ucode_sysfs_fini(adev);
>>          sysfs_remove_files(&adev->dev->kobj, amdgpu_dev_attributes);
>>
>> -
>>          amdgpu_fbdev_fini(adev);
>>
>>          amdgpu_irq_fini_hw(adev);
>>
>>          amdgpu_device_ip_fini_early(adev);
>> +
>> +       amdgpu_gart_dummy_page_fini(adev);
>>   }
>>
>>   void amdgpu_device_fini_sw(struct amdgpu_device *adev)
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
>> index c5a9a4fb10d2..354e68081b53 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
>> @@ -92,7 +92,7 @@ static int amdgpu_gart_dummy_page_init(struct amdgpu_device *adev)
>>    *
>>    * Frees the dummy page used by the driver (all asics).
>>    */
>> -static void amdgpu_gart_dummy_page_fini(struct amdgpu_device *adev)
>> +void amdgpu_gart_dummy_page_fini(struct amdgpu_device *adev)
>>   {
>>          if (!adev->dummy_page_addr)
>>                  return;
>> @@ -375,5 +375,4 @@ int amdgpu_gart_init(struct amdgpu_device *adev)
>>    */
>>   void amdgpu_gart_fini(struct amdgpu_device *adev)
>>   {
>> -       amdgpu_gart_dummy_page_fini(adev);
>>   }
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h
>> index a25fe97b0196..78dc7a23da56 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h
>> @@ -58,6 +58,7 @@ int amdgpu_gart_table_vram_pin(struct amdgpu_device *adev);
>>   void amdgpu_gart_table_vram_unpin(struct amdgpu_device *adev);
>>   int amdgpu_gart_init(struct amdgpu_device *adev);
>>   void amdgpu_gart_fini(struct amdgpu_device *adev);
>> +void amdgpu_gart_dummy_page_fini(struct amdgpu_device *adev);
>>   int amdgpu_gart_unbind(struct amdgpu_device *adev, uint64_t offset,
>>                         int pages);
>>   int amdgpu_gart_map(struct amdgpu_device *adev, uint64_t offset,
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
>> index 233b64dab94b..a14973a7a9c9 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
>> @@ -361,6 +361,15 @@ void amdgpu_irq_fini_hw(struct amdgpu_device *adev)
>>                  if (!amdgpu_device_has_dc_support(adev))
>>                          flush_work(&adev->hotplug_work);
>>          }
>> +
>> +       if (adev->irq.ih_soft.ring)
>> +               amdgpu_ih_ring_fini(adev, &adev->irq.ih_soft);
> 
> Why is the ih_soft handled here and in the various ih sw_fini functions?

Post last rebase new ASICs i think were added which i missed.
Taking care of this with prev. comment by Christian together right now.

Andrey

> 
>> +       if (adev->irq.ih.ring)
>> +               amdgpu_ih_ring_fini(adev, &adev->irq.ih);
>> +       if (adev->irq.ih1.ring)
>> +               amdgpu_ih_ring_fini(adev, &adev->irq.ih1);
>> +       if (adev->irq.ih2.ring)
>> +               amdgpu_ih_ring_fini(adev, &adev->irq.ih2);
>>   }
>>
>>   /**
>> diff --git a/drivers/gpu/drm/amd/amdgpu/cik_ih.c b/drivers/gpu/drm/amd/amdgpu/cik_ih.c
>> index 183d44a6583c..df385ffc9768 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/cik_ih.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/cik_ih.c
>> @@ -310,7 +310,6 @@ static int cik_ih_sw_fini(void *handle)
>>          struct amdgpu_device *adev = (struct amdgpu_device *)handle;
>>
>>          amdgpu_irq_fini_sw(adev);
>> -       amdgpu_ih_ring_fini(adev, &adev->irq.ih);
>>          amdgpu_irq_remove_domain(adev);
>>
>>          return 0;
>> diff --git a/drivers/gpu/drm/amd/amdgpu/cz_ih.c b/drivers/gpu/drm/amd/amdgpu/cz_ih.c
>> index d32743949003..b8c47e0cf37a 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/cz_ih.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/cz_ih.c
>> @@ -302,7 +302,6 @@ static int cz_ih_sw_fini(void *handle)
>>          struct amdgpu_device *adev = (struct amdgpu_device *)handle;
>>
>>          amdgpu_irq_fini_sw(adev);
>> -       amdgpu_ih_ring_fini(adev, &adev->irq.ih);
>>          amdgpu_irq_remove_domain(adev);
>>
>>          return 0;
>> diff --git a/drivers/gpu/drm/amd/amdgpu/iceland_ih.c b/drivers/gpu/drm/amd/amdgpu/iceland_ih.c
>> index da96c6013477..ddfe4eaeea05 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/iceland_ih.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/iceland_ih.c
>> @@ -301,7 +301,6 @@ static int iceland_ih_sw_fini(void *handle)
>>          struct amdgpu_device *adev = (struct amdgpu_device *)handle;
>>
>>          amdgpu_irq_fini_sw(adev);
>> -       amdgpu_ih_ring_fini(adev, &adev->irq.ih);
>>          amdgpu_irq_remove_domain(adev);
>>
>>          return 0;
>> diff --git a/drivers/gpu/drm/amd/amdgpu/navi10_ih.c b/drivers/gpu/drm/amd/amdgpu/navi10_ih.c
>> index 5eea4550b856..e171a9e78544 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/navi10_ih.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/navi10_ih.c
>> @@ -571,9 +571,6 @@ static int navi10_ih_sw_fini(void *handle)
>>
>>          amdgpu_irq_fini_sw(adev);
>>          amdgpu_ih_ring_fini(adev, &adev->irq.ih_soft);
>> -       amdgpu_ih_ring_fini(adev, &adev->irq.ih2);
>> -       amdgpu_ih_ring_fini(adev, &adev->irq.ih1);
>> -       amdgpu_ih_ring_fini(adev, &adev->irq.ih);
>>
>>          return 0;
>>   }
>> diff --git a/drivers/gpu/drm/amd/amdgpu/si_ih.c b/drivers/gpu/drm/amd/amdgpu/si_ih.c
>> index 751307f3252c..9a24f17a5750 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/si_ih.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/si_ih.c
>> @@ -176,7 +176,6 @@ static int si_ih_sw_fini(void *handle)
>>          struct amdgpu_device *adev = (struct amdgpu_device *)handle;
>>
>>          amdgpu_irq_fini_sw(adev);
>> -       amdgpu_ih_ring_fini(adev, &adev->irq.ih);
>>
>>          return 0;
>>   }
>> diff --git a/drivers/gpu/drm/amd/amdgpu/tonga_ih.c b/drivers/gpu/drm/amd/amdgpu/tonga_ih.c
>> index 973d80ec7f6c..b08905d1c00f 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/tonga_ih.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/tonga_ih.c
>> @@ -313,7 +313,6 @@ static int tonga_ih_sw_fini(void *handle)
>>          struct amdgpu_device *adev = (struct amdgpu_device *)handle;
>>
>>          amdgpu_irq_fini_sw(adev);
>> -       amdgpu_ih_ring_fini(adev, &adev->irq.ih);
>>          amdgpu_irq_remove_domain(adev);
>>
>>          return 0;
>> diff --git a/drivers/gpu/drm/amd/amdgpu/vega10_ih.c b/drivers/gpu/drm/amd/amdgpu/vega10_ih.c
>> index dead9c2fbd4c..d78b8abe993a 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/vega10_ih.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/vega10_ih.c
>> @@ -515,9 +515,6 @@ static int vega10_ih_sw_fini(void *handle)
>>
>>          amdgpu_irq_fini_sw(adev);
>>          amdgpu_ih_ring_fini(adev, &adev->irq.ih_soft);
>> -       amdgpu_ih_ring_fini(adev, &adev->irq.ih2);
>> -       amdgpu_ih_ring_fini(adev, &adev->irq.ih1);
>> -       amdgpu_ih_ring_fini(adev, &adev->irq.ih);
>>
>>          return 0;
>>   }
>> --
>> 2.25.1
>>
>> _______________________________________________
>> amd-gfx mailing list
>> amd-gfx@lists.freedesktop.org
>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&amp;data=04%7C01%7Candrey.grodzovsky%40amd.com%7Cb45a1bb9b62c47513d8a08d914955fb0%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637563454058642697%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=lKXHkne%2FTx7abQcPBaINBt769zrJzEvcHwQ7KfxG1ZY%3D&amp;reserved=0

^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [PATCH v6 06/16] drm/amdgpu: Handle IOMMU enabled case.
@ 2021-05-11 15:59       ` Andrey Grodzovsky
  0 siblings, 0 replies; 126+ messages in thread
From: Andrey Grodzovsky @ 2021-05-11 15:59 UTC (permalink / raw)
  To: Alex Deucher
  Cc: Christian König, Greg KH, Daniel Vetter, Kuehling, Felix,
	amd-gfx list, Pekka Paalanen, Bjorn Helgaas,
	Maling list - DRI developers, Linux PCI, Deucher, Alexander,
	Wentland, Harry



On 2021-05-11 11:56 a.m., Alex Deucher wrote:
> On Mon, May 10, 2021 at 12:37 PM Andrey Grodzovsky
> <andrey.grodzovsky@amd.com> wrote:
>>
>> Handle all DMA IOMMU gropup related dependencies before the
>> group is removed.
>>
>> v5: Drop IOMMU notifier and switch to lockless call to ttm_tt_unpopulate
>> v6: Drop the BO unamp list
>>
>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 4 ++--
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c   | 3 +--
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h   | 1 +
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c    | 9 +++++++++
>>   drivers/gpu/drm/amd/amdgpu/cik_ih.c        | 1 -
>>   drivers/gpu/drm/amd/amdgpu/cz_ih.c         | 1 -
>>   drivers/gpu/drm/amd/amdgpu/iceland_ih.c    | 1 -
>>   drivers/gpu/drm/amd/amdgpu/navi10_ih.c     | 3 ---
>>   drivers/gpu/drm/amd/amdgpu/si_ih.c         | 1 -
>>   drivers/gpu/drm/amd/amdgpu/tonga_ih.c      | 1 -
>>   drivers/gpu/drm/amd/amdgpu/vega10_ih.c     | 3 ---
>>   11 files changed, 13 insertions(+), 15 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> index 18598eda18f6..a0bff4713672 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> @@ -3256,7 +3256,6 @@ static const struct attribute *amdgpu_dev_attributes[] = {
>>          NULL
>>   };
>>
>> -
>>   /**
>>    * amdgpu_device_init - initialize the driver
>>    *
>> @@ -3698,12 +3697,13 @@ void amdgpu_device_fini_hw(struct amdgpu_device *adev)
>>                  amdgpu_ucode_sysfs_fini(adev);
>>          sysfs_remove_files(&adev->dev->kobj, amdgpu_dev_attributes);
>>
>> -
>>          amdgpu_fbdev_fini(adev);
>>
>>          amdgpu_irq_fini_hw(adev);
>>
>>          amdgpu_device_ip_fini_early(adev);
>> +
>> +       amdgpu_gart_dummy_page_fini(adev);
>>   }
>>
>>   void amdgpu_device_fini_sw(struct amdgpu_device *adev)
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
>> index c5a9a4fb10d2..354e68081b53 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
>> @@ -92,7 +92,7 @@ static int amdgpu_gart_dummy_page_init(struct amdgpu_device *adev)
>>    *
>>    * Frees the dummy page used by the driver (all asics).
>>    */
>> -static void amdgpu_gart_dummy_page_fini(struct amdgpu_device *adev)
>> +void amdgpu_gart_dummy_page_fini(struct amdgpu_device *adev)
>>   {
>>          if (!adev->dummy_page_addr)
>>                  return;
>> @@ -375,5 +375,4 @@ int amdgpu_gart_init(struct amdgpu_device *adev)
>>    */
>>   void amdgpu_gart_fini(struct amdgpu_device *adev)
>>   {
>> -       amdgpu_gart_dummy_page_fini(adev);
>>   }
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h
>> index a25fe97b0196..78dc7a23da56 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h
>> @@ -58,6 +58,7 @@ int amdgpu_gart_table_vram_pin(struct amdgpu_device *adev);
>>   void amdgpu_gart_table_vram_unpin(struct amdgpu_device *adev);
>>   int amdgpu_gart_init(struct amdgpu_device *adev);
>>   void amdgpu_gart_fini(struct amdgpu_device *adev);
>> +void amdgpu_gart_dummy_page_fini(struct amdgpu_device *adev);
>>   int amdgpu_gart_unbind(struct amdgpu_device *adev, uint64_t offset,
>>                         int pages);
>>   int amdgpu_gart_map(struct amdgpu_device *adev, uint64_t offset,
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
>> index 233b64dab94b..a14973a7a9c9 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
>> @@ -361,6 +361,15 @@ void amdgpu_irq_fini_hw(struct amdgpu_device *adev)
>>                  if (!amdgpu_device_has_dc_support(adev))
>>                          flush_work(&adev->hotplug_work);
>>          }
>> +
>> +       if (adev->irq.ih_soft.ring)
>> +               amdgpu_ih_ring_fini(adev, &adev->irq.ih_soft);
> 
> Why is the ih_soft handled here and in the various ih sw_fini functions?

Post last rebase new ASICs i think were added which i missed.
Taking care of this with prev. comment by Christian together right now.

Andrey

> 
>> +       if (adev->irq.ih.ring)
>> +               amdgpu_ih_ring_fini(adev, &adev->irq.ih);
>> +       if (adev->irq.ih1.ring)
>> +               amdgpu_ih_ring_fini(adev, &adev->irq.ih1);
>> +       if (adev->irq.ih2.ring)
>> +               amdgpu_ih_ring_fini(adev, &adev->irq.ih2);
>>   }
>>
>>   /**
>> diff --git a/drivers/gpu/drm/amd/amdgpu/cik_ih.c b/drivers/gpu/drm/amd/amdgpu/cik_ih.c
>> index 183d44a6583c..df385ffc9768 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/cik_ih.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/cik_ih.c
>> @@ -310,7 +310,6 @@ static int cik_ih_sw_fini(void *handle)
>>          struct amdgpu_device *adev = (struct amdgpu_device *)handle;
>>
>>          amdgpu_irq_fini_sw(adev);
>> -       amdgpu_ih_ring_fini(adev, &adev->irq.ih);
>>          amdgpu_irq_remove_domain(adev);
>>
>>          return 0;
>> diff --git a/drivers/gpu/drm/amd/amdgpu/cz_ih.c b/drivers/gpu/drm/amd/amdgpu/cz_ih.c
>> index d32743949003..b8c47e0cf37a 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/cz_ih.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/cz_ih.c
>> @@ -302,7 +302,6 @@ static int cz_ih_sw_fini(void *handle)
>>          struct amdgpu_device *adev = (struct amdgpu_device *)handle;
>>
>>          amdgpu_irq_fini_sw(adev);
>> -       amdgpu_ih_ring_fini(adev, &adev->irq.ih);
>>          amdgpu_irq_remove_domain(adev);
>>
>>          return 0;
>> diff --git a/drivers/gpu/drm/amd/amdgpu/iceland_ih.c b/drivers/gpu/drm/amd/amdgpu/iceland_ih.c
>> index da96c6013477..ddfe4eaeea05 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/iceland_ih.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/iceland_ih.c
>> @@ -301,7 +301,6 @@ static int iceland_ih_sw_fini(void *handle)
>>          struct amdgpu_device *adev = (struct amdgpu_device *)handle;
>>
>>          amdgpu_irq_fini_sw(adev);
>> -       amdgpu_ih_ring_fini(adev, &adev->irq.ih);
>>          amdgpu_irq_remove_domain(adev);
>>
>>          return 0;
>> diff --git a/drivers/gpu/drm/amd/amdgpu/navi10_ih.c b/drivers/gpu/drm/amd/amdgpu/navi10_ih.c
>> index 5eea4550b856..e171a9e78544 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/navi10_ih.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/navi10_ih.c
>> @@ -571,9 +571,6 @@ static int navi10_ih_sw_fini(void *handle)
>>
>>          amdgpu_irq_fini_sw(adev);
>>          amdgpu_ih_ring_fini(adev, &adev->irq.ih_soft);
>> -       amdgpu_ih_ring_fini(adev, &adev->irq.ih2);
>> -       amdgpu_ih_ring_fini(adev, &adev->irq.ih1);
>> -       amdgpu_ih_ring_fini(adev, &adev->irq.ih);
>>
>>          return 0;
>>   }
>> diff --git a/drivers/gpu/drm/amd/amdgpu/si_ih.c b/drivers/gpu/drm/amd/amdgpu/si_ih.c
>> index 751307f3252c..9a24f17a5750 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/si_ih.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/si_ih.c
>> @@ -176,7 +176,6 @@ static int si_ih_sw_fini(void *handle)
>>          struct amdgpu_device *adev = (struct amdgpu_device *)handle;
>>
>>          amdgpu_irq_fini_sw(adev);
>> -       amdgpu_ih_ring_fini(adev, &adev->irq.ih);
>>
>>          return 0;
>>   }
>> diff --git a/drivers/gpu/drm/amd/amdgpu/tonga_ih.c b/drivers/gpu/drm/amd/amdgpu/tonga_ih.c
>> index 973d80ec7f6c..b08905d1c00f 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/tonga_ih.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/tonga_ih.c
>> @@ -313,7 +313,6 @@ static int tonga_ih_sw_fini(void *handle)
>>          struct amdgpu_device *adev = (struct amdgpu_device *)handle;
>>
>>          amdgpu_irq_fini_sw(adev);
>> -       amdgpu_ih_ring_fini(adev, &adev->irq.ih);
>>          amdgpu_irq_remove_domain(adev);
>>
>>          return 0;
>> diff --git a/drivers/gpu/drm/amd/amdgpu/vega10_ih.c b/drivers/gpu/drm/amd/amdgpu/vega10_ih.c
>> index dead9c2fbd4c..d78b8abe993a 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/vega10_ih.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/vega10_ih.c
>> @@ -515,9 +515,6 @@ static int vega10_ih_sw_fini(void *handle)
>>
>>          amdgpu_irq_fini_sw(adev);
>>          amdgpu_ih_ring_fini(adev, &adev->irq.ih_soft);
>> -       amdgpu_ih_ring_fini(adev, &adev->irq.ih2);
>> -       amdgpu_ih_ring_fini(adev, &adev->irq.ih1);
>> -       amdgpu_ih_ring_fini(adev, &adev->irq.ih);
>>
>>          return 0;
>>   }
>> --
>> 2.25.1
>>
>> _______________________________________________
>> amd-gfx mailing list
>> amd-gfx@lists.freedesktop.org
>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&amp;data=04%7C01%7Candrey.grodzovsky%40amd.com%7Cb45a1bb9b62c47513d8a08d914955fb0%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637563454058642697%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=lKXHkne%2FTx7abQcPBaINBt769zrJzEvcHwQ7KfxG1ZY%3D&amp;reserved=0
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [PATCH v6 10/16] drm/amdgpu: Guard against write accesses after device removal
  2021-05-11  6:50     ` Christian König
  (?)
@ 2021-05-11 17:52       ` Andrey Grodzovsky
  -1 siblings, 0 replies; 126+ messages in thread
From: Andrey Grodzovsky @ 2021-05-11 17:52 UTC (permalink / raw)
  To: Christian König, dri-devel, amd-gfx, linux-pci,
	daniel.vetter, Harry.Wentland
  Cc: ppaalanen, Alexander.Deucher, gregkh, helgaas, Felix.Kuehling



On 2021-05-11 2:50 a.m., Christian König wrote:
> Am 10.05.21 um 18:36 schrieb Andrey Grodzovsky:
>> This should prevent writing to memory or IO ranges possibly
>> already allocated for other uses after our device is removed.
>>
>> v5:
>> Protect more places wher memcopy_to/form_io takes place
>> Protect IB submissions
>>
>> v6: Switch to !drm_dev_enter instead of scoping entire code
>> with brackets.
>>
>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c    | 11 ++-
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c       |  9 +++
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c        | 17 +++--
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c       | 63 +++++++++++------
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h       |  2 +
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c      | 70 +++++++++++++++++++
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h      | 49 ++-----------
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c       | 31 +++++---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c       | 11 ++-
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c       | 22 ++++--
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c        |  7 +-
>>   drivers/gpu/drm/amd/amdgpu/psp_v11_0.c        | 44 ++++++------
>>   drivers/gpu/drm/amd/amdgpu/psp_v12_0.c        |  8 +--
>>   drivers/gpu/drm/amd/amdgpu/psp_v3_1.c         |  8 +--
>>   drivers/gpu/drm/amd/amdgpu/vce_v4_0.c         | 26 ++++---
>>   drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c         | 22 +++---
>>   .../drm/amd/pm/powerplay/smumgr/smu7_smumgr.c |  2 +
>>   17 files changed, 257 insertions(+), 145 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> index a0bff4713672..94c415176cdc 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> @@ -71,6 +71,8 @@
>>   #include <drm/task_barrier.h>
>>   #include <linux/pm_runtime.h>
>> +#include <drm/drm_drv.h>
>> +
>>   MODULE_FIRMWARE("amdgpu/vega10_gpu_info.bin");
>>   MODULE_FIRMWARE("amdgpu/vega12_gpu_info.bin");
>>   MODULE_FIRMWARE("amdgpu/raven_gpu_info.bin");
>> @@ -281,7 +283,10 @@ void amdgpu_device_vram_access(struct 
>> amdgpu_device *adev, loff_t pos,
>>       unsigned long flags;
>>       uint32_t hi = ~0;
>>       uint64_t last;
>> +    int idx;
>> +     if (!drm_dev_enter(&adev->ddev, &idx))
>> +         return;
>>   #ifdef CONFIG_64BIT
>>       last = min(pos + size, adev->gmc.visible_vram_size);
>> @@ -299,8 +304,10 @@ void amdgpu_device_vram_access(struct 
>> amdgpu_device *adev, loff_t pos,
>>               memcpy_fromio(buf, addr, count);
>>           }
>> -        if (count == size)
>> +        if (count == size) {
>> +            drm_dev_exit(idx);
>>               return;
>> +        }
> 
> Maybe use a goto instead, but really just a nit pick.
> 
> 
> 
>>           pos += count;
>>           buf += count / 4;
>> @@ -323,6 +330,8 @@ void amdgpu_device_vram_access(struct 
>> amdgpu_device *adev, loff_t pos,
>>               *buf++ = RREG32_NO_KIQ(mmMM_DATA);
>>       }
>>       spin_unlock_irqrestore(&adev->mmio_idx_lock, flags);
>> +
>> +    drm_dev_exit(idx);
>>   }
>>   /*
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
>> index 4d32233cde92..04ba5eef1e88 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
>> @@ -31,6 +31,8 @@
>>   #include "amdgpu_ras.h"
>>   #include "amdgpu_xgmi.h"
>> +#include <drm/drm_drv.h>
>> +
>>   /**
>>    * amdgpu_gmc_pdb0_alloc - allocate vram for pdb0
>>    *
>> @@ -151,6 +153,10 @@ int amdgpu_gmc_set_pte_pde(struct amdgpu_device 
>> *adev, void *cpu_pt_addr,
>>   {
>>       void __iomem *ptr = (void *)cpu_pt_addr;
>>       uint64_t value;
>> +    int idx;
>> +
>> +    if (!drm_dev_enter(&adev->ddev, &idx))
>> +        return 0;
>>       /*
>>        * The following is for PTE only. GART does not have PDEs.
>> @@ -158,6 +164,9 @@ int amdgpu_gmc_set_pte_pde(struct amdgpu_device 
>> *adev, void *cpu_pt_addr,
>>       value = addr & 0x0000FFFFFFFFF000ULL;
>>       value |= flags;
>>       writeq(value, ptr + (gpu_page_idx * 8));
>> +
>> +    drm_dev_exit(idx);
>> +
>>       return 0;
>>   }
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
>> index 148a3b481b12..62fcbd446c71 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
>> @@ -30,6 +30,7 @@
>>   #include <linux/slab.h>
>>   #include <drm/amdgpu_drm.h>
>> +#include <drm/drm_drv.h>
>>   #include "amdgpu.h"
>>   #include "atom.h"
>> @@ -137,7 +138,7 @@ int amdgpu_ib_schedule(struct amdgpu_ring *ring, 
>> unsigned num_ibs,
>>       bool secure;
>>       unsigned i;
>> -    int r = 0;
>> +    int idx, r = 0;
>>       bool need_pipe_sync = false;
>>       if (num_ibs == 0)
>> @@ -169,13 +170,16 @@ int amdgpu_ib_schedule(struct amdgpu_ring *ring, 
>> unsigned num_ibs,
>>           return -EINVAL;
>>       }
>> +    if (!drm_dev_enter(&adev->ddev, &idx))
>> +        return -ENODEV;
>> +
>>       alloc_size = ring->funcs->emit_frame_size + num_ibs *
>>           ring->funcs->emit_ib_size;
>>       r = amdgpu_ring_alloc(ring, alloc_size);
>>       if (r) {
>>           dev_err(adev->dev, "scheduling IB failed (%d).\n", r);
>> -        return r;
>> +        goto exit;
>>       }
>>       need_ctx_switch = ring->current_ctx != fence_ctx;
>> @@ -205,7 +209,7 @@ int amdgpu_ib_schedule(struct amdgpu_ring *ring, 
>> unsigned num_ibs,
>>           r = amdgpu_vm_flush(ring, job, need_pipe_sync);
>>           if (r) {
>>               amdgpu_ring_undo(ring);
>> -            return r;
>> +            goto exit;
>>           }
>>       }
>> @@ -286,7 +290,7 @@ int amdgpu_ib_schedule(struct amdgpu_ring *ring, 
>> unsigned num_ibs,
>>           if (job && job->vmid)
>>               amdgpu_vmid_reset(adev, ring->funcs->vmhub, job->vmid);
>>           amdgpu_ring_undo(ring);
>> -        return r;
>> +        goto exit;
>>       }
>>       if (ring->funcs->insert_end)
>> @@ -304,7 +308,10 @@ int amdgpu_ib_schedule(struct amdgpu_ring *ring, 
>> unsigned num_ibs,
>>           ring->funcs->emit_wave_limit(ring, false);
>>       amdgpu_ring_commit(ring);
>> -    return 0;
>> +
>> +exit:
>> +    drm_dev_exit(idx);
>> +    return r;
>>   }
>>   /**
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
>> index 9e769cf6095b..bb6afee61666 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
>> @@ -25,6 +25,7 @@
>>   #include <linux/firmware.h>
>>   #include <linux/dma-mapping.h>
>> +#include <drm/drm_drv.h>
>>   #include "amdgpu.h"
>>   #include "amdgpu_psp.h"
>> @@ -39,6 +40,8 @@
>>   #include "amdgpu_ras.h"
>>   #include "amdgpu_securedisplay.h"
>> +#include <drm/drm_drv.h>
>> +
>>   static int psp_sysfs_init(struct amdgpu_device *adev);
>>   static void psp_sysfs_fini(struct amdgpu_device *adev);
>> @@ -253,7 +256,7 @@ psp_cmd_submit_buf(struct psp_context *psp,
>>              struct psp_gfx_cmd_resp *cmd, uint64_t fence_mc_addr)
>>   {
>>       int ret;
>> -    int index;
>> +    int index, idx;
>>       int timeout = 20000;
>>       bool ras_intr = false;
>>       bool skip_unsupport = false;
>> @@ -261,6 +264,9 @@ psp_cmd_submit_buf(struct psp_context *psp,
>>       if (psp->adev->in_pci_err_recovery)
>>           return 0;
>> +    if (!drm_dev_enter(&psp->adev->ddev, &idx))
>> +        return 0;
>> +
>>       mutex_lock(&psp->mutex);
>>       memset(psp->cmd_buf_mem, 0, PSP_CMD_BUFFER_SIZE);
>> @@ -271,8 +277,7 @@ psp_cmd_submit_buf(struct psp_context *psp,
>>       ret = psp_ring_cmd_submit(psp, psp->cmd_buf_mc_addr, 
>> fence_mc_addr, index);
>>       if (ret) {
>>           atomic_dec(&psp->fence_value);
>> -        mutex_unlock(&psp->mutex);
>> -        return ret;
>> +        goto exit;
>>       }
>>       amdgpu_asic_invalidate_hdp(psp->adev, NULL);
>> @@ -312,8 +317,8 @@ psp_cmd_submit_buf(struct psp_context *psp,
>>                psp->cmd_buf_mem->cmd_id,
>>                psp->cmd_buf_mem->resp.status);
>>           if (!timeout) {
>> -            mutex_unlock(&psp->mutex);
>> -            return -EINVAL;
>> +            ret = -EINVAL;
>> +            goto exit;
>>           }
>>       }
>> @@ -321,8 +326,10 @@ psp_cmd_submit_buf(struct psp_context *psp,
>>           ucode->tmr_mc_addr_lo = psp->cmd_buf_mem->resp.fw_addr_lo;
>>           ucode->tmr_mc_addr_hi = psp->cmd_buf_mem->resp.fw_addr_hi;
>>       }
>> -    mutex_unlock(&psp->mutex);
>> +exit:
>> +    mutex_unlock(&psp->mutex);
>> +    drm_dev_exit(idx);
>>       return ret;
>>   }
>> @@ -359,8 +366,7 @@ static int psp_load_toc(struct psp_context *psp,
>>       if (!cmd)
>>           return -ENOMEM;
>>       /* Copy toc to psp firmware private buffer */
>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>> -    memcpy(psp->fw_pri_buf, psp->toc_start_addr, psp->toc_bin_size);
>> +    psp_copy_fw(psp, psp->toc_start_addr, psp->toc_bin_size);
>>       psp_prep_load_toc_cmd_buf(cmd, psp->fw_pri_mc_addr, 
>> psp->toc_bin_size);
>> @@ -625,8 +631,7 @@ static int psp_asd_load(struct psp_context *psp)
>>       if (!cmd)
>>           return -ENOMEM;
>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>> -    memcpy(psp->fw_pri_buf, psp->asd_start_addr, psp->asd_ucode_size);
>> +    psp_copy_fw(psp, psp->asd_start_addr, psp->asd_ucode_size);
>>       psp_prep_asd_load_cmd_buf(cmd, psp->fw_pri_mc_addr,
>>                     psp->asd_ucode_size);
>> @@ -781,8 +786,7 @@ static int psp_xgmi_load(struct psp_context *psp)
>>       if (!cmd)
>>           return -ENOMEM;
>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>> -    memcpy(psp->fw_pri_buf, psp->ta_xgmi_start_addr, 
>> psp->ta_xgmi_ucode_size);
>> +    psp_copy_fw(psp, psp->ta_xgmi_start_addr, psp->ta_xgmi_ucode_size);
>>       psp_prep_ta_load_cmd_buf(cmd,
>>                    psp->fw_pri_mc_addr,
>> @@ -1038,8 +1042,7 @@ static int psp_ras_load(struct psp_context *psp)
>>       if (!cmd)
>>           return -ENOMEM;
>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>> -    memcpy(psp->fw_pri_buf, psp->ta_ras_start_addr, 
>> psp->ta_ras_ucode_size);
>> +    psp_copy_fw(psp, psp->ta_ras_start_addr, psp->ta_ras_ucode_size);
>>       psp_prep_ta_load_cmd_buf(cmd,
>>                    psp->fw_pri_mc_addr,
>> @@ -1275,8 +1278,7 @@ static int psp_hdcp_load(struct psp_context *psp)
>>       if (!cmd)
>>           return -ENOMEM;
>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>> -    memcpy(psp->fw_pri_buf, psp->ta_hdcp_start_addr,
>> +    psp_copy_fw(psp, psp->ta_hdcp_start_addr,
>>              psp->ta_hdcp_ucode_size);
>>       psp_prep_ta_load_cmd_buf(cmd,
>> @@ -1427,8 +1429,7 @@ static int psp_dtm_load(struct psp_context *psp)
>>       if (!cmd)
>>           return -ENOMEM;
>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>> -    memcpy(psp->fw_pri_buf, psp->ta_dtm_start_addr, 
>> psp->ta_dtm_ucode_size);
>> +    psp_copy_fw(psp, psp->ta_dtm_start_addr, psp->ta_dtm_ucode_size);
>>       psp_prep_ta_load_cmd_buf(cmd,
>>                    psp->fw_pri_mc_addr,
>> @@ -1573,8 +1574,7 @@ static int psp_rap_load(struct psp_context *psp)
>>       if (!cmd)
>>           return -ENOMEM;
>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>> -    memcpy(psp->fw_pri_buf, psp->ta_rap_start_addr, 
>> psp->ta_rap_ucode_size);
>> +    psp_copy_fw(psp, psp->ta_rap_start_addr, psp->ta_rap_ucode_size);
>>       psp_prep_ta_load_cmd_buf(cmd,
>>                    psp->fw_pri_mc_addr,
>> @@ -3022,7 +3022,7 @@ static ssize_t psp_usbc_pd_fw_sysfs_write(struct 
>> device *dev,
>>       struct amdgpu_device *adev = drm_to_adev(ddev);
>>       void *cpu_addr;
>>       dma_addr_t dma_addr;
>> -    int ret;
>> +    int ret, idx;
>>       char fw_name[100];
>>       const struct firmware *usbc_pd_fw;
>> @@ -3031,6 +3031,9 @@ static ssize_t psp_usbc_pd_fw_sysfs_write(struct 
>> device *dev,
>>           return -EBUSY;
>>       }
>> +    if (!drm_dev_enter(ddev, &idx))
>> +        return -ENODEV;
>> +
>>       snprintf(fw_name, sizeof(fw_name), "amdgpu/%s", buf);
>>       ret = request_firmware(&usbc_pd_fw, fw_name, adev->dev);
>>       if (ret)
>> @@ -3062,16 +3065,30 @@ static ssize_t 
>> psp_usbc_pd_fw_sysfs_write(struct device *dev,
>>   rel_buf:
>>       dma_free_coherent(adev->dev, usbc_pd_fw->size, cpu_addr, dma_addr);
>>       release_firmware(usbc_pd_fw);
>> -
>>   fail:
>>       if (ret) {
>>           DRM_ERROR("Failed to load USBC PD FW, err = %d", ret);
>> -        return ret;
>> +        count = ret;
>>       }
>> +    drm_dev_exit(idx);
>>       return count;
>>   }
>> +void psp_copy_fw(struct psp_context *psp, uint8_t *start_addr, 
>> uint32_t bin_size)
>> +{
>> +    int idx;
>> +
>> +    if (!drm_dev_enter(&psp->adev->ddev, &idx))
>> +        return;
>> +
>> +    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>> +    memcpy(psp->fw_pri_buf, start_addr, bin_size);
>> +
>> +    drm_dev_exit(idx);
>> +}
>> +
>> +
>>   static DEVICE_ATTR(usbc_pd_fw, S_IRUGO | S_IWUSR,
>>              psp_usbc_pd_fw_sysfs_read,
>>              psp_usbc_pd_fw_sysfs_write);
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
>> index 46a5328e00e0..2bfdc278817f 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
>> @@ -423,4 +423,6 @@ int psp_get_fw_attestation_records_addr(struct 
>> psp_context *psp,
>>   int psp_load_fw_list(struct psp_context *psp,
>>                struct amdgpu_firmware_info **ucode_list, int 
>> ucode_count);
>> +void psp_copy_fw(struct psp_context *psp, uint8_t *start_addr, 
>> uint32_t bin_size);
>> +
>>   #endif
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
>> index 688624ebe421..e1985bc34436 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
>> @@ -35,6 +35,8 @@
>>   #include "amdgpu.h"
>>   #include "atom.h"
>> +#include <drm/drm_drv.h>
>> +
>>   /*
>>    * Rings
>>    * Most engines on the GPU are fed via ring buffers.  Ring
>> @@ -461,3 +463,71 @@ int amdgpu_ring_test_helper(struct amdgpu_ring 
>> *ring)
>>       ring->sched.ready = !r;
>>       return r;
>>   }
>> +
>> +void amdgpu_ring_clear_ring(struct amdgpu_ring *ring)
>> +{
>> +    int idx;
>> +    int i = 0;
>> +
>> +    if (!drm_dev_enter(&ring->adev->ddev, &idx))
>> +        return;
>> +
>> +    while (i <= ring->buf_mask)
>> +        ring->ring[i++] = ring->funcs->nop;
>> +
>> +    drm_dev_exit(idx);
>> +
>> +}
>> +
>> +void amdgpu_ring_write(struct amdgpu_ring *ring, uint32_t v)
>> +{
>> +    int idx;
>> +
>> +    if (!drm_dev_enter(&ring->adev->ddev, &idx))
>> +        return;
>> +
>> +    if (ring->count_dw <= 0)
>> +        DRM_ERROR("amdgpu: writing more dwords to the ring than 
>> expected!\n");
>> +    ring->ring[ring->wptr++ & ring->buf_mask] = v;
>> +    ring->wptr &= ring->ptr_mask;
>> +    ring->count_dw--;
>> +
>> +    drm_dev_exit(idx);
>> +}
>> +
>> +void amdgpu_ring_write_multiple(struct amdgpu_ring *ring,
>> +                          void *src, int count_dw)
>> +{
>> +    unsigned occupied, chunk1, chunk2;
>> +    void *dst;
>> +    int idx;
>> +
>> +    if (!drm_dev_enter(&ring->adev->ddev, &idx))
>> +        return;
>> +
>> +    if (unlikely(ring->count_dw < count_dw))
>> +        DRM_ERROR("amdgpu: writing more dwords to the ring than 
>> expected!\n");
>> +
>> +    occupied = ring->wptr & ring->buf_mask;
>> +    dst = (void *)&ring->ring[occupied];
>> +    chunk1 = ring->buf_mask + 1 - occupied;
>> +    chunk1 = (chunk1 >= count_dw) ? count_dw: chunk1;
>> +    chunk2 = count_dw - chunk1;
>> +    chunk1 <<= 2;
>> +    chunk2 <<= 2;
>> +
>> +    if (chunk1)
>> +        memcpy(dst, src, chunk1);
>> +
>> +    if (chunk2) {
>> +        src += chunk1;
>> +        dst = (void *)ring->ring;
>> +        memcpy(dst, src, chunk2);
>> +    }
>> +
>> +    ring->wptr += count_dw;
>> +    ring->wptr &= ring->ptr_mask;
>> +    ring->count_dw -= count_dw;
>> +
>> +    drm_dev_exit(idx);
>> +}
> 
> The ring should never we in MMIO memory, so you can completely drop that 
> as far as I can see.

Yea, it's in all in GART, missed it for some reason...
> 
> Maybe split that patch by use case so that we can more easily review/ack 
> it.

In fact everything here is the same use case, once I added unmap of
all MMIO ranges (both registers ann VRAM) i got a lot of page faults
on device remove around any memcpy to from IO. That where I put the
drn_dev_enter/exit scope. Also I searched in code and preemeptivly
added guards to any other such place. I did drop amdgpu_schedule_ib
from this patch both because it had dma_fence_wait inside and so we
will take care of this once we decide on how to handle dma_fence waits.

Andrey

> 
> Thanks,
> Christian.
> 
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>> index e7d3d0dbdd96..c67bc6d3d039 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>> @@ -299,53 +299,12 @@ static inline void 
>> amdgpu_ring_set_preempt_cond_exec(struct amdgpu_ring *ring,
>>       *ring->cond_exe_cpu_addr = cond_exec;
>>   }
>> -static inline void amdgpu_ring_clear_ring(struct amdgpu_ring *ring)
>> -{
>> -    int i = 0;
>> -    while (i <= ring->buf_mask)
>> -        ring->ring[i++] = ring->funcs->nop;
>> -
>> -}
>> -
>> -static inline void amdgpu_ring_write(struct amdgpu_ring *ring, 
>> uint32_t v)
>> -{
>> -    if (ring->count_dw <= 0)
>> -        DRM_ERROR("amdgpu: writing more dwords to the ring than 
>> expected!\n");
>> -    ring->ring[ring->wptr++ & ring->buf_mask] = v;
>> -    ring->wptr &= ring->ptr_mask;
>> -    ring->count_dw--;
>> -}
>> +void amdgpu_ring_clear_ring(struct amdgpu_ring *ring);
>> -static inline void amdgpu_ring_write_multiple(struct amdgpu_ring *ring,
>> -                          void *src, int count_dw)
>> -{
>> -    unsigned occupied, chunk1, chunk2;
>> -    void *dst;
>> -
>> -    if (unlikely(ring->count_dw < count_dw))
>> -        DRM_ERROR("amdgpu: writing more dwords to the ring than 
>> expected!\n");
>> -
>> -    occupied = ring->wptr & ring->buf_mask;
>> -    dst = (void *)&ring->ring[occupied];
>> -    chunk1 = ring->buf_mask + 1 - occupied;
>> -    chunk1 = (chunk1 >= count_dw) ? count_dw: chunk1;
>> -    chunk2 = count_dw - chunk1;
>> -    chunk1 <<= 2;
>> -    chunk2 <<= 2;
>> +void amdgpu_ring_write(struct amdgpu_ring *ring, uint32_t v);
>> -    if (chunk1)
>> -        memcpy(dst, src, chunk1);
>> -
>> -    if (chunk2) {
>> -        src += chunk1;
>> -        dst = (void *)ring->ring;
>> -        memcpy(dst, src, chunk2);
>> -    }
>> -
>> -    ring->wptr += count_dw;
>> -    ring->wptr &= ring->ptr_mask;
>> -    ring->count_dw -= count_dw;
>> -}
>> +void amdgpu_ring_write_multiple(struct amdgpu_ring *ring,
>> +                          void *src, int count_dw);
>>   int amdgpu_ring_test_helper(struct amdgpu_ring *ring);
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
>> index c6dbc0801604..82f0542c7792 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
>> @@ -32,6 +32,7 @@
>>   #include <linux/module.h>
>>   #include <drm/drm.h>
>> +#include <drm/drm_drv.h>
>>   #include "amdgpu.h"
>>   #include "amdgpu_pm.h"
>> @@ -375,7 +376,7 @@ int amdgpu_uvd_suspend(struct amdgpu_device *adev)
>>   {
>>       unsigned size;
>>       void *ptr;
>> -    int i, j;
>> +    int i, j, idx;
>>       bool in_ras_intr = amdgpu_ras_intr_triggered();
>>       cancel_delayed_work_sync(&adev->uvd.idle_work);
>> @@ -403,11 +404,15 @@ int amdgpu_uvd_suspend(struct amdgpu_device *adev)
>>           if (!adev->uvd.inst[j].saved_bo)
>>               return -ENOMEM;
>> -        /* re-write 0 since err_event_athub will corrupt VCPU buffer */
>> -        if (in_ras_intr)
>> -            memset(adev->uvd.inst[j].saved_bo, 0, size);
>> -        else
>> -            memcpy_fromio(adev->uvd.inst[j].saved_bo, ptr, size);
>> +        if (drm_dev_enter(&adev->ddev, &idx)) {
>> +            /* re-write 0 since err_event_athub will corrupt VCPU 
>> buffer */
>> +            if (in_ras_intr)
>> +                memset(adev->uvd.inst[j].saved_bo, 0, size);
>> +            else
>> +                memcpy_fromio(adev->uvd.inst[j].saved_bo, ptr, size);
>> +
>> +            drm_dev_exit(idx);
>> +        }
>>       }
>>       if (in_ras_intr)
>> @@ -420,7 +425,7 @@ int amdgpu_uvd_resume(struct amdgpu_device *adev)
>>   {
>>       unsigned size;
>>       void *ptr;
>> -    int i;
>> +    int i, idx;
>>       for (i = 0; i < adev->uvd.num_uvd_inst; i++) {
>>           if (adev->uvd.harvest_config & (1 << i))
>> @@ -432,7 +437,10 @@ int amdgpu_uvd_resume(struct amdgpu_device *adev)
>>           ptr = adev->uvd.inst[i].cpu_addr;
>>           if (adev->uvd.inst[i].saved_bo != NULL) {
>> -            memcpy_toio(ptr, adev->uvd.inst[i].saved_bo, size);
>> +            if (drm_dev_enter(&adev->ddev, &idx)) {
>> +                memcpy_toio(ptr, adev->uvd.inst[i].saved_bo, size);
>> +                drm_dev_exit(idx);
>> +            }
>>               kvfree(adev->uvd.inst[i].saved_bo);
>>               adev->uvd.inst[i].saved_bo = NULL;
>>           } else {
>> @@ -442,8 +450,11 @@ int amdgpu_uvd_resume(struct amdgpu_device *adev)
>>               hdr = (const struct common_firmware_header 
>> *)adev->uvd.fw->data;
>>               if (adev->firmware.load_type != AMDGPU_FW_LOAD_PSP) {
>>                   offset = le32_to_cpu(hdr->ucode_array_offset_bytes);
>> -                memcpy_toio(adev->uvd.inst[i].cpu_addr, 
>> adev->uvd.fw->data + offset,
>> -                        le32_to_cpu(hdr->ucode_size_bytes));
>> +                if (drm_dev_enter(&adev->ddev, &idx)) {
>> +                    memcpy_toio(adev->uvd.inst[i].cpu_addr, 
>> adev->uvd.fw->data + offset,
>> +                            le32_to_cpu(hdr->ucode_size_bytes));
>> +                    drm_dev_exit(idx);
>> +                }
>>                   size -= le32_to_cpu(hdr->ucode_size_bytes);
>>                   ptr += le32_to_cpu(hdr->ucode_size_bytes);
>>               }
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
>> index ea6a62f67e38..833203401ef4 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
>> @@ -29,6 +29,7 @@
>>   #include <linux/module.h>
>>   #include <drm/drm.h>
>> +#include <drm/drm_drv.h>
>>   #include "amdgpu.h"
>>   #include "amdgpu_pm.h"
>> @@ -293,7 +294,7 @@ int amdgpu_vce_resume(struct amdgpu_device *adev)
>>       void *cpu_addr;
>>       const struct common_firmware_header *hdr;
>>       unsigned offset;
>> -    int r;
>> +    int r, idx;
>>       if (adev->vce.vcpu_bo == NULL)
>>           return -EINVAL;
>> @@ -313,8 +314,12 @@ int amdgpu_vce_resume(struct amdgpu_device *adev)
>>       hdr = (const struct common_firmware_header *)adev->vce.fw->data;
>>       offset = le32_to_cpu(hdr->ucode_array_offset_bytes);
>> -    memcpy_toio(cpu_addr, adev->vce.fw->data + offset,
>> -            adev->vce.fw->size - offset);
>> +
>> +    if (drm_dev_enter(&adev->ddev, &idx)) {
>> +        memcpy_toio(cpu_addr, adev->vce.fw->data + offset,
>> +                adev->vce.fw->size - offset);
>> +        drm_dev_exit(idx);
>> +    }
>>       amdgpu_bo_kunmap(adev->vce.vcpu_bo);
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
>> index 201645963ba5..21f7d3644d70 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
>> @@ -27,6 +27,7 @@
>>   #include <linux/firmware.h>
>>   #include <linux/module.h>
>>   #include <linux/pci.h>
>> +#include <drm/drm_drv.h>
>>   #include "amdgpu.h"
>>   #include "amdgpu_pm.h"
>> @@ -275,7 +276,7 @@ int amdgpu_vcn_suspend(struct amdgpu_device *adev)
>>   {
>>       unsigned size;
>>       void *ptr;
>> -    int i;
>> +    int i, idx;
>>       cancel_delayed_work_sync(&adev->vcn.idle_work);
>> @@ -292,7 +293,10 @@ int amdgpu_vcn_suspend(struct amdgpu_device *adev)
>>           if (!adev->vcn.inst[i].saved_bo)
>>               return -ENOMEM;
>> -        memcpy_fromio(adev->vcn.inst[i].saved_bo, ptr, size);
>> +        if (drm_dev_enter(&adev->ddev, &idx)) {
>> +            memcpy_fromio(adev->vcn.inst[i].saved_bo, ptr, size);
>> +            drm_dev_exit(idx);
>> +        }
>>       }
>>       return 0;
>>   }
>> @@ -301,7 +305,7 @@ int amdgpu_vcn_resume(struct amdgpu_device *adev)
>>   {
>>       unsigned size;
>>       void *ptr;
>> -    int i;
>> +    int i, idx;
>>       for (i = 0; i < adev->vcn.num_vcn_inst; ++i) {
>>           if (adev->vcn.harvest_config & (1 << i))
>> @@ -313,7 +317,10 @@ int amdgpu_vcn_resume(struct amdgpu_device *adev)
>>           ptr = adev->vcn.inst[i].cpu_addr;
>>           if (adev->vcn.inst[i].saved_bo != NULL) {
>> -            memcpy_toio(ptr, adev->vcn.inst[i].saved_bo, size);
>> +            if (drm_dev_enter(&adev->ddev, &idx)) {
>> +                memcpy_toio(ptr, adev->vcn.inst[i].saved_bo, size);
>> +                drm_dev_exit(idx);
>> +            }
>>               kvfree(adev->vcn.inst[i].saved_bo);
>>               adev->vcn.inst[i].saved_bo = NULL;
>>           } else {
>> @@ -323,8 +330,11 @@ int amdgpu_vcn_resume(struct amdgpu_device *adev)
>>               hdr = (const struct common_firmware_header 
>> *)adev->vcn.fw->data;
>>               if (adev->firmware.load_type != AMDGPU_FW_LOAD_PSP) {
>>                   offset = le32_to_cpu(hdr->ucode_array_offset_bytes);
>> -                memcpy_toio(adev->vcn.inst[i].cpu_addr, 
>> adev->vcn.fw->data + offset,
>> -                        le32_to_cpu(hdr->ucode_size_bytes));
>> +                if (drm_dev_enter(&adev->ddev, &idx)) {
>> +                    memcpy_toio(adev->vcn.inst[i].cpu_addr, 
>> adev->vcn.fw->data + offset,
>> +                            le32_to_cpu(hdr->ucode_size_bytes));
>> +                    drm_dev_exit(idx);
>> +                }
>>                   size -= le32_to_cpu(hdr->ucode_size_bytes);
>>                   ptr += le32_to_cpu(hdr->ucode_size_bytes);
>>               }
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> index 9f868cf3b832..7dd5f10ab570 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> @@ -32,6 +32,7 @@
>>   #include <linux/dma-buf.h>
>>   #include <drm/amdgpu_drm.h>
>> +#include <drm/drm_drv.h>
>>   #include "amdgpu.h"
>>   #include "amdgpu_trace.h"
>>   #include "amdgpu_amdkfd.h"
>> @@ -1606,7 +1607,10 @@ static int amdgpu_vm_bo_update_mapping(struct 
>> amdgpu_device *adev,
>>       struct amdgpu_vm_update_params params;
>>       enum amdgpu_sync_mode sync_mode;
>>       uint64_t pfn;
>> -    int r;
>> +    int r, idx;
>> +
>> +    if (!drm_dev_enter(&adev->ddev, &idx))
>> +        return -ENODEV;
>>       memset(&params, 0, sizeof(params));
>>       params.adev = adev;
>> @@ -1715,6 +1719,7 @@ static int amdgpu_vm_bo_update_mapping(struct 
>> amdgpu_device *adev,
>>   error_unlock:
>>       amdgpu_vm_eviction_unlock(vm);
>> +    drm_dev_exit(idx);
>>       return r;
>>   }
>> diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c 
>> b/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
>> index 589410c32d09..2cec71e823f5 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
>> @@ -23,6 +23,7 @@
>>   #include <linux/firmware.h>
>>   #include <linux/module.h>
>>   #include <linux/vmalloc.h>
>> +#include <drm/drm_drv.h>
>>   #include "amdgpu.h"
>>   #include "amdgpu_psp.h"
>> @@ -269,10 +270,8 @@ static int psp_v11_0_bootloader_load_kdb(struct 
>> psp_context *psp)
>>       if (ret)
>>           return ret;
>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>> -
>>       /* Copy PSP KDB binary to memory */
>> -    memcpy(psp->fw_pri_buf, psp->kdb_start_addr, psp->kdb_bin_size);
>> +    psp_copy_fw(psp, psp->kdb_start_addr, psp->kdb_bin_size);
>>       /* Provide the PSP KDB to bootloader */
>>       WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>> @@ -302,10 +301,8 @@ static int psp_v11_0_bootloader_load_spl(struct 
>> psp_context *psp)
>>       if (ret)
>>           return ret;
>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>> -
>>       /* Copy PSP SPL binary to memory */
>> -    memcpy(psp->fw_pri_buf, psp->spl_start_addr, psp->spl_bin_size);
>> +    psp_copy_fw(psp, psp->spl_start_addr, psp->spl_bin_size);
>>       /* Provide the PSP SPL to bootloader */
>>       WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>> @@ -335,10 +332,8 @@ static int 
>> psp_v11_0_bootloader_load_sysdrv(struct psp_context *psp)
>>       if (ret)
>>           return ret;
>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>> -
>>       /* Copy PSP System Driver binary to memory */
>> -    memcpy(psp->fw_pri_buf, psp->sys_start_addr, psp->sys_bin_size);
>> +    psp_copy_fw(psp, psp->sys_start_addr, psp->sys_bin_size);
>>       /* Provide the sys driver to bootloader */
>>       WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>> @@ -371,10 +366,8 @@ static int psp_v11_0_bootloader_load_sos(struct 
>> psp_context *psp)
>>       if (ret)
>>           return ret;
>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>> -
>>       /* Copy Secure OS binary to PSP memory */
>> -    memcpy(psp->fw_pri_buf, psp->sos_start_addr, psp->sos_bin_size);
>> +    psp_copy_fw(psp, psp->sos_start_addr, psp->sos_bin_size);
>>       /* Provide the PSP secure OS to bootloader */
>>       WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>> @@ -608,7 +601,7 @@ static int psp_v11_0_memory_training(struct 
>> psp_context *psp, uint32_t ops)
>>       uint32_t p2c_header[4];
>>       uint32_t sz;
>>       void *buf;
>> -    int ret;
>> +    int ret, idx;
>>       if (ctx->init == PSP_MEM_TRAIN_NOT_SUPPORT) {
>>           DRM_DEBUG("Memory training is not supported.\n");
>> @@ -681,17 +674,24 @@ static int psp_v11_0_memory_training(struct 
>> psp_context *psp, uint32_t ops)
>>               return -ENOMEM;
>>           }
>> -        memcpy_fromio(buf, adev->mman.aper_base_kaddr, sz);
>> -        ret = psp_v11_0_memory_training_send_msg(psp, 
>> PSP_BL__DRAM_LONG_TRAIN);
>> -        if (ret) {
>> -            DRM_ERROR("Send long training msg failed.\n");
>> +        if (drm_dev_enter(&adev->ddev, &idx)) {
>> +            memcpy_fromio(buf, adev->mman.aper_base_kaddr, sz);
>> +            ret = psp_v11_0_memory_training_send_msg(psp, 
>> PSP_BL__DRAM_LONG_TRAIN);
>> +            if (ret) {
>> +                DRM_ERROR("Send long training msg failed.\n");
>> +                vfree(buf);
>> +                drm_dev_exit(idx);
>> +                return ret;
>> +            }
>> +
>> +            memcpy_toio(adev->mman.aper_base_kaddr, buf, sz);
>> +            adev->hdp.funcs->flush_hdp(adev, NULL);
>>               vfree(buf);
>> -            return ret;
>> +            drm_dev_exit(idx);
>> +        } else {
>> +            vfree(buf);
>> +            return -ENODEV;
>>           }
>> -
>> -        memcpy_toio(adev->mman.aper_base_kaddr, buf, sz);
>> -        adev->hdp.funcs->flush_hdp(adev, NULL);
>> -        vfree(buf);
>>       }
>>       if (ops & PSP_MEM_TRAIN_SAVE) {
>> diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c 
>> b/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c
>> index c4828bd3264b..618e5b6b85d9 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c
>> @@ -138,10 +138,8 @@ static int 
>> psp_v12_0_bootloader_load_sysdrv(struct psp_context *psp)
>>       if (ret)
>>           return ret;
>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>> -
>>       /* Copy PSP System Driver binary to memory */
>> -    memcpy(psp->fw_pri_buf, psp->sys_start_addr, psp->sys_bin_size);
>> +    psp_copy_fw(psp, psp->sys_start_addr, psp->sys_bin_size);
>>       /* Provide the sys driver to bootloader */
>>       WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>> @@ -179,10 +177,8 @@ static int psp_v12_0_bootloader_load_sos(struct 
>> psp_context *psp)
>>       if (ret)
>>           return ret;
>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>> -
>>       /* Copy Secure OS binary to PSP memory */
>> -    memcpy(psp->fw_pri_buf, psp->sos_start_addr, psp->sos_bin_size);
>> +    psp_copy_fw(psp, psp->sos_start_addr, psp->sos_bin_size);
>>       /* Provide the PSP secure OS to bootloader */
>>       WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>> diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c 
>> b/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c
>> index f2e725f72d2f..d0a6cccd0897 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c
>> @@ -102,10 +102,8 @@ static int psp_v3_1_bootloader_load_sysdrv(struct 
>> psp_context *psp)
>>       if (ret)
>>           return ret;
>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>> -
>>       /* Copy PSP System Driver binary to memory */
>> -    memcpy(psp->fw_pri_buf, psp->sys_start_addr, psp->sys_bin_size);
>> +    psp_copy_fw(psp, psp->sys_start_addr, psp->sys_bin_size);
>>       /* Provide the sys driver to bootloader */
>>       WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>> @@ -143,10 +141,8 @@ static int psp_v3_1_bootloader_load_sos(struct 
>> psp_context *psp)
>>       if (ret)
>>           return ret;
>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>> -
>>       /* Copy Secure OS binary to PSP memory */
>> -    memcpy(psp->fw_pri_buf, psp->sos_start_addr, psp->sos_bin_size);
>> +    psp_copy_fw(psp, psp->sos_start_addr, psp->sos_bin_size);
>>       /* Provide the PSP secure OS to bootloader */
>>       WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>> diff --git a/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c 
>> b/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
>> index 8e238dea7bef..90910d19db12 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
>> @@ -25,6 +25,7 @@
>>    */
>>   #include <linux/firmware.h>
>> +#include <drm/drm_drv.h>
>>   #include "amdgpu.h"
>>   #include "amdgpu_vce.h"
>> @@ -555,16 +556,19 @@ static int vce_v4_0_hw_fini(void *handle)
>>   static int vce_v4_0_suspend(void *handle)
>>   {
>>       struct amdgpu_device *adev = (struct amdgpu_device *)handle;
>> -    int r;
>> +    int r, idx;
>>       if (adev->vce.vcpu_bo == NULL)
>>           return 0;
>> -    if (adev->firmware.load_type == AMDGPU_FW_LOAD_PSP) {
>> -        unsigned size = amdgpu_bo_size(adev->vce.vcpu_bo);
>> -        void *ptr = adev->vce.cpu_addr;
>> +    if (drm_dev_enter(&adev->ddev, &idx)) {
>> +        if (adev->firmware.load_type == AMDGPU_FW_LOAD_PSP) {
>> +            unsigned size = amdgpu_bo_size(adev->vce.vcpu_bo);
>> +            void *ptr = adev->vce.cpu_addr;
>> -        memcpy_fromio(adev->vce.saved_bo, ptr, size);
>> +            memcpy_fromio(adev->vce.saved_bo, ptr, size);
>> +        }
>> +        drm_dev_exit(idx);
>>       }
>>       r = vce_v4_0_hw_fini(adev);
>> @@ -577,16 +581,20 @@ static int vce_v4_0_suspend(void *handle)
>>   static int vce_v4_0_resume(void *handle)
>>   {
>>       struct amdgpu_device *adev = (struct amdgpu_device *)handle;
>> -    int r;
>> +    int r, idx;
>>       if (adev->vce.vcpu_bo == NULL)
>>           return -EINVAL;
>>       if (adev->firmware.load_type == AMDGPU_FW_LOAD_PSP) {
>> -        unsigned size = amdgpu_bo_size(adev->vce.vcpu_bo);
>> -        void *ptr = adev->vce.cpu_addr;
>> -        memcpy_toio(ptr, adev->vce.saved_bo, size);
>> +        if (drm_dev_enter(&adev->ddev, &idx)) {
>> +            unsigned size = amdgpu_bo_size(adev->vce.vcpu_bo);
>> +            void *ptr = adev->vce.cpu_addr;
>> +
>> +            memcpy_toio(ptr, adev->vce.saved_bo, size);
>> +            drm_dev_exit(idx);
>> +        }
>>       } else {
>>           r = amdgpu_vce_resume(adev);
>>           if (r)
>> diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c 
>> b/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c
>> index 3f15bf34123a..df34be8ec82d 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c
>> @@ -34,6 +34,8 @@
>>   #include "vcn/vcn_3_0_0_sh_mask.h"
>>   #include "ivsrcid/vcn/irqsrcs_vcn_2_0.h"
>> +#include <drm/drm_drv.h>
>> +
>>   #define mmUVD_CONTEXT_ID_INTERNAL_OFFSET            0x27
>>   #define mmUVD_GPCOM_VCPU_CMD_INTERNAL_OFFSET            0x0f
>>   #define mmUVD_GPCOM_VCPU_DATA0_INTERNAL_OFFSET            0x10
>> @@ -268,16 +270,20 @@ static int vcn_v3_0_sw_init(void *handle)
>>   static int vcn_v3_0_sw_fini(void *handle)
>>   {
>>       struct amdgpu_device *adev = (struct amdgpu_device *)handle;
>> -    int i, r;
>> +    int i, r, idx;
>> -    for (i = 0; i < adev->vcn.num_vcn_inst; i++) {
>> -        volatile struct amdgpu_fw_shared *fw_shared;
>> +    if (drm_dev_enter(&adev->ddev, &idx)) {
>> +        for (i = 0; i < adev->vcn.num_vcn_inst; i++) {
>> +            volatile struct amdgpu_fw_shared *fw_shared;
>> -        if (adev->vcn.harvest_config & (1 << i))
>> -            continue;
>> -        fw_shared = adev->vcn.inst[i].fw_shared_cpu_addr;
>> -        fw_shared->present_flag_0 = 0;
>> -        fw_shared->sw_ring.is_enabled = false;
>> +            if (adev->vcn.harvest_config & (1 << i))
>> +                continue;
>> +            fw_shared = adev->vcn.inst[i].fw_shared_cpu_addr;
>> +            fw_shared->present_flag_0 = 0;
>> +            fw_shared->sw_ring.is_enabled = false;
>> +        }
>> +
>> +        drm_dev_exit(idx);
>>       }
>>       if (amdgpu_sriov_vf(adev))
>> diff --git a/drivers/gpu/drm/amd/pm/powerplay/smumgr/smu7_smumgr.c 
>> b/drivers/gpu/drm/amd/pm/powerplay/smumgr/smu7_smumgr.c
>> index aae25243eb10..d628b91846c9 100644
>> --- a/drivers/gpu/drm/amd/pm/powerplay/smumgr/smu7_smumgr.c
>> +++ b/drivers/gpu/drm/amd/pm/powerplay/smumgr/smu7_smumgr.c
>> @@ -405,6 +405,8 @@ int smu7_request_smu_load_fw(struct pp_hwmgr *hwmgr)
>>                   UCODE_ID_MEC_STORAGE, &toc->entry[toc->num_entries++]),
>>                   "Failed to Get Firmware Entry.", r = -EINVAL; goto 
>> failed);
>>       }
>> +
>> +    /* AG TODO Can't call drm_dev_enter/exit because access 
>> adev->ddev here ... */
>>       memcpy_toio(smu_data->header_buffer.kaddr, smu_data->toc,
>>               sizeof(struct SMU_DRAMData_TOC));
>>       smum_send_msg_to_smc_with_parameter(hwmgr,
> 

^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [PATCH v6 10/16] drm/amdgpu: Guard against write accesses after device removal
@ 2021-05-11 17:52       ` Andrey Grodzovsky
  0 siblings, 0 replies; 126+ messages in thread
From: Andrey Grodzovsky @ 2021-05-11 17:52 UTC (permalink / raw)
  To: Christian König, dri-devel, amd-gfx, linux-pci,
	daniel.vetter, Harry.Wentland
  Cc: Alexander.Deucher, gregkh, helgaas, Felix.Kuehling



On 2021-05-11 2:50 a.m., Christian König wrote:
> Am 10.05.21 um 18:36 schrieb Andrey Grodzovsky:
>> This should prevent writing to memory or IO ranges possibly
>> already allocated for other uses after our device is removed.
>>
>> v5:
>> Protect more places wher memcopy_to/form_io takes place
>> Protect IB submissions
>>
>> v6: Switch to !drm_dev_enter instead of scoping entire code
>> with brackets.
>>
>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c    | 11 ++-
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c       |  9 +++
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c        | 17 +++--
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c       | 63 +++++++++++------
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h       |  2 +
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c      | 70 +++++++++++++++++++
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h      | 49 ++-----------
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c       | 31 +++++---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c       | 11 ++-
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c       | 22 ++++--
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c        |  7 +-
>>   drivers/gpu/drm/amd/amdgpu/psp_v11_0.c        | 44 ++++++------
>>   drivers/gpu/drm/amd/amdgpu/psp_v12_0.c        |  8 +--
>>   drivers/gpu/drm/amd/amdgpu/psp_v3_1.c         |  8 +--
>>   drivers/gpu/drm/amd/amdgpu/vce_v4_0.c         | 26 ++++---
>>   drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c         | 22 +++---
>>   .../drm/amd/pm/powerplay/smumgr/smu7_smumgr.c |  2 +
>>   17 files changed, 257 insertions(+), 145 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> index a0bff4713672..94c415176cdc 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> @@ -71,6 +71,8 @@
>>   #include <drm/task_barrier.h>
>>   #include <linux/pm_runtime.h>
>> +#include <drm/drm_drv.h>
>> +
>>   MODULE_FIRMWARE("amdgpu/vega10_gpu_info.bin");
>>   MODULE_FIRMWARE("amdgpu/vega12_gpu_info.bin");
>>   MODULE_FIRMWARE("amdgpu/raven_gpu_info.bin");
>> @@ -281,7 +283,10 @@ void amdgpu_device_vram_access(struct 
>> amdgpu_device *adev, loff_t pos,
>>       unsigned long flags;
>>       uint32_t hi = ~0;
>>       uint64_t last;
>> +    int idx;
>> +     if (!drm_dev_enter(&adev->ddev, &idx))
>> +         return;
>>   #ifdef CONFIG_64BIT
>>       last = min(pos + size, adev->gmc.visible_vram_size);
>> @@ -299,8 +304,10 @@ void amdgpu_device_vram_access(struct 
>> amdgpu_device *adev, loff_t pos,
>>               memcpy_fromio(buf, addr, count);
>>           }
>> -        if (count == size)
>> +        if (count == size) {
>> +            drm_dev_exit(idx);
>>               return;
>> +        }
> 
> Maybe use a goto instead, but really just a nit pick.
> 
> 
> 
>>           pos += count;
>>           buf += count / 4;
>> @@ -323,6 +330,8 @@ void amdgpu_device_vram_access(struct 
>> amdgpu_device *adev, loff_t pos,
>>               *buf++ = RREG32_NO_KIQ(mmMM_DATA);
>>       }
>>       spin_unlock_irqrestore(&adev->mmio_idx_lock, flags);
>> +
>> +    drm_dev_exit(idx);
>>   }
>>   /*
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
>> index 4d32233cde92..04ba5eef1e88 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
>> @@ -31,6 +31,8 @@
>>   #include "amdgpu_ras.h"
>>   #include "amdgpu_xgmi.h"
>> +#include <drm/drm_drv.h>
>> +
>>   /**
>>    * amdgpu_gmc_pdb0_alloc - allocate vram for pdb0
>>    *
>> @@ -151,6 +153,10 @@ int amdgpu_gmc_set_pte_pde(struct amdgpu_device 
>> *adev, void *cpu_pt_addr,
>>   {
>>       void __iomem *ptr = (void *)cpu_pt_addr;
>>       uint64_t value;
>> +    int idx;
>> +
>> +    if (!drm_dev_enter(&adev->ddev, &idx))
>> +        return 0;
>>       /*
>>        * The following is for PTE only. GART does not have PDEs.
>> @@ -158,6 +164,9 @@ int amdgpu_gmc_set_pte_pde(struct amdgpu_device 
>> *adev, void *cpu_pt_addr,
>>       value = addr & 0x0000FFFFFFFFF000ULL;
>>       value |= flags;
>>       writeq(value, ptr + (gpu_page_idx * 8));
>> +
>> +    drm_dev_exit(idx);
>> +
>>       return 0;
>>   }
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
>> index 148a3b481b12..62fcbd446c71 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
>> @@ -30,6 +30,7 @@
>>   #include <linux/slab.h>
>>   #include <drm/amdgpu_drm.h>
>> +#include <drm/drm_drv.h>
>>   #include "amdgpu.h"
>>   #include "atom.h"
>> @@ -137,7 +138,7 @@ int amdgpu_ib_schedule(struct amdgpu_ring *ring, 
>> unsigned num_ibs,
>>       bool secure;
>>       unsigned i;
>> -    int r = 0;
>> +    int idx, r = 0;
>>       bool need_pipe_sync = false;
>>       if (num_ibs == 0)
>> @@ -169,13 +170,16 @@ int amdgpu_ib_schedule(struct amdgpu_ring *ring, 
>> unsigned num_ibs,
>>           return -EINVAL;
>>       }
>> +    if (!drm_dev_enter(&adev->ddev, &idx))
>> +        return -ENODEV;
>> +
>>       alloc_size = ring->funcs->emit_frame_size + num_ibs *
>>           ring->funcs->emit_ib_size;
>>       r = amdgpu_ring_alloc(ring, alloc_size);
>>       if (r) {
>>           dev_err(adev->dev, "scheduling IB failed (%d).\n", r);
>> -        return r;
>> +        goto exit;
>>       }
>>       need_ctx_switch = ring->current_ctx != fence_ctx;
>> @@ -205,7 +209,7 @@ int amdgpu_ib_schedule(struct amdgpu_ring *ring, 
>> unsigned num_ibs,
>>           r = amdgpu_vm_flush(ring, job, need_pipe_sync);
>>           if (r) {
>>               amdgpu_ring_undo(ring);
>> -            return r;
>> +            goto exit;
>>           }
>>       }
>> @@ -286,7 +290,7 @@ int amdgpu_ib_schedule(struct amdgpu_ring *ring, 
>> unsigned num_ibs,
>>           if (job && job->vmid)
>>               amdgpu_vmid_reset(adev, ring->funcs->vmhub, job->vmid);
>>           amdgpu_ring_undo(ring);
>> -        return r;
>> +        goto exit;
>>       }
>>       if (ring->funcs->insert_end)
>> @@ -304,7 +308,10 @@ int amdgpu_ib_schedule(struct amdgpu_ring *ring, 
>> unsigned num_ibs,
>>           ring->funcs->emit_wave_limit(ring, false);
>>       amdgpu_ring_commit(ring);
>> -    return 0;
>> +
>> +exit:
>> +    drm_dev_exit(idx);
>> +    return r;
>>   }
>>   /**
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
>> index 9e769cf6095b..bb6afee61666 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
>> @@ -25,6 +25,7 @@
>>   #include <linux/firmware.h>
>>   #include <linux/dma-mapping.h>
>> +#include <drm/drm_drv.h>
>>   #include "amdgpu.h"
>>   #include "amdgpu_psp.h"
>> @@ -39,6 +40,8 @@
>>   #include "amdgpu_ras.h"
>>   #include "amdgpu_securedisplay.h"
>> +#include <drm/drm_drv.h>
>> +
>>   static int psp_sysfs_init(struct amdgpu_device *adev);
>>   static void psp_sysfs_fini(struct amdgpu_device *adev);
>> @@ -253,7 +256,7 @@ psp_cmd_submit_buf(struct psp_context *psp,
>>              struct psp_gfx_cmd_resp *cmd, uint64_t fence_mc_addr)
>>   {
>>       int ret;
>> -    int index;
>> +    int index, idx;
>>       int timeout = 20000;
>>       bool ras_intr = false;
>>       bool skip_unsupport = false;
>> @@ -261,6 +264,9 @@ psp_cmd_submit_buf(struct psp_context *psp,
>>       if (psp->adev->in_pci_err_recovery)
>>           return 0;
>> +    if (!drm_dev_enter(&psp->adev->ddev, &idx))
>> +        return 0;
>> +
>>       mutex_lock(&psp->mutex);
>>       memset(psp->cmd_buf_mem, 0, PSP_CMD_BUFFER_SIZE);
>> @@ -271,8 +277,7 @@ psp_cmd_submit_buf(struct psp_context *psp,
>>       ret = psp_ring_cmd_submit(psp, psp->cmd_buf_mc_addr, 
>> fence_mc_addr, index);
>>       if (ret) {
>>           atomic_dec(&psp->fence_value);
>> -        mutex_unlock(&psp->mutex);
>> -        return ret;
>> +        goto exit;
>>       }
>>       amdgpu_asic_invalidate_hdp(psp->adev, NULL);
>> @@ -312,8 +317,8 @@ psp_cmd_submit_buf(struct psp_context *psp,
>>                psp->cmd_buf_mem->cmd_id,
>>                psp->cmd_buf_mem->resp.status);
>>           if (!timeout) {
>> -            mutex_unlock(&psp->mutex);
>> -            return -EINVAL;
>> +            ret = -EINVAL;
>> +            goto exit;
>>           }
>>       }
>> @@ -321,8 +326,10 @@ psp_cmd_submit_buf(struct psp_context *psp,
>>           ucode->tmr_mc_addr_lo = psp->cmd_buf_mem->resp.fw_addr_lo;
>>           ucode->tmr_mc_addr_hi = psp->cmd_buf_mem->resp.fw_addr_hi;
>>       }
>> -    mutex_unlock(&psp->mutex);
>> +exit:
>> +    mutex_unlock(&psp->mutex);
>> +    drm_dev_exit(idx);
>>       return ret;
>>   }
>> @@ -359,8 +366,7 @@ static int psp_load_toc(struct psp_context *psp,
>>       if (!cmd)
>>           return -ENOMEM;
>>       /* Copy toc to psp firmware private buffer */
>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>> -    memcpy(psp->fw_pri_buf, psp->toc_start_addr, psp->toc_bin_size);
>> +    psp_copy_fw(psp, psp->toc_start_addr, psp->toc_bin_size);
>>       psp_prep_load_toc_cmd_buf(cmd, psp->fw_pri_mc_addr, 
>> psp->toc_bin_size);
>> @@ -625,8 +631,7 @@ static int psp_asd_load(struct psp_context *psp)
>>       if (!cmd)
>>           return -ENOMEM;
>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>> -    memcpy(psp->fw_pri_buf, psp->asd_start_addr, psp->asd_ucode_size);
>> +    psp_copy_fw(psp, psp->asd_start_addr, psp->asd_ucode_size);
>>       psp_prep_asd_load_cmd_buf(cmd, psp->fw_pri_mc_addr,
>>                     psp->asd_ucode_size);
>> @@ -781,8 +786,7 @@ static int psp_xgmi_load(struct psp_context *psp)
>>       if (!cmd)
>>           return -ENOMEM;
>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>> -    memcpy(psp->fw_pri_buf, psp->ta_xgmi_start_addr, 
>> psp->ta_xgmi_ucode_size);
>> +    psp_copy_fw(psp, psp->ta_xgmi_start_addr, psp->ta_xgmi_ucode_size);
>>       psp_prep_ta_load_cmd_buf(cmd,
>>                    psp->fw_pri_mc_addr,
>> @@ -1038,8 +1042,7 @@ static int psp_ras_load(struct psp_context *psp)
>>       if (!cmd)
>>           return -ENOMEM;
>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>> -    memcpy(psp->fw_pri_buf, psp->ta_ras_start_addr, 
>> psp->ta_ras_ucode_size);
>> +    psp_copy_fw(psp, psp->ta_ras_start_addr, psp->ta_ras_ucode_size);
>>       psp_prep_ta_load_cmd_buf(cmd,
>>                    psp->fw_pri_mc_addr,
>> @@ -1275,8 +1278,7 @@ static int psp_hdcp_load(struct psp_context *psp)
>>       if (!cmd)
>>           return -ENOMEM;
>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>> -    memcpy(psp->fw_pri_buf, psp->ta_hdcp_start_addr,
>> +    psp_copy_fw(psp, psp->ta_hdcp_start_addr,
>>              psp->ta_hdcp_ucode_size);
>>       psp_prep_ta_load_cmd_buf(cmd,
>> @@ -1427,8 +1429,7 @@ static int psp_dtm_load(struct psp_context *psp)
>>       if (!cmd)
>>           return -ENOMEM;
>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>> -    memcpy(psp->fw_pri_buf, psp->ta_dtm_start_addr, 
>> psp->ta_dtm_ucode_size);
>> +    psp_copy_fw(psp, psp->ta_dtm_start_addr, psp->ta_dtm_ucode_size);
>>       psp_prep_ta_load_cmd_buf(cmd,
>>                    psp->fw_pri_mc_addr,
>> @@ -1573,8 +1574,7 @@ static int psp_rap_load(struct psp_context *psp)
>>       if (!cmd)
>>           return -ENOMEM;
>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>> -    memcpy(psp->fw_pri_buf, psp->ta_rap_start_addr, 
>> psp->ta_rap_ucode_size);
>> +    psp_copy_fw(psp, psp->ta_rap_start_addr, psp->ta_rap_ucode_size);
>>       psp_prep_ta_load_cmd_buf(cmd,
>>                    psp->fw_pri_mc_addr,
>> @@ -3022,7 +3022,7 @@ static ssize_t psp_usbc_pd_fw_sysfs_write(struct 
>> device *dev,
>>       struct amdgpu_device *adev = drm_to_adev(ddev);
>>       void *cpu_addr;
>>       dma_addr_t dma_addr;
>> -    int ret;
>> +    int ret, idx;
>>       char fw_name[100];
>>       const struct firmware *usbc_pd_fw;
>> @@ -3031,6 +3031,9 @@ static ssize_t psp_usbc_pd_fw_sysfs_write(struct 
>> device *dev,
>>           return -EBUSY;
>>       }
>> +    if (!drm_dev_enter(ddev, &idx))
>> +        return -ENODEV;
>> +
>>       snprintf(fw_name, sizeof(fw_name), "amdgpu/%s", buf);
>>       ret = request_firmware(&usbc_pd_fw, fw_name, adev->dev);
>>       if (ret)
>> @@ -3062,16 +3065,30 @@ static ssize_t 
>> psp_usbc_pd_fw_sysfs_write(struct device *dev,
>>   rel_buf:
>>       dma_free_coherent(adev->dev, usbc_pd_fw->size, cpu_addr, dma_addr);
>>       release_firmware(usbc_pd_fw);
>> -
>>   fail:
>>       if (ret) {
>>           DRM_ERROR("Failed to load USBC PD FW, err = %d", ret);
>> -        return ret;
>> +        count = ret;
>>       }
>> +    drm_dev_exit(idx);
>>       return count;
>>   }
>> +void psp_copy_fw(struct psp_context *psp, uint8_t *start_addr, 
>> uint32_t bin_size)
>> +{
>> +    int idx;
>> +
>> +    if (!drm_dev_enter(&psp->adev->ddev, &idx))
>> +        return;
>> +
>> +    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>> +    memcpy(psp->fw_pri_buf, start_addr, bin_size);
>> +
>> +    drm_dev_exit(idx);
>> +}
>> +
>> +
>>   static DEVICE_ATTR(usbc_pd_fw, S_IRUGO | S_IWUSR,
>>              psp_usbc_pd_fw_sysfs_read,
>>              psp_usbc_pd_fw_sysfs_write);
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
>> index 46a5328e00e0..2bfdc278817f 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
>> @@ -423,4 +423,6 @@ int psp_get_fw_attestation_records_addr(struct 
>> psp_context *psp,
>>   int psp_load_fw_list(struct psp_context *psp,
>>                struct amdgpu_firmware_info **ucode_list, int 
>> ucode_count);
>> +void psp_copy_fw(struct psp_context *psp, uint8_t *start_addr, 
>> uint32_t bin_size);
>> +
>>   #endif
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
>> index 688624ebe421..e1985bc34436 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
>> @@ -35,6 +35,8 @@
>>   #include "amdgpu.h"
>>   #include "atom.h"
>> +#include <drm/drm_drv.h>
>> +
>>   /*
>>    * Rings
>>    * Most engines on the GPU are fed via ring buffers.  Ring
>> @@ -461,3 +463,71 @@ int amdgpu_ring_test_helper(struct amdgpu_ring 
>> *ring)
>>       ring->sched.ready = !r;
>>       return r;
>>   }
>> +
>> +void amdgpu_ring_clear_ring(struct amdgpu_ring *ring)
>> +{
>> +    int idx;
>> +    int i = 0;
>> +
>> +    if (!drm_dev_enter(&ring->adev->ddev, &idx))
>> +        return;
>> +
>> +    while (i <= ring->buf_mask)
>> +        ring->ring[i++] = ring->funcs->nop;
>> +
>> +    drm_dev_exit(idx);
>> +
>> +}
>> +
>> +void amdgpu_ring_write(struct amdgpu_ring *ring, uint32_t v)
>> +{
>> +    int idx;
>> +
>> +    if (!drm_dev_enter(&ring->adev->ddev, &idx))
>> +        return;
>> +
>> +    if (ring->count_dw <= 0)
>> +        DRM_ERROR("amdgpu: writing more dwords to the ring than 
>> expected!\n");
>> +    ring->ring[ring->wptr++ & ring->buf_mask] = v;
>> +    ring->wptr &= ring->ptr_mask;
>> +    ring->count_dw--;
>> +
>> +    drm_dev_exit(idx);
>> +}
>> +
>> +void amdgpu_ring_write_multiple(struct amdgpu_ring *ring,
>> +                          void *src, int count_dw)
>> +{
>> +    unsigned occupied, chunk1, chunk2;
>> +    void *dst;
>> +    int idx;
>> +
>> +    if (!drm_dev_enter(&ring->adev->ddev, &idx))
>> +        return;
>> +
>> +    if (unlikely(ring->count_dw < count_dw))
>> +        DRM_ERROR("amdgpu: writing more dwords to the ring than 
>> expected!\n");
>> +
>> +    occupied = ring->wptr & ring->buf_mask;
>> +    dst = (void *)&ring->ring[occupied];
>> +    chunk1 = ring->buf_mask + 1 - occupied;
>> +    chunk1 = (chunk1 >= count_dw) ? count_dw: chunk1;
>> +    chunk2 = count_dw - chunk1;
>> +    chunk1 <<= 2;
>> +    chunk2 <<= 2;
>> +
>> +    if (chunk1)
>> +        memcpy(dst, src, chunk1);
>> +
>> +    if (chunk2) {
>> +        src += chunk1;
>> +        dst = (void *)ring->ring;
>> +        memcpy(dst, src, chunk2);
>> +    }
>> +
>> +    ring->wptr += count_dw;
>> +    ring->wptr &= ring->ptr_mask;
>> +    ring->count_dw -= count_dw;
>> +
>> +    drm_dev_exit(idx);
>> +}
> 
> The ring should never we in MMIO memory, so you can completely drop that 
> as far as I can see.

Yea, it's in all in GART, missed it for some reason...
> 
> Maybe split that patch by use case so that we can more easily review/ack 
> it.

In fact everything here is the same use case, once I added unmap of
all MMIO ranges (both registers ann VRAM) i got a lot of page faults
on device remove around any memcpy to from IO. That where I put the
drn_dev_enter/exit scope. Also I searched in code and preemeptivly
added guards to any other such place. I did drop amdgpu_schedule_ib
from this patch both because it had dma_fence_wait inside and so we
will take care of this once we decide on how to handle dma_fence waits.

Andrey

> 
> Thanks,
> Christian.
> 
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>> index e7d3d0dbdd96..c67bc6d3d039 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>> @@ -299,53 +299,12 @@ static inline void 
>> amdgpu_ring_set_preempt_cond_exec(struct amdgpu_ring *ring,
>>       *ring->cond_exe_cpu_addr = cond_exec;
>>   }
>> -static inline void amdgpu_ring_clear_ring(struct amdgpu_ring *ring)
>> -{
>> -    int i = 0;
>> -    while (i <= ring->buf_mask)
>> -        ring->ring[i++] = ring->funcs->nop;
>> -
>> -}
>> -
>> -static inline void amdgpu_ring_write(struct amdgpu_ring *ring, 
>> uint32_t v)
>> -{
>> -    if (ring->count_dw <= 0)
>> -        DRM_ERROR("amdgpu: writing more dwords to the ring than 
>> expected!\n");
>> -    ring->ring[ring->wptr++ & ring->buf_mask] = v;
>> -    ring->wptr &= ring->ptr_mask;
>> -    ring->count_dw--;
>> -}
>> +void amdgpu_ring_clear_ring(struct amdgpu_ring *ring);
>> -static inline void amdgpu_ring_write_multiple(struct amdgpu_ring *ring,
>> -                          void *src, int count_dw)
>> -{
>> -    unsigned occupied, chunk1, chunk2;
>> -    void *dst;
>> -
>> -    if (unlikely(ring->count_dw < count_dw))
>> -        DRM_ERROR("amdgpu: writing more dwords to the ring than 
>> expected!\n");
>> -
>> -    occupied = ring->wptr & ring->buf_mask;
>> -    dst = (void *)&ring->ring[occupied];
>> -    chunk1 = ring->buf_mask + 1 - occupied;
>> -    chunk1 = (chunk1 >= count_dw) ? count_dw: chunk1;
>> -    chunk2 = count_dw - chunk1;
>> -    chunk1 <<= 2;
>> -    chunk2 <<= 2;
>> +void amdgpu_ring_write(struct amdgpu_ring *ring, uint32_t v);
>> -    if (chunk1)
>> -        memcpy(dst, src, chunk1);
>> -
>> -    if (chunk2) {
>> -        src += chunk1;
>> -        dst = (void *)ring->ring;
>> -        memcpy(dst, src, chunk2);
>> -    }
>> -
>> -    ring->wptr += count_dw;
>> -    ring->wptr &= ring->ptr_mask;
>> -    ring->count_dw -= count_dw;
>> -}
>> +void amdgpu_ring_write_multiple(struct amdgpu_ring *ring,
>> +                          void *src, int count_dw);
>>   int amdgpu_ring_test_helper(struct amdgpu_ring *ring);
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
>> index c6dbc0801604..82f0542c7792 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
>> @@ -32,6 +32,7 @@
>>   #include <linux/module.h>
>>   #include <drm/drm.h>
>> +#include <drm/drm_drv.h>
>>   #include "amdgpu.h"
>>   #include "amdgpu_pm.h"
>> @@ -375,7 +376,7 @@ int amdgpu_uvd_suspend(struct amdgpu_device *adev)
>>   {
>>       unsigned size;
>>       void *ptr;
>> -    int i, j;
>> +    int i, j, idx;
>>       bool in_ras_intr = amdgpu_ras_intr_triggered();
>>       cancel_delayed_work_sync(&adev->uvd.idle_work);
>> @@ -403,11 +404,15 @@ int amdgpu_uvd_suspend(struct amdgpu_device *adev)
>>           if (!adev->uvd.inst[j].saved_bo)
>>               return -ENOMEM;
>> -        /* re-write 0 since err_event_athub will corrupt VCPU buffer */
>> -        if (in_ras_intr)
>> -            memset(adev->uvd.inst[j].saved_bo, 0, size);
>> -        else
>> -            memcpy_fromio(adev->uvd.inst[j].saved_bo, ptr, size);
>> +        if (drm_dev_enter(&adev->ddev, &idx)) {
>> +            /* re-write 0 since err_event_athub will corrupt VCPU 
>> buffer */
>> +            if (in_ras_intr)
>> +                memset(adev->uvd.inst[j].saved_bo, 0, size);
>> +            else
>> +                memcpy_fromio(adev->uvd.inst[j].saved_bo, ptr, size);
>> +
>> +            drm_dev_exit(idx);
>> +        }
>>       }
>>       if (in_ras_intr)
>> @@ -420,7 +425,7 @@ int amdgpu_uvd_resume(struct amdgpu_device *adev)
>>   {
>>       unsigned size;
>>       void *ptr;
>> -    int i;
>> +    int i, idx;
>>       for (i = 0; i < adev->uvd.num_uvd_inst; i++) {
>>           if (adev->uvd.harvest_config & (1 << i))
>> @@ -432,7 +437,10 @@ int amdgpu_uvd_resume(struct amdgpu_device *adev)
>>           ptr = adev->uvd.inst[i].cpu_addr;
>>           if (adev->uvd.inst[i].saved_bo != NULL) {
>> -            memcpy_toio(ptr, adev->uvd.inst[i].saved_bo, size);
>> +            if (drm_dev_enter(&adev->ddev, &idx)) {
>> +                memcpy_toio(ptr, adev->uvd.inst[i].saved_bo, size);
>> +                drm_dev_exit(idx);
>> +            }
>>               kvfree(adev->uvd.inst[i].saved_bo);
>>               adev->uvd.inst[i].saved_bo = NULL;
>>           } else {
>> @@ -442,8 +450,11 @@ int amdgpu_uvd_resume(struct amdgpu_device *adev)
>>               hdr = (const struct common_firmware_header 
>> *)adev->uvd.fw->data;
>>               if (adev->firmware.load_type != AMDGPU_FW_LOAD_PSP) {
>>                   offset = le32_to_cpu(hdr->ucode_array_offset_bytes);
>> -                memcpy_toio(adev->uvd.inst[i].cpu_addr, 
>> adev->uvd.fw->data + offset,
>> -                        le32_to_cpu(hdr->ucode_size_bytes));
>> +                if (drm_dev_enter(&adev->ddev, &idx)) {
>> +                    memcpy_toio(adev->uvd.inst[i].cpu_addr, 
>> adev->uvd.fw->data + offset,
>> +                            le32_to_cpu(hdr->ucode_size_bytes));
>> +                    drm_dev_exit(idx);
>> +                }
>>                   size -= le32_to_cpu(hdr->ucode_size_bytes);
>>                   ptr += le32_to_cpu(hdr->ucode_size_bytes);
>>               }
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
>> index ea6a62f67e38..833203401ef4 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
>> @@ -29,6 +29,7 @@
>>   #include <linux/module.h>
>>   #include <drm/drm.h>
>> +#include <drm/drm_drv.h>
>>   #include "amdgpu.h"
>>   #include "amdgpu_pm.h"
>> @@ -293,7 +294,7 @@ int amdgpu_vce_resume(struct amdgpu_device *adev)
>>       void *cpu_addr;
>>       const struct common_firmware_header *hdr;
>>       unsigned offset;
>> -    int r;
>> +    int r, idx;
>>       if (adev->vce.vcpu_bo == NULL)
>>           return -EINVAL;
>> @@ -313,8 +314,12 @@ int amdgpu_vce_resume(struct amdgpu_device *adev)
>>       hdr = (const struct common_firmware_header *)adev->vce.fw->data;
>>       offset = le32_to_cpu(hdr->ucode_array_offset_bytes);
>> -    memcpy_toio(cpu_addr, adev->vce.fw->data + offset,
>> -            adev->vce.fw->size - offset);
>> +
>> +    if (drm_dev_enter(&adev->ddev, &idx)) {
>> +        memcpy_toio(cpu_addr, adev->vce.fw->data + offset,
>> +                adev->vce.fw->size - offset);
>> +        drm_dev_exit(idx);
>> +    }
>>       amdgpu_bo_kunmap(adev->vce.vcpu_bo);
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
>> index 201645963ba5..21f7d3644d70 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
>> @@ -27,6 +27,7 @@
>>   #include <linux/firmware.h>
>>   #include <linux/module.h>
>>   #include <linux/pci.h>
>> +#include <drm/drm_drv.h>
>>   #include "amdgpu.h"
>>   #include "amdgpu_pm.h"
>> @@ -275,7 +276,7 @@ int amdgpu_vcn_suspend(struct amdgpu_device *adev)
>>   {
>>       unsigned size;
>>       void *ptr;
>> -    int i;
>> +    int i, idx;
>>       cancel_delayed_work_sync(&adev->vcn.idle_work);
>> @@ -292,7 +293,10 @@ int amdgpu_vcn_suspend(struct amdgpu_device *adev)
>>           if (!adev->vcn.inst[i].saved_bo)
>>               return -ENOMEM;
>> -        memcpy_fromio(adev->vcn.inst[i].saved_bo, ptr, size);
>> +        if (drm_dev_enter(&adev->ddev, &idx)) {
>> +            memcpy_fromio(adev->vcn.inst[i].saved_bo, ptr, size);
>> +            drm_dev_exit(idx);
>> +        }
>>       }
>>       return 0;
>>   }
>> @@ -301,7 +305,7 @@ int amdgpu_vcn_resume(struct amdgpu_device *adev)
>>   {
>>       unsigned size;
>>       void *ptr;
>> -    int i;
>> +    int i, idx;
>>       for (i = 0; i < adev->vcn.num_vcn_inst; ++i) {
>>           if (adev->vcn.harvest_config & (1 << i))
>> @@ -313,7 +317,10 @@ int amdgpu_vcn_resume(struct amdgpu_device *adev)
>>           ptr = adev->vcn.inst[i].cpu_addr;
>>           if (adev->vcn.inst[i].saved_bo != NULL) {
>> -            memcpy_toio(ptr, adev->vcn.inst[i].saved_bo, size);
>> +            if (drm_dev_enter(&adev->ddev, &idx)) {
>> +                memcpy_toio(ptr, adev->vcn.inst[i].saved_bo, size);
>> +                drm_dev_exit(idx);
>> +            }
>>               kvfree(adev->vcn.inst[i].saved_bo);
>>               adev->vcn.inst[i].saved_bo = NULL;
>>           } else {
>> @@ -323,8 +330,11 @@ int amdgpu_vcn_resume(struct amdgpu_device *adev)
>>               hdr = (const struct common_firmware_header 
>> *)adev->vcn.fw->data;
>>               if (adev->firmware.load_type != AMDGPU_FW_LOAD_PSP) {
>>                   offset = le32_to_cpu(hdr->ucode_array_offset_bytes);
>> -                memcpy_toio(adev->vcn.inst[i].cpu_addr, 
>> adev->vcn.fw->data + offset,
>> -                        le32_to_cpu(hdr->ucode_size_bytes));
>> +                if (drm_dev_enter(&adev->ddev, &idx)) {
>> +                    memcpy_toio(adev->vcn.inst[i].cpu_addr, 
>> adev->vcn.fw->data + offset,
>> +                            le32_to_cpu(hdr->ucode_size_bytes));
>> +                    drm_dev_exit(idx);
>> +                }
>>                   size -= le32_to_cpu(hdr->ucode_size_bytes);
>>                   ptr += le32_to_cpu(hdr->ucode_size_bytes);
>>               }
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> index 9f868cf3b832..7dd5f10ab570 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> @@ -32,6 +32,7 @@
>>   #include <linux/dma-buf.h>
>>   #include <drm/amdgpu_drm.h>
>> +#include <drm/drm_drv.h>
>>   #include "amdgpu.h"
>>   #include "amdgpu_trace.h"
>>   #include "amdgpu_amdkfd.h"
>> @@ -1606,7 +1607,10 @@ static int amdgpu_vm_bo_update_mapping(struct 
>> amdgpu_device *adev,
>>       struct amdgpu_vm_update_params params;
>>       enum amdgpu_sync_mode sync_mode;
>>       uint64_t pfn;
>> -    int r;
>> +    int r, idx;
>> +
>> +    if (!drm_dev_enter(&adev->ddev, &idx))
>> +        return -ENODEV;
>>       memset(&params, 0, sizeof(params));
>>       params.adev = adev;
>> @@ -1715,6 +1719,7 @@ static int amdgpu_vm_bo_update_mapping(struct 
>> amdgpu_device *adev,
>>   error_unlock:
>>       amdgpu_vm_eviction_unlock(vm);
>> +    drm_dev_exit(idx);
>>       return r;
>>   }
>> diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c 
>> b/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
>> index 589410c32d09..2cec71e823f5 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
>> @@ -23,6 +23,7 @@
>>   #include <linux/firmware.h>
>>   #include <linux/module.h>
>>   #include <linux/vmalloc.h>
>> +#include <drm/drm_drv.h>
>>   #include "amdgpu.h"
>>   #include "amdgpu_psp.h"
>> @@ -269,10 +270,8 @@ static int psp_v11_0_bootloader_load_kdb(struct 
>> psp_context *psp)
>>       if (ret)
>>           return ret;
>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>> -
>>       /* Copy PSP KDB binary to memory */
>> -    memcpy(psp->fw_pri_buf, psp->kdb_start_addr, psp->kdb_bin_size);
>> +    psp_copy_fw(psp, psp->kdb_start_addr, psp->kdb_bin_size);
>>       /* Provide the PSP KDB to bootloader */
>>       WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>> @@ -302,10 +301,8 @@ static int psp_v11_0_bootloader_load_spl(struct 
>> psp_context *psp)
>>       if (ret)
>>           return ret;
>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>> -
>>       /* Copy PSP SPL binary to memory */
>> -    memcpy(psp->fw_pri_buf, psp->spl_start_addr, psp->spl_bin_size);
>> +    psp_copy_fw(psp, psp->spl_start_addr, psp->spl_bin_size);
>>       /* Provide the PSP SPL to bootloader */
>>       WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>> @@ -335,10 +332,8 @@ static int 
>> psp_v11_0_bootloader_load_sysdrv(struct psp_context *psp)
>>       if (ret)
>>           return ret;
>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>> -
>>       /* Copy PSP System Driver binary to memory */
>> -    memcpy(psp->fw_pri_buf, psp->sys_start_addr, psp->sys_bin_size);
>> +    psp_copy_fw(psp, psp->sys_start_addr, psp->sys_bin_size);
>>       /* Provide the sys driver to bootloader */
>>       WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>> @@ -371,10 +366,8 @@ static int psp_v11_0_bootloader_load_sos(struct 
>> psp_context *psp)
>>       if (ret)
>>           return ret;
>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>> -
>>       /* Copy Secure OS binary to PSP memory */
>> -    memcpy(psp->fw_pri_buf, psp->sos_start_addr, psp->sos_bin_size);
>> +    psp_copy_fw(psp, psp->sos_start_addr, psp->sos_bin_size);
>>       /* Provide the PSP secure OS to bootloader */
>>       WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>> @@ -608,7 +601,7 @@ static int psp_v11_0_memory_training(struct 
>> psp_context *psp, uint32_t ops)
>>       uint32_t p2c_header[4];
>>       uint32_t sz;
>>       void *buf;
>> -    int ret;
>> +    int ret, idx;
>>       if (ctx->init == PSP_MEM_TRAIN_NOT_SUPPORT) {
>>           DRM_DEBUG("Memory training is not supported.\n");
>> @@ -681,17 +674,24 @@ static int psp_v11_0_memory_training(struct 
>> psp_context *psp, uint32_t ops)
>>               return -ENOMEM;
>>           }
>> -        memcpy_fromio(buf, adev->mman.aper_base_kaddr, sz);
>> -        ret = psp_v11_0_memory_training_send_msg(psp, 
>> PSP_BL__DRAM_LONG_TRAIN);
>> -        if (ret) {
>> -            DRM_ERROR("Send long training msg failed.\n");
>> +        if (drm_dev_enter(&adev->ddev, &idx)) {
>> +            memcpy_fromio(buf, adev->mman.aper_base_kaddr, sz);
>> +            ret = psp_v11_0_memory_training_send_msg(psp, 
>> PSP_BL__DRAM_LONG_TRAIN);
>> +            if (ret) {
>> +                DRM_ERROR("Send long training msg failed.\n");
>> +                vfree(buf);
>> +                drm_dev_exit(idx);
>> +                return ret;
>> +            }
>> +
>> +            memcpy_toio(adev->mman.aper_base_kaddr, buf, sz);
>> +            adev->hdp.funcs->flush_hdp(adev, NULL);
>>               vfree(buf);
>> -            return ret;
>> +            drm_dev_exit(idx);
>> +        } else {
>> +            vfree(buf);
>> +            return -ENODEV;
>>           }
>> -
>> -        memcpy_toio(adev->mman.aper_base_kaddr, buf, sz);
>> -        adev->hdp.funcs->flush_hdp(adev, NULL);
>> -        vfree(buf);
>>       }
>>       if (ops & PSP_MEM_TRAIN_SAVE) {
>> diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c 
>> b/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c
>> index c4828bd3264b..618e5b6b85d9 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c
>> @@ -138,10 +138,8 @@ static int 
>> psp_v12_0_bootloader_load_sysdrv(struct psp_context *psp)
>>       if (ret)
>>           return ret;
>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>> -
>>       /* Copy PSP System Driver binary to memory */
>> -    memcpy(psp->fw_pri_buf, psp->sys_start_addr, psp->sys_bin_size);
>> +    psp_copy_fw(psp, psp->sys_start_addr, psp->sys_bin_size);
>>       /* Provide the sys driver to bootloader */
>>       WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>> @@ -179,10 +177,8 @@ static int psp_v12_0_bootloader_load_sos(struct 
>> psp_context *psp)
>>       if (ret)
>>           return ret;
>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>> -
>>       /* Copy Secure OS binary to PSP memory */
>> -    memcpy(psp->fw_pri_buf, psp->sos_start_addr, psp->sos_bin_size);
>> +    psp_copy_fw(psp, psp->sos_start_addr, psp->sos_bin_size);
>>       /* Provide the PSP secure OS to bootloader */
>>       WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>> diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c 
>> b/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c
>> index f2e725f72d2f..d0a6cccd0897 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c
>> @@ -102,10 +102,8 @@ static int psp_v3_1_bootloader_load_sysdrv(struct 
>> psp_context *psp)
>>       if (ret)
>>           return ret;
>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>> -
>>       /* Copy PSP System Driver binary to memory */
>> -    memcpy(psp->fw_pri_buf, psp->sys_start_addr, psp->sys_bin_size);
>> +    psp_copy_fw(psp, psp->sys_start_addr, psp->sys_bin_size);
>>       /* Provide the sys driver to bootloader */
>>       WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>> @@ -143,10 +141,8 @@ static int psp_v3_1_bootloader_load_sos(struct 
>> psp_context *psp)
>>       if (ret)
>>           return ret;
>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>> -
>>       /* Copy Secure OS binary to PSP memory */
>> -    memcpy(psp->fw_pri_buf, psp->sos_start_addr, psp->sos_bin_size);
>> +    psp_copy_fw(psp, psp->sos_start_addr, psp->sos_bin_size);
>>       /* Provide the PSP secure OS to bootloader */
>>       WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>> diff --git a/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c 
>> b/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
>> index 8e238dea7bef..90910d19db12 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
>> @@ -25,6 +25,7 @@
>>    */
>>   #include <linux/firmware.h>
>> +#include <drm/drm_drv.h>
>>   #include "amdgpu.h"
>>   #include "amdgpu_vce.h"
>> @@ -555,16 +556,19 @@ static int vce_v4_0_hw_fini(void *handle)
>>   static int vce_v4_0_suspend(void *handle)
>>   {
>>       struct amdgpu_device *adev = (struct amdgpu_device *)handle;
>> -    int r;
>> +    int r, idx;
>>       if (adev->vce.vcpu_bo == NULL)
>>           return 0;
>> -    if (adev->firmware.load_type == AMDGPU_FW_LOAD_PSP) {
>> -        unsigned size = amdgpu_bo_size(adev->vce.vcpu_bo);
>> -        void *ptr = adev->vce.cpu_addr;
>> +    if (drm_dev_enter(&adev->ddev, &idx)) {
>> +        if (adev->firmware.load_type == AMDGPU_FW_LOAD_PSP) {
>> +            unsigned size = amdgpu_bo_size(adev->vce.vcpu_bo);
>> +            void *ptr = adev->vce.cpu_addr;
>> -        memcpy_fromio(adev->vce.saved_bo, ptr, size);
>> +            memcpy_fromio(adev->vce.saved_bo, ptr, size);
>> +        }
>> +        drm_dev_exit(idx);
>>       }
>>       r = vce_v4_0_hw_fini(adev);
>> @@ -577,16 +581,20 @@ static int vce_v4_0_suspend(void *handle)
>>   static int vce_v4_0_resume(void *handle)
>>   {
>>       struct amdgpu_device *adev = (struct amdgpu_device *)handle;
>> -    int r;
>> +    int r, idx;
>>       if (adev->vce.vcpu_bo == NULL)
>>           return -EINVAL;
>>       if (adev->firmware.load_type == AMDGPU_FW_LOAD_PSP) {
>> -        unsigned size = amdgpu_bo_size(adev->vce.vcpu_bo);
>> -        void *ptr = adev->vce.cpu_addr;
>> -        memcpy_toio(ptr, adev->vce.saved_bo, size);
>> +        if (drm_dev_enter(&adev->ddev, &idx)) {
>> +            unsigned size = amdgpu_bo_size(adev->vce.vcpu_bo);
>> +            void *ptr = adev->vce.cpu_addr;
>> +
>> +            memcpy_toio(ptr, adev->vce.saved_bo, size);
>> +            drm_dev_exit(idx);
>> +        }
>>       } else {
>>           r = amdgpu_vce_resume(adev);
>>           if (r)
>> diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c 
>> b/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c
>> index 3f15bf34123a..df34be8ec82d 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c
>> @@ -34,6 +34,8 @@
>>   #include "vcn/vcn_3_0_0_sh_mask.h"
>>   #include "ivsrcid/vcn/irqsrcs_vcn_2_0.h"
>> +#include <drm/drm_drv.h>
>> +
>>   #define mmUVD_CONTEXT_ID_INTERNAL_OFFSET            0x27
>>   #define mmUVD_GPCOM_VCPU_CMD_INTERNAL_OFFSET            0x0f
>>   #define mmUVD_GPCOM_VCPU_DATA0_INTERNAL_OFFSET            0x10
>> @@ -268,16 +270,20 @@ static int vcn_v3_0_sw_init(void *handle)
>>   static int vcn_v3_0_sw_fini(void *handle)
>>   {
>>       struct amdgpu_device *adev = (struct amdgpu_device *)handle;
>> -    int i, r;
>> +    int i, r, idx;
>> -    for (i = 0; i < adev->vcn.num_vcn_inst; i++) {
>> -        volatile struct amdgpu_fw_shared *fw_shared;
>> +    if (drm_dev_enter(&adev->ddev, &idx)) {
>> +        for (i = 0; i < adev->vcn.num_vcn_inst; i++) {
>> +            volatile struct amdgpu_fw_shared *fw_shared;
>> -        if (adev->vcn.harvest_config & (1 << i))
>> -            continue;
>> -        fw_shared = adev->vcn.inst[i].fw_shared_cpu_addr;
>> -        fw_shared->present_flag_0 = 0;
>> -        fw_shared->sw_ring.is_enabled = false;
>> +            if (adev->vcn.harvest_config & (1 << i))
>> +                continue;
>> +            fw_shared = adev->vcn.inst[i].fw_shared_cpu_addr;
>> +            fw_shared->present_flag_0 = 0;
>> +            fw_shared->sw_ring.is_enabled = false;
>> +        }
>> +
>> +        drm_dev_exit(idx);
>>       }
>>       if (amdgpu_sriov_vf(adev))
>> diff --git a/drivers/gpu/drm/amd/pm/powerplay/smumgr/smu7_smumgr.c 
>> b/drivers/gpu/drm/amd/pm/powerplay/smumgr/smu7_smumgr.c
>> index aae25243eb10..d628b91846c9 100644
>> --- a/drivers/gpu/drm/amd/pm/powerplay/smumgr/smu7_smumgr.c
>> +++ b/drivers/gpu/drm/amd/pm/powerplay/smumgr/smu7_smumgr.c
>> @@ -405,6 +405,8 @@ int smu7_request_smu_load_fw(struct pp_hwmgr *hwmgr)
>>                   UCODE_ID_MEC_STORAGE, &toc->entry[toc->num_entries++]),
>>                   "Failed to Get Firmware Entry.", r = -EINVAL; goto 
>> failed);
>>       }
>> +
>> +    /* AG TODO Can't call drm_dev_enter/exit because access 
>> adev->ddev here ... */
>>       memcpy_toio(smu_data->header_buffer.kaddr, smu_data->toc,
>>               sizeof(struct SMU_DRAMData_TOC));
>>       smum_send_msg_to_smc_with_parameter(hwmgr,
> 

^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [PATCH v6 10/16] drm/amdgpu: Guard against write accesses after device removal
@ 2021-05-11 17:52       ` Andrey Grodzovsky
  0 siblings, 0 replies; 126+ messages in thread
From: Andrey Grodzovsky @ 2021-05-11 17:52 UTC (permalink / raw)
  To: Christian König, dri-devel, amd-gfx, linux-pci,
	daniel.vetter, Harry.Wentland
  Cc: Alexander.Deucher, gregkh, ppaalanen, helgaas, Felix.Kuehling



On 2021-05-11 2:50 a.m., Christian König wrote:
> Am 10.05.21 um 18:36 schrieb Andrey Grodzovsky:
>> This should prevent writing to memory or IO ranges possibly
>> already allocated for other uses after our device is removed.
>>
>> v5:
>> Protect more places wher memcopy_to/form_io takes place
>> Protect IB submissions
>>
>> v6: Switch to !drm_dev_enter instead of scoping entire code
>> with brackets.
>>
>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c    | 11 ++-
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c       |  9 +++
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c        | 17 +++--
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c       | 63 +++++++++++------
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h       |  2 +
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c      | 70 +++++++++++++++++++
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h      | 49 ++-----------
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c       | 31 +++++---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c       | 11 ++-
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c       | 22 ++++--
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c        |  7 +-
>>   drivers/gpu/drm/amd/amdgpu/psp_v11_0.c        | 44 ++++++------
>>   drivers/gpu/drm/amd/amdgpu/psp_v12_0.c        |  8 +--
>>   drivers/gpu/drm/amd/amdgpu/psp_v3_1.c         |  8 +--
>>   drivers/gpu/drm/amd/amdgpu/vce_v4_0.c         | 26 ++++---
>>   drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c         | 22 +++---
>>   .../drm/amd/pm/powerplay/smumgr/smu7_smumgr.c |  2 +
>>   17 files changed, 257 insertions(+), 145 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> index a0bff4713672..94c415176cdc 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> @@ -71,6 +71,8 @@
>>   #include <drm/task_barrier.h>
>>   #include <linux/pm_runtime.h>
>> +#include <drm/drm_drv.h>
>> +
>>   MODULE_FIRMWARE("amdgpu/vega10_gpu_info.bin");
>>   MODULE_FIRMWARE("amdgpu/vega12_gpu_info.bin");
>>   MODULE_FIRMWARE("amdgpu/raven_gpu_info.bin");
>> @@ -281,7 +283,10 @@ void amdgpu_device_vram_access(struct 
>> amdgpu_device *adev, loff_t pos,
>>       unsigned long flags;
>>       uint32_t hi = ~0;
>>       uint64_t last;
>> +    int idx;
>> +     if (!drm_dev_enter(&adev->ddev, &idx))
>> +         return;
>>   #ifdef CONFIG_64BIT
>>       last = min(pos + size, adev->gmc.visible_vram_size);
>> @@ -299,8 +304,10 @@ void amdgpu_device_vram_access(struct 
>> amdgpu_device *adev, loff_t pos,
>>               memcpy_fromio(buf, addr, count);
>>           }
>> -        if (count == size)
>> +        if (count == size) {
>> +            drm_dev_exit(idx);
>>               return;
>> +        }
> 
> Maybe use a goto instead, but really just a nit pick.
> 
> 
> 
>>           pos += count;
>>           buf += count / 4;
>> @@ -323,6 +330,8 @@ void amdgpu_device_vram_access(struct 
>> amdgpu_device *adev, loff_t pos,
>>               *buf++ = RREG32_NO_KIQ(mmMM_DATA);
>>       }
>>       spin_unlock_irqrestore(&adev->mmio_idx_lock, flags);
>> +
>> +    drm_dev_exit(idx);
>>   }
>>   /*
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
>> index 4d32233cde92..04ba5eef1e88 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
>> @@ -31,6 +31,8 @@
>>   #include "amdgpu_ras.h"
>>   #include "amdgpu_xgmi.h"
>> +#include <drm/drm_drv.h>
>> +
>>   /**
>>    * amdgpu_gmc_pdb0_alloc - allocate vram for pdb0
>>    *
>> @@ -151,6 +153,10 @@ int amdgpu_gmc_set_pte_pde(struct amdgpu_device 
>> *adev, void *cpu_pt_addr,
>>   {
>>       void __iomem *ptr = (void *)cpu_pt_addr;
>>       uint64_t value;
>> +    int idx;
>> +
>> +    if (!drm_dev_enter(&adev->ddev, &idx))
>> +        return 0;
>>       /*
>>        * The following is for PTE only. GART does not have PDEs.
>> @@ -158,6 +164,9 @@ int amdgpu_gmc_set_pte_pde(struct amdgpu_device 
>> *adev, void *cpu_pt_addr,
>>       value = addr & 0x0000FFFFFFFFF000ULL;
>>       value |= flags;
>>       writeq(value, ptr + (gpu_page_idx * 8));
>> +
>> +    drm_dev_exit(idx);
>> +
>>       return 0;
>>   }
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
>> index 148a3b481b12..62fcbd446c71 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
>> @@ -30,6 +30,7 @@
>>   #include <linux/slab.h>
>>   #include <drm/amdgpu_drm.h>
>> +#include <drm/drm_drv.h>
>>   #include "amdgpu.h"
>>   #include "atom.h"
>> @@ -137,7 +138,7 @@ int amdgpu_ib_schedule(struct amdgpu_ring *ring, 
>> unsigned num_ibs,
>>       bool secure;
>>       unsigned i;
>> -    int r = 0;
>> +    int idx, r = 0;
>>       bool need_pipe_sync = false;
>>       if (num_ibs == 0)
>> @@ -169,13 +170,16 @@ int amdgpu_ib_schedule(struct amdgpu_ring *ring, 
>> unsigned num_ibs,
>>           return -EINVAL;
>>       }
>> +    if (!drm_dev_enter(&adev->ddev, &idx))
>> +        return -ENODEV;
>> +
>>       alloc_size = ring->funcs->emit_frame_size + num_ibs *
>>           ring->funcs->emit_ib_size;
>>       r = amdgpu_ring_alloc(ring, alloc_size);
>>       if (r) {
>>           dev_err(adev->dev, "scheduling IB failed (%d).\n", r);
>> -        return r;
>> +        goto exit;
>>       }
>>       need_ctx_switch = ring->current_ctx != fence_ctx;
>> @@ -205,7 +209,7 @@ int amdgpu_ib_schedule(struct amdgpu_ring *ring, 
>> unsigned num_ibs,
>>           r = amdgpu_vm_flush(ring, job, need_pipe_sync);
>>           if (r) {
>>               amdgpu_ring_undo(ring);
>> -            return r;
>> +            goto exit;
>>           }
>>       }
>> @@ -286,7 +290,7 @@ int amdgpu_ib_schedule(struct amdgpu_ring *ring, 
>> unsigned num_ibs,
>>           if (job && job->vmid)
>>               amdgpu_vmid_reset(adev, ring->funcs->vmhub, job->vmid);
>>           amdgpu_ring_undo(ring);
>> -        return r;
>> +        goto exit;
>>       }
>>       if (ring->funcs->insert_end)
>> @@ -304,7 +308,10 @@ int amdgpu_ib_schedule(struct amdgpu_ring *ring, 
>> unsigned num_ibs,
>>           ring->funcs->emit_wave_limit(ring, false);
>>       amdgpu_ring_commit(ring);
>> -    return 0;
>> +
>> +exit:
>> +    drm_dev_exit(idx);
>> +    return r;
>>   }
>>   /**
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
>> index 9e769cf6095b..bb6afee61666 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
>> @@ -25,6 +25,7 @@
>>   #include <linux/firmware.h>
>>   #include <linux/dma-mapping.h>
>> +#include <drm/drm_drv.h>
>>   #include "amdgpu.h"
>>   #include "amdgpu_psp.h"
>> @@ -39,6 +40,8 @@
>>   #include "amdgpu_ras.h"
>>   #include "amdgpu_securedisplay.h"
>> +#include <drm/drm_drv.h>
>> +
>>   static int psp_sysfs_init(struct amdgpu_device *adev);
>>   static void psp_sysfs_fini(struct amdgpu_device *adev);
>> @@ -253,7 +256,7 @@ psp_cmd_submit_buf(struct psp_context *psp,
>>              struct psp_gfx_cmd_resp *cmd, uint64_t fence_mc_addr)
>>   {
>>       int ret;
>> -    int index;
>> +    int index, idx;
>>       int timeout = 20000;
>>       bool ras_intr = false;
>>       bool skip_unsupport = false;
>> @@ -261,6 +264,9 @@ psp_cmd_submit_buf(struct psp_context *psp,
>>       if (psp->adev->in_pci_err_recovery)
>>           return 0;
>> +    if (!drm_dev_enter(&psp->adev->ddev, &idx))
>> +        return 0;
>> +
>>       mutex_lock(&psp->mutex);
>>       memset(psp->cmd_buf_mem, 0, PSP_CMD_BUFFER_SIZE);
>> @@ -271,8 +277,7 @@ psp_cmd_submit_buf(struct psp_context *psp,
>>       ret = psp_ring_cmd_submit(psp, psp->cmd_buf_mc_addr, 
>> fence_mc_addr, index);
>>       if (ret) {
>>           atomic_dec(&psp->fence_value);
>> -        mutex_unlock(&psp->mutex);
>> -        return ret;
>> +        goto exit;
>>       }
>>       amdgpu_asic_invalidate_hdp(psp->adev, NULL);
>> @@ -312,8 +317,8 @@ psp_cmd_submit_buf(struct psp_context *psp,
>>                psp->cmd_buf_mem->cmd_id,
>>                psp->cmd_buf_mem->resp.status);
>>           if (!timeout) {
>> -            mutex_unlock(&psp->mutex);
>> -            return -EINVAL;
>> +            ret = -EINVAL;
>> +            goto exit;
>>           }
>>       }
>> @@ -321,8 +326,10 @@ psp_cmd_submit_buf(struct psp_context *psp,
>>           ucode->tmr_mc_addr_lo = psp->cmd_buf_mem->resp.fw_addr_lo;
>>           ucode->tmr_mc_addr_hi = psp->cmd_buf_mem->resp.fw_addr_hi;
>>       }
>> -    mutex_unlock(&psp->mutex);
>> +exit:
>> +    mutex_unlock(&psp->mutex);
>> +    drm_dev_exit(idx);
>>       return ret;
>>   }
>> @@ -359,8 +366,7 @@ static int psp_load_toc(struct psp_context *psp,
>>       if (!cmd)
>>           return -ENOMEM;
>>       /* Copy toc to psp firmware private buffer */
>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>> -    memcpy(psp->fw_pri_buf, psp->toc_start_addr, psp->toc_bin_size);
>> +    psp_copy_fw(psp, psp->toc_start_addr, psp->toc_bin_size);
>>       psp_prep_load_toc_cmd_buf(cmd, psp->fw_pri_mc_addr, 
>> psp->toc_bin_size);
>> @@ -625,8 +631,7 @@ static int psp_asd_load(struct psp_context *psp)
>>       if (!cmd)
>>           return -ENOMEM;
>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>> -    memcpy(psp->fw_pri_buf, psp->asd_start_addr, psp->asd_ucode_size);
>> +    psp_copy_fw(psp, psp->asd_start_addr, psp->asd_ucode_size);
>>       psp_prep_asd_load_cmd_buf(cmd, psp->fw_pri_mc_addr,
>>                     psp->asd_ucode_size);
>> @@ -781,8 +786,7 @@ static int psp_xgmi_load(struct psp_context *psp)
>>       if (!cmd)
>>           return -ENOMEM;
>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>> -    memcpy(psp->fw_pri_buf, psp->ta_xgmi_start_addr, 
>> psp->ta_xgmi_ucode_size);
>> +    psp_copy_fw(psp, psp->ta_xgmi_start_addr, psp->ta_xgmi_ucode_size);
>>       psp_prep_ta_load_cmd_buf(cmd,
>>                    psp->fw_pri_mc_addr,
>> @@ -1038,8 +1042,7 @@ static int psp_ras_load(struct psp_context *psp)
>>       if (!cmd)
>>           return -ENOMEM;
>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>> -    memcpy(psp->fw_pri_buf, psp->ta_ras_start_addr, 
>> psp->ta_ras_ucode_size);
>> +    psp_copy_fw(psp, psp->ta_ras_start_addr, psp->ta_ras_ucode_size);
>>       psp_prep_ta_load_cmd_buf(cmd,
>>                    psp->fw_pri_mc_addr,
>> @@ -1275,8 +1278,7 @@ static int psp_hdcp_load(struct psp_context *psp)
>>       if (!cmd)
>>           return -ENOMEM;
>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>> -    memcpy(psp->fw_pri_buf, psp->ta_hdcp_start_addr,
>> +    psp_copy_fw(psp, psp->ta_hdcp_start_addr,
>>              psp->ta_hdcp_ucode_size);
>>       psp_prep_ta_load_cmd_buf(cmd,
>> @@ -1427,8 +1429,7 @@ static int psp_dtm_load(struct psp_context *psp)
>>       if (!cmd)
>>           return -ENOMEM;
>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>> -    memcpy(psp->fw_pri_buf, psp->ta_dtm_start_addr, 
>> psp->ta_dtm_ucode_size);
>> +    psp_copy_fw(psp, psp->ta_dtm_start_addr, psp->ta_dtm_ucode_size);
>>       psp_prep_ta_load_cmd_buf(cmd,
>>                    psp->fw_pri_mc_addr,
>> @@ -1573,8 +1574,7 @@ static int psp_rap_load(struct psp_context *psp)
>>       if (!cmd)
>>           return -ENOMEM;
>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>> -    memcpy(psp->fw_pri_buf, psp->ta_rap_start_addr, 
>> psp->ta_rap_ucode_size);
>> +    psp_copy_fw(psp, psp->ta_rap_start_addr, psp->ta_rap_ucode_size);
>>       psp_prep_ta_load_cmd_buf(cmd,
>>                    psp->fw_pri_mc_addr,
>> @@ -3022,7 +3022,7 @@ static ssize_t psp_usbc_pd_fw_sysfs_write(struct 
>> device *dev,
>>       struct amdgpu_device *adev = drm_to_adev(ddev);
>>       void *cpu_addr;
>>       dma_addr_t dma_addr;
>> -    int ret;
>> +    int ret, idx;
>>       char fw_name[100];
>>       const struct firmware *usbc_pd_fw;
>> @@ -3031,6 +3031,9 @@ static ssize_t psp_usbc_pd_fw_sysfs_write(struct 
>> device *dev,
>>           return -EBUSY;
>>       }
>> +    if (!drm_dev_enter(ddev, &idx))
>> +        return -ENODEV;
>> +
>>       snprintf(fw_name, sizeof(fw_name), "amdgpu/%s", buf);
>>       ret = request_firmware(&usbc_pd_fw, fw_name, adev->dev);
>>       if (ret)
>> @@ -3062,16 +3065,30 @@ static ssize_t 
>> psp_usbc_pd_fw_sysfs_write(struct device *dev,
>>   rel_buf:
>>       dma_free_coherent(adev->dev, usbc_pd_fw->size, cpu_addr, dma_addr);
>>       release_firmware(usbc_pd_fw);
>> -
>>   fail:
>>       if (ret) {
>>           DRM_ERROR("Failed to load USBC PD FW, err = %d", ret);
>> -        return ret;
>> +        count = ret;
>>       }
>> +    drm_dev_exit(idx);
>>       return count;
>>   }
>> +void psp_copy_fw(struct psp_context *psp, uint8_t *start_addr, 
>> uint32_t bin_size)
>> +{
>> +    int idx;
>> +
>> +    if (!drm_dev_enter(&psp->adev->ddev, &idx))
>> +        return;
>> +
>> +    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>> +    memcpy(psp->fw_pri_buf, start_addr, bin_size);
>> +
>> +    drm_dev_exit(idx);
>> +}
>> +
>> +
>>   static DEVICE_ATTR(usbc_pd_fw, S_IRUGO | S_IWUSR,
>>              psp_usbc_pd_fw_sysfs_read,
>>              psp_usbc_pd_fw_sysfs_write);
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
>> index 46a5328e00e0..2bfdc278817f 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
>> @@ -423,4 +423,6 @@ int psp_get_fw_attestation_records_addr(struct 
>> psp_context *psp,
>>   int psp_load_fw_list(struct psp_context *psp,
>>                struct amdgpu_firmware_info **ucode_list, int 
>> ucode_count);
>> +void psp_copy_fw(struct psp_context *psp, uint8_t *start_addr, 
>> uint32_t bin_size);
>> +
>>   #endif
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
>> index 688624ebe421..e1985bc34436 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
>> @@ -35,6 +35,8 @@
>>   #include "amdgpu.h"
>>   #include "atom.h"
>> +#include <drm/drm_drv.h>
>> +
>>   /*
>>    * Rings
>>    * Most engines on the GPU are fed via ring buffers.  Ring
>> @@ -461,3 +463,71 @@ int amdgpu_ring_test_helper(struct amdgpu_ring 
>> *ring)
>>       ring->sched.ready = !r;
>>       return r;
>>   }
>> +
>> +void amdgpu_ring_clear_ring(struct amdgpu_ring *ring)
>> +{
>> +    int idx;
>> +    int i = 0;
>> +
>> +    if (!drm_dev_enter(&ring->adev->ddev, &idx))
>> +        return;
>> +
>> +    while (i <= ring->buf_mask)
>> +        ring->ring[i++] = ring->funcs->nop;
>> +
>> +    drm_dev_exit(idx);
>> +
>> +}
>> +
>> +void amdgpu_ring_write(struct amdgpu_ring *ring, uint32_t v)
>> +{
>> +    int idx;
>> +
>> +    if (!drm_dev_enter(&ring->adev->ddev, &idx))
>> +        return;
>> +
>> +    if (ring->count_dw <= 0)
>> +        DRM_ERROR("amdgpu: writing more dwords to the ring than 
>> expected!\n");
>> +    ring->ring[ring->wptr++ & ring->buf_mask] = v;
>> +    ring->wptr &= ring->ptr_mask;
>> +    ring->count_dw--;
>> +
>> +    drm_dev_exit(idx);
>> +}
>> +
>> +void amdgpu_ring_write_multiple(struct amdgpu_ring *ring,
>> +                          void *src, int count_dw)
>> +{
>> +    unsigned occupied, chunk1, chunk2;
>> +    void *dst;
>> +    int idx;
>> +
>> +    if (!drm_dev_enter(&ring->adev->ddev, &idx))
>> +        return;
>> +
>> +    if (unlikely(ring->count_dw < count_dw))
>> +        DRM_ERROR("amdgpu: writing more dwords to the ring than 
>> expected!\n");
>> +
>> +    occupied = ring->wptr & ring->buf_mask;
>> +    dst = (void *)&ring->ring[occupied];
>> +    chunk1 = ring->buf_mask + 1 - occupied;
>> +    chunk1 = (chunk1 >= count_dw) ? count_dw: chunk1;
>> +    chunk2 = count_dw - chunk1;
>> +    chunk1 <<= 2;
>> +    chunk2 <<= 2;
>> +
>> +    if (chunk1)
>> +        memcpy(dst, src, chunk1);
>> +
>> +    if (chunk2) {
>> +        src += chunk1;
>> +        dst = (void *)ring->ring;
>> +        memcpy(dst, src, chunk2);
>> +    }
>> +
>> +    ring->wptr += count_dw;
>> +    ring->wptr &= ring->ptr_mask;
>> +    ring->count_dw -= count_dw;
>> +
>> +    drm_dev_exit(idx);
>> +}
> 
> The ring should never we in MMIO memory, so you can completely drop that 
> as far as I can see.

Yea, it's in all in GART, missed it for some reason...
> 
> Maybe split that patch by use case so that we can more easily review/ack 
> it.

In fact everything here is the same use case, once I added unmap of
all MMIO ranges (both registers ann VRAM) i got a lot of page faults
on device remove around any memcpy to from IO. That where I put the
drn_dev_enter/exit scope. Also I searched in code and preemeptivly
added guards to any other such place. I did drop amdgpu_schedule_ib
from this patch both because it had dma_fence_wait inside and so we
will take care of this once we decide on how to handle dma_fence waits.

Andrey

> 
> Thanks,
> Christian.
> 
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>> index e7d3d0dbdd96..c67bc6d3d039 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>> @@ -299,53 +299,12 @@ static inline void 
>> amdgpu_ring_set_preempt_cond_exec(struct amdgpu_ring *ring,
>>       *ring->cond_exe_cpu_addr = cond_exec;
>>   }
>> -static inline void amdgpu_ring_clear_ring(struct amdgpu_ring *ring)
>> -{
>> -    int i = 0;
>> -    while (i <= ring->buf_mask)
>> -        ring->ring[i++] = ring->funcs->nop;
>> -
>> -}
>> -
>> -static inline void amdgpu_ring_write(struct amdgpu_ring *ring, 
>> uint32_t v)
>> -{
>> -    if (ring->count_dw <= 0)
>> -        DRM_ERROR("amdgpu: writing more dwords to the ring than 
>> expected!\n");
>> -    ring->ring[ring->wptr++ & ring->buf_mask] = v;
>> -    ring->wptr &= ring->ptr_mask;
>> -    ring->count_dw--;
>> -}
>> +void amdgpu_ring_clear_ring(struct amdgpu_ring *ring);
>> -static inline void amdgpu_ring_write_multiple(struct amdgpu_ring *ring,
>> -                          void *src, int count_dw)
>> -{
>> -    unsigned occupied, chunk1, chunk2;
>> -    void *dst;
>> -
>> -    if (unlikely(ring->count_dw < count_dw))
>> -        DRM_ERROR("amdgpu: writing more dwords to the ring than 
>> expected!\n");
>> -
>> -    occupied = ring->wptr & ring->buf_mask;
>> -    dst = (void *)&ring->ring[occupied];
>> -    chunk1 = ring->buf_mask + 1 - occupied;
>> -    chunk1 = (chunk1 >= count_dw) ? count_dw: chunk1;
>> -    chunk2 = count_dw - chunk1;
>> -    chunk1 <<= 2;
>> -    chunk2 <<= 2;
>> +void amdgpu_ring_write(struct amdgpu_ring *ring, uint32_t v);
>> -    if (chunk1)
>> -        memcpy(dst, src, chunk1);
>> -
>> -    if (chunk2) {
>> -        src += chunk1;
>> -        dst = (void *)ring->ring;
>> -        memcpy(dst, src, chunk2);
>> -    }
>> -
>> -    ring->wptr += count_dw;
>> -    ring->wptr &= ring->ptr_mask;
>> -    ring->count_dw -= count_dw;
>> -}
>> +void amdgpu_ring_write_multiple(struct amdgpu_ring *ring,
>> +                          void *src, int count_dw);
>>   int amdgpu_ring_test_helper(struct amdgpu_ring *ring);
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
>> index c6dbc0801604..82f0542c7792 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
>> @@ -32,6 +32,7 @@
>>   #include <linux/module.h>
>>   #include <drm/drm.h>
>> +#include <drm/drm_drv.h>
>>   #include "amdgpu.h"
>>   #include "amdgpu_pm.h"
>> @@ -375,7 +376,7 @@ int amdgpu_uvd_suspend(struct amdgpu_device *adev)
>>   {
>>       unsigned size;
>>       void *ptr;
>> -    int i, j;
>> +    int i, j, idx;
>>       bool in_ras_intr = amdgpu_ras_intr_triggered();
>>       cancel_delayed_work_sync(&adev->uvd.idle_work);
>> @@ -403,11 +404,15 @@ int amdgpu_uvd_suspend(struct amdgpu_device *adev)
>>           if (!adev->uvd.inst[j].saved_bo)
>>               return -ENOMEM;
>> -        /* re-write 0 since err_event_athub will corrupt VCPU buffer */
>> -        if (in_ras_intr)
>> -            memset(adev->uvd.inst[j].saved_bo, 0, size);
>> -        else
>> -            memcpy_fromio(adev->uvd.inst[j].saved_bo, ptr, size);
>> +        if (drm_dev_enter(&adev->ddev, &idx)) {
>> +            /* re-write 0 since err_event_athub will corrupt VCPU 
>> buffer */
>> +            if (in_ras_intr)
>> +                memset(adev->uvd.inst[j].saved_bo, 0, size);
>> +            else
>> +                memcpy_fromio(adev->uvd.inst[j].saved_bo, ptr, size);
>> +
>> +            drm_dev_exit(idx);
>> +        }
>>       }
>>       if (in_ras_intr)
>> @@ -420,7 +425,7 @@ int amdgpu_uvd_resume(struct amdgpu_device *adev)
>>   {
>>       unsigned size;
>>       void *ptr;
>> -    int i;
>> +    int i, idx;
>>       for (i = 0; i < adev->uvd.num_uvd_inst; i++) {
>>           if (adev->uvd.harvest_config & (1 << i))
>> @@ -432,7 +437,10 @@ int amdgpu_uvd_resume(struct amdgpu_device *adev)
>>           ptr = adev->uvd.inst[i].cpu_addr;
>>           if (adev->uvd.inst[i].saved_bo != NULL) {
>> -            memcpy_toio(ptr, adev->uvd.inst[i].saved_bo, size);
>> +            if (drm_dev_enter(&adev->ddev, &idx)) {
>> +                memcpy_toio(ptr, adev->uvd.inst[i].saved_bo, size);
>> +                drm_dev_exit(idx);
>> +            }
>>               kvfree(adev->uvd.inst[i].saved_bo);
>>               adev->uvd.inst[i].saved_bo = NULL;
>>           } else {
>> @@ -442,8 +450,11 @@ int amdgpu_uvd_resume(struct amdgpu_device *adev)
>>               hdr = (const struct common_firmware_header 
>> *)adev->uvd.fw->data;
>>               if (adev->firmware.load_type != AMDGPU_FW_LOAD_PSP) {
>>                   offset = le32_to_cpu(hdr->ucode_array_offset_bytes);
>> -                memcpy_toio(adev->uvd.inst[i].cpu_addr, 
>> adev->uvd.fw->data + offset,
>> -                        le32_to_cpu(hdr->ucode_size_bytes));
>> +                if (drm_dev_enter(&adev->ddev, &idx)) {
>> +                    memcpy_toio(adev->uvd.inst[i].cpu_addr, 
>> adev->uvd.fw->data + offset,
>> +                            le32_to_cpu(hdr->ucode_size_bytes));
>> +                    drm_dev_exit(idx);
>> +                }
>>                   size -= le32_to_cpu(hdr->ucode_size_bytes);
>>                   ptr += le32_to_cpu(hdr->ucode_size_bytes);
>>               }
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
>> index ea6a62f67e38..833203401ef4 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
>> @@ -29,6 +29,7 @@
>>   #include <linux/module.h>
>>   #include <drm/drm.h>
>> +#include <drm/drm_drv.h>
>>   #include "amdgpu.h"
>>   #include "amdgpu_pm.h"
>> @@ -293,7 +294,7 @@ int amdgpu_vce_resume(struct amdgpu_device *adev)
>>       void *cpu_addr;
>>       const struct common_firmware_header *hdr;
>>       unsigned offset;
>> -    int r;
>> +    int r, idx;
>>       if (adev->vce.vcpu_bo == NULL)
>>           return -EINVAL;
>> @@ -313,8 +314,12 @@ int amdgpu_vce_resume(struct amdgpu_device *adev)
>>       hdr = (const struct common_firmware_header *)adev->vce.fw->data;
>>       offset = le32_to_cpu(hdr->ucode_array_offset_bytes);
>> -    memcpy_toio(cpu_addr, adev->vce.fw->data + offset,
>> -            adev->vce.fw->size - offset);
>> +
>> +    if (drm_dev_enter(&adev->ddev, &idx)) {
>> +        memcpy_toio(cpu_addr, adev->vce.fw->data + offset,
>> +                adev->vce.fw->size - offset);
>> +        drm_dev_exit(idx);
>> +    }
>>       amdgpu_bo_kunmap(adev->vce.vcpu_bo);
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
>> index 201645963ba5..21f7d3644d70 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
>> @@ -27,6 +27,7 @@
>>   #include <linux/firmware.h>
>>   #include <linux/module.h>
>>   #include <linux/pci.h>
>> +#include <drm/drm_drv.h>
>>   #include "amdgpu.h"
>>   #include "amdgpu_pm.h"
>> @@ -275,7 +276,7 @@ int amdgpu_vcn_suspend(struct amdgpu_device *adev)
>>   {
>>       unsigned size;
>>       void *ptr;
>> -    int i;
>> +    int i, idx;
>>       cancel_delayed_work_sync(&adev->vcn.idle_work);
>> @@ -292,7 +293,10 @@ int amdgpu_vcn_suspend(struct amdgpu_device *adev)
>>           if (!adev->vcn.inst[i].saved_bo)
>>               return -ENOMEM;
>> -        memcpy_fromio(adev->vcn.inst[i].saved_bo, ptr, size);
>> +        if (drm_dev_enter(&adev->ddev, &idx)) {
>> +            memcpy_fromio(adev->vcn.inst[i].saved_bo, ptr, size);
>> +            drm_dev_exit(idx);
>> +        }
>>       }
>>       return 0;
>>   }
>> @@ -301,7 +305,7 @@ int amdgpu_vcn_resume(struct amdgpu_device *adev)
>>   {
>>       unsigned size;
>>       void *ptr;
>> -    int i;
>> +    int i, idx;
>>       for (i = 0; i < adev->vcn.num_vcn_inst; ++i) {
>>           if (adev->vcn.harvest_config & (1 << i))
>> @@ -313,7 +317,10 @@ int amdgpu_vcn_resume(struct amdgpu_device *adev)
>>           ptr = adev->vcn.inst[i].cpu_addr;
>>           if (adev->vcn.inst[i].saved_bo != NULL) {
>> -            memcpy_toio(ptr, adev->vcn.inst[i].saved_bo, size);
>> +            if (drm_dev_enter(&adev->ddev, &idx)) {
>> +                memcpy_toio(ptr, adev->vcn.inst[i].saved_bo, size);
>> +                drm_dev_exit(idx);
>> +            }
>>               kvfree(adev->vcn.inst[i].saved_bo);
>>               adev->vcn.inst[i].saved_bo = NULL;
>>           } else {
>> @@ -323,8 +330,11 @@ int amdgpu_vcn_resume(struct amdgpu_device *adev)
>>               hdr = (const struct common_firmware_header 
>> *)adev->vcn.fw->data;
>>               if (adev->firmware.load_type != AMDGPU_FW_LOAD_PSP) {
>>                   offset = le32_to_cpu(hdr->ucode_array_offset_bytes);
>> -                memcpy_toio(adev->vcn.inst[i].cpu_addr, 
>> adev->vcn.fw->data + offset,
>> -                        le32_to_cpu(hdr->ucode_size_bytes));
>> +                if (drm_dev_enter(&adev->ddev, &idx)) {
>> +                    memcpy_toio(adev->vcn.inst[i].cpu_addr, 
>> adev->vcn.fw->data + offset,
>> +                            le32_to_cpu(hdr->ucode_size_bytes));
>> +                    drm_dev_exit(idx);
>> +                }
>>                   size -= le32_to_cpu(hdr->ucode_size_bytes);
>>                   ptr += le32_to_cpu(hdr->ucode_size_bytes);
>>               }
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> index 9f868cf3b832..7dd5f10ab570 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> @@ -32,6 +32,7 @@
>>   #include <linux/dma-buf.h>
>>   #include <drm/amdgpu_drm.h>
>> +#include <drm/drm_drv.h>
>>   #include "amdgpu.h"
>>   #include "amdgpu_trace.h"
>>   #include "amdgpu_amdkfd.h"
>> @@ -1606,7 +1607,10 @@ static int amdgpu_vm_bo_update_mapping(struct 
>> amdgpu_device *adev,
>>       struct amdgpu_vm_update_params params;
>>       enum amdgpu_sync_mode sync_mode;
>>       uint64_t pfn;
>> -    int r;
>> +    int r, idx;
>> +
>> +    if (!drm_dev_enter(&adev->ddev, &idx))
>> +        return -ENODEV;
>>       memset(&params, 0, sizeof(params));
>>       params.adev = adev;
>> @@ -1715,6 +1719,7 @@ static int amdgpu_vm_bo_update_mapping(struct 
>> amdgpu_device *adev,
>>   error_unlock:
>>       amdgpu_vm_eviction_unlock(vm);
>> +    drm_dev_exit(idx);
>>       return r;
>>   }
>> diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c 
>> b/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
>> index 589410c32d09..2cec71e823f5 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
>> @@ -23,6 +23,7 @@
>>   #include <linux/firmware.h>
>>   #include <linux/module.h>
>>   #include <linux/vmalloc.h>
>> +#include <drm/drm_drv.h>
>>   #include "amdgpu.h"
>>   #include "amdgpu_psp.h"
>> @@ -269,10 +270,8 @@ static int psp_v11_0_bootloader_load_kdb(struct 
>> psp_context *psp)
>>       if (ret)
>>           return ret;
>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>> -
>>       /* Copy PSP KDB binary to memory */
>> -    memcpy(psp->fw_pri_buf, psp->kdb_start_addr, psp->kdb_bin_size);
>> +    psp_copy_fw(psp, psp->kdb_start_addr, psp->kdb_bin_size);
>>       /* Provide the PSP KDB to bootloader */
>>       WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>> @@ -302,10 +301,8 @@ static int psp_v11_0_bootloader_load_spl(struct 
>> psp_context *psp)
>>       if (ret)
>>           return ret;
>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>> -
>>       /* Copy PSP SPL binary to memory */
>> -    memcpy(psp->fw_pri_buf, psp->spl_start_addr, psp->spl_bin_size);
>> +    psp_copy_fw(psp, psp->spl_start_addr, psp->spl_bin_size);
>>       /* Provide the PSP SPL to bootloader */
>>       WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>> @@ -335,10 +332,8 @@ static int 
>> psp_v11_0_bootloader_load_sysdrv(struct psp_context *psp)
>>       if (ret)
>>           return ret;
>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>> -
>>       /* Copy PSP System Driver binary to memory */
>> -    memcpy(psp->fw_pri_buf, psp->sys_start_addr, psp->sys_bin_size);
>> +    psp_copy_fw(psp, psp->sys_start_addr, psp->sys_bin_size);
>>       /* Provide the sys driver to bootloader */
>>       WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>> @@ -371,10 +366,8 @@ static int psp_v11_0_bootloader_load_sos(struct 
>> psp_context *psp)
>>       if (ret)
>>           return ret;
>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>> -
>>       /* Copy Secure OS binary to PSP memory */
>> -    memcpy(psp->fw_pri_buf, psp->sos_start_addr, psp->sos_bin_size);
>> +    psp_copy_fw(psp, psp->sos_start_addr, psp->sos_bin_size);
>>       /* Provide the PSP secure OS to bootloader */
>>       WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>> @@ -608,7 +601,7 @@ static int psp_v11_0_memory_training(struct 
>> psp_context *psp, uint32_t ops)
>>       uint32_t p2c_header[4];
>>       uint32_t sz;
>>       void *buf;
>> -    int ret;
>> +    int ret, idx;
>>       if (ctx->init == PSP_MEM_TRAIN_NOT_SUPPORT) {
>>           DRM_DEBUG("Memory training is not supported.\n");
>> @@ -681,17 +674,24 @@ static int psp_v11_0_memory_training(struct 
>> psp_context *psp, uint32_t ops)
>>               return -ENOMEM;
>>           }
>> -        memcpy_fromio(buf, adev->mman.aper_base_kaddr, sz);
>> -        ret = psp_v11_0_memory_training_send_msg(psp, 
>> PSP_BL__DRAM_LONG_TRAIN);
>> -        if (ret) {
>> -            DRM_ERROR("Send long training msg failed.\n");
>> +        if (drm_dev_enter(&adev->ddev, &idx)) {
>> +            memcpy_fromio(buf, adev->mman.aper_base_kaddr, sz);
>> +            ret = psp_v11_0_memory_training_send_msg(psp, 
>> PSP_BL__DRAM_LONG_TRAIN);
>> +            if (ret) {
>> +                DRM_ERROR("Send long training msg failed.\n");
>> +                vfree(buf);
>> +                drm_dev_exit(idx);
>> +                return ret;
>> +            }
>> +
>> +            memcpy_toio(adev->mman.aper_base_kaddr, buf, sz);
>> +            adev->hdp.funcs->flush_hdp(adev, NULL);
>>               vfree(buf);
>> -            return ret;
>> +            drm_dev_exit(idx);
>> +        } else {
>> +            vfree(buf);
>> +            return -ENODEV;
>>           }
>> -
>> -        memcpy_toio(adev->mman.aper_base_kaddr, buf, sz);
>> -        adev->hdp.funcs->flush_hdp(adev, NULL);
>> -        vfree(buf);
>>       }
>>       if (ops & PSP_MEM_TRAIN_SAVE) {
>> diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c 
>> b/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c
>> index c4828bd3264b..618e5b6b85d9 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c
>> @@ -138,10 +138,8 @@ static int 
>> psp_v12_0_bootloader_load_sysdrv(struct psp_context *psp)
>>       if (ret)
>>           return ret;
>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>> -
>>       /* Copy PSP System Driver binary to memory */
>> -    memcpy(psp->fw_pri_buf, psp->sys_start_addr, psp->sys_bin_size);
>> +    psp_copy_fw(psp, psp->sys_start_addr, psp->sys_bin_size);
>>       /* Provide the sys driver to bootloader */
>>       WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>> @@ -179,10 +177,8 @@ static int psp_v12_0_bootloader_load_sos(struct 
>> psp_context *psp)
>>       if (ret)
>>           return ret;
>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>> -
>>       /* Copy Secure OS binary to PSP memory */
>> -    memcpy(psp->fw_pri_buf, psp->sos_start_addr, psp->sos_bin_size);
>> +    psp_copy_fw(psp, psp->sos_start_addr, psp->sos_bin_size);
>>       /* Provide the PSP secure OS to bootloader */
>>       WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>> diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c 
>> b/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c
>> index f2e725f72d2f..d0a6cccd0897 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c
>> @@ -102,10 +102,8 @@ static int psp_v3_1_bootloader_load_sysdrv(struct 
>> psp_context *psp)
>>       if (ret)
>>           return ret;
>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>> -
>>       /* Copy PSP System Driver binary to memory */
>> -    memcpy(psp->fw_pri_buf, psp->sys_start_addr, psp->sys_bin_size);
>> +    psp_copy_fw(psp, psp->sys_start_addr, psp->sys_bin_size);
>>       /* Provide the sys driver to bootloader */
>>       WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>> @@ -143,10 +141,8 @@ static int psp_v3_1_bootloader_load_sos(struct 
>> psp_context *psp)
>>       if (ret)
>>           return ret;
>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>> -
>>       /* Copy Secure OS binary to PSP memory */
>> -    memcpy(psp->fw_pri_buf, psp->sos_start_addr, psp->sos_bin_size);
>> +    psp_copy_fw(psp, psp->sos_start_addr, psp->sos_bin_size);
>>       /* Provide the PSP secure OS to bootloader */
>>       WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>> diff --git a/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c 
>> b/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
>> index 8e238dea7bef..90910d19db12 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
>> @@ -25,6 +25,7 @@
>>    */
>>   #include <linux/firmware.h>
>> +#include <drm/drm_drv.h>
>>   #include "amdgpu.h"
>>   #include "amdgpu_vce.h"
>> @@ -555,16 +556,19 @@ static int vce_v4_0_hw_fini(void *handle)
>>   static int vce_v4_0_suspend(void *handle)
>>   {
>>       struct amdgpu_device *adev = (struct amdgpu_device *)handle;
>> -    int r;
>> +    int r, idx;
>>       if (adev->vce.vcpu_bo == NULL)
>>           return 0;
>> -    if (adev->firmware.load_type == AMDGPU_FW_LOAD_PSP) {
>> -        unsigned size = amdgpu_bo_size(adev->vce.vcpu_bo);
>> -        void *ptr = adev->vce.cpu_addr;
>> +    if (drm_dev_enter(&adev->ddev, &idx)) {
>> +        if (adev->firmware.load_type == AMDGPU_FW_LOAD_PSP) {
>> +            unsigned size = amdgpu_bo_size(adev->vce.vcpu_bo);
>> +            void *ptr = adev->vce.cpu_addr;
>> -        memcpy_fromio(adev->vce.saved_bo, ptr, size);
>> +            memcpy_fromio(adev->vce.saved_bo, ptr, size);
>> +        }
>> +        drm_dev_exit(idx);
>>       }
>>       r = vce_v4_0_hw_fini(adev);
>> @@ -577,16 +581,20 @@ static int vce_v4_0_suspend(void *handle)
>>   static int vce_v4_0_resume(void *handle)
>>   {
>>       struct amdgpu_device *adev = (struct amdgpu_device *)handle;
>> -    int r;
>> +    int r, idx;
>>       if (adev->vce.vcpu_bo == NULL)
>>           return -EINVAL;
>>       if (adev->firmware.load_type == AMDGPU_FW_LOAD_PSP) {
>> -        unsigned size = amdgpu_bo_size(adev->vce.vcpu_bo);
>> -        void *ptr = adev->vce.cpu_addr;
>> -        memcpy_toio(ptr, adev->vce.saved_bo, size);
>> +        if (drm_dev_enter(&adev->ddev, &idx)) {
>> +            unsigned size = amdgpu_bo_size(adev->vce.vcpu_bo);
>> +            void *ptr = adev->vce.cpu_addr;
>> +
>> +            memcpy_toio(ptr, adev->vce.saved_bo, size);
>> +            drm_dev_exit(idx);
>> +        }
>>       } else {
>>           r = amdgpu_vce_resume(adev);
>>           if (r)
>> diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c 
>> b/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c
>> index 3f15bf34123a..df34be8ec82d 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c
>> @@ -34,6 +34,8 @@
>>   #include "vcn/vcn_3_0_0_sh_mask.h"
>>   #include "ivsrcid/vcn/irqsrcs_vcn_2_0.h"
>> +#include <drm/drm_drv.h>
>> +
>>   #define mmUVD_CONTEXT_ID_INTERNAL_OFFSET            0x27
>>   #define mmUVD_GPCOM_VCPU_CMD_INTERNAL_OFFSET            0x0f
>>   #define mmUVD_GPCOM_VCPU_DATA0_INTERNAL_OFFSET            0x10
>> @@ -268,16 +270,20 @@ static int vcn_v3_0_sw_init(void *handle)
>>   static int vcn_v3_0_sw_fini(void *handle)
>>   {
>>       struct amdgpu_device *adev = (struct amdgpu_device *)handle;
>> -    int i, r;
>> +    int i, r, idx;
>> -    for (i = 0; i < adev->vcn.num_vcn_inst; i++) {
>> -        volatile struct amdgpu_fw_shared *fw_shared;
>> +    if (drm_dev_enter(&adev->ddev, &idx)) {
>> +        for (i = 0; i < adev->vcn.num_vcn_inst; i++) {
>> +            volatile struct amdgpu_fw_shared *fw_shared;
>> -        if (adev->vcn.harvest_config & (1 << i))
>> -            continue;
>> -        fw_shared = adev->vcn.inst[i].fw_shared_cpu_addr;
>> -        fw_shared->present_flag_0 = 0;
>> -        fw_shared->sw_ring.is_enabled = false;
>> +            if (adev->vcn.harvest_config & (1 << i))
>> +                continue;
>> +            fw_shared = adev->vcn.inst[i].fw_shared_cpu_addr;
>> +            fw_shared->present_flag_0 = 0;
>> +            fw_shared->sw_ring.is_enabled = false;
>> +        }
>> +
>> +        drm_dev_exit(idx);
>>       }
>>       if (amdgpu_sriov_vf(adev))
>> diff --git a/drivers/gpu/drm/amd/pm/powerplay/smumgr/smu7_smumgr.c 
>> b/drivers/gpu/drm/amd/pm/powerplay/smumgr/smu7_smumgr.c
>> index aae25243eb10..d628b91846c9 100644
>> --- a/drivers/gpu/drm/amd/pm/powerplay/smumgr/smu7_smumgr.c
>> +++ b/drivers/gpu/drm/amd/pm/powerplay/smumgr/smu7_smumgr.c
>> @@ -405,6 +405,8 @@ int smu7_request_smu_load_fw(struct pp_hwmgr *hwmgr)
>>                   UCODE_ID_MEC_STORAGE, &toc->entry[toc->num_entries++]),
>>                   "Failed to Get Firmware Entry.", r = -EINVAL; goto 
>> failed);
>>       }
>> +
>> +    /* AG TODO Can't call drm_dev_enter/exit because access 
>> adev->ddev here ... */
>>       memcpy_toio(smu_data->header_buffer.kaddr, smu_data->toc,
>>               sizeof(struct SMU_DRAMData_TOC));
>>       smum_send_msg_to_smc_with_parameter(hwmgr,
> 
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [PATCH v6 10/16] drm/amdgpu: Guard against write accesses after device removal
  2021-05-11 17:52       ` Andrey Grodzovsky
  (?)
@ 2021-05-12 14:01         ` Andrey Grodzovsky
  -1 siblings, 0 replies; 126+ messages in thread
From: Andrey Grodzovsky @ 2021-05-12 14:01 UTC (permalink / raw)
  To: Christian König, dri-devel, amd-gfx, linux-pci,
	daniel.vetter, Harry.Wentland
  Cc: ppaalanen, Alexander.Deucher, gregkh, helgaas, Felix.Kuehling

Ping - need a confirmation it's ok to keep this as a single patch given
my explanation bellow.

Andrey

On 2021-05-11 1:52 p.m., Andrey Grodzovsky wrote:
> 
> 
> On 2021-05-11 2:50 a.m., Christian König wrote:
>> Am 10.05.21 um 18:36 schrieb Andrey Grodzovsky:
>>> This should prevent writing to memory or IO ranges possibly
>>> already allocated for other uses after our device is removed.
>>>
>>> v5:
>>> Protect more places wher memcopy_to/form_io takes place
>>> Protect IB submissions
>>>
>>> v6: Switch to !drm_dev_enter instead of scoping entire code
>>> with brackets.
>>>
>>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>>> ---
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c    | 11 ++-
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c       |  9 +++
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c        | 17 +++--
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c       | 63 +++++++++++------
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h       |  2 +
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c      | 70 +++++++++++++++++++
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h      | 49 ++-----------
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c       | 31 +++++---
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c       | 11 ++-
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c       | 22 ++++--
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c        |  7 +-
>>>   drivers/gpu/drm/amd/amdgpu/psp_v11_0.c        | 44 ++++++------
>>>   drivers/gpu/drm/amd/amdgpu/psp_v12_0.c        |  8 +--
>>>   drivers/gpu/drm/amd/amdgpu/psp_v3_1.c         |  8 +--
>>>   drivers/gpu/drm/amd/amdgpu/vce_v4_0.c         | 26 ++++---
>>>   drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c         | 22 +++---
>>>   .../drm/amd/pm/powerplay/smumgr/smu7_smumgr.c |  2 +
>>>   17 files changed, 257 insertions(+), 145 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>> index a0bff4713672..94c415176cdc 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>> @@ -71,6 +71,8 @@
>>>   #include <drm/task_barrier.h>
>>>   #include <linux/pm_runtime.h>
>>> +#include <drm/drm_drv.h>
>>> +
>>>   MODULE_FIRMWARE("amdgpu/vega10_gpu_info.bin");
>>>   MODULE_FIRMWARE("amdgpu/vega12_gpu_info.bin");
>>>   MODULE_FIRMWARE("amdgpu/raven_gpu_info.bin");
>>> @@ -281,7 +283,10 @@ void amdgpu_device_vram_access(struct 
>>> amdgpu_device *adev, loff_t pos,
>>>       unsigned long flags;
>>>       uint32_t hi = ~0;
>>>       uint64_t last;
>>> +    int idx;
>>> +     if (!drm_dev_enter(&adev->ddev, &idx))
>>> +         return;
>>>   #ifdef CONFIG_64BIT
>>>       last = min(pos + size, adev->gmc.visible_vram_size);
>>> @@ -299,8 +304,10 @@ void amdgpu_device_vram_access(struct 
>>> amdgpu_device *adev, loff_t pos,
>>>               memcpy_fromio(buf, addr, count);
>>>           }
>>> -        if (count == size)
>>> +        if (count == size) {
>>> +            drm_dev_exit(idx);
>>>               return;
>>> +        }
>>
>> Maybe use a goto instead, but really just a nit pick.
>>
>>
>>
>>>           pos += count;
>>>           buf += count / 4;
>>> @@ -323,6 +330,8 @@ void amdgpu_device_vram_access(struct 
>>> amdgpu_device *adev, loff_t pos,
>>>               *buf++ = RREG32_NO_KIQ(mmMM_DATA);
>>>       }
>>>       spin_unlock_irqrestore(&adev->mmio_idx_lock, flags);
>>> +
>>> +    drm_dev_exit(idx);
>>>   }
>>>   /*
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c 
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
>>> index 4d32233cde92..04ba5eef1e88 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
>>> @@ -31,6 +31,8 @@
>>>   #include "amdgpu_ras.h"
>>>   #include "amdgpu_xgmi.h"
>>> +#include <drm/drm_drv.h>
>>> +
>>>   /**
>>>    * amdgpu_gmc_pdb0_alloc - allocate vram for pdb0
>>>    *
>>> @@ -151,6 +153,10 @@ int amdgpu_gmc_set_pte_pde(struct amdgpu_device 
>>> *adev, void *cpu_pt_addr,
>>>   {
>>>       void __iomem *ptr = (void *)cpu_pt_addr;
>>>       uint64_t value;
>>> +    int idx;
>>> +
>>> +    if (!drm_dev_enter(&adev->ddev, &idx))
>>> +        return 0;
>>>       /*
>>>        * The following is for PTE only. GART does not have PDEs.
>>> @@ -158,6 +164,9 @@ int amdgpu_gmc_set_pte_pde(struct amdgpu_device 
>>> *adev, void *cpu_pt_addr,
>>>       value = addr & 0x0000FFFFFFFFF000ULL;
>>>       value |= flags;
>>>       writeq(value, ptr + (gpu_page_idx * 8));
>>> +
>>> +    drm_dev_exit(idx);
>>> +
>>>       return 0;
>>>   }
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c 
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
>>> index 148a3b481b12..62fcbd446c71 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
>>> @@ -30,6 +30,7 @@
>>>   #include <linux/slab.h>
>>>   #include <drm/amdgpu_drm.h>
>>> +#include <drm/drm_drv.h>
>>>   #include "amdgpu.h"
>>>   #include "atom.h"
>>> @@ -137,7 +138,7 @@ int amdgpu_ib_schedule(struct amdgpu_ring *ring, 
>>> unsigned num_ibs,
>>>       bool secure;
>>>       unsigned i;
>>> -    int r = 0;
>>> +    int idx, r = 0;
>>>       bool need_pipe_sync = false;
>>>       if (num_ibs == 0)
>>> @@ -169,13 +170,16 @@ int amdgpu_ib_schedule(struct amdgpu_ring 
>>> *ring, unsigned num_ibs,
>>>           return -EINVAL;
>>>       }
>>> +    if (!drm_dev_enter(&adev->ddev, &idx))
>>> +        return -ENODEV;
>>> +
>>>       alloc_size = ring->funcs->emit_frame_size + num_ibs *
>>>           ring->funcs->emit_ib_size;
>>>       r = amdgpu_ring_alloc(ring, alloc_size);
>>>       if (r) {
>>>           dev_err(adev->dev, "scheduling IB failed (%d).\n", r);
>>> -        return r;
>>> +        goto exit;
>>>       }
>>>       need_ctx_switch = ring->current_ctx != fence_ctx;
>>> @@ -205,7 +209,7 @@ int amdgpu_ib_schedule(struct amdgpu_ring *ring, 
>>> unsigned num_ibs,
>>>           r = amdgpu_vm_flush(ring, job, need_pipe_sync);
>>>           if (r) {
>>>               amdgpu_ring_undo(ring);
>>> -            return r;
>>> +            goto exit;
>>>           }
>>>       }
>>> @@ -286,7 +290,7 @@ int amdgpu_ib_schedule(struct amdgpu_ring *ring, 
>>> unsigned num_ibs,
>>>           if (job && job->vmid)
>>>               amdgpu_vmid_reset(adev, ring->funcs->vmhub, job->vmid);
>>>           amdgpu_ring_undo(ring);
>>> -        return r;
>>> +        goto exit;
>>>       }
>>>       if (ring->funcs->insert_end)
>>> @@ -304,7 +308,10 @@ int amdgpu_ib_schedule(struct amdgpu_ring *ring, 
>>> unsigned num_ibs,
>>>           ring->funcs->emit_wave_limit(ring, false);
>>>       amdgpu_ring_commit(ring);
>>> -    return 0;
>>> +
>>> +exit:
>>> +    drm_dev_exit(idx);
>>> +    return r;
>>>   }
>>>   /**
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c 
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
>>> index 9e769cf6095b..bb6afee61666 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
>>> @@ -25,6 +25,7 @@
>>>   #include <linux/firmware.h>
>>>   #include <linux/dma-mapping.h>
>>> +#include <drm/drm_drv.h>
>>>   #include "amdgpu.h"
>>>   #include "amdgpu_psp.h"
>>> @@ -39,6 +40,8 @@
>>>   #include "amdgpu_ras.h"
>>>   #include "amdgpu_securedisplay.h"
>>> +#include <drm/drm_drv.h>
>>> +
>>>   static int psp_sysfs_init(struct amdgpu_device *adev);
>>>   static void psp_sysfs_fini(struct amdgpu_device *adev);
>>> @@ -253,7 +256,7 @@ psp_cmd_submit_buf(struct psp_context *psp,
>>>              struct psp_gfx_cmd_resp *cmd, uint64_t fence_mc_addr)
>>>   {
>>>       int ret;
>>> -    int index;
>>> +    int index, idx;
>>>       int timeout = 20000;
>>>       bool ras_intr = false;
>>>       bool skip_unsupport = false;
>>> @@ -261,6 +264,9 @@ psp_cmd_submit_buf(struct psp_context *psp,
>>>       if (psp->adev->in_pci_err_recovery)
>>>           return 0;
>>> +    if (!drm_dev_enter(&psp->adev->ddev, &idx))
>>> +        return 0;
>>> +
>>>       mutex_lock(&psp->mutex);
>>>       memset(psp->cmd_buf_mem, 0, PSP_CMD_BUFFER_SIZE);
>>> @@ -271,8 +277,7 @@ psp_cmd_submit_buf(struct psp_context *psp,
>>>       ret = psp_ring_cmd_submit(psp, psp->cmd_buf_mc_addr, 
>>> fence_mc_addr, index);
>>>       if (ret) {
>>>           atomic_dec(&psp->fence_value);
>>> -        mutex_unlock(&psp->mutex);
>>> -        return ret;
>>> +        goto exit;
>>>       }
>>>       amdgpu_asic_invalidate_hdp(psp->adev, NULL);
>>> @@ -312,8 +317,8 @@ psp_cmd_submit_buf(struct psp_context *psp,
>>>                psp->cmd_buf_mem->cmd_id,
>>>                psp->cmd_buf_mem->resp.status);
>>>           if (!timeout) {
>>> -            mutex_unlock(&psp->mutex);
>>> -            return -EINVAL;
>>> +            ret = -EINVAL;
>>> +            goto exit;
>>>           }
>>>       }
>>> @@ -321,8 +326,10 @@ psp_cmd_submit_buf(struct psp_context *psp,
>>>           ucode->tmr_mc_addr_lo = psp->cmd_buf_mem->resp.fw_addr_lo;
>>>           ucode->tmr_mc_addr_hi = psp->cmd_buf_mem->resp.fw_addr_hi;
>>>       }
>>> -    mutex_unlock(&psp->mutex);
>>> +exit:
>>> +    mutex_unlock(&psp->mutex);
>>> +    drm_dev_exit(idx);
>>>       return ret;
>>>   }
>>> @@ -359,8 +366,7 @@ static int psp_load_toc(struct psp_context *psp,
>>>       if (!cmd)
>>>           return -ENOMEM;
>>>       /* Copy toc to psp firmware private buffer */
>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>> -    memcpy(psp->fw_pri_buf, psp->toc_start_addr, psp->toc_bin_size);
>>> +    psp_copy_fw(psp, psp->toc_start_addr, psp->toc_bin_size);
>>>       psp_prep_load_toc_cmd_buf(cmd, psp->fw_pri_mc_addr, 
>>> psp->toc_bin_size);
>>> @@ -625,8 +631,7 @@ static int psp_asd_load(struct psp_context *psp)
>>>       if (!cmd)
>>>           return -ENOMEM;
>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>> -    memcpy(psp->fw_pri_buf, psp->asd_start_addr, psp->asd_ucode_size);
>>> +    psp_copy_fw(psp, psp->asd_start_addr, psp->asd_ucode_size);
>>>       psp_prep_asd_load_cmd_buf(cmd, psp->fw_pri_mc_addr,
>>>                     psp->asd_ucode_size);
>>> @@ -781,8 +786,7 @@ static int psp_xgmi_load(struct psp_context *psp)
>>>       if (!cmd)
>>>           return -ENOMEM;
>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>> -    memcpy(psp->fw_pri_buf, psp->ta_xgmi_start_addr, 
>>> psp->ta_xgmi_ucode_size);
>>> +    psp_copy_fw(psp, psp->ta_xgmi_start_addr, psp->ta_xgmi_ucode_size);
>>>       psp_prep_ta_load_cmd_buf(cmd,
>>>                    psp->fw_pri_mc_addr,
>>> @@ -1038,8 +1042,7 @@ static int psp_ras_load(struct psp_context *psp)
>>>       if (!cmd)
>>>           return -ENOMEM;
>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>> -    memcpy(psp->fw_pri_buf, psp->ta_ras_start_addr, 
>>> psp->ta_ras_ucode_size);
>>> +    psp_copy_fw(psp, psp->ta_ras_start_addr, psp->ta_ras_ucode_size);
>>>       psp_prep_ta_load_cmd_buf(cmd,
>>>                    psp->fw_pri_mc_addr,
>>> @@ -1275,8 +1278,7 @@ static int psp_hdcp_load(struct psp_context *psp)
>>>       if (!cmd)
>>>           return -ENOMEM;
>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>> -    memcpy(psp->fw_pri_buf, psp->ta_hdcp_start_addr,
>>> +    psp_copy_fw(psp, psp->ta_hdcp_start_addr,
>>>              psp->ta_hdcp_ucode_size);
>>>       psp_prep_ta_load_cmd_buf(cmd,
>>> @@ -1427,8 +1429,7 @@ static int psp_dtm_load(struct psp_context *psp)
>>>       if (!cmd)
>>>           return -ENOMEM;
>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>> -    memcpy(psp->fw_pri_buf, psp->ta_dtm_start_addr, 
>>> psp->ta_dtm_ucode_size);
>>> +    psp_copy_fw(psp, psp->ta_dtm_start_addr, psp->ta_dtm_ucode_size);
>>>       psp_prep_ta_load_cmd_buf(cmd,
>>>                    psp->fw_pri_mc_addr,
>>> @@ -1573,8 +1574,7 @@ static int psp_rap_load(struct psp_context *psp)
>>>       if (!cmd)
>>>           return -ENOMEM;
>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>> -    memcpy(psp->fw_pri_buf, psp->ta_rap_start_addr, 
>>> psp->ta_rap_ucode_size);
>>> +    psp_copy_fw(psp, psp->ta_rap_start_addr, psp->ta_rap_ucode_size);
>>>       psp_prep_ta_load_cmd_buf(cmd,
>>>                    psp->fw_pri_mc_addr,
>>> @@ -3022,7 +3022,7 @@ static ssize_t 
>>> psp_usbc_pd_fw_sysfs_write(struct device *dev,
>>>       struct amdgpu_device *adev = drm_to_adev(ddev);
>>>       void *cpu_addr;
>>>       dma_addr_t dma_addr;
>>> -    int ret;
>>> +    int ret, idx;
>>>       char fw_name[100];
>>>       const struct firmware *usbc_pd_fw;
>>> @@ -3031,6 +3031,9 @@ static ssize_t 
>>> psp_usbc_pd_fw_sysfs_write(struct device *dev,
>>>           return -EBUSY;
>>>       }
>>> +    if (!drm_dev_enter(ddev, &idx))
>>> +        return -ENODEV;
>>> +
>>>       snprintf(fw_name, sizeof(fw_name), "amdgpu/%s", buf);
>>>       ret = request_firmware(&usbc_pd_fw, fw_name, adev->dev);
>>>       if (ret)
>>> @@ -3062,16 +3065,30 @@ static ssize_t 
>>> psp_usbc_pd_fw_sysfs_write(struct device *dev,
>>>   rel_buf:
>>>       dma_free_coherent(adev->dev, usbc_pd_fw->size, cpu_addr, 
>>> dma_addr);
>>>       release_firmware(usbc_pd_fw);
>>> -
>>>   fail:
>>>       if (ret) {
>>>           DRM_ERROR("Failed to load USBC PD FW, err = %d", ret);
>>> -        return ret;
>>> +        count = ret;
>>>       }
>>> +    drm_dev_exit(idx);
>>>       return count;
>>>   }
>>> +void psp_copy_fw(struct psp_context *psp, uint8_t *start_addr, 
>>> uint32_t bin_size)
>>> +{
>>> +    int idx;
>>> +
>>> +    if (!drm_dev_enter(&psp->adev->ddev, &idx))
>>> +        return;
>>> +
>>> +    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>> +    memcpy(psp->fw_pri_buf, start_addr, bin_size);
>>> +
>>> +    drm_dev_exit(idx);
>>> +}
>>> +
>>> +
>>>   static DEVICE_ATTR(usbc_pd_fw, S_IRUGO | S_IWUSR,
>>>              psp_usbc_pd_fw_sysfs_read,
>>>              psp_usbc_pd_fw_sysfs_write);
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h 
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
>>> index 46a5328e00e0..2bfdc278817f 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
>>> @@ -423,4 +423,6 @@ int psp_get_fw_attestation_records_addr(struct 
>>> psp_context *psp,
>>>   int psp_load_fw_list(struct psp_context *psp,
>>>                struct amdgpu_firmware_info **ucode_list, int 
>>> ucode_count);
>>> +void psp_copy_fw(struct psp_context *psp, uint8_t *start_addr, 
>>> uint32_t bin_size);
>>> +
>>>   #endif
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c 
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
>>> index 688624ebe421..e1985bc34436 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
>>> @@ -35,6 +35,8 @@
>>>   #include "amdgpu.h"
>>>   #include "atom.h"
>>> +#include <drm/drm_drv.h>
>>> +
>>>   /*
>>>    * Rings
>>>    * Most engines on the GPU are fed via ring buffers.  Ring
>>> @@ -461,3 +463,71 @@ int amdgpu_ring_test_helper(struct amdgpu_ring 
>>> *ring)
>>>       ring->sched.ready = !r;
>>>       return r;
>>>   }
>>> +
>>> +void amdgpu_ring_clear_ring(struct amdgpu_ring *ring)
>>> +{
>>> +    int idx;
>>> +    int i = 0;
>>> +
>>> +    if (!drm_dev_enter(&ring->adev->ddev, &idx))
>>> +        return;
>>> +
>>> +    while (i <= ring->buf_mask)
>>> +        ring->ring[i++] = ring->funcs->nop;
>>> +
>>> +    drm_dev_exit(idx);
>>> +
>>> +}
>>> +
>>> +void amdgpu_ring_write(struct amdgpu_ring *ring, uint32_t v)
>>> +{
>>> +    int idx;
>>> +
>>> +    if (!drm_dev_enter(&ring->adev->ddev, &idx))
>>> +        return;
>>> +
>>> +    if (ring->count_dw <= 0)
>>> +        DRM_ERROR("amdgpu: writing more dwords to the ring than 
>>> expected!\n");
>>> +    ring->ring[ring->wptr++ & ring->buf_mask] = v;
>>> +    ring->wptr &= ring->ptr_mask;
>>> +    ring->count_dw--;
>>> +
>>> +    drm_dev_exit(idx);
>>> +}
>>> +
>>> +void amdgpu_ring_write_multiple(struct amdgpu_ring *ring,
>>> +                          void *src, int count_dw)
>>> +{
>>> +    unsigned occupied, chunk1, chunk2;
>>> +    void *dst;
>>> +    int idx;
>>> +
>>> +    if (!drm_dev_enter(&ring->adev->ddev, &idx))
>>> +        return;
>>> +
>>> +    if (unlikely(ring->count_dw < count_dw))
>>> +        DRM_ERROR("amdgpu: writing more dwords to the ring than 
>>> expected!\n");
>>> +
>>> +    occupied = ring->wptr & ring->buf_mask;
>>> +    dst = (void *)&ring->ring[occupied];
>>> +    chunk1 = ring->buf_mask + 1 - occupied;
>>> +    chunk1 = (chunk1 >= count_dw) ? count_dw: chunk1;
>>> +    chunk2 = count_dw - chunk1;
>>> +    chunk1 <<= 2;
>>> +    chunk2 <<= 2;
>>> +
>>> +    if (chunk1)
>>> +        memcpy(dst, src, chunk1);
>>> +
>>> +    if (chunk2) {
>>> +        src += chunk1;
>>> +        dst = (void *)ring->ring;
>>> +        memcpy(dst, src, chunk2);
>>> +    }
>>> +
>>> +    ring->wptr += count_dw;
>>> +    ring->wptr &= ring->ptr_mask;
>>> +    ring->count_dw -= count_dw;
>>> +
>>> +    drm_dev_exit(idx);
>>> +}
>>
>> The ring should never we in MMIO memory, so you can completely drop 
>> that as far as I can see.
> 
> Yea, it's in all in GART, missed it for some reason...
>>
>> Maybe split that patch by use case so that we can more easily 
>> review/ack it.
> 
> In fact everything here is the same use case, once I added unmap of
> all MMIO ranges (both registers ann VRAM) i got a lot of page faults
> on device remove around any memcpy to from IO. That where I put the
> drn_dev_enter/exit scope. Also I searched in code and preemeptivly
> added guards to any other such place. I did drop amdgpu_schedule_ib
> from this patch both because it had dma_fence_wait inside and so we
> will take care of this once we decide on how to handle dma_fence waits.
> 
> Andrey
> 
>>
>> Thanks,
>> Christian.
>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h 
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>>> index e7d3d0dbdd96..c67bc6d3d039 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>>> @@ -299,53 +299,12 @@ static inline void 
>>> amdgpu_ring_set_preempt_cond_exec(struct amdgpu_ring *ring,
>>>       *ring->cond_exe_cpu_addr = cond_exec;
>>>   }
>>> -static inline void amdgpu_ring_clear_ring(struct amdgpu_ring *ring)
>>> -{
>>> -    int i = 0;
>>> -    while (i <= ring->buf_mask)
>>> -        ring->ring[i++] = ring->funcs->nop;
>>> -
>>> -}
>>> -
>>> -static inline void amdgpu_ring_write(struct amdgpu_ring *ring, 
>>> uint32_t v)
>>> -{
>>> -    if (ring->count_dw <= 0)
>>> -        DRM_ERROR("amdgpu: writing more dwords to the ring than 
>>> expected!\n");
>>> -    ring->ring[ring->wptr++ & ring->buf_mask] = v;
>>> -    ring->wptr &= ring->ptr_mask;
>>> -    ring->count_dw--;
>>> -}
>>> +void amdgpu_ring_clear_ring(struct amdgpu_ring *ring);
>>> -static inline void amdgpu_ring_write_multiple(struct amdgpu_ring *ring,
>>> -                          void *src, int count_dw)
>>> -{
>>> -    unsigned occupied, chunk1, chunk2;
>>> -    void *dst;
>>> -
>>> -    if (unlikely(ring->count_dw < count_dw))
>>> -        DRM_ERROR("amdgpu: writing more dwords to the ring than 
>>> expected!\n");
>>> -
>>> -    occupied = ring->wptr & ring->buf_mask;
>>> -    dst = (void *)&ring->ring[occupied];
>>> -    chunk1 = ring->buf_mask + 1 - occupied;
>>> -    chunk1 = (chunk1 >= count_dw) ? count_dw: chunk1;
>>> -    chunk2 = count_dw - chunk1;
>>> -    chunk1 <<= 2;
>>> -    chunk2 <<= 2;
>>> +void amdgpu_ring_write(struct amdgpu_ring *ring, uint32_t v);
>>> -    if (chunk1)
>>> -        memcpy(dst, src, chunk1);
>>> -
>>> -    if (chunk2) {
>>> -        src += chunk1;
>>> -        dst = (void *)ring->ring;
>>> -        memcpy(dst, src, chunk2);
>>> -    }
>>> -
>>> -    ring->wptr += count_dw;
>>> -    ring->wptr &= ring->ptr_mask;
>>> -    ring->count_dw -= count_dw;
>>> -}
>>> +void amdgpu_ring_write_multiple(struct amdgpu_ring *ring,
>>> +                          void *src, int count_dw);
>>>   int amdgpu_ring_test_helper(struct amdgpu_ring *ring);
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c 
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
>>> index c6dbc0801604..82f0542c7792 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
>>> @@ -32,6 +32,7 @@
>>>   #include <linux/module.h>
>>>   #include <drm/drm.h>
>>> +#include <drm/drm_drv.h>
>>>   #include "amdgpu.h"
>>>   #include "amdgpu_pm.h"
>>> @@ -375,7 +376,7 @@ int amdgpu_uvd_suspend(struct amdgpu_device *adev)
>>>   {
>>>       unsigned size;
>>>       void *ptr;
>>> -    int i, j;
>>> +    int i, j, idx;
>>>       bool in_ras_intr = amdgpu_ras_intr_triggered();
>>>       cancel_delayed_work_sync(&adev->uvd.idle_work);
>>> @@ -403,11 +404,15 @@ int amdgpu_uvd_suspend(struct amdgpu_device *adev)
>>>           if (!adev->uvd.inst[j].saved_bo)
>>>               return -ENOMEM;
>>> -        /* re-write 0 since err_event_athub will corrupt VCPU buffer */
>>> -        if (in_ras_intr)
>>> -            memset(adev->uvd.inst[j].saved_bo, 0, size);
>>> -        else
>>> -            memcpy_fromio(adev->uvd.inst[j].saved_bo, ptr, size);
>>> +        if (drm_dev_enter(&adev->ddev, &idx)) {
>>> +            /* re-write 0 since err_event_athub will corrupt VCPU 
>>> buffer */
>>> +            if (in_ras_intr)
>>> +                memset(adev->uvd.inst[j].saved_bo, 0, size);
>>> +            else
>>> +                memcpy_fromio(adev->uvd.inst[j].saved_bo, ptr, size);
>>> +
>>> +            drm_dev_exit(idx);
>>> +        }
>>>       }
>>>       if (in_ras_intr)
>>> @@ -420,7 +425,7 @@ int amdgpu_uvd_resume(struct amdgpu_device *adev)
>>>   {
>>>       unsigned size;
>>>       void *ptr;
>>> -    int i;
>>> +    int i, idx;
>>>       for (i = 0; i < adev->uvd.num_uvd_inst; i++) {
>>>           if (adev->uvd.harvest_config & (1 << i))
>>> @@ -432,7 +437,10 @@ int amdgpu_uvd_resume(struct amdgpu_device *adev)
>>>           ptr = adev->uvd.inst[i].cpu_addr;
>>>           if (adev->uvd.inst[i].saved_bo != NULL) {
>>> -            memcpy_toio(ptr, adev->uvd.inst[i].saved_bo, size);
>>> +            if (drm_dev_enter(&adev->ddev, &idx)) {
>>> +                memcpy_toio(ptr, adev->uvd.inst[i].saved_bo, size);
>>> +                drm_dev_exit(idx);
>>> +            }
>>>               kvfree(adev->uvd.inst[i].saved_bo);
>>>               adev->uvd.inst[i].saved_bo = NULL;
>>>           } else {
>>> @@ -442,8 +450,11 @@ int amdgpu_uvd_resume(struct amdgpu_device *adev)
>>>               hdr = (const struct common_firmware_header 
>>> *)adev->uvd.fw->data;
>>>               if (adev->firmware.load_type != AMDGPU_FW_LOAD_PSP) {
>>>                   offset = le32_to_cpu(hdr->ucode_array_offset_bytes);
>>> -                memcpy_toio(adev->uvd.inst[i].cpu_addr, 
>>> adev->uvd.fw->data + offset,
>>> -                        le32_to_cpu(hdr->ucode_size_bytes));
>>> +                if (drm_dev_enter(&adev->ddev, &idx)) {
>>> +                    memcpy_toio(adev->uvd.inst[i].cpu_addr, 
>>> adev->uvd.fw->data + offset,
>>> +                            le32_to_cpu(hdr->ucode_size_bytes));
>>> +                    drm_dev_exit(idx);
>>> +                }
>>>                   size -= le32_to_cpu(hdr->ucode_size_bytes);
>>>                   ptr += le32_to_cpu(hdr->ucode_size_bytes);
>>>               }
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c 
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
>>> index ea6a62f67e38..833203401ef4 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
>>> @@ -29,6 +29,7 @@
>>>   #include <linux/module.h>
>>>   #include <drm/drm.h>
>>> +#include <drm/drm_drv.h>
>>>   #include "amdgpu.h"
>>>   #include "amdgpu_pm.h"
>>> @@ -293,7 +294,7 @@ int amdgpu_vce_resume(struct amdgpu_device *adev)
>>>       void *cpu_addr;
>>>       const struct common_firmware_header *hdr;
>>>       unsigned offset;
>>> -    int r;
>>> +    int r, idx;
>>>       if (adev->vce.vcpu_bo == NULL)
>>>           return -EINVAL;
>>> @@ -313,8 +314,12 @@ int amdgpu_vce_resume(struct amdgpu_device *adev)
>>>       hdr = (const struct common_firmware_header *)adev->vce.fw->data;
>>>       offset = le32_to_cpu(hdr->ucode_array_offset_bytes);
>>> -    memcpy_toio(cpu_addr, adev->vce.fw->data + offset,
>>> -            adev->vce.fw->size - offset);
>>> +
>>> +    if (drm_dev_enter(&adev->ddev, &idx)) {
>>> +        memcpy_toio(cpu_addr, adev->vce.fw->data + offset,
>>> +                adev->vce.fw->size - offset);
>>> +        drm_dev_exit(idx);
>>> +    }
>>>       amdgpu_bo_kunmap(adev->vce.vcpu_bo);
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c 
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
>>> index 201645963ba5..21f7d3644d70 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
>>> @@ -27,6 +27,7 @@
>>>   #include <linux/firmware.h>
>>>   #include <linux/module.h>
>>>   #include <linux/pci.h>
>>> +#include <drm/drm_drv.h>
>>>   #include "amdgpu.h"
>>>   #include "amdgpu_pm.h"
>>> @@ -275,7 +276,7 @@ int amdgpu_vcn_suspend(struct amdgpu_device *adev)
>>>   {
>>>       unsigned size;
>>>       void *ptr;
>>> -    int i;
>>> +    int i, idx;
>>>       cancel_delayed_work_sync(&adev->vcn.idle_work);
>>> @@ -292,7 +293,10 @@ int amdgpu_vcn_suspend(struct amdgpu_device *adev)
>>>           if (!adev->vcn.inst[i].saved_bo)
>>>               return -ENOMEM;
>>> -        memcpy_fromio(adev->vcn.inst[i].saved_bo, ptr, size);
>>> +        if (drm_dev_enter(&adev->ddev, &idx)) {
>>> +            memcpy_fromio(adev->vcn.inst[i].saved_bo, ptr, size);
>>> +            drm_dev_exit(idx);
>>> +        }
>>>       }
>>>       return 0;
>>>   }
>>> @@ -301,7 +305,7 @@ int amdgpu_vcn_resume(struct amdgpu_device *adev)
>>>   {
>>>       unsigned size;
>>>       void *ptr;
>>> -    int i;
>>> +    int i, idx;
>>>       for (i = 0; i < adev->vcn.num_vcn_inst; ++i) {
>>>           if (adev->vcn.harvest_config & (1 << i))
>>> @@ -313,7 +317,10 @@ int amdgpu_vcn_resume(struct amdgpu_device *adev)
>>>           ptr = adev->vcn.inst[i].cpu_addr;
>>>           if (adev->vcn.inst[i].saved_bo != NULL) {
>>> -            memcpy_toio(ptr, adev->vcn.inst[i].saved_bo, size);
>>> +            if (drm_dev_enter(&adev->ddev, &idx)) {
>>> +                memcpy_toio(ptr, adev->vcn.inst[i].saved_bo, size);
>>> +                drm_dev_exit(idx);
>>> +            }
>>>               kvfree(adev->vcn.inst[i].saved_bo);
>>>               adev->vcn.inst[i].saved_bo = NULL;
>>>           } else {
>>> @@ -323,8 +330,11 @@ int amdgpu_vcn_resume(struct amdgpu_device *adev)
>>>               hdr = (const struct common_firmware_header 
>>> *)adev->vcn.fw->data;
>>>               if (adev->firmware.load_type != AMDGPU_FW_LOAD_PSP) {
>>>                   offset = le32_to_cpu(hdr->ucode_array_offset_bytes);
>>> -                memcpy_toio(adev->vcn.inst[i].cpu_addr, 
>>> adev->vcn.fw->data + offset,
>>> -                        le32_to_cpu(hdr->ucode_size_bytes));
>>> +                if (drm_dev_enter(&adev->ddev, &idx)) {
>>> +                    memcpy_toio(adev->vcn.inst[i].cpu_addr, 
>>> adev->vcn.fw->data + offset,
>>> +                            le32_to_cpu(hdr->ucode_size_bytes));
>>> +                    drm_dev_exit(idx);
>>> +                }
>>>                   size -= le32_to_cpu(hdr->ucode_size_bytes);
>>>                   ptr += le32_to_cpu(hdr->ucode_size_bytes);
>>>               }
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>> index 9f868cf3b832..7dd5f10ab570 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>> @@ -32,6 +32,7 @@
>>>   #include <linux/dma-buf.h>
>>>   #include <drm/amdgpu_drm.h>
>>> +#include <drm/drm_drv.h>
>>>   #include "amdgpu.h"
>>>   #include "amdgpu_trace.h"
>>>   #include "amdgpu_amdkfd.h"
>>> @@ -1606,7 +1607,10 @@ static int amdgpu_vm_bo_update_mapping(struct 
>>> amdgpu_device *adev,
>>>       struct amdgpu_vm_update_params params;
>>>       enum amdgpu_sync_mode sync_mode;
>>>       uint64_t pfn;
>>> -    int r;
>>> +    int r, idx;
>>> +
>>> +    if (!drm_dev_enter(&adev->ddev, &idx))
>>> +        return -ENODEV;
>>>       memset(&params, 0, sizeof(params));
>>>       params.adev = adev;
>>> @@ -1715,6 +1719,7 @@ static int amdgpu_vm_bo_update_mapping(struct 
>>> amdgpu_device *adev,
>>>   error_unlock:
>>>       amdgpu_vm_eviction_unlock(vm);
>>> +    drm_dev_exit(idx);
>>>       return r;
>>>   }
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c 
>>> b/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
>>> index 589410c32d09..2cec71e823f5 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
>>> @@ -23,6 +23,7 @@
>>>   #include <linux/firmware.h>
>>>   #include <linux/module.h>
>>>   #include <linux/vmalloc.h>
>>> +#include <drm/drm_drv.h>
>>>   #include "amdgpu.h"
>>>   #include "amdgpu_psp.h"
>>> @@ -269,10 +270,8 @@ static int psp_v11_0_bootloader_load_kdb(struct 
>>> psp_context *psp)
>>>       if (ret)
>>>           return ret;
>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>> -
>>>       /* Copy PSP KDB binary to memory */
>>> -    memcpy(psp->fw_pri_buf, psp->kdb_start_addr, psp->kdb_bin_size);
>>> +    psp_copy_fw(psp, psp->kdb_start_addr, psp->kdb_bin_size);
>>>       /* Provide the PSP KDB to bootloader */
>>>       WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>> @@ -302,10 +301,8 @@ static int psp_v11_0_bootloader_load_spl(struct 
>>> psp_context *psp)
>>>       if (ret)
>>>           return ret;
>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>> -
>>>       /* Copy PSP SPL binary to memory */
>>> -    memcpy(psp->fw_pri_buf, psp->spl_start_addr, psp->spl_bin_size);
>>> +    psp_copy_fw(psp, psp->spl_start_addr, psp->spl_bin_size);
>>>       /* Provide the PSP SPL to bootloader */
>>>       WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>> @@ -335,10 +332,8 @@ static int 
>>> psp_v11_0_bootloader_load_sysdrv(struct psp_context *psp)
>>>       if (ret)
>>>           return ret;
>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>> -
>>>       /* Copy PSP System Driver binary to memory */
>>> -    memcpy(psp->fw_pri_buf, psp->sys_start_addr, psp->sys_bin_size);
>>> +    psp_copy_fw(psp, psp->sys_start_addr, psp->sys_bin_size);
>>>       /* Provide the sys driver to bootloader */
>>>       WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>> @@ -371,10 +366,8 @@ static int psp_v11_0_bootloader_load_sos(struct 
>>> psp_context *psp)
>>>       if (ret)
>>>           return ret;
>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>> -
>>>       /* Copy Secure OS binary to PSP memory */
>>> -    memcpy(psp->fw_pri_buf, psp->sos_start_addr, psp->sos_bin_size);
>>> +    psp_copy_fw(psp, psp->sos_start_addr, psp->sos_bin_size);
>>>       /* Provide the PSP secure OS to bootloader */
>>>       WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>> @@ -608,7 +601,7 @@ static int psp_v11_0_memory_training(struct 
>>> psp_context *psp, uint32_t ops)
>>>       uint32_t p2c_header[4];
>>>       uint32_t sz;
>>>       void *buf;
>>> -    int ret;
>>> +    int ret, idx;
>>>       if (ctx->init == PSP_MEM_TRAIN_NOT_SUPPORT) {
>>>           DRM_DEBUG("Memory training is not supported.\n");
>>> @@ -681,17 +674,24 @@ static int psp_v11_0_memory_training(struct 
>>> psp_context *psp, uint32_t ops)
>>>               return -ENOMEM;
>>>           }
>>> -        memcpy_fromio(buf, adev->mman.aper_base_kaddr, sz);
>>> -        ret = psp_v11_0_memory_training_send_msg(psp, 
>>> PSP_BL__DRAM_LONG_TRAIN);
>>> -        if (ret) {
>>> -            DRM_ERROR("Send long training msg failed.\n");
>>> +        if (drm_dev_enter(&adev->ddev, &idx)) {
>>> +            memcpy_fromio(buf, adev->mman.aper_base_kaddr, sz);
>>> +            ret = psp_v11_0_memory_training_send_msg(psp, 
>>> PSP_BL__DRAM_LONG_TRAIN);
>>> +            if (ret) {
>>> +                DRM_ERROR("Send long training msg failed.\n");
>>> +                vfree(buf);
>>> +                drm_dev_exit(idx);
>>> +                return ret;
>>> +            }
>>> +
>>> +            memcpy_toio(adev->mman.aper_base_kaddr, buf, sz);
>>> +            adev->hdp.funcs->flush_hdp(adev, NULL);
>>>               vfree(buf);
>>> -            return ret;
>>> +            drm_dev_exit(idx);
>>> +        } else {
>>> +            vfree(buf);
>>> +            return -ENODEV;
>>>           }
>>> -
>>> -        memcpy_toio(adev->mman.aper_base_kaddr, buf, sz);
>>> -        adev->hdp.funcs->flush_hdp(adev, NULL);
>>> -        vfree(buf);
>>>       }
>>>       if (ops & PSP_MEM_TRAIN_SAVE) {
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c 
>>> b/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c
>>> index c4828bd3264b..618e5b6b85d9 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c
>>> @@ -138,10 +138,8 @@ static int 
>>> psp_v12_0_bootloader_load_sysdrv(struct psp_context *psp)
>>>       if (ret)
>>>           return ret;
>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>> -
>>>       /* Copy PSP System Driver binary to memory */
>>> -    memcpy(psp->fw_pri_buf, psp->sys_start_addr, psp->sys_bin_size);
>>> +    psp_copy_fw(psp, psp->sys_start_addr, psp->sys_bin_size);
>>>       /* Provide the sys driver to bootloader */
>>>       WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>> @@ -179,10 +177,8 @@ static int psp_v12_0_bootloader_load_sos(struct 
>>> psp_context *psp)
>>>       if (ret)
>>>           return ret;
>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>> -
>>>       /* Copy Secure OS binary to PSP memory */
>>> -    memcpy(psp->fw_pri_buf, psp->sos_start_addr, psp->sos_bin_size);
>>> +    psp_copy_fw(psp, psp->sos_start_addr, psp->sos_bin_size);
>>>       /* Provide the PSP secure OS to bootloader */
>>>       WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c 
>>> b/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c
>>> index f2e725f72d2f..d0a6cccd0897 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c
>>> @@ -102,10 +102,8 @@ static int 
>>> psp_v3_1_bootloader_load_sysdrv(struct psp_context *psp)
>>>       if (ret)
>>>           return ret;
>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>> -
>>>       /* Copy PSP System Driver binary to memory */
>>> -    memcpy(psp->fw_pri_buf, psp->sys_start_addr, psp->sys_bin_size);
>>> +    psp_copy_fw(psp, psp->sys_start_addr, psp->sys_bin_size);
>>>       /* Provide the sys driver to bootloader */
>>>       WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>> @@ -143,10 +141,8 @@ static int psp_v3_1_bootloader_load_sos(struct 
>>> psp_context *psp)
>>>       if (ret)
>>>           return ret;
>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>> -
>>>       /* Copy Secure OS binary to PSP memory */
>>> -    memcpy(psp->fw_pri_buf, psp->sos_start_addr, psp->sos_bin_size);
>>> +    psp_copy_fw(psp, psp->sos_start_addr, psp->sos_bin_size);
>>>       /* Provide the PSP secure OS to bootloader */
>>>       WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c 
>>> b/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
>>> index 8e238dea7bef..90910d19db12 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
>>> @@ -25,6 +25,7 @@
>>>    */
>>>   #include <linux/firmware.h>
>>> +#include <drm/drm_drv.h>
>>>   #include "amdgpu.h"
>>>   #include "amdgpu_vce.h"
>>> @@ -555,16 +556,19 @@ static int vce_v4_0_hw_fini(void *handle)
>>>   static int vce_v4_0_suspend(void *handle)
>>>   {
>>>       struct amdgpu_device *adev = (struct amdgpu_device *)handle;
>>> -    int r;
>>> +    int r, idx;
>>>       if (adev->vce.vcpu_bo == NULL)
>>>           return 0;
>>> -    if (adev->firmware.load_type == AMDGPU_FW_LOAD_PSP) {
>>> -        unsigned size = amdgpu_bo_size(adev->vce.vcpu_bo);
>>> -        void *ptr = adev->vce.cpu_addr;
>>> +    if (drm_dev_enter(&adev->ddev, &idx)) {
>>> +        if (adev->firmware.load_type == AMDGPU_FW_LOAD_PSP) {
>>> +            unsigned size = amdgpu_bo_size(adev->vce.vcpu_bo);
>>> +            void *ptr = adev->vce.cpu_addr;
>>> -        memcpy_fromio(adev->vce.saved_bo, ptr, size);
>>> +            memcpy_fromio(adev->vce.saved_bo, ptr, size);
>>> +        }
>>> +        drm_dev_exit(idx);
>>>       }
>>>       r = vce_v4_0_hw_fini(adev);
>>> @@ -577,16 +581,20 @@ static int vce_v4_0_suspend(void *handle)
>>>   static int vce_v4_0_resume(void *handle)
>>>   {
>>>       struct amdgpu_device *adev = (struct amdgpu_device *)handle;
>>> -    int r;
>>> +    int r, idx;
>>>       if (adev->vce.vcpu_bo == NULL)
>>>           return -EINVAL;
>>>       if (adev->firmware.load_type == AMDGPU_FW_LOAD_PSP) {
>>> -        unsigned size = amdgpu_bo_size(adev->vce.vcpu_bo);
>>> -        void *ptr = adev->vce.cpu_addr;
>>> -        memcpy_toio(ptr, adev->vce.saved_bo, size);
>>> +        if (drm_dev_enter(&adev->ddev, &idx)) {
>>> +            unsigned size = amdgpu_bo_size(adev->vce.vcpu_bo);
>>> +            void *ptr = adev->vce.cpu_addr;
>>> +
>>> +            memcpy_toio(ptr, adev->vce.saved_bo, size);
>>> +            drm_dev_exit(idx);
>>> +        }
>>>       } else {
>>>           r = amdgpu_vce_resume(adev);
>>>           if (r)
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c 
>>> b/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c
>>> index 3f15bf34123a..df34be8ec82d 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c
>>> @@ -34,6 +34,8 @@
>>>   #include "vcn/vcn_3_0_0_sh_mask.h"
>>>   #include "ivsrcid/vcn/irqsrcs_vcn_2_0.h"
>>> +#include <drm/drm_drv.h>
>>> +
>>>   #define mmUVD_CONTEXT_ID_INTERNAL_OFFSET            0x27
>>>   #define mmUVD_GPCOM_VCPU_CMD_INTERNAL_OFFSET            0x0f
>>>   #define mmUVD_GPCOM_VCPU_DATA0_INTERNAL_OFFSET            0x10
>>> @@ -268,16 +270,20 @@ static int vcn_v3_0_sw_init(void *handle)
>>>   static int vcn_v3_0_sw_fini(void *handle)
>>>   {
>>>       struct amdgpu_device *adev = (struct amdgpu_device *)handle;
>>> -    int i, r;
>>> +    int i, r, idx;
>>> -    for (i = 0; i < adev->vcn.num_vcn_inst; i++) {
>>> -        volatile struct amdgpu_fw_shared *fw_shared;
>>> +    if (drm_dev_enter(&adev->ddev, &idx)) {
>>> +        for (i = 0; i < adev->vcn.num_vcn_inst; i++) {
>>> +            volatile struct amdgpu_fw_shared *fw_shared;
>>> -        if (adev->vcn.harvest_config & (1 << i))
>>> -            continue;
>>> -        fw_shared = adev->vcn.inst[i].fw_shared_cpu_addr;
>>> -        fw_shared->present_flag_0 = 0;
>>> -        fw_shared->sw_ring.is_enabled = false;
>>> +            if (adev->vcn.harvest_config & (1 << i))
>>> +                continue;
>>> +            fw_shared = adev->vcn.inst[i].fw_shared_cpu_addr;
>>> +            fw_shared->present_flag_0 = 0;
>>> +            fw_shared->sw_ring.is_enabled = false;
>>> +        }
>>> +
>>> +        drm_dev_exit(idx);
>>>       }
>>>       if (amdgpu_sriov_vf(adev))
>>> diff --git a/drivers/gpu/drm/amd/pm/powerplay/smumgr/smu7_smumgr.c 
>>> b/drivers/gpu/drm/amd/pm/powerplay/smumgr/smu7_smumgr.c
>>> index aae25243eb10..d628b91846c9 100644
>>> --- a/drivers/gpu/drm/amd/pm/powerplay/smumgr/smu7_smumgr.c
>>> +++ b/drivers/gpu/drm/amd/pm/powerplay/smumgr/smu7_smumgr.c
>>> @@ -405,6 +405,8 @@ int smu7_request_smu_load_fw(struct pp_hwmgr *hwmgr)
>>>                   UCODE_ID_MEC_STORAGE, 
>>> &toc->entry[toc->num_entries++]),
>>>                   "Failed to Get Firmware Entry.", r = -EINVAL; goto 
>>> failed);
>>>       }
>>> +
>>> +    /* AG TODO Can't call drm_dev_enter/exit because access 
>>> adev->ddev here ... */
>>>       memcpy_toio(smu_data->header_buffer.kaddr, smu_data->toc,
>>>               sizeof(struct SMU_DRAMData_TOC));
>>>       smum_send_msg_to_smc_with_parameter(hwmgr,
>>

^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [PATCH v6 10/16] drm/amdgpu: Guard against write accesses after device removal
@ 2021-05-12 14:01         ` Andrey Grodzovsky
  0 siblings, 0 replies; 126+ messages in thread
From: Andrey Grodzovsky @ 2021-05-12 14:01 UTC (permalink / raw)
  To: Christian König, dri-devel, amd-gfx, linux-pci,
	daniel.vetter, Harry.Wentland
  Cc: Alexander.Deucher, gregkh, helgaas, Felix.Kuehling

Ping - need a confirmation it's ok to keep this as a single patch given
my explanation bellow.

Andrey

On 2021-05-11 1:52 p.m., Andrey Grodzovsky wrote:
> 
> 
> On 2021-05-11 2:50 a.m., Christian König wrote:
>> Am 10.05.21 um 18:36 schrieb Andrey Grodzovsky:
>>> This should prevent writing to memory or IO ranges possibly
>>> already allocated for other uses after our device is removed.
>>>
>>> v5:
>>> Protect more places wher memcopy_to/form_io takes place
>>> Protect IB submissions
>>>
>>> v6: Switch to !drm_dev_enter instead of scoping entire code
>>> with brackets.
>>>
>>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>>> ---
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c    | 11 ++-
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c       |  9 +++
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c        | 17 +++--
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c       | 63 +++++++++++------
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h       |  2 +
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c      | 70 +++++++++++++++++++
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h      | 49 ++-----------
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c       | 31 +++++---
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c       | 11 ++-
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c       | 22 ++++--
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c        |  7 +-
>>>   drivers/gpu/drm/amd/amdgpu/psp_v11_0.c        | 44 ++++++------
>>>   drivers/gpu/drm/amd/amdgpu/psp_v12_0.c        |  8 +--
>>>   drivers/gpu/drm/amd/amdgpu/psp_v3_1.c         |  8 +--
>>>   drivers/gpu/drm/amd/amdgpu/vce_v4_0.c         | 26 ++++---
>>>   drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c         | 22 +++---
>>>   .../drm/amd/pm/powerplay/smumgr/smu7_smumgr.c |  2 +
>>>   17 files changed, 257 insertions(+), 145 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>> index a0bff4713672..94c415176cdc 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>> @@ -71,6 +71,8 @@
>>>   #include <drm/task_barrier.h>
>>>   #include <linux/pm_runtime.h>
>>> +#include <drm/drm_drv.h>
>>> +
>>>   MODULE_FIRMWARE("amdgpu/vega10_gpu_info.bin");
>>>   MODULE_FIRMWARE("amdgpu/vega12_gpu_info.bin");
>>>   MODULE_FIRMWARE("amdgpu/raven_gpu_info.bin");
>>> @@ -281,7 +283,10 @@ void amdgpu_device_vram_access(struct 
>>> amdgpu_device *adev, loff_t pos,
>>>       unsigned long flags;
>>>       uint32_t hi = ~0;
>>>       uint64_t last;
>>> +    int idx;
>>> +     if (!drm_dev_enter(&adev->ddev, &idx))
>>> +         return;
>>>   #ifdef CONFIG_64BIT
>>>       last = min(pos + size, adev->gmc.visible_vram_size);
>>> @@ -299,8 +304,10 @@ void amdgpu_device_vram_access(struct 
>>> amdgpu_device *adev, loff_t pos,
>>>               memcpy_fromio(buf, addr, count);
>>>           }
>>> -        if (count == size)
>>> +        if (count == size) {
>>> +            drm_dev_exit(idx);
>>>               return;
>>> +        }
>>
>> Maybe use a goto instead, but really just a nit pick.
>>
>>
>>
>>>           pos += count;
>>>           buf += count / 4;
>>> @@ -323,6 +330,8 @@ void amdgpu_device_vram_access(struct 
>>> amdgpu_device *adev, loff_t pos,
>>>               *buf++ = RREG32_NO_KIQ(mmMM_DATA);
>>>       }
>>>       spin_unlock_irqrestore(&adev->mmio_idx_lock, flags);
>>> +
>>> +    drm_dev_exit(idx);
>>>   }
>>>   /*
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c 
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
>>> index 4d32233cde92..04ba5eef1e88 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
>>> @@ -31,6 +31,8 @@
>>>   #include "amdgpu_ras.h"
>>>   #include "amdgpu_xgmi.h"
>>> +#include <drm/drm_drv.h>
>>> +
>>>   /**
>>>    * amdgpu_gmc_pdb0_alloc - allocate vram for pdb0
>>>    *
>>> @@ -151,6 +153,10 @@ int amdgpu_gmc_set_pte_pde(struct amdgpu_device 
>>> *adev, void *cpu_pt_addr,
>>>   {
>>>       void __iomem *ptr = (void *)cpu_pt_addr;
>>>       uint64_t value;
>>> +    int idx;
>>> +
>>> +    if (!drm_dev_enter(&adev->ddev, &idx))
>>> +        return 0;
>>>       /*
>>>        * The following is for PTE only. GART does not have PDEs.
>>> @@ -158,6 +164,9 @@ int amdgpu_gmc_set_pte_pde(struct amdgpu_device 
>>> *adev, void *cpu_pt_addr,
>>>       value = addr & 0x0000FFFFFFFFF000ULL;
>>>       value |= flags;
>>>       writeq(value, ptr + (gpu_page_idx * 8));
>>> +
>>> +    drm_dev_exit(idx);
>>> +
>>>       return 0;
>>>   }
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c 
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
>>> index 148a3b481b12..62fcbd446c71 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
>>> @@ -30,6 +30,7 @@
>>>   #include <linux/slab.h>
>>>   #include <drm/amdgpu_drm.h>
>>> +#include <drm/drm_drv.h>
>>>   #include "amdgpu.h"
>>>   #include "atom.h"
>>> @@ -137,7 +138,7 @@ int amdgpu_ib_schedule(struct amdgpu_ring *ring, 
>>> unsigned num_ibs,
>>>       bool secure;
>>>       unsigned i;
>>> -    int r = 0;
>>> +    int idx, r = 0;
>>>       bool need_pipe_sync = false;
>>>       if (num_ibs == 0)
>>> @@ -169,13 +170,16 @@ int amdgpu_ib_schedule(struct amdgpu_ring 
>>> *ring, unsigned num_ibs,
>>>           return -EINVAL;
>>>       }
>>> +    if (!drm_dev_enter(&adev->ddev, &idx))
>>> +        return -ENODEV;
>>> +
>>>       alloc_size = ring->funcs->emit_frame_size + num_ibs *
>>>           ring->funcs->emit_ib_size;
>>>       r = amdgpu_ring_alloc(ring, alloc_size);
>>>       if (r) {
>>>           dev_err(adev->dev, "scheduling IB failed (%d).\n", r);
>>> -        return r;
>>> +        goto exit;
>>>       }
>>>       need_ctx_switch = ring->current_ctx != fence_ctx;
>>> @@ -205,7 +209,7 @@ int amdgpu_ib_schedule(struct amdgpu_ring *ring, 
>>> unsigned num_ibs,
>>>           r = amdgpu_vm_flush(ring, job, need_pipe_sync);
>>>           if (r) {
>>>               amdgpu_ring_undo(ring);
>>> -            return r;
>>> +            goto exit;
>>>           }
>>>       }
>>> @@ -286,7 +290,7 @@ int amdgpu_ib_schedule(struct amdgpu_ring *ring, 
>>> unsigned num_ibs,
>>>           if (job && job->vmid)
>>>               amdgpu_vmid_reset(adev, ring->funcs->vmhub, job->vmid);
>>>           amdgpu_ring_undo(ring);
>>> -        return r;
>>> +        goto exit;
>>>       }
>>>       if (ring->funcs->insert_end)
>>> @@ -304,7 +308,10 @@ int amdgpu_ib_schedule(struct amdgpu_ring *ring, 
>>> unsigned num_ibs,
>>>           ring->funcs->emit_wave_limit(ring, false);
>>>       amdgpu_ring_commit(ring);
>>> -    return 0;
>>> +
>>> +exit:
>>> +    drm_dev_exit(idx);
>>> +    return r;
>>>   }
>>>   /**
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c 
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
>>> index 9e769cf6095b..bb6afee61666 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
>>> @@ -25,6 +25,7 @@
>>>   #include <linux/firmware.h>
>>>   #include <linux/dma-mapping.h>
>>> +#include <drm/drm_drv.h>
>>>   #include "amdgpu.h"
>>>   #include "amdgpu_psp.h"
>>> @@ -39,6 +40,8 @@
>>>   #include "amdgpu_ras.h"
>>>   #include "amdgpu_securedisplay.h"
>>> +#include <drm/drm_drv.h>
>>> +
>>>   static int psp_sysfs_init(struct amdgpu_device *adev);
>>>   static void psp_sysfs_fini(struct amdgpu_device *adev);
>>> @@ -253,7 +256,7 @@ psp_cmd_submit_buf(struct psp_context *psp,
>>>              struct psp_gfx_cmd_resp *cmd, uint64_t fence_mc_addr)
>>>   {
>>>       int ret;
>>> -    int index;
>>> +    int index, idx;
>>>       int timeout = 20000;
>>>       bool ras_intr = false;
>>>       bool skip_unsupport = false;
>>> @@ -261,6 +264,9 @@ psp_cmd_submit_buf(struct psp_context *psp,
>>>       if (psp->adev->in_pci_err_recovery)
>>>           return 0;
>>> +    if (!drm_dev_enter(&psp->adev->ddev, &idx))
>>> +        return 0;
>>> +
>>>       mutex_lock(&psp->mutex);
>>>       memset(psp->cmd_buf_mem, 0, PSP_CMD_BUFFER_SIZE);
>>> @@ -271,8 +277,7 @@ psp_cmd_submit_buf(struct psp_context *psp,
>>>       ret = psp_ring_cmd_submit(psp, psp->cmd_buf_mc_addr, 
>>> fence_mc_addr, index);
>>>       if (ret) {
>>>           atomic_dec(&psp->fence_value);
>>> -        mutex_unlock(&psp->mutex);
>>> -        return ret;
>>> +        goto exit;
>>>       }
>>>       amdgpu_asic_invalidate_hdp(psp->adev, NULL);
>>> @@ -312,8 +317,8 @@ psp_cmd_submit_buf(struct psp_context *psp,
>>>                psp->cmd_buf_mem->cmd_id,
>>>                psp->cmd_buf_mem->resp.status);
>>>           if (!timeout) {
>>> -            mutex_unlock(&psp->mutex);
>>> -            return -EINVAL;
>>> +            ret = -EINVAL;
>>> +            goto exit;
>>>           }
>>>       }
>>> @@ -321,8 +326,10 @@ psp_cmd_submit_buf(struct psp_context *psp,
>>>           ucode->tmr_mc_addr_lo = psp->cmd_buf_mem->resp.fw_addr_lo;
>>>           ucode->tmr_mc_addr_hi = psp->cmd_buf_mem->resp.fw_addr_hi;
>>>       }
>>> -    mutex_unlock(&psp->mutex);
>>> +exit:
>>> +    mutex_unlock(&psp->mutex);
>>> +    drm_dev_exit(idx);
>>>       return ret;
>>>   }
>>> @@ -359,8 +366,7 @@ static int psp_load_toc(struct psp_context *psp,
>>>       if (!cmd)
>>>           return -ENOMEM;
>>>       /* Copy toc to psp firmware private buffer */
>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>> -    memcpy(psp->fw_pri_buf, psp->toc_start_addr, psp->toc_bin_size);
>>> +    psp_copy_fw(psp, psp->toc_start_addr, psp->toc_bin_size);
>>>       psp_prep_load_toc_cmd_buf(cmd, psp->fw_pri_mc_addr, 
>>> psp->toc_bin_size);
>>> @@ -625,8 +631,7 @@ static int psp_asd_load(struct psp_context *psp)
>>>       if (!cmd)
>>>           return -ENOMEM;
>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>> -    memcpy(psp->fw_pri_buf, psp->asd_start_addr, psp->asd_ucode_size);
>>> +    psp_copy_fw(psp, psp->asd_start_addr, psp->asd_ucode_size);
>>>       psp_prep_asd_load_cmd_buf(cmd, psp->fw_pri_mc_addr,
>>>                     psp->asd_ucode_size);
>>> @@ -781,8 +786,7 @@ static int psp_xgmi_load(struct psp_context *psp)
>>>       if (!cmd)
>>>           return -ENOMEM;
>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>> -    memcpy(psp->fw_pri_buf, psp->ta_xgmi_start_addr, 
>>> psp->ta_xgmi_ucode_size);
>>> +    psp_copy_fw(psp, psp->ta_xgmi_start_addr, psp->ta_xgmi_ucode_size);
>>>       psp_prep_ta_load_cmd_buf(cmd,
>>>                    psp->fw_pri_mc_addr,
>>> @@ -1038,8 +1042,7 @@ static int psp_ras_load(struct psp_context *psp)
>>>       if (!cmd)
>>>           return -ENOMEM;
>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>> -    memcpy(psp->fw_pri_buf, psp->ta_ras_start_addr, 
>>> psp->ta_ras_ucode_size);
>>> +    psp_copy_fw(psp, psp->ta_ras_start_addr, psp->ta_ras_ucode_size);
>>>       psp_prep_ta_load_cmd_buf(cmd,
>>>                    psp->fw_pri_mc_addr,
>>> @@ -1275,8 +1278,7 @@ static int psp_hdcp_load(struct psp_context *psp)
>>>       if (!cmd)
>>>           return -ENOMEM;
>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>> -    memcpy(psp->fw_pri_buf, psp->ta_hdcp_start_addr,
>>> +    psp_copy_fw(psp, psp->ta_hdcp_start_addr,
>>>              psp->ta_hdcp_ucode_size);
>>>       psp_prep_ta_load_cmd_buf(cmd,
>>> @@ -1427,8 +1429,7 @@ static int psp_dtm_load(struct psp_context *psp)
>>>       if (!cmd)
>>>           return -ENOMEM;
>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>> -    memcpy(psp->fw_pri_buf, psp->ta_dtm_start_addr, 
>>> psp->ta_dtm_ucode_size);
>>> +    psp_copy_fw(psp, psp->ta_dtm_start_addr, psp->ta_dtm_ucode_size);
>>>       psp_prep_ta_load_cmd_buf(cmd,
>>>                    psp->fw_pri_mc_addr,
>>> @@ -1573,8 +1574,7 @@ static int psp_rap_load(struct psp_context *psp)
>>>       if (!cmd)
>>>           return -ENOMEM;
>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>> -    memcpy(psp->fw_pri_buf, psp->ta_rap_start_addr, 
>>> psp->ta_rap_ucode_size);
>>> +    psp_copy_fw(psp, psp->ta_rap_start_addr, psp->ta_rap_ucode_size);
>>>       psp_prep_ta_load_cmd_buf(cmd,
>>>                    psp->fw_pri_mc_addr,
>>> @@ -3022,7 +3022,7 @@ static ssize_t 
>>> psp_usbc_pd_fw_sysfs_write(struct device *dev,
>>>       struct amdgpu_device *adev = drm_to_adev(ddev);
>>>       void *cpu_addr;
>>>       dma_addr_t dma_addr;
>>> -    int ret;
>>> +    int ret, idx;
>>>       char fw_name[100];
>>>       const struct firmware *usbc_pd_fw;
>>> @@ -3031,6 +3031,9 @@ static ssize_t 
>>> psp_usbc_pd_fw_sysfs_write(struct device *dev,
>>>           return -EBUSY;
>>>       }
>>> +    if (!drm_dev_enter(ddev, &idx))
>>> +        return -ENODEV;
>>> +
>>>       snprintf(fw_name, sizeof(fw_name), "amdgpu/%s", buf);
>>>       ret = request_firmware(&usbc_pd_fw, fw_name, adev->dev);
>>>       if (ret)
>>> @@ -3062,16 +3065,30 @@ static ssize_t 
>>> psp_usbc_pd_fw_sysfs_write(struct device *dev,
>>>   rel_buf:
>>>       dma_free_coherent(adev->dev, usbc_pd_fw->size, cpu_addr, 
>>> dma_addr);
>>>       release_firmware(usbc_pd_fw);
>>> -
>>>   fail:
>>>       if (ret) {
>>>           DRM_ERROR("Failed to load USBC PD FW, err = %d", ret);
>>> -        return ret;
>>> +        count = ret;
>>>       }
>>> +    drm_dev_exit(idx);
>>>       return count;
>>>   }
>>> +void psp_copy_fw(struct psp_context *psp, uint8_t *start_addr, 
>>> uint32_t bin_size)
>>> +{
>>> +    int idx;
>>> +
>>> +    if (!drm_dev_enter(&psp->adev->ddev, &idx))
>>> +        return;
>>> +
>>> +    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>> +    memcpy(psp->fw_pri_buf, start_addr, bin_size);
>>> +
>>> +    drm_dev_exit(idx);
>>> +}
>>> +
>>> +
>>>   static DEVICE_ATTR(usbc_pd_fw, S_IRUGO | S_IWUSR,
>>>              psp_usbc_pd_fw_sysfs_read,
>>>              psp_usbc_pd_fw_sysfs_write);
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h 
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
>>> index 46a5328e00e0..2bfdc278817f 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
>>> @@ -423,4 +423,6 @@ int psp_get_fw_attestation_records_addr(struct 
>>> psp_context *psp,
>>>   int psp_load_fw_list(struct psp_context *psp,
>>>                struct amdgpu_firmware_info **ucode_list, int 
>>> ucode_count);
>>> +void psp_copy_fw(struct psp_context *psp, uint8_t *start_addr, 
>>> uint32_t bin_size);
>>> +
>>>   #endif
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c 
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
>>> index 688624ebe421..e1985bc34436 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
>>> @@ -35,6 +35,8 @@
>>>   #include "amdgpu.h"
>>>   #include "atom.h"
>>> +#include <drm/drm_drv.h>
>>> +
>>>   /*
>>>    * Rings
>>>    * Most engines on the GPU are fed via ring buffers.  Ring
>>> @@ -461,3 +463,71 @@ int amdgpu_ring_test_helper(struct amdgpu_ring 
>>> *ring)
>>>       ring->sched.ready = !r;
>>>       return r;
>>>   }
>>> +
>>> +void amdgpu_ring_clear_ring(struct amdgpu_ring *ring)
>>> +{
>>> +    int idx;
>>> +    int i = 0;
>>> +
>>> +    if (!drm_dev_enter(&ring->adev->ddev, &idx))
>>> +        return;
>>> +
>>> +    while (i <= ring->buf_mask)
>>> +        ring->ring[i++] = ring->funcs->nop;
>>> +
>>> +    drm_dev_exit(idx);
>>> +
>>> +}
>>> +
>>> +void amdgpu_ring_write(struct amdgpu_ring *ring, uint32_t v)
>>> +{
>>> +    int idx;
>>> +
>>> +    if (!drm_dev_enter(&ring->adev->ddev, &idx))
>>> +        return;
>>> +
>>> +    if (ring->count_dw <= 0)
>>> +        DRM_ERROR("amdgpu: writing more dwords to the ring than 
>>> expected!\n");
>>> +    ring->ring[ring->wptr++ & ring->buf_mask] = v;
>>> +    ring->wptr &= ring->ptr_mask;
>>> +    ring->count_dw--;
>>> +
>>> +    drm_dev_exit(idx);
>>> +}
>>> +
>>> +void amdgpu_ring_write_multiple(struct amdgpu_ring *ring,
>>> +                          void *src, int count_dw)
>>> +{
>>> +    unsigned occupied, chunk1, chunk2;
>>> +    void *dst;
>>> +    int idx;
>>> +
>>> +    if (!drm_dev_enter(&ring->adev->ddev, &idx))
>>> +        return;
>>> +
>>> +    if (unlikely(ring->count_dw < count_dw))
>>> +        DRM_ERROR("amdgpu: writing more dwords to the ring than 
>>> expected!\n");
>>> +
>>> +    occupied = ring->wptr & ring->buf_mask;
>>> +    dst = (void *)&ring->ring[occupied];
>>> +    chunk1 = ring->buf_mask + 1 - occupied;
>>> +    chunk1 = (chunk1 >= count_dw) ? count_dw: chunk1;
>>> +    chunk2 = count_dw - chunk1;
>>> +    chunk1 <<= 2;
>>> +    chunk2 <<= 2;
>>> +
>>> +    if (chunk1)
>>> +        memcpy(dst, src, chunk1);
>>> +
>>> +    if (chunk2) {
>>> +        src += chunk1;
>>> +        dst = (void *)ring->ring;
>>> +        memcpy(dst, src, chunk2);
>>> +    }
>>> +
>>> +    ring->wptr += count_dw;
>>> +    ring->wptr &= ring->ptr_mask;
>>> +    ring->count_dw -= count_dw;
>>> +
>>> +    drm_dev_exit(idx);
>>> +}
>>
>> The ring should never we in MMIO memory, so you can completely drop 
>> that as far as I can see.
> 
> Yea, it's in all in GART, missed it for some reason...
>>
>> Maybe split that patch by use case so that we can more easily 
>> review/ack it.
> 
> In fact everything here is the same use case, once I added unmap of
> all MMIO ranges (both registers ann VRAM) i got a lot of page faults
> on device remove around any memcpy to from IO. That where I put the
> drn_dev_enter/exit scope. Also I searched in code and preemeptivly
> added guards to any other such place. I did drop amdgpu_schedule_ib
> from this patch both because it had dma_fence_wait inside and so we
> will take care of this once we decide on how to handle dma_fence waits.
> 
> Andrey
> 
>>
>> Thanks,
>> Christian.
>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h 
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>>> index e7d3d0dbdd96..c67bc6d3d039 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>>> @@ -299,53 +299,12 @@ static inline void 
>>> amdgpu_ring_set_preempt_cond_exec(struct amdgpu_ring *ring,
>>>       *ring->cond_exe_cpu_addr = cond_exec;
>>>   }
>>> -static inline void amdgpu_ring_clear_ring(struct amdgpu_ring *ring)
>>> -{
>>> -    int i = 0;
>>> -    while (i <= ring->buf_mask)
>>> -        ring->ring[i++] = ring->funcs->nop;
>>> -
>>> -}
>>> -
>>> -static inline void amdgpu_ring_write(struct amdgpu_ring *ring, 
>>> uint32_t v)
>>> -{
>>> -    if (ring->count_dw <= 0)
>>> -        DRM_ERROR("amdgpu: writing more dwords to the ring than 
>>> expected!\n");
>>> -    ring->ring[ring->wptr++ & ring->buf_mask] = v;
>>> -    ring->wptr &= ring->ptr_mask;
>>> -    ring->count_dw--;
>>> -}
>>> +void amdgpu_ring_clear_ring(struct amdgpu_ring *ring);
>>> -static inline void amdgpu_ring_write_multiple(struct amdgpu_ring *ring,
>>> -                          void *src, int count_dw)
>>> -{
>>> -    unsigned occupied, chunk1, chunk2;
>>> -    void *dst;
>>> -
>>> -    if (unlikely(ring->count_dw < count_dw))
>>> -        DRM_ERROR("amdgpu: writing more dwords to the ring than 
>>> expected!\n");
>>> -
>>> -    occupied = ring->wptr & ring->buf_mask;
>>> -    dst = (void *)&ring->ring[occupied];
>>> -    chunk1 = ring->buf_mask + 1 - occupied;
>>> -    chunk1 = (chunk1 >= count_dw) ? count_dw: chunk1;
>>> -    chunk2 = count_dw - chunk1;
>>> -    chunk1 <<= 2;
>>> -    chunk2 <<= 2;
>>> +void amdgpu_ring_write(struct amdgpu_ring *ring, uint32_t v);
>>> -    if (chunk1)
>>> -        memcpy(dst, src, chunk1);
>>> -
>>> -    if (chunk2) {
>>> -        src += chunk1;
>>> -        dst = (void *)ring->ring;
>>> -        memcpy(dst, src, chunk2);
>>> -    }
>>> -
>>> -    ring->wptr += count_dw;
>>> -    ring->wptr &= ring->ptr_mask;
>>> -    ring->count_dw -= count_dw;
>>> -}
>>> +void amdgpu_ring_write_multiple(struct amdgpu_ring *ring,
>>> +                          void *src, int count_dw);
>>>   int amdgpu_ring_test_helper(struct amdgpu_ring *ring);
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c 
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
>>> index c6dbc0801604..82f0542c7792 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
>>> @@ -32,6 +32,7 @@
>>>   #include <linux/module.h>
>>>   #include <drm/drm.h>
>>> +#include <drm/drm_drv.h>
>>>   #include "amdgpu.h"
>>>   #include "amdgpu_pm.h"
>>> @@ -375,7 +376,7 @@ int amdgpu_uvd_suspend(struct amdgpu_device *adev)
>>>   {
>>>       unsigned size;
>>>       void *ptr;
>>> -    int i, j;
>>> +    int i, j, idx;
>>>       bool in_ras_intr = amdgpu_ras_intr_triggered();
>>>       cancel_delayed_work_sync(&adev->uvd.idle_work);
>>> @@ -403,11 +404,15 @@ int amdgpu_uvd_suspend(struct amdgpu_device *adev)
>>>           if (!adev->uvd.inst[j].saved_bo)
>>>               return -ENOMEM;
>>> -        /* re-write 0 since err_event_athub will corrupt VCPU buffer */
>>> -        if (in_ras_intr)
>>> -            memset(adev->uvd.inst[j].saved_bo, 0, size);
>>> -        else
>>> -            memcpy_fromio(adev->uvd.inst[j].saved_bo, ptr, size);
>>> +        if (drm_dev_enter(&adev->ddev, &idx)) {
>>> +            /* re-write 0 since err_event_athub will corrupt VCPU 
>>> buffer */
>>> +            if (in_ras_intr)
>>> +                memset(adev->uvd.inst[j].saved_bo, 0, size);
>>> +            else
>>> +                memcpy_fromio(adev->uvd.inst[j].saved_bo, ptr, size);
>>> +
>>> +            drm_dev_exit(idx);
>>> +        }
>>>       }
>>>       if (in_ras_intr)
>>> @@ -420,7 +425,7 @@ int amdgpu_uvd_resume(struct amdgpu_device *adev)
>>>   {
>>>       unsigned size;
>>>       void *ptr;
>>> -    int i;
>>> +    int i, idx;
>>>       for (i = 0; i < adev->uvd.num_uvd_inst; i++) {
>>>           if (adev->uvd.harvest_config & (1 << i))
>>> @@ -432,7 +437,10 @@ int amdgpu_uvd_resume(struct amdgpu_device *adev)
>>>           ptr = adev->uvd.inst[i].cpu_addr;
>>>           if (adev->uvd.inst[i].saved_bo != NULL) {
>>> -            memcpy_toio(ptr, adev->uvd.inst[i].saved_bo, size);
>>> +            if (drm_dev_enter(&adev->ddev, &idx)) {
>>> +                memcpy_toio(ptr, adev->uvd.inst[i].saved_bo, size);
>>> +                drm_dev_exit(idx);
>>> +            }
>>>               kvfree(adev->uvd.inst[i].saved_bo);
>>>               adev->uvd.inst[i].saved_bo = NULL;
>>>           } else {
>>> @@ -442,8 +450,11 @@ int amdgpu_uvd_resume(struct amdgpu_device *adev)
>>>               hdr = (const struct common_firmware_header 
>>> *)adev->uvd.fw->data;
>>>               if (adev->firmware.load_type != AMDGPU_FW_LOAD_PSP) {
>>>                   offset = le32_to_cpu(hdr->ucode_array_offset_bytes);
>>> -                memcpy_toio(adev->uvd.inst[i].cpu_addr, 
>>> adev->uvd.fw->data + offset,
>>> -                        le32_to_cpu(hdr->ucode_size_bytes));
>>> +                if (drm_dev_enter(&adev->ddev, &idx)) {
>>> +                    memcpy_toio(adev->uvd.inst[i].cpu_addr, 
>>> adev->uvd.fw->data + offset,
>>> +                            le32_to_cpu(hdr->ucode_size_bytes));
>>> +                    drm_dev_exit(idx);
>>> +                }
>>>                   size -= le32_to_cpu(hdr->ucode_size_bytes);
>>>                   ptr += le32_to_cpu(hdr->ucode_size_bytes);
>>>               }
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c 
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
>>> index ea6a62f67e38..833203401ef4 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
>>> @@ -29,6 +29,7 @@
>>>   #include <linux/module.h>
>>>   #include <drm/drm.h>
>>> +#include <drm/drm_drv.h>
>>>   #include "amdgpu.h"
>>>   #include "amdgpu_pm.h"
>>> @@ -293,7 +294,7 @@ int amdgpu_vce_resume(struct amdgpu_device *adev)
>>>       void *cpu_addr;
>>>       const struct common_firmware_header *hdr;
>>>       unsigned offset;
>>> -    int r;
>>> +    int r, idx;
>>>       if (adev->vce.vcpu_bo == NULL)
>>>           return -EINVAL;
>>> @@ -313,8 +314,12 @@ int amdgpu_vce_resume(struct amdgpu_device *adev)
>>>       hdr = (const struct common_firmware_header *)adev->vce.fw->data;
>>>       offset = le32_to_cpu(hdr->ucode_array_offset_bytes);
>>> -    memcpy_toio(cpu_addr, adev->vce.fw->data + offset,
>>> -            adev->vce.fw->size - offset);
>>> +
>>> +    if (drm_dev_enter(&adev->ddev, &idx)) {
>>> +        memcpy_toio(cpu_addr, adev->vce.fw->data + offset,
>>> +                adev->vce.fw->size - offset);
>>> +        drm_dev_exit(idx);
>>> +    }
>>>       amdgpu_bo_kunmap(adev->vce.vcpu_bo);
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c 
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
>>> index 201645963ba5..21f7d3644d70 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
>>> @@ -27,6 +27,7 @@
>>>   #include <linux/firmware.h>
>>>   #include <linux/module.h>
>>>   #include <linux/pci.h>
>>> +#include <drm/drm_drv.h>
>>>   #include "amdgpu.h"
>>>   #include "amdgpu_pm.h"
>>> @@ -275,7 +276,7 @@ int amdgpu_vcn_suspend(struct amdgpu_device *adev)
>>>   {
>>>       unsigned size;
>>>       void *ptr;
>>> -    int i;
>>> +    int i, idx;
>>>       cancel_delayed_work_sync(&adev->vcn.idle_work);
>>> @@ -292,7 +293,10 @@ int amdgpu_vcn_suspend(struct amdgpu_device *adev)
>>>           if (!adev->vcn.inst[i].saved_bo)
>>>               return -ENOMEM;
>>> -        memcpy_fromio(adev->vcn.inst[i].saved_bo, ptr, size);
>>> +        if (drm_dev_enter(&adev->ddev, &idx)) {
>>> +            memcpy_fromio(adev->vcn.inst[i].saved_bo, ptr, size);
>>> +            drm_dev_exit(idx);
>>> +        }
>>>       }
>>>       return 0;
>>>   }
>>> @@ -301,7 +305,7 @@ int amdgpu_vcn_resume(struct amdgpu_device *adev)
>>>   {
>>>       unsigned size;
>>>       void *ptr;
>>> -    int i;
>>> +    int i, idx;
>>>       for (i = 0; i < adev->vcn.num_vcn_inst; ++i) {
>>>           if (adev->vcn.harvest_config & (1 << i))
>>> @@ -313,7 +317,10 @@ int amdgpu_vcn_resume(struct amdgpu_device *adev)
>>>           ptr = adev->vcn.inst[i].cpu_addr;
>>>           if (adev->vcn.inst[i].saved_bo != NULL) {
>>> -            memcpy_toio(ptr, adev->vcn.inst[i].saved_bo, size);
>>> +            if (drm_dev_enter(&adev->ddev, &idx)) {
>>> +                memcpy_toio(ptr, adev->vcn.inst[i].saved_bo, size);
>>> +                drm_dev_exit(idx);
>>> +            }
>>>               kvfree(adev->vcn.inst[i].saved_bo);
>>>               adev->vcn.inst[i].saved_bo = NULL;
>>>           } else {
>>> @@ -323,8 +330,11 @@ int amdgpu_vcn_resume(struct amdgpu_device *adev)
>>>               hdr = (const struct common_firmware_header 
>>> *)adev->vcn.fw->data;
>>>               if (adev->firmware.load_type != AMDGPU_FW_LOAD_PSP) {
>>>                   offset = le32_to_cpu(hdr->ucode_array_offset_bytes);
>>> -                memcpy_toio(adev->vcn.inst[i].cpu_addr, 
>>> adev->vcn.fw->data + offset,
>>> -                        le32_to_cpu(hdr->ucode_size_bytes));
>>> +                if (drm_dev_enter(&adev->ddev, &idx)) {
>>> +                    memcpy_toio(adev->vcn.inst[i].cpu_addr, 
>>> adev->vcn.fw->data + offset,
>>> +                            le32_to_cpu(hdr->ucode_size_bytes));
>>> +                    drm_dev_exit(idx);
>>> +                }
>>>                   size -= le32_to_cpu(hdr->ucode_size_bytes);
>>>                   ptr += le32_to_cpu(hdr->ucode_size_bytes);
>>>               }
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>> index 9f868cf3b832..7dd5f10ab570 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>> @@ -32,6 +32,7 @@
>>>   #include <linux/dma-buf.h>
>>>   #include <drm/amdgpu_drm.h>
>>> +#include <drm/drm_drv.h>
>>>   #include "amdgpu.h"
>>>   #include "amdgpu_trace.h"
>>>   #include "amdgpu_amdkfd.h"
>>> @@ -1606,7 +1607,10 @@ static int amdgpu_vm_bo_update_mapping(struct 
>>> amdgpu_device *adev,
>>>       struct amdgpu_vm_update_params params;
>>>       enum amdgpu_sync_mode sync_mode;
>>>       uint64_t pfn;
>>> -    int r;
>>> +    int r, idx;
>>> +
>>> +    if (!drm_dev_enter(&adev->ddev, &idx))
>>> +        return -ENODEV;
>>>       memset(&params, 0, sizeof(params));
>>>       params.adev = adev;
>>> @@ -1715,6 +1719,7 @@ static int amdgpu_vm_bo_update_mapping(struct 
>>> amdgpu_device *adev,
>>>   error_unlock:
>>>       amdgpu_vm_eviction_unlock(vm);
>>> +    drm_dev_exit(idx);
>>>       return r;
>>>   }
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c 
>>> b/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
>>> index 589410c32d09..2cec71e823f5 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
>>> @@ -23,6 +23,7 @@
>>>   #include <linux/firmware.h>
>>>   #include <linux/module.h>
>>>   #include <linux/vmalloc.h>
>>> +#include <drm/drm_drv.h>
>>>   #include "amdgpu.h"
>>>   #include "amdgpu_psp.h"
>>> @@ -269,10 +270,8 @@ static int psp_v11_0_bootloader_load_kdb(struct 
>>> psp_context *psp)
>>>       if (ret)
>>>           return ret;
>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>> -
>>>       /* Copy PSP KDB binary to memory */
>>> -    memcpy(psp->fw_pri_buf, psp->kdb_start_addr, psp->kdb_bin_size);
>>> +    psp_copy_fw(psp, psp->kdb_start_addr, psp->kdb_bin_size);
>>>       /* Provide the PSP KDB to bootloader */
>>>       WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>> @@ -302,10 +301,8 @@ static int psp_v11_0_bootloader_load_spl(struct 
>>> psp_context *psp)
>>>       if (ret)
>>>           return ret;
>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>> -
>>>       /* Copy PSP SPL binary to memory */
>>> -    memcpy(psp->fw_pri_buf, psp->spl_start_addr, psp->spl_bin_size);
>>> +    psp_copy_fw(psp, psp->spl_start_addr, psp->spl_bin_size);
>>>       /* Provide the PSP SPL to bootloader */
>>>       WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>> @@ -335,10 +332,8 @@ static int 
>>> psp_v11_0_bootloader_load_sysdrv(struct psp_context *psp)
>>>       if (ret)
>>>           return ret;
>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>> -
>>>       /* Copy PSP System Driver binary to memory */
>>> -    memcpy(psp->fw_pri_buf, psp->sys_start_addr, psp->sys_bin_size);
>>> +    psp_copy_fw(psp, psp->sys_start_addr, psp->sys_bin_size);
>>>       /* Provide the sys driver to bootloader */
>>>       WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>> @@ -371,10 +366,8 @@ static int psp_v11_0_bootloader_load_sos(struct 
>>> psp_context *psp)
>>>       if (ret)
>>>           return ret;
>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>> -
>>>       /* Copy Secure OS binary to PSP memory */
>>> -    memcpy(psp->fw_pri_buf, psp->sos_start_addr, psp->sos_bin_size);
>>> +    psp_copy_fw(psp, psp->sos_start_addr, psp->sos_bin_size);
>>>       /* Provide the PSP secure OS to bootloader */
>>>       WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>> @@ -608,7 +601,7 @@ static int psp_v11_0_memory_training(struct 
>>> psp_context *psp, uint32_t ops)
>>>       uint32_t p2c_header[4];
>>>       uint32_t sz;
>>>       void *buf;
>>> -    int ret;
>>> +    int ret, idx;
>>>       if (ctx->init == PSP_MEM_TRAIN_NOT_SUPPORT) {
>>>           DRM_DEBUG("Memory training is not supported.\n");
>>> @@ -681,17 +674,24 @@ static int psp_v11_0_memory_training(struct 
>>> psp_context *psp, uint32_t ops)
>>>               return -ENOMEM;
>>>           }
>>> -        memcpy_fromio(buf, adev->mman.aper_base_kaddr, sz);
>>> -        ret = psp_v11_0_memory_training_send_msg(psp, 
>>> PSP_BL__DRAM_LONG_TRAIN);
>>> -        if (ret) {
>>> -            DRM_ERROR("Send long training msg failed.\n");
>>> +        if (drm_dev_enter(&adev->ddev, &idx)) {
>>> +            memcpy_fromio(buf, adev->mman.aper_base_kaddr, sz);
>>> +            ret = psp_v11_0_memory_training_send_msg(psp, 
>>> PSP_BL__DRAM_LONG_TRAIN);
>>> +            if (ret) {
>>> +                DRM_ERROR("Send long training msg failed.\n");
>>> +                vfree(buf);
>>> +                drm_dev_exit(idx);
>>> +                return ret;
>>> +            }
>>> +
>>> +            memcpy_toio(adev->mman.aper_base_kaddr, buf, sz);
>>> +            adev->hdp.funcs->flush_hdp(adev, NULL);
>>>               vfree(buf);
>>> -            return ret;
>>> +            drm_dev_exit(idx);
>>> +        } else {
>>> +            vfree(buf);
>>> +            return -ENODEV;
>>>           }
>>> -
>>> -        memcpy_toio(adev->mman.aper_base_kaddr, buf, sz);
>>> -        adev->hdp.funcs->flush_hdp(adev, NULL);
>>> -        vfree(buf);
>>>       }
>>>       if (ops & PSP_MEM_TRAIN_SAVE) {
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c 
>>> b/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c
>>> index c4828bd3264b..618e5b6b85d9 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c
>>> @@ -138,10 +138,8 @@ static int 
>>> psp_v12_0_bootloader_load_sysdrv(struct psp_context *psp)
>>>       if (ret)
>>>           return ret;
>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>> -
>>>       /* Copy PSP System Driver binary to memory */
>>> -    memcpy(psp->fw_pri_buf, psp->sys_start_addr, psp->sys_bin_size);
>>> +    psp_copy_fw(psp, psp->sys_start_addr, psp->sys_bin_size);
>>>       /* Provide the sys driver to bootloader */
>>>       WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>> @@ -179,10 +177,8 @@ static int psp_v12_0_bootloader_load_sos(struct 
>>> psp_context *psp)
>>>       if (ret)
>>>           return ret;
>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>> -
>>>       /* Copy Secure OS binary to PSP memory */
>>> -    memcpy(psp->fw_pri_buf, psp->sos_start_addr, psp->sos_bin_size);
>>> +    psp_copy_fw(psp, psp->sos_start_addr, psp->sos_bin_size);
>>>       /* Provide the PSP secure OS to bootloader */
>>>       WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c 
>>> b/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c
>>> index f2e725f72d2f..d0a6cccd0897 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c
>>> @@ -102,10 +102,8 @@ static int 
>>> psp_v3_1_bootloader_load_sysdrv(struct psp_context *psp)
>>>       if (ret)
>>>           return ret;
>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>> -
>>>       /* Copy PSP System Driver binary to memory */
>>> -    memcpy(psp->fw_pri_buf, psp->sys_start_addr, psp->sys_bin_size);
>>> +    psp_copy_fw(psp, psp->sys_start_addr, psp->sys_bin_size);
>>>       /* Provide the sys driver to bootloader */
>>>       WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>> @@ -143,10 +141,8 @@ static int psp_v3_1_bootloader_load_sos(struct 
>>> psp_context *psp)
>>>       if (ret)
>>>           return ret;
>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>> -
>>>       /* Copy Secure OS binary to PSP memory */
>>> -    memcpy(psp->fw_pri_buf, psp->sos_start_addr, psp->sos_bin_size);
>>> +    psp_copy_fw(psp, psp->sos_start_addr, psp->sos_bin_size);
>>>       /* Provide the PSP secure OS to bootloader */
>>>       WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c 
>>> b/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
>>> index 8e238dea7bef..90910d19db12 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
>>> @@ -25,6 +25,7 @@
>>>    */
>>>   #include <linux/firmware.h>
>>> +#include <drm/drm_drv.h>
>>>   #include "amdgpu.h"
>>>   #include "amdgpu_vce.h"
>>> @@ -555,16 +556,19 @@ static int vce_v4_0_hw_fini(void *handle)
>>>   static int vce_v4_0_suspend(void *handle)
>>>   {
>>>       struct amdgpu_device *adev = (struct amdgpu_device *)handle;
>>> -    int r;
>>> +    int r, idx;
>>>       if (adev->vce.vcpu_bo == NULL)
>>>           return 0;
>>> -    if (adev->firmware.load_type == AMDGPU_FW_LOAD_PSP) {
>>> -        unsigned size = amdgpu_bo_size(adev->vce.vcpu_bo);
>>> -        void *ptr = adev->vce.cpu_addr;
>>> +    if (drm_dev_enter(&adev->ddev, &idx)) {
>>> +        if (adev->firmware.load_type == AMDGPU_FW_LOAD_PSP) {
>>> +            unsigned size = amdgpu_bo_size(adev->vce.vcpu_bo);
>>> +            void *ptr = adev->vce.cpu_addr;
>>> -        memcpy_fromio(adev->vce.saved_bo, ptr, size);
>>> +            memcpy_fromio(adev->vce.saved_bo, ptr, size);
>>> +        }
>>> +        drm_dev_exit(idx);
>>>       }
>>>       r = vce_v4_0_hw_fini(adev);
>>> @@ -577,16 +581,20 @@ static int vce_v4_0_suspend(void *handle)
>>>   static int vce_v4_0_resume(void *handle)
>>>   {
>>>       struct amdgpu_device *adev = (struct amdgpu_device *)handle;
>>> -    int r;
>>> +    int r, idx;
>>>       if (adev->vce.vcpu_bo == NULL)
>>>           return -EINVAL;
>>>       if (adev->firmware.load_type == AMDGPU_FW_LOAD_PSP) {
>>> -        unsigned size = amdgpu_bo_size(adev->vce.vcpu_bo);
>>> -        void *ptr = adev->vce.cpu_addr;
>>> -        memcpy_toio(ptr, adev->vce.saved_bo, size);
>>> +        if (drm_dev_enter(&adev->ddev, &idx)) {
>>> +            unsigned size = amdgpu_bo_size(adev->vce.vcpu_bo);
>>> +            void *ptr = adev->vce.cpu_addr;
>>> +
>>> +            memcpy_toio(ptr, adev->vce.saved_bo, size);
>>> +            drm_dev_exit(idx);
>>> +        }
>>>       } else {
>>>           r = amdgpu_vce_resume(adev);
>>>           if (r)
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c 
>>> b/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c
>>> index 3f15bf34123a..df34be8ec82d 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c
>>> @@ -34,6 +34,8 @@
>>>   #include "vcn/vcn_3_0_0_sh_mask.h"
>>>   #include "ivsrcid/vcn/irqsrcs_vcn_2_0.h"
>>> +#include <drm/drm_drv.h>
>>> +
>>>   #define mmUVD_CONTEXT_ID_INTERNAL_OFFSET            0x27
>>>   #define mmUVD_GPCOM_VCPU_CMD_INTERNAL_OFFSET            0x0f
>>>   #define mmUVD_GPCOM_VCPU_DATA0_INTERNAL_OFFSET            0x10
>>> @@ -268,16 +270,20 @@ static int vcn_v3_0_sw_init(void *handle)
>>>   static int vcn_v3_0_sw_fini(void *handle)
>>>   {
>>>       struct amdgpu_device *adev = (struct amdgpu_device *)handle;
>>> -    int i, r;
>>> +    int i, r, idx;
>>> -    for (i = 0; i < adev->vcn.num_vcn_inst; i++) {
>>> -        volatile struct amdgpu_fw_shared *fw_shared;
>>> +    if (drm_dev_enter(&adev->ddev, &idx)) {
>>> +        for (i = 0; i < adev->vcn.num_vcn_inst; i++) {
>>> +            volatile struct amdgpu_fw_shared *fw_shared;
>>> -        if (adev->vcn.harvest_config & (1 << i))
>>> -            continue;
>>> -        fw_shared = adev->vcn.inst[i].fw_shared_cpu_addr;
>>> -        fw_shared->present_flag_0 = 0;
>>> -        fw_shared->sw_ring.is_enabled = false;
>>> +            if (adev->vcn.harvest_config & (1 << i))
>>> +                continue;
>>> +            fw_shared = adev->vcn.inst[i].fw_shared_cpu_addr;
>>> +            fw_shared->present_flag_0 = 0;
>>> +            fw_shared->sw_ring.is_enabled = false;
>>> +        }
>>> +
>>> +        drm_dev_exit(idx);
>>>       }
>>>       if (amdgpu_sriov_vf(adev))
>>> diff --git a/drivers/gpu/drm/amd/pm/powerplay/smumgr/smu7_smumgr.c 
>>> b/drivers/gpu/drm/amd/pm/powerplay/smumgr/smu7_smumgr.c
>>> index aae25243eb10..d628b91846c9 100644
>>> --- a/drivers/gpu/drm/amd/pm/powerplay/smumgr/smu7_smumgr.c
>>> +++ b/drivers/gpu/drm/amd/pm/powerplay/smumgr/smu7_smumgr.c
>>> @@ -405,6 +405,8 @@ int smu7_request_smu_load_fw(struct pp_hwmgr *hwmgr)
>>>                   UCODE_ID_MEC_STORAGE, 
>>> &toc->entry[toc->num_entries++]),
>>>                   "Failed to Get Firmware Entry.", r = -EINVAL; goto 
>>> failed);
>>>       }
>>> +
>>> +    /* AG TODO Can't call drm_dev_enter/exit because access 
>>> adev->ddev here ... */
>>>       memcpy_toio(smu_data->header_buffer.kaddr, smu_data->toc,
>>>               sizeof(struct SMU_DRAMData_TOC));
>>>       smum_send_msg_to_smc_with_parameter(hwmgr,
>>

^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [PATCH v6 10/16] drm/amdgpu: Guard against write accesses after device removal
@ 2021-05-12 14:01         ` Andrey Grodzovsky
  0 siblings, 0 replies; 126+ messages in thread
From: Andrey Grodzovsky @ 2021-05-12 14:01 UTC (permalink / raw)
  To: Christian König, dri-devel, amd-gfx, linux-pci,
	daniel.vetter, Harry.Wentland
  Cc: Alexander.Deucher, gregkh, ppaalanen, helgaas, Felix.Kuehling

Ping - need a confirmation it's ok to keep this as a single patch given
my explanation bellow.

Andrey

On 2021-05-11 1:52 p.m., Andrey Grodzovsky wrote:
> 
> 
> On 2021-05-11 2:50 a.m., Christian König wrote:
>> Am 10.05.21 um 18:36 schrieb Andrey Grodzovsky:
>>> This should prevent writing to memory or IO ranges possibly
>>> already allocated for other uses after our device is removed.
>>>
>>> v5:
>>> Protect more places wher memcopy_to/form_io takes place
>>> Protect IB submissions
>>>
>>> v6: Switch to !drm_dev_enter instead of scoping entire code
>>> with brackets.
>>>
>>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>>> ---
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c    | 11 ++-
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c       |  9 +++
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c        | 17 +++--
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c       | 63 +++++++++++------
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h       |  2 +
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c      | 70 +++++++++++++++++++
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h      | 49 ++-----------
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c       | 31 +++++---
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c       | 11 ++-
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c       | 22 ++++--
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c        |  7 +-
>>>   drivers/gpu/drm/amd/amdgpu/psp_v11_0.c        | 44 ++++++------
>>>   drivers/gpu/drm/amd/amdgpu/psp_v12_0.c        |  8 +--
>>>   drivers/gpu/drm/amd/amdgpu/psp_v3_1.c         |  8 +--
>>>   drivers/gpu/drm/amd/amdgpu/vce_v4_0.c         | 26 ++++---
>>>   drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c         | 22 +++---
>>>   .../drm/amd/pm/powerplay/smumgr/smu7_smumgr.c |  2 +
>>>   17 files changed, 257 insertions(+), 145 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>> index a0bff4713672..94c415176cdc 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>> @@ -71,6 +71,8 @@
>>>   #include <drm/task_barrier.h>
>>>   #include <linux/pm_runtime.h>
>>> +#include <drm/drm_drv.h>
>>> +
>>>   MODULE_FIRMWARE("amdgpu/vega10_gpu_info.bin");
>>>   MODULE_FIRMWARE("amdgpu/vega12_gpu_info.bin");
>>>   MODULE_FIRMWARE("amdgpu/raven_gpu_info.bin");
>>> @@ -281,7 +283,10 @@ void amdgpu_device_vram_access(struct 
>>> amdgpu_device *adev, loff_t pos,
>>>       unsigned long flags;
>>>       uint32_t hi = ~0;
>>>       uint64_t last;
>>> +    int idx;
>>> +     if (!drm_dev_enter(&adev->ddev, &idx))
>>> +         return;
>>>   #ifdef CONFIG_64BIT
>>>       last = min(pos + size, adev->gmc.visible_vram_size);
>>> @@ -299,8 +304,10 @@ void amdgpu_device_vram_access(struct 
>>> amdgpu_device *adev, loff_t pos,
>>>               memcpy_fromio(buf, addr, count);
>>>           }
>>> -        if (count == size)
>>> +        if (count == size) {
>>> +            drm_dev_exit(idx);
>>>               return;
>>> +        }
>>
>> Maybe use a goto instead, but really just a nit pick.
>>
>>
>>
>>>           pos += count;
>>>           buf += count / 4;
>>> @@ -323,6 +330,8 @@ void amdgpu_device_vram_access(struct 
>>> amdgpu_device *adev, loff_t pos,
>>>               *buf++ = RREG32_NO_KIQ(mmMM_DATA);
>>>       }
>>>       spin_unlock_irqrestore(&adev->mmio_idx_lock, flags);
>>> +
>>> +    drm_dev_exit(idx);
>>>   }
>>>   /*
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c 
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
>>> index 4d32233cde92..04ba5eef1e88 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
>>> @@ -31,6 +31,8 @@
>>>   #include "amdgpu_ras.h"
>>>   #include "amdgpu_xgmi.h"
>>> +#include <drm/drm_drv.h>
>>> +
>>>   /**
>>>    * amdgpu_gmc_pdb0_alloc - allocate vram for pdb0
>>>    *
>>> @@ -151,6 +153,10 @@ int amdgpu_gmc_set_pte_pde(struct amdgpu_device 
>>> *adev, void *cpu_pt_addr,
>>>   {
>>>       void __iomem *ptr = (void *)cpu_pt_addr;
>>>       uint64_t value;
>>> +    int idx;
>>> +
>>> +    if (!drm_dev_enter(&adev->ddev, &idx))
>>> +        return 0;
>>>       /*
>>>        * The following is for PTE only. GART does not have PDEs.
>>> @@ -158,6 +164,9 @@ int amdgpu_gmc_set_pte_pde(struct amdgpu_device 
>>> *adev, void *cpu_pt_addr,
>>>       value = addr & 0x0000FFFFFFFFF000ULL;
>>>       value |= flags;
>>>       writeq(value, ptr + (gpu_page_idx * 8));
>>> +
>>> +    drm_dev_exit(idx);
>>> +
>>>       return 0;
>>>   }
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c 
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
>>> index 148a3b481b12..62fcbd446c71 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
>>> @@ -30,6 +30,7 @@
>>>   #include <linux/slab.h>
>>>   #include <drm/amdgpu_drm.h>
>>> +#include <drm/drm_drv.h>
>>>   #include "amdgpu.h"
>>>   #include "atom.h"
>>> @@ -137,7 +138,7 @@ int amdgpu_ib_schedule(struct amdgpu_ring *ring, 
>>> unsigned num_ibs,
>>>       bool secure;
>>>       unsigned i;
>>> -    int r = 0;
>>> +    int idx, r = 0;
>>>       bool need_pipe_sync = false;
>>>       if (num_ibs == 0)
>>> @@ -169,13 +170,16 @@ int amdgpu_ib_schedule(struct amdgpu_ring 
>>> *ring, unsigned num_ibs,
>>>           return -EINVAL;
>>>       }
>>> +    if (!drm_dev_enter(&adev->ddev, &idx))
>>> +        return -ENODEV;
>>> +
>>>       alloc_size = ring->funcs->emit_frame_size + num_ibs *
>>>           ring->funcs->emit_ib_size;
>>>       r = amdgpu_ring_alloc(ring, alloc_size);
>>>       if (r) {
>>>           dev_err(adev->dev, "scheduling IB failed (%d).\n", r);
>>> -        return r;
>>> +        goto exit;
>>>       }
>>>       need_ctx_switch = ring->current_ctx != fence_ctx;
>>> @@ -205,7 +209,7 @@ int amdgpu_ib_schedule(struct amdgpu_ring *ring, 
>>> unsigned num_ibs,
>>>           r = amdgpu_vm_flush(ring, job, need_pipe_sync);
>>>           if (r) {
>>>               amdgpu_ring_undo(ring);
>>> -            return r;
>>> +            goto exit;
>>>           }
>>>       }
>>> @@ -286,7 +290,7 @@ int amdgpu_ib_schedule(struct amdgpu_ring *ring, 
>>> unsigned num_ibs,
>>>           if (job && job->vmid)
>>>               amdgpu_vmid_reset(adev, ring->funcs->vmhub, job->vmid);
>>>           amdgpu_ring_undo(ring);
>>> -        return r;
>>> +        goto exit;
>>>       }
>>>       if (ring->funcs->insert_end)
>>> @@ -304,7 +308,10 @@ int amdgpu_ib_schedule(struct amdgpu_ring *ring, 
>>> unsigned num_ibs,
>>>           ring->funcs->emit_wave_limit(ring, false);
>>>       amdgpu_ring_commit(ring);
>>> -    return 0;
>>> +
>>> +exit:
>>> +    drm_dev_exit(idx);
>>> +    return r;
>>>   }
>>>   /**
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c 
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
>>> index 9e769cf6095b..bb6afee61666 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
>>> @@ -25,6 +25,7 @@
>>>   #include <linux/firmware.h>
>>>   #include <linux/dma-mapping.h>
>>> +#include <drm/drm_drv.h>
>>>   #include "amdgpu.h"
>>>   #include "amdgpu_psp.h"
>>> @@ -39,6 +40,8 @@
>>>   #include "amdgpu_ras.h"
>>>   #include "amdgpu_securedisplay.h"
>>> +#include <drm/drm_drv.h>
>>> +
>>>   static int psp_sysfs_init(struct amdgpu_device *adev);
>>>   static void psp_sysfs_fini(struct amdgpu_device *adev);
>>> @@ -253,7 +256,7 @@ psp_cmd_submit_buf(struct psp_context *psp,
>>>              struct psp_gfx_cmd_resp *cmd, uint64_t fence_mc_addr)
>>>   {
>>>       int ret;
>>> -    int index;
>>> +    int index, idx;
>>>       int timeout = 20000;
>>>       bool ras_intr = false;
>>>       bool skip_unsupport = false;
>>> @@ -261,6 +264,9 @@ psp_cmd_submit_buf(struct psp_context *psp,
>>>       if (psp->adev->in_pci_err_recovery)
>>>           return 0;
>>> +    if (!drm_dev_enter(&psp->adev->ddev, &idx))
>>> +        return 0;
>>> +
>>>       mutex_lock(&psp->mutex);
>>>       memset(psp->cmd_buf_mem, 0, PSP_CMD_BUFFER_SIZE);
>>> @@ -271,8 +277,7 @@ psp_cmd_submit_buf(struct psp_context *psp,
>>>       ret = psp_ring_cmd_submit(psp, psp->cmd_buf_mc_addr, 
>>> fence_mc_addr, index);
>>>       if (ret) {
>>>           atomic_dec(&psp->fence_value);
>>> -        mutex_unlock(&psp->mutex);
>>> -        return ret;
>>> +        goto exit;
>>>       }
>>>       amdgpu_asic_invalidate_hdp(psp->adev, NULL);
>>> @@ -312,8 +317,8 @@ psp_cmd_submit_buf(struct psp_context *psp,
>>>                psp->cmd_buf_mem->cmd_id,
>>>                psp->cmd_buf_mem->resp.status);
>>>           if (!timeout) {
>>> -            mutex_unlock(&psp->mutex);
>>> -            return -EINVAL;
>>> +            ret = -EINVAL;
>>> +            goto exit;
>>>           }
>>>       }
>>> @@ -321,8 +326,10 @@ psp_cmd_submit_buf(struct psp_context *psp,
>>>           ucode->tmr_mc_addr_lo = psp->cmd_buf_mem->resp.fw_addr_lo;
>>>           ucode->tmr_mc_addr_hi = psp->cmd_buf_mem->resp.fw_addr_hi;
>>>       }
>>> -    mutex_unlock(&psp->mutex);
>>> +exit:
>>> +    mutex_unlock(&psp->mutex);
>>> +    drm_dev_exit(idx);
>>>       return ret;
>>>   }
>>> @@ -359,8 +366,7 @@ static int psp_load_toc(struct psp_context *psp,
>>>       if (!cmd)
>>>           return -ENOMEM;
>>>       /* Copy toc to psp firmware private buffer */
>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>> -    memcpy(psp->fw_pri_buf, psp->toc_start_addr, psp->toc_bin_size);
>>> +    psp_copy_fw(psp, psp->toc_start_addr, psp->toc_bin_size);
>>>       psp_prep_load_toc_cmd_buf(cmd, psp->fw_pri_mc_addr, 
>>> psp->toc_bin_size);
>>> @@ -625,8 +631,7 @@ static int psp_asd_load(struct psp_context *psp)
>>>       if (!cmd)
>>>           return -ENOMEM;
>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>> -    memcpy(psp->fw_pri_buf, psp->asd_start_addr, psp->asd_ucode_size);
>>> +    psp_copy_fw(psp, psp->asd_start_addr, psp->asd_ucode_size);
>>>       psp_prep_asd_load_cmd_buf(cmd, psp->fw_pri_mc_addr,
>>>                     psp->asd_ucode_size);
>>> @@ -781,8 +786,7 @@ static int psp_xgmi_load(struct psp_context *psp)
>>>       if (!cmd)
>>>           return -ENOMEM;
>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>> -    memcpy(psp->fw_pri_buf, psp->ta_xgmi_start_addr, 
>>> psp->ta_xgmi_ucode_size);
>>> +    psp_copy_fw(psp, psp->ta_xgmi_start_addr, psp->ta_xgmi_ucode_size);
>>>       psp_prep_ta_load_cmd_buf(cmd,
>>>                    psp->fw_pri_mc_addr,
>>> @@ -1038,8 +1042,7 @@ static int psp_ras_load(struct psp_context *psp)
>>>       if (!cmd)
>>>           return -ENOMEM;
>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>> -    memcpy(psp->fw_pri_buf, psp->ta_ras_start_addr, 
>>> psp->ta_ras_ucode_size);
>>> +    psp_copy_fw(psp, psp->ta_ras_start_addr, psp->ta_ras_ucode_size);
>>>       psp_prep_ta_load_cmd_buf(cmd,
>>>                    psp->fw_pri_mc_addr,
>>> @@ -1275,8 +1278,7 @@ static int psp_hdcp_load(struct psp_context *psp)
>>>       if (!cmd)
>>>           return -ENOMEM;
>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>> -    memcpy(psp->fw_pri_buf, psp->ta_hdcp_start_addr,
>>> +    psp_copy_fw(psp, psp->ta_hdcp_start_addr,
>>>              psp->ta_hdcp_ucode_size);
>>>       psp_prep_ta_load_cmd_buf(cmd,
>>> @@ -1427,8 +1429,7 @@ static int psp_dtm_load(struct psp_context *psp)
>>>       if (!cmd)
>>>           return -ENOMEM;
>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>> -    memcpy(psp->fw_pri_buf, psp->ta_dtm_start_addr, 
>>> psp->ta_dtm_ucode_size);
>>> +    psp_copy_fw(psp, psp->ta_dtm_start_addr, psp->ta_dtm_ucode_size);
>>>       psp_prep_ta_load_cmd_buf(cmd,
>>>                    psp->fw_pri_mc_addr,
>>> @@ -1573,8 +1574,7 @@ static int psp_rap_load(struct psp_context *psp)
>>>       if (!cmd)
>>>           return -ENOMEM;
>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>> -    memcpy(psp->fw_pri_buf, psp->ta_rap_start_addr, 
>>> psp->ta_rap_ucode_size);
>>> +    psp_copy_fw(psp, psp->ta_rap_start_addr, psp->ta_rap_ucode_size);
>>>       psp_prep_ta_load_cmd_buf(cmd,
>>>                    psp->fw_pri_mc_addr,
>>> @@ -3022,7 +3022,7 @@ static ssize_t 
>>> psp_usbc_pd_fw_sysfs_write(struct device *dev,
>>>       struct amdgpu_device *adev = drm_to_adev(ddev);
>>>       void *cpu_addr;
>>>       dma_addr_t dma_addr;
>>> -    int ret;
>>> +    int ret, idx;
>>>       char fw_name[100];
>>>       const struct firmware *usbc_pd_fw;
>>> @@ -3031,6 +3031,9 @@ static ssize_t 
>>> psp_usbc_pd_fw_sysfs_write(struct device *dev,
>>>           return -EBUSY;
>>>       }
>>> +    if (!drm_dev_enter(ddev, &idx))
>>> +        return -ENODEV;
>>> +
>>>       snprintf(fw_name, sizeof(fw_name), "amdgpu/%s", buf);
>>>       ret = request_firmware(&usbc_pd_fw, fw_name, adev->dev);
>>>       if (ret)
>>> @@ -3062,16 +3065,30 @@ static ssize_t 
>>> psp_usbc_pd_fw_sysfs_write(struct device *dev,
>>>   rel_buf:
>>>       dma_free_coherent(adev->dev, usbc_pd_fw->size, cpu_addr, 
>>> dma_addr);
>>>       release_firmware(usbc_pd_fw);
>>> -
>>>   fail:
>>>       if (ret) {
>>>           DRM_ERROR("Failed to load USBC PD FW, err = %d", ret);
>>> -        return ret;
>>> +        count = ret;
>>>       }
>>> +    drm_dev_exit(idx);
>>>       return count;
>>>   }
>>> +void psp_copy_fw(struct psp_context *psp, uint8_t *start_addr, 
>>> uint32_t bin_size)
>>> +{
>>> +    int idx;
>>> +
>>> +    if (!drm_dev_enter(&psp->adev->ddev, &idx))
>>> +        return;
>>> +
>>> +    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>> +    memcpy(psp->fw_pri_buf, start_addr, bin_size);
>>> +
>>> +    drm_dev_exit(idx);
>>> +}
>>> +
>>> +
>>>   static DEVICE_ATTR(usbc_pd_fw, S_IRUGO | S_IWUSR,
>>>              psp_usbc_pd_fw_sysfs_read,
>>>              psp_usbc_pd_fw_sysfs_write);
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h 
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
>>> index 46a5328e00e0..2bfdc278817f 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
>>> @@ -423,4 +423,6 @@ int psp_get_fw_attestation_records_addr(struct 
>>> psp_context *psp,
>>>   int psp_load_fw_list(struct psp_context *psp,
>>>                struct amdgpu_firmware_info **ucode_list, int 
>>> ucode_count);
>>> +void psp_copy_fw(struct psp_context *psp, uint8_t *start_addr, 
>>> uint32_t bin_size);
>>> +
>>>   #endif
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c 
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
>>> index 688624ebe421..e1985bc34436 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
>>> @@ -35,6 +35,8 @@
>>>   #include "amdgpu.h"
>>>   #include "atom.h"
>>> +#include <drm/drm_drv.h>
>>> +
>>>   /*
>>>    * Rings
>>>    * Most engines on the GPU are fed via ring buffers.  Ring
>>> @@ -461,3 +463,71 @@ int amdgpu_ring_test_helper(struct amdgpu_ring 
>>> *ring)
>>>       ring->sched.ready = !r;
>>>       return r;
>>>   }
>>> +
>>> +void amdgpu_ring_clear_ring(struct amdgpu_ring *ring)
>>> +{
>>> +    int idx;
>>> +    int i = 0;
>>> +
>>> +    if (!drm_dev_enter(&ring->adev->ddev, &idx))
>>> +        return;
>>> +
>>> +    while (i <= ring->buf_mask)
>>> +        ring->ring[i++] = ring->funcs->nop;
>>> +
>>> +    drm_dev_exit(idx);
>>> +
>>> +}
>>> +
>>> +void amdgpu_ring_write(struct amdgpu_ring *ring, uint32_t v)
>>> +{
>>> +    int idx;
>>> +
>>> +    if (!drm_dev_enter(&ring->adev->ddev, &idx))
>>> +        return;
>>> +
>>> +    if (ring->count_dw <= 0)
>>> +        DRM_ERROR("amdgpu: writing more dwords to the ring than 
>>> expected!\n");
>>> +    ring->ring[ring->wptr++ & ring->buf_mask] = v;
>>> +    ring->wptr &= ring->ptr_mask;
>>> +    ring->count_dw--;
>>> +
>>> +    drm_dev_exit(idx);
>>> +}
>>> +
>>> +void amdgpu_ring_write_multiple(struct amdgpu_ring *ring,
>>> +                          void *src, int count_dw)
>>> +{
>>> +    unsigned occupied, chunk1, chunk2;
>>> +    void *dst;
>>> +    int idx;
>>> +
>>> +    if (!drm_dev_enter(&ring->adev->ddev, &idx))
>>> +        return;
>>> +
>>> +    if (unlikely(ring->count_dw < count_dw))
>>> +        DRM_ERROR("amdgpu: writing more dwords to the ring than 
>>> expected!\n");
>>> +
>>> +    occupied = ring->wptr & ring->buf_mask;
>>> +    dst = (void *)&ring->ring[occupied];
>>> +    chunk1 = ring->buf_mask + 1 - occupied;
>>> +    chunk1 = (chunk1 >= count_dw) ? count_dw: chunk1;
>>> +    chunk2 = count_dw - chunk1;
>>> +    chunk1 <<= 2;
>>> +    chunk2 <<= 2;
>>> +
>>> +    if (chunk1)
>>> +        memcpy(dst, src, chunk1);
>>> +
>>> +    if (chunk2) {
>>> +        src += chunk1;
>>> +        dst = (void *)ring->ring;
>>> +        memcpy(dst, src, chunk2);
>>> +    }
>>> +
>>> +    ring->wptr += count_dw;
>>> +    ring->wptr &= ring->ptr_mask;
>>> +    ring->count_dw -= count_dw;
>>> +
>>> +    drm_dev_exit(idx);
>>> +}
>>
>> The ring should never we in MMIO memory, so you can completely drop 
>> that as far as I can see.
> 
> Yea, it's in all in GART, missed it for some reason...
>>
>> Maybe split that patch by use case so that we can more easily 
>> review/ack it.
> 
> In fact everything here is the same use case, once I added unmap of
> all MMIO ranges (both registers ann VRAM) i got a lot of page faults
> on device remove around any memcpy to from IO. That where I put the
> drn_dev_enter/exit scope. Also I searched in code and preemeptivly
> added guards to any other such place. I did drop amdgpu_schedule_ib
> from this patch both because it had dma_fence_wait inside and so we
> will take care of this once we decide on how to handle dma_fence waits.
> 
> Andrey
> 
>>
>> Thanks,
>> Christian.
>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h 
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>>> index e7d3d0dbdd96..c67bc6d3d039 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>>> @@ -299,53 +299,12 @@ static inline void 
>>> amdgpu_ring_set_preempt_cond_exec(struct amdgpu_ring *ring,
>>>       *ring->cond_exe_cpu_addr = cond_exec;
>>>   }
>>> -static inline void amdgpu_ring_clear_ring(struct amdgpu_ring *ring)
>>> -{
>>> -    int i = 0;
>>> -    while (i <= ring->buf_mask)
>>> -        ring->ring[i++] = ring->funcs->nop;
>>> -
>>> -}
>>> -
>>> -static inline void amdgpu_ring_write(struct amdgpu_ring *ring, 
>>> uint32_t v)
>>> -{
>>> -    if (ring->count_dw <= 0)
>>> -        DRM_ERROR("amdgpu: writing more dwords to the ring than 
>>> expected!\n");
>>> -    ring->ring[ring->wptr++ & ring->buf_mask] = v;
>>> -    ring->wptr &= ring->ptr_mask;
>>> -    ring->count_dw--;
>>> -}
>>> +void amdgpu_ring_clear_ring(struct amdgpu_ring *ring);
>>> -static inline void amdgpu_ring_write_multiple(struct amdgpu_ring *ring,
>>> -                          void *src, int count_dw)
>>> -{
>>> -    unsigned occupied, chunk1, chunk2;
>>> -    void *dst;
>>> -
>>> -    if (unlikely(ring->count_dw < count_dw))
>>> -        DRM_ERROR("amdgpu: writing more dwords to the ring than 
>>> expected!\n");
>>> -
>>> -    occupied = ring->wptr & ring->buf_mask;
>>> -    dst = (void *)&ring->ring[occupied];
>>> -    chunk1 = ring->buf_mask + 1 - occupied;
>>> -    chunk1 = (chunk1 >= count_dw) ? count_dw: chunk1;
>>> -    chunk2 = count_dw - chunk1;
>>> -    chunk1 <<= 2;
>>> -    chunk2 <<= 2;
>>> +void amdgpu_ring_write(struct amdgpu_ring *ring, uint32_t v);
>>> -    if (chunk1)
>>> -        memcpy(dst, src, chunk1);
>>> -
>>> -    if (chunk2) {
>>> -        src += chunk1;
>>> -        dst = (void *)ring->ring;
>>> -        memcpy(dst, src, chunk2);
>>> -    }
>>> -
>>> -    ring->wptr += count_dw;
>>> -    ring->wptr &= ring->ptr_mask;
>>> -    ring->count_dw -= count_dw;
>>> -}
>>> +void amdgpu_ring_write_multiple(struct amdgpu_ring *ring,
>>> +                          void *src, int count_dw);
>>>   int amdgpu_ring_test_helper(struct amdgpu_ring *ring);
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c 
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
>>> index c6dbc0801604..82f0542c7792 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
>>> @@ -32,6 +32,7 @@
>>>   #include <linux/module.h>
>>>   #include <drm/drm.h>
>>> +#include <drm/drm_drv.h>
>>>   #include "amdgpu.h"
>>>   #include "amdgpu_pm.h"
>>> @@ -375,7 +376,7 @@ int amdgpu_uvd_suspend(struct amdgpu_device *adev)
>>>   {
>>>       unsigned size;
>>>       void *ptr;
>>> -    int i, j;
>>> +    int i, j, idx;
>>>       bool in_ras_intr = amdgpu_ras_intr_triggered();
>>>       cancel_delayed_work_sync(&adev->uvd.idle_work);
>>> @@ -403,11 +404,15 @@ int amdgpu_uvd_suspend(struct amdgpu_device *adev)
>>>           if (!adev->uvd.inst[j].saved_bo)
>>>               return -ENOMEM;
>>> -        /* re-write 0 since err_event_athub will corrupt VCPU buffer */
>>> -        if (in_ras_intr)
>>> -            memset(adev->uvd.inst[j].saved_bo, 0, size);
>>> -        else
>>> -            memcpy_fromio(adev->uvd.inst[j].saved_bo, ptr, size);
>>> +        if (drm_dev_enter(&adev->ddev, &idx)) {
>>> +            /* re-write 0 since err_event_athub will corrupt VCPU 
>>> buffer */
>>> +            if (in_ras_intr)
>>> +                memset(adev->uvd.inst[j].saved_bo, 0, size);
>>> +            else
>>> +                memcpy_fromio(adev->uvd.inst[j].saved_bo, ptr, size);
>>> +
>>> +            drm_dev_exit(idx);
>>> +        }
>>>       }
>>>       if (in_ras_intr)
>>> @@ -420,7 +425,7 @@ int amdgpu_uvd_resume(struct amdgpu_device *adev)
>>>   {
>>>       unsigned size;
>>>       void *ptr;
>>> -    int i;
>>> +    int i, idx;
>>>       for (i = 0; i < adev->uvd.num_uvd_inst; i++) {
>>>           if (adev->uvd.harvest_config & (1 << i))
>>> @@ -432,7 +437,10 @@ int amdgpu_uvd_resume(struct amdgpu_device *adev)
>>>           ptr = adev->uvd.inst[i].cpu_addr;
>>>           if (adev->uvd.inst[i].saved_bo != NULL) {
>>> -            memcpy_toio(ptr, adev->uvd.inst[i].saved_bo, size);
>>> +            if (drm_dev_enter(&adev->ddev, &idx)) {
>>> +                memcpy_toio(ptr, adev->uvd.inst[i].saved_bo, size);
>>> +                drm_dev_exit(idx);
>>> +            }
>>>               kvfree(adev->uvd.inst[i].saved_bo);
>>>               adev->uvd.inst[i].saved_bo = NULL;
>>>           } else {
>>> @@ -442,8 +450,11 @@ int amdgpu_uvd_resume(struct amdgpu_device *adev)
>>>               hdr = (const struct common_firmware_header 
>>> *)adev->uvd.fw->data;
>>>               if (adev->firmware.load_type != AMDGPU_FW_LOAD_PSP) {
>>>                   offset = le32_to_cpu(hdr->ucode_array_offset_bytes);
>>> -                memcpy_toio(adev->uvd.inst[i].cpu_addr, 
>>> adev->uvd.fw->data + offset,
>>> -                        le32_to_cpu(hdr->ucode_size_bytes));
>>> +                if (drm_dev_enter(&adev->ddev, &idx)) {
>>> +                    memcpy_toio(adev->uvd.inst[i].cpu_addr, 
>>> adev->uvd.fw->data + offset,
>>> +                            le32_to_cpu(hdr->ucode_size_bytes));
>>> +                    drm_dev_exit(idx);
>>> +                }
>>>                   size -= le32_to_cpu(hdr->ucode_size_bytes);
>>>                   ptr += le32_to_cpu(hdr->ucode_size_bytes);
>>>               }
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c 
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
>>> index ea6a62f67e38..833203401ef4 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
>>> @@ -29,6 +29,7 @@
>>>   #include <linux/module.h>
>>>   #include <drm/drm.h>
>>> +#include <drm/drm_drv.h>
>>>   #include "amdgpu.h"
>>>   #include "amdgpu_pm.h"
>>> @@ -293,7 +294,7 @@ int amdgpu_vce_resume(struct amdgpu_device *adev)
>>>       void *cpu_addr;
>>>       const struct common_firmware_header *hdr;
>>>       unsigned offset;
>>> -    int r;
>>> +    int r, idx;
>>>       if (adev->vce.vcpu_bo == NULL)
>>>           return -EINVAL;
>>> @@ -313,8 +314,12 @@ int amdgpu_vce_resume(struct amdgpu_device *adev)
>>>       hdr = (const struct common_firmware_header *)adev->vce.fw->data;
>>>       offset = le32_to_cpu(hdr->ucode_array_offset_bytes);
>>> -    memcpy_toio(cpu_addr, adev->vce.fw->data + offset,
>>> -            adev->vce.fw->size - offset);
>>> +
>>> +    if (drm_dev_enter(&adev->ddev, &idx)) {
>>> +        memcpy_toio(cpu_addr, adev->vce.fw->data + offset,
>>> +                adev->vce.fw->size - offset);
>>> +        drm_dev_exit(idx);
>>> +    }
>>>       amdgpu_bo_kunmap(adev->vce.vcpu_bo);
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c 
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
>>> index 201645963ba5..21f7d3644d70 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
>>> @@ -27,6 +27,7 @@
>>>   #include <linux/firmware.h>
>>>   #include <linux/module.h>
>>>   #include <linux/pci.h>
>>> +#include <drm/drm_drv.h>
>>>   #include "amdgpu.h"
>>>   #include "amdgpu_pm.h"
>>> @@ -275,7 +276,7 @@ int amdgpu_vcn_suspend(struct amdgpu_device *adev)
>>>   {
>>>       unsigned size;
>>>       void *ptr;
>>> -    int i;
>>> +    int i, idx;
>>>       cancel_delayed_work_sync(&adev->vcn.idle_work);
>>> @@ -292,7 +293,10 @@ int amdgpu_vcn_suspend(struct amdgpu_device *adev)
>>>           if (!adev->vcn.inst[i].saved_bo)
>>>               return -ENOMEM;
>>> -        memcpy_fromio(adev->vcn.inst[i].saved_bo, ptr, size);
>>> +        if (drm_dev_enter(&adev->ddev, &idx)) {
>>> +            memcpy_fromio(adev->vcn.inst[i].saved_bo, ptr, size);
>>> +            drm_dev_exit(idx);
>>> +        }
>>>       }
>>>       return 0;
>>>   }
>>> @@ -301,7 +305,7 @@ int amdgpu_vcn_resume(struct amdgpu_device *adev)
>>>   {
>>>       unsigned size;
>>>       void *ptr;
>>> -    int i;
>>> +    int i, idx;
>>>       for (i = 0; i < adev->vcn.num_vcn_inst; ++i) {
>>>           if (adev->vcn.harvest_config & (1 << i))
>>> @@ -313,7 +317,10 @@ int amdgpu_vcn_resume(struct amdgpu_device *adev)
>>>           ptr = adev->vcn.inst[i].cpu_addr;
>>>           if (adev->vcn.inst[i].saved_bo != NULL) {
>>> -            memcpy_toio(ptr, adev->vcn.inst[i].saved_bo, size);
>>> +            if (drm_dev_enter(&adev->ddev, &idx)) {
>>> +                memcpy_toio(ptr, adev->vcn.inst[i].saved_bo, size);
>>> +                drm_dev_exit(idx);
>>> +            }
>>>               kvfree(adev->vcn.inst[i].saved_bo);
>>>               adev->vcn.inst[i].saved_bo = NULL;
>>>           } else {
>>> @@ -323,8 +330,11 @@ int amdgpu_vcn_resume(struct amdgpu_device *adev)
>>>               hdr = (const struct common_firmware_header 
>>> *)adev->vcn.fw->data;
>>>               if (adev->firmware.load_type != AMDGPU_FW_LOAD_PSP) {
>>>                   offset = le32_to_cpu(hdr->ucode_array_offset_bytes);
>>> -                memcpy_toio(adev->vcn.inst[i].cpu_addr, 
>>> adev->vcn.fw->data + offset,
>>> -                        le32_to_cpu(hdr->ucode_size_bytes));
>>> +                if (drm_dev_enter(&adev->ddev, &idx)) {
>>> +                    memcpy_toio(adev->vcn.inst[i].cpu_addr, 
>>> adev->vcn.fw->data + offset,
>>> +                            le32_to_cpu(hdr->ucode_size_bytes));
>>> +                    drm_dev_exit(idx);
>>> +                }
>>>                   size -= le32_to_cpu(hdr->ucode_size_bytes);
>>>                   ptr += le32_to_cpu(hdr->ucode_size_bytes);
>>>               }
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>> index 9f868cf3b832..7dd5f10ab570 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>> @@ -32,6 +32,7 @@
>>>   #include <linux/dma-buf.h>
>>>   #include <drm/amdgpu_drm.h>
>>> +#include <drm/drm_drv.h>
>>>   #include "amdgpu.h"
>>>   #include "amdgpu_trace.h"
>>>   #include "amdgpu_amdkfd.h"
>>> @@ -1606,7 +1607,10 @@ static int amdgpu_vm_bo_update_mapping(struct 
>>> amdgpu_device *adev,
>>>       struct amdgpu_vm_update_params params;
>>>       enum amdgpu_sync_mode sync_mode;
>>>       uint64_t pfn;
>>> -    int r;
>>> +    int r, idx;
>>> +
>>> +    if (!drm_dev_enter(&adev->ddev, &idx))
>>> +        return -ENODEV;
>>>       memset(&params, 0, sizeof(params));
>>>       params.adev = adev;
>>> @@ -1715,6 +1719,7 @@ static int amdgpu_vm_bo_update_mapping(struct 
>>> amdgpu_device *adev,
>>>   error_unlock:
>>>       amdgpu_vm_eviction_unlock(vm);
>>> +    drm_dev_exit(idx);
>>>       return r;
>>>   }
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c 
>>> b/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
>>> index 589410c32d09..2cec71e823f5 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
>>> @@ -23,6 +23,7 @@
>>>   #include <linux/firmware.h>
>>>   #include <linux/module.h>
>>>   #include <linux/vmalloc.h>
>>> +#include <drm/drm_drv.h>
>>>   #include "amdgpu.h"
>>>   #include "amdgpu_psp.h"
>>> @@ -269,10 +270,8 @@ static int psp_v11_0_bootloader_load_kdb(struct 
>>> psp_context *psp)
>>>       if (ret)
>>>           return ret;
>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>> -
>>>       /* Copy PSP KDB binary to memory */
>>> -    memcpy(psp->fw_pri_buf, psp->kdb_start_addr, psp->kdb_bin_size);
>>> +    psp_copy_fw(psp, psp->kdb_start_addr, psp->kdb_bin_size);
>>>       /* Provide the PSP KDB to bootloader */
>>>       WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>> @@ -302,10 +301,8 @@ static int psp_v11_0_bootloader_load_spl(struct 
>>> psp_context *psp)
>>>       if (ret)
>>>           return ret;
>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>> -
>>>       /* Copy PSP SPL binary to memory */
>>> -    memcpy(psp->fw_pri_buf, psp->spl_start_addr, psp->spl_bin_size);
>>> +    psp_copy_fw(psp, psp->spl_start_addr, psp->spl_bin_size);
>>>       /* Provide the PSP SPL to bootloader */
>>>       WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>> @@ -335,10 +332,8 @@ static int 
>>> psp_v11_0_bootloader_load_sysdrv(struct psp_context *psp)
>>>       if (ret)
>>>           return ret;
>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>> -
>>>       /* Copy PSP System Driver binary to memory */
>>> -    memcpy(psp->fw_pri_buf, psp->sys_start_addr, psp->sys_bin_size);
>>> +    psp_copy_fw(psp, psp->sys_start_addr, psp->sys_bin_size);
>>>       /* Provide the sys driver to bootloader */
>>>       WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>> @@ -371,10 +366,8 @@ static int psp_v11_0_bootloader_load_sos(struct 
>>> psp_context *psp)
>>>       if (ret)
>>>           return ret;
>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>> -
>>>       /* Copy Secure OS binary to PSP memory */
>>> -    memcpy(psp->fw_pri_buf, psp->sos_start_addr, psp->sos_bin_size);
>>> +    psp_copy_fw(psp, psp->sos_start_addr, psp->sos_bin_size);
>>>       /* Provide the PSP secure OS to bootloader */
>>>       WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>> @@ -608,7 +601,7 @@ static int psp_v11_0_memory_training(struct 
>>> psp_context *psp, uint32_t ops)
>>>       uint32_t p2c_header[4];
>>>       uint32_t sz;
>>>       void *buf;
>>> -    int ret;
>>> +    int ret, idx;
>>>       if (ctx->init == PSP_MEM_TRAIN_NOT_SUPPORT) {
>>>           DRM_DEBUG("Memory training is not supported.\n");
>>> @@ -681,17 +674,24 @@ static int psp_v11_0_memory_training(struct 
>>> psp_context *psp, uint32_t ops)
>>>               return -ENOMEM;
>>>           }
>>> -        memcpy_fromio(buf, adev->mman.aper_base_kaddr, sz);
>>> -        ret = psp_v11_0_memory_training_send_msg(psp, 
>>> PSP_BL__DRAM_LONG_TRAIN);
>>> -        if (ret) {
>>> -            DRM_ERROR("Send long training msg failed.\n");
>>> +        if (drm_dev_enter(&adev->ddev, &idx)) {
>>> +            memcpy_fromio(buf, adev->mman.aper_base_kaddr, sz);
>>> +            ret = psp_v11_0_memory_training_send_msg(psp, 
>>> PSP_BL__DRAM_LONG_TRAIN);
>>> +            if (ret) {
>>> +                DRM_ERROR("Send long training msg failed.\n");
>>> +                vfree(buf);
>>> +                drm_dev_exit(idx);
>>> +                return ret;
>>> +            }
>>> +
>>> +            memcpy_toio(adev->mman.aper_base_kaddr, buf, sz);
>>> +            adev->hdp.funcs->flush_hdp(adev, NULL);
>>>               vfree(buf);
>>> -            return ret;
>>> +            drm_dev_exit(idx);
>>> +        } else {
>>> +            vfree(buf);
>>> +            return -ENODEV;
>>>           }
>>> -
>>> -        memcpy_toio(adev->mman.aper_base_kaddr, buf, sz);
>>> -        adev->hdp.funcs->flush_hdp(adev, NULL);
>>> -        vfree(buf);
>>>       }
>>>       if (ops & PSP_MEM_TRAIN_SAVE) {
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c 
>>> b/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c
>>> index c4828bd3264b..618e5b6b85d9 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c
>>> @@ -138,10 +138,8 @@ static int 
>>> psp_v12_0_bootloader_load_sysdrv(struct psp_context *psp)
>>>       if (ret)
>>>           return ret;
>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>> -
>>>       /* Copy PSP System Driver binary to memory */
>>> -    memcpy(psp->fw_pri_buf, psp->sys_start_addr, psp->sys_bin_size);
>>> +    psp_copy_fw(psp, psp->sys_start_addr, psp->sys_bin_size);
>>>       /* Provide the sys driver to bootloader */
>>>       WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>> @@ -179,10 +177,8 @@ static int psp_v12_0_bootloader_load_sos(struct 
>>> psp_context *psp)
>>>       if (ret)
>>>           return ret;
>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>> -
>>>       /* Copy Secure OS binary to PSP memory */
>>> -    memcpy(psp->fw_pri_buf, psp->sos_start_addr, psp->sos_bin_size);
>>> +    psp_copy_fw(psp, psp->sos_start_addr, psp->sos_bin_size);
>>>       /* Provide the PSP secure OS to bootloader */
>>>       WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c 
>>> b/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c
>>> index f2e725f72d2f..d0a6cccd0897 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c
>>> @@ -102,10 +102,8 @@ static int 
>>> psp_v3_1_bootloader_load_sysdrv(struct psp_context *psp)
>>>       if (ret)
>>>           return ret;
>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>> -
>>>       /* Copy PSP System Driver binary to memory */
>>> -    memcpy(psp->fw_pri_buf, psp->sys_start_addr, psp->sys_bin_size);
>>> +    psp_copy_fw(psp, psp->sys_start_addr, psp->sys_bin_size);
>>>       /* Provide the sys driver to bootloader */
>>>       WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>> @@ -143,10 +141,8 @@ static int psp_v3_1_bootloader_load_sos(struct 
>>> psp_context *psp)
>>>       if (ret)
>>>           return ret;
>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>> -
>>>       /* Copy Secure OS binary to PSP memory */
>>> -    memcpy(psp->fw_pri_buf, psp->sos_start_addr, psp->sos_bin_size);
>>> +    psp_copy_fw(psp, psp->sos_start_addr, psp->sos_bin_size);
>>>       /* Provide the PSP secure OS to bootloader */
>>>       WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c 
>>> b/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
>>> index 8e238dea7bef..90910d19db12 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
>>> @@ -25,6 +25,7 @@
>>>    */
>>>   #include <linux/firmware.h>
>>> +#include <drm/drm_drv.h>
>>>   #include "amdgpu.h"
>>>   #include "amdgpu_vce.h"
>>> @@ -555,16 +556,19 @@ static int vce_v4_0_hw_fini(void *handle)
>>>   static int vce_v4_0_suspend(void *handle)
>>>   {
>>>       struct amdgpu_device *adev = (struct amdgpu_device *)handle;
>>> -    int r;
>>> +    int r, idx;
>>>       if (adev->vce.vcpu_bo == NULL)
>>>           return 0;
>>> -    if (adev->firmware.load_type == AMDGPU_FW_LOAD_PSP) {
>>> -        unsigned size = amdgpu_bo_size(adev->vce.vcpu_bo);
>>> -        void *ptr = adev->vce.cpu_addr;
>>> +    if (drm_dev_enter(&adev->ddev, &idx)) {
>>> +        if (adev->firmware.load_type == AMDGPU_FW_LOAD_PSP) {
>>> +            unsigned size = amdgpu_bo_size(adev->vce.vcpu_bo);
>>> +            void *ptr = adev->vce.cpu_addr;
>>> -        memcpy_fromio(adev->vce.saved_bo, ptr, size);
>>> +            memcpy_fromio(adev->vce.saved_bo, ptr, size);
>>> +        }
>>> +        drm_dev_exit(idx);
>>>       }
>>>       r = vce_v4_0_hw_fini(adev);
>>> @@ -577,16 +581,20 @@ static int vce_v4_0_suspend(void *handle)
>>>   static int vce_v4_0_resume(void *handle)
>>>   {
>>>       struct amdgpu_device *adev = (struct amdgpu_device *)handle;
>>> -    int r;
>>> +    int r, idx;
>>>       if (adev->vce.vcpu_bo == NULL)
>>>           return -EINVAL;
>>>       if (adev->firmware.load_type == AMDGPU_FW_LOAD_PSP) {
>>> -        unsigned size = amdgpu_bo_size(adev->vce.vcpu_bo);
>>> -        void *ptr = adev->vce.cpu_addr;
>>> -        memcpy_toio(ptr, adev->vce.saved_bo, size);
>>> +        if (drm_dev_enter(&adev->ddev, &idx)) {
>>> +            unsigned size = amdgpu_bo_size(adev->vce.vcpu_bo);
>>> +            void *ptr = adev->vce.cpu_addr;
>>> +
>>> +            memcpy_toio(ptr, adev->vce.saved_bo, size);
>>> +            drm_dev_exit(idx);
>>> +        }
>>>       } else {
>>>           r = amdgpu_vce_resume(adev);
>>>           if (r)
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c 
>>> b/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c
>>> index 3f15bf34123a..df34be8ec82d 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c
>>> @@ -34,6 +34,8 @@
>>>   #include "vcn/vcn_3_0_0_sh_mask.h"
>>>   #include "ivsrcid/vcn/irqsrcs_vcn_2_0.h"
>>> +#include <drm/drm_drv.h>
>>> +
>>>   #define mmUVD_CONTEXT_ID_INTERNAL_OFFSET            0x27
>>>   #define mmUVD_GPCOM_VCPU_CMD_INTERNAL_OFFSET            0x0f
>>>   #define mmUVD_GPCOM_VCPU_DATA0_INTERNAL_OFFSET            0x10
>>> @@ -268,16 +270,20 @@ static int vcn_v3_0_sw_init(void *handle)
>>>   static int vcn_v3_0_sw_fini(void *handle)
>>>   {
>>>       struct amdgpu_device *adev = (struct amdgpu_device *)handle;
>>> -    int i, r;
>>> +    int i, r, idx;
>>> -    for (i = 0; i < adev->vcn.num_vcn_inst; i++) {
>>> -        volatile struct amdgpu_fw_shared *fw_shared;
>>> +    if (drm_dev_enter(&adev->ddev, &idx)) {
>>> +        for (i = 0; i < adev->vcn.num_vcn_inst; i++) {
>>> +            volatile struct amdgpu_fw_shared *fw_shared;
>>> -        if (adev->vcn.harvest_config & (1 << i))
>>> -            continue;
>>> -        fw_shared = adev->vcn.inst[i].fw_shared_cpu_addr;
>>> -        fw_shared->present_flag_0 = 0;
>>> -        fw_shared->sw_ring.is_enabled = false;
>>> +            if (adev->vcn.harvest_config & (1 << i))
>>> +                continue;
>>> +            fw_shared = adev->vcn.inst[i].fw_shared_cpu_addr;
>>> +            fw_shared->present_flag_0 = 0;
>>> +            fw_shared->sw_ring.is_enabled = false;
>>> +        }
>>> +
>>> +        drm_dev_exit(idx);
>>>       }
>>>       if (amdgpu_sriov_vf(adev))
>>> diff --git a/drivers/gpu/drm/amd/pm/powerplay/smumgr/smu7_smumgr.c 
>>> b/drivers/gpu/drm/amd/pm/powerplay/smumgr/smu7_smumgr.c
>>> index aae25243eb10..d628b91846c9 100644
>>> --- a/drivers/gpu/drm/amd/pm/powerplay/smumgr/smu7_smumgr.c
>>> +++ b/drivers/gpu/drm/amd/pm/powerplay/smumgr/smu7_smumgr.c
>>> @@ -405,6 +405,8 @@ int smu7_request_smu_load_fw(struct pp_hwmgr *hwmgr)
>>>                   UCODE_ID_MEC_STORAGE, 
>>> &toc->entry[toc->num_entries++]),
>>>                   "Failed to Get Firmware Entry.", r = -EINVAL; goto 
>>> failed);
>>>       }
>>> +
>>> +    /* AG TODO Can't call drm_dev_enter/exit because access 
>>> adev->ddev here ... */
>>>       memcpy_toio(smu_data->header_buffer.kaddr, smu_data->toc,
>>>               sizeof(struct SMU_DRAMData_TOC));
>>>       smum_send_msg_to_smc_with_parameter(hwmgr,
>>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [PATCH v6 10/16] drm/amdgpu: Guard against write accesses after device removal
  2021-05-12 14:01         ` Andrey Grodzovsky
  (?)
@ 2021-05-12 14:06           ` Christian König
  -1 siblings, 0 replies; 126+ messages in thread
From: Christian König @ 2021-05-12 14:06 UTC (permalink / raw)
  To: Andrey Grodzovsky, dri-devel, amd-gfx, linux-pci, daniel.vetter,
	Harry.Wentland
  Cc: ppaalanen, Alexander.Deucher, gregkh, helgaas, Felix.Kuehling

Am 12.05.21 um 16:01 schrieb Andrey Grodzovsky:
> Ping - need a confirmation it's ok to keep this as a single patch given
> my explanation bellow.

It was just an suggestion. Key point is the approach sounds sane to me, 
but I can't say much about the psp code for example.

So maximum I can give you is an Acked-by for that.

Christian.

>
> Andrey
>
> On 2021-05-11 1:52 p.m., Andrey Grodzovsky wrote:
>>
>>
>> On 2021-05-11 2:50 a.m., Christian König wrote:
>>> Am 10.05.21 um 18:36 schrieb Andrey Grodzovsky:
>>>> This should prevent writing to memory or IO ranges possibly
>>>> already allocated for other uses after our device is removed.
>>>>
>>>> v5:
>>>> Protect more places wher memcopy_to/form_io takes place
>>>> Protect IB submissions
>>>>
>>>> v6: Switch to !drm_dev_enter instead of scoping entire code
>>>> with brackets.
>>>>
>>>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>>>> ---
>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c    | 11 ++-
>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c       |  9 +++
>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c        | 17 +++--
>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c       | 63 +++++++++++------
>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h       |  2 +
>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c      | 70 
>>>> +++++++++++++++++++
>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h      | 49 ++-----------
>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c       | 31 +++++---
>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c       | 11 ++-
>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c       | 22 ++++--
>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c        |  7 +-
>>>>   drivers/gpu/drm/amd/amdgpu/psp_v11_0.c        | 44 ++++++------
>>>>   drivers/gpu/drm/amd/amdgpu/psp_v12_0.c        |  8 +--
>>>>   drivers/gpu/drm/amd/amdgpu/psp_v3_1.c         |  8 +--
>>>>   drivers/gpu/drm/amd/amdgpu/vce_v4_0.c         | 26 ++++---
>>>>   drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c         | 22 +++---
>>>>   .../drm/amd/pm/powerplay/smumgr/smu7_smumgr.c |  2 +
>>>>   17 files changed, 257 insertions(+), 145 deletions(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>> index a0bff4713672..94c415176cdc 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>> @@ -71,6 +71,8 @@
>>>>   #include <drm/task_barrier.h>
>>>>   #include <linux/pm_runtime.h>
>>>> +#include <drm/drm_drv.h>
>>>> +
>>>>   MODULE_FIRMWARE("amdgpu/vega10_gpu_info.bin");
>>>>   MODULE_FIRMWARE("amdgpu/vega12_gpu_info.bin");
>>>>   MODULE_FIRMWARE("amdgpu/raven_gpu_info.bin");
>>>> @@ -281,7 +283,10 @@ void amdgpu_device_vram_access(struct 
>>>> amdgpu_device *adev, loff_t pos,
>>>>       unsigned long flags;
>>>>       uint32_t hi = ~0;
>>>>       uint64_t last;
>>>> +    int idx;
>>>> +     if (!drm_dev_enter(&adev->ddev, &idx))
>>>> +         return;
>>>>   #ifdef CONFIG_64BIT
>>>>       last = min(pos + size, adev->gmc.visible_vram_size);
>>>> @@ -299,8 +304,10 @@ void amdgpu_device_vram_access(struct 
>>>> amdgpu_device *adev, loff_t pos,
>>>>               memcpy_fromio(buf, addr, count);
>>>>           }
>>>> -        if (count == size)
>>>> +        if (count == size) {
>>>> +            drm_dev_exit(idx);
>>>>               return;
>>>> +        }
>>>
>>> Maybe use a goto instead, but really just a nit pick.
>>>
>>>
>>>
>>>>           pos += count;
>>>>           buf += count / 4;
>>>> @@ -323,6 +330,8 @@ void amdgpu_device_vram_access(struct 
>>>> amdgpu_device *adev, loff_t pos,
>>>>               *buf++ = RREG32_NO_KIQ(mmMM_DATA);
>>>>       }
>>>>       spin_unlock_irqrestore(&adev->mmio_idx_lock, flags);
>>>> +
>>>> +    drm_dev_exit(idx);
>>>>   }
>>>>   /*
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c 
>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
>>>> index 4d32233cde92..04ba5eef1e88 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
>>>> @@ -31,6 +31,8 @@
>>>>   #include "amdgpu_ras.h"
>>>>   #include "amdgpu_xgmi.h"
>>>> +#include <drm/drm_drv.h>
>>>> +
>>>>   /**
>>>>    * amdgpu_gmc_pdb0_alloc - allocate vram for pdb0
>>>>    *
>>>> @@ -151,6 +153,10 @@ int amdgpu_gmc_set_pte_pde(struct 
>>>> amdgpu_device *adev, void *cpu_pt_addr,
>>>>   {
>>>>       void __iomem *ptr = (void *)cpu_pt_addr;
>>>>       uint64_t value;
>>>> +    int idx;
>>>> +
>>>> +    if (!drm_dev_enter(&adev->ddev, &idx))
>>>> +        return 0;
>>>>       /*
>>>>        * The following is for PTE only. GART does not have PDEs.
>>>> @@ -158,6 +164,9 @@ int amdgpu_gmc_set_pte_pde(struct amdgpu_device 
>>>> *adev, void *cpu_pt_addr,
>>>>       value = addr & 0x0000FFFFFFFFF000ULL;
>>>>       value |= flags;
>>>>       writeq(value, ptr + (gpu_page_idx * 8));
>>>> +
>>>> +    drm_dev_exit(idx);
>>>> +
>>>>       return 0;
>>>>   }
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c 
>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
>>>> index 148a3b481b12..62fcbd446c71 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
>>>> @@ -30,6 +30,7 @@
>>>>   #include <linux/slab.h>
>>>>   #include <drm/amdgpu_drm.h>
>>>> +#include <drm/drm_drv.h>
>>>>   #include "amdgpu.h"
>>>>   #include "atom.h"
>>>> @@ -137,7 +138,7 @@ int amdgpu_ib_schedule(struct amdgpu_ring 
>>>> *ring, unsigned num_ibs,
>>>>       bool secure;
>>>>       unsigned i;
>>>> -    int r = 0;
>>>> +    int idx, r = 0;
>>>>       bool need_pipe_sync = false;
>>>>       if (num_ibs == 0)
>>>> @@ -169,13 +170,16 @@ int amdgpu_ib_schedule(struct amdgpu_ring 
>>>> *ring, unsigned num_ibs,
>>>>           return -EINVAL;
>>>>       }
>>>> +    if (!drm_dev_enter(&adev->ddev, &idx))
>>>> +        return -ENODEV;
>>>> +
>>>>       alloc_size = ring->funcs->emit_frame_size + num_ibs *
>>>>           ring->funcs->emit_ib_size;
>>>>       r = amdgpu_ring_alloc(ring, alloc_size);
>>>>       if (r) {
>>>>           dev_err(adev->dev, "scheduling IB failed (%d).\n", r);
>>>> -        return r;
>>>> +        goto exit;
>>>>       }
>>>>       need_ctx_switch = ring->current_ctx != fence_ctx;
>>>> @@ -205,7 +209,7 @@ int amdgpu_ib_schedule(struct amdgpu_ring 
>>>> *ring, unsigned num_ibs,
>>>>           r = amdgpu_vm_flush(ring, job, need_pipe_sync);
>>>>           if (r) {
>>>>               amdgpu_ring_undo(ring);
>>>> -            return r;
>>>> +            goto exit;
>>>>           }
>>>>       }
>>>> @@ -286,7 +290,7 @@ int amdgpu_ib_schedule(struct amdgpu_ring 
>>>> *ring, unsigned num_ibs,
>>>>           if (job && job->vmid)
>>>>               amdgpu_vmid_reset(adev, ring->funcs->vmhub, job->vmid);
>>>>           amdgpu_ring_undo(ring);
>>>> -        return r;
>>>> +        goto exit;
>>>>       }
>>>>       if (ring->funcs->insert_end)
>>>> @@ -304,7 +308,10 @@ int amdgpu_ib_schedule(struct amdgpu_ring 
>>>> *ring, unsigned num_ibs,
>>>>           ring->funcs->emit_wave_limit(ring, false);
>>>>       amdgpu_ring_commit(ring);
>>>> -    return 0;
>>>> +
>>>> +exit:
>>>> +    drm_dev_exit(idx);
>>>> +    return r;
>>>>   }
>>>>   /**
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c 
>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
>>>> index 9e769cf6095b..bb6afee61666 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
>>>> @@ -25,6 +25,7 @@
>>>>   #include <linux/firmware.h>
>>>>   #include <linux/dma-mapping.h>
>>>> +#include <drm/drm_drv.h>
>>>>   #include "amdgpu.h"
>>>>   #include "amdgpu_psp.h"
>>>> @@ -39,6 +40,8 @@
>>>>   #include "amdgpu_ras.h"
>>>>   #include "amdgpu_securedisplay.h"
>>>> +#include <drm/drm_drv.h>
>>>> +
>>>>   static int psp_sysfs_init(struct amdgpu_device *adev);
>>>>   static void psp_sysfs_fini(struct amdgpu_device *adev);
>>>> @@ -253,7 +256,7 @@ psp_cmd_submit_buf(struct psp_context *psp,
>>>>              struct psp_gfx_cmd_resp *cmd, uint64_t fence_mc_addr)
>>>>   {
>>>>       int ret;
>>>> -    int index;
>>>> +    int index, idx;
>>>>       int timeout = 20000;
>>>>       bool ras_intr = false;
>>>>       bool skip_unsupport = false;
>>>> @@ -261,6 +264,9 @@ psp_cmd_submit_buf(struct psp_context *psp,
>>>>       if (psp->adev->in_pci_err_recovery)
>>>>           return 0;
>>>> +    if (!drm_dev_enter(&psp->adev->ddev, &idx))
>>>> +        return 0;
>>>> +
>>>>       mutex_lock(&psp->mutex);
>>>>       memset(psp->cmd_buf_mem, 0, PSP_CMD_BUFFER_SIZE);
>>>> @@ -271,8 +277,7 @@ psp_cmd_submit_buf(struct psp_context *psp,
>>>>       ret = psp_ring_cmd_submit(psp, psp->cmd_buf_mc_addr, 
>>>> fence_mc_addr, index);
>>>>       if (ret) {
>>>>           atomic_dec(&psp->fence_value);
>>>> -        mutex_unlock(&psp->mutex);
>>>> -        return ret;
>>>> +        goto exit;
>>>>       }
>>>>       amdgpu_asic_invalidate_hdp(psp->adev, NULL);
>>>> @@ -312,8 +317,8 @@ psp_cmd_submit_buf(struct psp_context *psp,
>>>>                psp->cmd_buf_mem->cmd_id,
>>>>                psp->cmd_buf_mem->resp.status);
>>>>           if (!timeout) {
>>>> -            mutex_unlock(&psp->mutex);
>>>> -            return -EINVAL;
>>>> +            ret = -EINVAL;
>>>> +            goto exit;
>>>>           }
>>>>       }
>>>> @@ -321,8 +326,10 @@ psp_cmd_submit_buf(struct psp_context *psp,
>>>>           ucode->tmr_mc_addr_lo = psp->cmd_buf_mem->resp.fw_addr_lo;
>>>>           ucode->tmr_mc_addr_hi = psp->cmd_buf_mem->resp.fw_addr_hi;
>>>>       }
>>>> -    mutex_unlock(&psp->mutex);
>>>> +exit:
>>>> +    mutex_unlock(&psp->mutex);
>>>> +    drm_dev_exit(idx);
>>>>       return ret;
>>>>   }
>>>> @@ -359,8 +366,7 @@ static int psp_load_toc(struct psp_context *psp,
>>>>       if (!cmd)
>>>>           return -ENOMEM;
>>>>       /* Copy toc to psp firmware private buffer */
>>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>> -    memcpy(psp->fw_pri_buf, psp->toc_start_addr, psp->toc_bin_size);
>>>> +    psp_copy_fw(psp, psp->toc_start_addr, psp->toc_bin_size);
>>>>       psp_prep_load_toc_cmd_buf(cmd, psp->fw_pri_mc_addr, 
>>>> psp->toc_bin_size);
>>>> @@ -625,8 +631,7 @@ static int psp_asd_load(struct psp_context *psp)
>>>>       if (!cmd)
>>>>           return -ENOMEM;
>>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>> -    memcpy(psp->fw_pri_buf, psp->asd_start_addr, 
>>>> psp->asd_ucode_size);
>>>> +    psp_copy_fw(psp, psp->asd_start_addr, psp->asd_ucode_size);
>>>>       psp_prep_asd_load_cmd_buf(cmd, psp->fw_pri_mc_addr,
>>>>                     psp->asd_ucode_size);
>>>> @@ -781,8 +786,7 @@ static int psp_xgmi_load(struct psp_context *psp)
>>>>       if (!cmd)
>>>>           return -ENOMEM;
>>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>> -    memcpy(psp->fw_pri_buf, psp->ta_xgmi_start_addr, 
>>>> psp->ta_xgmi_ucode_size);
>>>> +    psp_copy_fw(psp, psp->ta_xgmi_start_addr, 
>>>> psp->ta_xgmi_ucode_size);
>>>>       psp_prep_ta_load_cmd_buf(cmd,
>>>>                    psp->fw_pri_mc_addr,
>>>> @@ -1038,8 +1042,7 @@ static int psp_ras_load(struct psp_context *psp)
>>>>       if (!cmd)
>>>>           return -ENOMEM;
>>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>> -    memcpy(psp->fw_pri_buf, psp->ta_ras_start_addr, 
>>>> psp->ta_ras_ucode_size);
>>>> +    psp_copy_fw(psp, psp->ta_ras_start_addr, psp->ta_ras_ucode_size);
>>>>       psp_prep_ta_load_cmd_buf(cmd,
>>>>                    psp->fw_pri_mc_addr,
>>>> @@ -1275,8 +1278,7 @@ static int psp_hdcp_load(struct psp_context 
>>>> *psp)
>>>>       if (!cmd)
>>>>           return -ENOMEM;
>>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>> -    memcpy(psp->fw_pri_buf, psp->ta_hdcp_start_addr,
>>>> +    psp_copy_fw(psp, psp->ta_hdcp_start_addr,
>>>>              psp->ta_hdcp_ucode_size);
>>>>       psp_prep_ta_load_cmd_buf(cmd,
>>>> @@ -1427,8 +1429,7 @@ static int psp_dtm_load(struct psp_context *psp)
>>>>       if (!cmd)
>>>>           return -ENOMEM;
>>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>> -    memcpy(psp->fw_pri_buf, psp->ta_dtm_start_addr, 
>>>> psp->ta_dtm_ucode_size);
>>>> +    psp_copy_fw(psp, psp->ta_dtm_start_addr, psp->ta_dtm_ucode_size);
>>>>       psp_prep_ta_load_cmd_buf(cmd,
>>>>                    psp->fw_pri_mc_addr,
>>>> @@ -1573,8 +1574,7 @@ static int psp_rap_load(struct psp_context *psp)
>>>>       if (!cmd)
>>>>           return -ENOMEM;
>>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>> -    memcpy(psp->fw_pri_buf, psp->ta_rap_start_addr, 
>>>> psp->ta_rap_ucode_size);
>>>> +    psp_copy_fw(psp, psp->ta_rap_start_addr, psp->ta_rap_ucode_size);
>>>>       psp_prep_ta_load_cmd_buf(cmd,
>>>>                    psp->fw_pri_mc_addr,
>>>> @@ -3022,7 +3022,7 @@ static ssize_t 
>>>> psp_usbc_pd_fw_sysfs_write(struct device *dev,
>>>>       struct amdgpu_device *adev = drm_to_adev(ddev);
>>>>       void *cpu_addr;
>>>>       dma_addr_t dma_addr;
>>>> -    int ret;
>>>> +    int ret, idx;
>>>>       char fw_name[100];
>>>>       const struct firmware *usbc_pd_fw;
>>>> @@ -3031,6 +3031,9 @@ static ssize_t 
>>>> psp_usbc_pd_fw_sysfs_write(struct device *dev,
>>>>           return -EBUSY;
>>>>       }
>>>> +    if (!drm_dev_enter(ddev, &idx))
>>>> +        return -ENODEV;
>>>> +
>>>>       snprintf(fw_name, sizeof(fw_name), "amdgpu/%s", buf);
>>>>       ret = request_firmware(&usbc_pd_fw, fw_name, adev->dev);
>>>>       if (ret)
>>>> @@ -3062,16 +3065,30 @@ static ssize_t 
>>>> psp_usbc_pd_fw_sysfs_write(struct device *dev,
>>>>   rel_buf:
>>>>       dma_free_coherent(adev->dev, usbc_pd_fw->size, cpu_addr, 
>>>> dma_addr);
>>>>       release_firmware(usbc_pd_fw);
>>>> -
>>>>   fail:
>>>>       if (ret) {
>>>>           DRM_ERROR("Failed to load USBC PD FW, err = %d", ret);
>>>> -        return ret;
>>>> +        count = ret;
>>>>       }
>>>> +    drm_dev_exit(idx);
>>>>       return count;
>>>>   }
>>>> +void psp_copy_fw(struct psp_context *psp, uint8_t *start_addr, 
>>>> uint32_t bin_size)
>>>> +{
>>>> +    int idx;
>>>> +
>>>> +    if (!drm_dev_enter(&psp->adev->ddev, &idx))
>>>> +        return;
>>>> +
>>>> +    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>> +    memcpy(psp->fw_pri_buf, start_addr, bin_size);
>>>> +
>>>> +    drm_dev_exit(idx);
>>>> +}
>>>> +
>>>> +
>>>>   static DEVICE_ATTR(usbc_pd_fw, S_IRUGO | S_IWUSR,
>>>>              psp_usbc_pd_fw_sysfs_read,
>>>>              psp_usbc_pd_fw_sysfs_write);
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h 
>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
>>>> index 46a5328e00e0..2bfdc278817f 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
>>>> @@ -423,4 +423,6 @@ int psp_get_fw_attestation_records_addr(struct 
>>>> psp_context *psp,
>>>>   int psp_load_fw_list(struct psp_context *psp,
>>>>                struct amdgpu_firmware_info **ucode_list, int 
>>>> ucode_count);
>>>> +void psp_copy_fw(struct psp_context *psp, uint8_t *start_addr, 
>>>> uint32_t bin_size);
>>>> +
>>>>   #endif
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c 
>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
>>>> index 688624ebe421..e1985bc34436 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
>>>> @@ -35,6 +35,8 @@
>>>>   #include "amdgpu.h"
>>>>   #include "atom.h"
>>>> +#include <drm/drm_drv.h>
>>>> +
>>>>   /*
>>>>    * Rings
>>>>    * Most engines on the GPU are fed via ring buffers.  Ring
>>>> @@ -461,3 +463,71 @@ int amdgpu_ring_test_helper(struct amdgpu_ring 
>>>> *ring)
>>>>       ring->sched.ready = !r;
>>>>       return r;
>>>>   }
>>>> +
>>>> +void amdgpu_ring_clear_ring(struct amdgpu_ring *ring)
>>>> +{
>>>> +    int idx;
>>>> +    int i = 0;
>>>> +
>>>> +    if (!drm_dev_enter(&ring->adev->ddev, &idx))
>>>> +        return;
>>>> +
>>>> +    while (i <= ring->buf_mask)
>>>> +        ring->ring[i++] = ring->funcs->nop;
>>>> +
>>>> +    drm_dev_exit(idx);
>>>> +
>>>> +}
>>>> +
>>>> +void amdgpu_ring_write(struct amdgpu_ring *ring, uint32_t v)
>>>> +{
>>>> +    int idx;
>>>> +
>>>> +    if (!drm_dev_enter(&ring->adev->ddev, &idx))
>>>> +        return;
>>>> +
>>>> +    if (ring->count_dw <= 0)
>>>> +        DRM_ERROR("amdgpu: writing more dwords to the ring than 
>>>> expected!\n");
>>>> +    ring->ring[ring->wptr++ & ring->buf_mask] = v;
>>>> +    ring->wptr &= ring->ptr_mask;
>>>> +    ring->count_dw--;
>>>> +
>>>> +    drm_dev_exit(idx);
>>>> +}
>>>> +
>>>> +void amdgpu_ring_write_multiple(struct amdgpu_ring *ring,
>>>> +                          void *src, int count_dw)
>>>> +{
>>>> +    unsigned occupied, chunk1, chunk2;
>>>> +    void *dst;
>>>> +    int idx;
>>>> +
>>>> +    if (!drm_dev_enter(&ring->adev->ddev, &idx))
>>>> +        return;
>>>> +
>>>> +    if (unlikely(ring->count_dw < count_dw))
>>>> +        DRM_ERROR("amdgpu: writing more dwords to the ring than 
>>>> expected!\n");
>>>> +
>>>> +    occupied = ring->wptr & ring->buf_mask;
>>>> +    dst = (void *)&ring->ring[occupied];
>>>> +    chunk1 = ring->buf_mask + 1 - occupied;
>>>> +    chunk1 = (chunk1 >= count_dw) ? count_dw: chunk1;
>>>> +    chunk2 = count_dw - chunk1;
>>>> +    chunk1 <<= 2;
>>>> +    chunk2 <<= 2;
>>>> +
>>>> +    if (chunk1)
>>>> +        memcpy(dst, src, chunk1);
>>>> +
>>>> +    if (chunk2) {
>>>> +        src += chunk1;
>>>> +        dst = (void *)ring->ring;
>>>> +        memcpy(dst, src, chunk2);
>>>> +    }
>>>> +
>>>> +    ring->wptr += count_dw;
>>>> +    ring->wptr &= ring->ptr_mask;
>>>> +    ring->count_dw -= count_dw;
>>>> +
>>>> +    drm_dev_exit(idx);
>>>> +}
>>>
>>> The ring should never we in MMIO memory, so you can completely drop 
>>> that as far as I can see.
>>
>> Yea, it's in all in GART, missed it for some reason...
>>>
>>> Maybe split that patch by use case so that we can more easily 
>>> review/ack it.
>>
>> In fact everything here is the same use case, once I added unmap of
>> all MMIO ranges (both registers ann VRAM) i got a lot of page faults
>> on device remove around any memcpy to from IO. That where I put the
>> drn_dev_enter/exit scope. Also I searched in code and preemeptivly
>> added guards to any other such place. I did drop amdgpu_schedule_ib
>> from this patch both because it had dma_fence_wait inside and so we
>> will take care of this once we decide on how to handle dma_fence waits.
>>
>> Andrey
>>
>>>
>>> Thanks,
>>> Christian.
>>>
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h 
>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>>>> index e7d3d0dbdd96..c67bc6d3d039 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>>>> @@ -299,53 +299,12 @@ static inline void 
>>>> amdgpu_ring_set_preempt_cond_exec(struct amdgpu_ring *ring,
>>>>       *ring->cond_exe_cpu_addr = cond_exec;
>>>>   }
>>>> -static inline void amdgpu_ring_clear_ring(struct amdgpu_ring *ring)
>>>> -{
>>>> -    int i = 0;
>>>> -    while (i <= ring->buf_mask)
>>>> -        ring->ring[i++] = ring->funcs->nop;
>>>> -
>>>> -}
>>>> -
>>>> -static inline void amdgpu_ring_write(struct amdgpu_ring *ring, 
>>>> uint32_t v)
>>>> -{
>>>> -    if (ring->count_dw <= 0)
>>>> -        DRM_ERROR("amdgpu: writing more dwords to the ring than 
>>>> expected!\n");
>>>> -    ring->ring[ring->wptr++ & ring->buf_mask] = v;
>>>> -    ring->wptr &= ring->ptr_mask;
>>>> -    ring->count_dw--;
>>>> -}
>>>> +void amdgpu_ring_clear_ring(struct amdgpu_ring *ring);
>>>> -static inline void amdgpu_ring_write_multiple(struct amdgpu_ring 
>>>> *ring,
>>>> -                          void *src, int count_dw)
>>>> -{
>>>> -    unsigned occupied, chunk1, chunk2;
>>>> -    void *dst;
>>>> -
>>>> -    if (unlikely(ring->count_dw < count_dw))
>>>> -        DRM_ERROR("amdgpu: writing more dwords to the ring than 
>>>> expected!\n");
>>>> -
>>>> -    occupied = ring->wptr & ring->buf_mask;
>>>> -    dst = (void *)&ring->ring[occupied];
>>>> -    chunk1 = ring->buf_mask + 1 - occupied;
>>>> -    chunk1 = (chunk1 >= count_dw) ? count_dw: chunk1;
>>>> -    chunk2 = count_dw - chunk1;
>>>> -    chunk1 <<= 2;
>>>> -    chunk2 <<= 2;
>>>> +void amdgpu_ring_write(struct amdgpu_ring *ring, uint32_t v);
>>>> -    if (chunk1)
>>>> -        memcpy(dst, src, chunk1);
>>>> -
>>>> -    if (chunk2) {
>>>> -        src += chunk1;
>>>> -        dst = (void *)ring->ring;
>>>> -        memcpy(dst, src, chunk2);
>>>> -    }
>>>> -
>>>> -    ring->wptr += count_dw;
>>>> -    ring->wptr &= ring->ptr_mask;
>>>> -    ring->count_dw -= count_dw;
>>>> -}
>>>> +void amdgpu_ring_write_multiple(struct amdgpu_ring *ring,
>>>> +                          void *src, int count_dw);
>>>>   int amdgpu_ring_test_helper(struct amdgpu_ring *ring);
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c 
>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
>>>> index c6dbc0801604..82f0542c7792 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
>>>> @@ -32,6 +32,7 @@
>>>>   #include <linux/module.h>
>>>>   #include <drm/drm.h>
>>>> +#include <drm/drm_drv.h>
>>>>   #include "amdgpu.h"
>>>>   #include "amdgpu_pm.h"
>>>> @@ -375,7 +376,7 @@ int amdgpu_uvd_suspend(struct amdgpu_device *adev)
>>>>   {
>>>>       unsigned size;
>>>>       void *ptr;
>>>> -    int i, j;
>>>> +    int i, j, idx;
>>>>       bool in_ras_intr = amdgpu_ras_intr_triggered();
>>>>       cancel_delayed_work_sync(&adev->uvd.idle_work);
>>>> @@ -403,11 +404,15 @@ int amdgpu_uvd_suspend(struct amdgpu_device 
>>>> *adev)
>>>>           if (!adev->uvd.inst[j].saved_bo)
>>>>               return -ENOMEM;
>>>> -        /* re-write 0 since err_event_athub will corrupt VCPU 
>>>> buffer */
>>>> -        if (in_ras_intr)
>>>> -            memset(adev->uvd.inst[j].saved_bo, 0, size);
>>>> -        else
>>>> -            memcpy_fromio(adev->uvd.inst[j].saved_bo, ptr, size);
>>>> +        if (drm_dev_enter(&adev->ddev, &idx)) {
>>>> +            /* re-write 0 since err_event_athub will corrupt VCPU 
>>>> buffer */
>>>> +            if (in_ras_intr)
>>>> +                memset(adev->uvd.inst[j].saved_bo, 0, size);
>>>> +            else
>>>> + memcpy_fromio(adev->uvd.inst[j].saved_bo, ptr, size);
>>>> +
>>>> +            drm_dev_exit(idx);
>>>> +        }
>>>>       }
>>>>       if (in_ras_intr)
>>>> @@ -420,7 +425,7 @@ int amdgpu_uvd_resume(struct amdgpu_device *adev)
>>>>   {
>>>>       unsigned size;
>>>>       void *ptr;
>>>> -    int i;
>>>> +    int i, idx;
>>>>       for (i = 0; i < adev->uvd.num_uvd_inst; i++) {
>>>>           if (adev->uvd.harvest_config & (1 << i))
>>>> @@ -432,7 +437,10 @@ int amdgpu_uvd_resume(struct amdgpu_device *adev)
>>>>           ptr = adev->uvd.inst[i].cpu_addr;
>>>>           if (adev->uvd.inst[i].saved_bo != NULL) {
>>>> -            memcpy_toio(ptr, adev->uvd.inst[i].saved_bo, size);
>>>> +            if (drm_dev_enter(&adev->ddev, &idx)) {
>>>> +                memcpy_toio(ptr, adev->uvd.inst[i].saved_bo, size);
>>>> +                drm_dev_exit(idx);
>>>> +            }
>>>>               kvfree(adev->uvd.inst[i].saved_bo);
>>>>               adev->uvd.inst[i].saved_bo = NULL;
>>>>           } else {
>>>> @@ -442,8 +450,11 @@ int amdgpu_uvd_resume(struct amdgpu_device *adev)
>>>>               hdr = (const struct common_firmware_header 
>>>> *)adev->uvd.fw->data;
>>>>               if (adev->firmware.load_type != AMDGPU_FW_LOAD_PSP) {
>>>>                   offset = le32_to_cpu(hdr->ucode_array_offset_bytes);
>>>> -                memcpy_toio(adev->uvd.inst[i].cpu_addr, 
>>>> adev->uvd.fw->data + offset,
>>>> - le32_to_cpu(hdr->ucode_size_bytes));
>>>> +                if (drm_dev_enter(&adev->ddev, &idx)) {
>>>> + memcpy_toio(adev->uvd.inst[i].cpu_addr, adev->uvd.fw->data + offset,
>>>> + le32_to_cpu(hdr->ucode_size_bytes));
>>>> +                    drm_dev_exit(idx);
>>>> +                }
>>>>                   size -= le32_to_cpu(hdr->ucode_size_bytes);
>>>>                   ptr += le32_to_cpu(hdr->ucode_size_bytes);
>>>>               }
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c 
>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
>>>> index ea6a62f67e38..833203401ef4 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
>>>> @@ -29,6 +29,7 @@
>>>>   #include <linux/module.h>
>>>>   #include <drm/drm.h>
>>>> +#include <drm/drm_drv.h>
>>>>   #include "amdgpu.h"
>>>>   #include "amdgpu_pm.h"
>>>> @@ -293,7 +294,7 @@ int amdgpu_vce_resume(struct amdgpu_device *adev)
>>>>       void *cpu_addr;
>>>>       const struct common_firmware_header *hdr;
>>>>       unsigned offset;
>>>> -    int r;
>>>> +    int r, idx;
>>>>       if (adev->vce.vcpu_bo == NULL)
>>>>           return -EINVAL;
>>>> @@ -313,8 +314,12 @@ int amdgpu_vce_resume(struct amdgpu_device *adev)
>>>>       hdr = (const struct common_firmware_header *)adev->vce.fw->data;
>>>>       offset = le32_to_cpu(hdr->ucode_array_offset_bytes);
>>>> -    memcpy_toio(cpu_addr, adev->vce.fw->data + offset,
>>>> -            adev->vce.fw->size - offset);
>>>> +
>>>> +    if (drm_dev_enter(&adev->ddev, &idx)) {
>>>> +        memcpy_toio(cpu_addr, adev->vce.fw->data + offset,
>>>> +                adev->vce.fw->size - offset);
>>>> +        drm_dev_exit(idx);
>>>> +    }
>>>>       amdgpu_bo_kunmap(adev->vce.vcpu_bo);
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c 
>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
>>>> index 201645963ba5..21f7d3644d70 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
>>>> @@ -27,6 +27,7 @@
>>>>   #include <linux/firmware.h>
>>>>   #include <linux/module.h>
>>>>   #include <linux/pci.h>
>>>> +#include <drm/drm_drv.h>
>>>>   #include "amdgpu.h"
>>>>   #include "amdgpu_pm.h"
>>>> @@ -275,7 +276,7 @@ int amdgpu_vcn_suspend(struct amdgpu_device *adev)
>>>>   {
>>>>       unsigned size;
>>>>       void *ptr;
>>>> -    int i;
>>>> +    int i, idx;
>>>>       cancel_delayed_work_sync(&adev->vcn.idle_work);
>>>> @@ -292,7 +293,10 @@ int amdgpu_vcn_suspend(struct amdgpu_device 
>>>> *adev)
>>>>           if (!adev->vcn.inst[i].saved_bo)
>>>>               return -ENOMEM;
>>>> -        memcpy_fromio(adev->vcn.inst[i].saved_bo, ptr, size);
>>>> +        if (drm_dev_enter(&adev->ddev, &idx)) {
>>>> +            memcpy_fromio(adev->vcn.inst[i].saved_bo, ptr, size);
>>>> +            drm_dev_exit(idx);
>>>> +        }
>>>>       }
>>>>       return 0;
>>>>   }
>>>> @@ -301,7 +305,7 @@ int amdgpu_vcn_resume(struct amdgpu_device *adev)
>>>>   {
>>>>       unsigned size;
>>>>       void *ptr;
>>>> -    int i;
>>>> +    int i, idx;
>>>>       for (i = 0; i < adev->vcn.num_vcn_inst; ++i) {
>>>>           if (adev->vcn.harvest_config & (1 << i))
>>>> @@ -313,7 +317,10 @@ int amdgpu_vcn_resume(struct amdgpu_device *adev)
>>>>           ptr = adev->vcn.inst[i].cpu_addr;
>>>>           if (adev->vcn.inst[i].saved_bo != NULL) {
>>>> -            memcpy_toio(ptr, adev->vcn.inst[i].saved_bo, size);
>>>> +            if (drm_dev_enter(&adev->ddev, &idx)) {
>>>> +                memcpy_toio(ptr, adev->vcn.inst[i].saved_bo, size);
>>>> +                drm_dev_exit(idx);
>>>> +            }
>>>>               kvfree(adev->vcn.inst[i].saved_bo);
>>>>               adev->vcn.inst[i].saved_bo = NULL;
>>>>           } else {
>>>> @@ -323,8 +330,11 @@ int amdgpu_vcn_resume(struct amdgpu_device *adev)
>>>>               hdr = (const struct common_firmware_header 
>>>> *)adev->vcn.fw->data;
>>>>               if (adev->firmware.load_type != AMDGPU_FW_LOAD_PSP) {
>>>>                   offset = le32_to_cpu(hdr->ucode_array_offset_bytes);
>>>> -                memcpy_toio(adev->vcn.inst[i].cpu_addr, 
>>>> adev->vcn.fw->data + offset,
>>>> - le32_to_cpu(hdr->ucode_size_bytes));
>>>> +                if (drm_dev_enter(&adev->ddev, &idx)) {
>>>> + memcpy_toio(adev->vcn.inst[i].cpu_addr, adev->vcn.fw->data + offset,
>>>> + le32_to_cpu(hdr->ucode_size_bytes));
>>>> +                    drm_dev_exit(idx);
>>>> +                }
>>>>                   size -= le32_to_cpu(hdr->ucode_size_bytes);
>>>>                   ptr += le32_to_cpu(hdr->ucode_size_bytes);
>>>>               }
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>>> index 9f868cf3b832..7dd5f10ab570 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>>> @@ -32,6 +32,7 @@
>>>>   #include <linux/dma-buf.h>
>>>>   #include <drm/amdgpu_drm.h>
>>>> +#include <drm/drm_drv.h>
>>>>   #include "amdgpu.h"
>>>>   #include "amdgpu_trace.h"
>>>>   #include "amdgpu_amdkfd.h"
>>>> @@ -1606,7 +1607,10 @@ static int 
>>>> amdgpu_vm_bo_update_mapping(struct amdgpu_device *adev,
>>>>       struct amdgpu_vm_update_params params;
>>>>       enum amdgpu_sync_mode sync_mode;
>>>>       uint64_t pfn;
>>>> -    int r;
>>>> +    int r, idx;
>>>> +
>>>> +    if (!drm_dev_enter(&adev->ddev, &idx))
>>>> +        return -ENODEV;
>>>>       memset(&params, 0, sizeof(params));
>>>>       params.adev = adev;
>>>> @@ -1715,6 +1719,7 @@ static int amdgpu_vm_bo_update_mapping(struct 
>>>> amdgpu_device *adev,
>>>>   error_unlock:
>>>>       amdgpu_vm_eviction_unlock(vm);
>>>> +    drm_dev_exit(idx);
>>>>       return r;
>>>>   }
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c 
>>>> b/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
>>>> index 589410c32d09..2cec71e823f5 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
>>>> @@ -23,6 +23,7 @@
>>>>   #include <linux/firmware.h>
>>>>   #include <linux/module.h>
>>>>   #include <linux/vmalloc.h>
>>>> +#include <drm/drm_drv.h>
>>>>   #include "amdgpu.h"
>>>>   #include "amdgpu_psp.h"
>>>> @@ -269,10 +270,8 @@ static int 
>>>> psp_v11_0_bootloader_load_kdb(struct psp_context *psp)
>>>>       if (ret)
>>>>           return ret;
>>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>> -
>>>>       /* Copy PSP KDB binary to memory */
>>>> -    memcpy(psp->fw_pri_buf, psp->kdb_start_addr, psp->kdb_bin_size);
>>>> +    psp_copy_fw(psp, psp->kdb_start_addr, psp->kdb_bin_size);
>>>>       /* Provide the PSP KDB to bootloader */
>>>>       WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>>> @@ -302,10 +301,8 @@ static int 
>>>> psp_v11_0_bootloader_load_spl(struct psp_context *psp)
>>>>       if (ret)
>>>>           return ret;
>>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>> -
>>>>       /* Copy PSP SPL binary to memory */
>>>> -    memcpy(psp->fw_pri_buf, psp->spl_start_addr, psp->spl_bin_size);
>>>> +    psp_copy_fw(psp, psp->spl_start_addr, psp->spl_bin_size);
>>>>       /* Provide the PSP SPL to bootloader */
>>>>       WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>>> @@ -335,10 +332,8 @@ static int 
>>>> psp_v11_0_bootloader_load_sysdrv(struct psp_context *psp)
>>>>       if (ret)
>>>>           return ret;
>>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>> -
>>>>       /* Copy PSP System Driver binary to memory */
>>>> -    memcpy(psp->fw_pri_buf, psp->sys_start_addr, psp->sys_bin_size);
>>>> +    psp_copy_fw(psp, psp->sys_start_addr, psp->sys_bin_size);
>>>>       /* Provide the sys driver to bootloader */
>>>>       WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>>> @@ -371,10 +366,8 @@ static int 
>>>> psp_v11_0_bootloader_load_sos(struct psp_context *psp)
>>>>       if (ret)
>>>>           return ret;
>>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>> -
>>>>       /* Copy Secure OS binary to PSP memory */
>>>> -    memcpy(psp->fw_pri_buf, psp->sos_start_addr, psp->sos_bin_size);
>>>> +    psp_copy_fw(psp, psp->sos_start_addr, psp->sos_bin_size);
>>>>       /* Provide the PSP secure OS to bootloader */
>>>>       WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>>> @@ -608,7 +601,7 @@ static int psp_v11_0_memory_training(struct 
>>>> psp_context *psp, uint32_t ops)
>>>>       uint32_t p2c_header[4];
>>>>       uint32_t sz;
>>>>       void *buf;
>>>> -    int ret;
>>>> +    int ret, idx;
>>>>       if (ctx->init == PSP_MEM_TRAIN_NOT_SUPPORT) {
>>>>           DRM_DEBUG("Memory training is not supported.\n");
>>>> @@ -681,17 +674,24 @@ static int psp_v11_0_memory_training(struct 
>>>> psp_context *psp, uint32_t ops)
>>>>               return -ENOMEM;
>>>>           }
>>>> -        memcpy_fromio(buf, adev->mman.aper_base_kaddr, sz);
>>>> -        ret = psp_v11_0_memory_training_send_msg(psp, 
>>>> PSP_BL__DRAM_LONG_TRAIN);
>>>> -        if (ret) {
>>>> -            DRM_ERROR("Send long training msg failed.\n");
>>>> +        if (drm_dev_enter(&adev->ddev, &idx)) {
>>>> +            memcpy_fromio(buf, adev->mman.aper_base_kaddr, sz);
>>>> +            ret = psp_v11_0_memory_training_send_msg(psp, 
>>>> PSP_BL__DRAM_LONG_TRAIN);
>>>> +            if (ret) {
>>>> +                DRM_ERROR("Send long training msg failed.\n");
>>>> +                vfree(buf);
>>>> +                drm_dev_exit(idx);
>>>> +                return ret;
>>>> +            }
>>>> +
>>>> +            memcpy_toio(adev->mman.aper_base_kaddr, buf, sz);
>>>> +            adev->hdp.funcs->flush_hdp(adev, NULL);
>>>>               vfree(buf);
>>>> -            return ret;
>>>> +            drm_dev_exit(idx);
>>>> +        } else {
>>>> +            vfree(buf);
>>>> +            return -ENODEV;
>>>>           }
>>>> -
>>>> -        memcpy_toio(adev->mman.aper_base_kaddr, buf, sz);
>>>> -        adev->hdp.funcs->flush_hdp(adev, NULL);
>>>> -        vfree(buf);
>>>>       }
>>>>       if (ops & PSP_MEM_TRAIN_SAVE) {
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c 
>>>> b/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c
>>>> index c4828bd3264b..618e5b6b85d9 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c
>>>> @@ -138,10 +138,8 @@ static int 
>>>> psp_v12_0_bootloader_load_sysdrv(struct psp_context *psp)
>>>>       if (ret)
>>>>           return ret;
>>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>> -
>>>>       /* Copy PSP System Driver binary to memory */
>>>> -    memcpy(psp->fw_pri_buf, psp->sys_start_addr, psp->sys_bin_size);
>>>> +    psp_copy_fw(psp, psp->sys_start_addr, psp->sys_bin_size);
>>>>       /* Provide the sys driver to bootloader */
>>>>       WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>>> @@ -179,10 +177,8 @@ static int 
>>>> psp_v12_0_bootloader_load_sos(struct psp_context *psp)
>>>>       if (ret)
>>>>           return ret;
>>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>> -
>>>>       /* Copy Secure OS binary to PSP memory */
>>>> -    memcpy(psp->fw_pri_buf, psp->sos_start_addr, psp->sos_bin_size);
>>>> +    psp_copy_fw(psp, psp->sos_start_addr, psp->sos_bin_size);
>>>>       /* Provide the PSP secure OS to bootloader */
>>>>       WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c 
>>>> b/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c
>>>> index f2e725f72d2f..d0a6cccd0897 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c
>>>> @@ -102,10 +102,8 @@ static int 
>>>> psp_v3_1_bootloader_load_sysdrv(struct psp_context *psp)
>>>>       if (ret)
>>>>           return ret;
>>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>> -
>>>>       /* Copy PSP System Driver binary to memory */
>>>> -    memcpy(psp->fw_pri_buf, psp->sys_start_addr, psp->sys_bin_size);
>>>> +    psp_copy_fw(psp, psp->sys_start_addr, psp->sys_bin_size);
>>>>       /* Provide the sys driver to bootloader */
>>>>       WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>>> @@ -143,10 +141,8 @@ static int psp_v3_1_bootloader_load_sos(struct 
>>>> psp_context *psp)
>>>>       if (ret)
>>>>           return ret;
>>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>> -
>>>>       /* Copy Secure OS binary to PSP memory */
>>>> -    memcpy(psp->fw_pri_buf, psp->sos_start_addr, psp->sos_bin_size);
>>>> +    psp_copy_fw(psp, psp->sos_start_addr, psp->sos_bin_size);
>>>>       /* Provide the PSP secure OS to bootloader */
>>>>       WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c 
>>>> b/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
>>>> index 8e238dea7bef..90910d19db12 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
>>>> @@ -25,6 +25,7 @@
>>>>    */
>>>>   #include <linux/firmware.h>
>>>> +#include <drm/drm_drv.h>
>>>>   #include "amdgpu.h"
>>>>   #include "amdgpu_vce.h"
>>>> @@ -555,16 +556,19 @@ static int vce_v4_0_hw_fini(void *handle)
>>>>   static int vce_v4_0_suspend(void *handle)
>>>>   {
>>>>       struct amdgpu_device *adev = (struct amdgpu_device *)handle;
>>>> -    int r;
>>>> +    int r, idx;
>>>>       if (adev->vce.vcpu_bo == NULL)
>>>>           return 0;
>>>> -    if (adev->firmware.load_type == AMDGPU_FW_LOAD_PSP) {
>>>> -        unsigned size = amdgpu_bo_size(adev->vce.vcpu_bo);
>>>> -        void *ptr = adev->vce.cpu_addr;
>>>> +    if (drm_dev_enter(&adev->ddev, &idx)) {
>>>> +        if (adev->firmware.load_type == AMDGPU_FW_LOAD_PSP) {
>>>> +            unsigned size = amdgpu_bo_size(adev->vce.vcpu_bo);
>>>> +            void *ptr = adev->vce.cpu_addr;
>>>> -        memcpy_fromio(adev->vce.saved_bo, ptr, size);
>>>> +            memcpy_fromio(adev->vce.saved_bo, ptr, size);
>>>> +        }
>>>> +        drm_dev_exit(idx);
>>>>       }
>>>>       r = vce_v4_0_hw_fini(adev);
>>>> @@ -577,16 +581,20 @@ static int vce_v4_0_suspend(void *handle)
>>>>   static int vce_v4_0_resume(void *handle)
>>>>   {
>>>>       struct amdgpu_device *adev = (struct amdgpu_device *)handle;
>>>> -    int r;
>>>> +    int r, idx;
>>>>       if (adev->vce.vcpu_bo == NULL)
>>>>           return -EINVAL;
>>>>       if (adev->firmware.load_type == AMDGPU_FW_LOAD_PSP) {
>>>> -        unsigned size = amdgpu_bo_size(adev->vce.vcpu_bo);
>>>> -        void *ptr = adev->vce.cpu_addr;
>>>> -        memcpy_toio(ptr, adev->vce.saved_bo, size);
>>>> +        if (drm_dev_enter(&adev->ddev, &idx)) {
>>>> +            unsigned size = amdgpu_bo_size(adev->vce.vcpu_bo);
>>>> +            void *ptr = adev->vce.cpu_addr;
>>>> +
>>>> +            memcpy_toio(ptr, adev->vce.saved_bo, size);
>>>> +            drm_dev_exit(idx);
>>>> +        }
>>>>       } else {
>>>>           r = amdgpu_vce_resume(adev);
>>>>           if (r)
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c 
>>>> b/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c
>>>> index 3f15bf34123a..df34be8ec82d 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c
>>>> @@ -34,6 +34,8 @@
>>>>   #include "vcn/vcn_3_0_0_sh_mask.h"
>>>>   #include "ivsrcid/vcn/irqsrcs_vcn_2_0.h"
>>>> +#include <drm/drm_drv.h>
>>>> +
>>>>   #define mmUVD_CONTEXT_ID_INTERNAL_OFFSET            0x27
>>>>   #define mmUVD_GPCOM_VCPU_CMD_INTERNAL_OFFSET 0x0f
>>>>   #define mmUVD_GPCOM_VCPU_DATA0_INTERNAL_OFFSET 0x10
>>>> @@ -268,16 +270,20 @@ static int vcn_v3_0_sw_init(void *handle)
>>>>   static int vcn_v3_0_sw_fini(void *handle)
>>>>   {
>>>>       struct amdgpu_device *adev = (struct amdgpu_device *)handle;
>>>> -    int i, r;
>>>> +    int i, r, idx;
>>>> -    for (i = 0; i < adev->vcn.num_vcn_inst; i++) {
>>>> -        volatile struct amdgpu_fw_shared *fw_shared;
>>>> +    if (drm_dev_enter(&adev->ddev, &idx)) {
>>>> +        for (i = 0; i < adev->vcn.num_vcn_inst; i++) {
>>>> +            volatile struct amdgpu_fw_shared *fw_shared;
>>>> -        if (adev->vcn.harvest_config & (1 << i))
>>>> -            continue;
>>>> -        fw_shared = adev->vcn.inst[i].fw_shared_cpu_addr;
>>>> -        fw_shared->present_flag_0 = 0;
>>>> -        fw_shared->sw_ring.is_enabled = false;
>>>> +            if (adev->vcn.harvest_config & (1 << i))
>>>> +                continue;
>>>> +            fw_shared = adev->vcn.inst[i].fw_shared_cpu_addr;
>>>> +            fw_shared->present_flag_0 = 0;
>>>> +            fw_shared->sw_ring.is_enabled = false;
>>>> +        }
>>>> +
>>>> +        drm_dev_exit(idx);
>>>>       }
>>>>       if (amdgpu_sriov_vf(adev))
>>>> diff --git a/drivers/gpu/drm/amd/pm/powerplay/smumgr/smu7_smumgr.c 
>>>> b/drivers/gpu/drm/amd/pm/powerplay/smumgr/smu7_smumgr.c
>>>> index aae25243eb10..d628b91846c9 100644
>>>> --- a/drivers/gpu/drm/amd/pm/powerplay/smumgr/smu7_smumgr.c
>>>> +++ b/drivers/gpu/drm/amd/pm/powerplay/smumgr/smu7_smumgr.c
>>>> @@ -405,6 +405,8 @@ int smu7_request_smu_load_fw(struct pp_hwmgr 
>>>> *hwmgr)
>>>>                   UCODE_ID_MEC_STORAGE, 
>>>> &toc->entry[toc->num_entries++]),
>>>>                   "Failed to Get Firmware Entry.", r = -EINVAL; 
>>>> goto failed);
>>>>       }
>>>> +
>>>> +    /* AG TODO Can't call drm_dev_enter/exit because access 
>>>> adev->ddev here ... */
>>>>       memcpy_toio(smu_data->header_buffer.kaddr, smu_data->toc,
>>>>               sizeof(struct SMU_DRAMData_TOC));
>>>>       smum_send_msg_to_smc_with_parameter(hwmgr,
>>>


^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [PATCH v6 10/16] drm/amdgpu: Guard against write accesses after device removal
@ 2021-05-12 14:06           ` Christian König
  0 siblings, 0 replies; 126+ messages in thread
From: Christian König @ 2021-05-12 14:06 UTC (permalink / raw)
  To: Andrey Grodzovsky, dri-devel, amd-gfx, linux-pci, daniel.vetter,
	Harry.Wentland
  Cc: Alexander.Deucher, gregkh, helgaas, Felix.Kuehling

Am 12.05.21 um 16:01 schrieb Andrey Grodzovsky:
> Ping - need a confirmation it's ok to keep this as a single patch given
> my explanation bellow.

It was just an suggestion. Key point is the approach sounds sane to me, 
but I can't say much about the psp code for example.

So maximum I can give you is an Acked-by for that.

Christian.

>
> Andrey
>
> On 2021-05-11 1:52 p.m., Andrey Grodzovsky wrote:
>>
>>
>> On 2021-05-11 2:50 a.m., Christian König wrote:
>>> Am 10.05.21 um 18:36 schrieb Andrey Grodzovsky:
>>>> This should prevent writing to memory or IO ranges possibly
>>>> already allocated for other uses after our device is removed.
>>>>
>>>> v5:
>>>> Protect more places wher memcopy_to/form_io takes place
>>>> Protect IB submissions
>>>>
>>>> v6: Switch to !drm_dev_enter instead of scoping entire code
>>>> with brackets.
>>>>
>>>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>>>> ---
>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c    | 11 ++-
>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c       |  9 +++
>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c        | 17 +++--
>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c       | 63 +++++++++++------
>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h       |  2 +
>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c      | 70 
>>>> +++++++++++++++++++
>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h      | 49 ++-----------
>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c       | 31 +++++---
>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c       | 11 ++-
>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c       | 22 ++++--
>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c        |  7 +-
>>>>   drivers/gpu/drm/amd/amdgpu/psp_v11_0.c        | 44 ++++++------
>>>>   drivers/gpu/drm/amd/amdgpu/psp_v12_0.c        |  8 +--
>>>>   drivers/gpu/drm/amd/amdgpu/psp_v3_1.c         |  8 +--
>>>>   drivers/gpu/drm/amd/amdgpu/vce_v4_0.c         | 26 ++++---
>>>>   drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c         | 22 +++---
>>>>   .../drm/amd/pm/powerplay/smumgr/smu7_smumgr.c |  2 +
>>>>   17 files changed, 257 insertions(+), 145 deletions(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>> index a0bff4713672..94c415176cdc 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>> @@ -71,6 +71,8 @@
>>>>   #include <drm/task_barrier.h>
>>>>   #include <linux/pm_runtime.h>
>>>> +#include <drm/drm_drv.h>
>>>> +
>>>>   MODULE_FIRMWARE("amdgpu/vega10_gpu_info.bin");
>>>>   MODULE_FIRMWARE("amdgpu/vega12_gpu_info.bin");
>>>>   MODULE_FIRMWARE("amdgpu/raven_gpu_info.bin");
>>>> @@ -281,7 +283,10 @@ void amdgpu_device_vram_access(struct 
>>>> amdgpu_device *adev, loff_t pos,
>>>>       unsigned long flags;
>>>>       uint32_t hi = ~0;
>>>>       uint64_t last;
>>>> +    int idx;
>>>> +     if (!drm_dev_enter(&adev->ddev, &idx))
>>>> +         return;
>>>>   #ifdef CONFIG_64BIT
>>>>       last = min(pos + size, adev->gmc.visible_vram_size);
>>>> @@ -299,8 +304,10 @@ void amdgpu_device_vram_access(struct 
>>>> amdgpu_device *adev, loff_t pos,
>>>>               memcpy_fromio(buf, addr, count);
>>>>           }
>>>> -        if (count == size)
>>>> +        if (count == size) {
>>>> +            drm_dev_exit(idx);
>>>>               return;
>>>> +        }
>>>
>>> Maybe use a goto instead, but really just a nit pick.
>>>
>>>
>>>
>>>>           pos += count;
>>>>           buf += count / 4;
>>>> @@ -323,6 +330,8 @@ void amdgpu_device_vram_access(struct 
>>>> amdgpu_device *adev, loff_t pos,
>>>>               *buf++ = RREG32_NO_KIQ(mmMM_DATA);
>>>>       }
>>>>       spin_unlock_irqrestore(&adev->mmio_idx_lock, flags);
>>>> +
>>>> +    drm_dev_exit(idx);
>>>>   }
>>>>   /*
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c 
>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
>>>> index 4d32233cde92..04ba5eef1e88 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
>>>> @@ -31,6 +31,8 @@
>>>>   #include "amdgpu_ras.h"
>>>>   #include "amdgpu_xgmi.h"
>>>> +#include <drm/drm_drv.h>
>>>> +
>>>>   /**
>>>>    * amdgpu_gmc_pdb0_alloc - allocate vram for pdb0
>>>>    *
>>>> @@ -151,6 +153,10 @@ int amdgpu_gmc_set_pte_pde(struct 
>>>> amdgpu_device *adev, void *cpu_pt_addr,
>>>>   {
>>>>       void __iomem *ptr = (void *)cpu_pt_addr;
>>>>       uint64_t value;
>>>> +    int idx;
>>>> +
>>>> +    if (!drm_dev_enter(&adev->ddev, &idx))
>>>> +        return 0;
>>>>       /*
>>>>        * The following is for PTE only. GART does not have PDEs.
>>>> @@ -158,6 +164,9 @@ int amdgpu_gmc_set_pte_pde(struct amdgpu_device 
>>>> *adev, void *cpu_pt_addr,
>>>>       value = addr & 0x0000FFFFFFFFF000ULL;
>>>>       value |= flags;
>>>>       writeq(value, ptr + (gpu_page_idx * 8));
>>>> +
>>>> +    drm_dev_exit(idx);
>>>> +
>>>>       return 0;
>>>>   }
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c 
>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
>>>> index 148a3b481b12..62fcbd446c71 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
>>>> @@ -30,6 +30,7 @@
>>>>   #include <linux/slab.h>
>>>>   #include <drm/amdgpu_drm.h>
>>>> +#include <drm/drm_drv.h>
>>>>   #include "amdgpu.h"
>>>>   #include "atom.h"
>>>> @@ -137,7 +138,7 @@ int amdgpu_ib_schedule(struct amdgpu_ring 
>>>> *ring, unsigned num_ibs,
>>>>       bool secure;
>>>>       unsigned i;
>>>> -    int r = 0;
>>>> +    int idx, r = 0;
>>>>       bool need_pipe_sync = false;
>>>>       if (num_ibs == 0)
>>>> @@ -169,13 +170,16 @@ int amdgpu_ib_schedule(struct amdgpu_ring 
>>>> *ring, unsigned num_ibs,
>>>>           return -EINVAL;
>>>>       }
>>>> +    if (!drm_dev_enter(&adev->ddev, &idx))
>>>> +        return -ENODEV;
>>>> +
>>>>       alloc_size = ring->funcs->emit_frame_size + num_ibs *
>>>>           ring->funcs->emit_ib_size;
>>>>       r = amdgpu_ring_alloc(ring, alloc_size);
>>>>       if (r) {
>>>>           dev_err(adev->dev, "scheduling IB failed (%d).\n", r);
>>>> -        return r;
>>>> +        goto exit;
>>>>       }
>>>>       need_ctx_switch = ring->current_ctx != fence_ctx;
>>>> @@ -205,7 +209,7 @@ int amdgpu_ib_schedule(struct amdgpu_ring 
>>>> *ring, unsigned num_ibs,
>>>>           r = amdgpu_vm_flush(ring, job, need_pipe_sync);
>>>>           if (r) {
>>>>               amdgpu_ring_undo(ring);
>>>> -            return r;
>>>> +            goto exit;
>>>>           }
>>>>       }
>>>> @@ -286,7 +290,7 @@ int amdgpu_ib_schedule(struct amdgpu_ring 
>>>> *ring, unsigned num_ibs,
>>>>           if (job && job->vmid)
>>>>               amdgpu_vmid_reset(adev, ring->funcs->vmhub, job->vmid);
>>>>           amdgpu_ring_undo(ring);
>>>> -        return r;
>>>> +        goto exit;
>>>>       }
>>>>       if (ring->funcs->insert_end)
>>>> @@ -304,7 +308,10 @@ int amdgpu_ib_schedule(struct amdgpu_ring 
>>>> *ring, unsigned num_ibs,
>>>>           ring->funcs->emit_wave_limit(ring, false);
>>>>       amdgpu_ring_commit(ring);
>>>> -    return 0;
>>>> +
>>>> +exit:
>>>> +    drm_dev_exit(idx);
>>>> +    return r;
>>>>   }
>>>>   /**
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c 
>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
>>>> index 9e769cf6095b..bb6afee61666 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
>>>> @@ -25,6 +25,7 @@
>>>>   #include <linux/firmware.h>
>>>>   #include <linux/dma-mapping.h>
>>>> +#include <drm/drm_drv.h>
>>>>   #include "amdgpu.h"
>>>>   #include "amdgpu_psp.h"
>>>> @@ -39,6 +40,8 @@
>>>>   #include "amdgpu_ras.h"
>>>>   #include "amdgpu_securedisplay.h"
>>>> +#include <drm/drm_drv.h>
>>>> +
>>>>   static int psp_sysfs_init(struct amdgpu_device *adev);
>>>>   static void psp_sysfs_fini(struct amdgpu_device *adev);
>>>> @@ -253,7 +256,7 @@ psp_cmd_submit_buf(struct psp_context *psp,
>>>>              struct psp_gfx_cmd_resp *cmd, uint64_t fence_mc_addr)
>>>>   {
>>>>       int ret;
>>>> -    int index;
>>>> +    int index, idx;
>>>>       int timeout = 20000;
>>>>       bool ras_intr = false;
>>>>       bool skip_unsupport = false;
>>>> @@ -261,6 +264,9 @@ psp_cmd_submit_buf(struct psp_context *psp,
>>>>       if (psp->adev->in_pci_err_recovery)
>>>>           return 0;
>>>> +    if (!drm_dev_enter(&psp->adev->ddev, &idx))
>>>> +        return 0;
>>>> +
>>>>       mutex_lock(&psp->mutex);
>>>>       memset(psp->cmd_buf_mem, 0, PSP_CMD_BUFFER_SIZE);
>>>> @@ -271,8 +277,7 @@ psp_cmd_submit_buf(struct psp_context *psp,
>>>>       ret = psp_ring_cmd_submit(psp, psp->cmd_buf_mc_addr, 
>>>> fence_mc_addr, index);
>>>>       if (ret) {
>>>>           atomic_dec(&psp->fence_value);
>>>> -        mutex_unlock(&psp->mutex);
>>>> -        return ret;
>>>> +        goto exit;
>>>>       }
>>>>       amdgpu_asic_invalidate_hdp(psp->adev, NULL);
>>>> @@ -312,8 +317,8 @@ psp_cmd_submit_buf(struct psp_context *psp,
>>>>                psp->cmd_buf_mem->cmd_id,
>>>>                psp->cmd_buf_mem->resp.status);
>>>>           if (!timeout) {
>>>> -            mutex_unlock(&psp->mutex);
>>>> -            return -EINVAL;
>>>> +            ret = -EINVAL;
>>>> +            goto exit;
>>>>           }
>>>>       }
>>>> @@ -321,8 +326,10 @@ psp_cmd_submit_buf(struct psp_context *psp,
>>>>           ucode->tmr_mc_addr_lo = psp->cmd_buf_mem->resp.fw_addr_lo;
>>>>           ucode->tmr_mc_addr_hi = psp->cmd_buf_mem->resp.fw_addr_hi;
>>>>       }
>>>> -    mutex_unlock(&psp->mutex);
>>>> +exit:
>>>> +    mutex_unlock(&psp->mutex);
>>>> +    drm_dev_exit(idx);
>>>>       return ret;
>>>>   }
>>>> @@ -359,8 +366,7 @@ static int psp_load_toc(struct psp_context *psp,
>>>>       if (!cmd)
>>>>           return -ENOMEM;
>>>>       /* Copy toc to psp firmware private buffer */
>>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>> -    memcpy(psp->fw_pri_buf, psp->toc_start_addr, psp->toc_bin_size);
>>>> +    psp_copy_fw(psp, psp->toc_start_addr, psp->toc_bin_size);
>>>>       psp_prep_load_toc_cmd_buf(cmd, psp->fw_pri_mc_addr, 
>>>> psp->toc_bin_size);
>>>> @@ -625,8 +631,7 @@ static int psp_asd_load(struct psp_context *psp)
>>>>       if (!cmd)
>>>>           return -ENOMEM;
>>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>> -    memcpy(psp->fw_pri_buf, psp->asd_start_addr, 
>>>> psp->asd_ucode_size);
>>>> +    psp_copy_fw(psp, psp->asd_start_addr, psp->asd_ucode_size);
>>>>       psp_prep_asd_load_cmd_buf(cmd, psp->fw_pri_mc_addr,
>>>>                     psp->asd_ucode_size);
>>>> @@ -781,8 +786,7 @@ static int psp_xgmi_load(struct psp_context *psp)
>>>>       if (!cmd)
>>>>           return -ENOMEM;
>>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>> -    memcpy(psp->fw_pri_buf, psp->ta_xgmi_start_addr, 
>>>> psp->ta_xgmi_ucode_size);
>>>> +    psp_copy_fw(psp, psp->ta_xgmi_start_addr, 
>>>> psp->ta_xgmi_ucode_size);
>>>>       psp_prep_ta_load_cmd_buf(cmd,
>>>>                    psp->fw_pri_mc_addr,
>>>> @@ -1038,8 +1042,7 @@ static int psp_ras_load(struct psp_context *psp)
>>>>       if (!cmd)
>>>>           return -ENOMEM;
>>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>> -    memcpy(psp->fw_pri_buf, psp->ta_ras_start_addr, 
>>>> psp->ta_ras_ucode_size);
>>>> +    psp_copy_fw(psp, psp->ta_ras_start_addr, psp->ta_ras_ucode_size);
>>>>       psp_prep_ta_load_cmd_buf(cmd,
>>>>                    psp->fw_pri_mc_addr,
>>>> @@ -1275,8 +1278,7 @@ static int psp_hdcp_load(struct psp_context 
>>>> *psp)
>>>>       if (!cmd)
>>>>           return -ENOMEM;
>>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>> -    memcpy(psp->fw_pri_buf, psp->ta_hdcp_start_addr,
>>>> +    psp_copy_fw(psp, psp->ta_hdcp_start_addr,
>>>>              psp->ta_hdcp_ucode_size);
>>>>       psp_prep_ta_load_cmd_buf(cmd,
>>>> @@ -1427,8 +1429,7 @@ static int psp_dtm_load(struct psp_context *psp)
>>>>       if (!cmd)
>>>>           return -ENOMEM;
>>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>> -    memcpy(psp->fw_pri_buf, psp->ta_dtm_start_addr, 
>>>> psp->ta_dtm_ucode_size);
>>>> +    psp_copy_fw(psp, psp->ta_dtm_start_addr, psp->ta_dtm_ucode_size);
>>>>       psp_prep_ta_load_cmd_buf(cmd,
>>>>                    psp->fw_pri_mc_addr,
>>>> @@ -1573,8 +1574,7 @@ static int psp_rap_load(struct psp_context *psp)
>>>>       if (!cmd)
>>>>           return -ENOMEM;
>>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>> -    memcpy(psp->fw_pri_buf, psp->ta_rap_start_addr, 
>>>> psp->ta_rap_ucode_size);
>>>> +    psp_copy_fw(psp, psp->ta_rap_start_addr, psp->ta_rap_ucode_size);
>>>>       psp_prep_ta_load_cmd_buf(cmd,
>>>>                    psp->fw_pri_mc_addr,
>>>> @@ -3022,7 +3022,7 @@ static ssize_t 
>>>> psp_usbc_pd_fw_sysfs_write(struct device *dev,
>>>>       struct amdgpu_device *adev = drm_to_adev(ddev);
>>>>       void *cpu_addr;
>>>>       dma_addr_t dma_addr;
>>>> -    int ret;
>>>> +    int ret, idx;
>>>>       char fw_name[100];
>>>>       const struct firmware *usbc_pd_fw;
>>>> @@ -3031,6 +3031,9 @@ static ssize_t 
>>>> psp_usbc_pd_fw_sysfs_write(struct device *dev,
>>>>           return -EBUSY;
>>>>       }
>>>> +    if (!drm_dev_enter(ddev, &idx))
>>>> +        return -ENODEV;
>>>> +
>>>>       snprintf(fw_name, sizeof(fw_name), "amdgpu/%s", buf);
>>>>       ret = request_firmware(&usbc_pd_fw, fw_name, adev->dev);
>>>>       if (ret)
>>>> @@ -3062,16 +3065,30 @@ static ssize_t 
>>>> psp_usbc_pd_fw_sysfs_write(struct device *dev,
>>>>   rel_buf:
>>>>       dma_free_coherent(adev->dev, usbc_pd_fw->size, cpu_addr, 
>>>> dma_addr);
>>>>       release_firmware(usbc_pd_fw);
>>>> -
>>>>   fail:
>>>>       if (ret) {
>>>>           DRM_ERROR("Failed to load USBC PD FW, err = %d", ret);
>>>> -        return ret;
>>>> +        count = ret;
>>>>       }
>>>> +    drm_dev_exit(idx);
>>>>       return count;
>>>>   }
>>>> +void psp_copy_fw(struct psp_context *psp, uint8_t *start_addr, 
>>>> uint32_t bin_size)
>>>> +{
>>>> +    int idx;
>>>> +
>>>> +    if (!drm_dev_enter(&psp->adev->ddev, &idx))
>>>> +        return;
>>>> +
>>>> +    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>> +    memcpy(psp->fw_pri_buf, start_addr, bin_size);
>>>> +
>>>> +    drm_dev_exit(idx);
>>>> +}
>>>> +
>>>> +
>>>>   static DEVICE_ATTR(usbc_pd_fw, S_IRUGO | S_IWUSR,
>>>>              psp_usbc_pd_fw_sysfs_read,
>>>>              psp_usbc_pd_fw_sysfs_write);
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h 
>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
>>>> index 46a5328e00e0..2bfdc278817f 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
>>>> @@ -423,4 +423,6 @@ int psp_get_fw_attestation_records_addr(struct 
>>>> psp_context *psp,
>>>>   int psp_load_fw_list(struct psp_context *psp,
>>>>                struct amdgpu_firmware_info **ucode_list, int 
>>>> ucode_count);
>>>> +void psp_copy_fw(struct psp_context *psp, uint8_t *start_addr, 
>>>> uint32_t bin_size);
>>>> +
>>>>   #endif
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c 
>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
>>>> index 688624ebe421..e1985bc34436 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
>>>> @@ -35,6 +35,8 @@
>>>>   #include "amdgpu.h"
>>>>   #include "atom.h"
>>>> +#include <drm/drm_drv.h>
>>>> +
>>>>   /*
>>>>    * Rings
>>>>    * Most engines on the GPU are fed via ring buffers.  Ring
>>>> @@ -461,3 +463,71 @@ int amdgpu_ring_test_helper(struct amdgpu_ring 
>>>> *ring)
>>>>       ring->sched.ready = !r;
>>>>       return r;
>>>>   }
>>>> +
>>>> +void amdgpu_ring_clear_ring(struct amdgpu_ring *ring)
>>>> +{
>>>> +    int idx;
>>>> +    int i = 0;
>>>> +
>>>> +    if (!drm_dev_enter(&ring->adev->ddev, &idx))
>>>> +        return;
>>>> +
>>>> +    while (i <= ring->buf_mask)
>>>> +        ring->ring[i++] = ring->funcs->nop;
>>>> +
>>>> +    drm_dev_exit(idx);
>>>> +
>>>> +}
>>>> +
>>>> +void amdgpu_ring_write(struct amdgpu_ring *ring, uint32_t v)
>>>> +{
>>>> +    int idx;
>>>> +
>>>> +    if (!drm_dev_enter(&ring->adev->ddev, &idx))
>>>> +        return;
>>>> +
>>>> +    if (ring->count_dw <= 0)
>>>> +        DRM_ERROR("amdgpu: writing more dwords to the ring than 
>>>> expected!\n");
>>>> +    ring->ring[ring->wptr++ & ring->buf_mask] = v;
>>>> +    ring->wptr &= ring->ptr_mask;
>>>> +    ring->count_dw--;
>>>> +
>>>> +    drm_dev_exit(idx);
>>>> +}
>>>> +
>>>> +void amdgpu_ring_write_multiple(struct amdgpu_ring *ring,
>>>> +                          void *src, int count_dw)
>>>> +{
>>>> +    unsigned occupied, chunk1, chunk2;
>>>> +    void *dst;
>>>> +    int idx;
>>>> +
>>>> +    if (!drm_dev_enter(&ring->adev->ddev, &idx))
>>>> +        return;
>>>> +
>>>> +    if (unlikely(ring->count_dw < count_dw))
>>>> +        DRM_ERROR("amdgpu: writing more dwords to the ring than 
>>>> expected!\n");
>>>> +
>>>> +    occupied = ring->wptr & ring->buf_mask;
>>>> +    dst = (void *)&ring->ring[occupied];
>>>> +    chunk1 = ring->buf_mask + 1 - occupied;
>>>> +    chunk1 = (chunk1 >= count_dw) ? count_dw: chunk1;
>>>> +    chunk2 = count_dw - chunk1;
>>>> +    chunk1 <<= 2;
>>>> +    chunk2 <<= 2;
>>>> +
>>>> +    if (chunk1)
>>>> +        memcpy(dst, src, chunk1);
>>>> +
>>>> +    if (chunk2) {
>>>> +        src += chunk1;
>>>> +        dst = (void *)ring->ring;
>>>> +        memcpy(dst, src, chunk2);
>>>> +    }
>>>> +
>>>> +    ring->wptr += count_dw;
>>>> +    ring->wptr &= ring->ptr_mask;
>>>> +    ring->count_dw -= count_dw;
>>>> +
>>>> +    drm_dev_exit(idx);
>>>> +}
>>>
>>> The ring should never we in MMIO memory, so you can completely drop 
>>> that as far as I can see.
>>
>> Yea, it's in all in GART, missed it for some reason...
>>>
>>> Maybe split that patch by use case so that we can more easily 
>>> review/ack it.
>>
>> In fact everything here is the same use case, once I added unmap of
>> all MMIO ranges (both registers ann VRAM) i got a lot of page faults
>> on device remove around any memcpy to from IO. That where I put the
>> drn_dev_enter/exit scope. Also I searched in code and preemeptivly
>> added guards to any other such place. I did drop amdgpu_schedule_ib
>> from this patch both because it had dma_fence_wait inside and so we
>> will take care of this once we decide on how to handle dma_fence waits.
>>
>> Andrey
>>
>>>
>>> Thanks,
>>> Christian.
>>>
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h 
>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>>>> index e7d3d0dbdd96..c67bc6d3d039 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>>>> @@ -299,53 +299,12 @@ static inline void 
>>>> amdgpu_ring_set_preempt_cond_exec(struct amdgpu_ring *ring,
>>>>       *ring->cond_exe_cpu_addr = cond_exec;
>>>>   }
>>>> -static inline void amdgpu_ring_clear_ring(struct amdgpu_ring *ring)
>>>> -{
>>>> -    int i = 0;
>>>> -    while (i <= ring->buf_mask)
>>>> -        ring->ring[i++] = ring->funcs->nop;
>>>> -
>>>> -}
>>>> -
>>>> -static inline void amdgpu_ring_write(struct amdgpu_ring *ring, 
>>>> uint32_t v)
>>>> -{
>>>> -    if (ring->count_dw <= 0)
>>>> -        DRM_ERROR("amdgpu: writing more dwords to the ring than 
>>>> expected!\n");
>>>> -    ring->ring[ring->wptr++ & ring->buf_mask] = v;
>>>> -    ring->wptr &= ring->ptr_mask;
>>>> -    ring->count_dw--;
>>>> -}
>>>> +void amdgpu_ring_clear_ring(struct amdgpu_ring *ring);
>>>> -static inline void amdgpu_ring_write_multiple(struct amdgpu_ring 
>>>> *ring,
>>>> -                          void *src, int count_dw)
>>>> -{
>>>> -    unsigned occupied, chunk1, chunk2;
>>>> -    void *dst;
>>>> -
>>>> -    if (unlikely(ring->count_dw < count_dw))
>>>> -        DRM_ERROR("amdgpu: writing more dwords to the ring than 
>>>> expected!\n");
>>>> -
>>>> -    occupied = ring->wptr & ring->buf_mask;
>>>> -    dst = (void *)&ring->ring[occupied];
>>>> -    chunk1 = ring->buf_mask + 1 - occupied;
>>>> -    chunk1 = (chunk1 >= count_dw) ? count_dw: chunk1;
>>>> -    chunk2 = count_dw - chunk1;
>>>> -    chunk1 <<= 2;
>>>> -    chunk2 <<= 2;
>>>> +void amdgpu_ring_write(struct amdgpu_ring *ring, uint32_t v);
>>>> -    if (chunk1)
>>>> -        memcpy(dst, src, chunk1);
>>>> -
>>>> -    if (chunk2) {
>>>> -        src += chunk1;
>>>> -        dst = (void *)ring->ring;
>>>> -        memcpy(dst, src, chunk2);
>>>> -    }
>>>> -
>>>> -    ring->wptr += count_dw;
>>>> -    ring->wptr &= ring->ptr_mask;
>>>> -    ring->count_dw -= count_dw;
>>>> -}
>>>> +void amdgpu_ring_write_multiple(struct amdgpu_ring *ring,
>>>> +                          void *src, int count_dw);
>>>>   int amdgpu_ring_test_helper(struct amdgpu_ring *ring);
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c 
>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
>>>> index c6dbc0801604..82f0542c7792 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
>>>> @@ -32,6 +32,7 @@
>>>>   #include <linux/module.h>
>>>>   #include <drm/drm.h>
>>>> +#include <drm/drm_drv.h>
>>>>   #include "amdgpu.h"
>>>>   #include "amdgpu_pm.h"
>>>> @@ -375,7 +376,7 @@ int amdgpu_uvd_suspend(struct amdgpu_device *adev)
>>>>   {
>>>>       unsigned size;
>>>>       void *ptr;
>>>> -    int i, j;
>>>> +    int i, j, idx;
>>>>       bool in_ras_intr = amdgpu_ras_intr_triggered();
>>>>       cancel_delayed_work_sync(&adev->uvd.idle_work);
>>>> @@ -403,11 +404,15 @@ int amdgpu_uvd_suspend(struct amdgpu_device 
>>>> *adev)
>>>>           if (!adev->uvd.inst[j].saved_bo)
>>>>               return -ENOMEM;
>>>> -        /* re-write 0 since err_event_athub will corrupt VCPU 
>>>> buffer */
>>>> -        if (in_ras_intr)
>>>> -            memset(adev->uvd.inst[j].saved_bo, 0, size);
>>>> -        else
>>>> -            memcpy_fromio(adev->uvd.inst[j].saved_bo, ptr, size);
>>>> +        if (drm_dev_enter(&adev->ddev, &idx)) {
>>>> +            /* re-write 0 since err_event_athub will corrupt VCPU 
>>>> buffer */
>>>> +            if (in_ras_intr)
>>>> +                memset(adev->uvd.inst[j].saved_bo, 0, size);
>>>> +            else
>>>> + memcpy_fromio(adev->uvd.inst[j].saved_bo, ptr, size);
>>>> +
>>>> +            drm_dev_exit(idx);
>>>> +        }
>>>>       }
>>>>       if (in_ras_intr)
>>>> @@ -420,7 +425,7 @@ int amdgpu_uvd_resume(struct amdgpu_device *adev)
>>>>   {
>>>>       unsigned size;
>>>>       void *ptr;
>>>> -    int i;
>>>> +    int i, idx;
>>>>       for (i = 0; i < adev->uvd.num_uvd_inst; i++) {
>>>>           if (adev->uvd.harvest_config & (1 << i))
>>>> @@ -432,7 +437,10 @@ int amdgpu_uvd_resume(struct amdgpu_device *adev)
>>>>           ptr = adev->uvd.inst[i].cpu_addr;
>>>>           if (adev->uvd.inst[i].saved_bo != NULL) {
>>>> -            memcpy_toio(ptr, adev->uvd.inst[i].saved_bo, size);
>>>> +            if (drm_dev_enter(&adev->ddev, &idx)) {
>>>> +                memcpy_toio(ptr, adev->uvd.inst[i].saved_bo, size);
>>>> +                drm_dev_exit(idx);
>>>> +            }
>>>>               kvfree(adev->uvd.inst[i].saved_bo);
>>>>               adev->uvd.inst[i].saved_bo = NULL;
>>>>           } else {
>>>> @@ -442,8 +450,11 @@ int amdgpu_uvd_resume(struct amdgpu_device *adev)
>>>>               hdr = (const struct common_firmware_header 
>>>> *)adev->uvd.fw->data;
>>>>               if (adev->firmware.load_type != AMDGPU_FW_LOAD_PSP) {
>>>>                   offset = le32_to_cpu(hdr->ucode_array_offset_bytes);
>>>> -                memcpy_toio(adev->uvd.inst[i].cpu_addr, 
>>>> adev->uvd.fw->data + offset,
>>>> - le32_to_cpu(hdr->ucode_size_bytes));
>>>> +                if (drm_dev_enter(&adev->ddev, &idx)) {
>>>> + memcpy_toio(adev->uvd.inst[i].cpu_addr, adev->uvd.fw->data + offset,
>>>> + le32_to_cpu(hdr->ucode_size_bytes));
>>>> +                    drm_dev_exit(idx);
>>>> +                }
>>>>                   size -= le32_to_cpu(hdr->ucode_size_bytes);
>>>>                   ptr += le32_to_cpu(hdr->ucode_size_bytes);
>>>>               }
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c 
>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
>>>> index ea6a62f67e38..833203401ef4 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
>>>> @@ -29,6 +29,7 @@
>>>>   #include <linux/module.h>
>>>>   #include <drm/drm.h>
>>>> +#include <drm/drm_drv.h>
>>>>   #include "amdgpu.h"
>>>>   #include "amdgpu_pm.h"
>>>> @@ -293,7 +294,7 @@ int amdgpu_vce_resume(struct amdgpu_device *adev)
>>>>       void *cpu_addr;
>>>>       const struct common_firmware_header *hdr;
>>>>       unsigned offset;
>>>> -    int r;
>>>> +    int r, idx;
>>>>       if (adev->vce.vcpu_bo == NULL)
>>>>           return -EINVAL;
>>>> @@ -313,8 +314,12 @@ int amdgpu_vce_resume(struct amdgpu_device *adev)
>>>>       hdr = (const struct common_firmware_header *)adev->vce.fw->data;
>>>>       offset = le32_to_cpu(hdr->ucode_array_offset_bytes);
>>>> -    memcpy_toio(cpu_addr, adev->vce.fw->data + offset,
>>>> -            adev->vce.fw->size - offset);
>>>> +
>>>> +    if (drm_dev_enter(&adev->ddev, &idx)) {
>>>> +        memcpy_toio(cpu_addr, adev->vce.fw->data + offset,
>>>> +                adev->vce.fw->size - offset);
>>>> +        drm_dev_exit(idx);
>>>> +    }
>>>>       amdgpu_bo_kunmap(adev->vce.vcpu_bo);
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c 
>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
>>>> index 201645963ba5..21f7d3644d70 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
>>>> @@ -27,6 +27,7 @@
>>>>   #include <linux/firmware.h>
>>>>   #include <linux/module.h>
>>>>   #include <linux/pci.h>
>>>> +#include <drm/drm_drv.h>
>>>>   #include "amdgpu.h"
>>>>   #include "amdgpu_pm.h"
>>>> @@ -275,7 +276,7 @@ int amdgpu_vcn_suspend(struct amdgpu_device *adev)
>>>>   {
>>>>       unsigned size;
>>>>       void *ptr;
>>>> -    int i;
>>>> +    int i, idx;
>>>>       cancel_delayed_work_sync(&adev->vcn.idle_work);
>>>> @@ -292,7 +293,10 @@ int amdgpu_vcn_suspend(struct amdgpu_device 
>>>> *adev)
>>>>           if (!adev->vcn.inst[i].saved_bo)
>>>>               return -ENOMEM;
>>>> -        memcpy_fromio(adev->vcn.inst[i].saved_bo, ptr, size);
>>>> +        if (drm_dev_enter(&adev->ddev, &idx)) {
>>>> +            memcpy_fromio(adev->vcn.inst[i].saved_bo, ptr, size);
>>>> +            drm_dev_exit(idx);
>>>> +        }
>>>>       }
>>>>       return 0;
>>>>   }
>>>> @@ -301,7 +305,7 @@ int amdgpu_vcn_resume(struct amdgpu_device *adev)
>>>>   {
>>>>       unsigned size;
>>>>       void *ptr;
>>>> -    int i;
>>>> +    int i, idx;
>>>>       for (i = 0; i < adev->vcn.num_vcn_inst; ++i) {
>>>>           if (adev->vcn.harvest_config & (1 << i))
>>>> @@ -313,7 +317,10 @@ int amdgpu_vcn_resume(struct amdgpu_device *adev)
>>>>           ptr = adev->vcn.inst[i].cpu_addr;
>>>>           if (adev->vcn.inst[i].saved_bo != NULL) {
>>>> -            memcpy_toio(ptr, adev->vcn.inst[i].saved_bo, size);
>>>> +            if (drm_dev_enter(&adev->ddev, &idx)) {
>>>> +                memcpy_toio(ptr, adev->vcn.inst[i].saved_bo, size);
>>>> +                drm_dev_exit(idx);
>>>> +            }
>>>>               kvfree(adev->vcn.inst[i].saved_bo);
>>>>               adev->vcn.inst[i].saved_bo = NULL;
>>>>           } else {
>>>> @@ -323,8 +330,11 @@ int amdgpu_vcn_resume(struct amdgpu_device *adev)
>>>>               hdr = (const struct common_firmware_header 
>>>> *)adev->vcn.fw->data;
>>>>               if (adev->firmware.load_type != AMDGPU_FW_LOAD_PSP) {
>>>>                   offset = le32_to_cpu(hdr->ucode_array_offset_bytes);
>>>> -                memcpy_toio(adev->vcn.inst[i].cpu_addr, 
>>>> adev->vcn.fw->data + offset,
>>>> - le32_to_cpu(hdr->ucode_size_bytes));
>>>> +                if (drm_dev_enter(&adev->ddev, &idx)) {
>>>> + memcpy_toio(adev->vcn.inst[i].cpu_addr, adev->vcn.fw->data + offset,
>>>> + le32_to_cpu(hdr->ucode_size_bytes));
>>>> +                    drm_dev_exit(idx);
>>>> +                }
>>>>                   size -= le32_to_cpu(hdr->ucode_size_bytes);
>>>>                   ptr += le32_to_cpu(hdr->ucode_size_bytes);
>>>>               }
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>>> index 9f868cf3b832..7dd5f10ab570 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>>> @@ -32,6 +32,7 @@
>>>>   #include <linux/dma-buf.h>
>>>>   #include <drm/amdgpu_drm.h>
>>>> +#include <drm/drm_drv.h>
>>>>   #include "amdgpu.h"
>>>>   #include "amdgpu_trace.h"
>>>>   #include "amdgpu_amdkfd.h"
>>>> @@ -1606,7 +1607,10 @@ static int 
>>>> amdgpu_vm_bo_update_mapping(struct amdgpu_device *adev,
>>>>       struct amdgpu_vm_update_params params;
>>>>       enum amdgpu_sync_mode sync_mode;
>>>>       uint64_t pfn;
>>>> -    int r;
>>>> +    int r, idx;
>>>> +
>>>> +    if (!drm_dev_enter(&adev->ddev, &idx))
>>>> +        return -ENODEV;
>>>>       memset(&params, 0, sizeof(params));
>>>>       params.adev = adev;
>>>> @@ -1715,6 +1719,7 @@ static int amdgpu_vm_bo_update_mapping(struct 
>>>> amdgpu_device *adev,
>>>>   error_unlock:
>>>>       amdgpu_vm_eviction_unlock(vm);
>>>> +    drm_dev_exit(idx);
>>>>       return r;
>>>>   }
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c 
>>>> b/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
>>>> index 589410c32d09..2cec71e823f5 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
>>>> @@ -23,6 +23,7 @@
>>>>   #include <linux/firmware.h>
>>>>   #include <linux/module.h>
>>>>   #include <linux/vmalloc.h>
>>>> +#include <drm/drm_drv.h>
>>>>   #include "amdgpu.h"
>>>>   #include "amdgpu_psp.h"
>>>> @@ -269,10 +270,8 @@ static int 
>>>> psp_v11_0_bootloader_load_kdb(struct psp_context *psp)
>>>>       if (ret)
>>>>           return ret;
>>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>> -
>>>>       /* Copy PSP KDB binary to memory */
>>>> -    memcpy(psp->fw_pri_buf, psp->kdb_start_addr, psp->kdb_bin_size);
>>>> +    psp_copy_fw(psp, psp->kdb_start_addr, psp->kdb_bin_size);
>>>>       /* Provide the PSP KDB to bootloader */
>>>>       WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>>> @@ -302,10 +301,8 @@ static int 
>>>> psp_v11_0_bootloader_load_spl(struct psp_context *psp)
>>>>       if (ret)
>>>>           return ret;
>>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>> -
>>>>       /* Copy PSP SPL binary to memory */
>>>> -    memcpy(psp->fw_pri_buf, psp->spl_start_addr, psp->spl_bin_size);
>>>> +    psp_copy_fw(psp, psp->spl_start_addr, psp->spl_bin_size);
>>>>       /* Provide the PSP SPL to bootloader */
>>>>       WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>>> @@ -335,10 +332,8 @@ static int 
>>>> psp_v11_0_bootloader_load_sysdrv(struct psp_context *psp)
>>>>       if (ret)
>>>>           return ret;
>>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>> -
>>>>       /* Copy PSP System Driver binary to memory */
>>>> -    memcpy(psp->fw_pri_buf, psp->sys_start_addr, psp->sys_bin_size);
>>>> +    psp_copy_fw(psp, psp->sys_start_addr, psp->sys_bin_size);
>>>>       /* Provide the sys driver to bootloader */
>>>>       WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>>> @@ -371,10 +366,8 @@ static int 
>>>> psp_v11_0_bootloader_load_sos(struct psp_context *psp)
>>>>       if (ret)
>>>>           return ret;
>>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>> -
>>>>       /* Copy Secure OS binary to PSP memory */
>>>> -    memcpy(psp->fw_pri_buf, psp->sos_start_addr, psp->sos_bin_size);
>>>> +    psp_copy_fw(psp, psp->sos_start_addr, psp->sos_bin_size);
>>>>       /* Provide the PSP secure OS to bootloader */
>>>>       WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>>> @@ -608,7 +601,7 @@ static int psp_v11_0_memory_training(struct 
>>>> psp_context *psp, uint32_t ops)
>>>>       uint32_t p2c_header[4];
>>>>       uint32_t sz;
>>>>       void *buf;
>>>> -    int ret;
>>>> +    int ret, idx;
>>>>       if (ctx->init == PSP_MEM_TRAIN_NOT_SUPPORT) {
>>>>           DRM_DEBUG("Memory training is not supported.\n");
>>>> @@ -681,17 +674,24 @@ static int psp_v11_0_memory_training(struct 
>>>> psp_context *psp, uint32_t ops)
>>>>               return -ENOMEM;
>>>>           }
>>>> -        memcpy_fromio(buf, adev->mman.aper_base_kaddr, sz);
>>>> -        ret = psp_v11_0_memory_training_send_msg(psp, 
>>>> PSP_BL__DRAM_LONG_TRAIN);
>>>> -        if (ret) {
>>>> -            DRM_ERROR("Send long training msg failed.\n");
>>>> +        if (drm_dev_enter(&adev->ddev, &idx)) {
>>>> +            memcpy_fromio(buf, adev->mman.aper_base_kaddr, sz);
>>>> +            ret = psp_v11_0_memory_training_send_msg(psp, 
>>>> PSP_BL__DRAM_LONG_TRAIN);
>>>> +            if (ret) {
>>>> +                DRM_ERROR("Send long training msg failed.\n");
>>>> +                vfree(buf);
>>>> +                drm_dev_exit(idx);
>>>> +                return ret;
>>>> +            }
>>>> +
>>>> +            memcpy_toio(adev->mman.aper_base_kaddr, buf, sz);
>>>> +            adev->hdp.funcs->flush_hdp(adev, NULL);
>>>>               vfree(buf);
>>>> -            return ret;
>>>> +            drm_dev_exit(idx);
>>>> +        } else {
>>>> +            vfree(buf);
>>>> +            return -ENODEV;
>>>>           }
>>>> -
>>>> -        memcpy_toio(adev->mman.aper_base_kaddr, buf, sz);
>>>> -        adev->hdp.funcs->flush_hdp(adev, NULL);
>>>> -        vfree(buf);
>>>>       }
>>>>       if (ops & PSP_MEM_TRAIN_SAVE) {
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c 
>>>> b/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c
>>>> index c4828bd3264b..618e5b6b85d9 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c
>>>> @@ -138,10 +138,8 @@ static int 
>>>> psp_v12_0_bootloader_load_sysdrv(struct psp_context *psp)
>>>>       if (ret)
>>>>           return ret;
>>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>> -
>>>>       /* Copy PSP System Driver binary to memory */
>>>> -    memcpy(psp->fw_pri_buf, psp->sys_start_addr, psp->sys_bin_size);
>>>> +    psp_copy_fw(psp, psp->sys_start_addr, psp->sys_bin_size);
>>>>       /* Provide the sys driver to bootloader */
>>>>       WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>>> @@ -179,10 +177,8 @@ static int 
>>>> psp_v12_0_bootloader_load_sos(struct psp_context *psp)
>>>>       if (ret)
>>>>           return ret;
>>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>> -
>>>>       /* Copy Secure OS binary to PSP memory */
>>>> -    memcpy(psp->fw_pri_buf, psp->sos_start_addr, psp->sos_bin_size);
>>>> +    psp_copy_fw(psp, psp->sos_start_addr, psp->sos_bin_size);
>>>>       /* Provide the PSP secure OS to bootloader */
>>>>       WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c 
>>>> b/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c
>>>> index f2e725f72d2f..d0a6cccd0897 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c
>>>> @@ -102,10 +102,8 @@ static int 
>>>> psp_v3_1_bootloader_load_sysdrv(struct psp_context *psp)
>>>>       if (ret)
>>>>           return ret;
>>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>> -
>>>>       /* Copy PSP System Driver binary to memory */
>>>> -    memcpy(psp->fw_pri_buf, psp->sys_start_addr, psp->sys_bin_size);
>>>> +    psp_copy_fw(psp, psp->sys_start_addr, psp->sys_bin_size);
>>>>       /* Provide the sys driver to bootloader */
>>>>       WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>>> @@ -143,10 +141,8 @@ static int psp_v3_1_bootloader_load_sos(struct 
>>>> psp_context *psp)
>>>>       if (ret)
>>>>           return ret;
>>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>> -
>>>>       /* Copy Secure OS binary to PSP memory */
>>>> -    memcpy(psp->fw_pri_buf, psp->sos_start_addr, psp->sos_bin_size);
>>>> +    psp_copy_fw(psp, psp->sos_start_addr, psp->sos_bin_size);
>>>>       /* Provide the PSP secure OS to bootloader */
>>>>       WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c 
>>>> b/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
>>>> index 8e238dea7bef..90910d19db12 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
>>>> @@ -25,6 +25,7 @@
>>>>    */
>>>>   #include <linux/firmware.h>
>>>> +#include <drm/drm_drv.h>
>>>>   #include "amdgpu.h"
>>>>   #include "amdgpu_vce.h"
>>>> @@ -555,16 +556,19 @@ static int vce_v4_0_hw_fini(void *handle)
>>>>   static int vce_v4_0_suspend(void *handle)
>>>>   {
>>>>       struct amdgpu_device *adev = (struct amdgpu_device *)handle;
>>>> -    int r;
>>>> +    int r, idx;
>>>>       if (adev->vce.vcpu_bo == NULL)
>>>>           return 0;
>>>> -    if (adev->firmware.load_type == AMDGPU_FW_LOAD_PSP) {
>>>> -        unsigned size = amdgpu_bo_size(adev->vce.vcpu_bo);
>>>> -        void *ptr = adev->vce.cpu_addr;
>>>> +    if (drm_dev_enter(&adev->ddev, &idx)) {
>>>> +        if (adev->firmware.load_type == AMDGPU_FW_LOAD_PSP) {
>>>> +            unsigned size = amdgpu_bo_size(adev->vce.vcpu_bo);
>>>> +            void *ptr = adev->vce.cpu_addr;
>>>> -        memcpy_fromio(adev->vce.saved_bo, ptr, size);
>>>> +            memcpy_fromio(adev->vce.saved_bo, ptr, size);
>>>> +        }
>>>> +        drm_dev_exit(idx);
>>>>       }
>>>>       r = vce_v4_0_hw_fini(adev);
>>>> @@ -577,16 +581,20 @@ static int vce_v4_0_suspend(void *handle)
>>>>   static int vce_v4_0_resume(void *handle)
>>>>   {
>>>>       struct amdgpu_device *adev = (struct amdgpu_device *)handle;
>>>> -    int r;
>>>> +    int r, idx;
>>>>       if (adev->vce.vcpu_bo == NULL)
>>>>           return -EINVAL;
>>>>       if (adev->firmware.load_type == AMDGPU_FW_LOAD_PSP) {
>>>> -        unsigned size = amdgpu_bo_size(adev->vce.vcpu_bo);
>>>> -        void *ptr = adev->vce.cpu_addr;
>>>> -        memcpy_toio(ptr, adev->vce.saved_bo, size);
>>>> +        if (drm_dev_enter(&adev->ddev, &idx)) {
>>>> +            unsigned size = amdgpu_bo_size(adev->vce.vcpu_bo);
>>>> +            void *ptr = adev->vce.cpu_addr;
>>>> +
>>>> +            memcpy_toio(ptr, adev->vce.saved_bo, size);
>>>> +            drm_dev_exit(idx);
>>>> +        }
>>>>       } else {
>>>>           r = amdgpu_vce_resume(adev);
>>>>           if (r)
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c 
>>>> b/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c
>>>> index 3f15bf34123a..df34be8ec82d 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c
>>>> @@ -34,6 +34,8 @@
>>>>   #include "vcn/vcn_3_0_0_sh_mask.h"
>>>>   #include "ivsrcid/vcn/irqsrcs_vcn_2_0.h"
>>>> +#include <drm/drm_drv.h>
>>>> +
>>>>   #define mmUVD_CONTEXT_ID_INTERNAL_OFFSET            0x27
>>>>   #define mmUVD_GPCOM_VCPU_CMD_INTERNAL_OFFSET 0x0f
>>>>   #define mmUVD_GPCOM_VCPU_DATA0_INTERNAL_OFFSET 0x10
>>>> @@ -268,16 +270,20 @@ static int vcn_v3_0_sw_init(void *handle)
>>>>   static int vcn_v3_0_sw_fini(void *handle)
>>>>   {
>>>>       struct amdgpu_device *adev = (struct amdgpu_device *)handle;
>>>> -    int i, r;
>>>> +    int i, r, idx;
>>>> -    for (i = 0; i < adev->vcn.num_vcn_inst; i++) {
>>>> -        volatile struct amdgpu_fw_shared *fw_shared;
>>>> +    if (drm_dev_enter(&adev->ddev, &idx)) {
>>>> +        for (i = 0; i < adev->vcn.num_vcn_inst; i++) {
>>>> +            volatile struct amdgpu_fw_shared *fw_shared;
>>>> -        if (adev->vcn.harvest_config & (1 << i))
>>>> -            continue;
>>>> -        fw_shared = adev->vcn.inst[i].fw_shared_cpu_addr;
>>>> -        fw_shared->present_flag_0 = 0;
>>>> -        fw_shared->sw_ring.is_enabled = false;
>>>> +            if (adev->vcn.harvest_config & (1 << i))
>>>> +                continue;
>>>> +            fw_shared = adev->vcn.inst[i].fw_shared_cpu_addr;
>>>> +            fw_shared->present_flag_0 = 0;
>>>> +            fw_shared->sw_ring.is_enabled = false;
>>>> +        }
>>>> +
>>>> +        drm_dev_exit(idx);
>>>>       }
>>>>       if (amdgpu_sriov_vf(adev))
>>>> diff --git a/drivers/gpu/drm/amd/pm/powerplay/smumgr/smu7_smumgr.c 
>>>> b/drivers/gpu/drm/amd/pm/powerplay/smumgr/smu7_smumgr.c
>>>> index aae25243eb10..d628b91846c9 100644
>>>> --- a/drivers/gpu/drm/amd/pm/powerplay/smumgr/smu7_smumgr.c
>>>> +++ b/drivers/gpu/drm/amd/pm/powerplay/smumgr/smu7_smumgr.c
>>>> @@ -405,6 +405,8 @@ int smu7_request_smu_load_fw(struct pp_hwmgr 
>>>> *hwmgr)
>>>>                   UCODE_ID_MEC_STORAGE, 
>>>> &toc->entry[toc->num_entries++]),
>>>>                   "Failed to Get Firmware Entry.", r = -EINVAL; 
>>>> goto failed);
>>>>       }
>>>> +
>>>> +    /* AG TODO Can't call drm_dev_enter/exit because access 
>>>> adev->ddev here ... */
>>>>       memcpy_toio(smu_data->header_buffer.kaddr, smu_data->toc,
>>>>               sizeof(struct SMU_DRAMData_TOC));
>>>>       smum_send_msg_to_smc_with_parameter(hwmgr,
>>>


^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [PATCH v6 10/16] drm/amdgpu: Guard against write accesses after device removal
@ 2021-05-12 14:06           ` Christian König
  0 siblings, 0 replies; 126+ messages in thread
From: Christian König @ 2021-05-12 14:06 UTC (permalink / raw)
  To: Andrey Grodzovsky, dri-devel, amd-gfx, linux-pci, daniel.vetter,
	Harry.Wentland
  Cc: Alexander.Deucher, gregkh, ppaalanen, helgaas, Felix.Kuehling

Am 12.05.21 um 16:01 schrieb Andrey Grodzovsky:
> Ping - need a confirmation it's ok to keep this as a single patch given
> my explanation bellow.

It was just an suggestion. Key point is the approach sounds sane to me, 
but I can't say much about the psp code for example.

So maximum I can give you is an Acked-by for that.

Christian.

>
> Andrey
>
> On 2021-05-11 1:52 p.m., Andrey Grodzovsky wrote:
>>
>>
>> On 2021-05-11 2:50 a.m., Christian König wrote:
>>> Am 10.05.21 um 18:36 schrieb Andrey Grodzovsky:
>>>> This should prevent writing to memory or IO ranges possibly
>>>> already allocated for other uses after our device is removed.
>>>>
>>>> v5:
>>>> Protect more places wher memcopy_to/form_io takes place
>>>> Protect IB submissions
>>>>
>>>> v6: Switch to !drm_dev_enter instead of scoping entire code
>>>> with brackets.
>>>>
>>>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>>>> ---
>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c    | 11 ++-
>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c       |  9 +++
>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c        | 17 +++--
>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c       | 63 +++++++++++------
>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h       |  2 +
>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c      | 70 
>>>> +++++++++++++++++++
>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h      | 49 ++-----------
>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c       | 31 +++++---
>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c       | 11 ++-
>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c       | 22 ++++--
>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c        |  7 +-
>>>>   drivers/gpu/drm/amd/amdgpu/psp_v11_0.c        | 44 ++++++------
>>>>   drivers/gpu/drm/amd/amdgpu/psp_v12_0.c        |  8 +--
>>>>   drivers/gpu/drm/amd/amdgpu/psp_v3_1.c         |  8 +--
>>>>   drivers/gpu/drm/amd/amdgpu/vce_v4_0.c         | 26 ++++---
>>>>   drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c         | 22 +++---
>>>>   .../drm/amd/pm/powerplay/smumgr/smu7_smumgr.c |  2 +
>>>>   17 files changed, 257 insertions(+), 145 deletions(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>> index a0bff4713672..94c415176cdc 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>> @@ -71,6 +71,8 @@
>>>>   #include <drm/task_barrier.h>
>>>>   #include <linux/pm_runtime.h>
>>>> +#include <drm/drm_drv.h>
>>>> +
>>>>   MODULE_FIRMWARE("amdgpu/vega10_gpu_info.bin");
>>>>   MODULE_FIRMWARE("amdgpu/vega12_gpu_info.bin");
>>>>   MODULE_FIRMWARE("amdgpu/raven_gpu_info.bin");
>>>> @@ -281,7 +283,10 @@ void amdgpu_device_vram_access(struct 
>>>> amdgpu_device *adev, loff_t pos,
>>>>       unsigned long flags;
>>>>       uint32_t hi = ~0;
>>>>       uint64_t last;
>>>> +    int idx;
>>>> +     if (!drm_dev_enter(&adev->ddev, &idx))
>>>> +         return;
>>>>   #ifdef CONFIG_64BIT
>>>>       last = min(pos + size, adev->gmc.visible_vram_size);
>>>> @@ -299,8 +304,10 @@ void amdgpu_device_vram_access(struct 
>>>> amdgpu_device *adev, loff_t pos,
>>>>               memcpy_fromio(buf, addr, count);
>>>>           }
>>>> -        if (count == size)
>>>> +        if (count == size) {
>>>> +            drm_dev_exit(idx);
>>>>               return;
>>>> +        }
>>>
>>> Maybe use a goto instead, but really just a nit pick.
>>>
>>>
>>>
>>>>           pos += count;
>>>>           buf += count / 4;
>>>> @@ -323,6 +330,8 @@ void amdgpu_device_vram_access(struct 
>>>> amdgpu_device *adev, loff_t pos,
>>>>               *buf++ = RREG32_NO_KIQ(mmMM_DATA);
>>>>       }
>>>>       spin_unlock_irqrestore(&adev->mmio_idx_lock, flags);
>>>> +
>>>> +    drm_dev_exit(idx);
>>>>   }
>>>>   /*
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c 
>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
>>>> index 4d32233cde92..04ba5eef1e88 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
>>>> @@ -31,6 +31,8 @@
>>>>   #include "amdgpu_ras.h"
>>>>   #include "amdgpu_xgmi.h"
>>>> +#include <drm/drm_drv.h>
>>>> +
>>>>   /**
>>>>    * amdgpu_gmc_pdb0_alloc - allocate vram for pdb0
>>>>    *
>>>> @@ -151,6 +153,10 @@ int amdgpu_gmc_set_pte_pde(struct 
>>>> amdgpu_device *adev, void *cpu_pt_addr,
>>>>   {
>>>>       void __iomem *ptr = (void *)cpu_pt_addr;
>>>>       uint64_t value;
>>>> +    int idx;
>>>> +
>>>> +    if (!drm_dev_enter(&adev->ddev, &idx))
>>>> +        return 0;
>>>>       /*
>>>>        * The following is for PTE only. GART does not have PDEs.
>>>> @@ -158,6 +164,9 @@ int amdgpu_gmc_set_pte_pde(struct amdgpu_device 
>>>> *adev, void *cpu_pt_addr,
>>>>       value = addr & 0x0000FFFFFFFFF000ULL;
>>>>       value |= flags;
>>>>       writeq(value, ptr + (gpu_page_idx * 8));
>>>> +
>>>> +    drm_dev_exit(idx);
>>>> +
>>>>       return 0;
>>>>   }
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c 
>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
>>>> index 148a3b481b12..62fcbd446c71 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
>>>> @@ -30,6 +30,7 @@
>>>>   #include <linux/slab.h>
>>>>   #include <drm/amdgpu_drm.h>
>>>> +#include <drm/drm_drv.h>
>>>>   #include "amdgpu.h"
>>>>   #include "atom.h"
>>>> @@ -137,7 +138,7 @@ int amdgpu_ib_schedule(struct amdgpu_ring 
>>>> *ring, unsigned num_ibs,
>>>>       bool secure;
>>>>       unsigned i;
>>>> -    int r = 0;
>>>> +    int idx, r = 0;
>>>>       bool need_pipe_sync = false;
>>>>       if (num_ibs == 0)
>>>> @@ -169,13 +170,16 @@ int amdgpu_ib_schedule(struct amdgpu_ring 
>>>> *ring, unsigned num_ibs,
>>>>           return -EINVAL;
>>>>       }
>>>> +    if (!drm_dev_enter(&adev->ddev, &idx))
>>>> +        return -ENODEV;
>>>> +
>>>>       alloc_size = ring->funcs->emit_frame_size + num_ibs *
>>>>           ring->funcs->emit_ib_size;
>>>>       r = amdgpu_ring_alloc(ring, alloc_size);
>>>>       if (r) {
>>>>           dev_err(adev->dev, "scheduling IB failed (%d).\n", r);
>>>> -        return r;
>>>> +        goto exit;
>>>>       }
>>>>       need_ctx_switch = ring->current_ctx != fence_ctx;
>>>> @@ -205,7 +209,7 @@ int amdgpu_ib_schedule(struct amdgpu_ring 
>>>> *ring, unsigned num_ibs,
>>>>           r = amdgpu_vm_flush(ring, job, need_pipe_sync);
>>>>           if (r) {
>>>>               amdgpu_ring_undo(ring);
>>>> -            return r;
>>>> +            goto exit;
>>>>           }
>>>>       }
>>>> @@ -286,7 +290,7 @@ int amdgpu_ib_schedule(struct amdgpu_ring 
>>>> *ring, unsigned num_ibs,
>>>>           if (job && job->vmid)
>>>>               amdgpu_vmid_reset(adev, ring->funcs->vmhub, job->vmid);
>>>>           amdgpu_ring_undo(ring);
>>>> -        return r;
>>>> +        goto exit;
>>>>       }
>>>>       if (ring->funcs->insert_end)
>>>> @@ -304,7 +308,10 @@ int amdgpu_ib_schedule(struct amdgpu_ring 
>>>> *ring, unsigned num_ibs,
>>>>           ring->funcs->emit_wave_limit(ring, false);
>>>>       amdgpu_ring_commit(ring);
>>>> -    return 0;
>>>> +
>>>> +exit:
>>>> +    drm_dev_exit(idx);
>>>> +    return r;
>>>>   }
>>>>   /**
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c 
>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
>>>> index 9e769cf6095b..bb6afee61666 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
>>>> @@ -25,6 +25,7 @@
>>>>   #include <linux/firmware.h>
>>>>   #include <linux/dma-mapping.h>
>>>> +#include <drm/drm_drv.h>
>>>>   #include "amdgpu.h"
>>>>   #include "amdgpu_psp.h"
>>>> @@ -39,6 +40,8 @@
>>>>   #include "amdgpu_ras.h"
>>>>   #include "amdgpu_securedisplay.h"
>>>> +#include <drm/drm_drv.h>
>>>> +
>>>>   static int psp_sysfs_init(struct amdgpu_device *adev);
>>>>   static void psp_sysfs_fini(struct amdgpu_device *adev);
>>>> @@ -253,7 +256,7 @@ psp_cmd_submit_buf(struct psp_context *psp,
>>>>              struct psp_gfx_cmd_resp *cmd, uint64_t fence_mc_addr)
>>>>   {
>>>>       int ret;
>>>> -    int index;
>>>> +    int index, idx;
>>>>       int timeout = 20000;
>>>>       bool ras_intr = false;
>>>>       bool skip_unsupport = false;
>>>> @@ -261,6 +264,9 @@ psp_cmd_submit_buf(struct psp_context *psp,
>>>>       if (psp->adev->in_pci_err_recovery)
>>>>           return 0;
>>>> +    if (!drm_dev_enter(&psp->adev->ddev, &idx))
>>>> +        return 0;
>>>> +
>>>>       mutex_lock(&psp->mutex);
>>>>       memset(psp->cmd_buf_mem, 0, PSP_CMD_BUFFER_SIZE);
>>>> @@ -271,8 +277,7 @@ psp_cmd_submit_buf(struct psp_context *psp,
>>>>       ret = psp_ring_cmd_submit(psp, psp->cmd_buf_mc_addr, 
>>>> fence_mc_addr, index);
>>>>       if (ret) {
>>>>           atomic_dec(&psp->fence_value);
>>>> -        mutex_unlock(&psp->mutex);
>>>> -        return ret;
>>>> +        goto exit;
>>>>       }
>>>>       amdgpu_asic_invalidate_hdp(psp->adev, NULL);
>>>> @@ -312,8 +317,8 @@ psp_cmd_submit_buf(struct psp_context *psp,
>>>>                psp->cmd_buf_mem->cmd_id,
>>>>                psp->cmd_buf_mem->resp.status);
>>>>           if (!timeout) {
>>>> -            mutex_unlock(&psp->mutex);
>>>> -            return -EINVAL;
>>>> +            ret = -EINVAL;
>>>> +            goto exit;
>>>>           }
>>>>       }
>>>> @@ -321,8 +326,10 @@ psp_cmd_submit_buf(struct psp_context *psp,
>>>>           ucode->tmr_mc_addr_lo = psp->cmd_buf_mem->resp.fw_addr_lo;
>>>>           ucode->tmr_mc_addr_hi = psp->cmd_buf_mem->resp.fw_addr_hi;
>>>>       }
>>>> -    mutex_unlock(&psp->mutex);
>>>> +exit:
>>>> +    mutex_unlock(&psp->mutex);
>>>> +    drm_dev_exit(idx);
>>>>       return ret;
>>>>   }
>>>> @@ -359,8 +366,7 @@ static int psp_load_toc(struct psp_context *psp,
>>>>       if (!cmd)
>>>>           return -ENOMEM;
>>>>       /* Copy toc to psp firmware private buffer */
>>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>> -    memcpy(psp->fw_pri_buf, psp->toc_start_addr, psp->toc_bin_size);
>>>> +    psp_copy_fw(psp, psp->toc_start_addr, psp->toc_bin_size);
>>>>       psp_prep_load_toc_cmd_buf(cmd, psp->fw_pri_mc_addr, 
>>>> psp->toc_bin_size);
>>>> @@ -625,8 +631,7 @@ static int psp_asd_load(struct psp_context *psp)
>>>>       if (!cmd)
>>>>           return -ENOMEM;
>>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>> -    memcpy(psp->fw_pri_buf, psp->asd_start_addr, 
>>>> psp->asd_ucode_size);
>>>> +    psp_copy_fw(psp, psp->asd_start_addr, psp->asd_ucode_size);
>>>>       psp_prep_asd_load_cmd_buf(cmd, psp->fw_pri_mc_addr,
>>>>                     psp->asd_ucode_size);
>>>> @@ -781,8 +786,7 @@ static int psp_xgmi_load(struct psp_context *psp)
>>>>       if (!cmd)
>>>>           return -ENOMEM;
>>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>> -    memcpy(psp->fw_pri_buf, psp->ta_xgmi_start_addr, 
>>>> psp->ta_xgmi_ucode_size);
>>>> +    psp_copy_fw(psp, psp->ta_xgmi_start_addr, 
>>>> psp->ta_xgmi_ucode_size);
>>>>       psp_prep_ta_load_cmd_buf(cmd,
>>>>                    psp->fw_pri_mc_addr,
>>>> @@ -1038,8 +1042,7 @@ static int psp_ras_load(struct psp_context *psp)
>>>>       if (!cmd)
>>>>           return -ENOMEM;
>>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>> -    memcpy(psp->fw_pri_buf, psp->ta_ras_start_addr, 
>>>> psp->ta_ras_ucode_size);
>>>> +    psp_copy_fw(psp, psp->ta_ras_start_addr, psp->ta_ras_ucode_size);
>>>>       psp_prep_ta_load_cmd_buf(cmd,
>>>>                    psp->fw_pri_mc_addr,
>>>> @@ -1275,8 +1278,7 @@ static int psp_hdcp_load(struct psp_context 
>>>> *psp)
>>>>       if (!cmd)
>>>>           return -ENOMEM;
>>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>> -    memcpy(psp->fw_pri_buf, psp->ta_hdcp_start_addr,
>>>> +    psp_copy_fw(psp, psp->ta_hdcp_start_addr,
>>>>              psp->ta_hdcp_ucode_size);
>>>>       psp_prep_ta_load_cmd_buf(cmd,
>>>> @@ -1427,8 +1429,7 @@ static int psp_dtm_load(struct psp_context *psp)
>>>>       if (!cmd)
>>>>           return -ENOMEM;
>>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>> -    memcpy(psp->fw_pri_buf, psp->ta_dtm_start_addr, 
>>>> psp->ta_dtm_ucode_size);
>>>> +    psp_copy_fw(psp, psp->ta_dtm_start_addr, psp->ta_dtm_ucode_size);
>>>>       psp_prep_ta_load_cmd_buf(cmd,
>>>>                    psp->fw_pri_mc_addr,
>>>> @@ -1573,8 +1574,7 @@ static int psp_rap_load(struct psp_context *psp)
>>>>       if (!cmd)
>>>>           return -ENOMEM;
>>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>> -    memcpy(psp->fw_pri_buf, psp->ta_rap_start_addr, 
>>>> psp->ta_rap_ucode_size);
>>>> +    psp_copy_fw(psp, psp->ta_rap_start_addr, psp->ta_rap_ucode_size);
>>>>       psp_prep_ta_load_cmd_buf(cmd,
>>>>                    psp->fw_pri_mc_addr,
>>>> @@ -3022,7 +3022,7 @@ static ssize_t 
>>>> psp_usbc_pd_fw_sysfs_write(struct device *dev,
>>>>       struct amdgpu_device *adev = drm_to_adev(ddev);
>>>>       void *cpu_addr;
>>>>       dma_addr_t dma_addr;
>>>> -    int ret;
>>>> +    int ret, idx;
>>>>       char fw_name[100];
>>>>       const struct firmware *usbc_pd_fw;
>>>> @@ -3031,6 +3031,9 @@ static ssize_t 
>>>> psp_usbc_pd_fw_sysfs_write(struct device *dev,
>>>>           return -EBUSY;
>>>>       }
>>>> +    if (!drm_dev_enter(ddev, &idx))
>>>> +        return -ENODEV;
>>>> +
>>>>       snprintf(fw_name, sizeof(fw_name), "amdgpu/%s", buf);
>>>>       ret = request_firmware(&usbc_pd_fw, fw_name, adev->dev);
>>>>       if (ret)
>>>> @@ -3062,16 +3065,30 @@ static ssize_t 
>>>> psp_usbc_pd_fw_sysfs_write(struct device *dev,
>>>>   rel_buf:
>>>>       dma_free_coherent(adev->dev, usbc_pd_fw->size, cpu_addr, 
>>>> dma_addr);
>>>>       release_firmware(usbc_pd_fw);
>>>> -
>>>>   fail:
>>>>       if (ret) {
>>>>           DRM_ERROR("Failed to load USBC PD FW, err = %d", ret);
>>>> -        return ret;
>>>> +        count = ret;
>>>>       }
>>>> +    drm_dev_exit(idx);
>>>>       return count;
>>>>   }
>>>> +void psp_copy_fw(struct psp_context *psp, uint8_t *start_addr, 
>>>> uint32_t bin_size)
>>>> +{
>>>> +    int idx;
>>>> +
>>>> +    if (!drm_dev_enter(&psp->adev->ddev, &idx))
>>>> +        return;
>>>> +
>>>> +    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>> +    memcpy(psp->fw_pri_buf, start_addr, bin_size);
>>>> +
>>>> +    drm_dev_exit(idx);
>>>> +}
>>>> +
>>>> +
>>>>   static DEVICE_ATTR(usbc_pd_fw, S_IRUGO | S_IWUSR,
>>>>              psp_usbc_pd_fw_sysfs_read,
>>>>              psp_usbc_pd_fw_sysfs_write);
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h 
>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
>>>> index 46a5328e00e0..2bfdc278817f 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
>>>> @@ -423,4 +423,6 @@ int psp_get_fw_attestation_records_addr(struct 
>>>> psp_context *psp,
>>>>   int psp_load_fw_list(struct psp_context *psp,
>>>>                struct amdgpu_firmware_info **ucode_list, int 
>>>> ucode_count);
>>>> +void psp_copy_fw(struct psp_context *psp, uint8_t *start_addr, 
>>>> uint32_t bin_size);
>>>> +
>>>>   #endif
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c 
>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
>>>> index 688624ebe421..e1985bc34436 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
>>>> @@ -35,6 +35,8 @@
>>>>   #include "amdgpu.h"
>>>>   #include "atom.h"
>>>> +#include <drm/drm_drv.h>
>>>> +
>>>>   /*
>>>>    * Rings
>>>>    * Most engines on the GPU are fed via ring buffers.  Ring
>>>> @@ -461,3 +463,71 @@ int amdgpu_ring_test_helper(struct amdgpu_ring 
>>>> *ring)
>>>>       ring->sched.ready = !r;
>>>>       return r;
>>>>   }
>>>> +
>>>> +void amdgpu_ring_clear_ring(struct amdgpu_ring *ring)
>>>> +{
>>>> +    int idx;
>>>> +    int i = 0;
>>>> +
>>>> +    if (!drm_dev_enter(&ring->adev->ddev, &idx))
>>>> +        return;
>>>> +
>>>> +    while (i <= ring->buf_mask)
>>>> +        ring->ring[i++] = ring->funcs->nop;
>>>> +
>>>> +    drm_dev_exit(idx);
>>>> +
>>>> +}
>>>> +
>>>> +void amdgpu_ring_write(struct amdgpu_ring *ring, uint32_t v)
>>>> +{
>>>> +    int idx;
>>>> +
>>>> +    if (!drm_dev_enter(&ring->adev->ddev, &idx))
>>>> +        return;
>>>> +
>>>> +    if (ring->count_dw <= 0)
>>>> +        DRM_ERROR("amdgpu: writing more dwords to the ring than 
>>>> expected!\n");
>>>> +    ring->ring[ring->wptr++ & ring->buf_mask] = v;
>>>> +    ring->wptr &= ring->ptr_mask;
>>>> +    ring->count_dw--;
>>>> +
>>>> +    drm_dev_exit(idx);
>>>> +}
>>>> +
>>>> +void amdgpu_ring_write_multiple(struct amdgpu_ring *ring,
>>>> +                          void *src, int count_dw)
>>>> +{
>>>> +    unsigned occupied, chunk1, chunk2;
>>>> +    void *dst;
>>>> +    int idx;
>>>> +
>>>> +    if (!drm_dev_enter(&ring->adev->ddev, &idx))
>>>> +        return;
>>>> +
>>>> +    if (unlikely(ring->count_dw < count_dw))
>>>> +        DRM_ERROR("amdgpu: writing more dwords to the ring than 
>>>> expected!\n");
>>>> +
>>>> +    occupied = ring->wptr & ring->buf_mask;
>>>> +    dst = (void *)&ring->ring[occupied];
>>>> +    chunk1 = ring->buf_mask + 1 - occupied;
>>>> +    chunk1 = (chunk1 >= count_dw) ? count_dw: chunk1;
>>>> +    chunk2 = count_dw - chunk1;
>>>> +    chunk1 <<= 2;
>>>> +    chunk2 <<= 2;
>>>> +
>>>> +    if (chunk1)
>>>> +        memcpy(dst, src, chunk1);
>>>> +
>>>> +    if (chunk2) {
>>>> +        src += chunk1;
>>>> +        dst = (void *)ring->ring;
>>>> +        memcpy(dst, src, chunk2);
>>>> +    }
>>>> +
>>>> +    ring->wptr += count_dw;
>>>> +    ring->wptr &= ring->ptr_mask;
>>>> +    ring->count_dw -= count_dw;
>>>> +
>>>> +    drm_dev_exit(idx);
>>>> +}
>>>
>>> The ring should never we in MMIO memory, so you can completely drop 
>>> that as far as I can see.
>>
>> Yea, it's in all in GART, missed it for some reason...
>>>
>>> Maybe split that patch by use case so that we can more easily 
>>> review/ack it.
>>
>> In fact everything here is the same use case, once I added unmap of
>> all MMIO ranges (both registers ann VRAM) i got a lot of page faults
>> on device remove around any memcpy to from IO. That where I put the
>> drn_dev_enter/exit scope. Also I searched in code and preemeptivly
>> added guards to any other such place. I did drop amdgpu_schedule_ib
>> from this patch both because it had dma_fence_wait inside and so we
>> will take care of this once we decide on how to handle dma_fence waits.
>>
>> Andrey
>>
>>>
>>> Thanks,
>>> Christian.
>>>
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h 
>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>>>> index e7d3d0dbdd96..c67bc6d3d039 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>>>> @@ -299,53 +299,12 @@ static inline void 
>>>> amdgpu_ring_set_preempt_cond_exec(struct amdgpu_ring *ring,
>>>>       *ring->cond_exe_cpu_addr = cond_exec;
>>>>   }
>>>> -static inline void amdgpu_ring_clear_ring(struct amdgpu_ring *ring)
>>>> -{
>>>> -    int i = 0;
>>>> -    while (i <= ring->buf_mask)
>>>> -        ring->ring[i++] = ring->funcs->nop;
>>>> -
>>>> -}
>>>> -
>>>> -static inline void amdgpu_ring_write(struct amdgpu_ring *ring, 
>>>> uint32_t v)
>>>> -{
>>>> -    if (ring->count_dw <= 0)
>>>> -        DRM_ERROR("amdgpu: writing more dwords to the ring than 
>>>> expected!\n");
>>>> -    ring->ring[ring->wptr++ & ring->buf_mask] = v;
>>>> -    ring->wptr &= ring->ptr_mask;
>>>> -    ring->count_dw--;
>>>> -}
>>>> +void amdgpu_ring_clear_ring(struct amdgpu_ring *ring);
>>>> -static inline void amdgpu_ring_write_multiple(struct amdgpu_ring 
>>>> *ring,
>>>> -                          void *src, int count_dw)
>>>> -{
>>>> -    unsigned occupied, chunk1, chunk2;
>>>> -    void *dst;
>>>> -
>>>> -    if (unlikely(ring->count_dw < count_dw))
>>>> -        DRM_ERROR("amdgpu: writing more dwords to the ring than 
>>>> expected!\n");
>>>> -
>>>> -    occupied = ring->wptr & ring->buf_mask;
>>>> -    dst = (void *)&ring->ring[occupied];
>>>> -    chunk1 = ring->buf_mask + 1 - occupied;
>>>> -    chunk1 = (chunk1 >= count_dw) ? count_dw: chunk1;
>>>> -    chunk2 = count_dw - chunk1;
>>>> -    chunk1 <<= 2;
>>>> -    chunk2 <<= 2;
>>>> +void amdgpu_ring_write(struct amdgpu_ring *ring, uint32_t v);
>>>> -    if (chunk1)
>>>> -        memcpy(dst, src, chunk1);
>>>> -
>>>> -    if (chunk2) {
>>>> -        src += chunk1;
>>>> -        dst = (void *)ring->ring;
>>>> -        memcpy(dst, src, chunk2);
>>>> -    }
>>>> -
>>>> -    ring->wptr += count_dw;
>>>> -    ring->wptr &= ring->ptr_mask;
>>>> -    ring->count_dw -= count_dw;
>>>> -}
>>>> +void amdgpu_ring_write_multiple(struct amdgpu_ring *ring,
>>>> +                          void *src, int count_dw);
>>>>   int amdgpu_ring_test_helper(struct amdgpu_ring *ring);
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c 
>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
>>>> index c6dbc0801604..82f0542c7792 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
>>>> @@ -32,6 +32,7 @@
>>>>   #include <linux/module.h>
>>>>   #include <drm/drm.h>
>>>> +#include <drm/drm_drv.h>
>>>>   #include "amdgpu.h"
>>>>   #include "amdgpu_pm.h"
>>>> @@ -375,7 +376,7 @@ int amdgpu_uvd_suspend(struct amdgpu_device *adev)
>>>>   {
>>>>       unsigned size;
>>>>       void *ptr;
>>>> -    int i, j;
>>>> +    int i, j, idx;
>>>>       bool in_ras_intr = amdgpu_ras_intr_triggered();
>>>>       cancel_delayed_work_sync(&adev->uvd.idle_work);
>>>> @@ -403,11 +404,15 @@ int amdgpu_uvd_suspend(struct amdgpu_device 
>>>> *adev)
>>>>           if (!adev->uvd.inst[j].saved_bo)
>>>>               return -ENOMEM;
>>>> -        /* re-write 0 since err_event_athub will corrupt VCPU 
>>>> buffer */
>>>> -        if (in_ras_intr)
>>>> -            memset(adev->uvd.inst[j].saved_bo, 0, size);
>>>> -        else
>>>> -            memcpy_fromio(adev->uvd.inst[j].saved_bo, ptr, size);
>>>> +        if (drm_dev_enter(&adev->ddev, &idx)) {
>>>> +            /* re-write 0 since err_event_athub will corrupt VCPU 
>>>> buffer */
>>>> +            if (in_ras_intr)
>>>> +                memset(adev->uvd.inst[j].saved_bo, 0, size);
>>>> +            else
>>>> + memcpy_fromio(adev->uvd.inst[j].saved_bo, ptr, size);
>>>> +
>>>> +            drm_dev_exit(idx);
>>>> +        }
>>>>       }
>>>>       if (in_ras_intr)
>>>> @@ -420,7 +425,7 @@ int amdgpu_uvd_resume(struct amdgpu_device *adev)
>>>>   {
>>>>       unsigned size;
>>>>       void *ptr;
>>>> -    int i;
>>>> +    int i, idx;
>>>>       for (i = 0; i < adev->uvd.num_uvd_inst; i++) {
>>>>           if (adev->uvd.harvest_config & (1 << i))
>>>> @@ -432,7 +437,10 @@ int amdgpu_uvd_resume(struct amdgpu_device *adev)
>>>>           ptr = adev->uvd.inst[i].cpu_addr;
>>>>           if (adev->uvd.inst[i].saved_bo != NULL) {
>>>> -            memcpy_toio(ptr, adev->uvd.inst[i].saved_bo, size);
>>>> +            if (drm_dev_enter(&adev->ddev, &idx)) {
>>>> +                memcpy_toio(ptr, adev->uvd.inst[i].saved_bo, size);
>>>> +                drm_dev_exit(idx);
>>>> +            }
>>>>               kvfree(adev->uvd.inst[i].saved_bo);
>>>>               adev->uvd.inst[i].saved_bo = NULL;
>>>>           } else {
>>>> @@ -442,8 +450,11 @@ int amdgpu_uvd_resume(struct amdgpu_device *adev)
>>>>               hdr = (const struct common_firmware_header 
>>>> *)adev->uvd.fw->data;
>>>>               if (adev->firmware.load_type != AMDGPU_FW_LOAD_PSP) {
>>>>                   offset = le32_to_cpu(hdr->ucode_array_offset_bytes);
>>>> -                memcpy_toio(adev->uvd.inst[i].cpu_addr, 
>>>> adev->uvd.fw->data + offset,
>>>> - le32_to_cpu(hdr->ucode_size_bytes));
>>>> +                if (drm_dev_enter(&adev->ddev, &idx)) {
>>>> + memcpy_toio(adev->uvd.inst[i].cpu_addr, adev->uvd.fw->data + offset,
>>>> + le32_to_cpu(hdr->ucode_size_bytes));
>>>> +                    drm_dev_exit(idx);
>>>> +                }
>>>>                   size -= le32_to_cpu(hdr->ucode_size_bytes);
>>>>                   ptr += le32_to_cpu(hdr->ucode_size_bytes);
>>>>               }
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c 
>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
>>>> index ea6a62f67e38..833203401ef4 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
>>>> @@ -29,6 +29,7 @@
>>>>   #include <linux/module.h>
>>>>   #include <drm/drm.h>
>>>> +#include <drm/drm_drv.h>
>>>>   #include "amdgpu.h"
>>>>   #include "amdgpu_pm.h"
>>>> @@ -293,7 +294,7 @@ int amdgpu_vce_resume(struct amdgpu_device *adev)
>>>>       void *cpu_addr;
>>>>       const struct common_firmware_header *hdr;
>>>>       unsigned offset;
>>>> -    int r;
>>>> +    int r, idx;
>>>>       if (adev->vce.vcpu_bo == NULL)
>>>>           return -EINVAL;
>>>> @@ -313,8 +314,12 @@ int amdgpu_vce_resume(struct amdgpu_device *adev)
>>>>       hdr = (const struct common_firmware_header *)adev->vce.fw->data;
>>>>       offset = le32_to_cpu(hdr->ucode_array_offset_bytes);
>>>> -    memcpy_toio(cpu_addr, adev->vce.fw->data + offset,
>>>> -            adev->vce.fw->size - offset);
>>>> +
>>>> +    if (drm_dev_enter(&adev->ddev, &idx)) {
>>>> +        memcpy_toio(cpu_addr, adev->vce.fw->data + offset,
>>>> +                adev->vce.fw->size - offset);
>>>> +        drm_dev_exit(idx);
>>>> +    }
>>>>       amdgpu_bo_kunmap(adev->vce.vcpu_bo);
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c 
>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
>>>> index 201645963ba5..21f7d3644d70 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
>>>> @@ -27,6 +27,7 @@
>>>>   #include <linux/firmware.h>
>>>>   #include <linux/module.h>
>>>>   #include <linux/pci.h>
>>>> +#include <drm/drm_drv.h>
>>>>   #include "amdgpu.h"
>>>>   #include "amdgpu_pm.h"
>>>> @@ -275,7 +276,7 @@ int amdgpu_vcn_suspend(struct amdgpu_device *adev)
>>>>   {
>>>>       unsigned size;
>>>>       void *ptr;
>>>> -    int i;
>>>> +    int i, idx;
>>>>       cancel_delayed_work_sync(&adev->vcn.idle_work);
>>>> @@ -292,7 +293,10 @@ int amdgpu_vcn_suspend(struct amdgpu_device 
>>>> *adev)
>>>>           if (!adev->vcn.inst[i].saved_bo)
>>>>               return -ENOMEM;
>>>> -        memcpy_fromio(adev->vcn.inst[i].saved_bo, ptr, size);
>>>> +        if (drm_dev_enter(&adev->ddev, &idx)) {
>>>> +            memcpy_fromio(adev->vcn.inst[i].saved_bo, ptr, size);
>>>> +            drm_dev_exit(idx);
>>>> +        }
>>>>       }
>>>>       return 0;
>>>>   }
>>>> @@ -301,7 +305,7 @@ int amdgpu_vcn_resume(struct amdgpu_device *adev)
>>>>   {
>>>>       unsigned size;
>>>>       void *ptr;
>>>> -    int i;
>>>> +    int i, idx;
>>>>       for (i = 0; i < adev->vcn.num_vcn_inst; ++i) {
>>>>           if (adev->vcn.harvest_config & (1 << i))
>>>> @@ -313,7 +317,10 @@ int amdgpu_vcn_resume(struct amdgpu_device *adev)
>>>>           ptr = adev->vcn.inst[i].cpu_addr;
>>>>           if (adev->vcn.inst[i].saved_bo != NULL) {
>>>> -            memcpy_toio(ptr, adev->vcn.inst[i].saved_bo, size);
>>>> +            if (drm_dev_enter(&adev->ddev, &idx)) {
>>>> +                memcpy_toio(ptr, adev->vcn.inst[i].saved_bo, size);
>>>> +                drm_dev_exit(idx);
>>>> +            }
>>>>               kvfree(adev->vcn.inst[i].saved_bo);
>>>>               adev->vcn.inst[i].saved_bo = NULL;
>>>>           } else {
>>>> @@ -323,8 +330,11 @@ int amdgpu_vcn_resume(struct amdgpu_device *adev)
>>>>               hdr = (const struct common_firmware_header 
>>>> *)adev->vcn.fw->data;
>>>>               if (adev->firmware.load_type != AMDGPU_FW_LOAD_PSP) {
>>>>                   offset = le32_to_cpu(hdr->ucode_array_offset_bytes);
>>>> -                memcpy_toio(adev->vcn.inst[i].cpu_addr, 
>>>> adev->vcn.fw->data + offset,
>>>> - le32_to_cpu(hdr->ucode_size_bytes));
>>>> +                if (drm_dev_enter(&adev->ddev, &idx)) {
>>>> + memcpy_toio(adev->vcn.inst[i].cpu_addr, adev->vcn.fw->data + offset,
>>>> + le32_to_cpu(hdr->ucode_size_bytes));
>>>> +                    drm_dev_exit(idx);
>>>> +                }
>>>>                   size -= le32_to_cpu(hdr->ucode_size_bytes);
>>>>                   ptr += le32_to_cpu(hdr->ucode_size_bytes);
>>>>               }
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>>> index 9f868cf3b832..7dd5f10ab570 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>>> @@ -32,6 +32,7 @@
>>>>   #include <linux/dma-buf.h>
>>>>   #include <drm/amdgpu_drm.h>
>>>> +#include <drm/drm_drv.h>
>>>>   #include "amdgpu.h"
>>>>   #include "amdgpu_trace.h"
>>>>   #include "amdgpu_amdkfd.h"
>>>> @@ -1606,7 +1607,10 @@ static int 
>>>> amdgpu_vm_bo_update_mapping(struct amdgpu_device *adev,
>>>>       struct amdgpu_vm_update_params params;
>>>>       enum amdgpu_sync_mode sync_mode;
>>>>       uint64_t pfn;
>>>> -    int r;
>>>> +    int r, idx;
>>>> +
>>>> +    if (!drm_dev_enter(&adev->ddev, &idx))
>>>> +        return -ENODEV;
>>>>       memset(&params, 0, sizeof(params));
>>>>       params.adev = adev;
>>>> @@ -1715,6 +1719,7 @@ static int amdgpu_vm_bo_update_mapping(struct 
>>>> amdgpu_device *adev,
>>>>   error_unlock:
>>>>       amdgpu_vm_eviction_unlock(vm);
>>>> +    drm_dev_exit(idx);
>>>>       return r;
>>>>   }
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c 
>>>> b/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
>>>> index 589410c32d09..2cec71e823f5 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
>>>> @@ -23,6 +23,7 @@
>>>>   #include <linux/firmware.h>
>>>>   #include <linux/module.h>
>>>>   #include <linux/vmalloc.h>
>>>> +#include <drm/drm_drv.h>
>>>>   #include "amdgpu.h"
>>>>   #include "amdgpu_psp.h"
>>>> @@ -269,10 +270,8 @@ static int 
>>>> psp_v11_0_bootloader_load_kdb(struct psp_context *psp)
>>>>       if (ret)
>>>>           return ret;
>>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>> -
>>>>       /* Copy PSP KDB binary to memory */
>>>> -    memcpy(psp->fw_pri_buf, psp->kdb_start_addr, psp->kdb_bin_size);
>>>> +    psp_copy_fw(psp, psp->kdb_start_addr, psp->kdb_bin_size);
>>>>       /* Provide the PSP KDB to bootloader */
>>>>       WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>>> @@ -302,10 +301,8 @@ static int 
>>>> psp_v11_0_bootloader_load_spl(struct psp_context *psp)
>>>>       if (ret)
>>>>           return ret;
>>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>> -
>>>>       /* Copy PSP SPL binary to memory */
>>>> -    memcpy(psp->fw_pri_buf, psp->spl_start_addr, psp->spl_bin_size);
>>>> +    psp_copy_fw(psp, psp->spl_start_addr, psp->spl_bin_size);
>>>>       /* Provide the PSP SPL to bootloader */
>>>>       WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>>> @@ -335,10 +332,8 @@ static int 
>>>> psp_v11_0_bootloader_load_sysdrv(struct psp_context *psp)
>>>>       if (ret)
>>>>           return ret;
>>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>> -
>>>>       /* Copy PSP System Driver binary to memory */
>>>> -    memcpy(psp->fw_pri_buf, psp->sys_start_addr, psp->sys_bin_size);
>>>> +    psp_copy_fw(psp, psp->sys_start_addr, psp->sys_bin_size);
>>>>       /* Provide the sys driver to bootloader */
>>>>       WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>>> @@ -371,10 +366,8 @@ static int 
>>>> psp_v11_0_bootloader_load_sos(struct psp_context *psp)
>>>>       if (ret)
>>>>           return ret;
>>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>> -
>>>>       /* Copy Secure OS binary to PSP memory */
>>>> -    memcpy(psp->fw_pri_buf, psp->sos_start_addr, psp->sos_bin_size);
>>>> +    psp_copy_fw(psp, psp->sos_start_addr, psp->sos_bin_size);
>>>>       /* Provide the PSP secure OS to bootloader */
>>>>       WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>>> @@ -608,7 +601,7 @@ static int psp_v11_0_memory_training(struct 
>>>> psp_context *psp, uint32_t ops)
>>>>       uint32_t p2c_header[4];
>>>>       uint32_t sz;
>>>>       void *buf;
>>>> -    int ret;
>>>> +    int ret, idx;
>>>>       if (ctx->init == PSP_MEM_TRAIN_NOT_SUPPORT) {
>>>>           DRM_DEBUG("Memory training is not supported.\n");
>>>> @@ -681,17 +674,24 @@ static int psp_v11_0_memory_training(struct 
>>>> psp_context *psp, uint32_t ops)
>>>>               return -ENOMEM;
>>>>           }
>>>> -        memcpy_fromio(buf, adev->mman.aper_base_kaddr, sz);
>>>> -        ret = psp_v11_0_memory_training_send_msg(psp, 
>>>> PSP_BL__DRAM_LONG_TRAIN);
>>>> -        if (ret) {
>>>> -            DRM_ERROR("Send long training msg failed.\n");
>>>> +        if (drm_dev_enter(&adev->ddev, &idx)) {
>>>> +            memcpy_fromio(buf, adev->mman.aper_base_kaddr, sz);
>>>> +            ret = psp_v11_0_memory_training_send_msg(psp, 
>>>> PSP_BL__DRAM_LONG_TRAIN);
>>>> +            if (ret) {
>>>> +                DRM_ERROR("Send long training msg failed.\n");
>>>> +                vfree(buf);
>>>> +                drm_dev_exit(idx);
>>>> +                return ret;
>>>> +            }
>>>> +
>>>> +            memcpy_toio(adev->mman.aper_base_kaddr, buf, sz);
>>>> +            adev->hdp.funcs->flush_hdp(adev, NULL);
>>>>               vfree(buf);
>>>> -            return ret;
>>>> +            drm_dev_exit(idx);
>>>> +        } else {
>>>> +            vfree(buf);
>>>> +            return -ENODEV;
>>>>           }
>>>> -
>>>> -        memcpy_toio(adev->mman.aper_base_kaddr, buf, sz);
>>>> -        adev->hdp.funcs->flush_hdp(adev, NULL);
>>>> -        vfree(buf);
>>>>       }
>>>>       if (ops & PSP_MEM_TRAIN_SAVE) {
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c 
>>>> b/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c
>>>> index c4828bd3264b..618e5b6b85d9 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c
>>>> @@ -138,10 +138,8 @@ static int 
>>>> psp_v12_0_bootloader_load_sysdrv(struct psp_context *psp)
>>>>       if (ret)
>>>>           return ret;
>>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>> -
>>>>       /* Copy PSP System Driver binary to memory */
>>>> -    memcpy(psp->fw_pri_buf, psp->sys_start_addr, psp->sys_bin_size);
>>>> +    psp_copy_fw(psp, psp->sys_start_addr, psp->sys_bin_size);
>>>>       /* Provide the sys driver to bootloader */
>>>>       WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>>> @@ -179,10 +177,8 @@ static int 
>>>> psp_v12_0_bootloader_load_sos(struct psp_context *psp)
>>>>       if (ret)
>>>>           return ret;
>>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>> -
>>>>       /* Copy Secure OS binary to PSP memory */
>>>> -    memcpy(psp->fw_pri_buf, psp->sos_start_addr, psp->sos_bin_size);
>>>> +    psp_copy_fw(psp, psp->sos_start_addr, psp->sos_bin_size);
>>>>       /* Provide the PSP secure OS to bootloader */
>>>>       WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c 
>>>> b/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c
>>>> index f2e725f72d2f..d0a6cccd0897 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c
>>>> @@ -102,10 +102,8 @@ static int 
>>>> psp_v3_1_bootloader_load_sysdrv(struct psp_context *psp)
>>>>       if (ret)
>>>>           return ret;
>>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>> -
>>>>       /* Copy PSP System Driver binary to memory */
>>>> -    memcpy(psp->fw_pri_buf, psp->sys_start_addr, psp->sys_bin_size);
>>>> +    psp_copy_fw(psp, psp->sys_start_addr, psp->sys_bin_size);
>>>>       /* Provide the sys driver to bootloader */
>>>>       WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>>> @@ -143,10 +141,8 @@ static int psp_v3_1_bootloader_load_sos(struct 
>>>> psp_context *psp)
>>>>       if (ret)
>>>>           return ret;
>>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>> -
>>>>       /* Copy Secure OS binary to PSP memory */
>>>> -    memcpy(psp->fw_pri_buf, psp->sos_start_addr, psp->sos_bin_size);
>>>> +    psp_copy_fw(psp, psp->sos_start_addr, psp->sos_bin_size);
>>>>       /* Provide the PSP secure OS to bootloader */
>>>>       WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c 
>>>> b/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
>>>> index 8e238dea7bef..90910d19db12 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
>>>> @@ -25,6 +25,7 @@
>>>>    */
>>>>   #include <linux/firmware.h>
>>>> +#include <drm/drm_drv.h>
>>>>   #include "amdgpu.h"
>>>>   #include "amdgpu_vce.h"
>>>> @@ -555,16 +556,19 @@ static int vce_v4_0_hw_fini(void *handle)
>>>>   static int vce_v4_0_suspend(void *handle)
>>>>   {
>>>>       struct amdgpu_device *adev = (struct amdgpu_device *)handle;
>>>> -    int r;
>>>> +    int r, idx;
>>>>       if (adev->vce.vcpu_bo == NULL)
>>>>           return 0;
>>>> -    if (adev->firmware.load_type == AMDGPU_FW_LOAD_PSP) {
>>>> -        unsigned size = amdgpu_bo_size(adev->vce.vcpu_bo);
>>>> -        void *ptr = adev->vce.cpu_addr;
>>>> +    if (drm_dev_enter(&adev->ddev, &idx)) {
>>>> +        if (adev->firmware.load_type == AMDGPU_FW_LOAD_PSP) {
>>>> +            unsigned size = amdgpu_bo_size(adev->vce.vcpu_bo);
>>>> +            void *ptr = adev->vce.cpu_addr;
>>>> -        memcpy_fromio(adev->vce.saved_bo, ptr, size);
>>>> +            memcpy_fromio(adev->vce.saved_bo, ptr, size);
>>>> +        }
>>>> +        drm_dev_exit(idx);
>>>>       }
>>>>       r = vce_v4_0_hw_fini(adev);
>>>> @@ -577,16 +581,20 @@ static int vce_v4_0_suspend(void *handle)
>>>>   static int vce_v4_0_resume(void *handle)
>>>>   {
>>>>       struct amdgpu_device *adev = (struct amdgpu_device *)handle;
>>>> -    int r;
>>>> +    int r, idx;
>>>>       if (adev->vce.vcpu_bo == NULL)
>>>>           return -EINVAL;
>>>>       if (adev->firmware.load_type == AMDGPU_FW_LOAD_PSP) {
>>>> -        unsigned size = amdgpu_bo_size(adev->vce.vcpu_bo);
>>>> -        void *ptr = adev->vce.cpu_addr;
>>>> -        memcpy_toio(ptr, adev->vce.saved_bo, size);
>>>> +        if (drm_dev_enter(&adev->ddev, &idx)) {
>>>> +            unsigned size = amdgpu_bo_size(adev->vce.vcpu_bo);
>>>> +            void *ptr = adev->vce.cpu_addr;
>>>> +
>>>> +            memcpy_toio(ptr, adev->vce.saved_bo, size);
>>>> +            drm_dev_exit(idx);
>>>> +        }
>>>>       } else {
>>>>           r = amdgpu_vce_resume(adev);
>>>>           if (r)
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c 
>>>> b/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c
>>>> index 3f15bf34123a..df34be8ec82d 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c
>>>> @@ -34,6 +34,8 @@
>>>>   #include "vcn/vcn_3_0_0_sh_mask.h"
>>>>   #include "ivsrcid/vcn/irqsrcs_vcn_2_0.h"
>>>> +#include <drm/drm_drv.h>
>>>> +
>>>>   #define mmUVD_CONTEXT_ID_INTERNAL_OFFSET            0x27
>>>>   #define mmUVD_GPCOM_VCPU_CMD_INTERNAL_OFFSET 0x0f
>>>>   #define mmUVD_GPCOM_VCPU_DATA0_INTERNAL_OFFSET 0x10
>>>> @@ -268,16 +270,20 @@ static int vcn_v3_0_sw_init(void *handle)
>>>>   static int vcn_v3_0_sw_fini(void *handle)
>>>>   {
>>>>       struct amdgpu_device *adev = (struct amdgpu_device *)handle;
>>>> -    int i, r;
>>>> +    int i, r, idx;
>>>> -    for (i = 0; i < adev->vcn.num_vcn_inst; i++) {
>>>> -        volatile struct amdgpu_fw_shared *fw_shared;
>>>> +    if (drm_dev_enter(&adev->ddev, &idx)) {
>>>> +        for (i = 0; i < adev->vcn.num_vcn_inst; i++) {
>>>> +            volatile struct amdgpu_fw_shared *fw_shared;
>>>> -        if (adev->vcn.harvest_config & (1 << i))
>>>> -            continue;
>>>> -        fw_shared = adev->vcn.inst[i].fw_shared_cpu_addr;
>>>> -        fw_shared->present_flag_0 = 0;
>>>> -        fw_shared->sw_ring.is_enabled = false;
>>>> +            if (adev->vcn.harvest_config & (1 << i))
>>>> +                continue;
>>>> +            fw_shared = adev->vcn.inst[i].fw_shared_cpu_addr;
>>>> +            fw_shared->present_flag_0 = 0;
>>>> +            fw_shared->sw_ring.is_enabled = false;
>>>> +        }
>>>> +
>>>> +        drm_dev_exit(idx);
>>>>       }
>>>>       if (amdgpu_sriov_vf(adev))
>>>> diff --git a/drivers/gpu/drm/amd/pm/powerplay/smumgr/smu7_smumgr.c 
>>>> b/drivers/gpu/drm/amd/pm/powerplay/smumgr/smu7_smumgr.c
>>>> index aae25243eb10..d628b91846c9 100644
>>>> --- a/drivers/gpu/drm/amd/pm/powerplay/smumgr/smu7_smumgr.c
>>>> +++ b/drivers/gpu/drm/amd/pm/powerplay/smumgr/smu7_smumgr.c
>>>> @@ -405,6 +405,8 @@ int smu7_request_smu_load_fw(struct pp_hwmgr 
>>>> *hwmgr)
>>>>                   UCODE_ID_MEC_STORAGE, 
>>>> &toc->entry[toc->num_entries++]),
>>>>                   "Failed to Get Firmware Entry.", r = -EINVAL; 
>>>> goto failed);
>>>>       }
>>>> +
>>>> +    /* AG TODO Can't call drm_dev_enter/exit because access 
>>>> adev->ddev here ... */
>>>>       memcpy_toio(smu_data->header_buffer.kaddr, smu_data->toc,
>>>>               sizeof(struct SMU_DRAMData_TOC));
>>>>       smum_send_msg_to_smc_with_parameter(hwmgr,
>>>

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [PATCH v6 10/16] drm/amdgpu: Guard against write accesses after device removal
  2021-05-12 14:06           ` Christian König
  (?)
@ 2021-05-12 14:11             ` Andrey Grodzovsky
  -1 siblings, 0 replies; 126+ messages in thread
From: Andrey Grodzovsky @ 2021-05-12 14:11 UTC (permalink / raw)
  To: Christian König, dri-devel, amd-gfx, linux-pci,
	daniel.vetter, Harry.Wentland
  Cc: ppaalanen, Alexander.Deucher, gregkh, helgaas, Felix.Kuehling

Hopefyllu Alex can chime in on this.
I will respin V7 soon.

Andrey

On 2021-05-12 10:06 a.m., Christian König wrote:
> Am 12.05.21 um 16:01 schrieb Andrey Grodzovsky:
>> Ping - need a confirmation it's ok to keep this as a single patch given
>> my explanation bellow.
> 
> It was just an suggestion. Key point is the approach sounds sane to me, 
> but I can't say much about the psp code for example.
> 
> So maximum I can give you is an Acked-by for that.
> 
> Christian.
> 
>>
>> Andrey
>>
>> On 2021-05-11 1:52 p.m., Andrey Grodzovsky wrote:
>>>
>>>
>>> On 2021-05-11 2:50 a.m., Christian König wrote:
>>>> Am 10.05.21 um 18:36 schrieb Andrey Grodzovsky:
>>>>> This should prevent writing to memory or IO ranges possibly
>>>>> already allocated for other uses after our device is removed.
>>>>>
>>>>> v5:
>>>>> Protect more places wher memcopy_to/form_io takes place
>>>>> Protect IB submissions
>>>>>
>>>>> v6: Switch to !drm_dev_enter instead of scoping entire code
>>>>> with brackets.
>>>>>
>>>>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>>>>> ---
>>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c    | 11 ++-
>>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c       |  9 +++
>>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c        | 17 +++--
>>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c       | 63 +++++++++++------
>>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h       |  2 +
>>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c      | 70 
>>>>> +++++++++++++++++++
>>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h      | 49 ++-----------
>>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c       | 31 +++++---
>>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c       | 11 ++-
>>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c       | 22 ++++--
>>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c        |  7 +-
>>>>>   drivers/gpu/drm/amd/amdgpu/psp_v11_0.c        | 44 ++++++------
>>>>>   drivers/gpu/drm/amd/amdgpu/psp_v12_0.c        |  8 +--
>>>>>   drivers/gpu/drm/amd/amdgpu/psp_v3_1.c         |  8 +--
>>>>>   drivers/gpu/drm/amd/amdgpu/vce_v4_0.c         | 26 ++++---
>>>>>   drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c         | 22 +++---
>>>>>   .../drm/amd/pm/powerplay/smumgr/smu7_smumgr.c |  2 +
>>>>>   17 files changed, 257 insertions(+), 145 deletions(-)
>>>>>
>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>> index a0bff4713672..94c415176cdc 100644
>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>> @@ -71,6 +71,8 @@
>>>>>   #include <drm/task_barrier.h>
>>>>>   #include <linux/pm_runtime.h>
>>>>> +#include <drm/drm_drv.h>
>>>>> +
>>>>>   MODULE_FIRMWARE("amdgpu/vega10_gpu_info.bin");
>>>>>   MODULE_FIRMWARE("amdgpu/vega12_gpu_info.bin");
>>>>>   MODULE_FIRMWARE("amdgpu/raven_gpu_info.bin");
>>>>> @@ -281,7 +283,10 @@ void amdgpu_device_vram_access(struct 
>>>>> amdgpu_device *adev, loff_t pos,
>>>>>       unsigned long flags;
>>>>>       uint32_t hi = ~0;
>>>>>       uint64_t last;
>>>>> +    int idx;
>>>>> +     if (!drm_dev_enter(&adev->ddev, &idx))
>>>>> +         return;
>>>>>   #ifdef CONFIG_64BIT
>>>>>       last = min(pos + size, adev->gmc.visible_vram_size);
>>>>> @@ -299,8 +304,10 @@ void amdgpu_device_vram_access(struct 
>>>>> amdgpu_device *adev, loff_t pos,
>>>>>               memcpy_fromio(buf, addr, count);
>>>>>           }
>>>>> -        if (count == size)
>>>>> +        if (count == size) {
>>>>> +            drm_dev_exit(idx);
>>>>>               return;
>>>>> +        }
>>>>
>>>> Maybe use a goto instead, but really just a nit pick.
>>>>
>>>>
>>>>
>>>>>           pos += count;
>>>>>           buf += count / 4;
>>>>> @@ -323,6 +330,8 @@ void amdgpu_device_vram_access(struct 
>>>>> amdgpu_device *adev, loff_t pos,
>>>>>               *buf++ = RREG32_NO_KIQ(mmMM_DATA);
>>>>>       }
>>>>>       spin_unlock_irqrestore(&adev->mmio_idx_lock, flags);
>>>>> +
>>>>> +    drm_dev_exit(idx);
>>>>>   }
>>>>>   /*
>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c 
>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
>>>>> index 4d32233cde92..04ba5eef1e88 100644
>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
>>>>> @@ -31,6 +31,8 @@
>>>>>   #include "amdgpu_ras.h"
>>>>>   #include "amdgpu_xgmi.h"
>>>>> +#include <drm/drm_drv.h>
>>>>> +
>>>>>   /**
>>>>>    * amdgpu_gmc_pdb0_alloc - allocate vram for pdb0
>>>>>    *
>>>>> @@ -151,6 +153,10 @@ int amdgpu_gmc_set_pte_pde(struct 
>>>>> amdgpu_device *adev, void *cpu_pt_addr,
>>>>>   {
>>>>>       void __iomem *ptr = (void *)cpu_pt_addr;
>>>>>       uint64_t value;
>>>>> +    int idx;
>>>>> +
>>>>> +    if (!drm_dev_enter(&adev->ddev, &idx))
>>>>> +        return 0;
>>>>>       /*
>>>>>        * The following is for PTE only. GART does not have PDEs.
>>>>> @@ -158,6 +164,9 @@ int amdgpu_gmc_set_pte_pde(struct amdgpu_device 
>>>>> *adev, void *cpu_pt_addr,
>>>>>       value = addr & 0x0000FFFFFFFFF000ULL;
>>>>>       value |= flags;
>>>>>       writeq(value, ptr + (gpu_page_idx * 8));
>>>>> +
>>>>> +    drm_dev_exit(idx);
>>>>> +
>>>>>       return 0;
>>>>>   }
>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c 
>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
>>>>> index 148a3b481b12..62fcbd446c71 100644
>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
>>>>> @@ -30,6 +30,7 @@
>>>>>   #include <linux/slab.h>
>>>>>   #include <drm/amdgpu_drm.h>
>>>>> +#include <drm/drm_drv.h>
>>>>>   #include "amdgpu.h"
>>>>>   #include "atom.h"
>>>>> @@ -137,7 +138,7 @@ int amdgpu_ib_schedule(struct amdgpu_ring 
>>>>> *ring, unsigned num_ibs,
>>>>>       bool secure;
>>>>>       unsigned i;
>>>>> -    int r = 0;
>>>>> +    int idx, r = 0;
>>>>>       bool need_pipe_sync = false;
>>>>>       if (num_ibs == 0)
>>>>> @@ -169,13 +170,16 @@ int amdgpu_ib_schedule(struct amdgpu_ring 
>>>>> *ring, unsigned num_ibs,
>>>>>           return -EINVAL;
>>>>>       }
>>>>> +    if (!drm_dev_enter(&adev->ddev, &idx))
>>>>> +        return -ENODEV;
>>>>> +
>>>>>       alloc_size = ring->funcs->emit_frame_size + num_ibs *
>>>>>           ring->funcs->emit_ib_size;
>>>>>       r = amdgpu_ring_alloc(ring, alloc_size);
>>>>>       if (r) {
>>>>>           dev_err(adev->dev, "scheduling IB failed (%d).\n", r);
>>>>> -        return r;
>>>>> +        goto exit;
>>>>>       }
>>>>>       need_ctx_switch = ring->current_ctx != fence_ctx;
>>>>> @@ -205,7 +209,7 @@ int amdgpu_ib_schedule(struct amdgpu_ring 
>>>>> *ring, unsigned num_ibs,
>>>>>           r = amdgpu_vm_flush(ring, job, need_pipe_sync);
>>>>>           if (r) {
>>>>>               amdgpu_ring_undo(ring);
>>>>> -            return r;
>>>>> +            goto exit;
>>>>>           }
>>>>>       }
>>>>> @@ -286,7 +290,7 @@ int amdgpu_ib_schedule(struct amdgpu_ring 
>>>>> *ring, unsigned num_ibs,
>>>>>           if (job && job->vmid)
>>>>>               amdgpu_vmid_reset(adev, ring->funcs->vmhub, job->vmid);
>>>>>           amdgpu_ring_undo(ring);
>>>>> -        return r;
>>>>> +        goto exit;
>>>>>       }
>>>>>       if (ring->funcs->insert_end)
>>>>> @@ -304,7 +308,10 @@ int amdgpu_ib_schedule(struct amdgpu_ring 
>>>>> *ring, unsigned num_ibs,
>>>>>           ring->funcs->emit_wave_limit(ring, false);
>>>>>       amdgpu_ring_commit(ring);
>>>>> -    return 0;
>>>>> +
>>>>> +exit:
>>>>> +    drm_dev_exit(idx);
>>>>> +    return r;
>>>>>   }
>>>>>   /**
>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c 
>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
>>>>> index 9e769cf6095b..bb6afee61666 100644
>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
>>>>> @@ -25,6 +25,7 @@
>>>>>   #include <linux/firmware.h>
>>>>>   #include <linux/dma-mapping.h>
>>>>> +#include <drm/drm_drv.h>
>>>>>   #include "amdgpu.h"
>>>>>   #include "amdgpu_psp.h"
>>>>> @@ -39,6 +40,8 @@
>>>>>   #include "amdgpu_ras.h"
>>>>>   #include "amdgpu_securedisplay.h"
>>>>> +#include <drm/drm_drv.h>
>>>>> +
>>>>>   static int psp_sysfs_init(struct amdgpu_device *adev);
>>>>>   static void psp_sysfs_fini(struct amdgpu_device *adev);
>>>>> @@ -253,7 +256,7 @@ psp_cmd_submit_buf(struct psp_context *psp,
>>>>>              struct psp_gfx_cmd_resp *cmd, uint64_t fence_mc_addr)
>>>>>   {
>>>>>       int ret;
>>>>> -    int index;
>>>>> +    int index, idx;
>>>>>       int timeout = 20000;
>>>>>       bool ras_intr = false;
>>>>>       bool skip_unsupport = false;
>>>>> @@ -261,6 +264,9 @@ psp_cmd_submit_buf(struct psp_context *psp,
>>>>>       if (psp->adev->in_pci_err_recovery)
>>>>>           return 0;
>>>>> +    if (!drm_dev_enter(&psp->adev->ddev, &idx))
>>>>> +        return 0;
>>>>> +
>>>>>       mutex_lock(&psp->mutex);
>>>>>       memset(psp->cmd_buf_mem, 0, PSP_CMD_BUFFER_SIZE);
>>>>> @@ -271,8 +277,7 @@ psp_cmd_submit_buf(struct psp_context *psp,
>>>>>       ret = psp_ring_cmd_submit(psp, psp->cmd_buf_mc_addr, 
>>>>> fence_mc_addr, index);
>>>>>       if (ret) {
>>>>>           atomic_dec(&psp->fence_value);
>>>>> -        mutex_unlock(&psp->mutex);
>>>>> -        return ret;
>>>>> +        goto exit;
>>>>>       }
>>>>>       amdgpu_asic_invalidate_hdp(psp->adev, NULL);
>>>>> @@ -312,8 +317,8 @@ psp_cmd_submit_buf(struct psp_context *psp,
>>>>>                psp->cmd_buf_mem->cmd_id,
>>>>>                psp->cmd_buf_mem->resp.status);
>>>>>           if (!timeout) {
>>>>> -            mutex_unlock(&psp->mutex);
>>>>> -            return -EINVAL;
>>>>> +            ret = -EINVAL;
>>>>> +            goto exit;
>>>>>           }
>>>>>       }
>>>>> @@ -321,8 +326,10 @@ psp_cmd_submit_buf(struct psp_context *psp,
>>>>>           ucode->tmr_mc_addr_lo = psp->cmd_buf_mem->resp.fw_addr_lo;
>>>>>           ucode->tmr_mc_addr_hi = psp->cmd_buf_mem->resp.fw_addr_hi;
>>>>>       }
>>>>> -    mutex_unlock(&psp->mutex);
>>>>> +exit:
>>>>> +    mutex_unlock(&psp->mutex);
>>>>> +    drm_dev_exit(idx);
>>>>>       return ret;
>>>>>   }
>>>>> @@ -359,8 +366,7 @@ static int psp_load_toc(struct psp_context *psp,
>>>>>       if (!cmd)
>>>>>           return -ENOMEM;
>>>>>       /* Copy toc to psp firmware private buffer */
>>>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>> -    memcpy(psp->fw_pri_buf, psp->toc_start_addr, psp->toc_bin_size);
>>>>> +    psp_copy_fw(psp, psp->toc_start_addr, psp->toc_bin_size);
>>>>>       psp_prep_load_toc_cmd_buf(cmd, psp->fw_pri_mc_addr, 
>>>>> psp->toc_bin_size);
>>>>> @@ -625,8 +631,7 @@ static int psp_asd_load(struct psp_context *psp)
>>>>>       if (!cmd)
>>>>>           return -ENOMEM;
>>>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>> -    memcpy(psp->fw_pri_buf, psp->asd_start_addr, 
>>>>> psp->asd_ucode_size);
>>>>> +    psp_copy_fw(psp, psp->asd_start_addr, psp->asd_ucode_size);
>>>>>       psp_prep_asd_load_cmd_buf(cmd, psp->fw_pri_mc_addr,
>>>>>                     psp->asd_ucode_size);
>>>>> @@ -781,8 +786,7 @@ static int psp_xgmi_load(struct psp_context *psp)
>>>>>       if (!cmd)
>>>>>           return -ENOMEM;
>>>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>> -    memcpy(psp->fw_pri_buf, psp->ta_xgmi_start_addr, 
>>>>> psp->ta_xgmi_ucode_size);
>>>>> +    psp_copy_fw(psp, psp->ta_xgmi_start_addr, 
>>>>> psp->ta_xgmi_ucode_size);
>>>>>       psp_prep_ta_load_cmd_buf(cmd,
>>>>>                    psp->fw_pri_mc_addr,
>>>>> @@ -1038,8 +1042,7 @@ static int psp_ras_load(struct psp_context *psp)
>>>>>       if (!cmd)
>>>>>           return -ENOMEM;
>>>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>> -    memcpy(psp->fw_pri_buf, psp->ta_ras_start_addr, 
>>>>> psp->ta_ras_ucode_size);
>>>>> +    psp_copy_fw(psp, psp->ta_ras_start_addr, psp->ta_ras_ucode_size);
>>>>>       psp_prep_ta_load_cmd_buf(cmd,
>>>>>                    psp->fw_pri_mc_addr,
>>>>> @@ -1275,8 +1278,7 @@ static int psp_hdcp_load(struct psp_context 
>>>>> *psp)
>>>>>       if (!cmd)
>>>>>           return -ENOMEM;
>>>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>> -    memcpy(psp->fw_pri_buf, psp->ta_hdcp_start_addr,
>>>>> +    psp_copy_fw(psp, psp->ta_hdcp_start_addr,
>>>>>              psp->ta_hdcp_ucode_size);
>>>>>       psp_prep_ta_load_cmd_buf(cmd,
>>>>> @@ -1427,8 +1429,7 @@ static int psp_dtm_load(struct psp_context *psp)
>>>>>       if (!cmd)
>>>>>           return -ENOMEM;
>>>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>> -    memcpy(psp->fw_pri_buf, psp->ta_dtm_start_addr, 
>>>>> psp->ta_dtm_ucode_size);
>>>>> +    psp_copy_fw(psp, psp->ta_dtm_start_addr, psp->ta_dtm_ucode_size);
>>>>>       psp_prep_ta_load_cmd_buf(cmd,
>>>>>                    psp->fw_pri_mc_addr,
>>>>> @@ -1573,8 +1574,7 @@ static int psp_rap_load(struct psp_context *psp)
>>>>>       if (!cmd)
>>>>>           return -ENOMEM;
>>>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>> -    memcpy(psp->fw_pri_buf, psp->ta_rap_start_addr, 
>>>>> psp->ta_rap_ucode_size);
>>>>> +    psp_copy_fw(psp, psp->ta_rap_start_addr, psp->ta_rap_ucode_size);
>>>>>       psp_prep_ta_load_cmd_buf(cmd,
>>>>>                    psp->fw_pri_mc_addr,
>>>>> @@ -3022,7 +3022,7 @@ static ssize_t 
>>>>> psp_usbc_pd_fw_sysfs_write(struct device *dev,
>>>>>       struct amdgpu_device *adev = drm_to_adev(ddev);
>>>>>       void *cpu_addr;
>>>>>       dma_addr_t dma_addr;
>>>>> -    int ret;
>>>>> +    int ret, idx;
>>>>>       char fw_name[100];
>>>>>       const struct firmware *usbc_pd_fw;
>>>>> @@ -3031,6 +3031,9 @@ static ssize_t 
>>>>> psp_usbc_pd_fw_sysfs_write(struct device *dev,
>>>>>           return -EBUSY;
>>>>>       }
>>>>> +    if (!drm_dev_enter(ddev, &idx))
>>>>> +        return -ENODEV;
>>>>> +
>>>>>       snprintf(fw_name, sizeof(fw_name), "amdgpu/%s", buf);
>>>>>       ret = request_firmware(&usbc_pd_fw, fw_name, adev->dev);
>>>>>       if (ret)
>>>>> @@ -3062,16 +3065,30 @@ static ssize_t 
>>>>> psp_usbc_pd_fw_sysfs_write(struct device *dev,
>>>>>   rel_buf:
>>>>>       dma_free_coherent(adev->dev, usbc_pd_fw->size, cpu_addr, 
>>>>> dma_addr);
>>>>>       release_firmware(usbc_pd_fw);
>>>>> -
>>>>>   fail:
>>>>>       if (ret) {
>>>>>           DRM_ERROR("Failed to load USBC PD FW, err = %d", ret);
>>>>> -        return ret;
>>>>> +        count = ret;
>>>>>       }
>>>>> +    drm_dev_exit(idx);
>>>>>       return count;
>>>>>   }
>>>>> +void psp_copy_fw(struct psp_context *psp, uint8_t *start_addr, 
>>>>> uint32_t bin_size)
>>>>> +{
>>>>> +    int idx;
>>>>> +
>>>>> +    if (!drm_dev_enter(&psp->adev->ddev, &idx))
>>>>> +        return;
>>>>> +
>>>>> +    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>> +    memcpy(psp->fw_pri_buf, start_addr, bin_size);
>>>>> +
>>>>> +    drm_dev_exit(idx);
>>>>> +}
>>>>> +
>>>>> +
>>>>>   static DEVICE_ATTR(usbc_pd_fw, S_IRUGO | S_IWUSR,
>>>>>              psp_usbc_pd_fw_sysfs_read,
>>>>>              psp_usbc_pd_fw_sysfs_write);
>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h 
>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
>>>>> index 46a5328e00e0..2bfdc278817f 100644
>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
>>>>> @@ -423,4 +423,6 @@ int psp_get_fw_attestation_records_addr(struct 
>>>>> psp_context *psp,
>>>>>   int psp_load_fw_list(struct psp_context *psp,
>>>>>                struct amdgpu_firmware_info **ucode_list, int 
>>>>> ucode_count);
>>>>> +void psp_copy_fw(struct psp_context *psp, uint8_t *start_addr, 
>>>>> uint32_t bin_size);
>>>>> +
>>>>>   #endif
>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c 
>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
>>>>> index 688624ebe421..e1985bc34436 100644
>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
>>>>> @@ -35,6 +35,8 @@
>>>>>   #include "amdgpu.h"
>>>>>   #include "atom.h"
>>>>> +#include <drm/drm_drv.h>
>>>>> +
>>>>>   /*
>>>>>    * Rings
>>>>>    * Most engines on the GPU are fed via ring buffers.  Ring
>>>>> @@ -461,3 +463,71 @@ int amdgpu_ring_test_helper(struct amdgpu_ring 
>>>>> *ring)
>>>>>       ring->sched.ready = !r;
>>>>>       return r;
>>>>>   }
>>>>> +
>>>>> +void amdgpu_ring_clear_ring(struct amdgpu_ring *ring)
>>>>> +{
>>>>> +    int idx;
>>>>> +    int i = 0;
>>>>> +
>>>>> +    if (!drm_dev_enter(&ring->adev->ddev, &idx))
>>>>> +        return;
>>>>> +
>>>>> +    while (i <= ring->buf_mask)
>>>>> +        ring->ring[i++] = ring->funcs->nop;
>>>>> +
>>>>> +    drm_dev_exit(idx);
>>>>> +
>>>>> +}
>>>>> +
>>>>> +void amdgpu_ring_write(struct amdgpu_ring *ring, uint32_t v)
>>>>> +{
>>>>> +    int idx;
>>>>> +
>>>>> +    if (!drm_dev_enter(&ring->adev->ddev, &idx))
>>>>> +        return;
>>>>> +
>>>>> +    if (ring->count_dw <= 0)
>>>>> +        DRM_ERROR("amdgpu: writing more dwords to the ring than 
>>>>> expected!\n");
>>>>> +    ring->ring[ring->wptr++ & ring->buf_mask] = v;
>>>>> +    ring->wptr &= ring->ptr_mask;
>>>>> +    ring->count_dw--;
>>>>> +
>>>>> +    drm_dev_exit(idx);
>>>>> +}
>>>>> +
>>>>> +void amdgpu_ring_write_multiple(struct amdgpu_ring *ring,
>>>>> +                          void *src, int count_dw)
>>>>> +{
>>>>> +    unsigned occupied, chunk1, chunk2;
>>>>> +    void *dst;
>>>>> +    int idx;
>>>>> +
>>>>> +    if (!drm_dev_enter(&ring->adev->ddev, &idx))
>>>>> +        return;
>>>>> +
>>>>> +    if (unlikely(ring->count_dw < count_dw))
>>>>> +        DRM_ERROR("amdgpu: writing more dwords to the ring than 
>>>>> expected!\n");
>>>>> +
>>>>> +    occupied = ring->wptr & ring->buf_mask;
>>>>> +    dst = (void *)&ring->ring[occupied];
>>>>> +    chunk1 = ring->buf_mask + 1 - occupied;
>>>>> +    chunk1 = (chunk1 >= count_dw) ? count_dw: chunk1;
>>>>> +    chunk2 = count_dw - chunk1;
>>>>> +    chunk1 <<= 2;
>>>>> +    chunk2 <<= 2;
>>>>> +
>>>>> +    if (chunk1)
>>>>> +        memcpy(dst, src, chunk1);
>>>>> +
>>>>> +    if (chunk2) {
>>>>> +        src += chunk1;
>>>>> +        dst = (void *)ring->ring;
>>>>> +        memcpy(dst, src, chunk2);
>>>>> +    }
>>>>> +
>>>>> +    ring->wptr += count_dw;
>>>>> +    ring->wptr &= ring->ptr_mask;
>>>>> +    ring->count_dw -= count_dw;
>>>>> +
>>>>> +    drm_dev_exit(idx);
>>>>> +}
>>>>
>>>> The ring should never we in MMIO memory, so you can completely drop 
>>>> that as far as I can see.
>>>
>>> Yea, it's in all in GART, missed it for some reason...
>>>>
>>>> Maybe split that patch by use case so that we can more easily 
>>>> review/ack it.
>>>
>>> In fact everything here is the same use case, once I added unmap of
>>> all MMIO ranges (both registers ann VRAM) i got a lot of page faults
>>> on device remove around any memcpy to from IO. That where I put the
>>> drn_dev_enter/exit scope. Also I searched in code and preemeptivly
>>> added guards to any other such place. I did drop amdgpu_schedule_ib
>>> from this patch both because it had dma_fence_wait inside and so we
>>> will take care of this once we decide on how to handle dma_fence waits.
>>>
>>> Andrey
>>>
>>>>
>>>> Thanks,
>>>> Christian.
>>>>
>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h 
>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>>>>> index e7d3d0dbdd96..c67bc6d3d039 100644
>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>>>>> @@ -299,53 +299,12 @@ static inline void 
>>>>> amdgpu_ring_set_preempt_cond_exec(struct amdgpu_ring *ring,
>>>>>       *ring->cond_exe_cpu_addr = cond_exec;
>>>>>   }
>>>>> -static inline void amdgpu_ring_clear_ring(struct amdgpu_ring *ring)
>>>>> -{
>>>>> -    int i = 0;
>>>>> -    while (i <= ring->buf_mask)
>>>>> -        ring->ring[i++] = ring->funcs->nop;
>>>>> -
>>>>> -}
>>>>> -
>>>>> -static inline void amdgpu_ring_write(struct amdgpu_ring *ring, 
>>>>> uint32_t v)
>>>>> -{
>>>>> -    if (ring->count_dw <= 0)
>>>>> -        DRM_ERROR("amdgpu: writing more dwords to the ring than 
>>>>> expected!\n");
>>>>> -    ring->ring[ring->wptr++ & ring->buf_mask] = v;
>>>>> -    ring->wptr &= ring->ptr_mask;
>>>>> -    ring->count_dw--;
>>>>> -}
>>>>> +void amdgpu_ring_clear_ring(struct amdgpu_ring *ring);
>>>>> -static inline void amdgpu_ring_write_multiple(struct amdgpu_ring 
>>>>> *ring,
>>>>> -                          void *src, int count_dw)
>>>>> -{
>>>>> -    unsigned occupied, chunk1, chunk2;
>>>>> -    void *dst;
>>>>> -
>>>>> -    if (unlikely(ring->count_dw < count_dw))
>>>>> -        DRM_ERROR("amdgpu: writing more dwords to the ring than 
>>>>> expected!\n");
>>>>> -
>>>>> -    occupied = ring->wptr & ring->buf_mask;
>>>>> -    dst = (void *)&ring->ring[occupied];
>>>>> -    chunk1 = ring->buf_mask + 1 - occupied;
>>>>> -    chunk1 = (chunk1 >= count_dw) ? count_dw: chunk1;
>>>>> -    chunk2 = count_dw - chunk1;
>>>>> -    chunk1 <<= 2;
>>>>> -    chunk2 <<= 2;
>>>>> +void amdgpu_ring_write(struct amdgpu_ring *ring, uint32_t v);
>>>>> -    if (chunk1)
>>>>> -        memcpy(dst, src, chunk1);
>>>>> -
>>>>> -    if (chunk2) {
>>>>> -        src += chunk1;
>>>>> -        dst = (void *)ring->ring;
>>>>> -        memcpy(dst, src, chunk2);
>>>>> -    }
>>>>> -
>>>>> -    ring->wptr += count_dw;
>>>>> -    ring->wptr &= ring->ptr_mask;
>>>>> -    ring->count_dw -= count_dw;
>>>>> -}
>>>>> +void amdgpu_ring_write_multiple(struct amdgpu_ring *ring,
>>>>> +                          void *src, int count_dw);
>>>>>   int amdgpu_ring_test_helper(struct amdgpu_ring *ring);
>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c 
>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
>>>>> index c6dbc0801604..82f0542c7792 100644
>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
>>>>> @@ -32,6 +32,7 @@
>>>>>   #include <linux/module.h>
>>>>>   #include <drm/drm.h>
>>>>> +#include <drm/drm_drv.h>
>>>>>   #include "amdgpu.h"
>>>>>   #include "amdgpu_pm.h"
>>>>> @@ -375,7 +376,7 @@ int amdgpu_uvd_suspend(struct amdgpu_device *adev)
>>>>>   {
>>>>>       unsigned size;
>>>>>       void *ptr;
>>>>> -    int i, j;
>>>>> +    int i, j, idx;
>>>>>       bool in_ras_intr = amdgpu_ras_intr_triggered();
>>>>>       cancel_delayed_work_sync(&adev->uvd.idle_work);
>>>>> @@ -403,11 +404,15 @@ int amdgpu_uvd_suspend(struct amdgpu_device 
>>>>> *adev)
>>>>>           if (!adev->uvd.inst[j].saved_bo)
>>>>>               return -ENOMEM;
>>>>> -        /* re-write 0 since err_event_athub will corrupt VCPU 
>>>>> buffer */
>>>>> -        if (in_ras_intr)
>>>>> -            memset(adev->uvd.inst[j].saved_bo, 0, size);
>>>>> -        else
>>>>> -            memcpy_fromio(adev->uvd.inst[j].saved_bo, ptr, size);
>>>>> +        if (drm_dev_enter(&adev->ddev, &idx)) {
>>>>> +            /* re-write 0 since err_event_athub will corrupt VCPU 
>>>>> buffer */
>>>>> +            if (in_ras_intr)
>>>>> +                memset(adev->uvd.inst[j].saved_bo, 0, size);
>>>>> +            else
>>>>> + memcpy_fromio(adev->uvd.inst[j].saved_bo, ptr, size);
>>>>> +
>>>>> +            drm_dev_exit(idx);
>>>>> +        }
>>>>>       }
>>>>>       if (in_ras_intr)
>>>>> @@ -420,7 +425,7 @@ int amdgpu_uvd_resume(struct amdgpu_device *adev)
>>>>>   {
>>>>>       unsigned size;
>>>>>       void *ptr;
>>>>> -    int i;
>>>>> +    int i, idx;
>>>>>       for (i = 0; i < adev->uvd.num_uvd_inst; i++) {
>>>>>           if (adev->uvd.harvest_config & (1 << i))
>>>>> @@ -432,7 +437,10 @@ int amdgpu_uvd_resume(struct amdgpu_device *adev)
>>>>>           ptr = adev->uvd.inst[i].cpu_addr;
>>>>>           if (adev->uvd.inst[i].saved_bo != NULL) {
>>>>> -            memcpy_toio(ptr, adev->uvd.inst[i].saved_bo, size);
>>>>> +            if (drm_dev_enter(&adev->ddev, &idx)) {
>>>>> +                memcpy_toio(ptr, adev->uvd.inst[i].saved_bo, size);
>>>>> +                drm_dev_exit(idx);
>>>>> +            }
>>>>>               kvfree(adev->uvd.inst[i].saved_bo);
>>>>>               adev->uvd.inst[i].saved_bo = NULL;
>>>>>           } else {
>>>>> @@ -442,8 +450,11 @@ int amdgpu_uvd_resume(struct amdgpu_device *adev)
>>>>>               hdr = (const struct common_firmware_header 
>>>>> *)adev->uvd.fw->data;
>>>>>               if (adev->firmware.load_type != AMDGPU_FW_LOAD_PSP) {
>>>>>                   offset = le32_to_cpu(hdr->ucode_array_offset_bytes);
>>>>> -                memcpy_toio(adev->uvd.inst[i].cpu_addr, 
>>>>> adev->uvd.fw->data + offset,
>>>>> - le32_to_cpu(hdr->ucode_size_bytes));
>>>>> +                if (drm_dev_enter(&adev->ddev, &idx)) {
>>>>> + memcpy_toio(adev->uvd.inst[i].cpu_addr, adev->uvd.fw->data + offset,
>>>>> + le32_to_cpu(hdr->ucode_size_bytes));
>>>>> +                    drm_dev_exit(idx);
>>>>> +                }
>>>>>                   size -= le32_to_cpu(hdr->ucode_size_bytes);
>>>>>                   ptr += le32_to_cpu(hdr->ucode_size_bytes);
>>>>>               }
>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c 
>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
>>>>> index ea6a62f67e38..833203401ef4 100644
>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
>>>>> @@ -29,6 +29,7 @@
>>>>>   #include <linux/module.h>
>>>>>   #include <drm/drm.h>
>>>>> +#include <drm/drm_drv.h>
>>>>>   #include "amdgpu.h"
>>>>>   #include "amdgpu_pm.h"
>>>>> @@ -293,7 +294,7 @@ int amdgpu_vce_resume(struct amdgpu_device *adev)
>>>>>       void *cpu_addr;
>>>>>       const struct common_firmware_header *hdr;
>>>>>       unsigned offset;
>>>>> -    int r;
>>>>> +    int r, idx;
>>>>>       if (adev->vce.vcpu_bo == NULL)
>>>>>           return -EINVAL;
>>>>> @@ -313,8 +314,12 @@ int amdgpu_vce_resume(struct amdgpu_device *adev)
>>>>>       hdr = (const struct common_firmware_header *)adev->vce.fw->data;
>>>>>       offset = le32_to_cpu(hdr->ucode_array_offset_bytes);
>>>>> -    memcpy_toio(cpu_addr, adev->vce.fw->data + offset,
>>>>> -            adev->vce.fw->size - offset);
>>>>> +
>>>>> +    if (drm_dev_enter(&adev->ddev, &idx)) {
>>>>> +        memcpy_toio(cpu_addr, adev->vce.fw->data + offset,
>>>>> +                adev->vce.fw->size - offset);
>>>>> +        drm_dev_exit(idx);
>>>>> +    }
>>>>>       amdgpu_bo_kunmap(adev->vce.vcpu_bo);
>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c 
>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
>>>>> index 201645963ba5..21f7d3644d70 100644
>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
>>>>> @@ -27,6 +27,7 @@
>>>>>   #include <linux/firmware.h>
>>>>>   #include <linux/module.h>
>>>>>   #include <linux/pci.h>
>>>>> +#include <drm/drm_drv.h>
>>>>>   #include "amdgpu.h"
>>>>>   #include "amdgpu_pm.h"
>>>>> @@ -275,7 +276,7 @@ int amdgpu_vcn_suspend(struct amdgpu_device *adev)
>>>>>   {
>>>>>       unsigned size;
>>>>>       void *ptr;
>>>>> -    int i;
>>>>> +    int i, idx;
>>>>>       cancel_delayed_work_sync(&adev->vcn.idle_work);
>>>>> @@ -292,7 +293,10 @@ int amdgpu_vcn_suspend(struct amdgpu_device 
>>>>> *adev)
>>>>>           if (!adev->vcn.inst[i].saved_bo)
>>>>>               return -ENOMEM;
>>>>> -        memcpy_fromio(adev->vcn.inst[i].saved_bo, ptr, size);
>>>>> +        if (drm_dev_enter(&adev->ddev, &idx)) {
>>>>> +            memcpy_fromio(adev->vcn.inst[i].saved_bo, ptr, size);
>>>>> +            drm_dev_exit(idx);
>>>>> +        }
>>>>>       }
>>>>>       return 0;
>>>>>   }
>>>>> @@ -301,7 +305,7 @@ int amdgpu_vcn_resume(struct amdgpu_device *adev)
>>>>>   {
>>>>>       unsigned size;
>>>>>       void *ptr;
>>>>> -    int i;
>>>>> +    int i, idx;
>>>>>       for (i = 0; i < adev->vcn.num_vcn_inst; ++i) {
>>>>>           if (adev->vcn.harvest_config & (1 << i))
>>>>> @@ -313,7 +317,10 @@ int amdgpu_vcn_resume(struct amdgpu_device *adev)
>>>>>           ptr = adev->vcn.inst[i].cpu_addr;
>>>>>           if (adev->vcn.inst[i].saved_bo != NULL) {
>>>>> -            memcpy_toio(ptr, adev->vcn.inst[i].saved_bo, size);
>>>>> +            if (drm_dev_enter(&adev->ddev, &idx)) {
>>>>> +                memcpy_toio(ptr, adev->vcn.inst[i].saved_bo, size);
>>>>> +                drm_dev_exit(idx);
>>>>> +            }
>>>>>               kvfree(adev->vcn.inst[i].saved_bo);
>>>>>               adev->vcn.inst[i].saved_bo = NULL;
>>>>>           } else {
>>>>> @@ -323,8 +330,11 @@ int amdgpu_vcn_resume(struct amdgpu_device *adev)
>>>>>               hdr = (const struct common_firmware_header 
>>>>> *)adev->vcn.fw->data;
>>>>>               if (adev->firmware.load_type != AMDGPU_FW_LOAD_PSP) {
>>>>>                   offset = le32_to_cpu(hdr->ucode_array_offset_bytes);
>>>>> -                memcpy_toio(adev->vcn.inst[i].cpu_addr, 
>>>>> adev->vcn.fw->data + offset,
>>>>> - le32_to_cpu(hdr->ucode_size_bytes));
>>>>> +                if (drm_dev_enter(&adev->ddev, &idx)) {
>>>>> + memcpy_toio(adev->vcn.inst[i].cpu_addr, adev->vcn.fw->data + offset,
>>>>> + le32_to_cpu(hdr->ucode_size_bytes));
>>>>> +                    drm_dev_exit(idx);
>>>>> +                }
>>>>>                   size -= le32_to_cpu(hdr->ucode_size_bytes);
>>>>>                   ptr += le32_to_cpu(hdr->ucode_size_bytes);
>>>>>               }
>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>>>> index 9f868cf3b832..7dd5f10ab570 100644
>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>>>> @@ -32,6 +32,7 @@
>>>>>   #include <linux/dma-buf.h>
>>>>>   #include <drm/amdgpu_drm.h>
>>>>> +#include <drm/drm_drv.h>
>>>>>   #include "amdgpu.h"
>>>>>   #include "amdgpu_trace.h"
>>>>>   #include "amdgpu_amdkfd.h"
>>>>> @@ -1606,7 +1607,10 @@ static int 
>>>>> amdgpu_vm_bo_update_mapping(struct amdgpu_device *adev,
>>>>>       struct amdgpu_vm_update_params params;
>>>>>       enum amdgpu_sync_mode sync_mode;
>>>>>       uint64_t pfn;
>>>>> -    int r;
>>>>> +    int r, idx;
>>>>> +
>>>>> +    if (!drm_dev_enter(&adev->ddev, &idx))
>>>>> +        return -ENODEV;
>>>>>       memset(&params, 0, sizeof(params));
>>>>>       params.adev = adev;
>>>>> @@ -1715,6 +1719,7 @@ static int amdgpu_vm_bo_update_mapping(struct 
>>>>> amdgpu_device *adev,
>>>>>   error_unlock:
>>>>>       amdgpu_vm_eviction_unlock(vm);
>>>>> +    drm_dev_exit(idx);
>>>>>       return r;
>>>>>   }
>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c 
>>>>> b/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
>>>>> index 589410c32d09..2cec71e823f5 100644
>>>>> --- a/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
>>>>> @@ -23,6 +23,7 @@
>>>>>   #include <linux/firmware.h>
>>>>>   #include <linux/module.h>
>>>>>   #include <linux/vmalloc.h>
>>>>> +#include <drm/drm_drv.h>
>>>>>   #include "amdgpu.h"
>>>>>   #include "amdgpu_psp.h"
>>>>> @@ -269,10 +270,8 @@ static int 
>>>>> psp_v11_0_bootloader_load_kdb(struct psp_context *psp)
>>>>>       if (ret)
>>>>>           return ret;
>>>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>> -
>>>>>       /* Copy PSP KDB binary to memory */
>>>>> -    memcpy(psp->fw_pri_buf, psp->kdb_start_addr, psp->kdb_bin_size);
>>>>> +    psp_copy_fw(psp, psp->kdb_start_addr, psp->kdb_bin_size);
>>>>>       /* Provide the PSP KDB to bootloader */
>>>>>       WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>>>> @@ -302,10 +301,8 @@ static int 
>>>>> psp_v11_0_bootloader_load_spl(struct psp_context *psp)
>>>>>       if (ret)
>>>>>           return ret;
>>>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>> -
>>>>>       /* Copy PSP SPL binary to memory */
>>>>> -    memcpy(psp->fw_pri_buf, psp->spl_start_addr, psp->spl_bin_size);
>>>>> +    psp_copy_fw(psp, psp->spl_start_addr, psp->spl_bin_size);
>>>>>       /* Provide the PSP SPL to bootloader */
>>>>>       WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>>>> @@ -335,10 +332,8 @@ static int 
>>>>> psp_v11_0_bootloader_load_sysdrv(struct psp_context *psp)
>>>>>       if (ret)
>>>>>           return ret;
>>>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>> -
>>>>>       /* Copy PSP System Driver binary to memory */
>>>>> -    memcpy(psp->fw_pri_buf, psp->sys_start_addr, psp->sys_bin_size);
>>>>> +    psp_copy_fw(psp, psp->sys_start_addr, psp->sys_bin_size);
>>>>>       /* Provide the sys driver to bootloader */
>>>>>       WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>>>> @@ -371,10 +366,8 @@ static int 
>>>>> psp_v11_0_bootloader_load_sos(struct psp_context *psp)
>>>>>       if (ret)
>>>>>           return ret;
>>>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>> -
>>>>>       /* Copy Secure OS binary to PSP memory */
>>>>> -    memcpy(psp->fw_pri_buf, psp->sos_start_addr, psp->sos_bin_size);
>>>>> +    psp_copy_fw(psp, psp->sos_start_addr, psp->sos_bin_size);
>>>>>       /* Provide the PSP secure OS to bootloader */
>>>>>       WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>>>> @@ -608,7 +601,7 @@ static int psp_v11_0_memory_training(struct 
>>>>> psp_context *psp, uint32_t ops)
>>>>>       uint32_t p2c_header[4];
>>>>>       uint32_t sz;
>>>>>       void *buf;
>>>>> -    int ret;
>>>>> +    int ret, idx;
>>>>>       if (ctx->init == PSP_MEM_TRAIN_NOT_SUPPORT) {
>>>>>           DRM_DEBUG("Memory training is not supported.\n");
>>>>> @@ -681,17 +674,24 @@ static int psp_v11_0_memory_training(struct 
>>>>> psp_context *psp, uint32_t ops)
>>>>>               return -ENOMEM;
>>>>>           }
>>>>> -        memcpy_fromio(buf, adev->mman.aper_base_kaddr, sz);
>>>>> -        ret = psp_v11_0_memory_training_send_msg(psp, 
>>>>> PSP_BL__DRAM_LONG_TRAIN);
>>>>> -        if (ret) {
>>>>> -            DRM_ERROR("Send long training msg failed.\n");
>>>>> +        if (drm_dev_enter(&adev->ddev, &idx)) {
>>>>> +            memcpy_fromio(buf, adev->mman.aper_base_kaddr, sz);
>>>>> +            ret = psp_v11_0_memory_training_send_msg(psp, 
>>>>> PSP_BL__DRAM_LONG_TRAIN);
>>>>> +            if (ret) {
>>>>> +                DRM_ERROR("Send long training msg failed.\n");
>>>>> +                vfree(buf);
>>>>> +                drm_dev_exit(idx);
>>>>> +                return ret;
>>>>> +            }
>>>>> +
>>>>> +            memcpy_toio(adev->mman.aper_base_kaddr, buf, sz);
>>>>> +            adev->hdp.funcs->flush_hdp(adev, NULL);
>>>>>               vfree(buf);
>>>>> -            return ret;
>>>>> +            drm_dev_exit(idx);
>>>>> +        } else {
>>>>> +            vfree(buf);
>>>>> +            return -ENODEV;
>>>>>           }
>>>>> -
>>>>> -        memcpy_toio(adev->mman.aper_base_kaddr, buf, sz);
>>>>> -        adev->hdp.funcs->flush_hdp(adev, NULL);
>>>>> -        vfree(buf);
>>>>>       }
>>>>>       if (ops & PSP_MEM_TRAIN_SAVE) {
>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c 
>>>>> b/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c
>>>>> index c4828bd3264b..618e5b6b85d9 100644
>>>>> --- a/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c
>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c
>>>>> @@ -138,10 +138,8 @@ static int 
>>>>> psp_v12_0_bootloader_load_sysdrv(struct psp_context *psp)
>>>>>       if (ret)
>>>>>           return ret;
>>>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>> -
>>>>>       /* Copy PSP System Driver binary to memory */
>>>>> -    memcpy(psp->fw_pri_buf, psp->sys_start_addr, psp->sys_bin_size);
>>>>> +    psp_copy_fw(psp, psp->sys_start_addr, psp->sys_bin_size);
>>>>>       /* Provide the sys driver to bootloader */
>>>>>       WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>>>> @@ -179,10 +177,8 @@ static int 
>>>>> psp_v12_0_bootloader_load_sos(struct psp_context *psp)
>>>>>       if (ret)
>>>>>           return ret;
>>>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>> -
>>>>>       /* Copy Secure OS binary to PSP memory */
>>>>> -    memcpy(psp->fw_pri_buf, psp->sos_start_addr, psp->sos_bin_size);
>>>>> +    psp_copy_fw(psp, psp->sos_start_addr, psp->sos_bin_size);
>>>>>       /* Provide the PSP secure OS to bootloader */
>>>>>       WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c 
>>>>> b/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c
>>>>> index f2e725f72d2f..d0a6cccd0897 100644
>>>>> --- a/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c
>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c
>>>>> @@ -102,10 +102,8 @@ static int 
>>>>> psp_v3_1_bootloader_load_sysdrv(struct psp_context *psp)
>>>>>       if (ret)
>>>>>           return ret;
>>>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>> -
>>>>>       /* Copy PSP System Driver binary to memory */
>>>>> -    memcpy(psp->fw_pri_buf, psp->sys_start_addr, psp->sys_bin_size);
>>>>> +    psp_copy_fw(psp, psp->sys_start_addr, psp->sys_bin_size);
>>>>>       /* Provide the sys driver to bootloader */
>>>>>       WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>>>> @@ -143,10 +141,8 @@ static int psp_v3_1_bootloader_load_sos(struct 
>>>>> psp_context *psp)
>>>>>       if (ret)
>>>>>           return ret;
>>>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>> -
>>>>>       /* Copy Secure OS binary to PSP memory */
>>>>> -    memcpy(psp->fw_pri_buf, psp->sos_start_addr, psp->sos_bin_size);
>>>>> +    psp_copy_fw(psp, psp->sos_start_addr, psp->sos_bin_size);
>>>>>       /* Provide the PSP secure OS to bootloader */
>>>>>       WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c 
>>>>> b/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
>>>>> index 8e238dea7bef..90910d19db12 100644
>>>>> --- a/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
>>>>> @@ -25,6 +25,7 @@
>>>>>    */
>>>>>   #include <linux/firmware.h>
>>>>> +#include <drm/drm_drv.h>
>>>>>   #include "amdgpu.h"
>>>>>   #include "amdgpu_vce.h"
>>>>> @@ -555,16 +556,19 @@ static int vce_v4_0_hw_fini(void *handle)
>>>>>   static int vce_v4_0_suspend(void *handle)
>>>>>   {
>>>>>       struct amdgpu_device *adev = (struct amdgpu_device *)handle;
>>>>> -    int r;
>>>>> +    int r, idx;
>>>>>       if (adev->vce.vcpu_bo == NULL)
>>>>>           return 0;
>>>>> -    if (adev->firmware.load_type == AMDGPU_FW_LOAD_PSP) {
>>>>> -        unsigned size = amdgpu_bo_size(adev->vce.vcpu_bo);
>>>>> -        void *ptr = adev->vce.cpu_addr;
>>>>> +    if (drm_dev_enter(&adev->ddev, &idx)) {
>>>>> +        if (adev->firmware.load_type == AMDGPU_FW_LOAD_PSP) {
>>>>> +            unsigned size = amdgpu_bo_size(adev->vce.vcpu_bo);
>>>>> +            void *ptr = adev->vce.cpu_addr;
>>>>> -        memcpy_fromio(adev->vce.saved_bo, ptr, size);
>>>>> +            memcpy_fromio(adev->vce.saved_bo, ptr, size);
>>>>> +        }
>>>>> +        drm_dev_exit(idx);
>>>>>       }
>>>>>       r = vce_v4_0_hw_fini(adev);
>>>>> @@ -577,16 +581,20 @@ static int vce_v4_0_suspend(void *handle)
>>>>>   static int vce_v4_0_resume(void *handle)
>>>>>   {
>>>>>       struct amdgpu_device *adev = (struct amdgpu_device *)handle;
>>>>> -    int r;
>>>>> +    int r, idx;
>>>>>       if (adev->vce.vcpu_bo == NULL)
>>>>>           return -EINVAL;
>>>>>       if (adev->firmware.load_type == AMDGPU_FW_LOAD_PSP) {
>>>>> -        unsigned size = amdgpu_bo_size(adev->vce.vcpu_bo);
>>>>> -        void *ptr = adev->vce.cpu_addr;
>>>>> -        memcpy_toio(ptr, adev->vce.saved_bo, size);
>>>>> +        if (drm_dev_enter(&adev->ddev, &idx)) {
>>>>> +            unsigned size = amdgpu_bo_size(adev->vce.vcpu_bo);
>>>>> +            void *ptr = adev->vce.cpu_addr;
>>>>> +
>>>>> +            memcpy_toio(ptr, adev->vce.saved_bo, size);
>>>>> +            drm_dev_exit(idx);
>>>>> +        }
>>>>>       } else {
>>>>>           r = amdgpu_vce_resume(adev);
>>>>>           if (r)
>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c 
>>>>> b/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c
>>>>> index 3f15bf34123a..df34be8ec82d 100644
>>>>> --- a/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c
>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c
>>>>> @@ -34,6 +34,8 @@
>>>>>   #include "vcn/vcn_3_0_0_sh_mask.h"
>>>>>   #include "ivsrcid/vcn/irqsrcs_vcn_2_0.h"
>>>>> +#include <drm/drm_drv.h>
>>>>> +
>>>>>   #define mmUVD_CONTEXT_ID_INTERNAL_OFFSET            0x27
>>>>>   #define mmUVD_GPCOM_VCPU_CMD_INTERNAL_OFFSET 0x0f
>>>>>   #define mmUVD_GPCOM_VCPU_DATA0_INTERNAL_OFFSET 0x10
>>>>> @@ -268,16 +270,20 @@ static int vcn_v3_0_sw_init(void *handle)
>>>>>   static int vcn_v3_0_sw_fini(void *handle)
>>>>>   {
>>>>>       struct amdgpu_device *adev = (struct amdgpu_device *)handle;
>>>>> -    int i, r;
>>>>> +    int i, r, idx;
>>>>> -    for (i = 0; i < adev->vcn.num_vcn_inst; i++) {
>>>>> -        volatile struct amdgpu_fw_shared *fw_shared;
>>>>> +    if (drm_dev_enter(&adev->ddev, &idx)) {
>>>>> +        for (i = 0; i < adev->vcn.num_vcn_inst; i++) {
>>>>> +            volatile struct amdgpu_fw_shared *fw_shared;
>>>>> -        if (adev->vcn.harvest_config & (1 << i))
>>>>> -            continue;
>>>>> -        fw_shared = adev->vcn.inst[i].fw_shared_cpu_addr;
>>>>> -        fw_shared->present_flag_0 = 0;
>>>>> -        fw_shared->sw_ring.is_enabled = false;
>>>>> +            if (adev->vcn.harvest_config & (1 << i))
>>>>> +                continue;
>>>>> +            fw_shared = adev->vcn.inst[i].fw_shared_cpu_addr;
>>>>> +            fw_shared->present_flag_0 = 0;
>>>>> +            fw_shared->sw_ring.is_enabled = false;
>>>>> +        }
>>>>> +
>>>>> +        drm_dev_exit(idx);
>>>>>       }
>>>>>       if (amdgpu_sriov_vf(adev))
>>>>> diff --git a/drivers/gpu/drm/amd/pm/powerplay/smumgr/smu7_smumgr.c 
>>>>> b/drivers/gpu/drm/amd/pm/powerplay/smumgr/smu7_smumgr.c
>>>>> index aae25243eb10..d628b91846c9 100644
>>>>> --- a/drivers/gpu/drm/amd/pm/powerplay/smumgr/smu7_smumgr.c
>>>>> +++ b/drivers/gpu/drm/amd/pm/powerplay/smumgr/smu7_smumgr.c
>>>>> @@ -405,6 +405,8 @@ int smu7_request_smu_load_fw(struct pp_hwmgr 
>>>>> *hwmgr)
>>>>>                   UCODE_ID_MEC_STORAGE, 
>>>>> &toc->entry[toc->num_entries++]),
>>>>>                   "Failed to Get Firmware Entry.", r = -EINVAL; 
>>>>> goto failed);
>>>>>       }
>>>>> +
>>>>> +    /* AG TODO Can't call drm_dev_enter/exit because access 
>>>>> adev->ddev here ... */
>>>>>       memcpy_toio(smu_data->header_buffer.kaddr, smu_data->toc,
>>>>>               sizeof(struct SMU_DRAMData_TOC));
>>>>>       smum_send_msg_to_smc_with_parameter(hwmgr,
>>>>
> 

^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [PATCH v6 10/16] drm/amdgpu: Guard against write accesses after device removal
@ 2021-05-12 14:11             ` Andrey Grodzovsky
  0 siblings, 0 replies; 126+ messages in thread
From: Andrey Grodzovsky @ 2021-05-12 14:11 UTC (permalink / raw)
  To: Christian König, dri-devel, amd-gfx, linux-pci,
	daniel.vetter, Harry.Wentland
  Cc: Alexander.Deucher, gregkh, helgaas, Felix.Kuehling

Hopefyllu Alex can chime in on this.
I will respin V7 soon.

Andrey

On 2021-05-12 10:06 a.m., Christian König wrote:
> Am 12.05.21 um 16:01 schrieb Andrey Grodzovsky:
>> Ping - need a confirmation it's ok to keep this as a single patch given
>> my explanation bellow.
> 
> It was just an suggestion. Key point is the approach sounds sane to me, 
> but I can't say much about the psp code for example.
> 
> So maximum I can give you is an Acked-by for that.
> 
> Christian.
> 
>>
>> Andrey
>>
>> On 2021-05-11 1:52 p.m., Andrey Grodzovsky wrote:
>>>
>>>
>>> On 2021-05-11 2:50 a.m., Christian König wrote:
>>>> Am 10.05.21 um 18:36 schrieb Andrey Grodzovsky:
>>>>> This should prevent writing to memory or IO ranges possibly
>>>>> already allocated for other uses after our device is removed.
>>>>>
>>>>> v5:
>>>>> Protect more places wher memcopy_to/form_io takes place
>>>>> Protect IB submissions
>>>>>
>>>>> v6: Switch to !drm_dev_enter instead of scoping entire code
>>>>> with brackets.
>>>>>
>>>>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>>>>> ---
>>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c    | 11 ++-
>>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c       |  9 +++
>>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c        | 17 +++--
>>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c       | 63 +++++++++++------
>>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h       |  2 +
>>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c      | 70 
>>>>> +++++++++++++++++++
>>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h      | 49 ++-----------
>>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c       | 31 +++++---
>>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c       | 11 ++-
>>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c       | 22 ++++--
>>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c        |  7 +-
>>>>>   drivers/gpu/drm/amd/amdgpu/psp_v11_0.c        | 44 ++++++------
>>>>>   drivers/gpu/drm/amd/amdgpu/psp_v12_0.c        |  8 +--
>>>>>   drivers/gpu/drm/amd/amdgpu/psp_v3_1.c         |  8 +--
>>>>>   drivers/gpu/drm/amd/amdgpu/vce_v4_0.c         | 26 ++++---
>>>>>   drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c         | 22 +++---
>>>>>   .../drm/amd/pm/powerplay/smumgr/smu7_smumgr.c |  2 +
>>>>>   17 files changed, 257 insertions(+), 145 deletions(-)
>>>>>
>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>> index a0bff4713672..94c415176cdc 100644
>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>> @@ -71,6 +71,8 @@
>>>>>   #include <drm/task_barrier.h>
>>>>>   #include <linux/pm_runtime.h>
>>>>> +#include <drm/drm_drv.h>
>>>>> +
>>>>>   MODULE_FIRMWARE("amdgpu/vega10_gpu_info.bin");
>>>>>   MODULE_FIRMWARE("amdgpu/vega12_gpu_info.bin");
>>>>>   MODULE_FIRMWARE("amdgpu/raven_gpu_info.bin");
>>>>> @@ -281,7 +283,10 @@ void amdgpu_device_vram_access(struct 
>>>>> amdgpu_device *adev, loff_t pos,
>>>>>       unsigned long flags;
>>>>>       uint32_t hi = ~0;
>>>>>       uint64_t last;
>>>>> +    int idx;
>>>>> +     if (!drm_dev_enter(&adev->ddev, &idx))
>>>>> +         return;
>>>>>   #ifdef CONFIG_64BIT
>>>>>       last = min(pos + size, adev->gmc.visible_vram_size);
>>>>> @@ -299,8 +304,10 @@ void amdgpu_device_vram_access(struct 
>>>>> amdgpu_device *adev, loff_t pos,
>>>>>               memcpy_fromio(buf, addr, count);
>>>>>           }
>>>>> -        if (count == size)
>>>>> +        if (count == size) {
>>>>> +            drm_dev_exit(idx);
>>>>>               return;
>>>>> +        }
>>>>
>>>> Maybe use a goto instead, but really just a nit pick.
>>>>
>>>>
>>>>
>>>>>           pos += count;
>>>>>           buf += count / 4;
>>>>> @@ -323,6 +330,8 @@ void amdgpu_device_vram_access(struct 
>>>>> amdgpu_device *adev, loff_t pos,
>>>>>               *buf++ = RREG32_NO_KIQ(mmMM_DATA);
>>>>>       }
>>>>>       spin_unlock_irqrestore(&adev->mmio_idx_lock, flags);
>>>>> +
>>>>> +    drm_dev_exit(idx);
>>>>>   }
>>>>>   /*
>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c 
>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
>>>>> index 4d32233cde92..04ba5eef1e88 100644
>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
>>>>> @@ -31,6 +31,8 @@
>>>>>   #include "amdgpu_ras.h"
>>>>>   #include "amdgpu_xgmi.h"
>>>>> +#include <drm/drm_drv.h>
>>>>> +
>>>>>   /**
>>>>>    * amdgpu_gmc_pdb0_alloc - allocate vram for pdb0
>>>>>    *
>>>>> @@ -151,6 +153,10 @@ int amdgpu_gmc_set_pte_pde(struct 
>>>>> amdgpu_device *adev, void *cpu_pt_addr,
>>>>>   {
>>>>>       void __iomem *ptr = (void *)cpu_pt_addr;
>>>>>       uint64_t value;
>>>>> +    int idx;
>>>>> +
>>>>> +    if (!drm_dev_enter(&adev->ddev, &idx))
>>>>> +        return 0;
>>>>>       /*
>>>>>        * The following is for PTE only. GART does not have PDEs.
>>>>> @@ -158,6 +164,9 @@ int amdgpu_gmc_set_pte_pde(struct amdgpu_device 
>>>>> *adev, void *cpu_pt_addr,
>>>>>       value = addr & 0x0000FFFFFFFFF000ULL;
>>>>>       value |= flags;
>>>>>       writeq(value, ptr + (gpu_page_idx * 8));
>>>>> +
>>>>> +    drm_dev_exit(idx);
>>>>> +
>>>>>       return 0;
>>>>>   }
>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c 
>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
>>>>> index 148a3b481b12..62fcbd446c71 100644
>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
>>>>> @@ -30,6 +30,7 @@
>>>>>   #include <linux/slab.h>
>>>>>   #include <drm/amdgpu_drm.h>
>>>>> +#include <drm/drm_drv.h>
>>>>>   #include "amdgpu.h"
>>>>>   #include "atom.h"
>>>>> @@ -137,7 +138,7 @@ int amdgpu_ib_schedule(struct amdgpu_ring 
>>>>> *ring, unsigned num_ibs,
>>>>>       bool secure;
>>>>>       unsigned i;
>>>>> -    int r = 0;
>>>>> +    int idx, r = 0;
>>>>>       bool need_pipe_sync = false;
>>>>>       if (num_ibs == 0)
>>>>> @@ -169,13 +170,16 @@ int amdgpu_ib_schedule(struct amdgpu_ring 
>>>>> *ring, unsigned num_ibs,
>>>>>           return -EINVAL;
>>>>>       }
>>>>> +    if (!drm_dev_enter(&adev->ddev, &idx))
>>>>> +        return -ENODEV;
>>>>> +
>>>>>       alloc_size = ring->funcs->emit_frame_size + num_ibs *
>>>>>           ring->funcs->emit_ib_size;
>>>>>       r = amdgpu_ring_alloc(ring, alloc_size);
>>>>>       if (r) {
>>>>>           dev_err(adev->dev, "scheduling IB failed (%d).\n", r);
>>>>> -        return r;
>>>>> +        goto exit;
>>>>>       }
>>>>>       need_ctx_switch = ring->current_ctx != fence_ctx;
>>>>> @@ -205,7 +209,7 @@ int amdgpu_ib_schedule(struct amdgpu_ring 
>>>>> *ring, unsigned num_ibs,
>>>>>           r = amdgpu_vm_flush(ring, job, need_pipe_sync);
>>>>>           if (r) {
>>>>>               amdgpu_ring_undo(ring);
>>>>> -            return r;
>>>>> +            goto exit;
>>>>>           }
>>>>>       }
>>>>> @@ -286,7 +290,7 @@ int amdgpu_ib_schedule(struct amdgpu_ring 
>>>>> *ring, unsigned num_ibs,
>>>>>           if (job && job->vmid)
>>>>>               amdgpu_vmid_reset(adev, ring->funcs->vmhub, job->vmid);
>>>>>           amdgpu_ring_undo(ring);
>>>>> -        return r;
>>>>> +        goto exit;
>>>>>       }
>>>>>       if (ring->funcs->insert_end)
>>>>> @@ -304,7 +308,10 @@ int amdgpu_ib_schedule(struct amdgpu_ring 
>>>>> *ring, unsigned num_ibs,
>>>>>           ring->funcs->emit_wave_limit(ring, false);
>>>>>       amdgpu_ring_commit(ring);
>>>>> -    return 0;
>>>>> +
>>>>> +exit:
>>>>> +    drm_dev_exit(idx);
>>>>> +    return r;
>>>>>   }
>>>>>   /**
>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c 
>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
>>>>> index 9e769cf6095b..bb6afee61666 100644
>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
>>>>> @@ -25,6 +25,7 @@
>>>>>   #include <linux/firmware.h>
>>>>>   #include <linux/dma-mapping.h>
>>>>> +#include <drm/drm_drv.h>
>>>>>   #include "amdgpu.h"
>>>>>   #include "amdgpu_psp.h"
>>>>> @@ -39,6 +40,8 @@
>>>>>   #include "amdgpu_ras.h"
>>>>>   #include "amdgpu_securedisplay.h"
>>>>> +#include <drm/drm_drv.h>
>>>>> +
>>>>>   static int psp_sysfs_init(struct amdgpu_device *adev);
>>>>>   static void psp_sysfs_fini(struct amdgpu_device *adev);
>>>>> @@ -253,7 +256,7 @@ psp_cmd_submit_buf(struct psp_context *psp,
>>>>>              struct psp_gfx_cmd_resp *cmd, uint64_t fence_mc_addr)
>>>>>   {
>>>>>       int ret;
>>>>> -    int index;
>>>>> +    int index, idx;
>>>>>       int timeout = 20000;
>>>>>       bool ras_intr = false;
>>>>>       bool skip_unsupport = false;
>>>>> @@ -261,6 +264,9 @@ psp_cmd_submit_buf(struct psp_context *psp,
>>>>>       if (psp->adev->in_pci_err_recovery)
>>>>>           return 0;
>>>>> +    if (!drm_dev_enter(&psp->adev->ddev, &idx))
>>>>> +        return 0;
>>>>> +
>>>>>       mutex_lock(&psp->mutex);
>>>>>       memset(psp->cmd_buf_mem, 0, PSP_CMD_BUFFER_SIZE);
>>>>> @@ -271,8 +277,7 @@ psp_cmd_submit_buf(struct psp_context *psp,
>>>>>       ret = psp_ring_cmd_submit(psp, psp->cmd_buf_mc_addr, 
>>>>> fence_mc_addr, index);
>>>>>       if (ret) {
>>>>>           atomic_dec(&psp->fence_value);
>>>>> -        mutex_unlock(&psp->mutex);
>>>>> -        return ret;
>>>>> +        goto exit;
>>>>>       }
>>>>>       amdgpu_asic_invalidate_hdp(psp->adev, NULL);
>>>>> @@ -312,8 +317,8 @@ psp_cmd_submit_buf(struct psp_context *psp,
>>>>>                psp->cmd_buf_mem->cmd_id,
>>>>>                psp->cmd_buf_mem->resp.status);
>>>>>           if (!timeout) {
>>>>> -            mutex_unlock(&psp->mutex);
>>>>> -            return -EINVAL;
>>>>> +            ret = -EINVAL;
>>>>> +            goto exit;
>>>>>           }
>>>>>       }
>>>>> @@ -321,8 +326,10 @@ psp_cmd_submit_buf(struct psp_context *psp,
>>>>>           ucode->tmr_mc_addr_lo = psp->cmd_buf_mem->resp.fw_addr_lo;
>>>>>           ucode->tmr_mc_addr_hi = psp->cmd_buf_mem->resp.fw_addr_hi;
>>>>>       }
>>>>> -    mutex_unlock(&psp->mutex);
>>>>> +exit:
>>>>> +    mutex_unlock(&psp->mutex);
>>>>> +    drm_dev_exit(idx);
>>>>>       return ret;
>>>>>   }
>>>>> @@ -359,8 +366,7 @@ static int psp_load_toc(struct psp_context *psp,
>>>>>       if (!cmd)
>>>>>           return -ENOMEM;
>>>>>       /* Copy toc to psp firmware private buffer */
>>>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>> -    memcpy(psp->fw_pri_buf, psp->toc_start_addr, psp->toc_bin_size);
>>>>> +    psp_copy_fw(psp, psp->toc_start_addr, psp->toc_bin_size);
>>>>>       psp_prep_load_toc_cmd_buf(cmd, psp->fw_pri_mc_addr, 
>>>>> psp->toc_bin_size);
>>>>> @@ -625,8 +631,7 @@ static int psp_asd_load(struct psp_context *psp)
>>>>>       if (!cmd)
>>>>>           return -ENOMEM;
>>>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>> -    memcpy(psp->fw_pri_buf, psp->asd_start_addr, 
>>>>> psp->asd_ucode_size);
>>>>> +    psp_copy_fw(psp, psp->asd_start_addr, psp->asd_ucode_size);
>>>>>       psp_prep_asd_load_cmd_buf(cmd, psp->fw_pri_mc_addr,
>>>>>                     psp->asd_ucode_size);
>>>>> @@ -781,8 +786,7 @@ static int psp_xgmi_load(struct psp_context *psp)
>>>>>       if (!cmd)
>>>>>           return -ENOMEM;
>>>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>> -    memcpy(psp->fw_pri_buf, psp->ta_xgmi_start_addr, 
>>>>> psp->ta_xgmi_ucode_size);
>>>>> +    psp_copy_fw(psp, psp->ta_xgmi_start_addr, 
>>>>> psp->ta_xgmi_ucode_size);
>>>>>       psp_prep_ta_load_cmd_buf(cmd,
>>>>>                    psp->fw_pri_mc_addr,
>>>>> @@ -1038,8 +1042,7 @@ static int psp_ras_load(struct psp_context *psp)
>>>>>       if (!cmd)
>>>>>           return -ENOMEM;
>>>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>> -    memcpy(psp->fw_pri_buf, psp->ta_ras_start_addr, 
>>>>> psp->ta_ras_ucode_size);
>>>>> +    psp_copy_fw(psp, psp->ta_ras_start_addr, psp->ta_ras_ucode_size);
>>>>>       psp_prep_ta_load_cmd_buf(cmd,
>>>>>                    psp->fw_pri_mc_addr,
>>>>> @@ -1275,8 +1278,7 @@ static int psp_hdcp_load(struct psp_context 
>>>>> *psp)
>>>>>       if (!cmd)
>>>>>           return -ENOMEM;
>>>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>> -    memcpy(psp->fw_pri_buf, psp->ta_hdcp_start_addr,
>>>>> +    psp_copy_fw(psp, psp->ta_hdcp_start_addr,
>>>>>              psp->ta_hdcp_ucode_size);
>>>>>       psp_prep_ta_load_cmd_buf(cmd,
>>>>> @@ -1427,8 +1429,7 @@ static int psp_dtm_load(struct psp_context *psp)
>>>>>       if (!cmd)
>>>>>           return -ENOMEM;
>>>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>> -    memcpy(psp->fw_pri_buf, psp->ta_dtm_start_addr, 
>>>>> psp->ta_dtm_ucode_size);
>>>>> +    psp_copy_fw(psp, psp->ta_dtm_start_addr, psp->ta_dtm_ucode_size);
>>>>>       psp_prep_ta_load_cmd_buf(cmd,
>>>>>                    psp->fw_pri_mc_addr,
>>>>> @@ -1573,8 +1574,7 @@ static int psp_rap_load(struct psp_context *psp)
>>>>>       if (!cmd)
>>>>>           return -ENOMEM;
>>>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>> -    memcpy(psp->fw_pri_buf, psp->ta_rap_start_addr, 
>>>>> psp->ta_rap_ucode_size);
>>>>> +    psp_copy_fw(psp, psp->ta_rap_start_addr, psp->ta_rap_ucode_size);
>>>>>       psp_prep_ta_load_cmd_buf(cmd,
>>>>>                    psp->fw_pri_mc_addr,
>>>>> @@ -3022,7 +3022,7 @@ static ssize_t 
>>>>> psp_usbc_pd_fw_sysfs_write(struct device *dev,
>>>>>       struct amdgpu_device *adev = drm_to_adev(ddev);
>>>>>       void *cpu_addr;
>>>>>       dma_addr_t dma_addr;
>>>>> -    int ret;
>>>>> +    int ret, idx;
>>>>>       char fw_name[100];
>>>>>       const struct firmware *usbc_pd_fw;
>>>>> @@ -3031,6 +3031,9 @@ static ssize_t 
>>>>> psp_usbc_pd_fw_sysfs_write(struct device *dev,
>>>>>           return -EBUSY;
>>>>>       }
>>>>> +    if (!drm_dev_enter(ddev, &idx))
>>>>> +        return -ENODEV;
>>>>> +
>>>>>       snprintf(fw_name, sizeof(fw_name), "amdgpu/%s", buf);
>>>>>       ret = request_firmware(&usbc_pd_fw, fw_name, adev->dev);
>>>>>       if (ret)
>>>>> @@ -3062,16 +3065,30 @@ static ssize_t 
>>>>> psp_usbc_pd_fw_sysfs_write(struct device *dev,
>>>>>   rel_buf:
>>>>>       dma_free_coherent(adev->dev, usbc_pd_fw->size, cpu_addr, 
>>>>> dma_addr);
>>>>>       release_firmware(usbc_pd_fw);
>>>>> -
>>>>>   fail:
>>>>>       if (ret) {
>>>>>           DRM_ERROR("Failed to load USBC PD FW, err = %d", ret);
>>>>> -        return ret;
>>>>> +        count = ret;
>>>>>       }
>>>>> +    drm_dev_exit(idx);
>>>>>       return count;
>>>>>   }
>>>>> +void psp_copy_fw(struct psp_context *psp, uint8_t *start_addr, 
>>>>> uint32_t bin_size)
>>>>> +{
>>>>> +    int idx;
>>>>> +
>>>>> +    if (!drm_dev_enter(&psp->adev->ddev, &idx))
>>>>> +        return;
>>>>> +
>>>>> +    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>> +    memcpy(psp->fw_pri_buf, start_addr, bin_size);
>>>>> +
>>>>> +    drm_dev_exit(idx);
>>>>> +}
>>>>> +
>>>>> +
>>>>>   static DEVICE_ATTR(usbc_pd_fw, S_IRUGO | S_IWUSR,
>>>>>              psp_usbc_pd_fw_sysfs_read,
>>>>>              psp_usbc_pd_fw_sysfs_write);
>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h 
>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
>>>>> index 46a5328e00e0..2bfdc278817f 100644
>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
>>>>> @@ -423,4 +423,6 @@ int psp_get_fw_attestation_records_addr(struct 
>>>>> psp_context *psp,
>>>>>   int psp_load_fw_list(struct psp_context *psp,
>>>>>                struct amdgpu_firmware_info **ucode_list, int 
>>>>> ucode_count);
>>>>> +void psp_copy_fw(struct psp_context *psp, uint8_t *start_addr, 
>>>>> uint32_t bin_size);
>>>>> +
>>>>>   #endif
>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c 
>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
>>>>> index 688624ebe421..e1985bc34436 100644
>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
>>>>> @@ -35,6 +35,8 @@
>>>>>   #include "amdgpu.h"
>>>>>   #include "atom.h"
>>>>> +#include <drm/drm_drv.h>
>>>>> +
>>>>>   /*
>>>>>    * Rings
>>>>>    * Most engines on the GPU are fed via ring buffers.  Ring
>>>>> @@ -461,3 +463,71 @@ int amdgpu_ring_test_helper(struct amdgpu_ring 
>>>>> *ring)
>>>>>       ring->sched.ready = !r;
>>>>>       return r;
>>>>>   }
>>>>> +
>>>>> +void amdgpu_ring_clear_ring(struct amdgpu_ring *ring)
>>>>> +{
>>>>> +    int idx;
>>>>> +    int i = 0;
>>>>> +
>>>>> +    if (!drm_dev_enter(&ring->adev->ddev, &idx))
>>>>> +        return;
>>>>> +
>>>>> +    while (i <= ring->buf_mask)
>>>>> +        ring->ring[i++] = ring->funcs->nop;
>>>>> +
>>>>> +    drm_dev_exit(idx);
>>>>> +
>>>>> +}
>>>>> +
>>>>> +void amdgpu_ring_write(struct amdgpu_ring *ring, uint32_t v)
>>>>> +{
>>>>> +    int idx;
>>>>> +
>>>>> +    if (!drm_dev_enter(&ring->adev->ddev, &idx))
>>>>> +        return;
>>>>> +
>>>>> +    if (ring->count_dw <= 0)
>>>>> +        DRM_ERROR("amdgpu: writing more dwords to the ring than 
>>>>> expected!\n");
>>>>> +    ring->ring[ring->wptr++ & ring->buf_mask] = v;
>>>>> +    ring->wptr &= ring->ptr_mask;
>>>>> +    ring->count_dw--;
>>>>> +
>>>>> +    drm_dev_exit(idx);
>>>>> +}
>>>>> +
>>>>> +void amdgpu_ring_write_multiple(struct amdgpu_ring *ring,
>>>>> +                          void *src, int count_dw)
>>>>> +{
>>>>> +    unsigned occupied, chunk1, chunk2;
>>>>> +    void *dst;
>>>>> +    int idx;
>>>>> +
>>>>> +    if (!drm_dev_enter(&ring->adev->ddev, &idx))
>>>>> +        return;
>>>>> +
>>>>> +    if (unlikely(ring->count_dw < count_dw))
>>>>> +        DRM_ERROR("amdgpu: writing more dwords to the ring than 
>>>>> expected!\n");
>>>>> +
>>>>> +    occupied = ring->wptr & ring->buf_mask;
>>>>> +    dst = (void *)&ring->ring[occupied];
>>>>> +    chunk1 = ring->buf_mask + 1 - occupied;
>>>>> +    chunk1 = (chunk1 >= count_dw) ? count_dw: chunk1;
>>>>> +    chunk2 = count_dw - chunk1;
>>>>> +    chunk1 <<= 2;
>>>>> +    chunk2 <<= 2;
>>>>> +
>>>>> +    if (chunk1)
>>>>> +        memcpy(dst, src, chunk1);
>>>>> +
>>>>> +    if (chunk2) {
>>>>> +        src += chunk1;
>>>>> +        dst = (void *)ring->ring;
>>>>> +        memcpy(dst, src, chunk2);
>>>>> +    }
>>>>> +
>>>>> +    ring->wptr += count_dw;
>>>>> +    ring->wptr &= ring->ptr_mask;
>>>>> +    ring->count_dw -= count_dw;
>>>>> +
>>>>> +    drm_dev_exit(idx);
>>>>> +}
>>>>
>>>> The ring should never we in MMIO memory, so you can completely drop 
>>>> that as far as I can see.
>>>
>>> Yea, it's in all in GART, missed it for some reason...
>>>>
>>>> Maybe split that patch by use case so that we can more easily 
>>>> review/ack it.
>>>
>>> In fact everything here is the same use case, once I added unmap of
>>> all MMIO ranges (both registers ann VRAM) i got a lot of page faults
>>> on device remove around any memcpy to from IO. That where I put the
>>> drn_dev_enter/exit scope. Also I searched in code and preemeptivly
>>> added guards to any other such place. I did drop amdgpu_schedule_ib
>>> from this patch both because it had dma_fence_wait inside and so we
>>> will take care of this once we decide on how to handle dma_fence waits.
>>>
>>> Andrey
>>>
>>>>
>>>> Thanks,
>>>> Christian.
>>>>
>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h 
>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>>>>> index e7d3d0dbdd96..c67bc6d3d039 100644
>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>>>>> @@ -299,53 +299,12 @@ static inline void 
>>>>> amdgpu_ring_set_preempt_cond_exec(struct amdgpu_ring *ring,
>>>>>       *ring->cond_exe_cpu_addr = cond_exec;
>>>>>   }
>>>>> -static inline void amdgpu_ring_clear_ring(struct amdgpu_ring *ring)
>>>>> -{
>>>>> -    int i = 0;
>>>>> -    while (i <= ring->buf_mask)
>>>>> -        ring->ring[i++] = ring->funcs->nop;
>>>>> -
>>>>> -}
>>>>> -
>>>>> -static inline void amdgpu_ring_write(struct amdgpu_ring *ring, 
>>>>> uint32_t v)
>>>>> -{
>>>>> -    if (ring->count_dw <= 0)
>>>>> -        DRM_ERROR("amdgpu: writing more dwords to the ring than 
>>>>> expected!\n");
>>>>> -    ring->ring[ring->wptr++ & ring->buf_mask] = v;
>>>>> -    ring->wptr &= ring->ptr_mask;
>>>>> -    ring->count_dw--;
>>>>> -}
>>>>> +void amdgpu_ring_clear_ring(struct amdgpu_ring *ring);
>>>>> -static inline void amdgpu_ring_write_multiple(struct amdgpu_ring 
>>>>> *ring,
>>>>> -                          void *src, int count_dw)
>>>>> -{
>>>>> -    unsigned occupied, chunk1, chunk2;
>>>>> -    void *dst;
>>>>> -
>>>>> -    if (unlikely(ring->count_dw < count_dw))
>>>>> -        DRM_ERROR("amdgpu: writing more dwords to the ring than 
>>>>> expected!\n");
>>>>> -
>>>>> -    occupied = ring->wptr & ring->buf_mask;
>>>>> -    dst = (void *)&ring->ring[occupied];
>>>>> -    chunk1 = ring->buf_mask + 1 - occupied;
>>>>> -    chunk1 = (chunk1 >= count_dw) ? count_dw: chunk1;
>>>>> -    chunk2 = count_dw - chunk1;
>>>>> -    chunk1 <<= 2;
>>>>> -    chunk2 <<= 2;
>>>>> +void amdgpu_ring_write(struct amdgpu_ring *ring, uint32_t v);
>>>>> -    if (chunk1)
>>>>> -        memcpy(dst, src, chunk1);
>>>>> -
>>>>> -    if (chunk2) {
>>>>> -        src += chunk1;
>>>>> -        dst = (void *)ring->ring;
>>>>> -        memcpy(dst, src, chunk2);
>>>>> -    }
>>>>> -
>>>>> -    ring->wptr += count_dw;
>>>>> -    ring->wptr &= ring->ptr_mask;
>>>>> -    ring->count_dw -= count_dw;
>>>>> -}
>>>>> +void amdgpu_ring_write_multiple(struct amdgpu_ring *ring,
>>>>> +                          void *src, int count_dw);
>>>>>   int amdgpu_ring_test_helper(struct amdgpu_ring *ring);
>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c 
>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
>>>>> index c6dbc0801604..82f0542c7792 100644
>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
>>>>> @@ -32,6 +32,7 @@
>>>>>   #include <linux/module.h>
>>>>>   #include <drm/drm.h>
>>>>> +#include <drm/drm_drv.h>
>>>>>   #include "amdgpu.h"
>>>>>   #include "amdgpu_pm.h"
>>>>> @@ -375,7 +376,7 @@ int amdgpu_uvd_suspend(struct amdgpu_device *adev)
>>>>>   {
>>>>>       unsigned size;
>>>>>       void *ptr;
>>>>> -    int i, j;
>>>>> +    int i, j, idx;
>>>>>       bool in_ras_intr = amdgpu_ras_intr_triggered();
>>>>>       cancel_delayed_work_sync(&adev->uvd.idle_work);
>>>>> @@ -403,11 +404,15 @@ int amdgpu_uvd_suspend(struct amdgpu_device 
>>>>> *adev)
>>>>>           if (!adev->uvd.inst[j].saved_bo)
>>>>>               return -ENOMEM;
>>>>> -        /* re-write 0 since err_event_athub will corrupt VCPU 
>>>>> buffer */
>>>>> -        if (in_ras_intr)
>>>>> -            memset(adev->uvd.inst[j].saved_bo, 0, size);
>>>>> -        else
>>>>> -            memcpy_fromio(adev->uvd.inst[j].saved_bo, ptr, size);
>>>>> +        if (drm_dev_enter(&adev->ddev, &idx)) {
>>>>> +            /* re-write 0 since err_event_athub will corrupt VCPU 
>>>>> buffer */
>>>>> +            if (in_ras_intr)
>>>>> +                memset(adev->uvd.inst[j].saved_bo, 0, size);
>>>>> +            else
>>>>> + memcpy_fromio(adev->uvd.inst[j].saved_bo, ptr, size);
>>>>> +
>>>>> +            drm_dev_exit(idx);
>>>>> +        }
>>>>>       }
>>>>>       if (in_ras_intr)
>>>>> @@ -420,7 +425,7 @@ int amdgpu_uvd_resume(struct amdgpu_device *adev)
>>>>>   {
>>>>>       unsigned size;
>>>>>       void *ptr;
>>>>> -    int i;
>>>>> +    int i, idx;
>>>>>       for (i = 0; i < adev->uvd.num_uvd_inst; i++) {
>>>>>           if (adev->uvd.harvest_config & (1 << i))
>>>>> @@ -432,7 +437,10 @@ int amdgpu_uvd_resume(struct amdgpu_device *adev)
>>>>>           ptr = adev->uvd.inst[i].cpu_addr;
>>>>>           if (adev->uvd.inst[i].saved_bo != NULL) {
>>>>> -            memcpy_toio(ptr, adev->uvd.inst[i].saved_bo, size);
>>>>> +            if (drm_dev_enter(&adev->ddev, &idx)) {
>>>>> +                memcpy_toio(ptr, adev->uvd.inst[i].saved_bo, size);
>>>>> +                drm_dev_exit(idx);
>>>>> +            }
>>>>>               kvfree(adev->uvd.inst[i].saved_bo);
>>>>>               adev->uvd.inst[i].saved_bo = NULL;
>>>>>           } else {
>>>>> @@ -442,8 +450,11 @@ int amdgpu_uvd_resume(struct amdgpu_device *adev)
>>>>>               hdr = (const struct common_firmware_header 
>>>>> *)adev->uvd.fw->data;
>>>>>               if (adev->firmware.load_type != AMDGPU_FW_LOAD_PSP) {
>>>>>                   offset = le32_to_cpu(hdr->ucode_array_offset_bytes);
>>>>> -                memcpy_toio(adev->uvd.inst[i].cpu_addr, 
>>>>> adev->uvd.fw->data + offset,
>>>>> - le32_to_cpu(hdr->ucode_size_bytes));
>>>>> +                if (drm_dev_enter(&adev->ddev, &idx)) {
>>>>> + memcpy_toio(adev->uvd.inst[i].cpu_addr, adev->uvd.fw->data + offset,
>>>>> + le32_to_cpu(hdr->ucode_size_bytes));
>>>>> +                    drm_dev_exit(idx);
>>>>> +                }
>>>>>                   size -= le32_to_cpu(hdr->ucode_size_bytes);
>>>>>                   ptr += le32_to_cpu(hdr->ucode_size_bytes);
>>>>>               }
>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c 
>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
>>>>> index ea6a62f67e38..833203401ef4 100644
>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
>>>>> @@ -29,6 +29,7 @@
>>>>>   #include <linux/module.h>
>>>>>   #include <drm/drm.h>
>>>>> +#include <drm/drm_drv.h>
>>>>>   #include "amdgpu.h"
>>>>>   #include "amdgpu_pm.h"
>>>>> @@ -293,7 +294,7 @@ int amdgpu_vce_resume(struct amdgpu_device *adev)
>>>>>       void *cpu_addr;
>>>>>       const struct common_firmware_header *hdr;
>>>>>       unsigned offset;
>>>>> -    int r;
>>>>> +    int r, idx;
>>>>>       if (adev->vce.vcpu_bo == NULL)
>>>>>           return -EINVAL;
>>>>> @@ -313,8 +314,12 @@ int amdgpu_vce_resume(struct amdgpu_device *adev)
>>>>>       hdr = (const struct common_firmware_header *)adev->vce.fw->data;
>>>>>       offset = le32_to_cpu(hdr->ucode_array_offset_bytes);
>>>>> -    memcpy_toio(cpu_addr, adev->vce.fw->data + offset,
>>>>> -            adev->vce.fw->size - offset);
>>>>> +
>>>>> +    if (drm_dev_enter(&adev->ddev, &idx)) {
>>>>> +        memcpy_toio(cpu_addr, adev->vce.fw->data + offset,
>>>>> +                adev->vce.fw->size - offset);
>>>>> +        drm_dev_exit(idx);
>>>>> +    }
>>>>>       amdgpu_bo_kunmap(adev->vce.vcpu_bo);
>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c 
>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
>>>>> index 201645963ba5..21f7d3644d70 100644
>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
>>>>> @@ -27,6 +27,7 @@
>>>>>   #include <linux/firmware.h>
>>>>>   #include <linux/module.h>
>>>>>   #include <linux/pci.h>
>>>>> +#include <drm/drm_drv.h>
>>>>>   #include "amdgpu.h"
>>>>>   #include "amdgpu_pm.h"
>>>>> @@ -275,7 +276,7 @@ int amdgpu_vcn_suspend(struct amdgpu_device *adev)
>>>>>   {
>>>>>       unsigned size;
>>>>>       void *ptr;
>>>>> -    int i;
>>>>> +    int i, idx;
>>>>>       cancel_delayed_work_sync(&adev->vcn.idle_work);
>>>>> @@ -292,7 +293,10 @@ int amdgpu_vcn_suspend(struct amdgpu_device 
>>>>> *adev)
>>>>>           if (!adev->vcn.inst[i].saved_bo)
>>>>>               return -ENOMEM;
>>>>> -        memcpy_fromio(adev->vcn.inst[i].saved_bo, ptr, size);
>>>>> +        if (drm_dev_enter(&adev->ddev, &idx)) {
>>>>> +            memcpy_fromio(adev->vcn.inst[i].saved_bo, ptr, size);
>>>>> +            drm_dev_exit(idx);
>>>>> +        }
>>>>>       }
>>>>>       return 0;
>>>>>   }
>>>>> @@ -301,7 +305,7 @@ int amdgpu_vcn_resume(struct amdgpu_device *adev)
>>>>>   {
>>>>>       unsigned size;
>>>>>       void *ptr;
>>>>> -    int i;
>>>>> +    int i, idx;
>>>>>       for (i = 0; i < adev->vcn.num_vcn_inst; ++i) {
>>>>>           if (adev->vcn.harvest_config & (1 << i))
>>>>> @@ -313,7 +317,10 @@ int amdgpu_vcn_resume(struct amdgpu_device *adev)
>>>>>           ptr = adev->vcn.inst[i].cpu_addr;
>>>>>           if (adev->vcn.inst[i].saved_bo != NULL) {
>>>>> -            memcpy_toio(ptr, adev->vcn.inst[i].saved_bo, size);
>>>>> +            if (drm_dev_enter(&adev->ddev, &idx)) {
>>>>> +                memcpy_toio(ptr, adev->vcn.inst[i].saved_bo, size);
>>>>> +                drm_dev_exit(idx);
>>>>> +            }
>>>>>               kvfree(adev->vcn.inst[i].saved_bo);
>>>>>               adev->vcn.inst[i].saved_bo = NULL;
>>>>>           } else {
>>>>> @@ -323,8 +330,11 @@ int amdgpu_vcn_resume(struct amdgpu_device *adev)
>>>>>               hdr = (const struct common_firmware_header 
>>>>> *)adev->vcn.fw->data;
>>>>>               if (adev->firmware.load_type != AMDGPU_FW_LOAD_PSP) {
>>>>>                   offset = le32_to_cpu(hdr->ucode_array_offset_bytes);
>>>>> -                memcpy_toio(adev->vcn.inst[i].cpu_addr, 
>>>>> adev->vcn.fw->data + offset,
>>>>> - le32_to_cpu(hdr->ucode_size_bytes));
>>>>> +                if (drm_dev_enter(&adev->ddev, &idx)) {
>>>>> + memcpy_toio(adev->vcn.inst[i].cpu_addr, adev->vcn.fw->data + offset,
>>>>> + le32_to_cpu(hdr->ucode_size_bytes));
>>>>> +                    drm_dev_exit(idx);
>>>>> +                }
>>>>>                   size -= le32_to_cpu(hdr->ucode_size_bytes);
>>>>>                   ptr += le32_to_cpu(hdr->ucode_size_bytes);
>>>>>               }
>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>>>> index 9f868cf3b832..7dd5f10ab570 100644
>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>>>> @@ -32,6 +32,7 @@
>>>>>   #include <linux/dma-buf.h>
>>>>>   #include <drm/amdgpu_drm.h>
>>>>> +#include <drm/drm_drv.h>
>>>>>   #include "amdgpu.h"
>>>>>   #include "amdgpu_trace.h"
>>>>>   #include "amdgpu_amdkfd.h"
>>>>> @@ -1606,7 +1607,10 @@ static int 
>>>>> amdgpu_vm_bo_update_mapping(struct amdgpu_device *adev,
>>>>>       struct amdgpu_vm_update_params params;
>>>>>       enum amdgpu_sync_mode sync_mode;
>>>>>       uint64_t pfn;
>>>>> -    int r;
>>>>> +    int r, idx;
>>>>> +
>>>>> +    if (!drm_dev_enter(&adev->ddev, &idx))
>>>>> +        return -ENODEV;
>>>>>       memset(&params, 0, sizeof(params));
>>>>>       params.adev = adev;
>>>>> @@ -1715,6 +1719,7 @@ static int amdgpu_vm_bo_update_mapping(struct 
>>>>> amdgpu_device *adev,
>>>>>   error_unlock:
>>>>>       amdgpu_vm_eviction_unlock(vm);
>>>>> +    drm_dev_exit(idx);
>>>>>       return r;
>>>>>   }
>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c 
>>>>> b/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
>>>>> index 589410c32d09..2cec71e823f5 100644
>>>>> --- a/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
>>>>> @@ -23,6 +23,7 @@
>>>>>   #include <linux/firmware.h>
>>>>>   #include <linux/module.h>
>>>>>   #include <linux/vmalloc.h>
>>>>> +#include <drm/drm_drv.h>
>>>>>   #include "amdgpu.h"
>>>>>   #include "amdgpu_psp.h"
>>>>> @@ -269,10 +270,8 @@ static int 
>>>>> psp_v11_0_bootloader_load_kdb(struct psp_context *psp)
>>>>>       if (ret)
>>>>>           return ret;
>>>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>> -
>>>>>       /* Copy PSP KDB binary to memory */
>>>>> -    memcpy(psp->fw_pri_buf, psp->kdb_start_addr, psp->kdb_bin_size);
>>>>> +    psp_copy_fw(psp, psp->kdb_start_addr, psp->kdb_bin_size);
>>>>>       /* Provide the PSP KDB to bootloader */
>>>>>       WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>>>> @@ -302,10 +301,8 @@ static int 
>>>>> psp_v11_0_bootloader_load_spl(struct psp_context *psp)
>>>>>       if (ret)
>>>>>           return ret;
>>>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>> -
>>>>>       /* Copy PSP SPL binary to memory */
>>>>> -    memcpy(psp->fw_pri_buf, psp->spl_start_addr, psp->spl_bin_size);
>>>>> +    psp_copy_fw(psp, psp->spl_start_addr, psp->spl_bin_size);
>>>>>       /* Provide the PSP SPL to bootloader */
>>>>>       WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>>>> @@ -335,10 +332,8 @@ static int 
>>>>> psp_v11_0_bootloader_load_sysdrv(struct psp_context *psp)
>>>>>       if (ret)
>>>>>           return ret;
>>>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>> -
>>>>>       /* Copy PSP System Driver binary to memory */
>>>>> -    memcpy(psp->fw_pri_buf, psp->sys_start_addr, psp->sys_bin_size);
>>>>> +    psp_copy_fw(psp, psp->sys_start_addr, psp->sys_bin_size);
>>>>>       /* Provide the sys driver to bootloader */
>>>>>       WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>>>> @@ -371,10 +366,8 @@ static int 
>>>>> psp_v11_0_bootloader_load_sos(struct psp_context *psp)
>>>>>       if (ret)
>>>>>           return ret;
>>>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>> -
>>>>>       /* Copy Secure OS binary to PSP memory */
>>>>> -    memcpy(psp->fw_pri_buf, psp->sos_start_addr, psp->sos_bin_size);
>>>>> +    psp_copy_fw(psp, psp->sos_start_addr, psp->sos_bin_size);
>>>>>       /* Provide the PSP secure OS to bootloader */
>>>>>       WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>>>> @@ -608,7 +601,7 @@ static int psp_v11_0_memory_training(struct 
>>>>> psp_context *psp, uint32_t ops)
>>>>>       uint32_t p2c_header[4];
>>>>>       uint32_t sz;
>>>>>       void *buf;
>>>>> -    int ret;
>>>>> +    int ret, idx;
>>>>>       if (ctx->init == PSP_MEM_TRAIN_NOT_SUPPORT) {
>>>>>           DRM_DEBUG("Memory training is not supported.\n");
>>>>> @@ -681,17 +674,24 @@ static int psp_v11_0_memory_training(struct 
>>>>> psp_context *psp, uint32_t ops)
>>>>>               return -ENOMEM;
>>>>>           }
>>>>> -        memcpy_fromio(buf, adev->mman.aper_base_kaddr, sz);
>>>>> -        ret = psp_v11_0_memory_training_send_msg(psp, 
>>>>> PSP_BL__DRAM_LONG_TRAIN);
>>>>> -        if (ret) {
>>>>> -            DRM_ERROR("Send long training msg failed.\n");
>>>>> +        if (drm_dev_enter(&adev->ddev, &idx)) {
>>>>> +            memcpy_fromio(buf, adev->mman.aper_base_kaddr, sz);
>>>>> +            ret = psp_v11_0_memory_training_send_msg(psp, 
>>>>> PSP_BL__DRAM_LONG_TRAIN);
>>>>> +            if (ret) {
>>>>> +                DRM_ERROR("Send long training msg failed.\n");
>>>>> +                vfree(buf);
>>>>> +                drm_dev_exit(idx);
>>>>> +                return ret;
>>>>> +            }
>>>>> +
>>>>> +            memcpy_toio(adev->mman.aper_base_kaddr, buf, sz);
>>>>> +            adev->hdp.funcs->flush_hdp(adev, NULL);
>>>>>               vfree(buf);
>>>>> -            return ret;
>>>>> +            drm_dev_exit(idx);
>>>>> +        } else {
>>>>> +            vfree(buf);
>>>>> +            return -ENODEV;
>>>>>           }
>>>>> -
>>>>> -        memcpy_toio(adev->mman.aper_base_kaddr, buf, sz);
>>>>> -        adev->hdp.funcs->flush_hdp(adev, NULL);
>>>>> -        vfree(buf);
>>>>>       }
>>>>>       if (ops & PSP_MEM_TRAIN_SAVE) {
>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c 
>>>>> b/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c
>>>>> index c4828bd3264b..618e5b6b85d9 100644
>>>>> --- a/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c
>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c
>>>>> @@ -138,10 +138,8 @@ static int 
>>>>> psp_v12_0_bootloader_load_sysdrv(struct psp_context *psp)
>>>>>       if (ret)
>>>>>           return ret;
>>>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>> -
>>>>>       /* Copy PSP System Driver binary to memory */
>>>>> -    memcpy(psp->fw_pri_buf, psp->sys_start_addr, psp->sys_bin_size);
>>>>> +    psp_copy_fw(psp, psp->sys_start_addr, psp->sys_bin_size);
>>>>>       /* Provide the sys driver to bootloader */
>>>>>       WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>>>> @@ -179,10 +177,8 @@ static int 
>>>>> psp_v12_0_bootloader_load_sos(struct psp_context *psp)
>>>>>       if (ret)
>>>>>           return ret;
>>>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>> -
>>>>>       /* Copy Secure OS binary to PSP memory */
>>>>> -    memcpy(psp->fw_pri_buf, psp->sos_start_addr, psp->sos_bin_size);
>>>>> +    psp_copy_fw(psp, psp->sos_start_addr, psp->sos_bin_size);
>>>>>       /* Provide the PSP secure OS to bootloader */
>>>>>       WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c 
>>>>> b/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c
>>>>> index f2e725f72d2f..d0a6cccd0897 100644
>>>>> --- a/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c
>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c
>>>>> @@ -102,10 +102,8 @@ static int 
>>>>> psp_v3_1_bootloader_load_sysdrv(struct psp_context *psp)
>>>>>       if (ret)
>>>>>           return ret;
>>>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>> -
>>>>>       /* Copy PSP System Driver binary to memory */
>>>>> -    memcpy(psp->fw_pri_buf, psp->sys_start_addr, psp->sys_bin_size);
>>>>> +    psp_copy_fw(psp, psp->sys_start_addr, psp->sys_bin_size);
>>>>>       /* Provide the sys driver to bootloader */
>>>>>       WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>>>> @@ -143,10 +141,8 @@ static int psp_v3_1_bootloader_load_sos(struct 
>>>>> psp_context *psp)
>>>>>       if (ret)
>>>>>           return ret;
>>>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>> -
>>>>>       /* Copy Secure OS binary to PSP memory */
>>>>> -    memcpy(psp->fw_pri_buf, psp->sos_start_addr, psp->sos_bin_size);
>>>>> +    psp_copy_fw(psp, psp->sos_start_addr, psp->sos_bin_size);
>>>>>       /* Provide the PSP secure OS to bootloader */
>>>>>       WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c 
>>>>> b/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
>>>>> index 8e238dea7bef..90910d19db12 100644
>>>>> --- a/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
>>>>> @@ -25,6 +25,7 @@
>>>>>    */
>>>>>   #include <linux/firmware.h>
>>>>> +#include <drm/drm_drv.h>
>>>>>   #include "amdgpu.h"
>>>>>   #include "amdgpu_vce.h"
>>>>> @@ -555,16 +556,19 @@ static int vce_v4_0_hw_fini(void *handle)
>>>>>   static int vce_v4_0_suspend(void *handle)
>>>>>   {
>>>>>       struct amdgpu_device *adev = (struct amdgpu_device *)handle;
>>>>> -    int r;
>>>>> +    int r, idx;
>>>>>       if (adev->vce.vcpu_bo == NULL)
>>>>>           return 0;
>>>>> -    if (adev->firmware.load_type == AMDGPU_FW_LOAD_PSP) {
>>>>> -        unsigned size = amdgpu_bo_size(adev->vce.vcpu_bo);
>>>>> -        void *ptr = adev->vce.cpu_addr;
>>>>> +    if (drm_dev_enter(&adev->ddev, &idx)) {
>>>>> +        if (adev->firmware.load_type == AMDGPU_FW_LOAD_PSP) {
>>>>> +            unsigned size = amdgpu_bo_size(adev->vce.vcpu_bo);
>>>>> +            void *ptr = adev->vce.cpu_addr;
>>>>> -        memcpy_fromio(adev->vce.saved_bo, ptr, size);
>>>>> +            memcpy_fromio(adev->vce.saved_bo, ptr, size);
>>>>> +        }
>>>>> +        drm_dev_exit(idx);
>>>>>       }
>>>>>       r = vce_v4_0_hw_fini(adev);
>>>>> @@ -577,16 +581,20 @@ static int vce_v4_0_suspend(void *handle)
>>>>>   static int vce_v4_0_resume(void *handle)
>>>>>   {
>>>>>       struct amdgpu_device *adev = (struct amdgpu_device *)handle;
>>>>> -    int r;
>>>>> +    int r, idx;
>>>>>       if (adev->vce.vcpu_bo == NULL)
>>>>>           return -EINVAL;
>>>>>       if (adev->firmware.load_type == AMDGPU_FW_LOAD_PSP) {
>>>>> -        unsigned size = amdgpu_bo_size(adev->vce.vcpu_bo);
>>>>> -        void *ptr = adev->vce.cpu_addr;
>>>>> -        memcpy_toio(ptr, adev->vce.saved_bo, size);
>>>>> +        if (drm_dev_enter(&adev->ddev, &idx)) {
>>>>> +            unsigned size = amdgpu_bo_size(adev->vce.vcpu_bo);
>>>>> +            void *ptr = adev->vce.cpu_addr;
>>>>> +
>>>>> +            memcpy_toio(ptr, adev->vce.saved_bo, size);
>>>>> +            drm_dev_exit(idx);
>>>>> +        }
>>>>>       } else {
>>>>>           r = amdgpu_vce_resume(adev);
>>>>>           if (r)
>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c 
>>>>> b/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c
>>>>> index 3f15bf34123a..df34be8ec82d 100644
>>>>> --- a/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c
>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c
>>>>> @@ -34,6 +34,8 @@
>>>>>   #include "vcn/vcn_3_0_0_sh_mask.h"
>>>>>   #include "ivsrcid/vcn/irqsrcs_vcn_2_0.h"
>>>>> +#include <drm/drm_drv.h>
>>>>> +
>>>>>   #define mmUVD_CONTEXT_ID_INTERNAL_OFFSET            0x27
>>>>>   #define mmUVD_GPCOM_VCPU_CMD_INTERNAL_OFFSET 0x0f
>>>>>   #define mmUVD_GPCOM_VCPU_DATA0_INTERNAL_OFFSET 0x10
>>>>> @@ -268,16 +270,20 @@ static int vcn_v3_0_sw_init(void *handle)
>>>>>   static int vcn_v3_0_sw_fini(void *handle)
>>>>>   {
>>>>>       struct amdgpu_device *adev = (struct amdgpu_device *)handle;
>>>>> -    int i, r;
>>>>> +    int i, r, idx;
>>>>> -    for (i = 0; i < adev->vcn.num_vcn_inst; i++) {
>>>>> -        volatile struct amdgpu_fw_shared *fw_shared;
>>>>> +    if (drm_dev_enter(&adev->ddev, &idx)) {
>>>>> +        for (i = 0; i < adev->vcn.num_vcn_inst; i++) {
>>>>> +            volatile struct amdgpu_fw_shared *fw_shared;
>>>>> -        if (adev->vcn.harvest_config & (1 << i))
>>>>> -            continue;
>>>>> -        fw_shared = adev->vcn.inst[i].fw_shared_cpu_addr;
>>>>> -        fw_shared->present_flag_0 = 0;
>>>>> -        fw_shared->sw_ring.is_enabled = false;
>>>>> +            if (adev->vcn.harvest_config & (1 << i))
>>>>> +                continue;
>>>>> +            fw_shared = adev->vcn.inst[i].fw_shared_cpu_addr;
>>>>> +            fw_shared->present_flag_0 = 0;
>>>>> +            fw_shared->sw_ring.is_enabled = false;
>>>>> +        }
>>>>> +
>>>>> +        drm_dev_exit(idx);
>>>>>       }
>>>>>       if (amdgpu_sriov_vf(adev))
>>>>> diff --git a/drivers/gpu/drm/amd/pm/powerplay/smumgr/smu7_smumgr.c 
>>>>> b/drivers/gpu/drm/amd/pm/powerplay/smumgr/smu7_smumgr.c
>>>>> index aae25243eb10..d628b91846c9 100644
>>>>> --- a/drivers/gpu/drm/amd/pm/powerplay/smumgr/smu7_smumgr.c
>>>>> +++ b/drivers/gpu/drm/amd/pm/powerplay/smumgr/smu7_smumgr.c
>>>>> @@ -405,6 +405,8 @@ int smu7_request_smu_load_fw(struct pp_hwmgr 
>>>>> *hwmgr)
>>>>>                   UCODE_ID_MEC_STORAGE, 
>>>>> &toc->entry[toc->num_entries++]),
>>>>>                   "Failed to Get Firmware Entry.", r = -EINVAL; 
>>>>> goto failed);
>>>>>       }
>>>>> +
>>>>> +    /* AG TODO Can't call drm_dev_enter/exit because access 
>>>>> adev->ddev here ... */
>>>>>       memcpy_toio(smu_data->header_buffer.kaddr, smu_data->toc,
>>>>>               sizeof(struct SMU_DRAMData_TOC));
>>>>>       smum_send_msg_to_smc_with_parameter(hwmgr,
>>>>
> 

^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [PATCH v6 10/16] drm/amdgpu: Guard against write accesses after device removal
@ 2021-05-12 14:11             ` Andrey Grodzovsky
  0 siblings, 0 replies; 126+ messages in thread
From: Andrey Grodzovsky @ 2021-05-12 14:11 UTC (permalink / raw)
  To: Christian König, dri-devel, amd-gfx, linux-pci,
	daniel.vetter, Harry.Wentland
  Cc: Alexander.Deucher, gregkh, ppaalanen, helgaas, Felix.Kuehling

Hopefyllu Alex can chime in on this.
I will respin V7 soon.

Andrey

On 2021-05-12 10:06 a.m., Christian König wrote:
> Am 12.05.21 um 16:01 schrieb Andrey Grodzovsky:
>> Ping - need a confirmation it's ok to keep this as a single patch given
>> my explanation bellow.
> 
> It was just an suggestion. Key point is the approach sounds sane to me, 
> but I can't say much about the psp code for example.
> 
> So maximum I can give you is an Acked-by for that.
> 
> Christian.
> 
>>
>> Andrey
>>
>> On 2021-05-11 1:52 p.m., Andrey Grodzovsky wrote:
>>>
>>>
>>> On 2021-05-11 2:50 a.m., Christian König wrote:
>>>> Am 10.05.21 um 18:36 schrieb Andrey Grodzovsky:
>>>>> This should prevent writing to memory or IO ranges possibly
>>>>> already allocated for other uses after our device is removed.
>>>>>
>>>>> v5:
>>>>> Protect more places wher memcopy_to/form_io takes place
>>>>> Protect IB submissions
>>>>>
>>>>> v6: Switch to !drm_dev_enter instead of scoping entire code
>>>>> with brackets.
>>>>>
>>>>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>>>>> ---
>>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c    | 11 ++-
>>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c       |  9 +++
>>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c        | 17 +++--
>>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c       | 63 +++++++++++------
>>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h       |  2 +
>>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c      | 70 
>>>>> +++++++++++++++++++
>>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h      | 49 ++-----------
>>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c       | 31 +++++---
>>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c       | 11 ++-
>>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c       | 22 ++++--
>>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c        |  7 +-
>>>>>   drivers/gpu/drm/amd/amdgpu/psp_v11_0.c        | 44 ++++++------
>>>>>   drivers/gpu/drm/amd/amdgpu/psp_v12_0.c        |  8 +--
>>>>>   drivers/gpu/drm/amd/amdgpu/psp_v3_1.c         |  8 +--
>>>>>   drivers/gpu/drm/amd/amdgpu/vce_v4_0.c         | 26 ++++---
>>>>>   drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c         | 22 +++---
>>>>>   .../drm/amd/pm/powerplay/smumgr/smu7_smumgr.c |  2 +
>>>>>   17 files changed, 257 insertions(+), 145 deletions(-)
>>>>>
>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>> index a0bff4713672..94c415176cdc 100644
>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>> @@ -71,6 +71,8 @@
>>>>>   #include <drm/task_barrier.h>
>>>>>   #include <linux/pm_runtime.h>
>>>>> +#include <drm/drm_drv.h>
>>>>> +
>>>>>   MODULE_FIRMWARE("amdgpu/vega10_gpu_info.bin");
>>>>>   MODULE_FIRMWARE("amdgpu/vega12_gpu_info.bin");
>>>>>   MODULE_FIRMWARE("amdgpu/raven_gpu_info.bin");
>>>>> @@ -281,7 +283,10 @@ void amdgpu_device_vram_access(struct 
>>>>> amdgpu_device *adev, loff_t pos,
>>>>>       unsigned long flags;
>>>>>       uint32_t hi = ~0;
>>>>>       uint64_t last;
>>>>> +    int idx;
>>>>> +     if (!drm_dev_enter(&adev->ddev, &idx))
>>>>> +         return;
>>>>>   #ifdef CONFIG_64BIT
>>>>>       last = min(pos + size, adev->gmc.visible_vram_size);
>>>>> @@ -299,8 +304,10 @@ void amdgpu_device_vram_access(struct 
>>>>> amdgpu_device *adev, loff_t pos,
>>>>>               memcpy_fromio(buf, addr, count);
>>>>>           }
>>>>> -        if (count == size)
>>>>> +        if (count == size) {
>>>>> +            drm_dev_exit(idx);
>>>>>               return;
>>>>> +        }
>>>>
>>>> Maybe use a goto instead, but really just a nit pick.
>>>>
>>>>
>>>>
>>>>>           pos += count;
>>>>>           buf += count / 4;
>>>>> @@ -323,6 +330,8 @@ void amdgpu_device_vram_access(struct 
>>>>> amdgpu_device *adev, loff_t pos,
>>>>>               *buf++ = RREG32_NO_KIQ(mmMM_DATA);
>>>>>       }
>>>>>       spin_unlock_irqrestore(&adev->mmio_idx_lock, flags);
>>>>> +
>>>>> +    drm_dev_exit(idx);
>>>>>   }
>>>>>   /*
>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c 
>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
>>>>> index 4d32233cde92..04ba5eef1e88 100644
>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
>>>>> @@ -31,6 +31,8 @@
>>>>>   #include "amdgpu_ras.h"
>>>>>   #include "amdgpu_xgmi.h"
>>>>> +#include <drm/drm_drv.h>
>>>>> +
>>>>>   /**
>>>>>    * amdgpu_gmc_pdb0_alloc - allocate vram for pdb0
>>>>>    *
>>>>> @@ -151,6 +153,10 @@ int amdgpu_gmc_set_pte_pde(struct 
>>>>> amdgpu_device *adev, void *cpu_pt_addr,
>>>>>   {
>>>>>       void __iomem *ptr = (void *)cpu_pt_addr;
>>>>>       uint64_t value;
>>>>> +    int idx;
>>>>> +
>>>>> +    if (!drm_dev_enter(&adev->ddev, &idx))
>>>>> +        return 0;
>>>>>       /*
>>>>>        * The following is for PTE only. GART does not have PDEs.
>>>>> @@ -158,6 +164,9 @@ int amdgpu_gmc_set_pte_pde(struct amdgpu_device 
>>>>> *adev, void *cpu_pt_addr,
>>>>>       value = addr & 0x0000FFFFFFFFF000ULL;
>>>>>       value |= flags;
>>>>>       writeq(value, ptr + (gpu_page_idx * 8));
>>>>> +
>>>>> +    drm_dev_exit(idx);
>>>>> +
>>>>>       return 0;
>>>>>   }
>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c 
>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
>>>>> index 148a3b481b12..62fcbd446c71 100644
>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
>>>>> @@ -30,6 +30,7 @@
>>>>>   #include <linux/slab.h>
>>>>>   #include <drm/amdgpu_drm.h>
>>>>> +#include <drm/drm_drv.h>
>>>>>   #include "amdgpu.h"
>>>>>   #include "atom.h"
>>>>> @@ -137,7 +138,7 @@ int amdgpu_ib_schedule(struct amdgpu_ring 
>>>>> *ring, unsigned num_ibs,
>>>>>       bool secure;
>>>>>       unsigned i;
>>>>> -    int r = 0;
>>>>> +    int idx, r = 0;
>>>>>       bool need_pipe_sync = false;
>>>>>       if (num_ibs == 0)
>>>>> @@ -169,13 +170,16 @@ int amdgpu_ib_schedule(struct amdgpu_ring 
>>>>> *ring, unsigned num_ibs,
>>>>>           return -EINVAL;
>>>>>       }
>>>>> +    if (!drm_dev_enter(&adev->ddev, &idx))
>>>>> +        return -ENODEV;
>>>>> +
>>>>>       alloc_size = ring->funcs->emit_frame_size + num_ibs *
>>>>>           ring->funcs->emit_ib_size;
>>>>>       r = amdgpu_ring_alloc(ring, alloc_size);
>>>>>       if (r) {
>>>>>           dev_err(adev->dev, "scheduling IB failed (%d).\n", r);
>>>>> -        return r;
>>>>> +        goto exit;
>>>>>       }
>>>>>       need_ctx_switch = ring->current_ctx != fence_ctx;
>>>>> @@ -205,7 +209,7 @@ int amdgpu_ib_schedule(struct amdgpu_ring 
>>>>> *ring, unsigned num_ibs,
>>>>>           r = amdgpu_vm_flush(ring, job, need_pipe_sync);
>>>>>           if (r) {
>>>>>               amdgpu_ring_undo(ring);
>>>>> -            return r;
>>>>> +            goto exit;
>>>>>           }
>>>>>       }
>>>>> @@ -286,7 +290,7 @@ int amdgpu_ib_schedule(struct amdgpu_ring 
>>>>> *ring, unsigned num_ibs,
>>>>>           if (job && job->vmid)
>>>>>               amdgpu_vmid_reset(adev, ring->funcs->vmhub, job->vmid);
>>>>>           amdgpu_ring_undo(ring);
>>>>> -        return r;
>>>>> +        goto exit;
>>>>>       }
>>>>>       if (ring->funcs->insert_end)
>>>>> @@ -304,7 +308,10 @@ int amdgpu_ib_schedule(struct amdgpu_ring 
>>>>> *ring, unsigned num_ibs,
>>>>>           ring->funcs->emit_wave_limit(ring, false);
>>>>>       amdgpu_ring_commit(ring);
>>>>> -    return 0;
>>>>> +
>>>>> +exit:
>>>>> +    drm_dev_exit(idx);
>>>>> +    return r;
>>>>>   }
>>>>>   /**
>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c 
>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
>>>>> index 9e769cf6095b..bb6afee61666 100644
>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
>>>>> @@ -25,6 +25,7 @@
>>>>>   #include <linux/firmware.h>
>>>>>   #include <linux/dma-mapping.h>
>>>>> +#include <drm/drm_drv.h>
>>>>>   #include "amdgpu.h"
>>>>>   #include "amdgpu_psp.h"
>>>>> @@ -39,6 +40,8 @@
>>>>>   #include "amdgpu_ras.h"
>>>>>   #include "amdgpu_securedisplay.h"
>>>>> +#include <drm/drm_drv.h>
>>>>> +
>>>>>   static int psp_sysfs_init(struct amdgpu_device *adev);
>>>>>   static void psp_sysfs_fini(struct amdgpu_device *adev);
>>>>> @@ -253,7 +256,7 @@ psp_cmd_submit_buf(struct psp_context *psp,
>>>>>              struct psp_gfx_cmd_resp *cmd, uint64_t fence_mc_addr)
>>>>>   {
>>>>>       int ret;
>>>>> -    int index;
>>>>> +    int index, idx;
>>>>>       int timeout = 20000;
>>>>>       bool ras_intr = false;
>>>>>       bool skip_unsupport = false;
>>>>> @@ -261,6 +264,9 @@ psp_cmd_submit_buf(struct psp_context *psp,
>>>>>       if (psp->adev->in_pci_err_recovery)
>>>>>           return 0;
>>>>> +    if (!drm_dev_enter(&psp->adev->ddev, &idx))
>>>>> +        return 0;
>>>>> +
>>>>>       mutex_lock(&psp->mutex);
>>>>>       memset(psp->cmd_buf_mem, 0, PSP_CMD_BUFFER_SIZE);
>>>>> @@ -271,8 +277,7 @@ psp_cmd_submit_buf(struct psp_context *psp,
>>>>>       ret = psp_ring_cmd_submit(psp, psp->cmd_buf_mc_addr, 
>>>>> fence_mc_addr, index);
>>>>>       if (ret) {
>>>>>           atomic_dec(&psp->fence_value);
>>>>> -        mutex_unlock(&psp->mutex);
>>>>> -        return ret;
>>>>> +        goto exit;
>>>>>       }
>>>>>       amdgpu_asic_invalidate_hdp(psp->adev, NULL);
>>>>> @@ -312,8 +317,8 @@ psp_cmd_submit_buf(struct psp_context *psp,
>>>>>                psp->cmd_buf_mem->cmd_id,
>>>>>                psp->cmd_buf_mem->resp.status);
>>>>>           if (!timeout) {
>>>>> -            mutex_unlock(&psp->mutex);
>>>>> -            return -EINVAL;
>>>>> +            ret = -EINVAL;
>>>>> +            goto exit;
>>>>>           }
>>>>>       }
>>>>> @@ -321,8 +326,10 @@ psp_cmd_submit_buf(struct psp_context *psp,
>>>>>           ucode->tmr_mc_addr_lo = psp->cmd_buf_mem->resp.fw_addr_lo;
>>>>>           ucode->tmr_mc_addr_hi = psp->cmd_buf_mem->resp.fw_addr_hi;
>>>>>       }
>>>>> -    mutex_unlock(&psp->mutex);
>>>>> +exit:
>>>>> +    mutex_unlock(&psp->mutex);
>>>>> +    drm_dev_exit(idx);
>>>>>       return ret;
>>>>>   }
>>>>> @@ -359,8 +366,7 @@ static int psp_load_toc(struct psp_context *psp,
>>>>>       if (!cmd)
>>>>>           return -ENOMEM;
>>>>>       /* Copy toc to psp firmware private buffer */
>>>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>> -    memcpy(psp->fw_pri_buf, psp->toc_start_addr, psp->toc_bin_size);
>>>>> +    psp_copy_fw(psp, psp->toc_start_addr, psp->toc_bin_size);
>>>>>       psp_prep_load_toc_cmd_buf(cmd, psp->fw_pri_mc_addr, 
>>>>> psp->toc_bin_size);
>>>>> @@ -625,8 +631,7 @@ static int psp_asd_load(struct psp_context *psp)
>>>>>       if (!cmd)
>>>>>           return -ENOMEM;
>>>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>> -    memcpy(psp->fw_pri_buf, psp->asd_start_addr, 
>>>>> psp->asd_ucode_size);
>>>>> +    psp_copy_fw(psp, psp->asd_start_addr, psp->asd_ucode_size);
>>>>>       psp_prep_asd_load_cmd_buf(cmd, psp->fw_pri_mc_addr,
>>>>>                     psp->asd_ucode_size);
>>>>> @@ -781,8 +786,7 @@ static int psp_xgmi_load(struct psp_context *psp)
>>>>>       if (!cmd)
>>>>>           return -ENOMEM;
>>>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>> -    memcpy(psp->fw_pri_buf, psp->ta_xgmi_start_addr, 
>>>>> psp->ta_xgmi_ucode_size);
>>>>> +    psp_copy_fw(psp, psp->ta_xgmi_start_addr, 
>>>>> psp->ta_xgmi_ucode_size);
>>>>>       psp_prep_ta_load_cmd_buf(cmd,
>>>>>                    psp->fw_pri_mc_addr,
>>>>> @@ -1038,8 +1042,7 @@ static int psp_ras_load(struct psp_context *psp)
>>>>>       if (!cmd)
>>>>>           return -ENOMEM;
>>>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>> -    memcpy(psp->fw_pri_buf, psp->ta_ras_start_addr, 
>>>>> psp->ta_ras_ucode_size);
>>>>> +    psp_copy_fw(psp, psp->ta_ras_start_addr, psp->ta_ras_ucode_size);
>>>>>       psp_prep_ta_load_cmd_buf(cmd,
>>>>>                    psp->fw_pri_mc_addr,
>>>>> @@ -1275,8 +1278,7 @@ static int psp_hdcp_load(struct psp_context 
>>>>> *psp)
>>>>>       if (!cmd)
>>>>>           return -ENOMEM;
>>>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>> -    memcpy(psp->fw_pri_buf, psp->ta_hdcp_start_addr,
>>>>> +    psp_copy_fw(psp, psp->ta_hdcp_start_addr,
>>>>>              psp->ta_hdcp_ucode_size);
>>>>>       psp_prep_ta_load_cmd_buf(cmd,
>>>>> @@ -1427,8 +1429,7 @@ static int psp_dtm_load(struct psp_context *psp)
>>>>>       if (!cmd)
>>>>>           return -ENOMEM;
>>>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>> -    memcpy(psp->fw_pri_buf, psp->ta_dtm_start_addr, 
>>>>> psp->ta_dtm_ucode_size);
>>>>> +    psp_copy_fw(psp, psp->ta_dtm_start_addr, psp->ta_dtm_ucode_size);
>>>>>       psp_prep_ta_load_cmd_buf(cmd,
>>>>>                    psp->fw_pri_mc_addr,
>>>>> @@ -1573,8 +1574,7 @@ static int psp_rap_load(struct psp_context *psp)
>>>>>       if (!cmd)
>>>>>           return -ENOMEM;
>>>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>> -    memcpy(psp->fw_pri_buf, psp->ta_rap_start_addr, 
>>>>> psp->ta_rap_ucode_size);
>>>>> +    psp_copy_fw(psp, psp->ta_rap_start_addr, psp->ta_rap_ucode_size);
>>>>>       psp_prep_ta_load_cmd_buf(cmd,
>>>>>                    psp->fw_pri_mc_addr,
>>>>> @@ -3022,7 +3022,7 @@ static ssize_t 
>>>>> psp_usbc_pd_fw_sysfs_write(struct device *dev,
>>>>>       struct amdgpu_device *adev = drm_to_adev(ddev);
>>>>>       void *cpu_addr;
>>>>>       dma_addr_t dma_addr;
>>>>> -    int ret;
>>>>> +    int ret, idx;
>>>>>       char fw_name[100];
>>>>>       const struct firmware *usbc_pd_fw;
>>>>> @@ -3031,6 +3031,9 @@ static ssize_t 
>>>>> psp_usbc_pd_fw_sysfs_write(struct device *dev,
>>>>>           return -EBUSY;
>>>>>       }
>>>>> +    if (!drm_dev_enter(ddev, &idx))
>>>>> +        return -ENODEV;
>>>>> +
>>>>>       snprintf(fw_name, sizeof(fw_name), "amdgpu/%s", buf);
>>>>>       ret = request_firmware(&usbc_pd_fw, fw_name, adev->dev);
>>>>>       if (ret)
>>>>> @@ -3062,16 +3065,30 @@ static ssize_t 
>>>>> psp_usbc_pd_fw_sysfs_write(struct device *dev,
>>>>>   rel_buf:
>>>>>       dma_free_coherent(adev->dev, usbc_pd_fw->size, cpu_addr, 
>>>>> dma_addr);
>>>>>       release_firmware(usbc_pd_fw);
>>>>> -
>>>>>   fail:
>>>>>       if (ret) {
>>>>>           DRM_ERROR("Failed to load USBC PD FW, err = %d", ret);
>>>>> -        return ret;
>>>>> +        count = ret;
>>>>>       }
>>>>> +    drm_dev_exit(idx);
>>>>>       return count;
>>>>>   }
>>>>> +void psp_copy_fw(struct psp_context *psp, uint8_t *start_addr, 
>>>>> uint32_t bin_size)
>>>>> +{
>>>>> +    int idx;
>>>>> +
>>>>> +    if (!drm_dev_enter(&psp->adev->ddev, &idx))
>>>>> +        return;
>>>>> +
>>>>> +    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>> +    memcpy(psp->fw_pri_buf, start_addr, bin_size);
>>>>> +
>>>>> +    drm_dev_exit(idx);
>>>>> +}
>>>>> +
>>>>> +
>>>>>   static DEVICE_ATTR(usbc_pd_fw, S_IRUGO | S_IWUSR,
>>>>>              psp_usbc_pd_fw_sysfs_read,
>>>>>              psp_usbc_pd_fw_sysfs_write);
>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h 
>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
>>>>> index 46a5328e00e0..2bfdc278817f 100644
>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
>>>>> @@ -423,4 +423,6 @@ int psp_get_fw_attestation_records_addr(struct 
>>>>> psp_context *psp,
>>>>>   int psp_load_fw_list(struct psp_context *psp,
>>>>>                struct amdgpu_firmware_info **ucode_list, int 
>>>>> ucode_count);
>>>>> +void psp_copy_fw(struct psp_context *psp, uint8_t *start_addr, 
>>>>> uint32_t bin_size);
>>>>> +
>>>>>   #endif
>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c 
>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
>>>>> index 688624ebe421..e1985bc34436 100644
>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
>>>>> @@ -35,6 +35,8 @@
>>>>>   #include "amdgpu.h"
>>>>>   #include "atom.h"
>>>>> +#include <drm/drm_drv.h>
>>>>> +
>>>>>   /*
>>>>>    * Rings
>>>>>    * Most engines on the GPU are fed via ring buffers.  Ring
>>>>> @@ -461,3 +463,71 @@ int amdgpu_ring_test_helper(struct amdgpu_ring 
>>>>> *ring)
>>>>>       ring->sched.ready = !r;
>>>>>       return r;
>>>>>   }
>>>>> +
>>>>> +void amdgpu_ring_clear_ring(struct amdgpu_ring *ring)
>>>>> +{
>>>>> +    int idx;
>>>>> +    int i = 0;
>>>>> +
>>>>> +    if (!drm_dev_enter(&ring->adev->ddev, &idx))
>>>>> +        return;
>>>>> +
>>>>> +    while (i <= ring->buf_mask)
>>>>> +        ring->ring[i++] = ring->funcs->nop;
>>>>> +
>>>>> +    drm_dev_exit(idx);
>>>>> +
>>>>> +}
>>>>> +
>>>>> +void amdgpu_ring_write(struct amdgpu_ring *ring, uint32_t v)
>>>>> +{
>>>>> +    int idx;
>>>>> +
>>>>> +    if (!drm_dev_enter(&ring->adev->ddev, &idx))
>>>>> +        return;
>>>>> +
>>>>> +    if (ring->count_dw <= 0)
>>>>> +        DRM_ERROR("amdgpu: writing more dwords to the ring than 
>>>>> expected!\n");
>>>>> +    ring->ring[ring->wptr++ & ring->buf_mask] = v;
>>>>> +    ring->wptr &= ring->ptr_mask;
>>>>> +    ring->count_dw--;
>>>>> +
>>>>> +    drm_dev_exit(idx);
>>>>> +}
>>>>> +
>>>>> +void amdgpu_ring_write_multiple(struct amdgpu_ring *ring,
>>>>> +                          void *src, int count_dw)
>>>>> +{
>>>>> +    unsigned occupied, chunk1, chunk2;
>>>>> +    void *dst;
>>>>> +    int idx;
>>>>> +
>>>>> +    if (!drm_dev_enter(&ring->adev->ddev, &idx))
>>>>> +        return;
>>>>> +
>>>>> +    if (unlikely(ring->count_dw < count_dw))
>>>>> +        DRM_ERROR("amdgpu: writing more dwords to the ring than 
>>>>> expected!\n");
>>>>> +
>>>>> +    occupied = ring->wptr & ring->buf_mask;
>>>>> +    dst = (void *)&ring->ring[occupied];
>>>>> +    chunk1 = ring->buf_mask + 1 - occupied;
>>>>> +    chunk1 = (chunk1 >= count_dw) ? count_dw: chunk1;
>>>>> +    chunk2 = count_dw - chunk1;
>>>>> +    chunk1 <<= 2;
>>>>> +    chunk2 <<= 2;
>>>>> +
>>>>> +    if (chunk1)
>>>>> +        memcpy(dst, src, chunk1);
>>>>> +
>>>>> +    if (chunk2) {
>>>>> +        src += chunk1;
>>>>> +        dst = (void *)ring->ring;
>>>>> +        memcpy(dst, src, chunk2);
>>>>> +    }
>>>>> +
>>>>> +    ring->wptr += count_dw;
>>>>> +    ring->wptr &= ring->ptr_mask;
>>>>> +    ring->count_dw -= count_dw;
>>>>> +
>>>>> +    drm_dev_exit(idx);
>>>>> +}
>>>>
>>>> The ring should never we in MMIO memory, so you can completely drop 
>>>> that as far as I can see.
>>>
>>> Yea, it's in all in GART, missed it for some reason...
>>>>
>>>> Maybe split that patch by use case so that we can more easily 
>>>> review/ack it.
>>>
>>> In fact everything here is the same use case, once I added unmap of
>>> all MMIO ranges (both registers ann VRAM) i got a lot of page faults
>>> on device remove around any memcpy to from IO. That where I put the
>>> drn_dev_enter/exit scope. Also I searched in code and preemeptivly
>>> added guards to any other such place. I did drop amdgpu_schedule_ib
>>> from this patch both because it had dma_fence_wait inside and so we
>>> will take care of this once we decide on how to handle dma_fence waits.
>>>
>>> Andrey
>>>
>>>>
>>>> Thanks,
>>>> Christian.
>>>>
>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h 
>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>>>>> index e7d3d0dbdd96..c67bc6d3d039 100644
>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>>>>> @@ -299,53 +299,12 @@ static inline void 
>>>>> amdgpu_ring_set_preempt_cond_exec(struct amdgpu_ring *ring,
>>>>>       *ring->cond_exe_cpu_addr = cond_exec;
>>>>>   }
>>>>> -static inline void amdgpu_ring_clear_ring(struct amdgpu_ring *ring)
>>>>> -{
>>>>> -    int i = 0;
>>>>> -    while (i <= ring->buf_mask)
>>>>> -        ring->ring[i++] = ring->funcs->nop;
>>>>> -
>>>>> -}
>>>>> -
>>>>> -static inline void amdgpu_ring_write(struct amdgpu_ring *ring, 
>>>>> uint32_t v)
>>>>> -{
>>>>> -    if (ring->count_dw <= 0)
>>>>> -        DRM_ERROR("amdgpu: writing more dwords to the ring than 
>>>>> expected!\n");
>>>>> -    ring->ring[ring->wptr++ & ring->buf_mask] = v;
>>>>> -    ring->wptr &= ring->ptr_mask;
>>>>> -    ring->count_dw--;
>>>>> -}
>>>>> +void amdgpu_ring_clear_ring(struct amdgpu_ring *ring);
>>>>> -static inline void amdgpu_ring_write_multiple(struct amdgpu_ring 
>>>>> *ring,
>>>>> -                          void *src, int count_dw)
>>>>> -{
>>>>> -    unsigned occupied, chunk1, chunk2;
>>>>> -    void *dst;
>>>>> -
>>>>> -    if (unlikely(ring->count_dw < count_dw))
>>>>> -        DRM_ERROR("amdgpu: writing more dwords to the ring than 
>>>>> expected!\n");
>>>>> -
>>>>> -    occupied = ring->wptr & ring->buf_mask;
>>>>> -    dst = (void *)&ring->ring[occupied];
>>>>> -    chunk1 = ring->buf_mask + 1 - occupied;
>>>>> -    chunk1 = (chunk1 >= count_dw) ? count_dw: chunk1;
>>>>> -    chunk2 = count_dw - chunk1;
>>>>> -    chunk1 <<= 2;
>>>>> -    chunk2 <<= 2;
>>>>> +void amdgpu_ring_write(struct amdgpu_ring *ring, uint32_t v);
>>>>> -    if (chunk1)
>>>>> -        memcpy(dst, src, chunk1);
>>>>> -
>>>>> -    if (chunk2) {
>>>>> -        src += chunk1;
>>>>> -        dst = (void *)ring->ring;
>>>>> -        memcpy(dst, src, chunk2);
>>>>> -    }
>>>>> -
>>>>> -    ring->wptr += count_dw;
>>>>> -    ring->wptr &= ring->ptr_mask;
>>>>> -    ring->count_dw -= count_dw;
>>>>> -}
>>>>> +void amdgpu_ring_write_multiple(struct amdgpu_ring *ring,
>>>>> +                          void *src, int count_dw);
>>>>>   int amdgpu_ring_test_helper(struct amdgpu_ring *ring);
>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c 
>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
>>>>> index c6dbc0801604..82f0542c7792 100644
>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
>>>>> @@ -32,6 +32,7 @@
>>>>>   #include <linux/module.h>
>>>>>   #include <drm/drm.h>
>>>>> +#include <drm/drm_drv.h>
>>>>>   #include "amdgpu.h"
>>>>>   #include "amdgpu_pm.h"
>>>>> @@ -375,7 +376,7 @@ int amdgpu_uvd_suspend(struct amdgpu_device *adev)
>>>>>   {
>>>>>       unsigned size;
>>>>>       void *ptr;
>>>>> -    int i, j;
>>>>> +    int i, j, idx;
>>>>>       bool in_ras_intr = amdgpu_ras_intr_triggered();
>>>>>       cancel_delayed_work_sync(&adev->uvd.idle_work);
>>>>> @@ -403,11 +404,15 @@ int amdgpu_uvd_suspend(struct amdgpu_device 
>>>>> *adev)
>>>>>           if (!adev->uvd.inst[j].saved_bo)
>>>>>               return -ENOMEM;
>>>>> -        /* re-write 0 since err_event_athub will corrupt VCPU 
>>>>> buffer */
>>>>> -        if (in_ras_intr)
>>>>> -            memset(adev->uvd.inst[j].saved_bo, 0, size);
>>>>> -        else
>>>>> -            memcpy_fromio(adev->uvd.inst[j].saved_bo, ptr, size);
>>>>> +        if (drm_dev_enter(&adev->ddev, &idx)) {
>>>>> +            /* re-write 0 since err_event_athub will corrupt VCPU 
>>>>> buffer */
>>>>> +            if (in_ras_intr)
>>>>> +                memset(adev->uvd.inst[j].saved_bo, 0, size);
>>>>> +            else
>>>>> + memcpy_fromio(adev->uvd.inst[j].saved_bo, ptr, size);
>>>>> +
>>>>> +            drm_dev_exit(idx);
>>>>> +        }
>>>>>       }
>>>>>       if (in_ras_intr)
>>>>> @@ -420,7 +425,7 @@ int amdgpu_uvd_resume(struct amdgpu_device *adev)
>>>>>   {
>>>>>       unsigned size;
>>>>>       void *ptr;
>>>>> -    int i;
>>>>> +    int i, idx;
>>>>>       for (i = 0; i < adev->uvd.num_uvd_inst; i++) {
>>>>>           if (adev->uvd.harvest_config & (1 << i))
>>>>> @@ -432,7 +437,10 @@ int amdgpu_uvd_resume(struct amdgpu_device *adev)
>>>>>           ptr = adev->uvd.inst[i].cpu_addr;
>>>>>           if (adev->uvd.inst[i].saved_bo != NULL) {
>>>>> -            memcpy_toio(ptr, adev->uvd.inst[i].saved_bo, size);
>>>>> +            if (drm_dev_enter(&adev->ddev, &idx)) {
>>>>> +                memcpy_toio(ptr, adev->uvd.inst[i].saved_bo, size);
>>>>> +                drm_dev_exit(idx);
>>>>> +            }
>>>>>               kvfree(adev->uvd.inst[i].saved_bo);
>>>>>               adev->uvd.inst[i].saved_bo = NULL;
>>>>>           } else {
>>>>> @@ -442,8 +450,11 @@ int amdgpu_uvd_resume(struct amdgpu_device *adev)
>>>>>               hdr = (const struct common_firmware_header 
>>>>> *)adev->uvd.fw->data;
>>>>>               if (adev->firmware.load_type != AMDGPU_FW_LOAD_PSP) {
>>>>>                   offset = le32_to_cpu(hdr->ucode_array_offset_bytes);
>>>>> -                memcpy_toio(adev->uvd.inst[i].cpu_addr, 
>>>>> adev->uvd.fw->data + offset,
>>>>> - le32_to_cpu(hdr->ucode_size_bytes));
>>>>> +                if (drm_dev_enter(&adev->ddev, &idx)) {
>>>>> + memcpy_toio(adev->uvd.inst[i].cpu_addr, adev->uvd.fw->data + offset,
>>>>> + le32_to_cpu(hdr->ucode_size_bytes));
>>>>> +                    drm_dev_exit(idx);
>>>>> +                }
>>>>>                   size -= le32_to_cpu(hdr->ucode_size_bytes);
>>>>>                   ptr += le32_to_cpu(hdr->ucode_size_bytes);
>>>>>               }
>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c 
>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
>>>>> index ea6a62f67e38..833203401ef4 100644
>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
>>>>> @@ -29,6 +29,7 @@
>>>>>   #include <linux/module.h>
>>>>>   #include <drm/drm.h>
>>>>> +#include <drm/drm_drv.h>
>>>>>   #include "amdgpu.h"
>>>>>   #include "amdgpu_pm.h"
>>>>> @@ -293,7 +294,7 @@ int amdgpu_vce_resume(struct amdgpu_device *adev)
>>>>>       void *cpu_addr;
>>>>>       const struct common_firmware_header *hdr;
>>>>>       unsigned offset;
>>>>> -    int r;
>>>>> +    int r, idx;
>>>>>       if (adev->vce.vcpu_bo == NULL)
>>>>>           return -EINVAL;
>>>>> @@ -313,8 +314,12 @@ int amdgpu_vce_resume(struct amdgpu_device *adev)
>>>>>       hdr = (const struct common_firmware_header *)adev->vce.fw->data;
>>>>>       offset = le32_to_cpu(hdr->ucode_array_offset_bytes);
>>>>> -    memcpy_toio(cpu_addr, adev->vce.fw->data + offset,
>>>>> -            adev->vce.fw->size - offset);
>>>>> +
>>>>> +    if (drm_dev_enter(&adev->ddev, &idx)) {
>>>>> +        memcpy_toio(cpu_addr, adev->vce.fw->data + offset,
>>>>> +                adev->vce.fw->size - offset);
>>>>> +        drm_dev_exit(idx);
>>>>> +    }
>>>>>       amdgpu_bo_kunmap(adev->vce.vcpu_bo);
>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c 
>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
>>>>> index 201645963ba5..21f7d3644d70 100644
>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
>>>>> @@ -27,6 +27,7 @@
>>>>>   #include <linux/firmware.h>
>>>>>   #include <linux/module.h>
>>>>>   #include <linux/pci.h>
>>>>> +#include <drm/drm_drv.h>
>>>>>   #include "amdgpu.h"
>>>>>   #include "amdgpu_pm.h"
>>>>> @@ -275,7 +276,7 @@ int amdgpu_vcn_suspend(struct amdgpu_device *adev)
>>>>>   {
>>>>>       unsigned size;
>>>>>       void *ptr;
>>>>> -    int i;
>>>>> +    int i, idx;
>>>>>       cancel_delayed_work_sync(&adev->vcn.idle_work);
>>>>> @@ -292,7 +293,10 @@ int amdgpu_vcn_suspend(struct amdgpu_device 
>>>>> *adev)
>>>>>           if (!adev->vcn.inst[i].saved_bo)
>>>>>               return -ENOMEM;
>>>>> -        memcpy_fromio(adev->vcn.inst[i].saved_bo, ptr, size);
>>>>> +        if (drm_dev_enter(&adev->ddev, &idx)) {
>>>>> +            memcpy_fromio(adev->vcn.inst[i].saved_bo, ptr, size);
>>>>> +            drm_dev_exit(idx);
>>>>> +        }
>>>>>       }
>>>>>       return 0;
>>>>>   }
>>>>> @@ -301,7 +305,7 @@ int amdgpu_vcn_resume(struct amdgpu_device *adev)
>>>>>   {
>>>>>       unsigned size;
>>>>>       void *ptr;
>>>>> -    int i;
>>>>> +    int i, idx;
>>>>>       for (i = 0; i < adev->vcn.num_vcn_inst; ++i) {
>>>>>           if (adev->vcn.harvest_config & (1 << i))
>>>>> @@ -313,7 +317,10 @@ int amdgpu_vcn_resume(struct amdgpu_device *adev)
>>>>>           ptr = adev->vcn.inst[i].cpu_addr;
>>>>>           if (adev->vcn.inst[i].saved_bo != NULL) {
>>>>> -            memcpy_toio(ptr, adev->vcn.inst[i].saved_bo, size);
>>>>> +            if (drm_dev_enter(&adev->ddev, &idx)) {
>>>>> +                memcpy_toio(ptr, adev->vcn.inst[i].saved_bo, size);
>>>>> +                drm_dev_exit(idx);
>>>>> +            }
>>>>>               kvfree(adev->vcn.inst[i].saved_bo);
>>>>>               adev->vcn.inst[i].saved_bo = NULL;
>>>>>           } else {
>>>>> @@ -323,8 +330,11 @@ int amdgpu_vcn_resume(struct amdgpu_device *adev)
>>>>>               hdr = (const struct common_firmware_header 
>>>>> *)adev->vcn.fw->data;
>>>>>               if (adev->firmware.load_type != AMDGPU_FW_LOAD_PSP) {
>>>>>                   offset = le32_to_cpu(hdr->ucode_array_offset_bytes);
>>>>> -                memcpy_toio(adev->vcn.inst[i].cpu_addr, 
>>>>> adev->vcn.fw->data + offset,
>>>>> - le32_to_cpu(hdr->ucode_size_bytes));
>>>>> +                if (drm_dev_enter(&adev->ddev, &idx)) {
>>>>> + memcpy_toio(adev->vcn.inst[i].cpu_addr, adev->vcn.fw->data + offset,
>>>>> + le32_to_cpu(hdr->ucode_size_bytes));
>>>>> +                    drm_dev_exit(idx);
>>>>> +                }
>>>>>                   size -= le32_to_cpu(hdr->ucode_size_bytes);
>>>>>                   ptr += le32_to_cpu(hdr->ucode_size_bytes);
>>>>>               }
>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>>>> index 9f868cf3b832..7dd5f10ab570 100644
>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>>>> @@ -32,6 +32,7 @@
>>>>>   #include <linux/dma-buf.h>
>>>>>   #include <drm/amdgpu_drm.h>
>>>>> +#include <drm/drm_drv.h>
>>>>>   #include "amdgpu.h"
>>>>>   #include "amdgpu_trace.h"
>>>>>   #include "amdgpu_amdkfd.h"
>>>>> @@ -1606,7 +1607,10 @@ static int 
>>>>> amdgpu_vm_bo_update_mapping(struct amdgpu_device *adev,
>>>>>       struct amdgpu_vm_update_params params;
>>>>>       enum amdgpu_sync_mode sync_mode;
>>>>>       uint64_t pfn;
>>>>> -    int r;
>>>>> +    int r, idx;
>>>>> +
>>>>> +    if (!drm_dev_enter(&adev->ddev, &idx))
>>>>> +        return -ENODEV;
>>>>>       memset(&params, 0, sizeof(params));
>>>>>       params.adev = adev;
>>>>> @@ -1715,6 +1719,7 @@ static int amdgpu_vm_bo_update_mapping(struct 
>>>>> amdgpu_device *adev,
>>>>>   error_unlock:
>>>>>       amdgpu_vm_eviction_unlock(vm);
>>>>> +    drm_dev_exit(idx);
>>>>>       return r;
>>>>>   }
>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c 
>>>>> b/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
>>>>> index 589410c32d09..2cec71e823f5 100644
>>>>> --- a/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
>>>>> @@ -23,6 +23,7 @@
>>>>>   #include <linux/firmware.h>
>>>>>   #include <linux/module.h>
>>>>>   #include <linux/vmalloc.h>
>>>>> +#include <drm/drm_drv.h>
>>>>>   #include "amdgpu.h"
>>>>>   #include "amdgpu_psp.h"
>>>>> @@ -269,10 +270,8 @@ static int 
>>>>> psp_v11_0_bootloader_load_kdb(struct psp_context *psp)
>>>>>       if (ret)
>>>>>           return ret;
>>>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>> -
>>>>>       /* Copy PSP KDB binary to memory */
>>>>> -    memcpy(psp->fw_pri_buf, psp->kdb_start_addr, psp->kdb_bin_size);
>>>>> +    psp_copy_fw(psp, psp->kdb_start_addr, psp->kdb_bin_size);
>>>>>       /* Provide the PSP KDB to bootloader */
>>>>>       WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>>>> @@ -302,10 +301,8 @@ static int 
>>>>> psp_v11_0_bootloader_load_spl(struct psp_context *psp)
>>>>>       if (ret)
>>>>>           return ret;
>>>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>> -
>>>>>       /* Copy PSP SPL binary to memory */
>>>>> -    memcpy(psp->fw_pri_buf, psp->spl_start_addr, psp->spl_bin_size);
>>>>> +    psp_copy_fw(psp, psp->spl_start_addr, psp->spl_bin_size);
>>>>>       /* Provide the PSP SPL to bootloader */
>>>>>       WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>>>> @@ -335,10 +332,8 @@ static int 
>>>>> psp_v11_0_bootloader_load_sysdrv(struct psp_context *psp)
>>>>>       if (ret)
>>>>>           return ret;
>>>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>> -
>>>>>       /* Copy PSP System Driver binary to memory */
>>>>> -    memcpy(psp->fw_pri_buf, psp->sys_start_addr, psp->sys_bin_size);
>>>>> +    psp_copy_fw(psp, psp->sys_start_addr, psp->sys_bin_size);
>>>>>       /* Provide the sys driver to bootloader */
>>>>>       WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>>>> @@ -371,10 +366,8 @@ static int 
>>>>> psp_v11_0_bootloader_load_sos(struct psp_context *psp)
>>>>>       if (ret)
>>>>>           return ret;
>>>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>> -
>>>>>       /* Copy Secure OS binary to PSP memory */
>>>>> -    memcpy(psp->fw_pri_buf, psp->sos_start_addr, psp->sos_bin_size);
>>>>> +    psp_copy_fw(psp, psp->sos_start_addr, psp->sos_bin_size);
>>>>>       /* Provide the PSP secure OS to bootloader */
>>>>>       WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>>>> @@ -608,7 +601,7 @@ static int psp_v11_0_memory_training(struct 
>>>>> psp_context *psp, uint32_t ops)
>>>>>       uint32_t p2c_header[4];
>>>>>       uint32_t sz;
>>>>>       void *buf;
>>>>> -    int ret;
>>>>> +    int ret, idx;
>>>>>       if (ctx->init == PSP_MEM_TRAIN_NOT_SUPPORT) {
>>>>>           DRM_DEBUG("Memory training is not supported.\n");
>>>>> @@ -681,17 +674,24 @@ static int psp_v11_0_memory_training(struct 
>>>>> psp_context *psp, uint32_t ops)
>>>>>               return -ENOMEM;
>>>>>           }
>>>>> -        memcpy_fromio(buf, adev->mman.aper_base_kaddr, sz);
>>>>> -        ret = psp_v11_0_memory_training_send_msg(psp, 
>>>>> PSP_BL__DRAM_LONG_TRAIN);
>>>>> -        if (ret) {
>>>>> -            DRM_ERROR("Send long training msg failed.\n");
>>>>> +        if (drm_dev_enter(&adev->ddev, &idx)) {
>>>>> +            memcpy_fromio(buf, adev->mman.aper_base_kaddr, sz);
>>>>> +            ret = psp_v11_0_memory_training_send_msg(psp, 
>>>>> PSP_BL__DRAM_LONG_TRAIN);
>>>>> +            if (ret) {
>>>>> +                DRM_ERROR("Send long training msg failed.\n");
>>>>> +                vfree(buf);
>>>>> +                drm_dev_exit(idx);
>>>>> +                return ret;
>>>>> +            }
>>>>> +
>>>>> +            memcpy_toio(adev->mman.aper_base_kaddr, buf, sz);
>>>>> +            adev->hdp.funcs->flush_hdp(adev, NULL);
>>>>>               vfree(buf);
>>>>> -            return ret;
>>>>> +            drm_dev_exit(idx);
>>>>> +        } else {
>>>>> +            vfree(buf);
>>>>> +            return -ENODEV;
>>>>>           }
>>>>> -
>>>>> -        memcpy_toio(adev->mman.aper_base_kaddr, buf, sz);
>>>>> -        adev->hdp.funcs->flush_hdp(adev, NULL);
>>>>> -        vfree(buf);
>>>>>       }
>>>>>       if (ops & PSP_MEM_TRAIN_SAVE) {
>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c 
>>>>> b/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c
>>>>> index c4828bd3264b..618e5b6b85d9 100644
>>>>> --- a/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c
>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c
>>>>> @@ -138,10 +138,8 @@ static int 
>>>>> psp_v12_0_bootloader_load_sysdrv(struct psp_context *psp)
>>>>>       if (ret)
>>>>>           return ret;
>>>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>> -
>>>>>       /* Copy PSP System Driver binary to memory */
>>>>> -    memcpy(psp->fw_pri_buf, psp->sys_start_addr, psp->sys_bin_size);
>>>>> +    psp_copy_fw(psp, psp->sys_start_addr, psp->sys_bin_size);
>>>>>       /* Provide the sys driver to bootloader */
>>>>>       WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>>>> @@ -179,10 +177,8 @@ static int 
>>>>> psp_v12_0_bootloader_load_sos(struct psp_context *psp)
>>>>>       if (ret)
>>>>>           return ret;
>>>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>> -
>>>>>       /* Copy Secure OS binary to PSP memory */
>>>>> -    memcpy(psp->fw_pri_buf, psp->sos_start_addr, psp->sos_bin_size);
>>>>> +    psp_copy_fw(psp, psp->sos_start_addr, psp->sos_bin_size);
>>>>>       /* Provide the PSP secure OS to bootloader */
>>>>>       WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c 
>>>>> b/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c
>>>>> index f2e725f72d2f..d0a6cccd0897 100644
>>>>> --- a/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c
>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c
>>>>> @@ -102,10 +102,8 @@ static int 
>>>>> psp_v3_1_bootloader_load_sysdrv(struct psp_context *psp)
>>>>>       if (ret)
>>>>>           return ret;
>>>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>> -
>>>>>       /* Copy PSP System Driver binary to memory */
>>>>> -    memcpy(psp->fw_pri_buf, psp->sys_start_addr, psp->sys_bin_size);
>>>>> +    psp_copy_fw(psp, psp->sys_start_addr, psp->sys_bin_size);
>>>>>       /* Provide the sys driver to bootloader */
>>>>>       WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>>>> @@ -143,10 +141,8 @@ static int psp_v3_1_bootloader_load_sos(struct 
>>>>> psp_context *psp)
>>>>>       if (ret)
>>>>>           return ret;
>>>>> -    memset(psp->fw_pri_buf, 0, PSP_1_MEG);
>>>>> -
>>>>>       /* Copy Secure OS binary to PSP memory */
>>>>> -    memcpy(psp->fw_pri_buf, psp->sos_start_addr, psp->sos_bin_size);
>>>>> +    psp_copy_fw(psp, psp->sos_start_addr, psp->sos_bin_size);
>>>>>       /* Provide the PSP secure OS to bootloader */
>>>>>       WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c 
>>>>> b/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
>>>>> index 8e238dea7bef..90910d19db12 100644
>>>>> --- a/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
>>>>> @@ -25,6 +25,7 @@
>>>>>    */
>>>>>   #include <linux/firmware.h>
>>>>> +#include <drm/drm_drv.h>
>>>>>   #include "amdgpu.h"
>>>>>   #include "amdgpu_vce.h"
>>>>> @@ -555,16 +556,19 @@ static int vce_v4_0_hw_fini(void *handle)
>>>>>   static int vce_v4_0_suspend(void *handle)
>>>>>   {
>>>>>       struct amdgpu_device *adev = (struct amdgpu_device *)handle;
>>>>> -    int r;
>>>>> +    int r, idx;
>>>>>       if (adev->vce.vcpu_bo == NULL)
>>>>>           return 0;
>>>>> -    if (adev->firmware.load_type == AMDGPU_FW_LOAD_PSP) {
>>>>> -        unsigned size = amdgpu_bo_size(adev->vce.vcpu_bo);
>>>>> -        void *ptr = adev->vce.cpu_addr;
>>>>> +    if (drm_dev_enter(&adev->ddev, &idx)) {
>>>>> +        if (adev->firmware.load_type == AMDGPU_FW_LOAD_PSP) {
>>>>> +            unsigned size = amdgpu_bo_size(adev->vce.vcpu_bo);
>>>>> +            void *ptr = adev->vce.cpu_addr;
>>>>> -        memcpy_fromio(adev->vce.saved_bo, ptr, size);
>>>>> +            memcpy_fromio(adev->vce.saved_bo, ptr, size);
>>>>> +        }
>>>>> +        drm_dev_exit(idx);
>>>>>       }
>>>>>       r = vce_v4_0_hw_fini(adev);
>>>>> @@ -577,16 +581,20 @@ static int vce_v4_0_suspend(void *handle)
>>>>>   static int vce_v4_0_resume(void *handle)
>>>>>   {
>>>>>       struct amdgpu_device *adev = (struct amdgpu_device *)handle;
>>>>> -    int r;
>>>>> +    int r, idx;
>>>>>       if (adev->vce.vcpu_bo == NULL)
>>>>>           return -EINVAL;
>>>>>       if (adev->firmware.load_type == AMDGPU_FW_LOAD_PSP) {
>>>>> -        unsigned size = amdgpu_bo_size(adev->vce.vcpu_bo);
>>>>> -        void *ptr = adev->vce.cpu_addr;
>>>>> -        memcpy_toio(ptr, adev->vce.saved_bo, size);
>>>>> +        if (drm_dev_enter(&adev->ddev, &idx)) {
>>>>> +            unsigned size = amdgpu_bo_size(adev->vce.vcpu_bo);
>>>>> +            void *ptr = adev->vce.cpu_addr;
>>>>> +
>>>>> +            memcpy_toio(ptr, adev->vce.saved_bo, size);
>>>>> +            drm_dev_exit(idx);
>>>>> +        }
>>>>>       } else {
>>>>>           r = amdgpu_vce_resume(adev);
>>>>>           if (r)
>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c 
>>>>> b/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c
>>>>> index 3f15bf34123a..df34be8ec82d 100644
>>>>> --- a/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c
>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c
>>>>> @@ -34,6 +34,8 @@
>>>>>   #include "vcn/vcn_3_0_0_sh_mask.h"
>>>>>   #include "ivsrcid/vcn/irqsrcs_vcn_2_0.h"
>>>>> +#include <drm/drm_drv.h>
>>>>> +
>>>>>   #define mmUVD_CONTEXT_ID_INTERNAL_OFFSET            0x27
>>>>>   #define mmUVD_GPCOM_VCPU_CMD_INTERNAL_OFFSET 0x0f
>>>>>   #define mmUVD_GPCOM_VCPU_DATA0_INTERNAL_OFFSET 0x10
>>>>> @@ -268,16 +270,20 @@ static int vcn_v3_0_sw_init(void *handle)
>>>>>   static int vcn_v3_0_sw_fini(void *handle)
>>>>>   {
>>>>>       struct amdgpu_device *adev = (struct amdgpu_device *)handle;
>>>>> -    int i, r;
>>>>> +    int i, r, idx;
>>>>> -    for (i = 0; i < adev->vcn.num_vcn_inst; i++) {
>>>>> -        volatile struct amdgpu_fw_shared *fw_shared;
>>>>> +    if (drm_dev_enter(&adev->ddev, &idx)) {
>>>>> +        for (i = 0; i < adev->vcn.num_vcn_inst; i++) {
>>>>> +            volatile struct amdgpu_fw_shared *fw_shared;
>>>>> -        if (adev->vcn.harvest_config & (1 << i))
>>>>> -            continue;
>>>>> -        fw_shared = adev->vcn.inst[i].fw_shared_cpu_addr;
>>>>> -        fw_shared->present_flag_0 = 0;
>>>>> -        fw_shared->sw_ring.is_enabled = false;
>>>>> +            if (adev->vcn.harvest_config & (1 << i))
>>>>> +                continue;
>>>>> +            fw_shared = adev->vcn.inst[i].fw_shared_cpu_addr;
>>>>> +            fw_shared->present_flag_0 = 0;
>>>>> +            fw_shared->sw_ring.is_enabled = false;
>>>>> +        }
>>>>> +
>>>>> +        drm_dev_exit(idx);
>>>>>       }
>>>>>       if (amdgpu_sriov_vf(adev))
>>>>> diff --git a/drivers/gpu/drm/amd/pm/powerplay/smumgr/smu7_smumgr.c 
>>>>> b/drivers/gpu/drm/amd/pm/powerplay/smumgr/smu7_smumgr.c
>>>>> index aae25243eb10..d628b91846c9 100644
>>>>> --- a/drivers/gpu/drm/amd/pm/powerplay/smumgr/smu7_smumgr.c
>>>>> +++ b/drivers/gpu/drm/amd/pm/powerplay/smumgr/smu7_smumgr.c
>>>>> @@ -405,6 +405,8 @@ int smu7_request_smu_load_fw(struct pp_hwmgr 
>>>>> *hwmgr)
>>>>>                   UCODE_ID_MEC_STORAGE, 
>>>>> &toc->entry[toc->num_entries++]),
>>>>>                   "Failed to Get Firmware Entry.", r = -EINVAL; 
>>>>> goto failed);
>>>>>       }
>>>>> +
>>>>> +    /* AG TODO Can't call drm_dev_enter/exit because access 
>>>>> adev->ddev here ... */
>>>>>       memcpy_toio(smu_data->header_buffer.kaddr, smu_data->toc,
>>>>>               sizeof(struct SMU_DRAMData_TOC));
>>>>>       smum_send_msg_to_smc_with_parameter(hwmgr,
>>>>
> 
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 126+ messages in thread

end of thread, other threads:[~2021-05-12 14:11 UTC | newest]

Thread overview: 126+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-05-10 16:36 [PATCH v6 00/16] RFC Support hot device unplug in amdgpu Andrey Grodzovsky
2021-05-10 16:36 ` Andrey Grodzovsky
2021-05-10 16:36 ` Andrey Grodzovsky
2021-05-10 16:36 ` [PATCH v6 01/16] drm/ttm: Remap all page faults to per process dummy page Andrey Grodzovsky
2021-05-10 16:36   ` Andrey Grodzovsky
2021-05-10 16:36   ` Andrey Grodzovsky
2021-05-11  6:38   ` Christian König
2021-05-11  6:38     ` Christian König
2021-05-11  6:38     ` Christian König
2021-05-11 14:44     ` Andrey Grodzovsky
2021-05-11 14:44       ` Andrey Grodzovsky
2021-05-11 14:44       ` Andrey Grodzovsky
2021-05-11 15:12       ` Christian König
2021-05-11 15:12         ` Christian König
2021-05-11 15:12         ` Christian König
2021-05-10 16:36 ` [PATCH v6 02/16] drm/ttm: Expose ttm_tt_unpopulate for driver use Andrey Grodzovsky
2021-05-10 16:36   ` Andrey Grodzovsky
2021-05-10 16:36   ` Andrey Grodzovsky
2021-05-10 18:27   ` Felix Kuehling
2021-05-10 18:27     ` Felix Kuehling
2021-05-10 18:27     ` Felix Kuehling
2021-05-10 18:32     ` Andrey Grodzovsky
2021-05-10 18:32       ` Andrey Grodzovsky
2021-05-10 18:32       ` Andrey Grodzovsky
2021-05-10 16:36 ` [PATCH v6 03/16] drm/amdgpu: Split amdgpu_device_fini into early and late Andrey Grodzovsky
2021-05-10 16:36   ` Andrey Grodzovsky
2021-05-10 16:36   ` Andrey Grodzovsky
2021-05-10 23:49   ` kernel test robot
2021-05-10 23:49     ` kernel test robot
2021-05-10 23:49     ` kernel test robot
2021-05-10 23:49     ` kernel test robot
2021-05-10 16:36 ` [PATCH v6 04/16] drm/amdkfd: Split kfd suspend from devie exit Andrey Grodzovsky
2021-05-10 16:36   ` Andrey Grodzovsky
2021-05-10 16:36   ` Andrey Grodzovsky
2021-05-11  6:40   ` Christian König
2021-05-11  6:40     ` Christian König
2021-05-11  6:40     ` Christian König
2021-05-11 14:52     ` Andrey Grodzovsky
2021-05-11 14:52       ` Andrey Grodzovsky
2021-05-11 14:52       ` Andrey Grodzovsky
2021-05-11 13:24   ` Deucher, Alexander
2021-05-11 13:24     ` Deucher, Alexander
2021-05-10 16:36 ` [PATCH v6 05/16] drm/amdgpu: Add early fini callback Andrey Grodzovsky
2021-05-10 16:36   ` Andrey Grodzovsky
2021-05-10 16:36   ` Andrey Grodzovsky
2021-05-11  6:41   ` Christian König
2021-05-11  6:41     ` Christian König
2021-05-11  6:41     ` Christian König
2021-05-10 16:36 ` [PATCH v6 06/16] drm/amdgpu: Handle IOMMU enabled case Andrey Grodzovsky
2021-05-10 16:36   ` Andrey Grodzovsky
2021-05-10 16:36   ` Andrey Grodzovsky
2021-05-11  6:44   ` Christian König
2021-05-11  6:44     ` Christian König
2021-05-11  6:44     ` Christian König
2021-05-11 15:46     ` Andrey Grodzovsky
2021-05-11 15:46       ` Andrey Grodzovsky
2021-05-11 15:46       ` Andrey Grodzovsky
2021-05-11 15:56   ` Alex Deucher
2021-05-11 15:56     ` Alex Deucher
2021-05-11 15:56     ` Alex Deucher
2021-05-11 15:59     ` Andrey Grodzovsky
2021-05-11 15:59       ` Andrey Grodzovsky
2021-05-11 15:59       ` Andrey Grodzovsky
2021-05-10 16:36 ` [PATCH v6 07/16] drm/amdgpu: Remap all page faults to per process dummy page Andrey Grodzovsky
2021-05-10 16:36   ` Andrey Grodzovsky
2021-05-10 16:36   ` Andrey Grodzovsky
2021-05-10 16:36 ` [PATCH v6 08/16] PCI: Add support for dev_groups to struct pci_device_driver Andrey Grodzovsky
2021-05-10 16:36   ` Andrey Grodzovsky
2021-05-10 16:36   ` Andrey Grodzovsky
2021-05-10 20:56   ` Bjorn Helgaas
2021-05-10 20:56     ` Bjorn Helgaas
2021-05-10 20:56     ` Bjorn Helgaas
2021-05-10 16:36 ` [PATCH v6 09/16] drm/amdgpu: Convert driver sysfs attributes to static attributes Andrey Grodzovsky
2021-05-10 16:36   ` Andrey Grodzovsky
2021-05-10 16:36   ` Andrey Grodzovsky
2021-05-10 16:36 ` [PATCH v6 10/16] drm/amdgpu: Guard against write accesses after device removal Andrey Grodzovsky
2021-05-10 16:36   ` Andrey Grodzovsky
2021-05-10 16:36   ` Andrey Grodzovsky
2021-05-11  6:50   ` Christian König
2021-05-11  6:50     ` Christian König
2021-05-11  6:50     ` Christian König
2021-05-11 17:52     ` Andrey Grodzovsky
2021-05-11 17:52       ` Andrey Grodzovsky
2021-05-11 17:52       ` Andrey Grodzovsky
2021-05-12 14:01       ` Andrey Grodzovsky
2021-05-12 14:01         ` Andrey Grodzovsky
2021-05-12 14:01         ` Andrey Grodzovsky
2021-05-12 14:06         ` Christian König
2021-05-12 14:06           ` Christian König
2021-05-12 14:06           ` Christian König
2021-05-12 14:11           ` Andrey Grodzovsky
2021-05-12 14:11             ` Andrey Grodzovsky
2021-05-12 14:11             ` Andrey Grodzovsky
2021-05-10 16:36 ` [PATCH v6 11/16] drm/sched: Make timeout timer rearm conditional Andrey Grodzovsky
2021-05-10 16:36   ` Andrey Grodzovsky
2021-05-10 16:36   ` Andrey Grodzovsky
2021-05-11  6:52   ` Christian König
2021-05-11  6:52     ` Christian König
2021-05-11  6:52     ` Christian König
2021-05-10 16:36 ` [PATCH v6 12/16] drm/amdgpu: Prevent any job recoveries after device is unplugged Andrey Grodzovsky
2021-05-10 16:36   ` Andrey Grodzovsky
2021-05-10 16:36   ` Andrey Grodzovsky
2021-05-11  6:53   ` Christian König
2021-05-11  6:53     ` Christian König
2021-05-11  6:53     ` Christian König
2021-05-10 16:36 ` [PATCH v6 13/16] drm/amdgpu: Fix hang on device removal Andrey Grodzovsky
2021-05-10 16:36   ` Andrey Grodzovsky
2021-05-10 16:36   ` Andrey Grodzovsky
2021-05-11  6:54   ` Christian König
2021-05-11  6:54     ` Christian König
2021-05-11  6:54     ` Christian König
2021-05-10 16:36 ` [PATCH v6 14/16] drm/scheduler: Fix hang when sched_entity released Andrey Grodzovsky
2021-05-10 16:36   ` Andrey Grodzovsky
2021-05-10 16:36   ` Andrey Grodzovsky
2021-05-10 16:36 ` [PATCH v6 15/16] drm/amd/display: Remove superflous drm_mode_config_cleanup Andrey Grodzovsky
2021-05-10 16:36   ` Andrey Grodzovsky
2021-05-10 16:36   ` Andrey Grodzovsky
2021-05-10 21:38   ` Rodrigo Siqueira
2021-05-10 21:38     ` Rodrigo Siqueira
2021-05-10 21:38     ` Rodrigo Siqueira
2021-05-10 16:36 ` [PATCH v6 16/16] drm/amdgpu: Verify DMA opearations from device are done Andrey Grodzovsky
2021-05-10 16:36   ` Andrey Grodzovsky
2021-05-10 16:36   ` Andrey Grodzovsky
2021-05-11  6:56   ` Christian König
2021-05-11  6:56     ` Christian König
2021-05-11  6:56     ` Christian König

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.