linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Felix Kuehling <felix.kuehling@amd.com>
To: Andrey Grodzovsky <andrey.grodzovsky@amd.com>,
	dri-devel@lists.freedesktop.org, amd-gfx@lists.freedesktop.org,
	linux-pci@vger.kernel.org, ckoenig.leichtzumerken@gmail.com,
	daniel.vetter@ffwll.ch, Harry.Wentland@amd.com
Cc: ppaalanen@gmail.com, Alexander.Deucher@amd.com,
	gregkh@linuxfoundation.org, helgaas@kernel.org,
	"Christian König" <christian.koenig@amd.com>
Subject: Re: [PATCH] drm/amdgpu: Add early fini callback
Date: Wed, 19 May 2021 23:29:26 -0400	[thread overview]
Message-ID: <ab262224-fd1d-31a2-72cd-54174fba17ae@amd.com> (raw)
In-Reply-To: <20210520032057.497334-1-andrey.grodzovsky@amd.com>

Am 2021-05-19 um 11:20 p.m. schrieb Andrey Grodzovsky:
> Use it to call disply code dependent on device->drv_data
> before it's set to NULL on device unplug
>
> v5:
> Move HW finilization into this callback to prevent MMIO accesses
> post cpi remove.
>
> v7:
> Split kfd suspend from device exit to expdite HW related
> stuff to amdgpu_pci_remove
>
> v8:
> Squash previous KFD commit into this commit to avoid compile break.
>
> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> Acked-by: Christian König <christian.koenig@amd.com>

See one cosmetic comment inline. With that fixed the patch is

Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>


> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c    |  2 +-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h    |  2 +-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c    | 59 +++++++++++++------
>  drivers/gpu/drm/amd/amdkfd/kfd_device.c       |  3 +-
>  .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 12 +++-
>  drivers/gpu/drm/amd/include/amd_shared.h      |  2 +
>  6 files changed, 56 insertions(+), 24 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
> index 5f6696a3c778..2b06dee9a0ce 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
> @@ -170,7 +170,7 @@ void amdgpu_amdkfd_device_init(struct amdgpu_device *adev)
>  	}
>  }
>  
> -void amdgpu_amdkfd_device_fini(struct amdgpu_device *adev)
> +void amdgpu_amdkfd_device_fini_sw(struct amdgpu_device *adev)
>  {
>  	if (adev->kfd.dev) {
>  		kgd2kfd_device_exit(adev->kfd.dev);
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
> index 5ffb07b02810..d8a537e8aea5 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
> @@ -127,7 +127,7 @@ void amdgpu_amdkfd_interrupt(struct amdgpu_device *adev,
>  			const void *ih_ring_entry);
>  void amdgpu_amdkfd_device_probe(struct amdgpu_device *adev);
>  void amdgpu_amdkfd_device_init(struct amdgpu_device *adev);
> -void amdgpu_amdkfd_device_fini(struct amdgpu_device *adev);
> +void amdgpu_amdkfd_device_fini_sw(struct amdgpu_device *adev);
>  int amdgpu_amdkfd_submit_ib(struct kgd_dev *kgd, enum kgd_engine_type engine,
>  				uint32_t vmid, uint64_t gpu_addr,
>  				uint32_t *ib_cmd, uint32_t ib_len);
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index 8bee95ad32d9..bc75e35dd8d8 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -2558,34 +2558,26 @@ static int amdgpu_device_ip_late_init(struct amdgpu_device *adev)
>  	return 0;
>  }
>  
> -/**
> - * amdgpu_device_ip_fini - run fini for hardware IPs
> - *
> - * @adev: amdgpu_device pointer
> - *
> - * Main teardown pass for hardware IPs.  The list of all the hardware
> - * IPs that make up the asic is walked and the hw_fini and sw_fini callbacks
> - * are run.  hw_fini tears down the hardware associated with each IP
> - * and sw_fini tears down any software state associated with each IP.
> - * Returns 0 on success, negative error code on failure.
> - */
> -static int amdgpu_device_ip_fini(struct amdgpu_device *adev)
> +static int amdgpu_device_ip_fini_early(struct amdgpu_device *adev)
>  {
>  	int i, r;
>  
> -	if (amdgpu_sriov_vf(adev) && adev->virt.ras_init_done)
> -		amdgpu_virt_release_ras_err_handler_data(adev);
> +	for (i = 0; i < adev->num_ip_blocks; i++) {
> +		if (!adev->ip_blocks[i].version->funcs->early_fini)
> +			continue;
>  
> -	amdgpu_ras_pre_fini(adev);
> +		r = adev->ip_blocks[i].version->funcs->early_fini((void *)adev);
> +		if (r) {
> +			DRM_DEBUG("early_fini of IP block <%s> failed %d\n",
> +				  adev->ip_blocks[i].version->funcs->name, r);
> +		}
> +	}
>  
> -	if (adev->gmc.xgmi.num_physical_nodes > 1)
> -		amdgpu_xgmi_remove_device(adev);
> +	amdgpu_amdkfd_suspend(adev, false);
>  
>  	amdgpu_device_set_pg_state(adev, AMD_PG_STATE_UNGATE);
>  	amdgpu_device_set_cg_state(adev, AMD_CG_STATE_UNGATE);
>  
> -	amdgpu_amdkfd_device_fini(adev);
> -
>  	/* need to disable SMC first */
>  	for (i = 0; i < adev->num_ip_blocks; i++) {
>  		if (!adev->ip_blocks[i].status.hw)
> @@ -2616,6 +2608,33 @@ static int amdgpu_device_ip_fini(struct amdgpu_device *adev)
>  		adev->ip_blocks[i].status.hw = false;
>  	}
>  
> +	return 0;
> +}
> +
> +/**
> + * amdgpu_device_ip_fini - run fini for hardware IPs
> + *
> + * @adev: amdgpu_device pointer
> + *
> + * Main teardown pass for hardware IPs.  The list of all the hardware
> + * IPs that make up the asic is walked and the hw_fini and sw_fini callbacks
> + * are run.  hw_fini tears down the hardware associated with each IP
> + * and sw_fini tears down any software state associated with each IP.
> + * Returns 0 on success, negative error code on failure.
> + */
> +static int amdgpu_device_ip_fini(struct amdgpu_device *adev)
> +{
> +	int i, r;
> +
> +	if (amdgpu_sriov_vf(adev) && adev->virt.ras_init_done)
> +		amdgpu_virt_release_ras_err_handler_data(adev);
> +
> +	amdgpu_ras_pre_fini(adev);
> +
> +	if (adev->gmc.xgmi.num_physical_nodes > 1)
> +		amdgpu_xgmi_remove_device(adev);
> +
> +	amdgpu_amdkfd_device_fini_sw(adev);
>  
>  	for (i = adev->num_ip_blocks - 1; i >= 0; i--) {
>  		if (!adev->ip_blocks[i].status.sw)
> @@ -3681,6 +3700,8 @@ void amdgpu_device_fini_hw(struct amdgpu_device *adev)
>  	amdgpu_fbdev_fini(adev);
>  
>  	amdgpu_irq_fini_hw(adev);
> +
> +	amdgpu_device_ip_fini_early(adev);
>  }
>  
>  void amdgpu_device_fini_sw(struct amdgpu_device *adev)
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
> index 357b9bf62a1c..ab6d2a43c9a3 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
> @@ -858,10 +858,11 @@ bool kgd2kfd_device_init(struct kfd_dev *kfd,
>  	return kfd->init_complete;
>  }
>  
> +
> +
>  void kgd2kfd_device_exit(struct kfd_dev *kfd)

Unnecessary whitespace change.

Regards,
  Felix


>  {
>  	if (kfd->init_complete) {
> -		kgd2kfd_suspend(kfd, false);
>  		device_queue_manager_uninit(kfd->dqm);
>  		kfd_interrupt_exit(kfd);
>  		kfd_topology_remove_device(kfd);
> diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> index 9ca517b65854..f7112865269a 100644
> --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> @@ -1251,6 +1251,15 @@ static int amdgpu_dm_init(struct amdgpu_device *adev)
>  	return -EINVAL;
>  }
>  
> +static int amdgpu_dm_early_fini(void *handle)
> +{
> +	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
> +
> +	amdgpu_dm_audio_fini(adev);
> +
> +	return 0;
> +}
> +
>  static void amdgpu_dm_fini(struct amdgpu_device *adev)
>  {
>  	int i;
> @@ -1259,8 +1268,6 @@ static void amdgpu_dm_fini(struct amdgpu_device *adev)
>  		drm_encoder_cleanup(&adev->dm.mst_encoders[i].base);
>  	}
>  
> -	amdgpu_dm_audio_fini(adev);
> -
>  	amdgpu_dm_destroy_drm_device(&adev->dm);
>  
>  #if defined(CONFIG_DRM_AMD_SECURE_DISPLAY)
> @@ -2298,6 +2305,7 @@ static const struct amd_ip_funcs amdgpu_dm_funcs = {
>  	.late_init = dm_late_init,
>  	.sw_init = dm_sw_init,
>  	.sw_fini = dm_sw_fini,
> +	.early_fini = amdgpu_dm_early_fini,
>  	.hw_init = dm_hw_init,
>  	.hw_fini = dm_hw_fini,
>  	.suspend = dm_suspend,
> diff --git a/drivers/gpu/drm/amd/include/amd_shared.h b/drivers/gpu/drm/amd/include/amd_shared.h
> index 43ed6291b2b8..1ad56da486e4 100644
> --- a/drivers/gpu/drm/amd/include/amd_shared.h
> +++ b/drivers/gpu/drm/amd/include/amd_shared.h
> @@ -240,6 +240,7 @@ enum amd_dpm_forced_level;
>   * @late_init: sets up late driver/hw state (post hw_init) - Optional
>   * @sw_init: sets up driver state, does not configure hw
>   * @sw_fini: tears down driver state, does not configure hw
> + * @early_fini: tears down stuff before dev detached from driver
>   * @hw_init: sets up the hw state
>   * @hw_fini: tears down the hw state
>   * @late_fini: final cleanup
> @@ -268,6 +269,7 @@ struct amd_ip_funcs {
>  	int (*late_init)(void *handle);
>  	int (*sw_init)(void *handle);
>  	int (*sw_fini)(void *handle);
> +	int (*early_fini)(void *handle);
>  	int (*hw_init)(void *handle);
>  	int (*hw_fini)(void *handle);
>  	void (*late_fini)(void *handle);

  reply	other threads:[~2021-05-20  3:29 UTC|newest]

Thread overview: 64+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-12 14:26 [PATCH v7 00/16] RFC Support hot device unplug in amdgpu Andrey Grodzovsky
2021-05-12 14:26 ` [PATCH v7 01/16] drm/ttm: Remap all page faults to per process dummy page Andrey Grodzovsky
2021-05-12 14:26 ` [PATCH v7 02/16] drm/amdgpu: Split amdgpu_device_fini into early and late Andrey Grodzovsky
2021-05-12 14:26 ` [PATCH v7 03/16] drm/amdkfd: Split kfd suspend from device exit Andrey Grodzovsky
2021-05-12 20:33   ` Felix Kuehling
2021-05-12 20:38     ` Andrey Grodzovsky
2021-05-20  3:20     ` [PATCH] drm/amdgpu: Add early fini callback Andrey Grodzovsky
2021-05-20  3:29       ` Felix Kuehling [this message]
2021-05-20  3:58         ` Andrey Grodzovsky
2021-05-12 14:26 ` [PATCH v7 04/16] " Andrey Grodzovsky
2021-05-12 14:26 ` [PATCH v7 05/16] drm/amdgpu: Handle IOMMU enabled case Andrey Grodzovsky
2021-05-14 14:41   ` Andrey Grodzovsky
2021-05-14 16:25     ` Felix Kuehling
2021-05-14 16:26       ` Andrey Grodzovsky
2021-05-17 14:38       ` [PATCH] " Andrey Grodzovsky
2021-05-17 14:48         ` Felix Kuehling
2021-05-12 14:26 ` [PATCH v7 06/16] drm/amdgpu: Remap all page faults to per process dummy page Andrey Grodzovsky
2021-05-12 14:26 ` [PATCH v7 07/16] PCI: Add support for dev_groups to struct pci_driver Andrey Grodzovsky
2021-05-12 14:26 ` [PATCH v7 08/16] drm/amdgpu: Convert driver sysfs attributes to static attributes Andrey Grodzovsky
2021-05-12 14:26 ` [PATCH v7 09/16] drm/amdgpu: Guard against write accesses after device removal Andrey Grodzovsky
2021-05-12 20:17   ` Alex Deucher
2021-05-12 20:30     ` Andrey Grodzovsky
2021-05-12 20:50       ` Alex Deucher
2021-05-13 14:47         ` Andrey Grodzovsky
2021-05-13 14:54           ` Alex Deucher
2021-05-12 14:26 ` [PATCH v7 10/16] drm/sched: Make timeout timer rearm conditional Andrey Grodzovsky
2021-05-12 14:26 ` [PATCH v7 11/16] drm/amdgpu: Prevent any job recoveries after device is unplugged Andrey Grodzovsky
2021-05-12 14:26 ` [PATCH v7 12/16] drm/amdgpu: Fix hang on device removal Andrey Grodzovsky
2021-05-14 14:42   ` Andrey Grodzovsky
2021-05-17 14:40     ` Andrey Grodzovsky
2021-05-17 17:39       ` Alex Deucher
2021-05-17 19:39       ` Christian König
2021-05-17 19:46         ` Andrey Grodzovsky
2021-05-17 19:54           ` Christian König
2021-05-12 14:26 ` [PATCH v7 13/16] drm/scheduler: Fix hang when sched_entity released Andrey Grodzovsky
2021-05-18 14:07   ` Christian König
2021-05-18 15:03     ` Andrey Grodzovsky
2021-05-18 15:15       ` Christian König
2021-05-18 16:17         ` Andrey Grodzovsky
2021-05-18 16:33           ` Christian König
2021-05-18 17:43             ` Andrey Grodzovsky
2021-05-18 18:02               ` Christian König
2021-05-18 18:09                 ` Andrey Grodzovsky
2021-05-18 18:13                   ` Christian König
2021-05-18 18:48                     ` Andrey Grodzovsky
2021-05-18 20:56                       ` Andrey Grodzovsky
2021-05-19 10:57                       ` Christian König
2021-05-19 11:03                         ` Andrey Grodzovsky
2021-05-19 11:46                           ` Christian König
2021-05-19 11:51                             ` Andrey Grodzovsky
2021-05-19 11:56                               ` Christian König
2021-05-19 14:14                                 ` [PATCH] drm/sched: Avoid data corruptions Andrey Grodzovsky
2021-05-19 14:15                                   ` Christian König
2021-05-12 14:26 ` [PATCH v7 14/16] drm/amd/display: Remove superfluous drm_mode_config_cleanup Andrey Grodzovsky
2021-05-12 14:26 ` [PATCH v7 15/16] drm/amdgpu: Verify DMA opearations from device are done Andrey Grodzovsky
2021-05-12 14:26 ` [PATCH v7 16/16] drm/amdgpu: Unmap all MMIO mappings Andrey Grodzovsky
2021-05-14 14:42   ` Andrey Grodzovsky
2021-05-17 14:41     ` Andrey Grodzovsky
2021-05-17 17:43   ` Alex Deucher
2021-05-17 18:46     ` Andrey Grodzovsky
2021-05-17 18:56       ` Alex Deucher
2021-05-17 19:22         ` Andrey Grodzovsky
2021-05-17 19:31     ` [PATCH] " Andrey Grodzovsky
2021-05-18 14:01       ` Andrey Grodzovsky

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ab262224-fd1d-31a2-72cd-54174fba17ae@amd.com \
    --to=felix.kuehling@amd.com \
    --cc=Alexander.Deucher@amd.com \
    --cc=Harry.Wentland@amd.com \
    --cc=amd-gfx@lists.freedesktop.org \
    --cc=andrey.grodzovsky@amd.com \
    --cc=christian.koenig@amd.com \
    --cc=ckoenig.leichtzumerken@gmail.com \
    --cc=daniel.vetter@ffwll.ch \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=gregkh@linuxfoundation.org \
    --cc=helgaas@kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=ppaalanen@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).