All of lore.kernel.org
 help / color / mirror / Atom feed
* [Patch v2] drm/amdkfd: Fix CRIU restore op due to doorbell offset
@ 2022-09-07 19:43 Rajneesh Bhardwaj
  2022-09-07 21:26 ` Felix Kuehling
  0 siblings, 1 reply; 2+ messages in thread
From: Rajneesh Bhardwaj @ 2022-09-07 19:43 UTC (permalink / raw)
  To: amd-gfx; +Cc: alexander.deucher, Felix.Kuehling, Rajneesh Bhardwaj

Recently introduced change to allocate doorbells only when the first
queue is created or mapped for CPU / GPU access, did not consider
Checkpoint Restore scenario completely. This fix allows the CRIU restore
operation by extending the doorbell optimization to CRIU restore
scenario.

Fixes: 'commit 15bcfbc55b57 ("drm/amdkfd: Allocate doorbells only when needed")'

Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com>
---

Changes in v2:

* Addressed review feedback from Felix

 drivers/gpu/drm/amd/amdkfd/kfd_chardev.c               | 6 ++++++
 drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c              | 3 +++
 drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c | 7 +++++++
 3 files changed, 16 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
index 84da1a9ce37c..56f7307c21d2 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
@@ -2153,6 +2153,12 @@ static int criu_restore_devices(struct kfd_process *p,
 			ret = PTR_ERR(pdd);
 			goto exit;
 		}
+
+		if (!pdd->doorbell_index &&
+		    kfd_alloc_process_doorbells(pdd->dev, &pdd->doorbell_index) < 0) {
+			ret = -ENOMEM;
+			goto exit;
+		}
 	}
 
 	/*
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c b/drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c
index b33798f89ef0..cd4e61bf0493 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c
@@ -303,6 +303,9 @@ int kfd_alloc_process_doorbells(struct kfd_dev *kfd, unsigned int *doorbell_inde
 	if (r > 0)
 		*doorbell_index = r;
 
+	if (r < 0)
+		pr_err("Failed to allocate process doorbells\n");
+
 	return r;
 }
 
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
index 6e3e7f54381b..5137476ec18e 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
@@ -857,6 +857,13 @@ int kfd_criu_restore_queue(struct kfd_process *p,
 		ret = -EINVAL;
 		goto exit;
 	}
+
+	if (!pdd->doorbell_index &&
+	    kfd_alloc_process_doorbells(pdd->dev, &pdd->doorbell_index) < 0) {
+		ret = -ENOMEM;
+		goto exit;
+	}
+
 	/* data stored in this order: mqd, ctl_stack */
 	mqd = q_extra_data;
 	ctl_stack = mqd + q_data->mqd_size;
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 2+ messages in thread

* Re: [Patch v2] drm/amdkfd: Fix CRIU restore op due to doorbell offset
  2022-09-07 19:43 [Patch v2] drm/amdkfd: Fix CRIU restore op due to doorbell offset Rajneesh Bhardwaj
@ 2022-09-07 21:26 ` Felix Kuehling
  0 siblings, 0 replies; 2+ messages in thread
From: Felix Kuehling @ 2022-09-07 21:26 UTC (permalink / raw)
  To: Rajneesh Bhardwaj, amd-gfx; +Cc: alexander.deucher

On 2022-09-07 15:43, Rajneesh Bhardwaj wrote:
> Recently introduced change to allocate doorbells only when the first
> queue is created or mapped for CPU / GPU access, did not consider
> Checkpoint Restore scenario completely. This fix allows the CRIU restore
> operation by extending the doorbell optimization to CRIU restore
> scenario.
>
> Fixes: 'commit 15bcfbc55b57 ("drm/amdkfd: Allocate doorbells only when needed")'
>
> Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com>

Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>

Thanks!


> ---
>
> Changes in v2:
>
> * Addressed review feedback from Felix
>
>   drivers/gpu/drm/amd/amdkfd/kfd_chardev.c               | 6 ++++++
>   drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c              | 3 +++
>   drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c | 7 +++++++
>   3 files changed, 16 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
> index 84da1a9ce37c..56f7307c21d2 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
> @@ -2153,6 +2153,12 @@ static int criu_restore_devices(struct kfd_process *p,
>   			ret = PTR_ERR(pdd);
>   			goto exit;
>   		}
> +
> +		if (!pdd->doorbell_index &&
> +		    kfd_alloc_process_doorbells(pdd->dev, &pdd->doorbell_index) < 0) {
> +			ret = -ENOMEM;
> +			goto exit;
> +		}
>   	}
>   
>   	/*
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c b/drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c
> index b33798f89ef0..cd4e61bf0493 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c
> @@ -303,6 +303,9 @@ int kfd_alloc_process_doorbells(struct kfd_dev *kfd, unsigned int *doorbell_inde
>   	if (r > 0)
>   		*doorbell_index = r;
>   
> +	if (r < 0)
> +		pr_err("Failed to allocate process doorbells\n");
> +
>   	return r;
>   }
>   
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
> index 6e3e7f54381b..5137476ec18e 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
> @@ -857,6 +857,13 @@ int kfd_criu_restore_queue(struct kfd_process *p,
>   		ret = -EINVAL;
>   		goto exit;
>   	}
> +
> +	if (!pdd->doorbell_index &&
> +	    kfd_alloc_process_doorbells(pdd->dev, &pdd->doorbell_index) < 0) {
> +		ret = -ENOMEM;
> +		goto exit;
> +	}
> +
>   	/* data stored in this order: mqd, ctl_stack */
>   	mqd = q_extra_data;
>   	ctl_stack = mqd + q_data->mqd_size;

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2022-09-07 21:26 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-09-07 19:43 [Patch v2] drm/amdkfd: Fix CRIU restore op due to doorbell offset Rajneesh Bhardwaj
2022-09-07 21:26 ` Felix Kuehling

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.