From: Felix Kuehling <Felix.Kuehling@amd.com> To: amd-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org Cc: David Yat Sin <david.yatsin@amd.com>, Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com> Subject: [RFC PATCH 04/17] drm/amdkfd: CRIU Implement KFD helper ioctl Date: Fri, 30 Apr 2021 21:57:39 -0400 [thread overview] Message-ID: <20210501015752.888-5-Felix.Kuehling@amd.com> (raw) In-Reply-To: <20210501015752.888-1-Felix.Kuehling@amd.com> From: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com> This IOCTL is expected to be called as a precursor to the actual Checkpoint operation. This does the basic discovery into the target process seized by CRIU and relays the information to the userspace that utilizes it to start the Checkpoint operation via another dedicated IOCTL. The helper IOCTL determines the number of GPUs, buffer objects that are associated with the target process, its process id in caller's namespace since /proc/pid/mem interface maybe used to drain the contenets of the discovered buffer objects in userspace and getpid returns the pid of CRIU dumper process. Also the pid of a process inside a container might be different than its global pid so return the ns pid. Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com> (cherry picked from commit b2fa92d0a8f1de51013cd6742b4996b38c285ffc) (cherry picked from commit 8b44c466ce53162603cd8ae49624462902541a47) Signed-off-by: David Yat Sin <david.yatsin@amd.com> Change-Id: I2c6b28fe4df7333c9faf7eb6ee86decabe475338 --- drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 42 ++++++++++++++++++++++-- drivers/gpu/drm/amd/amdkfd/kfd_priv.h | 2 ++ drivers/gpu/drm/amd/amdkfd/kfd_process.c | 14 ++++++++ 3 files changed, 56 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c index 1fa2ba34a429..6b347ce5992f 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c @@ -1822,9 +1822,47 @@ static int kfd_ioctl_criu_restorer(struct file *filep, static int kfd_ioctl_criu_helper(struct file *filep, struct kfd_process *p, void *data) { - pr_info("Inside %s\n",__func__); + struct kfd_ioctl_criu_helper_args *args = data; + struct kgd_mem *kgd_mem; + u64 num_of_bos = 0; + int id, i = 0; + void *mem; + int ret = 0; - return 0; + pr_debug("Inside %s\n", __func__); + mutex_lock(&p->mutex); + + if (!kfd_has_process_device_data(p)) { + pr_err("No pdd for given process\n"); + ret = -ENODEV; + goto err_unlock; + } + + /* Run over all PDDs of the process */ + for (i = 0; i < p->n_pdds; i++) { + struct kfd_process_device *pdd = p->pdds[i]; + + idr_for_each_entry(&pdd->alloc_idr, mem, id) { + if (!mem) { + ret = -ENOMEM; + goto err_unlock; + } + + kgd_mem = (struct kgd_mem *)mem; + if ((uint64_t)kgd_mem->va > pdd->gpuvm_base) + num_of_bos++; + } + } + + args->task_pid = task_pid_nr_ns(p->lead_thread, + task_active_pid_ns(p->lead_thread)); + args->num_of_devices = p->n_pdds; + args->num_of_bos = num_of_bos; + dev_dbg(kfd_device, "Num of bos = %llu\n", num_of_bos); + +err_unlock: + mutex_unlock(&p->mutex); + return ret; } static int kfd_ioctl_criu_resume(struct file *filep, diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h index a494d61543af..74d3eb383099 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h +++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h @@ -932,6 +932,8 @@ void *kfd_process_device_translate_handle(struct kfd_process_device *p, void kfd_process_device_remove_obj_handle(struct kfd_process_device *pdd, int handle); +bool kfd_has_process_device_data(struct kfd_process *p); + /* PASIDs */ int kfd_pasid_init(void); void kfd_pasid_exit(void); diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c b/drivers/gpu/drm/amd/amdkfd/kfd_process.c index 9d4f527bda7c..bc133c3789d8 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c @@ -1359,6 +1359,20 @@ static int init_doorbell_bitmap(struct qcm_process_device *qpd, return 0; } +bool kfd_has_process_device_data(struct kfd_process *p) +{ + int i; + + for (i = 0; i < p->n_pdds; i++) { + struct kfd_process_device *pdd = p->pdds[i]; + + if (pdd) + return true; + } + + return false; +} + struct kfd_process_device *kfd_get_process_device_data(struct kfd_dev *dev, struct kfd_process *p) { -- 2.17.1 _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
WARNING: multiple messages have this Message-ID (diff)
From: Felix Kuehling <Felix.Kuehling@amd.com> To: amd-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org Cc: David Yat Sin <david.yatsin@amd.com>, Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com> Subject: [RFC PATCH 04/17] drm/amdkfd: CRIU Implement KFD helper ioctl Date: Fri, 30 Apr 2021 21:57:39 -0400 [thread overview] Message-ID: <20210501015752.888-5-Felix.Kuehling@amd.com> (raw) In-Reply-To: <20210501015752.888-1-Felix.Kuehling@amd.com> From: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com> This IOCTL is expected to be called as a precursor to the actual Checkpoint operation. This does the basic discovery into the target process seized by CRIU and relays the information to the userspace that utilizes it to start the Checkpoint operation via another dedicated IOCTL. The helper IOCTL determines the number of GPUs, buffer objects that are associated with the target process, its process id in caller's namespace since /proc/pid/mem interface maybe used to drain the contenets of the discovered buffer objects in userspace and getpid returns the pid of CRIU dumper process. Also the pid of a process inside a container might be different than its global pid so return the ns pid. Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com> (cherry picked from commit b2fa92d0a8f1de51013cd6742b4996b38c285ffc) (cherry picked from commit 8b44c466ce53162603cd8ae49624462902541a47) Signed-off-by: David Yat Sin <david.yatsin@amd.com> Change-Id: I2c6b28fe4df7333c9faf7eb6ee86decabe475338 --- drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 42 ++++++++++++++++++++++-- drivers/gpu/drm/amd/amdkfd/kfd_priv.h | 2 ++ drivers/gpu/drm/amd/amdkfd/kfd_process.c | 14 ++++++++ 3 files changed, 56 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c index 1fa2ba34a429..6b347ce5992f 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c @@ -1822,9 +1822,47 @@ static int kfd_ioctl_criu_restorer(struct file *filep, static int kfd_ioctl_criu_helper(struct file *filep, struct kfd_process *p, void *data) { - pr_info("Inside %s\n",__func__); + struct kfd_ioctl_criu_helper_args *args = data; + struct kgd_mem *kgd_mem; + u64 num_of_bos = 0; + int id, i = 0; + void *mem; + int ret = 0; - return 0; + pr_debug("Inside %s\n", __func__); + mutex_lock(&p->mutex); + + if (!kfd_has_process_device_data(p)) { + pr_err("No pdd for given process\n"); + ret = -ENODEV; + goto err_unlock; + } + + /* Run over all PDDs of the process */ + for (i = 0; i < p->n_pdds; i++) { + struct kfd_process_device *pdd = p->pdds[i]; + + idr_for_each_entry(&pdd->alloc_idr, mem, id) { + if (!mem) { + ret = -ENOMEM; + goto err_unlock; + } + + kgd_mem = (struct kgd_mem *)mem; + if ((uint64_t)kgd_mem->va > pdd->gpuvm_base) + num_of_bos++; + } + } + + args->task_pid = task_pid_nr_ns(p->lead_thread, + task_active_pid_ns(p->lead_thread)); + args->num_of_devices = p->n_pdds; + args->num_of_bos = num_of_bos; + dev_dbg(kfd_device, "Num of bos = %llu\n", num_of_bos); + +err_unlock: + mutex_unlock(&p->mutex); + return ret; } static int kfd_ioctl_criu_resume(struct file *filep, diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h index a494d61543af..74d3eb383099 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h +++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h @@ -932,6 +932,8 @@ void *kfd_process_device_translate_handle(struct kfd_process_device *p, void kfd_process_device_remove_obj_handle(struct kfd_process_device *pdd, int handle); +bool kfd_has_process_device_data(struct kfd_process *p); + /* PASIDs */ int kfd_pasid_init(void); void kfd_pasid_exit(void); diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c b/drivers/gpu/drm/amd/amdkfd/kfd_process.c index 9d4f527bda7c..bc133c3789d8 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c @@ -1359,6 +1359,20 @@ static int init_doorbell_bitmap(struct qcm_process_device *qpd, return 0; } +bool kfd_has_process_device_data(struct kfd_process *p) +{ + int i; + + for (i = 0; i < p->n_pdds; i++) { + struct kfd_process_device *pdd = p->pdds[i]; + + if (pdd) + return true; + } + + return false; +} + struct kfd_process_device *kfd_get_process_device_data(struct kfd_dev *dev, struct kfd_process *p) { -- 2.17.1 _______________________________________________ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
next prev parent reply other threads:[~2021-05-01 1:58 UTC|newest] Thread overview: 36+ messages / expand[flat|nested] mbox.gz Atom feed top 2021-05-01 1:57 [RFC PATCH 00/17] CRIU support for ROCm Felix Kuehling 2021-05-01 1:57 ` Felix Kuehling 2021-05-01 1:57 ` [RFC PATCH 01/17] x86/configs: CRIU update release defconfig Felix Kuehling 2021-05-01 1:57 ` Felix Kuehling 2021-05-01 1:57 ` [RFC PATCH 02/17] x86/configs: CRIU update debug rock defconfig Felix Kuehling 2021-05-01 1:57 ` Felix Kuehling 2021-05-01 1:57 ` [RFC PATCH 03/17] drm/amdkfd: CRIU Introduce Checkpoint-Restore APIs Felix Kuehling 2021-05-01 1:57 ` Felix Kuehling 2021-05-01 1:57 ` Felix Kuehling [this message] 2021-05-01 1:57 ` [RFC PATCH 04/17] drm/amdkfd: CRIU Implement KFD helper ioctl Felix Kuehling 2021-05-01 1:57 ` [RFC PATCH 05/17] drm/amdkfd: CRIU Implement KFD dumper ioctl Felix Kuehling 2021-05-01 1:57 ` Felix Kuehling 2021-05-01 1:57 ` [RFC PATCH 06/17] drm/amdkfd: CRIU Implement KFD restore ioctl Felix Kuehling 2021-05-01 1:57 ` Felix Kuehling 2021-05-01 1:57 ` [RFC PATCH 07/17] drm/amdkfd: CRIU Implement KFD resume ioctl Felix Kuehling 2021-05-01 1:57 ` Felix Kuehling 2021-05-01 1:57 ` [RFC PATCH 08/17] drm/amdkfd: CRIU add queues support Felix Kuehling 2021-05-01 1:57 ` Felix Kuehling 2021-05-01 1:57 ` [RFC PATCH 09/17] drm/amdkfd: CRIU restore queue ids Felix Kuehling 2021-05-01 1:57 ` Felix Kuehling 2021-05-01 1:57 ` [RFC PATCH 10/17] drm/amdkfd: CRIU restore sdma id for queues Felix Kuehling 2021-05-01 1:57 ` Felix Kuehling 2021-05-01 1:57 ` [RFC PATCH 11/17] drm/amdkfd: CRIU restore queue doorbell id Felix Kuehling 2021-05-01 1:57 ` Felix Kuehling 2021-05-01 1:57 ` [RFC PATCH 12/17] drm/amdkfd: CRIU restore CU mask for queues Felix Kuehling 2021-05-01 1:57 ` Felix Kuehling 2021-05-01 1:57 ` [RFC PATCH 13/17] drm/amdkfd: CRIU dump and restore queue mqds Felix Kuehling 2021-05-01 1:57 ` Felix Kuehling 2021-05-01 1:57 ` [RFC PATCH 14/17] drm/amdkfd: CRIU dump/restore queue control stack Felix Kuehling 2021-05-01 1:57 ` Felix Kuehling 2021-05-01 1:57 ` [RFC PATCH 15/17] drm/amdkfd: CRIU dump and restore events Felix Kuehling 2021-05-01 1:57 ` Felix Kuehling 2021-05-01 1:57 ` [RFC PATCH 16/17] drm/amdkfd: CRIU implement gpu_id remapping Felix Kuehling 2021-05-01 1:57 ` Felix Kuehling 2021-05-01 1:57 ` [RFC PATCH 17/17] Revert "drm/amdgpu: Remove verify_access shortcut for KFD BOs" Felix Kuehling 2021-05-01 1:57 ` Felix Kuehling
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20210501015752.888-5-Felix.Kuehling@amd.com \ --to=felix.kuehling@amd.com \ --cc=amd-gfx@lists.freedesktop.org \ --cc=david.yatsin@amd.com \ --cc=dri-devel@lists.freedesktop.org \ --cc=rajneesh.bhardwaj@amd.com \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.