From: Felix Kuehling <felix.kuehling@amd.com>
To: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com>,
amd-gfx@lists.freedesktop.org
Cc: alexander.deucher@amd.com
Subject: Re: [Patch v3 3/4] drm/amdkfd: refactor runtime pm for baco
Date: Fri, 7 Feb 2020 16:49:57 -0500 [thread overview]
Message-ID: <e73350f5-c604-8f2d-97fb-5c3226dfcf74@amd.com> (raw)
In-Reply-To: <20200207000911.19166-4-rajneesh.bhardwaj@amd.com>
One more nit-pick and one error-handling problem inline.
On 2020-02-06 7:09 p.m., Rajneesh Bhardwaj wrote:
> So far the kfd driver implemented same routines for runtime and system
> wide suspend and resume (s2idle or mem). During system wide suspend the
> kfd aquires an atomic lock that prevents any more user processes to
> create queues and interact with kfd driver and amd gpu. This mechanism
> created problem when amdgpu device is runtime suspended with BACO
> enabled. Any application that relies on kfd driver fails to load because
> the driver reports a locked kfd device since gpu is runtime suspended.
>
> However, in an ideal case, when gpu is runtime suspended the kfd driver
> should be able to:
>
> - auto resume amdgpu driver whenever a client requests compute service
> - prevent runtime suspend for amdgpu while kfd is in use
>
> This change refactors the amdgpu and amdkfd drivers to support BACO and
> runtime power management.
>
> Reviewed-by: Oak Zeng <oak.zeng@amd.com>
> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
> Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com>
> ---
> drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 12 +++----
> drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h | 8 ++---
> drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 4 +--
> drivers/gpu/drm/amd/amdkfd/kfd_device.c | 29 +++++++++-------
> drivers/gpu/drm/amd/amdkfd/kfd_priv.h | 1 +
> drivers/gpu/drm/amd/amdkfd/kfd_process.c | 40 ++++++++++++++++++++--
> 6 files changed, 68 insertions(+), 26 deletions(-)
>
[snip]
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
> index 98dcbb96b2e2..6d6c25fe2677 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
> @@ -31,6 +31,7 @@
> #include <linux/compat.h>
> #include <linux/mman.h>
> #include <linux/file.h>
> +#include <linux/pm_runtime.h>
> #include "amdgpu_amdkfd.h"
> #include "amdgpu.h"
>
> @@ -527,6 +528,16 @@ static void kfd_process_destroy_pdds(struct kfd_process *p)
> kfree(pdd->qpd.doorbell_bitmap);
> idr_destroy(&pdd->alloc_idr);
>
> + /*
> + * before destroying pdd, make sure to report availability
> + * for auto suspend
> + */
> + if (pdd->runtime_inuse) {
> + pm_runtime_mark_last_busy(pdd->dev->ddev->dev);
> + pm_runtime_put_autosuspend(pdd->dev->ddev->dev);
> + pdd->runtime_inuse = false;
> + }
> +
> kfree(pdd);
> }
> }
> @@ -844,6 +855,7 @@ struct kfd_process_device *kfd_create_process_device_data(struct kfd_dev *dev,
> pdd->process = p;
> pdd->bound = PDD_UNBOUND;
> pdd->already_dequeued = false;
> + pdd->runtime_inuse = false;
> list_add(&pdd->per_device_list, &p->per_device_data);
>
> /* Init idr used for memory handle translation */
> @@ -933,15 +945,39 @@ struct kfd_process_device *kfd_bind_process_to_device(struct kfd_dev *dev,
> return ERR_PTR(-ENOMEM);
> }
>
> + /*
> + * signal runtime-pm system to auto resume and prevent
> + * further runtime suspend once device pdd is created until
> + * pdd is destroyed.
> + */
> + if (!pdd->runtime_inuse) {
> + err = pm_runtime_get_sync(dev->ddev->dev);
> + if (err < 0)
> + return ERR_PTR(err);
> + }
> +
> err = kfd_iommu_bind_process_to_device(pdd);
> if (err)
> - return ERR_PTR(err);
> + goto out;
>
> err = kfd_process_device_init_vm(pdd, NULL);
> if (err)
> - return ERR_PTR(err);
> + goto out;
> +
> + if (!err)
This "if" is also redundant. If there was an error, you already did goto
out. pdd->runtime_inuse should be set whenever we return successfully
from this function, so logically there should be no extra "if".
> + /*
> + * make sure that runtime_usage counter is incremented
> + * just once per pdd
> + */
> + pdd->runtime_inuse = true;
>
> return pdd;
> +
> +out:
> + /* balance runpm reference count and exit with error */
I think you need an "if (!pdd->runtime_inuse)" here. If this function
didn't call pm_runtime_get_sync above, you shouldn't do the cleanup
below. Otherwise you risk getting unbalanced usage counters. In other
words, you need to use the same condition for pm_runtime_get_sync and
the cleanup.
Regards,
Felix
> + pm_runtime_mark_last_busy(dev->ddev->dev);
> + pm_runtime_put_autosuspend(dev->ddev->dev);
> + return ERR_PTR(err);
> }
>
> struct kfd_process_device *kfd_get_first_process_device_data(
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
next prev parent reply other threads:[~2020-02-07 21:50 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-02-07 0:09 [Patch v3 0/4] Enable BACO with KFD Rajneesh Bhardwaj
2020-02-07 0:09 ` [Patch v3 1/4] drm/amdgpu: Fix missing error check in suspend Rajneesh Bhardwaj
2020-02-07 0:09 ` [Patch v3 2/4] drm/amdkfd: show warning when kfd is locked Rajneesh Bhardwaj
2020-02-07 0:09 ` [Patch v3 3/4] drm/amdkfd: refactor runtime pm for baco Rajneesh Bhardwaj
2020-02-07 21:49 ` Felix Kuehling [this message]
2020-02-07 0:09 ` [Patch v3 4/4] drm/amdgpu/runpm: enable runpm on baco capable VI+ asics Rajneesh Bhardwaj
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=e73350f5-c604-8f2d-97fb-5c3226dfcf74@amd.com \
--to=felix.kuehling@amd.com \
--cc=alexander.deucher@amd.com \
--cc=amd-gfx@lists.freedesktop.org \
--cc=rajneesh.bhardwaj@amd.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).