All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Dixit, Ashutosh" <ashutosh.dixit@intel.com>
To: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>,
	intel-gfx@lists.freedesktop.org,
	John Harrison <john.c.harrison@intel.com>,
	dri-devel@lists.freedesktop.org
Subject: Re: [PATCH] drm/i915/guc/slpc: Use non-blocking H2G for waitboost
Date: Tue, 21 Jun 2022 17:26:29 -0700	[thread overview]
Message-ID: <87pmj11r2i.wl-ashutosh.dixit@intel.com> (raw)
In-Reply-To: <20220515060506.22084-1-vinay.belgaumkar@intel.com>

On Sat, 14 May 2022 23:05:06 -0700, Vinay Belgaumkar wrote:
>
> SLPC min/max frequency updates require H2G calls. We are seeing
> timeouts when GuC channel is backed up and it is unable to respond
> in a timely fashion causing warnings and affecting CI.
>
> This is seen when waitboosting happens during a stress test.
> this patch updates the waitboost path to use a non-blocking
> H2G call instead, which returns as soon as the message is
> successfully transmitted.

Overall I am ok moving waitboost to use the non-blocking H2G. We can
consider increasing the timeout in wait_for_ct_request_update() to be a
separate issue for blocking cases and we can handle that separately.

Still there a couple of issues with this patch mentioned below.

> v2: Use drm_notice to report any errors that might occur while
> sending the waitboost H2G request (Tvrtko)
>
> Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
> ---
>  drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c | 44 +++++++++++++++++----
>  1 file changed, 36 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c
> index 1db833da42df..e5e869c96262 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c
> @@ -98,6 +98,30 @@ static u32 slpc_get_state(struct intel_guc_slpc *slpc)
>	return data->header.global_state;
>  }
>
> +static int guc_action_slpc_set_param_nb(struct intel_guc *guc, u8 id, u32 value)
> +{
> +	u32 request[] = {
> +		GUC_ACTION_HOST2GUC_PC_SLPC_REQUEST,
> +		SLPC_EVENT(SLPC_EVENT_PARAMETER_SET, 2),
> +		id,
> +		value,
> +	};
> +	int ret;
> +
> +	ret = intel_guc_send_nb(guc, request, ARRAY_SIZE(request), 0);
> +
> +	return ret > 0 ? -EPROTO : ret;
> +}
> +
> +static int slpc_set_param_nb(struct intel_guc_slpc *slpc, u8 id, u32 value)
> +{
> +	struct intel_guc *guc = slpc_to_guc(slpc);
> +
> +	GEM_BUG_ON(id >= SLPC_MAX_PARAM);
> +
> +	return guc_action_slpc_set_param_nb(guc, id, value);
> +}
> +
>  static int guc_action_slpc_set_param(struct intel_guc *guc, u8 id, u32 value)
>  {
>	u32 request[] = {
> @@ -208,12 +232,10 @@ static int slpc_force_min_freq(struct intel_guc_slpc *slpc, u32 freq)
>	 */
>
>	with_intel_runtime_pm(&i915->runtime_pm, wakeref) {
> -		ret = slpc_set_param(slpc,
> -				     SLPC_PARAM_GLOBAL_MIN_GT_UNSLICE_FREQ_MHZ,
> -				     freq);
> -		if (ret)
> -			i915_probe_error(i915, "Unable to force min freq to %u: %d",
> -					 freq, ret);
> +		/* Non-blocking request will avoid stalls */
> +		ret = slpc_set_param_nb(slpc,
> +					SLPC_PARAM_GLOBAL_MIN_GT_UNSLICE_FREQ_MHZ,
> +					freq);
>	}
>
>	return ret;
> @@ -222,6 +244,8 @@ static int slpc_force_min_freq(struct intel_guc_slpc *slpc, u32 freq)
>  static void slpc_boost_work(struct work_struct *work)
>  {
>	struct intel_guc_slpc *slpc = container_of(work, typeof(*slpc), boost_work);
> +	struct drm_i915_private *i915 = slpc_to_i915(slpc);
> +	int err;
>
>	/*
>	 * Raise min freq to boost. It's possible that
> @@ -231,8 +255,12 @@ static void slpc_boost_work(struct work_struct *work)
>	 */
>	mutex_lock(&slpc->lock);
>	if (atomic_read(&slpc->num_waiters)) {
> -		slpc_force_min_freq(slpc, slpc->boost_freq);
> -		slpc->num_boosts++;
> +		err = slpc_force_min_freq(slpc, slpc->boost_freq);
> +		if (!err)
> +			slpc->num_boosts++;
> +		else
> +			drm_notice(&i915->drm, "Failed to send waitboost request (%d)\n",
> +				   err);

The issue I have is what happens when we de-boost (restore min freq to its
previous value in intel_guc_slpc_dec_waiters()). It would seem that that
call is fairly important to get the min freq down when there are no pending
requests. Therefore what do we do in that case?

This is the function:

void intel_guc_slpc_dec_waiters(struct intel_guc_slpc *slpc)
{
        mutex_lock(&slpc->lock);
        if (atomic_dec_and_test(&slpc->num_waiters))
                slpc_force_min_freq(slpc, slpc->min_freq_softlimit);
        mutex_unlock(&slpc->lock);
}


1. First it would seem that at the minimum we need a similar drm_notice()
   in intel_guc_slpc_dec_waiters(). That would mean we need to put the
   drm_notice() back in slpc_force_min_freq() (replacing
   i915_probe_error()) rather than in slpc_boost_work() above?

2. Further, if de-boosting is important then maybe as was being discussed
   in v1 of this patch (see the bottom of
   https://patchwork.freedesktop.org/patch/485004/?series=103598&rev=1) do
   we need to use intel_guc_send_busy_loop() in the
   intel_guc_slpc_dec_waiters() code path?

At least we need to do 1. But for 2. we might as well just put
intel_guc_send_busy_loop() in guc_action_slpc_set_param_nb()? In both cases
(boost and de-boost) intel_guc_send_busy_loop() would be called from a work
item so looks doable (the way we were previously doing the blocking call
from the two places). Thoughts?

Thanks.
--
Ashutosh

WARNING: multiple messages have this Message-ID (diff)
From: "Dixit, Ashutosh" <ashutosh.dixit@intel.com>
To: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Cc: intel-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org
Subject: Re: [Intel-gfx] [PATCH] drm/i915/guc/slpc: Use non-blocking H2G for waitboost
Date: Tue, 21 Jun 2022 17:26:29 -0700	[thread overview]
Message-ID: <87pmj11r2i.wl-ashutosh.dixit@intel.com> (raw)
In-Reply-To: <20220515060506.22084-1-vinay.belgaumkar@intel.com>

On Sat, 14 May 2022 23:05:06 -0700, Vinay Belgaumkar wrote:
>
> SLPC min/max frequency updates require H2G calls. We are seeing
> timeouts when GuC channel is backed up and it is unable to respond
> in a timely fashion causing warnings and affecting CI.
>
> This is seen when waitboosting happens during a stress test.
> this patch updates the waitboost path to use a non-blocking
> H2G call instead, which returns as soon as the message is
> successfully transmitted.

Overall I am ok moving waitboost to use the non-blocking H2G. We can
consider increasing the timeout in wait_for_ct_request_update() to be a
separate issue for blocking cases and we can handle that separately.

Still there a couple of issues with this patch mentioned below.

> v2: Use drm_notice to report any errors that might occur while
> sending the waitboost H2G request (Tvrtko)
>
> Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
> ---
>  drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c | 44 +++++++++++++++++----
>  1 file changed, 36 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c
> index 1db833da42df..e5e869c96262 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c
> @@ -98,6 +98,30 @@ static u32 slpc_get_state(struct intel_guc_slpc *slpc)
>	return data->header.global_state;
>  }
>
> +static int guc_action_slpc_set_param_nb(struct intel_guc *guc, u8 id, u32 value)
> +{
> +	u32 request[] = {
> +		GUC_ACTION_HOST2GUC_PC_SLPC_REQUEST,
> +		SLPC_EVENT(SLPC_EVENT_PARAMETER_SET, 2),
> +		id,
> +		value,
> +	};
> +	int ret;
> +
> +	ret = intel_guc_send_nb(guc, request, ARRAY_SIZE(request), 0);
> +
> +	return ret > 0 ? -EPROTO : ret;
> +}
> +
> +static int slpc_set_param_nb(struct intel_guc_slpc *slpc, u8 id, u32 value)
> +{
> +	struct intel_guc *guc = slpc_to_guc(slpc);
> +
> +	GEM_BUG_ON(id >= SLPC_MAX_PARAM);
> +
> +	return guc_action_slpc_set_param_nb(guc, id, value);
> +}
> +
>  static int guc_action_slpc_set_param(struct intel_guc *guc, u8 id, u32 value)
>  {
>	u32 request[] = {
> @@ -208,12 +232,10 @@ static int slpc_force_min_freq(struct intel_guc_slpc *slpc, u32 freq)
>	 */
>
>	with_intel_runtime_pm(&i915->runtime_pm, wakeref) {
> -		ret = slpc_set_param(slpc,
> -				     SLPC_PARAM_GLOBAL_MIN_GT_UNSLICE_FREQ_MHZ,
> -				     freq);
> -		if (ret)
> -			i915_probe_error(i915, "Unable to force min freq to %u: %d",
> -					 freq, ret);
> +		/* Non-blocking request will avoid stalls */
> +		ret = slpc_set_param_nb(slpc,
> +					SLPC_PARAM_GLOBAL_MIN_GT_UNSLICE_FREQ_MHZ,
> +					freq);
>	}
>
>	return ret;
> @@ -222,6 +244,8 @@ static int slpc_force_min_freq(struct intel_guc_slpc *slpc, u32 freq)
>  static void slpc_boost_work(struct work_struct *work)
>  {
>	struct intel_guc_slpc *slpc = container_of(work, typeof(*slpc), boost_work);
> +	struct drm_i915_private *i915 = slpc_to_i915(slpc);
> +	int err;
>
>	/*
>	 * Raise min freq to boost. It's possible that
> @@ -231,8 +255,12 @@ static void slpc_boost_work(struct work_struct *work)
>	 */
>	mutex_lock(&slpc->lock);
>	if (atomic_read(&slpc->num_waiters)) {
> -		slpc_force_min_freq(slpc, slpc->boost_freq);
> -		slpc->num_boosts++;
> +		err = slpc_force_min_freq(slpc, slpc->boost_freq);
> +		if (!err)
> +			slpc->num_boosts++;
> +		else
> +			drm_notice(&i915->drm, "Failed to send waitboost request (%d)\n",
> +				   err);

The issue I have is what happens when we de-boost (restore min freq to its
previous value in intel_guc_slpc_dec_waiters()). It would seem that that
call is fairly important to get the min freq down when there are no pending
requests. Therefore what do we do in that case?

This is the function:

void intel_guc_slpc_dec_waiters(struct intel_guc_slpc *slpc)
{
        mutex_lock(&slpc->lock);
        if (atomic_dec_and_test(&slpc->num_waiters))
                slpc_force_min_freq(slpc, slpc->min_freq_softlimit);
        mutex_unlock(&slpc->lock);
}


1. First it would seem that at the minimum we need a similar drm_notice()
   in intel_guc_slpc_dec_waiters(). That would mean we need to put the
   drm_notice() back in slpc_force_min_freq() (replacing
   i915_probe_error()) rather than in slpc_boost_work() above?

2. Further, if de-boosting is important then maybe as was being discussed
   in v1 of this patch (see the bottom of
   https://patchwork.freedesktop.org/patch/485004/?series=103598&rev=1) do
   we need to use intel_guc_send_busy_loop() in the
   intel_guc_slpc_dec_waiters() code path?

At least we need to do 1. But for 2. we might as well just put
intel_guc_send_busy_loop() in guc_action_slpc_set_param_nb()? In both cases
(boost and de-boost) intel_guc_send_busy_loop() would be called from a work
item so looks doable (the way we were previously doing the blocking call
from the two places). Thoughts?

Thanks.
--
Ashutosh

  parent reply	other threads:[~2022-06-22  0:26 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-15  6:05 [PATCH] drm/i915/guc/slpc: Use non-blocking H2G for waitboost Vinay Belgaumkar
2022-05-15  6:05 ` [Intel-gfx] " Vinay Belgaumkar
2022-05-15  6:39 ` [Intel-gfx] ✓ Fi.CI.BAT: success for drm/i915/guc/slpc: Use non-blocking H2G for waitboost (rev2) Patchwork
2022-05-15  7:51 ` [Intel-gfx] ✓ Fi.CI.IGT: " Patchwork
2022-05-16  7:59 ` [PATCH] drm/i915/guc/slpc: Use non-blocking H2G for waitboost Jani Nikula
2022-05-16  7:59   ` [Intel-gfx] " Jani Nikula
2022-05-16  8:00   ` Jani Nikula
2022-05-16  8:00     ` [Intel-gfx] " Jani Nikula
2022-06-07 23:02   ` John Harrison
2022-06-07 23:04     ` John Harrison
2022-06-08  7:58       ` Jani Nikula
2022-06-07 22:29 ` Dixit, Ashutosh
2022-06-07 22:29   ` [Intel-gfx] " Dixit, Ashutosh
2022-06-07 23:15   ` John Harrison
2022-06-07 23:15     ` [Intel-gfx] " John Harrison
2022-06-08 17:39     ` Dixit, Ashutosh
2022-06-08 17:39       ` [Intel-gfx] " Dixit, Ashutosh
2022-06-22  0:26 ` Dixit, Ashutosh [this message]
2022-06-22  0:26   ` Dixit, Ashutosh
2022-06-22 20:30   ` Belgaumkar, Vinay
2022-06-22 20:30     ` [Intel-gfx] " Belgaumkar, Vinay
2022-06-22 21:28     ` Dixit, Ashutosh
2022-06-22 21:28       ` [Intel-gfx] " Dixit, Ashutosh
2022-06-23  8:12       ` Tvrtko Ursulin
2022-06-23  8:12         ` [Intel-gfx] " Tvrtko Ursulin
  -- strict thread matches above, loose matches on Subject: below --
2022-06-23  0:32 Vinay Belgaumkar
2022-06-23  0:53 ` Dixit, Ashutosh
2022-05-05  5:40 Vinay Belgaumkar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87pmj11r2i.wl-ashutosh.dixit@intel.com \
    --to=ashutosh.dixit@intel.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=intel-gfx@lists.freedesktop.org \
    --cc=john.c.harrison@intel.com \
    --cc=tvrtko.ursulin@linux.intel.com \
    --cc=vinay.belgaumkar@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.