All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Ceraolo Spurio, Daniele" <daniele.ceraolospurio@intel.com>
To: Alan Previn <alan.previn.teres.alexis@intel.com>,
	<intel-gfx@lists.freedesktop.org>
Cc: Anshuman <anshuman.gupta@intel.com>,
	dri-devel@lists.freedesktop.org, Rodrigo <rodrigo.vivi@intel.com>
Subject: Re: [PATCH] drm/i915/gsc: Fix the Driver-FLR completion
Date: Thu, 23 Feb 2023 15:49:13 -0800	[thread overview]
Message-ID: <f15e26d3-fde2-acba-fb2f-2363e8c66d1c@intel.com> (raw)
In-Reply-To: <20230222210120.407780-1-alan.previn.teres.alexis@intel.com>



On 2/22/2023 1:01 PM, Alan Previn wrote:
> The Driver-FLR flow may inadvertently exit early before the full
> completion of the re-init of the internal HW state if we only poll
> GU_DEBUG Bit31 (polling for it to toggle from 0 -> 1). Instead
> we need a two-step completion wait-for-completion flow that also
> involves GU_CNTL. See the patch and new code comments for detail.
> This is new direction from HW architecture folks.
>
>     v2: - Add error message for the teardown timeout (Anshuman)
>         - Don't duplicate code in comments (Jani)
>
> Signed-off-by: Alan Previn <alan.previn.teres.alexis@intel.com>
> Fixes: 5a44fcd73498 ("drm/i915/gsc: Do a driver-FLR on unload if GSC was loaded")

I'm not sure if we need a fixes tag, given that this is MTL specific 
code and that's still under force probe.

> ---
>   drivers/gpu/drm/i915/intel_uncore.c | 13 ++++++++++++-
>   1 file changed, 12 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c
> index f018da7ebaac..f3c46352db89 100644
> --- a/drivers/gpu/drm/i915/intel_uncore.c
> +++ b/drivers/gpu/drm/i915/intel_uncore.c
> @@ -2749,14 +2749,25 @@ static void driver_initiated_flr(struct intel_uncore *uncore)
>   	/* Trigger the actual Driver-FLR */
>   	intel_uncore_rmw_fw(uncore, GU_CNTL, 0, DRIVERFLR);
>   
> +	/* Wait for hardware teardown to complete */
> +	ret = intel_wait_for_register_fw(uncore, GU_CNTL,
> +					 DRIVERFLR_STATUS, 0,

shouldn't this bit be DRIVERFLR instead of DRIVERFLR_STATUS ? I know 
they're both BIT(31), but DRIVERFLR_STATUS is the definition for the 
GU_DEBUG bit, while this wait is on GU_CNTL.

> +					 flr_timeout_ms);
> +	if (ret) {
> +		drm_err(&i915->drm, "Driver-FLR-teardown wait completion failed! %d\n", ret);
> +		return;
> +	}
> +
> +	/* Wait for hardware/firmware re-init to complete */
>   	ret = intel_wait_for_register_fw(uncore, GU_DEBUG,
>   					 DRIVERFLR_STATUS, DRIVERFLR_STATUS,
>   					 flr_timeout_ms);

I was wondering if we could reduce the timing here to avoid 2 waits of 3 
seconds, as the 3 seconds should be for the full process. However, the 
specs don't say how much each step can take, so I agree that to be safe 
is better to have both timeouts at 3 seconds. If the FLR fails the HW is 
toast anyway, so waiting a few seconds more to detect it on driver 
unload is not going to have additional consequences that we wouldn't 
already have.

With the bit in the wait above fixed:
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>

Daniele

>   	if (ret) {
> -		drm_err(&i915->drm, "wait for Driver-FLR completion failed! %d\n", ret);
> +		drm_err(&i915->drm, "Driver-FLR-reinit wait completion failed! %d\n", ret);
>   		return;
>   	}
>   
> +	/* Clear sticky completion status */
>   	intel_uncore_write_fw(uncore, GU_DEBUG, DRIVERFLR_STATUS);
>   }
>   


WARNING: multiple messages have this Message-ID (diff)
From: "Ceraolo Spurio, Daniele" <daniele.ceraolospurio@intel.com>
To: Alan Previn <alan.previn.teres.alexis@intel.com>,
	<intel-gfx@lists.freedesktop.org>
Cc: dri-devel@lists.freedesktop.org, Rodrigo <rodrigo.vivi@intel.com>
Subject: Re: [Intel-gfx] [PATCH] drm/i915/gsc: Fix the Driver-FLR completion
Date: Thu, 23 Feb 2023 15:49:13 -0800	[thread overview]
Message-ID: <f15e26d3-fde2-acba-fb2f-2363e8c66d1c@intel.com> (raw)
In-Reply-To: <20230222210120.407780-1-alan.previn.teres.alexis@intel.com>



On 2/22/2023 1:01 PM, Alan Previn wrote:
> The Driver-FLR flow may inadvertently exit early before the full
> completion of the re-init of the internal HW state if we only poll
> GU_DEBUG Bit31 (polling for it to toggle from 0 -> 1). Instead
> we need a two-step completion wait-for-completion flow that also
> involves GU_CNTL. See the patch and new code comments for detail.
> This is new direction from HW architecture folks.
>
>     v2: - Add error message for the teardown timeout (Anshuman)
>         - Don't duplicate code in comments (Jani)
>
> Signed-off-by: Alan Previn <alan.previn.teres.alexis@intel.com>
> Fixes: 5a44fcd73498 ("drm/i915/gsc: Do a driver-FLR on unload if GSC was loaded")

I'm not sure if we need a fixes tag, given that this is MTL specific 
code and that's still under force probe.

> ---
>   drivers/gpu/drm/i915/intel_uncore.c | 13 ++++++++++++-
>   1 file changed, 12 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c
> index f018da7ebaac..f3c46352db89 100644
> --- a/drivers/gpu/drm/i915/intel_uncore.c
> +++ b/drivers/gpu/drm/i915/intel_uncore.c
> @@ -2749,14 +2749,25 @@ static void driver_initiated_flr(struct intel_uncore *uncore)
>   	/* Trigger the actual Driver-FLR */
>   	intel_uncore_rmw_fw(uncore, GU_CNTL, 0, DRIVERFLR);
>   
> +	/* Wait for hardware teardown to complete */
> +	ret = intel_wait_for_register_fw(uncore, GU_CNTL,
> +					 DRIVERFLR_STATUS, 0,

shouldn't this bit be DRIVERFLR instead of DRIVERFLR_STATUS ? I know 
they're both BIT(31), but DRIVERFLR_STATUS is the definition for the 
GU_DEBUG bit, while this wait is on GU_CNTL.

> +					 flr_timeout_ms);
> +	if (ret) {
> +		drm_err(&i915->drm, "Driver-FLR-teardown wait completion failed! %d\n", ret);
> +		return;
> +	}
> +
> +	/* Wait for hardware/firmware re-init to complete */
>   	ret = intel_wait_for_register_fw(uncore, GU_DEBUG,
>   					 DRIVERFLR_STATUS, DRIVERFLR_STATUS,
>   					 flr_timeout_ms);

I was wondering if we could reduce the timing here to avoid 2 waits of 3 
seconds, as the 3 seconds should be for the full process. However, the 
specs don't say how much each step can take, so I agree that to be safe 
is better to have both timeouts at 3 seconds. If the FLR fails the HW is 
toast anyway, so waiting a few seconds more to detect it on driver 
unload is not going to have additional consequences that we wouldn't 
already have.

With the bit in the wait above fixed:
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>

Daniele

>   	if (ret) {
> -		drm_err(&i915->drm, "wait for Driver-FLR completion failed! %d\n", ret);
> +		drm_err(&i915->drm, "Driver-FLR-reinit wait completion failed! %d\n", ret);
>   		return;
>   	}
>   
> +	/* Clear sticky completion status */
>   	intel_uncore_write_fw(uncore, GU_DEBUG, DRIVERFLR_STATUS);
>   }
>   


  parent reply	other threads:[~2023-02-23 23:50 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-02-22 21:01 [PATCH] drm/i915/gsc: Fix the Driver-FLR completion Alan Previn
2023-02-22 21:01 ` [Intel-gfx] " Alan Previn
2023-02-22 21:53 ` [Intel-gfx] ✓ Fi.CI.BAT: success for " Patchwork
2023-02-22 23:16 ` [Intel-gfx] ✓ Fi.CI.IGT: " Patchwork
2023-02-23  1:41 ` [Intel-gfx] [PATCH] " Belgaumkar, Vinay
2023-02-23 21:48 ` Teres Alexis, Alan Previn
2023-02-23 21:48   ` [Intel-gfx] " Teres Alexis, Alan Previn
2023-02-23 23:49 ` Ceraolo Spurio, Daniele [this message]
2023-02-23 23:49   ` Ceraolo Spurio, Daniele
2023-02-24  0:05   ` Teres Alexis, Alan Previn
2023-02-24  0:05     ` [Intel-gfx] " Teres Alexis, Alan Previn
2023-02-23 22:04 Alan Previn
2023-02-23 23:35 ` Ceraolo Spurio, Daniele
2023-02-24  0:17 Alan Previn

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f15e26d3-fde2-acba-fb2f-2363e8c66d1c@intel.com \
    --to=daniele.ceraolospurio@intel.com \
    --cc=alan.previn.teres.alexis@intel.com \
    --cc=anshuman.gupta@intel.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=intel-gfx@lists.freedesktop.org \
    --cc=rodrigo.vivi@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.