Re: [PATCH v5 10/25] drm/i915/guc: Copy whole golden context, set engine state size of subset

From: John Harrison <john.c.harrison@intel.com>
To: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>,
	<intel-gfx@lists.freedesktop.org>
Cc: <dri-devel@lists.freedesktop.org>, <matthew.brost@intel.com>
Subject: Re: [PATCH v5 10/25] drm/i915/guc: Copy whole golden context, set engine state size of subset
Date: Thu, 2 Sep 2021 12:25:14 -0700	[thread overview]
Message-ID: <ff1a37b2-0275-9f2c-763b-0ef818c29c48@intel.com> (raw)
In-Reply-To: <20210902005022.711767-11-daniele.ceraolospurio@intel.com>

On 9/1/2021 17:50, Daniele Ceraolo Spurio wrote:
> From: Matthew Brost <matthew.brost@intel.com>
>
> When the GuC does a media reset, it copies a golden context state back
> into the corrupted context's state. The address of the golden context
> and the size of the engine state restore are passed in via the GuC ADS.
> The i915 had a bug where it passed in the whole size of the golden
> context, not the size of the engine state to restore resulting in a
> memory corruption.
>
> Also copy the entire golden context on init rather than just the engine
> state that is restored.
>
> v2 (Daniele): use defines to avoid duplicated const variables (John).
>
> Fixes: 481d458caede ("drm/i915/guc: Add golden context to GuC ADS")
> Signed-off-by: Matthew Brost <matthew.brost@intel.com>
> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
> Cc: John Harrison <John.C.Harrison@Intel.com>
> ---
>   drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c | 26 ++++++++++++++--------
>   1 file changed, 17 insertions(+), 9 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
> index 6926919bcac6..2c6ea64af7ec 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
> @@ -349,6 +349,8 @@ static void fill_engine_enable_masks(struct intel_gt *gt,
>   	info->engine_enabled_masks[GUC_VIDEOENHANCE_CLASS] = VEBOX_MASK(gt);
>   }
>   
> +#define LR_HW_CONTEXT_SIZE (80 * sizeof(u32))
> +#define LRC_SKIP_SIZE (LRC_PPHWSP_SZ * PAGE_SIZE + LR_HW_CONTEXT_SIZE)
Personally, I would have preferred to turn it into a function. 
Especially as it gets more complex with the later platforms that are now 
being pushed upstream. Not a blocker though.

Reviewed-by: John Harrison <John.C.Harrison@Intel.com>

>   static int guc_prep_golden_context(struct intel_guc *guc,
>   				   struct __guc_ads_blob *blob)
>   {
> @@ -396,7 +398,18 @@ static int guc_prep_golden_context(struct intel_guc *guc,
>   		if (!blob)
>   			continue;
>   
> -		blob->ads.eng_state_size[guc_class] = real_size;
> +		/*
> +		 * This interface is slightly confusing. We need to pass the
> +		 * base address of the full golden context and the size of just
> +		 * the engine state, which is the section of the context image
> +		 * that starts after the execlists context. This is required to
> +		 * allow the GuC to restore just the engine state when a
> +		 * watchdog reset occurs.
> +		 * We calculate the engine state size by removing the size of
> +		 * what comes before it in the context image (which is identical
> +		 * on all engines).
> +		 */
> +		blob->ads.eng_state_size[guc_class] = real_size - LRC_SKIP_SIZE;
>   		blob->ads.golden_context_lrca[guc_class] = addr_ggtt;
>   		addr_ggtt += alloc_size;
>   	}
> @@ -436,11 +449,6 @@ static void guc_init_golden_context(struct intel_guc *guc)
>   	u8 engine_class, guc_class;
>   	u8 *ptr;
>   
> -	/* Skip execlist and PPGTT registers + HWSP */
> -	const u32 lr_hw_context_size = 80 * sizeof(u32);
> -	const u32 skip_size = LRC_PPHWSP_SZ * PAGE_SIZE +
> -		lr_hw_context_size;
> -
>   	if (!intel_uc_uses_guc_submission(&gt->uc))
>   		return;
>   
> @@ -476,12 +484,12 @@ static void guc_init_golden_context(struct intel_guc *guc)
>   			continue;
>   		}
>   
> -		GEM_BUG_ON(blob->ads.eng_state_size[guc_class] != real_size);
> +		GEM_BUG_ON(blob->ads.eng_state_size[guc_class] !=
> +			   real_size - LRC_SKIP_SIZE);
>   		GEM_BUG_ON(blob->ads.golden_context_lrca[guc_class] != addr_ggtt);
>   		addr_ggtt += alloc_size;
>   
> -		shmem_read(engine->default_state, skip_size, ptr + skip_size,
> -			   real_size - skip_size);
> +		shmem_read(engine->default_state, 0, ptr, real_size);
>   		ptr += alloc_size;
>   	}
>