All of lore.kernel.org
 help / color / mirror / Atom feed
From: Daniel Vetter <daniel@ffwll.ch>
To: Matthew Brost <matthew.brost@intel.com>
Cc: daniel.vetter@intel.com, intel-gfx@lists.freedesktop.org,
	dri-devel@lists.freedesktop.org,
	Michal Wajdeczko <michal.wajdeczko@intel.com>
Subject: Re: [Intel-gfx] [PATCH 15/20] drm/i915/guc: Ensure H2G buffer updates visible before tail update
Date: Fri, 4 Jun 2021 10:39:47 +0200	[thread overview]
Message-ID: <YLnm00d5gAO7/WmZ@phenom.ffwll.local> (raw)
In-Reply-To: <20210603161014.GA620@sdutt-i7>

On Thu, Jun 03, 2021 at 09:10:14AM -0700, Matthew Brost wrote:
> On Thu, Jun 03, 2021 at 11:44:57AM +0200, Michal Wajdeczko wrote:
> > 
> > 
> > On 03.06.2021 07:16, Matthew Brost wrote:
> > > Ensure H2G buffer updates are visible before descriptor tail updates by
> > > inserting a barrier between the H2G buffer update and the tail. The
> > > barrier is simple wmb() for SMEM and is register write for LMEM. This is
> > > needed if more than 1 H2G can be inflight at once.
> > > 
> > > If this barrier is not inserted it is possible the descriptor tail
> > > update is scene by the GuC before H2G buffer update which results in the
> > > GuC reading a corrupt H2G value. This can bring down the H2G channel
> > > among other bad things.
> > > 
> > > Signed-off-by: Matthew Brost <matthew.brost@intel.com>
> > > Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
> > > Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
> > > ---
> > >  drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c | 28 +++++++++++++++++++++++
> > >  1 file changed, 28 insertions(+)
> > > 
> > > diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
> > > index 80976fe40fbf..31f83956bfc3 100644
> > > --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
> > > +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
> > > @@ -328,6 +328,28 @@ static u32 ct_get_next_fence(struct intel_guc_ct *ct)
> > >  	return ++ct->requests.last_fence;
> > >  }
> > >  
> > > +static void write_barrier(struct intel_guc_ct *ct)
> > > +{
> > > +	struct intel_guc *guc = ct_to_guc(ct);
> > > +	struct intel_gt *gt = guc_to_gt(guc);
> > > +
> > > +	if (i915_gem_object_is_lmem(guc->ct.vma->obj)) {
> > > +		GEM_BUG_ON(guc->send_regs.fw_domains);
> > > +		/*
> > > +		 * This register is used by the i915 and GuC for MMIO based
> > > +		 * communication. Once we are in this code CTBs are the only
> > > +		 * method the i915 uses to communicate with the GuC so it is
> > > +		 * safe to write to this register (a value of 0 is NOP for MMIO
> > > +		 * communication). If we ever start mixing CTBs and MMIOs a new
> > > +		 * register will have to be chosen.
> > > +		 */
> > > +		intel_uncore_write_fw(gt->uncore, GEN11_SOFT_SCRATCH(0), 0);
> > 
> > can't we at least start with SOFT_SCRATCH register that is not used for
> > GuC MMIO based communication on Gen12 LMEM platforms? see [1]
> > 
> 
> We likely can use this but I really don't feel comfortable switching the
> register without some more testing first (e.g. let's change in this in
> internal, let it soak for bit, then make the change upstream).
> 
> > I really don't feel comfortable that we are touching a register that
> > elsewhere is protected with the mutex. And mixing CTBs and MMIO is not
> > far away.
> >
> 
> The only code that mixes CTBs and MMIOs is SRIOV which is a ways away
> from landing.

Maybe add a FIXME note as part of the SRIOV patch stack in internal to
track this?
-Daniel

> 
> Matt
>  
> > Michal
> > 
> > [1]
> > https://lore.kernel.org/intel-gfx/51b9bd05-7d6f-29f1-de0f-3a14bade6c9c@intel.com/
> > 
> > > +	} else {
> > > +		/* wmb() sufficient for a barrier if in smem */
> > > +		wmb();
> > > +	}
> > > +}
> > > +
> > >  /**
> > >   * DOC: CTB Host to GuC request
> > >   *
> > > @@ -411,6 +433,12 @@ static int ct_write(struct intel_guc_ct *ct,
> > >  	}
> > >  	GEM_BUG_ON(tail > size);
> > >  
> > > +	/*
> > > +	 * make sure H2G buffer update and LRC tail update (if this triggering a
> > > +	 * submission) are visible before updating the descriptor tail
> > > +	 */
> > > +	write_barrier(ct);
> > > +
> > >  	/* now update desc tail (back in bytes) */
> > >  	desc->tail = tail * 4;
> > >  	return 0;
> > > 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

WARNING: multiple messages have this Message-ID (diff)
From: Daniel Vetter <daniel@ffwll.ch>
To: Matthew Brost <matthew.brost@intel.com>
Cc: daniel.vetter@intel.com, intel-gfx@lists.freedesktop.org,
	dri-devel@lists.freedesktop.org
Subject: Re: [Intel-gfx] [PATCH 15/20] drm/i915/guc: Ensure H2G buffer updates visible before tail update
Date: Fri, 4 Jun 2021 10:39:47 +0200	[thread overview]
Message-ID: <YLnm00d5gAO7/WmZ@phenom.ffwll.local> (raw)
In-Reply-To: <20210603161014.GA620@sdutt-i7>

On Thu, Jun 03, 2021 at 09:10:14AM -0700, Matthew Brost wrote:
> On Thu, Jun 03, 2021 at 11:44:57AM +0200, Michal Wajdeczko wrote:
> > 
> > 
> > On 03.06.2021 07:16, Matthew Brost wrote:
> > > Ensure H2G buffer updates are visible before descriptor tail updates by
> > > inserting a barrier between the H2G buffer update and the tail. The
> > > barrier is simple wmb() for SMEM and is register write for LMEM. This is
> > > needed if more than 1 H2G can be inflight at once.
> > > 
> > > If this barrier is not inserted it is possible the descriptor tail
> > > update is scene by the GuC before H2G buffer update which results in the
> > > GuC reading a corrupt H2G value. This can bring down the H2G channel
> > > among other bad things.
> > > 
> > > Signed-off-by: Matthew Brost <matthew.brost@intel.com>
> > > Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
> > > Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
> > > ---
> > >  drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c | 28 +++++++++++++++++++++++
> > >  1 file changed, 28 insertions(+)
> > > 
> > > diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
> > > index 80976fe40fbf..31f83956bfc3 100644
> > > --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
> > > +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
> > > @@ -328,6 +328,28 @@ static u32 ct_get_next_fence(struct intel_guc_ct *ct)
> > >  	return ++ct->requests.last_fence;
> > >  }
> > >  
> > > +static void write_barrier(struct intel_guc_ct *ct)
> > > +{
> > > +	struct intel_guc *guc = ct_to_guc(ct);
> > > +	struct intel_gt *gt = guc_to_gt(guc);
> > > +
> > > +	if (i915_gem_object_is_lmem(guc->ct.vma->obj)) {
> > > +		GEM_BUG_ON(guc->send_regs.fw_domains);
> > > +		/*
> > > +		 * This register is used by the i915 and GuC for MMIO based
> > > +		 * communication. Once we are in this code CTBs are the only
> > > +		 * method the i915 uses to communicate with the GuC so it is
> > > +		 * safe to write to this register (a value of 0 is NOP for MMIO
> > > +		 * communication). If we ever start mixing CTBs and MMIOs a new
> > > +		 * register will have to be chosen.
> > > +		 */
> > > +		intel_uncore_write_fw(gt->uncore, GEN11_SOFT_SCRATCH(0), 0);
> > 
> > can't we at least start with SOFT_SCRATCH register that is not used for
> > GuC MMIO based communication on Gen12 LMEM platforms? see [1]
> > 
> 
> We likely can use this but I really don't feel comfortable switching the
> register without some more testing first (e.g. let's change in this in
> internal, let it soak for bit, then make the change upstream).
> 
> > I really don't feel comfortable that we are touching a register that
> > elsewhere is protected with the mutex. And mixing CTBs and MMIO is not
> > far away.
> >
> 
> The only code that mixes CTBs and MMIOs is SRIOV which is a ways away
> from landing.

Maybe add a FIXME note as part of the SRIOV patch stack in internal to
track this?
-Daniel

> 
> Matt
>  
> > Michal
> > 
> > [1]
> > https://lore.kernel.org/intel-gfx/51b9bd05-7d6f-29f1-de0f-3a14bade6c9c@intel.com/
> > 
> > > +	} else {
> > > +		/* wmb() sufficient for a barrier if in smem */
> > > +		wmb();
> > > +	}
> > > +}
> > > +
> > >  /**
> > >   * DOC: CTB Host to GuC request
> > >   *
> > > @@ -411,6 +433,12 @@ static int ct_write(struct intel_guc_ct *ct,
> > >  	}
> > >  	GEM_BUG_ON(tail > size);
> > >  
> > > +	/*
> > > +	 * make sure H2G buffer update and LRC tail update (if this triggering a
> > > +	 * submission) are visible before updating the descriptor tail
> > > +	 */
> > > +	write_barrier(ct);
> > > +
> > >  	/* now update desc tail (back in bytes) */
> > >  	desc->tail = tail * 4;
> > >  	return 0;
> > > 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

  reply	other threads:[~2021-06-04  8:39 UTC|newest]

Thread overview: 88+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-03  5:16 [PATCH 00/20] GuC CTBs changes + a few misc patches Matthew Brost
2021-06-03  5:16 ` [Intel-gfx] " Matthew Brost
2021-06-03  5:10 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for " Patchwork
2021-06-03  5:11 ` [Intel-gfx] ✗ Fi.CI.SPARSE: " Patchwork
2021-06-03  5:16 ` [PATCH 01/20] drm/i915/guc: skip disabling CTBs before sanitizing the GuC Matthew Brost
2021-06-03  5:16   ` [Intel-gfx] " Matthew Brost
2021-06-03  5:16 ` [PATCH 02/20] drm/i915/guc: use probe_error log for CT enablement failure Matthew Brost
2021-06-03  5:16   ` [Intel-gfx] " Matthew Brost
2021-06-03  5:16 ` [PATCH 03/20] drm/i915/guc: enable only the user interrupt when using GuC submission Matthew Brost
2021-06-03  5:16   ` [Intel-gfx] " Matthew Brost
2021-06-03  5:16 ` [PATCH 04/20] drm/i915/guc: Remove sample_forcewake h2g action Matthew Brost
2021-06-03  5:16   ` [Intel-gfx] " Matthew Brost
2021-06-03  5:16 ` [PATCH 05/20] drm/i915/guc: Keep strict GuC ABI definitions Matthew Brost
2021-06-03  5:16   ` [Intel-gfx] " Matthew Brost
2021-06-03  5:16 ` [PATCH 06/20] drm/i915/guc: Drop guc->interrupts.enabled Matthew Brost
2021-06-03  5:16   ` [Intel-gfx] " Matthew Brost
2021-06-03  5:16 ` [PATCH 07/20] drm/i915/guc: Stop using fence/status from CTB descriptor Matthew Brost
2021-06-03  5:16   ` [Intel-gfx] " Matthew Brost
2021-06-03  5:16 ` [PATCH 08/20] drm/i915: Promote ptrdiff() to i915_utils.h Matthew Brost
2021-06-03  5:16   ` [Intel-gfx] " Matthew Brost
2021-06-03 21:35   ` Daniel Vetter
2021-06-03 21:35     ` [Intel-gfx] " Daniel Vetter
2021-06-04  2:02     ` Matthew Brost
2021-06-04  2:02       ` [Intel-gfx] " Matthew Brost
2021-06-04  8:11       ` Daniel Vetter
2021-06-04  8:11         ` [Intel-gfx] " Daniel Vetter
2021-06-03  5:16 ` [PATCH 09/20] drm/i915/guc: Only rely on own CTB size Matthew Brost
2021-06-03  5:16   ` [Intel-gfx] " Matthew Brost
2021-06-03  5:16 ` [PATCH 10/20] drm/i915/guc: Don't repeat CTB layout calculations Matthew Brost
2021-06-03  5:16   ` [Intel-gfx] " Matthew Brost
2021-06-03  5:16 ` [PATCH 11/20] drm/i915/guc: Replace CTB array with explicit members Matthew Brost
2021-06-03  5:16   ` [Intel-gfx] " Matthew Brost
2021-06-03  7:25   ` kernel test robot
2021-06-03  7:25     ` kernel test robot
2021-06-03  7:25     ` [Intel-gfx] " kernel test robot
2021-06-03 21:37     ` Daniel Vetter
2021-06-03 21:37       ` Daniel Vetter
2021-06-03 21:37       ` Daniel Vetter
2021-06-03 22:44       ` [PATCH 1/2] " Matthew Brost
2021-06-03 22:44         ` [Intel-gfx] " Matthew Brost
2021-06-03 22:44         ` [PATCH 2/2] drm/i915/guc: Update sizes of CTB buffers Matthew Brost
2021-06-03 22:44           ` [Intel-gfx] " Matthew Brost
2021-06-03 23:04       ` [v3 PATCH 1/2] drm/i915/guc: Replace CTB array with explicit members Matthew Brost
2021-06-03 23:04         ` [Intel-gfx] " Matthew Brost
2021-06-03 23:04         ` [v3 PATCH 2/2] drm/i915/guc: Update sizes of CTB buffers Matthew Brost
2021-06-03 23:04           ` [Intel-gfx] " Matthew Brost
2021-06-04  8:20           ` Daniel Vetter
2021-06-04  8:20             ` [Intel-gfx] " Daniel Vetter
2021-06-04  8:49             ` Michal Wajdeczko
2021-06-04  8:49               ` [Intel-gfx] " Michal Wajdeczko
2021-06-03  5:16 ` [PATCH 12/20] " Matthew Brost
2021-06-03  5:16   ` [Intel-gfx] " Matthew Brost
2021-06-03  5:16 ` [PATCH 13/20] drm/i915/guc: Relax CTB response timeout Matthew Brost
2021-06-03  5:16   ` [Intel-gfx] " Matthew Brost
2021-06-04  8:33   ` Daniel Vetter
2021-06-04  8:33     ` Daniel Vetter
2021-06-04 18:35     ` Matthew Brost
2021-06-04 18:35       ` Matthew Brost
2021-06-09 13:24       ` Daniel Vetter
2021-06-09 13:24         ` Daniel Vetter
2021-06-03  5:16 ` [PATCH 14/20] drm/i915/guc: Start protecting access to CTB descriptors Matthew Brost
2021-06-03  5:16   ` [Intel-gfx] " Matthew Brost
2021-06-04  8:35   ` Daniel Vetter
2021-06-04  8:35     ` [Intel-gfx] " Daniel Vetter
2021-06-03  5:16 ` [PATCH 15/20] drm/i915/guc: Ensure H2G buffer updates visible before tail update Matthew Brost
2021-06-03  5:16   ` [Intel-gfx] " Matthew Brost
2021-06-03  9:44   ` Michal Wajdeczko
2021-06-03  9:44     ` Michal Wajdeczko
2021-06-03 16:10     ` Matthew Brost
2021-06-03 16:10       ` Matthew Brost
2021-06-04  8:39       ` Daniel Vetter [this message]
2021-06-04  8:39         ` Daniel Vetter
2021-06-03  5:16 ` [PATCH 16/20] drm/i915/guc: Stop using mutex while sending CTB messages Matthew Brost
2021-06-03  5:16   ` [Intel-gfx] " Matthew Brost
2021-06-03  5:16 ` [PATCH 17/20] drm/i915/guc: Don't receive all G2H messages in irq handler Matthew Brost
2021-06-03  5:16   ` [Intel-gfx] " Matthew Brost
2021-06-03  5:16 ` [PATCH 18/20] drm/i915/guc: Always copy CT message to new allocation Matthew Brost
2021-06-03  5:16   ` [Intel-gfx] " Matthew Brost
2021-06-03  5:16 ` [PATCH 19/20] drm/i915/guc: Early initialization of GuC send registers Matthew Brost
2021-06-03  5:16   ` [Intel-gfx] " Matthew Brost
2021-06-03  5:16 ` [PATCH 20/20] drm/i915/guc: Use guc_class instead of engine_class in fw interface Matthew Brost
2021-06-03  5:16   ` [Intel-gfx] " Matthew Brost
2021-06-04  8:44   ` Daniel Vetter
2021-06-04  8:44     ` [Intel-gfx] " Daniel Vetter
2021-06-04 18:12     ` Matthew Brost
2021-06-04 18:12       ` [Intel-gfx] " Matthew Brost
2021-06-03  5:41 ` [Intel-gfx] ✓ Fi.CI.BAT: success for GuC CTBs changes + a few misc patches Patchwork
2021-06-03  6:50 ` [Intel-gfx] ✗ Fi.CI.IGT: failure " Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YLnm00d5gAO7/WmZ@phenom.ffwll.local \
    --to=daniel@ffwll.ch \
    --cc=daniel.vetter@intel.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=intel-gfx@lists.freedesktop.org \
    --cc=matthew.brost@intel.com \
    --cc=michal.wajdeczko@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.