All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Chang, Yu bruce" <yu.bruce.chang@intel.com>
To: "Brost, Matthew" <matthew.brost@intel.com>
Cc: "De Marchi, Lucas" <lucas.demarchi@intel.com>,
	"intel-xe@lists.freedesktop.org" <intel-xe@lists.freedesktop.org>
Subject: Re: [Intel-xe] [PATCH v2] drm/xe: Use fast virtual copy engine for migrate engine on PVC
Date: Fri, 24 Mar 2023 18:12:00 +0000	[thread overview]
Message-ID: <CY8PR11MB6940EB1DADE0C4E4583D62E8C3849@CY8PR11MB6940.namprd11.prod.outlook.com> (raw)
In-Reply-To: <ZB3K0PjzD1LWD7u0@DUT025-TGLU.fm.intel.com>



> -----Original Message-----
> From: Brost, Matthew <matthew.brost@intel.com>
> Sent: Friday, March 24, 2023 9:08 AM
> To: Chang, Yu bruce <yu.bruce.chang@intel.com>
> Cc: De Marchi, Lucas <lucas.demarchi@intel.com>; intel-
> xe@lists.freedesktop.org
> Subject: Re: [Intel-xe] [PATCH v2] drm/xe: Use fast virtual copy engine for
> migrate engine on PVC
> 
> On Fri, Mar 24, 2023 at 09:29:10AM -0600, Chang, Yu bruce wrote:
> >
> >
> > > -----Original Message-----
> > > From: Brost, Matthew <matthew.brost@intel.com>
> > > Sent: Thursday, March 23, 2023 11:59 PM
> > > To: De Marchi, Lucas <lucas.demarchi@intel.com>
> > > Cc: intel-xe@lists.freedesktop.org; Chang, Yu bruce
> > > <yu.bruce.chang@intel.com>
> > > Subject: Re: [Intel-xe] [PATCH v2] drm/xe: Use fast virtual copy
> > > engine for migrate engine on PVC
> > >
> > > On Thu, Mar 23, 2023 at 09:53:11PM -0700, Lucas De Marchi wrote:
> > > > On Thu, Mar 23, 2023 at 06:23:29PM -0700, Matthew Brost wrote:
> > > > > Some copy hardware engine instances are faster than others on
> > > > > PVC, use a virtual engine of these plus the reserved instance
> > > > > for the migrate engine on PVC. The idea being if a fast instance
> > > > > is available it will be used and the throughput of kernel
> > > > > copies, clears, and pagefault servicing will be higher.
> > > >
> > > > how faster and/or why?  If it was related to being link copy
> > > > engine vs main copy engine it was very understandable as the
> > > > commands available are different and optimized for certain usages.
> > > > However below you are setting to the odd link copy engines + the
> > > > main copy engine
> > > > + whatever was reserved for USM.
> > > >
> > > > Without a proper reason here or numbers or spec, it's hard to
> > > > judge where this is coming from and understand in future.
> > > >
> > >
> > > Your right, probably need to get a spec reference or something to justify
> this.
> > > I came up with this bit mask from IM conversation with Bruce, maybe
> > > he can point me to the spec. Also I looked at the i915 code for this
> > > and it is just BCS0
> > > | reserved BCS so definitely need to dig into what is the ideal mask.
> > >
> > Please find the detailed information from the i915 patch below:
> >
> > INTEL_DII: drm/i915/pvc: Force even num engines to use 64B
> >
> >     On PVC observed gt_fatal_7 as arbiter is out of credits while
> >     running Molten Concurrency stress+ 2 HPLs + ProcHot + Warm Idle
> >     + Solar DVFS + ASPM + Link Width Change.
> >
> >     Its root caused to HW bug and SW workaround proposed to use
> >     all even instance engines to do 64B transfer while using
> >     system memory.
> >
> >     So this change implements below scenario :
> >     ------------------------------------------------------------
> >     L7  |  L6  |  L5  |  L4  |  L3  |  L2  |  L1  |  L0  |  Main
> >
> >     8      7      6      5      4      3      2      1
> >
> >     64B    256B   64B    256B   64B    256B   64B    256B   64B
> >     -------------------------------------------------------------
> >
> > Bug-id: 16017236439
> >
> > The 64B will limit the transfer BW. The main copy engine has several
> > backend, So it may not be impacted much, but other link copy engine
> > such as the reserved
> > bcs8 will slow down to possible ~20% for host transfer.
> >
> 
> Hmm, we don't have this WA, indirect BB, WA pages in Xe? I have no idea
> what this is for but for the moment this patch isn't needed. If we pull in this
> WA, then I see why we need this.
> 
> Matt
> 

PVC will need this WA, as long as there is concurrent read and write from smem
for bcs.

-Bruce

> > -Bruce
> > > >
> > > > >
> > > > > v2: Include local change of correct mask for fast instances
> > > > >
> > > > > Cc: Bruce Chang <yu.bruce.chang@intel.com>
> > > > > Signed-off-by: Matthew Brost <matthew.brost@intel.com>
> > > > > ---
> > > > > drivers/gpu/drm/xe/xe_engine.h    |  2 ++
> > > > > drivers/gpu/drm/xe/xe_hw_engine.c | 20 ++++++++++++++++++++
> > > > > drivers/gpu/drm/xe/xe_migrate.c   |  7 ++++---
> > > > > 3 files changed, 26 insertions(+), 3 deletions(-)
> > > > >
> > > > > diff --git a/drivers/gpu/drm/xe/xe_engine.h
> > > > > b/drivers/gpu/drm/xe/xe_engine.h index
> > > > > 1cf7f23c4afd..0a9c35ea3d34
> > > > > 100644
> > > > > --- a/drivers/gpu/drm/xe/xe_engine.h
> > > > > +++ b/drivers/gpu/drm/xe/xe_engine.h
> > > > > @@ -26,6 +26,8 @@ void xe_engine_destroy(struct kref *ref);
> > > > >
> > > > > struct xe_engine *xe_engine_lookup(struct xe_file *xef, u32 id);
> > > > >
> > > > > +u32 xe_hw_engine_fast_copy_logical_mask(struct xe_gt *gt);
> > > > > +
> > > > > static inline struct xe_engine *xe_engine_get(struct xe_engine
> > > > > *engine) {
> > > > > 	kref_get(&engine->refcount);
> > > > > diff --git a/drivers/gpu/drm/xe/xe_hw_engine.c
> > > > > b/drivers/gpu/drm/xe/xe_hw_engine.c
> > > > > index 63a4efd5edcc..d2b43b189b14 100644
> > > > > --- a/drivers/gpu/drm/xe/xe_hw_engine.c
> > > > > +++ b/drivers/gpu/drm/xe/xe_hw_engine.c
> > > > > @@ -600,3 +600,23 @@ bool xe_hw_engine_is_reserved(struct
> > > xe_hw_engine *hwe)
> > > > > 	return xe->info.supports_usm && hwe->class ==
> > > XE_ENGINE_CLASS_COPY &&
> > > > > 		hwe->instance == gt->usm.reserved_bcs_instance; }
> > > > > +
> > > > > +u32 xe_hw_engine_fast_copy_logical_mask(struct xe_gt *gt)
> > > >
> > > > this deserves its kernel-doc, probably with similar info asked for
> > > > in the commit message.
> > > >
> > >
> > > I thought I added kernel DoC but apartently forgot. Will fix in next rev.
> > >
> > > Matt
> > >
> > > > Lucas De Marchi
> > > >
> > > > > +{
> > > > > +	struct xe_device *xe = gt_to_xe(gt);
> > > > > +	struct xe_hw_engine *hwe;
> > > > > +	const u32 fast_physical_mask = 0xab;	/* 0, 1, 3, 5, 7 */
> > > > > +	u32 fast_logical_mask = 0;
> > > > > +	enum xe_hw_engine_id id;
> > > > > +
> > > > > +	/* XXX: We only support this function on PVC for now */
> > > > > +	XE_BUG_ON(!(xe->info.platform == XE_PVC));
> > > > > +
> > > > > +	for_each_hw_engine(hwe, gt, id) {
> > > > > +		if ((fast_physical_mask | gt-
> >usm.reserved_bcs_instance) &
> > > > > +		    BIT(hwe->instance))
> > > > > +			fast_logical_mask |= hwe->logical_instance;
> > > > > +	}
> > > > > +
> > > > > +	return fast_logical_mask;
> > > > > +}
> > > > > diff --git a/drivers/gpu/drm/xe/xe_migrate.c
> > > > > b/drivers/gpu/drm/xe/xe_migrate.c index
> > > > > 11c8af9c6c92..4a7fec5d619d
> > > > > 100644
> > > > > --- a/drivers/gpu/drm/xe/xe_migrate.c
> > > > > +++ b/drivers/gpu/drm/xe/xe_migrate.c
> > > > > @@ -345,11 +345,12 @@ struct xe_migrate *xe_migrate_init(struct
> > > xe_gt *gt)
> > > > >
> > > XE_ENGINE_CLASS_COPY,
> > > > > 							   gt-
> > > >usm.reserved_bcs_instance,
> > > > > 							   false);
> > > > > -		if (!hwe)
> > > > > +		u32 logical_mask =
> > > xe_hw_engine_fast_copy_logical_mask(gt);
> > > > > +
> > > > > +		if (!hwe || !logical_mask)
> > > > > 			return ERR_PTR(-EINVAL);
> > > > >
> > > > > -		m->eng = xe_engine_create(xe, vm,
> > > > > -					  BIT(hwe->logical_instance),
> 1,
> > > > > +		m->eng = xe_engine_create(xe, vm, logical_mask, 1,
> > > > > 					  hwe, ENGINE_FLAG_KERNEL);
> > > > > 	} else {
> > > > > 		m->eng = xe_engine_create_class(xe, gt, vm,
> > > > > --
> > > > > 2.34.1
> > > > >

  reply	other threads:[~2023-03-24 18:12 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-24  1:23 [Intel-xe] [PATCH v2] drm/xe: Use fast virtual copy engine for migrate engine on PVC Matthew Brost
2023-03-24  1:52 ` [Intel-xe] ✓ CI.Patch_applied: success for drm/xe: Use fast virtual copy engine for migrate engine on PVC (rev2) Patchwork
2023-03-24  1:53 ` [Intel-xe] ✓ CI.KUnit: " Patchwork
2023-03-24  1:57 ` [Intel-xe] ✓ CI.Build: " Patchwork
2023-03-24  2:19 ` [Intel-xe] ○ CI.BAT: info " Patchwork
2023-03-24  4:53 ` [Intel-xe] [PATCH v2] drm/xe: Use fast virtual copy engine for migrate engine on PVC Lucas De Marchi
2023-03-24  6:59   ` Matthew Brost
2023-03-24 15:29     ` Chang, Yu bruce
2023-03-24 16:07       ` Matthew Brost
2023-03-24 18:12         ` Chang, Yu bruce [this message]
2023-03-24  6:42 ` Mauro Carvalho Chehab
2023-03-24  7:02   ` Matthew Brost

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CY8PR11MB6940EB1DADE0C4E4583D62E8C3849@CY8PR11MB6940.namprd11.prod.outlook.com \
    --to=yu.bruce.chang@intel.com \
    --cc=intel-xe@lists.freedesktop.org \
    --cc=lucas.demarchi@intel.com \
    --cc=matthew.brost@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.