linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Mauro Carvalho Chehab <mauro.chehab@linux.intel.com>
To: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: stable@vger.kernel.org,
	"Thomas Hellström" <thomas.hellstrom@linux.intel.com>,
	linux-media@vger.kernel.org, "David Airlie" <airlied@linux.ie>,
	intel-gfx@lists.freedesktop.org,
	"Lucas De Marchi" <lucas.demarchi@intel.com>,
	linux-kernel@vger.kernel.org, dri-devel@lists.freedesktop.org,
	"Christian König" <christian.koenig@amd.com>,
	linaro-mm-sig@lists.linaro.org,
	"Chris Wilson" <chris.p.wilson@intel.com>,
	"Rodrigo Vivi" <rodrigo.vivi@intel.com>,
	"Dave Airlie" <airlied@redhat.com>,
	"Tomas Winkler" <tomas.winkler@intel.com>,
	"Mauro Carvalho Chehab" <mchehab@kernel.org>,
	"Sumit Semwal" <sumit.semwal@linaro.org>,
	"Matthew Auld" <matthew.auld@intel.com>
Subject: Re: [Intel-gfx] [PATCH v2 06/21] drm/i915/gt: Batch TLB invalidations
Date: Thu, 28 Jul 2022 09:26:18 +0200	[thread overview]
Message-ID: <20220728092618.050969fc@maurocar-mobl2> (raw)
In-Reply-To: <20220728083232.352f80cf@maurocar-mobl2>

On Thu, 28 Jul 2022 08:32:32 +0200
Mauro Carvalho Chehab <mauro.chehab@linux.intel.com> wrote:

> On Wed, 27 Jul 2022 13:56:50 +0100
> Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> wrote:
> 
> > > Because vma_invalidate_tlb() basically stores a TLB seqno, but the
> > > actual invalidation is deferred to when the pages are unset, at
> > > __i915_gem_object_unset_pages().
> > > 
> > > So, what happens is:
> > > 
> > > - on VMA sync mode, the need to invalidate TLB is marked at
> > >    __vma_put_pages(), before VMA unbind;
> > > - on async, this is deferred to happen at ppgtt_unbind_vma(), where
> > >    it marks the need to invalidate TLBs.
> > > 
> > > On both cases, __i915_gem_object_unset_pages() is called later,
> > > when the driver is ready to unmap the page.    
> > 
> > Sorry still not clear to me why is the patch moving marking of the need 
> > to invalidate (regardless if it a bit like today, or a seqno like in 
> > this patch) from bind to unbind?
> > 
> > What if the seqno was stored in i915_vma_bind, where the bit is set 
> > today, and all the hunks which touch the unbind and evict would 
> > disappear from the patch. What wouldn't work in that case, if anything?  
> 
> Ah, now I see your point.
> 
> I can't see any sense on having a sequence number at VMA bind, as the
> unbind order can be different. The need of doing a full TLB invalidation
> or not depends on the unbind order.
> 
> The way the current algorithm works is that drm_i915_gem_object can be
> created on any order, and, at unbind/evict, they receive a seqno.
> 
> The seqno is incremented at intel_gt_invalidate_tlb():
> 
>     void intel_gt_invalidate_tlb(struct intel_gt *gt, u32 seqno)
>     {
> 	with_intel_gt_pm_if_awake(gt, wakeref) {
> 		mutex_lock(&gt->tlb.invalidate_lock);
> 		if (tlb_seqno_passed(gt, seqno))
> 				goto unlock;
> 
> 		mmio_invalidate_full(gt);
> 
> 		write_seqcount_invalidate(&gt->tlb.seqno);	// increment seqno
> 		
> 
> So, let's say 3 objects were created, on this order:
> 
> 	obj1
> 	obj2
> 	obj3
> 
> They would be unbind/evict on a different order. On that time, 
> the mm.tlb will be stamped with a seqno, using the number from the
> last TLB flush, plus 1.
> 
> As different threads can be used to handle TLB flushes, let's imagine
> two threads (just for the sake of having an example). On such case,
> what we would have is:
> 
> seqno		Thread 0			Thread 1
> 
> seqno=2		unbind/evict event
> 		obj3.mm.tlb = seqno | 1
> seqno=2		unbind/evict event
> 		obj1.mm.tlb = seqno | 1
> 						__i915_gem_object_unset_pages() 
> 						called for obj3, TLB flush happened,
> 						invalidating both obj1 and obj2.
> 						seqno += 2					
> seqno=4		unbind/evict event
> 		obj1.mm.tlb = seqno | 1

cut-and-paste typo. it should be, instead:

 		obj2.mm.tlb = seqno | 1


> 						__i915_gem_object_unset_pages()
> 						called for obj1, don't flush.
> ...
> 						__i915_gem_object_unset_pages() called for obj2, TLB flush happened
> 						seqno += 2
> seqno=6
> 
> So, basically the seqno is used to track when the object data stopped
> being updated, because of an unbind/evict event, being later used by
> intel_gt_invalidate_tlb() when called from __i915_gem_object_unset_pages(),
> in order to check if a previous invalidation call was enough to invalidate
> the object, or if a new call is needed.
> 
> Now, if seqno is stored at bind, data can still leak, as the assumption
> made by intel_gt_invalidate_tlb() that the data stopped being used at
> seqno is not true anymore.
> 
> Still, I agree that this logic is complex and should be better 
> documented. So, if you're now OK with this patch, I'll add the above
> explanation inside a kernel-doc comment.

I'm enclosing the kernel-doc patch (to be applied after moving the code into
its own files: intel_tlb.c/intel_tlb.h):

[PATCH] drm/i915/gt: document TLB cache invalidation functions

Add a description for the TLB cache invalidation algorithm and for
the related kAPI functions.

Signed-off-by: Mauro Carvalho Chehab <mchehab@kernel.org>

diff --git a/drivers/gpu/drm/i915/gt/intel_tlb.c b/drivers/gpu/drm/i915/gt/intel_tlb.c
index af8cae979489..8eda0743da74 100644
--- a/drivers/gpu/drm/i915/gt/intel_tlb.c
+++ b/drivers/gpu/drm/i915/gt/intel_tlb.c
@@ -145,6 +145,18 @@ static void mmio_invalidate_full(struct intel_gt *gt)
 	intel_uncore_forcewake_put_delayed(uncore, FORCEWAKE_ALL);
 }
 
+/**
+ * intel_gt_invalidate_tlb_full - do full TLB cache invalidation
+ * @gt: GT structure
+ * @seqno: sequence number
+ *
+ * Do a full TLB cache invalidation if the @seqno is bigger than the last
+ * full TLB cache invalidation.
+ *
+ * Note:
+ * The TLB cache invalidation logic depends on GEN-specific registers.
+ * It currently supports GEN8 to GEN12 and GuC-based TLB cache invalidation.
+ */
 void intel_gt_invalidate_tlb_full(struct intel_gt *gt, u32 seqno)
 {
 	intel_wakeref_t wakeref;
@@ -177,6 +189,12 @@ void intel_gt_init_tlb(struct intel_gt *gt)
 	seqcount_mutex_init(&gt->tlb.seqno, &gt->tlb.invalidate_lock);
 }
 
+/**
+ * intel_gt_fini_tlb - initialize TLB-specific vars
+ * @gt: GT structure
+ *
+ * Frees any resources needed by TLB cache invalidation logic.
+ */
 void intel_gt_fini_tlb(struct intel_gt *gt)
 {
 	mutex_destroy(&gt->tlb.invalidate_lock);
diff --git a/drivers/gpu/drm/i915/gt/intel_tlb.h b/drivers/gpu/drm/i915/gt/intel_tlb.h
index 46ce25bf5afe..d186f5d5901f 100644
--- a/drivers/gpu/drm/i915/gt/intel_tlb.h
+++ b/drivers/gpu/drm/i915/gt/intel_tlb.h
@@ -11,16 +11,99 @@
 
 #include "intel_gt_types.h"
 
+/**
+ * DOC: TLB cache invalidation logic
+ *
+ * The way the current algorithm works is that drm_i915_gem_object can be
+ * created on any order. At unbind/evict time, the object is warranted that
+ * it won't be used anymore. So, they store a sequence number provided by
+ * intel_gt_next_invalidate_tlb_full().This can happen either at
+ * __vma_put_pages(), for VMA sync unbind, or at ppgtt_unbind_vma(), for
+ * VMA async VMA bind.
+ *
+ * At __i915_gem_object_unset_pages(), intel_gt_invalidate_tlb() is called,
+ * where it checks if the sequence number of the object was already invalidated
+ * or not. If not, it increments it::
+ *
+ *   void intel_gt_invalidate_tlb(struct intel_gt *gt, u32 seqno)
+ *   {
+ *   ...
+ * 	with_intel_gt_pm_if_awake(gt, wakeref) {
+ * 		mutex_lock(&gt->tlb.invalidate_lock);
+ * 		if (tlb_seqno_passed(gt, seqno))
+ * 				goto unlock;
+ *
+ * 		mmio_invalidate_full(gt);
+ *
+ * 		write_seqcount_invalidate(&gt->tlb.seqno); // increment seqno
+ *    ...
+ *
+ * So, let's say the current seqno is 2 and 3 new objects were created,
+ * on this order:
+ *
+ * 	obj1
+ * 	obj2
+ * 	obj3
+ *
+ * They can be unbind/evict on a different order. At unbind/evict time,
+ * the mm.tlb will be stamped with the sequence number, using the number
+ * from the last TLB flush, plus 1.
+ *
+ * Different threads may be used on unbind/evict and/or unset pages.
+ *
+ * As the logic at void intel_gt_invalidate_tlb() is protected by a mutex,
+ * for simplicity, let's consider just two threads::
+ *
+ *   sequence number	Thread 0		Thread 1
+ *
+ *   seqno=2
+ *			unbind/evict event
+ * 			obj3.mm.tlb = seqno | 1
+ *
+ *			unbind/evict event
+ * 			obj1.mm.tlb = seqno | 1
+ * 						__i915_gem_object_unset_pages()
+ * 						called for obj3 => TLB flush
+ * 						invalidating both obj1 and obj2.
+ * 						seqno += 2
+ *   seqno=4
+ *			unbind/evict event
+ * 			obj2.mm.tlb = seqno | 1
+ * 						__i915_gem_object_unset_pages()
+ * 						called for obj1, don't flush,
+ *						as past flush invalidated obj1
+ *
+ * 						__i915_gem_object_unset_pages()
+ *						called for obj2 => TLB flush
+ * 						seqno += 2
+ *   seqno=6
+ */
+
 void intel_gt_invalidate_tlb_full(struct intel_gt *gt, u32 seqno);
 
 void intel_gt_init_tlb(struct intel_gt *gt);
 void intel_gt_fini_tlb(struct intel_gt *gt);
 
+/**
+ * intel_gt_tlb_seqno - Returns the current TLB invlidation sequence number
+ *
+ * @gt: GT structure
+ *
+ * There's no need to lock while calling it, as seqprop_sequence is thread-safe
+ */
 static inline u32 intel_gt_tlb_seqno(const struct intel_gt *gt)
 {
 	return seqprop_sequence(&gt->tlb.seqno);
 }
 
+/**
+ * intel_gt_next_invalidate_tlb_full - Returns the next TLB full invalidation
+ *	sequence number
+ *
+ * @gt: GT structure
+ *
+ * There's no need to lock while calling it, as seqprop_sequence is thread-safe
+ */
 static inline u32 intel_gt_next_invalidate_tlb_full(const struct intel_gt *gt)
 {
 	return intel_gt_tlb_seqno(gt) | 1;


  reply	other threads:[~2022-07-28  7:26 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-07-14 12:06 [PATCH v2 00/21] Fix performance regressions with TLB and add GuC support Mauro Carvalho Chehab
2022-07-14 12:06 ` [PATCH v2 01/21] drm/i915/gt: Ignore TLB invalidations on idle engines Mauro Carvalho Chehab
2022-07-18 13:16   ` Tvrtko Ursulin
2022-07-18 14:53     ` [Intel-gfx] " Mauro Carvalho Chehab
2022-07-18 15:01       ` Tvrtko Ursulin
2022-07-18 15:50       ` David Laight
2022-07-19  7:24         ` Tvrtko Ursulin
2022-07-19  7:45           ` David Laight
2022-07-22 11:56   ` Andi Shyti
2022-07-14 12:06 ` [PATCH v2 02/21] drm/i915/gt: document with_intel_gt_pm_if_awake() Mauro Carvalho Chehab
2022-07-18 13:21   ` Tvrtko Ursulin
2022-07-14 12:06 ` [PATCH v2 03/21] drm/i915/gt: Invalidate TLB of the OA unit at TLB invalidations Mauro Carvalho Chehab
2022-07-18 13:24   ` Tvrtko Ursulin
2022-07-22 11:57   ` Andi Shyti
2022-07-14 12:06 ` [PATCH v2 04/21] drm/i915/gt: Only invalidate TLBs exposed to user manipulation Mauro Carvalho Chehab
2022-07-18 13:39   ` Tvrtko Ursulin
2022-07-18 16:00     ` [Intel-gfx] " Mauro Carvalho Chehab
2022-07-22 11:58   ` Andi Shyti
2022-07-14 12:06 ` [PATCH v2 05/21] drm/i915/gt: Skip TLB invalidations once wedged Mauro Carvalho Chehab
2022-07-18 13:45   ` Tvrtko Ursulin
2022-07-18 16:06     ` [Intel-gfx] " Mauro Carvalho Chehab
2022-07-19  7:19       ` Tvrtko Ursulin
2022-07-22 12:00   ` Andi Shyti
2022-07-14 12:06 ` [PATCH v2 06/21] drm/i915/gt: Batch TLB invalidations Mauro Carvalho Chehab
2022-07-18 13:52   ` Tvrtko Ursulin
2022-07-20  7:13     ` [Intel-gfx] " Mauro Carvalho Chehab
2022-07-20 10:49       ` Tvrtko Ursulin
2022-07-20 10:54   ` Tvrtko Ursulin
2022-07-27 11:48     ` [Intel-gfx] " Mauro Carvalho Chehab
2022-07-27 12:56       ` Tvrtko Ursulin
2022-07-28  6:32         ` Mauro Carvalho Chehab
2022-07-28  7:26           ` Mauro Carvalho Chehab [this message]
2022-07-28 10:11           ` Tvrtko Ursulin
2022-07-14 12:06 ` [PATCH v2 07/21] drm/i915/gt: describe the new tlb parameter at i915_vma_resource Mauro Carvalho Chehab
2022-07-14 12:06 ` [PATCH v2 08/21] drm/i915/gt: Move TLB invalidation to its own file Mauro Carvalho Chehab
2022-07-22 12:07   ` Andi Shyti
2022-07-14 12:06 ` [PATCH v2 09/21] drm/i915/guc: Define CTB based TLB invalidation routines Mauro Carvalho Chehab
2022-07-14 14:06   ` Michal Wajdeczko
2022-08-02  7:48     ` [Intel-gfx] " Mauro Carvalho Chehab
2022-07-14 12:06 ` [PATCH v2 10/21] drm/i915/guc: use kernel-doc for enum intel_guc_tlb_inval_mode Mauro Carvalho Chehab
2022-07-14 12:06 ` [PATCH v2 11/21] drm/i915/guc: document the TLB invalidation struct members Mauro Carvalho Chehab
2022-07-14 12:06 ` [PATCH v2 12/21] drm/i915/guc: Introduce TLB_INVALIDATION_ALL action Mauro Carvalho Chehab
2022-07-14 12:06 ` [PATCH v2 13/21] drm/i915: Invalidate the TLBs on each GT Mauro Carvalho Chehab
2022-07-14 12:06 ` [PATCH v2 14/21] drm/i915: document tlb field at struct drm_i915_gem_object Mauro Carvalho Chehab
2022-07-14 12:06 ` [PATCH v2 15/21] drm/i915: Add platform macro for selective tlb flush Mauro Carvalho Chehab
2022-07-14 12:06 ` [PATCH v2 16/21] drm/i915: Define GuC Based TLB invalidation routines Mauro Carvalho Chehab
2022-07-14 15:20   ` Michal Wajdeczko
2022-07-14 12:06 ` [PATCH v2 17/21] drm/i915: Add generic interface for tlb invalidation for XeHP Mauro Carvalho Chehab
2022-07-14 12:06 ` [PATCH v2 18/21] drm/i915: Use selective tlb invalidations where supported Mauro Carvalho Chehab
2022-07-14 12:06 ` [PATCH v2 19/21] drm/i915/gt: document TLB cache invalidation functions Mauro Carvalho Chehab
2022-07-14 12:06 ` [PATCH v2 20/21] drm/i915/guc: describe enum intel_guc_tlb_invalidation_type Mauro Carvalho Chehab
2022-07-14 12:06 ` [PATCH v2 21/21] drm/i915/guc: document TLB cache invalidation functions Mauro Carvalho Chehab

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220728092618.050969fc@maurocar-mobl2 \
    --to=mauro.chehab@linux.intel.com \
    --cc=airlied@linux.ie \
    --cc=airlied@redhat.com \
    --cc=chris.p.wilson@intel.com \
    --cc=christian.koenig@amd.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=intel-gfx@lists.freedesktop.org \
    --cc=linaro-mm-sig@lists.linaro.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-media@vger.kernel.org \
    --cc=lucas.demarchi@intel.com \
    --cc=matthew.auld@intel.com \
    --cc=mchehab@kernel.org \
    --cc=rodrigo.vivi@intel.com \
    --cc=stable@vger.kernel.org \
    --cc=sumit.semwal@linaro.org \
    --cc=thomas.hellstrom@linux.intel.com \
    --cc=tomas.winkler@intel.com \
    --cc=tvrtko.ursulin@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).