[PATCH v3 0/6] drm/i915: reduce TLB performance regressions

* [PATCH v3 0/6] drm/i915: reduce TLB performance regressions
@ 2022-07-27 12:29 ` Mauro Carvalho Chehab
  0 siblings, 0 replies; 35+ messages in thread
From: Mauro Carvalho Chehab @ 2022-07-27 12:29 UTC (permalink / raw)
  Cc: Mauro Carvalho Chehab, Christian König, Daniel Vetter,
	David Airlie, Sumit Semwal, dri-devel, intel-gfx, linaro-mm-sig,
	linux-kernel, linux-media

Doing TLB invalidation cause performance regressions, like:
	[424.370996] i915 0000:00:02.0: [drm] *ERROR* rcs0 TLB invalidation did not complete in 4ms!

As reported at:
	https://gitlab.freedesktop.org/drm/intel/-/issues/6424

as this is an expensive operation. So, reduce the need of it by:
  - checking if the engine is awake;
  - checking if the engine is not wedged;
  - batching operations.

Additionally, add a workaround for a known hardware issue on some GPUs.

In order to double-check that this series won't be introducing any regressions,
I used this new IGT test:

https://patchwork.freedesktop.org/patch/495684/?series=106757&rev=1

Checking the results for 3 different patchsets, on Broadwell:

1) On the top of drm-tip (2022y-07m-14d-08h-35m-36) - e. g. with TLB
invalidation and serialization patches:

	$ sudo build/tests/gem_exec_tlb|grep Subtest
	Subtest close-clear: SUCCESS (10.490s)
	Subtest madv-clear: SUCCESS (10.484s)
	Subtest u-unmap-clear: SUCCESS (10.527s)
	Subtest u-shrink-clear: SUCCESS (10.506s)
	Subtest close-dumb: SUCCESS (10.165s)
	Subtest madv-dumb: SUCCESS (10.177s)
	Subtest u-unmap-dumb: SUCCESS (10.172s)
	Subtest u-shrink-dumb: SUCCESS (10.172s)

2) With the new version of the batch TLB invalidation patches from this series:

	$ sudo build/tests/gem_exec_tlb|grep Subtest
	Subtest close-clear: SUCCESS (10.483s)
	Subtest madv-clear: SUCCESS (10.495s)
	Subtest u-unmap-clear: SUCCESS (10.545s)
	Subtest u-shrink-clear: SUCCESS (10.508s)
	Subtest close-dumb: SUCCESS (10.172s)
	Subtest madv-dumb: SUCCESS (10.169s)
	Subtest u-unmap-dumb: SUCCESS (10.174s)
	Subtest u-shrink-dumb: SUCCESS (10.176s)

3) Changing the TLB invalidation routine to do nothing[1]:

	$ sudo ~/freedesktop-igt/build/tests/gem_exec_tlb|grep Subtest
	(gem_exec_tlb:1958) CRITICAL: Test assertion failure function check_bo, file ../tests/i915/gem_exec_tlb.c:384:
	(gem_exec_tlb:1958) CRITICAL: Failed assertion: !sq
	(gem_exec_tlb:1958) CRITICAL: Found deadbeef in a new (clear) buffer after 3 tries!
	(gem_exec_tlb:1956) CRITICAL: Test assertion failure function check_bo, file ../tests/i915/gem_exec_tlb.c:384:
	(gem_exec_tlb:1956) CRITICAL: Failed assertion: !sq
	(gem_exec_tlb:1956) CRITICAL: Found deadbeef in a new (clear) buffer after 89 tries!
	(gem_exec_tlb:1957) CRITICAL: Test assertion failure function check_bo, file ../tests/i915/gem_exec_tlb.c:384:
	(gem_exec_tlb:1957) CRITICAL: Failed assertion: !sq
	(gem_exec_tlb:1957) CRITICAL: Found deadbeef in a new (clear) buffer after 256 tries!
	(gem_exec_tlb:1960) CRITICAL: Test assertion failure function check_bo, file ../tests/i915/gem_exec_tlb.c:384:
	(gem_exec_tlb:1960) CRITICAL: Failed assertion: !sq
	(gem_exec_tlb:1960) CRITICAL: Found deadbeef in a new (clear) buffer after 845 tries!
	(gem_exec_tlb:1961) CRITICAL: Test assertion failure function check_bo, file ../tests/i915/gem_exec_tlb.c:384:
	(gem_exec_tlb:1961) CRITICAL: Failed assertion: !sq
	(gem_exec_tlb:1961) CRITICAL: Found deadbeef in a new (clear) buffer after 1138 tries!
	(gem_exec_tlb:1954) CRITICAL: Test assertion failure function check_bo, file ../tests/i915/gem_exec_tlb.c:384:
	(gem_exec_tlb:1954) CRITICAL: Failed assertion: !sq
	(gem_exec_tlb:1954) CRITICAL: Found deadbeef in a new (clear) buffer after 1359 tries!
	(gem_exec_tlb:1955) CRITICAL: Test assertion failure function check_bo, file ../tests/i915/gem_exec_tlb.c:384:
	(gem_exec_tlb:1955) CRITICAL: Failed assertion: !sq
	(gem_exec_tlb:1955) CRITICAL: Found deadbeef in a new (clear) buffer after 1794 tries!
	(gem_exec_tlb:1959) CRITICAL: Test assertion failure function check_bo, file ../tests/i915/gem_exec_tlb.c:384:
	(gem_exec_tlb:1959) CRITICAL: Failed assertion: !sq
	(gem_exec_tlb:1959) CRITICAL: Found deadbeef in a new (clear) buffer after 2139 tries!
	Dynamic subtest smem0 failed.
	**** DEBUG ****
	(gem_exec_tlb:1944) DEBUG: 2M hole:200000 contains poison:6b6b6b6b
	(gem_exec_tlb:1944) DEBUG: Running writer for 200000 at 300000 on bcs0
	(gem_exec_tlb:1944) DEBUG: Closing hole:200000 on rcs0, sample:deadbeef
	(gem_exec_tlb:1944) DEBUG: Rechecking hole:200000, sample:6b6b6b6b
	****  END  ****
	Subtest close-clear: FAIL (10.434s)
	Subtest madv-clear: SUCCESS (10.479s)
	Subtest u-unmap-clear: SUCCESS (10.512s)

In summary, the test does properly detect fail when TLB cache invalidation doesn't happen,
as shown at result (3). It also shows that both current drm-tip and drm-tip with this series
applied don't have TLB invalidation cache issues.

[1] I applied this patch on the top of drm-tip:

	diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c b/drivers/gpu/drm/i915/gt/intel_gt.c
	index 68c2b0d8f187..0aefcd7be5e9 100644
	--- a/drivers/gpu/drm/i915/gt/intel_gt.c
	+++ b/drivers/gpu/drm/i915/gt/intel_gt.c
	@@ -930,0 +931,3 @@ void intel_gt_invalidate_tlbs(struct intel_gt *gt)
	+	// HACK: don't do TLB invalidations!!!
	+	return;
	+

Regards,
Mauro

Chris Wilson (4):
  drm/i915/gt: Ignore TLB invalidations on idle engines
  drm/i915/gt: Invalidate TLB of the OA unit at TLB invalidations
  drm/i915/gt: Skip TLB invalidations once wedged
  drm/i915/gt: Batch TLB invalidations

Mauro Carvalho Chehab (2):
  drm/i915/gt: document with_intel_gt_pm_if_awake()
  drm/i915/gt: describe the new tlb parameter at i915_vma_resource

 .../gpu/drm/i915/gem/i915_gem_object_types.h  |  3 +-
 drivers/gpu/drm/i915/gem/i915_gem_pages.c     | 25 +++---
 drivers/gpu/drm/i915/gt/intel_gt.c            | 77 +++++++++++++++----
 drivers/gpu/drm/i915/gt/intel_gt.h            | 12 ++-
 drivers/gpu/drm/i915/gt/intel_gt_pm.h         | 11 +++
 drivers/gpu/drm/i915/gt/intel_gt_types.h      | 18 ++++-
 drivers/gpu/drm/i915/gt/intel_ppgtt.c         |  8 +-
 drivers/gpu/drm/i915/i915_vma.c               | 33 ++++++--
 drivers/gpu/drm/i915/i915_vma.h               |  1 +
 drivers/gpu/drm/i915/i915_vma_resource.c      |  9 ++-
 drivers/gpu/drm/i915/i915_vma_resource.h      |  6 +-
 11 files changed, 163 insertions(+), 40 deletions(-)

-- 
2.36.1

^ permalink raw reply	[flat|nested] 35+ messages in thread