All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/19] More DG1 enabling
@ 2021-04-12  9:05 ` Matthew Auld
  0 siblings, 0 replies; 130+ messages in thread
From: Matthew Auld @ 2021-04-12  9:05 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel

Next batch of DG1 patches. With this we should now get a booting DG1 system with
the kernel selftests passing.

Anshuman Gupta (1):
  drm/i915/oprom: Basic sanitization

Anusha Srivatsa (1):
  drm/i915/lmem: Bypass aperture when lmem is available

CQ Tang (3):
  drm/i915: Create stolen memory region from local memory
  drm/i915/stolen: enforce the min_page_size contract
  drm/i915/stolen: pass the allocation flags

Chris Wilson (2):
  drm/i915/gt: Skip aperture remapping selftest where there is no
    aperture
  drm/i915/selftests: Only query RAPL for integrated power measurements

Clint Taylor (3):
  drm/i915/dg1: Read OPROM via SPI controller
  drm/i915/dg1: Compute MEM Bandwidth using MCHBAR
  drm/i915/dg1: Double memory bandwidth available

José Roberto de Souza (1):
  drm/i915: WA for zero memory channel

Matt Roper (1):
  drm/i915/lmem: Fail driver init if LMEM training failed

Matthew Auld (3):
  drm/i915/stolen: treat stolen local as normal local memory
  drm/i915/gtt: map the PD up front
  drm/i915/gtt/dgfx: place the PD in LMEM

Mohammed Khajapasha (2):
  drm/i915/fbdev: Use lmem physical addresses for fb_mmap() on discrete
  drm/i915: Return error value when bo not in LMEM for discrete

Venkata Ramana Nayana (1):
  drm/i915/dg1: Fix mapping type for default state object

Venkata Sandeep Dhanalakota (1):
  drm/i915: Update the helper to set correct mapping

 drivers/gpu/drm/i915/display/intel_bios.c     |  75 +++++++-
 drivers/gpu/drm/i915/display/intel_bw.c       |  63 ++++++-
 drivers/gpu/drm/i915/display/intel_display.c  |  10 ++
 drivers/gpu/drm/i915/display/intel_fbdev.c    |  51 ++++--
 drivers/gpu/drm/i915/display/intel_opregion.c | 169 ++++++++++++++++++
 drivers/gpu/drm/i915/display/intel_opregion.h |  38 +++-
 drivers/gpu/drm/i915/gem/i915_gem_lmem.c      |  20 ++-
 drivers/gpu/drm/i915/gem/i915_gem_lmem.h      |   5 +
 drivers/gpu/drm/i915/gem/i915_gem_stolen.c    | 116 ++++++++++--
 drivers/gpu/drm/i915/gem/i915_gem_stolen.h    |   3 +
 .../drm/i915/gem/selftests/i915_gem_context.c |  11 +-
 drivers/gpu/drm/i915/gt/gen6_ppgtt.c          |  11 +-
 drivers/gpu/drm/i915/gt/gen8_ppgtt.c          |  31 ++--
 drivers/gpu/drm/i915/gt/intel_engine_cs.c     |   3 +-
 drivers/gpu/drm/i915/gt/intel_engine_pm.c     |   2 +-
 drivers/gpu/drm/i915/gt/intel_ggtt.c          |   2 +-
 drivers/gpu/drm/i915/gt/intel_gtt.c           |  71 +++++---
 drivers/gpu/drm/i915/gt/intel_gtt.h           |  12 +-
 drivers/gpu/drm/i915/gt/intel_lrc.c           |   4 +-
 drivers/gpu/drm/i915/gt/intel_ppgtt.c         |   7 +-
 drivers/gpu/drm/i915/gt/intel_ring.c          |   9 +-
 drivers/gpu/drm/i915/gt/selftest_context.c    |   3 +-
 drivers/gpu/drm/i915/gt/selftest_hangcheck.c  |   4 +-
 drivers/gpu/drm/i915/gt/selftest_lrc.c        |   4 +-
 drivers/gpu/drm/i915/gt/selftest_rc6.c        |  32 ++--
 drivers/gpu/drm/i915/gt/selftest_rps.c        |   2 +-
 drivers/gpu/drm/i915/gt/shmem_utils.c         |   4 +-
 drivers/gpu/drm/i915/gt/uc/intel_guc.c        |   4 +-
 drivers/gpu/drm/i915/gt/uc/intel_huc.c        |   4 +-
 drivers/gpu/drm/i915/i915_drv.h               |  11 +-
 drivers/gpu/drm/i915/i915_pci.c               |   2 +-
 drivers/gpu/drm/i915/i915_reg.h               |  12 ++
 drivers/gpu/drm/i915/i915_vma.c               |  22 ++-
 drivers/gpu/drm/i915/intel_memory_region.c    |   6 +
 drivers/gpu/drm/i915/intel_memory_region.h    |   5 +-
 drivers/gpu/drm/i915/intel_uncore.c           |  12 ++
 drivers/gpu/drm/i915/selftests/i915_gem_gtt.c |  10 +-
 drivers/gpu/drm/i915/selftests/i915_perf.c    |   3 +-
 drivers/gpu/drm/i915/selftests/i915_vma.c     |   3 +
 drivers/gpu/drm/i915/selftests/igt_spinner.c  |   4 +-
 drivers/gpu/drm/i915/selftests/librapl.c      |  10 ++
 drivers/gpu/drm/i915/selftests/librapl.h      |   4 +
 42 files changed, 716 insertions(+), 158 deletions(-)

-- 
2.26.3

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 130+ messages in thread

* [Intel-gfx] [PATCH 00/19] More DG1 enabling
@ 2021-04-12  9:05 ` Matthew Auld
  0 siblings, 0 replies; 130+ messages in thread
From: Matthew Auld @ 2021-04-12  9:05 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel

Next batch of DG1 patches. With this we should now get a booting DG1 system with
the kernel selftests passing.

Anshuman Gupta (1):
  drm/i915/oprom: Basic sanitization

Anusha Srivatsa (1):
  drm/i915/lmem: Bypass aperture when lmem is available

CQ Tang (3):
  drm/i915: Create stolen memory region from local memory
  drm/i915/stolen: enforce the min_page_size contract
  drm/i915/stolen: pass the allocation flags

Chris Wilson (2):
  drm/i915/gt: Skip aperture remapping selftest where there is no
    aperture
  drm/i915/selftests: Only query RAPL for integrated power measurements

Clint Taylor (3):
  drm/i915/dg1: Read OPROM via SPI controller
  drm/i915/dg1: Compute MEM Bandwidth using MCHBAR
  drm/i915/dg1: Double memory bandwidth available

José Roberto de Souza (1):
  drm/i915: WA for zero memory channel

Matt Roper (1):
  drm/i915/lmem: Fail driver init if LMEM training failed

Matthew Auld (3):
  drm/i915/stolen: treat stolen local as normal local memory
  drm/i915/gtt: map the PD up front
  drm/i915/gtt/dgfx: place the PD in LMEM

Mohammed Khajapasha (2):
  drm/i915/fbdev: Use lmem physical addresses for fb_mmap() on discrete
  drm/i915: Return error value when bo not in LMEM for discrete

Venkata Ramana Nayana (1):
  drm/i915/dg1: Fix mapping type for default state object

Venkata Sandeep Dhanalakota (1):
  drm/i915: Update the helper to set correct mapping

 drivers/gpu/drm/i915/display/intel_bios.c     |  75 +++++++-
 drivers/gpu/drm/i915/display/intel_bw.c       |  63 ++++++-
 drivers/gpu/drm/i915/display/intel_display.c  |  10 ++
 drivers/gpu/drm/i915/display/intel_fbdev.c    |  51 ++++--
 drivers/gpu/drm/i915/display/intel_opregion.c | 169 ++++++++++++++++++
 drivers/gpu/drm/i915/display/intel_opregion.h |  38 +++-
 drivers/gpu/drm/i915/gem/i915_gem_lmem.c      |  20 ++-
 drivers/gpu/drm/i915/gem/i915_gem_lmem.h      |   5 +
 drivers/gpu/drm/i915/gem/i915_gem_stolen.c    | 116 ++++++++++--
 drivers/gpu/drm/i915/gem/i915_gem_stolen.h    |   3 +
 .../drm/i915/gem/selftests/i915_gem_context.c |  11 +-
 drivers/gpu/drm/i915/gt/gen6_ppgtt.c          |  11 +-
 drivers/gpu/drm/i915/gt/gen8_ppgtt.c          |  31 ++--
 drivers/gpu/drm/i915/gt/intel_engine_cs.c     |   3 +-
 drivers/gpu/drm/i915/gt/intel_engine_pm.c     |   2 +-
 drivers/gpu/drm/i915/gt/intel_ggtt.c          |   2 +-
 drivers/gpu/drm/i915/gt/intel_gtt.c           |  71 +++++---
 drivers/gpu/drm/i915/gt/intel_gtt.h           |  12 +-
 drivers/gpu/drm/i915/gt/intel_lrc.c           |   4 +-
 drivers/gpu/drm/i915/gt/intel_ppgtt.c         |   7 +-
 drivers/gpu/drm/i915/gt/intel_ring.c          |   9 +-
 drivers/gpu/drm/i915/gt/selftest_context.c    |   3 +-
 drivers/gpu/drm/i915/gt/selftest_hangcheck.c  |   4 +-
 drivers/gpu/drm/i915/gt/selftest_lrc.c        |   4 +-
 drivers/gpu/drm/i915/gt/selftest_rc6.c        |  32 ++--
 drivers/gpu/drm/i915/gt/selftest_rps.c        |   2 +-
 drivers/gpu/drm/i915/gt/shmem_utils.c         |   4 +-
 drivers/gpu/drm/i915/gt/uc/intel_guc.c        |   4 +-
 drivers/gpu/drm/i915/gt/uc/intel_huc.c        |   4 +-
 drivers/gpu/drm/i915/i915_drv.h               |  11 +-
 drivers/gpu/drm/i915/i915_pci.c               |   2 +-
 drivers/gpu/drm/i915/i915_reg.h               |  12 ++
 drivers/gpu/drm/i915/i915_vma.c               |  22 ++-
 drivers/gpu/drm/i915/intel_memory_region.c    |   6 +
 drivers/gpu/drm/i915/intel_memory_region.h    |   5 +-
 drivers/gpu/drm/i915/intel_uncore.c           |  12 ++
 drivers/gpu/drm/i915/selftests/i915_gem_gtt.c |  10 +-
 drivers/gpu/drm/i915/selftests/i915_perf.c    |   3 +-
 drivers/gpu/drm/i915/selftests/i915_vma.c     |   3 +
 drivers/gpu/drm/i915/selftests/igt_spinner.c  |   4 +-
 drivers/gpu/drm/i915/selftests/librapl.c      |  10 ++
 drivers/gpu/drm/i915/selftests/librapl.h      |   4 +
 42 files changed, 716 insertions(+), 158 deletions(-)

-- 
2.26.3

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 130+ messages in thread

* [PATCH 01/19] drm/i915/gt: Skip aperture remapping selftest where there is no aperture
  2021-04-12  9:05 ` [Intel-gfx] " Matthew Auld
@ 2021-04-12  9:05   ` Matthew Auld
  -1 siblings, 0 replies; 130+ messages in thread
From: Matthew Auld @ 2021-04-12  9:05 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel, Chris Wilson

From: Chris Wilson <chris@chris-wilson.co.uk>

If there is no mappable aperture, we cannot remap it for access, and the
selftest is void.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
---
 drivers/gpu/drm/i915/selftests/i915_vma.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/i915/selftests/i915_vma.c b/drivers/gpu/drm/i915/selftests/i915_vma.c
index 5fe7b80ca0bd..dd0607254a95 100644
--- a/drivers/gpu/drm/i915/selftests/i915_vma.c
+++ b/drivers/gpu/drm/i915/selftests/i915_vma.c
@@ -967,6 +967,9 @@ static int igt_vma_remapped_gtt(void *arg)
 	intel_wakeref_t wakeref;
 	int err = 0;
 
+	if (!i915_ggtt_has_aperture(&i915->ggtt))
+		return 0;
+
 	obj = i915_gem_object_create_internal(i915, 10 * 10 * PAGE_SIZE);
 	if (IS_ERR(obj))
 		return PTR_ERR(obj);
-- 
2.26.3

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 130+ messages in thread

* [Intel-gfx] [PATCH 01/19] drm/i915/gt: Skip aperture remapping selftest where there is no aperture
@ 2021-04-12  9:05   ` Matthew Auld
  0 siblings, 0 replies; 130+ messages in thread
From: Matthew Auld @ 2021-04-12  9:05 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel, Chris Wilson

From: Chris Wilson <chris@chris-wilson.co.uk>

If there is no mappable aperture, we cannot remap it for access, and the
selftest is void.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
---
 drivers/gpu/drm/i915/selftests/i915_vma.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/i915/selftests/i915_vma.c b/drivers/gpu/drm/i915/selftests/i915_vma.c
index 5fe7b80ca0bd..dd0607254a95 100644
--- a/drivers/gpu/drm/i915/selftests/i915_vma.c
+++ b/drivers/gpu/drm/i915/selftests/i915_vma.c
@@ -967,6 +967,9 @@ static int igt_vma_remapped_gtt(void *arg)
 	intel_wakeref_t wakeref;
 	int err = 0;
 
+	if (!i915_ggtt_has_aperture(&i915->ggtt))
+		return 0;
+
 	obj = i915_gem_object_create_internal(i915, 10 * 10 * PAGE_SIZE);
 	if (IS_ERR(obj))
 		return PTR_ERR(obj);
-- 
2.26.3

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 130+ messages in thread

* [PATCH 02/19] drm/i915/selftests: Only query RAPL for integrated power measurements
  2021-04-12  9:05 ` [Intel-gfx] " Matthew Auld
@ 2021-04-12  9:05   ` Matthew Auld
  -1 siblings, 0 replies; 130+ messages in thread
From: Matthew Auld @ 2021-04-12  9:05 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel, Chris Wilson

From: Chris Wilson <chris@chris-wilson.co.uk>

RAPL provides an on-package power measurements which does not encompass
discrete graphics, so let's avoid using the igfx masurements when testing
dgfx. Later we will abstract the simple librapl interface over hwmon so
that we can verify basic power consumption scenarios.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
---
 drivers/gpu/drm/i915/gt/selftest_rc6.c   | 32 +++++++++++++++---------
 drivers/gpu/drm/i915/gt/selftest_rps.c   |  2 +-
 drivers/gpu/drm/i915/selftests/librapl.c | 10 ++++++++
 drivers/gpu/drm/i915/selftests/librapl.h |  4 +++
 4 files changed, 35 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/selftest_rc6.c b/drivers/gpu/drm/i915/gt/selftest_rc6.c
index f097e420ac45..710f825f6e5a 100644
--- a/drivers/gpu/drm/i915/gt/selftest_rc6.c
+++ b/drivers/gpu/drm/i915/gt/selftest_rc6.c
@@ -34,6 +34,7 @@ int live_rc6_manual(void *arg)
 	struct intel_rc6 *rc6 = &gt->rc6;
 	u64 rc0_power, rc6_power;
 	intel_wakeref_t wakeref;
+	bool has_power;
 	ktime_t dt;
 	u64 res[2];
 	int err = 0;
@@ -50,6 +51,7 @@ int live_rc6_manual(void *arg)
 	if (IS_VALLEYVIEW(gt->i915) || IS_CHERRYVIEW(gt->i915))
 		return 0;
 
+	has_power = librapl_supported(gt->i915);
 	wakeref = intel_runtime_pm_get(gt->uncore->rpm);
 
 	/* Force RC6 off for starters */
@@ -71,11 +73,14 @@ int live_rc6_manual(void *arg)
 		goto out_unlock;
 	}
 
-	rc0_power = div64_u64(NSEC_PER_SEC * rc0_power, ktime_to_ns(dt));
-	if (!rc0_power) {
-		pr_err("No power measured while in RC0\n");
-		err = -EINVAL;
-		goto out_unlock;
+	if (has_power) {
+		rc0_power = div64_u64(NSEC_PER_SEC * rc0_power,
+				      ktime_to_ns(dt));
+		if (!rc0_power) {
+			pr_err("No power measured while in RC0\n");
+			err = -EINVAL;
+			goto out_unlock;
+		}
 	}
 
 	/* Manually enter RC6 */
@@ -97,13 +102,16 @@ int live_rc6_manual(void *arg)
 		err = -EINVAL;
 	}
 
-	rc6_power = div64_u64(NSEC_PER_SEC * rc6_power, ktime_to_ns(dt));
-	pr_info("GPU consumed %llduW in RC0 and %llduW in RC6\n",
-		rc0_power, rc6_power);
-	if (2 * rc6_power > rc0_power) {
-		pr_err("GPU leaked energy while in RC6!\n");
-		err = -EINVAL;
-		goto out_unlock;
+	if (has_power) {
+		rc6_power = div64_u64(NSEC_PER_SEC * rc6_power,
+				      ktime_to_ns(dt));
+		pr_info("GPU consumed %llduW in RC0 and %llduW in RC6\n",
+			rc0_power, rc6_power);
+		if (2 * rc6_power > rc0_power) {
+			pr_err("GPU leaked energy while in RC6!\n");
+			err = -EINVAL;
+			goto out_unlock;
+		}
 	}
 
 	/* Restore what should have been the original state! */
diff --git a/drivers/gpu/drm/i915/gt/selftest_rps.c b/drivers/gpu/drm/i915/gt/selftest_rps.c
index 967641fee42a..adf7fdbc00f7 100644
--- a/drivers/gpu/drm/i915/gt/selftest_rps.c
+++ b/drivers/gpu/drm/i915/gt/selftest_rps.c
@@ -1139,7 +1139,7 @@ int live_rps_power(void *arg)
 	if (!intel_rps_is_enabled(rps) || INTEL_GEN(gt->i915) < 6)
 		return 0;
 
-	if (!librapl_energy_uJ())
+	if (!librapl_supported(gt->i915))
 		return 0;
 
 	if (igt_spinner_init(&spin, gt))
diff --git a/drivers/gpu/drm/i915/selftests/librapl.c b/drivers/gpu/drm/i915/selftests/librapl.c
index 58710ac3f979..eb03b5b28bad 100644
--- a/drivers/gpu/drm/i915/selftests/librapl.c
+++ b/drivers/gpu/drm/i915/selftests/librapl.c
@@ -5,8 +5,18 @@
 
 #include <asm/msr.h>
 
+#include "i915_drv.h"
 #include "librapl.h"
 
+bool librapl_supported(const struct drm_i915_private *i915)
+{
+	/* Discrete cards require hwmon integration */
+	if (IS_DGFX(i915))
+		return false;
+
+	return librapl_energy_uJ();
+}
+
 u64 librapl_energy_uJ(void)
 {
 	unsigned long long power;
diff --git a/drivers/gpu/drm/i915/selftests/librapl.h b/drivers/gpu/drm/i915/selftests/librapl.h
index 887f3e91dd05..e3b24fad0a7a 100644
--- a/drivers/gpu/drm/i915/selftests/librapl.h
+++ b/drivers/gpu/drm/i915/selftests/librapl.h
@@ -8,6 +8,10 @@
 
 #include <linux/types.h>
 
+struct drm_i915_private;
+
+bool librapl_supported(const struct drm_i915_private *i915);
+
 u64 librapl_energy_uJ(void);
 
 #endif /* SELFTEST_LIBRAPL_H */
-- 
2.26.3

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 130+ messages in thread

* [Intel-gfx] [PATCH 02/19] drm/i915/selftests: Only query RAPL for integrated power measurements
@ 2021-04-12  9:05   ` Matthew Auld
  0 siblings, 0 replies; 130+ messages in thread
From: Matthew Auld @ 2021-04-12  9:05 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel, Chris Wilson

From: Chris Wilson <chris@chris-wilson.co.uk>

RAPL provides an on-package power measurements which does not encompass
discrete graphics, so let's avoid using the igfx masurements when testing
dgfx. Later we will abstract the simple librapl interface over hwmon so
that we can verify basic power consumption scenarios.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
---
 drivers/gpu/drm/i915/gt/selftest_rc6.c   | 32 +++++++++++++++---------
 drivers/gpu/drm/i915/gt/selftest_rps.c   |  2 +-
 drivers/gpu/drm/i915/selftests/librapl.c | 10 ++++++++
 drivers/gpu/drm/i915/selftests/librapl.h |  4 +++
 4 files changed, 35 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/selftest_rc6.c b/drivers/gpu/drm/i915/gt/selftest_rc6.c
index f097e420ac45..710f825f6e5a 100644
--- a/drivers/gpu/drm/i915/gt/selftest_rc6.c
+++ b/drivers/gpu/drm/i915/gt/selftest_rc6.c
@@ -34,6 +34,7 @@ int live_rc6_manual(void *arg)
 	struct intel_rc6 *rc6 = &gt->rc6;
 	u64 rc0_power, rc6_power;
 	intel_wakeref_t wakeref;
+	bool has_power;
 	ktime_t dt;
 	u64 res[2];
 	int err = 0;
@@ -50,6 +51,7 @@ int live_rc6_manual(void *arg)
 	if (IS_VALLEYVIEW(gt->i915) || IS_CHERRYVIEW(gt->i915))
 		return 0;
 
+	has_power = librapl_supported(gt->i915);
 	wakeref = intel_runtime_pm_get(gt->uncore->rpm);
 
 	/* Force RC6 off for starters */
@@ -71,11 +73,14 @@ int live_rc6_manual(void *arg)
 		goto out_unlock;
 	}
 
-	rc0_power = div64_u64(NSEC_PER_SEC * rc0_power, ktime_to_ns(dt));
-	if (!rc0_power) {
-		pr_err("No power measured while in RC0\n");
-		err = -EINVAL;
-		goto out_unlock;
+	if (has_power) {
+		rc0_power = div64_u64(NSEC_PER_SEC * rc0_power,
+				      ktime_to_ns(dt));
+		if (!rc0_power) {
+			pr_err("No power measured while in RC0\n");
+			err = -EINVAL;
+			goto out_unlock;
+		}
 	}
 
 	/* Manually enter RC6 */
@@ -97,13 +102,16 @@ int live_rc6_manual(void *arg)
 		err = -EINVAL;
 	}
 
-	rc6_power = div64_u64(NSEC_PER_SEC * rc6_power, ktime_to_ns(dt));
-	pr_info("GPU consumed %llduW in RC0 and %llduW in RC6\n",
-		rc0_power, rc6_power);
-	if (2 * rc6_power > rc0_power) {
-		pr_err("GPU leaked energy while in RC6!\n");
-		err = -EINVAL;
-		goto out_unlock;
+	if (has_power) {
+		rc6_power = div64_u64(NSEC_PER_SEC * rc6_power,
+				      ktime_to_ns(dt));
+		pr_info("GPU consumed %llduW in RC0 and %llduW in RC6\n",
+			rc0_power, rc6_power);
+		if (2 * rc6_power > rc0_power) {
+			pr_err("GPU leaked energy while in RC6!\n");
+			err = -EINVAL;
+			goto out_unlock;
+		}
 	}
 
 	/* Restore what should have been the original state! */
diff --git a/drivers/gpu/drm/i915/gt/selftest_rps.c b/drivers/gpu/drm/i915/gt/selftest_rps.c
index 967641fee42a..adf7fdbc00f7 100644
--- a/drivers/gpu/drm/i915/gt/selftest_rps.c
+++ b/drivers/gpu/drm/i915/gt/selftest_rps.c
@@ -1139,7 +1139,7 @@ int live_rps_power(void *arg)
 	if (!intel_rps_is_enabled(rps) || INTEL_GEN(gt->i915) < 6)
 		return 0;
 
-	if (!librapl_energy_uJ())
+	if (!librapl_supported(gt->i915))
 		return 0;
 
 	if (igt_spinner_init(&spin, gt))
diff --git a/drivers/gpu/drm/i915/selftests/librapl.c b/drivers/gpu/drm/i915/selftests/librapl.c
index 58710ac3f979..eb03b5b28bad 100644
--- a/drivers/gpu/drm/i915/selftests/librapl.c
+++ b/drivers/gpu/drm/i915/selftests/librapl.c
@@ -5,8 +5,18 @@
 
 #include <asm/msr.h>
 
+#include "i915_drv.h"
 #include "librapl.h"
 
+bool librapl_supported(const struct drm_i915_private *i915)
+{
+	/* Discrete cards require hwmon integration */
+	if (IS_DGFX(i915))
+		return false;
+
+	return librapl_energy_uJ();
+}
+
 u64 librapl_energy_uJ(void)
 {
 	unsigned long long power;
diff --git a/drivers/gpu/drm/i915/selftests/librapl.h b/drivers/gpu/drm/i915/selftests/librapl.h
index 887f3e91dd05..e3b24fad0a7a 100644
--- a/drivers/gpu/drm/i915/selftests/librapl.h
+++ b/drivers/gpu/drm/i915/selftests/librapl.h
@@ -8,6 +8,10 @@
 
 #include <linux/types.h>
 
+struct drm_i915_private;
+
+bool librapl_supported(const struct drm_i915_private *i915);
+
 u64 librapl_energy_uJ(void);
 
 #endif /* SELFTEST_LIBRAPL_H */
-- 
2.26.3

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 130+ messages in thread

* [PATCH 03/19] drm/i915: Create stolen memory region from local memory
  2021-04-12  9:05 ` [Intel-gfx] " Matthew Auld
@ 2021-04-12  9:05   ` Matthew Auld
  -1 siblings, 0 replies; 130+ messages in thread
From: Matthew Auld @ 2021-04-12  9:05 UTC (permalink / raw)
  To: intel-gfx; +Cc: CQ Tang, dri-devel

From: CQ Tang <cq.tang@intel.com>

Add "REGION_STOLEN" device info to dg1, create stolen memory
region from upper portion of local device memory, starting
from DSMBASE.

v2:
    - s/drm_info/drm_dbg; userspace likely doesn't care about stolen.
    - mem->type is only setup after the region probe, so setting the name
      as stolen-local or stolen-system based on this value won't work. Split
      system vs local stolen setup to fix this.
    - kill all the region->devmem/is_devmem stuff. We already differentiate
      the different types of stolen so such things shouldn't be needed
      anymore.

Signed-off-by: CQ Tang <cq.tang@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_stolen.c | 99 +++++++++++++++++++---
 drivers/gpu/drm/i915/gem/i915_gem_stolen.h |  3 +
 drivers/gpu/drm/i915/i915_pci.c            |  2 +-
 drivers/gpu/drm/i915/i915_reg.h            |  1 +
 drivers/gpu/drm/i915/intel_memory_region.c |  6 ++
 drivers/gpu/drm/i915/intel_memory_region.h |  5 +-
 6 files changed, 102 insertions(+), 14 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_stolen.c b/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
index b0597de206de..56dd58bef5ee 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
@@ -10,6 +10,7 @@
 #include <drm/drm_mm.h>
 #include <drm/i915_drm.h>
 
+#include "gem/i915_gem_lmem.h"
 #include "gem/i915_gem_region.h"
 #include "i915_drv.h"
 #include "i915_gem_stolen.h"
@@ -121,6 +122,14 @@ static int i915_adjust_stolen(struct drm_i915_private *i915,
 		}
 	}
 
+	/*
+	 * With device local memory, we don't need to check the address range,
+	 * this is device memory physical address, could overlap with system
+	 * memory.
+	 */
+	if (HAS_LMEM(i915))
+		return 0;
+
 	/*
 	 * Verify that nothing else uses this physical address. Stolen
 	 * memory should be reserved by the BIOS and hidden from the
@@ -374,8 +383,9 @@ static void icl_get_stolen_reserved(struct drm_i915_private *i915,
 	}
 }
 
-static int i915_gem_init_stolen(struct drm_i915_private *i915)
+static int i915_gem_init_stolen(struct intel_memory_region *mem)
 {
+	struct drm_i915_private *i915 = mem->i915;
 	struct intel_uncore *uncore = &i915->uncore;
 	resource_size_t reserved_base, stolen_top;
 	resource_size_t reserved_total, reserved_size;
@@ -396,10 +406,10 @@ static int i915_gem_init_stolen(struct drm_i915_private *i915)
 		return 0;
 	}
 
-	if (resource_size(&intel_graphics_stolen_res) == 0)
+	if (resource_size(&mem->region) == 0)
 		return 0;
 
-	i915->dsm = intel_graphics_stolen_res;
+	i915->dsm = mem->region;
 
 	if (i915_adjust_stolen(i915, &i915->dsm))
 		return 0;
@@ -684,23 +694,36 @@ static int _i915_gem_object_stolen_init(struct intel_memory_region *mem,
 	return ret;
 }
 
+struct intel_memory_region *i915_stolen_region(struct drm_i915_private *i915)
+{
+	if (HAS_LMEM(i915))
+		return i915->mm.regions[INTEL_REGION_STOLEN_LMEM];
+
+	return i915->mm.regions[INTEL_REGION_STOLEN_SMEM];
+}
+
 struct drm_i915_gem_object *
 i915_gem_object_create_stolen(struct drm_i915_private *i915,
 			      resource_size_t size)
 {
-	return i915_gem_object_create_region(i915->mm.regions[INTEL_REGION_STOLEN_SMEM],
+	return i915_gem_object_create_region(i915_stolen_region(i915),
 					     size, I915_BO_ALLOC_CONTIGUOUS);
 }
 
 static int init_stolen(struct intel_memory_region *mem)
 {
-	intel_memory_region_set_name(mem, "stolen");
+	if (HAS_LMEM(mem->i915)) {
+		if (!io_mapping_init_wc(&mem->iomap,
+					mem->io_start,
+					resource_size(&mem->region)))
+			return -EIO;
+	}
 
 	/*
 	 * Initialise stolen early so that we may reserve preallocated
 	 * objects for the BIOS to KMS transition.
 	 */
-	return i915_gem_init_stolen(mem->i915);
+	return i915_gem_init_stolen(mem);
 }
 
 static void release_stolen(struct intel_memory_region *mem)
@@ -714,13 +737,65 @@ static const struct intel_memory_region_ops i915_region_stolen_ops = {
 	.init_object = _i915_gem_object_stolen_init,
 };
 
+static struct intel_memory_region *
+setup_lmem_stolen(struct drm_i915_private *i915)
+{
+	struct intel_uncore *uncore = &i915->uncore;
+	struct pci_dev *pdev = i915->drm.pdev;
+	struct intel_memory_region *mem;
+	resource_size_t io_start;
+	resource_size_t lmem_size;
+	u64 lmem_base;
+
+	if (!IS_DGFX(i915))
+		return ERR_PTR(-ENODEV);
+
+	lmem_base = intel_uncore_read64(uncore, GEN12_DSMBASE);
+	lmem_size = pci_resource_len(pdev, 2) - lmem_base;
+	io_start = pci_resource_start(pdev, 2) + lmem_base;
+
+	mem = intel_memory_region_create(i915, lmem_base, lmem_size,
+					 I915_GTT_PAGE_SIZE_4K, io_start,
+					 &i915_region_stolen_ops);
+	if (IS_ERR(mem))
+		return mem;
+
+	drm_dbg(&i915->drm, "Stolen Local memory: %pR\n", &mem->region);
+	drm_dbg(&i915->drm, "Stolen Local memory IO start: %pa\n",
+		&mem->io_start);
+
+	intel_memory_region_set_name(mem, "stolen-local");
+
+	return mem;
+}
+
+static struct intel_memory_region*
+setup_smem_stolen(struct drm_i915_private *i915)
+{
+	struct intel_memory_region *mem;
+
+	mem = intel_memory_region_create(i915,
+					 intel_graphics_stolen_res.start,
+					 resource_size(&intel_graphics_stolen_res),
+					 PAGE_SIZE, 0,
+					 &i915_region_stolen_ops);
+	if (IS_ERR(mem))
+		return mem;
+
+	intel_memory_region_set_name(mem, "stolen-system");
+
+	return mem;
+}
+
 struct intel_memory_region *i915_gem_stolen_setup(struct drm_i915_private *i915)
 {
-	return intel_memory_region_create(i915,
-					  intel_graphics_stolen_res.start,
-					  resource_size(&intel_graphics_stolen_res),
-					  PAGE_SIZE, 0,
-					  &i915_region_stolen_ops);
+	struct intel_memory_region *mem;
+
+	mem = setup_lmem_stolen(i915);
+	if (mem == ERR_PTR(-ENODEV))
+		mem = setup_smem_stolen(i915);
+
+	return mem;
 }
 
 struct drm_i915_gem_object *
@@ -728,7 +803,7 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_i915_private *i915,
 					       resource_size_t stolen_offset,
 					       resource_size_t size)
 {
-	struct intel_memory_region *mem = i915->mm.regions[INTEL_REGION_STOLEN_SMEM];
+	struct intel_memory_region *mem = i915_stolen_region(i915);
 	struct drm_i915_gem_object *obj;
 	struct drm_mm_node *stolen;
 	int ret;
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_stolen.h b/drivers/gpu/drm/i915/gem/i915_gem_stolen.h
index b03489706796..2d1ce7fec61c 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_stolen.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_stolen.h
@@ -22,6 +22,9 @@ int i915_gem_stolen_insert_node_in_range(struct drm_i915_private *dev_priv,
 void i915_gem_stolen_remove_node(struct drm_i915_private *dev_priv,
 				 struct drm_mm_node *node);
 struct intel_memory_region *i915_gem_stolen_setup(struct drm_i915_private *i915);
+
+struct intel_memory_region *i915_stolen_region(struct drm_i915_private *i915);
+
 struct drm_i915_gem_object *
 i915_gem_object_create_stolen(struct drm_i915_private *dev_priv,
 			      resource_size_t size);
diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c
index 480553746794..53f5d1e6daef 100644
--- a/drivers/gpu/drm/i915/i915_pci.c
+++ b/drivers/gpu/drm/i915/i915_pci.c
@@ -906,7 +906,7 @@ static const struct intel_device_info rkl_info = {
 
 #define GEN12_DGFX_FEATURES \
 	GEN12_FEATURES, \
-	.memory_regions = REGION_SMEM | REGION_LMEM, \
+	.memory_regions = REGION_SMEM | REGION_LMEM | REGION_STOLEN_LMEM, \
 	.has_master_unit_irq = 1, \
 	.has_llc = 0, \
 	.has_snoop = 1, \
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index e087bcd21911..4108f2a7ebfa 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -12191,6 +12191,7 @@ enum skl_power_gate {
 #define GEN12_GLOBAL_MOCS(i)	_MMIO(0x4000 + (i) * 4) /* Global MOCS regs */
 
 #define GEN12_GSMBASE			_MMIO(0x108100)
+#define GEN12_DSMBASE			_MMIO(0x1080C0)
 
 /* gamt regs */
 #define GEN8_L3_LRA_1_GPGPU _MMIO(0x4dd4)
diff --git a/drivers/gpu/drm/i915/intel_memory_region.c b/drivers/gpu/drm/i915/intel_memory_region.c
index bf837b6bb185..ac90b76a3fa0 100644
--- a/drivers/gpu/drm/i915/intel_memory_region.c
+++ b/drivers/gpu/drm/i915/intel_memory_region.c
@@ -22,6 +22,10 @@ static const struct {
 		.class = INTEL_MEMORY_STOLEN_SYSTEM,
 		.instance = 0,
 	},
+	[INTEL_REGION_STOLEN_LMEM] = {
+		.class = INTEL_MEMORY_STOLEN_LOCAL,
+		.instance = 0,
+	},
 };
 
 struct intel_memory_region *
@@ -278,6 +282,8 @@ int intel_memory_regions_hw_probe(struct drm_i915_private *i915)
 		case INTEL_MEMORY_SYSTEM:
 			mem = i915_gem_shmem_setup(i915);
 			break;
+		case INTEL_MEMORY_STOLEN_LOCAL:
+			fallthrough;
 		case INTEL_MEMORY_STOLEN_SYSTEM:
 			mem = i915_gem_stolen_setup(i915);
 			break;
diff --git a/drivers/gpu/drm/i915/intel_memory_region.h b/drivers/gpu/drm/i915/intel_memory_region.h
index edd49067c8ca..4c8ec15af55f 100644
--- a/drivers/gpu/drm/i915/intel_memory_region.h
+++ b/drivers/gpu/drm/i915/intel_memory_region.h
@@ -26,18 +26,21 @@ enum intel_memory_type {
 	INTEL_MEMORY_SYSTEM = 0,
 	INTEL_MEMORY_LOCAL,
 	INTEL_MEMORY_STOLEN_SYSTEM,
+	INTEL_MEMORY_STOLEN_LOCAL,
 };
 
 enum intel_region_id {
 	INTEL_REGION_SMEM = 0,
 	INTEL_REGION_LMEM,
 	INTEL_REGION_STOLEN_SMEM,
+	INTEL_REGION_STOLEN_LMEM,
 	INTEL_REGION_UNKNOWN, /* Should be last */
 };
 
 #define REGION_SMEM     BIT(INTEL_REGION_SMEM)
 #define REGION_LMEM     BIT(INTEL_REGION_LMEM)
 #define REGION_STOLEN_SMEM   BIT(INTEL_REGION_STOLEN_SMEM)
+#define REGION_STOLEN_LMEM   BIT(INTEL_REGION_STOLEN_LMEM)
 
 #define I915_ALLOC_MIN_PAGE_SIZE  BIT(0)
 #define I915_ALLOC_CONTIGUOUS     BIT(1)
@@ -82,7 +85,7 @@ struct intel_memory_region {
 	u16 type;
 	u16 instance;
 	enum intel_region_id id;
-	char name[8];
+	char name[16];
 
 	struct list_head reserved;
 
-- 
2.26.3

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 130+ messages in thread

* [Intel-gfx] [PATCH 03/19] drm/i915: Create stolen memory region from local memory
@ 2021-04-12  9:05   ` Matthew Auld
  0 siblings, 0 replies; 130+ messages in thread
From: Matthew Auld @ 2021-04-12  9:05 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel

From: CQ Tang <cq.tang@intel.com>

Add "REGION_STOLEN" device info to dg1, create stolen memory
region from upper portion of local device memory, starting
from DSMBASE.

v2:
    - s/drm_info/drm_dbg; userspace likely doesn't care about stolen.
    - mem->type is only setup after the region probe, so setting the name
      as stolen-local or stolen-system based on this value won't work. Split
      system vs local stolen setup to fix this.
    - kill all the region->devmem/is_devmem stuff. We already differentiate
      the different types of stolen so such things shouldn't be needed
      anymore.

Signed-off-by: CQ Tang <cq.tang@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_stolen.c | 99 +++++++++++++++++++---
 drivers/gpu/drm/i915/gem/i915_gem_stolen.h |  3 +
 drivers/gpu/drm/i915/i915_pci.c            |  2 +-
 drivers/gpu/drm/i915/i915_reg.h            |  1 +
 drivers/gpu/drm/i915/intel_memory_region.c |  6 ++
 drivers/gpu/drm/i915/intel_memory_region.h |  5 +-
 6 files changed, 102 insertions(+), 14 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_stolen.c b/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
index b0597de206de..56dd58bef5ee 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
@@ -10,6 +10,7 @@
 #include <drm/drm_mm.h>
 #include <drm/i915_drm.h>
 
+#include "gem/i915_gem_lmem.h"
 #include "gem/i915_gem_region.h"
 #include "i915_drv.h"
 #include "i915_gem_stolen.h"
@@ -121,6 +122,14 @@ static int i915_adjust_stolen(struct drm_i915_private *i915,
 		}
 	}
 
+	/*
+	 * With device local memory, we don't need to check the address range,
+	 * this is device memory physical address, could overlap with system
+	 * memory.
+	 */
+	if (HAS_LMEM(i915))
+		return 0;
+
 	/*
 	 * Verify that nothing else uses this physical address. Stolen
 	 * memory should be reserved by the BIOS and hidden from the
@@ -374,8 +383,9 @@ static void icl_get_stolen_reserved(struct drm_i915_private *i915,
 	}
 }
 
-static int i915_gem_init_stolen(struct drm_i915_private *i915)
+static int i915_gem_init_stolen(struct intel_memory_region *mem)
 {
+	struct drm_i915_private *i915 = mem->i915;
 	struct intel_uncore *uncore = &i915->uncore;
 	resource_size_t reserved_base, stolen_top;
 	resource_size_t reserved_total, reserved_size;
@@ -396,10 +406,10 @@ static int i915_gem_init_stolen(struct drm_i915_private *i915)
 		return 0;
 	}
 
-	if (resource_size(&intel_graphics_stolen_res) == 0)
+	if (resource_size(&mem->region) == 0)
 		return 0;
 
-	i915->dsm = intel_graphics_stolen_res;
+	i915->dsm = mem->region;
 
 	if (i915_adjust_stolen(i915, &i915->dsm))
 		return 0;
@@ -684,23 +694,36 @@ static int _i915_gem_object_stolen_init(struct intel_memory_region *mem,
 	return ret;
 }
 
+struct intel_memory_region *i915_stolen_region(struct drm_i915_private *i915)
+{
+	if (HAS_LMEM(i915))
+		return i915->mm.regions[INTEL_REGION_STOLEN_LMEM];
+
+	return i915->mm.regions[INTEL_REGION_STOLEN_SMEM];
+}
+
 struct drm_i915_gem_object *
 i915_gem_object_create_stolen(struct drm_i915_private *i915,
 			      resource_size_t size)
 {
-	return i915_gem_object_create_region(i915->mm.regions[INTEL_REGION_STOLEN_SMEM],
+	return i915_gem_object_create_region(i915_stolen_region(i915),
 					     size, I915_BO_ALLOC_CONTIGUOUS);
 }
 
 static int init_stolen(struct intel_memory_region *mem)
 {
-	intel_memory_region_set_name(mem, "stolen");
+	if (HAS_LMEM(mem->i915)) {
+		if (!io_mapping_init_wc(&mem->iomap,
+					mem->io_start,
+					resource_size(&mem->region)))
+			return -EIO;
+	}
 
 	/*
 	 * Initialise stolen early so that we may reserve preallocated
 	 * objects for the BIOS to KMS transition.
 	 */
-	return i915_gem_init_stolen(mem->i915);
+	return i915_gem_init_stolen(mem);
 }
 
 static void release_stolen(struct intel_memory_region *mem)
@@ -714,13 +737,65 @@ static const struct intel_memory_region_ops i915_region_stolen_ops = {
 	.init_object = _i915_gem_object_stolen_init,
 };
 
+static struct intel_memory_region *
+setup_lmem_stolen(struct drm_i915_private *i915)
+{
+	struct intel_uncore *uncore = &i915->uncore;
+	struct pci_dev *pdev = i915->drm.pdev;
+	struct intel_memory_region *mem;
+	resource_size_t io_start;
+	resource_size_t lmem_size;
+	u64 lmem_base;
+
+	if (!IS_DGFX(i915))
+		return ERR_PTR(-ENODEV);
+
+	lmem_base = intel_uncore_read64(uncore, GEN12_DSMBASE);
+	lmem_size = pci_resource_len(pdev, 2) - lmem_base;
+	io_start = pci_resource_start(pdev, 2) + lmem_base;
+
+	mem = intel_memory_region_create(i915, lmem_base, lmem_size,
+					 I915_GTT_PAGE_SIZE_4K, io_start,
+					 &i915_region_stolen_ops);
+	if (IS_ERR(mem))
+		return mem;
+
+	drm_dbg(&i915->drm, "Stolen Local memory: %pR\n", &mem->region);
+	drm_dbg(&i915->drm, "Stolen Local memory IO start: %pa\n",
+		&mem->io_start);
+
+	intel_memory_region_set_name(mem, "stolen-local");
+
+	return mem;
+}
+
+static struct intel_memory_region*
+setup_smem_stolen(struct drm_i915_private *i915)
+{
+	struct intel_memory_region *mem;
+
+	mem = intel_memory_region_create(i915,
+					 intel_graphics_stolen_res.start,
+					 resource_size(&intel_graphics_stolen_res),
+					 PAGE_SIZE, 0,
+					 &i915_region_stolen_ops);
+	if (IS_ERR(mem))
+		return mem;
+
+	intel_memory_region_set_name(mem, "stolen-system");
+
+	return mem;
+}
+
 struct intel_memory_region *i915_gem_stolen_setup(struct drm_i915_private *i915)
 {
-	return intel_memory_region_create(i915,
-					  intel_graphics_stolen_res.start,
-					  resource_size(&intel_graphics_stolen_res),
-					  PAGE_SIZE, 0,
-					  &i915_region_stolen_ops);
+	struct intel_memory_region *mem;
+
+	mem = setup_lmem_stolen(i915);
+	if (mem == ERR_PTR(-ENODEV))
+		mem = setup_smem_stolen(i915);
+
+	return mem;
 }
 
 struct drm_i915_gem_object *
@@ -728,7 +803,7 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_i915_private *i915,
 					       resource_size_t stolen_offset,
 					       resource_size_t size)
 {
-	struct intel_memory_region *mem = i915->mm.regions[INTEL_REGION_STOLEN_SMEM];
+	struct intel_memory_region *mem = i915_stolen_region(i915);
 	struct drm_i915_gem_object *obj;
 	struct drm_mm_node *stolen;
 	int ret;
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_stolen.h b/drivers/gpu/drm/i915/gem/i915_gem_stolen.h
index b03489706796..2d1ce7fec61c 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_stolen.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_stolen.h
@@ -22,6 +22,9 @@ int i915_gem_stolen_insert_node_in_range(struct drm_i915_private *dev_priv,
 void i915_gem_stolen_remove_node(struct drm_i915_private *dev_priv,
 				 struct drm_mm_node *node);
 struct intel_memory_region *i915_gem_stolen_setup(struct drm_i915_private *i915);
+
+struct intel_memory_region *i915_stolen_region(struct drm_i915_private *i915);
+
 struct drm_i915_gem_object *
 i915_gem_object_create_stolen(struct drm_i915_private *dev_priv,
 			      resource_size_t size);
diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c
index 480553746794..53f5d1e6daef 100644
--- a/drivers/gpu/drm/i915/i915_pci.c
+++ b/drivers/gpu/drm/i915/i915_pci.c
@@ -906,7 +906,7 @@ static const struct intel_device_info rkl_info = {
 
 #define GEN12_DGFX_FEATURES \
 	GEN12_FEATURES, \
-	.memory_regions = REGION_SMEM | REGION_LMEM, \
+	.memory_regions = REGION_SMEM | REGION_LMEM | REGION_STOLEN_LMEM, \
 	.has_master_unit_irq = 1, \
 	.has_llc = 0, \
 	.has_snoop = 1, \
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index e087bcd21911..4108f2a7ebfa 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -12191,6 +12191,7 @@ enum skl_power_gate {
 #define GEN12_GLOBAL_MOCS(i)	_MMIO(0x4000 + (i) * 4) /* Global MOCS regs */
 
 #define GEN12_GSMBASE			_MMIO(0x108100)
+#define GEN12_DSMBASE			_MMIO(0x1080C0)
 
 /* gamt regs */
 #define GEN8_L3_LRA_1_GPGPU _MMIO(0x4dd4)
diff --git a/drivers/gpu/drm/i915/intel_memory_region.c b/drivers/gpu/drm/i915/intel_memory_region.c
index bf837b6bb185..ac90b76a3fa0 100644
--- a/drivers/gpu/drm/i915/intel_memory_region.c
+++ b/drivers/gpu/drm/i915/intel_memory_region.c
@@ -22,6 +22,10 @@ static const struct {
 		.class = INTEL_MEMORY_STOLEN_SYSTEM,
 		.instance = 0,
 	},
+	[INTEL_REGION_STOLEN_LMEM] = {
+		.class = INTEL_MEMORY_STOLEN_LOCAL,
+		.instance = 0,
+	},
 };
 
 struct intel_memory_region *
@@ -278,6 +282,8 @@ int intel_memory_regions_hw_probe(struct drm_i915_private *i915)
 		case INTEL_MEMORY_SYSTEM:
 			mem = i915_gem_shmem_setup(i915);
 			break;
+		case INTEL_MEMORY_STOLEN_LOCAL:
+			fallthrough;
 		case INTEL_MEMORY_STOLEN_SYSTEM:
 			mem = i915_gem_stolen_setup(i915);
 			break;
diff --git a/drivers/gpu/drm/i915/intel_memory_region.h b/drivers/gpu/drm/i915/intel_memory_region.h
index edd49067c8ca..4c8ec15af55f 100644
--- a/drivers/gpu/drm/i915/intel_memory_region.h
+++ b/drivers/gpu/drm/i915/intel_memory_region.h
@@ -26,18 +26,21 @@ enum intel_memory_type {
 	INTEL_MEMORY_SYSTEM = 0,
 	INTEL_MEMORY_LOCAL,
 	INTEL_MEMORY_STOLEN_SYSTEM,
+	INTEL_MEMORY_STOLEN_LOCAL,
 };
 
 enum intel_region_id {
 	INTEL_REGION_SMEM = 0,
 	INTEL_REGION_LMEM,
 	INTEL_REGION_STOLEN_SMEM,
+	INTEL_REGION_STOLEN_LMEM,
 	INTEL_REGION_UNKNOWN, /* Should be last */
 };
 
 #define REGION_SMEM     BIT(INTEL_REGION_SMEM)
 #define REGION_LMEM     BIT(INTEL_REGION_LMEM)
 #define REGION_STOLEN_SMEM   BIT(INTEL_REGION_STOLEN_SMEM)
+#define REGION_STOLEN_LMEM   BIT(INTEL_REGION_STOLEN_LMEM)
 
 #define I915_ALLOC_MIN_PAGE_SIZE  BIT(0)
 #define I915_ALLOC_CONTIGUOUS     BIT(1)
@@ -82,7 +85,7 @@ struct intel_memory_region {
 	u16 type;
 	u16 instance;
 	enum intel_region_id id;
-	char name[8];
+	char name[16];
 
 	struct list_head reserved;
 
-- 
2.26.3

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 130+ messages in thread

* [PATCH 04/19] drm/i915/stolen: treat stolen local as normal local memory
  2021-04-12  9:05 ` [Intel-gfx] " Matthew Auld
@ 2021-04-12  9:05   ` Matthew Auld
  -1 siblings, 0 replies; 130+ messages in thread
From: Matthew Auld @ 2021-04-12  9:05 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel

Underneath it's the same stuff, so things like the PTE_LM bits for the
GTT should just keep working as-is.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_lmem.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_lmem.c b/drivers/gpu/drm/i915/gem/i915_gem_lmem.c
index ce1c83c13d05..017db8f71130 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_lmem.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_lmem.c
@@ -19,7 +19,10 @@ const struct drm_i915_gem_object_ops i915_gem_lmem_obj_ops = {
 
 bool i915_gem_object_is_lmem(struct drm_i915_gem_object *obj)
 {
-	return obj->ops == &i915_gem_lmem_obj_ops;
+	struct intel_memory_region *mr = obj->mm.region;
+
+	return mr && (mr->type == INTEL_MEMORY_LOCAL ||
+		      mr->type == INTEL_MEMORY_STOLEN_LOCAL);
 }
 
 struct drm_i915_gem_object *
-- 
2.26.3

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 130+ messages in thread

* [Intel-gfx] [PATCH 04/19] drm/i915/stolen: treat stolen local as normal local memory
@ 2021-04-12  9:05   ` Matthew Auld
  0 siblings, 0 replies; 130+ messages in thread
From: Matthew Auld @ 2021-04-12  9:05 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel

Underneath it's the same stuff, so things like the PTE_LM bits for the
GTT should just keep working as-is.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_lmem.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_lmem.c b/drivers/gpu/drm/i915/gem/i915_gem_lmem.c
index ce1c83c13d05..017db8f71130 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_lmem.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_lmem.c
@@ -19,7 +19,10 @@ const struct drm_i915_gem_object_ops i915_gem_lmem_obj_ops = {
 
 bool i915_gem_object_is_lmem(struct drm_i915_gem_object *obj)
 {
-	return obj->ops == &i915_gem_lmem_obj_ops;
+	struct intel_memory_region *mr = obj->mm.region;
+
+	return mr && (mr->type == INTEL_MEMORY_LOCAL ||
+		      mr->type == INTEL_MEMORY_STOLEN_LOCAL);
 }
 
 struct drm_i915_gem_object *
-- 
2.26.3

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 130+ messages in thread

* [PATCH 05/19] drm/i915/stolen: enforce the min_page_size contract
  2021-04-12  9:05 ` [Intel-gfx] " Matthew Auld
@ 2021-04-12  9:05   ` Matthew Auld
  -1 siblings, 0 replies; 130+ messages in thread
From: Matthew Auld @ 2021-04-12  9:05 UTC (permalink / raw)
  To: intel-gfx; +Cc: CQ Tang, dri-devel

From: CQ Tang <cq.tang@intel.com>

Since stolen can now be device local-memory underneath, we should try to
enforce any min_page_size restrictions when allocating pages.

Signed-off-by: CQ Tang <cq.tang@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_stolen.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_stolen.c b/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
index 56dd58bef5ee..f713eabb7671 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
@@ -677,7 +677,8 @@ static int _i915_gem_object_stolen_init(struct intel_memory_region *mem,
 	if (!stolen)
 		return -ENOMEM;
 
-	ret = i915_gem_stolen_insert_node(i915, stolen, size, 4096);
+	ret = i915_gem_stolen_insert_node(i915, stolen, size,
+					  mem->min_page_size);
 	if (ret)
 		goto err_free;
 
@@ -817,8 +818,8 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_i915_private *i915,
 
 	/* KISS and expect everything to be page-aligned */
 	if (GEM_WARN_ON(size == 0) ||
-	    GEM_WARN_ON(!IS_ALIGNED(size, I915_GTT_PAGE_SIZE)) ||
-	    GEM_WARN_ON(!IS_ALIGNED(stolen_offset, I915_GTT_MIN_ALIGNMENT)))
+	    GEM_WARN_ON(!IS_ALIGNED(size, mem->min_page_size)) ||
+	    GEM_WARN_ON(!IS_ALIGNED(stolen_offset, mem->min_page_size)))
 		return ERR_PTR(-EINVAL);
 
 	stolen = kzalloc(sizeof(*stolen), GFP_KERNEL);
-- 
2.26.3

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 130+ messages in thread

* [Intel-gfx] [PATCH 05/19] drm/i915/stolen: enforce the min_page_size contract
@ 2021-04-12  9:05   ` Matthew Auld
  0 siblings, 0 replies; 130+ messages in thread
From: Matthew Auld @ 2021-04-12  9:05 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel

From: CQ Tang <cq.tang@intel.com>

Since stolen can now be device local-memory underneath, we should try to
enforce any min_page_size restrictions when allocating pages.

Signed-off-by: CQ Tang <cq.tang@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_stolen.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_stolen.c b/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
index 56dd58bef5ee..f713eabb7671 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
@@ -677,7 +677,8 @@ static int _i915_gem_object_stolen_init(struct intel_memory_region *mem,
 	if (!stolen)
 		return -ENOMEM;
 
-	ret = i915_gem_stolen_insert_node(i915, stolen, size, 4096);
+	ret = i915_gem_stolen_insert_node(i915, stolen, size,
+					  mem->min_page_size);
 	if (ret)
 		goto err_free;
 
@@ -817,8 +818,8 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_i915_private *i915,
 
 	/* KISS and expect everything to be page-aligned */
 	if (GEM_WARN_ON(size == 0) ||
-	    GEM_WARN_ON(!IS_ALIGNED(size, I915_GTT_PAGE_SIZE)) ||
-	    GEM_WARN_ON(!IS_ALIGNED(stolen_offset, I915_GTT_MIN_ALIGNMENT)))
+	    GEM_WARN_ON(!IS_ALIGNED(size, mem->min_page_size)) ||
+	    GEM_WARN_ON(!IS_ALIGNED(stolen_offset, mem->min_page_size)))
 		return ERR_PTR(-EINVAL);
 
 	stolen = kzalloc(sizeof(*stolen), GFP_KERNEL);
-- 
2.26.3

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 130+ messages in thread

* [PATCH 06/19] drm/i915/stolen: pass the allocation flags
  2021-04-12  9:05 ` [Intel-gfx] " Matthew Auld
@ 2021-04-12  9:05   ` Matthew Auld
  -1 siblings, 0 replies; 130+ messages in thread
From: Matthew Auld @ 2021-04-12  9:05 UTC (permalink / raw)
  To: intel-gfx; +Cc: CQ Tang, dri-devel

From: CQ Tang <cq.tang@intel.com>

Stolen memory is always allocated as physically contiguous pages, mark
the object flags as such.

Signed-off-by: CQ Tang <cq.tang@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_stolen.c | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_stolen.c b/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
index f713eabb7671..49a2dfcc8ba7 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
@@ -633,14 +633,15 @@ static const struct drm_i915_gem_object_ops i915_gem_object_stolen_ops = {
 
 static int __i915_gem_object_create_stolen(struct intel_memory_region *mem,
 					   struct drm_i915_gem_object *obj,
-					   struct drm_mm_node *stolen)
+					   struct drm_mm_node *stolen,
+					   unsigned int flags)
 {
 	static struct lock_class_key lock_class;
 	unsigned int cache_level;
 	int err;
 
 	drm_gem_private_object_init(&mem->i915->drm, &obj->base, stolen->size);
-	i915_gem_object_init(obj, &i915_gem_object_stolen_ops, &lock_class, 0);
+	i915_gem_object_init(obj, &i915_gem_object_stolen_ops, &lock_class, flags);
 
 	obj->stolen = stolen;
 	obj->read_domains = I915_GEM_DOMAIN_CPU | I915_GEM_DOMAIN_GTT;
@@ -682,7 +683,7 @@ static int _i915_gem_object_stolen_init(struct intel_memory_region *mem,
 	if (ret)
 		goto err_free;
 
-	ret = __i915_gem_object_create_stolen(mem, obj, stolen);
+	ret = __i915_gem_object_create_stolen(mem, obj, stolen, flags);
 	if (ret)
 		goto err_remove;
 
@@ -840,7 +841,8 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_i915_private *i915,
 		goto err_stolen;
 	}
 
-	ret = __i915_gem_object_create_stolen(mem, obj, stolen);
+	ret = __i915_gem_object_create_stolen(mem, obj, stolen,
+					      I915_BO_ALLOC_CONTIGUOUS);
 	if (ret)
 		goto err_object_free;
 
-- 
2.26.3

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 130+ messages in thread

* [Intel-gfx] [PATCH 06/19] drm/i915/stolen: pass the allocation flags
@ 2021-04-12  9:05   ` Matthew Auld
  0 siblings, 0 replies; 130+ messages in thread
From: Matthew Auld @ 2021-04-12  9:05 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel

From: CQ Tang <cq.tang@intel.com>

Stolen memory is always allocated as physically contiguous pages, mark
the object flags as such.

Signed-off-by: CQ Tang <cq.tang@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_stolen.c | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_stolen.c b/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
index f713eabb7671..49a2dfcc8ba7 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
@@ -633,14 +633,15 @@ static const struct drm_i915_gem_object_ops i915_gem_object_stolen_ops = {
 
 static int __i915_gem_object_create_stolen(struct intel_memory_region *mem,
 					   struct drm_i915_gem_object *obj,
-					   struct drm_mm_node *stolen)
+					   struct drm_mm_node *stolen,
+					   unsigned int flags)
 {
 	static struct lock_class_key lock_class;
 	unsigned int cache_level;
 	int err;
 
 	drm_gem_private_object_init(&mem->i915->drm, &obj->base, stolen->size);
-	i915_gem_object_init(obj, &i915_gem_object_stolen_ops, &lock_class, 0);
+	i915_gem_object_init(obj, &i915_gem_object_stolen_ops, &lock_class, flags);
 
 	obj->stolen = stolen;
 	obj->read_domains = I915_GEM_DOMAIN_CPU | I915_GEM_DOMAIN_GTT;
@@ -682,7 +683,7 @@ static int _i915_gem_object_stolen_init(struct intel_memory_region *mem,
 	if (ret)
 		goto err_free;
 
-	ret = __i915_gem_object_create_stolen(mem, obj, stolen);
+	ret = __i915_gem_object_create_stolen(mem, obj, stolen, flags);
 	if (ret)
 		goto err_remove;
 
@@ -840,7 +841,8 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_i915_private *i915,
 		goto err_stolen;
 	}
 
-	ret = __i915_gem_object_create_stolen(mem, obj, stolen);
+	ret = __i915_gem_object_create_stolen(mem, obj, stolen,
+					      I915_BO_ALLOC_CONTIGUOUS);
 	if (ret)
 		goto err_object_free;
 
-- 
2.26.3

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 130+ messages in thread

* [PATCH 07/19] drm/i915/fbdev: Use lmem physical addresses for fb_mmap() on discrete
  2021-04-12  9:05 ` [Intel-gfx] " Matthew Auld
@ 2021-04-12  9:05   ` Matthew Auld
  -1 siblings, 0 replies; 130+ messages in thread
From: Matthew Auld @ 2021-04-12  9:05 UTC (permalink / raw)
  To: intel-gfx; +Cc: Mohammed Khajapasha, dri-devel

From: Mohammed Khajapasha <mohammed.khajapasha@intel.com>

use local memory io BAR address for fbdev's fb_mmap() operation on
discrete, fbdev uses the physical address of our framebuffer for its
fb_mmap() fn.

Signed-off-by: Mohammed Khajapasha <mohammed.khajapasha@intel.com>
---
 drivers/gpu/drm/i915/display/intel_fbdev.c | 29 +++++++++++++++++-----
 1 file changed, 23 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_fbdev.c b/drivers/gpu/drm/i915/display/intel_fbdev.c
index ccd00e65a5fe..2b37959da747 100644
--- a/drivers/gpu/drm/i915/display/intel_fbdev.c
+++ b/drivers/gpu/drm/i915/display/intel_fbdev.c
@@ -41,6 +41,8 @@
 #include <drm/drm_fb_helper.h>
 #include <drm/drm_fourcc.h>
 
+#include "gem/i915_gem_lmem.h"
+
 #include "i915_drv.h"
 #include "intel_display_types.h"
 #include "intel_fbdev.h"
@@ -178,6 +180,7 @@ static int intelfb_create(struct drm_fb_helper *helper,
 	unsigned long flags = 0;
 	bool prealloc = false;
 	void __iomem *vaddr;
+	struct drm_i915_gem_object *obj;
 	int ret;
 
 	if (intel_fb &&
@@ -232,13 +235,27 @@ static int intelfb_create(struct drm_fb_helper *helper,
 	info->fbops = &intelfb_ops;
 
 	/* setup aperture base/size for vesafb takeover */
-	info->apertures->ranges[0].base = ggtt->gmadr.start;
-	info->apertures->ranges[0].size = ggtt->mappable_end;
+	obj = intel_fb_obj(&intel_fb->base);
+	if (i915_gem_object_is_lmem(obj)) {
+		struct intel_memory_region *mem = obj->mm.region;
+
+		info->apertures->ranges[0].base = mem->io_start;
+		info->apertures->ranges[0].size = mem->total;
+
+		/* Use fbdev's framebuffer from lmem for discrete */
+		info->fix.smem_start =
+			(unsigned long)(mem->io_start +
+					i915_gem_object_get_dma_address(obj, 0));
+		info->fix.smem_len = obj->base.size;
+	} else {
+		info->apertures->ranges[0].base = ggtt->gmadr.start;
+		info->apertures->ranges[0].size = ggtt->mappable_end;
 
-	/* Our framebuffer is the entirety of fbdev's system memory */
-	info->fix.smem_start =
-		(unsigned long)(ggtt->gmadr.start + vma->node.start);
-	info->fix.smem_len = vma->node.size;
+		/* Our framebuffer is the entirety of fbdev's system memory */
+		info->fix.smem_start =
+			(unsigned long)(ggtt->gmadr.start + vma->node.start);
+		info->fix.smem_len = vma->node.size;
+	}
 
 	vaddr = i915_vma_pin_iomap(vma);
 	if (IS_ERR(vaddr)) {
-- 
2.26.3

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 130+ messages in thread

* [Intel-gfx] [PATCH 07/19] drm/i915/fbdev: Use lmem physical addresses for fb_mmap() on discrete
@ 2021-04-12  9:05   ` Matthew Auld
  0 siblings, 0 replies; 130+ messages in thread
From: Matthew Auld @ 2021-04-12  9:05 UTC (permalink / raw)
  To: intel-gfx; +Cc: Mohammed Khajapasha, dri-devel

From: Mohammed Khajapasha <mohammed.khajapasha@intel.com>

use local memory io BAR address for fbdev's fb_mmap() operation on
discrete, fbdev uses the physical address of our framebuffer for its
fb_mmap() fn.

Signed-off-by: Mohammed Khajapasha <mohammed.khajapasha@intel.com>
---
 drivers/gpu/drm/i915/display/intel_fbdev.c | 29 +++++++++++++++++-----
 1 file changed, 23 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_fbdev.c b/drivers/gpu/drm/i915/display/intel_fbdev.c
index ccd00e65a5fe..2b37959da747 100644
--- a/drivers/gpu/drm/i915/display/intel_fbdev.c
+++ b/drivers/gpu/drm/i915/display/intel_fbdev.c
@@ -41,6 +41,8 @@
 #include <drm/drm_fb_helper.h>
 #include <drm/drm_fourcc.h>
 
+#include "gem/i915_gem_lmem.h"
+
 #include "i915_drv.h"
 #include "intel_display_types.h"
 #include "intel_fbdev.h"
@@ -178,6 +180,7 @@ static int intelfb_create(struct drm_fb_helper *helper,
 	unsigned long flags = 0;
 	bool prealloc = false;
 	void __iomem *vaddr;
+	struct drm_i915_gem_object *obj;
 	int ret;
 
 	if (intel_fb &&
@@ -232,13 +235,27 @@ static int intelfb_create(struct drm_fb_helper *helper,
 	info->fbops = &intelfb_ops;
 
 	/* setup aperture base/size for vesafb takeover */
-	info->apertures->ranges[0].base = ggtt->gmadr.start;
-	info->apertures->ranges[0].size = ggtt->mappable_end;
+	obj = intel_fb_obj(&intel_fb->base);
+	if (i915_gem_object_is_lmem(obj)) {
+		struct intel_memory_region *mem = obj->mm.region;
+
+		info->apertures->ranges[0].base = mem->io_start;
+		info->apertures->ranges[0].size = mem->total;
+
+		/* Use fbdev's framebuffer from lmem for discrete */
+		info->fix.smem_start =
+			(unsigned long)(mem->io_start +
+					i915_gem_object_get_dma_address(obj, 0));
+		info->fix.smem_len = obj->base.size;
+	} else {
+		info->apertures->ranges[0].base = ggtt->gmadr.start;
+		info->apertures->ranges[0].size = ggtt->mappable_end;
 
-	/* Our framebuffer is the entirety of fbdev's system memory */
-	info->fix.smem_start =
-		(unsigned long)(ggtt->gmadr.start + vma->node.start);
-	info->fix.smem_len = vma->node.size;
+		/* Our framebuffer is the entirety of fbdev's system memory */
+		info->fix.smem_start =
+			(unsigned long)(ggtt->gmadr.start + vma->node.start);
+		info->fix.smem_len = vma->node.size;
+	}
 
 	vaddr = i915_vma_pin_iomap(vma);
 	if (IS_ERR(vaddr)) {
-- 
2.26.3

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 130+ messages in thread

* [PATCH 08/19] drm/i915: Return error value when bo not in LMEM for discrete
  2021-04-12  9:05 ` [Intel-gfx] " Matthew Auld
@ 2021-04-12  9:05   ` Matthew Auld
  -1 siblings, 0 replies; 130+ messages in thread
From: Matthew Auld @ 2021-04-12  9:05 UTC (permalink / raw)
  To: intel-gfx; +Cc: Mohammed Khajapasha, dri-devel

From: Mohammed Khajapasha <mohammed.khajapasha@intel.com>

Return EREMOTE value when frame buffer object is not backed by LMEM
for discrete. If Local memory is supported by hardware the framebuffer
backing gem objects should be from local memory.

Signed-off-by: Mohammed Khajapasha <mohammed.khajapasha@intel.com>
---
 drivers/gpu/drm/i915/display/intel_display.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/drivers/gpu/drm/i915/display/intel_display.c b/drivers/gpu/drm/i915/display/intel_display.c
index 411b46c012f8..57b06d8728af 100644
--- a/drivers/gpu/drm/i915/display/intel_display.c
+++ b/drivers/gpu/drm/i915/display/intel_display.c
@@ -63,6 +63,7 @@
 #include "display/intel_vdsc.h"
 #include "display/intel_vrr.h"
 
+#include "gem/i915_gem_lmem.h"
 #include "gem/i915_gem_object.h"
 
 #include "gt/intel_rps.h"
@@ -11279,11 +11280,20 @@ intel_user_framebuffer_create(struct drm_device *dev,
 	struct drm_framebuffer *fb;
 	struct drm_i915_gem_object *obj;
 	struct drm_mode_fb_cmd2 mode_cmd = *user_mode_cmd;
+	struct drm_i915_private *i915;
 
 	obj = i915_gem_object_lookup(filp, mode_cmd.handles[0]);
 	if (!obj)
 		return ERR_PTR(-ENOENT);
 
+	/* object is backed with LMEM for discrete */
+	i915 = to_i915(obj->base.dev);
+	if (HAS_LMEM(i915) && !i915_gem_object_is_lmem(obj)) {
+		/* object is "remote", not in local memory */
+		i915_gem_object_put(obj);
+		return ERR_PTR(-EREMOTE);
+	}
+
 	fb = intel_framebuffer_create(obj, &mode_cmd);
 	i915_gem_object_put(obj);
 
-- 
2.26.3

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 130+ messages in thread

* [Intel-gfx] [PATCH 08/19] drm/i915: Return error value when bo not in LMEM for discrete
@ 2021-04-12  9:05   ` Matthew Auld
  0 siblings, 0 replies; 130+ messages in thread
From: Matthew Auld @ 2021-04-12  9:05 UTC (permalink / raw)
  To: intel-gfx; +Cc: Mohammed Khajapasha, dri-devel

From: Mohammed Khajapasha <mohammed.khajapasha@intel.com>

Return EREMOTE value when frame buffer object is not backed by LMEM
for discrete. If Local memory is supported by hardware the framebuffer
backing gem objects should be from local memory.

Signed-off-by: Mohammed Khajapasha <mohammed.khajapasha@intel.com>
---
 drivers/gpu/drm/i915/display/intel_display.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/drivers/gpu/drm/i915/display/intel_display.c b/drivers/gpu/drm/i915/display/intel_display.c
index 411b46c012f8..57b06d8728af 100644
--- a/drivers/gpu/drm/i915/display/intel_display.c
+++ b/drivers/gpu/drm/i915/display/intel_display.c
@@ -63,6 +63,7 @@
 #include "display/intel_vdsc.h"
 #include "display/intel_vrr.h"
 
+#include "gem/i915_gem_lmem.h"
 #include "gem/i915_gem_object.h"
 
 #include "gt/intel_rps.h"
@@ -11279,11 +11280,20 @@ intel_user_framebuffer_create(struct drm_device *dev,
 	struct drm_framebuffer *fb;
 	struct drm_i915_gem_object *obj;
 	struct drm_mode_fb_cmd2 mode_cmd = *user_mode_cmd;
+	struct drm_i915_private *i915;
 
 	obj = i915_gem_object_lookup(filp, mode_cmd.handles[0]);
 	if (!obj)
 		return ERR_PTR(-ENOENT);
 
+	/* object is backed with LMEM for discrete */
+	i915 = to_i915(obj->base.dev);
+	if (HAS_LMEM(i915) && !i915_gem_object_is_lmem(obj)) {
+		/* object is "remote", not in local memory */
+		i915_gem_object_put(obj);
+		return ERR_PTR(-EREMOTE);
+	}
+
 	fb = intel_framebuffer_create(obj, &mode_cmd);
 	i915_gem_object_put(obj);
 
-- 
2.26.3

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 130+ messages in thread

* [PATCH 09/19] drm/i915/lmem: Fail driver init if LMEM training failed
  2021-04-12  9:05 ` [Intel-gfx] " Matthew Auld
@ 2021-04-12  9:05   ` Matthew Auld
  -1 siblings, 0 replies; 130+ messages in thread
From: Matthew Auld @ 2021-04-12  9:05 UTC (permalink / raw)
  To: intel-gfx; +Cc: Caz Yokoyama, dri-devel

From: Matt Roper <matthew.d.roper@intel.com>

Boot firmware performs memory training and health assessment during
startup.  If the memory training fails, the firmware will consider the
GPU unusable and will instruct the punit to keep the GT powered down.
If this happens, our driver will be unable to communicate with the GT
(all GT registers will read back as 0, forcewake requests will timeout,
etc.) so we should abort driver initialization if this happens.  We can
confirm that LMEM was initialized successfully via sgunit register
GU_CNTL.

Bspec: 53111
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Cc: Caz Yokoyama <Caz.Yokoyama@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
---
 drivers/gpu/drm/i915/i915_reg.h     |  3 +++
 drivers/gpu/drm/i915/intel_uncore.c | 12 ++++++++++++
 2 files changed, 15 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 4108f2a7ebfa..da73dc939e58 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -487,6 +487,9 @@ static inline bool i915_mmio_reg_valid(i915_reg_t reg)
 #define GAB_CTL				_MMIO(0x24000)
 #define   GAB_CTL_CONT_AFTER_PAGEFAULT	(1 << 8)
 
+#define GU_CNTL				_MMIO(0x101010)
+#define   LMEM_INIT			REG_BIT(7)
+
 #define GEN6_STOLEN_RESERVED		_MMIO(0x1082C0)
 #define GEN6_STOLEN_RESERVED_ADDR_MASK	(0xFFF << 20)
 #define GEN7_STOLEN_RESERVED_ADDR_MASK	(0x3FFF << 18)
diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c
index 661b50191f2b..4d0605757428 100644
--- a/drivers/gpu/drm/i915/intel_uncore.c
+++ b/drivers/gpu/drm/i915/intel_uncore.c
@@ -1917,6 +1917,18 @@ int intel_uncore_init_mmio(struct intel_uncore *uncore)
 	if (ret)
 		return ret;
 
+	/*
+	 * The boot firmware initializes local memory and assesses its health.
+	 * If memory training fails, the punit will have been instructed to
+	 * keep the GT powered down; we won't be able to communicate with it
+	 * and we should not continue with driver initialization.
+	 */
+	if (IS_DGFX(i915) &&
+	    !(__raw_uncore_read32(uncore, GU_CNTL) & LMEM_INIT)) {
+		drm_err(&i915->drm, "LMEM not initialized by firmware\n");
+		return -ENODEV;
+	}
+
 	if (INTEL_GEN(i915) > 5 && !intel_vgpu_active(i915))
 		uncore->flags |= UNCORE_HAS_FORCEWAKE;
 
-- 
2.26.3

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 130+ messages in thread

* [Intel-gfx] [PATCH 09/19] drm/i915/lmem: Fail driver init if LMEM training failed
@ 2021-04-12  9:05   ` Matthew Auld
  0 siblings, 0 replies; 130+ messages in thread
From: Matthew Auld @ 2021-04-12  9:05 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel

From: Matt Roper <matthew.d.roper@intel.com>

Boot firmware performs memory training and health assessment during
startup.  If the memory training fails, the firmware will consider the
GPU unusable and will instruct the punit to keep the GT powered down.
If this happens, our driver will be unable to communicate with the GT
(all GT registers will read back as 0, forcewake requests will timeout,
etc.) so we should abort driver initialization if this happens.  We can
confirm that LMEM was initialized successfully via sgunit register
GU_CNTL.

Bspec: 53111
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Cc: Caz Yokoyama <Caz.Yokoyama@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
---
 drivers/gpu/drm/i915/i915_reg.h     |  3 +++
 drivers/gpu/drm/i915/intel_uncore.c | 12 ++++++++++++
 2 files changed, 15 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 4108f2a7ebfa..da73dc939e58 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -487,6 +487,9 @@ static inline bool i915_mmio_reg_valid(i915_reg_t reg)
 #define GAB_CTL				_MMIO(0x24000)
 #define   GAB_CTL_CONT_AFTER_PAGEFAULT	(1 << 8)
 
+#define GU_CNTL				_MMIO(0x101010)
+#define   LMEM_INIT			REG_BIT(7)
+
 #define GEN6_STOLEN_RESERVED		_MMIO(0x1082C0)
 #define GEN6_STOLEN_RESERVED_ADDR_MASK	(0xFFF << 20)
 #define GEN7_STOLEN_RESERVED_ADDR_MASK	(0x3FFF << 18)
diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c
index 661b50191f2b..4d0605757428 100644
--- a/drivers/gpu/drm/i915/intel_uncore.c
+++ b/drivers/gpu/drm/i915/intel_uncore.c
@@ -1917,6 +1917,18 @@ int intel_uncore_init_mmio(struct intel_uncore *uncore)
 	if (ret)
 		return ret;
 
+	/*
+	 * The boot firmware initializes local memory and assesses its health.
+	 * If memory training fails, the punit will have been instructed to
+	 * keep the GT powered down; we won't be able to communicate with it
+	 * and we should not continue with driver initialization.
+	 */
+	if (IS_DGFX(i915) &&
+	    !(__raw_uncore_read32(uncore, GU_CNTL) & LMEM_INIT)) {
+		drm_err(&i915->drm, "LMEM not initialized by firmware\n");
+		return -ENODEV;
+	}
+
 	if (INTEL_GEN(i915) > 5 && !intel_vgpu_active(i915))
 		uncore->flags |= UNCORE_HAS_FORCEWAKE;
 
-- 
2.26.3

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 130+ messages in thread

* [PATCH 10/19] drm/i915/dg1: Fix mapping type for default state object
  2021-04-12  9:05 ` [Intel-gfx] " Matthew Auld
@ 2021-04-12  9:05   ` Matthew Auld
  -1 siblings, 0 replies; 130+ messages in thread
From: Matthew Auld @ 2021-04-12  9:05 UTC (permalink / raw)
  To: intel-gfx; +Cc: Venkata Ramana Nayana, dri-devel

From: Venkata Ramana Nayana <venkata.ramana.nayana@intel.com>

Use I915_MAP_WC when default state object is allocated in LMEM.

Signed-off-by: Venkata Ramana Nayana <venkata.ramana.nayana@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
---
 drivers/gpu/drm/i915/gt/shmem_utils.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gt/shmem_utils.c b/drivers/gpu/drm/i915/gt/shmem_utils.c
index f8f02aab842b..0683b27a3890 100644
--- a/drivers/gpu/drm/i915/gt/shmem_utils.c
+++ b/drivers/gpu/drm/i915/gt/shmem_utils.c
@@ -8,6 +8,7 @@
 #include <linux/shmem_fs.h>
 
 #include "gem/i915_gem_object.h"
+#include "gem/i915_gem_lmem.h"
 #include "shmem_utils.h"
 
 struct file *shmem_create_from_data(const char *name, void *data, size_t len)
@@ -39,7 +40,8 @@ struct file *shmem_create_from_object(struct drm_i915_gem_object *obj)
 		return file;
 	}
 
-	ptr = i915_gem_object_pin_map_unlocked(obj, I915_MAP_WB);
+	ptr = i915_gem_object_pin_map_unlocked(obj, i915_gem_object_is_lmem(obj) ?
+						I915_MAP_WC : I915_MAP_WB);
 	if (IS_ERR(ptr))
 		return ERR_CAST(ptr);
 
-- 
2.26.3

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 130+ messages in thread

* [Intel-gfx] [PATCH 10/19] drm/i915/dg1: Fix mapping type for default state object
@ 2021-04-12  9:05   ` Matthew Auld
  0 siblings, 0 replies; 130+ messages in thread
From: Matthew Auld @ 2021-04-12  9:05 UTC (permalink / raw)
  To: intel-gfx; +Cc: Venkata Ramana Nayana, dri-devel

From: Venkata Ramana Nayana <venkata.ramana.nayana@intel.com>

Use I915_MAP_WC when default state object is allocated in LMEM.

Signed-off-by: Venkata Ramana Nayana <venkata.ramana.nayana@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
---
 drivers/gpu/drm/i915/gt/shmem_utils.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gt/shmem_utils.c b/drivers/gpu/drm/i915/gt/shmem_utils.c
index f8f02aab842b..0683b27a3890 100644
--- a/drivers/gpu/drm/i915/gt/shmem_utils.c
+++ b/drivers/gpu/drm/i915/gt/shmem_utils.c
@@ -8,6 +8,7 @@
 #include <linux/shmem_fs.h>
 
 #include "gem/i915_gem_object.h"
+#include "gem/i915_gem_lmem.h"
 #include "shmem_utils.h"
 
 struct file *shmem_create_from_data(const char *name, void *data, size_t len)
@@ -39,7 +40,8 @@ struct file *shmem_create_from_object(struct drm_i915_gem_object *obj)
 		return file;
 	}
 
-	ptr = i915_gem_object_pin_map_unlocked(obj, I915_MAP_WB);
+	ptr = i915_gem_object_pin_map_unlocked(obj, i915_gem_object_is_lmem(obj) ?
+						I915_MAP_WC : I915_MAP_WB);
 	if (IS_ERR(ptr))
 		return ERR_CAST(ptr);
 
-- 
2.26.3

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 130+ messages in thread

* [PATCH 11/19] drm/i915: Update the helper to set correct mapping
  2021-04-12  9:05 ` [Intel-gfx] " Matthew Auld
@ 2021-04-12  9:05   ` Matthew Auld
  -1 siblings, 0 replies; 130+ messages in thread
From: Matthew Auld @ 2021-04-12  9:05 UTC (permalink / raw)
  To: intel-gfx
  Cc: CQ Tang, Venkata Sandeep Dhanalakota, dri-devel, Michal Wajdeczko

From: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com>

Determine the possible coherent map type based on object location,
and if target has llc or if user requires an always coherent
mapping.

Cc: Matthew Auld <matthew.auld@intel.com>
Cc: CQ Tang <cq.tang@intel.com>
Suggested-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_engine_cs.c    |  3 ++-
 drivers/gpu/drm/i915/gt/intel_engine_pm.c    |  2 +-
 drivers/gpu/drm/i915/gt/intel_lrc.c          |  4 +++-
 drivers/gpu/drm/i915/gt/intel_ring.c         |  9 ++++++---
 drivers/gpu/drm/i915/gt/selftest_context.c   |  3 ++-
 drivers/gpu/drm/i915/gt/selftest_hangcheck.c |  4 ++--
 drivers/gpu/drm/i915/gt/selftest_lrc.c       |  4 +++-
 drivers/gpu/drm/i915/gt/uc/intel_guc.c       |  4 +++-
 drivers/gpu/drm/i915/gt/uc/intel_huc.c       |  4 +++-
 drivers/gpu/drm/i915/i915_drv.h              | 11 +++++++++--
 drivers/gpu/drm/i915/selftests/igt_spinner.c |  4 ++--
 11 files changed, 36 insertions(+), 16 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index efe935f80c1a..b79568d370f5 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -664,7 +664,8 @@ static int init_status_page(struct intel_engine_cs *engine)
 	if (ret)
 		goto err;
 
-	vaddr = i915_gem_object_pin_map(obj, I915_MAP_WB);
+	vaddr = i915_gem_object_pin_map(obj,
+					i915_coherent_map_type(engine->i915, obj, true));
 	if (IS_ERR(vaddr)) {
 		ret = PTR_ERR(vaddr);
 		goto err_unpin;
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pm.c b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
index 7c9af86fdb1e..47f4397095e5 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_pm.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
@@ -23,7 +23,7 @@ static void dbg_poison_ce(struct intel_context *ce)
 
 	if (ce->state) {
 		struct drm_i915_gem_object *obj = ce->state->obj;
-		int type = i915_coherent_map_type(ce->engine->i915);
+		int type = i915_coherent_map_type(ce->engine->i915, obj, true);
 		void *map;
 
 		if (!i915_gem_object_trylock(obj))
diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
index e86897cde984..aafe2a4df496 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
@@ -903,7 +903,9 @@ lrc_pre_pin(struct intel_context *ce,
 	GEM_BUG_ON(!i915_vma_is_pinned(ce->state));
 
 	*vaddr = i915_gem_object_pin_map(ce->state->obj,
-					 i915_coherent_map_type(ce->engine->i915) |
+					 i915_coherent_map_type(ce->engine->i915,
+								ce->state->obj,
+								false) |
 					 I915_MAP_OVERRIDE);
 
 	return PTR_ERR_OR_ZERO(*vaddr);
diff --git a/drivers/gpu/drm/i915/gt/intel_ring.c b/drivers/gpu/drm/i915/gt/intel_ring.c
index aee0a77c77e0..3cf6c7e68108 100644
--- a/drivers/gpu/drm/i915/gt/intel_ring.c
+++ b/drivers/gpu/drm/i915/gt/intel_ring.c
@@ -53,9 +53,12 @@ int intel_ring_pin(struct intel_ring *ring, struct i915_gem_ww_ctx *ww)
 
 	if (i915_vma_is_map_and_fenceable(vma))
 		addr = (void __force *)i915_vma_pin_iomap(vma);
-	else
-		addr = i915_gem_object_pin_map(vma->obj,
-					       i915_coherent_map_type(vma->vm->i915));
+	else {
+		int type = i915_coherent_map_type(vma->vm->i915, vma->obj, false);
+
+		addr = i915_gem_object_pin_map(vma->obj, type);
+	}
+
 	if (IS_ERR(addr)) {
 		ret = PTR_ERR(addr);
 		goto err_ring;
diff --git a/drivers/gpu/drm/i915/gt/selftest_context.c b/drivers/gpu/drm/i915/gt/selftest_context.c
index b9bdd1d23243..26685b927169 100644
--- a/drivers/gpu/drm/i915/gt/selftest_context.c
+++ b/drivers/gpu/drm/i915/gt/selftest_context.c
@@ -88,7 +88,8 @@ static int __live_context_size(struct intel_engine_cs *engine)
 		goto err;
 
 	vaddr = i915_gem_object_pin_map_unlocked(ce->state->obj,
-						 i915_coherent_map_type(engine->i915));
+						 i915_coherent_map_type(engine->i915,
+									ce->state->obj, false));
 	if (IS_ERR(vaddr)) {
 		err = PTR_ERR(vaddr);
 		intel_context_unpin(ce);
diff --git a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
index 746985971c3a..5b63d4df8c93 100644
--- a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
+++ b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
@@ -69,7 +69,7 @@ static int hang_init(struct hang *h, struct intel_gt *gt)
 	h->seqno = memset(vaddr, 0xff, PAGE_SIZE);
 
 	vaddr = i915_gem_object_pin_map_unlocked(h->obj,
-						 i915_coherent_map_type(gt->i915));
+						 i915_coherent_map_type(gt->i915, h->obj, false));
 	if (IS_ERR(vaddr)) {
 		err = PTR_ERR(vaddr);
 		goto err_unpin_hws;
@@ -130,7 +130,7 @@ hang_create_request(struct hang *h, struct intel_engine_cs *engine)
 		return ERR_CAST(obj);
 	}
 
-	vaddr = i915_gem_object_pin_map_unlocked(obj, i915_coherent_map_type(gt->i915));
+	vaddr = i915_gem_object_pin_map_unlocked(obj, i915_coherent_map_type(gt->i915, obj, false));
 	if (IS_ERR(vaddr)) {
 		i915_gem_object_put(obj);
 		i915_vm_put(vm);
diff --git a/drivers/gpu/drm/i915/gt/selftest_lrc.c b/drivers/gpu/drm/i915/gt/selftest_lrc.c
index 85e7df6a5123..d8f6623524e8 100644
--- a/drivers/gpu/drm/i915/gt/selftest_lrc.c
+++ b/drivers/gpu/drm/i915/gt/selftest_lrc.c
@@ -1221,7 +1221,9 @@ static int compare_isolation(struct intel_engine_cs *engine,
 	}
 
 	lrc = i915_gem_object_pin_map_unlocked(ce->state->obj,
-				      i915_coherent_map_type(engine->i915));
+					       i915_coherent_map_type(engine->i915,
+								      ce->state->obj,
+								      false));
 	if (IS_ERR(lrc)) {
 		err = PTR_ERR(lrc);
 		goto err_B1;
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.c b/drivers/gpu/drm/i915/gt/uc/intel_guc.c
index 78305b2ec89d..adae04c47aab 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.c
@@ -682,7 +682,9 @@ int intel_guc_allocate_and_map_vma(struct intel_guc *guc, u32 size,
 	if (IS_ERR(vma))
 		return PTR_ERR(vma);
 
-	vaddr = i915_gem_object_pin_map_unlocked(vma->obj, I915_MAP_WB);
+	vaddr = i915_gem_object_pin_map_unlocked(vma->obj,
+						 i915_coherent_map_type(guc_to_gt(guc)->i915,
+									vma->obj, true));
 	if (IS_ERR(vaddr)) {
 		i915_vma_unpin_and_release(&vma, 0);
 		return PTR_ERR(vaddr);
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_huc.c b/drivers/gpu/drm/i915/gt/uc/intel_huc.c
index 2126dd81ac38..56d2144dc6a0 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_huc.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_huc.c
@@ -82,7 +82,9 @@ static int intel_huc_rsa_data_create(struct intel_huc *huc)
 	if (IS_ERR(vma))
 		return PTR_ERR(vma);
 
-	vaddr = i915_gem_object_pin_map_unlocked(vma->obj, I915_MAP_WB);
+	vaddr = i915_gem_object_pin_map_unlocked(vma->obj,
+						 i915_coherent_map_type(gt->i915,
+									vma->obj, true));
 	if (IS_ERR(vaddr)) {
 		i915_vma_unpin_and_release(&vma, 0);
 		return PTR_ERR(vaddr);
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 69e43bf91a15..2abbc06712a4 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -78,6 +78,7 @@
 #include "gem/i915_gem_context_types.h"
 #include "gem/i915_gem_shrinker.h"
 #include "gem/i915_gem_stolen.h"
+#include "gem/i915_gem_lmem.h"
 
 #include "gt/intel_engine.h"
 #include "gt/intel_gt_types.h"
@@ -1921,9 +1922,15 @@ static inline int intel_hws_csb_write_index(struct drm_i915_private *i915)
 }
 
 static inline enum i915_map_type
-i915_coherent_map_type(struct drm_i915_private *i915)
+i915_coherent_map_type(struct drm_i915_private *i915,
+		       struct drm_i915_gem_object *obj, bool always_coherent)
 {
-	return HAS_LLC(i915) ? I915_MAP_WB : I915_MAP_WC;
+	if (i915_gem_object_is_lmem(obj))
+		return I915_MAP_WC;
+	if (HAS_LLC(i915) || always_coherent)
+		return I915_MAP_WB;
+	else
+		return I915_MAP_WC;
 }
 
 #endif
diff --git a/drivers/gpu/drm/i915/selftests/igt_spinner.c b/drivers/gpu/drm/i915/selftests/igt_spinner.c
index cfbbe415b57c..5fe397b7d1d9 100644
--- a/drivers/gpu/drm/i915/selftests/igt_spinner.c
+++ b/drivers/gpu/drm/i915/selftests/igt_spinner.c
@@ -94,9 +94,9 @@ int igt_spinner_pin(struct igt_spinner *spin,
 	}
 
 	if (!spin->batch) {
-		unsigned int mode =
-			i915_coherent_map_type(spin->gt->i915);
+		unsigned int mode;
 
+		mode = i915_coherent_map_type(spin->gt->i915, spin->obj, false);
 		vaddr = igt_spinner_pin_obj(ce, ww, spin->obj, mode, &spin->batch_vma);
 		if (IS_ERR(vaddr))
 			return PTR_ERR(vaddr);
-- 
2.26.3

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 130+ messages in thread

* [Intel-gfx] [PATCH 11/19] drm/i915: Update the helper to set correct mapping
@ 2021-04-12  9:05   ` Matthew Auld
  0 siblings, 0 replies; 130+ messages in thread
From: Matthew Auld @ 2021-04-12  9:05 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel

From: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com>

Determine the possible coherent map type based on object location,
and if target has llc or if user requires an always coherent
mapping.

Cc: Matthew Auld <matthew.auld@intel.com>
Cc: CQ Tang <cq.tang@intel.com>
Suggested-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_engine_cs.c    |  3 ++-
 drivers/gpu/drm/i915/gt/intel_engine_pm.c    |  2 +-
 drivers/gpu/drm/i915/gt/intel_lrc.c          |  4 +++-
 drivers/gpu/drm/i915/gt/intel_ring.c         |  9 ++++++---
 drivers/gpu/drm/i915/gt/selftest_context.c   |  3 ++-
 drivers/gpu/drm/i915/gt/selftest_hangcheck.c |  4 ++--
 drivers/gpu/drm/i915/gt/selftest_lrc.c       |  4 +++-
 drivers/gpu/drm/i915/gt/uc/intel_guc.c       |  4 +++-
 drivers/gpu/drm/i915/gt/uc/intel_huc.c       |  4 +++-
 drivers/gpu/drm/i915/i915_drv.h              | 11 +++++++++--
 drivers/gpu/drm/i915/selftests/igt_spinner.c |  4 ++--
 11 files changed, 36 insertions(+), 16 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index efe935f80c1a..b79568d370f5 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -664,7 +664,8 @@ static int init_status_page(struct intel_engine_cs *engine)
 	if (ret)
 		goto err;
 
-	vaddr = i915_gem_object_pin_map(obj, I915_MAP_WB);
+	vaddr = i915_gem_object_pin_map(obj,
+					i915_coherent_map_type(engine->i915, obj, true));
 	if (IS_ERR(vaddr)) {
 		ret = PTR_ERR(vaddr);
 		goto err_unpin;
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pm.c b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
index 7c9af86fdb1e..47f4397095e5 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_pm.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
@@ -23,7 +23,7 @@ static void dbg_poison_ce(struct intel_context *ce)
 
 	if (ce->state) {
 		struct drm_i915_gem_object *obj = ce->state->obj;
-		int type = i915_coherent_map_type(ce->engine->i915);
+		int type = i915_coherent_map_type(ce->engine->i915, obj, true);
 		void *map;
 
 		if (!i915_gem_object_trylock(obj))
diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
index e86897cde984..aafe2a4df496 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
@@ -903,7 +903,9 @@ lrc_pre_pin(struct intel_context *ce,
 	GEM_BUG_ON(!i915_vma_is_pinned(ce->state));
 
 	*vaddr = i915_gem_object_pin_map(ce->state->obj,
-					 i915_coherent_map_type(ce->engine->i915) |
+					 i915_coherent_map_type(ce->engine->i915,
+								ce->state->obj,
+								false) |
 					 I915_MAP_OVERRIDE);
 
 	return PTR_ERR_OR_ZERO(*vaddr);
diff --git a/drivers/gpu/drm/i915/gt/intel_ring.c b/drivers/gpu/drm/i915/gt/intel_ring.c
index aee0a77c77e0..3cf6c7e68108 100644
--- a/drivers/gpu/drm/i915/gt/intel_ring.c
+++ b/drivers/gpu/drm/i915/gt/intel_ring.c
@@ -53,9 +53,12 @@ int intel_ring_pin(struct intel_ring *ring, struct i915_gem_ww_ctx *ww)
 
 	if (i915_vma_is_map_and_fenceable(vma))
 		addr = (void __force *)i915_vma_pin_iomap(vma);
-	else
-		addr = i915_gem_object_pin_map(vma->obj,
-					       i915_coherent_map_type(vma->vm->i915));
+	else {
+		int type = i915_coherent_map_type(vma->vm->i915, vma->obj, false);
+
+		addr = i915_gem_object_pin_map(vma->obj, type);
+	}
+
 	if (IS_ERR(addr)) {
 		ret = PTR_ERR(addr);
 		goto err_ring;
diff --git a/drivers/gpu/drm/i915/gt/selftest_context.c b/drivers/gpu/drm/i915/gt/selftest_context.c
index b9bdd1d23243..26685b927169 100644
--- a/drivers/gpu/drm/i915/gt/selftest_context.c
+++ b/drivers/gpu/drm/i915/gt/selftest_context.c
@@ -88,7 +88,8 @@ static int __live_context_size(struct intel_engine_cs *engine)
 		goto err;
 
 	vaddr = i915_gem_object_pin_map_unlocked(ce->state->obj,
-						 i915_coherent_map_type(engine->i915));
+						 i915_coherent_map_type(engine->i915,
+									ce->state->obj, false));
 	if (IS_ERR(vaddr)) {
 		err = PTR_ERR(vaddr);
 		intel_context_unpin(ce);
diff --git a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
index 746985971c3a..5b63d4df8c93 100644
--- a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
+++ b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
@@ -69,7 +69,7 @@ static int hang_init(struct hang *h, struct intel_gt *gt)
 	h->seqno = memset(vaddr, 0xff, PAGE_SIZE);
 
 	vaddr = i915_gem_object_pin_map_unlocked(h->obj,
-						 i915_coherent_map_type(gt->i915));
+						 i915_coherent_map_type(gt->i915, h->obj, false));
 	if (IS_ERR(vaddr)) {
 		err = PTR_ERR(vaddr);
 		goto err_unpin_hws;
@@ -130,7 +130,7 @@ hang_create_request(struct hang *h, struct intel_engine_cs *engine)
 		return ERR_CAST(obj);
 	}
 
-	vaddr = i915_gem_object_pin_map_unlocked(obj, i915_coherent_map_type(gt->i915));
+	vaddr = i915_gem_object_pin_map_unlocked(obj, i915_coherent_map_type(gt->i915, obj, false));
 	if (IS_ERR(vaddr)) {
 		i915_gem_object_put(obj);
 		i915_vm_put(vm);
diff --git a/drivers/gpu/drm/i915/gt/selftest_lrc.c b/drivers/gpu/drm/i915/gt/selftest_lrc.c
index 85e7df6a5123..d8f6623524e8 100644
--- a/drivers/gpu/drm/i915/gt/selftest_lrc.c
+++ b/drivers/gpu/drm/i915/gt/selftest_lrc.c
@@ -1221,7 +1221,9 @@ static int compare_isolation(struct intel_engine_cs *engine,
 	}
 
 	lrc = i915_gem_object_pin_map_unlocked(ce->state->obj,
-				      i915_coherent_map_type(engine->i915));
+					       i915_coherent_map_type(engine->i915,
+								      ce->state->obj,
+								      false));
 	if (IS_ERR(lrc)) {
 		err = PTR_ERR(lrc);
 		goto err_B1;
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.c b/drivers/gpu/drm/i915/gt/uc/intel_guc.c
index 78305b2ec89d..adae04c47aab 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.c
@@ -682,7 +682,9 @@ int intel_guc_allocate_and_map_vma(struct intel_guc *guc, u32 size,
 	if (IS_ERR(vma))
 		return PTR_ERR(vma);
 
-	vaddr = i915_gem_object_pin_map_unlocked(vma->obj, I915_MAP_WB);
+	vaddr = i915_gem_object_pin_map_unlocked(vma->obj,
+						 i915_coherent_map_type(guc_to_gt(guc)->i915,
+									vma->obj, true));
 	if (IS_ERR(vaddr)) {
 		i915_vma_unpin_and_release(&vma, 0);
 		return PTR_ERR(vaddr);
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_huc.c b/drivers/gpu/drm/i915/gt/uc/intel_huc.c
index 2126dd81ac38..56d2144dc6a0 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_huc.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_huc.c
@@ -82,7 +82,9 @@ static int intel_huc_rsa_data_create(struct intel_huc *huc)
 	if (IS_ERR(vma))
 		return PTR_ERR(vma);
 
-	vaddr = i915_gem_object_pin_map_unlocked(vma->obj, I915_MAP_WB);
+	vaddr = i915_gem_object_pin_map_unlocked(vma->obj,
+						 i915_coherent_map_type(gt->i915,
+									vma->obj, true));
 	if (IS_ERR(vaddr)) {
 		i915_vma_unpin_and_release(&vma, 0);
 		return PTR_ERR(vaddr);
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 69e43bf91a15..2abbc06712a4 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -78,6 +78,7 @@
 #include "gem/i915_gem_context_types.h"
 #include "gem/i915_gem_shrinker.h"
 #include "gem/i915_gem_stolen.h"
+#include "gem/i915_gem_lmem.h"
 
 #include "gt/intel_engine.h"
 #include "gt/intel_gt_types.h"
@@ -1921,9 +1922,15 @@ static inline int intel_hws_csb_write_index(struct drm_i915_private *i915)
 }
 
 static inline enum i915_map_type
-i915_coherent_map_type(struct drm_i915_private *i915)
+i915_coherent_map_type(struct drm_i915_private *i915,
+		       struct drm_i915_gem_object *obj, bool always_coherent)
 {
-	return HAS_LLC(i915) ? I915_MAP_WB : I915_MAP_WC;
+	if (i915_gem_object_is_lmem(obj))
+		return I915_MAP_WC;
+	if (HAS_LLC(i915) || always_coherent)
+		return I915_MAP_WB;
+	else
+		return I915_MAP_WC;
 }
 
 #endif
diff --git a/drivers/gpu/drm/i915/selftests/igt_spinner.c b/drivers/gpu/drm/i915/selftests/igt_spinner.c
index cfbbe415b57c..5fe397b7d1d9 100644
--- a/drivers/gpu/drm/i915/selftests/igt_spinner.c
+++ b/drivers/gpu/drm/i915/selftests/igt_spinner.c
@@ -94,9 +94,9 @@ int igt_spinner_pin(struct igt_spinner *spin,
 	}
 
 	if (!spin->batch) {
-		unsigned int mode =
-			i915_coherent_map_type(spin->gt->i915);
+		unsigned int mode;
 
+		mode = i915_coherent_map_type(spin->gt->i915, spin->obj, false);
 		vaddr = igt_spinner_pin_obj(ce, ww, spin->obj, mode, &spin->batch_vma);
 		if (IS_ERR(vaddr))
 			return PTR_ERR(vaddr);
-- 
2.26.3

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 130+ messages in thread

* [PATCH 12/19] drm/i915/lmem: Bypass aperture when lmem is available
  2021-04-12  9:05 ` [Intel-gfx] " Matthew Auld
@ 2021-04-12  9:05   ` Matthew Auld
  -1 siblings, 0 replies; 130+ messages in thread
From: Matthew Auld @ 2021-04-12  9:05 UTC (permalink / raw)
  To: intel-gfx
  Cc: Anusha Srivatsa, Chris P Wilson, CQ Tang, Daniele Ceraolo Spurio,
	dri-devel, Daniel Vetter, Dhinakaran Pandiyan

From: Anusha Srivatsa <anusha.srivatsa@intel.com>

In the scenario where local memory is available, we have
rely on CPU access via lmem directly instead of aperture.

v2:
gmch is only relevant for much older hw, therefore we can drop the
has_aperture check since it should always be present on such platforms.
(Chris)

Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Chris P Wilson <chris.p.wilson@intel.com>
Cc: Daniel Vetter <daniel.vetter@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: CQ Tang <cq.tang@intel.com>
Signed-off-by: Anusha Srivatsa <anusha.srivatsa@intel.com>
---
 drivers/gpu/drm/i915/display/intel_fbdev.c | 22 +++++++++++++++-------
 drivers/gpu/drm/i915/gem/i915_gem_lmem.c   | 15 +++++++++++++++
 drivers/gpu/drm/i915/gem/i915_gem_lmem.h   |  5 +++++
 drivers/gpu/drm/i915/i915_vma.c            | 19 +++++++++++++------
 4 files changed, 48 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_fbdev.c b/drivers/gpu/drm/i915/display/intel_fbdev.c
index 2b37959da747..4af40229f5ec 100644
--- a/drivers/gpu/drm/i915/display/intel_fbdev.c
+++ b/drivers/gpu/drm/i915/display/intel_fbdev.c
@@ -139,14 +139,22 @@ static int intelfb_alloc(struct drm_fb_helper *helper,
 	size = mode_cmd.pitches[0] * mode_cmd.height;
 	size = PAGE_ALIGN(size);
 
-	/* If the FB is too big, just don't use it since fbdev is not very
-	 * important and we should probably use that space with FBC or other
-	 * features. */
 	obj = ERR_PTR(-ENODEV);
-	if (size * 2 < dev_priv->stolen_usable_size)
-		obj = i915_gem_object_create_stolen(dev_priv, size);
-	if (IS_ERR(obj))
-		obj = i915_gem_object_create_shmem(dev_priv, size);
+	if (HAS_LMEM(dev_priv)) {
+		obj = i915_gem_object_create_lmem(dev_priv, size,
+						  I915_BO_ALLOC_CONTIGUOUS);
+	} else {
+		/*
+		 * If the FB is too big, just don't use it since fbdev is not very
+		 * important and we should probably use that space with FBC or other
+		 * features.
+		 */
+		if (size * 2 < dev_priv->stolen_usable_size)
+			obj = i915_gem_object_create_stolen(dev_priv, size);
+		if (IS_ERR(obj))
+			obj = i915_gem_object_create_shmem(dev_priv, size);
+	}
+
 	if (IS_ERR(obj)) {
 		drm_err(&dev_priv->drm, "failed to allocate framebuffer\n");
 		return PTR_ERR(obj);
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_lmem.c b/drivers/gpu/drm/i915/gem/i915_gem_lmem.c
index 017db8f71130..f44bdd08f7cb 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_lmem.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_lmem.c
@@ -17,6 +17,21 @@ const struct drm_i915_gem_object_ops i915_gem_lmem_obj_ops = {
 	.release = i915_gem_object_release_memory_region,
 };
 
+void __iomem *
+i915_gem_object_lmem_io_map(struct drm_i915_gem_object *obj,
+			    unsigned long n,
+			    unsigned long size)
+{
+	resource_size_t offset;
+
+	GEM_BUG_ON(!i915_gem_object_is_contiguous(obj));
+
+	offset = i915_gem_object_get_dma_address(obj, n);
+	offset -= obj->mm.region->region.start;
+
+	return io_mapping_map_wc(&obj->mm.region->iomap, offset, size);
+}
+
 bool i915_gem_object_is_lmem(struct drm_i915_gem_object *obj)
 {
 	struct intel_memory_region *mr = obj->mm.region;
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_lmem.h b/drivers/gpu/drm/i915/gem/i915_gem_lmem.h
index 036d53c01de9..fac6bc5a5ebb 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_lmem.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_lmem.h
@@ -14,6 +14,11 @@ struct intel_memory_region;
 
 extern const struct drm_i915_gem_object_ops i915_gem_lmem_obj_ops;
 
+void __iomem *
+i915_gem_object_lmem_io_map(struct drm_i915_gem_object *obj,
+			    unsigned long n,
+			    unsigned long size);
+
 bool i915_gem_object_is_lmem(struct drm_i915_gem_object *obj);
 
 struct drm_i915_gem_object *
diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
index 07490db51cdc..e24d33aecac4 100644
--- a/drivers/gpu/drm/i915/i915_vma.c
+++ b/drivers/gpu/drm/i915/i915_vma.c
@@ -27,6 +27,7 @@
 
 #include "display/intel_frontbuffer.h"
 
+#include "gem/i915_gem_lmem.h"
 #include "gt/intel_engine.h"
 #include "gt/intel_engine_heartbeat.h"
 #include "gt/intel_gt.h"
@@ -448,9 +449,11 @@ void __iomem *i915_vma_pin_iomap(struct i915_vma *vma)
 	void __iomem *ptr;
 	int err;
 
-	if (GEM_WARN_ON(!i915_vma_is_map_and_fenceable(vma))) {
-		err = -ENODEV;
-		goto err;
+	if (!i915_gem_object_is_lmem(vma->obj)) {
+		if (GEM_WARN_ON(!i915_vma_is_map_and_fenceable(vma))) {
+			err = -ENODEV;
+			goto err;
+		}
 	}
 
 	GEM_BUG_ON(!i915_vma_is_ggtt(vma));
@@ -458,9 +461,13 @@ void __iomem *i915_vma_pin_iomap(struct i915_vma *vma)
 
 	ptr = READ_ONCE(vma->iomap);
 	if (ptr == NULL) {
-		ptr = io_mapping_map_wc(&i915_vm_to_ggtt(vma->vm)->iomap,
-					vma->node.start,
-					vma->node.size);
+		if (i915_gem_object_is_lmem(vma->obj))
+			ptr = i915_gem_object_lmem_io_map(vma->obj, 0,
+							  vma->obj->base.size);
+		else
+			ptr = io_mapping_map_wc(&i915_vm_to_ggtt(vma->vm)->iomap,
+						vma->node.start,
+						vma->node.size);
 		if (ptr == NULL) {
 			err = -ENOMEM;
 			goto err;
-- 
2.26.3

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 130+ messages in thread

* [Intel-gfx] [PATCH 12/19] drm/i915/lmem: Bypass aperture when lmem is available
@ 2021-04-12  9:05   ` Matthew Auld
  0 siblings, 0 replies; 130+ messages in thread
From: Matthew Auld @ 2021-04-12  9:05 UTC (permalink / raw)
  To: intel-gfx; +Cc: Chris P Wilson, dri-devel, Daniel Vetter, Dhinakaran Pandiyan

From: Anusha Srivatsa <anusha.srivatsa@intel.com>

In the scenario where local memory is available, we have
rely on CPU access via lmem directly instead of aperture.

v2:
gmch is only relevant for much older hw, therefore we can drop the
has_aperture check since it should always be present on such platforms.
(Chris)

Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Chris P Wilson <chris.p.wilson@intel.com>
Cc: Daniel Vetter <daniel.vetter@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: CQ Tang <cq.tang@intel.com>
Signed-off-by: Anusha Srivatsa <anusha.srivatsa@intel.com>
---
 drivers/gpu/drm/i915/display/intel_fbdev.c | 22 +++++++++++++++-------
 drivers/gpu/drm/i915/gem/i915_gem_lmem.c   | 15 +++++++++++++++
 drivers/gpu/drm/i915/gem/i915_gem_lmem.h   |  5 +++++
 drivers/gpu/drm/i915/i915_vma.c            | 19 +++++++++++++------
 4 files changed, 48 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_fbdev.c b/drivers/gpu/drm/i915/display/intel_fbdev.c
index 2b37959da747..4af40229f5ec 100644
--- a/drivers/gpu/drm/i915/display/intel_fbdev.c
+++ b/drivers/gpu/drm/i915/display/intel_fbdev.c
@@ -139,14 +139,22 @@ static int intelfb_alloc(struct drm_fb_helper *helper,
 	size = mode_cmd.pitches[0] * mode_cmd.height;
 	size = PAGE_ALIGN(size);
 
-	/* If the FB is too big, just don't use it since fbdev is not very
-	 * important and we should probably use that space with FBC or other
-	 * features. */
 	obj = ERR_PTR(-ENODEV);
-	if (size * 2 < dev_priv->stolen_usable_size)
-		obj = i915_gem_object_create_stolen(dev_priv, size);
-	if (IS_ERR(obj))
-		obj = i915_gem_object_create_shmem(dev_priv, size);
+	if (HAS_LMEM(dev_priv)) {
+		obj = i915_gem_object_create_lmem(dev_priv, size,
+						  I915_BO_ALLOC_CONTIGUOUS);
+	} else {
+		/*
+		 * If the FB is too big, just don't use it since fbdev is not very
+		 * important and we should probably use that space with FBC or other
+		 * features.
+		 */
+		if (size * 2 < dev_priv->stolen_usable_size)
+			obj = i915_gem_object_create_stolen(dev_priv, size);
+		if (IS_ERR(obj))
+			obj = i915_gem_object_create_shmem(dev_priv, size);
+	}
+
 	if (IS_ERR(obj)) {
 		drm_err(&dev_priv->drm, "failed to allocate framebuffer\n");
 		return PTR_ERR(obj);
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_lmem.c b/drivers/gpu/drm/i915/gem/i915_gem_lmem.c
index 017db8f71130..f44bdd08f7cb 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_lmem.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_lmem.c
@@ -17,6 +17,21 @@ const struct drm_i915_gem_object_ops i915_gem_lmem_obj_ops = {
 	.release = i915_gem_object_release_memory_region,
 };
 
+void __iomem *
+i915_gem_object_lmem_io_map(struct drm_i915_gem_object *obj,
+			    unsigned long n,
+			    unsigned long size)
+{
+	resource_size_t offset;
+
+	GEM_BUG_ON(!i915_gem_object_is_contiguous(obj));
+
+	offset = i915_gem_object_get_dma_address(obj, n);
+	offset -= obj->mm.region->region.start;
+
+	return io_mapping_map_wc(&obj->mm.region->iomap, offset, size);
+}
+
 bool i915_gem_object_is_lmem(struct drm_i915_gem_object *obj)
 {
 	struct intel_memory_region *mr = obj->mm.region;
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_lmem.h b/drivers/gpu/drm/i915/gem/i915_gem_lmem.h
index 036d53c01de9..fac6bc5a5ebb 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_lmem.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_lmem.h
@@ -14,6 +14,11 @@ struct intel_memory_region;
 
 extern const struct drm_i915_gem_object_ops i915_gem_lmem_obj_ops;
 
+void __iomem *
+i915_gem_object_lmem_io_map(struct drm_i915_gem_object *obj,
+			    unsigned long n,
+			    unsigned long size);
+
 bool i915_gem_object_is_lmem(struct drm_i915_gem_object *obj);
 
 struct drm_i915_gem_object *
diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
index 07490db51cdc..e24d33aecac4 100644
--- a/drivers/gpu/drm/i915/i915_vma.c
+++ b/drivers/gpu/drm/i915/i915_vma.c
@@ -27,6 +27,7 @@
 
 #include "display/intel_frontbuffer.h"
 
+#include "gem/i915_gem_lmem.h"
 #include "gt/intel_engine.h"
 #include "gt/intel_engine_heartbeat.h"
 #include "gt/intel_gt.h"
@@ -448,9 +449,11 @@ void __iomem *i915_vma_pin_iomap(struct i915_vma *vma)
 	void __iomem *ptr;
 	int err;
 
-	if (GEM_WARN_ON(!i915_vma_is_map_and_fenceable(vma))) {
-		err = -ENODEV;
-		goto err;
+	if (!i915_gem_object_is_lmem(vma->obj)) {
+		if (GEM_WARN_ON(!i915_vma_is_map_and_fenceable(vma))) {
+			err = -ENODEV;
+			goto err;
+		}
 	}
 
 	GEM_BUG_ON(!i915_vma_is_ggtt(vma));
@@ -458,9 +461,13 @@ void __iomem *i915_vma_pin_iomap(struct i915_vma *vma)
 
 	ptr = READ_ONCE(vma->iomap);
 	if (ptr == NULL) {
-		ptr = io_mapping_map_wc(&i915_vm_to_ggtt(vma->vm)->iomap,
-					vma->node.start,
-					vma->node.size);
+		if (i915_gem_object_is_lmem(vma->obj))
+			ptr = i915_gem_object_lmem_io_map(vma->obj, 0,
+							  vma->obj->base.size);
+		else
+			ptr = io_mapping_map_wc(&i915_vm_to_ggtt(vma->vm)->iomap,
+						vma->node.start,
+						vma->node.size);
 		if (ptr == NULL) {
 			err = -ENOMEM;
 			goto err;
-- 
2.26.3

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 130+ messages in thread

* [PATCH 13/19] drm/i915/dg1: Read OPROM via SPI controller
  2021-04-12  9:05 ` [Intel-gfx] " Matthew Auld
@ 2021-04-12  9:05   ` Matthew Auld
  -1 siblings, 0 replies; 130+ messages in thread
From: Matthew Auld @ 2021-04-12  9:05 UTC (permalink / raw)
  To: intel-gfx
  Cc: Jani Nikula, Lucas De Marchi, dri-devel, Jon Bloomfield, Tomas Winkler

From: Clint Taylor <clinton.a.taylor@intel.com>

Read OPROM SPI through MMIO and find VBT entry since we can't use
OpRegion and PCI mapping may not work on some systems due to the BIOS
not leaving the Option ROM mapped.

v2 by Jani:
- switch to intel_uncore_read/intel_uncore_write

Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Tomas Winkler <tomas.winkler@intel.com>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Signed-off-by: Clint Taylor <clinton.a.taylor@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
---
 drivers/gpu/drm/i915/display/intel_bios.c | 80 +++++++++++++++++++++--
 drivers/gpu/drm/i915/i915_reg.h           |  8 +++
 2 files changed, 82 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_bios.c b/drivers/gpu/drm/i915/display/intel_bios.c
index ea4837d485a1..f9dc651f1652 100644
--- a/drivers/gpu/drm/i915/display/intel_bios.c
+++ b/drivers/gpu/drm/i915/display/intel_bios.c
@@ -2238,6 +2238,66 @@ bool intel_bios_is_valid_vbt(const void *buf, size_t size)
 	return vbt;
 }
 
+static struct vbt_header *spi_oprom_get_vbt(struct drm_i915_private *i915)
+{
+	u32 count, data, found, store = 0;
+	u32 static_region, oprom_offset;
+	u32 oprom_size = 0x200000;
+	u16 vbt_size;
+	u32 *vbt;
+
+	static_region = intel_uncore_read(&i915->uncore, SPI_STATIC_REGIONS);
+	static_region &= OPTIONROM_SPI_REGIONID_MASK;
+	intel_uncore_write(&i915->uncore, PRIMARY_SPI_REGIONID, static_region);
+
+	oprom_offset = intel_uncore_read(&i915->uncore, OROM_OFFSET);
+	oprom_offset &= OROM_OFFSET_MASK;
+
+	for (count = 0; count < oprom_size; count += 4) {
+		intel_uncore_write(&i915->uncore, PRIMARY_SPI_ADDRESS, oprom_offset + count);
+		data = intel_uncore_read(&i915->uncore, PRIMARY_SPI_TRIGGER);
+
+		if (data == *((const u32 *)"$VBT")) {
+			found = oprom_offset + count;
+			break;
+		}
+	}
+
+	if (count >= oprom_size)
+		goto err_not_found;
+
+	/* Get VBT size and allocate space for the VBT */
+	intel_uncore_write(&i915->uncore, PRIMARY_SPI_ADDRESS, found +
+		   offsetof(struct vbt_header, vbt_size));
+	vbt_size = intel_uncore_read(&i915->uncore, PRIMARY_SPI_TRIGGER);
+	vbt_size &= 0xffff;
+
+	vbt = kzalloc(vbt_size, GFP_KERNEL);
+	if (!vbt) {
+		DRM_ERROR("Unable to allocate %u bytes for VBT storage\n",
+			  vbt_size);
+		goto err_not_found;
+	}
+
+	for (count = 0; count < vbt_size; count += 4) {
+		intel_uncore_write(&i915->uncore, PRIMARY_SPI_ADDRESS, found + count);
+		data = intel_uncore_read(&i915->uncore, PRIMARY_SPI_TRIGGER);
+		*(vbt + store++) = data;
+	}
+
+	if (!intel_bios_is_valid_vbt(vbt, vbt_size))
+		goto err_free_vbt;
+
+	DRM_DEBUG_KMS("Found valid VBT in SPI flash\n");
+
+	return (struct vbt_header *)vbt;
+
+err_free_vbt:
+	kfree(vbt);
+err_not_found:
+	return NULL;
+}
+
 static struct vbt_header *oprom_get_vbt(struct drm_i915_private *i915)
 {
 	struct pci_dev *pdev = to_pci_dev(i915->drm.dev);
@@ -2287,6 +2347,8 @@ static struct vbt_header *oprom_get_vbt(struct drm_i915_private *i915)
 
 	pci_unmap_rom(pdev, oprom);
 
+	DRM_DEBUG_KMS("Found valid VBT in PCI ROM\n");
+
 	return vbt;
 
 err_free_vbt:
@@ -2321,17 +2383,23 @@ void intel_bios_init(struct drm_i915_private *i915)
 
 	init_vbt_defaults(i915);
 
-	/* If the OpRegion does not have VBT, look in PCI ROM. */
+	/*
+	 * If the OpRegion does not have VBT, look in SPI flash through MMIO or
+	 * PCI mapping
+	 */
+	if (!vbt && IS_DGFX(i915)) {
+		oprom_vbt = spi_oprom_get_vbt(i915);
+		vbt = oprom_vbt;
+	}
+
 	if (!vbt) {
 		oprom_vbt = oprom_get_vbt(i915);
-		if (!oprom_vbt)
-			goto out;
-
 		vbt = oprom_vbt;
-
-		drm_dbg_kms(&i915->drm, "Found valid VBT in PCI ROM\n");
 	}
 
+	if (!vbt)
+		goto out;
+
 	bdb = get_bdb_header(vbt);
 	i915->vbt.version = bdb->version;
 
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index da73dc939e58..54ff63b86df6 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -12540,6 +12540,14 @@ enum skl_power_gate {
 #define   DP_PIN_ASSIGNMENT_MASK(idx)		(0xf << ((idx) * 4))
 #define   DP_PIN_ASSIGNMENT(idx, x)		((x) << ((idx) * 4))
 
+#define PRIMARY_SPI_TRIGGER			_MMIO(0x102040)
+#define PRIMARY_SPI_ADDRESS			_MMIO(0x102080)
+#define PRIMARY_SPI_REGIONID			_MMIO(0x102084)
+#define SPI_STATIC_REGIONS			_MMIO(0x102090)
+#define   OPTIONROM_SPI_REGIONID_MASK		REG_GENMASK(7, 0)
+#define OROM_OFFSET				_MMIO(0x1020c0)
+#define   OROM_OFFSET_MASK			REG_GENMASK(20, 16)
+
 /* This register controls the Display State Buffer (DSB) engines. */
 #define _DSBSL_INSTANCE_BASE		0x70B00
 #define DSBSL_INSTANCE(pipe, id)	(_DSBSL_INSTANCE_BASE + \
-- 
2.26.3

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 130+ messages in thread

* [Intel-gfx] [PATCH 13/19] drm/i915/dg1: Read OPROM via SPI controller
@ 2021-04-12  9:05   ` Matthew Auld
  0 siblings, 0 replies; 130+ messages in thread
From: Matthew Auld @ 2021-04-12  9:05 UTC (permalink / raw)
  To: intel-gfx; +Cc: Jani Nikula, Lucas De Marchi, dri-devel, Tomas Winkler

From: Clint Taylor <clinton.a.taylor@intel.com>

Read OPROM SPI through MMIO and find VBT entry since we can't use
OpRegion and PCI mapping may not work on some systems due to the BIOS
not leaving the Option ROM mapped.

v2 by Jani:
- switch to intel_uncore_read/intel_uncore_write

Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Tomas Winkler <tomas.winkler@intel.com>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Signed-off-by: Clint Taylor <clinton.a.taylor@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
---
 drivers/gpu/drm/i915/display/intel_bios.c | 80 +++++++++++++++++++++--
 drivers/gpu/drm/i915/i915_reg.h           |  8 +++
 2 files changed, 82 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_bios.c b/drivers/gpu/drm/i915/display/intel_bios.c
index ea4837d485a1..f9dc651f1652 100644
--- a/drivers/gpu/drm/i915/display/intel_bios.c
+++ b/drivers/gpu/drm/i915/display/intel_bios.c
@@ -2238,6 +2238,66 @@ bool intel_bios_is_valid_vbt(const void *buf, size_t size)
 	return vbt;
 }
 
+static struct vbt_header *spi_oprom_get_vbt(struct drm_i915_private *i915)
+{
+	u32 count, data, found, store = 0;
+	u32 static_region, oprom_offset;
+	u32 oprom_size = 0x200000;
+	u16 vbt_size;
+	u32 *vbt;
+
+	static_region = intel_uncore_read(&i915->uncore, SPI_STATIC_REGIONS);
+	static_region &= OPTIONROM_SPI_REGIONID_MASK;
+	intel_uncore_write(&i915->uncore, PRIMARY_SPI_REGIONID, static_region);
+
+	oprom_offset = intel_uncore_read(&i915->uncore, OROM_OFFSET);
+	oprom_offset &= OROM_OFFSET_MASK;
+
+	for (count = 0; count < oprom_size; count += 4) {
+		intel_uncore_write(&i915->uncore, PRIMARY_SPI_ADDRESS, oprom_offset + count);
+		data = intel_uncore_read(&i915->uncore, PRIMARY_SPI_TRIGGER);
+
+		if (data == *((const u32 *)"$VBT")) {
+			found = oprom_offset + count;
+			break;
+		}
+	}
+
+	if (count >= oprom_size)
+		goto err_not_found;
+
+	/* Get VBT size and allocate space for the VBT */
+	intel_uncore_write(&i915->uncore, PRIMARY_SPI_ADDRESS, found +
+		   offsetof(struct vbt_header, vbt_size));
+	vbt_size = intel_uncore_read(&i915->uncore, PRIMARY_SPI_TRIGGER);
+	vbt_size &= 0xffff;
+
+	vbt = kzalloc(vbt_size, GFP_KERNEL);
+	if (!vbt) {
+		DRM_ERROR("Unable to allocate %u bytes for VBT storage\n",
+			  vbt_size);
+		goto err_not_found;
+	}
+
+	for (count = 0; count < vbt_size; count += 4) {
+		intel_uncore_write(&i915->uncore, PRIMARY_SPI_ADDRESS, found + count);
+		data = intel_uncore_read(&i915->uncore, PRIMARY_SPI_TRIGGER);
+		*(vbt + store++) = data;
+	}
+
+	if (!intel_bios_is_valid_vbt(vbt, vbt_size))
+		goto err_free_vbt;
+
+	DRM_DEBUG_KMS("Found valid VBT in SPI flash\n");
+
+	return (struct vbt_header *)vbt;
+
+err_free_vbt:
+	kfree(vbt);
+err_not_found:
+	return NULL;
+}
+
 static struct vbt_header *oprom_get_vbt(struct drm_i915_private *i915)
 {
 	struct pci_dev *pdev = to_pci_dev(i915->drm.dev);
@@ -2287,6 +2347,8 @@ static struct vbt_header *oprom_get_vbt(struct drm_i915_private *i915)
 
 	pci_unmap_rom(pdev, oprom);
 
+	DRM_DEBUG_KMS("Found valid VBT in PCI ROM\n");
+
 	return vbt;
 
 err_free_vbt:
@@ -2321,17 +2383,23 @@ void intel_bios_init(struct drm_i915_private *i915)
 
 	init_vbt_defaults(i915);
 
-	/* If the OpRegion does not have VBT, look in PCI ROM. */
+	/*
+	 * If the OpRegion does not have VBT, look in SPI flash through MMIO or
+	 * PCI mapping
+	 */
+	if (!vbt && IS_DGFX(i915)) {
+		oprom_vbt = spi_oprom_get_vbt(i915);
+		vbt = oprom_vbt;
+	}
+
 	if (!vbt) {
 		oprom_vbt = oprom_get_vbt(i915);
-		if (!oprom_vbt)
-			goto out;
-
 		vbt = oprom_vbt;
-
-		drm_dbg_kms(&i915->drm, "Found valid VBT in PCI ROM\n");
 	}
 
+	if (!vbt)
+		goto out;
+
 	bdb = get_bdb_header(vbt);
 	i915->vbt.version = bdb->version;
 
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index da73dc939e58..54ff63b86df6 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -12540,6 +12540,14 @@ enum skl_power_gate {
 #define   DP_PIN_ASSIGNMENT_MASK(idx)		(0xf << ((idx) * 4))
 #define   DP_PIN_ASSIGNMENT(idx, x)		((x) << ((idx) * 4))
 
+#define PRIMARY_SPI_TRIGGER			_MMIO(0x102040)
+#define PRIMARY_SPI_ADDRESS			_MMIO(0x102080)
+#define PRIMARY_SPI_REGIONID			_MMIO(0x102084)
+#define SPI_STATIC_REGIONS			_MMIO(0x102090)
+#define   OPTIONROM_SPI_REGIONID_MASK		REG_GENMASK(7, 0)
+#define OROM_OFFSET				_MMIO(0x1020c0)
+#define   OROM_OFFSET_MASK			REG_GENMASK(20, 16)
+
 /* This register controls the Display State Buffer (DSB) engines. */
 #define _DSBSL_INSTANCE_BASE		0x70B00
 #define DSBSL_INSTANCE(pipe, id)	(_DSBSL_INSTANCE_BASE + \
-- 
2.26.3

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 130+ messages in thread

* [PATCH 14/19] drm/i915/oprom: Basic sanitization
  2021-04-12  9:05 ` [Intel-gfx] " Matthew Auld
@ 2021-04-12  9:05   ` Matthew Auld
  -1 siblings, 0 replies; 130+ messages in thread
From: Matthew Auld @ 2021-04-12  9:05 UTC (permalink / raw)
  To: intel-gfx
  Cc: Jani Nikula, Anshuman Gupta, Uma Shankar, Mohammed Khajapasha, dri-devel

From: Anshuman Gupta <anshuman.gupta@intel.com>

Sanitize OPROM header, CPD signature and OPROM PCI version.
OPROM_HEADER, EXPANSION_ROM_HEADER and OPROM_MEU_BLOB structures
and PCI struct offsets are provided by GSC counterparts.
These are yet to be Documented in B.Spec.
After successful sanitization, extract VBT from opregion
image.

v2:
- Used macro for OPROM header magic 0xaa55 [Rodrigo]
- Added a OPROM layout. [Uma]
- Extract opregion from OPROM package and then extract
  VBT from opregion to have backward compatibility with
  older IFWI.

v3:
- Moved opreg stuff to intel_opregion.{c,h}. [Uma]
- Memory leak and intel_oprom_verify_signature return
  value fixes. [Uma]

v4:
 - Fix return code storage for oprom_image_parse_helper (Matt)

v5 by Jani:
- switch to intel_uncore_read/intel_uncore_write

v6 by Khajapasha:
- Rename intel_oprom_verify_signature() to
  intel_spi_get_oprom_opreg() [Jani, Nikula]
- Use u32 data type for opregion size [Jani, Nikula]

Cc: Jani Nikula <jani.nikula@intel.com>
Cc: Uma Shankar <uma.shankar@intel.com>
Signed-off-by: Anshuman Gupta <anshuman.gupta@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Mohammed Khajapasha <mohammed.khajapasha@intel.com>
---
 drivers/gpu/drm/i915/display/intel_bios.c     |  47 +++--
 drivers/gpu/drm/i915/display/intel_opregion.c | 169 ++++++++++++++++++
 drivers/gpu/drm/i915/display/intel_opregion.h |  38 +++-
 3 files changed, 227 insertions(+), 27 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_bios.c b/drivers/gpu/drm/i915/display/intel_bios.c
index f9dc651f1652..59eec8333723 100644
--- a/drivers/gpu/drm/i915/display/intel_bios.c
+++ b/drivers/gpu/drm/i915/display/intel_bios.c
@@ -2240,37 +2240,36 @@ bool intel_bios_is_valid_vbt(const void *buf, size_t size)
 
 static struct vbt_header *spi_oprom_get_vbt(struct drm_i915_private *i915)
 {
-	u32 count, data, found, store = 0;
-	u32 static_region, oprom_offset;
-	u32 oprom_size = 0x200000;
+	u32 count, found, opreg_size;
+	u32 *vbt, *oprom_opreg = NULL;
 	u16 vbt_size;
-	u32 *vbt;
+	u8 *parse_ptr;
 
-	static_region = intel_uncore_read(&i915->uncore, SPI_STATIC_REGIONS);
-	static_region &= OPTIONROM_SPI_REGIONID_MASK;
-	intel_uncore_write(&i915->uncore, PRIMARY_SPI_REGIONID, static_region);
-
-	oprom_offset = intel_uncore_read(&i915->uncore, OROM_OFFSET);
-	oprom_offset &= OROM_OFFSET_MASK;
+	if (intel_spi_get_oprom_opreg(i915, &oprom_opreg, &opreg_size)) {
+		drm_err(&i915->drm, "oprom signature verification failed\n");
+		goto err_not_found;
+	}
 
-	for (count = 0; count < oprom_size; count += 4) {
-		intel_uncore_write(&i915->uncore, PRIMARY_SPI_ADDRESS, oprom_offset + count);
-		data = intel_uncore_read(&i915->uncore, PRIMARY_SPI_TRIGGER);
+	if (!oprom_opreg) {
+		drm_err(&i915->drm, "opregion not found\n");
+		goto err_not_found;
+	}
 
-		if (data == *((const u32 *)"$VBT")) {
-			found = oprom_offset + count;
+	for (count = 0; count < opreg_size; count += 4) {
+		if (oprom_opreg[count / 4] == *((const u32 *)"$VBT")) {
+			found = count;
 			break;
 		}
 	}
 
-	if (count >= oprom_size)
+	if (count >= opreg_size) {
+		drm_err(&i915->drm, "VBT not found in opregion\n");
 		goto err_not_found;
+	}
 
 	/* Get VBT size and allocate space for the VBT */
-	intel_uncore_write(&i915->uncore, PRIMARY_SPI_ADDRESS, found +
-		   offsetof(struct vbt_header, vbt_size));
-	vbt_size = intel_uncore_read(&i915->uncore, PRIMARY_SPI_TRIGGER);
-	vbt_size &= 0xffff;
+	parse_ptr = (u8 *)oprom_opreg + found;
+	vbt_size = ((struct vbt_header *)parse_ptr)->vbt_size;
 
 	vbt = kzalloc(vbt_size, GFP_KERNEL);
 	if (!vbt) {
@@ -2279,16 +2278,12 @@ static struct vbt_header *spi_oprom_get_vbt(struct drm_i915_private *i915)
 		goto err_not_found;
 	}
 
-	for (count = 0; count < vbt_size; count += 4) {
-		intel_uncore_write(&i915->uncore, PRIMARY_SPI_ADDRESS, found + count);
-		data = intel_uncore_read(&i915->uncore, PRIMARY_SPI_TRIGGER);
-		*(vbt + store++) = data;
-	}
-
+	memcpy(vbt, parse_ptr, vbt_size);
 	if (!intel_bios_is_valid_vbt(vbt, vbt_size))
 		goto err_free_vbt;
 
 	DRM_DEBUG_KMS("Found valid VBT in SPI flash\n");
+	kfree(oprom_opreg);
 
 	return (struct vbt_header *)vbt;
 
diff --git a/drivers/gpu/drm/i915/display/intel_opregion.c b/drivers/gpu/drm/i915/display/intel_opregion.c
index dfd724e506b5..e9ccd8265a1f 100644
--- a/drivers/gpu/drm/i915/display/intel_opregion.c
+++ b/drivers/gpu/drm/i915/display/intel_opregion.c
@@ -983,6 +983,175 @@ int intel_opregion_setup(struct drm_i915_private *dev_priv)
 	return err;
 }
 
+static int oprom_image_parse_helper(u8 *parse_ptr, u8 *last_img, u8 *code_type,
+				    struct drm_i915_private *i915)
+{
+	u8 size_512_bytes;
+
+	if (((union oprom_header *)parse_ptr)->signature != OPROM_IMAGE_MAGIC) {
+		drm_err(&i915->drm, "Wrong OPROM header signature.\n");
+		return -EINVAL;
+	}
+
+	size_512_bytes = parse_ptr[((struct expansion_rom_header *)parse_ptr)->pcistructoffset + PCI_IMAGE_LENGTH_OFFSET];
+	*code_type = parse_ptr[((struct expansion_rom_header *)parse_ptr)->pcistructoffset + PCI_CODE_TYPE_OFFSET];
+	*last_img = parse_ptr[((struct expansion_rom_header *)parse_ptr)->pcistructoffset + PCI_LAST_IMAGE_INDICATOR_OFFSET];
+
+	return size_512_bytes;
+}
+
+static void spi_read_oprom_helper(size_t len, u32 offset, u32 *buf,
+				  struct drm_i915_private *dev_priv)
+{
+	u32 count, data;
+
+	for (count = 0; count < len; count += 4) {
+		intel_uncore_write(&dev_priv->uncore, PRIMARY_SPI_ADDRESS, offset + count);
+		data = intel_uncore_read(&dev_priv->uncore, PRIMARY_SPI_TRIGGER);
+		buf[count / 4] = data;
+	}
+}
+
+/**
+ *	+        DASH+G OPROM IMAGE LAYOUT           +
+ *	+--------+-------+---------------------------+
+ *	| Offset | Value |   ROM Header Fields       +-----> Image 1 (CSS)
+ *	+--------------------------------------------+
+ *	|    0h  |  55h  |   ROM Signature Byte1     |
+ *	|    1h  |  AAh  |   ROM Signature Byte2     |
+ *	|    2h  |  xx   |        Reserved           |
+ *	|  18+19h|  xx   |  Ptr to PCI DataStructure |
+ *	+----------------+---------------------------+
+ *	|           PCI Data Structure               |
+ *	+--------------------------------------------+
+ *	|    .       .             .                 |
+ *	|    .       .             .                 |
+ *	|    10  +  xx   +     Image Length          |
+ *	|    14  +  xx   +     Code Type             |
+ *	|    15  +  xx   +  Last Image Indicator     |
+ *	|    .       .             .                 |
+ *	+--------------------------------------------+
+ *	|               MEU BLOB                     |
+ *	+--------------------------------------------+
+ *	|              CPD Header                    |
+ *	|              CPD Entry                     |
+ *	|              Reserved                      |
+ *	|           SignedDataPart1                  |
+ *	|              PublicKey                     |
+ *	|            RSA Signature                   |
+ *	|           SignedDataPart2                  |
+ *	|            IFWI Metadata                   |
+ *	+--------+-------+---------------------------+
+ *	|    .   |   .   |         .                 |
+ *	|    .   |   .   |         .                 |
+ *	+--------------------------------------------+
+ *	| Offset | Value |   ROM Header Fields       +-----> Image 2 (Config Data) (Offset: 0x800)
+ *	+--------------------------------------------+
+ *	|    0h  |  55h  |   ROM Signature Byte1     |
+ *	|    1h  |  AAh  |   ROM Signature Byte2     |
+ *	|    2h  |  xx   |        Reserved           |
+ *	|  18+19h|  xx   |  Ptr to PCI DataStructure |
+ *	+----------------+---------------------------+
+ *	|           PCI Data Structure               |
+ *	+--------------------------------------------+
+ *	|    .       .             .                 |
+ *	|    .       .             .                 |
+ *	|    10  +  xx   +     Image Length          |
+ *	|    14  +  xx   +      Code Type            |
+ *	|    15  +  xx   +   Last Image Indicator    |
+ *	|    .       .             .                 |
+ *	|    1A  +  3C   + Ptr to Opregion Signature |
+ *	|    .       .             .                 |
+ *	|    .       .             .                 |
+ *	|   83Ch + IntelGraphicsMem                  | <---+ Opregion Signature
+ *	+--------+-----------------------------------+
+ *
+ * intel_spi_get_oprom_opreg() get OPROM image.
+ * @i915: pointer to i915 device.
+ * @opreg: pointer to opregion buffer output.
+ * @opreg_size: pointer to opregion size output.
+ */
+int
+intel_spi_get_oprom_opreg(struct drm_i915_private *i915, u32 **opreg,
+			  u32 *opreg_size)
+{
+	u8 img_sig[sizeof(OPREGION_SIGNATURE)];
+	u8 code_type, last_img;
+	u32 static_region, offset, img_len;
+	u32 *oprom_img, *oprom_img_hdr;
+	u16 opreg_base;
+	u8 *parse_ptr;
+	int img_size;
+	int ret = -EINVAL;
+
+	/* initialize SPI to read the OPROM */
+	static_region = intel_uncore_read(&i915->uncore, SPI_STATIC_REGIONS);
+	static_region &= OPTIONROM_SPI_REGIONID_MASK;
+	intel_uncore_write(&i915->uncore, PRIMARY_SPI_REGIONID, static_region);
+	/* read OPROM offset in SPI flash */
+	offset = intel_uncore_read(&i915->uncore, OROM_OFFSET);
+	offset &= OROM_OFFSET_MASK;
+
+	oprom_img_hdr = kzalloc(OPROM_INITIAL_READ_SIZE, GFP_KERNEL);
+	if (!oprom_img_hdr)
+		return -ENOMEM;
+
+	do {
+		spi_read_oprom_helper(OPROM_INITIAL_READ_SIZE, offset,
+				      oprom_img_hdr, i915);
+		img_size = oprom_image_parse_helper((u8 *)oprom_img_hdr, &last_img,
+						    &code_type, i915);
+		if (img_size <= 0) {
+			ret = -EINVAL;
+			goto err_free_hdr;
+		}
+
+		img_len = img_size * OPROM_BYTE_BOUNDARY;
+		oprom_img = kzalloc(img_len, GFP_KERNEL);
+		if (!oprom_img) {
+			ret = -ENOMEM;
+			goto err_free_hdr;
+		}
+
+		spi_read_oprom_helper(img_len, offset, oprom_img, i915);
+		parse_ptr = (u8 *)oprom_img;
+		offset = offset + img_len;
+
+		/* opregion base offset */
+		opreg_base = ((struct expansion_rom_header *)parse_ptr)->opregion_base;
+		/* CPD or opreg signature is present at opregion_base offset */
+		memcpy(img_sig, parse_ptr + opreg_base, sizeof(OPREGION_SIGNATURE));
+
+		if (!memcmp(img_sig, OPREGION_SIGNATURE, sizeof(OPREGION_SIGNATURE) - 1)) {
+			*opreg = oprom_img;
+			*opreg_size = img_len;
+			drm_dbg_kms(&i915->drm, "Found opregion image\n");
+			ret = 0;
+			break;
+		} else if (!memcmp(img_sig, CPD_SIGNATURE, NUM_CPD_BYTES)) {
+			if (code_type != OPROM_CSS_CODE_TYPE) {
+				drm_err(&i915->drm, "Invalid OPROM\n");
+				ret = -EINVAL;
+				goto err_free_img;
+			}
+			drm_dbg_kms(&i915->drm, "Found CSS image\n");
+			/* proceed here onwards for signature authentication */
+			kfree(oprom_img);
+			continue;
+		}
+
+	} while (last_img != LAST_IMG_INDICATOR);
+
+	return ret;
+
+err_free_img:
+	kfree(oprom_img);
+err_free_hdr:
+	kfree(oprom_img_hdr);
+
+	return ret;
+}
+
 static int intel_use_opregion_panel_type_callback(const struct dmi_system_id *id)
 {
 	DRM_INFO("Using panel type from OpRegion on %s\n", id->ident);
diff --git a/drivers/gpu/drm/i915/display/intel_opregion.h b/drivers/gpu/drm/i915/display/intel_opregion.h
index 4aa68ffbd30e..de53dde10dd9 100644
--- a/drivers/gpu/drm/i915/display/intel_opregion.h
+++ b/drivers/gpu/drm/i915/display/intel_opregion.h
@@ -54,6 +54,34 @@ struct intel_opregion {
 
 #define OPREGION_SIZE            (8 * 1024)
 
+#define CPD_SIGNATURE "$CPD"                  /* CPD Signature */
+#define NUM_CPD_BYTES 4
+#define PCI_IMAGE_LENGTH_OFFSET 0x10
+#define PCI_CODE_TYPE_OFFSET 0x14
+#define PCI_LAST_IMAGE_INDICATOR_OFFSET 0x15
+#define LAST_IMG_INDICATOR 0x80
+#define OPROM_IMAGE_MAGIC 0xAA55       /* Little Endian */
+#define OPROM_CSS_CODE_TYPE 0xF0
+#define OPROM_BYTE_BOUNDARY 512        /* OPROM image sizes are indicated in 512 byte boundaries */
+#define OPROM_INITIAL_READ_SIZE 60     /* Read 60 bytes to compute the Img Len from PCI structure */
+
+union oprom_header {
+	u32 data;
+	struct {
+		u16 signature;  /* Offset[0x0]: Header 0x55 0xAA */
+		u8 sizein512bytes;
+		u8 reserved;
+	};
+};
+
+struct expansion_rom_header {
+	union oprom_header header;      /* Offset[0x0]: Oprom Header */
+	u16 vbiospostoffset;    /* Offset[0x4]: pointer to VBIOS entry point */
+	u8 resvd[0x12];
+	u16 pcistructoffset;    /* Offset[0x18]: Contains pointer PCI Data Structure */
+	u16 opregion_base;      /* Offset[0x1A]: Offset to Opregion Base start */
+};
+
 #ifdef CONFIG_ACPI
 
 int intel_opregion_setup(struct drm_i915_private *dev_priv);
@@ -72,6 +100,9 @@ int intel_opregion_notify_adapter(struct drm_i915_private *dev_priv,
 				  pci_power_t state);
 int intel_opregion_get_panel_type(struct drm_i915_private *dev_priv);
 
+int intel_spi_get_oprom_opreg(struct drm_i915_private *i915, u32 **opreg,
+			      u32 *opreg_size);
+
 #else /* CONFIG_ACPI*/
 
 static inline int intel_opregion_setup(struct drm_i915_private *dev_priv)
@@ -117,6 +148,11 @@ static inline int intel_opregion_get_panel_type(struct drm_i915_private *dev)
 	return -ENODEV;
 }
 
-#endif /* CONFIG_ACPI */
+static int intel_spi_get_oprom_opreg(struct drm_i915_private *i915, u32 **opreg,
+				     u32 *opreg_size)
+{
+	return 0;
+}
 
+#endif /* CONFIG_ACPI */
 #endif
-- 
2.26.3

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 130+ messages in thread

* [Intel-gfx] [PATCH 14/19] drm/i915/oprom: Basic sanitization
@ 2021-04-12  9:05   ` Matthew Auld
  0 siblings, 0 replies; 130+ messages in thread
From: Matthew Auld @ 2021-04-12  9:05 UTC (permalink / raw)
  To: intel-gfx; +Cc: Jani Nikula, Mohammed Khajapasha, dri-devel

From: Anshuman Gupta <anshuman.gupta@intel.com>

Sanitize OPROM header, CPD signature and OPROM PCI version.
OPROM_HEADER, EXPANSION_ROM_HEADER and OPROM_MEU_BLOB structures
and PCI struct offsets are provided by GSC counterparts.
These are yet to be Documented in B.Spec.
After successful sanitization, extract VBT from opregion
image.

v2:
- Used macro for OPROM header magic 0xaa55 [Rodrigo]
- Added a OPROM layout. [Uma]
- Extract opregion from OPROM package and then extract
  VBT from opregion to have backward compatibility with
  older IFWI.

v3:
- Moved opreg stuff to intel_opregion.{c,h}. [Uma]
- Memory leak and intel_oprom_verify_signature return
  value fixes. [Uma]

v4:
 - Fix return code storage for oprom_image_parse_helper (Matt)

v5 by Jani:
- switch to intel_uncore_read/intel_uncore_write

v6 by Khajapasha:
- Rename intel_oprom_verify_signature() to
  intel_spi_get_oprom_opreg() [Jani, Nikula]
- Use u32 data type for opregion size [Jani, Nikula]

Cc: Jani Nikula <jani.nikula@intel.com>
Cc: Uma Shankar <uma.shankar@intel.com>
Signed-off-by: Anshuman Gupta <anshuman.gupta@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Mohammed Khajapasha <mohammed.khajapasha@intel.com>
---
 drivers/gpu/drm/i915/display/intel_bios.c     |  47 +++--
 drivers/gpu/drm/i915/display/intel_opregion.c | 169 ++++++++++++++++++
 drivers/gpu/drm/i915/display/intel_opregion.h |  38 +++-
 3 files changed, 227 insertions(+), 27 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_bios.c b/drivers/gpu/drm/i915/display/intel_bios.c
index f9dc651f1652..59eec8333723 100644
--- a/drivers/gpu/drm/i915/display/intel_bios.c
+++ b/drivers/gpu/drm/i915/display/intel_bios.c
@@ -2240,37 +2240,36 @@ bool intel_bios_is_valid_vbt(const void *buf, size_t size)
 
 static struct vbt_header *spi_oprom_get_vbt(struct drm_i915_private *i915)
 {
-	u32 count, data, found, store = 0;
-	u32 static_region, oprom_offset;
-	u32 oprom_size = 0x200000;
+	u32 count, found, opreg_size;
+	u32 *vbt, *oprom_opreg = NULL;
 	u16 vbt_size;
-	u32 *vbt;
+	u8 *parse_ptr;
 
-	static_region = intel_uncore_read(&i915->uncore, SPI_STATIC_REGIONS);
-	static_region &= OPTIONROM_SPI_REGIONID_MASK;
-	intel_uncore_write(&i915->uncore, PRIMARY_SPI_REGIONID, static_region);
-
-	oprom_offset = intel_uncore_read(&i915->uncore, OROM_OFFSET);
-	oprom_offset &= OROM_OFFSET_MASK;
+	if (intel_spi_get_oprom_opreg(i915, &oprom_opreg, &opreg_size)) {
+		drm_err(&i915->drm, "oprom signature verification failed\n");
+		goto err_not_found;
+	}
 
-	for (count = 0; count < oprom_size; count += 4) {
-		intel_uncore_write(&i915->uncore, PRIMARY_SPI_ADDRESS, oprom_offset + count);
-		data = intel_uncore_read(&i915->uncore, PRIMARY_SPI_TRIGGER);
+	if (!oprom_opreg) {
+		drm_err(&i915->drm, "opregion not found\n");
+		goto err_not_found;
+	}
 
-		if (data == *((const u32 *)"$VBT")) {
-			found = oprom_offset + count;
+	for (count = 0; count < opreg_size; count += 4) {
+		if (oprom_opreg[count / 4] == *((const u32 *)"$VBT")) {
+			found = count;
 			break;
 		}
 	}
 
-	if (count >= oprom_size)
+	if (count >= opreg_size) {
+		drm_err(&i915->drm, "VBT not found in opregion\n");
 		goto err_not_found;
+	}
 
 	/* Get VBT size and allocate space for the VBT */
-	intel_uncore_write(&i915->uncore, PRIMARY_SPI_ADDRESS, found +
-		   offsetof(struct vbt_header, vbt_size));
-	vbt_size = intel_uncore_read(&i915->uncore, PRIMARY_SPI_TRIGGER);
-	vbt_size &= 0xffff;
+	parse_ptr = (u8 *)oprom_opreg + found;
+	vbt_size = ((struct vbt_header *)parse_ptr)->vbt_size;
 
 	vbt = kzalloc(vbt_size, GFP_KERNEL);
 	if (!vbt) {
@@ -2279,16 +2278,12 @@ static struct vbt_header *spi_oprom_get_vbt(struct drm_i915_private *i915)
 		goto err_not_found;
 	}
 
-	for (count = 0; count < vbt_size; count += 4) {
-		intel_uncore_write(&i915->uncore, PRIMARY_SPI_ADDRESS, found + count);
-		data = intel_uncore_read(&i915->uncore, PRIMARY_SPI_TRIGGER);
-		*(vbt + store++) = data;
-	}
-
+	memcpy(vbt, parse_ptr, vbt_size);
 	if (!intel_bios_is_valid_vbt(vbt, vbt_size))
 		goto err_free_vbt;
 
 	DRM_DEBUG_KMS("Found valid VBT in SPI flash\n");
+	kfree(oprom_opreg);
 
 	return (struct vbt_header *)vbt;
 
diff --git a/drivers/gpu/drm/i915/display/intel_opregion.c b/drivers/gpu/drm/i915/display/intel_opregion.c
index dfd724e506b5..e9ccd8265a1f 100644
--- a/drivers/gpu/drm/i915/display/intel_opregion.c
+++ b/drivers/gpu/drm/i915/display/intel_opregion.c
@@ -983,6 +983,175 @@ int intel_opregion_setup(struct drm_i915_private *dev_priv)
 	return err;
 }
 
+static int oprom_image_parse_helper(u8 *parse_ptr, u8 *last_img, u8 *code_type,
+				    struct drm_i915_private *i915)
+{
+	u8 size_512_bytes;
+
+	if (((union oprom_header *)parse_ptr)->signature != OPROM_IMAGE_MAGIC) {
+		drm_err(&i915->drm, "Wrong OPROM header signature.\n");
+		return -EINVAL;
+	}
+
+	size_512_bytes = parse_ptr[((struct expansion_rom_header *)parse_ptr)->pcistructoffset + PCI_IMAGE_LENGTH_OFFSET];
+	*code_type = parse_ptr[((struct expansion_rom_header *)parse_ptr)->pcistructoffset + PCI_CODE_TYPE_OFFSET];
+	*last_img = parse_ptr[((struct expansion_rom_header *)parse_ptr)->pcistructoffset + PCI_LAST_IMAGE_INDICATOR_OFFSET];
+
+	return size_512_bytes;
+}
+
+static void spi_read_oprom_helper(size_t len, u32 offset, u32 *buf,
+				  struct drm_i915_private *dev_priv)
+{
+	u32 count, data;
+
+	for (count = 0; count < len; count += 4) {
+		intel_uncore_write(&dev_priv->uncore, PRIMARY_SPI_ADDRESS, offset + count);
+		data = intel_uncore_read(&dev_priv->uncore, PRIMARY_SPI_TRIGGER);
+		buf[count / 4] = data;
+	}
+}
+
+/**
+ *	+        DASH+G OPROM IMAGE LAYOUT           +
+ *	+--------+-------+---------------------------+
+ *	| Offset | Value |   ROM Header Fields       +-----> Image 1 (CSS)
+ *	+--------------------------------------------+
+ *	|    0h  |  55h  |   ROM Signature Byte1     |
+ *	|    1h  |  AAh  |   ROM Signature Byte2     |
+ *	|    2h  |  xx   |        Reserved           |
+ *	|  18+19h|  xx   |  Ptr to PCI DataStructure |
+ *	+----------------+---------------------------+
+ *	|           PCI Data Structure               |
+ *	+--------------------------------------------+
+ *	|    .       .             .                 |
+ *	|    .       .             .                 |
+ *	|    10  +  xx   +     Image Length          |
+ *	|    14  +  xx   +     Code Type             |
+ *	|    15  +  xx   +  Last Image Indicator     |
+ *	|    .       .             .                 |
+ *	+--------------------------------------------+
+ *	|               MEU BLOB                     |
+ *	+--------------------------------------------+
+ *	|              CPD Header                    |
+ *	|              CPD Entry                     |
+ *	|              Reserved                      |
+ *	|           SignedDataPart1                  |
+ *	|              PublicKey                     |
+ *	|            RSA Signature                   |
+ *	|           SignedDataPart2                  |
+ *	|            IFWI Metadata                   |
+ *	+--------+-------+---------------------------+
+ *	|    .   |   .   |         .                 |
+ *	|    .   |   .   |         .                 |
+ *	+--------------------------------------------+
+ *	| Offset | Value |   ROM Header Fields       +-----> Image 2 (Config Data) (Offset: 0x800)
+ *	+--------------------------------------------+
+ *	|    0h  |  55h  |   ROM Signature Byte1     |
+ *	|    1h  |  AAh  |   ROM Signature Byte2     |
+ *	|    2h  |  xx   |        Reserved           |
+ *	|  18+19h|  xx   |  Ptr to PCI DataStructure |
+ *	+----------------+---------------------------+
+ *	|           PCI Data Structure               |
+ *	+--------------------------------------------+
+ *	|    .       .             .                 |
+ *	|    .       .             .                 |
+ *	|    10  +  xx   +     Image Length          |
+ *	|    14  +  xx   +      Code Type            |
+ *	|    15  +  xx   +   Last Image Indicator    |
+ *	|    .       .             .                 |
+ *	|    1A  +  3C   + Ptr to Opregion Signature |
+ *	|    .       .             .                 |
+ *	|    .       .             .                 |
+ *	|   83Ch + IntelGraphicsMem                  | <---+ Opregion Signature
+ *	+--------+-----------------------------------+
+ *
+ * intel_spi_get_oprom_opreg() get OPROM image.
+ * @i915: pointer to i915 device.
+ * @opreg: pointer to opregion buffer output.
+ * @opreg_size: pointer to opregion size output.
+ */
+int
+intel_spi_get_oprom_opreg(struct drm_i915_private *i915, u32 **opreg,
+			  u32 *opreg_size)
+{
+	u8 img_sig[sizeof(OPREGION_SIGNATURE)];
+	u8 code_type, last_img;
+	u32 static_region, offset, img_len;
+	u32 *oprom_img, *oprom_img_hdr;
+	u16 opreg_base;
+	u8 *parse_ptr;
+	int img_size;
+	int ret = -EINVAL;
+
+	/* initialize SPI to read the OPROM */
+	static_region = intel_uncore_read(&i915->uncore, SPI_STATIC_REGIONS);
+	static_region &= OPTIONROM_SPI_REGIONID_MASK;
+	intel_uncore_write(&i915->uncore, PRIMARY_SPI_REGIONID, static_region);
+	/* read OPROM offset in SPI flash */
+	offset = intel_uncore_read(&i915->uncore, OROM_OFFSET);
+	offset &= OROM_OFFSET_MASK;
+
+	oprom_img_hdr = kzalloc(OPROM_INITIAL_READ_SIZE, GFP_KERNEL);
+	if (!oprom_img_hdr)
+		return -ENOMEM;
+
+	do {
+		spi_read_oprom_helper(OPROM_INITIAL_READ_SIZE, offset,
+				      oprom_img_hdr, i915);
+		img_size = oprom_image_parse_helper((u8 *)oprom_img_hdr, &last_img,
+						    &code_type, i915);
+		if (img_size <= 0) {
+			ret = -EINVAL;
+			goto err_free_hdr;
+		}
+
+		img_len = img_size * OPROM_BYTE_BOUNDARY;
+		oprom_img = kzalloc(img_len, GFP_KERNEL);
+		if (!oprom_img) {
+			ret = -ENOMEM;
+			goto err_free_hdr;
+		}
+
+		spi_read_oprom_helper(img_len, offset, oprom_img, i915);
+		parse_ptr = (u8 *)oprom_img;
+		offset = offset + img_len;
+
+		/* opregion base offset */
+		opreg_base = ((struct expansion_rom_header *)parse_ptr)->opregion_base;
+		/* CPD or opreg signature is present at opregion_base offset */
+		memcpy(img_sig, parse_ptr + opreg_base, sizeof(OPREGION_SIGNATURE));
+
+		if (!memcmp(img_sig, OPREGION_SIGNATURE, sizeof(OPREGION_SIGNATURE) - 1)) {
+			*opreg = oprom_img;
+			*opreg_size = img_len;
+			drm_dbg_kms(&i915->drm, "Found opregion image\n");
+			ret = 0;
+			break;
+		} else if (!memcmp(img_sig, CPD_SIGNATURE, NUM_CPD_BYTES)) {
+			if (code_type != OPROM_CSS_CODE_TYPE) {
+				drm_err(&i915->drm, "Invalid OPROM\n");
+				ret = -EINVAL;
+				goto err_free_img;
+			}
+			drm_dbg_kms(&i915->drm, "Found CSS image\n");
+			/* proceed here onwards for signature authentication */
+			kfree(oprom_img);
+			continue;
+		}
+
+	} while (last_img != LAST_IMG_INDICATOR);
+
+	return ret;
+
+err_free_img:
+	kfree(oprom_img);
+err_free_hdr:
+	kfree(oprom_img_hdr);
+
+	return ret;
+}
+
 static int intel_use_opregion_panel_type_callback(const struct dmi_system_id *id)
 {
 	DRM_INFO("Using panel type from OpRegion on %s\n", id->ident);
diff --git a/drivers/gpu/drm/i915/display/intel_opregion.h b/drivers/gpu/drm/i915/display/intel_opregion.h
index 4aa68ffbd30e..de53dde10dd9 100644
--- a/drivers/gpu/drm/i915/display/intel_opregion.h
+++ b/drivers/gpu/drm/i915/display/intel_opregion.h
@@ -54,6 +54,34 @@ struct intel_opregion {
 
 #define OPREGION_SIZE            (8 * 1024)
 
+#define CPD_SIGNATURE "$CPD"                  /* CPD Signature */
+#define NUM_CPD_BYTES 4
+#define PCI_IMAGE_LENGTH_OFFSET 0x10
+#define PCI_CODE_TYPE_OFFSET 0x14
+#define PCI_LAST_IMAGE_INDICATOR_OFFSET 0x15
+#define LAST_IMG_INDICATOR 0x80
+#define OPROM_IMAGE_MAGIC 0xAA55       /* Little Endian */
+#define OPROM_CSS_CODE_TYPE 0xF0
+#define OPROM_BYTE_BOUNDARY 512        /* OPROM image sizes are indicated in 512 byte boundaries */
+#define OPROM_INITIAL_READ_SIZE 60     /* Read 60 bytes to compute the Img Len from PCI structure */
+
+union oprom_header {
+	u32 data;
+	struct {
+		u16 signature;  /* Offset[0x0]: Header 0x55 0xAA */
+		u8 sizein512bytes;
+		u8 reserved;
+	};
+};
+
+struct expansion_rom_header {
+	union oprom_header header;      /* Offset[0x0]: Oprom Header */
+	u16 vbiospostoffset;    /* Offset[0x4]: pointer to VBIOS entry point */
+	u8 resvd[0x12];
+	u16 pcistructoffset;    /* Offset[0x18]: Contains pointer PCI Data Structure */
+	u16 opregion_base;      /* Offset[0x1A]: Offset to Opregion Base start */
+};
+
 #ifdef CONFIG_ACPI
 
 int intel_opregion_setup(struct drm_i915_private *dev_priv);
@@ -72,6 +100,9 @@ int intel_opregion_notify_adapter(struct drm_i915_private *dev_priv,
 				  pci_power_t state);
 int intel_opregion_get_panel_type(struct drm_i915_private *dev_priv);
 
+int intel_spi_get_oprom_opreg(struct drm_i915_private *i915, u32 **opreg,
+			      u32 *opreg_size);
+
 #else /* CONFIG_ACPI*/
 
 static inline int intel_opregion_setup(struct drm_i915_private *dev_priv)
@@ -117,6 +148,11 @@ static inline int intel_opregion_get_panel_type(struct drm_i915_private *dev)
 	return -ENODEV;
 }
 
-#endif /* CONFIG_ACPI */
+static int intel_spi_get_oprom_opreg(struct drm_i915_private *i915, u32 **opreg,
+				     u32 *opreg_size)
+{
+	return 0;
+}
 
+#endif /* CONFIG_ACPI */
 #endif
-- 
2.26.3

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 130+ messages in thread

* [PATCH 15/19] drm/i915: WA for zero memory channel
  2021-04-12  9:05 ` [Intel-gfx] " Matthew Auld
@ 2021-04-12  9:05   ` Matthew Auld
  -1 siblings, 0 replies; 130+ messages in thread
From: Matthew Auld @ 2021-04-12  9:05 UTC (permalink / raw)
  To: intel-gfx
  Cc: Lucas De Marchi, José Roberto de Souza, Stanislav Lisovskiy,
	Daniele Ceraolo Spurio, dri-devel, Rodrigo Vivi

From: José Roberto de Souza <jose.souza@intel.com>

Commit c457d9cf256e ("drm/i915: Make sure we have enough memory
bandwidth on ICL") assumes that we always have a non-zero
dram_info->channels and uses it as a divisor. We need num memory
channels to be at least 1 for sane bw limits checking, even when PCode
returns 0, so lets force it to 1 in this case.

Cc: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
---
 drivers/gpu/drm/i915/display/intel_bw.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/i915/display/intel_bw.c b/drivers/gpu/drm/i915/display/intel_bw.c
index 584ab5ce4106..c5f70f3e930e 100644
--- a/drivers/gpu/drm/i915/display/intel_bw.c
+++ b/drivers/gpu/drm/i915/display/intel_bw.c
@@ -175,6 +175,7 @@ static int icl_get_bw_info(struct drm_i915_private *dev_priv, const struct intel
 			    "Failed to get memory subsystem information, ignoring bandwidth limits");
 		return ret;
 	}
+	num_channels = max_t(u8, 1, num_channels);
 
 	deinterleave = DIV_ROUND_UP(num_channels, is_y_tile ? 4 : 2);
 	dclk_max = icl_sagv_max_dclk(&qi);
-- 
2.26.3

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 130+ messages in thread

* [Intel-gfx] [PATCH 15/19] drm/i915: WA for zero memory channel
@ 2021-04-12  9:05   ` Matthew Auld
  0 siblings, 0 replies; 130+ messages in thread
From: Matthew Auld @ 2021-04-12  9:05 UTC (permalink / raw)
  To: intel-gfx; +Cc: Lucas De Marchi, dri-devel

From: José Roberto de Souza <jose.souza@intel.com>

Commit c457d9cf256e ("drm/i915: Make sure we have enough memory
bandwidth on ICL") assumes that we always have a non-zero
dram_info->channels and uses it as a divisor. We need num memory
channels to be at least 1 for sane bw limits checking, even when PCode
returns 0, so lets force it to 1 in this case.

Cc: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
---
 drivers/gpu/drm/i915/display/intel_bw.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/i915/display/intel_bw.c b/drivers/gpu/drm/i915/display/intel_bw.c
index 584ab5ce4106..c5f70f3e930e 100644
--- a/drivers/gpu/drm/i915/display/intel_bw.c
+++ b/drivers/gpu/drm/i915/display/intel_bw.c
@@ -175,6 +175,7 @@ static int icl_get_bw_info(struct drm_i915_private *dev_priv, const struct intel
 			    "Failed to get memory subsystem information, ignoring bandwidth limits");
 		return ret;
 	}
+	num_channels = max_t(u8, 1, num_channels);
 
 	deinterleave = DIV_ROUND_UP(num_channels, is_y_tile ? 4 : 2);
 	dclk_max = icl_sagv_max_dclk(&qi);
-- 
2.26.3

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 130+ messages in thread

* [PATCH 16/19] drm/i915/dg1: Compute MEM Bandwidth using MCHBAR
  2021-04-12  9:05 ` [Intel-gfx] " Matthew Auld
@ 2021-04-12  9:05   ` Matthew Auld
  -1 siblings, 0 replies; 130+ messages in thread
From: Matthew Auld @ 2021-04-12  9:05 UTC (permalink / raw)
  To: intel-gfx; +Cc: Jani Nikula, dri-devel, Jani Saarinen

From: Clint Taylor <clinton.a.taylor@intel.com>

The PUNIT FW is currently returning 0 for all memory bandwidth
parameters. Read the values directly from MCHBAR offsets 0x5918 and
0x4000(4). This is a temporary WA until the PUNIT FW returns valid
values.

v2 (Lucas): Add error to log since this is fixed in new pcode available
on IFWI WW14. Also fix checkpatch warnings.

v3 by Jani:
- switch to intel_uncore_read/intel_uncore_write

Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Jani Saarinen <jani.saarinen@intel.com>
Signed-off-by: Clint Taylor <clinton.a.taylor@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
---
 drivers/gpu/drm/i915/display/intel_bw.c | 54 ++++++++++++++++++++++++-
 1 file changed, 53 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/display/intel_bw.c b/drivers/gpu/drm/i915/display/intel_bw.c
index c5f70f3e930e..99cae0dc0ca2 100644
--- a/drivers/gpu/drm/i915/display/intel_bw.c
+++ b/drivers/gpu/drm/i915/display/intel_bw.c
@@ -23,6 +23,53 @@ struct intel_qgv_info {
 	u8 t_bl;
 };
 
+#define SA_PERF_STATUS_0_0_0_MCHBAR_PC _MMIO(MCHBAR_MIRROR_BASE_SNB + 0x5918)
+#define  DG1_QCLK_RATIO_MASK (0xFF << 2)
+#define  DG1_QCLK_RATIO_SHIFT 2
+#define  DG1_QCLK_REFERENCE (1 << 10)
+
+#define MCHBAR_CH0_CR_TC_PRE_0_0_0_MCHBAR _MMIO(MCHBAR_MIRROR_BASE_SNB + 0x4000)
+#define MCHBAR_CH0_CR_TC_PRE_0_0_0_MCHBAR_HIGH _MMIO(MCHBAR_MIRROR_BASE_SNB + 0x4004)
+#define MCHBAR_CH1_CR_TC_PRE_0_0_0_MCHBAR _MMIO(MCHBAR_MIRROR_BASE_SNB + 0x4400)
+#define MCHBAR_CH1_CR_TC_PRE_0_0_0_MCHBAR_HIGH _MMIO(MCHBAR_MIRROR_BASE_SNB + 0x4404)
+#define  DG1_DRAM_T_RCD_MASK (0x7F << 9)
+#define  DG1_DRAM_T_RCD_SHIFT 9
+#define  DG1_DRAM_T_RDPRE_MASK (0x3F << 11)
+#define  DG1_DRAM_T_RDPRE_SHIFT 11
+#define  DG1_DRAM_T_RAS_MASK (0xFF << 1)
+#define  DG1_DRAM_T_RAS_SHIFT 1
+#define  DG1_DRAM_T_RP_MASK (0x7F << 0)
+#define  DG1_DRAM_T_RP_SHIFT 0
+
+static int dg1_mchbar_read_qgv_point_info(struct drm_i915_private *dev_priv,
+					  struct intel_qgv_point *sp,
+					  int point)
+{
+	u32 val = 0;
+	u32 dclk_ratio = 0, dclk_reference = 0;
+
+	val = intel_uncore_read(&dev_priv->uncore, SA_PERF_STATUS_0_0_0_MCHBAR_PC);
+	dclk_ratio = (val & DG1_QCLK_RATIO_MASK) >> DG1_QCLK_RATIO_SHIFT;
+	if (val & DG1_QCLK_REFERENCE)
+		dclk_reference = 6; /* 6 * 16.666 MHz = 100 MHz */
+	else
+		dclk_reference = 8; /* 8 * 16.666 MHz = 133 MHz */
+	sp->dclk = dclk_ratio * dclk_reference;
+	if (sp->dclk == 0)
+		return -EINVAL;
+
+	val = intel_uncore_read(&dev_priv->uncore, MCHBAR_CH0_CR_TC_PRE_0_0_0_MCHBAR);
+	sp->t_rp = (val & DG1_DRAM_T_RP_MASK) >> DG1_DRAM_T_RP_SHIFT;
+	sp->t_rdpre = (val & DG1_DRAM_T_RDPRE_MASK) >> DG1_DRAM_T_RDPRE_SHIFT;
+
+	val = intel_uncore_read(&dev_priv->uncore, MCHBAR_CH0_CR_TC_PRE_0_0_0_MCHBAR_HIGH);
+	sp->t_rcd = (val & DG1_DRAM_T_RCD_MASK) >> DG1_DRAM_T_RCD_SHIFT;
+	sp->t_ras = (val & DG1_DRAM_T_RAS_MASK) >> DG1_DRAM_T_RAS_SHIFT;
+
+	sp->t_rc = sp->t_rp + sp->t_ras;
+	return 0;
+}
+
 static int icl_pcode_read_qgv_point_info(struct drm_i915_private *dev_priv,
 					 struct intel_qgv_point *sp,
 					 int point)
@@ -100,7 +147,12 @@ static int icl_get_qgv_points(struct drm_i915_private *dev_priv,
 		struct intel_qgv_point *sp = &qi->points[i];
 
 		ret = icl_pcode_read_qgv_point_info(dev_priv, sp, i);
-		if (ret)
+		if (IS_DG1(dev_priv) && (ret || sp->dclk == 0)) {
+			drm_dbg_kms(&dev_priv->drm, "Failed to get memory subsystem information via pcode. IFWI needs update. Trying with MCHBAR\n");
+			ret = dg1_mchbar_read_qgv_point_info(dev_priv, sp, i);
+			if (ret)
+				return ret;
+		} else if (ret)
 			return ret;
 
 		drm_dbg_kms(&dev_priv->drm,
-- 
2.26.3

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 130+ messages in thread

* [Intel-gfx] [PATCH 16/19] drm/i915/dg1: Compute MEM Bandwidth using MCHBAR
@ 2021-04-12  9:05   ` Matthew Auld
  0 siblings, 0 replies; 130+ messages in thread
From: Matthew Auld @ 2021-04-12  9:05 UTC (permalink / raw)
  To: intel-gfx; +Cc: Jani Nikula, dri-devel

From: Clint Taylor <clinton.a.taylor@intel.com>

The PUNIT FW is currently returning 0 for all memory bandwidth
parameters. Read the values directly from MCHBAR offsets 0x5918 and
0x4000(4). This is a temporary WA until the PUNIT FW returns valid
values.

v2 (Lucas): Add error to log since this is fixed in new pcode available
on IFWI WW14. Also fix checkpatch warnings.

v3 by Jani:
- switch to intel_uncore_read/intel_uncore_write

Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Jani Saarinen <jani.saarinen@intel.com>
Signed-off-by: Clint Taylor <clinton.a.taylor@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
---
 drivers/gpu/drm/i915/display/intel_bw.c | 54 ++++++++++++++++++++++++-
 1 file changed, 53 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/display/intel_bw.c b/drivers/gpu/drm/i915/display/intel_bw.c
index c5f70f3e930e..99cae0dc0ca2 100644
--- a/drivers/gpu/drm/i915/display/intel_bw.c
+++ b/drivers/gpu/drm/i915/display/intel_bw.c
@@ -23,6 +23,53 @@ struct intel_qgv_info {
 	u8 t_bl;
 };
 
+#define SA_PERF_STATUS_0_0_0_MCHBAR_PC _MMIO(MCHBAR_MIRROR_BASE_SNB + 0x5918)
+#define  DG1_QCLK_RATIO_MASK (0xFF << 2)
+#define  DG1_QCLK_RATIO_SHIFT 2
+#define  DG1_QCLK_REFERENCE (1 << 10)
+
+#define MCHBAR_CH0_CR_TC_PRE_0_0_0_MCHBAR _MMIO(MCHBAR_MIRROR_BASE_SNB + 0x4000)
+#define MCHBAR_CH0_CR_TC_PRE_0_0_0_MCHBAR_HIGH _MMIO(MCHBAR_MIRROR_BASE_SNB + 0x4004)
+#define MCHBAR_CH1_CR_TC_PRE_0_0_0_MCHBAR _MMIO(MCHBAR_MIRROR_BASE_SNB + 0x4400)
+#define MCHBAR_CH1_CR_TC_PRE_0_0_0_MCHBAR_HIGH _MMIO(MCHBAR_MIRROR_BASE_SNB + 0x4404)
+#define  DG1_DRAM_T_RCD_MASK (0x7F << 9)
+#define  DG1_DRAM_T_RCD_SHIFT 9
+#define  DG1_DRAM_T_RDPRE_MASK (0x3F << 11)
+#define  DG1_DRAM_T_RDPRE_SHIFT 11
+#define  DG1_DRAM_T_RAS_MASK (0xFF << 1)
+#define  DG1_DRAM_T_RAS_SHIFT 1
+#define  DG1_DRAM_T_RP_MASK (0x7F << 0)
+#define  DG1_DRAM_T_RP_SHIFT 0
+
+static int dg1_mchbar_read_qgv_point_info(struct drm_i915_private *dev_priv,
+					  struct intel_qgv_point *sp,
+					  int point)
+{
+	u32 val = 0;
+	u32 dclk_ratio = 0, dclk_reference = 0;
+
+	val = intel_uncore_read(&dev_priv->uncore, SA_PERF_STATUS_0_0_0_MCHBAR_PC);
+	dclk_ratio = (val & DG1_QCLK_RATIO_MASK) >> DG1_QCLK_RATIO_SHIFT;
+	if (val & DG1_QCLK_REFERENCE)
+		dclk_reference = 6; /* 6 * 16.666 MHz = 100 MHz */
+	else
+		dclk_reference = 8; /* 8 * 16.666 MHz = 133 MHz */
+	sp->dclk = dclk_ratio * dclk_reference;
+	if (sp->dclk == 0)
+		return -EINVAL;
+
+	val = intel_uncore_read(&dev_priv->uncore, MCHBAR_CH0_CR_TC_PRE_0_0_0_MCHBAR);
+	sp->t_rp = (val & DG1_DRAM_T_RP_MASK) >> DG1_DRAM_T_RP_SHIFT;
+	sp->t_rdpre = (val & DG1_DRAM_T_RDPRE_MASK) >> DG1_DRAM_T_RDPRE_SHIFT;
+
+	val = intel_uncore_read(&dev_priv->uncore, MCHBAR_CH0_CR_TC_PRE_0_0_0_MCHBAR_HIGH);
+	sp->t_rcd = (val & DG1_DRAM_T_RCD_MASK) >> DG1_DRAM_T_RCD_SHIFT;
+	sp->t_ras = (val & DG1_DRAM_T_RAS_MASK) >> DG1_DRAM_T_RAS_SHIFT;
+
+	sp->t_rc = sp->t_rp + sp->t_ras;
+	return 0;
+}
+
 static int icl_pcode_read_qgv_point_info(struct drm_i915_private *dev_priv,
 					 struct intel_qgv_point *sp,
 					 int point)
@@ -100,7 +147,12 @@ static int icl_get_qgv_points(struct drm_i915_private *dev_priv,
 		struct intel_qgv_point *sp = &qi->points[i];
 
 		ret = icl_pcode_read_qgv_point_info(dev_priv, sp, i);
-		if (ret)
+		if (IS_DG1(dev_priv) && (ret || sp->dclk == 0)) {
+			drm_dbg_kms(&dev_priv->drm, "Failed to get memory subsystem information via pcode. IFWI needs update. Trying with MCHBAR\n");
+			ret = dg1_mchbar_read_qgv_point_info(dev_priv, sp, i);
+			if (ret)
+				return ret;
+		} else if (ret)
 			return ret;
 
 		drm_dbg_kms(&dev_priv->drm,
-- 
2.26.3

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 130+ messages in thread

* [PATCH 17/19] drm/i915/dg1: Double memory bandwidth available
  2021-04-12  9:05 ` [Intel-gfx] " Matthew Auld
@ 2021-04-12  9:05   ` Matthew Auld
  -1 siblings, 0 replies; 130+ messages in thread
From: Matthew Auld @ 2021-04-12  9:05 UTC (permalink / raw)
  To: intel-gfx; +Cc: Jani Nikula, Swati Sharma, dri-devel

From: Clint Taylor <clinton.a.taylor@intel.com>

Use MCHBAR Gear_type information to compute memory bandwidth available
during MCHBAR calculations.

v2 by Jani:
- switch to intel_uncore_read/intel_uncore_write

Tested-by: Swati Sharma <swati2.sharma@intel.com>
Cc: Swati Sharma <swati2.sharma@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Clint Taylor <clinton.a.taylor@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
---
 drivers/gpu/drm/i915/display/intel_bw.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/drivers/gpu/drm/i915/display/intel_bw.c b/drivers/gpu/drm/i915/display/intel_bw.c
index 99cae0dc0ca2..6c02bd52ce45 100644
--- a/drivers/gpu/drm/i915/display/intel_bw.c
+++ b/drivers/gpu/drm/i915/display/intel_bw.c
@@ -41,6 +41,9 @@ struct intel_qgv_info {
 #define  DG1_DRAM_T_RP_MASK (0x7F << 0)
 #define  DG1_DRAM_T_RP_SHIFT 0
 
+#define  ICL_GEAR_TYPE_MASK (0x01 << 16)
+#define  ICL_GEAR_TYPE_SHIFT 16
+
 static int dg1_mchbar_read_qgv_point_info(struct drm_i915_private *dev_priv,
 					  struct intel_qgv_point *sp,
 					  int point)
@@ -55,6 +58,11 @@ static int dg1_mchbar_read_qgv_point_info(struct drm_i915_private *dev_priv,
 	else
 		dclk_reference = 8; /* 8 * 16.666 MHz = 133 MHz */
 	sp->dclk = dclk_ratio * dclk_reference;
+
+	val = intel_uncore_read(&dev_priv->uncore, SKL_MC_BIOS_DATA_0_0_0_MCHBAR_PCU);
+	if ((val & ICL_GEAR_TYPE_MASK) >> ICL_GEAR_TYPE_SHIFT)
+		sp->dclk *= 2;
+
 	if (sp->dclk == 0)
 		return -EINVAL;
 
-- 
2.26.3

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 130+ messages in thread

* [Intel-gfx] [PATCH 17/19] drm/i915/dg1: Double memory bandwidth available
@ 2021-04-12  9:05   ` Matthew Auld
  0 siblings, 0 replies; 130+ messages in thread
From: Matthew Auld @ 2021-04-12  9:05 UTC (permalink / raw)
  To: intel-gfx; +Cc: Jani Nikula, dri-devel

From: Clint Taylor <clinton.a.taylor@intel.com>

Use MCHBAR Gear_type information to compute memory bandwidth available
during MCHBAR calculations.

v2 by Jani:
- switch to intel_uncore_read/intel_uncore_write

Tested-by: Swati Sharma <swati2.sharma@intel.com>
Cc: Swati Sharma <swati2.sharma@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Clint Taylor <clinton.a.taylor@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
---
 drivers/gpu/drm/i915/display/intel_bw.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/drivers/gpu/drm/i915/display/intel_bw.c b/drivers/gpu/drm/i915/display/intel_bw.c
index 99cae0dc0ca2..6c02bd52ce45 100644
--- a/drivers/gpu/drm/i915/display/intel_bw.c
+++ b/drivers/gpu/drm/i915/display/intel_bw.c
@@ -41,6 +41,9 @@ struct intel_qgv_info {
 #define  DG1_DRAM_T_RP_MASK (0x7F << 0)
 #define  DG1_DRAM_T_RP_SHIFT 0
 
+#define  ICL_GEAR_TYPE_MASK (0x01 << 16)
+#define  ICL_GEAR_TYPE_SHIFT 16
+
 static int dg1_mchbar_read_qgv_point_info(struct drm_i915_private *dev_priv,
 					  struct intel_qgv_point *sp,
 					  int point)
@@ -55,6 +58,11 @@ static int dg1_mchbar_read_qgv_point_info(struct drm_i915_private *dev_priv,
 	else
 		dclk_reference = 8; /* 8 * 16.666 MHz = 133 MHz */
 	sp->dclk = dclk_ratio * dclk_reference;
+
+	val = intel_uncore_read(&dev_priv->uncore, SKL_MC_BIOS_DATA_0_0_0_MCHBAR_PCU);
+	if ((val & ICL_GEAR_TYPE_MASK) >> ICL_GEAR_TYPE_SHIFT)
+		sp->dclk *= 2;
+
 	if (sp->dclk == 0)
 		return -EINVAL;
 
-- 
2.26.3

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 130+ messages in thread

* [PATCH 18/19] drm/i915/gtt: map the PD up front
  2021-04-12  9:05 ` [Intel-gfx] " Matthew Auld
@ 2021-04-12  9:05   ` Matthew Auld
  -1 siblings, 0 replies; 130+ messages in thread
From: Matthew Auld @ 2021-04-12  9:05 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel, Chris Wilson

We need to general our accessor for the page directories and tables from
using the simple kmap_atomic to support local memory, and this setup
must be done on acquisition of the backing storage prior to entering
fence execution contexts. Here we replace the kmap with the object
maping code that for simple single page shmemfs object will return a
plain kmap, that is then kept for the lifetime of the page directory.

v2: (Thomas) Rebase on dma_resv and obj->mm.lock removal.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 .../drm/i915/gem/selftests/i915_gem_context.c | 11 +----
 drivers/gpu/drm/i915/gt/gen6_ppgtt.c          | 11 ++---
 drivers/gpu/drm/i915/gt/gen8_ppgtt.c          | 26 ++++------
 drivers/gpu/drm/i915/gt/intel_ggtt.c          |  2 +-
 drivers/gpu/drm/i915/gt/intel_gtt.c           | 48 +++++++++----------
 drivers/gpu/drm/i915/gt/intel_gtt.h           | 11 +++--
 drivers/gpu/drm/i915/gt/intel_ppgtt.c         |  7 ++-
 drivers/gpu/drm/i915/i915_vma.c               |  3 +-
 drivers/gpu/drm/i915/selftests/i915_gem_gtt.c | 10 ++--
 drivers/gpu/drm/i915/selftests/i915_perf.c    |  3 +-
 10 files changed, 54 insertions(+), 78 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c
index 5fef592390cb..ce70d0a3afb2 100644
--- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c
@@ -1740,7 +1740,6 @@ static int read_from_scratch(struct i915_gem_context *ctx,
 static int check_scratch_page(struct i915_gem_context *ctx, u32 *out)
 {
 	struct i915_address_space *vm;
-	struct page *page;
 	u32 *vaddr;
 	int err = 0;
 
@@ -1748,24 +1747,18 @@ static int check_scratch_page(struct i915_gem_context *ctx, u32 *out)
 	if (!vm)
 		return -ENODEV;
 
-	page = __px_page(vm->scratch[0]);
-	if (!page) {
+	if (!vm->scratch[0]) {
 		pr_err("No scratch page!\n");
 		return -EINVAL;
 	}
 
-	vaddr = kmap(page);
-	if (!vaddr) {
-		pr_err("No (mappable) scratch page!\n");
-		return -EINVAL;
-	}
+	vaddr = __px_vaddr(vm->scratch[0]);
 
 	memcpy(out, vaddr, sizeof(*out));
 	if (memchr_inv(vaddr, *out, PAGE_SIZE)) {
 		pr_err("Inconsistent initial state of scratch page!\n");
 		err = -EINVAL;
 	}
-	kunmap(page);
 
 	return err;
 }
diff --git a/drivers/gpu/drm/i915/gt/gen6_ppgtt.c b/drivers/gpu/drm/i915/gt/gen6_ppgtt.c
index e08dff376339..21b1085769be 100644
--- a/drivers/gpu/drm/i915/gt/gen6_ppgtt.c
+++ b/drivers/gpu/drm/i915/gt/gen6_ppgtt.c
@@ -96,9 +96,8 @@ static void gen6_ppgtt_clear_range(struct i915_address_space *vm,
 		 * entries back to scratch.
 		 */
 
-		vaddr = kmap_atomic_px(pt);
+		vaddr = px_vaddr(pt);
 		memset32(vaddr + pte, scratch_pte, count);
-		kunmap_atomic(vaddr);
 
 		pte = 0;
 	}
@@ -120,7 +119,7 @@ static void gen6_ppgtt_insert_entries(struct i915_address_space *vm,
 
 	GEM_BUG_ON(!pd->entry[act_pt]);
 
-	vaddr = kmap_atomic_px(i915_pt_entry(pd, act_pt));
+	vaddr = px_vaddr(i915_pt_entry(pd, act_pt));
 	do {
 		GEM_BUG_ON(sg_dma_len(iter.sg) < I915_GTT_PAGE_SIZE);
 		vaddr[act_pte] = pte_encode | GEN6_PTE_ADDR_ENCODE(iter.dma);
@@ -136,12 +135,10 @@ static void gen6_ppgtt_insert_entries(struct i915_address_space *vm,
 		}
 
 		if (++act_pte == GEN6_PTES) {
-			kunmap_atomic(vaddr);
-			vaddr = kmap_atomic_px(i915_pt_entry(pd, ++act_pt));
+			vaddr = px_vaddr(i915_pt_entry(pd, ++act_pt));
 			act_pte = 0;
 		}
 	} while (1);
-	kunmap_atomic(vaddr);
 
 	vma->page_sizes.gtt = I915_GTT_PAGE_SIZE;
 }
@@ -235,7 +232,7 @@ static int gen6_ppgtt_init_scratch(struct gen6_ppgtt *ppgtt)
 		goto err_scratch0;
 	}
 
-	ret = pin_pt_dma(vm, vm->scratch[1]);
+	ret = map_pt_dma(vm, vm->scratch[1]);
 	if (ret)
 		goto err_scratch1;
 
diff --git a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
index 176c19633412..f83496836f0f 100644
--- a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
+++ b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
@@ -242,11 +242,10 @@ static u64 __gen8_ppgtt_clear(struct i915_address_space * const vm,
 			    atomic_read(&pt->used));
 			GEM_BUG_ON(!count || count >= atomic_read(&pt->used));
 
-			vaddr = kmap_atomic_px(pt);
+			vaddr = px_vaddr(pt);
 			memset64(vaddr + gen8_pd_index(start, 0),
 				 vm->scratch[0]->encode,
 				 count);
-			kunmap_atomic(vaddr);
 
 			atomic_sub(count, &pt->used);
 			start += count;
@@ -375,7 +374,7 @@ gen8_ppgtt_insert_pte(struct i915_ppgtt *ppgtt,
 	gen8_pte_t *vaddr;
 
 	pd = i915_pd_entry(pdp, gen8_pd_index(idx, 2));
-	vaddr = kmap_atomic_px(i915_pt_entry(pd, gen8_pd_index(idx, 1)));
+	vaddr = px_vaddr(i915_pt_entry(pd, gen8_pd_index(idx, 1)));
 	do {
 		GEM_BUG_ON(sg_dma_len(iter->sg) < I915_GTT_PAGE_SIZE);
 		vaddr[gen8_pd_index(idx, 0)] = pte_encode | iter->dma;
@@ -402,12 +401,10 @@ gen8_ppgtt_insert_pte(struct i915_ppgtt *ppgtt,
 			}
 
 			clflush_cache_range(vaddr, PAGE_SIZE);
-			kunmap_atomic(vaddr);
-			vaddr = kmap_atomic_px(i915_pt_entry(pd, gen8_pd_index(idx, 1)));
+			vaddr = px_vaddr(i915_pt_entry(pd, gen8_pd_index(idx, 1)));
 		}
 	} while (1);
 	clflush_cache_range(vaddr, PAGE_SIZE);
-	kunmap_atomic(vaddr);
 
 	return idx;
 }
@@ -442,7 +439,7 @@ static void gen8_ppgtt_insert_huge(struct i915_vma *vma,
 			encode |= GEN8_PDE_PS_2M;
 			page_size = I915_GTT_PAGE_SIZE_2M;
 
-			vaddr = kmap_atomic_px(pd);
+			vaddr = px_vaddr(pd);
 		} else {
 			struct i915_page_table *pt =
 				i915_pt_entry(pd, __gen8_pte_index(start, 1));
@@ -457,7 +454,7 @@ static void gen8_ppgtt_insert_huge(struct i915_vma *vma,
 			     rem >= (I915_PDES - index) * I915_GTT_PAGE_SIZE))
 				maybe_64K = __gen8_pte_index(start, 1);
 
-			vaddr = kmap_atomic_px(pt);
+			vaddr = px_vaddr(pt);
 		}
 
 		do {
@@ -491,7 +488,6 @@ static void gen8_ppgtt_insert_huge(struct i915_vma *vma,
 		} while (rem >= page_size && index < I915_PDES);
 
 		clflush_cache_range(vaddr, PAGE_SIZE);
-		kunmap_atomic(vaddr);
 
 		/*
 		 * Is it safe to mark the 2M block as 64K? -- Either we have
@@ -505,9 +501,8 @@ static void gen8_ppgtt_insert_huge(struct i915_vma *vma,
 		      !iter->sg && IS_ALIGNED(vma->node.start +
 					      vma->node.size,
 					      I915_GTT_PAGE_SIZE_2M)))) {
-			vaddr = kmap_atomic_px(pd);
+			vaddr = px_vaddr(pd);
 			vaddr[maybe_64K] |= GEN8_PDE_IPS_64K;
-			kunmap_atomic(vaddr);
 			page_size = I915_GTT_PAGE_SIZE_64K;
 
 			/*
@@ -523,12 +518,11 @@ static void gen8_ppgtt_insert_huge(struct i915_vma *vma,
 				u16 i;
 
 				encode = vma->vm->scratch[0]->encode;
-				vaddr = kmap_atomic_px(i915_pt_entry(pd, maybe_64K));
+				vaddr = px_vaddr(i915_pt_entry(pd, maybe_64K));
 
 				for (i = 1; i < index; i += 16)
 					memset64(vaddr + i, encode, 15);
 
-				kunmap_atomic(vaddr);
 			}
 		}
 
@@ -602,7 +596,7 @@ static int gen8_init_scratch(struct i915_address_space *vm)
 		if (IS_ERR(obj))
 			goto free_scratch;
 
-		ret = pin_pt_dma(vm, obj);
+		ret = map_pt_dma(vm, obj);
 		if (ret) {
 			i915_gem_object_put(obj);
 			goto free_scratch;
@@ -639,7 +633,7 @@ static int gen8_preallocate_top_level_pdp(struct i915_ppgtt *ppgtt)
 		if (IS_ERR(pde))
 			return PTR_ERR(pde);
 
-		err = pin_pt_dma(vm, pde->pt.base);
+		err = map_pt_dma(vm, pde->pt.base);
 		if (err) {
 			i915_gem_object_put(pde->pt.base);
 			free_pd(vm, pde);
@@ -675,7 +669,7 @@ gen8_alloc_top_pd(struct i915_address_space *vm)
 		goto err_pd;
 	}
 
-	err = pin_pt_dma(vm, pd->pt.base);
+	err = map_pt_dma(vm, pd->pt.base);
 	if (err)
 		goto err_pd;
 
diff --git a/drivers/gpu/drm/i915/gt/intel_ggtt.c b/drivers/gpu/drm/i915/gt/intel_ggtt.c
index 670c1271e7d5..d94628b9d89e 100644
--- a/drivers/gpu/drm/i915/gt/intel_ggtt.c
+++ b/drivers/gpu/drm/i915/gt/intel_ggtt.c
@@ -657,7 +657,7 @@ static int init_aliasing_ppgtt(struct i915_ggtt *ggtt)
 		goto err_ppgtt;
 
 	i915_gem_object_lock(ppgtt->vm.scratch[0], NULL);
-	err = i915_vm_pin_pt_stash(&ppgtt->vm, &stash);
+	err = i915_vm_map_pt_stash(&ppgtt->vm, &stash);
 	i915_gem_object_unlock(ppgtt->vm.scratch[0]);
 	if (err)
 		goto err_stash;
diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.c b/drivers/gpu/drm/i915/gt/intel_gtt.c
index 941f8af016d6..d386b89e2758 100644
--- a/drivers/gpu/drm/i915/gt/intel_gtt.c
+++ b/drivers/gpu/drm/i915/gt/intel_gtt.c
@@ -25,27 +25,25 @@ struct drm_i915_gem_object *alloc_pt_dma(struct i915_address_space *vm, int sz)
 	return obj;
 }
 
-int pin_pt_dma(struct i915_address_space *vm, struct drm_i915_gem_object *obj)
+int map_pt_dma(struct i915_address_space *vm, struct drm_i915_gem_object *obj)
 {
-	int err;
+	void *vaddr;
 
-	i915_gem_object_lock(obj, NULL);
-	err = i915_gem_object_pin_pages(obj);
-	i915_gem_object_unlock(obj);
-	if (err)
-		return err;
+	vaddr = i915_gem_object_pin_map_unlocked(obj, I915_MAP_WB);
+	if (IS_ERR(vaddr))
+		return PTR_ERR(vaddr);
 
 	i915_gem_object_make_unshrinkable(obj);
 	return 0;
 }
 
-int pin_pt_dma_locked(struct i915_address_space *vm, struct drm_i915_gem_object *obj)
+int map_pt_dma_locked(struct i915_address_space *vm, struct drm_i915_gem_object *obj)
 {
-	int err;
+	void *vaddr;
 
-	err = i915_gem_object_pin_pages(obj);
-	if (err)
-		return err;
+	vaddr = i915_gem_object_pin_map(obj, I915_MAP_WB);
+	if (IS_ERR(vaddr))
+		return PTR_ERR(vaddr);
 
 	i915_gem_object_make_unshrinkable(obj);
 	return 0;
@@ -155,6 +153,14 @@ void clear_pages(struct i915_vma *vma)
 	memset(&vma->page_sizes, 0, sizeof(vma->page_sizes));
 }
 
+void *__px_vaddr(struct drm_i915_gem_object *p)
+{
+	enum i915_map_type type;
+
+	GEM_BUG_ON(!i915_gem_object_has_pages(p));
+	return page_unpack_bits(p->mm.mapping, &type);
+}
+
 dma_addr_t __px_dma(struct drm_i915_gem_object *p)
 {
 	GEM_BUG_ON(!i915_gem_object_has_pages(p));
@@ -170,32 +176,22 @@ struct page *__px_page(struct drm_i915_gem_object *p)
 void
 fill_page_dma(struct drm_i915_gem_object *p, const u64 val, unsigned int count)
 {
-	struct page *page = __px_page(p);
-	void *vaddr;
+	void *vaddr = __px_vaddr(p);
 
-	vaddr = kmap(page);
 	memset64(vaddr, val, count);
 	clflush_cache_range(vaddr, PAGE_SIZE);
-	kunmap(page);
 }
 
 static void poison_scratch_page(struct drm_i915_gem_object *scratch)
 {
-	struct sgt_iter sgt;
-	struct page *page;
+	void *vaddr = __px_vaddr(scratch);
 	u8 val;
 
 	val = 0;
 	if (IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM))
 		val = POISON_FREE;
 
-	for_each_sgt_page(page, sgt, scratch->mm.pages) {
-		void *vaddr;
-
-		vaddr = kmap(page);
-		memset(vaddr, val, PAGE_SIZE);
-		kunmap(page);
-	}
+	memset(vaddr, val, scratch->base.size);
 }
 
 int setup_scratch_page(struct i915_address_space *vm)
@@ -225,7 +221,7 @@ int setup_scratch_page(struct i915_address_space *vm)
 		if (IS_ERR(obj))
 			goto skip;
 
-		if (pin_pt_dma(vm, obj))
+		if (map_pt_dma(vm, obj))
 			goto skip_obj;
 
 		/* We need a single contiguous page for our scratch */
diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.h b/drivers/gpu/drm/i915/gt/intel_gtt.h
index e67e34e17913..40e486704558 100644
--- a/drivers/gpu/drm/i915/gt/intel_gtt.h
+++ b/drivers/gpu/drm/i915/gt/intel_gtt.h
@@ -180,6 +180,9 @@ struct page *__px_page(struct drm_i915_gem_object *p);
 dma_addr_t __px_dma(struct drm_i915_gem_object *p);
 #define px_dma(px) (__px_dma(px_base(px)))
 
+void *__px_vaddr(struct drm_i915_gem_object *p);
+#define px_vaddr(px) (__px_vaddr(px_base(px)))
+
 #define px_pt(px) \
 	__px_choose_expr(px, struct i915_page_table *, __x, \
 	__px_choose_expr(px, struct i915_page_directory *, &__x->pt, \
@@ -511,8 +514,6 @@ struct i915_ppgtt *i915_ppgtt_create(struct intel_gt *gt);
 void i915_ggtt_suspend(struct i915_ggtt *gtt);
 void i915_ggtt_resume(struct i915_ggtt *ggtt);
 
-#define kmap_atomic_px(px) kmap_atomic(__px_page(px_base(px)))
-
 void
 fill_page_dma(struct drm_i915_gem_object *p, const u64 val, unsigned int count);
 
@@ -530,8 +531,8 @@ struct i915_page_table *alloc_pt(struct i915_address_space *vm);
 struct i915_page_directory *alloc_pd(struct i915_address_space *vm);
 struct i915_page_directory *__alloc_pd(int npde);
 
-int pin_pt_dma(struct i915_address_space *vm, struct drm_i915_gem_object *obj);
-int pin_pt_dma_locked(struct i915_address_space *vm, struct drm_i915_gem_object *obj);
+int map_pt_dma(struct i915_address_space *vm, struct drm_i915_gem_object *obj);
+int map_pt_dma_locked(struct i915_address_space *vm, struct drm_i915_gem_object *obj);
 
 void free_px(struct i915_address_space *vm,
 	     struct i915_page_table *pt, int lvl);
@@ -578,7 +579,7 @@ void setup_private_pat(struct intel_uncore *uncore);
 int i915_vm_alloc_pt_stash(struct i915_address_space *vm,
 			   struct i915_vm_pt_stash *stash,
 			   u64 size);
-int i915_vm_pin_pt_stash(struct i915_address_space *vm,
+int i915_vm_map_pt_stash(struct i915_address_space *vm,
 			 struct i915_vm_pt_stash *stash);
 void i915_vm_free_pt_stash(struct i915_address_space *vm,
 			   struct i915_vm_pt_stash *stash);
diff --git a/drivers/gpu/drm/i915/gt/intel_ppgtt.c b/drivers/gpu/drm/i915/gt/intel_ppgtt.c
index 014ae8ac4480..4e3d80c2295c 100644
--- a/drivers/gpu/drm/i915/gt/intel_ppgtt.c
+++ b/drivers/gpu/drm/i915/gt/intel_ppgtt.c
@@ -87,11 +87,10 @@ write_dma_entry(struct drm_i915_gem_object * const pdma,
 		const unsigned short idx,
 		const u64 encoded_entry)
 {
-	u64 * const vaddr = kmap_atomic(__px_page(pdma));
+	u64 * const vaddr = __px_vaddr(pdma);
 
 	vaddr[idx] = encoded_entry;
 	clflush_cache_range(&vaddr[idx], sizeof(u64));
-	kunmap_atomic(vaddr);
 }
 
 void
@@ -258,7 +257,7 @@ int i915_vm_alloc_pt_stash(struct i915_address_space *vm,
 	return 0;
 }
 
-int i915_vm_pin_pt_stash(struct i915_address_space *vm,
+int i915_vm_map_pt_stash(struct i915_address_space *vm,
 			 struct i915_vm_pt_stash *stash)
 {
 	struct i915_page_table *pt;
@@ -266,7 +265,7 @@ int i915_vm_pin_pt_stash(struct i915_address_space *vm,
 
 	for (n = 0; n < ARRAY_SIZE(stash->pt); n++) {
 		for (pt = stash->pt[n]; pt; pt = pt->stash) {
-			err = pin_pt_dma_locked(vm, pt->base);
+			err = map_pt_dma_locked(vm, pt->base);
 			if (err)
 				return err;
 		}
diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
index e24d33aecac4..c68a743fac2a 100644
--- a/drivers/gpu/drm/i915/i915_vma.c
+++ b/drivers/gpu/drm/i915/i915_vma.c
@@ -912,8 +912,7 @@ int i915_vma_pin_ww(struct i915_vma *vma, struct i915_gem_ww_ctx *ww,
 			if (err)
 				goto err_fence;
 
-			err = i915_vm_pin_pt_stash(vma->vm,
-						   &work->stash);
+			err = i915_vm_map_pt_stash(vma->vm, &work->stash);
 			if (err)
 				goto err_fence;
 		}
diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c b/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
index 2e4f06eaacc1..e060e455e9f6 100644
--- a/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
@@ -186,7 +186,7 @@ static int igt_ppgtt_alloc(void *arg)
 		if (err)
 			goto err_ppgtt_cleanup;
 
-		err = i915_vm_pin_pt_stash(&ppgtt->vm, &stash);
+		err = i915_vm_map_pt_stash(&ppgtt->vm, &stash);
 		if (err) {
 			i915_vm_free_pt_stash(&ppgtt->vm, &stash);
 			goto err_ppgtt_cleanup;
@@ -208,7 +208,7 @@ static int igt_ppgtt_alloc(void *arg)
 		if (err)
 			goto err_ppgtt_cleanup;
 
-		err = i915_vm_pin_pt_stash(&ppgtt->vm, &stash);
+		err = i915_vm_map_pt_stash(&ppgtt->vm, &stash);
 		if (err) {
 			i915_vm_free_pt_stash(&ppgtt->vm, &stash);
 			goto err_ppgtt_cleanup;
@@ -325,11 +325,10 @@ static int lowlevel_hole(struct i915_address_space *vm,
 							   BIT_ULL(size)))
 					goto alloc_vm_end;
 
-				err = i915_vm_pin_pt_stash(vm, &stash);
+				err = i915_vm_map_pt_stash(vm, &stash);
 				if (!err)
 					vm->allocate_va_range(vm, &stash,
 							      addr, BIT_ULL(size));
-
 				i915_vm_free_pt_stash(vm, &stash);
 alloc_vm_end:
 				if (err == -EDEADLK) {
@@ -1967,10 +1966,9 @@ static int igt_cs_tlb(void *arg)
 			if (err)
 				goto end_ww;
 
-			err = i915_vm_pin_pt_stash(vm, &stash);
+			err = i915_vm_map_pt_stash(vm, &stash);
 			if (!err)
 				vm->allocate_va_range(vm, &stash, offset, chunk_size);
-
 			i915_vm_free_pt_stash(vm, &stash);
 end_ww:
 			if (err == -EDEADLK) {
diff --git a/drivers/gpu/drm/i915/selftests/i915_perf.c b/drivers/gpu/drm/i915/selftests/i915_perf.c
index e9d86dab8677..bfb0290967a1 100644
--- a/drivers/gpu/drm/i915/selftests/i915_perf.c
+++ b/drivers/gpu/drm/i915/selftests/i915_perf.c
@@ -307,7 +307,7 @@ static int live_noa_gpr(void *arg)
 	}
 
 	/* Poison the ce->vm so we detect writes not to the GGTT gt->scratch */
-	scratch = kmap(__px_page(ce->vm->scratch[0]));
+	scratch = __px_vaddr(ce->vm->scratch[0]);
 	memset(scratch, POISON_FREE, PAGE_SIZE);
 
 	rq = intel_context_create_request(ce);
@@ -405,7 +405,6 @@ static int live_noa_gpr(void *arg)
 out_rq:
 	i915_request_put(rq);
 out_ce:
-	kunmap(__px_page(ce->vm->scratch[0]));
 	intel_context_put(ce);
 out:
 	stream_destroy(stream);
-- 
2.26.3

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 130+ messages in thread

* [Intel-gfx] [PATCH 18/19] drm/i915/gtt: map the PD up front
@ 2021-04-12  9:05   ` Matthew Auld
  0 siblings, 0 replies; 130+ messages in thread
From: Matthew Auld @ 2021-04-12  9:05 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel, Chris Wilson

We need to general our accessor for the page directories and tables from
using the simple kmap_atomic to support local memory, and this setup
must be done on acquisition of the backing storage prior to entering
fence execution contexts. Here we replace the kmap with the object
maping code that for simple single page shmemfs object will return a
plain kmap, that is then kept for the lifetime of the page directory.

v2: (Thomas) Rebase on dma_resv and obj->mm.lock removal.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 .../drm/i915/gem/selftests/i915_gem_context.c | 11 +----
 drivers/gpu/drm/i915/gt/gen6_ppgtt.c          | 11 ++---
 drivers/gpu/drm/i915/gt/gen8_ppgtt.c          | 26 ++++------
 drivers/gpu/drm/i915/gt/intel_ggtt.c          |  2 +-
 drivers/gpu/drm/i915/gt/intel_gtt.c           | 48 +++++++++----------
 drivers/gpu/drm/i915/gt/intel_gtt.h           | 11 +++--
 drivers/gpu/drm/i915/gt/intel_ppgtt.c         |  7 ++-
 drivers/gpu/drm/i915/i915_vma.c               |  3 +-
 drivers/gpu/drm/i915/selftests/i915_gem_gtt.c | 10 ++--
 drivers/gpu/drm/i915/selftests/i915_perf.c    |  3 +-
 10 files changed, 54 insertions(+), 78 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c
index 5fef592390cb..ce70d0a3afb2 100644
--- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c
@@ -1740,7 +1740,6 @@ static int read_from_scratch(struct i915_gem_context *ctx,
 static int check_scratch_page(struct i915_gem_context *ctx, u32 *out)
 {
 	struct i915_address_space *vm;
-	struct page *page;
 	u32 *vaddr;
 	int err = 0;
 
@@ -1748,24 +1747,18 @@ static int check_scratch_page(struct i915_gem_context *ctx, u32 *out)
 	if (!vm)
 		return -ENODEV;
 
-	page = __px_page(vm->scratch[0]);
-	if (!page) {
+	if (!vm->scratch[0]) {
 		pr_err("No scratch page!\n");
 		return -EINVAL;
 	}
 
-	vaddr = kmap(page);
-	if (!vaddr) {
-		pr_err("No (mappable) scratch page!\n");
-		return -EINVAL;
-	}
+	vaddr = __px_vaddr(vm->scratch[0]);
 
 	memcpy(out, vaddr, sizeof(*out));
 	if (memchr_inv(vaddr, *out, PAGE_SIZE)) {
 		pr_err("Inconsistent initial state of scratch page!\n");
 		err = -EINVAL;
 	}
-	kunmap(page);
 
 	return err;
 }
diff --git a/drivers/gpu/drm/i915/gt/gen6_ppgtt.c b/drivers/gpu/drm/i915/gt/gen6_ppgtt.c
index e08dff376339..21b1085769be 100644
--- a/drivers/gpu/drm/i915/gt/gen6_ppgtt.c
+++ b/drivers/gpu/drm/i915/gt/gen6_ppgtt.c
@@ -96,9 +96,8 @@ static void gen6_ppgtt_clear_range(struct i915_address_space *vm,
 		 * entries back to scratch.
 		 */
 
-		vaddr = kmap_atomic_px(pt);
+		vaddr = px_vaddr(pt);
 		memset32(vaddr + pte, scratch_pte, count);
-		kunmap_atomic(vaddr);
 
 		pte = 0;
 	}
@@ -120,7 +119,7 @@ static void gen6_ppgtt_insert_entries(struct i915_address_space *vm,
 
 	GEM_BUG_ON(!pd->entry[act_pt]);
 
-	vaddr = kmap_atomic_px(i915_pt_entry(pd, act_pt));
+	vaddr = px_vaddr(i915_pt_entry(pd, act_pt));
 	do {
 		GEM_BUG_ON(sg_dma_len(iter.sg) < I915_GTT_PAGE_SIZE);
 		vaddr[act_pte] = pte_encode | GEN6_PTE_ADDR_ENCODE(iter.dma);
@@ -136,12 +135,10 @@ static void gen6_ppgtt_insert_entries(struct i915_address_space *vm,
 		}
 
 		if (++act_pte == GEN6_PTES) {
-			kunmap_atomic(vaddr);
-			vaddr = kmap_atomic_px(i915_pt_entry(pd, ++act_pt));
+			vaddr = px_vaddr(i915_pt_entry(pd, ++act_pt));
 			act_pte = 0;
 		}
 	} while (1);
-	kunmap_atomic(vaddr);
 
 	vma->page_sizes.gtt = I915_GTT_PAGE_SIZE;
 }
@@ -235,7 +232,7 @@ static int gen6_ppgtt_init_scratch(struct gen6_ppgtt *ppgtt)
 		goto err_scratch0;
 	}
 
-	ret = pin_pt_dma(vm, vm->scratch[1]);
+	ret = map_pt_dma(vm, vm->scratch[1]);
 	if (ret)
 		goto err_scratch1;
 
diff --git a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
index 176c19633412..f83496836f0f 100644
--- a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
+++ b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
@@ -242,11 +242,10 @@ static u64 __gen8_ppgtt_clear(struct i915_address_space * const vm,
 			    atomic_read(&pt->used));
 			GEM_BUG_ON(!count || count >= atomic_read(&pt->used));
 
-			vaddr = kmap_atomic_px(pt);
+			vaddr = px_vaddr(pt);
 			memset64(vaddr + gen8_pd_index(start, 0),
 				 vm->scratch[0]->encode,
 				 count);
-			kunmap_atomic(vaddr);
 
 			atomic_sub(count, &pt->used);
 			start += count;
@@ -375,7 +374,7 @@ gen8_ppgtt_insert_pte(struct i915_ppgtt *ppgtt,
 	gen8_pte_t *vaddr;
 
 	pd = i915_pd_entry(pdp, gen8_pd_index(idx, 2));
-	vaddr = kmap_atomic_px(i915_pt_entry(pd, gen8_pd_index(idx, 1)));
+	vaddr = px_vaddr(i915_pt_entry(pd, gen8_pd_index(idx, 1)));
 	do {
 		GEM_BUG_ON(sg_dma_len(iter->sg) < I915_GTT_PAGE_SIZE);
 		vaddr[gen8_pd_index(idx, 0)] = pte_encode | iter->dma;
@@ -402,12 +401,10 @@ gen8_ppgtt_insert_pte(struct i915_ppgtt *ppgtt,
 			}
 
 			clflush_cache_range(vaddr, PAGE_SIZE);
-			kunmap_atomic(vaddr);
-			vaddr = kmap_atomic_px(i915_pt_entry(pd, gen8_pd_index(idx, 1)));
+			vaddr = px_vaddr(i915_pt_entry(pd, gen8_pd_index(idx, 1)));
 		}
 	} while (1);
 	clflush_cache_range(vaddr, PAGE_SIZE);
-	kunmap_atomic(vaddr);
 
 	return idx;
 }
@@ -442,7 +439,7 @@ static void gen8_ppgtt_insert_huge(struct i915_vma *vma,
 			encode |= GEN8_PDE_PS_2M;
 			page_size = I915_GTT_PAGE_SIZE_2M;
 
-			vaddr = kmap_atomic_px(pd);
+			vaddr = px_vaddr(pd);
 		} else {
 			struct i915_page_table *pt =
 				i915_pt_entry(pd, __gen8_pte_index(start, 1));
@@ -457,7 +454,7 @@ static void gen8_ppgtt_insert_huge(struct i915_vma *vma,
 			     rem >= (I915_PDES - index) * I915_GTT_PAGE_SIZE))
 				maybe_64K = __gen8_pte_index(start, 1);
 
-			vaddr = kmap_atomic_px(pt);
+			vaddr = px_vaddr(pt);
 		}
 
 		do {
@@ -491,7 +488,6 @@ static void gen8_ppgtt_insert_huge(struct i915_vma *vma,
 		} while (rem >= page_size && index < I915_PDES);
 
 		clflush_cache_range(vaddr, PAGE_SIZE);
-		kunmap_atomic(vaddr);
 
 		/*
 		 * Is it safe to mark the 2M block as 64K? -- Either we have
@@ -505,9 +501,8 @@ static void gen8_ppgtt_insert_huge(struct i915_vma *vma,
 		      !iter->sg && IS_ALIGNED(vma->node.start +
 					      vma->node.size,
 					      I915_GTT_PAGE_SIZE_2M)))) {
-			vaddr = kmap_atomic_px(pd);
+			vaddr = px_vaddr(pd);
 			vaddr[maybe_64K] |= GEN8_PDE_IPS_64K;
-			kunmap_atomic(vaddr);
 			page_size = I915_GTT_PAGE_SIZE_64K;
 
 			/*
@@ -523,12 +518,11 @@ static void gen8_ppgtt_insert_huge(struct i915_vma *vma,
 				u16 i;
 
 				encode = vma->vm->scratch[0]->encode;
-				vaddr = kmap_atomic_px(i915_pt_entry(pd, maybe_64K));
+				vaddr = px_vaddr(i915_pt_entry(pd, maybe_64K));
 
 				for (i = 1; i < index; i += 16)
 					memset64(vaddr + i, encode, 15);
 
-				kunmap_atomic(vaddr);
 			}
 		}
 
@@ -602,7 +596,7 @@ static int gen8_init_scratch(struct i915_address_space *vm)
 		if (IS_ERR(obj))
 			goto free_scratch;
 
-		ret = pin_pt_dma(vm, obj);
+		ret = map_pt_dma(vm, obj);
 		if (ret) {
 			i915_gem_object_put(obj);
 			goto free_scratch;
@@ -639,7 +633,7 @@ static int gen8_preallocate_top_level_pdp(struct i915_ppgtt *ppgtt)
 		if (IS_ERR(pde))
 			return PTR_ERR(pde);
 
-		err = pin_pt_dma(vm, pde->pt.base);
+		err = map_pt_dma(vm, pde->pt.base);
 		if (err) {
 			i915_gem_object_put(pde->pt.base);
 			free_pd(vm, pde);
@@ -675,7 +669,7 @@ gen8_alloc_top_pd(struct i915_address_space *vm)
 		goto err_pd;
 	}
 
-	err = pin_pt_dma(vm, pd->pt.base);
+	err = map_pt_dma(vm, pd->pt.base);
 	if (err)
 		goto err_pd;
 
diff --git a/drivers/gpu/drm/i915/gt/intel_ggtt.c b/drivers/gpu/drm/i915/gt/intel_ggtt.c
index 670c1271e7d5..d94628b9d89e 100644
--- a/drivers/gpu/drm/i915/gt/intel_ggtt.c
+++ b/drivers/gpu/drm/i915/gt/intel_ggtt.c
@@ -657,7 +657,7 @@ static int init_aliasing_ppgtt(struct i915_ggtt *ggtt)
 		goto err_ppgtt;
 
 	i915_gem_object_lock(ppgtt->vm.scratch[0], NULL);
-	err = i915_vm_pin_pt_stash(&ppgtt->vm, &stash);
+	err = i915_vm_map_pt_stash(&ppgtt->vm, &stash);
 	i915_gem_object_unlock(ppgtt->vm.scratch[0]);
 	if (err)
 		goto err_stash;
diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.c b/drivers/gpu/drm/i915/gt/intel_gtt.c
index 941f8af016d6..d386b89e2758 100644
--- a/drivers/gpu/drm/i915/gt/intel_gtt.c
+++ b/drivers/gpu/drm/i915/gt/intel_gtt.c
@@ -25,27 +25,25 @@ struct drm_i915_gem_object *alloc_pt_dma(struct i915_address_space *vm, int sz)
 	return obj;
 }
 
-int pin_pt_dma(struct i915_address_space *vm, struct drm_i915_gem_object *obj)
+int map_pt_dma(struct i915_address_space *vm, struct drm_i915_gem_object *obj)
 {
-	int err;
+	void *vaddr;
 
-	i915_gem_object_lock(obj, NULL);
-	err = i915_gem_object_pin_pages(obj);
-	i915_gem_object_unlock(obj);
-	if (err)
-		return err;
+	vaddr = i915_gem_object_pin_map_unlocked(obj, I915_MAP_WB);
+	if (IS_ERR(vaddr))
+		return PTR_ERR(vaddr);
 
 	i915_gem_object_make_unshrinkable(obj);
 	return 0;
 }
 
-int pin_pt_dma_locked(struct i915_address_space *vm, struct drm_i915_gem_object *obj)
+int map_pt_dma_locked(struct i915_address_space *vm, struct drm_i915_gem_object *obj)
 {
-	int err;
+	void *vaddr;
 
-	err = i915_gem_object_pin_pages(obj);
-	if (err)
-		return err;
+	vaddr = i915_gem_object_pin_map(obj, I915_MAP_WB);
+	if (IS_ERR(vaddr))
+		return PTR_ERR(vaddr);
 
 	i915_gem_object_make_unshrinkable(obj);
 	return 0;
@@ -155,6 +153,14 @@ void clear_pages(struct i915_vma *vma)
 	memset(&vma->page_sizes, 0, sizeof(vma->page_sizes));
 }
 
+void *__px_vaddr(struct drm_i915_gem_object *p)
+{
+	enum i915_map_type type;
+
+	GEM_BUG_ON(!i915_gem_object_has_pages(p));
+	return page_unpack_bits(p->mm.mapping, &type);
+}
+
 dma_addr_t __px_dma(struct drm_i915_gem_object *p)
 {
 	GEM_BUG_ON(!i915_gem_object_has_pages(p));
@@ -170,32 +176,22 @@ struct page *__px_page(struct drm_i915_gem_object *p)
 void
 fill_page_dma(struct drm_i915_gem_object *p, const u64 val, unsigned int count)
 {
-	struct page *page = __px_page(p);
-	void *vaddr;
+	void *vaddr = __px_vaddr(p);
 
-	vaddr = kmap(page);
 	memset64(vaddr, val, count);
 	clflush_cache_range(vaddr, PAGE_SIZE);
-	kunmap(page);
 }
 
 static void poison_scratch_page(struct drm_i915_gem_object *scratch)
 {
-	struct sgt_iter sgt;
-	struct page *page;
+	void *vaddr = __px_vaddr(scratch);
 	u8 val;
 
 	val = 0;
 	if (IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM))
 		val = POISON_FREE;
 
-	for_each_sgt_page(page, sgt, scratch->mm.pages) {
-		void *vaddr;
-
-		vaddr = kmap(page);
-		memset(vaddr, val, PAGE_SIZE);
-		kunmap(page);
-	}
+	memset(vaddr, val, scratch->base.size);
 }
 
 int setup_scratch_page(struct i915_address_space *vm)
@@ -225,7 +221,7 @@ int setup_scratch_page(struct i915_address_space *vm)
 		if (IS_ERR(obj))
 			goto skip;
 
-		if (pin_pt_dma(vm, obj))
+		if (map_pt_dma(vm, obj))
 			goto skip_obj;
 
 		/* We need a single contiguous page for our scratch */
diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.h b/drivers/gpu/drm/i915/gt/intel_gtt.h
index e67e34e17913..40e486704558 100644
--- a/drivers/gpu/drm/i915/gt/intel_gtt.h
+++ b/drivers/gpu/drm/i915/gt/intel_gtt.h
@@ -180,6 +180,9 @@ struct page *__px_page(struct drm_i915_gem_object *p);
 dma_addr_t __px_dma(struct drm_i915_gem_object *p);
 #define px_dma(px) (__px_dma(px_base(px)))
 
+void *__px_vaddr(struct drm_i915_gem_object *p);
+#define px_vaddr(px) (__px_vaddr(px_base(px)))
+
 #define px_pt(px) \
 	__px_choose_expr(px, struct i915_page_table *, __x, \
 	__px_choose_expr(px, struct i915_page_directory *, &__x->pt, \
@@ -511,8 +514,6 @@ struct i915_ppgtt *i915_ppgtt_create(struct intel_gt *gt);
 void i915_ggtt_suspend(struct i915_ggtt *gtt);
 void i915_ggtt_resume(struct i915_ggtt *ggtt);
 
-#define kmap_atomic_px(px) kmap_atomic(__px_page(px_base(px)))
-
 void
 fill_page_dma(struct drm_i915_gem_object *p, const u64 val, unsigned int count);
 
@@ -530,8 +531,8 @@ struct i915_page_table *alloc_pt(struct i915_address_space *vm);
 struct i915_page_directory *alloc_pd(struct i915_address_space *vm);
 struct i915_page_directory *__alloc_pd(int npde);
 
-int pin_pt_dma(struct i915_address_space *vm, struct drm_i915_gem_object *obj);
-int pin_pt_dma_locked(struct i915_address_space *vm, struct drm_i915_gem_object *obj);
+int map_pt_dma(struct i915_address_space *vm, struct drm_i915_gem_object *obj);
+int map_pt_dma_locked(struct i915_address_space *vm, struct drm_i915_gem_object *obj);
 
 void free_px(struct i915_address_space *vm,
 	     struct i915_page_table *pt, int lvl);
@@ -578,7 +579,7 @@ void setup_private_pat(struct intel_uncore *uncore);
 int i915_vm_alloc_pt_stash(struct i915_address_space *vm,
 			   struct i915_vm_pt_stash *stash,
 			   u64 size);
-int i915_vm_pin_pt_stash(struct i915_address_space *vm,
+int i915_vm_map_pt_stash(struct i915_address_space *vm,
 			 struct i915_vm_pt_stash *stash);
 void i915_vm_free_pt_stash(struct i915_address_space *vm,
 			   struct i915_vm_pt_stash *stash);
diff --git a/drivers/gpu/drm/i915/gt/intel_ppgtt.c b/drivers/gpu/drm/i915/gt/intel_ppgtt.c
index 014ae8ac4480..4e3d80c2295c 100644
--- a/drivers/gpu/drm/i915/gt/intel_ppgtt.c
+++ b/drivers/gpu/drm/i915/gt/intel_ppgtt.c
@@ -87,11 +87,10 @@ write_dma_entry(struct drm_i915_gem_object * const pdma,
 		const unsigned short idx,
 		const u64 encoded_entry)
 {
-	u64 * const vaddr = kmap_atomic(__px_page(pdma));
+	u64 * const vaddr = __px_vaddr(pdma);
 
 	vaddr[idx] = encoded_entry;
 	clflush_cache_range(&vaddr[idx], sizeof(u64));
-	kunmap_atomic(vaddr);
 }
 
 void
@@ -258,7 +257,7 @@ int i915_vm_alloc_pt_stash(struct i915_address_space *vm,
 	return 0;
 }
 
-int i915_vm_pin_pt_stash(struct i915_address_space *vm,
+int i915_vm_map_pt_stash(struct i915_address_space *vm,
 			 struct i915_vm_pt_stash *stash)
 {
 	struct i915_page_table *pt;
@@ -266,7 +265,7 @@ int i915_vm_pin_pt_stash(struct i915_address_space *vm,
 
 	for (n = 0; n < ARRAY_SIZE(stash->pt); n++) {
 		for (pt = stash->pt[n]; pt; pt = pt->stash) {
-			err = pin_pt_dma_locked(vm, pt->base);
+			err = map_pt_dma_locked(vm, pt->base);
 			if (err)
 				return err;
 		}
diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
index e24d33aecac4..c68a743fac2a 100644
--- a/drivers/gpu/drm/i915/i915_vma.c
+++ b/drivers/gpu/drm/i915/i915_vma.c
@@ -912,8 +912,7 @@ int i915_vma_pin_ww(struct i915_vma *vma, struct i915_gem_ww_ctx *ww,
 			if (err)
 				goto err_fence;
 
-			err = i915_vm_pin_pt_stash(vma->vm,
-						   &work->stash);
+			err = i915_vm_map_pt_stash(vma->vm, &work->stash);
 			if (err)
 				goto err_fence;
 		}
diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c b/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
index 2e4f06eaacc1..e060e455e9f6 100644
--- a/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
@@ -186,7 +186,7 @@ static int igt_ppgtt_alloc(void *arg)
 		if (err)
 			goto err_ppgtt_cleanup;
 
-		err = i915_vm_pin_pt_stash(&ppgtt->vm, &stash);
+		err = i915_vm_map_pt_stash(&ppgtt->vm, &stash);
 		if (err) {
 			i915_vm_free_pt_stash(&ppgtt->vm, &stash);
 			goto err_ppgtt_cleanup;
@@ -208,7 +208,7 @@ static int igt_ppgtt_alloc(void *arg)
 		if (err)
 			goto err_ppgtt_cleanup;
 
-		err = i915_vm_pin_pt_stash(&ppgtt->vm, &stash);
+		err = i915_vm_map_pt_stash(&ppgtt->vm, &stash);
 		if (err) {
 			i915_vm_free_pt_stash(&ppgtt->vm, &stash);
 			goto err_ppgtt_cleanup;
@@ -325,11 +325,10 @@ static int lowlevel_hole(struct i915_address_space *vm,
 							   BIT_ULL(size)))
 					goto alloc_vm_end;
 
-				err = i915_vm_pin_pt_stash(vm, &stash);
+				err = i915_vm_map_pt_stash(vm, &stash);
 				if (!err)
 					vm->allocate_va_range(vm, &stash,
 							      addr, BIT_ULL(size));
-
 				i915_vm_free_pt_stash(vm, &stash);
 alloc_vm_end:
 				if (err == -EDEADLK) {
@@ -1967,10 +1966,9 @@ static int igt_cs_tlb(void *arg)
 			if (err)
 				goto end_ww;
 
-			err = i915_vm_pin_pt_stash(vm, &stash);
+			err = i915_vm_map_pt_stash(vm, &stash);
 			if (!err)
 				vm->allocate_va_range(vm, &stash, offset, chunk_size);
-
 			i915_vm_free_pt_stash(vm, &stash);
 end_ww:
 			if (err == -EDEADLK) {
diff --git a/drivers/gpu/drm/i915/selftests/i915_perf.c b/drivers/gpu/drm/i915/selftests/i915_perf.c
index e9d86dab8677..bfb0290967a1 100644
--- a/drivers/gpu/drm/i915/selftests/i915_perf.c
+++ b/drivers/gpu/drm/i915/selftests/i915_perf.c
@@ -307,7 +307,7 @@ static int live_noa_gpr(void *arg)
 	}
 
 	/* Poison the ce->vm so we detect writes not to the GGTT gt->scratch */
-	scratch = kmap(__px_page(ce->vm->scratch[0]));
+	scratch = __px_vaddr(ce->vm->scratch[0]);
 	memset(scratch, POISON_FREE, PAGE_SIZE);
 
 	rq = intel_context_create_request(ce);
@@ -405,7 +405,6 @@ static int live_noa_gpr(void *arg)
 out_rq:
 	i915_request_put(rq);
 out_ce:
-	kunmap(__px_page(ce->vm->scratch[0]));
 	intel_context_put(ce);
 out:
 	stream_destroy(stream);
-- 
2.26.3

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 130+ messages in thread

* [PATCH 19/19] drm/i915/gtt/dgfx: place the PD in LMEM
  2021-04-12  9:05 ` [Intel-gfx] " Matthew Auld
@ 2021-04-12  9:05   ` Matthew Auld
  -1 siblings, 0 replies; 130+ messages in thread
From: Matthew Auld @ 2021-04-12  9:05 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel

It's a requirement that for dgfx we place all the paging structures in
device local-memory.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
---
 drivers/gpu/drm/i915/gt/gen8_ppgtt.c |  5 ++++-
 drivers/gpu/drm/i915/gt/intel_gtt.c  | 27 +++++++++++++++++++++++++--
 drivers/gpu/drm/i915/gt/intel_gtt.h  |  1 +
 3 files changed, 30 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
index f83496836f0f..11fb5df45a0f 100644
--- a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
+++ b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
@@ -712,7 +712,10 @@ struct i915_ppgtt *gen8_ppgtt_create(struct intel_gt *gt)
 	 */
 	ppgtt->vm.has_read_only = !IS_GEN_RANGE(gt->i915, 11, 12);
 
-	ppgtt->vm.alloc_pt_dma = alloc_pt_dma;
+	if (HAS_LMEM(gt->i915))
+		ppgtt->vm.alloc_pt_dma = alloc_pt_lmem;
+	else
+		ppgtt->vm.alloc_pt_dma = alloc_pt_dma;
 
 	err = gen8_init_scratch(&ppgtt->vm);
 	if (err)
diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.c b/drivers/gpu/drm/i915/gt/intel_gtt.c
index d386b89e2758..1eeeab45445c 100644
--- a/drivers/gpu/drm/i915/gt/intel_gtt.c
+++ b/drivers/gpu/drm/i915/gt/intel_gtt.c
@@ -7,10 +7,23 @@
 
 #include <linux/fault-inject.h>
 
+#include "gem/i915_gem_lmem.h"
 #include "i915_trace.h"
 #include "intel_gt.h"
 #include "intel_gtt.h"
 
+struct drm_i915_gem_object *alloc_pt_lmem(struct i915_address_space *vm, int sz)
+{
+	struct drm_i915_gem_object *obj;
+
+	obj = i915_gem_object_create_lmem(vm->i915, sz, 0);
+
+	/* ensure all dma objects have the same reservation class */
+	if (!IS_ERR(obj))
+		obj->base.resv = &vm->resv;
+	return obj;
+}
+
 struct drm_i915_gem_object *alloc_pt_dma(struct i915_address_space *vm, int sz)
 {
 	struct drm_i915_gem_object *obj;
@@ -27,9 +40,14 @@ struct drm_i915_gem_object *alloc_pt_dma(struct i915_address_space *vm, int sz)
 
 int map_pt_dma(struct i915_address_space *vm, struct drm_i915_gem_object *obj)
 {
+	enum i915_map_type type;
 	void *vaddr;
 
-	vaddr = i915_gem_object_pin_map_unlocked(obj, I915_MAP_WB);
+	type = I915_MAP_WB;
+	if (i915_gem_object_is_lmem(obj))
+		type = I915_MAP_WC;
+
+	vaddr = i915_gem_object_pin_map_unlocked(obj, type);
 	if (IS_ERR(vaddr))
 		return PTR_ERR(vaddr);
 
@@ -39,9 +57,14 @@ int map_pt_dma(struct i915_address_space *vm, struct drm_i915_gem_object *obj)
 
 int map_pt_dma_locked(struct i915_address_space *vm, struct drm_i915_gem_object *obj)
 {
+	enum i915_map_type type;
 	void *vaddr;
 
-	vaddr = i915_gem_object_pin_map(obj, I915_MAP_WB);
+	type = I915_MAP_WB;
+	if (i915_gem_object_is_lmem(obj))
+		type = I915_MAP_WC;
+
+	vaddr = i915_gem_object_pin_map(obj, type);
 	if (IS_ERR(vaddr))
 		return PTR_ERR(vaddr);
 
diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.h b/drivers/gpu/drm/i915/gt/intel_gtt.h
index 40e486704558..44ce27c51631 100644
--- a/drivers/gpu/drm/i915/gt/intel_gtt.h
+++ b/drivers/gpu/drm/i915/gt/intel_gtt.h
@@ -527,6 +527,7 @@ int setup_scratch_page(struct i915_address_space *vm);
 void free_scratch(struct i915_address_space *vm);
 
 struct drm_i915_gem_object *alloc_pt_dma(struct i915_address_space *vm, int sz);
+struct drm_i915_gem_object *alloc_pt_lmem(struct i915_address_space *vm, int sz);
 struct i915_page_table *alloc_pt(struct i915_address_space *vm);
 struct i915_page_directory *alloc_pd(struct i915_address_space *vm);
 struct i915_page_directory *__alloc_pd(int npde);
-- 
2.26.3

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 130+ messages in thread

* [Intel-gfx] [PATCH 19/19] drm/i915/gtt/dgfx: place the PD in LMEM
@ 2021-04-12  9:05   ` Matthew Auld
  0 siblings, 0 replies; 130+ messages in thread
From: Matthew Auld @ 2021-04-12  9:05 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel

It's a requirement that for dgfx we place all the paging structures in
device local-memory.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
---
 drivers/gpu/drm/i915/gt/gen8_ppgtt.c |  5 ++++-
 drivers/gpu/drm/i915/gt/intel_gtt.c  | 27 +++++++++++++++++++++++++--
 drivers/gpu/drm/i915/gt/intel_gtt.h  |  1 +
 3 files changed, 30 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
index f83496836f0f..11fb5df45a0f 100644
--- a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
+++ b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
@@ -712,7 +712,10 @@ struct i915_ppgtt *gen8_ppgtt_create(struct intel_gt *gt)
 	 */
 	ppgtt->vm.has_read_only = !IS_GEN_RANGE(gt->i915, 11, 12);
 
-	ppgtt->vm.alloc_pt_dma = alloc_pt_dma;
+	if (HAS_LMEM(gt->i915))
+		ppgtt->vm.alloc_pt_dma = alloc_pt_lmem;
+	else
+		ppgtt->vm.alloc_pt_dma = alloc_pt_dma;
 
 	err = gen8_init_scratch(&ppgtt->vm);
 	if (err)
diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.c b/drivers/gpu/drm/i915/gt/intel_gtt.c
index d386b89e2758..1eeeab45445c 100644
--- a/drivers/gpu/drm/i915/gt/intel_gtt.c
+++ b/drivers/gpu/drm/i915/gt/intel_gtt.c
@@ -7,10 +7,23 @@
 
 #include <linux/fault-inject.h>
 
+#include "gem/i915_gem_lmem.h"
 #include "i915_trace.h"
 #include "intel_gt.h"
 #include "intel_gtt.h"
 
+struct drm_i915_gem_object *alloc_pt_lmem(struct i915_address_space *vm, int sz)
+{
+	struct drm_i915_gem_object *obj;
+
+	obj = i915_gem_object_create_lmem(vm->i915, sz, 0);
+
+	/* ensure all dma objects have the same reservation class */
+	if (!IS_ERR(obj))
+		obj->base.resv = &vm->resv;
+	return obj;
+}
+
 struct drm_i915_gem_object *alloc_pt_dma(struct i915_address_space *vm, int sz)
 {
 	struct drm_i915_gem_object *obj;
@@ -27,9 +40,14 @@ struct drm_i915_gem_object *alloc_pt_dma(struct i915_address_space *vm, int sz)
 
 int map_pt_dma(struct i915_address_space *vm, struct drm_i915_gem_object *obj)
 {
+	enum i915_map_type type;
 	void *vaddr;
 
-	vaddr = i915_gem_object_pin_map_unlocked(obj, I915_MAP_WB);
+	type = I915_MAP_WB;
+	if (i915_gem_object_is_lmem(obj))
+		type = I915_MAP_WC;
+
+	vaddr = i915_gem_object_pin_map_unlocked(obj, type);
 	if (IS_ERR(vaddr))
 		return PTR_ERR(vaddr);
 
@@ -39,9 +57,14 @@ int map_pt_dma(struct i915_address_space *vm, struct drm_i915_gem_object *obj)
 
 int map_pt_dma_locked(struct i915_address_space *vm, struct drm_i915_gem_object *obj)
 {
+	enum i915_map_type type;
 	void *vaddr;
 
-	vaddr = i915_gem_object_pin_map(obj, I915_MAP_WB);
+	type = I915_MAP_WB;
+	if (i915_gem_object_is_lmem(obj))
+		type = I915_MAP_WC;
+
+	vaddr = i915_gem_object_pin_map(obj, type);
 	if (IS_ERR(vaddr))
 		return PTR_ERR(vaddr);
 
diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.h b/drivers/gpu/drm/i915/gt/intel_gtt.h
index 40e486704558..44ce27c51631 100644
--- a/drivers/gpu/drm/i915/gt/intel_gtt.h
+++ b/drivers/gpu/drm/i915/gt/intel_gtt.h
@@ -527,6 +527,7 @@ int setup_scratch_page(struct i915_address_space *vm);
 void free_scratch(struct i915_address_space *vm);
 
 struct drm_i915_gem_object *alloc_pt_dma(struct i915_address_space *vm, int sz);
+struct drm_i915_gem_object *alloc_pt_lmem(struct i915_address_space *vm, int sz);
 struct i915_page_table *alloc_pt(struct i915_address_space *vm);
 struct i915_page_directory *alloc_pd(struct i915_address_space *vm);
 struct i915_page_directory *__alloc_pd(int npde);
-- 
2.26.3

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 130+ messages in thread

* [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for More DG1 enabling
  2021-04-12  9:05 ` [Intel-gfx] " Matthew Auld
                   ` (19 preceding siblings ...)
  (?)
@ 2021-04-12 11:07 ` Patchwork
  -1 siblings, 0 replies; 130+ messages in thread
From: Patchwork @ 2021-04-12 11:07 UTC (permalink / raw)
  To: Matthew Auld; +Cc: intel-gfx

== Series Details ==

Series: More DG1 enabling
URL   : https://patchwork.freedesktop.org/series/88947/
State : warning

== Summary ==

$ dim checkpatch origin/drm-tip
525b7ae56cfd drm/i915/gt: Skip aperture remapping selftest where there is no aperture
530e7443c201 drm/i915/selftests: Only query RAPL for integrated power measurements
c46573ddee8d drm/i915: Create stolen memory region from local memory
-:13: WARNING:COMMIT_LOG_LONG_LINE: Possible unwrapped commit description (prefer a maximum 75 chars per line)
#13: 
      as stolen-local or stolen-system based on this value won't work. Split

total: 0 errors, 1 warnings, 0 checks, 231 lines checked
9ab764a5a21b drm/i915/stolen: treat stolen local as normal local memory
154d7cbfda7f drm/i915/stolen: enforce the min_page_size contract
03c0cc0dae7f drm/i915/stolen: pass the allocation flags
693ca8d4d780 drm/i915/fbdev: Use lmem physical addresses for fb_mmap() on discrete
4ec6aedb7ac2 drm/i915: Return error value when bo not in LMEM for discrete
e861362b0f25 drm/i915/lmem: Fail driver init if LMEM training failed
4bd6b41215c3 drm/i915/dg1: Fix mapping type for default state object
a7c90db9a5c4 drm/i915: Update the helper to set correct mapping
-:68: CHECK:BRACES: Unbalanced braces around else statement
#68: FILE: drivers/gpu/drm/i915/gt/intel_ring.c:56:
+	else {

total: 0 errors, 0 warnings, 1 checks, 132 lines checked
e03946b5bf8c drm/i915/lmem: Bypass aperture when lmem is available
e793b2d75109 drm/i915/dg1: Read OPROM via SPI controller
dd926fd22135 drm/i915/oprom: Basic sanitization
-:140: WARNING:LONG_LINE: line length of 122 exceeds 100 columns
#140: FILE: drivers/gpu/drm/i915/display/intel_opregion.c:996:
+	size_512_bytes = parse_ptr[((struct expansion_rom_header *)parse_ptr)->pcistructoffset + PCI_IMAGE_LENGTH_OFFSET];

-:141: WARNING:LONG_LINE: line length of 115 exceeds 100 columns
#141: FILE: drivers/gpu/drm/i915/display/intel_opregion.c:997:
+	*code_type = parse_ptr[((struct expansion_rom_header *)parse_ptr)->pcistructoffset + PCI_CODE_TYPE_OFFSET];

-:142: WARNING:LONG_LINE: line length of 125 exceeds 100 columns
#142: FILE: drivers/gpu/drm/i915/display/intel_opregion.c:998:
+	*last_img = parse_ptr[((struct expansion_rom_header *)parse_ptr)->pcistructoffset + PCI_LAST_IMAGE_INDICATOR_OFFSET];

total: 0 errors, 3 warnings, 0 checks, 304 lines checked
2f21e3309e84 drm/i915: WA for zero memory channel
-:32: ERROR:NO_AUTHOR_SIGN_OFF: Missing Signed-off-by: line by nominal patch author '"José Roberto de Souza" <jose.souza@intel.com>'

total: 1 errors, 0 warnings, 0 checks, 7 lines checked
acf0a81fb877 drm/i915/dg1: Compute MEM Bandwidth using MCHBAR
4988227f5fee drm/i915/dg1: Double memory bandwidth available
a1d3eec6b5bc drm/i915/gtt: map the PD up front
-:10: WARNING:TYPO_SPELLING: 'maping' may be misspelled - perhaps 'mapping'?
#10: 
maping code that for simple single page shmemfs object will return a
^^^^^^

total: 0 errors, 1 warnings, 0 checks, 403 lines checked
817240b12321 drm/i915/gtt/dgfx: place the PD in LMEM


_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 130+ messages in thread

* [Intel-gfx] ✗ Fi.CI.DOCS: warning for More DG1 enabling
  2021-04-12  9:05 ` [Intel-gfx] " Matthew Auld
                   ` (20 preceding siblings ...)
  (?)
@ 2021-04-12 11:12 ` Patchwork
  -1 siblings, 0 replies; 130+ messages in thread
From: Patchwork @ 2021-04-12 11:12 UTC (permalink / raw)
  To: Matthew Auld; +Cc: intel-gfx

== Series Details ==

Series: More DG1 enabling
URL   : https://patchwork.freedesktop.org/series/88947/
State : warning

== Summary ==

$ make htmldocs 2>&1 > /dev/null | grep i915
./drivers/gpu/drm/i915/gem/i915_gem_shrinker.c:102: warning: Function parameter or member 'ww' not described in 'i915_gem_shrink'
./drivers/gpu/drm/i915/i915_cmd_parser.c:1420: warning: Excess function parameter 'trampoline' description in 'intel_engine_cmd_parser'
./drivers/gpu/drm/i915/i915_cmd_parser.c:1420: warning: Function parameter or member 'jump_whitelist' not described in 'intel_engine_cmd_parser'
./drivers/gpu/drm/i915/i915_cmd_parser.c:1420: warning: Function parameter or member 'shadow_map' not described in 'intel_engine_cmd_parser'
./drivers/gpu/drm/i915/i915_cmd_parser.c:1420: warning: Function parameter or member 'batch_map' not described in 'intel_engine_cmd_parser'
./drivers/gpu/drm/i915/i915_cmd_parser.c:1420: warning: Excess function parameter 'trampoline' description in 'intel_engine_cmd_parser'


_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 130+ messages in thread

* [Intel-gfx] ✓ Fi.CI.BAT: success for More DG1 enabling
  2021-04-12  9:05 ` [Intel-gfx] " Matthew Auld
                   ` (21 preceding siblings ...)
  (?)
@ 2021-04-12 11:37 ` Patchwork
  -1 siblings, 0 replies; 130+ messages in thread
From: Patchwork @ 2021-04-12 11:37 UTC (permalink / raw)
  To: Matthew Auld; +Cc: intel-gfx


[-- Attachment #1.1: Type: text/plain, Size: 5813 bytes --]

== Series Details ==

Series: More DG1 enabling
URL   : https://patchwork.freedesktop.org/series/88947/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_9957 -> Patchwork_19912
====================================================

Summary
-------

  **SUCCESS**

  No regressions found.

  External URL: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19912/index.html

Possible new issues
-------------------

  Here are the unknown changes that may have been introduced in Patchwork_19912:

### IGT changes ###

#### Suppressed ####

  The following results come from untrusted machines, tests, or statuses.
  They do not affect the overall result.

  * igt@kms_flip@basic-flip-vs-modeset@a-edp1:
    - {fi-cml-drallion}:  NOTRUN -> [INCOMPLETE][1]
   [1]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19912/fi-cml-drallion/igt@kms_flip@basic-flip-vs-modeset@a-edp1.html

  
Known issues
------------

  Here are the changes found in Patchwork_19912 that come from known issues:

### IGT changes ###

#### Issues hit ####

  * igt@amdgpu/amd_basic@semaphore:
    - fi-bsw-nick:        NOTRUN -> [SKIP][2] ([fdo#109271]) +17 similar issues
   [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19912/fi-bsw-nick/igt@amdgpu/amd_basic@semaphore.html

  * igt@gem_exec_gttfill@basic:
    - fi-bsw-n3050:       NOTRUN -> [SKIP][3] ([fdo#109271])
   [3]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19912/fi-bsw-n3050/igt@gem_exec_gttfill@basic.html

  * igt@gem_exec_suspend@basic-s3:
    - fi-bsw-n3050:       NOTRUN -> [INCOMPLETE][4] ([i915#3159])
   [4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19912/fi-bsw-n3050/igt@gem_exec_suspend@basic-s3.html

  * igt@prime_self_import@basic-with_one_bo:
    - fi-tgl-y:           [PASS][5] -> [DMESG-WARN][6] ([i915#402])
   [5]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9957/fi-tgl-y/igt@prime_self_import@basic-with_one_bo.html
   [6]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19912/fi-tgl-y/igt@prime_self_import@basic-with_one_bo.html

  
#### Possible fixes ####

  * igt@i915_selftest@live@hangcheck:
    - {fi-hsw-gt1}:       [DMESG-WARN][7] ([i915#3303]) -> [PASS][8]
   [7]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9957/fi-hsw-gt1/igt@i915_selftest@live@hangcheck.html
   [8]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19912/fi-hsw-gt1/igt@i915_selftest@live@hangcheck.html

  * igt@i915_selftest@live@late_gt_pm:
    - fi-bsw-nick:        [DMESG-FAIL][9] ([i915#2927]) -> [PASS][10]
   [9]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9957/fi-bsw-nick/igt@i915_selftest@live@late_gt_pm.html
   [10]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19912/fi-bsw-nick/igt@i915_selftest@live@late_gt_pm.html

  * igt@prime_self_import@basic-with_one_bo_two_files:
    - fi-tgl-y:           [DMESG-WARN][11] ([i915#402]) -> [PASS][12] +1 similar issue
   [11]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9957/fi-tgl-y/igt@prime_self_import@basic-with_one_bo_two_files.html
   [12]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19912/fi-tgl-y/igt@prime_self_import@basic-with_one_bo_two_files.html

  
  {name}: This element is suppressed. This means it is ignored when computing
          the status of the difference (SUCCESS, WARNING, or FAILURE).

  [fdo#109271]: https://bugs.freedesktop.org/show_bug.cgi?id=109271
  [fdo#111827]: https://bugs.freedesktop.org/show_bug.cgi?id=111827
  [i915#1208]: https://gitlab.freedesktop.org/drm/intel/issues/1208
  [i915#2190]: https://gitlab.freedesktop.org/drm/intel/issues/2190
  [i915#2927]: https://gitlab.freedesktop.org/drm/intel/issues/2927
  [i915#3159]: https://gitlab.freedesktop.org/drm/intel/issues/3159
  [i915#3303]: https://gitlab.freedesktop.org/drm/intel/issues/3303
  [i915#402]: https://gitlab.freedesktop.org/drm/intel/issues/402


Participating hosts (42 -> 39)
------------------------------

  Additional (2): fi-cml-drallion fi-bsw-n3050 
  Missing    (5): fi-kbl-soraka fi-ilk-m540 fi-hsw-4200u fi-bsw-cyan fi-bdw-samus 


Build changes
-------------

  * Linux: CI_DRM_9957 -> Patchwork_19912

  CI-20190529: 20190529
  CI_DRM_9957: 1c979586f3208fdd56573cec840f7d9000be51ab @ git://anongit.freedesktop.org/gfx-ci/linux
  IGT_6063: d3b7f74ce5df6fdea03e490b7c64f0c6bfe76f03 @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
  Patchwork_19912: 817240b12321d5959560cf2306142d63644bbef5 @ git://anongit.freedesktop.org/gfx-ci/linux


== Linux commits ==

817240b12321 drm/i915/gtt/dgfx: place the PD in LMEM
a1d3eec6b5bc drm/i915/gtt: map the PD up front
4988227f5fee drm/i915/dg1: Double memory bandwidth available
acf0a81fb877 drm/i915/dg1: Compute MEM Bandwidth using MCHBAR
2f21e3309e84 drm/i915: WA for zero memory channel
dd926fd22135 drm/i915/oprom: Basic sanitization
e793b2d75109 drm/i915/dg1: Read OPROM via SPI controller
e03946b5bf8c drm/i915/lmem: Bypass aperture when lmem is available
a7c90db9a5c4 drm/i915: Update the helper to set correct mapping
4bd6b41215c3 drm/i915/dg1: Fix mapping type for default state object
e861362b0f25 drm/i915/lmem: Fail driver init if LMEM training failed
4ec6aedb7ac2 drm/i915: Return error value when bo not in LMEM for discrete
693ca8d4d780 drm/i915/fbdev: Use lmem physical addresses for fb_mmap() on discrete
03c0cc0dae7f drm/i915/stolen: pass the allocation flags
154d7cbfda7f drm/i915/stolen: enforce the min_page_size contract
9ab764a5a21b drm/i915/stolen: treat stolen local as normal local memory
c46573ddee8d drm/i915: Create stolen memory region from local memory
530e7443c201 drm/i915/selftests: Only query RAPL for integrated power measurements
525b7ae56cfd drm/i915/gt: Skip aperture remapping selftest where there is no aperture

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19912/index.html

[-- Attachment #1.2: Type: text/html, Size: 6669 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 130+ messages in thread

* [Intel-gfx] ✗ Fi.CI.IGT: failure for More DG1 enabling
  2021-04-12  9:05 ` [Intel-gfx] " Matthew Auld
                   ` (22 preceding siblings ...)
  (?)
@ 2021-04-12 13:37 ` Patchwork
  -1 siblings, 0 replies; 130+ messages in thread
From: Patchwork @ 2021-04-12 13:37 UTC (permalink / raw)
  To: Matthew Auld; +Cc: intel-gfx


[-- Attachment #1.1: Type: text/plain, Size: 30240 bytes --]

== Series Details ==

Series: More DG1 enabling
URL   : https://patchwork.freedesktop.org/series/88947/
State : failure

== Summary ==

CI Bug Log - changes from CI_DRM_9957_full -> Patchwork_19912_full
====================================================

Summary
-------

  **FAILURE**

  Serious unknown changes coming with Patchwork_19912_full absolutely need to be
  verified manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in Patchwork_19912_full, please notify your bug team to allow them
  to document this new failure mode, which will reduce false positives in CI.

  

Possible new issues
-------------------

  Here are the unknown changes that may have been introduced in Patchwork_19912_full:

### Piglit changes ###

#### Possible regressions ####

  * spec@!opengl 1.0@gl-1.0-blend-func (NEW):
    - pig-glk-j5005:      NOTRUN -> [INCOMPLETE][1] +4 similar issues
   [1]: None

  
New tests
---------

  New tests have been introduced between CI_DRM_9957_full and Patchwork_19912_full:

### New Piglit tests (5) ###

  * security@initialized-texmemory:
    - Statuses : 1 incomplete(s)
    - Exec time: [0.0] s

  * spec@!opengl 1.0@gl-1.0-blend-func:
    - Statuses : 1 incomplete(s)
    - Exec time: [0.0] s

  * spec@!opengl 1.0@gl-1.0-read-cache-stress-test:
    - Statuses : 1 incomplete(s)
    - Exec time: [0.0] s

  * spec@!opengl 1.0@gl-1.0-readpixsanity:
    - Statuses : 1 incomplete(s)
    - Exec time: [0.0] s

  * spec@!opengl 1.0@gl-1.0-simple-readbuffer:
    - Statuses : 1 incomplete(s)
    - Exec time: [0.0] s

  

Known issues
------------

  Here are the changes found in Patchwork_19912_full that come from known issues:

### IGT changes ###

#### Issues hit ####

  * igt@gem_ctx_persistence@smoketest:
    - shard-snb:          NOTRUN -> [SKIP][2] ([fdo#109271] / [i915#1099]) +4 similar issues
   [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19912/shard-snb7/igt@gem_ctx_persistence@smoketest.html

  * igt@gem_exec_fair@basic-none@vcs1:
    - shard-iclb:         NOTRUN -> [FAIL][3] ([i915#2842])
   [3]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19912/shard-iclb2/igt@gem_exec_fair@basic-none@vcs1.html

  * igt@gem_exec_fair@basic-pace-share@rcs0:
    - shard-glk:          [PASS][4] -> [FAIL][5] ([i915#2842])
   [4]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9957/shard-glk6/igt@gem_exec_fair@basic-pace-share@rcs0.html
   [5]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19912/shard-glk5/igt@gem_exec_fair@basic-pace-share@rcs0.html

  * igt@gem_exec_fair@basic-pace-solo@rcs0:
    - shard-kbl:          [PASS][6] -> [FAIL][7] ([i915#2842]) +1 similar issue
   [6]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9957/shard-kbl3/igt@gem_exec_fair@basic-pace-solo@rcs0.html
   [7]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19912/shard-kbl7/igt@gem_exec_fair@basic-pace-solo@rcs0.html

  * igt@gem_exec_fair@basic-pace@vcs1:
    - shard-tglb:         [PASS][8] -> [FAIL][9] ([i915#2842])
   [8]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9957/shard-tglb3/igt@gem_exec_fair@basic-pace@vcs1.html
   [9]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19912/shard-tglb5/igt@gem_exec_fair@basic-pace@vcs1.html

  * igt@gem_exec_reloc@basic-wide-active@bcs0:
    - shard-skl:          NOTRUN -> [FAIL][10] ([i915#2389]) +3 similar issues
   [10]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19912/shard-skl4/igt@gem_exec_reloc@basic-wide-active@bcs0.html

  * igt@gem_exec_reloc@basic-wide-active@rcs0:
    - shard-snb:          NOTRUN -> [FAIL][11] ([i915#2389]) +2 similar issues
   [11]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19912/shard-snb2/igt@gem_exec_reloc@basic-wide-active@rcs0.html

  * igt@gem_exec_suspend@basic-s3:
    - shard-apl:          NOTRUN -> [DMESG-WARN][12] ([i915#180])
   [12]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19912/shard-apl8/igt@gem_exec_suspend@basic-s3.html

  * igt@gem_exec_whisper@basic-fds-all:
    - shard-iclb:         [PASS][13] -> [INCOMPLETE][14] ([i915#1394])
   [13]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9957/shard-iclb4/igt@gem_exec_whisper@basic-fds-all.html
   [14]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19912/shard-iclb4/igt@gem_exec_whisper@basic-fds-all.html

  * igt@gem_mmap_gtt@big-copy-odd:
    - shard-skl:          [PASS][15] -> [FAIL][16] ([i915#307])
   [15]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9957/shard-skl4/igt@gem_mmap_gtt@big-copy-odd.html
   [16]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19912/shard-skl5/igt@gem_mmap_gtt@big-copy-odd.html
    - shard-glk:          [PASS][17] -> [FAIL][18] ([i915#307])
   [17]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9957/shard-glk2/igt@gem_mmap_gtt@big-copy-odd.html
   [18]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19912/shard-glk4/igt@gem_mmap_gtt@big-copy-odd.html

  * igt@gem_mmap_gtt@cpuset-big-copy:
    - shard-iclb:         [PASS][19] -> [FAIL][20] ([i915#2428])
   [19]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9957/shard-iclb7/igt@gem_mmap_gtt@cpuset-big-copy.html
   [20]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19912/shard-iclb7/igt@gem_mmap_gtt@cpuset-big-copy.html

  * igt@gem_ppgtt@flink-and-close-vma-leak:
    - shard-glk:          [PASS][21] -> [FAIL][22] ([i915#644])
   [21]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9957/shard-glk1/igt@gem_ppgtt@flink-and-close-vma-leak.html
   [22]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19912/shard-glk8/igt@gem_ppgtt@flink-and-close-vma-leak.html

  * igt@gem_userptr_blits@input-checking:
    - shard-snb:          NOTRUN -> [DMESG-WARN][23] ([i915#3002])
   [23]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19912/shard-snb5/igt@gem_userptr_blits@input-checking.html

  * igt@gem_userptr_blits@set-cache-level:
    - shard-snb:          NOTRUN -> [FAIL][24] ([i915#3324])
   [24]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19912/shard-snb2/igt@gem_userptr_blits@set-cache-level.html

  * igt@gem_userptr_blits@vma-merge:
    - shard-apl:          NOTRUN -> [FAIL][25] ([i915#3318])
   [25]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19912/shard-apl7/igt@gem_userptr_blits@vma-merge.html

  * igt@gen9_exec_parse@bb-large:
    - shard-apl:          NOTRUN -> [FAIL][26] ([i915#3296])
   [26]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19912/shard-apl1/igt@gen9_exec_parse@bb-large.html

  * igt@i915_pm_dc@dc6-dpms:
    - shard-kbl:          NOTRUN -> [FAIL][27] ([i915#454])
   [27]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19912/shard-kbl4/igt@i915_pm_dc@dc6-dpms.html

  * igt@i915_pm_dc@dc6-psr:
    - shard-skl:          NOTRUN -> [FAIL][28] ([i915#454])
   [28]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19912/shard-skl7/igt@i915_pm_dc@dc6-psr.html

  * igt@i915_pm_lpsp@kms-lpsp@kms-lpsp-dp:
    - shard-kbl:          NOTRUN -> [SKIP][29] ([fdo#109271] / [i915#1937])
   [29]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19912/shard-kbl3/igt@i915_pm_lpsp@kms-lpsp@kms-lpsp-dp.html

  * igt@i915_pm_rpm@modeset-lpsp-stress:
    - shard-apl:          NOTRUN -> [SKIP][30] ([fdo#109271]) +342 similar issues
   [30]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19912/shard-apl1/igt@i915_pm_rpm@modeset-lpsp-stress.html

  * igt@i915_selftest@live@hangcheck:
    - shard-snb:          [PASS][31] -> [INCOMPLETE][32] ([i915#2782])
   [31]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9957/shard-snb2/igt@i915_selftest@live@hangcheck.html
   [32]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19912/shard-snb7/igt@i915_selftest@live@hangcheck.html

  * igt@kms_chamelium@hdmi-hpd-for-each-pipe:
    - shard-kbl:          NOTRUN -> [SKIP][33] ([fdo#109271] / [fdo#111827]) +7 similar issues
   [33]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19912/shard-kbl6/igt@kms_chamelium@hdmi-hpd-for-each-pipe.html

  * igt@kms_chamelium@vga-hpd:
    - shard-apl:          NOTRUN -> [SKIP][34] ([fdo#109271] / [fdo#111827]) +31 similar issues
   [34]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19912/shard-apl2/igt@kms_chamelium@vga-hpd.html

  * igt@kms_color_chamelium@pipe-c-ctm-red-to-blue:
    - shard-snb:          NOTRUN -> [SKIP][35] ([fdo#109271] / [fdo#111827]) +20 similar issues
   [35]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19912/shard-snb5/igt@kms_color_chamelium@pipe-c-ctm-red-to-blue.html

  * igt@kms_color_chamelium@pipe-c-degamma:
    - shard-skl:          NOTRUN -> [SKIP][36] ([fdo#109271] / [fdo#111827]) +1 similar issue
   [36]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19912/shard-skl9/igt@kms_color_chamelium@pipe-c-degamma.html

  * igt@kms_content_protection@legacy:
    - shard-kbl:          NOTRUN -> [TIMEOUT][37] ([i915#1319])
   [37]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19912/shard-kbl6/igt@kms_content_protection@legacy.html

  * igt@kms_content_protection@lic:
    - shard-apl:          NOTRUN -> [TIMEOUT][38] ([i915#1319]) +1 similar issue
   [38]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19912/shard-apl8/igt@kms_content_protection@lic.html

  * igt@kms_cursor_crc@pipe-d-cursor-max-size-sliding:
    - shard-kbl:          NOTRUN -> [SKIP][39] ([fdo#109271]) +59 similar issues
   [39]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19912/shard-kbl3/igt@kms_cursor_crc@pipe-d-cursor-max-size-sliding.html

  * igt@kms_cursor_legacy@flip-vs-cursor-atomic-transitions:
    - shard-skl:          [PASS][40] -> [FAIL][41] ([i915#2346])
   [40]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9957/shard-skl9/igt@kms_cursor_legacy@flip-vs-cursor-atomic-transitions.html
   [41]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19912/shard-skl1/igt@kms_cursor_legacy@flip-vs-cursor-atomic-transitions.html

  * igt@kms_cursor_legacy@flip-vs-cursor-toggle:
    - shard-tglb:         [PASS][42] -> [FAIL][43] ([i915#2346])
   [42]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9957/shard-tglb8/igt@kms_cursor_legacy@flip-vs-cursor-toggle.html
   [43]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19912/shard-tglb1/igt@kms_cursor_legacy@flip-vs-cursor-toggle.html

  * igt@kms_flip@flip-vs-expired-vblank-interruptible@a-edp1:
    - shard-skl:          [PASS][44] -> [FAIL][45] ([i915#79])
   [44]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9957/shard-skl8/igt@kms_flip@flip-vs-expired-vblank-interruptible@a-edp1.html
   [45]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19912/shard-skl10/igt@kms_flip@flip-vs-expired-vblank-interruptible@a-edp1.html

  * igt@kms_flip@flip-vs-suspend-interruptible@a-dp1:
    - shard-kbl:          [PASS][46] -> [INCOMPLETE][47] ([i915#155])
   [46]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9957/shard-kbl2/igt@kms_flip@flip-vs-suspend-interruptible@a-dp1.html
   [47]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19912/shard-kbl3/igt@kms_flip@flip-vs-suspend-interruptible@a-dp1.html

  * igt@kms_flip@plain-flip-fb-recreate-interruptible@c-dp1:
    - shard-kbl:          [PASS][48] -> [FAIL][49] ([i915#2122])
   [48]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9957/shard-kbl6/igt@kms_flip@plain-flip-fb-recreate-interruptible@c-dp1.html
   [49]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19912/shard-kbl6/igt@kms_flip@plain-flip-fb-recreate-interruptible@c-dp1.html

  * igt@kms_flip_scaled_crc@flip-32bpp-ytile-to-32bpp-ytileccs:
    - shard-apl:          NOTRUN -> [FAIL][50] ([i915#2641]) +1 similar issue
   [50]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19912/shard-apl6/igt@kms_flip_scaled_crc@flip-32bpp-ytile-to-32bpp-ytileccs.html

  * igt@kms_flip_scaled_crc@flip-32bpp-ytileccs-to-64bpp-ytile:
    - shard-apl:          NOTRUN -> [SKIP][51] ([fdo#109271] / [i915#2642])
   [51]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19912/shard-apl1/igt@kms_flip_scaled_crc@flip-32bpp-ytileccs-to-64bpp-ytile.html

  * igt@kms_flip_scaled_crc@flip-64bpp-ytile-to-32bpp-ytilercccs:
    - shard-skl:          NOTRUN -> [SKIP][52] ([fdo#109271] / [i915#2672])
   [52]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19912/shard-skl4/igt@kms_flip_scaled_crc@flip-64bpp-ytile-to-32bpp-ytilercccs.html

  * igt@kms_frontbuffer_tracking@fbc-suspend:
    - shard-apl:          [PASS][53] -> [DMESG-WARN][54] ([i915#180])
   [53]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9957/shard-apl8/igt@kms_frontbuffer_tracking@fbc-suspend.html
   [54]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19912/shard-apl6/igt@kms_frontbuffer_tracking@fbc-suspend.html

  * igt@kms_frontbuffer_tracking@fbcpsr-1p-primscrn-spr-indfb-draw-pwrite:
    - shard-skl:          NOTRUN -> [SKIP][55] ([fdo#109271]) +56 similar issues
   [55]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19912/shard-skl7/igt@kms_frontbuffer_tracking@fbcpsr-1p-primscrn-spr-indfb-draw-pwrite.html

  * igt@kms_hdr@bpc-switch-dpms:
    - shard-skl:          [PASS][56] -> [FAIL][57] ([i915#1188])
   [56]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9957/shard-skl9/igt@kms_hdr@bpc-switch-dpms.html
   [57]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19912/shard-skl1/igt@kms_hdr@bpc-switch-dpms.html

  * igt@kms_pipe_crc_basic@read-crc-pipe-d-frame-sequence:
    - shard-kbl:          NOTRUN -> [SKIP][58] ([fdo#109271] / [i915#533])
   [58]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19912/shard-kbl6/igt@kms_pipe_crc_basic@read-crc-pipe-d-frame-sequence.html

  * igt@kms_pipe_crc_basic@suspend-read-crc-pipe-a:
    - shard-kbl:          [PASS][59] -> [DMESG-WARN][60] ([i915#180]) +2 similar issues
   [59]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9957/shard-kbl7/igt@kms_pipe_crc_basic@suspend-read-crc-pipe-a.html
   [60]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19912/shard-kbl4/igt@kms_pipe_crc_basic@suspend-read-crc-pipe-a.html

  * igt@kms_plane@plane-panning-bottom-right-suspend-pipe-b-planes:
    - shard-skl:          [PASS][61] -> [INCOMPLETE][62] ([i915#198])
   [61]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9957/shard-skl10/igt@kms_plane@plane-panning-bottom-right-suspend-pipe-b-planes.html
   [62]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19912/shard-skl9/igt@kms_plane@plane-panning-bottom-right-suspend-pipe-b-planes.html

  * igt@kms_plane_alpha_blend@pipe-a-alpha-opaque-fb:
    - shard-apl:          NOTRUN -> [FAIL][63] ([fdo#108145] / [i915#265]) +4 similar issues
   [63]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19912/shard-apl7/igt@kms_plane_alpha_blend@pipe-a-alpha-opaque-fb.html

  * igt@kms_plane_alpha_blend@pipe-b-alpha-7efc:
    - shard-skl:          NOTRUN -> [FAIL][64] ([fdo#108145] / [i915#265])
   [64]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19912/shard-skl9/igt@kms_plane_alpha_blend@pipe-b-alpha-7efc.html

  * igt@kms_plane_alpha_blend@pipe-b-alpha-transparent-fb:
    - shard-apl:          NOTRUN -> [FAIL][65] ([i915#265])
   [65]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19912/shard-apl6/igt@kms_plane_alpha_blend@pipe-b-alpha-transparent-fb.html

  * igt@kms_plane_alpha_blend@pipe-c-coverage-7efc:
    - shard-skl:          [PASS][66] -> [FAIL][67] ([fdo#108145] / [i915#265]) +4 similar issues
   [66]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9957/shard-skl4/igt@kms_plane_alpha_blend@pipe-c-coverage-7efc.html
   [67]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19912/shard-skl1/igt@kms_plane_alpha_blend@pipe-c-coverage-7efc.html

  * igt@kms_psr2_sf@overlay-primary-update-sf-dmg-area-3:
    - shard-kbl:          NOTRUN -> [SKIP][68] ([fdo#109271] / [i915#658]) +4 similar issues
   [68]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19912/shard-kbl6/igt@kms_psr2_sf@overlay-primary-update-sf-dmg-area-3.html

  * igt@kms_psr2_sf@plane-move-sf-dmg-area-2:
    - shard-apl:          NOTRUN -> [SKIP][69] ([fdo#109271] / [i915#658]) +6 similar issues
   [69]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19912/shard-apl6/igt@kms_psr2_sf@plane-move-sf-dmg-area-2.html

  * igt@kms_psr@psr2_primary_mmap_cpu:
    - shard-iclb:         [PASS][70] -> [SKIP][71] ([fdo#109441]) +3 similar issues
   [70]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9957/shard-iclb2/igt@kms_psr@psr2_primary_mmap_cpu.html
   [71]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19912/shard-iclb6/igt@kms_psr@psr2_primary_mmap_cpu.html

  * igt@kms_sysfs_edid_timing:
    - shard-apl:          NOTRUN -> [FAIL][72] ([IGT#2])
   [72]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19912/shard-apl1/igt@kms_sysfs_edid_timing.html

  * igt@kms_vblank@pipe-d-query-forked-hang:
    - shard-snb:          NOTRUN -> [SKIP][73] ([fdo#109271]) +365 similar issues
   [73]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19912/shard-snb5/igt@kms_vblank@pipe-d-query-forked-hang.html

  * igt@kms_vblank@pipe-d-wait-idle:
    - shard-apl:          NOTRUN -> [SKIP][74] ([fdo#109271] / [i915#533]) +1 similar issue
   [74]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19912/shard-apl7/igt@kms_vblank@pipe-d-wait-idle.html

  * igt@kms_writeback@writeback-check-output:
    - shard-skl:          NOTRUN -> [SKIP][75] ([fdo#109271] / [i915#2437])
   [75]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19912/shard-skl9/igt@kms_writeback@writeback-check-output.html

  * igt@kms_writeback@writeback-invalid-parameters:
    - shard-apl:          NOTRUN -> [SKIP][76] ([fdo#109271] / [i915#2437])
   [76]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19912/shard-apl7/igt@kms_writeback@writeback-invalid-parameters.html

  * igt@perf@blocking:
    - shard-skl:          [PASS][77] -> [FAIL][78] ([i915#1542]) +1 similar issue
   [77]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9957/shard-skl8/igt@perf@blocking.html
   [78]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19912/shard-skl7/igt@perf@blocking.html

  * igt@perf@polling-parameterized:
    - shard-glk:          [PASS][79] -> [FAIL][80] ([i915#1542])
   [79]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9957/shard-glk9/igt@perf@polling-parameterized.html
   [80]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19912/shard-glk8/igt@perf@polling-parameterized.html

  * igt@perf@polling-small-buf:
    - shard-skl:          [PASS][81] -> [FAIL][82] ([i915#1722])
   [81]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9957/shard-skl4/igt@perf@polling-small-buf.html
   [82]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19912/shard-skl1/igt@perf@polling-small-buf.html

  * igt@sysfs_clients@fair-7:
    - shard-apl:          NOTRUN -> [SKIP][83] ([fdo#109271] / [i915#2994]) +5 similar issues
   [83]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19912/shard-apl7/igt@sysfs_clients@fair-7.html

  * igt@sysfs_clients@recycle-many:
    - shard-kbl:          NOTRUN -> [SKIP][84] ([fdo#109271] / [i915#2994]) +1 similar issue
   [84]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19912/shard-kbl4/igt@sysfs_clients@recycle-many.html

  
#### Possible fixes ####

  * igt@gem_exec_fair@basic-none-share@rcs0:
    - shard-iclb:         [FAIL][85] ([i915#2842]) -> [PASS][86]
   [85]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9957/shard-iclb2/igt@gem_exec_fair@basic-none-share@rcs0.html
   [86]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19912/shard-iclb7/igt@gem_exec_fair@basic-none-share@rcs0.html
    - shard-tglb:         [FAIL][87] ([i915#2842]) -> [PASS][88] +2 similar issues
   [87]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9957/shard-tglb6/igt@gem_exec_fair@basic-none-share@rcs0.html
   [88]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19912/shard-tglb2/igt@gem_exec_fair@basic-none-share@rcs0.html

  * igt@gem_exec_fair@basic-none@rcs0:
    - shard-kbl:          [FAIL][89] ([i915#2842]) -> [PASS][90] +2 similar issues
   [89]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9957/shard-kbl7/igt@gem_exec_fair@basic-none@rcs0.html
   [90]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19912/shard-kbl2/igt@gem_exec_fair@basic-none@rcs0.html

  * igt@gem_workarounds@suspend-resume:
    - shard-skl:          [INCOMPLETE][91] ([i915#198]) -> [PASS][92] +1 similar issue
   [91]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9957/shard-skl10/igt@gem_workarounds@suspend-resume.html
   [92]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19912/shard-skl7/igt@gem_workarounds@suspend-resume.html

  * igt@i915_suspend@fence-restore-tiled2untiled:
    - shard-apl:          [DMESG-WARN][93] ([i915#180]) -> [PASS][94] +1 similar issue
   [93]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9957/shard-apl6/igt@i915_suspend@fence-restore-tiled2untiled.html
   [94]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19912/shard-apl8/igt@i915_suspend@fence-restore-tiled2untiled.html

  * igt@i915_suspend@forcewake:
    - shard-kbl:          [DMESG-WARN][95] ([i915#180]) -> [PASS][96] +2 similar issues
   [95]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9957/shard-kbl1/igt@i915_suspend@forcewake.html
   [96]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19912/shard-kbl3/igt@i915_suspend@forcewake.html

  * igt@kms_cursor_edge_walk@pipe-c-64x64-right-edge:
    - shard-skl:          [DMESG-WARN][97] ([i915#1982]) -> [PASS][98]
   [97]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9957/shard-skl5/igt@kms_cursor_edge_walk@pipe-c-64x64-right-edge.html
   [98]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19912/shard-skl2/igt@kms_cursor_edge_walk@pipe-c-64x64-right-edge.html

  * igt@kms_draw_crc@draw-method-rgb565-mmap-wc-ytiled:
    - shard-glk:          [FAIL][99] ([i915#52] / [i915#54]) -> [PASS][100] +1 similar issue
   [99]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9957/shard-glk3/igt@kms_draw_crc@draw-method-rgb565-mmap-wc-ytiled.html
   [100]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19912/shard-glk4/igt@kms_draw_crc@draw-method-rgb565-mmap-wc-ytiled.html

  * igt@kms_flip@2x-flip-vs-expired-vblank@bc-hdmi-a1-hdmi-a2:
    - shard-glk:          [FAIL][101] ([i915#2122]) -> [PASS][102]
   [101]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9957/shard-glk9/igt@kms_flip@2x-flip-vs-expired-vblank@bc-hdmi-a1-hdmi-a2.html
   [102]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19912/shard-glk6/igt@kms_flip@2x-flip-vs-expired-vblank@bc-hdmi-a1-hdmi-a2.html

  * igt@kms_flip@flip-vs-expired-vblank-interruptible@c-edp1:
    - shard-skl:          [FAIL][103] ([i915#79]) -> [PASS][104]
   [103]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9957/shard-skl8/igt@kms_flip@flip-vs-expired-vblank-interruptible@c-edp1.html
   [104]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19912/shard-skl10/igt@kms_flip@flip-vs-expired-vblank-interruptible@c-edp1.html

  * igt@kms_flip@plain-flip-ts-check@a-edp1:
    - shard-skl:          [FAIL][105] ([i915#2122]) -> [PASS][106]
   [105]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9957/shard-skl8/igt@kms_flip@plain-flip-ts-check@a-edp1.html
   [106]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19912/shard-skl7/igt@kms_flip@plain-flip-ts-check@a-edp1.html

  * igt@kms_hdr@bpc-switch:
    - shard-skl:          [FAIL][107] ([i915#1188]) -> [PASS][108]
   [107]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9957/shard-skl7/igt@kms_hdr@bpc-switch.html
   [108]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19912/shard-skl9/igt@kms_hdr@bpc-switch.html

  * igt@kms_psr@psr2_sprite_plane_move:
    - shard-iclb:         [SKIP][109] ([fdo#109441]) -> [PASS][110]
   [109]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9957/shard-iclb3/igt@kms_psr@psr2_sprite_plane_move.html
   [110]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19912/shard-iclb2/igt@kms_psr@psr2_sprite_plane_move.html

  * igt@perf@polling-parameterized:
    - shard-tglb:         [FAIL][111] ([i915#1542]) -> [PASS][112]
   [111]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9957/shard-tglb1/igt@perf@polling-parameterized.html
   [112]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19912/shard-tglb8/igt@perf@polling-parameterized.html

  * igt@sysfs_preempt_timeout@timeout@bcs0:
    - shard-skl:          [FAIL][113] -> [PASS][114]
   [113]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9957/shard-skl5/igt@sysfs_preempt_timeout@timeout@bcs0.html
   [114]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19912/shard-skl2/igt@sysfs_preempt_timeout@timeout@bcs0.html

  
#### Warnings ####

  * igt@i915_pm_dc@dc9-dpms:
    - shard-iclb:         [FAIL][115] ([i915#3343]) -> [SKIP][116] ([i915#3288])
   [115]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9957/shard-iclb5/igt@i915_pm_dc@dc9-dpms.html
   [116]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19912/shard-iclb2/igt@i915_pm_dc@dc9-dpms.html

  * igt@kms_psr2_sf@overlay-primary-update-sf-dmg-area-4:
    - shard-iclb:         [SKIP][117] ([i915#658]) -> [SKIP][118] ([i915#2920]) +3 similar issues
   [117]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9957/shard-iclb5/igt@kms_psr2_sf@overlay-primary-update-sf-dmg-area-4.html
   [118]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19912/shard-iclb2/igt@kms_psr2_sf@overlay-primary-update-sf-dmg-area-4.html

  * igt@kms_psr2_sf@primary-plane-update-sf-dmg-area-4:
    - shard-iclb:         [SKIP][119] ([i915#2920]) -> [SKIP][120] ([i915#658]) +1 similar issue
   [119]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9957/shard-iclb2/igt@kms_psr2_sf@primary-plane-update-sf-dmg-area-4.html
   [120]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19912/shard-iclb6/igt@kms_psr2_sf@primary-plane-update-sf-dmg-area-4.html

  * igt@runner@aborted:
    - shard-kbl:          ([FAIL][121], [FAIL][122], [FAIL][123], [FAIL][124], [FAIL][125], [FAIL][126]) ([i915#180] / [i915#1814] / [i915#3002]) -> ([FAIL][127], [FAIL][128], [FAIL][129], [FAIL][130]) ([i915#1436] / [i915#180] / [i915#1814])
   [121]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9957/shard-kbl1/igt@runner@aborted.html
   [122]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9957/shard-kbl7/igt@runner@aborted.html
   [123]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9957/shard-kbl2/igt@runner@aborted.html
   [124]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9957/shard-kbl1/igt@runner@aborted.html
   [125]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9957/shard-kbl1/igt@runner@aborted.html
   [126]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9957/shard-kbl1/igt@runner@aborted.html
   [127]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19912/shard-kbl1/igt@runner@aborted.html
   [128]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19912/shard-kbl4/igt@runner@aborted.html
   [129]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19912/shard-kbl1/igt@runner@aborted.html
   [130]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19912/shard-kbl1/igt@runner@aborted.html
    - shard-skl:          ([FAIL][131], [FAIL][132], [FAIL][133], [FAIL][134]) ([i915#1436] / [i915#1814] / [i915#2029] / [i915#3002]) -> ([FAIL][135], [FAIL][136]) ([i915#2029] / [i915#3002])
   [131]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9957/shard-skl9/igt@runner@aborted.html
   [132]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9957/shard-skl2/igt@runner@aborted.html
   [133]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9957/shard-skl5/igt@runner@aborted.html
   [134]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9957/shard-skl10/igt@runner@aborted.html
   [135]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19912/shard-skl2/igt@runner@aborted.html
   [136]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19912/shard-skl6/igt@runner@aborted.html

  
  [IGT#2]: https://gitlab.freedesktop.org/drm/igt-gpu-tools/issues/2
  [fdo#108145]: https://bugs.freedesktop.org/show_bug.cgi?id=108145
  [fdo#109271]: https://bugs.freedesktop.org/show_bug.cgi?id=109271
  [fdo#109441]: https://bugs.freedesktop.org/show_bug.cgi?id=109441
  [fdo#111827]: https://bugs.freedesktop.org/show_bug.cgi?id=111827
  [i915#1099]: https://gitlab.freedesktop.org/drm/intel/issues/1099
  [i915#1188]: https://gitlab.freedesktop.org/drm/intel/issues/1188
  [i915#1319]: https://gitlab.freedesktop.org/drm/intel/issues/1319
  [i915#1394]: https://gitlab.freedesktop.org/drm/intel/issues/1394
  [i915#1436]: https://gitlab.freedesktop.org/drm/intel/issues/1436
  [i915#1542]: https://gitlab.freedesktop.org/drm/intel/issues/1542
  [i915#155]: https://gitlab.freedesktop.org/drm/intel/issues/155
  [i915#1722]: https://gitlab.freedesktop.org/drm/intel/issues/1722
  [i915#180]: https://gitlab.freedesktop.org/drm/intel/issues/180
  [i915#1814]: https://gitlab.freedesktop.org/drm/intel/issues/1814
  [i915#1937]: https://gitlab.freedesktop.org/drm/intel/issues/1937
  [i915#198]: https://gitlab.freedesktop.org/drm/intel/issues/198
  [i915#1982]: https://gitlab.freedesktop.org/drm/intel/issues/1982
  [i915#2029]: https://gitlab.freedesktop.org/drm/intel/issues/2029
  [i915#2122]: https://gitlab.freedesktop.org/drm/intel/issues/2122
  [i915#2346]: https://gitlab.freedesktop.org/drm/intel/issues/2346
  [i915#2389]: https://gitlab.freedesktop.org/drm/intel/issues/2389
  [i915#2428]: https://gitlab.freedesktop.org/drm/intel/issues/2428
  [i915#2437]: https://gitlab.freedesktop.org/drm/intel/issues/2437
  [i915#2641]: https://gitlab.freedesktop.org/drm/intel/issues/2641
  [i915#2642]: https://gitlab.freedesktop.org/drm/intel/issues/2642
  [i915#265]: https://gitlab.freedesktop.org/drm/intel/issues/265
  [i915#2672]: https://gitlab.freedesktop.org/drm/intel/issues/2672
  [i915#2782]: https://gitlab.freedesktop.org/drm/intel/issues/2782
  [i915#2842]: https://gitlab.freedesktop.org/drm/intel/issues/2842
  [i915#2920]: https://gitlab.freedesktop.org/drm/intel/issues/2920
  [i915#2994]: https://gitlab.freedesktop.org/drm/intel/issues/2994
  [i915#3002]: https://gitlab.freedesktop.org/drm/intel/issues/3002
  [i915#307]: https://gitlab.freedesktop.org/drm/intel/issues/307
  [i915#3288]: https://gitlab.freedesktop.org/drm/intel/issues/3288
  [i915#3296]: https://gitlab.freedesktop.org/drm/intel/issues/3296
  [i915#331

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19912/index.html

[-- Attachment #1.2: Type: text/html, Size: 37518 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [Intel-gfx] [PATCH 01/19] drm/i915/gt: Skip aperture remapping selftest where there is no aperture
  2021-04-12  9:05   ` [Intel-gfx] " Matthew Auld
@ 2021-04-12 14:48     ` Daniel Vetter
  -1 siblings, 0 replies; 130+ messages in thread
From: Daniel Vetter @ 2021-04-12 14:48 UTC (permalink / raw)
  To: Matthew Auld; +Cc: intel-gfx, dri-devel, Chris Wilson

On Mon, Apr 12, 2021 at 10:05:08AM +0100, Matthew Auld wrote:
> From: Chris Wilson <chris@chris-wilson.co.uk>
> 
> If there is no mappable aperture, we cannot remap it for access, and the
> selftest is void.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Reviewed-by: Matthew Auld <matthew.auld@intel.com>
> Reviewed-by: Imre Deak <imre.deak@intel.com>

I guess subject should have i915/selftest in it? Also if you resubmit
other people's code needs your sob. Otherwise looks reasonable.
-Daniel
> ---
>  drivers/gpu/drm/i915/selftests/i915_vma.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/selftests/i915_vma.c b/drivers/gpu/drm/i915/selftests/i915_vma.c
> index 5fe7b80ca0bd..dd0607254a95 100644
> --- a/drivers/gpu/drm/i915/selftests/i915_vma.c
> +++ b/drivers/gpu/drm/i915/selftests/i915_vma.c
> @@ -967,6 +967,9 @@ static int igt_vma_remapped_gtt(void *arg)
>  	intel_wakeref_t wakeref;
>  	int err = 0;
>  
> +	if (!i915_ggtt_has_aperture(&i915->ggtt))
> +		return 0;
> +
>  	obj = i915_gem_object_create_internal(i915, 10 * 10 * PAGE_SIZE);
>  	if (IS_ERR(obj))
>  		return PTR_ERR(obj);
> -- 
> 2.26.3
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [Intel-gfx] [PATCH 01/19] drm/i915/gt: Skip aperture remapping selftest where there is no aperture
@ 2021-04-12 14:48     ` Daniel Vetter
  0 siblings, 0 replies; 130+ messages in thread
From: Daniel Vetter @ 2021-04-12 14:48 UTC (permalink / raw)
  To: Matthew Auld; +Cc: intel-gfx, dri-devel, Chris Wilson

On Mon, Apr 12, 2021 at 10:05:08AM +0100, Matthew Auld wrote:
> From: Chris Wilson <chris@chris-wilson.co.uk>
> 
> If there is no mappable aperture, we cannot remap it for access, and the
> selftest is void.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Reviewed-by: Matthew Auld <matthew.auld@intel.com>
> Reviewed-by: Imre Deak <imre.deak@intel.com>

I guess subject should have i915/selftest in it? Also if you resubmit
other people's code needs your sob. Otherwise looks reasonable.
-Daniel
> ---
>  drivers/gpu/drm/i915/selftests/i915_vma.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/selftests/i915_vma.c b/drivers/gpu/drm/i915/selftests/i915_vma.c
> index 5fe7b80ca0bd..dd0607254a95 100644
> --- a/drivers/gpu/drm/i915/selftests/i915_vma.c
> +++ b/drivers/gpu/drm/i915/selftests/i915_vma.c
> @@ -967,6 +967,9 @@ static int igt_vma_remapped_gtt(void *arg)
>  	intel_wakeref_t wakeref;
>  	int err = 0;
>  
> +	if (!i915_ggtt_has_aperture(&i915->ggtt))
> +		return 0;
> +
>  	obj = i915_gem_object_create_internal(i915, 10 * 10 * PAGE_SIZE);
>  	if (IS_ERR(obj))
>  		return PTR_ERR(obj);
> -- 
> 2.26.3
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH 07/19] drm/i915/fbdev: Use lmem physical addresses for fb_mmap() on discrete
  2021-04-12  9:05   ` [Intel-gfx] " Matthew Auld
@ 2021-04-12 15:00     ` Daniel Vetter
  -1 siblings, 0 replies; 130+ messages in thread
From: Daniel Vetter @ 2021-04-12 15:00 UTC (permalink / raw)
  To: Matthew Auld; +Cc: Mohammed Khajapasha, intel-gfx, dri-devel

On Mon, Apr 12, 2021 at 10:05:14AM +0100, Matthew Auld wrote:
> From: Mohammed Khajapasha <mohammed.khajapasha@intel.com>
> 
> use local memory io BAR address for fbdev's fb_mmap() operation on
> discrete, fbdev uses the physical address of our framebuffer for its
> fb_mmap() fn.
> 
> Signed-off-by: Mohammed Khajapasha <mohammed.khajapasha@intel.com>

Sob missing (I didn't check all previous patches), but also I think we
should aim more to reuse drm fbdev helpers and retire our owns here.
Eventually, long-term, and all that.
-Daniel

> ---
>  drivers/gpu/drm/i915/display/intel_fbdev.c | 29 +++++++++++++++++-----
>  1 file changed, 23 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/display/intel_fbdev.c b/drivers/gpu/drm/i915/display/intel_fbdev.c
> index ccd00e65a5fe..2b37959da747 100644
> --- a/drivers/gpu/drm/i915/display/intel_fbdev.c
> +++ b/drivers/gpu/drm/i915/display/intel_fbdev.c
> @@ -41,6 +41,8 @@
>  #include <drm/drm_fb_helper.h>
>  #include <drm/drm_fourcc.h>
>  
> +#include "gem/i915_gem_lmem.h"
> +
>  #include "i915_drv.h"
>  #include "intel_display_types.h"
>  #include "intel_fbdev.h"
> @@ -178,6 +180,7 @@ static int intelfb_create(struct drm_fb_helper *helper,
>  	unsigned long flags = 0;
>  	bool prealloc = false;
>  	void __iomem *vaddr;
> +	struct drm_i915_gem_object *obj;
>  	int ret;
>  
>  	if (intel_fb &&
> @@ -232,13 +235,27 @@ static int intelfb_create(struct drm_fb_helper *helper,
>  	info->fbops = &intelfb_ops;
>  
>  	/* setup aperture base/size for vesafb takeover */
> -	info->apertures->ranges[0].base = ggtt->gmadr.start;
> -	info->apertures->ranges[0].size = ggtt->mappable_end;
> +	obj = intel_fb_obj(&intel_fb->base);
> +	if (i915_gem_object_is_lmem(obj)) {
> +		struct intel_memory_region *mem = obj->mm.region;
> +
> +		info->apertures->ranges[0].base = mem->io_start;
> +		info->apertures->ranges[0].size = mem->total;
> +
> +		/* Use fbdev's framebuffer from lmem for discrete */
> +		info->fix.smem_start =
> +			(unsigned long)(mem->io_start +
> +					i915_gem_object_get_dma_address(obj, 0));
> +		info->fix.smem_len = obj->base.size;
> +	} else {
> +		info->apertures->ranges[0].base = ggtt->gmadr.start;
> +		info->apertures->ranges[0].size = ggtt->mappable_end;
>  
> -	/* Our framebuffer is the entirety of fbdev's system memory */
> -	info->fix.smem_start =
> -		(unsigned long)(ggtt->gmadr.start + vma->node.start);
> -	info->fix.smem_len = vma->node.size;
> +		/* Our framebuffer is the entirety of fbdev's system memory */
> +		info->fix.smem_start =
> +			(unsigned long)(ggtt->gmadr.start + vma->node.start);
> +		info->fix.smem_len = vma->node.size;
> +	}
>  
>  	vaddr = i915_vma_pin_iomap(vma);
>  	if (IS_ERR(vaddr)) {
> -- 
> 2.26.3
> 
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [Intel-gfx] [PATCH 07/19] drm/i915/fbdev: Use lmem physical addresses for fb_mmap() on discrete
@ 2021-04-12 15:00     ` Daniel Vetter
  0 siblings, 0 replies; 130+ messages in thread
From: Daniel Vetter @ 2021-04-12 15:00 UTC (permalink / raw)
  To: Matthew Auld; +Cc: Mohammed Khajapasha, intel-gfx, dri-devel

On Mon, Apr 12, 2021 at 10:05:14AM +0100, Matthew Auld wrote:
> From: Mohammed Khajapasha <mohammed.khajapasha@intel.com>
> 
> use local memory io BAR address for fbdev's fb_mmap() operation on
> discrete, fbdev uses the physical address of our framebuffer for its
> fb_mmap() fn.
> 
> Signed-off-by: Mohammed Khajapasha <mohammed.khajapasha@intel.com>

Sob missing (I didn't check all previous patches), but also I think we
should aim more to reuse drm fbdev helpers and retire our owns here.
Eventually, long-term, and all that.
-Daniel

> ---
>  drivers/gpu/drm/i915/display/intel_fbdev.c | 29 +++++++++++++++++-----
>  1 file changed, 23 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/display/intel_fbdev.c b/drivers/gpu/drm/i915/display/intel_fbdev.c
> index ccd00e65a5fe..2b37959da747 100644
> --- a/drivers/gpu/drm/i915/display/intel_fbdev.c
> +++ b/drivers/gpu/drm/i915/display/intel_fbdev.c
> @@ -41,6 +41,8 @@
>  #include <drm/drm_fb_helper.h>
>  #include <drm/drm_fourcc.h>
>  
> +#include "gem/i915_gem_lmem.h"
> +
>  #include "i915_drv.h"
>  #include "intel_display_types.h"
>  #include "intel_fbdev.h"
> @@ -178,6 +180,7 @@ static int intelfb_create(struct drm_fb_helper *helper,
>  	unsigned long flags = 0;
>  	bool prealloc = false;
>  	void __iomem *vaddr;
> +	struct drm_i915_gem_object *obj;
>  	int ret;
>  
>  	if (intel_fb &&
> @@ -232,13 +235,27 @@ static int intelfb_create(struct drm_fb_helper *helper,
>  	info->fbops = &intelfb_ops;
>  
>  	/* setup aperture base/size for vesafb takeover */
> -	info->apertures->ranges[0].base = ggtt->gmadr.start;
> -	info->apertures->ranges[0].size = ggtt->mappable_end;
> +	obj = intel_fb_obj(&intel_fb->base);
> +	if (i915_gem_object_is_lmem(obj)) {
> +		struct intel_memory_region *mem = obj->mm.region;
> +
> +		info->apertures->ranges[0].base = mem->io_start;
> +		info->apertures->ranges[0].size = mem->total;
> +
> +		/* Use fbdev's framebuffer from lmem for discrete */
> +		info->fix.smem_start =
> +			(unsigned long)(mem->io_start +
> +					i915_gem_object_get_dma_address(obj, 0));
> +		info->fix.smem_len = obj->base.size;
> +	} else {
> +		info->apertures->ranges[0].base = ggtt->gmadr.start;
> +		info->apertures->ranges[0].size = ggtt->mappable_end;
>  
> -	/* Our framebuffer is the entirety of fbdev's system memory */
> -	info->fix.smem_start =
> -		(unsigned long)(ggtt->gmadr.start + vma->node.start);
> -	info->fix.smem_len = vma->node.size;
> +		/* Our framebuffer is the entirety of fbdev's system memory */
> +		info->fix.smem_start =
> +			(unsigned long)(ggtt->gmadr.start + vma->node.start);
> +		info->fix.smem_len = vma->node.size;
> +	}
>  
>  	vaddr = i915_vma_pin_iomap(vma);
>  	if (IS_ERR(vaddr)) {
> -- 
> 2.26.3
> 
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [Intel-gfx] [PATCH 18/19] drm/i915/gtt: map the PD up front
  2021-04-12  9:05   ` [Intel-gfx] " Matthew Auld
@ 2021-04-12 15:17     ` Daniel Vetter
  -1 siblings, 0 replies; 130+ messages in thread
From: Daniel Vetter @ 2021-04-12 15:17 UTC (permalink / raw)
  To: Matthew Auld; +Cc: intel-gfx, dri-devel, Chris Wilson

On Mon, Apr 12, 2021 at 10:05:25AM +0100, Matthew Auld wrote:
> We need to general our accessor for the page directories and tables from
> using the simple kmap_atomic to support local memory, and this setup
> must be done on acquisition of the backing storage prior to entering
> fence execution contexts. Here we replace the kmap with the object
> maping code that for simple single page shmemfs object will return a
> plain kmap, that is then kept for the lifetime of the page directory.
> 
> v2: (Thomas) Rebase on dma_resv and obj->mm.lock removal.
> 
> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>

So I wanted to understand what px stands for as an abbreviation, and dug
all the way down to this:

commit 567047be2a7ede082d29f45524c287b87bd75e53
Author: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Date:   Thu Jun 25 18:35:12 2015 +0300

    drm/i915/gtt: Use macros to access dma mapped pages

I still have no idea what it means, I guess px = page. But I also
committed this, so I guess can blame myself :-)

But while digging I've stumbled over this here

commit 6eebfe8a10a62139d681e2f1af1386252742278b
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Fri Jul 12 08:58:18 2019 +0100

    drm/i915/gtt: Use shallow dma pages for scratch


And that's some serious wtf. Yes we've done some compile-time type
casting automagic between i915_priv and dev in the past, and I think even
that was bad taste. But it was justified with that we have these
everywhere (especially in the mmio macros), and it would be a terrible
flag day.

But I'm not seeing any need for auto-casting for these pages here, and I'm
not aware that we're doing this anywhere else in kernel code. There is
some macro-trickery in lockdep annotations, but that relies on the lockdep
map having the same struct member name in all lock types, and is not
exposed to drivers at all.

Am I missing something, or why do we have this compile-time type casting
stuff going on in i915 page accessors?
-Daniel

> ---
>  .../drm/i915/gem/selftests/i915_gem_context.c | 11 +----
>  drivers/gpu/drm/i915/gt/gen6_ppgtt.c          | 11 ++---
>  drivers/gpu/drm/i915/gt/gen8_ppgtt.c          | 26 ++++------
>  drivers/gpu/drm/i915/gt/intel_ggtt.c          |  2 +-
>  drivers/gpu/drm/i915/gt/intel_gtt.c           | 48 +++++++++----------
>  drivers/gpu/drm/i915/gt/intel_gtt.h           | 11 +++--
>  drivers/gpu/drm/i915/gt/intel_ppgtt.c         |  7 ++-
>  drivers/gpu/drm/i915/i915_vma.c               |  3 +-
>  drivers/gpu/drm/i915/selftests/i915_gem_gtt.c | 10 ++--
>  drivers/gpu/drm/i915/selftests/i915_perf.c    |  3 +-
>  10 files changed, 54 insertions(+), 78 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c
> index 5fef592390cb..ce70d0a3afb2 100644
> --- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c
> @@ -1740,7 +1740,6 @@ static int read_from_scratch(struct i915_gem_context *ctx,
>  static int check_scratch_page(struct i915_gem_context *ctx, u32 *out)
>  {
>  	struct i915_address_space *vm;
> -	struct page *page;
>  	u32 *vaddr;
>  	int err = 0;
>  
> @@ -1748,24 +1747,18 @@ static int check_scratch_page(struct i915_gem_context *ctx, u32 *out)
>  	if (!vm)
>  		return -ENODEV;
>  
> -	page = __px_page(vm->scratch[0]);
> -	if (!page) {
> +	if (!vm->scratch[0]) {
>  		pr_err("No scratch page!\n");
>  		return -EINVAL;
>  	}
>  
> -	vaddr = kmap(page);
> -	if (!vaddr) {
> -		pr_err("No (mappable) scratch page!\n");
> -		return -EINVAL;
> -	}
> +	vaddr = __px_vaddr(vm->scratch[0]);
>  
>  	memcpy(out, vaddr, sizeof(*out));
>  	if (memchr_inv(vaddr, *out, PAGE_SIZE)) {
>  		pr_err("Inconsistent initial state of scratch page!\n");
>  		err = -EINVAL;
>  	}
> -	kunmap(page);
>  
>  	return err;
>  }
> diff --git a/drivers/gpu/drm/i915/gt/gen6_ppgtt.c b/drivers/gpu/drm/i915/gt/gen6_ppgtt.c
> index e08dff376339..21b1085769be 100644
> --- a/drivers/gpu/drm/i915/gt/gen6_ppgtt.c
> +++ b/drivers/gpu/drm/i915/gt/gen6_ppgtt.c
> @@ -96,9 +96,8 @@ static void gen6_ppgtt_clear_range(struct i915_address_space *vm,
>  		 * entries back to scratch.
>  		 */
>  
> -		vaddr = kmap_atomic_px(pt);
> +		vaddr = px_vaddr(pt);
>  		memset32(vaddr + pte, scratch_pte, count);
> -		kunmap_atomic(vaddr);
>  
>  		pte = 0;
>  	}
> @@ -120,7 +119,7 @@ static void gen6_ppgtt_insert_entries(struct i915_address_space *vm,
>  
>  	GEM_BUG_ON(!pd->entry[act_pt]);
>  
> -	vaddr = kmap_atomic_px(i915_pt_entry(pd, act_pt));
> +	vaddr = px_vaddr(i915_pt_entry(pd, act_pt));
>  	do {
>  		GEM_BUG_ON(sg_dma_len(iter.sg) < I915_GTT_PAGE_SIZE);
>  		vaddr[act_pte] = pte_encode | GEN6_PTE_ADDR_ENCODE(iter.dma);
> @@ -136,12 +135,10 @@ static void gen6_ppgtt_insert_entries(struct i915_address_space *vm,
>  		}
>  
>  		if (++act_pte == GEN6_PTES) {
> -			kunmap_atomic(vaddr);
> -			vaddr = kmap_atomic_px(i915_pt_entry(pd, ++act_pt));
> +			vaddr = px_vaddr(i915_pt_entry(pd, ++act_pt));
>  			act_pte = 0;
>  		}
>  	} while (1);
> -	kunmap_atomic(vaddr);
>  
>  	vma->page_sizes.gtt = I915_GTT_PAGE_SIZE;
>  }
> @@ -235,7 +232,7 @@ static int gen6_ppgtt_init_scratch(struct gen6_ppgtt *ppgtt)
>  		goto err_scratch0;
>  	}
>  
> -	ret = pin_pt_dma(vm, vm->scratch[1]);
> +	ret = map_pt_dma(vm, vm->scratch[1]);
>  	if (ret)
>  		goto err_scratch1;
>  
> diff --git a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
> index 176c19633412..f83496836f0f 100644
> --- a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
> +++ b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
> @@ -242,11 +242,10 @@ static u64 __gen8_ppgtt_clear(struct i915_address_space * const vm,
>  			    atomic_read(&pt->used));
>  			GEM_BUG_ON(!count || count >= atomic_read(&pt->used));
>  
> -			vaddr = kmap_atomic_px(pt);
> +			vaddr = px_vaddr(pt);
>  			memset64(vaddr + gen8_pd_index(start, 0),
>  				 vm->scratch[0]->encode,
>  				 count);
> -			kunmap_atomic(vaddr);
>  
>  			atomic_sub(count, &pt->used);
>  			start += count;
> @@ -375,7 +374,7 @@ gen8_ppgtt_insert_pte(struct i915_ppgtt *ppgtt,
>  	gen8_pte_t *vaddr;
>  
>  	pd = i915_pd_entry(pdp, gen8_pd_index(idx, 2));
> -	vaddr = kmap_atomic_px(i915_pt_entry(pd, gen8_pd_index(idx, 1)));
> +	vaddr = px_vaddr(i915_pt_entry(pd, gen8_pd_index(idx, 1)));
>  	do {
>  		GEM_BUG_ON(sg_dma_len(iter->sg) < I915_GTT_PAGE_SIZE);
>  		vaddr[gen8_pd_index(idx, 0)] = pte_encode | iter->dma;
> @@ -402,12 +401,10 @@ gen8_ppgtt_insert_pte(struct i915_ppgtt *ppgtt,
>  			}
>  
>  			clflush_cache_range(vaddr, PAGE_SIZE);
> -			kunmap_atomic(vaddr);
> -			vaddr = kmap_atomic_px(i915_pt_entry(pd, gen8_pd_index(idx, 1)));
> +			vaddr = px_vaddr(i915_pt_entry(pd, gen8_pd_index(idx, 1)));
>  		}
>  	} while (1);
>  	clflush_cache_range(vaddr, PAGE_SIZE);
> -	kunmap_atomic(vaddr);
>  
>  	return idx;
>  }
> @@ -442,7 +439,7 @@ static void gen8_ppgtt_insert_huge(struct i915_vma *vma,
>  			encode |= GEN8_PDE_PS_2M;
>  			page_size = I915_GTT_PAGE_SIZE_2M;
>  
> -			vaddr = kmap_atomic_px(pd);
> +			vaddr = px_vaddr(pd);
>  		} else {
>  			struct i915_page_table *pt =
>  				i915_pt_entry(pd, __gen8_pte_index(start, 1));
> @@ -457,7 +454,7 @@ static void gen8_ppgtt_insert_huge(struct i915_vma *vma,
>  			     rem >= (I915_PDES - index) * I915_GTT_PAGE_SIZE))
>  				maybe_64K = __gen8_pte_index(start, 1);
>  
> -			vaddr = kmap_atomic_px(pt);
> +			vaddr = px_vaddr(pt);
>  		}
>  
>  		do {
> @@ -491,7 +488,6 @@ static void gen8_ppgtt_insert_huge(struct i915_vma *vma,
>  		} while (rem >= page_size && index < I915_PDES);
>  
>  		clflush_cache_range(vaddr, PAGE_SIZE);
> -		kunmap_atomic(vaddr);
>  
>  		/*
>  		 * Is it safe to mark the 2M block as 64K? -- Either we have
> @@ -505,9 +501,8 @@ static void gen8_ppgtt_insert_huge(struct i915_vma *vma,
>  		      !iter->sg && IS_ALIGNED(vma->node.start +
>  					      vma->node.size,
>  					      I915_GTT_PAGE_SIZE_2M)))) {
> -			vaddr = kmap_atomic_px(pd);
> +			vaddr = px_vaddr(pd);
>  			vaddr[maybe_64K] |= GEN8_PDE_IPS_64K;
> -			kunmap_atomic(vaddr);
>  			page_size = I915_GTT_PAGE_SIZE_64K;
>  
>  			/*
> @@ -523,12 +518,11 @@ static void gen8_ppgtt_insert_huge(struct i915_vma *vma,
>  				u16 i;
>  
>  				encode = vma->vm->scratch[0]->encode;
> -				vaddr = kmap_atomic_px(i915_pt_entry(pd, maybe_64K));
> +				vaddr = px_vaddr(i915_pt_entry(pd, maybe_64K));
>  
>  				for (i = 1; i < index; i += 16)
>  					memset64(vaddr + i, encode, 15);
>  
> -				kunmap_atomic(vaddr);
>  			}
>  		}
>  
> @@ -602,7 +596,7 @@ static int gen8_init_scratch(struct i915_address_space *vm)
>  		if (IS_ERR(obj))
>  			goto free_scratch;
>  
> -		ret = pin_pt_dma(vm, obj);
> +		ret = map_pt_dma(vm, obj);
>  		if (ret) {
>  			i915_gem_object_put(obj);
>  			goto free_scratch;
> @@ -639,7 +633,7 @@ static int gen8_preallocate_top_level_pdp(struct i915_ppgtt *ppgtt)
>  		if (IS_ERR(pde))
>  			return PTR_ERR(pde);
>  
> -		err = pin_pt_dma(vm, pde->pt.base);
> +		err = map_pt_dma(vm, pde->pt.base);
>  		if (err) {
>  			i915_gem_object_put(pde->pt.base);
>  			free_pd(vm, pde);
> @@ -675,7 +669,7 @@ gen8_alloc_top_pd(struct i915_address_space *vm)
>  		goto err_pd;
>  	}
>  
> -	err = pin_pt_dma(vm, pd->pt.base);
> +	err = map_pt_dma(vm, pd->pt.base);
>  	if (err)
>  		goto err_pd;
>  
> diff --git a/drivers/gpu/drm/i915/gt/intel_ggtt.c b/drivers/gpu/drm/i915/gt/intel_ggtt.c
> index 670c1271e7d5..d94628b9d89e 100644
> --- a/drivers/gpu/drm/i915/gt/intel_ggtt.c
> +++ b/drivers/gpu/drm/i915/gt/intel_ggtt.c
> @@ -657,7 +657,7 @@ static int init_aliasing_ppgtt(struct i915_ggtt *ggtt)
>  		goto err_ppgtt;
>  
>  	i915_gem_object_lock(ppgtt->vm.scratch[0], NULL);
> -	err = i915_vm_pin_pt_stash(&ppgtt->vm, &stash);
> +	err = i915_vm_map_pt_stash(&ppgtt->vm, &stash);
>  	i915_gem_object_unlock(ppgtt->vm.scratch[0]);
>  	if (err)
>  		goto err_stash;
> diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.c b/drivers/gpu/drm/i915/gt/intel_gtt.c
> index 941f8af016d6..d386b89e2758 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gtt.c
> +++ b/drivers/gpu/drm/i915/gt/intel_gtt.c
> @@ -25,27 +25,25 @@ struct drm_i915_gem_object *alloc_pt_dma(struct i915_address_space *vm, int sz)
>  	return obj;
>  }
>  
> -int pin_pt_dma(struct i915_address_space *vm, struct drm_i915_gem_object *obj)
> +int map_pt_dma(struct i915_address_space *vm, struct drm_i915_gem_object *obj)
>  {
> -	int err;
> +	void *vaddr;
>  
> -	i915_gem_object_lock(obj, NULL);
> -	err = i915_gem_object_pin_pages(obj);
> -	i915_gem_object_unlock(obj);
> -	if (err)
> -		return err;
> +	vaddr = i915_gem_object_pin_map_unlocked(obj, I915_MAP_WB);
> +	if (IS_ERR(vaddr))
> +		return PTR_ERR(vaddr);
>  
>  	i915_gem_object_make_unshrinkable(obj);
>  	return 0;
>  }
>  
> -int pin_pt_dma_locked(struct i915_address_space *vm, struct drm_i915_gem_object *obj)
> +int map_pt_dma_locked(struct i915_address_space *vm, struct drm_i915_gem_object *obj)
>  {
> -	int err;
> +	void *vaddr;
>  
> -	err = i915_gem_object_pin_pages(obj);
> -	if (err)
> -		return err;
> +	vaddr = i915_gem_object_pin_map(obj, I915_MAP_WB);
> +	if (IS_ERR(vaddr))
> +		return PTR_ERR(vaddr);
>  
>  	i915_gem_object_make_unshrinkable(obj);
>  	return 0;
> @@ -155,6 +153,14 @@ void clear_pages(struct i915_vma *vma)
>  	memset(&vma->page_sizes, 0, sizeof(vma->page_sizes));
>  }
>  
> +void *__px_vaddr(struct drm_i915_gem_object *p)
> +{
> +	enum i915_map_type type;
> +
> +	GEM_BUG_ON(!i915_gem_object_has_pages(p));
> +	return page_unpack_bits(p->mm.mapping, &type);
> +}
> +
>  dma_addr_t __px_dma(struct drm_i915_gem_object *p)
>  {
>  	GEM_BUG_ON(!i915_gem_object_has_pages(p));
> @@ -170,32 +176,22 @@ struct page *__px_page(struct drm_i915_gem_object *p)
>  void
>  fill_page_dma(struct drm_i915_gem_object *p, const u64 val, unsigned int count)
>  {
> -	struct page *page = __px_page(p);
> -	void *vaddr;
> +	void *vaddr = __px_vaddr(p);
>  
> -	vaddr = kmap(page);
>  	memset64(vaddr, val, count);
>  	clflush_cache_range(vaddr, PAGE_SIZE);
> -	kunmap(page);
>  }
>  
>  static void poison_scratch_page(struct drm_i915_gem_object *scratch)
>  {
> -	struct sgt_iter sgt;
> -	struct page *page;
> +	void *vaddr = __px_vaddr(scratch);
>  	u8 val;
>  
>  	val = 0;
>  	if (IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM))
>  		val = POISON_FREE;
>  
> -	for_each_sgt_page(page, sgt, scratch->mm.pages) {
> -		void *vaddr;
> -
> -		vaddr = kmap(page);
> -		memset(vaddr, val, PAGE_SIZE);
> -		kunmap(page);
> -	}
> +	memset(vaddr, val, scratch->base.size);
>  }
>  
>  int setup_scratch_page(struct i915_address_space *vm)
> @@ -225,7 +221,7 @@ int setup_scratch_page(struct i915_address_space *vm)
>  		if (IS_ERR(obj))
>  			goto skip;
>  
> -		if (pin_pt_dma(vm, obj))
> +		if (map_pt_dma(vm, obj))
>  			goto skip_obj;
>  
>  		/* We need a single contiguous page for our scratch */
> diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.h b/drivers/gpu/drm/i915/gt/intel_gtt.h
> index e67e34e17913..40e486704558 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gtt.h
> +++ b/drivers/gpu/drm/i915/gt/intel_gtt.h
> @@ -180,6 +180,9 @@ struct page *__px_page(struct drm_i915_gem_object *p);
>  dma_addr_t __px_dma(struct drm_i915_gem_object *p);
>  #define px_dma(px) (__px_dma(px_base(px)))
>  
> +void *__px_vaddr(struct drm_i915_gem_object *p);
> +#define px_vaddr(px) (__px_vaddr(px_base(px)))
> +
>  #define px_pt(px) \
>  	__px_choose_expr(px, struct i915_page_table *, __x, \
>  	__px_choose_expr(px, struct i915_page_directory *, &__x->pt, \
> @@ -511,8 +514,6 @@ struct i915_ppgtt *i915_ppgtt_create(struct intel_gt *gt);
>  void i915_ggtt_suspend(struct i915_ggtt *gtt);
>  void i915_ggtt_resume(struct i915_ggtt *ggtt);
>  
> -#define kmap_atomic_px(px) kmap_atomic(__px_page(px_base(px)))
> -
>  void
>  fill_page_dma(struct drm_i915_gem_object *p, const u64 val, unsigned int count);
>  
> @@ -530,8 +531,8 @@ struct i915_page_table *alloc_pt(struct i915_address_space *vm);
>  struct i915_page_directory *alloc_pd(struct i915_address_space *vm);
>  struct i915_page_directory *__alloc_pd(int npde);
>  
> -int pin_pt_dma(struct i915_address_space *vm, struct drm_i915_gem_object *obj);
> -int pin_pt_dma_locked(struct i915_address_space *vm, struct drm_i915_gem_object *obj);
> +int map_pt_dma(struct i915_address_space *vm, struct drm_i915_gem_object *obj);
> +int map_pt_dma_locked(struct i915_address_space *vm, struct drm_i915_gem_object *obj);
>  
>  void free_px(struct i915_address_space *vm,
>  	     struct i915_page_table *pt, int lvl);
> @@ -578,7 +579,7 @@ void setup_private_pat(struct intel_uncore *uncore);
>  int i915_vm_alloc_pt_stash(struct i915_address_space *vm,
>  			   struct i915_vm_pt_stash *stash,
>  			   u64 size);
> -int i915_vm_pin_pt_stash(struct i915_address_space *vm,
> +int i915_vm_map_pt_stash(struct i915_address_space *vm,
>  			 struct i915_vm_pt_stash *stash);
>  void i915_vm_free_pt_stash(struct i915_address_space *vm,
>  			   struct i915_vm_pt_stash *stash);
> diff --git a/drivers/gpu/drm/i915/gt/intel_ppgtt.c b/drivers/gpu/drm/i915/gt/intel_ppgtt.c
> index 014ae8ac4480..4e3d80c2295c 100644
> --- a/drivers/gpu/drm/i915/gt/intel_ppgtt.c
> +++ b/drivers/gpu/drm/i915/gt/intel_ppgtt.c
> @@ -87,11 +87,10 @@ write_dma_entry(struct drm_i915_gem_object * const pdma,
>  		const unsigned short idx,
>  		const u64 encoded_entry)
>  {
> -	u64 * const vaddr = kmap_atomic(__px_page(pdma));
> +	u64 * const vaddr = __px_vaddr(pdma);
>  
>  	vaddr[idx] = encoded_entry;
>  	clflush_cache_range(&vaddr[idx], sizeof(u64));
> -	kunmap_atomic(vaddr);
>  }
>  
>  void
> @@ -258,7 +257,7 @@ int i915_vm_alloc_pt_stash(struct i915_address_space *vm,
>  	return 0;
>  }
>  
> -int i915_vm_pin_pt_stash(struct i915_address_space *vm,
> +int i915_vm_map_pt_stash(struct i915_address_space *vm,
>  			 struct i915_vm_pt_stash *stash)
>  {
>  	struct i915_page_table *pt;
> @@ -266,7 +265,7 @@ int i915_vm_pin_pt_stash(struct i915_address_space *vm,
>  
>  	for (n = 0; n < ARRAY_SIZE(stash->pt); n++) {
>  		for (pt = stash->pt[n]; pt; pt = pt->stash) {
> -			err = pin_pt_dma_locked(vm, pt->base);
> +			err = map_pt_dma_locked(vm, pt->base);
>  			if (err)
>  				return err;
>  		}
> diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
> index e24d33aecac4..c68a743fac2a 100644
> --- a/drivers/gpu/drm/i915/i915_vma.c
> +++ b/drivers/gpu/drm/i915/i915_vma.c
> @@ -912,8 +912,7 @@ int i915_vma_pin_ww(struct i915_vma *vma, struct i915_gem_ww_ctx *ww,
>  			if (err)
>  				goto err_fence;
>  
> -			err = i915_vm_pin_pt_stash(vma->vm,
> -						   &work->stash);
> +			err = i915_vm_map_pt_stash(vma->vm, &work->stash);
>  			if (err)
>  				goto err_fence;
>  		}
> diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c b/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
> index 2e4f06eaacc1..e060e455e9f6 100644
> --- a/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
> +++ b/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
> @@ -186,7 +186,7 @@ static int igt_ppgtt_alloc(void *arg)
>  		if (err)
>  			goto err_ppgtt_cleanup;
>  
> -		err = i915_vm_pin_pt_stash(&ppgtt->vm, &stash);
> +		err = i915_vm_map_pt_stash(&ppgtt->vm, &stash);
>  		if (err) {
>  			i915_vm_free_pt_stash(&ppgtt->vm, &stash);
>  			goto err_ppgtt_cleanup;
> @@ -208,7 +208,7 @@ static int igt_ppgtt_alloc(void *arg)
>  		if (err)
>  			goto err_ppgtt_cleanup;
>  
> -		err = i915_vm_pin_pt_stash(&ppgtt->vm, &stash);
> +		err = i915_vm_map_pt_stash(&ppgtt->vm, &stash);
>  		if (err) {
>  			i915_vm_free_pt_stash(&ppgtt->vm, &stash);
>  			goto err_ppgtt_cleanup;
> @@ -325,11 +325,10 @@ static int lowlevel_hole(struct i915_address_space *vm,
>  							   BIT_ULL(size)))
>  					goto alloc_vm_end;
>  
> -				err = i915_vm_pin_pt_stash(vm, &stash);
> +				err = i915_vm_map_pt_stash(vm, &stash);
>  				if (!err)
>  					vm->allocate_va_range(vm, &stash,
>  							      addr, BIT_ULL(size));
> -
>  				i915_vm_free_pt_stash(vm, &stash);
>  alloc_vm_end:
>  				if (err == -EDEADLK) {
> @@ -1967,10 +1966,9 @@ static int igt_cs_tlb(void *arg)
>  			if (err)
>  				goto end_ww;
>  
> -			err = i915_vm_pin_pt_stash(vm, &stash);
> +			err = i915_vm_map_pt_stash(vm, &stash);
>  			if (!err)
>  				vm->allocate_va_range(vm, &stash, offset, chunk_size);
> -
>  			i915_vm_free_pt_stash(vm, &stash);
>  end_ww:
>  			if (err == -EDEADLK) {
> diff --git a/drivers/gpu/drm/i915/selftests/i915_perf.c b/drivers/gpu/drm/i915/selftests/i915_perf.c
> index e9d86dab8677..bfb0290967a1 100644
> --- a/drivers/gpu/drm/i915/selftests/i915_perf.c
> +++ b/drivers/gpu/drm/i915/selftests/i915_perf.c
> @@ -307,7 +307,7 @@ static int live_noa_gpr(void *arg)
>  	}
>  
>  	/* Poison the ce->vm so we detect writes not to the GGTT gt->scratch */
> -	scratch = kmap(__px_page(ce->vm->scratch[0]));
> +	scratch = __px_vaddr(ce->vm->scratch[0]);
>  	memset(scratch, POISON_FREE, PAGE_SIZE);
>  
>  	rq = intel_context_create_request(ce);
> @@ -405,7 +405,6 @@ static int live_noa_gpr(void *arg)
>  out_rq:
>  	i915_request_put(rq);
>  out_ce:
> -	kunmap(__px_page(ce->vm->scratch[0]));
>  	intel_context_put(ce);
>  out:
>  	stream_destroy(stream);
> -- 
> 2.26.3
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [Intel-gfx] [PATCH 18/19] drm/i915/gtt: map the PD up front
@ 2021-04-12 15:17     ` Daniel Vetter
  0 siblings, 0 replies; 130+ messages in thread
From: Daniel Vetter @ 2021-04-12 15:17 UTC (permalink / raw)
  To: Matthew Auld; +Cc: intel-gfx, dri-devel, Chris Wilson

On Mon, Apr 12, 2021 at 10:05:25AM +0100, Matthew Auld wrote:
> We need to general our accessor for the page directories and tables from
> using the simple kmap_atomic to support local memory, and this setup
> must be done on acquisition of the backing storage prior to entering
> fence execution contexts. Here we replace the kmap with the object
> maping code that for simple single page shmemfs object will return a
> plain kmap, that is then kept for the lifetime of the page directory.
> 
> v2: (Thomas) Rebase on dma_resv and obj->mm.lock removal.
> 
> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>

So I wanted to understand what px stands for as an abbreviation, and dug
all the way down to this:

commit 567047be2a7ede082d29f45524c287b87bd75e53
Author: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Date:   Thu Jun 25 18:35:12 2015 +0300

    drm/i915/gtt: Use macros to access dma mapped pages

I still have no idea what it means, I guess px = page. But I also
committed this, so I guess can blame myself :-)

But while digging I've stumbled over this here

commit 6eebfe8a10a62139d681e2f1af1386252742278b
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Fri Jul 12 08:58:18 2019 +0100

    drm/i915/gtt: Use shallow dma pages for scratch


And that's some serious wtf. Yes we've done some compile-time type
casting automagic between i915_priv and dev in the past, and I think even
that was bad taste. But it was justified with that we have these
everywhere (especially in the mmio macros), and it would be a terrible
flag day.

But I'm not seeing any need for auto-casting for these pages here, and I'm
not aware that we're doing this anywhere else in kernel code. There is
some macro-trickery in lockdep annotations, but that relies on the lockdep
map having the same struct member name in all lock types, and is not
exposed to drivers at all.

Am I missing something, or why do we have this compile-time type casting
stuff going on in i915 page accessors?
-Daniel

> ---
>  .../drm/i915/gem/selftests/i915_gem_context.c | 11 +----
>  drivers/gpu/drm/i915/gt/gen6_ppgtt.c          | 11 ++---
>  drivers/gpu/drm/i915/gt/gen8_ppgtt.c          | 26 ++++------
>  drivers/gpu/drm/i915/gt/intel_ggtt.c          |  2 +-
>  drivers/gpu/drm/i915/gt/intel_gtt.c           | 48 +++++++++----------
>  drivers/gpu/drm/i915/gt/intel_gtt.h           | 11 +++--
>  drivers/gpu/drm/i915/gt/intel_ppgtt.c         |  7 ++-
>  drivers/gpu/drm/i915/i915_vma.c               |  3 +-
>  drivers/gpu/drm/i915/selftests/i915_gem_gtt.c | 10 ++--
>  drivers/gpu/drm/i915/selftests/i915_perf.c    |  3 +-
>  10 files changed, 54 insertions(+), 78 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c
> index 5fef592390cb..ce70d0a3afb2 100644
> --- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c
> @@ -1740,7 +1740,6 @@ static int read_from_scratch(struct i915_gem_context *ctx,
>  static int check_scratch_page(struct i915_gem_context *ctx, u32 *out)
>  {
>  	struct i915_address_space *vm;
> -	struct page *page;
>  	u32 *vaddr;
>  	int err = 0;
>  
> @@ -1748,24 +1747,18 @@ static int check_scratch_page(struct i915_gem_context *ctx, u32 *out)
>  	if (!vm)
>  		return -ENODEV;
>  
> -	page = __px_page(vm->scratch[0]);
> -	if (!page) {
> +	if (!vm->scratch[0]) {
>  		pr_err("No scratch page!\n");
>  		return -EINVAL;
>  	}
>  
> -	vaddr = kmap(page);
> -	if (!vaddr) {
> -		pr_err("No (mappable) scratch page!\n");
> -		return -EINVAL;
> -	}
> +	vaddr = __px_vaddr(vm->scratch[0]);
>  
>  	memcpy(out, vaddr, sizeof(*out));
>  	if (memchr_inv(vaddr, *out, PAGE_SIZE)) {
>  		pr_err("Inconsistent initial state of scratch page!\n");
>  		err = -EINVAL;
>  	}
> -	kunmap(page);
>  
>  	return err;
>  }
> diff --git a/drivers/gpu/drm/i915/gt/gen6_ppgtt.c b/drivers/gpu/drm/i915/gt/gen6_ppgtt.c
> index e08dff376339..21b1085769be 100644
> --- a/drivers/gpu/drm/i915/gt/gen6_ppgtt.c
> +++ b/drivers/gpu/drm/i915/gt/gen6_ppgtt.c
> @@ -96,9 +96,8 @@ static void gen6_ppgtt_clear_range(struct i915_address_space *vm,
>  		 * entries back to scratch.
>  		 */
>  
> -		vaddr = kmap_atomic_px(pt);
> +		vaddr = px_vaddr(pt);
>  		memset32(vaddr + pte, scratch_pte, count);
> -		kunmap_atomic(vaddr);
>  
>  		pte = 0;
>  	}
> @@ -120,7 +119,7 @@ static void gen6_ppgtt_insert_entries(struct i915_address_space *vm,
>  
>  	GEM_BUG_ON(!pd->entry[act_pt]);
>  
> -	vaddr = kmap_atomic_px(i915_pt_entry(pd, act_pt));
> +	vaddr = px_vaddr(i915_pt_entry(pd, act_pt));
>  	do {
>  		GEM_BUG_ON(sg_dma_len(iter.sg) < I915_GTT_PAGE_SIZE);
>  		vaddr[act_pte] = pte_encode | GEN6_PTE_ADDR_ENCODE(iter.dma);
> @@ -136,12 +135,10 @@ static void gen6_ppgtt_insert_entries(struct i915_address_space *vm,
>  		}
>  
>  		if (++act_pte == GEN6_PTES) {
> -			kunmap_atomic(vaddr);
> -			vaddr = kmap_atomic_px(i915_pt_entry(pd, ++act_pt));
> +			vaddr = px_vaddr(i915_pt_entry(pd, ++act_pt));
>  			act_pte = 0;
>  		}
>  	} while (1);
> -	kunmap_atomic(vaddr);
>  
>  	vma->page_sizes.gtt = I915_GTT_PAGE_SIZE;
>  }
> @@ -235,7 +232,7 @@ static int gen6_ppgtt_init_scratch(struct gen6_ppgtt *ppgtt)
>  		goto err_scratch0;
>  	}
>  
> -	ret = pin_pt_dma(vm, vm->scratch[1]);
> +	ret = map_pt_dma(vm, vm->scratch[1]);
>  	if (ret)
>  		goto err_scratch1;
>  
> diff --git a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
> index 176c19633412..f83496836f0f 100644
> --- a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
> +++ b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
> @@ -242,11 +242,10 @@ static u64 __gen8_ppgtt_clear(struct i915_address_space * const vm,
>  			    atomic_read(&pt->used));
>  			GEM_BUG_ON(!count || count >= atomic_read(&pt->used));
>  
> -			vaddr = kmap_atomic_px(pt);
> +			vaddr = px_vaddr(pt);
>  			memset64(vaddr + gen8_pd_index(start, 0),
>  				 vm->scratch[0]->encode,
>  				 count);
> -			kunmap_atomic(vaddr);
>  
>  			atomic_sub(count, &pt->used);
>  			start += count;
> @@ -375,7 +374,7 @@ gen8_ppgtt_insert_pte(struct i915_ppgtt *ppgtt,
>  	gen8_pte_t *vaddr;
>  
>  	pd = i915_pd_entry(pdp, gen8_pd_index(idx, 2));
> -	vaddr = kmap_atomic_px(i915_pt_entry(pd, gen8_pd_index(idx, 1)));
> +	vaddr = px_vaddr(i915_pt_entry(pd, gen8_pd_index(idx, 1)));
>  	do {
>  		GEM_BUG_ON(sg_dma_len(iter->sg) < I915_GTT_PAGE_SIZE);
>  		vaddr[gen8_pd_index(idx, 0)] = pte_encode | iter->dma;
> @@ -402,12 +401,10 @@ gen8_ppgtt_insert_pte(struct i915_ppgtt *ppgtt,
>  			}
>  
>  			clflush_cache_range(vaddr, PAGE_SIZE);
> -			kunmap_atomic(vaddr);
> -			vaddr = kmap_atomic_px(i915_pt_entry(pd, gen8_pd_index(idx, 1)));
> +			vaddr = px_vaddr(i915_pt_entry(pd, gen8_pd_index(idx, 1)));
>  		}
>  	} while (1);
>  	clflush_cache_range(vaddr, PAGE_SIZE);
> -	kunmap_atomic(vaddr);
>  
>  	return idx;
>  }
> @@ -442,7 +439,7 @@ static void gen8_ppgtt_insert_huge(struct i915_vma *vma,
>  			encode |= GEN8_PDE_PS_2M;
>  			page_size = I915_GTT_PAGE_SIZE_2M;
>  
> -			vaddr = kmap_atomic_px(pd);
> +			vaddr = px_vaddr(pd);
>  		} else {
>  			struct i915_page_table *pt =
>  				i915_pt_entry(pd, __gen8_pte_index(start, 1));
> @@ -457,7 +454,7 @@ static void gen8_ppgtt_insert_huge(struct i915_vma *vma,
>  			     rem >= (I915_PDES - index) * I915_GTT_PAGE_SIZE))
>  				maybe_64K = __gen8_pte_index(start, 1);
>  
> -			vaddr = kmap_atomic_px(pt);
> +			vaddr = px_vaddr(pt);
>  		}
>  
>  		do {
> @@ -491,7 +488,6 @@ static void gen8_ppgtt_insert_huge(struct i915_vma *vma,
>  		} while (rem >= page_size && index < I915_PDES);
>  
>  		clflush_cache_range(vaddr, PAGE_SIZE);
> -		kunmap_atomic(vaddr);
>  
>  		/*
>  		 * Is it safe to mark the 2M block as 64K? -- Either we have
> @@ -505,9 +501,8 @@ static void gen8_ppgtt_insert_huge(struct i915_vma *vma,
>  		      !iter->sg && IS_ALIGNED(vma->node.start +
>  					      vma->node.size,
>  					      I915_GTT_PAGE_SIZE_2M)))) {
> -			vaddr = kmap_atomic_px(pd);
> +			vaddr = px_vaddr(pd);
>  			vaddr[maybe_64K] |= GEN8_PDE_IPS_64K;
> -			kunmap_atomic(vaddr);
>  			page_size = I915_GTT_PAGE_SIZE_64K;
>  
>  			/*
> @@ -523,12 +518,11 @@ static void gen8_ppgtt_insert_huge(struct i915_vma *vma,
>  				u16 i;
>  
>  				encode = vma->vm->scratch[0]->encode;
> -				vaddr = kmap_atomic_px(i915_pt_entry(pd, maybe_64K));
> +				vaddr = px_vaddr(i915_pt_entry(pd, maybe_64K));
>  
>  				for (i = 1; i < index; i += 16)
>  					memset64(vaddr + i, encode, 15);
>  
> -				kunmap_atomic(vaddr);
>  			}
>  		}
>  
> @@ -602,7 +596,7 @@ static int gen8_init_scratch(struct i915_address_space *vm)
>  		if (IS_ERR(obj))
>  			goto free_scratch;
>  
> -		ret = pin_pt_dma(vm, obj);
> +		ret = map_pt_dma(vm, obj);
>  		if (ret) {
>  			i915_gem_object_put(obj);
>  			goto free_scratch;
> @@ -639,7 +633,7 @@ static int gen8_preallocate_top_level_pdp(struct i915_ppgtt *ppgtt)
>  		if (IS_ERR(pde))
>  			return PTR_ERR(pde);
>  
> -		err = pin_pt_dma(vm, pde->pt.base);
> +		err = map_pt_dma(vm, pde->pt.base);
>  		if (err) {
>  			i915_gem_object_put(pde->pt.base);
>  			free_pd(vm, pde);
> @@ -675,7 +669,7 @@ gen8_alloc_top_pd(struct i915_address_space *vm)
>  		goto err_pd;
>  	}
>  
> -	err = pin_pt_dma(vm, pd->pt.base);
> +	err = map_pt_dma(vm, pd->pt.base);
>  	if (err)
>  		goto err_pd;
>  
> diff --git a/drivers/gpu/drm/i915/gt/intel_ggtt.c b/drivers/gpu/drm/i915/gt/intel_ggtt.c
> index 670c1271e7d5..d94628b9d89e 100644
> --- a/drivers/gpu/drm/i915/gt/intel_ggtt.c
> +++ b/drivers/gpu/drm/i915/gt/intel_ggtt.c
> @@ -657,7 +657,7 @@ static int init_aliasing_ppgtt(struct i915_ggtt *ggtt)
>  		goto err_ppgtt;
>  
>  	i915_gem_object_lock(ppgtt->vm.scratch[0], NULL);
> -	err = i915_vm_pin_pt_stash(&ppgtt->vm, &stash);
> +	err = i915_vm_map_pt_stash(&ppgtt->vm, &stash);
>  	i915_gem_object_unlock(ppgtt->vm.scratch[0]);
>  	if (err)
>  		goto err_stash;
> diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.c b/drivers/gpu/drm/i915/gt/intel_gtt.c
> index 941f8af016d6..d386b89e2758 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gtt.c
> +++ b/drivers/gpu/drm/i915/gt/intel_gtt.c
> @@ -25,27 +25,25 @@ struct drm_i915_gem_object *alloc_pt_dma(struct i915_address_space *vm, int sz)
>  	return obj;
>  }
>  
> -int pin_pt_dma(struct i915_address_space *vm, struct drm_i915_gem_object *obj)
> +int map_pt_dma(struct i915_address_space *vm, struct drm_i915_gem_object *obj)
>  {
> -	int err;
> +	void *vaddr;
>  
> -	i915_gem_object_lock(obj, NULL);
> -	err = i915_gem_object_pin_pages(obj);
> -	i915_gem_object_unlock(obj);
> -	if (err)
> -		return err;
> +	vaddr = i915_gem_object_pin_map_unlocked(obj, I915_MAP_WB);
> +	if (IS_ERR(vaddr))
> +		return PTR_ERR(vaddr);
>  
>  	i915_gem_object_make_unshrinkable(obj);
>  	return 0;
>  }
>  
> -int pin_pt_dma_locked(struct i915_address_space *vm, struct drm_i915_gem_object *obj)
> +int map_pt_dma_locked(struct i915_address_space *vm, struct drm_i915_gem_object *obj)
>  {
> -	int err;
> +	void *vaddr;
>  
> -	err = i915_gem_object_pin_pages(obj);
> -	if (err)
> -		return err;
> +	vaddr = i915_gem_object_pin_map(obj, I915_MAP_WB);
> +	if (IS_ERR(vaddr))
> +		return PTR_ERR(vaddr);
>  
>  	i915_gem_object_make_unshrinkable(obj);
>  	return 0;
> @@ -155,6 +153,14 @@ void clear_pages(struct i915_vma *vma)
>  	memset(&vma->page_sizes, 0, sizeof(vma->page_sizes));
>  }
>  
> +void *__px_vaddr(struct drm_i915_gem_object *p)
> +{
> +	enum i915_map_type type;
> +
> +	GEM_BUG_ON(!i915_gem_object_has_pages(p));
> +	return page_unpack_bits(p->mm.mapping, &type);
> +}
> +
>  dma_addr_t __px_dma(struct drm_i915_gem_object *p)
>  {
>  	GEM_BUG_ON(!i915_gem_object_has_pages(p));
> @@ -170,32 +176,22 @@ struct page *__px_page(struct drm_i915_gem_object *p)
>  void
>  fill_page_dma(struct drm_i915_gem_object *p, const u64 val, unsigned int count)
>  {
> -	struct page *page = __px_page(p);
> -	void *vaddr;
> +	void *vaddr = __px_vaddr(p);
>  
> -	vaddr = kmap(page);
>  	memset64(vaddr, val, count);
>  	clflush_cache_range(vaddr, PAGE_SIZE);
> -	kunmap(page);
>  }
>  
>  static void poison_scratch_page(struct drm_i915_gem_object *scratch)
>  {
> -	struct sgt_iter sgt;
> -	struct page *page;
> +	void *vaddr = __px_vaddr(scratch);
>  	u8 val;
>  
>  	val = 0;
>  	if (IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM))
>  		val = POISON_FREE;
>  
> -	for_each_sgt_page(page, sgt, scratch->mm.pages) {
> -		void *vaddr;
> -
> -		vaddr = kmap(page);
> -		memset(vaddr, val, PAGE_SIZE);
> -		kunmap(page);
> -	}
> +	memset(vaddr, val, scratch->base.size);
>  }
>  
>  int setup_scratch_page(struct i915_address_space *vm)
> @@ -225,7 +221,7 @@ int setup_scratch_page(struct i915_address_space *vm)
>  		if (IS_ERR(obj))
>  			goto skip;
>  
> -		if (pin_pt_dma(vm, obj))
> +		if (map_pt_dma(vm, obj))
>  			goto skip_obj;
>  
>  		/* We need a single contiguous page for our scratch */
> diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.h b/drivers/gpu/drm/i915/gt/intel_gtt.h
> index e67e34e17913..40e486704558 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gtt.h
> +++ b/drivers/gpu/drm/i915/gt/intel_gtt.h
> @@ -180,6 +180,9 @@ struct page *__px_page(struct drm_i915_gem_object *p);
>  dma_addr_t __px_dma(struct drm_i915_gem_object *p);
>  #define px_dma(px) (__px_dma(px_base(px)))
>  
> +void *__px_vaddr(struct drm_i915_gem_object *p);
> +#define px_vaddr(px) (__px_vaddr(px_base(px)))
> +
>  #define px_pt(px) \
>  	__px_choose_expr(px, struct i915_page_table *, __x, \
>  	__px_choose_expr(px, struct i915_page_directory *, &__x->pt, \
> @@ -511,8 +514,6 @@ struct i915_ppgtt *i915_ppgtt_create(struct intel_gt *gt);
>  void i915_ggtt_suspend(struct i915_ggtt *gtt);
>  void i915_ggtt_resume(struct i915_ggtt *ggtt);
>  
> -#define kmap_atomic_px(px) kmap_atomic(__px_page(px_base(px)))
> -
>  void
>  fill_page_dma(struct drm_i915_gem_object *p, const u64 val, unsigned int count);
>  
> @@ -530,8 +531,8 @@ struct i915_page_table *alloc_pt(struct i915_address_space *vm);
>  struct i915_page_directory *alloc_pd(struct i915_address_space *vm);
>  struct i915_page_directory *__alloc_pd(int npde);
>  
> -int pin_pt_dma(struct i915_address_space *vm, struct drm_i915_gem_object *obj);
> -int pin_pt_dma_locked(struct i915_address_space *vm, struct drm_i915_gem_object *obj);
> +int map_pt_dma(struct i915_address_space *vm, struct drm_i915_gem_object *obj);
> +int map_pt_dma_locked(struct i915_address_space *vm, struct drm_i915_gem_object *obj);
>  
>  void free_px(struct i915_address_space *vm,
>  	     struct i915_page_table *pt, int lvl);
> @@ -578,7 +579,7 @@ void setup_private_pat(struct intel_uncore *uncore);
>  int i915_vm_alloc_pt_stash(struct i915_address_space *vm,
>  			   struct i915_vm_pt_stash *stash,
>  			   u64 size);
> -int i915_vm_pin_pt_stash(struct i915_address_space *vm,
> +int i915_vm_map_pt_stash(struct i915_address_space *vm,
>  			 struct i915_vm_pt_stash *stash);
>  void i915_vm_free_pt_stash(struct i915_address_space *vm,
>  			   struct i915_vm_pt_stash *stash);
> diff --git a/drivers/gpu/drm/i915/gt/intel_ppgtt.c b/drivers/gpu/drm/i915/gt/intel_ppgtt.c
> index 014ae8ac4480..4e3d80c2295c 100644
> --- a/drivers/gpu/drm/i915/gt/intel_ppgtt.c
> +++ b/drivers/gpu/drm/i915/gt/intel_ppgtt.c
> @@ -87,11 +87,10 @@ write_dma_entry(struct drm_i915_gem_object * const pdma,
>  		const unsigned short idx,
>  		const u64 encoded_entry)
>  {
> -	u64 * const vaddr = kmap_atomic(__px_page(pdma));
> +	u64 * const vaddr = __px_vaddr(pdma);
>  
>  	vaddr[idx] = encoded_entry;
>  	clflush_cache_range(&vaddr[idx], sizeof(u64));
> -	kunmap_atomic(vaddr);
>  }
>  
>  void
> @@ -258,7 +257,7 @@ int i915_vm_alloc_pt_stash(struct i915_address_space *vm,
>  	return 0;
>  }
>  
> -int i915_vm_pin_pt_stash(struct i915_address_space *vm,
> +int i915_vm_map_pt_stash(struct i915_address_space *vm,
>  			 struct i915_vm_pt_stash *stash)
>  {
>  	struct i915_page_table *pt;
> @@ -266,7 +265,7 @@ int i915_vm_pin_pt_stash(struct i915_address_space *vm,
>  
>  	for (n = 0; n < ARRAY_SIZE(stash->pt); n++) {
>  		for (pt = stash->pt[n]; pt; pt = pt->stash) {
> -			err = pin_pt_dma_locked(vm, pt->base);
> +			err = map_pt_dma_locked(vm, pt->base);
>  			if (err)
>  				return err;
>  		}
> diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
> index e24d33aecac4..c68a743fac2a 100644
> --- a/drivers/gpu/drm/i915/i915_vma.c
> +++ b/drivers/gpu/drm/i915/i915_vma.c
> @@ -912,8 +912,7 @@ int i915_vma_pin_ww(struct i915_vma *vma, struct i915_gem_ww_ctx *ww,
>  			if (err)
>  				goto err_fence;
>  
> -			err = i915_vm_pin_pt_stash(vma->vm,
> -						   &work->stash);
> +			err = i915_vm_map_pt_stash(vma->vm, &work->stash);
>  			if (err)
>  				goto err_fence;
>  		}
> diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c b/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
> index 2e4f06eaacc1..e060e455e9f6 100644
> --- a/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
> +++ b/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
> @@ -186,7 +186,7 @@ static int igt_ppgtt_alloc(void *arg)
>  		if (err)
>  			goto err_ppgtt_cleanup;
>  
> -		err = i915_vm_pin_pt_stash(&ppgtt->vm, &stash);
> +		err = i915_vm_map_pt_stash(&ppgtt->vm, &stash);
>  		if (err) {
>  			i915_vm_free_pt_stash(&ppgtt->vm, &stash);
>  			goto err_ppgtt_cleanup;
> @@ -208,7 +208,7 @@ static int igt_ppgtt_alloc(void *arg)
>  		if (err)
>  			goto err_ppgtt_cleanup;
>  
> -		err = i915_vm_pin_pt_stash(&ppgtt->vm, &stash);
> +		err = i915_vm_map_pt_stash(&ppgtt->vm, &stash);
>  		if (err) {
>  			i915_vm_free_pt_stash(&ppgtt->vm, &stash);
>  			goto err_ppgtt_cleanup;
> @@ -325,11 +325,10 @@ static int lowlevel_hole(struct i915_address_space *vm,
>  							   BIT_ULL(size)))
>  					goto alloc_vm_end;
>  
> -				err = i915_vm_pin_pt_stash(vm, &stash);
> +				err = i915_vm_map_pt_stash(vm, &stash);
>  				if (!err)
>  					vm->allocate_va_range(vm, &stash,
>  							      addr, BIT_ULL(size));
> -
>  				i915_vm_free_pt_stash(vm, &stash);
>  alloc_vm_end:
>  				if (err == -EDEADLK) {
> @@ -1967,10 +1966,9 @@ static int igt_cs_tlb(void *arg)
>  			if (err)
>  				goto end_ww;
>  
> -			err = i915_vm_pin_pt_stash(vm, &stash);
> +			err = i915_vm_map_pt_stash(vm, &stash);
>  			if (!err)
>  				vm->allocate_va_range(vm, &stash, offset, chunk_size);
> -
>  			i915_vm_free_pt_stash(vm, &stash);
>  end_ww:
>  			if (err == -EDEADLK) {
> diff --git a/drivers/gpu/drm/i915/selftests/i915_perf.c b/drivers/gpu/drm/i915/selftests/i915_perf.c
> index e9d86dab8677..bfb0290967a1 100644
> --- a/drivers/gpu/drm/i915/selftests/i915_perf.c
> +++ b/drivers/gpu/drm/i915/selftests/i915_perf.c
> @@ -307,7 +307,7 @@ static int live_noa_gpr(void *arg)
>  	}
>  
>  	/* Poison the ce->vm so we detect writes not to the GGTT gt->scratch */
> -	scratch = kmap(__px_page(ce->vm->scratch[0]));
> +	scratch = __px_vaddr(ce->vm->scratch[0]);
>  	memset(scratch, POISON_FREE, PAGE_SIZE);
>  
>  	rq = intel_context_create_request(ce);
> @@ -405,7 +405,6 @@ static int live_noa_gpr(void *arg)
>  out_rq:
>  	i915_request_put(rq);
>  out_ce:
> -	kunmap(__px_page(ce->vm->scratch[0]));
>  	intel_context_put(ce);
>  out:
>  	stream_destroy(stream);
> -- 
> 2.26.3
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [Intel-gfx] [PATCH 18/19] drm/i915/gtt: map the PD up front
  2021-04-12 15:17     ` Daniel Vetter
@ 2021-04-12 16:01       ` Jani Nikula
  -1 siblings, 0 replies; 130+ messages in thread
From: Jani Nikula @ 2021-04-12 16:01 UTC (permalink / raw)
  To: Daniel Vetter, Matthew Auld; +Cc: intel-gfx, dri-devel, Chris Wilson

On Mon, 12 Apr 2021, Daniel Vetter <daniel@ffwll.ch> wrote:
> And that's some serious wtf. Yes we've done some compile-time type
> casting automagic between i915_priv and dev in the past, and I think even
> that was bad taste. But it was justified with that we have these
> everywhere (especially in the mmio macros), and it would be a terrible
> flag day.

FWIW, we had the dev_priv/dev macro trickery for a while to not have
that flag day conversion, until everything used i915 or &i915->drm. But
we got rid of it afterwards.

BR,
Jani.


-- 
Jani Nikula, Intel Open Source Graphics Center
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [Intel-gfx] [PATCH 18/19] drm/i915/gtt: map the PD up front
@ 2021-04-12 16:01       ` Jani Nikula
  0 siblings, 0 replies; 130+ messages in thread
From: Jani Nikula @ 2021-04-12 16:01 UTC (permalink / raw)
  To: Daniel Vetter, Matthew Auld; +Cc: intel-gfx, dri-devel, Chris Wilson

On Mon, 12 Apr 2021, Daniel Vetter <daniel@ffwll.ch> wrote:
> And that's some serious wtf. Yes we've done some compile-time type
> casting automagic between i915_priv and dev in the past, and I think even
> that was bad taste. But it was justified with that we have these
> everywhere (especially in the mmio macros), and it would be a terrible
> flag day.

FWIW, we had the dev_priv/dev macro trickery for a while to not have
that flag day conversion, until everything used i915 or &i915->drm. But
we got rid of it afterwards.

BR,
Jani.


-- 
Jani Nikula, Intel Open Source Graphics Center
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [Intel-gfx] [PATCH 18/19] drm/i915/gtt: map the PD up front
  2021-04-12 15:17     ` Daniel Vetter
@ 2021-04-12 16:08       ` Matthew Auld
  -1 siblings, 0 replies; 130+ messages in thread
From: Matthew Auld @ 2021-04-12 16:08 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Intel Graphics Development, Matthew Auld, ML dri-devel, Chris Wilson

On Mon, 12 Apr 2021 at 16:17, Daniel Vetter <daniel@ffwll.ch> wrote:
>
> On Mon, Apr 12, 2021 at 10:05:25AM +0100, Matthew Auld wrote:
> > We need to general our accessor for the page directories and tables from
> > using the simple kmap_atomic to support local memory, and this setup
> > must be done on acquisition of the backing storage prior to entering
> > fence execution contexts. Here we replace the kmap with the object
> > maping code that for simple single page shmemfs object will return a
> > plain kmap, that is then kept for the lifetime of the page directory.
> >
> > v2: (Thomas) Rebase on dma_resv and obj->mm.lock removal.
> >
> > Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>
> So I wanted to understand what px stands for as an abbreviation, and dug
> all the way down to this:
>
> commit 567047be2a7ede082d29f45524c287b87bd75e53
> Author: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> Date:   Thu Jun 25 18:35:12 2015 +0300
>
>     drm/i915/gtt: Use macros to access dma mapped pages
>
> I still have no idea what it means, I guess px = page. But I also
> committed this, so I guess can blame myself :-)
>
> But while digging I've stumbled over this here
>
> commit 6eebfe8a10a62139d681e2f1af1386252742278b
> Author: Chris Wilson <chris@chris-wilson.co.uk>
> Date:   Fri Jul 12 08:58:18 2019 +0100
>
>     drm/i915/gtt: Use shallow dma pages for scratch
>
>
> And that's some serious wtf. Yes we've done some compile-time type
> casting automagic between i915_priv and dev in the past, and I think even
> that was bad taste. But it was justified with that we have these
> everywhere (especially in the mmio macros), and it would be a terrible
> flag day.
>
> But I'm not seeing any need for auto-casting for these pages here, and I'm
> not aware that we're doing this anywhere else in kernel code. There is
> some macro-trickery in lockdep annotations, but that relies on the lockdep
> map having the same struct member name in all lock types, and is not
> exposed to drivers at all.
>
> Am I missing something, or why do we have this compile-time type casting
> stuff going on in i915 page accessors?

I think 'x' in the px family of macros/functions is meant in the
variable/polymorphic sense, so it can potentially be a pt, pd, etc
underneath. If you look at px_base() for example all it does is fish
out the base GEM object from the structure, using the
known-at-compile-time-type, which then lets us get at the dma address,
vaddr etc.

It does seem pretty magical, but seems ok to me, if it means less typing?
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [Intel-gfx] [PATCH 18/19] drm/i915/gtt: map the PD up front
@ 2021-04-12 16:08       ` Matthew Auld
  0 siblings, 0 replies; 130+ messages in thread
From: Matthew Auld @ 2021-04-12 16:08 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Intel Graphics Development, Matthew Auld, ML dri-devel, Chris Wilson

On Mon, 12 Apr 2021 at 16:17, Daniel Vetter <daniel@ffwll.ch> wrote:
>
> On Mon, Apr 12, 2021 at 10:05:25AM +0100, Matthew Auld wrote:
> > We need to general our accessor for the page directories and tables from
> > using the simple kmap_atomic to support local memory, and this setup
> > must be done on acquisition of the backing storage prior to entering
> > fence execution contexts. Here we replace the kmap with the object
> > maping code that for simple single page shmemfs object will return a
> > plain kmap, that is then kept for the lifetime of the page directory.
> >
> > v2: (Thomas) Rebase on dma_resv and obj->mm.lock removal.
> >
> > Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>
> So I wanted to understand what px stands for as an abbreviation, and dug
> all the way down to this:
>
> commit 567047be2a7ede082d29f45524c287b87bd75e53
> Author: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> Date:   Thu Jun 25 18:35:12 2015 +0300
>
>     drm/i915/gtt: Use macros to access dma mapped pages
>
> I still have no idea what it means, I guess px = page. But I also
> committed this, so I guess can blame myself :-)
>
> But while digging I've stumbled over this here
>
> commit 6eebfe8a10a62139d681e2f1af1386252742278b
> Author: Chris Wilson <chris@chris-wilson.co.uk>
> Date:   Fri Jul 12 08:58:18 2019 +0100
>
>     drm/i915/gtt: Use shallow dma pages for scratch
>
>
> And that's some serious wtf. Yes we've done some compile-time type
> casting automagic between i915_priv and dev in the past, and I think even
> that was bad taste. But it was justified with that we have these
> everywhere (especially in the mmio macros), and it would be a terrible
> flag day.
>
> But I'm not seeing any need for auto-casting for these pages here, and I'm
> not aware that we're doing this anywhere else in kernel code. There is
> some macro-trickery in lockdep annotations, but that relies on the lockdep
> map having the same struct member name in all lock types, and is not
> exposed to drivers at all.
>
> Am I missing something, or why do we have this compile-time type casting
> stuff going on in i915 page accessors?

I think 'x' in the px family of macros/functions is meant in the
variable/polymorphic sense, so it can potentially be a pt, pd, etc
underneath. If you look at px_base() for example all it does is fish
out the base GEM object from the structure, using the
known-at-compile-time-type, which then lets us get at the dma address,
vaddr etc.

It does seem pretty magical, but seems ok to me, if it means less typing?
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [Intel-gfx] [PATCH 18/19] drm/i915/gtt: map the PD up front
  2021-04-12 16:01       ` Jani Nikula
@ 2021-04-12 16:36         ` Daniel Vetter
  -1 siblings, 0 replies; 130+ messages in thread
From: Daniel Vetter @ 2021-04-12 16:36 UTC (permalink / raw)
  To: Jani Nikula; +Cc: dri-devel, intel-gfx, Matthew Auld, Chris Wilson

On Mon, Apr 12, 2021 at 07:01:19PM +0300, Jani Nikula wrote:
> On Mon, 12 Apr 2021, Daniel Vetter <daniel@ffwll.ch> wrote:
> > And that's some serious wtf. Yes we've done some compile-time type
> > casting automagic between i915_priv and dev in the past, and I think even
> > that was bad taste. But it was justified with that we have these
> > everywhere (especially in the mmio macros), and it would be a terrible
> > flag day.
> 
> FWIW, we had the dev_priv/dev macro trickery for a while to not have
> that flag day conversion, until everything used i915 or &i915->drm. But
> we got rid of it afterwards.

Yay, and yes that was the plan to avoid the flag day. And not as a great
coding pattern that everyone should imitate ...
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [Intel-gfx] [PATCH 18/19] drm/i915/gtt: map the PD up front
@ 2021-04-12 16:36         ` Daniel Vetter
  0 siblings, 0 replies; 130+ messages in thread
From: Daniel Vetter @ 2021-04-12 16:36 UTC (permalink / raw)
  To: Jani Nikula; +Cc: dri-devel, intel-gfx, Matthew Auld, Chris Wilson

On Mon, Apr 12, 2021 at 07:01:19PM +0300, Jani Nikula wrote:
> On Mon, 12 Apr 2021, Daniel Vetter <daniel@ffwll.ch> wrote:
> > And that's some serious wtf. Yes we've done some compile-time type
> > casting automagic between i915_priv and dev in the past, and I think even
> > that was bad taste. But it was justified with that we have these
> > everywhere (especially in the mmio macros), and it would be a terrible
> > flag day.
> 
> FWIW, we had the dev_priv/dev macro trickery for a while to not have
> that flag day conversion, until everything used i915 or &i915->drm. But
> we got rid of it afterwards.

Yay, and yes that was the plan to avoid the flag day. And not as a great
coding pattern that everyone should imitate ...
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH 15/19] drm/i915: WA for zero memory channel
  2021-04-12  9:05   ` [Intel-gfx] " Matthew Auld
@ 2021-04-12 16:57     ` Souza, Jose
  -1 siblings, 0 replies; 130+ messages in thread
From: Souza, Jose @ 2021-04-12 16:57 UTC (permalink / raw)
  To: intel-gfx, Auld, Matthew; +Cc: dri-devel

On Mon, 2021-04-12 at 10:05 +0100, Matthew Auld wrote:
> From: José Roberto de Souza <jose.souza@intel.com>
> 
> Commit c457d9cf256e ("drm/i915: Make sure we have enough memory
> bandwidth on ICL") assumes that we always have a non-zero
> dram_info->channels and uses it as a divisor. We need num memory
> channels to be at least 1 for sane bw limits checking, even when PCode
> returns 0, so lets force it to 1 in this case.

Missing my sob.

> 
> Cc: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
> Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
> ---
>  drivers/gpu/drm/i915/display/intel_bw.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/gpu/drm/i915/display/intel_bw.c b/drivers/gpu/drm/i915/display/intel_bw.c
> index 584ab5ce4106..c5f70f3e930e 100644
> --- a/drivers/gpu/drm/i915/display/intel_bw.c
> +++ b/drivers/gpu/drm/i915/display/intel_bw.c
> @@ -175,6 +175,7 @@ static int icl_get_bw_info(struct drm_i915_private *dev_priv, const struct intel
>  			    "Failed to get memory subsystem information, ignoring bandwidth limits");
>  		return ret;
>  	}
> +	num_channels = max_t(u8, 1, num_channels);
>  
> 
> 
> 
>  	deinterleave = DIV_ROUND_UP(num_channels, is_y_tile ? 4 : 2);
>  	dclk_max = icl_sagv_max_dclk(&qi);

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [Intel-gfx] [PATCH 15/19] drm/i915: WA for zero memory channel
@ 2021-04-12 16:57     ` Souza, Jose
  0 siblings, 0 replies; 130+ messages in thread
From: Souza, Jose @ 2021-04-12 16:57 UTC (permalink / raw)
  To: intel-gfx, Auld, Matthew; +Cc: dri-devel

On Mon, 2021-04-12 at 10:05 +0100, Matthew Auld wrote:
> From: José Roberto de Souza <jose.souza@intel.com>
> 
> Commit c457d9cf256e ("drm/i915: Make sure we have enough memory
> bandwidth on ICL") assumes that we always have a non-zero
> dram_info->channels and uses it as a divisor. We need num memory
> channels to be at least 1 for sane bw limits checking, even when PCode
> returns 0, so lets force it to 1 in this case.

Missing my sob.

> 
> Cc: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
> Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
> ---
>  drivers/gpu/drm/i915/display/intel_bw.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/gpu/drm/i915/display/intel_bw.c b/drivers/gpu/drm/i915/display/intel_bw.c
> index 584ab5ce4106..c5f70f3e930e 100644
> --- a/drivers/gpu/drm/i915/display/intel_bw.c
> +++ b/drivers/gpu/drm/i915/display/intel_bw.c
> @@ -175,6 +175,7 @@ static int icl_get_bw_info(struct drm_i915_private *dev_priv, const struct intel
>  			    "Failed to get memory subsystem information, ignoring bandwidth limits");
>  		return ret;
>  	}
> +	num_channels = max_t(u8, 1, num_channels);
>  
> 
> 
> 
>  	deinterleave = DIV_ROUND_UP(num_channels, is_y_tile ? 4 : 2);
>  	dclk_max = icl_sagv_max_dclk(&qi);

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [Intel-gfx] [PATCH 18/19] drm/i915/gtt: map the PD up front
  2021-04-12 16:08       ` Matthew Auld
@ 2021-04-12 17:00         ` Daniel Vetter
  -1 siblings, 0 replies; 130+ messages in thread
From: Daniel Vetter @ 2021-04-12 17:00 UTC (permalink / raw)
  To: Matthew Auld
  Cc: Intel Graphics Development, Matthew Auld, ML dri-devel, Chris Wilson

On Mon, Apr 12, 2021 at 6:08 PM Matthew Auld
<matthew.william.auld@gmail.com> wrote:
>
> On Mon, 12 Apr 2021 at 16:17, Daniel Vetter <daniel@ffwll.ch> wrote:
> >
> > On Mon, Apr 12, 2021 at 10:05:25AM +0100, Matthew Auld wrote:
> > > We need to general our accessor for the page directories and tables from
> > > using the simple kmap_atomic to support local memory, and this setup
> > > must be done on acquisition of the backing storage prior to entering
> > > fence execution contexts. Here we replace the kmap with the object
> > > maping code that for simple single page shmemfs object will return a
> > > plain kmap, that is then kept for the lifetime of the page directory.
> > >
> > > v2: (Thomas) Rebase on dma_resv and obj->mm.lock removal.
> > >
> > > Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> >
> > So I wanted to understand what px stands for as an abbreviation, and dug
> > all the way down to this:
> >
> > commit 567047be2a7ede082d29f45524c287b87bd75e53
> > Author: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> > Date:   Thu Jun 25 18:35:12 2015 +0300
> >
> >     drm/i915/gtt: Use macros to access dma mapped pages
> >
> > I still have no idea what it means, I guess px = page. But I also
> > committed this, so I guess can blame myself :-)
> >
> > But while digging I've stumbled over this here
> >
> > commit 6eebfe8a10a62139d681e2f1af1386252742278b
> > Author: Chris Wilson <chris@chris-wilson.co.uk>
> > Date:   Fri Jul 12 08:58:18 2019 +0100
> >
> >     drm/i915/gtt: Use shallow dma pages for scratch
> >
> >
> > And that's some serious wtf. Yes we've done some compile-time type
> > casting automagic between i915_priv and dev in the past, and I think even
> > that was bad taste. But it was justified with that we have these
> > everywhere (especially in the mmio macros), and it would be a terrible
> > flag day.
> >
> > But I'm not seeing any need for auto-casting for these pages here, and I'm
> > not aware that we're doing this anywhere else in kernel code. There is
> > some macro-trickery in lockdep annotations, but that relies on the lockdep
> > map having the same struct member name in all lock types, and is not
> > exposed to drivers at all.
> >
> > Am I missing something, or why do we have this compile-time type casting
> > stuff going on in i915 page accessors?
>
> I think 'x' in the px family of macros/functions is meant in the
> variable/polymorphic sense, so it can potentially be a pt, pd, etc
> underneath. If you look at px_base() for example all it does is fish
> out the base GEM object from the structure, using the
> known-at-compile-time-type, which then lets us get at the dma address,
> vaddr etc.

Yeah, but that's not how things landed. px predates the magic
polymorphism. I think the px just stands for page, or at least
originally only stood for page. I'm not sure honestly. It seems to be
just used for page directory type of things, but I haven't found that
written down anywhere.

> It does seem pretty magical, but seems ok to me, if it means less typing?

That's the worst justification. Code is generally write once, read
many times. Optimizing for writing at the cost of magic indirection is
generally not the right tradeoff in the kernel, where any indirection
could hide a major gotcha. In huge userspace applications fancy
abstraction and polymorphism is often the right thing to do, but there
you also have a real compiler with a real typesystem (generally at
least) helping you out. Or it's yolo duct-taping with lots of tests,
where the speed at which you can hack up something matters more than
being able to read it quickly.

We're typing C here. It is generally rather verbose, with type casting
all done explicitly.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [Intel-gfx] [PATCH 18/19] drm/i915/gtt: map the PD up front
@ 2021-04-12 17:00         ` Daniel Vetter
  0 siblings, 0 replies; 130+ messages in thread
From: Daniel Vetter @ 2021-04-12 17:00 UTC (permalink / raw)
  To: Matthew Auld
  Cc: Intel Graphics Development, Matthew Auld, ML dri-devel, Chris Wilson

On Mon, Apr 12, 2021 at 6:08 PM Matthew Auld
<matthew.william.auld@gmail.com> wrote:
>
> On Mon, 12 Apr 2021 at 16:17, Daniel Vetter <daniel@ffwll.ch> wrote:
> >
> > On Mon, Apr 12, 2021 at 10:05:25AM +0100, Matthew Auld wrote:
> > > We need to general our accessor for the page directories and tables from
> > > using the simple kmap_atomic to support local memory, and this setup
> > > must be done on acquisition of the backing storage prior to entering
> > > fence execution contexts. Here we replace the kmap with the object
> > > maping code that for simple single page shmemfs object will return a
> > > plain kmap, that is then kept for the lifetime of the page directory.
> > >
> > > v2: (Thomas) Rebase on dma_resv and obj->mm.lock removal.
> > >
> > > Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> >
> > So I wanted to understand what px stands for as an abbreviation, and dug
> > all the way down to this:
> >
> > commit 567047be2a7ede082d29f45524c287b87bd75e53
> > Author: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> > Date:   Thu Jun 25 18:35:12 2015 +0300
> >
> >     drm/i915/gtt: Use macros to access dma mapped pages
> >
> > I still have no idea what it means, I guess px = page. But I also
> > committed this, so I guess can blame myself :-)
> >
> > But while digging I've stumbled over this here
> >
> > commit 6eebfe8a10a62139d681e2f1af1386252742278b
> > Author: Chris Wilson <chris@chris-wilson.co.uk>
> > Date:   Fri Jul 12 08:58:18 2019 +0100
> >
> >     drm/i915/gtt: Use shallow dma pages for scratch
> >
> >
> > And that's some serious wtf. Yes we've done some compile-time type
> > casting automagic between i915_priv and dev in the past, and I think even
> > that was bad taste. But it was justified with that we have these
> > everywhere (especially in the mmio macros), and it would be a terrible
> > flag day.
> >
> > But I'm not seeing any need for auto-casting for these pages here, and I'm
> > not aware that we're doing this anywhere else in kernel code. There is
> > some macro-trickery in lockdep annotations, but that relies on the lockdep
> > map having the same struct member name in all lock types, and is not
> > exposed to drivers at all.
> >
> > Am I missing something, or why do we have this compile-time type casting
> > stuff going on in i915 page accessors?
>
> I think 'x' in the px family of macros/functions is meant in the
> variable/polymorphic sense, so it can potentially be a pt, pd, etc
> underneath. If you look at px_base() for example all it does is fish
> out the base GEM object from the structure, using the
> known-at-compile-time-type, which then lets us get at the dma address,
> vaddr etc.

Yeah, but that's not how things landed. px predates the magic
polymorphism. I think the px just stands for page, or at least
originally only stood for page. I'm not sure honestly. It seems to be
just used for page directory type of things, but I haven't found that
written down anywhere.

> It does seem pretty magical, but seems ok to me, if it means less typing?

That's the worst justification. Code is generally write once, read
many times. Optimizing for writing at the cost of magic indirection is
generally not the right tradeoff in the kernel, where any indirection
could hide a major gotcha. In huge userspace applications fancy
abstraction and polymorphism is often the right thing to do, but there
you also have a real compiler with a real typesystem (generally at
least) helping you out. Or it's yolo duct-taping with lots of tests,
where the speed at which you can hack up something matters more than
being able to read it quickly.

We're typing C here. It is generally rather verbose, with type casting
all done explicitly.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [Intel-gfx] [PATCH 14/19] drm/i915/oprom: Basic sanitization
  2021-04-12  9:05   ` [Intel-gfx] " Matthew Auld
@ 2021-04-12 22:36     ` kernel test robot
  -1 siblings, 0 replies; 130+ messages in thread
From: kernel test robot @ 2021-04-12 22:36 UTC (permalink / raw)
  To: Matthew Auld, intel-gfx
  Cc: Jani Nikula, Mohammed Khajapasha, kbuild-all, dri-devel

[-- Attachment #1: Type: text/plain, Size: 1122 bytes --]

Hi Matthew,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on drm-intel/for-linux-next]
[also build test WARNING on drm-tip/drm-tip]
[cannot apply to drm-exynos/exynos-drm-next tegra-drm/drm/tegra/for-next drm/drm-next v5.12-rc7]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Matthew-Auld/More-DG1-enabling/20210412-171139
base:   git://anongit.freedesktop.org/drm-intel for-linux-next
config: x86_64-randconfig-c022-20210412 (attached as .config)
compiler: gcc-9 (Debian 9.3.0-22) 9.3.0

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>


cocci warnings: (new ones prefixed by >>)
>> drivers/gpu/drm/i915/display/intel_bios.c:2274:7-14: WARNING opportunity for kmemdup

Please review and possibly fold the followup patch.

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 31163 bytes --]

[-- Attachment #3: Type: text/plain, Size: 160 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [Intel-gfx] [PATCH 14/19] drm/i915/oprom: Basic sanitization
@ 2021-04-12 22:36     ` kernel test robot
  0 siblings, 0 replies; 130+ messages in thread
From: kernel test robot @ 2021-04-12 22:36 UTC (permalink / raw)
  To: Matthew Auld, intel-gfx
  Cc: Jani Nikula, Mohammed Khajapasha, kbuild-all, dri-devel

[-- Attachment #1: Type: text/plain, Size: 1122 bytes --]

Hi Matthew,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on drm-intel/for-linux-next]
[also build test WARNING on drm-tip/drm-tip]
[cannot apply to drm-exynos/exynos-drm-next tegra-drm/drm/tegra/for-next drm/drm-next v5.12-rc7]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Matthew-Auld/More-DG1-enabling/20210412-171139
base:   git://anongit.freedesktop.org/drm-intel for-linux-next
config: x86_64-randconfig-c022-20210412 (attached as .config)
compiler: gcc-9 (Debian 9.3.0-22) 9.3.0

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>


cocci warnings: (new ones prefixed by >>)
>> drivers/gpu/drm/i915/display/intel_bios.c:2274:7-14: WARNING opportunity for kmemdup

Please review and possibly fold the followup patch.

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 31163 bytes --]

[-- Attachment #3: Type: text/plain, Size: 160 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 130+ messages in thread

* [PATCH] drm/i915/oprom: fix memdup.cocci warnings
  2021-04-12  9:05   ` [Intel-gfx] " Matthew Auld
@ 2021-04-12 22:36     ` kernel test robot
  -1 siblings, 0 replies; 130+ messages in thread
From: kernel test robot @ 2021-04-12 22:36 UTC (permalink / raw)
  To: Matthew Auld, intel-gfx
  Cc: Jani Nikula, Mohammed Khajapasha, kbuild-all, dri-devel

From: kernel test robot <lkp@intel.com>

drivers/gpu/drm/i915/display/intel_bios.c:2274:7-14: WARNING opportunity for kmemdup

 Use kmemdup rather than duplicating its implementation

Generated by: scripts/coccinelle/api/memdup.cocci

CC: Anshuman Gupta <anshuman.gupta@intel.com>
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: kernel test robot <lkp@intel.com>
---

url:    https://github.com/0day-ci/linux/commits/Matthew-Auld/More-DG1-enabling/20210412-171139
base:   git://anongit.freedesktop.org/drm-intel for-linux-next

 intel_bios.c |    3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

--- a/drivers/gpu/drm/i915/display/intel_bios.c
+++ b/drivers/gpu/drm/i915/display/intel_bios.c
@@ -2271,14 +2271,13 @@ static struct vbt_header *spi_oprom_get_
 	parse_ptr = (u8 *)oprom_opreg + found;
 	vbt_size = ((struct vbt_header *)parse_ptr)->vbt_size;
 
-	vbt = kzalloc(vbt_size, GFP_KERNEL);
+	vbt = kmemdup(parse_ptr, vbt_size, GFP_KERNEL);
 	if (!vbt) {
 		DRM_ERROR("Unable to allocate %u bytes for VBT storage\n",
 			  vbt_size);
 		goto err_not_found;
 	}
 
-	memcpy(vbt, parse_ptr, vbt_size);
 	if (!intel_bios_is_valid_vbt(vbt, vbt_size))
 		goto err_free_vbt;
 
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 130+ messages in thread

* [Intel-gfx] [PATCH] drm/i915/oprom: fix memdup.cocci warnings
@ 2021-04-12 22:36     ` kernel test robot
  0 siblings, 0 replies; 130+ messages in thread
From: kernel test robot @ 2021-04-12 22:36 UTC (permalink / raw)
  To: Matthew Auld, intel-gfx
  Cc: Jani Nikula, Mohammed Khajapasha, kbuild-all, dri-devel

From: kernel test robot <lkp@intel.com>

drivers/gpu/drm/i915/display/intel_bios.c:2274:7-14: WARNING opportunity for kmemdup

 Use kmemdup rather than duplicating its implementation

Generated by: scripts/coccinelle/api/memdup.cocci

CC: Anshuman Gupta <anshuman.gupta@intel.com>
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: kernel test robot <lkp@intel.com>
---

url:    https://github.com/0day-ci/linux/commits/Matthew-Auld/More-DG1-enabling/20210412-171139
base:   git://anongit.freedesktop.org/drm-intel for-linux-next

 intel_bios.c |    3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

--- a/drivers/gpu/drm/i915/display/intel_bios.c
+++ b/drivers/gpu/drm/i915/display/intel_bios.c
@@ -2271,14 +2271,13 @@ static struct vbt_header *spi_oprom_get_
 	parse_ptr = (u8 *)oprom_opreg + found;
 	vbt_size = ((struct vbt_header *)parse_ptr)->vbt_size;
 
-	vbt = kzalloc(vbt_size, GFP_KERNEL);
+	vbt = kmemdup(parse_ptr, vbt_size, GFP_KERNEL);
 	if (!vbt) {
 		DRM_ERROR("Unable to allocate %u bytes for VBT storage\n",
 			  vbt_size);
 		goto err_not_found;
 	}
 
-	memcpy(vbt, parse_ptr, vbt_size);
 	if (!intel_bios_is_valid_vbt(vbt, vbt_size))
 		goto err_free_vbt;
 
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [Intel-gfx] [PATCH 18/19] drm/i915/gtt: map the PD up front
  2021-04-12 17:00         ` Daniel Vetter
@ 2021-04-13  9:28           ` Matthew Auld
  -1 siblings, 0 replies; 130+ messages in thread
From: Matthew Auld @ 2021-04-13  9:28 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Intel Graphics Development, Matthew Auld, ML dri-devel, Chris Wilson

On Mon, 12 Apr 2021 at 18:01, Daniel Vetter <daniel@ffwll.ch> wrote:
>
> On Mon, Apr 12, 2021 at 6:08 PM Matthew Auld
> <matthew.william.auld@gmail.com> wrote:
> >
> > On Mon, 12 Apr 2021 at 16:17, Daniel Vetter <daniel@ffwll.ch> wrote:
> > >
> > > On Mon, Apr 12, 2021 at 10:05:25AM +0100, Matthew Auld wrote:
> > > > We need to general our accessor for the page directories and tables from
> > > > using the simple kmap_atomic to support local memory, and this setup
> > > > must be done on acquisition of the backing storage prior to entering
> > > > fence execution contexts. Here we replace the kmap with the object
> > > > maping code that for simple single page shmemfs object will return a
> > > > plain kmap, that is then kept for the lifetime of the page directory.
> > > >
> > > > v2: (Thomas) Rebase on dma_resv and obj->mm.lock removal.
> > > >
> > > > Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> > > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > >
> > > So I wanted to understand what px stands for as an abbreviation, and dug
> > > all the way down to this:
> > >
> > > commit 567047be2a7ede082d29f45524c287b87bd75e53
> > > Author: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> > > Date:   Thu Jun 25 18:35:12 2015 +0300
> > >
> > >     drm/i915/gtt: Use macros to access dma mapped pages
> > >
> > > I still have no idea what it means, I guess px = page. But I also
> > > committed this, so I guess can blame myself :-)
> > >
> > > But while digging I've stumbled over this here
> > >
> > > commit 6eebfe8a10a62139d681e2f1af1386252742278b
> > > Author: Chris Wilson <chris@chris-wilson.co.uk>
> > > Date:   Fri Jul 12 08:58:18 2019 +0100
> > >
> > >     drm/i915/gtt: Use shallow dma pages for scratch
> > >
> > >
> > > And that's some serious wtf. Yes we've done some compile-time type
> > > casting automagic between i915_priv and dev in the past, and I think even
> > > that was bad taste. But it was justified with that we have these
> > > everywhere (especially in the mmio macros), and it would be a terrible
> > > flag day.
> > >
> > > But I'm not seeing any need for auto-casting for these pages here, and I'm
> > > not aware that we're doing this anywhere else in kernel code. There is
> > > some macro-trickery in lockdep annotations, but that relies on the lockdep
> > > map having the same struct member name in all lock types, and is not
> > > exposed to drivers at all.
> > >
> > > Am I missing something, or why do we have this compile-time type casting
> > > stuff going on in i915 page accessors?
> >
> > I think 'x' in the px family of macros/functions is meant in the
> > variable/polymorphic sense, so it can potentially be a pt, pd, etc
> > underneath. If you look at px_base() for example all it does is fish
> > out the base GEM object from the structure, using the
> > known-at-compile-time-type, which then lets us get at the dma address,
> > vaddr etc.
>
> Yeah, but that's not how things landed. px predates the magic
> polymorphism. I think the px just stands for page, or at least
> originally only stood for page. I'm not sure honestly. It seems to be
> just used for page directory type of things, but I haven't found that
> written down anywhere.
>
> > It does seem pretty magical, but seems ok to me, if it means less typing?
>
> That's the worst justification. Code is generally write once, read
> many times. Optimizing for writing at the cost of magic indirection is
> generally not the right tradeoff in the kernel, where any indirection
> could hide a major gotcha. In huge userspace applications fancy
> abstraction and polymorphism is often the right thing to do, but there
> you also have a real compiler with a real typesystem (generally at
> least) helping you out. Or it's yolo duct-taping with lots of tests,
> where the speed at which you can hack up something matters more than
> being able to read it quickly.
>
> We're typing C here. It is generally rather verbose, with type casting
> all done explicitly.

Ok. So should we change this around for this patch? The px_ stuff is
already quite prevalent it seems, and the px_vaddr() is just one part
of it? Maybe just add pt_vaddr(), pd_vaddr() etc instead?

> -Daniel
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [Intel-gfx] [PATCH 18/19] drm/i915/gtt: map the PD up front
@ 2021-04-13  9:28           ` Matthew Auld
  0 siblings, 0 replies; 130+ messages in thread
From: Matthew Auld @ 2021-04-13  9:28 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Intel Graphics Development, Matthew Auld, ML dri-devel, Chris Wilson

On Mon, 12 Apr 2021 at 18:01, Daniel Vetter <daniel@ffwll.ch> wrote:
>
> On Mon, Apr 12, 2021 at 6:08 PM Matthew Auld
> <matthew.william.auld@gmail.com> wrote:
> >
> > On Mon, 12 Apr 2021 at 16:17, Daniel Vetter <daniel@ffwll.ch> wrote:
> > >
> > > On Mon, Apr 12, 2021 at 10:05:25AM +0100, Matthew Auld wrote:
> > > > We need to general our accessor for the page directories and tables from
> > > > using the simple kmap_atomic to support local memory, and this setup
> > > > must be done on acquisition of the backing storage prior to entering
> > > > fence execution contexts. Here we replace the kmap with the object
> > > > maping code that for simple single page shmemfs object will return a
> > > > plain kmap, that is then kept for the lifetime of the page directory.
> > > >
> > > > v2: (Thomas) Rebase on dma_resv and obj->mm.lock removal.
> > > >
> > > > Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> > > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > >
> > > So I wanted to understand what px stands for as an abbreviation, and dug
> > > all the way down to this:
> > >
> > > commit 567047be2a7ede082d29f45524c287b87bd75e53
> > > Author: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> > > Date:   Thu Jun 25 18:35:12 2015 +0300
> > >
> > >     drm/i915/gtt: Use macros to access dma mapped pages
> > >
> > > I still have no idea what it means, I guess px = page. But I also
> > > committed this, so I guess can blame myself :-)
> > >
> > > But while digging I've stumbled over this here
> > >
> > > commit 6eebfe8a10a62139d681e2f1af1386252742278b
> > > Author: Chris Wilson <chris@chris-wilson.co.uk>
> > > Date:   Fri Jul 12 08:58:18 2019 +0100
> > >
> > >     drm/i915/gtt: Use shallow dma pages for scratch
> > >
> > >
> > > And that's some serious wtf. Yes we've done some compile-time type
> > > casting automagic between i915_priv and dev in the past, and I think even
> > > that was bad taste. But it was justified with that we have these
> > > everywhere (especially in the mmio macros), and it would be a terrible
> > > flag day.
> > >
> > > But I'm not seeing any need for auto-casting for these pages here, and I'm
> > > not aware that we're doing this anywhere else in kernel code. There is
> > > some macro-trickery in lockdep annotations, but that relies on the lockdep
> > > map having the same struct member name in all lock types, and is not
> > > exposed to drivers at all.
> > >
> > > Am I missing something, or why do we have this compile-time type casting
> > > stuff going on in i915 page accessors?
> >
> > I think 'x' in the px family of macros/functions is meant in the
> > variable/polymorphic sense, so it can potentially be a pt, pd, etc
> > underneath. If you look at px_base() for example all it does is fish
> > out the base GEM object from the structure, using the
> > known-at-compile-time-type, which then lets us get at the dma address,
> > vaddr etc.
>
> Yeah, but that's not how things landed. px predates the magic
> polymorphism. I think the px just stands for page, or at least
> originally only stood for page. I'm not sure honestly. It seems to be
> just used for page directory type of things, but I haven't found that
> written down anywhere.
>
> > It does seem pretty magical, but seems ok to me, if it means less typing?
>
> That's the worst justification. Code is generally write once, read
> many times. Optimizing for writing at the cost of magic indirection is
> generally not the right tradeoff in the kernel, where any indirection
> could hide a major gotcha. In huge userspace applications fancy
> abstraction and polymorphism is often the right thing to do, but there
> you also have a real compiler with a real typesystem (generally at
> least) helping you out. Or it's yolo duct-taping with lots of tests,
> where the speed at which you can hack up something matters more than
> being able to read it quickly.
>
> We're typing C here. It is generally rather verbose, with type casting
> all done explicitly.

Ok. So should we change this around for this patch? The px_ stuff is
already quite prevalent it seems, and the px_vaddr() is just one part
of it? Maybe just add pt_vaddr(), pd_vaddr() etc instead?

> -Daniel
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [Intel-gfx] [PATCH 18/19] drm/i915/gtt: map the PD up front
  2021-04-13  9:28           ` Matthew Auld
@ 2021-04-13 10:18             ` Daniel Vetter
  -1 siblings, 0 replies; 130+ messages in thread
From: Daniel Vetter @ 2021-04-13 10:18 UTC (permalink / raw)
  To: Matthew Auld
  Cc: Intel Graphics Development, Matthew Auld, ML dri-devel, Chris Wilson

On Tue, Apr 13, 2021 at 11:29 AM Matthew Auld
<matthew.william.auld@gmail.com> wrote:
>
> On Mon, 12 Apr 2021 at 18:01, Daniel Vetter <daniel@ffwll.ch> wrote:
> >
> > On Mon, Apr 12, 2021 at 6:08 PM Matthew Auld
> > <matthew.william.auld@gmail.com> wrote:
> > >
> > > On Mon, 12 Apr 2021 at 16:17, Daniel Vetter <daniel@ffwll.ch> wrote:
> > > >
> > > > On Mon, Apr 12, 2021 at 10:05:25AM +0100, Matthew Auld wrote:
> > > > > We need to general our accessor for the page directories and tables from
> > > > > using the simple kmap_atomic to support local memory, and this setup
> > > > > must be done on acquisition of the backing storage prior to entering
> > > > > fence execution contexts. Here we replace the kmap with the object
> > > > > maping code that for simple single page shmemfs object will return a
> > > > > plain kmap, that is then kept for the lifetime of the page directory.
> > > > >
> > > > > v2: (Thomas) Rebase on dma_resv and obj->mm.lock removal.
> > > > >
> > > > > Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> > > > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > > >
> > > > So I wanted to understand what px stands for as an abbreviation, and dug
> > > > all the way down to this:
> > > >
> > > > commit 567047be2a7ede082d29f45524c287b87bd75e53
> > > > Author: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> > > > Date:   Thu Jun 25 18:35:12 2015 +0300
> > > >
> > > >     drm/i915/gtt: Use macros to access dma mapped pages
> > > >
> > > > I still have no idea what it means, I guess px = page. But I also
> > > > committed this, so I guess can blame myself :-)
> > > >
> > > > But while digging I've stumbled over this here
> > > >
> > > > commit 6eebfe8a10a62139d681e2f1af1386252742278b
> > > > Author: Chris Wilson <chris@chris-wilson.co.uk>
> > > > Date:   Fri Jul 12 08:58:18 2019 +0100
> > > >
> > > >     drm/i915/gtt: Use shallow dma pages for scratch
> > > >
> > > >
> > > > And that's some serious wtf. Yes we've done some compile-time type
> > > > casting automagic between i915_priv and dev in the past, and I think even
> > > > that was bad taste. But it was justified with that we have these
> > > > everywhere (especially in the mmio macros), and it would be a terrible
> > > > flag day.
> > > >
> > > > But I'm not seeing any need for auto-casting for these pages here, and I'm
> > > > not aware that we're doing this anywhere else in kernel code. There is
> > > > some macro-trickery in lockdep annotations, but that relies on the lockdep
> > > > map having the same struct member name in all lock types, and is not
> > > > exposed to drivers at all.
> > > >
> > > > Am I missing something, or why do we have this compile-time type casting
> > > > stuff going on in i915 page accessors?
> > >
> > > I think 'x' in the px family of macros/functions is meant in the
> > > variable/polymorphic sense, so it can potentially be a pt, pd, etc
> > > underneath. If you look at px_base() for example all it does is fish
> > > out the base GEM object from the structure, using the
> > > known-at-compile-time-type, which then lets us get at the dma address,
> > > vaddr etc.
> >
> > Yeah, but that's not how things landed. px predates the magic
> > polymorphism. I think the px just stands for page, or at least
> > originally only stood for page. I'm not sure honestly. It seems to be
> > just used for page directory type of things, but I haven't found that
> > written down anywhere.
> >
> > > It does seem pretty magical, but seems ok to me, if it means less typing?
> >
> > That's the worst justification. Code is generally write once, read
> > many times. Optimizing for writing at the cost of magic indirection is
> > generally not the right tradeoff in the kernel, where any indirection
> > could hide a major gotcha. In huge userspace applications fancy
> > abstraction and polymorphism is often the right thing to do, but there
> > you also have a real compiler with a real typesystem (generally at
> > least) helping you out. Or it's yolo duct-taping with lots of tests,
> > where the speed at which you can hack up something matters more than
> > being able to read it quickly.
> >
> > We're typing C here. It is generally rather verbose, with type casting
> > all done explicitly.
>
> Ok. So should we change this around for this patch? The px_ stuff is
> already quite prevalent it seems, and the px_vaddr() is just one part
> of it? Maybe just add pt_vaddr(), pd_vaddr() etc instead?

Nah, that was just an orthogonal observation. The confusion with magic
type-aware macros is preexisting and widespread, there's no point
holding up dg1 code with that. But it is maybe something we should put
on our cleanup list. Or at least have a better explanation for why
exactly it is needed. Also note I'm not worried about the px stuff
standing for pt/pd/whatever, it's the magic type casting property of
these macros added with the 2nd patch I've mentioned above that looks
rather questionable to me. Maybe as transition thing like we've done
with i915_priv pointers, but not something that we should build on top
for long term.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [Intel-gfx] [PATCH 18/19] drm/i915/gtt: map the PD up front
@ 2021-04-13 10:18             ` Daniel Vetter
  0 siblings, 0 replies; 130+ messages in thread
From: Daniel Vetter @ 2021-04-13 10:18 UTC (permalink / raw)
  To: Matthew Auld
  Cc: Intel Graphics Development, Matthew Auld, ML dri-devel, Chris Wilson

On Tue, Apr 13, 2021 at 11:29 AM Matthew Auld
<matthew.william.auld@gmail.com> wrote:
>
> On Mon, 12 Apr 2021 at 18:01, Daniel Vetter <daniel@ffwll.ch> wrote:
> >
> > On Mon, Apr 12, 2021 at 6:08 PM Matthew Auld
> > <matthew.william.auld@gmail.com> wrote:
> > >
> > > On Mon, 12 Apr 2021 at 16:17, Daniel Vetter <daniel@ffwll.ch> wrote:
> > > >
> > > > On Mon, Apr 12, 2021 at 10:05:25AM +0100, Matthew Auld wrote:
> > > > > We need to general our accessor for the page directories and tables from
> > > > > using the simple kmap_atomic to support local memory, and this setup
> > > > > must be done on acquisition of the backing storage prior to entering
> > > > > fence execution contexts. Here we replace the kmap with the object
> > > > > maping code that for simple single page shmemfs object will return a
> > > > > plain kmap, that is then kept for the lifetime of the page directory.
> > > > >
> > > > > v2: (Thomas) Rebase on dma_resv and obj->mm.lock removal.
> > > > >
> > > > > Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> > > > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > > >
> > > > So I wanted to understand what px stands for as an abbreviation, and dug
> > > > all the way down to this:
> > > >
> > > > commit 567047be2a7ede082d29f45524c287b87bd75e53
> > > > Author: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> > > > Date:   Thu Jun 25 18:35:12 2015 +0300
> > > >
> > > >     drm/i915/gtt: Use macros to access dma mapped pages
> > > >
> > > > I still have no idea what it means, I guess px = page. But I also
> > > > committed this, so I guess can blame myself :-)
> > > >
> > > > But while digging I've stumbled over this here
> > > >
> > > > commit 6eebfe8a10a62139d681e2f1af1386252742278b
> > > > Author: Chris Wilson <chris@chris-wilson.co.uk>
> > > > Date:   Fri Jul 12 08:58:18 2019 +0100
> > > >
> > > >     drm/i915/gtt: Use shallow dma pages for scratch
> > > >
> > > >
> > > > And that's some serious wtf. Yes we've done some compile-time type
> > > > casting automagic between i915_priv and dev in the past, and I think even
> > > > that was bad taste. But it was justified with that we have these
> > > > everywhere (especially in the mmio macros), and it would be a terrible
> > > > flag day.
> > > >
> > > > But I'm not seeing any need for auto-casting for these pages here, and I'm
> > > > not aware that we're doing this anywhere else in kernel code. There is
> > > > some macro-trickery in lockdep annotations, but that relies on the lockdep
> > > > map having the same struct member name in all lock types, and is not
> > > > exposed to drivers at all.
> > > >
> > > > Am I missing something, or why do we have this compile-time type casting
> > > > stuff going on in i915 page accessors?
> > >
> > > I think 'x' in the px family of macros/functions is meant in the
> > > variable/polymorphic sense, so it can potentially be a pt, pd, etc
> > > underneath. If you look at px_base() for example all it does is fish
> > > out the base GEM object from the structure, using the
> > > known-at-compile-time-type, which then lets us get at the dma address,
> > > vaddr etc.
> >
> > Yeah, but that's not how things landed. px predates the magic
> > polymorphism. I think the px just stands for page, or at least
> > originally only stood for page. I'm not sure honestly. It seems to be
> > just used for page directory type of things, but I haven't found that
> > written down anywhere.
> >
> > > It does seem pretty magical, but seems ok to me, if it means less typing?
> >
> > That's the worst justification. Code is generally write once, read
> > many times. Optimizing for writing at the cost of magic indirection is
> > generally not the right tradeoff in the kernel, where any indirection
> > could hide a major gotcha. In huge userspace applications fancy
> > abstraction and polymorphism is often the right thing to do, but there
> > you also have a real compiler with a real typesystem (generally at
> > least) helping you out. Or it's yolo duct-taping with lots of tests,
> > where the speed at which you can hack up something matters more than
> > being able to read it quickly.
> >
> > We're typing C here. It is generally rather verbose, with type casting
> > all done explicitly.
>
> Ok. So should we change this around for this patch? The px_ stuff is
> already quite prevalent it seems, and the px_vaddr() is just one part
> of it? Maybe just add pt_vaddr(), pd_vaddr() etc instead?

Nah, that was just an orthogonal observation. The confusion with magic
type-aware macros is preexisting and widespread, there's no point
holding up dg1 code with that. But it is maybe something we should put
on our cleanup list. Or at least have a better explanation for why
exactly it is needed. Also note I'm not worried about the px stuff
standing for pt/pd/whatever, it's the magic type casting property of
these macros added with the 2nd patch I've mentioned above that looks
rather questionable to me. Maybe as transition thing like we've done
with i915_priv pointers, but not something that we should build on top
for long term.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [Intel-gfx] [PATCH 03/19] drm/i915: Create stolen memory region from local memory
  2021-04-12  9:05   ` [Intel-gfx] " Matthew Auld
@ 2021-04-14 15:01     ` Tvrtko Ursulin
  -1 siblings, 0 replies; 130+ messages in thread
From: Tvrtko Ursulin @ 2021-04-14 15:01 UTC (permalink / raw)
  To: Matthew Auld, intel-gfx; +Cc: dri-devel


On 12/04/2021 10:05, Matthew Auld wrote:
> From: CQ Tang <cq.tang@intel.com>
> 
> Add "REGION_STOLEN" device info to dg1, create stolen memory
> region from upper portion of local device memory, starting
> from DSMBASE.
> 
> v2:
>      - s/drm_info/drm_dbg; userspace likely doesn't care about stolen.
>      - mem->type is only setup after the region probe, so setting the name
>        as stolen-local or stolen-system based on this value won't work. Split
>        system vs local stolen setup to fix this.
>      - kill all the region->devmem/is_devmem stuff. We already differentiate
>        the different types of stolen so such things shouldn't be needed
>        anymore.
> 
> Signed-off-by: CQ Tang <cq.tang@intel.com>
> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> ---
>   drivers/gpu/drm/i915/gem/i915_gem_stolen.c | 99 +++++++++++++++++++---
>   drivers/gpu/drm/i915/gem/i915_gem_stolen.h |  3 +
>   drivers/gpu/drm/i915/i915_pci.c            |  2 +-
>   drivers/gpu/drm/i915/i915_reg.h            |  1 +
>   drivers/gpu/drm/i915/intel_memory_region.c |  6 ++
>   drivers/gpu/drm/i915/intel_memory_region.h |  5 +-
>   6 files changed, 102 insertions(+), 14 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_stolen.c b/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
> index b0597de206de..56dd58bef5ee 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
> @@ -10,6 +10,7 @@
>   #include <drm/drm_mm.h>
>   #include <drm/i915_drm.h>
>   
> +#include "gem/i915_gem_lmem.h"
>   #include "gem/i915_gem_region.h"
>   #include "i915_drv.h"
>   #include "i915_gem_stolen.h"
> @@ -121,6 +122,14 @@ static int i915_adjust_stolen(struct drm_i915_private *i915,
>   		}
>   	}
>   
> +	/*
> +	 * With device local memory, we don't need to check the address range,
> +	 * this is device memory physical address, could overlap with system
> +	 * memory.
> +	 */
> +	if (HAS_LMEM(i915))
> +		return 0;
> +
>   	/*
>   	 * Verify that nothing else uses this physical address. Stolen
>   	 * memory should be reserved by the BIOS and hidden from the
> @@ -374,8 +383,9 @@ static void icl_get_stolen_reserved(struct drm_i915_private *i915,
>   	}
>   }
>   
> -static int i915_gem_init_stolen(struct drm_i915_private *i915)
> +static int i915_gem_init_stolen(struct intel_memory_region *mem)
>   {
> +	struct drm_i915_private *i915 = mem->i915;
>   	struct intel_uncore *uncore = &i915->uncore;
>   	resource_size_t reserved_base, stolen_top;
>   	resource_size_t reserved_total, reserved_size;
> @@ -396,10 +406,10 @@ static int i915_gem_init_stolen(struct drm_i915_private *i915)
>   		return 0;
>   	}
>   
> -	if (resource_size(&intel_graphics_stolen_res) == 0)
> +	if (resource_size(&mem->region) == 0)
>   		return 0;
>   
> -	i915->dsm = intel_graphics_stolen_res;
> +	i915->dsm = mem->region;
>   
>   	if (i915_adjust_stolen(i915, &i915->dsm))
>   		return 0;
> @@ -684,23 +694,36 @@ static int _i915_gem_object_stolen_init(struct intel_memory_region *mem,
>   	return ret;
>   }
>   
> +struct intel_memory_region *i915_stolen_region(struct drm_i915_private *i915)
> +{
> +	if (HAS_LMEM(i915))
> +		return i915->mm.regions[INTEL_REGION_STOLEN_LMEM];
> +
> +	return i915->mm.regions[INTEL_REGION_STOLEN_SMEM];
> +}

Could be a bikeshedding comment only - especially since I think this 
path gets very little used at runtime so it is most likely pointless to 
fiddle with it, but it just strikes me a bit not fully elegant to do:

i915_gem_object_create_stolen
  -> i915_gem_object_create_region
     -> i915_stolen_region

And end up in here, when alternative could be at driver init:

i915->stolen_region_id = HAS_LMEM() ? ... : ...;

i915_gem_object_create_stolen
  -> 
i915_gem_object_create_region(i915->mm.regions[i915->stolen_region_id]);

Or pointer to region. Would avoid having to export i915_stolen_region as 
well.

Or is i915->dsm already the right thing? Because..

> +
>   struct drm_i915_gem_object *
>   i915_gem_object_create_stolen(struct drm_i915_private *i915,
>   			      resource_size_t size)
>   {
> -	return i915_gem_object_create_region(i915->mm.regions[INTEL_REGION_STOLEN_SMEM],
> +	return i915_gem_object_create_region(i915_stolen_region(i915),
>   					     size, I915_BO_ALLOC_CONTIGUOUS);
>   }
>   
>   static int init_stolen(struct intel_memory_region *mem)
>   {
> -	intel_memory_region_set_name(mem, "stolen");
> +	if (HAS_LMEM(mem->i915)) {
> +		if (!io_mapping_init_wc(&mem->iomap,
> +					mem->io_start,
> +					resource_size(&mem->region)))
> +			return -EIO;
> +	}
>   
>   	/*
>   	 * Initialise stolen early so that we may reserve preallocated
>   	 * objects for the BIOS to KMS transition.
>   	 */
> -	return i915_gem_init_stolen(mem->i915);
> +	return i915_gem_init_stolen(mem);

... I find the mem region init paths a bit convoluted, stolen 
especially, and struggle to figure it out every time.

For instance we have i915_region_stolen_ops shared between system and 
local stolen. But then shared vfuncs branch depending on system vs stolen?

i915_gem_init_stolen is shared - but which parts of it are relevant for 
local stolen?

>   }
>   
>   static void release_stolen(struct intel_memory_region *mem)
> @@ -714,13 +737,65 @@ static const struct intel_memory_region_ops i915_region_stolen_ops = {
>   	.init_object = _i915_gem_object_stolen_init,
>   };
>   
> +static struct intel_memory_region *
> +setup_lmem_stolen(struct drm_i915_private *i915)
> +{
> +	struct intel_uncore *uncore = &i915->uncore;
> +	struct pci_dev *pdev = i915->drm.pdev;
> +	struct intel_memory_region *mem;
> +	resource_size_t io_start;
> +	resource_size_t lmem_size;
> +	u64 lmem_base;
> +
> +	if (!IS_DGFX(i915))
> +		return ERR_PTR(-ENODEV);
> +
> +	lmem_base = intel_uncore_read64(uncore, GEN12_DSMBASE);
> +	lmem_size = pci_resource_len(pdev, 2) - lmem_base;
> +	io_start = pci_resource_start(pdev, 2) + lmem_base;
> +
> +	mem = intel_memory_region_create(i915, lmem_base, lmem_size,
> +					 I915_GTT_PAGE_SIZE_4K, io_start,
> +					 &i915_region_stolen_ops);
> +	if (IS_ERR(mem))
> +		return mem;
> +
> +	drm_dbg(&i915->drm, "Stolen Local memory: %pR\n", &mem->region);
> +	drm_dbg(&i915->drm, "Stolen Local memory IO start: %pa\n",
> +		&mem->io_start);

Could these messages be consolidated with the system stolen ones 
(i915_gem_setup_stolen?) and based off the memory_region data printed 
from common i915_gem_stolen_setup?

> +
> +	intel_memory_region_set_name(mem, "stolen-local");
> +
> +	return mem;
> +}
> +
> +static struct intel_memory_region*

Space before asterisk.

> +setup_smem_stolen(struct drm_i915_private *i915)
> +{
> +	struct intel_memory_region *mem;
> +
> +	mem = intel_memory_region_create(i915,
> +					 intel_graphics_stolen_res.start,
> +					 resource_size(&intel_graphics_stolen_res),
> +					 PAGE_SIZE, 0,
> +					 &i915_region_stolen_ops);
> +	if (IS_ERR(mem))
> +		return mem;
> +
> +	intel_memory_region_set_name(mem, "stolen-system");

I assume this name, although changed from the current ("stolen"), is not 
exported anywhere to matter?

> +
> +	return mem;
> +}
> +
>   struct intel_memory_region *i915_gem_stolen_setup(struct drm_i915_private *i915)
>   {
> -	return intel_memory_region_create(i915,
> -					  intel_graphics_stolen_res.start,
> -					  resource_size(&intel_graphics_stolen_res),
> -					  PAGE_SIZE, 0,
> -					  &i915_region_stolen_ops);
> +	struct intel_memory_region *mem;
> +
> +	mem = setup_lmem_stolen(i915);
> +	if (mem == ERR_PTR(-ENODEV))
> +		mem = setup_smem_stolen(i915);
> +
> +	return mem;
>   }
>   
>   struct drm_i915_gem_object *
> @@ -728,7 +803,7 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_i915_private *i915,
>   					       resource_size_t stolen_offset,
>   					       resource_size_t size)
>   {
> -	struct intel_memory_region *mem = i915->mm.regions[INTEL_REGION_STOLEN_SMEM];
> +	struct intel_memory_region *mem = i915_stolen_region(i915);
>   	struct drm_i915_gem_object *obj;
>   	struct drm_mm_node *stolen;
>   	int ret;
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_stolen.h b/drivers/gpu/drm/i915/gem/i915_gem_stolen.h
> index b03489706796..2d1ce7fec61c 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_stolen.h
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_stolen.h
> @@ -22,6 +22,9 @@ int i915_gem_stolen_insert_node_in_range(struct drm_i915_private *dev_priv,
>   void i915_gem_stolen_remove_node(struct drm_i915_private *dev_priv,
>   				 struct drm_mm_node *node);
>   struct intel_memory_region *i915_gem_stolen_setup(struct drm_i915_private *i915);
> +
> +struct intel_memory_region *i915_stolen_region(struct drm_i915_private *i915);
> +
>   struct drm_i915_gem_object *
>   i915_gem_object_create_stolen(struct drm_i915_private *dev_priv,
>   			      resource_size_t size);
> diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c
> index 480553746794..53f5d1e6daef 100644
> --- a/drivers/gpu/drm/i915/i915_pci.c
> +++ b/drivers/gpu/drm/i915/i915_pci.c
> @@ -906,7 +906,7 @@ static const struct intel_device_info rkl_info = {
>   
>   #define GEN12_DGFX_FEATURES \
>   	GEN12_FEATURES, \
> -	.memory_regions = REGION_SMEM | REGION_LMEM, \
> +	.memory_regions = REGION_SMEM | REGION_LMEM | REGION_STOLEN_LMEM, \
>   	.has_master_unit_irq = 1, \
>   	.has_llc = 0, \
>   	.has_snoop = 1, \
> diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
> index e087bcd21911..4108f2a7ebfa 100644
> --- a/drivers/gpu/drm/i915/i915_reg.h
> +++ b/drivers/gpu/drm/i915/i915_reg.h
> @@ -12191,6 +12191,7 @@ enum skl_power_gate {
>   #define GEN12_GLOBAL_MOCS(i)	_MMIO(0x4000 + (i) * 4) /* Global MOCS regs */
>   
>   #define GEN12_GSMBASE			_MMIO(0x108100)
> +#define GEN12_DSMBASE			_MMIO(0x1080C0)
>   
>   /* gamt regs */
>   #define GEN8_L3_LRA_1_GPGPU _MMIO(0x4dd4)
> diff --git a/drivers/gpu/drm/i915/intel_memory_region.c b/drivers/gpu/drm/i915/intel_memory_region.c
> index bf837b6bb185..ac90b76a3fa0 100644
> --- a/drivers/gpu/drm/i915/intel_memory_region.c
> +++ b/drivers/gpu/drm/i915/intel_memory_region.c
> @@ -22,6 +22,10 @@ static const struct {
>   		.class = INTEL_MEMORY_STOLEN_SYSTEM,
>   		.instance = 0,
>   	},
> +	[INTEL_REGION_STOLEN_LMEM] = {
> +		.class = INTEL_MEMORY_STOLEN_LOCAL,
> +		.instance = 0,
> +	},
>   };
>   
>   struct intel_memory_region *
> @@ -278,6 +282,8 @@ int intel_memory_regions_hw_probe(struct drm_i915_private *i915)
>   		case INTEL_MEMORY_SYSTEM:
>   			mem = i915_gem_shmem_setup(i915);
>   			break;
> +		case INTEL_MEMORY_STOLEN_LOCAL:
> +			fallthrough;
>   		case INTEL_MEMORY_STOLEN_SYSTEM:
>   			mem = i915_gem_stolen_setup(i915);
>   			break;
> diff --git a/drivers/gpu/drm/i915/intel_memory_region.h b/drivers/gpu/drm/i915/intel_memory_region.h
> index edd49067c8ca..4c8ec15af55f 100644
> --- a/drivers/gpu/drm/i915/intel_memory_region.h
> +++ b/drivers/gpu/drm/i915/intel_memory_region.h
> @@ -26,18 +26,21 @@ enum intel_memory_type {
>   	INTEL_MEMORY_SYSTEM = 0,
>   	INTEL_MEMORY_LOCAL,
>   	INTEL_MEMORY_STOLEN_SYSTEM,
> +	INTEL_MEMORY_STOLEN_LOCAL,
>   };
>   
>   enum intel_region_id {
>   	INTEL_REGION_SMEM = 0,
>   	INTEL_REGION_LMEM,
>   	INTEL_REGION_STOLEN_SMEM,
> +	INTEL_REGION_STOLEN_LMEM,
>   	INTEL_REGION_UNKNOWN, /* Should be last */
>   };
>   
>   #define REGION_SMEM     BIT(INTEL_REGION_SMEM)
>   #define REGION_LMEM     BIT(INTEL_REGION_LMEM)
>   #define REGION_STOLEN_SMEM   BIT(INTEL_REGION_STOLEN_SMEM)
> +#define REGION_STOLEN_LMEM   BIT(INTEL_REGION_STOLEN_LMEM)
>   
>   #define I915_ALLOC_MIN_PAGE_SIZE  BIT(0)
>   #define I915_ALLOC_CONTIGUOUS     BIT(1)
> @@ -82,7 +85,7 @@ struct intel_memory_region {
>   	u16 type;
>   	u16 instance;
>   	enum intel_region_id id;
> -	char name[8];
> +	char name[16];
>   
>   	struct list_head reserved;
>   
> 

Regards,

Tvrtko
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [Intel-gfx] [PATCH 03/19] drm/i915: Create stolen memory region from local memory
@ 2021-04-14 15:01     ` Tvrtko Ursulin
  0 siblings, 0 replies; 130+ messages in thread
From: Tvrtko Ursulin @ 2021-04-14 15:01 UTC (permalink / raw)
  To: Matthew Auld, intel-gfx; +Cc: dri-devel


On 12/04/2021 10:05, Matthew Auld wrote:
> From: CQ Tang <cq.tang@intel.com>
> 
> Add "REGION_STOLEN" device info to dg1, create stolen memory
> region from upper portion of local device memory, starting
> from DSMBASE.
> 
> v2:
>      - s/drm_info/drm_dbg; userspace likely doesn't care about stolen.
>      - mem->type is only setup after the region probe, so setting the name
>        as stolen-local or stolen-system based on this value won't work. Split
>        system vs local stolen setup to fix this.
>      - kill all the region->devmem/is_devmem stuff. We already differentiate
>        the different types of stolen so such things shouldn't be needed
>        anymore.
> 
> Signed-off-by: CQ Tang <cq.tang@intel.com>
> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> ---
>   drivers/gpu/drm/i915/gem/i915_gem_stolen.c | 99 +++++++++++++++++++---
>   drivers/gpu/drm/i915/gem/i915_gem_stolen.h |  3 +
>   drivers/gpu/drm/i915/i915_pci.c            |  2 +-
>   drivers/gpu/drm/i915/i915_reg.h            |  1 +
>   drivers/gpu/drm/i915/intel_memory_region.c |  6 ++
>   drivers/gpu/drm/i915/intel_memory_region.h |  5 +-
>   6 files changed, 102 insertions(+), 14 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_stolen.c b/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
> index b0597de206de..56dd58bef5ee 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
> @@ -10,6 +10,7 @@
>   #include <drm/drm_mm.h>
>   #include <drm/i915_drm.h>
>   
> +#include "gem/i915_gem_lmem.h"
>   #include "gem/i915_gem_region.h"
>   #include "i915_drv.h"
>   #include "i915_gem_stolen.h"
> @@ -121,6 +122,14 @@ static int i915_adjust_stolen(struct drm_i915_private *i915,
>   		}
>   	}
>   
> +	/*
> +	 * With device local memory, we don't need to check the address range,
> +	 * this is device memory physical address, could overlap with system
> +	 * memory.
> +	 */
> +	if (HAS_LMEM(i915))
> +		return 0;
> +
>   	/*
>   	 * Verify that nothing else uses this physical address. Stolen
>   	 * memory should be reserved by the BIOS and hidden from the
> @@ -374,8 +383,9 @@ static void icl_get_stolen_reserved(struct drm_i915_private *i915,
>   	}
>   }
>   
> -static int i915_gem_init_stolen(struct drm_i915_private *i915)
> +static int i915_gem_init_stolen(struct intel_memory_region *mem)
>   {
> +	struct drm_i915_private *i915 = mem->i915;
>   	struct intel_uncore *uncore = &i915->uncore;
>   	resource_size_t reserved_base, stolen_top;
>   	resource_size_t reserved_total, reserved_size;
> @@ -396,10 +406,10 @@ static int i915_gem_init_stolen(struct drm_i915_private *i915)
>   		return 0;
>   	}
>   
> -	if (resource_size(&intel_graphics_stolen_res) == 0)
> +	if (resource_size(&mem->region) == 0)
>   		return 0;
>   
> -	i915->dsm = intel_graphics_stolen_res;
> +	i915->dsm = mem->region;
>   
>   	if (i915_adjust_stolen(i915, &i915->dsm))
>   		return 0;
> @@ -684,23 +694,36 @@ static int _i915_gem_object_stolen_init(struct intel_memory_region *mem,
>   	return ret;
>   }
>   
> +struct intel_memory_region *i915_stolen_region(struct drm_i915_private *i915)
> +{
> +	if (HAS_LMEM(i915))
> +		return i915->mm.regions[INTEL_REGION_STOLEN_LMEM];
> +
> +	return i915->mm.regions[INTEL_REGION_STOLEN_SMEM];
> +}

Could be a bikeshedding comment only - especially since I think this 
path gets very little used at runtime so it is most likely pointless to 
fiddle with it, but it just strikes me a bit not fully elegant to do:

i915_gem_object_create_stolen
  -> i915_gem_object_create_region
     -> i915_stolen_region

And end up in here, when alternative could be at driver init:

i915->stolen_region_id = HAS_LMEM() ? ... : ...;

i915_gem_object_create_stolen
  -> 
i915_gem_object_create_region(i915->mm.regions[i915->stolen_region_id]);

Or pointer to region. Would avoid having to export i915_stolen_region as 
well.

Or is i915->dsm already the right thing? Because..

> +
>   struct drm_i915_gem_object *
>   i915_gem_object_create_stolen(struct drm_i915_private *i915,
>   			      resource_size_t size)
>   {
> -	return i915_gem_object_create_region(i915->mm.regions[INTEL_REGION_STOLEN_SMEM],
> +	return i915_gem_object_create_region(i915_stolen_region(i915),
>   					     size, I915_BO_ALLOC_CONTIGUOUS);
>   }
>   
>   static int init_stolen(struct intel_memory_region *mem)
>   {
> -	intel_memory_region_set_name(mem, "stolen");
> +	if (HAS_LMEM(mem->i915)) {
> +		if (!io_mapping_init_wc(&mem->iomap,
> +					mem->io_start,
> +					resource_size(&mem->region)))
> +			return -EIO;
> +	}
>   
>   	/*
>   	 * Initialise stolen early so that we may reserve preallocated
>   	 * objects for the BIOS to KMS transition.
>   	 */
> -	return i915_gem_init_stolen(mem->i915);
> +	return i915_gem_init_stolen(mem);

... I find the mem region init paths a bit convoluted, stolen 
especially, and struggle to figure it out every time.

For instance we have i915_region_stolen_ops shared between system and 
local stolen. But then shared vfuncs branch depending on system vs stolen?

i915_gem_init_stolen is shared - but which parts of it are relevant for 
local stolen?

>   }
>   
>   static void release_stolen(struct intel_memory_region *mem)
> @@ -714,13 +737,65 @@ static const struct intel_memory_region_ops i915_region_stolen_ops = {
>   	.init_object = _i915_gem_object_stolen_init,
>   };
>   
> +static struct intel_memory_region *
> +setup_lmem_stolen(struct drm_i915_private *i915)
> +{
> +	struct intel_uncore *uncore = &i915->uncore;
> +	struct pci_dev *pdev = i915->drm.pdev;
> +	struct intel_memory_region *mem;
> +	resource_size_t io_start;
> +	resource_size_t lmem_size;
> +	u64 lmem_base;
> +
> +	if (!IS_DGFX(i915))
> +		return ERR_PTR(-ENODEV);
> +
> +	lmem_base = intel_uncore_read64(uncore, GEN12_DSMBASE);
> +	lmem_size = pci_resource_len(pdev, 2) - lmem_base;
> +	io_start = pci_resource_start(pdev, 2) + lmem_base;
> +
> +	mem = intel_memory_region_create(i915, lmem_base, lmem_size,
> +					 I915_GTT_PAGE_SIZE_4K, io_start,
> +					 &i915_region_stolen_ops);
> +	if (IS_ERR(mem))
> +		return mem;
> +
> +	drm_dbg(&i915->drm, "Stolen Local memory: %pR\n", &mem->region);
> +	drm_dbg(&i915->drm, "Stolen Local memory IO start: %pa\n",
> +		&mem->io_start);

Could these messages be consolidated with the system stolen ones 
(i915_gem_setup_stolen?) and based off the memory_region data printed 
from common i915_gem_stolen_setup?

> +
> +	intel_memory_region_set_name(mem, "stolen-local");
> +
> +	return mem;
> +}
> +
> +static struct intel_memory_region*

Space before asterisk.

> +setup_smem_stolen(struct drm_i915_private *i915)
> +{
> +	struct intel_memory_region *mem;
> +
> +	mem = intel_memory_region_create(i915,
> +					 intel_graphics_stolen_res.start,
> +					 resource_size(&intel_graphics_stolen_res),
> +					 PAGE_SIZE, 0,
> +					 &i915_region_stolen_ops);
> +	if (IS_ERR(mem))
> +		return mem;
> +
> +	intel_memory_region_set_name(mem, "stolen-system");

I assume this name, although changed from the current ("stolen"), is not 
exported anywhere to matter?

> +
> +	return mem;
> +}
> +
>   struct intel_memory_region *i915_gem_stolen_setup(struct drm_i915_private *i915)
>   {
> -	return intel_memory_region_create(i915,
> -					  intel_graphics_stolen_res.start,
> -					  resource_size(&intel_graphics_stolen_res),
> -					  PAGE_SIZE, 0,
> -					  &i915_region_stolen_ops);
> +	struct intel_memory_region *mem;
> +
> +	mem = setup_lmem_stolen(i915);
> +	if (mem == ERR_PTR(-ENODEV))
> +		mem = setup_smem_stolen(i915);
> +
> +	return mem;
>   }
>   
>   struct drm_i915_gem_object *
> @@ -728,7 +803,7 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_i915_private *i915,
>   					       resource_size_t stolen_offset,
>   					       resource_size_t size)
>   {
> -	struct intel_memory_region *mem = i915->mm.regions[INTEL_REGION_STOLEN_SMEM];
> +	struct intel_memory_region *mem = i915_stolen_region(i915);
>   	struct drm_i915_gem_object *obj;
>   	struct drm_mm_node *stolen;
>   	int ret;
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_stolen.h b/drivers/gpu/drm/i915/gem/i915_gem_stolen.h
> index b03489706796..2d1ce7fec61c 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_stolen.h
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_stolen.h
> @@ -22,6 +22,9 @@ int i915_gem_stolen_insert_node_in_range(struct drm_i915_private *dev_priv,
>   void i915_gem_stolen_remove_node(struct drm_i915_private *dev_priv,
>   				 struct drm_mm_node *node);
>   struct intel_memory_region *i915_gem_stolen_setup(struct drm_i915_private *i915);
> +
> +struct intel_memory_region *i915_stolen_region(struct drm_i915_private *i915);
> +
>   struct drm_i915_gem_object *
>   i915_gem_object_create_stolen(struct drm_i915_private *dev_priv,
>   			      resource_size_t size);
> diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c
> index 480553746794..53f5d1e6daef 100644
> --- a/drivers/gpu/drm/i915/i915_pci.c
> +++ b/drivers/gpu/drm/i915/i915_pci.c
> @@ -906,7 +906,7 @@ static const struct intel_device_info rkl_info = {
>   
>   #define GEN12_DGFX_FEATURES \
>   	GEN12_FEATURES, \
> -	.memory_regions = REGION_SMEM | REGION_LMEM, \
> +	.memory_regions = REGION_SMEM | REGION_LMEM | REGION_STOLEN_LMEM, \
>   	.has_master_unit_irq = 1, \
>   	.has_llc = 0, \
>   	.has_snoop = 1, \
> diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
> index e087bcd21911..4108f2a7ebfa 100644
> --- a/drivers/gpu/drm/i915/i915_reg.h
> +++ b/drivers/gpu/drm/i915/i915_reg.h
> @@ -12191,6 +12191,7 @@ enum skl_power_gate {
>   #define GEN12_GLOBAL_MOCS(i)	_MMIO(0x4000 + (i) * 4) /* Global MOCS regs */
>   
>   #define GEN12_GSMBASE			_MMIO(0x108100)
> +#define GEN12_DSMBASE			_MMIO(0x1080C0)
>   
>   /* gamt regs */
>   #define GEN8_L3_LRA_1_GPGPU _MMIO(0x4dd4)
> diff --git a/drivers/gpu/drm/i915/intel_memory_region.c b/drivers/gpu/drm/i915/intel_memory_region.c
> index bf837b6bb185..ac90b76a3fa0 100644
> --- a/drivers/gpu/drm/i915/intel_memory_region.c
> +++ b/drivers/gpu/drm/i915/intel_memory_region.c
> @@ -22,6 +22,10 @@ static const struct {
>   		.class = INTEL_MEMORY_STOLEN_SYSTEM,
>   		.instance = 0,
>   	},
> +	[INTEL_REGION_STOLEN_LMEM] = {
> +		.class = INTEL_MEMORY_STOLEN_LOCAL,
> +		.instance = 0,
> +	},
>   };
>   
>   struct intel_memory_region *
> @@ -278,6 +282,8 @@ int intel_memory_regions_hw_probe(struct drm_i915_private *i915)
>   		case INTEL_MEMORY_SYSTEM:
>   			mem = i915_gem_shmem_setup(i915);
>   			break;
> +		case INTEL_MEMORY_STOLEN_LOCAL:
> +			fallthrough;
>   		case INTEL_MEMORY_STOLEN_SYSTEM:
>   			mem = i915_gem_stolen_setup(i915);
>   			break;
> diff --git a/drivers/gpu/drm/i915/intel_memory_region.h b/drivers/gpu/drm/i915/intel_memory_region.h
> index edd49067c8ca..4c8ec15af55f 100644
> --- a/drivers/gpu/drm/i915/intel_memory_region.h
> +++ b/drivers/gpu/drm/i915/intel_memory_region.h
> @@ -26,18 +26,21 @@ enum intel_memory_type {
>   	INTEL_MEMORY_SYSTEM = 0,
>   	INTEL_MEMORY_LOCAL,
>   	INTEL_MEMORY_STOLEN_SYSTEM,
> +	INTEL_MEMORY_STOLEN_LOCAL,
>   };
>   
>   enum intel_region_id {
>   	INTEL_REGION_SMEM = 0,
>   	INTEL_REGION_LMEM,
>   	INTEL_REGION_STOLEN_SMEM,
> +	INTEL_REGION_STOLEN_LMEM,
>   	INTEL_REGION_UNKNOWN, /* Should be last */
>   };
>   
>   #define REGION_SMEM     BIT(INTEL_REGION_SMEM)
>   #define REGION_LMEM     BIT(INTEL_REGION_LMEM)
>   #define REGION_STOLEN_SMEM   BIT(INTEL_REGION_STOLEN_SMEM)
> +#define REGION_STOLEN_LMEM   BIT(INTEL_REGION_STOLEN_LMEM)
>   
>   #define I915_ALLOC_MIN_PAGE_SIZE  BIT(0)
>   #define I915_ALLOC_CONTIGUOUS     BIT(1)
> @@ -82,7 +85,7 @@ struct intel_memory_region {
>   	u16 type;
>   	u16 instance;
>   	enum intel_region_id id;
> -	char name[8];
> +	char name[16];
>   
>   	struct list_head reserved;
>   
> 

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [Intel-gfx] [PATCH 04/19] drm/i915/stolen: treat stolen local as normal local memory
  2021-04-12  9:05   ` [Intel-gfx] " Matthew Auld
@ 2021-04-14 15:06     ` Tvrtko Ursulin
  -1 siblings, 0 replies; 130+ messages in thread
From: Tvrtko Ursulin @ 2021-04-14 15:06 UTC (permalink / raw)
  To: Matthew Auld, intel-gfx; +Cc: dri-devel


On 12/04/2021 10:05, Matthew Auld wrote:
> Underneath it's the same stuff, so things like the PTE_LM bits for the
> GTT should just keep working as-is.
> 
> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> ---
>   drivers/gpu/drm/i915/gem/i915_gem_lmem.c | 5 ++++-
>   1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_lmem.c b/drivers/gpu/drm/i915/gem/i915_gem_lmem.c
> index ce1c83c13d05..017db8f71130 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_lmem.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_lmem.c
> @@ -19,7 +19,10 @@ const struct drm_i915_gem_object_ops i915_gem_lmem_obj_ops = {
>   
>   bool i915_gem_object_is_lmem(struct drm_i915_gem_object *obj)
>   {
> -	return obj->ops == &i915_gem_lmem_obj_ops;
> +	struct intel_memory_region *mr = obj->mm.region;
> +
> +	return mr && (mr->type == INTEL_MEMORY_LOCAL ||
> +		      mr->type == INTEL_MEMORY_STOLEN_LOCAL);
>   }
>   
>   struct drm_i915_gem_object *
> 

Passable I guess. Although there is also i915_gem_object_is_stolen so it 
is not immediately clear what are the semantics of 
i915_gem_object_is_lmem vs that one. Almost like we need more 
"hierarchy" in region types, or flags of some sort, but I haven't looked 
at the callers to have a good idea what would work best.

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Regards,

Tvrtko
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [Intel-gfx] [PATCH 04/19] drm/i915/stolen: treat stolen local as normal local memory
@ 2021-04-14 15:06     ` Tvrtko Ursulin
  0 siblings, 0 replies; 130+ messages in thread
From: Tvrtko Ursulin @ 2021-04-14 15:06 UTC (permalink / raw)
  To: Matthew Auld, intel-gfx; +Cc: dri-devel


On 12/04/2021 10:05, Matthew Auld wrote:
> Underneath it's the same stuff, so things like the PTE_LM bits for the
> GTT should just keep working as-is.
> 
> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> ---
>   drivers/gpu/drm/i915/gem/i915_gem_lmem.c | 5 ++++-
>   1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_lmem.c b/drivers/gpu/drm/i915/gem/i915_gem_lmem.c
> index ce1c83c13d05..017db8f71130 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_lmem.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_lmem.c
> @@ -19,7 +19,10 @@ const struct drm_i915_gem_object_ops i915_gem_lmem_obj_ops = {
>   
>   bool i915_gem_object_is_lmem(struct drm_i915_gem_object *obj)
>   {
> -	return obj->ops == &i915_gem_lmem_obj_ops;
> +	struct intel_memory_region *mr = obj->mm.region;
> +
> +	return mr && (mr->type == INTEL_MEMORY_LOCAL ||
> +		      mr->type == INTEL_MEMORY_STOLEN_LOCAL);
>   }
>   
>   struct drm_i915_gem_object *
> 

Passable I guess. Although there is also i915_gem_object_is_stolen so it 
is not immediately clear what are the semantics of 
i915_gem_object_is_lmem vs that one. Almost like we need more 
"hierarchy" in region types, or flags of some sort, but I haven't looked 
at the callers to have a good idea what would work best.

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [Intel-gfx] [PATCH 05/19] drm/i915/stolen: enforce the min_page_size contract
  2021-04-12  9:05   ` [Intel-gfx] " Matthew Auld
@ 2021-04-14 15:07     ` Tvrtko Ursulin
  -1 siblings, 0 replies; 130+ messages in thread
From: Tvrtko Ursulin @ 2021-04-14 15:07 UTC (permalink / raw)
  To: Matthew Auld, intel-gfx; +Cc: dri-devel


On 12/04/2021 10:05, Matthew Auld wrote:
> From: CQ Tang <cq.tang@intel.com>
> 
> Since stolen can now be device local-memory underneath, we should try to
> enforce any min_page_size restrictions when allocating pages.
> 
> Signed-off-by: CQ Tang <cq.tang@intel.com>
> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> ---
>   drivers/gpu/drm/i915/gem/i915_gem_stolen.c | 7 ++++---
>   1 file changed, 4 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_stolen.c b/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
> index 56dd58bef5ee..f713eabb7671 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
> @@ -677,7 +677,8 @@ static int _i915_gem_object_stolen_init(struct intel_memory_region *mem,
>   	if (!stolen)
>   		return -ENOMEM;
>   
> -	ret = i915_gem_stolen_insert_node(i915, stolen, size, 4096);
> +	ret = i915_gem_stolen_insert_node(i915, stolen, size,
> +					  mem->min_page_size);
>   	if (ret)
>   		goto err_free;
>   
> @@ -817,8 +818,8 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_i915_private *i915,
>   
>   	/* KISS and expect everything to be page-aligned */
>   	if (GEM_WARN_ON(size == 0) ||
> -	    GEM_WARN_ON(!IS_ALIGNED(size, I915_GTT_PAGE_SIZE)) ||
> -	    GEM_WARN_ON(!IS_ALIGNED(stolen_offset, I915_GTT_MIN_ALIGNMENT)))
> +	    GEM_WARN_ON(!IS_ALIGNED(size, mem->min_page_size)) ||
> +	    GEM_WARN_ON(!IS_ALIGNED(stolen_offset, mem->min_page_size)))
>   		return ERR_PTR(-EINVAL);
>   
>   	stolen = kzalloc(sizeof(*stolen), GFP_KERNEL);
> 

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Regards,

Tvrtko
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [Intel-gfx] [PATCH 05/19] drm/i915/stolen: enforce the min_page_size contract
@ 2021-04-14 15:07     ` Tvrtko Ursulin
  0 siblings, 0 replies; 130+ messages in thread
From: Tvrtko Ursulin @ 2021-04-14 15:07 UTC (permalink / raw)
  To: Matthew Auld, intel-gfx; +Cc: dri-devel


On 12/04/2021 10:05, Matthew Auld wrote:
> From: CQ Tang <cq.tang@intel.com>
> 
> Since stolen can now be device local-memory underneath, we should try to
> enforce any min_page_size restrictions when allocating pages.
> 
> Signed-off-by: CQ Tang <cq.tang@intel.com>
> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> ---
>   drivers/gpu/drm/i915/gem/i915_gem_stolen.c | 7 ++++---
>   1 file changed, 4 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_stolen.c b/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
> index 56dd58bef5ee..f713eabb7671 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
> @@ -677,7 +677,8 @@ static int _i915_gem_object_stolen_init(struct intel_memory_region *mem,
>   	if (!stolen)
>   		return -ENOMEM;
>   
> -	ret = i915_gem_stolen_insert_node(i915, stolen, size, 4096);
> +	ret = i915_gem_stolen_insert_node(i915, stolen, size,
> +					  mem->min_page_size);
>   	if (ret)
>   		goto err_free;
>   
> @@ -817,8 +818,8 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_i915_private *i915,
>   
>   	/* KISS and expect everything to be page-aligned */
>   	if (GEM_WARN_ON(size == 0) ||
> -	    GEM_WARN_ON(!IS_ALIGNED(size, I915_GTT_PAGE_SIZE)) ||
> -	    GEM_WARN_ON(!IS_ALIGNED(stolen_offset, I915_GTT_MIN_ALIGNMENT)))
> +	    GEM_WARN_ON(!IS_ALIGNED(size, mem->min_page_size)) ||
> +	    GEM_WARN_ON(!IS_ALIGNED(stolen_offset, mem->min_page_size)))
>   		return ERR_PTR(-EINVAL);
>   
>   	stolen = kzalloc(sizeof(*stolen), GFP_KERNEL);
> 

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [Intel-gfx] [PATCH 06/19] drm/i915/stolen: pass the allocation flags
  2021-04-12  9:05   ` [Intel-gfx] " Matthew Auld
@ 2021-04-14 15:09     ` Tvrtko Ursulin
  -1 siblings, 0 replies; 130+ messages in thread
From: Tvrtko Ursulin @ 2021-04-14 15:09 UTC (permalink / raw)
  To: Matthew Auld, intel-gfx; +Cc: dri-devel


On 12/04/2021 10:05, Matthew Auld wrote:
> From: CQ Tang <cq.tang@intel.com>
> 
> Stolen memory is always allocated as physically contiguous pages, mark
> the object flags as such.
> 
> Signed-off-by: CQ Tang <cq.tang@intel.com>
> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> ---
>   drivers/gpu/drm/i915/gem/i915_gem_stolen.c | 10 ++++++----
>   1 file changed, 6 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_stolen.c b/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
> index f713eabb7671..49a2dfcc8ba7 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
> @@ -633,14 +633,15 @@ static const struct drm_i915_gem_object_ops i915_gem_object_stolen_ops = {
>   
>   static int __i915_gem_object_create_stolen(struct intel_memory_region *mem,
>   					   struct drm_i915_gem_object *obj,
> -					   struct drm_mm_node *stolen)
> +					   struct drm_mm_node *stolen,
> +					   unsigned int flags)
>   {
>   	static struct lock_class_key lock_class;
>   	unsigned int cache_level;
>   	int err;
>   
>   	drm_gem_private_object_init(&mem->i915->drm, &obj->base, stolen->size);
> -	i915_gem_object_init(obj, &i915_gem_object_stolen_ops, &lock_class, 0);
> +	i915_gem_object_init(obj, &i915_gem_object_stolen_ops, &lock_class, flags);
>   
>   	obj->stolen = stolen;
>   	obj->read_domains = I915_GEM_DOMAIN_CPU | I915_GEM_DOMAIN_GTT;
> @@ -682,7 +683,7 @@ static int _i915_gem_object_stolen_init(struct intel_memory_region *mem,
>   	if (ret)
>   		goto err_free;
>   
> -	ret = __i915_gem_object_create_stolen(mem, obj, stolen);
> +	ret = __i915_gem_object_create_stolen(mem, obj, stolen, flags);
>   	if (ret)
>   		goto err_remove;
>   
> @@ -840,7 +841,8 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_i915_private *i915,
>   		goto err_stolen;
>   	}
>   
> -	ret = __i915_gem_object_create_stolen(mem, obj, stolen);
> +	ret = __i915_gem_object_create_stolen(mem, obj, stolen,
> +					      I915_BO_ALLOC_CONTIGUOUS);
>   	if (ret)
>   		goto err_object_free;
>   
> 

Are all stolen objects always contiguous or only ones allocated by 
i915_gem_object_create_stolen_for_preallocated? If former should 
__i915_gem_object_create_stolen just set the flag without the need to 
pass it in?

Regards,

Tvrtko
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [Intel-gfx] [PATCH 06/19] drm/i915/stolen: pass the allocation flags
@ 2021-04-14 15:09     ` Tvrtko Ursulin
  0 siblings, 0 replies; 130+ messages in thread
From: Tvrtko Ursulin @ 2021-04-14 15:09 UTC (permalink / raw)
  To: Matthew Auld, intel-gfx; +Cc: dri-devel


On 12/04/2021 10:05, Matthew Auld wrote:
> From: CQ Tang <cq.tang@intel.com>
> 
> Stolen memory is always allocated as physically contiguous pages, mark
> the object flags as such.
> 
> Signed-off-by: CQ Tang <cq.tang@intel.com>
> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> ---
>   drivers/gpu/drm/i915/gem/i915_gem_stolen.c | 10 ++++++----
>   1 file changed, 6 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_stolen.c b/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
> index f713eabb7671..49a2dfcc8ba7 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
> @@ -633,14 +633,15 @@ static const struct drm_i915_gem_object_ops i915_gem_object_stolen_ops = {
>   
>   static int __i915_gem_object_create_stolen(struct intel_memory_region *mem,
>   					   struct drm_i915_gem_object *obj,
> -					   struct drm_mm_node *stolen)
> +					   struct drm_mm_node *stolen,
> +					   unsigned int flags)
>   {
>   	static struct lock_class_key lock_class;
>   	unsigned int cache_level;
>   	int err;
>   
>   	drm_gem_private_object_init(&mem->i915->drm, &obj->base, stolen->size);
> -	i915_gem_object_init(obj, &i915_gem_object_stolen_ops, &lock_class, 0);
> +	i915_gem_object_init(obj, &i915_gem_object_stolen_ops, &lock_class, flags);
>   
>   	obj->stolen = stolen;
>   	obj->read_domains = I915_GEM_DOMAIN_CPU | I915_GEM_DOMAIN_GTT;
> @@ -682,7 +683,7 @@ static int _i915_gem_object_stolen_init(struct intel_memory_region *mem,
>   	if (ret)
>   		goto err_free;
>   
> -	ret = __i915_gem_object_create_stolen(mem, obj, stolen);
> +	ret = __i915_gem_object_create_stolen(mem, obj, stolen, flags);
>   	if (ret)
>   		goto err_remove;
>   
> @@ -840,7 +841,8 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_i915_private *i915,
>   		goto err_stolen;
>   	}
>   
> -	ret = __i915_gem_object_create_stolen(mem, obj, stolen);
> +	ret = __i915_gem_object_create_stolen(mem, obj, stolen,
> +					      I915_BO_ALLOC_CONTIGUOUS);
>   	if (ret)
>   		goto err_object_free;
>   
> 

Are all stolen objects always contiguous or only ones allocated by 
i915_gem_object_create_stolen_for_preallocated? If former should 
__i915_gem_object_create_stolen just set the flag without the need to 
pass it in?

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [Intel-gfx] [PATCH 08/19] drm/i915: Return error value when bo not in LMEM for discrete
  2021-04-12  9:05   ` [Intel-gfx] " Matthew Auld
@ 2021-04-14 15:16     ` Tvrtko Ursulin
  -1 siblings, 0 replies; 130+ messages in thread
From: Tvrtko Ursulin @ 2021-04-14 15:16 UTC (permalink / raw)
  To: Matthew Auld, intel-gfx; +Cc: Mohammed Khajapasha, dri-devel


On 12/04/2021 10:05, Matthew Auld wrote:
> From: Mohammed Khajapasha <mohammed.khajapasha@intel.com>
> 
> Return EREMOTE value when frame buffer object is not backed by LMEM
> for discrete. If Local memory is supported by hardware the framebuffer
> backing gem objects should be from local memory.
> 
> Signed-off-by: Mohammed Khajapasha <mohammed.khajapasha@intel.com>
> ---
>   drivers/gpu/drm/i915/display/intel_display.c | 10 ++++++++++
>   1 file changed, 10 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/display/intel_display.c b/drivers/gpu/drm/i915/display/intel_display.c
> index 411b46c012f8..57b06d8728af 100644
> --- a/drivers/gpu/drm/i915/display/intel_display.c
> +++ b/drivers/gpu/drm/i915/display/intel_display.c
> @@ -63,6 +63,7 @@
>   #include "display/intel_vdsc.h"
>   #include "display/intel_vrr.h"
>   
> +#include "gem/i915_gem_lmem.h"
>   #include "gem/i915_gem_object.h"
>   
>   #include "gt/intel_rps.h"
> @@ -11279,11 +11280,20 @@ intel_user_framebuffer_create(struct drm_device *dev,
>   	struct drm_framebuffer *fb;
>   	struct drm_i915_gem_object *obj;
>   	struct drm_mode_fb_cmd2 mode_cmd = *user_mode_cmd;
> +	struct drm_i915_private *i915;
>   
>   	obj = i915_gem_object_lookup(filp, mode_cmd.handles[0]);
>   	if (!obj)
>   		return ERR_PTR(-ENOENT);
>   
> +	/* object is backed with LMEM for discrete */
> +	i915 = to_i915(obj->base.dev);
> +	if (HAS_LMEM(i915) && !i915_gem_object_is_lmem(obj)) {
> +		/* object is "remote", not in local memory */
> +		i915_gem_object_put(obj);
> +		return ERR_PTR(-EREMOTE);

I am a fan of rich errnos and this one feels appropriately descriptive, 
but please get an ack from Daniel or so.

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Regards,

Tvrtko

> +	}
> +
>   	fb = intel_framebuffer_create(obj, &mode_cmd);
>   	i915_gem_object_put(obj);
>   
> 
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [Intel-gfx] [PATCH 08/19] drm/i915: Return error value when bo not in LMEM for discrete
@ 2021-04-14 15:16     ` Tvrtko Ursulin
  0 siblings, 0 replies; 130+ messages in thread
From: Tvrtko Ursulin @ 2021-04-14 15:16 UTC (permalink / raw)
  To: Matthew Auld, intel-gfx; +Cc: Mohammed Khajapasha, dri-devel


On 12/04/2021 10:05, Matthew Auld wrote:
> From: Mohammed Khajapasha <mohammed.khajapasha@intel.com>
> 
> Return EREMOTE value when frame buffer object is not backed by LMEM
> for discrete. If Local memory is supported by hardware the framebuffer
> backing gem objects should be from local memory.
> 
> Signed-off-by: Mohammed Khajapasha <mohammed.khajapasha@intel.com>
> ---
>   drivers/gpu/drm/i915/display/intel_display.c | 10 ++++++++++
>   1 file changed, 10 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/display/intel_display.c b/drivers/gpu/drm/i915/display/intel_display.c
> index 411b46c012f8..57b06d8728af 100644
> --- a/drivers/gpu/drm/i915/display/intel_display.c
> +++ b/drivers/gpu/drm/i915/display/intel_display.c
> @@ -63,6 +63,7 @@
>   #include "display/intel_vdsc.h"
>   #include "display/intel_vrr.h"
>   
> +#include "gem/i915_gem_lmem.h"
>   #include "gem/i915_gem_object.h"
>   
>   #include "gt/intel_rps.h"
> @@ -11279,11 +11280,20 @@ intel_user_framebuffer_create(struct drm_device *dev,
>   	struct drm_framebuffer *fb;
>   	struct drm_i915_gem_object *obj;
>   	struct drm_mode_fb_cmd2 mode_cmd = *user_mode_cmd;
> +	struct drm_i915_private *i915;
>   
>   	obj = i915_gem_object_lookup(filp, mode_cmd.handles[0]);
>   	if (!obj)
>   		return ERR_PTR(-ENOENT);
>   
> +	/* object is backed with LMEM for discrete */
> +	i915 = to_i915(obj->base.dev);
> +	if (HAS_LMEM(i915) && !i915_gem_object_is_lmem(obj)) {
> +		/* object is "remote", not in local memory */
> +		i915_gem_object_put(obj);
> +		return ERR_PTR(-EREMOTE);

I am a fan of rich errnos and this one feels appropriately descriptive, 
but please get an ack from Daniel or so.

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Regards,

Tvrtko

> +	}
> +
>   	fb = intel_framebuffer_create(obj, &mode_cmd);
>   	i915_gem_object_put(obj);
>   
> 
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [Intel-gfx] [PATCH 11/19] drm/i915: Update the helper to set correct mapping
  2021-04-12  9:05   ` [Intel-gfx] " Matthew Auld
@ 2021-04-14 15:22     ` Tvrtko Ursulin
  -1 siblings, 0 replies; 130+ messages in thread
From: Tvrtko Ursulin @ 2021-04-14 15:22 UTC (permalink / raw)
  To: Matthew Auld, intel-gfx; +Cc: dri-devel


On 12/04/2021 10:05, Matthew Auld wrote:
> From: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com>
> 
> Determine the possible coherent map type based on object location,
> and if target has llc or if user requires an always coherent
> mapping.
> 
> Cc: Matthew Auld <matthew.auld@intel.com>
> Cc: CQ Tang <cq.tang@intel.com>
> Suggested-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
> Signed-off-by: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com>
> ---
>   drivers/gpu/drm/i915/gt/intel_engine_cs.c    |  3 ++-
>   drivers/gpu/drm/i915/gt/intel_engine_pm.c    |  2 +-
>   drivers/gpu/drm/i915/gt/intel_lrc.c          |  4 +++-
>   drivers/gpu/drm/i915/gt/intel_ring.c         |  9 ++++++---
>   drivers/gpu/drm/i915/gt/selftest_context.c   |  3 ++-
>   drivers/gpu/drm/i915/gt/selftest_hangcheck.c |  4 ++--
>   drivers/gpu/drm/i915/gt/selftest_lrc.c       |  4 +++-
>   drivers/gpu/drm/i915/gt/uc/intel_guc.c       |  4 +++-
>   drivers/gpu/drm/i915/gt/uc/intel_huc.c       |  4 +++-
>   drivers/gpu/drm/i915/i915_drv.h              | 11 +++++++++--
>   drivers/gpu/drm/i915/selftests/igt_spinner.c |  4 ++--
>   11 files changed, 36 insertions(+), 16 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> index efe935f80c1a..b79568d370f5 100644
> --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> @@ -664,7 +664,8 @@ static int init_status_page(struct intel_engine_cs *engine)
>   	if (ret)
>   		goto err;
>   
> -	vaddr = i915_gem_object_pin_map(obj, I915_MAP_WB);
> +	vaddr = i915_gem_object_pin_map(obj,
> +					i915_coherent_map_type(engine->i915, obj, true));
>   	if (IS_ERR(vaddr)) {
>   		ret = PTR_ERR(vaddr);
>   		goto err_unpin;
> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pm.c b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
> index 7c9af86fdb1e..47f4397095e5 100644
> --- a/drivers/gpu/drm/i915/gt/intel_engine_pm.c
> +++ b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
> @@ -23,7 +23,7 @@ static void dbg_poison_ce(struct intel_context *ce)
>   
>   	if (ce->state) {
>   		struct drm_i915_gem_object *obj = ce->state->obj;
> -		int type = i915_coherent_map_type(ce->engine->i915);
> +		int type = i915_coherent_map_type(ce->engine->i915, obj, true);
>   		void *map;
>   
>   		if (!i915_gem_object_trylock(obj))
> diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
> index e86897cde984..aafe2a4df496 100644
> --- a/drivers/gpu/drm/i915/gt/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
> @@ -903,7 +903,9 @@ lrc_pre_pin(struct intel_context *ce,
>   	GEM_BUG_ON(!i915_vma_is_pinned(ce->state));
>   
>   	*vaddr = i915_gem_object_pin_map(ce->state->obj,
> -					 i915_coherent_map_type(ce->engine->i915) |
> +					 i915_coherent_map_type(ce->engine->i915,
> +								ce->state->obj,
> +								false) |
>   					 I915_MAP_OVERRIDE);
>   
>   	return PTR_ERR_OR_ZERO(*vaddr);
> diff --git a/drivers/gpu/drm/i915/gt/intel_ring.c b/drivers/gpu/drm/i915/gt/intel_ring.c
> index aee0a77c77e0..3cf6c7e68108 100644
> --- a/drivers/gpu/drm/i915/gt/intel_ring.c
> +++ b/drivers/gpu/drm/i915/gt/intel_ring.c
> @@ -53,9 +53,12 @@ int intel_ring_pin(struct intel_ring *ring, struct i915_gem_ww_ctx *ww)
>   
>   	if (i915_vma_is_map_and_fenceable(vma))
>   		addr = (void __force *)i915_vma_pin_iomap(vma);
> -	else
> -		addr = i915_gem_object_pin_map(vma->obj,
> -					       i915_coherent_map_type(vma->vm->i915));
> +	else {
> +		int type = i915_coherent_map_type(vma->vm->i915, vma->obj, false);
> +
> +		addr = i915_gem_object_pin_map(vma->obj, type);
> +	}
> +
>   	if (IS_ERR(addr)) {
>   		ret = PTR_ERR(addr);
>   		goto err_ring;
> diff --git a/drivers/gpu/drm/i915/gt/selftest_context.c b/drivers/gpu/drm/i915/gt/selftest_context.c
> index b9bdd1d23243..26685b927169 100644
> --- a/drivers/gpu/drm/i915/gt/selftest_context.c
> +++ b/drivers/gpu/drm/i915/gt/selftest_context.c
> @@ -88,7 +88,8 @@ static int __live_context_size(struct intel_engine_cs *engine)
>   		goto err;
>   
>   	vaddr = i915_gem_object_pin_map_unlocked(ce->state->obj,
> -						 i915_coherent_map_type(engine->i915));
> +						 i915_coherent_map_type(engine->i915,
> +									ce->state->obj, false));
>   	if (IS_ERR(vaddr)) {
>   		err = PTR_ERR(vaddr);
>   		intel_context_unpin(ce);
> diff --git a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
> index 746985971c3a..5b63d4df8c93 100644
> --- a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
> +++ b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
> @@ -69,7 +69,7 @@ static int hang_init(struct hang *h, struct intel_gt *gt)
>   	h->seqno = memset(vaddr, 0xff, PAGE_SIZE);
>   
>   	vaddr = i915_gem_object_pin_map_unlocked(h->obj,
> -						 i915_coherent_map_type(gt->i915));
> +						 i915_coherent_map_type(gt->i915, h->obj, false));
>   	if (IS_ERR(vaddr)) {
>   		err = PTR_ERR(vaddr);
>   		goto err_unpin_hws;
> @@ -130,7 +130,7 @@ hang_create_request(struct hang *h, struct intel_engine_cs *engine)
>   		return ERR_CAST(obj);
>   	}
>   
> -	vaddr = i915_gem_object_pin_map_unlocked(obj, i915_coherent_map_type(gt->i915));
> +	vaddr = i915_gem_object_pin_map_unlocked(obj, i915_coherent_map_type(gt->i915, obj, false));
>   	if (IS_ERR(vaddr)) {
>   		i915_gem_object_put(obj);
>   		i915_vm_put(vm);
> diff --git a/drivers/gpu/drm/i915/gt/selftest_lrc.c b/drivers/gpu/drm/i915/gt/selftest_lrc.c
> index 85e7df6a5123..d8f6623524e8 100644
> --- a/drivers/gpu/drm/i915/gt/selftest_lrc.c
> +++ b/drivers/gpu/drm/i915/gt/selftest_lrc.c
> @@ -1221,7 +1221,9 @@ static int compare_isolation(struct intel_engine_cs *engine,
>   	}
>   
>   	lrc = i915_gem_object_pin_map_unlocked(ce->state->obj,
> -				      i915_coherent_map_type(engine->i915));
> +					       i915_coherent_map_type(engine->i915,
> +								      ce->state->obj,
> +								      false));
>   	if (IS_ERR(lrc)) {
>   		err = PTR_ERR(lrc);
>   		goto err_B1;
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.c b/drivers/gpu/drm/i915/gt/uc/intel_guc.c
> index 78305b2ec89d..adae04c47aab 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc.c
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.c
> @@ -682,7 +682,9 @@ int intel_guc_allocate_and_map_vma(struct intel_guc *guc, u32 size,
>   	if (IS_ERR(vma))
>   		return PTR_ERR(vma);
>   
> -	vaddr = i915_gem_object_pin_map_unlocked(vma->obj, I915_MAP_WB);
> +	vaddr = i915_gem_object_pin_map_unlocked(vma->obj,
> +						 i915_coherent_map_type(guc_to_gt(guc)->i915,
> +									vma->obj, true));
>   	if (IS_ERR(vaddr)) {
>   		i915_vma_unpin_and_release(&vma, 0);
>   		return PTR_ERR(vaddr);
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_huc.c b/drivers/gpu/drm/i915/gt/uc/intel_huc.c
> index 2126dd81ac38..56d2144dc6a0 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_huc.c
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_huc.c
> @@ -82,7 +82,9 @@ static int intel_huc_rsa_data_create(struct intel_huc *huc)
>   	if (IS_ERR(vma))
>   		return PTR_ERR(vma);
>   
> -	vaddr = i915_gem_object_pin_map_unlocked(vma->obj, I915_MAP_WB);
> +	vaddr = i915_gem_object_pin_map_unlocked(vma->obj,
> +						 i915_coherent_map_type(gt->i915,
> +									vma->obj, true));
>   	if (IS_ERR(vaddr)) {
>   		i915_vma_unpin_and_release(&vma, 0);
>   		return PTR_ERR(vaddr);
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 69e43bf91a15..2abbc06712a4 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -78,6 +78,7 @@
>   #include "gem/i915_gem_context_types.h"
>   #include "gem/i915_gem_shrinker.h"
>   #include "gem/i915_gem_stolen.h"
> +#include "gem/i915_gem_lmem.h"
>   
>   #include "gt/intel_engine.h"
>   #include "gt/intel_gt_types.h"
> @@ -1921,9 +1922,15 @@ static inline int intel_hws_csb_write_index(struct drm_i915_private *i915)
>   }
>   
>   static inline enum i915_map_type
> -i915_coherent_map_type(struct drm_i915_private *i915)
> +i915_coherent_map_type(struct drm_i915_private *i915,
> +		       struct drm_i915_gem_object *obj, bool always_coherent)
>   {
> -	return HAS_LLC(i915) ? I915_MAP_WB : I915_MAP_WC;
> +	if (i915_gem_object_is_lmem(obj))
> +		return I915_MAP_WC;
> +	if (HAS_LLC(i915) || always_coherent)
> +		return I915_MAP_WB;
> +	else
> +		return I915_MAP_WC;

Seems this patch is doing two things.

First it is adding lmem support to this helper by always returning WC 
for lmem objects.

Secondly it is introducing an idea of "always coherent" in a helper 
called i915_coherent_map_type. Could someone explain what is coherent vs 
always coherent?

And also, why is always coherent happy with WB? Sounds counter intuitive 
to me.

Regards,

Tvrtko

>   }
>   
>   #endif
> diff --git a/drivers/gpu/drm/i915/selftests/igt_spinner.c b/drivers/gpu/drm/i915/selftests/igt_spinner.c
> index cfbbe415b57c..5fe397b7d1d9 100644
> --- a/drivers/gpu/drm/i915/selftests/igt_spinner.c
> +++ b/drivers/gpu/drm/i915/selftests/igt_spinner.c
> @@ -94,9 +94,9 @@ int igt_spinner_pin(struct igt_spinner *spin,
>   	}
>   
>   	if (!spin->batch) {
> -		unsigned int mode =
> -			i915_coherent_map_type(spin->gt->i915);
> +		unsigned int mode;
>   
> +		mode = i915_coherent_map_type(spin->gt->i915, spin->obj, false);
>   		vaddr = igt_spinner_pin_obj(ce, ww, spin->obj, mode, &spin->batch_vma);
>   		if (IS_ERR(vaddr))
>   			return PTR_ERR(vaddr);
> 
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [Intel-gfx] [PATCH 11/19] drm/i915: Update the helper to set correct mapping
@ 2021-04-14 15:22     ` Tvrtko Ursulin
  0 siblings, 0 replies; 130+ messages in thread
From: Tvrtko Ursulin @ 2021-04-14 15:22 UTC (permalink / raw)
  To: Matthew Auld, intel-gfx; +Cc: dri-devel


On 12/04/2021 10:05, Matthew Auld wrote:
> From: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com>
> 
> Determine the possible coherent map type based on object location,
> and if target has llc or if user requires an always coherent
> mapping.
> 
> Cc: Matthew Auld <matthew.auld@intel.com>
> Cc: CQ Tang <cq.tang@intel.com>
> Suggested-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
> Signed-off-by: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com>
> ---
>   drivers/gpu/drm/i915/gt/intel_engine_cs.c    |  3 ++-
>   drivers/gpu/drm/i915/gt/intel_engine_pm.c    |  2 +-
>   drivers/gpu/drm/i915/gt/intel_lrc.c          |  4 +++-
>   drivers/gpu/drm/i915/gt/intel_ring.c         |  9 ++++++---
>   drivers/gpu/drm/i915/gt/selftest_context.c   |  3 ++-
>   drivers/gpu/drm/i915/gt/selftest_hangcheck.c |  4 ++--
>   drivers/gpu/drm/i915/gt/selftest_lrc.c       |  4 +++-
>   drivers/gpu/drm/i915/gt/uc/intel_guc.c       |  4 +++-
>   drivers/gpu/drm/i915/gt/uc/intel_huc.c       |  4 +++-
>   drivers/gpu/drm/i915/i915_drv.h              | 11 +++++++++--
>   drivers/gpu/drm/i915/selftests/igt_spinner.c |  4 ++--
>   11 files changed, 36 insertions(+), 16 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> index efe935f80c1a..b79568d370f5 100644
> --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> @@ -664,7 +664,8 @@ static int init_status_page(struct intel_engine_cs *engine)
>   	if (ret)
>   		goto err;
>   
> -	vaddr = i915_gem_object_pin_map(obj, I915_MAP_WB);
> +	vaddr = i915_gem_object_pin_map(obj,
> +					i915_coherent_map_type(engine->i915, obj, true));
>   	if (IS_ERR(vaddr)) {
>   		ret = PTR_ERR(vaddr);
>   		goto err_unpin;
> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pm.c b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
> index 7c9af86fdb1e..47f4397095e5 100644
> --- a/drivers/gpu/drm/i915/gt/intel_engine_pm.c
> +++ b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
> @@ -23,7 +23,7 @@ static void dbg_poison_ce(struct intel_context *ce)
>   
>   	if (ce->state) {
>   		struct drm_i915_gem_object *obj = ce->state->obj;
> -		int type = i915_coherent_map_type(ce->engine->i915);
> +		int type = i915_coherent_map_type(ce->engine->i915, obj, true);
>   		void *map;
>   
>   		if (!i915_gem_object_trylock(obj))
> diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
> index e86897cde984..aafe2a4df496 100644
> --- a/drivers/gpu/drm/i915/gt/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
> @@ -903,7 +903,9 @@ lrc_pre_pin(struct intel_context *ce,
>   	GEM_BUG_ON(!i915_vma_is_pinned(ce->state));
>   
>   	*vaddr = i915_gem_object_pin_map(ce->state->obj,
> -					 i915_coherent_map_type(ce->engine->i915) |
> +					 i915_coherent_map_type(ce->engine->i915,
> +								ce->state->obj,
> +								false) |
>   					 I915_MAP_OVERRIDE);
>   
>   	return PTR_ERR_OR_ZERO(*vaddr);
> diff --git a/drivers/gpu/drm/i915/gt/intel_ring.c b/drivers/gpu/drm/i915/gt/intel_ring.c
> index aee0a77c77e0..3cf6c7e68108 100644
> --- a/drivers/gpu/drm/i915/gt/intel_ring.c
> +++ b/drivers/gpu/drm/i915/gt/intel_ring.c
> @@ -53,9 +53,12 @@ int intel_ring_pin(struct intel_ring *ring, struct i915_gem_ww_ctx *ww)
>   
>   	if (i915_vma_is_map_and_fenceable(vma))
>   		addr = (void __force *)i915_vma_pin_iomap(vma);
> -	else
> -		addr = i915_gem_object_pin_map(vma->obj,
> -					       i915_coherent_map_type(vma->vm->i915));
> +	else {
> +		int type = i915_coherent_map_type(vma->vm->i915, vma->obj, false);
> +
> +		addr = i915_gem_object_pin_map(vma->obj, type);
> +	}
> +
>   	if (IS_ERR(addr)) {
>   		ret = PTR_ERR(addr);
>   		goto err_ring;
> diff --git a/drivers/gpu/drm/i915/gt/selftest_context.c b/drivers/gpu/drm/i915/gt/selftest_context.c
> index b9bdd1d23243..26685b927169 100644
> --- a/drivers/gpu/drm/i915/gt/selftest_context.c
> +++ b/drivers/gpu/drm/i915/gt/selftest_context.c
> @@ -88,7 +88,8 @@ static int __live_context_size(struct intel_engine_cs *engine)
>   		goto err;
>   
>   	vaddr = i915_gem_object_pin_map_unlocked(ce->state->obj,
> -						 i915_coherent_map_type(engine->i915));
> +						 i915_coherent_map_type(engine->i915,
> +									ce->state->obj, false));
>   	if (IS_ERR(vaddr)) {
>   		err = PTR_ERR(vaddr);
>   		intel_context_unpin(ce);
> diff --git a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
> index 746985971c3a..5b63d4df8c93 100644
> --- a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
> +++ b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
> @@ -69,7 +69,7 @@ static int hang_init(struct hang *h, struct intel_gt *gt)
>   	h->seqno = memset(vaddr, 0xff, PAGE_SIZE);
>   
>   	vaddr = i915_gem_object_pin_map_unlocked(h->obj,
> -						 i915_coherent_map_type(gt->i915));
> +						 i915_coherent_map_type(gt->i915, h->obj, false));
>   	if (IS_ERR(vaddr)) {
>   		err = PTR_ERR(vaddr);
>   		goto err_unpin_hws;
> @@ -130,7 +130,7 @@ hang_create_request(struct hang *h, struct intel_engine_cs *engine)
>   		return ERR_CAST(obj);
>   	}
>   
> -	vaddr = i915_gem_object_pin_map_unlocked(obj, i915_coherent_map_type(gt->i915));
> +	vaddr = i915_gem_object_pin_map_unlocked(obj, i915_coherent_map_type(gt->i915, obj, false));
>   	if (IS_ERR(vaddr)) {
>   		i915_gem_object_put(obj);
>   		i915_vm_put(vm);
> diff --git a/drivers/gpu/drm/i915/gt/selftest_lrc.c b/drivers/gpu/drm/i915/gt/selftest_lrc.c
> index 85e7df6a5123..d8f6623524e8 100644
> --- a/drivers/gpu/drm/i915/gt/selftest_lrc.c
> +++ b/drivers/gpu/drm/i915/gt/selftest_lrc.c
> @@ -1221,7 +1221,9 @@ static int compare_isolation(struct intel_engine_cs *engine,
>   	}
>   
>   	lrc = i915_gem_object_pin_map_unlocked(ce->state->obj,
> -				      i915_coherent_map_type(engine->i915));
> +					       i915_coherent_map_type(engine->i915,
> +								      ce->state->obj,
> +								      false));
>   	if (IS_ERR(lrc)) {
>   		err = PTR_ERR(lrc);
>   		goto err_B1;
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.c b/drivers/gpu/drm/i915/gt/uc/intel_guc.c
> index 78305b2ec89d..adae04c47aab 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc.c
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.c
> @@ -682,7 +682,9 @@ int intel_guc_allocate_and_map_vma(struct intel_guc *guc, u32 size,
>   	if (IS_ERR(vma))
>   		return PTR_ERR(vma);
>   
> -	vaddr = i915_gem_object_pin_map_unlocked(vma->obj, I915_MAP_WB);
> +	vaddr = i915_gem_object_pin_map_unlocked(vma->obj,
> +						 i915_coherent_map_type(guc_to_gt(guc)->i915,
> +									vma->obj, true));
>   	if (IS_ERR(vaddr)) {
>   		i915_vma_unpin_and_release(&vma, 0);
>   		return PTR_ERR(vaddr);
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_huc.c b/drivers/gpu/drm/i915/gt/uc/intel_huc.c
> index 2126dd81ac38..56d2144dc6a0 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_huc.c
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_huc.c
> @@ -82,7 +82,9 @@ static int intel_huc_rsa_data_create(struct intel_huc *huc)
>   	if (IS_ERR(vma))
>   		return PTR_ERR(vma);
>   
> -	vaddr = i915_gem_object_pin_map_unlocked(vma->obj, I915_MAP_WB);
> +	vaddr = i915_gem_object_pin_map_unlocked(vma->obj,
> +						 i915_coherent_map_type(gt->i915,
> +									vma->obj, true));
>   	if (IS_ERR(vaddr)) {
>   		i915_vma_unpin_and_release(&vma, 0);
>   		return PTR_ERR(vaddr);
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 69e43bf91a15..2abbc06712a4 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -78,6 +78,7 @@
>   #include "gem/i915_gem_context_types.h"
>   #include "gem/i915_gem_shrinker.h"
>   #include "gem/i915_gem_stolen.h"
> +#include "gem/i915_gem_lmem.h"
>   
>   #include "gt/intel_engine.h"
>   #include "gt/intel_gt_types.h"
> @@ -1921,9 +1922,15 @@ static inline int intel_hws_csb_write_index(struct drm_i915_private *i915)
>   }
>   
>   static inline enum i915_map_type
> -i915_coherent_map_type(struct drm_i915_private *i915)
> +i915_coherent_map_type(struct drm_i915_private *i915,
> +		       struct drm_i915_gem_object *obj, bool always_coherent)
>   {
> -	return HAS_LLC(i915) ? I915_MAP_WB : I915_MAP_WC;
> +	if (i915_gem_object_is_lmem(obj))
> +		return I915_MAP_WC;
> +	if (HAS_LLC(i915) || always_coherent)
> +		return I915_MAP_WB;
> +	else
> +		return I915_MAP_WC;

Seems this patch is doing two things.

First it is adding lmem support to this helper by always returning WC 
for lmem objects.

Secondly it is introducing an idea of "always coherent" in a helper 
called i915_coherent_map_type. Could someone explain what is coherent vs 
always coherent?

And also, why is always coherent happy with WB? Sounds counter intuitive 
to me.

Regards,

Tvrtko

>   }
>   
>   #endif
> diff --git a/drivers/gpu/drm/i915/selftests/igt_spinner.c b/drivers/gpu/drm/i915/selftests/igt_spinner.c
> index cfbbe415b57c..5fe397b7d1d9 100644
> --- a/drivers/gpu/drm/i915/selftests/igt_spinner.c
> +++ b/drivers/gpu/drm/i915/selftests/igt_spinner.c
> @@ -94,9 +94,9 @@ int igt_spinner_pin(struct igt_spinner *spin,
>   	}
>   
>   	if (!spin->batch) {
> -		unsigned int mode =
> -			i915_coherent_map_type(spin->gt->i915);
> +		unsigned int mode;
>   
> +		mode = i915_coherent_map_type(spin->gt->i915, spin->obj, false);
>   		vaddr = igt_spinner_pin_obj(ce, ww, spin->obj, mode, &spin->batch_vma);
>   		if (IS_ERR(vaddr))
>   			return PTR_ERR(vaddr);
> 
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [Intel-gfx] [PATCH 12/19] drm/i915/lmem: Bypass aperture when lmem is available
  2021-04-12  9:05   ` [Intel-gfx] " Matthew Auld
@ 2021-04-14 15:33     ` Tvrtko Ursulin
  -1 siblings, 0 replies; 130+ messages in thread
From: Tvrtko Ursulin @ 2021-04-14 15:33 UTC (permalink / raw)
  To: Matthew Auld, intel-gfx
  Cc: Daniel Vetter, dri-devel, Chris P Wilson, Dhinakaran Pandiyan


On 12/04/2021 10:05, Matthew Auld wrote:
> From: Anusha Srivatsa <anusha.srivatsa@intel.com>
> 
> In the scenario where local memory is available, we have
> rely on CPU access via lmem directly instead of aperture.
> 
> v2:
> gmch is only relevant for much older hw, therefore we can drop the
> has_aperture check since it should always be present on such platforms.
> (Chris)
> 
> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
> Cc: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> Cc: Chris P Wilson <chris.p.wilson@intel.com>
> Cc: Daniel Vetter <daniel.vetter@intel.com>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
> Cc: CQ Tang <cq.tang@intel.com>
> Signed-off-by: Anusha Srivatsa <anusha.srivatsa@intel.com>
> ---
>   drivers/gpu/drm/i915/display/intel_fbdev.c | 22 +++++++++++++++-------
>   drivers/gpu/drm/i915/gem/i915_gem_lmem.c   | 15 +++++++++++++++
>   drivers/gpu/drm/i915/gem/i915_gem_lmem.h   |  5 +++++
>   drivers/gpu/drm/i915/i915_vma.c            | 19 +++++++++++++------
>   4 files changed, 48 insertions(+), 13 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/display/intel_fbdev.c b/drivers/gpu/drm/i915/display/intel_fbdev.c
> index 2b37959da747..4af40229f5ec 100644
> --- a/drivers/gpu/drm/i915/display/intel_fbdev.c
> +++ b/drivers/gpu/drm/i915/display/intel_fbdev.c
> @@ -139,14 +139,22 @@ static int intelfb_alloc(struct drm_fb_helper *helper,
>   	size = mode_cmd.pitches[0] * mode_cmd.height;
>   	size = PAGE_ALIGN(size);
>   
> -	/* If the FB is too big, just don't use it since fbdev is not very
> -	 * important and we should probably use that space with FBC or other
> -	 * features. */
>   	obj = ERR_PTR(-ENODEV);
> -	if (size * 2 < dev_priv->stolen_usable_size)
> -		obj = i915_gem_object_create_stolen(dev_priv, size);
> -	if (IS_ERR(obj))
> -		obj = i915_gem_object_create_shmem(dev_priv, size);
> +	if (HAS_LMEM(dev_priv)) {
> +		obj = i915_gem_object_create_lmem(dev_priv, size,
> +						  I915_BO_ALLOC_CONTIGUOUS);

Has to be contiguous? Question for display experts I guess.

[Comes back later.] Ah for iomap? Put a comment to that effect perhaps?

> +	} else {
> +		/*
> +		 * If the FB is too big, just don't use it since fbdev is not very
> +		 * important and we should probably use that space with FBC or other
> +		 * features.
> +		 */
> +		if (size * 2 < dev_priv->stolen_usable_size)
> +			obj = i915_gem_object_create_stolen(dev_priv, size);
> +		if (IS_ERR(obj))
> +			obj = i915_gem_object_create_shmem(dev_priv, size);
> +	}

Could we keep the IS_ERR ordered allocation order to save having to 
re-indent? Bike shed so optional..

> +
>   	if (IS_ERR(obj)) {
>   		drm_err(&dev_priv->drm, "failed to allocate framebuffer\n");
>   		return PTR_ERR(obj);
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_lmem.c b/drivers/gpu/drm/i915/gem/i915_gem_lmem.c
> index 017db8f71130..f44bdd08f7cb 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_lmem.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_lmem.c
> @@ -17,6 +17,21 @@ const struct drm_i915_gem_object_ops i915_gem_lmem_obj_ops = {
>   	.release = i915_gem_object_release_memory_region,
>   };
>   
> +void __iomem *
> +i915_gem_object_lmem_io_map(struct drm_i915_gem_object *obj,
> +			    unsigned long n,
> +			    unsigned long size)
> +{
> +	resource_size_t offset;
> +
> +	GEM_BUG_ON(!i915_gem_object_is_contiguous(obj));
> +
> +	offset = i915_gem_object_get_dma_address(obj, n);
> +	offset -= obj->mm.region->region.start;
> +
> +	return io_mapping_map_wc(&obj->mm.region->iomap, offset, size);
> +}
> +
>   bool i915_gem_object_is_lmem(struct drm_i915_gem_object *obj)
>   {
>   	struct intel_memory_region *mr = obj->mm.region;
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_lmem.h b/drivers/gpu/drm/i915/gem/i915_gem_lmem.h
> index 036d53c01de9..fac6bc5a5ebb 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_lmem.h
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_lmem.h
> @@ -14,6 +14,11 @@ struct intel_memory_region;
>   
>   extern const struct drm_i915_gem_object_ops i915_gem_lmem_obj_ops;
>   
> +void __iomem *
> +i915_gem_object_lmem_io_map(struct drm_i915_gem_object *obj,
> +			    unsigned long n,
> +			    unsigned long size);
> +
>   bool i915_gem_object_is_lmem(struct drm_i915_gem_object *obj);
>   
>   struct drm_i915_gem_object *
> diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
> index 07490db51cdc..e24d33aecac4 100644
> --- a/drivers/gpu/drm/i915/i915_vma.c
> +++ b/drivers/gpu/drm/i915/i915_vma.c
> @@ -27,6 +27,7 @@
>   
>   #include "display/intel_frontbuffer.h"
>   
> +#include "gem/i915_gem_lmem.h"
>   #include "gt/intel_engine.h"
>   #include "gt/intel_engine_heartbeat.h"
>   #include "gt/intel_gt.h"
> @@ -448,9 +449,11 @@ void __iomem *i915_vma_pin_iomap(struct i915_vma *vma)
>   	void __iomem *ptr;
>   	int err;
>   
> -	if (GEM_WARN_ON(!i915_vma_is_map_and_fenceable(vma))) {
> -		err = -ENODEV;
> -		goto err;
> +	if (!i915_gem_object_is_lmem(vma->obj)) {
> +		if (GEM_WARN_ON(!i915_vma_is_map_and_fenceable(vma))) {
> +			err = -ENODEV;
> +			goto err;
> +		}
>   	}
>   
>   	GEM_BUG_ON(!i915_vma_is_ggtt(vma));
> @@ -458,9 +461,13 @@ void __iomem *i915_vma_pin_iomap(struct i915_vma *vma)
>   
>   	ptr = READ_ONCE(vma->iomap);
>   	if (ptr == NULL) {
> -		ptr = io_mapping_map_wc(&i915_vm_to_ggtt(vma->vm)->iomap,
> -					vma->node.start,
> -					vma->node.size);
> +		if (i915_gem_object_is_lmem(vma->obj))
> +			ptr = i915_gem_object_lmem_io_map(vma->obj, 0,
> +							  vma->obj->base.size);

Can the vma size be bigger than the object here? Given how below works 
of vma->node.size.

> +		else
> +			ptr = io_mapping_map_wc(&i915_vm_to_ggtt(vma->vm)->iomap,
> +						vma->node.start,
> +						vma->node.size);

Looks a bit odd that this calls the same io_mapping_map_wc as 
i915_gem_object_lmem_io_map ends up doing. Perhaps that suggests there 
should be a single helper here but I am not sure what would be elegant.

Regards,

Tvrtko

>   		if (ptr == NULL) {
>   			err = -ENOMEM;
>   			goto err;
> 
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [Intel-gfx] [PATCH 12/19] drm/i915/lmem: Bypass aperture when lmem is available
@ 2021-04-14 15:33     ` Tvrtko Ursulin
  0 siblings, 0 replies; 130+ messages in thread
From: Tvrtko Ursulin @ 2021-04-14 15:33 UTC (permalink / raw)
  To: Matthew Auld, intel-gfx
  Cc: Daniel Vetter, dri-devel, Chris P Wilson, Dhinakaran Pandiyan


On 12/04/2021 10:05, Matthew Auld wrote:
> From: Anusha Srivatsa <anusha.srivatsa@intel.com>
> 
> In the scenario where local memory is available, we have
> rely on CPU access via lmem directly instead of aperture.
> 
> v2:
> gmch is only relevant for much older hw, therefore we can drop the
> has_aperture check since it should always be present on such platforms.
> (Chris)
> 
> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
> Cc: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> Cc: Chris P Wilson <chris.p.wilson@intel.com>
> Cc: Daniel Vetter <daniel.vetter@intel.com>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
> Cc: CQ Tang <cq.tang@intel.com>
> Signed-off-by: Anusha Srivatsa <anusha.srivatsa@intel.com>
> ---
>   drivers/gpu/drm/i915/display/intel_fbdev.c | 22 +++++++++++++++-------
>   drivers/gpu/drm/i915/gem/i915_gem_lmem.c   | 15 +++++++++++++++
>   drivers/gpu/drm/i915/gem/i915_gem_lmem.h   |  5 +++++
>   drivers/gpu/drm/i915/i915_vma.c            | 19 +++++++++++++------
>   4 files changed, 48 insertions(+), 13 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/display/intel_fbdev.c b/drivers/gpu/drm/i915/display/intel_fbdev.c
> index 2b37959da747..4af40229f5ec 100644
> --- a/drivers/gpu/drm/i915/display/intel_fbdev.c
> +++ b/drivers/gpu/drm/i915/display/intel_fbdev.c
> @@ -139,14 +139,22 @@ static int intelfb_alloc(struct drm_fb_helper *helper,
>   	size = mode_cmd.pitches[0] * mode_cmd.height;
>   	size = PAGE_ALIGN(size);
>   
> -	/* If the FB is too big, just don't use it since fbdev is not very
> -	 * important and we should probably use that space with FBC or other
> -	 * features. */
>   	obj = ERR_PTR(-ENODEV);
> -	if (size * 2 < dev_priv->stolen_usable_size)
> -		obj = i915_gem_object_create_stolen(dev_priv, size);
> -	if (IS_ERR(obj))
> -		obj = i915_gem_object_create_shmem(dev_priv, size);
> +	if (HAS_LMEM(dev_priv)) {
> +		obj = i915_gem_object_create_lmem(dev_priv, size,
> +						  I915_BO_ALLOC_CONTIGUOUS);

Has to be contiguous? Question for display experts I guess.

[Comes back later.] Ah for iomap? Put a comment to that effect perhaps?

> +	} else {
> +		/*
> +		 * If the FB is too big, just don't use it since fbdev is not very
> +		 * important and we should probably use that space with FBC or other
> +		 * features.
> +		 */
> +		if (size * 2 < dev_priv->stolen_usable_size)
> +			obj = i915_gem_object_create_stolen(dev_priv, size);
> +		if (IS_ERR(obj))
> +			obj = i915_gem_object_create_shmem(dev_priv, size);
> +	}

Could we keep the IS_ERR ordered allocation order to save having to 
re-indent? Bike shed so optional..

> +
>   	if (IS_ERR(obj)) {
>   		drm_err(&dev_priv->drm, "failed to allocate framebuffer\n");
>   		return PTR_ERR(obj);
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_lmem.c b/drivers/gpu/drm/i915/gem/i915_gem_lmem.c
> index 017db8f71130..f44bdd08f7cb 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_lmem.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_lmem.c
> @@ -17,6 +17,21 @@ const struct drm_i915_gem_object_ops i915_gem_lmem_obj_ops = {
>   	.release = i915_gem_object_release_memory_region,
>   };
>   
> +void __iomem *
> +i915_gem_object_lmem_io_map(struct drm_i915_gem_object *obj,
> +			    unsigned long n,
> +			    unsigned long size)
> +{
> +	resource_size_t offset;
> +
> +	GEM_BUG_ON(!i915_gem_object_is_contiguous(obj));
> +
> +	offset = i915_gem_object_get_dma_address(obj, n);
> +	offset -= obj->mm.region->region.start;
> +
> +	return io_mapping_map_wc(&obj->mm.region->iomap, offset, size);
> +}
> +
>   bool i915_gem_object_is_lmem(struct drm_i915_gem_object *obj)
>   {
>   	struct intel_memory_region *mr = obj->mm.region;
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_lmem.h b/drivers/gpu/drm/i915/gem/i915_gem_lmem.h
> index 036d53c01de9..fac6bc5a5ebb 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_lmem.h
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_lmem.h
> @@ -14,6 +14,11 @@ struct intel_memory_region;
>   
>   extern const struct drm_i915_gem_object_ops i915_gem_lmem_obj_ops;
>   
> +void __iomem *
> +i915_gem_object_lmem_io_map(struct drm_i915_gem_object *obj,
> +			    unsigned long n,
> +			    unsigned long size);
> +
>   bool i915_gem_object_is_lmem(struct drm_i915_gem_object *obj);
>   
>   struct drm_i915_gem_object *
> diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
> index 07490db51cdc..e24d33aecac4 100644
> --- a/drivers/gpu/drm/i915/i915_vma.c
> +++ b/drivers/gpu/drm/i915/i915_vma.c
> @@ -27,6 +27,7 @@
>   
>   #include "display/intel_frontbuffer.h"
>   
> +#include "gem/i915_gem_lmem.h"
>   #include "gt/intel_engine.h"
>   #include "gt/intel_engine_heartbeat.h"
>   #include "gt/intel_gt.h"
> @@ -448,9 +449,11 @@ void __iomem *i915_vma_pin_iomap(struct i915_vma *vma)
>   	void __iomem *ptr;
>   	int err;
>   
> -	if (GEM_WARN_ON(!i915_vma_is_map_and_fenceable(vma))) {
> -		err = -ENODEV;
> -		goto err;
> +	if (!i915_gem_object_is_lmem(vma->obj)) {
> +		if (GEM_WARN_ON(!i915_vma_is_map_and_fenceable(vma))) {
> +			err = -ENODEV;
> +			goto err;
> +		}
>   	}
>   
>   	GEM_BUG_ON(!i915_vma_is_ggtt(vma));
> @@ -458,9 +461,13 @@ void __iomem *i915_vma_pin_iomap(struct i915_vma *vma)
>   
>   	ptr = READ_ONCE(vma->iomap);
>   	if (ptr == NULL) {
> -		ptr = io_mapping_map_wc(&i915_vm_to_ggtt(vma->vm)->iomap,
> -					vma->node.start,
> -					vma->node.size);
> +		if (i915_gem_object_is_lmem(vma->obj))
> +			ptr = i915_gem_object_lmem_io_map(vma->obj, 0,
> +							  vma->obj->base.size);

Can the vma size be bigger than the object here? Given how below works 
of vma->node.size.

> +		else
> +			ptr = io_mapping_map_wc(&i915_vm_to_ggtt(vma->vm)->iomap,
> +						vma->node.start,
> +						vma->node.size);

Looks a bit odd that this calls the same io_mapping_map_wc as 
i915_gem_object_lmem_io_map ends up doing. Perhaps that suggests there 
should be a single helper here but I am not sure what would be elegant.

Regards,

Tvrtko

>   		if (ptr == NULL) {
>   			err = -ENOMEM;
>   			goto err;
> 
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [Intel-gfx] [PATCH 19/19] drm/i915/gtt/dgfx: place the PD in LMEM
  2021-04-12  9:05   ` [Intel-gfx] " Matthew Auld
@ 2021-04-14 15:37     ` Tvrtko Ursulin
  -1 siblings, 0 replies; 130+ messages in thread
From: Tvrtko Ursulin @ 2021-04-14 15:37 UTC (permalink / raw)
  To: Matthew Auld, intel-gfx; +Cc: dri-devel


On 12/04/2021 10:05, Matthew Auld wrote:
> It's a requirement that for dgfx we place all the paging structures in
> device local-memory.
> 
> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> ---
>   drivers/gpu/drm/i915/gt/gen8_ppgtt.c |  5 ++++-
>   drivers/gpu/drm/i915/gt/intel_gtt.c  | 27 +++++++++++++++++++++++++--
>   drivers/gpu/drm/i915/gt/intel_gtt.h  |  1 +
>   3 files changed, 30 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
> index f83496836f0f..11fb5df45a0f 100644
> --- a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
> +++ b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
> @@ -712,7 +712,10 @@ struct i915_ppgtt *gen8_ppgtt_create(struct intel_gt *gt)
>   	 */
>   	ppgtt->vm.has_read_only = !IS_GEN_RANGE(gt->i915, 11, 12);
>   
> -	ppgtt->vm.alloc_pt_dma = alloc_pt_dma;
> +	if (HAS_LMEM(gt->i915))
> +		ppgtt->vm.alloc_pt_dma = alloc_pt_lmem;
> +	else
> +		ppgtt->vm.alloc_pt_dma = alloc_pt_dma;
>   
>   	err = gen8_init_scratch(&ppgtt->vm);
>   	if (err)
> diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.c b/drivers/gpu/drm/i915/gt/intel_gtt.c
> index d386b89e2758..1eeeab45445c 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gtt.c
> +++ b/drivers/gpu/drm/i915/gt/intel_gtt.c
> @@ -7,10 +7,23 @@
>   
>   #include <linux/fault-inject.h>
>   
> +#include "gem/i915_gem_lmem.h"
>   #include "i915_trace.h"
>   #include "intel_gt.h"
>   #include "intel_gtt.h"
>   
> +struct drm_i915_gem_object *alloc_pt_lmem(struct i915_address_space *vm, int sz)
> +{
> +	struct drm_i915_gem_object *obj;
> +
> +	obj = i915_gem_object_create_lmem(vm->i915, sz, 0);
> +
> +	/* ensure all dma objects have the same reservation class */
> +	if (!IS_ERR(obj))
> +		obj->base.resv = &vm->resv;
> +	return obj;
> +}
> +
>   struct drm_i915_gem_object *alloc_pt_dma(struct i915_address_space *vm, int sz)
>   {
>   	struct drm_i915_gem_object *obj;
> @@ -27,9 +40,14 @@ struct drm_i915_gem_object *alloc_pt_dma(struct i915_address_space *vm, int sz)
>   
>   int map_pt_dma(struct i915_address_space *vm, struct drm_i915_gem_object *obj)
>   {
> +	enum i915_map_type type;
>   	void *vaddr;
>   
> -	vaddr = i915_gem_object_pin_map_unlocked(obj, I915_MAP_WB);
> +	type = I915_MAP_WB;
> +	if (i915_gem_object_is_lmem(obj))
> +		type = I915_MAP_WC;

Not trusting the "always coherent" helper from earlier in the series?

Regards,

Tvrtko

> +
> +	vaddr = i915_gem_object_pin_map_unlocked(obj, type);
>   	if (IS_ERR(vaddr))
>   		return PTR_ERR(vaddr);
>   
> @@ -39,9 +57,14 @@ int map_pt_dma(struct i915_address_space *vm, struct drm_i915_gem_object *obj)
>   
>   int map_pt_dma_locked(struct i915_address_space *vm, struct drm_i915_gem_object *obj)
>   {
> +	enum i915_map_type type;
>   	void *vaddr;
>   
> -	vaddr = i915_gem_object_pin_map(obj, I915_MAP_WB);
> +	type = I915_MAP_WB;
> +	if (i915_gem_object_is_lmem(obj))
> +		type = I915_MAP_WC;
> +
> +	vaddr = i915_gem_object_pin_map(obj, type);
>   	if (IS_ERR(vaddr))
>   		return PTR_ERR(vaddr);
>   
> diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.h b/drivers/gpu/drm/i915/gt/intel_gtt.h
> index 40e486704558..44ce27c51631 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gtt.h
> +++ b/drivers/gpu/drm/i915/gt/intel_gtt.h
> @@ -527,6 +527,7 @@ int setup_scratch_page(struct i915_address_space *vm);
>   void free_scratch(struct i915_address_space *vm);
>   
>   struct drm_i915_gem_object *alloc_pt_dma(struct i915_address_space *vm, int sz);
> +struct drm_i915_gem_object *alloc_pt_lmem(struct i915_address_space *vm, int sz);
>   struct i915_page_table *alloc_pt(struct i915_address_space *vm);
>   struct i915_page_directory *alloc_pd(struct i915_address_space *vm);
>   struct i915_page_directory *__alloc_pd(int npde);
> 
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [Intel-gfx] [PATCH 19/19] drm/i915/gtt/dgfx: place the PD in LMEM
@ 2021-04-14 15:37     ` Tvrtko Ursulin
  0 siblings, 0 replies; 130+ messages in thread
From: Tvrtko Ursulin @ 2021-04-14 15:37 UTC (permalink / raw)
  To: Matthew Auld, intel-gfx; +Cc: dri-devel


On 12/04/2021 10:05, Matthew Auld wrote:
> It's a requirement that for dgfx we place all the paging structures in
> device local-memory.
> 
> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> ---
>   drivers/gpu/drm/i915/gt/gen8_ppgtt.c |  5 ++++-
>   drivers/gpu/drm/i915/gt/intel_gtt.c  | 27 +++++++++++++++++++++++++--
>   drivers/gpu/drm/i915/gt/intel_gtt.h  |  1 +
>   3 files changed, 30 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
> index f83496836f0f..11fb5df45a0f 100644
> --- a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
> +++ b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
> @@ -712,7 +712,10 @@ struct i915_ppgtt *gen8_ppgtt_create(struct intel_gt *gt)
>   	 */
>   	ppgtt->vm.has_read_only = !IS_GEN_RANGE(gt->i915, 11, 12);
>   
> -	ppgtt->vm.alloc_pt_dma = alloc_pt_dma;
> +	if (HAS_LMEM(gt->i915))
> +		ppgtt->vm.alloc_pt_dma = alloc_pt_lmem;
> +	else
> +		ppgtt->vm.alloc_pt_dma = alloc_pt_dma;
>   
>   	err = gen8_init_scratch(&ppgtt->vm);
>   	if (err)
> diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.c b/drivers/gpu/drm/i915/gt/intel_gtt.c
> index d386b89e2758..1eeeab45445c 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gtt.c
> +++ b/drivers/gpu/drm/i915/gt/intel_gtt.c
> @@ -7,10 +7,23 @@
>   
>   #include <linux/fault-inject.h>
>   
> +#include "gem/i915_gem_lmem.h"
>   #include "i915_trace.h"
>   #include "intel_gt.h"
>   #include "intel_gtt.h"
>   
> +struct drm_i915_gem_object *alloc_pt_lmem(struct i915_address_space *vm, int sz)
> +{
> +	struct drm_i915_gem_object *obj;
> +
> +	obj = i915_gem_object_create_lmem(vm->i915, sz, 0);
> +
> +	/* ensure all dma objects have the same reservation class */
> +	if (!IS_ERR(obj))
> +		obj->base.resv = &vm->resv;
> +	return obj;
> +}
> +
>   struct drm_i915_gem_object *alloc_pt_dma(struct i915_address_space *vm, int sz)
>   {
>   	struct drm_i915_gem_object *obj;
> @@ -27,9 +40,14 @@ struct drm_i915_gem_object *alloc_pt_dma(struct i915_address_space *vm, int sz)
>   
>   int map_pt_dma(struct i915_address_space *vm, struct drm_i915_gem_object *obj)
>   {
> +	enum i915_map_type type;
>   	void *vaddr;
>   
> -	vaddr = i915_gem_object_pin_map_unlocked(obj, I915_MAP_WB);
> +	type = I915_MAP_WB;
> +	if (i915_gem_object_is_lmem(obj))
> +		type = I915_MAP_WC;

Not trusting the "always coherent" helper from earlier in the series?

Regards,

Tvrtko

> +
> +	vaddr = i915_gem_object_pin_map_unlocked(obj, type);
>   	if (IS_ERR(vaddr))
>   		return PTR_ERR(vaddr);
>   
> @@ -39,9 +57,14 @@ int map_pt_dma(struct i915_address_space *vm, struct drm_i915_gem_object *obj)
>   
>   int map_pt_dma_locked(struct i915_address_space *vm, struct drm_i915_gem_object *obj)
>   {
> +	enum i915_map_type type;
>   	void *vaddr;
>   
> -	vaddr = i915_gem_object_pin_map(obj, I915_MAP_WB);
> +	type = I915_MAP_WB;
> +	if (i915_gem_object_is_lmem(obj))
> +		type = I915_MAP_WC;
> +
> +	vaddr = i915_gem_object_pin_map(obj, type);
>   	if (IS_ERR(vaddr))
>   		return PTR_ERR(vaddr);
>   
> diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.h b/drivers/gpu/drm/i915/gt/intel_gtt.h
> index 40e486704558..44ce27c51631 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gtt.h
> +++ b/drivers/gpu/drm/i915/gt/intel_gtt.h
> @@ -527,6 +527,7 @@ int setup_scratch_page(struct i915_address_space *vm);
>   void free_scratch(struct i915_address_space *vm);
>   
>   struct drm_i915_gem_object *alloc_pt_dma(struct i915_address_space *vm, int sz);
> +struct drm_i915_gem_object *alloc_pt_lmem(struct i915_address_space *vm, int sz);
>   struct i915_page_table *alloc_pt(struct i915_address_space *vm);
>   struct i915_page_directory *alloc_pd(struct i915_address_space *vm);
>   struct i915_page_directory *__alloc_pd(int npde);
> 
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [Intel-gfx] [PATCH 11/19] drm/i915: Update the helper to set correct mapping
  2021-04-14 15:22     ` Tvrtko Ursulin
@ 2021-04-14 16:20       ` Matthew Auld
  -1 siblings, 0 replies; 130+ messages in thread
From: Matthew Auld @ 2021-04-14 16:20 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: Intel Graphics Development, Matthew Auld, ML dri-devel

On Wed, 14 Apr 2021 at 16:22, Tvrtko Ursulin
<tvrtko.ursulin@linux.intel.com> wrote:
>
>
> On 12/04/2021 10:05, Matthew Auld wrote:
> > From: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com>
> >
> > Determine the possible coherent map type based on object location,
> > and if target has llc or if user requires an always coherent
> > mapping.
> >
> > Cc: Matthew Auld <matthew.auld@intel.com>
> > Cc: CQ Tang <cq.tang@intel.com>
> > Suggested-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
> > Signed-off-by: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com>
> > ---
> >   drivers/gpu/drm/i915/gt/intel_engine_cs.c    |  3 ++-
> >   drivers/gpu/drm/i915/gt/intel_engine_pm.c    |  2 +-
> >   drivers/gpu/drm/i915/gt/intel_lrc.c          |  4 +++-
> >   drivers/gpu/drm/i915/gt/intel_ring.c         |  9 ++++++---
> >   drivers/gpu/drm/i915/gt/selftest_context.c   |  3 ++-
> >   drivers/gpu/drm/i915/gt/selftest_hangcheck.c |  4 ++--
> >   drivers/gpu/drm/i915/gt/selftest_lrc.c       |  4 +++-
> >   drivers/gpu/drm/i915/gt/uc/intel_guc.c       |  4 +++-
> >   drivers/gpu/drm/i915/gt/uc/intel_huc.c       |  4 +++-
> >   drivers/gpu/drm/i915/i915_drv.h              | 11 +++++++++--
> >   drivers/gpu/drm/i915/selftests/igt_spinner.c |  4 ++--
> >   11 files changed, 36 insertions(+), 16 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> > index efe935f80c1a..b79568d370f5 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> > @@ -664,7 +664,8 @@ static int init_status_page(struct intel_engine_cs *engine)
> >       if (ret)
> >               goto err;
> >
> > -     vaddr = i915_gem_object_pin_map(obj, I915_MAP_WB);
> > +     vaddr = i915_gem_object_pin_map(obj,
> > +                                     i915_coherent_map_type(engine->i915, obj, true));
> >       if (IS_ERR(vaddr)) {
> >               ret = PTR_ERR(vaddr);
> >               goto err_unpin;
> > diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pm.c b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
> > index 7c9af86fdb1e..47f4397095e5 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_engine_pm.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
> > @@ -23,7 +23,7 @@ static void dbg_poison_ce(struct intel_context *ce)
> >
> >       if (ce->state) {
> >               struct drm_i915_gem_object *obj = ce->state->obj;
> > -             int type = i915_coherent_map_type(ce->engine->i915);
> > +             int type = i915_coherent_map_type(ce->engine->i915, obj, true);
> >               void *map;
> >
> >               if (!i915_gem_object_trylock(obj))
> > diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
> > index e86897cde984..aafe2a4df496 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_lrc.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
> > @@ -903,7 +903,9 @@ lrc_pre_pin(struct intel_context *ce,
> >       GEM_BUG_ON(!i915_vma_is_pinned(ce->state));
> >
> >       *vaddr = i915_gem_object_pin_map(ce->state->obj,
> > -                                      i915_coherent_map_type(ce->engine->i915) |
> > +                                      i915_coherent_map_type(ce->engine->i915,
> > +                                                             ce->state->obj,
> > +                                                             false) |
> >                                        I915_MAP_OVERRIDE);
> >
> >       return PTR_ERR_OR_ZERO(*vaddr);
> > diff --git a/drivers/gpu/drm/i915/gt/intel_ring.c b/drivers/gpu/drm/i915/gt/intel_ring.c
> > index aee0a77c77e0..3cf6c7e68108 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_ring.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_ring.c
> > @@ -53,9 +53,12 @@ int intel_ring_pin(struct intel_ring *ring, struct i915_gem_ww_ctx *ww)
> >
> >       if (i915_vma_is_map_and_fenceable(vma))
> >               addr = (void __force *)i915_vma_pin_iomap(vma);
> > -     else
> > -             addr = i915_gem_object_pin_map(vma->obj,
> > -                                            i915_coherent_map_type(vma->vm->i915));
> > +     else {
> > +             int type = i915_coherent_map_type(vma->vm->i915, vma->obj, false);
> > +
> > +             addr = i915_gem_object_pin_map(vma->obj, type);
> > +     }
> > +
> >       if (IS_ERR(addr)) {
> >               ret = PTR_ERR(addr);
> >               goto err_ring;
> > diff --git a/drivers/gpu/drm/i915/gt/selftest_context.c b/drivers/gpu/drm/i915/gt/selftest_context.c
> > index b9bdd1d23243..26685b927169 100644
> > --- a/drivers/gpu/drm/i915/gt/selftest_context.c
> > +++ b/drivers/gpu/drm/i915/gt/selftest_context.c
> > @@ -88,7 +88,8 @@ static int __live_context_size(struct intel_engine_cs *engine)
> >               goto err;
> >
> >       vaddr = i915_gem_object_pin_map_unlocked(ce->state->obj,
> > -                                              i915_coherent_map_type(engine->i915));
> > +                                              i915_coherent_map_type(engine->i915,
> > +                                                                     ce->state->obj, false));
> >       if (IS_ERR(vaddr)) {
> >               err = PTR_ERR(vaddr);
> >               intel_context_unpin(ce);
> > diff --git a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
> > index 746985971c3a..5b63d4df8c93 100644
> > --- a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
> > +++ b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
> > @@ -69,7 +69,7 @@ static int hang_init(struct hang *h, struct intel_gt *gt)
> >       h->seqno = memset(vaddr, 0xff, PAGE_SIZE);
> >
> >       vaddr = i915_gem_object_pin_map_unlocked(h->obj,
> > -                                              i915_coherent_map_type(gt->i915));
> > +                                              i915_coherent_map_type(gt->i915, h->obj, false));
> >       if (IS_ERR(vaddr)) {
> >               err = PTR_ERR(vaddr);
> >               goto err_unpin_hws;
> > @@ -130,7 +130,7 @@ hang_create_request(struct hang *h, struct intel_engine_cs *engine)
> >               return ERR_CAST(obj);
> >       }
> >
> > -     vaddr = i915_gem_object_pin_map_unlocked(obj, i915_coherent_map_type(gt->i915));
> > +     vaddr = i915_gem_object_pin_map_unlocked(obj, i915_coherent_map_type(gt->i915, obj, false));
> >       if (IS_ERR(vaddr)) {
> >               i915_gem_object_put(obj);
> >               i915_vm_put(vm);
> > diff --git a/drivers/gpu/drm/i915/gt/selftest_lrc.c b/drivers/gpu/drm/i915/gt/selftest_lrc.c
> > index 85e7df6a5123..d8f6623524e8 100644
> > --- a/drivers/gpu/drm/i915/gt/selftest_lrc.c
> > +++ b/drivers/gpu/drm/i915/gt/selftest_lrc.c
> > @@ -1221,7 +1221,9 @@ static int compare_isolation(struct intel_engine_cs *engine,
> >       }
> >
> >       lrc = i915_gem_object_pin_map_unlocked(ce->state->obj,
> > -                                   i915_coherent_map_type(engine->i915));
> > +                                            i915_coherent_map_type(engine->i915,
> > +                                                                   ce->state->obj,
> > +                                                                   false));
> >       if (IS_ERR(lrc)) {
> >               err = PTR_ERR(lrc);
> >               goto err_B1;
> > diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.c b/drivers/gpu/drm/i915/gt/uc/intel_guc.c
> > index 78305b2ec89d..adae04c47aab 100644
> > --- a/drivers/gpu/drm/i915/gt/uc/intel_guc.c
> > +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.c
> > @@ -682,7 +682,9 @@ int intel_guc_allocate_and_map_vma(struct intel_guc *guc, u32 size,
> >       if (IS_ERR(vma))
> >               return PTR_ERR(vma);
> >
> > -     vaddr = i915_gem_object_pin_map_unlocked(vma->obj, I915_MAP_WB);
> > +     vaddr = i915_gem_object_pin_map_unlocked(vma->obj,
> > +                                              i915_coherent_map_type(guc_to_gt(guc)->i915,
> > +                                                                     vma->obj, true));
> >       if (IS_ERR(vaddr)) {
> >               i915_vma_unpin_and_release(&vma, 0);
> >               return PTR_ERR(vaddr);
> > diff --git a/drivers/gpu/drm/i915/gt/uc/intel_huc.c b/drivers/gpu/drm/i915/gt/uc/intel_huc.c
> > index 2126dd81ac38..56d2144dc6a0 100644
> > --- a/drivers/gpu/drm/i915/gt/uc/intel_huc.c
> > +++ b/drivers/gpu/drm/i915/gt/uc/intel_huc.c
> > @@ -82,7 +82,9 @@ static int intel_huc_rsa_data_create(struct intel_huc *huc)
> >       if (IS_ERR(vma))
> >               return PTR_ERR(vma);
> >
> > -     vaddr = i915_gem_object_pin_map_unlocked(vma->obj, I915_MAP_WB);
> > +     vaddr = i915_gem_object_pin_map_unlocked(vma->obj,
> > +                                              i915_coherent_map_type(gt->i915,
> > +                                                                     vma->obj, true));
> >       if (IS_ERR(vaddr)) {
> >               i915_vma_unpin_and_release(&vma, 0);
> >               return PTR_ERR(vaddr);
> > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> > index 69e43bf91a15..2abbc06712a4 100644
> > --- a/drivers/gpu/drm/i915/i915_drv.h
> > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > @@ -78,6 +78,7 @@
> >   #include "gem/i915_gem_context_types.h"
> >   #include "gem/i915_gem_shrinker.h"
> >   #include "gem/i915_gem_stolen.h"
> > +#include "gem/i915_gem_lmem.h"
> >
> >   #include "gt/intel_engine.h"
> >   #include "gt/intel_gt_types.h"
> > @@ -1921,9 +1922,15 @@ static inline int intel_hws_csb_write_index(struct drm_i915_private *i915)
> >   }
> >
> >   static inline enum i915_map_type
> > -i915_coherent_map_type(struct drm_i915_private *i915)
> > +i915_coherent_map_type(struct drm_i915_private *i915,
> > +                    struct drm_i915_gem_object *obj, bool always_coherent)
> >   {
> > -     return HAS_LLC(i915) ? I915_MAP_WB : I915_MAP_WC;
> > +     if (i915_gem_object_is_lmem(obj))
> > +             return I915_MAP_WC;
> > +     if (HAS_LLC(i915) || always_coherent)
> > +             return I915_MAP_WB;
> > +     else
> > +             return I915_MAP_WC;
>
> Seems this patch is doing two things.
>
> First it is adding lmem support to this helper by always returning WC
> for lmem objects.
>
> Secondly it is introducing an idea of "always coherent" in a helper
> called i915_coherent_map_type. Could someone explain what is coherent vs
> always coherent?
>
> And also, why is always coherent happy with WB? Sounds counter intuitive
> to me.

All this does is try to keep the existing behaviour intact, whilst
also ensuring that all lmem objects are mapped using only WC, no
matter what. The always_coherent=true thing is for the existing places
where we sometimes map the object using WB, without first considering
whether the device has the fast shared LLC vs snooping. Yes, it's
slightly ugly :)

>
> Regards,
>
> Tvrtko
>
> >   }
> >
> >   #endif
> > diff --git a/drivers/gpu/drm/i915/selftests/igt_spinner.c b/drivers/gpu/drm/i915/selftests/igt_spinner.c
> > index cfbbe415b57c..5fe397b7d1d9 100644
> > --- a/drivers/gpu/drm/i915/selftests/igt_spinner.c
> > +++ b/drivers/gpu/drm/i915/selftests/igt_spinner.c
> > @@ -94,9 +94,9 @@ int igt_spinner_pin(struct igt_spinner *spin,
> >       }
> >
> >       if (!spin->batch) {
> > -             unsigned int mode =
> > -                     i915_coherent_map_type(spin->gt->i915);
> > +             unsigned int mode;
> >
> > +             mode = i915_coherent_map_type(spin->gt->i915, spin->obj, false);
> >               vaddr = igt_spinner_pin_obj(ce, ww, spin->obj, mode, &spin->batch_vma);
> >               if (IS_ERR(vaddr))
> >                       return PTR_ERR(vaddr);
> >
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [Intel-gfx] [PATCH 11/19] drm/i915: Update the helper to set correct mapping
@ 2021-04-14 16:20       ` Matthew Auld
  0 siblings, 0 replies; 130+ messages in thread
From: Matthew Auld @ 2021-04-14 16:20 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: Intel Graphics Development, Matthew Auld, ML dri-devel

On Wed, 14 Apr 2021 at 16:22, Tvrtko Ursulin
<tvrtko.ursulin@linux.intel.com> wrote:
>
>
> On 12/04/2021 10:05, Matthew Auld wrote:
> > From: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com>
> >
> > Determine the possible coherent map type based on object location,
> > and if target has llc or if user requires an always coherent
> > mapping.
> >
> > Cc: Matthew Auld <matthew.auld@intel.com>
> > Cc: CQ Tang <cq.tang@intel.com>
> > Suggested-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
> > Signed-off-by: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com>
> > ---
> >   drivers/gpu/drm/i915/gt/intel_engine_cs.c    |  3 ++-
> >   drivers/gpu/drm/i915/gt/intel_engine_pm.c    |  2 +-
> >   drivers/gpu/drm/i915/gt/intel_lrc.c          |  4 +++-
> >   drivers/gpu/drm/i915/gt/intel_ring.c         |  9 ++++++---
> >   drivers/gpu/drm/i915/gt/selftest_context.c   |  3 ++-
> >   drivers/gpu/drm/i915/gt/selftest_hangcheck.c |  4 ++--
> >   drivers/gpu/drm/i915/gt/selftest_lrc.c       |  4 +++-
> >   drivers/gpu/drm/i915/gt/uc/intel_guc.c       |  4 +++-
> >   drivers/gpu/drm/i915/gt/uc/intel_huc.c       |  4 +++-
> >   drivers/gpu/drm/i915/i915_drv.h              | 11 +++++++++--
> >   drivers/gpu/drm/i915/selftests/igt_spinner.c |  4 ++--
> >   11 files changed, 36 insertions(+), 16 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> > index efe935f80c1a..b79568d370f5 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> > @@ -664,7 +664,8 @@ static int init_status_page(struct intel_engine_cs *engine)
> >       if (ret)
> >               goto err;
> >
> > -     vaddr = i915_gem_object_pin_map(obj, I915_MAP_WB);
> > +     vaddr = i915_gem_object_pin_map(obj,
> > +                                     i915_coherent_map_type(engine->i915, obj, true));
> >       if (IS_ERR(vaddr)) {
> >               ret = PTR_ERR(vaddr);
> >               goto err_unpin;
> > diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pm.c b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
> > index 7c9af86fdb1e..47f4397095e5 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_engine_pm.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
> > @@ -23,7 +23,7 @@ static void dbg_poison_ce(struct intel_context *ce)
> >
> >       if (ce->state) {
> >               struct drm_i915_gem_object *obj = ce->state->obj;
> > -             int type = i915_coherent_map_type(ce->engine->i915);
> > +             int type = i915_coherent_map_type(ce->engine->i915, obj, true);
> >               void *map;
> >
> >               if (!i915_gem_object_trylock(obj))
> > diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
> > index e86897cde984..aafe2a4df496 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_lrc.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
> > @@ -903,7 +903,9 @@ lrc_pre_pin(struct intel_context *ce,
> >       GEM_BUG_ON(!i915_vma_is_pinned(ce->state));
> >
> >       *vaddr = i915_gem_object_pin_map(ce->state->obj,
> > -                                      i915_coherent_map_type(ce->engine->i915) |
> > +                                      i915_coherent_map_type(ce->engine->i915,
> > +                                                             ce->state->obj,
> > +                                                             false) |
> >                                        I915_MAP_OVERRIDE);
> >
> >       return PTR_ERR_OR_ZERO(*vaddr);
> > diff --git a/drivers/gpu/drm/i915/gt/intel_ring.c b/drivers/gpu/drm/i915/gt/intel_ring.c
> > index aee0a77c77e0..3cf6c7e68108 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_ring.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_ring.c
> > @@ -53,9 +53,12 @@ int intel_ring_pin(struct intel_ring *ring, struct i915_gem_ww_ctx *ww)
> >
> >       if (i915_vma_is_map_and_fenceable(vma))
> >               addr = (void __force *)i915_vma_pin_iomap(vma);
> > -     else
> > -             addr = i915_gem_object_pin_map(vma->obj,
> > -                                            i915_coherent_map_type(vma->vm->i915));
> > +     else {
> > +             int type = i915_coherent_map_type(vma->vm->i915, vma->obj, false);
> > +
> > +             addr = i915_gem_object_pin_map(vma->obj, type);
> > +     }
> > +
> >       if (IS_ERR(addr)) {
> >               ret = PTR_ERR(addr);
> >               goto err_ring;
> > diff --git a/drivers/gpu/drm/i915/gt/selftest_context.c b/drivers/gpu/drm/i915/gt/selftest_context.c
> > index b9bdd1d23243..26685b927169 100644
> > --- a/drivers/gpu/drm/i915/gt/selftest_context.c
> > +++ b/drivers/gpu/drm/i915/gt/selftest_context.c
> > @@ -88,7 +88,8 @@ static int __live_context_size(struct intel_engine_cs *engine)
> >               goto err;
> >
> >       vaddr = i915_gem_object_pin_map_unlocked(ce->state->obj,
> > -                                              i915_coherent_map_type(engine->i915));
> > +                                              i915_coherent_map_type(engine->i915,
> > +                                                                     ce->state->obj, false));
> >       if (IS_ERR(vaddr)) {
> >               err = PTR_ERR(vaddr);
> >               intel_context_unpin(ce);
> > diff --git a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
> > index 746985971c3a..5b63d4df8c93 100644
> > --- a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
> > +++ b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
> > @@ -69,7 +69,7 @@ static int hang_init(struct hang *h, struct intel_gt *gt)
> >       h->seqno = memset(vaddr, 0xff, PAGE_SIZE);
> >
> >       vaddr = i915_gem_object_pin_map_unlocked(h->obj,
> > -                                              i915_coherent_map_type(gt->i915));
> > +                                              i915_coherent_map_type(gt->i915, h->obj, false));
> >       if (IS_ERR(vaddr)) {
> >               err = PTR_ERR(vaddr);
> >               goto err_unpin_hws;
> > @@ -130,7 +130,7 @@ hang_create_request(struct hang *h, struct intel_engine_cs *engine)
> >               return ERR_CAST(obj);
> >       }
> >
> > -     vaddr = i915_gem_object_pin_map_unlocked(obj, i915_coherent_map_type(gt->i915));
> > +     vaddr = i915_gem_object_pin_map_unlocked(obj, i915_coherent_map_type(gt->i915, obj, false));
> >       if (IS_ERR(vaddr)) {
> >               i915_gem_object_put(obj);
> >               i915_vm_put(vm);
> > diff --git a/drivers/gpu/drm/i915/gt/selftest_lrc.c b/drivers/gpu/drm/i915/gt/selftest_lrc.c
> > index 85e7df6a5123..d8f6623524e8 100644
> > --- a/drivers/gpu/drm/i915/gt/selftest_lrc.c
> > +++ b/drivers/gpu/drm/i915/gt/selftest_lrc.c
> > @@ -1221,7 +1221,9 @@ static int compare_isolation(struct intel_engine_cs *engine,
> >       }
> >
> >       lrc = i915_gem_object_pin_map_unlocked(ce->state->obj,
> > -                                   i915_coherent_map_type(engine->i915));
> > +                                            i915_coherent_map_type(engine->i915,
> > +                                                                   ce->state->obj,
> > +                                                                   false));
> >       if (IS_ERR(lrc)) {
> >               err = PTR_ERR(lrc);
> >               goto err_B1;
> > diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.c b/drivers/gpu/drm/i915/gt/uc/intel_guc.c
> > index 78305b2ec89d..adae04c47aab 100644
> > --- a/drivers/gpu/drm/i915/gt/uc/intel_guc.c
> > +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.c
> > @@ -682,7 +682,9 @@ int intel_guc_allocate_and_map_vma(struct intel_guc *guc, u32 size,
> >       if (IS_ERR(vma))
> >               return PTR_ERR(vma);
> >
> > -     vaddr = i915_gem_object_pin_map_unlocked(vma->obj, I915_MAP_WB);
> > +     vaddr = i915_gem_object_pin_map_unlocked(vma->obj,
> > +                                              i915_coherent_map_type(guc_to_gt(guc)->i915,
> > +                                                                     vma->obj, true));
> >       if (IS_ERR(vaddr)) {
> >               i915_vma_unpin_and_release(&vma, 0);
> >               return PTR_ERR(vaddr);
> > diff --git a/drivers/gpu/drm/i915/gt/uc/intel_huc.c b/drivers/gpu/drm/i915/gt/uc/intel_huc.c
> > index 2126dd81ac38..56d2144dc6a0 100644
> > --- a/drivers/gpu/drm/i915/gt/uc/intel_huc.c
> > +++ b/drivers/gpu/drm/i915/gt/uc/intel_huc.c
> > @@ -82,7 +82,9 @@ static int intel_huc_rsa_data_create(struct intel_huc *huc)
> >       if (IS_ERR(vma))
> >               return PTR_ERR(vma);
> >
> > -     vaddr = i915_gem_object_pin_map_unlocked(vma->obj, I915_MAP_WB);
> > +     vaddr = i915_gem_object_pin_map_unlocked(vma->obj,
> > +                                              i915_coherent_map_type(gt->i915,
> > +                                                                     vma->obj, true));
> >       if (IS_ERR(vaddr)) {
> >               i915_vma_unpin_and_release(&vma, 0);
> >               return PTR_ERR(vaddr);
> > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> > index 69e43bf91a15..2abbc06712a4 100644
> > --- a/drivers/gpu/drm/i915/i915_drv.h
> > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > @@ -78,6 +78,7 @@
> >   #include "gem/i915_gem_context_types.h"
> >   #include "gem/i915_gem_shrinker.h"
> >   #include "gem/i915_gem_stolen.h"
> > +#include "gem/i915_gem_lmem.h"
> >
> >   #include "gt/intel_engine.h"
> >   #include "gt/intel_gt_types.h"
> > @@ -1921,9 +1922,15 @@ static inline int intel_hws_csb_write_index(struct drm_i915_private *i915)
> >   }
> >
> >   static inline enum i915_map_type
> > -i915_coherent_map_type(struct drm_i915_private *i915)
> > +i915_coherent_map_type(struct drm_i915_private *i915,
> > +                    struct drm_i915_gem_object *obj, bool always_coherent)
> >   {
> > -     return HAS_LLC(i915) ? I915_MAP_WB : I915_MAP_WC;
> > +     if (i915_gem_object_is_lmem(obj))
> > +             return I915_MAP_WC;
> > +     if (HAS_LLC(i915) || always_coherent)
> > +             return I915_MAP_WB;
> > +     else
> > +             return I915_MAP_WC;
>
> Seems this patch is doing two things.
>
> First it is adding lmem support to this helper by always returning WC
> for lmem objects.
>
> Secondly it is introducing an idea of "always coherent" in a helper
> called i915_coherent_map_type. Could someone explain what is coherent vs
> always coherent?
>
> And also, why is always coherent happy with WB? Sounds counter intuitive
> to me.

All this does is try to keep the existing behaviour intact, whilst
also ensuring that all lmem objects are mapped using only WC, no
matter what. The always_coherent=true thing is for the existing places
where we sometimes map the object using WB, without first considering
whether the device has the fast shared LLC vs snooping. Yes, it's
slightly ugly :)

>
> Regards,
>
> Tvrtko
>
> >   }
> >
> >   #endif
> > diff --git a/drivers/gpu/drm/i915/selftests/igt_spinner.c b/drivers/gpu/drm/i915/selftests/igt_spinner.c
> > index cfbbe415b57c..5fe397b7d1d9 100644
> > --- a/drivers/gpu/drm/i915/selftests/igt_spinner.c
> > +++ b/drivers/gpu/drm/i915/selftests/igt_spinner.c
> > @@ -94,9 +94,9 @@ int igt_spinner_pin(struct igt_spinner *spin,
> >       }
> >
> >       if (!spin->batch) {
> > -             unsigned int mode =
> > -                     i915_coherent_map_type(spin->gt->i915);
> > +             unsigned int mode;
> >
> > +             mode = i915_coherent_map_type(spin->gt->i915, spin->obj, false);
> >               vaddr = igt_spinner_pin_obj(ce, ww, spin->obj, mode, &spin->batch_vma);
> >               if (IS_ERR(vaddr))
> >                       return PTR_ERR(vaddr);
> >
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [Intel-gfx] [PATCH 11/19] drm/i915: Update the helper to set correct mapping
  2021-04-14 16:20       ` Matthew Auld
@ 2021-04-15  8:20         ` Tvrtko Ursulin
  -1 siblings, 0 replies; 130+ messages in thread
From: Tvrtko Ursulin @ 2021-04-15  8:20 UTC (permalink / raw)
  To: Matthew Auld; +Cc: Intel Graphics Development, Matthew Auld, ML dri-devel


On 14/04/2021 17:20, Matthew Auld wrote:
> On Wed, 14 Apr 2021 at 16:22, Tvrtko Ursulin
> <tvrtko.ursulin@linux.intel.com> wrote:
>>
>>
>> On 12/04/2021 10:05, Matthew Auld wrote:
>>> From: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com>
>>>
>>> Determine the possible coherent map type based on object location,
>>> and if target has llc or if user requires an always coherent
>>> mapping.
>>>
>>> Cc: Matthew Auld <matthew.auld@intel.com>
>>> Cc: CQ Tang <cq.tang@intel.com>
>>> Suggested-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
>>> Signed-off-by: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com>
>>> ---
>>>    drivers/gpu/drm/i915/gt/intel_engine_cs.c    |  3 ++-
>>>    drivers/gpu/drm/i915/gt/intel_engine_pm.c    |  2 +-
>>>    drivers/gpu/drm/i915/gt/intel_lrc.c          |  4 +++-
>>>    drivers/gpu/drm/i915/gt/intel_ring.c         |  9 ++++++---
>>>    drivers/gpu/drm/i915/gt/selftest_context.c   |  3 ++-
>>>    drivers/gpu/drm/i915/gt/selftest_hangcheck.c |  4 ++--
>>>    drivers/gpu/drm/i915/gt/selftest_lrc.c       |  4 +++-
>>>    drivers/gpu/drm/i915/gt/uc/intel_guc.c       |  4 +++-
>>>    drivers/gpu/drm/i915/gt/uc/intel_huc.c       |  4 +++-
>>>    drivers/gpu/drm/i915/i915_drv.h              | 11 +++++++++--
>>>    drivers/gpu/drm/i915/selftests/igt_spinner.c |  4 ++--
>>>    11 files changed, 36 insertions(+), 16 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
>>> index efe935f80c1a..b79568d370f5 100644
>>> --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
>>> +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
>>> @@ -664,7 +664,8 @@ static int init_status_page(struct intel_engine_cs *engine)
>>>        if (ret)
>>>                goto err;
>>>
>>> -     vaddr = i915_gem_object_pin_map(obj, I915_MAP_WB);
>>> +     vaddr = i915_gem_object_pin_map(obj,
>>> +                                     i915_coherent_map_type(engine->i915, obj, true));
>>>        if (IS_ERR(vaddr)) {
>>>                ret = PTR_ERR(vaddr);
>>>                goto err_unpin;
>>> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pm.c b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
>>> index 7c9af86fdb1e..47f4397095e5 100644
>>> --- a/drivers/gpu/drm/i915/gt/intel_engine_pm.c
>>> +++ b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
>>> @@ -23,7 +23,7 @@ static void dbg_poison_ce(struct intel_context *ce)
>>>
>>>        if (ce->state) {
>>>                struct drm_i915_gem_object *obj = ce->state->obj;
>>> -             int type = i915_coherent_map_type(ce->engine->i915);
>>> +             int type = i915_coherent_map_type(ce->engine->i915, obj, true);
>>>                void *map;
>>>
>>>                if (!i915_gem_object_trylock(obj))
>>> diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
>>> index e86897cde984..aafe2a4df496 100644
>>> --- a/drivers/gpu/drm/i915/gt/intel_lrc.c
>>> +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
>>> @@ -903,7 +903,9 @@ lrc_pre_pin(struct intel_context *ce,
>>>        GEM_BUG_ON(!i915_vma_is_pinned(ce->state));
>>>
>>>        *vaddr = i915_gem_object_pin_map(ce->state->obj,
>>> -                                      i915_coherent_map_type(ce->engine->i915) |
>>> +                                      i915_coherent_map_type(ce->engine->i915,
>>> +                                                             ce->state->obj,
>>> +                                                             false) |
>>>                                         I915_MAP_OVERRIDE);
>>>
>>>        return PTR_ERR_OR_ZERO(*vaddr);
>>> diff --git a/drivers/gpu/drm/i915/gt/intel_ring.c b/drivers/gpu/drm/i915/gt/intel_ring.c
>>> index aee0a77c77e0..3cf6c7e68108 100644
>>> --- a/drivers/gpu/drm/i915/gt/intel_ring.c
>>> +++ b/drivers/gpu/drm/i915/gt/intel_ring.c
>>> @@ -53,9 +53,12 @@ int intel_ring_pin(struct intel_ring *ring, struct i915_gem_ww_ctx *ww)
>>>
>>>        if (i915_vma_is_map_and_fenceable(vma))
>>>                addr = (void __force *)i915_vma_pin_iomap(vma);
>>> -     else
>>> -             addr = i915_gem_object_pin_map(vma->obj,
>>> -                                            i915_coherent_map_type(vma->vm->i915));
>>> +     else {
>>> +             int type = i915_coherent_map_type(vma->vm->i915, vma->obj, false);
>>> +
>>> +             addr = i915_gem_object_pin_map(vma->obj, type);
>>> +     }
>>> +
>>>        if (IS_ERR(addr)) {
>>>                ret = PTR_ERR(addr);
>>>                goto err_ring;
>>> diff --git a/drivers/gpu/drm/i915/gt/selftest_context.c b/drivers/gpu/drm/i915/gt/selftest_context.c
>>> index b9bdd1d23243..26685b927169 100644
>>> --- a/drivers/gpu/drm/i915/gt/selftest_context.c
>>> +++ b/drivers/gpu/drm/i915/gt/selftest_context.c
>>> @@ -88,7 +88,8 @@ static int __live_context_size(struct intel_engine_cs *engine)
>>>                goto err;
>>>
>>>        vaddr = i915_gem_object_pin_map_unlocked(ce->state->obj,
>>> -                                              i915_coherent_map_type(engine->i915));
>>> +                                              i915_coherent_map_type(engine->i915,
>>> +                                                                     ce->state->obj, false));
>>>        if (IS_ERR(vaddr)) {
>>>                err = PTR_ERR(vaddr);
>>>                intel_context_unpin(ce);
>>> diff --git a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
>>> index 746985971c3a..5b63d4df8c93 100644
>>> --- a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
>>> +++ b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
>>> @@ -69,7 +69,7 @@ static int hang_init(struct hang *h, struct intel_gt *gt)
>>>        h->seqno = memset(vaddr, 0xff, PAGE_SIZE);
>>>
>>>        vaddr = i915_gem_object_pin_map_unlocked(h->obj,
>>> -                                              i915_coherent_map_type(gt->i915));
>>> +                                              i915_coherent_map_type(gt->i915, h->obj, false));
>>>        if (IS_ERR(vaddr)) {
>>>                err = PTR_ERR(vaddr);
>>>                goto err_unpin_hws;
>>> @@ -130,7 +130,7 @@ hang_create_request(struct hang *h, struct intel_engine_cs *engine)
>>>                return ERR_CAST(obj);
>>>        }
>>>
>>> -     vaddr = i915_gem_object_pin_map_unlocked(obj, i915_coherent_map_type(gt->i915));
>>> +     vaddr = i915_gem_object_pin_map_unlocked(obj, i915_coherent_map_type(gt->i915, obj, false));
>>>        if (IS_ERR(vaddr)) {
>>>                i915_gem_object_put(obj);
>>>                i915_vm_put(vm);
>>> diff --git a/drivers/gpu/drm/i915/gt/selftest_lrc.c b/drivers/gpu/drm/i915/gt/selftest_lrc.c
>>> index 85e7df6a5123..d8f6623524e8 100644
>>> --- a/drivers/gpu/drm/i915/gt/selftest_lrc.c
>>> +++ b/drivers/gpu/drm/i915/gt/selftest_lrc.c
>>> @@ -1221,7 +1221,9 @@ static int compare_isolation(struct intel_engine_cs *engine,
>>>        }
>>>
>>>        lrc = i915_gem_object_pin_map_unlocked(ce->state->obj,
>>> -                                   i915_coherent_map_type(engine->i915));
>>> +                                            i915_coherent_map_type(engine->i915,
>>> +                                                                   ce->state->obj,
>>> +                                                                   false));
>>>        if (IS_ERR(lrc)) {
>>>                err = PTR_ERR(lrc);
>>>                goto err_B1;
>>> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.c b/drivers/gpu/drm/i915/gt/uc/intel_guc.c
>>> index 78305b2ec89d..adae04c47aab 100644
>>> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc.c
>>> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.c
>>> @@ -682,7 +682,9 @@ int intel_guc_allocate_and_map_vma(struct intel_guc *guc, u32 size,
>>>        if (IS_ERR(vma))
>>>                return PTR_ERR(vma);
>>>
>>> -     vaddr = i915_gem_object_pin_map_unlocked(vma->obj, I915_MAP_WB);
>>> +     vaddr = i915_gem_object_pin_map_unlocked(vma->obj,
>>> +                                              i915_coherent_map_type(guc_to_gt(guc)->i915,
>>> +                                                                     vma->obj, true));
>>>        if (IS_ERR(vaddr)) {
>>>                i915_vma_unpin_and_release(&vma, 0);
>>>                return PTR_ERR(vaddr);
>>> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_huc.c b/drivers/gpu/drm/i915/gt/uc/intel_huc.c
>>> index 2126dd81ac38..56d2144dc6a0 100644
>>> --- a/drivers/gpu/drm/i915/gt/uc/intel_huc.c
>>> +++ b/drivers/gpu/drm/i915/gt/uc/intel_huc.c
>>> @@ -82,7 +82,9 @@ static int intel_huc_rsa_data_create(struct intel_huc *huc)
>>>        if (IS_ERR(vma))
>>>                return PTR_ERR(vma);
>>>
>>> -     vaddr = i915_gem_object_pin_map_unlocked(vma->obj, I915_MAP_WB);
>>> +     vaddr = i915_gem_object_pin_map_unlocked(vma->obj,
>>> +                                              i915_coherent_map_type(gt->i915,
>>> +                                                                     vma->obj, true));
>>>        if (IS_ERR(vaddr)) {
>>>                i915_vma_unpin_and_release(&vma, 0);
>>>                return PTR_ERR(vaddr);
>>> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
>>> index 69e43bf91a15..2abbc06712a4 100644
>>> --- a/drivers/gpu/drm/i915/i915_drv.h
>>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>>> @@ -78,6 +78,7 @@
>>>    #include "gem/i915_gem_context_types.h"
>>>    #include "gem/i915_gem_shrinker.h"
>>>    #include "gem/i915_gem_stolen.h"
>>> +#include "gem/i915_gem_lmem.h"
>>>
>>>    #include "gt/intel_engine.h"
>>>    #include "gt/intel_gt_types.h"
>>> @@ -1921,9 +1922,15 @@ static inline int intel_hws_csb_write_index(struct drm_i915_private *i915)
>>>    }
>>>
>>>    static inline enum i915_map_type
>>> -i915_coherent_map_type(struct drm_i915_private *i915)
>>> +i915_coherent_map_type(struct drm_i915_private *i915,
>>> +                    struct drm_i915_gem_object *obj, bool always_coherent)
>>>    {
>>> -     return HAS_LLC(i915) ? I915_MAP_WB : I915_MAP_WC;
>>> +     if (i915_gem_object_is_lmem(obj))
>>> +             return I915_MAP_WC;
>>> +     if (HAS_LLC(i915) || always_coherent)
>>> +             return I915_MAP_WB;
>>> +     else
>>> +             return I915_MAP_WC;
>>
>> Seems this patch is doing two things.
>>
>> First it is adding lmem support to this helper by always returning WC
>> for lmem objects.
>>
>> Secondly it is introducing an idea of "always coherent" in a helper
>> called i915_coherent_map_type. Could someone explain what is coherent vs
>> always coherent?
>>
>> And also, why is always coherent happy with WB? Sounds counter intuitive
>> to me.
> 
> All this does is try to keep the existing behaviour intact, whilst
> also ensuring that all lmem objects are mapped using only WC, no
> matter what. The always_coherent=true thing is for the existing places
> where we sometimes map the object using WB, without first considering
> whether the device has the fast shared LLC vs snooping. Yes, it's
> slightly ugly :)

Not fully following - if we had to write kerneldoc for always_coherent 
input argument - what it would say?

Regards,

Tvrtko
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [Intel-gfx] [PATCH 11/19] drm/i915: Update the helper to set correct mapping
@ 2021-04-15  8:20         ` Tvrtko Ursulin
  0 siblings, 0 replies; 130+ messages in thread
From: Tvrtko Ursulin @ 2021-04-15  8:20 UTC (permalink / raw)
  To: Matthew Auld; +Cc: Intel Graphics Development, Matthew Auld, ML dri-devel


On 14/04/2021 17:20, Matthew Auld wrote:
> On Wed, 14 Apr 2021 at 16:22, Tvrtko Ursulin
> <tvrtko.ursulin@linux.intel.com> wrote:
>>
>>
>> On 12/04/2021 10:05, Matthew Auld wrote:
>>> From: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com>
>>>
>>> Determine the possible coherent map type based on object location,
>>> and if target has llc or if user requires an always coherent
>>> mapping.
>>>
>>> Cc: Matthew Auld <matthew.auld@intel.com>
>>> Cc: CQ Tang <cq.tang@intel.com>
>>> Suggested-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
>>> Signed-off-by: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com>
>>> ---
>>>    drivers/gpu/drm/i915/gt/intel_engine_cs.c    |  3 ++-
>>>    drivers/gpu/drm/i915/gt/intel_engine_pm.c    |  2 +-
>>>    drivers/gpu/drm/i915/gt/intel_lrc.c          |  4 +++-
>>>    drivers/gpu/drm/i915/gt/intel_ring.c         |  9 ++++++---
>>>    drivers/gpu/drm/i915/gt/selftest_context.c   |  3 ++-
>>>    drivers/gpu/drm/i915/gt/selftest_hangcheck.c |  4 ++--
>>>    drivers/gpu/drm/i915/gt/selftest_lrc.c       |  4 +++-
>>>    drivers/gpu/drm/i915/gt/uc/intel_guc.c       |  4 +++-
>>>    drivers/gpu/drm/i915/gt/uc/intel_huc.c       |  4 +++-
>>>    drivers/gpu/drm/i915/i915_drv.h              | 11 +++++++++--
>>>    drivers/gpu/drm/i915/selftests/igt_spinner.c |  4 ++--
>>>    11 files changed, 36 insertions(+), 16 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
>>> index efe935f80c1a..b79568d370f5 100644
>>> --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
>>> +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
>>> @@ -664,7 +664,8 @@ static int init_status_page(struct intel_engine_cs *engine)
>>>        if (ret)
>>>                goto err;
>>>
>>> -     vaddr = i915_gem_object_pin_map(obj, I915_MAP_WB);
>>> +     vaddr = i915_gem_object_pin_map(obj,
>>> +                                     i915_coherent_map_type(engine->i915, obj, true));
>>>        if (IS_ERR(vaddr)) {
>>>                ret = PTR_ERR(vaddr);
>>>                goto err_unpin;
>>> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pm.c b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
>>> index 7c9af86fdb1e..47f4397095e5 100644
>>> --- a/drivers/gpu/drm/i915/gt/intel_engine_pm.c
>>> +++ b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
>>> @@ -23,7 +23,7 @@ static void dbg_poison_ce(struct intel_context *ce)
>>>
>>>        if (ce->state) {
>>>                struct drm_i915_gem_object *obj = ce->state->obj;
>>> -             int type = i915_coherent_map_type(ce->engine->i915);
>>> +             int type = i915_coherent_map_type(ce->engine->i915, obj, true);
>>>                void *map;
>>>
>>>                if (!i915_gem_object_trylock(obj))
>>> diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
>>> index e86897cde984..aafe2a4df496 100644
>>> --- a/drivers/gpu/drm/i915/gt/intel_lrc.c
>>> +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
>>> @@ -903,7 +903,9 @@ lrc_pre_pin(struct intel_context *ce,
>>>        GEM_BUG_ON(!i915_vma_is_pinned(ce->state));
>>>
>>>        *vaddr = i915_gem_object_pin_map(ce->state->obj,
>>> -                                      i915_coherent_map_type(ce->engine->i915) |
>>> +                                      i915_coherent_map_type(ce->engine->i915,
>>> +                                                             ce->state->obj,
>>> +                                                             false) |
>>>                                         I915_MAP_OVERRIDE);
>>>
>>>        return PTR_ERR_OR_ZERO(*vaddr);
>>> diff --git a/drivers/gpu/drm/i915/gt/intel_ring.c b/drivers/gpu/drm/i915/gt/intel_ring.c
>>> index aee0a77c77e0..3cf6c7e68108 100644
>>> --- a/drivers/gpu/drm/i915/gt/intel_ring.c
>>> +++ b/drivers/gpu/drm/i915/gt/intel_ring.c
>>> @@ -53,9 +53,12 @@ int intel_ring_pin(struct intel_ring *ring, struct i915_gem_ww_ctx *ww)
>>>
>>>        if (i915_vma_is_map_and_fenceable(vma))
>>>                addr = (void __force *)i915_vma_pin_iomap(vma);
>>> -     else
>>> -             addr = i915_gem_object_pin_map(vma->obj,
>>> -                                            i915_coherent_map_type(vma->vm->i915));
>>> +     else {
>>> +             int type = i915_coherent_map_type(vma->vm->i915, vma->obj, false);
>>> +
>>> +             addr = i915_gem_object_pin_map(vma->obj, type);
>>> +     }
>>> +
>>>        if (IS_ERR(addr)) {
>>>                ret = PTR_ERR(addr);
>>>                goto err_ring;
>>> diff --git a/drivers/gpu/drm/i915/gt/selftest_context.c b/drivers/gpu/drm/i915/gt/selftest_context.c
>>> index b9bdd1d23243..26685b927169 100644
>>> --- a/drivers/gpu/drm/i915/gt/selftest_context.c
>>> +++ b/drivers/gpu/drm/i915/gt/selftest_context.c
>>> @@ -88,7 +88,8 @@ static int __live_context_size(struct intel_engine_cs *engine)
>>>                goto err;
>>>
>>>        vaddr = i915_gem_object_pin_map_unlocked(ce->state->obj,
>>> -                                              i915_coherent_map_type(engine->i915));
>>> +                                              i915_coherent_map_type(engine->i915,
>>> +                                                                     ce->state->obj, false));
>>>        if (IS_ERR(vaddr)) {
>>>                err = PTR_ERR(vaddr);
>>>                intel_context_unpin(ce);
>>> diff --git a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
>>> index 746985971c3a..5b63d4df8c93 100644
>>> --- a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
>>> +++ b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
>>> @@ -69,7 +69,7 @@ static int hang_init(struct hang *h, struct intel_gt *gt)
>>>        h->seqno = memset(vaddr, 0xff, PAGE_SIZE);
>>>
>>>        vaddr = i915_gem_object_pin_map_unlocked(h->obj,
>>> -                                              i915_coherent_map_type(gt->i915));
>>> +                                              i915_coherent_map_type(gt->i915, h->obj, false));
>>>        if (IS_ERR(vaddr)) {
>>>                err = PTR_ERR(vaddr);
>>>                goto err_unpin_hws;
>>> @@ -130,7 +130,7 @@ hang_create_request(struct hang *h, struct intel_engine_cs *engine)
>>>                return ERR_CAST(obj);
>>>        }
>>>
>>> -     vaddr = i915_gem_object_pin_map_unlocked(obj, i915_coherent_map_type(gt->i915));
>>> +     vaddr = i915_gem_object_pin_map_unlocked(obj, i915_coherent_map_type(gt->i915, obj, false));
>>>        if (IS_ERR(vaddr)) {
>>>                i915_gem_object_put(obj);
>>>                i915_vm_put(vm);
>>> diff --git a/drivers/gpu/drm/i915/gt/selftest_lrc.c b/drivers/gpu/drm/i915/gt/selftest_lrc.c
>>> index 85e7df6a5123..d8f6623524e8 100644
>>> --- a/drivers/gpu/drm/i915/gt/selftest_lrc.c
>>> +++ b/drivers/gpu/drm/i915/gt/selftest_lrc.c
>>> @@ -1221,7 +1221,9 @@ static int compare_isolation(struct intel_engine_cs *engine,
>>>        }
>>>
>>>        lrc = i915_gem_object_pin_map_unlocked(ce->state->obj,
>>> -                                   i915_coherent_map_type(engine->i915));
>>> +                                            i915_coherent_map_type(engine->i915,
>>> +                                                                   ce->state->obj,
>>> +                                                                   false));
>>>        if (IS_ERR(lrc)) {
>>>                err = PTR_ERR(lrc);
>>>                goto err_B1;
>>> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.c b/drivers/gpu/drm/i915/gt/uc/intel_guc.c
>>> index 78305b2ec89d..adae04c47aab 100644
>>> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc.c
>>> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.c
>>> @@ -682,7 +682,9 @@ int intel_guc_allocate_and_map_vma(struct intel_guc *guc, u32 size,
>>>        if (IS_ERR(vma))
>>>                return PTR_ERR(vma);
>>>
>>> -     vaddr = i915_gem_object_pin_map_unlocked(vma->obj, I915_MAP_WB);
>>> +     vaddr = i915_gem_object_pin_map_unlocked(vma->obj,
>>> +                                              i915_coherent_map_type(guc_to_gt(guc)->i915,
>>> +                                                                     vma->obj, true));
>>>        if (IS_ERR(vaddr)) {
>>>                i915_vma_unpin_and_release(&vma, 0);
>>>                return PTR_ERR(vaddr);
>>> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_huc.c b/drivers/gpu/drm/i915/gt/uc/intel_huc.c
>>> index 2126dd81ac38..56d2144dc6a0 100644
>>> --- a/drivers/gpu/drm/i915/gt/uc/intel_huc.c
>>> +++ b/drivers/gpu/drm/i915/gt/uc/intel_huc.c
>>> @@ -82,7 +82,9 @@ static int intel_huc_rsa_data_create(struct intel_huc *huc)
>>>        if (IS_ERR(vma))
>>>                return PTR_ERR(vma);
>>>
>>> -     vaddr = i915_gem_object_pin_map_unlocked(vma->obj, I915_MAP_WB);
>>> +     vaddr = i915_gem_object_pin_map_unlocked(vma->obj,
>>> +                                              i915_coherent_map_type(gt->i915,
>>> +                                                                     vma->obj, true));
>>>        if (IS_ERR(vaddr)) {
>>>                i915_vma_unpin_and_release(&vma, 0);
>>>                return PTR_ERR(vaddr);
>>> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
>>> index 69e43bf91a15..2abbc06712a4 100644
>>> --- a/drivers/gpu/drm/i915/i915_drv.h
>>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>>> @@ -78,6 +78,7 @@
>>>    #include "gem/i915_gem_context_types.h"
>>>    #include "gem/i915_gem_shrinker.h"
>>>    #include "gem/i915_gem_stolen.h"
>>> +#include "gem/i915_gem_lmem.h"
>>>
>>>    #include "gt/intel_engine.h"
>>>    #include "gt/intel_gt_types.h"
>>> @@ -1921,9 +1922,15 @@ static inline int intel_hws_csb_write_index(struct drm_i915_private *i915)
>>>    }
>>>
>>>    static inline enum i915_map_type
>>> -i915_coherent_map_type(struct drm_i915_private *i915)
>>> +i915_coherent_map_type(struct drm_i915_private *i915,
>>> +                    struct drm_i915_gem_object *obj, bool always_coherent)
>>>    {
>>> -     return HAS_LLC(i915) ? I915_MAP_WB : I915_MAP_WC;
>>> +     if (i915_gem_object_is_lmem(obj))
>>> +             return I915_MAP_WC;
>>> +     if (HAS_LLC(i915) || always_coherent)
>>> +             return I915_MAP_WB;
>>> +     else
>>> +             return I915_MAP_WC;
>>
>> Seems this patch is doing two things.
>>
>> First it is adding lmem support to this helper by always returning WC
>> for lmem objects.
>>
>> Secondly it is introducing an idea of "always coherent" in a helper
>> called i915_coherent_map_type. Could someone explain what is coherent vs
>> always coherent?
>>
>> And also, why is always coherent happy with WB? Sounds counter intuitive
>> to me.
> 
> All this does is try to keep the existing behaviour intact, whilst
> also ensuring that all lmem objects are mapped using only WC, no
> matter what. The always_coherent=true thing is for the existing places
> where we sometimes map the object using WB, without first considering
> whether the device has the fast shared LLC vs snooping. Yes, it's
> slightly ugly :)

Not fully following - if we had to write kerneldoc for always_coherent 
input argument - what it would say?

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [Intel-gfx] [PATCH 11/19] drm/i915: Update the helper to set correct mapping
  2021-04-15  8:20         ` Tvrtko Ursulin
@ 2021-04-15  9:23           ` Matthew Auld
  -1 siblings, 0 replies; 130+ messages in thread
From: Matthew Auld @ 2021-04-15  9:23 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: Intel Graphics Development, Matthew Auld, ML dri-devel

On Thu, 15 Apr 2021 at 09:21, Tvrtko Ursulin
<tvrtko.ursulin@linux.intel.com> wrote:
>
>
> On 14/04/2021 17:20, Matthew Auld wrote:
> > On Wed, 14 Apr 2021 at 16:22, Tvrtko Ursulin
> > <tvrtko.ursulin@linux.intel.com> wrote:
> >>
> >>
> >> On 12/04/2021 10:05, Matthew Auld wrote:
> >>> From: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com>
> >>>
> >>> Determine the possible coherent map type based on object location,
> >>> and if target has llc or if user requires an always coherent
> >>> mapping.
> >>>
> >>> Cc: Matthew Auld <matthew.auld@intel.com>
> >>> Cc: CQ Tang <cq.tang@intel.com>
> >>> Suggested-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
> >>> Signed-off-by: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com>
> >>> ---
> >>>    drivers/gpu/drm/i915/gt/intel_engine_cs.c    |  3 ++-
> >>>    drivers/gpu/drm/i915/gt/intel_engine_pm.c    |  2 +-
> >>>    drivers/gpu/drm/i915/gt/intel_lrc.c          |  4 +++-
> >>>    drivers/gpu/drm/i915/gt/intel_ring.c         |  9 ++++++---
> >>>    drivers/gpu/drm/i915/gt/selftest_context.c   |  3 ++-
> >>>    drivers/gpu/drm/i915/gt/selftest_hangcheck.c |  4 ++--
> >>>    drivers/gpu/drm/i915/gt/selftest_lrc.c       |  4 +++-
> >>>    drivers/gpu/drm/i915/gt/uc/intel_guc.c       |  4 +++-
> >>>    drivers/gpu/drm/i915/gt/uc/intel_huc.c       |  4 +++-
> >>>    drivers/gpu/drm/i915/i915_drv.h              | 11 +++++++++--
> >>>    drivers/gpu/drm/i915/selftests/igt_spinner.c |  4 ++--
> >>>    11 files changed, 36 insertions(+), 16 deletions(-)
> >>>
> >>> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> >>> index efe935f80c1a..b79568d370f5 100644
> >>> --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> >>> +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> >>> @@ -664,7 +664,8 @@ static int init_status_page(struct intel_engine_cs *engine)
> >>>        if (ret)
> >>>                goto err;
> >>>
> >>> -     vaddr = i915_gem_object_pin_map(obj, I915_MAP_WB);
> >>> +     vaddr = i915_gem_object_pin_map(obj,
> >>> +                                     i915_coherent_map_type(engine->i915, obj, true));
> >>>        if (IS_ERR(vaddr)) {
> >>>                ret = PTR_ERR(vaddr);
> >>>                goto err_unpin;
> >>> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pm.c b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
> >>> index 7c9af86fdb1e..47f4397095e5 100644
> >>> --- a/drivers/gpu/drm/i915/gt/intel_engine_pm.c
> >>> +++ b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
> >>> @@ -23,7 +23,7 @@ static void dbg_poison_ce(struct intel_context *ce)
> >>>
> >>>        if (ce->state) {
> >>>                struct drm_i915_gem_object *obj = ce->state->obj;
> >>> -             int type = i915_coherent_map_type(ce->engine->i915);
> >>> +             int type = i915_coherent_map_type(ce->engine->i915, obj, true);
> >>>                void *map;
> >>>
> >>>                if (!i915_gem_object_trylock(obj))
> >>> diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
> >>> index e86897cde984..aafe2a4df496 100644
> >>> --- a/drivers/gpu/drm/i915/gt/intel_lrc.c
> >>> +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
> >>> @@ -903,7 +903,9 @@ lrc_pre_pin(struct intel_context *ce,
> >>>        GEM_BUG_ON(!i915_vma_is_pinned(ce->state));
> >>>
> >>>        *vaddr = i915_gem_object_pin_map(ce->state->obj,
> >>> -                                      i915_coherent_map_type(ce->engine->i915) |
> >>> +                                      i915_coherent_map_type(ce->engine->i915,
> >>> +                                                             ce->state->obj,
> >>> +                                                             false) |
> >>>                                         I915_MAP_OVERRIDE);
> >>>
> >>>        return PTR_ERR_OR_ZERO(*vaddr);
> >>> diff --git a/drivers/gpu/drm/i915/gt/intel_ring.c b/drivers/gpu/drm/i915/gt/intel_ring.c
> >>> index aee0a77c77e0..3cf6c7e68108 100644
> >>> --- a/drivers/gpu/drm/i915/gt/intel_ring.c
> >>> +++ b/drivers/gpu/drm/i915/gt/intel_ring.c
> >>> @@ -53,9 +53,12 @@ int intel_ring_pin(struct intel_ring *ring, struct i915_gem_ww_ctx *ww)
> >>>
> >>>        if (i915_vma_is_map_and_fenceable(vma))
> >>>                addr = (void __force *)i915_vma_pin_iomap(vma);
> >>> -     else
> >>> -             addr = i915_gem_object_pin_map(vma->obj,
> >>> -                                            i915_coherent_map_type(vma->vm->i915));
> >>> +     else {
> >>> +             int type = i915_coherent_map_type(vma->vm->i915, vma->obj, false);
> >>> +
> >>> +             addr = i915_gem_object_pin_map(vma->obj, type);
> >>> +     }
> >>> +
> >>>        if (IS_ERR(addr)) {
> >>>                ret = PTR_ERR(addr);
> >>>                goto err_ring;
> >>> diff --git a/drivers/gpu/drm/i915/gt/selftest_context.c b/drivers/gpu/drm/i915/gt/selftest_context.c
> >>> index b9bdd1d23243..26685b927169 100644
> >>> --- a/drivers/gpu/drm/i915/gt/selftest_context.c
> >>> +++ b/drivers/gpu/drm/i915/gt/selftest_context.c
> >>> @@ -88,7 +88,8 @@ static int __live_context_size(struct intel_engine_cs *engine)
> >>>                goto err;
> >>>
> >>>        vaddr = i915_gem_object_pin_map_unlocked(ce->state->obj,
> >>> -                                              i915_coherent_map_type(engine->i915));
> >>> +                                              i915_coherent_map_type(engine->i915,
> >>> +                                                                     ce->state->obj, false));
> >>>        if (IS_ERR(vaddr)) {
> >>>                err = PTR_ERR(vaddr);
> >>>                intel_context_unpin(ce);
> >>> diff --git a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
> >>> index 746985971c3a..5b63d4df8c93 100644
> >>> --- a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
> >>> +++ b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
> >>> @@ -69,7 +69,7 @@ static int hang_init(struct hang *h, struct intel_gt *gt)
> >>>        h->seqno = memset(vaddr, 0xff, PAGE_SIZE);
> >>>
> >>>        vaddr = i915_gem_object_pin_map_unlocked(h->obj,
> >>> -                                              i915_coherent_map_type(gt->i915));
> >>> +                                              i915_coherent_map_type(gt->i915, h->obj, false));
> >>>        if (IS_ERR(vaddr)) {
> >>>                err = PTR_ERR(vaddr);
> >>>                goto err_unpin_hws;
> >>> @@ -130,7 +130,7 @@ hang_create_request(struct hang *h, struct intel_engine_cs *engine)
> >>>                return ERR_CAST(obj);
> >>>        }
> >>>
> >>> -     vaddr = i915_gem_object_pin_map_unlocked(obj, i915_coherent_map_type(gt->i915));
> >>> +     vaddr = i915_gem_object_pin_map_unlocked(obj, i915_coherent_map_type(gt->i915, obj, false));
> >>>        if (IS_ERR(vaddr)) {
> >>>                i915_gem_object_put(obj);
> >>>                i915_vm_put(vm);
> >>> diff --git a/drivers/gpu/drm/i915/gt/selftest_lrc.c b/drivers/gpu/drm/i915/gt/selftest_lrc.c
> >>> index 85e7df6a5123..d8f6623524e8 100644
> >>> --- a/drivers/gpu/drm/i915/gt/selftest_lrc.c
> >>> +++ b/drivers/gpu/drm/i915/gt/selftest_lrc.c
> >>> @@ -1221,7 +1221,9 @@ static int compare_isolation(struct intel_engine_cs *engine,
> >>>        }
> >>>
> >>>        lrc = i915_gem_object_pin_map_unlocked(ce->state->obj,
> >>> -                                   i915_coherent_map_type(engine->i915));
> >>> +                                            i915_coherent_map_type(engine->i915,
> >>> +                                                                   ce->state->obj,
> >>> +                                                                   false));
> >>>        if (IS_ERR(lrc)) {
> >>>                err = PTR_ERR(lrc);
> >>>                goto err_B1;
> >>> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.c b/drivers/gpu/drm/i915/gt/uc/intel_guc.c
> >>> index 78305b2ec89d..adae04c47aab 100644
> >>> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc.c
> >>> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.c
> >>> @@ -682,7 +682,9 @@ int intel_guc_allocate_and_map_vma(struct intel_guc *guc, u32 size,
> >>>        if (IS_ERR(vma))
> >>>                return PTR_ERR(vma);
> >>>
> >>> -     vaddr = i915_gem_object_pin_map_unlocked(vma->obj, I915_MAP_WB);
> >>> +     vaddr = i915_gem_object_pin_map_unlocked(vma->obj,
> >>> +                                              i915_coherent_map_type(guc_to_gt(guc)->i915,
> >>> +                                                                     vma->obj, true));
> >>>        if (IS_ERR(vaddr)) {
> >>>                i915_vma_unpin_and_release(&vma, 0);
> >>>                return PTR_ERR(vaddr);
> >>> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_huc.c b/drivers/gpu/drm/i915/gt/uc/intel_huc.c
> >>> index 2126dd81ac38..56d2144dc6a0 100644
> >>> --- a/drivers/gpu/drm/i915/gt/uc/intel_huc.c
> >>> +++ b/drivers/gpu/drm/i915/gt/uc/intel_huc.c
> >>> @@ -82,7 +82,9 @@ static int intel_huc_rsa_data_create(struct intel_huc *huc)
> >>>        if (IS_ERR(vma))
> >>>                return PTR_ERR(vma);
> >>>
> >>> -     vaddr = i915_gem_object_pin_map_unlocked(vma->obj, I915_MAP_WB);
> >>> +     vaddr = i915_gem_object_pin_map_unlocked(vma->obj,
> >>> +                                              i915_coherent_map_type(gt->i915,
> >>> +                                                                     vma->obj, true));
> >>>        if (IS_ERR(vaddr)) {
> >>>                i915_vma_unpin_and_release(&vma, 0);
> >>>                return PTR_ERR(vaddr);
> >>> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> >>> index 69e43bf91a15..2abbc06712a4 100644
> >>> --- a/drivers/gpu/drm/i915/i915_drv.h
> >>> +++ b/drivers/gpu/drm/i915/i915_drv.h
> >>> @@ -78,6 +78,7 @@
> >>>    #include "gem/i915_gem_context_types.h"
> >>>    #include "gem/i915_gem_shrinker.h"
> >>>    #include "gem/i915_gem_stolen.h"
> >>> +#include "gem/i915_gem_lmem.h"
> >>>
> >>>    #include "gt/intel_engine.h"
> >>>    #include "gt/intel_gt_types.h"
> >>> @@ -1921,9 +1922,15 @@ static inline int intel_hws_csb_write_index(struct drm_i915_private *i915)
> >>>    }
> >>>
> >>>    static inline enum i915_map_type
> >>> -i915_coherent_map_type(struct drm_i915_private *i915)
> >>> +i915_coherent_map_type(struct drm_i915_private *i915,
> >>> +                    struct drm_i915_gem_object *obj, bool always_coherent)
> >>>    {
> >>> -     return HAS_LLC(i915) ? I915_MAP_WB : I915_MAP_WC;
> >>> +     if (i915_gem_object_is_lmem(obj))
> >>> +             return I915_MAP_WC;
> >>> +     if (HAS_LLC(i915) || always_coherent)
> >>> +             return I915_MAP_WB;
> >>> +     else
> >>> +             return I915_MAP_WC;
> >>
> >> Seems this patch is doing two things.
> >>
> >> First it is adding lmem support to this helper by always returning WC
> >> for lmem objects.
> >>
> >> Secondly it is introducing an idea of "always coherent" in a helper
> >> called i915_coherent_map_type. Could someone explain what is coherent vs
> >> always coherent?
> >>
> >> And also, why is always coherent happy with WB? Sounds counter intuitive
> >> to me.
> >
> > All this does is try to keep the existing behaviour intact, whilst
> > also ensuring that all lmem objects are mapped using only WC, no
> > matter what. The always_coherent=true thing is for the existing places
> > where we sometimes map the object using WB, without first considering
> > whether the device has the fast shared LLC vs snooping. Yes, it's
> > slightly ugly :)
>
> Not fully following - if we had to write kerneldoc for always_coherent
> input argument - what it would say?

@always_coherent - If true we should always try to map the object
using WB. If false we should only map as WB if the device supports the
fast shared LLC, in the case of snooped devices we will map use WC.
Note that If the resource is lmem then we will always map as WC,
regardless of the value of always_coherent, since that's all we
currently support.

Maybe the naming is poor?

>
> Regards,
>
> Tvrtko
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [Intel-gfx] [PATCH 11/19] drm/i915: Update the helper to set correct mapping
@ 2021-04-15  9:23           ` Matthew Auld
  0 siblings, 0 replies; 130+ messages in thread
From: Matthew Auld @ 2021-04-15  9:23 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: Intel Graphics Development, Matthew Auld, ML dri-devel

On Thu, 15 Apr 2021 at 09:21, Tvrtko Ursulin
<tvrtko.ursulin@linux.intel.com> wrote:
>
>
> On 14/04/2021 17:20, Matthew Auld wrote:
> > On Wed, 14 Apr 2021 at 16:22, Tvrtko Ursulin
> > <tvrtko.ursulin@linux.intel.com> wrote:
> >>
> >>
> >> On 12/04/2021 10:05, Matthew Auld wrote:
> >>> From: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com>
> >>>
> >>> Determine the possible coherent map type based on object location,
> >>> and if target has llc or if user requires an always coherent
> >>> mapping.
> >>>
> >>> Cc: Matthew Auld <matthew.auld@intel.com>
> >>> Cc: CQ Tang <cq.tang@intel.com>
> >>> Suggested-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
> >>> Signed-off-by: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com>
> >>> ---
> >>>    drivers/gpu/drm/i915/gt/intel_engine_cs.c    |  3 ++-
> >>>    drivers/gpu/drm/i915/gt/intel_engine_pm.c    |  2 +-
> >>>    drivers/gpu/drm/i915/gt/intel_lrc.c          |  4 +++-
> >>>    drivers/gpu/drm/i915/gt/intel_ring.c         |  9 ++++++---
> >>>    drivers/gpu/drm/i915/gt/selftest_context.c   |  3 ++-
> >>>    drivers/gpu/drm/i915/gt/selftest_hangcheck.c |  4 ++--
> >>>    drivers/gpu/drm/i915/gt/selftest_lrc.c       |  4 +++-
> >>>    drivers/gpu/drm/i915/gt/uc/intel_guc.c       |  4 +++-
> >>>    drivers/gpu/drm/i915/gt/uc/intel_huc.c       |  4 +++-
> >>>    drivers/gpu/drm/i915/i915_drv.h              | 11 +++++++++--
> >>>    drivers/gpu/drm/i915/selftests/igt_spinner.c |  4 ++--
> >>>    11 files changed, 36 insertions(+), 16 deletions(-)
> >>>
> >>> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> >>> index efe935f80c1a..b79568d370f5 100644
> >>> --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> >>> +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> >>> @@ -664,7 +664,8 @@ static int init_status_page(struct intel_engine_cs *engine)
> >>>        if (ret)
> >>>                goto err;
> >>>
> >>> -     vaddr = i915_gem_object_pin_map(obj, I915_MAP_WB);
> >>> +     vaddr = i915_gem_object_pin_map(obj,
> >>> +                                     i915_coherent_map_type(engine->i915, obj, true));
> >>>        if (IS_ERR(vaddr)) {
> >>>                ret = PTR_ERR(vaddr);
> >>>                goto err_unpin;
> >>> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pm.c b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
> >>> index 7c9af86fdb1e..47f4397095e5 100644
> >>> --- a/drivers/gpu/drm/i915/gt/intel_engine_pm.c
> >>> +++ b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
> >>> @@ -23,7 +23,7 @@ static void dbg_poison_ce(struct intel_context *ce)
> >>>
> >>>        if (ce->state) {
> >>>                struct drm_i915_gem_object *obj = ce->state->obj;
> >>> -             int type = i915_coherent_map_type(ce->engine->i915);
> >>> +             int type = i915_coherent_map_type(ce->engine->i915, obj, true);
> >>>                void *map;
> >>>
> >>>                if (!i915_gem_object_trylock(obj))
> >>> diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
> >>> index e86897cde984..aafe2a4df496 100644
> >>> --- a/drivers/gpu/drm/i915/gt/intel_lrc.c
> >>> +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
> >>> @@ -903,7 +903,9 @@ lrc_pre_pin(struct intel_context *ce,
> >>>        GEM_BUG_ON(!i915_vma_is_pinned(ce->state));
> >>>
> >>>        *vaddr = i915_gem_object_pin_map(ce->state->obj,
> >>> -                                      i915_coherent_map_type(ce->engine->i915) |
> >>> +                                      i915_coherent_map_type(ce->engine->i915,
> >>> +                                                             ce->state->obj,
> >>> +                                                             false) |
> >>>                                         I915_MAP_OVERRIDE);
> >>>
> >>>        return PTR_ERR_OR_ZERO(*vaddr);
> >>> diff --git a/drivers/gpu/drm/i915/gt/intel_ring.c b/drivers/gpu/drm/i915/gt/intel_ring.c
> >>> index aee0a77c77e0..3cf6c7e68108 100644
> >>> --- a/drivers/gpu/drm/i915/gt/intel_ring.c
> >>> +++ b/drivers/gpu/drm/i915/gt/intel_ring.c
> >>> @@ -53,9 +53,12 @@ int intel_ring_pin(struct intel_ring *ring, struct i915_gem_ww_ctx *ww)
> >>>
> >>>        if (i915_vma_is_map_and_fenceable(vma))
> >>>                addr = (void __force *)i915_vma_pin_iomap(vma);
> >>> -     else
> >>> -             addr = i915_gem_object_pin_map(vma->obj,
> >>> -                                            i915_coherent_map_type(vma->vm->i915));
> >>> +     else {
> >>> +             int type = i915_coherent_map_type(vma->vm->i915, vma->obj, false);
> >>> +
> >>> +             addr = i915_gem_object_pin_map(vma->obj, type);
> >>> +     }
> >>> +
> >>>        if (IS_ERR(addr)) {
> >>>                ret = PTR_ERR(addr);
> >>>                goto err_ring;
> >>> diff --git a/drivers/gpu/drm/i915/gt/selftest_context.c b/drivers/gpu/drm/i915/gt/selftest_context.c
> >>> index b9bdd1d23243..26685b927169 100644
> >>> --- a/drivers/gpu/drm/i915/gt/selftest_context.c
> >>> +++ b/drivers/gpu/drm/i915/gt/selftest_context.c
> >>> @@ -88,7 +88,8 @@ static int __live_context_size(struct intel_engine_cs *engine)
> >>>                goto err;
> >>>
> >>>        vaddr = i915_gem_object_pin_map_unlocked(ce->state->obj,
> >>> -                                              i915_coherent_map_type(engine->i915));
> >>> +                                              i915_coherent_map_type(engine->i915,
> >>> +                                                                     ce->state->obj, false));
> >>>        if (IS_ERR(vaddr)) {
> >>>                err = PTR_ERR(vaddr);
> >>>                intel_context_unpin(ce);
> >>> diff --git a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
> >>> index 746985971c3a..5b63d4df8c93 100644
> >>> --- a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
> >>> +++ b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
> >>> @@ -69,7 +69,7 @@ static int hang_init(struct hang *h, struct intel_gt *gt)
> >>>        h->seqno = memset(vaddr, 0xff, PAGE_SIZE);
> >>>
> >>>        vaddr = i915_gem_object_pin_map_unlocked(h->obj,
> >>> -                                              i915_coherent_map_type(gt->i915));
> >>> +                                              i915_coherent_map_type(gt->i915, h->obj, false));
> >>>        if (IS_ERR(vaddr)) {
> >>>                err = PTR_ERR(vaddr);
> >>>                goto err_unpin_hws;
> >>> @@ -130,7 +130,7 @@ hang_create_request(struct hang *h, struct intel_engine_cs *engine)
> >>>                return ERR_CAST(obj);
> >>>        }
> >>>
> >>> -     vaddr = i915_gem_object_pin_map_unlocked(obj, i915_coherent_map_type(gt->i915));
> >>> +     vaddr = i915_gem_object_pin_map_unlocked(obj, i915_coherent_map_type(gt->i915, obj, false));
> >>>        if (IS_ERR(vaddr)) {
> >>>                i915_gem_object_put(obj);
> >>>                i915_vm_put(vm);
> >>> diff --git a/drivers/gpu/drm/i915/gt/selftest_lrc.c b/drivers/gpu/drm/i915/gt/selftest_lrc.c
> >>> index 85e7df6a5123..d8f6623524e8 100644
> >>> --- a/drivers/gpu/drm/i915/gt/selftest_lrc.c
> >>> +++ b/drivers/gpu/drm/i915/gt/selftest_lrc.c
> >>> @@ -1221,7 +1221,9 @@ static int compare_isolation(struct intel_engine_cs *engine,
> >>>        }
> >>>
> >>>        lrc = i915_gem_object_pin_map_unlocked(ce->state->obj,
> >>> -                                   i915_coherent_map_type(engine->i915));
> >>> +                                            i915_coherent_map_type(engine->i915,
> >>> +                                                                   ce->state->obj,
> >>> +                                                                   false));
> >>>        if (IS_ERR(lrc)) {
> >>>                err = PTR_ERR(lrc);
> >>>                goto err_B1;
> >>> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.c b/drivers/gpu/drm/i915/gt/uc/intel_guc.c
> >>> index 78305b2ec89d..adae04c47aab 100644
> >>> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc.c
> >>> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.c
> >>> @@ -682,7 +682,9 @@ int intel_guc_allocate_and_map_vma(struct intel_guc *guc, u32 size,
> >>>        if (IS_ERR(vma))
> >>>                return PTR_ERR(vma);
> >>>
> >>> -     vaddr = i915_gem_object_pin_map_unlocked(vma->obj, I915_MAP_WB);
> >>> +     vaddr = i915_gem_object_pin_map_unlocked(vma->obj,
> >>> +                                              i915_coherent_map_type(guc_to_gt(guc)->i915,
> >>> +                                                                     vma->obj, true));
> >>>        if (IS_ERR(vaddr)) {
> >>>                i915_vma_unpin_and_release(&vma, 0);
> >>>                return PTR_ERR(vaddr);
> >>> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_huc.c b/drivers/gpu/drm/i915/gt/uc/intel_huc.c
> >>> index 2126dd81ac38..56d2144dc6a0 100644
> >>> --- a/drivers/gpu/drm/i915/gt/uc/intel_huc.c
> >>> +++ b/drivers/gpu/drm/i915/gt/uc/intel_huc.c
> >>> @@ -82,7 +82,9 @@ static int intel_huc_rsa_data_create(struct intel_huc *huc)
> >>>        if (IS_ERR(vma))
> >>>                return PTR_ERR(vma);
> >>>
> >>> -     vaddr = i915_gem_object_pin_map_unlocked(vma->obj, I915_MAP_WB);
> >>> +     vaddr = i915_gem_object_pin_map_unlocked(vma->obj,
> >>> +                                              i915_coherent_map_type(gt->i915,
> >>> +                                                                     vma->obj, true));
> >>>        if (IS_ERR(vaddr)) {
> >>>                i915_vma_unpin_and_release(&vma, 0);
> >>>                return PTR_ERR(vaddr);
> >>> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> >>> index 69e43bf91a15..2abbc06712a4 100644
> >>> --- a/drivers/gpu/drm/i915/i915_drv.h
> >>> +++ b/drivers/gpu/drm/i915/i915_drv.h
> >>> @@ -78,6 +78,7 @@
> >>>    #include "gem/i915_gem_context_types.h"
> >>>    #include "gem/i915_gem_shrinker.h"
> >>>    #include "gem/i915_gem_stolen.h"
> >>> +#include "gem/i915_gem_lmem.h"
> >>>
> >>>    #include "gt/intel_engine.h"
> >>>    #include "gt/intel_gt_types.h"
> >>> @@ -1921,9 +1922,15 @@ static inline int intel_hws_csb_write_index(struct drm_i915_private *i915)
> >>>    }
> >>>
> >>>    static inline enum i915_map_type
> >>> -i915_coherent_map_type(struct drm_i915_private *i915)
> >>> +i915_coherent_map_type(struct drm_i915_private *i915,
> >>> +                    struct drm_i915_gem_object *obj, bool always_coherent)
> >>>    {
> >>> -     return HAS_LLC(i915) ? I915_MAP_WB : I915_MAP_WC;
> >>> +     if (i915_gem_object_is_lmem(obj))
> >>> +             return I915_MAP_WC;
> >>> +     if (HAS_LLC(i915) || always_coherent)
> >>> +             return I915_MAP_WB;
> >>> +     else
> >>> +             return I915_MAP_WC;
> >>
> >> Seems this patch is doing two things.
> >>
> >> First it is adding lmem support to this helper by always returning WC
> >> for lmem objects.
> >>
> >> Secondly it is introducing an idea of "always coherent" in a helper
> >> called i915_coherent_map_type. Could someone explain what is coherent vs
> >> always coherent?
> >>
> >> And also, why is always coherent happy with WB? Sounds counter intuitive
> >> to me.
> >
> > All this does is try to keep the existing behaviour intact, whilst
> > also ensuring that all lmem objects are mapped using only WC, no
> > matter what. The always_coherent=true thing is for the existing places
> > where we sometimes map the object using WB, without first considering
> > whether the device has the fast shared LLC vs snooping. Yes, it's
> > slightly ugly :)
>
> Not fully following - if we had to write kerneldoc for always_coherent
> input argument - what it would say?

@always_coherent - If true we should always try to map the object
using WB. If false we should only map as WB if the device supports the
fast shared LLC, in the case of snooped devices we will map use WC.
Note that If the resource is lmem then we will always map as WC,
regardless of the value of always_coherent, since that's all we
currently support.

Maybe the naming is poor?

>
> Regards,
>
> Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [Intel-gfx] [PATCH 11/19] drm/i915: Update the helper to set correct mapping
  2021-04-15  9:23           ` Matthew Auld
@ 2021-04-15 11:05             ` Tvrtko Ursulin
  -1 siblings, 0 replies; 130+ messages in thread
From: Tvrtko Ursulin @ 2021-04-15 11:05 UTC (permalink / raw)
  To: Matthew Auld; +Cc: Intel Graphics Development, Matthew Auld, ML dri-devel


On 15/04/2021 10:23, Matthew Auld wrote:
> On Thu, 15 Apr 2021 at 09:21, Tvrtko Ursulin
> <tvrtko.ursulin@linux.intel.com> wrote:
>>
>>
>> On 14/04/2021 17:20, Matthew Auld wrote:
>>> On Wed, 14 Apr 2021 at 16:22, Tvrtko Ursulin
>>> <tvrtko.ursulin@linux.intel.com> wrote:
>>>>
>>>>
>>>> On 12/04/2021 10:05, Matthew Auld wrote:
>>>>> From: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com>
>>>>>
>>>>> Determine the possible coherent map type based on object location,
>>>>> and if target has llc or if user requires an always coherent
>>>>> mapping.
>>>>>
>>>>> Cc: Matthew Auld <matthew.auld@intel.com>
>>>>> Cc: CQ Tang <cq.tang@intel.com>
>>>>> Suggested-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
>>>>> Signed-off-by: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com>
>>>>> ---
>>>>>     drivers/gpu/drm/i915/gt/intel_engine_cs.c    |  3 ++-
>>>>>     drivers/gpu/drm/i915/gt/intel_engine_pm.c    |  2 +-
>>>>>     drivers/gpu/drm/i915/gt/intel_lrc.c          |  4 +++-
>>>>>     drivers/gpu/drm/i915/gt/intel_ring.c         |  9 ++++++---
>>>>>     drivers/gpu/drm/i915/gt/selftest_context.c   |  3 ++-
>>>>>     drivers/gpu/drm/i915/gt/selftest_hangcheck.c |  4 ++--
>>>>>     drivers/gpu/drm/i915/gt/selftest_lrc.c       |  4 +++-
>>>>>     drivers/gpu/drm/i915/gt/uc/intel_guc.c       |  4 +++-
>>>>>     drivers/gpu/drm/i915/gt/uc/intel_huc.c       |  4 +++-
>>>>>     drivers/gpu/drm/i915/i915_drv.h              | 11 +++++++++--
>>>>>     drivers/gpu/drm/i915/selftests/igt_spinner.c |  4 ++--
>>>>>     11 files changed, 36 insertions(+), 16 deletions(-)
>>>>>
>>>>> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
>>>>> index efe935f80c1a..b79568d370f5 100644
>>>>> --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
>>>>> +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
>>>>> @@ -664,7 +664,8 @@ static int init_status_page(struct intel_engine_cs *engine)
>>>>>         if (ret)
>>>>>                 goto err;
>>>>>
>>>>> -     vaddr = i915_gem_object_pin_map(obj, I915_MAP_WB);
>>>>> +     vaddr = i915_gem_object_pin_map(obj,
>>>>> +                                     i915_coherent_map_type(engine->i915, obj, true));
>>>>>         if (IS_ERR(vaddr)) {
>>>>>                 ret = PTR_ERR(vaddr);
>>>>>                 goto err_unpin;
>>>>> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pm.c b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
>>>>> index 7c9af86fdb1e..47f4397095e5 100644
>>>>> --- a/drivers/gpu/drm/i915/gt/intel_engine_pm.c
>>>>> +++ b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
>>>>> @@ -23,7 +23,7 @@ static void dbg_poison_ce(struct intel_context *ce)
>>>>>
>>>>>         if (ce->state) {
>>>>>                 struct drm_i915_gem_object *obj = ce->state->obj;
>>>>> -             int type = i915_coherent_map_type(ce->engine->i915);
>>>>> +             int type = i915_coherent_map_type(ce->engine->i915, obj, true);
>>>>>                 void *map;
>>>>>
>>>>>                 if (!i915_gem_object_trylock(obj))
>>>>> diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
>>>>> index e86897cde984..aafe2a4df496 100644
>>>>> --- a/drivers/gpu/drm/i915/gt/intel_lrc.c
>>>>> +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
>>>>> @@ -903,7 +903,9 @@ lrc_pre_pin(struct intel_context *ce,
>>>>>         GEM_BUG_ON(!i915_vma_is_pinned(ce->state));
>>>>>
>>>>>         *vaddr = i915_gem_object_pin_map(ce->state->obj,
>>>>> -                                      i915_coherent_map_type(ce->engine->i915) |
>>>>> +                                      i915_coherent_map_type(ce->engine->i915,
>>>>> +                                                             ce->state->obj,
>>>>> +                                                             false) |
>>>>>                                          I915_MAP_OVERRIDE);
>>>>>
>>>>>         return PTR_ERR_OR_ZERO(*vaddr);
>>>>> diff --git a/drivers/gpu/drm/i915/gt/intel_ring.c b/drivers/gpu/drm/i915/gt/intel_ring.c
>>>>> index aee0a77c77e0..3cf6c7e68108 100644
>>>>> --- a/drivers/gpu/drm/i915/gt/intel_ring.c
>>>>> +++ b/drivers/gpu/drm/i915/gt/intel_ring.c
>>>>> @@ -53,9 +53,12 @@ int intel_ring_pin(struct intel_ring *ring, struct i915_gem_ww_ctx *ww)
>>>>>
>>>>>         if (i915_vma_is_map_and_fenceable(vma))
>>>>>                 addr = (void __force *)i915_vma_pin_iomap(vma);
>>>>> -     else
>>>>> -             addr = i915_gem_object_pin_map(vma->obj,
>>>>> -                                            i915_coherent_map_type(vma->vm->i915));
>>>>> +     else {
>>>>> +             int type = i915_coherent_map_type(vma->vm->i915, vma->obj, false);
>>>>> +
>>>>> +             addr = i915_gem_object_pin_map(vma->obj, type);
>>>>> +     }
>>>>> +
>>>>>         if (IS_ERR(addr)) {
>>>>>                 ret = PTR_ERR(addr);
>>>>>                 goto err_ring;
>>>>> diff --git a/drivers/gpu/drm/i915/gt/selftest_context.c b/drivers/gpu/drm/i915/gt/selftest_context.c
>>>>> index b9bdd1d23243..26685b927169 100644
>>>>> --- a/drivers/gpu/drm/i915/gt/selftest_context.c
>>>>> +++ b/drivers/gpu/drm/i915/gt/selftest_context.c
>>>>> @@ -88,7 +88,8 @@ static int __live_context_size(struct intel_engine_cs *engine)
>>>>>                 goto err;
>>>>>
>>>>>         vaddr = i915_gem_object_pin_map_unlocked(ce->state->obj,
>>>>> -                                              i915_coherent_map_type(engine->i915));
>>>>> +                                              i915_coherent_map_type(engine->i915,
>>>>> +                                                                     ce->state->obj, false));
>>>>>         if (IS_ERR(vaddr)) {
>>>>>                 err = PTR_ERR(vaddr);
>>>>>                 intel_context_unpin(ce);
>>>>> diff --git a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
>>>>> index 746985971c3a..5b63d4df8c93 100644
>>>>> --- a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
>>>>> +++ b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
>>>>> @@ -69,7 +69,7 @@ static int hang_init(struct hang *h, struct intel_gt *gt)
>>>>>         h->seqno = memset(vaddr, 0xff, PAGE_SIZE);
>>>>>
>>>>>         vaddr = i915_gem_object_pin_map_unlocked(h->obj,
>>>>> -                                              i915_coherent_map_type(gt->i915));
>>>>> +                                              i915_coherent_map_type(gt->i915, h->obj, false));
>>>>>         if (IS_ERR(vaddr)) {
>>>>>                 err = PTR_ERR(vaddr);
>>>>>                 goto err_unpin_hws;
>>>>> @@ -130,7 +130,7 @@ hang_create_request(struct hang *h, struct intel_engine_cs *engine)
>>>>>                 return ERR_CAST(obj);
>>>>>         }
>>>>>
>>>>> -     vaddr = i915_gem_object_pin_map_unlocked(obj, i915_coherent_map_type(gt->i915));
>>>>> +     vaddr = i915_gem_object_pin_map_unlocked(obj, i915_coherent_map_type(gt->i915, obj, false));
>>>>>         if (IS_ERR(vaddr)) {
>>>>>                 i915_gem_object_put(obj);
>>>>>                 i915_vm_put(vm);
>>>>> diff --git a/drivers/gpu/drm/i915/gt/selftest_lrc.c b/drivers/gpu/drm/i915/gt/selftest_lrc.c
>>>>> index 85e7df6a5123..d8f6623524e8 100644
>>>>> --- a/drivers/gpu/drm/i915/gt/selftest_lrc.c
>>>>> +++ b/drivers/gpu/drm/i915/gt/selftest_lrc.c
>>>>> @@ -1221,7 +1221,9 @@ static int compare_isolation(struct intel_engine_cs *engine,
>>>>>         }
>>>>>
>>>>>         lrc = i915_gem_object_pin_map_unlocked(ce->state->obj,
>>>>> -                                   i915_coherent_map_type(engine->i915));
>>>>> +                                            i915_coherent_map_type(engine->i915,
>>>>> +                                                                   ce->state->obj,
>>>>> +                                                                   false));
>>>>>         if (IS_ERR(lrc)) {
>>>>>                 err = PTR_ERR(lrc);
>>>>>                 goto err_B1;
>>>>> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.c b/drivers/gpu/drm/i915/gt/uc/intel_guc.c
>>>>> index 78305b2ec89d..adae04c47aab 100644
>>>>> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc.c
>>>>> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.c
>>>>> @@ -682,7 +682,9 @@ int intel_guc_allocate_and_map_vma(struct intel_guc *guc, u32 size,
>>>>>         if (IS_ERR(vma))
>>>>>                 return PTR_ERR(vma);
>>>>>
>>>>> -     vaddr = i915_gem_object_pin_map_unlocked(vma->obj, I915_MAP_WB);
>>>>> +     vaddr = i915_gem_object_pin_map_unlocked(vma->obj,
>>>>> +                                              i915_coherent_map_type(guc_to_gt(guc)->i915,
>>>>> +                                                                     vma->obj, true));
>>>>>         if (IS_ERR(vaddr)) {
>>>>>                 i915_vma_unpin_and_release(&vma, 0);
>>>>>                 return PTR_ERR(vaddr);
>>>>> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_huc.c b/drivers/gpu/drm/i915/gt/uc/intel_huc.c
>>>>> index 2126dd81ac38..56d2144dc6a0 100644
>>>>> --- a/drivers/gpu/drm/i915/gt/uc/intel_huc.c
>>>>> +++ b/drivers/gpu/drm/i915/gt/uc/intel_huc.c
>>>>> @@ -82,7 +82,9 @@ static int intel_huc_rsa_data_create(struct intel_huc *huc)
>>>>>         if (IS_ERR(vma))
>>>>>                 return PTR_ERR(vma);
>>>>>
>>>>> -     vaddr = i915_gem_object_pin_map_unlocked(vma->obj, I915_MAP_WB);
>>>>> +     vaddr = i915_gem_object_pin_map_unlocked(vma->obj,
>>>>> +                                              i915_coherent_map_type(gt->i915,
>>>>> +                                                                     vma->obj, true));
>>>>>         if (IS_ERR(vaddr)) {
>>>>>                 i915_vma_unpin_and_release(&vma, 0);
>>>>>                 return PTR_ERR(vaddr);
>>>>> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
>>>>> index 69e43bf91a15..2abbc06712a4 100644
>>>>> --- a/drivers/gpu/drm/i915/i915_drv.h
>>>>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>>>>> @@ -78,6 +78,7 @@
>>>>>     #include "gem/i915_gem_context_types.h"
>>>>>     #include "gem/i915_gem_shrinker.h"
>>>>>     #include "gem/i915_gem_stolen.h"
>>>>> +#include "gem/i915_gem_lmem.h"
>>>>>
>>>>>     #include "gt/intel_engine.h"
>>>>>     #include "gt/intel_gt_types.h"
>>>>> @@ -1921,9 +1922,15 @@ static inline int intel_hws_csb_write_index(struct drm_i915_private *i915)
>>>>>     }
>>>>>
>>>>>     static inline enum i915_map_type
>>>>> -i915_coherent_map_type(struct drm_i915_private *i915)
>>>>> +i915_coherent_map_type(struct drm_i915_private *i915,
>>>>> +                    struct drm_i915_gem_object *obj, bool always_coherent)
>>>>>     {
>>>>> -     return HAS_LLC(i915) ? I915_MAP_WB : I915_MAP_WC;
>>>>> +     if (i915_gem_object_is_lmem(obj))
>>>>> +             return I915_MAP_WC;
>>>>> +     if (HAS_LLC(i915) || always_coherent)
>>>>> +             return I915_MAP_WB;
>>>>> +     else
>>>>> +             return I915_MAP_WC;
>>>>
>>>> Seems this patch is doing two things.
>>>>
>>>> First it is adding lmem support to this helper by always returning WC
>>>> for lmem objects.
>>>>
>>>> Secondly it is introducing an idea of "always coherent" in a helper
>>>> called i915_coherent_map_type. Could someone explain what is coherent vs
>>>> always coherent?
>>>>
>>>> And also, why is always coherent happy with WB? Sounds counter intuitive
>>>> to me.
>>>
>>> All this does is try to keep the existing behaviour intact, whilst
>>> also ensuring that all lmem objects are mapped using only WC, no
>>> matter what. The always_coherent=true thing is for the existing places
>>> where we sometimes map the object using WB, without first considering
>>> whether the device has the fast shared LLC vs snooping. Yes, it's
>>> slightly ugly :)
>>
>> Not fully following - if we had to write kerneldoc for always_coherent
>> input argument - what it would say?
> 
> @always_coherent - If true we should always try to map the object
> using WB. If false we should only map as WB if the device supports the
> fast shared LLC, in the case of snooped devices we will map use WC.
> Note that If the resource is lmem then we will always map as WC,
> regardless of the value of always_coherent, since that's all we
> currently support.
> 
> Maybe the naming is poor?

Maybe just confusing to me, not sure yet.

So always_coherent is not about how the callers wants to use it, but 
about platform knowledge? Or a performance concern for LLC vs snooping 
cases? Does WB works (coherently) on snooping platforms?

Regards,

Tvrtko
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [Intel-gfx] [PATCH 11/19] drm/i915: Update the helper to set correct mapping
@ 2021-04-15 11:05             ` Tvrtko Ursulin
  0 siblings, 0 replies; 130+ messages in thread
From: Tvrtko Ursulin @ 2021-04-15 11:05 UTC (permalink / raw)
  To: Matthew Auld; +Cc: Intel Graphics Development, Matthew Auld, ML dri-devel


On 15/04/2021 10:23, Matthew Auld wrote:
> On Thu, 15 Apr 2021 at 09:21, Tvrtko Ursulin
> <tvrtko.ursulin@linux.intel.com> wrote:
>>
>>
>> On 14/04/2021 17:20, Matthew Auld wrote:
>>> On Wed, 14 Apr 2021 at 16:22, Tvrtko Ursulin
>>> <tvrtko.ursulin@linux.intel.com> wrote:
>>>>
>>>>
>>>> On 12/04/2021 10:05, Matthew Auld wrote:
>>>>> From: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com>
>>>>>
>>>>> Determine the possible coherent map type based on object location,
>>>>> and if target has llc or if user requires an always coherent
>>>>> mapping.
>>>>>
>>>>> Cc: Matthew Auld <matthew.auld@intel.com>
>>>>> Cc: CQ Tang <cq.tang@intel.com>
>>>>> Suggested-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
>>>>> Signed-off-by: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com>
>>>>> ---
>>>>>     drivers/gpu/drm/i915/gt/intel_engine_cs.c    |  3 ++-
>>>>>     drivers/gpu/drm/i915/gt/intel_engine_pm.c    |  2 +-
>>>>>     drivers/gpu/drm/i915/gt/intel_lrc.c          |  4 +++-
>>>>>     drivers/gpu/drm/i915/gt/intel_ring.c         |  9 ++++++---
>>>>>     drivers/gpu/drm/i915/gt/selftest_context.c   |  3 ++-
>>>>>     drivers/gpu/drm/i915/gt/selftest_hangcheck.c |  4 ++--
>>>>>     drivers/gpu/drm/i915/gt/selftest_lrc.c       |  4 +++-
>>>>>     drivers/gpu/drm/i915/gt/uc/intel_guc.c       |  4 +++-
>>>>>     drivers/gpu/drm/i915/gt/uc/intel_huc.c       |  4 +++-
>>>>>     drivers/gpu/drm/i915/i915_drv.h              | 11 +++++++++--
>>>>>     drivers/gpu/drm/i915/selftests/igt_spinner.c |  4 ++--
>>>>>     11 files changed, 36 insertions(+), 16 deletions(-)
>>>>>
>>>>> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
>>>>> index efe935f80c1a..b79568d370f5 100644
>>>>> --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
>>>>> +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
>>>>> @@ -664,7 +664,8 @@ static int init_status_page(struct intel_engine_cs *engine)
>>>>>         if (ret)
>>>>>                 goto err;
>>>>>
>>>>> -     vaddr = i915_gem_object_pin_map(obj, I915_MAP_WB);
>>>>> +     vaddr = i915_gem_object_pin_map(obj,
>>>>> +                                     i915_coherent_map_type(engine->i915, obj, true));
>>>>>         if (IS_ERR(vaddr)) {
>>>>>                 ret = PTR_ERR(vaddr);
>>>>>                 goto err_unpin;
>>>>> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pm.c b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
>>>>> index 7c9af86fdb1e..47f4397095e5 100644
>>>>> --- a/drivers/gpu/drm/i915/gt/intel_engine_pm.c
>>>>> +++ b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
>>>>> @@ -23,7 +23,7 @@ static void dbg_poison_ce(struct intel_context *ce)
>>>>>
>>>>>         if (ce->state) {
>>>>>                 struct drm_i915_gem_object *obj = ce->state->obj;
>>>>> -             int type = i915_coherent_map_type(ce->engine->i915);
>>>>> +             int type = i915_coherent_map_type(ce->engine->i915, obj, true);
>>>>>                 void *map;
>>>>>
>>>>>                 if (!i915_gem_object_trylock(obj))
>>>>> diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
>>>>> index e86897cde984..aafe2a4df496 100644
>>>>> --- a/drivers/gpu/drm/i915/gt/intel_lrc.c
>>>>> +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
>>>>> @@ -903,7 +903,9 @@ lrc_pre_pin(struct intel_context *ce,
>>>>>         GEM_BUG_ON(!i915_vma_is_pinned(ce->state));
>>>>>
>>>>>         *vaddr = i915_gem_object_pin_map(ce->state->obj,
>>>>> -                                      i915_coherent_map_type(ce->engine->i915) |
>>>>> +                                      i915_coherent_map_type(ce->engine->i915,
>>>>> +                                                             ce->state->obj,
>>>>> +                                                             false) |
>>>>>                                          I915_MAP_OVERRIDE);
>>>>>
>>>>>         return PTR_ERR_OR_ZERO(*vaddr);
>>>>> diff --git a/drivers/gpu/drm/i915/gt/intel_ring.c b/drivers/gpu/drm/i915/gt/intel_ring.c
>>>>> index aee0a77c77e0..3cf6c7e68108 100644
>>>>> --- a/drivers/gpu/drm/i915/gt/intel_ring.c
>>>>> +++ b/drivers/gpu/drm/i915/gt/intel_ring.c
>>>>> @@ -53,9 +53,12 @@ int intel_ring_pin(struct intel_ring *ring, struct i915_gem_ww_ctx *ww)
>>>>>
>>>>>         if (i915_vma_is_map_and_fenceable(vma))
>>>>>                 addr = (void __force *)i915_vma_pin_iomap(vma);
>>>>> -     else
>>>>> -             addr = i915_gem_object_pin_map(vma->obj,
>>>>> -                                            i915_coherent_map_type(vma->vm->i915));
>>>>> +     else {
>>>>> +             int type = i915_coherent_map_type(vma->vm->i915, vma->obj, false);
>>>>> +
>>>>> +             addr = i915_gem_object_pin_map(vma->obj, type);
>>>>> +     }
>>>>> +
>>>>>         if (IS_ERR(addr)) {
>>>>>                 ret = PTR_ERR(addr);
>>>>>                 goto err_ring;
>>>>> diff --git a/drivers/gpu/drm/i915/gt/selftest_context.c b/drivers/gpu/drm/i915/gt/selftest_context.c
>>>>> index b9bdd1d23243..26685b927169 100644
>>>>> --- a/drivers/gpu/drm/i915/gt/selftest_context.c
>>>>> +++ b/drivers/gpu/drm/i915/gt/selftest_context.c
>>>>> @@ -88,7 +88,8 @@ static int __live_context_size(struct intel_engine_cs *engine)
>>>>>                 goto err;
>>>>>
>>>>>         vaddr = i915_gem_object_pin_map_unlocked(ce->state->obj,
>>>>> -                                              i915_coherent_map_type(engine->i915));
>>>>> +                                              i915_coherent_map_type(engine->i915,
>>>>> +                                                                     ce->state->obj, false));
>>>>>         if (IS_ERR(vaddr)) {
>>>>>                 err = PTR_ERR(vaddr);
>>>>>                 intel_context_unpin(ce);
>>>>> diff --git a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
>>>>> index 746985971c3a..5b63d4df8c93 100644
>>>>> --- a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
>>>>> +++ b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
>>>>> @@ -69,7 +69,7 @@ static int hang_init(struct hang *h, struct intel_gt *gt)
>>>>>         h->seqno = memset(vaddr, 0xff, PAGE_SIZE);
>>>>>
>>>>>         vaddr = i915_gem_object_pin_map_unlocked(h->obj,
>>>>> -                                              i915_coherent_map_type(gt->i915));
>>>>> +                                              i915_coherent_map_type(gt->i915, h->obj, false));
>>>>>         if (IS_ERR(vaddr)) {
>>>>>                 err = PTR_ERR(vaddr);
>>>>>                 goto err_unpin_hws;
>>>>> @@ -130,7 +130,7 @@ hang_create_request(struct hang *h, struct intel_engine_cs *engine)
>>>>>                 return ERR_CAST(obj);
>>>>>         }
>>>>>
>>>>> -     vaddr = i915_gem_object_pin_map_unlocked(obj, i915_coherent_map_type(gt->i915));
>>>>> +     vaddr = i915_gem_object_pin_map_unlocked(obj, i915_coherent_map_type(gt->i915, obj, false));
>>>>>         if (IS_ERR(vaddr)) {
>>>>>                 i915_gem_object_put(obj);
>>>>>                 i915_vm_put(vm);
>>>>> diff --git a/drivers/gpu/drm/i915/gt/selftest_lrc.c b/drivers/gpu/drm/i915/gt/selftest_lrc.c
>>>>> index 85e7df6a5123..d8f6623524e8 100644
>>>>> --- a/drivers/gpu/drm/i915/gt/selftest_lrc.c
>>>>> +++ b/drivers/gpu/drm/i915/gt/selftest_lrc.c
>>>>> @@ -1221,7 +1221,9 @@ static int compare_isolation(struct intel_engine_cs *engine,
>>>>>         }
>>>>>
>>>>>         lrc = i915_gem_object_pin_map_unlocked(ce->state->obj,
>>>>> -                                   i915_coherent_map_type(engine->i915));
>>>>> +                                            i915_coherent_map_type(engine->i915,
>>>>> +                                                                   ce->state->obj,
>>>>> +                                                                   false));
>>>>>         if (IS_ERR(lrc)) {
>>>>>                 err = PTR_ERR(lrc);
>>>>>                 goto err_B1;
>>>>> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.c b/drivers/gpu/drm/i915/gt/uc/intel_guc.c
>>>>> index 78305b2ec89d..adae04c47aab 100644
>>>>> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc.c
>>>>> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.c
>>>>> @@ -682,7 +682,9 @@ int intel_guc_allocate_and_map_vma(struct intel_guc *guc, u32 size,
>>>>>         if (IS_ERR(vma))
>>>>>                 return PTR_ERR(vma);
>>>>>
>>>>> -     vaddr = i915_gem_object_pin_map_unlocked(vma->obj, I915_MAP_WB);
>>>>> +     vaddr = i915_gem_object_pin_map_unlocked(vma->obj,
>>>>> +                                              i915_coherent_map_type(guc_to_gt(guc)->i915,
>>>>> +                                                                     vma->obj, true));
>>>>>         if (IS_ERR(vaddr)) {
>>>>>                 i915_vma_unpin_and_release(&vma, 0);
>>>>>                 return PTR_ERR(vaddr);
>>>>> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_huc.c b/drivers/gpu/drm/i915/gt/uc/intel_huc.c
>>>>> index 2126dd81ac38..56d2144dc6a0 100644
>>>>> --- a/drivers/gpu/drm/i915/gt/uc/intel_huc.c
>>>>> +++ b/drivers/gpu/drm/i915/gt/uc/intel_huc.c
>>>>> @@ -82,7 +82,9 @@ static int intel_huc_rsa_data_create(struct intel_huc *huc)
>>>>>         if (IS_ERR(vma))
>>>>>                 return PTR_ERR(vma);
>>>>>
>>>>> -     vaddr = i915_gem_object_pin_map_unlocked(vma->obj, I915_MAP_WB);
>>>>> +     vaddr = i915_gem_object_pin_map_unlocked(vma->obj,
>>>>> +                                              i915_coherent_map_type(gt->i915,
>>>>> +                                                                     vma->obj, true));
>>>>>         if (IS_ERR(vaddr)) {
>>>>>                 i915_vma_unpin_and_release(&vma, 0);
>>>>>                 return PTR_ERR(vaddr);
>>>>> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
>>>>> index 69e43bf91a15..2abbc06712a4 100644
>>>>> --- a/drivers/gpu/drm/i915/i915_drv.h
>>>>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>>>>> @@ -78,6 +78,7 @@
>>>>>     #include "gem/i915_gem_context_types.h"
>>>>>     #include "gem/i915_gem_shrinker.h"
>>>>>     #include "gem/i915_gem_stolen.h"
>>>>> +#include "gem/i915_gem_lmem.h"
>>>>>
>>>>>     #include "gt/intel_engine.h"
>>>>>     #include "gt/intel_gt_types.h"
>>>>> @@ -1921,9 +1922,15 @@ static inline int intel_hws_csb_write_index(struct drm_i915_private *i915)
>>>>>     }
>>>>>
>>>>>     static inline enum i915_map_type
>>>>> -i915_coherent_map_type(struct drm_i915_private *i915)
>>>>> +i915_coherent_map_type(struct drm_i915_private *i915,
>>>>> +                    struct drm_i915_gem_object *obj, bool always_coherent)
>>>>>     {
>>>>> -     return HAS_LLC(i915) ? I915_MAP_WB : I915_MAP_WC;
>>>>> +     if (i915_gem_object_is_lmem(obj))
>>>>> +             return I915_MAP_WC;
>>>>> +     if (HAS_LLC(i915) || always_coherent)
>>>>> +             return I915_MAP_WB;
>>>>> +     else
>>>>> +             return I915_MAP_WC;
>>>>
>>>> Seems this patch is doing two things.
>>>>
>>>> First it is adding lmem support to this helper by always returning WC
>>>> for lmem objects.
>>>>
>>>> Secondly it is introducing an idea of "always coherent" in a helper
>>>> called i915_coherent_map_type. Could someone explain what is coherent vs
>>>> always coherent?
>>>>
>>>> And also, why is always coherent happy with WB? Sounds counter intuitive
>>>> to me.
>>>
>>> All this does is try to keep the existing behaviour intact, whilst
>>> also ensuring that all lmem objects are mapped using only WC, no
>>> matter what. The always_coherent=true thing is for the existing places
>>> where we sometimes map the object using WB, without first considering
>>> whether the device has the fast shared LLC vs snooping. Yes, it's
>>> slightly ugly :)
>>
>> Not fully following - if we had to write kerneldoc for always_coherent
>> input argument - what it would say?
> 
> @always_coherent - If true we should always try to map the object
> using WB. If false we should only map as WB if the device supports the
> fast shared LLC, in the case of snooped devices we will map use WC.
> Note that If the resource is lmem then we will always map as WC,
> regardless of the value of always_coherent, since that's all we
> currently support.
> 
> Maybe the naming is poor?

Maybe just confusing to me, not sure yet.

So always_coherent is not about how the callers wants to use it, but 
about platform knowledge? Or a performance concern for LLC vs snooping 
cases? Does WB works (coherently) on snooping platforms?

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [Intel-gfx] [PATCH 06/19] drm/i915/stolen: pass the allocation flags
  2021-04-14 15:09     ` Tvrtko Ursulin
@ 2021-04-16 13:53       ` Matthew Auld
  -1 siblings, 0 replies; 130+ messages in thread
From: Matthew Auld @ 2021-04-16 13:53 UTC (permalink / raw)
  To: Tvrtko Ursulin, intel-gfx; +Cc: dri-devel

On 14/04/2021 16:09, Tvrtko Ursulin wrote:
> 
> On 12/04/2021 10:05, Matthew Auld wrote:
>> From: CQ Tang <cq.tang@intel.com>
>>
>> Stolen memory is always allocated as physically contiguous pages, mark
>> the object flags as such.
>>
>> Signed-off-by: CQ Tang <cq.tang@intel.com>
>> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
>> ---
>>   drivers/gpu/drm/i915/gem/i915_gem_stolen.c | 10 ++++++----
>>   1 file changed, 6 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_stolen.c 
>> b/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
>> index f713eabb7671..49a2dfcc8ba7 100644
>> --- a/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
>> @@ -633,14 +633,15 @@ static const struct drm_i915_gem_object_ops 
>> i915_gem_object_stolen_ops = {
>>   static int __i915_gem_object_create_stolen(struct 
>> intel_memory_region *mem,
>>                          struct drm_i915_gem_object *obj,
>> -                       struct drm_mm_node *stolen)
>> +                       struct drm_mm_node *stolen,
>> +                       unsigned int flags)
>>   {
>>       static struct lock_class_key lock_class;
>>       unsigned int cache_level;
>>       int err;
>>       drm_gem_private_object_init(&mem->i915->drm, &obj->base, 
>> stolen->size);
>> -    i915_gem_object_init(obj, &i915_gem_object_stolen_ops, 
>> &lock_class, 0);
>> +    i915_gem_object_init(obj, &i915_gem_object_stolen_ops, 
>> &lock_class, flags);
>>       obj->stolen = stolen;
>>       obj->read_domains = I915_GEM_DOMAIN_CPU | I915_GEM_DOMAIN_GTT;
>> @@ -682,7 +683,7 @@ static int _i915_gem_object_stolen_init(struct 
>> intel_memory_region *mem,
>>       if (ret)
>>           goto err_free;
>> -    ret = __i915_gem_object_create_stolen(mem, obj, stolen);
>> +    ret = __i915_gem_object_create_stolen(mem, obj, stolen, flags);
>>       if (ret)
>>           goto err_remove;
>> @@ -840,7 +841,8 @@ 
>> i915_gem_object_create_stolen_for_preallocated(struct drm_i915_private 
>> *i915,
>>           goto err_stolen;
>>       }
>> -    ret = __i915_gem_object_create_stolen(mem, obj, stolen);
>> +    ret = __i915_gem_object_create_stolen(mem, obj, stolen,
>> +                          I915_BO_ALLOC_CONTIGUOUS);
>>       if (ret)
>>           goto err_object_free;
>>
> 
> Are all stolen objects always contiguous or only ones allocated by 
> i915_gem_object_create_stolen_for_preallocated? If former should 
> __i915_gem_object_create_stolen just set the flag without the need to 
> pass it in?

Yes, all stolen object are physically contiguous. Agreed, moving the 
I915_BO_ALLOC_CONTIGUOUS into __i915_gem_object_create_stolen() makes 
more sense here.

> 
> Regards,
> 
> Tvrtko
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [Intel-gfx] [PATCH 06/19] drm/i915/stolen: pass the allocation flags
@ 2021-04-16 13:53       ` Matthew Auld
  0 siblings, 0 replies; 130+ messages in thread
From: Matthew Auld @ 2021-04-16 13:53 UTC (permalink / raw)
  To: Tvrtko Ursulin, intel-gfx; +Cc: dri-devel

On 14/04/2021 16:09, Tvrtko Ursulin wrote:
> 
> On 12/04/2021 10:05, Matthew Auld wrote:
>> From: CQ Tang <cq.tang@intel.com>
>>
>> Stolen memory is always allocated as physically contiguous pages, mark
>> the object flags as such.
>>
>> Signed-off-by: CQ Tang <cq.tang@intel.com>
>> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
>> ---
>>   drivers/gpu/drm/i915/gem/i915_gem_stolen.c | 10 ++++++----
>>   1 file changed, 6 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_stolen.c 
>> b/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
>> index f713eabb7671..49a2dfcc8ba7 100644
>> --- a/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
>> @@ -633,14 +633,15 @@ static const struct drm_i915_gem_object_ops 
>> i915_gem_object_stolen_ops = {
>>   static int __i915_gem_object_create_stolen(struct 
>> intel_memory_region *mem,
>>                          struct drm_i915_gem_object *obj,
>> -                       struct drm_mm_node *stolen)
>> +                       struct drm_mm_node *stolen,
>> +                       unsigned int flags)
>>   {
>>       static struct lock_class_key lock_class;
>>       unsigned int cache_level;
>>       int err;
>>       drm_gem_private_object_init(&mem->i915->drm, &obj->base, 
>> stolen->size);
>> -    i915_gem_object_init(obj, &i915_gem_object_stolen_ops, 
>> &lock_class, 0);
>> +    i915_gem_object_init(obj, &i915_gem_object_stolen_ops, 
>> &lock_class, flags);
>>       obj->stolen = stolen;
>>       obj->read_domains = I915_GEM_DOMAIN_CPU | I915_GEM_DOMAIN_GTT;
>> @@ -682,7 +683,7 @@ static int _i915_gem_object_stolen_init(struct 
>> intel_memory_region *mem,
>>       if (ret)
>>           goto err_free;
>> -    ret = __i915_gem_object_create_stolen(mem, obj, stolen);
>> +    ret = __i915_gem_object_create_stolen(mem, obj, stolen, flags);
>>       if (ret)
>>           goto err_remove;
>> @@ -840,7 +841,8 @@ 
>> i915_gem_object_create_stolen_for_preallocated(struct drm_i915_private 
>> *i915,
>>           goto err_stolen;
>>       }
>> -    ret = __i915_gem_object_create_stolen(mem, obj, stolen);
>> +    ret = __i915_gem_object_create_stolen(mem, obj, stolen,
>> +                          I915_BO_ALLOC_CONTIGUOUS);
>>       if (ret)
>>           goto err_object_free;
>>
> 
> Are all stolen objects always contiguous or only ones allocated by 
> i915_gem_object_create_stolen_for_preallocated? If former should 
> __i915_gem_object_create_stolen just set the flag without the need to 
> pass it in?

Yes, all stolen object are physically contiguous. Agreed, moving the 
I915_BO_ALLOC_CONTIGUOUS into __i915_gem_object_create_stolen() makes 
more sense here.

> 
> Regards,
> 
> Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [Intel-gfx] [PATCH 12/19] drm/i915/lmem: Bypass aperture when lmem is available
  2021-04-14 15:33     ` Tvrtko Ursulin
@ 2021-04-16 14:25       ` Matthew Auld
  -1 siblings, 0 replies; 130+ messages in thread
From: Matthew Auld @ 2021-04-16 14:25 UTC (permalink / raw)
  To: Tvrtko Ursulin, intel-gfx
  Cc: Daniel Vetter, dri-devel, Chris P Wilson, Dhinakaran Pandiyan

On 14/04/2021 16:33, Tvrtko Ursulin wrote:
> 
> On 12/04/2021 10:05, Matthew Auld wrote:
>> From: Anusha Srivatsa <anusha.srivatsa@intel.com>
>>
>> In the scenario where local memory is available, we have
>> rely on CPU access via lmem directly instead of aperture.
>>
>> v2:
>> gmch is only relevant for much older hw, therefore we can drop the
>> has_aperture check since it should always be present on such platforms.
>> (Chris)
>>
>> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
>> Cc: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
>> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
>> Cc: Chris P Wilson <chris.p.wilson@intel.com>
>> Cc: Daniel Vetter <daniel.vetter@intel.com>
>> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
>> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
>> Cc: CQ Tang <cq.tang@intel.com>
>> Signed-off-by: Anusha Srivatsa <anusha.srivatsa@intel.com>
>> ---
>>   drivers/gpu/drm/i915/display/intel_fbdev.c | 22 +++++++++++++++-------
>>   drivers/gpu/drm/i915/gem/i915_gem_lmem.c   | 15 +++++++++++++++
>>   drivers/gpu/drm/i915/gem/i915_gem_lmem.h   |  5 +++++
>>   drivers/gpu/drm/i915/i915_vma.c            | 19 +++++++++++++------
>>   4 files changed, 48 insertions(+), 13 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/display/intel_fbdev.c 
>> b/drivers/gpu/drm/i915/display/intel_fbdev.c
>> index 2b37959da747..4af40229f5ec 100644
>> --- a/drivers/gpu/drm/i915/display/intel_fbdev.c
>> +++ b/drivers/gpu/drm/i915/display/intel_fbdev.c
>> @@ -139,14 +139,22 @@ static int intelfb_alloc(struct drm_fb_helper 
>> *helper,
>>       size = mode_cmd.pitches[0] * mode_cmd.height;
>>       size = PAGE_ALIGN(size);
>> -    /* If the FB is too big, just don't use it since fbdev is not very
>> -     * important and we should probably use that space with FBC or other
>> -     * features. */
>>       obj = ERR_PTR(-ENODEV);
>> -    if (size * 2 < dev_priv->stolen_usable_size)
>> -        obj = i915_gem_object_create_stolen(dev_priv, size);
>> -    if (IS_ERR(obj))
>> -        obj = i915_gem_object_create_shmem(dev_priv, size);
>> +    if (HAS_LMEM(dev_priv)) {
>> +        obj = i915_gem_object_create_lmem(dev_priv, size,
>> +                          I915_BO_ALLOC_CONTIGUOUS);
> 
> Has to be contiguous? Question for display experts I guess.
> 
> [Comes back later.] Ah for iomap? Put a comment to that effect perhaps?

I don't think it has to be, since we could in theory just use pin_map() 
underneath, which can already deal with non-contiguous chunks of lmem, 
although that might bring in ww locking. I think for now just add a 
comment and mark this as XXX, and potentially revisit as follow up?

> 
>> +    } else {
>> +        /*
>> +         * If the FB is too big, just don't use it since fbdev is not 
>> very
>> +         * important and we should probably use that space with FBC 
>> or other
>> +         * features.
>> +         */
>> +        if (size * 2 < dev_priv->stolen_usable_size)
>> +            obj = i915_gem_object_create_stolen(dev_priv, size);
>> +        if (IS_ERR(obj))
>> +            obj = i915_gem_object_create_shmem(dev_priv, size);
>> +    }
> 
> Could we keep the IS_ERR ordered allocation order to save having to 
> re-indent? Bike shed so optional..
> 
>> +
>>       if (IS_ERR(obj)) {
>>           drm_err(&dev_priv->drm, "failed to allocate framebuffer\n");
>>           return PTR_ERR(obj);
>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_lmem.c 
>> b/drivers/gpu/drm/i915/gem/i915_gem_lmem.c
>> index 017db8f71130..f44bdd08f7cb 100644
>> --- a/drivers/gpu/drm/i915/gem/i915_gem_lmem.c
>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_lmem.c
>> @@ -17,6 +17,21 @@ const struct drm_i915_gem_object_ops 
>> i915_gem_lmem_obj_ops = {
>>       .release = i915_gem_object_release_memory_region,
>>   };
>> +void __iomem *
>> +i915_gem_object_lmem_io_map(struct drm_i915_gem_object *obj,
>> +                unsigned long n,
>> +                unsigned long size)
>> +{
>> +    resource_size_t offset;
>> +
>> +    GEM_BUG_ON(!i915_gem_object_is_contiguous(obj));
>> +
>> +    offset = i915_gem_object_get_dma_address(obj, n);
>> +    offset -= obj->mm.region->region.start;
>> +
>> +    return io_mapping_map_wc(&obj->mm.region->iomap, offset, size);
>> +}
>> +
>>   bool i915_gem_object_is_lmem(struct drm_i915_gem_object *obj)
>>   {
>>       struct intel_memory_region *mr = obj->mm.region;
>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_lmem.h 
>> b/drivers/gpu/drm/i915/gem/i915_gem_lmem.h
>> index 036d53c01de9..fac6bc5a5ebb 100644
>> --- a/drivers/gpu/drm/i915/gem/i915_gem_lmem.h
>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_lmem.h
>> @@ -14,6 +14,11 @@ struct intel_memory_region;
>>   extern const struct drm_i915_gem_object_ops i915_gem_lmem_obj_ops;
>> +void __iomem *
>> +i915_gem_object_lmem_io_map(struct drm_i915_gem_object *obj,
>> +                unsigned long n,
>> +                unsigned long size);
>> +
>>   bool i915_gem_object_is_lmem(struct drm_i915_gem_object *obj);
>>   struct drm_i915_gem_object *
>> diff --git a/drivers/gpu/drm/i915/i915_vma.c 
>> b/drivers/gpu/drm/i915/i915_vma.c
>> index 07490db51cdc..e24d33aecac4 100644
>> --- a/drivers/gpu/drm/i915/i915_vma.c
>> +++ b/drivers/gpu/drm/i915/i915_vma.c
>> @@ -27,6 +27,7 @@
>>   #include "display/intel_frontbuffer.h"
>> +#include "gem/i915_gem_lmem.h"
>>   #include "gt/intel_engine.h"
>>   #include "gt/intel_engine_heartbeat.h"
>>   #include "gt/intel_gt.h"
>> @@ -448,9 +449,11 @@ void __iomem *i915_vma_pin_iomap(struct i915_vma 
>> *vma)
>>       void __iomem *ptr;
>>       int err;
>> -    if (GEM_WARN_ON(!i915_vma_is_map_and_fenceable(vma))) {
>> -        err = -ENODEV;
>> -        goto err;
>> +    if (!i915_gem_object_is_lmem(vma->obj)) {
>> +        if (GEM_WARN_ON(!i915_vma_is_map_and_fenceable(vma))) {
>> +            err = -ENODEV;
>> +            goto err;
>> +        }
>>       }
>>       GEM_BUG_ON(!i915_vma_is_ggtt(vma));
>> @@ -458,9 +461,13 @@ void __iomem *i915_vma_pin_iomap(struct i915_vma 
>> *vma)
>>       ptr = READ_ONCE(vma->iomap);
>>       if (ptr == NULL) {
>> -        ptr = io_mapping_map_wc(&i915_vm_to_ggtt(vma->vm)->iomap,
>> -                    vma->node.start,
>> -                    vma->node.size);
>> +        if (i915_gem_object_is_lmem(vma->obj))
>> +            ptr = i915_gem_object_lmem_io_map(vma->obj, 0,
>> +                              vma->obj->base.size);
> 
> Can the vma size be bigger than the object here? Given how below works 
> of vma->node.size.

I don't know tbh. But in general node.size can definitely be larger than 
vma->size/obj->base.size.

For the iomap version below, it's using the mappable aperture, which 
requires reserving a vma node into the mappable part of the GGTT first, 
so using node.size here make sense, since the node reflects the window 
into the mappable aperture.

For the lmem case though that might be bogus, since the vma has no 
relationship with LMEM_BAR, since really it's the object, hence why we 
use the obj->base.size instead. Although really it might make more sense 
to use pin_map() instead for the lmem case, if it's possible.

> 
>> +        else
>> +            ptr = io_mapping_map_wc(&i915_vm_to_ggtt(vma->vm)->iomap,
>> +                        vma->node.start,
>> +                        vma->node.size);
> 
> Looks a bit odd that this calls the same io_mapping_map_wc as 
> i915_gem_object_lmem_io_map ends up doing. Perhaps that suggests there 
> should be a single helper here but I am not sure what would be elegant.
> 
> Regards,
> 
> Tvrtko
> 
>>           if (ptr == NULL) {
>>               err = -ENOMEM;
>>               goto err;
>>
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [Intel-gfx] [PATCH 12/19] drm/i915/lmem: Bypass aperture when lmem is available
@ 2021-04-16 14:25       ` Matthew Auld
  0 siblings, 0 replies; 130+ messages in thread
From: Matthew Auld @ 2021-04-16 14:25 UTC (permalink / raw)
  To: Tvrtko Ursulin, intel-gfx
  Cc: Daniel Vetter, dri-devel, Chris P Wilson, Dhinakaran Pandiyan

On 14/04/2021 16:33, Tvrtko Ursulin wrote:
> 
> On 12/04/2021 10:05, Matthew Auld wrote:
>> From: Anusha Srivatsa <anusha.srivatsa@intel.com>
>>
>> In the scenario where local memory is available, we have
>> rely on CPU access via lmem directly instead of aperture.
>>
>> v2:
>> gmch is only relevant for much older hw, therefore we can drop the
>> has_aperture check since it should always be present on such platforms.
>> (Chris)
>>
>> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
>> Cc: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
>> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
>> Cc: Chris P Wilson <chris.p.wilson@intel.com>
>> Cc: Daniel Vetter <daniel.vetter@intel.com>
>> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
>> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
>> Cc: CQ Tang <cq.tang@intel.com>
>> Signed-off-by: Anusha Srivatsa <anusha.srivatsa@intel.com>
>> ---
>>   drivers/gpu/drm/i915/display/intel_fbdev.c | 22 +++++++++++++++-------
>>   drivers/gpu/drm/i915/gem/i915_gem_lmem.c   | 15 +++++++++++++++
>>   drivers/gpu/drm/i915/gem/i915_gem_lmem.h   |  5 +++++
>>   drivers/gpu/drm/i915/i915_vma.c            | 19 +++++++++++++------
>>   4 files changed, 48 insertions(+), 13 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/display/intel_fbdev.c 
>> b/drivers/gpu/drm/i915/display/intel_fbdev.c
>> index 2b37959da747..4af40229f5ec 100644
>> --- a/drivers/gpu/drm/i915/display/intel_fbdev.c
>> +++ b/drivers/gpu/drm/i915/display/intel_fbdev.c
>> @@ -139,14 +139,22 @@ static int intelfb_alloc(struct drm_fb_helper 
>> *helper,
>>       size = mode_cmd.pitches[0] * mode_cmd.height;
>>       size = PAGE_ALIGN(size);
>> -    /* If the FB is too big, just don't use it since fbdev is not very
>> -     * important and we should probably use that space with FBC or other
>> -     * features. */
>>       obj = ERR_PTR(-ENODEV);
>> -    if (size * 2 < dev_priv->stolen_usable_size)
>> -        obj = i915_gem_object_create_stolen(dev_priv, size);
>> -    if (IS_ERR(obj))
>> -        obj = i915_gem_object_create_shmem(dev_priv, size);
>> +    if (HAS_LMEM(dev_priv)) {
>> +        obj = i915_gem_object_create_lmem(dev_priv, size,
>> +                          I915_BO_ALLOC_CONTIGUOUS);
> 
> Has to be contiguous? Question for display experts I guess.
> 
> [Comes back later.] Ah for iomap? Put a comment to that effect perhaps?

I don't think it has to be, since we could in theory just use pin_map() 
underneath, which can already deal with non-contiguous chunks of lmem, 
although that might bring in ww locking. I think for now just add a 
comment and mark this as XXX, and potentially revisit as follow up?

> 
>> +    } else {
>> +        /*
>> +         * If the FB is too big, just don't use it since fbdev is not 
>> very
>> +         * important and we should probably use that space with FBC 
>> or other
>> +         * features.
>> +         */
>> +        if (size * 2 < dev_priv->stolen_usable_size)
>> +            obj = i915_gem_object_create_stolen(dev_priv, size);
>> +        if (IS_ERR(obj))
>> +            obj = i915_gem_object_create_shmem(dev_priv, size);
>> +    }
> 
> Could we keep the IS_ERR ordered allocation order to save having to 
> re-indent? Bike shed so optional..
> 
>> +
>>       if (IS_ERR(obj)) {
>>           drm_err(&dev_priv->drm, "failed to allocate framebuffer\n");
>>           return PTR_ERR(obj);
>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_lmem.c 
>> b/drivers/gpu/drm/i915/gem/i915_gem_lmem.c
>> index 017db8f71130..f44bdd08f7cb 100644
>> --- a/drivers/gpu/drm/i915/gem/i915_gem_lmem.c
>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_lmem.c
>> @@ -17,6 +17,21 @@ const struct drm_i915_gem_object_ops 
>> i915_gem_lmem_obj_ops = {
>>       .release = i915_gem_object_release_memory_region,
>>   };
>> +void __iomem *
>> +i915_gem_object_lmem_io_map(struct drm_i915_gem_object *obj,
>> +                unsigned long n,
>> +                unsigned long size)
>> +{
>> +    resource_size_t offset;
>> +
>> +    GEM_BUG_ON(!i915_gem_object_is_contiguous(obj));
>> +
>> +    offset = i915_gem_object_get_dma_address(obj, n);
>> +    offset -= obj->mm.region->region.start;
>> +
>> +    return io_mapping_map_wc(&obj->mm.region->iomap, offset, size);
>> +}
>> +
>>   bool i915_gem_object_is_lmem(struct drm_i915_gem_object *obj)
>>   {
>>       struct intel_memory_region *mr = obj->mm.region;
>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_lmem.h 
>> b/drivers/gpu/drm/i915/gem/i915_gem_lmem.h
>> index 036d53c01de9..fac6bc5a5ebb 100644
>> --- a/drivers/gpu/drm/i915/gem/i915_gem_lmem.h
>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_lmem.h
>> @@ -14,6 +14,11 @@ struct intel_memory_region;
>>   extern const struct drm_i915_gem_object_ops i915_gem_lmem_obj_ops;
>> +void __iomem *
>> +i915_gem_object_lmem_io_map(struct drm_i915_gem_object *obj,
>> +                unsigned long n,
>> +                unsigned long size);
>> +
>>   bool i915_gem_object_is_lmem(struct drm_i915_gem_object *obj);
>>   struct drm_i915_gem_object *
>> diff --git a/drivers/gpu/drm/i915/i915_vma.c 
>> b/drivers/gpu/drm/i915/i915_vma.c
>> index 07490db51cdc..e24d33aecac4 100644
>> --- a/drivers/gpu/drm/i915/i915_vma.c
>> +++ b/drivers/gpu/drm/i915/i915_vma.c
>> @@ -27,6 +27,7 @@
>>   #include "display/intel_frontbuffer.h"
>> +#include "gem/i915_gem_lmem.h"
>>   #include "gt/intel_engine.h"
>>   #include "gt/intel_engine_heartbeat.h"
>>   #include "gt/intel_gt.h"
>> @@ -448,9 +449,11 @@ void __iomem *i915_vma_pin_iomap(struct i915_vma 
>> *vma)
>>       void __iomem *ptr;
>>       int err;
>> -    if (GEM_WARN_ON(!i915_vma_is_map_and_fenceable(vma))) {
>> -        err = -ENODEV;
>> -        goto err;
>> +    if (!i915_gem_object_is_lmem(vma->obj)) {
>> +        if (GEM_WARN_ON(!i915_vma_is_map_and_fenceable(vma))) {
>> +            err = -ENODEV;
>> +            goto err;
>> +        }
>>       }
>>       GEM_BUG_ON(!i915_vma_is_ggtt(vma));
>> @@ -458,9 +461,13 @@ void __iomem *i915_vma_pin_iomap(struct i915_vma 
>> *vma)
>>       ptr = READ_ONCE(vma->iomap);
>>       if (ptr == NULL) {
>> -        ptr = io_mapping_map_wc(&i915_vm_to_ggtt(vma->vm)->iomap,
>> -                    vma->node.start,
>> -                    vma->node.size);
>> +        if (i915_gem_object_is_lmem(vma->obj))
>> +            ptr = i915_gem_object_lmem_io_map(vma->obj, 0,
>> +                              vma->obj->base.size);
> 
> Can the vma size be bigger than the object here? Given how below works 
> of vma->node.size.

I don't know tbh. But in general node.size can definitely be larger than 
vma->size/obj->base.size.

For the iomap version below, it's using the mappable aperture, which 
requires reserving a vma node into the mappable part of the GGTT first, 
so using node.size here make sense, since the node reflects the window 
into the mappable aperture.

For the lmem case though that might be bogus, since the vma has no 
relationship with LMEM_BAR, since really it's the object, hence why we 
use the obj->base.size instead. Although really it might make more sense 
to use pin_map() instead for the lmem case, if it's possible.

> 
>> +        else
>> +            ptr = io_mapping_map_wc(&i915_vm_to_ggtt(vma->vm)->iomap,
>> +                        vma->node.start,
>> +                        vma->node.size);
> 
> Looks a bit odd that this calls the same io_mapping_map_wc as 
> i915_gem_object_lmem_io_map ends up doing. Perhaps that suggests there 
> should be a single helper here but I am not sure what would be elegant.
> 
> Regards,
> 
> Tvrtko
> 
>>           if (ptr == NULL) {
>>               err = -ENOMEM;
>>               goto err;
>>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [Intel-gfx] [PATCH 03/19] drm/i915: Create stolen memory region from local memory
  2021-04-14 15:01     ` Tvrtko Ursulin
@ 2021-04-16 15:04       ` Matthew Auld
  -1 siblings, 0 replies; 130+ messages in thread
From: Matthew Auld @ 2021-04-16 15:04 UTC (permalink / raw)
  To: Tvrtko Ursulin, intel-gfx; +Cc: dri-devel

On 14/04/2021 16:01, Tvrtko Ursulin wrote:
> 
> On 12/04/2021 10:05, Matthew Auld wrote:
>> From: CQ Tang <cq.tang@intel.com>
>>
>> Add "REGION_STOLEN" device info to dg1, create stolen memory
>> region from upper portion of local device memory, starting
>> from DSMBASE.
>>
>> v2:
>>      - s/drm_info/drm_dbg; userspace likely doesn't care about stolen.
>>      - mem->type is only setup after the region probe, so setting the 
>> name
>>        as stolen-local or stolen-system based on this value won't 
>> work. Split
>>        system vs local stolen setup to fix this.
>>      - kill all the region->devmem/is_devmem stuff. We already 
>> differentiate
>>        the different types of stolen so such things shouldn't be needed
>>        anymore.
>>
>> Signed-off-by: CQ Tang <cq.tang@intel.com>
>> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
>> ---
>>   drivers/gpu/drm/i915/gem/i915_gem_stolen.c | 99 +++++++++++++++++++---
>>   drivers/gpu/drm/i915/gem/i915_gem_stolen.h |  3 +
>>   drivers/gpu/drm/i915/i915_pci.c            |  2 +-
>>   drivers/gpu/drm/i915/i915_reg.h            |  1 +
>>   drivers/gpu/drm/i915/intel_memory_region.c |  6 ++
>>   drivers/gpu/drm/i915/intel_memory_region.h |  5 +-
>>   6 files changed, 102 insertions(+), 14 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_stolen.c 
>> b/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
>> index b0597de206de..56dd58bef5ee 100644
>> --- a/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
>> @@ -10,6 +10,7 @@
>>   #include <drm/drm_mm.h>
>>   #include <drm/i915_drm.h>
>> +#include "gem/i915_gem_lmem.h"
>>   #include "gem/i915_gem_region.h"
>>   #include "i915_drv.h"
>>   #include "i915_gem_stolen.h"
>> @@ -121,6 +122,14 @@ static int i915_adjust_stolen(struct 
>> drm_i915_private *i915,
>>           }
>>       }
>> +    /*
>> +     * With device local memory, we don't need to check the address 
>> range,
>> +     * this is device memory physical address, could overlap with system
>> +     * memory.
>> +     */
>> +    if (HAS_LMEM(i915))
>> +        return 0;
>> +
>>       /*
>>        * Verify that nothing else uses this physical address. Stolen
>>        * memory should be reserved by the BIOS and hidden from the
>> @@ -374,8 +383,9 @@ static void icl_get_stolen_reserved(struct 
>> drm_i915_private *i915,
>>       }
>>   }
>> -static int i915_gem_init_stolen(struct drm_i915_private *i915)
>> +static int i915_gem_init_stolen(struct intel_memory_region *mem)
>>   {
>> +    struct drm_i915_private *i915 = mem->i915;
>>       struct intel_uncore *uncore = &i915->uncore;
>>       resource_size_t reserved_base, stolen_top;
>>       resource_size_t reserved_total, reserved_size;
>> @@ -396,10 +406,10 @@ static int i915_gem_init_stolen(struct 
>> drm_i915_private *i915)
>>           return 0;
>>       }
>> -    if (resource_size(&intel_graphics_stolen_res) == 0)
>> +    if (resource_size(&mem->region) == 0)
>>           return 0;
>> -    i915->dsm = intel_graphics_stolen_res;
>> +    i915->dsm = mem->region;
>>       if (i915_adjust_stolen(i915, &i915->dsm))
>>           return 0;
>> @@ -684,23 +694,36 @@ static int _i915_gem_object_stolen_init(struct 
>> intel_memory_region *mem,
>>       return ret;
>>   }
>> +struct intel_memory_region *i915_stolen_region(struct 
>> drm_i915_private *i915)
>> +{
>> +    if (HAS_LMEM(i915))
>> +        return i915->mm.regions[INTEL_REGION_STOLEN_LMEM];
>> +
>> +    return i915->mm.regions[INTEL_REGION_STOLEN_SMEM];
>> +}
> 
> Could be a bikeshedding comment only - especially since I think this 
> path gets very little used at runtime so it is most likely pointless to 
> fiddle with it, but it just strikes me a bit not fully elegant to do:
> 
> i915_gem_object_create_stolen
>   -> i915_gem_object_create_region
>      -> i915_stolen_region
> 
> And end up in here, when alternative could be at driver init:
> 
> i915->stolen_region_id = HAS_LMEM() ? ... : ...;
> 
> i915_gem_object_create_stolen
>   -> 
> i915_gem_object_create_region(i915->mm.regions[i915->stolen_region_id]);
> 
> Or pointer to region. Would avoid having to export i915_stolen_region as 
> well.
> 
> Or is i915->dsm already the right thing? Because..

I guess we could just have an i915->stolen_region short-cut or something?

> 
>> +
>>   struct drm_i915_gem_object *
>>   i915_gem_object_create_stolen(struct drm_i915_private *i915,
>>                     resource_size_t size)
>>   {
>> -    return 
>> i915_gem_object_create_region(i915->mm.regions[INTEL_REGION_STOLEN_SMEM],
>> +    return i915_gem_object_create_region(i915_stolen_region(i915),
>>                            size, I915_BO_ALLOC_CONTIGUOUS);
>>   }
>>   static int init_stolen(struct intel_memory_region *mem)
>>   {
>> -    intel_memory_region_set_name(mem, "stolen");
>> +    if (HAS_LMEM(mem->i915)) {
>> +        if (!io_mapping_init_wc(&mem->iomap,
>> +                    mem->io_start,
>> +                    resource_size(&mem->region)))
>> +            return -EIO;
>> +    }
>>       /*
>>        * Initialise stolen early so that we may reserve preallocated
>>        * objects for the BIOS to KMS transition.
>>        */
>> -    return i915_gem_init_stolen(mem->i915);
>> +    return i915_gem_init_stolen(mem);
> 
> ... I find the mem region init paths a bit convoluted, stolen 
> especially, and struggle to figure it out every time.
> 
> For instance we have i915_region_stolen_ops shared between system and 
> local stolen. But then shared vfuncs branch depending on system vs stolen?

We could split the intel_memory_region ops? Maybe that will make it 
slightly less muddled?

The probing is slightly different, but that's kind of expected since 
it's quite different from the HW pov.

But once we get an intel_memory_region, it should be the same whether 
it's stolen device memory or whatever.

> 
> i915_gem_init_stolen is shared - but which parts of it are relevant for 
> local stolen?

Asking all the difficult questions :)

It's just to populate dsm I think. I can rip that out and then we don't 
call i915_gem_init_stolen() for the stolen device memory path? Maybe 
that will look slightly better?

> 
>>   }
>>   static void release_stolen(struct intel_memory_region *mem)
>> @@ -714,13 +737,65 @@ static const struct intel_memory_region_ops 
>> i915_region_stolen_ops = {
>>       .init_object = _i915_gem_object_stolen_init,
>>   };
>> +static struct intel_memory_region *
>> +setup_lmem_stolen(struct drm_i915_private *i915)
>> +{
>> +    struct intel_uncore *uncore = &i915->uncore;
>> +    struct pci_dev *pdev = i915->drm.pdev;
>> +    struct intel_memory_region *mem;
>> +    resource_size_t io_start;
>> +    resource_size_t lmem_size;
>> +    u64 lmem_base;
>> +
>> +    if (!IS_DGFX(i915))
>> +        return ERR_PTR(-ENODEV);
>> +
>> +    lmem_base = intel_uncore_read64(uncore, GEN12_DSMBASE);
>> +    lmem_size = pci_resource_len(pdev, 2) - lmem_base;
>> +    io_start = pci_resource_start(pdev, 2) + lmem_base;
>> +
>> +    mem = intel_memory_region_create(i915, lmem_base, lmem_size,
>> +                     I915_GTT_PAGE_SIZE_4K, io_start,
>> +                     &i915_region_stolen_ops);
>> +    if (IS_ERR(mem))
>> +        return mem;
>> +
>> +    drm_dbg(&i915->drm, "Stolen Local memory: %pR\n", &mem->region);
>> +    drm_dbg(&i915->drm, "Stolen Local memory IO start: %pa\n",
>> +        &mem->io_start);
> 
> Could these messages be consolidated with the system stolen ones 
> (i915_gem_setup_stolen?) and based off the memory_region data printed 
> from common i915_gem_stolen_setup?
> 
>> +
>> +    intel_memory_region_set_name(mem, "stolen-local");
>> +
>> +    return mem;
>> +}
>> +
>> +static struct intel_memory_region*
> 
> Space before asterisk.
> 
>> +setup_smem_stolen(struct drm_i915_private *i915)
>> +{
>> +    struct intel_memory_region *mem;
>> +
>> +    mem = intel_memory_region_create(i915,
>> +                     intel_graphics_stolen_res.start,
>> +                     resource_size(&intel_graphics_stolen_res),
>> +                     PAGE_SIZE, 0,
>> +                     &i915_region_stolen_ops);
>> +    if (IS_ERR(mem))
>> +        return mem;
>> +
>> +    intel_memory_region_set_name(mem, "stolen-system");
> 
> I assume this name, although changed from the current ("stolen"), is not 
> exported anywhere to matter?

Yeah, it's just for internal use, and some debugfs.

> 
>> +
>> +    return mem;
>> +}
>> +
>>   struct intel_memory_region *i915_gem_stolen_setup(struct 
>> drm_i915_private *i915)
>>   {
>> -    return intel_memory_region_create(i915,
>> -                      intel_graphics_stolen_res.start,
>> -                      resource_size(&intel_graphics_stolen_res),
>> -                      PAGE_SIZE, 0,
>> -                      &i915_region_stolen_ops);
>> +    struct intel_memory_region *mem;
>> +
>> +    mem = setup_lmem_stolen(i915);
>> +    if (mem == ERR_PTR(-ENODEV))
>> +        mem = setup_smem_stolen(i915);
>> +
>> +    return mem;
>>   }
>>   struct drm_i915_gem_object *
>> @@ -728,7 +803,7 @@ 
>> i915_gem_object_create_stolen_for_preallocated(struct drm_i915_private 
>> *i915,
>>                              resource_size_t stolen_offset,
>>                              resource_size_t size)
>>   {
>> -    struct intel_memory_region *mem = 
>> i915->mm.regions[INTEL_REGION_STOLEN_SMEM];
>> +    struct intel_memory_region *mem = i915_stolen_region(i915);
>>       struct drm_i915_gem_object *obj;
>>       struct drm_mm_node *stolen;
>>       int ret;
>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_stolen.h 
>> b/drivers/gpu/drm/i915/gem/i915_gem_stolen.h
>> index b03489706796..2d1ce7fec61c 100644
>> --- a/drivers/gpu/drm/i915/gem/i915_gem_stolen.h
>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_stolen.h
>> @@ -22,6 +22,9 @@ int i915_gem_stolen_insert_node_in_range(struct 
>> drm_i915_private *dev_priv,
>>   void i915_gem_stolen_remove_node(struct drm_i915_private *dev_priv,
>>                    struct drm_mm_node *node);
>>   struct intel_memory_region *i915_gem_stolen_setup(struct 
>> drm_i915_private *i915);
>> +
>> +struct intel_memory_region *i915_stolen_region(struct 
>> drm_i915_private *i915);
>> +
>>   struct drm_i915_gem_object *
>>   i915_gem_object_create_stolen(struct drm_i915_private *dev_priv,
>>                     resource_size_t size);
>> diff --git a/drivers/gpu/drm/i915/i915_pci.c 
>> b/drivers/gpu/drm/i915/i915_pci.c
>> index 480553746794..53f5d1e6daef 100644
>> --- a/drivers/gpu/drm/i915/i915_pci.c
>> +++ b/drivers/gpu/drm/i915/i915_pci.c
>> @@ -906,7 +906,7 @@ static const struct intel_device_info rkl_info = {
>>   #define GEN12_DGFX_FEATURES \
>>       GEN12_FEATURES, \
>> -    .memory_regions = REGION_SMEM | REGION_LMEM, \
>> +    .memory_regions = REGION_SMEM | REGION_LMEM | REGION_STOLEN_LMEM, \
>>       .has_master_unit_irq = 1, \
>>       .has_llc = 0, \
>>       .has_snoop = 1, \
>> diff --git a/drivers/gpu/drm/i915/i915_reg.h 
>> b/drivers/gpu/drm/i915/i915_reg.h
>> index e087bcd21911..4108f2a7ebfa 100644
>> --- a/drivers/gpu/drm/i915/i915_reg.h
>> +++ b/drivers/gpu/drm/i915/i915_reg.h
>> @@ -12191,6 +12191,7 @@ enum skl_power_gate {
>>   #define GEN12_GLOBAL_MOCS(i)    _MMIO(0x4000 + (i) * 4) /* Global 
>> MOCS regs */
>>   #define GEN12_GSMBASE            _MMIO(0x108100)
>> +#define GEN12_DSMBASE            _MMIO(0x1080C0)
>>   /* gamt regs */
>>   #define GEN8_L3_LRA_1_GPGPU _MMIO(0x4dd4)
>> diff --git a/drivers/gpu/drm/i915/intel_memory_region.c 
>> b/drivers/gpu/drm/i915/intel_memory_region.c
>> index bf837b6bb185..ac90b76a3fa0 100644
>> --- a/drivers/gpu/drm/i915/intel_memory_region.c
>> +++ b/drivers/gpu/drm/i915/intel_memory_region.c
>> @@ -22,6 +22,10 @@ static const struct {
>>           .class = INTEL_MEMORY_STOLEN_SYSTEM,
>>           .instance = 0,
>>       },
>> +    [INTEL_REGION_STOLEN_LMEM] = {
>> +        .class = INTEL_MEMORY_STOLEN_LOCAL,
>> +        .instance = 0,
>> +    },
>>   };
>>   struct intel_memory_region *
>> @@ -278,6 +282,8 @@ int intel_memory_regions_hw_probe(struct 
>> drm_i915_private *i915)
>>           case INTEL_MEMORY_SYSTEM:
>>               mem = i915_gem_shmem_setup(i915);
>>               break;
>> +        case INTEL_MEMORY_STOLEN_LOCAL:
>> +            fallthrough;
>>           case INTEL_MEMORY_STOLEN_SYSTEM:
>>               mem = i915_gem_stolen_setup(i915);
>>               break;
>> diff --git a/drivers/gpu/drm/i915/intel_memory_region.h 
>> b/drivers/gpu/drm/i915/intel_memory_region.h
>> index edd49067c8ca..4c8ec15af55f 100644
>> --- a/drivers/gpu/drm/i915/intel_memory_region.h
>> +++ b/drivers/gpu/drm/i915/intel_memory_region.h
>> @@ -26,18 +26,21 @@ enum intel_memory_type {
>>       INTEL_MEMORY_SYSTEM = 0,
>>       INTEL_MEMORY_LOCAL,
>>       INTEL_MEMORY_STOLEN_SYSTEM,
>> +    INTEL_MEMORY_STOLEN_LOCAL,
>>   };
>>   enum intel_region_id {
>>       INTEL_REGION_SMEM = 0,
>>       INTEL_REGION_LMEM,
>>       INTEL_REGION_STOLEN_SMEM,
>> +    INTEL_REGION_STOLEN_LMEM,
>>       INTEL_REGION_UNKNOWN, /* Should be last */
>>   };
>>   #define REGION_SMEM     BIT(INTEL_REGION_SMEM)
>>   #define REGION_LMEM     BIT(INTEL_REGION_LMEM)
>>   #define REGION_STOLEN_SMEM   BIT(INTEL_REGION_STOLEN_SMEM)
>> +#define REGION_STOLEN_LMEM   BIT(INTEL_REGION_STOLEN_LMEM)
>>   #define I915_ALLOC_MIN_PAGE_SIZE  BIT(0)
>>   #define I915_ALLOC_CONTIGUOUS     BIT(1)
>> @@ -82,7 +85,7 @@ struct intel_memory_region {
>>       u16 type;
>>       u16 instance;
>>       enum intel_region_id id;
>> -    char name[8];
>> +    char name[16];
>>       struct list_head reserved;
>>
> 
> Regards,
> 
> Tvrtko
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [Intel-gfx] [PATCH 03/19] drm/i915: Create stolen memory region from local memory
@ 2021-04-16 15:04       ` Matthew Auld
  0 siblings, 0 replies; 130+ messages in thread
From: Matthew Auld @ 2021-04-16 15:04 UTC (permalink / raw)
  To: Tvrtko Ursulin, intel-gfx; +Cc: dri-devel

On 14/04/2021 16:01, Tvrtko Ursulin wrote:
> 
> On 12/04/2021 10:05, Matthew Auld wrote:
>> From: CQ Tang <cq.tang@intel.com>
>>
>> Add "REGION_STOLEN" device info to dg1, create stolen memory
>> region from upper portion of local device memory, starting
>> from DSMBASE.
>>
>> v2:
>>      - s/drm_info/drm_dbg; userspace likely doesn't care about stolen.
>>      - mem->type is only setup after the region probe, so setting the 
>> name
>>        as stolen-local or stolen-system based on this value won't 
>> work. Split
>>        system vs local stolen setup to fix this.
>>      - kill all the region->devmem/is_devmem stuff. We already 
>> differentiate
>>        the different types of stolen so such things shouldn't be needed
>>        anymore.
>>
>> Signed-off-by: CQ Tang <cq.tang@intel.com>
>> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
>> ---
>>   drivers/gpu/drm/i915/gem/i915_gem_stolen.c | 99 +++++++++++++++++++---
>>   drivers/gpu/drm/i915/gem/i915_gem_stolen.h |  3 +
>>   drivers/gpu/drm/i915/i915_pci.c            |  2 +-
>>   drivers/gpu/drm/i915/i915_reg.h            |  1 +
>>   drivers/gpu/drm/i915/intel_memory_region.c |  6 ++
>>   drivers/gpu/drm/i915/intel_memory_region.h |  5 +-
>>   6 files changed, 102 insertions(+), 14 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_stolen.c 
>> b/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
>> index b0597de206de..56dd58bef5ee 100644
>> --- a/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
>> @@ -10,6 +10,7 @@
>>   #include <drm/drm_mm.h>
>>   #include <drm/i915_drm.h>
>> +#include "gem/i915_gem_lmem.h"
>>   #include "gem/i915_gem_region.h"
>>   #include "i915_drv.h"
>>   #include "i915_gem_stolen.h"
>> @@ -121,6 +122,14 @@ static int i915_adjust_stolen(struct 
>> drm_i915_private *i915,
>>           }
>>       }
>> +    /*
>> +     * With device local memory, we don't need to check the address 
>> range,
>> +     * this is device memory physical address, could overlap with system
>> +     * memory.
>> +     */
>> +    if (HAS_LMEM(i915))
>> +        return 0;
>> +
>>       /*
>>        * Verify that nothing else uses this physical address. Stolen
>>        * memory should be reserved by the BIOS and hidden from the
>> @@ -374,8 +383,9 @@ static void icl_get_stolen_reserved(struct 
>> drm_i915_private *i915,
>>       }
>>   }
>> -static int i915_gem_init_stolen(struct drm_i915_private *i915)
>> +static int i915_gem_init_stolen(struct intel_memory_region *mem)
>>   {
>> +    struct drm_i915_private *i915 = mem->i915;
>>       struct intel_uncore *uncore = &i915->uncore;
>>       resource_size_t reserved_base, stolen_top;
>>       resource_size_t reserved_total, reserved_size;
>> @@ -396,10 +406,10 @@ static int i915_gem_init_stolen(struct 
>> drm_i915_private *i915)
>>           return 0;
>>       }
>> -    if (resource_size(&intel_graphics_stolen_res) == 0)
>> +    if (resource_size(&mem->region) == 0)
>>           return 0;
>> -    i915->dsm = intel_graphics_stolen_res;
>> +    i915->dsm = mem->region;
>>       if (i915_adjust_stolen(i915, &i915->dsm))
>>           return 0;
>> @@ -684,23 +694,36 @@ static int _i915_gem_object_stolen_init(struct 
>> intel_memory_region *mem,
>>       return ret;
>>   }
>> +struct intel_memory_region *i915_stolen_region(struct 
>> drm_i915_private *i915)
>> +{
>> +    if (HAS_LMEM(i915))
>> +        return i915->mm.regions[INTEL_REGION_STOLEN_LMEM];
>> +
>> +    return i915->mm.regions[INTEL_REGION_STOLEN_SMEM];
>> +}
> 
> Could be a bikeshedding comment only - especially since I think this 
> path gets very little used at runtime so it is most likely pointless to 
> fiddle with it, but it just strikes me a bit not fully elegant to do:
> 
> i915_gem_object_create_stolen
>   -> i915_gem_object_create_region
>      -> i915_stolen_region
> 
> And end up in here, when alternative could be at driver init:
> 
> i915->stolen_region_id = HAS_LMEM() ? ... : ...;
> 
> i915_gem_object_create_stolen
>   -> 
> i915_gem_object_create_region(i915->mm.regions[i915->stolen_region_id]);
> 
> Or pointer to region. Would avoid having to export i915_stolen_region as 
> well.
> 
> Or is i915->dsm already the right thing? Because..

I guess we could just have an i915->stolen_region short-cut or something?

> 
>> +
>>   struct drm_i915_gem_object *
>>   i915_gem_object_create_stolen(struct drm_i915_private *i915,
>>                     resource_size_t size)
>>   {
>> -    return 
>> i915_gem_object_create_region(i915->mm.regions[INTEL_REGION_STOLEN_SMEM],
>> +    return i915_gem_object_create_region(i915_stolen_region(i915),
>>                            size, I915_BO_ALLOC_CONTIGUOUS);
>>   }
>>   static int init_stolen(struct intel_memory_region *mem)
>>   {
>> -    intel_memory_region_set_name(mem, "stolen");
>> +    if (HAS_LMEM(mem->i915)) {
>> +        if (!io_mapping_init_wc(&mem->iomap,
>> +                    mem->io_start,
>> +                    resource_size(&mem->region)))
>> +            return -EIO;
>> +    }
>>       /*
>>        * Initialise stolen early so that we may reserve preallocated
>>        * objects for the BIOS to KMS transition.
>>        */
>> -    return i915_gem_init_stolen(mem->i915);
>> +    return i915_gem_init_stolen(mem);
> 
> ... I find the mem region init paths a bit convoluted, stolen 
> especially, and struggle to figure it out every time.
> 
> For instance we have i915_region_stolen_ops shared between system and 
> local stolen. But then shared vfuncs branch depending on system vs stolen?

We could split the intel_memory_region ops? Maybe that will make it 
slightly less muddled?

The probing is slightly different, but that's kind of expected since 
it's quite different from the HW pov.

But once we get an intel_memory_region, it should be the same whether 
it's stolen device memory or whatever.

> 
> i915_gem_init_stolen is shared - but which parts of it are relevant for 
> local stolen?

Asking all the difficult questions :)

It's just to populate dsm I think. I can rip that out and then we don't 
call i915_gem_init_stolen() for the stolen device memory path? Maybe 
that will look slightly better?

> 
>>   }
>>   static void release_stolen(struct intel_memory_region *mem)
>> @@ -714,13 +737,65 @@ static const struct intel_memory_region_ops 
>> i915_region_stolen_ops = {
>>       .init_object = _i915_gem_object_stolen_init,
>>   };
>> +static struct intel_memory_region *
>> +setup_lmem_stolen(struct drm_i915_private *i915)
>> +{
>> +    struct intel_uncore *uncore = &i915->uncore;
>> +    struct pci_dev *pdev = i915->drm.pdev;
>> +    struct intel_memory_region *mem;
>> +    resource_size_t io_start;
>> +    resource_size_t lmem_size;
>> +    u64 lmem_base;
>> +
>> +    if (!IS_DGFX(i915))
>> +        return ERR_PTR(-ENODEV);
>> +
>> +    lmem_base = intel_uncore_read64(uncore, GEN12_DSMBASE);
>> +    lmem_size = pci_resource_len(pdev, 2) - lmem_base;
>> +    io_start = pci_resource_start(pdev, 2) + lmem_base;
>> +
>> +    mem = intel_memory_region_create(i915, lmem_base, lmem_size,
>> +                     I915_GTT_PAGE_SIZE_4K, io_start,
>> +                     &i915_region_stolen_ops);
>> +    if (IS_ERR(mem))
>> +        return mem;
>> +
>> +    drm_dbg(&i915->drm, "Stolen Local memory: %pR\n", &mem->region);
>> +    drm_dbg(&i915->drm, "Stolen Local memory IO start: %pa\n",
>> +        &mem->io_start);
> 
> Could these messages be consolidated with the system stolen ones 
> (i915_gem_setup_stolen?) and based off the memory_region data printed 
> from common i915_gem_stolen_setup?
> 
>> +
>> +    intel_memory_region_set_name(mem, "stolen-local");
>> +
>> +    return mem;
>> +}
>> +
>> +static struct intel_memory_region*
> 
> Space before asterisk.
> 
>> +setup_smem_stolen(struct drm_i915_private *i915)
>> +{
>> +    struct intel_memory_region *mem;
>> +
>> +    mem = intel_memory_region_create(i915,
>> +                     intel_graphics_stolen_res.start,
>> +                     resource_size(&intel_graphics_stolen_res),
>> +                     PAGE_SIZE, 0,
>> +                     &i915_region_stolen_ops);
>> +    if (IS_ERR(mem))
>> +        return mem;
>> +
>> +    intel_memory_region_set_name(mem, "stolen-system");
> 
> I assume this name, although changed from the current ("stolen"), is not 
> exported anywhere to matter?

Yeah, it's just for internal use, and some debugfs.

> 
>> +
>> +    return mem;
>> +}
>> +
>>   struct intel_memory_region *i915_gem_stolen_setup(struct 
>> drm_i915_private *i915)
>>   {
>> -    return intel_memory_region_create(i915,
>> -                      intel_graphics_stolen_res.start,
>> -                      resource_size(&intel_graphics_stolen_res),
>> -                      PAGE_SIZE, 0,
>> -                      &i915_region_stolen_ops);
>> +    struct intel_memory_region *mem;
>> +
>> +    mem = setup_lmem_stolen(i915);
>> +    if (mem == ERR_PTR(-ENODEV))
>> +        mem = setup_smem_stolen(i915);
>> +
>> +    return mem;
>>   }
>>   struct drm_i915_gem_object *
>> @@ -728,7 +803,7 @@ 
>> i915_gem_object_create_stolen_for_preallocated(struct drm_i915_private 
>> *i915,
>>                              resource_size_t stolen_offset,
>>                              resource_size_t size)
>>   {
>> -    struct intel_memory_region *mem = 
>> i915->mm.regions[INTEL_REGION_STOLEN_SMEM];
>> +    struct intel_memory_region *mem = i915_stolen_region(i915);
>>       struct drm_i915_gem_object *obj;
>>       struct drm_mm_node *stolen;
>>       int ret;
>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_stolen.h 
>> b/drivers/gpu/drm/i915/gem/i915_gem_stolen.h
>> index b03489706796..2d1ce7fec61c 100644
>> --- a/drivers/gpu/drm/i915/gem/i915_gem_stolen.h
>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_stolen.h
>> @@ -22,6 +22,9 @@ int i915_gem_stolen_insert_node_in_range(struct 
>> drm_i915_private *dev_priv,
>>   void i915_gem_stolen_remove_node(struct drm_i915_private *dev_priv,
>>                    struct drm_mm_node *node);
>>   struct intel_memory_region *i915_gem_stolen_setup(struct 
>> drm_i915_private *i915);
>> +
>> +struct intel_memory_region *i915_stolen_region(struct 
>> drm_i915_private *i915);
>> +
>>   struct drm_i915_gem_object *
>>   i915_gem_object_create_stolen(struct drm_i915_private *dev_priv,
>>                     resource_size_t size);
>> diff --git a/drivers/gpu/drm/i915/i915_pci.c 
>> b/drivers/gpu/drm/i915/i915_pci.c
>> index 480553746794..53f5d1e6daef 100644
>> --- a/drivers/gpu/drm/i915/i915_pci.c
>> +++ b/drivers/gpu/drm/i915/i915_pci.c
>> @@ -906,7 +906,7 @@ static const struct intel_device_info rkl_info = {
>>   #define GEN12_DGFX_FEATURES \
>>       GEN12_FEATURES, \
>> -    .memory_regions = REGION_SMEM | REGION_LMEM, \
>> +    .memory_regions = REGION_SMEM | REGION_LMEM | REGION_STOLEN_LMEM, \
>>       .has_master_unit_irq = 1, \
>>       .has_llc = 0, \
>>       .has_snoop = 1, \
>> diff --git a/drivers/gpu/drm/i915/i915_reg.h 
>> b/drivers/gpu/drm/i915/i915_reg.h
>> index e087bcd21911..4108f2a7ebfa 100644
>> --- a/drivers/gpu/drm/i915/i915_reg.h
>> +++ b/drivers/gpu/drm/i915/i915_reg.h
>> @@ -12191,6 +12191,7 @@ enum skl_power_gate {
>>   #define GEN12_GLOBAL_MOCS(i)    _MMIO(0x4000 + (i) * 4) /* Global 
>> MOCS regs */
>>   #define GEN12_GSMBASE            _MMIO(0x108100)
>> +#define GEN12_DSMBASE            _MMIO(0x1080C0)
>>   /* gamt regs */
>>   #define GEN8_L3_LRA_1_GPGPU _MMIO(0x4dd4)
>> diff --git a/drivers/gpu/drm/i915/intel_memory_region.c 
>> b/drivers/gpu/drm/i915/intel_memory_region.c
>> index bf837b6bb185..ac90b76a3fa0 100644
>> --- a/drivers/gpu/drm/i915/intel_memory_region.c
>> +++ b/drivers/gpu/drm/i915/intel_memory_region.c
>> @@ -22,6 +22,10 @@ static const struct {
>>           .class = INTEL_MEMORY_STOLEN_SYSTEM,
>>           .instance = 0,
>>       },
>> +    [INTEL_REGION_STOLEN_LMEM] = {
>> +        .class = INTEL_MEMORY_STOLEN_LOCAL,
>> +        .instance = 0,
>> +    },
>>   };
>>   struct intel_memory_region *
>> @@ -278,6 +282,8 @@ int intel_memory_regions_hw_probe(struct 
>> drm_i915_private *i915)
>>           case INTEL_MEMORY_SYSTEM:
>>               mem = i915_gem_shmem_setup(i915);
>>               break;
>> +        case INTEL_MEMORY_STOLEN_LOCAL:
>> +            fallthrough;
>>           case INTEL_MEMORY_STOLEN_SYSTEM:
>>               mem = i915_gem_stolen_setup(i915);
>>               break;
>> diff --git a/drivers/gpu/drm/i915/intel_memory_region.h 
>> b/drivers/gpu/drm/i915/intel_memory_region.h
>> index edd49067c8ca..4c8ec15af55f 100644
>> --- a/drivers/gpu/drm/i915/intel_memory_region.h
>> +++ b/drivers/gpu/drm/i915/intel_memory_region.h
>> @@ -26,18 +26,21 @@ enum intel_memory_type {
>>       INTEL_MEMORY_SYSTEM = 0,
>>       INTEL_MEMORY_LOCAL,
>>       INTEL_MEMORY_STOLEN_SYSTEM,
>> +    INTEL_MEMORY_STOLEN_LOCAL,
>>   };
>>   enum intel_region_id {
>>       INTEL_REGION_SMEM = 0,
>>       INTEL_REGION_LMEM,
>>       INTEL_REGION_STOLEN_SMEM,
>> +    INTEL_REGION_STOLEN_LMEM,
>>       INTEL_REGION_UNKNOWN, /* Should be last */
>>   };
>>   #define REGION_SMEM     BIT(INTEL_REGION_SMEM)
>>   #define REGION_LMEM     BIT(INTEL_REGION_LMEM)
>>   #define REGION_STOLEN_SMEM   BIT(INTEL_REGION_STOLEN_SMEM)
>> +#define REGION_STOLEN_LMEM   BIT(INTEL_REGION_STOLEN_LMEM)
>>   #define I915_ALLOC_MIN_PAGE_SIZE  BIT(0)
>>   #define I915_ALLOC_CONTIGUOUS     BIT(1)
>> @@ -82,7 +85,7 @@ struct intel_memory_region {
>>       u16 type;
>>       u16 instance;
>>       enum intel_region_id id;
>> -    char name[8];
>> +    char name[16];
>>       struct list_head reserved;
>>
> 
> Regards,
> 
> Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [Intel-gfx] [PATCH 11/19] drm/i915: Update the helper to set correct mapping
  2021-04-15 11:05             ` Tvrtko Ursulin
@ 2021-04-19 11:30               ` Matthew Auld
  -1 siblings, 0 replies; 130+ messages in thread
From: Matthew Auld @ 2021-04-19 11:30 UTC (permalink / raw)
  To: Tvrtko Ursulin, Matthew Auld; +Cc: Intel Graphics Development, ML dri-devel

On 15/04/2021 12:05, Tvrtko Ursulin wrote:
> 
> On 15/04/2021 10:23, Matthew Auld wrote:
>> On Thu, 15 Apr 2021 at 09:21, Tvrtko Ursulin
>> <tvrtko.ursulin@linux.intel.com> wrote:
>>>
>>>
>>> On 14/04/2021 17:20, Matthew Auld wrote:
>>>> On Wed, 14 Apr 2021 at 16:22, Tvrtko Ursulin
>>>> <tvrtko.ursulin@linux.intel.com> wrote:
>>>>>
>>>>>
>>>>> On 12/04/2021 10:05, Matthew Auld wrote:
>>>>>> From: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com>
>>>>>>
>>>>>> Determine the possible coherent map type based on object location,
>>>>>> and if target has llc or if user requires an always coherent
>>>>>> mapping.
>>>>>>
>>>>>> Cc: Matthew Auld <matthew.auld@intel.com>
>>>>>> Cc: CQ Tang <cq.tang@intel.com>
>>>>>> Suggested-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
>>>>>> Signed-off-by: Venkata Sandeep Dhanalakota 
>>>>>> <venkata.s.dhanalakota@intel.com>
>>>>>> ---
>>>>>>     drivers/gpu/drm/i915/gt/intel_engine_cs.c    |  3 ++-
>>>>>>     drivers/gpu/drm/i915/gt/intel_engine_pm.c    |  2 +-
>>>>>>     drivers/gpu/drm/i915/gt/intel_lrc.c          |  4 +++-
>>>>>>     drivers/gpu/drm/i915/gt/intel_ring.c         |  9 ++++++---
>>>>>>     drivers/gpu/drm/i915/gt/selftest_context.c   |  3 ++-
>>>>>>     drivers/gpu/drm/i915/gt/selftest_hangcheck.c |  4 ++--
>>>>>>     drivers/gpu/drm/i915/gt/selftest_lrc.c       |  4 +++-
>>>>>>     drivers/gpu/drm/i915/gt/uc/intel_guc.c       |  4 +++-
>>>>>>     drivers/gpu/drm/i915/gt/uc/intel_huc.c       |  4 +++-
>>>>>>     drivers/gpu/drm/i915/i915_drv.h              | 11 +++++++++--
>>>>>>     drivers/gpu/drm/i915/selftests/igt_spinner.c |  4 ++--
>>>>>>     11 files changed, 36 insertions(+), 16 deletions(-)
>>>>>>
>>>>>> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c 
>>>>>> b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
>>>>>> index efe935f80c1a..b79568d370f5 100644
>>>>>> --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
>>>>>> +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
>>>>>> @@ -664,7 +664,8 @@ static int init_status_page(struct 
>>>>>> intel_engine_cs *engine)
>>>>>>         if (ret)
>>>>>>                 goto err;
>>>>>>
>>>>>> -     vaddr = i915_gem_object_pin_map(obj, I915_MAP_WB);
>>>>>> +     vaddr = i915_gem_object_pin_map(obj,
>>>>>> +                                     
>>>>>> i915_coherent_map_type(engine->i915, obj, true));
>>>>>>         if (IS_ERR(vaddr)) {
>>>>>>                 ret = PTR_ERR(vaddr);
>>>>>>                 goto err_unpin;
>>>>>> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pm.c 
>>>>>> b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
>>>>>> index 7c9af86fdb1e..47f4397095e5 100644
>>>>>> --- a/drivers/gpu/drm/i915/gt/intel_engine_pm.c
>>>>>> +++ b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
>>>>>> @@ -23,7 +23,7 @@ static void dbg_poison_ce(struct intel_context *ce)
>>>>>>
>>>>>>         if (ce->state) {
>>>>>>                 struct drm_i915_gem_object *obj = ce->state->obj;
>>>>>> -             int type = i915_coherent_map_type(ce->engine->i915);
>>>>>> +             int type = i915_coherent_map_type(ce->engine->i915, 
>>>>>> obj, true);
>>>>>>                 void *map;
>>>>>>
>>>>>>                 if (!i915_gem_object_trylock(obj))
>>>>>> diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c 
>>>>>> b/drivers/gpu/drm/i915/gt/intel_lrc.c
>>>>>> index e86897cde984..aafe2a4df496 100644
>>>>>> --- a/drivers/gpu/drm/i915/gt/intel_lrc.c
>>>>>> +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
>>>>>> @@ -903,7 +903,9 @@ lrc_pre_pin(struct intel_context *ce,
>>>>>>         GEM_BUG_ON(!i915_vma_is_pinned(ce->state));
>>>>>>
>>>>>>         *vaddr = i915_gem_object_pin_map(ce->state->obj,
>>>>>> -                                      
>>>>>> i915_coherent_map_type(ce->engine->i915) |
>>>>>> +                                      
>>>>>> i915_coherent_map_type(ce->engine->i915,
>>>>>> +                                                             
>>>>>> ce->state->obj,
>>>>>> +                                                             
>>>>>> false) |
>>>>>>                                          I915_MAP_OVERRIDE);
>>>>>>
>>>>>>         return PTR_ERR_OR_ZERO(*vaddr);
>>>>>> diff --git a/drivers/gpu/drm/i915/gt/intel_ring.c 
>>>>>> b/drivers/gpu/drm/i915/gt/intel_ring.c
>>>>>> index aee0a77c77e0..3cf6c7e68108 100644
>>>>>> --- a/drivers/gpu/drm/i915/gt/intel_ring.c
>>>>>> +++ b/drivers/gpu/drm/i915/gt/intel_ring.c
>>>>>> @@ -53,9 +53,12 @@ int intel_ring_pin(struct intel_ring *ring, 
>>>>>> struct i915_gem_ww_ctx *ww)
>>>>>>
>>>>>>         if (i915_vma_is_map_and_fenceable(vma))
>>>>>>                 addr = (void __force *)i915_vma_pin_iomap(vma);
>>>>>> -     else
>>>>>> -             addr = i915_gem_object_pin_map(vma->obj,
>>>>>> -                                            
>>>>>> i915_coherent_map_type(vma->vm->i915));
>>>>>> +     else {
>>>>>> +             int type = i915_coherent_map_type(vma->vm->i915, 
>>>>>> vma->obj, false);
>>>>>> +
>>>>>> +             addr = i915_gem_object_pin_map(vma->obj, type);
>>>>>> +     }
>>>>>> +
>>>>>>         if (IS_ERR(addr)) {
>>>>>>                 ret = PTR_ERR(addr);
>>>>>>                 goto err_ring;
>>>>>> diff --git a/drivers/gpu/drm/i915/gt/selftest_context.c 
>>>>>> b/drivers/gpu/drm/i915/gt/selftest_context.c
>>>>>> index b9bdd1d23243..26685b927169 100644
>>>>>> --- a/drivers/gpu/drm/i915/gt/selftest_context.c
>>>>>> +++ b/drivers/gpu/drm/i915/gt/selftest_context.c
>>>>>> @@ -88,7 +88,8 @@ static int __live_context_size(struct 
>>>>>> intel_engine_cs *engine)
>>>>>>                 goto err;
>>>>>>
>>>>>>         vaddr = i915_gem_object_pin_map_unlocked(ce->state->obj,
>>>>>> -                                              
>>>>>> i915_coherent_map_type(engine->i915));
>>>>>> +                                              
>>>>>> i915_coherent_map_type(engine->i915,
>>>>>> +                                                                     
>>>>>> ce->state->obj, false));
>>>>>>         if (IS_ERR(vaddr)) {
>>>>>>                 err = PTR_ERR(vaddr);
>>>>>>                 intel_context_unpin(ce);
>>>>>> diff --git a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c 
>>>>>> b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
>>>>>> index 746985971c3a..5b63d4df8c93 100644
>>>>>> --- a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
>>>>>> +++ b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
>>>>>> @@ -69,7 +69,7 @@ static int hang_init(struct hang *h, struct 
>>>>>> intel_gt *gt)
>>>>>>         h->seqno = memset(vaddr, 0xff, PAGE_SIZE);
>>>>>>
>>>>>>         vaddr = i915_gem_object_pin_map_unlocked(h->obj,
>>>>>> -                                              
>>>>>> i915_coherent_map_type(gt->i915));
>>>>>> +                                              
>>>>>> i915_coherent_map_type(gt->i915, h->obj, false));
>>>>>>         if (IS_ERR(vaddr)) {
>>>>>>                 err = PTR_ERR(vaddr);
>>>>>>                 goto err_unpin_hws;
>>>>>> @@ -130,7 +130,7 @@ hang_create_request(struct hang *h, struct 
>>>>>> intel_engine_cs *engine)
>>>>>>                 return ERR_CAST(obj);
>>>>>>         }
>>>>>>
>>>>>> -     vaddr = i915_gem_object_pin_map_unlocked(obj, 
>>>>>> i915_coherent_map_type(gt->i915));
>>>>>> +     vaddr = i915_gem_object_pin_map_unlocked(obj, 
>>>>>> i915_coherent_map_type(gt->i915, obj, false));
>>>>>>         if (IS_ERR(vaddr)) {
>>>>>>                 i915_gem_object_put(obj);
>>>>>>                 i915_vm_put(vm);
>>>>>> diff --git a/drivers/gpu/drm/i915/gt/selftest_lrc.c 
>>>>>> b/drivers/gpu/drm/i915/gt/selftest_lrc.c
>>>>>> index 85e7df6a5123..d8f6623524e8 100644
>>>>>> --- a/drivers/gpu/drm/i915/gt/selftest_lrc.c
>>>>>> +++ b/drivers/gpu/drm/i915/gt/selftest_lrc.c
>>>>>> @@ -1221,7 +1221,9 @@ static int compare_isolation(struct 
>>>>>> intel_engine_cs *engine,
>>>>>>         }
>>>>>>
>>>>>>         lrc = i915_gem_object_pin_map_unlocked(ce->state->obj,
>>>>>> -                                   
>>>>>> i915_coherent_map_type(engine->i915));
>>>>>> +                                            
>>>>>> i915_coherent_map_type(engine->i915,
>>>>>> +                                                                   ce->state->obj, 
>>>>>>
>>>>>> +                                                                   false)); 
>>>>>>
>>>>>>         if (IS_ERR(lrc)) {
>>>>>>                 err = PTR_ERR(lrc);
>>>>>>                 goto err_B1;
>>>>>> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.c 
>>>>>> b/drivers/gpu/drm/i915/gt/uc/intel_guc.c
>>>>>> index 78305b2ec89d..adae04c47aab 100644
>>>>>> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc.c
>>>>>> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.c
>>>>>> @@ -682,7 +682,9 @@ int intel_guc_allocate_and_map_vma(struct 
>>>>>> intel_guc *guc, u32 size,
>>>>>>         if (IS_ERR(vma))
>>>>>>                 return PTR_ERR(vma);
>>>>>>
>>>>>> -     vaddr = i915_gem_object_pin_map_unlocked(vma->obj, 
>>>>>> I915_MAP_WB);
>>>>>> +     vaddr = i915_gem_object_pin_map_unlocked(vma->obj,
>>>>>> +                                              
>>>>>> i915_coherent_map_type(guc_to_gt(guc)->i915,
>>>>>> +                                                                     
>>>>>> vma->obj, true));
>>>>>>         if (IS_ERR(vaddr)) {
>>>>>>                 i915_vma_unpin_and_release(&vma, 0);
>>>>>>                 return PTR_ERR(vaddr);
>>>>>> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_huc.c 
>>>>>> b/drivers/gpu/drm/i915/gt/uc/intel_huc.c
>>>>>> index 2126dd81ac38..56d2144dc6a0 100644
>>>>>> --- a/drivers/gpu/drm/i915/gt/uc/intel_huc.c
>>>>>> +++ b/drivers/gpu/drm/i915/gt/uc/intel_huc.c
>>>>>> @@ -82,7 +82,9 @@ static int intel_huc_rsa_data_create(struct 
>>>>>> intel_huc *huc)
>>>>>>         if (IS_ERR(vma))
>>>>>>                 return PTR_ERR(vma);
>>>>>>
>>>>>> -     vaddr = i915_gem_object_pin_map_unlocked(vma->obj, 
>>>>>> I915_MAP_WB);
>>>>>> +     vaddr = i915_gem_object_pin_map_unlocked(vma->obj,
>>>>>> +                                              
>>>>>> i915_coherent_map_type(gt->i915,
>>>>>> +                                                                     
>>>>>> vma->obj, true));
>>>>>>         if (IS_ERR(vaddr)) {
>>>>>>                 i915_vma_unpin_and_release(&vma, 0);
>>>>>>                 return PTR_ERR(vaddr);
>>>>>> diff --git a/drivers/gpu/drm/i915/i915_drv.h 
>>>>>> b/drivers/gpu/drm/i915/i915_drv.h
>>>>>> index 69e43bf91a15..2abbc06712a4 100644
>>>>>> --- a/drivers/gpu/drm/i915/i915_drv.h
>>>>>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>>>>>> @@ -78,6 +78,7 @@
>>>>>>     #include "gem/i915_gem_context_types.h"
>>>>>>     #include "gem/i915_gem_shrinker.h"
>>>>>>     #include "gem/i915_gem_stolen.h"
>>>>>> +#include "gem/i915_gem_lmem.h"
>>>>>>
>>>>>>     #include "gt/intel_engine.h"
>>>>>>     #include "gt/intel_gt_types.h"
>>>>>> @@ -1921,9 +1922,15 @@ static inline int 
>>>>>> intel_hws_csb_write_index(struct drm_i915_private *i915)
>>>>>>     }
>>>>>>
>>>>>>     static inline enum i915_map_type
>>>>>> -i915_coherent_map_type(struct drm_i915_private *i915)
>>>>>> +i915_coherent_map_type(struct drm_i915_private *i915,
>>>>>> +                    struct drm_i915_gem_object *obj, bool 
>>>>>> always_coherent)
>>>>>>     {
>>>>>> -     return HAS_LLC(i915) ? I915_MAP_WB : I915_MAP_WC;
>>>>>> +     if (i915_gem_object_is_lmem(obj))
>>>>>> +             return I915_MAP_WC;
>>>>>> +     if (HAS_LLC(i915) || always_coherent)
>>>>>> +             return I915_MAP_WB;
>>>>>> +     else
>>>>>> +             return I915_MAP_WC;
>>>>>
>>>>> Seems this patch is doing two things.
>>>>>
>>>>> First it is adding lmem support to this helper by always returning WC
>>>>> for lmem objects.
>>>>>
>>>>> Secondly it is introducing an idea of "always coherent" in a helper
>>>>> called i915_coherent_map_type. Could someone explain what is 
>>>>> coherent vs
>>>>> always coherent?
>>>>>
>>>>> And also, why is always coherent happy with WB? Sounds counter 
>>>>> intuitive
>>>>> to me.
>>>>
>>>> All this does is try to keep the existing behaviour intact, whilst
>>>> also ensuring that all lmem objects are mapped using only WC, no
>>>> matter what. The always_coherent=true thing is for the existing places
>>>> where we sometimes map the object using WB, without first considering
>>>> whether the device has the fast shared LLC vs snooping. Yes, it's
>>>> slightly ugly :)
>>>
>>> Not fully following - if we had to write kerneldoc for always_coherent
>>> input argument - what it would say?
>>
>> @always_coherent - If true we should always try to map the object
>> using WB. If false we should only map as WB if the device supports the
>> fast shared LLC, in the case of snooped devices we will map use WC.
>> Note that If the resource is lmem then we will always map as WC,
>> regardless of the value of always_coherent, since that's all we
>> currently support.
>>
>> Maybe the naming is poor?
> 
> Maybe just confusing to me, not sure yet.
> 
> So always_coherent is not about how the callers wants to use it, but 
> about platform knowledge? Or a performance concern for LLC vs snooping 
> cases? Does WB works (coherently) on snooping platforms?

The always_coherent=true is for the existing callers that want WB, 
regardless of LLC vs snooping.

The other callers use the existing i915_coherent_map_type() which only 
gives out WB for LLC platforms.

AFAIK, LLC vs snooping should offer the same in terms of coherency, but 
in terms of performance the shared LLC is much faster, and so for 
snooping platforms we choose to not enable WB everywhere.

On top of that we now have lmem, but for that we only allow WC. This 
patch just rolls all of that into one helper, while keeping the existing 
behaviour unchanged.

> 
> Regards,
> 
> Tvrtko
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [Intel-gfx] [PATCH 11/19] drm/i915: Update the helper to set correct mapping
@ 2021-04-19 11:30               ` Matthew Auld
  0 siblings, 0 replies; 130+ messages in thread
From: Matthew Auld @ 2021-04-19 11:30 UTC (permalink / raw)
  To: Tvrtko Ursulin, Matthew Auld; +Cc: Intel Graphics Development, ML dri-devel

On 15/04/2021 12:05, Tvrtko Ursulin wrote:
> 
> On 15/04/2021 10:23, Matthew Auld wrote:
>> On Thu, 15 Apr 2021 at 09:21, Tvrtko Ursulin
>> <tvrtko.ursulin@linux.intel.com> wrote:
>>>
>>>
>>> On 14/04/2021 17:20, Matthew Auld wrote:
>>>> On Wed, 14 Apr 2021 at 16:22, Tvrtko Ursulin
>>>> <tvrtko.ursulin@linux.intel.com> wrote:
>>>>>
>>>>>
>>>>> On 12/04/2021 10:05, Matthew Auld wrote:
>>>>>> From: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com>
>>>>>>
>>>>>> Determine the possible coherent map type based on object location,
>>>>>> and if target has llc or if user requires an always coherent
>>>>>> mapping.
>>>>>>
>>>>>> Cc: Matthew Auld <matthew.auld@intel.com>
>>>>>> Cc: CQ Tang <cq.tang@intel.com>
>>>>>> Suggested-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
>>>>>> Signed-off-by: Venkata Sandeep Dhanalakota 
>>>>>> <venkata.s.dhanalakota@intel.com>
>>>>>> ---
>>>>>>     drivers/gpu/drm/i915/gt/intel_engine_cs.c    |  3 ++-
>>>>>>     drivers/gpu/drm/i915/gt/intel_engine_pm.c    |  2 +-
>>>>>>     drivers/gpu/drm/i915/gt/intel_lrc.c          |  4 +++-
>>>>>>     drivers/gpu/drm/i915/gt/intel_ring.c         |  9 ++++++---
>>>>>>     drivers/gpu/drm/i915/gt/selftest_context.c   |  3 ++-
>>>>>>     drivers/gpu/drm/i915/gt/selftest_hangcheck.c |  4 ++--
>>>>>>     drivers/gpu/drm/i915/gt/selftest_lrc.c       |  4 +++-
>>>>>>     drivers/gpu/drm/i915/gt/uc/intel_guc.c       |  4 +++-
>>>>>>     drivers/gpu/drm/i915/gt/uc/intel_huc.c       |  4 +++-
>>>>>>     drivers/gpu/drm/i915/i915_drv.h              | 11 +++++++++--
>>>>>>     drivers/gpu/drm/i915/selftests/igt_spinner.c |  4 ++--
>>>>>>     11 files changed, 36 insertions(+), 16 deletions(-)
>>>>>>
>>>>>> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c 
>>>>>> b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
>>>>>> index efe935f80c1a..b79568d370f5 100644
>>>>>> --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
>>>>>> +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
>>>>>> @@ -664,7 +664,8 @@ static int init_status_page(struct 
>>>>>> intel_engine_cs *engine)
>>>>>>         if (ret)
>>>>>>                 goto err;
>>>>>>
>>>>>> -     vaddr = i915_gem_object_pin_map(obj, I915_MAP_WB);
>>>>>> +     vaddr = i915_gem_object_pin_map(obj,
>>>>>> +                                     
>>>>>> i915_coherent_map_type(engine->i915, obj, true));
>>>>>>         if (IS_ERR(vaddr)) {
>>>>>>                 ret = PTR_ERR(vaddr);
>>>>>>                 goto err_unpin;
>>>>>> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pm.c 
>>>>>> b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
>>>>>> index 7c9af86fdb1e..47f4397095e5 100644
>>>>>> --- a/drivers/gpu/drm/i915/gt/intel_engine_pm.c
>>>>>> +++ b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
>>>>>> @@ -23,7 +23,7 @@ static void dbg_poison_ce(struct intel_context *ce)
>>>>>>
>>>>>>         if (ce->state) {
>>>>>>                 struct drm_i915_gem_object *obj = ce->state->obj;
>>>>>> -             int type = i915_coherent_map_type(ce->engine->i915);
>>>>>> +             int type = i915_coherent_map_type(ce->engine->i915, 
>>>>>> obj, true);
>>>>>>                 void *map;
>>>>>>
>>>>>>                 if (!i915_gem_object_trylock(obj))
>>>>>> diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c 
>>>>>> b/drivers/gpu/drm/i915/gt/intel_lrc.c
>>>>>> index e86897cde984..aafe2a4df496 100644
>>>>>> --- a/drivers/gpu/drm/i915/gt/intel_lrc.c
>>>>>> +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
>>>>>> @@ -903,7 +903,9 @@ lrc_pre_pin(struct intel_context *ce,
>>>>>>         GEM_BUG_ON(!i915_vma_is_pinned(ce->state));
>>>>>>
>>>>>>         *vaddr = i915_gem_object_pin_map(ce->state->obj,
>>>>>> -                                      
>>>>>> i915_coherent_map_type(ce->engine->i915) |
>>>>>> +                                      
>>>>>> i915_coherent_map_type(ce->engine->i915,
>>>>>> +                                                             
>>>>>> ce->state->obj,
>>>>>> +                                                             
>>>>>> false) |
>>>>>>                                          I915_MAP_OVERRIDE);
>>>>>>
>>>>>>         return PTR_ERR_OR_ZERO(*vaddr);
>>>>>> diff --git a/drivers/gpu/drm/i915/gt/intel_ring.c 
>>>>>> b/drivers/gpu/drm/i915/gt/intel_ring.c
>>>>>> index aee0a77c77e0..3cf6c7e68108 100644
>>>>>> --- a/drivers/gpu/drm/i915/gt/intel_ring.c
>>>>>> +++ b/drivers/gpu/drm/i915/gt/intel_ring.c
>>>>>> @@ -53,9 +53,12 @@ int intel_ring_pin(struct intel_ring *ring, 
>>>>>> struct i915_gem_ww_ctx *ww)
>>>>>>
>>>>>>         if (i915_vma_is_map_and_fenceable(vma))
>>>>>>                 addr = (void __force *)i915_vma_pin_iomap(vma);
>>>>>> -     else
>>>>>> -             addr = i915_gem_object_pin_map(vma->obj,
>>>>>> -                                            
>>>>>> i915_coherent_map_type(vma->vm->i915));
>>>>>> +     else {
>>>>>> +             int type = i915_coherent_map_type(vma->vm->i915, 
>>>>>> vma->obj, false);
>>>>>> +
>>>>>> +             addr = i915_gem_object_pin_map(vma->obj, type);
>>>>>> +     }
>>>>>> +
>>>>>>         if (IS_ERR(addr)) {
>>>>>>                 ret = PTR_ERR(addr);
>>>>>>                 goto err_ring;
>>>>>> diff --git a/drivers/gpu/drm/i915/gt/selftest_context.c 
>>>>>> b/drivers/gpu/drm/i915/gt/selftest_context.c
>>>>>> index b9bdd1d23243..26685b927169 100644
>>>>>> --- a/drivers/gpu/drm/i915/gt/selftest_context.c
>>>>>> +++ b/drivers/gpu/drm/i915/gt/selftest_context.c
>>>>>> @@ -88,7 +88,8 @@ static int __live_context_size(struct 
>>>>>> intel_engine_cs *engine)
>>>>>>                 goto err;
>>>>>>
>>>>>>         vaddr = i915_gem_object_pin_map_unlocked(ce->state->obj,
>>>>>> -                                              
>>>>>> i915_coherent_map_type(engine->i915));
>>>>>> +                                              
>>>>>> i915_coherent_map_type(engine->i915,
>>>>>> +                                                                     
>>>>>> ce->state->obj, false));
>>>>>>         if (IS_ERR(vaddr)) {
>>>>>>                 err = PTR_ERR(vaddr);
>>>>>>                 intel_context_unpin(ce);
>>>>>> diff --git a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c 
>>>>>> b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
>>>>>> index 746985971c3a..5b63d4df8c93 100644
>>>>>> --- a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
>>>>>> +++ b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
>>>>>> @@ -69,7 +69,7 @@ static int hang_init(struct hang *h, struct 
>>>>>> intel_gt *gt)
>>>>>>         h->seqno = memset(vaddr, 0xff, PAGE_SIZE);
>>>>>>
>>>>>>         vaddr = i915_gem_object_pin_map_unlocked(h->obj,
>>>>>> -                                              
>>>>>> i915_coherent_map_type(gt->i915));
>>>>>> +                                              
>>>>>> i915_coherent_map_type(gt->i915, h->obj, false));
>>>>>>         if (IS_ERR(vaddr)) {
>>>>>>                 err = PTR_ERR(vaddr);
>>>>>>                 goto err_unpin_hws;
>>>>>> @@ -130,7 +130,7 @@ hang_create_request(struct hang *h, struct 
>>>>>> intel_engine_cs *engine)
>>>>>>                 return ERR_CAST(obj);
>>>>>>         }
>>>>>>
>>>>>> -     vaddr = i915_gem_object_pin_map_unlocked(obj, 
>>>>>> i915_coherent_map_type(gt->i915));
>>>>>> +     vaddr = i915_gem_object_pin_map_unlocked(obj, 
>>>>>> i915_coherent_map_type(gt->i915, obj, false));
>>>>>>         if (IS_ERR(vaddr)) {
>>>>>>                 i915_gem_object_put(obj);
>>>>>>                 i915_vm_put(vm);
>>>>>> diff --git a/drivers/gpu/drm/i915/gt/selftest_lrc.c 
>>>>>> b/drivers/gpu/drm/i915/gt/selftest_lrc.c
>>>>>> index 85e7df6a5123..d8f6623524e8 100644
>>>>>> --- a/drivers/gpu/drm/i915/gt/selftest_lrc.c
>>>>>> +++ b/drivers/gpu/drm/i915/gt/selftest_lrc.c
>>>>>> @@ -1221,7 +1221,9 @@ static int compare_isolation(struct 
>>>>>> intel_engine_cs *engine,
>>>>>>         }
>>>>>>
>>>>>>         lrc = i915_gem_object_pin_map_unlocked(ce->state->obj,
>>>>>> -                                   
>>>>>> i915_coherent_map_type(engine->i915));
>>>>>> +                                            
>>>>>> i915_coherent_map_type(engine->i915,
>>>>>> +                                                                   ce->state->obj, 
>>>>>>
>>>>>> +                                                                   false)); 
>>>>>>
>>>>>>         if (IS_ERR(lrc)) {
>>>>>>                 err = PTR_ERR(lrc);
>>>>>>                 goto err_B1;
>>>>>> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.c 
>>>>>> b/drivers/gpu/drm/i915/gt/uc/intel_guc.c
>>>>>> index 78305b2ec89d..adae04c47aab 100644
>>>>>> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc.c
>>>>>> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.c
>>>>>> @@ -682,7 +682,9 @@ int intel_guc_allocate_and_map_vma(struct 
>>>>>> intel_guc *guc, u32 size,
>>>>>>         if (IS_ERR(vma))
>>>>>>                 return PTR_ERR(vma);
>>>>>>
>>>>>> -     vaddr = i915_gem_object_pin_map_unlocked(vma->obj, 
>>>>>> I915_MAP_WB);
>>>>>> +     vaddr = i915_gem_object_pin_map_unlocked(vma->obj,
>>>>>> +                                              
>>>>>> i915_coherent_map_type(guc_to_gt(guc)->i915,
>>>>>> +                                                                     
>>>>>> vma->obj, true));
>>>>>>         if (IS_ERR(vaddr)) {
>>>>>>                 i915_vma_unpin_and_release(&vma, 0);
>>>>>>                 return PTR_ERR(vaddr);
>>>>>> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_huc.c 
>>>>>> b/drivers/gpu/drm/i915/gt/uc/intel_huc.c
>>>>>> index 2126dd81ac38..56d2144dc6a0 100644
>>>>>> --- a/drivers/gpu/drm/i915/gt/uc/intel_huc.c
>>>>>> +++ b/drivers/gpu/drm/i915/gt/uc/intel_huc.c
>>>>>> @@ -82,7 +82,9 @@ static int intel_huc_rsa_data_create(struct 
>>>>>> intel_huc *huc)
>>>>>>         if (IS_ERR(vma))
>>>>>>                 return PTR_ERR(vma);
>>>>>>
>>>>>> -     vaddr = i915_gem_object_pin_map_unlocked(vma->obj, 
>>>>>> I915_MAP_WB);
>>>>>> +     vaddr = i915_gem_object_pin_map_unlocked(vma->obj,
>>>>>> +                                              
>>>>>> i915_coherent_map_type(gt->i915,
>>>>>> +                                                                     
>>>>>> vma->obj, true));
>>>>>>         if (IS_ERR(vaddr)) {
>>>>>>                 i915_vma_unpin_and_release(&vma, 0);
>>>>>>                 return PTR_ERR(vaddr);
>>>>>> diff --git a/drivers/gpu/drm/i915/i915_drv.h 
>>>>>> b/drivers/gpu/drm/i915/i915_drv.h
>>>>>> index 69e43bf91a15..2abbc06712a4 100644
>>>>>> --- a/drivers/gpu/drm/i915/i915_drv.h
>>>>>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>>>>>> @@ -78,6 +78,7 @@
>>>>>>     #include "gem/i915_gem_context_types.h"
>>>>>>     #include "gem/i915_gem_shrinker.h"
>>>>>>     #include "gem/i915_gem_stolen.h"
>>>>>> +#include "gem/i915_gem_lmem.h"
>>>>>>
>>>>>>     #include "gt/intel_engine.h"
>>>>>>     #include "gt/intel_gt_types.h"
>>>>>> @@ -1921,9 +1922,15 @@ static inline int 
>>>>>> intel_hws_csb_write_index(struct drm_i915_private *i915)
>>>>>>     }
>>>>>>
>>>>>>     static inline enum i915_map_type
>>>>>> -i915_coherent_map_type(struct drm_i915_private *i915)
>>>>>> +i915_coherent_map_type(struct drm_i915_private *i915,
>>>>>> +                    struct drm_i915_gem_object *obj, bool 
>>>>>> always_coherent)
>>>>>>     {
>>>>>> -     return HAS_LLC(i915) ? I915_MAP_WB : I915_MAP_WC;
>>>>>> +     if (i915_gem_object_is_lmem(obj))
>>>>>> +             return I915_MAP_WC;
>>>>>> +     if (HAS_LLC(i915) || always_coherent)
>>>>>> +             return I915_MAP_WB;
>>>>>> +     else
>>>>>> +             return I915_MAP_WC;
>>>>>
>>>>> Seems this patch is doing two things.
>>>>>
>>>>> First it is adding lmem support to this helper by always returning WC
>>>>> for lmem objects.
>>>>>
>>>>> Secondly it is introducing an idea of "always coherent" in a helper
>>>>> called i915_coherent_map_type. Could someone explain what is 
>>>>> coherent vs
>>>>> always coherent?
>>>>>
>>>>> And also, why is always coherent happy with WB? Sounds counter 
>>>>> intuitive
>>>>> to me.
>>>>
>>>> All this does is try to keep the existing behaviour intact, whilst
>>>> also ensuring that all lmem objects are mapped using only WC, no
>>>> matter what. The always_coherent=true thing is for the existing places
>>>> where we sometimes map the object using WB, without first considering
>>>> whether the device has the fast shared LLC vs snooping. Yes, it's
>>>> slightly ugly :)
>>>
>>> Not fully following - if we had to write kerneldoc for always_coherent
>>> input argument - what it would say?
>>
>> @always_coherent - If true we should always try to map the object
>> using WB. If false we should only map as WB if the device supports the
>> fast shared LLC, in the case of snooped devices we will map use WC.
>> Note that If the resource is lmem then we will always map as WC,
>> regardless of the value of always_coherent, since that's all we
>> currently support.
>>
>> Maybe the naming is poor?
> 
> Maybe just confusing to me, not sure yet.
> 
> So always_coherent is not about how the callers wants to use it, but 
> about platform knowledge? Or a performance concern for LLC vs snooping 
> cases? Does WB works (coherently) on snooping platforms?

The always_coherent=true is for the existing callers that want WB, 
regardless of LLC vs snooping.

The other callers use the existing i915_coherent_map_type() which only 
gives out WB for LLC platforms.

AFAIK, LLC vs snooping should offer the same in terms of coherency, but 
in terms of performance the shared LLC is much faster, and so for 
snooping platforms we choose to not enable WB everywhere.

On top of that we now have lmem, but for that we only allow WC. This 
patch just rolls all of that into one helper, while keeping the existing 
behaviour unchanged.

> 
> Regards,
> 
> Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [Intel-gfx] [PATCH 11/19] drm/i915: Update the helper to set correct mapping
  2021-04-19 11:30               ` Matthew Auld
@ 2021-04-19 14:07                 ` Tvrtko Ursulin
  -1 siblings, 0 replies; 130+ messages in thread
From: Tvrtko Ursulin @ 2021-04-19 14:07 UTC (permalink / raw)
  To: Matthew Auld, Matthew Auld; +Cc: Intel Graphics Development, ML dri-devel


On 19/04/2021 12:30, Matthew Auld wrote:
> On 15/04/2021 12:05, Tvrtko Ursulin wrote:
>>
>> On 15/04/2021 10:23, Matthew Auld wrote:
>>> On Thu, 15 Apr 2021 at 09:21, Tvrtko Ursulin
>>> <tvrtko.ursulin@linux.intel.com> wrote:
>>>>
>>>>
>>>> On 14/04/2021 17:20, Matthew Auld wrote:
>>>>> On Wed, 14 Apr 2021 at 16:22, Tvrtko Ursulin
>>>>> <tvrtko.ursulin@linux.intel.com> wrote:
>>>>>>
>>>>>>
>>>>>> On 12/04/2021 10:05, Matthew Auld wrote:
>>>>>>> From: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com>
>>>>>>>
>>>>>>> Determine the possible coherent map type based on object location,
>>>>>>> and if target has llc or if user requires an always coherent
>>>>>>> mapping.
>>>>>>>
>>>>>>> Cc: Matthew Auld <matthew.auld@intel.com>
>>>>>>> Cc: CQ Tang <cq.tang@intel.com>
>>>>>>> Suggested-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
>>>>>>> Signed-off-by: Venkata Sandeep Dhanalakota 
>>>>>>> <venkata.s.dhanalakota@intel.com>
>>>>>>> ---
>>>>>>>     drivers/gpu/drm/i915/gt/intel_engine_cs.c    |  3 ++-
>>>>>>>     drivers/gpu/drm/i915/gt/intel_engine_pm.c    |  2 +-
>>>>>>>     drivers/gpu/drm/i915/gt/intel_lrc.c          |  4 +++-
>>>>>>>     drivers/gpu/drm/i915/gt/intel_ring.c         |  9 ++++++---
>>>>>>>     drivers/gpu/drm/i915/gt/selftest_context.c   |  3 ++-
>>>>>>>     drivers/gpu/drm/i915/gt/selftest_hangcheck.c |  4 ++--
>>>>>>>     drivers/gpu/drm/i915/gt/selftest_lrc.c       |  4 +++-
>>>>>>>     drivers/gpu/drm/i915/gt/uc/intel_guc.c       |  4 +++-
>>>>>>>     drivers/gpu/drm/i915/gt/uc/intel_huc.c       |  4 +++-
>>>>>>>     drivers/gpu/drm/i915/i915_drv.h              | 11 +++++++++--
>>>>>>>     drivers/gpu/drm/i915/selftests/igt_spinner.c |  4 ++--
>>>>>>>     11 files changed, 36 insertions(+), 16 deletions(-)
>>>>>>>
>>>>>>> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c 
>>>>>>> b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
>>>>>>> index efe935f80c1a..b79568d370f5 100644
>>>>>>> --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
>>>>>>> +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
>>>>>>> @@ -664,7 +664,8 @@ static int init_status_page(struct 
>>>>>>> intel_engine_cs *engine)
>>>>>>>         if (ret)
>>>>>>>                 goto err;
>>>>>>>
>>>>>>> -     vaddr = i915_gem_object_pin_map(obj, I915_MAP_WB);
>>>>>>> +     vaddr = i915_gem_object_pin_map(obj,
>>>>>>> + i915_coherent_map_type(engine->i915, obj, true));
>>>>>>>         if (IS_ERR(vaddr)) {
>>>>>>>                 ret = PTR_ERR(vaddr);
>>>>>>>                 goto err_unpin;
>>>>>>> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pm.c 
>>>>>>> b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
>>>>>>> index 7c9af86fdb1e..47f4397095e5 100644
>>>>>>> --- a/drivers/gpu/drm/i915/gt/intel_engine_pm.c
>>>>>>> +++ b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
>>>>>>> @@ -23,7 +23,7 @@ static void dbg_poison_ce(struct intel_context 
>>>>>>> *ce)
>>>>>>>
>>>>>>>         if (ce->state) {
>>>>>>>                 struct drm_i915_gem_object *obj = ce->state->obj;
>>>>>>> -             int type = i915_coherent_map_type(ce->engine->i915);
>>>>>>> +             int type = i915_coherent_map_type(ce->engine->i915, 
>>>>>>> obj, true);
>>>>>>>                 void *map;
>>>>>>>
>>>>>>>                 if (!i915_gem_object_trylock(obj))
>>>>>>> diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c 
>>>>>>> b/drivers/gpu/drm/i915/gt/intel_lrc.c
>>>>>>> index e86897cde984..aafe2a4df496 100644
>>>>>>> --- a/drivers/gpu/drm/i915/gt/intel_lrc.c
>>>>>>> +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
>>>>>>> @@ -903,7 +903,9 @@ lrc_pre_pin(struct intel_context *ce,
>>>>>>>         GEM_BUG_ON(!i915_vma_is_pinned(ce->state));
>>>>>>>
>>>>>>>         *vaddr = i915_gem_object_pin_map(ce->state->obj,
>>>>>>> - i915_coherent_map_type(ce->engine->i915) |
>>>>>>> + i915_coherent_map_type(ce->engine->i915,
>>>>>>> + ce->state->obj,
>>>>>>> + false) |
>>>>>>>                                          I915_MAP_OVERRIDE);
>>>>>>>
>>>>>>>         return PTR_ERR_OR_ZERO(*vaddr);
>>>>>>> diff --git a/drivers/gpu/drm/i915/gt/intel_ring.c 
>>>>>>> b/drivers/gpu/drm/i915/gt/intel_ring.c
>>>>>>> index aee0a77c77e0..3cf6c7e68108 100644
>>>>>>> --- a/drivers/gpu/drm/i915/gt/intel_ring.c
>>>>>>> +++ b/drivers/gpu/drm/i915/gt/intel_ring.c
>>>>>>> @@ -53,9 +53,12 @@ int intel_ring_pin(struct intel_ring *ring, 
>>>>>>> struct i915_gem_ww_ctx *ww)
>>>>>>>
>>>>>>>         if (i915_vma_is_map_and_fenceable(vma))
>>>>>>>                 addr = (void __force *)i915_vma_pin_iomap(vma);
>>>>>>> -     else
>>>>>>> -             addr = i915_gem_object_pin_map(vma->obj,
>>>>>>> - i915_coherent_map_type(vma->vm->i915));
>>>>>>> +     else {
>>>>>>> +             int type = i915_coherent_map_type(vma->vm->i915, 
>>>>>>> vma->obj, false);
>>>>>>> +
>>>>>>> +             addr = i915_gem_object_pin_map(vma->obj, type);
>>>>>>> +     }
>>>>>>> +
>>>>>>>         if (IS_ERR(addr)) {
>>>>>>>                 ret = PTR_ERR(addr);
>>>>>>>                 goto err_ring;
>>>>>>> diff --git a/drivers/gpu/drm/i915/gt/selftest_context.c 
>>>>>>> b/drivers/gpu/drm/i915/gt/selftest_context.c
>>>>>>> index b9bdd1d23243..26685b927169 100644
>>>>>>> --- a/drivers/gpu/drm/i915/gt/selftest_context.c
>>>>>>> +++ b/drivers/gpu/drm/i915/gt/selftest_context.c
>>>>>>> @@ -88,7 +88,8 @@ static int __live_context_size(struct 
>>>>>>> intel_engine_cs *engine)
>>>>>>>                 goto err;
>>>>>>>
>>>>>>>         vaddr = i915_gem_object_pin_map_unlocked(ce->state->obj,
>>>>>>> - i915_coherent_map_type(engine->i915));
>>>>>>> + i915_coherent_map_type(engine->i915,
>>>>>>> + ce->state->obj, false));
>>>>>>>         if (IS_ERR(vaddr)) {
>>>>>>>                 err = PTR_ERR(vaddr);
>>>>>>>                 intel_context_unpin(ce);
>>>>>>> diff --git a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c 
>>>>>>> b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
>>>>>>> index 746985971c3a..5b63d4df8c93 100644
>>>>>>> --- a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
>>>>>>> +++ b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
>>>>>>> @@ -69,7 +69,7 @@ static int hang_init(struct hang *h, struct 
>>>>>>> intel_gt *gt)
>>>>>>>         h->seqno = memset(vaddr, 0xff, PAGE_SIZE);
>>>>>>>
>>>>>>>         vaddr = i915_gem_object_pin_map_unlocked(h->obj,
>>>>>>> - i915_coherent_map_type(gt->i915));
>>>>>>> + i915_coherent_map_type(gt->i915, h->obj, false));
>>>>>>>         if (IS_ERR(vaddr)) {
>>>>>>>                 err = PTR_ERR(vaddr);
>>>>>>>                 goto err_unpin_hws;
>>>>>>> @@ -130,7 +130,7 @@ hang_create_request(struct hang *h, struct 
>>>>>>> intel_engine_cs *engine)
>>>>>>>                 return ERR_CAST(obj);
>>>>>>>         }
>>>>>>>
>>>>>>> -     vaddr = i915_gem_object_pin_map_unlocked(obj, 
>>>>>>> i915_coherent_map_type(gt->i915));
>>>>>>> +     vaddr = i915_gem_object_pin_map_unlocked(obj, 
>>>>>>> i915_coherent_map_type(gt->i915, obj, false));
>>>>>>>         if (IS_ERR(vaddr)) {
>>>>>>>                 i915_gem_object_put(obj);
>>>>>>>                 i915_vm_put(vm);
>>>>>>> diff --git a/drivers/gpu/drm/i915/gt/selftest_lrc.c 
>>>>>>> b/drivers/gpu/drm/i915/gt/selftest_lrc.c
>>>>>>> index 85e7df6a5123..d8f6623524e8 100644
>>>>>>> --- a/drivers/gpu/drm/i915/gt/selftest_lrc.c
>>>>>>> +++ b/drivers/gpu/drm/i915/gt/selftest_lrc.c
>>>>>>> @@ -1221,7 +1221,9 @@ static int compare_isolation(struct 
>>>>>>> intel_engine_cs *engine,
>>>>>>>         }
>>>>>>>
>>>>>>>         lrc = i915_gem_object_pin_map_unlocked(ce->state->obj,
>>>>>>> - i915_coherent_map_type(engine->i915));
>>>>>>> + i915_coherent_map_type(engine->i915,
>>>>>>> +                                                                   
>>>>>>> ce->state->obj,
>>>>>>> +                                                                   
>>>>>>> false));
>>>>>>>         if (IS_ERR(lrc)) {
>>>>>>>                 err = PTR_ERR(lrc);
>>>>>>>                 goto err_B1;
>>>>>>> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.c 
>>>>>>> b/drivers/gpu/drm/i915/gt/uc/intel_guc.c
>>>>>>> index 78305b2ec89d..adae04c47aab 100644
>>>>>>> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc.c
>>>>>>> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.c
>>>>>>> @@ -682,7 +682,9 @@ int intel_guc_allocate_and_map_vma(struct 
>>>>>>> intel_guc *guc, u32 size,
>>>>>>>         if (IS_ERR(vma))
>>>>>>>                 return PTR_ERR(vma);
>>>>>>>
>>>>>>> -     vaddr = i915_gem_object_pin_map_unlocked(vma->obj, 
>>>>>>> I915_MAP_WB);
>>>>>>> +     vaddr = i915_gem_object_pin_map_unlocked(vma->obj,
>>>>>>> + i915_coherent_map_type(guc_to_gt(guc)->i915,
>>>>>>> + vma->obj, true));
>>>>>>>         if (IS_ERR(vaddr)) {
>>>>>>>                 i915_vma_unpin_and_release(&vma, 0);
>>>>>>>                 return PTR_ERR(vaddr);
>>>>>>> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_huc.c 
>>>>>>> b/drivers/gpu/drm/i915/gt/uc/intel_huc.c
>>>>>>> index 2126dd81ac38..56d2144dc6a0 100644
>>>>>>> --- a/drivers/gpu/drm/i915/gt/uc/intel_huc.c
>>>>>>> +++ b/drivers/gpu/drm/i915/gt/uc/intel_huc.c
>>>>>>> @@ -82,7 +82,9 @@ static int intel_huc_rsa_data_create(struct 
>>>>>>> intel_huc *huc)
>>>>>>>         if (IS_ERR(vma))
>>>>>>>                 return PTR_ERR(vma);
>>>>>>>
>>>>>>> -     vaddr = i915_gem_object_pin_map_unlocked(vma->obj, 
>>>>>>> I915_MAP_WB);
>>>>>>> +     vaddr = i915_gem_object_pin_map_unlocked(vma->obj,
>>>>>>> + i915_coherent_map_type(gt->i915,
>>>>>>> + vma->obj, true));
>>>>>>>         if (IS_ERR(vaddr)) {
>>>>>>>                 i915_vma_unpin_and_release(&vma, 0);
>>>>>>>                 return PTR_ERR(vaddr);
>>>>>>> diff --git a/drivers/gpu/drm/i915/i915_drv.h 
>>>>>>> b/drivers/gpu/drm/i915/i915_drv.h
>>>>>>> index 69e43bf91a15..2abbc06712a4 100644
>>>>>>> --- a/drivers/gpu/drm/i915/i915_drv.h
>>>>>>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>>>>>>> @@ -78,6 +78,7 @@
>>>>>>>     #include "gem/i915_gem_context_types.h"
>>>>>>>     #include "gem/i915_gem_shrinker.h"
>>>>>>>     #include "gem/i915_gem_stolen.h"
>>>>>>> +#include "gem/i915_gem_lmem.h"
>>>>>>>
>>>>>>>     #include "gt/intel_engine.h"
>>>>>>>     #include "gt/intel_gt_types.h"
>>>>>>> @@ -1921,9 +1922,15 @@ static inline int 
>>>>>>> intel_hws_csb_write_index(struct drm_i915_private *i915)
>>>>>>>     }
>>>>>>>
>>>>>>>     static inline enum i915_map_type
>>>>>>> -i915_coherent_map_type(struct drm_i915_private *i915)
>>>>>>> +i915_coherent_map_type(struct drm_i915_private *i915,
>>>>>>> +                    struct drm_i915_gem_object *obj, bool 
>>>>>>> always_coherent)
>>>>>>>     {
>>>>>>> -     return HAS_LLC(i915) ? I915_MAP_WB : I915_MAP_WC;
>>>>>>> +     if (i915_gem_object_is_lmem(obj))
>>>>>>> +             return I915_MAP_WC;
>>>>>>> +     if (HAS_LLC(i915) || always_coherent)
>>>>>>> +             return I915_MAP_WB;
>>>>>>> +     else
>>>>>>> +             return I915_MAP_WC;
>>>>>>
>>>>>> Seems this patch is doing two things.
>>>>>>
>>>>>> First it is adding lmem support to this helper by always returning WC
>>>>>> for lmem objects.
>>>>>>
>>>>>> Secondly it is introducing an idea of "always coherent" in a helper
>>>>>> called i915_coherent_map_type. Could someone explain what is 
>>>>>> coherent vs
>>>>>> always coherent?
>>>>>>
>>>>>> And also, why is always coherent happy with WB? Sounds counter 
>>>>>> intuitive
>>>>>> to me.
>>>>>
>>>>> All this does is try to keep the existing behaviour intact, whilst
>>>>> also ensuring that all lmem objects are mapped using only WC, no
>>>>> matter what. The always_coherent=true thing is for the existing places
>>>>> where we sometimes map the object using WB, without first considering
>>>>> whether the device has the fast shared LLC vs snooping. Yes, it's
>>>>> slightly ugly :)
>>>>
>>>> Not fully following - if we had to write kerneldoc for always_coherent
>>>> input argument - what it would say?
>>>
>>> @always_coherent - If true we should always try to map the object
>>> using WB. If false we should only map as WB if the device supports the
>>> fast shared LLC, in the case of snooped devices we will map use WC.
>>> Note that If the resource is lmem then we will always map as WC,
>>> regardless of the value of always_coherent, since that's all we
>>> currently support.
>>>
>>> Maybe the naming is poor?
>>
>> Maybe just confusing to me, not sure yet.
>>
>> So always_coherent is not about how the callers wants to use it, but 
>> about platform knowledge? Or a performance concern for LLC vs snooping 
>> cases? Does WB works (coherently) on snooping platforms?
> 
> The always_coherent=true is for the existing callers that want WB, 
> regardless of LLC vs snooping.
> 
> The other callers use the existing i915_coherent_map_type() which only 
> gives out WB for LLC platforms.
> 
> AFAIK, LLC vs snooping should offer the same in terms of coherency, but 
> in terms of performance the shared LLC is much faster, and so for 
> snooping platforms we choose to not enable WB everywhere.
> 
> On top of that we now have lmem, but for that we only allow WC. This 
> patch just rolls all of that into one helper, while keeping the existing 
> behaviour unchanged.

Thanks. But I am still struggling with the API. :(

Is the introduction of always_coherent flag in the context of DG1 
required even? AFAICT for lmem objects the flag is ignored so no?

Regards,

Tvrtko
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [Intel-gfx] [PATCH 11/19] drm/i915: Update the helper to set correct mapping
@ 2021-04-19 14:07                 ` Tvrtko Ursulin
  0 siblings, 0 replies; 130+ messages in thread
From: Tvrtko Ursulin @ 2021-04-19 14:07 UTC (permalink / raw)
  To: Matthew Auld, Matthew Auld; +Cc: Intel Graphics Development, ML dri-devel


On 19/04/2021 12:30, Matthew Auld wrote:
> On 15/04/2021 12:05, Tvrtko Ursulin wrote:
>>
>> On 15/04/2021 10:23, Matthew Auld wrote:
>>> On Thu, 15 Apr 2021 at 09:21, Tvrtko Ursulin
>>> <tvrtko.ursulin@linux.intel.com> wrote:
>>>>
>>>>
>>>> On 14/04/2021 17:20, Matthew Auld wrote:
>>>>> On Wed, 14 Apr 2021 at 16:22, Tvrtko Ursulin
>>>>> <tvrtko.ursulin@linux.intel.com> wrote:
>>>>>>
>>>>>>
>>>>>> On 12/04/2021 10:05, Matthew Auld wrote:
>>>>>>> From: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com>
>>>>>>>
>>>>>>> Determine the possible coherent map type based on object location,
>>>>>>> and if target has llc or if user requires an always coherent
>>>>>>> mapping.
>>>>>>>
>>>>>>> Cc: Matthew Auld <matthew.auld@intel.com>
>>>>>>> Cc: CQ Tang <cq.tang@intel.com>
>>>>>>> Suggested-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
>>>>>>> Signed-off-by: Venkata Sandeep Dhanalakota 
>>>>>>> <venkata.s.dhanalakota@intel.com>
>>>>>>> ---
>>>>>>>     drivers/gpu/drm/i915/gt/intel_engine_cs.c    |  3 ++-
>>>>>>>     drivers/gpu/drm/i915/gt/intel_engine_pm.c    |  2 +-
>>>>>>>     drivers/gpu/drm/i915/gt/intel_lrc.c          |  4 +++-
>>>>>>>     drivers/gpu/drm/i915/gt/intel_ring.c         |  9 ++++++---
>>>>>>>     drivers/gpu/drm/i915/gt/selftest_context.c   |  3 ++-
>>>>>>>     drivers/gpu/drm/i915/gt/selftest_hangcheck.c |  4 ++--
>>>>>>>     drivers/gpu/drm/i915/gt/selftest_lrc.c       |  4 +++-
>>>>>>>     drivers/gpu/drm/i915/gt/uc/intel_guc.c       |  4 +++-
>>>>>>>     drivers/gpu/drm/i915/gt/uc/intel_huc.c       |  4 +++-
>>>>>>>     drivers/gpu/drm/i915/i915_drv.h              | 11 +++++++++--
>>>>>>>     drivers/gpu/drm/i915/selftests/igt_spinner.c |  4 ++--
>>>>>>>     11 files changed, 36 insertions(+), 16 deletions(-)
>>>>>>>
>>>>>>> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c 
>>>>>>> b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
>>>>>>> index efe935f80c1a..b79568d370f5 100644
>>>>>>> --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
>>>>>>> +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
>>>>>>> @@ -664,7 +664,8 @@ static int init_status_page(struct 
>>>>>>> intel_engine_cs *engine)
>>>>>>>         if (ret)
>>>>>>>                 goto err;
>>>>>>>
>>>>>>> -     vaddr = i915_gem_object_pin_map(obj, I915_MAP_WB);
>>>>>>> +     vaddr = i915_gem_object_pin_map(obj,
>>>>>>> + i915_coherent_map_type(engine->i915, obj, true));
>>>>>>>         if (IS_ERR(vaddr)) {
>>>>>>>                 ret = PTR_ERR(vaddr);
>>>>>>>                 goto err_unpin;
>>>>>>> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pm.c 
>>>>>>> b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
>>>>>>> index 7c9af86fdb1e..47f4397095e5 100644
>>>>>>> --- a/drivers/gpu/drm/i915/gt/intel_engine_pm.c
>>>>>>> +++ b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
>>>>>>> @@ -23,7 +23,7 @@ static void dbg_poison_ce(struct intel_context 
>>>>>>> *ce)
>>>>>>>
>>>>>>>         if (ce->state) {
>>>>>>>                 struct drm_i915_gem_object *obj = ce->state->obj;
>>>>>>> -             int type = i915_coherent_map_type(ce->engine->i915);
>>>>>>> +             int type = i915_coherent_map_type(ce->engine->i915, 
>>>>>>> obj, true);
>>>>>>>                 void *map;
>>>>>>>
>>>>>>>                 if (!i915_gem_object_trylock(obj))
>>>>>>> diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c 
>>>>>>> b/drivers/gpu/drm/i915/gt/intel_lrc.c
>>>>>>> index e86897cde984..aafe2a4df496 100644
>>>>>>> --- a/drivers/gpu/drm/i915/gt/intel_lrc.c
>>>>>>> +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
>>>>>>> @@ -903,7 +903,9 @@ lrc_pre_pin(struct intel_context *ce,
>>>>>>>         GEM_BUG_ON(!i915_vma_is_pinned(ce->state));
>>>>>>>
>>>>>>>         *vaddr = i915_gem_object_pin_map(ce->state->obj,
>>>>>>> - i915_coherent_map_type(ce->engine->i915) |
>>>>>>> + i915_coherent_map_type(ce->engine->i915,
>>>>>>> + ce->state->obj,
>>>>>>> + false) |
>>>>>>>                                          I915_MAP_OVERRIDE);
>>>>>>>
>>>>>>>         return PTR_ERR_OR_ZERO(*vaddr);
>>>>>>> diff --git a/drivers/gpu/drm/i915/gt/intel_ring.c 
>>>>>>> b/drivers/gpu/drm/i915/gt/intel_ring.c
>>>>>>> index aee0a77c77e0..3cf6c7e68108 100644
>>>>>>> --- a/drivers/gpu/drm/i915/gt/intel_ring.c
>>>>>>> +++ b/drivers/gpu/drm/i915/gt/intel_ring.c
>>>>>>> @@ -53,9 +53,12 @@ int intel_ring_pin(struct intel_ring *ring, 
>>>>>>> struct i915_gem_ww_ctx *ww)
>>>>>>>
>>>>>>>         if (i915_vma_is_map_and_fenceable(vma))
>>>>>>>                 addr = (void __force *)i915_vma_pin_iomap(vma);
>>>>>>> -     else
>>>>>>> -             addr = i915_gem_object_pin_map(vma->obj,
>>>>>>> - i915_coherent_map_type(vma->vm->i915));
>>>>>>> +     else {
>>>>>>> +             int type = i915_coherent_map_type(vma->vm->i915, 
>>>>>>> vma->obj, false);
>>>>>>> +
>>>>>>> +             addr = i915_gem_object_pin_map(vma->obj, type);
>>>>>>> +     }
>>>>>>> +
>>>>>>>         if (IS_ERR(addr)) {
>>>>>>>                 ret = PTR_ERR(addr);
>>>>>>>                 goto err_ring;
>>>>>>> diff --git a/drivers/gpu/drm/i915/gt/selftest_context.c 
>>>>>>> b/drivers/gpu/drm/i915/gt/selftest_context.c
>>>>>>> index b9bdd1d23243..26685b927169 100644
>>>>>>> --- a/drivers/gpu/drm/i915/gt/selftest_context.c
>>>>>>> +++ b/drivers/gpu/drm/i915/gt/selftest_context.c
>>>>>>> @@ -88,7 +88,8 @@ static int __live_context_size(struct 
>>>>>>> intel_engine_cs *engine)
>>>>>>>                 goto err;
>>>>>>>
>>>>>>>         vaddr = i915_gem_object_pin_map_unlocked(ce->state->obj,
>>>>>>> - i915_coherent_map_type(engine->i915));
>>>>>>> + i915_coherent_map_type(engine->i915,
>>>>>>> + ce->state->obj, false));
>>>>>>>         if (IS_ERR(vaddr)) {
>>>>>>>                 err = PTR_ERR(vaddr);
>>>>>>>                 intel_context_unpin(ce);
>>>>>>> diff --git a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c 
>>>>>>> b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
>>>>>>> index 746985971c3a..5b63d4df8c93 100644
>>>>>>> --- a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
>>>>>>> +++ b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
>>>>>>> @@ -69,7 +69,7 @@ static int hang_init(struct hang *h, struct 
>>>>>>> intel_gt *gt)
>>>>>>>         h->seqno = memset(vaddr, 0xff, PAGE_SIZE);
>>>>>>>
>>>>>>>         vaddr = i915_gem_object_pin_map_unlocked(h->obj,
>>>>>>> - i915_coherent_map_type(gt->i915));
>>>>>>> + i915_coherent_map_type(gt->i915, h->obj, false));
>>>>>>>         if (IS_ERR(vaddr)) {
>>>>>>>                 err = PTR_ERR(vaddr);
>>>>>>>                 goto err_unpin_hws;
>>>>>>> @@ -130,7 +130,7 @@ hang_create_request(struct hang *h, struct 
>>>>>>> intel_engine_cs *engine)
>>>>>>>                 return ERR_CAST(obj);
>>>>>>>         }
>>>>>>>
>>>>>>> -     vaddr = i915_gem_object_pin_map_unlocked(obj, 
>>>>>>> i915_coherent_map_type(gt->i915));
>>>>>>> +     vaddr = i915_gem_object_pin_map_unlocked(obj, 
>>>>>>> i915_coherent_map_type(gt->i915, obj, false));
>>>>>>>         if (IS_ERR(vaddr)) {
>>>>>>>                 i915_gem_object_put(obj);
>>>>>>>                 i915_vm_put(vm);
>>>>>>> diff --git a/drivers/gpu/drm/i915/gt/selftest_lrc.c 
>>>>>>> b/drivers/gpu/drm/i915/gt/selftest_lrc.c
>>>>>>> index 85e7df6a5123..d8f6623524e8 100644
>>>>>>> --- a/drivers/gpu/drm/i915/gt/selftest_lrc.c
>>>>>>> +++ b/drivers/gpu/drm/i915/gt/selftest_lrc.c
>>>>>>> @@ -1221,7 +1221,9 @@ static int compare_isolation(struct 
>>>>>>> intel_engine_cs *engine,
>>>>>>>         }
>>>>>>>
>>>>>>>         lrc = i915_gem_object_pin_map_unlocked(ce->state->obj,
>>>>>>> - i915_coherent_map_type(engine->i915));
>>>>>>> + i915_coherent_map_type(engine->i915,
>>>>>>> +                                                                   
>>>>>>> ce->state->obj,
>>>>>>> +                                                                   
>>>>>>> false));
>>>>>>>         if (IS_ERR(lrc)) {
>>>>>>>                 err = PTR_ERR(lrc);
>>>>>>>                 goto err_B1;
>>>>>>> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.c 
>>>>>>> b/drivers/gpu/drm/i915/gt/uc/intel_guc.c
>>>>>>> index 78305b2ec89d..adae04c47aab 100644
>>>>>>> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc.c
>>>>>>> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.c
>>>>>>> @@ -682,7 +682,9 @@ int intel_guc_allocate_and_map_vma(struct 
>>>>>>> intel_guc *guc, u32 size,
>>>>>>>         if (IS_ERR(vma))
>>>>>>>                 return PTR_ERR(vma);
>>>>>>>
>>>>>>> -     vaddr = i915_gem_object_pin_map_unlocked(vma->obj, 
>>>>>>> I915_MAP_WB);
>>>>>>> +     vaddr = i915_gem_object_pin_map_unlocked(vma->obj,
>>>>>>> + i915_coherent_map_type(guc_to_gt(guc)->i915,
>>>>>>> + vma->obj, true));
>>>>>>>         if (IS_ERR(vaddr)) {
>>>>>>>                 i915_vma_unpin_and_release(&vma, 0);
>>>>>>>                 return PTR_ERR(vaddr);
>>>>>>> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_huc.c 
>>>>>>> b/drivers/gpu/drm/i915/gt/uc/intel_huc.c
>>>>>>> index 2126dd81ac38..56d2144dc6a0 100644
>>>>>>> --- a/drivers/gpu/drm/i915/gt/uc/intel_huc.c
>>>>>>> +++ b/drivers/gpu/drm/i915/gt/uc/intel_huc.c
>>>>>>> @@ -82,7 +82,9 @@ static int intel_huc_rsa_data_create(struct 
>>>>>>> intel_huc *huc)
>>>>>>>         if (IS_ERR(vma))
>>>>>>>                 return PTR_ERR(vma);
>>>>>>>
>>>>>>> -     vaddr = i915_gem_object_pin_map_unlocked(vma->obj, 
>>>>>>> I915_MAP_WB);
>>>>>>> +     vaddr = i915_gem_object_pin_map_unlocked(vma->obj,
>>>>>>> + i915_coherent_map_type(gt->i915,
>>>>>>> + vma->obj, true));
>>>>>>>         if (IS_ERR(vaddr)) {
>>>>>>>                 i915_vma_unpin_and_release(&vma, 0);
>>>>>>>                 return PTR_ERR(vaddr);
>>>>>>> diff --git a/drivers/gpu/drm/i915/i915_drv.h 
>>>>>>> b/drivers/gpu/drm/i915/i915_drv.h
>>>>>>> index 69e43bf91a15..2abbc06712a4 100644
>>>>>>> --- a/drivers/gpu/drm/i915/i915_drv.h
>>>>>>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>>>>>>> @@ -78,6 +78,7 @@
>>>>>>>     #include "gem/i915_gem_context_types.h"
>>>>>>>     #include "gem/i915_gem_shrinker.h"
>>>>>>>     #include "gem/i915_gem_stolen.h"
>>>>>>> +#include "gem/i915_gem_lmem.h"
>>>>>>>
>>>>>>>     #include "gt/intel_engine.h"
>>>>>>>     #include "gt/intel_gt_types.h"
>>>>>>> @@ -1921,9 +1922,15 @@ static inline int 
>>>>>>> intel_hws_csb_write_index(struct drm_i915_private *i915)
>>>>>>>     }
>>>>>>>
>>>>>>>     static inline enum i915_map_type
>>>>>>> -i915_coherent_map_type(struct drm_i915_private *i915)
>>>>>>> +i915_coherent_map_type(struct drm_i915_private *i915,
>>>>>>> +                    struct drm_i915_gem_object *obj, bool 
>>>>>>> always_coherent)
>>>>>>>     {
>>>>>>> -     return HAS_LLC(i915) ? I915_MAP_WB : I915_MAP_WC;
>>>>>>> +     if (i915_gem_object_is_lmem(obj))
>>>>>>> +             return I915_MAP_WC;
>>>>>>> +     if (HAS_LLC(i915) || always_coherent)
>>>>>>> +             return I915_MAP_WB;
>>>>>>> +     else
>>>>>>> +             return I915_MAP_WC;
>>>>>>
>>>>>> Seems this patch is doing two things.
>>>>>>
>>>>>> First it is adding lmem support to this helper by always returning WC
>>>>>> for lmem objects.
>>>>>>
>>>>>> Secondly it is introducing an idea of "always coherent" in a helper
>>>>>> called i915_coherent_map_type. Could someone explain what is 
>>>>>> coherent vs
>>>>>> always coherent?
>>>>>>
>>>>>> And also, why is always coherent happy with WB? Sounds counter 
>>>>>> intuitive
>>>>>> to me.
>>>>>
>>>>> All this does is try to keep the existing behaviour intact, whilst
>>>>> also ensuring that all lmem objects are mapped using only WC, no
>>>>> matter what. The always_coherent=true thing is for the existing places
>>>>> where we sometimes map the object using WB, without first considering
>>>>> whether the device has the fast shared LLC vs snooping. Yes, it's
>>>>> slightly ugly :)
>>>>
>>>> Not fully following - if we had to write kerneldoc for always_coherent
>>>> input argument - what it would say?
>>>
>>> @always_coherent - If true we should always try to map the object
>>> using WB. If false we should only map as WB if the device supports the
>>> fast shared LLC, in the case of snooped devices we will map use WC.
>>> Note that If the resource is lmem then we will always map as WC,
>>> regardless of the value of always_coherent, since that's all we
>>> currently support.
>>>
>>> Maybe the naming is poor?
>>
>> Maybe just confusing to me, not sure yet.
>>
>> So always_coherent is not about how the callers wants to use it, but 
>> about platform knowledge? Or a performance concern for LLC vs snooping 
>> cases? Does WB works (coherently) on snooping platforms?
> 
> The always_coherent=true is for the existing callers that want WB, 
> regardless of LLC vs snooping.
> 
> The other callers use the existing i915_coherent_map_type() which only 
> gives out WB for LLC platforms.
> 
> AFAIK, LLC vs snooping should offer the same in terms of coherency, but 
> in terms of performance the shared LLC is much faster, and so for 
> snooping platforms we choose to not enable WB everywhere.
> 
> On top of that we now have lmem, but for that we only allow WC. This 
> patch just rolls all of that into one helper, while keeping the existing 
> behaviour unchanged.

Thanks. But I am still struggling with the API. :(

Is the introduction of always_coherent flag in the context of DG1 
required even? AFAICT for lmem objects the flag is ignored so no?

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [Intel-gfx] [PATCH 03/19] drm/i915: Create stolen memory region from local memory
  2021-04-16 15:04       ` Matthew Auld
@ 2021-04-19 14:15         ` Tvrtko Ursulin
  -1 siblings, 0 replies; 130+ messages in thread
From: Tvrtko Ursulin @ 2021-04-19 14:15 UTC (permalink / raw)
  To: Matthew Auld, intel-gfx; +Cc: dri-devel


On 16/04/2021 16:04, Matthew Auld wrote:
> On 14/04/2021 16:01, Tvrtko Ursulin wrote:
>>
>> On 12/04/2021 10:05, Matthew Auld wrote:
>>> From: CQ Tang <cq.tang@intel.com>
>>>
>>> Add "REGION_STOLEN" device info to dg1, create stolen memory
>>> region from upper portion of local device memory, starting
>>> from DSMBASE.
>>>
>>> v2:
>>>      - s/drm_info/drm_dbg; userspace likely doesn't care about stolen.
>>>      - mem->type is only setup after the region probe, so setting the 
>>> name
>>>        as stolen-local or stolen-system based on this value won't 
>>> work. Split
>>>        system vs local stolen setup to fix this.
>>>      - kill all the region->devmem/is_devmem stuff. We already 
>>> differentiate
>>>        the different types of stolen so such things shouldn't be needed
>>>        anymore.
>>>
>>> Signed-off-by: CQ Tang <cq.tang@intel.com>
>>> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
>>> ---
>>>   drivers/gpu/drm/i915/gem/i915_gem_stolen.c | 99 +++++++++++++++++++---
>>>   drivers/gpu/drm/i915/gem/i915_gem_stolen.h |  3 +
>>>   drivers/gpu/drm/i915/i915_pci.c            |  2 +-
>>>   drivers/gpu/drm/i915/i915_reg.h            |  1 +
>>>   drivers/gpu/drm/i915/intel_memory_region.c |  6 ++
>>>   drivers/gpu/drm/i915/intel_memory_region.h |  5 +-
>>>   6 files changed, 102 insertions(+), 14 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_stolen.c 
>>> b/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
>>> index b0597de206de..56dd58bef5ee 100644
>>> --- a/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
>>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
>>> @@ -10,6 +10,7 @@
>>>   #include <drm/drm_mm.h>
>>>   #include <drm/i915_drm.h>
>>> +#include "gem/i915_gem_lmem.h"
>>>   #include "gem/i915_gem_region.h"
>>>   #include "i915_drv.h"
>>>   #include "i915_gem_stolen.h"
>>> @@ -121,6 +122,14 @@ static int i915_adjust_stolen(struct 
>>> drm_i915_private *i915,
>>>           }
>>>       }
>>> +    /*
>>> +     * With device local memory, we don't need to check the address 
>>> range,
>>> +     * this is device memory physical address, could overlap with 
>>> system
>>> +     * memory.
>>> +     */
>>> +    if (HAS_LMEM(i915))
>>> +        return 0;
>>> +
>>>       /*
>>>        * Verify that nothing else uses this physical address. Stolen
>>>        * memory should be reserved by the BIOS and hidden from the
>>> @@ -374,8 +383,9 @@ static void icl_get_stolen_reserved(struct 
>>> drm_i915_private *i915,
>>>       }
>>>   }
>>> -static int i915_gem_init_stolen(struct drm_i915_private *i915)
>>> +static int i915_gem_init_stolen(struct intel_memory_region *mem)
>>>   {
>>> +    struct drm_i915_private *i915 = mem->i915;
>>>       struct intel_uncore *uncore = &i915->uncore;
>>>       resource_size_t reserved_base, stolen_top;
>>>       resource_size_t reserved_total, reserved_size;
>>> @@ -396,10 +406,10 @@ static int i915_gem_init_stolen(struct 
>>> drm_i915_private *i915)
>>>           return 0;
>>>       }
>>> -    if (resource_size(&intel_graphics_stolen_res) == 0)
>>> +    if (resource_size(&mem->region) == 0)
>>>           return 0;
>>> -    i915->dsm = intel_graphics_stolen_res;
>>> +    i915->dsm = mem->region;
>>>       if (i915_adjust_stolen(i915, &i915->dsm))
>>>           return 0;
>>> @@ -684,23 +694,36 @@ static int _i915_gem_object_stolen_init(struct 
>>> intel_memory_region *mem,
>>>       return ret;
>>>   }
>>> +struct intel_memory_region *i915_stolen_region(struct 
>>> drm_i915_private *i915)
>>> +{
>>> +    if (HAS_LMEM(i915))
>>> +        return i915->mm.regions[INTEL_REGION_STOLEN_LMEM];
>>> +
>>> +    return i915->mm.regions[INTEL_REGION_STOLEN_SMEM];
>>> +}
>>
>> Could be a bikeshedding comment only - especially since I think this 
>> path gets very little used at runtime so it is most likely pointless 
>> to fiddle with it, but it just strikes me a bit not fully elegant to do:
>>
>> i915_gem_object_create_stolen
>>   -> i915_gem_object_create_region
>>      -> i915_stolen_region
>>
>> And end up in here, when alternative could be at driver init:
>>
>> i915->stolen_region_id = HAS_LMEM() ? ... : ...;
>>
>> i915_gem_object_create_stolen
>>   -> 
>> i915_gem_object_create_region(i915->mm.regions[i915->stolen_region_id]);
>>
>> Or pointer to region. Would avoid having to export i915_stolen_region 
>> as well.
>>
>> Or is i915->dsm already the right thing? Because..
> 
> I guess we could just have an i915->stolen_region short-cut or something?

i915->dsm is not it? Where does i915_gem_init_stolen exists for 
local-stolen then? At the "resource_size(&mem->region) == 0" check?

> 
>>
>>> +
>>>   struct drm_i915_gem_object *
>>>   i915_gem_object_create_stolen(struct drm_i915_private *i915,
>>>                     resource_size_t size)
>>>   {
>>> -    return 
>>> i915_gem_object_create_region(i915->mm.regions[INTEL_REGION_STOLEN_SMEM], 
>>>
>>> +    return i915_gem_object_create_region(i915_stolen_region(i915),
>>>                            size, I915_BO_ALLOC_CONTIGUOUS);
>>>   }
>>>   static int init_stolen(struct intel_memory_region *mem)
>>>   {
>>> -    intel_memory_region_set_name(mem, "stolen");
>>> +    if (HAS_LMEM(mem->i915)) {
>>> +        if (!io_mapping_init_wc(&mem->iomap,
>>> +                    mem->io_start,
>>> +                    resource_size(&mem->region)))
>>> +            return -EIO;
>>> +    }
>>>       /*
>>>        * Initialise stolen early so that we may reserve preallocated
>>>        * objects for the BIOS to KMS transition.
>>>        */
>>> -    return i915_gem_init_stolen(mem->i915);
>>> +    return i915_gem_init_stolen(mem);
>>
>> ... I find the mem region init paths a bit convoluted, stolen 
>> especially, and struggle to figure it out every time.
>>
>> For instance we have i915_region_stolen_ops shared between system and 
>> local stolen. But then shared vfuncs branch depending on system vs 
>> stolen?
> 
> We could split the intel_memory_region ops? Maybe that will make it 
> slightly less muddled?

I think so. Each vfunc table with it's own ->init() should make it 
easier to follow.

> The probing is slightly different, but that's kind of expected since 
> it's quite different from the HW pov.
> 
> But once we get an intel_memory_region, it should be the same whether 
> it's stolen device memory or whatever.
> 
>>
>> i915_gem_init_stolen is shared - but which parts of it are relevant 
>> for local stolen?
> 
> Asking all the difficult questions :)
> 
> It's just to populate dsm I think. I can rip that out and then we don't 
> call i915_gem_init_stolen() for the stolen device memory path? Maybe 
> that will look slightly better?

Yes, with the above approach of two struct intel_memory_region_ops? Even 
if some vfuncs are shared it should be better.

I am also confused by ->release ie. i915_gem_cleanup_stolen. How does 
that work for two stolen regions, I mean one i915->mm.stolen?

Regards,

Tvrtko
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [Intel-gfx] [PATCH 03/19] drm/i915: Create stolen memory region from local memory
@ 2021-04-19 14:15         ` Tvrtko Ursulin
  0 siblings, 0 replies; 130+ messages in thread
From: Tvrtko Ursulin @ 2021-04-19 14:15 UTC (permalink / raw)
  To: Matthew Auld, intel-gfx; +Cc: dri-devel


On 16/04/2021 16:04, Matthew Auld wrote:
> On 14/04/2021 16:01, Tvrtko Ursulin wrote:
>>
>> On 12/04/2021 10:05, Matthew Auld wrote:
>>> From: CQ Tang <cq.tang@intel.com>
>>>
>>> Add "REGION_STOLEN" device info to dg1, create stolen memory
>>> region from upper portion of local device memory, starting
>>> from DSMBASE.
>>>
>>> v2:
>>>      - s/drm_info/drm_dbg; userspace likely doesn't care about stolen.
>>>      - mem->type is only setup after the region probe, so setting the 
>>> name
>>>        as stolen-local or stolen-system based on this value won't 
>>> work. Split
>>>        system vs local stolen setup to fix this.
>>>      - kill all the region->devmem/is_devmem stuff. We already 
>>> differentiate
>>>        the different types of stolen so such things shouldn't be needed
>>>        anymore.
>>>
>>> Signed-off-by: CQ Tang <cq.tang@intel.com>
>>> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
>>> ---
>>>   drivers/gpu/drm/i915/gem/i915_gem_stolen.c | 99 +++++++++++++++++++---
>>>   drivers/gpu/drm/i915/gem/i915_gem_stolen.h |  3 +
>>>   drivers/gpu/drm/i915/i915_pci.c            |  2 +-
>>>   drivers/gpu/drm/i915/i915_reg.h            |  1 +
>>>   drivers/gpu/drm/i915/intel_memory_region.c |  6 ++
>>>   drivers/gpu/drm/i915/intel_memory_region.h |  5 +-
>>>   6 files changed, 102 insertions(+), 14 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_stolen.c 
>>> b/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
>>> index b0597de206de..56dd58bef5ee 100644
>>> --- a/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
>>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
>>> @@ -10,6 +10,7 @@
>>>   #include <drm/drm_mm.h>
>>>   #include <drm/i915_drm.h>
>>> +#include "gem/i915_gem_lmem.h"
>>>   #include "gem/i915_gem_region.h"
>>>   #include "i915_drv.h"
>>>   #include "i915_gem_stolen.h"
>>> @@ -121,6 +122,14 @@ static int i915_adjust_stolen(struct 
>>> drm_i915_private *i915,
>>>           }
>>>       }
>>> +    /*
>>> +     * With device local memory, we don't need to check the address 
>>> range,
>>> +     * this is device memory physical address, could overlap with 
>>> system
>>> +     * memory.
>>> +     */
>>> +    if (HAS_LMEM(i915))
>>> +        return 0;
>>> +
>>>       /*
>>>        * Verify that nothing else uses this physical address. Stolen
>>>        * memory should be reserved by the BIOS and hidden from the
>>> @@ -374,8 +383,9 @@ static void icl_get_stolen_reserved(struct 
>>> drm_i915_private *i915,
>>>       }
>>>   }
>>> -static int i915_gem_init_stolen(struct drm_i915_private *i915)
>>> +static int i915_gem_init_stolen(struct intel_memory_region *mem)
>>>   {
>>> +    struct drm_i915_private *i915 = mem->i915;
>>>       struct intel_uncore *uncore = &i915->uncore;
>>>       resource_size_t reserved_base, stolen_top;
>>>       resource_size_t reserved_total, reserved_size;
>>> @@ -396,10 +406,10 @@ static int i915_gem_init_stolen(struct 
>>> drm_i915_private *i915)
>>>           return 0;
>>>       }
>>> -    if (resource_size(&intel_graphics_stolen_res) == 0)
>>> +    if (resource_size(&mem->region) == 0)
>>>           return 0;
>>> -    i915->dsm = intel_graphics_stolen_res;
>>> +    i915->dsm = mem->region;
>>>       if (i915_adjust_stolen(i915, &i915->dsm))
>>>           return 0;
>>> @@ -684,23 +694,36 @@ static int _i915_gem_object_stolen_init(struct 
>>> intel_memory_region *mem,
>>>       return ret;
>>>   }
>>> +struct intel_memory_region *i915_stolen_region(struct 
>>> drm_i915_private *i915)
>>> +{
>>> +    if (HAS_LMEM(i915))
>>> +        return i915->mm.regions[INTEL_REGION_STOLEN_LMEM];
>>> +
>>> +    return i915->mm.regions[INTEL_REGION_STOLEN_SMEM];
>>> +}
>>
>> Could be a bikeshedding comment only - especially since I think this 
>> path gets very little used at runtime so it is most likely pointless 
>> to fiddle with it, but it just strikes me a bit not fully elegant to do:
>>
>> i915_gem_object_create_stolen
>>   -> i915_gem_object_create_region
>>      -> i915_stolen_region
>>
>> And end up in here, when alternative could be at driver init:
>>
>> i915->stolen_region_id = HAS_LMEM() ? ... : ...;
>>
>> i915_gem_object_create_stolen
>>   -> 
>> i915_gem_object_create_region(i915->mm.regions[i915->stolen_region_id]);
>>
>> Or pointer to region. Would avoid having to export i915_stolen_region 
>> as well.
>>
>> Or is i915->dsm already the right thing? Because..
> 
> I guess we could just have an i915->stolen_region short-cut or something?

i915->dsm is not it? Where does i915_gem_init_stolen exists for 
local-stolen then? At the "resource_size(&mem->region) == 0" check?

> 
>>
>>> +
>>>   struct drm_i915_gem_object *
>>>   i915_gem_object_create_stolen(struct drm_i915_private *i915,
>>>                     resource_size_t size)
>>>   {
>>> -    return 
>>> i915_gem_object_create_region(i915->mm.regions[INTEL_REGION_STOLEN_SMEM], 
>>>
>>> +    return i915_gem_object_create_region(i915_stolen_region(i915),
>>>                            size, I915_BO_ALLOC_CONTIGUOUS);
>>>   }
>>>   static int init_stolen(struct intel_memory_region *mem)
>>>   {
>>> -    intel_memory_region_set_name(mem, "stolen");
>>> +    if (HAS_LMEM(mem->i915)) {
>>> +        if (!io_mapping_init_wc(&mem->iomap,
>>> +                    mem->io_start,
>>> +                    resource_size(&mem->region)))
>>> +            return -EIO;
>>> +    }
>>>       /*
>>>        * Initialise stolen early so that we may reserve preallocated
>>>        * objects for the BIOS to KMS transition.
>>>        */
>>> -    return i915_gem_init_stolen(mem->i915);
>>> +    return i915_gem_init_stolen(mem);
>>
>> ... I find the mem region init paths a bit convoluted, stolen 
>> especially, and struggle to figure it out every time.
>>
>> For instance we have i915_region_stolen_ops shared between system and 
>> local stolen. But then shared vfuncs branch depending on system vs 
>> stolen?
> 
> We could split the intel_memory_region ops? Maybe that will make it 
> slightly less muddled?

I think so. Each vfunc table with it's own ->init() should make it 
easier to follow.

> The probing is slightly different, but that's kind of expected since 
> it's quite different from the HW pov.
> 
> But once we get an intel_memory_region, it should be the same whether 
> it's stolen device memory or whatever.
> 
>>
>> i915_gem_init_stolen is shared - but which parts of it are relevant 
>> for local stolen?
> 
> Asking all the difficult questions :)
> 
> It's just to populate dsm I think. I can rip that out and then we don't 
> call i915_gem_init_stolen() for the stolen device memory path? Maybe 
> that will look slightly better?

Yes, with the above approach of two struct intel_memory_region_ops? Even 
if some vfuncs are shared it should be better.

I am also confused by ->release ie. i915_gem_cleanup_stolen. How does 
that work for two stolen regions, I mean one i915->mm.stolen?

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [Intel-gfx] [PATCH 12/19] drm/i915/lmem: Bypass aperture when lmem is available
  2021-04-16 14:25       ` Matthew Auld
@ 2021-04-19 14:16         ` Tvrtko Ursulin
  -1 siblings, 0 replies; 130+ messages in thread
From: Tvrtko Ursulin @ 2021-04-19 14:16 UTC (permalink / raw)
  To: Matthew Auld, intel-gfx
  Cc: Daniel Vetter, dri-devel, Chris P Wilson, Dhinakaran Pandiyan


On 16/04/2021 15:25, Matthew Auld wrote:
> On 14/04/2021 16:33, Tvrtko Ursulin wrote:
>>
>> On 12/04/2021 10:05, Matthew Auld wrote:
>>> From: Anusha Srivatsa <anusha.srivatsa@intel.com>
>>>
>>> In the scenario where local memory is available, we have
>>> rely on CPU access via lmem directly instead of aperture.
>>>
>>> v2:
>>> gmch is only relevant for much older hw, therefore we can drop the
>>> has_aperture check since it should always be present on such platforms.
>>> (Chris)
>>>
>>> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
>>> Cc: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
>>> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
>>> Cc: Chris P Wilson <chris.p.wilson@intel.com>
>>> Cc: Daniel Vetter <daniel.vetter@intel.com>
>>> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
>>> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
>>> Cc: CQ Tang <cq.tang@intel.com>
>>> Signed-off-by: Anusha Srivatsa <anusha.srivatsa@intel.com>
>>> ---
>>>   drivers/gpu/drm/i915/display/intel_fbdev.c | 22 +++++++++++++++-------
>>>   drivers/gpu/drm/i915/gem/i915_gem_lmem.c   | 15 +++++++++++++++
>>>   drivers/gpu/drm/i915/gem/i915_gem_lmem.h   |  5 +++++
>>>   drivers/gpu/drm/i915/i915_vma.c            | 19 +++++++++++++------
>>>   4 files changed, 48 insertions(+), 13 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/display/intel_fbdev.c 
>>> b/drivers/gpu/drm/i915/display/intel_fbdev.c
>>> index 2b37959da747..4af40229f5ec 100644
>>> --- a/drivers/gpu/drm/i915/display/intel_fbdev.c
>>> +++ b/drivers/gpu/drm/i915/display/intel_fbdev.c
>>> @@ -139,14 +139,22 @@ static int intelfb_alloc(struct drm_fb_helper 
>>> *helper,
>>>       size = mode_cmd.pitches[0] * mode_cmd.height;
>>>       size = PAGE_ALIGN(size);
>>> -    /* If the FB is too big, just don't use it since fbdev is not very
>>> -     * important and we should probably use that space with FBC or 
>>> other
>>> -     * features. */
>>>       obj = ERR_PTR(-ENODEV);
>>> -    if (size * 2 < dev_priv->stolen_usable_size)
>>> -        obj = i915_gem_object_create_stolen(dev_priv, size);
>>> -    if (IS_ERR(obj))
>>> -        obj = i915_gem_object_create_shmem(dev_priv, size);
>>> +    if (HAS_LMEM(dev_priv)) {
>>> +        obj = i915_gem_object_create_lmem(dev_priv, size,
>>> +                          I915_BO_ALLOC_CONTIGUOUS);
>>
>> Has to be contiguous? Question for display experts I guess.
>>
>> [Comes back later.] Ah for iomap? Put a comment to that effect perhaps?
> 
> I don't think it has to be, since we could in theory just use pin_map() 
> underneath, which can already deal with non-contiguous chunks of lmem, 
> although that might bring in ww locking. I think for now just add a 
> comment and mark this as XXX, and potentially revisit as follow up?

Sure.

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Regards,

Tvrtko
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [Intel-gfx] [PATCH 12/19] drm/i915/lmem: Bypass aperture when lmem is available
@ 2021-04-19 14:16         ` Tvrtko Ursulin
  0 siblings, 0 replies; 130+ messages in thread
From: Tvrtko Ursulin @ 2021-04-19 14:16 UTC (permalink / raw)
  To: Matthew Auld, intel-gfx
  Cc: Daniel Vetter, dri-devel, Chris P Wilson, Dhinakaran Pandiyan


On 16/04/2021 15:25, Matthew Auld wrote:
> On 14/04/2021 16:33, Tvrtko Ursulin wrote:
>>
>> On 12/04/2021 10:05, Matthew Auld wrote:
>>> From: Anusha Srivatsa <anusha.srivatsa@intel.com>
>>>
>>> In the scenario where local memory is available, we have
>>> rely on CPU access via lmem directly instead of aperture.
>>>
>>> v2:
>>> gmch is only relevant for much older hw, therefore we can drop the
>>> has_aperture check since it should always be present on such platforms.
>>> (Chris)
>>>
>>> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
>>> Cc: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
>>> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
>>> Cc: Chris P Wilson <chris.p.wilson@intel.com>
>>> Cc: Daniel Vetter <daniel.vetter@intel.com>
>>> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
>>> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
>>> Cc: CQ Tang <cq.tang@intel.com>
>>> Signed-off-by: Anusha Srivatsa <anusha.srivatsa@intel.com>
>>> ---
>>>   drivers/gpu/drm/i915/display/intel_fbdev.c | 22 +++++++++++++++-------
>>>   drivers/gpu/drm/i915/gem/i915_gem_lmem.c   | 15 +++++++++++++++
>>>   drivers/gpu/drm/i915/gem/i915_gem_lmem.h   |  5 +++++
>>>   drivers/gpu/drm/i915/i915_vma.c            | 19 +++++++++++++------
>>>   4 files changed, 48 insertions(+), 13 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/display/intel_fbdev.c 
>>> b/drivers/gpu/drm/i915/display/intel_fbdev.c
>>> index 2b37959da747..4af40229f5ec 100644
>>> --- a/drivers/gpu/drm/i915/display/intel_fbdev.c
>>> +++ b/drivers/gpu/drm/i915/display/intel_fbdev.c
>>> @@ -139,14 +139,22 @@ static int intelfb_alloc(struct drm_fb_helper 
>>> *helper,
>>>       size = mode_cmd.pitches[0] * mode_cmd.height;
>>>       size = PAGE_ALIGN(size);
>>> -    /* If the FB is too big, just don't use it since fbdev is not very
>>> -     * important and we should probably use that space with FBC or 
>>> other
>>> -     * features. */
>>>       obj = ERR_PTR(-ENODEV);
>>> -    if (size * 2 < dev_priv->stolen_usable_size)
>>> -        obj = i915_gem_object_create_stolen(dev_priv, size);
>>> -    if (IS_ERR(obj))
>>> -        obj = i915_gem_object_create_shmem(dev_priv, size);
>>> +    if (HAS_LMEM(dev_priv)) {
>>> +        obj = i915_gem_object_create_lmem(dev_priv, size,
>>> +                          I915_BO_ALLOC_CONTIGUOUS);
>>
>> Has to be contiguous? Question for display experts I guess.
>>
>> [Comes back later.] Ah for iomap? Put a comment to that effect perhaps?
> 
> I don't think it has to be, since we could in theory just use pin_map() 
> underneath, which can already deal with non-contiguous chunks of lmem, 
> although that might bring in ww locking. I think for now just add a 
> comment and mark this as XXX, and potentially revisit as follow up?

Sure.

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [Intel-gfx] [PATCH 11/19] drm/i915: Update the helper to set correct mapping
  2021-04-19 14:07                 ` Tvrtko Ursulin
@ 2021-04-19 14:37                   ` Matthew Auld
  -1 siblings, 0 replies; 130+ messages in thread
From: Matthew Auld @ 2021-04-19 14:37 UTC (permalink / raw)
  To: Tvrtko Ursulin, Matthew Auld; +Cc: Intel Graphics Development, ML dri-devel

On 19/04/2021 15:07, Tvrtko Ursulin wrote:
> 
> On 19/04/2021 12:30, Matthew Auld wrote:
>> On 15/04/2021 12:05, Tvrtko Ursulin wrote:
>>>
>>> On 15/04/2021 10:23, Matthew Auld wrote:
>>>> On Thu, 15 Apr 2021 at 09:21, Tvrtko Ursulin
>>>> <tvrtko.ursulin@linux.intel.com> wrote:
>>>>>
>>>>>
>>>>> On 14/04/2021 17:20, Matthew Auld wrote:
>>>>>> On Wed, 14 Apr 2021 at 16:22, Tvrtko Ursulin
>>>>>> <tvrtko.ursulin@linux.intel.com> wrote:
>>>>>>>
>>>>>>>
>>>>>>> On 12/04/2021 10:05, Matthew Auld wrote:
>>>>>>>> From: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com>
>>>>>>>>
>>>>>>>> Determine the possible coherent map type based on object location,
>>>>>>>> and if target has llc or if user requires an always coherent
>>>>>>>> mapping.
>>>>>>>>
>>>>>>>> Cc: Matthew Auld <matthew.auld@intel.com>
>>>>>>>> Cc: CQ Tang <cq.tang@intel.com>
>>>>>>>> Suggested-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
>>>>>>>> Signed-off-by: Venkata Sandeep Dhanalakota 
>>>>>>>> <venkata.s.dhanalakota@intel.com>
>>>>>>>> ---
>>>>>>>>     drivers/gpu/drm/i915/gt/intel_engine_cs.c    |  3 ++-
>>>>>>>>     drivers/gpu/drm/i915/gt/intel_engine_pm.c    |  2 +-
>>>>>>>>     drivers/gpu/drm/i915/gt/intel_lrc.c          |  4 +++-
>>>>>>>>     drivers/gpu/drm/i915/gt/intel_ring.c         |  9 ++++++---
>>>>>>>>     drivers/gpu/drm/i915/gt/selftest_context.c   |  3 ++-
>>>>>>>>     drivers/gpu/drm/i915/gt/selftest_hangcheck.c |  4 ++--
>>>>>>>>     drivers/gpu/drm/i915/gt/selftest_lrc.c       |  4 +++-
>>>>>>>>     drivers/gpu/drm/i915/gt/uc/intel_guc.c       |  4 +++-
>>>>>>>>     drivers/gpu/drm/i915/gt/uc/intel_huc.c       |  4 +++-
>>>>>>>>     drivers/gpu/drm/i915/i915_drv.h              | 11 +++++++++--
>>>>>>>>     drivers/gpu/drm/i915/selftests/igt_spinner.c |  4 ++--
>>>>>>>>     11 files changed, 36 insertions(+), 16 deletions(-)
>>>>>>>>
>>>>>>>> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c 
>>>>>>>> b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
>>>>>>>> index efe935f80c1a..b79568d370f5 100644
>>>>>>>> --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
>>>>>>>> +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
>>>>>>>> @@ -664,7 +664,8 @@ static int init_status_page(struct 
>>>>>>>> intel_engine_cs *engine)
>>>>>>>>         if (ret)
>>>>>>>>                 goto err;
>>>>>>>>
>>>>>>>> -     vaddr = i915_gem_object_pin_map(obj, I915_MAP_WB);
>>>>>>>> +     vaddr = i915_gem_object_pin_map(obj,
>>>>>>>> + i915_coherent_map_type(engine->i915, obj, true));
>>>>>>>>         if (IS_ERR(vaddr)) {
>>>>>>>>                 ret = PTR_ERR(vaddr);
>>>>>>>>                 goto err_unpin;
>>>>>>>> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pm.c 
>>>>>>>> b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
>>>>>>>> index 7c9af86fdb1e..47f4397095e5 100644
>>>>>>>> --- a/drivers/gpu/drm/i915/gt/intel_engine_pm.c
>>>>>>>> +++ b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
>>>>>>>> @@ -23,7 +23,7 @@ static void dbg_poison_ce(struct intel_context 
>>>>>>>> *ce)
>>>>>>>>
>>>>>>>>         if (ce->state) {
>>>>>>>>                 struct drm_i915_gem_object *obj = ce->state->obj;
>>>>>>>> -             int type = i915_coherent_map_type(ce->engine->i915);
>>>>>>>> +             int type = 
>>>>>>>> i915_coherent_map_type(ce->engine->i915, obj, true);
>>>>>>>>                 void *map;
>>>>>>>>
>>>>>>>>                 if (!i915_gem_object_trylock(obj))
>>>>>>>> diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c 
>>>>>>>> b/drivers/gpu/drm/i915/gt/intel_lrc.c
>>>>>>>> index e86897cde984..aafe2a4df496 100644
>>>>>>>> --- a/drivers/gpu/drm/i915/gt/intel_lrc.c
>>>>>>>> +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
>>>>>>>> @@ -903,7 +903,9 @@ lrc_pre_pin(struct intel_context *ce,
>>>>>>>>         GEM_BUG_ON(!i915_vma_is_pinned(ce->state));
>>>>>>>>
>>>>>>>>         *vaddr = i915_gem_object_pin_map(ce->state->obj,
>>>>>>>> - i915_coherent_map_type(ce->engine->i915) |
>>>>>>>> + i915_coherent_map_type(ce->engine->i915,
>>>>>>>> + ce->state->obj,
>>>>>>>> + false) |
>>>>>>>>                                          I915_MAP_OVERRIDE);
>>>>>>>>
>>>>>>>>         return PTR_ERR_OR_ZERO(*vaddr);
>>>>>>>> diff --git a/drivers/gpu/drm/i915/gt/intel_ring.c 
>>>>>>>> b/drivers/gpu/drm/i915/gt/intel_ring.c
>>>>>>>> index aee0a77c77e0..3cf6c7e68108 100644
>>>>>>>> --- a/drivers/gpu/drm/i915/gt/intel_ring.c
>>>>>>>> +++ b/drivers/gpu/drm/i915/gt/intel_ring.c
>>>>>>>> @@ -53,9 +53,12 @@ int intel_ring_pin(struct intel_ring *ring, 
>>>>>>>> struct i915_gem_ww_ctx *ww)
>>>>>>>>
>>>>>>>>         if (i915_vma_is_map_and_fenceable(vma))
>>>>>>>>                 addr = (void __force *)i915_vma_pin_iomap(vma);
>>>>>>>> -     else
>>>>>>>> -             addr = i915_gem_object_pin_map(vma->obj,
>>>>>>>> - i915_coherent_map_type(vma->vm->i915));
>>>>>>>> +     else {
>>>>>>>> +             int type = i915_coherent_map_type(vma->vm->i915, 
>>>>>>>> vma->obj, false);
>>>>>>>> +
>>>>>>>> +             addr = i915_gem_object_pin_map(vma->obj, type);
>>>>>>>> +     }
>>>>>>>> +
>>>>>>>>         if (IS_ERR(addr)) {
>>>>>>>>                 ret = PTR_ERR(addr);
>>>>>>>>                 goto err_ring;
>>>>>>>> diff --git a/drivers/gpu/drm/i915/gt/selftest_context.c 
>>>>>>>> b/drivers/gpu/drm/i915/gt/selftest_context.c
>>>>>>>> index b9bdd1d23243..26685b927169 100644
>>>>>>>> --- a/drivers/gpu/drm/i915/gt/selftest_context.c
>>>>>>>> +++ b/drivers/gpu/drm/i915/gt/selftest_context.c
>>>>>>>> @@ -88,7 +88,8 @@ static int __live_context_size(struct 
>>>>>>>> intel_engine_cs *engine)
>>>>>>>>                 goto err;
>>>>>>>>
>>>>>>>>         vaddr = i915_gem_object_pin_map_unlocked(ce->state->obj,
>>>>>>>> - i915_coherent_map_type(engine->i915));
>>>>>>>> + i915_coherent_map_type(engine->i915,
>>>>>>>> + ce->state->obj, false));
>>>>>>>>         if (IS_ERR(vaddr)) {
>>>>>>>>                 err = PTR_ERR(vaddr);
>>>>>>>>                 intel_context_unpin(ce);
>>>>>>>> diff --git a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c 
>>>>>>>> b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
>>>>>>>> index 746985971c3a..5b63d4df8c93 100644
>>>>>>>> --- a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
>>>>>>>> +++ b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
>>>>>>>> @@ -69,7 +69,7 @@ static int hang_init(struct hang *h, struct 
>>>>>>>> intel_gt *gt)
>>>>>>>>         h->seqno = memset(vaddr, 0xff, PAGE_SIZE);
>>>>>>>>
>>>>>>>>         vaddr = i915_gem_object_pin_map_unlocked(h->obj,
>>>>>>>> - i915_coherent_map_type(gt->i915));
>>>>>>>> + i915_coherent_map_type(gt->i915, h->obj, false));
>>>>>>>>         if (IS_ERR(vaddr)) {
>>>>>>>>                 err = PTR_ERR(vaddr);
>>>>>>>>                 goto err_unpin_hws;
>>>>>>>> @@ -130,7 +130,7 @@ hang_create_request(struct hang *h, struct 
>>>>>>>> intel_engine_cs *engine)
>>>>>>>>                 return ERR_CAST(obj);
>>>>>>>>         }
>>>>>>>>
>>>>>>>> -     vaddr = i915_gem_object_pin_map_unlocked(obj, 
>>>>>>>> i915_coherent_map_type(gt->i915));
>>>>>>>> +     vaddr = i915_gem_object_pin_map_unlocked(obj, 
>>>>>>>> i915_coherent_map_type(gt->i915, obj, false));
>>>>>>>>         if (IS_ERR(vaddr)) {
>>>>>>>>                 i915_gem_object_put(obj);
>>>>>>>>                 i915_vm_put(vm);
>>>>>>>> diff --git a/drivers/gpu/drm/i915/gt/selftest_lrc.c 
>>>>>>>> b/drivers/gpu/drm/i915/gt/selftest_lrc.c
>>>>>>>> index 85e7df6a5123..d8f6623524e8 100644
>>>>>>>> --- a/drivers/gpu/drm/i915/gt/selftest_lrc.c
>>>>>>>> +++ b/drivers/gpu/drm/i915/gt/selftest_lrc.c
>>>>>>>> @@ -1221,7 +1221,9 @@ static int compare_isolation(struct 
>>>>>>>> intel_engine_cs *engine,
>>>>>>>>         }
>>>>>>>>
>>>>>>>>         lrc = i915_gem_object_pin_map_unlocked(ce->state->obj,
>>>>>>>> - i915_coherent_map_type(engine->i915));
>>>>>>>> + i915_coherent_map_type(engine->i915,
>>>>>>>> + ce->state->obj,
>>>>>>>> + false));
>>>>>>>>         if (IS_ERR(lrc)) {
>>>>>>>>                 err = PTR_ERR(lrc);
>>>>>>>>                 goto err_B1;
>>>>>>>> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.c 
>>>>>>>> b/drivers/gpu/drm/i915/gt/uc/intel_guc.c
>>>>>>>> index 78305b2ec89d..adae04c47aab 100644
>>>>>>>> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc.c
>>>>>>>> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.c
>>>>>>>> @@ -682,7 +682,9 @@ int intel_guc_allocate_and_map_vma(struct 
>>>>>>>> intel_guc *guc, u32 size,
>>>>>>>>         if (IS_ERR(vma))
>>>>>>>>                 return PTR_ERR(vma);
>>>>>>>>
>>>>>>>> -     vaddr = i915_gem_object_pin_map_unlocked(vma->obj, 
>>>>>>>> I915_MAP_WB);
>>>>>>>> +     vaddr = i915_gem_object_pin_map_unlocked(vma->obj,
>>>>>>>> + i915_coherent_map_type(guc_to_gt(guc)->i915,
>>>>>>>> + vma->obj, true));
>>>>>>>>         if (IS_ERR(vaddr)) {
>>>>>>>>                 i915_vma_unpin_and_release(&vma, 0);
>>>>>>>>                 return PTR_ERR(vaddr);
>>>>>>>> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_huc.c 
>>>>>>>> b/drivers/gpu/drm/i915/gt/uc/intel_huc.c
>>>>>>>> index 2126dd81ac38..56d2144dc6a0 100644
>>>>>>>> --- a/drivers/gpu/drm/i915/gt/uc/intel_huc.c
>>>>>>>> +++ b/drivers/gpu/drm/i915/gt/uc/intel_huc.c
>>>>>>>> @@ -82,7 +82,9 @@ static int intel_huc_rsa_data_create(struct 
>>>>>>>> intel_huc *huc)
>>>>>>>>         if (IS_ERR(vma))
>>>>>>>>                 return PTR_ERR(vma);
>>>>>>>>
>>>>>>>> -     vaddr = i915_gem_object_pin_map_unlocked(vma->obj, 
>>>>>>>> I915_MAP_WB);
>>>>>>>> +     vaddr = i915_gem_object_pin_map_unlocked(vma->obj,
>>>>>>>> + i915_coherent_map_type(gt->i915,
>>>>>>>> + vma->obj, true));
>>>>>>>>         if (IS_ERR(vaddr)) {
>>>>>>>>                 i915_vma_unpin_and_release(&vma, 0);
>>>>>>>>                 return PTR_ERR(vaddr);
>>>>>>>> diff --git a/drivers/gpu/drm/i915/i915_drv.h 
>>>>>>>> b/drivers/gpu/drm/i915/i915_drv.h
>>>>>>>> index 69e43bf91a15..2abbc06712a4 100644
>>>>>>>> --- a/drivers/gpu/drm/i915/i915_drv.h
>>>>>>>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>>>>>>>> @@ -78,6 +78,7 @@
>>>>>>>>     #include "gem/i915_gem_context_types.h"
>>>>>>>>     #include "gem/i915_gem_shrinker.h"
>>>>>>>>     #include "gem/i915_gem_stolen.h"
>>>>>>>> +#include "gem/i915_gem_lmem.h"
>>>>>>>>
>>>>>>>>     #include "gt/intel_engine.h"
>>>>>>>>     #include "gt/intel_gt_types.h"
>>>>>>>> @@ -1921,9 +1922,15 @@ static inline int 
>>>>>>>> intel_hws_csb_write_index(struct drm_i915_private *i915)
>>>>>>>>     }
>>>>>>>>
>>>>>>>>     static inline enum i915_map_type
>>>>>>>> -i915_coherent_map_type(struct drm_i915_private *i915)
>>>>>>>> +i915_coherent_map_type(struct drm_i915_private *i915,
>>>>>>>> +                    struct drm_i915_gem_object *obj, bool 
>>>>>>>> always_coherent)
>>>>>>>>     {
>>>>>>>> -     return HAS_LLC(i915) ? I915_MAP_WB : I915_MAP_WC;
>>>>>>>> +     if (i915_gem_object_is_lmem(obj))
>>>>>>>> +             return I915_MAP_WC;
>>>>>>>> +     if (HAS_LLC(i915) || always_coherent)
>>>>>>>> +             return I915_MAP_WB;
>>>>>>>> +     else
>>>>>>>> +             return I915_MAP_WC;
>>>>>>>
>>>>>>> Seems this patch is doing two things.
>>>>>>>
>>>>>>> First it is adding lmem support to this helper by always 
>>>>>>> returning WC
>>>>>>> for lmem objects.
>>>>>>>
>>>>>>> Secondly it is introducing an idea of "always coherent" in a helper
>>>>>>> called i915_coherent_map_type. Could someone explain what is 
>>>>>>> coherent vs
>>>>>>> always coherent?
>>>>>>>
>>>>>>> And also, why is always coherent happy with WB? Sounds counter 
>>>>>>> intuitive
>>>>>>> to me.
>>>>>>
>>>>>> All this does is try to keep the existing behaviour intact, whilst
>>>>>> also ensuring that all lmem objects are mapped using only WC, no
>>>>>> matter what. The always_coherent=true thing is for the existing 
>>>>>> places
>>>>>> where we sometimes map the object using WB, without first considering
>>>>>> whether the device has the fast shared LLC vs snooping. Yes, it's
>>>>>> slightly ugly :)
>>>>>
>>>>> Not fully following - if we had to write kerneldoc for always_coherent
>>>>> input argument - what it would say?
>>>>
>>>> @always_coherent - If true we should always try to map the object
>>>> using WB. If false we should only map as WB if the device supports the
>>>> fast shared LLC, in the case of snooped devices we will map use WC.
>>>> Note that If the resource is lmem then we will always map as WC,
>>>> regardless of the value of always_coherent, since that's all we
>>>> currently support.
>>>>
>>>> Maybe the naming is poor?
>>>
>>> Maybe just confusing to me, not sure yet.
>>>
>>> So always_coherent is not about how the callers wants to use it, but 
>>> about platform knowledge? Or a performance concern for LLC vs 
>>> snooping cases? Does WB works (coherently) on snooping platforms?
>>
>> The always_coherent=true is for the existing callers that want WB, 
>> regardless of LLC vs snooping.
>>
>> The other callers use the existing i915_coherent_map_type() which only 
>> gives out WB for LLC platforms.
>>
>> AFAIK, LLC vs snooping should offer the same in terms of coherency, 
>> but in terms of performance the shared LLC is much faster, and so for 
>> snooping platforms we choose to not enable WB everywhere.
>>
>> On top of that we now have lmem, but for that we only allow WC. This 
>> patch just rolls all of that into one helper, while keeping the 
>> existing behaviour unchanged.
> 
> Thanks. But I am still struggling with the API. :(
> 
> Is the introduction of always_coherent flag in the context of DG1 
> required even? AFAICT for lmem objects the flag is ignored so no?

If we drop the flag/helper thing, then we need something like:

type = WB;
if (i915_gem_object_is_lmem(obj))
     type = WC;

vaddr = i915_gem_object_pin_map(obj, type);

In all the places where we currently do:

vaddr = i915_gem_object_pin_map(obj, WB);

Where obj can be lmem, so ctx, ring, guc etc. Is that better or worse? 
The existing i915_coherent_map_type() callers should work as-is, since 
DG1 is snooped. And this patch just extends that to cover all cases.

Perhaps we need a new helper instead? Maybe you have a better idea?

> 
> Regards,
> 
> Tvrtko
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [Intel-gfx] [PATCH 11/19] drm/i915: Update the helper to set correct mapping
@ 2021-04-19 14:37                   ` Matthew Auld
  0 siblings, 0 replies; 130+ messages in thread
From: Matthew Auld @ 2021-04-19 14:37 UTC (permalink / raw)
  To: Tvrtko Ursulin, Matthew Auld; +Cc: Intel Graphics Development, ML dri-devel

On 19/04/2021 15:07, Tvrtko Ursulin wrote:
> 
> On 19/04/2021 12:30, Matthew Auld wrote:
>> On 15/04/2021 12:05, Tvrtko Ursulin wrote:
>>>
>>> On 15/04/2021 10:23, Matthew Auld wrote:
>>>> On Thu, 15 Apr 2021 at 09:21, Tvrtko Ursulin
>>>> <tvrtko.ursulin@linux.intel.com> wrote:
>>>>>
>>>>>
>>>>> On 14/04/2021 17:20, Matthew Auld wrote:
>>>>>> On Wed, 14 Apr 2021 at 16:22, Tvrtko Ursulin
>>>>>> <tvrtko.ursulin@linux.intel.com> wrote:
>>>>>>>
>>>>>>>
>>>>>>> On 12/04/2021 10:05, Matthew Auld wrote:
>>>>>>>> From: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com>
>>>>>>>>
>>>>>>>> Determine the possible coherent map type based on object location,
>>>>>>>> and if target has llc or if user requires an always coherent
>>>>>>>> mapping.
>>>>>>>>
>>>>>>>> Cc: Matthew Auld <matthew.auld@intel.com>
>>>>>>>> Cc: CQ Tang <cq.tang@intel.com>
>>>>>>>> Suggested-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
>>>>>>>> Signed-off-by: Venkata Sandeep Dhanalakota 
>>>>>>>> <venkata.s.dhanalakota@intel.com>
>>>>>>>> ---
>>>>>>>>     drivers/gpu/drm/i915/gt/intel_engine_cs.c    |  3 ++-
>>>>>>>>     drivers/gpu/drm/i915/gt/intel_engine_pm.c    |  2 +-
>>>>>>>>     drivers/gpu/drm/i915/gt/intel_lrc.c          |  4 +++-
>>>>>>>>     drivers/gpu/drm/i915/gt/intel_ring.c         |  9 ++++++---
>>>>>>>>     drivers/gpu/drm/i915/gt/selftest_context.c   |  3 ++-
>>>>>>>>     drivers/gpu/drm/i915/gt/selftest_hangcheck.c |  4 ++--
>>>>>>>>     drivers/gpu/drm/i915/gt/selftest_lrc.c       |  4 +++-
>>>>>>>>     drivers/gpu/drm/i915/gt/uc/intel_guc.c       |  4 +++-
>>>>>>>>     drivers/gpu/drm/i915/gt/uc/intel_huc.c       |  4 +++-
>>>>>>>>     drivers/gpu/drm/i915/i915_drv.h              | 11 +++++++++--
>>>>>>>>     drivers/gpu/drm/i915/selftests/igt_spinner.c |  4 ++--
>>>>>>>>     11 files changed, 36 insertions(+), 16 deletions(-)
>>>>>>>>
>>>>>>>> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c 
>>>>>>>> b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
>>>>>>>> index efe935f80c1a..b79568d370f5 100644
>>>>>>>> --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
>>>>>>>> +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
>>>>>>>> @@ -664,7 +664,8 @@ static int init_status_page(struct 
>>>>>>>> intel_engine_cs *engine)
>>>>>>>>         if (ret)
>>>>>>>>                 goto err;
>>>>>>>>
>>>>>>>> -     vaddr = i915_gem_object_pin_map(obj, I915_MAP_WB);
>>>>>>>> +     vaddr = i915_gem_object_pin_map(obj,
>>>>>>>> + i915_coherent_map_type(engine->i915, obj, true));
>>>>>>>>         if (IS_ERR(vaddr)) {
>>>>>>>>                 ret = PTR_ERR(vaddr);
>>>>>>>>                 goto err_unpin;
>>>>>>>> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pm.c 
>>>>>>>> b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
>>>>>>>> index 7c9af86fdb1e..47f4397095e5 100644
>>>>>>>> --- a/drivers/gpu/drm/i915/gt/intel_engine_pm.c
>>>>>>>> +++ b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
>>>>>>>> @@ -23,7 +23,7 @@ static void dbg_poison_ce(struct intel_context 
>>>>>>>> *ce)
>>>>>>>>
>>>>>>>>         if (ce->state) {
>>>>>>>>                 struct drm_i915_gem_object *obj = ce->state->obj;
>>>>>>>> -             int type = i915_coherent_map_type(ce->engine->i915);
>>>>>>>> +             int type = 
>>>>>>>> i915_coherent_map_type(ce->engine->i915, obj, true);
>>>>>>>>                 void *map;
>>>>>>>>
>>>>>>>>                 if (!i915_gem_object_trylock(obj))
>>>>>>>> diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c 
>>>>>>>> b/drivers/gpu/drm/i915/gt/intel_lrc.c
>>>>>>>> index e86897cde984..aafe2a4df496 100644
>>>>>>>> --- a/drivers/gpu/drm/i915/gt/intel_lrc.c
>>>>>>>> +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
>>>>>>>> @@ -903,7 +903,9 @@ lrc_pre_pin(struct intel_context *ce,
>>>>>>>>         GEM_BUG_ON(!i915_vma_is_pinned(ce->state));
>>>>>>>>
>>>>>>>>         *vaddr = i915_gem_object_pin_map(ce->state->obj,
>>>>>>>> - i915_coherent_map_type(ce->engine->i915) |
>>>>>>>> + i915_coherent_map_type(ce->engine->i915,
>>>>>>>> + ce->state->obj,
>>>>>>>> + false) |
>>>>>>>>                                          I915_MAP_OVERRIDE);
>>>>>>>>
>>>>>>>>         return PTR_ERR_OR_ZERO(*vaddr);
>>>>>>>> diff --git a/drivers/gpu/drm/i915/gt/intel_ring.c 
>>>>>>>> b/drivers/gpu/drm/i915/gt/intel_ring.c
>>>>>>>> index aee0a77c77e0..3cf6c7e68108 100644
>>>>>>>> --- a/drivers/gpu/drm/i915/gt/intel_ring.c
>>>>>>>> +++ b/drivers/gpu/drm/i915/gt/intel_ring.c
>>>>>>>> @@ -53,9 +53,12 @@ int intel_ring_pin(struct intel_ring *ring, 
>>>>>>>> struct i915_gem_ww_ctx *ww)
>>>>>>>>
>>>>>>>>         if (i915_vma_is_map_and_fenceable(vma))
>>>>>>>>                 addr = (void __force *)i915_vma_pin_iomap(vma);
>>>>>>>> -     else
>>>>>>>> -             addr = i915_gem_object_pin_map(vma->obj,
>>>>>>>> - i915_coherent_map_type(vma->vm->i915));
>>>>>>>> +     else {
>>>>>>>> +             int type = i915_coherent_map_type(vma->vm->i915, 
>>>>>>>> vma->obj, false);
>>>>>>>> +
>>>>>>>> +             addr = i915_gem_object_pin_map(vma->obj, type);
>>>>>>>> +     }
>>>>>>>> +
>>>>>>>>         if (IS_ERR(addr)) {
>>>>>>>>                 ret = PTR_ERR(addr);
>>>>>>>>                 goto err_ring;
>>>>>>>> diff --git a/drivers/gpu/drm/i915/gt/selftest_context.c 
>>>>>>>> b/drivers/gpu/drm/i915/gt/selftest_context.c
>>>>>>>> index b9bdd1d23243..26685b927169 100644
>>>>>>>> --- a/drivers/gpu/drm/i915/gt/selftest_context.c
>>>>>>>> +++ b/drivers/gpu/drm/i915/gt/selftest_context.c
>>>>>>>> @@ -88,7 +88,8 @@ static int __live_context_size(struct 
>>>>>>>> intel_engine_cs *engine)
>>>>>>>>                 goto err;
>>>>>>>>
>>>>>>>>         vaddr = i915_gem_object_pin_map_unlocked(ce->state->obj,
>>>>>>>> - i915_coherent_map_type(engine->i915));
>>>>>>>> + i915_coherent_map_type(engine->i915,
>>>>>>>> + ce->state->obj, false));
>>>>>>>>         if (IS_ERR(vaddr)) {
>>>>>>>>                 err = PTR_ERR(vaddr);
>>>>>>>>                 intel_context_unpin(ce);
>>>>>>>> diff --git a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c 
>>>>>>>> b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
>>>>>>>> index 746985971c3a..5b63d4df8c93 100644
>>>>>>>> --- a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
>>>>>>>> +++ b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
>>>>>>>> @@ -69,7 +69,7 @@ static int hang_init(struct hang *h, struct 
>>>>>>>> intel_gt *gt)
>>>>>>>>         h->seqno = memset(vaddr, 0xff, PAGE_SIZE);
>>>>>>>>
>>>>>>>>         vaddr = i915_gem_object_pin_map_unlocked(h->obj,
>>>>>>>> - i915_coherent_map_type(gt->i915));
>>>>>>>> + i915_coherent_map_type(gt->i915, h->obj, false));
>>>>>>>>         if (IS_ERR(vaddr)) {
>>>>>>>>                 err = PTR_ERR(vaddr);
>>>>>>>>                 goto err_unpin_hws;
>>>>>>>> @@ -130,7 +130,7 @@ hang_create_request(struct hang *h, struct 
>>>>>>>> intel_engine_cs *engine)
>>>>>>>>                 return ERR_CAST(obj);
>>>>>>>>         }
>>>>>>>>
>>>>>>>> -     vaddr = i915_gem_object_pin_map_unlocked(obj, 
>>>>>>>> i915_coherent_map_type(gt->i915));
>>>>>>>> +     vaddr = i915_gem_object_pin_map_unlocked(obj, 
>>>>>>>> i915_coherent_map_type(gt->i915, obj, false));
>>>>>>>>         if (IS_ERR(vaddr)) {
>>>>>>>>                 i915_gem_object_put(obj);
>>>>>>>>                 i915_vm_put(vm);
>>>>>>>> diff --git a/drivers/gpu/drm/i915/gt/selftest_lrc.c 
>>>>>>>> b/drivers/gpu/drm/i915/gt/selftest_lrc.c
>>>>>>>> index 85e7df6a5123..d8f6623524e8 100644
>>>>>>>> --- a/drivers/gpu/drm/i915/gt/selftest_lrc.c
>>>>>>>> +++ b/drivers/gpu/drm/i915/gt/selftest_lrc.c
>>>>>>>> @@ -1221,7 +1221,9 @@ static int compare_isolation(struct 
>>>>>>>> intel_engine_cs *engine,
>>>>>>>>         }
>>>>>>>>
>>>>>>>>         lrc = i915_gem_object_pin_map_unlocked(ce->state->obj,
>>>>>>>> - i915_coherent_map_type(engine->i915));
>>>>>>>> + i915_coherent_map_type(engine->i915,
>>>>>>>> + ce->state->obj,
>>>>>>>> + false));
>>>>>>>>         if (IS_ERR(lrc)) {
>>>>>>>>                 err = PTR_ERR(lrc);
>>>>>>>>                 goto err_B1;
>>>>>>>> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.c 
>>>>>>>> b/drivers/gpu/drm/i915/gt/uc/intel_guc.c
>>>>>>>> index 78305b2ec89d..adae04c47aab 100644
>>>>>>>> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc.c
>>>>>>>> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.c
>>>>>>>> @@ -682,7 +682,9 @@ int intel_guc_allocate_and_map_vma(struct 
>>>>>>>> intel_guc *guc, u32 size,
>>>>>>>>         if (IS_ERR(vma))
>>>>>>>>                 return PTR_ERR(vma);
>>>>>>>>
>>>>>>>> -     vaddr = i915_gem_object_pin_map_unlocked(vma->obj, 
>>>>>>>> I915_MAP_WB);
>>>>>>>> +     vaddr = i915_gem_object_pin_map_unlocked(vma->obj,
>>>>>>>> + i915_coherent_map_type(guc_to_gt(guc)->i915,
>>>>>>>> + vma->obj, true));
>>>>>>>>         if (IS_ERR(vaddr)) {
>>>>>>>>                 i915_vma_unpin_and_release(&vma, 0);
>>>>>>>>                 return PTR_ERR(vaddr);
>>>>>>>> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_huc.c 
>>>>>>>> b/drivers/gpu/drm/i915/gt/uc/intel_huc.c
>>>>>>>> index 2126dd81ac38..56d2144dc6a0 100644
>>>>>>>> --- a/drivers/gpu/drm/i915/gt/uc/intel_huc.c
>>>>>>>> +++ b/drivers/gpu/drm/i915/gt/uc/intel_huc.c
>>>>>>>> @@ -82,7 +82,9 @@ static int intel_huc_rsa_data_create(struct 
>>>>>>>> intel_huc *huc)
>>>>>>>>         if (IS_ERR(vma))
>>>>>>>>                 return PTR_ERR(vma);
>>>>>>>>
>>>>>>>> -     vaddr = i915_gem_object_pin_map_unlocked(vma->obj, 
>>>>>>>> I915_MAP_WB);
>>>>>>>> +     vaddr = i915_gem_object_pin_map_unlocked(vma->obj,
>>>>>>>> + i915_coherent_map_type(gt->i915,
>>>>>>>> + vma->obj, true));
>>>>>>>>         if (IS_ERR(vaddr)) {
>>>>>>>>                 i915_vma_unpin_and_release(&vma, 0);
>>>>>>>>                 return PTR_ERR(vaddr);
>>>>>>>> diff --git a/drivers/gpu/drm/i915/i915_drv.h 
>>>>>>>> b/drivers/gpu/drm/i915/i915_drv.h
>>>>>>>> index 69e43bf91a15..2abbc06712a4 100644
>>>>>>>> --- a/drivers/gpu/drm/i915/i915_drv.h
>>>>>>>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>>>>>>>> @@ -78,6 +78,7 @@
>>>>>>>>     #include "gem/i915_gem_context_types.h"
>>>>>>>>     #include "gem/i915_gem_shrinker.h"
>>>>>>>>     #include "gem/i915_gem_stolen.h"
>>>>>>>> +#include "gem/i915_gem_lmem.h"
>>>>>>>>
>>>>>>>>     #include "gt/intel_engine.h"
>>>>>>>>     #include "gt/intel_gt_types.h"
>>>>>>>> @@ -1921,9 +1922,15 @@ static inline int 
>>>>>>>> intel_hws_csb_write_index(struct drm_i915_private *i915)
>>>>>>>>     }
>>>>>>>>
>>>>>>>>     static inline enum i915_map_type
>>>>>>>> -i915_coherent_map_type(struct drm_i915_private *i915)
>>>>>>>> +i915_coherent_map_type(struct drm_i915_private *i915,
>>>>>>>> +                    struct drm_i915_gem_object *obj, bool 
>>>>>>>> always_coherent)
>>>>>>>>     {
>>>>>>>> -     return HAS_LLC(i915) ? I915_MAP_WB : I915_MAP_WC;
>>>>>>>> +     if (i915_gem_object_is_lmem(obj))
>>>>>>>> +             return I915_MAP_WC;
>>>>>>>> +     if (HAS_LLC(i915) || always_coherent)
>>>>>>>> +             return I915_MAP_WB;
>>>>>>>> +     else
>>>>>>>> +             return I915_MAP_WC;
>>>>>>>
>>>>>>> Seems this patch is doing two things.
>>>>>>>
>>>>>>> First it is adding lmem support to this helper by always 
>>>>>>> returning WC
>>>>>>> for lmem objects.
>>>>>>>
>>>>>>> Secondly it is introducing an idea of "always coherent" in a helper
>>>>>>> called i915_coherent_map_type. Could someone explain what is 
>>>>>>> coherent vs
>>>>>>> always coherent?
>>>>>>>
>>>>>>> And also, why is always coherent happy with WB? Sounds counter 
>>>>>>> intuitive
>>>>>>> to me.
>>>>>>
>>>>>> All this does is try to keep the existing behaviour intact, whilst
>>>>>> also ensuring that all lmem objects are mapped using only WC, no
>>>>>> matter what. The always_coherent=true thing is for the existing 
>>>>>> places
>>>>>> where we sometimes map the object using WB, without first considering
>>>>>> whether the device has the fast shared LLC vs snooping. Yes, it's
>>>>>> slightly ugly :)
>>>>>
>>>>> Not fully following - if we had to write kerneldoc for always_coherent
>>>>> input argument - what it would say?
>>>>
>>>> @always_coherent - If true we should always try to map the object
>>>> using WB. If false we should only map as WB if the device supports the
>>>> fast shared LLC, in the case of snooped devices we will map use WC.
>>>> Note that If the resource is lmem then we will always map as WC,
>>>> regardless of the value of always_coherent, since that's all we
>>>> currently support.
>>>>
>>>> Maybe the naming is poor?
>>>
>>> Maybe just confusing to me, not sure yet.
>>>
>>> So always_coherent is not about how the callers wants to use it, but 
>>> about platform knowledge? Or a performance concern for LLC vs 
>>> snooping cases? Does WB works (coherently) on snooping platforms?
>>
>> The always_coherent=true is for the existing callers that want WB, 
>> regardless of LLC vs snooping.
>>
>> The other callers use the existing i915_coherent_map_type() which only 
>> gives out WB for LLC platforms.
>>
>> AFAIK, LLC vs snooping should offer the same in terms of coherency, 
>> but in terms of performance the shared LLC is much faster, and so for 
>> snooping platforms we choose to not enable WB everywhere.
>>
>> On top of that we now have lmem, but for that we only allow WC. This 
>> patch just rolls all of that into one helper, while keeping the 
>> existing behaviour unchanged.
> 
> Thanks. But I am still struggling with the API. :(
> 
> Is the introduction of always_coherent flag in the context of DG1 
> required even? AFAICT for lmem objects the flag is ignored so no?

If we drop the flag/helper thing, then we need something like:

type = WB;
if (i915_gem_object_is_lmem(obj))
     type = WC;

vaddr = i915_gem_object_pin_map(obj, type);

In all the places where we currently do:

vaddr = i915_gem_object_pin_map(obj, WB);

Where obj can be lmem, so ctx, ring, guc etc. Is that better or worse? 
The existing i915_coherent_map_type() callers should work as-is, since 
DG1 is snooped. And this patch just extends that to cover all cases.

Perhaps we need a new helper instead? Maybe you have a better idea?

> 
> Regards,
> 
> Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [Intel-gfx] [PATCH 11/19] drm/i915: Update the helper to set correct mapping
  2021-04-19 14:37                   ` Matthew Auld
@ 2021-04-19 15:01                     ` Tvrtko Ursulin
  -1 siblings, 0 replies; 130+ messages in thread
From: Tvrtko Ursulin @ 2021-04-19 15:01 UTC (permalink / raw)
  To: Matthew Auld, Matthew Auld; +Cc: Intel Graphics Development, ML dri-devel


On 19/04/2021 15:37, Matthew Auld wrote:
> On 19/04/2021 15:07, Tvrtko Ursulin wrote:
>>
>> On 19/04/2021 12:30, Matthew Auld wrote:
>>> On 15/04/2021 12:05, Tvrtko Ursulin wrote:
>>>>
>>>> On 15/04/2021 10:23, Matthew Auld wrote:
>>>>> On Thu, 15 Apr 2021 at 09:21, Tvrtko Ursulin
>>>>> <tvrtko.ursulin@linux.intel.com> wrote:
>>>>>>
>>>>>>
>>>>>> On 14/04/2021 17:20, Matthew Auld wrote:
>>>>>>> On Wed, 14 Apr 2021 at 16:22, Tvrtko Ursulin
>>>>>>> <tvrtko.ursulin@linux.intel.com> wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>> On 12/04/2021 10:05, Matthew Auld wrote:
>>>>>>>>> From: Venkata Sandeep Dhanalakota 
>>>>>>>>> <venkata.s.dhanalakota@intel.com>
>>>>>>>>>
>>>>>>>>> Determine the possible coherent map type based on object location,
>>>>>>>>> and if target has llc or if user requires an always coherent
>>>>>>>>> mapping.
>>>>>>>>>
>>>>>>>>> Cc: Matthew Auld <matthew.auld@intel.com>
>>>>>>>>> Cc: CQ Tang <cq.tang@intel.com>
>>>>>>>>> Suggested-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
>>>>>>>>> Signed-off-by: Venkata Sandeep Dhanalakota 
>>>>>>>>> <venkata.s.dhanalakota@intel.com>
>>>>>>>>> ---
>>>>>>>>>     drivers/gpu/drm/i915/gt/intel_engine_cs.c    |  3 ++-
>>>>>>>>>     drivers/gpu/drm/i915/gt/intel_engine_pm.c    |  2 +-
>>>>>>>>>     drivers/gpu/drm/i915/gt/intel_lrc.c          |  4 +++-
>>>>>>>>>     drivers/gpu/drm/i915/gt/intel_ring.c         |  9 ++++++---
>>>>>>>>>     drivers/gpu/drm/i915/gt/selftest_context.c   |  3 ++-
>>>>>>>>>     drivers/gpu/drm/i915/gt/selftest_hangcheck.c |  4 ++--
>>>>>>>>>     drivers/gpu/drm/i915/gt/selftest_lrc.c       |  4 +++-
>>>>>>>>>     drivers/gpu/drm/i915/gt/uc/intel_guc.c       |  4 +++-
>>>>>>>>>     drivers/gpu/drm/i915/gt/uc/intel_huc.c       |  4 +++-
>>>>>>>>>     drivers/gpu/drm/i915/i915_drv.h              | 11 +++++++++--
>>>>>>>>>     drivers/gpu/drm/i915/selftests/igt_spinner.c |  4 ++--
>>>>>>>>>     11 files changed, 36 insertions(+), 16 deletions(-)
>>>>>>>>>
>>>>>>>>> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c 
>>>>>>>>> b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
>>>>>>>>> index efe935f80c1a..b79568d370f5 100644
>>>>>>>>> --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
>>>>>>>>> +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
>>>>>>>>> @@ -664,7 +664,8 @@ static int init_status_page(struct 
>>>>>>>>> intel_engine_cs *engine)
>>>>>>>>>         if (ret)
>>>>>>>>>                 goto err;
>>>>>>>>>
>>>>>>>>> -     vaddr = i915_gem_object_pin_map(obj, I915_MAP_WB);
>>>>>>>>> +     vaddr = i915_gem_object_pin_map(obj,
>>>>>>>>> + i915_coherent_map_type(engine->i915, obj, true));
>>>>>>>>>         if (IS_ERR(vaddr)) {
>>>>>>>>>                 ret = PTR_ERR(vaddr);
>>>>>>>>>                 goto err_unpin;
>>>>>>>>> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pm.c 
>>>>>>>>> b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
>>>>>>>>> index 7c9af86fdb1e..47f4397095e5 100644
>>>>>>>>> --- a/drivers/gpu/drm/i915/gt/intel_engine_pm.c
>>>>>>>>> +++ b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
>>>>>>>>> @@ -23,7 +23,7 @@ static void dbg_poison_ce(struct 
>>>>>>>>> intel_context *ce)
>>>>>>>>>
>>>>>>>>>         if (ce->state) {
>>>>>>>>>                 struct drm_i915_gem_object *obj = ce->state->obj;
>>>>>>>>> -             int type = i915_coherent_map_type(ce->engine->i915);
>>>>>>>>> +             int type = 
>>>>>>>>> i915_coherent_map_type(ce->engine->i915, obj, true);
>>>>>>>>>                 void *map;
>>>>>>>>>
>>>>>>>>>                 if (!i915_gem_object_trylock(obj))
>>>>>>>>> diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c 
>>>>>>>>> b/drivers/gpu/drm/i915/gt/intel_lrc.c
>>>>>>>>> index e86897cde984..aafe2a4df496 100644
>>>>>>>>> --- a/drivers/gpu/drm/i915/gt/intel_lrc.c
>>>>>>>>> +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
>>>>>>>>> @@ -903,7 +903,9 @@ lrc_pre_pin(struct intel_context *ce,
>>>>>>>>>         GEM_BUG_ON(!i915_vma_is_pinned(ce->state));
>>>>>>>>>
>>>>>>>>>         *vaddr = i915_gem_object_pin_map(ce->state->obj,
>>>>>>>>> - i915_coherent_map_type(ce->engine->i915) |
>>>>>>>>> + i915_coherent_map_type(ce->engine->i915,
>>>>>>>>> + ce->state->obj,
>>>>>>>>> + false) |
>>>>>>>>>                                          I915_MAP_OVERRIDE);
>>>>>>>>>
>>>>>>>>>         return PTR_ERR_OR_ZERO(*vaddr);
>>>>>>>>> diff --git a/drivers/gpu/drm/i915/gt/intel_ring.c 
>>>>>>>>> b/drivers/gpu/drm/i915/gt/intel_ring.c
>>>>>>>>> index aee0a77c77e0..3cf6c7e68108 100644
>>>>>>>>> --- a/drivers/gpu/drm/i915/gt/intel_ring.c
>>>>>>>>> +++ b/drivers/gpu/drm/i915/gt/intel_ring.c
>>>>>>>>> @@ -53,9 +53,12 @@ int intel_ring_pin(struct intel_ring *ring, 
>>>>>>>>> struct i915_gem_ww_ctx *ww)
>>>>>>>>>
>>>>>>>>>         if (i915_vma_is_map_and_fenceable(vma))
>>>>>>>>>                 addr = (void __force *)i915_vma_pin_iomap(vma);
>>>>>>>>> -     else
>>>>>>>>> -             addr = i915_gem_object_pin_map(vma->obj,
>>>>>>>>> - i915_coherent_map_type(vma->vm->i915));
>>>>>>>>> +     else {
>>>>>>>>> +             int type = i915_coherent_map_type(vma->vm->i915, 
>>>>>>>>> vma->obj, false);
>>>>>>>>> +
>>>>>>>>> +             addr = i915_gem_object_pin_map(vma->obj, type);
>>>>>>>>> +     }
>>>>>>>>> +
>>>>>>>>>         if (IS_ERR(addr)) {
>>>>>>>>>                 ret = PTR_ERR(addr);
>>>>>>>>>                 goto err_ring;
>>>>>>>>> diff --git a/drivers/gpu/drm/i915/gt/selftest_context.c 
>>>>>>>>> b/drivers/gpu/drm/i915/gt/selftest_context.c
>>>>>>>>> index b9bdd1d23243..26685b927169 100644
>>>>>>>>> --- a/drivers/gpu/drm/i915/gt/selftest_context.c
>>>>>>>>> +++ b/drivers/gpu/drm/i915/gt/selftest_context.c
>>>>>>>>> @@ -88,7 +88,8 @@ static int __live_context_size(struct 
>>>>>>>>> intel_engine_cs *engine)
>>>>>>>>>                 goto err;
>>>>>>>>>
>>>>>>>>>         vaddr = i915_gem_object_pin_map_unlocked(ce->state->obj,
>>>>>>>>> - i915_coherent_map_type(engine->i915));
>>>>>>>>> + i915_coherent_map_type(engine->i915,
>>>>>>>>> + ce->state->obj, false));
>>>>>>>>>         if (IS_ERR(vaddr)) {
>>>>>>>>>                 err = PTR_ERR(vaddr);
>>>>>>>>>                 intel_context_unpin(ce);
>>>>>>>>> diff --git a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c 
>>>>>>>>> b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
>>>>>>>>> index 746985971c3a..5b63d4df8c93 100644
>>>>>>>>> --- a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
>>>>>>>>> +++ b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
>>>>>>>>> @@ -69,7 +69,7 @@ static int hang_init(struct hang *h, struct 
>>>>>>>>> int