dri-devel.lists.freedesktop.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 00/19] More DG1 enabling
@ 2021-04-12  9:05 Matthew Auld
  2021-04-12  9:05 ` [PATCH 01/19] drm/i915/gt: Skip aperture remapping selftest where there is no aperture Matthew Auld
                   ` (18 more replies)
  0 siblings, 19 replies; 65+ messages in thread
From: Matthew Auld @ 2021-04-12  9:05 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel

Next batch of DG1 patches. With this we should now get a booting DG1 system with
the kernel selftests passing.

Anshuman Gupta (1):
  drm/i915/oprom: Basic sanitization

Anusha Srivatsa (1):
  drm/i915/lmem: Bypass aperture when lmem is available

CQ Tang (3):
  drm/i915: Create stolen memory region from local memory
  drm/i915/stolen: enforce the min_page_size contract
  drm/i915/stolen: pass the allocation flags

Chris Wilson (2):
  drm/i915/gt: Skip aperture remapping selftest where there is no
    aperture
  drm/i915/selftests: Only query RAPL for integrated power measurements

Clint Taylor (3):
  drm/i915/dg1: Read OPROM via SPI controller
  drm/i915/dg1: Compute MEM Bandwidth using MCHBAR
  drm/i915/dg1: Double memory bandwidth available

José Roberto de Souza (1):
  drm/i915: WA for zero memory channel

Matt Roper (1):
  drm/i915/lmem: Fail driver init if LMEM training failed

Matthew Auld (3):
  drm/i915/stolen: treat stolen local as normal local memory
  drm/i915/gtt: map the PD up front
  drm/i915/gtt/dgfx: place the PD in LMEM

Mohammed Khajapasha (2):
  drm/i915/fbdev: Use lmem physical addresses for fb_mmap() on discrete
  drm/i915: Return error value when bo not in LMEM for discrete

Venkata Ramana Nayana (1):
  drm/i915/dg1: Fix mapping type for default state object

Venkata Sandeep Dhanalakota (1):
  drm/i915: Update the helper to set correct mapping

 drivers/gpu/drm/i915/display/intel_bios.c     |  75 +++++++-
 drivers/gpu/drm/i915/display/intel_bw.c       |  63 ++++++-
 drivers/gpu/drm/i915/display/intel_display.c  |  10 ++
 drivers/gpu/drm/i915/display/intel_fbdev.c    |  51 ++++--
 drivers/gpu/drm/i915/display/intel_opregion.c | 169 ++++++++++++++++++
 drivers/gpu/drm/i915/display/intel_opregion.h |  38 +++-
 drivers/gpu/drm/i915/gem/i915_gem_lmem.c      |  20 ++-
 drivers/gpu/drm/i915/gem/i915_gem_lmem.h      |   5 +
 drivers/gpu/drm/i915/gem/i915_gem_stolen.c    | 116 ++++++++++--
 drivers/gpu/drm/i915/gem/i915_gem_stolen.h    |   3 +
 .../drm/i915/gem/selftests/i915_gem_context.c |  11 +-
 drivers/gpu/drm/i915/gt/gen6_ppgtt.c          |  11 +-
 drivers/gpu/drm/i915/gt/gen8_ppgtt.c          |  31 ++--
 drivers/gpu/drm/i915/gt/intel_engine_cs.c     |   3 +-
 drivers/gpu/drm/i915/gt/intel_engine_pm.c     |   2 +-
 drivers/gpu/drm/i915/gt/intel_ggtt.c          |   2 +-
 drivers/gpu/drm/i915/gt/intel_gtt.c           |  71 +++++---
 drivers/gpu/drm/i915/gt/intel_gtt.h           |  12 +-
 drivers/gpu/drm/i915/gt/intel_lrc.c           |   4 +-
 drivers/gpu/drm/i915/gt/intel_ppgtt.c         |   7 +-
 drivers/gpu/drm/i915/gt/intel_ring.c          |   9 +-
 drivers/gpu/drm/i915/gt/selftest_context.c    |   3 +-
 drivers/gpu/drm/i915/gt/selftest_hangcheck.c  |   4 +-
 drivers/gpu/drm/i915/gt/selftest_lrc.c        |   4 +-
 drivers/gpu/drm/i915/gt/selftest_rc6.c        |  32 ++--
 drivers/gpu/drm/i915/gt/selftest_rps.c        |   2 +-
 drivers/gpu/drm/i915/gt/shmem_utils.c         |   4 +-
 drivers/gpu/drm/i915/gt/uc/intel_guc.c        |   4 +-
 drivers/gpu/drm/i915/gt/uc/intel_huc.c        |   4 +-
 drivers/gpu/drm/i915/i915_drv.h               |  11 +-
 drivers/gpu/drm/i915/i915_pci.c               |   2 +-
 drivers/gpu/drm/i915/i915_reg.h               |  12 ++
 drivers/gpu/drm/i915/i915_vma.c               |  22 ++-
 drivers/gpu/drm/i915/intel_memory_region.c    |   6 +
 drivers/gpu/drm/i915/intel_memory_region.h    |   5 +-
 drivers/gpu/drm/i915/intel_uncore.c           |  12 ++
 drivers/gpu/drm/i915/selftests/i915_gem_gtt.c |  10 +-
 drivers/gpu/drm/i915/selftests/i915_perf.c    |   3 +-
 drivers/gpu/drm/i915/selftests/i915_vma.c     |   3 +
 drivers/gpu/drm/i915/selftests/igt_spinner.c  |   4 +-
 drivers/gpu/drm/i915/selftests/librapl.c      |  10 ++
 drivers/gpu/drm/i915/selftests/librapl.h      |   4 +
 42 files changed, 716 insertions(+), 158 deletions(-)

-- 
2.26.3

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 65+ messages in thread

* [PATCH 01/19] drm/i915/gt: Skip aperture remapping selftest where there is no aperture
  2021-04-12  9:05 [PATCH 00/19] More DG1 enabling Matthew Auld
@ 2021-04-12  9:05 ` Matthew Auld
  2021-04-12 14:48   ` [Intel-gfx] " Daniel Vetter
  2021-04-12  9:05 ` [PATCH 02/19] drm/i915/selftests: Only query RAPL for integrated power measurements Matthew Auld
                   ` (17 subsequent siblings)
  18 siblings, 1 reply; 65+ messages in thread
From: Matthew Auld @ 2021-04-12  9:05 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel, Chris Wilson

From: Chris Wilson <chris@chris-wilson.co.uk>

If there is no mappable aperture, we cannot remap it for access, and the
selftest is void.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
---
 drivers/gpu/drm/i915/selftests/i915_vma.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/i915/selftests/i915_vma.c b/drivers/gpu/drm/i915/selftests/i915_vma.c
index 5fe7b80ca0bd..dd0607254a95 100644
--- a/drivers/gpu/drm/i915/selftests/i915_vma.c
+++ b/drivers/gpu/drm/i915/selftests/i915_vma.c
@@ -967,6 +967,9 @@ static int igt_vma_remapped_gtt(void *arg)
 	intel_wakeref_t wakeref;
 	int err = 0;
 
+	if (!i915_ggtt_has_aperture(&i915->ggtt))
+		return 0;
+
 	obj = i915_gem_object_create_internal(i915, 10 * 10 * PAGE_SIZE);
 	if (IS_ERR(obj))
 		return PTR_ERR(obj);
-- 
2.26.3

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH 02/19] drm/i915/selftests: Only query RAPL for integrated power measurements
  2021-04-12  9:05 [PATCH 00/19] More DG1 enabling Matthew Auld
  2021-04-12  9:05 ` [PATCH 01/19] drm/i915/gt: Skip aperture remapping selftest where there is no aperture Matthew Auld
@ 2021-04-12  9:05 ` Matthew Auld
  2021-04-12  9:05 ` [PATCH 03/19] drm/i915: Create stolen memory region from local memory Matthew Auld
                   ` (16 subsequent siblings)
  18 siblings, 0 replies; 65+ messages in thread
From: Matthew Auld @ 2021-04-12  9:05 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel, Chris Wilson

From: Chris Wilson <chris@chris-wilson.co.uk>

RAPL provides an on-package power measurements which does not encompass
discrete graphics, so let's avoid using the igfx masurements when testing
dgfx. Later we will abstract the simple librapl interface over hwmon so
that we can verify basic power consumption scenarios.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
---
 drivers/gpu/drm/i915/gt/selftest_rc6.c   | 32 +++++++++++++++---------
 drivers/gpu/drm/i915/gt/selftest_rps.c   |  2 +-
 drivers/gpu/drm/i915/selftests/librapl.c | 10 ++++++++
 drivers/gpu/drm/i915/selftests/librapl.h |  4 +++
 4 files changed, 35 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/selftest_rc6.c b/drivers/gpu/drm/i915/gt/selftest_rc6.c
index f097e420ac45..710f825f6e5a 100644
--- a/drivers/gpu/drm/i915/gt/selftest_rc6.c
+++ b/drivers/gpu/drm/i915/gt/selftest_rc6.c
@@ -34,6 +34,7 @@ int live_rc6_manual(void *arg)
 	struct intel_rc6 *rc6 = &gt->rc6;
 	u64 rc0_power, rc6_power;
 	intel_wakeref_t wakeref;
+	bool has_power;
 	ktime_t dt;
 	u64 res[2];
 	int err = 0;
@@ -50,6 +51,7 @@ int live_rc6_manual(void *arg)
 	if (IS_VALLEYVIEW(gt->i915) || IS_CHERRYVIEW(gt->i915))
 		return 0;
 
+	has_power = librapl_supported(gt->i915);
 	wakeref = intel_runtime_pm_get(gt->uncore->rpm);
 
 	/* Force RC6 off for starters */
@@ -71,11 +73,14 @@ int live_rc6_manual(void *arg)
 		goto out_unlock;
 	}
 
-	rc0_power = div64_u64(NSEC_PER_SEC * rc0_power, ktime_to_ns(dt));
-	if (!rc0_power) {
-		pr_err("No power measured while in RC0\n");
-		err = -EINVAL;
-		goto out_unlock;
+	if (has_power) {
+		rc0_power = div64_u64(NSEC_PER_SEC * rc0_power,
+				      ktime_to_ns(dt));
+		if (!rc0_power) {
+			pr_err("No power measured while in RC0\n");
+			err = -EINVAL;
+			goto out_unlock;
+		}
 	}
 
 	/* Manually enter RC6 */
@@ -97,13 +102,16 @@ int live_rc6_manual(void *arg)
 		err = -EINVAL;
 	}
 
-	rc6_power = div64_u64(NSEC_PER_SEC * rc6_power, ktime_to_ns(dt));
-	pr_info("GPU consumed %llduW in RC0 and %llduW in RC6\n",
-		rc0_power, rc6_power);
-	if (2 * rc6_power > rc0_power) {
-		pr_err("GPU leaked energy while in RC6!\n");
-		err = -EINVAL;
-		goto out_unlock;
+	if (has_power) {
+		rc6_power = div64_u64(NSEC_PER_SEC * rc6_power,
+				      ktime_to_ns(dt));
+		pr_info("GPU consumed %llduW in RC0 and %llduW in RC6\n",
+			rc0_power, rc6_power);
+		if (2 * rc6_power > rc0_power) {
+			pr_err("GPU leaked energy while in RC6!\n");
+			err = -EINVAL;
+			goto out_unlock;
+		}
 	}
 
 	/* Restore what should have been the original state! */
diff --git a/drivers/gpu/drm/i915/gt/selftest_rps.c b/drivers/gpu/drm/i915/gt/selftest_rps.c
index 967641fee42a..adf7fdbc00f7 100644
--- a/drivers/gpu/drm/i915/gt/selftest_rps.c
+++ b/drivers/gpu/drm/i915/gt/selftest_rps.c
@@ -1139,7 +1139,7 @@ int live_rps_power(void *arg)
 	if (!intel_rps_is_enabled(rps) || INTEL_GEN(gt->i915) < 6)
 		return 0;
 
-	if (!librapl_energy_uJ())
+	if (!librapl_supported(gt->i915))
 		return 0;
 
 	if (igt_spinner_init(&spin, gt))
diff --git a/drivers/gpu/drm/i915/selftests/librapl.c b/drivers/gpu/drm/i915/selftests/librapl.c
index 58710ac3f979..eb03b5b28bad 100644
--- a/drivers/gpu/drm/i915/selftests/librapl.c
+++ b/drivers/gpu/drm/i915/selftests/librapl.c
@@ -5,8 +5,18 @@
 
 #include <asm/msr.h>
 
+#include "i915_drv.h"
 #include "librapl.h"
 
+bool librapl_supported(const struct drm_i915_private *i915)
+{
+	/* Discrete cards require hwmon integration */
+	if (IS_DGFX(i915))
+		return false;
+
+	return librapl_energy_uJ();
+}
+
 u64 librapl_energy_uJ(void)
 {
 	unsigned long long power;
diff --git a/drivers/gpu/drm/i915/selftests/librapl.h b/drivers/gpu/drm/i915/selftests/librapl.h
index 887f3e91dd05..e3b24fad0a7a 100644
--- a/drivers/gpu/drm/i915/selftests/librapl.h
+++ b/drivers/gpu/drm/i915/selftests/librapl.h
@@ -8,6 +8,10 @@
 
 #include <linux/types.h>
 
+struct drm_i915_private;
+
+bool librapl_supported(const struct drm_i915_private *i915);
+
 u64 librapl_energy_uJ(void);
 
 #endif /* SELFTEST_LIBRAPL_H */
-- 
2.26.3

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH 03/19] drm/i915: Create stolen memory region from local memory
  2021-04-12  9:05 [PATCH 00/19] More DG1 enabling Matthew Auld
  2021-04-12  9:05 ` [PATCH 01/19] drm/i915/gt: Skip aperture remapping selftest where there is no aperture Matthew Auld
  2021-04-12  9:05 ` [PATCH 02/19] drm/i915/selftests: Only query RAPL for integrated power measurements Matthew Auld
@ 2021-04-12  9:05 ` Matthew Auld
  2021-04-14 15:01   ` [Intel-gfx] " Tvrtko Ursulin
  2021-04-12  9:05 ` [PATCH 04/19] drm/i915/stolen: treat stolen local as normal " Matthew Auld
                   ` (15 subsequent siblings)
  18 siblings, 1 reply; 65+ messages in thread
From: Matthew Auld @ 2021-04-12  9:05 UTC (permalink / raw)
  To: intel-gfx; +Cc: CQ Tang, dri-devel

From: CQ Tang <cq.tang@intel.com>

Add "REGION_STOLEN" device info to dg1, create stolen memory
region from upper portion of local device memory, starting
from DSMBASE.

v2:
    - s/drm_info/drm_dbg; userspace likely doesn't care about stolen.
    - mem->type is only setup after the region probe, so setting the name
      as stolen-local or stolen-system based on this value won't work. Split
      system vs local stolen setup to fix this.
    - kill all the region->devmem/is_devmem stuff. We already differentiate
      the different types of stolen so such things shouldn't be needed
      anymore.

Signed-off-by: CQ Tang <cq.tang@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_stolen.c | 99 +++++++++++++++++++---
 drivers/gpu/drm/i915/gem/i915_gem_stolen.h |  3 +
 drivers/gpu/drm/i915/i915_pci.c            |  2 +-
 drivers/gpu/drm/i915/i915_reg.h            |  1 +
 drivers/gpu/drm/i915/intel_memory_region.c |  6 ++
 drivers/gpu/drm/i915/intel_memory_region.h |  5 +-
 6 files changed, 102 insertions(+), 14 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_stolen.c b/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
index b0597de206de..56dd58bef5ee 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
@@ -10,6 +10,7 @@
 #include <drm/drm_mm.h>
 #include <drm/i915_drm.h>
 
+#include "gem/i915_gem_lmem.h"
 #include "gem/i915_gem_region.h"
 #include "i915_drv.h"
 #include "i915_gem_stolen.h"
@@ -121,6 +122,14 @@ static int i915_adjust_stolen(struct drm_i915_private *i915,
 		}
 	}
 
+	/*
+	 * With device local memory, we don't need to check the address range,
+	 * this is device memory physical address, could overlap with system
+	 * memory.
+	 */
+	if (HAS_LMEM(i915))
+		return 0;
+
 	/*
 	 * Verify that nothing else uses this physical address. Stolen
 	 * memory should be reserved by the BIOS and hidden from the
@@ -374,8 +383,9 @@ static void icl_get_stolen_reserved(struct drm_i915_private *i915,
 	}
 }
 
-static int i915_gem_init_stolen(struct drm_i915_private *i915)
+static int i915_gem_init_stolen(struct intel_memory_region *mem)
 {
+	struct drm_i915_private *i915 = mem->i915;
 	struct intel_uncore *uncore = &i915->uncore;
 	resource_size_t reserved_base, stolen_top;
 	resource_size_t reserved_total, reserved_size;
@@ -396,10 +406,10 @@ static int i915_gem_init_stolen(struct drm_i915_private *i915)
 		return 0;
 	}
 
-	if (resource_size(&intel_graphics_stolen_res) == 0)
+	if (resource_size(&mem->region) == 0)
 		return 0;
 
-	i915->dsm = intel_graphics_stolen_res;
+	i915->dsm = mem->region;
 
 	if (i915_adjust_stolen(i915, &i915->dsm))
 		return 0;
@@ -684,23 +694,36 @@ static int _i915_gem_object_stolen_init(struct intel_memory_region *mem,
 	return ret;
 }
 
+struct intel_memory_region *i915_stolen_region(struct drm_i915_private *i915)
+{
+	if (HAS_LMEM(i915))
+		return i915->mm.regions[INTEL_REGION_STOLEN_LMEM];
+
+	return i915->mm.regions[INTEL_REGION_STOLEN_SMEM];
+}
+
 struct drm_i915_gem_object *
 i915_gem_object_create_stolen(struct drm_i915_private *i915,
 			      resource_size_t size)
 {
-	return i915_gem_object_create_region(i915->mm.regions[INTEL_REGION_STOLEN_SMEM],
+	return i915_gem_object_create_region(i915_stolen_region(i915),
 					     size, I915_BO_ALLOC_CONTIGUOUS);
 }
 
 static int init_stolen(struct intel_memory_region *mem)
 {
-	intel_memory_region_set_name(mem, "stolen");
+	if (HAS_LMEM(mem->i915)) {
+		if (!io_mapping_init_wc(&mem->iomap,
+					mem->io_start,
+					resource_size(&mem->region)))
+			return -EIO;
+	}
 
 	/*
 	 * Initialise stolen early so that we may reserve preallocated
 	 * objects for the BIOS to KMS transition.
 	 */
-	return i915_gem_init_stolen(mem->i915);
+	return i915_gem_init_stolen(mem);
 }
 
 static void release_stolen(struct intel_memory_region *mem)
@@ -714,13 +737,65 @@ static const struct intel_memory_region_ops i915_region_stolen_ops = {
 	.init_object = _i915_gem_object_stolen_init,
 };
 
+static struct intel_memory_region *
+setup_lmem_stolen(struct drm_i915_private *i915)
+{
+	struct intel_uncore *uncore = &i915->uncore;
+	struct pci_dev *pdev = i915->drm.pdev;
+	struct intel_memory_region *mem;
+	resource_size_t io_start;
+	resource_size_t lmem_size;
+	u64 lmem_base;
+
+	if (!IS_DGFX(i915))
+		return ERR_PTR(-ENODEV);
+
+	lmem_base = intel_uncore_read64(uncore, GEN12_DSMBASE);
+	lmem_size = pci_resource_len(pdev, 2) - lmem_base;
+	io_start = pci_resource_start(pdev, 2) + lmem_base;
+
+	mem = intel_memory_region_create(i915, lmem_base, lmem_size,
+					 I915_GTT_PAGE_SIZE_4K, io_start,
+					 &i915_region_stolen_ops);
+	if (IS_ERR(mem))
+		return mem;
+
+	drm_dbg(&i915->drm, "Stolen Local memory: %pR\n", &mem->region);
+	drm_dbg(&i915->drm, "Stolen Local memory IO start: %pa\n",
+		&mem->io_start);
+
+	intel_memory_region_set_name(mem, "stolen-local");
+
+	return mem;
+}
+
+static struct intel_memory_region*
+setup_smem_stolen(struct drm_i915_private *i915)
+{
+	struct intel_memory_region *mem;
+
+	mem = intel_memory_region_create(i915,
+					 intel_graphics_stolen_res.start,
+					 resource_size(&intel_graphics_stolen_res),
+					 PAGE_SIZE, 0,
+					 &i915_region_stolen_ops);
+	if (IS_ERR(mem))
+		return mem;
+
+	intel_memory_region_set_name(mem, "stolen-system");
+
+	return mem;
+}
+
 struct intel_memory_region *i915_gem_stolen_setup(struct drm_i915_private *i915)
 {
-	return intel_memory_region_create(i915,
-					  intel_graphics_stolen_res.start,
-					  resource_size(&intel_graphics_stolen_res),
-					  PAGE_SIZE, 0,
-					  &i915_region_stolen_ops);
+	struct intel_memory_region *mem;
+
+	mem = setup_lmem_stolen(i915);
+	if (mem == ERR_PTR(-ENODEV))
+		mem = setup_smem_stolen(i915);
+
+	return mem;
 }
 
 struct drm_i915_gem_object *
@@ -728,7 +803,7 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_i915_private *i915,
 					       resource_size_t stolen_offset,
 					       resource_size_t size)
 {
-	struct intel_memory_region *mem = i915->mm.regions[INTEL_REGION_STOLEN_SMEM];
+	struct intel_memory_region *mem = i915_stolen_region(i915);
 	struct drm_i915_gem_object *obj;
 	struct drm_mm_node *stolen;
 	int ret;
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_stolen.h b/drivers/gpu/drm/i915/gem/i915_gem_stolen.h
index b03489706796..2d1ce7fec61c 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_stolen.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_stolen.h
@@ -22,6 +22,9 @@ int i915_gem_stolen_insert_node_in_range(struct drm_i915_private *dev_priv,
 void i915_gem_stolen_remove_node(struct drm_i915_private *dev_priv,
 				 struct drm_mm_node *node);
 struct intel_memory_region *i915_gem_stolen_setup(struct drm_i915_private *i915);
+
+struct intel_memory_region *i915_stolen_region(struct drm_i915_private *i915);
+
 struct drm_i915_gem_object *
 i915_gem_object_create_stolen(struct drm_i915_private *dev_priv,
 			      resource_size_t size);
diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c
index 480553746794..53f5d1e6daef 100644
--- a/drivers/gpu/drm/i915/i915_pci.c
+++ b/drivers/gpu/drm/i915/i915_pci.c
@@ -906,7 +906,7 @@ static const struct intel_device_info rkl_info = {
 
 #define GEN12_DGFX_FEATURES \
 	GEN12_FEATURES, \
-	.memory_regions = REGION_SMEM | REGION_LMEM, \
+	.memory_regions = REGION_SMEM | REGION_LMEM | REGION_STOLEN_LMEM, \
 	.has_master_unit_irq = 1, \
 	.has_llc = 0, \
 	.has_snoop = 1, \
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index e087bcd21911..4108f2a7ebfa 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -12191,6 +12191,7 @@ enum skl_power_gate {
 #define GEN12_GLOBAL_MOCS(i)	_MMIO(0x4000 + (i) * 4) /* Global MOCS regs */
 
 #define GEN12_GSMBASE			_MMIO(0x108100)
+#define GEN12_DSMBASE			_MMIO(0x1080C0)
 
 /* gamt regs */
 #define GEN8_L3_LRA_1_GPGPU _MMIO(0x4dd4)
diff --git a/drivers/gpu/drm/i915/intel_memory_region.c b/drivers/gpu/drm/i915/intel_memory_region.c
index bf837b6bb185..ac90b76a3fa0 100644
--- a/drivers/gpu/drm/i915/intel_memory_region.c
+++ b/drivers/gpu/drm/i915/intel_memory_region.c
@@ -22,6 +22,10 @@ static const struct {
 		.class = INTEL_MEMORY_STOLEN_SYSTEM,
 		.instance = 0,
 	},
+	[INTEL_REGION_STOLEN_LMEM] = {
+		.class = INTEL_MEMORY_STOLEN_LOCAL,
+		.instance = 0,
+	},
 };
 
 struct intel_memory_region *
@@ -278,6 +282,8 @@ int intel_memory_regions_hw_probe(struct drm_i915_private *i915)
 		case INTEL_MEMORY_SYSTEM:
 			mem = i915_gem_shmem_setup(i915);
 			break;
+		case INTEL_MEMORY_STOLEN_LOCAL:
+			fallthrough;
 		case INTEL_MEMORY_STOLEN_SYSTEM:
 			mem = i915_gem_stolen_setup(i915);
 			break;
diff --git a/drivers/gpu/drm/i915/intel_memory_region.h b/drivers/gpu/drm/i915/intel_memory_region.h
index edd49067c8ca..4c8ec15af55f 100644
--- a/drivers/gpu/drm/i915/intel_memory_region.h
+++ b/drivers/gpu/drm/i915/intel_memory_region.h
@@ -26,18 +26,21 @@ enum intel_memory_type {
 	INTEL_MEMORY_SYSTEM = 0,
 	INTEL_MEMORY_LOCAL,
 	INTEL_MEMORY_STOLEN_SYSTEM,
+	INTEL_MEMORY_STOLEN_LOCAL,
 };
 
 enum intel_region_id {
 	INTEL_REGION_SMEM = 0,
 	INTEL_REGION_LMEM,
 	INTEL_REGION_STOLEN_SMEM,
+	INTEL_REGION_STOLEN_LMEM,
 	INTEL_REGION_UNKNOWN, /* Should be last */
 };
 
 #define REGION_SMEM     BIT(INTEL_REGION_SMEM)
 #define REGION_LMEM     BIT(INTEL_REGION_LMEM)
 #define REGION_STOLEN_SMEM   BIT(INTEL_REGION_STOLEN_SMEM)
+#define REGION_STOLEN_LMEM   BIT(INTEL_REGION_STOLEN_LMEM)
 
 #define I915_ALLOC_MIN_PAGE_SIZE  BIT(0)
 #define I915_ALLOC_CONTIGUOUS     BIT(1)
@@ -82,7 +85,7 @@ struct intel_memory_region {
 	u16 type;
 	u16 instance;
 	enum intel_region_id id;
-	char name[8];
+	char name[16];
 
 	struct list_head reserved;
 
-- 
2.26.3

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH 04/19] drm/i915/stolen: treat stolen local as normal local memory
  2021-04-12  9:05 [PATCH 00/19] More DG1 enabling Matthew Auld
                   ` (2 preceding siblings ...)
  2021-04-12  9:05 ` [PATCH 03/19] drm/i915: Create stolen memory region from local memory Matthew Auld
@ 2021-04-12  9:05 ` Matthew Auld
  2021-04-14 15:06   ` [Intel-gfx] " Tvrtko Ursulin
  2021-04-12  9:05 ` [PATCH 05/19] drm/i915/stolen: enforce the min_page_size contract Matthew Auld
                   ` (14 subsequent siblings)
  18 siblings, 1 reply; 65+ messages in thread
From: Matthew Auld @ 2021-04-12  9:05 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel

Underneath it's the same stuff, so things like the PTE_LM bits for the
GTT should just keep working as-is.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_lmem.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_lmem.c b/drivers/gpu/drm/i915/gem/i915_gem_lmem.c
index ce1c83c13d05..017db8f71130 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_lmem.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_lmem.c
@@ -19,7 +19,10 @@ const struct drm_i915_gem_object_ops i915_gem_lmem_obj_ops = {
 
 bool i915_gem_object_is_lmem(struct drm_i915_gem_object *obj)
 {
-	return obj->ops == &i915_gem_lmem_obj_ops;
+	struct intel_memory_region *mr = obj->mm.region;
+
+	return mr && (mr->type == INTEL_MEMORY_LOCAL ||
+		      mr->type == INTEL_MEMORY_STOLEN_LOCAL);
 }
 
 struct drm_i915_gem_object *
-- 
2.26.3

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH 05/19] drm/i915/stolen: enforce the min_page_size contract
  2021-04-12  9:05 [PATCH 00/19] More DG1 enabling Matthew Auld
                   ` (3 preceding siblings ...)
  2021-04-12  9:05 ` [PATCH 04/19] drm/i915/stolen: treat stolen local as normal " Matthew Auld
@ 2021-04-12  9:05 ` Matthew Auld
  2021-04-14 15:07   ` [Intel-gfx] " Tvrtko Ursulin
  2021-04-12  9:05 ` [PATCH 06/19] drm/i915/stolen: pass the allocation flags Matthew Auld
                   ` (13 subsequent siblings)
  18 siblings, 1 reply; 65+ messages in thread
From: Matthew Auld @ 2021-04-12  9:05 UTC (permalink / raw)
  To: intel-gfx; +Cc: CQ Tang, dri-devel

From: CQ Tang <cq.tang@intel.com>

Since stolen can now be device local-memory underneath, we should try to
enforce any min_page_size restrictions when allocating pages.

Signed-off-by: CQ Tang <cq.tang@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_stolen.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_stolen.c b/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
index 56dd58bef5ee..f713eabb7671 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
@@ -677,7 +677,8 @@ static int _i915_gem_object_stolen_init(struct intel_memory_region *mem,
 	if (!stolen)
 		return -ENOMEM;
 
-	ret = i915_gem_stolen_insert_node(i915, stolen, size, 4096);
+	ret = i915_gem_stolen_insert_node(i915, stolen, size,
+					  mem->min_page_size);
 	if (ret)
 		goto err_free;
 
@@ -817,8 +818,8 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_i915_private *i915,
 
 	/* KISS and expect everything to be page-aligned */
 	if (GEM_WARN_ON(size == 0) ||
-	    GEM_WARN_ON(!IS_ALIGNED(size, I915_GTT_PAGE_SIZE)) ||
-	    GEM_WARN_ON(!IS_ALIGNED(stolen_offset, I915_GTT_MIN_ALIGNMENT)))
+	    GEM_WARN_ON(!IS_ALIGNED(size, mem->min_page_size)) ||
+	    GEM_WARN_ON(!IS_ALIGNED(stolen_offset, mem->min_page_size)))
 		return ERR_PTR(-EINVAL);
 
 	stolen = kzalloc(sizeof(*stolen), GFP_KERNEL);
-- 
2.26.3

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH 06/19] drm/i915/stolen: pass the allocation flags
  2021-04-12  9:05 [PATCH 00/19] More DG1 enabling Matthew Auld
                   ` (4 preceding siblings ...)
  2021-04-12  9:05 ` [PATCH 05/19] drm/i915/stolen: enforce the min_page_size contract Matthew Auld
@ 2021-04-12  9:05 ` Matthew Auld
  2021-04-14 15:09   ` [Intel-gfx] " Tvrtko Ursulin
  2021-04-12  9:05 ` [PATCH 07/19] drm/i915/fbdev: Use lmem physical addresses for fb_mmap() on discrete Matthew Auld
                   ` (12 subsequent siblings)
  18 siblings, 1 reply; 65+ messages in thread
From: Matthew Auld @ 2021-04-12  9:05 UTC (permalink / raw)
  To: intel-gfx; +Cc: CQ Tang, dri-devel

From: CQ Tang <cq.tang@intel.com>

Stolen memory is always allocated as physically contiguous pages, mark
the object flags as such.

Signed-off-by: CQ Tang <cq.tang@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_stolen.c | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_stolen.c b/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
index f713eabb7671..49a2dfcc8ba7 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
@@ -633,14 +633,15 @@ static const struct drm_i915_gem_object_ops i915_gem_object_stolen_ops = {
 
 static int __i915_gem_object_create_stolen(struct intel_memory_region *mem,
 					   struct drm_i915_gem_object *obj,
-					   struct drm_mm_node *stolen)
+					   struct drm_mm_node *stolen,
+					   unsigned int flags)
 {
 	static struct lock_class_key lock_class;
 	unsigned int cache_level;
 	int err;
 
 	drm_gem_private_object_init(&mem->i915->drm, &obj->base, stolen->size);
-	i915_gem_object_init(obj, &i915_gem_object_stolen_ops, &lock_class, 0);
+	i915_gem_object_init(obj, &i915_gem_object_stolen_ops, &lock_class, flags);
 
 	obj->stolen = stolen;
 	obj->read_domains = I915_GEM_DOMAIN_CPU | I915_GEM_DOMAIN_GTT;
@@ -682,7 +683,7 @@ static int _i915_gem_object_stolen_init(struct intel_memory_region *mem,
 	if (ret)
 		goto err_free;
 
-	ret = __i915_gem_object_create_stolen(mem, obj, stolen);
+	ret = __i915_gem_object_create_stolen(mem, obj, stolen, flags);
 	if (ret)
 		goto err_remove;
 
@@ -840,7 +841,8 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_i915_private *i915,
 		goto err_stolen;
 	}
 
-	ret = __i915_gem_object_create_stolen(mem, obj, stolen);
+	ret = __i915_gem_object_create_stolen(mem, obj, stolen,
+					      I915_BO_ALLOC_CONTIGUOUS);
 	if (ret)
 		goto err_object_free;
 
-- 
2.26.3

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH 07/19] drm/i915/fbdev: Use lmem physical addresses for fb_mmap() on discrete
  2021-04-12  9:05 [PATCH 00/19] More DG1 enabling Matthew Auld
                   ` (5 preceding siblings ...)
  2021-04-12  9:05 ` [PATCH 06/19] drm/i915/stolen: pass the allocation flags Matthew Auld
@ 2021-04-12  9:05 ` Matthew Auld
  2021-04-12 15:00   ` Daniel Vetter
  2021-04-12  9:05 ` [PATCH 08/19] drm/i915: Return error value when bo not in LMEM for discrete Matthew Auld
                   ` (11 subsequent siblings)
  18 siblings, 1 reply; 65+ messages in thread
From: Matthew Auld @ 2021-04-12  9:05 UTC (permalink / raw)
  To: intel-gfx; +Cc: Mohammed Khajapasha, dri-devel

From: Mohammed Khajapasha <mohammed.khajapasha@intel.com>

use local memory io BAR address for fbdev's fb_mmap() operation on
discrete, fbdev uses the physical address of our framebuffer for its
fb_mmap() fn.

Signed-off-by: Mohammed Khajapasha <mohammed.khajapasha@intel.com>
---
 drivers/gpu/drm/i915/display/intel_fbdev.c | 29 +++++++++++++++++-----
 1 file changed, 23 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_fbdev.c b/drivers/gpu/drm/i915/display/intel_fbdev.c
index ccd00e65a5fe..2b37959da747 100644
--- a/drivers/gpu/drm/i915/display/intel_fbdev.c
+++ b/drivers/gpu/drm/i915/display/intel_fbdev.c
@@ -41,6 +41,8 @@
 #include <drm/drm_fb_helper.h>
 #include <drm/drm_fourcc.h>
 
+#include "gem/i915_gem_lmem.h"
+
 #include "i915_drv.h"
 #include "intel_display_types.h"
 #include "intel_fbdev.h"
@@ -178,6 +180,7 @@ static int intelfb_create(struct drm_fb_helper *helper,
 	unsigned long flags = 0;
 	bool prealloc = false;
 	void __iomem *vaddr;
+	struct drm_i915_gem_object *obj;
 	int ret;
 
 	if (intel_fb &&
@@ -232,13 +235,27 @@ static int intelfb_create(struct drm_fb_helper *helper,
 	info->fbops = &intelfb_ops;
 
 	/* setup aperture base/size for vesafb takeover */
-	info->apertures->ranges[0].base = ggtt->gmadr.start;
-	info->apertures->ranges[0].size = ggtt->mappable_end;
+	obj = intel_fb_obj(&intel_fb->base);
+	if (i915_gem_object_is_lmem(obj)) {
+		struct intel_memory_region *mem = obj->mm.region;
+
+		info->apertures->ranges[0].base = mem->io_start;
+		info->apertures->ranges[0].size = mem->total;
+
+		/* Use fbdev's framebuffer from lmem for discrete */
+		info->fix.smem_start =
+			(unsigned long)(mem->io_start +
+					i915_gem_object_get_dma_address(obj, 0));
+		info->fix.smem_len = obj->base.size;
+	} else {
+		info->apertures->ranges[0].base = ggtt->gmadr.start;
+		info->apertures->ranges[0].size = ggtt->mappable_end;
 
-	/* Our framebuffer is the entirety of fbdev's system memory */
-	info->fix.smem_start =
-		(unsigned long)(ggtt->gmadr.start + vma->node.start);
-	info->fix.smem_len = vma->node.size;
+		/* Our framebuffer is the entirety of fbdev's system memory */
+		info->fix.smem_start =
+			(unsigned long)(ggtt->gmadr.start + vma->node.start);
+		info->fix.smem_len = vma->node.size;
+	}
 
 	vaddr = i915_vma_pin_iomap(vma);
 	if (IS_ERR(vaddr)) {
-- 
2.26.3

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH 08/19] drm/i915: Return error value when bo not in LMEM for discrete
  2021-04-12  9:05 [PATCH 00/19] More DG1 enabling Matthew Auld
                   ` (6 preceding siblings ...)
  2021-04-12  9:05 ` [PATCH 07/19] drm/i915/fbdev: Use lmem physical addresses for fb_mmap() on discrete Matthew Auld
@ 2021-04-12  9:05 ` Matthew Auld
  2021-04-14 15:16   ` [Intel-gfx] " Tvrtko Ursulin
  2021-04-12  9:05 ` [PATCH 09/19] drm/i915/lmem: Fail driver init if LMEM training failed Matthew Auld
                   ` (10 subsequent siblings)
  18 siblings, 1 reply; 65+ messages in thread
From: Matthew Auld @ 2021-04-12  9:05 UTC (permalink / raw)
  To: intel-gfx; +Cc: Mohammed Khajapasha, dri-devel

From: Mohammed Khajapasha <mohammed.khajapasha@intel.com>

Return EREMOTE value when frame buffer object is not backed by LMEM
for discrete. If Local memory is supported by hardware the framebuffer
backing gem objects should be from local memory.

Signed-off-by: Mohammed Khajapasha <mohammed.khajapasha@intel.com>
---
 drivers/gpu/drm/i915/display/intel_display.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/drivers/gpu/drm/i915/display/intel_display.c b/drivers/gpu/drm/i915/display/intel_display.c
index 411b46c012f8..57b06d8728af 100644
--- a/drivers/gpu/drm/i915/display/intel_display.c
+++ b/drivers/gpu/drm/i915/display/intel_display.c
@@ -63,6 +63,7 @@
 #include "display/intel_vdsc.h"
 #include "display/intel_vrr.h"
 
+#include "gem/i915_gem_lmem.h"
 #include "gem/i915_gem_object.h"
 
 #include "gt/intel_rps.h"
@@ -11279,11 +11280,20 @@ intel_user_framebuffer_create(struct drm_device *dev,
 	struct drm_framebuffer *fb;
 	struct drm_i915_gem_object *obj;
 	struct drm_mode_fb_cmd2 mode_cmd = *user_mode_cmd;
+	struct drm_i915_private *i915;
 
 	obj = i915_gem_object_lookup(filp, mode_cmd.handles[0]);
 	if (!obj)
 		return ERR_PTR(-ENOENT);
 
+	/* object is backed with LMEM for discrete */
+	i915 = to_i915(obj->base.dev);
+	if (HAS_LMEM(i915) && !i915_gem_object_is_lmem(obj)) {
+		/* object is "remote", not in local memory */
+		i915_gem_object_put(obj);
+		return ERR_PTR(-EREMOTE);
+	}
+
 	fb = intel_framebuffer_create(obj, &mode_cmd);
 	i915_gem_object_put(obj);
 
-- 
2.26.3

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH 09/19] drm/i915/lmem: Fail driver init if LMEM training failed
  2021-04-12  9:05 [PATCH 00/19] More DG1 enabling Matthew Auld
                   ` (7 preceding siblings ...)
  2021-04-12  9:05 ` [PATCH 08/19] drm/i915: Return error value when bo not in LMEM for discrete Matthew Auld
@ 2021-04-12  9:05 ` Matthew Auld
  2021-04-12  9:05 ` [PATCH 10/19] drm/i915/dg1: Fix mapping type for default state object Matthew Auld
                   ` (9 subsequent siblings)
  18 siblings, 0 replies; 65+ messages in thread
From: Matthew Auld @ 2021-04-12  9:05 UTC (permalink / raw)
  To: intel-gfx; +Cc: Caz Yokoyama, dri-devel

From: Matt Roper <matthew.d.roper@intel.com>

Boot firmware performs memory training and health assessment during
startup.  If the memory training fails, the firmware will consider the
GPU unusable and will instruct the punit to keep the GT powered down.
If this happens, our driver will be unable to communicate with the GT
(all GT registers will read back as 0, forcewake requests will timeout,
etc.) so we should abort driver initialization if this happens.  We can
confirm that LMEM was initialized successfully via sgunit register
GU_CNTL.

Bspec: 53111
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Cc: Caz Yokoyama <Caz.Yokoyama@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
---
 drivers/gpu/drm/i915/i915_reg.h     |  3 +++
 drivers/gpu/drm/i915/intel_uncore.c | 12 ++++++++++++
 2 files changed, 15 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 4108f2a7ebfa..da73dc939e58 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -487,6 +487,9 @@ static inline bool i915_mmio_reg_valid(i915_reg_t reg)
 #define GAB_CTL				_MMIO(0x24000)
 #define   GAB_CTL_CONT_AFTER_PAGEFAULT	(1 << 8)
 
+#define GU_CNTL				_MMIO(0x101010)
+#define   LMEM_INIT			REG_BIT(7)
+
 #define GEN6_STOLEN_RESERVED		_MMIO(0x1082C0)
 #define GEN6_STOLEN_RESERVED_ADDR_MASK	(0xFFF << 20)
 #define GEN7_STOLEN_RESERVED_ADDR_MASK	(0x3FFF << 18)
diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c
index 661b50191f2b..4d0605757428 100644
--- a/drivers/gpu/drm/i915/intel_uncore.c
+++ b/drivers/gpu/drm/i915/intel_uncore.c
@@ -1917,6 +1917,18 @@ int intel_uncore_init_mmio(struct intel_uncore *uncore)
 	if (ret)
 		return ret;
 
+	/*
+	 * The boot firmware initializes local memory and assesses its health.
+	 * If memory training fails, the punit will have been instructed to
+	 * keep the GT powered down; we won't be able to communicate with it
+	 * and we should not continue with driver initialization.
+	 */
+	if (IS_DGFX(i915) &&
+	    !(__raw_uncore_read32(uncore, GU_CNTL) & LMEM_INIT)) {
+		drm_err(&i915->drm, "LMEM not initialized by firmware\n");
+		return -ENODEV;
+	}
+
 	if (INTEL_GEN(i915) > 5 && !intel_vgpu_active(i915))
 		uncore->flags |= UNCORE_HAS_FORCEWAKE;
 
-- 
2.26.3

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH 10/19] drm/i915/dg1: Fix mapping type for default state object
  2021-04-12  9:05 [PATCH 00/19] More DG1 enabling Matthew Auld
                   ` (8 preceding siblings ...)
  2021-04-12  9:05 ` [PATCH 09/19] drm/i915/lmem: Fail driver init if LMEM training failed Matthew Auld
@ 2021-04-12  9:05 ` Matthew Auld
  2021-04-12  9:05 ` [PATCH 11/19] drm/i915: Update the helper to set correct mapping Matthew Auld
                   ` (8 subsequent siblings)
  18 siblings, 0 replies; 65+ messages in thread
From: Matthew Auld @ 2021-04-12  9:05 UTC (permalink / raw)
  To: intel-gfx; +Cc: Venkata Ramana Nayana, dri-devel

From: Venkata Ramana Nayana <venkata.ramana.nayana@intel.com>

Use I915_MAP_WC when default state object is allocated in LMEM.

Signed-off-by: Venkata Ramana Nayana <venkata.ramana.nayana@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
---
 drivers/gpu/drm/i915/gt/shmem_utils.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gt/shmem_utils.c b/drivers/gpu/drm/i915/gt/shmem_utils.c
index f8f02aab842b..0683b27a3890 100644
--- a/drivers/gpu/drm/i915/gt/shmem_utils.c
+++ b/drivers/gpu/drm/i915/gt/shmem_utils.c
@@ -8,6 +8,7 @@
 #include <linux/shmem_fs.h>
 
 #include "gem/i915_gem_object.h"
+#include "gem/i915_gem_lmem.h"
 #include "shmem_utils.h"
 
 struct file *shmem_create_from_data(const char *name, void *data, size_t len)
@@ -39,7 +40,8 @@ struct file *shmem_create_from_object(struct drm_i915_gem_object *obj)
 		return file;
 	}
 
-	ptr = i915_gem_object_pin_map_unlocked(obj, I915_MAP_WB);
+	ptr = i915_gem_object_pin_map_unlocked(obj, i915_gem_object_is_lmem(obj) ?
+						I915_MAP_WC : I915_MAP_WB);
 	if (IS_ERR(ptr))
 		return ERR_CAST(ptr);
 
-- 
2.26.3

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH 11/19] drm/i915: Update the helper to set correct mapping
  2021-04-12  9:05 [PATCH 00/19] More DG1 enabling Matthew Auld
                   ` (9 preceding siblings ...)
  2021-04-12  9:05 ` [PATCH 10/19] drm/i915/dg1: Fix mapping type for default state object Matthew Auld
@ 2021-04-12  9:05 ` Matthew Auld
  2021-04-14 15:22   ` [Intel-gfx] " Tvrtko Ursulin
  2021-04-12  9:05 ` [PATCH 12/19] drm/i915/lmem: Bypass aperture when lmem is available Matthew Auld
                   ` (7 subsequent siblings)
  18 siblings, 1 reply; 65+ messages in thread
From: Matthew Auld @ 2021-04-12  9:05 UTC (permalink / raw)
  To: intel-gfx
  Cc: CQ Tang, Venkata Sandeep Dhanalakota, dri-devel, Michal Wajdeczko

From: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com>

Determine the possible coherent map type based on object location,
and if target has llc or if user requires an always coherent
mapping.

Cc: Matthew Auld <matthew.auld@intel.com>
Cc: CQ Tang <cq.tang@intel.com>
Suggested-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_engine_cs.c    |  3 ++-
 drivers/gpu/drm/i915/gt/intel_engine_pm.c    |  2 +-
 drivers/gpu/drm/i915/gt/intel_lrc.c          |  4 +++-
 drivers/gpu/drm/i915/gt/intel_ring.c         |  9 ++++++---
 drivers/gpu/drm/i915/gt/selftest_context.c   |  3 ++-
 drivers/gpu/drm/i915/gt/selftest_hangcheck.c |  4 ++--
 drivers/gpu/drm/i915/gt/selftest_lrc.c       |  4 +++-
 drivers/gpu/drm/i915/gt/uc/intel_guc.c       |  4 +++-
 drivers/gpu/drm/i915/gt/uc/intel_huc.c       |  4 +++-
 drivers/gpu/drm/i915/i915_drv.h              | 11 +++++++++--
 drivers/gpu/drm/i915/selftests/igt_spinner.c |  4 ++--
 11 files changed, 36 insertions(+), 16 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index efe935f80c1a..b79568d370f5 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -664,7 +664,8 @@ static int init_status_page(struct intel_engine_cs *engine)
 	if (ret)
 		goto err;
 
-	vaddr = i915_gem_object_pin_map(obj, I915_MAP_WB);
+	vaddr = i915_gem_object_pin_map(obj,
+					i915_coherent_map_type(engine->i915, obj, true));
 	if (IS_ERR(vaddr)) {
 		ret = PTR_ERR(vaddr);
 		goto err_unpin;
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pm.c b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
index 7c9af86fdb1e..47f4397095e5 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_pm.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
@@ -23,7 +23,7 @@ static void dbg_poison_ce(struct intel_context *ce)
 
 	if (ce->state) {
 		struct drm_i915_gem_object *obj = ce->state->obj;
-		int type = i915_coherent_map_type(ce->engine->i915);
+		int type = i915_coherent_map_type(ce->engine->i915, obj, true);
 		void *map;
 
 		if (!i915_gem_object_trylock(obj))
diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
index e86897cde984..aafe2a4df496 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
@@ -903,7 +903,9 @@ lrc_pre_pin(struct intel_context *ce,
 	GEM_BUG_ON(!i915_vma_is_pinned(ce->state));
 
 	*vaddr = i915_gem_object_pin_map(ce->state->obj,
-					 i915_coherent_map_type(ce->engine->i915) |
+					 i915_coherent_map_type(ce->engine->i915,
+								ce->state->obj,
+								false) |
 					 I915_MAP_OVERRIDE);
 
 	return PTR_ERR_OR_ZERO(*vaddr);
diff --git a/drivers/gpu/drm/i915/gt/intel_ring.c b/drivers/gpu/drm/i915/gt/intel_ring.c
index aee0a77c77e0..3cf6c7e68108 100644
--- a/drivers/gpu/drm/i915/gt/intel_ring.c
+++ b/drivers/gpu/drm/i915/gt/intel_ring.c
@@ -53,9 +53,12 @@ int intel_ring_pin(struct intel_ring *ring, struct i915_gem_ww_ctx *ww)
 
 	if (i915_vma_is_map_and_fenceable(vma))
 		addr = (void __force *)i915_vma_pin_iomap(vma);
-	else
-		addr = i915_gem_object_pin_map(vma->obj,
-					       i915_coherent_map_type(vma->vm->i915));
+	else {
+		int type = i915_coherent_map_type(vma->vm->i915, vma->obj, false);
+
+		addr = i915_gem_object_pin_map(vma->obj, type);
+	}
+
 	if (IS_ERR(addr)) {
 		ret = PTR_ERR(addr);
 		goto err_ring;
diff --git a/drivers/gpu/drm/i915/gt/selftest_context.c b/drivers/gpu/drm/i915/gt/selftest_context.c
index b9bdd1d23243..26685b927169 100644
--- a/drivers/gpu/drm/i915/gt/selftest_context.c
+++ b/drivers/gpu/drm/i915/gt/selftest_context.c
@@ -88,7 +88,8 @@ static int __live_context_size(struct intel_engine_cs *engine)
 		goto err;
 
 	vaddr = i915_gem_object_pin_map_unlocked(ce->state->obj,
-						 i915_coherent_map_type(engine->i915));
+						 i915_coherent_map_type(engine->i915,
+									ce->state->obj, false));
 	if (IS_ERR(vaddr)) {
 		err = PTR_ERR(vaddr);
 		intel_context_unpin(ce);
diff --git a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
index 746985971c3a..5b63d4df8c93 100644
--- a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
+++ b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
@@ -69,7 +69,7 @@ static int hang_init(struct hang *h, struct intel_gt *gt)
 	h->seqno = memset(vaddr, 0xff, PAGE_SIZE);
 
 	vaddr = i915_gem_object_pin_map_unlocked(h->obj,
-						 i915_coherent_map_type(gt->i915));
+						 i915_coherent_map_type(gt->i915, h->obj, false));
 	if (IS_ERR(vaddr)) {
 		err = PTR_ERR(vaddr);
 		goto err_unpin_hws;
@@ -130,7 +130,7 @@ hang_create_request(struct hang *h, struct intel_engine_cs *engine)
 		return ERR_CAST(obj);
 	}
 
-	vaddr = i915_gem_object_pin_map_unlocked(obj, i915_coherent_map_type(gt->i915));
+	vaddr = i915_gem_object_pin_map_unlocked(obj, i915_coherent_map_type(gt->i915, obj, false));
 	if (IS_ERR(vaddr)) {
 		i915_gem_object_put(obj);
 		i915_vm_put(vm);
diff --git a/drivers/gpu/drm/i915/gt/selftest_lrc.c b/drivers/gpu/drm/i915/gt/selftest_lrc.c
index 85e7df6a5123..d8f6623524e8 100644
--- a/drivers/gpu/drm/i915/gt/selftest_lrc.c
+++ b/drivers/gpu/drm/i915/gt/selftest_lrc.c
@@ -1221,7 +1221,9 @@ static int compare_isolation(struct intel_engine_cs *engine,
 	}
 
 	lrc = i915_gem_object_pin_map_unlocked(ce->state->obj,
-				      i915_coherent_map_type(engine->i915));
+					       i915_coherent_map_type(engine->i915,
+								      ce->state->obj,
+								      false));
 	if (IS_ERR(lrc)) {
 		err = PTR_ERR(lrc);
 		goto err_B1;
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.c b/drivers/gpu/drm/i915/gt/uc/intel_guc.c
index 78305b2ec89d..adae04c47aab 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.c
@@ -682,7 +682,9 @@ int intel_guc_allocate_and_map_vma(struct intel_guc *guc, u32 size,
 	if (IS_ERR(vma))
 		return PTR_ERR(vma);
 
-	vaddr = i915_gem_object_pin_map_unlocked(vma->obj, I915_MAP_WB);
+	vaddr = i915_gem_object_pin_map_unlocked(vma->obj,
+						 i915_coherent_map_type(guc_to_gt(guc)->i915,
+									vma->obj, true));
 	if (IS_ERR(vaddr)) {
 		i915_vma_unpin_and_release(&vma, 0);
 		return PTR_ERR(vaddr);
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_huc.c b/drivers/gpu/drm/i915/gt/uc/intel_huc.c
index 2126dd81ac38..56d2144dc6a0 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_huc.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_huc.c
@@ -82,7 +82,9 @@ static int intel_huc_rsa_data_create(struct intel_huc *huc)
 	if (IS_ERR(vma))
 		return PTR_ERR(vma);
 
-	vaddr = i915_gem_object_pin_map_unlocked(vma->obj, I915_MAP_WB);
+	vaddr = i915_gem_object_pin_map_unlocked(vma->obj,
+						 i915_coherent_map_type(gt->i915,
+									vma->obj, true));
 	if (IS_ERR(vaddr)) {
 		i915_vma_unpin_and_release(&vma, 0);
 		return PTR_ERR(vaddr);
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 69e43bf91a15..2abbc06712a4 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -78,6 +78,7 @@
 #include "gem/i915_gem_context_types.h"
 #include "gem/i915_gem_shrinker.h"
 #include "gem/i915_gem_stolen.h"
+#include "gem/i915_gem_lmem.h"
 
 #include "gt/intel_engine.h"
 #include "gt/intel_gt_types.h"
@@ -1921,9 +1922,15 @@ static inline int intel_hws_csb_write_index(struct drm_i915_private *i915)
 }
 
 static inline enum i915_map_type
-i915_coherent_map_type(struct drm_i915_private *i915)
+i915_coherent_map_type(struct drm_i915_private *i915,
+		       struct drm_i915_gem_object *obj, bool always_coherent)
 {
-	return HAS_LLC(i915) ? I915_MAP_WB : I915_MAP_WC;
+	if (i915_gem_object_is_lmem(obj))
+		return I915_MAP_WC;
+	if (HAS_LLC(i915) || always_coherent)
+		return I915_MAP_WB;
+	else
+		return I915_MAP_WC;
 }
 
 #endif
diff --git a/drivers/gpu/drm/i915/selftests/igt_spinner.c b/drivers/gpu/drm/i915/selftests/igt_spinner.c
index cfbbe415b57c..5fe397b7d1d9 100644
--- a/drivers/gpu/drm/i915/selftests/igt_spinner.c
+++ b/drivers/gpu/drm/i915/selftests/igt_spinner.c
@@ -94,9 +94,9 @@ int igt_spinner_pin(struct igt_spinner *spin,
 	}
 
 	if (!spin->batch) {
-		unsigned int mode =
-			i915_coherent_map_type(spin->gt->i915);
+		unsigned int mode;
 
+		mode = i915_coherent_map_type(spin->gt->i915, spin->obj, false);
 		vaddr = igt_spinner_pin_obj(ce, ww, spin->obj, mode, &spin->batch_vma);
 		if (IS_ERR(vaddr))
 			return PTR_ERR(vaddr);
-- 
2.26.3

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH 12/19] drm/i915/lmem: Bypass aperture when lmem is available
  2021-04-12  9:05 [PATCH 00/19] More DG1 enabling Matthew Auld
                   ` (10 preceding siblings ...)
  2021-04-12  9:05 ` [PATCH 11/19] drm/i915: Update the helper to set correct mapping Matthew Auld
@ 2021-04-12  9:05 ` Matthew Auld
  2021-04-14 15:33   ` [Intel-gfx] " Tvrtko Ursulin
  2021-04-12  9:05 ` [PATCH 13/19] drm/i915/dg1: Read OPROM via SPI controller Matthew Auld
                   ` (6 subsequent siblings)
  18 siblings, 1 reply; 65+ messages in thread
From: Matthew Auld @ 2021-04-12  9:05 UTC (permalink / raw)
  To: intel-gfx
  Cc: Anusha Srivatsa, Chris P Wilson, CQ Tang, Daniele Ceraolo Spurio,
	dri-devel, Daniel Vetter, Dhinakaran Pandiyan

From: Anusha Srivatsa <anusha.srivatsa@intel.com>

In the scenario where local memory is available, we have
rely on CPU access via lmem directly instead of aperture.

v2:
gmch is only relevant for much older hw, therefore we can drop the
has_aperture check since it should always be present on such platforms.
(Chris)

Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Chris P Wilson <chris.p.wilson@intel.com>
Cc: Daniel Vetter <daniel.vetter@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: CQ Tang <cq.tang@intel.com>
Signed-off-by: Anusha Srivatsa <anusha.srivatsa@intel.com>
---
 drivers/gpu/drm/i915/display/intel_fbdev.c | 22 +++++++++++++++-------
 drivers/gpu/drm/i915/gem/i915_gem_lmem.c   | 15 +++++++++++++++
 drivers/gpu/drm/i915/gem/i915_gem_lmem.h   |  5 +++++
 drivers/gpu/drm/i915/i915_vma.c            | 19 +++++++++++++------
 4 files changed, 48 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_fbdev.c b/drivers/gpu/drm/i915/display/intel_fbdev.c
index 2b37959da747..4af40229f5ec 100644
--- a/drivers/gpu/drm/i915/display/intel_fbdev.c
+++ b/drivers/gpu/drm/i915/display/intel_fbdev.c
@@ -139,14 +139,22 @@ static int intelfb_alloc(struct drm_fb_helper *helper,
 	size = mode_cmd.pitches[0] * mode_cmd.height;
 	size = PAGE_ALIGN(size);
 
-	/* If the FB is too big, just don't use it since fbdev is not very
-	 * important and we should probably use that space with FBC or other
-	 * features. */
 	obj = ERR_PTR(-ENODEV);
-	if (size * 2 < dev_priv->stolen_usable_size)
-		obj = i915_gem_object_create_stolen(dev_priv, size);
-	if (IS_ERR(obj))
-		obj = i915_gem_object_create_shmem(dev_priv, size);
+	if (HAS_LMEM(dev_priv)) {
+		obj = i915_gem_object_create_lmem(dev_priv, size,
+						  I915_BO_ALLOC_CONTIGUOUS);
+	} else {
+		/*
+		 * If the FB is too big, just don't use it since fbdev is not very
+		 * important and we should probably use that space with FBC or other
+		 * features.
+		 */
+		if (size * 2 < dev_priv->stolen_usable_size)
+			obj = i915_gem_object_create_stolen(dev_priv, size);
+		if (IS_ERR(obj))
+			obj = i915_gem_object_create_shmem(dev_priv, size);
+	}
+
 	if (IS_ERR(obj)) {
 		drm_err(&dev_priv->drm, "failed to allocate framebuffer\n");
 		return PTR_ERR(obj);
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_lmem.c b/drivers/gpu/drm/i915/gem/i915_gem_lmem.c
index 017db8f71130..f44bdd08f7cb 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_lmem.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_lmem.c
@@ -17,6 +17,21 @@ const struct drm_i915_gem_object_ops i915_gem_lmem_obj_ops = {
 	.release = i915_gem_object_release_memory_region,
 };
 
+void __iomem *
+i915_gem_object_lmem_io_map(struct drm_i915_gem_object *obj,
+			    unsigned long n,
+			    unsigned long size)
+{
+	resource_size_t offset;
+
+	GEM_BUG_ON(!i915_gem_object_is_contiguous(obj));
+
+	offset = i915_gem_object_get_dma_address(obj, n);
+	offset -= obj->mm.region->region.start;
+
+	return io_mapping_map_wc(&obj->mm.region->iomap, offset, size);
+}
+
 bool i915_gem_object_is_lmem(struct drm_i915_gem_object *obj)
 {
 	struct intel_memory_region *mr = obj->mm.region;
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_lmem.h b/drivers/gpu/drm/i915/gem/i915_gem_lmem.h
index 036d53c01de9..fac6bc5a5ebb 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_lmem.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_lmem.h
@@ -14,6 +14,11 @@ struct intel_memory_region;
 
 extern const struct drm_i915_gem_object_ops i915_gem_lmem_obj_ops;
 
+void __iomem *
+i915_gem_object_lmem_io_map(struct drm_i915_gem_object *obj,
+			    unsigned long n,
+			    unsigned long size);
+
 bool i915_gem_object_is_lmem(struct drm_i915_gem_object *obj);
 
 struct drm_i915_gem_object *
diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
index 07490db51cdc..e24d33aecac4 100644
--- a/drivers/gpu/drm/i915/i915_vma.c
+++ b/drivers/gpu/drm/i915/i915_vma.c
@@ -27,6 +27,7 @@
 
 #include "display/intel_frontbuffer.h"
 
+#include "gem/i915_gem_lmem.h"
 #include "gt/intel_engine.h"
 #include "gt/intel_engine_heartbeat.h"
 #include "gt/intel_gt.h"
@@ -448,9 +449,11 @@ void __iomem *i915_vma_pin_iomap(struct i915_vma *vma)
 	void __iomem *ptr;
 	int err;
 
-	if (GEM_WARN_ON(!i915_vma_is_map_and_fenceable(vma))) {
-		err = -ENODEV;
-		goto err;
+	if (!i915_gem_object_is_lmem(vma->obj)) {
+		if (GEM_WARN_ON(!i915_vma_is_map_and_fenceable(vma))) {
+			err = -ENODEV;
+			goto err;
+		}
 	}
 
 	GEM_BUG_ON(!i915_vma_is_ggtt(vma));
@@ -458,9 +461,13 @@ void __iomem *i915_vma_pin_iomap(struct i915_vma *vma)
 
 	ptr = READ_ONCE(vma->iomap);
 	if (ptr == NULL) {
-		ptr = io_mapping_map_wc(&i915_vm_to_ggtt(vma->vm)->iomap,
-					vma->node.start,
-					vma->node.size);
+		if (i915_gem_object_is_lmem(vma->obj))
+			ptr = i915_gem_object_lmem_io_map(vma->obj, 0,
+							  vma->obj->base.size);
+		else
+			ptr = io_mapping_map_wc(&i915_vm_to_ggtt(vma->vm)->iomap,
+						vma->node.start,
+						vma->node.size);
 		if (ptr == NULL) {
 			err = -ENOMEM;
 			goto err;
-- 
2.26.3

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH 13/19] drm/i915/dg1: Read OPROM via SPI controller
  2021-04-12  9:05 [PATCH 00/19] More DG1 enabling Matthew Auld
                   ` (11 preceding siblings ...)
  2021-04-12  9:05 ` [PATCH 12/19] drm/i915/lmem: Bypass aperture when lmem is available Matthew Auld
@ 2021-04-12  9:05 ` Matthew Auld
  2021-09-17 23:29   ` [Intel-gfx] " Lucas De Marchi
  2021-04-12  9:05 ` [PATCH 14/19] drm/i915/oprom: Basic sanitization Matthew Auld
                   ` (5 subsequent siblings)
  18 siblings, 1 reply; 65+ messages in thread
From: Matthew Auld @ 2021-04-12  9:05 UTC (permalink / raw)
  To: intel-gfx
  Cc: Jani Nikula, Lucas De Marchi, dri-devel, Jon Bloomfield, Tomas Winkler

From: Clint Taylor <clinton.a.taylor@intel.com>

Read OPROM SPI through MMIO and find VBT entry since we can't use
OpRegion and PCI mapping may not work on some systems due to the BIOS
not leaving the Option ROM mapped.

v2 by Jani:
- switch to intel_uncore_read/intel_uncore_write

Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Tomas Winkler <tomas.winkler@intel.com>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Signed-off-by: Clint Taylor <clinton.a.taylor@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
---
 drivers/gpu/drm/i915/display/intel_bios.c | 80 +++++++++++++++++++++--
 drivers/gpu/drm/i915/i915_reg.h           |  8 +++
 2 files changed, 82 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_bios.c b/drivers/gpu/drm/i915/display/intel_bios.c
index ea4837d485a1..f9dc651f1652 100644
--- a/drivers/gpu/drm/i915/display/intel_bios.c
+++ b/drivers/gpu/drm/i915/display/intel_bios.c
@@ -2238,6 +2238,66 @@ bool intel_bios_is_valid_vbt(const void *buf, size_t size)
 	return vbt;
 }
 
+static struct vbt_header *spi_oprom_get_vbt(struct drm_i915_private *i915)
+{
+	u32 count, data, found, store = 0;
+	u32 static_region, oprom_offset;
+	u32 oprom_size = 0x200000;
+	u16 vbt_size;
+	u32 *vbt;
+
+	static_region = intel_uncore_read(&i915->uncore, SPI_STATIC_REGIONS);
+	static_region &= OPTIONROM_SPI_REGIONID_MASK;
+	intel_uncore_write(&i915->uncore, PRIMARY_SPI_REGIONID, static_region);
+
+	oprom_offset = intel_uncore_read(&i915->uncore, OROM_OFFSET);
+	oprom_offset &= OROM_OFFSET_MASK;
+
+	for (count = 0; count < oprom_size; count += 4) {
+		intel_uncore_write(&i915->uncore, PRIMARY_SPI_ADDRESS, oprom_offset + count);
+		data = intel_uncore_read(&i915->uncore, PRIMARY_SPI_TRIGGER);
+
+		if (data == *((const u32 *)"$VBT")) {
+			found = oprom_offset + count;
+			break;
+		}
+	}
+
+	if (count >= oprom_size)
+		goto err_not_found;
+
+	/* Get VBT size and allocate space for the VBT */
+	intel_uncore_write(&i915->uncore, PRIMARY_SPI_ADDRESS, found +
+		   offsetof(struct vbt_header, vbt_size));
+	vbt_size = intel_uncore_read(&i915->uncore, PRIMARY_SPI_TRIGGER);
+	vbt_size &= 0xffff;
+
+	vbt = kzalloc(vbt_size, GFP_KERNEL);
+	if (!vbt) {
+		DRM_ERROR("Unable to allocate %u bytes for VBT storage\n",
+			  vbt_size);
+		goto err_not_found;
+	}
+
+	for (count = 0; count < vbt_size; count += 4) {
+		intel_uncore_write(&i915->uncore, PRIMARY_SPI_ADDRESS, found + count);
+		data = intel_uncore_read(&i915->uncore, PRIMARY_SPI_TRIGGER);
+		*(vbt + store++) = data;
+	}
+
+	if (!intel_bios_is_valid_vbt(vbt, vbt_size))
+		goto err_free_vbt;
+
+	DRM_DEBUG_KMS("Found valid VBT in SPI flash\n");
+
+	return (struct vbt_header *)vbt;
+
+err_free_vbt:
+	kfree(vbt);
+err_not_found:
+	return NULL;
+}
+
 static struct vbt_header *oprom_get_vbt(struct drm_i915_private *i915)
 {
 	struct pci_dev *pdev = to_pci_dev(i915->drm.dev);
@@ -2287,6 +2347,8 @@ static struct vbt_header *oprom_get_vbt(struct drm_i915_private *i915)
 
 	pci_unmap_rom(pdev, oprom);
 
+	DRM_DEBUG_KMS("Found valid VBT in PCI ROM\n");
+
 	return vbt;
 
 err_free_vbt:
@@ -2321,17 +2383,23 @@ void intel_bios_init(struct drm_i915_private *i915)
 
 	init_vbt_defaults(i915);
 
-	/* If the OpRegion does not have VBT, look in PCI ROM. */
+	/*
+	 * If the OpRegion does not have VBT, look in SPI flash through MMIO or
+	 * PCI mapping
+	 */
+	if (!vbt && IS_DGFX(i915)) {
+		oprom_vbt = spi_oprom_get_vbt(i915);
+		vbt = oprom_vbt;
+	}
+
 	if (!vbt) {
 		oprom_vbt = oprom_get_vbt(i915);
-		if (!oprom_vbt)
-			goto out;
-
 		vbt = oprom_vbt;
-
-		drm_dbg_kms(&i915->drm, "Found valid VBT in PCI ROM\n");
 	}
 
+	if (!vbt)
+		goto out;
+
 	bdb = get_bdb_header(vbt);
 	i915->vbt.version = bdb->version;
 
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index da73dc939e58..54ff63b86df6 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -12540,6 +12540,14 @@ enum skl_power_gate {
 #define   DP_PIN_ASSIGNMENT_MASK(idx)		(0xf << ((idx) * 4))
 #define   DP_PIN_ASSIGNMENT(idx, x)		((x) << ((idx) * 4))
 
+#define PRIMARY_SPI_TRIGGER			_MMIO(0x102040)
+#define PRIMARY_SPI_ADDRESS			_MMIO(0x102080)
+#define PRIMARY_SPI_REGIONID			_MMIO(0x102084)
+#define SPI_STATIC_REGIONS			_MMIO(0x102090)
+#define   OPTIONROM_SPI_REGIONID_MASK		REG_GENMASK(7, 0)
+#define OROM_OFFSET				_MMIO(0x1020c0)
+#define   OROM_OFFSET_MASK			REG_GENMASK(20, 16)
+
 /* This register controls the Display State Buffer (DSB) engines. */
 #define _DSBSL_INSTANCE_BASE		0x70B00
 #define DSBSL_INSTANCE(pipe, id)	(_DSBSL_INSTANCE_BASE + \
-- 
2.26.3

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH 14/19] drm/i915/oprom: Basic sanitization
  2021-04-12  9:05 [PATCH 00/19] More DG1 enabling Matthew Auld
                   ` (12 preceding siblings ...)
  2021-04-12  9:05 ` [PATCH 13/19] drm/i915/dg1: Read OPROM via SPI controller Matthew Auld
@ 2021-04-12  9:05 ` Matthew Auld
  2021-04-12 22:36   ` [Intel-gfx] " kernel test robot
                     ` (2 more replies)
  2021-04-12  9:05 ` [PATCH 15/19] drm/i915: WA for zero memory channel Matthew Auld
                   ` (4 subsequent siblings)
  18 siblings, 3 replies; 65+ messages in thread
From: Matthew Auld @ 2021-04-12  9:05 UTC (permalink / raw)
  To: intel-gfx
  Cc: Jani Nikula, Anshuman Gupta, Uma Shankar, Mohammed Khajapasha, dri-devel

From: Anshuman Gupta <anshuman.gupta@intel.com>

Sanitize OPROM header, CPD signature and OPROM PCI version.
OPROM_HEADER, EXPANSION_ROM_HEADER and OPROM_MEU_BLOB structures
and PCI struct offsets are provided by GSC counterparts.
These are yet to be Documented in B.Spec.
After successful sanitization, extract VBT from opregion
image.

v2:
- Used macro for OPROM header magic 0xaa55 [Rodrigo]
- Added a OPROM layout. [Uma]
- Extract opregion from OPROM package and then extract
  VBT from opregion to have backward compatibility with
  older IFWI.

v3:
- Moved opreg stuff to intel_opregion.{c,h}. [Uma]
- Memory leak and intel_oprom_verify_signature return
  value fixes. [Uma]

v4:
 - Fix return code storage for oprom_image_parse_helper (Matt)

v5 by Jani:
- switch to intel_uncore_read/intel_uncore_write

v6 by Khajapasha:
- Rename intel_oprom_verify_signature() to
  intel_spi_get_oprom_opreg() [Jani, Nikula]
- Use u32 data type for opregion size [Jani, Nikula]

Cc: Jani Nikula <jani.nikula@intel.com>
Cc: Uma Shankar <uma.shankar@intel.com>
Signed-off-by: Anshuman Gupta <anshuman.gupta@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Mohammed Khajapasha <mohammed.khajapasha@intel.com>
---
 drivers/gpu/drm/i915/display/intel_bios.c     |  47 +++--
 drivers/gpu/drm/i915/display/intel_opregion.c | 169 ++++++++++++++++++
 drivers/gpu/drm/i915/display/intel_opregion.h |  38 +++-
 3 files changed, 227 insertions(+), 27 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_bios.c b/drivers/gpu/drm/i915/display/intel_bios.c
index f9dc651f1652..59eec8333723 100644
--- a/drivers/gpu/drm/i915/display/intel_bios.c
+++ b/drivers/gpu/drm/i915/display/intel_bios.c
@@ -2240,37 +2240,36 @@ bool intel_bios_is_valid_vbt(const void *buf, size_t size)
 
 static struct vbt_header *spi_oprom_get_vbt(struct drm_i915_private *i915)
 {
-	u32 count, data, found, store = 0;
-	u32 static_region, oprom_offset;
-	u32 oprom_size = 0x200000;
+	u32 count, found, opreg_size;
+	u32 *vbt, *oprom_opreg = NULL;
 	u16 vbt_size;
-	u32 *vbt;
+	u8 *parse_ptr;
 
-	static_region = intel_uncore_read(&i915->uncore, SPI_STATIC_REGIONS);
-	static_region &= OPTIONROM_SPI_REGIONID_MASK;
-	intel_uncore_write(&i915->uncore, PRIMARY_SPI_REGIONID, static_region);
-
-	oprom_offset = intel_uncore_read(&i915->uncore, OROM_OFFSET);
-	oprom_offset &= OROM_OFFSET_MASK;
+	if (intel_spi_get_oprom_opreg(i915, &oprom_opreg, &opreg_size)) {
+		drm_err(&i915->drm, "oprom signature verification failed\n");
+		goto err_not_found;
+	}
 
-	for (count = 0; count < oprom_size; count += 4) {
-		intel_uncore_write(&i915->uncore, PRIMARY_SPI_ADDRESS, oprom_offset + count);
-		data = intel_uncore_read(&i915->uncore, PRIMARY_SPI_TRIGGER);
+	if (!oprom_opreg) {
+		drm_err(&i915->drm, "opregion not found\n");
+		goto err_not_found;
+	}
 
-		if (data == *((const u32 *)"$VBT")) {
-			found = oprom_offset + count;
+	for (count = 0; count < opreg_size; count += 4) {
+		if (oprom_opreg[count / 4] == *((const u32 *)"$VBT")) {
+			found = count;
 			break;
 		}
 	}
 
-	if (count >= oprom_size)
+	if (count >= opreg_size) {
+		drm_err(&i915->drm, "VBT not found in opregion\n");
 		goto err_not_found;
+	}
 
 	/* Get VBT size and allocate space for the VBT */
-	intel_uncore_write(&i915->uncore, PRIMARY_SPI_ADDRESS, found +
-		   offsetof(struct vbt_header, vbt_size));
-	vbt_size = intel_uncore_read(&i915->uncore, PRIMARY_SPI_TRIGGER);
-	vbt_size &= 0xffff;
+	parse_ptr = (u8 *)oprom_opreg + found;
+	vbt_size = ((struct vbt_header *)parse_ptr)->vbt_size;
 
 	vbt = kzalloc(vbt_size, GFP_KERNEL);
 	if (!vbt) {
@@ -2279,16 +2278,12 @@ static struct vbt_header *spi_oprom_get_vbt(struct drm_i915_private *i915)
 		goto err_not_found;
 	}
 
-	for (count = 0; count < vbt_size; count += 4) {
-		intel_uncore_write(&i915->uncore, PRIMARY_SPI_ADDRESS, found + count);
-		data = intel_uncore_read(&i915->uncore, PRIMARY_SPI_TRIGGER);
-		*(vbt + store++) = data;
-	}
-
+	memcpy(vbt, parse_ptr, vbt_size);
 	if (!intel_bios_is_valid_vbt(vbt, vbt_size))
 		goto err_free_vbt;
 
 	DRM_DEBUG_KMS("Found valid VBT in SPI flash\n");
+	kfree(oprom_opreg);
 
 	return (struct vbt_header *)vbt;
 
diff --git a/drivers/gpu/drm/i915/display/intel_opregion.c b/drivers/gpu/drm/i915/display/intel_opregion.c
index dfd724e506b5..e9ccd8265a1f 100644
--- a/drivers/gpu/drm/i915/display/intel_opregion.c
+++ b/drivers/gpu/drm/i915/display/intel_opregion.c
@@ -983,6 +983,175 @@ int intel_opregion_setup(struct drm_i915_private *dev_priv)
 	return err;
 }
 
+static int oprom_image_parse_helper(u8 *parse_ptr, u8 *last_img, u8 *code_type,
+				    struct drm_i915_private *i915)
+{
+	u8 size_512_bytes;
+
+	if (((union oprom_header *)parse_ptr)->signature != OPROM_IMAGE_MAGIC) {
+		drm_err(&i915->drm, "Wrong OPROM header signature.\n");
+		return -EINVAL;
+	}
+
+	size_512_bytes = parse_ptr[((struct expansion_rom_header *)parse_ptr)->pcistructoffset + PCI_IMAGE_LENGTH_OFFSET];
+	*code_type = parse_ptr[((struct expansion_rom_header *)parse_ptr)->pcistructoffset + PCI_CODE_TYPE_OFFSET];
+	*last_img = parse_ptr[((struct expansion_rom_header *)parse_ptr)->pcistructoffset + PCI_LAST_IMAGE_INDICATOR_OFFSET];
+
+	return size_512_bytes;
+}
+
+static void spi_read_oprom_helper(size_t len, u32 offset, u32 *buf,
+				  struct drm_i915_private *dev_priv)
+{
+	u32 count, data;
+
+	for (count = 0; count < len; count += 4) {
+		intel_uncore_write(&dev_priv->uncore, PRIMARY_SPI_ADDRESS, offset + count);
+		data = intel_uncore_read(&dev_priv->uncore, PRIMARY_SPI_TRIGGER);
+		buf[count / 4] = data;
+	}
+}
+
+/**
+ *	+        DASH+G OPROM IMAGE LAYOUT           +
+ *	+--------+-------+---------------------------+
+ *	| Offset | Value |   ROM Header Fields       +-----> Image 1 (CSS)
+ *	+--------------------------------------------+
+ *	|    0h  |  55h  |   ROM Signature Byte1     |
+ *	|    1h  |  AAh  |   ROM Signature Byte2     |
+ *	|    2h  |  xx   |        Reserved           |
+ *	|  18+19h|  xx   |  Ptr to PCI DataStructure |
+ *	+----------------+---------------------------+
+ *	|           PCI Data Structure               |
+ *	+--------------------------------------------+
+ *	|    .       .             .                 |
+ *	|    .       .             .                 |
+ *	|    10  +  xx   +     Image Length          |
+ *	|    14  +  xx   +     Code Type             |
+ *	|    15  +  xx   +  Last Image Indicator     |
+ *	|    .       .             .                 |
+ *	+--------------------------------------------+
+ *	|               MEU BLOB                     |
+ *	+--------------------------------------------+
+ *	|              CPD Header                    |
+ *	|              CPD Entry                     |
+ *	|              Reserved                      |
+ *	|           SignedDataPart1                  |
+ *	|              PublicKey                     |
+ *	|            RSA Signature                   |
+ *	|           SignedDataPart2                  |
+ *	|            IFWI Metadata                   |
+ *	+--------+-------+---------------------------+
+ *	|    .   |   .   |         .                 |
+ *	|    .   |   .   |         .                 |
+ *	+--------------------------------------------+
+ *	| Offset | Value |   ROM Header Fields       +-----> Image 2 (Config Data) (Offset: 0x800)
+ *	+--------------------------------------------+
+ *	|    0h  |  55h  |   ROM Signature Byte1     |
+ *	|    1h  |  AAh  |   ROM Signature Byte2     |
+ *	|    2h  |  xx   |        Reserved           |
+ *	|  18+19h|  xx   |  Ptr to PCI DataStructure |
+ *	+----------------+---------------------------+
+ *	|           PCI Data Structure               |
+ *	+--------------------------------------------+
+ *	|    .       .             .                 |
+ *	|    .       .             .                 |
+ *	|    10  +  xx   +     Image Length          |
+ *	|    14  +  xx   +      Code Type            |
+ *	|    15  +  xx   +   Last Image Indicator    |
+ *	|    .       .             .                 |
+ *	|    1A  +  3C   + Ptr to Opregion Signature |
+ *	|    .       .             .                 |
+ *	|    .       .             .                 |
+ *	|   83Ch + IntelGraphicsMem                  | <---+ Opregion Signature
+ *	+--------+-----------------------------------+
+ *
+ * intel_spi_get_oprom_opreg() get OPROM image.
+ * @i915: pointer to i915 device.
+ * @opreg: pointer to opregion buffer output.
+ * @opreg_size: pointer to opregion size output.
+ */
+int
+intel_spi_get_oprom_opreg(struct drm_i915_private *i915, u32 **opreg,
+			  u32 *opreg_size)
+{
+	u8 img_sig[sizeof(OPREGION_SIGNATURE)];
+	u8 code_type, last_img;
+	u32 static_region, offset, img_len;
+	u32 *oprom_img, *oprom_img_hdr;
+	u16 opreg_base;
+	u8 *parse_ptr;
+	int img_size;
+	int ret = -EINVAL;
+
+	/* initialize SPI to read the OPROM */
+	static_region = intel_uncore_read(&i915->uncore, SPI_STATIC_REGIONS);
+	static_region &= OPTIONROM_SPI_REGIONID_MASK;
+	intel_uncore_write(&i915->uncore, PRIMARY_SPI_REGIONID, static_region);
+	/* read OPROM offset in SPI flash */
+	offset = intel_uncore_read(&i915->uncore, OROM_OFFSET);
+	offset &= OROM_OFFSET_MASK;
+
+	oprom_img_hdr = kzalloc(OPROM_INITIAL_READ_SIZE, GFP_KERNEL);
+	if (!oprom_img_hdr)
+		return -ENOMEM;
+
+	do {
+		spi_read_oprom_helper(OPROM_INITIAL_READ_SIZE, offset,
+				      oprom_img_hdr, i915);
+		img_size = oprom_image_parse_helper((u8 *)oprom_img_hdr, &last_img,
+						    &code_type, i915);
+		if (img_size <= 0) {
+			ret = -EINVAL;
+			goto err_free_hdr;
+		}
+
+		img_len = img_size * OPROM_BYTE_BOUNDARY;
+		oprom_img = kzalloc(img_len, GFP_KERNEL);
+		if (!oprom_img) {
+			ret = -ENOMEM;
+			goto err_free_hdr;
+		}
+
+		spi_read_oprom_helper(img_len, offset, oprom_img, i915);
+		parse_ptr = (u8 *)oprom_img;
+		offset = offset + img_len;
+
+		/* opregion base offset */
+		opreg_base = ((struct expansion_rom_header *)parse_ptr)->opregion_base;
+		/* CPD or opreg signature is present at opregion_base offset */
+		memcpy(img_sig, parse_ptr + opreg_base, sizeof(OPREGION_SIGNATURE));
+
+		if (!memcmp(img_sig, OPREGION_SIGNATURE, sizeof(OPREGION_SIGNATURE) - 1)) {
+			*opreg = oprom_img;
+			*opreg_size = img_len;
+			drm_dbg_kms(&i915->drm, "Found opregion image\n");
+			ret = 0;
+			break;
+		} else if (!memcmp(img_sig, CPD_SIGNATURE, NUM_CPD_BYTES)) {
+			if (code_type != OPROM_CSS_CODE_TYPE) {
+				drm_err(&i915->drm, "Invalid OPROM\n");
+				ret = -EINVAL;
+				goto err_free_img;
+			}
+			drm_dbg_kms(&i915->drm, "Found CSS image\n");
+			/* proceed here onwards for signature authentication */
+			kfree(oprom_img);
+			continue;
+		}
+
+	} while (last_img != LAST_IMG_INDICATOR);
+
+	return ret;
+
+err_free_img:
+	kfree(oprom_img);
+err_free_hdr:
+	kfree(oprom_img_hdr);
+
+	return ret;
+}
+
 static int intel_use_opregion_panel_type_callback(const struct dmi_system_id *id)
 {
 	DRM_INFO("Using panel type from OpRegion on %s\n", id->ident);
diff --git a/drivers/gpu/drm/i915/display/intel_opregion.h b/drivers/gpu/drm/i915/display/intel_opregion.h
index 4aa68ffbd30e..de53dde10dd9 100644
--- a/drivers/gpu/drm/i915/display/intel_opregion.h
+++ b/drivers/gpu/drm/i915/display/intel_opregion.h
@@ -54,6 +54,34 @@ struct intel_opregion {
 
 #define OPREGION_SIZE            (8 * 1024)
 
+#define CPD_SIGNATURE "$CPD"                  /* CPD Signature */
+#define NUM_CPD_BYTES 4
+#define PCI_IMAGE_LENGTH_OFFSET 0x10
+#define PCI_CODE_TYPE_OFFSET 0x14
+#define PCI_LAST_IMAGE_INDICATOR_OFFSET 0x15
+#define LAST_IMG_INDICATOR 0x80
+#define OPROM_IMAGE_MAGIC 0xAA55       /* Little Endian */
+#define OPROM_CSS_CODE_TYPE 0xF0
+#define OPROM_BYTE_BOUNDARY 512        /* OPROM image sizes are indicated in 512 byte boundaries */
+#define OPROM_INITIAL_READ_SIZE 60     /* Read 60 bytes to compute the Img Len from PCI structure */
+
+union oprom_header {
+	u32 data;
+	struct {
+		u16 signature;  /* Offset[0x0]: Header 0x55 0xAA */
+		u8 sizein512bytes;
+		u8 reserved;
+	};
+};
+
+struct expansion_rom_header {
+	union oprom_header header;      /* Offset[0x0]: Oprom Header */
+	u16 vbiospostoffset;    /* Offset[0x4]: pointer to VBIOS entry point */
+	u8 resvd[0x12];
+	u16 pcistructoffset;    /* Offset[0x18]: Contains pointer PCI Data Structure */
+	u16 opregion_base;      /* Offset[0x1A]: Offset to Opregion Base start */
+};
+
 #ifdef CONFIG_ACPI
 
 int intel_opregion_setup(struct drm_i915_private *dev_priv);
@@ -72,6 +100,9 @@ int intel_opregion_notify_adapter(struct drm_i915_private *dev_priv,
 				  pci_power_t state);
 int intel_opregion_get_panel_type(struct drm_i915_private *dev_priv);
 
+int intel_spi_get_oprom_opreg(struct drm_i915_private *i915, u32 **opreg,
+			      u32 *opreg_size);
+
 #else /* CONFIG_ACPI*/
 
 static inline int intel_opregion_setup(struct drm_i915_private *dev_priv)
@@ -117,6 +148,11 @@ static inline int intel_opregion_get_panel_type(struct drm_i915_private *dev)
 	return -ENODEV;
 }
 
-#endif /* CONFIG_ACPI */
+static int intel_spi_get_oprom_opreg(struct drm_i915_private *i915, u32 **opreg,
+				     u32 *opreg_size)
+{
+	return 0;
+}
 
+#endif /* CONFIG_ACPI */
 #endif
-- 
2.26.3

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH 15/19] drm/i915: WA for zero memory channel
  2021-04-12  9:05 [PATCH 00/19] More DG1 enabling Matthew Auld
                   ` (13 preceding siblings ...)
  2021-04-12  9:05 ` [PATCH 14/19] drm/i915/oprom: Basic sanitization Matthew Auld
@ 2021-04-12  9:05 ` Matthew Auld
  2021-04-12 16:57   ` Souza, Jose
  2021-04-12  9:05 ` [PATCH 16/19] drm/i915/dg1: Compute MEM Bandwidth using MCHBAR Matthew Auld
                   ` (3 subsequent siblings)
  18 siblings, 1 reply; 65+ messages in thread
From: Matthew Auld @ 2021-04-12  9:05 UTC (permalink / raw)
  To: intel-gfx
  Cc: Lucas De Marchi, José Roberto de Souza, Stanislav Lisovskiy,
	Daniele Ceraolo Spurio, dri-devel, Rodrigo Vivi

From: José Roberto de Souza <jose.souza@intel.com>

Commit c457d9cf256e ("drm/i915: Make sure we have enough memory
bandwidth on ICL") assumes that we always have a non-zero
dram_info->channels and uses it as a divisor. We need num memory
channels to be at least 1 for sane bw limits checking, even when PCode
returns 0, so lets force it to 1 in this case.

Cc: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
---
 drivers/gpu/drm/i915/display/intel_bw.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/i915/display/intel_bw.c b/drivers/gpu/drm/i915/display/intel_bw.c
index 584ab5ce4106..c5f70f3e930e 100644
--- a/drivers/gpu/drm/i915/display/intel_bw.c
+++ b/drivers/gpu/drm/i915/display/intel_bw.c
@@ -175,6 +175,7 @@ static int icl_get_bw_info(struct drm_i915_private *dev_priv, const struct intel
 			    "Failed to get memory subsystem information, ignoring bandwidth limits");
 		return ret;
 	}
+	num_channels = max_t(u8, 1, num_channels);
 
 	deinterleave = DIV_ROUND_UP(num_channels, is_y_tile ? 4 : 2);
 	dclk_max = icl_sagv_max_dclk(&qi);
-- 
2.26.3

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH 16/19] drm/i915/dg1: Compute MEM Bandwidth using MCHBAR
  2021-04-12  9:05 [PATCH 00/19] More DG1 enabling Matthew Auld
                   ` (14 preceding siblings ...)
  2021-04-12  9:05 ` [PATCH 15/19] drm/i915: WA for zero memory channel Matthew Auld
@ 2021-04-12  9:05 ` Matthew Auld
  2021-04-12  9:05 ` [PATCH 17/19] drm/i915/dg1: Double memory bandwidth available Matthew Auld
                   ` (2 subsequent siblings)
  18 siblings, 0 replies; 65+ messages in thread
From: Matthew Auld @ 2021-04-12  9:05 UTC (permalink / raw)
  To: intel-gfx; +Cc: Jani Nikula, dri-devel, Jani Saarinen

From: Clint Taylor <clinton.a.taylor@intel.com>

The PUNIT FW is currently returning 0 for all memory bandwidth
parameters. Read the values directly from MCHBAR offsets 0x5918 and
0x4000(4). This is a temporary WA until the PUNIT FW returns valid
values.

v2 (Lucas): Add error to log since this is fixed in new pcode available
on IFWI WW14. Also fix checkpatch warnings.

v3 by Jani:
- switch to intel_uncore_read/intel_uncore_write

Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Jani Saarinen <jani.saarinen@intel.com>
Signed-off-by: Clint Taylor <clinton.a.taylor@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
---
 drivers/gpu/drm/i915/display/intel_bw.c | 54 ++++++++++++++++++++++++-
 1 file changed, 53 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/display/intel_bw.c b/drivers/gpu/drm/i915/display/intel_bw.c
index c5f70f3e930e..99cae0dc0ca2 100644
--- a/drivers/gpu/drm/i915/display/intel_bw.c
+++ b/drivers/gpu/drm/i915/display/intel_bw.c
@@ -23,6 +23,53 @@ struct intel_qgv_info {
 	u8 t_bl;
 };
 
+#define SA_PERF_STATUS_0_0_0_MCHBAR_PC _MMIO(MCHBAR_MIRROR_BASE_SNB + 0x5918)
+#define  DG1_QCLK_RATIO_MASK (0xFF << 2)
+#define  DG1_QCLK_RATIO_SHIFT 2
+#define  DG1_QCLK_REFERENCE (1 << 10)
+
+#define MCHBAR_CH0_CR_TC_PRE_0_0_0_MCHBAR _MMIO(MCHBAR_MIRROR_BASE_SNB + 0x4000)
+#define MCHBAR_CH0_CR_TC_PRE_0_0_0_MCHBAR_HIGH _MMIO(MCHBAR_MIRROR_BASE_SNB + 0x4004)
+#define MCHBAR_CH1_CR_TC_PRE_0_0_0_MCHBAR _MMIO(MCHBAR_MIRROR_BASE_SNB + 0x4400)
+#define MCHBAR_CH1_CR_TC_PRE_0_0_0_MCHBAR_HIGH _MMIO(MCHBAR_MIRROR_BASE_SNB + 0x4404)
+#define  DG1_DRAM_T_RCD_MASK (0x7F << 9)
+#define  DG1_DRAM_T_RCD_SHIFT 9
+#define  DG1_DRAM_T_RDPRE_MASK (0x3F << 11)
+#define  DG1_DRAM_T_RDPRE_SHIFT 11
+#define  DG1_DRAM_T_RAS_MASK (0xFF << 1)
+#define  DG1_DRAM_T_RAS_SHIFT 1
+#define  DG1_DRAM_T_RP_MASK (0x7F << 0)
+#define  DG1_DRAM_T_RP_SHIFT 0
+
+static int dg1_mchbar_read_qgv_point_info(struct drm_i915_private *dev_priv,
+					  struct intel_qgv_point *sp,
+					  int point)
+{
+	u32 val = 0;
+	u32 dclk_ratio = 0, dclk_reference = 0;
+
+	val = intel_uncore_read(&dev_priv->uncore, SA_PERF_STATUS_0_0_0_MCHBAR_PC);
+	dclk_ratio = (val & DG1_QCLK_RATIO_MASK) >> DG1_QCLK_RATIO_SHIFT;
+	if (val & DG1_QCLK_REFERENCE)
+		dclk_reference = 6; /* 6 * 16.666 MHz = 100 MHz */
+	else
+		dclk_reference = 8; /* 8 * 16.666 MHz = 133 MHz */
+	sp->dclk = dclk_ratio * dclk_reference;
+	if (sp->dclk == 0)
+		return -EINVAL;
+
+	val = intel_uncore_read(&dev_priv->uncore, MCHBAR_CH0_CR_TC_PRE_0_0_0_MCHBAR);
+	sp->t_rp = (val & DG1_DRAM_T_RP_MASK) >> DG1_DRAM_T_RP_SHIFT;
+	sp->t_rdpre = (val & DG1_DRAM_T_RDPRE_MASK) >> DG1_DRAM_T_RDPRE_SHIFT;
+
+	val = intel_uncore_read(&dev_priv->uncore, MCHBAR_CH0_CR_TC_PRE_0_0_0_MCHBAR_HIGH);
+	sp->t_rcd = (val & DG1_DRAM_T_RCD_MASK) >> DG1_DRAM_T_RCD_SHIFT;
+	sp->t_ras = (val & DG1_DRAM_T_RAS_MASK) >> DG1_DRAM_T_RAS_SHIFT;
+
+	sp->t_rc = sp->t_rp + sp->t_ras;
+	return 0;
+}
+
 static int icl_pcode_read_qgv_point_info(struct drm_i915_private *dev_priv,
 					 struct intel_qgv_point *sp,
 					 int point)
@@ -100,7 +147,12 @@ static int icl_get_qgv_points(struct drm_i915_private *dev_priv,
 		struct intel_qgv_point *sp = &qi->points[i];
 
 		ret = icl_pcode_read_qgv_point_info(dev_priv, sp, i);
-		if (ret)
+		if (IS_DG1(dev_priv) && (ret || sp->dclk == 0)) {
+			drm_dbg_kms(&dev_priv->drm, "Failed to get memory subsystem information via pcode. IFWI needs update. Trying with MCHBAR\n");
+			ret = dg1_mchbar_read_qgv_point_info(dev_priv, sp, i);
+			if (ret)
+				return ret;
+		} else if (ret)
 			return ret;
 
 		drm_dbg_kms(&dev_priv->drm,
-- 
2.26.3

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH 17/19] drm/i915/dg1: Double memory bandwidth available
  2021-04-12  9:05 [PATCH 00/19] More DG1 enabling Matthew Auld
                   ` (15 preceding siblings ...)
  2021-04-12  9:05 ` [PATCH 16/19] drm/i915/dg1: Compute MEM Bandwidth using MCHBAR Matthew Auld
@ 2021-04-12  9:05 ` Matthew Auld
  2021-04-12  9:05 ` [PATCH 18/19] drm/i915/gtt: map the PD up front Matthew Auld
  2021-04-12  9:05 ` [PATCH 19/19] drm/i915/gtt/dgfx: place the PD in LMEM Matthew Auld
  18 siblings, 0 replies; 65+ messages in thread
From: Matthew Auld @ 2021-04-12  9:05 UTC (permalink / raw)
  To: intel-gfx; +Cc: Jani Nikula, Swati Sharma, dri-devel

From: Clint Taylor <clinton.a.taylor@intel.com>

Use MCHBAR Gear_type information to compute memory bandwidth available
during MCHBAR calculations.

v2 by Jani:
- switch to intel_uncore_read/intel_uncore_write

Tested-by: Swati Sharma <swati2.sharma@intel.com>
Cc: Swati Sharma <swati2.sharma@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Clint Taylor <clinton.a.taylor@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
---
 drivers/gpu/drm/i915/display/intel_bw.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/drivers/gpu/drm/i915/display/intel_bw.c b/drivers/gpu/drm/i915/display/intel_bw.c
index 99cae0dc0ca2..6c02bd52ce45 100644
--- a/drivers/gpu/drm/i915/display/intel_bw.c
+++ b/drivers/gpu/drm/i915/display/intel_bw.c
@@ -41,6 +41,9 @@ struct intel_qgv_info {
 #define  DG1_DRAM_T_RP_MASK (0x7F << 0)
 #define  DG1_DRAM_T_RP_SHIFT 0
 
+#define  ICL_GEAR_TYPE_MASK (0x01 << 16)
+#define  ICL_GEAR_TYPE_SHIFT 16
+
 static int dg1_mchbar_read_qgv_point_info(struct drm_i915_private *dev_priv,
 					  struct intel_qgv_point *sp,
 					  int point)
@@ -55,6 +58,11 @@ static int dg1_mchbar_read_qgv_point_info(struct drm_i915_private *dev_priv,
 	else
 		dclk_reference = 8; /* 8 * 16.666 MHz = 133 MHz */
 	sp->dclk = dclk_ratio * dclk_reference;
+
+	val = intel_uncore_read(&dev_priv->uncore, SKL_MC_BIOS_DATA_0_0_0_MCHBAR_PCU);
+	if ((val & ICL_GEAR_TYPE_MASK) >> ICL_GEAR_TYPE_SHIFT)
+		sp->dclk *= 2;
+
 	if (sp->dclk == 0)
 		return -EINVAL;
 
-- 
2.26.3

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH 18/19] drm/i915/gtt: map the PD up front
  2021-04-12  9:05 [PATCH 00/19] More DG1 enabling Matthew Auld
                   ` (16 preceding siblings ...)
  2021-04-12  9:05 ` [PATCH 17/19] drm/i915/dg1: Double memory bandwidth available Matthew Auld
@ 2021-04-12  9:05 ` Matthew Auld
  2021-04-12 15:17   ` [Intel-gfx] " Daniel Vetter
  2021-04-12  9:05 ` [PATCH 19/19] drm/i915/gtt/dgfx: place the PD in LMEM Matthew Auld
  18 siblings, 1 reply; 65+ messages in thread
From: Matthew Auld @ 2021-04-12  9:05 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel, Chris Wilson

We need to general our accessor for the page directories and tables from
using the simple kmap_atomic to support local memory, and this setup
must be done on acquisition of the backing storage prior to entering
fence execution contexts. Here we replace the kmap with the object
maping code that for simple single page shmemfs object will return a
plain kmap, that is then kept for the lifetime of the page directory.

v2: (Thomas) Rebase on dma_resv and obj->mm.lock removal.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 .../drm/i915/gem/selftests/i915_gem_context.c | 11 +----
 drivers/gpu/drm/i915/gt/gen6_ppgtt.c          | 11 ++---
 drivers/gpu/drm/i915/gt/gen8_ppgtt.c          | 26 ++++------
 drivers/gpu/drm/i915/gt/intel_ggtt.c          |  2 +-
 drivers/gpu/drm/i915/gt/intel_gtt.c           | 48 +++++++++----------
 drivers/gpu/drm/i915/gt/intel_gtt.h           | 11 +++--
 drivers/gpu/drm/i915/gt/intel_ppgtt.c         |  7 ++-
 drivers/gpu/drm/i915/i915_vma.c               |  3 +-
 drivers/gpu/drm/i915/selftests/i915_gem_gtt.c | 10 ++--
 drivers/gpu/drm/i915/selftests/i915_perf.c    |  3 +-
 10 files changed, 54 insertions(+), 78 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c
index 5fef592390cb..ce70d0a3afb2 100644
--- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c
@@ -1740,7 +1740,6 @@ static int read_from_scratch(struct i915_gem_context *ctx,
 static int check_scratch_page(struct i915_gem_context *ctx, u32 *out)
 {
 	struct i915_address_space *vm;
-	struct page *page;
 	u32 *vaddr;
 	int err = 0;
 
@@ -1748,24 +1747,18 @@ static int check_scratch_page(struct i915_gem_context *ctx, u32 *out)
 	if (!vm)
 		return -ENODEV;
 
-	page = __px_page(vm->scratch[0]);
-	if (!page) {
+	if (!vm->scratch[0]) {
 		pr_err("No scratch page!\n");
 		return -EINVAL;
 	}
 
-	vaddr = kmap(page);
-	if (!vaddr) {
-		pr_err("No (mappable) scratch page!\n");
-		return -EINVAL;
-	}
+	vaddr = __px_vaddr(vm->scratch[0]);
 
 	memcpy(out, vaddr, sizeof(*out));
 	if (memchr_inv(vaddr, *out, PAGE_SIZE)) {
 		pr_err("Inconsistent initial state of scratch page!\n");
 		err = -EINVAL;
 	}
-	kunmap(page);
 
 	return err;
 }
diff --git a/drivers/gpu/drm/i915/gt/gen6_ppgtt.c b/drivers/gpu/drm/i915/gt/gen6_ppgtt.c
index e08dff376339..21b1085769be 100644
--- a/drivers/gpu/drm/i915/gt/gen6_ppgtt.c
+++ b/drivers/gpu/drm/i915/gt/gen6_ppgtt.c
@@ -96,9 +96,8 @@ static void gen6_ppgtt_clear_range(struct i915_address_space *vm,
 		 * entries back to scratch.
 		 */
 
-		vaddr = kmap_atomic_px(pt);
+		vaddr = px_vaddr(pt);
 		memset32(vaddr + pte, scratch_pte, count);
-		kunmap_atomic(vaddr);
 
 		pte = 0;
 	}
@@ -120,7 +119,7 @@ static void gen6_ppgtt_insert_entries(struct i915_address_space *vm,
 
 	GEM_BUG_ON(!pd->entry[act_pt]);
 
-	vaddr = kmap_atomic_px(i915_pt_entry(pd, act_pt));
+	vaddr = px_vaddr(i915_pt_entry(pd, act_pt));
 	do {
 		GEM_BUG_ON(sg_dma_len(iter.sg) < I915_GTT_PAGE_SIZE);
 		vaddr[act_pte] = pte_encode | GEN6_PTE_ADDR_ENCODE(iter.dma);
@@ -136,12 +135,10 @@ static void gen6_ppgtt_insert_entries(struct i915_address_space *vm,
 		}
 
 		if (++act_pte == GEN6_PTES) {
-			kunmap_atomic(vaddr);
-			vaddr = kmap_atomic_px(i915_pt_entry(pd, ++act_pt));
+			vaddr = px_vaddr(i915_pt_entry(pd, ++act_pt));
 			act_pte = 0;
 		}
 	} while (1);
-	kunmap_atomic(vaddr);
 
 	vma->page_sizes.gtt = I915_GTT_PAGE_SIZE;
 }
@@ -235,7 +232,7 @@ static int gen6_ppgtt_init_scratch(struct gen6_ppgtt *ppgtt)
 		goto err_scratch0;
 	}
 
-	ret = pin_pt_dma(vm, vm->scratch[1]);
+	ret = map_pt_dma(vm, vm->scratch[1]);
 	if (ret)
 		goto err_scratch1;
 
diff --git a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
index 176c19633412..f83496836f0f 100644
--- a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
+++ b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
@@ -242,11 +242,10 @@ static u64 __gen8_ppgtt_clear(struct i915_address_space * const vm,
 			    atomic_read(&pt->used));
 			GEM_BUG_ON(!count || count >= atomic_read(&pt->used));
 
-			vaddr = kmap_atomic_px(pt);
+			vaddr = px_vaddr(pt);
 			memset64(vaddr + gen8_pd_index(start, 0),
 				 vm->scratch[0]->encode,
 				 count);
-			kunmap_atomic(vaddr);
 
 			atomic_sub(count, &pt->used);
 			start += count;
@@ -375,7 +374,7 @@ gen8_ppgtt_insert_pte(struct i915_ppgtt *ppgtt,
 	gen8_pte_t *vaddr;
 
 	pd = i915_pd_entry(pdp, gen8_pd_index(idx, 2));
-	vaddr = kmap_atomic_px(i915_pt_entry(pd, gen8_pd_index(idx, 1)));
+	vaddr = px_vaddr(i915_pt_entry(pd, gen8_pd_index(idx, 1)));
 	do {
 		GEM_BUG_ON(sg_dma_len(iter->sg) < I915_GTT_PAGE_SIZE);
 		vaddr[gen8_pd_index(idx, 0)] = pte_encode | iter->dma;
@@ -402,12 +401,10 @@ gen8_ppgtt_insert_pte(struct i915_ppgtt *ppgtt,
 			}
 
 			clflush_cache_range(vaddr, PAGE_SIZE);
-			kunmap_atomic(vaddr);
-			vaddr = kmap_atomic_px(i915_pt_entry(pd, gen8_pd_index(idx, 1)));
+			vaddr = px_vaddr(i915_pt_entry(pd, gen8_pd_index(idx, 1)));
 		}
 	} while (1);
 	clflush_cache_range(vaddr, PAGE_SIZE);
-	kunmap_atomic(vaddr);
 
 	return idx;
 }
@@ -442,7 +439,7 @@ static void gen8_ppgtt_insert_huge(struct i915_vma *vma,
 			encode |= GEN8_PDE_PS_2M;
 			page_size = I915_GTT_PAGE_SIZE_2M;
 
-			vaddr = kmap_atomic_px(pd);
+			vaddr = px_vaddr(pd);
 		} else {
 			struct i915_page_table *pt =
 				i915_pt_entry(pd, __gen8_pte_index(start, 1));
@@ -457,7 +454,7 @@ static void gen8_ppgtt_insert_huge(struct i915_vma *vma,
 			     rem >= (I915_PDES - index) * I915_GTT_PAGE_SIZE))
 				maybe_64K = __gen8_pte_index(start, 1);
 
-			vaddr = kmap_atomic_px(pt);
+			vaddr = px_vaddr(pt);
 		}
 
 		do {
@@ -491,7 +488,6 @@ static void gen8_ppgtt_insert_huge(struct i915_vma *vma,
 		} while (rem >= page_size && index < I915_PDES);
 
 		clflush_cache_range(vaddr, PAGE_SIZE);
-		kunmap_atomic(vaddr);
 
 		/*
 		 * Is it safe to mark the 2M block as 64K? -- Either we have
@@ -505,9 +501,8 @@ static void gen8_ppgtt_insert_huge(struct i915_vma *vma,
 		      !iter->sg && IS_ALIGNED(vma->node.start +
 					      vma->node.size,
 					      I915_GTT_PAGE_SIZE_2M)))) {
-			vaddr = kmap_atomic_px(pd);
+			vaddr = px_vaddr(pd);
 			vaddr[maybe_64K] |= GEN8_PDE_IPS_64K;
-			kunmap_atomic(vaddr);
 			page_size = I915_GTT_PAGE_SIZE_64K;
 
 			/*
@@ -523,12 +518,11 @@ static void gen8_ppgtt_insert_huge(struct i915_vma *vma,
 				u16 i;
 
 				encode = vma->vm->scratch[0]->encode;
-				vaddr = kmap_atomic_px(i915_pt_entry(pd, maybe_64K));
+				vaddr = px_vaddr(i915_pt_entry(pd, maybe_64K));
 
 				for (i = 1; i < index; i += 16)
 					memset64(vaddr + i, encode, 15);
 
-				kunmap_atomic(vaddr);
 			}
 		}
 
@@ -602,7 +596,7 @@ static int gen8_init_scratch(struct i915_address_space *vm)
 		if (IS_ERR(obj))
 			goto free_scratch;
 
-		ret = pin_pt_dma(vm, obj);
+		ret = map_pt_dma(vm, obj);
 		if (ret) {
 			i915_gem_object_put(obj);
 			goto free_scratch;
@@ -639,7 +633,7 @@ static int gen8_preallocate_top_level_pdp(struct i915_ppgtt *ppgtt)
 		if (IS_ERR(pde))
 			return PTR_ERR(pde);
 
-		err = pin_pt_dma(vm, pde->pt.base);
+		err = map_pt_dma(vm, pde->pt.base);
 		if (err) {
 			i915_gem_object_put(pde->pt.base);
 			free_pd(vm, pde);
@@ -675,7 +669,7 @@ gen8_alloc_top_pd(struct i915_address_space *vm)
 		goto err_pd;
 	}
 
-	err = pin_pt_dma(vm, pd->pt.base);
+	err = map_pt_dma(vm, pd->pt.base);
 	if (err)
 		goto err_pd;
 
diff --git a/drivers/gpu/drm/i915/gt/intel_ggtt.c b/drivers/gpu/drm/i915/gt/intel_ggtt.c
index 670c1271e7d5..d94628b9d89e 100644
--- a/drivers/gpu/drm/i915/gt/intel_ggtt.c
+++ b/drivers/gpu/drm/i915/gt/intel_ggtt.c
@@ -657,7 +657,7 @@ static int init_aliasing_ppgtt(struct i915_ggtt *ggtt)
 		goto err_ppgtt;
 
 	i915_gem_object_lock(ppgtt->vm.scratch[0], NULL);
-	err = i915_vm_pin_pt_stash(&ppgtt->vm, &stash);
+	err = i915_vm_map_pt_stash(&ppgtt->vm, &stash);
 	i915_gem_object_unlock(ppgtt->vm.scratch[0]);
 	if (err)
 		goto err_stash;
diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.c b/drivers/gpu/drm/i915/gt/intel_gtt.c
index 941f8af016d6..d386b89e2758 100644
--- a/drivers/gpu/drm/i915/gt/intel_gtt.c
+++ b/drivers/gpu/drm/i915/gt/intel_gtt.c
@@ -25,27 +25,25 @@ struct drm_i915_gem_object *alloc_pt_dma(struct i915_address_space *vm, int sz)
 	return obj;
 }
 
-int pin_pt_dma(struct i915_address_space *vm, struct drm_i915_gem_object *obj)
+int map_pt_dma(struct i915_address_space *vm, struct drm_i915_gem_object *obj)
 {
-	int err;
+	void *vaddr;
 
-	i915_gem_object_lock(obj, NULL);
-	err = i915_gem_object_pin_pages(obj);
-	i915_gem_object_unlock(obj);
-	if (err)
-		return err;
+	vaddr = i915_gem_object_pin_map_unlocked(obj, I915_MAP_WB);
+	if (IS_ERR(vaddr))
+		return PTR_ERR(vaddr);
 
 	i915_gem_object_make_unshrinkable(obj);
 	return 0;
 }
 
-int pin_pt_dma_locked(struct i915_address_space *vm, struct drm_i915_gem_object *obj)
+int map_pt_dma_locked(struct i915_address_space *vm, struct drm_i915_gem_object *obj)
 {
-	int err;
+	void *vaddr;
 
-	err = i915_gem_object_pin_pages(obj);
-	if (err)
-		return err;
+	vaddr = i915_gem_object_pin_map(obj, I915_MAP_WB);
+	if (IS_ERR(vaddr))
+		return PTR_ERR(vaddr);
 
 	i915_gem_object_make_unshrinkable(obj);
 	return 0;
@@ -155,6 +153,14 @@ void clear_pages(struct i915_vma *vma)
 	memset(&vma->page_sizes, 0, sizeof(vma->page_sizes));
 }
 
+void *__px_vaddr(struct drm_i915_gem_object *p)
+{
+	enum i915_map_type type;
+
+	GEM_BUG_ON(!i915_gem_object_has_pages(p));
+	return page_unpack_bits(p->mm.mapping, &type);
+}
+
 dma_addr_t __px_dma(struct drm_i915_gem_object *p)
 {
 	GEM_BUG_ON(!i915_gem_object_has_pages(p));
@@ -170,32 +176,22 @@ struct page *__px_page(struct drm_i915_gem_object *p)
 void
 fill_page_dma(struct drm_i915_gem_object *p, const u64 val, unsigned int count)
 {
-	struct page *page = __px_page(p);
-	void *vaddr;
+	void *vaddr = __px_vaddr(p);
 
-	vaddr = kmap(page);
 	memset64(vaddr, val, count);
 	clflush_cache_range(vaddr, PAGE_SIZE);
-	kunmap(page);
 }
 
 static void poison_scratch_page(struct drm_i915_gem_object *scratch)
 {
-	struct sgt_iter sgt;
-	struct page *page;
+	void *vaddr = __px_vaddr(scratch);
 	u8 val;
 
 	val = 0;
 	if (IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM))
 		val = POISON_FREE;
 
-	for_each_sgt_page(page, sgt, scratch->mm.pages) {
-		void *vaddr;
-
-		vaddr = kmap(page);
-		memset(vaddr, val, PAGE_SIZE);
-		kunmap(page);
-	}
+	memset(vaddr, val, scratch->base.size);
 }
 
 int setup_scratch_page(struct i915_address_space *vm)
@@ -225,7 +221,7 @@ int setup_scratch_page(struct i915_address_space *vm)
 		if (IS_ERR(obj))
 			goto skip;
 
-		if (pin_pt_dma(vm, obj))
+		if (map_pt_dma(vm, obj))
 			goto skip_obj;
 
 		/* We need a single contiguous page for our scratch */
diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.h b/drivers/gpu/drm/i915/gt/intel_gtt.h
index e67e34e17913..40e486704558 100644
--- a/drivers/gpu/drm/i915/gt/intel_gtt.h
+++ b/drivers/gpu/drm/i915/gt/intel_gtt.h
@@ -180,6 +180,9 @@ struct page *__px_page(struct drm_i915_gem_object *p);
 dma_addr_t __px_dma(struct drm_i915_gem_object *p);
 #define px_dma(px) (__px_dma(px_base(px)))
 
+void *__px_vaddr(struct drm_i915_gem_object *p);
+#define px_vaddr(px) (__px_vaddr(px_base(px)))
+
 #define px_pt(px) \
 	__px_choose_expr(px, struct i915_page_table *, __x, \
 	__px_choose_expr(px, struct i915_page_directory *, &__x->pt, \
@@ -511,8 +514,6 @@ struct i915_ppgtt *i915_ppgtt_create(struct intel_gt *gt);
 void i915_ggtt_suspend(struct i915_ggtt *gtt);
 void i915_ggtt_resume(struct i915_ggtt *ggtt);
 
-#define kmap_atomic_px(px) kmap_atomic(__px_page(px_base(px)))
-
 void
 fill_page_dma(struct drm_i915_gem_object *p, const u64 val, unsigned int count);
 
@@ -530,8 +531,8 @@ struct i915_page_table *alloc_pt(struct i915_address_space *vm);
 struct i915_page_directory *alloc_pd(struct i915_address_space *vm);
 struct i915_page_directory *__alloc_pd(int npde);
 
-int pin_pt_dma(struct i915_address_space *vm, struct drm_i915_gem_object *obj);
-int pin_pt_dma_locked(struct i915_address_space *vm, struct drm_i915_gem_object *obj);
+int map_pt_dma(struct i915_address_space *vm, struct drm_i915_gem_object *obj);
+int map_pt_dma_locked(struct i915_address_space *vm, struct drm_i915_gem_object *obj);
 
 void free_px(struct i915_address_space *vm,
 	     struct i915_page_table *pt, int lvl);
@@ -578,7 +579,7 @@ void setup_private_pat(struct intel_uncore *uncore);
 int i915_vm_alloc_pt_stash(struct i915_address_space *vm,
 			   struct i915_vm_pt_stash *stash,
 			   u64 size);
-int i915_vm_pin_pt_stash(struct i915_address_space *vm,
+int i915_vm_map_pt_stash(struct i915_address_space *vm,
 			 struct i915_vm_pt_stash *stash);
 void i915_vm_free_pt_stash(struct i915_address_space *vm,
 			   struct i915_vm_pt_stash *stash);
diff --git a/drivers/gpu/drm/i915/gt/intel_ppgtt.c b/drivers/gpu/drm/i915/gt/intel_ppgtt.c
index 014ae8ac4480..4e3d80c2295c 100644
--- a/drivers/gpu/drm/i915/gt/intel_ppgtt.c
+++ b/drivers/gpu/drm/i915/gt/intel_ppgtt.c
@@ -87,11 +87,10 @@ write_dma_entry(struct drm_i915_gem_object * const pdma,
 		const unsigned short idx,
 		const u64 encoded_entry)
 {
-	u64 * const vaddr = kmap_atomic(__px_page(pdma));
+	u64 * const vaddr = __px_vaddr(pdma);
 
 	vaddr[idx] = encoded_entry;
 	clflush_cache_range(&vaddr[idx], sizeof(u64));
-	kunmap_atomic(vaddr);
 }
 
 void
@@ -258,7 +257,7 @@ int i915_vm_alloc_pt_stash(struct i915_address_space *vm,
 	return 0;
 }
 
-int i915_vm_pin_pt_stash(struct i915_address_space *vm,
+int i915_vm_map_pt_stash(struct i915_address_space *vm,
 			 struct i915_vm_pt_stash *stash)
 {
 	struct i915_page_table *pt;
@@ -266,7 +265,7 @@ int i915_vm_pin_pt_stash(struct i915_address_space *vm,
 
 	for (n = 0; n < ARRAY_SIZE(stash->pt); n++) {
 		for (pt = stash->pt[n]; pt; pt = pt->stash) {
-			err = pin_pt_dma_locked(vm, pt->base);
+			err = map_pt_dma_locked(vm, pt->base);
 			if (err)
 				return err;
 		}
diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
index e24d33aecac4..c68a743fac2a 100644
--- a/drivers/gpu/drm/i915/i915_vma.c
+++ b/drivers/gpu/drm/i915/i915_vma.c
@@ -912,8 +912,7 @@ int i915_vma_pin_ww(struct i915_vma *vma, struct i915_gem_ww_ctx *ww,
 			if (err)
 				goto err_fence;
 
-			err = i915_vm_pin_pt_stash(vma->vm,
-						   &work->stash);
+			err = i915_vm_map_pt_stash(vma->vm, &work->stash);
 			if (err)
 				goto err_fence;
 		}
diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c b/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
index 2e4f06eaacc1..e060e455e9f6 100644
--- a/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
@@ -186,7 +186,7 @@ static int igt_ppgtt_alloc(void *arg)
 		if (err)
 			goto err_ppgtt_cleanup;
 
-		err = i915_vm_pin_pt_stash(&ppgtt->vm, &stash);
+		err = i915_vm_map_pt_stash(&ppgtt->vm, &stash);
 		if (err) {
 			i915_vm_free_pt_stash(&ppgtt->vm, &stash);
 			goto err_ppgtt_cleanup;
@@ -208,7 +208,7 @@ static int igt_ppgtt_alloc(void *arg)
 		if (err)
 			goto err_ppgtt_cleanup;
 
-		err = i915_vm_pin_pt_stash(&ppgtt->vm, &stash);
+		err = i915_vm_map_pt_stash(&ppgtt->vm, &stash);
 		if (err) {
 			i915_vm_free_pt_stash(&ppgtt->vm, &stash);
 			goto err_ppgtt_cleanup;
@@ -325,11 +325,10 @@ static int lowlevel_hole(struct i915_address_space *vm,
 							   BIT_ULL(size)))
 					goto alloc_vm_end;
 
-				err = i915_vm_pin_pt_stash(vm, &stash);
+				err = i915_vm_map_pt_stash(vm, &stash);
 				if (!err)
 					vm->allocate_va_range(vm, &stash,
 							      addr, BIT_ULL(size));
-
 				i915_vm_free_pt_stash(vm, &stash);
 alloc_vm_end:
 				if (err == -EDEADLK) {
@@ -1967,10 +1966,9 @@ static int igt_cs_tlb(void *arg)
 			if (err)
 				goto end_ww;
 
-			err = i915_vm_pin_pt_stash(vm, &stash);
+			err = i915_vm_map_pt_stash(vm, &stash);
 			if (!err)
 				vm->allocate_va_range(vm, &stash, offset, chunk_size);
-
 			i915_vm_free_pt_stash(vm, &stash);
 end_ww:
 			if (err == -EDEADLK) {
diff --git a/drivers/gpu/drm/i915/selftests/i915_perf.c b/drivers/gpu/drm/i915/selftests/i915_perf.c
index e9d86dab8677..bfb0290967a1 100644
--- a/drivers/gpu/drm/i915/selftests/i915_perf.c
+++ b/drivers/gpu/drm/i915/selftests/i915_perf.c
@@ -307,7 +307,7 @@ static int live_noa_gpr(void *arg)
 	}
 
 	/* Poison the ce->vm so we detect writes not to the GGTT gt->scratch */
-	scratch = kmap(__px_page(ce->vm->scratch[0]));
+	scratch = __px_vaddr(ce->vm->scratch[0]);
 	memset(scratch, POISON_FREE, PAGE_SIZE);
 
 	rq = intel_context_create_request(ce);
@@ -405,7 +405,6 @@ static int live_noa_gpr(void *arg)
 out_rq:
 	i915_request_put(rq);
 out_ce:
-	kunmap(__px_page(ce->vm->scratch[0]));
 	intel_context_put(ce);
 out:
 	stream_destroy(stream);
-- 
2.26.3

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH 19/19] drm/i915/gtt/dgfx: place the PD in LMEM
  2021-04-12  9:05 [PATCH 00/19] More DG1 enabling Matthew Auld
                   ` (17 preceding siblings ...)
  2021-04-12  9:05 ` [PATCH 18/19] drm/i915/gtt: map the PD up front Matthew Auld
@ 2021-04-12  9:05 ` Matthew Auld
  2021-04-14 15:37   ` [Intel-gfx] " Tvrtko Ursulin
  18 siblings, 1 reply; 65+ messages in thread
From: Matthew Auld @ 2021-04-12  9:05 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel

It's a requirement that for dgfx we place all the paging structures in
device local-memory.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
---
 drivers/gpu/drm/i915/gt/gen8_ppgtt.c |  5 ++++-
 drivers/gpu/drm/i915/gt/intel_gtt.c  | 27 +++++++++++++++++++++++++--
 drivers/gpu/drm/i915/gt/intel_gtt.h  |  1 +
 3 files changed, 30 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
index f83496836f0f..11fb5df45a0f 100644
--- a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
+++ b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
@@ -712,7 +712,10 @@ struct i915_ppgtt *gen8_ppgtt_create(struct intel_gt *gt)
 	 */
 	ppgtt->vm.has_read_only = !IS_GEN_RANGE(gt->i915, 11, 12);
 
-	ppgtt->vm.alloc_pt_dma = alloc_pt_dma;
+	if (HAS_LMEM(gt->i915))
+		ppgtt->vm.alloc_pt_dma = alloc_pt_lmem;
+	else
+		ppgtt->vm.alloc_pt_dma = alloc_pt_dma;
 
 	err = gen8_init_scratch(&ppgtt->vm);
 	if (err)
diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.c b/drivers/gpu/drm/i915/gt/intel_gtt.c
index d386b89e2758..1eeeab45445c 100644
--- a/drivers/gpu/drm/i915/gt/intel_gtt.c
+++ b/drivers/gpu/drm/i915/gt/intel_gtt.c
@@ -7,10 +7,23 @@
 
 #include <linux/fault-inject.h>
 
+#include "gem/i915_gem_lmem.h"
 #include "i915_trace.h"
 #include "intel_gt.h"
 #include "intel_gtt.h"
 
+struct drm_i915_gem_object *alloc_pt_lmem(struct i915_address_space *vm, int sz)
+{
+	struct drm_i915_gem_object *obj;
+
+	obj = i915_gem_object_create_lmem(vm->i915, sz, 0);
+
+	/* ensure all dma objects have the same reservation class */
+	if (!IS_ERR(obj))
+		obj->base.resv = &vm->resv;
+	return obj;
+}
+
 struct drm_i915_gem_object *alloc_pt_dma(struct i915_address_space *vm, int sz)
 {
 	struct drm_i915_gem_object *obj;
@@ -27,9 +40,14 @@ struct drm_i915_gem_object *alloc_pt_dma(struct i915_address_space *vm, int sz)
 
 int map_pt_dma(struct i915_address_space *vm, struct drm_i915_gem_object *obj)
 {
+	enum i915_map_type type;
 	void *vaddr;
 
-	vaddr = i915_gem_object_pin_map_unlocked(obj, I915_MAP_WB);
+	type = I915_MAP_WB;
+	if (i915_gem_object_is_lmem(obj))
+		type = I915_MAP_WC;
+
+	vaddr = i915_gem_object_pin_map_unlocked(obj, type);
 	if (IS_ERR(vaddr))
 		return PTR_ERR(vaddr);
 
@@ -39,9 +57,14 @@ int map_pt_dma(struct i915_address_space *vm, struct drm_i915_gem_object *obj)
 
 int map_pt_dma_locked(struct i915_address_space *vm, struct drm_i915_gem_object *obj)
 {
+	enum i915_map_type type;
 	void *vaddr;
 
-	vaddr = i915_gem_object_pin_map(obj, I915_MAP_WB);
+	type = I915_MAP_WB;
+	if (i915_gem_object_is_lmem(obj))
+		type = I915_MAP_WC;
+
+	vaddr = i915_gem_object_pin_map(obj, type);
 	if (IS_ERR(vaddr))
 		return PTR_ERR(vaddr);
 
diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.h b/drivers/gpu/drm/i915/gt/intel_gtt.h
index 40e486704558..44ce27c51631 100644
--- a/drivers/gpu/drm/i915/gt/intel_gtt.h
+++ b/drivers/gpu/drm/i915/gt/intel_gtt.h
@@ -527,6 +527,7 @@ int setup_scratch_page(struct i915_address_space *vm);
 void free_scratch(struct i915_address_space *vm);
 
 struct drm_i915_gem_object *alloc_pt_dma(struct i915_address_space *vm, int sz);
+struct drm_i915_gem_object *alloc_pt_lmem(struct i915_address_space *vm, int sz);
 struct i915_page_table *alloc_pt(struct i915_address_space *vm);
 struct i915_page_directory *alloc_pd(struct i915_address_space *vm);
 struct i915_page_directory *__alloc_pd(int npde);
-- 
2.26.3

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 65+ messages in thread

* Re: [Intel-gfx] [PATCH 01/19] drm/i915/gt: Skip aperture remapping selftest where there is no aperture
  2021-04-12  9:05 ` [PATCH 01/19] drm/i915/gt: Skip aperture remapping selftest where there is no aperture Matthew Auld
@ 2021-04-12 14:48   ` Daniel Vetter
  0 siblings, 0 replies; 65+ messages in thread
From: Daniel Vetter @ 2021-04-12 14:48 UTC (permalink / raw)
  To: Matthew Auld; +Cc: intel-gfx, dri-devel, Chris Wilson

On Mon, Apr 12, 2021 at 10:05:08AM +0100, Matthew Auld wrote:
> From: Chris Wilson <chris@chris-wilson.co.uk>
> 
> If there is no mappable aperture, we cannot remap it for access, and the
> selftest is void.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Reviewed-by: Matthew Auld <matthew.auld@intel.com>
> Reviewed-by: Imre Deak <imre.deak@intel.com>

I guess subject should have i915/selftest in it? Also if you resubmit
other people's code needs your sob. Otherwise looks reasonable.
-Daniel
> ---
>  drivers/gpu/drm/i915/selftests/i915_vma.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/selftests/i915_vma.c b/drivers/gpu/drm/i915/selftests/i915_vma.c
> index 5fe7b80ca0bd..dd0607254a95 100644
> --- a/drivers/gpu/drm/i915/selftests/i915_vma.c
> +++ b/drivers/gpu/drm/i915/selftests/i915_vma.c
> @@ -967,6 +967,9 @@ static int igt_vma_remapped_gtt(void *arg)
>  	intel_wakeref_t wakeref;
>  	int err = 0;
>  
> +	if (!i915_ggtt_has_aperture(&i915->ggtt))
> +		return 0;
> +
>  	obj = i915_gem_object_create_internal(i915, 10 * 10 * PAGE_SIZE);
>  	if (IS_ERR(obj))
>  		return PTR_ERR(obj);
> -- 
> 2.26.3
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 07/19] drm/i915/fbdev: Use lmem physical addresses for fb_mmap() on discrete
  2021-04-12  9:05 ` [PATCH 07/19] drm/i915/fbdev: Use lmem physical addresses for fb_mmap() on discrete Matthew Auld
@ 2021-04-12 15:00   ` Daniel Vetter
  0 siblings, 0 replies; 65+ messages in thread
From: Daniel Vetter @ 2021-04-12 15:00 UTC (permalink / raw)
  To: Matthew Auld; +Cc: Mohammed Khajapasha, intel-gfx, dri-devel

On Mon, Apr 12, 2021 at 10:05:14AM +0100, Matthew Auld wrote:
> From: Mohammed Khajapasha <mohammed.khajapasha@intel.com>
> 
> use local memory io BAR address for fbdev's fb_mmap() operation on
> discrete, fbdev uses the physical address of our framebuffer for its
> fb_mmap() fn.
> 
> Signed-off-by: Mohammed Khajapasha <mohammed.khajapasha@intel.com>

Sob missing (I didn't check all previous patches), but also I think we
should aim more to reuse drm fbdev helpers and retire our owns here.
Eventually, long-term, and all that.
-Daniel

> ---
>  drivers/gpu/drm/i915/display/intel_fbdev.c | 29 +++++++++++++++++-----
>  1 file changed, 23 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/display/intel_fbdev.c b/drivers/gpu/drm/i915/display/intel_fbdev.c
> index ccd00e65a5fe..2b37959da747 100644
> --- a/drivers/gpu/drm/i915/display/intel_fbdev.c
> +++ b/drivers/gpu/drm/i915/display/intel_fbdev.c
> @@ -41,6 +41,8 @@
>  #include <drm/drm_fb_helper.h>
>  #include <drm/drm_fourcc.h>
>  
> +#include "gem/i915_gem_lmem.h"
> +
>  #include "i915_drv.h"
>  #include "intel_display_types.h"
>  #include "intel_fbdev.h"
> @@ -178,6 +180,7 @@ static int intelfb_create(struct drm_fb_helper *helper,
>  	unsigned long flags = 0;
>  	bool prealloc = false;
>  	void __iomem *vaddr;
> +	struct drm_i915_gem_object *obj;
>  	int ret;
>  
>  	if (intel_fb &&
> @@ -232,13 +235,27 @@ static int intelfb_create(struct drm_fb_helper *helper,
>  	info->fbops = &intelfb_ops;
>  
>  	/* setup aperture base/size for vesafb takeover */
> -	info->apertures->ranges[0].base = ggtt->gmadr.start;
> -	info->apertures->ranges[0].size = ggtt->mappable_end;
> +	obj = intel_fb_obj(&intel_fb->base);
> +	if (i915_gem_object_is_lmem(obj)) {
> +		struct intel_memory_region *mem = obj->mm.region;
> +
> +		info->apertures->ranges[0].base = mem->io_start;
> +		info->apertures->ranges[0].size = mem->total;
> +
> +		/* Use fbdev's framebuffer from lmem for discrete */
> +		info->fix.smem_start =
> +			(unsigned long)(mem->io_start +
> +					i915_gem_object_get_dma_address(obj, 0));
> +		info->fix.smem_len = obj->base.size;
> +	} else {
> +		info->apertures->ranges[0].base = ggtt->gmadr.start;
> +		info->apertures->ranges[0].size = ggtt->mappable_end;
>  
> -	/* Our framebuffer is the entirety of fbdev's system memory */
> -	info->fix.smem_start =
> -		(unsigned long)(ggtt->gmadr.start + vma->node.start);
> -	info->fix.smem_len = vma->node.size;
> +		/* Our framebuffer is the entirety of fbdev's system memory */
> +		info->fix.smem_start =
> +			(unsigned long)(ggtt->gmadr.start + vma->node.start);
> +		info->fix.smem_len = vma->node.size;
> +	}
>  
>  	vaddr = i915_vma_pin_iomap(vma);
>  	if (IS_ERR(vaddr)) {
> -- 
> 2.26.3
> 
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [Intel-gfx] [PATCH 18/19] drm/i915/gtt: map the PD up front
  2021-04-12  9:05 ` [PATCH 18/19] drm/i915/gtt: map the PD up front Matthew Auld
@ 2021-04-12 15:17   ` Daniel Vetter
  2021-04-12 16:01     ` Jani Nikula
  2021-04-12 16:08     ` Matthew Auld
  0 siblings, 2 replies; 65+ messages in thread
From: Daniel Vetter @ 2021-04-12 15:17 UTC (permalink / raw)
  To: Matthew Auld; +Cc: intel-gfx, dri-devel, Chris Wilson

On Mon, Apr 12, 2021 at 10:05:25AM +0100, Matthew Auld wrote:
> We need to general our accessor for the page directories and tables from
> using the simple kmap_atomic to support local memory, and this setup
> must be done on acquisition of the backing storage prior to entering
> fence execution contexts. Here we replace the kmap with the object
> maping code that for simple single page shmemfs object will return a
> plain kmap, that is then kept for the lifetime of the page directory.
> 
> v2: (Thomas) Rebase on dma_resv and obj->mm.lock removal.
> 
> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>

So I wanted to understand what px stands for as an abbreviation, and dug
all the way down to this:

commit 567047be2a7ede082d29f45524c287b87bd75e53
Author: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Date:   Thu Jun 25 18:35:12 2015 +0300

    drm/i915/gtt: Use macros to access dma mapped pages

I still have no idea what it means, I guess px = page. But I also
committed this, so I guess can blame myself :-)

But while digging I've stumbled over this here

commit 6eebfe8a10a62139d681e2f1af1386252742278b
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Fri Jul 12 08:58:18 2019 +0100

    drm/i915/gtt: Use shallow dma pages for scratch


And that's some serious wtf. Yes we've done some compile-time type
casting automagic between i915_priv and dev in the past, and I think even
that was bad taste. But it was justified with that we have these
everywhere (especially in the mmio macros), and it would be a terrible
flag day.

But I'm not seeing any need for auto-casting for these pages here, and I'm
not aware that we're doing this anywhere else in kernel code. There is
some macro-trickery in lockdep annotations, but that relies on the lockdep
map having the same struct member name in all lock types, and is not
exposed to drivers at all.

Am I missing something, or why do we have this compile-time type casting
stuff going on in i915 page accessors?
-Daniel

> ---
>  .../drm/i915/gem/selftests/i915_gem_context.c | 11 +----
>  drivers/gpu/drm/i915/gt/gen6_ppgtt.c          | 11 ++---
>  drivers/gpu/drm/i915/gt/gen8_ppgtt.c          | 26 ++++------
>  drivers/gpu/drm/i915/gt/intel_ggtt.c          |  2 +-
>  drivers/gpu/drm/i915/gt/intel_gtt.c           | 48 +++++++++----------
>  drivers/gpu/drm/i915/gt/intel_gtt.h           | 11 +++--
>  drivers/gpu/drm/i915/gt/intel_ppgtt.c         |  7 ++-
>  drivers/gpu/drm/i915/i915_vma.c               |  3 +-
>  drivers/gpu/drm/i915/selftests/i915_gem_gtt.c | 10 ++--
>  drivers/gpu/drm/i915/selftests/i915_perf.c    |  3 +-
>  10 files changed, 54 insertions(+), 78 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c
> index 5fef592390cb..ce70d0a3afb2 100644
> --- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c
> @@ -1740,7 +1740,6 @@ static int read_from_scratch(struct i915_gem_context *ctx,
>  static int check_scratch_page(struct i915_gem_context *ctx, u32 *out)
>  {
>  	struct i915_address_space *vm;
> -	struct page *page;
>  	u32 *vaddr;
>  	int err = 0;
>  
> @@ -1748,24 +1747,18 @@ static int check_scratch_page(struct i915_gem_context *ctx, u32 *out)
>  	if (!vm)
>  		return -ENODEV;
>  
> -	page = __px_page(vm->scratch[0]);
> -	if (!page) {
> +	if (!vm->scratch[0]) {
>  		pr_err("No scratch page!\n");
>  		return -EINVAL;
>  	}
>  
> -	vaddr = kmap(page);
> -	if (!vaddr) {
> -		pr_err("No (mappable) scratch page!\n");
> -		return -EINVAL;
> -	}
> +	vaddr = __px_vaddr(vm->scratch[0]);
>  
>  	memcpy(out, vaddr, sizeof(*out));
>  	if (memchr_inv(vaddr, *out, PAGE_SIZE)) {
>  		pr_err("Inconsistent initial state of scratch page!\n");
>  		err = -EINVAL;
>  	}
> -	kunmap(page);
>  
>  	return err;
>  }
> diff --git a/drivers/gpu/drm/i915/gt/gen6_ppgtt.c b/drivers/gpu/drm/i915/gt/gen6_ppgtt.c
> index e08dff376339..21b1085769be 100644
> --- a/drivers/gpu/drm/i915/gt/gen6_ppgtt.c
> +++ b/drivers/gpu/drm/i915/gt/gen6_ppgtt.c
> @@ -96,9 +96,8 @@ static void gen6_ppgtt_clear_range(struct i915_address_space *vm,
>  		 * entries back to scratch.
>  		 */
>  
> -		vaddr = kmap_atomic_px(pt);
> +		vaddr = px_vaddr(pt);
>  		memset32(vaddr + pte, scratch_pte, count);
> -		kunmap_atomic(vaddr);
>  
>  		pte = 0;
>  	}
> @@ -120,7 +119,7 @@ static void gen6_ppgtt_insert_entries(struct i915_address_space *vm,
>  
>  	GEM_BUG_ON(!pd->entry[act_pt]);
>  
> -	vaddr = kmap_atomic_px(i915_pt_entry(pd, act_pt));
> +	vaddr = px_vaddr(i915_pt_entry(pd, act_pt));
>  	do {
>  		GEM_BUG_ON(sg_dma_len(iter.sg) < I915_GTT_PAGE_SIZE);
>  		vaddr[act_pte] = pte_encode | GEN6_PTE_ADDR_ENCODE(iter.dma);
> @@ -136,12 +135,10 @@ static void gen6_ppgtt_insert_entries(struct i915_address_space *vm,
>  		}
>  
>  		if (++act_pte == GEN6_PTES) {
> -			kunmap_atomic(vaddr);
> -			vaddr = kmap_atomic_px(i915_pt_entry(pd, ++act_pt));
> +			vaddr = px_vaddr(i915_pt_entry(pd, ++act_pt));
>  			act_pte = 0;
>  		}
>  	} while (1);
> -	kunmap_atomic(vaddr);
>  
>  	vma->page_sizes.gtt = I915_GTT_PAGE_SIZE;
>  }
> @@ -235,7 +232,7 @@ static int gen6_ppgtt_init_scratch(struct gen6_ppgtt *ppgtt)
>  		goto err_scratch0;
>  	}
>  
> -	ret = pin_pt_dma(vm, vm->scratch[1]);
> +	ret = map_pt_dma(vm, vm->scratch[1]);
>  	if (ret)
>  		goto err_scratch1;
>  
> diff --git a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
> index 176c19633412..f83496836f0f 100644
> --- a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
> +++ b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
> @@ -242,11 +242,10 @@ static u64 __gen8_ppgtt_clear(struct i915_address_space * const vm,
>  			    atomic_read(&pt->used));
>  			GEM_BUG_ON(!count || count >= atomic_read(&pt->used));
>  
> -			vaddr = kmap_atomic_px(pt);
> +			vaddr = px_vaddr(pt);
>  			memset64(vaddr + gen8_pd_index(start, 0),
>  				 vm->scratch[0]->encode,
>  				 count);
> -			kunmap_atomic(vaddr);
>  
>  			atomic_sub(count, &pt->used);
>  			start += count;
> @@ -375,7 +374,7 @@ gen8_ppgtt_insert_pte(struct i915_ppgtt *ppgtt,
>  	gen8_pte_t *vaddr;
>  
>  	pd = i915_pd_entry(pdp, gen8_pd_index(idx, 2));
> -	vaddr = kmap_atomic_px(i915_pt_entry(pd, gen8_pd_index(idx, 1)));
> +	vaddr = px_vaddr(i915_pt_entry(pd, gen8_pd_index(idx, 1)));
>  	do {
>  		GEM_BUG_ON(sg_dma_len(iter->sg) < I915_GTT_PAGE_SIZE);
>  		vaddr[gen8_pd_index(idx, 0)] = pte_encode | iter->dma;
> @@ -402,12 +401,10 @@ gen8_ppgtt_insert_pte(struct i915_ppgtt *ppgtt,
>  			}
>  
>  			clflush_cache_range(vaddr, PAGE_SIZE);
> -			kunmap_atomic(vaddr);
> -			vaddr = kmap_atomic_px(i915_pt_entry(pd, gen8_pd_index(idx, 1)));
> +			vaddr = px_vaddr(i915_pt_entry(pd, gen8_pd_index(idx, 1)));
>  		}
>  	} while (1);
>  	clflush_cache_range(vaddr, PAGE_SIZE);
> -	kunmap_atomic(vaddr);
>  
>  	return idx;
>  }
> @@ -442,7 +439,7 @@ static void gen8_ppgtt_insert_huge(struct i915_vma *vma,
>  			encode |= GEN8_PDE_PS_2M;
>  			page_size = I915_GTT_PAGE_SIZE_2M;
>  
> -			vaddr = kmap_atomic_px(pd);
> +			vaddr = px_vaddr(pd);
>  		} else {
>  			struct i915_page_table *pt =
>  				i915_pt_entry(pd, __gen8_pte_index(start, 1));
> @@ -457,7 +454,7 @@ static void gen8_ppgtt_insert_huge(struct i915_vma *vma,
>  			     rem >= (I915_PDES - index) * I915_GTT_PAGE_SIZE))
>  				maybe_64K = __gen8_pte_index(start, 1);
>  
> -			vaddr = kmap_atomic_px(pt);
> +			vaddr = px_vaddr(pt);
>  		}
>  
>  		do {
> @@ -491,7 +488,6 @@ static void gen8_ppgtt_insert_huge(struct i915_vma *vma,
>  		} while (rem >= page_size && index < I915_PDES);
>  
>  		clflush_cache_range(vaddr, PAGE_SIZE);
> -		kunmap_atomic(vaddr);
>  
>  		/*
>  		 * Is it safe to mark the 2M block as 64K? -- Either we have
> @@ -505,9 +501,8 @@ static void gen8_ppgtt_insert_huge(struct i915_vma *vma,
>  		      !iter->sg && IS_ALIGNED(vma->node.start +
>  					      vma->node.size,
>  					      I915_GTT_PAGE_SIZE_2M)))) {
> -			vaddr = kmap_atomic_px(pd);
> +			vaddr = px_vaddr(pd);
>  			vaddr[maybe_64K] |= GEN8_PDE_IPS_64K;
> -			kunmap_atomic(vaddr);
>  			page_size = I915_GTT_PAGE_SIZE_64K;
>  
>  			/*
> @@ -523,12 +518,11 @@ static void gen8_ppgtt_insert_huge(struct i915_vma *vma,
>  				u16 i;
>  
>  				encode = vma->vm->scratch[0]->encode;
> -				vaddr = kmap_atomic_px(i915_pt_entry(pd, maybe_64K));
> +				vaddr = px_vaddr(i915_pt_entry(pd, maybe_64K));
>  
>  				for (i = 1; i < index; i += 16)
>  					memset64(vaddr + i, encode, 15);
>  
> -				kunmap_atomic(vaddr);
>  			}
>  		}
>  
> @@ -602,7 +596,7 @@ static int gen8_init_scratch(struct i915_address_space *vm)
>  		if (IS_ERR(obj))
>  			goto free_scratch;
>  
> -		ret = pin_pt_dma(vm, obj);
> +		ret = map_pt_dma(vm, obj);
>  		if (ret) {
>  			i915_gem_object_put(obj);
>  			goto free_scratch;
> @@ -639,7 +633,7 @@ static int gen8_preallocate_top_level_pdp(struct i915_ppgtt *ppgtt)
>  		if (IS_ERR(pde))
>  			return PTR_ERR(pde);
>  
> -		err = pin_pt_dma(vm, pde->pt.base);
> +		err = map_pt_dma(vm, pde->pt.base);
>  		if (err) {
>  			i915_gem_object_put(pde->pt.base);
>  			free_pd(vm, pde);
> @@ -675,7 +669,7 @@ gen8_alloc_top_pd(struct i915_address_space *vm)
>  		goto err_pd;
>  	}
>  
> -	err = pin_pt_dma(vm, pd->pt.base);
> +	err = map_pt_dma(vm, pd->pt.base);
>  	if (err)
>  		goto err_pd;
>  
> diff --git a/drivers/gpu/drm/i915/gt/intel_ggtt.c b/drivers/gpu/drm/i915/gt/intel_ggtt.c
> index 670c1271e7d5..d94628b9d89e 100644
> --- a/drivers/gpu/drm/i915/gt/intel_ggtt.c
> +++ b/drivers/gpu/drm/i915/gt/intel_ggtt.c
> @@ -657,7 +657,7 @@ static int init_aliasing_ppgtt(struct i915_ggtt *ggtt)
>  		goto err_ppgtt;
>  
>  	i915_gem_object_lock(ppgtt->vm.scratch[0], NULL);
> -	err = i915_vm_pin_pt_stash(&ppgtt->vm, &stash);
> +	err = i915_vm_map_pt_stash(&ppgtt->vm, &stash);
>  	i915_gem_object_unlock(ppgtt->vm.scratch[0]);
>  	if (err)
>  		goto err_stash;
> diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.c b/drivers/gpu/drm/i915/gt/intel_gtt.c
> index 941f8af016d6..d386b89e2758 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gtt.c
> +++ b/drivers/gpu/drm/i915/gt/intel_gtt.c
> @@ -25,27 +25,25 @@ struct drm_i915_gem_object *alloc_pt_dma(struct i915_address_space *vm, int sz)
>  	return obj;
>  }
>  
> -int pin_pt_dma(struct i915_address_space *vm, struct drm_i915_gem_object *obj)
> +int map_pt_dma(struct i915_address_space *vm, struct drm_i915_gem_object *obj)
>  {
> -	int err;
> +	void *vaddr;
>  
> -	i915_gem_object_lock(obj, NULL);
> -	err = i915_gem_object_pin_pages(obj);
> -	i915_gem_object_unlock(obj);
> -	if (err)
> -		return err;
> +	vaddr = i915_gem_object_pin_map_unlocked(obj, I915_MAP_WB);
> +	if (IS_ERR(vaddr))
> +		return PTR_ERR(vaddr);
>  
>  	i915_gem_object_make_unshrinkable(obj);
>  	return 0;
>  }
>  
> -int pin_pt_dma_locked(struct i915_address_space *vm, struct drm_i915_gem_object *obj)
> +int map_pt_dma_locked(struct i915_address_space *vm, struct drm_i915_gem_object *obj)
>  {
> -	int err;
> +	void *vaddr;
>  
> -	err = i915_gem_object_pin_pages(obj);
> -	if (err)
> -		return err;
> +	vaddr = i915_gem_object_pin_map(obj, I915_MAP_WB);
> +	if (IS_ERR(vaddr))
> +		return PTR_ERR(vaddr);
>  
>  	i915_gem_object_make_unshrinkable(obj);
>  	return 0;
> @@ -155,6 +153,14 @@ void clear_pages(struct i915_vma *vma)
>  	memset(&vma->page_sizes, 0, sizeof(vma->page_sizes));
>  }
>  
> +void *__px_vaddr(struct drm_i915_gem_object *p)
> +{
> +	enum i915_map_type type;
> +
> +	GEM_BUG_ON(!i915_gem_object_has_pages(p));
> +	return page_unpack_bits(p->mm.mapping, &type);
> +}
> +
>  dma_addr_t __px_dma(struct drm_i915_gem_object *p)
>  {
>  	GEM_BUG_ON(!i915_gem_object_has_pages(p));
> @@ -170,32 +176,22 @@ struct page *__px_page(struct drm_i915_gem_object *p)
>  void
>  fill_page_dma(struct drm_i915_gem_object *p, const u64 val, unsigned int count)
>  {
> -	struct page *page = __px_page(p);
> -	void *vaddr;
> +	void *vaddr = __px_vaddr(p);
>  
> -	vaddr = kmap(page);
>  	memset64(vaddr, val, count);
>  	clflush_cache_range(vaddr, PAGE_SIZE);
> -	kunmap(page);
>  }
>  
>  static void poison_scratch_page(struct drm_i915_gem_object *scratch)
>  {
> -	struct sgt_iter sgt;
> -	struct page *page;
> +	void *vaddr = __px_vaddr(scratch);
>  	u8 val;
>  
>  	val = 0;
>  	if (IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM))
>  		val = POISON_FREE;
>  
> -	for_each_sgt_page(page, sgt, scratch->mm.pages) {
> -		void *vaddr;
> -
> -		vaddr = kmap(page);
> -		memset(vaddr, val, PAGE_SIZE);
> -		kunmap(page);
> -	}
> +	memset(vaddr, val, scratch->base.size);
>  }
>  
>  int setup_scratch_page(struct i915_address_space *vm)
> @@ -225,7 +221,7 @@ int setup_scratch_page(struct i915_address_space *vm)
>  		if (IS_ERR(obj))
>  			goto skip;
>  
> -		if (pin_pt_dma(vm, obj))
> +		if (map_pt_dma(vm, obj))
>  			goto skip_obj;
>  
>  		/* We need a single contiguous page for our scratch */
> diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.h b/drivers/gpu/drm/i915/gt/intel_gtt.h
> index e67e34e17913..40e486704558 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gtt.h
> +++ b/drivers/gpu/drm/i915/gt/intel_gtt.h
> @@ -180,6 +180,9 @@ struct page *__px_page(struct drm_i915_gem_object *p);
>  dma_addr_t __px_dma(struct drm_i915_gem_object *p);
>  #define px_dma(px) (__px_dma(px_base(px)))
>  
> +void *__px_vaddr(struct drm_i915_gem_object *p);
> +#define px_vaddr(px) (__px_vaddr(px_base(px)))
> +
>  #define px_pt(px) \
>  	__px_choose_expr(px, struct i915_page_table *, __x, \
>  	__px_choose_expr(px, struct i915_page_directory *, &__x->pt, \
> @@ -511,8 +514,6 @@ struct i915_ppgtt *i915_ppgtt_create(struct intel_gt *gt);
>  void i915_ggtt_suspend(struct i915_ggtt *gtt);
>  void i915_ggtt_resume(struct i915_ggtt *ggtt);
>  
> -#define kmap_atomic_px(px) kmap_atomic(__px_page(px_base(px)))
> -
>  void
>  fill_page_dma(struct drm_i915_gem_object *p, const u64 val, unsigned int count);
>  
> @@ -530,8 +531,8 @@ struct i915_page_table *alloc_pt(struct i915_address_space *vm);
>  struct i915_page_directory *alloc_pd(struct i915_address_space *vm);
>  struct i915_page_directory *__alloc_pd(int npde);
>  
> -int pin_pt_dma(struct i915_address_space *vm, struct drm_i915_gem_object *obj);
> -int pin_pt_dma_locked(struct i915_address_space *vm, struct drm_i915_gem_object *obj);
> +int map_pt_dma(struct i915_address_space *vm, struct drm_i915_gem_object *obj);
> +int map_pt_dma_locked(struct i915_address_space *vm, struct drm_i915_gem_object *obj);
>  
>  void free_px(struct i915_address_space *vm,
>  	     struct i915_page_table *pt, int lvl);
> @@ -578,7 +579,7 @@ void setup_private_pat(struct intel_uncore *uncore);
>  int i915_vm_alloc_pt_stash(struct i915_address_space *vm,
>  			   struct i915_vm_pt_stash *stash,
>  			   u64 size);
> -int i915_vm_pin_pt_stash(struct i915_address_space *vm,
> +int i915_vm_map_pt_stash(struct i915_address_space *vm,
>  			 struct i915_vm_pt_stash *stash);
>  void i915_vm_free_pt_stash(struct i915_address_space *vm,
>  			   struct i915_vm_pt_stash *stash);
> diff --git a/drivers/gpu/drm/i915/gt/intel_ppgtt.c b/drivers/gpu/drm/i915/gt/intel_ppgtt.c
> index 014ae8ac4480..4e3d80c2295c 100644
> --- a/drivers/gpu/drm/i915/gt/intel_ppgtt.c
> +++ b/drivers/gpu/drm/i915/gt/intel_ppgtt.c
> @@ -87,11 +87,10 @@ write_dma_entry(struct drm_i915_gem_object * const pdma,
>  		const unsigned short idx,
>  		const u64 encoded_entry)
>  {
> -	u64 * const vaddr = kmap_atomic(__px_page(pdma));
> +	u64 * const vaddr = __px_vaddr(pdma);
>  
>  	vaddr[idx] = encoded_entry;
>  	clflush_cache_range(&vaddr[idx], sizeof(u64));
> -	kunmap_atomic(vaddr);
>  }
>  
>  void
> @@ -258,7 +257,7 @@ int i915_vm_alloc_pt_stash(struct i915_address_space *vm,
>  	return 0;
>  }
>  
> -int i915_vm_pin_pt_stash(struct i915_address_space *vm,
> +int i915_vm_map_pt_stash(struct i915_address_space *vm,
>  			 struct i915_vm_pt_stash *stash)
>  {
>  	struct i915_page_table *pt;
> @@ -266,7 +265,7 @@ int i915_vm_pin_pt_stash(struct i915_address_space *vm,
>  
>  	for (n = 0; n < ARRAY_SIZE(stash->pt); n++) {
>  		for (pt = stash->pt[n]; pt; pt = pt->stash) {
> -			err = pin_pt_dma_locked(vm, pt->base);
> +			err = map_pt_dma_locked(vm, pt->base);
>  			if (err)
>  				return err;
>  		}
> diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
> index e24d33aecac4..c68a743fac2a 100644
> --- a/drivers/gpu/drm/i915/i915_vma.c
> +++ b/drivers/gpu/drm/i915/i915_vma.c
> @@ -912,8 +912,7 @@ int i915_vma_pin_ww(struct i915_vma *vma, struct i915_gem_ww_ctx *ww,
>  			if (err)
>  				goto err_fence;
>  
> -			err = i915_vm_pin_pt_stash(vma->vm,
> -						   &work->stash);
> +			err = i915_vm_map_pt_stash(vma->vm, &work->stash);
>  			if (err)
>  				goto err_fence;
>  		}
> diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c b/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
> index 2e4f06eaacc1..e060e455e9f6 100644
> --- a/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
> +++ b/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
> @@ -186,7 +186,7 @@ static int igt_ppgtt_alloc(void *arg)
>  		if (err)
>  			goto err_ppgtt_cleanup;
>  
> -		err = i915_vm_pin_pt_stash(&ppgtt->vm, &stash);
> +		err = i915_vm_map_pt_stash(&ppgtt->vm, &stash);
>  		if (err) {
>  			i915_vm_free_pt_stash(&ppgtt->vm, &stash);
>  			goto err_ppgtt_cleanup;
> @@ -208,7 +208,7 @@ static int igt_ppgtt_alloc(void *arg)
>  		if (err)
>  			goto err_ppgtt_cleanup;
>  
> -		err = i915_vm_pin_pt_stash(&ppgtt->vm, &stash);
> +		err = i915_vm_map_pt_stash(&ppgtt->vm, &stash);
>  		if (err) {
>  			i915_vm_free_pt_stash(&ppgtt->vm, &stash);
>  			goto err_ppgtt_cleanup;
> @@ -325,11 +325,10 @@ static int lowlevel_hole(struct i915_address_space *vm,
>  							   BIT_ULL(size)))
>  					goto alloc_vm_end;
>  
> -				err = i915_vm_pin_pt_stash(vm, &stash);
> +				err = i915_vm_map_pt_stash(vm, &stash);
>  				if (!err)
>  					vm->allocate_va_range(vm, &stash,
>  							      addr, BIT_ULL(size));
> -
>  				i915_vm_free_pt_stash(vm, &stash);
>  alloc_vm_end:
>  				if (err == -EDEADLK) {
> @@ -1967,10 +1966,9 @@ static int igt_cs_tlb(void *arg)
>  			if (err)
>  				goto end_ww;
>  
> -			err = i915_vm_pin_pt_stash(vm, &stash);
> +			err = i915_vm_map_pt_stash(vm, &stash);
>  			if (!err)
>  				vm->allocate_va_range(vm, &stash, offset, chunk_size);
> -
>  			i915_vm_free_pt_stash(vm, &stash);
>  end_ww:
>  			if (err == -EDEADLK) {
> diff --git a/drivers/gpu/drm/i915/selftests/i915_perf.c b/drivers/gpu/drm/i915/selftests/i915_perf.c
> index e9d86dab8677..bfb0290967a1 100644
> --- a/drivers/gpu/drm/i915/selftests/i915_perf.c
> +++ b/drivers/gpu/drm/i915/selftests/i915_perf.c
> @@ -307,7 +307,7 @@ static int live_noa_gpr(void *arg)
>  	}
>  
>  	/* Poison the ce->vm so we detect writes not to the GGTT gt->scratch */
> -	scratch = kmap(__px_page(ce->vm->scratch[0]));
> +	scratch = __px_vaddr(ce->vm->scratch[0]);
>  	memset(scratch, POISON_FREE, PAGE_SIZE);
>  
>  	rq = intel_context_create_request(ce);
> @@ -405,7 +405,6 @@ static int live_noa_gpr(void *arg)
>  out_rq:
>  	i915_request_put(rq);
>  out_ce:
> -	kunmap(__px_page(ce->vm->scratch[0]));
>  	intel_context_put(ce);
>  out:
>  	stream_destroy(stream);
> -- 
> 2.26.3
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [Intel-gfx] [PATCH 18/19] drm/i915/gtt: map the PD up front
  2021-04-12 15:17   ` [Intel-gfx] " Daniel Vetter
@ 2021-04-12 16:01     ` Jani Nikula
  2021-04-12 16:36       ` Daniel Vetter
  2021-04-12 16:08     ` Matthew Auld
  1 sibling, 1 reply; 65+ messages in thread
From: Jani Nikula @ 2021-04-12 16:01 UTC (permalink / raw)
  To: Daniel Vetter, Matthew Auld; +Cc: intel-gfx, dri-devel, Chris Wilson

On Mon, 12 Apr 2021, Daniel Vetter <daniel@ffwll.ch> wrote:
> And that's some serious wtf. Yes we've done some compile-time type
> casting automagic between i915_priv and dev in the past, and I think even
> that was bad taste. But it was justified with that we have these
> everywhere (especially in the mmio macros), and it would be a terrible
> flag day.

FWIW, we had the dev_priv/dev macro trickery for a while to not have
that flag day conversion, until everything used i915 or &i915->drm. But
we got rid of it afterwards.

BR,
Jani.


-- 
Jani Nikula, Intel Open Source Graphics Center
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [Intel-gfx] [PATCH 18/19] drm/i915/gtt: map the PD up front
  2021-04-12 15:17   ` [Intel-gfx] " Daniel Vetter
  2021-04-12 16:01     ` Jani Nikula
@ 2021-04-12 16:08     ` Matthew Auld
  2021-04-12 17:00       ` Daniel Vetter
  1 sibling, 1 reply; 65+ messages in thread
From: Matthew Auld @ 2021-04-12 16:08 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Intel Graphics Development, Matthew Auld, ML dri-devel, Chris Wilson

On Mon, 12 Apr 2021 at 16:17, Daniel Vetter <daniel@ffwll.ch> wrote:
>
> On Mon, Apr 12, 2021 at 10:05:25AM +0100, Matthew Auld wrote:
> > We need to general our accessor for the page directories and tables from
> > using the simple kmap_atomic to support local memory, and this setup
> > must be done on acquisition of the backing storage prior to entering
> > fence execution contexts. Here we replace the kmap with the object
> > maping code that for simple single page shmemfs object will return a
> > plain kmap, that is then kept for the lifetime of the page directory.
> >
> > v2: (Thomas) Rebase on dma_resv and obj->mm.lock removal.
> >
> > Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>
> So I wanted to understand what px stands for as an abbreviation, and dug
> all the way down to this:
>
> commit 567047be2a7ede082d29f45524c287b87bd75e53
> Author: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> Date:   Thu Jun 25 18:35:12 2015 +0300
>
>     drm/i915/gtt: Use macros to access dma mapped pages
>
> I still have no idea what it means, I guess px = page. But I also
> committed this, so I guess can blame myself :-)
>
> But while digging I've stumbled over this here
>
> commit 6eebfe8a10a62139d681e2f1af1386252742278b
> Author: Chris Wilson <chris@chris-wilson.co.uk>
> Date:   Fri Jul 12 08:58:18 2019 +0100
>
>     drm/i915/gtt: Use shallow dma pages for scratch
>
>
> And that's some serious wtf. Yes we've done some compile-time type
> casting automagic between i915_priv and dev in the past, and I think even
> that was bad taste. But it was justified with that we have these
> everywhere (especially in the mmio macros), and it would be a terrible
> flag day.
>
> But I'm not seeing any need for auto-casting for these pages here, and I'm
> not aware that we're doing this anywhere else in kernel code. There is
> some macro-trickery in lockdep annotations, but that relies on the lockdep
> map having the same struct member name in all lock types, and is not
> exposed to drivers at all.
>
> Am I missing something, or why do we have this compile-time type casting
> stuff going on in i915 page accessors?

I think 'x' in the px family of macros/functions is meant in the
variable/polymorphic sense, so it can potentially be a pt, pd, etc
underneath. If you look at px_base() for example all it does is fish
out the base GEM object from the structure, using the
known-at-compile-time-type, which then lets us get at the dma address,
vaddr etc.

It does seem pretty magical, but seems ok to me, if it means less typing?
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [Intel-gfx] [PATCH 18/19] drm/i915/gtt: map the PD up front
  2021-04-12 16:01     ` Jani Nikula
@ 2021-04-12 16:36       ` Daniel Vetter
  0 siblings, 0 replies; 65+ messages in thread
From: Daniel Vetter @ 2021-04-12 16:36 UTC (permalink / raw)
  To: Jani Nikula; +Cc: dri-devel, intel-gfx, Matthew Auld, Chris Wilson

On Mon, Apr 12, 2021 at 07:01:19PM +0300, Jani Nikula wrote:
> On Mon, 12 Apr 2021, Daniel Vetter <daniel@ffwll.ch> wrote:
> > And that's some serious wtf. Yes we've done some compile-time type
> > casting automagic between i915_priv and dev in the past, and I think even
> > that was bad taste. But it was justified with that we have these
> > everywhere (especially in the mmio macros), and it would be a terrible
> > flag day.
> 
> FWIW, we had the dev_priv/dev macro trickery for a while to not have
> that flag day conversion, until everything used i915 or &i915->drm. But
> we got rid of it afterwards.

Yay, and yes that was the plan to avoid the flag day. And not as a great
coding pattern that everyone should imitate ...
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 15/19] drm/i915: WA for zero memory channel
  2021-04-12  9:05 ` [PATCH 15/19] drm/i915: WA for zero memory channel Matthew Auld
@ 2021-04-12 16:57   ` Souza, Jose
  0 siblings, 0 replies; 65+ messages in thread
From: Souza, Jose @ 2021-04-12 16:57 UTC (permalink / raw)
  To: intel-gfx, Auld, Matthew; +Cc: dri-devel

On Mon, 2021-04-12 at 10:05 +0100, Matthew Auld wrote:
> From: José Roberto de Souza <jose.souza@intel.com>
> 
> Commit c457d9cf256e ("drm/i915: Make sure we have enough memory
> bandwidth on ICL") assumes that we always have a non-zero
> dram_info->channels and uses it as a divisor. We need num memory
> channels to be at least 1 for sane bw limits checking, even when PCode
> returns 0, so lets force it to 1 in this case.

Missing my sob.

> 
> Cc: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
> Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
> ---
>  drivers/gpu/drm/i915/display/intel_bw.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/gpu/drm/i915/display/intel_bw.c b/drivers/gpu/drm/i915/display/intel_bw.c
> index 584ab5ce4106..c5f70f3e930e 100644
> --- a/drivers/gpu/drm/i915/display/intel_bw.c
> +++ b/drivers/gpu/drm/i915/display/intel_bw.c
> @@ -175,6 +175,7 @@ static int icl_get_bw_info(struct drm_i915_private *dev_priv, const struct intel
>  			    "Failed to get memory subsystem information, ignoring bandwidth limits");
>  		return ret;
>  	}
> +	num_channels = max_t(u8, 1, num_channels);
>  
> 
> 
> 
>  	deinterleave = DIV_ROUND_UP(num_channels, is_y_tile ? 4 : 2);
>  	dclk_max = icl_sagv_max_dclk(&qi);

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [Intel-gfx] [PATCH 18/19] drm/i915/gtt: map the PD up front
  2021-04-12 16:08     ` Matthew Auld
@ 2021-04-12 17:00       ` Daniel Vetter
  2021-04-13  9:28         ` Matthew Auld
  0 siblings, 1 reply; 65+ messages in thread
From: Daniel Vetter @ 2021-04-12 17:00 UTC (permalink / raw)
  To: Matthew Auld
  Cc: Intel Graphics Development, Matthew Auld, ML dri-devel, Chris Wilson

On Mon, Apr 12, 2021 at 6:08 PM Matthew Auld
<matthew.william.auld@gmail.com> wrote:
>
> On Mon, 12 Apr 2021 at 16:17, Daniel Vetter <daniel@ffwll.ch> wrote:
> >
> > On Mon, Apr 12, 2021 at 10:05:25AM +0100, Matthew Auld wrote:
> > > We need to general our accessor for the page directories and tables from
> > > using the simple kmap_atomic to support local memory, and this setup
> > > must be done on acquisition of the backing storage prior to entering
> > > fence execution contexts. Here we replace the kmap with the object
> > > maping code that for simple single page shmemfs object will return a
> > > plain kmap, that is then kept for the lifetime of the page directory.
> > >
> > > v2: (Thomas) Rebase on dma_resv and obj->mm.lock removal.
> > >
> > > Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> >
> > So I wanted to understand what px stands for as an abbreviation, and dug
> > all the way down to this:
> >
> > commit 567047be2a7ede082d29f45524c287b87bd75e53
> > Author: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> > Date:   Thu Jun 25 18:35:12 2015 +0300
> >
> >     drm/i915/gtt: Use macros to access dma mapped pages
> >
> > I still have no idea what it means, I guess px = page. But I also
> > committed this, so I guess can blame myself :-)
> >
> > But while digging I've stumbled over this here
> >
> > commit 6eebfe8a10a62139d681e2f1af1386252742278b
> > Author: Chris Wilson <chris@chris-wilson.co.uk>
> > Date:   Fri Jul 12 08:58:18 2019 +0100
> >
> >     drm/i915/gtt: Use shallow dma pages for scratch
> >
> >
> > And that's some serious wtf. Yes we've done some compile-time type
> > casting automagic between i915_priv and dev in the past, and I think even
> > that was bad taste. But it was justified with that we have these
> > everywhere (especially in the mmio macros), and it would be a terrible
> > flag day.
> >
> > But I'm not seeing any need for auto-casting for these pages here, and I'm
> > not aware that we're doing this anywhere else in kernel code. There is
> > some macro-trickery in lockdep annotations, but that relies on the lockdep
> > map having the same struct member name in all lock types, and is not
> > exposed to drivers at all.
> >
> > Am I missing something, or why do we have this compile-time type casting
> > stuff going on in i915 page accessors?
>
> I think 'x' in the px family of macros/functions is meant in the
> variable/polymorphic sense, so it can potentially be a pt, pd, etc
> underneath. If you look at px_base() for example all it does is fish
> out the base GEM object from the structure, using the
> known-at-compile-time-type, which then lets us get at the dma address,
> vaddr etc.

Yeah, but that's not how things landed. px predates the magic
polymorphism. I think the px just stands for page, or at least
originally only stood for page. I'm not sure honestly. It seems to be
just used for page directory type of things, but I haven't found that
written down anywhere.

> It does seem pretty magical, but seems ok to me, if it means less typing?

That's the worst justification. Code is generally write once, read
many times. Optimizing for writing at the cost of magic indirection is
generally not the right tradeoff in the kernel, where any indirection
could hide a major gotcha. In huge userspace applications fancy
abstraction and polymorphism is often the right thing to do, but there
you also have a real compiler with a real typesystem (generally at
least) helping you out. Or it's yolo duct-taping with lots of tests,
where the speed at which you can hack up something matters more than
being able to read it quickly.

We're typing C here. It is generally rather verbose, with type casting
all done explicitly.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [Intel-gfx] [PATCH 14/19] drm/i915/oprom: Basic sanitization
  2021-04-12  9:05 ` [PATCH 14/19] drm/i915/oprom: Basic sanitization Matthew Auld
@ 2021-04-12 22:36   ` kernel test robot
  2021-04-12 22:36   ` [PATCH] drm/i915/oprom: fix memdup.cocci warnings kernel test robot
  2021-05-17 11:57   ` [Intel-gfx] [PATCH 14/19] drm/i915/oprom: Basic sanitization Jani Nikula
  2 siblings, 0 replies; 65+ messages in thread
From: kernel test robot @ 2021-04-12 22:36 UTC (permalink / raw)
  To: Matthew Auld, intel-gfx
  Cc: Jani Nikula, Mohammed Khajapasha, kbuild-all, dri-devel

[-- Attachment #1: Type: text/plain, Size: 1122 bytes --]

Hi Matthew,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on drm-intel/for-linux-next]
[also build test WARNING on drm-tip/drm-tip]
[cannot apply to drm-exynos/exynos-drm-next tegra-drm/drm/tegra/for-next drm/drm-next v5.12-rc7]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Matthew-Auld/More-DG1-enabling/20210412-171139
base:   git://anongit.freedesktop.org/drm-intel for-linux-next
config: x86_64-randconfig-c022-20210412 (attached as .config)
compiler: gcc-9 (Debian 9.3.0-22) 9.3.0

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>


cocci warnings: (new ones prefixed by >>)
>> drivers/gpu/drm/i915/display/intel_bios.c:2274:7-14: WARNING opportunity for kmemdup

Please review and possibly fold the followup patch.

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 31163 bytes --]

[-- Attachment #3: Type: text/plain, Size: 160 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 65+ messages in thread

* [PATCH] drm/i915/oprom: fix memdup.cocci warnings
  2021-04-12  9:05 ` [PATCH 14/19] drm/i915/oprom: Basic sanitization Matthew Auld
  2021-04-12 22:36   ` [Intel-gfx] " kernel test robot
@ 2021-04-12 22:36   ` kernel test robot
  2021-05-17 11:57   ` [Intel-gfx] [PATCH 14/19] drm/i915/oprom: Basic sanitization Jani Nikula
  2 siblings, 0 replies; 65+ messages in thread
From: kernel test robot @ 2021-04-12 22:36 UTC (permalink / raw)
  To: Matthew Auld, intel-gfx
  Cc: Jani Nikula, Mohammed Khajapasha, kbuild-all, dri-devel

From: kernel test robot <lkp@intel.com>

drivers/gpu/drm/i915/display/intel_bios.c:2274:7-14: WARNING opportunity for kmemdup

 Use kmemdup rather than duplicating its implementation

Generated by: scripts/coccinelle/api/memdup.cocci

CC: Anshuman Gupta <anshuman.gupta@intel.com>
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: kernel test robot <lkp@intel.com>
---

url:    https://github.com/0day-ci/linux/commits/Matthew-Auld/More-DG1-enabling/20210412-171139
base:   git://anongit.freedesktop.org/drm-intel for-linux-next

 intel_bios.c |    3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

--- a/drivers/gpu/drm/i915/display/intel_bios.c
+++ b/drivers/gpu/drm/i915/display/intel_bios.c
@@ -2271,14 +2271,13 @@ static struct vbt_header *spi_oprom_get_
 	parse_ptr = (u8 *)oprom_opreg + found;
 	vbt_size = ((struct vbt_header *)parse_ptr)->vbt_size;
 
-	vbt = kzalloc(vbt_size, GFP_KERNEL);
+	vbt = kmemdup(parse_ptr, vbt_size, GFP_KERNEL);
 	if (!vbt) {
 		DRM_ERROR("Unable to allocate %u bytes for VBT storage\n",
 			  vbt_size);
 		goto err_not_found;
 	}
 
-	memcpy(vbt, parse_ptr, vbt_size);
 	if (!intel_bios_is_valid_vbt(vbt, vbt_size))
 		goto err_free_vbt;
 
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [Intel-gfx] [PATCH 18/19] drm/i915/gtt: map the PD up front
  2021-04-12 17:00       ` Daniel Vetter
@ 2021-04-13  9:28         ` Matthew Auld
  2021-04-13 10:18           ` Daniel Vetter
  0 siblings, 1 reply; 65+ messages in thread
From: Matthew Auld @ 2021-04-13  9:28 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Intel Graphics Development, Matthew Auld, ML dri-devel, Chris Wilson

On Mon, 12 Apr 2021 at 18:01, Daniel Vetter <daniel@ffwll.ch> wrote:
>
> On Mon, Apr 12, 2021 at 6:08 PM Matthew Auld
> <matthew.william.auld@gmail.com> wrote:
> >
> > On Mon, 12 Apr 2021 at 16:17, Daniel Vetter <daniel@ffwll.ch> wrote:
> > >
> > > On Mon, Apr 12, 2021 at 10:05:25AM +0100, Matthew Auld wrote:
> > > > We need to general our accessor for the page directories and tables from
> > > > using the simple kmap_atomic to support local memory, and this setup
> > > > must be done on acquisition of the backing storage prior to entering
> > > > fence execution contexts. Here we replace the kmap with the object
> > > > maping code that for simple single page shmemfs object will return a
> > > > plain kmap, that is then kept for the lifetime of the page directory.
> > > >
> > > > v2: (Thomas) Rebase on dma_resv and obj->mm.lock removal.
> > > >
> > > > Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> > > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > >
> > > So I wanted to understand what px stands for as an abbreviation, and dug
> > > all the way down to this:
> > >
> > > commit 567047be2a7ede082d29f45524c287b87bd75e53
> > > Author: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> > > Date:   Thu Jun 25 18:35:12 2015 +0300
> > >
> > >     drm/i915/gtt: Use macros to access dma mapped pages
> > >
> > > I still have no idea what it means, I guess px = page. But I also
> > > committed this, so I guess can blame myself :-)
> > >
> > > But while digging I've stumbled over this here
> > >
> > > commit 6eebfe8a10a62139d681e2f1af1386252742278b
> > > Author: Chris Wilson <chris@chris-wilson.co.uk>
> > > Date:   Fri Jul 12 08:58:18 2019 +0100
> > >
> > >     drm/i915/gtt: Use shallow dma pages for scratch
> > >
> > >
> > > And that's some serious wtf. Yes we've done some compile-time type
> > > casting automagic between i915_priv and dev in the past, and I think even
> > > that was bad taste. But it was justified with that we have these
> > > everywhere (especially in the mmio macros), and it would be a terrible
> > > flag day.
> > >
> > > But I'm not seeing any need for auto-casting for these pages here, and I'm
> > > not aware that we're doing this anywhere else in kernel code. There is
> > > some macro-trickery in lockdep annotations, but that relies on the lockdep
> > > map having the same struct member name in all lock types, and is not
> > > exposed to drivers at all.
> > >
> > > Am I missing something, or why do we have this compile-time type casting
> > > stuff going on in i915 page accessors?
> >
> > I think 'x' in the px family of macros/functions is meant in the
> > variable/polymorphic sense, so it can potentially be a pt, pd, etc
> > underneath. If you look at px_base() for example all it does is fish
> > out the base GEM object from the structure, using the
> > known-at-compile-time-type, which then lets us get at the dma address,
> > vaddr etc.
>
> Yeah, but that's not how things landed. px predates the magic
> polymorphism. I think the px just stands for page, or at least
> originally only stood for page. I'm not sure honestly. It seems to be
> just used for page directory type of things, but I haven't found that
> written down anywhere.
>
> > It does seem pretty magical, but seems ok to me, if it means less typing?
>
> That's the worst justification. Code is generally write once, read
> many times. Optimizing for writing at the cost of magic indirection is
> generally not the right tradeoff in the kernel, where any indirection
> could hide a major gotcha. In huge userspace applications fancy
> abstraction and polymorphism is often the right thing to do, but there
> you also have a real compiler with a real typesystem (generally at
> least) helping you out. Or it's yolo duct-taping with lots of tests,
> where the speed at which you can hack up something matters more than
> being able to read it quickly.
>
> We're typing C here. It is generally rather verbose, with type casting
> all done explicitly.

Ok. So should we change this around for this patch? The px_ stuff is
already quite prevalent it seems, and the px_vaddr() is just one part
of it? Maybe just add pt_vaddr(), pd_vaddr() etc instead?

> -Daniel
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [Intel-gfx] [PATCH 18/19] drm/i915/gtt: map the PD up front
  2021-04-13  9:28         ` Matthew Auld
@ 2021-04-13 10:18           ` Daniel Vetter
  0 siblings, 0 replies; 65+ messages in thread
From: Daniel Vetter @ 2021-04-13 10:18 UTC (permalink / raw)
  To: Matthew Auld
  Cc: Intel Graphics Development, Matthew Auld, ML dri-devel, Chris Wilson

On Tue, Apr 13, 2021 at 11:29 AM Matthew Auld
<matthew.william.auld@gmail.com> wrote:
>
> On Mon, 12 Apr 2021 at 18:01, Daniel Vetter <daniel@ffwll.ch> wrote:
> >
> > On Mon, Apr 12, 2021 at 6:08 PM Matthew Auld
> > <matthew.william.auld@gmail.com> wrote:
> > >
> > > On Mon, 12 Apr 2021 at 16:17, Daniel Vetter <daniel@ffwll.ch> wrote:
> > > >
> > > > On Mon, Apr 12, 2021 at 10:05:25AM +0100, Matthew Auld wrote:
> > > > > We need to general our accessor for the page directories and tables from
> > > > > using the simple kmap_atomic to support local memory, and this setup
> > > > > must be done on acquisition of the backing storage prior to entering
> > > > > fence execution contexts. Here we replace the kmap with the object
> > > > > maping code that for simple single page shmemfs object will return a
> > > > > plain kmap, that is then kept for the lifetime of the page directory.
> > > > >
> > > > > v2: (Thomas) Rebase on dma_resv and obj->mm.lock removal.
> > > > >
> > > > > Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> > > > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > > >
> > > > So I wanted to understand what px stands for as an abbreviation, and dug
> > > > all the way down to this:
> > > >
> > > > commit 567047be2a7ede082d29f45524c287b87bd75e53
> > > > Author: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> > > > Date:   Thu Jun 25 18:35:12 2015 +0300
> > > >
> > > >     drm/i915/gtt: Use macros to access dma mapped pages
> > > >
> > > > I still have no idea what it means, I guess px = page. But I also
> > > > committed this, so I guess can blame myself :-)
> > > >
> > > > But while digging I've stumbled over this here
> > > >
> > > > commit 6eebfe8a10a62139d681e2f1af1386252742278b
> > > > Author: Chris Wilson <chris@chris-wilson.co.uk>
> > > > Date:   Fri Jul 12 08:58:18 2019 +0100
> > > >
> > > >     drm/i915/gtt: Use shallow dma pages for scratch
> > > >
> > > >
> > > > And that's some serious wtf. Yes we've done some compile-time type
> > > > casting automagic between i915_priv and dev in the past, and I think even
> > > > that was bad taste. But it was justified with that we have these
> > > > everywhere (especially in the mmio macros), and it would be a terrible
> > > > flag day.
> > > >
> > > > But I'm not seeing any need for auto-casting for these pages here, and I'm
> > > > not aware that we're doing this anywhere else in kernel code. There is
> > > > some macro-trickery in lockdep annotations, but that relies on the lockdep
> > > > map having the same struct member name in all lock types, and is not
> > > > exposed to drivers at all.
> > > >
> > > > Am I missing something, or why do we have this compile-time type casting
> > > > stuff going on in i915 page accessors?
> > >
> > > I think 'x' in the px family of macros/functions is meant in the
> > > variable/polymorphic sense, so it can potentially be a pt, pd, etc
> > > underneath. If you look at px_base() for example all it does is fish
> > > out the base GEM object from the structure, using the
> > > known-at-compile-time-type, which then lets us get at the dma address,
> > > vaddr etc.
> >
> > Yeah, but that's not how things landed. px predates the magic
> > polymorphism. I think the px just stands for page, or at least
> > originally only stood for page. I'm not sure honestly. It seems to be
> > just used for page directory type of things, but I haven't found that
> > written down anywhere.
> >
> > > It does seem pretty magical, but seems ok to me, if it means less typing?
> >
> > That's the worst justification. Code is generally write once, read
> > many times. Optimizing for writing at the cost of magic indirection is
> > generally not the right tradeoff in the kernel, where any indirection
> > could hide a major gotcha. In huge userspace applications fancy
> > abstraction and polymorphism is often the right thing to do, but there
> > you also have a real compiler with a real typesystem (generally at
> > least) helping you out. Or it's yolo duct-taping with lots of tests,
> > where the speed at which you can hack up something matters more than
> > being able to read it quickly.
> >
> > We're typing C here. It is generally rather verbose, with type casting
> > all done explicitly.
>
> Ok. So should we change this around for this patch? The px_ stuff is
> already quite prevalent it seems, and the px_vaddr() is just one part
> of it? Maybe just add pt_vaddr(), pd_vaddr() etc instead?

Nah, that was just an orthogonal observation. The confusion with magic
type-aware macros is preexisting and widespread, there's no point
holding up dg1 code with that. But it is maybe something we should put
on our cleanup list. Or at least have a better explanation for why
exactly it is needed. Also note I'm not worried about the px stuff
standing for pt/pd/whatever, it's the magic type casting property of
these macros added with the 2nd patch I've mentioned above that looks
rather questionable to me. Maybe as transition thing like we've done
with i915_priv pointers, but not something that we should build on top
for long term.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [Intel-gfx] [PATCH 03/19] drm/i915: Create stolen memory region from local memory
  2021-04-12  9:05 ` [PATCH 03/19] drm/i915: Create stolen memory region from local memory Matthew Auld
@ 2021-04-14 15:01   ` Tvrtko Ursulin
  2021-04-16 15:04     ` Matthew Auld
  0 siblings, 1 reply; 65+ messages in thread
From: Tvrtko Ursulin @ 2021-04-14 15:01 UTC (permalink / raw)
  To: Matthew Auld, intel-gfx; +Cc: dri-devel


On 12/04/2021 10:05, Matthew Auld wrote:
> From: CQ Tang <cq.tang@intel.com>
> 
> Add "REGION_STOLEN" device info to dg1, create stolen memory
> region from upper portion of local device memory, starting
> from DSMBASE.
> 
> v2:
>      - s/drm_info/drm_dbg; userspace likely doesn't care about stolen.
>      - mem->type is only setup after the region probe, so setting the name
>        as stolen-local or stolen-system based on this value won't work. Split
>        system vs local stolen setup to fix this.
>      - kill all the region->devmem/is_devmem stuff. We already differentiate
>        the different types of stolen so such things shouldn't be needed
>        anymore.
> 
> Signed-off-by: CQ Tang <cq.tang@intel.com>
> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> ---
>   drivers/gpu/drm/i915/gem/i915_gem_stolen.c | 99 +++++++++++++++++++---
>   drivers/gpu/drm/i915/gem/i915_gem_stolen.h |  3 +
>   drivers/gpu/drm/i915/i915_pci.c            |  2 +-
>   drivers/gpu/drm/i915/i915_reg.h            |  1 +
>   drivers/gpu/drm/i915/intel_memory_region.c |  6 ++
>   drivers/gpu/drm/i915/intel_memory_region.h |  5 +-
>   6 files changed, 102 insertions(+), 14 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_stolen.c b/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
> index b0597de206de..56dd58bef5ee 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
> @@ -10,6 +10,7 @@
>   #include <drm/drm_mm.h>
>   #include <drm/i915_drm.h>
>   
> +#include "gem/i915_gem_lmem.h"
>   #include "gem/i915_gem_region.h"
>   #include "i915_drv.h"
>   #include "i915_gem_stolen.h"
> @@ -121,6 +122,14 @@ static int i915_adjust_stolen(struct drm_i915_private *i915,
>   		}
>   	}
>   
> +	/*
> +	 * With device local memory, we don't need to check the address range,
> +	 * this is device memory physical address, could overlap with system
> +	 * memory.
> +	 */
> +	if (HAS_LMEM(i915))
> +		return 0;
> +
>   	/*
>   	 * Verify that nothing else uses this physical address. Stolen
>   	 * memory should be reserved by the BIOS and hidden from the
> @@ -374,8 +383,9 @@ static void icl_get_stolen_reserved(struct drm_i915_private *i915,
>   	}
>   }
>   
> -static int i915_gem_init_stolen(struct drm_i915_private *i915)
> +static int i915_gem_init_stolen(struct intel_memory_region *mem)
>   {
> +	struct drm_i915_private *i915 = mem->i915;
>   	struct intel_uncore *uncore = &i915->uncore;
>   	resource_size_t reserved_base, stolen_top;
>   	resource_size_t reserved_total, reserved_size;
> @@ -396,10 +406,10 @@ static int i915_gem_init_stolen(struct drm_i915_private *i915)
>   		return 0;
>   	}
>   
> -	if (resource_size(&intel_graphics_stolen_res) == 0)
> +	if (resource_size(&mem->region) == 0)
>   		return 0;
>   
> -	i915->dsm = intel_graphics_stolen_res;
> +	i915->dsm = mem->region;
>   
>   	if (i915_adjust_stolen(i915, &i915->dsm))
>   		return 0;
> @@ -684,23 +694,36 @@ static int _i915_gem_object_stolen_init(struct intel_memory_region *mem,
>   	return ret;
>   }
>   
> +struct intel_memory_region *i915_stolen_region(struct drm_i915_private *i915)
> +{
> +	if (HAS_LMEM(i915))
> +		return i915->mm.regions[INTEL_REGION_STOLEN_LMEM];
> +
> +	return i915->mm.regions[INTEL_REGION_STOLEN_SMEM];
> +}

Could be a bikeshedding comment only - especially since I think this 
path gets very little used at runtime so it is most likely pointless to 
fiddle with it, but it just strikes me a bit not fully elegant to do:

i915_gem_object_create_stolen
  -> i915_gem_object_create_region
     -> i915_stolen_region

And end up in here, when alternative could be at driver init:

i915->stolen_region_id = HAS_LMEM() ? ... : ...;

i915_gem_object_create_stolen
  -> 
i915_gem_object_create_region(i915->mm.regions[i915->stolen_region_id]);

Or pointer to region. Would avoid having to export i915_stolen_region as 
well.

Or is i915->dsm already the right thing? Because..

> +
>   struct drm_i915_gem_object *
>   i915_gem_object_create_stolen(struct drm_i915_private *i915,
>   			      resource_size_t size)
>   {
> -	return i915_gem_object_create_region(i915->mm.regions[INTEL_REGION_STOLEN_SMEM],
> +	return i915_gem_object_create_region(i915_stolen_region(i915),
>   					     size, I915_BO_ALLOC_CONTIGUOUS);
>   }
>   
>   static int init_stolen(struct intel_memory_region *mem)
>   {
> -	intel_memory_region_set_name(mem, "stolen");
> +	if (HAS_LMEM(mem->i915)) {
> +		if (!io_mapping_init_wc(&mem->iomap,
> +					mem->io_start,
> +					resource_size(&mem->region)))
> +			return -EIO;
> +	}
>   
>   	/*
>   	 * Initialise stolen early so that we may reserve preallocated
>   	 * objects for the BIOS to KMS transition.
>   	 */
> -	return i915_gem_init_stolen(mem->i915);
> +	return i915_gem_init_stolen(mem);

... I find the mem region init paths a bit convoluted, stolen 
especially, and struggle to figure it out every time.

For instance we have i915_region_stolen_ops shared between system and 
local stolen. But then shared vfuncs branch depending on system vs stolen?

i915_gem_init_stolen is shared - but which parts of it are relevant for 
local stolen?

>   }
>   
>   static void release_stolen(struct intel_memory_region *mem)
> @@ -714,13 +737,65 @@ static const struct intel_memory_region_ops i915_region_stolen_ops = {
>   	.init_object = _i915_gem_object_stolen_init,
>   };
>   
> +static struct intel_memory_region *
> +setup_lmem_stolen(struct drm_i915_private *i915)
> +{
> +	struct intel_uncore *uncore = &i915->uncore;
> +	struct pci_dev *pdev = i915->drm.pdev;
> +	struct intel_memory_region *mem;
> +	resource_size_t io_start;
> +	resource_size_t lmem_size;
> +	u64 lmem_base;
> +
> +	if (!IS_DGFX(i915))
> +		return ERR_PTR(-ENODEV);
> +
> +	lmem_base = intel_uncore_read64(uncore, GEN12_DSMBASE);
> +	lmem_size = pci_resource_len(pdev, 2) - lmem_base;
> +	io_start = pci_resource_start(pdev, 2) + lmem_base;
> +
> +	mem = intel_memory_region_create(i915, lmem_base, lmem_size,
> +					 I915_GTT_PAGE_SIZE_4K, io_start,
> +					 &i915_region_stolen_ops);
> +	if (IS_ERR(mem))
> +		return mem;
> +
> +	drm_dbg(&i915->drm, "Stolen Local memory: %pR\n", &mem->region);
> +	drm_dbg(&i915->drm, "Stolen Local memory IO start: %pa\n",
> +		&mem->io_start);

Could these messages be consolidated with the system stolen ones 
(i915_gem_setup_stolen?) and based off the memory_region data printed 
from common i915_gem_stolen_setup?

> +
> +	intel_memory_region_set_name(mem, "stolen-local");
> +
> +	return mem;
> +}
> +
> +static struct intel_memory_region*

Space before asterisk.

> +setup_smem_stolen(struct drm_i915_private *i915)
> +{
> +	struct intel_memory_region *mem;
> +
> +	mem = intel_memory_region_create(i915,
> +					 intel_graphics_stolen_res.start,
> +					 resource_size(&intel_graphics_stolen_res),
> +					 PAGE_SIZE, 0,
> +					 &i915_region_stolen_ops);
> +	if (IS_ERR(mem))
> +		return mem;
> +
> +	intel_memory_region_set_name(mem, "stolen-system");

I assume this name, although changed from the current ("stolen"), is not 
exported anywhere to matter?

> +
> +	return mem;
> +}
> +
>   struct intel_memory_region *i915_gem_stolen_setup(struct drm_i915_private *i915)
>   {
> -	return intel_memory_region_create(i915,
> -					  intel_graphics_stolen_res.start,
> -					  resource_size(&intel_graphics_stolen_res),
> -					  PAGE_SIZE, 0,
> -					  &i915_region_stolen_ops);
> +	struct intel_memory_region *mem;
> +
> +	mem = setup_lmem_stolen(i915);
> +	if (mem == ERR_PTR(-ENODEV))
> +		mem = setup_smem_stolen(i915);
> +
> +	return mem;
>   }
>   
>   struct drm_i915_gem_object *
> @@ -728,7 +803,7 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_i915_private *i915,
>   					       resource_size_t stolen_offset,
>   					       resource_size_t size)
>   {
> -	struct intel_memory_region *mem = i915->mm.regions[INTEL_REGION_STOLEN_SMEM];
> +	struct intel_memory_region *mem = i915_stolen_region(i915);
>   	struct drm_i915_gem_object *obj;
>   	struct drm_mm_node *stolen;
>   	int ret;
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_stolen.h b/drivers/gpu/drm/i915/gem/i915_gem_stolen.h
> index b03489706796..2d1ce7fec61c 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_stolen.h
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_stolen.h
> @@ -22,6 +22,9 @@ int i915_gem_stolen_insert_node_in_range(struct drm_i915_private *dev_priv,
>   void i915_gem_stolen_remove_node(struct drm_i915_private *dev_priv,
>   				 struct drm_mm_node *node);
>   struct intel_memory_region *i915_gem_stolen_setup(struct drm_i915_private *i915);
> +
> +struct intel_memory_region *i915_stolen_region(struct drm_i915_private *i915);
> +
>   struct drm_i915_gem_object *
>   i915_gem_object_create_stolen(struct drm_i915_private *dev_priv,
>   			      resource_size_t size);
> diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c
> index 480553746794..53f5d1e6daef 100644
> --- a/drivers/gpu/drm/i915/i915_pci.c
> +++ b/drivers/gpu/drm/i915/i915_pci.c
> @@ -906,7 +906,7 @@ static const struct intel_device_info rkl_info = {
>   
>   #define GEN12_DGFX_FEATURES \
>   	GEN12_FEATURES, \
> -	.memory_regions = REGION_SMEM | REGION_LMEM, \
> +	.memory_regions = REGION_SMEM | REGION_LMEM | REGION_STOLEN_LMEM, \
>   	.has_master_unit_irq = 1, \
>   	.has_llc = 0, \
>   	.has_snoop = 1, \
> diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
> index e087bcd21911..4108f2a7ebfa 100644
> --- a/drivers/gpu/drm/i915/i915_reg.h
> +++ b/drivers/gpu/drm/i915/i915_reg.h
> @@ -12191,6 +12191,7 @@ enum skl_power_gate {
>   #define GEN12_GLOBAL_MOCS(i)	_MMIO(0x4000 + (i) * 4) /* Global MOCS regs */
>   
>   #define GEN12_GSMBASE			_MMIO(0x108100)
> +#define GEN12_DSMBASE			_MMIO(0x1080C0)
>   
>   /* gamt regs */
>   #define GEN8_L3_LRA_1_GPGPU _MMIO(0x4dd4)
> diff --git a/drivers/gpu/drm/i915/intel_memory_region.c b/drivers/gpu/drm/i915/intel_memory_region.c
> index bf837b6bb185..ac90b76a3fa0 100644
> --- a/drivers/gpu/drm/i915/intel_memory_region.c
> +++ b/drivers/gpu/drm/i915/intel_memory_region.c
> @@ -22,6 +22,10 @@ static const struct {
>   		.class = INTEL_MEMORY_STOLEN_SYSTEM,
>   		.instance = 0,
>   	},
> +	[INTEL_REGION_STOLEN_LMEM] = {
> +		.class = INTEL_MEMORY_STOLEN_LOCAL,
> +		.instance = 0,
> +	},
>   };
>   
>   struct intel_memory_region *
> @@ -278,6 +282,8 @@ int intel_memory_regions_hw_probe(struct drm_i915_private *i915)
>   		case INTEL_MEMORY_SYSTEM:
>   			mem = i915_gem_shmem_setup(i915);
>   			break;
> +		case INTEL_MEMORY_STOLEN_LOCAL:
> +			fallthrough;
>   		case INTEL_MEMORY_STOLEN_SYSTEM:
>   			mem = i915_gem_stolen_setup(i915);
>   			break;
> diff --git a/drivers/gpu/drm/i915/intel_memory_region.h b/drivers/gpu/drm/i915/intel_memory_region.h
> index edd49067c8ca..4c8ec15af55f 100644
> --- a/drivers/gpu/drm/i915/intel_memory_region.h
> +++ b/drivers/gpu/drm/i915/intel_memory_region.h
> @@ -26,18 +26,21 @@ enum intel_memory_type {
>   	INTEL_MEMORY_SYSTEM = 0,
>   	INTEL_MEMORY_LOCAL,
>   	INTEL_MEMORY_STOLEN_SYSTEM,
> +	INTEL_MEMORY_STOLEN_LOCAL,
>   };
>   
>   enum intel_region_id {
>   	INTEL_REGION_SMEM = 0,
>   	INTEL_REGION_LMEM,
>   	INTEL_REGION_STOLEN_SMEM,
> +	INTEL_REGION_STOLEN_LMEM,
>   	INTEL_REGION_UNKNOWN, /* Should be last */
>   };
>   
>   #define REGION_SMEM     BIT(INTEL_REGION_SMEM)
>   #define REGION_LMEM     BIT(INTEL_REGION_LMEM)
>   #define REGION_STOLEN_SMEM   BIT(INTEL_REGION_STOLEN_SMEM)
> +#define REGION_STOLEN_LMEM   BIT(INTEL_REGION_STOLEN_LMEM)
>   
>   #define I915_ALLOC_MIN_PAGE_SIZE  BIT(0)
>   #define I915_ALLOC_CONTIGUOUS     BIT(1)
> @@ -82,7 +85,7 @@ struct intel_memory_region {
>   	u16 type;
>   	u16 instance;
>   	enum intel_region_id id;
> -	char name[8];
> +	char name[16];
>   
>   	struct list_head reserved;
>   
> 

Regards,

Tvrtko
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [Intel-gfx] [PATCH 04/19] drm/i915/stolen: treat stolen local as normal local memory
  2021-04-12  9:05 ` [PATCH 04/19] drm/i915/stolen: treat stolen local as normal " Matthew Auld
@ 2021-04-14 15:06   ` Tvrtko Ursulin
  0 siblings, 0 replies; 65+ messages in thread
From: Tvrtko Ursulin @ 2021-04-14 15:06 UTC (permalink / raw)
  To: Matthew Auld, intel-gfx; +Cc: dri-devel


On 12/04/2021 10:05, Matthew Auld wrote:
> Underneath it's the same stuff, so things like the PTE_LM bits for the
> GTT should just keep working as-is.
> 
> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> ---
>   drivers/gpu/drm/i915/gem/i915_gem_lmem.c | 5 ++++-
>   1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_lmem.c b/drivers/gpu/drm/i915/gem/i915_gem_lmem.c
> index ce1c83c13d05..017db8f71130 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_lmem.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_lmem.c
> @@ -19,7 +19,10 @@ const struct drm_i915_gem_object_ops i915_gem_lmem_obj_ops = {
>   
>   bool i915_gem_object_is_lmem(struct drm_i915_gem_object *obj)
>   {
> -	return obj->ops == &i915_gem_lmem_obj_ops;
> +	struct intel_memory_region *mr = obj->mm.region;
> +
> +	return mr && (mr->type == INTEL_MEMORY_LOCAL ||
> +		      mr->type == INTEL_MEMORY_STOLEN_LOCAL);
>   }
>   
>   struct drm_i915_gem_object *
> 

Passable I guess. Although there is also i915_gem_object_is_stolen so it 
is not immediately clear what are the semantics of 
i915_gem_object_is_lmem vs that one. Almost like we need more 
"hierarchy" in region types, or flags of some sort, but I haven't looked 
at the callers to have a good idea what would work best.

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Regards,

Tvrtko
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [Intel-gfx] [PATCH 05/19] drm/i915/stolen: enforce the min_page_size contract
  2021-04-12  9:05 ` [PATCH 05/19] drm/i915/stolen: enforce the min_page_size contract Matthew Auld
@ 2021-04-14 15:07   ` Tvrtko Ursulin
  0 siblings, 0 replies; 65+ messages in thread
From: Tvrtko Ursulin @ 2021-04-14 15:07 UTC (permalink / raw)
  To: Matthew Auld, intel-gfx; +Cc: dri-devel


On 12/04/2021 10:05, Matthew Auld wrote:
> From: CQ Tang <cq.tang@intel.com>
> 
> Since stolen can now be device local-memory underneath, we should try to
> enforce any min_page_size restrictions when allocating pages.
> 
> Signed-off-by: CQ Tang <cq.tang@intel.com>
> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> ---
>   drivers/gpu/drm/i915/gem/i915_gem_stolen.c | 7 ++++---
>   1 file changed, 4 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_stolen.c b/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
> index 56dd58bef5ee..f713eabb7671 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
> @@ -677,7 +677,8 @@ static int _i915_gem_object_stolen_init(struct intel_memory_region *mem,
>   	if (!stolen)
>   		return -ENOMEM;
>   
> -	ret = i915_gem_stolen_insert_node(i915, stolen, size, 4096);
> +	ret = i915_gem_stolen_insert_node(i915, stolen, size,
> +					  mem->min_page_size);
>   	if (ret)
>   		goto err_free;
>   
> @@ -817,8 +818,8 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_i915_private *i915,
>   
>   	/* KISS and expect everything to be page-aligned */
>   	if (GEM_WARN_ON(size == 0) ||
> -	    GEM_WARN_ON(!IS_ALIGNED(size, I915_GTT_PAGE_SIZE)) ||
> -	    GEM_WARN_ON(!IS_ALIGNED(stolen_offset, I915_GTT_MIN_ALIGNMENT)))
> +	    GEM_WARN_ON(!IS_ALIGNED(size, mem->min_page_size)) ||
> +	    GEM_WARN_ON(!IS_ALIGNED(stolen_offset, mem->min_page_size)))
>   		return ERR_PTR(-EINVAL);
>   
>   	stolen = kzalloc(sizeof(*stolen), GFP_KERNEL);
> 

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Regards,

Tvrtko
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [Intel-gfx] [PATCH 06/19] drm/i915/stolen: pass the allocation flags
  2021-04-12  9:05 ` [PATCH 06/19] drm/i915/stolen: pass the allocation flags Matthew Auld
@ 2021-04-14 15:09   ` Tvrtko Ursulin
  2021-04-16 13:53     ` Matthew Auld
  0 siblings, 1 reply; 65+ messages in thread
From: Tvrtko Ursulin @ 2021-04-14 15:09 UTC (permalink / raw)
  To: Matthew Auld, intel-gfx; +Cc: dri-devel


On 12/04/2021 10:05, Matthew Auld wrote:
> From: CQ Tang <cq.tang@intel.com>
> 
> Stolen memory is always allocated as physically contiguous pages, mark
> the object flags as such.
> 
> Signed-off-by: CQ Tang <cq.tang@intel.com>
> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> ---
>   drivers/gpu/drm/i915/gem/i915_gem_stolen.c | 10 ++++++----
>   1 file changed, 6 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_stolen.c b/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
> index f713eabb7671..49a2dfcc8ba7 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
> @@ -633,14 +633,15 @@ static const struct drm_i915_gem_object_ops i915_gem_object_stolen_ops = {
>   
>   static int __i915_gem_object_create_stolen(struct intel_memory_region *mem,
>   					   struct drm_i915_gem_object *obj,
> -					   struct drm_mm_node *stolen)
> +					   struct drm_mm_node *stolen,
> +					   unsigned int flags)
>   {
>   	static struct lock_class_key lock_class;
>   	unsigned int cache_level;
>   	int err;
>   
>   	drm_gem_private_object_init(&mem->i915->drm, &obj->base, stolen->size);
> -	i915_gem_object_init(obj, &i915_gem_object_stolen_ops, &lock_class, 0);
> +	i915_gem_object_init(obj, &i915_gem_object_stolen_ops, &lock_class, flags);
>   
>   	obj->stolen = stolen;
>   	obj->read_domains = I915_GEM_DOMAIN_CPU | I915_GEM_DOMAIN_GTT;
> @@ -682,7 +683,7 @@ static int _i915_gem_object_stolen_init(struct intel_memory_region *mem,
>   	if (ret)
>   		goto err_free;
>   
> -	ret = __i915_gem_object_create_stolen(mem, obj, stolen);
> +	ret = __i915_gem_object_create_stolen(mem, obj, stolen, flags);
>   	if (ret)
>   		goto err_remove;
>   
> @@ -840,7 +841,8 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_i915_private *i915,
>   		goto err_stolen;
>   	}
>   
> -	ret = __i915_gem_object_create_stolen(mem, obj, stolen);
> +	ret = __i915_gem_object_create_stolen(mem, obj, stolen,
> +					      I915_BO_ALLOC_CONTIGUOUS);
>   	if (ret)
>   		goto err_object_free;
>   
> 

Are all stolen objects always contiguous or only ones allocated by 
i915_gem_object_create_stolen_for_preallocated? If former should 
__i915_gem_object_create_stolen just set the flag without the need to 
pass it in?

Regards,

Tvrtko
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [Intel-gfx] [PATCH 08/19] drm/i915: Return error value when bo not in LMEM for discrete
  2021-04-12  9:05 ` [PATCH 08/19] drm/i915: Return error value when bo not in LMEM for discrete Matthew Auld
@ 2021-04-14 15:16   ` Tvrtko Ursulin
  0 siblings, 0 replies; 65+ messages in thread
From: Tvrtko Ursulin @ 2021-04-14 15:16 UTC (permalink / raw)
  To: Matthew Auld, intel-gfx; +Cc: Mohammed Khajapasha, dri-devel


On 12/04/2021 10:05, Matthew Auld wrote:
> From: Mohammed Khajapasha <mohammed.khajapasha@intel.com>
> 
> Return EREMOTE value when frame buffer object is not backed by LMEM
> for discrete. If Local memory is supported by hardware the framebuffer
> backing gem objects should be from local memory.
> 
> Signed-off-by: Mohammed Khajapasha <mohammed.khajapasha@intel.com>
> ---
>   drivers/gpu/drm/i915/display/intel_display.c | 10 ++++++++++
>   1 file changed, 10 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/display/intel_display.c b/drivers/gpu/drm/i915/display/intel_display.c
> index 411b46c012f8..57b06d8728af 100644
> --- a/drivers/gpu/drm/i915/display/intel_display.c
> +++ b/drivers/gpu/drm/i915/display/intel_display.c
> @@ -63,6 +63,7 @@
>   #include "display/intel_vdsc.h"
>   #include "display/intel_vrr.h"
>   
> +#include "gem/i915_gem_lmem.h"
>   #include "gem/i915_gem_object.h"
>   
>   #include "gt/intel_rps.h"
> @@ -11279,11 +11280,20 @@ intel_user_framebuffer_create(struct drm_device *dev,
>   	struct drm_framebuffer *fb;
>   	struct drm_i915_gem_object *obj;
>   	struct drm_mode_fb_cmd2 mode_cmd = *user_mode_cmd;
> +	struct drm_i915_private *i915;
>   
>   	obj = i915_gem_object_lookup(filp, mode_cmd.handles[0]);
>   	if (!obj)
>   		return ERR_PTR(-ENOENT);
>   
> +	/* object is backed with LMEM for discrete */
> +	i915 = to_i915(obj->base.dev);
> +	if (HAS_LMEM(i915) && !i915_gem_object_is_lmem(obj)) {
> +		/* object is "remote", not in local memory */
> +		i915_gem_object_put(obj);
> +		return ERR_PTR(-EREMOTE);

I am a fan of rich errnos and this one feels appropriately descriptive, 
but please get an ack from Daniel or so.

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Regards,

Tvrtko

> +	}
> +
>   	fb = intel_framebuffer_create(obj, &mode_cmd);
>   	i915_gem_object_put(obj);
>   
> 
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [Intel-gfx] [PATCH 11/19] drm/i915: Update the helper to set correct mapping
  2021-04-12  9:05 ` [PATCH 11/19] drm/i915: Update the helper to set correct mapping Matthew Auld
@ 2021-04-14 15:22   ` Tvrtko Ursulin
  2021-04-14 16:20     ` Matthew Auld
  0 siblings, 1 reply; 65+ messages in thread
From: Tvrtko Ursulin @ 2021-04-14 15:22 UTC (permalink / raw)
  To: Matthew Auld, intel-gfx; +Cc: dri-devel


On 12/04/2021 10:05, Matthew Auld wrote:
> From: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com>
> 
> Determine the possible coherent map type based on object location,
> and if target has llc or if user requires an always coherent
> mapping.
> 
> Cc: Matthew Auld <matthew.auld@intel.com>
> Cc: CQ Tang <cq.tang@intel.com>
> Suggested-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
> Signed-off-by: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com>
> ---
>   drivers/gpu/drm/i915/gt/intel_engine_cs.c    |  3 ++-
>   drivers/gpu/drm/i915/gt/intel_engine_pm.c    |  2 +-
>   drivers/gpu/drm/i915/gt/intel_lrc.c          |  4 +++-
>   drivers/gpu/drm/i915/gt/intel_ring.c         |  9 ++++++---
>   drivers/gpu/drm/i915/gt/selftest_context.c   |  3 ++-
>   drivers/gpu/drm/i915/gt/selftest_hangcheck.c |  4 ++--
>   drivers/gpu/drm/i915/gt/selftest_lrc.c       |  4 +++-
>   drivers/gpu/drm/i915/gt/uc/intel_guc.c       |  4 +++-
>   drivers/gpu/drm/i915/gt/uc/intel_huc.c       |  4 +++-
>   drivers/gpu/drm/i915/i915_drv.h              | 11 +++++++++--
>   drivers/gpu/drm/i915/selftests/igt_spinner.c |  4 ++--
>   11 files changed, 36 insertions(+), 16 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> index efe935f80c1a..b79568d370f5 100644
> --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> @@ -664,7 +664,8 @@ static int init_status_page(struct intel_engine_cs *engine)
>   	if (ret)
>   		goto err;
>   
> -	vaddr = i915_gem_object_pin_map(obj, I915_MAP_WB);
> +	vaddr = i915_gem_object_pin_map(obj,
> +					i915_coherent_map_type(engine->i915, obj, true));
>   	if (IS_ERR(vaddr)) {
>   		ret = PTR_ERR(vaddr);
>   		goto err_unpin;
> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pm.c b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
> index 7c9af86fdb1e..47f4397095e5 100644
> --- a/drivers/gpu/drm/i915/gt/intel_engine_pm.c
> +++ b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
> @@ -23,7 +23,7 @@ static void dbg_poison_ce(struct intel_context *ce)
>   
>   	if (ce->state) {
>   		struct drm_i915_gem_object *obj = ce->state->obj;
> -		int type = i915_coherent_map_type(ce->engine->i915);
> +		int type = i915_coherent_map_type(ce->engine->i915, obj, true);
>   		void *map;
>   
>   		if (!i915_gem_object_trylock(obj))
> diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
> index e86897cde984..aafe2a4df496 100644
> --- a/drivers/gpu/drm/i915/gt/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
> @@ -903,7 +903,9 @@ lrc_pre_pin(struct intel_context *ce,
>   	GEM_BUG_ON(!i915_vma_is_pinned(ce->state));
>   
>   	*vaddr = i915_gem_object_pin_map(ce->state->obj,
> -					 i915_coherent_map_type(ce->engine->i915) |
> +					 i915_coherent_map_type(ce->engine->i915,
> +								ce->state->obj,
> +								false) |
>   					 I915_MAP_OVERRIDE);
>   
>   	return PTR_ERR_OR_ZERO(*vaddr);
> diff --git a/drivers/gpu/drm/i915/gt/intel_ring.c b/drivers/gpu/drm/i915/gt/intel_ring.c
> index aee0a77c77e0..3cf6c7e68108 100644
> --- a/drivers/gpu/drm/i915/gt/intel_ring.c
> +++ b/drivers/gpu/drm/i915/gt/intel_ring.c
> @@ -53,9 +53,12 @@ int intel_ring_pin(struct intel_ring *ring, struct i915_gem_ww_ctx *ww)
>   
>   	if (i915_vma_is_map_and_fenceable(vma))
>   		addr = (void __force *)i915_vma_pin_iomap(vma);
> -	else
> -		addr = i915_gem_object_pin_map(vma->obj,
> -					       i915_coherent_map_type(vma->vm->i915));
> +	else {
> +		int type = i915_coherent_map_type(vma->vm->i915, vma->obj, false);
> +
> +		addr = i915_gem_object_pin_map(vma->obj, type);
> +	}
> +
>   	if (IS_ERR(addr)) {
>   		ret = PTR_ERR(addr);
>   		goto err_ring;
> diff --git a/drivers/gpu/drm/i915/gt/selftest_context.c b/drivers/gpu/drm/i915/gt/selftest_context.c
> index b9bdd1d23243..26685b927169 100644
> --- a/drivers/gpu/drm/i915/gt/selftest_context.c
> +++ b/drivers/gpu/drm/i915/gt/selftest_context.c
> @@ -88,7 +88,8 @@ static int __live_context_size(struct intel_engine_cs *engine)
>   		goto err;
>   
>   	vaddr = i915_gem_object_pin_map_unlocked(ce->state->obj,
> -						 i915_coherent_map_type(engine->i915));
> +						 i915_coherent_map_type(engine->i915,
> +									ce->state->obj, false));
>   	if (IS_ERR(vaddr)) {
>   		err = PTR_ERR(vaddr);
>   		intel_context_unpin(ce);
> diff --git a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
> index 746985971c3a..5b63d4df8c93 100644
> --- a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
> +++ b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
> @@ -69,7 +69,7 @@ static int hang_init(struct hang *h, struct intel_gt *gt)
>   	h->seqno = memset(vaddr, 0xff, PAGE_SIZE);
>   
>   	vaddr = i915_gem_object_pin_map_unlocked(h->obj,
> -						 i915_coherent_map_type(gt->i915));
> +						 i915_coherent_map_type(gt->i915, h->obj, false));
>   	if (IS_ERR(vaddr)) {
>   		err = PTR_ERR(vaddr);
>   		goto err_unpin_hws;
> @@ -130,7 +130,7 @@ hang_create_request(struct hang *h, struct intel_engine_cs *engine)
>   		return ERR_CAST(obj);
>   	}
>   
> -	vaddr = i915_gem_object_pin_map_unlocked(obj, i915_coherent_map_type(gt->i915));
> +	vaddr = i915_gem_object_pin_map_unlocked(obj, i915_coherent_map_type(gt->i915, obj, false));
>   	if (IS_ERR(vaddr)) {
>   		i915_gem_object_put(obj);
>   		i915_vm_put(vm);
> diff --git a/drivers/gpu/drm/i915/gt/selftest_lrc.c b/drivers/gpu/drm/i915/gt/selftest_lrc.c
> index 85e7df6a5123..d8f6623524e8 100644
> --- a/drivers/gpu/drm/i915/gt/selftest_lrc.c
> +++ b/drivers/gpu/drm/i915/gt/selftest_lrc.c
> @@ -1221,7 +1221,9 @@ static int compare_isolation(struct intel_engine_cs *engine,
>   	}
>   
>   	lrc = i915_gem_object_pin_map_unlocked(ce->state->obj,
> -				      i915_coherent_map_type(engine->i915));
> +					       i915_coherent_map_type(engine->i915,
> +								      ce->state->obj,
> +								      false));
>   	if (IS_ERR(lrc)) {
>   		err = PTR_ERR(lrc);
>   		goto err_B1;
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.c b/drivers/gpu/drm/i915/gt/uc/intel_guc.c
> index 78305b2ec89d..adae04c47aab 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc.c
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.c
> @@ -682,7 +682,9 @@ int intel_guc_allocate_and_map_vma(struct intel_guc *guc, u32 size,
>   	if (IS_ERR(vma))
>   		return PTR_ERR(vma);
>   
> -	vaddr = i915_gem_object_pin_map_unlocked(vma->obj, I915_MAP_WB);
> +	vaddr = i915_gem_object_pin_map_unlocked(vma->obj,
> +						 i915_coherent_map_type(guc_to_gt(guc)->i915,
> +									vma->obj, true));
>   	if (IS_ERR(vaddr)) {
>   		i915_vma_unpin_and_release(&vma, 0);
>   		return PTR_ERR(vaddr);
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_huc.c b/drivers/gpu/drm/i915/gt/uc/intel_huc.c
> index 2126dd81ac38..56d2144dc6a0 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_huc.c
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_huc.c
> @@ -82,7 +82,9 @@ static int intel_huc_rsa_data_create(struct intel_huc *huc)
>   	if (IS_ERR(vma))
>   		return PTR_ERR(vma);
>   
> -	vaddr = i915_gem_object_pin_map_unlocked(vma->obj, I915_MAP_WB);
> +	vaddr = i915_gem_object_pin_map_unlocked(vma->obj,
> +						 i915_coherent_map_type(gt->i915,
> +									vma->obj, true));
>   	if (IS_ERR(vaddr)) {
>   		i915_vma_unpin_and_release(&vma, 0);
>   		return PTR_ERR(vaddr);
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 69e43bf91a15..2abbc06712a4 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -78,6 +78,7 @@
>   #include "gem/i915_gem_context_types.h"
>   #include "gem/i915_gem_shrinker.h"
>   #include "gem/i915_gem_stolen.h"
> +#include "gem/i915_gem_lmem.h"
>   
>   #include "gt/intel_engine.h"
>   #include "gt/intel_gt_types.h"
> @@ -1921,9 +1922,15 @@ static inline int intel_hws_csb_write_index(struct drm_i915_private *i915)
>   }
>   
>   static inline enum i915_map_type
> -i915_coherent_map_type(struct drm_i915_private *i915)
> +i915_coherent_map_type(struct drm_i915_private *i915,
> +		       struct drm_i915_gem_object *obj, bool always_coherent)
>   {
> -	return HAS_LLC(i915) ? I915_MAP_WB : I915_MAP_WC;
> +	if (i915_gem_object_is_lmem(obj))
> +		return I915_MAP_WC;
> +	if (HAS_LLC(i915) || always_coherent)
> +		return I915_MAP_WB;
> +	else
> +		return I915_MAP_WC;

Seems this patch is doing two things.

First it is adding lmem support to this helper by always returning WC 
for lmem objects.

Secondly it is introducing an idea of "always coherent" in a helper 
called i915_coherent_map_type. Could someone explain what is coherent vs 
always coherent?

And also, why is always coherent happy with WB? Sounds counter intuitive 
to me.

Regards,

Tvrtko

>   }
>   
>   #endif
> diff --git a/drivers/gpu/drm/i915/selftests/igt_spinner.c b/drivers/gpu/drm/i915/selftests/igt_spinner.c
> index cfbbe415b57c..5fe397b7d1d9 100644
> --- a/drivers/gpu/drm/i915/selftests/igt_spinner.c
> +++ b/drivers/gpu/drm/i915/selftests/igt_spinner.c
> @@ -94,9 +94,9 @@ int igt_spinner_pin(struct igt_spinner *spin,
>   	}
>   
>   	if (!spin->batch) {
> -		unsigned int mode =
> -			i915_coherent_map_type(spin->gt->i915);
> +		unsigned int mode;
>   
> +		mode = i915_coherent_map_type(spin->gt->i915, spin->obj, false);
>   		vaddr = igt_spinner_pin_obj(ce, ww, spin->obj, mode, &spin->batch_vma);
>   		if (IS_ERR(vaddr))
>   			return PTR_ERR(vaddr);
> 
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [Intel-gfx] [PATCH 12/19] drm/i915/lmem: Bypass aperture when lmem is available
  2021-04-12  9:05 ` [PATCH 12/19] drm/i915/lmem: Bypass aperture when lmem is available Matthew Auld
@ 2021-04-14 15:33   ` Tvrtko Ursulin
  2021-04-16 14:25     ` Matthew Auld
  0 siblings, 1 reply; 65+ messages in thread
From: Tvrtko Ursulin @ 2021-04-14 15:33 UTC (permalink / raw)
  To: Matthew Auld, intel-gfx
  Cc: Daniel Vetter, dri-devel, Chris P Wilson, Dhinakaran Pandiyan


On 12/04/2021 10:05, Matthew Auld wrote:
> From: Anusha Srivatsa <anusha.srivatsa@intel.com>
> 
> In the scenario where local memory is available, we have
> rely on CPU access via lmem directly instead of aperture.
> 
> v2:
> gmch is only relevant for much older hw, therefore we can drop the
> has_aperture check since it should always be present on such platforms.
> (Chris)
> 
> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
> Cc: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> Cc: Chris P Wilson <chris.p.wilson@intel.com>
> Cc: Daniel Vetter <daniel.vetter@intel.com>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
> Cc: CQ Tang <cq.tang@intel.com>
> Signed-off-by: Anusha Srivatsa <anusha.srivatsa@intel.com>
> ---
>   drivers/gpu/drm/i915/display/intel_fbdev.c | 22 +++++++++++++++-------
>   drivers/gpu/drm/i915/gem/i915_gem_lmem.c   | 15 +++++++++++++++
>   drivers/gpu/drm/i915/gem/i915_gem_lmem.h   |  5 +++++
>   drivers/gpu/drm/i915/i915_vma.c            | 19 +++++++++++++------
>   4 files changed, 48 insertions(+), 13 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/display/intel_fbdev.c b/drivers/gpu/drm/i915/display/intel_fbdev.c
> index 2b37959da747..4af40229f5ec 100644
> --- a/drivers/gpu/drm/i915/display/intel_fbdev.c
> +++ b/drivers/gpu/drm/i915/display/intel_fbdev.c
> @@ -139,14 +139,22 @@ static int intelfb_alloc(struct drm_fb_helper *helper,
>   	size = mode_cmd.pitches[0] * mode_cmd.height;
>   	size = PAGE_ALIGN(size);
>   
> -	/* If the FB is too big, just don't use it since fbdev is not very
> -	 * important and we should probably use that space with FBC or other
> -	 * features. */
>   	obj = ERR_PTR(-ENODEV);
> -	if (size * 2 < dev_priv->stolen_usable_size)
> -		obj = i915_gem_object_create_stolen(dev_priv, size);
> -	if (IS_ERR(obj))
> -		obj = i915_gem_object_create_shmem(dev_priv, size);
> +	if (HAS_LMEM(dev_priv)) {
> +		obj = i915_gem_object_create_lmem(dev_priv, size,
> +						  I915_BO_ALLOC_CONTIGUOUS);

Has to be contiguous? Question for display experts I guess.

[Comes back later.] Ah for iomap? Put a comment to that effect perhaps?

> +	} else {
> +		/*
> +		 * If the FB is too big, just don't use it since fbdev is not very
> +		 * important and we should probably use that space with FBC or other
> +		 * features.
> +		 */
> +		if (size * 2 < dev_priv->stolen_usable_size)
> +			obj = i915_gem_object_create_stolen(dev_priv, size);
> +		if (IS_ERR(obj))
> +			obj = i915_gem_object_create_shmem(dev_priv, size);
> +	}

Could we keep the IS_ERR ordered allocation order to save having to 
re-indent? Bike shed so optional..

> +
>   	if (IS_ERR(obj)) {
>   		drm_err(&dev_priv->drm, "failed to allocate framebuffer\n");
>   		return PTR_ERR(obj);
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_lmem.c b/drivers/gpu/drm/i915/gem/i915_gem_lmem.c
> index 017db8f71130..f44bdd08f7cb 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_lmem.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_lmem.c
> @@ -17,6 +17,21 @@ const struct drm_i915_gem_object_ops i915_gem_lmem_obj_ops = {
>   	.release = i915_gem_object_release_memory_region,
>   };
>   
> +void __iomem *
> +i915_gem_object_lmem_io_map(struct drm_i915_gem_object *obj,
> +			    unsigned long n,
> +			    unsigned long size)
> +{
> +	resource_size_t offset;
> +
> +	GEM_BUG_ON(!i915_gem_object_is_contiguous(obj));
> +
> +	offset = i915_gem_object_get_dma_address(obj, n);
> +	offset -= obj->mm.region->region.start;
> +
> +	return io_mapping_map_wc(&obj->mm.region->iomap, offset, size);
> +}
> +
>   bool i915_gem_object_is_lmem(struct drm_i915_gem_object *obj)
>   {
>   	struct intel_memory_region *mr = obj->mm.region;
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_lmem.h b/drivers/gpu/drm/i915/gem/i915_gem_lmem.h
> index 036d53c01de9..fac6bc5a5ebb 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_lmem.h
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_lmem.h
> @@ -14,6 +14,11 @@ struct intel_memory_region;
>   
>   extern const struct drm_i915_gem_object_ops i915_gem_lmem_obj_ops;
>   
> +void __iomem *
> +i915_gem_object_lmem_io_map(struct drm_i915_gem_object *obj,
> +			    unsigned long n,
> +			    unsigned long size);
> +
>   bool i915_gem_object_is_lmem(struct drm_i915_gem_object *obj);
>   
>   struct drm_i915_gem_object *
> diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
> index 07490db51cdc..e24d33aecac4 100644
> --- a/drivers/gpu/drm/i915/i915_vma.c
> +++ b/drivers/gpu/drm/i915/i915_vma.c
> @@ -27,6 +27,7 @@
>   
>   #include "display/intel_frontbuffer.h"
>   
> +#include "gem/i915_gem_lmem.h"
>   #include "gt/intel_engine.h"
>   #include "gt/intel_engine_heartbeat.h"
>   #include "gt/intel_gt.h"
> @@ -448,9 +449,11 @@ void __iomem *i915_vma_pin_iomap(struct i915_vma *vma)
>   	void __iomem *ptr;
>   	int err;
>   
> -	if (GEM_WARN_ON(!i915_vma_is_map_and_fenceable(vma))) {
> -		err = -ENODEV;
> -		goto err;
> +	if (!i915_gem_object_is_lmem(vma->obj)) {
> +		if (GEM_WARN_ON(!i915_vma_is_map_and_fenceable(vma))) {
> +			err = -ENODEV;
> +			goto err;
> +		}
>   	}
>   
>   	GEM_BUG_ON(!i915_vma_is_ggtt(vma));
> @@ -458,9 +461,13 @@ void __iomem *i915_vma_pin_iomap(struct i915_vma *vma)
>   
>   	ptr = READ_ONCE(vma->iomap);
>   	if (ptr == NULL) {
> -		ptr = io_mapping_map_wc(&i915_vm_to_ggtt(vma->vm)->iomap,
> -					vma->node.start,
> -					vma->node.size);
> +		if (i915_gem_object_is_lmem(vma->obj))
> +			ptr = i915_gem_object_lmem_io_map(vma->obj, 0,
> +							  vma->obj->base.size);

Can the vma size be bigger than the object here? Given how below works 
of vma->node.size.

> +		else
> +			ptr = io_mapping_map_wc(&i915_vm_to_ggtt(vma->vm)->iomap,
> +						vma->node.start,
> +						vma->node.size);

Looks a bit odd that this calls the same io_mapping_map_wc as 
i915_gem_object_lmem_io_map ends up doing. Perhaps that suggests there 
should be a single helper here but I am not sure what would be elegant.

Regards,

Tvrtko

>   		if (ptr == NULL) {
>   			err = -ENOMEM;
>   			goto err;
> 
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [Intel-gfx] [PATCH 19/19] drm/i915/gtt/dgfx: place the PD in LMEM
  2021-04-12  9:05 ` [PATCH 19/19] drm/i915/gtt/dgfx: place the PD in LMEM Matthew Auld
@ 2021-04-14 15:37   ` Tvrtko Ursulin
  0 siblings, 0 replies; 65+ messages in thread
From: Tvrtko Ursulin @ 2021-04-14 15:37 UTC (permalink / raw)
  To: Matthew Auld, intel-gfx; +Cc: dri-devel


On 12/04/2021 10:05, Matthew Auld wrote:
> It's a requirement that for dgfx we place all the paging structures in
> device local-memory.
> 
> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> ---
>   drivers/gpu/drm/i915/gt/gen8_ppgtt.c |  5 ++++-
>   drivers/gpu/drm/i915/gt/intel_gtt.c  | 27 +++++++++++++++++++++++++--
>   drivers/gpu/drm/i915/gt/intel_gtt.h  |  1 +
>   3 files changed, 30 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
> index f83496836f0f..11fb5df45a0f 100644
> --- a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
> +++ b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
> @@ -712,7 +712,10 @@ struct i915_ppgtt *gen8_ppgtt_create(struct intel_gt *gt)
>   	 */
>   	ppgtt->vm.has_read_only = !IS_GEN_RANGE(gt->i915, 11, 12);
>   
> -	ppgtt->vm.alloc_pt_dma = alloc_pt_dma;
> +	if (HAS_LMEM(gt->i915))
> +		ppgtt->vm.alloc_pt_dma = alloc_pt_lmem;
> +	else
> +		ppgtt->vm.alloc_pt_dma = alloc_pt_dma;
>   
>   	err = gen8_init_scratch(&ppgtt->vm);
>   	if (err)
> diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.c b/drivers/gpu/drm/i915/gt/intel_gtt.c
> index d386b89e2758..1eeeab45445c 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gtt.c
> +++ b/drivers/gpu/drm/i915/gt/intel_gtt.c
> @@ -7,10 +7,23 @@
>   
>   #include <linux/fault-inject.h>
>   
> +#include "gem/i915_gem_lmem.h"
>   #include "i915_trace.h"
>   #include "intel_gt.h"
>   #include "intel_gtt.h"
>   
> +struct drm_i915_gem_object *alloc_pt_lmem(struct i915_address_space *vm, int sz)
> +{
> +	struct drm_i915_gem_object *obj;
> +
> +	obj = i915_gem_object_create_lmem(vm->i915, sz, 0);
> +
> +	/* ensure all dma objects have the same reservation class */
> +	if (!IS_ERR(obj))
> +		obj->base.resv = &vm->resv;
> +	return obj;
> +}
> +
>   struct drm_i915_gem_object *alloc_pt_dma(struct i915_address_space *vm, int sz)
>   {
>   	struct drm_i915_gem_object *obj;
> @@ -27,9 +40,14 @@ struct drm_i915_gem_object *alloc_pt_dma(struct i915_address_space *vm, int sz)
>   
>   int map_pt_dma(struct i915_address_space *vm, struct drm_i915_gem_object *obj)
>   {
> +	enum i915_map_type type;
>   	void *vaddr;
>   
> -	vaddr = i915_gem_object_pin_map_unlocked(obj, I915_MAP_WB);
> +	type = I915_MAP_WB;
> +	if (i915_gem_object_is_lmem(obj))
> +		type = I915_MAP_WC;

Not trusting the "always coherent" helper from earlier in the series?

Regards,

Tvrtko

> +
> +	vaddr = i915_gem_object_pin_map_unlocked(obj, type);
>   	if (IS_ERR(vaddr))
>   		return PTR_ERR(vaddr);
>   
> @@ -39,9 +57,14 @@ int map_pt_dma(struct i915_address_space *vm, struct drm_i915_gem_object *obj)
>   
>   int map_pt_dma_locked(struct i915_address_space *vm, struct drm_i915_gem_object *obj)
>   {
> +	enum i915_map_type type;
>   	void *vaddr;
>   
> -	vaddr = i915_gem_object_pin_map(obj, I915_MAP_WB);
> +	type = I915_MAP_WB;
> +	if (i915_gem_object_is_lmem(obj))
> +		type = I915_MAP_WC;
> +
> +	vaddr = i915_gem_object_pin_map(obj, type);
>   	if (IS_ERR(vaddr))
>   		return PTR_ERR(vaddr);
>   
> diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.h b/drivers/gpu/drm/i915/gt/intel_gtt.h
> index 40e486704558..44ce27c51631 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gtt.h
> +++ b/drivers/gpu/drm/i915/gt/intel_gtt.h
> @@ -527,6 +527,7 @@ int setup_scratch_page(struct i915_address_space *vm);
>   void free_scratch(struct i915_address_space *vm);
>   
>   struct drm_i915_gem_object *alloc_pt_dma(struct i915_address_space *vm, int sz);
> +struct drm_i915_gem_object *alloc_pt_lmem(struct i915_address_space *vm, int sz);
>   struct i915_page_table *alloc_pt(struct i915_address_space *vm);
>   struct i915_page_directory *alloc_pd(struct i915_address_space *vm);
>   struct i915_page_directory *__alloc_pd(int npde);
> 
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [Intel-gfx] [PATCH 11/19] drm/i915: Update the helper to set correct mapping
  2021-04-14 15:22   ` [Intel-gfx] " Tvrtko Ursulin
@ 2021-04-14 16:20     ` Matthew Auld
  2021-04-15  8:20       ` Tvrtko Ursulin
  0 siblings, 1 reply; 65+ messages in thread
From: Matthew Auld @ 2021-04-14 16:20 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: Intel Graphics Development, Matthew Auld, ML dri-devel

On Wed, 14 Apr 2021 at 16:22, Tvrtko Ursulin
<tvrtko.ursulin@linux.intel.com> wrote:
>
>
> On 12/04/2021 10:05, Matthew Auld wrote:
> > From: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com>
> >
> > Determine the possible coherent map type based on object location,
> > and if target has llc or if user requires an always coherent
> > mapping.
> >
> > Cc: Matthew Auld <matthew.auld@intel.com>
> > Cc: CQ Tang <cq.tang@intel.com>
> > Suggested-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
> > Signed-off-by: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com>
> > ---
> >   drivers/gpu/drm/i915/gt/intel_engine_cs.c    |  3 ++-
> >   drivers/gpu/drm/i915/gt/intel_engine_pm.c    |  2 +-
> >   drivers/gpu/drm/i915/gt/intel_lrc.c          |  4 +++-
> >   drivers/gpu/drm/i915/gt/intel_ring.c         |  9 ++++++---
> >   drivers/gpu/drm/i915/gt/selftest_context.c   |  3 ++-
> >   drivers/gpu/drm/i915/gt/selftest_hangcheck.c |  4 ++--
> >   drivers/gpu/drm/i915/gt/selftest_lrc.c       |  4 +++-
> >   drivers/gpu/drm/i915/gt/uc/intel_guc.c       |  4 +++-
> >   drivers/gpu/drm/i915/gt/uc/intel_huc.c       |  4 +++-
> >   drivers/gpu/drm/i915/i915_drv.h              | 11 +++++++++--
> >   drivers/gpu/drm/i915/selftests/igt_spinner.c |  4 ++--
> >   11 files changed, 36 insertions(+), 16 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> > index efe935f80c1a..b79568d370f5 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> > @@ -664,7 +664,8 @@ static int init_status_page(struct intel_engine_cs *engine)
> >       if (ret)
> >               goto err;
> >
> > -     vaddr = i915_gem_object_pin_map(obj, I915_MAP_WB);
> > +     vaddr = i915_gem_object_pin_map(obj,
> > +                                     i915_coherent_map_type(engine->i915, obj, true));
> >       if (IS_ERR(vaddr)) {
> >               ret = PTR_ERR(vaddr);
> >               goto err_unpin;
> > diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pm.c b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
> > index 7c9af86fdb1e..47f4397095e5 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_engine_pm.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
> > @@ -23,7 +23,7 @@ static void dbg_poison_ce(struct intel_context *ce)
> >
> >       if (ce->state) {
> >               struct drm_i915_gem_object *obj = ce->state->obj;
> > -             int type = i915_coherent_map_type(ce->engine->i915);
> > +             int type = i915_coherent_map_type(ce->engine->i915, obj, true);
> >               void *map;
> >
> >               if (!i915_gem_object_trylock(obj))
> > diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
> > index e86897cde984..aafe2a4df496 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_lrc.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
> > @@ -903,7 +903,9 @@ lrc_pre_pin(struct intel_context *ce,
> >       GEM_BUG_ON(!i915_vma_is_pinned(ce->state));
> >
> >       *vaddr = i915_gem_object_pin_map(ce->state->obj,
> > -                                      i915_coherent_map_type(ce->engine->i915) |
> > +                                      i915_coherent_map_type(ce->engine->i915,
> > +                                                             ce->state->obj,
> > +                                                             false) |
> >                                        I915_MAP_OVERRIDE);
> >
> >       return PTR_ERR_OR_ZERO(*vaddr);
> > diff --git a/drivers/gpu/drm/i915/gt/intel_ring.c b/drivers/gpu/drm/i915/gt/intel_ring.c
> > index aee0a77c77e0..3cf6c7e68108 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_ring.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_ring.c
> > @@ -53,9 +53,12 @@ int intel_ring_pin(struct intel_ring *ring, struct i915_gem_ww_ctx *ww)
> >
> >       if (i915_vma_is_map_and_fenceable(vma))
> >               addr = (void __force *)i915_vma_pin_iomap(vma);
> > -     else
> > -             addr = i915_gem_object_pin_map(vma->obj,
> > -                                            i915_coherent_map_type(vma->vm->i915));
> > +     else {
> > +             int type = i915_coherent_map_type(vma->vm->i915, vma->obj, false);
> > +
> > +             addr = i915_gem_object_pin_map(vma->obj, type);
> > +     }
> > +
> >       if (IS_ERR(addr)) {
> >               ret = PTR_ERR(addr);
> >               goto err_ring;
> > diff --git a/drivers/gpu/drm/i915/gt/selftest_context.c b/drivers/gpu/drm/i915/gt/selftest_context.c
> > index b9bdd1d23243..26685b927169 100644
> > --- a/drivers/gpu/drm/i915/gt/selftest_context.c
> > +++ b/drivers/gpu/drm/i915/gt/selftest_context.c
> > @@ -88,7 +88,8 @@ static int __live_context_size(struct intel_engine_cs *engine)
> >               goto err;
> >
> >       vaddr = i915_gem_object_pin_map_unlocked(ce->state->obj,
> > -                                              i915_coherent_map_type(engine->i915));
> > +                                              i915_coherent_map_type(engine->i915,
> > +                                                                     ce->state->obj, false));
> >       if (IS_ERR(vaddr)) {
> >               err = PTR_ERR(vaddr);
> >               intel_context_unpin(ce);
> > diff --git a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
> > index 746985971c3a..5b63d4df8c93 100644
> > --- a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
> > +++ b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
> > @@ -69,7 +69,7 @@ static int hang_init(struct hang *h, struct intel_gt *gt)
> >       h->seqno = memset(vaddr, 0xff, PAGE_SIZE);
> >
> >       vaddr = i915_gem_object_pin_map_unlocked(h->obj,
> > -                                              i915_coherent_map_type(gt->i915));
> > +                                              i915_coherent_map_type(gt->i915, h->obj, false));
> >       if (IS_ERR(vaddr)) {
> >               err = PTR_ERR(vaddr);
> >               goto err_unpin_hws;
> > @@ -130,7 +130,7 @@ hang_create_request(struct hang *h, struct intel_engine_cs *engine)
> >               return ERR_CAST(obj);
> >       }
> >
> > -     vaddr = i915_gem_object_pin_map_unlocked(obj, i915_coherent_map_type(gt->i915));
> > +     vaddr = i915_gem_object_pin_map_unlocked(obj, i915_coherent_map_type(gt->i915, obj, false));
> >       if (IS_ERR(vaddr)) {
> >               i915_gem_object_put(obj);
> >               i915_vm_put(vm);
> > diff --git a/drivers/gpu/drm/i915/gt/selftest_lrc.c b/drivers/gpu/drm/i915/gt/selftest_lrc.c
> > index 85e7df6a5123..d8f6623524e8 100644
> > --- a/drivers/gpu/drm/i915/gt/selftest_lrc.c
> > +++ b/drivers/gpu/drm/i915/gt/selftest_lrc.c
> > @@ -1221,7 +1221,9 @@ static int compare_isolation(struct intel_engine_cs *engine,
> >       }
> >
> >       lrc = i915_gem_object_pin_map_unlocked(ce->state->obj,
> > -                                   i915_coherent_map_type(engine->i915));
> > +                                            i915_coherent_map_type(engine->i915,
> > +                                                                   ce->state->obj,
> > +                                                                   false));
> >       if (IS_ERR(lrc)) {
> >               err = PTR_ERR(lrc);
> >               goto err_B1;
> > diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.c b/drivers/gpu/drm/i915/gt/uc/intel_guc.c
> > index 78305b2ec89d..adae04c47aab 100644
> > --- a/drivers/gpu/drm/i915/gt/uc/intel_guc.c
> > +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.c
> > @@ -682,7 +682,9 @@ int intel_guc_allocate_and_map_vma(struct intel_guc *guc, u32 size,
> >       if (IS_ERR(vma))
> >               return PTR_ERR(vma);
> >
> > -     vaddr = i915_gem_object_pin_map_unlocked(vma->obj, I915_MAP_WB);
> > +     vaddr = i915_gem_object_pin_map_unlocked(vma->obj,
> > +                                              i915_coherent_map_type(guc_to_gt(guc)->i915,
> > +                                                                     vma->obj, true));
> >       if (IS_ERR(vaddr)) {
> >               i915_vma_unpin_and_release(&vma, 0);
> >               return PTR_ERR(vaddr);
> > diff --git a/drivers/gpu/drm/i915/gt/uc/intel_huc.c b/drivers/gpu/drm/i915/gt/uc/intel_huc.c
> > index 2126dd81ac38..56d2144dc6a0 100644
> > --- a/drivers/gpu/drm/i915/gt/uc/intel_huc.c
> > +++ b/drivers/gpu/drm/i915/gt/uc/intel_huc.c
> > @@ -82,7 +82,9 @@ static int intel_huc_rsa_data_create(struct intel_huc *huc)
> >       if (IS_ERR(vma))
> >               return PTR_ERR(vma);
> >
> > -     vaddr = i915_gem_object_pin_map_unlocked(vma->obj, I915_MAP_WB);
> > +     vaddr = i915_gem_object_pin_map_unlocked(vma->obj,
> > +                                              i915_coherent_map_type(gt->i915,
> > +                                                                     vma->obj, true));
> >       if (IS_ERR(vaddr)) {
> >               i915_vma_unpin_and_release(&vma, 0);
> >               return PTR_ERR(vaddr);
> > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> > index 69e43bf91a15..2abbc06712a4 100644
> > --- a/drivers/gpu/drm/i915/i915_drv.h
> > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > @@ -78,6 +78,7 @@
> >   #include "gem/i915_gem_context_types.h"
> >   #include "gem/i915_gem_shrinker.h"
> >   #include "gem/i915_gem_stolen.h"
> > +#include "gem/i915_gem_lmem.h"
> >
> >   #include "gt/intel_engine.h"
> >   #include "gt/intel_gt_types.h"
> > @@ -1921,9 +1922,15 @@ static inline int intel_hws_csb_write_index(struct drm_i915_private *i915)
> >   }
> >
> >   static inline enum i915_map_type
> > -i915_coherent_map_type(struct drm_i915_private *i915)
> > +i915_coherent_map_type(struct drm_i915_private *i915,
> > +                    struct drm_i915_gem_object *obj, bool always_coherent)
> >   {
> > -     return HAS_LLC(i915) ? I915_MAP_WB : I915_MAP_WC;
> > +     if (i915_gem_object_is_lmem(obj))
> > +             return I915_MAP_WC;
> > +     if (HAS_LLC(i915) || always_coherent)
> > +             return I915_MAP_WB;
> > +     else
> > +             return I915_MAP_WC;
>
> Seems this patch is doing two things.
>
> First it is adding lmem support to this helper by always returning WC
> for lmem objects.
>
> Secondly it is introducing an idea of "always coherent" in a helper
> called i915_coherent_map_type. Could someone explain what is coherent vs
> always coherent?
>
> And also, why is always coherent happy with WB? Sounds counter intuitive
> to me.

All this does is try to keep the existing behaviour intact, whilst
also ensuring that all lmem objects are mapped using only WC, no
matter what. The always_coherent=true thing is for the existing places
where we sometimes map the object using WB, without first considering
whether the device has the fast shared LLC vs snooping. Yes, it's
slightly ugly :)

>
> Regards,
>
> Tvrtko
>
> >   }
> >
> >   #endif
> > diff --git a/drivers/gpu/drm/i915/selftests/igt_spinner.c b/drivers/gpu/drm/i915/selftests/igt_spinner.c
> > index cfbbe415b57c..5fe397b7d1d9 100644
> > --- a/drivers/gpu/drm/i915/selftests/igt_spinner.c
> > +++ b/drivers/gpu/drm/i915/selftests/igt_spinner.c
> > @@ -94,9 +94,9 @@ int igt_spinner_pin(struct igt_spinner *spin,
> >       }
> >
> >       if (!spin->batch) {
> > -             unsigned int mode =
> > -                     i915_coherent_map_type(spin->gt->i915);
> > +             unsigned int mode;
> >
> > +             mode = i915_coherent_map_type(spin->gt->i915, spin->obj, false);
> >               vaddr = igt_spinner_pin_obj(ce, ww, spin->obj, mode, &spin->batch_vma);
> >               if (IS_ERR(vaddr))
> >                       return PTR_ERR(vaddr);
> >
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [Intel-gfx] [PATCH 11/19] drm/i915: Update the helper to set correct mapping
  2021-04-14 16:20     ` Matthew Auld
@ 2021-04-15  8:20       ` Tvrtko Ursulin
  2021-04-15  9:23         ` Matthew Auld
  0 siblings, 1 reply; 65+ messages in thread
From: Tvrtko Ursulin @ 2021-04-15  8:20 UTC (permalink / raw)
  To: Matthew Auld; +Cc: Intel Graphics Development, Matthew Auld, ML dri-devel


On 14/04/2021 17:20, Matthew Auld wrote:
> On Wed, 14 Apr 2021 at 16:22, Tvrtko Ursulin
> <tvrtko.ursulin@linux.intel.com> wrote:
>>
>>
>> On 12/04/2021 10:05, Matthew Auld wrote:
>>> From: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com>
>>>
>>> Determine the possible coherent map type based on object location,
>>> and if target has llc or if user requires an always coherent
>>> mapping.
>>>
>>> Cc: Matthew Auld <matthew.auld@intel.com>
>>> Cc: CQ Tang <cq.tang@intel.com>
>>> Suggested-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
>>> Signed-off-by: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com>
>>> ---
>>>    drivers/gpu/drm/i915/gt/intel_engine_cs.c    |  3 ++-
>>>    drivers/gpu/drm/i915/gt/intel_engine_pm.c    |  2 +-
>>>    drivers/gpu/drm/i915/gt/intel_lrc.c          |  4 +++-
>>>    drivers/gpu/drm/i915/gt/intel_ring.c         |  9 ++++++---
>>>    drivers/gpu/drm/i915/gt/selftest_context.c   |  3 ++-
>>>    drivers/gpu/drm/i915/gt/selftest_hangcheck.c |  4 ++--
>>>    drivers/gpu/drm/i915/gt/selftest_lrc.c       |  4 +++-
>>>    drivers/gpu/drm/i915/gt/uc/intel_guc.c       |  4 +++-
>>>    drivers/gpu/drm/i915/gt/uc/intel_huc.c       |  4 +++-
>>>    drivers/gpu/drm/i915/i915_drv.h              | 11 +++++++++--
>>>    drivers/gpu/drm/i915/selftests/igt_spinner.c |  4 ++--
>>>    11 files changed, 36 insertions(+), 16 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
>>> index efe935f80c1a..b79568d370f5 100644
>>> --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
>>> +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
>>> @@ -664,7 +664,8 @@ static int init_status_page(struct intel_engine_cs *engine)
>>>        if (ret)
>>>                goto err;
>>>
>>> -     vaddr = i915_gem_object_pin_map(obj, I915_MAP_WB);
>>> +     vaddr = i915_gem_object_pin_map(obj,
>>> +                                     i915_coherent_map_type(engine->i915, obj, true));
>>>        if (IS_ERR(vaddr)) {
>>>                ret = PTR_ERR(vaddr);
>>>                goto err_unpin;
>>> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pm.c b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
>>> index 7c9af86fdb1e..47f4397095e5 100644
>>> --- a/drivers/gpu/drm/i915/gt/intel_engine_pm.c
>>> +++ b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
>>> @@ -23,7 +23,7 @@ static void dbg_poison_ce(struct intel_context *ce)
>>>
>>>        if (ce->state) {
>>>                struct drm_i915_gem_object *obj = ce->state->obj;
>>> -             int type = i915_coherent_map_type(ce->engine->i915);
>>> +             int type = i915_coherent_map_type(ce->engine->i915, obj, true);
>>>                void *map;
>>>
>>>                if (!i915_gem_object_trylock(obj))
>>> diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
>>> index e86897cde984..aafe2a4df496 100644
>>> --- a/drivers/gpu/drm/i915/gt/intel_lrc.c
>>> +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
>>> @@ -903,7 +903,9 @@ lrc_pre_pin(struct intel_context *ce,
>>>        GEM_BUG_ON(!i915_vma_is_pinned(ce->state));
>>>
>>>        *vaddr = i915_gem_object_pin_map(ce->state->obj,
>>> -                                      i915_coherent_map_type(ce->engine->i915) |
>>> +                                      i915_coherent_map_type(ce->engine->i915,
>>> +                                                             ce->state->obj,
>>> +                                                             false) |
>>>                                         I915_MAP_OVERRIDE);
>>>
>>>        return PTR_ERR_OR_ZERO(*vaddr);
>>> diff --git a/drivers/gpu/drm/i915/gt/intel_ring.c b/drivers/gpu/drm/i915/gt/intel_ring.c
>>> index aee0a77c77e0..3cf6c7e68108 100644
>>> --- a/drivers/gpu/drm/i915/gt/intel_ring.c
>>> +++ b/drivers/gpu/drm/i915/gt/intel_ring.c
>>> @@ -53,9 +53,12 @@ int intel_ring_pin(struct intel_ring *ring, struct i915_gem_ww_ctx *ww)
>>>
>>>        if (i915_vma_is_map_and_fenceable(vma))
>>>                addr = (void __force *)i915_vma_pin_iomap(vma);
>>> -     else
>>> -             addr = i915_gem_object_pin_map(vma->obj,
>>> -                                            i915_coherent_map_type(vma->vm->i915));
>>> +     else {
>>> +             int type = i915_coherent_map_type(vma->vm->i915, vma->obj, false);
>>> +
>>> +             addr = i915_gem_object_pin_map(vma->obj, type);
>>> +     }
>>> +
>>>        if (IS_ERR(addr)) {
>>>                ret = PTR_ERR(addr);
>>>                goto err_ring;
>>> diff --git a/drivers/gpu/drm/i915/gt/selftest_context.c b/drivers/gpu/drm/i915/gt/selftest_context.c
>>> index b9bdd1d23243..26685b927169 100644
>>> --- a/drivers/gpu/drm/i915/gt/selftest_context.c
>>> +++ b/drivers/gpu/drm/i915/gt/selftest_context.c
>>> @@ -88,7 +88,8 @@ static int __live_context_size(struct intel_engine_cs *engine)
>>>                goto err;
>>>
>>>        vaddr = i915_gem_object_pin_map_unlocked(ce->state->obj,
>>> -                                              i915_coherent_map_type(engine->i915));
>>> +                                              i915_coherent_map_type(engine->i915,
>>> +                                                                     ce->state->obj, false));
>>>        if (IS_ERR(vaddr)) {
>>>                err = PTR_ERR(vaddr);
>>>                intel_context_unpin(ce);
>>> diff --git a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
>>> index 746985971c3a..5b63d4df8c93 100644
>>> --- a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
>>> +++ b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
>>> @@ -69,7 +69,7 @@ static int hang_init(struct hang *h, struct intel_gt *gt)
>>>        h->seqno = memset(vaddr, 0xff, PAGE_SIZE);
>>>
>>>        vaddr = i915_gem_object_pin_map_unlocked(h->obj,
>>> -                                              i915_coherent_map_type(gt->i915));
>>> +                                              i915_coherent_map_type(gt->i915, h->obj, false));
>>>        if (IS_ERR(vaddr)) {
>>>                err = PTR_ERR(vaddr);
>>>                goto err_unpin_hws;
>>> @@ -130,7 +130,7 @@ hang_create_request(struct hang *h, struct intel_engine_cs *engine)
>>>                return ERR_CAST(obj);
>>>        }
>>>
>>> -     vaddr = i915_gem_object_pin_map_unlocked(obj, i915_coherent_map_type(gt->i915));
>>> +     vaddr = i915_gem_object_pin_map_unlocked(obj, i915_coherent_map_type(gt->i915, obj, false));
>>>        if (IS_ERR(vaddr)) {
>>>                i915_gem_object_put(obj);
>>>                i915_vm_put(vm);
>>> diff --git a/drivers/gpu/drm/i915/gt/selftest_lrc.c b/drivers/gpu/drm/i915/gt/selftest_lrc.c
>>> index 85e7df6a5123..d8f6623524e8 100644
>>> --- a/drivers/gpu/drm/i915/gt/selftest_lrc.c
>>> +++ b/drivers/gpu/drm/i915/gt/selftest_lrc.c
>>> @@ -1221,7 +1221,9 @@ static int compare_isolation(struct intel_engine_cs *engine,
>>>        }
>>>
>>>        lrc = i915_gem_object_pin_map_unlocked(ce->state->obj,
>>> -                                   i915_coherent_map_type(engine->i915));
>>> +                                            i915_coherent_map_type(engine->i915,
>>> +                                                                   ce->state->obj,
>>> +                                                                   false));
>>>        if (IS_ERR(lrc)) {
>>>                err = PTR_ERR(lrc);
>>>                goto err_B1;
>>> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.c b/drivers/gpu/drm/i915/gt/uc/intel_guc.c
>>> index 78305b2ec89d..adae04c47aab 100644
>>> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc.c
>>> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.c
>>> @@ -682,7 +682,9 @@ int intel_guc_allocate_and_map_vma(struct intel_guc *guc, u32 size,
>>>        if (IS_ERR(vma))
>>>                return PTR_ERR(vma);
>>>
>>> -     vaddr = i915_gem_object_pin_map_unlocked(vma->obj, I915_MAP_WB);
>>> +     vaddr = i915_gem_object_pin_map_unlocked(vma->obj,
>>> +                                              i915_coherent_map_type(guc_to_gt(guc)->i915,
>>> +                                                                     vma->obj, true));
>>>        if (IS_ERR(vaddr)) {
>>>                i915_vma_unpin_and_release(&vma, 0);
>>>                return PTR_ERR(vaddr);
>>> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_huc.c b/drivers/gpu/drm/i915/gt/uc/intel_huc.c
>>> index 2126dd81ac38..56d2144dc6a0 100644
>>> --- a/drivers/gpu/drm/i915/gt/uc/intel_huc.c
>>> +++ b/drivers/gpu/drm/i915/gt/uc/intel_huc.c
>>> @@ -82,7 +82,9 @@ static int intel_huc_rsa_data_create(struct intel_huc *huc)
>>>        if (IS_ERR(vma))
>>>                return PTR_ERR(vma);
>>>
>>> -     vaddr = i915_gem_object_pin_map_unlocked(vma->obj, I915_MAP_WB);
>>> +     vaddr = i915_gem_object_pin_map_unlocked(vma->obj,
>>> +                                              i915_coherent_map_type(gt->i915,
>>> +                                                                     vma->obj, true));
>>>        if (IS_ERR(vaddr)) {
>>>                i915_vma_unpin_and_release(&vma, 0);
>>>                return PTR_ERR(vaddr);
>>> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
>>> index 69e43bf91a15..2abbc06712a4 100644
>>> --- a/drivers/gpu/drm/i915/i915_drv.h
>>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>>> @@ -78,6 +78,7 @@
>>>    #include "gem/i915_gem_context_types.h"
>>>    #include "gem/i915_gem_shrinker.h"
>>>    #include "gem/i915_gem_stolen.h"
>>> +#include "gem/i915_gem_lmem.h"
>>>
>>>    #include "gt/intel_engine.h"
>>>    #include "gt/intel_gt_types.h"
>>> @@ -1921,9 +1922,15 @@ static inline int intel_hws_csb_write_index(struct drm_i915_private *i915)
>>>    }
>>>
>>>    static inline enum i915_map_type
>>> -i915_coherent_map_type(struct drm_i915_private *i915)
>>> +i915_coherent_map_type(struct drm_i915_private *i915,
>>> +                    struct drm_i915_gem_object *obj, bool always_coherent)
>>>    {
>>> -     return HAS_LLC(i915) ? I915_MAP_WB : I915_MAP_WC;
>>> +     if (i915_gem_object_is_lmem(obj))
>>> +             return I915_MAP_WC;
>>> +     if (HAS_LLC(i915) || always_coherent)
>>> +             return I915_MAP_WB;
>>> +     else
>>> +             return I915_MAP_WC;
>>
>> Seems this patch is doing two things.
>>
>> First it is adding lmem support to this helper by always returning WC
>> for lmem objects.
>>
>> Secondly it is introducing an idea of "always coherent" in a helper
>> called i915_coherent_map_type. Could someone explain what is coherent vs
>> always coherent?
>>
>> And also, why is always coherent happy with WB? Sounds counter intuitive
>> to me.
> 
> All this does is try to keep the existing behaviour intact, whilst
> also ensuring that all lmem objects are mapped using only WC, no
> matter what. The always_coherent=true thing is for the existing places
> where we sometimes map the object using WB, without first considering
> whether the device has the fast shared LLC vs snooping. Yes, it's
> slightly ugly :)

Not fully following - if we had to write kerneldoc for always_coherent 
input argument - what it would say?

Regards,

Tvrtko
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [Intel-gfx] [PATCH 11/19] drm/i915: Update the helper to set correct mapping
  2021-04-15  8:20       ` Tvrtko Ursulin
@ 2021-04-15  9:23         ` Matthew Auld
  2021-04-15 11:05           ` Tvrtko Ursulin
  0 siblings, 1 reply; 65+ messages in thread
From: Matthew Auld @ 2021-04-15  9:23 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: Intel Graphics Development, Matthew Auld, ML dri-devel

On Thu, 15 Apr 2021 at 09:21, Tvrtko Ursulin
<tvrtko.ursulin@linux.intel.com> wrote:
>
>
> On 14/04/2021 17:20, Matthew Auld wrote:
> > On Wed, 14 Apr 2021 at 16:22, Tvrtko Ursulin
> > <tvrtko.ursulin@linux.intel.com> wrote:
> >>
> >>
> >> On 12/04/2021 10:05, Matthew Auld wrote:
> >>> From: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com>
> >>>
> >>> Determine the possible coherent map type based on object location,
> >>> and if target has llc or if user requires an always coherent
> >>> mapping.
> >>>
> >>> Cc: Matthew Auld <matthew.auld@intel.com>
> >>> Cc: CQ Tang <cq.tang@intel.com>
> >>> Suggested-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
> >>> Signed-off-by: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com>
> >>> ---
> >>>    drivers/gpu/drm/i915/gt/intel_engine_cs.c    |  3 ++-
> >>>    drivers/gpu/drm/i915/gt/intel_engine_pm.c    |  2 +-
> >>>    drivers/gpu/drm/i915/gt/intel_lrc.c          |  4 +++-
> >>>    drivers/gpu/drm/i915/gt/intel_ring.c         |  9 ++++++---
> >>>    drivers/gpu/drm/i915/gt/selftest_context.c   |  3 ++-
> >>>    drivers/gpu/drm/i915/gt/selftest_hangcheck.c |  4 ++--
> >>>    drivers/gpu/drm/i915/gt/selftest_lrc.c       |  4 +++-
> >>>    drivers/gpu/drm/i915/gt/uc/intel_guc.c       |  4 +++-
> >>>    drivers/gpu/drm/i915/gt/uc/intel_huc.c       |  4 +++-
> >>>    drivers/gpu/drm/i915/i915_drv.h              | 11 +++++++++--
> >>>    drivers/gpu/drm/i915/selftests/igt_spinner.c |  4 ++--
> >>>    11 files changed, 36 insertions(+), 16 deletions(-)
> >>>
> >>> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> >>> index efe935f80c1a..b79568d370f5 100644
> >>> --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> >>> +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> >>> @@ -664,7 +664,8 @@ static int init_status_page(struct intel_engine_cs *engine)
> >>>        if (ret)
> >>>                goto err;
> >>>
> >>> -     vaddr = i915_gem_object_pin_map(obj, I915_MAP_WB);
> >>> +     vaddr = i915_gem_object_pin_map(obj,
> >>> +                                     i915_coherent_map_type(engine->i915, obj, true));
> >>>        if (IS_ERR(vaddr)) {
> >>>                ret = PTR_ERR(vaddr);
> >>>                goto err_unpin;
> >>> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pm.c b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
> >>> index 7c9af86fdb1e..47f4397095e5 100644
> >>> --- a/drivers/gpu/drm/i915/gt/intel_engine_pm.c
> >>> +++ b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
> >>> @@ -23,7 +23,7 @@ static void dbg_poison_ce(struct intel_context *ce)
> >>>
> >>>        if (ce->state) {
> >>>                struct drm_i915_gem_object *obj = ce->state->obj;
> >>> -             int type = i915_coherent_map_type(ce->engine->i915);
> >>> +             int type = i915_coherent_map_type(ce->engine->i915, obj, true);
> >>>                void *map;
> >>>
> >>>                if (!i915_gem_object_trylock(obj))
> >>> diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
> >>> index e86897cde984..aafe2a4df496 100644
> >>> --- a/drivers/gpu/drm/i915/gt/intel_lrc.c
> >>> +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
> >>> @@ -903,7 +903,9 @@ lrc_pre_pin(struct intel_context *ce,
> >>>        GEM_BUG_ON(!i915_vma_is_pinned(ce->state));
> >>>
> >>>        *vaddr = i915_gem_object_pin_map(ce->state->obj,
> >>> -                                      i915_coherent_map_type(ce->engine->i915) |
> >>> +                                      i915_coherent_map_type(ce->engine->i915,
> >>> +                                                             ce->state->obj,
> >>> +                                                             false) |
> >>>                                         I915_MAP_OVERRIDE);
> >>>
> >>>        return PTR_ERR_OR_ZERO(*vaddr);
> >>> diff --git a/drivers/gpu/drm/i915/gt/intel_ring.c b/drivers/gpu/drm/i915/gt/intel_ring.c
> >>> index aee0a77c77e0..3cf6c7e68108 100644
> >>> --- a/drivers/gpu/drm/i915/gt/intel_ring.c
> >>> +++ b/drivers/gpu/drm/i915/gt/intel_ring.c
> >>> @@ -53,9 +53,12 @@ int intel_ring_pin(struct intel_ring *ring, struct i915_gem_ww_ctx *ww)
> >>>
> >>>        if (i915_vma_is_map_and_fenceable(vma))
> >>>                addr = (void __force *)i915_vma_pin_iomap(vma);
> >>> -     else
> >>> -             addr = i915_gem_object_pin_map(vma->obj,
> >>> -                                            i915_coherent_map_type(vma->vm->i915));
> >>> +     else {
> >>> +             int type = i915_coherent_map_type(vma->vm->i915, vma->obj, false);
> >>> +
> >>> +             addr = i915_gem_object_pin_map(vma->obj, type);
> >>> +     }
> >>> +
> >>>        if (IS_ERR(addr)) {
> >>>                ret = PTR_ERR(addr);
> >>>                goto err_ring;
> >>> diff --git a/drivers/gpu/drm/i915/gt/selftest_context.c b/drivers/gpu/drm/i915/gt/selftest_context.c
> >>> index b9bdd1d23243..26685b927169 100644
> >>> --- a/drivers/gpu/drm/i915/gt/selftest_context.c
> >>> +++ b/drivers/gpu/drm/i915/gt/selftest_context.c
> >>> @@ -88,7 +88,8 @@ static int __live_context_size(struct intel_engine_cs *engine)
> >>>                goto err;
> >>>
> >>>        vaddr = i915_gem_object_pin_map_unlocked(ce->state->obj,
> >>> -                                              i915_coherent_map_type(engine->i915));
> >>> +                                              i915_coherent_map_type(engine->i915,
> >>> +                                                                     ce->state->obj, false));
> >>>        if (IS_ERR(vaddr)) {
> >>>                err = PTR_ERR(vaddr);
> >>>                intel_context_unpin(ce);
> >>> diff --git a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
> >>> index 746985971c3a..5b63d4df8c93 100644
> >>> --- a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
> >>> +++ b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
> >>> @@ -69,7 +69,7 @@ static int hang_init(struct hang *h, struct intel_gt *gt)
> >>>        h->seqno = memset(vaddr, 0xff, PAGE_SIZE);
> >>>
> >>>        vaddr = i915_gem_object_pin_map_unlocked(h->obj,
> >>> -                                              i915_coherent_map_type(gt->i915));
> >>> +                                              i915_coherent_map_type(gt->i915, h->obj, false));
> >>>        if (IS_ERR(vaddr)) {
> >>>                err = PTR_ERR(vaddr);
> >>>                goto err_unpin_hws;
> >>> @@ -130,7 +130,7 @@ hang_create_request(struct hang *h, struct intel_engine_cs *engine)
> >>>                return ERR_CAST(obj);
> >>>        }
> >>>
> >>> -     vaddr = i915_gem_object_pin_map_unlocked(obj, i915_coherent_map_type(gt->i915));
> >>> +     vaddr = i915_gem_object_pin_map_unlocked(obj, i915_coherent_map_type(gt->i915, obj, false));
> >>>        if (IS_ERR(vaddr)) {
> >>>                i915_gem_object_put(obj);
> >>>                i915_vm_put(vm);
> >>> diff --git a/drivers/gpu/drm/i915/gt/selftest_lrc.c b/drivers/gpu/drm/i915/gt/selftest_lrc.c
> >>> index 85e7df6a5123..d8f6623524e8 100644
> >>> --- a/drivers/gpu/drm/i915/gt/selftest_lrc.c
> >>> +++ b/drivers/gpu/drm/i915/gt/selftest_lrc.c
> >>> @@ -1221,7 +1221,9 @@ static int compare_isolation(struct intel_engine_cs *engine,
> >>>        }
> >>>
> >>>        lrc = i915_gem_object_pin_map_unlocked(ce->state->obj,
> >>> -                                   i915_coherent_map_type(engine->i915));
> >>> +                                            i915_coherent_map_type(engine->i915,
> >>> +                                                                   ce->state->obj,
> >>> +                                                                   false));
> >>>        if (IS_ERR(lrc)) {
> >>>                err = PTR_ERR(lrc);
> >>>                goto err_B1;
> >>> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.c b/drivers/gpu/drm/i915/gt/uc/intel_guc.c
> >>> index 78305b2ec89d..adae04c47aab 100644
> >>> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc.c
> >>> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.c
> >>> @@ -682,7 +682,9 @@ int intel_guc_allocate_and_map_vma(struct intel_guc *guc, u32 size,
> >>>        if (IS_ERR(vma))
> >>>                return PTR_ERR(vma);
> >>>
> >>> -     vaddr = i915_gem_object_pin_map_unlocked(vma->obj, I915_MAP_WB);
> >>> +     vaddr = i915_gem_object_pin_map_unlocked(vma->obj,
> >>> +                                              i915_coherent_map_type(guc_to_gt(guc)->i915,
> >>> +                                                                     vma->obj, true));
> >>>        if (IS_ERR(vaddr)) {
> >>>                i915_vma_unpin_and_release(&vma, 0);
> >>>                return PTR_ERR(vaddr);
> >>> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_huc.c b/drivers/gpu/drm/i915/gt/uc/intel_huc.c
> >>> index 2126dd81ac38..56d2144dc6a0 100644
> >>> --- a/drivers/gpu/drm/i915/gt/uc/intel_huc.c
> >>> +++ b/drivers/gpu/drm/i915/gt/uc/intel_huc.c
> >>> @@ -82,7 +82,9 @@ static int intel_huc_rsa_data_create(struct intel_huc *huc)
> >>>        if (IS_ERR(vma))
> >>>                return PTR_ERR(vma);
> >>>
> >>> -     vaddr = i915_gem_object_pin_map_unlocked(vma->obj, I915_MAP_WB);
> >>> +     vaddr = i915_gem_object_pin_map_unlocked(vma->obj,
> >>> +                                              i915_coherent_map_type(gt->i915,
> >>> +                                                                     vma->obj, true));
> >>>        if (IS_ERR(vaddr)) {
> >>>                i915_vma_unpin_and_release(&vma, 0);
> >>>                return PTR_ERR(vaddr);
> >>> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> >>> index 69e43bf91a15..2abbc06712a4 100644
> >>> --- a/drivers/gpu/drm/i915/i915_drv.h
> >>> +++ b/drivers/gpu/drm/i915/i915_drv.h
> >>> @@ -78,6 +78,7 @@
> >>>    #include "gem/i915_gem_context_types.h"
> >>>    #include "gem/i915_gem_shrinker.h"
> >>>    #include "gem/i915_gem_stolen.h"
> >>> +#include "gem/i915_gem_lmem.h"
> >>>
> >>>    #include "gt/intel_engine.h"
> >>>    #include "gt/intel_gt_types.h"
> >>> @@ -1921,9 +1922,15 @@ static inline int intel_hws_csb_write_index(struct drm_i915_private *i915)
> >>>    }
> >>>
> >>>    static inline enum i915_map_type
> >>> -i915_coherent_map_type(struct drm_i915_private *i915)
> >>> +i915_coherent_map_type(struct drm_i915_private *i915,
> >>> +                    struct drm_i915_gem_object *obj, bool always_coherent)
> >>>    {
> >>> -     return HAS_LLC(i915) ? I915_MAP_WB : I915_MAP_WC;
> >>> +     if (i915_gem_object_is_lmem(obj))
> >>> +             return I915_MAP_WC;
> >>> +     if (HAS_LLC(i915) || always_coherent)
> >>> +             return I915_MAP_WB;
> >>> +     else
> >>> +             return I915_MAP_WC;
> >>
> >> Seems this patch is doing two things.
> >>
> >> First it is adding lmem support to this helper by always returning WC
> >> for lmem objects.
> >>
> >> Secondly it is introducing an idea of "always coherent" in a helper
> >> called i915_coherent_map_type. Could someone explain what is coherent vs
> >> always coherent?
> >>
> >> And also, why is always coherent happy with WB? Sounds counter intuitive
> >> to me.
> >
> > All this does is try to keep the existing behaviour intact, whilst
> > also ensuring that all lmem objects are mapped using only WC, no
> > matter what. The always_coherent=true thing is for the existing places
> > where we sometimes map the object using WB, without first considering
> > whether the device has the fast shared LLC vs snooping. Yes, it's
> > slightly ugly :)
>
> Not fully following - if we had to write kerneldoc for always_coherent
> input argument - what it would say?

@always_coherent - If true we should always try to map the object
using WB. If false we should only map as WB if the device supports the
fast shared LLC, in the case of snooped devices we will map use WC.
Note that If the resource is lmem then we will always map as WC,
regardless of the value of always_coherent, since that's all we
currently support.

Maybe the naming is poor?

>
> Regards,
>
> Tvrtko
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [Intel-gfx] [PATCH 11/19] drm/i915: Update the helper to set correct mapping
  2021-04-15  9:23         ` Matthew Auld
@ 2021-04-15 11:05           ` Tvrtko Ursulin
  2021-04-19 11:30             ` Matthew Auld
  0 siblings, 1 reply; 65+ messages in thread
From: Tvrtko Ursulin @ 2021-04-15 11:05 UTC (permalink / raw)
  To: Matthew Auld; +Cc: Intel Graphics Development, Matthew Auld, ML dri-devel


On 15/04/2021 10:23, Matthew Auld wrote:
> On Thu, 15 Apr 2021 at 09:21, Tvrtko Ursulin
> <tvrtko.ursulin@linux.intel.com> wrote:
>>
>>
>> On 14/04/2021 17:20, Matthew Auld wrote:
>>> On Wed, 14 Apr 2021 at 16:22, Tvrtko Ursulin
>>> <tvrtko.ursulin@linux.intel.com> wrote:
>>>>
>>>>
>>>> On 12/04/2021 10:05, Matthew Auld wrote:
>>>>> From: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com>
>>>>>
>>>>> Determine the possible coherent map type based on object location,
>>>>> and if target has llc or if user requires an always coherent
>>>>> mapping.
>>>>>
>>>>> Cc: Matthew Auld <matthew.auld@intel.com>
>>>>> Cc: CQ Tang <cq.tang@intel.com>
>>>>> Suggested-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
>>>>> Signed-off-by: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com>
>>>>> ---
>>>>>     drivers/gpu/drm/i915/gt/intel_engine_cs.c    |  3 ++-
>>>>>     drivers/gpu/drm/i915/gt/intel_engine_pm.c    |  2 +-
>>>>>     drivers/gpu/drm/i915/gt/intel_lrc.c          |  4 +++-
>>>>>     drivers/gpu/drm/i915/gt/intel_ring.c         |  9 ++++++---
>>>>>     drivers/gpu/drm/i915/gt/selftest_context.c   |  3 ++-
>>>>>     drivers/gpu/drm/i915/gt/selftest_hangcheck.c |  4 ++--
>>>>>     drivers/gpu/drm/i915/gt/selftest_lrc.c       |  4 +++-
>>>>>     drivers/gpu/drm/i915/gt/uc/intel_guc.c       |  4 +++-
>>>>>     drivers/gpu/drm/i915/gt/uc/intel_huc.c       |  4 +++-
>>>>>     drivers/gpu/drm/i915/i915_drv.h              | 11 +++++++++--
>>>>>     drivers/gpu/drm/i915/selftests/igt_spinner.c |  4 ++--
>>>>>     11 files changed, 36 insertions(+), 16 deletions(-)
>>>>>
>>>>> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
>>>>> index efe935f80c1a..b79568d370f5 100644
>>>>> --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
>>>>> +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
>>>>> @@ -664,7 +664,8 @@ static int init_status_page(struct intel_engine_cs *engine)
>>>>>         if (ret)
>>>>>                 goto err;
>>>>>
>>>>> -     vaddr = i915_gem_object_pin_map(obj, I915_MAP_WB);
>>>>> +     vaddr = i915_gem_object_pin_map(obj,
>>>>> +                                     i915_coherent_map_type(engine->i915, obj, true));
>>>>>         if (IS_ERR(vaddr)) {
>>>>>                 ret = PTR_ERR(vaddr);
>>>>>                 goto err_unpin;
>>>>> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pm.c b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
>>>>> index 7c9af86fdb1e..47f4397095e5 100644
>>>>> --- a/drivers/gpu/drm/i915/gt/intel_engine_pm.c
>>>>> +++ b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
>>>>> @@ -23,7 +23,7 @@ static void dbg_poison_ce(struct intel_context *ce)
>>>>>
>>>>>         if (ce->state) {
>>>>>                 struct drm_i915_gem_object *obj = ce->state->obj;
>>>>> -             int type = i915_coherent_map_type(ce->engine->i915);
>>>>> +             int type = i915_coherent_map_type(ce->engine->i915, obj, true);
>>>>>                 void *map;
>>>>>
>>>>>                 if (!i915_gem_object_trylock(obj))
>>>>> diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
>>>>> index e86897cde984..aafe2a4df496 100644
>>>>> --- a/drivers/gpu/drm/i915/gt/intel_lrc.c
>>>>> +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
>>>>> @@ -903,7 +903,9 @@ lrc_pre_pin(struct intel_context *ce,
>>>>>         GEM_BUG_ON(!i915_vma_is_pinned(ce->state));
>>>>>
>>>>>         *vaddr = i915_gem_object_pin_map(ce->state->obj,
>>>>> -                                      i915_coherent_map_type(ce->engine->i915) |
>>>>> +                                      i915_coherent_map_type(ce->engine->i915,
>>>>> +                                                             ce->state->obj,
>>>>> +                                                             false) |
>>>>>                                          I915_MAP_OVERRIDE);
>>>>>
>>>>>         return PTR_ERR_OR_ZERO(*vaddr);
>>>>> diff --git a/drivers/gpu/drm/i915/gt/intel_ring.c b/drivers/gpu/drm/i915/gt/intel_ring.c
>>>>> index aee0a77c77e0..3cf6c7e68108 100644
>>>>> --- a/drivers/gpu/drm/i915/gt/intel_ring.c
>>>>> +++ b/drivers/gpu/drm/i915/gt/intel_ring.c
>>>>> @@ -53,9 +53,12 @@ int intel_ring_pin(struct intel_ring *ring, struct i915_gem_ww_ctx *ww)
>>>>>
>>>>>         if (i915_vma_is_map_and_fenceable(vma))
>>>>>                 addr = (void __force *)i915_vma_pin_iomap(vma);
>>>>> -     else
>>>>> -             addr = i915_gem_object_pin_map(vma->obj,
>>>>> -                                            i915_coherent_map_type(vma->vm->i915));
>>>>> +     else {
>>>>> +             int type = i915_coherent_map_type(vma->vm->i915, vma->obj, false);
>>>>> +
>>>>> +             addr = i915_gem_object_pin_map(vma->obj, type);
>>>>> +     }
>>>>> +
>>>>>         if (IS_ERR(addr)) {
>>>>>                 ret = PTR_ERR(addr);
>>>>>                 goto err_ring;
>>>>> diff --git a/drivers/gpu/drm/i915/gt/selftest_context.c b/drivers/gpu/drm/i915/gt/selftest_context.c
>>>>> index b9bdd1d23243..26685b927169 100644
>>>>> --- a/drivers/gpu/drm/i915/gt/selftest_context.c
>>>>> +++ b/drivers/gpu/drm/i915/gt/selftest_context.c
>>>>> @@ -88,7 +88,8 @@ static int __live_context_size(struct intel_engine_cs *engine)
>>>>>                 goto err;
>>>>>
>>>>>         vaddr = i915_gem_object_pin_map_unlocked(ce->state->obj,
>>>>> -                                              i915_coherent_map_type(engine->i915));
>>>>> +                                              i915_coherent_map_type(engine->i915,
>>>>> +                                                                     ce->state->obj, false));
>>>>>         if (IS_ERR(vaddr)) {
>>>>>                 err = PTR_ERR(vaddr);
>>>>>                 intel_context_unpin(ce);
>>>>> diff --git a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
>>>>> index 746985971c3a..5b63d4df8c93 100644
>>>>> --- a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
>>>>> +++ b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
>>>>> @@ -69,7 +69,7 @@ static int hang_init(struct hang *h, struct intel_gt *gt)
>>>>>         h->seqno = memset(vaddr, 0xff, PAGE_SIZE);
>>>>>
>>>>>         vaddr = i915_gem_object_pin_map_unlocked(h->obj,
>>>>> -                                              i915_coherent_map_type(gt->i915));
>>>>> +                                              i915_coherent_map_type(gt->i915, h->obj, false));
>>>>>         if (IS_ERR(vaddr)) {
>>>>>                 err = PTR_ERR(vaddr);
>>>>>                 goto err_unpin_hws;
>>>>> @@ -130,7 +130,7 @@ hang_create_request(struct hang *h, struct intel_engine_cs *engine)
>>>>>                 return ERR_CAST(obj);
>>>>>         }
>>>>>
>>>>> -     vaddr = i915_gem_object_pin_map_unlocked(obj, i915_coherent_map_type(gt->i915));
>>>>> +     vaddr = i915_gem_object_pin_map_unlocked(obj, i915_coherent_map_type(gt->i915, obj, false));
>>>>>         if (IS_ERR(vaddr)) {
>>>>>                 i915_gem_object_put(obj);
>>>>>                 i915_vm_put(vm);
>>>>> diff --git a/drivers/gpu/drm/i915/gt/selftest_lrc.c b/drivers/gpu/drm/i915/gt/selftest_lrc.c
>>>>> index 85e7df6a5123..d8f6623524e8 100644
>>>>> --- a/drivers/gpu/drm/i915/gt/selftest_lrc.c
>>>>> +++ b/drivers/gpu/drm/i915/gt/selftest_lrc.c
>>>>> @@ -1221,7 +1221,9 @@ static int compare_isolation(struct intel_engine_cs *engine,
>>>>>         }
>>>>>
>>>>>         lrc = i915_gem_object_pin_map_unlocked(ce->state->obj,
>>>>> -                                   i915_coherent_map_type(engine->i915));
>>>>> +                                            i915_coherent_map_type(engine->i915,
>>>>> +                                                                   ce->state->obj,
>>>>> +                                                                   false));
>>>>>         if (IS_ERR(lrc)) {
>>>>>                 err = PTR_ERR(lrc);
>>>>>                 goto err_B1;
>>>>> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.c b/drivers/gpu/drm/i915/gt/uc/intel_guc.c
>>>>> index 78305b2ec89d..adae04c47aab 100644
>>>>> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc.c
>>>>> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.c
>>>>> @@ -682,7 +682,9 @@ int intel_guc_allocate_and_map_vma(struct intel_guc *guc, u32 size,
>>>>>         if (IS_ERR(vma))
>>>>>                 return PTR_ERR(vma);
>>>>>
>>>>> -     vaddr = i915_gem_object_pin_map_unlocked(vma->obj, I915_MAP_WB);
>>>>> +     vaddr = i915_gem_object_pin_map_unlocked(vma->obj,
>>>>> +                                              i915_coherent_map_type(guc_to_gt(guc)->i915,
>>>>> +                                                                     vma->obj, true));
>>>>>         if (IS_ERR(vaddr)) {
>>>>>                 i915_vma_unpin_and_release(&vma, 0);
>>>>>                 return PTR_ERR(vaddr);
>>>>> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_huc.c b/drivers/gpu/drm/i915/gt/uc/intel_huc.c
>>>>> index 2126dd81ac38..56d2144dc6a0 100644
>>>>> --- a/drivers/gpu/drm/i915/gt/uc/intel_huc.c
>>>>> +++ b/drivers/gpu/drm/i915/gt/uc/intel_huc.c
>>>>> @@ -82,7 +82,9 @@ static int intel_huc_rsa_data_create(struct intel_huc *huc)
>>>>>         if (IS_ERR(vma))
>>>>>                 return PTR_ERR(vma);
>>>>>
>>>>> -     vaddr = i915_gem_object_pin_map_unlocked(vma->obj, I915_MAP_WB);
>>>>> +     vaddr = i915_gem_object_pin_map_unlocked(vma->obj,
>>>>> +                                              i915_coherent_map_type(gt->i915,
>>>>> +                                                                     vma->obj, true));
>>>>>         if (IS_ERR(vaddr)) {
>>>>>                 i915_vma_unpin_and_release(&vma, 0);
>>>>>                 return PTR_ERR(vaddr);
>>>>> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
>>>>> index 69e43bf91a15..2abbc06712a4 100644
>>>>> --- a/drivers/gpu/drm/i915/i915_drv.h
>>>>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>>>>> @@ -78,6 +78,7 @@
>>>>>     #include "gem/i915_gem_context_types.h"
>>>>>     #include "gem/i915_gem_shrinker.h"
>>>>>     #include "gem/i915_gem_stolen.h"
>>>>> +#include "gem/i915_gem_lmem.h"
>>>>>
>>>>>     #include "gt/intel_engine.h"
>>>>>     #include "gt/intel_gt_types.h"
>>>>> @@ -1921,9 +1922,15 @@ static inline int intel_hws_csb_write_index(struct drm_i915_private *i915)
>>>>>     }
>>>>>
>>>>>     static inline enum i915_map_type
>>>>> -i915_coherent_map_type(struct drm_i915_private *i915)
>>>>> +i915_coherent_map_type(struct drm_i915_private *i915,
>>>>> +                    struct drm_i915_gem_object *obj, bool always_coherent)
>>>>>     {
>>>>> -     return HAS_LLC(i915) ? I915_MAP_WB : I915_MAP_WC;
>>>>> +     if (i915_gem_object_is_lmem(obj))
>>>>> +             return I915_MAP_WC;
>>>>> +     if (HAS_LLC(i915) || always_coherent)
>>>>> +             return I915_MAP_WB;
>>>>> +     else
>>>>> +             return I915_MAP_WC;
>>>>
>>>> Seems this patch is doing two things.
>>>>
>>>> First it is adding lmem support to this helper by always returning WC
>>>> for lmem objects.
>>>>
>>>> Secondly it is introducing an idea of "always coherent" in a helper
>>>> called i915_coherent_map_type. Could someone explain what is coherent vs
>>>> always coherent?
>>>>
>>>> And also, why is always coherent happy with WB? Sounds counter intuitive
>>>> to me.
>>>
>>> All this does is try to keep the existing behaviour intact, whilst
>>> also ensuring that all lmem objects are mapped using only WC, no
>>> matter what. The always_coherent=true thing is for the existing places
>>> where we sometimes map the object using WB, without first considering
>>> whether the device has the fast shared LLC vs snooping. Yes, it's
>>> slightly ugly :)
>>
>> Not fully following - if we had to write kerneldoc for always_coherent
>> input argument - what it would say?
> 
> @always_coherent - If true we should always try to map the object
> using WB. If false we should only map as WB if the device supports the
> fast shared LLC, in the case of snooped devices we will map use WC.
> Note that If the resource is lmem then we will always map as WC,
> regardless of the value of always_coherent, since that's all we
> currently support.
> 
> Maybe the naming is poor?

Maybe just confusing to me, not sure yet.

So always_coherent is not about how the callers wants to use it, but 
about platform knowledge? Or a performance concern for LLC vs snooping 
cases? Does WB works (coherently) on snooping platforms?

Regards,

Tvrtko
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [Intel-gfx] [PATCH 06/19] drm/i915/stolen: pass the allocation flags
  2021-04-14 15:09   ` [Intel-gfx] " Tvrtko Ursulin
@ 2021-04-16 13:53     ` Matthew Auld
  0 siblings, 0 replies; 65+ messages in thread
From: Matthew Auld @ 2021-04-16 13:53 UTC (permalink / raw)
  To: Tvrtko Ursulin, intel-gfx; +Cc: dri-devel

On 14/04/2021 16:09, Tvrtko Ursulin wrote:
> 
> On 12/04/2021 10:05, Matthew Auld wrote:
>> From: CQ Tang <cq.tang@intel.com>
>>
>> Stolen memory is always allocated as physically contiguous pages, mark
>> the object flags as such.
>>
>> Signed-off-by: CQ Tang <cq.tang@intel.com>
>> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
>> ---
>>   drivers/gpu/drm/i915/gem/i915_gem_stolen.c | 10 ++++++----
>>   1 file changed, 6 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_stolen.c 
>> b/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
>> index f713eabb7671..49a2dfcc8ba7 100644
>> --- a/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
>> @@ -633,14 +633,15 @@ static const struct drm_i915_gem_object_ops 
>> i915_gem_object_stolen_ops = {
>>   static int __i915_gem_object_create_stolen(struct 
>> intel_memory_region *mem,
>>                          struct drm_i915_gem_object *obj,
>> -                       struct drm_mm_node *stolen)
>> +                       struct drm_mm_node *stolen,
>> +                       unsigned int flags)
>>   {
>>       static struct lock_class_key lock_class;
>>       unsigned int cache_level;
>>       int err;
>>       drm_gem_private_object_init(&mem->i915->drm, &obj->base, 
>> stolen->size);
>> -    i915_gem_object_init(obj, &i915_gem_object_stolen_ops, 
>> &lock_class, 0);
>> +    i915_gem_object_init(obj, &i915_gem_object_stolen_ops, 
>> &lock_class, flags);
>>       obj->stolen = stolen;
>>       obj->read_domains = I915_GEM_DOMAIN_CPU | I915_GEM_DOMAIN_GTT;
>> @@ -682,7 +683,7 @@ static int _i915_gem_object_stolen_init(struct 
>> intel_memory_region *mem,
>>       if (ret)
>>           goto err_free;
>> -    ret = __i915_gem_object_create_stolen(mem, obj, stolen);
>> +    ret = __i915_gem_object_create_stolen(mem, obj, stolen, flags);
>>       if (ret)
>>           goto err_remove;
>> @@ -840,7 +841,8 @@ 
>> i915_gem_object_create_stolen_for_preallocated(struct drm_i915_private 
>> *i915,
>>           goto err_stolen;
>>       }
>> -    ret = __i915_gem_object_create_stolen(mem, obj, stolen);
>> +    ret = __i915_gem_object_create_stolen(mem, obj, stolen,
>> +                          I915_BO_ALLOC_CONTIGUOUS);
>>       if (ret)
>>           goto err_object_free;
>>
> 
> Are all stolen objects always contiguous or only ones allocated by 
> i915_gem_object_create_stolen_for_preallocated? If former should 
> __i915_gem_object_create_stolen just set the flag without the need to 
> pass it in?

Yes, all stolen object are physically contiguous. Agreed, moving the 
I915_BO_ALLOC_CONTIGUOUS into __i915_gem_object_create_stolen() makes 
more sense here.

> 
> Regards,
> 
> Tvrtko
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [Intel-gfx] [PATCH 12/19] drm/i915/lmem: Bypass aperture when lmem is available
  2021-04-14 15:33   ` [Intel-gfx] " Tvrtko Ursulin
@ 2021-04-16 14:25     ` Matthew Auld
  2021-04-19 14:16       ` Tvrtko Ursulin
  0 siblings, 1 reply; 65+ messages in thread
From: Matthew Auld @ 2021-04-16 14:25 UTC (permalink / raw)
  To: Tvrtko Ursulin, intel-gfx
  Cc: Daniel Vetter, dri-devel, Chris P Wilson, Dhinakaran Pandiyan

On 14/04/2021 16:33, Tvrtko Ursulin wrote:
> 
> On 12/04/2021 10:05, Matthew Auld wrote:
>> From: Anusha Srivatsa <anusha.srivatsa@intel.com>
>>
>> In the scenario where local memory is available, we have
>> rely on CPU access via lmem directly instead of aperture.
>>
>> v2:
>> gmch is only relevant for much older hw, therefore we can drop the
>> has_aperture check since it should always be present on such platforms.
>> (Chris)
>>
>> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
>> Cc: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
>> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
>> Cc: Chris P Wilson <chris.p.wilson@intel.com>
>> Cc: Daniel Vetter <daniel.vetter@intel.com>
>> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
>> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
>> Cc: CQ Tang <cq.tang@intel.com>
>> Signed-off-by: Anusha Srivatsa <anusha.srivatsa@intel.com>
>> ---
>>   drivers/gpu/drm/i915/display/intel_fbdev.c | 22 +++++++++++++++-------
>>   drivers/gpu/drm/i915/gem/i915_gem_lmem.c   | 15 +++++++++++++++
>>   drivers/gpu/drm/i915/gem/i915_gem_lmem.h   |  5 +++++
>>   drivers/gpu/drm/i915/i915_vma.c            | 19 +++++++++++++------
>>   4 files changed, 48 insertions(+), 13 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/display/intel_fbdev.c 
>> b/drivers/gpu/drm/i915/display/intel_fbdev.c
>> index 2b37959da747..4af40229f5ec 100644
>> --- a/drivers/gpu/drm/i915/display/intel_fbdev.c
>> +++ b/drivers/gpu/drm/i915/display/intel_fbdev.c
>> @@ -139,14 +139,22 @@ static int intelfb_alloc(struct drm_fb_helper 
>> *helper,
>>       size = mode_cmd.pitches[0] * mode_cmd.height;
>>       size = PAGE_ALIGN(size);
>> -    /* If the FB is too big, just don't use it since fbdev is not very
>> -     * important and we should probably use that space with FBC or other
>> -     * features. */
>>       obj = ERR_PTR(-ENODEV);
>> -    if (size * 2 < dev_priv->stolen_usable_size)
>> -        obj = i915_gem_object_create_stolen(dev_priv, size);
>> -    if (IS_ERR(obj))
>> -        obj = i915_gem_object_create_shmem(dev_priv, size);
>> +    if (HAS_LMEM(dev_priv)) {
>> +        obj = i915_gem_object_create_lmem(dev_priv, size,
>> +                          I915_BO_ALLOC_CONTIGUOUS);
> 
> Has to be contiguous? Question for display experts I guess.
> 
> [Comes back later.] Ah for iomap? Put a comment to that effect perhaps?

I don't think it has to be, since we could in theory just use pin_map() 
underneath, which can already deal with non-contiguous chunks of lmem, 
although that might bring in ww locking. I think for now just add a 
comment and mark this as XXX, and potentially revisit as follow up?

> 
>> +    } else {
>> +        /*
>> +         * If the FB is too big, just don't use it since fbdev is not 
>> very
>> +         * important and we should probably use that space with FBC 
>> or other
>> +         * features.
>> +         */
>> +        if (size * 2 < dev_priv->stolen_usable_size)
>> +            obj = i915_gem_object_create_stolen(dev_priv, size);
>> +        if (IS_ERR(obj))
>> +            obj = i915_gem_object_create_shmem(dev_priv, size);
>> +    }
> 
> Could we keep the IS_ERR ordered allocation order to save having to 
> re-indent? Bike shed so optional..
> 
>> +
>>       if (IS_ERR(obj)) {
>>           drm_err(&dev_priv->drm, "failed to allocate framebuffer\n");
>>           return PTR_ERR(obj);
>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_lmem.c 
>> b/drivers/gpu/drm/i915/gem/i915_gem_lmem.c
>> index 017db8f71130..f44bdd08f7cb 100644
>> --- a/drivers/gpu/drm/i915/gem/i915_gem_lmem.c
>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_lmem.c
>> @@ -17,6 +17,21 @@ const struct drm_i915_gem_object_ops 
>> i915_gem_lmem_obj_ops = {
>>       .release = i915_gem_object_release_memory_region,
>>   };
>> +void __iomem *
>> +i915_gem_object_lmem_io_map(struct drm_i915_gem_object *obj,
>> +                unsigned long n,
>> +                unsigned long size)
>> +{
>> +    resource_size_t offset;
>> +
>> +    GEM_BUG_ON(!i915_gem_object_is_contiguous(obj));
>> +
>> +    offset = i915_gem_object_get_dma_address(obj, n);
>> +    offset -= obj->mm.region->region.start;
>> +
>> +    return io_mapping_map_wc(&obj->mm.region->iomap, offset, size);
>> +}
>> +
>>   bool i915_gem_object_is_lmem(struct drm_i915_gem_object *obj)
>>   {
>>       struct intel_memory_region *mr = obj->mm.region;
>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_lmem.h 
>> b/drivers/gpu/drm/i915/gem/i915_gem_lmem.h
>> index 036d53c01de9..fac6bc5a5ebb 100644
>> --- a/drivers/gpu/drm/i915/gem/i915_gem_lmem.h
>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_lmem.h
>> @@ -14,6 +14,11 @@ struct intel_memory_region;
>>   extern const struct drm_i915_gem_object_ops i915_gem_lmem_obj_ops;
>> +void __iomem *
>> +i915_gem_object_lmem_io_map(struct drm_i915_gem_object *obj,
>> +                unsigned long n,
>> +                unsigned long size);
>> +
>>   bool i915_gem_object_is_lmem(struct drm_i915_gem_object *obj);
>>   struct drm_i915_gem_object *
>> diff --git a/drivers/gpu/drm/i915/i915_vma.c 
>> b/drivers/gpu/drm/i915/i915_vma.c
>> index 07490db51cdc..e24d33aecac4 100644
>> --- a/drivers/gpu/drm/i915/i915_vma.c
>> +++ b/drivers/gpu/drm/i915/i915_vma.c
>> @@ -27,6 +27,7 @@
>>   #include "display/intel_frontbuffer.h"
>> +#include "gem/i915_gem_lmem.h"
>>   #include "gt/intel_engine.h"
>>   #include "gt/intel_engine_heartbeat.h"
>>   #include "gt/intel_gt.h"
>> @@ -448,9 +449,11 @@ void __iomem *i915_vma_pin_iomap(struct i915_vma 
>> *vma)
>>       void __iomem *ptr;
>>       int err;
>> -    if (GEM_WARN_ON(!i915_vma_is_map_and_fenceable(vma))) {
>> -        err = -ENODEV;
>> -        goto err;
>> +    if (!i915_gem_object_is_lmem(vma->obj)) {
>> +        if (GEM_WARN_ON(!i915_vma_is_map_and_fenceable(vma))) {
>> +            err = -ENODEV;
>> +            goto err;
>> +        }
>>       }
>>       GEM_BUG_ON(!i915_vma_is_ggtt(vma));
>> @@ -458,9 +461,13 @@ void __iomem *i915_vma_pin_iomap(struct i915_vma 
>> *vma)
>>       ptr = READ_ONCE(vma->iomap);
>>       if (ptr == NULL) {
>> -        ptr = io_mapping_map_wc(&i915_vm_to_ggtt(vma->vm)->iomap,
>> -                    vma->node.start,
>> -                    vma->node.size);
>> +        if (i915_gem_object_is_lmem(vma->obj))
>> +            ptr = i915_gem_object_lmem_io_map(vma->obj, 0,
>> +                              vma->obj->base.size);
> 
> Can the vma size be bigger than the object here? Given how below works 
> of vma->node.size.

I don't know tbh. But in general node.size can definitely be larger than 
vma->size/obj->base.size.

For the iomap version below, it's using the mappable aperture, which 
requires reserving a vma node into the mappable part of the GGTT first, 
so using node.size here make sense, since the node reflects the window 
into the mappable aperture.

For the lmem case though that might be bogus, since the vma has no 
relationship with LMEM_BAR, since really it's the object, hence why we 
use the obj->base.size instead. Although really it might make more sense 
to use pin_map() instead for the lmem case, if it's possible.

> 
>> +        else
>> +            ptr = io_mapping_map_wc(&i915_vm_to_ggtt(vma->vm)->iomap,
>> +                        vma->node.start,
>> +                        vma->node.size);
> 
> Looks a bit odd that this calls the same io_mapping_map_wc as 
> i915_gem_object_lmem_io_map ends up doing. Perhaps that suggests there 
> should be a single helper here but I am not sure what would be elegant.
> 
> Regards,
> 
> Tvrtko
> 
>>           if (ptr == NULL) {
>>               err = -ENOMEM;
>>               goto err;
>>
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [Intel-gfx] [PATCH 03/19] drm/i915: Create stolen memory region from local memory
  2021-04-14 15:01   ` [Intel-gfx] " Tvrtko Ursulin
@ 2021-04-16 15:04     ` Matthew Auld
  2021-04-19 14:15       ` Tvrtko Ursulin
  0 siblings, 1 reply; 65+ messages in thread
From: Matthew Auld @ 2021-04-16 15:04 UTC (permalink / raw)
  To: Tvrtko Ursulin, intel-gfx; +Cc: dri-devel

On 14/04/2021 16:01, Tvrtko Ursulin wrote:
> 
> On 12/04/2021 10:05, Matthew Auld wrote:
>> From: CQ Tang <cq.tang@intel.com>
>>
>> Add "REGION_STOLEN" device info to dg1, create stolen memory
>> region from upper portion of local device memory, starting
>> from DSMBASE.
>>
>> v2:
>>      - s/drm_info/drm_dbg; userspace likely doesn't care about stolen.
>>      - mem->type is only setup after the region probe, so setting the 
>> name
>>        as stolen-local or stolen-system based on this value won't 
>> work. Split
>>        system vs local stolen setup to fix this.
>>      - kill all the region->devmem/is_devmem stuff. We already 
>> differentiate
>>        the different types of stolen so such things shouldn't be needed
>>        anymore.
>>
>> Signed-off-by: CQ Tang <cq.tang@intel.com>
>> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
>> ---
>>   drivers/gpu/drm/i915/gem/i915_gem_stolen.c | 99 +++++++++++++++++++---
>>   drivers/gpu/drm/i915/gem/i915_gem_stolen.h |  3 +
>>   drivers/gpu/drm/i915/i915_pci.c            |  2 +-
>>   drivers/gpu/drm/i915/i915_reg.h            |  1 +
>>   drivers/gpu/drm/i915/intel_memory_region.c |  6 ++
>>   drivers/gpu/drm/i915/intel_memory_region.h |  5 +-
>>   6 files changed, 102 insertions(+), 14 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_stolen.c 
>> b/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
>> index b0597de206de..56dd58bef5ee 100644
>> --- a/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
>> @@ -10,6 +10,7 @@
>>   #include <drm/drm_mm.h>
>>   #include <drm/i915_drm.h>
>> +#include "gem/i915_gem_lmem.h"
>>   #include "gem/i915_gem_region.h"
>>   #include "i915_drv.h"
>>   #include "i915_gem_stolen.h"
>> @@ -121,6 +122,14 @@ static int i915_adjust_stolen(struct 
>> drm_i915_private *i915,
>>           }
>>       }
>> +    /*
>> +     * With device local memory, we don't need to check the address 
>> range,
>> +     * this is device memory physical address, could overlap with system
>> +     * memory.
>> +     */
>> +    if (HAS_LMEM(i915))
>> +        return 0;
>> +
>>       /*
>>        * Verify that nothing else uses this physical address. Stolen
>>        * memory should be reserved by the BIOS and hidden from the
>> @@ -374,8 +383,9 @@ static void icl_get_stolen_reserved(struct 
>> drm_i915_private *i915,
>>       }
>>   }
>> -static int i915_gem_init_stolen(struct drm_i915_private *i915)
>> +static int i915_gem_init_stolen(struct intel_memory_region *mem)
>>   {
>> +    struct drm_i915_private *i915 = mem->i915;
>>       struct intel_uncore *uncore = &i915->uncore;
>>       resource_size_t reserved_base, stolen_top;
>>       resource_size_t reserved_total, reserved_size;
>> @@ -396,10 +406,10 @@ static int i915_gem_init_stolen(struct 
>> drm_i915_private *i915)
>>           return 0;
>>       }
>> -    if (resource_size(&intel_graphics_stolen_res) == 0)
>> +    if (resource_size(&mem->region) == 0)
>>           return 0;
>> -    i915->dsm = intel_graphics_stolen_res;
>> +    i915->dsm = mem->region;
>>       if (i915_adjust_stolen(i915, &i915->dsm))
>>           return 0;
>> @@ -684,23 +694,36 @@ static int _i915_gem_object_stolen_init(struct 
>> intel_memory_region *mem,
>>       return ret;
>>   }
>> +struct intel_memory_region *i915_stolen_region(struct 
>> drm_i915_private *i915)
>> +{
>> +    if (HAS_LMEM(i915))
>> +        return i915->mm.regions[INTEL_REGION_STOLEN_LMEM];
>> +
>> +    return i915->mm.regions[INTEL_REGION_STOLEN_SMEM];
>> +}
> 
> Could be a bikeshedding comment only - especially since I think this 
> path gets very little used at runtime so it is most likely pointless to 
> fiddle with it, but it just strikes me a bit not fully elegant to do:
> 
> i915_gem_object_create_stolen
>   -> i915_gem_object_create_region
>      -> i915_stolen_region
> 
> And end up in here, when alternative could be at driver init:
> 
> i915->stolen_region_id = HAS_LMEM() ? ... : ...;
> 
> i915_gem_object_create_stolen
>   -> 
> i915_gem_object_create_region(i915->mm.regions[i915->stolen_region_id]);
> 
> Or pointer to region. Would avoid having to export i915_stolen_region as 
> well.
> 
> Or is i915->dsm already the right thing? Because..

I guess we could just have an i915->stolen_region short-cut or something?

> 
>> +
>>   struct drm_i915_gem_object *
>>   i915_gem_object_create_stolen(struct drm_i915_private *i915,
>>                     resource_size_t size)
>>   {
>> -    return 
>> i915_gem_object_create_region(i915->mm.regions[INTEL_REGION_STOLEN_SMEM],
>> +    return i915_gem_object_create_region(i915_stolen_region(i915),
>>                            size, I915_BO_ALLOC_CONTIGUOUS);
>>   }
>>   static int init_stolen(struct intel_memory_region *mem)
>>   {
>> -    intel_memory_region_set_name(mem, "stolen");
>> +    if (HAS_LMEM(mem->i915)) {
>> +        if (!io_mapping_init_wc(&mem->iomap,
>> +                    mem->io_start,
>> +                    resource_size(&mem->region)))
>> +            return -EIO;
>> +    }
>>       /*
>>        * Initialise stolen early so that we may reserve preallocated
>>        * objects for the BIOS to KMS transition.
>>        */
>> -    return i915_gem_init_stolen(mem->i915);
>> +    return i915_gem_init_stolen(mem);
> 
> ... I find the mem region init paths a bit convoluted, stolen 
> especially, and struggle to figure it out every time.
> 
> For instance we have i915_region_stolen_ops shared between system and 
> local stolen. But then shared vfuncs branch depending on system vs stolen?

We could split the intel_memory_region ops? Maybe that will make it 
slightly less muddled?

The probing is slightly different, but that's kind of expected since 
it's quite different from the HW pov.

But once we get an intel_memory_region, it should be the same whether 
it's stolen device memory or whatever.

> 
> i915_gem_init_stolen is shared - but which parts of it are relevant for 
> local stolen?

Asking all the difficult questions :)

It's just to populate dsm I think. I can rip that out and then we don't 
call i915_gem_init_stolen() for the stolen device memory path? Maybe 
that will look slightly better?

> 
>>   }
>>   static void release_stolen(struct intel_memory_region *mem)
>> @@ -714,13 +737,65 @@ static const struct intel_memory_region_ops 
>> i915_region_stolen_ops = {
>>       .init_object = _i915_gem_object_stolen_init,
>>   };
>> +static struct intel_memory_region *
>> +setup_lmem_stolen(struct drm_i915_private *i915)
>> +{
>> +    struct intel_uncore *uncore = &i915->uncore;
>> +    struct pci_dev *pdev = i915->drm.pdev;
>> +    struct intel_memory_region *mem;
>> +    resource_size_t io_start;
>> +    resource_size_t lmem_size;
>> +    u64 lmem_base;
>> +
>> +    if (!IS_DGFX(i915))
>> +        return ERR_PTR(-ENODEV);
>> +
>> +    lmem_base = intel_uncore_read64(uncore, GEN12_DSMBASE);
>> +    lmem_size = pci_resource_len(pdev, 2) - lmem_base;
>> +    io_start = pci_resource_start(pdev, 2) + lmem_base;
>> +
>> +    mem = intel_memory_region_create(i915, lmem_base, lmem_size,
>> +                     I915_GTT_PAGE_SIZE_4K, io_start,
>> +                     &i915_region_stolen_ops);
>> +    if (IS_ERR(mem))
>> +        return mem;
>> +
>> +    drm_dbg(&i915->drm, "Stolen Local memory: %pR\n", &mem->region);
>> +    drm_dbg(&i915->drm, "Stolen Local memory IO start: %pa\n",
>> +        &mem->io_start);
> 
> Could these messages be consolidated with the system stolen ones 
> (i915_gem_setup_stolen?) and based off the memory_region data printed 
> from common i915_gem_stolen_setup?
> 
>> +
>> +    intel_memory_region_set_name(mem, "stolen-local");
>> +
>> +    return mem;
>> +}
>> +
>> +static struct intel_memory_region*
> 
> Space before asterisk.
> 
>> +setup_smem_stolen(struct drm_i915_private *i915)
>> +{
>> +    struct intel_memory_region *mem;
>> +
>> +    mem = intel_memory_region_create(i915,
>> +                     intel_graphics_stolen_res.start,
>> +                     resource_size(&intel_graphics_stolen_res),
>> +                     PAGE_SIZE, 0,
>> +                     &i915_region_stolen_ops);
>> +    if (IS_ERR(mem))
>> +        return mem;
>> +
>> +    intel_memory_region_set_name(mem, "stolen-system");
> 
> I assume this name, although changed from the current ("stolen"), is not 
> exported anywhere to matter?

Yeah, it's just for internal use, and some debugfs.

> 
>> +
>> +    return mem;
>> +}
>> +
>>   struct intel_memory_region *i915_gem_stolen_setup(struct 
>> drm_i915_private *i915)
>>   {
>> -    return intel_memory_region_create(i915,
>> -                      intel_graphics_stolen_res.start,
>> -                      resource_size(&intel_graphics_stolen_res),
>> -                      PAGE_SIZE, 0,
>> -                      &i915_region_stolen_ops);
>> +    struct intel_memory_region *mem;
>> +
>> +    mem = setup_lmem_stolen(i915);
>> +    if (mem == ERR_PTR(-ENODEV))
>> +        mem = setup_smem_stolen(i915);
>> +
>> +    return mem;
>>   }
>>   struct drm_i915_gem_object *
>> @@ -728,7 +803,7 @@ 
>> i915_gem_object_create_stolen_for_preallocated(struct drm_i915_private 
>> *i915,
>>                              resource_size_t stolen_offset,
>>                              resource_size_t size)
>>   {
>> -    struct intel_memory_region *mem = 
>> i915->mm.regions[INTEL_REGION_STOLEN_SMEM];
>> +    struct intel_memory_region *mem = i915_stolen_region(i915);
>>       struct drm_i915_gem_object *obj;
>>       struct drm_mm_node *stolen;
>>       int ret;
>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_stolen.h 
>> b/drivers/gpu/drm/i915/gem/i915_gem_stolen.h
>> index b03489706796..2d1ce7fec61c 100644
>> --- a/drivers/gpu/drm/i915/gem/i915_gem_stolen.h
>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_stolen.h
>> @@ -22,6 +22,9 @@ int i915_gem_stolen_insert_node_in_range(struct 
>> drm_i915_private *dev_priv,
>>   void i915_gem_stolen_remove_node(struct drm_i915_private *dev_priv,
>>                    struct drm_mm_node *node);
>>   struct intel_memory_region *i915_gem_stolen_setup(struct 
>> drm_i915_private *i915);
>> +
>> +struct intel_memory_region *i915_stolen_region(struct 
>> drm_i915_private *i915);
>> +
>>   struct drm_i915_gem_object *
>>   i915_gem_object_create_stolen(struct drm_i915_private *dev_priv,
>>                     resource_size_t size);
>> diff --git a/drivers/gpu/drm/i915/i915_pci.c 
>> b/drivers/gpu/drm/i915/i915_pci.c
>> index 480553746794..53f5d1e6daef 100644
>> --- a/drivers/gpu/drm/i915/i915_pci.c
>> +++ b/drivers/gpu/drm/i915/i915_pci.c
>> @@ -906,7 +906,7 @@ static const struct intel_device_info rkl_info = {
>>   #define GEN12_DGFX_FEATURES \
>>       GEN12_FEATURES, \
>> -    .memory_regions = REGION_SMEM | REGION_LMEM, \
>> +    .memory_regions = REGION_SMEM | REGION_LMEM | REGION_STOLEN_LMEM, \
>>       .has_master_unit_irq = 1, \
>>       .has_llc = 0, \
>>       .has_snoop = 1, \
>> diff --git a/drivers/gpu/drm/i915/i915_reg.h 
>> b/drivers/gpu/drm/i915/i915_reg.h
>> index e087bcd21911..4108f2a7ebfa 100644
>> --- a/drivers/gpu/drm/i915/i915_reg.h
>> +++ b/drivers/gpu/drm/i915/i915_reg.h
>> @@ -12191,6 +12191,7 @@ enum skl_power_gate {
>>   #define GEN12_GLOBAL_MOCS(i)    _MMIO(0x4000 + (i) * 4) /* Global 
>> MOCS regs */
>>   #define GEN12_GSMBASE            _MMIO(0x108100)
>> +#define GEN12_DSMBASE            _MMIO(0x1080C0)
>>   /* gamt regs */
>>   #define GEN8_L3_LRA_1_GPGPU _MMIO(0x4dd4)
>> diff --git a/drivers/gpu/drm/i915/intel_memory_region.c 
>> b/drivers/gpu/drm/i915/intel_memory_region.c
>> index bf837b6bb185..ac90b76a3fa0 100644
>> --- a/drivers/gpu/drm/i915/intel_memory_region.c
>> +++ b/drivers/gpu/drm/i915/intel_memory_region.c
>> @@ -22,6 +22,10 @@ static const struct {
>>           .class = INTEL_MEMORY_STOLEN_SYSTEM,
>>           .instance = 0,
>>       },
>> +    [INTEL_REGION_STOLEN_LMEM] = {
>> +        .class = INTEL_MEMORY_STOLEN_LOCAL,
>> +        .instance = 0,
>> +    },
>>   };
>>   struct intel_memory_region *
>> @@ -278,6 +282,8 @@ int intel_memory_regions_hw_probe(struct 
>> drm_i915_private *i915)
>>           case INTEL_MEMORY_SYSTEM:
>>               mem = i915_gem_shmem_setup(i915);
>>               break;
>> +        case INTEL_MEMORY_STOLEN_LOCAL:
>> +            fallthrough;
>>           case INTEL_MEMORY_STOLEN_SYSTEM:
>>               mem = i915_gem_stolen_setup(i915);
>>               break;
>> diff --git a/drivers/gpu/drm/i915/intel_memory_region.h 
>> b/drivers/gpu/drm/i915/intel_memory_region.h
>> index edd49067c8ca..4c8ec15af55f 100644
>> --- a/drivers/gpu/drm/i915/intel_memory_region.h
>> +++ b/drivers/gpu/drm/i915/intel_memory_region.h
>> @@ -26,18 +26,21 @@ enum intel_memory_type {
>>       INTEL_MEMORY_SYSTEM = 0,
>>       INTEL_MEMORY_LOCAL,
>>       INTEL_MEMORY_STOLEN_SYSTEM,
>> +    INTEL_MEMORY_STOLEN_LOCAL,
>>   };
>>   enum intel_region_id {
>>       INTEL_REGION_SMEM = 0,
>>       INTEL_REGION_LMEM,
>>       INTEL_REGION_STOLEN_SMEM,
>> +    INTEL_REGION_STOLEN_LMEM,
>>       INTEL_REGION_UNKNOWN, /* Should be last */
>>   };
>>   #define REGION_SMEM     BIT(INTEL_REGION_SMEM)
>>   #define REGION_LMEM     BIT(INTEL_REGION_LMEM)
>>   #define REGION_STOLEN_SMEM   BIT(INTEL_REGION_STOLEN_SMEM)
>> +#define REGION_STOLEN_LMEM   BIT(INTEL_REGION_STOLEN_LMEM)
>>   #define I915_ALLOC_MIN_PAGE_SIZE  BIT(0)
>>   #define I915_ALLOC_CONTIGUOUS     BIT(1)
>> @@ -82,7 +85,7 @@ struct intel_memory_region {
>>       u16 type;
>>       u16 instance;
>>       enum intel_region_id id;
>> -    char name[8];
>> +    char name[16];
>>       struct list_head reserved;
>>
> 
> Regards,
> 
> Tvrtko
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [Intel-gfx] [PATCH 11/19] drm/i915: Update the helper to set correct mapping
  2021-04-15 11:05           ` Tvrtko Ursulin
@ 2021-04-19 11:30             ` Matthew Auld
  2021-04-19 14:07               ` Tvrtko Ursulin
  0 siblings, 1 reply; 65+ messages in thread
From: Matthew Auld @ 2021-04-19 11:30 UTC (permalink / raw)
  To: Tvrtko Ursulin, Matthew Auld; +Cc: Intel Graphics Development, ML dri-devel

On 15/04/2021 12:05, Tvrtko Ursulin wrote:
> 
> On 15/04/2021 10:23, Matthew Auld wrote:
>> On Thu, 15 Apr 2021 at 09:21, Tvrtko Ursulin
>> <tvrtko.ursulin@linux.intel.com> wrote:
>>>
>>>
>>> On 14/04/2021 17:20, Matthew Auld wrote:
>>>> On Wed, 14 Apr 2021 at 16:22, Tvrtko Ursulin
>>>> <tvrtko.ursulin@linux.intel.com> wrote:
>>>>>
>>>>>
>>>>> On 12/04/2021 10:05, Matthew Auld wrote:
>>>>>> From: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com>
>>>>>>
>>>>>> Determine the possible coherent map type based on object location,
>>>>>> and if target has llc or if user requires an always coherent
>>>>>> mapping.
>>>>>>
>>>>>> Cc: Matthew Auld <matthew.auld@intel.com>
>>>>>> Cc: CQ Tang <cq.tang@intel.com>
>>>>>> Suggested-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
>>>>>> Signed-off-by: Venkata Sandeep Dhanalakota 
>>>>>> <venkata.s.dhanalakota@intel.com>
>>>>>> ---
>>>>>>     drivers/gpu/drm/i915/gt/intel_engine_cs.c    |  3 ++-
>>>>>>     drivers/gpu/drm/i915/gt/intel_engine_pm.c    |  2 +-
>>>>>>     drivers/gpu/drm/i915/gt/intel_lrc.c          |  4 +++-
>>>>>>     drivers/gpu/drm/i915/gt/intel_ring.c         |  9 ++++++---
>>>>>>     drivers/gpu/drm/i915/gt/selftest_context.c   |  3 ++-
>>>>>>     drivers/gpu/drm/i915/gt/selftest_hangcheck.c |  4 ++--
>>>>>>     drivers/gpu/drm/i915/gt/selftest_lrc.c       |  4 +++-
>>>>>>     drivers/gpu/drm/i915/gt/uc/intel_guc.c       |  4 +++-
>>>>>>     drivers/gpu/drm/i915/gt/uc/intel_huc.c       |  4 +++-
>>>>>>     drivers/gpu/drm/i915/i915_drv.h              | 11 +++++++++--
>>>>>>     drivers/gpu/drm/i915/selftests/igt_spinner.c |  4 ++--
>>>>>>     11 files changed, 36 insertions(+), 16 deletions(-)
>>>>>>
>>>>>> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c 
>>>>>> b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
>>>>>> index efe935f80c1a..b79568d370f5 100644
>>>>>> --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
>>>>>> +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
>>>>>> @@ -664,7 +664,8 @@ static int init_status_page(struct 
>>>>>> intel_engine_cs *engine)
>>>>>>         if (ret)
>>>>>>                 goto err;
>>>>>>
>>>>>> -     vaddr = i915_gem_object_pin_map(obj, I915_MAP_WB);
>>>>>> +     vaddr = i915_gem_object_pin_map(obj,
>>>>>> +                                     
>>>>>> i915_coherent_map_type(engine->i915, obj, true));
>>>>>>         if (IS_ERR(vaddr)) {
>>>>>>                 ret = PTR_ERR(vaddr);
>>>>>>                 goto err_unpin;
>>>>>> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pm.c 
>>>>>> b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
>>>>>> index 7c9af86fdb1e..47f4397095e5 100644
>>>>>> --- a/drivers/gpu/drm/i915/gt/intel_engine_pm.c
>>>>>> +++ b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
>>>>>> @@ -23,7 +23,7 @@ static void dbg_poison_ce(struct intel_context *ce)
>>>>>>
>>>>>>         if (ce->state) {
>>>>>>                 struct drm_i915_gem_object *obj = ce->state->obj;
>>>>>> -             int type = i915_coherent_map_type(ce->engine->i915);
>>>>>> +             int type = i915_coherent_map_type(ce->engine->i915, 
>>>>>> obj, true);
>>>>>>                 void *map;
>>>>>>
>>>>>>                 if (!i915_gem_object_trylock(obj))
>>>>>> diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c 
>>>>>> b/drivers/gpu/drm/i915/gt/intel_lrc.c
>>>>>> index e86897cde984..aafe2a4df496 100644
>>>>>> --- a/drivers/gpu/drm/i915/gt/intel_lrc.c
>>>>>> +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
>>>>>> @@ -903,7 +903,9 @@ lrc_pre_pin(struct intel_context *ce,
>>>>>>         GEM_BUG_ON(!i915_vma_is_pinned(ce->state));
>>>>>>
>>>>>>         *vaddr = i915_gem_object_pin_map(ce->state->obj,
>>>>>> -                                      
>>>>>> i915_coherent_map_type(ce->engine->i915) |
>>>>>> +                                      
>>>>>> i915_coherent_map_type(ce->engine->i915,
>>>>>> +                                                             
>>>>>> ce->state->obj,
>>>>>> +                                                             
>>>>>> false) |
>>>>>>                                          I915_MAP_OVERRIDE);
>>>>>>
>>>>>>         return PTR_ERR_OR_ZERO(*vaddr);
>>>>>> diff --git a/drivers/gpu/drm/i915/gt/intel_ring.c 
>>>>>> b/drivers/gpu/drm/i915/gt/intel_ring.c
>>>>>> index aee0a77c77e0..3cf6c7e68108 100644
>>>>>> --- a/drivers/gpu/drm/i915/gt/intel_ring.c
>>>>>> +++ b/drivers/gpu/drm/i915/gt/intel_ring.c
>>>>>> @@ -53,9 +53,12 @@ int intel_ring_pin(struct intel_ring *ring, 
>>>>>> struct i915_gem_ww_ctx *ww)
>>>>>>
>>>>>>         if (i915_vma_is_map_and_fenceable(vma))
>>>>>>                 addr = (void __force *)i915_vma_pin_iomap(vma);
>>>>>> -     else
>>>>>> -             addr = i915_gem_object_pin_map(vma->obj,
>>>>>> -                                            
>>>>>> i915_coherent_map_type(vma->vm->i915));
>>>>>> +     else {
>>>>>> +             int type = i915_coherent_map_type(vma->vm->i915, 
>>>>>> vma->obj, false);
>>>>>> +
>>>>>> +             addr = i915_gem_object_pin_map(vma->obj, type);
>>>>>> +     }
>>>>>> +
>>>>>>         if (IS_ERR(addr)) {
>>>>>>                 ret = PTR_ERR(addr);
>>>>>>                 goto err_ring;
>>>>>> diff --git a/drivers/gpu/drm/i915/gt/selftest_context.c 
>>>>>> b/drivers/gpu/drm/i915/gt/selftest_context.c
>>>>>> index b9bdd1d23243..26685b927169 100644
>>>>>> --- a/drivers/gpu/drm/i915/gt/selftest_context.c
>>>>>> +++ b/drivers/gpu/drm/i915/gt/selftest_context.c
>>>>>> @@ -88,7 +88,8 @@ static int __live_context_size(struct 
>>>>>> intel_engine_cs *engine)
>>>>>>                 goto err;
>>>>>>
>>>>>>         vaddr = i915_gem_object_pin_map_unlocked(ce->state->obj,
>>>>>> -                                              
>>>>>> i915_coherent_map_type(engine->i915));
>>>>>> +                                              
>>>>>> i915_coherent_map_type(engine->i915,
>>>>>> +                                                                     
>>>>>> ce->state->obj, false));
>>>>>>         if (IS_ERR(vaddr)) {
>>>>>>                 err = PTR_ERR(vaddr);
>>>>>>                 intel_context_unpin(ce);
>>>>>> diff --git a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c 
>>>>>> b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
>>>>>> index 746985971c3a..5b63d4df8c93 100644
>>>>>> --- a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
>>>>>> +++ b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
>>>>>> @@ -69,7 +69,7 @@ static int hang_init(struct hang *h, struct 
>>>>>> intel_gt *gt)
>>>>>>         h->seqno = memset(vaddr, 0xff, PAGE_SIZE);
>>>>>>
>>>>>>         vaddr = i915_gem_object_pin_map_unlocked(h->obj,
>>>>>> -                                              
>>>>>> i915_coherent_map_type(gt->i915));
>>>>>> +                                              
>>>>>> i915_coherent_map_type(gt->i915, h->obj, false));
>>>>>>         if (IS_ERR(vaddr)) {
>>>>>>                 err = PTR_ERR(vaddr);
>>>>>>                 goto err_unpin_hws;
>>>>>> @@ -130,7 +130,7 @@ hang_create_request(struct hang *h, struct 
>>>>>> intel_engine_cs *engine)
>>>>>>                 return ERR_CAST(obj);
>>>>>>         }
>>>>>>
>>>>>> -     vaddr = i915_gem_object_pin_map_unlocked(obj, 
>>>>>> i915_coherent_map_type(gt->i915));
>>>>>> +     vaddr = i915_gem_object_pin_map_unlocked(obj, 
>>>>>> i915_coherent_map_type(gt->i915, obj, false));
>>>>>>         if (IS_ERR(vaddr)) {
>>>>>>                 i915_gem_object_put(obj);
>>>>>>                 i915_vm_put(vm);
>>>>>> diff --git a/drivers/gpu/drm/i915/gt/selftest_lrc.c 
>>>>>> b/drivers/gpu/drm/i915/gt/selftest_lrc.c
>>>>>> index 85e7df6a5123..d8f6623524e8 100644
>>>>>> --- a/drivers/gpu/drm/i915/gt/selftest_lrc.c
>>>>>> +++ b/drivers/gpu/drm/i915/gt/selftest_lrc.c
>>>>>> @@ -1221,7 +1221,9 @@ static int compare_isolation(struct 
>>>>>> intel_engine_cs *engine,
>>>>>>         }
>>>>>>
>>>>>>         lrc = i915_gem_object_pin_map_unlocked(ce->state->obj,
>>>>>> -                                   
>>>>>> i915_coherent_map_type(engine->i915));
>>>>>> +                                            
>>>>>> i915_coherent_map_type(engine->i915,
>>>>>> +                                                                   ce->state->obj, 
>>>>>>
>>>>>> +                                                                   false)); 
>>>>>>
>>>>>>         if (IS_ERR(lrc)) {
>>>>>>                 err = PTR_ERR(lrc);
>>>>>>                 goto err_B1;
>>>>>> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.c 
>>>>>> b/drivers/gpu/drm/i915/gt/uc/intel_guc.c
>>>>>> index 78305b2ec89d..adae04c47aab 100644
>>>>>> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc.c
>>>>>> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.c
>>>>>> @@ -682,7 +682,9 @@ int intel_guc_allocate_and_map_vma(struct 
>>>>>> intel_guc *guc, u32 size,
>>>>>>         if (IS_ERR(vma))
>>>>>>                 return PTR_ERR(vma);
>>>>>>
>>>>>> -     vaddr = i915_gem_object_pin_map_unlocked(vma->obj, 
>>>>>> I915_MAP_WB);
>>>>>> +     vaddr = i915_gem_object_pin_map_unlocked(vma->obj,
>>>>>> +                                              
>>>>>> i915_coherent_map_type(guc_to_gt(guc)->i915,
>>>>>> +                                                                     
>>>>>> vma->obj, true));
>>>>>>         if (IS_ERR(vaddr)) {
>>>>>>                 i915_vma_unpin_and_release(&vma, 0);
>>>>>>                 return PTR_ERR(vaddr);
>>>>>> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_huc.c 
>>>>>> b/drivers/gpu/drm/i915/gt/uc/intel_huc.c
>>>>>> index 2126dd81ac38..56d2144dc6a0 100644
>>>>>> --- a/drivers/gpu/drm/i915/gt/uc/intel_huc.c
>>>>>> +++ b/drivers/gpu/drm/i915/gt/uc/intel_huc.c
>>>>>> @@ -82,7 +82,9 @@ static int intel_huc_rsa_data_create(struct 
>>>>>> intel_huc *huc)
>>>>>>         if (IS_ERR(vma))
>>>>>>                 return PTR_ERR(vma);
>>>>>>
>>>>>> -     vaddr = i915_gem_object_pin_map_unlocked(vma->obj, 
>>>>>> I915_MAP_WB);
>>>>>> +     vaddr = i915_gem_object_pin_map_unlocked(vma->obj,
>>>>>> +                                              
>>>>>> i915_coherent_map_type(gt->i915,
>>>>>> +                                                                     
>>>>>> vma->obj, true));
>>>>>>         if (IS_ERR(vaddr)) {
>>>>>>                 i915_vma_unpin_and_release(&vma, 0);
>>>>>>                 return PTR_ERR(vaddr);
>>>>>> diff --git a/drivers/gpu/drm/i915/i915_drv.h 
>>>>>> b/drivers/gpu/drm/i915/i915_drv.h
>>>>>> index 69e43bf91a15..2abbc06712a4 100644
>>>>>> --- a/drivers/gpu/drm/i915/i915_drv.h
>>>>>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>>>>>> @@ -78,6 +78,7 @@
>>>>>>     #include "gem/i915_gem_context_types.h"
>>>>>>     #include "gem/i915_gem_shrinker.h"
>>>>>>     #include "gem/i915_gem_stolen.h"
>>>>>> +#include "gem/i915_gem_lmem.h"
>>>>>>
>>>>>>     #include "gt/intel_engine.h"
>>>>>>     #include "gt/intel_gt_types.h"
>>>>>> @@ -1921,9 +1922,15 @@ static inline int 
>>>>>> intel_hws_csb_write_index(struct drm_i915_private *i915)
>>>>>>     }
>>>>>>
>>>>>>     static inline enum i915_map_type
>>>>>> -i915_coherent_map_type(struct drm_i915_private *i915)
>>>>>> +i915_coherent_map_type(struct drm_i915_private *i915,
>>>>>> +                    struct drm_i915_gem_object *obj, bool 
>>>>>> always_coherent)
>>>>>>     {
>>>>>> -     return HAS_LLC(i915) ? I915_MAP_WB : I915_MAP_WC;
>>>>>> +     if (i915_gem_object_is_lmem(obj))
>>>>>> +             return I915_MAP_WC;
>>>>>> +     if (HAS_LLC(i915) || always_coherent)
>>>>>> +             return I915_MAP_WB;
>>>>>> +     else
>>>>>> +             return I915_MAP_WC;
>>>>>
>>>>> Seems this patch is doing two things.
>>>>>
>>>>> First it is adding lmem support to this helper by always returning WC
>>>>> for lmem objects.
>>>>>
>>>>> Secondly it is introducing an idea of "always coherent" in a helper
>>>>> called i915_coherent_map_type. Could someone explain what is 
>>>>> coherent vs
>>>>> always coherent?
>>>>>
>>>>> And also, why is always coherent happy with WB? Sounds counter 
>>>>> intuitive
>>>>> to me.
>>>>
>>>> All this does is try to keep the existing behaviour intact, whilst
>>>> also ensuring that all lmem objects are mapped using only WC, no
>>>> matter what. The always_coherent=true thing is for the existing places
>>>> where we sometimes map the object using WB, without first considering
>>>> whether the device has the fast shared LLC vs snooping. Yes, it's
>>>> slightly ugly :)
>>>
>>> Not fully following - if we had to write kerneldoc for always_coherent
>>> input argument - what it would say?
>>
>> @always_coherent - If true we should always try to map the object
>> using WB. If false we should only map as WB if the device supports the
>> fast shared LLC, in the case of snooped devices we will map use WC.
>> Note that If the resource is lmem then we will always map as WC,
>> regardless of the value of always_coherent, since that's all we
>> currently support.
>>
>> Maybe the naming is poor?
> 
> Maybe just confusing to me, not sure yet.
> 
> So always_coherent is not about how the callers wants to use it, but 
> about platform knowledge? Or a performance concern for LLC vs snooping 
> cases? Does WB works (coherently) on snooping platforms?

The always_coherent=true is for the existing callers that want WB, 
regardless of LLC vs snooping.

The other callers use the existing i915_coherent_map_type() which only 
gives out WB for LLC platforms.

AFAIK, LLC vs snooping should offer the same in terms of coherency, but 
in terms of performance the shared LLC is much faster, and so for 
snooping platforms we choose to not enable WB everywhere.

On top of that we now have lmem, but for that we only allow WC. This 
patch just rolls all of that into one helper, while keeping the existing 
behaviour unchanged.

> 
> Regards,
> 
> Tvrtko
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [Intel-gfx] [PATCH 11/19] drm/i915: Update the helper to set correct mapping
  2021-04-19 11:30             ` Matthew Auld
@ 2021-04-19 14:07               ` Tvrtko Ursulin
  2021-04-19 14:37                 ` Matthew Auld
  0 siblings, 1 reply; 65+ messages in thread
From: Tvrtko Ursulin @ 2021-04-19 14:07 UTC (permalink / raw)
  To: Matthew Auld, Matthew Auld; +Cc: Intel Graphics Development, ML dri-devel


On 19/04/2021 12:30, Matthew Auld wrote:
> On 15/04/2021 12:05, Tvrtko Ursulin wrote:
>>
>> On 15/04/2021 10:23, Matthew Auld wrote:
>>> On Thu, 15 Apr 2021 at 09:21, Tvrtko Ursulin
>>> <tvrtko.ursulin@linux.intel.com> wrote:
>>>>
>>>>
>>>> On 14/04/2021 17:20, Matthew Auld wrote:
>>>>> On Wed, 14 Apr 2021 at 16:22, Tvrtko Ursulin
>>>>> <tvrtko.ursulin@linux.intel.com> wrote:
>>>>>>
>>>>>>
>>>>>> On 12/04/2021 10:05, Matthew Auld wrote:
>>>>>>> From: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com>
>>>>>>>
>>>>>>> Determine the possible coherent map type based on object location,
>>>>>>> and if target has llc or if user requires an always coherent
>>>>>>> mapping.
>>>>>>>
>>>>>>> Cc: Matthew Auld <matthew.auld@intel.com>
>>>>>>> Cc: CQ Tang <cq.tang@intel.com>
>>>>>>> Suggested-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
>>>>>>> Signed-off-by: Venkata Sandeep Dhanalakota 
>>>>>>> <venkata.s.dhanalakota@intel.com>
>>>>>>> ---
>>>>>>>     drivers/gpu/drm/i915/gt/intel_engine_cs.c    |  3 ++-
>>>>>>>     drivers/gpu/drm/i915/gt/intel_engine_pm.c    |  2 +-
>>>>>>>     drivers/gpu/drm/i915/gt/intel_lrc.c          |  4 +++-
>>>>>>>     drivers/gpu/drm/i915/gt/intel_ring.c         |  9 ++++++---
>>>>>>>     drivers/gpu/drm/i915/gt/selftest_context.c   |  3 ++-
>>>>>>>     drivers/gpu/drm/i915/gt/selftest_hangcheck.c |  4 ++--
>>>>>>>     drivers/gpu/drm/i915/gt/selftest_lrc.c       |  4 +++-
>>>>>>>     drivers/gpu/drm/i915/gt/uc/intel_guc.c       |  4 +++-
>>>>>>>     drivers/gpu/drm/i915/gt/uc/intel_huc.c       |  4 +++-
>>>>>>>     drivers/gpu/drm/i915/i915_drv.h              | 11 +++++++++--
>>>>>>>     drivers/gpu/drm/i915/selftests/igt_spinner.c |  4 ++--
>>>>>>>     11 files changed, 36 insertions(+), 16 deletions(-)
>>>>>>>
>>>>>>> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c 
>>>>>>> b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
>>>>>>> index efe935f80c1a..b79568d370f5 100644
>>>>>>> --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
>>>>>>> +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
>>>>>>> @@ -664,7 +664,8 @@ static int init_status_page(struct 
>>>>>>> intel_engine_cs *engine)
>>>>>>>         if (ret)
>>>>>>>                 goto err;
>>>>>>>
>>>>>>> -     vaddr = i915_gem_object_pin_map(obj, I915_MAP_WB);
>>>>>>> +     vaddr = i915_gem_object_pin_map(obj,
>>>>>>> + i915_coherent_map_type(engine->i915, obj, true));
>>>>>>>         if (IS_ERR(vaddr)) {
>>>>>>>                 ret = PTR_ERR(vaddr);
>>>>>>>                 goto err_unpin;
>>>>>>> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pm.c 
>>>>>>> b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
>>>>>>> index 7c9af86fdb1e..47f4397095e5 100644
>>>>>>> --- a/drivers/gpu/drm/i915/gt/intel_engine_pm.c
>>>>>>> +++ b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
>>>>>>> @@ -23,7 +23,7 @@ static void dbg_poison_ce(struct intel_context 
>>>>>>> *ce)
>>>>>>>
>>>>>>>         if (ce->state) {
>>>>>>>                 struct drm_i915_gem_object *obj = ce->state->obj;
>>>>>>> -             int type = i915_coherent_map_type(ce->engine->i915);
>>>>>>> +             int type = i915_coherent_map_type(ce->engine->i915, 
>>>>>>> obj, true);
>>>>>>>                 void *map;
>>>>>>>
>>>>>>>                 if (!i915_gem_object_trylock(obj))
>>>>>>> diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c 
>>>>>>> b/drivers/gpu/drm/i915/gt/intel_lrc.c
>>>>>>> index e86897cde984..aafe2a4df496 100644
>>>>>>> --- a/drivers/gpu/drm/i915/gt/intel_lrc.c
>>>>>>> +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
>>>>>>> @@ -903,7 +903,9 @@ lrc_pre_pin(struct intel_context *ce,
>>>>>>>         GEM_BUG_ON(!i915_vma_is_pinned(ce->state));
>>>>>>>
>>>>>>>         *vaddr = i915_gem_object_pin_map(ce->state->obj,
>>>>>>> - i915_coherent_map_type(ce->engine->i915) |
>>>>>>> + i915_coherent_map_type(ce->engine->i915,
>>>>>>> + ce->state->obj,
>>>>>>> + false) |
>>>>>>>                                          I915_MAP_OVERRIDE);
>>>>>>>
>>>>>>>         return PTR_ERR_OR_ZERO(*vaddr);
>>>>>>> diff --git a/drivers/gpu/drm/i915/gt/intel_ring.c 
>>>>>>> b/drivers/gpu/drm/i915/gt/intel_ring.c
>>>>>>> index aee0a77c77e0..3cf6c7e68108 100644
>>>>>>> --- a/drivers/gpu/drm/i915/gt/intel_ring.c
>>>>>>> +++ b/drivers/gpu/drm/i915/gt/intel_ring.c
>>>>>>> @@ -53,9 +53,12 @@ int intel_ring_pin(struct intel_ring *ring, 
>>>>>>> struct i915_gem_ww_ctx *ww)
>>>>>>>
>>>>>>>         if (i915_vma_is_map_and_fenceable(vma))
>>>>>>>                 addr = (void __force *)i915_vma_pin_iomap(vma);
>>>>>>> -     else
>>>>>>> -             addr = i915_gem_object_pin_map(vma->obj,
>>>>>>> - i915_coherent_map_type(vma->vm->i915));
>>>>>>> +     else {
>>>>>>> +             int type = i915_coherent_map_type(vma->vm->i915, 
>>>>>>> vma->obj, false);
>>>>>>> +
>>>>>>> +             addr = i915_gem_object_pin_map(vma->obj, type);
>>>>>>> +     }
>>>>>>> +
>>>>>>>         if (IS_ERR(addr)) {
>>>>>>>                 ret = PTR_ERR(addr);
>>>>>>>                 goto err_ring;
>>>>>>> diff --git a/drivers/gpu/drm/i915/gt/selftest_context.c 
>>>>>>> b/drivers/gpu/drm/i915/gt/selftest_context.c
>>>>>>> index b9bdd1d23243..26685b927169 100644
>>>>>>> --- a/drivers/gpu/drm/i915/gt/selftest_context.c
>>>>>>> +++ b/drivers/gpu/drm/i915/gt/selftest_context.c
>>>>>>> @@ -88,7 +88,8 @@ static int __live_context_size(struct 
>>>>>>> intel_engine_cs *engine)
>>>>>>>                 goto err;
>>>>>>>
>>>>>>>         vaddr = i915_gem_object_pin_map_unlocked(ce->state->obj,
>>>>>>> - i915_coherent_map_type(engine->i915));
>>>>>>> + i915_coherent_map_type(engine->i915,
>>>>>>> + ce->state->obj, false));
>>>>>>>         if (IS_ERR(vaddr)) {
>>>>>>>                 err = PTR_ERR(vaddr);
>>>>>>>                 intel_context_unpin(ce);
>>>>>>> diff --git a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c 
>>>>>>> b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
>>>>>>> index 746985971c3a..5b63d4df8c93 100644
>>>>>>> --- a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
>>>>>>> +++ b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
>>>>>>> @@ -69,7 +69,7 @@ static int hang_init(struct hang *h, struct 
>>>>>>> intel_gt *gt)
>>>>>>>         h->seqno = memset(vaddr, 0xff, PAGE_SIZE);
>>>>>>>
>>>>>>>         vaddr = i915_gem_object_pin_map_unlocked(h->obj,
>>>>>>> - i915_coherent_map_type(gt->i915));
>>>>>>> + i915_coherent_map_type(gt->i915, h->obj, false));
>>>>>>>         if (IS_ERR(vaddr)) {
>>>>>>>                 err = PTR_ERR(vaddr);
>>>>>>>                 goto err_unpin_hws;
>>>>>>> @@ -130,7 +130,7 @@ hang_create_request(struct hang *h, struct 
>>>>>>> intel_engine_cs *engine)
>>>>>>>                 return ERR_CAST(obj);
>>>>>>>         }
>>>>>>>
>>>>>>> -     vaddr = i915_gem_object_pin_map_unlocked(obj, 
>>>>>>> i915_coherent_map_type(gt->i915));
>>>>>>> +     vaddr = i915_gem_object_pin_map_unlocked(obj, 
>>>>>>> i915_coherent_map_type(gt->i915, obj, false));
>>>>>>>         if (IS_ERR(vaddr)) {
>>>>>>>                 i915_gem_object_put(obj);
>>>>>>>                 i915_vm_put(vm);
>>>>>>> diff --git a/drivers/gpu/drm/i915/gt/selftest_lrc.c 
>>>>>>> b/drivers/gpu/drm/i915/gt/selftest_lrc.c
>>>>>>> index 85e7df6a5123..d8f6623524e8 100644
>>>>>>> --- a/drivers/gpu/drm/i915/gt/selftest_lrc.c
>>>>>>> +++ b/drivers/gpu/drm/i915/gt/selftest_lrc.c
>>>>>>> @@ -1221,7 +1221,9 @@ static int compare_isolation(struct 
>>>>>>> intel_engine_cs *engine,
>>>>>>>         }
>>>>>>>
>>>>>>>         lrc = i915_gem_object_pin_map_unlocked(ce->state->obj,
>>>>>>> - i915_coherent_map_type(engine->i915));
>>>>>>> + i915_coherent_map_type(engine->i915,
>>>>>>> +                                                                   
>>>>>>> ce->state->obj,
>>>>>>> +                                                                   
>>>>>>> false));
>>>>>>>         if (IS_ERR(lrc)) {
>>>>>>>                 err = PTR_ERR(lrc);
>>>>>>>                 goto err_B1;
>>>>>>> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.c 
>>>>>>> b/drivers/gpu/drm/i915/gt/uc/intel_guc.c
>>>>>>> index 78305b2ec89d..adae04c47aab 100644
>>>>>>> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc.c
>>>>>>> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.c
>>>>>>> @@ -682,7 +682,9 @@ int intel_guc_allocate_and_map_vma(struct 
>>>>>>> intel_guc *guc, u32 size,
>>>>>>>         if (IS_ERR(vma))
>>>>>>>                 return PTR_ERR(vma);
>>>>>>>
>>>>>>> -     vaddr = i915_gem_object_pin_map_unlocked(vma->obj, 
>>>>>>> I915_MAP_WB);
>>>>>>> +     vaddr = i915_gem_object_pin_map_unlocked(vma->obj,
>>>>>>> + i915_coherent_map_type(guc_to_gt(guc)->i915,
>>>>>>> + vma->obj, true));
>>>>>>>         if (IS_ERR(vaddr)) {
>>>>>>>                 i915_vma_unpin_and_release(&vma, 0);
>>>>>>>                 return PTR_ERR(vaddr);
>>>>>>> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_huc.c 
>>>>>>> b/drivers/gpu/drm/i915/gt/uc/intel_huc.c
>>>>>>> index 2126dd81ac38..56d2144dc6a0 100644
>>>>>>> --- a/drivers/gpu/drm/i915/gt/uc/intel_huc.c
>>>>>>> +++ b/drivers/gpu/drm/i915/gt/uc/intel_huc.c
>>>>>>> @@ -82,7 +82,9 @@ static int intel_huc_rsa_data_create(struct 
>>>>>>> intel_huc *huc)
>>>>>>>         if (IS_ERR(vma))
>>>>>>>                 return PTR_ERR(vma);
>>>>>>>
>>>>>>> -     vaddr = i915_gem_object_pin_map_unlocked(vma->obj, 
>>>>>>> I915_MAP_WB);
>>>>>>> +     vaddr = i915_gem_object_pin_map_unlocked(vma->obj,
>>>>>>> + i915_coherent_map_type(gt->i915,
>>>>>>> + vma->obj, true));
>>>>>>>         if (IS_ERR(vaddr)) {
>>>>>>>                 i915_vma_unpin_and_release(&vma, 0);
>>>>>>>                 return PTR_ERR(vaddr);
>>>>>>> diff --git a/drivers/gpu/drm/i915/i915_drv.h 
>>>>>>> b/drivers/gpu/drm/i915/i915_drv.h
>>>>>>> index 69e43bf91a15..2abbc06712a4 100644
>>>>>>> --- a/drivers/gpu/drm/i915/i915_drv.h
>>>>>>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>>>>>>> @@ -78,6 +78,7 @@
>>>>>>>     #include "gem/i915_gem_context_types.h"
>>>>>>>     #include "gem/i915_gem_shrinker.h"
>>>>>>>     #include "gem/i915_gem_stolen.h"
>>>>>>> +#include "gem/i915_gem_lmem.h"
>>>>>>>
>>>>>>>     #include "gt/intel_engine.h"
>>>>>>>     #include "gt/intel_gt_types.h"
>>>>>>> @@ -1921,9 +1922,15 @@ static inline int 
>>>>>>> intel_hws_csb_write_index(struct drm_i915_private *i915)
>>>>>>>     }
>>>>>>>
>>>>>>>     static inline enum i915_map_type
>>>>>>> -i915_coherent_map_type(struct drm_i915_private *i915)
>>>>>>> +i915_coherent_map_type(struct drm_i915_private *i915,
>>>>>>> +                    struct drm_i915_gem_object *obj, bool 
>>>>>>> always_coherent)
>>>>>>>     {
>>>>>>> -     return HAS_LLC(i915) ? I915_MAP_WB : I915_MAP_WC;
>>>>>>> +     if (i915_gem_object_is_lmem(obj))
>>>>>>> +             return I915_MAP_WC;
>>>>>>> +     if (HAS_LLC(i915) || always_coherent)
>>>>>>> +             return I915_MAP_WB;
>>>>>>> +     else
>>>>>>> +             return I915_MAP_WC;
>>>>>>
>>>>>> Seems this patch is doing two things.
>>>>>>
>>>>>> First it is adding lmem support to this helper by always returning WC
>>>>>> for lmem objects.
>>>>>>
>>>>>> Secondly it is introducing an idea of "always coherent" in a helper
>>>>>> called i915_coherent_map_type. Could someone explain what is 
>>>>>> coherent vs
>>>>>> always coherent?
>>>>>>
>>>>>> And also, why is always coherent happy with WB? Sounds counter 
>>>>>> intuitive
>>>>>> to me.
>>>>>
>>>>> All this does is try to keep the existing behaviour intact, whilst
>>>>> also ensuring that all lmem objects are mapped using only WC, no
>>>>> matter what. The always_coherent=true thing is for the existing places
>>>>> where we sometimes map the object using WB, without first considering
>>>>> whether the device has the fast shared LLC vs snooping. Yes, it's
>>>>> slightly ugly :)
>>>>
>>>> Not fully following - if we had to write kerneldoc for always_coherent
>>>> input argument - what it would say?
>>>
>>> @always_coherent - If true we should always try to map the object
>>> using WB. If false we should only map as WB if the device supports the
>>> fast shared LLC, in the case of snooped devices we will map use WC.
>>> Note that If the resource is lmem then we will always map as WC,
>>> regardless of the value of always_coherent, since that's all we
>>> currently support.
>>>
>>> Maybe the naming is poor?
>>
>> Maybe just confusing to me, not sure yet.
>>
>> So always_coherent is not about how the callers wants to use it, but 
>> about platform knowledge? Or a performance concern for LLC vs snooping 
>> cases? Does WB works (coherently) on snooping platforms?
> 
> The always_coherent=true is for the existing callers that want WB, 
> regardless of LLC vs snooping.
> 
> The other callers use the existing i915_coherent_map_type() which only 
> gives out WB for LLC platforms.
> 
> AFAIK, LLC vs snooping should offer the same in terms of coherency, but 
> in terms of performance the shared LLC is much faster, and so for 
> snooping platforms we choose to not enable WB everywhere.
> 
> On top of that we now have lmem, but for that we only allow WC. This 
> patch just rolls all of that into one helper, while keeping the existing 
> behaviour unchanged.

Thanks. But I am still struggling with the API. :(

Is the introduction of always_coherent flag in the context of DG1 
required even? AFAICT for lmem objects the flag is ignored so no?

Regards,

Tvrtko
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [Intel-gfx] [PATCH 03/19] drm/i915: Create stolen memory region from local memory
  2021-04-16 15:04     ` Matthew Auld
@ 2021-04-19 14:15       ` Tvrtko Ursulin
  0 siblings, 0 replies; 65+ messages in thread
From: Tvrtko Ursulin @ 2021-04-19 14:15 UTC (permalink / raw)
  To: Matthew Auld, intel-gfx; +Cc: dri-devel


On 16/04/2021 16:04, Matthew Auld wrote:
> On 14/04/2021 16:01, Tvrtko Ursulin wrote:
>>
>> On 12/04/2021 10:05, Matthew Auld wrote:
>>> From: CQ Tang <cq.tang@intel.com>
>>>
>>> Add "REGION_STOLEN" device info to dg1, create stolen memory
>>> region from upper portion of local device memory, starting
>>> from DSMBASE.
>>>
>>> v2:
>>>      - s/drm_info/drm_dbg; userspace likely doesn't care about stolen.
>>>      - mem->type is only setup after the region probe, so setting the 
>>> name
>>>        as stolen-local or stolen-system based on this value won't 
>>> work. Split
>>>        system vs local stolen setup to fix this.
>>>      - kill all the region->devmem/is_devmem stuff. We already 
>>> differentiate
>>>        the different types of stolen so such things shouldn't be needed
>>>        anymore.
>>>
>>> Signed-off-by: CQ Tang <cq.tang@intel.com>
>>> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
>>> ---
>>>   drivers/gpu/drm/i915/gem/i915_gem_stolen.c | 99 +++++++++++++++++++---
>>>   drivers/gpu/drm/i915/gem/i915_gem_stolen.h |  3 +
>>>   drivers/gpu/drm/i915/i915_pci.c            |  2 +-
>>>   drivers/gpu/drm/i915/i915_reg.h            |  1 +
>>>   drivers/gpu/drm/i915/intel_memory_region.c |  6 ++
>>>   drivers/gpu/drm/i915/intel_memory_region.h |  5 +-
>>>   6 files changed, 102 insertions(+), 14 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_stolen.c 
>>> b/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
>>> index b0597de206de..56dd58bef5ee 100644
>>> --- a/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
>>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
>>> @@ -10,6 +10,7 @@
>>>   #include <drm/drm_mm.h>
>>>   #include <drm/i915_drm.h>
>>> +#include "gem/i915_gem_lmem.h"
>>>   #include "gem/i915_gem_region.h"
>>>   #include "i915_drv.h"
>>>   #include "i915_gem_stolen.h"
>>> @@ -121,6 +122,14 @@ static int i915_adjust_stolen(struct 
>>> drm_i915_private *i915,
>>>           }
>>>       }
>>> +    /*
>>> +     * With device local memory, we don't need to check the address 
>>> range,
>>> +     * this is device memory physical address, could overlap with 
>>> system
>>> +     * memory.
>>> +     */
>>> +    if (HAS_LMEM(i915))
>>> +        return 0;
>>> +
>>>       /*
>>>        * Verify that nothing else uses this physical address. Stolen
>>>        * memory should be reserved by the BIOS and hidden from the
>>> @@ -374,8 +383,9 @@ static void icl_get_stolen_reserved(struct 
>>> drm_i915_private *i915,
>>>       }
>>>   }
>>> -static int i915_gem_init_stolen(struct drm_i915_private *i915)
>>> +static int i915_gem_init_stolen(struct intel_memory_region *mem)
>>>   {
>>> +    struct drm_i915_private *i915 = mem->i915;
>>>       struct intel_uncore *uncore = &i915->uncore;
>>>       resource_size_t reserved_base, stolen_top;
>>>       resource_size_t reserved_total, reserved_size;
>>> @@ -396,10 +406,10 @@ static int i915_gem_init_stolen(struct 
>>> drm_i915_private *i915)
>>>           return 0;
>>>       }
>>> -    if (resource_size(&intel_graphics_stolen_res) == 0)
>>> +    if (resource_size(&mem->region) == 0)
>>>           return 0;
>>> -    i915->dsm = intel_graphics_stolen_res;
>>> +    i915->dsm = mem->region;
>>>       if (i915_adjust_stolen(i915, &i915->dsm))
>>>           return 0;
>>> @@ -684,23 +694,36 @@ static int _i915_gem_object_stolen_init(struct 
>>> intel_memory_region *mem,
>>>       return ret;
>>>   }
>>> +struct intel_memory_region *i915_stolen_region(struct 
>>> drm_i915_private *i915)
>>> +{
>>> +    if (HAS_LMEM(i915))
>>> +        return i915->mm.regions[INTEL_REGION_STOLEN_LMEM];
>>> +
>>> +    return i915->mm.regions[INTEL_REGION_STOLEN_SMEM];
>>> +}
>>
>> Could be a bikeshedding comment only - especially since I think this 
>> path gets very little used at runtime so it is most likely pointless 
>> to fiddle with it, but it just strikes me a bit not fully elegant to do:
>>
>> i915_gem_object_create_stolen
>>   -> i915_gem_object_create_region
>>      -> i915_stolen_region
>>
>> And end up in here, when alternative could be at driver init:
>>
>> i915->stolen_region_id = HAS_LMEM() ? ... : ...;
>>
>> i915_gem_object_create_stolen
>>   -> 
>> i915_gem_object_create_region(i915->mm.regions[i915->stolen_region_id]);
>>
>> Or pointer to region. Would avoid having to export i915_stolen_region 
>> as well.
>>
>> Or is i915->dsm already the right thing? Because..
> 
> I guess we could just have an i915->stolen_region short-cut or something?

i915->dsm is not it? Where does i915_gem_init_stolen exists for 
local-stolen then? At the "resource_size(&mem->region) == 0" check?

> 
>>
>>> +
>>>   struct drm_i915_gem_object *
>>>   i915_gem_object_create_stolen(struct drm_i915_private *i915,
>>>                     resource_size_t size)
>>>   {
>>> -    return 
>>> i915_gem_object_create_region(i915->mm.regions[INTEL_REGION_STOLEN_SMEM], 
>>>
>>> +    return i915_gem_object_create_region(i915_stolen_region(i915),
>>>                            size, I915_BO_ALLOC_CONTIGUOUS);
>>>   }
>>>   static int init_stolen(struct intel_memory_region *mem)
>>>   {
>>> -    intel_memory_region_set_name(mem, "stolen");
>>> +    if (HAS_LMEM(mem->i915)) {
>>> +        if (!io_mapping_init_wc(&mem->iomap,
>>> +                    mem->io_start,
>>> +                    resource_size(&mem->region)))
>>> +            return -EIO;
>>> +    }
>>>       /*
>>>        * Initialise stolen early so that we may reserve preallocated
>>>        * objects for the BIOS to KMS transition.
>>>        */
>>> -    return i915_gem_init_stolen(mem->i915);
>>> +    return i915_gem_init_stolen(mem);
>>
>> ... I find the mem region init paths a bit convoluted, stolen 
>> especially, and struggle to figure it out every time.
>>
>> For instance we have i915_region_stolen_ops shared between system and 
>> local stolen. But then shared vfuncs branch depending on system vs 
>> stolen?
> 
> We could split the intel_memory_region ops? Maybe that will make it 
> slightly less muddled?

I think so. Each vfunc table with it's own ->init() should make it 
easier to follow.

> The probing is slightly different, but that's kind of expected since 
> it's quite different from the HW pov.
> 
> But once we get an intel_memory_region, it should be the same whether 
> it's stolen device memory or whatever.
> 
>>
>> i915_gem_init_stolen is shared - but which parts of it are relevant 
>> for local stolen?
> 
> Asking all the difficult questions :)
> 
> It's just to populate dsm I think. I can rip that out and then we don't 
> call i915_gem_init_stolen() for the stolen device memory path? Maybe 
> that will look slightly better?

Yes, with the above approach of two struct intel_memory_region_ops? Even 
if some vfuncs are shared it should be better.

I am also confused by ->release ie. i915_gem_cleanup_stolen. How does 
that work for two stolen regions, I mean one i915->mm.stolen?

Regards,

Tvrtko
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [Intel-gfx] [PATCH 12/19] drm/i915/lmem: Bypass aperture when lmem is available
  2021-04-16 14:25     ` Matthew Auld
@ 2021-04-19 14:16       ` Tvrtko Ursulin
  0 siblings, 0 replies; 65+ messages in thread
From: Tvrtko Ursulin @ 2021-04-19 14:16 UTC (permalink / raw)
  To: Matthew Auld, intel-gfx
  Cc: Daniel Vetter, dri-devel, Chris P Wilson, Dhinakaran Pandiyan


On 16/04/2021 15:25, Matthew Auld wrote:
> On 14/04/2021 16:33, Tvrtko Ursulin wrote:
>>
>> On 12/04/2021 10:05, Matthew Auld wrote:
>>> From: Anusha Srivatsa <anusha.srivatsa@intel.com>
>>>
>>> In the scenario where local memory is available, we have
>>> rely on CPU access via lmem directly instead of aperture.
>>>
>>> v2:
>>> gmch is only relevant for much older hw, therefore we can drop the
>>> has_aperture check since it should always be present on such platforms.
>>> (Chris)
>>>
>>> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
>>> Cc: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
>>> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
>>> Cc: Chris P Wilson <chris.p.wilson@intel.com>
>>> Cc: Daniel Vetter <daniel.vetter@intel.com>
>>> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
>>> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
>>> Cc: CQ Tang <cq.tang@intel.com>
>>> Signed-off-by: Anusha Srivatsa <anusha.srivatsa@intel.com>
>>> ---
>>>   drivers/gpu/drm/i915/display/intel_fbdev.c | 22 +++++++++++++++-------
>>>   drivers/gpu/drm/i915/gem/i915_gem_lmem.c   | 15 +++++++++++++++
>>>   drivers/gpu/drm/i915/gem/i915_gem_lmem.h   |  5 +++++
>>>   drivers/gpu/drm/i915/i915_vma.c            | 19 +++++++++++++------
>>>   4 files changed, 48 insertions(+), 13 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/display/intel_fbdev.c 
>>> b/drivers/gpu/drm/i915/display/intel_fbdev.c
>>> index 2b37959da747..4af40229f5ec 100644
>>> --- a/drivers/gpu/drm/i915/display/intel_fbdev.c
>>> +++ b/drivers/gpu/drm/i915/display/intel_fbdev.c
>>> @@ -139,14 +139,22 @@ static int intelfb_alloc(struct drm_fb_helper 
>>> *helper,
>>>       size = mode_cmd.pitches[0] * mode_cmd.height;
>>>       size = PAGE_ALIGN(size);
>>> -    /* If the FB is too big, just don't use it since fbdev is not very
>>> -     * important and we should probably use that space with FBC or 
>>> other
>>> -     * features. */
>>>       obj = ERR_PTR(-ENODEV);
>>> -    if (size * 2 < dev_priv->stolen_usable_size)
>>> -        obj = i915_gem_object_create_stolen(dev_priv, size);
>>> -    if (IS_ERR(obj))
>>> -        obj = i915_gem_object_create_shmem(dev_priv, size);
>>> +    if (HAS_LMEM(dev_priv)) {
>>> +        obj = i915_gem_object_create_lmem(dev_priv, size,
>>> +                          I915_BO_ALLOC_CONTIGUOUS);
>>
>> Has to be contiguous? Question for display experts I guess.
>>
>> [Comes back later.] Ah for iomap? Put a comment to that effect perhaps?
> 
> I don't think it has to be, since we could in theory just use pin_map() 
> underneath, which can already deal with non-contiguous chunks of lmem, 
> although that might bring in ww locking. I think for now just add a 
> comment and mark this as XXX, and potentially revisit as follow up?

Sure.

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Regards,

Tvrtko
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [Intel-gfx] [PATCH 11/19] drm/i915: Update the helper to set correct mapping
  2021-04-19 14:07               ` Tvrtko Ursulin
@ 2021-04-19 14:37                 ` Matthew Auld
  2021-04-19 15:01                   ` Tvrtko Ursulin
  0 siblings, 1 reply; 65+ messages in thread
From: Matthew Auld @ 2021-04-19 14:37 UTC (permalink / raw)
  To: Tvrtko Ursulin, Matthew Auld; +Cc: Intel Graphics Development, ML dri-devel

On 19/04/2021 15:07, Tvrtko Ursulin wrote:
> 
> On 19/04/2021 12:30, Matthew Auld wrote:
>> On 15/04/2021 12:05, Tvrtko Ursulin wrote:
>>>
>>> On 15/04/2021 10:23, Matthew Auld wrote:
>>>> On Thu, 15 Apr 2021 at 09:21, Tvrtko Ursulin
>>>> <tvrtko.ursulin@linux.intel.com> wrote:
>>>>>
>>>>>
>>>>> On 14/04/2021 17:20, Matthew Auld wrote:
>>>>>> On Wed, 14 Apr 2021 at 16:22, Tvrtko Ursulin
>>>>>> <tvrtko.ursulin@linux.intel.com> wrote:
>>>>>>>
>>>>>>>
>>>>>>> On 12/04/2021 10:05, Matthew Auld wrote:
>>>>>>>> From: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com>
>>>>>>>>
>>>>>>>> Determine the possible coherent map type based on object location,
>>>>>>>> and if target has llc or if user requires an always coherent
>>>>>>>> mapping.
>>>>>>>>
>>>>>>>> Cc: Matthew Auld <matthew.auld@intel.com>
>>>>>>>> Cc: CQ Tang <cq.tang@intel.com>
>>>>>>>> Suggested-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
>>>>>>>> Signed-off-by: Venkata Sandeep Dhanalakota 
>>>>>>>> <venkata.s.dhanalakota@intel.com>
>>>>>>>> ---
>>>>>>>>     drivers/gpu/drm/i915/gt/intel_engine_cs.c    |  3 ++-
>>>>>>>>     drivers/gpu/drm/i915/gt/intel_engine_pm.c    |  2 +-
>>>>>>>>     drivers/gpu/drm/i915/gt/intel_lrc.c          |  4 +++-
>>>>>>>>     drivers/gpu/drm/i915/gt/intel_ring.c         |  9 ++++++---
>>>>>>>>     drivers/gpu/drm/i915/gt/selftest_context.c   |  3 ++-
>>>>>>>>     drivers/gpu/drm/i915/gt/selftest_hangcheck.c |  4 ++--
>>>>>>>>     drivers/gpu/drm/i915/gt/selftest_lrc.c       |  4 +++-
>>>>>>>>     drivers/gpu/drm/i915/gt/uc/intel_guc.c       |  4 +++-
>>>>>>>>     drivers/gpu/drm/i915/gt/uc/intel_huc.c       |  4 +++-
>>>>>>>>     drivers/gpu/drm/i915/i915_drv.h              | 11 +++++++++--
>>>>>>>>     drivers/gpu/drm/i915/selftests/igt_spinner.c |  4 ++--
>>>>>>>>     11 files changed, 36 insertions(+), 16 deletions(-)
>>>>>>>>
>>>>>>>> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c 
>>>>>>>> b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
>>>>>>>> index efe935f80c1a..b79568d370f5 100644
>>>>>>>> --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
>>>>>>>> +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
>>>>>>>> @@ -664,7 +664,8 @@ static int init_status_page(struct 
>>>>>>>> intel_engine_cs *engine)
>>>>>>>>         if (ret)
>>>>>>>>                 goto err;
>>>>>>>>
>>>>>>>> -     vaddr = i915_gem_object_pin_map(obj, I915_MAP_WB);
>>>>>>>> +     vaddr = i915_gem_object_pin_map(obj,
>>>>>>>> + i915_coherent_map_type(engine->i915, obj, true));
>>>>>>>>         if (IS_ERR(vaddr)) {
>>>>>>>>                 ret = PTR_ERR(vaddr);
>>>>>>>>                 goto err_unpin;
>>>>>>>> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pm.c 
>>>>>>>> b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
>>>>>>>> index 7c9af86fdb1e..47f4397095e5 100644
>>>>>>>> --- a/drivers/gpu/drm/i915/gt/intel_engine_pm.c
>>>>>>>> +++ b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
>>>>>>>> @@ -23,7 +23,7 @@ static void dbg_poison_ce(struct intel_context 
>>>>>>>> *ce)
>>>>>>>>
>>>>>>>>         if (ce->state) {
>>>>>>>>                 struct drm_i915_gem_object *obj = ce->state->obj;
>>>>>>>> -             int type = i915_coherent_map_type(ce->engine->i915);
>>>>>>>> +             int type = 
>>>>>>>> i915_coherent_map_type(ce->engine->i915, obj, true);
>>>>>>>>                 void *map;
>>>>>>>>
>>>>>>>>                 if (!i915_gem_object_trylock(obj))
>>>>>>>> diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c 
>>>>>>>> b/drivers/gpu/drm/i915/gt/intel_lrc.c
>>>>>>>> index e86897cde984..aafe2a4df496 100644
>>>>>>>> --- a/drivers/gpu/drm/i915/gt/intel_lrc.c
>>>>>>>> +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
>>>>>>>> @@ -903,7 +903,9 @@ lrc_pre_pin(struct intel_context *ce,
>>>>>>>>         GEM_BUG_ON(!i915_vma_is_pinned(ce->state));
>>>>>>>>
>>>>>>>>         *vaddr = i915_gem_object_pin_map(ce->state->obj,
>>>>>>>> - i915_coherent_map_type(ce->engine->i915) |
>>>>>>>> + i915_coherent_map_type(ce->engine->i915,
>>>>>>>> + ce->state->obj,
>>>>>>>> + false) |
>>>>>>>>                                          I915_MAP_OVERRIDE);
>>>>>>>>
>>>>>>>>         return PTR_ERR_OR_ZERO(*vaddr);
>>>>>>>> diff --git a/drivers/gpu/drm/i915/gt/intel_ring.c 
>>>>>>>> b/drivers/gpu/drm/i915/gt/intel_ring.c
>>>>>>>> index aee0a77c77e0..3cf6c7e68108 100644
>>>>>>>> --- a/drivers/gpu/drm/i915/gt/intel_ring.c
>>>>>>>> +++ b/drivers/gpu/drm/i915/gt/intel_ring.c
>>>>>>>> @@ -53,9 +53,12 @@ int intel_ring_pin(struct intel_ring *ring, 
>>>>>>>> struct i915_gem_ww_ctx *ww)
>>>>>>>>
>>>>>>>>         if (i915_vma_is_map_and_fenceable(vma))
>>>>>>>>                 addr = (void __force *)i915_vma_pin_iomap(vma);
>>>>>>>> -     else
>>>>>>>> -             addr = i915_gem_object_pin_map(vma->obj,
>>>>>>>> - i915_coherent_map_type(vma->vm->i915));
>>>>>>>> +     else {
>>>>>>>> +             int type = i915_coherent_map_type(vma->vm->i915, 
>>>>>>>> vma->obj, false);
>>>>>>>> +
>>>>>>>> +             addr = i915_gem_object_pin_map(vma->obj, type);
>>>>>>>> +     }
>>>>>>>> +
>>>>>>>>         if (IS_ERR(addr)) {
>>>>>>>>                 ret = PTR_ERR(addr);
>>>>>>>>                 goto err_ring;
>>>>>>>> diff --git a/drivers/gpu/drm/i915/gt/selftest_context.c 
>>>>>>>> b/drivers/gpu/drm/i915/gt/selftest_context.c
>>>>>>>> index b9bdd1d23243..26685b927169 100644
>>>>>>>> --- a/drivers/gpu/drm/i915/gt/selftest_context.c
>>>>>>>> +++ b/drivers/gpu/drm/i915/gt/selftest_context.c
>>>>>>>> @@ -88,7 +88,8 @@ static int __live_context_size(struct 
>>>>>>>> intel_engine_cs *engine)
>>>>>>>>                 goto err;
>>>>>>>>
>>>>>>>>         vaddr = i915_gem_object_pin_map_unlocked(ce->state->obj,
>>>>>>>> - i915_coherent_map_type(engine->i915));
>>>>>>>> + i915_coherent_map_type(engine->i915,
>>>>>>>> + ce->state->obj, false));
>>>>>>>>         if (IS_ERR(vaddr)) {
>>>>>>>>                 err = PTR_ERR(vaddr);
>>>>>>>>                 intel_context_unpin(ce);
>>>>>>>> diff --git a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c 
>>>>>>>> b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
>>>>>>>> index 746985971c3a..5b63d4df8c93 100644
>>>>>>>> --- a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
>>>>>>>> +++ b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
>>>>>>>> @@ -69,7 +69,7 @@ static int hang_init(struct hang *h, struct 
>>>>>>>> intel_gt *gt)
>>>>>>>>         h->seqno = memset(vaddr, 0xff, PAGE_SIZE);
>>>>>>>>
>>>>>>>>         vaddr = i915_gem_object_pin_map_unlocked(h->obj,
>>>>>>>> - i915_coherent_map_type(gt->i915));
>>>>>>>> + i915_coherent_map_type(gt->i915, h->obj, false));
>>>>>>>>         if (IS_ERR(vaddr)) {
>>>>>>>>                 err = PTR_ERR(vaddr);
>>>>>>>>                 goto err_unpin_hws;
>>>>>>>> @@ -130,7 +130,7 @@ hang_create_request(struct hang *h, struct 
>>>>>>>> intel_engine_cs *engine)
>>>>>>>>                 return ERR_CAST(obj);
>>>>>>>>         }
>>>>>>>>
>>>>>>>> -     vaddr = i915_gem_object_pin_map_unlocked(obj, 
>>>>>>>> i915_coherent_map_type(gt->i915));
>>>>>>>> +     vaddr = i915_gem_object_pin_map_unlocked(obj, 
>>>>>>>> i915_coherent_map_type(gt->i915, obj, false));
>>>>>>>>         if (IS_ERR(vaddr)) {
>>>>>>>>                 i915_gem_object_put(obj);
>>>>>>>>                 i915_vm_put(vm);
>>>>>>>> diff --git a/drivers/gpu/drm/i915/gt/selftest_lrc.c 
>>>>>>>> b/drivers/gpu/drm/i915/gt/selftest_lrc.c
>>>>>>>> index 85e7df6a5123..d8f6623524e8 100644
>>>>>>>> --- a/drivers/gpu/drm/i915/gt/selftest_lrc.c
>>>>>>>> +++ b/drivers/gpu/drm/i915/gt/selftest_lrc.c
>>>>>>>> @@ -1221,7 +1221,9 @@ static int compare_isolation(struct 
>>>>>>>> intel_engine_cs *engine,
>>>>>>>>         }
>>>>>>>>
>>>>>>>>         lrc = i915_gem_object_pin_map_unlocked(ce->state->obj,
>>>>>>>> - i915_coherent_map_type(engine->i915));
>>>>>>>> + i915_coherent_map_type(engine->i915,
>>>>>>>> + ce->state->obj,
>>>>>>>> + false));
>>>>>>>>         if (IS_ERR(lrc)) {
>>>>>>>>                 err = PTR_ERR(lrc);
>>>>>>>>                 goto err_B1;
>>>>>>>> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.c 
>>>>>>>> b/drivers/gpu/drm/i915/gt/uc/intel_guc.c
>>>>>>>> index 78305b2ec89d..adae04c47aab 100644
>>>>>>>> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc.c
>>>>>>>> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.c
>>>>>>>> @@ -682,7 +682,9 @@ int intel_guc_allocate_and_map_vma(struct 
>>>>>>>> intel_guc *guc, u32 size,
>>>>>>>>         if (IS_ERR(vma))
>>>>>>>>                 return PTR_ERR(vma);
>>>>>>>>
>>>>>>>> -     vaddr = i915_gem_object_pin_map_unlocked(vma->obj, 
>>>>>>>> I915_MAP_WB);
>>>>>>>> +     vaddr = i915_gem_object_pin_map_unlocked(vma->obj,
>>>>>>>> + i915_coherent_map_type(guc_to_gt(guc)->i915,
>>>>>>>> + vma->obj, true));
>>>>>>>>         if (IS_ERR(vaddr)) {
>>>>>>>>                 i915_vma_unpin_and_release(&vma, 0);
>>>>>>>>                 return PTR_ERR(vaddr);
>>>>>>>> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_huc.c 
>>>>>>>> b/drivers/gpu/drm/i915/gt/uc/intel_huc.c
>>>>>>>> index 2126dd81ac38..56d2144dc6a0 100644
>>>>>>>> --- a/drivers/gpu/drm/i915/gt/uc/intel_huc.c
>>>>>>>> +++ b/drivers/gpu/drm/i915/gt/uc/intel_huc.c
>>>>>>>> @@ -82,7 +82,9 @@ static int intel_huc_rsa_data_create(struct 
>>>>>>>> intel_huc *huc)
>>>>>>>>         if (IS_ERR(vma))
>>>>>>>>                 return PTR_ERR(vma);
>>>>>>>>
>>>>>>>> -     vaddr = i915_gem_object_pin_map_unlocked(vma->obj, 
>>>>>>>> I915_MAP_WB);
>>>>>>>> +     vaddr = i915_gem_object_pin_map_unlocked(vma->obj,
>>>>>>>> + i915_coherent_map_type(gt->i915,
>>>>>>>> + vma->obj, true));
>>>>>>>>         if (IS_ERR(vaddr)) {
>>>>>>>>                 i915_vma_unpin_and_release(&vma, 0);
>>>>>>>>                 return PTR_ERR(vaddr);
>>>>>>>> diff --git a/drivers/gpu/drm/i915/i915_drv.h 
>>>>>>>> b/drivers/gpu/drm/i915/i915_drv.h
>>>>>>>> index 69e43bf91a15..2abbc06712a4 100644
>>>>>>>> --- a/drivers/gpu/drm/i915/i915_drv.h
>>>>>>>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>>>>>>>> @@ -78,6 +78,7 @@
>>>>>>>>     #include "gem/i915_gem_context_types.h"
>>>>>>>>     #include "gem/i915_gem_shrinker.h"
>>>>>>>>     #include "gem/i915_gem_stolen.h"
>>>>>>>> +#include "gem/i915_gem_lmem.h"
>>>>>>>>
>>>>>>>>     #include "gt/intel_engine.h"
>>>>>>>>     #include "gt/intel_gt_types.h"
>>>>>>>> @@ -1921,9 +1922,15 @@ static inline int 
>>>>>>>> intel_hws_csb_write_index(struct drm_i915_private *i915)
>>>>>>>>     }
>>>>>>>>
>>>>>>>>     static inline enum i915_map_type
>>>>>>>> -i915_coherent_map_type(struct drm_i915_private *i915)
>>>>>>>> +i915_coherent_map_type(struct drm_i915_private *i915,
>>>>>>>> +                    struct drm_i915_gem_object *obj, bool 
>>>>>>>> always_coherent)
>>>>>>>>     {
>>>>>>>> -     return HAS_LLC(i915) ? I915_MAP_WB : I915_MAP_WC;
>>>>>>>> +     if (i915_gem_object_is_lmem(obj))
>>>>>>>> +             return I915_MAP_WC;
>>>>>>>> +     if (HAS_LLC(i915) || always_coherent)
>>>>>>>> +             return I915_MAP_WB;
>>>>>>>> +     else
>>>>>>>> +             return I915_MAP_WC;
>>>>>>>
>>>>>>> Seems this patch is doing two things.
>>>>>>>
>>>>>>> First it is adding lmem support to this helper by always 
>>>>>>> returning WC
>>>>>>> for lmem objects.
>>>>>>>
>>>>>>> Secondly it is introducing an idea of "always coherent" in a helper
>>>>>>> called i915_coherent_map_type. Could someone explain what is 
>>>>>>> coherent vs
>>>>>>> always coherent?
>>>>>>>
>>>>>>> And also, why is always coherent happy with WB? Sounds counter 
>>>>>>> intuitive
>>>>>>> to me.
>>>>>>
>>>>>> All this does is try to keep the existing behaviour intact, whilst
>>>>>> also ensuring that all lmem objects are mapped using only WC, no
>>>>>> matter what. The always_coherent=true thing is for the existing 
>>>>>> places
>>>>>> where we sometimes map the object using WB, without first considering
>>>>>> whether the device has the fast shared LLC vs snooping. Yes, it's
>>>>>> slightly ugly :)
>>>>>
>>>>> Not fully following - if we had to write kerneldoc for always_coherent
>>>>> input argument - what it would say?
>>>>
>>>> @always_coherent - If true we should always try to map the object
>>>> using WB. If false we should only map as WB if the device supports the
>>>> fast shared LLC, in the case of snooped devices we will map use WC.
>>>> Note that If the resource is lmem then we will always map as WC,
>>>> regardless of the value of always_coherent, since that's all we
>>>> currently support.
>>>>
>>>> Maybe the naming is poor?
>>>
>>> Maybe just confusing to me, not sure yet.
>>>
>>> So always_coherent is not about how the callers wants to use it, but 
>>> about platform knowledge? Or a performance concern for LLC vs 
>>> snooping cases? Does WB works (coherently) on snooping platforms?
>>
>> The always_coherent=true is for the existing callers that want WB, 
>> regardless of LLC vs snooping.
>>
>> The other callers use the existing i915_coherent_map_type() which only 
>> gives out WB for LLC platforms.
>>
>> AFAIK, LLC vs snooping should offer the same in terms of coherency, 
>> but in terms of performance the shared LLC is much faster, and so for 
>> snooping platforms we choose to not enable WB everywhere.
>>
>> On top of that we now have lmem, but for that we only allow WC. This 
>> patch just rolls all of that into one helper, while keeping the 
>> existing behaviour unchanged.
> 
> Thanks. But I am still struggling with the API. :(
> 
> Is the introduction of always_coherent flag in the context of DG1 
> required even? AFAICT for lmem objects the flag is ignored so no?

If we drop the flag/helper thing, then we need something like:

type = WB;
if (i915_gem_object_is_lmem(obj))
     type = WC;

vaddr = i915_gem_object_pin_map(obj, type);

In all the places where we currently do:

vaddr = i915_gem_object_pin_map(obj, WB);

Where obj can be lmem, so ctx, ring, guc etc. Is that better or worse? 
The existing i915_coherent_map_type() callers should work as-is, since 
DG1 is snooped. And this patch just extends that to cover all cases.

Perhaps we need a new helper instead? Maybe you have a better idea?

> 
> Regards,
> 
> Tvrtko
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [Intel-gfx] [PATCH 11/19] drm/i915: Update the helper to set correct mapping
  2021-04-19 14:37                 ` Matthew Auld
@ 2021-04-19 15:01                   ` Tvrtko Ursulin
  2021-04-21 11:42                     ` Matthew Auld
  0 siblings, 1 reply; 65+ messages in thread
From: Tvrtko Ursulin @ 2021-04-19 15:01 UTC (permalink / raw)
  To: Matthew Auld, Matthew Auld; +Cc: Intel Graphics Development, ML dri-devel


On 19/04/2021 15:37, Matthew Auld wrote:
> On 19/04/2021 15:07, Tvrtko Ursulin wrote:
>>
>> On 19/04/2021 12:30, Matthew Auld wrote:
>>> On 15/04/2021 12:05, Tvrtko Ursulin wrote:
>>>>
>>>> On 15/04/2021 10:23, Matthew Auld wrote:
>>>>> On Thu, 15 Apr 2021 at 09:21, Tvrtko Ursulin
>>>>> <tvrtko.ursulin@linux.intel.com> wrote:
>>>>>>
>>>>>>
>>>>>> On 14/04/2021 17:20, Matthew Auld wrote:
>>>>>>> On Wed, 14 Apr 2021 at 16:22, Tvrtko Ursulin
>>>>>>> <tvrtko.ursulin@linux.intel.com> wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>> On 12/04/2021 10:05, Matthew Auld wrote:
>>>>>>>>> From: Venkata Sandeep Dhanalakota 
>>>>>>>>> <venkata.s.dhanalakota@intel.com>
>>>>>>>>>
>>>>>>>>> Determine the possible coherent map type based on object location,
>>>>>>>>> and if target has llc or if user requires an always coherent
>>>>>>>>> mapping.
>>>>>>>>>
>>>>>>>>> Cc: Matthew Auld <matthew.auld@intel.com>
>>>>>>>>> Cc: CQ Tang <cq.tang@intel.com>
>>>>>>>>> Suggested-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
>>>>>>>>> Signed-off-by: Venkata Sandeep Dhanalakota 
>>>>>>>>> <venkata.s.dhanalakota@intel.com>
>>>>>>>>> ---
>>>>>>>>>     drivers/gpu/drm/i915/gt/intel_engine_cs.c    |  3 ++-
>>>>>>>>>     drivers/gpu/drm/i915/gt/intel_engine_pm.c    |  2 +-
>>>>>>>>>     drivers/gpu/drm/i915/gt/intel_lrc.c          |  4 +++-
>>>>>>>>>     drivers/gpu/drm/i915/gt/intel_ring.c         |  9 ++++++---
>>>>>>>>>     drivers/gpu/drm/i915/gt/selftest_context.c   |  3 ++-
>>>>>>>>>     drivers/gpu/drm/i915/gt/selftest_hangcheck.c |  4 ++--
>>>>>>>>>     drivers/gpu/drm/i915/gt/selftest_lrc.c       |  4 +++-
>>>>>>>>>     drivers/gpu/drm/i915/gt/uc/intel_guc.c       |  4 +++-
>>>>>>>>>     drivers/gpu/drm/i915/gt/uc/intel_huc.c       |  4 +++-
>>>>>>>>>     drivers/gpu/drm/i915/i915_drv.h              | 11 +++++++++--
>>>>>>>>>     drivers/gpu/drm/i915/selftests/igt_spinner.c |  4 ++--
>>>>>>>>>     11 files changed, 36 insertions(+), 16 deletions(-)
>>>>>>>>>
>>>>>>>>> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c 
>>>>>>>>> b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
>>>>>>>>> index efe935f80c1a..b79568d370f5 100644
>>>>>>>>> --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
>>>>>>>>> +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
>>>>>>>>> @@ -664,7 +664,8 @@ static int init_status_page(struct 
>>>>>>>>> intel_engine_cs *engine)
>>>>>>>>>         if (ret)
>>>>>>>>>                 goto err;
>>>>>>>>>
>>>>>>>>> -     vaddr = i915_gem_object_pin_map(obj, I915_MAP_WB);
>>>>>>>>> +     vaddr = i915_gem_object_pin_map(obj,
>>>>>>>>> + i915_coherent_map_type(engine->i915, obj, true));
>>>>>>>>>         if (IS_ERR(vaddr)) {
>>>>>>>>>                 ret = PTR_ERR(vaddr);
>>>>>>>>>                 goto err_unpin;
>>>>>>>>> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pm.c 
>>>>>>>>> b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
>>>>>>>>> index 7c9af86fdb1e..47f4397095e5 100644
>>>>>>>>> --- a/drivers/gpu/drm/i915/gt/intel_engine_pm.c
>>>>>>>>> +++ b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
>>>>>>>>> @@ -23,7 +23,7 @@ static void dbg_poison_ce(struct 
>>>>>>>>> intel_context *ce)
>>>>>>>>>
>>>>>>>>>         if (ce->state) {
>>>>>>>>>                 struct drm_i915_gem_object *obj = ce->state->obj;
>>>>>>>>> -             int type = i915_coherent_map_type(ce->engine->i915);
>>>>>>>>> +             int type = 
>>>>>>>>> i915_coherent_map_type(ce->engine->i915, obj, true);
>>>>>>>>>                 void *map;
>>>>>>>>>
>>>>>>>>>                 if (!i915_gem_object_trylock(obj))
>>>>>>>>> diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c 
>>>>>>>>> b/drivers/gpu/drm/i915/gt/intel_lrc.c
>>>>>>>>> index e86897cde984..aafe2a4df496 100644
>>>>>>>>> --- a/drivers/gpu/drm/i915/gt/intel_lrc.c
>>>>>>>>> +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
>>>>>>>>> @@ -903,7 +903,9 @@ lrc_pre_pin(struct intel_context *ce,
>>>>>>>>>         GEM_BUG_ON(!i915_vma_is_pinned(ce->state));
>>>>>>>>>
>>>>>>>>>         *vaddr = i915_gem_object_pin_map(ce->state->obj,
>>>>>>>>> - i915_coherent_map_type(ce->engine->i915) |
>>>>>>>>> + i915_coherent_map_type(ce->engine->i915,
>>>>>>>>> + ce->state->obj,
>>>>>>>>> + false) |
>>>>>>>>>                                          I915_MAP_OVERRIDE);
>>>>>>>>>
>>>>>>>>>         return PTR_ERR_OR_ZERO(*vaddr);
>>>>>>>>> diff --git a/drivers/gpu/drm/i915/gt/intel_ring.c 
>>>>>>>>> b/drivers/gpu/drm/i915/gt/intel_ring.c
>>>>>>>>> index aee0a77c77e0..3cf6c7e68108 100644
>>>>>>>>> --- a/drivers/gpu/drm/i915/gt/intel_ring.c
>>>>>>>>> +++ b/drivers/gpu/drm/i915/gt/intel_ring.c
>>>>>>>>> @@ -53,9 +53,12 @@ int intel_ring_pin(struct intel_ring *ring, 
>>>>>>>>> struct i915_gem_ww_ctx *ww)
>>>>>>>>>
>>>>>>>>>         if (i915_vma_is_map_and_fenceable(vma))
>>>>>>>>>                 addr = (void __force *)i915_vma_pin_iomap(vma);
>>>>>>>>> -     else
>>>>>>>>> -             addr = i915_gem_object_pin_map(vma->obj,
>>>>>>>>> - i915_coherent_map_type(vma->vm->i915));
>>>>>>>>> +     else {
>>>>>>>>> +             int type = i915_coherent_map_type(vma->vm->i915, 
>>>>>>>>> vma->obj, false);
>>>>>>>>> +
>>>>>>>>> +             addr = i915_gem_object_pin_map(vma->obj, type);
>>>>>>>>> +     }
>>>>>>>>> +
>>>>>>>>>         if (IS_ERR(addr)) {
>>>>>>>>>                 ret = PTR_ERR(addr);
>>>>>>>>>                 goto err_ring;
>>>>>>>>> diff --git a/drivers/gpu/drm/i915/gt/selftest_context.c 
>>>>>>>>> b/drivers/gpu/drm/i915/gt/selftest_context.c
>>>>>>>>> index b9bdd1d23243..26685b927169 100644
>>>>>>>>> --- a/drivers/gpu/drm/i915/gt/selftest_context.c
>>>>>>>>> +++ b/drivers/gpu/drm/i915/gt/selftest_context.c
>>>>>>>>> @@ -88,7 +88,8 @@ static int __live_context_size(struct 
>>>>>>>>> intel_engine_cs *engine)
>>>>>>>>>                 goto err;
>>>>>>>>>
>>>>>>>>>         vaddr = i915_gem_object_pin_map_unlocked(ce->state->obj,
>>>>>>>>> - i915_coherent_map_type(engine->i915));
>>>>>>>>> + i915_coherent_map_type(engine->i915,
>>>>>>>>> + ce->state->obj, false));
>>>>>>>>>         if (IS_ERR(vaddr)) {
>>>>>>>>>                 err = PTR_ERR(vaddr);
>>>>>>>>>                 intel_context_unpin(ce);
>>>>>>>>> diff --git a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c 
>>>>>>>>> b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
>>>>>>>>> index 746985971c3a..5b63d4df8c93 100644
>>>>>>>>> --- a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
>>>>>>>>> +++ b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
>>>>>>>>> @@ -69,7 +69,7 @@ static int hang_init(struct hang *h, struct 
>>>>>>>>> intel_gt *gt)
>>>>>>>>>         h->seqno = memset(vaddr, 0xff, PAGE_SIZE);
>>>>>>>>>
>>>>>>>>>         vaddr = i915_gem_object_pin_map_unlocked(h->obj,
>>>>>>>>> - i915_coherent_map_type(gt->i915));
>>>>>>>>> + i915_coherent_map_type(gt->i915, h->obj, false));
>>>>>>>>>         if (IS_ERR(vaddr)) {
>>>>>>>>>                 err = PTR_ERR(vaddr);
>>>>>>>>>                 goto err_unpin_hws;
>>>>>>>>> @@ -130,7 +130,7 @@ hang_create_request(struct hang *h, struct 
>>>>>>>>> intel_engine_cs *engine)
>>>>>>>>>                 return ERR_CAST(obj);
>>>>>>>>>         }
>>>>>>>>>
>>>>>>>>> -     vaddr = i915_gem_object_pin_map_unlocked(obj, 
>>>>>>>>> i915_coherent_map_type(gt->i915));
>>>>>>>>> +     vaddr = i915_gem_object_pin_map_unlocked(obj, 
>>>>>>>>> i915_coherent_map_type(gt->i915, obj, false));
>>>>>>>>>         if (IS_ERR(vaddr)) {
>>>>>>>>>                 i915_gem_object_put(obj);
>>>>>>>>>                 i915_vm_put(vm);
>>>>>>>>> diff --git a/drivers/gpu/drm/i915/gt/selftest_lrc.c 
>>>>>>>>> b/drivers/gpu/drm/i915/gt/selftest_lrc.c
>>>>>>>>> index 85e7df6a5123..d8f6623524e8 100644
>>>>>>>>> --- a/drivers/gpu/drm/i915/gt/selftest_lrc.c
>>>>>>>>> +++ b/drivers/gpu/drm/i915/gt/selftest_lrc.c
>>>>>>>>> @@ -1221,7 +1221,9 @@ static int compare_isolation(struct 
>>>>>>>>> intel_engine_cs *engine,
>>>>>>>>>         }
>>>>>>>>>
>>>>>>>>>         lrc = i915_gem_object_pin_map_unlocked(ce->state->obj,
>>>>>>>>> - i915_coherent_map_type(engine->i915));
>>>>>>>>> + i915_coherent_map_type(engine->i915,
>>>>>>>>> + ce->state->obj,
>>>>>>>>> + false));
>>>>>>>>>         if (IS_ERR(lrc)) {
>>>>>>>>>                 err = PTR_ERR(lrc);
>>>>>>>>>                 goto err_B1;
>>>>>>>>> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.c 
>>>>>>>>> b/drivers/gpu/drm/i915/gt/uc/intel_guc.c
>>>>>>>>> index 78305b2ec89d..adae04c47aab 100644
>>>>>>>>> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc.c
>>>>>>>>> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.c
>>>>>>>>> @@ -682,7 +682,9 @@ int intel_guc_allocate_and_map_vma(struct 
>>>>>>>>> intel_guc *guc, u32 size,
>>>>>>>>>         if (IS_ERR(vma))
>>>>>>>>>                 return PTR_ERR(vma);
>>>>>>>>>
>>>>>>>>> -     vaddr = i915_gem_object_pin_map_unlocked(vma->obj, 
>>>>>>>>> I915_MAP_WB);
>>>>>>>>> +     vaddr = i915_gem_object_pin_map_unlocked(vma->obj,
>>>>>>>>> + i915_coherent_map_type(guc_to_gt(guc)->i915,
>>>>>>>>> + vma->obj, true));
>>>>>>>>>         if (IS_ERR(vaddr)) {
>>>>>>>>>                 i915_vma_unpin_and_release(&vma, 0);
>>>>>>>>>                 return PTR_ERR(vaddr);
>>>>>>>>> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_huc.c 
>>>>>>>>> b/drivers/gpu/drm/i915/gt/uc/intel_huc.c
>>>>>>>>> index 2126dd81ac38..56d2144dc6a0 100644
>>>>>>>>> --- a/drivers/gpu/drm/i915/gt/uc/intel_huc.c
>>>>>>>>> +++ b/drivers/gpu/drm/i915/gt/uc/intel_huc.c
>>>>>>>>> @@ -82,7 +82,9 @@ static int intel_huc_rsa_data_create(struct 
>>>>>>>>> intel_huc *huc)
>>>>>>>>>         if (IS_ERR(vma))
>>>>>>>>>                 return PTR_ERR(vma);
>>>>>>>>>
>>>>>>>>> -     vaddr = i915_gem_object_pin_map_unlocked(vma->obj, 
>>>>>>>>> I915_MAP_WB);
>>>>>>>>> +     vaddr = i915_gem_object_pin_map_unlocked(vma->obj,
>>>>>>>>> + i915_coherent_map_type(gt->i915,
>>>>>>>>> + vma->obj, true));
>>>>>>>>>         if (IS_ERR(vaddr)) {
>>>>>>>>>                 i915_vma_unpin_and_release(&vma, 0);
>>>>>>>>>                 return PTR_ERR(vaddr);
>>>>>>>>> diff --git a/drivers/gpu/drm/i915/i915_drv.h 
>>>>>>>>> b/drivers/gpu/drm/i915/i915_drv.h
>>>>>>>>> index 69e43bf91a15..2abbc06712a4 100644
>>>>>>>>> --- a/drivers/gpu/drm/i915/i915_drv.h
>>>>>>>>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>>>>>>>>> @@ -78,6 +78,7 @@
>>>>>>>>>     #include "gem/i915_gem_context_types.h"
>>>>>>>>>     #include "gem/i915_gem_shrinker.h"
>>>>>>>>>     #include "gem/i915_gem_stolen.h"
>>>>>>>>> +#include "gem/i915_gem_lmem.h"
>>>>>>>>>
>>>>>>>>>     #include "gt/intel_engine.h"
>>>>>>>>>     #include "gt/intel_gt_types.h"
>>>>>>>>> @@ -1921,9 +1922,15 @@ static inline int 
>>>>>>>>> intel_hws_csb_write_index(struct drm_i915_private *i915)
>>>>>>>>>     }
>>>>>>>>>
>>>>>>>>>     static inline enum i915_map_type
>>>>>>>>> -i915_coherent_map_type(struct drm_i915_private *i915)
>>>>>>>>> +i915_coherent_map_type(struct drm_i915_private *i915,
>>>>>>>>> +                    struct drm_i915_gem_object *obj, bool 
>>>>>>>>> always_coherent)
>>>>>>>>>     {
>>>>>>>>> -     return HAS_LLC(i915) ? I915_MAP_WB : I915_MAP_WC;
>>>>>>>>> +     if (i915_gem_object_is_lmem(obj))
>>>>>>>>> +             return I915_MAP_WC;
>>>>>>>>> +     if (HAS_LLC(i915) || always_coherent)
>>>>>>>>> +             return I915_MAP_WB;
>>>>>>>>> +     else
>>>>>>>>> +             return I915_MAP_WC;
>>>>>>>>
>>>>>>>> Seems this patch is doing two things.
>>>>>>>>
>>>>>>>> First it is adding lmem support to this helper by always 
>>>>>>>> returning WC
>>>>>>>> for lmem objects.
>>>>>>>>
>>>>>>>> Secondly it is introducing an idea of "always coherent" in a helper
>>>>>>>> called i915_coherent_map_type. Could someone explain what is 
>>>>>>>> coherent vs
>>>>>>>> always coherent?
>>>>>>>>
>>>>>>>> And also, why is always coherent happy with WB? Sounds counter 
>>>>>>>> intuitive
>>>>>>>> to me.
>>>>>>>
>>>>>>> All this does is try to keep the existing behaviour intact, whilst
>>>>>>> also ensuring that all lmem objects are mapped using only WC, no
>>>>>>> matter what. The always_coherent=true thing is for the existing 
>>>>>>> places
>>>>>>> where we sometimes map the object using WB, without first 
>>>>>>> considering
>>>>>>> whether the device has the fast shared LLC vs snooping. Yes, it's
>>>>>>> slightly ugly :)
>>>>>>
>>>>>> Not fully following - if we had to write kerneldoc for 
>>>>>> always_coherent
>>>>>> input argument - what it would say?
>>>>>
>>>>> @always_coherent - If true we should always try to map the object
>>>>> using WB. If false we should only map as WB if the device supports the
>>>>> fast shared LLC, in the case of snooped devices we will map use WC.
>>>>> Note that If the resource is lmem then we will always map as WC,
>>>>> regardless of the value of always_coherent, since that's all we
>>>>> currently support.
>>>>>
>>>>> Maybe the naming is poor?
>>>>
>>>> Maybe just confusing to me, not sure yet.
>>>>
>>>> So always_coherent is not about how the callers wants to use it, but 
>>>> about platform knowledge? Or a performance concern for LLC vs 
>>>> snooping cases? Does WB works (coherently) on snooping platforms?
>>>
>>> The always_coherent=true is for the existing callers that want WB, 
>>> regardless of LLC vs snooping.
>>>
>>> The other callers use the existing i915_coherent_map_type() which 
>>> only gives out WB for LLC platforms.
>>>
>>> AFAIK, LLC vs snooping should offer the same in terms of coherency, 
>>> but in terms of performance the shared LLC is much faster, and so for 
>>> snooping platforms we choose to not enable WB everywhere.
>>>
>>> On top of that we now have lmem, but for that we only allow WC. This 
>>> patch just rolls all of that into one helper, while keeping the 
>>> existing behaviour unchanged.
>>
>> Thanks. But I am still struggling with the API. :(
>>
>> Is the introduction of always_coherent flag in the context of DG1 
>> required even? AFAICT for lmem objects the flag is ignored so no?
> 
> If we drop the flag/helper thing, then we need something like:
> 
> type = WB;
> if (i915_gem_object_is_lmem(obj))
>      type = WC;
> 
> vaddr = i915_gem_object_pin_map(obj, type);
> 
> In all the places where we currently do:
> 
> vaddr = i915_gem_object_pin_map(obj, WB);
> 
> Where obj can be lmem, so ctx, ring, guc etc. Is that better or worse? 
> The existing i915_coherent_map_type() callers should work as-is, since 
> DG1 is snooped. And this patch just extends that to cover all cases.
> 
> Perhaps we need a new helper instead? Maybe you have a better idea?

Not yet. Would it make sense to put something in kerneldoc about when 
callers might choose always_coherent true vs false? In terms of expected 
usage (frequency, simplicity?) and any rules with regards when callers 
need to worry about flushing/ordering when there are mixed read and writes?

Regards,

Tvrtko
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [Intel-gfx] [PATCH 11/19] drm/i915: Update the helper to set correct mapping
  2021-04-19 15:01                   ` Tvrtko Ursulin
@ 2021-04-21 11:42                     ` Matthew Auld
  2021-04-21 15:41                       ` Tvrtko Ursulin
  0 siblings, 1 reply; 65+ messages in thread
From: Matthew Auld @ 2021-04-21 11:42 UTC (permalink / raw)
  To: Tvrtko Ursulin, Matthew Auld; +Cc: Intel Graphics Development, ML dri-devel

On 19/04/2021 16:01, Tvrtko Ursulin wrote:
> 
> On 19/04/2021 15:37, Matthew Auld wrote:
>> On 19/04/2021 15:07, Tvrtko Ursulin wrote:
>>>
>>> On 19/04/2021 12:30, Matthew Auld wrote:
>>>> On 15/04/2021 12:05, Tvrtko Ursulin wrote:
>>>>>
>>>>> On 15/04/2021 10:23, Matthew Auld wrote:
>>>>>> On Thu, 15 Apr 2021 at 09:21, Tvrtko Ursulin
>>>>>> <tvrtko.ursulin@linux.intel.com> wrote:
>>>>>>>
>>>>>>>
>>>>>>> On 14/04/2021 17:20, Matthew Auld wrote:
>>>>>>>> On Wed, 14 Apr 2021 at 16:22, Tvrtko Ursulin
>>>>>>>> <tvrtko.ursulin@linux.intel.com> wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 12/04/2021 10:05, Matthew Auld wrote:
>>>>>>>>>> From: Venkata Sandeep Dhanalakota 
>>>>>>>>>> <venkata.s.dhanalakota@intel.com>
>>>>>>>>>>
>>>>>>>>>> Determine the possible coherent map type based on object 
>>>>>>>>>> location,
>>>>>>>>>> and if target has llc or if user requires an always coherent
>>>>>>>>>> mapping.
>>>>>>>>>>
>>>>>>>>>> Cc: Matthew Auld <matthew.auld@intel.com>
>>>>>>>>>> Cc: CQ Tang <cq.tang@intel.com>
>>>>>>>>>> Suggested-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
>>>>>>>>>> Signed-off-by: Venkata Sandeep Dhanalakota 
>>>>>>>>>> <venkata.s.dhanalakota@intel.com>
>>>>>>>>>> ---
>>>>>>>>>>     drivers/gpu/drm/i915/gt/intel_engine_cs.c    |  3 ++-
>>>>>>>>>>     drivers/gpu/drm/i915/gt/intel_engine_pm.c    |  2 +-
>>>>>>>>>>     drivers/gpu/drm/i915/gt/intel_lrc.c          |  4 +++-
>>>>>>>>>>     drivers/gpu/drm/i915/gt/intel_ring.c         |  9 ++++++---
>>>>>>>>>>     drivers/gpu/drm/i915/gt/selftest_context.c   |  3 ++-
>>>>>>>>>>     drivers/gpu/drm/i915/gt/selftest_hangcheck.c |  4 ++--
>>>>>>>>>>     drivers/gpu/drm/i915/gt/selftest_lrc.c       |  4 +++-
>>>>>>>>>>     drivers/gpu/drm/i915/gt/uc/intel_guc.c       |  4 +++-
>>>>>>>>>>     drivers/gpu/drm/i915/gt/uc/intel_huc.c       |  4 +++-
>>>>>>>>>>     drivers/gpu/drm/i915/i915_drv.h              | 11 +++++++++--
>>>>>>>>>>     drivers/gpu/drm/i915/selftests/igt_spinner.c |  4 ++--
>>>>>>>>>>     11 files changed, 36 insertions(+), 16 deletions(-)
>>>>>>>>>>
>>>>>>>>>> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c 
>>>>>>>>>> b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
>>>>>>>>>> index efe935f80c1a..b79568d370f5 100644
>>>>>>>>>> --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
>>>>>>>>>> +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
>>>>>>>>>> @@ -664,7 +664,8 @@ static int init_status_page(struct 
>>>>>>>>>> intel_engine_cs *engine)
>>>>>>>>>>         if (ret)
>>>>>>>>>>                 goto err;
>>>>>>>>>>
>>>>>>>>>> -     vaddr = i915_gem_object_pin_map(obj, I915_MAP_WB);
>>>>>>>>>> +     vaddr = i915_gem_object_pin_map(obj,
>>>>>>>>>> + i915_coherent_map_type(engine->i915, obj, true));
>>>>>>>>>>         if (IS_ERR(vaddr)) {
>>>>>>>>>>                 ret = PTR_ERR(vaddr);
>>>>>>>>>>                 goto err_unpin;
>>>>>>>>>> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pm.c 
>>>>>>>>>> b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
>>>>>>>>>> index 7c9af86fdb1e..47f4397095e5 100644
>>>>>>>>>> --- a/drivers/gpu/drm/i915/gt/intel_engine_pm.c
>>>>>>>>>> +++ b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
>>>>>>>>>> @@ -23,7 +23,7 @@ static void dbg_poison_ce(struct 
>>>>>>>>>> intel_context *ce)
>>>>>>>>>>
>>>>>>>>>>         if (ce->state) {
>>>>>>>>>>                 struct drm_i915_gem_object *obj = ce->state->obj;
>>>>>>>>>> -             int type = 
>>>>>>>>>> i915_coherent_map_type(ce->engine->i915);
>>>>>>>>>> +             int type = 
>>>>>>>>>> i915_coherent_map_type(ce->engine->i915, obj, true);
>>>>>>>>>>                 void *map;
>>>>>>>>>>
>>>>>>>>>>                 if (!i915_gem_object_trylock(obj))
>>>>>>>>>> diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c 
>>>>>>>>>> b/drivers/gpu/drm/i915/gt/intel_lrc.c
>>>>>>>>>> index e86897cde984..aafe2a4df496 100644
>>>>>>>>>> --- a/drivers/gpu/drm/i915/gt/intel_lrc.c
>>>>>>>>>> +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
>>>>>>>>>> @@ -903,7 +903,9 @@ lrc_pre_pin(struct intel_context *ce,
>>>>>>>>>>         GEM_BUG_ON(!i915_vma_is_pinned(ce->state));
>>>>>>>>>>
>>>>>>>>>>         *vaddr = i915_gem_object_pin_map(ce->state->obj,
>>>>>>>>>> - i915_coherent_map_type(ce->engine->i915) |
>>>>>>>>>> + i915_coherent_map_type(ce->engine->i915,
>>>>>>>>>> + ce->state->obj,
>>>>>>>>>> + false) |
>>>>>>>>>>                                          I915_MAP_OVERRIDE);
>>>>>>>>>>
>>>>>>>>>>         return PTR_ERR_OR_ZERO(*vaddr);
>>>>>>>>>> diff --git a/drivers/gpu/drm/i915/gt/intel_ring.c 
>>>>>>>>>> b/drivers/gpu/drm/i915/gt/intel_ring.c
>>>>>>>>>> index aee0a77c77e0..3cf6c7e68108 100644
>>>>>>>>>> --- a/drivers/gpu/drm/i915/gt/intel_ring.c
>>>>>>>>>> +++ b/drivers/gpu/drm/i915/gt/intel_ring.c
>>>>>>>>>> @@ -53,9 +53,12 @@ int intel_ring_pin(struct intel_ring *ring, 
>>>>>>>>>> struct i915_gem_ww_ctx *ww)
>>>>>>>>>>
>>>>>>>>>>         if (i915_vma_is_map_and_fenceable(vma))
>>>>>>>>>>                 addr = (void __force *)i915_vma_pin_iomap(vma);
>>>>>>>>>> -     else
>>>>>>>>>> -             addr = i915_gem_object_pin_map(vma->obj,
>>>>>>>>>> - i915_coherent_map_type(vma->vm->i915));
>>>>>>>>>> +     else {
>>>>>>>>>> +             int type = i915_coherent_map_type(vma->vm->i915, 
>>>>>>>>>> vma->obj, false);
>>>>>>>>>> +
>>>>>>>>>> +             addr = i915_gem_object_pin_map(vma->obj, type);
>>>>>>>>>> +     }
>>>>>>>>>> +
>>>>>>>>>>         if (IS_ERR(addr)) {
>>>>>>>>>>                 ret = PTR_ERR(addr);
>>>>>>>>>>                 goto err_ring;
>>>>>>>>>> diff --git a/drivers/gpu/drm/i915/gt/selftest_context.c 
>>>>>>>>>> b/drivers/gpu/drm/i915/gt/selftest_context.c
>>>>>>>>>> index b9bdd1d23243..26685b927169 100644
>>>>>>>>>> --- a/drivers/gpu/drm/i915/gt/selftest_context.c
>>>>>>>>>> +++ b/drivers/gpu/drm/i915/gt/selftest_context.c
>>>>>>>>>> @@ -88,7 +88,8 @@ static int __live_context_size(struct 
>>>>>>>>>> intel_engine_cs *engine)
>>>>>>>>>>                 goto err;
>>>>>>>>>>
>>>>>>>>>>         vaddr = i915_gem_object_pin_map_unlocked(ce->state->obj,
>>>>>>>>>> - i915_coherent_map_type(engine->i915));
>>>>>>>>>> + i915_coherent_map_type(engine->i915,
>>>>>>>>>> + ce->state->obj, false));
>>>>>>>>>>         if (IS_ERR(vaddr)) {
>>>>>>>>>>                 err = PTR_ERR(vaddr);
>>>>>>>>>>                 intel_context_unpin(ce);
>>>>>>>>>> diff --git a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c 
>>>>>>>>>> b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
>>>>>>>>>> index 746985971c3a..5b63d4df8c93 100644
>>>>>>>>>> --- a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
>>>>>>>>>> +++ b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
>>>>>>>>>> @@ -69,7 +69,7 @@ static int hang_init(struct hang *h, struct 
>>>>>>>>>> intel_gt *gt)
>>>>>>>>>>         h->seqno = memset(vaddr, 0xff, PAGE_SIZE);
>>>>>>>>>>
>>>>>>>>>>         vaddr = i915_gem_object_pin_map_unlocked(h->obj,
>>>>>>>>>> - i915_coherent_map_type(gt->i915));
>>>>>>>>>> + i915_coherent_map_type(gt->i915, h->obj, false));
>>>>>>>>>>         if (IS_ERR(vaddr)) {
>>>>>>>>>>                 err = PTR_ERR(vaddr);
>>>>>>>>>>                 goto err_unpin_hws;
>>>>>>>>>> @@ -130,7 +130,7 @@ hang_create_request(struct hang *h, struct 
>>>>>>>>>> intel_engine_cs *engine)
>>>>>>>>>>                 return ERR_CAST(obj);
>>>>>>>>>>         }
>>>>>>>>>>
>>>>>>>>>> -     vaddr = i915_gem_object_pin_map_unlocked(obj, 
>>>>>>>>>> i915_coherent_map_type(gt->i915));
>>>>>>>>>> +     vaddr = i915_gem_object_pin_map_unlocked(obj, 
>>>>>>>>>> i915_coherent_map_type(gt->i915, obj, false));
>>>>>>>>>>         if (IS_ERR(vaddr)) {
>>>>>>>>>>                 i915_gem_object_put(obj);
>>>>>>>>>>                 i915_vm_put(vm);
>>>>>>>>>> diff --git a/drivers/gpu/drm/i915/gt/selftest_lrc.c 
>>>>>>>>>> b/drivers/gpu/drm/i915/gt/selftest_lrc.c
>>>>>>>>>> index 85e7df6a5123..d8f6623524e8 100644
>>>>>>>>>> --- a/drivers/gpu/drm/i915/gt/selftest_lrc.c
>>>>>>>>>> +++ b/drivers/gpu/drm/i915/gt/selftest_lrc.c
>>>>>>>>>> @@ -1221,7 +1221,9 @@ static int compare_isolation(struct 
>>>>>>>>>> intel_engine_cs *engine,
>>>>>>>>>>         }
>>>>>>>>>>
>>>>>>>>>>         lrc = i915_gem_object_pin_map_unlocked(ce->state->obj,
>>>>>>>>>> - i915_coherent_map_type(engine->i915));
>>>>>>>>>> + i915_coherent_map_type(engine->i915,
>>>>>>>>>> + ce->state->obj,
>>>>>>>>>> + false));
>>>>>>>>>>         if (IS_ERR(lrc)) {
>>>>>>>>>>                 err = PTR_ERR(lrc);
>>>>>>>>>>                 goto err_B1;
>>>>>>>>>> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.c 
>>>>>>>>>> b/drivers/gpu/drm/i915/gt/uc/intel_guc.c
>>>>>>>>>> index 78305b2ec89d..adae04c47aab 100644
>>>>>>>>>> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc.c
>>>>>>>>>> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.c
>>>>>>>>>> @@ -682,7 +682,9 @@ int intel_guc_allocate_and_map_vma(struct 
>>>>>>>>>> intel_guc *guc, u32 size,
>>>>>>>>>>         if (IS_ERR(vma))
>>>>>>>>>>                 return PTR_ERR(vma);
>>>>>>>>>>
>>>>>>>>>> -     vaddr = i915_gem_object_pin_map_unlocked(vma->obj, 
>>>>>>>>>> I915_MAP_WB);
>>>>>>>>>> +     vaddr = i915_gem_object_pin_map_unlocked(vma->obj,
>>>>>>>>>> + i915_coherent_map_type(guc_to_gt(guc)->i915,
>>>>>>>>>> + vma->obj, true));
>>>>>>>>>>         if (IS_ERR(vaddr)) {
>>>>>>>>>>                 i915_vma_unpin_and_release(&vma, 0);
>>>>>>>>>>                 return PTR_ERR(vaddr);
>>>>>>>>>> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_huc.c 
>>>>>>>>>> b/drivers/gpu/drm/i915/gt/uc/intel_huc.c
>>>>>>>>>> index 2126dd81ac38..56d2144dc6a0 100644
>>>>>>>>>> --- a/drivers/gpu/drm/i915/gt/uc/intel_huc.c
>>>>>>>>>> +++ b/drivers/gpu/drm/i915/gt/uc/intel_huc.c
>>>>>>>>>> @@ -82,7 +82,9 @@ static int intel_huc_rsa_data_create(struct 
>>>>>>>>>> intel_huc *huc)
>>>>>>>>>>         if (IS_ERR(vma))
>>>>>>>>>>                 return PTR_ERR(vma);
>>>>>>>>>>
>>>>>>>>>> -     vaddr = i915_gem_object_pin_map_unlocked(vma->obj, 
>>>>>>>>>> I915_MAP_WB);
>>>>>>>>>> +     vaddr = i915_gem_object_pin_map_unlocked(vma->obj,
>>>>>>>>>> + i915_coherent_map_type(gt->i915,
>>>>>>>>>> + vma->obj, true));
>>>>>>>>>>         if (IS_ERR(vaddr)) {
>>>>>>>>>>                 i915_vma_unpin_and_release(&vma, 0);
>>>>>>>>>>                 return PTR_ERR(vaddr);
>>>>>>>>>> diff --git a/drivers/gpu/drm/i915/i915_drv.h 
>>>>>>>>>> b/drivers/gpu/drm/i915/i915_drv.h
>>>>>>>>>> index 69e43bf91a15..2abbc06712a4 100644
>>>>>>>>>> --- a/drivers/gpu/drm/i915/i915_drv.h
>>>>>>>>>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>>>>>>>>>> @@ -78,6 +78,7 @@
>>>>>>>>>>     #include "gem/i915_gem_context_types.h"
>>>>>>>>>>     #include "gem/i915_gem_shrinker.h"
>>>>>>>>>>     #include "gem/i915_gem_stolen.h"
>>>>>>>>>> +#include "gem/i915_gem_lmem.h"
>>>>>>>>>>
>>>>>>>>>>     #include "gt/intel_engine.h"
>>>>>>>>>>     #include "gt/intel_gt_types.h"
>>>>>>>>>> @@ -1921,9 +1922,15 @@ static inline int 
>>>>>>>>>> intel_hws_csb_write_index(struct drm_i915_private *i915)
>>>>>>>>>>     }
>>>>>>>>>>
>>>>>>>>>>     static inline enum i915_map_type
>>>>>>>>>> -i915_coherent_map_type(struct drm_i915_private *i915)
>>>>>>>>>> +i915_coherent_map_type(struct drm_i915_private *i915,
>>>>>>>>>> +                    struct drm_i915_gem_object *obj, bool 
>>>>>>>>>> always_coherent)
>>>>>>>>>>     {
>>>>>>>>>> -     return HAS_LLC(i915) ? I915_MAP_WB : I915_MAP_WC;
>>>>>>>>>> +     if (i915_gem_object_is_lmem(obj))
>>>>>>>>>> +             return I915_MAP_WC;
>>>>>>>>>> +     if (HAS_LLC(i915) || always_coherent)
>>>>>>>>>> +             return I915_MAP_WB;
>>>>>>>>>> +     else
>>>>>>>>>> +             return I915_MAP_WC;
>>>>>>>>>
>>>>>>>>> Seems this patch is doing two things.
>>>>>>>>>
>>>>>>>>> First it is adding lmem support to this helper by always 
>>>>>>>>> returning WC
>>>>>>>>> for lmem objects.
>>>>>>>>>
>>>>>>>>> Secondly it is introducing an idea of "always coherent" in a 
>>>>>>>>> helper
>>>>>>>>> called i915_coherent_map_type. Could someone explain what is 
>>>>>>>>> coherent vs
>>>>>>>>> always coherent?
>>>>>>>>>
>>>>>>>>> And also, why is always coherent happy with WB? Sounds counter 
>>>>>>>>> intuitive
>>>>>>>>> to me.
>>>>>>>>
>>>>>>>> All this does is try to keep the existing behaviour intact, whilst
>>>>>>>> also ensuring that all lmem objects are mapped using only WC, no
>>>>>>>> matter what. The always_coherent=true thing is for the existing 
>>>>>>>> places
>>>>>>>> where we sometimes map the object using WB, without first 
>>>>>>>> considering
>>>>>>>> whether the device has the fast shared LLC vs snooping. Yes, it's
>>>>>>>> slightly ugly :)
>>>>>>>
>>>>>>> Not fully following - if we had to write kerneldoc for 
>>>>>>> always_coherent
>>>>>>> input argument - what it would say?
>>>>>>
>>>>>> @always_coherent - If true we should always try to map the object
>>>>>> using WB. If false we should only map as WB if the device supports 
>>>>>> the
>>>>>> fast shared LLC, in the case of snooped devices we will map use WC.
>>>>>> Note that If the resource is lmem then we will always map as WC,
>>>>>> regardless of the value of always_coherent, since that's all we
>>>>>> currently support.
>>>>>>
>>>>>> Maybe the naming is poor?
>>>>>
>>>>> Maybe just confusing to me, not sure yet.
>>>>>
>>>>> So always_coherent is not about how the callers wants to use it, 
>>>>> but about platform knowledge? Or a performance concern for LLC vs 
>>>>> snooping cases? Does WB works (coherently) on snooping platforms?
>>>>
>>>> The always_coherent=true is for the existing callers that want WB, 
>>>> regardless of LLC vs snooping.
>>>>
>>>> The other callers use the existing i915_coherent_map_type() which 
>>>> only gives out WB for LLC platforms.
>>>>
>>>> AFAIK, LLC vs snooping should offer the same in terms of coherency, 
>>>> but in terms of performance the shared LLC is much faster, and so 
>>>> for snooping platforms we choose to not enable WB everywhere.
>>>>
>>>> On top of that we now have lmem, but for that we only allow WC. This 
>>>> patch just rolls all of that into one helper, while keeping the 
>>>> existing behaviour unchanged.
>>>
>>> Thanks. But I am still struggling with the API. :(
>>>
>>> Is the introduction of always_coherent flag in the context of DG1 
>>> required even? AFAICT for lmem objects the flag is ignored so no?
>>
>> If we drop the flag/helper thing, then we need something like:
>>
>> type = WB;
>> if (i915_gem_object_is_lmem(obj))
>>      type = WC;
>>
>> vaddr = i915_gem_object_pin_map(obj, type);
>>
>> In all the places where we currently do:
>>
>> vaddr = i915_gem_object_pin_map(obj, WB);
>>
>> Where obj can be lmem, so ctx, ring, guc etc. Is that better or worse? 
>> The existing i915_coherent_map_type() callers should work as-is, since 
>> DG1 is snooped. And this patch just extends that to cover all cases.
>>
>> Perhaps we need a new helper instead? Maybe you have a better idea?
> 
> Not yet. Would it make sense to put something in kerneldoc about when 
> callers might choose always_coherent true vs false? In terms of expected 
> usage (frequency, simplicity?) and any rules with regards when callers 
> need to worry about flushing/ordering when there are mixed read and writes?

Hmmm, looking at this again, maybe for now we should just go with:

type = WB;
if (i915_gem_object_is_lmem(obj))
       type = WC;

vaddr = i915_gem_object_pin_map(obj, type)

Which is way less confusing, plus there are only a handful of places 
where we need this, so doesn't seem too bad?

Alternatively, we could wrap that in something like:

/* Returns WB for system memory, or WC for local memory */
void *i915_gem_object_pin_map_default(obj);

Thoughts?

> 
> Regards,
> 
> Tvrtko
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [Intel-gfx] [PATCH 11/19] drm/i915: Update the helper to set correct mapping
  2021-04-21 11:42                     ` Matthew Auld
@ 2021-04-21 15:41                       ` Tvrtko Ursulin
  2021-04-21 19:13                         ` Matthew Auld
  0 siblings, 1 reply; 65+ messages in thread
From: Tvrtko Ursulin @ 2021-04-21 15:41 UTC (permalink / raw)
  To: Matthew Auld, Matthew Auld; +Cc: Intel Graphics Development, ML dri-devel


On 21/04/2021 12:42, Matthew Auld wrote:
> On 19/04/2021 16:01, Tvrtko Ursulin wrote:
>>
>> On 19/04/2021 15:37, Matthew Auld wrote:
>>> On 19/04/2021 15:07, Tvrtko Ursulin wrote:
>>>>
>>>> On 19/04/2021 12:30, Matthew Auld wrote:
>>>>> On 15/04/2021 12:05, Tvrtko Ursulin wrote:
>>>>>>
>>>>>> On 15/04/2021 10:23, Matthew Auld wrote:
>>>>>>> On Thu, 15 Apr 2021 at 09:21, Tvrtko Ursulin
>>>>>>> <tvrtko.ursulin@linux.intel.com> wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>> On 14/04/2021 17:20, Matthew Auld wrote:
>>>>>>>>> On Wed, 14 Apr 2021 at 16:22, Tvrtko Ursulin
>>>>>>>>> <tvrtko.ursulin@linux.intel.com> wrote:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 12/04/2021 10:05, Matthew Auld wrote:
>>>>>>>>>>> From: Venkata Sandeep Dhanalakota 
>>>>>>>>>>> <venkata.s.dhanalakota@intel.com>
>>>>>>>>>>>
>>>>>>>>>>> Determine the possible coherent map type based on object 
>>>>>>>>>>> location,
>>>>>>>>>>> and if target has llc or if user requires an always coherent
>>>>>>>>>>> mapping.
>>>>>>>>>>>
>>>>>>>>>>> Cc: Matthew Auld <matthew.auld@intel.com>
>>>>>>>>>>> Cc: CQ Tang <cq.tang@intel.com>
>>>>>>>>>>> Suggested-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
>>>>>>>>>>> Signed-off-by: Venkata Sandeep Dhanalakota 
>>>>>>>>>>> <venkata.s.dhanalakota@intel.com>
>>>>>>>>>>> ---
>>>>>>>>>>>     drivers/gpu/drm/i915/gt/intel_engine_cs.c    |  3 ++-
>>>>>>>>>>>     drivers/gpu/drm/i915/gt/intel_engine_pm.c    |  2 +-
>>>>>>>>>>>     drivers/gpu/drm/i915/gt/intel_lrc.c          |  4 +++-
>>>>>>>>>>>     drivers/gpu/drm/i915/gt/intel_ring.c         |  9 ++++++---
>>>>>>>>>>>     drivers/gpu/drm/i915/gt/selftest_context.c   |  3 ++-
>>>>>>>>>>>     drivers/gpu/drm/i915/gt/selftest_hangcheck.c |  4 ++--
>>>>>>>>>>>     drivers/gpu/drm/i915/gt/selftest_lrc.c       |  4 +++-
>>>>>>>>>>>     drivers/gpu/drm/i915/gt/uc/intel_guc.c       |  4 +++-
>>>>>>>>>>>     drivers/gpu/drm/i915/gt/uc/intel_huc.c       |  4 +++-
>>>>>>>>>>>     drivers/gpu/drm/i915/i915_drv.h              | 11 
>>>>>>>>>>> +++++++++--
>>>>>>>>>>>     drivers/gpu/drm/i915/selftests/igt_spinner.c |  4 ++--
>>>>>>>>>>>     11 files changed, 36 insertions(+), 16 deletions(-)
>>>>>>>>>>>
>>>>>>>>>>> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c 
>>>>>>>>>>> b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
>>>>>>>>>>> index efe935f80c1a..b79568d370f5 100644
>>>>>>>>>>> --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
>>>>>>>>>>> +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
>>>>>>>>>>> @@ -664,7 +664,8 @@ static int init_status_page(struct 
>>>>>>>>>>> intel_engine_cs *engine)
>>>>>>>>>>>         if (ret)
>>>>>>>>>>>                 goto err;
>>>>>>>>>>>
>>>>>>>>>>> -     vaddr = i915_gem_object_pin_map(obj, I915_MAP_WB);
>>>>>>>>>>> +     vaddr = i915_gem_object_pin_map(obj,
>>>>>>>>>>> + i915_coherent_map_type(engine->i915, obj, true));
>>>>>>>>>>>         if (IS_ERR(vaddr)) {
>>>>>>>>>>>                 ret = PTR_ERR(vaddr);
>>>>>>>>>>>                 goto err_unpin;
>>>>>>>>>>> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pm.c 
>>>>>>>>>>> b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
>>>>>>>>>>> index 7c9af86fdb1e..47f4397095e5 100644
>>>>>>>>>>> --- a/drivers/gpu/drm/i915/gt/intel_engine_pm.c
>>>>>>>>>>> +++ b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
>>>>>>>>>>> @@ -23,7 +23,7 @@ static void dbg_poison_ce(struct 
>>>>>>>>>>> intel_context *ce)
>>>>>>>>>>>
>>>>>>>>>>>         if (ce->state) {
>>>>>>>>>>>                 struct drm_i915_gem_object *obj = 
>>>>>>>>>>> ce->state->obj;
>>>>>>>>>>> -             int type = 
>>>>>>>>>>> i915_coherent_map_type(ce->engine->i915);
>>>>>>>>>>> +             int type = 
>>>>>>>>>>> i915_coherent_map_type(ce->engine->i915, obj, true);
>>>>>>>>>>>                 void *map;
>>>>>>>>>>>
>>>>>>>>>>>                 if (!i915_gem_object_trylock(obj))
>>>>>>>>>>> diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c 
>>>>>>>>>>> b/drivers/gpu/drm/i915/gt/intel_lrc.c
>>>>>>>>>>> index e86897cde984..aafe2a4df496 100644
>>>>>>>>>>> --- a/drivers/gpu/drm/i915/gt/intel_lrc.c
>>>>>>>>>>> +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
>>>>>>>>>>> @@ -903,7 +903,9 @@ lrc_pre_pin(struct intel_context *ce,
>>>>>>>>>>>         GEM_BUG_ON(!i915_vma_is_pinned(ce->state));
>>>>>>>>>>>
>>>>>>>>>>>         *vaddr = i915_gem_object_pin_map(ce->state->obj,
>>>>>>>>>>> - i915_coherent_map_type(ce->engine->i915) |
>>>>>>>>>>> + i915_coherent_map_type(ce->engine->i915,
>>>>>>>>>>> + ce->state->obj,
>>>>>>>>>>> + false) |
>>>>>>>>>>>                                          I915_MAP_OVERRIDE);
>>>>>>>>>>>
>>>>>>>>>>>         return PTR_ERR_OR_ZERO(*vaddr);
>>>>>>>>>>> diff --git a/drivers/gpu/drm/i915/gt/intel_ring.c 
>>>>>>>>>>> b/drivers/gpu/drm/i915/gt/intel_ring.c
>>>>>>>>>>> index aee0a77c77e0..3cf6c7e68108 100644
>>>>>>>>>>> --- a/drivers/gpu/drm/i915/gt/intel_ring.c
>>>>>>>>>>> +++ b/drivers/gpu/drm/i915/gt/intel_ring.c
>>>>>>>>>>> @@ -53,9 +53,12 @@ int intel_ring_pin(struct intel_ring 
>>>>>>>>>>> *ring, struct i915_gem_ww_ctx *ww)
>>>>>>>>>>>
>>>>>>>>>>>         if (i915_vma_is_map_and_fenceable(vma))
>>>>>>>>>>>                 addr = (void __force *)i915_vma_pin_iomap(vma);
>>>>>>>>>>> -     else
>>>>>>>>>>> -             addr = i915_gem_object_pin_map(vma->obj,
>>>>>>>>>>> - i915_coherent_map_type(vma->vm->i915));
>>>>>>>>>>> +     else {
>>>>>>>>>>> +             int type = 
>>>>>>>>>>> i915_coherent_map_type(vma->vm->i915, vma->obj, false);
>>>>>>>>>>> +
>>>>>>>>>>> +             addr = i915_gem_object_pin_map(vma->obj, type);
>>>>>>>>>>> +     }
>>>>>>>>>>> +
>>>>>>>>>>>         if (IS_ERR(addr)) {
>>>>>>>>>>>                 ret = PTR_ERR(addr);
>>>>>>>>>>>                 goto err_ring;
>>>>>>>>>>> diff --git a/drivers/gpu/drm/i915/gt/selftest_context.c 
>>>>>>>>>>> b/drivers/gpu/drm/i915/gt/selftest_context.c
>>>>>>>>>>> index b9bdd1d23243..26685b927169 100644
>>>>>>>>>>> --- a/drivers/gpu/drm/i915/gt/selftest_context.c
>>>>>>>>>>> +++ b/drivers/gpu/drm/i915/gt/selftest_context.c
>>>>>>>>>>> @@ -88,7 +88,8 @@ static int __live_context_size(struct 
>>>>>>>>>>> intel_engine_cs *engine)
>>>>>>>>>>>                 goto err;
>>>>>>>>>>>
>>>>>>>>>>>         vaddr = i915_gem_object_pin_map_unlocked(ce->state->obj,
>>>>>>>>>>> - i915_coherent_map_type(engine->i915));
>>>>>>>>>>> + i915_coherent_map_type(engine->i915,
>>>>>>>>>>> + ce->state->obj, false));
>>>>>>>>>>>         if (IS_ERR(vaddr)) {
>>>>>>>>>>>                 err = PTR_ERR(vaddr);
>>>>>>>>>>>                 intel_context_unpin(ce);
>>>>>>>>>>> diff --git a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c 
>>>>>>>>>>> b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
>>>>>>>>>>> index 746985971c3a..5b63d4df8c93 100644
>>>>>>>>>>> --- a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
>>>>>>>>>>> +++ b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
>>>>>>>>>>> @@ -69,7 +69,7 @@ static int hang_init(struct hang *h, struct 
>>>>>>>>>>> intel_gt *gt)
>>>>>>>>>>>         h->seqno = memset(vaddr, 0xff, PAGE_SIZE);
>>>>>>>>>>>
>>>>>>>>>>>         vaddr = i915_gem_object_pin_map_unlocked(h->obj,
>>>>>>>>>>> - i915_coherent_map_type(gt->i915));
>>>>>>>>>>> + i915_coherent_map_type(gt->i915, h->obj, false));
>>>>>>>>>>>         if (IS_ERR(vaddr)) {
>>>>>>>>>>>                 err = PTR_ERR(vaddr);
>>>>>>>>>>>                 goto err_unpin_hws;
>>>>>>>>>>> @@ -130,7 +130,7 @@ hang_create_request(struct hang *h, 
>>>>>>>>>>> struct intel_engine_cs *engine)
>>>>>>>>>>>                 return ERR_CAST(obj);
>>>>>>>>>>>         }
>>>>>>>>>>>
>>>>>>>>>>> -     vaddr = i915_gem_object_pin_map_unlocked(obj, 
>>>>>>>>>>> i915_coherent_map_type(gt->i915));
>>>>>>>>>>> +     vaddr = i915_gem_object_pin_map_unlocked(obj, 
>>>>>>>>>>> i915_coherent_map_type(gt->i915, obj, false));
>>>>>>>>>>>         if (IS_ERR(vaddr)) {
>>>>>>>>>>>                 i915_gem_object_put(obj);
>>>>>>>>>>>                 i915_vm_put(vm);
>>>>>>>>>>> diff --git a/drivers/gpu/drm/i915/gt/selftest_lrc.c 
>>>>>>>>>>> b/drivers/gpu/drm/i915/gt/selftest_lrc.c
>>>>>>>>>>> index 85e7df6a5123..d8f6623524e8 100644
>>>>>>>>>>> --- a/drivers/gpu/drm/i915/gt/selftest_lrc.c
>>>>>>>>>>> +++ b/drivers/gpu/drm/i915/gt/selftest_lrc.c
>>>>>>>>>>> @@ -1221,7 +1221,9 @@ static int compare_isolation(struct 
>>>>>>>>>>> intel_engine_cs *engine,
>>>>>>>>>>>         }
>>>>>>>>>>>
>>>>>>>>>>>         lrc = i915_gem_object_pin_map_unlocked(ce->state->obj,
>>>>>>>>>>> - i915_coherent_map_type(engine->i915));
>>>>>>>>>>> + i915_coherent_map_type(engine->i915,
>>>>>>>>>>> + ce->state->obj,
>>>>>>>>>>> + false));
>>>>>>>>>>>         if (IS_ERR(lrc)) {
>>>>>>>>>>>                 err = PTR_ERR(lrc);
>>>>>>>>>>>                 goto err_B1;
>>>>>>>>>>> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.c 
>>>>>>>>>>> b/drivers/gpu/drm/i915/gt/uc/intel_guc.c
>>>>>>>>>>> index 78305b2ec89d..adae04c47aab 100644
>>>>>>>>>>> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc.c
>>>>>>>>>>> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.c
>>>>>>>>>>> @@ -682,7 +682,9 @@ int intel_guc_allocate_and_map_vma(struct 
>>>>>>>>>>> intel_guc *guc, u32 size,
>>>>>>>>>>>         if (IS_ERR(vma))
>>>>>>>>>>>                 return PTR_ERR(vma);
>>>>>>>>>>>
>>>>>>>>>>> -     vaddr = i915_gem_object_pin_map_unlocked(vma->obj, 
>>>>>>>>>>> I915_MAP_WB);
>>>>>>>>>>> +     vaddr = i915_gem_object_pin_map_unlocked(vma->obj,
>>>>>>>>>>> + i915_coherent_map_type(guc_to_gt(guc)->i915,
>>>>>>>>>>> + vma->obj, true));
>>>>>>>>>>>         if (IS_ERR(vaddr)) {
>>>>>>>>>>>                 i915_vma_unpin_and_release(&vma, 0);
>>>>>>>>>>>                 return PTR_ERR(vaddr);
>>>>>>>>>>> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_huc.c 
>>>>>>>>>>> b/drivers/gpu/drm/i915/gt/uc/intel_huc.c
>>>>>>>>>>> index 2126dd81ac38..56d2144dc6a0 100644
>>>>>>>>>>> --- a/drivers/gpu/drm/i915/gt/uc/intel_huc.c
>>>>>>>>>>> +++ b/drivers/gpu/drm/i915/gt/uc/intel_huc.c
>>>>>>>>>>> @@ -82,7 +82,9 @@ static int intel_huc_rsa_data_create(struct 
>>>>>>>>>>> intel_huc *huc)
>>>>>>>>>>>         if (IS_ERR(vma))
>>>>>>>>>>>                 return PTR_ERR(vma);
>>>>>>>>>>>
>>>>>>>>>>> -     vaddr = i915_gem_object_pin_map_unlocked(vma->obj, 
>>>>>>>>>>> I915_MAP_WB);
>>>>>>>>>>> +     vaddr = i915_gem_object_pin_map_unlocked(vma->obj,
>>>>>>>>>>> + i915_coherent_map_type(gt->i915,
>>>>>>>>>>> + vma->obj, true));
>>>>>>>>>>>         if (IS_ERR(vaddr)) {
>>>>>>>>>>>                 i915_vma_unpin_and_release(&vma, 0);
>>>>>>>>>>>                 return PTR_ERR(vaddr);
>>>>>>>>>>> diff --git a/drivers/gpu/drm/i915/i915_drv.h 
>>>>>>>>>>> b/drivers/gpu/drm/i915/i915_drv.h
>>>>>>>>>>> index 69e43bf91a15..2abbc06712a4 100644
>>>>>>>>>>> --- a/drivers/gpu/drm/i915/i915_drv.h
>>>>>>>>>>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>>>>>>>>>>> @@ -78,6 +78,7 @@
>>>>>>>>>>>     #include "gem/i915_gem_context_types.h"
>>>>>>>>>>>     #include "gem/i915_gem_shrinker.h"
>>>>>>>>>>>     #include "gem/i915_gem_stolen.h"
>>>>>>>>>>> +#include "gem/i915_gem_lmem.h"
>>>>>>>>>>>
>>>>>>>>>>>     #include "gt/intel_engine.h"
>>>>>>>>>>>     #include "gt/intel_gt_types.h"
>>>>>>>>>>> @@ -1921,9 +1922,15 @@ static inline int 
>>>>>>>>>>> intel_hws_csb_write_index(struct drm_i915_private *i915)
>>>>>>>>>>>     }
>>>>>>>>>>>
>>>>>>>>>>>     static inline enum i915_map_type
>>>>>>>>>>> -i915_coherent_map_type(struct drm_i915_private *i915)
>>>>>>>>>>> +i915_coherent_map_type(struct drm_i915_private *i915,
>>>>>>>>>>> +                    struct drm_i915_gem_object *obj, bool 
>>>>>>>>>>> always_coherent)
>>>>>>>>>>>     {
>>>>>>>>>>> -     return HAS_LLC(i915) ? I915_MAP_WB : I915_MAP_WC;
>>>>>>>>>>> +     if (i915_gem_object_is_lmem(obj))
>>>>>>>>>>> +             return I915_MAP_WC;
>>>>>>>>>>> +     if (HAS_LLC(i915) || always_coherent)
>>>>>>>>>>> +             return I915_MAP_WB;
>>>>>>>>>>> +     else
>>>>>>>>>>> +             return I915_MAP_WC;
>>>>>>>>>>
>>>>>>>>>> Seems this patch is doing two things.
>>>>>>>>>>
>>>>>>>>>> First it is adding lmem support to this helper by always 
>>>>>>>>>> returning WC
>>>>>>>>>> for lmem objects.
>>>>>>>>>>
>>>>>>>>>> Secondly it is introducing an idea of "always coherent" in a 
>>>>>>>>>> helper
>>>>>>>>>> called i915_coherent_map_type. Could someone explain what is 
>>>>>>>>>> coherent vs
>>>>>>>>>> always coherent?
>>>>>>>>>>
>>>>>>>>>> And also, why is always coherent happy with WB? Sounds counter 
>>>>>>>>>> intuitive
>>>>>>>>>> to me.
>>>>>>>>>
>>>>>>>>> All this does is try to keep the existing behaviour intact, whilst
>>>>>>>>> also ensuring that all lmem objects are mapped using only WC, no
>>>>>>>>> matter what. The always_coherent=true thing is for the existing 
>>>>>>>>> places
>>>>>>>>> where we sometimes map the object using WB, without first 
>>>>>>>>> considering
>>>>>>>>> whether the device has the fast shared LLC vs snooping. Yes, it's
>>>>>>>>> slightly ugly :)
>>>>>>>>
>>>>>>>> Not fully following - if we had to write kerneldoc for 
>>>>>>>> always_coherent
>>>>>>>> input argument - what it would say?
>>>>>>>
>>>>>>> @always_coherent - If true we should always try to map the object
>>>>>>> using WB. If false we should only map as WB if the device 
>>>>>>> supports the
>>>>>>> fast shared LLC, in the case of snooped devices we will map use WC.
>>>>>>> Note that If the resource is lmem then we will always map as WC,
>>>>>>> regardless of the value of always_coherent, since that's all we
>>>>>>> currently support.
>>>>>>>
>>>>>>> Maybe the naming is poor?
>>>>>>
>>>>>> Maybe just confusing to me, not sure yet.
>>>>>>
>>>>>> So always_coherent is not about how the callers wants to use it, 
>>>>>> but about platform knowledge? Or a performance concern for LLC vs 
>>>>>> snooping cases? Does WB works (coherently) on snooping platforms?
>>>>>
>>>>> The always_coherent=true is for the existing callers that want WB, 
>>>>> regardless of LLC vs snooping.
>>>>>
>>>>> The other callers use the existing i915_coherent_map_type() which 
>>>>> only gives out WB for LLC platforms.
>>>>>
>>>>> AFAIK, LLC vs snooping should offer the same in terms of coherency, 
>>>>> but in terms of performance the shared LLC is much faster, and so 
>>>>> for snooping platforms we choose to not enable WB everywhere.
>>>>>
>>>>> On top of that we now have lmem, but for that we only allow WC. 
>>>>> This patch just rolls all of that into one helper, while keeping 
>>>>> the existing behaviour unchanged.
>>>>
>>>> Thanks. But I am still struggling with the API. :(
>>>>
>>>> Is the introduction of always_coherent flag in the context of DG1 
>>>> required even? AFAICT for lmem objects the flag is ignored so no?
>>>
>>> If we drop the flag/helper thing, then we need something like:
>>>
>>> type = WB;
>>> if (i915_gem_object_is_lmem(obj))
>>>      type = WC;
>>>
>>> vaddr = i915_gem_object_pin_map(obj, type);
>>>
>>> In all the places where we currently do:
>>>
>>> vaddr = i915_gem_object_pin_map(obj, WB);
>>>
>>> Where obj can be lmem, so ctx, ring, guc etc. Is that better or 
>>> worse? The existing i915_coherent_map_type() callers should work 
>>> as-is, since DG1 is snooped. And this patch just extends that to 
>>> cover all cases.
>>>
>>> Perhaps we need a new helper instead? Maybe you have a better idea?
>>
>> Not yet. Would it make sense to put something in kerneldoc about when 
>> callers might choose always_coherent true vs false? In terms of 
>> expected usage (frequency, simplicity?) and any rules with regards 
>> when callers need to worry about flushing/ordering when there are 
>> mixed read and writes?
> 
> Hmmm, looking at this again, maybe for now we should just go with:
> 
> type = WB;
> if (i915_gem_object_is_lmem(obj))
>        type = WC;
> 
> vaddr = i915_gem_object_pin_map(obj, type)
> 
> Which is way less confusing, plus there are only a handful of places 
> where we need this, so doesn't seem too bad?
> 
> Alternatively, we could wrap that in something like:
> 
> /* Returns WB for system memory, or WC for local memory */
> void *i915_gem_object_pin_map_default(obj);
> 
> Thoughts?

I went and looked at the use sites to try and figure it out.

First thing, the bool always_coherent story is only relevant when we 
decide to place some object in system memory. Otherwise mapping is 
always WC so I guess our code needs to handle it anyway. Well, if the 
assumption is that we can change the location of the objects and it all 
just keeps working? Or that is not the goal?

Let see about the users (ignoring selftests):

1) lrc_reg_state and ring; always_coherent=false

Update frequency medium and mostly write from the CPU side.

They say always_coherent=false - which means they have to handle being 
given a WC mapping anyway.

What is the benefit of ever selecting WB here?

2) Engine status page; always_coherent=true

Frequently read and written from the CPU and GPU so cost of snooping is 
therefore fine? Apart from having to be ready to deal with WC anyway.

3) dbg_poison_ce; always_coherent=true

Writes to lrc_reg_state once - meh. Could just as well always ask for WC.

4) intel_guc_allocate_and_map_vma; always_coherent=true

This one has three users:

a) guc_stage_desc_pool_create stage_desc_pool_vaddr

This one seems write once at init.

b) intel_guc_ct_init

Use for CT communication so similar to CSB on engine status page in 
principle. But code also has to deal with WC when object is in lmem.

c) intel_guc_ads_create

CPU appears to only write on init and GPU reset.

5) intel_huc_rsa_data_create; always_coheret=true

Called from intel_huc_init so it appears write once from CPU. Not sure 
why it would need a coherent mapping if that is correct.

I think this exercise left me equally confused. Because flushing and 
read-write ordering rules are different between WB and WC. And code 
which accesses all these mappings either has to know which one is in 
use, or does not care. For the latter case we have to be sure about for 
every path.

The write on init / reset ones are easy enough and it doesn't really 
matter for them to use the coherent helper.

Lrc_reg_state as well I think can be WC with explicit flushing - it has 
to on lmem, no?

This leaves the status page (CSB, etc) and GuC CT. Those are frequent 
R/W but also code has to be able to handle WC so what is the benefit of 
WB? It ends up faster than if it was WC, considering explicit 
flushes/barriers are still in there?

Regards,

Tvrtko
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [Intel-gfx] [PATCH 11/19] drm/i915: Update the helper to set correct mapping
  2021-04-21 15:41                       ` Tvrtko Ursulin
@ 2021-04-21 19:13                         ` Matthew Auld
  2021-04-26  8:57                           ` Matthew Auld
  0 siblings, 1 reply; 65+ messages in thread
From: Matthew Auld @ 2021-04-21 19:13 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: Intel Graphics Development, Matthew Auld, ML dri-devel

On Wed, 21 Apr 2021 at 16:41, Tvrtko Ursulin
<tvrtko.ursulin@linux.intel.com> wrote:
>
>
> On 21/04/2021 12:42, Matthew Auld wrote:
> > On 19/04/2021 16:01, Tvrtko Ursulin wrote:
> >>
> >> On 19/04/2021 15:37, Matthew Auld wrote:
> >>> On 19/04/2021 15:07, Tvrtko Ursulin wrote:
> >>>>
> >>>> On 19/04/2021 12:30, Matthew Auld wrote:
> >>>>> On 15/04/2021 12:05, Tvrtko Ursulin wrote:
> >>>>>>
> >>>>>> On 15/04/2021 10:23, Matthew Auld wrote:
> >>>>>>> On Thu, 15 Apr 2021 at 09:21, Tvrtko Ursulin
> >>>>>>> <tvrtko.ursulin@linux.intel.com> wrote:
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> On 14/04/2021 17:20, Matthew Auld wrote:
> >>>>>>>>> On Wed, 14 Apr 2021 at 16:22, Tvrtko Ursulin
> >>>>>>>>> <tvrtko.ursulin@linux.intel.com> wrote:
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> On 12/04/2021 10:05, Matthew Auld wrote:
> >>>>>>>>>>> From: Venkata Sandeep Dhanalakota
> >>>>>>>>>>> <venkata.s.dhanalakota@intel.com>
> >>>>>>>>>>>
> >>>>>>>>>>> Determine the possible coherent map type based on object
> >>>>>>>>>>> location,
> >>>>>>>>>>> and if target has llc or if user requires an always coherent
> >>>>>>>>>>> mapping.
> >>>>>>>>>>>
> >>>>>>>>>>> Cc: Matthew Auld <matthew.auld@intel.com>
> >>>>>>>>>>> Cc: CQ Tang <cq.tang@intel.com>
> >>>>>>>>>>> Suggested-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
> >>>>>>>>>>> Signed-off-by: Venkata Sandeep Dhanalakota
> >>>>>>>>>>> <venkata.s.dhanalakota@intel.com>
> >>>>>>>>>>> ---
> >>>>>>>>>>>     drivers/gpu/drm/i915/gt/intel_engine_cs.c    |  3 ++-
> >>>>>>>>>>>     drivers/gpu/drm/i915/gt/intel_engine_pm.c    |  2 +-
> >>>>>>>>>>>     drivers/gpu/drm/i915/gt/intel_lrc.c          |  4 +++-
> >>>>>>>>>>>     drivers/gpu/drm/i915/gt/intel_ring.c         |  9 ++++++---
> >>>>>>>>>>>     drivers/gpu/drm/i915/gt/selftest_context.c   |  3 ++-
> >>>>>>>>>>>     drivers/gpu/drm/i915/gt/selftest_hangcheck.c |  4 ++--
> >>>>>>>>>>>     drivers/gpu/drm/i915/gt/selftest_lrc.c       |  4 +++-
> >>>>>>>>>>>     drivers/gpu/drm/i915/gt/uc/intel_guc.c       |  4 +++-
> >>>>>>>>>>>     drivers/gpu/drm/i915/gt/uc/intel_huc.c       |  4 +++-
> >>>>>>>>>>>     drivers/gpu/drm/i915/i915_drv.h              | 11
> >>>>>>>>>>> +++++++++--
> >>>>>>>>>>>     drivers/gpu/drm/i915/selftests/igt_spinner.c |  4 ++--
> >>>>>>>>>>>     11 files changed, 36 insertions(+), 16 deletions(-)
> >>>>>>>>>>>
> >>>>>>>>>>> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> >>>>>>>>>>> b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> >>>>>>>>>>> index efe935f80c1a..b79568d370f5 100644
> >>>>>>>>>>> --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> >>>>>>>>>>> +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> >>>>>>>>>>> @@ -664,7 +664,8 @@ static int init_status_page(struct
> >>>>>>>>>>> intel_engine_cs *engine)
> >>>>>>>>>>>         if (ret)
> >>>>>>>>>>>                 goto err;
> >>>>>>>>>>>
> >>>>>>>>>>> -     vaddr = i915_gem_object_pin_map(obj, I915_MAP_WB);
> >>>>>>>>>>> +     vaddr = i915_gem_object_pin_map(obj,
> >>>>>>>>>>> + i915_coherent_map_type(engine->i915, obj, true));
> >>>>>>>>>>>         if (IS_ERR(vaddr)) {
> >>>>>>>>>>>                 ret = PTR_ERR(vaddr);
> >>>>>>>>>>>                 goto err_unpin;
> >>>>>>>>>>> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pm.c
> >>>>>>>>>>> b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
> >>>>>>>>>>> index 7c9af86fdb1e..47f4397095e5 100644
> >>>>>>>>>>> --- a/drivers/gpu/drm/i915/gt/intel_engine_pm.c
> >>>>>>>>>>> +++ b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
> >>>>>>>>>>> @@ -23,7 +23,7 @@ static void dbg_poison_ce(struct
> >>>>>>>>>>> intel_context *ce)
> >>>>>>>>>>>
> >>>>>>>>>>>         if (ce->state) {
> >>>>>>>>>>>                 struct drm_i915_gem_object *obj =
> >>>>>>>>>>> ce->state->obj;
> >>>>>>>>>>> -             int type =
> >>>>>>>>>>> i915_coherent_map_type(ce->engine->i915);
> >>>>>>>>>>> +             int type =
> >>>>>>>>>>> i915_coherent_map_type(ce->engine->i915, obj, true);
> >>>>>>>>>>>                 void *map;
> >>>>>>>>>>>
> >>>>>>>>>>>                 if (!i915_gem_object_trylock(obj))
> >>>>>>>>>>> diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c
> >>>>>>>>>>> b/drivers/gpu/drm/i915/gt/intel_lrc.c
> >>>>>>>>>>> index e86897cde984..aafe2a4df496 100644
> >>>>>>>>>>> --- a/drivers/gpu/drm/i915/gt/intel_lrc.c
> >>>>>>>>>>> +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
> >>>>>>>>>>> @@ -903,7 +903,9 @@ lrc_pre_pin(struct intel_context *ce,
> >>>>>>>>>>>         GEM_BUG_ON(!i915_vma_is_pinned(ce->state));
> >>>>>>>>>>>
> >>>>>>>>>>>         *vaddr = i915_gem_object_pin_map(ce->state->obj,
> >>>>>>>>>>> - i915_coherent_map_type(ce->engine->i915) |
> >>>>>>>>>>> + i915_coherent_map_type(ce->engine->i915,
> >>>>>>>>>>> + ce->state->obj,
> >>>>>>>>>>> + false) |
> >>>>>>>>>>>                                          I915_MAP_OVERRIDE);
> >>>>>>>>>>>
> >>>>>>>>>>>         return PTR_ERR_OR_ZERO(*vaddr);
> >>>>>>>>>>> diff --git a/drivers/gpu/drm/i915/gt/intel_ring.c
> >>>>>>>>>>> b/drivers/gpu/drm/i915/gt/intel_ring.c
> >>>>>>>>>>> index aee0a77c77e0..3cf6c7e68108 100644
> >>>>>>>>>>> --- a/drivers/gpu/drm/i915/gt/intel_ring.c
> >>>>>>>>>>> +++ b/drivers/gpu/drm/i915/gt/intel_ring.c
> >>>>>>>>>>> @@ -53,9 +53,12 @@ int intel_ring_pin(struct intel_ring
> >>>>>>>>>>> *ring, struct i915_gem_ww_ctx *ww)
> >>>>>>>>>>>
> >>>>>>>>>>>         if (i915_vma_is_map_and_fenceable(vma))
> >>>>>>>>>>>                 addr = (void __force *)i915_vma_pin_iomap(vma);
> >>>>>>>>>>> -     else
> >>>>>>>>>>> -             addr = i915_gem_object_pin_map(vma->obj,
> >>>>>>>>>>> - i915_coherent_map_type(vma->vm->i915));
> >>>>>>>>>>> +     else {
> >>>>>>>>>>> +             int type =
> >>>>>>>>>>> i915_coherent_map_type(vma->vm->i915, vma->obj, false);
> >>>>>>>>>>> +
> >>>>>>>>>>> +             addr = i915_gem_object_pin_map(vma->obj, type);
> >>>>>>>>>>> +     }
> >>>>>>>>>>> +
> >>>>>>>>>>>         if (IS_ERR(addr)) {
> >>>>>>>>>>>                 ret = PTR_ERR(addr);
> >>>>>>>>>>>                 goto err_ring;
> >>>>>>>>>>> diff --git a/drivers/gpu/drm/i915/gt/selftest_context.c
> >>>>>>>>>>> b/drivers/gpu/drm/i915/gt/selftest_context.c
> >>>>>>>>>>> index b9bdd1d23243..26685b927169 100644
> >>>>>>>>>>> --- a/drivers/gpu/drm/i915/gt/selftest_context.c
> >>>>>>>>>>> +++ b/drivers/gpu/drm/i915/gt/selftest_context.c
> >>>>>>>>>>> @@ -88,7 +88,8 @@ static int __live_context_size(struct
> >>>>>>>>>>> intel_engine_cs *engine)
> >>>>>>>>>>>                 goto err;
> >>>>>>>>>>>
> >>>>>>>>>>>         vaddr = i915_gem_object_pin_map_unlocked(ce->state->obj,
> >>>>>>>>>>> - i915_coherent_map_type(engine->i915));
> >>>>>>>>>>> + i915_coherent_map_type(engine->i915,
> >>>>>>>>>>> + ce->state->obj, false));
> >>>>>>>>>>>         if (IS_ERR(vaddr)) {
> >>>>>>>>>>>                 err = PTR_ERR(vaddr);
> >>>>>>>>>>>                 intel_context_unpin(ce);
> >>>>>>>>>>> diff --git a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
> >>>>>>>>>>> b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
> >>>>>>>>>>> index 746985971c3a..5b63d4df8c93 100644
> >>>>>>>>>>> --- a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
> >>>>>>>>>>> +++ b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
> >>>>>>>>>>> @@ -69,7 +69,7 @@ static int hang_init(struct hang *h, struct
> >>>>>>>>>>> intel_gt *gt)
> >>>>>>>>>>>         h->seqno = memset(vaddr, 0xff, PAGE_SIZE);
> >>>>>>>>>>>
> >>>>>>>>>>>         vaddr = i915_gem_object_pin_map_unlocked(h->obj,
> >>>>>>>>>>> - i915_coherent_map_type(gt->i915));
> >>>>>>>>>>> + i915_coherent_map_type(gt->i915, h->obj, false));
> >>>>>>>>>>>         if (IS_ERR(vaddr)) {
> >>>>>>>>>>>                 err = PTR_ERR(vaddr);
> >>>>>>>>>>>                 goto err_unpin_hws;
> >>>>>>>>>>> @@ -130,7 +130,7 @@ hang_create_request(struct hang *h,
> >>>>>>>>>>> struct intel_engine_cs *engine)
> >>>>>>>>>>>                 return ERR_CAST(obj);
> >>>>>>>>>>>         }
> >>>>>>>>>>>
> >>>>>>>>>>> -     vaddr = i915_gem_object_pin_map_unlocked(obj,
> >>>>>>>>>>> i915_coherent_map_type(gt->i915));
> >>>>>>>>>>> +     vaddr = i915_gem_object_pin_map_unlocked(obj,
> >>>>>>>>>>> i915_coherent_map_type(gt->i915, obj, false));
> >>>>>>>>>>>         if (IS_ERR(vaddr)) {
> >>>>>>>>>>>                 i915_gem_object_put(obj);
> >>>>>>>>>>>                 i915_vm_put(vm);
> >>>>>>>>>>> diff --git a/drivers/gpu/drm/i915/gt/selftest_lrc.c
> >>>>>>>>>>> b/drivers/gpu/drm/i915/gt/selftest_lrc.c
> >>>>>>>>>>> index 85e7df6a5123..d8f6623524e8 100644
> >>>>>>>>>>> --- a/drivers/gpu/drm/i915/gt/selftest_lrc.c
> >>>>>>>>>>> +++ b/drivers/gpu/drm/i915/gt/selftest_lrc.c
> >>>>>>>>>>> @@ -1221,7 +1221,9 @@ static int compare_isolation(struct
> >>>>>>>>>>> intel_engine_cs *engine,
> >>>>>>>>>>>         }
> >>>>>>>>>>>
> >>>>>>>>>>>         lrc = i915_gem_object_pin_map_unlocked(ce->state->obj,
> >>>>>>>>>>> - i915_coherent_map_type(engine->i915));
> >>>>>>>>>>> + i915_coherent_map_type(engine->i915,
> >>>>>>>>>>> + ce->state->obj,
> >>>>>>>>>>> + false));
> >>>>>>>>>>>         if (IS_ERR(lrc)) {
> >>>>>>>>>>>                 err = PTR_ERR(lrc);
> >>>>>>>>>>>                 goto err_B1;
> >>>>>>>>>>> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.c
> >>>>>>>>>>> b/drivers/gpu/drm/i915/gt/uc/intel_guc.c
> >>>>>>>>>>> index 78305b2ec89d..adae04c47aab 100644
> >>>>>>>>>>> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc.c
> >>>>>>>>>>> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.c
> >>>>>>>>>>> @@ -682,7 +682,9 @@ int intel_guc_allocate_and_map_vma(struct
> >>>>>>>>>>> intel_guc *guc, u32 size,
> >>>>>>>>>>>         if (IS_ERR(vma))
> >>>>>>>>>>>                 return PTR_ERR(vma);
> >>>>>>>>>>>
> >>>>>>>>>>> -     vaddr = i915_gem_object_pin_map_unlocked(vma->obj,
> >>>>>>>>>>> I915_MAP_WB);
> >>>>>>>>>>> +     vaddr = i915_gem_object_pin_map_unlocked(vma->obj,
> >>>>>>>>>>> + i915_coherent_map_type(guc_to_gt(guc)->i915,
> >>>>>>>>>>> + vma->obj, true));
> >>>>>>>>>>>         if (IS_ERR(vaddr)) {
> >>>>>>>>>>>                 i915_vma_unpin_and_release(&vma, 0);
> >>>>>>>>>>>                 return PTR_ERR(vaddr);
> >>>>>>>>>>> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_huc.c
> >>>>>>>>>>> b/drivers/gpu/drm/i915/gt/uc/intel_huc.c
> >>>>>>>>>>> index 2126dd81ac38..56d2144dc6a0 100644
> >>>>>>>>>>> --- a/drivers/gpu/drm/i915/gt/uc/intel_huc.c
> >>>>>>>>>>> +++ b/drivers/gpu/drm/i915/gt/uc/intel_huc.c
> >>>>>>>>>>> @@ -82,7 +82,9 @@ static int intel_huc_rsa_data_create(struct
> >>>>>>>>>>> intel_huc *huc)
> >>>>>>>>>>>         if (IS_ERR(vma))
> >>>>>>>>>>>                 return PTR_ERR(vma);
> >>>>>>>>>>>
> >>>>>>>>>>> -     vaddr = i915_gem_object_pin_map_unlocked(vma->obj,
> >>>>>>>>>>> I915_MAP_WB);
> >>>>>>>>>>> +     vaddr = i915_gem_object_pin_map_unlocked(vma->obj,
> >>>>>>>>>>> + i915_coherent_map_type(gt->i915,
> >>>>>>>>>>> + vma->obj, true));
> >>>>>>>>>>>         if (IS_ERR(vaddr)) {
> >>>>>>>>>>>                 i915_vma_unpin_and_release(&vma, 0);
> >>>>>>>>>>>                 return PTR_ERR(vaddr);
> >>>>>>>>>>> diff --git a/drivers/gpu/drm/i915/i915_drv.h
> >>>>>>>>>>> b/drivers/gpu/drm/i915/i915_drv.h
> >>>>>>>>>>> index 69e43bf91a15..2abbc06712a4 100644
> >>>>>>>>>>> --- a/drivers/gpu/drm/i915/i915_drv.h
> >>>>>>>>>>> +++ b/drivers/gpu/drm/i915/i915_drv.h
> >>>>>>>>>>> @@ -78,6 +78,7 @@
> >>>>>>>>>>>     #include "gem/i915_gem_context_types.h"
> >>>>>>>>>>>     #include "gem/i915_gem_shrinker.h"
> >>>>>>>>>>>     #include "gem/i915_gem_stolen.h"
> >>>>>>>>>>> +#include "gem/i915_gem_lmem.h"
> >>>>>>>>>>>
> >>>>>>>>>>>     #include "gt/intel_engine.h"
> >>>>>>>>>>>     #include "gt/intel_gt_types.h"
> >>>>>>>>>>> @@ -1921,9 +1922,15 @@ static inline int
> >>>>>>>>>>> intel_hws_csb_write_index(struct drm_i915_private *i915)
> >>>>>>>>>>>     }
> >>>>>>>>>>>
> >>>>>>>>>>>     static inline enum i915_map_type
> >>>>>>>>>>> -i915_coherent_map_type(struct drm_i915_private *i915)
> >>>>>>>>>>> +i915_coherent_map_type(struct drm_i915_private *i915,
> >>>>>>>>>>> +                    struct drm_i915_gem_object *obj, bool
> >>>>>>>>>>> always_coherent)
> >>>>>>>>>>>     {
> >>>>>>>>>>> -     return HAS_LLC(i915) ? I915_MAP_WB : I915_MAP_WC;
> >>>>>>>>>>> +     if (i915_gem_object_is_lmem(obj))
> >>>>>>>>>>> +             return I915_MAP_WC;
> >>>>>>>>>>> +     if (HAS_LLC(i915) || always_coherent)
> >>>>>>>>>>> +             return I915_MAP_WB;
> >>>>>>>>>>> +     else
> >>>>>>>>>>> +             return I915_MAP_WC;
> >>>>>>>>>>
> >>>>>>>>>> Seems this patch is doing two things.
> >>>>>>>>>>
> >>>>>>>>>> First it is adding lmem support to this helper by always
> >>>>>>>>>> returning WC
> >>>>>>>>>> for lmem objects.
> >>>>>>>>>>
> >>>>>>>>>> Secondly it is introducing an idea of "always coherent" in a
> >>>>>>>>>> helper
> >>>>>>>>>> called i915_coherent_map_type. Could someone explain what is
> >>>>>>>>>> coherent vs
> >>>>>>>>>> always coherent?
> >>>>>>>>>>
> >>>>>>>>>> And also, why is always coherent happy with WB? Sounds counter
> >>>>>>>>>> intuitive
> >>>>>>>>>> to me.
> >>>>>>>>>
> >>>>>>>>> All this does is try to keep the existing behaviour intact, whilst
> >>>>>>>>> also ensuring that all lmem objects are mapped using only WC, no
> >>>>>>>>> matter what. The always_coherent=true thing is for the existing
> >>>>>>>>> places
> >>>>>>>>> where we sometimes map the object using WB, without first
> >>>>>>>>> considering
> >>>>>>>>> whether the device has the fast shared LLC vs snooping. Yes, it's
> >>>>>>>>> slightly ugly :)
> >>>>>>>>
> >>>>>>>> Not fully following - if we had to write kerneldoc for
> >>>>>>>> always_coherent
> >>>>>>>> input argument - what it would say?
> >>>>>>>
> >>>>>>> @always_coherent - If true we should always try to map the object
> >>>>>>> using WB. If false we should only map as WB if the device
> >>>>>>> supports the
> >>>>>>> fast shared LLC, in the case of snooped devices we will map use WC.
> >>>>>>> Note that If the resource is lmem then we will always map as WC,
> >>>>>>> regardless of the value of always_coherent, since that's all we
> >>>>>>> currently support.
> >>>>>>>
> >>>>>>> Maybe the naming is poor?
> >>>>>>
> >>>>>> Maybe just confusing to me, not sure yet.
> >>>>>>
> >>>>>> So always_coherent is not about how the callers wants to use it,
> >>>>>> but about platform knowledge? Or a performance concern for LLC vs
> >>>>>> snooping cases? Does WB works (coherently) on snooping platforms?
> >>>>>
> >>>>> The always_coherent=true is for the existing callers that want WB,
> >>>>> regardless of LLC vs snooping.
> >>>>>
> >>>>> The other callers use the existing i915_coherent_map_type() which
> >>>>> only gives out WB for LLC platforms.
> >>>>>
> >>>>> AFAIK, LLC vs snooping should offer the same in terms of coherency,
> >>>>> but in terms of performance the shared LLC is much faster, and so
> >>>>> for snooping platforms we choose to not enable WB everywhere.
> >>>>>
> >>>>> On top of that we now have lmem, but for that we only allow WC.
> >>>>> This patch just rolls all of that into one helper, while keeping
> >>>>> the existing behaviour unchanged.
> >>>>
> >>>> Thanks. But I am still struggling with the API. :(
> >>>>
> >>>> Is the introduction of always_coherent flag in the context of DG1
> >>>> required even? AFAICT for lmem objects the flag is ignored so no?
> >>>
> >>> If we drop the flag/helper thing, then we need something like:
> >>>
> >>> type = WB;
> >>> if (i915_gem_object_is_lmem(obj))
> >>>      type = WC;
> >>>
> >>> vaddr = i915_gem_object_pin_map(obj, type);
> >>>
> >>> In all the places where we currently do:
> >>>
> >>> vaddr = i915_gem_object_pin_map(obj, WB);
> >>>
> >>> Where obj can be lmem, so ctx, ring, guc etc. Is that better or
> >>> worse? The existing i915_coherent_map_type() callers should work
> >>> as-is, since DG1 is snooped. And this patch just extends that to
> >>> cover all cases.
> >>>
> >>> Perhaps we need a new helper instead? Maybe you have a better idea?
> >>
> >> Not yet. Would it make sense to put something in kerneldoc about when
> >> callers might choose always_coherent true vs false? In terms of
> >> expected usage (frequency, simplicity?) and any rules with regards
> >> when callers need to worry about flushing/ordering when there are
> >> mixed read and writes?
> >
> > Hmmm, looking at this again, maybe for now we should just go with:
> >
> > type = WB;
> > if (i915_gem_object_is_lmem(obj))
> >        type = WC;
> >
> > vaddr = i915_gem_object_pin_map(obj, type)
> >
> > Which is way less confusing, plus there are only a handful of places
> > where we need this, so doesn't seem too bad?
> >
> > Alternatively, we could wrap that in something like:
> >
> > /* Returns WB for system memory, or WC for local memory */
> > void *i915_gem_object_pin_map_default(obj);
> >
> > Thoughts?
>
> I went and looked at the use sites to try and figure it out.
>
> First thing, the bool always_coherent story is only relevant when we
> decide to place some object in system memory. Otherwise mapping is
> always WC so I guess our code needs to handle it anyway. Well, if the
> assumption is that we can change the location of the objects and it all
> just keeps working? Or that is not the goal?

I guess your concern is that mapping as WC has different semantics,
and that might somehow break the caller?

>
> Let see about the users (ignoring selftests):
>
> 1) lrc_reg_state and ring; always_coherent=false
>
> Update frequency medium and mostly write from the CPU side.
>
> They say always_coherent=false - which means they have to handle being
> given a WC mapping anyway.
>
> What is the benefit of ever selecting WB here?
>
> 2) Engine status page; always_coherent=true
>
> Frequently read and written from the CPU and GPU so cost of snooping is
> therefore fine? Apart from having to be ready to deal with WC anyway.
>
> 3) dbg_poison_ce; always_coherent=true
>
> Writes to lrc_reg_state once - meh. Could just as well always ask for WC.
>
> 4) intel_guc_allocate_and_map_vma; always_coherent=true
>
> This one has three users:
>
> a) guc_stage_desc_pool_create stage_desc_pool_vaddr
>
> This one seems write once at init.
>
> b) intel_guc_ct_init
>
> Use for CT communication so similar to CSB on engine status page in
> principle. But code also has to deal with WC when object is in lmem.
>
> c) intel_guc_ads_create
>
> CPU appears to only write on init and GPU reset.
>
> 5) intel_huc_rsa_data_create; always_coheret=true
>
> Called from intel_huc_init so it appears write once from CPU. Not sure
> why it would need a coherent mapping if that is correct.
>
> I think this exercise left me equally confused. Because flushing and
> read-write ordering rules are different between WB and WC. And code
> which accesses all these mappings either has to know which one is in
> use, or does not care. For the latter case we have to be sure about for
> every path.

Users of pin_map() are generally meant to call flush_map() where
appropriate, which should do the right thing for us. For WC it only
needs to flush the wcb. For WB it's more complicated since that
depends on if the object is considered coherent or not, if it is then
we don't need to do anything, otherwise we need to clflush.

Also note that we If we just map the buffer as WB, that by itself
doesn't magically enable snooping for the pages AFAIK. We still have
to tell the GPU that these pages are meant to be coherent, which we
always do for LLC platforms I think, since the shared LLC is
considered fast, whereas on snooping platforms, we don't enable this
by default, and have this as CACHE_NONE instead(see shmem_object_init
for example), and incur the cost of additional clflushing.  Doing an
explicit i915_gem_object_set_coherency(I915_CACHE_LLC) I think will
mark the object as coherent for us. I think there are also some
matching GTT bits for caching.

Also for DG1 you apparently can't disable snooping, as per what Daniel
was saying in another thread.

>
> The write on init / reset ones are easy enough and it doesn't really
> matter for them to use the coherent helper.
>
> Lrc_reg_state as well I think can be WC with explicit flushing - it has
> to on lmem, no?

I doubt it has to be, since the GPU still just accesses it through the GTT.

>
> This leaves the status page (CSB, etc) and GuC CT. Those are frequent
> R/W but also code has to be able to handle WC so what is the benefit of
> WB? It ends up faster than if it was WC, considering explicit
> flushes/barriers are still in there?

No idea for GuC, but for the hwsp it's still in system memory, and is
WB, even for discrete. Chris measured this to be more performant with
our execlists submission path than say just sticking it in lmem, and
mapping it as WC.

>
> Regards,
>
> Tvrtko
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [Intel-gfx] [PATCH 11/19] drm/i915: Update the helper to set correct mapping
  2021-04-21 19:13                         ` Matthew Auld
@ 2021-04-26  8:57                           ` Matthew Auld
  2021-04-26  9:21                             ` Tvrtko Ursulin
  0 siblings, 1 reply; 65+ messages in thread
From: Matthew Auld @ 2021-04-26  8:57 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: Intel Graphics Development, Matthew Auld, ML dri-devel

On Wed, 21 Apr 2021 at 20:13, Matthew Auld
<matthew.william.auld@gmail.com> wrote:
>
> On Wed, 21 Apr 2021 at 16:41, Tvrtko Ursulin
> <tvrtko.ursulin@linux.intel.com> wrote:
> >
> >
> > On 21/04/2021 12:42, Matthew Auld wrote:
> > > On 19/04/2021 16:01, Tvrtko Ursulin wrote:
> > >>
> > >> On 19/04/2021 15:37, Matthew Auld wrote:
> > >>> On 19/04/2021 15:07, Tvrtko Ursulin wrote:
> > >>>>
> > >>>> On 19/04/2021 12:30, Matthew Auld wrote:
> > >>>>> On 15/04/2021 12:05, Tvrtko Ursulin wrote:
> > >>>>>>
> > >>>>>> On 15/04/2021 10:23, Matthew Auld wrote:
> > >>>>>>> On Thu, 15 Apr 2021 at 09:21, Tvrtko Ursulin
> > >>>>>>> <tvrtko.ursulin@linux.intel.com> wrote:
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> On 14/04/2021 17:20, Matthew Auld wrote:
> > >>>>>>>>> On Wed, 14 Apr 2021 at 16:22, Tvrtko Ursulin
> > >>>>>>>>> <tvrtko.ursulin@linux.intel.com> wrote:
> > >>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>> On 12/04/2021 10:05, Matthew Auld wrote:
> > >>>>>>>>>>> From: Venkata Sandeep Dhanalakota
> > >>>>>>>>>>> <venkata.s.dhanalakota@intel.com>
> > >>>>>>>>>>>
> > >>>>>>>>>>> Determine the possible coherent map type based on object
> > >>>>>>>>>>> location,
> > >>>>>>>>>>> and if target has llc or if user requires an always coherent
> > >>>>>>>>>>> mapping.
> > >>>>>>>>>>>
> > >>>>>>>>>>> Cc: Matthew Auld <matthew.auld@intel.com>
> > >>>>>>>>>>> Cc: CQ Tang <cq.tang@intel.com>
> > >>>>>>>>>>> Suggested-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
> > >>>>>>>>>>> Signed-off-by: Venkata Sandeep Dhanalakota
> > >>>>>>>>>>> <venkata.s.dhanalakota@intel.com>
> > >>>>>>>>>>> ---
> > >>>>>>>>>>>     drivers/gpu/drm/i915/gt/intel_engine_cs.c    |  3 ++-
> > >>>>>>>>>>>     drivers/gpu/drm/i915/gt/intel_engine_pm.c    |  2 +-
> > >>>>>>>>>>>     drivers/gpu/drm/i915/gt/intel_lrc.c          |  4 +++-
> > >>>>>>>>>>>     drivers/gpu/drm/i915/gt/intel_ring.c         |  9 ++++++---
> > >>>>>>>>>>>     drivers/gpu/drm/i915/gt/selftest_context.c   |  3 ++-
> > >>>>>>>>>>>     drivers/gpu/drm/i915/gt/selftest_hangcheck.c |  4 ++--
> > >>>>>>>>>>>     drivers/gpu/drm/i915/gt/selftest_lrc.c       |  4 +++-
> > >>>>>>>>>>>     drivers/gpu/drm/i915/gt/uc/intel_guc.c       |  4 +++-
> > >>>>>>>>>>>     drivers/gpu/drm/i915/gt/uc/intel_huc.c       |  4 +++-
> > >>>>>>>>>>>     drivers/gpu/drm/i915/i915_drv.h              | 11
> > >>>>>>>>>>> +++++++++--
> > >>>>>>>>>>>     drivers/gpu/drm/i915/selftests/igt_spinner.c |  4 ++--
> > >>>>>>>>>>>     11 files changed, 36 insertions(+), 16 deletions(-)
> > >>>>>>>>>>>
> > >>>>>>>>>>> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> > >>>>>>>>>>> b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> > >>>>>>>>>>> index efe935f80c1a..b79568d370f5 100644
> > >>>>>>>>>>> --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> > >>>>>>>>>>> +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> > >>>>>>>>>>> @@ -664,7 +664,8 @@ static int init_status_page(struct
> > >>>>>>>>>>> intel_engine_cs *engine)
> > >>>>>>>>>>>         if (ret)
> > >>>>>>>>>>>                 goto err;
> > >>>>>>>>>>>
> > >>>>>>>>>>> -     vaddr = i915_gem_object_pin_map(obj, I915_MAP_WB);
> > >>>>>>>>>>> +     vaddr = i915_gem_object_pin_map(obj,
> > >>>>>>>>>>> + i915_coherent_map_type(engine->i915, obj, true));
> > >>>>>>>>>>>         if (IS_ERR(vaddr)) {
> > >>>>>>>>>>>                 ret = PTR_ERR(vaddr);
> > >>>>>>>>>>>                 goto err_unpin;
> > >>>>>>>>>>> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pm.c
> > >>>>>>>>>>> b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
> > >>>>>>>>>>> index 7c9af86fdb1e..47f4397095e5 100644
> > >>>>>>>>>>> --- a/drivers/gpu/drm/i915/gt/intel_engine_pm.c
> > >>>>>>>>>>> +++ b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
> > >>>>>>>>>>> @@ -23,7 +23,7 @@ static void dbg_poison_ce(struct
> > >>>>>>>>>>> intel_context *ce)
> > >>>>>>>>>>>
> > >>>>>>>>>>>         if (ce->state) {
> > >>>>>>>>>>>                 struct drm_i915_gem_object *obj =
> > >>>>>>>>>>> ce->state->obj;
> > >>>>>>>>>>> -             int type =
> > >>>>>>>>>>> i915_coherent_map_type(ce->engine->i915);
> > >>>>>>>>>>> +             int type =
> > >>>>>>>>>>> i915_coherent_map_type(ce->engine->i915, obj, true);
> > >>>>>>>>>>>                 void *map;
> > >>>>>>>>>>>
> > >>>>>>>>>>>                 if (!i915_gem_object_trylock(obj))
> > >>>>>>>>>>> diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c
> > >>>>>>>>>>> b/drivers/gpu/drm/i915/gt/intel_lrc.c
> > >>>>>>>>>>> index e86897cde984..aafe2a4df496 100644
> > >>>>>>>>>>> --- a/drivers/gpu/drm/i915/gt/intel_lrc.c
> > >>>>>>>>>>> +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
> > >>>>>>>>>>> @@ -903,7 +903,9 @@ lrc_pre_pin(struct intel_context *ce,
> > >>>>>>>>>>>         GEM_BUG_ON(!i915_vma_is_pinned(ce->state));
> > >>>>>>>>>>>
> > >>>>>>>>>>>         *vaddr = i915_gem_object_pin_map(ce->state->obj,
> > >>>>>>>>>>> - i915_coherent_map_type(ce->engine->i915) |
> > >>>>>>>>>>> + i915_coherent_map_type(ce->engine->i915,
> > >>>>>>>>>>> + ce->state->obj,
> > >>>>>>>>>>> + false) |
> > >>>>>>>>>>>                                          I915_MAP_OVERRIDE);
> > >>>>>>>>>>>
> > >>>>>>>>>>>         return PTR_ERR_OR_ZERO(*vaddr);
> > >>>>>>>>>>> diff --git a/drivers/gpu/drm/i915/gt/intel_ring.c
> > >>>>>>>>>>> b/drivers/gpu/drm/i915/gt/intel_ring.c
> > >>>>>>>>>>> index aee0a77c77e0..3cf6c7e68108 100644
> > >>>>>>>>>>> --- a/drivers/gpu/drm/i915/gt/intel_ring.c
> > >>>>>>>>>>> +++ b/drivers/gpu/drm/i915/gt/intel_ring.c
> > >>>>>>>>>>> @@ -53,9 +53,12 @@ int intel_ring_pin(struct intel_ring
> > >>>>>>>>>>> *ring, struct i915_gem_ww_ctx *ww)
> > >>>>>>>>>>>
> > >>>>>>>>>>>         if (i915_vma_is_map_and_fenceable(vma))
> > >>>>>>>>>>>                 addr = (void __force *)i915_vma_pin_iomap(vma);
> > >>>>>>>>>>> -     else
> > >>>>>>>>>>> -             addr = i915_gem_object_pin_map(vma->obj,
> > >>>>>>>>>>> - i915_coherent_map_type(vma->vm->i915));
> > >>>>>>>>>>> +     else {
> > >>>>>>>>>>> +             int type =
> > >>>>>>>>>>> i915_coherent_map_type(vma->vm->i915, vma->obj, false);
> > >>>>>>>>>>> +
> > >>>>>>>>>>> +             addr = i915_gem_object_pin_map(vma->obj, type);
> > >>>>>>>>>>> +     }
> > >>>>>>>>>>> +
> > >>>>>>>>>>>         if (IS_ERR(addr)) {
> > >>>>>>>>>>>                 ret = PTR_ERR(addr);
> > >>>>>>>>>>>                 goto err_ring;
> > >>>>>>>>>>> diff --git a/drivers/gpu/drm/i915/gt/selftest_context.c
> > >>>>>>>>>>> b/drivers/gpu/drm/i915/gt/selftest_context.c
> > >>>>>>>>>>> index b9bdd1d23243..26685b927169 100644
> > >>>>>>>>>>> --- a/drivers/gpu/drm/i915/gt/selftest_context.c
> > >>>>>>>>>>> +++ b/drivers/gpu/drm/i915/gt/selftest_context.c
> > >>>>>>>>>>> @@ -88,7 +88,8 @@ static int __live_context_size(struct
> > >>>>>>>>>>> intel_engine_cs *engine)
> > >>>>>>>>>>>                 goto err;
> > >>>>>>>>>>>
> > >>>>>>>>>>>         vaddr = i915_gem_object_pin_map_unlocked(ce->state->obj,
> > >>>>>>>>>>> - i915_coherent_map_type(engine->i915));
> > >>>>>>>>>>> + i915_coherent_map_type(engine->i915,
> > >>>>>>>>>>> + ce->state->obj, false));
> > >>>>>>>>>>>         if (IS_ERR(vaddr)) {
> > >>>>>>>>>>>                 err = PTR_ERR(vaddr);
> > >>>>>>>>>>>                 intel_context_unpin(ce);
> > >>>>>>>>>>> diff --git a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
> > >>>>>>>>>>> b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
> > >>>>>>>>>>> index 746985971c3a..5b63d4df8c93 100644
> > >>>>>>>>>>> --- a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
> > >>>>>>>>>>> +++ b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
> > >>>>>>>>>>> @@ -69,7 +69,7 @@ static int hang_init(struct hang *h, struct
> > >>>>>>>>>>> intel_gt *gt)
> > >>>>>>>>>>>         h->seqno = memset(vaddr, 0xff, PAGE_SIZE);
> > >>>>>>>>>>>
> > >>>>>>>>>>>         vaddr = i915_gem_object_pin_map_unlocked(h->obj,
> > >>>>>>>>>>> - i915_coherent_map_type(gt->i915));
> > >>>>>>>>>>> + i915_coherent_map_type(gt->i915, h->obj, false));
> > >>>>>>>>>>>         if (IS_ERR(vaddr)) {
> > >>>>>>>>>>>                 err = PTR_ERR(vaddr);
> > >>>>>>>>>>>                 goto err_unpin_hws;
> > >>>>>>>>>>> @@ -130,7 +130,7 @@ hang_create_request(struct hang *h,
> > >>>>>>>>>>> struct intel_engine_cs *engine)
> > >>>>>>>>>>>                 return ERR_CAST(obj);
> > >>>>>>>>>>>         }
> > >>>>>>>>>>>
> > >>>>>>>>>>> -     vaddr = i915_gem_object_pin_map_unlocked(obj,
> > >>>>>>>>>>> i915_coherent_map_type(gt->i915));
> > >>>>>>>>>>> +     vaddr = i915_gem_object_pin_map_unlocked(obj,
> > >>>>>>>>>>> i915_coherent_map_type(gt->i915, obj, false));
> > >>>>>>>>>>>         if (IS_ERR(vaddr)) {
> > >>>>>>>>>>>                 i915_gem_object_put(obj);
> > >>>>>>>>>>>                 i915_vm_put(vm);
> > >>>>>>>>>>> diff --git a/drivers/gpu/drm/i915/gt/selftest_lrc.c
> > >>>>>>>>>>> b/drivers/gpu/drm/i915/gt/selftest_lrc.c
> > >>>>>>>>>>> index 85e7df6a5123..d8f6623524e8 100644
> > >>>>>>>>>>> --- a/drivers/gpu/drm/i915/gt/selftest_lrc.c
> > >>>>>>>>>>> +++ b/drivers/gpu/drm/i915/gt/selftest_lrc.c
> > >>>>>>>>>>> @@ -1221,7 +1221,9 @@ static int compare_isolation(struct
> > >>>>>>>>>>> intel_engine_cs *engine,
> > >>>>>>>>>>>         }
> > >>>>>>>>>>>
> > >>>>>>>>>>>         lrc = i915_gem_object_pin_map_unlocked(ce->state->obj,
> > >>>>>>>>>>> - i915_coherent_map_type(engine->i915));
> > >>>>>>>>>>> + i915_coherent_map_type(engine->i915,
> > >>>>>>>>>>> + ce->state->obj,
> > >>>>>>>>>>> + false));
> > >>>>>>>>>>>         if (IS_ERR(lrc)) {
> > >>>>>>>>>>>                 err = PTR_ERR(lrc);
> > >>>>>>>>>>>                 goto err_B1;
> > >>>>>>>>>>> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.c
> > >>>>>>>>>>> b/drivers/gpu/drm/i915/gt/uc/intel_guc.c
> > >>>>>>>>>>> index 78305b2ec89d..adae04c47aab 100644
> > >>>>>>>>>>> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc.c
> > >>>>>>>>>>> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.c
> > >>>>>>>>>>> @@ -682,7 +682,9 @@ int intel_guc_allocate_and_map_vma(struct
> > >>>>>>>>>>> intel_guc *guc, u32 size,
> > >>>>>>>>>>>         if (IS_ERR(vma))
> > >>>>>>>>>>>                 return PTR_ERR(vma);
> > >>>>>>>>>>>
> > >>>>>>>>>>> -     vaddr = i915_gem_object_pin_map_unlocked(vma->obj,
> > >>>>>>>>>>> I915_MAP_WB);
> > >>>>>>>>>>> +     vaddr = i915_gem_object_pin_map_unlocked(vma->obj,
> > >>>>>>>>>>> + i915_coherent_map_type(guc_to_gt(guc)->i915,
> > >>>>>>>>>>> + vma->obj, true));
> > >>>>>>>>>>>         if (IS_ERR(vaddr)) {
> > >>>>>>>>>>>                 i915_vma_unpin_and_release(&vma, 0);
> > >>>>>>>>>>>                 return PTR_ERR(vaddr);
> > >>>>>>>>>>> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_huc.c
> > >>>>>>>>>>> b/drivers/gpu/drm/i915/gt/uc/intel_huc.c
> > >>>>>>>>>>> index 2126dd81ac38..56d2144dc6a0 100644
> > >>>>>>>>>>> --- a/drivers/gpu/drm/i915/gt/uc/intel_huc.c
> > >>>>>>>>>>> +++ b/drivers/gpu/drm/i915/gt/uc/intel_huc.c
> > >>>>>>>>>>> @@ -82,7 +82,9 @@ static int intel_huc_rsa_data_create(struct
> > >>>>>>>>>>> intel_huc *huc)
> > >>>>>>>>>>>         if (IS_ERR(vma))
> > >>>>>>>>>>>                 return PTR_ERR(vma);
> > >>>>>>>>>>>
> > >>>>>>>>>>> -     vaddr = i915_gem_object_pin_map_unlocked(vma->obj,
> > >>>>>>>>>>> I915_MAP_WB);
> > >>>>>>>>>>> +     vaddr = i915_gem_object_pin_map_unlocked(vma->obj,
> > >>>>>>>>>>> + i915_coherent_map_type(gt->i915,
> > >>>>>>>>>>> + vma->obj, true));
> > >>>>>>>>>>>         if (IS_ERR(vaddr)) {
> > >>>>>>>>>>>                 i915_vma_unpin_and_release(&vma, 0);
> > >>>>>>>>>>>                 return PTR_ERR(vaddr);
> > >>>>>>>>>>> diff --git a/drivers/gpu/drm/i915/i915_drv.h
> > >>>>>>>>>>> b/drivers/gpu/drm/i915/i915_drv.h
> > >>>>>>>>>>> index 69e43bf91a15..2abbc06712a4 100644
> > >>>>>>>>>>> --- a/drivers/gpu/drm/i915/i915_drv.h
> > >>>>>>>>>>> +++ b/drivers/gpu/drm/i915/i915_drv.h
> > >>>>>>>>>>> @@ -78,6 +78,7 @@
> > >>>>>>>>>>>     #include "gem/i915_gem_context_types.h"
> > >>>>>>>>>>>     #include "gem/i915_gem_shrinker.h"
> > >>>>>>>>>>>     #include "gem/i915_gem_stolen.h"
> > >>>>>>>>>>> +#include "gem/i915_gem_lmem.h"
> > >>>>>>>>>>>
> > >>>>>>>>>>>     #include "gt/intel_engine.h"
> > >>>>>>>>>>>     #include "gt/intel_gt_types.h"
> > >>>>>>>>>>> @@ -1921,9 +1922,15 @@ static inline int
> > >>>>>>>>>>> intel_hws_csb_write_index(struct drm_i915_private *i915)
> > >>>>>>>>>>>     }
> > >>>>>>>>>>>
> > >>>>>>>>>>>     static inline enum i915_map_type
> > >>>>>>>>>>> -i915_coherent_map_type(struct drm_i915_private *i915)
> > >>>>>>>>>>> +i915_coherent_map_type(struct drm_i915_private *i915,
> > >>>>>>>>>>> +                    struct drm_i915_gem_object *obj, bool
> > >>>>>>>>>>> always_coherent)
> > >>>>>>>>>>>     {
> > >>>>>>>>>>> -     return HAS_LLC(i915) ? I915_MAP_WB : I915_MAP_WC;
> > >>>>>>>>>>> +     if (i915_gem_object_is_lmem(obj))
> > >>>>>>>>>>> +             return I915_MAP_WC;
> > >>>>>>>>>>> +     if (HAS_LLC(i915) || always_coherent)
> > >>>>>>>>>>> +             return I915_MAP_WB;
> > >>>>>>>>>>> +     else
> > >>>>>>>>>>> +             return I915_MAP_WC;
> > >>>>>>>>>>
> > >>>>>>>>>> Seems this patch is doing two things.
> > >>>>>>>>>>
> > >>>>>>>>>> First it is adding lmem support to this helper by always
> > >>>>>>>>>> returning WC
> > >>>>>>>>>> for lmem objects.
> > >>>>>>>>>>
> > >>>>>>>>>> Secondly it is introducing an idea of "always coherent" in a
> > >>>>>>>>>> helper
> > >>>>>>>>>> called i915_coherent_map_type. Could someone explain what is
> > >>>>>>>>>> coherent vs
> > >>>>>>>>>> always coherent?
> > >>>>>>>>>>
> > >>>>>>>>>> And also, why is always coherent happy with WB? Sounds counter
> > >>>>>>>>>> intuitive
> > >>>>>>>>>> to me.
> > >>>>>>>>>
> > >>>>>>>>> All this does is try to keep the existing behaviour intact, whilst
> > >>>>>>>>> also ensuring that all lmem objects are mapped using only WC, no
> > >>>>>>>>> matter what. The always_coherent=true thing is for the existing
> > >>>>>>>>> places
> > >>>>>>>>> where we sometimes map the object using WB, without first
> > >>>>>>>>> considering
> > >>>>>>>>> whether the device has the fast shared LLC vs snooping. Yes, it's
> > >>>>>>>>> slightly ugly :)
> > >>>>>>>>
> > >>>>>>>> Not fully following - if we had to write kerneldoc for
> > >>>>>>>> always_coherent
> > >>>>>>>> input argument - what it would say?
> > >>>>>>>
> > >>>>>>> @always_coherent - If true we should always try to map the object
> > >>>>>>> using WB. If false we should only map as WB if the device
> > >>>>>>> supports the
> > >>>>>>> fast shared LLC, in the case of snooped devices we will map use WC.
> > >>>>>>> Note that If the resource is lmem then we will always map as WC,
> > >>>>>>> regardless of the value of always_coherent, since that's all we
> > >>>>>>> currently support.
> > >>>>>>>
> > >>>>>>> Maybe the naming is poor?
> > >>>>>>
> > >>>>>> Maybe just confusing to me, not sure yet.
> > >>>>>>
> > >>>>>> So always_coherent is not about how the callers wants to use it,
> > >>>>>> but about platform knowledge? Or a performance concern for LLC vs
> > >>>>>> snooping cases? Does WB works (coherently) on snooping platforms?
> > >>>>>
> > >>>>> The always_coherent=true is for the existing callers that want WB,
> > >>>>> regardless of LLC vs snooping.
> > >>>>>
> > >>>>> The other callers use the existing i915_coherent_map_type() which
> > >>>>> only gives out WB for LLC platforms.
> > >>>>>
> > >>>>> AFAIK, LLC vs snooping should offer the same in terms of coherency,
> > >>>>> but in terms of performance the shared LLC is much faster, and so
> > >>>>> for snooping platforms we choose to not enable WB everywhere.
> > >>>>>
> > >>>>> On top of that we now have lmem, but for that we only allow WC.
> > >>>>> This patch just rolls all of that into one helper, while keeping
> > >>>>> the existing behaviour unchanged.
> > >>>>
> > >>>> Thanks. But I am still struggling with the API. :(
> > >>>>
> > >>>> Is the introduction of always_coherent flag in the context of DG1
> > >>>> required even? AFAICT for lmem objects the flag is ignored so no?
> > >>>
> > >>> If we drop the flag/helper thing, then we need something like:
> > >>>
> > >>> type = WB;
> > >>> if (i915_gem_object_is_lmem(obj))
> > >>>      type = WC;
> > >>>
> > >>> vaddr = i915_gem_object_pin_map(obj, type);
> > >>>
> > >>> In all the places where we currently do:
> > >>>
> > >>> vaddr = i915_gem_object_pin_map(obj, WB);
> > >>>
> > >>> Where obj can be lmem, so ctx, ring, guc etc. Is that better or
> > >>> worse? The existing i915_coherent_map_type() callers should work
> > >>> as-is, since DG1 is snooped. And this patch just extends that to
> > >>> cover all cases.
> > >>>
> > >>> Perhaps we need a new helper instead? Maybe you have a better idea?
> > >>
> > >> Not yet. Would it make sense to put something in kerneldoc about when
> > >> callers might choose always_coherent true vs false? In terms of
> > >> expected usage (frequency, simplicity?) and any rules with regards
> > >> when callers need to worry about flushing/ordering when there are
> > >> mixed read and writes?
> > >
> > > Hmmm, looking at this again, maybe for now we should just go with:
> > >
> > > type = WB;
> > > if (i915_gem_object_is_lmem(obj))
> > >        type = WC;
> > >
> > > vaddr = i915_gem_object_pin_map(obj, type)
> > >
> > > Which is way less confusing, plus there are only a handful of places
> > > where we need this, so doesn't seem too bad?
> > >
> > > Alternatively, we could wrap that in something like:
> > >
> > > /* Returns WB for system memory, or WC for local memory */
> > > void *i915_gem_object_pin_map_default(obj);
> > >
> > > Thoughts?
> >
> > I went and looked at the use sites to try and figure it out.
> >
> > First thing, the bool always_coherent story is only relevant when we
> > decide to place some object in system memory. Otherwise mapping is
> > always WC so I guess our code needs to handle it anyway. Well, if the
> > assumption is that we can change the location of the objects and it all
> > just keeps working? Or that is not the goal?
>
> I guess your concern is that mapping as WC has different semantics,
> and that might somehow break the caller?
>
> >
> > Let see about the users (ignoring selftests):
> >
> > 1) lrc_reg_state and ring; always_coherent=false
> >
> > Update frequency medium and mostly write from the CPU side.
> >
> > They say always_coherent=false - which means they have to handle being
> > given a WC mapping anyway.
> >
> > What is the benefit of ever selecting WB here?
> >
> > 2) Engine status page; always_coherent=true
> >
> > Frequently read and written from the CPU and GPU so cost of snooping is
> > therefore fine? Apart from having to be ready to deal with WC anyway.
> >
> > 3) dbg_poison_ce; always_coherent=true
> >
> > Writes to lrc_reg_state once - meh. Could just as well always ask for WC.
> >
> > 4) intel_guc_allocate_and_map_vma; always_coherent=true
> >
> > This one has three users:
> >
> > a) guc_stage_desc_pool_create stage_desc_pool_vaddr
> >
> > This one seems write once at init.
> >
> > b) intel_guc_ct_init
> >
> > Use for CT communication so similar to CSB on engine status page in
> > principle. But code also has to deal with WC when object is in lmem.
> >
> > c) intel_guc_ads_create
> >
> > CPU appears to only write on init and GPU reset.
> >
> > 5) intel_huc_rsa_data_create; always_coheret=true
> >
> > Called from intel_huc_init so it appears write once from CPU. Not sure
> > why it would need a coherent mapping if that is correct.
> >
> > I think this exercise left me equally confused. Because flushing and
> > read-write ordering rules are different between WB and WC. And code
> > which accesses all these mappings either has to know which one is in
> > use, or does not care. For the latter case we have to be sure about for
> > every path.
>
> Users of pin_map() are generally meant to call flush_map() where
> appropriate, which should do the right thing for us. For WC it only
> needs to flush the wcb. For WB it's more complicated since that
> depends on if the object is considered coherent or not, if it is then
> we don't need to do anything, otherwise we need to clflush.
>
> Also note that we If we just map the buffer as WB, that by itself
> doesn't magically enable snooping for the pages AFAIK. We still have
> to tell the GPU that these pages are meant to be coherent, which we
> always do for LLC platforms I think, since the shared LLC is
> considered fast, whereas on snooping platforms, we don't enable this
> by default, and have this as CACHE_NONE instead(see shmem_object_init
> for example), and incur the cost of additional clflushing.  Doing an
> explicit i915_gem_object_set_coherency(I915_CACHE_LLC) I think will
> mark the object as coherent for us. I think there are also some
> matching GTT bits for caching.
>
> Also for DG1 you apparently can't disable snooping, as per what Daniel
> was saying in another thread.
>
> >
> > The write on init / reset ones are easy enough and it doesn't really
> > matter for them to use the coherent helper.
> >
> > Lrc_reg_state as well I think can be WC with explicit flushing - it has
> > to on lmem, no?
>
> I doubt it has to be, since the GPU still just accesses it through the GTT.
>
> >
> > This leaves the status page (CSB, etc) and GuC CT. Those are frequent
> > R/W but also code has to be able to handle WC so what is the benefit of
> > WB? It ends up faster than if it was WC, considering explicit
> > flushes/barriers are still in there?
>
> No idea for GuC, but for the hwsp it's still in system memory, and is
> WB, even for discrete. Chris measured this to be more performant with
> our execlists submission path than say just sticking it in lmem, and
> mapping it as WC.

Ping? How should we proceed with this patch?

>
> >
> > Regards,
> >
> > Tvrtko
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [Intel-gfx] [PATCH 11/19] drm/i915: Update the helper to set correct mapping
  2021-04-26  8:57                           ` Matthew Auld
@ 2021-04-26  9:21                             ` Tvrtko Ursulin
  0 siblings, 0 replies; 65+ messages in thread
From: Tvrtko Ursulin @ 2021-04-26  9:21 UTC (permalink / raw)
  To: Matthew Auld; +Cc: Intel Graphics Development, Matthew Auld, ML dri-devel


On 26/04/2021 09:57, Matthew Auld wrote:
> On Wed, 21 Apr 2021 at 20:13, Matthew Auld
> <matthew.william.auld@gmail.com> wrote:
>>
>> On Wed, 21 Apr 2021 at 16:41, Tvrtko Ursulin
>> <tvrtko.ursulin@linux.intel.com> wrote:
>>>
>>>
>>> On 21/04/2021 12:42, Matthew Auld wrote:
>>>> On 19/04/2021 16:01, Tvrtko Ursulin wrote:
>>>>>
>>>>> On 19/04/2021 15:37, Matthew Auld wrote:
>>>>>> On 19/04/2021 15:07, Tvrtko Ursulin wrote:
>>>>>>>
>>>>>>> On 19/04/2021 12:30, Matthew Auld wrote:
>>>>>>>> On 15/04/2021 12:05, Tvrtko Ursulin wrote:
>>>>>>>>>
>>>>>>>>> On 15/04/2021 10:23, Matthew Auld wrote:
>>>>>>>>>> On Thu, 15 Apr 2021 at 09:21, Tvrtko Ursulin
>>>>>>>>>> <tvrtko.ursulin@linux.intel.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On 14/04/2021 17:20, Matthew Auld wrote:
>>>>>>>>>>>> On Wed, 14 Apr 2021 at 16:22, Tvrtko Ursulin
>>>>>>>>>>>> <tvrtko.ursulin@linux.intel.com> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 12/04/2021 10:05, Matthew Auld wrote:
>>>>>>>>>>>>>> From: Venkata Sandeep Dhanalakota
>>>>>>>>>>>>>> <venkata.s.dhanalakota@intel.com>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Determine the possible coherent map type based on object
>>>>>>>>>>>>>> location,
>>>>>>>>>>>>>> and if target has llc or if user requires an always coherent
>>>>>>>>>>>>>> mapping.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Cc: Matthew Auld <matthew.auld@intel.com>
>>>>>>>>>>>>>> Cc: CQ Tang <cq.tang@intel.com>
>>>>>>>>>>>>>> Suggested-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
>>>>>>>>>>>>>> Signed-off-by: Venkata Sandeep Dhanalakota
>>>>>>>>>>>>>> <venkata.s.dhanalakota@intel.com>
>>>>>>>>>>>>>> ---
>>>>>>>>>>>>>>      drivers/gpu/drm/i915/gt/intel_engine_cs.c    |  3 ++-
>>>>>>>>>>>>>>      drivers/gpu/drm/i915/gt/intel_engine_pm.c    |  2 +-
>>>>>>>>>>>>>>      drivers/gpu/drm/i915/gt/intel_lrc.c          |  4 +++-
>>>>>>>>>>>>>>      drivers/gpu/drm/i915/gt/intel_ring.c         |  9 ++++++---
>>>>>>>>>>>>>>      drivers/gpu/drm/i915/gt/selftest_context.c   |  3 ++-
>>>>>>>>>>>>>>      drivers/gpu/drm/i915/gt/selftest_hangcheck.c |  4 ++--
>>>>>>>>>>>>>>      drivers/gpu/drm/i915/gt/selftest_lrc.c       |  4 +++-
>>>>>>>>>>>>>>      drivers/gpu/drm/i915/gt/uc/intel_guc.c       |  4 +++-
>>>>>>>>>>>>>>      drivers/gpu/drm/i915/gt/uc/intel_huc.c       |  4 +++-
>>>>>>>>>>>>>>      drivers/gpu/drm/i915/i915_drv.h              | 11
>>>>>>>>>>>>>> +++++++++--
>>>>>>>>>>>>>>      drivers/gpu/drm/i915/selftests/igt_spinner.c |  4 ++--
>>>>>>>>>>>>>>      11 files changed, 36 insertions(+), 16 deletions(-)
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
>>>>>>>>>>>>>> b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
>>>>>>>>>>>>>> index efe935f80c1a..b79568d370f5 100644
>>>>>>>>>>>>>> --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
>>>>>>>>>>>>>> +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
>>>>>>>>>>>>>> @@ -664,7 +664,8 @@ static int init_status_page(struct
>>>>>>>>>>>>>> intel_engine_cs *engine)
>>>>>>>>>>>>>>          if (ret)
>>>>>>>>>>>>>>                  goto err;
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> -     vaddr = i915_gem_object_pin_map(obj, I915_MAP_WB);
>>>>>>>>>>>>>> +     vaddr = i915_gem_object_pin_map(obj,
>>>>>>>>>>>>>> + i915_coherent_map_type(engine->i915, obj, true));
>>>>>>>>>>>>>>          if (IS_ERR(vaddr)) {
>>>>>>>>>>>>>>                  ret = PTR_ERR(vaddr);
>>>>>>>>>>>>>>                  goto err_unpin;
>>>>>>>>>>>>>> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pm.c
>>>>>>>>>>>>>> b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
>>>>>>>>>>>>>> index 7c9af86fdb1e..47f4397095e5 100644
>>>>>>>>>>>>>> --- a/drivers/gpu/drm/i915/gt/intel_engine_pm.c
>>>>>>>>>>>>>> +++ b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
>>>>>>>>>>>>>> @@ -23,7 +23,7 @@ static void dbg_poison_ce(struct
>>>>>>>>>>>>>> intel_context *ce)
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>          if (ce->state) {
>>>>>>>>>>>>>>                  struct drm_i915_gem_object *obj =
>>>>>>>>>>>>>> ce->state->obj;
>>>>>>>>>>>>>> -             int type =
>>>>>>>>>>>>>> i915_coherent_map_type(ce->engine->i915);
>>>>>>>>>>>>>> +             int type =
>>>>>>>>>>>>>> i915_coherent_map_type(ce->engine->i915, obj, true);
>>>>>>>>>>>>>>                  void *map;
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>                  if (!i915_gem_object_trylock(obj))
>>>>>>>>>>>>>> diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c
>>>>>>>>>>>>>> b/drivers/gpu/drm/i915/gt/intel_lrc.c
>>>>>>>>>>>>>> index e86897cde984..aafe2a4df496 100644
>>>>>>>>>>>>>> --- a/drivers/gpu/drm/i915/gt/intel_lrc.c
>>>>>>>>>>>>>> +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
>>>>>>>>>>>>>> @@ -903,7 +903,9 @@ lrc_pre_pin(struct intel_context *ce,
>>>>>>>>>>>>>>          GEM_BUG_ON(!i915_vma_is_pinned(ce->state));
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>          *vaddr = i915_gem_object_pin_map(ce->state->obj,
>>>>>>>>>>>>>> - i915_coherent_map_type(ce->engine->i915) |
>>>>>>>>>>>>>> + i915_coherent_map_type(ce->engine->i915,
>>>>>>>>>>>>>> + ce->state->obj,
>>>>>>>>>>>>>> + false) |
>>>>>>>>>>>>>>                                           I915_MAP_OVERRIDE);
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>          return PTR_ERR_OR_ZERO(*vaddr);
>>>>>>>>>>>>>> diff --git a/drivers/gpu/drm/i915/gt/intel_ring.c
>>>>>>>>>>>>>> b/drivers/gpu/drm/i915/gt/intel_ring.c
>>>>>>>>>>>>>> index aee0a77c77e0..3cf6c7e68108 100644
>>>>>>>>>>>>>> --- a/drivers/gpu/drm/i915/gt/intel_ring.c
>>>>>>>>>>>>>> +++ b/drivers/gpu/drm/i915/gt/intel_ring.c
>>>>>>>>>>>>>> @@ -53,9 +53,12 @@ int intel_ring_pin(struct intel_ring
>>>>>>>>>>>>>> *ring, struct i915_gem_ww_ctx *ww)
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>          if (i915_vma_is_map_and_fenceable(vma))
>>>>>>>>>>>>>>                  addr = (void __force *)i915_vma_pin_iomap(vma);
>>>>>>>>>>>>>> -     else
>>>>>>>>>>>>>> -             addr = i915_gem_object_pin_map(vma->obj,
>>>>>>>>>>>>>> - i915_coherent_map_type(vma->vm->i915));
>>>>>>>>>>>>>> +     else {
>>>>>>>>>>>>>> +             int type =
>>>>>>>>>>>>>> i915_coherent_map_type(vma->vm->i915, vma->obj, false);
>>>>>>>>>>>>>> +
>>>>>>>>>>>>>> +             addr = i915_gem_object_pin_map(vma->obj, type);
>>>>>>>>>>>>>> +     }
>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>          if (IS_ERR(addr)) {
>>>>>>>>>>>>>>                  ret = PTR_ERR(addr);
>>>>>>>>>>>>>>                  goto err_ring;
>>>>>>>>>>>>>> diff --git a/drivers/gpu/drm/i915/gt/selftest_context.c
>>>>>>>>>>>>>> b/drivers/gpu/drm/i915/gt/selftest_context.c
>>>>>>>>>>>>>> index b9bdd1d23243..26685b927169 100644
>>>>>>>>>>>>>> --- a/drivers/gpu/drm/i915/gt/selftest_context.c
>>>>>>>>>>>>>> +++ b/drivers/gpu/drm/i915/gt/selftest_context.c
>>>>>>>>>>>>>> @@ -88,7 +88,8 @@ static int __live_context_size(struct
>>>>>>>>>>>>>> intel_engine_cs *engine)
>>>>>>>>>>>>>>                  goto err;
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>          vaddr = i915_gem_object_pin_map_unlocked(ce->state->obj,
>>>>>>>>>>>>>> - i915_coherent_map_type(engine->i915));
>>>>>>>>>>>>>> + i915_coherent_map_type(engine->i915,
>>>>>>>>>>>>>> + ce->state->obj, false));
>>>>>>>>>>>>>>          if (IS_ERR(vaddr)) {
>>>>>>>>>>>>>>                  err = PTR_ERR(vaddr);
>>>>>>>>>>>>>>                  intel_context_unpin(ce);
>>>>>>>>>>>>>> diff --git a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
>>>>>>>>>>>>>> b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
>>>>>>>>>>>>>> index 746985971c3a..5b63d4df8c93 100644
>>>>>>>>>>>>>> --- a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
>>>>>>>>>>>>>> +++ b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
>>>>>>>>>>>>>> @@ -69,7 +69,7 @@ static int hang_init(struct hang *h, struct
>>>>>>>>>>>>>> intel_gt *gt)
>>>>>>>>>>>>>>          h->seqno = memset(vaddr, 0xff, PAGE_SIZE);
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>          vaddr = i915_gem_object_pin_map_unlocked(h->obj,
>>>>>>>>>>>>>> - i915_coherent_map_type(gt->i915));
>>>>>>>>>>>>>> + i915_coherent_map_type(gt->i915, h->obj, false));
>>>>>>>>>>>>>>          if (IS_ERR(vaddr)) {
>>>>>>>>>>>>>>                  err = PTR_ERR(vaddr);
>>>>>>>>>>>>>>                  goto err_unpin_hws;
>>>>>>>>>>>>>> @@ -130,7 +130,7 @@ hang_create_request(struct hang *h,
>>>>>>>>>>>>>> struct intel_engine_cs *engine)
>>>>>>>>>>>>>>                  return ERR_CAST(obj);
>>>>>>>>>>>>>>          }
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> -     vaddr = i915_gem_object_pin_map_unlocked(obj,
>>>>>>>>>>>>>> i915_coherent_map_type(gt->i915));
>>>>>>>>>>>>>> +     vaddr = i915_gem_object_pin_map_unlocked(obj,
>>>>>>>>>>>>>> i915_coherent_map_type(gt->i915, obj, false));
>>>>>>>>>>>>>>          if (IS_ERR(vaddr)) {
>>>>>>>>>>>>>>                  i915_gem_object_put(obj);
>>>>>>>>>>>>>>                  i915_vm_put(vm);
>>>>>>>>>>>>>> diff --git a/drivers/gpu/drm/i915/gt/selftest_lrc.c
>>>>>>>>>>>>>> b/drivers/gpu/drm/i915/gt/selftest_lrc.c
>>>>>>>>>>>>>> index 85e7df6a5123..d8f6623524e8 100644
>>>>>>>>>>>>>> --- a/drivers/gpu/drm/i915/gt/selftest_lrc.c
>>>>>>>>>>>>>> +++ b/drivers/gpu/drm/i915/gt/selftest_lrc.c
>>>>>>>>>>>>>> @@ -1221,7 +1221,9 @@ static int compare_isolation(struct
>>>>>>>>>>>>>> intel_engine_cs *engine,
>>>>>>>>>>>>>>          }
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>          lrc = i915_gem_object_pin_map_unlocked(ce->state->obj,
>>>>>>>>>>>>>> - i915_coherent_map_type(engine->i915));
>>>>>>>>>>>>>> + i915_coherent_map_type(engine->i915,
>>>>>>>>>>>>>> + ce->state->obj,
>>>>>>>>>>>>>> + false));
>>>>>>>>>>>>>>          if (IS_ERR(lrc)) {
>>>>>>>>>>>>>>                  err = PTR_ERR(lrc);
>>>>>>>>>>>>>>                  goto err_B1;
>>>>>>>>>>>>>> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.c
>>>>>>>>>>>>>> b/drivers/gpu/drm/i915/gt/uc/intel_guc.c
>>>>>>>>>>>>>> index 78305b2ec89d..adae04c47aab 100644
>>>>>>>>>>>>>> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc.c
>>>>>>>>>>>>>> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.c
>>>>>>>>>>>>>> @@ -682,7 +682,9 @@ int intel_guc_allocate_and_map_vma(struct
>>>>>>>>>>>>>> intel_guc *guc, u32 size,
>>>>>>>>>>>>>>          if (IS_ERR(vma))
>>>>>>>>>>>>>>                  return PTR_ERR(vma);
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> -     vaddr = i915_gem_object_pin_map_unlocked(vma->obj,
>>>>>>>>>>>>>> I915_MAP_WB);
>>>>>>>>>>>>>> +     vaddr = i915_gem_object_pin_map_unlocked(vma->obj,
>>>>>>>>>>>>>> + i915_coherent_map_type(guc_to_gt(guc)->i915,
>>>>>>>>>>>>>> + vma->obj, true));
>>>>>>>>>>>>>>          if (IS_ERR(vaddr)) {
>>>>>>>>>>>>>>                  i915_vma_unpin_and_release(&vma, 0);
>>>>>>>>>>>>>>                  return PTR_ERR(vaddr);
>>>>>>>>>>>>>> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_huc.c
>>>>>>>>>>>>>> b/drivers/gpu/drm/i915/gt/uc/intel_huc.c
>>>>>>>>>>>>>> index 2126dd81ac38..56d2144dc6a0 100644
>>>>>>>>>>>>>> --- a/drivers/gpu/drm/i915/gt/uc/intel_huc.c
>>>>>>>>>>>>>> +++ b/drivers/gpu/drm/i915/gt/uc/intel_huc.c
>>>>>>>>>>>>>> @@ -82,7 +82,9 @@ static int intel_huc_rsa_data_create(struct
>>>>>>>>>>>>>> intel_huc *huc)
>>>>>>>>>>>>>>          if (IS_ERR(vma))
>>>>>>>>>>>>>>                  return PTR_ERR(vma);
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> -     vaddr = i915_gem_object_pin_map_unlocked(vma->obj,
>>>>>>>>>>>>>> I915_MAP_WB);
>>>>>>>>>>>>>> +     vaddr = i915_gem_object_pin_map_unlocked(vma->obj,
>>>>>>>>>>>>>> + i915_coherent_map_type(gt->i915,
>>>>>>>>>>>>>> + vma->obj, true));
>>>>>>>>>>>>>>          if (IS_ERR(vaddr)) {
>>>>>>>>>>>>>>                  i915_vma_unpin_and_release(&vma, 0);
>>>>>>>>>>>>>>                  return PTR_ERR(vaddr);
>>>>>>>>>>>>>> diff --git a/drivers/gpu/drm/i915/i915_drv.h
>>>>>>>>>>>>>> b/drivers/gpu/drm/i915/i915_drv.h
>>>>>>>>>>>>>> index 69e43bf91a15..2abbc06712a4 100644
>>>>>>>>>>>>>> --- a/drivers/gpu/drm/i915/i915_drv.h
>>>>>>>>>>>>>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>>>>>>>>>>>>>> @@ -78,6 +78,7 @@
>>>>>>>>>>>>>>      #include "gem/i915_gem_context_types.h"
>>>>>>>>>>>>>>      #include "gem/i915_gem_shrinker.h"
>>>>>>>>>>>>>>      #include "gem/i915_gem_stolen.h"
>>>>>>>>>>>>>> +#include "gem/i915_gem_lmem.h"
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>      #include "gt/intel_engine.h"
>>>>>>>>>>>>>>      #include "gt/intel_gt_types.h"
>>>>>>>>>>>>>> @@ -1921,9 +1922,15 @@ static inline int
>>>>>>>>>>>>>> intel_hws_csb_write_index(struct drm_i915_private *i915)
>>>>>>>>>>>>>>      }
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>      static inline enum i915_map_type
>>>>>>>>>>>>>> -i915_coherent_map_type(struct drm_i915_private *i915)
>>>>>>>>>>>>>> +i915_coherent_map_type(struct drm_i915_private *i915,
>>>>>>>>>>>>>> +                    struct drm_i915_gem_object *obj, bool
>>>>>>>>>>>>>> always_coherent)
>>>>>>>>>>>>>>      {
>>>>>>>>>>>>>> -     return HAS_LLC(i915) ? I915_MAP_WB : I915_MAP_WC;
>>>>>>>>>>>>>> +     if (i915_gem_object_is_lmem(obj))
>>>>>>>>>>>>>> +             return I915_MAP_WC;
>>>>>>>>>>>>>> +     if (HAS_LLC(i915) || always_coherent)
>>>>>>>>>>>>>> +             return I915_MAP_WB;
>>>>>>>>>>>>>> +     else
>>>>>>>>>>>>>> +             return I915_MAP_WC;
>>>>>>>>>>>>>
>>>>>>>>>>>>> Seems this patch is doing two things.
>>>>>>>>>>>>>
>>>>>>>>>>>>> First it is adding lmem support to this helper by always
>>>>>>>>>>>>> returning WC
>>>>>>>>>>>>> for lmem objects.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Secondly it is introducing an idea of "always coherent" in a
>>>>>>>>>>>>> helper
>>>>>>>>>>>>> called i915_coherent_map_type. Could someone explain what is
>>>>>>>>>>>>> coherent vs
>>>>>>>>>>>>> always coherent?
>>>>>>>>>>>>>
>>>>>>>>>>>>> And also, why is always coherent happy with WB? Sounds counter
>>>>>>>>>>>>> intuitive
>>>>>>>>>>>>> to me.
>>>>>>>>>>>>
>>>>>>>>>>>> All this does is try to keep the existing behaviour intact, whilst
>>>>>>>>>>>> also ensuring that all lmem objects are mapped using only WC, no
>>>>>>>>>>>> matter what. The always_coherent=true thing is for the existing
>>>>>>>>>>>> places
>>>>>>>>>>>> where we sometimes map the object using WB, without first
>>>>>>>>>>>> considering
>>>>>>>>>>>> whether the device has the fast shared LLC vs snooping. Yes, it's
>>>>>>>>>>>> slightly ugly :)
>>>>>>>>>>>
>>>>>>>>>>> Not fully following - if we had to write kerneldoc for
>>>>>>>>>>> always_coherent
>>>>>>>>>>> input argument - what it would say?
>>>>>>>>>>
>>>>>>>>>> @always_coherent - If true we should always try to map the object
>>>>>>>>>> using WB. If false we should only map as WB if the device
>>>>>>>>>> supports the
>>>>>>>>>> fast shared LLC, in the case of snooped devices we will map use WC.
>>>>>>>>>> Note that If the resource is lmem then we will always map as WC,
>>>>>>>>>> regardless of the value of always_coherent, since that's all we
>>>>>>>>>> currently support.
>>>>>>>>>>
>>>>>>>>>> Maybe the naming is poor?
>>>>>>>>>
>>>>>>>>> Maybe just confusing to me, not sure yet.
>>>>>>>>>
>>>>>>>>> So always_coherent is not about how the callers wants to use it,
>>>>>>>>> but about platform knowledge? Or a performance concern for LLC vs
>>>>>>>>> snooping cases? Does WB works (coherently) on snooping platforms?
>>>>>>>>
>>>>>>>> The always_coherent=true is for the existing callers that want WB,
>>>>>>>> regardless of LLC vs snooping.
>>>>>>>>
>>>>>>>> The other callers use the existing i915_coherent_map_type() which
>>>>>>>> only gives out WB for LLC platforms.
>>>>>>>>
>>>>>>>> AFAIK, LLC vs snooping should offer the same in terms of coherency,
>>>>>>>> but in terms of performance the shared LLC is much faster, and so
>>>>>>>> for snooping platforms we choose to not enable WB everywhere.
>>>>>>>>
>>>>>>>> On top of that we now have lmem, but for that we only allow WC.
>>>>>>>> This patch just rolls all of that into one helper, while keeping
>>>>>>>> the existing behaviour unchanged.
>>>>>>>
>>>>>>> Thanks. But I am still struggling with the API. :(
>>>>>>>
>>>>>>> Is the introduction of always_coherent flag in the context of DG1
>>>>>>> required even? AFAICT for lmem objects the flag is ignored so no?
>>>>>>
>>>>>> If we drop the flag/helper thing, then we need something like:
>>>>>>
>>>>>> type = WB;
>>>>>> if (i915_gem_object_is_lmem(obj))
>>>>>>       type = WC;
>>>>>>
>>>>>> vaddr = i915_gem_object_pin_map(obj, type);
>>>>>>
>>>>>> In all the places where we currently do:
>>>>>>
>>>>>> vaddr = i915_gem_object_pin_map(obj, WB);
>>>>>>
>>>>>> Where obj can be lmem, so ctx, ring, guc etc. Is that better or
>>>>>> worse? The existing i915_coherent_map_type() callers should work
>>>>>> as-is, since DG1 is snooped. And this patch just extends that to
>>>>>> cover all cases.
>>>>>>
>>>>>> Perhaps we need a new helper instead? Maybe you have a better idea?
>>>>>
>>>>> Not yet. Would it make sense to put something in kerneldoc about when
>>>>> callers might choose always_coherent true vs false? In terms of
>>>>> expected usage (frequency, simplicity?) and any rules with regards
>>>>> when callers need to worry about flushing/ordering when there are
>>>>> mixed read and writes?
>>>>
>>>> Hmmm, looking at this again, maybe for now we should just go with:
>>>>
>>>> type = WB;
>>>> if (i915_gem_object_is_lmem(obj))
>>>>         type = WC;
>>>>
>>>> vaddr = i915_gem_object_pin_map(obj, type)
>>>>
>>>> Which is way less confusing, plus there are only a handful of places
>>>> where we need this, so doesn't seem too bad?
>>>>
>>>> Alternatively, we could wrap that in something like:
>>>>
>>>> /* Returns WB for system memory, or WC for local memory */
>>>> void *i915_gem_object_pin_map_default(obj);
>>>>
>>>> Thoughts?
>>>
>>> I went and looked at the use sites to try and figure it out.
>>>
>>> First thing, the bool always_coherent story is only relevant when we
>>> decide to place some object in system memory. Otherwise mapping is
>>> always WC so I guess our code needs to handle it anyway. Well, if the
>>> assumption is that we can change the location of the objects and it all
>>> just keeps working? Or that is not the goal?
>>
>> I guess your concern is that mapping as WC has different semantics,
>> and that might somehow break the caller?
>>
>>>
>>> Let see about the users (ignoring selftests):
>>>
>>> 1) lrc_reg_state and ring; always_coherent=false
>>>
>>> Update frequency medium and mostly write from the CPU side.
>>>
>>> They say always_coherent=false - which means they have to handle being
>>> given a WC mapping anyway.
>>>
>>> What is the benefit of ever selecting WB here?
>>>
>>> 2) Engine status page; always_coherent=true
>>>
>>> Frequently read and written from the CPU and GPU so cost of snooping is
>>> therefore fine? Apart from having to be ready to deal with WC anyway.
>>>
>>> 3) dbg_poison_ce; always_coherent=true
>>>
>>> Writes to lrc_reg_state once - meh. Could just as well always ask for WC.
>>>
>>> 4) intel_guc_allocate_and_map_vma; always_coherent=true
>>>
>>> This one has three users:
>>>
>>> a) guc_stage_desc_pool_create stage_desc_pool_vaddr
>>>
>>> This one seems write once at init.
>>>
>>> b) intel_guc_ct_init
>>>
>>> Use for CT communication so similar to CSB on engine status page in
>>> principle. But code also has to deal with WC when object is in lmem.
>>>
>>> c) intel_guc_ads_create
>>>
>>> CPU appears to only write on init and GPU reset.
>>>
>>> 5) intel_huc_rsa_data_create; always_coheret=true
>>>
>>> Called from intel_huc_init so it appears write once from CPU. Not sure
>>> why it would need a coherent mapping if that is correct.
>>>
>>> I think this exercise left me equally confused. Because flushing and
>>> read-write ordering rules are different between WB and WC. And code
>>> which accesses all these mappings either has to know which one is in
>>> use, or does not care. For the latter case we have to be sure about for
>>> every path.
>>
>> Users of pin_map() are generally meant to call flush_map() where
>> appropriate, which should do the right thing for us. For WC it only
>> needs to flush the wcb. For WB it's more complicated since that
>> depends on if the object is considered coherent or not, if it is then
>> we don't need to do anything, otherwise we need to clflush.
>>
>> Also note that we If we just map the buffer as WB, that by itself
>> doesn't magically enable snooping for the pages AFAIK. We still have
>> to tell the GPU that these pages are meant to be coherent, which we
>> always do for LLC platforms I think, since the shared LLC is
>> considered fast, whereas on snooping platforms, we don't enable this
>> by default, and have this as CACHE_NONE instead(see shmem_object_init
>> for example), and incur the cost of additional clflushing.  Doing an
>> explicit i915_gem_object_set_coherency(I915_CACHE_LLC) I think will
>> mark the object as coherent for us. I think there are also some
>> matching GTT bits for caching.
>>
>> Also for DG1 you apparently can't disable snooping, as per what Daniel
>> was saying in another thread.
>>
>>>
>>> The write on init / reset ones are easy enough and it doesn't really
>>> matter for them to use the coherent helper.
>>>
>>> Lrc_reg_state as well I think can be WC with explicit flushing - it has
>>> to on lmem, no?
>>
>> I doubt it has to be, since the GPU still just accesses it through the GTT.
>>
>>>
>>> This leaves the status page (CSB, etc) and GuC CT. Those are frequent
>>> R/W but also code has to be able to handle WC so what is the benefit of
>>> WB? It ends up faster than if it was WC, considering explicit
>>> flushes/barriers are still in there?
>>
>> No idea for GuC, but for the hwsp it's still in system memory, and is
>> WB, even for discrete. Chris measured this to be more performant with
>> our execlists submission path than say just sticking it in lmem, and
>> mapping it as WC.
> 
> Ping? How should we proceed with this patch?

I just re-freshed my memory on when the write combine buffer gets 
flushed and realized uncached reads are also an implicit flush. So my 
complications from earlier reply were purely mine and I think you can 
proceed with the patch as is.

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Regards,

Tvrtko
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [Intel-gfx] [PATCH 14/19] drm/i915/oprom: Basic sanitization
  2021-04-12  9:05 ` [PATCH 14/19] drm/i915/oprom: Basic sanitization Matthew Auld
  2021-04-12 22:36   ` [Intel-gfx] " kernel test robot
  2021-04-12 22:36   ` [PATCH] drm/i915/oprom: fix memdup.cocci warnings kernel test robot
@ 2021-05-17 11:57   ` Jani Nikula
  2021-09-18  4:30     ` Lucas De Marchi
  2 siblings, 1 reply; 65+ messages in thread
From: Jani Nikula @ 2021-05-17 11:57 UTC (permalink / raw)
  To: Matthew Auld, intel-gfx; +Cc: Anshuman Gupta, dri-devel

On Mon, 12 Apr 2021, Matthew Auld <matthew.auld@intel.com> wrote:
> From: Anshuman Gupta <anshuman.gupta@intel.com>
>
> Sanitize OPROM header, CPD signature and OPROM PCI version.
> OPROM_HEADER, EXPANSION_ROM_HEADER and OPROM_MEU_BLOB structures
> and PCI struct offsets are provided by GSC counterparts.
> These are yet to be Documented in B.Spec.
> After successful sanitization, extract VBT from opregion
> image.

So I don't understand what the point is with two consecutive patches
where the latter rewrites a lot of the former. 

BR,
Jani.


>
> v2:
> - Used macro for OPROM header magic 0xaa55 [Rodrigo]
> - Added a OPROM layout. [Uma]
> - Extract opregion from OPROM package and then extract
>   VBT from opregion to have backward compatibility with
>   older IFWI.
>
> v3:
> - Moved opreg stuff to intel_opregion.{c,h}. [Uma]
> - Memory leak and intel_oprom_verify_signature return
>   value fixes. [Uma]
>
> v4:
>  - Fix return code storage for oprom_image_parse_helper (Matt)
>
> v5 by Jani:
> - switch to intel_uncore_read/intel_uncore_write
>
> v6 by Khajapasha:
> - Rename intel_oprom_verify_signature() to
>   intel_spi_get_oprom_opreg() [Jani, Nikula]
> - Use u32 data type for opregion size [Jani, Nikula]
>
> Cc: Jani Nikula <jani.nikula@intel.com>
> Cc: Uma Shankar <uma.shankar@intel.com>
> Signed-off-by: Anshuman Gupta <anshuman.gupta@intel.com>
> Signed-off-by: Jani Nikula <jani.nikula@intel.com>
> Signed-off-by: Mohammed Khajapasha <mohammed.khajapasha@intel.com>
> ---
>  drivers/gpu/drm/i915/display/intel_bios.c     |  47 +++--
>  drivers/gpu/drm/i915/display/intel_opregion.c | 169 ++++++++++++++++++
>  drivers/gpu/drm/i915/display/intel_opregion.h |  38 +++-
>  3 files changed, 227 insertions(+), 27 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/display/intel_bios.c b/drivers/gpu/drm/i915/display/intel_bios.c
> index f9dc651f1652..59eec8333723 100644
> --- a/drivers/gpu/drm/i915/display/intel_bios.c
> +++ b/drivers/gpu/drm/i915/display/intel_bios.c
> @@ -2240,37 +2240,36 @@ bool intel_bios_is_valid_vbt(const void *buf, size_t size)
>  
>  static struct vbt_header *spi_oprom_get_vbt(struct drm_i915_private *i915)
>  {
> -	u32 count, data, found, store = 0;
> -	u32 static_region, oprom_offset;
> -	u32 oprom_size = 0x200000;
> +	u32 count, found, opreg_size;
> +	u32 *vbt, *oprom_opreg = NULL;
>  	u16 vbt_size;
> -	u32 *vbt;
> +	u8 *parse_ptr;
>  
> -	static_region = intel_uncore_read(&i915->uncore, SPI_STATIC_REGIONS);
> -	static_region &= OPTIONROM_SPI_REGIONID_MASK;
> -	intel_uncore_write(&i915->uncore, PRIMARY_SPI_REGIONID, static_region);
> -
> -	oprom_offset = intel_uncore_read(&i915->uncore, OROM_OFFSET);
> -	oprom_offset &= OROM_OFFSET_MASK;
> +	if (intel_spi_get_oprom_opreg(i915, &oprom_opreg, &opreg_size)) {
> +		drm_err(&i915->drm, "oprom signature verification failed\n");
> +		goto err_not_found;
> +	}
>  
> -	for (count = 0; count < oprom_size; count += 4) {
> -		intel_uncore_write(&i915->uncore, PRIMARY_SPI_ADDRESS, oprom_offset + count);
> -		data = intel_uncore_read(&i915->uncore, PRIMARY_SPI_TRIGGER);
> +	if (!oprom_opreg) {
> +		drm_err(&i915->drm, "opregion not found\n");
> +		goto err_not_found;
> +	}
>  
> -		if (data == *((const u32 *)"$VBT")) {
> -			found = oprom_offset + count;
> +	for (count = 0; count < opreg_size; count += 4) {
> +		if (oprom_opreg[count / 4] == *((const u32 *)"$VBT")) {
> +			found = count;
>  			break;
>  		}
>  	}
>  
> -	if (count >= oprom_size)
> +	if (count >= opreg_size) {
> +		drm_err(&i915->drm, "VBT not found in opregion\n");
>  		goto err_not_found;
> +	}
>  
>  	/* Get VBT size and allocate space for the VBT */
> -	intel_uncore_write(&i915->uncore, PRIMARY_SPI_ADDRESS, found +
> -		   offsetof(struct vbt_header, vbt_size));
> -	vbt_size = intel_uncore_read(&i915->uncore, PRIMARY_SPI_TRIGGER);
> -	vbt_size &= 0xffff;
> +	parse_ptr = (u8 *)oprom_opreg + found;
> +	vbt_size = ((struct vbt_header *)parse_ptr)->vbt_size;
>  
>  	vbt = kzalloc(vbt_size, GFP_KERNEL);
>  	if (!vbt) {
> @@ -2279,16 +2278,12 @@ static struct vbt_header *spi_oprom_get_vbt(struct drm_i915_private *i915)
>  		goto err_not_found;
>  	}
>  
> -	for (count = 0; count < vbt_size; count += 4) {
> -		intel_uncore_write(&i915->uncore, PRIMARY_SPI_ADDRESS, found + count);
> -		data = intel_uncore_read(&i915->uncore, PRIMARY_SPI_TRIGGER);
> -		*(vbt + store++) = data;
> -	}
> -
> +	memcpy(vbt, parse_ptr, vbt_size);
>  	if (!intel_bios_is_valid_vbt(vbt, vbt_size))
>  		goto err_free_vbt;
>  
>  	DRM_DEBUG_KMS("Found valid VBT in SPI flash\n");
> +	kfree(oprom_opreg);
>  
>  	return (struct vbt_header *)vbt;
>  
> diff --git a/drivers/gpu/drm/i915/display/intel_opregion.c b/drivers/gpu/drm/i915/display/intel_opregion.c
> index dfd724e506b5..e9ccd8265a1f 100644
> --- a/drivers/gpu/drm/i915/display/intel_opregion.c
> +++ b/drivers/gpu/drm/i915/display/intel_opregion.c
> @@ -983,6 +983,175 @@ int intel_opregion_setup(struct drm_i915_private *dev_priv)
>  	return err;
>  }
>  
> +static int oprom_image_parse_helper(u8 *parse_ptr, u8 *last_img, u8 *code_type,
> +				    struct drm_i915_private *i915)
> +{
> +	u8 size_512_bytes;
> +
> +	if (((union oprom_header *)parse_ptr)->signature != OPROM_IMAGE_MAGIC) {
> +		drm_err(&i915->drm, "Wrong OPROM header signature.\n");
> +		return -EINVAL;
> +	}
> +
> +	size_512_bytes = parse_ptr[((struct expansion_rom_header *)parse_ptr)->pcistructoffset + PCI_IMAGE_LENGTH_OFFSET];
> +	*code_type = parse_ptr[((struct expansion_rom_header *)parse_ptr)->pcistructoffset + PCI_CODE_TYPE_OFFSET];
> +	*last_img = parse_ptr[((struct expansion_rom_header *)parse_ptr)->pcistructoffset + PCI_LAST_IMAGE_INDICATOR_OFFSET];
> +
> +	return size_512_bytes;
> +}
> +
> +static void spi_read_oprom_helper(size_t len, u32 offset, u32 *buf,
> +				  struct drm_i915_private *dev_priv)
> +{
> +	u32 count, data;
> +
> +	for (count = 0; count < len; count += 4) {
> +		intel_uncore_write(&dev_priv->uncore, PRIMARY_SPI_ADDRESS, offset + count);
> +		data = intel_uncore_read(&dev_priv->uncore, PRIMARY_SPI_TRIGGER);
> +		buf[count / 4] = data;
> +	}
> +}
> +
> +/**
> + *	+        DASH+G OPROM IMAGE LAYOUT           +
> + *	+--------+-------+---------------------------+
> + *	| Offset | Value |   ROM Header Fields       +-----> Image 1 (CSS)
> + *	+--------------------------------------------+
> + *	|    0h  |  55h  |   ROM Signature Byte1     |
> + *	|    1h  |  AAh  |   ROM Signature Byte2     |
> + *	|    2h  |  xx   |        Reserved           |
> + *	|  18+19h|  xx   |  Ptr to PCI DataStructure |
> + *	+----------------+---------------------------+
> + *	|           PCI Data Structure               |
> + *	+--------------------------------------------+
> + *	|    .       .             .                 |
> + *	|    .       .             .                 |
> + *	|    10  +  xx   +     Image Length          |
> + *	|    14  +  xx   +     Code Type             |
> + *	|    15  +  xx   +  Last Image Indicator     |
> + *	|    .       .             .                 |
> + *	+--------------------------------------------+
> + *	|               MEU BLOB                     |
> + *	+--------------------------------------------+
> + *	|              CPD Header                    |
> + *	|              CPD Entry                     |
> + *	|              Reserved                      |
> + *	|           SignedDataPart1                  |
> + *	|              PublicKey                     |
> + *	|            RSA Signature                   |
> + *	|           SignedDataPart2                  |
> + *	|            IFWI Metadata                   |
> + *	+--------+-------+---------------------------+
> + *	|    .   |   .   |         .                 |
> + *	|    .   |   .   |         .                 |
> + *	+--------------------------------------------+
> + *	| Offset | Value |   ROM Header Fields       +-----> Image 2 (Config Data) (Offset: 0x800)
> + *	+--------------------------------------------+
> + *	|    0h  |  55h  |   ROM Signature Byte1     |
> + *	|    1h  |  AAh  |   ROM Signature Byte2     |
> + *	|    2h  |  xx   |        Reserved           |
> + *	|  18+19h|  xx   |  Ptr to PCI DataStructure |
> + *	+----------------+---------------------------+
> + *	|           PCI Data Structure               |
> + *	+--------------------------------------------+
> + *	|    .       .             .                 |
> + *	|    .       .             .                 |
> + *	|    10  +  xx   +     Image Length          |
> + *	|    14  +  xx   +      Code Type            |
> + *	|    15  +  xx   +   Last Image Indicator    |
> + *	|    .       .             .                 |
> + *	|    1A  +  3C   + Ptr to Opregion Signature |
> + *	|    .       .             .                 |
> + *	|    .       .             .                 |
> + *	|   83Ch + IntelGraphicsMem                  | <---+ Opregion Signature
> + *	+--------+-----------------------------------+
> + *
> + * intel_spi_get_oprom_opreg() get OPROM image.
> + * @i915: pointer to i915 device.
> + * @opreg: pointer to opregion buffer output.
> + * @opreg_size: pointer to opregion size output.
> + */
> +int
> +intel_spi_get_oprom_opreg(struct drm_i915_private *i915, u32 **opreg,
> +			  u32 *opreg_size)
> +{
> +	u8 img_sig[sizeof(OPREGION_SIGNATURE)];
> +	u8 code_type, last_img;
> +	u32 static_region, offset, img_len;
> +	u32 *oprom_img, *oprom_img_hdr;
> +	u16 opreg_base;
> +	u8 *parse_ptr;
> +	int img_size;
> +	int ret = -EINVAL;
> +
> +	/* initialize SPI to read the OPROM */
> +	static_region = intel_uncore_read(&i915->uncore, SPI_STATIC_REGIONS);
> +	static_region &= OPTIONROM_SPI_REGIONID_MASK;
> +	intel_uncore_write(&i915->uncore, PRIMARY_SPI_REGIONID, static_region);
> +	/* read OPROM offset in SPI flash */
> +	offset = intel_uncore_read(&i915->uncore, OROM_OFFSET);
> +	offset &= OROM_OFFSET_MASK;
> +
> +	oprom_img_hdr = kzalloc(OPROM_INITIAL_READ_SIZE, GFP_KERNEL);
> +	if (!oprom_img_hdr)
> +		return -ENOMEM;
> +
> +	do {
> +		spi_read_oprom_helper(OPROM_INITIAL_READ_SIZE, offset,
> +				      oprom_img_hdr, i915);
> +		img_size = oprom_image_parse_helper((u8 *)oprom_img_hdr, &last_img,
> +						    &code_type, i915);
> +		if (img_size <= 0) {
> +			ret = -EINVAL;
> +			goto err_free_hdr;
> +		}
> +
> +		img_len = img_size * OPROM_BYTE_BOUNDARY;
> +		oprom_img = kzalloc(img_len, GFP_KERNEL);
> +		if (!oprom_img) {
> +			ret = -ENOMEM;
> +			goto err_free_hdr;
> +		}
> +
> +		spi_read_oprom_helper(img_len, offset, oprom_img, i915);
> +		parse_ptr = (u8 *)oprom_img;
> +		offset = offset + img_len;
> +
> +		/* opregion base offset */
> +		opreg_base = ((struct expansion_rom_header *)parse_ptr)->opregion_base;
> +		/* CPD or opreg signature is present at opregion_base offset */
> +		memcpy(img_sig, parse_ptr + opreg_base, sizeof(OPREGION_SIGNATURE));
> +
> +		if (!memcmp(img_sig, OPREGION_SIGNATURE, sizeof(OPREGION_SIGNATURE) - 1)) {
> +			*opreg = oprom_img;
> +			*opreg_size = img_len;
> +			drm_dbg_kms(&i915->drm, "Found opregion image\n");
> +			ret = 0;
> +			break;
> +		} else if (!memcmp(img_sig, CPD_SIGNATURE, NUM_CPD_BYTES)) {
> +			if (code_type != OPROM_CSS_CODE_TYPE) {
> +				drm_err(&i915->drm, "Invalid OPROM\n");
> +				ret = -EINVAL;
> +				goto err_free_img;
> +			}
> +			drm_dbg_kms(&i915->drm, "Found CSS image\n");
> +			/* proceed here onwards for signature authentication */
> +			kfree(oprom_img);
> +			continue;
> +		}
> +
> +	} while (last_img != LAST_IMG_INDICATOR);
> +
> +	return ret;
> +
> +err_free_img:
> +	kfree(oprom_img);
> +err_free_hdr:
> +	kfree(oprom_img_hdr);
> +
> +	return ret;
> +}
> +
>  static int intel_use_opregion_panel_type_callback(const struct dmi_system_id *id)
>  {
>  	DRM_INFO("Using panel type from OpRegion on %s\n", id->ident);
> diff --git a/drivers/gpu/drm/i915/display/intel_opregion.h b/drivers/gpu/drm/i915/display/intel_opregion.h
> index 4aa68ffbd30e..de53dde10dd9 100644
> --- a/drivers/gpu/drm/i915/display/intel_opregion.h
> +++ b/drivers/gpu/drm/i915/display/intel_opregion.h
> @@ -54,6 +54,34 @@ struct intel_opregion {
>  
>  #define OPREGION_SIZE            (8 * 1024)
>  
> +#define CPD_SIGNATURE "$CPD"                  /* CPD Signature */
> +#define NUM_CPD_BYTES 4
> +#define PCI_IMAGE_LENGTH_OFFSET 0x10
> +#define PCI_CODE_TYPE_OFFSET 0x14
> +#define PCI_LAST_IMAGE_INDICATOR_OFFSET 0x15
> +#define LAST_IMG_INDICATOR 0x80
> +#define OPROM_IMAGE_MAGIC 0xAA55       /* Little Endian */
> +#define OPROM_CSS_CODE_TYPE 0xF0
> +#define OPROM_BYTE_BOUNDARY 512        /* OPROM image sizes are indicated in 512 byte boundaries */
> +#define OPROM_INITIAL_READ_SIZE 60     /* Read 60 bytes to compute the Img Len from PCI structure */
> +
> +union oprom_header {
> +	u32 data;
> +	struct {
> +		u16 signature;  /* Offset[0x0]: Header 0x55 0xAA */
> +		u8 sizein512bytes;
> +		u8 reserved;
> +	};
> +};
> +
> +struct expansion_rom_header {
> +	union oprom_header header;      /* Offset[0x0]: Oprom Header */
> +	u16 vbiospostoffset;    /* Offset[0x4]: pointer to VBIOS entry point */
> +	u8 resvd[0x12];
> +	u16 pcistructoffset;    /* Offset[0x18]: Contains pointer PCI Data Structure */
> +	u16 opregion_base;      /* Offset[0x1A]: Offset to Opregion Base start */
> +};
> +
>  #ifdef CONFIG_ACPI
>  
>  int intel_opregion_setup(struct drm_i915_private *dev_priv);
> @@ -72,6 +100,9 @@ int intel_opregion_notify_adapter(struct drm_i915_private *dev_priv,
>  				  pci_power_t state);
>  int intel_opregion_get_panel_type(struct drm_i915_private *dev_priv);
>  
> +int intel_spi_get_oprom_opreg(struct drm_i915_private *i915, u32 **opreg,
> +			      u32 *opreg_size);
> +
>  #else /* CONFIG_ACPI*/
>  
>  static inline int intel_opregion_setup(struct drm_i915_private *dev_priv)
> @@ -117,6 +148,11 @@ static inline int intel_opregion_get_panel_type(struct drm_i915_private *dev)
>  	return -ENODEV;
>  }
>  
> -#endif /* CONFIG_ACPI */
> +static int intel_spi_get_oprom_opreg(struct drm_i915_private *i915, u32 **opreg,
> +				     u32 *opreg_size)
> +{
> +	return 0;
> +}
>  
> +#endif /* CONFIG_ACPI */
>  #endif

-- 
Jani Nikula, Intel Open Source Graphics Center

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [Intel-gfx] [PATCH 13/19] drm/i915/dg1: Read OPROM via SPI controller
  2021-04-12  9:05 ` [PATCH 13/19] drm/i915/dg1: Read OPROM via SPI controller Matthew Auld
@ 2021-09-17 23:29   ` Lucas De Marchi
  0 siblings, 0 replies; 65+ messages in thread
From: Lucas De Marchi @ 2021-09-17 23:29 UTC (permalink / raw)
  To: Matthew Auld; +Cc: intel-gfx, Jani Nikula, dri-devel, Tomas Winkler

On Mon, Apr 12, 2021 at 10:05:20AM +0100, Matthew Auld wrote:
>From: Clint Taylor <clinton.a.taylor@intel.com>
>
>Read OPROM SPI through MMIO and find VBT entry since we can't use
>OpRegion and PCI mapping may not work on some systems due to the BIOS
>not leaving the Option ROM mapped.

I was surprised to see we still don't have this patch applied. There is
some coding style to fix, but if we don't have it we are basically
relying on the fallback of using a fake/hardcoded vbt. I will do some
fixups and re-submit.

Lucas De Marchi

>
>v2 by Jani:
>- switch to intel_uncore_read/intel_uncore_write
>
>Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
>Cc: Tomas Winkler <tomas.winkler@intel.com>
>Cc: Jon Bloomfield <jon.bloomfield@intel.com>
>Signed-off-by: Clint Taylor <clinton.a.taylor@intel.com>
>Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
>Signed-off-by: Jani Nikula <jani.nikula@intel.com>
>---
> drivers/gpu/drm/i915/display/intel_bios.c | 80 +++++++++++++++++++++--
> drivers/gpu/drm/i915/i915_reg.h           |  8 +++
> 2 files changed, 82 insertions(+), 6 deletions(-)
>
>diff --git a/drivers/gpu/drm/i915/display/intel_bios.c b/drivers/gpu/drm/i915/display/intel_bios.c
>index ea4837d485a1..f9dc651f1652 100644
>--- a/drivers/gpu/drm/i915/display/intel_bios.c
>+++ b/drivers/gpu/drm/i915/display/intel_bios.c
>@@ -2238,6 +2238,66 @@ bool intel_bios_is_valid_vbt(const void *buf, size_t size)
> 	return vbt;
> }
>
>+static struct vbt_header *spi_oprom_get_vbt(struct drm_i915_private *i915)
>+{
>+	u32 count, data, found, store = 0;
>+	u32 static_region, oprom_offset;
>+	u32 oprom_size = 0x200000;
>+	u16 vbt_size;
>+	u32 *vbt;
>+
>+	static_region = intel_uncore_read(&i915->uncore, SPI_STATIC_REGIONS);
>+	static_region &= OPTIONROM_SPI_REGIONID_MASK;
>+	intel_uncore_write(&i915->uncore, PRIMARY_SPI_REGIONID, static_region);
>+
>+	oprom_offset = intel_uncore_read(&i915->uncore, OROM_OFFSET);
>+	oprom_offset &= OROM_OFFSET_MASK;
>+
>+	for (count = 0; count < oprom_size; count += 4) {
>+		intel_uncore_write(&i915->uncore, PRIMARY_SPI_ADDRESS, oprom_offset + count);
>+		data = intel_uncore_read(&i915->uncore, PRIMARY_SPI_TRIGGER);
>+
>+		if (data == *((const u32 *)"$VBT")) {
>+			found = oprom_offset + count;
>+			break;
>+		}
>+	}
>+
>+	if (count >= oprom_size)
>+		goto err_not_found;
>+
>+	/* Get VBT size and allocate space for the VBT */
>+	intel_uncore_write(&i915->uncore, PRIMARY_SPI_ADDRESS, found +
>+		   offsetof(struct vbt_header, vbt_size));
>+	vbt_size = intel_uncore_read(&i915->uncore, PRIMARY_SPI_TRIGGER);
>+	vbt_size &= 0xffff;
>+
>+	vbt = kzalloc(vbt_size, GFP_KERNEL);
>+	if (!vbt) {
>+		DRM_ERROR("Unable to allocate %u bytes for VBT storage\n",
>+			  vbt_size);
>+		goto err_not_found;
>+	}
>+
>+	for (count = 0; count < vbt_size; count += 4) {
>+		intel_uncore_write(&i915->uncore, PRIMARY_SPI_ADDRESS, found + count);
>+		data = intel_uncore_read(&i915->uncore, PRIMARY_SPI_TRIGGER);
>+		*(vbt + store++) = data;
>+	}
>+
>+	if (!intel_bios_is_valid_vbt(vbt, vbt_size))
>+		goto err_free_vbt;
>+
>+	DRM_DEBUG_KMS("Found valid VBT in SPI flash\n");
>+
>+	return (struct vbt_header *)vbt;
>+
>+err_free_vbt:
>+	kfree(vbt);
>+err_not_found:
>+	return NULL;
>+}
>+
> static struct vbt_header *oprom_get_vbt(struct drm_i915_private *i915)
> {
> 	struct pci_dev *pdev = to_pci_dev(i915->drm.dev);
>@@ -2287,6 +2347,8 @@ static struct vbt_header *oprom_get_vbt(struct drm_i915_private *i915)
>
> 	pci_unmap_rom(pdev, oprom);
>
>+	DRM_DEBUG_KMS("Found valid VBT in PCI ROM\n");
>+
> 	return vbt;
>
> err_free_vbt:
>@@ -2321,17 +2383,23 @@ void intel_bios_init(struct drm_i915_private *i915)
>
> 	init_vbt_defaults(i915);
>
>-	/* If the OpRegion does not have VBT, look in PCI ROM. */
>+	/*
>+	 * If the OpRegion does not have VBT, look in SPI flash through MMIO or
>+	 * PCI mapping
>+	 */
>+	if (!vbt && IS_DGFX(i915)) {
>+		oprom_vbt = spi_oprom_get_vbt(i915);
>+		vbt = oprom_vbt;
>+	}
>+
> 	if (!vbt) {
> 		oprom_vbt = oprom_get_vbt(i915);
>-		if (!oprom_vbt)
>-			goto out;
>-
> 		vbt = oprom_vbt;
>-
>-		drm_dbg_kms(&i915->drm, "Found valid VBT in PCI ROM\n");
> 	}
>
>+	if (!vbt)
>+		goto out;
>+
> 	bdb = get_bdb_header(vbt);
> 	i915->vbt.version = bdb->version;
>
>diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
>index da73dc939e58..54ff63b86df6 100644
>--- a/drivers/gpu/drm/i915/i915_reg.h
>+++ b/drivers/gpu/drm/i915/i915_reg.h
>@@ -12540,6 +12540,14 @@ enum skl_power_gate {
> #define   DP_PIN_ASSIGNMENT_MASK(idx)		(0xf << ((idx) * 4))
> #define   DP_PIN_ASSIGNMENT(idx, x)		((x) << ((idx) * 4))
>
>+#define PRIMARY_SPI_TRIGGER			_MMIO(0x102040)
>+#define PRIMARY_SPI_ADDRESS			_MMIO(0x102080)
>+#define PRIMARY_SPI_REGIONID			_MMIO(0x102084)
>+#define SPI_STATIC_REGIONS			_MMIO(0x102090)
>+#define   OPTIONROM_SPI_REGIONID_MASK		REG_GENMASK(7, 0)
>+#define OROM_OFFSET				_MMIO(0x1020c0)
>+#define   OROM_OFFSET_MASK			REG_GENMASK(20, 16)
>+
> /* This register controls the Display State Buffer (DSB) engines. */
> #define _DSBSL_INSTANCE_BASE		0x70B00
> #define DSBSL_INSTANCE(pipe, id)	(_DSBSL_INSTANCE_BASE + \
>-- 
>2.26.3
>
>_______________________________________________
>Intel-gfx mailing list
>Intel-gfx@lists.freedesktop.org
>https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [Intel-gfx] [PATCH 14/19] drm/i915/oprom: Basic sanitization
  2021-05-17 11:57   ` [Intel-gfx] [PATCH 14/19] drm/i915/oprom: Basic sanitization Jani Nikula
@ 2021-09-18  4:30     ` Lucas De Marchi
  2021-09-20  7:41       ` Jani Nikula
  0 siblings, 1 reply; 65+ messages in thread
From: Lucas De Marchi @ 2021-09-18  4:30 UTC (permalink / raw)
  To: Jani Nikula; +Cc: Matthew Auld, intel-gfx, dri-devel

On Mon, May 17, 2021 at 02:57:33PM +0300, Jani Nikula wrote:
>On Mon, 12 Apr 2021, Matthew Auld <matthew.auld@intel.com> wrote:
>> From: Anshuman Gupta <anshuman.gupta@intel.com>
>>
>> Sanitize OPROM header, CPD signature and OPROM PCI version.
>> OPROM_HEADER, EXPANSION_ROM_HEADER and OPROM_MEU_BLOB structures
>> and PCI struct offsets are provided by GSC counterparts.
>> These are yet to be Documented in B.Spec.
>> After successful sanitization, extract VBT from opregion
>> image.
>
>So I don't understand what the point is with two consecutive patches
>where the latter rewrites a lot of the former.

I actually wonder what's the point of this. Getting it from spi is
already the fallback and looks much more complex. Yes, it's pretty
detailed and document the format pretty well, but it still looks more
complex than the initial code. Do you see additional benefit in this
one?

Lucas De Marchi

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [Intel-gfx] [PATCH 14/19] drm/i915/oprom: Basic sanitization
  2021-09-18  4:30     ` Lucas De Marchi
@ 2021-09-20  7:41       ` Jani Nikula
  2021-09-20  8:04         ` Gupta, Anshuman
  0 siblings, 1 reply; 65+ messages in thread
From: Jani Nikula @ 2021-09-20  7:41 UTC (permalink / raw)
  To: Lucas De Marchi; +Cc: Matthew Auld, intel-gfx, dri-devel, Anshuman Gupta

On Fri, 17 Sep 2021, Lucas De Marchi <lucas.demarchi@intel.com> wrote:
> On Mon, May 17, 2021 at 02:57:33PM +0300, Jani Nikula wrote:
>>On Mon, 12 Apr 2021, Matthew Auld <matthew.auld@intel.com> wrote:
>>> From: Anshuman Gupta <anshuman.gupta@intel.com>
>>>
>>> Sanitize OPROM header, CPD signature and OPROM PCI version.
>>> OPROM_HEADER, EXPANSION_ROM_HEADER and OPROM_MEU_BLOB structures
>>> and PCI struct offsets are provided by GSC counterparts.
>>> These are yet to be Documented in B.Spec.
>>> After successful sanitization, extract VBT from opregion
>>> image.
>>
>>So I don't understand what the point is with two consecutive patches
>>where the latter rewrites a lot of the former.
>
> I actually wonder what's the point of this. Getting it from spi is
> already the fallback and looks much more complex. Yes, it's pretty
> detailed and document the format pretty well, but it still looks more
> complex than the initial code. Do you see additional benefit in this
> one?

The commit message doesn't really explain much. Anshuman?

BR,
Jani.


-- 
Jani Nikula, Intel Open Source Graphics Center

^ permalink raw reply	[flat|nested] 65+ messages in thread

* RE: [Intel-gfx] [PATCH 14/19] drm/i915/oprom: Basic sanitization
  2021-09-20  7:41       ` Jani Nikula
@ 2021-09-20  8:04         ` Gupta, Anshuman
  2021-09-20  8:43           ` Jani Nikula
  2021-09-22 21:53           ` Lucas De Marchi
  0 siblings, 2 replies; 65+ messages in thread
From: Gupta, Anshuman @ 2021-09-20  8:04 UTC (permalink / raw)
  To: Nikula, Jani, De Marchi, Lucas; +Cc: Auld, Matthew, intel-gfx, dri-devel



> -----Original Message-----
> From: Nikula, Jani <jani.nikula@intel.com>
> Sent: Monday, September 20, 2021 1:12 PM
> To: De Marchi, Lucas <lucas.demarchi@intel.com>
> Cc: Auld, Matthew <matthew.auld@intel.com>; intel-gfx@lists.freedesktop.org;
> dri-devel@lists.freedesktop.org; Gupta, Anshuman
> <anshuman.gupta@intel.com>
> Subject: Re: [Intel-gfx] [PATCH 14/19] drm/i915/oprom: Basic sanitization
> 
> On Fri, 17 Sep 2021, Lucas De Marchi <lucas.demarchi@intel.com> wrote:
> > On Mon, May 17, 2021 at 02:57:33PM +0300, Jani Nikula wrote:
> >>On Mon, 12 Apr 2021, Matthew Auld <matthew.auld@intel.com> wrote:
> >>> From: Anshuman Gupta <anshuman.gupta@intel.com>
> >>>
> >>> Sanitize OPROM header, CPD signature and OPROM PCI version.
> >>> OPROM_HEADER, EXPANSION_ROM_HEADER and OPROM_MEU_BLOB
> structures and
> >>> PCI struct offsets are provided by GSC counterparts.
> >>> These are yet to be Documented in B.Spec.
> >>> After successful sanitization, extract VBT from opregion image.
> >>
> >>So I don't understand what the point is with two consecutive patches
> >>where the latter rewrites a lot of the former.
> >
> > I actually wonder what's the point of this. Getting it from spi is
> > already the fallback and looks much more complex. Yes, it's pretty
> > detailed and document the format pretty well, but it still looks more
> > complex than the initial code. Do you see additional benefit in this
> > one?
Getting opregion image from spi is needed to get the intel_opregion and its mailboxes on discrete card.
> 
> The commit message doesn't really explain much. Anshuman?
I will get rework of the patches and float it again.
Thanks,
Anshuman Gupta.
> 
> BR,
> Jani.
> 
> 
> --
> Jani Nikula, Intel Open Source Graphics Center

^ permalink raw reply	[flat|nested] 65+ messages in thread

* RE: [Intel-gfx] [PATCH 14/19] drm/i915/oprom: Basic sanitization
  2021-09-20  8:04         ` Gupta, Anshuman
@ 2021-09-20  8:43           ` Jani Nikula
  2021-09-22 21:53           ` Lucas De Marchi
  1 sibling, 0 replies; 65+ messages in thread
From: Jani Nikula @ 2021-09-20  8:43 UTC (permalink / raw)
  To: Gupta, Anshuman, De Marchi, Lucas; +Cc: Auld, Matthew, intel-gfx, dri-devel

On Mon, 20 Sep 2021, "Gupta, Anshuman" <anshuman.gupta@intel.com> wrote:
>> -----Original Message-----
>> From: Nikula, Jani <jani.nikula@intel.com>
>> Sent: Monday, September 20, 2021 1:12 PM
>> To: De Marchi, Lucas <lucas.demarchi@intel.com>
>> Cc: Auld, Matthew <matthew.auld@intel.com>; intel-gfx@lists.freedesktop.org;
>> dri-devel@lists.freedesktop.org; Gupta, Anshuman
>> <anshuman.gupta@intel.com>
>> Subject: Re: [Intel-gfx] [PATCH 14/19] drm/i915/oprom: Basic sanitization
>> 
>> On Fri, 17 Sep 2021, Lucas De Marchi <lucas.demarchi@intel.com> wrote:
>> > On Mon, May 17, 2021 at 02:57:33PM +0300, Jani Nikula wrote:
>> >>On Mon, 12 Apr 2021, Matthew Auld <matthew.auld@intel.com> wrote:
>> >>> From: Anshuman Gupta <anshuman.gupta@intel.com>
>> >>>
>> >>> Sanitize OPROM header, CPD signature and OPROM PCI version.
>> >>> OPROM_HEADER, EXPANSION_ROM_HEADER and OPROM_MEU_BLOB
>> structures and
>> >>> PCI struct offsets are provided by GSC counterparts.
>> >>> These are yet to be Documented in B.Spec.
>> >>> After successful sanitization, extract VBT from opregion image.
>> >>
>> >>So I don't understand what the point is with two consecutive patches
>> >>where the latter rewrites a lot of the former.
>> >
>> > I actually wonder what's the point of this. Getting it from spi is
>> > already the fallback and looks much more complex. Yes, it's pretty
>> > detailed and document the format pretty well, but it still looks more
>> > complex than the initial code. Do you see additional benefit in this
>> > one?
> Getting opregion image from spi is needed to get the intel_opregion and its mailboxes on discrete card.

I mean what's the point of the "drm/i915/oprom: Basic sanitization"
patch? And if that's needed, then why is it separate from "drm/i915/dg1:
Read OPROM via SPI controller"?

>> The commit message doesn't really explain much. Anshuman?
> I will get rework of the patches and float it again.

Lucas already sent something, please sync with him.

BR,
Jani.


> Thanks,
> Anshuman Gupta.
>> 
>> BR,
>> Jani.
>> 
>> 
>> --
>> Jani Nikula, Intel Open Source Graphics Center

-- 
Jani Nikula, Intel Open Source Graphics Center

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [Intel-gfx] [PATCH 14/19] drm/i915/oprom: Basic sanitization
  2021-09-20  8:04         ` Gupta, Anshuman
  2021-09-20  8:43           ` Jani Nikula
@ 2021-09-22 21:53           ` Lucas De Marchi
  1 sibling, 0 replies; 65+ messages in thread
From: Lucas De Marchi @ 2021-09-22 21:53 UTC (permalink / raw)
  To: Gupta, Anshuman; +Cc: Nikula, Jani, Auld, Matthew, intel-gfx, dri-devel

On Mon, Sep 20, 2021 at 08:04:32AM +0000, Gupta, Anshuman wrote:
>
>
>> -----Original Message-----
>> From: Nikula, Jani <jani.nikula@intel.com>
>> Sent: Monday, September 20, 2021 1:12 PM
>> To: De Marchi, Lucas <lucas.demarchi@intel.com>
>> Cc: Auld, Matthew <matthew.auld@intel.com>; intel-gfx@lists.freedesktop.org;
>> dri-devel@lists.freedesktop.org; Gupta, Anshuman
>> <anshuman.gupta@intel.com>
>> Subject: Re: [Intel-gfx] [PATCH 14/19] drm/i915/oprom: Basic sanitization
>>
>> On Fri, 17 Sep 2021, Lucas De Marchi <lucas.demarchi@intel.com> wrote:
>> > On Mon, May 17, 2021 at 02:57:33PM +0300, Jani Nikula wrote:
>> >>On Mon, 12 Apr 2021, Matthew Auld <matthew.auld@intel.com> wrote:
>> >>> From: Anshuman Gupta <anshuman.gupta@intel.com>
>> >>>
>> >>> Sanitize OPROM header, CPD signature and OPROM PCI version.
>> >>> OPROM_HEADER, EXPANSION_ROM_HEADER and OPROM_MEU_BLOB
>> structures and
>> >>> PCI struct offsets are provided by GSC counterparts.
>> >>> These are yet to be Documented in B.Spec.
>> >>> After successful sanitization, extract VBT from opregion image.
>> >>
>> >>So I don't understand what the point is with two consecutive patches
>> >>where the latter rewrites a lot of the former.
>> >
>> > I actually wonder what's the point of this. Getting it from spi is
>> > already the fallback and looks much more complex. Yes, it's pretty
>> > detailed and document the format pretty well, but it still looks more
>> > complex than the initial code. Do you see additional benefit in this
>> > one?
>Getting opregion image from spi is needed to get the intel_opregion and its mailboxes on discrete card.
>>
>> The commit message doesn't really explain much. Anshuman?
>I will get rework of the patches and float it again.


from this patch the only thing I see it's doing is to get the VBT from
inside opregion... it moves the read part to helper methods and
apparently it supports multiple images...?

The question here is not why we are reading from spi, but rather what
this is doing that the previous commit wasn't already.

Lucas De Marchi

^ permalink raw reply	[flat|nested] 65+ messages in thread

end of thread, other threads:[~2021-09-22 21:53 UTC | newest]

Thread overview: 65+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-12  9:05 [PATCH 00/19] More DG1 enabling Matthew Auld
2021-04-12  9:05 ` [PATCH 01/19] drm/i915/gt: Skip aperture remapping selftest where there is no aperture Matthew Auld
2021-04-12 14:48   ` [Intel-gfx] " Daniel Vetter
2021-04-12  9:05 ` [PATCH 02/19] drm/i915/selftests: Only query RAPL for integrated power measurements Matthew Auld
2021-04-12  9:05 ` [PATCH 03/19] drm/i915: Create stolen memory region from local memory Matthew Auld
2021-04-14 15:01   ` [Intel-gfx] " Tvrtko Ursulin
2021-04-16 15:04     ` Matthew Auld
2021-04-19 14:15       ` Tvrtko Ursulin
2021-04-12  9:05 ` [PATCH 04/19] drm/i915/stolen: treat stolen local as normal " Matthew Auld
2021-04-14 15:06   ` [Intel-gfx] " Tvrtko Ursulin
2021-04-12  9:05 ` [PATCH 05/19] drm/i915/stolen: enforce the min_page_size contract Matthew Auld
2021-04-14 15:07   ` [Intel-gfx] " Tvrtko Ursulin
2021-04-12  9:05 ` [PATCH 06/19] drm/i915/stolen: pass the allocation flags Matthew Auld
2021-04-14 15:09   ` [Intel-gfx] " Tvrtko Ursulin
2021-04-16 13:53     ` Matthew Auld
2021-04-12  9:05 ` [PATCH 07/19] drm/i915/fbdev: Use lmem physical addresses for fb_mmap() on discrete Matthew Auld
2021-04-12 15:00   ` Daniel Vetter
2021-04-12  9:05 ` [PATCH 08/19] drm/i915: Return error value when bo not in LMEM for discrete Matthew Auld
2021-04-14 15:16   ` [Intel-gfx] " Tvrtko Ursulin
2021-04-12  9:05 ` [PATCH 09/19] drm/i915/lmem: Fail driver init if LMEM training failed Matthew Auld
2021-04-12  9:05 ` [PATCH 10/19] drm/i915/dg1: Fix mapping type for default state object Matthew Auld
2021-04-12  9:05 ` [PATCH 11/19] drm/i915: Update the helper to set correct mapping Matthew Auld
2021-04-14 15:22   ` [Intel-gfx] " Tvrtko Ursulin
2021-04-14 16:20     ` Matthew Auld
2021-04-15  8:20       ` Tvrtko Ursulin
2021-04-15  9:23         ` Matthew Auld
2021-04-15 11:05           ` Tvrtko Ursulin
2021-04-19 11:30             ` Matthew Auld
2021-04-19 14:07               ` Tvrtko Ursulin
2021-04-19 14:37                 ` Matthew Auld
2021-04-19 15:01                   ` Tvrtko Ursulin
2021-04-21 11:42                     ` Matthew Auld
2021-04-21 15:41                       ` Tvrtko Ursulin
2021-04-21 19:13                         ` Matthew Auld
2021-04-26  8:57                           ` Matthew Auld
2021-04-26  9:21                             ` Tvrtko Ursulin
2021-04-12  9:05 ` [PATCH 12/19] drm/i915/lmem: Bypass aperture when lmem is available Matthew Auld
2021-04-14 15:33   ` [Intel-gfx] " Tvrtko Ursulin
2021-04-16 14:25     ` Matthew Auld
2021-04-19 14:16       ` Tvrtko Ursulin
2021-04-12  9:05 ` [PATCH 13/19] drm/i915/dg1: Read OPROM via SPI controller Matthew Auld
2021-09-17 23:29   ` [Intel-gfx] " Lucas De Marchi
2021-04-12  9:05 ` [PATCH 14/19] drm/i915/oprom: Basic sanitization Matthew Auld
2021-04-12 22:36   ` [Intel-gfx] " kernel test robot
2021-04-12 22:36   ` [PATCH] drm/i915/oprom: fix memdup.cocci warnings kernel test robot
2021-05-17 11:57   ` [Intel-gfx] [PATCH 14/19] drm/i915/oprom: Basic sanitization Jani Nikula
2021-09-18  4:30     ` Lucas De Marchi
2021-09-20  7:41       ` Jani Nikula
2021-09-20  8:04         ` Gupta, Anshuman
2021-09-20  8:43           ` Jani Nikula
2021-09-22 21:53           ` Lucas De Marchi
2021-04-12  9:05 ` [PATCH 15/19] drm/i915: WA for zero memory channel Matthew Auld
2021-04-12 16:57   ` Souza, Jose
2021-04-12  9:05 ` [PATCH 16/19] drm/i915/dg1: Compute MEM Bandwidth using MCHBAR Matthew Auld
2021-04-12  9:05 ` [PATCH 17/19] drm/i915/dg1: Double memory bandwidth available Matthew Auld
2021-04-12  9:05 ` [PATCH 18/19] drm/i915/gtt: map the PD up front Matthew Auld
2021-04-12 15:17   ` [Intel-gfx] " Daniel Vetter
2021-04-12 16:01     ` Jani Nikula
2021-04-12 16:36       ` Daniel Vetter
2021-04-12 16:08     ` Matthew Auld
2021-04-12 17:00       ` Daniel Vetter
2021-04-13  9:28         ` Matthew Auld
2021-04-13 10:18           ` Daniel Vetter
2021-04-12  9:05 ` [PATCH 19/19] drm/i915/gtt/dgfx: place the PD in LMEM Matthew Auld
2021-04-14 15:37   ` [Intel-gfx] " Tvrtko Ursulin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).