All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 00/14] Forcewake binary search & code shrink
@ 2016-09-30 17:48 Tvrtko Ursulin
  2016-09-30 17:48 ` [PATCH 01/14] drm/i915: Remove redundant hsw_write* mmio functions Tvrtko Ursulin
                   ` (16 more replies)
  0 siblings, 17 replies; 32+ messages in thread
From: Tvrtko Ursulin @ 2016-09-30 17:48 UTC (permalink / raw)
  To: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Motivation was to replace linear search comparison if-else ladders in MMIO
accessors with a data driven approach.

These lookups are needed to determine to correct forcewake domains to take
when reading and writing to MMIO registers.

Going data driven and keeping the tables sorted enabled binary search to be
used. Followed by gradual extraction of commonality between existing functions,
accompanying code reduction, and the end result is around 15KiB less code
generated and also some saving in the source code lines.

v2:
 * Updated cover letter given feedback.
 * Some rework for review comments.
 * New patch in the series (14) to inline the binary search.

Tvrtko Ursulin (14):
  drm/i915: Remove redundant hsw_write* mmio functions
  drm/i915: Keep track of active forcewake domains in a bitmask
  drm/i915: Do not inline forcewake taking in mmio accessors
  drm/i915: Data driven register to forcewake domains lookup
  drm/i915: Sort forcewake mapping tables
  drm/i915: Use binary search when looking up forcewake domains
  drm/i915: Eliminate Gen9 special case
  drm/i915: Store the active forcewake range table pointer
  drm/i915: Remove identical macros
  drm/i915: Remove identical mmio read functions
  drm/i915: Remove identical write mmmio functions
  drm/i915: Sort the shadow register table
  drm/i915: Use binary search when looking for shadowed registers
  drm/i915: Inline binary search

 drivers/gpu/drm/i915/i915_drv.h     |  12 +
 drivers/gpu/drm/i915/intel_uncore.c | 600 ++++++++++++++++--------------------
 2 files changed, 276 insertions(+), 336 deletions(-)

-- 
2.7.4

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH 01/14] drm/i915: Remove redundant hsw_write* mmio functions
  2016-09-30 17:48 [PATCH v2 00/14] Forcewake binary search & code shrink Tvrtko Ursulin
@ 2016-09-30 17:48 ` Tvrtko Ursulin
  2016-09-30 17:48 ` [PATCH 02/14] drm/i915: Keep track of active forcewake domains in a bitmask Tvrtko Ursulin
                   ` (15 subsequent siblings)
  16 siblings, 0 replies; 32+ messages in thread
From: Tvrtko Ursulin @ 2016-09-30 17:48 UTC (permalink / raw)
  To: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

They are completely identical to gen6_write* ones.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/intel_uncore.c | 25 +------------------------
 1 file changed, 1 insertion(+), 24 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c
index a9b6c936aadd..478364057812 100644
--- a/drivers/gpu/drm/i915/intel_uncore.c
+++ b/drivers/gpu/drm/i915/intel_uncore.c
@@ -1055,21 +1055,6 @@ gen6_write##x(struct drm_i915_private *dev_priv, i915_reg_t reg, u##x val, bool
 	GEN6_WRITE_FOOTER; \
 }
 
-#define __hsw_write(x) \
-static void \
-hsw_write##x(struct drm_i915_private *dev_priv, i915_reg_t reg, u##x val, bool trace) { \
-	u32 __fifo_ret = 0; \
-	GEN6_WRITE_HEADER; \
-	if (NEEDS_FORCE_WAKE(offset)) { \
-		__fifo_ret = __gen6_gt_wait_for_fifo(dev_priv); \
-	} \
-	__raw_i915_write##x(dev_priv, reg, val); \
-	if (unlikely(__fifo_ret)) { \
-		gen6_gt_check_fifodbg(dev_priv); \
-	} \
-	GEN6_WRITE_FOOTER; \
-}
-
 #define __gen8_write(x) \
 static void \
 gen8_write##x(struct drm_i915_private *dev_priv, i915_reg_t reg, u##x val, bool trace) { \
@@ -1116,9 +1101,6 @@ __chv_write(32)
 __gen8_write(8)
 __gen8_write(16)
 __gen8_write(32)
-__hsw_write(8)
-__hsw_write(16)
-__hsw_write(32)
 __gen6_write(8)
 __gen6_write(16)
 __gen6_write(32)
@@ -1126,7 +1108,6 @@ __gen6_write(32)
 #undef __gen9_write
 #undef __chv_write
 #undef __gen8_write
-#undef __hsw_write
 #undef __gen6_write
 #undef GEN6_WRITE_FOOTER
 #undef GEN6_WRITE_HEADER
@@ -1343,11 +1324,7 @@ void intel_uncore_init(struct drm_i915_private *dev_priv)
 		break;
 	case 7:
 	case 6:
-		if (IS_HASWELL(dev_priv)) {
-			ASSIGN_WRITE_MMIO_VFUNCS(hsw);
-		} else {
-			ASSIGN_WRITE_MMIO_VFUNCS(gen6);
-		}
+		ASSIGN_WRITE_MMIO_VFUNCS(gen6);
 
 		if (IS_VALLEYVIEW(dev_priv)) {
 			ASSIGN_READ_MMIO_VFUNCS(vlv);
-- 
2.7.4

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH 02/14] drm/i915: Keep track of active forcewake domains in a bitmask
  2016-09-30 17:48 [PATCH v2 00/14] Forcewake binary search & code shrink Tvrtko Ursulin
  2016-09-30 17:48 ` [PATCH 01/14] drm/i915: Remove redundant hsw_write* mmio functions Tvrtko Ursulin
@ 2016-09-30 17:48 ` Tvrtko Ursulin
  2016-09-30 17:48 ` [PATCH 03/14] drm/i915: Do not inline forcewake taking in mmio accessors Tvrtko Ursulin
                   ` (14 subsequent siblings)
  16 siblings, 0 replies; 32+ messages in thread
From: Tvrtko Ursulin @ 2016-09-30 17:48 UTC (permalink / raw)
  To: Intel-gfx; +Cc: Paneri, Praveen

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

There are current places in the code, and there will be more in the
future, which iterate the forcewake domains to find out which ones
are currently active.

To save them from doing this iteration, we can cheaply keep a mask
of active domains in dev_priv->uncore.fw_domains_active.

This has no cost in terms of object size, even manages to shrink it
overall by 368 bytes on my config.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: "Paneri, Praveen" <praveen.paneri@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_drv.h     |  2 ++
 drivers/gpu/drm/i915/intel_uncore.c | 54 ++++++++++++++++---------------------
 2 files changed, 25 insertions(+), 31 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index e0cb71c9c52e..9b1997fd8c87 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -588,7 +588,9 @@ struct intel_uncore {
 	struct intel_uncore_funcs funcs;
 
 	unsigned fifo_count;
+
 	enum forcewake_domains fw_domains;
+	enum forcewake_domains fw_domains_active;
 
 	struct intel_uncore_forcewake_domain {
 		struct drm_i915_private *i915;
diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c
index 478364057812..079102b46fd3 100644
--- a/drivers/gpu/drm/i915/intel_uncore.c
+++ b/drivers/gpu/drm/i915/intel_uncore.c
@@ -231,19 +231,21 @@ intel_uncore_fw_release_timer(struct hrtimer *timer)
 {
 	struct intel_uncore_forcewake_domain *domain =
 	       container_of(timer, struct intel_uncore_forcewake_domain, timer);
+	struct drm_i915_private *dev_priv = domain->i915;
 	unsigned long irqflags;
 
-	assert_rpm_device_not_suspended(domain->i915);
+	assert_rpm_device_not_suspended(dev_priv);
 
-	spin_lock_irqsave(&domain->i915->uncore.lock, irqflags);
+	spin_lock_irqsave(&dev_priv->uncore.lock, irqflags);
 	if (WARN_ON(domain->wake_count == 0))
 		domain->wake_count++;
 
-	if (--domain->wake_count == 0)
-		domain->i915->uncore.funcs.force_wake_put(domain->i915,
-							  1 << domain->id);
+	if (--domain->wake_count == 0) {
+		dev_priv->uncore.funcs.force_wake_put(dev_priv, domain->mask);
+		dev_priv->uncore.fw_domains_active &= ~domain->mask;
+	}
 
-	spin_unlock_irqrestore(&domain->i915->uncore.lock, irqflags);
+	spin_unlock_irqrestore(&dev_priv->uncore.lock, irqflags);
 
 	return HRTIMER_NORESTART;
 }
@@ -254,7 +256,7 @@ void intel_uncore_forcewake_reset(struct drm_i915_private *dev_priv,
 	unsigned long irqflags;
 	struct intel_uncore_forcewake_domain *domain;
 	int retry_count = 100;
-	enum forcewake_domains fw = 0, active_domains;
+	enum forcewake_domains fw, active_domains;
 
 	/* Hold uncore.lock across reset to prevent any register access
 	 * with forcewake not set correctly. Wait until all pending
@@ -291,10 +293,7 @@ void intel_uncore_forcewake_reset(struct drm_i915_private *dev_priv,
 
 	WARN_ON(active_domains);
 
-	for_each_fw_domain(domain, dev_priv)
-		if (domain->wake_count)
-			fw |= domain->mask;
-
+	fw = dev_priv->uncore.fw_domains_active;
 	if (fw)
 		dev_priv->uncore.funcs.force_wake_put(dev_priv, fw);
 
@@ -443,9 +442,6 @@ static void __intel_uncore_forcewake_get(struct drm_i915_private *dev_priv,
 {
 	struct intel_uncore_forcewake_domain *domain;
 
-	if (!dev_priv->uncore.funcs.force_wake_get)
-		return;
-
 	fw_domains &= dev_priv->uncore.fw_domains;
 
 	for_each_fw_domain_masked(domain, fw_domains, dev_priv) {
@@ -453,8 +449,10 @@ static void __intel_uncore_forcewake_get(struct drm_i915_private *dev_priv,
 			fw_domains &= ~domain->mask;
 	}
 
-	if (fw_domains)
+	if (fw_domains) {
 		dev_priv->uncore.funcs.force_wake_get(dev_priv, fw_domains);
+		dev_priv->uncore.fw_domains_active |= fw_domains;
+	}
 }
 
 /**
@@ -509,9 +507,6 @@ static void __intel_uncore_forcewake_put(struct drm_i915_private *dev_priv,
 {
 	struct intel_uncore_forcewake_domain *domain;
 
-	if (!dev_priv->uncore.funcs.force_wake_put)
-		return;
-
 	fw_domains &= dev_priv->uncore.fw_domains;
 
 	for_each_fw_domain_masked(domain, fw_domains, dev_priv) {
@@ -567,13 +562,10 @@ void intel_uncore_forcewake_put__locked(struct drm_i915_private *dev_priv,
 
 void assert_forcewakes_inactive(struct drm_i915_private *dev_priv)
 {
-	struct intel_uncore_forcewake_domain *domain;
-
 	if (!dev_priv->uncore.funcs.force_wake_get)
 		return;
 
-	for_each_fw_domain(domain, dev_priv)
-		WARN_ON(domain->wake_count);
+	WARN_ON(dev_priv->uncore.fw_domains_active);
 }
 
 /* We give fast paths for the really cool registers */
@@ -878,18 +870,18 @@ static inline void __force_wake_auto(struct drm_i915_private *dev_priv,
 	if (WARN_ON(!fw_domains))
 		return;
 
-	/* Ideally GCC would be constant-fold and eliminate this loop */
-	for_each_fw_domain_masked(domain, fw_domains, dev_priv) {
-		if (domain->wake_count) {
-			fw_domains &= ~domain->mask;
-			continue;
-		}
+	/* Turn on all requested but inactive supported forcewake domains. */
+	fw_domains &= dev_priv->uncore.fw_domains;
+	fw_domains &= ~dev_priv->uncore.fw_domains_active;
 
-		fw_domain_arm_timer(domain);
-	}
+	if (fw_domains) {
+		/* Ideally GCC would be constant-fold and eliminate this loop */
+		for_each_fw_domain_masked(domain, fw_domains, dev_priv)
+			fw_domain_arm_timer(domain);
 
-	if (fw_domains)
 		dev_priv->uncore.funcs.force_wake_get(dev_priv, fw_domains);
+		dev_priv->uncore.fw_domains_active |= fw_domains;
+	}
 }
 
 #define __gen6_read(x) \
-- 
2.7.4

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH 03/14] drm/i915: Do not inline forcewake taking in mmio accessors
  2016-09-30 17:48 [PATCH v2 00/14] Forcewake binary search & code shrink Tvrtko Ursulin
  2016-09-30 17:48 ` [PATCH 01/14] drm/i915: Remove redundant hsw_write* mmio functions Tvrtko Ursulin
  2016-09-30 17:48 ` [PATCH 02/14] drm/i915: Keep track of active forcewake domains in a bitmask Tvrtko Ursulin
@ 2016-09-30 17:48 ` Tvrtko Ursulin
  2016-09-30 17:48 ` [PATCH 04/14] drm/i915: Data driven register to forcewake domains lookup Tvrtko Ursulin
                   ` (13 subsequent siblings)
  16 siblings, 0 replies; 32+ messages in thread
From: Tvrtko Ursulin @ 2016-09-30 17:48 UTC (permalink / raw)
  To: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Once we know we need to take new forcewakes, that being
a slow operation, it does not make sense to inline that
code into every mmio accessor.

Move it to a separate function and save some code.

v2: Be explicit with noinline and remove stale comment.
    (Chris Wilson)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/intel_uncore.c | 24 ++++++++++++++----------
 1 file changed, 14 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c
index 079102b46fd3..60137a164094 100644
--- a/drivers/gpu/drm/i915/intel_uncore.c
+++ b/drivers/gpu/drm/i915/intel_uncore.c
@@ -862,11 +862,21 @@ __gen2_read(64)
 	trace_i915_reg_rw(false, reg, val, sizeof(val), trace); \
 	return val
 
-static inline void __force_wake_auto(struct drm_i915_private *dev_priv,
-				     enum forcewake_domains fw_domains)
+static noinline void ___force_wake_auto(struct drm_i915_private *dev_priv,
+					enum forcewake_domains fw_domains)
 {
 	struct intel_uncore_forcewake_domain *domain;
 
+	for_each_fw_domain_masked(domain, fw_domains, dev_priv)
+		fw_domain_arm_timer(domain);
+
+	dev_priv->uncore.funcs.force_wake_get(dev_priv, fw_domains);
+	dev_priv->uncore.fw_domains_active |= fw_domains;
+}
+
+static inline void __force_wake_auto(struct drm_i915_private *dev_priv,
+				     enum forcewake_domains fw_domains)
+{
 	if (WARN_ON(!fw_domains))
 		return;
 
@@ -874,14 +884,8 @@ static inline void __force_wake_auto(struct drm_i915_private *dev_priv,
 	fw_domains &= dev_priv->uncore.fw_domains;
 	fw_domains &= ~dev_priv->uncore.fw_domains_active;
 
-	if (fw_domains) {
-		/* Ideally GCC would be constant-fold and eliminate this loop */
-		for_each_fw_domain_masked(domain, fw_domains, dev_priv)
-			fw_domain_arm_timer(domain);
-
-		dev_priv->uncore.funcs.force_wake_get(dev_priv, fw_domains);
-		dev_priv->uncore.fw_domains_active |= fw_domains;
-	}
+	if (fw_domains)
+		___force_wake_auto(dev_priv, fw_domains);
 }
 
 #define __gen6_read(x) \
-- 
2.7.4

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH 04/14] drm/i915: Data driven register to forcewake domains lookup
  2016-09-30 17:48 [PATCH v2 00/14] Forcewake binary search & code shrink Tvrtko Ursulin
                   ` (2 preceding siblings ...)
  2016-09-30 17:48 ` [PATCH 03/14] drm/i915: Do not inline forcewake taking in mmio accessors Tvrtko Ursulin
@ 2016-09-30 17:48 ` Tvrtko Ursulin
  2016-10-03  7:58   ` Joonas Lahtinen
  2016-09-30 17:48 ` [PATCH 05/14] drm/i915: Sort forcewake mapping tables Tvrtko Ursulin
                   ` (12 subsequent siblings)
  16 siblings, 1 reply; 32+ messages in thread
From: Tvrtko Ursulin @ 2016-09-30 17:48 UTC (permalink / raw)
  To: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Move finding the correct forcewake domains to take for
register access from code to a mapping table. This will
allow more interesting work in the following patches
and is easier to review if singled out early.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/intel_uncore.c | 212 ++++++++++++++++++------------------
 1 file changed, 103 insertions(+), 109 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c
index 60137a164094..e21e65ab2a16 100644
--- a/drivers/gpu/drm/i915/intel_uncore.c
+++ b/drivers/gpu/drm/i915/intel_uncore.c
@@ -581,28 +581,52 @@ void assert_forcewakes_inactive(struct drm_i915_private *dev_priv)
 	__fwd; \
 })
 
-#define REG_RANGE(reg, start, end) ((reg) >= (start) && (reg) < (end))
+struct intel_forcewake_range
+{
+	u32 start;
+	u32 end;
+
+	enum forcewake_domains domains;
+};
 
-#define FORCEWAKE_VLV_RENDER_RANGE_OFFSET(reg) \
-	(REG_RANGE((reg), 0x2000, 0x4000) || \
-	 REG_RANGE((reg), 0x5000, 0x8000) || \
-	 REG_RANGE((reg), 0xB000, 0x12000) || \
-	 REG_RANGE((reg), 0x2E000, 0x30000))
+static enum forcewake_domains
+find_fw_domain(u32 offset, const struct intel_forcewake_range *ranges,
+	       unsigned int num_ranges)
+{
+	unsigned int i;
+	struct intel_forcewake_range *entry =
+		(struct intel_forcewake_range *)ranges;
+
+	for (i = 0; i < num_ranges; i++, entry++) {
+		if (offset >= entry->start && offset <= entry->end)
+			return entry->domains;
+	}
+
+	return -1;
+}
 
-#define FORCEWAKE_VLV_MEDIA_RANGE_OFFSET(reg) \
-	(REG_RANGE((reg), 0x12000, 0x14000) || \
-	 REG_RANGE((reg), 0x22000, 0x24000) || \
-	 REG_RANGE((reg), 0x30000, 0x40000))
+#define GEN_FW_RANGE(s, e, d) \
+	{ .start = (s), .end = (e), .domains = (d) }
+
+static const struct intel_forcewake_range __vlv_fw_ranges[] = {
+	GEN_FW_RANGE(0x2000, 0x3fff, FORCEWAKE_RENDER),
+	GEN_FW_RANGE(0x5000, 0x7fff, FORCEWAKE_RENDER),
+	GEN_FW_RANGE(0xb000, 0x11fff, FORCEWAKE_RENDER),
+	GEN_FW_RANGE(0x2e000, 0x2ffff, FORCEWAKE_RENDER),
+	GEN_FW_RANGE(0x12000, 0x13fff, FORCEWAKE_MEDIA),
+	GEN_FW_RANGE(0x22000, 0x23fff, FORCEWAKE_MEDIA),
+	GEN_FW_RANGE(0x30000, 0x3ffff, FORCEWAKE_MEDIA),
+};
 
 #define __vlv_reg_read_fw_domains(offset) \
 ({ \
 	enum forcewake_domains __fwd = 0; \
-	if (!NEEDS_FORCE_WAKE(offset)) \
-		__fwd = 0; \
-	else if (FORCEWAKE_VLV_RENDER_RANGE_OFFSET(offset)) \
-		__fwd = FORCEWAKE_RENDER; \
-	else if (FORCEWAKE_VLV_MEDIA_RANGE_OFFSET(offset)) \
-		__fwd = FORCEWAKE_MEDIA; \
+	if (NEEDS_FORCE_WAKE((offset))) { \
+		__fwd = find_fw_domain(offset, __vlv_fw_ranges, \
+				       ARRAY_SIZE(__vlv_fw_ranges)); \
+		if (__fwd == -1 ) \
+			__fwd = 0; \
+	} \
 	__fwd; \
 })
 
@@ -636,104 +660,78 @@ static bool is_gen8_shadowed(u32 offset)
 	__fwd; \
 })
 
-#define FORCEWAKE_CHV_RENDER_RANGE_OFFSET(reg) \
-	(REG_RANGE((reg), 0x2000, 0x4000) || \
-	 REG_RANGE((reg), 0x5200, 0x8000) || \
-	 REG_RANGE((reg), 0x8300, 0x8500) || \
-	 REG_RANGE((reg), 0xB000, 0xB480) || \
-	 REG_RANGE((reg), 0xE000, 0xE800))
-
-#define FORCEWAKE_CHV_MEDIA_RANGE_OFFSET(reg) \
-	(REG_RANGE((reg), 0x8800, 0x8900) || \
-	 REG_RANGE((reg), 0xD000, 0xD800) || \
-	 REG_RANGE((reg), 0x12000, 0x14000) || \
-	 REG_RANGE((reg), 0x1A000, 0x1C000) || \
-	 REG_RANGE((reg), 0x1E800, 0x1EA00) || \
-	 REG_RANGE((reg), 0x30000, 0x38000))
-
-#define FORCEWAKE_CHV_COMMON_RANGE_OFFSET(reg) \
-	(REG_RANGE((reg), 0x4000, 0x5000) || \
-	 REG_RANGE((reg), 0x8000, 0x8300) || \
-	 REG_RANGE((reg), 0x8500, 0x8600) || \
-	 REG_RANGE((reg), 0x9000, 0xB000) || \
-	 REG_RANGE((reg), 0xF000, 0x10000))
+static const struct intel_forcewake_range __chv_fw_ranges[] = {
+	GEN_FW_RANGE(0x2000, 0x3fff, FORCEWAKE_RENDER),
+	GEN_FW_RANGE(0x5200, 0x7fff, FORCEWAKE_RENDER),
+	GEN_FW_RANGE(0x8300, 0x84ff, FORCEWAKE_RENDER),
+	GEN_FW_RANGE(0xb000, 0xb47f, FORCEWAKE_RENDER),
+	GEN_FW_RANGE(0xe000, 0xe7ff, FORCEWAKE_RENDER),
+	GEN_FW_RANGE(0x8800, 0x88ff, FORCEWAKE_MEDIA),
+	GEN_FW_RANGE(0xd000, 0xd7ff, FORCEWAKE_MEDIA),
+	GEN_FW_RANGE(0x12000, 0x13fff, FORCEWAKE_MEDIA),
+	GEN_FW_RANGE(0x1a000, 0x1bfff, FORCEWAKE_MEDIA),
+	GEN_FW_RANGE(0x1e800, 0x1e9ff, FORCEWAKE_MEDIA),
+	GEN_FW_RANGE(0x30000, 0x37fff, FORCEWAKE_MEDIA),
+	GEN_FW_RANGE(0x4000, 0x4fff, FORCEWAKE_RENDER | FORCEWAKE_MEDIA),
+	GEN_FW_RANGE(0x8000, 0x82ff, FORCEWAKE_RENDER | FORCEWAKE_MEDIA),
+	GEN_FW_RANGE(0x8500, 0x85ff, FORCEWAKE_RENDER | FORCEWAKE_MEDIA),
+	GEN_FW_RANGE(0x9000, 0xafff, FORCEWAKE_RENDER | FORCEWAKE_MEDIA),
+	GEN_FW_RANGE(0xf000, 0xffff, FORCEWAKE_RENDER | FORCEWAKE_MEDIA),
+};
 
 #define __chv_reg_read_fw_domains(offset) \
 ({ \
 	enum forcewake_domains __fwd = 0; \
-	if (!NEEDS_FORCE_WAKE(offset)) \
-		__fwd = 0; \
-	else if (FORCEWAKE_CHV_RENDER_RANGE_OFFSET(offset)) \
-		__fwd = FORCEWAKE_RENDER; \
-	else if (FORCEWAKE_CHV_MEDIA_RANGE_OFFSET(offset)) \
-		__fwd = FORCEWAKE_MEDIA; \
-	else if (FORCEWAKE_CHV_COMMON_RANGE_OFFSET(offset)) \
-		__fwd = FORCEWAKE_RENDER | FORCEWAKE_MEDIA; \
+	if (NEEDS_FORCE_WAKE((offset))) { \
+		__fwd = find_fw_domain(offset, __chv_fw_ranges, \
+				       ARRAY_SIZE(__chv_fw_ranges)); \
+		if (__fwd == -1 ) \
+			__fwd = 0; \
+	} \
 	__fwd; \
 })
 
 #define __chv_reg_write_fw_domains(offset) \
 ({ \
 	enum forcewake_domains __fwd = 0; \
-	if (!NEEDS_FORCE_WAKE(offset) || is_gen8_shadowed(offset)) \
-		__fwd = 0; \
-	else if (FORCEWAKE_CHV_RENDER_RANGE_OFFSET(offset)) \
-		__fwd = FORCEWAKE_RENDER; \
-	else if (FORCEWAKE_CHV_MEDIA_RANGE_OFFSET(offset)) \
-		__fwd = FORCEWAKE_MEDIA; \
-	else if (FORCEWAKE_CHV_COMMON_RANGE_OFFSET(offset)) \
-		__fwd = FORCEWAKE_RENDER | FORCEWAKE_MEDIA; \
+	if (NEEDS_FORCE_WAKE((offset)) && !is_gen8_shadowed(offset)) { \
+		__fwd = find_fw_domain(offset, __chv_fw_ranges, \
+				       ARRAY_SIZE(__chv_fw_ranges)); \
+		if (__fwd == -1 ) \
+			__fwd = 0; \
+	} \
 	__fwd; \
 })
 
-#define FORCEWAKE_GEN9_UNCORE_RANGE_OFFSET(reg) \
-	REG_RANGE((reg), 0xB00,  0x2000)
-
-#define FORCEWAKE_GEN9_RENDER_RANGE_OFFSET(reg) \
-	(REG_RANGE((reg), 0x2000, 0x2700) || \
-	 REG_RANGE((reg), 0x3000, 0x4000) || \
-	 REG_RANGE((reg), 0x5200, 0x8000) || \
-	 REG_RANGE((reg), 0x8140, 0x8160) || \
-	 REG_RANGE((reg), 0x8300, 0x8500) || \
-	 REG_RANGE((reg), 0x8C00, 0x8D00) || \
-	 REG_RANGE((reg), 0xB000, 0xB480) || \
-	 REG_RANGE((reg), 0xE000, 0xE900) || \
-	 REG_RANGE((reg), 0x24400, 0x24800))
-
-#define FORCEWAKE_GEN9_MEDIA_RANGE_OFFSET(reg) \
-	(REG_RANGE((reg), 0x8130, 0x8140) || \
-	 REG_RANGE((reg), 0x8800, 0x8A00) || \
-	 REG_RANGE((reg), 0xD000, 0xD800) || \
-	 REG_RANGE((reg), 0x12000, 0x14000) || \
-	 REG_RANGE((reg), 0x1A000, 0x1EA00) || \
-	 REG_RANGE((reg), 0x30000, 0x40000))
-
-#define FORCEWAKE_GEN9_COMMON_RANGE_OFFSET(reg) \
-	REG_RANGE((reg), 0x9400, 0x9800)
-
-#define FORCEWAKE_GEN9_BLITTER_RANGE_OFFSET(reg) \
-	((reg) < 0x40000 && \
-	 !FORCEWAKE_GEN9_UNCORE_RANGE_OFFSET(reg) && \
-	 !FORCEWAKE_GEN9_RENDER_RANGE_OFFSET(reg) && \
-	 !FORCEWAKE_GEN9_MEDIA_RANGE_OFFSET(reg) && \
-	 !FORCEWAKE_GEN9_COMMON_RANGE_OFFSET(reg))
-
-#define SKL_NEEDS_FORCE_WAKE(reg) \
-	((reg) < 0x40000 && !FORCEWAKE_GEN9_UNCORE_RANGE_OFFSET(reg))
+static const struct intel_forcewake_range __gen9_fw_ranges[] = {
+	GEN_FW_RANGE(0xb00, 0x1fff, 0), /* uncore range */
+	GEN_FW_RANGE(0x2000, 0x26ff, FORCEWAKE_RENDER),
+	GEN_FW_RANGE(0x3000, 0x3fff, FORCEWAKE_RENDER),
+	GEN_FW_RANGE(0x5200, 0x7fff, FORCEWAKE_RENDER),
+	GEN_FW_RANGE(0x8140, 0x815f, FORCEWAKE_RENDER),
+	GEN_FW_RANGE(0x8300, 0x84ff, FORCEWAKE_RENDER),
+	GEN_FW_RANGE(0x8c00, 0x8cff, FORCEWAKE_RENDER),
+	GEN_FW_RANGE(0xb000, 0xb47f, FORCEWAKE_RENDER),
+	GEN_FW_RANGE(0xe000, 0xe8ff, FORCEWAKE_RENDER),
+	GEN_FW_RANGE(0x24400, 0x247ff, FORCEWAKE_RENDER),
+	GEN_FW_RANGE(0x9400, 0x97ff, FORCEWAKE_RENDER | FORCEWAKE_MEDIA),
+	GEN_FW_RANGE(0x8130, 0x813f, FORCEWAKE_MEDIA),
+	GEN_FW_RANGE(0x8800, 0x89ff, FORCEWAKE_MEDIA),
+	GEN_FW_RANGE(0xd000, 0xd7ff, FORCEWAKE_MEDIA),
+	GEN_FW_RANGE(0x12000, 0x13fff, FORCEWAKE_MEDIA),
+	GEN_FW_RANGE(0x1a000, 0x1e9ff, FORCEWAKE_MEDIA),
+	GEN_FW_RANGE(0x30000, 0x3ffff, FORCEWAKE_MEDIA),
+};
 
 #define __gen9_reg_read_fw_domains(offset) \
 ({ \
-	enum forcewake_domains __fwd; \
-	if (!SKL_NEEDS_FORCE_WAKE(offset)) \
-		__fwd = 0; \
-	else if (FORCEWAKE_GEN9_RENDER_RANGE_OFFSET(offset)) \
-		__fwd = FORCEWAKE_RENDER; \
-	else if (FORCEWAKE_GEN9_MEDIA_RANGE_OFFSET(offset)) \
-		__fwd = FORCEWAKE_MEDIA; \
-	else if (FORCEWAKE_GEN9_COMMON_RANGE_OFFSET(offset)) \
-		__fwd = FORCEWAKE_RENDER | FORCEWAKE_MEDIA; \
-	else \
-		__fwd = FORCEWAKE_BLITTER; \
+	enum forcewake_domains __fwd = 0; \
+	if (NEEDS_FORCE_WAKE((offset))) { \
+		__fwd = find_fw_domain(offset, __gen9_fw_ranges, \
+				       ARRAY_SIZE(__gen9_fw_ranges)); \
+		if (__fwd == -1 ) \
+			__fwd = FORCEWAKE_BLITTER; \
+	} \
 	__fwd; \
 })
 
@@ -759,17 +757,13 @@ static bool is_gen9_shadowed(u32 offset)
 
 #define __gen9_reg_write_fw_domains(offset) \
 ({ \
-	enum forcewake_domains __fwd; \
-	if (!SKL_NEEDS_FORCE_WAKE(offset) || is_gen9_shadowed(offset)) \
-		__fwd = 0; \
-	else if (FORCEWAKE_GEN9_RENDER_RANGE_OFFSET(offset)) \
-		__fwd = FORCEWAKE_RENDER; \
-	else if (FORCEWAKE_GEN9_MEDIA_RANGE_OFFSET(offset)) \
-		__fwd = FORCEWAKE_MEDIA; \
-	else if (FORCEWAKE_GEN9_COMMON_RANGE_OFFSET(offset)) \
-		__fwd = FORCEWAKE_RENDER | FORCEWAKE_MEDIA; \
-	else \
-		__fwd = FORCEWAKE_BLITTER; \
+	enum forcewake_domains __fwd = 0; \
+	if (NEEDS_FORCE_WAKE((offset)) && !is_gen9_shadowed(offset)) { \
+		__fwd = find_fw_domain(offset, __gen9_fw_ranges, \
+				       ARRAY_SIZE(__gen9_fw_ranges)); \
+		if (__fwd == -1 ) \
+			__fwd = FORCEWAKE_BLITTER; \
+	} \
 	__fwd; \
 })
 
-- 
2.7.4

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH 05/14] drm/i915: Sort forcewake mapping tables
  2016-09-30 17:48 [PATCH v2 00/14] Forcewake binary search & code shrink Tvrtko Ursulin
                   ` (3 preceding siblings ...)
  2016-09-30 17:48 ` [PATCH 04/14] drm/i915: Data driven register to forcewake domains lookup Tvrtko Ursulin
@ 2016-09-30 17:48 ` Tvrtko Ursulin
  2016-09-30 18:05   ` Chris Wilson
  2016-09-30 17:48 ` [PATCH 06/14] drm/i915: Use binary search when looking up forcewake domains Tvrtko Ursulin
                   ` (11 subsequent siblings)
  16 siblings, 1 reply; 32+ messages in thread
From: Tvrtko Ursulin @ 2016-09-30 17:48 UTC (permalink / raw)
  To: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Sorting the tables (verified at runtime to help during
development) is another prerequisite for interesting
work which will follow.

v2:
 * Remove const away cast and improve comments. (Chris Wilson)
 * Check tables only when debug option is enabled.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/intel_uncore.c | 56 ++++++++++++++++++++++++++++---------
 1 file changed, 43 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c
index e21e65ab2a16..5ca65de340b7 100644
--- a/drivers/gpu/drm/i915/intel_uncore.c
+++ b/drivers/gpu/drm/i915/intel_uncore.c
@@ -605,16 +605,35 @@ find_fw_domain(u32 offset, const struct intel_forcewake_range *ranges,
 	return -1;
 }
 
+static void
+intel_fw_table_check(const struct intel_forcewake_range *ranges,
+		     unsigned int num_ranges)
+{
+#ifdef CONFIG_DRM_I915_DEBUG
+	unsigned int i;
+	const struct intel_forcewake_range *entry = ranges;
+	s32 prev = -1;
+
+	for (i = 0; i < num_ranges; i++, entry++) {
+		WARN_ON_ONCE(prev >= (s32)entry->start);
+		prev = entry->start;
+		WARN_ON_ONCE(prev >= (s32)entry->end);
+		prev = entry->end;
+	}
+#endif
+}
+
 #define GEN_FW_RANGE(s, e, d) \
 	{ .start = (s), .end = (e), .domains = (d) }
 
+/* *Must* be sorted by offset ranges! See intel_fw_table_check(). */
 static const struct intel_forcewake_range __vlv_fw_ranges[] = {
 	GEN_FW_RANGE(0x2000, 0x3fff, FORCEWAKE_RENDER),
 	GEN_FW_RANGE(0x5000, 0x7fff, FORCEWAKE_RENDER),
 	GEN_FW_RANGE(0xb000, 0x11fff, FORCEWAKE_RENDER),
-	GEN_FW_RANGE(0x2e000, 0x2ffff, FORCEWAKE_RENDER),
 	GEN_FW_RANGE(0x12000, 0x13fff, FORCEWAKE_MEDIA),
 	GEN_FW_RANGE(0x22000, 0x23fff, FORCEWAKE_MEDIA),
+	GEN_FW_RANGE(0x2e000, 0x2ffff, FORCEWAKE_RENDER),
 	GEN_FW_RANGE(0x30000, 0x3ffff, FORCEWAKE_MEDIA),
 };
 
@@ -660,23 +679,24 @@ static bool is_gen8_shadowed(u32 offset)
 	__fwd; \
 })
 
+/* *Must* be sorted by offset ranges! See intel_fw_table_check(). */
 static const struct intel_forcewake_range __chv_fw_ranges[] = {
 	GEN_FW_RANGE(0x2000, 0x3fff, FORCEWAKE_RENDER),
+	GEN_FW_RANGE(0x4000, 0x4fff, FORCEWAKE_RENDER | FORCEWAKE_MEDIA),
 	GEN_FW_RANGE(0x5200, 0x7fff, FORCEWAKE_RENDER),
+	GEN_FW_RANGE(0x8000, 0x82ff, FORCEWAKE_RENDER | FORCEWAKE_MEDIA),
 	GEN_FW_RANGE(0x8300, 0x84ff, FORCEWAKE_RENDER),
-	GEN_FW_RANGE(0xb000, 0xb47f, FORCEWAKE_RENDER),
-	GEN_FW_RANGE(0xe000, 0xe7ff, FORCEWAKE_RENDER),
+	GEN_FW_RANGE(0x8500, 0x85ff, FORCEWAKE_RENDER | FORCEWAKE_MEDIA),
 	GEN_FW_RANGE(0x8800, 0x88ff, FORCEWAKE_MEDIA),
+	GEN_FW_RANGE(0x9000, 0xafff, FORCEWAKE_RENDER | FORCEWAKE_MEDIA),
+	GEN_FW_RANGE(0xb000, 0xb47f, FORCEWAKE_RENDER),
 	GEN_FW_RANGE(0xd000, 0xd7ff, FORCEWAKE_MEDIA),
+	GEN_FW_RANGE(0xe000, 0xe7ff, FORCEWAKE_RENDER),
+	GEN_FW_RANGE(0xf000, 0xffff, FORCEWAKE_RENDER | FORCEWAKE_MEDIA),
 	GEN_FW_RANGE(0x12000, 0x13fff, FORCEWAKE_MEDIA),
 	GEN_FW_RANGE(0x1a000, 0x1bfff, FORCEWAKE_MEDIA),
 	GEN_FW_RANGE(0x1e800, 0x1e9ff, FORCEWAKE_MEDIA),
 	GEN_FW_RANGE(0x30000, 0x37fff, FORCEWAKE_MEDIA),
-	GEN_FW_RANGE(0x4000, 0x4fff, FORCEWAKE_RENDER | FORCEWAKE_MEDIA),
-	GEN_FW_RANGE(0x8000, 0x82ff, FORCEWAKE_RENDER | FORCEWAKE_MEDIA),
-	GEN_FW_RANGE(0x8500, 0x85ff, FORCEWAKE_RENDER | FORCEWAKE_MEDIA),
-	GEN_FW_RANGE(0x9000, 0xafff, FORCEWAKE_RENDER | FORCEWAKE_MEDIA),
-	GEN_FW_RANGE(0xf000, 0xffff, FORCEWAKE_RENDER | FORCEWAKE_MEDIA),
 };
 
 #define __chv_reg_read_fw_domains(offset) \
@@ -703,23 +723,24 @@ static const struct intel_forcewake_range __chv_fw_ranges[] = {
 	__fwd; \
 })
 
+/* *Must* be sorted by offset ranges! See intel_fw_table_check(). */
 static const struct intel_forcewake_range __gen9_fw_ranges[] = {
 	GEN_FW_RANGE(0xb00, 0x1fff, 0), /* uncore range */
 	GEN_FW_RANGE(0x2000, 0x26ff, FORCEWAKE_RENDER),
 	GEN_FW_RANGE(0x3000, 0x3fff, FORCEWAKE_RENDER),
 	GEN_FW_RANGE(0x5200, 0x7fff, FORCEWAKE_RENDER),
+	GEN_FW_RANGE(0x8130, 0x813f, FORCEWAKE_MEDIA),
 	GEN_FW_RANGE(0x8140, 0x815f, FORCEWAKE_RENDER),
 	GEN_FW_RANGE(0x8300, 0x84ff, FORCEWAKE_RENDER),
+	GEN_FW_RANGE(0x8800, 0x89ff, FORCEWAKE_MEDIA),
 	GEN_FW_RANGE(0x8c00, 0x8cff, FORCEWAKE_RENDER),
-	GEN_FW_RANGE(0xb000, 0xb47f, FORCEWAKE_RENDER),
-	GEN_FW_RANGE(0xe000, 0xe8ff, FORCEWAKE_RENDER),
-	GEN_FW_RANGE(0x24400, 0x247ff, FORCEWAKE_RENDER),
 	GEN_FW_RANGE(0x9400, 0x97ff, FORCEWAKE_RENDER | FORCEWAKE_MEDIA),
-	GEN_FW_RANGE(0x8130, 0x813f, FORCEWAKE_MEDIA),
-	GEN_FW_RANGE(0x8800, 0x89ff, FORCEWAKE_MEDIA),
+	GEN_FW_RANGE(0xb000, 0xb47f, FORCEWAKE_RENDER),
 	GEN_FW_RANGE(0xd000, 0xd7ff, FORCEWAKE_MEDIA),
+	GEN_FW_RANGE(0xe000, 0xe8ff, FORCEWAKE_RENDER),
 	GEN_FW_RANGE(0x12000, 0x13fff, FORCEWAKE_MEDIA),
 	GEN_FW_RANGE(0x1a000, 0x1e9ff, FORCEWAKE_MEDIA),
+	GEN_FW_RANGE(0x24400, 0x247ff, FORCEWAKE_RENDER),
 	GEN_FW_RANGE(0x30000, 0x3ffff, FORCEWAKE_MEDIA),
 };
 
@@ -1299,11 +1320,17 @@ void intel_uncore_init(struct drm_i915_private *dev_priv)
 	switch (INTEL_INFO(dev_priv)->gen) {
 	default:
 	case 9:
+		intel_fw_table_check(__gen9_fw_ranges,
+				     ARRAY_SIZE(__gen9_fw_ranges));
+
 		ASSIGN_WRITE_MMIO_VFUNCS(gen9);
 		ASSIGN_READ_MMIO_VFUNCS(gen9);
 		break;
 	case 8:
 		if (IS_CHERRYVIEW(dev_priv)) {
+			intel_fw_table_check(__chv_fw_ranges,
+					     ARRAY_SIZE(__chv_fw_ranges));
+
 			ASSIGN_WRITE_MMIO_VFUNCS(chv);
 			ASSIGN_READ_MMIO_VFUNCS(chv);
 
@@ -1317,6 +1344,9 @@ void intel_uncore_init(struct drm_i915_private *dev_priv)
 		ASSIGN_WRITE_MMIO_VFUNCS(gen6);
 
 		if (IS_VALLEYVIEW(dev_priv)) {
+			intel_fw_table_check(__vlv_fw_ranges,
+					     ARRAY_SIZE(__vlv_fw_ranges));
+
 			ASSIGN_READ_MMIO_VFUNCS(vlv);
 		} else {
 			ASSIGN_READ_MMIO_VFUNCS(gen6);
-- 
2.7.4

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH 06/14] drm/i915: Use binary search when looking up forcewake domains
  2016-09-30 17:48 [PATCH v2 00/14] Forcewake binary search & code shrink Tvrtko Ursulin
                   ` (4 preceding siblings ...)
  2016-09-30 17:48 ` [PATCH 05/14] drm/i915: Sort forcewake mapping tables Tvrtko Ursulin
@ 2016-09-30 17:48 ` Tvrtko Ursulin
  2016-09-30 17:48 ` [PATCH 07/14] drm/i915: Eliminate Gen9 special case Tvrtko Ursulin
                   ` (10 subsequent siblings)
  16 siblings, 0 replies; 32+ messages in thread
From: Tvrtko Ursulin @ 2016-09-30 17:48 UTC (permalink / raw)
  To: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Instead of the existing linear seach, now that we have sorted
range tables, we can do a binary search on them for some
potential miniscule performance gain, but more importantly
for elegance and code size. Hopefully the perfomance gain is
sufficient to offset the function calls which were not there
before.

v2: Removed const cast away.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/intel_uncore.c | 27 +++++++++++++++++++--------
 1 file changed, 19 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c
index 5ca65de340b7..60fe796fdd0c 100644
--- a/drivers/gpu/drm/i915/intel_uncore.c
+++ b/drivers/gpu/drm/i915/intel_uncore.c
@@ -26,6 +26,7 @@
 #include "i915_vgpu.h"
 
 #include <linux/pm_runtime.h>
+#include <linux/bsearch.h>
 
 #define FORCEWAKE_ACK_TIMEOUT_MS 50
 
@@ -589,20 +590,30 @@ struct intel_forcewake_range
 	enum forcewake_domains domains;
 };
 
+static int fw_range_cmp(const void *key, const void *elt)
+{
+	const struct intel_forcewake_range *entry = elt;
+	u32 offset = (u32)((unsigned long)key);
+
+	if (offset < entry->start)
+		return -1;
+	else if (offset > entry->end)
+		return 1;
+	else
+		return 0;
+}
+
 static enum forcewake_domains
 find_fw_domain(u32 offset, const struct intel_forcewake_range *ranges,
 	       unsigned int num_ranges)
 {
-	unsigned int i;
-	struct intel_forcewake_range *entry =
-		(struct intel_forcewake_range *)ranges;
+	struct intel_forcewake_range *entry;
 
-	for (i = 0; i < num_ranges; i++, entry++) {
-		if (offset >= entry->start && offset <= entry->end)
-			return entry->domains;
-	}
+	entry = bsearch((void *)(unsigned long)offset, (const void *)ranges,
+			num_ranges, sizeof(struct intel_forcewake_range),
+			fw_range_cmp);
 
-	return -1;
+	return entry ? entry->domains : -1;
 }
 
 static void
-- 
2.7.4

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH 07/14] drm/i915: Eliminate Gen9 special case
  2016-09-30 17:48 [PATCH v2 00/14] Forcewake binary search & code shrink Tvrtko Ursulin
                   ` (5 preceding siblings ...)
  2016-09-30 17:48 ` [PATCH 06/14] drm/i915: Use binary search when looking up forcewake domains Tvrtko Ursulin
@ 2016-09-30 17:48 ` Tvrtko Ursulin
  2016-09-30 17:48 ` [PATCH 08/14] drm/i915: Store the active forcewake range table pointer Tvrtko Ursulin
                   ` (9 subsequent siblings)
  16 siblings, 0 replies; 32+ messages in thread
From: Tvrtko Ursulin @ 2016-09-30 17:48 UTC (permalink / raw)
  To: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

If we insert blitter forcewake domain entries in the range
table we can eliminate that special case and simplify the
code in a few macros. This will enable more unification later.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/intel_uncore.c | 42 ++++++++++++++++++-------------------
 1 file changed, 21 insertions(+), 21 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c
index 60fe796fdd0c..04ed1721c69f 100644
--- a/drivers/gpu/drm/i915/intel_uncore.c
+++ b/drivers/gpu/drm/i915/intel_uncore.c
@@ -613,7 +613,7 @@ find_fw_domain(u32 offset, const struct intel_forcewake_range *ranges,
 			num_ranges, sizeof(struct intel_forcewake_range),
 			fw_range_cmp);
 
-	return entry ? entry->domains : -1;
+	return entry ? entry->domains : 0;
 }
 
 static void
@@ -651,12 +651,9 @@ static const struct intel_forcewake_range __vlv_fw_ranges[] = {
 #define __vlv_reg_read_fw_domains(offset) \
 ({ \
 	enum forcewake_domains __fwd = 0; \
-	if (NEEDS_FORCE_WAKE((offset))) { \
+	if (NEEDS_FORCE_WAKE((offset))) \
 		__fwd = find_fw_domain(offset, __vlv_fw_ranges, \
 				       ARRAY_SIZE(__vlv_fw_ranges)); \
-		if (__fwd == -1 ) \
-			__fwd = 0; \
-	} \
 	__fwd; \
 })
 
@@ -713,57 +710,63 @@ static const struct intel_forcewake_range __chv_fw_ranges[] = {
 #define __chv_reg_read_fw_domains(offset) \
 ({ \
 	enum forcewake_domains __fwd = 0; \
-	if (NEEDS_FORCE_WAKE((offset))) { \
+	if (NEEDS_FORCE_WAKE((offset))) \
 		__fwd = find_fw_domain(offset, __chv_fw_ranges, \
 				       ARRAY_SIZE(__chv_fw_ranges)); \
-		if (__fwd == -1 ) \
-			__fwd = 0; \
-	} \
 	__fwd; \
 })
 
 #define __chv_reg_write_fw_domains(offset) \
 ({ \
 	enum forcewake_domains __fwd = 0; \
-	if (NEEDS_FORCE_WAKE((offset)) && !is_gen8_shadowed(offset)) { \
+	if (NEEDS_FORCE_WAKE((offset)) && !is_gen8_shadowed(offset)) \
 		__fwd = find_fw_domain(offset, __chv_fw_ranges, \
 				       ARRAY_SIZE(__chv_fw_ranges)); \
-		if (__fwd == -1 ) \
-			__fwd = 0; \
-	} \
 	__fwd; \
 })
 
 /* *Must* be sorted by offset ranges! See intel_fw_table_check(). */
 static const struct intel_forcewake_range __gen9_fw_ranges[] = {
+	GEN_FW_RANGE(0x0, 0xaff, FORCEWAKE_BLITTER),
 	GEN_FW_RANGE(0xb00, 0x1fff, 0), /* uncore range */
 	GEN_FW_RANGE(0x2000, 0x26ff, FORCEWAKE_RENDER),
+	GEN_FW_RANGE(0x2700, 0x2fff, FORCEWAKE_BLITTER),
 	GEN_FW_RANGE(0x3000, 0x3fff, FORCEWAKE_RENDER),
+	GEN_FW_RANGE(0x4000, 0x51ff, FORCEWAKE_BLITTER),
 	GEN_FW_RANGE(0x5200, 0x7fff, FORCEWAKE_RENDER),
+	GEN_FW_RANGE(0x8000, 0x812f, FORCEWAKE_BLITTER),
 	GEN_FW_RANGE(0x8130, 0x813f, FORCEWAKE_MEDIA),
 	GEN_FW_RANGE(0x8140, 0x815f, FORCEWAKE_RENDER),
+	GEN_FW_RANGE(0x8160, 0x82ff, FORCEWAKE_BLITTER),
 	GEN_FW_RANGE(0x8300, 0x84ff, FORCEWAKE_RENDER),
+	GEN_FW_RANGE(0x8500, 0x87ff, FORCEWAKE_BLITTER),
 	GEN_FW_RANGE(0x8800, 0x89ff, FORCEWAKE_MEDIA),
+	GEN_FW_RANGE(0x8a00, 0x8bff, FORCEWAKE_BLITTER),
 	GEN_FW_RANGE(0x8c00, 0x8cff, FORCEWAKE_RENDER),
+	GEN_FW_RANGE(0x8d00, 0x93ff, FORCEWAKE_BLITTER),
 	GEN_FW_RANGE(0x9400, 0x97ff, FORCEWAKE_RENDER | FORCEWAKE_MEDIA),
+	GEN_FW_RANGE(0x9800, 0xafff, FORCEWAKE_BLITTER),
 	GEN_FW_RANGE(0xb000, 0xb47f, FORCEWAKE_RENDER),
+	GEN_FW_RANGE(0xb480, 0xbfff, FORCEWAKE_BLITTER),
 	GEN_FW_RANGE(0xd000, 0xd7ff, FORCEWAKE_MEDIA),
+	GEN_FW_RANGE(0xd800, 0xdfff, FORCEWAKE_BLITTER),
 	GEN_FW_RANGE(0xe000, 0xe8ff, FORCEWAKE_RENDER),
+	GEN_FW_RANGE(0xe900, 0x11fff, FORCEWAKE_BLITTER),
 	GEN_FW_RANGE(0x12000, 0x13fff, FORCEWAKE_MEDIA),
+	GEN_FW_RANGE(0x14000, 0x19fff, FORCEWAKE_BLITTER),
 	GEN_FW_RANGE(0x1a000, 0x1e9ff, FORCEWAKE_MEDIA),
+	GEN_FW_RANGE(0x1ea00, 0x243ff, FORCEWAKE_BLITTER),
 	GEN_FW_RANGE(0x24400, 0x247ff, FORCEWAKE_RENDER),
+	GEN_FW_RANGE(0x24800, 0x2ffff, FORCEWAKE_BLITTER),
 	GEN_FW_RANGE(0x30000, 0x3ffff, FORCEWAKE_MEDIA),
 };
 
 #define __gen9_reg_read_fw_domains(offset) \
 ({ \
 	enum forcewake_domains __fwd = 0; \
-	if (NEEDS_FORCE_WAKE((offset))) { \
+	if (NEEDS_FORCE_WAKE((offset))) \
 		__fwd = find_fw_domain(offset, __gen9_fw_ranges, \
 				       ARRAY_SIZE(__gen9_fw_ranges)); \
-		if (__fwd == -1 ) \
-			__fwd = FORCEWAKE_BLITTER; \
-	} \
 	__fwd; \
 })
 
@@ -790,12 +793,9 @@ static bool is_gen9_shadowed(u32 offset)
 #define __gen9_reg_write_fw_domains(offset) \
 ({ \
 	enum forcewake_domains __fwd = 0; \
-	if (NEEDS_FORCE_WAKE((offset)) && !is_gen9_shadowed(offset)) { \
+	if (NEEDS_FORCE_WAKE((offset)) && !is_gen9_shadowed(offset)) \
 		__fwd = find_fw_domain(offset, __gen9_fw_ranges, \
 				       ARRAY_SIZE(__gen9_fw_ranges)); \
-		if (__fwd == -1 ) \
-			__fwd = FORCEWAKE_BLITTER; \
-	} \
 	__fwd; \
 })
 
-- 
2.7.4

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH 08/14] drm/i915: Store the active forcewake range table pointer
  2016-09-30 17:48 [PATCH v2 00/14] Forcewake binary search & code shrink Tvrtko Ursulin
                   ` (6 preceding siblings ...)
  2016-09-30 17:48 ` [PATCH 07/14] drm/i915: Eliminate Gen9 special case Tvrtko Ursulin
@ 2016-09-30 17:48 ` Tvrtko Ursulin
  2016-10-03  8:39   ` Tvrtko Ursulin
  2016-09-30 17:48 ` [PATCH 09/14] drm/i915: Remove identical macros Tvrtko Ursulin
                   ` (8 subsequent siblings)
  16 siblings, 1 reply; 32+ messages in thread
From: Tvrtko Ursulin @ 2016-09-30 17:48 UTC (permalink / raw)
  To: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

If we store this in the uncore structure we are on a good way to
show more commonality between the per-platform implementations.

v2: Constify table pointer and correct coding style. (Chris Wilson)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h     | 10 +++++
 drivers/gpu/drm/i915/intel_uncore.c | 73 ++++++++++++++++++-------------------
 2 files changed, 46 insertions(+), 37 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 9b1997fd8c87..3b14938aa09e 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -582,9 +582,19 @@ struct intel_uncore_funcs {
 				uint32_t val, bool trace);
 };
 
+struct intel_forcewake_range {
+	u32 start;
+	u32 end;
+
+	enum forcewake_domains domains;
+};
+
 struct intel_uncore {
 	spinlock_t lock; /** lock is also taken in irq contexts. */
 
+	const struct intel_forcewake_range *fw_domains_table;
+	unsigned int fw_domains_table_entries;
+
 	struct intel_uncore_funcs funcs;
 
 	unsigned fifo_count;
diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c
index 04ed1721c69f..050596fe90c5 100644
--- a/drivers/gpu/drm/i915/intel_uncore.c
+++ b/drivers/gpu/drm/i915/intel_uncore.c
@@ -582,14 +582,6 @@ void assert_forcewakes_inactive(struct drm_i915_private *dev_priv)
 	__fwd; \
 })
 
-struct intel_forcewake_range
-{
-	u32 start;
-	u32 end;
-
-	enum forcewake_domains domains;
-};
-
 static int fw_range_cmp(const void *key, const void *elt)
 {
 	const struct intel_forcewake_range *entry = elt;
@@ -604,28 +596,37 @@ static int fw_range_cmp(const void *key, const void *elt)
 }
 
 static enum forcewake_domains
-find_fw_domain(u32 offset, const struct intel_forcewake_range *ranges,
-	       unsigned int num_ranges)
+find_fw_domain(struct drm_i915_private *dev_priv, u32 offset)
 {
-	struct intel_forcewake_range *entry;
+	const struct intel_forcewake_range *table, *entry;
+	unsigned int num_entries;
 
-	entry = bsearch((void *)(unsigned long)offset, (const void *)ranges,
-			num_ranges, sizeof(struct intel_forcewake_range),
+	table = dev_priv->uncore.fw_domains_table;
+	num_entries = dev_priv->uncore.fw_domains_table_entries;
+
+	entry = bsearch((void *)(unsigned long)offset, (const void *)table,
+			num_entries, sizeof(struct intel_forcewake_range),
 			fw_range_cmp);
 
 	return entry ? entry->domains : 0;
 }
 
 static void
-intel_fw_table_check(const struct intel_forcewake_range *ranges,
-		     unsigned int num_ranges)
+intel_fw_table_check(struct drm_i915_private *dev_priv)
 {
 #ifdef CONFIG_DRM_I915_DEBUG
 	unsigned int i;
-	const struct intel_forcewake_range *entry = ranges;
-	s32 prev = -1;
+	const struct intel_forcewake_range *entry;
+	unsigned int num_ranges;
+	s32 prev;
+
+	entry = dev_priv->uncore.fw_domains_table;
+	if (!entry)
+		return;
 
-	for (i = 0; i < num_ranges; i++, entry++) {
+	num_ranges = dev_priv->uncore.fw_domains_table_entries;
+
+	for (i = 0, prev = -1; i < num_ranges; i++, entry++) {
 		WARN_ON_ONCE(prev >= (s32)entry->start);
 		prev = entry->start;
 		WARN_ON_ONCE(prev >= (s32)entry->end);
@@ -652,8 +653,7 @@ static const struct intel_forcewake_range __vlv_fw_ranges[] = {
 ({ \
 	enum forcewake_domains __fwd = 0; \
 	if (NEEDS_FORCE_WAKE((offset))) \
-		__fwd = find_fw_domain(offset, __vlv_fw_ranges, \
-				       ARRAY_SIZE(__vlv_fw_ranges)); \
+		__fwd = find_fw_domain(dev_priv, offset); \
 	__fwd; \
 })
 
@@ -711,8 +711,7 @@ static const struct intel_forcewake_range __chv_fw_ranges[] = {
 ({ \
 	enum forcewake_domains __fwd = 0; \
 	if (NEEDS_FORCE_WAKE((offset))) \
-		__fwd = find_fw_domain(offset, __chv_fw_ranges, \
-				       ARRAY_SIZE(__chv_fw_ranges)); \
+		__fwd = find_fw_domain(dev_priv, offset); \
 	__fwd; \
 })
 
@@ -720,8 +719,7 @@ static const struct intel_forcewake_range __chv_fw_ranges[] = {
 ({ \
 	enum forcewake_domains __fwd = 0; \
 	if (NEEDS_FORCE_WAKE((offset)) && !is_gen8_shadowed(offset)) \
-		__fwd = find_fw_domain(offset, __chv_fw_ranges, \
-				       ARRAY_SIZE(__chv_fw_ranges)); \
+		__fwd = find_fw_domain(dev_priv, offset); \
 	__fwd; \
 })
 
@@ -765,8 +763,7 @@ static const struct intel_forcewake_range __gen9_fw_ranges[] = {
 ({ \
 	enum forcewake_domains __fwd = 0; \
 	if (NEEDS_FORCE_WAKE((offset))) \
-		__fwd = find_fw_domain(offset, __gen9_fw_ranges, \
-				       ARRAY_SIZE(__gen9_fw_ranges)); \
+		__fwd = find_fw_domain(dev_priv, offset); \
 	__fwd; \
 })
 
@@ -794,8 +791,7 @@ static bool is_gen9_shadowed(u32 offset)
 ({ \
 	enum forcewake_domains __fwd = 0; \
 	if (NEEDS_FORCE_WAKE((offset)) && !is_gen9_shadowed(offset)) \
-		__fwd = find_fw_domain(offset, __gen9_fw_ranges, \
-				       ARRAY_SIZE(__gen9_fw_ranges)); \
+		__fwd = find_fw_domain(dev_priv, offset); \
 	__fwd; \
 })
 
@@ -1318,6 +1314,13 @@ static void intel_uncore_fw_domains_init(struct drm_i915_private *dev_priv)
 	WARN_ON(dev_priv->uncore.fw_domains == 0);
 }
 
+#define ASSIGN_FW_DOMAINS_TABLE(d) \
+{ \
+	dev_priv->uncore.fw_domains_table = \
+			(struct intel_forcewake_range *)(d); \
+	dev_priv->uncore.fw_domains_table_entries = ARRAY_SIZE((d)); \
+}
+
 void intel_uncore_init(struct drm_i915_private *dev_priv)
 {
 	i915_check_vgpu(dev_priv);
@@ -1331,17 +1334,13 @@ void intel_uncore_init(struct drm_i915_private *dev_priv)
 	switch (INTEL_INFO(dev_priv)->gen) {
 	default:
 	case 9:
-		intel_fw_table_check(__gen9_fw_ranges,
-				     ARRAY_SIZE(__gen9_fw_ranges));
-
+		ASSIGN_FW_DOMAINS_TABLE(__gen9_fw_ranges);
 		ASSIGN_WRITE_MMIO_VFUNCS(gen9);
 		ASSIGN_READ_MMIO_VFUNCS(gen9);
 		break;
 	case 8:
 		if (IS_CHERRYVIEW(dev_priv)) {
-			intel_fw_table_check(__chv_fw_ranges,
-					     ARRAY_SIZE(__chv_fw_ranges));
-
+			ASSIGN_FW_DOMAINS_TABLE(__chv_fw_ranges);
 			ASSIGN_WRITE_MMIO_VFUNCS(chv);
 			ASSIGN_READ_MMIO_VFUNCS(chv);
 
@@ -1355,9 +1354,7 @@ void intel_uncore_init(struct drm_i915_private *dev_priv)
 		ASSIGN_WRITE_MMIO_VFUNCS(gen6);
 
 		if (IS_VALLEYVIEW(dev_priv)) {
-			intel_fw_table_check(__vlv_fw_ranges,
-					     ARRAY_SIZE(__vlv_fw_ranges));
-
+			ASSIGN_FW_DOMAINS_TABLE(__vlv_fw_ranges);
 			ASSIGN_READ_MMIO_VFUNCS(vlv);
 		} else {
 			ASSIGN_READ_MMIO_VFUNCS(gen6);
@@ -1375,6 +1372,8 @@ void intel_uncore_init(struct drm_i915_private *dev_priv)
 		break;
 	}
 
+	intel_fw_table_check(dev_priv);
+
 	if (intel_vgpu_active(dev_priv)) {
 		ASSIGN_WRITE_MMIO_VFUNCS(vgpu);
 		ASSIGN_READ_MMIO_VFUNCS(vgpu);
-- 
2.7.4

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH 09/14] drm/i915: Remove identical macros
  2016-09-30 17:48 [PATCH v2 00/14] Forcewake binary search & code shrink Tvrtko Ursulin
                   ` (7 preceding siblings ...)
  2016-09-30 17:48 ` [PATCH 08/14] drm/i915: Store the active forcewake range table pointer Tvrtko Ursulin
@ 2016-09-30 17:48 ` Tvrtko Ursulin
  2016-09-30 17:48 ` [PATCH 10/14] drm/i915: Remove identical mmio read functions Tvrtko Ursulin
                   ` (7 subsequent siblings)
  16 siblings, 0 replies; 32+ messages in thread
From: Tvrtko Ursulin @ 2016-09-30 17:48 UTC (permalink / raw)
  To: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Remove some macros which are now obviously identical.

v2: Added HAS_FWTABLE macro and simplified intel_uncore_forcewake_for_read.
    (Joonas Lahtinen)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/intel_uncore.c | 70 +++++++++++--------------------------
 1 file changed, 20 insertions(+), 50 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c
index 050596fe90c5..8e46e793ac1b 100644
--- a/drivers/gpu/drm/i915/intel_uncore.c
+++ b/drivers/gpu/drm/i915/intel_uncore.c
@@ -638,6 +638,11 @@ intel_fw_table_check(struct drm_i915_private *dev_priv)
 #define GEN_FW_RANGE(s, e, d) \
 	{ .start = (s), .end = (e), .domains = (d) }
 
+#define HAS_FWTABLE(dev_priv) \
+	(IS_GEN9(dev_priv) || \
+	 IS_CHERRYVIEW(dev_priv) || \
+	 IS_VALLEYVIEW(dev_priv))
+
 /* *Must* be sorted by offset ranges! See intel_fw_table_check(). */
 static const struct intel_forcewake_range __vlv_fw_ranges[] = {
 	GEN_FW_RANGE(0x2000, 0x3fff, FORCEWAKE_RENDER),
@@ -649,7 +654,7 @@ static const struct intel_forcewake_range __vlv_fw_ranges[] = {
 	GEN_FW_RANGE(0x30000, 0x3ffff, FORCEWAKE_MEDIA),
 };
 
-#define __vlv_reg_read_fw_domains(offset) \
+#define __fwtable_reg_read_fw_domains(offset) \
 ({ \
 	enum forcewake_domains __fwd = 0; \
 	if (NEEDS_FORCE_WAKE((offset))) \
@@ -707,14 +712,6 @@ static const struct intel_forcewake_range __chv_fw_ranges[] = {
 	GEN_FW_RANGE(0x30000, 0x37fff, FORCEWAKE_MEDIA),
 };
 
-#define __chv_reg_read_fw_domains(offset) \
-({ \
-	enum forcewake_domains __fwd = 0; \
-	if (NEEDS_FORCE_WAKE((offset))) \
-		__fwd = find_fw_domain(dev_priv, offset); \
-	__fwd; \
-})
-
 #define __chv_reg_write_fw_domains(offset) \
 ({ \
 	enum forcewake_domains __fwd = 0; \
@@ -759,14 +756,6 @@ static const struct intel_forcewake_range __gen9_fw_ranges[] = {
 	GEN_FW_RANGE(0x30000, 0x3ffff, FORCEWAKE_MEDIA),
 };
 
-#define __gen9_reg_read_fw_domains(offset) \
-({ \
-	enum forcewake_domains __fwd = 0; \
-	if (NEEDS_FORCE_WAKE((offset))) \
-		__fwd = find_fw_domain(dev_priv, offset); \
-	__fwd; \
-})
-
 static const i915_reg_t gen9_shadowed_regs[] = {
 	RING_TAIL(RENDER_RING_BASE),
 	RING_TAIL(GEN6_BSD_RING_BASE),
@@ -927,7 +916,7 @@ static u##x \
 vlv_read##x(struct drm_i915_private *dev_priv, i915_reg_t reg, bool trace) { \
 	enum forcewake_domains fw_engine; \
 	GEN6_READ_HEADER(x); \
-	fw_engine = __vlv_reg_read_fw_domains(offset); \
+	fw_engine = __fwtable_reg_read_fw_domains(offset); \
 	if (fw_engine) \
 		__force_wake_auto(dev_priv, fw_engine); \
 	val = __raw_i915_read##x(dev_priv, reg); \
@@ -939,7 +928,7 @@ static u##x \
 chv_read##x(struct drm_i915_private *dev_priv, i915_reg_t reg, bool trace) { \
 	enum forcewake_domains fw_engine; \
 	GEN6_READ_HEADER(x); \
-	fw_engine = __chv_reg_read_fw_domains(offset); \
+	fw_engine = __fwtable_reg_read_fw_domains(offset); \
 	if (fw_engine) \
 		__force_wake_auto(dev_priv, fw_engine); \
 	val = __raw_i915_read##x(dev_priv, reg); \
@@ -951,7 +940,7 @@ static u##x \
 gen9_read##x(struct drm_i915_private *dev_priv, i915_reg_t reg, bool trace) { \
 	enum forcewake_domains fw_engine; \
 	GEN6_READ_HEADER(x); \
-	fw_engine = __gen9_reg_read_fw_domains(offset); \
+	fw_engine = __fwtable_reg_read_fw_domains(offset); \
 	if (fw_engine) \
 		__force_wake_auto(dev_priv, fw_engine); \
 	val = __raw_i915_read##x(dev_priv, reg); \
@@ -1823,35 +1812,16 @@ static enum forcewake_domains
 intel_uncore_forcewake_for_read(struct drm_i915_private *dev_priv,
 				i915_reg_t reg)
 {
+	u32 offset = i915_mmio_reg_offset(reg);
 	enum forcewake_domains fw_domains;
 
-	if (intel_vgpu_active(dev_priv))
-		return 0;
-
-	switch (INTEL_GEN(dev_priv)) {
-	case 9:
-		fw_domains = __gen9_reg_read_fw_domains(i915_mmio_reg_offset(reg));
-		break;
-	case 8:
-		if (IS_CHERRYVIEW(dev_priv))
-			fw_domains = __chv_reg_read_fw_domains(i915_mmio_reg_offset(reg));
-		else
-			fw_domains = __gen6_reg_read_fw_domains(i915_mmio_reg_offset(reg));
-		break;
-	case 7:
-	case 6:
-		if (IS_VALLEYVIEW(dev_priv))
-			fw_domains = __vlv_reg_read_fw_domains(i915_mmio_reg_offset(reg));
-		else
-			fw_domains = __gen6_reg_read_fw_domains(i915_mmio_reg_offset(reg));
-		break;
-	default:
-		MISSING_CASE(INTEL_INFO(dev_priv)->gen);
-	case 5: /* forcewake was introduced with gen6 */
-	case 4:
-	case 3:
-	case 2:
-		return 0;
+	if (HAS_FWTABLE(dev_priv)) {
+		fw_domains = __fwtable_reg_read_fw_domains(offset);
+	} else if (INTEL_GEN(dev_priv) >= 6) {
+		fw_domains = __gen6_reg_read_fw_domains(offset);
+	} else {
+		WARN_ON(!IS_GEN(dev_priv, 2, 5));
+		fw_domains = 0;
 	}
 
 	WARN_ON(fw_domains & ~dev_priv->uncore.fw_domains);
@@ -1865,9 +1835,6 @@ intel_uncore_forcewake_for_write(struct drm_i915_private *dev_priv,
 {
 	enum forcewake_domains fw_domains;
 
-	if (intel_vgpu_active(dev_priv))
-		return 0;
-
 	switch (INTEL_GEN(dev_priv)) {
 	case 9:
 		fw_domains = __gen9_reg_write_fw_domains(i915_mmio_reg_offset(reg));
@@ -1918,6 +1885,9 @@ intel_uncore_forcewake_for_reg(struct drm_i915_private *dev_priv,
 
 	WARN_ON(!op);
 
+	if (intel_vgpu_active(dev_priv))
+		return 0;
+
 	if (op & FW_REG_READ)
 		fw_domains = intel_uncore_forcewake_for_read(dev_priv, reg);
 
-- 
2.7.4

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH 10/14] drm/i915: Remove identical mmio read functions
  2016-09-30 17:48 [PATCH v2 00/14] Forcewake binary search & code shrink Tvrtko Ursulin
                   ` (8 preceding siblings ...)
  2016-09-30 17:48 ` [PATCH 09/14] drm/i915: Remove identical macros Tvrtko Ursulin
@ 2016-09-30 17:48 ` Tvrtko Ursulin
  2016-09-30 17:48 ` [PATCH 11/14] drm/i915: Remove identical write mmmio functions Tvrtko Ursulin
                   ` (6 subsequent siblings)
  16 siblings, 0 replies; 32+ messages in thread
From: Tvrtko Ursulin @ 2016-09-30 17:48 UTC (permalink / raw)
  To: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

It is now obvious VLV, CHV and Gen9 mmio read fcuntions are
completely identical so we can remove the three copies and
just keep the newly named generic implementation.

v2: Use fwtable naming consistently. (Joonas Lahtinen)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
---
 drivers/gpu/drm/i915/intel_uncore.c | 54 +++++++------------------------------
 1 file changed, 10 insertions(+), 44 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c
index 8e46e793ac1b..9d870b902947 100644
--- a/drivers/gpu/drm/i915/intel_uncore.c
+++ b/drivers/gpu/drm/i915/intel_uncore.c
@@ -911,9 +911,9 @@ gen6_read##x(struct drm_i915_private *dev_priv, i915_reg_t reg, bool trace) { \
 	GEN6_READ_FOOTER; \
 }
 
-#define __vlv_read(x) \
+#define __fwtable_read(x) \
 static u##x \
-vlv_read##x(struct drm_i915_private *dev_priv, i915_reg_t reg, bool trace) { \
+fwtable_read##x(struct drm_i915_private *dev_priv, i915_reg_t reg, bool trace) { \
 	enum forcewake_domains fw_engine; \
 	GEN6_READ_HEADER(x); \
 	fw_engine = __fwtable_reg_read_fw_domains(offset); \
@@ -923,50 +923,16 @@ vlv_read##x(struct drm_i915_private *dev_priv, i915_reg_t reg, bool trace) { \
 	GEN6_READ_FOOTER; \
 }
 
-#define __chv_read(x) \
-static u##x \
-chv_read##x(struct drm_i915_private *dev_priv, i915_reg_t reg, bool trace) { \
-	enum forcewake_domains fw_engine; \
-	GEN6_READ_HEADER(x); \
-	fw_engine = __fwtable_reg_read_fw_domains(offset); \
-	if (fw_engine) \
-		__force_wake_auto(dev_priv, fw_engine); \
-	val = __raw_i915_read##x(dev_priv, reg); \
-	GEN6_READ_FOOTER; \
-}
-
-#define __gen9_read(x) \
-static u##x \
-gen9_read##x(struct drm_i915_private *dev_priv, i915_reg_t reg, bool trace) { \
-	enum forcewake_domains fw_engine; \
-	GEN6_READ_HEADER(x); \
-	fw_engine = __fwtable_reg_read_fw_domains(offset); \
-	if (fw_engine) \
-		__force_wake_auto(dev_priv, fw_engine); \
-	val = __raw_i915_read##x(dev_priv, reg); \
-	GEN6_READ_FOOTER; \
-}
-
-__gen9_read(8)
-__gen9_read(16)
-__gen9_read(32)
-__gen9_read(64)
-__chv_read(8)
-__chv_read(16)
-__chv_read(32)
-__chv_read(64)
-__vlv_read(8)
-__vlv_read(16)
-__vlv_read(32)
-__vlv_read(64)
+__fwtable_read(8)
+__fwtable_read(16)
+__fwtable_read(32)
+__fwtable_read(64)
 __gen6_read(8)
 __gen6_read(16)
 __gen6_read(32)
 __gen6_read(64)
 
-#undef __gen9_read
-#undef __chv_read
-#undef __vlv_read
+#undef __fwtable_read
 #undef __gen6_read
 #undef GEN6_READ_FOOTER
 #undef GEN6_READ_HEADER
@@ -1325,13 +1291,13 @@ void intel_uncore_init(struct drm_i915_private *dev_priv)
 	case 9:
 		ASSIGN_FW_DOMAINS_TABLE(__gen9_fw_ranges);
 		ASSIGN_WRITE_MMIO_VFUNCS(gen9);
-		ASSIGN_READ_MMIO_VFUNCS(gen9);
+		ASSIGN_READ_MMIO_VFUNCS(fwtable);
 		break;
 	case 8:
 		if (IS_CHERRYVIEW(dev_priv)) {
 			ASSIGN_FW_DOMAINS_TABLE(__chv_fw_ranges);
 			ASSIGN_WRITE_MMIO_VFUNCS(chv);
-			ASSIGN_READ_MMIO_VFUNCS(chv);
+			ASSIGN_READ_MMIO_VFUNCS(fwtable);
 
 		} else {
 			ASSIGN_WRITE_MMIO_VFUNCS(gen8);
@@ -1344,7 +1310,7 @@ void intel_uncore_init(struct drm_i915_private *dev_priv)
 
 		if (IS_VALLEYVIEW(dev_priv)) {
 			ASSIGN_FW_DOMAINS_TABLE(__vlv_fw_ranges);
-			ASSIGN_READ_MMIO_VFUNCS(vlv);
+			ASSIGN_READ_MMIO_VFUNCS(fwtable);
 		} else {
 			ASSIGN_READ_MMIO_VFUNCS(gen6);
 		}
-- 
2.7.4

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH 11/14] drm/i915: Remove identical write mmmio functions
  2016-09-30 17:48 [PATCH v2 00/14] Forcewake binary search & code shrink Tvrtko Ursulin
                   ` (9 preceding siblings ...)
  2016-09-30 17:48 ` [PATCH 10/14] drm/i915: Remove identical mmio read functions Tvrtko Ursulin
@ 2016-09-30 17:48 ` Tvrtko Ursulin
  2016-09-30 17:48 ` [PATCH 12/14] drm/i915: Sort the shadow register table Tvrtko Ursulin
                   ` (5 subsequent siblings)
  16 siblings, 0 replies; 32+ messages in thread
From: Tvrtko Ursulin @ 2016-09-30 17:48 UTC (permalink / raw)
  To: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

We notice two identical copies of the shadow register table and
following from that removal can also unify CHV and Gen9 write
mmio functions and macros into a single implementation.

v2: Name fwtable consistently and use HAS_FWTABLE. (Joonas Lahtinen)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/intel_uncore.c | 94 ++++++++-----------------------------
 1 file changed, 19 insertions(+), 75 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c
index 9d870b902947..dee4e93d482d 100644
--- a/drivers/gpu/drm/i915/intel_uncore.c
+++ b/drivers/gpu/drm/i915/intel_uncore.c
@@ -712,7 +712,7 @@ static const struct intel_forcewake_range __chv_fw_ranges[] = {
 	GEN_FW_RANGE(0x30000, 0x37fff, FORCEWAKE_MEDIA),
 };
 
-#define __chv_reg_write_fw_domains(offset) \
+#define __fwtable_reg_write_fw_domains(offset) \
 ({ \
 	enum forcewake_domains __fwd = 0; \
 	if (NEEDS_FORCE_WAKE((offset)) && !is_gen8_shadowed(offset)) \
@@ -756,34 +756,6 @@ static const struct intel_forcewake_range __gen9_fw_ranges[] = {
 	GEN_FW_RANGE(0x30000, 0x3ffff, FORCEWAKE_MEDIA),
 };
 
-static const i915_reg_t gen9_shadowed_regs[] = {
-	RING_TAIL(RENDER_RING_BASE),
-	RING_TAIL(GEN6_BSD_RING_BASE),
-	RING_TAIL(VEBOX_RING_BASE),
-	RING_TAIL(BLT_RING_BASE),
-	GEN6_RPNSWREQ,
-	GEN6_RC_VIDEO_FREQ,
-	/* TODO: Other registers are not yet used */
-};
-
-static bool is_gen9_shadowed(u32 offset)
-{
-	int i;
-	for (i = 0; i < ARRAY_SIZE(gen9_shadowed_regs); i++)
-		if (offset == gen9_shadowed_regs[i].reg)
-			return true;
-
-	return false;
-}
-
-#define __gen9_reg_write_fw_domains(offset) \
-({ \
-	enum forcewake_domains __fwd = 0; \
-	if (NEEDS_FORCE_WAKE((offset)) && !is_gen9_shadowed(offset)) \
-		__fwd = find_fw_domain(dev_priv, offset); \
-	__fwd; \
-})
-
 static void
 ilk_dummy_write(struct drm_i915_private *dev_priv)
 {
@@ -1040,37 +1012,21 @@ gen8_write##x(struct drm_i915_private *dev_priv, i915_reg_t reg, u##x val, bool
 	GEN6_WRITE_FOOTER; \
 }
 
-#define __chv_write(x) \
+#define __fwtable_write(x) \
 static void \
-chv_write##x(struct drm_i915_private *dev_priv, i915_reg_t reg, u##x val, bool trace) { \
+fwtable_write##x(struct drm_i915_private *dev_priv, i915_reg_t reg, u##x val, bool trace) { \
 	enum forcewake_domains fw_engine; \
 	GEN6_WRITE_HEADER; \
-	fw_engine = __chv_reg_write_fw_domains(offset); \
+	fw_engine = __fwtable_reg_write_fw_domains(offset); \
 	if (fw_engine) \
 		__force_wake_auto(dev_priv, fw_engine); \
 	__raw_i915_write##x(dev_priv, reg, val); \
 	GEN6_WRITE_FOOTER; \
 }
 
-#define __gen9_write(x) \
-static void \
-gen9_write##x(struct drm_i915_private *dev_priv, i915_reg_t reg, u##x val, \
-		bool trace) { \
-	enum forcewake_domains fw_engine; \
-	GEN6_WRITE_HEADER; \
-	fw_engine = __gen9_reg_write_fw_domains(offset); \
-	if (fw_engine) \
-		__force_wake_auto(dev_priv, fw_engine); \
-	__raw_i915_write##x(dev_priv, reg, val); \
-	GEN6_WRITE_FOOTER; \
-}
-
-__gen9_write(8)
-__gen9_write(16)
-__gen9_write(32)
-__chv_write(8)
-__chv_write(16)
-__chv_write(32)
+__fwtable_write(8)
+__fwtable_write(16)
+__fwtable_write(32)
 __gen8_write(8)
 __gen8_write(16)
 __gen8_write(32)
@@ -1078,8 +1034,7 @@ __gen6_write(8)
 __gen6_write(16)
 __gen6_write(32)
 
-#undef __gen9_write
-#undef __chv_write
+#undef __fwtable_write
 #undef __gen8_write
 #undef __gen6_write
 #undef GEN6_WRITE_FOOTER
@@ -1290,13 +1245,13 @@ void intel_uncore_init(struct drm_i915_private *dev_priv)
 	default:
 	case 9:
 		ASSIGN_FW_DOMAINS_TABLE(__gen9_fw_ranges);
-		ASSIGN_WRITE_MMIO_VFUNCS(gen9);
+		ASSIGN_WRITE_MMIO_VFUNCS(fwtable);
 		ASSIGN_READ_MMIO_VFUNCS(fwtable);
 		break;
 	case 8:
 		if (IS_CHERRYVIEW(dev_priv)) {
 			ASSIGN_FW_DOMAINS_TABLE(__chv_fw_ranges);
-			ASSIGN_WRITE_MMIO_VFUNCS(chv);
+			ASSIGN_WRITE_MMIO_VFUNCS(fwtable);
 			ASSIGN_READ_MMIO_VFUNCS(fwtable);
 
 		} else {
@@ -1799,29 +1754,18 @@ static enum forcewake_domains
 intel_uncore_forcewake_for_write(struct drm_i915_private *dev_priv,
 				 i915_reg_t reg)
 {
+	u32 offset = i915_mmio_reg_offset(reg);
 	enum forcewake_domains fw_domains;
 
-	switch (INTEL_GEN(dev_priv)) {
-	case 9:
-		fw_domains = __gen9_reg_write_fw_domains(i915_mmio_reg_offset(reg));
-		break;
-	case 8:
-		if (IS_CHERRYVIEW(dev_priv))
-			fw_domains = __chv_reg_write_fw_domains(i915_mmio_reg_offset(reg));
-		else
-			fw_domains = __gen8_reg_write_fw_domains(i915_mmio_reg_offset(reg));
-		break;
-	case 7:
-	case 6:
+	if (HAS_FWTABLE(dev_priv) && !IS_VALLEYVIEW(dev_priv)) {
+		fw_domains = __fwtable_reg_write_fw_domains(offset);
+	} else if (IS_GEN8(dev_priv)) {
+		fw_domains = __gen8_reg_write_fw_domains(offset);
+	} else if (IS_GEN(dev_priv, 6, 7)) {
 		fw_domains = FORCEWAKE_RENDER;
-		break;
-	default:
-		MISSING_CASE(INTEL_INFO(dev_priv)->gen);
-	case 5:
-	case 4:
-	case 3:
-	case 2:
-		return 0;
+	} else {
+		WARN_ON(!IS_GEN(dev_priv, 2, 5));
+		fw_domains = 0;
 	}
 
 	WARN_ON(fw_domains & ~dev_priv->uncore.fw_domains);
-- 
2.7.4

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH 12/14] drm/i915: Sort the shadow register table
  2016-09-30 17:48 [PATCH v2 00/14] Forcewake binary search & code shrink Tvrtko Ursulin
                   ` (10 preceding siblings ...)
  2016-09-30 17:48 ` [PATCH 11/14] drm/i915: Remove identical write mmmio functions Tvrtko Ursulin
@ 2016-09-30 17:48 ` Tvrtko Ursulin
  2016-10-03  8:40   ` Tvrtko Ursulin
  2016-09-30 17:48 ` [PATCH 13/14] drm/i915: Use binary search when looking for shadowed registers Tvrtko Ursulin
                   ` (4 subsequent siblings)
  16 siblings, 1 reply; 32+ messages in thread
From: Tvrtko Ursulin @ 2016-09-30 17:48 UTC (permalink / raw)
  To: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Also verify the order at runtime. This was we can start using
binary search on it in a following patch.

v2: Add comment on the sorted array and only check it when
    debug option is enabled.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> (v1)
---
 drivers/gpu/drm/i915/intel_uncore.c | 31 +++++++++++++++++++++++++------
 1 file changed, 25 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c
index dee4e93d482d..10b21516a7de 100644
--- a/drivers/gpu/drm/i915/intel_uncore.c
+++ b/drivers/gpu/drm/i915/intel_uncore.c
@@ -662,16 +662,33 @@ static const struct intel_forcewake_range __vlv_fw_ranges[] = {
 	__fwd; \
 })
 
+/* *Must* be sorted by offset! See intel_shadow_table_check(). */
 static const i915_reg_t gen8_shadowed_regs[] = {
-	GEN6_RPNSWREQ,
-	GEN6_RC_VIDEO_FREQ,
-	RING_TAIL(RENDER_RING_BASE),
-	RING_TAIL(GEN6_BSD_RING_BASE),
-	RING_TAIL(VEBOX_RING_BASE),
-	RING_TAIL(BLT_RING_BASE),
+	RING_TAIL(RENDER_RING_BASE),	/* 0x2000 (base) */
+	GEN6_RPNSWREQ,			/* 0xA008 */
+	GEN6_RC_VIDEO_FREQ,		/* 0xA00C */
+	RING_TAIL(GEN6_BSD_RING_BASE),	/* 0x12000 (base) */
+	RING_TAIL(VEBOX_RING_BASE),	/* 0x1a000 (base) */
+	RING_TAIL(BLT_RING_BASE),	/* 0x22000 (base) */
 	/* TODO: Other registers are not yet used */
 };
 
+static void intel_shadow_table_check(void)
+{
+#ifdef CONFIG_DRM_I915_DEBUG
+	const i915_reg_t *reg = gen8_shadowed_regs;
+	s32 prev = -1;
+	u32 offset;
+	unsigned int i;
+
+	for (i = 0; i < ARRAY_SIZE(gen8_shadowed_regs); i++, reg++) {
+		offset = i915_mmio_reg_offset(*reg);
+		WARN_ON_ONCE(prev >= (s32)offset);
+		prev = offset;
+	}
+#endif
+}
+
 static bool is_gen8_shadowed(u32 offset)
 {
 	int i;
@@ -1283,6 +1300,8 @@ void intel_uncore_init(struct drm_i915_private *dev_priv)
 	}
 
 	intel_fw_table_check(dev_priv);
+	if (INTEL_GEN(dev_priv) >= 8)
+		intel_shadow_table_check();
 
 	if (intel_vgpu_active(dev_priv)) {
 		ASSIGN_WRITE_MMIO_VFUNCS(vgpu);
-- 
2.7.4

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH 13/14] drm/i915: Use binary search when looking for shadowed registers
  2016-09-30 17:48 [PATCH v2 00/14] Forcewake binary search & code shrink Tvrtko Ursulin
                   ` (11 preceding siblings ...)
  2016-09-30 17:48 ` [PATCH 12/14] drm/i915: Sort the shadow register table Tvrtko Ursulin
@ 2016-09-30 17:48 ` Tvrtko Ursulin
  2016-09-30 18:08   ` Chris Wilson
  2016-09-30 17:48 ` [PATCH 14/14] drm/i915: Inline binary search Tvrtko Ursulin
                   ` (3 subsequent siblings)
  16 siblings, 1 reply; 32+ messages in thread
From: Tvrtko Ursulin @ 2016-09-30 17:48 UTC (permalink / raw)
  To: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Simply replace the linear search with the kernel's binary
search implementation. There is only six registers currently
in that table so this may not be that interesting. It adds a
function call so hopefully remains performance neutral for now.

v2: No need for manual conversion to bool for return.
    (Joonas Lahtinen)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
---
 drivers/gpu/drm/i915/intel_uncore.c | 26 +++++++++++++++++++++-----
 1 file changed, 21 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c
index 10b21516a7de..70a7fef79846 100644
--- a/drivers/gpu/drm/i915/intel_uncore.c
+++ b/drivers/gpu/drm/i915/intel_uncore.c
@@ -689,14 +689,30 @@ static void intel_shadow_table_check(void)
 #endif
 }
 
+static int mmio_reg_cmp(const void *key, const void *elt)
+{
+	u32 offset = (u32)(unsigned long)key;
+	i915_reg_t *reg = (i915_reg_t *)elt;
+
+	if (offset < i915_mmio_reg_offset(*reg))
+		return -1;
+	else if (offset > i915_mmio_reg_offset(*reg))
+		return 1;
+	else
+		return 0;
+}
+
 static bool is_gen8_shadowed(u32 offset)
 {
-	int i;
-	for (i = 0; i < ARRAY_SIZE(gen8_shadowed_regs); i++)
-		if (offset == gen8_shadowed_regs[i].reg)
-			return true;
+	i915_reg_t *reg;
 
-	return false;
+	reg = bsearch((void *)(unsigned long)offset,
+		      (const void *)gen8_shadowed_regs,
+		      ARRAY_SIZE(gen8_shadowed_regs),
+		      sizeof(i915_reg_t),
+		      mmio_reg_cmp);
+
+	return reg;
 }
 
 #define __gen8_reg_write_fw_domains(offset) \
-- 
2.7.4

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH 14/14] drm/i915: Inline binary search
  2016-09-30 17:48 [PATCH v2 00/14] Forcewake binary search & code shrink Tvrtko Ursulin
                   ` (12 preceding siblings ...)
  2016-09-30 17:48 ` [PATCH 13/14] drm/i915: Use binary search when looking for shadowed registers Tvrtko Ursulin
@ 2016-09-30 17:48 ` Tvrtko Ursulin
  2016-09-30 18:09 ` [PATCH v2 00/14] Forcewake binary search & code shrink Chris Wilson
                   ` (2 subsequent siblings)
  16 siblings, 0 replies; 32+ messages in thread
From: Tvrtko Ursulin @ 2016-09-30 17:48 UTC (permalink / raw)
  To: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Instead of using bsearch library function make a local generator
macro out of it so the comparison callback can be inlined.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/intel_uncore.c | 56 ++++++++++++++++++++-----------------
 1 file changed, 31 insertions(+), 25 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c
index 70a7fef79846..8e5357bdd914 100644
--- a/drivers/gpu/drm/i915/intel_uncore.c
+++ b/drivers/gpu/drm/i915/intel_uncore.c
@@ -26,7 +26,6 @@
 #include "i915_vgpu.h"
 
 #include <linux/pm_runtime.h>
-#include <linux/bsearch.h>
 
 #define FORCEWAKE_ACK_TIMEOUT_MS 50
 
@@ -582,11 +581,8 @@ void assert_forcewakes_inactive(struct drm_i915_private *dev_priv)
 	__fwd; \
 })
 
-static int fw_range_cmp(const void *key, const void *elt)
+static int fw_range_cmp(u32 offset, const struct intel_forcewake_range *entry)
 {
-	const struct intel_forcewake_range *entry = elt;
-	u32 offset = (u32)((unsigned long)key);
-
 	if (offset < entry->start)
 		return -1;
 	else if (offset > entry->end)
@@ -595,17 +591,33 @@ static int fw_range_cmp(const void *key, const void *elt)
 		return 0;
 }
 
+/* Copied and "macroized" from lib/bsearch.c */
+#define BSEARCH(key, base, num, cmp) ({                                 \
+	unsigned int start__ = 0, end__ = (num);                        \
+	typeof(base) result__ = NULL;                                   \
+	while (start__ < end__) {                                       \
+		unsigned int mid__ = start__ + (end__ - start__) / 2;   \
+		int ret__ = (cmp)((key), (base) + mid__);               \
+		if (ret__ < 0) {                                        \
+			end__ = mid__;                                  \
+		} else if (ret__ > 0) {                                 \
+			start__ = mid__ + 1;                            \
+		} else {                                                \
+			result__ = (base) + mid__;                      \
+			break;                                          \
+		}                                                       \
+	}                                                               \
+	result__;                                                       \
+})
+
 static enum forcewake_domains
 find_fw_domain(struct drm_i915_private *dev_priv, u32 offset)
 {
-	const struct intel_forcewake_range *table, *entry;
-	unsigned int num_entries;
-
-	table = dev_priv->uncore.fw_domains_table;
-	num_entries = dev_priv->uncore.fw_domains_table_entries;
+	const struct intel_forcewake_range *entry;
 
-	entry = bsearch((void *)(unsigned long)offset, (const void *)table,
-			num_entries, sizeof(struct intel_forcewake_range),
+	entry = BSEARCH(offset,
+			dev_priv->uncore.fw_domains_table,
+			dev_priv->uncore.fw_domains_table_entries,
 			fw_range_cmp);
 
 	return entry ? entry->domains : 0;
@@ -689,14 +701,13 @@ static void intel_shadow_table_check(void)
 #endif
 }
 
-static int mmio_reg_cmp(const void *key, const void *elt)
+static int mmio_reg_cmp(u32 key, const i915_reg_t *reg)
 {
-	u32 offset = (u32)(unsigned long)key;
-	i915_reg_t *reg = (i915_reg_t *)elt;
+	u32 offset = i915_mmio_reg_offset(*reg);
 
-	if (offset < i915_mmio_reg_offset(*reg))
+	if (key < offset)
 		return -1;
-	else if (offset > i915_mmio_reg_offset(*reg))
+	else if (key > offset)
 		return 1;
 	else
 		return 0;
@@ -704,15 +715,10 @@ static int mmio_reg_cmp(const void *key, const void *elt)
 
 static bool is_gen8_shadowed(u32 offset)
 {
-	i915_reg_t *reg;
-
-	reg = bsearch((void *)(unsigned long)offset,
-		      (const void *)gen8_shadowed_regs,
-		      ARRAY_SIZE(gen8_shadowed_regs),
-		      sizeof(i915_reg_t),
-		      mmio_reg_cmp);
+	const i915_reg_t *regs = gen8_shadowed_regs;
 
-	return reg;
+	return BSEARCH(offset, regs, ARRAY_SIZE(gen8_shadowed_regs),
+		       mmio_reg_cmp);
 }
 
 #define __gen8_reg_write_fw_domains(offset) \
-- 
2.7.4

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* Re: [PATCH 05/14] drm/i915: Sort forcewake mapping tables
  2016-09-30 17:48 ` [PATCH 05/14] drm/i915: Sort forcewake mapping tables Tvrtko Ursulin
@ 2016-09-30 18:05   ` Chris Wilson
  2016-10-03  8:37     ` Tvrtko Ursulin
  0 siblings, 1 reply; 32+ messages in thread
From: Chris Wilson @ 2016-09-30 18:05 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: Intel-gfx

On Fri, Sep 30, 2016 at 06:48:40PM +0100, Tvrtko Ursulin wrote:
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> Sorting the tables (verified at runtime to help during
> development) is another prerequisite for interesting
> work which will follow.
> 
> v2:
>  * Remove const away cast and improve comments. (Chris Wilson)
>  * Check tables only when debug option is enabled.
> 
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> ---
>  drivers/gpu/drm/i915/intel_uncore.c | 56 ++++++++++++++++++++++++++++---------
>  1 file changed, 43 insertions(+), 13 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c
> index e21e65ab2a16..5ca65de340b7 100644
> --- a/drivers/gpu/drm/i915/intel_uncore.c
> +++ b/drivers/gpu/drm/i915/intel_uncore.c
> @@ -605,16 +605,35 @@ find_fw_domain(u32 offset, const struct intel_forcewake_range *ranges,
>  	return -1;
>  }
>  
> +static void
> +intel_fw_table_check(const struct intel_forcewake_range *ranges,
> +		     unsigned int num_ranges)
> +{
> +#ifdef CONFIG_DRM_I915_DEBUG
> +	unsigned int i;
> +	const struct intel_forcewake_range *entry = ranges;
> +	s32 prev = -1;
> +

	if (!IS_ENABLED(CONFIG_DRM_I915_DEBUG))
		return;

for compiler checking always.

or
#if IS_ENABLED(CONFIG_DRM_I915_DEBUG) 
...
#endif
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 13/14] drm/i915: Use binary search when looking for shadowed registers
  2016-09-30 17:48 ` [PATCH 13/14] drm/i915: Use binary search when looking for shadowed registers Tvrtko Ursulin
@ 2016-09-30 18:08   ` Chris Wilson
  2016-10-03  8:05     ` Joonas Lahtinen
  0 siblings, 1 reply; 32+ messages in thread
From: Chris Wilson @ 2016-09-30 18:08 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: Intel-gfx

On Fri, Sep 30, 2016 at 06:48:48PM +0100, Tvrtko Ursulin wrote:
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> Simply replace the linear search with the kernel's binary
> search implementation. There is only six registers currently
> in that table so this may not be that interesting. It adds a
> function call so hopefully remains performance neutral for now.
> 
> v2: No need for manual conversion to bool for return.
>     (Joonas Lahtinen)
> 
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> ---
>  drivers/gpu/drm/i915/intel_uncore.c | 26 +++++++++++++++++++++-----
>  1 file changed, 21 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c
> index 10b21516a7de..70a7fef79846 100644
> --- a/drivers/gpu/drm/i915/intel_uncore.c
> +++ b/drivers/gpu/drm/i915/intel_uncore.c
> @@ -689,14 +689,30 @@ static void intel_shadow_table_check(void)
>  #endif
>  }
>  
> +static int mmio_reg_cmp(const void *key, const void *elt)
> +{
> +	u32 offset = (u32)(unsigned long)key;
> +	i915_reg_t *reg = (i915_reg_t *)elt;
> +
> +	if (offset < i915_mmio_reg_offset(*reg))
> +		return -1;
> +	else if (offset > i915_mmio_reg_offset(*reg))
> +		return 1;
> +	else
> +		return 0;

There's no issue with using

	return offset - i915_mmio_reg_offset(*reg)

here.

> +}
> +
>  static bool is_gen8_shadowed(u32 offset)
>  {
> -	int i;
> -	for (i = 0; i < ARRAY_SIZE(gen8_shadowed_regs); i++)
> -		if (offset == gen8_shadowed_regs[i].reg)
> -			return true;
> +	i915_reg_t *reg;
>  
> -	return false;
> +	reg = bsearch((void *)(unsigned long)offset,
> +		      (const void *)gen8_shadowed_regs,
> +		      ARRAY_SIZE(gen8_shadowed_regs),
> +		      sizeof(i915_reg_t),
> +		      mmio_reg_cmp);
> +
> +	return reg;

Or just return bseearch() ? (Probably like this for easing future
patches.)
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v2 00/14] Forcewake binary search & code shrink
  2016-09-30 17:48 [PATCH v2 00/14] Forcewake binary search & code shrink Tvrtko Ursulin
                   ` (13 preceding siblings ...)
  2016-09-30 17:48 ` [PATCH 14/14] drm/i915: Inline binary search Tvrtko Ursulin
@ 2016-09-30 18:09 ` Chris Wilson
  2016-09-30 18:49 ` ✗ Fi.CI.BAT: warning for Forcewake binary search & code shrink (rev3) Patchwork
  2016-10-03  9:27 ` ✗ Fi.CI.BAT: warning for Forcewake binary search & code shrink (rev6) Patchwork
  16 siblings, 0 replies; 32+ messages in thread
From: Chris Wilson @ 2016-09-30 18:09 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: Intel-gfx

On Fri, Sep 30, 2016 at 06:48:35PM +0100, Tvrtko Ursulin wrote:
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> Motivation was to replace linear search comparison if-else ladders in MMIO
> accessors with a data driven approach.
> 
> These lookups are needed to determine to correct forcewake domains to take
> when reading and writing to MMIO registers.
> 
> Going data driven and keeping the tables sorted enabled binary search to be
> used. Followed by gradual extraction of commonality between existing functions,
> accompanying code reduction, and the end result is around 15KiB less code
> generated and also some saving in the source code lines.
> 
> v2:
>  * Updated cover letter given feedback.
>  * Some rework for review comments.
>  * New patch in the series (14) to inline the binary search.
> 
> Tvrtko Ursulin (14):
>   drm/i915: Remove redundant hsw_write* mmio functions
>   drm/i915: Keep track of active forcewake domains in a bitmask
>   drm/i915: Do not inline forcewake taking in mmio accessors
>   drm/i915: Data driven register to forcewake domains lookup
>   drm/i915: Sort forcewake mapping tables
>   drm/i915: Use binary search when looking up forcewake domains
>   drm/i915: Eliminate Gen9 special case
>   drm/i915: Store the active forcewake range table pointer
>   drm/i915: Remove identical macros
>   drm/i915: Remove identical mmio read functions
>   drm/i915: Remove identical write mmmio functions
>   drm/i915: Sort the shadow register table
>   drm/i915: Use binary search when looking for shadowed registers
>   drm/i915: Inline binary search

Minor comments notwithstanding,
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 32+ messages in thread

* ✗ Fi.CI.BAT: warning for Forcewake binary search & code shrink (rev3)
  2016-09-30 17:48 [PATCH v2 00/14] Forcewake binary search & code shrink Tvrtko Ursulin
                   ` (14 preceding siblings ...)
  2016-09-30 18:09 ` [PATCH v2 00/14] Forcewake binary search & code shrink Chris Wilson
@ 2016-09-30 18:49 ` Patchwork
  2016-10-03  9:27 ` ✗ Fi.CI.BAT: warning for Forcewake binary search & code shrink (rev6) Patchwork
  16 siblings, 0 replies; 32+ messages in thread
From: Patchwork @ 2016-09-30 18:49 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: Forcewake binary search & code shrink (rev3)
URL   : https://patchwork.freedesktop.org/series/13080/
State : warning

== Summary ==

Series 13080v3 Forcewake binary search & code shrink
https://patchwork.freedesktop.org/api/1.0/series/13080/revisions/3/mbox/

Test drv_module_reload_basic:
                pass       -> SKIP       (fi-skl-6770hq)
Test kms_pipe_crc_basic:
        Subgroup suspend-read-crc-pipe-c:
                pass       -> DMESG-WARN (fi-skl-6770hq)
Test kms_psr_sink_crc:
        Subgroup psr_basic:
                dmesg-warn -> PASS       (fi-skl-6700hq)

fi-bdw-5557u     total:244  pass:229  dwarn:0   dfail:0   fail:0   skip:15 
fi-bsw-n3050     total:244  pass:202  dwarn:0   dfail:0   fail:0   skip:42 
fi-bxt-t5700     total:244  pass:214  dwarn:0   dfail:0   fail:0   skip:30 
fi-byt-j1900     total:244  pass:212  dwarn:0   dfail:0   fail:1   skip:31 
fi-byt-n2820     total:244  pass:208  dwarn:0   dfail:0   fail:1   skip:35 
fi-hsw-4770      total:244  pass:222  dwarn:0   dfail:0   fail:0   skip:22 
fi-hsw-4770r     total:244  pass:222  dwarn:0   dfail:0   fail:0   skip:22 
fi-ilk-650       total:244  pass:182  dwarn:0   dfail:0   fail:2   skip:60 
fi-ivb-3520m     total:244  pass:219  dwarn:0   dfail:0   fail:0   skip:25 
fi-ivb-3770      total:244  pass:207  dwarn:0   dfail:0   fail:0   skip:37 
fi-skl-6260u     total:244  pass:230  dwarn:0   dfail:0   fail:0   skip:14 
fi-skl-6700hq    total:244  pass:222  dwarn:0   dfail:0   fail:0   skip:22 
fi-skl-6700k     total:244  pass:219  dwarn:1   dfail:0   fail:0   skip:24 
fi-skl-6770hq    total:244  pass:226  dwarn:2   dfail:0   fail:1   skip:15 
fi-snb-2520m     total:244  pass:208  dwarn:0   dfail:0   fail:0   skip:36 
fi-snb-2600      total:244  pass:207  dwarn:0   dfail:0   fail:0   skip:37 

Results at /archive/results/CI_IGT_test/Patchwork_2608/

d0534bd0217c83c083ba146b9c987e19b219e0e4 drm-intel-nightly: 2016y-09m-30d-10h-31m-00s UTC integration manifest
a77551d drm/i915: Inline binary search
3275a3c drm/i915: Use binary search when looking for shadowed registers
933eef3 drm/i915: Sort the shadow register table
569af7c drm/i915: Remove identical write mmmio functions
c7bfcc2 drm/i915: Remove identical mmio read functions
8e1c3a2 drm/i915: Remove identical macros
2736ef1 drm/i915: Store the active forcewake range table pointer
42ba155 drm/i915: Eliminate Gen9 special case
b249ec5 drm/i915: Use binary search when looking up forcewake domains
904768b drm/i915: Sort forcewake mapping tables
9d1baf6 drm/i915: Data driven register to forcewake domains lookup
17975ae drm/i915: Do not inline forcewake taking in mmio accessors
5b92076 drm/i915: Keep track of active forcewake domains in a bitmask
8d9b34c drm/i915: Remove redundant hsw_write* mmio functions

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 04/14] drm/i915: Data driven register to forcewake domains lookup
  2016-09-30 17:48 ` [PATCH 04/14] drm/i915: Data driven register to forcewake domains lookup Tvrtko Ursulin
@ 2016-10-03  7:58   ` Joonas Lahtinen
  2016-10-26  9:24     ` Mika Kuoppala
  0 siblings, 1 reply; 32+ messages in thread
From: Joonas Lahtinen @ 2016-10-03  7:58 UTC (permalink / raw)
  To: Tvrtko Ursulin, Intel-gfx

On pe, 2016-09-30 at 18:48 +0100, Tvrtko Ursulin wrote:
> 
> +find_fw_domain(u32 offset, const struct intel_forcewake_range *ranges,
> +	       unsigned int num_ranges)

Gotta remind that generally "fw" is associated with firmware.
"forcewake" might be appropriate word in function names too.

> +{
> +	unsigned int i;
> +	struct intel_forcewake_range *entry =
> +		(struct intel_forcewake_range *)ranges;
> +
> +	for (i = 0; i < num_ranges; i++, entry++) {
> +		if (offset >= entry->start && offset <= entry->end)
> +			return entry->domains;
> +	}
> +
> +	return -1;
> +}
> 

This could be a binary search, too, with some sorting.

Already as is;

Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

Regards, Joonas
-- 
Joonas Lahtinen
Open Source Technology Center
Intel Corporation
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 13/14] drm/i915: Use binary search when looking for shadowed registers
  2016-09-30 18:08   ` Chris Wilson
@ 2016-10-03  8:05     ` Joonas Lahtinen
  2016-10-03  8:33       ` Tvrtko Ursulin
  0 siblings, 1 reply; 32+ messages in thread
From: Joonas Lahtinen @ 2016-10-03  8:05 UTC (permalink / raw)
  To: Chris Wilson, Tvrtko Ursulin; +Cc: Intel-gfx

On pe, 2016-09-30 at 19:08 +0100, Chris Wilson wrote:
> On Fri, Sep 30, 2016 at 06:48:48PM +0100, Tvrtko Ursulin wrote: 
> > +static int mmio_reg_cmp(const void *key, const void *elt)
> > +{
> > +	u32 offset = (u32)(unsigned long)key;
> > +	i915_reg_t *reg = (i915_reg_t *)elt;
> > +
> > +	if (offset < i915_mmio_reg_offset(*reg))
> > +		return -1;
> > +	else if (offset > i915_mmio_reg_offset(*reg))
> > +		return 1;
> > +	else
> > +		return 0;
> 
> There's no issue with using
> 
> 	return offset - i915_mmio_reg_offset(*reg)
> 
> here.
> 

Why not?

> > +	reg = bsearch((void *)(unsigned long)offset,
> > +		      (const void *)gen8_shadowed_regs,
> > +		      ARRAY_SIZE(gen8_shadowed_regs),
> > +		      sizeof(i915_reg_t),
> > +		      mmio_reg_cmp);
> > +
> > +	return reg;
> 
> Or just return bseearch() ? (Probably like this for easing future
> patches.)

Suggested that too, but obviously he has something in mind.

Regards, Joonas
-- 
Joonas Lahtinen
Open Source Technology Center
Intel Corporation
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 13/14] drm/i915: Use binary search when looking for shadowed registers
  2016-10-03  8:05     ` Joonas Lahtinen
@ 2016-10-03  8:33       ` Tvrtko Ursulin
  0 siblings, 0 replies; 32+ messages in thread
From: Tvrtko Ursulin @ 2016-10-03  8:33 UTC (permalink / raw)
  To: Joonas Lahtinen, Chris Wilson, Tvrtko Ursulin; +Cc: Intel-gfx


On 03/10/2016 09:05, Joonas Lahtinen wrote:
> On pe, 2016-09-30 at 19:08 +0100, Chris Wilson wrote:
>> On Fri, Sep 30, 2016 at 06:48:48PM +0100, Tvrtko Ursulin wrote:
>>> +static int mmio_reg_cmp(const void *key, const void *elt)
>>> +{
>>> +	u32 offset = (u32)(unsigned long)key;
>>> +	i915_reg_t *reg = (i915_reg_t *)elt;
>>> +
>>> +	if (offset < i915_mmio_reg_offset(*reg))
>>> +		return -1;
>>> +	else if (offset > i915_mmio_reg_offset(*reg))
>>> +		return 1;
>>> +	else
>>> +		return 0;
>> There's no issue with using
>>
>> 	return offset - i915_mmio_reg_offset(*reg)
>>
>> here.
>>
> Why not?
>
>>> +	reg = bsearch((void *)(unsigned long)offset,
>>> +		      (const void *)gen8_shadowed_regs,
>>> +		      ARRAY_SIZE(gen8_shadowed_regs),
>>> +		      sizeof(i915_reg_t),
>>> +		      mmio_reg_cmp);
>>> +
>>> +	return reg;
>> Or just return bseearch() ? (Probably like this for easing future
>> patches.)
> Suggested that too, but obviously he has something in mind.

It becomes a direct return in a following patch. It is just a virtue of 
me wanting to split out the series a lot for easier review, and then not 
picking up all relevant places to tidy when acting on review feedback. 
But this one as I said happens in a following patch.

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH 05/14] drm/i915: Sort forcewake mapping tables
  2016-09-30 18:05   ` Chris Wilson
@ 2016-10-03  8:37     ` Tvrtko Ursulin
  0 siblings, 0 replies; 32+ messages in thread
From: Tvrtko Ursulin @ 2016-10-03  8:37 UTC (permalink / raw)
  To: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Sorting the tables (verified at runtime to help during
development) is another prerequisite for interesting
work which will follow.

v2:
 * Remove const away cast and improve comments. (Chris Wilson)
 * Check tables only when debug option is enabled.

v3: Use IS_ENABLED. (Chris Wilson)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/intel_uncore.c | 56 ++++++++++++++++++++++++++++---------
 1 file changed, 43 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c
index e21e65ab2a16..7110c8fdfb88 100644
--- a/drivers/gpu/drm/i915/intel_uncore.c
+++ b/drivers/gpu/drm/i915/intel_uncore.c
@@ -605,16 +605,35 @@ find_fw_domain(u32 offset, const struct intel_forcewake_range *ranges,
 	return -1;
 }
 
+static void
+intel_fw_table_check(const struct intel_forcewake_range *ranges,
+		     unsigned int num_ranges)
+{
+	s32 prev;
+	unsigned int i;
+
+	if (!IS_ENABLED(CONFIG_DRM_I915_DEBUG))
+		return;
+
+	for (i = 0, prev = -1; i < num_ranges; i++, ranges++) {
+		WARN_ON_ONCE(prev >= (s32)ranges->start);
+		prev = ranges->start;
+		WARN_ON_ONCE(prev >= (s32)ranges->end);
+		prev = ranges->end;
+	}
+}
+
 #define GEN_FW_RANGE(s, e, d) \
 	{ .start = (s), .end = (e), .domains = (d) }
 
+/* *Must* be sorted by offset ranges! See intel_fw_table_check(). */
 static const struct intel_forcewake_range __vlv_fw_ranges[] = {
 	GEN_FW_RANGE(0x2000, 0x3fff, FORCEWAKE_RENDER),
 	GEN_FW_RANGE(0x5000, 0x7fff, FORCEWAKE_RENDER),
 	GEN_FW_RANGE(0xb000, 0x11fff, FORCEWAKE_RENDER),
-	GEN_FW_RANGE(0x2e000, 0x2ffff, FORCEWAKE_RENDER),
 	GEN_FW_RANGE(0x12000, 0x13fff, FORCEWAKE_MEDIA),
 	GEN_FW_RANGE(0x22000, 0x23fff, FORCEWAKE_MEDIA),
+	GEN_FW_RANGE(0x2e000, 0x2ffff, FORCEWAKE_RENDER),
 	GEN_FW_RANGE(0x30000, 0x3ffff, FORCEWAKE_MEDIA),
 };
 
@@ -660,23 +679,24 @@ static bool is_gen8_shadowed(u32 offset)
 	__fwd; \
 })
 
+/* *Must* be sorted by offset ranges! See intel_fw_table_check(). */
 static const struct intel_forcewake_range __chv_fw_ranges[] = {
 	GEN_FW_RANGE(0x2000, 0x3fff, FORCEWAKE_RENDER),
+	GEN_FW_RANGE(0x4000, 0x4fff, FORCEWAKE_RENDER | FORCEWAKE_MEDIA),
 	GEN_FW_RANGE(0x5200, 0x7fff, FORCEWAKE_RENDER),
+	GEN_FW_RANGE(0x8000, 0x82ff, FORCEWAKE_RENDER | FORCEWAKE_MEDIA),
 	GEN_FW_RANGE(0x8300, 0x84ff, FORCEWAKE_RENDER),
-	GEN_FW_RANGE(0xb000, 0xb47f, FORCEWAKE_RENDER),
-	GEN_FW_RANGE(0xe000, 0xe7ff, FORCEWAKE_RENDER),
+	GEN_FW_RANGE(0x8500, 0x85ff, FORCEWAKE_RENDER | FORCEWAKE_MEDIA),
 	GEN_FW_RANGE(0x8800, 0x88ff, FORCEWAKE_MEDIA),
+	GEN_FW_RANGE(0x9000, 0xafff, FORCEWAKE_RENDER | FORCEWAKE_MEDIA),
+	GEN_FW_RANGE(0xb000, 0xb47f, FORCEWAKE_RENDER),
 	GEN_FW_RANGE(0xd000, 0xd7ff, FORCEWAKE_MEDIA),
+	GEN_FW_RANGE(0xe000, 0xe7ff, FORCEWAKE_RENDER),
+	GEN_FW_RANGE(0xf000, 0xffff, FORCEWAKE_RENDER | FORCEWAKE_MEDIA),
 	GEN_FW_RANGE(0x12000, 0x13fff, FORCEWAKE_MEDIA),
 	GEN_FW_RANGE(0x1a000, 0x1bfff, FORCEWAKE_MEDIA),
 	GEN_FW_RANGE(0x1e800, 0x1e9ff, FORCEWAKE_MEDIA),
 	GEN_FW_RANGE(0x30000, 0x37fff, FORCEWAKE_MEDIA),
-	GEN_FW_RANGE(0x4000, 0x4fff, FORCEWAKE_RENDER | FORCEWAKE_MEDIA),
-	GEN_FW_RANGE(0x8000, 0x82ff, FORCEWAKE_RENDER | FORCEWAKE_MEDIA),
-	GEN_FW_RANGE(0x8500, 0x85ff, FORCEWAKE_RENDER | FORCEWAKE_MEDIA),
-	GEN_FW_RANGE(0x9000, 0xafff, FORCEWAKE_RENDER | FORCEWAKE_MEDIA),
-	GEN_FW_RANGE(0xf000, 0xffff, FORCEWAKE_RENDER | FORCEWAKE_MEDIA),
 };
 
 #define __chv_reg_read_fw_domains(offset) \
@@ -703,23 +723,24 @@ static const struct intel_forcewake_range __chv_fw_ranges[] = {
 	__fwd; \
 })
 
+/* *Must* be sorted by offset ranges! See intel_fw_table_check(). */
 static const struct intel_forcewake_range __gen9_fw_ranges[] = {
 	GEN_FW_RANGE(0xb00, 0x1fff, 0), /* uncore range */
 	GEN_FW_RANGE(0x2000, 0x26ff, FORCEWAKE_RENDER),
 	GEN_FW_RANGE(0x3000, 0x3fff, FORCEWAKE_RENDER),
 	GEN_FW_RANGE(0x5200, 0x7fff, FORCEWAKE_RENDER),
+	GEN_FW_RANGE(0x8130, 0x813f, FORCEWAKE_MEDIA),
 	GEN_FW_RANGE(0x8140, 0x815f, FORCEWAKE_RENDER),
 	GEN_FW_RANGE(0x8300, 0x84ff, FORCEWAKE_RENDER),
+	GEN_FW_RANGE(0x8800, 0x89ff, FORCEWAKE_MEDIA),
 	GEN_FW_RANGE(0x8c00, 0x8cff, FORCEWAKE_RENDER),
-	GEN_FW_RANGE(0xb000, 0xb47f, FORCEWAKE_RENDER),
-	GEN_FW_RANGE(0xe000, 0xe8ff, FORCEWAKE_RENDER),
-	GEN_FW_RANGE(0x24400, 0x247ff, FORCEWAKE_RENDER),
 	GEN_FW_RANGE(0x9400, 0x97ff, FORCEWAKE_RENDER | FORCEWAKE_MEDIA),
-	GEN_FW_RANGE(0x8130, 0x813f, FORCEWAKE_MEDIA),
-	GEN_FW_RANGE(0x8800, 0x89ff, FORCEWAKE_MEDIA),
+	GEN_FW_RANGE(0xb000, 0xb47f, FORCEWAKE_RENDER),
 	GEN_FW_RANGE(0xd000, 0xd7ff, FORCEWAKE_MEDIA),
+	GEN_FW_RANGE(0xe000, 0xe8ff, FORCEWAKE_RENDER),
 	GEN_FW_RANGE(0x12000, 0x13fff, FORCEWAKE_MEDIA),
 	GEN_FW_RANGE(0x1a000, 0x1e9ff, FORCEWAKE_MEDIA),
+	GEN_FW_RANGE(0x24400, 0x247ff, FORCEWAKE_RENDER),
 	GEN_FW_RANGE(0x30000, 0x3ffff, FORCEWAKE_MEDIA),
 };
 
@@ -1299,11 +1320,17 @@ void intel_uncore_init(struct drm_i915_private *dev_priv)
 	switch (INTEL_INFO(dev_priv)->gen) {
 	default:
 	case 9:
+		intel_fw_table_check(__gen9_fw_ranges,
+				     ARRAY_SIZE(__gen9_fw_ranges));
+
 		ASSIGN_WRITE_MMIO_VFUNCS(gen9);
 		ASSIGN_READ_MMIO_VFUNCS(gen9);
 		break;
 	case 8:
 		if (IS_CHERRYVIEW(dev_priv)) {
+			intel_fw_table_check(__chv_fw_ranges,
+					     ARRAY_SIZE(__chv_fw_ranges));
+
 			ASSIGN_WRITE_MMIO_VFUNCS(chv);
 			ASSIGN_READ_MMIO_VFUNCS(chv);
 
@@ -1317,6 +1344,9 @@ void intel_uncore_init(struct drm_i915_private *dev_priv)
 		ASSIGN_WRITE_MMIO_VFUNCS(gen6);
 
 		if (IS_VALLEYVIEW(dev_priv)) {
+			intel_fw_table_check(__vlv_fw_ranges,
+					     ARRAY_SIZE(__vlv_fw_ranges));
+
 			ASSIGN_READ_MMIO_VFUNCS(vlv);
 		} else {
 			ASSIGN_READ_MMIO_VFUNCS(gen6);
-- 
2.7.4

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH 08/14] drm/i915: Store the active forcewake range table pointer
  2016-09-30 17:48 ` [PATCH 08/14] drm/i915: Store the active forcewake range table pointer Tvrtko Ursulin
@ 2016-10-03  8:39   ` Tvrtko Ursulin
  0 siblings, 0 replies; 32+ messages in thread
From: Tvrtko Ursulin @ 2016-10-03  8:39 UTC (permalink / raw)
  To: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

If we store this in the uncore structure we are on a good way to
show more commonality between the per-platform implementations.

v2: Constify table pointer and correct coding style. (Chris Wilson)
v3: Rebase.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_drv.h     | 10 ++++++
 drivers/gpu/drm/i915/intel_uncore.c | 68 ++++++++++++++++++-------------------
 2 files changed, 44 insertions(+), 34 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 9b1997fd8c87..3b14938aa09e 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -582,9 +582,19 @@ struct intel_uncore_funcs {
 				uint32_t val, bool trace);
 };
 
+struct intel_forcewake_range {
+	u32 start;
+	u32 end;
+
+	enum forcewake_domains domains;
+};
+
 struct intel_uncore {
 	spinlock_t lock; /** lock is also taken in irq contexts. */
 
+	const struct intel_forcewake_range *fw_domains_table;
+	unsigned int fw_domains_table_entries;
+
 	struct intel_uncore_funcs funcs;
 
 	unsigned fifo_count;
diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c
index ba90a7e9d3b2..c813bdef8ded 100644
--- a/drivers/gpu/drm/i915/intel_uncore.c
+++ b/drivers/gpu/drm/i915/intel_uncore.c
@@ -582,14 +582,6 @@ void assert_forcewakes_inactive(struct drm_i915_private *dev_priv)
 	__fwd; \
 })
 
-struct intel_forcewake_range
-{
-	u32 start;
-	u32 end;
-
-	enum forcewake_domains domains;
-};
-
 static int fw_range_cmp(const void *key, const void *elt)
 {
 	const struct intel_forcewake_range *entry = elt;
@@ -604,28 +596,38 @@ static int fw_range_cmp(const void *key, const void *elt)
 }
 
 static enum forcewake_domains
-find_fw_domain(u32 offset, const struct intel_forcewake_range *ranges,
-	       unsigned int num_ranges)
+find_fw_domain(struct drm_i915_private *dev_priv, u32 offset)
 {
-	struct intel_forcewake_range *entry;
+	const struct intel_forcewake_range *table, *entry;
+	unsigned int num_entries;
 
-	entry = bsearch((void *)(unsigned long)offset, (const void *)ranges,
-			num_ranges, sizeof(struct intel_forcewake_range),
+	table = dev_priv->uncore.fw_domains_table;
+	num_entries = dev_priv->uncore.fw_domains_table_entries;
+
+	entry = bsearch((void *)(unsigned long)offset, (const void *)table,
+			num_entries, sizeof(struct intel_forcewake_range),
 			fw_range_cmp);
 
 	return entry ? entry->domains : 0;
 }
 
 static void
-intel_fw_table_check(const struct intel_forcewake_range *ranges,
-		     unsigned int num_ranges)
+intel_fw_table_check(struct drm_i915_private *dev_priv)
 {
+	const struct intel_forcewake_range *ranges;
+	unsigned int num_ranges;
 	s32 prev;
 	unsigned int i;
 
 	if (!IS_ENABLED(CONFIG_DRM_I915_DEBUG))
 		return;
 
+	ranges = dev_priv->uncore.fw_domains_table;
+	if (!ranges)
+		return;
+
+	num_ranges = dev_priv->uncore.fw_domains_table_entries;
+
 	for (i = 0, prev = -1; i < num_ranges; i++, ranges++) {
 		WARN_ON_ONCE(prev >= (s32)ranges->start);
 		prev = ranges->start;
@@ -652,8 +654,7 @@ static const struct intel_forcewake_range __vlv_fw_ranges[] = {
 ({ \
 	enum forcewake_domains __fwd = 0; \
 	if (NEEDS_FORCE_WAKE((offset))) \
-		__fwd = find_fw_domain(offset, __vlv_fw_ranges, \
-				       ARRAY_SIZE(__vlv_fw_ranges)); \
+		__fwd = find_fw_domain(dev_priv, offset); \
 	__fwd; \
 })
 
@@ -711,8 +712,7 @@ static const struct intel_forcewake_range __chv_fw_ranges[] = {
 ({ \
 	enum forcewake_domains __fwd = 0; \
 	if (NEEDS_FORCE_WAKE((offset))) \
-		__fwd = find_fw_domain(offset, __chv_fw_ranges, \
-				       ARRAY_SIZE(__chv_fw_ranges)); \
+		__fwd = find_fw_domain(dev_priv, offset); \
 	__fwd; \
 })
 
@@ -720,8 +720,7 @@ static const struct intel_forcewake_range __chv_fw_ranges[] = {
 ({ \
 	enum forcewake_domains __fwd = 0; \
 	if (NEEDS_FORCE_WAKE((offset)) && !is_gen8_shadowed(offset)) \
-		__fwd = find_fw_domain(offset, __chv_fw_ranges, \
-				       ARRAY_SIZE(__chv_fw_ranges)); \
+		__fwd = find_fw_domain(dev_priv, offset); \
 	__fwd; \
 })
 
@@ -765,8 +764,7 @@ static const struct intel_forcewake_range __gen9_fw_ranges[] = {
 ({ \
 	enum forcewake_domains __fwd = 0; \
 	if (NEEDS_FORCE_WAKE((offset))) \
-		__fwd = find_fw_domain(offset, __gen9_fw_ranges, \
-				       ARRAY_SIZE(__gen9_fw_ranges)); \
+		__fwd = find_fw_domain(dev_priv, offset); \
 	__fwd; \
 })
 
@@ -794,8 +792,7 @@ static bool is_gen9_shadowed(u32 offset)
 ({ \
 	enum forcewake_domains __fwd = 0; \
 	if (NEEDS_FORCE_WAKE((offset)) && !is_gen9_shadowed(offset)) \
-		__fwd = find_fw_domain(offset, __gen9_fw_ranges, \
-				       ARRAY_SIZE(__gen9_fw_ranges)); \
+		__fwd = find_fw_domain(dev_priv, offset); \
 	__fwd; \
 })
 
@@ -1318,6 +1315,13 @@ static void intel_uncore_fw_domains_init(struct drm_i915_private *dev_priv)
 	WARN_ON(dev_priv->uncore.fw_domains == 0);
 }
 
+#define ASSIGN_FW_DOMAINS_TABLE(d) \
+{ \
+	dev_priv->uncore.fw_domains_table = \
+			(struct intel_forcewake_range *)(d); \
+	dev_priv->uncore.fw_domains_table_entries = ARRAY_SIZE((d)); \
+}
+
 void intel_uncore_init(struct drm_i915_private *dev_priv)
 {
 	i915_check_vgpu(dev_priv);
@@ -1331,17 +1335,13 @@ void intel_uncore_init(struct drm_i915_private *dev_priv)
 	switch (INTEL_INFO(dev_priv)->gen) {
 	default:
 	case 9:
-		intel_fw_table_check(__gen9_fw_ranges,
-				     ARRAY_SIZE(__gen9_fw_ranges));
-
+		ASSIGN_FW_DOMAINS_TABLE(__gen9_fw_ranges);
 		ASSIGN_WRITE_MMIO_VFUNCS(gen9);
 		ASSIGN_READ_MMIO_VFUNCS(gen9);
 		break;
 	case 8:
 		if (IS_CHERRYVIEW(dev_priv)) {
-			intel_fw_table_check(__chv_fw_ranges,
-					     ARRAY_SIZE(__chv_fw_ranges));
-
+			ASSIGN_FW_DOMAINS_TABLE(__chv_fw_ranges);
 			ASSIGN_WRITE_MMIO_VFUNCS(chv);
 			ASSIGN_READ_MMIO_VFUNCS(chv);
 
@@ -1355,9 +1355,7 @@ void intel_uncore_init(struct drm_i915_private *dev_priv)
 		ASSIGN_WRITE_MMIO_VFUNCS(gen6);
 
 		if (IS_VALLEYVIEW(dev_priv)) {
-			intel_fw_table_check(__vlv_fw_ranges,
-					     ARRAY_SIZE(__vlv_fw_ranges));
-
+			ASSIGN_FW_DOMAINS_TABLE(__vlv_fw_ranges);
 			ASSIGN_READ_MMIO_VFUNCS(vlv);
 		} else {
 			ASSIGN_READ_MMIO_VFUNCS(gen6);
@@ -1375,6 +1373,8 @@ void intel_uncore_init(struct drm_i915_private *dev_priv)
 		break;
 	}
 
+	intel_fw_table_check(dev_priv);
+
 	if (intel_vgpu_active(dev_priv)) {
 		ASSIGN_WRITE_MMIO_VFUNCS(vgpu);
 		ASSIGN_READ_MMIO_VFUNCS(vgpu);
-- 
2.7.4

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH 12/14] drm/i915: Sort the shadow register table
  2016-09-30 17:48 ` [PATCH 12/14] drm/i915: Sort the shadow register table Tvrtko Ursulin
@ 2016-10-03  8:40   ` Tvrtko Ursulin
  0 siblings, 0 replies; 32+ messages in thread
From: Tvrtko Ursulin @ 2016-10-03  8:40 UTC (permalink / raw)
  To: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Also verify the order at runtime. This was we can start using
binary search on it in a following patch.

v2: Add comment on the sorted array and only check it when
    debug option is enabled.

v3: Use IS_ENABLED. (Chris Wilson)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> (v1)
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/intel_uncore.c | 32 ++++++++++++++++++++++++++------
 1 file changed, 26 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c
index 1e9061b22e0f..7d6f87acf661 100644
--- a/drivers/gpu/drm/i915/intel_uncore.c
+++ b/drivers/gpu/drm/i915/intel_uncore.c
@@ -663,16 +663,34 @@ static const struct intel_forcewake_range __vlv_fw_ranges[] = {
 	__fwd; \
 })
 
+/* *Must* be sorted by offset! See intel_shadow_table_check(). */
 static const i915_reg_t gen8_shadowed_regs[] = {
-	GEN6_RPNSWREQ,
-	GEN6_RC_VIDEO_FREQ,
-	RING_TAIL(RENDER_RING_BASE),
-	RING_TAIL(GEN6_BSD_RING_BASE),
-	RING_TAIL(VEBOX_RING_BASE),
-	RING_TAIL(BLT_RING_BASE),
+	RING_TAIL(RENDER_RING_BASE),	/* 0x2000 (base) */
+	GEN6_RPNSWREQ,			/* 0xA008 */
+	GEN6_RC_VIDEO_FREQ,		/* 0xA00C */
+	RING_TAIL(GEN6_BSD_RING_BASE),	/* 0x12000 (base) */
+	RING_TAIL(VEBOX_RING_BASE),	/* 0x1a000 (base) */
+	RING_TAIL(BLT_RING_BASE),	/* 0x22000 (base) */
 	/* TODO: Other registers are not yet used */
 };
 
+static void intel_shadow_table_check(void)
+{
+	const i915_reg_t *reg = gen8_shadowed_regs;
+	s32 prev;
+	u32 offset;
+	unsigned int i;
+
+	if (!IS_ENABLED(CONFIG_DRM_I915_DEBUG))
+		return;
+
+	for (i = 0, prev = -1; i < ARRAY_SIZE(gen8_shadowed_regs); i++, reg++) {
+		offset = i915_mmio_reg_offset(*reg);
+		WARN_ON_ONCE(prev >= (s32)offset);
+		prev = offset;
+	}
+}
+
 static bool is_gen8_shadowed(u32 offset)
 {
 	int i;
@@ -1284,6 +1302,8 @@ void intel_uncore_init(struct drm_i915_private *dev_priv)
 	}
 
 	intel_fw_table_check(dev_priv);
+	if (INTEL_GEN(dev_priv) >= 8)
+		intel_shadow_table_check();
 
 	if (intel_vgpu_active(dev_priv)) {
 		ASSIGN_WRITE_MMIO_VFUNCS(vgpu);
-- 
2.7.4

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* ✗ Fi.CI.BAT: warning for Forcewake binary search & code shrink (rev6)
  2016-09-30 17:48 [PATCH v2 00/14] Forcewake binary search & code shrink Tvrtko Ursulin
                   ` (15 preceding siblings ...)
  2016-09-30 18:49 ` ✗ Fi.CI.BAT: warning for Forcewake binary search & code shrink (rev3) Patchwork
@ 2016-10-03  9:27 ` Patchwork
  2016-10-04  8:46   ` Tvrtko Ursulin
  16 siblings, 1 reply; 32+ messages in thread
From: Patchwork @ 2016-10-03  9:27 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: intel-gfx

== Series Details ==

Series: Forcewake binary search & code shrink (rev6)
URL   : https://patchwork.freedesktop.org/series/13080/
State : warning

== Summary ==

Series 13080v6 Forcewake binary search & code shrink
https://patchwork.freedesktop.org/api/1.0/series/13080/revisions/6/mbox/

Test kms_pipe_crc_basic:
        Subgroup nonblocking-crc-pipe-b:
                pass       -> DMESG-WARN (fi-skl-6700k)
        Subgroup suspend-read-crc-pipe-a:
                pass       -> DMESG-WARN (fi-byt-j1900)

fi-bdw-5557u     total:236  pass:220  dwarn:0   dfail:0   fail:0   skip:15 
fi-bxt-t5700     total:236  pass:205  dwarn:0   dfail:0   fail:0   skip:30 
fi-byt-j1900     total:236  pass:202  dwarn:1   dfail:0   fail:1   skip:31 
fi-byt-n2820     total:245  pass:208  dwarn:0   dfail:0   fail:2   skip:35 
fi-hsw-4770      total:236  pass:213  dwarn:0   dfail:0   fail:0   skip:22 
fi-hsw-4770r     total:245  pass:222  dwarn:0   dfail:0   fail:1   skip:22 
fi-ilk-650       total:236  pass:173  dwarn:0   dfail:0   fail:2   skip:60 
fi-skl-6700k     total:236  pass:209  dwarn:2   dfail:0   fail:0   skip:24 
fi-skl-6770hq    total:245  pass:228  dwarn:1   dfail:0   fail:2   skip:14 
fi-snb-2520m     total:245  pass:208  dwarn:0   dfail:0   fail:1   skip:36 
fi-snb-2600      total:236  pass:198  dwarn:0   dfail:0   fail:0   skip:37 
fi-ivb-3520m failed to connect after reboot
fi-ivb-3770 failed to collect. IGT log at Patchwork_2612/fi-ivb-3770/igt.log
fi-skl-6260u failed to collect. IGT log at Patchwork_2612/fi-skl-6260u/igt.log
fi-skl-6700hq failed to collect. IGT log at Patchwork_2612/fi-skl-6700hq/igt.log

Results at /archive/results/CI_IGT_test/Patchwork_2612/

d0534bd0217c83c083ba146b9c987e19b219e0e4 drm-intel-nightly: 2016y-09m-30d-10h-31m-00s UTC integration manifest
c5237e5 drm/i915: Inline binary search
9b4e3c9 drm/i915: Use binary search when looking for shadowed registers
0b12dbd drm/i915: Sort the shadow register table
7978e69 drm/i915: Remove identical write mmmio functions
b0960d1 drm/i915: Remove identical mmio read functions
354b09a drm/i915: Remove identical macros
e608845 drm/i915: Store the active forcewake range table pointer
79f3cb4 drm/i915: Eliminate Gen9 special case
7fe1499 drm/i915: Use binary search when looking up forcewake domains
1a71a14 drm/i915: Sort forcewake mapping tables
d34328e drm/i915: Data driven register to forcewake domains lookup
82e0e23 drm/i915: Do not inline forcewake taking in mmio accessors
920037b drm/i915: Keep track of active forcewake domains in a bitmask
e6741b5 drm/i915: Remove redundant hsw_write* mmio functions

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: ✗ Fi.CI.BAT: warning for Forcewake binary search & code shrink (rev6)
  2016-10-03  9:27 ` ✗ Fi.CI.BAT: warning for Forcewake binary search & code shrink (rev6) Patchwork
@ 2016-10-04  8:46   ` Tvrtko Ursulin
  2016-10-04  8:55     ` Chris Wilson
  0 siblings, 1 reply; 32+ messages in thread
From: Tvrtko Ursulin @ 2016-10-04  8:46 UTC (permalink / raw)
  To: intel-gfx, Tvrtko Ursulin


On 03/10/2016 10:27, Patchwork wrote:
> == Series Details ==
>
> Series: Forcewake binary search & code shrink (rev6)
> URL   : https://patchwork.freedesktop.org/series/13080/
> State : warning
>
> == Summary ==
>
> Series 13080v6 Forcewake binary search & code shrink
> https://patchwork.freedesktop.org/api/1.0/series/13080/revisions/6/mbox/
>
> Test kms_pipe_crc_basic:
>          Subgroup nonblocking-crc-pipe-b:
>                  pass       -> DMESG-WARN (fi-skl-6700k)

Test passed but kernel logged:

[  520.080543] [drm:intel_cpu_fifo_underrun_irq_handler [i915]] *ERROR* CPU pipe B FIFO underrun

Sporadically happening since CI_DRM_1645.

Raised new bug https://bugs.freedesktop.org/show_bug.cgi?id=98041

>          Subgroup suspend-read-crc-pipe-a:
>                  pass       -> DMESG-WARN (fi-byt-j1900)

Test passes but kernel error is logged:

[  626.726086] [drm:intel_dp_aux_ch] *ERROR* dp_aux_ch not done status 0x00050000


Happening since mid-September, build CI_DRM_1635.

Raised new https://bugs.freedesktop.org/show_bug.cgi?id=98040

> fi-bdw-5557u     total:236  pass:220  dwarn:0   dfail:0   fail:0   skip:15
> fi-bxt-t5700     total:236  pass:205  dwarn:0   dfail:0   fail:0   skip:30
> fi-byt-j1900     total:236  pass:202  dwarn:1   dfail:0   fail:1   skip:31
> fi-byt-n2820     total:245  pass:208  dwarn:0   dfail:0   fail:2   skip:35
> fi-hsw-4770      total:236  pass:213  dwarn:0   dfail:0   fail:0   skip:22
> fi-hsw-4770r     total:245  pass:222  dwarn:0   dfail:0   fail:1   skip:22
> fi-ilk-650       total:236  pass:173  dwarn:0   dfail:0   fail:2   skip:60
> fi-skl-6700k     total:236  pass:209  dwarn:2   dfail:0   fail:0   skip:24
> fi-skl-6770hq    total:245  pass:228  dwarn:1   dfail:0   fail:2   skip:14
> fi-snb-2520m     total:245  pass:208  dwarn:0   dfail:0   fail:1   skip:36
> fi-snb-2600      total:236  pass:198  dwarn:0   dfail:0   fail:0   skip:37
> fi-ivb-3520m failed to connect after reboot
> fi-ivb-3770 failed to collect. IGT log at Patchwork_2612/fi-ivb-3770/igt.log
> fi-skl-6260u failed to collect. IGT log at Patchwork_2612/fi-skl-6260u/igt.log
> fi-skl-6700hq failed to collect. IGT log at Patchwork_2612/fi-skl-6700hq/igt.log
>
> Results at /archive/results/CI_IGT_test/Patchwork_2612/
>
> d0534bd0217c83c083ba146b9c987e19b219e0e4 drm-intel-nightly: 2016y-09m-30d-10h-31m-00s UTC integration manifest
> c5237e5 drm/i915: Inline binary search
> 9b4e3c9 drm/i915: Use binary search when looking for shadowed registers
> 0b12dbd drm/i915: Sort the shadow register table
> 7978e69 drm/i915: Remove identical write mmmio functions
> b0960d1 drm/i915: Remove identical mmio read functions
> 354b09a drm/i915: Remove identical macros
> e608845 drm/i915: Store the active forcewake range table pointer
> 79f3cb4 drm/i915: Eliminate Gen9 special case
> 7fe1499 drm/i915: Use binary search when looking up forcewake domains
> 1a71a14 drm/i915: Sort forcewake mapping tables
> d34328e drm/i915: Data driven register to forcewake domains lookup
> 82e0e23 drm/i915: Do not inline forcewake taking in mmio accessors
> 920037b drm/i915: Keep track of active forcewake domains in a bitmask
> e6741b5 drm/i915: Remove redundant hsw_write* mmio functions

Results look good. If everyone is happy we could merge this.

Regards,

Tvrtko

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: ✗ Fi.CI.BAT:  warning for Forcewake binary search & code shrink (rev6)
  2016-10-04  8:46   ` Tvrtko Ursulin
@ 2016-10-04  8:55     ` Chris Wilson
  2016-10-04  9:18       ` Tvrtko Ursulin
  0 siblings, 1 reply; 32+ messages in thread
From: Chris Wilson @ 2016-10-04  8:55 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: intel-gfx

On Tue, Oct 04, 2016 at 09:46:50AM +0100, Tvrtko Ursulin wrote:
> On 03/10/2016 10:27, Patchwork wrote:
> >fi-bdw-5557u     total:236  pass:220  dwarn:0   dfail:0   fail:0   skip:15
> >fi-bxt-t5700     total:236  pass:205  dwarn:0   dfail:0   fail:0   skip:30
> >fi-byt-j1900     total:236  pass:202  dwarn:1   dfail:0   fail:1   skip:31
> >fi-byt-n2820     total:245  pass:208  dwarn:0   dfail:0   fail:2   skip:35
> >fi-hsw-4770      total:236  pass:213  dwarn:0   dfail:0   fail:0   skip:22
> >fi-hsw-4770r     total:245  pass:222  dwarn:0   dfail:0   fail:1   skip:22
> >fi-ilk-650       total:236  pass:173  dwarn:0   dfail:0   fail:2   skip:60
> >fi-skl-6700k     total:236  pass:209  dwarn:2   dfail:0   fail:0   skip:24
> >fi-skl-6770hq    total:245  pass:228  dwarn:1   dfail:0   fail:2   skip:14
> >fi-snb-2520m     total:245  pass:208  dwarn:0   dfail:0   fail:1   skip:36
> >fi-snb-2600      total:236  pass:198  dwarn:0   dfail:0   fail:0   skip:37
> >fi-ivb-3520m failed to connect after reboot
> >fi-ivb-3770 failed to collect. IGT log at Patchwork_2612/fi-ivb-3770/igt.log
> >fi-skl-6260u failed to collect. IGT log at Patchwork_2612/fi-skl-6260u/igt.log
> >fi-skl-6700hq failed to collect. IGT log at Patchwork_2612/fi-skl-6700hq/igt.log
> >
> >Results at /archive/results/CI_IGT_test/Patchwork_2612/
> >
> >d0534bd0217c83c083ba146b9c987e19b219e0e4 drm-intel-nightly: 2016y-09m-30d-10h-31m-00s UTC integration manifest
> >c5237e5 drm/i915: Inline binary search
> >9b4e3c9 drm/i915: Use binary search when looking for shadowed registers
> >0b12dbd drm/i915: Sort the shadow register table
> >7978e69 drm/i915: Remove identical write mmmio functions
> >b0960d1 drm/i915: Remove identical mmio read functions
> >354b09a drm/i915: Remove identical macros
> >e608845 drm/i915: Store the active forcewake range table pointer
> >79f3cb4 drm/i915: Eliminate Gen9 special case
> >7fe1499 drm/i915: Use binary search when looking up forcewake domains
> >1a71a14 drm/i915: Sort forcewake mapping tables
> >d34328e drm/i915: Data driven register to forcewake domains lookup
> >82e0e23 drm/i915: Do not inline forcewake taking in mmio accessors
> >920037b drm/i915: Keep track of active forcewake domains in a bitmask
> >e6741b5 drm/i915: Remove redundant hsw_write* mmio functions
> 
> Results look good. If everyone is happy we could merge this.

Hmm, ivb a sporadic fail? That both died is a bit concerning, but
https://patchwork.freedesktop.org/series/13264/ is happy, right?

Lgtm.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: ✗ Fi.CI.BAT: warning for Forcewake binary search & code shrink (rev6)
  2016-10-04  8:55     ` Chris Wilson
@ 2016-10-04  9:18       ` Tvrtko Ursulin
  2016-10-04  9:37         ` Joonas Lahtinen
  0 siblings, 1 reply; 32+ messages in thread
From: Tvrtko Ursulin @ 2016-10-04  9:18 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx, Tvrtko Ursulin


On 04/10/2016 09:55, Chris Wilson wrote:

> On Tue, Oct 04, 2016 at 09:46:50AM +0100, Tvrtko Ursulin wrote:
>> On 03/10/2016 10:27, Patchwork wrote:
>>> fi-bdw-5557u     total:236  pass:220  dwarn:0   dfail:0   fail:0   skip:15
>>> fi-bxt-t5700     total:236  pass:205  dwarn:0   dfail:0   fail:0   skip:30
>>> fi-byt-j1900     total:236  pass:202  dwarn:1   dfail:0   fail:1   skip:31
>>> fi-byt-n2820     total:245  pass:208  dwarn:0   dfail:0   fail:2   skip:35
>>> fi-hsw-4770      total:236  pass:213  dwarn:0   dfail:0   fail:0   skip:22
>>> fi-hsw-4770r     total:245  pass:222  dwarn:0   dfail:0   fail:1   skip:22
>>> fi-ilk-650       total:236  pass:173  dwarn:0   dfail:0   fail:2   skip:60
>>> fi-skl-6700k     total:236  pass:209  dwarn:2   dfail:0   fail:0   skip:24
>>> fi-skl-6770hq    total:245  pass:228  dwarn:1   dfail:0   fail:2   skip:14
>>> fi-snb-2520m     total:245  pass:208  dwarn:0   dfail:0   fail:1   skip:36
>>> fi-snb-2600      total:236  pass:198  dwarn:0   dfail:0   fail:0   skip:37
>>> fi-ivb-3520m failed to connect after reboot
>>> fi-ivb-3770 failed to collect. IGT log at Patchwork_2612/fi-ivb-3770/igt.log
>>> fi-skl-6260u failed to collect. IGT log at Patchwork_2612/fi-skl-6260u/igt.log
>>> fi-skl-6700hq failed to collect. IGT log at Patchwork_2612/fi-skl-6700hq/igt.log
>>>
>>> Results at /archive/results/CI_IGT_test/Patchwork_2612/
>>>
>>> d0534bd0217c83c083ba146b9c987e19b219e0e4 drm-intel-nightly: 2016y-09m-30d-10h-31m-00s UTC integration manifest
>>> c5237e5 drm/i915: Inline binary search
>>> 9b4e3c9 drm/i915: Use binary search when looking for shadowed registers
>>> 0b12dbd drm/i915: Sort the shadow register table
>>> 7978e69 drm/i915: Remove identical write mmmio functions
>>> b0960d1 drm/i915: Remove identical mmio read functions
>>> 354b09a drm/i915: Remove identical macros
>>> e608845 drm/i915: Store the active forcewake range table pointer
>>> 79f3cb4 drm/i915: Eliminate Gen9 special case
>>> 7fe1499 drm/i915: Use binary search when looking up forcewake domains
>>> 1a71a14 drm/i915: Sort forcewake mapping tables
>>> d34328e drm/i915: Data driven register to forcewake domains lookup
>>> 82e0e23 drm/i915: Do not inline forcewake taking in mmio accessors
>>> 920037b drm/i915: Keep track of active forcewake domains in a bitmask
>>> e6741b5 drm/i915: Remove redundant hsw_write* mmio functions
>> Results look good. If everyone is happy we could merge this.
> Hmm, ivb a sporadic fail? That both died is a bit concerning, but
> https://patchwork.freedesktop.org/series/13264/ is happy, right?

Yes, and also they did not both die, one was just over the allocated 
time for the run. Same with the two SKLs. So perhaps we are close to the 
cut off with the whole suite again.

> Lgtm.

Cool, thanks.

Joonas what do you think, worth merging on balance?

Regards,

Tvrtko

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: ✗ Fi.CI.BAT: warning for Forcewake binary search & code shrink (rev6)
  2016-10-04  9:18       ` Tvrtko Ursulin
@ 2016-10-04  9:37         ` Joonas Lahtinen
  2016-10-04 10:17           ` Tvrtko Ursulin
  0 siblings, 1 reply; 32+ messages in thread
From: Joonas Lahtinen @ 2016-10-04  9:37 UTC (permalink / raw)
  To: Tvrtko Ursulin, Chris Wilson, intel-gfx, Tvrtko Ursulin

On ti, 2016-10-04 at 10:18 +0100, Tvrtko Ursulin wrote:
> 
> Joonas what do you think, worth merging on balance?
> 

+1 on merging.

Regards, Joonas

-- 
Joonas Lahtinen
Open Source Technology Center
Intel Corporation
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: ✗ Fi.CI.BAT: warning for Forcewake binary search & code shrink (rev6)
  2016-10-04  9:37         ` Joonas Lahtinen
@ 2016-10-04 10:17           ` Tvrtko Ursulin
  0 siblings, 0 replies; 32+ messages in thread
From: Tvrtko Ursulin @ 2016-10-04 10:17 UTC (permalink / raw)
  To: Joonas Lahtinen, Chris Wilson, intel-gfx, Tvrtko Ursulin


On 04/10/2016 10:37, Joonas Lahtinen wrote:
> On ti, 2016-10-04 at 10:18 +0100, Tvrtko Ursulin wrote:
>>   
>> Joonas what do you think, worth merging on balance?
>>   
> +1 on merging.

Pushed to dinq, first one from the new office. :)

Thanks for the review guys!

Regards,

Tvrtko

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 04/14] drm/i915: Data driven register to forcewake domains lookup
  2016-10-03  7:58   ` Joonas Lahtinen
@ 2016-10-26  9:24     ` Mika Kuoppala
  0 siblings, 0 replies; 32+ messages in thread
From: Mika Kuoppala @ 2016-10-26  9:24 UTC (permalink / raw)
  To: Joonas Lahtinen, Tvrtko Ursulin, Intel-gfx

Joonas Lahtinen <joonas.lahtinen@linux.intel.com> writes:

> On pe, 2016-09-30 at 18:48 +0100, Tvrtko Ursulin wrote:
>> 
>> +find_fw_domain(u32 offset, const struct intel_forcewake_range *ranges,
>> +	       unsigned int num_ranges)
>
> Gotta remind that generally "fw" is associated with firmware.
> "forcewake" might be appropriate word in function names too.
>

My reasoning originally was that these are static inside
intel_uncore.c thus the context is clear that these are not
about firmware.

Exposed ones have 'forcewake' in their name to avoid the
confusion.

-Mika

>> +{
>> +	unsigned int i;
>> +	struct intel_forcewake_range *entry =
>> +		(struct intel_forcewake_range *)ranges;
>> +
>> +	for (i = 0; i < num_ranges; i++, entry++) {
>> +		if (offset >= entry->start && offset <= entry->end)
>> +			return entry->domains;
>> +	}
>> +
>> +	return -1;
>> +}
>> 
>
> This could be a binary search, too, with some sorting.
>
> Already as is;
>
> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
>
> Regards, Joonas
> -- 
> Joonas Lahtinen
> Open Source Technology Center
> Intel Corporation
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 32+ messages in thread

end of thread, other threads:[~2016-10-26  9:26 UTC | newest]

Thread overview: 32+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-09-30 17:48 [PATCH v2 00/14] Forcewake binary search & code shrink Tvrtko Ursulin
2016-09-30 17:48 ` [PATCH 01/14] drm/i915: Remove redundant hsw_write* mmio functions Tvrtko Ursulin
2016-09-30 17:48 ` [PATCH 02/14] drm/i915: Keep track of active forcewake domains in a bitmask Tvrtko Ursulin
2016-09-30 17:48 ` [PATCH 03/14] drm/i915: Do not inline forcewake taking in mmio accessors Tvrtko Ursulin
2016-09-30 17:48 ` [PATCH 04/14] drm/i915: Data driven register to forcewake domains lookup Tvrtko Ursulin
2016-10-03  7:58   ` Joonas Lahtinen
2016-10-26  9:24     ` Mika Kuoppala
2016-09-30 17:48 ` [PATCH 05/14] drm/i915: Sort forcewake mapping tables Tvrtko Ursulin
2016-09-30 18:05   ` Chris Wilson
2016-10-03  8:37     ` Tvrtko Ursulin
2016-09-30 17:48 ` [PATCH 06/14] drm/i915: Use binary search when looking up forcewake domains Tvrtko Ursulin
2016-09-30 17:48 ` [PATCH 07/14] drm/i915: Eliminate Gen9 special case Tvrtko Ursulin
2016-09-30 17:48 ` [PATCH 08/14] drm/i915: Store the active forcewake range table pointer Tvrtko Ursulin
2016-10-03  8:39   ` Tvrtko Ursulin
2016-09-30 17:48 ` [PATCH 09/14] drm/i915: Remove identical macros Tvrtko Ursulin
2016-09-30 17:48 ` [PATCH 10/14] drm/i915: Remove identical mmio read functions Tvrtko Ursulin
2016-09-30 17:48 ` [PATCH 11/14] drm/i915: Remove identical write mmmio functions Tvrtko Ursulin
2016-09-30 17:48 ` [PATCH 12/14] drm/i915: Sort the shadow register table Tvrtko Ursulin
2016-10-03  8:40   ` Tvrtko Ursulin
2016-09-30 17:48 ` [PATCH 13/14] drm/i915: Use binary search when looking for shadowed registers Tvrtko Ursulin
2016-09-30 18:08   ` Chris Wilson
2016-10-03  8:05     ` Joonas Lahtinen
2016-10-03  8:33       ` Tvrtko Ursulin
2016-09-30 17:48 ` [PATCH 14/14] drm/i915: Inline binary search Tvrtko Ursulin
2016-09-30 18:09 ` [PATCH v2 00/14] Forcewake binary search & code shrink Chris Wilson
2016-09-30 18:49 ` ✗ Fi.CI.BAT: warning for Forcewake binary search & code shrink (rev3) Patchwork
2016-10-03  9:27 ` ✗ Fi.CI.BAT: warning for Forcewake binary search & code shrink (rev6) Patchwork
2016-10-04  8:46   ` Tvrtko Ursulin
2016-10-04  8:55     ` Chris Wilson
2016-10-04  9:18       ` Tvrtko Ursulin
2016-10-04  9:37         ` Joonas Lahtinen
2016-10-04 10:17           ` Tvrtko Ursulin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.