All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/6] Refactor to expand subslice mask
@ 2019-05-01 15:34 Stuart Summers
  2019-05-01 15:34 ` [PATCH 1/6] drm/i915: Use local variable for SSEU info in GETPARAM ioctl Stuart Summers
                   ` (9 more replies)
  0 siblings, 10 replies; 35+ messages in thread
From: Stuart Summers @ 2019-05-01 15:34 UTC (permalink / raw)
  To: intel-gfx

This patch series contains a few code clean-up patches, followed
by a patch which changes the storage of the subslice mask to better
match the userspace access through the I915_QUERY_TOPOLOGY_INFO
ioctl. The index into the subslice_mask array is then calculated:
  slice * subslice stride + subslice index / 8

v2: fix i915_pm_sseu test failure
v3: no changes to patches in the series, just resending to pick up
    in CI correctly
v4: rebase
v5: fix header test
v6: address review comments from Jari
    address minor checkpatch warning in existing code
    use eu_stride for EU div-by-8
v7: another rebase

Stuart Summers (6):
  drm/i915: Use local variable for SSEU info in GETPARAM ioctl
  drm/i915: Add macro for SSEU stride calculation
  drm/i915: Move calculation of subslices per slice to new function
  drm/i915: Move sseu helper functions to intel_sseu.h
  drm/i915: Remove inline from sseu helper functions
  drm/i915: Expand subslice mask

 drivers/gpu/drm/i915/gt/intel_engine_cs.c    |   6 +-
 drivers/gpu/drm/i915/gt/intel_engine_types.h |  32 +--
 drivers/gpu/drm/i915/gt/intel_hangcheck.c    |   3 +-
 drivers/gpu/drm/i915/gt/intel_sseu.c         |  85 ++++++++
 drivers/gpu/drm/i915/gt/intel_sseu.h         |  30 ++-
 drivers/gpu/drm/i915/gt/intel_workarounds.c  |   2 +-
 drivers/gpu/drm/i915/i915_debugfs.c          |  50 +++--
 drivers/gpu/drm/i915/i915_drv.c              |  15 +-
 drivers/gpu/drm/i915/i915_gpu_error.c        |   5 +-
 drivers/gpu/drm/i915/i915_query.c            |  15 +-
 drivers/gpu/drm/i915/intel_device_info.c     | 209 +++++++++++--------
 drivers/gpu/drm/i915/intel_device_info.h     |  47 -----
 12 files changed, 302 insertions(+), 197 deletions(-)

-- 
2.21.0.5.gaeb582a983

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [PATCH 1/6] drm/i915: Use local variable for SSEU info in GETPARAM ioctl
  2019-05-01 15:34 [PATCH 0/6] Refactor to expand subslice mask Stuart Summers
@ 2019-05-01 15:34 ` Stuart Summers
  2019-05-01 17:54   ` Daniele Ceraolo Spurio
  2019-05-01 15:34 ` [PATCH 2/6] drm/i915: Add macro for SSEU stride calculation Stuart Summers
                   ` (8 subsequent siblings)
  9 siblings, 1 reply; 35+ messages in thread
From: Stuart Summers @ 2019-05-01 15:34 UTC (permalink / raw)
  To: intel-gfx

In the GETPARAM ioctl handler, use a local variable to consolidate
usage of SSEU runtime info.

v2: add const to sseu_dev_info variable

Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Stuart Summers <stuart.summers@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.c | 11 ++++++-----
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 21dac5a09fbe..c376244c19c4 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -324,6 +324,7 @@ static int i915_getparam_ioctl(struct drm_device *dev, void *data,
 {
 	struct drm_i915_private *dev_priv = to_i915(dev);
 	struct pci_dev *pdev = dev_priv->drm.pdev;
+	const struct sseu_dev_info *sseu = &RUNTIME_INFO(dev_priv)->sseu;
 	drm_i915_getparam_t *param = data;
 	int value;
 
@@ -377,12 +378,12 @@ static int i915_getparam_ioctl(struct drm_device *dev, void *data,
 		value = i915_cmd_parser_get_version(dev_priv);
 		break;
 	case I915_PARAM_SUBSLICE_TOTAL:
-		value = sseu_subslice_total(&RUNTIME_INFO(dev_priv)->sseu);
+		value = sseu_subslice_total(sseu);
 		if (!value)
 			return -ENODEV;
 		break;
 	case I915_PARAM_EU_TOTAL:
-		value = RUNTIME_INFO(dev_priv)->sseu.eu_total;
+		value = sseu->eu_total;
 		if (!value)
 			return -ENODEV;
 		break;
@@ -399,7 +400,7 @@ static int i915_getparam_ioctl(struct drm_device *dev, void *data,
 		value = HAS_POOLED_EU(dev_priv);
 		break;
 	case I915_PARAM_MIN_EU_IN_POOL:
-		value = RUNTIME_INFO(dev_priv)->sseu.min_eu_in_pool;
+		value = sseu->min_eu_in_pool;
 		break;
 	case I915_PARAM_HUC_STATUS:
 		value = intel_huc_check_status(&dev_priv->huc);
@@ -449,12 +450,12 @@ static int i915_getparam_ioctl(struct drm_device *dev, void *data,
 		value = intel_engines_has_context_isolation(dev_priv);
 		break;
 	case I915_PARAM_SLICE_MASK:
-		value = RUNTIME_INFO(dev_priv)->sseu.slice_mask;
+		value = sseu->slice_mask;
 		if (!value)
 			return -ENODEV;
 		break;
 	case I915_PARAM_SUBSLICE_MASK:
-		value = RUNTIME_INFO(dev_priv)->sseu.subslice_mask[0];
+		value = sseu->subslice_mask[0];
 		if (!value)
 			return -ENODEV;
 		break;
-- 
2.21.0.5.gaeb582a983

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH 2/6] drm/i915: Add macro for SSEU stride calculation
  2019-05-01 15:34 [PATCH 0/6] Refactor to expand subslice mask Stuart Summers
  2019-05-01 15:34 ` [PATCH 1/6] drm/i915: Use local variable for SSEU info in GETPARAM ioctl Stuart Summers
@ 2019-05-01 15:34 ` Stuart Summers
  2019-05-01 18:11   ` Daniele Ceraolo Spurio
  2019-05-01 15:34 ` [PATCH 3/6] drm/i915: Move calculation of subslices per slice to new function Stuart Summers
                   ` (7 subsequent siblings)
  9 siblings, 1 reply; 35+ messages in thread
From: Stuart Summers @ 2019-05-01 15:34 UTC (permalink / raw)
  To: intel-gfx

Subslice stride and EU stride are calculated multiple times in
i915_query. Move this calculation to a macro to reduce code duplication.

v2: update headers in intel_sseu.h

Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Stuart Summers <stuart.summers@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_sseu.h |  2 ++
 drivers/gpu/drm/i915/i915_query.c    | 17 ++++++++---------
 2 files changed, 10 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_sseu.h b/drivers/gpu/drm/i915/gt/intel_sseu.h
index 73bc824094e8..c0b16b248d4c 100644
--- a/drivers/gpu/drm/i915/gt/intel_sseu.h
+++ b/drivers/gpu/drm/i915/gt/intel_sseu.h
@@ -8,11 +8,13 @@
 #define __INTEL_SSEU_H__
 
 #include <linux/types.h>
+#include <linux/kernel.h>
 
 struct drm_i915_private;
 
 #define GEN_MAX_SLICES		(6) /* CNL upper bound */
 #define GEN_MAX_SUBSLICES	(8) /* ICL upper bound */
+#define GEN_SSEU_STRIDE(bits) DIV_ROUND_UP(bits, BITS_PER_BYTE)
 
 struct sseu_dev_info {
 	u8 slice_mask;
diff --git a/drivers/gpu/drm/i915/i915_query.c b/drivers/gpu/drm/i915/i915_query.c
index 782183b78f49..7c1708c22811 100644
--- a/drivers/gpu/drm/i915/i915_query.c
+++ b/drivers/gpu/drm/i915/i915_query.c
@@ -37,6 +37,8 @@ static int query_topology_info(struct drm_i915_private *dev_priv,
 	const struct sseu_dev_info *sseu = &RUNTIME_INFO(dev_priv)->sseu;
 	struct drm_i915_query_topology_info topo;
 	u32 slice_length, subslice_length, eu_length, total_length;
+	u8 subslice_stride = GEN_SSEU_STRIDE(sseu->max_subslices);
+	u8 eu_stride = GEN_SSEU_STRIDE(sseu->max_eus_per_subslice);
 	int ret;
 
 	if (query_item->flags != 0)
@@ -48,12 +50,10 @@ static int query_topology_info(struct drm_i915_private *dev_priv,
 	BUILD_BUG_ON(sizeof(u8) != sizeof(sseu->slice_mask));
 
 	slice_length = sizeof(sseu->slice_mask);
-	subslice_length = sseu->max_slices *
-		DIV_ROUND_UP(sseu->max_subslices, BITS_PER_BYTE);
-	eu_length = sseu->max_slices * sseu->max_subslices *
-		DIV_ROUND_UP(sseu->max_eus_per_subslice, BITS_PER_BYTE);
-
-	total_length = sizeof(topo) + slice_length + subslice_length + eu_length;
+	subslice_length = sseu->max_slices * subslice_stride;
+	eu_length = sseu->max_slices * sseu->max_subslices * eu_stride;
+	total_length = sizeof(topo) + slice_length + subslice_length +
+		       eu_length;
 
 	ret = copy_query_item(&topo, sizeof(topo), total_length,
 			      query_item);
@@ -69,10 +69,9 @@ static int query_topology_info(struct drm_i915_private *dev_priv,
 	topo.max_eus_per_subslice = sseu->max_eus_per_subslice;
 
 	topo.subslice_offset = slice_length;
-	topo.subslice_stride = DIV_ROUND_UP(sseu->max_subslices, BITS_PER_BYTE);
+	topo.subslice_stride = subslice_stride;
 	topo.eu_offset = slice_length + subslice_length;
-	topo.eu_stride =
-		DIV_ROUND_UP(sseu->max_eus_per_subslice, BITS_PER_BYTE);
+	topo.eu_stride = eu_stride;
 
 	if (__copy_to_user(u64_to_user_ptr(query_item->data_ptr),
 			   &topo, sizeof(topo)))
-- 
2.21.0.5.gaeb582a983

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH 3/6] drm/i915: Move calculation of subslices per slice to new function
  2019-05-01 15:34 [PATCH 0/6] Refactor to expand subslice mask Stuart Summers
  2019-05-01 15:34 ` [PATCH 1/6] drm/i915: Use local variable for SSEU info in GETPARAM ioctl Stuart Summers
  2019-05-01 15:34 ` [PATCH 2/6] drm/i915: Add macro for SSEU stride calculation Stuart Summers
@ 2019-05-01 15:34 ` Stuart Summers
  2019-05-01 18:14   ` Daniele Ceraolo Spurio
  2019-05-01 15:34 ` [PATCH 4/6] drm/i915: Move sseu helper functions to intel_sseu.h Stuart Summers
                   ` (6 subsequent siblings)
  9 siblings, 1 reply; 35+ messages in thread
From: Stuart Summers @ 2019-05-01 15:34 UTC (permalink / raw)
  To: intel-gfx

Add a new function to return the number of subslices per slice to
consolidate code usage.

v2: rebase on changes to move sseu struct to intel_sseu.h

Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Stuart Summers <stuart.summers@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_sseu.h     | 6 ++++++
 drivers/gpu/drm/i915/i915_debugfs.c      | 2 +-
 drivers/gpu/drm/i915/intel_device_info.c | 4 ++--
 3 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_sseu.h b/drivers/gpu/drm/i915/gt/intel_sseu.h
index c0b16b248d4c..f5ff6b7a756a 100644
--- a/drivers/gpu/drm/i915/gt/intel_sseu.h
+++ b/drivers/gpu/drm/i915/gt/intel_sseu.h
@@ -63,6 +63,12 @@ intel_sseu_from_device_info(const struct sseu_dev_info *sseu)
 	return value;
 }
 
+static inline unsigned int
+sseu_subslices_per_slice(const struct sseu_dev_info *sseu, u8 slice)
+{
+	return hweight8(sseu->subslice_mask[slice]);
+}
+
 u32 intel_sseu_make_rpcs(struct drm_i915_private *i915,
 			 const struct intel_sseu *req_sseu);
 
diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index 0e4dffcd4da4..fe854c629a32 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -4185,7 +4185,7 @@ static void i915_print_sseu_info(struct seq_file *m, bool is_available_info,
 		   sseu_subslice_total(sseu));
 	for (s = 0; s < fls(sseu->slice_mask); s++) {
 		seq_printf(m, "  %s Slice%i subslices: %u\n", type,
-			   s, hweight8(sseu->subslice_mask[s]));
+			   s, sseu_subslices_per_slice(sseu, s));
 	}
 	seq_printf(m, "  %s EU Total: %u\n", type,
 		   sseu->eu_total);
diff --git a/drivers/gpu/drm/i915/intel_device_info.c b/drivers/gpu/drm/i915/intel_device_info.c
index 6af480b95bc6..559cf0d0628e 100644
--- a/drivers/gpu/drm/i915/intel_device_info.c
+++ b/drivers/gpu/drm/i915/intel_device_info.c
@@ -93,7 +93,7 @@ static void sseu_dump(const struct sseu_dev_info *sseu, struct drm_printer *p)
 	drm_printf(p, "subslice total: %u\n", sseu_subslice_total(sseu));
 	for (s = 0; s < sseu->max_slices; s++) {
 		drm_printf(p, "slice%d: %u subslices, mask=%04x\n",
-			   s, hweight8(sseu->subslice_mask[s]),
+			   s, sseu_subslices_per_slice(sseu, s),
 			   sseu->subslice_mask[s]);
 	}
 	drm_printf(p, "EU total: %u\n", sseu->eu_total);
@@ -126,7 +126,7 @@ void intel_device_info_dump_topology(const struct sseu_dev_info *sseu,
 
 	for (s = 0; s < sseu->max_slices; s++) {
 		drm_printf(p, "slice%d: %u subslice(s) (0x%hhx):\n",
-			   s, hweight8(sseu->subslice_mask[s]),
+			   s, sseu_subslices_per_slice(sseu, s),
 			   sseu->subslice_mask[s]);
 
 		for (ss = 0; ss < sseu->max_subslices; ss++) {
-- 
2.21.0.5.gaeb582a983

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH 4/6] drm/i915: Move sseu helper functions to intel_sseu.h
  2019-05-01 15:34 [PATCH 0/6] Refactor to expand subslice mask Stuart Summers
                   ` (2 preceding siblings ...)
  2019-05-01 15:34 ` [PATCH 3/6] drm/i915: Move calculation of subslices per slice to new function Stuart Summers
@ 2019-05-01 15:34 ` Stuart Summers
  2019-05-01 18:48   ` Daniele Ceraolo Spurio
  2019-05-01 15:34 ` [PATCH 5/6] drm/i915: Remove inline from sseu helper functions Stuart Summers
                   ` (5 subsequent siblings)
  9 siblings, 1 reply; 35+ messages in thread
From: Stuart Summers @ 2019-05-01 15:34 UTC (permalink / raw)
  To: intel-gfx

v2: fix spacing from checkpatch warning

Signed-off-by: Stuart Summers <stuart.summers@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_sseu.h     | 47 ++++++++++++++++++++++++
 drivers/gpu/drm/i915/intel_device_info.h | 47 ------------------------
 2 files changed, 47 insertions(+), 47 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_sseu.h b/drivers/gpu/drm/i915/gt/intel_sseu.h
index f5ff6b7a756a..029e71d8f140 100644
--- a/drivers/gpu/drm/i915/gt/intel_sseu.h
+++ b/drivers/gpu/drm/i915/gt/intel_sseu.h
@@ -63,12 +63,59 @@ intel_sseu_from_device_info(const struct sseu_dev_info *sseu)
 	return value;
 }
 
+static inline unsigned int sseu_subslice_total(const struct sseu_dev_info *sseu)
+{
+	unsigned int i, total = 0;
+
+	for (i = 0; i < ARRAY_SIZE(sseu->subslice_mask); i++)
+		total += hweight8(sseu->subslice_mask[i]);
+
+	return total;
+}
+
 static inline unsigned int
 sseu_subslices_per_slice(const struct sseu_dev_info *sseu, u8 slice)
 {
 	return hweight8(sseu->subslice_mask[slice]);
 }
 
+static inline int sseu_eu_idx(const struct sseu_dev_info *sseu,
+			      int slice, int subslice)
+{
+	int subslice_stride = DIV_ROUND_UP(sseu->max_eus_per_subslice,
+					   BITS_PER_BYTE);
+	int slice_stride = sseu->max_subslices * subslice_stride;
+
+	return slice * slice_stride + subslice * subslice_stride;
+}
+
+static inline u16 sseu_get_eus(const struct sseu_dev_info *sseu,
+			       int slice, int subslice)
+{
+	int i, offset = sseu_eu_idx(sseu, slice, subslice);
+	u16 eu_mask = 0;
+
+	for (i = 0;
+	     i < DIV_ROUND_UP(sseu->max_eus_per_subslice, BITS_PER_BYTE); i++) {
+		eu_mask |= ((u16)sseu->eu_mask[offset + i]) <<
+			(i * BITS_PER_BYTE);
+	}
+
+	return eu_mask;
+}
+
+static inline void sseu_set_eus(struct sseu_dev_info *sseu,
+				int slice, int subslice, u16 eu_mask)
+{
+	int i, offset = sseu_eu_idx(sseu, slice, subslice);
+
+	for (i = 0;
+	     i < DIV_ROUND_UP(sseu->max_eus_per_subslice, BITS_PER_BYTE); i++) {
+		sseu->eu_mask[offset + i] =
+			(eu_mask >> (BITS_PER_BYTE * i)) & 0xff;
+	}
+}
+
 u32 intel_sseu_make_rpcs(struct drm_i915_private *i915,
 			 const struct intel_sseu *req_sseu);
 
diff --git a/drivers/gpu/drm/i915/intel_device_info.h b/drivers/gpu/drm/i915/intel_device_info.h
index 5a2e17d6146b..6412a9c72898 100644
--- a/drivers/gpu/drm/i915/intel_device_info.h
+++ b/drivers/gpu/drm/i915/intel_device_info.h
@@ -218,53 +218,6 @@ struct intel_driver_caps {
 	bool has_logical_contexts:1;
 };
 
-static inline unsigned int sseu_subslice_total(const struct sseu_dev_info *sseu)
-{
-	unsigned int i, total = 0;
-
-	for (i = 0; i < ARRAY_SIZE(sseu->subslice_mask); i++)
-		total += hweight8(sseu->subslice_mask[i]);
-
-	return total;
-}
-
-static inline int sseu_eu_idx(const struct sseu_dev_info *sseu,
-			      int slice, int subslice)
-{
-	int subslice_stride = DIV_ROUND_UP(sseu->max_eus_per_subslice,
-					   BITS_PER_BYTE);
-	int slice_stride = sseu->max_subslices * subslice_stride;
-
-	return slice * slice_stride + subslice * subslice_stride;
-}
-
-static inline u16 sseu_get_eus(const struct sseu_dev_info *sseu,
-			       int slice, int subslice)
-{
-	int i, offset = sseu_eu_idx(sseu, slice, subslice);
-	u16 eu_mask = 0;
-
-	for (i = 0;
-	     i < DIV_ROUND_UP(sseu->max_eus_per_subslice, BITS_PER_BYTE); i++) {
-		eu_mask |= ((u16) sseu->eu_mask[offset + i]) <<
-			(i * BITS_PER_BYTE);
-	}
-
-	return eu_mask;
-}
-
-static inline void sseu_set_eus(struct sseu_dev_info *sseu,
-				int slice, int subslice, u16 eu_mask)
-{
-	int i, offset = sseu_eu_idx(sseu, slice, subslice);
-
-	for (i = 0;
-	     i < DIV_ROUND_UP(sseu->max_eus_per_subslice, BITS_PER_BYTE); i++) {
-		sseu->eu_mask[offset + i] =
-			(eu_mask >> (BITS_PER_BYTE * i)) & 0xff;
-	}
-}
-
 const char *intel_platform_name(enum intel_platform platform);
 
 void intel_device_info_subplatform_init(struct drm_i915_private *dev_priv);
-- 
2.21.0.5.gaeb582a983

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH 5/6] drm/i915: Remove inline from sseu helper functions
  2019-05-01 15:34 [PATCH 0/6] Refactor to expand subslice mask Stuart Summers
                   ` (3 preceding siblings ...)
  2019-05-01 15:34 ` [PATCH 4/6] drm/i915: Move sseu helper functions to intel_sseu.h Stuart Summers
@ 2019-05-01 15:34 ` Stuart Summers
  2019-05-01 20:04   ` Daniele Ceraolo Spurio
  2019-05-01 15:34 ` [PATCH 6/6] drm/i915: Expand subslice mask Stuart Summers
                   ` (4 subsequent siblings)
  9 siblings, 1 reply; 35+ messages in thread
From: Stuart Summers @ 2019-05-01 15:34 UTC (permalink / raw)
  To: intel-gfx

Additionally, ensure these are all prefixed with intel_sseu_*
to match the convention of other functions in i915.

Signed-off-by: Stuart Summers <stuart.summers@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_sseu.c     | 54 +++++++++++++++++++
 drivers/gpu/drm/i915/gt/intel_sseu.h     | 57 +++-----------------
 drivers/gpu/drm/i915/i915_debugfs.c      |  6 +--
 drivers/gpu/drm/i915/i915_drv.c          |  2 +-
 drivers/gpu/drm/i915/intel_device_info.c | 69 ++++++++++++------------
 5 files changed, 102 insertions(+), 86 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_sseu.c b/drivers/gpu/drm/i915/gt/intel_sseu.c
index 7f448f3bea0b..4a0b82fc108c 100644
--- a/drivers/gpu/drm/i915/gt/intel_sseu.c
+++ b/drivers/gpu/drm/i915/gt/intel_sseu.c
@@ -8,6 +8,60 @@
 #include "intel_lrc_reg.h"
 #include "intel_sseu.h"
 
+unsigned int
+intel_sseu_subslice_total(const struct sseu_dev_info *sseu)
+{
+	unsigned int i, total = 0;
+
+	for (i = 0; i < ARRAY_SIZE(sseu->subslice_mask); i++)
+		total += hweight8(sseu->subslice_mask[i]);
+
+	return total;
+}
+
+unsigned int
+intel_sseu_subslices_per_slice(const struct sseu_dev_info *sseu, u8 slice)
+{
+	return hweight8(sseu->subslice_mask[slice]);
+}
+
+static int intel_sseu_eu_idx(const struct sseu_dev_info *sseu, int slice,
+			     int subslice)
+{
+	int subslice_stride = DIV_ROUND_UP(sseu->max_eus_per_subslice,
+					   BITS_PER_BYTE);
+	int slice_stride = sseu->max_subslices * subslice_stride;
+
+	return slice * slice_stride + subslice * subslice_stride;
+}
+
+u16 intel_sseu_get_eus(const struct sseu_dev_info *sseu, int slice,
+		       int subslice)
+{
+	int i, offset = intel_sseu_eu_idx(sseu, slice, subslice);
+	u16 eu_mask = 0;
+
+	for (i = 0;
+	     i < DIV_ROUND_UP(sseu->max_eus_per_subslice, BITS_PER_BYTE); i++) {
+		eu_mask |= ((u16)sseu->eu_mask[offset + i]) <<
+			(i * BITS_PER_BYTE);
+	}
+
+	return eu_mask;
+}
+
+void intel_sseu_set_eus(struct sseu_dev_info *sseu, int slice, int subslice,
+			u16 eu_mask)
+{
+	int i, offset = intel_sseu_eu_idx(sseu, slice, subslice);
+
+	for (i = 0;
+	     i < DIV_ROUND_UP(sseu->max_eus_per_subslice, BITS_PER_BYTE); i++) {
+		sseu->eu_mask[offset + i] =
+			(eu_mask >> (BITS_PER_BYTE * i)) & 0xff;
+	}
+}
+
 u32 intel_sseu_make_rpcs(struct drm_i915_private *i915,
 			 const struct intel_sseu *req_sseu)
 {
diff --git a/drivers/gpu/drm/i915/gt/intel_sseu.h b/drivers/gpu/drm/i915/gt/intel_sseu.h
index 029e71d8f140..56e3721ae83f 100644
--- a/drivers/gpu/drm/i915/gt/intel_sseu.h
+++ b/drivers/gpu/drm/i915/gt/intel_sseu.h
@@ -63,58 +63,17 @@ intel_sseu_from_device_info(const struct sseu_dev_info *sseu)
 	return value;
 }
 
-static inline unsigned int sseu_subslice_total(const struct sseu_dev_info *sseu)
-{
-	unsigned int i, total = 0;
-
-	for (i = 0; i < ARRAY_SIZE(sseu->subslice_mask); i++)
-		total += hweight8(sseu->subslice_mask[i]);
+unsigned int
+intel_sseu_subslice_total(const struct sseu_dev_info *sseu);
 
-	return total;
-}
+unsigned int
+intel_sseu_subslices_per_slice(const struct sseu_dev_info *sseu, u8 slice);
 
-static inline unsigned int
-sseu_subslices_per_slice(const struct sseu_dev_info *sseu, u8 slice)
-{
-	return hweight8(sseu->subslice_mask[slice]);
-}
-
-static inline int sseu_eu_idx(const struct sseu_dev_info *sseu,
-			      int slice, int subslice)
-{
-	int subslice_stride = DIV_ROUND_UP(sseu->max_eus_per_subslice,
-					   BITS_PER_BYTE);
-	int slice_stride = sseu->max_subslices * subslice_stride;
-
-	return slice * slice_stride + subslice * subslice_stride;
-}
+u16 intel_sseu_get_eus(const struct sseu_dev_info *sseu, int slice,
+		       int subslice);
 
-static inline u16 sseu_get_eus(const struct sseu_dev_info *sseu,
-			       int slice, int subslice)
-{
-	int i, offset = sseu_eu_idx(sseu, slice, subslice);
-	u16 eu_mask = 0;
-
-	for (i = 0;
-	     i < DIV_ROUND_UP(sseu->max_eus_per_subslice, BITS_PER_BYTE); i++) {
-		eu_mask |= ((u16)sseu->eu_mask[offset + i]) <<
-			(i * BITS_PER_BYTE);
-	}
-
-	return eu_mask;
-}
-
-static inline void sseu_set_eus(struct sseu_dev_info *sseu,
-				int slice, int subslice, u16 eu_mask)
-{
-	int i, offset = sseu_eu_idx(sseu, slice, subslice);
-
-	for (i = 0;
-	     i < DIV_ROUND_UP(sseu->max_eus_per_subslice, BITS_PER_BYTE); i++) {
-		sseu->eu_mask[offset + i] =
-			(eu_mask >> (BITS_PER_BYTE * i)) & 0xff;
-	}
-}
+void intel_sseu_set_eus(struct sseu_dev_info *sseu, int slice, int subslice,
+			u16 eu_mask);
 
 u32 intel_sseu_make_rpcs(struct drm_i915_private *i915,
 			 const struct intel_sseu *req_sseu);
diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index fe854c629a32..3f3ee83ac315 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -4158,7 +4158,7 @@ static void broadwell_sseu_device_status(struct drm_i915_private *dev_priv,
 				RUNTIME_INFO(dev_priv)->sseu.subslice_mask[s];
 		}
 		sseu->eu_total = sseu->eu_per_subslice *
-				 sseu_subslice_total(sseu);
+				 intel_sseu_subslice_total(sseu);
 
 		/* subtract fused off EU(s) from enabled slice(s) */
 		for (s = 0; s < fls(sseu->slice_mask); s++) {
@@ -4182,10 +4182,10 @@ static void i915_print_sseu_info(struct seq_file *m, bool is_available_info,
 	seq_printf(m, "  %s Slice Total: %u\n", type,
 		   hweight8(sseu->slice_mask));
 	seq_printf(m, "  %s Subslice Total: %u\n", type,
-		   sseu_subslice_total(sseu));
+		   intel_sseu_subslice_total(sseu));
 	for (s = 0; s < fls(sseu->slice_mask); s++) {
 		seq_printf(m, "  %s Slice%i subslices: %u\n", type,
-			   s, sseu_subslices_per_slice(sseu, s));
+			   s, intel_sseu_subslices_per_slice(sseu, s));
 	}
 	seq_printf(m, "  %s EU Total: %u\n", type,
 		   sseu->eu_total);
diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index c376244c19c4..130c5140db0d 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -378,7 +378,7 @@ static int i915_getparam_ioctl(struct drm_device *dev, void *data,
 		value = i915_cmd_parser_get_version(dev_priv);
 		break;
 	case I915_PARAM_SUBSLICE_TOTAL:
-		value = sseu_subslice_total(sseu);
+		value = intel_sseu_subslice_total(sseu);
 		if (!value)
 			return -ENODEV;
 		break;
diff --git a/drivers/gpu/drm/i915/intel_device_info.c b/drivers/gpu/drm/i915/intel_device_info.c
index 559cf0d0628e..e1dbccf04cd9 100644
--- a/drivers/gpu/drm/i915/intel_device_info.c
+++ b/drivers/gpu/drm/i915/intel_device_info.c
@@ -90,10 +90,10 @@ static void sseu_dump(const struct sseu_dev_info *sseu, struct drm_printer *p)
 
 	drm_printf(p, "slice total: %u, mask=%04x\n",
 		   hweight8(sseu->slice_mask), sseu->slice_mask);
-	drm_printf(p, "subslice total: %u\n", sseu_subslice_total(sseu));
+	drm_printf(p, "subslice total: %u\n", intel_sseu_subslice_total(sseu));
 	for (s = 0; s < sseu->max_slices; s++) {
 		drm_printf(p, "slice%d: %u subslices, mask=%04x\n",
-			   s, sseu_subslices_per_slice(sseu, s),
+			   s, intel_sseu_subslices_per_slice(sseu, s),
 			   sseu->subslice_mask[s]);
 	}
 	drm_printf(p, "EU total: %u\n", sseu->eu_total);
@@ -126,11 +126,11 @@ void intel_device_info_dump_topology(const struct sseu_dev_info *sseu,
 
 	for (s = 0; s < sseu->max_slices; s++) {
 		drm_printf(p, "slice%d: %u subslice(s) (0x%hhx):\n",
-			   s, sseu_subslices_per_slice(sseu, s),
+			   s, intel_sseu_subslices_per_slice(sseu, s),
 			   sseu->subslice_mask[s]);
 
 		for (ss = 0; ss < sseu->max_subslices; ss++) {
-			u16 enabled_eus = sseu_get_eus(sseu, s, ss);
+			u16 enabled_eus = intel_sseu_get_eus(sseu, s, ss);
 
 			drm_printf(p, "\tsubslice%d: %u EUs (0x%hx)\n",
 				   ss, hweight16(enabled_eus), enabled_eus);
@@ -180,7 +180,7 @@ static void gen11_sseu_info_init(struct drm_i915_private *dev_priv)
 			sseu->subslice_mask[s] = (ss_en >> ss_idx) & ss_en_mask;
 			for (ss = 0; ss < sseu->max_subslices; ss++) {
 				if (sseu->subslice_mask[s] & BIT(ss))
-					sseu_set_eus(sseu, s, ss, eu_en);
+					intel_sseu_set_eus(sseu, s, ss, eu_en);
 			}
 		}
 	}
@@ -222,32 +222,32 @@ static void gen10_sseu_info_init(struct drm_i915_private *dev_priv)
 	/* Slice0 */
 	eu_en = ~I915_READ(GEN8_EU_DISABLE0);
 	for (ss = 0; ss < sseu->max_subslices; ss++)
-		sseu_set_eus(sseu, 0, ss, (eu_en >> (8 * ss)) & eu_mask);
+		intel_sseu_set_eus(sseu, 0, ss, (eu_en >> (8 * ss)) & eu_mask);
 	/* Slice1 */
-	sseu_set_eus(sseu, 1, 0, (eu_en >> 24) & eu_mask);
+	intel_sseu_set_eus(sseu, 1, 0, (eu_en >> 24) & eu_mask);
 	eu_en = ~I915_READ(GEN8_EU_DISABLE1);
-	sseu_set_eus(sseu, 1, 1, eu_en & eu_mask);
+	intel_sseu_set_eus(sseu, 1, 1, eu_en & eu_mask);
 	/* Slice2 */
-	sseu_set_eus(sseu, 2, 0, (eu_en >> 8) & eu_mask);
-	sseu_set_eus(sseu, 2, 1, (eu_en >> 16) & eu_mask);
+	intel_sseu_set_eus(sseu, 2, 0, (eu_en >> 8) & eu_mask);
+	intel_sseu_set_eus(sseu, 2, 1, (eu_en >> 16) & eu_mask);
 	/* Slice3 */
-	sseu_set_eus(sseu, 3, 0, (eu_en >> 24) & eu_mask);
+	intel_sseu_set_eus(sseu, 3, 0, (eu_en >> 24) & eu_mask);
 	eu_en = ~I915_READ(GEN8_EU_DISABLE2);
-	sseu_set_eus(sseu, 3, 1, eu_en & eu_mask);
+	intel_sseu_set_eus(sseu, 3, 1, eu_en & eu_mask);
 	/* Slice4 */
-	sseu_set_eus(sseu, 4, 0, (eu_en >> 8) & eu_mask);
-	sseu_set_eus(sseu, 4, 1, (eu_en >> 16) & eu_mask);
+	intel_sseu_set_eus(sseu, 4, 0, (eu_en >> 8) & eu_mask);
+	intel_sseu_set_eus(sseu, 4, 1, (eu_en >> 16) & eu_mask);
 	/* Slice5 */
-	sseu_set_eus(sseu, 5, 0, (eu_en >> 24) & eu_mask);
+	intel_sseu_set_eus(sseu, 5, 0, (eu_en >> 24) & eu_mask);
 	eu_en = ~I915_READ(GEN10_EU_DISABLE3);
-	sseu_set_eus(sseu, 5, 1, eu_en & eu_mask);
+	intel_sseu_set_eus(sseu, 5, 1, eu_en & eu_mask);
 
 	/* Do a second pass where we mark the subslices disabled if all their
 	 * eus are off.
 	 */
 	for (s = 0; s < sseu->max_slices; s++) {
 		for (ss = 0; ss < sseu->max_subslices; ss++) {
-			if (sseu_get_eus(sseu, s, ss) == 0)
+			if (intel_sseu_get_eus(sseu, s, ss) == 0)
 				sseu->subslice_mask[s] &= ~BIT(ss);
 		}
 	}
@@ -260,9 +260,10 @@ static void gen10_sseu_info_init(struct drm_i915_private *dev_priv)
 	 * EU in any one subslice may be fused off for die
 	 * recovery.
 	 */
-	sseu->eu_per_subslice = sseu_subslice_total(sseu) ?
+	sseu->eu_per_subslice = intel_sseu_subslice_total(sseu) ?
 				DIV_ROUND_UP(sseu->eu_total,
-					     sseu_subslice_total(sseu)) : 0;
+					     intel_sseu_subslice_total(sseu)) :
+				0;
 
 	/* No restrictions on Power Gating */
 	sseu->has_slice_pg = 1;
@@ -290,7 +291,7 @@ static void cherryview_sseu_info_init(struct drm_i915_private *dev_priv)
 			  CHV_FGT_EU_DIS_SS0_R1_SHIFT) << 4);
 
 		sseu->subslice_mask[0] |= BIT(0);
-		sseu_set_eus(sseu, 0, 0, ~disabled_mask);
+		intel_sseu_set_eus(sseu, 0, 0, ~disabled_mask);
 	}
 
 	if (!(fuse & CHV_FGT_DISABLE_SS1)) {
@@ -301,7 +302,7 @@ static void cherryview_sseu_info_init(struct drm_i915_private *dev_priv)
 			  CHV_FGT_EU_DIS_SS1_R1_SHIFT) << 4);
 
 		sseu->subslice_mask[0] |= BIT(1);
-		sseu_set_eus(sseu, 0, 1, ~disabled_mask);
+		intel_sseu_set_eus(sseu, 0, 1, ~disabled_mask);
 	}
 
 	sseu->eu_total = compute_eu_total(sseu);
@@ -310,8 +311,8 @@ static void cherryview_sseu_info_init(struct drm_i915_private *dev_priv)
 	 * CHV expected to always have a uniform distribution of EU
 	 * across subslices.
 	*/
-	sseu->eu_per_subslice = sseu_subslice_total(sseu) ?
-				sseu->eu_total / sseu_subslice_total(sseu) :
+	sseu->eu_per_subslice = intel_sseu_subslice_total(sseu) ?
+				sseu->eu_total / intel_sseu_subslice_total(sseu) :
 				0;
 	/*
 	 * CHV supports subslice power gating on devices with more than
@@ -319,7 +320,7 @@ static void cherryview_sseu_info_init(struct drm_i915_private *dev_priv)
 	 * more than one EU pair per subslice.
 	*/
 	sseu->has_slice_pg = 0;
-	sseu->has_subslice_pg = sseu_subslice_total(sseu) > 1;
+	sseu->has_subslice_pg = intel_sseu_subslice_total(sseu) > 1;
 	sseu->has_eu_pg = (sseu->eu_per_subslice > 2);
 }
 
@@ -369,7 +370,7 @@ static void gen9_sseu_info_init(struct drm_i915_private *dev_priv)
 
 			eu_disabled_mask = (eu_disable >> (ss * 8)) & eu_mask;
 
-			sseu_set_eus(sseu, s, ss, ~eu_disabled_mask);
+			intel_sseu_set_eus(sseu, s, ss, ~eu_disabled_mask);
 
 			eu_per_ss = sseu->max_eus_per_subslice -
 				hweight8(eu_disabled_mask);
@@ -393,9 +394,10 @@ static void gen9_sseu_info_init(struct drm_i915_private *dev_priv)
 	 * recovery. BXT is expected to be perfectly uniform in EU
 	 * distribution.
 	*/
-	sseu->eu_per_subslice = sseu_subslice_total(sseu) ?
+	sseu->eu_per_subslice = intel_sseu_subslice_total(sseu) ?
 				DIV_ROUND_UP(sseu->eu_total,
-					     sseu_subslice_total(sseu)) : 0;
+					     intel_sseu_subslice_total(sseu)) :
+				0;
 	/*
 	 * SKL+ supports slice power gating on devices with more than
 	 * one slice, and supports EU power gating on devices with
@@ -407,7 +409,7 @@ static void gen9_sseu_info_init(struct drm_i915_private *dev_priv)
 	sseu->has_slice_pg =
 		!IS_GEN9_LP(dev_priv) && hweight8(sseu->slice_mask) > 1;
 	sseu->has_subslice_pg =
-		IS_GEN9_LP(dev_priv) && sseu_subslice_total(sseu) > 1;
+		IS_GEN9_LP(dev_priv) && intel_sseu_subslice_total(sseu) > 1;
 	sseu->has_eu_pg = sseu->eu_per_subslice > 2;
 
 	if (IS_GEN9_LP(dev_priv)) {
@@ -477,7 +479,7 @@ static void broadwell_sseu_info_init(struct drm_i915_private *dev_priv)
 			eu_disabled_mask =
 				eu_disable[s] >> (ss * sseu->max_eus_per_subslice);
 
-			sseu_set_eus(sseu, s, ss, ~eu_disabled_mask);
+			intel_sseu_set_eus(sseu, s, ss, ~eu_disabled_mask);
 
 			n_disabled = hweight8(eu_disabled_mask);
 
@@ -496,9 +498,10 @@ static void broadwell_sseu_info_init(struct drm_i915_private *dev_priv)
 	 * subslices with the exception that any one EU in any one subslice may
 	 * be fused off for die recovery.
 	 */
-	sseu->eu_per_subslice = sseu_subslice_total(sseu) ?
+	sseu->eu_per_subslice = intel_sseu_subslice_total(sseu) ?
 				DIV_ROUND_UP(sseu->eu_total,
-					     sseu_subslice_total(sseu)) : 0;
+					     intel_sseu_subslice_total(sseu)) :
+				0;
 
 	/*
 	 * BDW supports slice power gating on devices with more than
@@ -561,8 +564,8 @@ static void haswell_sseu_info_init(struct drm_i915_private *dev_priv)
 
 	for (s = 0; s < sseu->max_slices; s++) {
 		for (ss = 0; ss < sseu->max_subslices; ss++) {
-			sseu_set_eus(sseu, s, ss,
-				     (1UL << sseu->eu_per_subslice) - 1);
+			intel_sseu_set_eus(sseu, s, ss,
+					   (1UL << sseu->eu_per_subslice) - 1);
 		}
 	}
 
-- 
2.21.0.5.gaeb582a983

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH 6/6] drm/i915: Expand subslice mask
  2019-05-01 15:34 [PATCH 0/6] Refactor to expand subslice mask Stuart Summers
                   ` (4 preceding siblings ...)
  2019-05-01 15:34 ` [PATCH 5/6] drm/i915: Remove inline from sseu helper functions Stuart Summers
@ 2019-05-01 15:34 ` Stuart Summers
  2019-05-01 18:22   ` Tvrtko Ursulin
  2019-05-01 22:04   ` Daniele Ceraolo Spurio
  2019-05-01 15:58 ` ✗ Fi.CI.CHECKPATCH: warning for Refactor to expand subslice mask (rev7) Patchwork
                   ` (3 subsequent siblings)
  9 siblings, 2 replies; 35+ messages in thread
From: Stuart Summers @ 2019-05-01 15:34 UTC (permalink / raw)
  To: intel-gfx

Currently, the subslice_mask runtime parameter is stored as an
array of subslices per slice. Expand the subslice mask array to
better match what is presented to userspace through the
I915_QUERY_TOPOLOGY_INFO ioctl. The index into this array is
then calculated:
  slice * subslice stride + subslice index / 8

v2: fix spacing in set_sseu_info args
    use set_sseu_info to initialize sseu data when building
    device status in debugfs
    rename variables in intel_engine_types.h to avoid checkpatch
    warnings
v3: update headers in intel_sseu.h
v4: add const to some sseu_dev_info variables
    use sseu->eu_stride for EU stride calculations

Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Stuart Summers <stuart.summers@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_engine_cs.c    |   6 +-
 drivers/gpu/drm/i915/gt/intel_engine_types.h |  32 +++--
 drivers/gpu/drm/i915/gt/intel_hangcheck.c    |   3 +-
 drivers/gpu/drm/i915/gt/intel_sseu.c         |  49 +++++--
 drivers/gpu/drm/i915/gt/intel_sseu.h         |  16 ++-
 drivers/gpu/drm/i915/gt/intel_workarounds.c  |   2 +-
 drivers/gpu/drm/i915/i915_debugfs.c          |  44 +++---
 drivers/gpu/drm/i915/i915_drv.c              |   6 +-
 drivers/gpu/drm/i915/i915_gpu_error.c        |   5 +-
 drivers/gpu/drm/i915/i915_query.c            |  10 +-
 drivers/gpu/drm/i915/intel_device_info.c     | 142 +++++++++++--------
 11 files changed, 198 insertions(+), 117 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index 6e40f8ea9a6a..8f7967cc9a50 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -914,7 +914,7 @@ u32 intel_calculate_mcr_s_ss_select(struct drm_i915_private *dev_priv)
 	const struct sseu_dev_info *sseu = &RUNTIME_INFO(dev_priv)->sseu;
 	u32 mcr_s_ss_select;
 	u32 slice = fls(sseu->slice_mask);
-	u32 subslice = fls(sseu->subslice_mask[slice]);
+	u32 subslice = fls(sseu->subslice_mask[slice * sseu->ss_stride]);
 
 	if (IS_GEN(dev_priv, 10))
 		mcr_s_ss_select = GEN8_MCR_SLICE(slice) |
@@ -990,6 +990,7 @@ void intel_engine_get_instdone(struct intel_engine_cs *engine,
 			       struct intel_instdone *instdone)
 {
 	struct drm_i915_private *dev_priv = engine->i915;
+	const struct sseu_dev_info *sseu = &RUNTIME_INFO(dev_priv)->sseu;
 	struct intel_uncore *uncore = engine->uncore;
 	u32 mmio_base = engine->mmio_base;
 	int slice;
@@ -1007,7 +1008,8 @@ void intel_engine_get_instdone(struct intel_engine_cs *engine,
 
 		instdone->slice_common =
 			intel_uncore_read(uncore, GEN7_SC_INSTDONE);
-		for_each_instdone_slice_subslice(dev_priv, slice, subslice) {
+		for_each_instdone_slice_subslice(dev_priv, sseu, slice,
+						 subslice) {
 			instdone->sampler[slice][subslice] =
 				read_subslice_reg(dev_priv, slice, subslice,
 						  GEN7_SAMPLER_INSTDONE);
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h b/drivers/gpu/drm/i915/gt/intel_engine_types.h
index 9d64e33f8427..1710546a2446 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h
@@ -534,20 +534,22 @@ intel_engine_needs_breadcrumb_tasklet(const struct intel_engine_cs *engine)
 	return engine->flags & I915_ENGINE_NEEDS_BREADCRUMB_TASKLET;
 }
 
-#define instdone_slice_mask(dev_priv__) \
-	(IS_GEN(dev_priv__, 7) ? \
-	 1 : RUNTIME_INFO(dev_priv__)->sseu.slice_mask)
-
-#define instdone_subslice_mask(dev_priv__) \
-	(IS_GEN(dev_priv__, 7) ? \
-	 1 : RUNTIME_INFO(dev_priv__)->sseu.subslice_mask[0])
-
-#define for_each_instdone_slice_subslice(dev_priv__, slice__, subslice__) \
-	for ((slice__) = 0, (subslice__) = 0; \
-	     (slice__) < I915_MAX_SLICES; \
-	     (subslice__) = ((subslice__) + 1) < I915_MAX_SUBSLICES ? (subslice__) + 1 : 0, \
-	       (slice__) += ((subslice__) == 0)) \
-		for_each_if((BIT(slice__) & instdone_slice_mask(dev_priv__)) && \
-			    (BIT(subslice__) & instdone_subslice_mask(dev_priv__)))
+#define instdone_has_slice(dev_priv___, sseu___, slice___) \
+	((IS_GEN(dev_priv___, 7) ? \
+	  1 : (sseu___)->slice_mask) & \
+	BIT(slice___)) \
+
+#define instdone_has_subslice(dev_priv__, sseu__, slice__, subslice__) \
+	((IS_GEN(dev_priv__, 7) ? \
+	  1 : (sseu__)->subslice_mask[slice__ * (sseu__)->ss_stride + \
+				      subslice__ / BITS_PER_BYTE]) & \
+	 BIT(subslice__ % BITS_PER_BYTE)) \
+
+#define for_each_instdone_slice_subslice(dev_priv_, sseu_, slice_, subslice_) \
+	for ((slice_) = 0, (subslice_) = 0; (slice_) < I915_MAX_SLICES; \
+	     (subslice_) = ((subslice_) + 1) < I915_MAX_SUBSLICES ? (subslice_) + 1 : 0, \
+	       (slice_) += ((subslice_) == 0)) \
+		for_each_if(instdone_has_slice(dev_priv_, sseu_, slice) && \
+			    instdone_has_subslice(dev_priv_, sseu_, slice_, subslice_)) \
 
 #endif /* __INTEL_ENGINE_TYPES_H__ */
diff --git a/drivers/gpu/drm/i915/gt/intel_hangcheck.c b/drivers/gpu/drm/i915/gt/intel_hangcheck.c
index e5eaa06fe74d..53c1c98161e1 100644
--- a/drivers/gpu/drm/i915/gt/intel_hangcheck.c
+++ b/drivers/gpu/drm/i915/gt/intel_hangcheck.c
@@ -50,6 +50,7 @@ static bool instdone_unchanged(u32 current_instdone, u32 *old_instdone)
 static bool subunits_stuck(struct intel_engine_cs *engine)
 {
 	struct drm_i915_private *dev_priv = engine->i915;
+	const struct sseu_dev_info *sseu = &RUNTIME_INFO(dev_priv)->sseu;
 	struct intel_instdone instdone;
 	struct intel_instdone *accu_instdone = &engine->hangcheck.instdone;
 	bool stuck;
@@ -71,7 +72,7 @@ static bool subunits_stuck(struct intel_engine_cs *engine)
 	stuck &= instdone_unchanged(instdone.slice_common,
 				    &accu_instdone->slice_common);
 
-	for_each_instdone_slice_subslice(dev_priv, slice, subslice) {
+	for_each_instdone_slice_subslice(dev_priv, sseu, slice, subslice) {
 		stuck &= instdone_unchanged(instdone.sampler[slice][subslice],
 					    &accu_instdone->sampler[slice][subslice]);
 		stuck &= instdone_unchanged(instdone.row[slice][subslice],
diff --git a/drivers/gpu/drm/i915/gt/intel_sseu.c b/drivers/gpu/drm/i915/gt/intel_sseu.c
index 4a0b82fc108c..49316b7ef074 100644
--- a/drivers/gpu/drm/i915/gt/intel_sseu.c
+++ b/drivers/gpu/drm/i915/gt/intel_sseu.c
@@ -8,6 +8,17 @@
 #include "intel_lrc_reg.h"
 #include "intel_sseu.h"
 
+void intel_sseu_set_info(struct sseu_dev_info *sseu, u8 max_slices,
+			 u8 max_subslices, u8 max_eus_per_subslice)
+{
+	sseu->max_slices = max_slices;
+	sseu->max_subslices = max_subslices;
+	sseu->max_eus_per_subslice = max_eus_per_subslice;
+
+	sseu->ss_stride = GEN_SSEU_STRIDE(sseu->max_subslices);
+	sseu->eu_stride = GEN_SSEU_STRIDE(sseu->max_eus_per_subslice);
+}
+
 unsigned int
 intel_sseu_subslice_total(const struct sseu_dev_info *sseu)
 {
@@ -22,17 +33,39 @@ intel_sseu_subslice_total(const struct sseu_dev_info *sseu)
 unsigned int
 intel_sseu_subslices_per_slice(const struct sseu_dev_info *sseu, u8 slice)
 {
-	return hweight8(sseu->subslice_mask[slice]);
+	unsigned int i, total = 0;
+
+	for (i = 0; i < sseu->ss_stride; i++)
+		total += hweight8(sseu->subslice_mask[slice * sseu->ss_stride +
+						      i]);
+
+	return total;
+}
+
+void intel_sseu_copy_subslices(const struct sseu_dev_info *sseu, int slice,
+			       u8 *to_mask, const u8 *from_mask)
+{
+	int offset = slice * sseu->ss_stride;
+
+	memcpy(&to_mask[offset], &from_mask[offset], sseu->ss_stride);
+}
+
+void intel_sseu_set_subslices(struct sseu_dev_info *sseu, int slice,
+			      u32 ss_mask)
+{
+	int i, offset = slice * sseu->ss_stride;
+
+	for (i = 0; i < sseu->ss_stride; i++)
+		sseu->subslice_mask[offset + i] =
+			(ss_mask >> (BITS_PER_BYTE * i)) & 0xff;
 }
 
 static int intel_sseu_eu_idx(const struct sseu_dev_info *sseu, int slice,
 			     int subslice)
 {
-	int subslice_stride = DIV_ROUND_UP(sseu->max_eus_per_subslice,
-					   BITS_PER_BYTE);
-	int slice_stride = sseu->max_subslices * subslice_stride;
+	int slice_stride = sseu->max_subslices * sseu->eu_stride;
 
-	return slice * slice_stride + subslice * subslice_stride;
+	return slice * slice_stride + subslice * sseu->eu_stride;
 }
 
 u16 intel_sseu_get_eus(const struct sseu_dev_info *sseu, int slice,
@@ -41,8 +74,7 @@ u16 intel_sseu_get_eus(const struct sseu_dev_info *sseu, int slice,
 	int i, offset = intel_sseu_eu_idx(sseu, slice, subslice);
 	u16 eu_mask = 0;
 
-	for (i = 0;
-	     i < DIV_ROUND_UP(sseu->max_eus_per_subslice, BITS_PER_BYTE); i++) {
+	for (i = 0; i < sseu->eu_stride; i++) {
 		eu_mask |= ((u16)sseu->eu_mask[offset + i]) <<
 			(i * BITS_PER_BYTE);
 	}
@@ -55,8 +87,7 @@ void intel_sseu_set_eus(struct sseu_dev_info *sseu, int slice, int subslice,
 {
 	int i, offset = intel_sseu_eu_idx(sseu, slice, subslice);
 
-	for (i = 0;
-	     i < DIV_ROUND_UP(sseu->max_eus_per_subslice, BITS_PER_BYTE); i++) {
+	for (i = 0; i < sseu->eu_stride; i++) {
 		sseu->eu_mask[offset + i] =
 			(eu_mask >> (BITS_PER_BYTE * i)) & 0xff;
 	}
diff --git a/drivers/gpu/drm/i915/gt/intel_sseu.h b/drivers/gpu/drm/i915/gt/intel_sseu.h
index 56e3721ae83f..bf01f338a8cc 100644
--- a/drivers/gpu/drm/i915/gt/intel_sseu.h
+++ b/drivers/gpu/drm/i915/gt/intel_sseu.h
@@ -9,16 +9,18 @@
 
 #include <linux/types.h>
 #include <linux/kernel.h>
+#include <linux/string.h>
 
 struct drm_i915_private;
 
 #define GEN_MAX_SLICES		(6) /* CNL upper bound */
 #define GEN_MAX_SUBSLICES	(8) /* ICL upper bound */
 #define GEN_SSEU_STRIDE(bits) DIV_ROUND_UP(bits, BITS_PER_BYTE)
+#define GEN_MAX_SUBSLICE_STRIDE GEN_SSEU_STRIDE(GEN_MAX_SUBSLICES)
 
 struct sseu_dev_info {
 	u8 slice_mask;
-	u8 subslice_mask[GEN_MAX_SLICES];
+	u8 subslice_mask[GEN_MAX_SLICES * GEN_MAX_SUBSLICE_STRIDE];
 	u16 eu_total;
 	u8 eu_per_subslice;
 	u8 min_eu_in_pool;
@@ -33,6 +35,9 @@ struct sseu_dev_info {
 	u8 max_subslices;
 	u8 max_eus_per_subslice;
 
+	u8 ss_stride;
+	u8 eu_stride;
+
 	/* We don't have more than 8 eus per subslice at the moment and as we
 	 * store eus enabled using bits, no need to multiply by eus per
 	 * subslice.
@@ -63,12 +68,21 @@ intel_sseu_from_device_info(const struct sseu_dev_info *sseu)
 	return value;
 }
 
+void intel_sseu_set_info(struct sseu_dev_info *sseu, u8 max_slices,
+			 u8 max_subslices, u8 max_eus_per_subslice);
+
 unsigned int
 intel_sseu_subslice_total(const struct sseu_dev_info *sseu);
 
 unsigned int
 intel_sseu_subslices_per_slice(const struct sseu_dev_info *sseu, u8 slice);
 
+void intel_sseu_copy_subslices(const struct sseu_dev_info *sseu, int slice,
+			       u8 *to_mask, const u8 *from_mask);
+
+void intel_sseu_set_subslices(struct sseu_dev_info *sseu, int slice,
+			      u32 ss_mask);
+
 u16 intel_sseu_get_eus(const struct sseu_dev_info *sseu, int slice,
 		       int subslice);
 
diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c b/drivers/gpu/drm/i915/gt/intel_workarounds.c
index 43e290306551..7c7e9556c1c5 100644
--- a/drivers/gpu/drm/i915/gt/intel_workarounds.c
+++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c
@@ -767,7 +767,7 @@ wa_init_mcr(struct drm_i915_private *i915, struct i915_wa_list *wal)
 		u32 slice = fls(sseu->slice_mask);
 		u32 fuse3 =
 			intel_uncore_read(&i915->uncore, GEN10_MIRROR_FUSE3);
-		u8 ss_mask = sseu->subslice_mask[slice];
+		u8 ss_mask = sseu->subslice_mask[slice * sseu->ss_stride];
 
 		u8 enabled_mask = (ss_mask | ss_mask >>
 				   GEN10_L3BANK_PAIR_COUNT) & GEN10_L3BANK_MASK;
diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index 3f3ee83ac315..08089c24db25 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -1257,6 +1257,7 @@ static void i915_instdone_info(struct drm_i915_private *dev_priv,
 			       struct seq_file *m,
 			       struct intel_instdone *instdone)
 {
+	struct sseu_dev_info *sseu = &RUNTIME_INFO(dev_priv)->sseu;
 	int slice;
 	int subslice;
 
@@ -1272,11 +1273,11 @@ static void i915_instdone_info(struct drm_i915_private *dev_priv,
 	if (INTEL_GEN(dev_priv) <= 6)
 		return;
 
-	for_each_instdone_slice_subslice(dev_priv, slice, subslice)
+	for_each_instdone_slice_subslice(dev_priv, sseu, slice, subslice)
 		seq_printf(m, "\t\tSAMPLER_INSTDONE[%d][%d]: 0x%08x\n",
 			   slice, subslice, instdone->sampler[slice][subslice]);
 
-	for_each_instdone_slice_subslice(dev_priv, slice, subslice)
+	for_each_instdone_slice_subslice(dev_priv, sseu, slice, subslice)
 		seq_printf(m, "\t\tROW_INSTDONE[%d][%d]: 0x%08x\n",
 			   slice, subslice, instdone->row[slice][subslice]);
 }
@@ -4066,7 +4067,9 @@ static void gen10_sseu_device_status(struct drm_i915_private *dev_priv,
 			continue;
 
 		sseu->slice_mask |= BIT(s);
-		sseu->subslice_mask[s] = info->sseu.subslice_mask[s];
+		intel_sseu_copy_subslices(&info->sseu, s,
+					  sseu->subslice_mask,
+					  info->sseu.subslice_mask);
 
 		for (ss = 0; ss < info->sseu.max_subslices; ss++) {
 			unsigned int eu_cnt;
@@ -4117,18 +4120,22 @@ static void gen9_sseu_device_status(struct drm_i915_private *dev_priv,
 		sseu->slice_mask |= BIT(s);
 
 		if (IS_GEN9_BC(dev_priv))
-			sseu->subslice_mask[s] =
-				RUNTIME_INFO(dev_priv)->sseu.subslice_mask[s];
+			intel_sseu_copy_subslices(&info->sseu, s,
+						  sseu->subslice_mask,
+						  info->sseu.subslice_mask);
 
 		for (ss = 0; ss < info->sseu.max_subslices; ss++) {
 			unsigned int eu_cnt;
+			u8 ss_idx = s * info->sseu.ss_stride +
+				    ss / BITS_PER_BYTE;
 
 			if (IS_GEN9_LP(dev_priv)) {
 				if (!(s_reg[s] & (GEN9_PGCTL_SS_ACK(ss))))
 					/* skip disabled subslice */
 					continue;
 
-				sseu->subslice_mask[s] |= BIT(ss);
+				sseu->subslice_mask[ss_idx] |=
+					BIT(ss % BITS_PER_BYTE);
 			}
 
 			eu_cnt = 2 * hweight32(eu_reg[2*s + ss/2] &
@@ -4145,25 +4152,24 @@ static void gen9_sseu_device_status(struct drm_i915_private *dev_priv,
 static void broadwell_sseu_device_status(struct drm_i915_private *dev_priv,
 					 struct sseu_dev_info *sseu)
 {
+	struct intel_runtime_info *info = RUNTIME_INFO(dev_priv);
 	u32 slice_info = I915_READ(GEN8_GT_SLICE_INFO);
 	int s;
 
 	sseu->slice_mask = slice_info & GEN8_LSLICESTAT_MASK;
 
 	if (sseu->slice_mask) {
-		sseu->eu_per_subslice =
-			RUNTIME_INFO(dev_priv)->sseu.eu_per_subslice;
-		for (s = 0; s < fls(sseu->slice_mask); s++) {
-			sseu->subslice_mask[s] =
-				RUNTIME_INFO(dev_priv)->sseu.subslice_mask[s];
-		}
+		sseu->eu_per_subslice = info->sseu.eu_per_subslice;
+		for (s = 0; s < fls(sseu->slice_mask); s++)
+			intel_sseu_copy_subslices(&info->sseu, s,
+						  sseu->subslice_mask,
+						  info->sseu.subslice_mask);
 		sseu->eu_total = sseu->eu_per_subslice *
 				 intel_sseu_subslice_total(sseu);
 
 		/* subtract fused off EU(s) from enabled slice(s) */
 		for (s = 0; s < fls(sseu->slice_mask); s++) {
-			u8 subslice_7eu =
-				RUNTIME_INFO(dev_priv)->sseu.subslice_7eu[s];
+			u8 subslice_7eu = info->sseu.subslice_7eu[s];
 
 			sseu->eu_total -= hweight8(subslice_7eu);
 		}
@@ -4210,6 +4216,7 @@ static void i915_print_sseu_info(struct seq_file *m, bool is_available_info,
 static int i915_sseu_status(struct seq_file *m, void *unused)
 {
 	struct drm_i915_private *dev_priv = node_to_i915(m->private);
+	const struct intel_runtime_info *info = RUNTIME_INFO(dev_priv);
 	struct sseu_dev_info sseu;
 	intel_wakeref_t wakeref;
 
@@ -4217,14 +4224,13 @@ static int i915_sseu_status(struct seq_file *m, void *unused)
 		return -ENODEV;
 
 	seq_puts(m, "SSEU Device Info\n");
-	i915_print_sseu_info(m, true, &RUNTIME_INFO(dev_priv)->sseu);
+	i915_print_sseu_info(m, true, &info->sseu);
 
 	seq_puts(m, "SSEU Device Status\n");
 	memset(&sseu, 0, sizeof(sseu));
-	sseu.max_slices = RUNTIME_INFO(dev_priv)->sseu.max_slices;
-	sseu.max_subslices = RUNTIME_INFO(dev_priv)->sseu.max_subslices;
-	sseu.max_eus_per_subslice =
-		RUNTIME_INFO(dev_priv)->sseu.max_eus_per_subslice;
+	intel_sseu_set_info(&sseu, info->sseu.max_slices,
+			    info->sseu.max_subslices,
+			    info->sseu.max_eus_per_subslice);
 
 	with_intel_runtime_pm(dev_priv, wakeref) {
 		if (IS_CHERRYVIEW(dev_priv))
diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 130c5140db0d..6afe4e3afea4 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -326,7 +326,7 @@ static int i915_getparam_ioctl(struct drm_device *dev, void *data,
 	struct pci_dev *pdev = dev_priv->drm.pdev;
 	const struct sseu_dev_info *sseu = &RUNTIME_INFO(dev_priv)->sseu;
 	drm_i915_getparam_t *param = data;
-	int value;
+	int value = 0;
 
 	switch (param->param) {
 	case I915_PARAM_IRQ_ACTIVE:
@@ -455,7 +455,9 @@ static int i915_getparam_ioctl(struct drm_device *dev, void *data,
 			return -ENODEV;
 		break;
 	case I915_PARAM_SUBSLICE_MASK:
-		value = sseu->subslice_mask[0];
+		/* Only copy bits from the first subslice */
+		memcpy(&value, sseu->subslice_mask,
+		       min(sseu->ss_stride, (u8)sizeof(value)));
 		if (!value)
 			return -ENODEV;
 		break;
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
index e1b858bd1d32..140918dd9b7d 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -407,6 +407,7 @@ static void print_error_buffers(struct drm_i915_error_state_buf *m,
 static void error_print_instdone(struct drm_i915_error_state_buf *m,
 				 const struct drm_i915_error_engine *ee)
 {
+	struct sseu_dev_info *sseu = &RUNTIME_INFO(m->i915)->sseu;
 	int slice;
 	int subslice;
 
@@ -422,12 +423,12 @@ static void error_print_instdone(struct drm_i915_error_state_buf *m,
 	if (INTEL_GEN(m->i915) <= 6)
 		return;
 
-	for_each_instdone_slice_subslice(m->i915, slice, subslice)
+	for_each_instdone_slice_subslice(m->i915, sseu, slice, subslice)
 		err_printf(m, "  SAMPLER_INSTDONE[%d][%d]: 0x%08x\n",
 			   slice, subslice,
 			   ee->instdone.sampler[slice][subslice]);
 
-	for_each_instdone_slice_subslice(m->i915, slice, subslice)
+	for_each_instdone_slice_subslice(m->i915, sseu, slice, subslice)
 		err_printf(m, "  ROW_INSTDONE[%d][%d]: 0x%08x\n",
 			   slice, subslice,
 			   ee->instdone.row[slice][subslice]);
diff --git a/drivers/gpu/drm/i915/i915_query.c b/drivers/gpu/drm/i915/i915_query.c
index 7c1708c22811..000dcb145ce0 100644
--- a/drivers/gpu/drm/i915/i915_query.c
+++ b/drivers/gpu/drm/i915/i915_query.c
@@ -37,8 +37,6 @@ static int query_topology_info(struct drm_i915_private *dev_priv,
 	const struct sseu_dev_info *sseu = &RUNTIME_INFO(dev_priv)->sseu;
 	struct drm_i915_query_topology_info topo;
 	u32 slice_length, subslice_length, eu_length, total_length;
-	u8 subslice_stride = GEN_SSEU_STRIDE(sseu->max_subslices);
-	u8 eu_stride = GEN_SSEU_STRIDE(sseu->max_eus_per_subslice);
 	int ret;
 
 	if (query_item->flags != 0)
@@ -50,8 +48,8 @@ static int query_topology_info(struct drm_i915_private *dev_priv,
 	BUILD_BUG_ON(sizeof(u8) != sizeof(sseu->slice_mask));
 
 	slice_length = sizeof(sseu->slice_mask);
-	subslice_length = sseu->max_slices * subslice_stride;
-	eu_length = sseu->max_slices * sseu->max_subslices * eu_stride;
+	subslice_length = sseu->max_slices * sseu->ss_stride;
+	eu_length = sseu->max_slices * sseu->max_subslices * sseu->eu_stride;
 	total_length = sizeof(topo) + slice_length + subslice_length +
 		       eu_length;
 
@@ -69,9 +67,9 @@ static int query_topology_info(struct drm_i915_private *dev_priv,
 	topo.max_eus_per_subslice = sseu->max_eus_per_subslice;
 
 	topo.subslice_offset = slice_length;
-	topo.subslice_stride = subslice_stride;
+	topo.subslice_stride = sseu->ss_stride;
 	topo.eu_offset = slice_length + subslice_length;
-	topo.eu_stride = eu_stride;
+	topo.eu_stride = sseu->eu_stride;
 
 	if (__copy_to_user(u64_to_user_ptr(query_item->data_ptr),
 			   &topo, sizeof(topo)))
diff --git a/drivers/gpu/drm/i915/intel_device_info.c b/drivers/gpu/drm/i915/intel_device_info.c
index e1dbccf04cd9..bbbc0a8c2183 100644
--- a/drivers/gpu/drm/i915/intel_device_info.c
+++ b/drivers/gpu/drm/i915/intel_device_info.c
@@ -84,17 +84,42 @@ void intel_device_info_dump_flags(const struct intel_device_info *info,
 #undef PRINT_FLAG
 }
 
+#define SS_STR_MAX_SIZE (GEN_MAX_SUBSLICE_STRIDE * 2)
+
+static u8 *
+subslice_per_slice_str(u8 *buf, const struct sseu_dev_info *sseu, u8 slice)
+{
+	int i;
+	u8 ss_offset = slice * sseu->ss_stride;
+
+	GEM_BUG_ON(slice >= sseu->max_slices);
+
+	memset(buf, 0, SS_STR_MAX_SIZE);
+
+	/*
+	 * Print subslice information in reverse order to match
+	 * userspace expectations.
+	 */
+	for (i = 0; i < sseu->ss_stride; i++)
+		sprintf(&buf[i * 2], "%02x",
+			sseu->subslice_mask[ss_offset + sseu->ss_stride -
+					    (i + 1)]);
+
+	return buf;
+}
+
 static void sseu_dump(const struct sseu_dev_info *sseu, struct drm_printer *p)
 {
 	int s;
+	u8 buf[SS_STR_MAX_SIZE];
 
 	drm_printf(p, "slice total: %u, mask=%04x\n",
 		   hweight8(sseu->slice_mask), sseu->slice_mask);
 	drm_printf(p, "subslice total: %u\n", intel_sseu_subslice_total(sseu));
 	for (s = 0; s < sseu->max_slices; s++) {
-		drm_printf(p, "slice%d: %u subslices, mask=%04x\n",
+		drm_printf(p, "slice%d: %u subslices, mask=%s\n",
 			   s, intel_sseu_subslices_per_slice(sseu, s),
-			   sseu->subslice_mask[s]);
+			   subslice_per_slice_str(buf, sseu, s));
 	}
 	drm_printf(p, "EU total: %u\n", sseu->eu_total);
 	drm_printf(p, "EU per subslice: %u\n", sseu->eu_per_subslice);
@@ -118,6 +143,7 @@ void intel_device_info_dump_topology(const struct sseu_dev_info *sseu,
 				     struct drm_printer *p)
 {
 	int s, ss;
+	u8 buf[SS_STR_MAX_SIZE];
 
 	if (sseu->max_slices == 0) {
 		drm_printf(p, "Unavailable\n");
@@ -125,9 +151,9 @@ void intel_device_info_dump_topology(const struct sseu_dev_info *sseu,
 	}
 
 	for (s = 0; s < sseu->max_slices; s++) {
-		drm_printf(p, "slice%d: %u subslice(s) (0x%hhx):\n",
+		drm_printf(p, "slice%d: %u subslice(s) (0x%s):\n",
 			   s, intel_sseu_subslices_per_slice(sseu, s),
-			   sseu->subslice_mask[s]);
+			   subslice_per_slice_str(buf, sseu, s));
 
 		for (ss = 0; ss < sseu->max_subslices; ss++) {
 			u16 enabled_eus = intel_sseu_get_eus(sseu, s, ss);
@@ -156,15 +182,10 @@ static void gen11_sseu_info_init(struct drm_i915_private *dev_priv)
 	u8 eu_en;
 	int s;
 
-	if (IS_ELKHARTLAKE(dev_priv)) {
-		sseu->max_slices = 1;
-		sseu->max_subslices = 4;
-		sseu->max_eus_per_subslice = 8;
-	} else {
-		sseu->max_slices = 1;
-		sseu->max_subslices = 8;
-		sseu->max_eus_per_subslice = 8;
-	}
+	if (IS_ELKHARTLAKE(dev_priv))
+		intel_sseu_set_info(sseu, 1, 4, 8);
+	else
+		intel_sseu_set_info(sseu, 1, 8, 8);
 
 	s_en = I915_READ(GEN11_GT_SLICE_ENABLE) & GEN11_GT_S_ENA_MASK;
 	ss_en = ~I915_READ(GEN11_GT_SUBSLICE_DISABLE);
@@ -177,9 +198,11 @@ static void gen11_sseu_info_init(struct drm_i915_private *dev_priv)
 			int ss;
 
 			sseu->slice_mask |= BIT(s);
-			sseu->subslice_mask[s] = (ss_en >> ss_idx) & ss_en_mask;
+			sseu->subslice_mask[s * sseu->ss_stride] =
+				(ss_en >> ss_idx) & ss_en_mask;
 			for (ss = 0; ss < sseu->max_subslices; ss++) {
-				if (sseu->subslice_mask[s] & BIT(ss))
+				if (sseu->subslice_mask[s * sseu->ss_stride] &
+				    BIT(ss))
 					intel_sseu_set_eus(sseu, s, ss, eu_en);
 			}
 		}
@@ -201,23 +224,10 @@ static void gen10_sseu_info_init(struct drm_i915_private *dev_priv)
 	const int eu_mask = 0xff;
 	u32 subslice_mask, eu_en;
 
+	intel_sseu_set_info(sseu, 6, 4, 8);
+
 	sseu->slice_mask = (fuse2 & GEN10_F2_S_ENA_MASK) >>
 			    GEN10_F2_S_ENA_SHIFT;
-	sseu->max_slices = 6;
-	sseu->max_subslices = 4;
-	sseu->max_eus_per_subslice = 8;
-
-	subslice_mask = (1 << 4) - 1;
-	subslice_mask &= ~((fuse2 & GEN10_F2_SS_DIS_MASK) >>
-			   GEN10_F2_SS_DIS_SHIFT);
-
-	/*
-	 * Slice0 can have up to 3 subslices, but there are only 2 in
-	 * slice1/2.
-	 */
-	sseu->subslice_mask[0] = subslice_mask;
-	for (s = 1; s < sseu->max_slices; s++)
-		sseu->subslice_mask[s] = subslice_mask & 0x3;
 
 	/* Slice0 */
 	eu_en = ~I915_READ(GEN8_EU_DISABLE0);
@@ -242,14 +252,22 @@ static void gen10_sseu_info_init(struct drm_i915_private *dev_priv)
 	eu_en = ~I915_READ(GEN10_EU_DISABLE3);
 	intel_sseu_set_eus(sseu, 5, 1, eu_en & eu_mask);
 
-	/* Do a second pass where we mark the subslices disabled if all their
-	 * eus are off.
-	 */
+	subslice_mask = (1 << 4) - 1;
+	subslice_mask &= ~((fuse2 & GEN10_F2_SS_DIS_MASK) >>
+			   GEN10_F2_SS_DIS_SHIFT);
+
 	for (s = 0; s < sseu->max_slices; s++) {
 		for (ss = 0; ss < sseu->max_subslices; ss++) {
 			if (intel_sseu_get_eus(sseu, s, ss) == 0)
-				sseu->subslice_mask[s] &= ~BIT(ss);
+				subslice_mask &= ~BIT(ss);
 		}
+
+		/*
+		 * Slice0 can have up to 3 subslices, but there are only 2 in
+		 * slice1/2.
+		 */
+		intel_sseu_set_subslices(sseu, s, s == 0 ? subslice_mask :
+							   subslice_mask & 0x3);
 	}
 
 	sseu->eu_total = compute_eu_total(sseu);
@@ -275,13 +293,12 @@ static void cherryview_sseu_info_init(struct drm_i915_private *dev_priv)
 {
 	struct sseu_dev_info *sseu = &RUNTIME_INFO(dev_priv)->sseu;
 	u32 fuse;
+	u8 subslice_mask;
 
 	fuse = I915_READ(CHV_FUSE_GT);
 
 	sseu->slice_mask = BIT(0);
-	sseu->max_slices = 1;
-	sseu->max_subslices = 2;
-	sseu->max_eus_per_subslice = 8;
+	intel_sseu_set_info(sseu, 1, 2, 8);
 
 	if (!(fuse & CHV_FGT_DISABLE_SS0)) {
 		u8 disabled_mask =
@@ -290,7 +307,7 @@ static void cherryview_sseu_info_init(struct drm_i915_private *dev_priv)
 			(((fuse & CHV_FGT_EU_DIS_SS0_R1_MASK) >>
 			  CHV_FGT_EU_DIS_SS0_R1_SHIFT) << 4);
 
-		sseu->subslice_mask[0] |= BIT(0);
+		subslice_mask |= BIT(0);
 		intel_sseu_set_eus(sseu, 0, 0, ~disabled_mask);
 	}
 
@@ -301,10 +318,12 @@ static void cherryview_sseu_info_init(struct drm_i915_private *dev_priv)
 			(((fuse & CHV_FGT_EU_DIS_SS1_R1_MASK) >>
 			  CHV_FGT_EU_DIS_SS1_R1_SHIFT) << 4);
 
-		sseu->subslice_mask[0] |= BIT(1);
+		subslice_mask |= BIT(1);
 		intel_sseu_set_eus(sseu, 0, 1, ~disabled_mask);
 	}
 
+	intel_sseu_set_subslices(sseu, 0, subslice_mask);
+
 	sseu->eu_total = compute_eu_total(sseu);
 
 	/*
@@ -312,7 +331,8 @@ static void cherryview_sseu_info_init(struct drm_i915_private *dev_priv)
 	 * across subslices.
 	*/
 	sseu->eu_per_subslice = intel_sseu_subslice_total(sseu) ?
-				sseu->eu_total / intel_sseu_subslice_total(sseu) :
+				sseu->eu_total /
+					intel_sseu_subslice_total(sseu) :
 				0;
 	/*
 	 * CHV supports subslice power gating on devices with more than
@@ -336,9 +356,8 @@ static void gen9_sseu_info_init(struct drm_i915_private *dev_priv)
 	sseu->slice_mask = (fuse2 & GEN8_F2_S_ENA_MASK) >> GEN8_F2_S_ENA_SHIFT;
 
 	/* BXT has a single slice and at most 3 subslices. */
-	sseu->max_slices = IS_GEN9_LP(dev_priv) ? 1 : 3;
-	sseu->max_subslices = IS_GEN9_LP(dev_priv) ? 3 : 4;
-	sseu->max_eus_per_subslice = 8;
+	intel_sseu_set_info(sseu, IS_GEN9_LP(dev_priv) ? 1 : 3,
+			    IS_GEN9_LP(dev_priv) ? 3 : 4, 8);
 
 	/*
 	 * The subslice disable field is global, i.e. it applies
@@ -357,14 +376,16 @@ static void gen9_sseu_info_init(struct drm_i915_private *dev_priv)
 			/* skip disabled slice */
 			continue;
 
-		sseu->subslice_mask[s] = subslice_mask;
+		intel_sseu_set_subslices(sseu, s, subslice_mask);
 
 		eu_disable = I915_READ(GEN9_EU_DISABLE(s));
 		for (ss = 0; ss < sseu->max_subslices; ss++) {
 			int eu_per_ss;
 			u8 eu_disabled_mask;
+			u8 ss_idx = s * sseu->ss_stride + ss / BITS_PER_BYTE;
 
-			if (!(sseu->subslice_mask[s] & BIT(ss)))
+			if (!(sseu->subslice_mask[ss_idx] &
+			      BIT(ss % BITS_PER_BYTE)))
 				/* skip disabled subslice */
 				continue;
 
@@ -437,9 +458,7 @@ static void broadwell_sseu_info_init(struct drm_i915_private *dev_priv)
 
 	fuse2 = I915_READ(GEN8_FUSE2);
 	sseu->slice_mask = (fuse2 & GEN8_F2_S_ENA_MASK) >> GEN8_F2_S_ENA_SHIFT;
-	sseu->max_slices = 3;
-	sseu->max_subslices = 3;
-	sseu->max_eus_per_subslice = 8;
+	intel_sseu_set_info(sseu, 3, 3, 8);
 
 	/*
 	 * The subslice disable field is global, i.e. it applies
@@ -466,18 +485,21 @@ static void broadwell_sseu_info_init(struct drm_i915_private *dev_priv)
 			/* skip disabled slice */
 			continue;
 
-		sseu->subslice_mask[s] = subslice_mask;
+		intel_sseu_set_subslices(sseu, s, subslice_mask);
 
 		for (ss = 0; ss < sseu->max_subslices; ss++) {
 			u8 eu_disabled_mask;
+			u8 ss_idx = s * sseu->ss_stride + ss / BITS_PER_BYTE;
 			u32 n_disabled;
 
-			if (!(sseu->subslice_mask[s] & BIT(ss)))
+			if (!(sseu->subslice_mask[ss_idx] &
+			      BIT(ss % BITS_PER_BYTE)))
 				/* skip disabled subslice */
 				continue;
 
 			eu_disabled_mask =
-				eu_disable[s] >> (ss * sseu->max_eus_per_subslice);
+				eu_disable[s] >>
+					(ss * sseu->max_eus_per_subslice);
 
 			intel_sseu_set_eus(sseu, s, ss, ~eu_disabled_mask);
 
@@ -517,6 +539,7 @@ static void haswell_sseu_info_init(struct drm_i915_private *dev_priv)
 	struct sseu_dev_info *sseu = &RUNTIME_INFO(dev_priv)->sseu;
 	u32 fuse1;
 	int s, ss;
+	u32 subslice_mask;
 
 	/*
 	 * There isn't a register to tell us how many slices/subslices. We
@@ -528,22 +551,18 @@ static void haswell_sseu_info_init(struct drm_i915_private *dev_priv)
 		/* fall through */
 	case 1:
 		sseu->slice_mask = BIT(0);
-		sseu->subslice_mask[0] = BIT(0);
+		subslice_mask = BIT(0);
 		break;
 	case 2:
 		sseu->slice_mask = BIT(0);
-		sseu->subslice_mask[0] = BIT(0) | BIT(1);
+		subslice_mask = BIT(0) | BIT(1);
 		break;
 	case 3:
 		sseu->slice_mask = BIT(0) | BIT(1);
-		sseu->subslice_mask[0] = BIT(0) | BIT(1);
-		sseu->subslice_mask[1] = BIT(0) | BIT(1);
+		subslice_mask = BIT(0) | BIT(1);
 		break;
 	}
 
-	sseu->max_slices = hweight8(sseu->slice_mask);
-	sseu->max_subslices = hweight8(sseu->subslice_mask[0]);
-
 	fuse1 = I915_READ(HSW_PAVP_FUSE1);
 	switch ((fuse1 & HSW_F1_EU_DIS_MASK) >> HSW_F1_EU_DIS_SHIFT) {
 	default:
@@ -560,9 +579,14 @@ static void haswell_sseu_info_init(struct drm_i915_private *dev_priv)
 		sseu->eu_per_subslice = 6;
 		break;
 	}
-	sseu->max_eus_per_subslice = sseu->eu_per_subslice;
+
+	intel_sseu_set_info(sseu, hweight8(sseu->slice_mask),
+			    hweight8(subslice_mask),
+			    sseu->eu_per_subslice);
 
 	for (s = 0; s < sseu->max_slices; s++) {
+		intel_sseu_set_subslices(sseu, s, subslice_mask);
+
 		for (ss = 0; ss < sseu->max_subslices; ss++) {
 			intel_sseu_set_eus(sseu, s, ss,
 					   (1UL << sseu->eu_per_subslice) - 1);
-- 
2.21.0.5.gaeb582a983

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* ✗ Fi.CI.CHECKPATCH: warning for Refactor to expand subslice mask (rev7)
  2019-05-01 15:34 [PATCH 0/6] Refactor to expand subslice mask Stuart Summers
                   ` (5 preceding siblings ...)
  2019-05-01 15:34 ` [PATCH 6/6] drm/i915: Expand subslice mask Stuart Summers
@ 2019-05-01 15:58 ` Patchwork
  2019-05-01 16:01 ` ✗ Fi.CI.SPARSE: " Patchwork
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 35+ messages in thread
From: Patchwork @ 2019-05-01 15:58 UTC (permalink / raw)
  To: Stuart Summers; +Cc: intel-gfx

== Series Details ==

Series: Refactor to expand subslice mask (rev7)
URL   : https://patchwork.freedesktop.org/series/59742/
State : warning

== Summary ==

$ dim checkpatch origin/drm-tip
47812364a20c drm/i915: Use local variable for SSEU info in GETPARAM ioctl
b0f425609a7b drm/i915: Add macro for SSEU stride calculation
be21e44a771d drm/i915: Move calculation of subslices per slice to new function
0dbec2982376 drm/i915: Move sseu helper functions to intel_sseu.h
bf909e717bcf drm/i915: Remove inline from sseu helper functions
3416f0a72f5d drm/i915: Expand subslice mask
-:84: CHECK:MACRO_ARG_REUSE: Macro argument reuse 'sseu__' - possible side-effects?
#84: FILE: drivers/gpu/drm/i915/gt/intel_engine_types.h:542:
+#define instdone_has_subslice(dev_priv__, sseu__, slice__, subslice__) \
+	((IS_GEN(dev_priv__, 7) ? \
+	  1 : (sseu__)->subslice_mask[slice__ * (sseu__)->ss_stride + \
+				      subslice__ / BITS_PER_BYTE]) & \
+	 BIT(subslice__ % BITS_PER_BYTE)) \
+

-:84: CHECK:MACRO_ARG_PRECEDENCE: Macro argument 'slice__' may be better as '(slice__)' to avoid precedence issues
#84: FILE: drivers/gpu/drm/i915/gt/intel_engine_types.h:542:
+#define instdone_has_subslice(dev_priv__, sseu__, slice__, subslice__) \
+	((IS_GEN(dev_priv__, 7) ? \
+	  1 : (sseu__)->subslice_mask[slice__ * (sseu__)->ss_stride + \
+				      subslice__ / BITS_PER_BYTE]) & \
+	 BIT(subslice__ % BITS_PER_BYTE)) \
+

-:84: CHECK:MACRO_ARG_REUSE: Macro argument reuse 'subslice__' - possible side-effects?
#84: FILE: drivers/gpu/drm/i915/gt/intel_engine_types.h:542:
+#define instdone_has_subslice(dev_priv__, sseu__, slice__, subslice__) \
+	((IS_GEN(dev_priv__, 7) ? \
+	  1 : (sseu__)->subslice_mask[slice__ * (sseu__)->ss_stride + \
+				      subslice__ / BITS_PER_BYTE]) & \
+	 BIT(subslice__ % BITS_PER_BYTE)) \
+

-:84: CHECK:MACRO_ARG_PRECEDENCE: Macro argument 'subslice__' may be better as '(subslice__)' to avoid precedence issues
#84: FILE: drivers/gpu/drm/i915/gt/intel_engine_types.h:542:
+#define instdone_has_subslice(dev_priv__, sseu__, slice__, subslice__) \
+	((IS_GEN(dev_priv__, 7) ? \
+	  1 : (sseu__)->subslice_mask[slice__ * (sseu__)->ss_stride + \
+				      subslice__ / BITS_PER_BYTE]) & \
+	 BIT(subslice__ % BITS_PER_BYTE)) \
+

-:90: CHECK:MACRO_ARG_REUSE: Macro argument reuse 'dev_priv_' - possible side-effects?
#90: FILE: drivers/gpu/drm/i915/gt/intel_engine_types.h:548:
+#define for_each_instdone_slice_subslice(dev_priv_, sseu_, slice_, subslice_) \
+	for ((slice_) = 0, (subslice_) = 0; (slice_) < I915_MAX_SLICES; \
+	     (subslice_) = ((subslice_) + 1) < I915_MAX_SUBSLICES ? (subslice_) + 1 : 0, \
+	       (slice_) += ((subslice_) == 0)) \
+		for_each_if(instdone_has_slice(dev_priv_, sseu_, slice) && \
+			    instdone_has_subslice(dev_priv_, sseu_, slice_, subslice_)) \
 

-:90: CHECK:MACRO_ARG_REUSE: Macro argument reuse 'sseu_' - possible side-effects?
#90: FILE: drivers/gpu/drm/i915/gt/intel_engine_types.h:548:
+#define for_each_instdone_slice_subslice(dev_priv_, sseu_, slice_, subslice_) \
+	for ((slice_) = 0, (subslice_) = 0; (slice_) < I915_MAX_SLICES; \
+	     (subslice_) = ((subslice_) + 1) < I915_MAX_SUBSLICES ? (subslice_) + 1 : 0, \
+	       (slice_) += ((subslice_) == 0)) \
+		for_each_if(instdone_has_slice(dev_priv_, sseu_, slice) && \
+			    instdone_has_subslice(dev_priv_, sseu_, slice_, subslice_)) \
 

-:90: CHECK:MACRO_ARG_REUSE: Macro argument reuse 'slice_' - possible side-effects?
#90: FILE: drivers/gpu/drm/i915/gt/intel_engine_types.h:548:
+#define for_each_instdone_slice_subslice(dev_priv_, sseu_, slice_, subslice_) \
+	for ((slice_) = 0, (subslice_) = 0; (slice_) < I915_MAX_SLICES; \
+	     (subslice_) = ((subslice_) + 1) < I915_MAX_SUBSLICES ? (subslice_) + 1 : 0, \
+	       (slice_) += ((subslice_) == 0)) \
+		for_each_if(instdone_has_slice(dev_priv_, sseu_, slice) && \
+			    instdone_has_subslice(dev_priv_, sseu_, slice_, subslice_)) \
 

-:90: CHECK:MACRO_ARG_REUSE: Macro argument reuse 'subslice_' - possible side-effects?
#90: FILE: drivers/gpu/drm/i915/gt/intel_engine_types.h:548:
+#define for_each_instdone_slice_subslice(dev_priv_, sseu_, slice_, subslice_) \
+	for ((slice_) = 0, (subslice_) = 0; (slice_) < I915_MAX_SLICES; \
+	     (subslice_) = ((subslice_) + 1) < I915_MAX_SUBSLICES ? (subslice_) + 1 : 0, \
+	       (slice_) += ((subslice_) == 0)) \
+		for_each_if(instdone_has_slice(dev_priv_, sseu_, slice) && \
+			    instdone_has_subslice(dev_priv_, sseu_, slice_, subslice_)) \
 

total: 0 errors, 0 warnings, 8 checks, 692 lines checked

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 35+ messages in thread

* ✗ Fi.CI.SPARSE: warning for Refactor to expand subslice mask (rev7)
  2019-05-01 15:34 [PATCH 0/6] Refactor to expand subslice mask Stuart Summers
                   ` (6 preceding siblings ...)
  2019-05-01 15:58 ` ✗ Fi.CI.CHECKPATCH: warning for Refactor to expand subslice mask (rev7) Patchwork
@ 2019-05-01 16:01 ` Patchwork
  2019-05-01 16:14 ` ✓ Fi.CI.BAT: success " Patchwork
  2019-05-02  9:14 ` ✓ Fi.CI.IGT: " Patchwork
  9 siblings, 0 replies; 35+ messages in thread
From: Patchwork @ 2019-05-01 16:01 UTC (permalink / raw)
  To: Stuart Summers; +Cc: intel-gfx

== Series Details ==

Series: Refactor to expand subslice mask (rev7)
URL   : https://patchwork.freedesktop.org/series/59742/
State : warning

== Summary ==

$ dim sparse origin/drm-tip
Sparse version: v0.5.2
Commit: drm/i915: Use local variable for SSEU info in GETPARAM ioctl
Okay!

Commit: drm/i915: Add macro for SSEU stride calculation
Okay!

Commit: drm/i915: Move calculation of subslices per slice to new function
Okay!

Commit: drm/i915: Move sseu helper functions to intel_sseu.h
Okay!

Commit: drm/i915: Remove inline from sseu helper functions
Okay!

Commit: drm/i915: Expand subslice mask
+drivers/gpu/drm/i915/i915_drv.c:460:24: warning: expression using sizeof(void)

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 35+ messages in thread

* ✓ Fi.CI.BAT: success for Refactor to expand subslice mask (rev7)
  2019-05-01 15:34 [PATCH 0/6] Refactor to expand subslice mask Stuart Summers
                   ` (7 preceding siblings ...)
  2019-05-01 16:01 ` ✗ Fi.CI.SPARSE: " Patchwork
@ 2019-05-01 16:14 ` Patchwork
  2019-05-02  9:14 ` ✓ Fi.CI.IGT: " Patchwork
  9 siblings, 0 replies; 35+ messages in thread
From: Patchwork @ 2019-05-01 16:14 UTC (permalink / raw)
  To: Stuart Summers; +Cc: intel-gfx

== Series Details ==

Series: Refactor to expand subslice mask (rev7)
URL   : https://patchwork.freedesktop.org/series/59742/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_6021 -> Patchwork_12926
====================================================

Summary
-------

  **SUCCESS**

  No regressions found.

  External URL: https://patchwork.freedesktop.org/api/1.0/series/59742/revisions/7/mbox/

Known issues
------------

  Here are the changes found in Patchwork_12926 that come from known issues:

### IGT changes ###

#### Issues hit ####

  * igt@gem_ctx_create@basic-files:
    - fi-icl-y:           [PASS][1] -> [INCOMPLETE][2] ([fdo#107713] / [fdo#109100])
   [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6021/fi-icl-y/igt@gem_ctx_create@basic-files.html
   [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12926/fi-icl-y/igt@gem_ctx_create@basic-files.html

  * igt@i915_selftest@live_contexts:
    - fi-bdw-gvtdvm:      [PASS][3] -> [DMESG-FAIL][4] ([fdo#110235])
   [3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6021/fi-bdw-gvtdvm/igt@i915_selftest@live_contexts.html
   [4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12926/fi-bdw-gvtdvm/igt@i915_selftest@live_contexts.html

  * igt@kms_cursor_legacy@basic-flip-after-cursor-varying-size:
    - fi-glk-dsi:         [PASS][5] -> [INCOMPLETE][6] ([fdo#103359] / [k.org#198133])
   [5]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6021/fi-glk-dsi/igt@kms_cursor_legacy@basic-flip-after-cursor-varying-size.html
   [6]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12926/fi-glk-dsi/igt@kms_cursor_legacy@basic-flip-after-cursor-varying-size.html

  
#### Possible fixes ####

  * igt@kms_pipe_crc_basic@suspend-read-crc-pipe-a:
    - fi-blb-e6850:       [INCOMPLETE][7] ([fdo#107718]) -> [PASS][8]
   [7]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6021/fi-blb-e6850/igt@kms_pipe_crc_basic@suspend-read-crc-pipe-a.html
   [8]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12926/fi-blb-e6850/igt@kms_pipe_crc_basic@suspend-read-crc-pipe-a.html

  
#### Warnings ####

  * igt@i915_pm_rpm@basic-pci-d3-state:
    - fi-kbl-guc:         [SKIP][9] ([fdo#109271]) -> [INCOMPLETE][10] ([fdo#107807])
   [9]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6021/fi-kbl-guc/igt@i915_pm_rpm@basic-pci-d3-state.html
   [10]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12926/fi-kbl-guc/igt@i915_pm_rpm@basic-pci-d3-state.html

  
  [fdo#103359]: https://bugs.freedesktop.org/show_bug.cgi?id=103359
  [fdo#107713]: https://bugs.freedesktop.org/show_bug.cgi?id=107713
  [fdo#107718]: https://bugs.freedesktop.org/show_bug.cgi?id=107718
  [fdo#107807]: https://bugs.freedesktop.org/show_bug.cgi?id=107807
  [fdo#109100]: https://bugs.freedesktop.org/show_bug.cgi?id=109100
  [fdo#109271]: https://bugs.freedesktop.org/show_bug.cgi?id=109271
  [fdo#110235]: https://bugs.freedesktop.org/show_bug.cgi?id=110235
  [k.org#198133]: https://bugzilla.kernel.org/show_bug.cgi?id=198133


Participating hosts (52 -> 44)
------------------------------

  Additional (1): fi-pnv-d510 
  Missing    (9): fi-kbl-soraka fi-ilk-m540 fi-hsw-4200u fi-bsw-cyan fi-ctg-p8600 fi-cfl-8109u fi-icl-u3 fi-byt-clapper fi-bdw-samus 


Build changes
-------------

  * Linux: CI_DRM_6021 -> Patchwork_12926

  CI_DRM_6021: 850aa4220e8bf7609b03bf89bce146305704bec6 @ git://anongit.freedesktop.org/gfx-ci/linux
  IGT_4971: fc5e0467eb6913d21ad932aa8a31c77fdb5a9c77 @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
  Patchwork_12926: 3416f0a72f5df37142ab06ae57f55bff003ecf5d @ git://anongit.freedesktop.org/gfx-ci/linux


== Linux commits ==

3416f0a72f5d drm/i915: Expand subslice mask
bf909e717bcf drm/i915: Remove inline from sseu helper functions
0dbec2982376 drm/i915: Move sseu helper functions to intel_sseu.h
be21e44a771d drm/i915: Move calculation of subslices per slice to new function
b0f425609a7b drm/i915: Add macro for SSEU stride calculation
47812364a20c drm/i915: Use local variable for SSEU info in GETPARAM ioctl

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12926/
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 1/6] drm/i915: Use local variable for SSEU info in GETPARAM ioctl
  2019-05-01 15:34 ` [PATCH 1/6] drm/i915: Use local variable for SSEU info in GETPARAM ioctl Stuart Summers
@ 2019-05-01 17:54   ` Daniele Ceraolo Spurio
  2019-05-01 19:38     ` Summers, Stuart
  0 siblings, 1 reply; 35+ messages in thread
From: Daniele Ceraolo Spurio @ 2019-05-01 17:54 UTC (permalink / raw)
  To: Stuart Summers, intel-gfx



On 5/1/19 8:34 AM, Stuart Summers wrote:
> In the GETPARAM ioctl handler, use a local variable to consolidate
> usage of SSEU runtime info.
> 
> v2: add const to sseu_dev_info variable
> 
> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
> Signed-off-by: Stuart Summers <stuart.summers@intel.com>

Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>

> ---
>   drivers/gpu/drm/i915/i915_drv.c | 11 ++++++-----
>   1 file changed, 6 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
> index 21dac5a09fbe..c376244c19c4 100644
> --- a/drivers/gpu/drm/i915/i915_drv.c
> +++ b/drivers/gpu/drm/i915/i915_drv.c
> @@ -324,6 +324,7 @@ static int i915_getparam_ioctl(struct drm_device *dev, void *data,
>   {
>   	struct drm_i915_private *dev_priv = to_i915(dev);
>   	struct pci_dev *pdev = dev_priv->drm.pdev;
> +	const struct sseu_dev_info *sseu = &RUNTIME_INFO(dev_priv)->sseu;
>   	drm_i915_getparam_t *param = data;
>   	int value;
>   
> @@ -377,12 +378,12 @@ static int i915_getparam_ioctl(struct drm_device *dev, void *data,
>   		value = i915_cmd_parser_get_version(dev_priv);
>   		break;
>   	case I915_PARAM_SUBSLICE_TOTAL:
> -		value = sseu_subslice_total(&RUNTIME_INFO(dev_priv)->sseu);
> +		value = sseu_subslice_total(sseu);
>   		if (!value)
>   			return -ENODEV;
>   		break;
>   	case I915_PARAM_EU_TOTAL:
> -		value = RUNTIME_INFO(dev_priv)->sseu.eu_total;
> +		value = sseu->eu_total;
>   		if (!value)
>   			return -ENODEV;
>   		break;
> @@ -399,7 +400,7 @@ static int i915_getparam_ioctl(struct drm_device *dev, void *data,
>   		value = HAS_POOLED_EU(dev_priv);
>   		break;
>   	case I915_PARAM_MIN_EU_IN_POOL:
> -		value = RUNTIME_INFO(dev_priv)->sseu.min_eu_in_pool;
> +		value = sseu->min_eu_in_pool;
>   		break;
>   	case I915_PARAM_HUC_STATUS:
>   		value = intel_huc_check_status(&dev_priv->huc);
> @@ -449,12 +450,12 @@ static int i915_getparam_ioctl(struct drm_device *dev, void *data,
>   		value = intel_engines_has_context_isolation(dev_priv);
>   		break;
>   	case I915_PARAM_SLICE_MASK:
> -		value = RUNTIME_INFO(dev_priv)->sseu.slice_mask;
> +		value = sseu->slice_mask;
>   		if (!value)
>   			return -ENODEV;
>   		break;
>   	case I915_PARAM_SUBSLICE_MASK:
> -		value = RUNTIME_INFO(dev_priv)->sseu.subslice_mask[0];
> +		value = sseu->subslice_mask[0];
>   		if (!value)
>   			return -ENODEV;
>   		break;
> 
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 2/6] drm/i915: Add macro for SSEU stride calculation
  2019-05-01 15:34 ` [PATCH 2/6] drm/i915: Add macro for SSEU stride calculation Stuart Summers
@ 2019-05-01 18:11   ` Daniele Ceraolo Spurio
  2019-05-01 19:37     ` Summers, Stuart
  0 siblings, 1 reply; 35+ messages in thread
From: Daniele Ceraolo Spurio @ 2019-05-01 18:11 UTC (permalink / raw)
  To: Stuart Summers, intel-gfx



On 5/1/19 8:34 AM, Stuart Summers wrote:
> Subslice stride and EU stride are calculated multiple times in
> i915_query. Move this calculation to a macro to reduce code duplication.
> 
> v2: update headers in intel_sseu.h
> 
> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
> Signed-off-by: Stuart Summers <stuart.summers@intel.com>
> ---
>   drivers/gpu/drm/i915/gt/intel_sseu.h |  2 ++
>   drivers/gpu/drm/i915/i915_query.c    | 17 ++++++++---------
>   2 files changed, 10 insertions(+), 9 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_sseu.h b/drivers/gpu/drm/i915/gt/intel_sseu.h
> index 73bc824094e8..c0b16b248d4c 100644
> --- a/drivers/gpu/drm/i915/gt/intel_sseu.h
> +++ b/drivers/gpu/drm/i915/gt/intel_sseu.h
> @@ -8,11 +8,13 @@
>   #define __INTEL_SSEU_H__
>   
>   #include <linux/types.h>
> +#include <linux/kernel.h>
>   
>   struct drm_i915_private;
>   
>   #define GEN_MAX_SLICES		(6) /* CNL upper bound */
>   #define GEN_MAX_SUBSLICES	(8) /* ICL upper bound */
> +#define GEN_SSEU_STRIDE(bits) DIV_ROUND_UP(bits, BITS_PER_BYTE)

What we pass to this macro isn't really a bits count but the maximum 
amount of s/ss/eus. s/bits/max_entry/, or something like that? with that:

Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>

Daniele

>   
>   struct sseu_dev_info {
>   	u8 slice_mask;
> diff --git a/drivers/gpu/drm/i915/i915_query.c b/drivers/gpu/drm/i915/i915_query.c
> index 782183b78f49..7c1708c22811 100644
> --- a/drivers/gpu/drm/i915/i915_query.c
> +++ b/drivers/gpu/drm/i915/i915_query.c
> @@ -37,6 +37,8 @@ static int query_topology_info(struct drm_i915_private *dev_priv,
>   	const struct sseu_dev_info *sseu = &RUNTIME_INFO(dev_priv)->sseu;
>   	struct drm_i915_query_topology_info topo;
>   	u32 slice_length, subslice_length, eu_length, total_length;
> +	u8 subslice_stride = GEN_SSEU_STRIDE(sseu->max_subslices);
> +	u8 eu_stride = GEN_SSEU_STRIDE(sseu->max_eus_per_subslice);
>   	int ret;
>   
>   	if (query_item->flags != 0)
> @@ -48,12 +50,10 @@ static int query_topology_info(struct drm_i915_private *dev_priv,
>   	BUILD_BUG_ON(sizeof(u8) != sizeof(sseu->slice_mask));
>   
>   	slice_length = sizeof(sseu->slice_mask);
> -	subslice_length = sseu->max_slices *
> -		DIV_ROUND_UP(sseu->max_subslices, BITS_PER_BYTE);
> -	eu_length = sseu->max_slices * sseu->max_subslices *
> -		DIV_ROUND_UP(sseu->max_eus_per_subslice, BITS_PER_BYTE);
> -
> -	total_length = sizeof(topo) + slice_length + subslice_length + eu_length;
> +	subslice_length = sseu->max_slices * subslice_stride;
> +	eu_length = sseu->max_slices * sseu->max_subslices * eu_stride;
> +	total_length = sizeof(topo) + slice_length + subslice_length +
> +		       eu_length;
>   
>   	ret = copy_query_item(&topo, sizeof(topo), total_length,
>   			      query_item);
> @@ -69,10 +69,9 @@ static int query_topology_info(struct drm_i915_private *dev_priv,
>   	topo.max_eus_per_subslice = sseu->max_eus_per_subslice;
>   
>   	topo.subslice_offset = slice_length;
> -	topo.subslice_stride = DIV_ROUND_UP(sseu->max_subslices, BITS_PER_BYTE);
> +	topo.subslice_stride = subslice_stride;
>   	topo.eu_offset = slice_length + subslice_length;
> -	topo.eu_stride =
> -		DIV_ROUND_UP(sseu->max_eus_per_subslice, BITS_PER_BYTE);
> +	topo.eu_stride = eu_stride;
>   
>   	if (__copy_to_user(u64_to_user_ptr(query_item->data_ptr),
>   			   &topo, sizeof(topo)))
> 
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 3/6] drm/i915: Move calculation of subslices per slice to new function
  2019-05-01 15:34 ` [PATCH 3/6] drm/i915: Move calculation of subslices per slice to new function Stuart Summers
@ 2019-05-01 18:14   ` Daniele Ceraolo Spurio
  2019-05-01 19:37     ` Summers, Stuart
  0 siblings, 1 reply; 35+ messages in thread
From: Daniele Ceraolo Spurio @ 2019-05-01 18:14 UTC (permalink / raw)
  To: Stuart Summers, intel-gfx



On 5/1/19 8:34 AM, Stuart Summers wrote:
> Add a new function to return the number of subslices per slice to
> consolidate code usage.
> 
> v2: rebase on changes to move sseu struct to intel_sseu.h
> 
> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
> Signed-off-by: Stuart Summers <stuart.summers@intel.com>
> ---
>   drivers/gpu/drm/i915/gt/intel_sseu.h     | 6 ++++++
>   drivers/gpu/drm/i915/i915_debugfs.c      | 2 +-
>   drivers/gpu/drm/i915/intel_device_info.c | 4 ++--
>   3 files changed, 9 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_sseu.h b/drivers/gpu/drm/i915/gt/intel_sseu.h
> index c0b16b248d4c..f5ff6b7a756a 100644
> --- a/drivers/gpu/drm/i915/gt/intel_sseu.h
> +++ b/drivers/gpu/drm/i915/gt/intel_sseu.h
> @@ -63,6 +63,12 @@ intel_sseu_from_device_info(const struct sseu_dev_info *sseu)
>   	return value;
>   }
>   
> +static inline unsigned int
> +sseu_subslices_per_slice(const struct sseu_dev_info *sseu, u8 slice)

This is exposed, so needs an intel_* prefix. with that:

Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>

Daniele

> +{
> +	return hweight8(sseu->subslice_mask[slice]);
> +}
> +
>   u32 intel_sseu_make_rpcs(struct drm_i915_private *i915,
>   			 const struct intel_sseu *req_sseu);
>   
> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
> index 0e4dffcd4da4..fe854c629a32 100644
> --- a/drivers/gpu/drm/i915/i915_debugfs.c
> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> @@ -4185,7 +4185,7 @@ static void i915_print_sseu_info(struct seq_file *m, bool is_available_info,
>   		   sseu_subslice_total(sseu));
>   	for (s = 0; s < fls(sseu->slice_mask); s++) {
>   		seq_printf(m, "  %s Slice%i subslices: %u\n", type,
> -			   s, hweight8(sseu->subslice_mask[s]));
> +			   s, sseu_subslices_per_slice(sseu, s));
>   	}
>   	seq_printf(m, "  %s EU Total: %u\n", type,
>   		   sseu->eu_total);
> diff --git a/drivers/gpu/drm/i915/intel_device_info.c b/drivers/gpu/drm/i915/intel_device_info.c
> index 6af480b95bc6..559cf0d0628e 100644
> --- a/drivers/gpu/drm/i915/intel_device_info.c
> +++ b/drivers/gpu/drm/i915/intel_device_info.c
> @@ -93,7 +93,7 @@ static void sseu_dump(const struct sseu_dev_info *sseu, struct drm_printer *p)
>   	drm_printf(p, "subslice total: %u\n", sseu_subslice_total(sseu));
>   	for (s = 0; s < sseu->max_slices; s++) {
>   		drm_printf(p, "slice%d: %u subslices, mask=%04x\n",
> -			   s, hweight8(sseu->subslice_mask[s]),
> +			   s, sseu_subslices_per_slice(sseu, s),
>   			   sseu->subslice_mask[s]);
>   	}
>   	drm_printf(p, "EU total: %u\n", sseu->eu_total);
> @@ -126,7 +126,7 @@ void intel_device_info_dump_topology(const struct sseu_dev_info *sseu,
>   
>   	for (s = 0; s < sseu->max_slices; s++) {
>   		drm_printf(p, "slice%d: %u subslice(s) (0x%hhx):\n",
> -			   s, hweight8(sseu->subslice_mask[s]),
> +			   s, sseu_subslices_per_slice(sseu, s),
>   			   sseu->subslice_mask[s]);
>   
>   		for (ss = 0; ss < sseu->max_subslices; ss++) {
> 
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 6/6] drm/i915: Expand subslice mask
  2019-05-01 15:34 ` [PATCH 6/6] drm/i915: Expand subslice mask Stuart Summers
@ 2019-05-01 18:22   ` Tvrtko Ursulin
  2019-05-01 18:29     ` Tvrtko Ursulin
  2019-05-01 22:04   ` Daniele Ceraolo Spurio
  1 sibling, 1 reply; 35+ messages in thread
From: Tvrtko Ursulin @ 2019-05-01 18:22 UTC (permalink / raw)
  To: Stuart Summers, intel-gfx


Just one drive by below...

On 01/05/2019 16:34, Stuart Summers wrote:
> Currently, the subslice_mask runtime parameter is stored as an
> array of subslices per slice. Expand the subslice mask array to
> better match what is presented to userspace through the
> I915_QUERY_TOPOLOGY_INFO ioctl. The index into this array is
> then calculated:
>    slice * subslice stride + subslice index / 8
> 
> v2: fix spacing in set_sseu_info args
>      use set_sseu_info to initialize sseu data when building
>      device status in debugfs
>      rename variables in intel_engine_types.h to avoid checkpatch
>      warnings
> v3: update headers in intel_sseu.h
> v4: add const to some sseu_dev_info variables
>      use sseu->eu_stride for EU stride calculations
> 
> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
> Signed-off-by: Stuart Summers <stuart.summers@intel.com>
> ---
>   drivers/gpu/drm/i915/gt/intel_engine_cs.c    |   6 +-
>   drivers/gpu/drm/i915/gt/intel_engine_types.h |  32 +++--
>   drivers/gpu/drm/i915/gt/intel_hangcheck.c    |   3 +-
>   drivers/gpu/drm/i915/gt/intel_sseu.c         |  49 +++++--
>   drivers/gpu/drm/i915/gt/intel_sseu.h         |  16 ++-
>   drivers/gpu/drm/i915/gt/intel_workarounds.c  |   2 +-
>   drivers/gpu/drm/i915/i915_debugfs.c          |  44 +++---
>   drivers/gpu/drm/i915/i915_drv.c              |   6 +-
>   drivers/gpu/drm/i915/i915_gpu_error.c        |   5 +-
>   drivers/gpu/drm/i915/i915_query.c            |  10 +-
>   drivers/gpu/drm/i915/intel_device_info.c     | 142 +++++++++++--------
>   11 files changed, 198 insertions(+), 117 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> index 6e40f8ea9a6a..8f7967cc9a50 100644
> --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> @@ -914,7 +914,7 @@ u32 intel_calculate_mcr_s_ss_select(struct drm_i915_private *dev_priv)
>   	const struct sseu_dev_info *sseu = &RUNTIME_INFO(dev_priv)->sseu;
>   	u32 mcr_s_ss_select;
>   	u32 slice = fls(sseu->slice_mask);
> -	u32 subslice = fls(sseu->subslice_mask[slice]);
> +	u32 subslice = fls(sseu->subslice_mask[slice * sseu->ss_stride]);
>   
>   	if (IS_GEN(dev_priv, 10))
>   		mcr_s_ss_select = GEN8_MCR_SLICE(slice) |
> @@ -990,6 +990,7 @@ void intel_engine_get_instdone(struct intel_engine_cs *engine,
>   			       struct intel_instdone *instdone)
>   {
>   	struct drm_i915_private *dev_priv = engine->i915;
> +	const struct sseu_dev_info *sseu = &RUNTIME_INFO(dev_priv)->sseu;
>   	struct intel_uncore *uncore = engine->uncore;
>   	u32 mmio_base = engine->mmio_base;
>   	int slice;
> @@ -1007,7 +1008,8 @@ void intel_engine_get_instdone(struct intel_engine_cs *engine,
>   
>   		instdone->slice_common =
>   			intel_uncore_read(uncore, GEN7_SC_INSTDONE);
> -		for_each_instdone_slice_subslice(dev_priv, slice, subslice) {
> +		for_each_instdone_slice_subslice(dev_priv, sseu, slice,
> +						 subslice) {
>   			instdone->sampler[slice][subslice] =
>   				read_subslice_reg(dev_priv, slice, subslice,
>   						  GEN7_SAMPLER_INSTDONE);
> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h b/drivers/gpu/drm/i915/gt/intel_engine_types.h
> index 9d64e33f8427..1710546a2446 100644
> --- a/drivers/gpu/drm/i915/gt/intel_engine_types.h
> +++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h
> @@ -534,20 +534,22 @@ intel_engine_needs_breadcrumb_tasklet(const struct intel_engine_cs *engine)
>   	return engine->flags & I915_ENGINE_NEEDS_BREADCRUMB_TASKLET;
>   }
>   
> -#define instdone_slice_mask(dev_priv__) \
> -	(IS_GEN(dev_priv__, 7) ? \
> -	 1 : RUNTIME_INFO(dev_priv__)->sseu.slice_mask)
> -
> -#define instdone_subslice_mask(dev_priv__) \
> -	(IS_GEN(dev_priv__, 7) ? \
> -	 1 : RUNTIME_INFO(dev_priv__)->sseu.subslice_mask[0])
> -
> -#define for_each_instdone_slice_subslice(dev_priv__, slice__, subslice__) \
> -	for ((slice__) = 0, (subslice__) = 0; \
> -	     (slice__) < I915_MAX_SLICES; \
> -	     (subslice__) = ((subslice__) + 1) < I915_MAX_SUBSLICES ? (subslice__) + 1 : 0, \
> -	       (slice__) += ((subslice__) == 0)) \
> -		for_each_if((BIT(slice__) & instdone_slice_mask(dev_priv__)) && \
> -			    (BIT(subslice__) & instdone_subslice_mask(dev_priv__)))
> +#define instdone_has_slice(dev_priv___, sseu___, slice___) \
> +	((IS_GEN(dev_priv___, 7) ? \
> +	  1 : (sseu___)->slice_mask) & \
> +	BIT(slice___)) \
> +
> +#define instdone_has_subslice(dev_priv__, sseu__, slice__, subslice__) \
> +	((IS_GEN(dev_priv__, 7) ? \
> +	  1 : (sseu__)->subslice_mask[slice__ * (sseu__)->ss_stride + \
> +				      subslice__ / BITS_PER_BYTE]) & \
> +	 BIT(subslice__ % BITS_PER_BYTE)) \
> +
> +#define for_each_instdone_slice_subslice(dev_priv_, sseu_, slice_, subslice_) \
> +	for ((slice_) = 0, (subslice_) = 0; (slice_) < I915_MAX_SLICES; \
> +	     (subslice_) = ((subslice_) + 1) < I915_MAX_SUBSLICES ? (subslice_) + 1 : 0, \
> +	       (slice_) += ((subslice_) == 0)) \
> +		for_each_if(instdone_has_slice(dev_priv_, sseu_, slice) && \
> +			    instdone_has_subslice(dev_priv_, sseu_, slice_, subslice_)) \
>   
>   #endif /* __INTEL_ENGINE_TYPES_H__ */
> diff --git a/drivers/gpu/drm/i915/gt/intel_hangcheck.c b/drivers/gpu/drm/i915/gt/intel_hangcheck.c
> index e5eaa06fe74d..53c1c98161e1 100644
> --- a/drivers/gpu/drm/i915/gt/intel_hangcheck.c
> +++ b/drivers/gpu/drm/i915/gt/intel_hangcheck.c
> @@ -50,6 +50,7 @@ static bool instdone_unchanged(u32 current_instdone, u32 *old_instdone)
>   static bool subunits_stuck(struct intel_engine_cs *engine)
>   {
>   	struct drm_i915_private *dev_priv = engine->i915;
> +	const struct sseu_dev_info *sseu = &RUNTIME_INFO(dev_priv)->sseu;
>   	struct intel_instdone instdone;
>   	struct intel_instdone *accu_instdone = &engine->hangcheck.instdone;
>   	bool stuck;
> @@ -71,7 +72,7 @@ static bool subunits_stuck(struct intel_engine_cs *engine)
>   	stuck &= instdone_unchanged(instdone.slice_common,
>   				    &accu_instdone->slice_common);
>   
> -	for_each_instdone_slice_subslice(dev_priv, slice, subslice) {
> +	for_each_instdone_slice_subslice(dev_priv, sseu, slice, subslice) {
>   		stuck &= instdone_unchanged(instdone.sampler[slice][subslice],
>   					    &accu_instdone->sampler[slice][subslice]);
>   		stuck &= instdone_unchanged(instdone.row[slice][subslice],
> diff --git a/drivers/gpu/drm/i915/gt/intel_sseu.c b/drivers/gpu/drm/i915/gt/intel_sseu.c
> index 4a0b82fc108c..49316b7ef074 100644
> --- a/drivers/gpu/drm/i915/gt/intel_sseu.c
> +++ b/drivers/gpu/drm/i915/gt/intel_sseu.c
> @@ -8,6 +8,17 @@
>   #include "intel_lrc_reg.h"
>   #include "intel_sseu.h"
>   
> +void intel_sseu_set_info(struct sseu_dev_info *sseu, u8 max_slices,
> +			 u8 max_subslices, u8 max_eus_per_subslice)
> +{
> +	sseu->max_slices = max_slices;
> +	sseu->max_subslices = max_subslices;
> +	sseu->max_eus_per_subslice = max_eus_per_subslice;
> +
> +	sseu->ss_stride = GEN_SSEU_STRIDE(sseu->max_subslices);
> +	sseu->eu_stride = GEN_SSEU_STRIDE(sseu->max_eus_per_subslice);
> +}
> +
>   unsigned int
>   intel_sseu_subslice_total(const struct sseu_dev_info *sseu)
>   {
> @@ -22,17 +33,39 @@ intel_sseu_subslice_total(const struct sseu_dev_info *sseu)
>   unsigned int
>   intel_sseu_subslices_per_slice(const struct sseu_dev_info *sseu, u8 slice)
>   {
> -	return hweight8(sseu->subslice_mask[slice]);
> +	unsigned int i, total = 0;
> +
> +	for (i = 0; i < sseu->ss_stride; i++)
> +		total += hweight8(sseu->subslice_mask[slice * sseu->ss_stride +
> +						      i]);
> +
> +	return total;
> +}
> +
> +void intel_sseu_copy_subslices(const struct sseu_dev_info *sseu, int slice,
> +			       u8 *to_mask, const u8 *from_mask)
> +{
> +	int offset = slice * sseu->ss_stride;
> +
> +	memcpy(&to_mask[offset], &from_mask[offset], sseu->ss_stride);
> +}
> +
> +void intel_sseu_set_subslices(struct sseu_dev_info *sseu, int slice,
> +			      u32 ss_mask)
> +{
> +	int i, offset = slice * sseu->ss_stride;
> +
> +	for (i = 0; i < sseu->ss_stride; i++)
> +		sseu->subslice_mask[offset + i] =
> +			(ss_mask >> (BITS_PER_BYTE * i)) & 0xff;
>   }
>   
>   static int intel_sseu_eu_idx(const struct sseu_dev_info *sseu, int slice,
>   			     int subslice)
>   {
> -	int subslice_stride = DIV_ROUND_UP(sseu->max_eus_per_subslice,
> -					   BITS_PER_BYTE);
> -	int slice_stride = sseu->max_subslices * subslice_stride;
> +	int slice_stride = sseu->max_subslices * sseu->eu_stride;
>   
> -	return slice * slice_stride + subslice * subslice_stride;
> +	return slice * slice_stride + subslice * sseu->eu_stride;
>   }
>   
>   u16 intel_sseu_get_eus(const struct sseu_dev_info *sseu, int slice,
> @@ -41,8 +74,7 @@ u16 intel_sseu_get_eus(const struct sseu_dev_info *sseu, int slice,
>   	int i, offset = intel_sseu_eu_idx(sseu, slice, subslice);
>   	u16 eu_mask = 0;
>   
> -	for (i = 0;
> -	     i < DIV_ROUND_UP(sseu->max_eus_per_subslice, BITS_PER_BYTE); i++) {
> +	for (i = 0; i < sseu->eu_stride; i++) {
>   		eu_mask |= ((u16)sseu->eu_mask[offset + i]) <<
>   			(i * BITS_PER_BYTE);
>   	}
> @@ -55,8 +87,7 @@ void intel_sseu_set_eus(struct sseu_dev_info *sseu, int slice, int subslice,
>   {
>   	int i, offset = intel_sseu_eu_idx(sseu, slice, subslice);
>   
> -	for (i = 0;
> -	     i < DIV_ROUND_UP(sseu->max_eus_per_subslice, BITS_PER_BYTE); i++) {
> +	for (i = 0; i < sseu->eu_stride; i++) {
>   		sseu->eu_mask[offset + i] =
>   			(eu_mask >> (BITS_PER_BYTE * i)) & 0xff;
>   	}
> diff --git a/drivers/gpu/drm/i915/gt/intel_sseu.h b/drivers/gpu/drm/i915/gt/intel_sseu.h
> index 56e3721ae83f..bf01f338a8cc 100644
> --- a/drivers/gpu/drm/i915/gt/intel_sseu.h
> +++ b/drivers/gpu/drm/i915/gt/intel_sseu.h
> @@ -9,16 +9,18 @@
>   
>   #include <linux/types.h>
>   #include <linux/kernel.h>
> +#include <linux/string.h>
>   
>   struct drm_i915_private;
>   
>   #define GEN_MAX_SLICES		(6) /* CNL upper bound */
>   #define GEN_MAX_SUBSLICES	(8) /* ICL upper bound */
>   #define GEN_SSEU_STRIDE(bits) DIV_ROUND_UP(bits, BITS_PER_BYTE)
> +#define GEN_MAX_SUBSLICE_STRIDE GEN_SSEU_STRIDE(GEN_MAX_SUBSLICES)
>   
>   struct sseu_dev_info {
>   	u8 slice_mask;
> -	u8 subslice_mask[GEN_MAX_SLICES];
> +	u8 subslice_mask[GEN_MAX_SLICES * GEN_MAX_SUBSLICE_STRIDE];
>   	u16 eu_total;
>   	u8 eu_per_subslice;
>   	u8 min_eu_in_pool;
> @@ -33,6 +35,9 @@ struct sseu_dev_info {
>   	u8 max_subslices;
>   	u8 max_eus_per_subslice;
>   
> +	u8 ss_stride;
> +	u8 eu_stride;
> +
>   	/* We don't have more than 8 eus per subslice at the moment and as we
>   	 * store eus enabled using bits, no need to multiply by eus per
>   	 * subslice.
> @@ -63,12 +68,21 @@ intel_sseu_from_device_info(const struct sseu_dev_info *sseu)
>   	return value;
>   }
>   
> +void intel_sseu_set_info(struct sseu_dev_info *sseu, u8 max_slices,
> +			 u8 max_subslices, u8 max_eus_per_subslice);
> +
>   unsigned int
>   intel_sseu_subslice_total(const struct sseu_dev_info *sseu);
>   
>   unsigned int
>   intel_sseu_subslices_per_slice(const struct sseu_dev_info *sseu, u8 slice);
>   
> +void intel_sseu_copy_subslices(const struct sseu_dev_info *sseu, int slice,
> +			       u8 *to_mask, const u8 *from_mask);
> +
> +void intel_sseu_set_subslices(struct sseu_dev_info *sseu, int slice,
> +			      u32 ss_mask);
> +
>   u16 intel_sseu_get_eus(const struct sseu_dev_info *sseu, int slice,
>   		       int subslice);
>   
> diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c b/drivers/gpu/drm/i915/gt/intel_workarounds.c
> index 43e290306551..7c7e9556c1c5 100644
> --- a/drivers/gpu/drm/i915/gt/intel_workarounds.c
> +++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c
> @@ -767,7 +767,7 @@ wa_init_mcr(struct drm_i915_private *i915, struct i915_wa_list *wal)
>   		u32 slice = fls(sseu->slice_mask);
>   		u32 fuse3 =
>   			intel_uncore_read(&i915->uncore, GEN10_MIRROR_FUSE3);
> -		u8 ss_mask = sseu->subslice_mask[slice];
> +		u8 ss_mask = sseu->subslice_mask[slice * sseu->ss_stride];
>   
>   		u8 enabled_mask = (ss_mask | ss_mask >>
>   				   GEN10_L3BANK_PAIR_COUNT) & GEN10_L3BANK_MASK;
> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
> index 3f3ee83ac315..08089c24db25 100644
> --- a/drivers/gpu/drm/i915/i915_debugfs.c
> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> @@ -1257,6 +1257,7 @@ static void i915_instdone_info(struct drm_i915_private *dev_priv,
>   			       struct seq_file *m,
>   			       struct intel_instdone *instdone)
>   {
> +	struct sseu_dev_info *sseu = &RUNTIME_INFO(dev_priv)->sseu;
>   	int slice;
>   	int subslice;
>   
> @@ -1272,11 +1273,11 @@ static void i915_instdone_info(struct drm_i915_private *dev_priv,
>   	if (INTEL_GEN(dev_priv) <= 6)
>   		return;
>   
> -	for_each_instdone_slice_subslice(dev_priv, slice, subslice)
> +	for_each_instdone_slice_subslice(dev_priv, sseu, slice, subslice)
>   		seq_printf(m, "\t\tSAMPLER_INSTDONE[%d][%d]: 0x%08x\n",
>   			   slice, subslice, instdone->sampler[slice][subslice]);
>   
> -	for_each_instdone_slice_subslice(dev_priv, slice, subslice)
> +	for_each_instdone_slice_subslice(dev_priv, sseu, slice, subslice)
>   		seq_printf(m, "\t\tROW_INSTDONE[%d][%d]: 0x%08x\n",
>   			   slice, subslice, instdone->row[slice][subslice]);
>   }
> @@ -4066,7 +4067,9 @@ static void gen10_sseu_device_status(struct drm_i915_private *dev_priv,
>   			continue;
>   
>   		sseu->slice_mask |= BIT(s);
> -		sseu->subslice_mask[s] = info->sseu.subslice_mask[s];
> +		intel_sseu_copy_subslices(&info->sseu, s,
> +					  sseu->subslice_mask,
> +					  info->sseu.subslice_mask);
>   
>   		for (ss = 0; ss < info->sseu.max_subslices; ss++) {
>   			unsigned int eu_cnt;
> @@ -4117,18 +4120,22 @@ static void gen9_sseu_device_status(struct drm_i915_private *dev_priv,
>   		sseu->slice_mask |= BIT(s);
>   
>   		if (IS_GEN9_BC(dev_priv))
> -			sseu->subslice_mask[s] =
> -				RUNTIME_INFO(dev_priv)->sseu.subslice_mask[s];
> +			intel_sseu_copy_subslices(&info->sseu, s,
> +						  sseu->subslice_mask,
> +						  info->sseu.subslice_mask);
>   
>   		for (ss = 0; ss < info->sseu.max_subslices; ss++) {
>   			unsigned int eu_cnt;
> +			u8 ss_idx = s * info->sseu.ss_stride +
> +				    ss / BITS_PER_BYTE;
>   
>   			if (IS_GEN9_LP(dev_priv)) {
>   				if (!(s_reg[s] & (GEN9_PGCTL_SS_ACK(ss))))
>   					/* skip disabled subslice */
>   					continue;
>   
> -				sseu->subslice_mask[s] |= BIT(ss);
> +				sseu->subslice_mask[ss_idx] |=
> +					BIT(ss % BITS_PER_BYTE);
>   			}
>   
>   			eu_cnt = 2 * hweight32(eu_reg[2*s + ss/2] &
> @@ -4145,25 +4152,24 @@ static void gen9_sseu_device_status(struct drm_i915_private *dev_priv,
>   static void broadwell_sseu_device_status(struct drm_i915_private *dev_priv,
>   					 struct sseu_dev_info *sseu)
>   {
> +	struct intel_runtime_info *info = RUNTIME_INFO(dev_priv);
>   	u32 slice_info = I915_READ(GEN8_GT_SLICE_INFO);
>   	int s;
>   
>   	sseu->slice_mask = slice_info & GEN8_LSLICESTAT_MASK;
>   
>   	if (sseu->slice_mask) {
> -		sseu->eu_per_subslice =
> -			RUNTIME_INFO(dev_priv)->sseu.eu_per_subslice;
> -		for (s = 0; s < fls(sseu->slice_mask); s++) {
> -			sseu->subslice_mask[s] =
> -				RUNTIME_INFO(dev_priv)->sseu.subslice_mask[s];
> -		}
> +		sseu->eu_per_subslice = info->sseu.eu_per_subslice;
> +		for (s = 0; s < fls(sseu->slice_mask); s++)
> +			intel_sseu_copy_subslices(&info->sseu, s,
> +						  sseu->subslice_mask,
> +						  info->sseu.subslice_mask);
>   		sseu->eu_total = sseu->eu_per_subslice *
>   				 intel_sseu_subslice_total(sseu);
>   
>   		/* subtract fused off EU(s) from enabled slice(s) */
>   		for (s = 0; s < fls(sseu->slice_mask); s++) {
> -			u8 subslice_7eu =
> -				RUNTIME_INFO(dev_priv)->sseu.subslice_7eu[s];
> +			u8 subslice_7eu = info->sseu.subslice_7eu[s];
>   
>   			sseu->eu_total -= hweight8(subslice_7eu);
>   		}
> @@ -4210,6 +4216,7 @@ static void i915_print_sseu_info(struct seq_file *m, bool is_available_info,
>   static int i915_sseu_status(struct seq_file *m, void *unused)
>   {
>   	struct drm_i915_private *dev_priv = node_to_i915(m->private);
> +	const struct intel_runtime_info *info = RUNTIME_INFO(dev_priv);
>   	struct sseu_dev_info sseu;
>   	intel_wakeref_t wakeref;
>   
> @@ -4217,14 +4224,13 @@ static int i915_sseu_status(struct seq_file *m, void *unused)
>   		return -ENODEV;
>   
>   	seq_puts(m, "SSEU Device Info\n");
> -	i915_print_sseu_info(m, true, &RUNTIME_INFO(dev_priv)->sseu);
> +	i915_print_sseu_info(m, true, &info->sseu);
>   
>   	seq_puts(m, "SSEU Device Status\n");
>   	memset(&sseu, 0, sizeof(sseu));
> -	sseu.max_slices = RUNTIME_INFO(dev_priv)->sseu.max_slices;
> -	sseu.max_subslices = RUNTIME_INFO(dev_priv)->sseu.max_subslices;
> -	sseu.max_eus_per_subslice =
> -		RUNTIME_INFO(dev_priv)->sseu.max_eus_per_subslice;
> +	intel_sseu_set_info(&sseu, info->sseu.max_slices,
> +			    info->sseu.max_subslices,
> +			    info->sseu.max_eus_per_subslice);
>   
>   	with_intel_runtime_pm(dev_priv, wakeref) {
>   		if (IS_CHERRYVIEW(dev_priv))
> diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
> index 130c5140db0d..6afe4e3afea4 100644
> --- a/drivers/gpu/drm/i915/i915_drv.c
> +++ b/drivers/gpu/drm/i915/i915_drv.c
> @@ -326,7 +326,7 @@ static int i915_getparam_ioctl(struct drm_device *dev, void *data,
>   	struct pci_dev *pdev = dev_priv->drm.pdev;
>   	const struct sseu_dev_info *sseu = &RUNTIME_INFO(dev_priv)->sseu;
>   	drm_i915_getparam_t *param = data;
> -	int value;
> +	int value = 0;
>   
>   	switch (param->param) {
>   	case I915_PARAM_IRQ_ACTIVE:
> @@ -455,7 +455,9 @@ static int i915_getparam_ioctl(struct drm_device *dev, void *data,
>   			return -ENODEV;
>   		break;
>   	case I915_PARAM_SUBSLICE_MASK:
> -		value = sseu->subslice_mask[0];
> +		/* Only copy bits from the first subslice */
> +		memcpy(&value, sseu->subslice_mask,
> +		       min(sseu->ss_stride, (u8)sizeof(value)));
>   		if (!value)
>   			return -ENODEV;
>   		break;
> diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
> index e1b858bd1d32..140918dd9b7d 100644
> --- a/drivers/gpu/drm/i915/i915_gpu_error.c
> +++ b/drivers/gpu/drm/i915/i915_gpu_error.c
> @@ -407,6 +407,7 @@ static void print_error_buffers(struct drm_i915_error_state_buf *m,
>   static void error_print_instdone(struct drm_i915_error_state_buf *m,
>   				 const struct drm_i915_error_engine *ee)
>   {
> +	struct sseu_dev_info *sseu = &RUNTIME_INFO(m->i915)->sseu;
>   	int slice;
>   	int subslice;
>   
> @@ -422,12 +423,12 @@ static void error_print_instdone(struct drm_i915_error_state_buf *m,
>   	if (INTEL_GEN(m->i915) <= 6)
>   		return;
>   
> -	for_each_instdone_slice_subslice(m->i915, slice, subslice)
> +	for_each_instdone_slice_subslice(m->i915, sseu, slice, subslice)
>   		err_printf(m, "  SAMPLER_INSTDONE[%d][%d]: 0x%08x\n",
>   			   slice, subslice,
>   			   ee->instdone.sampler[slice][subslice]);
>   
> -	for_each_instdone_slice_subslice(m->i915, slice, subslice)
> +	for_each_instdone_slice_subslice(m->i915, sseu, slice, subslice)
>   		err_printf(m, "  ROW_INSTDONE[%d][%d]: 0x%08x\n",
>   			   slice, subslice,
>   			   ee->instdone.row[slice][subslice]);
> diff --git a/drivers/gpu/drm/i915/i915_query.c b/drivers/gpu/drm/i915/i915_query.c
> index 7c1708c22811..000dcb145ce0 100644
> --- a/drivers/gpu/drm/i915/i915_query.c
> +++ b/drivers/gpu/drm/i915/i915_query.c
> @@ -37,8 +37,6 @@ static int query_topology_info(struct drm_i915_private *dev_priv,
>   	const struct sseu_dev_info *sseu = &RUNTIME_INFO(dev_priv)->sseu;
>   	struct drm_i915_query_topology_info topo;
>   	u32 slice_length, subslice_length, eu_length, total_length;
> -	u8 subslice_stride = GEN_SSEU_STRIDE(sseu->max_subslices);
> -	u8 eu_stride = GEN_SSEU_STRIDE(sseu->max_eus_per_subslice);
>   	int ret;
>   
>   	if (query_item->flags != 0)
> @@ -50,8 +48,8 @@ static int query_topology_info(struct drm_i915_private *dev_priv,
>   	BUILD_BUG_ON(sizeof(u8) != sizeof(sseu->slice_mask));
>   
>   	slice_length = sizeof(sseu->slice_mask);
> -	subslice_length = sseu->max_slices * subslice_stride;
> -	eu_length = sseu->max_slices * sseu->max_subslices * eu_stride;
> +	subslice_length = sseu->max_slices * sseu->ss_stride;
> +	eu_length = sseu->max_slices * sseu->max_subslices * sseu->eu_stride;
>   	total_length = sizeof(topo) + slice_length + subslice_length +
>   		       eu_length;
>   
> @@ -69,9 +67,9 @@ static int query_topology_info(struct drm_i915_private *dev_priv,
>   	topo.max_eus_per_subslice = sseu->max_eus_per_subslice;
>   
>   	topo.subslice_offset = slice_length;
> -	topo.subslice_stride = subslice_stride;
> +	topo.subslice_stride = sseu->ss_stride;
>   	topo.eu_offset = slice_length + subslice_length;
> -	topo.eu_stride = eu_stride;
> +	topo.eu_stride = sseu->eu_stride;
>   
>   	if (__copy_to_user(u64_to_user_ptr(query_item->data_ptr),
>   			   &topo, sizeof(topo)))
> diff --git a/drivers/gpu/drm/i915/intel_device_info.c b/drivers/gpu/drm/i915/intel_device_info.c
> index e1dbccf04cd9..bbbc0a8c2183 100644
> --- a/drivers/gpu/drm/i915/intel_device_info.c
> +++ b/drivers/gpu/drm/i915/intel_device_info.c
> @@ -84,17 +84,42 @@ void intel_device_info_dump_flags(const struct intel_device_info *info,
>   #undef PRINT_FLAG
>   }
>   
> +#define SS_STR_MAX_SIZE (GEN_MAX_SUBSLICE_STRIDE * 2)
> +
> +static u8 *
> +subslice_per_slice_str(u8 *buf, const struct sseu_dev_info *sseu, u8 slice)
> +{
> +	int i;
> +	u8 ss_offset = slice * sseu->ss_stride;
> +
> +	GEM_BUG_ON(slice >= sseu->max_slices);
> +
> +	memset(buf, 0, SS_STR_MAX_SIZE);

I suggest a more hardened approach of caller passing in the buffer size, 
since it is their buffer.

Regards,

Tvrtko

> +
> +	/*
> +	 * Print subslice information in reverse order to match
> +	 * userspace expectations.
> +	 */
> +	for (i = 0; i < sseu->ss_stride; i++)
> +		sprintf(&buf[i * 2], "%02x",
> +			sseu->subslice_mask[ss_offset + sseu->ss_stride -
> +					    (i + 1)]);
> +
> +	return buf;
> +}
> +
>   static void sseu_dump(const struct sseu_dev_info *sseu, struct drm_printer *p)
>   {
>   	int s;
> +	u8 buf[SS_STR_MAX_SIZE];
>   
>   	drm_printf(p, "slice total: %u, mask=%04x\n",
>   		   hweight8(sseu->slice_mask), sseu->slice_mask);
>   	drm_printf(p, "subslice total: %u\n", intel_sseu_subslice_total(sseu));
>   	for (s = 0; s < sseu->max_slices; s++) {
> -		drm_printf(p, "slice%d: %u subslices, mask=%04x\n",
> +		drm_printf(p, "slice%d: %u subslices, mask=%s\n",
>   			   s, intel_sseu_subslices_per_slice(sseu, s),
> -			   sseu->subslice_mask[s]);
> +			   subslice_per_slice_str(buf, sseu, s));
>   	}
>   	drm_printf(p, "EU total: %u\n", sseu->eu_total);
>   	drm_printf(p, "EU per subslice: %u\n", sseu->eu_per_subslice);
> @@ -118,6 +143,7 @@ void intel_device_info_dump_topology(const struct sseu_dev_info *sseu,
>   				     struct drm_printer *p)
>   {
>   	int s, ss;
> +	u8 buf[SS_STR_MAX_SIZE];
>   
>   	if (sseu->max_slices == 0) {
>   		drm_printf(p, "Unavailable\n");
> @@ -125,9 +151,9 @@ void intel_device_info_dump_topology(const struct sseu_dev_info *sseu,
>   	}
>   
>   	for (s = 0; s < sseu->max_slices; s++) {
> -		drm_printf(p, "slice%d: %u subslice(s) (0x%hhx):\n",
> +		drm_printf(p, "slice%d: %u subslice(s) (0x%s):\n",
>   			   s, intel_sseu_subslices_per_slice(sseu, s),
> -			   sseu->subslice_mask[s]);
> +			   subslice_per_slice_str(buf, sseu, s));
>   
>   		for (ss = 0; ss < sseu->max_subslices; ss++) {
>   			u16 enabled_eus = intel_sseu_get_eus(sseu, s, ss);
> @@ -156,15 +182,10 @@ static void gen11_sseu_info_init(struct drm_i915_private *dev_priv)
>   	u8 eu_en;
>   	int s;
>   
> -	if (IS_ELKHARTLAKE(dev_priv)) {
> -		sseu->max_slices = 1;
> -		sseu->max_subslices = 4;
> -		sseu->max_eus_per_subslice = 8;
> -	} else {
> -		sseu->max_slices = 1;
> -		sseu->max_subslices = 8;
> -		sseu->max_eus_per_subslice = 8;
> -	}
> +	if (IS_ELKHARTLAKE(dev_priv))
> +		intel_sseu_set_info(sseu, 1, 4, 8);
> +	else
> +		intel_sseu_set_info(sseu, 1, 8, 8);
>   
>   	s_en = I915_READ(GEN11_GT_SLICE_ENABLE) & GEN11_GT_S_ENA_MASK;
>   	ss_en = ~I915_READ(GEN11_GT_SUBSLICE_DISABLE);
> @@ -177,9 +198,11 @@ static void gen11_sseu_info_init(struct drm_i915_private *dev_priv)
>   			int ss;
>   
>   			sseu->slice_mask |= BIT(s);
> -			sseu->subslice_mask[s] = (ss_en >> ss_idx) & ss_en_mask;
> +			sseu->subslice_mask[s * sseu->ss_stride] =
> +				(ss_en >> ss_idx) & ss_en_mask;
>   			for (ss = 0; ss < sseu->max_subslices; ss++) {
> -				if (sseu->subslice_mask[s] & BIT(ss))
> +				if (sseu->subslice_mask[s * sseu->ss_stride] &
> +				    BIT(ss))
>   					intel_sseu_set_eus(sseu, s, ss, eu_en);
>   			}
>   		}
> @@ -201,23 +224,10 @@ static void gen10_sseu_info_init(struct drm_i915_private *dev_priv)
>   	const int eu_mask = 0xff;
>   	u32 subslice_mask, eu_en;
>   
> +	intel_sseu_set_info(sseu, 6, 4, 8);
> +
>   	sseu->slice_mask = (fuse2 & GEN10_F2_S_ENA_MASK) >>
>   			    GEN10_F2_S_ENA_SHIFT;
> -	sseu->max_slices = 6;
> -	sseu->max_subslices = 4;
> -	sseu->max_eus_per_subslice = 8;
> -
> -	subslice_mask = (1 << 4) - 1;
> -	subslice_mask &= ~((fuse2 & GEN10_F2_SS_DIS_MASK) >>
> -			   GEN10_F2_SS_DIS_SHIFT);
> -
> -	/*
> -	 * Slice0 can have up to 3 subslices, but there are only 2 in
> -	 * slice1/2.
> -	 */
> -	sseu->subslice_mask[0] = subslice_mask;
> -	for (s = 1; s < sseu->max_slices; s++)
> -		sseu->subslice_mask[s] = subslice_mask & 0x3;
>   
>   	/* Slice0 */
>   	eu_en = ~I915_READ(GEN8_EU_DISABLE0);
> @@ -242,14 +252,22 @@ static void gen10_sseu_info_init(struct drm_i915_private *dev_priv)
>   	eu_en = ~I915_READ(GEN10_EU_DISABLE3);
>   	intel_sseu_set_eus(sseu, 5, 1, eu_en & eu_mask);
>   
> -	/* Do a second pass where we mark the subslices disabled if all their
> -	 * eus are off.
> -	 */
> +	subslice_mask = (1 << 4) - 1;
> +	subslice_mask &= ~((fuse2 & GEN10_F2_SS_DIS_MASK) >>
> +			   GEN10_F2_SS_DIS_SHIFT);
> +
>   	for (s = 0; s < sseu->max_slices; s++) {
>   		for (ss = 0; ss < sseu->max_subslices; ss++) {
>   			if (intel_sseu_get_eus(sseu, s, ss) == 0)
> -				sseu->subslice_mask[s] &= ~BIT(ss);
> +				subslice_mask &= ~BIT(ss);
>   		}
> +
> +		/*
> +		 * Slice0 can have up to 3 subslices, but there are only 2 in
> +		 * slice1/2.
> +		 */
> +		intel_sseu_set_subslices(sseu, s, s == 0 ? subslice_mask :
> +							   subslice_mask & 0x3);
>   	}
>   
>   	sseu->eu_total = compute_eu_total(sseu);
> @@ -275,13 +293,12 @@ static void cherryview_sseu_info_init(struct drm_i915_private *dev_priv)
>   {
>   	struct sseu_dev_info *sseu = &RUNTIME_INFO(dev_priv)->sseu;
>   	u32 fuse;
> +	u8 subslice_mask;
>   
>   	fuse = I915_READ(CHV_FUSE_GT);
>   
>   	sseu->slice_mask = BIT(0);
> -	sseu->max_slices = 1;
> -	sseu->max_subslices = 2;
> -	sseu->max_eus_per_subslice = 8;
> +	intel_sseu_set_info(sseu, 1, 2, 8);
>   
>   	if (!(fuse & CHV_FGT_DISABLE_SS0)) {
>   		u8 disabled_mask =
> @@ -290,7 +307,7 @@ static void cherryview_sseu_info_init(struct drm_i915_private *dev_priv)
>   			(((fuse & CHV_FGT_EU_DIS_SS0_R1_MASK) >>
>   			  CHV_FGT_EU_DIS_SS0_R1_SHIFT) << 4);
>   
> -		sseu->subslice_mask[0] |= BIT(0);
> +		subslice_mask |= BIT(0);
>   		intel_sseu_set_eus(sseu, 0, 0, ~disabled_mask);
>   	}
>   
> @@ -301,10 +318,12 @@ static void cherryview_sseu_info_init(struct drm_i915_private *dev_priv)
>   			(((fuse & CHV_FGT_EU_DIS_SS1_R1_MASK) >>
>   			  CHV_FGT_EU_DIS_SS1_R1_SHIFT) << 4);
>   
> -		sseu->subslice_mask[0] |= BIT(1);
> +		subslice_mask |= BIT(1);
>   		intel_sseu_set_eus(sseu, 0, 1, ~disabled_mask);
>   	}
>   
> +	intel_sseu_set_subslices(sseu, 0, subslice_mask);
> +
>   	sseu->eu_total = compute_eu_total(sseu);
>   
>   	/*
> @@ -312,7 +331,8 @@ static void cherryview_sseu_info_init(struct drm_i915_private *dev_priv)
>   	 * across subslices.
>   	*/
>   	sseu->eu_per_subslice = intel_sseu_subslice_total(sseu) ?
> -				sseu->eu_total / intel_sseu_subslice_total(sseu) :
> +				sseu->eu_total /
> +					intel_sseu_subslice_total(sseu) :
>   				0;
>   	/*
>   	 * CHV supports subslice power gating on devices with more than
> @@ -336,9 +356,8 @@ static void gen9_sseu_info_init(struct drm_i915_private *dev_priv)
>   	sseu->slice_mask = (fuse2 & GEN8_F2_S_ENA_MASK) >> GEN8_F2_S_ENA_SHIFT;
>   
>   	/* BXT has a single slice and at most 3 subslices. */
> -	sseu->max_slices = IS_GEN9_LP(dev_priv) ? 1 : 3;
> -	sseu->max_subslices = IS_GEN9_LP(dev_priv) ? 3 : 4;
> -	sseu->max_eus_per_subslice = 8;
> +	intel_sseu_set_info(sseu, IS_GEN9_LP(dev_priv) ? 1 : 3,
> +			    IS_GEN9_LP(dev_priv) ? 3 : 4, 8);
>   
>   	/*
>   	 * The subslice disable field is global, i.e. it applies
> @@ -357,14 +376,16 @@ static void gen9_sseu_info_init(struct drm_i915_private *dev_priv)
>   			/* skip disabled slice */
>   			continue;
>   
> -		sseu->subslice_mask[s] = subslice_mask;
> +		intel_sseu_set_subslices(sseu, s, subslice_mask);
>   
>   		eu_disable = I915_READ(GEN9_EU_DISABLE(s));
>   		for (ss = 0; ss < sseu->max_subslices; ss++) {
>   			int eu_per_ss;
>   			u8 eu_disabled_mask;
> +			u8 ss_idx = s * sseu->ss_stride + ss / BITS_PER_BYTE;
>   
> -			if (!(sseu->subslice_mask[s] & BIT(ss)))
> +			if (!(sseu->subslice_mask[ss_idx] &
> +			      BIT(ss % BITS_PER_BYTE)))
>   				/* skip disabled subslice */
>   				continue;
>   
> @@ -437,9 +458,7 @@ static void broadwell_sseu_info_init(struct drm_i915_private *dev_priv)
>   
>   	fuse2 = I915_READ(GEN8_FUSE2);
>   	sseu->slice_mask = (fuse2 & GEN8_F2_S_ENA_MASK) >> GEN8_F2_S_ENA_SHIFT;
> -	sseu->max_slices = 3;
> -	sseu->max_subslices = 3;
> -	sseu->max_eus_per_subslice = 8;
> +	intel_sseu_set_info(sseu, 3, 3, 8);
>   
>   	/*
>   	 * The subslice disable field is global, i.e. it applies
> @@ -466,18 +485,21 @@ static void broadwell_sseu_info_init(struct drm_i915_private *dev_priv)
>   			/* skip disabled slice */
>   			continue;
>   
> -		sseu->subslice_mask[s] = subslice_mask;
> +		intel_sseu_set_subslices(sseu, s, subslice_mask);
>   
>   		for (ss = 0; ss < sseu->max_subslices; ss++) {
>   			u8 eu_disabled_mask;
> +			u8 ss_idx = s * sseu->ss_stride + ss / BITS_PER_BYTE;
>   			u32 n_disabled;
>   
> -			if (!(sseu->subslice_mask[s] & BIT(ss)))
> +			if (!(sseu->subslice_mask[ss_idx] &
> +			      BIT(ss % BITS_PER_BYTE)))
>   				/* skip disabled subslice */
>   				continue;
>   
>   			eu_disabled_mask =
> -				eu_disable[s] >> (ss * sseu->max_eus_per_subslice);
> +				eu_disable[s] >>
> +					(ss * sseu->max_eus_per_subslice);
>   
>   			intel_sseu_set_eus(sseu, s, ss, ~eu_disabled_mask);
>   
> @@ -517,6 +539,7 @@ static void haswell_sseu_info_init(struct drm_i915_private *dev_priv)
>   	struct sseu_dev_info *sseu = &RUNTIME_INFO(dev_priv)->sseu;
>   	u32 fuse1;
>   	int s, ss;
> +	u32 subslice_mask;
>   
>   	/*
>   	 * There isn't a register to tell us how many slices/subslices. We
> @@ -528,22 +551,18 @@ static void haswell_sseu_info_init(struct drm_i915_private *dev_priv)
>   		/* fall through */
>   	case 1:
>   		sseu->slice_mask = BIT(0);
> -		sseu->subslice_mask[0] = BIT(0);
> +		subslice_mask = BIT(0);
>   		break;
>   	case 2:
>   		sseu->slice_mask = BIT(0);
> -		sseu->subslice_mask[0] = BIT(0) | BIT(1);
> +		subslice_mask = BIT(0) | BIT(1);
>   		break;
>   	case 3:
>   		sseu->slice_mask = BIT(0) | BIT(1);
> -		sseu->subslice_mask[0] = BIT(0) | BIT(1);
> -		sseu->subslice_mask[1] = BIT(0) | BIT(1);
> +		subslice_mask = BIT(0) | BIT(1);
>   		break;
>   	}
>   
> -	sseu->max_slices = hweight8(sseu->slice_mask);
> -	sseu->max_subslices = hweight8(sseu->subslice_mask[0]);
> -
>   	fuse1 = I915_READ(HSW_PAVP_FUSE1);
>   	switch ((fuse1 & HSW_F1_EU_DIS_MASK) >> HSW_F1_EU_DIS_SHIFT) {
>   	default:
> @@ -560,9 +579,14 @@ static void haswell_sseu_info_init(struct drm_i915_private *dev_priv)
>   		sseu->eu_per_subslice = 6;
>   		break;
>   	}
> -	sseu->max_eus_per_subslice = sseu->eu_per_subslice;
> +
> +	intel_sseu_set_info(sseu, hweight8(sseu->slice_mask),
> +			    hweight8(subslice_mask),
> +			    sseu->eu_per_subslice);
>   
>   	for (s = 0; s < sseu->max_slices; s++) {
> +		intel_sseu_set_subslices(sseu, s, subslice_mask);
> +
>   		for (ss = 0; ss < sseu->max_subslices; ss++) {
>   			intel_sseu_set_eus(sseu, s, ss,
>   					   (1UL << sseu->eu_per_subslice) - 1);
> 
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 6/6] drm/i915: Expand subslice mask
  2019-05-01 18:22   ` Tvrtko Ursulin
@ 2019-05-01 18:29     ` Tvrtko Ursulin
  2019-05-01 19:40       ` Summers, Stuart
  0 siblings, 1 reply; 35+ messages in thread
From: Tvrtko Ursulin @ 2019-05-01 18:29 UTC (permalink / raw)
  To: Stuart Summers, intel-gfx


On 01/05/2019 19:22, Tvrtko Ursulin wrote:

[snip]

>> +#define SS_STR_MAX_SIZE (GEN_MAX_SUBSLICE_STRIDE * 2)
>> +
>> +static u8 *
>> +subslice_per_slice_str(u8 *buf, const struct sseu_dev_info *sseu, u8 
>> slice)
>> +{
>> +    int i;
>> +    u8 ss_offset = slice * sseu->ss_stride;
>> +
>> +    GEM_BUG_ON(slice >= sseu->max_slices);
>> +
>> +    memset(buf, 0, SS_STR_MAX_SIZE);
> 
> I suggest a more hardened approach of caller passing in the buffer size, 
> since it is their buffer.

Having said this..

>> +
>> +    /*
>> +     * Print subslice information in reverse order to match
>> +     * userspace expectations.
>> +     */
>> +    for (i = 0; i < sseu->ss_stride; i++)
>> +        sprintf(&buf[i * 2], "%02x",
>> +            sseu->subslice_mask[ss_offset + sseu->ss_stride -
>> +                        (i + 1)]);

...sprintf also needs to check against overflowing the buffer. 
(Relationship between loop boundary (ss_stride) and buffer size is a bit 
decoupled.)

And buffer should probably be char *.

Regards,

Tvrtko

>> +
>> +    return buf;
>> +}
>> +
>>   static void sseu_dump(const struct sseu_dev_info *sseu, struct 
>> drm_printer *p)
>>   {
>>       int s;
>> +    u8 buf[SS_STR_MAX_SIZE];
>>       drm_printf(p, "slice total: %u, mask=%04x\n",
>>              hweight8(sseu->slice_mask), sseu->slice_mask);
>>       drm_printf(p, "subslice total: %u\n", 
>> intel_sseu_subslice_total(sseu));
>>       for (s = 0; s < sseu->max_slices; s++) {
>> -        drm_printf(p, "slice%d: %u subslices, mask=%04x\n",
>> +        drm_printf(p, "slice%d: %u subslices, mask=%s\n",
>>                  s, intel_sseu_subslices_per_slice(sseu, s),
>> -               sseu->subslice_mask[s]);
>> +               subslice_per_slice_str(buf, sseu, s));
>>       }
>>       drm_printf(p, "EU total: %u\n", sseu->eu_total);
>>       drm_printf(p, "EU per subslice: %u\n", sseu->eu_per_subslice);
>> @@ -118,6 +143,7 @@ void intel_device_info_dump_topology(const struct 
>> sseu_dev_info *sseu,
>>                        struct drm_printer *p)
>>   {
>>       int s, ss;
>> +    u8 buf[SS_STR_MAX_SIZE];
>>       if (sseu->max_slices == 0) {
>>           drm_printf(p, "Unavailable\n");
>> @@ -125,9 +151,9 @@ void intel_device_info_dump_topology(const struct 
>> sseu_dev_info *sseu,
>>       }
>>       for (s = 0; s < sseu->max_slices; s++) {
>> -        drm_printf(p, "slice%d: %u subslice(s) (0x%hhx):\n",
>> +        drm_printf(p, "slice%d: %u subslice(s) (0x%s):\n",
>>                  s, intel_sseu_subslices_per_slice(sseu, s),
>> -               sseu->subslice_mask[s]);
>> +               subslice_per_slice_str(buf, sseu, s));
>>           for (ss = 0; ss < sseu->max_subslices; ss++) {
>>               u16 enabled_eus = intel_sseu_get_eus(sseu, s, ss);
>> @@ -156,15 +182,10 @@ static void gen11_sseu_info_init(struct 
>> drm_i915_private *dev_priv)
>>       u8 eu_en;
>>       int s;
>> -    if (IS_ELKHARTLAKE(dev_priv)) {
>> -        sseu->max_slices = 1;
>> -        sseu->max_subslices = 4;
>> -        sseu->max_eus_per_subslice = 8;
>> -    } else {
>> -        sseu->max_slices = 1;
>> -        sseu->max_subslices = 8;
>> -        sseu->max_eus_per_subslice = 8;
>> -    }
>> +    if (IS_ELKHARTLAKE(dev_priv))
>> +        intel_sseu_set_info(sseu, 1, 4, 8);
>> +    else
>> +        intel_sseu_set_info(sseu, 1, 8, 8);
>>       s_en = I915_READ(GEN11_GT_SLICE_ENABLE) & GEN11_GT_S_ENA_MASK;
>>       ss_en = ~I915_READ(GEN11_GT_SUBSLICE_DISABLE);
>> @@ -177,9 +198,11 @@ static void gen11_sseu_info_init(struct 
>> drm_i915_private *dev_priv)
>>               int ss;
>>               sseu->slice_mask |= BIT(s);
>> -            sseu->subslice_mask[s] = (ss_en >> ss_idx) & ss_en_mask;
>> +            sseu->subslice_mask[s * sseu->ss_stride] =
>> +                (ss_en >> ss_idx) & ss_en_mask;
>>               for (ss = 0; ss < sseu->max_subslices; ss++) {
>> -                if (sseu->subslice_mask[s] & BIT(ss))
>> +                if (sseu->subslice_mask[s * sseu->ss_stride] &
>> +                    BIT(ss))
>>                       intel_sseu_set_eus(sseu, s, ss, eu_en);
>>               }
>>           }
>> @@ -201,23 +224,10 @@ static void gen10_sseu_info_init(struct 
>> drm_i915_private *dev_priv)
>>       const int eu_mask = 0xff;
>>       u32 subslice_mask, eu_en;
>> +    intel_sseu_set_info(sseu, 6, 4, 8);
>> +
>>       sseu->slice_mask = (fuse2 & GEN10_F2_S_ENA_MASK) >>
>>                   GEN10_F2_S_ENA_SHIFT;
>> -    sseu->max_slices = 6;
>> -    sseu->max_subslices = 4;
>> -    sseu->max_eus_per_subslice = 8;
>> -
>> -    subslice_mask = (1 << 4) - 1;
>> -    subslice_mask &= ~((fuse2 & GEN10_F2_SS_DIS_MASK) >>
>> -               GEN10_F2_SS_DIS_SHIFT);
>> -
>> -    /*
>> -     * Slice0 can have up to 3 subslices, but there are only 2 in
>> -     * slice1/2.
>> -     */
>> -    sseu->subslice_mask[0] = subslice_mask;
>> -    for (s = 1; s < sseu->max_slices; s++)
>> -        sseu->subslice_mask[s] = subslice_mask & 0x3;
>>       /* Slice0 */
>>       eu_en = ~I915_READ(GEN8_EU_DISABLE0);
>> @@ -242,14 +252,22 @@ static void gen10_sseu_info_init(struct 
>> drm_i915_private *dev_priv)
>>       eu_en = ~I915_READ(GEN10_EU_DISABLE3);
>>       intel_sseu_set_eus(sseu, 5, 1, eu_en & eu_mask);
>> -    /* Do a second pass where we mark the subslices disabled if all 
>> their
>> -     * eus are off.
>> -     */
>> +    subslice_mask = (1 << 4) - 1;
>> +    subslice_mask &= ~((fuse2 & GEN10_F2_SS_DIS_MASK) >>
>> +               GEN10_F2_SS_DIS_SHIFT);
>> +
>>       for (s = 0; s < sseu->max_slices; s++) {
>>           for (ss = 0; ss < sseu->max_subslices; ss++) {
>>               if (intel_sseu_get_eus(sseu, s, ss) == 0)
>> -                sseu->subslice_mask[s] &= ~BIT(ss);
>> +                subslice_mask &= ~BIT(ss);
>>           }
>> +
>> +        /*
>> +         * Slice0 can have up to 3 subslices, but there are only 2 in
>> +         * slice1/2.
>> +         */
>> +        intel_sseu_set_subslices(sseu, s, s == 0 ? subslice_mask :
>> +                               subslice_mask & 0x3);
>>       }
>>       sseu->eu_total = compute_eu_total(sseu);
>> @@ -275,13 +293,12 @@ static void cherryview_sseu_info_init(struct 
>> drm_i915_private *dev_priv)
>>   {
>>       struct sseu_dev_info *sseu = &RUNTIME_INFO(dev_priv)->sseu;
>>       u32 fuse;
>> +    u8 subslice_mask;
>>       fuse = I915_READ(CHV_FUSE_GT);
>>       sseu->slice_mask = BIT(0);
>> -    sseu->max_slices = 1;
>> -    sseu->max_subslices = 2;
>> -    sseu->max_eus_per_subslice = 8;
>> +    intel_sseu_set_info(sseu, 1, 2, 8);
>>       if (!(fuse & CHV_FGT_DISABLE_SS0)) {
>>           u8 disabled_mask =
>> @@ -290,7 +307,7 @@ static void cherryview_sseu_info_init(struct 
>> drm_i915_private *dev_priv)
>>               (((fuse & CHV_FGT_EU_DIS_SS0_R1_MASK) >>
>>                 CHV_FGT_EU_DIS_SS0_R1_SHIFT) << 4);
>> -        sseu->subslice_mask[0] |= BIT(0);
>> +        subslice_mask |= BIT(0);
>>           intel_sseu_set_eus(sseu, 0, 0, ~disabled_mask);
>>       }
>> @@ -301,10 +318,12 @@ static void cherryview_sseu_info_init(struct 
>> drm_i915_private *dev_priv)
>>               (((fuse & CHV_FGT_EU_DIS_SS1_R1_MASK) >>
>>                 CHV_FGT_EU_DIS_SS1_R1_SHIFT) << 4);
>> -        sseu->subslice_mask[0] |= BIT(1);
>> +        subslice_mask |= BIT(1);
>>           intel_sseu_set_eus(sseu, 0, 1, ~disabled_mask);
>>       }
>> +    intel_sseu_set_subslices(sseu, 0, subslice_mask);
>> +
>>       sseu->eu_total = compute_eu_total(sseu);
>>       /*
>> @@ -312,7 +331,8 @@ static void cherryview_sseu_info_init(struct 
>> drm_i915_private *dev_priv)
>>        * across subslices.
>>       */
>>       sseu->eu_per_subslice = intel_sseu_subslice_total(sseu) ?
>> -                sseu->eu_total / intel_sseu_subslice_total(sseu) :
>> +                sseu->eu_total /
>> +                    intel_sseu_subslice_total(sseu) :
>>                   0;
>>       /*
>>        * CHV supports subslice power gating on devices with more than
>> @@ -336,9 +356,8 @@ static void gen9_sseu_info_init(struct 
>> drm_i915_private *dev_priv)
>>       sseu->slice_mask = (fuse2 & GEN8_F2_S_ENA_MASK) >> 
>> GEN8_F2_S_ENA_SHIFT;
>>       /* BXT has a single slice and at most 3 subslices. */
>> -    sseu->max_slices = IS_GEN9_LP(dev_priv) ? 1 : 3;
>> -    sseu->max_subslices = IS_GEN9_LP(dev_priv) ? 3 : 4;
>> -    sseu->max_eus_per_subslice = 8;
>> +    intel_sseu_set_info(sseu, IS_GEN9_LP(dev_priv) ? 1 : 3,
>> +                IS_GEN9_LP(dev_priv) ? 3 : 4, 8);
>>       /*
>>        * The subslice disable field is global, i.e. it applies
>> @@ -357,14 +376,16 @@ static void gen9_sseu_info_init(struct 
>> drm_i915_private *dev_priv)
>>               /* skip disabled slice */
>>               continue;
>> -        sseu->subslice_mask[s] = subslice_mask;
>> +        intel_sseu_set_subslices(sseu, s, subslice_mask);
>>           eu_disable = I915_READ(GEN9_EU_DISABLE(s));
>>           for (ss = 0; ss < sseu->max_subslices; ss++) {
>>               int eu_per_ss;
>>               u8 eu_disabled_mask;
>> +            u8 ss_idx = s * sseu->ss_stride + ss / BITS_PER_BYTE;
>> -            if (!(sseu->subslice_mask[s] & BIT(ss)))
>> +            if (!(sseu->subslice_mask[ss_idx] &
>> +                  BIT(ss % BITS_PER_BYTE)))
>>                   /* skip disabled subslice */
>>                   continue;
>> @@ -437,9 +458,7 @@ static void broadwell_sseu_info_init(struct 
>> drm_i915_private *dev_priv)
>>       fuse2 = I915_READ(GEN8_FUSE2);
>>       sseu->slice_mask = (fuse2 & GEN8_F2_S_ENA_MASK) >> 
>> GEN8_F2_S_ENA_SHIFT;
>> -    sseu->max_slices = 3;
>> -    sseu->max_subslices = 3;
>> -    sseu->max_eus_per_subslice = 8;
>> +    intel_sseu_set_info(sseu, 3, 3, 8);
>>       /*
>>        * The subslice disable field is global, i.e. it applies
>> @@ -466,18 +485,21 @@ static void broadwell_sseu_info_init(struct 
>> drm_i915_private *dev_priv)
>>               /* skip disabled slice */
>>               continue;
>> -        sseu->subslice_mask[s] = subslice_mask;
>> +        intel_sseu_set_subslices(sseu, s, subslice_mask);
>>           for (ss = 0; ss < sseu->max_subslices; ss++) {
>>               u8 eu_disabled_mask;
>> +            u8 ss_idx = s * sseu->ss_stride + ss / BITS_PER_BYTE;
>>               u32 n_disabled;
>> -            if (!(sseu->subslice_mask[s] & BIT(ss)))
>> +            if (!(sseu->subslice_mask[ss_idx] &
>> +                  BIT(ss % BITS_PER_BYTE)))
>>                   /* skip disabled subslice */
>>                   continue;
>>               eu_disabled_mask =
>> -                eu_disable[s] >> (ss * sseu->max_eus_per_subslice);
>> +                eu_disable[s] >>
>> +                    (ss * sseu->max_eus_per_subslice);
>>               intel_sseu_set_eus(sseu, s, ss, ~eu_disabled_mask);
>> @@ -517,6 +539,7 @@ static void haswell_sseu_info_init(struct 
>> drm_i915_private *dev_priv)
>>       struct sseu_dev_info *sseu = &RUNTIME_INFO(dev_priv)->sseu;
>>       u32 fuse1;
>>       int s, ss;
>> +    u32 subslice_mask;
>>       /*
>>        * There isn't a register to tell us how many slices/subslices. We
>> @@ -528,22 +551,18 @@ static void haswell_sseu_info_init(struct 
>> drm_i915_private *dev_priv)
>>           /* fall through */
>>       case 1:
>>           sseu->slice_mask = BIT(0);
>> -        sseu->subslice_mask[0] = BIT(0);
>> +        subslice_mask = BIT(0);
>>           break;
>>       case 2:
>>           sseu->slice_mask = BIT(0);
>> -        sseu->subslice_mask[0] = BIT(0) | BIT(1);
>> +        subslice_mask = BIT(0) | BIT(1);
>>           break;
>>       case 3:
>>           sseu->slice_mask = BIT(0) | BIT(1);
>> -        sseu->subslice_mask[0] = BIT(0) | BIT(1);
>> -        sseu->subslice_mask[1] = BIT(0) | BIT(1);
>> +        subslice_mask = BIT(0) | BIT(1);
>>           break;
>>       }
>> -    sseu->max_slices = hweight8(sseu->slice_mask);
>> -    sseu->max_subslices = hweight8(sseu->subslice_mask[0]);
>> -
>>       fuse1 = I915_READ(HSW_PAVP_FUSE1);
>>       switch ((fuse1 & HSW_F1_EU_DIS_MASK) >> HSW_F1_EU_DIS_SHIFT) {
>>       default:
>> @@ -560,9 +579,14 @@ static void haswell_sseu_info_init(struct 
>> drm_i915_private *dev_priv)
>>           sseu->eu_per_subslice = 6;
>>           break;
>>       }
>> -    sseu->max_eus_per_subslice = sseu->eu_per_subslice;
>> +
>> +    intel_sseu_set_info(sseu, hweight8(sseu->slice_mask),
>> +                hweight8(subslice_mask),
>> +                sseu->eu_per_subslice);
>>       for (s = 0; s < sseu->max_slices; s++) {
>> +        intel_sseu_set_subslices(sseu, s, subslice_mask);
>> +
>>           for (ss = 0; ss < sseu->max_subslices; ss++) {
>>               intel_sseu_set_eus(sseu, s, ss,
>>                          (1UL << sseu->eu_per_subslice) - 1);
>>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 4/6] drm/i915: Move sseu helper functions to intel_sseu.h
  2019-05-01 15:34 ` [PATCH 4/6] drm/i915: Move sseu helper functions to intel_sseu.h Stuart Summers
@ 2019-05-01 18:48   ` Daniele Ceraolo Spurio
  2019-05-01 19:36     ` Summers, Stuart
  0 siblings, 1 reply; 35+ messages in thread
From: Daniele Ceraolo Spurio @ 2019-05-01 18:48 UTC (permalink / raw)
  To: Stuart Summers, intel-gfx



On 5/1/19 8:34 AM, Stuart Summers wrote:
> v2: fix spacing from checkpatch warning
> 
> Signed-off-by: Stuart Summers <stuart.summers@intel.com>
> ---
>   drivers/gpu/drm/i915/gt/intel_sseu.h     | 47 ++++++++++++++++++++++++
>   drivers/gpu/drm/i915/intel_device_info.h | 47 ------------------------
>   2 files changed, 47 insertions(+), 47 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_sseu.h b/drivers/gpu/drm/i915/gt/intel_sseu.h
> index f5ff6b7a756a..029e71d8f140 100644
> --- a/drivers/gpu/drm/i915/gt/intel_sseu.h
> +++ b/drivers/gpu/drm/i915/gt/intel_sseu.h
> @@ -63,12 +63,59 @@ intel_sseu_from_device_info(const struct sseu_dev_info *sseu)
>   	return value;
>   }
>   
> +static inline unsigned int sseu_subslice_total(const struct sseu_dev_info *sseu)
> +{
> +	unsigned int i, total = 0;
> +
> +	for (i = 0; i < ARRAY_SIZE(sseu->subslice_mask); i++)
> +		total += hweight8(sseu->subslice_mask[i]);
> +
> +	return total;
> +}
> +
>   static inline unsigned int
>   sseu_subslices_per_slice(const struct sseu_dev_info *sseu, u8 slice)
>   {
>   	return hweight8(sseu->subslice_mask[slice]);
>   }
>   
> +static inline int sseu_eu_idx(const struct sseu_dev_info *sseu,
> +			      int slice, int subslice)
> +{
> +	int subslice_stride = DIV_ROUND_UP(sseu->max_eus_per_subslice,
> +					   BITS_PER_BYTE);
> +	int slice_stride = sseu->max_subslices * subslice_stride;
> +
> +	return slice * slice_stride + subslice * subslice_stride;
> +}
> +
> +static inline u16 sseu_get_eus(const struct sseu_dev_info *sseu,
> +			       int slice, int subslice)
> +{
> +	int i, offset = sseu_eu_idx(sseu, slice, subslice);
> +	u16 eu_mask = 0;
> +
> +	for (i = 0;
> +	     i < DIV_ROUND_UP(sseu->max_eus_per_subslice, BITS_PER_BYTE); i++) {
> +		eu_mask |= ((u16)sseu->eu_mask[offset + i]) <<
> +			(i * BITS_PER_BYTE);
> +	}
> +
> +	return eu_mask;
> +}
> +
> +static inline void sseu_set_eus(struct sseu_dev_info *sseu,
> +				int slice, int subslice, u16 eu_mask)
> +{
> +	int i, offset = sseu_eu_idx(sseu, slice, subslice);
> +
> +	for (i = 0;
> +	     i < DIV_ROUND_UP(sseu->max_eus_per_subslice, BITS_PER_BYTE); i++) {
> +		sseu->eu_mask[offset + i] =
> +			(eu_mask >> (BITS_PER_BYTE * i)) & 0xff;
> +	}
> +}
> +

AFAICS sseu_get_eus() and sseu_set_eus() are only used by sseu-related 
functions in device_info.c and sseu_eu_idx() is only used by those 2, so 
we can make all 3 of them static in that file. We should also migrate 
all of the sseu code from device_info.c to intel_sseu.c for consistency; 
I'm ok with that being done as a follow up if you prefer to avoid it in 
this series.

I was also about to mention adding the intel_* prefix to the remaining 
functions but then I realized you add it in the next patch. My personal 
preference would be to do that in this patch, but I'm not going to block 
on it.

Daniele

>   u32 intel_sseu_make_rpcs(struct drm_i915_private *i915,
>   			 const struct intel_sseu *req_sseu);
>   
> diff --git a/drivers/gpu/drm/i915/intel_device_info.h b/drivers/gpu/drm/i915/intel_device_info.h
> index 5a2e17d6146b..6412a9c72898 100644
> --- a/drivers/gpu/drm/i915/intel_device_info.h
> +++ b/drivers/gpu/drm/i915/intel_device_info.h
> @@ -218,53 +218,6 @@ struct intel_driver_caps {
>   	bool has_logical_contexts:1;
>   };
>   
> -static inline unsigned int sseu_subslice_total(const struct sseu_dev_info *sseu)
> -{
> -	unsigned int i, total = 0;
> -
> -	for (i = 0; i < ARRAY_SIZE(sseu->subslice_mask); i++)
> -		total += hweight8(sseu->subslice_mask[i]);
> -
> -	return total;
> -}
> -
> -static inline int sseu_eu_idx(const struct sseu_dev_info *sseu,
> -			      int slice, int subslice)
> -{
> -	int subslice_stride = DIV_ROUND_UP(sseu->max_eus_per_subslice,
> -					   BITS_PER_BYTE);
> -	int slice_stride = sseu->max_subslices * subslice_stride;
> -
> -	return slice * slice_stride + subslice * subslice_stride;
> -}
> -
> -static inline u16 sseu_get_eus(const struct sseu_dev_info *sseu,
> -			       int slice, int subslice)
> -{
> -	int i, offset = sseu_eu_idx(sseu, slice, subslice);
> -	u16 eu_mask = 0;
> -
> -	for (i = 0;
> -	     i < DIV_ROUND_UP(sseu->max_eus_per_subslice, BITS_PER_BYTE); i++) {
> -		eu_mask |= ((u16) sseu->eu_mask[offset + i]) <<
> -			(i * BITS_PER_BYTE);
> -	}
> -
> -	return eu_mask;
> -}
> -
> -static inline void sseu_set_eus(struct sseu_dev_info *sseu,
> -				int slice, int subslice, u16 eu_mask)
> -{
> -	int i, offset = sseu_eu_idx(sseu, slice, subslice);
> -
> -	for (i = 0;
> -	     i < DIV_ROUND_UP(sseu->max_eus_per_subslice, BITS_PER_BYTE); i++) {
> -		sseu->eu_mask[offset + i] =
> -			(eu_mask >> (BITS_PER_BYTE * i)) & 0xff;
> -	}
> -}
> -
>   const char *intel_platform_name(enum intel_platform platform);
>   
>   void intel_device_info_subplatform_init(struct drm_i915_private *dev_priv);
> 
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 4/6] drm/i915: Move sseu helper functions to intel_sseu.h
  2019-05-01 18:48   ` Daniele Ceraolo Spurio
@ 2019-05-01 19:36     ` Summers, Stuart
  0 siblings, 0 replies; 35+ messages in thread
From: Summers, Stuart @ 2019-05-01 19:36 UTC (permalink / raw)
  To: Ceraolo Spurio, Daniele, intel-gfx


[-- Attachment #1.1: Type: text/plain, Size: 5522 bytes --]

On Wed, 2019-05-01 at 11:48 -0700, Daniele Ceraolo Spurio wrote:
> 
> On 5/1/19 8:34 AM, Stuart Summers wrote:
> > v2: fix spacing from checkpatch warning
> > 
> > Signed-off-by: Stuart Summers <stuart.summers@intel.com>
> > ---
> >   drivers/gpu/drm/i915/gt/intel_sseu.h     | 47
> > ++++++++++++++++++++++++
> >   drivers/gpu/drm/i915/intel_device_info.h | 47 -----------------
> > -------
> >   2 files changed, 47 insertions(+), 47 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/gt/intel_sseu.h
> > b/drivers/gpu/drm/i915/gt/intel_sseu.h
> > index f5ff6b7a756a..029e71d8f140 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_sseu.h
> > +++ b/drivers/gpu/drm/i915/gt/intel_sseu.h
> > @@ -63,12 +63,59 @@ intel_sseu_from_device_info(const struct
> > sseu_dev_info *sseu)
> >   	return value;
> >   }
> >   
> > +static inline unsigned int sseu_subslice_total(const struct
> > sseu_dev_info *sseu)
> > +{
> > +	unsigned int i, total = 0;
> > +
> > +	for (i = 0; i < ARRAY_SIZE(sseu->subslice_mask); i++)
> > +		total += hweight8(sseu->subslice_mask[i]);
> > +
> > +	return total;
> > +}
> > +
> >   static inline unsigned int
> >   sseu_subslices_per_slice(const struct sseu_dev_info *sseu, u8
> > slice)
> >   {
> >   	return hweight8(sseu->subslice_mask[slice]);
> >   }
> >   
> > +static inline int sseu_eu_idx(const struct sseu_dev_info *sseu,
> > +			      int slice, int subslice)
> > +{
> > +	int subslice_stride = DIV_ROUND_UP(sseu->max_eus_per_subslice,
> > +					   BITS_PER_BYTE);
> > +	int slice_stride = sseu->max_subslices * subslice_stride;
> > +
> > +	return slice * slice_stride + subslice * subslice_stride;
> > +}
> > +
> > +static inline u16 sseu_get_eus(const struct sseu_dev_info *sseu,
> > +			       int slice, int subslice)
> > +{
> > +	int i, offset = sseu_eu_idx(sseu, slice, subslice);
> > +	u16 eu_mask = 0;
> > +
> > +	for (i = 0;
> > +	     i < DIV_ROUND_UP(sseu->max_eus_per_subslice,
> > BITS_PER_BYTE); i++) {
> > +		eu_mask |= ((u16)sseu->eu_mask[offset + i]) <<
> > +			(i * BITS_PER_BYTE);
> > +	}
> > +
> > +	return eu_mask;
> > +}
> > +
> > +static inline void sseu_set_eus(struct sseu_dev_info *sseu,
> > +				int slice, int subslice, u16 eu_mask)
> > +{
> > +	int i, offset = sseu_eu_idx(sseu, slice, subslice);
> > +
> > +	for (i = 0;
> > +	     i < DIV_ROUND_UP(sseu->max_eus_per_subslice,
> > BITS_PER_BYTE); i++) {
> > +		sseu->eu_mask[offset + i] =
> > +			(eu_mask >> (BITS_PER_BYTE * i)) & 0xff;
> > +	}
> > +}
> > +
> 
> AFAICS sseu_get_eus() and sseu_set_eus() are only used by sseu-
> related 
> functions in device_info.c and sseu_eu_idx() is only used by those 2,
> so 
> we can make all 3 of them static in that file. We should also
> migrate 
> all of the sseu code from device_info.c to intel_sseu.c for
> consistency; 
> I'm ok with that being done as a follow up if you prefer to avoid it
> in 
> this series.
> 
> I was also about to mention adding the intel_* prefix to the
> remaining 
> functions but then I realized you add it in the next patch. My
> personal 
> preference would be to do that in this patch, but I'm not going to
> block 
> on it.

Not sure why I missed this one, but good catch! I'll post a fix in the
next series update.

-Stuart

> 
> Daniele
> 
> >   u32 intel_sseu_make_rpcs(struct drm_i915_private *i915,
> >   			 const struct intel_sseu *req_sseu);
> >   
> > diff --git a/drivers/gpu/drm/i915/intel_device_info.h
> > b/drivers/gpu/drm/i915/intel_device_info.h
> > index 5a2e17d6146b..6412a9c72898 100644
> > --- a/drivers/gpu/drm/i915/intel_device_info.h
> > +++ b/drivers/gpu/drm/i915/intel_device_info.h
> > @@ -218,53 +218,6 @@ struct intel_driver_caps {
> >   	bool has_logical_contexts:1;
> >   };
> >   
> > -static inline unsigned int sseu_subslice_total(const struct
> > sseu_dev_info *sseu)
> > -{
> > -	unsigned int i, total = 0;
> > -
> > -	for (i = 0; i < ARRAY_SIZE(sseu->subslice_mask); i++)
> > -		total += hweight8(sseu->subslice_mask[i]);
> > -
> > -	return total;
> > -}
> > -
> > -static inline int sseu_eu_idx(const struct sseu_dev_info *sseu,
> > -			      int slice, int subslice)
> > -{
> > -	int subslice_stride = DIV_ROUND_UP(sseu->max_eus_per_subslice,
> > -					   BITS_PER_BYTE);
> > -	int slice_stride = sseu->max_subslices * subslice_stride;
> > -
> > -	return slice * slice_stride + subslice * subslice_stride;
> > -}
> > -
> > -static inline u16 sseu_get_eus(const struct sseu_dev_info *sseu,
> > -			       int slice, int subslice)
> > -{
> > -	int i, offset = sseu_eu_idx(sseu, slice, subslice);
> > -	u16 eu_mask = 0;
> > -
> > -	for (i = 0;
> > -	     i < DIV_ROUND_UP(sseu->max_eus_per_subslice,
> > BITS_PER_BYTE); i++) {
> > -		eu_mask |= ((u16) sseu->eu_mask[offset + i]) <<
> > -			(i * BITS_PER_BYTE);
> > -	}
> > -
> > -	return eu_mask;
> > -}
> > -
> > -static inline void sseu_set_eus(struct sseu_dev_info *sseu,
> > -				int slice, int subslice, u16 eu_mask)
> > -{
> > -	int i, offset = sseu_eu_idx(sseu, slice, subslice);
> > -
> > -	for (i = 0;
> > -	     i < DIV_ROUND_UP(sseu->max_eus_per_subslice,
> > BITS_PER_BYTE); i++) {
> > -		sseu->eu_mask[offset + i] =
> > -			(eu_mask >> (BITS_PER_BYTE * i)) & 0xff;
> > -	}
> > -}
> > -
> >   const char *intel_platform_name(enum intel_platform platform);
> >   
> >   void intel_device_info_subplatform_init(struct drm_i915_private
> > *dev_priv);
> > 

[-- Attachment #1.2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 3270 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 3/6] drm/i915: Move calculation of subslices per slice to new function
  2019-05-01 18:14   ` Daniele Ceraolo Spurio
@ 2019-05-01 19:37     ` Summers, Stuart
  0 siblings, 0 replies; 35+ messages in thread
From: Summers, Stuart @ 2019-05-01 19:37 UTC (permalink / raw)
  To: Ceraolo Spurio, Daniele, intel-gfx


[-- Attachment #1.1: Type: text/plain, Size: 3446 bytes --]

On Wed, 2019-05-01 at 11:14 -0700, Daniele Ceraolo Spurio wrote:
> 
> On 5/1/19 8:34 AM, Stuart Summers wrote:
> > Add a new function to return the number of subslices per slice to
> > consolidate code usage.
> > 
> > v2: rebase on changes to move sseu struct to intel_sseu.h
> > 
> > Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
> > Signed-off-by: Stuart Summers <stuart.summers@intel.com>
> > ---
> >   drivers/gpu/drm/i915/gt/intel_sseu.h     | 6 ++++++
> >   drivers/gpu/drm/i915/i915_debugfs.c      | 2 +-
> >   drivers/gpu/drm/i915/intel_device_info.c | 4 ++--
> >   3 files changed, 9 insertions(+), 3 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/gt/intel_sseu.h
> > b/drivers/gpu/drm/i915/gt/intel_sseu.h
> > index c0b16b248d4c..f5ff6b7a756a 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_sseu.h
> > +++ b/drivers/gpu/drm/i915/gt/intel_sseu.h
> > @@ -63,6 +63,12 @@ intel_sseu_from_device_info(const struct
> > sseu_dev_info *sseu)
> >   	return value;
> >   }
> >   
> > +static inline unsigned int
> > +sseu_subslices_per_slice(const struct sseu_dev_info *sseu, u8
> > slice)
> 
> This is exposed, so needs an intel_* prefix. with that:

Will change in the next series update. Thanks for the review!

-Stuart

> 
> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
> 
> Daniele
> 
> > +{
> > +	return hweight8(sseu->subslice_mask[slice]);
> > +}
> > +
> >   u32 intel_sseu_make_rpcs(struct drm_i915_private *i915,
> >   			 const struct intel_sseu *req_sseu);
> >   
> > diff --git a/drivers/gpu/drm/i915/i915_debugfs.c
> > b/drivers/gpu/drm/i915/i915_debugfs.c
> > index 0e4dffcd4da4..fe854c629a32 100644
> > --- a/drivers/gpu/drm/i915/i915_debugfs.c
> > +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> > @@ -4185,7 +4185,7 @@ static void i915_print_sseu_info(struct
> > seq_file *m, bool is_available_info,
> >   		   sseu_subslice_total(sseu));
> >   	for (s = 0; s < fls(sseu->slice_mask); s++) {
> >   		seq_printf(m, "  %s Slice%i subslices: %u\n", type,
> > -			   s, hweight8(sseu->subslice_mask[s]));
> > +			   s, sseu_subslices_per_slice(sseu, s));
> >   	}
> >   	seq_printf(m, "  %s EU Total: %u\n", type,
> >   		   sseu->eu_total);
> > diff --git a/drivers/gpu/drm/i915/intel_device_info.c
> > b/drivers/gpu/drm/i915/intel_device_info.c
> > index 6af480b95bc6..559cf0d0628e 100644
> > --- a/drivers/gpu/drm/i915/intel_device_info.c
> > +++ b/drivers/gpu/drm/i915/intel_device_info.c
> > @@ -93,7 +93,7 @@ static void sseu_dump(const struct sseu_dev_info
> > *sseu, struct drm_printer *p)
> >   	drm_printf(p, "subslice total: %u\n",
> > sseu_subslice_total(sseu));
> >   	for (s = 0; s < sseu->max_slices; s++) {
> >   		drm_printf(p, "slice%d: %u subslices, mask=%04x\n",
> > -			   s, hweight8(sseu->subslice_mask[s]),
> > +			   s, sseu_subslices_per_slice(sseu, s),
> >   			   sseu->subslice_mask[s]);
> >   	}
> >   	drm_printf(p, "EU total: %u\n", sseu->eu_total);
> > @@ -126,7 +126,7 @@ void intel_device_info_dump_topology(const
> > struct sseu_dev_info *sseu,
> >   
> >   	for (s = 0; s < sseu->max_slices; s++) {
> >   		drm_printf(p, "slice%d: %u subslice(s) (0x%hhx):\n",
> > -			   s, hweight8(sseu->subslice_mask[s]),
> > +			   s, sseu_subslices_per_slice(sseu, s),
> >   			   sseu->subslice_mask[s]);
> >   
> >   		for (ss = 0; ss < sseu->max_subslices; ss++) {
> > 

[-- Attachment #1.2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 3270 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 2/6] drm/i915: Add macro for SSEU stride calculation
  2019-05-01 18:11   ` Daniele Ceraolo Spurio
@ 2019-05-01 19:37     ` Summers, Stuart
  0 siblings, 0 replies; 35+ messages in thread
From: Summers, Stuart @ 2019-05-01 19:37 UTC (permalink / raw)
  To: Ceraolo Spurio, Daniele, intel-gfx


[-- Attachment #1.1: Type: text/plain, Size: 3879 bytes --]

On Wed, 2019-05-01 at 11:11 -0700, Daniele Ceraolo Spurio wrote:
> 
> On 5/1/19 8:34 AM, Stuart Summers wrote:
> > Subslice stride and EU stride are calculated multiple times in
> > i915_query. Move this calculation to a macro to reduce code
> > duplication.
> > 
> > v2: update headers in intel_sseu.h
> > 
> > Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
> > Signed-off-by: Stuart Summers <stuart.summers@intel.com>
> > ---
> >   drivers/gpu/drm/i915/gt/intel_sseu.h |  2 ++
> >   drivers/gpu/drm/i915/i915_query.c    | 17 ++++++++---------
> >   2 files changed, 10 insertions(+), 9 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/gt/intel_sseu.h
> > b/drivers/gpu/drm/i915/gt/intel_sseu.h
> > index 73bc824094e8..c0b16b248d4c 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_sseu.h
> > +++ b/drivers/gpu/drm/i915/gt/intel_sseu.h
> > @@ -8,11 +8,13 @@
> >   #define __INTEL_SSEU_H__
> >   
> >   #include <linux/types.h>
> > +#include <linux/kernel.h>
> >   
> >   struct drm_i915_private;
> >   
> >   #define GEN_MAX_SLICES		(6) /* CNL upper bound */
> >   #define GEN_MAX_SUBSLICES	(8) /* ICL upper bound */
> > +#define GEN_SSEU_STRIDE(bits) DIV_ROUND_UP(bits, BITS_PER_BYTE)
> 
> What we pass to this macro isn't really a bits count but the maximum 
> amount of s/ss/eus. s/bits/max_entry/, or something like that? with
> that:

Makes sense, I'll make the change in the next series post. Thanks for
the review!

-Stuart

> 
> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
> 
> Daniele
> 
> >   
> >   struct sseu_dev_info {
> >   	u8 slice_mask;
> > diff --git a/drivers/gpu/drm/i915/i915_query.c
> > b/drivers/gpu/drm/i915/i915_query.c
> > index 782183b78f49..7c1708c22811 100644
> > --- a/drivers/gpu/drm/i915/i915_query.c
> > +++ b/drivers/gpu/drm/i915/i915_query.c
> > @@ -37,6 +37,8 @@ static int query_topology_info(struct
> > drm_i915_private *dev_priv,
> >   	const struct sseu_dev_info *sseu = &RUNTIME_INFO(dev_priv)-
> > >sseu;
> >   	struct drm_i915_query_topology_info topo;
> >   	u32 slice_length, subslice_length, eu_length, total_length;
> > +	u8 subslice_stride = GEN_SSEU_STRIDE(sseu->max_subslices);
> > +	u8 eu_stride = GEN_SSEU_STRIDE(sseu->max_eus_per_subslice);
> >   	int ret;
> >   
> >   	if (query_item->flags != 0)
> > @@ -48,12 +50,10 @@ static int query_topology_info(struct
> > drm_i915_private *dev_priv,
> >   	BUILD_BUG_ON(sizeof(u8) != sizeof(sseu->slice_mask));
> >   
> >   	slice_length = sizeof(sseu->slice_mask);
> > -	subslice_length = sseu->max_slices *
> > -		DIV_ROUND_UP(sseu->max_subslices, BITS_PER_BYTE);
> > -	eu_length = sseu->max_slices * sseu->max_subslices *
> > -		DIV_ROUND_UP(sseu->max_eus_per_subslice,
> > BITS_PER_BYTE);
> > -
> > -	total_length = sizeof(topo) + slice_length + subslice_length +
> > eu_length;
> > +	subslice_length = sseu->max_slices * subslice_stride;
> > +	eu_length = sseu->max_slices * sseu->max_subslices * eu_stride;
> > +	total_length = sizeof(topo) + slice_length + subslice_length +
> > +		       eu_length;
> >   
> >   	ret = copy_query_item(&topo, sizeof(topo), total_length,
> >   			      query_item);
> > @@ -69,10 +69,9 @@ static int query_topology_info(struct
> > drm_i915_private *dev_priv,
> >   	topo.max_eus_per_subslice = sseu->max_eus_per_subslice;
> >   
> >   	topo.subslice_offset = slice_length;
> > -	topo.subslice_stride = DIV_ROUND_UP(sseu->max_subslices,
> > BITS_PER_BYTE);
> > +	topo.subslice_stride = subslice_stride;
> >   	topo.eu_offset = slice_length + subslice_length;
> > -	topo.eu_stride =
> > -		DIV_ROUND_UP(sseu->max_eus_per_subslice,
> > BITS_PER_BYTE);
> > +	topo.eu_stride = eu_stride;
> >   
> >   	if (__copy_to_user(u64_to_user_ptr(query_item->data_ptr),
> >   			   &topo, sizeof(topo)))
> > 

[-- Attachment #1.2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 3270 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 1/6] drm/i915: Use local variable for SSEU info in GETPARAM ioctl
  2019-05-01 17:54   ` Daniele Ceraolo Spurio
@ 2019-05-01 19:38     ` Summers, Stuart
  0 siblings, 0 replies; 35+ messages in thread
From: Summers, Stuart @ 2019-05-01 19:38 UTC (permalink / raw)
  To: Ceraolo Spurio, Daniele, intel-gfx


[-- Attachment #1.1: Type: text/plain, Size: 2792 bytes --]

On Wed, 2019-05-01 at 10:54 -0700, Daniele Ceraolo Spurio wrote:
> 
> On 5/1/19 8:34 AM, Stuart Summers wrote:
> > In the GETPARAM ioctl handler, use a local variable to consolidate
> > usage of SSEU runtime info.
> > 
> > v2: add const to sseu_dev_info variable
> > 
> > Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
> > Signed-off-by: Stuart Summers <stuart.summers@intel.com>
> 

Thanks for the review!

-Stuart

> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
> 
> > ---
> >   drivers/gpu/drm/i915/i915_drv.c | 11 ++++++-----
> >   1 file changed, 6 insertions(+), 5 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_drv.c
> > b/drivers/gpu/drm/i915/i915_drv.c
> > index 21dac5a09fbe..c376244c19c4 100644
> > --- a/drivers/gpu/drm/i915/i915_drv.c
> > +++ b/drivers/gpu/drm/i915/i915_drv.c
> > @@ -324,6 +324,7 @@ static int i915_getparam_ioctl(struct
> > drm_device *dev, void *data,
> >   {
> >   	struct drm_i915_private *dev_priv = to_i915(dev);
> >   	struct pci_dev *pdev = dev_priv->drm.pdev;
> > +	const struct sseu_dev_info *sseu = &RUNTIME_INFO(dev_priv)-
> > >sseu;
> >   	drm_i915_getparam_t *param = data;
> >   	int value;
> >   
> > @@ -377,12 +378,12 @@ static int i915_getparam_ioctl(struct
> > drm_device *dev, void *data,
> >   		value = i915_cmd_parser_get_version(dev_priv);
> >   		break;
> >   	case I915_PARAM_SUBSLICE_TOTAL:
> > -		value = sseu_subslice_total(&RUNTIME_INFO(dev_priv)-
> > >sseu);
> > +		value = sseu_subslice_total(sseu);
> >   		if (!value)
> >   			return -ENODEV;
> >   		break;
> >   	case I915_PARAM_EU_TOTAL:
> > -		value = RUNTIME_INFO(dev_priv)->sseu.eu_total;
> > +		value = sseu->eu_total;
> >   		if (!value)
> >   			return -ENODEV;
> >   		break;
> > @@ -399,7 +400,7 @@ static int i915_getparam_ioctl(struct
> > drm_device *dev, void *data,
> >   		value = HAS_POOLED_EU(dev_priv);
> >   		break;
> >   	case I915_PARAM_MIN_EU_IN_POOL:
> > -		value = RUNTIME_INFO(dev_priv)->sseu.min_eu_in_pool;
> > +		value = sseu->min_eu_in_pool;
> >   		break;
> >   	case I915_PARAM_HUC_STATUS:
> >   		value = intel_huc_check_status(&dev_priv->huc);
> > @@ -449,12 +450,12 @@ static int i915_getparam_ioctl(struct
> > drm_device *dev, void *data,
> >   		value = intel_engines_has_context_isolation(dev_priv);
> >   		break;
> >   	case I915_PARAM_SLICE_MASK:
> > -		value = RUNTIME_INFO(dev_priv)->sseu.slice_mask;
> > +		value = sseu->slice_mask;
> >   		if (!value)
> >   			return -ENODEV;
> >   		break;
> >   	case I915_PARAM_SUBSLICE_MASK:
> > -		value = RUNTIME_INFO(dev_priv)->sseu.subslice_mask[0];
> > +		value = sseu->subslice_mask[0];
> >   		if (!value)
> >   			return -ENODEV;
> >   		break;
> > 

[-- Attachment #1.2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 3270 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 6/6] drm/i915: Expand subslice mask
  2019-05-01 18:29     ` Tvrtko Ursulin
@ 2019-05-01 19:40       ` Summers, Stuart
  0 siblings, 0 replies; 35+ messages in thread
From: Summers, Stuart @ 2019-05-01 19:40 UTC (permalink / raw)
  To: intel-gfx, tvrtko.ursulin


[-- Attachment #1.1: Type: text/plain, Size: 14582 bytes --]

On Wed, 2019-05-01 at 19:29 +0100, Tvrtko Ursulin wrote:
> On 01/05/2019 19:22, Tvrtko Ursulin wrote:
> 
> [snip]
> 
> > > +#define SS_STR_MAX_SIZE (GEN_MAX_SUBSLICE_STRIDE * 2)
> > > +
> > > +static u8 *
> > > +subslice_per_slice_str(u8 *buf, const struct sseu_dev_info
> > > *sseu, u8 
> > > slice)
> > > +{
> > > +    int i;
> > > +    u8 ss_offset = slice * sseu->ss_stride;
> > > +
> > > +    GEM_BUG_ON(slice >= sseu->max_slices);
> > > +
> > > +    memset(buf, 0, SS_STR_MAX_SIZE);
> > 
> > I suggest a more hardened approach of caller passing in the buffer
> > size, 
> > since it is their buffer.

Not a bad idea. I had the define to make this explicit and handle the
future cases, but probably right it's better to isolate this. I'll make
the change in the next series update.

> 
> Having said this..
> 
> > > +
> > > +    /*
> > > +     * Print subslice information in reverse order to match
> > > +     * userspace expectations.
> > > +     */
> > > +    for (i = 0; i < sseu->ss_stride; i++)
> > > +        sprintf(&buf[i * 2], "%02x",
> > > +            sseu->subslice_mask[ss_offset + sseu->ss_stride -
> > > +                        (i + 1)]);
> 
> ...sprintf also needs to check against overflowing the buffer. 
> (Relationship between loop boundary (ss_stride) and buffer size is a
> bit 
> decoupled.)

I'll add the check, makes sense.

> 
> And buffer should probably be char *.

No problem. I'll make this change. Thanks for the feedback!

- Stuart

> 
> Regards,
> 
> Tvrtko
> 
> > > +
> > > +    return buf;
> > > +}
> > > +
> > >   static void sseu_dump(const struct sseu_dev_info *sseu, struct 
> > > drm_printer *p)
> > >   {
> > >       int s;
> > > +    u8 buf[SS_STR_MAX_SIZE];
> > >       drm_printf(p, "slice total: %u, mask=%04x\n",
> > >              hweight8(sseu->slice_mask), sseu->slice_mask);
> > >       drm_printf(p, "subslice total: %u\n", 
> > > intel_sseu_subslice_total(sseu));
> > >       for (s = 0; s < sseu->max_slices; s++) {
> > > -        drm_printf(p, "slice%d: %u subslices, mask=%04x\n",
> > > +        drm_printf(p, "slice%d: %u subslices, mask=%s\n",
> > >                  s, intel_sseu_subslices_per_slice(sseu, s),
> > > -               sseu->subslice_mask[s]);
> > > +               subslice_per_slice_str(buf, sseu, s));
> > >       }
> > >       drm_printf(p, "EU total: %u\n", sseu->eu_total);
> > >       drm_printf(p, "EU per subslice: %u\n", sseu-
> > > >eu_per_subslice);
> > > @@ -118,6 +143,7 @@ void intel_device_info_dump_topology(const
> > > struct 
> > > sseu_dev_info *sseu,
> > >                        struct drm_printer *p)
> > >   {
> > >       int s, ss;
> > > +    u8 buf[SS_STR_MAX_SIZE];
> > >       if (sseu->max_slices == 0) {
> > >           drm_printf(p, "Unavailable\n");
> > > @@ -125,9 +151,9 @@ void intel_device_info_dump_topology(const
> > > struct 
> > > sseu_dev_info *sseu,
> > >       }
> > >       for (s = 0; s < sseu->max_slices; s++) {
> > > -        drm_printf(p, "slice%d: %u subslice(s) (0x%hhx):\n",
> > > +        drm_printf(p, "slice%d: %u subslice(s) (0x%s):\n",
> > >                  s, intel_sseu_subslices_per_slice(sseu, s),
> > > -               sseu->subslice_mask[s]);
> > > +               subslice_per_slice_str(buf, sseu, s));
> > >           for (ss = 0; ss < sseu->max_subslices; ss++) {
> > >               u16 enabled_eus = intel_sseu_get_eus(sseu, s, ss);
> > > @@ -156,15 +182,10 @@ static void gen11_sseu_info_init(struct 
> > > drm_i915_private *dev_priv)
> > >       u8 eu_en;
> > >       int s;
> > > -    if (IS_ELKHARTLAKE(dev_priv)) {
> > > -        sseu->max_slices = 1;
> > > -        sseu->max_subslices = 4;
> > > -        sseu->max_eus_per_subslice = 8;
> > > -    } else {
> > > -        sseu->max_slices = 1;
> > > -        sseu->max_subslices = 8;
> > > -        sseu->max_eus_per_subslice = 8;
> > > -    }
> > > +    if (IS_ELKHARTLAKE(dev_priv))
> > > +        intel_sseu_set_info(sseu, 1, 4, 8);
> > > +    else
> > > +        intel_sseu_set_info(sseu, 1, 8, 8);
> > >       s_en = I915_READ(GEN11_GT_SLICE_ENABLE) &
> > > GEN11_GT_S_ENA_MASK;
> > >       ss_en = ~I915_READ(GEN11_GT_SUBSLICE_DISABLE);
> > > @@ -177,9 +198,11 @@ static void gen11_sseu_info_init(struct 
> > > drm_i915_private *dev_priv)
> > >               int ss;
> > >               sseu->slice_mask |= BIT(s);
> > > -            sseu->subslice_mask[s] = (ss_en >> ss_idx) &
> > > ss_en_mask;
> > > +            sseu->subslice_mask[s * sseu->ss_stride] =
> > > +                (ss_en >> ss_idx) & ss_en_mask;
> > >               for (ss = 0; ss < sseu->max_subslices; ss++) {
> > > -                if (sseu->subslice_mask[s] & BIT(ss))
> > > +                if (sseu->subslice_mask[s * sseu->ss_stride] &
> > > +                    BIT(ss))
> > >                       intel_sseu_set_eus(sseu, s, ss, eu_en);
> > >               }
> > >           }
> > > @@ -201,23 +224,10 @@ static void gen10_sseu_info_init(struct 
> > > drm_i915_private *dev_priv)
> > >       const int eu_mask = 0xff;
> > >       u32 subslice_mask, eu_en;
> > > +    intel_sseu_set_info(sseu, 6, 4, 8);
> > > +
> > >       sseu->slice_mask = (fuse2 & GEN10_F2_S_ENA_MASK) >>
> > >                   GEN10_F2_S_ENA_SHIFT;
> > > -    sseu->max_slices = 6;
> > > -    sseu->max_subslices = 4;
> > > -    sseu->max_eus_per_subslice = 8;
> > > -
> > > -    subslice_mask = (1 << 4) - 1;
> > > -    subslice_mask &= ~((fuse2 & GEN10_F2_SS_DIS_MASK) >>
> > > -               GEN10_F2_SS_DIS_SHIFT);
> > > -
> > > -    /*
> > > -     * Slice0 can have up to 3 subslices, but there are only 2
> > > in
> > > -     * slice1/2.
> > > -     */
> > > -    sseu->subslice_mask[0] = subslice_mask;
> > > -    for (s = 1; s < sseu->max_slices; s++)
> > > -        sseu->subslice_mask[s] = subslice_mask & 0x3;
> > >       /* Slice0 */
> > >       eu_en = ~I915_READ(GEN8_EU_DISABLE0);
> > > @@ -242,14 +252,22 @@ static void gen10_sseu_info_init(struct 
> > > drm_i915_private *dev_priv)
> > >       eu_en = ~I915_READ(GEN10_EU_DISABLE3);
> > >       intel_sseu_set_eus(sseu, 5, 1, eu_en & eu_mask);
> > > -    /* Do a second pass where we mark the subslices disabled if
> > > all 
> > > their
> > > -     * eus are off.
> > > -     */
> > > +    subslice_mask = (1 << 4) - 1;
> > > +    subslice_mask &= ~((fuse2 & GEN10_F2_SS_DIS_MASK) >>
> > > +               GEN10_F2_SS_DIS_SHIFT);
> > > +
> > >       for (s = 0; s < sseu->max_slices; s++) {
> > >           for (ss = 0; ss < sseu->max_subslices; ss++) {
> > >               if (intel_sseu_get_eus(sseu, s, ss) == 0)
> > > -                sseu->subslice_mask[s] &= ~BIT(ss);
> > > +                subslice_mask &= ~BIT(ss);
> > >           }
> > > +
> > > +        /*
> > > +         * Slice0 can have up to 3 subslices, but there are only
> > > 2 in
> > > +         * slice1/2.
> > > +         */
> > > +        intel_sseu_set_subslices(sseu, s, s == 0 ? subslice_mask
> > > :
> > > +                               subslice_mask & 0x3);
> > >       }
> > >       sseu->eu_total = compute_eu_total(sseu);
> > > @@ -275,13 +293,12 @@ static void
> > > cherryview_sseu_info_init(struct 
> > > drm_i915_private *dev_priv)
> > >   {
> > >       struct sseu_dev_info *sseu = &RUNTIME_INFO(dev_priv)->sseu;
> > >       u32 fuse;
> > > +    u8 subslice_mask;
> > >       fuse = I915_READ(CHV_FUSE_GT);
> > >       sseu->slice_mask = BIT(0);
> > > -    sseu->max_slices = 1;
> > > -    sseu->max_subslices = 2;
> > > -    sseu->max_eus_per_subslice = 8;
> > > +    intel_sseu_set_info(sseu, 1, 2, 8);
> > >       if (!(fuse & CHV_FGT_DISABLE_SS0)) {
> > >           u8 disabled_mask =
> > > @@ -290,7 +307,7 @@ static void cherryview_sseu_info_init(struct 
> > > drm_i915_private *dev_priv)
> > >               (((fuse & CHV_FGT_EU_DIS_SS0_R1_MASK) >>
> > >                 CHV_FGT_EU_DIS_SS0_R1_SHIFT) << 4);
> > > -        sseu->subslice_mask[0] |= BIT(0);
> > > +        subslice_mask |= BIT(0);
> > >           intel_sseu_set_eus(sseu, 0, 0, ~disabled_mask);
> > >       }
> > > @@ -301,10 +318,12 @@ static void
> > > cherryview_sseu_info_init(struct 
> > > drm_i915_private *dev_priv)
> > >               (((fuse & CHV_FGT_EU_DIS_SS1_R1_MASK) >>
> > >                 CHV_FGT_EU_DIS_SS1_R1_SHIFT) << 4);
> > > -        sseu->subslice_mask[0] |= BIT(1);
> > > +        subslice_mask |= BIT(1);
> > >           intel_sseu_set_eus(sseu, 0, 1, ~disabled_mask);
> > >       }
> > > +    intel_sseu_set_subslices(sseu, 0, subslice_mask);
> > > +
> > >       sseu->eu_total = compute_eu_total(sseu);
> > >       /*
> > > @@ -312,7 +331,8 @@ static void cherryview_sseu_info_init(struct 
> > > drm_i915_private *dev_priv)
> > >        * across subslices.
> > >       */
> > >       sseu->eu_per_subslice = intel_sseu_subslice_total(sseu) ?
> > > -                sseu->eu_total / intel_sseu_subslice_total(sseu)
> > > :
> > > +                sseu->eu_total /
> > > +                    intel_sseu_subslice_total(sseu) :
> > >                   0;
> > >       /*
> > >        * CHV supports subslice power gating on devices with more
> > > than
> > > @@ -336,9 +356,8 @@ static void gen9_sseu_info_init(struct 
> > > drm_i915_private *dev_priv)
> > >       sseu->slice_mask = (fuse2 & GEN8_F2_S_ENA_MASK) >> 
> > > GEN8_F2_S_ENA_SHIFT;
> > >       /* BXT has a single slice and at most 3 subslices. */
> > > -    sseu->max_slices = IS_GEN9_LP(dev_priv) ? 1 : 3;
> > > -    sseu->max_subslices = IS_GEN9_LP(dev_priv) ? 3 : 4;
> > > -    sseu->max_eus_per_subslice = 8;
> > > +    intel_sseu_set_info(sseu, IS_GEN9_LP(dev_priv) ? 1 : 3,
> > > +                IS_GEN9_LP(dev_priv) ? 3 : 4, 8);
> > >       /*
> > >        * The subslice disable field is global, i.e. it applies
> > > @@ -357,14 +376,16 @@ static void gen9_sseu_info_init(struct 
> > > drm_i915_private *dev_priv)
> > >               /* skip disabled slice */
> > >               continue;
> > > -        sseu->subslice_mask[s] = subslice_mask;
> > > +        intel_sseu_set_subslices(sseu, s, subslice_mask);
> > >           eu_disable = I915_READ(GEN9_EU_DISABLE(s));
> > >           for (ss = 0; ss < sseu->max_subslices; ss++) {
> > >               int eu_per_ss;
> > >               u8 eu_disabled_mask;
> > > +            u8 ss_idx = s * sseu->ss_stride + ss /
> > > BITS_PER_BYTE;
> > > -            if (!(sseu->subslice_mask[s] & BIT(ss)))
> > > +            if (!(sseu->subslice_mask[ss_idx] &
> > > +                  BIT(ss % BITS_PER_BYTE)))
> > >                   /* skip disabled subslice */
> > >                   continue;
> > > @@ -437,9 +458,7 @@ static void broadwell_sseu_info_init(struct 
> > > drm_i915_private *dev_priv)
> > >       fuse2 = I915_READ(GEN8_FUSE2);
> > >       sseu->slice_mask = (fuse2 & GEN8_F2_S_ENA_MASK) >> 
> > > GEN8_F2_S_ENA_SHIFT;
> > > -    sseu->max_slices = 3;
> > > -    sseu->max_subslices = 3;
> > > -    sseu->max_eus_per_subslice = 8;
> > > +    intel_sseu_set_info(sseu, 3, 3, 8);
> > >       /*
> > >        * The subslice disable field is global, i.e. it applies
> > > @@ -466,18 +485,21 @@ static void
> > > broadwell_sseu_info_init(struct 
> > > drm_i915_private *dev_priv)
> > >               /* skip disabled slice */
> > >               continue;
> > > -        sseu->subslice_mask[s] = subslice_mask;
> > > +        intel_sseu_set_subslices(sseu, s, subslice_mask);
> > >           for (ss = 0; ss < sseu->max_subslices; ss++) {
> > >               u8 eu_disabled_mask;
> > > +            u8 ss_idx = s * sseu->ss_stride + ss /
> > > BITS_PER_BYTE;
> > >               u32 n_disabled;
> > > -            if (!(sseu->subslice_mask[s] & BIT(ss)))
> > > +            if (!(sseu->subslice_mask[ss_idx] &
> > > +                  BIT(ss % BITS_PER_BYTE)))
> > >                   /* skip disabled subslice */
> > >                   continue;
> > >               eu_disabled_mask =
> > > -                eu_disable[s] >> (ss * sseu-
> > > >max_eus_per_subslice);
> > > +                eu_disable[s] >>
> > > +                    (ss * sseu->max_eus_per_subslice);
> > >               intel_sseu_set_eus(sseu, s, ss, ~eu_disabled_mask);
> > > @@ -517,6 +539,7 @@ static void haswell_sseu_info_init(struct 
> > > drm_i915_private *dev_priv)
> > >       struct sseu_dev_info *sseu = &RUNTIME_INFO(dev_priv)->sseu;
> > >       u32 fuse1;
> > >       int s, ss;
> > > +    u32 subslice_mask;
> > >       /*
> > >        * There isn't a register to tell us how many
> > > slices/subslices. We
> > > @@ -528,22 +551,18 @@ static void haswell_sseu_info_init(struct 
> > > drm_i915_private *dev_priv)
> > >           /* fall through */
> > >       case 1:
> > >           sseu->slice_mask = BIT(0);
> > > -        sseu->subslice_mask[0] = BIT(0);
> > > +        subslice_mask = BIT(0);
> > >           break;
> > >       case 2:
> > >           sseu->slice_mask = BIT(0);
> > > -        sseu->subslice_mask[0] = BIT(0) | BIT(1);
> > > +        subslice_mask = BIT(0) | BIT(1);
> > >           break;
> > >       case 3:
> > >           sseu->slice_mask = BIT(0) | BIT(1);
> > > -        sseu->subslice_mask[0] = BIT(0) | BIT(1);
> > > -        sseu->subslice_mask[1] = BIT(0) | BIT(1);
> > > +        subslice_mask = BIT(0) | BIT(1);
> > >           break;
> > >       }
> > > -    sseu->max_slices = hweight8(sseu->slice_mask);
> > > -    sseu->max_subslices = hweight8(sseu->subslice_mask[0]);
> > > -
> > >       fuse1 = I915_READ(HSW_PAVP_FUSE1);
> > >       switch ((fuse1 & HSW_F1_EU_DIS_MASK) >>
> > > HSW_F1_EU_DIS_SHIFT) {
> > >       default:
> > > @@ -560,9 +579,14 @@ static void haswell_sseu_info_init(struct 
> > > drm_i915_private *dev_priv)
> > >           sseu->eu_per_subslice = 6;
> > >           break;
> > >       }
> > > -    sseu->max_eus_per_subslice = sseu->eu_per_subslice;
> > > +
> > > +    intel_sseu_set_info(sseu, hweight8(sseu->slice_mask),
> > > +                hweight8(subslice_mask),
> > > +                sseu->eu_per_subslice);
> > >       for (s = 0; s < sseu->max_slices; s++) {
> > > +        intel_sseu_set_subslices(sseu, s, subslice_mask);
> > > +
> > >           for (ss = 0; ss < sseu->max_subslices; ss++) {
> > >               intel_sseu_set_eus(sseu, s, ss,
> > >                          (1UL << sseu->eu_per_subslice) - 1);
> > > 

[-- Attachment #1.2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 3270 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 5/6] drm/i915: Remove inline from sseu helper functions
  2019-05-01 15:34 ` [PATCH 5/6] drm/i915: Remove inline from sseu helper functions Stuart Summers
@ 2019-05-01 20:04   ` Daniele Ceraolo Spurio
  2019-05-01 21:04     ` Summers, Stuart
  0 siblings, 1 reply; 35+ messages in thread
From: Daniele Ceraolo Spurio @ 2019-05-01 20:04 UTC (permalink / raw)
  To: Stuart Summers, intel-gfx

Can you elaborate a bit more on what's the rationale for this? do you 
just want to avoid having too many inlines since the paths they're used 
in are not critical, or do you have some more functional reason? This is 
not a critic to the patch, I just want to understand where you're coming 
from ;)

BTW, looking at this patch I realized there are a few more 
DIV_ROUND_UP(..., BITS_PER_BYTE) that could be converted to 
GEN_SSEU_STRIDE() in patch 2. I noticed you update them to a new 
variable in the next patch, but for consistency it might still be worth 
updating them all in patch 2 or at least mention in the commit message 
of patch 2 that the remaining cases are updated by a follow-up patch in 
the series. Patch 2 is quite small, so you could also just squash it 
into patch 6 to avoid the split.

Daniele

On 5/1/19 8:34 AM, Stuart Summers wrote:
> Additionally, ensure these are all prefixed with intel_sseu_*
> to match the convention of other functions in i915.
> 
> Signed-off-by: Stuart Summers <stuart.summers@intel.com>
> ---
>   drivers/gpu/drm/i915/gt/intel_sseu.c     | 54 +++++++++++++++++++
>   drivers/gpu/drm/i915/gt/intel_sseu.h     | 57 +++-----------------
>   drivers/gpu/drm/i915/i915_debugfs.c      |  6 +--
>   drivers/gpu/drm/i915/i915_drv.c          |  2 +-
>   drivers/gpu/drm/i915/intel_device_info.c | 69 ++++++++++++------------
>   5 files changed, 102 insertions(+), 86 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_sseu.c b/drivers/gpu/drm/i915/gt/intel_sseu.c
> index 7f448f3bea0b..4a0b82fc108c 100644
> --- a/drivers/gpu/drm/i915/gt/intel_sseu.c
> +++ b/drivers/gpu/drm/i915/gt/intel_sseu.c
> @@ -8,6 +8,60 @@
>   #include "intel_lrc_reg.h"
>   #include "intel_sseu.h"
>   
> +unsigned int
> +intel_sseu_subslice_total(const struct sseu_dev_info *sseu)
> +{
> +	unsigned int i, total = 0;
> +
> +	for (i = 0; i < ARRAY_SIZE(sseu->subslice_mask); i++)
> +		total += hweight8(sseu->subslice_mask[i]);
> +
> +	return total;
> +}
> +
> +unsigned int
> +intel_sseu_subslices_per_slice(const struct sseu_dev_info *sseu, u8 slice)
> +{
> +	return hweight8(sseu->subslice_mask[slice]);
> +}
> +
> +static int intel_sseu_eu_idx(const struct sseu_dev_info *sseu, int slice,
> +			     int subslice)
> +{
> +	int subslice_stride = DIV_ROUND_UP(sseu->max_eus_per_subslice,
> +					   BITS_PER_BYTE);
> +	int slice_stride = sseu->max_subslices * subslice_stride;
> +
> +	return slice * slice_stride + subslice * subslice_stride;
> +}
> +
> +u16 intel_sseu_get_eus(const struct sseu_dev_info *sseu, int slice,
> +		       int subslice)
> +{
> +	int i, offset = intel_sseu_eu_idx(sseu, slice, subslice);
> +	u16 eu_mask = 0;
> +
> +	for (i = 0;
> +	     i < DIV_ROUND_UP(sseu->max_eus_per_subslice, BITS_PER_BYTE); i++) {
> +		eu_mask |= ((u16)sseu->eu_mask[offset + i]) <<
> +			(i * BITS_PER_BYTE);
> +	}
> +
> +	return eu_mask;
> +}
> +
> +void intel_sseu_set_eus(struct sseu_dev_info *sseu, int slice, int subslice,
> +			u16 eu_mask)
> +{
> +	int i, offset = intel_sseu_eu_idx(sseu, slice, subslice);
> +
> +	for (i = 0;
> +	     i < DIV_ROUND_UP(sseu->max_eus_per_subslice, BITS_PER_BYTE); i++) {
> +		sseu->eu_mask[offset + i] =
> +			(eu_mask >> (BITS_PER_BYTE * i)) & 0xff;
> +	}
> +}
> +
>   u32 intel_sseu_make_rpcs(struct drm_i915_private *i915,
>   			 const struct intel_sseu *req_sseu)
>   {
> diff --git a/drivers/gpu/drm/i915/gt/intel_sseu.h b/drivers/gpu/drm/i915/gt/intel_sseu.h
> index 029e71d8f140..56e3721ae83f 100644
> --- a/drivers/gpu/drm/i915/gt/intel_sseu.h
> +++ b/drivers/gpu/drm/i915/gt/intel_sseu.h
> @@ -63,58 +63,17 @@ intel_sseu_from_device_info(const struct sseu_dev_info *sseu)
>   	return value;
>   }
>   
> -static inline unsigned int sseu_subslice_total(const struct sseu_dev_info *sseu)
> -{
> -	unsigned int i, total = 0;
> -
> -	for (i = 0; i < ARRAY_SIZE(sseu->subslice_mask); i++)
> -		total += hweight8(sseu->subslice_mask[i]);
> +unsigned int
> +intel_sseu_subslice_total(const struct sseu_dev_info *sseu);
>   
> -	return total;
> -}
> +unsigned int
> +intel_sseu_subslices_per_slice(const struct sseu_dev_info *sseu, u8 slice);
>   
> -static inline unsigned int
> -sseu_subslices_per_slice(const struct sseu_dev_info *sseu, u8 slice)
> -{
> -	return hweight8(sseu->subslice_mask[slice]);
> -}
> -
> -static inline int sseu_eu_idx(const struct sseu_dev_info *sseu,
> -			      int slice, int subslice)
> -{
> -	int subslice_stride = DIV_ROUND_UP(sseu->max_eus_per_subslice,
> -					   BITS_PER_BYTE);
> -	int slice_stride = sseu->max_subslices * subslice_stride;
> -
> -	return slice * slice_stride + subslice * subslice_stride;
> -}
> +u16 intel_sseu_get_eus(const struct sseu_dev_info *sseu, int slice,
> +		       int subslice);
>   
> -static inline u16 sseu_get_eus(const struct sseu_dev_info *sseu,
> -			       int slice, int subslice)
> -{
> -	int i, offset = sseu_eu_idx(sseu, slice, subslice);
> -	u16 eu_mask = 0;
> -
> -	for (i = 0;
> -	     i < DIV_ROUND_UP(sseu->max_eus_per_subslice, BITS_PER_BYTE); i++) {
> -		eu_mask |= ((u16)sseu->eu_mask[offset + i]) <<
> -			(i * BITS_PER_BYTE);
> -	}
> -
> -	return eu_mask;
> -}
> -
> -static inline void sseu_set_eus(struct sseu_dev_info *sseu,
> -				int slice, int subslice, u16 eu_mask)
> -{
> -	int i, offset = sseu_eu_idx(sseu, slice, subslice);
> -
> -	for (i = 0;
> -	     i < DIV_ROUND_UP(sseu->max_eus_per_subslice, BITS_PER_BYTE); i++) {
> -		sseu->eu_mask[offset + i] =
> -			(eu_mask >> (BITS_PER_BYTE * i)) & 0xff;
> -	}
> -}
> +void intel_sseu_set_eus(struct sseu_dev_info *sseu, int slice, int subslice,
> +			u16 eu_mask);
>   
>   u32 intel_sseu_make_rpcs(struct drm_i915_private *i915,
>   			 const struct intel_sseu *req_sseu);
> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
> index fe854c629a32..3f3ee83ac315 100644
> --- a/drivers/gpu/drm/i915/i915_debugfs.c
> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> @@ -4158,7 +4158,7 @@ static void broadwell_sseu_device_status(struct drm_i915_private *dev_priv,
>   				RUNTIME_INFO(dev_priv)->sseu.subslice_mask[s];
>   		}
>   		sseu->eu_total = sseu->eu_per_subslice *
> -				 sseu_subslice_total(sseu);
> +				 intel_sseu_subslice_total(sseu);
>   
>   		/* subtract fused off EU(s) from enabled slice(s) */
>   		for (s = 0; s < fls(sseu->slice_mask); s++) {
> @@ -4182,10 +4182,10 @@ static void i915_print_sseu_info(struct seq_file *m, bool is_available_info,
>   	seq_printf(m, "  %s Slice Total: %u\n", type,
>   		   hweight8(sseu->slice_mask));
>   	seq_printf(m, "  %s Subslice Total: %u\n", type,
> -		   sseu_subslice_total(sseu));
> +		   intel_sseu_subslice_total(sseu));
>   	for (s = 0; s < fls(sseu->slice_mask); s++) {
>   		seq_printf(m, "  %s Slice%i subslices: %u\n", type,
> -			   s, sseu_subslices_per_slice(sseu, s));
> +			   s, intel_sseu_subslices_per_slice(sseu, s));
>   	}
>   	seq_printf(m, "  %s EU Total: %u\n", type,
>   		   sseu->eu_total);
> diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
> index c376244c19c4..130c5140db0d 100644
> --- a/drivers/gpu/drm/i915/i915_drv.c
> +++ b/drivers/gpu/drm/i915/i915_drv.c
> @@ -378,7 +378,7 @@ static int i915_getparam_ioctl(struct drm_device *dev, void *data,
>   		value = i915_cmd_parser_get_version(dev_priv);
>   		break;
>   	case I915_PARAM_SUBSLICE_TOTAL:
> -		value = sseu_subslice_total(sseu);
> +		value = intel_sseu_subslice_total(sseu);
>   		if (!value)
>   			return -ENODEV;
>   		break;
> diff --git a/drivers/gpu/drm/i915/intel_device_info.c b/drivers/gpu/drm/i915/intel_device_info.c
> index 559cf0d0628e..e1dbccf04cd9 100644
> --- a/drivers/gpu/drm/i915/intel_device_info.c
> +++ b/drivers/gpu/drm/i915/intel_device_info.c
> @@ -90,10 +90,10 @@ static void sseu_dump(const struct sseu_dev_info *sseu, struct drm_printer *p)
>   
>   	drm_printf(p, "slice total: %u, mask=%04x\n",
>   		   hweight8(sseu->slice_mask), sseu->slice_mask);
> -	drm_printf(p, "subslice total: %u\n", sseu_subslice_total(sseu));
> +	drm_printf(p, "subslice total: %u\n", intel_sseu_subslice_total(sseu));
>   	for (s = 0; s < sseu->max_slices; s++) {
>   		drm_printf(p, "slice%d: %u subslices, mask=%04x\n",
> -			   s, sseu_subslices_per_slice(sseu, s),
> +			   s, intel_sseu_subslices_per_slice(sseu, s),
>   			   sseu->subslice_mask[s]);
>   	}
>   	drm_printf(p, "EU total: %u\n", sseu->eu_total);
> @@ -126,11 +126,11 @@ void intel_device_info_dump_topology(const struct sseu_dev_info *sseu,
>   
>   	for (s = 0; s < sseu->max_slices; s++) {
>   		drm_printf(p, "slice%d: %u subslice(s) (0x%hhx):\n",
> -			   s, sseu_subslices_per_slice(sseu, s),
> +			   s, intel_sseu_subslices_per_slice(sseu, s),
>   			   sseu->subslice_mask[s]);
>   
>   		for (ss = 0; ss < sseu->max_subslices; ss++) {
> -			u16 enabled_eus = sseu_get_eus(sseu, s, ss);
> +			u16 enabled_eus = intel_sseu_get_eus(sseu, s, ss);
>   
>   			drm_printf(p, "\tsubslice%d: %u EUs (0x%hx)\n",
>   				   ss, hweight16(enabled_eus), enabled_eus);
> @@ -180,7 +180,7 @@ static void gen11_sseu_info_init(struct drm_i915_private *dev_priv)
>   			sseu->subslice_mask[s] = (ss_en >> ss_idx) & ss_en_mask;
>   			for (ss = 0; ss < sseu->max_subslices; ss++) {
>   				if (sseu->subslice_mask[s] & BIT(ss))
> -					sseu_set_eus(sseu, s, ss, eu_en);
> +					intel_sseu_set_eus(sseu, s, ss, eu_en);
>   			}
>   		}
>   	}
> @@ -222,32 +222,32 @@ static void gen10_sseu_info_init(struct drm_i915_private *dev_priv)
>   	/* Slice0 */
>   	eu_en = ~I915_READ(GEN8_EU_DISABLE0);
>   	for (ss = 0; ss < sseu->max_subslices; ss++)
> -		sseu_set_eus(sseu, 0, ss, (eu_en >> (8 * ss)) & eu_mask);
> +		intel_sseu_set_eus(sseu, 0, ss, (eu_en >> (8 * ss)) & eu_mask);
>   	/* Slice1 */
> -	sseu_set_eus(sseu, 1, 0, (eu_en >> 24) & eu_mask);
> +	intel_sseu_set_eus(sseu, 1, 0, (eu_en >> 24) & eu_mask);
>   	eu_en = ~I915_READ(GEN8_EU_DISABLE1);
> -	sseu_set_eus(sseu, 1, 1, eu_en & eu_mask);
> +	intel_sseu_set_eus(sseu, 1, 1, eu_en & eu_mask);
>   	/* Slice2 */
> -	sseu_set_eus(sseu, 2, 0, (eu_en >> 8) & eu_mask);
> -	sseu_set_eus(sseu, 2, 1, (eu_en >> 16) & eu_mask);
> +	intel_sseu_set_eus(sseu, 2, 0, (eu_en >> 8) & eu_mask);
> +	intel_sseu_set_eus(sseu, 2, 1, (eu_en >> 16) & eu_mask);
>   	/* Slice3 */
> -	sseu_set_eus(sseu, 3, 0, (eu_en >> 24) & eu_mask);
> +	intel_sseu_set_eus(sseu, 3, 0, (eu_en >> 24) & eu_mask);
>   	eu_en = ~I915_READ(GEN8_EU_DISABLE2);
> -	sseu_set_eus(sseu, 3, 1, eu_en & eu_mask);
> +	intel_sseu_set_eus(sseu, 3, 1, eu_en & eu_mask);
>   	/* Slice4 */
> -	sseu_set_eus(sseu, 4, 0, (eu_en >> 8) & eu_mask);
> -	sseu_set_eus(sseu, 4, 1, (eu_en >> 16) & eu_mask);
> +	intel_sseu_set_eus(sseu, 4, 0, (eu_en >> 8) & eu_mask);
> +	intel_sseu_set_eus(sseu, 4, 1, (eu_en >> 16) & eu_mask);
>   	/* Slice5 */
> -	sseu_set_eus(sseu, 5, 0, (eu_en >> 24) & eu_mask);
> +	intel_sseu_set_eus(sseu, 5, 0, (eu_en >> 24) & eu_mask);
>   	eu_en = ~I915_READ(GEN10_EU_DISABLE3);
> -	sseu_set_eus(sseu, 5, 1, eu_en & eu_mask);
> +	intel_sseu_set_eus(sseu, 5, 1, eu_en & eu_mask);
>   
>   	/* Do a second pass where we mark the subslices disabled if all their
>   	 * eus are off.
>   	 */
>   	for (s = 0; s < sseu->max_slices; s++) {
>   		for (ss = 0; ss < sseu->max_subslices; ss++) {
> -			if (sseu_get_eus(sseu, s, ss) == 0)
> +			if (intel_sseu_get_eus(sseu, s, ss) == 0)
>   				sseu->subslice_mask[s] &= ~BIT(ss);
>   		}
>   	}
> @@ -260,9 +260,10 @@ static void gen10_sseu_info_init(struct drm_i915_private *dev_priv)
>   	 * EU in any one subslice may be fused off for die
>   	 * recovery.
>   	 */
> -	sseu->eu_per_subslice = sseu_subslice_total(sseu) ?
> +	sseu->eu_per_subslice = intel_sseu_subslice_total(sseu) ?
>   				DIV_ROUND_UP(sseu->eu_total,
> -					     sseu_subslice_total(sseu)) : 0;
> +					     intel_sseu_subslice_total(sseu)) :
> +				0;
>   
>   	/* No restrictions on Power Gating */
>   	sseu->has_slice_pg = 1;
> @@ -290,7 +291,7 @@ static void cherryview_sseu_info_init(struct drm_i915_private *dev_priv)
>   			  CHV_FGT_EU_DIS_SS0_R1_SHIFT) << 4);
>   
>   		sseu->subslice_mask[0] |= BIT(0);
> -		sseu_set_eus(sseu, 0, 0, ~disabled_mask);
> +		intel_sseu_set_eus(sseu, 0, 0, ~disabled_mask);
>   	}
>   
>   	if (!(fuse & CHV_FGT_DISABLE_SS1)) {
> @@ -301,7 +302,7 @@ static void cherryview_sseu_info_init(struct drm_i915_private *dev_priv)
>   			  CHV_FGT_EU_DIS_SS1_R1_SHIFT) << 4);
>   
>   		sseu->subslice_mask[0] |= BIT(1);
> -		sseu_set_eus(sseu, 0, 1, ~disabled_mask);
> +		intel_sseu_set_eus(sseu, 0, 1, ~disabled_mask);
>   	}
>   
>   	sseu->eu_total = compute_eu_total(sseu);
> @@ -310,8 +311,8 @@ static void cherryview_sseu_info_init(struct drm_i915_private *dev_priv)
>   	 * CHV expected to always have a uniform distribution of EU
>   	 * across subslices.
>   	*/
> -	sseu->eu_per_subslice = sseu_subslice_total(sseu) ?
> -				sseu->eu_total / sseu_subslice_total(sseu) :
> +	sseu->eu_per_subslice = intel_sseu_subslice_total(sseu) ?
> +				sseu->eu_total / intel_sseu_subslice_total(sseu) :
>   				0;
>   	/*
>   	 * CHV supports subslice power gating on devices with more than
> @@ -319,7 +320,7 @@ static void cherryview_sseu_info_init(struct drm_i915_private *dev_priv)
>   	 * more than one EU pair per subslice.
>   	*/
>   	sseu->has_slice_pg = 0;
> -	sseu->has_subslice_pg = sseu_subslice_total(sseu) > 1;
> +	sseu->has_subslice_pg = intel_sseu_subslice_total(sseu) > 1;
>   	sseu->has_eu_pg = (sseu->eu_per_subslice > 2);
>   }
>   
> @@ -369,7 +370,7 @@ static void gen9_sseu_info_init(struct drm_i915_private *dev_priv)
>   
>   			eu_disabled_mask = (eu_disable >> (ss * 8)) & eu_mask;
>   
> -			sseu_set_eus(sseu, s, ss, ~eu_disabled_mask);
> +			intel_sseu_set_eus(sseu, s, ss, ~eu_disabled_mask);
>   
>   			eu_per_ss = sseu->max_eus_per_subslice -
>   				hweight8(eu_disabled_mask);
> @@ -393,9 +394,10 @@ static void gen9_sseu_info_init(struct drm_i915_private *dev_priv)
>   	 * recovery. BXT is expected to be perfectly uniform in EU
>   	 * distribution.
>   	*/
> -	sseu->eu_per_subslice = sseu_subslice_total(sseu) ?
> +	sseu->eu_per_subslice = intel_sseu_subslice_total(sseu) ?
>   				DIV_ROUND_UP(sseu->eu_total,
> -					     sseu_subslice_total(sseu)) : 0;
> +					     intel_sseu_subslice_total(sseu)) :
> +				0;
>   	/*
>   	 * SKL+ supports slice power gating on devices with more than
>   	 * one slice, and supports EU power gating on devices with
> @@ -407,7 +409,7 @@ static void gen9_sseu_info_init(struct drm_i915_private *dev_priv)
>   	sseu->has_slice_pg =
>   		!IS_GEN9_LP(dev_priv) && hweight8(sseu->slice_mask) > 1;
>   	sseu->has_subslice_pg =
> -		IS_GEN9_LP(dev_priv) && sseu_subslice_total(sseu) > 1;
> +		IS_GEN9_LP(dev_priv) && intel_sseu_subslice_total(sseu) > 1;
>   	sseu->has_eu_pg = sseu->eu_per_subslice > 2;
>   
>   	if (IS_GEN9_LP(dev_priv)) {
> @@ -477,7 +479,7 @@ static void broadwell_sseu_info_init(struct drm_i915_private *dev_priv)
>   			eu_disabled_mask =
>   				eu_disable[s] >> (ss * sseu->max_eus_per_subslice);
>   
> -			sseu_set_eus(sseu, s, ss, ~eu_disabled_mask);
> +			intel_sseu_set_eus(sseu, s, ss, ~eu_disabled_mask);
>   
>   			n_disabled = hweight8(eu_disabled_mask);
>   
> @@ -496,9 +498,10 @@ static void broadwell_sseu_info_init(struct drm_i915_private *dev_priv)
>   	 * subslices with the exception that any one EU in any one subslice may
>   	 * be fused off for die recovery.
>   	 */
> -	sseu->eu_per_subslice = sseu_subslice_total(sseu) ?
> +	sseu->eu_per_subslice = intel_sseu_subslice_total(sseu) ?
>   				DIV_ROUND_UP(sseu->eu_total,
> -					     sseu_subslice_total(sseu)) : 0;
> +					     intel_sseu_subslice_total(sseu)) :
> +				0;
>   
>   	/*
>   	 * BDW supports slice power gating on devices with more than
> @@ -561,8 +564,8 @@ static void haswell_sseu_info_init(struct drm_i915_private *dev_priv)
>   
>   	for (s = 0; s < sseu->max_slices; s++) {
>   		for (ss = 0; ss < sseu->max_subslices; ss++) {
> -			sseu_set_eus(sseu, s, ss,
> -				     (1UL << sseu->eu_per_subslice) - 1);
> +			intel_sseu_set_eus(sseu, s, ss,
> +					   (1UL << sseu->eu_per_subslice) - 1);
>   		}
>   	}
>   
> 
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 5/6] drm/i915: Remove inline from sseu helper functions
  2019-05-01 20:04   ` Daniele Ceraolo Spurio
@ 2019-05-01 21:04     ` Summers, Stuart
  2019-05-01 21:19       ` Daniele Ceraolo Spurio
  0 siblings, 1 reply; 35+ messages in thread
From: Summers, Stuart @ 2019-05-01 21:04 UTC (permalink / raw)
  To: Ceraolo Spurio, Daniele, intel-gfx


[-- Attachment #1.1: Type: text/plain, Size: 18410 bytes --]

On Wed, 2019-05-01 at 13:04 -0700, Daniele Ceraolo Spurio wrote:
> Can you elaborate a bit more on what's the rationale for this? do
> you 
> just want to avoid having too many inlines since the paths they're
> used 
> in are not critical, or do you have some more functional reason? This
> is 
> not a critic to the patch, I just want to understand where you're
> coming 
> from ;)

This was a request from Jani Nikula in a previous series update. I
don't have a strong preference either way personally. If you don't have
any major concerns, I'd prefer to keep the series as-is to prevent too
much thrash here, but let me know.

> 
> BTW, looking at this patch I realized there are a few more 
> DIV_ROUND_UP(..., BITS_PER_BYTE) that could be converted to 
> GEN_SSEU_STRIDE() in patch 2. I noticed you update them to a new 
> variable in the next patch, but for consistency it might still be
> worth 
> updating them all in patch 2 or at least mention in the commit
> message 
> of patch 2 that the remaining cases are updated by a follow-up patch
> in 
> the series. Patch 2 is quite small, so you could also just squash it 
> into patch 6 to avoid the split.

I'm happy to squash them. I did try to isolate this a bit, but you're
right that I ended up pushing some of these DIV_ROUND_UP... stride
calculations to the last patch in the series. If you don't have any
objection, to keep the finaly patch a bit simpler, I'd rather pull
those changes into the earlier patch. I realize you already have a RB
on that patch. Any issues doing this?

Thanks,
Stuart

> 
> Daniele
> 
> On 5/1/19 8:34 AM, Stuart Summers wrote:
> > Additionally, ensure these are all prefixed with intel_sseu_*
> > to match the convention of other functions in i915.
> > 
> > Signed-off-by: Stuart Summers <stuart.summers@intel.com>
> > ---
> >   drivers/gpu/drm/i915/gt/intel_sseu.c     | 54 +++++++++++++++++++
> >   drivers/gpu/drm/i915/gt/intel_sseu.h     | 57 +++--------------
> > ---
> >   drivers/gpu/drm/i915/i915_debugfs.c      |  6 +--
> >   drivers/gpu/drm/i915/i915_drv.c          |  2 +-
> >   drivers/gpu/drm/i915/intel_device_info.c | 69 ++++++++++++-------
> > -----
> >   5 files changed, 102 insertions(+), 86 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/gt/intel_sseu.c
> > b/drivers/gpu/drm/i915/gt/intel_sseu.c
> > index 7f448f3bea0b..4a0b82fc108c 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_sseu.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_sseu.c
> > @@ -8,6 +8,60 @@
> >   #include "intel_lrc_reg.h"
> >   #include "intel_sseu.h"
> >   
> > +unsigned int
> > +intel_sseu_subslice_total(const struct sseu_dev_info *sseu)
> > +{
> > +	unsigned int i, total = 0;
> > +
> > +	for (i = 0; i < ARRAY_SIZE(sseu->subslice_mask); i++)
> > +		total += hweight8(sseu->subslice_mask[i]);
> > +
> > +	return total;
> > +}
> > +
> > +unsigned int
> > +intel_sseu_subslices_per_slice(const struct sseu_dev_info *sseu,
> > u8 slice)
> > +{
> > +	return hweight8(sseu->subslice_mask[slice]);
> > +}
> > +
> > +static int intel_sseu_eu_idx(const struct sseu_dev_info *sseu, int
> > slice,
> > +			     int subslice)
> > +{
> > +	int subslice_stride = DIV_ROUND_UP(sseu->max_eus_per_subslice,
> > +					   BITS_PER_BYTE);
> > +	int slice_stride = sseu->max_subslices * subslice_stride;
> > +
> > +	return slice * slice_stride + subslice * subslice_stride;
> > +}
> > +
> > +u16 intel_sseu_get_eus(const struct sseu_dev_info *sseu, int
> > slice,
> > +		       int subslice)
> > +{
> > +	int i, offset = intel_sseu_eu_idx(sseu, slice, subslice);
> > +	u16 eu_mask = 0;
> > +
> > +	for (i = 0;
> > +	     i < DIV_ROUND_UP(sseu->max_eus_per_subslice,
> > BITS_PER_BYTE); i++) {
> > +		eu_mask |= ((u16)sseu->eu_mask[offset + i]) <<
> > +			(i * BITS_PER_BYTE);
> > +	}
> > +
> > +	return eu_mask;
> > +}
> > +
> > +void intel_sseu_set_eus(struct sseu_dev_info *sseu, int slice, int
> > subslice,
> > +			u16 eu_mask)
> > +{
> > +	int i, offset = intel_sseu_eu_idx(sseu, slice, subslice);
> > +
> > +	for (i = 0;
> > +	     i < DIV_ROUND_UP(sseu->max_eus_per_subslice,
> > BITS_PER_BYTE); i++) {
> > +		sseu->eu_mask[offset + i] =
> > +			(eu_mask >> (BITS_PER_BYTE * i)) & 0xff;
> > +	}
> > +}
> > +
> >   u32 intel_sseu_make_rpcs(struct drm_i915_private *i915,
> >   			 const struct intel_sseu *req_sseu)
> >   {
> > diff --git a/drivers/gpu/drm/i915/gt/intel_sseu.h
> > b/drivers/gpu/drm/i915/gt/intel_sseu.h
> > index 029e71d8f140..56e3721ae83f 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_sseu.h
> > +++ b/drivers/gpu/drm/i915/gt/intel_sseu.h
> > @@ -63,58 +63,17 @@ intel_sseu_from_device_info(const struct
> > sseu_dev_info *sseu)
> >   	return value;
> >   }
> >   
> > -static inline unsigned int sseu_subslice_total(const struct
> > sseu_dev_info *sseu)
> > -{
> > -	unsigned int i, total = 0;
> > -
> > -	for (i = 0; i < ARRAY_SIZE(sseu->subslice_mask); i++)
> > -		total += hweight8(sseu->subslice_mask[i]);
> > +unsigned int
> > +intel_sseu_subslice_total(const struct sseu_dev_info *sseu);
> >   
> > -	return total;
> > -}
> > +unsigned int
> > +intel_sseu_subslices_per_slice(const struct sseu_dev_info *sseu,
> > u8 slice);
> >   
> > -static inline unsigned int
> > -sseu_subslices_per_slice(const struct sseu_dev_info *sseu, u8
> > slice)
> > -{
> > -	return hweight8(sseu->subslice_mask[slice]);
> > -}
> > -
> > -static inline int sseu_eu_idx(const struct sseu_dev_info *sseu,
> > -			      int slice, int subslice)
> > -{
> > -	int subslice_stride = DIV_ROUND_UP(sseu->max_eus_per_subslice,
> > -					   BITS_PER_BYTE);
> > -	int slice_stride = sseu->max_subslices * subslice_stride;
> > -
> > -	return slice * slice_stride + subslice * subslice_stride;
> > -}
> > +u16 intel_sseu_get_eus(const struct sseu_dev_info *sseu, int
> > slice,
> > +		       int subslice);
> >   
> > -static inline u16 sseu_get_eus(const struct sseu_dev_info *sseu,
> > -			       int slice, int subslice)
> > -{
> > -	int i, offset = sseu_eu_idx(sseu, slice, subslice);
> > -	u16 eu_mask = 0;
> > -
> > -	for (i = 0;
> > -	     i < DIV_ROUND_UP(sseu->max_eus_per_subslice,
> > BITS_PER_BYTE); i++) {
> > -		eu_mask |= ((u16)sseu->eu_mask[offset + i]) <<
> > -			(i * BITS_PER_BYTE);
> > -	}
> > -
> > -	return eu_mask;
> > -}
> > -
> > -static inline void sseu_set_eus(struct sseu_dev_info *sseu,
> > -				int slice, int subslice, u16 eu_mask)
> > -{
> > -	int i, offset = sseu_eu_idx(sseu, slice, subslice);
> > -
> > -	for (i = 0;
> > -	     i < DIV_ROUND_UP(sseu->max_eus_per_subslice,
> > BITS_PER_BYTE); i++) {
> > -		sseu->eu_mask[offset + i] =
> > -			(eu_mask >> (BITS_PER_BYTE * i)) & 0xff;
> > -	}
> > -}
> > +void intel_sseu_set_eus(struct sseu_dev_info *sseu, int slice, int
> > subslice,
> > +			u16 eu_mask);
> >   
> >   u32 intel_sseu_make_rpcs(struct drm_i915_private *i915,
> >   			 const struct intel_sseu *req_sseu);
> > diff --git a/drivers/gpu/drm/i915/i915_debugfs.c
> > b/drivers/gpu/drm/i915/i915_debugfs.c
> > index fe854c629a32..3f3ee83ac315 100644
> > --- a/drivers/gpu/drm/i915/i915_debugfs.c
> > +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> > @@ -4158,7 +4158,7 @@ static void
> > broadwell_sseu_device_status(struct drm_i915_private *dev_priv,
> >   				RUNTIME_INFO(dev_priv)-
> > >sseu.subslice_mask[s];
> >   		}
> >   		sseu->eu_total = sseu->eu_per_subslice *
> > -				 sseu_subslice_total(sseu);
> > +				 intel_sseu_subslice_total(sseu);
> >   
> >   		/* subtract fused off EU(s) from enabled slice(s) */
> >   		for (s = 0; s < fls(sseu->slice_mask); s++) {
> > @@ -4182,10 +4182,10 @@ static void i915_print_sseu_info(struct
> > seq_file *m, bool is_available_info,
> >   	seq_printf(m, "  %s Slice Total: %u\n", type,
> >   		   hweight8(sseu->slice_mask));
> >   	seq_printf(m, "  %s Subslice Total: %u\n", type,
> > -		   sseu_subslice_total(sseu));
> > +		   intel_sseu_subslice_total(sseu));
> >   	for (s = 0; s < fls(sseu->slice_mask); s++) {
> >   		seq_printf(m, "  %s Slice%i subslices: %u\n", type,
> > -			   s, sseu_subslices_per_slice(sseu, s));
> > +			   s, intel_sseu_subslices_per_slice(sseu, s));
> >   	}
> >   	seq_printf(m, "  %s EU Total: %u\n", type,
> >   		   sseu->eu_total);
> > diff --git a/drivers/gpu/drm/i915/i915_drv.c
> > b/drivers/gpu/drm/i915/i915_drv.c
> > index c376244c19c4..130c5140db0d 100644
> > --- a/drivers/gpu/drm/i915/i915_drv.c
> > +++ b/drivers/gpu/drm/i915/i915_drv.c
> > @@ -378,7 +378,7 @@ static int i915_getparam_ioctl(struct
> > drm_device *dev, void *data,
> >   		value = i915_cmd_parser_get_version(dev_priv);
> >   		break;
> >   	case I915_PARAM_SUBSLICE_TOTAL:
> > -		value = sseu_subslice_total(sseu);
> > +		value = intel_sseu_subslice_total(sseu);
> >   		if (!value)
> >   			return -ENODEV;
> >   		break;
> > diff --git a/drivers/gpu/drm/i915/intel_device_info.c
> > b/drivers/gpu/drm/i915/intel_device_info.c
> > index 559cf0d0628e..e1dbccf04cd9 100644
> > --- a/drivers/gpu/drm/i915/intel_device_info.c
> > +++ b/drivers/gpu/drm/i915/intel_device_info.c
> > @@ -90,10 +90,10 @@ static void sseu_dump(const struct
> > sseu_dev_info *sseu, struct drm_printer *p)
> >   
> >   	drm_printf(p, "slice total: %u, mask=%04x\n",
> >   		   hweight8(sseu->slice_mask), sseu->slice_mask);
> > -	drm_printf(p, "subslice total: %u\n",
> > sseu_subslice_total(sseu));
> > +	drm_printf(p, "subslice total: %u\n",
> > intel_sseu_subslice_total(sseu));
> >   	for (s = 0; s < sseu->max_slices; s++) {
> >   		drm_printf(p, "slice%d: %u subslices, mask=%04x\n",
> > -			   s, sseu_subslices_per_slice(sseu, s),
> > +			   s, intel_sseu_subslices_per_slice(sseu, s),
> >   			   sseu->subslice_mask[s]);
> >   	}
> >   	drm_printf(p, "EU total: %u\n", sseu->eu_total);
> > @@ -126,11 +126,11 @@ void intel_device_info_dump_topology(const
> > struct sseu_dev_info *sseu,
> >   
> >   	for (s = 0; s < sseu->max_slices; s++) {
> >   		drm_printf(p, "slice%d: %u subslice(s) (0x%hhx):\n",
> > -			   s, sseu_subslices_per_slice(sseu, s),
> > +			   s, intel_sseu_subslices_per_slice(sseu, s),
> >   			   sseu->subslice_mask[s]);
> >   
> >   		for (ss = 0; ss < sseu->max_subslices; ss++) {
> > -			u16 enabled_eus = sseu_get_eus(sseu, s, ss);
> > +			u16 enabled_eus = intel_sseu_get_eus(sseu, s,
> > ss);
> >   
> >   			drm_printf(p, "\tsubslice%d: %u EUs (0x%hx)\n",
> >   				   ss, hweight16(enabled_eus),
> > enabled_eus);
> > @@ -180,7 +180,7 @@ static void gen11_sseu_info_init(struct
> > drm_i915_private *dev_priv)
> >   			sseu->subslice_mask[s] = (ss_en >> ss_idx) &
> > ss_en_mask;
> >   			for (ss = 0; ss < sseu->max_subslices; ss++) {
> >   				if (sseu->subslice_mask[s] & BIT(ss))
> > -					sseu_set_eus(sseu, s, ss,
> > eu_en);
> > +					intel_sseu_set_eus(sseu, s, ss,
> > eu_en);
> >   			}
> >   		}
> >   	}
> > @@ -222,32 +222,32 @@ static void gen10_sseu_info_init(struct
> > drm_i915_private *dev_priv)
> >   	/* Slice0 */
> >   	eu_en = ~I915_READ(GEN8_EU_DISABLE0);
> >   	for (ss = 0; ss < sseu->max_subslices; ss++)
> > -		sseu_set_eus(sseu, 0, ss, (eu_en >> (8 * ss)) &
> > eu_mask);
> > +		intel_sseu_set_eus(sseu, 0, ss, (eu_en >> (8 * ss)) &
> > eu_mask);
> >   	/* Slice1 */
> > -	sseu_set_eus(sseu, 1, 0, (eu_en >> 24) & eu_mask);
> > +	intel_sseu_set_eus(sseu, 1, 0, (eu_en >> 24) & eu_mask);
> >   	eu_en = ~I915_READ(GEN8_EU_DISABLE1);
> > -	sseu_set_eus(sseu, 1, 1, eu_en & eu_mask);
> > +	intel_sseu_set_eus(sseu, 1, 1, eu_en & eu_mask);
> >   	/* Slice2 */
> > -	sseu_set_eus(sseu, 2, 0, (eu_en >> 8) & eu_mask);
> > -	sseu_set_eus(sseu, 2, 1, (eu_en >> 16) & eu_mask);
> > +	intel_sseu_set_eus(sseu, 2, 0, (eu_en >> 8) & eu_mask);
> > +	intel_sseu_set_eus(sseu, 2, 1, (eu_en >> 16) & eu_mask);
> >   	/* Slice3 */
> > -	sseu_set_eus(sseu, 3, 0, (eu_en >> 24) & eu_mask);
> > +	intel_sseu_set_eus(sseu, 3, 0, (eu_en >> 24) & eu_mask);
> >   	eu_en = ~I915_READ(GEN8_EU_DISABLE2);
> > -	sseu_set_eus(sseu, 3, 1, eu_en & eu_mask);
> > +	intel_sseu_set_eus(sseu, 3, 1, eu_en & eu_mask);
> >   	/* Slice4 */
> > -	sseu_set_eus(sseu, 4, 0, (eu_en >> 8) & eu_mask);
> > -	sseu_set_eus(sseu, 4, 1, (eu_en >> 16) & eu_mask);
> > +	intel_sseu_set_eus(sseu, 4, 0, (eu_en >> 8) & eu_mask);
> > +	intel_sseu_set_eus(sseu, 4, 1, (eu_en >> 16) & eu_mask);
> >   	/* Slice5 */
> > -	sseu_set_eus(sseu, 5, 0, (eu_en >> 24) & eu_mask);
> > +	intel_sseu_set_eus(sseu, 5, 0, (eu_en >> 24) & eu_mask);
> >   	eu_en = ~I915_READ(GEN10_EU_DISABLE3);
> > -	sseu_set_eus(sseu, 5, 1, eu_en & eu_mask);
> > +	intel_sseu_set_eus(sseu, 5, 1, eu_en & eu_mask);
> >   
> >   	/* Do a second pass where we mark the subslices disabled if all
> > their
> >   	 * eus are off.
> >   	 */
> >   	for (s = 0; s < sseu->max_slices; s++) {
> >   		for (ss = 0; ss < sseu->max_subslices; ss++) {
> > -			if (sseu_get_eus(sseu, s, ss) == 0)
> > +			if (intel_sseu_get_eus(sseu, s, ss) == 0)
> >   				sseu->subslice_mask[s] &= ~BIT(ss);
> >   		}
> >   	}
> > @@ -260,9 +260,10 @@ static void gen10_sseu_info_init(struct
> > drm_i915_private *dev_priv)
> >   	 * EU in any one subslice may be fused off for die
> >   	 * recovery.
> >   	 */
> > -	sseu->eu_per_subslice = sseu_subslice_total(sseu) ?
> > +	sseu->eu_per_subslice = intel_sseu_subslice_total(sseu) ?
> >   				DIV_ROUND_UP(sseu->eu_total,
> > -					     sseu_subslice_total(sseu))
> > : 0;
> > +					     intel_sseu_subslice_total(
> > sseu)) :
> > +				0;
> >   
> >   	/* No restrictions on Power Gating */
> >   	sseu->has_slice_pg = 1;
> > @@ -290,7 +291,7 @@ static void cherryview_sseu_info_init(struct
> > drm_i915_private *dev_priv)
> >   			  CHV_FGT_EU_DIS_SS0_R1_SHIFT) << 4);
> >   
> >   		sseu->subslice_mask[0] |= BIT(0);
> > -		sseu_set_eus(sseu, 0, 0, ~disabled_mask);
> > +		intel_sseu_set_eus(sseu, 0, 0, ~disabled_mask);
> >   	}
> >   
> >   	if (!(fuse & CHV_FGT_DISABLE_SS1)) {
> > @@ -301,7 +302,7 @@ static void cherryview_sseu_info_init(struct
> > drm_i915_private *dev_priv)
> >   			  CHV_FGT_EU_DIS_SS1_R1_SHIFT) << 4);
> >   
> >   		sseu->subslice_mask[0] |= BIT(1);
> > -		sseu_set_eus(sseu, 0, 1, ~disabled_mask);
> > +		intel_sseu_set_eus(sseu, 0, 1, ~disabled_mask);
> >   	}
> >   
> >   	sseu->eu_total = compute_eu_total(sseu);
> > @@ -310,8 +311,8 @@ static void cherryview_sseu_info_init(struct
> > drm_i915_private *dev_priv)
> >   	 * CHV expected to always have a uniform distribution of EU
> >   	 * across subslices.
> >   	*/
> > -	sseu->eu_per_subslice = sseu_subslice_total(sseu) ?
> > -				sseu->eu_total /
> > sseu_subslice_total(sseu) :
> > +	sseu->eu_per_subslice = intel_sseu_subslice_total(sseu) ?
> > +				sseu->eu_total /
> > intel_sseu_subslice_total(sseu) :
> >   				0;
> >   	/*
> >   	 * CHV supports subslice power gating on devices with more than
> > @@ -319,7 +320,7 @@ static void cherryview_sseu_info_init(struct
> > drm_i915_private *dev_priv)
> >   	 * more than one EU pair per subslice.
> >   	*/
> >   	sseu->has_slice_pg = 0;
> > -	sseu->has_subslice_pg = sseu_subslice_total(sseu) > 1;
> > +	sseu->has_subslice_pg = intel_sseu_subslice_total(sseu) > 1;
> >   	sseu->has_eu_pg = (sseu->eu_per_subslice > 2);
> >   }
> >   
> > @@ -369,7 +370,7 @@ static void gen9_sseu_info_init(struct
> > drm_i915_private *dev_priv)
> >   
> >   			eu_disabled_mask = (eu_disable >> (ss * 8)) &
> > eu_mask;
> >   
> > -			sseu_set_eus(sseu, s, ss, ~eu_disabled_mask);
> > +			intel_sseu_set_eus(sseu, s, ss,
> > ~eu_disabled_mask);
> >   
> >   			eu_per_ss = sseu->max_eus_per_subslice -
> >   				hweight8(eu_disabled_mask);
> > @@ -393,9 +394,10 @@ static void gen9_sseu_info_init(struct
> > drm_i915_private *dev_priv)
> >   	 * recovery. BXT is expected to be perfectly uniform in EU
> >   	 * distribution.
> >   	*/
> > -	sseu->eu_per_subslice = sseu_subslice_total(sseu) ?
> > +	sseu->eu_per_subslice = intel_sseu_subslice_total(sseu) ?
> >   				DIV_ROUND_UP(sseu->eu_total,
> > -					     sseu_subslice_total(sseu))
> > : 0;
> > +					     intel_sseu_subslice_total(
> > sseu)) :
> > +				0;
> >   	/*
> >   	 * SKL+ supports slice power gating on devices with more than
> >   	 * one slice, and supports EU power gating on devices with
> > @@ -407,7 +409,7 @@ static void gen9_sseu_info_init(struct
> > drm_i915_private *dev_priv)
> >   	sseu->has_slice_pg =
> >   		!IS_GEN9_LP(dev_priv) && hweight8(sseu->slice_mask) >
> > 1;
> >   	sseu->has_subslice_pg =
> > -		IS_GEN9_LP(dev_priv) && sseu_subslice_total(sseu) > 1;
> > +		IS_GEN9_LP(dev_priv) && intel_sseu_subslice_total(sseu)
> > > 1;
> >   	sseu->has_eu_pg = sseu->eu_per_subslice > 2;
> >   
> >   	if (IS_GEN9_LP(dev_priv)) {
> > @@ -477,7 +479,7 @@ static void broadwell_sseu_info_init(struct
> > drm_i915_private *dev_priv)
> >   			eu_disabled_mask =
> >   				eu_disable[s] >> (ss * sseu-
> > >max_eus_per_subslice);
> >   
> > -			sseu_set_eus(sseu, s, ss, ~eu_disabled_mask);
> > +			intel_sseu_set_eus(sseu, s, ss,
> > ~eu_disabled_mask);
> >   
> >   			n_disabled = hweight8(eu_disabled_mask);
> >   
> > @@ -496,9 +498,10 @@ static void broadwell_sseu_info_init(struct
> > drm_i915_private *dev_priv)
> >   	 * subslices with the exception that any one EU in any one
> > subslice may
> >   	 * be fused off for die recovery.
> >   	 */
> > -	sseu->eu_per_subslice = sseu_subslice_total(sseu) ?
> > +	sseu->eu_per_subslice = intel_sseu_subslice_total(sseu) ?
> >   				DIV_ROUND_UP(sseu->eu_total,
> > -					     sseu_subslice_total(sseu))
> > : 0;
> > +					     intel_sseu_subslice_total(
> > sseu)) :
> > +				0;
> >   
> >   	/*
> >   	 * BDW supports slice power gating on devices with more than
> > @@ -561,8 +564,8 @@ static void haswell_sseu_info_init(struct
> > drm_i915_private *dev_priv)
> >   
> >   	for (s = 0; s < sseu->max_slices; s++) {
> >   		for (ss = 0; ss < sseu->max_subslices; ss++) {
> > -			sseu_set_eus(sseu, s, ss,
> > -				     (1UL << sseu->eu_per_subslice) -
> > 1);
> > +			intel_sseu_set_eus(sseu, s, ss,
> > +					   (1UL << sseu-
> > >eu_per_subslice) - 1);
> >   		}
> >   	}
> >   
> > 

[-- Attachment #1.2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 3270 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 5/6] drm/i915: Remove inline from sseu helper functions
  2019-05-01 21:04     ` Summers, Stuart
@ 2019-05-01 21:19       ` Daniele Ceraolo Spurio
  2019-05-01 21:28         ` Summers, Stuart
  0 siblings, 1 reply; 35+ messages in thread
From: Daniele Ceraolo Spurio @ 2019-05-01 21:19 UTC (permalink / raw)
  To: Summers, Stuart, intel-gfx



On 5/1/19 2:04 PM, Summers, Stuart wrote:
> On Wed, 2019-05-01 at 13:04 -0700, Daniele Ceraolo Spurio wrote:
>> Can you elaborate a bit more on what's the rationale for this? do
>> you
>> just want to avoid having too many inlines since the paths they're
>> used
>> in are not critical, or do you have some more functional reason? This
>> is
>> not a critic to the patch, I just want to understand where you're
>> coming
>> from ;)
> 
> This was a request from Jani Nikula in a previous series update. I
> don't have a strong preference either way personally. If you don't have
> any major concerns, I'd prefer to keep the series as-is to prevent too
> much thrash here, but let me know.
> 

No concerns, just please update the commit message to explain that we're 
moving them because there is no need for them to be inline since they're 
not on a critical path where we need preformance.

>>
>> BTW, looking at this patch I realized there are a few more
>> DIV_ROUND_UP(..., BITS_PER_BYTE) that could be converted to
>> GEN_SSEU_STRIDE() in patch 2. I noticed you update them to a new
>> variable in the next patch, but for consistency it might still be
>> worth
>> updating them all in patch 2 or at least mention in the commit
>> message
>> of patch 2 that the remaining cases are updated by a follow-up patch
>> in
>> the series. Patch 2 is quite small, so you could also just squash it
>> into patch 6 to avoid the split.
> 
> I'm happy to squash them. I did try to isolate this a bit, but you're
> right that I ended up pushing some of these DIV_ROUND_UP... stride
> calculations to the last patch in the series. If you don't have any
> objection, to keep the finaly patch a bit simpler, I'd rather pull
> those changes into the earlier patch. I realize you already have a RB
> on that patch. Any issues doing this?
> 

If you're changing all of them from DIV_ROUND_UP to GEN_SSEU_STRIDE in 
patch 2 I'm ok for you to keep the r-b. If you want to port the other 
logic for saving sseu->ss_stride to that patch then I'll have another 
quick look at it after you re-send as that is a more complex change.

Daniele

> Thanks,
> Stuart
> 
>>
>> Daniele
>>
>> On 5/1/19 8:34 AM, Stuart Summers wrote:
>>> Additionally, ensure these are all prefixed with intel_sseu_*
>>> to match the convention of other functions in i915.
>>>
>>> Signed-off-by: Stuart Summers <stuart.summers@intel.com>
>>> ---
>>>    drivers/gpu/drm/i915/gt/intel_sseu.c     | 54 +++++++++++++++++++
>>>    drivers/gpu/drm/i915/gt/intel_sseu.h     | 57 +++--------------
>>> ---
>>>    drivers/gpu/drm/i915/i915_debugfs.c      |  6 +--
>>>    drivers/gpu/drm/i915/i915_drv.c          |  2 +-
>>>    drivers/gpu/drm/i915/intel_device_info.c | 69 ++++++++++++-------
>>> -----
>>>    5 files changed, 102 insertions(+), 86 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/gt/intel_sseu.c
>>> b/drivers/gpu/drm/i915/gt/intel_sseu.c
>>> index 7f448f3bea0b..4a0b82fc108c 100644
>>> --- a/drivers/gpu/drm/i915/gt/intel_sseu.c
>>> +++ b/drivers/gpu/drm/i915/gt/intel_sseu.c
>>> @@ -8,6 +8,60 @@
>>>    #include "intel_lrc_reg.h"
>>>    #include "intel_sseu.h"
>>>    
>>> +unsigned int
>>> +intel_sseu_subslice_total(const struct sseu_dev_info *sseu)
>>> +{
>>> +	unsigned int i, total = 0;
>>> +
>>> +	for (i = 0; i < ARRAY_SIZE(sseu->subslice_mask); i++)
>>> +		total += hweight8(sseu->subslice_mask[i]);
>>> +
>>> +	return total;
>>> +}
>>> +
>>> +unsigned int
>>> +intel_sseu_subslices_per_slice(const struct sseu_dev_info *sseu,
>>> u8 slice)
>>> +{
>>> +	return hweight8(sseu->subslice_mask[slice]);
>>> +}
>>> +
>>> +static int intel_sseu_eu_idx(const struct sseu_dev_info *sseu, int
>>> slice,
>>> +			     int subslice)
>>> +{
>>> +	int subslice_stride = DIV_ROUND_UP(sseu->max_eus_per_subslice,
>>> +					   BITS_PER_BYTE);
>>> +	int slice_stride = sseu->max_subslices * subslice_stride;
>>> +
>>> +	return slice * slice_stride + subslice * subslice_stride;
>>> +}
>>> +
>>> +u16 intel_sseu_get_eus(const struct sseu_dev_info *sseu, int
>>> slice,
>>> +		       int subslice)
>>> +{
>>> +	int i, offset = intel_sseu_eu_idx(sseu, slice, subslice);
>>> +	u16 eu_mask = 0;
>>> +
>>> +	for (i = 0;
>>> +	     i < DIV_ROUND_UP(sseu->max_eus_per_subslice,
>>> BITS_PER_BYTE); i++) {
>>> +		eu_mask |= ((u16)sseu->eu_mask[offset + i]) <<
>>> +			(i * BITS_PER_BYTE);
>>> +	}
>>> +
>>> +	return eu_mask;
>>> +}
>>> +
>>> +void intel_sseu_set_eus(struct sseu_dev_info *sseu, int slice, int
>>> subslice,
>>> +			u16 eu_mask)
>>> +{
>>> +	int i, offset = intel_sseu_eu_idx(sseu, slice, subslice);
>>> +
>>> +	for (i = 0;
>>> +	     i < DIV_ROUND_UP(sseu->max_eus_per_subslice,
>>> BITS_PER_BYTE); i++) {
>>> +		sseu->eu_mask[offset + i] =
>>> +			(eu_mask >> (BITS_PER_BYTE * i)) & 0xff;
>>> +	}
>>> +}
>>> +
>>>    u32 intel_sseu_make_rpcs(struct drm_i915_private *i915,
>>>    			 const struct intel_sseu *req_sseu)
>>>    {
>>> diff --git a/drivers/gpu/drm/i915/gt/intel_sseu.h
>>> b/drivers/gpu/drm/i915/gt/intel_sseu.h
>>> index 029e71d8f140..56e3721ae83f 100644
>>> --- a/drivers/gpu/drm/i915/gt/intel_sseu.h
>>> +++ b/drivers/gpu/drm/i915/gt/intel_sseu.h
>>> @@ -63,58 +63,17 @@ intel_sseu_from_device_info(const struct
>>> sseu_dev_info *sseu)
>>>    	return value;
>>>    }
>>>    
>>> -static inline unsigned int sseu_subslice_total(const struct
>>> sseu_dev_info *sseu)
>>> -{
>>> -	unsigned int i, total = 0;
>>> -
>>> -	for (i = 0; i < ARRAY_SIZE(sseu->subslice_mask); i++)
>>> -		total += hweight8(sseu->subslice_mask[i]);
>>> +unsigned int
>>> +intel_sseu_subslice_total(const struct sseu_dev_info *sseu);
>>>    
>>> -	return total;
>>> -}
>>> +unsigned int
>>> +intel_sseu_subslices_per_slice(const struct sseu_dev_info *sseu,
>>> u8 slice);
>>>    
>>> -static inline unsigned int
>>> -sseu_subslices_per_slice(const struct sseu_dev_info *sseu, u8
>>> slice)
>>> -{
>>> -	return hweight8(sseu->subslice_mask[slice]);
>>> -}
>>> -
>>> -static inline int sseu_eu_idx(const struct sseu_dev_info *sseu,
>>> -			      int slice, int subslice)
>>> -{
>>> -	int subslice_stride = DIV_ROUND_UP(sseu->max_eus_per_subslice,
>>> -					   BITS_PER_BYTE);
>>> -	int slice_stride = sseu->max_subslices * subslice_stride;
>>> -
>>> -	return slice * slice_stride + subslice * subslice_stride;
>>> -}
>>> +u16 intel_sseu_get_eus(const struct sseu_dev_info *sseu, int
>>> slice,
>>> +		       int subslice);
>>>    
>>> -static inline u16 sseu_get_eus(const struct sseu_dev_info *sseu,
>>> -			       int slice, int subslice)
>>> -{
>>> -	int i, offset = sseu_eu_idx(sseu, slice, subslice);
>>> -	u16 eu_mask = 0;
>>> -
>>> -	for (i = 0;
>>> -	     i < DIV_ROUND_UP(sseu->max_eus_per_subslice,
>>> BITS_PER_BYTE); i++) {
>>> -		eu_mask |= ((u16)sseu->eu_mask[offset + i]) <<
>>> -			(i * BITS_PER_BYTE);
>>> -	}
>>> -
>>> -	return eu_mask;
>>> -}
>>> -
>>> -static inline void sseu_set_eus(struct sseu_dev_info *sseu,
>>> -				int slice, int subslice, u16 eu_mask)
>>> -{
>>> -	int i, offset = sseu_eu_idx(sseu, slice, subslice);
>>> -
>>> -	for (i = 0;
>>> -	     i < DIV_ROUND_UP(sseu->max_eus_per_subslice,
>>> BITS_PER_BYTE); i++) {
>>> -		sseu->eu_mask[offset + i] =
>>> -			(eu_mask >> (BITS_PER_BYTE * i)) & 0xff;
>>> -	}
>>> -}
>>> +void intel_sseu_set_eus(struct sseu_dev_info *sseu, int slice, int
>>> subslice,
>>> +			u16 eu_mask);
>>>    
>>>    u32 intel_sseu_make_rpcs(struct drm_i915_private *i915,
>>>    			 const struct intel_sseu *req_sseu);
>>> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c
>>> b/drivers/gpu/drm/i915/i915_debugfs.c
>>> index fe854c629a32..3f3ee83ac315 100644
>>> --- a/drivers/gpu/drm/i915/i915_debugfs.c
>>> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
>>> @@ -4158,7 +4158,7 @@ static void
>>> broadwell_sseu_device_status(struct drm_i915_private *dev_priv,
>>>    				RUNTIME_INFO(dev_priv)-
>>>> sseu.subslice_mask[s];
>>>    		}
>>>    		sseu->eu_total = sseu->eu_per_subslice *
>>> -				 sseu_subslice_total(sseu);
>>> +				 intel_sseu_subslice_total(sseu);
>>>    
>>>    		/* subtract fused off EU(s) from enabled slice(s) */
>>>    		for (s = 0; s < fls(sseu->slice_mask); s++) {
>>> @@ -4182,10 +4182,10 @@ static void i915_print_sseu_info(struct
>>> seq_file *m, bool is_available_info,
>>>    	seq_printf(m, "  %s Slice Total: %u\n", type,
>>>    		   hweight8(sseu->slice_mask));
>>>    	seq_printf(m, "  %s Subslice Total: %u\n", type,
>>> -		   sseu_subslice_total(sseu));
>>> +		   intel_sseu_subslice_total(sseu));
>>>    	for (s = 0; s < fls(sseu->slice_mask); s++) {
>>>    		seq_printf(m, "  %s Slice%i subslices: %u\n", type,
>>> -			   s, sseu_subslices_per_slice(sseu, s));
>>> +			   s, intel_sseu_subslices_per_slice(sseu, s));
>>>    	}
>>>    	seq_printf(m, "  %s EU Total: %u\n", type,
>>>    		   sseu->eu_total);
>>> diff --git a/drivers/gpu/drm/i915/i915_drv.c
>>> b/drivers/gpu/drm/i915/i915_drv.c
>>> index c376244c19c4..130c5140db0d 100644
>>> --- a/drivers/gpu/drm/i915/i915_drv.c
>>> +++ b/drivers/gpu/drm/i915/i915_drv.c
>>> @@ -378,7 +378,7 @@ static int i915_getparam_ioctl(struct
>>> drm_device *dev, void *data,
>>>    		value = i915_cmd_parser_get_version(dev_priv);
>>>    		break;
>>>    	case I915_PARAM_SUBSLICE_TOTAL:
>>> -		value = sseu_subslice_total(sseu);
>>> +		value = intel_sseu_subslice_total(sseu);
>>>    		if (!value)
>>>    			return -ENODEV;
>>>    		break;
>>> diff --git a/drivers/gpu/drm/i915/intel_device_info.c
>>> b/drivers/gpu/drm/i915/intel_device_info.c
>>> index 559cf0d0628e..e1dbccf04cd9 100644
>>> --- a/drivers/gpu/drm/i915/intel_device_info.c
>>> +++ b/drivers/gpu/drm/i915/intel_device_info.c
>>> @@ -90,10 +90,10 @@ static void sseu_dump(const struct
>>> sseu_dev_info *sseu, struct drm_printer *p)
>>>    
>>>    	drm_printf(p, "slice total: %u, mask=%04x\n",
>>>    		   hweight8(sseu->slice_mask), sseu->slice_mask);
>>> -	drm_printf(p, "subslice total: %u\n",
>>> sseu_subslice_total(sseu));
>>> +	drm_printf(p, "subslice total: %u\n",
>>> intel_sseu_subslice_total(sseu));
>>>    	for (s = 0; s < sseu->max_slices; s++) {
>>>    		drm_printf(p, "slice%d: %u subslices, mask=%04x\n",
>>> -			   s, sseu_subslices_per_slice(sseu, s),
>>> +			   s, intel_sseu_subslices_per_slice(sseu, s),
>>>    			   sseu->subslice_mask[s]);
>>>    	}
>>>    	drm_printf(p, "EU total: %u\n", sseu->eu_total);
>>> @@ -126,11 +126,11 @@ void intel_device_info_dump_topology(const
>>> struct sseu_dev_info *sseu,
>>>    
>>>    	for (s = 0; s < sseu->max_slices; s++) {
>>>    		drm_printf(p, "slice%d: %u subslice(s) (0x%hhx):\n",
>>> -			   s, sseu_subslices_per_slice(sseu, s),
>>> +			   s, intel_sseu_subslices_per_slice(sseu, s),
>>>    			   sseu->subslice_mask[s]);
>>>    
>>>    		for (ss = 0; ss < sseu->max_subslices; ss++) {
>>> -			u16 enabled_eus = sseu_get_eus(sseu, s, ss);
>>> +			u16 enabled_eus = intel_sseu_get_eus(sseu, s,
>>> ss);
>>>    
>>>    			drm_printf(p, "\tsubslice%d: %u EUs (0x%hx)\n",
>>>    				   ss, hweight16(enabled_eus),
>>> enabled_eus);
>>> @@ -180,7 +180,7 @@ static void gen11_sseu_info_init(struct
>>> drm_i915_private *dev_priv)
>>>    			sseu->subslice_mask[s] = (ss_en >> ss_idx) &
>>> ss_en_mask;
>>>    			for (ss = 0; ss < sseu->max_subslices; ss++) {
>>>    				if (sseu->subslice_mask[s] & BIT(ss))
>>> -					sseu_set_eus(sseu, s, ss,
>>> eu_en);
>>> +					intel_sseu_set_eus(sseu, s, ss,
>>> eu_en);
>>>    			}
>>>    		}
>>>    	}
>>> @@ -222,32 +222,32 @@ static void gen10_sseu_info_init(struct
>>> drm_i915_private *dev_priv)
>>>    	/* Slice0 */
>>>    	eu_en = ~I915_READ(GEN8_EU_DISABLE0);
>>>    	for (ss = 0; ss < sseu->max_subslices; ss++)
>>> -		sseu_set_eus(sseu, 0, ss, (eu_en >> (8 * ss)) &
>>> eu_mask);
>>> +		intel_sseu_set_eus(sseu, 0, ss, (eu_en >> (8 * ss)) &
>>> eu_mask);
>>>    	/* Slice1 */
>>> -	sseu_set_eus(sseu, 1, 0, (eu_en >> 24) & eu_mask);
>>> +	intel_sseu_set_eus(sseu, 1, 0, (eu_en >> 24) & eu_mask);
>>>    	eu_en = ~I915_READ(GEN8_EU_DISABLE1);
>>> -	sseu_set_eus(sseu, 1, 1, eu_en & eu_mask);
>>> +	intel_sseu_set_eus(sseu, 1, 1, eu_en & eu_mask);
>>>    	/* Slice2 */
>>> -	sseu_set_eus(sseu, 2, 0, (eu_en >> 8) & eu_mask);
>>> -	sseu_set_eus(sseu, 2, 1, (eu_en >> 16) & eu_mask);
>>> +	intel_sseu_set_eus(sseu, 2, 0, (eu_en >> 8) & eu_mask);
>>> +	intel_sseu_set_eus(sseu, 2, 1, (eu_en >> 16) & eu_mask);
>>>    	/* Slice3 */
>>> -	sseu_set_eus(sseu, 3, 0, (eu_en >> 24) & eu_mask);
>>> +	intel_sseu_set_eus(sseu, 3, 0, (eu_en >> 24) & eu_mask);
>>>    	eu_en = ~I915_READ(GEN8_EU_DISABLE2);
>>> -	sseu_set_eus(sseu, 3, 1, eu_en & eu_mask);
>>> +	intel_sseu_set_eus(sseu, 3, 1, eu_en & eu_mask);
>>>    	/* Slice4 */
>>> -	sseu_set_eus(sseu, 4, 0, (eu_en >> 8) & eu_mask);
>>> -	sseu_set_eus(sseu, 4, 1, (eu_en >> 16) & eu_mask);
>>> +	intel_sseu_set_eus(sseu, 4, 0, (eu_en >> 8) & eu_mask);
>>> +	intel_sseu_set_eus(sseu, 4, 1, (eu_en >> 16) & eu_mask);
>>>    	/* Slice5 */
>>> -	sseu_set_eus(sseu, 5, 0, (eu_en >> 24) & eu_mask);
>>> +	intel_sseu_set_eus(sseu, 5, 0, (eu_en >> 24) & eu_mask);
>>>    	eu_en = ~I915_READ(GEN10_EU_DISABLE3);
>>> -	sseu_set_eus(sseu, 5, 1, eu_en & eu_mask);
>>> +	intel_sseu_set_eus(sseu, 5, 1, eu_en & eu_mask);
>>>    
>>>    	/* Do a second pass where we mark the subslices disabled if all
>>> their
>>>    	 * eus are off.
>>>    	 */
>>>    	for (s = 0; s < sseu->max_slices; s++) {
>>>    		for (ss = 0; ss < sseu->max_subslices; ss++) {
>>> -			if (sseu_get_eus(sseu, s, ss) == 0)
>>> +			if (intel_sseu_get_eus(sseu, s, ss) == 0)
>>>    				sseu->subslice_mask[s] &= ~BIT(ss);
>>>    		}
>>>    	}
>>> @@ -260,9 +260,10 @@ static void gen10_sseu_info_init(struct
>>> drm_i915_private *dev_priv)
>>>    	 * EU in any one subslice may be fused off for die
>>>    	 * recovery.
>>>    	 */
>>> -	sseu->eu_per_subslice = sseu_subslice_total(sseu) ?
>>> +	sseu->eu_per_subslice = intel_sseu_subslice_total(sseu) ?
>>>    				DIV_ROUND_UP(sseu->eu_total,
>>> -					     sseu_subslice_total(sseu))
>>> : 0;
>>> +					     intel_sseu_subslice_total(
>>> sseu)) :
>>> +				0;
>>>    
>>>    	/* No restrictions on Power Gating */
>>>    	sseu->has_slice_pg = 1;
>>> @@ -290,7 +291,7 @@ static void cherryview_sseu_info_init(struct
>>> drm_i915_private *dev_priv)
>>>    			  CHV_FGT_EU_DIS_SS0_R1_SHIFT) << 4);
>>>    
>>>    		sseu->subslice_mask[0] |= BIT(0);
>>> -		sseu_set_eus(sseu, 0, 0, ~disabled_mask);
>>> +		intel_sseu_set_eus(sseu, 0, 0, ~disabled_mask);
>>>    	}
>>>    
>>>    	if (!(fuse & CHV_FGT_DISABLE_SS1)) {
>>> @@ -301,7 +302,7 @@ static void cherryview_sseu_info_init(struct
>>> drm_i915_private *dev_priv)
>>>    			  CHV_FGT_EU_DIS_SS1_R1_SHIFT) << 4);
>>>    
>>>    		sseu->subslice_mask[0] |= BIT(1);
>>> -		sseu_set_eus(sseu, 0, 1, ~disabled_mask);
>>> +		intel_sseu_set_eus(sseu, 0, 1, ~disabled_mask);
>>>    	}
>>>    
>>>    	sseu->eu_total = compute_eu_total(sseu);
>>> @@ -310,8 +311,8 @@ static void cherryview_sseu_info_init(struct
>>> drm_i915_private *dev_priv)
>>>    	 * CHV expected to always have a uniform distribution of EU
>>>    	 * across subslices.
>>>    	*/
>>> -	sseu->eu_per_subslice = sseu_subslice_total(sseu) ?
>>> -				sseu->eu_total /
>>> sseu_subslice_total(sseu) :
>>> +	sseu->eu_per_subslice = intel_sseu_subslice_total(sseu) ?
>>> +				sseu->eu_total /
>>> intel_sseu_subslice_total(sseu) :
>>>    				0;
>>>    	/*
>>>    	 * CHV supports subslice power gating on devices with more than
>>> @@ -319,7 +320,7 @@ static void cherryview_sseu_info_init(struct
>>> drm_i915_private *dev_priv)
>>>    	 * more than one EU pair per subslice.
>>>    	*/
>>>    	sseu->has_slice_pg = 0;
>>> -	sseu->has_subslice_pg = sseu_subslice_total(sseu) > 1;
>>> +	sseu->has_subslice_pg = intel_sseu_subslice_total(sseu) > 1;
>>>    	sseu->has_eu_pg = (sseu->eu_per_subslice > 2);
>>>    }
>>>    
>>> @@ -369,7 +370,7 @@ static void gen9_sseu_info_init(struct
>>> drm_i915_private *dev_priv)
>>>    
>>>    			eu_disabled_mask = (eu_disable >> (ss * 8)) &
>>> eu_mask;
>>>    
>>> -			sseu_set_eus(sseu, s, ss, ~eu_disabled_mask);
>>> +			intel_sseu_set_eus(sseu, s, ss,
>>> ~eu_disabled_mask);
>>>    
>>>    			eu_per_ss = sseu->max_eus_per_subslice -
>>>    				hweight8(eu_disabled_mask);
>>> @@ -393,9 +394,10 @@ static void gen9_sseu_info_init(struct
>>> drm_i915_private *dev_priv)
>>>    	 * recovery. BXT is expected to be perfectly uniform in EU
>>>    	 * distribution.
>>>    	*/
>>> -	sseu->eu_per_subslice = sseu_subslice_total(sseu) ?
>>> +	sseu->eu_per_subslice = intel_sseu_subslice_total(sseu) ?
>>>    				DIV_ROUND_UP(sseu->eu_total,
>>> -					     sseu_subslice_total(sseu))
>>> : 0;
>>> +					     intel_sseu_subslice_total(
>>> sseu)) :
>>> +				0;
>>>    	/*
>>>    	 * SKL+ supports slice power gating on devices with more than
>>>    	 * one slice, and supports EU power gating on devices with
>>> @@ -407,7 +409,7 @@ static void gen9_sseu_info_init(struct
>>> drm_i915_private *dev_priv)
>>>    	sseu->has_slice_pg =
>>>    		!IS_GEN9_LP(dev_priv) && hweight8(sseu->slice_mask) >
>>> 1;
>>>    	sseu->has_subslice_pg =
>>> -		IS_GEN9_LP(dev_priv) && sseu_subslice_total(sseu) > 1;
>>> +		IS_GEN9_LP(dev_priv) && intel_sseu_subslice_total(sseu)
>>>> 1;
>>>    	sseu->has_eu_pg = sseu->eu_per_subslice > 2;
>>>    
>>>    	if (IS_GEN9_LP(dev_priv)) {
>>> @@ -477,7 +479,7 @@ static void broadwell_sseu_info_init(struct
>>> drm_i915_private *dev_priv)
>>>    			eu_disabled_mask =
>>>    				eu_disable[s] >> (ss * sseu-
>>>> max_eus_per_subslice);
>>>    
>>> -			sseu_set_eus(sseu, s, ss, ~eu_disabled_mask);
>>> +			intel_sseu_set_eus(sseu, s, ss,
>>> ~eu_disabled_mask);
>>>    
>>>    			n_disabled = hweight8(eu_disabled_mask);
>>>    
>>> @@ -496,9 +498,10 @@ static void broadwell_sseu_info_init(struct
>>> drm_i915_private *dev_priv)
>>>    	 * subslices with the exception that any one EU in any one
>>> subslice may
>>>    	 * be fused off for die recovery.
>>>    	 */
>>> -	sseu->eu_per_subslice = sseu_subslice_total(sseu) ?
>>> +	sseu->eu_per_subslice = intel_sseu_subslice_total(sseu) ?
>>>    				DIV_ROUND_UP(sseu->eu_total,
>>> -					     sseu_subslice_total(sseu))
>>> : 0;
>>> +					     intel_sseu_subslice_total(
>>> sseu)) :
>>> +				0;
>>>    
>>>    	/*
>>>    	 * BDW supports slice power gating on devices with more than
>>> @@ -561,8 +564,8 @@ static void haswell_sseu_info_init(struct
>>> drm_i915_private *dev_priv)
>>>    
>>>    	for (s = 0; s < sseu->max_slices; s++) {
>>>    		for (ss = 0; ss < sseu->max_subslices; ss++) {
>>> -			sseu_set_eus(sseu, s, ss,
>>> -				     (1UL << sseu->eu_per_subslice) -
>>> 1);
>>> +			intel_sseu_set_eus(sseu, s, ss,
>>> +					   (1UL << sseu-
>>>> eu_per_subslice) - 1);
>>>    		}
>>>    	}
>>>    
>>>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 5/6] drm/i915: Remove inline from sseu helper functions
  2019-05-01 21:19       ` Daniele Ceraolo Spurio
@ 2019-05-01 21:28         ` Summers, Stuart
  2019-05-02  7:15           ` Jani Nikula
  0 siblings, 1 reply; 35+ messages in thread
From: Summers, Stuart @ 2019-05-01 21:28 UTC (permalink / raw)
  To: Ceraolo Spurio, Daniele, intel-gfx


[-- Attachment #1.1: Type: text/plain, Size: 22198 bytes --]

On Wed, 2019-05-01 at 14:19 -0700, Daniele Ceraolo Spurio wrote:
> 
> On 5/1/19 2:04 PM, Summers, Stuart wrote:
> > On Wed, 2019-05-01 at 13:04 -0700, Daniele Ceraolo Spurio wrote:
> > > Can you elaborate a bit more on what's the rationale for this? do
> > > you
> > > just want to avoid having too many inlines since the paths
> > > they're
> > > used
> > > in are not critical, or do you have some more functional reason?
> > > This
> > > is
> > > not a critic to the patch, I just want to understand where you're
> > > coming
> > > from ;)
> > 
> > This was a request from Jani Nikula in a previous series update. I
> > don't have a strong preference either way personally. If you don't
> > have
> > any major concerns, I'd prefer to keep the series as-is to prevent
> > too
> > much thrash here, but let me know.
> > 
> 
> No concerns, just please update the commit message to explain that
> we're 
> moving them because there is no need for them to be inline since
> they're 
> not on a critical path where we need preformance.

Sounds great.

> 
> > > 
> > > BTW, looking at this patch I realized there are a few more
> > > DIV_ROUND_UP(..., BITS_PER_BYTE) that could be converted to
> > > GEN_SSEU_STRIDE() in patch 2. I noticed you update them to a new
> > > variable in the next patch, but for consistency it might still be
> > > worth
> > > updating them all in patch 2 or at least mention in the commit
> > > message
> > > of patch 2 that the remaining cases are updated by a follow-up
> > > patch
> > > in
> > > the series. Patch 2 is quite small, so you could also just squash
> > > it
> > > into patch 6 to avoid the split.
> > 
> > I'm happy to squash them. I did try to isolate this a bit, but
> > you're
> > right that I ended up pushing some of these DIV_ROUND_UP... stride
> > calculations to the last patch in the series. If you don't have any
> > objection, to keep the finaly patch a bit simpler, I'd rather pull
> > those changes into the earlier patch. I realize you already have a
> > RB
> > on that patch. Any issues doing this?
> > 
> 
> If you're changing all of them from DIV_ROUND_UP to GEN_SSEU_STRIDE
> in 
> patch 2 I'm ok for you to keep the r-b. If you want to port the
> other 
> logic for saving sseu->ss_stride to that patch then I'll have
> another 
> quick look at it after you re-send as that is a more complex change.

I'll do the former, then convert those to the new structure layout in
the subsequent patches.

Thanks,
Stuart

> 
> Daniele
> 
> > Thanks,
> > Stuart
> > 
> > > 
> > > Daniele
> > > 
> > > On 5/1/19 8:34 AM, Stuart Summers wrote:
> > > > Additionally, ensure these are all prefixed with intel_sseu_*
> > > > to match the convention of other functions in i915.
> > > > 
> > > > Signed-off-by: Stuart Summers <stuart.summers@intel.com>
> > > > ---
> > > >    drivers/gpu/drm/i915/gt/intel_sseu.c     | 54
> > > > +++++++++++++++++++
> > > >    drivers/gpu/drm/i915/gt/intel_sseu.h     | 57 +++-----------
> > > > ---
> > > > ---
> > > >    drivers/gpu/drm/i915/i915_debugfs.c      |  6 +--
> > > >    drivers/gpu/drm/i915/i915_drv.c          |  2 +-
> > > >    drivers/gpu/drm/i915/intel_device_info.c | 69 ++++++++++++
> > > > -------
> > > > -----
> > > >    5 files changed, 102 insertions(+), 86 deletions(-)
> > > > 
> > > > diff --git a/drivers/gpu/drm/i915/gt/intel_sseu.c
> > > > b/drivers/gpu/drm/i915/gt/intel_sseu.c
> > > > index 7f448f3bea0b..4a0b82fc108c 100644
> > > > --- a/drivers/gpu/drm/i915/gt/intel_sseu.c
> > > > +++ b/drivers/gpu/drm/i915/gt/intel_sseu.c
> > > > @@ -8,6 +8,60 @@
> > > >    #include "intel_lrc_reg.h"
> > > >    #include "intel_sseu.h"
> > > >    
> > > > +unsigned int
> > > > +intel_sseu_subslice_total(const struct sseu_dev_info *sseu)
> > > > +{
> > > > +	unsigned int i, total = 0;
> > > > +
> > > > +	for (i = 0; i < ARRAY_SIZE(sseu->subslice_mask); i++)
> > > > +		total += hweight8(sseu->subslice_mask[i]);
> > > > +
> > > > +	return total;
> > > > +}
> > > > +
> > > > +unsigned int
> > > > +intel_sseu_subslices_per_slice(const struct sseu_dev_info
> > > > *sseu,
> > > > u8 slice)
> > > > +{
> > > > +	return hweight8(sseu->subslice_mask[slice]);
> > > > +}
> > > > +
> > > > +static int intel_sseu_eu_idx(const struct sseu_dev_info *sseu,
> > > > int
> > > > slice,
> > > > +			     int subslice)
> > > > +{
> > > > +	int subslice_stride = DIV_ROUND_UP(sseu-
> > > > >max_eus_per_subslice,
> > > > +					   BITS_PER_BYTE);
> > > > +	int slice_stride = sseu->max_subslices *
> > > > subslice_stride;
> > > > +
> > > > +	return slice * slice_stride + subslice *
> > > > subslice_stride;
> > > > +}
> > > > +
> > > > +u16 intel_sseu_get_eus(const struct sseu_dev_info *sseu, int
> > > > slice,
> > > > +		       int subslice)
> > > > +{
> > > > +	int i, offset = intel_sseu_eu_idx(sseu, slice,
> > > > subslice);
> > > > +	u16 eu_mask = 0;
> > > > +
> > > > +	for (i = 0;
> > > > +	     i < DIV_ROUND_UP(sseu->max_eus_per_subslice,
> > > > BITS_PER_BYTE); i++) {
> > > > +		eu_mask |= ((u16)sseu->eu_mask[offset + i]) <<
> > > > +			(i * BITS_PER_BYTE);
> > > > +	}
> > > > +
> > > > +	return eu_mask;
> > > > +}
> > > > +
> > > > +void intel_sseu_set_eus(struct sseu_dev_info *sseu, int slice,
> > > > int
> > > > subslice,
> > > > +			u16 eu_mask)
> > > > +{
> > > > +	int i, offset = intel_sseu_eu_idx(sseu, slice,
> > > > subslice);
> > > > +
> > > > +	for (i = 0;
> > > > +	     i < DIV_ROUND_UP(sseu->max_eus_per_subslice,
> > > > BITS_PER_BYTE); i++) {
> > > > +		sseu->eu_mask[offset + i] =
> > > > +			(eu_mask >> (BITS_PER_BYTE * i)) &
> > > > 0xff;
> > > > +	}
> > > > +}
> > > > +
> > > >    u32 intel_sseu_make_rpcs(struct drm_i915_private *i915,
> > > >    			 const struct intel_sseu *req_sseu)
> > > >    {
> > > > diff --git a/drivers/gpu/drm/i915/gt/intel_sseu.h
> > > > b/drivers/gpu/drm/i915/gt/intel_sseu.h
> > > > index 029e71d8f140..56e3721ae83f 100644
> > > > --- a/drivers/gpu/drm/i915/gt/intel_sseu.h
> > > > +++ b/drivers/gpu/drm/i915/gt/intel_sseu.h
> > > > @@ -63,58 +63,17 @@ intel_sseu_from_device_info(const struct
> > > > sseu_dev_info *sseu)
> > > >    	return value;
> > > >    }
> > > >    
> > > > -static inline unsigned int sseu_subslice_total(const struct
> > > > sseu_dev_info *sseu)
> > > > -{
> > > > -	unsigned int i, total = 0;
> > > > -
> > > > -	for (i = 0; i < ARRAY_SIZE(sseu->subslice_mask); i++)
> > > > -		total += hweight8(sseu->subslice_mask[i]);
> > > > +unsigned int
> > > > +intel_sseu_subslice_total(const struct sseu_dev_info *sseu);
> > > >    
> > > > -	return total;
> > > > -}
> > > > +unsigned int
> > > > +intel_sseu_subslices_per_slice(const struct sseu_dev_info
> > > > *sseu,
> > > > u8 slice);
> > > >    
> > > > -static inline unsigned int
> > > > -sseu_subslices_per_slice(const struct sseu_dev_info *sseu, u8
> > > > slice)
> > > > -{
> > > > -	return hweight8(sseu->subslice_mask[slice]);
> > > > -}
> > > > -
> > > > -static inline int sseu_eu_idx(const struct sseu_dev_info
> > > > *sseu,
> > > > -			      int slice, int subslice)
> > > > -{
> > > > -	int subslice_stride = DIV_ROUND_UP(sseu-
> > > > >max_eus_per_subslice,
> > > > -					   BITS_PER_BYTE);
> > > > -	int slice_stride = sseu->max_subslices *
> > > > subslice_stride;
> > > > -
> > > > -	return slice * slice_stride + subslice *
> > > > subslice_stride;
> > > > -}
> > > > +u16 intel_sseu_get_eus(const struct sseu_dev_info *sseu, int
> > > > slice,
> > > > +		       int subslice);
> > > >    
> > > > -static inline u16 sseu_get_eus(const struct sseu_dev_info
> > > > *sseu,
> > > > -			       int slice, int subslice)
> > > > -{
> > > > -	int i, offset = sseu_eu_idx(sseu, slice, subslice);
> > > > -	u16 eu_mask = 0;
> > > > -
> > > > -	for (i = 0;
> > > > -	     i < DIV_ROUND_UP(sseu->max_eus_per_subslice,
> > > > BITS_PER_BYTE); i++) {
> > > > -		eu_mask |= ((u16)sseu->eu_mask[offset + i]) <<
> > > > -			(i * BITS_PER_BYTE);
> > > > -	}
> > > > -
> > > > -	return eu_mask;
> > > > -}
> > > > -
> > > > -static inline void sseu_set_eus(struct sseu_dev_info *sseu,
> > > > -				int slice, int subslice, u16
> > > > eu_mask)
> > > > -{
> > > > -	int i, offset = sseu_eu_idx(sseu, slice, subslice);
> > > > -
> > > > -	for (i = 0;
> > > > -	     i < DIV_ROUND_UP(sseu->max_eus_per_subslice,
> > > > BITS_PER_BYTE); i++) {
> > > > -		sseu->eu_mask[offset + i] =
> > > > -			(eu_mask >> (BITS_PER_BYTE * i)) &
> > > > 0xff;
> > > > -	}
> > > > -}
> > > > +void intel_sseu_set_eus(struct sseu_dev_info *sseu, int slice,
> > > > int
> > > > subslice,
> > > > +			u16 eu_mask);
> > > >    
> > > >    u32 intel_sseu_make_rpcs(struct drm_i915_private *i915,
> > > >    			 const struct intel_sseu *req_sseu);
> > > > diff --git a/drivers/gpu/drm/i915/i915_debugfs.c
> > > > b/drivers/gpu/drm/i915/i915_debugfs.c
> > > > index fe854c629a32..3f3ee83ac315 100644
> > > > --- a/drivers/gpu/drm/i915/i915_debugfs.c
> > > > +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> > > > @@ -4158,7 +4158,7 @@ static void
> > > > broadwell_sseu_device_status(struct drm_i915_private *dev_priv,
> > > >    				RUNTIME_INFO(dev_priv)-
> > > > > sseu.subslice_mask[s];
> > > > 
> > > >    		}
> > > >    		sseu->eu_total = sseu->eu_per_subslice *
> > > > -				 sseu_subslice_total(sseu);
> > > > +				 intel_sseu_subslice_total(sseu
> > > > );
> > > >    
> > > >    		/* subtract fused off EU(s) from enabled
> > > > slice(s) */
> > > >    		for (s = 0; s < fls(sseu->slice_mask); s++) {
> > > > @@ -4182,10 +4182,10 @@ static void i915_print_sseu_info(struct
> > > > seq_file *m, bool is_available_info,
> > > >    	seq_printf(m, "  %s Slice Total: %u\n", type,
> > > >    		   hweight8(sseu->slice_mask));
> > > >    	seq_printf(m, "  %s Subslice Total: %u\n", type,
> > > > -		   sseu_subslice_total(sseu));
> > > > +		   intel_sseu_subslice_total(sseu));
> > > >    	for (s = 0; s < fls(sseu->slice_mask); s++) {
> > > >    		seq_printf(m, "  %s Slice%i subslices: %u\n",
> > > > type,
> > > > -			   s, sseu_subslices_per_slice(sseu,
> > > > s));
> > > > +			   s,
> > > > intel_sseu_subslices_per_slice(sseu, s));
> > > >    	}
> > > >    	seq_printf(m, "  %s EU Total: %u\n", type,
> > > >    		   sseu->eu_total);
> > > > diff --git a/drivers/gpu/drm/i915/i915_drv.c
> > > > b/drivers/gpu/drm/i915/i915_drv.c
> > > > index c376244c19c4..130c5140db0d 100644
> > > > --- a/drivers/gpu/drm/i915/i915_drv.c
> > > > +++ b/drivers/gpu/drm/i915/i915_drv.c
> > > > @@ -378,7 +378,7 @@ static int i915_getparam_ioctl(struct
> > > > drm_device *dev, void *data,
> > > >    		value = i915_cmd_parser_get_version(dev_priv);
> > > >    		break;
> > > >    	case I915_PARAM_SUBSLICE_TOTAL:
> > > > -		value = sseu_subslice_total(sseu);
> > > > +		value = intel_sseu_subslice_total(sseu);
> > > >    		if (!value)
> > > >    			return -ENODEV;
> > > >    		break;
> > > > diff --git a/drivers/gpu/drm/i915/intel_device_info.c
> > > > b/drivers/gpu/drm/i915/intel_device_info.c
> > > > index 559cf0d0628e..e1dbccf04cd9 100644
> > > > --- a/drivers/gpu/drm/i915/intel_device_info.c
> > > > +++ b/drivers/gpu/drm/i915/intel_device_info.c
> > > > @@ -90,10 +90,10 @@ static void sseu_dump(const struct
> > > > sseu_dev_info *sseu, struct drm_printer *p)
> > > >    
> > > >    	drm_printf(p, "slice total: %u, mask=%04x\n",
> > > >    		   hweight8(sseu->slice_mask), sseu-
> > > > >slice_mask);
> > > > -	drm_printf(p, "subslice total: %u\n",
> > > > sseu_subslice_total(sseu));
> > > > +	drm_printf(p, "subslice total: %u\n",
> > > > intel_sseu_subslice_total(sseu));
> > > >    	for (s = 0; s < sseu->max_slices; s++) {
> > > >    		drm_printf(p, "slice%d: %u subslices,
> > > > mask=%04x\n",
> > > > -			   s, sseu_subslices_per_slice(sseu,
> > > > s),
> > > > +			   s,
> > > > intel_sseu_subslices_per_slice(sseu, s),
> > > >    			   sseu->subslice_mask[s]);
> > > >    	}
> > > >    	drm_printf(p, "EU total: %u\n", sseu->eu_total);
> > > > @@ -126,11 +126,11 @@ void
> > > > intel_device_info_dump_topology(const
> > > > struct sseu_dev_info *sseu,
> > > >    
> > > >    	for (s = 0; s < sseu->max_slices; s++) {
> > > >    		drm_printf(p, "slice%d: %u subslice(s)
> > > > (0x%hhx):\n",
> > > > -			   s, sseu_subslices_per_slice(sseu,
> > > > s),
> > > > +			   s,
> > > > intel_sseu_subslices_per_slice(sseu, s),
> > > >    			   sseu->subslice_mask[s]);
> > > >    
> > > >    		for (ss = 0; ss < sseu->max_subslices; ss++) {
> > > > -			u16 enabled_eus = sseu_get_eus(sseu, s,
> > > > ss);
> > > > +			u16 enabled_eus =
> > > > intel_sseu_get_eus(sseu, s,
> > > > ss);
> > > >    
> > > >    			drm_printf(p, "\tsubslice%d: %u EUs
> > > > (0x%hx)\n",
> > > >    				   ss, hweight16(enabled_eus),
> > > > enabled_eus);
> > > > @@ -180,7 +180,7 @@ static void gen11_sseu_info_init(struct
> > > > drm_i915_private *dev_priv)
> > > >    			sseu->subslice_mask[s] = (ss_en >>
> > > > ss_idx) &
> > > > ss_en_mask;
> > > >    			for (ss = 0; ss < sseu->max_subslices;
> > > > ss++) {
> > > >    				if (sseu->subslice_mask[s] &
> > > > BIT(ss))
> > > > -					sseu_set_eus(sseu, s,
> > > > ss,
> > > > eu_en);
> > > > +					intel_sseu_set_eus(sseu
> > > > , s, ss,
> > > > eu_en);
> > > >    			}
> > > >    		}
> > > >    	}
> > > > @@ -222,32 +222,32 @@ static void gen10_sseu_info_init(struct
> > > > drm_i915_private *dev_priv)
> > > >    	/* Slice0 */
> > > >    	eu_en = ~I915_READ(GEN8_EU_DISABLE0);
> > > >    	for (ss = 0; ss < sseu->max_subslices; ss++)
> > > > -		sseu_set_eus(sseu, 0, ss, (eu_en >> (8 * ss)) &
> > > > eu_mask);
> > > > +		intel_sseu_set_eus(sseu, 0, ss, (eu_en >> (8 *
> > > > ss)) &
> > > > eu_mask);
> > > >    	/* Slice1 */
> > > > -	sseu_set_eus(sseu, 1, 0, (eu_en >> 24) & eu_mask);
> > > > +	intel_sseu_set_eus(sseu, 1, 0, (eu_en >> 24) &
> > > > eu_mask);
> > > >    	eu_en = ~I915_READ(GEN8_EU_DISABLE1);
> > > > -	sseu_set_eus(sseu, 1, 1, eu_en & eu_mask);
> > > > +	intel_sseu_set_eus(sseu, 1, 1, eu_en & eu_mask);
> > > >    	/* Slice2 */
> > > > -	sseu_set_eus(sseu, 2, 0, (eu_en >> 8) & eu_mask);
> > > > -	sseu_set_eus(sseu, 2, 1, (eu_en >> 16) & eu_mask);
> > > > +	intel_sseu_set_eus(sseu, 2, 0, (eu_en >> 8) & eu_mask);
> > > > +	intel_sseu_set_eus(sseu, 2, 1, (eu_en >> 16) &
> > > > eu_mask);
> > > >    	/* Slice3 */
> > > > -	sseu_set_eus(sseu, 3, 0, (eu_en >> 24) & eu_mask);
> > > > +	intel_sseu_set_eus(sseu, 3, 0, (eu_en >> 24) &
> > > > eu_mask);
> > > >    	eu_en = ~I915_READ(GEN8_EU_DISABLE2);
> > > > -	sseu_set_eus(sseu, 3, 1, eu_en & eu_mask);
> > > > +	intel_sseu_set_eus(sseu, 3, 1, eu_en & eu_mask);
> > > >    	/* Slice4 */
> > > > -	sseu_set_eus(sseu, 4, 0, (eu_en >> 8) & eu_mask);
> > > > -	sseu_set_eus(sseu, 4, 1, (eu_en >> 16) & eu_mask);
> > > > +	intel_sseu_set_eus(sseu, 4, 0, (eu_en >> 8) & eu_mask);
> > > > +	intel_sseu_set_eus(sseu, 4, 1, (eu_en >> 16) &
> > > > eu_mask);
> > > >    	/* Slice5 */
> > > > -	sseu_set_eus(sseu, 5, 0, (eu_en >> 24) & eu_mask);
> > > > +	intel_sseu_set_eus(sseu, 5, 0, (eu_en >> 24) &
> > > > eu_mask);
> > > >    	eu_en = ~I915_READ(GEN10_EU_DISABLE3);
> > > > -	sseu_set_eus(sseu, 5, 1, eu_en & eu_mask);
> > > > +	intel_sseu_set_eus(sseu, 5, 1, eu_en & eu_mask);
> > > >    
> > > >    	/* Do a second pass where we mark the subslices
> > > > disabled if all
> > > > their
> > > >    	 * eus are off.
> > > >    	 */
> > > >    	for (s = 0; s < sseu->max_slices; s++) {
> > > >    		for (ss = 0; ss < sseu->max_subslices; ss++) {
> > > > -			if (sseu_get_eus(sseu, s, ss) == 0)
> > > > +			if (intel_sseu_get_eus(sseu, s, ss) ==
> > > > 0)
> > > >    				sseu->subslice_mask[s] &=
> > > > ~BIT(ss);
> > > >    		}
> > > >    	}
> > > > @@ -260,9 +260,10 @@ static void gen10_sseu_info_init(struct
> > > > drm_i915_private *dev_priv)
> > > >    	 * EU in any one subslice may be fused off for die
> > > >    	 * recovery.
> > > >    	 */
> > > > -	sseu->eu_per_subslice = sseu_subslice_total(sseu) ?
> > > > +	sseu->eu_per_subslice = intel_sseu_subslice_total(sseu)
> > > > ?
> > > >    				DIV_ROUND_UP(sseu->eu_total,
> > > > -					     sseu_subslice_tota
> > > > l(sseu))
> > > > : 0;
> > > > +					     intel_sseu_subslic
> > > > e_total(
> > > > sseu)) :
> > > > +				0;
> > > >    
> > > >    	/* No restrictions on Power Gating */
> > > >    	sseu->has_slice_pg = 1;
> > > > @@ -290,7 +291,7 @@ static void
> > > > cherryview_sseu_info_init(struct
> > > > drm_i915_private *dev_priv)
> > > >    			  CHV_FGT_EU_DIS_SS0_R1_SHIFT) << 4);
> > > >    
> > > >    		sseu->subslice_mask[0] |= BIT(0);
> > > > -		sseu_set_eus(sseu, 0, 0, ~disabled_mask);
> > > > +		intel_sseu_set_eus(sseu, 0, 0, ~disabled_mask);
> > > >    	}
> > > >    
> > > >    	if (!(fuse & CHV_FGT_DISABLE_SS1)) {
> > > > @@ -301,7 +302,7 @@ static void
> > > > cherryview_sseu_info_init(struct
> > > > drm_i915_private *dev_priv)
> > > >    			  CHV_FGT_EU_DIS_SS1_R1_SHIFT) << 4);
> > > >    
> > > >    		sseu->subslice_mask[0] |= BIT(1);
> > > > -		sseu_set_eus(sseu, 0, 1, ~disabled_mask);
> > > > +		intel_sseu_set_eus(sseu, 0, 1, ~disabled_mask);
> > > >    	}
> > > >    
> > > >    	sseu->eu_total = compute_eu_total(sseu);
> > > > @@ -310,8 +311,8 @@ static void
> > > > cherryview_sseu_info_init(struct
> > > > drm_i915_private *dev_priv)
> > > >    	 * CHV expected to always have a uniform distribution
> > > > of EU
> > > >    	 * across subslices.
> > > >    	*/
> > > > -	sseu->eu_per_subslice = sseu_subslice_total(sseu) ?
> > > > -				sseu->eu_total /
> > > > sseu_subslice_total(sseu) :
> > > > +	sseu->eu_per_subslice = intel_sseu_subslice_total(sseu)
> > > > ?
> > > > +				sseu->eu_total /
> > > > intel_sseu_subslice_total(sseu) :
> > > >    				0;
> > > >    	/*
> > > >    	 * CHV supports subslice power gating on devices with
> > > > more than
> > > > @@ -319,7 +320,7 @@ static void
> > > > cherryview_sseu_info_init(struct
> > > > drm_i915_private *dev_priv)
> > > >    	 * more than one EU pair per subslice.
> > > >    	*/
> > > >    	sseu->has_slice_pg = 0;
> > > > -	sseu->has_subslice_pg = sseu_subslice_total(sseu) > 1;
> > > > +	sseu->has_subslice_pg = intel_sseu_subslice_total(sseu)
> > > > > 1;
> > > >    	sseu->has_eu_pg = (sseu->eu_per_subslice > 2);
> > > >    }
> > > >    
> > > > @@ -369,7 +370,7 @@ static void gen9_sseu_info_init(struct
> > > > drm_i915_private *dev_priv)
> > > >    
> > > >    			eu_disabled_mask = (eu_disable >> (ss *
> > > > 8)) &
> > > > eu_mask;
> > > >    
> > > > -			sseu_set_eus(sseu, s, ss,
> > > > ~eu_disabled_mask);
> > > > +			intel_sseu_set_eus(sseu, s, ss,
> > > > ~eu_disabled_mask);
> > > >    
> > > >    			eu_per_ss = sseu->max_eus_per_subslice
> > > > -
> > > >    				hweight8(eu_disabled_mask);
> > > > @@ -393,9 +394,10 @@ static void gen9_sseu_info_init(struct
> > > > drm_i915_private *dev_priv)
> > > >    	 * recovery. BXT is expected to be perfectly uniform in
> > > > EU
> > > >    	 * distribution.
> > > >    	*/
> > > > -	sseu->eu_per_subslice = sseu_subslice_total(sseu) ?
> > > > +	sseu->eu_per_subslice = intel_sseu_subslice_total(sseu)
> > > > ?
> > > >    				DIV_ROUND_UP(sseu->eu_total,
> > > > -					     sseu_subslice_tota
> > > > l(sseu))
> > > > : 0;
> > > > +					     intel_sseu_subslic
> > > > e_total(
> > > > sseu)) :
> > > > +				0;
> > > >    	/*
> > > >    	 * SKL+ supports slice power gating on devices with
> > > > more than
> > > >    	 * one slice, and supports EU power gating on devices
> > > > with
> > > > @@ -407,7 +409,7 @@ static void gen9_sseu_info_init(struct
> > > > drm_i915_private *dev_priv)
> > > >    	sseu->has_slice_pg =
> > > >    		!IS_GEN9_LP(dev_priv) && hweight8(sseu-
> > > > >slice_mask) >
> > > > 1;
> > > >    	sseu->has_subslice_pg =
> > > > -		IS_GEN9_LP(dev_priv) &&
> > > > sseu_subslice_total(sseu) > 1;
> > > > +		IS_GEN9_LP(dev_priv) &&
> > > > intel_sseu_subslice_total(sseu)
> > > > > 1;
> > > > 
> > > >    	sseu->has_eu_pg = sseu->eu_per_subslice > 2;
> > > >    
> > > >    	if (IS_GEN9_LP(dev_priv)) {
> > > > @@ -477,7 +479,7 @@ static void broadwell_sseu_info_init(struct
> > > > drm_i915_private *dev_priv)
> > > >    			eu_disabled_mask =
> > > >    				eu_disable[s] >> (ss * sseu-
> > > > > max_eus_per_subslice);
> > > > 
> > > >    
> > > > -			sseu_set_eus(sseu, s, ss,
> > > > ~eu_disabled_mask);
> > > > +			intel_sseu_set_eus(sseu, s, ss,
> > > > ~eu_disabled_mask);
> > > >    
> > > >    			n_disabled =
> > > > hweight8(eu_disabled_mask);
> > > >    
> > > > @@ -496,9 +498,10 @@ static void
> > > > broadwell_sseu_info_init(struct
> > > > drm_i915_private *dev_priv)
> > > >    	 * subslices with the exception that any one EU in any
> > > > one
> > > > subslice may
> > > >    	 * be fused off for die recovery.
> > > >    	 */
> > > > -	sseu->eu_per_subslice = sseu_subslice_total(sseu) ?
> > > > +	sseu->eu_per_subslice = intel_sseu_subslice_total(sseu)
> > > > ?
> > > >    				DIV_ROUND_UP(sseu->eu_total,
> > > > -					     sseu_subslice_tota
> > > > l(sseu))
> > > > : 0;
> > > > +					     intel_sseu_subslic
> > > > e_total(
> > > > sseu)) :
> > > > +				0;
> > > >    
> > > >    	/*
> > > >    	 * BDW supports slice power gating on devices with more
> > > > than
> > > > @@ -561,8 +564,8 @@ static void haswell_sseu_info_init(struct
> > > > drm_i915_private *dev_priv)
> > > >    
> > > >    	for (s = 0; s < sseu->max_slices; s++) {
> > > >    		for (ss = 0; ss < sseu->max_subslices; ss++) {
> > > > -			sseu_set_eus(sseu, s, ss,
> > > > -				     (1UL << sseu-
> > > > >eu_per_subslice) -
> > > > 1);
> > > > +			intel_sseu_set_eus(sseu, s, ss,
> > > > +					   (1UL << sseu-
> > > > > eu_per_subslice) - 1);
> > > > 
> > > >    		}
> > > >    	}
> > > >    
> > > > 

[-- Attachment #1.2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 3270 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 6/6] drm/i915: Expand subslice mask
  2019-05-01 15:34 ` [PATCH 6/6] drm/i915: Expand subslice mask Stuart Summers
  2019-05-01 18:22   ` Tvrtko Ursulin
@ 2019-05-01 22:04   ` Daniele Ceraolo Spurio
  2019-05-02 14:47     ` Summers, Stuart
  1 sibling, 1 reply; 35+ messages in thread
From: Daniele Ceraolo Spurio @ 2019-05-01 22:04 UTC (permalink / raw)
  To: Stuart Summers, intel-gfx



On 5/1/19 8:34 AM, Stuart Summers wrote:
> Currently, the subslice_mask runtime parameter is stored as an
> array of subslices per slice. Expand the subslice mask array to
> better match what is presented to userspace through the
> I915_QUERY_TOPOLOGY_INFO ioctl. The index into this array is
> then calculated:
>    slice * subslice stride + subslice index / 8
> 
> v2: fix spacing in set_sseu_info args
>      use set_sseu_info to initialize sseu data when building
>      device status in debugfs
>      rename variables in intel_engine_types.h to avoid checkpatch
>      warnings
> v3: update headers in intel_sseu.h
> v4: add const to some sseu_dev_info variables
>      use sseu->eu_stride for EU stride calculations
> 
> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
> Signed-off-by: Stuart Summers <stuart.summers@intel.com>

Can you also get an ack from Lionel, to make sure this all fits with the 
expected reporting?

> ---
>   drivers/gpu/drm/i915/gt/intel_engine_cs.c    |   6 +-
>   drivers/gpu/drm/i915/gt/intel_engine_types.h |  32 +++--
>   drivers/gpu/drm/i915/gt/intel_hangcheck.c    |   3 +-
>   drivers/gpu/drm/i915/gt/intel_sseu.c         |  49 +++++--
>   drivers/gpu/drm/i915/gt/intel_sseu.h         |  16 ++-
>   drivers/gpu/drm/i915/gt/intel_workarounds.c  |   2 +-
>   drivers/gpu/drm/i915/i915_debugfs.c          |  44 +++---
>   drivers/gpu/drm/i915/i915_drv.c              |   6 +-
>   drivers/gpu/drm/i915/i915_gpu_error.c        |   5 +-
>   drivers/gpu/drm/i915/i915_query.c            |  10 +-
>   drivers/gpu/drm/i915/intel_device_info.c     | 142 +++++++++++--------
>   11 files changed, 198 insertions(+), 117 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> index 6e40f8ea9a6a..8f7967cc9a50 100644
> --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> @@ -914,7 +914,7 @@ u32 intel_calculate_mcr_s_ss_select(struct drm_i915_private *dev_priv)
>   	const struct sseu_dev_info *sseu = &RUNTIME_INFO(dev_priv)->sseu;
>   	u32 mcr_s_ss_select;
>   	u32 slice = fls(sseu->slice_mask);
> -	u32 subslice = fls(sseu->subslice_mask[slice]);
> +	u32 subslice = fls(sseu->subslice_mask[slice * sseu->ss_stride]);

This (and the registers we use below) only works if ss_stride = 1. Can 
we add a:

	GEM_BUG_ON(sseu->ss_stride > 1);

to catch the fact that this function will need updating to handle that 
case if/when we get it?

>   
>   	if (IS_GEN(dev_priv, 10))
>   		mcr_s_ss_select = GEN8_MCR_SLICE(slice) |
> @@ -990,6 +990,7 @@ void intel_engine_get_instdone(struct intel_engine_cs *engine,
>   			       struct intel_instdone *instdone)
>   {
>   	struct drm_i915_private *dev_priv = engine->i915;
> +	const struct sseu_dev_info *sseu = &RUNTIME_INFO(dev_priv)->sseu;
>   	struct intel_uncore *uncore = engine->uncore;
>   	u32 mmio_base = engine->mmio_base;
>   	int slice;
> @@ -1007,7 +1008,8 @@ void intel_engine_get_instdone(struct intel_engine_cs *engine,
>   
>   		instdone->slice_common =
>   			intel_uncore_read(uncore, GEN7_SC_INSTDONE);
> -		for_each_instdone_slice_subslice(dev_priv, slice, subslice) {
> +		for_each_instdone_slice_subslice(dev_priv, sseu, slice,
> +						 subslice) {
>   			instdone->sampler[slice][subslice] =
>   				read_subslice_reg(dev_priv, slice, subslice,
>   						  GEN7_SAMPLER_INSTDONE);
> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h b/drivers/gpu/drm/i915/gt/intel_engine_types.h
> index 9d64e33f8427..1710546a2446 100644
> --- a/drivers/gpu/drm/i915/gt/intel_engine_types.h
> +++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h
> @@ -534,20 +534,22 @@ intel_engine_needs_breadcrumb_tasklet(const struct intel_engine_cs *engine)
>   	return engine->flags & I915_ENGINE_NEEDS_BREADCRUMB_TASKLET;
>   }
>   
> -#define instdone_slice_mask(dev_priv__) \
> -	(IS_GEN(dev_priv__, 7) ? \
> -	 1 : RUNTIME_INFO(dev_priv__)->sseu.slice_mask)
> -
> -#define instdone_subslice_mask(dev_priv__) \
> -	(IS_GEN(dev_priv__, 7) ? \
> -	 1 : RUNTIME_INFO(dev_priv__)->sseu.subslice_mask[0])
> -
> -#define for_each_instdone_slice_subslice(dev_priv__, slice__, subslice__) \
> -	for ((slice__) = 0, (subslice__) = 0; \
> -	     (slice__) < I915_MAX_SLICES; \
> -	     (subslice__) = ((subslice__) + 1) < I915_MAX_SUBSLICES ? (subslice__) + 1 : 0, \
> -	       (slice__) += ((subslice__) == 0)) \
> -		for_each_if((BIT(slice__) & instdone_slice_mask(dev_priv__)) && \
> -			    (BIT(subslice__) & instdone_subslice_mask(dev_priv__)))
> +#define instdone_has_slice(dev_priv___, sseu___, slice___) \
> +	((IS_GEN(dev_priv___, 7) ? \
> +	  1 : (sseu___)->slice_mask) & \

I'd put the ternary op on the same line here for readability

> +	BIT(slice___)) \

no need for "\" here (and below).

> +
> +#define instdone_has_subslice(dev_priv__, sseu__, slice__, subslice__) \

need some more parenthesis in this macro to fix the MACRO_ARG_PRECEDENCE 
warning in checkpatch.

> +	((IS_GEN(dev_priv__, 7) ? \
> +	  1 : (sseu__)->subslice_mask[slice__ * (sseu__)->ss_stride + \
> +				      subslice__ / BITS_PER_BYTE]) & \

The calculation to get the correct subslice u8 entry:

	sseu->subslice_mask[s * sseu->ss_stride + ss / BITS_PER_BYTE]

seems to be repeated a few times in this patch, so it might be worth 
moving it to its own inline function. looks like you always ultimately 
want a bool, so we could also go a bit further and have something like:

static inline bool intel_sseu_has_subslice(sseu, s, ss)
{
	u8 mask = sseu->subslice_mask[s * sseu->ss_stride +
				      ss / BITS_PER_BYTE];

	return mask & BIT(ss % BITS_PER_BYTE);
}

and then do:

#define instdone_has_subslice(dev_priv__, sseu__, slice__, subslice__) \
	((IS_GEN(dev_priv__, 7) ? subslice__ == 0 : \
		intel_sseu_has_subslice(...))

> +	 BIT(subslice__ % BITS_PER_BYTE)) \
> +
> +#define for_each_instdone_slice_subslice(dev_priv_, sseu_, slice_, subslice_) \
> +	for ((slice_) = 0, (subslice_) = 0; (slice_) < I915_MAX_SLICES; \
> +	     (subslice_) = ((subslice_) + 1) < I915_MAX_SUBSLICES ? (subslice_) + 1 : 0, \

This ternary op should be simplifiable as:

	(subslice_) = ((subslice_) + 1) % I915_MAX_SUBSLICES,

> +	       (slice_) += ((subslice_) == 0)) \
> +		for_each_if(instdone_has_slice(dev_priv_, sseu_, slice) && \

missing the "_" after "slice"

> +			    instdone_has_subslice(dev_priv_, sseu_, slice_, subslice_)) \
>   
>   #endif /* __INTEL_ENGINE_TYPES_H__ */
> diff --git a/drivers/gpu/drm/i915/gt/intel_hangcheck.c b/drivers/gpu/drm/i915/gt/intel_hangcheck.c
> index e5eaa06fe74d..53c1c98161e1 100644
> --- a/drivers/gpu/drm/i915/gt/intel_hangcheck.c
> +++ b/drivers/gpu/drm/i915/gt/intel_hangcheck.c
> @@ -50,6 +50,7 @@ static bool instdone_unchanged(u32 current_instdone, u32 *old_instdone)
>   static bool subunits_stuck(struct intel_engine_cs *engine)
>   {
>   	struct drm_i915_private *dev_priv = engine->i915;
> +	const struct sseu_dev_info *sseu = &RUNTIME_INFO(dev_priv)->sseu;
>   	struct intel_instdone instdone;
>   	struct intel_instdone *accu_instdone = &engine->hangcheck.instdone;
>   	bool stuck;
> @@ -71,7 +72,7 @@ static bool subunits_stuck(struct intel_engine_cs *engine)
>   	stuck &= instdone_unchanged(instdone.slice_common,
>   				    &accu_instdone->slice_common);
>   
> -	for_each_instdone_slice_subslice(dev_priv, slice, subslice) {
> +	for_each_instdone_slice_subslice(dev_priv, sseu, slice, subslice) {
>   		stuck &= instdone_unchanged(instdone.sampler[slice][subslice],
>   					    &accu_instdone->sampler[slice][subslice]);
>   		stuck &= instdone_unchanged(instdone.row[slice][subslice],
> diff --git a/drivers/gpu/drm/i915/gt/intel_sseu.c b/drivers/gpu/drm/i915/gt/intel_sseu.c
> index 4a0b82fc108c..49316b7ef074 100644
> --- a/drivers/gpu/drm/i915/gt/intel_sseu.c
> +++ b/drivers/gpu/drm/i915/gt/intel_sseu.c
> @@ -8,6 +8,17 @@
>   #include "intel_lrc_reg.h"
>   #include "intel_sseu.h"
>   
> +void intel_sseu_set_info(struct sseu_dev_info *sseu, u8 max_slices,
> +			 u8 max_subslices, u8 max_eus_per_subslice)
> +{
> +	sseu->max_slices = max_slices;
> +	sseu->max_subslices = max_subslices;
> +	sseu->max_eus_per_subslice = max_eus_per_subslice;
> +
> +	sseu->ss_stride = GEN_SSEU_STRIDE(sseu->max_subslices);
> +	sseu->eu_stride = GEN_SSEU_STRIDE(sseu->max_eus_per_subslice);
> +}
> +
>   unsigned int
>   intel_sseu_subslice_total(const struct sseu_dev_info *sseu)
>   {
> @@ -22,17 +33,39 @@ intel_sseu_subslice_total(const struct sseu_dev_info *sseu)
>   unsigned int
>   intel_sseu_subslices_per_slice(const struct sseu_dev_info *sseu, u8 slice)

Here we pass slice as u8, but below we use int. Any reason for the 
difference?

>   {
> -	return hweight8(sseu->subslice_mask[slice]);
> +	unsigned int i, total = 0;
> +
> +	for (i = 0; i < sseu->ss_stride; i++)
> +		total += hweight8(sseu->subslice_mask[slice * sseu->ss_stride +
> +						      i]);
> +
> +	return total;
> +}
> +
> +void intel_sseu_copy_subslices(const struct sseu_dev_info *sseu, int slice,
> +			       u8 *to_mask, const u8 *from_mask)

You always use sseu->subslice_mask has a from_mask, can't we just get 
that from the sseu param and avoid the from_mask?

> +{
> +	int offset = slice * sseu->ss_stride;
> +
> +	memcpy(&to_mask[offset], &from_mask[offset], sseu->ss_stride);
> +}
> +
> +void intel_sseu_set_subslices(struct sseu_dev_info *sseu, int slice,
> +			      u32 ss_mask)
> +{
> +	int i, offset = slice * sseu->ss_stride;
> +
> +	for (i = 0; i < sseu->ss_stride; i++)
> +		sseu->subslice_mask[offset + i] =
> +			(ss_mask >> (BITS_PER_BYTE * i)) & 0xff;
>   }
>   
>   static int intel_sseu_eu_idx(const struct sseu_dev_info *sseu, int slice,
>   			     int subslice)
>   {
> -	int subslice_stride = DIV_ROUND_UP(sseu->max_eus_per_subslice,
> -					   BITS_PER_BYTE);
> -	int slice_stride = sseu->max_subslices * subslice_stride;
> +	int slice_stride = sseu->max_subslices * sseu->eu_stride;
>   
> -	return slice * slice_stride + subslice * subslice_stride;
> +	return slice * slice_stride + subslice * sseu->eu_stride;
>   }
>   
>   u16 intel_sseu_get_eus(const struct sseu_dev_info *sseu, int slice,
> @@ -41,8 +74,7 @@ u16 intel_sseu_get_eus(const struct sseu_dev_info *sseu, int slice,
>   	int i, offset = intel_sseu_eu_idx(sseu, slice, subslice);
>   	u16 eu_mask = 0;
>   
> -	for (i = 0;
> -	     i < DIV_ROUND_UP(sseu->max_eus_per_subslice, BITS_PER_BYTE); i++) {
> +	for (i = 0; i < sseu->eu_stride; i++) {
>   		eu_mask |= ((u16)sseu->eu_mask[offset + i]) <<
>   			(i * BITS_PER_BYTE);
>   	}
> @@ -55,8 +87,7 @@ void intel_sseu_set_eus(struct sseu_dev_info *sseu, int slice, int subslice,
>   {
>   	int i, offset = intel_sseu_eu_idx(sseu, slice, subslice);
>   
> -	for (i = 0;
> -	     i < DIV_ROUND_UP(sseu->max_eus_per_subslice, BITS_PER_BYTE); i++) {
> +	for (i = 0; i < sseu->eu_stride; i++) {
>   		sseu->eu_mask[offset + i] =
>   			(eu_mask >> (BITS_PER_BYTE * i)) & 0xff;
>   	}
> diff --git a/drivers/gpu/drm/i915/gt/intel_sseu.h b/drivers/gpu/drm/i915/gt/intel_sseu.h
> index 56e3721ae83f..bf01f338a8cc 100644
> --- a/drivers/gpu/drm/i915/gt/intel_sseu.h
> +++ b/drivers/gpu/drm/i915/gt/intel_sseu.h
> @@ -9,16 +9,18 @@
>   
>   #include <linux/types.h>
>   #include <linux/kernel.h>
> +#include <linux/string.h>
>   
>   struct drm_i915_private;
>   
>   #define GEN_MAX_SLICES		(6) /* CNL upper bound */
>   #define GEN_MAX_SUBSLICES	(8) /* ICL upper bound */
>   #define GEN_SSEU_STRIDE(bits) DIV_ROUND_UP(bits, BITS_PER_BYTE)
> +#define GEN_MAX_SUBSLICE_STRIDE GEN_SSEU_STRIDE(GEN_MAX_SUBSLICES)
>   
>   struct sseu_dev_info {
>   	u8 slice_mask;
> -	u8 subslice_mask[GEN_MAX_SLICES];
> +	u8 subslice_mask[GEN_MAX_SLICES * GEN_MAX_SUBSLICE_STRIDE];
>   	u16 eu_total;
>   	u8 eu_per_subslice;
>   	u8 min_eu_in_pool;
> @@ -33,6 +35,9 @@ struct sseu_dev_info {
>   	u8 max_subslices;
>   	u8 max_eus_per_subslice;
>   
> +	u8 ss_stride;
> +	u8 eu_stride;
> +
>   	/* We don't have more than 8 eus per subslice at the moment and as we
>   	 * store eus enabled using bits, no need to multiply by eus per
>   	 * subslice.
> @@ -63,12 +68,21 @@ intel_sseu_from_device_info(const struct sseu_dev_info *sseu)
>   	return value;
>   }
>   
> +void intel_sseu_set_info(struct sseu_dev_info *sseu, u8 max_slices,
> +			 u8 max_subslices, u8 max_eus_per_subslice);
> +
>   unsigned int
>   intel_sseu_subslice_total(const struct sseu_dev_info *sseu);
>   
>   unsigned int
>   intel_sseu_subslices_per_slice(const struct sseu_dev_info *sseu, u8 slice);
>   
> +void intel_sseu_copy_subslices(const struct sseu_dev_info *sseu, int slice,
> +			       u8 *to_mask, const u8 *from_mask);
> +
> +void intel_sseu_set_subslices(struct sseu_dev_info *sseu, int slice,
> +			      u32 ss_mask);
> +
>   u16 intel_sseu_get_eus(const struct sseu_dev_info *sseu, int slice,
>   		       int subslice);
>   
> diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c b/drivers/gpu/drm/i915/gt/intel_workarounds.c
> index 43e290306551..7c7e9556c1c5 100644
> --- a/drivers/gpu/drm/i915/gt/intel_workarounds.c
> +++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c
> @@ -767,7 +767,7 @@ wa_init_mcr(struct drm_i915_private *i915, struct i915_wa_list *wal)
>   		u32 slice = fls(sseu->slice_mask);
>   		u32 fuse3 =
>   			intel_uncore_read(&i915->uncore, GEN10_MIRROR_FUSE3);
> -		u8 ss_mask = sseu->subslice_mask[slice];
> +		u8 ss_mask = sseu->subslice_mask[slice * sseu->ss_stride];

could use a

	GEM_BUG_ON(sseu->ss_stride > 1);

here as well to remind us this will need changes in that case

>   
>   		u8 enabled_mask = (ss_mask | ss_mask >>
>   				   GEN10_L3BANK_PAIR_COUNT) & GEN10_L3BANK_MASK;
> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
> index 3f3ee83ac315..08089c24db25 100644
> --- a/drivers/gpu/drm/i915/i915_debugfs.c
> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> @@ -1257,6 +1257,7 @@ static void i915_instdone_info(struct drm_i915_private *dev_priv,
>   			       struct seq_file *m,
>   			       struct intel_instdone *instdone)
>   {
> +	struct sseu_dev_info *sseu = &RUNTIME_INFO(dev_priv)->sseu;
>   	int slice;
>   	int subslice;
>   
> @@ -1272,11 +1273,11 @@ static void i915_instdone_info(struct drm_i915_private *dev_priv,
>   	if (INTEL_GEN(dev_priv) <= 6)
>   		return;
>   
> -	for_each_instdone_slice_subslice(dev_priv, slice, subslice)
> +	for_each_instdone_slice_subslice(dev_priv, sseu, slice, subslice)
>   		seq_printf(m, "\t\tSAMPLER_INSTDONE[%d][%d]: 0x%08x\n",
>   			   slice, subslice, instdone->sampler[slice][subslice]);
>   
> -	for_each_instdone_slice_subslice(dev_priv, slice, subslice)
> +	for_each_instdone_slice_subslice(dev_priv, sseu, slice, subslice)
>   		seq_printf(m, "\t\tROW_INSTDONE[%d][%d]: 0x%08x\n",
>   			   slice, subslice, instdone->row[slice][subslice]);
>   }
> @@ -4066,7 +4067,9 @@ static void gen10_sseu_device_status(struct drm_i915_private *dev_priv,
>   			continue;
>   
>   		sseu->slice_mask |= BIT(s);
> -		sseu->subslice_mask[s] = info->sseu.subslice_mask[s];
> +		intel_sseu_copy_subslices(&info->sseu, s,
> +					  sseu->subslice_mask,
> +					  info->sseu.subslice_mask);
>   
>   		for (ss = 0; ss < info->sseu.max_subslices; ss++) {
>   			unsigned int eu_cnt;
> @@ -4117,18 +4120,22 @@ static void gen9_sseu_device_status(struct drm_i915_private *dev_priv,
>   		sseu->slice_mask |= BIT(s);
>   
>   		if (IS_GEN9_BC(dev_priv))
> -			sseu->subslice_mask[s] =
> -				RUNTIME_INFO(dev_priv)->sseu.subslice_mask[s];
> +			intel_sseu_copy_subslices(&info->sseu, s,
> +						  sseu->subslice_mask,
> +						  info->sseu.subslice_mask);
>   
>   		for (ss = 0; ss < info->sseu.max_subslices; ss++) {
>   			unsigned int eu_cnt;
> +			u8 ss_idx = s * info->sseu.ss_stride +
> +				    ss / BITS_PER_BYTE;
>   
>   			if (IS_GEN9_LP(dev_priv)) {
>   				if (!(s_reg[s] & (GEN9_PGCTL_SS_ACK(ss))))
>   					/* skip disabled subslice */
>   					continue;
>   
> -				sseu->subslice_mask[s] |= BIT(ss);
> +				sseu->subslice_mask[ss_idx] |=
> +					BIT(ss % BITS_PER_BYTE);
>   			}
>   
>   			eu_cnt = 2 * hweight32(eu_reg[2*s + ss/2] &
> @@ -4145,25 +4152,24 @@ static void gen9_sseu_device_status(struct drm_i915_private *dev_priv,
>   static void broadwell_sseu_device_status(struct drm_i915_private *dev_priv,
>   					 struct sseu_dev_info *sseu)
>   {
> +	struct intel_runtime_info *info = RUNTIME_INFO(dev_priv);
>   	u32 slice_info = I915_READ(GEN8_GT_SLICE_INFO);
>   	int s;
>   
>   	sseu->slice_mask = slice_info & GEN8_LSLICESTAT_MASK;
>   
>   	if (sseu->slice_mask) {
> -		sseu->eu_per_subslice =
> -			RUNTIME_INFO(dev_priv)->sseu.eu_per_subslice;
> -		for (s = 0; s < fls(sseu->slice_mask); s++) {
> -			sseu->subslice_mask[s] =
> -				RUNTIME_INFO(dev_priv)->sseu.subslice_mask[s];
> -		}
> +		sseu->eu_per_subslice = info->sseu.eu_per_subslice;
> +		for (s = 0; s < fls(sseu->slice_mask); s++)
> +			intel_sseu_copy_subslices(&info->sseu, s,
> +						  sseu->subslice_mask,
> +						  info->sseu.subslice_mask);
>   		sseu->eu_total = sseu->eu_per_subslice *
>   				 intel_sseu_subslice_total(sseu);
>   
>   		/* subtract fused off EU(s) from enabled slice(s) */
>   		for (s = 0; s < fls(sseu->slice_mask); s++) {
> -			u8 subslice_7eu =
> -				RUNTIME_INFO(dev_priv)->sseu.subslice_7eu[s];
> +			u8 subslice_7eu = info->sseu.subslice_7eu[s];
>   
>   			sseu->eu_total -= hweight8(subslice_7eu);
>   		}
> @@ -4210,6 +4216,7 @@ static void i915_print_sseu_info(struct seq_file *m, bool is_available_info,
>   static int i915_sseu_status(struct seq_file *m, void *unused)
>   {
>   	struct drm_i915_private *dev_priv = node_to_i915(m->private);
> +	const struct intel_runtime_info *info = RUNTIME_INFO(dev_priv);
>   	struct sseu_dev_info sseu;
>   	intel_wakeref_t wakeref;
>   
> @@ -4217,14 +4224,13 @@ static int i915_sseu_status(struct seq_file *m, void *unused)
>   		return -ENODEV;
>   
>   	seq_puts(m, "SSEU Device Info\n");
> -	i915_print_sseu_info(m, true, &RUNTIME_INFO(dev_priv)->sseu);
> +	i915_print_sseu_info(m, true, &info->sseu);
>   
>   	seq_puts(m, "SSEU Device Status\n");
>   	memset(&sseu, 0, sizeof(sseu));
> -	sseu.max_slices = RUNTIME_INFO(dev_priv)->sseu.max_slices;
> -	sseu.max_subslices = RUNTIME_INFO(dev_priv)->sseu.max_subslices;
> -	sseu.max_eus_per_subslice =
> -		RUNTIME_INFO(dev_priv)->sseu.max_eus_per_subslice;
> +	intel_sseu_set_info(&sseu, info->sseu.max_slices,
> +			    info->sseu.max_subslices,
> +			    info->sseu.max_eus_per_subslice);
>   
>   	with_intel_runtime_pm(dev_priv, wakeref) {
>   		if (IS_CHERRYVIEW(dev_priv))
> diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
> index 130c5140db0d..6afe4e3afea4 100644
> --- a/drivers/gpu/drm/i915/i915_drv.c
> +++ b/drivers/gpu/drm/i915/i915_drv.c
> @@ -326,7 +326,7 @@ static int i915_getparam_ioctl(struct drm_device *dev, void *data,
>   	struct pci_dev *pdev = dev_priv->drm.pdev;
>   	const struct sseu_dev_info *sseu = &RUNTIME_INFO(dev_priv)->sseu;
>   	drm_i915_getparam_t *param = data;
> -	int value;
> +	int value = 0;
>   
>   	switch (param->param) {
>   	case I915_PARAM_IRQ_ACTIVE:
> @@ -455,7 +455,9 @@ static int i915_getparam_ioctl(struct drm_device *dev, void *data,
>   			return -ENODEV;
>   		break;
>   	case I915_PARAM_SUBSLICE_MASK:
> -		value = sseu->subslice_mask[0];
> +		/* Only copy bits from the first subslice */

s/subslice/slice/ ?

> +		memcpy(&value, sseu->subslice_mask,
> +		       min(sseu->ss_stride, (u8)sizeof(value)));
>   		if (!value)
>   			return -ENODEV;
>   		break;
> diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
> index e1b858bd1d32..140918dd9b7d 100644
> --- a/drivers/gpu/drm/i915/i915_gpu_error.c
> +++ b/drivers/gpu/drm/i915/i915_gpu_error.c
> @@ -407,6 +407,7 @@ static void print_error_buffers(struct drm_i915_error_state_buf *m,
>   static void error_print_instdone(struct drm_i915_error_state_buf *m,
>   				 const struct drm_i915_error_engine *ee)
>   {
> +	struct sseu_dev_info *sseu = &RUNTIME_INFO(m->i915)->sseu;
>   	int slice;
>   	int subslice;
>   
> @@ -422,12 +423,12 @@ static void error_print_instdone(struct drm_i915_error_state_buf *m,
>   	if (INTEL_GEN(m->i915) <= 6)
>   		return;
>   
> -	for_each_instdone_slice_subslice(m->i915, slice, subslice)
> +	for_each_instdone_slice_subslice(m->i915, sseu, slice, subslice)
>   		err_printf(m, "  SAMPLER_INSTDONE[%d][%d]: 0x%08x\n",
>   			   slice, subslice,
>   			   ee->instdone.sampler[slice][subslice]);
>   
> -	for_each_instdone_slice_subslice(m->i915, slice, subslice)
> +	for_each_instdone_slice_subslice(m->i915, sseu, slice, subslice)
>   		err_printf(m, "  ROW_INSTDONE[%d][%d]: 0x%08x\n",
>   			   slice, subslice,
>   			   ee->instdone.row[slice][subslice]);
> diff --git a/drivers/gpu/drm/i915/i915_query.c b/drivers/gpu/drm/i915/i915_query.c
> index 7c1708c22811..000dcb145ce0 100644
> --- a/drivers/gpu/drm/i915/i915_query.c
> +++ b/drivers/gpu/drm/i915/i915_query.c
> @@ -37,8 +37,6 @@ static int query_topology_info(struct drm_i915_private *dev_priv,
>   	const struct sseu_dev_info *sseu = &RUNTIME_INFO(dev_priv)->sseu;
>   	struct drm_i915_query_topology_info topo;
>   	u32 slice_length, subslice_length, eu_length, total_length;
> -	u8 subslice_stride = GEN_SSEU_STRIDE(sseu->max_subslices);
> -	u8 eu_stride = GEN_SSEU_STRIDE(sseu->max_eus_per_subslice);
>   	int ret;
>   
>   	if (query_item->flags != 0)
> @@ -50,8 +48,8 @@ static int query_topology_info(struct drm_i915_private *dev_priv,
>   	BUILD_BUG_ON(sizeof(u8) != sizeof(sseu->slice_mask));
>   
>   	slice_length = sizeof(sseu->slice_mask);
> -	subslice_length = sseu->max_slices * subslice_stride;
> -	eu_length = sseu->max_slices * sseu->max_subslices * eu_stride;
> +	subslice_length = sseu->max_slices * sseu->ss_stride;
> +	eu_length = sseu->max_slices * sseu->max_subslices * sseu->eu_stride;
>   	total_length = sizeof(topo) + slice_length + subslice_length +
>   		       eu_length;
>   
> @@ -69,9 +67,9 @@ static int query_topology_info(struct drm_i915_private *dev_priv,
>   	topo.max_eus_per_subslice = sseu->max_eus_per_subslice;
>   
>   	topo.subslice_offset = slice_length;
> -	topo.subslice_stride = subslice_stride;
> +	topo.subslice_stride = sseu->ss_stride;
>   	topo.eu_offset = slice_length + subslice_length;
> -	topo.eu_stride = eu_stride;
> +	topo.eu_stride = sseu->eu_stride;
>   
>   	if (__copy_to_user(u64_to_user_ptr(query_item->data_ptr),
>   			   &topo, sizeof(topo)))
> diff --git a/drivers/gpu/drm/i915/intel_device_info.c b/drivers/gpu/drm/i915/intel_device_info.c
> index e1dbccf04cd9..bbbc0a8c2183 100644
> --- a/drivers/gpu/drm/i915/intel_device_info.c
> +++ b/drivers/gpu/drm/i915/intel_device_info.c
> @@ -84,17 +84,42 @@ void intel_device_info_dump_flags(const struct intel_device_info *info,
>   #undef PRINT_FLAG
>   }
>   
> +#define SS_STR_MAX_SIZE (GEN_MAX_SUBSLICE_STRIDE * 2)
> +
> +static u8 *
> +subslice_per_slice_str(u8 *buf, const struct sseu_dev_info *sseu, u8 slice)
> +{
> +	int i;
> +	u8 ss_offset = slice * sseu->ss_stride;
> +
> +	GEM_BUG_ON(slice >= sseu->max_slices);
> +
> +	memset(buf, 0, SS_STR_MAX_SIZE);
> +
> +	/*
> +	 * Print subslice information in reverse order to match
> +	 * userspace expectations.
> +	 */
> +	for (i = 0; i < sseu->ss_stride; i++)
> +		sprintf(&buf[i * 2], "%02x",
> +			sseu->subslice_mask[ss_offset + sseu->ss_stride -
> +					    (i + 1)]);
> +
> +	return buf;
> +}
> +
>   static void sseu_dump(const struct sseu_dev_info *sseu, struct drm_printer *p)
>   {
>   	int s;
> +	u8 buf[SS_STR_MAX_SIZE];
>   
>   	drm_printf(p, "slice total: %u, mask=%04x\n",
>   		   hweight8(sseu->slice_mask), sseu->slice_mask);
>   	drm_printf(p, "subslice total: %u\n", intel_sseu_subslice_total(sseu));
>   	for (s = 0; s < sseu->max_slices; s++) {
> -		drm_printf(p, "slice%d: %u subslices, mask=%04x\n",
> +		drm_printf(p, "slice%d: %u subslices, mask=%s\n",
>   			   s, intel_sseu_subslices_per_slice(sseu, s),
> -			   sseu->subslice_mask[s]);
> +			   subslice_per_slice_str(buf, sseu, s));
>   	}
>   	drm_printf(p, "EU total: %u\n", sseu->eu_total);
>   	drm_printf(p, "EU per subslice: %u\n", sseu->eu_per_subslice);
> @@ -118,6 +143,7 @@ void intel_device_info_dump_topology(const struct sseu_dev_info *sseu,
>   				     struct drm_printer *p)
>   {
>   	int s, ss;
> +	u8 buf[SS_STR_MAX_SIZE];
>   
>   	if (sseu->max_slices == 0) {
>   		drm_printf(p, "Unavailable\n");
> @@ -125,9 +151,9 @@ void intel_device_info_dump_topology(const struct sseu_dev_info *sseu,
>   	}
>   
>   	for (s = 0; s < sseu->max_slices; s++) {
> -		drm_printf(p, "slice%d: %u subslice(s) (0x%hhx):\n",
> +		drm_printf(p, "slice%d: %u subslice(s) (0x%s):\n",
>   			   s, intel_sseu_subslices_per_slice(sseu, s),
> -			   sseu->subslice_mask[s]);
> +			   subslice_per_slice_str(buf, sseu, s));
>   
>   		for (ss = 0; ss < sseu->max_subslices; ss++) {
>   			u16 enabled_eus = intel_sseu_get_eus(sseu, s, ss);
> @@ -156,15 +182,10 @@ static void gen11_sseu_info_init(struct drm_i915_private *dev_priv)
>   	u8 eu_en;
>   	int s;
>   
> -	if (IS_ELKHARTLAKE(dev_priv)) {
> -		sseu->max_slices = 1;
> -		sseu->max_subslices = 4;
> -		sseu->max_eus_per_subslice = 8;
> -	} else {
> -		sseu->max_slices = 1;
> -		sseu->max_subslices = 8;
> -		sseu->max_eus_per_subslice = 8;
> -	}
> +	if (IS_ELKHARTLAKE(dev_priv))
> +		intel_sseu_set_info(sseu, 1, 4, 8);
> +	else
> +		intel_sseu_set_info(sseu, 1, 8, 8);
>   
>   	s_en = I915_READ(GEN11_GT_SLICE_ENABLE) & GEN11_GT_S_ENA_MASK;
>   	ss_en = ~I915_READ(GEN11_GT_SUBSLICE_DISABLE);
> @@ -177,9 +198,11 @@ static void gen11_sseu_info_init(struct drm_i915_private *dev_priv)
>   			int ss;
>   
>   			sseu->slice_mask |= BIT(s);
> -			sseu->subslice_mask[s] = (ss_en >> ss_idx) & ss_en_mask;
> +			sseu->subslice_mask[s * sseu->ss_stride] =
> +				(ss_en >> ss_idx) & ss_en_mask;

Shouldn't this just call intel_sseu_set_subslices() instead of doing the 
setting locally?

>   			for (ss = 0; ss < sseu->max_subslices; ss++) {
> -				if (sseu->subslice_mask[s] & BIT(ss))
> +				if (sseu->subslice_mask[s * sseu->ss_stride] &
> +				    BIT(ss))

This culd use the intel_sseu_has_subslice() suggested earlier, otherwise 
it needs to consider ss_stride > 1

>   					intel_sseu_set_eus(sseu, s, ss, eu_en);
>   			}
>   		}
> @@ -201,23 +224,10 @@ static void gen10_sseu_info_init(struct drm_i915_private *dev_priv)
>   	const int eu_mask = 0xff;
>   	u32 subslice_mask, eu_en;
>   
> +	intel_sseu_set_info(sseu, 6, 4, 8);
> +
>   	sseu->slice_mask = (fuse2 & GEN10_F2_S_ENA_MASK) >>
>   			    GEN10_F2_S_ENA_SHIFT;
> -	sseu->max_slices = 6;
> -	sseu->max_subslices = 4;
> -	sseu->max_eus_per_subslice = 8;
> -
> -	subslice_mask = (1 << 4) - 1;
> -	subslice_mask &= ~((fuse2 & GEN10_F2_SS_DIS_MASK) >>
> -			   GEN10_F2_SS_DIS_SHIFT);
> -
> -	/*
> -	 * Slice0 can have up to 3 subslices, but there are only 2 in
> -	 * slice1/2.
> -	 */
> -	sseu->subslice_mask[0] = subslice_mask;
> -	for (s = 1; s < sseu->max_slices; s++)
> -		sseu->subslice_mask[s] = subslice_mask & 0x3;
>   
>   	/* Slice0 */
>   	eu_en = ~I915_READ(GEN8_EU_DISABLE0);
> @@ -242,14 +252,22 @@ static void gen10_sseu_info_init(struct drm_i915_private *dev_priv)
>   	eu_en = ~I915_READ(GEN10_EU_DISABLE3);
>   	intel_sseu_set_eus(sseu, 5, 1, eu_en & eu_mask);
>   
> -	/* Do a second pass where we mark the subslices disabled if all their
> -	 * eus are off.
> -	 */
> +	subslice_mask = (1 << 4) - 1;
> +	subslice_mask &= ~((fuse2 & GEN10_F2_SS_DIS_MASK) >>
> +			   GEN10_F2_SS_DIS_SHIFT);
> +
>   	for (s = 0; s < sseu->max_slices; s++) {
>   		for (ss = 0; ss < sseu->max_subslices; ss++) {
>   			if (intel_sseu_get_eus(sseu, s, ss) == 0)
> -				sseu->subslice_mask[s] &= ~BIT(ss);
> +				subslice_mask &= ~BIT(ss);
>   		}
> +
> +		/*
> +		 * Slice0 can have up to 3 subslices, but there are only 2 in
> +		 * slice1/2.
> +		 */
> +		intel_sseu_set_subslices(sseu, s, s == 0 ? subslice_mask :
> +							   subslice_mask & 0x3);
>   	}
>   
>   	sseu->eu_total = compute_eu_total(sseu);
> @@ -275,13 +293,12 @@ static void cherryview_sseu_info_init(struct drm_i915_private *dev_priv)
>   {
>   	struct sseu_dev_info *sseu = &RUNTIME_INFO(dev_priv)->sseu;
>   	u32 fuse;
> +	u8 subslice_mask;
>   
>   	fuse = I915_READ(CHV_FUSE_GT);
>   
>   	sseu->slice_mask = BIT(0);
> -	sseu->max_slices = 1;
> -	sseu->max_subslices = 2;
> -	sseu->max_eus_per_subslice = 8;
> +	intel_sseu_set_info(sseu, 1, 2, 8);
>   
>   	if (!(fuse & CHV_FGT_DISABLE_SS0)) {
>   		u8 disabled_mask =
> @@ -290,7 +307,7 @@ static void cherryview_sseu_info_init(struct drm_i915_private *dev_priv)
>   			(((fuse & CHV_FGT_EU_DIS_SS0_R1_MASK) >>
>   			  CHV_FGT_EU_DIS_SS0_R1_SHIFT) << 4);
>   
> -		sseu->subslice_mask[0] |= BIT(0);
> +		subslice_mask |= BIT(0);
>   		intel_sseu_set_eus(sseu, 0, 0, ~disabled_mask);
>   	}
>   
> @@ -301,10 +318,12 @@ static void cherryview_sseu_info_init(struct drm_i915_private *dev_priv)
>   			(((fuse & CHV_FGT_EU_DIS_SS1_R1_MASK) >>
>   			  CHV_FGT_EU_DIS_SS1_R1_SHIFT) << 4);
>   
> -		sseu->subslice_mask[0] |= BIT(1);
> +		subslice_mask |= BIT(1);
>   		intel_sseu_set_eus(sseu, 0, 1, ~disabled_mask);
>   	}
>   
> +	intel_sseu_set_subslices(sseu, 0, subslice_mask);
> +
>   	sseu->eu_total = compute_eu_total(sseu);
>   
>   	/*
> @@ -312,7 +331,8 @@ static void cherryview_sseu_info_init(struct drm_i915_private *dev_priv)
>   	 * across subslices.
>   	*/
>   	sseu->eu_per_subslice = intel_sseu_subslice_total(sseu) ?
> -				sseu->eu_total / intel_sseu_subslice_total(sseu) :
> +				sseu->eu_total /
> +					intel_sseu_subslice_total(sseu) :
>   				0;
>   	/*
>   	 * CHV supports subslice power gating on devices with more than
> @@ -336,9 +356,8 @@ static void gen9_sseu_info_init(struct drm_i915_private *dev_priv)
>   	sseu->slice_mask = (fuse2 & GEN8_F2_S_ENA_MASK) >> GEN8_F2_S_ENA_SHIFT;
>   
>   	/* BXT has a single slice and at most 3 subslices. */
> -	sseu->max_slices = IS_GEN9_LP(dev_priv) ? 1 : 3;
> -	sseu->max_subslices = IS_GEN9_LP(dev_priv) ? 3 : 4;
> -	sseu->max_eus_per_subslice = 8;
> +	intel_sseu_set_info(sseu, IS_GEN9_LP(dev_priv) ? 1 : 3,
> +			    IS_GEN9_LP(dev_priv) ? 3 : 4, 8);
>   
>   	/*
>   	 * The subslice disable field is global, i.e. it applies
> @@ -357,14 +376,16 @@ static void gen9_sseu_info_init(struct drm_i915_private *dev_priv)
>   			/* skip disabled slice */
>   			continue;
>   
> -		sseu->subslice_mask[s] = subslice_mask;
> +		intel_sseu_set_subslices(sseu, s, subslice_mask);
>   
>   		eu_disable = I915_READ(GEN9_EU_DISABLE(s));
>   		for (ss = 0; ss < sseu->max_subslices; ss++) {
>   			int eu_per_ss;
>   			u8 eu_disabled_mask;
> +			u8 ss_idx = s * sseu->ss_stride + ss / BITS_PER_BYTE;
>   
> -			if (!(sseu->subslice_mask[s] & BIT(ss)))
> +			if (!(sseu->subslice_mask[ss_idx] &
> +			      BIT(ss % BITS_PER_BYTE)))
>   				/* skip disabled subslice */
>   				continue;
>   
> @@ -437,9 +458,7 @@ static void broadwell_sseu_info_init(struct drm_i915_private *dev_priv)
>   
>   	fuse2 = I915_READ(GEN8_FUSE2);
>   	sseu->slice_mask = (fuse2 & GEN8_F2_S_ENA_MASK) >> GEN8_F2_S_ENA_SHIFT;
> -	sseu->max_slices = 3;
> -	sseu->max_subslices = 3;
> -	sseu->max_eus_per_subslice = 8;
> +	intel_sseu_set_info(sseu, 3, 3, 8);
>   
>   	/*
>   	 * The subslice disable field is global, i.e. it applies
> @@ -466,18 +485,21 @@ static void broadwell_sseu_info_init(struct drm_i915_private *dev_priv)
>   			/* skip disabled slice */
>   			continue;
>   
> -		sseu->subslice_mask[s] = subslice_mask;
> +		intel_sseu_set_subslices(sseu, s, subslice_mask);
>   
>   		for (ss = 0; ss < sseu->max_subslices; ss++) {
>   			u8 eu_disabled_mask;
> +			u8 ss_idx = s * sseu->ss_stride + ss / BITS_PER_BYTE;
>   			u32 n_disabled;
>   
> -			if (!(sseu->subslice_mask[s] & BIT(ss)))
> +			if (!(sseu->subslice_mask[ss_idx] &
> +			      BIT(ss % BITS_PER_BYTE)))
>   				/* skip disabled subslice */
>   				continue;
>   
>   			eu_disabled_mask =
> -				eu_disable[s] >> (ss * sseu->max_eus_per_subslice);
> +				eu_disable[s] >>
> +					(ss * sseu->max_eus_per_subslice);
>   
>   			intel_sseu_set_eus(sseu, s, ss, ~eu_disabled_mask);
>   
> @@ -517,6 +539,7 @@ static void haswell_sseu_info_init(struct drm_i915_private *dev_priv)
>   	struct sseu_dev_info *sseu = &RUNTIME_INFO(dev_priv)->sseu;
>   	u32 fuse1;
>   	int s, ss;
> +	u32 subslice_mask;
>   
>   	/*
>   	 * There isn't a register to tell us how many slices/subslices. We
> @@ -528,22 +551,18 @@ static void haswell_sseu_info_init(struct drm_i915_private *dev_priv)
>   		/* fall through */
>   	case 1:
>   		sseu->slice_mask = BIT(0);
> -		sseu->subslice_mask[0] = BIT(0);
> +		subslice_mask = BIT(0);
>   		break;
>   	case 2:
>   		sseu->slice_mask = BIT(0);
> -		sseu->subslice_mask[0] = BIT(0) | BIT(1);
> +		subslice_mask = BIT(0) | BIT(1);
>   		break;
>   	case 3:
>   		sseu->slice_mask = BIT(0) | BIT(1);
> -		sseu->subslice_mask[0] = BIT(0) | BIT(1);
> -		sseu->subslice_mask[1] = BIT(0) | BIT(1);
> +		subslice_mask = BIT(0) | BIT(1);
>   		break;
>   	}
>   
> -	sseu->max_slices = hweight8(sseu->slice_mask);
> -	sseu->max_subslices = hweight8(sseu->subslice_mask[0]);
> -
>   	fuse1 = I915_READ(HSW_PAVP_FUSE1);
>   	switch ((fuse1 & HSW_F1_EU_DIS_MASK) >> HSW_F1_EU_DIS_SHIFT) {
>   	default:
> @@ -560,9 +579,14 @@ static void haswell_sseu_info_init(struct drm_i915_private *dev_priv)
>   		sseu->eu_per_subslice = 6;
>   		break;
>   	}
> -	sseu->max_eus_per_subslice = sseu->eu_per_subslice;
> +
> +	intel_sseu_set_info(sseu, hweight8(sseu->slice_mask),
> +			    hweight8(subslice_mask),
> +			    sseu->eu_per_subslice);

Personal preference: could use a local variable for eu_per_subslice 
above to avoid setting it to itself here.

Daniele

>   
>   	for (s = 0; s < sseu->max_slices; s++) {
> +		intel_sseu_set_subslices(sseu, s, subslice_mask);
> +
>   		for (ss = 0; ss < sseu->max_subslices; ss++) {
>   			intel_sseu_set_eus(sseu, s, ss,
>   					   (1UL << sseu->eu_per_subslice) - 1);
> 
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 5/6] drm/i915: Remove inline from sseu helper functions
  2019-05-01 21:28         ` Summers, Stuart
@ 2019-05-02  7:15           ` Jani Nikula
  2019-05-02 14:50             ` Summers, Stuart
  0 siblings, 1 reply; 35+ messages in thread
From: Jani Nikula @ 2019-05-02  7:15 UTC (permalink / raw)
  To: Summers, Stuart, Ceraolo Spurio, Daniele, intel-gfx

On Wed, 01 May 2019, "Summers, Stuart" <stuart.summers@intel.com> wrote:
> On Wed, 2019-05-01 at 14:19 -0700, Daniele Ceraolo Spurio wrote:
>> 
>> On 5/1/19 2:04 PM, Summers, Stuart wrote:
>> > On Wed, 2019-05-01 at 13:04 -0700, Daniele Ceraolo Spurio wrote:
>> > > Can you elaborate a bit more on what's the rationale for this? do
>> > > you just want to avoid having too many inlines since the paths
>> > > they're used in are not critical, or do you have some more
>> > > functional reason?  This is not a critic to the patch, I just
>> > > want to understand where you're coming from ;)
>> > 
>> > This was a request from Jani Nikula in a previous series update. I
>> > don't have a strong preference either way personally. If you don't
>> > have any major concerns, I'd prefer to keep the series as-is to
>> > prevent too much thrash here, but let me know.
>> > 
>> 
>> No concerns, just please update the commit message to explain that
>> we're moving them because there is no need for them to be inline
>> since they're not on a critical path where we need preformance.
>
> Sounds great.

I've become critical of superfluous inlines. They break the abstraction
by exposing the internals in the header, and make the interdependencies
of headers harder to resolve.

As the driver keeps growing and more people contribute to it, I think we
need to pay more attention on how we structure the source. To this end
we've added new gt/ subdir, are about to add gem/ and likely display/
too before long, and we've significantly split off the monster
i915_drv.h and intel_drv.h headers.

Obviously inlines have their place and purpose, but I think we sprinkle
them a bit too eagerly without paying attention.

I like the patch.

Acked-by: Jani Nikula <jani.nikula@intel.com>


-- 
Jani Nikula, Intel Open Source Graphics Center
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 35+ messages in thread

* ✓ Fi.CI.IGT: success for Refactor to expand subslice mask (rev7)
  2019-05-01 15:34 [PATCH 0/6] Refactor to expand subslice mask Stuart Summers
                   ` (8 preceding siblings ...)
  2019-05-01 16:14 ` ✓ Fi.CI.BAT: success " Patchwork
@ 2019-05-02  9:14 ` Patchwork
  9 siblings, 0 replies; 35+ messages in thread
From: Patchwork @ 2019-05-02  9:14 UTC (permalink / raw)
  To: Summers, Stuart; +Cc: intel-gfx

== Series Details ==

Series: Refactor to expand subslice mask (rev7)
URL   : https://patchwork.freedesktop.org/series/59742/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_6021_full -> Patchwork_12926_full
====================================================

Summary
-------

  **SUCCESS**

  No regressions found.

  

Known issues
------------

  Here are the changes found in Patchwork_12926_full that come from known issues:

### IGT changes ###

#### Issues hit ####

  * igt@i915_suspend@debugfs-reader:
    - shard-kbl:          [PASS][1] -> [DMESG-WARN][2] ([fdo#108566]) +1 similar issue
   [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6021/shard-kbl4/igt@i915_suspend@debugfs-reader.html
   [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12926/shard-kbl7/igt@i915_suspend@debugfs-reader.html

  * igt@i915_suspend@fence-restore-untiled:
    - shard-apl:          [PASS][3] -> [DMESG-WARN][4] ([fdo#108566]) +4 similar issues
   [3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6021/shard-apl4/igt@i915_suspend@fence-restore-untiled.html
   [4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12926/shard-apl2/igt@i915_suspend@fence-restore-untiled.html

  * igt@kms_cursor_crc@cursor-64x21-sliding:
    - shard-skl:          [PASS][5] -> [INCOMPLETE][6] ([fdo#110581]) +2 similar issues
   [5]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6021/shard-skl6/igt@kms_cursor_crc@cursor-64x21-sliding.html
   [6]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12926/shard-skl2/igt@kms_cursor_crc@cursor-64x21-sliding.html

  * igt@kms_draw_crc@draw-method-xrgb8888-render-xtiled:
    - shard-snb:          [PASS][7] -> [SKIP][8] ([fdo#109271]) +1 similar issue
   [7]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6021/shard-snb1/igt@kms_draw_crc@draw-method-xrgb8888-render-xtiled.html
   [8]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12926/shard-snb4/igt@kms_draw_crc@draw-method-xrgb8888-render-xtiled.html

  * igt@kms_flip@flip-vs-expired-vblank:
    - shard-skl:          [PASS][9] -> [FAIL][10] ([fdo#105363])
   [9]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6021/shard-skl4/igt@kms_flip@flip-vs-expired-vblank.html
   [10]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12926/shard-skl8/igt@kms_flip@flip-vs-expired-vblank.html

  * igt@kms_flip@flip-vs-expired-vblank-interruptible:
    - shard-iclb:         [PASS][11] -> [INCOMPLETE][12] ([fdo#107713] / [fdo#110581]) +1 similar issue
   [11]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6021/shard-iclb2/igt@kms_flip@flip-vs-expired-vblank-interruptible.html
   [12]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12926/shard-iclb6/igt@kms_flip@flip-vs-expired-vblank-interruptible.html

  * igt@kms_flip@flip-vs-suspend:
    - shard-hsw:          [PASS][13] -> [INCOMPLETE][14] ([fdo#103540] / [fdo#110581])
   [13]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6021/shard-hsw6/igt@kms_flip@flip-vs-suspend.html
   [14]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12926/shard-hsw1/igt@kms_flip@flip-vs-suspend.html
    - shard-skl:          [PASS][15] -> [INCOMPLETE][16] ([fdo#107773] / [fdo#109507] / [fdo#110581])
   [15]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6021/shard-skl9/igt@kms_flip@flip-vs-suspend.html
   [16]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12926/shard-skl9/igt@kms_flip@flip-vs-suspend.html

  * igt@kms_frontbuffer_tracking@fbcpsr-1p-primscrn-pri-indfb-draw-blt:
    - shard-iclb:         [PASS][17] -> [FAIL][18] ([fdo#103167]) +4 similar issues
   [17]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6021/shard-iclb3/igt@kms_frontbuffer_tracking@fbcpsr-1p-primscrn-pri-indfb-draw-blt.html
   [18]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12926/shard-iclb2/igt@kms_frontbuffer_tracking@fbcpsr-1p-primscrn-pri-indfb-draw-blt.html

  * igt@kms_frontbuffer_tracking@fbcpsr-1p-primscrn-spr-indfb-draw-render:
    - shard-iclb:         [PASS][19] -> [INCOMPLETE][20] ([fdo#106978] / [fdo#107713] / [fdo#110581])
   [19]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6021/shard-iclb1/igt@kms_frontbuffer_tracking@fbcpsr-1p-primscrn-spr-indfb-draw-render.html
   [20]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12926/shard-iclb5/igt@kms_frontbuffer_tracking@fbcpsr-1p-primscrn-spr-indfb-draw-render.html

  * igt@kms_plane@pixel-format-pipe-b-planes:
    - shard-glk:          [PASS][21] -> [SKIP][22] ([fdo#109271])
   [21]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6021/shard-glk9/igt@kms_plane@pixel-format-pipe-b-planes.html
   [22]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12926/shard-glk4/igt@kms_plane@pixel-format-pipe-b-planes.html

  * igt@kms_plane_alpha_blend@pipe-a-constant-alpha-min:
    - shard-skl:          [PASS][23] -> [FAIL][24] ([fdo#108145])
   [23]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6021/shard-skl5/igt@kms_plane_alpha_blend@pipe-a-constant-alpha-min.html
   [24]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12926/shard-skl5/igt@kms_plane_alpha_blend@pipe-a-constant-alpha-min.html

  * igt@kms_plane_scaling@pipe-c-scaler-with-pixel-format:
    - shard-glk:          [PASS][25] -> [SKIP][26] ([fdo#109271] / [fdo#109278])
   [25]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6021/shard-glk9/igt@kms_plane_scaling@pipe-c-scaler-with-pixel-format.html
   [26]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12926/shard-glk4/igt@kms_plane_scaling@pipe-c-scaler-with-pixel-format.html

  * igt@kms_psr2_su@frontbuffer:
    - shard-iclb:         [PASS][27] -> [SKIP][28] ([fdo#109642])
   [27]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6021/shard-iclb2/igt@kms_psr2_su@frontbuffer.html
   [28]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12926/shard-iclb1/igt@kms_psr2_su@frontbuffer.html

  * igt@kms_psr@psr2_sprite_plane_move:
    - shard-iclb:         [PASS][29] -> [SKIP][30] ([fdo#109441]) +2 similar issues
   [29]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6021/shard-iclb2/igt@kms_psr@psr2_sprite_plane_move.html
   [30]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12926/shard-iclb1/igt@kms_psr@psr2_sprite_plane_move.html

  * igt@perf_pmu@enable-race-rcs0:
    - shard-glk:          [PASS][31] -> [INCOMPLETE][32] ([fdo#103359] / [fdo#110581] / [k.org#198133])
   [31]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6021/shard-glk9/igt@perf_pmu@enable-race-rcs0.html
   [32]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12926/shard-glk4/igt@perf_pmu@enable-race-rcs0.html

  * igt@perf_pmu@rc6:
    - shard-kbl:          [PASS][33] -> [SKIP][34] ([fdo#109271])
   [33]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6021/shard-kbl7/igt@perf_pmu@rc6.html
   [34]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12926/shard-kbl5/igt@perf_pmu@rc6.html

  * igt@tools_test@tools_test:
    - shard-iclb:         [PASS][35] -> [SKIP][36] ([fdo#109352])
   [35]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6021/shard-iclb7/igt@tools_test@tools_test.html
   [36]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12926/shard-iclb8/igt@tools_test@tools_test.html

  
#### Possible fixes ####

  * igt@gem_ctx_switch@basic-all-light:
    - shard-iclb:         [INCOMPLETE][37] ([fdo#107713] / [fdo#109100]) -> [PASS][38]
   [37]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6021/shard-iclb1/igt@gem_ctx_switch@basic-all-light.html
   [38]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12926/shard-iclb7/igt@gem_ctx_switch@basic-all-light.html

  * igt@gem_exec_blt@dumb-buf-min:
    - shard-hsw:          [INCOMPLETE][39] ([fdo#103540]) -> [PASS][40]
   [39]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6021/shard-hsw6/igt@gem_exec_blt@dumb-buf-min.html
   [40]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12926/shard-hsw8/igt@gem_exec_blt@dumb-buf-min.html

  * igt@gem_exec_schedule@preempt-hang-vebox:
    - shard-skl:          [INCOMPLETE][41] ([fdo#110581]) -> [PASS][42] +1 similar issue
   [41]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6021/shard-skl8/igt@gem_exec_schedule@preempt-hang-vebox.html
   [42]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12926/shard-skl9/igt@gem_exec_schedule@preempt-hang-vebox.html

  * igt@gem_workarounds@suspend-resume:
    - shard-apl:          [DMESG-WARN][43] ([fdo#108566]) -> [PASS][44] +3 similar issues
   [43]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6021/shard-apl8/igt@gem_workarounds@suspend-resume.html
   [44]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12926/shard-apl7/igt@gem_workarounds@suspend-resume.html

  * igt@i915_pm_rpm@system-suspend-execbuf:
    - shard-skl:          [INCOMPLETE][45] ([fdo#104108] / [fdo#107773] / [fdo#107807]) -> [PASS][46]
   [45]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6021/shard-skl10/igt@i915_pm_rpm@system-suspend-execbuf.html
   [46]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12926/shard-skl3/igt@i915_pm_rpm@system-suspend-execbuf.html

  * igt@i915_selftest@mock_requests:
    - shard-skl:          [INCOMPLETE][47] ([fdo#110550]) -> [PASS][48]
   [47]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6021/shard-skl5/igt@i915_selftest@mock_requests.html
   [48]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12926/shard-skl6/igt@i915_selftest@mock_requests.html

  * igt@kms_cursor_crc@cursor-128x128-suspend:
    - shard-skl:          [INCOMPLETE][49] ([fdo#104108] / [fdo#107773]) -> [PASS][50]
   [49]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6021/shard-skl5/igt@kms_cursor_crc@cursor-128x128-suspend.html
   [50]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12926/shard-skl5/igt@kms_cursor_crc@cursor-128x128-suspend.html

  * igt@kms_cursor_crc@cursor-256x256-suspend:
    - shard-skl:          [INCOMPLETE][51] ([fdo#104108]) -> [PASS][52]
   [51]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6021/shard-skl2/igt@kms_cursor_crc@cursor-256x256-suspend.html
   [52]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12926/shard-skl2/igt@kms_cursor_crc@cursor-256x256-suspend.html

  * igt@kms_frontbuffer_tracking@fbc-1p-primscrn-pri-indfb-draw-mmap-gtt:
    - shard-iclb:         [FAIL][53] ([fdo#103167]) -> [PASS][54] +1 similar issue
   [53]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6021/shard-iclb2/igt@kms_frontbuffer_tracking@fbc-1p-primscrn-pri-indfb-draw-mmap-gtt.html
   [54]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12926/shard-iclb6/igt@kms_frontbuffer_tracking@fbc-1p-primscrn-pri-indfb-draw-mmap-gtt.html

  * igt@kms_plane@pixel-format-pipe-c-planes:
    - shard-glk:          [SKIP][55] ([fdo#109271]) -> [PASS][56]
   [55]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6021/shard-glk4/igt@kms_plane@pixel-format-pipe-c-planes.html
   [56]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12926/shard-glk9/igt@kms_plane@pixel-format-pipe-c-planes.html

  * igt@kms_plane_lowres@pipe-a-tiling-y:
    - shard-iclb:         [FAIL][57] ([fdo#103166]) -> [PASS][58]
   [57]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6021/shard-iclb6/igt@kms_plane_lowres@pipe-a-tiling-y.html
   [58]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12926/shard-iclb8/igt@kms_plane_lowres@pipe-a-tiling-y.html

  * igt@kms_plane_scaling@pipe-b-scaler-with-clipping-clamping:
    - shard-glk:          [SKIP][59] ([fdo#109271] / [fdo#109278]) -> [PASS][60] +1 similar issue
   [59]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6021/shard-glk4/igt@kms_plane_scaling@pipe-b-scaler-with-clipping-clamping.html
   [60]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12926/shard-glk9/igt@kms_plane_scaling@pipe-b-scaler-with-clipping-clamping.html

  * igt@kms_psr@no_drrs:
    - shard-iclb:         [FAIL][61] ([fdo#108341]) -> [PASS][62]
   [61]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6021/shard-iclb1/igt@kms_psr@no_drrs.html
   [62]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12926/shard-iclb5/igt@kms_psr@no_drrs.html

  * igt@kms_psr@psr2_primary_mmap_gtt:
    - shard-iclb:         [SKIP][63] ([fdo#109441]) -> [PASS][64]
   [63]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6021/shard-iclb4/igt@kms_psr@psr2_primary_mmap_gtt.html
   [64]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12926/shard-iclb2/igt@kms_psr@psr2_primary_mmap_gtt.html

  * igt@kms_setmode@basic:
    - shard-apl:          [FAIL][65] ([fdo#99912]) -> [PASS][66]
   [65]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6021/shard-apl3/igt@kms_setmode@basic.html
   [66]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12926/shard-apl1/igt@kms_setmode@basic.html

  * igt@kms_vblank@pipe-a-wait-busy:
    - shard-iclb:         [INCOMPLETE][67] ([fdo#107713]) -> [PASS][68]
   [67]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6021/shard-iclb3/igt@kms_vblank@pipe-a-wait-busy.html
   [68]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12926/shard-iclb1/igt@kms_vblank@pipe-a-wait-busy.html

  
#### Warnings ####

  * igt@kms_atomic_transition@5x-modeset-transitions-nonblocking-fencing:
    - shard-hsw:          [SKIP][69] ([fdo#109271] / [fdo#109278]) -> [INCOMPLETE][70] ([fdo#103540] / [fdo#110581])
   [69]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6021/shard-hsw4/igt@kms_atomic_transition@5x-modeset-transitions-nonblocking-fencing.html
   [70]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12926/shard-hsw2/igt@kms_atomic_transition@5x-modeset-transitions-nonblocking-fencing.html
    - shard-snb:          [SKIP][71] ([fdo#109271] / [fdo#109278]) -> [SKIP][72] ([fdo#109271])
   [71]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6021/shard-snb1/igt@kms_atomic_transition@5x-modeset-transitions-nonblocking-fencing.html
   [72]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12926/shard-snb4/igt@kms_atomic_transition@5x-modeset-transitions-nonblocking-fencing.html

  * igt@kms_flip@2x-flip-vs-suspend:
    - shard-glk:          [INCOMPLETE][73] ([fdo#103359] / [k.org#198133]) -> [INCOMPLETE][74] ([fdo#103359] / [fdo#110581] / [k.org#198133])
   [73]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6021/shard-glk9/igt@kms_flip@2x-flip-vs-suspend.html
   [74]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12926/shard-glk4/igt@kms_flip@2x-flip-vs-suspend.html

  * igt@kms_frontbuffer_tracking@fbcpsr-1p-primscrn-indfb-plflip-blt:
    - shard-skl:          [FAIL][75] ([fdo#108040]) -> [FAIL][76] ([fdo#103167])
   [75]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6021/shard-skl10/igt@kms_frontbuffer_tracking@fbcpsr-1p-primscrn-indfb-plflip-blt.html
   [76]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12926/shard-skl6/igt@kms_frontbuffer_tracking@fbcpsr-1p-primscrn-indfb-plflip-blt.html

  * igt@kms_plane_scaling@pipe-b-scaler-with-clipping-clamping:
    - shard-skl:          [INCOMPLETE][77] ([fdo#108972]) -> [INCOMPLETE][78] ([fdo#108972] / [fdo#110581]) +1 similar issue
   [77]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6021/shard-skl2/igt@kms_plane_scaling@pipe-b-scaler-with-clipping-clamping.html
   [78]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12926/shard-skl2/igt@kms_plane_scaling@pipe-b-scaler-with-clipping-clamping.html

  
  [fdo#103166]: https://bugs.freedesktop.org/show_bug.cgi?id=103166
  [fdo#103167]: https://bugs.freedesktop.org/show_bug.cgi?id=103167
  [fdo#103359]: https://bugs.freedesktop.org/show_bug.cgi?id=103359
  [fdo#103540]: https://bugs.freedesktop.org/show_bug.cgi?id=103540
  [fdo#104108]: https://bugs.freedesktop.org/show_bug.cgi?id=104108
  [fdo#105363]: https://bugs.freedesktop.org/show_bug.cgi?id=105363
  [fdo#106978]: https://bugs.freedesktop.org/show_bug.cgi?id=106978
  [fdo#107713]: https://bugs.freedesktop.org/show_bug.cgi?id=107713
  [fdo#107773]: https://bugs.freedesktop.org/show_bug.cgi?id=107773
  [fdo#107807]: https://bugs.freedesktop.org/show_bug.cgi?id=107807
  [fdo#108040]: https://bugs.freedesktop.org/show_bug.cgi?id=108040
  [fdo#108145]: https://bugs.freedesktop.org/show_bug.cgi?id=108145
  [fdo#108341]: https://bugs.freedesktop.org/show_bug.cgi?id=108341
  [fdo#108566]: https://bugs.freedesktop.org/show_bug.cgi?id=108566
  [fdo#108972]: https://bugs.freedesktop.org/show_bug.cgi?id=108972
  [fdo#109100]: https://bugs.freedesktop.org/show_bug.cgi?id=109100
  [fdo#109271]: https://bugs.freedesktop.org/show_bug.cgi?id=109271
  [fdo#109278]: https://bugs.freedesktop.org/show_bug.cgi?id=109278
  [fdo#109352]: https://bugs.freedesktop.org/show_bug.cgi?id=109352
  [fdo#109441]: https://bugs.freedesktop.org/show_bug.cgi?id=109441
  [fdo#109507]: https://bugs.freedesktop.org/show_bug.cgi?id=109507
  [fdo#109642]: https://bugs.freedesktop.org/show_bug.cgi?id=109642
  [fdo#110550]: https://bugs.freedesktop.org/show_bug.cgi?id=110550
  [fdo#110581]: https://bugs.freedesktop.org/show_bug.cgi?id=110581
  [fdo#99912]: https://bugs.freedesktop.org/show_bug.cgi?id=99912
  [k.org#198133]: https://bugzilla.kernel.org/show_bug.cgi?id=198133


Participating hosts (10 -> 10)
------------------------------

  No changes in participating hosts


Build changes
-------------

  * Linux: CI_DRM_6021 -> Patchwork_12926

  CI_DRM_6021: 850aa4220e8bf7609b03bf89bce146305704bec6 @ git://anongit.freedesktop.org/gfx-ci/linux
  IGT_4971: fc5e0467eb6913d21ad932aa8a31c77fdb5a9c77 @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
  Patchwork_12926: 3416f0a72f5df37142ab06ae57f55bff003ecf5d @ git://anongit.freedesktop.org/gfx-ci/linux
  piglit_4509: fdc5a4ca11124ab8413c7988896eec4c97336694 @ git://anongit.freedesktop.org/piglit

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12926/
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 6/6] drm/i915: Expand subslice mask
  2019-05-01 22:04   ` Daniele Ceraolo Spurio
@ 2019-05-02 14:47     ` Summers, Stuart
  2019-05-03  9:05       ` Lionel Landwerlin
  0 siblings, 1 reply; 35+ messages in thread
From: Summers, Stuart @ 2019-05-02 14:47 UTC (permalink / raw)
  To: Ceraolo Spurio, Daniele, intel-gfx


[-- Attachment #1.1: Type: text/plain, Size: 38195 bytes --]

On Wed, 2019-05-01 at 15:04 -0700, Daniele Ceraolo Spurio wrote:
> 
> On 5/1/19 8:34 AM, Stuart Summers wrote:
> > Currently, the subslice_mask runtime parameter is stored as an
> > array of subslices per slice. Expand the subslice mask array to
> > better match what is presented to userspace through the
> > I915_QUERY_TOPOLOGY_INFO ioctl. The index into this array is
> > then calculated:
> >    slice * subslice stride + subslice index / 8
> > 
> > v2: fix spacing in set_sseu_info args
> >      use set_sseu_info to initialize sseu data when building
> >      device status in debugfs
> >      rename variables in intel_engine_types.h to avoid checkpatch
> >      warnings
> > v3: update headers in intel_sseu.h
> > v4: add const to some sseu_dev_info variables
> >      use sseu->eu_stride for EU stride calculations
> > 
> > Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
> > Signed-off-by: Stuart Summers <stuart.summers@intel.com>
> 
> Can you also get an ack from Lionel, to make sure this all fits with
> the 
> expected reporting?

Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>

> 
> > ---
> >   drivers/gpu/drm/i915/gt/intel_engine_cs.c    |   6 +-
> >   drivers/gpu/drm/i915/gt/intel_engine_types.h |  32 +++--
> >   drivers/gpu/drm/i915/gt/intel_hangcheck.c    |   3 +-
> >   drivers/gpu/drm/i915/gt/intel_sseu.c         |  49 +++++--
> >   drivers/gpu/drm/i915/gt/intel_sseu.h         |  16 ++-
> >   drivers/gpu/drm/i915/gt/intel_workarounds.c  |   2 +-
> >   drivers/gpu/drm/i915/i915_debugfs.c          |  44 +++---
> >   drivers/gpu/drm/i915/i915_drv.c              |   6 +-
> >   drivers/gpu/drm/i915/i915_gpu_error.c        |   5 +-
> >   drivers/gpu/drm/i915/i915_query.c            |  10 +-
> >   drivers/gpu/drm/i915/intel_device_info.c     | 142 +++++++++++---
> > -----
> >   11 files changed, 198 insertions(+), 117 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> > b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> > index 6e40f8ea9a6a..8f7967cc9a50 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> > @@ -914,7 +914,7 @@ u32 intel_calculate_mcr_s_ss_select(struct
> > drm_i915_private *dev_priv)
> >   	const struct sseu_dev_info *sseu = &RUNTIME_INFO(dev_priv)-
> > >sseu;
> >   	u32 mcr_s_ss_select;
> >   	u32 slice = fls(sseu->slice_mask);
> > -	u32 subslice = fls(sseu->subslice_mask[slice]);
> > +	u32 subslice = fls(sseu->subslice_mask[slice * sseu-
> > >ss_stride]);
> 
> This (and the registers we use below) only works if ss_stride = 1.
> Can 
> we add a:
> 
> 	GEM_BUG_ON(sseu->ss_stride > 1);
> 
> to catch the fact that this function will need updating to handle
> that 
> case if/when we get it?

I'll rework this and post an update.

> 
> >   
> >   	if (IS_GEN(dev_priv, 10))
> >   		mcr_s_ss_select = GEN8_MCR_SLICE(slice) |
> > @@ -990,6 +990,7 @@ void intel_engine_get_instdone(struct
> > intel_engine_cs *engine,
> >   			       struct intel_instdone *instdone)
> >   {
> >   	struct drm_i915_private *dev_priv = engine->i915;
> > +	const struct sseu_dev_info *sseu = &RUNTIME_INFO(dev_priv)-
> > >sseu;
> >   	struct intel_uncore *uncore = engine->uncore;
> >   	u32 mmio_base = engine->mmio_base;
> >   	int slice;
> > @@ -1007,7 +1008,8 @@ void intel_engine_get_instdone(struct
> > intel_engine_cs *engine,
> >   
> >   		instdone->slice_common =
> >   			intel_uncore_read(uncore, GEN7_SC_INSTDONE);
> > -		for_each_instdone_slice_subslice(dev_priv, slice,
> > subslice) {
> > +		for_each_instdone_slice_subslice(dev_priv, sseu, slice,
> > +						 subslice) {
> >   			instdone->sampler[slice][subslice] =
> >   				read_subslice_reg(dev_priv, slice,
> > subslice,
> >   						  GEN7_SAMPLER_INSTDONE
> > );
> > diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h
> > b/drivers/gpu/drm/i915/gt/intel_engine_types.h
> > index 9d64e33f8427..1710546a2446 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_engine_types.h
> > +++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h
> > @@ -534,20 +534,22 @@ intel_engine_needs_breadcrumb_tasklet(const
> > struct intel_engine_cs *engine)
> >   	return engine->flags & I915_ENGINE_NEEDS_BREADCRUMB_TASKLET;
> >   }
> >   
> > -#define instdone_slice_mask(dev_priv__) \
> > -	(IS_GEN(dev_priv__, 7) ? \
> > -	 1 : RUNTIME_INFO(dev_priv__)->sseu.slice_mask)
> > -
> > -#define instdone_subslice_mask(dev_priv__) \
> > -	(IS_GEN(dev_priv__, 7) ? \
> > -	 1 : RUNTIME_INFO(dev_priv__)->sseu.subslice_mask[0])
> > -
> > -#define for_each_instdone_slice_subslice(dev_priv__, slice__,
> > subslice__) \
> > -	for ((slice__) = 0, (subslice__) = 0; \
> > -	     (slice__) < I915_MAX_SLICES; \
> > -	     (subslice__) = ((subslice__) + 1) < I915_MAX_SUBSLICES ?
> > (subslice__) + 1 : 0, \
> > -	       (slice__) += ((subslice__) == 0)) \
> > -		for_each_if((BIT(slice__) &
> > instdone_slice_mask(dev_priv__)) && \
> > -			    (BIT(subslice__) &
> > instdone_subslice_mask(dev_priv__)))
> > +#define instdone_has_slice(dev_priv___, sseu___, slice___) \
> > +	((IS_GEN(dev_priv___, 7) ? \
> > +	  1 : (sseu___)->slice_mask) & \
> 
> I'd put the ternary op on the same line here for readability

Yeah good point.

> 
> > +	BIT(slice___)) \
> 
> no need for "\" here (and below).

Ok.

> 
> > +
> > +#define instdone_has_subslice(dev_priv__, sseu__, slice__,
> > subslice__) \
> 
> need some more parenthesis in this macro to fix the
> MACRO_ARG_PRECEDENCE 
> warning in checkpatch.

Thanks, I'll fix this.

> 
> > +	((IS_GEN(dev_priv__, 7) ? \
> > +	  1 : (sseu__)->subslice_mask[slice__ * (sseu__)->ss_stride + \
> > +				      subslice__ / BITS_PER_BYTE]) & \
> 
> The calculation to get the correct subslice u8 entry:
> 
> 	sseu->subslice_mask[s * sseu->ss_stride + ss / BITS_PER_BYTE]
> 
> seems to be repeated a few times in this patch, so it might be worth 
> moving it to its own inline function. looks like you always
> ultimately 
> want a bool, so we could also go a bit further and have something
> like:
> 
> static inline bool intel_sseu_has_subslice(sseu, s, ss)
> {
> 	u8 mask = sseu->subslice_mask[s * sseu->ss_stride +
> 				      ss / BITS_PER_BYTE];
> 
> 	return mask & BIT(ss % BITS_PER_BYTE);
> }
> 
> and then do:
> 
> #define instdone_has_subslice(dev_priv__, sseu__, slice__,
> subslice__) \
> 	((IS_GEN(dev_priv__, 7) ? subslice__ == 0 : \
> 		intel_sseu_has_subslice(...))
> 

Good point and good suggeMakes sense. stion. I'll clean this up.

> > +	 BIT(subslice__ % BITS_PER_BYTE)) \
> > +
> > +#define for_each_instdone_slice_subslice(dev_priv_, sseu_, slice_,
> > subslice_) \
> > +	for ((slice_) = 0, (subslice_) = 0; (slice_) < I915_MAX_SLICES;
> > \
> > +	     (subslice_) = ((subslice_) + 1) < I915_MAX_SUBSLICES ?
> > (subslice_) + 1 : 0, \
> 
> This ternary op should be simplifiable as:
> 
> 	(subslice_) = ((subslice_) + 1) % I915_MAX_SUBSLICES,

I had been carrying this forward as-is as much as possible. Your
suggestion makes sense though. I'll take a look.

> 
> > +	       (slice_) += ((subslice_) == 0)) \
> > +		for_each_if(instdone_has_slice(dev_priv_, sseu_, slice)
> > && \
> 
> missing the "_" after "slice"

Good catch, thanks!

> 
> > +			    instdone_has_subslice(dev_priv_, sseu_,
> > slice_, subslice_)) \
> >   
> >   #endif /* __INTEL_ENGINE_TYPES_H__ */
> > diff --git a/drivers/gpu/drm/i915/gt/intel_hangcheck.c
> > b/drivers/gpu/drm/i915/gt/intel_hangcheck.c
> > index e5eaa06fe74d..53c1c98161e1 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_hangcheck.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_hangcheck.c
> > @@ -50,6 +50,7 @@ static bool instdone_unchanged(u32
> > current_instdone, u32 *old_instdone)
> >   static bool subunits_stuck(struct intel_engine_cs *engine)
> >   {
> >   	struct drm_i915_private *dev_priv = engine->i915;
> > +	const struct sseu_dev_info *sseu = &RUNTIME_INFO(dev_priv)-
> > >sseu;
> >   	struct intel_instdone instdone;
> >   	struct intel_instdone *accu_instdone = &engine-
> > >hangcheck.instdone;
> >   	bool stuck;
> > @@ -71,7 +72,7 @@ static bool subunits_stuck(struct intel_engine_cs
> > *engine)
> >   	stuck &= instdone_unchanged(instdone.slice_common,
> >   				    &accu_instdone->slice_common);
> >   
> > -	for_each_instdone_slice_subslice(dev_priv, slice, subslice) {
> > +	for_each_instdone_slice_subslice(dev_priv, sseu, slice,
> > subslice) {
> >   		stuck &=
> > instdone_unchanged(instdone.sampler[slice][subslice],
> >   					    &accu_instdone-
> > >sampler[slice][subslice]);
> >   		stuck &=
> > instdone_unchanged(instdone.row[slice][subslice],
> > diff --git a/drivers/gpu/drm/i915/gt/intel_sseu.c
> > b/drivers/gpu/drm/i915/gt/intel_sseu.c
> > index 4a0b82fc108c..49316b7ef074 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_sseu.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_sseu.c
> > @@ -8,6 +8,17 @@
> >   #include "intel_lrc_reg.h"
> >   #include "intel_sseu.h"
> >   
> > +void intel_sseu_set_info(struct sseu_dev_info *sseu, u8
> > max_slices,
> > +			 u8 max_subslices, u8 max_eus_per_subslice)
> > +{
> > +	sseu->max_slices = max_slices;
> > +	sseu->max_subslices = max_subslices;
> > +	sseu->max_eus_per_subslice = max_eus_per_subslice;
> > +
> > +	sseu->ss_stride = GEN_SSEU_STRIDE(sseu->max_subslices);
> > +	sseu->eu_stride = GEN_SSEU_STRIDE(sseu->max_eus_per_subslice);
> > +}
> > +
> >   unsigned int
> >   intel_sseu_subslice_total(const struct sseu_dev_info *sseu)
> >   {
> > @@ -22,17 +33,39 @@ intel_sseu_subslice_total(const struct
> > sseu_dev_info *sseu)
> >   unsigned int
> >   intel_sseu_subslices_per_slice(const struct sseu_dev_info *sseu,
> > u8 slice)
> 
> Here we pass slice as u8, but below we use int. Any reason for the 
> difference?

No good reason. I'll fix it.

> 
> >   {
> > -	return hweight8(sseu->subslice_mask[slice]);
> > +	unsigned int i, total = 0;
> > +
> > +	for (i = 0; i < sseu->ss_stride; i++)
> > +		total += hweight8(sseu->subslice_mask[slice * sseu-
> > >ss_stride +
> > +						      i]);
> > +
> > +	return total;
> > +}
> > +
> > +void intel_sseu_copy_subslices(const struct sseu_dev_info *sseu,
> > int slice,
> > +			       u8 *to_mask, const u8 *from_mask)
> 
> You always use sseu->subslice_mask has a from_mask, can't we just
> get 
> that from the sseu param and avoid the from_mask?

I wanted to make this a little more generic, but I agree maybe that's
overkill. I'll rework this.

> 
> > +{
> > +	int offset = slice * sseu->ss_stride;
> > +
> > +	memcpy(&to_mask[offset], &from_mask[offset], sseu->ss_stride);
> > +}
> > +
> > +void intel_sseu_set_subslices(struct sseu_dev_info *sseu, int
> > slice,
> > +			      u32 ss_mask)
> > +{
> > +	int i, offset = slice * sseu->ss_stride;
> > +
> > +	for (i = 0; i < sseu->ss_stride; i++)
> > +		sseu->subslice_mask[offset + i] =
> > +			(ss_mask >> (BITS_PER_BYTE * i)) & 0xff;
> >   }
> >   
> >   static int intel_sseu_eu_idx(const struct sseu_dev_info *sseu,
> > int slice,
> >   			     int subslice)
> >   {
> > -	int subslice_stride = DIV_ROUND_UP(sseu->max_eus_per_subslice,
> > -					   BITS_PER_BYTE);
> > -	int slice_stride = sseu->max_subslices * subslice_stride;
> > +	int slice_stride = sseu->max_subslices * sseu->eu_stride;
> >   
> > -	return slice * slice_stride + subslice * subslice_stride;
> > +	return slice * slice_stride + subslice * sseu->eu_stride;
> >   }
> >   
> >   u16 intel_sseu_get_eus(const struct sseu_dev_info *sseu, int
> > slice,
> > @@ -41,8 +74,7 @@ u16 intel_sseu_get_eus(const struct sseu_dev_info
> > *sseu, int slice,
> >   	int i, offset = intel_sseu_eu_idx(sseu, slice, subslice);
> >   	u16 eu_mask = 0;
> >   
> > -	for (i = 0;
> > -	     i < DIV_ROUND_UP(sseu->max_eus_per_subslice,
> > BITS_PER_BYTE); i++) {
> > +	for (i = 0; i < sseu->eu_stride; i++) {
> >   		eu_mask |= ((u16)sseu->eu_mask[offset + i]) <<
> >   			(i * BITS_PER_BYTE);
> >   	}
> > @@ -55,8 +87,7 @@ void intel_sseu_set_eus(struct sseu_dev_info
> > *sseu, int slice, int subslice,
> >   {
> >   	int i, offset = intel_sseu_eu_idx(sseu, slice, subslice);
> >   
> > -	for (i = 0;
> > -	     i < DIV_ROUND_UP(sseu->max_eus_per_subslice,
> > BITS_PER_BYTE); i++) {
> > +	for (i = 0; i < sseu->eu_stride; i++) {
> >   		sseu->eu_mask[offset + i] =
> >   			(eu_mask >> (BITS_PER_BYTE * i)) & 0xff;
> >   	}
> > diff --git a/drivers/gpu/drm/i915/gt/intel_sseu.h
> > b/drivers/gpu/drm/i915/gt/intel_sseu.h
> > index 56e3721ae83f..bf01f338a8cc 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_sseu.h
> > +++ b/drivers/gpu/drm/i915/gt/intel_sseu.h
> > @@ -9,16 +9,18 @@
> >   
> >   #include <linux/types.h>
> >   #include <linux/kernel.h>
> > +#include <linux/string.h>
> >   
> >   struct drm_i915_private;
> >   
> >   #define GEN_MAX_SLICES		(6) /* CNL upper bound */
> >   #define GEN_MAX_SUBSLICES	(8) /* ICL upper bound */
> >   #define GEN_SSEU_STRIDE(bits) DIV_ROUND_UP(bits, BITS_PER_BYTE)
> > +#define GEN_MAX_SUBSLICE_STRIDE GEN_SSEU_STRIDE(GEN_MAX_SUBSLICES)
> >   
> >   struct sseu_dev_info {
> >   	u8 slice_mask;
> > -	u8 subslice_mask[GEN_MAX_SLICES];
> > +	u8 subslice_mask[GEN_MAX_SLICES * GEN_MAX_SUBSLICE_STRIDE];
> >   	u16 eu_total;
> >   	u8 eu_per_subslice;
> >   	u8 min_eu_in_pool;
> > @@ -33,6 +35,9 @@ struct sseu_dev_info {
> >   	u8 max_subslices;
> >   	u8 max_eus_per_subslice;
> >   
> > +	u8 ss_stride;
> > +	u8 eu_stride;
> > +
> >   	/* We don't have more than 8 eus per subslice at the moment and
> > as we
> >   	 * store eus enabled using bits, no need to multiply by eus per
> >   	 * subslice.
> > @@ -63,12 +68,21 @@ intel_sseu_from_device_info(const struct
> > sseu_dev_info *sseu)
> >   	return value;
> >   }
> >   
> > +void intel_sseu_set_info(struct sseu_dev_info *sseu, u8
> > max_slices,
> > +			 u8 max_subslices, u8 max_eus_per_subslice);
> > +
> >   unsigned int
> >   intel_sseu_subslice_total(const struct sseu_dev_info *sseu);
> >   
> >   unsigned int
> >   intel_sseu_subslices_per_slice(const struct sseu_dev_info *sseu,
> > u8 slice);
> >   
> > +void intel_sseu_copy_subslices(const struct sseu_dev_info *sseu,
> > int slice,
> > +			       u8 *to_mask, const u8 *from_mask);
> > +
> > +void intel_sseu_set_subslices(struct sseu_dev_info *sseu, int
> > slice,
> > +			      u32 ss_mask);
> > +
> >   u16 intel_sseu_get_eus(const struct sseu_dev_info *sseu, int
> > slice,
> >   		       int subslice);
> >   
> > diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c
> > b/drivers/gpu/drm/i915/gt/intel_workarounds.c
> > index 43e290306551..7c7e9556c1c5 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_workarounds.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c
> > @@ -767,7 +767,7 @@ wa_init_mcr(struct drm_i915_private *i915,
> > struct i915_wa_list *wal)
> >   		u32 slice = fls(sseu->slice_mask);
> >   		u32 fuse3 =
> >   			intel_uncore_read(&i915->uncore,
> > GEN10_MIRROR_FUSE3);
> > -		u8 ss_mask = sseu->subslice_mask[slice];
> > +		u8 ss_mask = sseu->subslice_mask[slice * sseu-
> > >ss_stride];
> 
> could use a
> 
> 	GEM_BUG_ON(sseu->ss_stride > 1);
> 
> here as well to remind us this will need changes in that case

Ok.

> 
> >   
> >   		u8 enabled_mask = (ss_mask | ss_mask >>
> >   				   GEN10_L3BANK_PAIR_COUNT) &
> > GEN10_L3BANK_MASK;
> > diff --git a/drivers/gpu/drm/i915/i915_debugfs.c
> > b/drivers/gpu/drm/i915/i915_debugfs.c
> > index 3f3ee83ac315..08089c24db25 100644
> > --- a/drivers/gpu/drm/i915/i915_debugfs.c
> > +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> > @@ -1257,6 +1257,7 @@ static void i915_instdone_info(struct
> > drm_i915_private *dev_priv,
> >   			       struct seq_file *m,
> >   			       struct intel_instdone *instdone)
> >   {
> > +	struct sseu_dev_info *sseu = &RUNTIME_INFO(dev_priv)->sseu;
> >   	int slice;
> >   	int subslice;
> >   
> > @@ -1272,11 +1273,11 @@ static void i915_instdone_info(struct
> > drm_i915_private *dev_priv,
> >   	if (INTEL_GEN(dev_priv) <= 6)
> >   		return;
> >   
> > -	for_each_instdone_slice_subslice(dev_priv, slice, subslice)
> > +	for_each_instdone_slice_subslice(dev_priv, sseu, slice,
> > subslice)
> >   		seq_printf(m, "\t\tSAMPLER_INSTDONE[%d][%d]: 0x%08x\n",
> >   			   slice, subslice, instdone-
> > >sampler[slice][subslice]);
> >   
> > -	for_each_instdone_slice_subslice(dev_priv, slice, subslice)
> > +	for_each_instdone_slice_subslice(dev_priv, sseu, slice,
> > subslice)
> >   		seq_printf(m, "\t\tROW_INSTDONE[%d][%d]: 0x%08x\n",
> >   			   slice, subslice, instdone-
> > >row[slice][subslice]);
> >   }
> > @@ -4066,7 +4067,9 @@ static void gen10_sseu_device_status(struct
> > drm_i915_private *dev_priv,
> >   			continue;
> >   
> >   		sseu->slice_mask |= BIT(s);
> > -		sseu->subslice_mask[s] = info->sseu.subslice_mask[s];
> > +		intel_sseu_copy_subslices(&info->sseu, s,
> > +					  sseu->subslice_mask,
> > +					  info->sseu.subslice_mask);
> >   
> >   		for (ss = 0; ss < info->sseu.max_subslices; ss++) {
> >   			unsigned int eu_cnt;
> > @@ -4117,18 +4120,22 @@ static void gen9_sseu_device_status(struct
> > drm_i915_private *dev_priv,
> >   		sseu->slice_mask |= BIT(s);
> >   
> >   		if (IS_GEN9_BC(dev_priv))
> > -			sseu->subslice_mask[s] =
> > -				RUNTIME_INFO(dev_priv)-
> > >sseu.subslice_mask[s];
> > +			intel_sseu_copy_subslices(&info->sseu, s,
> > +						  sseu->subslice_mask,
> > +						  info-
> > >sseu.subslice_mask);
> >   
> >   		for (ss = 0; ss < info->sseu.max_subslices; ss++) {
> >   			unsigned int eu_cnt;
> > +			u8 ss_idx = s * info->sseu.ss_stride +
> > +				    ss / BITS_PER_BYTE;
> >   
> >   			if (IS_GEN9_LP(dev_priv)) {
> >   				if (!(s_reg[s] &
> > (GEN9_PGCTL_SS_ACK(ss))))
> >   					/* skip disabled subslice */
> >   					continue;
> >   
> > -				sseu->subslice_mask[s] |= BIT(ss);
> > +				sseu->subslice_mask[ss_idx] |=
> > +					BIT(ss % BITS_PER_BYTE);
> >   			}
> >   
> >   			eu_cnt = 2 * hweight32(eu_reg[2*s + ss/2] &
> > @@ -4145,25 +4152,24 @@ static void gen9_sseu_device_status(struct
> > drm_i915_private *dev_priv,
> >   static void broadwell_sseu_device_status(struct drm_i915_private
> > *dev_priv,
> >   					 struct sseu_dev_info *sseu)
> >   {
> > +	struct intel_runtime_info *info = RUNTIME_INFO(dev_priv);
> >   	u32 slice_info = I915_READ(GEN8_GT_SLICE_INFO);
> >   	int s;
> >   
> >   	sseu->slice_mask = slice_info & GEN8_LSLICESTAT_MASK;
> >   
> >   	if (sseu->slice_mask) {
> > -		sseu->eu_per_subslice =
> > -			RUNTIME_INFO(dev_priv)->sseu.eu_per_subslice;
> > -		for (s = 0; s < fls(sseu->slice_mask); s++) {
> > -			sseu->subslice_mask[s] =
> > -				RUNTIME_INFO(dev_priv)-
> > >sseu.subslice_mask[s];
> > -		}
> > +		sseu->eu_per_subslice = info->sseu.eu_per_subslice;
> > +		for (s = 0; s < fls(sseu->slice_mask); s++)
> > +			intel_sseu_copy_subslices(&info->sseu, s,
> > +						  sseu->subslice_mask,
> > +						  info-
> > >sseu.subslice_mask);
> >   		sseu->eu_total = sseu->eu_per_subslice *
> >   				 intel_sseu_subslice_total(sseu);
> >   
> >   		/* subtract fused off EU(s) from enabled slice(s) */
> >   		for (s = 0; s < fls(sseu->slice_mask); s++) {
> > -			u8 subslice_7eu =
> > -				RUNTIME_INFO(dev_priv)-
> > >sseu.subslice_7eu[s];
> > +			u8 subslice_7eu = info->sseu.subslice_7eu[s];
> >   
> >   			sseu->eu_total -= hweight8(subslice_7eu);
> >   		}
> > @@ -4210,6 +4216,7 @@ static void i915_print_sseu_info(struct
> > seq_file *m, bool is_available_info,
> >   static int i915_sseu_status(struct seq_file *m, void *unused)
> >   {
> >   	struct drm_i915_private *dev_priv = node_to_i915(m->private);
> > +	const struct intel_runtime_info *info = RUNTIME_INFO(dev_priv);
> >   	struct sseu_dev_info sseu;
> >   	intel_wakeref_t wakeref;
> >   
> > @@ -4217,14 +4224,13 @@ static int i915_sseu_status(struct seq_file
> > *m, void *unused)
> >   		return -ENODEV;
> >   
> >   	seq_puts(m, "SSEU Device Info\n");
> > -	i915_print_sseu_info(m, true, &RUNTIME_INFO(dev_priv)->sseu);
> > +	i915_print_sseu_info(m, true, &info->sseu);
> >   
> >   	seq_puts(m, "SSEU Device Status\n");
> >   	memset(&sseu, 0, sizeof(sseu));
> > -	sseu.max_slices = RUNTIME_INFO(dev_priv)->sseu.max_slices;
> > -	sseu.max_subslices = RUNTIME_INFO(dev_priv)-
> > >sseu.max_subslices;
> > -	sseu.max_eus_per_subslice =
> > -		RUNTIME_INFO(dev_priv)->sseu.max_eus_per_subslice;
> > +	intel_sseu_set_info(&sseu, info->sseu.max_slices,
> > +			    info->sseu.max_subslices,
> > +			    info->sseu.max_eus_per_subslice);
> >   
> >   	with_intel_runtime_pm(dev_priv, wakeref) {
> >   		if (IS_CHERRYVIEW(dev_priv))
> > diff --git a/drivers/gpu/drm/i915/i915_drv.c
> > b/drivers/gpu/drm/i915/i915_drv.c
> > index 130c5140db0d..6afe4e3afea4 100644
> > --- a/drivers/gpu/drm/i915/i915_drv.c
> > +++ b/drivers/gpu/drm/i915/i915_drv.c
> > @@ -326,7 +326,7 @@ static int i915_getparam_ioctl(struct
> > drm_device *dev, void *data,
> >   	struct pci_dev *pdev = dev_priv->drm.pdev;
> >   	const struct sseu_dev_info *sseu = &RUNTIME_INFO(dev_priv)-
> > >sseu;
> >   	drm_i915_getparam_t *param = data;
> > -	int value;
> > +	int value = 0;
> >   
> >   	switch (param->param) {
> >   	case I915_PARAM_IRQ_ACTIVE:
> > @@ -455,7 +455,9 @@ static int i915_getparam_ioctl(struct
> > drm_device *dev, void *data,
> >   			return -ENODEV;
> >   		break;
> >   	case I915_PARAM_SUBSLICE_MASK:
> > -		value = sseu->subslice_mask[0];
> > +		/* Only copy bits from the first subslice */
> 
> s/subslice/slice/ ?

True, thanks.

> 
> > +		memcpy(&value, sseu->subslice_mask,
> > +		       min(sseu->ss_stride, (u8)sizeof(value)));
> >   		if (!value)
> >   			return -ENODEV;
> >   		break;
> > diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c
> > b/drivers/gpu/drm/i915/i915_gpu_error.c
> > index e1b858bd1d32..140918dd9b7d 100644
> > --- a/drivers/gpu/drm/i915/i915_gpu_error.c
> > +++ b/drivers/gpu/drm/i915/i915_gpu_error.c
> > @@ -407,6 +407,7 @@ static void print_error_buffers(struct
> > drm_i915_error_state_buf *m,
> >   static void error_print_instdone(struct drm_i915_error_state_buf
> > *m,
> >   				 const struct drm_i915_error_engine
> > *ee)
> >   {
> > +	struct sseu_dev_info *sseu = &RUNTIME_INFO(m->i915)->sseu;
> >   	int slice;
> >   	int subslice;
> >   
> > @@ -422,12 +423,12 @@ static void error_print_instdone(struct
> > drm_i915_error_state_buf *m,
> >   	if (INTEL_GEN(m->i915) <= 6)
> >   		return;
> >   
> > -	for_each_instdone_slice_subslice(m->i915, slice, subslice)
> > +	for_each_instdone_slice_subslice(m->i915, sseu, slice,
> > subslice)
> >   		err_printf(m, "  SAMPLER_INSTDONE[%d][%d]: 0x%08x\n",
> >   			   slice, subslice,
> >   			   ee->instdone.sampler[slice][subslice]);
> >   
> > -	for_each_instdone_slice_subslice(m->i915, slice, subslice)
> > +	for_each_instdone_slice_subslice(m->i915, sseu, slice,
> > subslice)
> >   		err_printf(m, "  ROW_INSTDONE[%d][%d]: 0x%08x\n",
> >   			   slice, subslice,
> >   			   ee->instdone.row[slice][subslice]);
> > diff --git a/drivers/gpu/drm/i915/i915_query.c
> > b/drivers/gpu/drm/i915/i915_query.c
> > index 7c1708c22811..000dcb145ce0 100644
> > --- a/drivers/gpu/drm/i915/i915_query.c
> > +++ b/drivers/gpu/drm/i915/i915_query.c
> > @@ -37,8 +37,6 @@ static int query_topology_info(struct
> > drm_i915_private *dev_priv,
> >   	const struct sseu_dev_info *sseu = &RUNTIME_INFO(dev_priv)-
> > >sseu;
> >   	struct drm_i915_query_topology_info topo;
> >   	u32 slice_length, subslice_length, eu_length, total_length;
> > -	u8 subslice_stride = GEN_SSEU_STRIDE(sseu->max_subslices);
> > -	u8 eu_stride = GEN_SSEU_STRIDE(sseu->max_eus_per_subslice);
> >   	int ret;
> >   
> >   	if (query_item->flags != 0)
> > @@ -50,8 +48,8 @@ static int query_topology_info(struct
> > drm_i915_private *dev_priv,
> >   	BUILD_BUG_ON(sizeof(u8) != sizeof(sseu->slice_mask));
> >   
> >   	slice_length = sizeof(sseu->slice_mask);
> > -	subslice_length = sseu->max_slices * subslice_stride;
> > -	eu_length = sseu->max_slices * sseu->max_subslices * eu_stride;
> > +	subslice_length = sseu->max_slices * sseu->ss_stride;
> > +	eu_length = sseu->max_slices * sseu->max_subslices * sseu-
> > >eu_stride;
> >   	total_length = sizeof(topo) + slice_length + subslice_length +
> >   		       eu_length;
> >   
> > @@ -69,9 +67,9 @@ static int query_topology_info(struct
> > drm_i915_private *dev_priv,
> >   	topo.max_eus_per_subslice = sseu->max_eus_per_subslice;
> >   
> >   	topo.subslice_offset = slice_length;
> > -	topo.subslice_stride = subslice_stride;
> > +	topo.subslice_stride = sseu->ss_stride;
> >   	topo.eu_offset = slice_length + subslice_length;
> > -	topo.eu_stride = eu_stride;
> > +	topo.eu_stride = sseu->eu_stride;
> >   
> >   	if (__copy_to_user(u64_to_user_ptr(query_item->data_ptr),
> >   			   &topo, sizeof(topo)))
> > diff --git a/drivers/gpu/drm/i915/intel_device_info.c
> > b/drivers/gpu/drm/i915/intel_device_info.c
> > index e1dbccf04cd9..bbbc0a8c2183 100644
> > --- a/drivers/gpu/drm/i915/intel_device_info.c
> > +++ b/drivers/gpu/drm/i915/intel_device_info.c
> > @@ -84,17 +84,42 @@ void intel_device_info_dump_flags(const struct
> > intel_device_info *info,
> >   #undef PRINT_FLAG
> >   }
> >   
> > +#define SS_STR_MAX_SIZE (GEN_MAX_SUBSLICE_STRIDE * 2)
> > +
> > +static u8 *
> > +subslice_per_slice_str(u8 *buf, const struct sseu_dev_info *sseu,
> > u8 slice)
> > +{
> > +	int i;
> > +	u8 ss_offset = slice * sseu->ss_stride;
> > +
> > +	GEM_BUG_ON(slice >= sseu->max_slices);
> > +
> > +	memset(buf, 0, SS_STR_MAX_SIZE);
> > +
> > +	/*
> > +	 * Print subslice information in reverse order to match
> > +	 * userspace expectations.
> > +	 */
> > +	for (i = 0; i < sseu->ss_stride; i++)
> > +		sprintf(&buf[i * 2], "%02x",
> > +			sseu->subslice_mask[ss_offset + sseu->ss_stride 
> > -
> > +					    (i + 1)]);
> > +
> > +	return buf;
> > +}
> > +
> >   static void sseu_dump(const struct sseu_dev_info *sseu, struct
> > drm_printer *p)
> >   {
> >   	int s;
> > +	u8 buf[SS_STR_MAX_SIZE];
> >   
> >   	drm_printf(p, "slice total: %u, mask=%04x\n",
> >   		   hweight8(sseu->slice_mask), sseu->slice_mask);
> >   	drm_printf(p, "subslice total: %u\n",
> > intel_sseu_subslice_total(sseu));
> >   	for (s = 0; s < sseu->max_slices; s++) {
> > -		drm_printf(p, "slice%d: %u subslices, mask=%04x\n",
> > +		drm_printf(p, "slice%d: %u subslices, mask=%s\n",
> >   			   s, intel_sseu_subslices_per_slice(sseu, s),
> > -			   sseu->subslice_mask[s]);
> > +			   subslice_per_slice_str(buf, sseu, s));
> >   	}
> >   	drm_printf(p, "EU total: %u\n", sseu->eu_total);
> >   	drm_printf(p, "EU per subslice: %u\n", sseu->eu_per_subslice);
> > @@ -118,6 +143,7 @@ void intel_device_info_dump_topology(const
> > struct sseu_dev_info *sseu,
> >   				     struct drm_printer *p)
> >   {
> >   	int s, ss;
> > +	u8 buf[SS_STR_MAX_SIZE];
> >   
> >   	if (sseu->max_slices == 0) {
> >   		drm_printf(p, "Unavailable\n");
> > @@ -125,9 +151,9 @@ void intel_device_info_dump_topology(const
> > struct sseu_dev_info *sseu,
> >   	}
> >   
> >   	for (s = 0; s < sseu->max_slices; s++) {
> > -		drm_printf(p, "slice%d: %u subslice(s) (0x%hhx):\n",
> > +		drm_printf(p, "slice%d: %u subslice(s) (0x%s):\n",
> >   			   s, intel_sseu_subslices_per_slice(sseu, s),
> > -			   sseu->subslice_mask[s]);
> > +			   subslice_per_slice_str(buf, sseu, s));
> >   
> >   		for (ss = 0; ss < sseu->max_subslices; ss++) {
> >   			u16 enabled_eus = intel_sseu_get_eus(sseu, s,
> > ss);
> > @@ -156,15 +182,10 @@ static void gen11_sseu_info_init(struct
> > drm_i915_private *dev_priv)
> >   	u8 eu_en;
> >   	int s;
> >   
> > -	if (IS_ELKHARTLAKE(dev_priv)) {
> > -		sseu->max_slices = 1;
> > -		sseu->max_subslices = 4;
> > -		sseu->max_eus_per_subslice = 8;
> > -	} else {
> > -		sseu->max_slices = 1;
> > -		sseu->max_subslices = 8;
> > -		sseu->max_eus_per_subslice = 8;
> > -	}
> > +	if (IS_ELKHARTLAKE(dev_priv))
> > +		intel_sseu_set_info(sseu, 1, 4, 8);
> > +	else
> > +		intel_sseu_set_info(sseu, 1, 8, 8);
> >   
> >   	s_en = I915_READ(GEN11_GT_SLICE_ENABLE) & GEN11_GT_S_ENA_MASK;
> >   	ss_en = ~I915_READ(GEN11_GT_SUBSLICE_DISABLE);
> > @@ -177,9 +198,11 @@ static void gen11_sseu_info_init(struct
> > drm_i915_private *dev_priv)
> >   			int ss;
> >   
> >   			sseu->slice_mask |= BIT(s);
> > -			sseu->subslice_mask[s] = (ss_en >> ss_idx) &
> > ss_en_mask;
> > +			sseu->subslice_mask[s * sseu->ss_stride] =
> > +				(ss_en >> ss_idx) & ss_en_mask;
> 
> Shouldn't this just call intel_sseu_set_subslices() instead of doing
> the 
> setting locally?

Yes, let me fix this.

> 
> >   			for (ss = 0; ss < sseu->max_subslices; ss++) {
> > -				if (sseu->subslice_mask[s] & BIT(ss))
> > +				if (sseu->subslice_mask[s * sseu-
> > >ss_stride] &
> > +				    BIT(ss))
> 
> This culd use the intel_sseu_has_subslice() suggested earlier,
> otherwise 
> it needs to consider ss_stride > 1

Ok.

> 
> >   					intel_sseu_set_eus(sseu, s, ss,
> > eu_en);
> >   			}
> >   		}
> > @@ -201,23 +224,10 @@ static void gen10_sseu_info_init(struct
> > drm_i915_private *dev_priv)
> >   	const int eu_mask = 0xff;
> >   	u32 subslice_mask, eu_en;
> >   
> > +	intel_sseu_set_info(sseu, 6, 4, 8);
> > +
> >   	sseu->slice_mask = (fuse2 & GEN10_F2_S_ENA_MASK) >>
> >   			    GEN10_F2_S_ENA_SHIFT;
> > -	sseu->max_slices = 6;
> > -	sseu->max_subslices = 4;
> > -	sseu->max_eus_per_subslice = 8;
> > -
> > -	subslice_mask = (1 << 4) - 1;
> > -	subslice_mask &= ~((fuse2 & GEN10_F2_SS_DIS_MASK) >>
> > -			   GEN10_F2_SS_DIS_SHIFT);
> > -
> > -	/*
> > -	 * Slice0 can have up to 3 subslices, but there are only 2 in
> > -	 * slice1/2.
> > -	 */
> > -	sseu->subslice_mask[0] = subslice_mask;
> > -	for (s = 1; s < sseu->max_slices; s++)
> > -		sseu->subslice_mask[s] = subslice_mask & 0x3;
> >   
> >   	/* Slice0 */
> >   	eu_en = ~I915_READ(GEN8_EU_DISABLE0);
> > @@ -242,14 +252,22 @@ static void gen10_sseu_info_init(struct
> > drm_i915_private *dev_priv)
> >   	eu_en = ~I915_READ(GEN10_EU_DISABLE3);
> >   	intel_sseu_set_eus(sseu, 5, 1, eu_en & eu_mask);
> >   
> > -	/* Do a second pass where we mark the subslices disabled if all
> > their
> > -	 * eus are off.
> > -	 */
> > +	subslice_mask = (1 << 4) - 1;
> > +	subslice_mask &= ~((fuse2 & GEN10_F2_SS_DIS_MASK) >>
> > +			   GEN10_F2_SS_DIS_SHIFT);
> > +
> >   	for (s = 0; s < sseu->max_slices; s++) {
> >   		for (ss = 0; ss < sseu->max_subslices; ss++) {
> >   			if (intel_sseu_get_eus(sseu, s, ss) == 0)
> > -				sseu->subslice_mask[s] &= ~BIT(ss);
> > +				subslice_mask &= ~BIT(ss);
> >   		}
> > +
> > +		/*
> > +		 * Slice0 can have up to 3 subslices, but there are
> > only 2 in
> > +		 * slice1/2.
> > +		 */
> > +		intel_sseu_set_subslices(sseu, s, s == 0 ?
> > subslice_mask :
> > +							   subslice_mas
> > k & 0x3);
> >   	}
> >   
> >   	sseu->eu_total = compute_eu_total(sseu);
> > @@ -275,13 +293,12 @@ static void cherryview_sseu_info_init(struct
> > drm_i915_private *dev_priv)
> >   {
> >   	struct sseu_dev_info *sseu = &RUNTIME_INFO(dev_priv)->sseu;
> >   	u32 fuse;
> > +	u8 subslice_mask;
> >   
> >   	fuse = I915_READ(CHV_FUSE_GT);
> >   
> >   	sseu->slice_mask = BIT(0);
> > -	sseu->max_slices = 1;
> > -	sseu->max_subslices = 2;
> > -	sseu->max_eus_per_subslice = 8;
> > +	intel_sseu_set_info(sseu, 1, 2, 8);
> >   
> >   	if (!(fuse & CHV_FGT_DISABLE_SS0)) {
> >   		u8 disabled_mask =
> > @@ -290,7 +307,7 @@ static void cherryview_sseu_info_init(struct
> > drm_i915_private *dev_priv)
> >   			(((fuse & CHV_FGT_EU_DIS_SS0_R1_MASK) >>
> >   			  CHV_FGT_EU_DIS_SS0_R1_SHIFT) << 4);
> >   
> > -		sseu->subslice_mask[0] |= BIT(0);
> > +		subslice_mask |= BIT(0);
> >   		intel_sseu_set_eus(sseu, 0, 0, ~disabled_mask);
> >   	}
> >   
> > @@ -301,10 +318,12 @@ static void cherryview_sseu_info_init(struct
> > drm_i915_private *dev_priv)
> >   			(((fuse & CHV_FGT_EU_DIS_SS1_R1_MASK) >>
> >   			  CHV_FGT_EU_DIS_SS1_R1_SHIFT) << 4);
> >   
> > -		sseu->subslice_mask[0] |= BIT(1);
> > +		subslice_mask |= BIT(1);
> >   		intel_sseu_set_eus(sseu, 0, 1, ~disabled_mask);
> >   	}
> >   
> > +	intel_sseu_set_subslices(sseu, 0, subslice_mask);
> > +
> >   	sseu->eu_total = compute_eu_total(sseu);
> >   
> >   	/*
> > @@ -312,7 +331,8 @@ static void cherryview_sseu_info_init(struct
> > drm_i915_private *dev_priv)
> >   	 * across subslices.
> >   	*/
> >   	sseu->eu_per_subslice = intel_sseu_subslice_total(sseu) ?
> > -				sseu->eu_total /
> > intel_sseu_subslice_total(sseu) :
> > +				sseu->eu_total /
> > +					intel_sseu_subslice_total(sseu)
> > :
> >   				0;
> >   	/*
> >   	 * CHV supports subslice power gating on devices with more than
> > @@ -336,9 +356,8 @@ static void gen9_sseu_info_init(struct
> > drm_i915_private *dev_priv)
> >   	sseu->slice_mask = (fuse2 & GEN8_F2_S_ENA_MASK) >>
> > GEN8_F2_S_ENA_SHIFT;
> >   
> >   	/* BXT has a single slice and at most 3 subslices. */
> > -	sseu->max_slices = IS_GEN9_LP(dev_priv) ? 1 : 3;
> > -	sseu->max_subslices = IS_GEN9_LP(dev_priv) ? 3 : 4;
> > -	sseu->max_eus_per_subslice = 8;
> > +	intel_sseu_set_info(sseu, IS_GEN9_LP(dev_priv) ? 1 : 3,
> > +			    IS_GEN9_LP(dev_priv) ? 3 : 4, 8);
> >   
> >   	/*
> >   	 * The subslice disable field is global, i.e. it applies
> > @@ -357,14 +376,16 @@ static void gen9_sseu_info_init(struct
> > drm_i915_private *dev_priv)
> >   			/* skip disabled slice */
> >   			continue;
> >   
> > -		sseu->subslice_mask[s] = subslice_mask;
> > +		intel_sseu_set_subslices(sseu, s, subslice_mask);
> >   
> >   		eu_disable = I915_READ(GEN9_EU_DISABLE(s));
> >   		for (ss = 0; ss < sseu->max_subslices; ss++) {
> >   			int eu_per_ss;
> >   			u8 eu_disabled_mask;
> > +			u8 ss_idx = s * sseu->ss_stride + ss /
> > BITS_PER_BYTE;
> >   
> > -			if (!(sseu->subslice_mask[s] & BIT(ss)))
> > +			if (!(sseu->subslice_mask[ss_idx] &
> > +			      BIT(ss % BITS_PER_BYTE)))
> >   				/* skip disabled subslice */
> >   				continue;
> >   
> > @@ -437,9 +458,7 @@ static void broadwell_sseu_info_init(struct
> > drm_i915_private *dev_priv)
> >   
> >   	fuse2 = I915_READ(GEN8_FUSE2);
> >   	sseu->slice_mask = (fuse2 & GEN8_F2_S_ENA_MASK) >>
> > GEN8_F2_S_ENA_SHIFT;
> > -	sseu->max_slices = 3;
> > -	sseu->max_subslices = 3;
> > -	sseu->max_eus_per_subslice = 8;
> > +	intel_sseu_set_info(sseu, 3, 3, 8);
> >   
> >   	/*
> >   	 * The subslice disable field is global, i.e. it applies
> > @@ -466,18 +485,21 @@ static void broadwell_sseu_info_init(struct
> > drm_i915_private *dev_priv)
> >   			/* skip disabled slice */
> >   			continue;
> >   
> > -		sseu->subslice_mask[s] = subslice_mask;
> > +		intel_sseu_set_subslices(sseu, s, subslice_mask);
> >   
> >   		for (ss = 0; ss < sseu->max_subslices; ss++) {
> >   			u8 eu_disabled_mask;
> > +			u8 ss_idx = s * sseu->ss_stride + ss /
> > BITS_PER_BYTE;
> >   			u32 n_disabled;
> >   
> > -			if (!(sseu->subslice_mask[s] & BIT(ss)))
> > +			if (!(sseu->subslice_mask[ss_idx] &
> > +			      BIT(ss % BITS_PER_BYTE)))
> >   				/* skip disabled subslice */
> >   				continue;
> >   
> >   			eu_disabled_mask =
> > -				eu_disable[s] >> (ss * sseu-
> > >max_eus_per_subslice);
> > +				eu_disable[s] >>
> > +					(ss * sseu-
> > >max_eus_per_subslice);
> >   
> >   			intel_sseu_set_eus(sseu, s, ss,
> > ~eu_disabled_mask);
> >   
> > @@ -517,6 +539,7 @@ static void haswell_sseu_info_init(struct
> > drm_i915_private *dev_priv)
> >   	struct sseu_dev_info *sseu = &RUNTIME_INFO(dev_priv)->sseu;
> >   	u32 fuse1;
> >   	int s, ss;
> > +	u32 subslice_mask;
> >   
> >   	/*
> >   	 * There isn't a register to tell us how many slices/subslices.
> > We
> > @@ -528,22 +551,18 @@ static void haswell_sseu_info_init(struct
> > drm_i915_private *dev_priv)
> >   		/* fall through */
> >   	case 1:
> >   		sseu->slice_mask = BIT(0);
> > -		sseu->subslice_mask[0] = BIT(0);
> > +		subslice_mask = BIT(0);
> >   		break;
> >   	case 2:
> >   		sseu->slice_mask = BIT(0);
> > -		sseu->subslice_mask[0] = BIT(0) | BIT(1);
> > +		subslice_mask = BIT(0) | BIT(1);
> >   		break;
> >   	case 3:
> >   		sseu->slice_mask = BIT(0) | BIT(1);
> > -		sseu->subslice_mask[0] = BIT(0) | BIT(1);
> > -		sseu->subslice_mask[1] = BIT(0) | BIT(1);
> > +		subslice_mask = BIT(0) | BIT(1);
> >   		break;
> >   	}
> >   
> > -	sseu->max_slices = hweight8(sseu->slice_mask);
> > -	sseu->max_subslices = hweight8(sseu->subslice_mask[0]);
> > -
> >   	fuse1 = I915_READ(HSW_PAVP_FUSE1);
> >   	switch ((fuse1 & HSW_F1_EU_DIS_MASK) >> HSW_F1_EU_DIS_SHIFT) {
> >   	default:
> > @@ -560,9 +579,14 @@ static void haswell_sseu_info_init(struct
> > drm_i915_private *dev_priv)
> >   		sseu->eu_per_subslice = 6;
> >   		break;
> >   	}
> > -	sseu->max_eus_per_subslice = sseu->eu_per_subslice;
> > +
> > +	intel_sseu_set_info(sseu, hweight8(sseu->slice_mask),
> > +			    hweight8(subslice_mask),
> > +			    sseu->eu_per_subslice);
> 
> Personal preference: could use a local variable for eu_per_subslice 
> above to avoid setting it to itself here.

Yeah this is a bit ugly... I'll change it.

Thanks for the feedback!
Stuart

> 
> Daniele
> 
> >   
> >   	for (s = 0; s < sseu->max_slices; s++) {
> > +		intel_sseu_set_subslices(sseu, s, subslice_mask);
> > +
> >   		for (ss = 0; ss < sseu->max_subslices; ss++) {
> >   			intel_sseu_set_eus(sseu, s, ss,
> >   					   (1UL << sseu-
> > >eu_per_subslice) - 1);
> > 

[-- Attachment #1.2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 3270 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 5/6] drm/i915: Remove inline from sseu helper functions
  2019-05-02  7:15           ` Jani Nikula
@ 2019-05-02 14:50             ` Summers, Stuart
  2019-05-02 14:58               ` Jani Nikula
  0 siblings, 1 reply; 35+ messages in thread
From: Summers, Stuart @ 2019-05-02 14:50 UTC (permalink / raw)
  To: Ceraolo Spurio, Daniele, intel-gfx, jani.nikula


[-- Attachment #1.1: Type: text/plain, Size: 2234 bytes --]

On Thu, 2019-05-02 at 10:15 +0300, Jani Nikula wrote:
> On Wed, 01 May 2019, "Summers, Stuart" <stuart.summers@intel.com>
> wrote:
> > On Wed, 2019-05-01 at 14:19 -0700, Daniele Ceraolo Spurio wrote:
> > > 
> > > On 5/1/19 2:04 PM, Summers, Stuart wrote:
> > > > On Wed, 2019-05-01 at 13:04 -0700, Daniele Ceraolo Spurio
> > > > wrote:
> > > > > Can you elaborate a bit more on what's the rationale for
> > > > > this? do
> > > > > you just want to avoid having too many inlines since the
> > > > > paths
> > > > > they're used in are not critical, or do you have some more
> > > > > functional reason?  This is not a critic to the patch, I just
> > > > > want to understand where you're coming from ;)
> > > > 
> > > > This was a request from Jani Nikula in a previous series
> > > > update. I
> > > > don't have a strong preference either way personally. If you
> > > > don't
> > > > have any major concerns, I'd prefer to keep the series as-is to
> > > > prevent too much thrash here, but let me know.
> > > > 
> > > 
> > > No concerns, just please update the commit message to explain
> > > that
> > > we're moving them because there is no need for them to be inline
> > > since they're not on a critical path where we need preformance.
> > 
> > Sounds great.
> 
> I've become critical of superfluous inlines. They break the
> abstraction
> by exposing the internals in the header, and make the
> interdependencies
> of headers harder to resolve.
> 
> As the driver keeps growing and more people contribute to it, I think
> we
> need to pay more attention on how we structure the source. To this
> end
> we've added new gt/ subdir, are about to add gem/ and likely display/
> too before long, and we've significantly split off the monster
> i915_drv.h and intel_drv.h headers.
> 
> Obviously inlines have their place and purpose, but I think we
> sprinkle
> them a bit too eagerly without paying attention.
> 
> I like the patch.
> 
> Acked-by: Jani Nikula <jani.nikula@intel.com>

Jani, based on Daniele's feedback, I'm planning on squashing this patch
with the patch that moves these helper functions to intel_sseu.h. Any
issue keeping your Ack here?

-Stuart

> 
> 

[-- Attachment #1.2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 3270 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 5/6] drm/i915: Remove inline from sseu helper functions
  2019-05-02 14:58               ` Jani Nikula
@ 2019-05-02 14:58                 ` Summers, Stuart
  0 siblings, 0 replies; 35+ messages in thread
From: Summers, Stuart @ 2019-05-02 14:58 UTC (permalink / raw)
  To: Ceraolo Spurio, Daniele, intel-gfx, jani.nikula


[-- Attachment #1.1: Type: text/plain, Size: 521 bytes --]

On Thu, 2019-05-02 at 17:58 +0300, Jani Nikula wrote:
> On Thu, 02 May 2019, "Summers, Stuart" <stuart.summers@intel.com>
> wrote:
> > On Thu, 2019-05-02 at 10:15 +0300, Jani Nikula wrote:
> > > Acked-by: Jani Nikula <jani.nikula@intel.com>
> > 
> > Jani, based on Daniele's feedback, I'm planning on squashing this
> > patch
> > with the patch that moves these helper functions to intel_sseu.h.
> > Any
> > issue keeping your Ack here?
> 
> None.

Thanks for the Ack!

-Stuart

> 
> BR,
> Jani.
> 

[-- Attachment #1.2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 3270 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 5/6] drm/i915: Remove inline from sseu helper functions
  2019-05-02 14:50             ` Summers, Stuart
@ 2019-05-02 14:58               ` Jani Nikula
  2019-05-02 14:58                 ` Summers, Stuart
  0 siblings, 1 reply; 35+ messages in thread
From: Jani Nikula @ 2019-05-02 14:58 UTC (permalink / raw)
  To: Summers, Stuart, Ceraolo Spurio, Daniele, intel-gfx

On Thu, 02 May 2019, "Summers, Stuart" <stuart.summers@intel.com> wrote:
> On Thu, 2019-05-02 at 10:15 +0300, Jani Nikula wrote:
>> Acked-by: Jani Nikula <jani.nikula@intel.com>
>
> Jani, based on Daniele's feedback, I'm planning on squashing this patch
> with the patch that moves these helper functions to intel_sseu.h. Any
> issue keeping your Ack here?

None.

BR,
Jani.

-- 
Jani Nikula, Intel Open Source Graphics Center
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 6/6] drm/i915: Expand subslice mask
  2019-05-02 14:47     ` Summers, Stuart
@ 2019-05-03  9:05       ` Lionel Landwerlin
  2019-05-03 14:28         ` Summers, Stuart
  0 siblings, 1 reply; 35+ messages in thread
From: Lionel Landwerlin @ 2019-05-03  9:05 UTC (permalink / raw)
  To: Summers, Stuart, Ceraolo Spurio, Daniele, intel-gfx

On 02/05/2019 15:47, Summers, Stuart wrote:
> On Wed, 2019-05-01 at 15:04 -0700, Daniele Ceraolo Spurio wrote:
>> On 5/1/19 8:34 AM, Stuart Summers wrote:
>>> Currently, the subslice_mask runtime parameter is stored as an
>>> array of subslices per slice. Expand the subslice mask array to
>>> better match what is presented to userspace through the
>>> I915_QUERY_TOPOLOGY_INFO ioctl. The index into this array is
>>> then calculated:
>>>     slice * subslice stride + subslice index / 8
>>>
>>> v2: fix spacing in set_sseu_info args
>>>       use set_sseu_info to initialize sseu data when building
>>>       device status in debugfs
>>>       rename variables in intel_engine_types.h to avoid checkpatch
>>>       warnings
>>> v3: update headers in intel_sseu.h
>>> v4: add const to some sseu_dev_info variables
>>>       use sseu->eu_stride for EU stride calculations
>>>
>>> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
>>> Signed-off-by: Stuart Summers <stuart.summers@intel.com>
>> Can you also get an ack from Lionel, to make sure this all fits with
>> the
>> expected reporting?
> Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>


The change makes sense to me but I haven't had time to verify every 
single bit in this patch :)

The i915_query IGT tests should be able to catch some issues on HSW 
(which can have 10EUs per subslice).


Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>


>
>>> ---
>>>    drivers/gpu/drm/i915/gt/intel_engine_cs.c    |   6 +-
>>>    drivers/gpu/drm/i915/gt/intel_engine_types.h |  32 +++--
>>>    drivers/gpu/drm/i915/gt/intel_hangcheck.c    |   3 +-
>>>    drivers/gpu/drm/i915/gt/intel_sseu.c         |  49 +++++--
>>>    drivers/gpu/drm/i915/gt/intel_sseu.h         |  16 ++-
>>>    drivers/gpu/drm/i915/gt/intel_workarounds.c  |   2 +-
>>>    drivers/gpu/drm/i915/i915_debugfs.c          |  44 +++---
>>>    drivers/gpu/drm/i915/i915_drv.c              |   6 +-
>>>    drivers/gpu/drm/i915/i915_gpu_error.c        |   5 +-
>>>    drivers/gpu/drm/i915/i915_query.c            |  10 +-
>>>    drivers/gpu/drm/i915/intel_device_info.c     | 142 +++++++++++---
>>> -----
>>>    11 files changed, 198 insertions(+), 117 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
>>> b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
>>> index 6e40f8ea9a6a..8f7967cc9a50 100644
>>> --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
>>> +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
>>> @@ -914,7 +914,7 @@ u32 intel_calculate_mcr_s_ss_select(struct
>>> drm_i915_private *dev_priv)
>>>    	const struct sseu_dev_info *sseu = &RUNTIME_INFO(dev_priv)-
>>>> sseu;
>>>    	u32 mcr_s_ss_select;
>>>    	u32 slice = fls(sseu->slice_mask);
>>> -	u32 subslice = fls(sseu->subslice_mask[slice]);
>>> +	u32 subslice = fls(sseu->subslice_mask[slice * sseu-
>>>> ss_stride]);
>> This (and the registers we use below) only works if ss_stride = 1.
>> Can
>> we add a:
>>
>> 	GEM_BUG_ON(sseu->ss_stride > 1);
>>
>> to catch the fact that this function will need updating to handle
>> that
>> case if/when we get it?
> I'll rework this and post an update.
>
>>>    
>>>    	if (IS_GEN(dev_priv, 10))
>>>    		mcr_s_ss_select = GEN8_MCR_SLICE(slice) |
>>> @@ -990,6 +990,7 @@ void intel_engine_get_instdone(struct
>>> intel_engine_cs *engine,
>>>    			       struct intel_instdone *instdone)
>>>    {
>>>    	struct drm_i915_private *dev_priv = engine->i915;
>>> +	const struct sseu_dev_info *sseu = &RUNTIME_INFO(dev_priv)-
>>>> sseu;
>>>    	struct intel_uncore *uncore = engine->uncore;
>>>    	u32 mmio_base = engine->mmio_base;
>>>    	int slice;
>>> @@ -1007,7 +1008,8 @@ void intel_engine_get_instdone(struct
>>> intel_engine_cs *engine,
>>>    
>>>    		instdone->slice_common =
>>>    			intel_uncore_read(uncore, GEN7_SC_INSTDONE);
>>> -		for_each_instdone_slice_subslice(dev_priv, slice,
>>> subslice) {
>>> +		for_each_instdone_slice_subslice(dev_priv, sseu, slice,
>>> +						 subslice) {
>>>    			instdone->sampler[slice][subslice] =
>>>    				read_subslice_reg(dev_priv, slice,
>>> subslice,
>>>    						  GEN7_SAMPLER_INSTDONE
>>> );
>>> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h
>>> b/drivers/gpu/drm/i915/gt/intel_engine_types.h
>>> index 9d64e33f8427..1710546a2446 100644
>>> --- a/drivers/gpu/drm/i915/gt/intel_engine_types.h
>>> +++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h
>>> @@ -534,20 +534,22 @@ intel_engine_needs_breadcrumb_tasklet(const
>>> struct intel_engine_cs *engine)
>>>    	return engine->flags & I915_ENGINE_NEEDS_BREADCRUMB_TASKLET;
>>>    }
>>>    
>>> -#define instdone_slice_mask(dev_priv__) \
>>> -	(IS_GEN(dev_priv__, 7) ? \
>>> -	 1 : RUNTIME_INFO(dev_priv__)->sseu.slice_mask)
>>> -
>>> -#define instdone_subslice_mask(dev_priv__) \
>>> -	(IS_GEN(dev_priv__, 7) ? \
>>> -	 1 : RUNTIME_INFO(dev_priv__)->sseu.subslice_mask[0])
>>> -
>>> -#define for_each_instdone_slice_subslice(dev_priv__, slice__,
>>> subslice__) \
>>> -	for ((slice__) = 0, (subslice__) = 0; \
>>> -	     (slice__) < I915_MAX_SLICES; \
>>> -	     (subslice__) = ((subslice__) + 1) < I915_MAX_SUBSLICES ?
>>> (subslice__) + 1 : 0, \
>>> -	       (slice__) += ((subslice__) == 0)) \
>>> -		for_each_if((BIT(slice__) &
>>> instdone_slice_mask(dev_priv__)) && \
>>> -			    (BIT(subslice__) &
>>> instdone_subslice_mask(dev_priv__)))
>>> +#define instdone_has_slice(dev_priv___, sseu___, slice___) \
>>> +	((IS_GEN(dev_priv___, 7) ? \
>>> +	  1 : (sseu___)->slice_mask) & \
>> I'd put the ternary op on the same line here for readability
> Yeah good point.
>
>>> +	BIT(slice___)) \
>> no need for "\" here (and below).
> Ok.
>
>>> +
>>> +#define instdone_has_subslice(dev_priv__, sseu__, slice__,
>>> subslice__) \
>> need some more parenthesis in this macro to fix the
>> MACRO_ARG_PRECEDENCE
>> warning in checkpatch.
> Thanks, I'll fix this.
>
>>> +	((IS_GEN(dev_priv__, 7) ? \
>>> +	  1 : (sseu__)->subslice_mask[slice__ * (sseu__)->ss_stride + \
>>> +				      subslice__ / BITS_PER_BYTE]) & \
>> The calculation to get the correct subslice u8 entry:
>>
>> 	sseu->subslice_mask[s * sseu->ss_stride + ss / BITS_PER_BYTE]
>>
>> seems to be repeated a few times in this patch, so it might be worth
>> moving it to its own inline function. looks like you always
>> ultimately
>> want a bool, so we could also go a bit further and have something
>> like:
>>
>> static inline bool intel_sseu_has_subslice(sseu, s, ss)
>> {
>> 	u8 mask = sseu->subslice_mask[s * sseu->ss_stride +
>> 				      ss / BITS_PER_BYTE];
>>
>> 	return mask & BIT(ss % BITS_PER_BYTE);
>> }
>>
>> and then do:
>>
>> #define instdone_has_subslice(dev_priv__, sseu__, slice__,
>> subslice__) \
>> 	((IS_GEN(dev_priv__, 7) ? subslice__ == 0 : \
>> 		intel_sseu_has_subslice(...))
>>
> Good point and good suggeMakes sense. stion. I'll clean this up.
>
>>> +	 BIT(subslice__ % BITS_PER_BYTE)) \
>>> +
>>> +#define for_each_instdone_slice_subslice(dev_priv_, sseu_, slice_,
>>> subslice_) \
>>> +	for ((slice_) = 0, (subslice_) = 0; (slice_) < I915_MAX_SLICES;
>>> \
>>> +	     (subslice_) = ((subslice_) + 1) < I915_MAX_SUBSLICES ?
>>> (subslice_) + 1 : 0, \
>> This ternary op should be simplifiable as:
>>
>> 	(subslice_) = ((subslice_) + 1) % I915_MAX_SUBSLICES,
> I had been carrying this forward as-is as much as possible. Your
> suggestion makes sense though. I'll take a look.
>
>>> +	       (slice_) += ((subslice_) == 0)) \
>>> +		for_each_if(instdone_has_slice(dev_priv_, sseu_, slice)
>>> && \
>> missing the "_" after "slice"
> Good catch, thanks!
>
>>> +			    instdone_has_subslice(dev_priv_, sseu_,
>>> slice_, subslice_)) \
>>>    
>>>    #endif /* __INTEL_ENGINE_TYPES_H__ */
>>> diff --git a/drivers/gpu/drm/i915/gt/intel_hangcheck.c
>>> b/drivers/gpu/drm/i915/gt/intel_hangcheck.c
>>> index e5eaa06fe74d..53c1c98161e1 100644
>>> --- a/drivers/gpu/drm/i915/gt/intel_hangcheck.c
>>> +++ b/drivers/gpu/drm/i915/gt/intel_hangcheck.c
>>> @@ -50,6 +50,7 @@ static bool instdone_unchanged(u32
>>> current_instdone, u32 *old_instdone)
>>>    static bool subunits_stuck(struct intel_engine_cs *engine)
>>>    {
>>>    	struct drm_i915_private *dev_priv = engine->i915;
>>> +	const struct sseu_dev_info *sseu = &RUNTIME_INFO(dev_priv)-
>>>> sseu;
>>>    	struct intel_instdone instdone;
>>>    	struct intel_instdone *accu_instdone = &engine-
>>>> hangcheck.instdone;
>>>    	bool stuck;
>>> @@ -71,7 +72,7 @@ static bool subunits_stuck(struct intel_engine_cs
>>> *engine)
>>>    	stuck &= instdone_unchanged(instdone.slice_common,
>>>    				    &accu_instdone->slice_common);
>>>    
>>> -	for_each_instdone_slice_subslice(dev_priv, slice, subslice) {
>>> +	for_each_instdone_slice_subslice(dev_priv, sseu, slice,
>>> subslice) {
>>>    		stuck &=
>>> instdone_unchanged(instdone.sampler[slice][subslice],
>>>    					    &accu_instdone-
>>>> sampler[slice][subslice]);
>>>    		stuck &=
>>> instdone_unchanged(instdone.row[slice][subslice],
>>> diff --git a/drivers/gpu/drm/i915/gt/intel_sseu.c
>>> b/drivers/gpu/drm/i915/gt/intel_sseu.c
>>> index 4a0b82fc108c..49316b7ef074 100644
>>> --- a/drivers/gpu/drm/i915/gt/intel_sseu.c
>>> +++ b/drivers/gpu/drm/i915/gt/intel_sseu.c
>>> @@ -8,6 +8,17 @@
>>>    #include "intel_lrc_reg.h"
>>>    #include "intel_sseu.h"
>>>    
>>> +void intel_sseu_set_info(struct sseu_dev_info *sseu, u8
>>> max_slices,
>>> +			 u8 max_subslices, u8 max_eus_per_subslice)
>>> +{
>>> +	sseu->max_slices = max_slices;
>>> +	sseu->max_subslices = max_subslices;
>>> +	sseu->max_eus_per_subslice = max_eus_per_subslice;
>>> +
>>> +	sseu->ss_stride = GEN_SSEU_STRIDE(sseu->max_subslices);
>>> +	sseu->eu_stride = GEN_SSEU_STRIDE(sseu->max_eus_per_subslice);
>>> +}
>>> +
>>>    unsigned int
>>>    intel_sseu_subslice_total(const struct sseu_dev_info *sseu)
>>>    {
>>> @@ -22,17 +33,39 @@ intel_sseu_subslice_total(const struct
>>> sseu_dev_info *sseu)
>>>    unsigned int
>>>    intel_sseu_subslices_per_slice(const struct sseu_dev_info *sseu,
>>> u8 slice)
>> Here we pass slice as u8, but below we use int. Any reason for the
>> difference?
> No good reason. I'll fix it.
>
>>>    {
>>> -	return hweight8(sseu->subslice_mask[slice]);
>>> +	unsigned int i, total = 0;
>>> +
>>> +	for (i = 0; i < sseu->ss_stride; i++)
>>> +		total += hweight8(sseu->subslice_mask[slice * sseu-
>>>> ss_stride +
>>> +						      i]);
>>> +
>>> +	return total;
>>> +}
>>> +
>>> +void intel_sseu_copy_subslices(const struct sseu_dev_info *sseu,
>>> int slice,
>>> +			       u8 *to_mask, const u8 *from_mask)
>> You always use sseu->subslice_mask has a from_mask, can't we just
>> get
>> that from the sseu param and avoid the from_mask?
> I wanted to make this a little more generic, but I agree maybe that's
> overkill. I'll rework this.
>
>>> +{
>>> +	int offset = slice * sseu->ss_stride;
>>> +
>>> +	memcpy(&to_mask[offset], &from_mask[offset], sseu->ss_stride);
>>> +}
>>> +
>>> +void intel_sseu_set_subslices(struct sseu_dev_info *sseu, int
>>> slice,
>>> +			      u32 ss_mask)
>>> +{
>>> +	int i, offset = slice * sseu->ss_stride;
>>> +
>>> +	for (i = 0; i < sseu->ss_stride; i++)
>>> +		sseu->subslice_mask[offset + i] =
>>> +			(ss_mask >> (BITS_PER_BYTE * i)) & 0xff;
>>>    }
>>>    
>>>    static int intel_sseu_eu_idx(const struct sseu_dev_info *sseu,
>>> int slice,
>>>    			     int subslice)
>>>    {
>>> -	int subslice_stride = DIV_ROUND_UP(sseu->max_eus_per_subslice,
>>> -					   BITS_PER_BYTE);
>>> -	int slice_stride = sseu->max_subslices * subslice_stride;
>>> +	int slice_stride = sseu->max_subslices * sseu->eu_stride;
>>>    
>>> -	return slice * slice_stride + subslice * subslice_stride;
>>> +	return slice * slice_stride + subslice * sseu->eu_stride;
>>>    }
>>>    
>>>    u16 intel_sseu_get_eus(const struct sseu_dev_info *sseu, int
>>> slice,
>>> @@ -41,8 +74,7 @@ u16 intel_sseu_get_eus(const struct sseu_dev_info
>>> *sseu, int slice,
>>>    	int i, offset = intel_sseu_eu_idx(sseu, slice, subslice);
>>>    	u16 eu_mask = 0;
>>>    
>>> -	for (i = 0;
>>> -	     i < DIV_ROUND_UP(sseu->max_eus_per_subslice,
>>> BITS_PER_BYTE); i++) {
>>> +	for (i = 0; i < sseu->eu_stride; i++) {
>>>    		eu_mask |= ((u16)sseu->eu_mask[offset + i]) <<
>>>    			(i * BITS_PER_BYTE);
>>>    	}
>>> @@ -55,8 +87,7 @@ void intel_sseu_set_eus(struct sseu_dev_info
>>> *sseu, int slice, int subslice,
>>>    {
>>>    	int i, offset = intel_sseu_eu_idx(sseu, slice, subslice);
>>>    
>>> -	for (i = 0;
>>> -	     i < DIV_ROUND_UP(sseu->max_eus_per_subslice,
>>> BITS_PER_BYTE); i++) {
>>> +	for (i = 0; i < sseu->eu_stride; i++) {
>>>    		sseu->eu_mask[offset + i] =
>>>    			(eu_mask >> (BITS_PER_BYTE * i)) & 0xff;
>>>    	}
>>> diff --git a/drivers/gpu/drm/i915/gt/intel_sseu.h
>>> b/drivers/gpu/drm/i915/gt/intel_sseu.h
>>> index 56e3721ae83f..bf01f338a8cc 100644
>>> --- a/drivers/gpu/drm/i915/gt/intel_sseu.h
>>> +++ b/drivers/gpu/drm/i915/gt/intel_sseu.h
>>> @@ -9,16 +9,18 @@
>>>    
>>>    #include <linux/types.h>
>>>    #include <linux/kernel.h>
>>> +#include <linux/string.h>
>>>    
>>>    struct drm_i915_private;
>>>    
>>>    #define GEN_MAX_SLICES		(6) /* CNL upper bound */
>>>    #define GEN_MAX_SUBSLICES	(8) /* ICL upper bound */
>>>    #define GEN_SSEU_STRIDE(bits) DIV_ROUND_UP(bits, BITS_PER_BYTE)
>>> +#define GEN_MAX_SUBSLICE_STRIDE GEN_SSEU_STRIDE(GEN_MAX_SUBSLICES)
>>>    
>>>    struct sseu_dev_info {
>>>    	u8 slice_mask;
>>> -	u8 subslice_mask[GEN_MAX_SLICES];
>>> +	u8 subslice_mask[GEN_MAX_SLICES * GEN_MAX_SUBSLICE_STRIDE];
>>>    	u16 eu_total;
>>>    	u8 eu_per_subslice;
>>>    	u8 min_eu_in_pool;
>>> @@ -33,6 +35,9 @@ struct sseu_dev_info {
>>>    	u8 max_subslices;
>>>    	u8 max_eus_per_subslice;
>>>    
>>> +	u8 ss_stride;
>>> +	u8 eu_stride;
>>> +
>>>    	/* We don't have more than 8 eus per subslice at the moment and
>>> as we
>>>    	 * store eus enabled using bits, no need to multiply by eus per
>>>    	 * subslice.
>>> @@ -63,12 +68,21 @@ intel_sseu_from_device_info(const struct
>>> sseu_dev_info *sseu)
>>>    	return value;
>>>    }
>>>    
>>> +void intel_sseu_set_info(struct sseu_dev_info *sseu, u8
>>> max_slices,
>>> +			 u8 max_subslices, u8 max_eus_per_subslice);
>>> +
>>>    unsigned int
>>>    intel_sseu_subslice_total(const struct sseu_dev_info *sseu);
>>>    
>>>    unsigned int
>>>    intel_sseu_subslices_per_slice(const struct sseu_dev_info *sseu,
>>> u8 slice);
>>>    
>>> +void intel_sseu_copy_subslices(const struct sseu_dev_info *sseu,
>>> int slice,
>>> +			       u8 *to_mask, const u8 *from_mask);
>>> +
>>> +void intel_sseu_set_subslices(struct sseu_dev_info *sseu, int
>>> slice,
>>> +			      u32 ss_mask);
>>> +
>>>    u16 intel_sseu_get_eus(const struct sseu_dev_info *sseu, int
>>> slice,
>>>    		       int subslice);
>>>    
>>> diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c
>>> b/drivers/gpu/drm/i915/gt/intel_workarounds.c
>>> index 43e290306551..7c7e9556c1c5 100644
>>> --- a/drivers/gpu/drm/i915/gt/intel_workarounds.c
>>> +++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c
>>> @@ -767,7 +767,7 @@ wa_init_mcr(struct drm_i915_private *i915,
>>> struct i915_wa_list *wal)
>>>    		u32 slice = fls(sseu->slice_mask);
>>>    		u32 fuse3 =
>>>    			intel_uncore_read(&i915->uncore,
>>> GEN10_MIRROR_FUSE3);
>>> -		u8 ss_mask = sseu->subslice_mask[slice];
>>> +		u8 ss_mask = sseu->subslice_mask[slice * sseu-
>>>> ss_stride];
>> could use a
>>
>> 	GEM_BUG_ON(sseu->ss_stride > 1);
>>
>> here as well to remind us this will need changes in that case
> Ok.
>
>>>    
>>>    		u8 enabled_mask = (ss_mask | ss_mask >>
>>>    				   GEN10_L3BANK_PAIR_COUNT) &
>>> GEN10_L3BANK_MASK;
>>> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c
>>> b/drivers/gpu/drm/i915/i915_debugfs.c
>>> index 3f3ee83ac315..08089c24db25 100644
>>> --- a/drivers/gpu/drm/i915/i915_debugfs.c
>>> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
>>> @@ -1257,6 +1257,7 @@ static void i915_instdone_info(struct
>>> drm_i915_private *dev_priv,
>>>    			       struct seq_file *m,
>>>    			       struct intel_instdone *instdone)
>>>    {
>>> +	struct sseu_dev_info *sseu = &RUNTIME_INFO(dev_priv)->sseu;
>>>    	int slice;
>>>    	int subslice;
>>>    
>>> @@ -1272,11 +1273,11 @@ static void i915_instdone_info(struct
>>> drm_i915_private *dev_priv,
>>>    	if (INTEL_GEN(dev_priv) <= 6)
>>>    		return;
>>>    
>>> -	for_each_instdone_slice_subslice(dev_priv, slice, subslice)
>>> +	for_each_instdone_slice_subslice(dev_priv, sseu, slice,
>>> subslice)
>>>    		seq_printf(m, "\t\tSAMPLER_INSTDONE[%d][%d]: 0x%08x\n",
>>>    			   slice, subslice, instdone-
>>>> sampler[slice][subslice]);
>>>    
>>> -	for_each_instdone_slice_subslice(dev_priv, slice, subslice)
>>> +	for_each_instdone_slice_subslice(dev_priv, sseu, slice,
>>> subslice)
>>>    		seq_printf(m, "\t\tROW_INSTDONE[%d][%d]: 0x%08x\n",
>>>    			   slice, subslice, instdone-
>>>> row[slice][subslice]);
>>>    }
>>> @@ -4066,7 +4067,9 @@ static void gen10_sseu_device_status(struct
>>> drm_i915_private *dev_priv,
>>>    			continue;
>>>    
>>>    		sseu->slice_mask |= BIT(s);
>>> -		sseu->subslice_mask[s] = info->sseu.subslice_mask[s];
>>> +		intel_sseu_copy_subslices(&info->sseu, s,
>>> +					  sseu->subslice_mask,
>>> +					  info->sseu.subslice_mask);
>>>    
>>>    		for (ss = 0; ss < info->sseu.max_subslices; ss++) {
>>>    			unsigned int eu_cnt;
>>> @@ -4117,18 +4120,22 @@ static void gen9_sseu_device_status(struct
>>> drm_i915_private *dev_priv,
>>>    		sseu->slice_mask |= BIT(s);
>>>    
>>>    		if (IS_GEN9_BC(dev_priv))
>>> -			sseu->subslice_mask[s] =
>>> -				RUNTIME_INFO(dev_priv)-
>>>> sseu.subslice_mask[s];
>>> +			intel_sseu_copy_subslices(&info->sseu, s,
>>> +						  sseu->subslice_mask,
>>> +						  info-
>>>> sseu.subslice_mask);
>>>    
>>>    		for (ss = 0; ss < info->sseu.max_subslices; ss++) {
>>>    			unsigned int eu_cnt;
>>> +			u8 ss_idx = s * info->sseu.ss_stride +
>>> +				    ss / BITS_PER_BYTE;
>>>    
>>>    			if (IS_GEN9_LP(dev_priv)) {
>>>    				if (!(s_reg[s] &
>>> (GEN9_PGCTL_SS_ACK(ss))))
>>>    					/* skip disabled subslice */
>>>    					continue;
>>>    
>>> -				sseu->subslice_mask[s] |= BIT(ss);
>>> +				sseu->subslice_mask[ss_idx] |=
>>> +					BIT(ss % BITS_PER_BYTE);
>>>    			}
>>>    
>>>    			eu_cnt = 2 * hweight32(eu_reg[2*s + ss/2] &
>>> @@ -4145,25 +4152,24 @@ static void gen9_sseu_device_status(struct
>>> drm_i915_private *dev_priv,
>>>    static void broadwell_sseu_device_status(struct drm_i915_private
>>> *dev_priv,
>>>    					 struct sseu_dev_info *sseu)
>>>    {
>>> +	struct intel_runtime_info *info = RUNTIME_INFO(dev_priv);
>>>    	u32 slice_info = I915_READ(GEN8_GT_SLICE_INFO);
>>>    	int s;
>>>    
>>>    	sseu->slice_mask = slice_info & GEN8_LSLICESTAT_MASK;
>>>    
>>>    	if (sseu->slice_mask) {
>>> -		sseu->eu_per_subslice =
>>> -			RUNTIME_INFO(dev_priv)->sseu.eu_per_subslice;
>>> -		for (s = 0; s < fls(sseu->slice_mask); s++) {
>>> -			sseu->subslice_mask[s] =
>>> -				RUNTIME_INFO(dev_priv)-
>>>> sseu.subslice_mask[s];
>>> -		}
>>> +		sseu->eu_per_subslice = info->sseu.eu_per_subslice;
>>> +		for (s = 0; s < fls(sseu->slice_mask); s++)
>>> +			intel_sseu_copy_subslices(&info->sseu, s,
>>> +						  sseu->subslice_mask,
>>> +						  info-
>>>> sseu.subslice_mask);
>>>    		sseu->eu_total = sseu->eu_per_subslice *
>>>    				 intel_sseu_subslice_total(sseu);
>>>    
>>>    		/* subtract fused off EU(s) from enabled slice(s) */
>>>    		for (s = 0; s < fls(sseu->slice_mask); s++) {
>>> -			u8 subslice_7eu =
>>> -				RUNTIME_INFO(dev_priv)-
>>>> sseu.subslice_7eu[s];
>>> +			u8 subslice_7eu = info->sseu.subslice_7eu[s];
>>>    
>>>    			sseu->eu_total -= hweight8(subslice_7eu);
>>>    		}
>>> @@ -4210,6 +4216,7 @@ static void i915_print_sseu_info(struct
>>> seq_file *m, bool is_available_info,
>>>    static int i915_sseu_status(struct seq_file *m, void *unused)
>>>    {
>>>    	struct drm_i915_private *dev_priv = node_to_i915(m->private);
>>> +	const struct intel_runtime_info *info = RUNTIME_INFO(dev_priv);
>>>    	struct sseu_dev_info sseu;
>>>    	intel_wakeref_t wakeref;
>>>    
>>> @@ -4217,14 +4224,13 @@ static int i915_sseu_status(struct seq_file
>>> *m, void *unused)
>>>    		return -ENODEV;
>>>    
>>>    	seq_puts(m, "SSEU Device Info\n");
>>> -	i915_print_sseu_info(m, true, &RUNTIME_INFO(dev_priv)->sseu);
>>> +	i915_print_sseu_info(m, true, &info->sseu);
>>>    
>>>    	seq_puts(m, "SSEU Device Status\n");
>>>    	memset(&sseu, 0, sizeof(sseu));
>>> -	sseu.max_slices = RUNTIME_INFO(dev_priv)->sseu.max_slices;
>>> -	sseu.max_subslices = RUNTIME_INFO(dev_priv)-
>>>> sseu.max_subslices;
>>> -	sseu.max_eus_per_subslice =
>>> -		RUNTIME_INFO(dev_priv)->sseu.max_eus_per_subslice;
>>> +	intel_sseu_set_info(&sseu, info->sseu.max_slices,
>>> +			    info->sseu.max_subslices,
>>> +			    info->sseu.max_eus_per_subslice);
>>>    
>>>    	with_intel_runtime_pm(dev_priv, wakeref) {
>>>    		if (IS_CHERRYVIEW(dev_priv))
>>> diff --git a/drivers/gpu/drm/i915/i915_drv.c
>>> b/drivers/gpu/drm/i915/i915_drv.c
>>> index 130c5140db0d..6afe4e3afea4 100644
>>> --- a/drivers/gpu/drm/i915/i915_drv.c
>>> +++ b/drivers/gpu/drm/i915/i915_drv.c
>>> @@ -326,7 +326,7 @@ static int i915_getparam_ioctl(struct
>>> drm_device *dev, void *data,
>>>    	struct pci_dev *pdev = dev_priv->drm.pdev;
>>>    	const struct sseu_dev_info *sseu = &RUNTIME_INFO(dev_priv)-
>>>> sseu;
>>>    	drm_i915_getparam_t *param = data;
>>> -	int value;
>>> +	int value = 0;
>>>    
>>>    	switch (param->param) {
>>>    	case I915_PARAM_IRQ_ACTIVE:
>>> @@ -455,7 +455,9 @@ static int i915_getparam_ioctl(struct
>>> drm_device *dev, void *data,
>>>    			return -ENODEV;
>>>    		break;
>>>    	case I915_PARAM_SUBSLICE_MASK:
>>> -		value = sseu->subslice_mask[0];
>>> +		/* Only copy bits from the first subslice */
>> s/subslice/slice/ ?
> True, thanks.
>
>>> +		memcpy(&value, sseu->subslice_mask,
>>> +		       min(sseu->ss_stride, (u8)sizeof(value)));
>>>    		if (!value)
>>>    			return -ENODEV;
>>>    		break;
>>> diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c
>>> b/drivers/gpu/drm/i915/i915_gpu_error.c
>>> index e1b858bd1d32..140918dd9b7d 100644
>>> --- a/drivers/gpu/drm/i915/i915_gpu_error.c
>>> +++ b/drivers/gpu/drm/i915/i915_gpu_error.c
>>> @@ -407,6 +407,7 @@ static void print_error_buffers(struct
>>> drm_i915_error_state_buf *m,
>>>    static void error_print_instdone(struct drm_i915_error_state_buf
>>> *m,
>>>    				 const struct drm_i915_error_engine
>>> *ee)
>>>    {
>>> +	struct sseu_dev_info *sseu = &RUNTIME_INFO(m->i915)->sseu;
>>>    	int slice;
>>>    	int subslice;
>>>    
>>> @@ -422,12 +423,12 @@ static void error_print_instdone(struct
>>> drm_i915_error_state_buf *m,
>>>    	if (INTEL_GEN(m->i915) <= 6)
>>>    		return;
>>>    
>>> -	for_each_instdone_slice_subslice(m->i915, slice, subslice)
>>> +	for_each_instdone_slice_subslice(m->i915, sseu, slice,
>>> subslice)
>>>    		err_printf(m, "  SAMPLER_INSTDONE[%d][%d]: 0x%08x\n",
>>>    			   slice, subslice,
>>>    			   ee->instdone.sampler[slice][subslice]);
>>>    
>>> -	for_each_instdone_slice_subslice(m->i915, slice, subslice)
>>> +	for_each_instdone_slice_subslice(m->i915, sseu, slice,
>>> subslice)
>>>    		err_printf(m, "  ROW_INSTDONE[%d][%d]: 0x%08x\n",
>>>    			   slice, subslice,
>>>    			   ee->instdone.row[slice][subslice]);
>>> diff --git a/drivers/gpu/drm/i915/i915_query.c
>>> b/drivers/gpu/drm/i915/i915_query.c
>>> index 7c1708c22811..000dcb145ce0 100644
>>> --- a/drivers/gpu/drm/i915/i915_query.c
>>> +++ b/drivers/gpu/drm/i915/i915_query.c
>>> @@ -37,8 +37,6 @@ static int query_topology_info(struct
>>> drm_i915_private *dev_priv,
>>>    	const struct sseu_dev_info *sseu = &RUNTIME_INFO(dev_priv)-
>>>> sseu;
>>>    	struct drm_i915_query_topology_info topo;
>>>    	u32 slice_length, subslice_length, eu_length, total_length;
>>> -	u8 subslice_stride = GEN_SSEU_STRIDE(sseu->max_subslices);
>>> -	u8 eu_stride = GEN_SSEU_STRIDE(sseu->max_eus_per_subslice);
>>>    	int ret;
>>>    
>>>    	if (query_item->flags != 0)
>>> @@ -50,8 +48,8 @@ static int query_topology_info(struct
>>> drm_i915_private *dev_priv,
>>>    	BUILD_BUG_ON(sizeof(u8) != sizeof(sseu->slice_mask));
>>>    
>>>    	slice_length = sizeof(sseu->slice_mask);
>>> -	subslice_length = sseu->max_slices * subslice_stride;
>>> -	eu_length = sseu->max_slices * sseu->max_subslices * eu_stride;
>>> +	subslice_length = sseu->max_slices * sseu->ss_stride;
>>> +	eu_length = sseu->max_slices * sseu->max_subslices * sseu-
>>>> eu_stride;
>>>    	total_length = sizeof(topo) + slice_length + subslice_length +
>>>    		       eu_length;
>>>    
>>> @@ -69,9 +67,9 @@ static int query_topology_info(struct
>>> drm_i915_private *dev_priv,
>>>    	topo.max_eus_per_subslice = sseu->max_eus_per_subslice;
>>>    
>>>    	topo.subslice_offset = slice_length;
>>> -	topo.subslice_stride = subslice_stride;
>>> +	topo.subslice_stride = sseu->ss_stride;
>>>    	topo.eu_offset = slice_length + subslice_length;
>>> -	topo.eu_stride = eu_stride;
>>> +	topo.eu_stride = sseu->eu_stride;
>>>    
>>>    	if (__copy_to_user(u64_to_user_ptr(query_item->data_ptr),
>>>    			   &topo, sizeof(topo)))
>>> diff --git a/drivers/gpu/drm/i915/intel_device_info.c
>>> b/drivers/gpu/drm/i915/intel_device_info.c
>>> index e1dbccf04cd9..bbbc0a8c2183 100644
>>> --- a/drivers/gpu/drm/i915/intel_device_info.c
>>> +++ b/drivers/gpu/drm/i915/intel_device_info.c
>>> @@ -84,17 +84,42 @@ void intel_device_info_dump_flags(const struct
>>> intel_device_info *info,
>>>    #undef PRINT_FLAG
>>>    }
>>>    
>>> +#define SS_STR_MAX_SIZE (GEN_MAX_SUBSLICE_STRIDE * 2)
>>> +
>>> +static u8 *
>>> +subslice_per_slice_str(u8 *buf, const struct sseu_dev_info *sseu,
>>> u8 slice)
>>> +{
>>> +	int i;
>>> +	u8 ss_offset = slice * sseu->ss_stride;
>>> +
>>> +	GEM_BUG_ON(slice >= sseu->max_slices);
>>> +
>>> +	memset(buf, 0, SS_STR_MAX_SIZE);
>>> +
>>> +	/*
>>> +	 * Print subslice information in reverse order to match
>>> +	 * userspace expectations.
>>> +	 */
>>> +	for (i = 0; i < sseu->ss_stride; i++)
>>> +		sprintf(&buf[i * 2], "%02x",
>>> +			sseu->subslice_mask[ss_offset + sseu->ss_stride
>>> -
>>> +					    (i + 1)]);
>>> +
>>> +	return buf;
>>> +}
>>> +
>>>    static void sseu_dump(const struct sseu_dev_info *sseu, struct
>>> drm_printer *p)
>>>    {
>>>    	int s;
>>> +	u8 buf[SS_STR_MAX_SIZE];
>>>    
>>>    	drm_printf(p, "slice total: %u, mask=%04x\n",
>>>    		   hweight8(sseu->slice_mask), sseu->slice_mask);
>>>    	drm_printf(p, "subslice total: %u\n",
>>> intel_sseu_subslice_total(sseu));
>>>    	for (s = 0; s < sseu->max_slices; s++) {
>>> -		drm_printf(p, "slice%d: %u subslices, mask=%04x\n",
>>> +		drm_printf(p, "slice%d: %u subslices, mask=%s\n",
>>>    			   s, intel_sseu_subslices_per_slice(sseu, s),
>>> -			   sseu->subslice_mask[s]);
>>> +			   subslice_per_slice_str(buf, sseu, s));
>>>    	}
>>>    	drm_printf(p, "EU total: %u\n", sseu->eu_total);
>>>    	drm_printf(p, "EU per subslice: %u\n", sseu->eu_per_subslice);
>>> @@ -118,6 +143,7 @@ void intel_device_info_dump_topology(const
>>> struct sseu_dev_info *sseu,
>>>    				     struct drm_printer *p)
>>>    {
>>>    	int s, ss;
>>> +	u8 buf[SS_STR_MAX_SIZE];
>>>    
>>>    	if (sseu->max_slices == 0) {
>>>    		drm_printf(p, "Unavailable\n");
>>> @@ -125,9 +151,9 @@ void intel_device_info_dump_topology(const
>>> struct sseu_dev_info *sseu,
>>>    	}
>>>    
>>>    	for (s = 0; s < sseu->max_slices; s++) {
>>> -		drm_printf(p, "slice%d: %u subslice(s) (0x%hhx):\n",
>>> +		drm_printf(p, "slice%d: %u subslice(s) (0x%s):\n",
>>>    			   s, intel_sseu_subslices_per_slice(sseu, s),
>>> -			   sseu->subslice_mask[s]);
>>> +			   subslice_per_slice_str(buf, sseu, s));
>>>    
>>>    		for (ss = 0; ss < sseu->max_subslices; ss++) {
>>>    			u16 enabled_eus = intel_sseu_get_eus(sseu, s,
>>> ss);
>>> @@ -156,15 +182,10 @@ static void gen11_sseu_info_init(struct
>>> drm_i915_private *dev_priv)
>>>    	u8 eu_en;
>>>    	int s;
>>>    
>>> -	if (IS_ELKHARTLAKE(dev_priv)) {
>>> -		sseu->max_slices = 1;
>>> -		sseu->max_subslices = 4;
>>> -		sseu->max_eus_per_subslice = 8;
>>> -	} else {
>>> -		sseu->max_slices = 1;
>>> -		sseu->max_subslices = 8;
>>> -		sseu->max_eus_per_subslice = 8;
>>> -	}
>>> +	if (IS_ELKHARTLAKE(dev_priv))
>>> +		intel_sseu_set_info(sseu, 1, 4, 8);
>>> +	else
>>> +		intel_sseu_set_info(sseu, 1, 8, 8);
>>>    
>>>    	s_en = I915_READ(GEN11_GT_SLICE_ENABLE) & GEN11_GT_S_ENA_MASK;
>>>    	ss_en = ~I915_READ(GEN11_GT_SUBSLICE_DISABLE);
>>> @@ -177,9 +198,11 @@ static void gen11_sseu_info_init(struct
>>> drm_i915_private *dev_priv)
>>>    			int ss;
>>>    
>>>    			sseu->slice_mask |= BIT(s);
>>> -			sseu->subslice_mask[s] = (ss_en >> ss_idx) &
>>> ss_en_mask;
>>> +			sseu->subslice_mask[s * sseu->ss_stride] =
>>> +				(ss_en >> ss_idx) & ss_en_mask;
>> Shouldn't this just call intel_sseu_set_subslices() instead of doing
>> the
>> setting locally?
> Yes, let me fix this.
>
>>>    			for (ss = 0; ss < sseu->max_subslices; ss++) {
>>> -				if (sseu->subslice_mask[s] & BIT(ss))
>>> +				if (sseu->subslice_mask[s * sseu-
>>>> ss_stride] &
>>> +				    BIT(ss))
>> This culd use the intel_sseu_has_subslice() suggested earlier,
>> otherwise
>> it needs to consider ss_stride > 1
> Ok.
>
>>>    					intel_sseu_set_eus(sseu, s, ss,
>>> eu_en);
>>>    			}
>>>    		}
>>> @@ -201,23 +224,10 @@ static void gen10_sseu_info_init(struct
>>> drm_i915_private *dev_priv)
>>>    	const int eu_mask = 0xff;
>>>    	u32 subslice_mask, eu_en;
>>>    
>>> +	intel_sseu_set_info(sseu, 6, 4, 8);
>>> +
>>>    	sseu->slice_mask = (fuse2 & GEN10_F2_S_ENA_MASK) >>
>>>    			    GEN10_F2_S_ENA_SHIFT;
>>> -	sseu->max_slices = 6;
>>> -	sseu->max_subslices = 4;
>>> -	sseu->max_eus_per_subslice = 8;
>>> -
>>> -	subslice_mask = (1 << 4) - 1;
>>> -	subslice_mask &= ~((fuse2 & GEN10_F2_SS_DIS_MASK) >>
>>> -			   GEN10_F2_SS_DIS_SHIFT);
>>> -
>>> -	/*
>>> -	 * Slice0 can have up to 3 subslices, but there are only 2 in
>>> -	 * slice1/2.
>>> -	 */
>>> -	sseu->subslice_mask[0] = subslice_mask;
>>> -	for (s = 1; s < sseu->max_slices; s++)
>>> -		sseu->subslice_mask[s] = subslice_mask & 0x3;
>>>    
>>>    	/* Slice0 */
>>>    	eu_en = ~I915_READ(GEN8_EU_DISABLE0);
>>> @@ -242,14 +252,22 @@ static void gen10_sseu_info_init(struct
>>> drm_i915_private *dev_priv)
>>>    	eu_en = ~I915_READ(GEN10_EU_DISABLE3);
>>>    	intel_sseu_set_eus(sseu, 5, 1, eu_en & eu_mask);
>>>    
>>> -	/* Do a second pass where we mark the subslices disabled if all
>>> their
>>> -	 * eus are off.
>>> -	 */
>>> +	subslice_mask = (1 << 4) - 1;
>>> +	subslice_mask &= ~((fuse2 & GEN10_F2_SS_DIS_MASK) >>
>>> +			   GEN10_F2_SS_DIS_SHIFT);
>>> +
>>>    	for (s = 0; s < sseu->max_slices; s++) {
>>>    		for (ss = 0; ss < sseu->max_subslices; ss++) {
>>>    			if (intel_sseu_get_eus(sseu, s, ss) == 0)
>>> -				sseu->subslice_mask[s] &= ~BIT(ss);
>>> +				subslice_mask &= ~BIT(ss);
>>>    		}
>>> +
>>> +		/*
>>> +		 * Slice0 can have up to 3 subslices, but there are
>>> only 2 in
>>> +		 * slice1/2.
>>> +		 */
>>> +		intel_sseu_set_subslices(sseu, s, s == 0 ?
>>> subslice_mask :
>>> +							   subslice_mas
>>> k & 0x3);
>>>    	}
>>>    
>>>    	sseu->eu_total = compute_eu_total(sseu);
>>> @@ -275,13 +293,12 @@ static void cherryview_sseu_info_init(struct
>>> drm_i915_private *dev_priv)
>>>    {
>>>    	struct sseu_dev_info *sseu = &RUNTIME_INFO(dev_priv)->sseu;
>>>    	u32 fuse;
>>> +	u8 subslice_mask;
>>>    
>>>    	fuse = I915_READ(CHV_FUSE_GT);
>>>    
>>>    	sseu->slice_mask = BIT(0);
>>> -	sseu->max_slices = 1;
>>> -	sseu->max_subslices = 2;
>>> -	sseu->max_eus_per_subslice = 8;
>>> +	intel_sseu_set_info(sseu, 1, 2, 8);
>>>    
>>>    	if (!(fuse & CHV_FGT_DISABLE_SS0)) {
>>>    		u8 disabled_mask =
>>> @@ -290,7 +307,7 @@ static void cherryview_sseu_info_init(struct
>>> drm_i915_private *dev_priv)
>>>    			(((fuse & CHV_FGT_EU_DIS_SS0_R1_MASK) >>
>>>    			  CHV_FGT_EU_DIS_SS0_R1_SHIFT) << 4);
>>>    
>>> -		sseu->subslice_mask[0] |= BIT(0);
>>> +		subslice_mask |= BIT(0);
>>>    		intel_sseu_set_eus(sseu, 0, 0, ~disabled_mask);
>>>    	}
>>>    
>>> @@ -301,10 +318,12 @@ static void cherryview_sseu_info_init(struct
>>> drm_i915_private *dev_priv)
>>>    			(((fuse & CHV_FGT_EU_DIS_SS1_R1_MASK) >>
>>>    			  CHV_FGT_EU_DIS_SS1_R1_SHIFT) << 4);
>>>    
>>> -		sseu->subslice_mask[0] |= BIT(1);
>>> +		subslice_mask |= BIT(1);
>>>    		intel_sseu_set_eus(sseu, 0, 1, ~disabled_mask);
>>>    	}
>>>    
>>> +	intel_sseu_set_subslices(sseu, 0, subslice_mask);
>>> +
>>>    	sseu->eu_total = compute_eu_total(sseu);
>>>    
>>>    	/*
>>> @@ -312,7 +331,8 @@ static void cherryview_sseu_info_init(struct
>>> drm_i915_private *dev_priv)
>>>    	 * across subslices.
>>>    	*/
>>>    	sseu->eu_per_subslice = intel_sseu_subslice_total(sseu) ?
>>> -				sseu->eu_total /
>>> intel_sseu_subslice_total(sseu) :
>>> +				sseu->eu_total /
>>> +					intel_sseu_subslice_total(sseu)
>>> :
>>>    				0;
>>>    	/*
>>>    	 * CHV supports subslice power gating on devices with more than
>>> @@ -336,9 +356,8 @@ static void gen9_sseu_info_init(struct
>>> drm_i915_private *dev_priv)
>>>    	sseu->slice_mask = (fuse2 & GEN8_F2_S_ENA_MASK) >>
>>> GEN8_F2_S_ENA_SHIFT;
>>>    
>>>    	/* BXT has a single slice and at most 3 subslices. */
>>> -	sseu->max_slices = IS_GEN9_LP(dev_priv) ? 1 : 3;
>>> -	sseu->max_subslices = IS_GEN9_LP(dev_priv) ? 3 : 4;
>>> -	sseu->max_eus_per_subslice = 8;
>>> +	intel_sseu_set_info(sseu, IS_GEN9_LP(dev_priv) ? 1 : 3,
>>> +			    IS_GEN9_LP(dev_priv) ? 3 : 4, 8);
>>>    
>>>    	/*
>>>    	 * The subslice disable field is global, i.e. it applies
>>> @@ -357,14 +376,16 @@ static void gen9_sseu_info_init(struct
>>> drm_i915_private *dev_priv)
>>>    			/* skip disabled slice */
>>>    			continue;
>>>    
>>> -		sseu->subslice_mask[s] = subslice_mask;
>>> +		intel_sseu_set_subslices(sseu, s, subslice_mask);
>>>    
>>>    		eu_disable = I915_READ(GEN9_EU_DISABLE(s));
>>>    		for (ss = 0; ss < sseu->max_subslices; ss++) {
>>>    			int eu_per_ss;
>>>    			u8 eu_disabled_mask;
>>> +			u8 ss_idx = s * sseu->ss_stride + ss /
>>> BITS_PER_BYTE;
>>>    
>>> -			if (!(sseu->subslice_mask[s] & BIT(ss)))
>>> +			if (!(sseu->subslice_mask[ss_idx] &
>>> +			      BIT(ss % BITS_PER_BYTE)))
>>>    				/* skip disabled subslice */
>>>    				continue;
>>>    
>>> @@ -437,9 +458,7 @@ static void broadwell_sseu_info_init(struct
>>> drm_i915_private *dev_priv)
>>>    
>>>    	fuse2 = I915_READ(GEN8_FUSE2);
>>>    	sseu->slice_mask = (fuse2 & GEN8_F2_S_ENA_MASK) >>
>>> GEN8_F2_S_ENA_SHIFT;
>>> -	sseu->max_slices = 3;
>>> -	sseu->max_subslices = 3;
>>> -	sseu->max_eus_per_subslice = 8;
>>> +	intel_sseu_set_info(sseu, 3, 3, 8);
>>>    
>>>    	/*
>>>    	 * The subslice disable field is global, i.e. it applies
>>> @@ -466,18 +485,21 @@ static void broadwell_sseu_info_init(struct
>>> drm_i915_private *dev_priv)
>>>    			/* skip disabled slice */
>>>    			continue;
>>>    
>>> -		sseu->subslice_mask[s] = subslice_mask;
>>> +		intel_sseu_set_subslices(sseu, s, subslice_mask);
>>>    
>>>    		for (ss = 0; ss < sseu->max_subslices; ss++) {
>>>    			u8 eu_disabled_mask;
>>> +			u8 ss_idx = s * sseu->ss_stride + ss /
>>> BITS_PER_BYTE;
>>>    			u32 n_disabled;
>>>    
>>> -			if (!(sseu->subslice_mask[s] & BIT(ss)))
>>> +			if (!(sseu->subslice_mask[ss_idx] &
>>> +			      BIT(ss % BITS_PER_BYTE)))
>>>    				/* skip disabled subslice */
>>>    				continue;
>>>    
>>>    			eu_disabled_mask =
>>> -				eu_disable[s] >> (ss * sseu-
>>>> max_eus_per_subslice);
>>> +				eu_disable[s] >>
>>> +					(ss * sseu-
>>>> max_eus_per_subslice);
>>>    
>>>    			intel_sseu_set_eus(sseu, s, ss,
>>> ~eu_disabled_mask);
>>>    
>>> @@ -517,6 +539,7 @@ static void haswell_sseu_info_init(struct
>>> drm_i915_private *dev_priv)
>>>    	struct sseu_dev_info *sseu = &RUNTIME_INFO(dev_priv)->sseu;
>>>    	u32 fuse1;
>>>    	int s, ss;
>>> +	u32 subslice_mask;
>>>    
>>>    	/*
>>>    	 * There isn't a register to tell us how many slices/subslices.
>>> We
>>> @@ -528,22 +551,18 @@ static void haswell_sseu_info_init(struct
>>> drm_i915_private *dev_priv)
>>>    		/* fall through */
>>>    	case 1:
>>>    		sseu->slice_mask = BIT(0);
>>> -		sseu->subslice_mask[0] = BIT(0);
>>> +		subslice_mask = BIT(0);
>>>    		break;
>>>    	case 2:
>>>    		sseu->slice_mask = BIT(0);
>>> -		sseu->subslice_mask[0] = BIT(0) | BIT(1);
>>> +		subslice_mask = BIT(0) | BIT(1);
>>>    		break;
>>>    	case 3:
>>>    		sseu->slice_mask = BIT(0) | BIT(1);
>>> -		sseu->subslice_mask[0] = BIT(0) | BIT(1);
>>> -		sseu->subslice_mask[1] = BIT(0) | BIT(1);
>>> +		subslice_mask = BIT(0) | BIT(1);
>>>    		break;
>>>    	}
>>>    
>>> -	sseu->max_slices = hweight8(sseu->slice_mask);
>>> -	sseu->max_subslices = hweight8(sseu->subslice_mask[0]);
>>> -
>>>    	fuse1 = I915_READ(HSW_PAVP_FUSE1);
>>>    	switch ((fuse1 & HSW_F1_EU_DIS_MASK) >> HSW_F1_EU_DIS_SHIFT) {
>>>    	default:
>>> @@ -560,9 +579,14 @@ static void haswell_sseu_info_init(struct
>>> drm_i915_private *dev_priv)
>>>    		sseu->eu_per_subslice = 6;
>>>    		break;
>>>    	}
>>> -	sseu->max_eus_per_subslice = sseu->eu_per_subslice;
>>> +
>>> +	intel_sseu_set_info(sseu, hweight8(sseu->slice_mask),
>>> +			    hweight8(subslice_mask),
>>> +			    sseu->eu_per_subslice);
>> Personal preference: could use a local variable for eu_per_subslice
>> above to avoid setting it to itself here.
> Yeah this is a bit ugly... I'll change it.
>
> Thanks for the feedback!
> Stuart
>
>> Daniele
>>
>>>    
>>>    	for (s = 0; s < sseu->max_slices; s++) {
>>> +		intel_sseu_set_subslices(sseu, s, subslice_mask);
>>> +
>>>    		for (ss = 0; ss < sseu->max_subslices; ss++) {
>>>    			intel_sseu_set_eus(sseu, s, ss,
>>>    					   (1UL << sseu-
>>>> eu_per_subslice) - 1);


_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 6/6] drm/i915: Expand subslice mask
  2019-05-03  9:05       ` Lionel Landwerlin
@ 2019-05-03 14:28         ` Summers, Stuart
  0 siblings, 0 replies; 35+ messages in thread
From: Summers, Stuart @ 2019-05-03 14:28 UTC (permalink / raw)
  To: Ceraolo Spurio, Daniele, intel-gfx, Landwerlin, Lionel G


[-- Attachment #1.1: Type: text/plain, Size: 158 bytes --]

On Fri, 2019-05-03 at 10:05 +0100, Lionel Landwerlin wrote:
> Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>

Thanks for the Ack!

-Stuart

[-- Attachment #1.2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 3270 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [PATCH 6/6] drm/i915: Expand subslice mask
  2019-04-30 23:06 [PATCH 0/6] Refactor to expand subslice mask Stuart Summers
@ 2019-04-30 23:06 ` Stuart Summers
  0 siblings, 0 replies; 35+ messages in thread
From: Stuart Summers @ 2019-04-30 23:06 UTC (permalink / raw)
  To: intel-gfx

Currently, the subslice_mask runtime parameter is stored as an
array of subslices per slice. Expand the subslice mask array to
better match what is presented to userspace through the
I915_QUERY_TOPOLOGY_INFO ioctl. The index into this array is
then calculated:
  slice * subslice stride + subslice index / 8

v2: fix spacing in set_sseu_info args
    use set_sseu_info to initialize sseu data when building
    device status in debugfs
    rename variables in intel_engine_types.h to avoid checkpatch
    warnings
v3: update headers in intel_sseu.h
v4: add const to some sseu_dev_info variables
    use sseu->eu_stride for EU stride calculations

Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Stuart Summers <stuart.summers@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_engine_cs.c    |   6 +-
 drivers/gpu/drm/i915/gt/intel_engine_types.h |  32 +++--
 drivers/gpu/drm/i915/gt/intel_hangcheck.c    |   3 +-
 drivers/gpu/drm/i915/gt/intel_sseu.c         |  49 +++++--
 drivers/gpu/drm/i915/gt/intel_sseu.h         |  16 ++-
 drivers/gpu/drm/i915/gt/intel_workarounds.c  |   2 +-
 drivers/gpu/drm/i915/i915_debugfs.c          |  44 +++---
 drivers/gpu/drm/i915/i915_drv.c              |   6 +-
 drivers/gpu/drm/i915/i915_gpu_error.c        |   5 +-
 drivers/gpu/drm/i915/i915_query.c            |  10 +-
 drivers/gpu/drm/i915/intel_device_info.c     | 142 +++++++++++--------
 11 files changed, 198 insertions(+), 117 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index f7308479d511..e438d366874f 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -908,7 +908,7 @@ u32 intel_calculate_mcr_s_ss_select(struct drm_i915_private *dev_priv)
 	const struct sseu_dev_info *sseu = &RUNTIME_INFO(dev_priv)->sseu;
 	u32 mcr_s_ss_select;
 	u32 slice = fls(sseu->slice_mask);
-	u32 subslice = fls(sseu->subslice_mask[slice]);
+	u32 subslice = fls(sseu->subslice_mask[slice * sseu->ss_stride]);
 
 	if (IS_GEN(dev_priv, 10))
 		mcr_s_ss_select = GEN8_MCR_SLICE(slice) |
@@ -984,6 +984,7 @@ void intel_engine_get_instdone(struct intel_engine_cs *engine,
 			       struct intel_instdone *instdone)
 {
 	struct drm_i915_private *dev_priv = engine->i915;
+	const struct sseu_dev_info *sseu = &RUNTIME_INFO(dev_priv)->sseu;
 	struct intel_uncore *uncore = engine->uncore;
 	u32 mmio_base = engine->mmio_base;
 	int slice;
@@ -1001,7 +1002,8 @@ void intel_engine_get_instdone(struct intel_engine_cs *engine,
 
 		instdone->slice_common =
 			intel_uncore_read(uncore, GEN7_SC_INSTDONE);
-		for_each_instdone_slice_subslice(dev_priv, slice, subslice) {
+		for_each_instdone_slice_subslice(dev_priv, sseu, slice,
+						 subslice) {
 			instdone->sampler[slice][subslice] =
 				read_subslice_reg(dev_priv, slice, subslice,
 						  GEN7_SAMPLER_INSTDONE);
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h b/drivers/gpu/drm/i915/gt/intel_engine_types.h
index d972c339309c..fa70528963a4 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h
@@ -534,20 +534,22 @@ intel_engine_needs_breadcrumb_tasklet(const struct intel_engine_cs *engine)
 	return engine->flags & I915_ENGINE_NEEDS_BREADCRUMB_TASKLET;
 }
 
-#define instdone_slice_mask(dev_priv__) \
-	(IS_GEN(dev_priv__, 7) ? \
-	 1 : RUNTIME_INFO(dev_priv__)->sseu.slice_mask)
-
-#define instdone_subslice_mask(dev_priv__) \
-	(IS_GEN(dev_priv__, 7) ? \
-	 1 : RUNTIME_INFO(dev_priv__)->sseu.subslice_mask[0])
-
-#define for_each_instdone_slice_subslice(dev_priv__, slice__, subslice__) \
-	for ((slice__) = 0, (subslice__) = 0; \
-	     (slice__) < I915_MAX_SLICES; \
-	     (subslice__) = ((subslice__) + 1) < I915_MAX_SUBSLICES ? (subslice__) + 1 : 0, \
-	       (slice__) += ((subslice__) == 0)) \
-		for_each_if((BIT(slice__) & instdone_slice_mask(dev_priv__)) && \
-			    (BIT(subslice__) & instdone_subslice_mask(dev_priv__)))
+#define instdone_has_slice(dev_priv___, sseu___, slice___) \
+	((IS_GEN(dev_priv___, 7) ? \
+	  1 : (sseu___)->slice_mask) & \
+	BIT(slice___)) \
+
+#define instdone_has_subslice(dev_priv__, sseu__, slice__, subslice__) \
+	((IS_GEN(dev_priv__, 7) ? \
+	  1 : (sseu__)->subslice_mask[slice__ * (sseu__)->ss_stride + \
+				      subslice__ / BITS_PER_BYTE]) & \
+	 BIT(subslice__ % BITS_PER_BYTE)) \
+
+#define for_each_instdone_slice_subslice(dev_priv_, sseu_, slice_, subslice_) \
+	for ((slice_) = 0, (subslice_) = 0; (slice_) < I915_MAX_SLICES; \
+	     (subslice_) = ((subslice_) + 1) < I915_MAX_SUBSLICES ? (subslice_) + 1 : 0, \
+	       (slice_) += ((subslice_) == 0)) \
+		for_each_if(instdone_has_slice(dev_priv_, sseu_, slice) && \
+			    instdone_has_subslice(dev_priv_, sseu_, slice_, subslice_)) \
 
 #endif /* __INTEL_ENGINE_TYPES_H__ */
diff --git a/drivers/gpu/drm/i915/gt/intel_hangcheck.c b/drivers/gpu/drm/i915/gt/intel_hangcheck.c
index e5eaa06fe74d..53c1c98161e1 100644
--- a/drivers/gpu/drm/i915/gt/intel_hangcheck.c
+++ b/drivers/gpu/drm/i915/gt/intel_hangcheck.c
@@ -50,6 +50,7 @@ static bool instdone_unchanged(u32 current_instdone, u32 *old_instdone)
 static bool subunits_stuck(struct intel_engine_cs *engine)
 {
 	struct drm_i915_private *dev_priv = engine->i915;
+	const struct sseu_dev_info *sseu = &RUNTIME_INFO(dev_priv)->sseu;
 	struct intel_instdone instdone;
 	struct intel_instdone *accu_instdone = &engine->hangcheck.instdone;
 	bool stuck;
@@ -71,7 +72,7 @@ static bool subunits_stuck(struct intel_engine_cs *engine)
 	stuck &= instdone_unchanged(instdone.slice_common,
 				    &accu_instdone->slice_common);
 
-	for_each_instdone_slice_subslice(dev_priv, slice, subslice) {
+	for_each_instdone_slice_subslice(dev_priv, sseu, slice, subslice) {
 		stuck &= instdone_unchanged(instdone.sampler[slice][subslice],
 					    &accu_instdone->sampler[slice][subslice]);
 		stuck &= instdone_unchanged(instdone.row[slice][subslice],
diff --git a/drivers/gpu/drm/i915/gt/intel_sseu.c b/drivers/gpu/drm/i915/gt/intel_sseu.c
index 4a0b82fc108c..49316b7ef074 100644
--- a/drivers/gpu/drm/i915/gt/intel_sseu.c
+++ b/drivers/gpu/drm/i915/gt/intel_sseu.c
@@ -8,6 +8,17 @@
 #include "intel_lrc_reg.h"
 #include "intel_sseu.h"
 
+void intel_sseu_set_info(struct sseu_dev_info *sseu, u8 max_slices,
+			 u8 max_subslices, u8 max_eus_per_subslice)
+{
+	sseu->max_slices = max_slices;
+	sseu->max_subslices = max_subslices;
+	sseu->max_eus_per_subslice = max_eus_per_subslice;
+
+	sseu->ss_stride = GEN_SSEU_STRIDE(sseu->max_subslices);
+	sseu->eu_stride = GEN_SSEU_STRIDE(sseu->max_eus_per_subslice);
+}
+
 unsigned int
 intel_sseu_subslice_total(const struct sseu_dev_info *sseu)
 {
@@ -22,17 +33,39 @@ intel_sseu_subslice_total(const struct sseu_dev_info *sseu)
 unsigned int
 intel_sseu_subslices_per_slice(const struct sseu_dev_info *sseu, u8 slice)
 {
-	return hweight8(sseu->subslice_mask[slice]);
+	unsigned int i, total = 0;
+
+	for (i = 0; i < sseu->ss_stride; i++)
+		total += hweight8(sseu->subslice_mask[slice * sseu->ss_stride +
+						      i]);
+
+	return total;
+}
+
+void intel_sseu_copy_subslices(const struct sseu_dev_info *sseu, int slice,
+			       u8 *to_mask, const u8 *from_mask)
+{
+	int offset = slice * sseu->ss_stride;
+
+	memcpy(&to_mask[offset], &from_mask[offset], sseu->ss_stride);
+}
+
+void intel_sseu_set_subslices(struct sseu_dev_info *sseu, int slice,
+			      u32 ss_mask)
+{
+	int i, offset = slice * sseu->ss_stride;
+
+	for (i = 0; i < sseu->ss_stride; i++)
+		sseu->subslice_mask[offset + i] =
+			(ss_mask >> (BITS_PER_BYTE * i)) & 0xff;
 }
 
 static int intel_sseu_eu_idx(const struct sseu_dev_info *sseu, int slice,
 			     int subslice)
 {
-	int subslice_stride = DIV_ROUND_UP(sseu->max_eus_per_subslice,
-					   BITS_PER_BYTE);
-	int slice_stride = sseu->max_subslices * subslice_stride;
+	int slice_stride = sseu->max_subslices * sseu->eu_stride;
 
-	return slice * slice_stride + subslice * subslice_stride;
+	return slice * slice_stride + subslice * sseu->eu_stride;
 }
 
 u16 intel_sseu_get_eus(const struct sseu_dev_info *sseu, int slice,
@@ -41,8 +74,7 @@ u16 intel_sseu_get_eus(const struct sseu_dev_info *sseu, int slice,
 	int i, offset = intel_sseu_eu_idx(sseu, slice, subslice);
 	u16 eu_mask = 0;
 
-	for (i = 0;
-	     i < DIV_ROUND_UP(sseu->max_eus_per_subslice, BITS_PER_BYTE); i++) {
+	for (i = 0; i < sseu->eu_stride; i++) {
 		eu_mask |= ((u16)sseu->eu_mask[offset + i]) <<
 			(i * BITS_PER_BYTE);
 	}
@@ -55,8 +87,7 @@ void intel_sseu_set_eus(struct sseu_dev_info *sseu, int slice, int subslice,
 {
 	int i, offset = intel_sseu_eu_idx(sseu, slice, subslice);
 
-	for (i = 0;
-	     i < DIV_ROUND_UP(sseu->max_eus_per_subslice, BITS_PER_BYTE); i++) {
+	for (i = 0; i < sseu->eu_stride; i++) {
 		sseu->eu_mask[offset + i] =
 			(eu_mask >> (BITS_PER_BYTE * i)) & 0xff;
 	}
diff --git a/drivers/gpu/drm/i915/gt/intel_sseu.h b/drivers/gpu/drm/i915/gt/intel_sseu.h
index 56e3721ae83f..bf01f338a8cc 100644
--- a/drivers/gpu/drm/i915/gt/intel_sseu.h
+++ b/drivers/gpu/drm/i915/gt/intel_sseu.h
@@ -9,16 +9,18 @@
 
 #include <linux/types.h>
 #include <linux/kernel.h>
+#include <linux/string.h>
 
 struct drm_i915_private;
 
 #define GEN_MAX_SLICES		(6) /* CNL upper bound */
 #define GEN_MAX_SUBSLICES	(8) /* ICL upper bound */
 #define GEN_SSEU_STRIDE(bits) DIV_ROUND_UP(bits, BITS_PER_BYTE)
+#define GEN_MAX_SUBSLICE_STRIDE GEN_SSEU_STRIDE(GEN_MAX_SUBSLICES)
 
 struct sseu_dev_info {
 	u8 slice_mask;
-	u8 subslice_mask[GEN_MAX_SLICES];
+	u8 subslice_mask[GEN_MAX_SLICES * GEN_MAX_SUBSLICE_STRIDE];
 	u16 eu_total;
 	u8 eu_per_subslice;
 	u8 min_eu_in_pool;
@@ -33,6 +35,9 @@ struct sseu_dev_info {
 	u8 max_subslices;
 	u8 max_eus_per_subslice;
 
+	u8 ss_stride;
+	u8 eu_stride;
+
 	/* We don't have more than 8 eus per subslice at the moment and as we
 	 * store eus enabled using bits, no need to multiply by eus per
 	 * subslice.
@@ -63,12 +68,21 @@ intel_sseu_from_device_info(const struct sseu_dev_info *sseu)
 	return value;
 }
 
+void intel_sseu_set_info(struct sseu_dev_info *sseu, u8 max_slices,
+			 u8 max_subslices, u8 max_eus_per_subslice);
+
 unsigned int
 intel_sseu_subslice_total(const struct sseu_dev_info *sseu);
 
 unsigned int
 intel_sseu_subslices_per_slice(const struct sseu_dev_info *sseu, u8 slice);
 
+void intel_sseu_copy_subslices(const struct sseu_dev_info *sseu, int slice,
+			       u8 *to_mask, const u8 *from_mask);
+
+void intel_sseu_set_subslices(struct sseu_dev_info *sseu, int slice,
+			      u32 ss_mask);
+
 u16 intel_sseu_get_eus(const struct sseu_dev_info *sseu, int slice,
 		       int subslice);
 
diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c b/drivers/gpu/drm/i915/gt/intel_workarounds.c
index 5751446a4b0b..51df88873ff5 100644
--- a/drivers/gpu/drm/i915/gt/intel_workarounds.c
+++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c
@@ -771,7 +771,7 @@ wa_init_mcr(struct drm_i915_private *i915, struct i915_wa_list *wal)
 		u32 slice = fls(sseu->slice_mask);
 		u32 fuse3 =
 			intel_uncore_read(&i915->uncore, GEN10_MIRROR_FUSE3);
-		u8 ss_mask = sseu->subslice_mask[slice];
+		u8 ss_mask = sseu->subslice_mask[slice * sseu->ss_stride];
 
 		u8 enabled_mask = (ss_mask | ss_mask >>
 				   GEN10_L3BANK_PAIR_COUNT) & GEN10_L3BANK_MASK;
diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index 20ac782c50c8..a45299cd6989 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -1256,6 +1256,7 @@ static void i915_instdone_info(struct drm_i915_private *dev_priv,
 			       struct seq_file *m,
 			       struct intel_instdone *instdone)
 {
+	struct sseu_dev_info *sseu = &RUNTIME_INFO(dev_priv)->sseu;
 	int slice;
 	int subslice;
 
@@ -1271,11 +1272,11 @@ static void i915_instdone_info(struct drm_i915_private *dev_priv,
 	if (INTEL_GEN(dev_priv) <= 6)
 		return;
 
-	for_each_instdone_slice_subslice(dev_priv, slice, subslice)
+	for_each_instdone_slice_subslice(dev_priv, sseu, slice, subslice)
 		seq_printf(m, "\t\tSAMPLER_INSTDONE[%d][%d]: 0x%08x\n",
 			   slice, subslice, instdone->sampler[slice][subslice]);
 
-	for_each_instdone_slice_subslice(dev_priv, slice, subslice)
+	for_each_instdone_slice_subslice(dev_priv, sseu, slice, subslice)
 		seq_printf(m, "\t\tROW_INSTDONE[%d][%d]: 0x%08x\n",
 			   slice, subslice, instdone->row[slice][subslice]);
 }
@@ -4065,7 +4066,9 @@ static void gen10_sseu_device_status(struct drm_i915_private *dev_priv,
 			continue;
 
 		sseu->slice_mask |= BIT(s);
-		sseu->subslice_mask[s] = info->sseu.subslice_mask[s];
+		intel_sseu_copy_subslices(&info->sseu, s,
+					  sseu->subslice_mask,
+					  info->sseu.subslice_mask);
 
 		for (ss = 0; ss < info->sseu.max_subslices; ss++) {
 			unsigned int eu_cnt;
@@ -4116,18 +4119,22 @@ static void gen9_sseu_device_status(struct drm_i915_private *dev_priv,
 		sseu->slice_mask |= BIT(s);
 
 		if (IS_GEN9_BC(dev_priv))
-			sseu->subslice_mask[s] =
-				RUNTIME_INFO(dev_priv)->sseu.subslice_mask[s];
+			intel_sseu_copy_subslices(&info->sseu, s,
+						  sseu->subslice_mask,
+						  info->sseu.subslice_mask);
 
 		for (ss = 0; ss < info->sseu.max_subslices; ss++) {
 			unsigned int eu_cnt;
+			u8 ss_idx = s * info->sseu.ss_stride +
+				    ss / BITS_PER_BYTE;
 
 			if (IS_GEN9_LP(dev_priv)) {
 				if (!(s_reg[s] & (GEN9_PGCTL_SS_ACK(ss))))
 					/* skip disabled subslice */
 					continue;
 
-				sseu->subslice_mask[s] |= BIT(ss);
+				sseu->subslice_mask[ss_idx] |=
+					BIT(ss % BITS_PER_BYTE);
 			}
 
 			eu_cnt = 2 * hweight32(eu_reg[2*s + ss/2] &
@@ -4144,25 +4151,24 @@ static void gen9_sseu_device_status(struct drm_i915_private *dev_priv,
 static void broadwell_sseu_device_status(struct drm_i915_private *dev_priv,
 					 struct sseu_dev_info *sseu)
 {
+	struct intel_runtime_info *info = RUNTIME_INFO(dev_priv);
 	u32 slice_info = I915_READ(GEN8_GT_SLICE_INFO);
 	int s;
 
 	sseu->slice_mask = slice_info & GEN8_LSLICESTAT_MASK;
 
 	if (sseu->slice_mask) {
-		sseu->eu_per_subslice =
-			RUNTIME_INFO(dev_priv)->sseu.eu_per_subslice;
-		for (s = 0; s < fls(sseu->slice_mask); s++) {
-			sseu->subslice_mask[s] =
-				RUNTIME_INFO(dev_priv)->sseu.subslice_mask[s];
-		}
+		sseu->eu_per_subslice = info->sseu.eu_per_subslice;
+		for (s = 0; s < fls(sseu->slice_mask); s++)
+			intel_sseu_copy_subslices(&info->sseu, s,
+						  sseu->subslice_mask,
+						  info->sseu.subslice_mask);
 		sseu->eu_total = sseu->eu_per_subslice *
 				 intel_sseu_subslice_total(sseu);
 
 		/* subtract fused off EU(s) from enabled slice(s) */
 		for (s = 0; s < fls(sseu->slice_mask); s++) {
-			u8 subslice_7eu =
-				RUNTIME_INFO(dev_priv)->sseu.subslice_7eu[s];
+			u8 subslice_7eu = info->sseu.subslice_7eu[s];
 
 			sseu->eu_total -= hweight8(subslice_7eu);
 		}
@@ -4209,6 +4215,7 @@ static void i915_print_sseu_info(struct seq_file *m, bool is_available_info,
 static int i915_sseu_status(struct seq_file *m, void *unused)
 {
 	struct drm_i915_private *dev_priv = node_to_i915(m->private);
+	const struct intel_runtime_info *info = RUNTIME_INFO(dev_priv);
 	struct sseu_dev_info sseu;
 	intel_wakeref_t wakeref;
 
@@ -4216,14 +4223,13 @@ static int i915_sseu_status(struct seq_file *m, void *unused)
 		return -ENODEV;
 
 	seq_puts(m, "SSEU Device Info\n");
-	i915_print_sseu_info(m, true, &RUNTIME_INFO(dev_priv)->sseu);
+	i915_print_sseu_info(m, true, &info->sseu);
 
 	seq_puts(m, "SSEU Device Status\n");
 	memset(&sseu, 0, sizeof(sseu));
-	sseu.max_slices = RUNTIME_INFO(dev_priv)->sseu.max_slices;
-	sseu.max_subslices = RUNTIME_INFO(dev_priv)->sseu.max_subslices;
-	sseu.max_eus_per_subslice =
-		RUNTIME_INFO(dev_priv)->sseu.max_eus_per_subslice;
+	intel_sseu_set_info(&sseu, info->sseu.max_slices,
+			    info->sseu.max_subslices,
+			    info->sseu.max_eus_per_subslice);
 
 	with_intel_runtime_pm(dev_priv, wakeref) {
 		if (IS_CHERRYVIEW(dev_priv))
diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index c7ab0f724021..820b9216cb13 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -323,7 +323,7 @@ static int i915_getparam_ioctl(struct drm_device *dev, void *data,
 	struct pci_dev *pdev = dev_priv->drm.pdev;
 	struct sseu_dev_info *sseu = &RUNTIME_INFO(dev_priv)->sseu;
 	drm_i915_getparam_t *param = data;
-	int value;
+	int value = 0;
 
 	switch (param->param) {
 	case I915_PARAM_IRQ_ACTIVE:
@@ -452,7 +452,9 @@ static int i915_getparam_ioctl(struct drm_device *dev, void *data,
 			return -ENODEV;
 		break;
 	case I915_PARAM_SUBSLICE_MASK:
-		value = sseu->subslice_mask[0];
+		/* Only copy bits from the first subslice */
+		memcpy(&value, sseu->subslice_mask,
+		       min(sseu->ss_stride, (u8)sizeof(value)));
 		if (!value)
 			return -ENODEV;
 		break;
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
index f51ff683dd2e..9da4118ad43a 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -405,6 +405,7 @@ static void print_error_buffers(struct drm_i915_error_state_buf *m,
 static void error_print_instdone(struct drm_i915_error_state_buf *m,
 				 const struct drm_i915_error_engine *ee)
 {
+	struct sseu_dev_info *sseu = &RUNTIME_INFO(m->i915)->sseu;
 	int slice;
 	int subslice;
 
@@ -420,12 +421,12 @@ static void error_print_instdone(struct drm_i915_error_state_buf *m,
 	if (INTEL_GEN(m->i915) <= 6)
 		return;
 
-	for_each_instdone_slice_subslice(m->i915, slice, subslice)
+	for_each_instdone_slice_subslice(m->i915, sseu, slice, subslice)
 		err_printf(m, "  SAMPLER_INSTDONE[%d][%d]: 0x%08x\n",
 			   slice, subslice,
 			   ee->instdone.sampler[slice][subslice]);
 
-	for_each_instdone_slice_subslice(m->i915, slice, subslice)
+	for_each_instdone_slice_subslice(m->i915, sseu, slice, subslice)
 		err_printf(m, "  ROW_INSTDONE[%d][%d]: 0x%08x\n",
 			   slice, subslice,
 			   ee->instdone.row[slice][subslice]);
diff --git a/drivers/gpu/drm/i915/i915_query.c b/drivers/gpu/drm/i915/i915_query.c
index 7c1708c22811..000dcb145ce0 100644
--- a/drivers/gpu/drm/i915/i915_query.c
+++ b/drivers/gpu/drm/i915/i915_query.c
@@ -37,8 +37,6 @@ static int query_topology_info(struct drm_i915_private *dev_priv,
 	const struct sseu_dev_info *sseu = &RUNTIME_INFO(dev_priv)->sseu;
 	struct drm_i915_query_topology_info topo;
 	u32 slice_length, subslice_length, eu_length, total_length;
-	u8 subslice_stride = GEN_SSEU_STRIDE(sseu->max_subslices);
-	u8 eu_stride = GEN_SSEU_STRIDE(sseu->max_eus_per_subslice);
 	int ret;
 
 	if (query_item->flags != 0)
@@ -50,8 +48,8 @@ static int query_topology_info(struct drm_i915_private *dev_priv,
 	BUILD_BUG_ON(sizeof(u8) != sizeof(sseu->slice_mask));
 
 	slice_length = sizeof(sseu->slice_mask);
-	subslice_length = sseu->max_slices * subslice_stride;
-	eu_length = sseu->max_slices * sseu->max_subslices * eu_stride;
+	subslice_length = sseu->max_slices * sseu->ss_stride;
+	eu_length = sseu->max_slices * sseu->max_subslices * sseu->eu_stride;
 	total_length = sizeof(topo) + slice_length + subslice_length +
 		       eu_length;
 
@@ -69,9 +67,9 @@ static int query_topology_info(struct drm_i915_private *dev_priv,
 	topo.max_eus_per_subslice = sseu->max_eus_per_subslice;
 
 	topo.subslice_offset = slice_length;
-	topo.subslice_stride = subslice_stride;
+	topo.subslice_stride = sseu->ss_stride;
 	topo.eu_offset = slice_length + subslice_length;
-	topo.eu_stride = eu_stride;
+	topo.eu_stride = sseu->eu_stride;
 
 	if (__copy_to_user(u64_to_user_ptr(query_item->data_ptr),
 			   &topo, sizeof(topo)))
diff --git a/drivers/gpu/drm/i915/intel_device_info.c b/drivers/gpu/drm/i915/intel_device_info.c
index e1dbccf04cd9..bbbc0a8c2183 100644
--- a/drivers/gpu/drm/i915/intel_device_info.c
+++ b/drivers/gpu/drm/i915/intel_device_info.c
@@ -84,17 +84,42 @@ void intel_device_info_dump_flags(const struct intel_device_info *info,
 #undef PRINT_FLAG
 }
 
+#define SS_STR_MAX_SIZE (GEN_MAX_SUBSLICE_STRIDE * 2)
+
+static u8 *
+subslice_per_slice_str(u8 *buf, const struct sseu_dev_info *sseu, u8 slice)
+{
+	int i;
+	u8 ss_offset = slice * sseu->ss_stride;
+
+	GEM_BUG_ON(slice >= sseu->max_slices);
+
+	memset(buf, 0, SS_STR_MAX_SIZE);
+
+	/*
+	 * Print subslice information in reverse order to match
+	 * userspace expectations.
+	 */
+	for (i = 0; i < sseu->ss_stride; i++)
+		sprintf(&buf[i * 2], "%02x",
+			sseu->subslice_mask[ss_offset + sseu->ss_stride -
+					    (i + 1)]);
+
+	return buf;
+}
+
 static void sseu_dump(const struct sseu_dev_info *sseu, struct drm_printer *p)
 {
 	int s;
+	u8 buf[SS_STR_MAX_SIZE];
 
 	drm_printf(p, "slice total: %u, mask=%04x\n",
 		   hweight8(sseu->slice_mask), sseu->slice_mask);
 	drm_printf(p, "subslice total: %u\n", intel_sseu_subslice_total(sseu));
 	for (s = 0; s < sseu->max_slices; s++) {
-		drm_printf(p, "slice%d: %u subslices, mask=%04x\n",
+		drm_printf(p, "slice%d: %u subslices, mask=%s\n",
 			   s, intel_sseu_subslices_per_slice(sseu, s),
-			   sseu->subslice_mask[s]);
+			   subslice_per_slice_str(buf, sseu, s));
 	}
 	drm_printf(p, "EU total: %u\n", sseu->eu_total);
 	drm_printf(p, "EU per subslice: %u\n", sseu->eu_per_subslice);
@@ -118,6 +143,7 @@ void intel_device_info_dump_topology(const struct sseu_dev_info *sseu,
 				     struct drm_printer *p)
 {
 	int s, ss;
+	u8 buf[SS_STR_MAX_SIZE];
 
 	if (sseu->max_slices == 0) {
 		drm_printf(p, "Unavailable\n");
@@ -125,9 +151,9 @@ void intel_device_info_dump_topology(const struct sseu_dev_info *sseu,
 	}
 
 	for (s = 0; s < sseu->max_slices; s++) {
-		drm_printf(p, "slice%d: %u subslice(s) (0x%hhx):\n",
+		drm_printf(p, "slice%d: %u subslice(s) (0x%s):\n",
 			   s, intel_sseu_subslices_per_slice(sseu, s),
-			   sseu->subslice_mask[s]);
+			   subslice_per_slice_str(buf, sseu, s));
 
 		for (ss = 0; ss < sseu->max_subslices; ss++) {
 			u16 enabled_eus = intel_sseu_get_eus(sseu, s, ss);
@@ -156,15 +182,10 @@ static void gen11_sseu_info_init(struct drm_i915_private *dev_priv)
 	u8 eu_en;
 	int s;
 
-	if (IS_ELKHARTLAKE(dev_priv)) {
-		sseu->max_slices = 1;
-		sseu->max_subslices = 4;
-		sseu->max_eus_per_subslice = 8;
-	} else {
-		sseu->max_slices = 1;
-		sseu->max_subslices = 8;
-		sseu->max_eus_per_subslice = 8;
-	}
+	if (IS_ELKHARTLAKE(dev_priv))
+		intel_sseu_set_info(sseu, 1, 4, 8);
+	else
+		intel_sseu_set_info(sseu, 1, 8, 8);
 
 	s_en = I915_READ(GEN11_GT_SLICE_ENABLE) & GEN11_GT_S_ENA_MASK;
 	ss_en = ~I915_READ(GEN11_GT_SUBSLICE_DISABLE);
@@ -177,9 +198,11 @@ static void gen11_sseu_info_init(struct drm_i915_private *dev_priv)
 			int ss;
 
 			sseu->slice_mask |= BIT(s);
-			sseu->subslice_mask[s] = (ss_en >> ss_idx) & ss_en_mask;
+			sseu->subslice_mask[s * sseu->ss_stride] =
+				(ss_en >> ss_idx) & ss_en_mask;
 			for (ss = 0; ss < sseu->max_subslices; ss++) {
-				if (sseu->subslice_mask[s] & BIT(ss))
+				if (sseu->subslice_mask[s * sseu->ss_stride] &
+				    BIT(ss))
 					intel_sseu_set_eus(sseu, s, ss, eu_en);
 			}
 		}
@@ -201,23 +224,10 @@ static void gen10_sseu_info_init(struct drm_i915_private *dev_priv)
 	const int eu_mask = 0xff;
 	u32 subslice_mask, eu_en;
 
+	intel_sseu_set_info(sseu, 6, 4, 8);
+
 	sseu->slice_mask = (fuse2 & GEN10_F2_S_ENA_MASK) >>
 			    GEN10_F2_S_ENA_SHIFT;
-	sseu->max_slices = 6;
-	sseu->max_subslices = 4;
-	sseu->max_eus_per_subslice = 8;
-
-	subslice_mask = (1 << 4) - 1;
-	subslice_mask &= ~((fuse2 & GEN10_F2_SS_DIS_MASK) >>
-			   GEN10_F2_SS_DIS_SHIFT);
-
-	/*
-	 * Slice0 can have up to 3 subslices, but there are only 2 in
-	 * slice1/2.
-	 */
-	sseu->subslice_mask[0] = subslice_mask;
-	for (s = 1; s < sseu->max_slices; s++)
-		sseu->subslice_mask[s] = subslice_mask & 0x3;
 
 	/* Slice0 */
 	eu_en = ~I915_READ(GEN8_EU_DISABLE0);
@@ -242,14 +252,22 @@ static void gen10_sseu_info_init(struct drm_i915_private *dev_priv)
 	eu_en = ~I915_READ(GEN10_EU_DISABLE3);
 	intel_sseu_set_eus(sseu, 5, 1, eu_en & eu_mask);
 
-	/* Do a second pass where we mark the subslices disabled if all their
-	 * eus are off.
-	 */
+	subslice_mask = (1 << 4) - 1;
+	subslice_mask &= ~((fuse2 & GEN10_F2_SS_DIS_MASK) >>
+			   GEN10_F2_SS_DIS_SHIFT);
+
 	for (s = 0; s < sseu->max_slices; s++) {
 		for (ss = 0; ss < sseu->max_subslices; ss++) {
 			if (intel_sseu_get_eus(sseu, s, ss) == 0)
-				sseu->subslice_mask[s] &= ~BIT(ss);
+				subslice_mask &= ~BIT(ss);
 		}
+
+		/*
+		 * Slice0 can have up to 3 subslices, but there are only 2 in
+		 * slice1/2.
+		 */
+		intel_sseu_set_subslices(sseu, s, s == 0 ? subslice_mask :
+							   subslice_mask & 0x3);
 	}
 
 	sseu->eu_total = compute_eu_total(sseu);
@@ -275,13 +293,12 @@ static void cherryview_sseu_info_init(struct drm_i915_private *dev_priv)
 {
 	struct sseu_dev_info *sseu = &RUNTIME_INFO(dev_priv)->sseu;
 	u32 fuse;
+	u8 subslice_mask;
 
 	fuse = I915_READ(CHV_FUSE_GT);
 
 	sseu->slice_mask = BIT(0);
-	sseu->max_slices = 1;
-	sseu->max_subslices = 2;
-	sseu->max_eus_per_subslice = 8;
+	intel_sseu_set_info(sseu, 1, 2, 8);
 
 	if (!(fuse & CHV_FGT_DISABLE_SS0)) {
 		u8 disabled_mask =
@@ -290,7 +307,7 @@ static void cherryview_sseu_info_init(struct drm_i915_private *dev_priv)
 			(((fuse & CHV_FGT_EU_DIS_SS0_R1_MASK) >>
 			  CHV_FGT_EU_DIS_SS0_R1_SHIFT) << 4);
 
-		sseu->subslice_mask[0] |= BIT(0);
+		subslice_mask |= BIT(0);
 		intel_sseu_set_eus(sseu, 0, 0, ~disabled_mask);
 	}
 
@@ -301,10 +318,12 @@ static void cherryview_sseu_info_init(struct drm_i915_private *dev_priv)
 			(((fuse & CHV_FGT_EU_DIS_SS1_R1_MASK) >>
 			  CHV_FGT_EU_DIS_SS1_R1_SHIFT) << 4);
 
-		sseu->subslice_mask[0] |= BIT(1);
+		subslice_mask |= BIT(1);
 		intel_sseu_set_eus(sseu, 0, 1, ~disabled_mask);
 	}
 
+	intel_sseu_set_subslices(sseu, 0, subslice_mask);
+
 	sseu->eu_total = compute_eu_total(sseu);
 
 	/*
@@ -312,7 +331,8 @@ static void cherryview_sseu_info_init(struct drm_i915_private *dev_priv)
 	 * across subslices.
 	*/
 	sseu->eu_per_subslice = intel_sseu_subslice_total(sseu) ?
-				sseu->eu_total / intel_sseu_subslice_total(sseu) :
+				sseu->eu_total /
+					intel_sseu_subslice_total(sseu) :
 				0;
 	/*
 	 * CHV supports subslice power gating on devices with more than
@@ -336,9 +356,8 @@ static void gen9_sseu_info_init(struct drm_i915_private *dev_priv)
 	sseu->slice_mask = (fuse2 & GEN8_F2_S_ENA_MASK) >> GEN8_F2_S_ENA_SHIFT;
 
 	/* BXT has a single slice and at most 3 subslices. */
-	sseu->max_slices = IS_GEN9_LP(dev_priv) ? 1 : 3;
-	sseu->max_subslices = IS_GEN9_LP(dev_priv) ? 3 : 4;
-	sseu->max_eus_per_subslice = 8;
+	intel_sseu_set_info(sseu, IS_GEN9_LP(dev_priv) ? 1 : 3,
+			    IS_GEN9_LP(dev_priv) ? 3 : 4, 8);
 
 	/*
 	 * The subslice disable field is global, i.e. it applies
@@ -357,14 +376,16 @@ static void gen9_sseu_info_init(struct drm_i915_private *dev_priv)
 			/* skip disabled slice */
 			continue;
 
-		sseu->subslice_mask[s] = subslice_mask;
+		intel_sseu_set_subslices(sseu, s, subslice_mask);
 
 		eu_disable = I915_READ(GEN9_EU_DISABLE(s));
 		for (ss = 0; ss < sseu->max_subslices; ss++) {
 			int eu_per_ss;
 			u8 eu_disabled_mask;
+			u8 ss_idx = s * sseu->ss_stride + ss / BITS_PER_BYTE;
 
-			if (!(sseu->subslice_mask[s] & BIT(ss)))
+			if (!(sseu->subslice_mask[ss_idx] &
+			      BIT(ss % BITS_PER_BYTE)))
 				/* skip disabled subslice */
 				continue;
 
@@ -437,9 +458,7 @@ static void broadwell_sseu_info_init(struct drm_i915_private *dev_priv)
 
 	fuse2 = I915_READ(GEN8_FUSE2);
 	sseu->slice_mask = (fuse2 & GEN8_F2_S_ENA_MASK) >> GEN8_F2_S_ENA_SHIFT;
-	sseu->max_slices = 3;
-	sseu->max_subslices = 3;
-	sseu->max_eus_per_subslice = 8;
+	intel_sseu_set_info(sseu, 3, 3, 8);
 
 	/*
 	 * The subslice disable field is global, i.e. it applies
@@ -466,18 +485,21 @@ static void broadwell_sseu_info_init(struct drm_i915_private *dev_priv)
 			/* skip disabled slice */
 			continue;
 
-		sseu->subslice_mask[s] = subslice_mask;
+		intel_sseu_set_subslices(sseu, s, subslice_mask);
 
 		for (ss = 0; ss < sseu->max_subslices; ss++) {
 			u8 eu_disabled_mask;
+			u8 ss_idx = s * sseu->ss_stride + ss / BITS_PER_BYTE;
 			u32 n_disabled;
 
-			if (!(sseu->subslice_mask[s] & BIT(ss)))
+			if (!(sseu->subslice_mask[ss_idx] &
+			      BIT(ss % BITS_PER_BYTE)))
 				/* skip disabled subslice */
 				continue;
 
 			eu_disabled_mask =
-				eu_disable[s] >> (ss * sseu->max_eus_per_subslice);
+				eu_disable[s] >>
+					(ss * sseu->max_eus_per_subslice);
 
 			intel_sseu_set_eus(sseu, s, ss, ~eu_disabled_mask);
 
@@ -517,6 +539,7 @@ static void haswell_sseu_info_init(struct drm_i915_private *dev_priv)
 	struct sseu_dev_info *sseu = &RUNTIME_INFO(dev_priv)->sseu;
 	u32 fuse1;
 	int s, ss;
+	u32 subslice_mask;
 
 	/*
 	 * There isn't a register to tell us how many slices/subslices. We
@@ -528,22 +551,18 @@ static void haswell_sseu_info_init(struct drm_i915_private *dev_priv)
 		/* fall through */
 	case 1:
 		sseu->slice_mask = BIT(0);
-		sseu->subslice_mask[0] = BIT(0);
+		subslice_mask = BIT(0);
 		break;
 	case 2:
 		sseu->slice_mask = BIT(0);
-		sseu->subslice_mask[0] = BIT(0) | BIT(1);
+		subslice_mask = BIT(0) | BIT(1);
 		break;
 	case 3:
 		sseu->slice_mask = BIT(0) | BIT(1);
-		sseu->subslice_mask[0] = BIT(0) | BIT(1);
-		sseu->subslice_mask[1] = BIT(0) | BIT(1);
+		subslice_mask = BIT(0) | BIT(1);
 		break;
 	}
 
-	sseu->max_slices = hweight8(sseu->slice_mask);
-	sseu->max_subslices = hweight8(sseu->subslice_mask[0]);
-
 	fuse1 = I915_READ(HSW_PAVP_FUSE1);
 	switch ((fuse1 & HSW_F1_EU_DIS_MASK) >> HSW_F1_EU_DIS_SHIFT) {
 	default:
@@ -560,9 +579,14 @@ static void haswell_sseu_info_init(struct drm_i915_private *dev_priv)
 		sseu->eu_per_subslice = 6;
 		break;
 	}
-	sseu->max_eus_per_subslice = sseu->eu_per_subslice;
+
+	intel_sseu_set_info(sseu, hweight8(sseu->slice_mask),
+			    hweight8(subslice_mask),
+			    sseu->eu_per_subslice);
 
 	for (s = 0; s < sseu->max_slices; s++) {
+		intel_sseu_set_subslices(sseu, s, subslice_mask);
+
 		for (ss = 0; ss < sseu->max_subslices; ss++) {
 			intel_sseu_set_eus(sseu, s, ss,
 					   (1UL << sseu->eu_per_subslice) - 1);
-- 
2.21.0.5.gaeb582a983

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 35+ messages in thread

end of thread, other threads:[~2019-05-03 14:28 UTC | newest]

Thread overview: 35+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-05-01 15:34 [PATCH 0/6] Refactor to expand subslice mask Stuart Summers
2019-05-01 15:34 ` [PATCH 1/6] drm/i915: Use local variable for SSEU info in GETPARAM ioctl Stuart Summers
2019-05-01 17:54   ` Daniele Ceraolo Spurio
2019-05-01 19:38     ` Summers, Stuart
2019-05-01 15:34 ` [PATCH 2/6] drm/i915: Add macro for SSEU stride calculation Stuart Summers
2019-05-01 18:11   ` Daniele Ceraolo Spurio
2019-05-01 19:37     ` Summers, Stuart
2019-05-01 15:34 ` [PATCH 3/6] drm/i915: Move calculation of subslices per slice to new function Stuart Summers
2019-05-01 18:14   ` Daniele Ceraolo Spurio
2019-05-01 19:37     ` Summers, Stuart
2019-05-01 15:34 ` [PATCH 4/6] drm/i915: Move sseu helper functions to intel_sseu.h Stuart Summers
2019-05-01 18:48   ` Daniele Ceraolo Spurio
2019-05-01 19:36     ` Summers, Stuart
2019-05-01 15:34 ` [PATCH 5/6] drm/i915: Remove inline from sseu helper functions Stuart Summers
2019-05-01 20:04   ` Daniele Ceraolo Spurio
2019-05-01 21:04     ` Summers, Stuart
2019-05-01 21:19       ` Daniele Ceraolo Spurio
2019-05-01 21:28         ` Summers, Stuart
2019-05-02  7:15           ` Jani Nikula
2019-05-02 14:50             ` Summers, Stuart
2019-05-02 14:58               ` Jani Nikula
2019-05-02 14:58                 ` Summers, Stuart
2019-05-01 15:34 ` [PATCH 6/6] drm/i915: Expand subslice mask Stuart Summers
2019-05-01 18:22   ` Tvrtko Ursulin
2019-05-01 18:29     ` Tvrtko Ursulin
2019-05-01 19:40       ` Summers, Stuart
2019-05-01 22:04   ` Daniele Ceraolo Spurio
2019-05-02 14:47     ` Summers, Stuart
2019-05-03  9:05       ` Lionel Landwerlin
2019-05-03 14:28         ` Summers, Stuart
2019-05-01 15:58 ` ✗ Fi.CI.CHECKPATCH: warning for Refactor to expand subslice mask (rev7) Patchwork
2019-05-01 16:01 ` ✗ Fi.CI.SPARSE: " Patchwork
2019-05-01 16:14 ` ✓ Fi.CI.BAT: success " Patchwork
2019-05-02  9:14 ` ✓ Fi.CI.IGT: " Patchwork
  -- strict thread matches above, loose matches on Subject: below --
2019-04-30 23:06 [PATCH 0/6] Refactor to expand subslice mask Stuart Summers
2019-04-30 23:06 ` [PATCH 6/6] drm/i915: Expand " Stuart Summers

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.