All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/4] drm/i915: introduce query information
@ 2017-11-08 16:22 Lionel Landwerlin
  2017-11-08 16:22 ` [PATCH 1/4] drm/i915: introduce query info uAPI Lionel Landwerlin
                   ` (6 more replies)
  0 siblings, 7 replies; 19+ messages in thread
From: Lionel Landwerlin @ 2017-11-08 16:22 UTC (permalink / raw)
  To: intel-gfx

Hi,

This series is based off work that Tvrtko started, initially for
exposing the engines available to userspace.

I've added the topology information I would like to expose for
normalizing the performance counters.

Cheers,

Lionel Landwerlin (3):
  drm/i915: store all subslice masks
  drm/i915/debugfs: reuse max slice/subslices already stored in sseu
  drm/i915: expose rcs topology through discovery uAPI

Tvrtko Ursulin (1):
  drm/i915: introduce query info uAPI

 drivers/gpu/drm/i915/Makefile            |   1 +
 drivers/gpu/drm/i915/i915_debugfs.c      |  50 ++++-----
 drivers/gpu/drm/i915/i915_drv.c          |   3 +-
 drivers/gpu/drm/i915/i915_drv.h          |  26 ++++-
 drivers/gpu/drm/i915/intel_device_info.c | 171 +++++++++++++++++++++--------
 drivers/gpu/drm/i915/intel_engine_cs.c   |   1 +
 drivers/gpu/drm/i915/intel_lrc.c         |   2 +-
 drivers/gpu/drm/i915/intel_query_info.c  | 182 +++++++++++++++++++++++++++++++
 drivers/gpu/drm/i915/intel_ringbuffer.h  |   2 +-
 include/uapi/drm/i915_drm.h              | 119 ++++++++++++++++++++
 10 files changed, 479 insertions(+), 78 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/intel_query_info.c

--
2.15.0
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH 1/4] drm/i915: introduce query info uAPI
  2017-11-08 16:22 [PATCH 0/4] drm/i915: introduce query information Lionel Landwerlin
@ 2017-11-08 16:22 ` Lionel Landwerlin
  2017-11-09 15:57   ` Joonas Lahtinen
  2017-11-08 16:22 ` [PATCH 2/4] drm/i915: store all subslice masks Lionel Landwerlin
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 19+ messages in thread
From: Lionel Landwerlin @ 2017-11-08 16:22 UTC (permalink / raw)
  To: intel-gfx; +Cc: Ben Widawsky, Daniel Vetter

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Query info uAPI allows userspace to probe for a number of properties
of the GPU. This partially removes the need for userspace to maintain
the internal PCI id based database (as well as code duplication across
userspace components).

This first version enables engine configuration and features to be
queried.

Probing is done via the new DRM_IOCTL_I915_QUERY_INFO ioctl which
takes a query ID as well as query arguments.

Currently only general engine configuration and HEVC feature of the
VCS engine can be probed but the uAPI is designed to be generic and
extensible.

Code is based almost exactly on the earlier proposal on the topic by
Jon Bloomfield. Engine class and instance refactoring made recently by
Daniele Ceraolo Spurio enabled this to be implemented in an elegant
fashion.

To probe configuration userspace sets the engine class it wants to
query (struct drm_i915_gem_engine_info) and provides an array of
drm_i915_engine_info structs which will be filled in by the driver.
Userspace also has to tell i915 how many elements are in the array,
and the driver will report back the total number of engine instances
in any case.

Pseudo-code example of userspace using the uAPI:

  struct drm_i915_query_info info = {};
  struct drm_i915_engine_info *engines;
  int i, ret;

  info.version = 1;
  info.query = I915_QUERY_INFO_ENGINE;
  info.query_params[0] = I915_ENGINE_CLASS_VIDEO;
  info.info_ptr_len = 0;
  ret = ioctl(..., &info);
  assert(ret == 0);

  engines = malloc(info.info_ptr_len);
  info.info_ptr = to_user_pointer(engines);
  ret = ioctl(..., &info);
  assert(ret == 0);

  for (i = 0; i < info.info_ptr / sizeof(*engines); i++)
      printf("VCS%u: flags=%x\n",
             engines[i].instance,
             engines[i].info);

First time with info.info_ptr_len set to zero, so the kernel reports
back the size of the data to be written into info.info_ptr. Then a
second time with the info.info_ptr_len field set to the correct
length.

v2:
 * Add a version field and consolidate to one engine count.
   (Chris Wilson)
 * Rename uAPI flags for VCS engines to DRM_I915_ENGINE_CLASS_VIDEO.
   (Gong Zhipeng)

v3:
 * Drop the DRM_ prefix from uAPI enums. (Chris Wilson)
 * Only reserve 8-bits for the engine info for time being.

v4:
 * Made user_class_map static.

v5:
 * Example usage added to commit msg. (Chris Wilson)
 * Report engine count in case of version mismatch. (Chris Wilson)

v6:
 * Return API to query_info to make it more generic (Lionel)
 * Add query ID & query params (Lionel)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: Ben Widawsky <ben@bwidawsk.net>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: "Rogozhkin, Dmitry V" <dmitry.v.rogozhkin@intel.com>
Cc: Oscar Mateo <oscar.mateo@intel.com>
Cc: "Gong, Zhipeng" <zhipeng.gong@intel.com>
---
 drivers/gpu/drm/i915/Makefile           |   1 +
 drivers/gpu/drm/i915/i915_drv.c         |   1 +
 drivers/gpu/drm/i915/i915_drv.h         |   3 +
 drivers/gpu/drm/i915/intel_engine_cs.c  |   1 +
 drivers/gpu/drm/i915/intel_query_info.c | 118 ++++++++++++++++++++++++++++++++
 include/uapi/drm/i915_drm.h             |  64 +++++++++++++++++
 6 files changed, 188 insertions(+)
 create mode 100644 drivers/gpu/drm/i915/intel_query_info.c

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index 177404462386..18d087f3e501 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -75,6 +75,7 @@ i915-y += i915_cmd_parser.o \
 	  intel_hangcheck.o \
 	  intel_lrc.o \
 	  intel_mocs.o \
+	  intel_query_info.o \
 	  intel_ringbuffer.o \
 	  intel_uncore.o
 
diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index e7e9e061073b..d1843a18ea2a 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -2776,6 +2776,7 @@ static const struct drm_ioctl_desc i915_ioctls[] = {
 	DRM_IOCTL_DEF_DRV(I915_PERF_OPEN, i915_perf_open_ioctl, DRM_RENDER_ALLOW),
 	DRM_IOCTL_DEF_DRV(I915_PERF_ADD_CONFIG, i915_perf_add_config_ioctl, DRM_UNLOCKED|DRM_RENDER_ALLOW),
 	DRM_IOCTL_DEF_DRV(I915_PERF_REMOVE_CONFIG, i915_perf_remove_config_ioctl, DRM_UNLOCKED|DRM_RENDER_ALLOW),
+	DRM_IOCTL_DEF_DRV(I915_QUERY_INFO, i915_query_info_ioctl, DRM_RENDER_ALLOW),
 };
 
 static struct drm_driver driver = {
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 5e6938d0a903..a4978bd6a436 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -3858,6 +3858,9 @@ void i915_oa_init_reg_state(struct intel_engine_cs *engine,
 			    struct i915_gem_context *ctx,
 			    uint32_t *reg_state);
 
+int i915_query_info_ioctl(struct drm_device *dev, void *data,
+			  struct drm_file *file);
+
 /* i915_gem_evict.c */
 int __must_check i915_gem_evict_something(struct i915_address_space *vm,
 					  u64 min_size, u64 alignment,
diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c
index 6997306be0d2..ddfe9b722d92 100644
--- a/drivers/gpu/drm/i915/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/intel_engine_cs.c
@@ -28,6 +28,7 @@
 #include "i915_vgpu.h"
 #include "intel_ringbuffer.h"
 #include "intel_lrc.h"
+#include <uapi/drm/i915_drm.h>
 
 /* Haswell does have the CXT_SIZE register however it does not appear to be
  * valid. Now, docs explain in dwords what is in the context object. The full
diff --git a/drivers/gpu/drm/i915/intel_query_info.c b/drivers/gpu/drm/i915/intel_query_info.c
new file mode 100644
index 000000000000..21434c393582
--- /dev/null
+++ b/drivers/gpu/drm/i915/intel_query_info.c
@@ -0,0 +1,118 @@
+/*
+ * Copyright © 2017 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ *
+ */
+
+#include "i915_drv.h"
+#include <uapi/drm/i915_drm.h>
+
+static u8 user_class_map[I915_ENGINE_CLASS_MAX] = {
+	[I915_ENGINE_CLASS_OTHER] = OTHER_CLASS,
+	[I915_ENGINE_CLASS_RENDER] = RENDER_CLASS,
+	[I915_ENGINE_CLASS_COPY] = COPY_ENGINE_CLASS,
+	[I915_ENGINE_CLASS_VIDEO] = VIDEO_DECODE_CLASS,
+	[I915_ENGINE_CLASS_VIDEO_ENHANCE] = VIDEO_ENHANCEMENT_CLASS,
+};
+
+static int query_info_engine(struct drm_i915_private *dev_priv,
+			     struct drm_i915_query_info *args)
+{
+	struct drm_i915_engine_info __user *user_info =
+		u64_to_user_ptr(args->info_ptr);
+	struct intel_engine_cs *engine;
+	enum intel_engine_id id;
+	u8 num_engines, class;
+	u32 info_size;
+
+	switch (args->query_params[0]) {
+	case I915_ENGINE_CLASS_OTHER:
+	case I915_ENGINE_CLASS_RENDER:
+	case I915_ENGINE_CLASS_COPY:
+	case I915_ENGINE_CLASS_VIDEO:
+	case I915_ENGINE_CLASS_VIDEO_ENHANCE:
+		class = user_class_map[args->query_params[0]];
+		break;
+	case I915_ENGINE_CLASS_MAX:
+	default:
+		return -EINVAL;
+	};
+
+	num_engines = 0;
+	for_each_engine(engine, dev_priv, id) {
+		if (class != engine->class)
+			continue;
+
+		num_engines++;
+	}
+
+	info_size = sizeof(struct drm_i915_engine_info) * num_engines;
+	if (args->info_ptr_len == 0) {
+		args->info_ptr_len = info_size;
+		return 0;
+	}
+
+	if (args->info_ptr_len < info_size)
+		return -EINVAL;
+
+	for_each_engine(engine, dev_priv, id) {
+		struct drm_i915_engine_info info;
+		int ret;
+
+		if (class != engine->class)
+			continue;
+
+		memset(&info, 0, sizeof(info));
+		info.instance = engine->instance;
+		if (INTEL_GEN(dev_priv) >= 8 && id == VCS)
+			info.info = I915_VCS_HAS_HEVC;
+
+		ret = copy_to_user(user_info++, &info, sizeof(info));
+		if (ret)
+			return -EFAULT;
+	}
+
+	return 0;
+}
+
+
+int i915_query_info_ioctl(struct drm_device *dev, void *data,
+			  struct drm_file *file)
+{
+	struct drm_i915_private *dev_priv = to_i915(dev);
+	struct drm_i915_query_info *args = data;
+
+	/* Currently supported version of this API. */
+	if (args->version == 0) {
+		args->version = 1;
+		return 0;
+	}
+
+	if (args->version != 1)
+		return -EINVAL;
+
+	switch (args->query) {
+	case I915_QUERY_INFO_ENGINE:
+		return query_info_engine(dev_priv, args);
+	default:
+		return -EINVAL;
+	}
+}
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index ac3c6503ca27..c6026616300e 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -262,6 +262,7 @@ typedef struct _drm_i915_sarea {
 #define DRM_I915_PERF_OPEN		0x36
 #define DRM_I915_PERF_ADD_CONFIG	0x37
 #define DRM_I915_PERF_REMOVE_CONFIG	0x38
+#define DRM_I915_QUERY_INFO		0x39
 
 #define DRM_IOCTL_I915_INIT		DRM_IOW( DRM_COMMAND_BASE + DRM_I915_INIT, drm_i915_init_t)
 #define DRM_IOCTL_I915_FLUSH		DRM_IO ( DRM_COMMAND_BASE + DRM_I915_FLUSH)
@@ -319,6 +320,7 @@ typedef struct _drm_i915_sarea {
 #define DRM_IOCTL_I915_PERF_OPEN	DRM_IOW(DRM_COMMAND_BASE + DRM_I915_PERF_OPEN, struct drm_i915_perf_open_param)
 #define DRM_IOCTL_I915_PERF_ADD_CONFIG	DRM_IOW(DRM_COMMAND_BASE + DRM_I915_PERF_ADD_CONFIG, struct drm_i915_perf_oa_config)
 #define DRM_IOCTL_I915_PERF_REMOVE_CONFIG	DRM_IOW(DRM_COMMAND_BASE + DRM_I915_PERF_REMOVE_CONFIG, __u64)
+#define DRM_IOCTL_I915_QUERY_INFO	DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_QUERY_INFO, struct drm_i915_query_info)
 
 /* Allow drivers to submit batchbuffers directly to hardware, relying
  * on the security mechanisms provided by hardware.
@@ -1536,6 +1538,68 @@ struct drm_i915_perf_oa_config {
 	__u64 flex_regs_ptr;
 };
 
+
+/* Query engines information.
+ *
+ * drm_i915_query_info.query_params[0] should be set to one of the
+ * I915_ENGINE_CLASS_* enum.
+ *
+ * drm_i915_engine_info.info_ptr will be written to with an array of
+ * drm_i915_engine_info.
+ */
+#define I915_QUERY_INFO_ENGINE		0 /* version 1 */
+
+enum drm_i915_engine_class {
+	I915_ENGINE_CLASS_OTHER = 0,
+	I915_ENGINE_CLASS_RENDER = 1,
+	I915_ENGINE_CLASS_COPY = 2,
+	I915_ENGINE_CLASS_VIDEO = 3,
+	I915_ENGINE_CLASS_VIDEO_ENHANCE = 4,
+	I915_ENGINE_CLASS_MAX /* non-ABI */
+};
+
+struct drm_i915_engine_info {
+	/** Engine instance number. */
+	__u8 instance;
+
+	/** Engine specific info. */
+#define I915_VCS_HAS_HEVC	BIT(0)
+	__u8 info;
+
+	__u8 rsvd2[6];
+};
+
+
+struct drm_i915_query_info {
+	/* in/out: Protocol version requested/supported. When set to 0, the
+	 * kernel set this field to the current supported version. EINVAL will
+	 * be return if version is greater than what the kernel supports.
+	 */
+	__u32 version;
+
+	/* in: Query to perform on the engine (use one of
+	 * I915_QUERY_INFO_* define).
+	 */
+	__u32 query;
+
+	/** in: Parameters associated with the query (refer to each
+	 * I915_QUERY_INFO_* define)
+	 */
+	__u32 query_params[3];
+
+	/* in/out: Size of the data to be copied into info_ptr. When set to 0,
+	 * the kernel set this field to the size to be copied into info_ptr.
+	 * Call this one more time with the correct value set to make the
+	 * kernel copy the data into info_ptr.
+	 */
+	__u32 info_ptr_len;
+
+	/** in/out: Pointer to the data filled for the query (pointer set by
+	 * the caller, data written by the callee).
+	 */
+	__u64 info_ptr;
+};
+
 #if defined(__cplusplus)
 }
 #endif
-- 
2.15.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 2/4] drm/i915: store all subslice masks
  2017-11-08 16:22 [PATCH 0/4] drm/i915: introduce query information Lionel Landwerlin
  2017-11-08 16:22 ` [PATCH 1/4] drm/i915: introduce query info uAPI Lionel Landwerlin
@ 2017-11-08 16:22 ` Lionel Landwerlin
  2017-11-08 16:22 ` [PATCH 3/4] drm/i915/debugfs: reuse max slice/subslices already stored in sseu Lionel Landwerlin
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 19+ messages in thread
From: Lionel Landwerlin @ 2017-11-08 16:22 UTC (permalink / raw)
  To: intel-gfx

Up to now, subslice mask was assumed to be uniform across slices. But
starting with Cannonlake, slices can be asymetric (for example slice0
has different number of subslices as slice1+). This change stores all
subslices masks for all slices rather than having a single mask that
applies to all slices.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
---
 drivers/gpu/drm/i915/i915_debugfs.c      |  24 +++--
 drivers/gpu/drm/i915/i915_drv.c          |   2 +-
 drivers/gpu/drm/i915/i915_drv.h          |  23 ++++-
 drivers/gpu/drm/i915/intel_device_info.c | 171 ++++++++++++++++++++++---------
 drivers/gpu/drm/i915/intel_lrc.c         |   2 +-
 drivers/gpu/drm/i915/intel_ringbuffer.h  |   2 +-
 6 files changed, 162 insertions(+), 62 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index 39883cd915db..4982e92f0197 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -4439,7 +4439,7 @@ static void cherryview_sseu_device_status(struct drm_i915_private *dev_priv,
 			continue;
 
 		sseu->slice_mask = BIT(0);
-		sseu->subslice_mask |= BIT(ss);
+		sseu->subslices_mask[0] |= BIT(ss);
 		eu_cnt = ((sig1[ss] & CHV_EU08_PG_ENABLE) ? 0 : 2) +
 			 ((sig1[ss] & CHV_EU19_PG_ENABLE) ? 0 : 2) +
 			 ((sig1[ss] & CHV_EU210_PG_ENABLE) ? 0 : 2) +
@@ -4486,7 +4486,7 @@ static void gen10_sseu_device_status(struct drm_i915_private *dev_priv,
 			continue;
 
 		sseu->slice_mask |= BIT(s);
-		sseu->subslice_mask = info->sseu.subslice_mask;
+		sseu->subslices_mask[s] = info->sseu.subslices_mask[s];
 
 		for (ss = 0; ss < ss_max; ss++) {
 			unsigned int eu_cnt;
@@ -4541,8 +4541,8 @@ static void gen9_sseu_device_status(struct drm_i915_private *dev_priv,
 		sseu->slice_mask |= BIT(s);
 
 		if (IS_GEN9_BC(dev_priv))
-			sseu->subslice_mask =
-				INTEL_INFO(dev_priv)->sseu.subslice_mask;
+			sseu->subslices_mask[s] =
+				INTEL_INFO(dev_priv)->sseu.subslices_mask[s];
 
 		for (ss = 0; ss < ss_max; ss++) {
 			unsigned int eu_cnt;
@@ -4552,7 +4552,7 @@ static void gen9_sseu_device_status(struct drm_i915_private *dev_priv,
 					/* skip disabled subslice */
 					continue;
 
-				sseu->subslice_mask |= BIT(ss);
+				sseu->subslices_mask[s] |= BIT(ss);
 			}
 
 			eu_cnt = 2 * hweight32(eu_reg[2*s + ss/2] &
@@ -4574,9 +4574,12 @@ static void broadwell_sseu_device_status(struct drm_i915_private *dev_priv,
 	sseu->slice_mask = slice_info & GEN8_LSLICESTAT_MASK;
 
 	if (sseu->slice_mask) {
-		sseu->subslice_mask = INTEL_INFO(dev_priv)->sseu.subslice_mask;
 		sseu->eu_per_subslice =
 				INTEL_INFO(dev_priv)->sseu.eu_per_subslice;
+		for (s = 0; s < fls(sseu->slice_mask); s++) {
+			sseu->subslices_mask[s] =
+				INTEL_INFO(dev_priv)->sseu.subslices_mask[s];
+		}
 		sseu->eu_total = sseu->eu_per_subslice *
 				 sseu_subslice_total(sseu);
 
@@ -4595,6 +4598,7 @@ static void i915_print_sseu_info(struct seq_file *m, bool is_available_info,
 {
 	struct drm_i915_private *dev_priv = node_to_i915(m->private);
 	const char *type = is_available_info ? "Available" : "Enabled";
+	int s;
 
 	seq_printf(m, "  %s Slice Mask: %04x\n", type,
 		   sseu->slice_mask);
@@ -4602,10 +4606,10 @@ static void i915_print_sseu_info(struct seq_file *m, bool is_available_info,
 		   hweight8(sseu->slice_mask));
 	seq_printf(m, "  %s Subslice Total: %u\n", type,
 		   sseu_subslice_total(sseu));
-	seq_printf(m, "  %s Subslice Mask: %04x\n", type,
-		   sseu->subslice_mask);
-	seq_printf(m, "  %s Subslice Per Slice: %u\n", type,
-		   hweight8(sseu->subslice_mask));
+	for (s = 0; s < fls(sseu->slice_mask); s++) {
+		seq_printf(m, "  %s Slice%i Subslice Mask: %04x\n", type,
+			   s, sseu->subslices_mask[s]);
+	}
 	seq_printf(m, "  %s EU Total: %u\n", type,
 		   sseu->eu_total);
 	seq_printf(m, "  %s EU Per Subslice: %u\n", type,
diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index d1843a18ea2a..ef020d96cba1 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -412,7 +412,7 @@ static int i915_getparam(struct drm_device *dev, void *data,
 			return -ENODEV;
 		break;
 	case I915_PARAM_SUBSLICE_MASK:
-		value = INTEL_INFO(dev_priv)->sseu.subslice_mask;
+		value = INTEL_INFO(dev_priv)->sseu.subslices_mask[0];
 		if (!value)
 			return -ENODEV;
 		break;
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index a4978bd6a436..6391d497fc88 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -801,9 +801,12 @@ struct intel_csr {
 	func(supports_tv); \
 	func(has_ipc);
 
+#define GEN_MAX_SLICES		(6) /* CNL upper bound */
+#define GEN_MAX_SUBSLICES	(7)
+
 struct sseu_dev_info {
 	u8 slice_mask;
-	u8 subslice_mask;
+	u8 subslices_mask[GEN_MAX_SLICES];
 	u8 eu_total;
 	u8 eu_per_subslice;
 	u8 min_eu_in_pool;
@@ -812,11 +815,27 @@ struct sseu_dev_info {
 	u8 has_slice_pg:1;
 	u8 has_subslice_pg:1;
 	u8 has_eu_pg:1;
+
+	/* Topology fields */
+	u8 max_slices;
+	u8 max_subslices;
+	u8 max_eus_per_subslice;
+
+	/* We don't have more than 8 eus per subslice at the moment and as we
+	 * store eus enabled using bits, no need to multiply by eus per
+	 * subslice.
+	 */
+	u8 eu_mask[GEN_MAX_SLICES * GEN_MAX_SUBSLICES];
 };
 
 static inline unsigned int sseu_subslice_total(const struct sseu_dev_info *sseu)
 {
-	return hweight8(sseu->slice_mask) * hweight8(sseu->subslice_mask);
+	unsigned s, total = 0;
+
+	for (s = 0; s < ARRAY_SIZE(sseu->subslices_mask); s++)
+		total += hweight8(sseu->subslices_mask[s]);
+
+	return total;
 }
 
 /* Keep in gen based order, and chronological order within a gen */
diff --git a/drivers/gpu/drm/i915/intel_device_info.c b/drivers/gpu/drm/i915/intel_device_info.c
index db03d179fc85..1591342ffb9d 100644
--- a/drivers/gpu/drm/i915/intel_device_info.c
+++ b/drivers/gpu/drm/i915/intel_device_info.c
@@ -82,22 +82,74 @@ void intel_device_info_dump(struct drm_i915_private *dev_priv)
 #undef PRINT_FLAG
 }
 
+static u8 compute_eu_total(const struct sseu_dev_info *sseu)
+{
+	u8 i, total = 0;
+
+	for (i = 0; i < ARRAY_SIZE(sseu->eu_mask); i++)
+		total += hweight8(sseu->eu_mask[i]);
+
+	return total;
+}
+
 static void gen10_sseu_info_init(struct drm_i915_private *dev_priv)
 {
 	struct sseu_dev_info *sseu = &mkwrite_device_info(dev_priv)->sseu;
 	const u32 fuse2 = I915_READ(GEN8_FUSE2);
+	int s, ss, eu_mask = 0xff;
+	u32 subslice_mask, eu_en;
 
 	sseu->slice_mask = (fuse2 & GEN10_F2_S_ENA_MASK) >>
 			    GEN10_F2_S_ENA_SHIFT;
-	sseu->subslice_mask = (1 << 4) - 1;
-	sseu->subslice_mask &= ~((fuse2 & GEN10_F2_SS_DIS_MASK) >>
-				 GEN10_F2_SS_DIS_SHIFT);
+	sseu->max_slices = 6;
+	sseu->max_subslices = 4;
+	sseu->max_eus_per_subslice = 8;
+
+	subslice_mask = (1 << 4) - 1;
+	subslice_mask &= ~((fuse2 & GEN10_F2_SS_DIS_MASK) >>
+			   GEN10_F2_SS_DIS_SHIFT);
+
+	/* Slice0 can have up to 3 subslices, but there are only 2 in
+	 * slice1/2.
+	 */
+	sseu->subslices_mask[0] = subslice_mask;
+	for (s = 1; s < sseu->max_slices; s++)
+		sseu->subslices_mask[s] = subslice_mask & 0x3;
+
+	/* Slice0 */
+	eu_en = ~I915_READ(GEN8_EU_DISABLE0);
+	for (ss = 0; ss < sseu->max_subslices; ss++)
+		sseu->eu_mask[ss]     = (eu_en >> (8 * ss)) & eu_mask;
+	/* Slice1 */
+	sseu->eu_mask[sseu->max_subslices]         = (eu_en >> 24) & eu_mask;
+	eu_en = ~I915_READ(GEN8_EU_DISABLE1);
+	sseu->eu_mask[sseu->max_subslices + 1]     = eu_en & eu_mask;
+	/* Slice2 */
+	sseu->eu_mask[2 * sseu->max_subslices]     = (eu_en >> 8) & eu_mask;
+	sseu->eu_mask[2 * sseu->max_subslices + 1] = (eu_en >> 16) & eu_mask;
+	/* Slice3 */
+	sseu->eu_mask[3 * sseu->max_subslices]     = (eu_en >> 24) & eu_mask;
+	eu_en = ~I915_READ(GEN8_EU_DISABLE2);
+	sseu->eu_mask[3 * sseu->max_subslices + 1] = eu_en & eu_mask;
+	/* Slice4 */
+	sseu->eu_mask[4 * sseu->max_subslices]     = (eu_en >> 8) & eu_mask;
+	sseu->eu_mask[4 * sseu->max_subslices + 1] = (eu_en >> 16) & eu_mask;
+	/* Slice5 */
+	sseu->eu_mask[5 * sseu->max_subslices]     = (eu_en >> 24) & eu_mask;
+	eu_en = ~I915_READ(GEN10_EU_DISABLE3);
+	sseu->eu_mask[5 * sseu->max_subslices + 1] = eu_en & eu_mask;
+
+	/* Do a second pass where we marked the subslices disabled if all
+	 * their eus are off.
+	 */
+	for (s = 0; s < sseu->max_slices; s++) {
+		for (ss = 0; ss < sseu->max_subslices; ss++) {
+			if (sseu->eu_mask[s * sseu->max_subslices + ss] == 0)
+				sseu->subslices_mask[s] &= ~BIT(ss);
+		}
+	}
 
-	sseu->eu_total = hweight32(~I915_READ(GEN8_EU_DISABLE0));
-	sseu->eu_total += hweight32(~I915_READ(GEN8_EU_DISABLE1));
-	sseu->eu_total += hweight32(~I915_READ(GEN8_EU_DISABLE2));
-	sseu->eu_total += hweight8(~(I915_READ(GEN10_EU_DISABLE3) &
-				     GEN10_EU_DIS_SS_MASK));
+	sseu->eu_total = compute_eu_total(sseu);
 
 	/*
 	 * CNL is expected to always have a uniform distribution
@@ -118,26 +170,30 @@ static void gen10_sseu_info_init(struct drm_i915_private *dev_priv)
 static void cherryview_sseu_info_init(struct drm_i915_private *dev_priv)
 {
 	struct sseu_dev_info *sseu = &mkwrite_device_info(dev_priv)->sseu;
-	u32 fuse, eu_dis;
+	u32 fuse;
 
 	fuse = I915_READ(CHV_FUSE_GT);
 
 	sseu->slice_mask = BIT(0);
+	sseu->max_slices = 1;
+	sseu->max_subslices = 2;
+	sseu->max_eus_per_subslice = 8;
 
 	if (!(fuse & CHV_FGT_DISABLE_SS0)) {
-		sseu->subslice_mask |= BIT(0);
-		eu_dis = fuse & (CHV_FGT_EU_DIS_SS0_R0_MASK |
-				 CHV_FGT_EU_DIS_SS0_R1_MASK);
-		sseu->eu_total += 8 - hweight32(eu_dis);
+		sseu->subslices_mask[0] |= BIT(0);
+		sseu->eu_mask[0] = (fuse & CHV_FGT_EU_DIS_SS0_R0_MASK) >> CHV_FGT_EU_DIS_SS0_R0_SHIFT;
+		sseu->eu_mask[0] |= ((fuse & CHV_FGT_EU_DIS_SS0_R1_MASK) >> CHV_FGT_EU_DIS_SS0_R1_SHIFT) << 4;
+		sseu->subslices_mask[0] = 1;
 	}
 
 	if (!(fuse & CHV_FGT_DISABLE_SS1)) {
-		sseu->subslice_mask |= BIT(1);
-		eu_dis = fuse & (CHV_FGT_EU_DIS_SS1_R0_MASK |
-				 CHV_FGT_EU_DIS_SS1_R1_MASK);
-		sseu->eu_total += 8 - hweight32(eu_dis);
+		sseu->subslices_mask[0] |= BIT(1);
+		sseu->eu_mask[1] = (fuse & CHV_FGT_EU_DIS_SS1_R0_MASK) >> CHV_FGT_EU_DIS_SS0_R0_SHIFT;
+		sseu->eu_mask[2] |= ((fuse & CHV_FGT_EU_DIS_SS1_R1_MASK) >> CHV_FGT_EU_DIS_SS0_R1_SHIFT) << 4;
 	}
 
+	sseu->eu_total = compute_eu_total(sseu);
+
 	/*
 	 * CHV expected to always have a uniform distribution of EU
 	 * across subslices.
@@ -159,41 +215,50 @@ static void gen9_sseu_info_init(struct drm_i915_private *dev_priv)
 {
 	struct intel_device_info *info = mkwrite_device_info(dev_priv);
 	struct sseu_dev_info *sseu = &info->sseu;
-	int s_max = 3, ss_max = 4, eu_max = 8;
 	int s, ss;
-	u32 fuse2, eu_disable;
+	u32 fuse2, eu_disable, subslice_mask;
 	u8 eu_mask = 0xff;
 
 	fuse2 = I915_READ(GEN8_FUSE2);
 	sseu->slice_mask = (fuse2 & GEN8_F2_S_ENA_MASK) >> GEN8_F2_S_ENA_SHIFT;
 
+	/* BXT has a single slice and at most 3 subslices. */
+	sseu->max_slices = IS_GEN9_LP(dev_priv) ? 1 : 3;
+	sseu->max_subslices = IS_GEN9_LP(dev_priv) ? 3 : 4;
+	sseu->max_eus_per_subslice = 8;
+
 	/*
 	 * The subslice disable field is global, i.e. it applies
 	 * to each of the enabled slices.
 	*/
-	sseu->subslice_mask = (1 << ss_max) - 1;
-	sseu->subslice_mask &= ~((fuse2 & GEN9_F2_SS_DIS_MASK) >>
-				 GEN9_F2_SS_DIS_SHIFT);
+	subslice_mask = (1 << sseu->max_subslices) - 1;
+	subslice_mask &= ~((fuse2 & GEN9_F2_SS_DIS_MASK) >>
+			   GEN9_F2_SS_DIS_SHIFT);
 
 	/*
 	 * Iterate through enabled slices and subslices to
 	 * count the total enabled EU.
 	*/
-	for (s = 0; s < s_max; s++) {
+	for (s = 0; s < sseu->max_slices; s++) {
 		if (!(sseu->slice_mask & BIT(s)))
 			/* skip disabled slice */
 			continue;
 
+		sseu->subslices_mask[s] = subslice_mask;
+
 		eu_disable = I915_READ(GEN9_EU_DISABLE(s));
-		for (ss = 0; ss < ss_max; ss++) {
+		for (ss = 0; ss < sseu->max_subslices; ss++) {
 			int eu_per_ss;
 
-			if (!(sseu->subslice_mask & BIT(ss)))
+			if (!(sseu->subslices_mask[s] & BIT(ss)))
 				/* skip disabled subslice */
 				continue;
 
-			eu_per_ss = eu_max - hweight8((eu_disable >> (ss*8)) &
-						      eu_mask);
+			sseu->eu_mask[ss + s * sseu->max_subslices] =
+				~((eu_disable >> (ss*8)) & eu_mask);
+
+			eu_per_ss = sseu->max_eus_per_subslice -
+				hweight8((eu_disable >> (ss*8)) & eu_mask);
 
 			/*
 			 * Record which subslice(s) has(have) 7 EUs. we
@@ -202,11 +267,11 @@ static void gen9_sseu_info_init(struct drm_i915_private *dev_priv)
 			 */
 			if (eu_per_ss == 7)
 				sseu->subslice_7eu[s] |= BIT(ss);
-
-			sseu->eu_total += eu_per_ss;
 		}
 	}
 
+	sseu->eu_total = compute_eu_total(sseu);
+
 	/*
 	 * SKL is expected to always have a uniform distribution
 	 * of EU across subslices with the exception that any one
@@ -232,8 +297,8 @@ static void gen9_sseu_info_init(struct drm_i915_private *dev_priv)
 	sseu->has_eu_pg = sseu->eu_per_subslice > 2;
 
 	if (IS_GEN9_LP(dev_priv)) {
-#define IS_SS_DISABLED(ss)	(!(sseu->subslice_mask & BIT(ss)))
-		info->has_pooled_eu = hweight8(sseu->subslice_mask) == 3;
+#define IS_SS_DISABLED(ss)	(!(sseu->subslices_mask[0] & BIT(ss)))
+		info->has_pooled_eu = hweight8(sseu->subslices_mask[0]) == 3;
 
 		/*
 		 * There is a HW issue in 2x6 fused down parts that requires
@@ -242,7 +307,7 @@ static void gen9_sseu_info_init(struct drm_i915_private *dev_priv)
 		 * doesn't affect if the device has all 3 subslices enabled.
 		 */
 		/* WaEnablePooledEuFor2x6:bxt */
-		info->has_pooled_eu |= (hweight8(sseu->subslice_mask) == 2 &&
+		info->has_pooled_eu |= (hweight8(sseu->subslices_mask[0]) == 2 &&
 					IS_BXT_REVID(dev_priv, 0, BXT_REVID_B_LAST));
 
 		sseu->min_eu_in_pool = 0;
@@ -261,19 +326,22 @@ static void gen9_sseu_info_init(struct drm_i915_private *dev_priv)
 static void broadwell_sseu_info_init(struct drm_i915_private *dev_priv)
 {
 	struct sseu_dev_info *sseu = &mkwrite_device_info(dev_priv)->sseu;
-	const int s_max = 3, ss_max = 3, eu_max = 8;
 	int s, ss;
-	u32 fuse2, eu_disable[3]; /* s_max */
+	u32 fuse2, subslice_mask, eu_disable[3]; /* s_max */
 
 	fuse2 = I915_READ(GEN8_FUSE2);
 	sseu->slice_mask = (fuse2 & GEN8_F2_S_ENA_MASK) >> GEN8_F2_S_ENA_SHIFT;
+	sseu->max_slices = 3;
+	sseu->max_subslices = 3;
+	sseu->max_eus_per_subslice = 8;
+
 	/*
 	 * The subslice disable field is global, i.e. it applies
 	 * to each of the enabled slices.
 	 */
-	sseu->subslice_mask = GENMASK(ss_max - 1, 0);
-	sseu->subslice_mask &= ~((fuse2 & GEN8_F2_SS_DIS_MASK) >>
-				 GEN8_F2_SS_DIS_SHIFT);
+	subslice_mask = GENMASK(sseu->max_subslices - 1, 0);
+	subslice_mask &= ~((fuse2 & GEN8_F2_SS_DIS_MASK) >>
+			   GEN8_F2_SS_DIS_SHIFT);
 
 	eu_disable[0] = I915_READ(GEN8_EU_DISABLE0) & GEN8_EU_DIS0_S0_MASK;
 	eu_disable[1] = (I915_READ(GEN8_EU_DISABLE0) >> GEN8_EU_DIS0_S1_SHIFT) |
@@ -287,30 +355,36 @@ static void broadwell_sseu_info_init(struct drm_i915_private *dev_priv)
 	 * Iterate through enabled slices and subslices to
 	 * count the total enabled EU.
 	 */
-	for (s = 0; s < s_max; s++) {
+	for (s = 0; s < sseu->max_slices; s++) {
 		if (!(sseu->slice_mask & BIT(s)))
 			/* skip disabled slice */
 			continue;
 
-		for (ss = 0; ss < ss_max; ss++) {
+		sseu->subslices_mask[s] = subslice_mask;
+
+		for (ss = 0; ss < sseu->max_subslices; ss++) {
 			u32 n_disabled;
 
-			if (!(sseu->subslice_mask & BIT(ss)))
+			if (!(sseu->subslices_mask[ss] & BIT(ss)))
 				/* skip disabled subslice */
 				continue;
 
-			n_disabled = hweight8(eu_disable[s] >> (ss * eu_max));
+			sseu->eu_mask[ss + s * sseu->max_subslices] =
+				~(eu_disable[s] >>
+				  (ss * sseu->max_eus_per_subslice));
+			n_disabled = hweight8(eu_disable[s] >>
+					      (ss * sseu->max_eus_per_subslice));
 
 			/*
 			 * Record which subslices have 7 EUs.
 			 */
-			if (eu_max - n_disabled == 7)
+			if (sseu->max_eus_per_subslice - n_disabled == 7)
 				sseu->subslice_7eu[s] |= 1 << ss;
-
-			sseu->eu_total += eu_max - n_disabled;
 		}
 	}
 
+	sseu->eu_total = compute_eu_total(sseu);
+
 	/*
 	 * BDW is expected to always have a uniform distribution of EU across
 	 * subslices with the exception that any one EU in any one subslice may
@@ -346,6 +420,7 @@ void intel_device_info_runtime_init(struct drm_i915_private *dev_priv)
 {
 	struct intel_device_info *info = mkwrite_device_info(dev_priv);
 	enum pipe pipe;
+	int s;
 
 	if (INTEL_GEN(dev_priv) >= 10) {
 		for_each_pipe(dev_priv, pipe)
@@ -454,9 +529,11 @@ void intel_device_info_runtime_init(struct drm_i915_private *dev_priv)
 	DRM_DEBUG_DRIVER("slice total: %u\n", hweight8(info->sseu.slice_mask));
 	DRM_DEBUG_DRIVER("subslice total: %u\n",
 			 sseu_subslice_total(&info->sseu));
-	DRM_DEBUG_DRIVER("subslice mask %04x\n", info->sseu.subslice_mask);
-	DRM_DEBUG_DRIVER("subslice per slice: %u\n",
-			 hweight8(info->sseu.subslice_mask));
+	for (s = 0; s < ARRAY_SIZE(info->sseu.subslices_mask); s++) {
+		DRM_DEBUG_DRIVER("subslice mask %04x\n", info->sseu.subslices_mask[s]);
+		DRM_DEBUG_DRIVER("subslice per slice: %u\n",
+				 hweight8(info->sseu.subslices_mask[s]));
+	}
 	DRM_DEBUG_DRIVER("EU total: %u\n", info->sseu.eu_total);
 	DRM_DEBUG_DRIVER("EU per subslice: %u\n", info->sseu.eu_per_subslice);
 	DRM_DEBUG_DRIVER("has slice power gating: %s\n",
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 6840ec8db037..98b65cb6199a 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -2037,7 +2037,7 @@ make_rpcs(struct drm_i915_private *dev_priv)
 
 	if (INTEL_INFO(dev_priv)->sseu.has_subslice_pg) {
 		rpcs |= GEN8_RPCS_SS_CNT_ENABLE;
-		rpcs |= hweight8(INTEL_INFO(dev_priv)->sseu.subslice_mask) <<
+		rpcs |= hweight8(INTEL_INFO(dev_priv)->sseu.subslices_mask[0]) <<
 			GEN8_RPCS_SS_CNT_SHIFT;
 		rpcs |= GEN8_RPCS_ENABLE;
 	}
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 1106904f6e31..876704e7c17d 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -99,7 +99,7 @@ hangcheck_action_to_str(const enum intel_engine_hangcheck_action a)
 
 #define instdone_subslice_mask(dev_priv__) \
 	(INTEL_GEN(dev_priv__) == 7 ? \
-	 1 : INTEL_INFO(dev_priv__)->sseu.subslice_mask)
+	 1 : INTEL_INFO(dev_priv__)->sseu.subslices_mask[0])
 
 #define for_each_instdone_slice_subslice(dev_priv__, slice__, subslice__) \
 	for ((slice__) = 0, (subslice__) = 0; \
-- 
2.15.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 3/4] drm/i915/debugfs: reuse max slice/subslices already stored in sseu
  2017-11-08 16:22 [PATCH 0/4] drm/i915: introduce query information Lionel Landwerlin
  2017-11-08 16:22 ` [PATCH 1/4] drm/i915: introduce query info uAPI Lionel Landwerlin
  2017-11-08 16:22 ` [PATCH 2/4] drm/i915: store all subslice masks Lionel Landwerlin
@ 2017-11-08 16:22 ` Lionel Landwerlin
  2017-11-08 16:22 ` [PATCH 4/4] drm/i915: expose rcs topology through discovery uAPI Lionel Landwerlin
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 19+ messages in thread
From: Lionel Landwerlin @ 2017-11-08 16:22 UTC (permalink / raw)
  To: intel-gfx

Now that we have that information in topology fields, let's just reused it.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
---
 drivers/gpu/drm/i915/i915_debugfs.c | 26 ++++++++++----------------
 1 file changed, 10 insertions(+), 16 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index 4982e92f0197..c6784e5422ce 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -4454,11 +4454,11 @@ static void gen10_sseu_device_status(struct drm_i915_private *dev_priv,
 				     struct sseu_dev_info *sseu)
 {
 	const struct intel_device_info *info = INTEL_INFO(dev_priv);
-	int s_max = 6, ss_max = 4;
 	int s, ss;
-	u32 s_reg[s_max], eu_reg[2 * s_max], eu_mask[2];
+	u32 s_reg[info->sseu.max_slices],
+		eu_reg[2 * info->sseu.max_subslices], eu_mask[2];
 
-	for (s = 0; s < s_max; s++) {
+	for (s = 0; s < info->sseu.max_slices; s++) {
 		/*
 		 * FIXME: Valid SS Mask respects the spec and read
 		 * only valid bits for those registers, excluding reserverd
@@ -4480,7 +4480,7 @@ static void gen10_sseu_device_status(struct drm_i915_private *dev_priv,
 		     GEN9_PGCTL_SSB_EU210_ACK |
 		     GEN9_PGCTL_SSB_EU311_ACK;
 
-	for (s = 0; s < s_max; s++) {
+	for (s = 0; s < info->sseu.max_slices; s++) {
 		if ((s_reg[s] & GEN9_PGCTL_SLICE_ACK) == 0)
 			/* skip disabled slice */
 			continue;
@@ -4488,7 +4488,7 @@ static void gen10_sseu_device_status(struct drm_i915_private *dev_priv,
 		sseu->slice_mask |= BIT(s);
 		sseu->subslices_mask[s] = info->sseu.subslices_mask[s];
 
-		for (ss = 0; ss < ss_max; ss++) {
+		for (ss = 0; ss < info->sseu.max_subslices; ss++) {
 			unsigned int eu_cnt;
 
 			if (!(s_reg[s] & (GEN9_PGCTL_SS_ACK(ss))))
@@ -4508,17 +4508,11 @@ static void gen10_sseu_device_status(struct drm_i915_private *dev_priv,
 static void gen9_sseu_device_status(struct drm_i915_private *dev_priv,
 				    struct sseu_dev_info *sseu)
 {
-	int s_max = 3, ss_max = 4;
+	const struct intel_device_info *info = INTEL_INFO(dev_priv);
 	int s, ss;
-	u32 s_reg[s_max], eu_reg[2*s_max], eu_mask[2];
-
-	/* BXT has a single slice and at most 3 subslices. */
-	if (IS_GEN9_LP(dev_priv)) {
-		s_max = 1;
-		ss_max = 3;
-	}
+	u32 s_reg[info->sseu.max_slices], eu_reg[2*info->sseu.max_subslices], eu_mask[2];
 
-	for (s = 0; s < s_max; s++) {
+	for (s = 0; s < info->sseu.max_slices; s++) {
 		s_reg[s] = I915_READ(GEN9_SLICE_PGCTL_ACK(s));
 		eu_reg[2*s] = I915_READ(GEN9_SS01_EU_PGCTL_ACK(s));
 		eu_reg[2*s + 1] = I915_READ(GEN9_SS23_EU_PGCTL_ACK(s));
@@ -4533,7 +4527,7 @@ static void gen9_sseu_device_status(struct drm_i915_private *dev_priv,
 		     GEN9_PGCTL_SSB_EU210_ACK |
 		     GEN9_PGCTL_SSB_EU311_ACK;
 
-	for (s = 0; s < s_max; s++) {
+	for (s = 0; s < info->sseu.max_slices; s++) {
 		if ((s_reg[s] & GEN9_PGCTL_SLICE_ACK) == 0)
 			/* skip disabled slice */
 			continue;
@@ -4544,7 +4538,7 @@ static void gen9_sseu_device_status(struct drm_i915_private *dev_priv,
 			sseu->subslices_mask[s] =
 				INTEL_INFO(dev_priv)->sseu.subslices_mask[s];
 
-		for (ss = 0; ss < ss_max; ss++) {
+		for (ss = 0; ss < info->sseu.max_subslices; ss++) {
 			unsigned int eu_cnt;
 
 			if (IS_GEN9_LP(dev_priv)) {
-- 
2.15.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 4/4] drm/i915: expose rcs topology through discovery uAPI
  2017-11-08 16:22 [PATCH 0/4] drm/i915: introduce query information Lionel Landwerlin
                   ` (2 preceding siblings ...)
  2017-11-08 16:22 ` [PATCH 3/4] drm/i915/debugfs: reuse max slice/subslices already stored in sseu Lionel Landwerlin
@ 2017-11-08 16:22 ` Lionel Landwerlin
  2017-11-09 17:34   ` Tvrtko Ursulin
  2017-11-08 16:43 ` ✓ Fi.CI.BAT: success for drm/i915: introduce query information Patchwork
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 19+ messages in thread
From: Lionel Landwerlin @ 2017-11-08 16:22 UTC (permalink / raw)
  To: intel-gfx

With the introduction of asymetric slices in CNL, we cannot rely on
the previous SUBSLICE_MASK getparam. Here we introduce a more detailed
way of querying the Gen's GPU topology that doesn't aggregate numbers.

This is essential for monitoring parts of the GPU with the OA unit,
because counters need to be normalized to the number of
EUs/subslices/slices. The current aggregated numbers like EU_TOTAL do
not gives us sufficient information.

This change introduce a new way to query properties of the GPU, making
room for new queries (some media related topology could be exposed in
the future).

As a bonus we can draw representations of the GPU :

        https://imgur.com/a/vuqpa

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
---
 drivers/gpu/drm/i915/intel_query_info.c | 64 +++++++++++++++++++++++++++++++++
 include/uapi/drm/i915_drm.h             | 55 ++++++++++++++++++++++++++++
 2 files changed, 119 insertions(+)

diff --git a/drivers/gpu/drm/i915/intel_query_info.c b/drivers/gpu/drm/i915/intel_query_info.c
index 21434c393582..45c5b29e0988 100644
--- a/drivers/gpu/drm/i915/intel_query_info.c
+++ b/drivers/gpu/drm/i915/intel_query_info.c
@@ -25,6 +25,68 @@
 #include "i915_drv.h"
 #include <uapi/drm/i915_drm.h>
 
+static int query_info_rcs_topology(struct drm_i915_private *dev_priv,
+				   struct drm_i915_query_info *args)
+{
+	const struct sseu_dev_info *sseu = &INTEL_INFO(dev_priv)->sseu;
+	struct drm_i915_rcs_topology_info __user *user_topology =
+		u64_to_user_ptr(args->info_ptr);
+	struct drm_i915_rcs_topology_info topology;
+	u32 data_size, total_size;
+	const u8 *data = NULL;
+	int ret;
+
+	/* Not supported on gen < 8. */
+	if (sseu->max_slices == 0)
+		return -ENODEV;
+
+	switch (args->query_params[0]) {
+	case I915_RCS_TOPOLOGY_SLICE:
+		topology.params[0] = sseu->max_slices;
+		data_size = sizeof(sseu->slice_mask);
+		data = &sseu->slice_mask;
+		break;
+
+	case I915_RCS_TOPOLOGY_SUBSLICE:
+		topology.params[0] = sseu->max_slices;
+		topology.params[1] = ALIGN(sseu->max_subslices, 8) / 8;
+		data_size = sseu->max_slices * topology.params[1];
+		data = sseu->subslices_mask;
+		break;
+
+	case I915_RCS_TOPOLOGY_EU:
+		topology.params[2] = ALIGN(sseu->max_eus_per_subslice, 8) / 8;
+		topology.params[1] = sseu->max_subslices * topology.params[2];
+		topology.params[0] = sseu->max_slices;
+		data_size = sseu->max_slices * topology.params[1];
+		data = sseu->eu_mask;
+		break;
+
+	default:
+		return -EINVAL;
+	}
+
+	total_size = sizeof(topology) + data_size;
+
+	if (args->info_ptr_len == 0) {
+		args->info_ptr_len = total_size;
+		return 0;
+	}
+
+	if (args->info_ptr_len < total_size)
+		return -EINVAL;
+
+	ret = copy_to_user(user_topology, &topology, sizeof(topology));
+	if (ret)
+		return -EFAULT;
+
+	ret = copy_to_user(user_topology + 1, data, data_size);
+	if (ret)
+		return -EFAULT;
+
+	return 0;
+}
+
 static u8 user_class_map[I915_ENGINE_CLASS_MAX] = {
 	[I915_ENGINE_CLASS_OTHER] = OTHER_CLASS,
 	[I915_ENGINE_CLASS_RENDER] = RENDER_CLASS,
@@ -112,6 +174,8 @@ int i915_query_info_ioctl(struct drm_device *dev, void *data,
 	switch (args->query) {
 	case I915_QUERY_INFO_ENGINE:
 		return query_info_engine(dev_priv, args);
+	case I915_QUERY_INFO_RCS_TOPOLOGY:
+		return query_info_rcs_topology(dev_priv, args);
 	default:
 		return -EINVAL;
 	}
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index c6026616300e..5b6a8995a948 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -1569,6 +1569,61 @@ struct drm_i915_engine_info {
 	__u8 rsvd2[6];
 };
 
+/* Query RCS topology.
+ *
+ * drm_i915_query_info.query_params[0] should be set to one of the
+ * I915_RCS_TOPOLOGY_* define.
+ *
+ * drm_i915_gem_query_info.info_ptr will be written to with
+ * drm_i915_rcs_topology_info.
+ */
+#define I915_QUERY_INFO_RCS_TOPOLOGY	1 /* version 1 */
+
+/* Query RCS slice topology
+ *
+ * The meaning of the drm_i915_rcs_topology_info fields is :
+ *
+ * params[0]: number of slices
+ *
+ * data: Each bit indicates whether a slice is available (1) or fused off (0).
+ * Formula to tell if slice X is available :
+ *
+ *         (data[X / 8] >> (X % 8)) & 1
+ */
+#define I915_RCS_TOPOLOGY_SLICE		0 /* version 1 */
+/* Query RCS subslice topology
+ *
+ * The meaning of the drm_i915_rcs_topology_info fields is :
+ *
+ * params[0]: number of slices
+ * params[1]: slice stride
+ *
+ * data: each bit indicates whether a subslice is available (1) or fused off
+ * (0). Formula to tell if slice X subslice Y is available :
+ *
+ *         (data[(X * params[1]) + Y / 8] >> (Y % 8)) & 1
+ */
+#define I915_RCS_TOPOLOGY_SUBSLICE	1 /* version 1 */
+/* Query RCS EU topology
+ *
+ * The meaning of the drm_i915_rcs_topology_info fields is :
+ *
+ * params[0]: number of slices
+ * params[1]: slice stride
+ * params[2]: subslice stride
+ *
+ * data: Each bit indicates whether a subslice is available (1) or fused off
+ * (0). Formula to tell if slice X subslice Y eu Z is available :
+ *
+ *         (data[X * params[1] + Y * params[2] + Z / 8] >> (Z % 8)) & 1
+ */
+#define I915_RCS_TOPOLOGY_EU		2 /* version 1 */
+
+struct drm_i915_rcs_topology_info {
+	__u32 params[6];
+
+	__u8 data[];
+};
 
 struct drm_i915_query_info {
 	/* in/out: Protocol version requested/supported. When set to 0, the
-- 
2.15.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* ✓ Fi.CI.BAT: success for drm/i915: introduce query information
  2017-11-08 16:22 [PATCH 0/4] drm/i915: introduce query information Lionel Landwerlin
                   ` (3 preceding siblings ...)
  2017-11-08 16:22 ` [PATCH 4/4] drm/i915: expose rcs topology through discovery uAPI Lionel Landwerlin
@ 2017-11-08 16:43 ` Patchwork
  2017-11-08 20:04 ` ✓ Fi.CI.IGT: " Patchwork
  2017-11-09 11:49 ` [PATCH 0/4] " Lionel Landwerlin
  6 siblings, 0 replies; 19+ messages in thread
From: Patchwork @ 2017-11-08 16:43 UTC (permalink / raw)
  To: Lionel Landwerlin; +Cc: intel-gfx

== Series Details ==

Series: drm/i915: introduce query information
URL   : https://patchwork.freedesktop.org/series/33436/
State : success

== Summary ==

Series 33436v1 drm/i915: introduce query information
https://patchwork.freedesktop.org/api/1.0/series/33436/revisions/1/mbox/

Test gem_sync:
        Subgroup basic-store-all:
                pass       -> FAIL       (fi-ivb-3520m) fdo#100007

fdo#100007 https://bugs.freedesktop.org/show_bug.cgi?id=100007

fi-bdw-5557u     total:289  pass:268  dwarn:0   dfail:0   fail:0   skip:21  time:444s
fi-bdw-gvtdvm    total:289  pass:265  dwarn:0   dfail:0   fail:0   skip:24  time:454s
fi-blb-e6850     total:289  pass:223  dwarn:1   dfail:0   fail:0   skip:65  time:383s
fi-bsw-n3050     total:289  pass:243  dwarn:0   dfail:0   fail:0   skip:46  time:547s
fi-bwr-2160      total:289  pass:183  dwarn:0   dfail:0   fail:0   skip:106 time:275s
fi-bxt-dsi       total:289  pass:259  dwarn:0   dfail:0   fail:0   skip:30  time:501s
fi-bxt-j4205     total:289  pass:260  dwarn:0   dfail:0   fail:0   skip:29  time:502s
fi-byt-j1900     total:289  pass:254  dwarn:0   dfail:0   fail:0   skip:35  time:492s
fi-byt-n2820     total:289  pass:250  dwarn:0   dfail:0   fail:0   skip:39  time:501s
fi-elk-e7500     total:289  pass:229  dwarn:0   dfail:0   fail:0   skip:60  time:424s
fi-gdg-551       total:289  pass:178  dwarn:1   dfail:0   fail:1   skip:109 time:263s
fi-glk-1         total:289  pass:261  dwarn:0   dfail:0   fail:0   skip:28  time:534s
fi-hsw-4770      total:289  pass:262  dwarn:0   dfail:0   fail:0   skip:27  time:431s
fi-hsw-4770r     total:289  pass:262  dwarn:0   dfail:0   fail:0   skip:27  time:440s
fi-ilk-650       total:289  pass:228  dwarn:0   dfail:0   fail:0   skip:61  time:423s
fi-ivb-3520m     total:289  pass:259  dwarn:0   dfail:0   fail:1   skip:29  time:482s
fi-ivb-3770      total:289  pass:260  dwarn:0   dfail:0   fail:0   skip:29  time:462s
fi-kbl-7500u     total:289  pass:264  dwarn:1   dfail:0   fail:0   skip:24  time:484s
fi-kbl-7560u     total:289  pass:270  dwarn:0   dfail:0   fail:0   skip:19  time:519s
fi-kbl-7567u     total:289  pass:269  dwarn:0   dfail:0   fail:0   skip:20  time:470s
fi-kbl-r         total:289  pass:262  dwarn:0   dfail:0   fail:0   skip:27  time:532s
fi-pnv-d510      total:289  pass:222  dwarn:1   dfail:0   fail:0   skip:66  time:569s
fi-skl-6260u     total:289  pass:269  dwarn:0   dfail:0   fail:0   skip:20  time:466s
fi-skl-6600u     total:289  pass:262  dwarn:0   dfail:0   fail:0   skip:27  time:542s
fi-skl-6700hq    total:289  pass:263  dwarn:0   dfail:0   fail:0   skip:26  time:563s
fi-skl-6700k     total:289  pass:265  dwarn:0   dfail:0   fail:0   skip:24  time:529s
fi-skl-6770hq    total:289  pass:269  dwarn:0   dfail:0   fail:0   skip:20  time:499s
fi-skl-gvtdvm    total:289  pass:266  dwarn:0   dfail:0   fail:0   skip:23  time:456s
fi-snb-2520m     total:289  pass:250  dwarn:0   dfail:0   fail:0   skip:39  time:561s
fi-snb-2600      total:289  pass:249  dwarn:0   dfail:0   fail:0   skip:40  time:423s
Blacklisted hosts:
fi-cnl-y         total:289  pass:262  dwarn:0   dfail:0   fail:0   skip:27  time:557s
fi-glk-dsi       total:289  pass:258  dwarn:0   dfail:0   fail:1   skip:30  time:491s

087c404bd6d56a52e0656ac7c79faa376c25b796 drm-tip: 2017y-11m-08d-15h-44m-06s UTC integration manifest
625f19cc2af8 drm/i915: expose rcs topology through discovery uAPI
b290818a8573 drm/i915/debugfs: reuse max slice/subslices already stored in sseu
444f02ffa59c drm/i915: store all subslice masks
88d12e0504a2 drm/i915: introduce query info uAPI

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_7016/
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 19+ messages in thread

* ✓ Fi.CI.IGT: success for drm/i915: introduce query information
  2017-11-08 16:22 [PATCH 0/4] drm/i915: introduce query information Lionel Landwerlin
                   ` (4 preceding siblings ...)
  2017-11-08 16:43 ` ✓ Fi.CI.BAT: success for drm/i915: introduce query information Patchwork
@ 2017-11-08 20:04 ` Patchwork
  2017-11-09 11:49 ` [PATCH 0/4] " Lionel Landwerlin
  6 siblings, 0 replies; 19+ messages in thread
From: Patchwork @ 2017-11-08 20:04 UTC (permalink / raw)
  To: Lionel Landwerlin; +Cc: intel-gfx

== Series Details ==

Series: drm/i915: introduce query information
URL   : https://patchwork.freedesktop.org/series/33436/
State : success

== Summary ==

Test perf:
        Subgroup blocking:
                fail       -> PASS       (shard-hsw) fdo#102252 +1

fdo#102252 https://bugs.freedesktop.org/show_bug.cgi?id=102252

shard-hsw        total:2540 pass:1432 dwarn:1   dfail:0   fail:10  skip:1097 time:9239s

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_7016/shards.html
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 0/4] drm/i915: introduce query information
  2017-11-08 16:22 [PATCH 0/4] drm/i915: introduce query information Lionel Landwerlin
                   ` (5 preceding siblings ...)
  2017-11-08 20:04 ` ✓ Fi.CI.IGT: " Patchwork
@ 2017-11-09 11:49 ` Lionel Landwerlin
  6 siblings, 0 replies; 19+ messages in thread
From: Lionel Landwerlin @ 2017-11-09 11:49 UTC (permalink / raw)
  To: intel-gfx

With the GPUTop bits wired, here are some mugshots of a set of NUCs I 
have running nearby (KBL/SKL/APL) :

https://imgur.com/a/uHzJc

On 08/11/17 16:22, Lionel Landwerlin wrote:
> Hi,
>
> This series is based off work that Tvrtko started, initially for
> exposing the engines available to userspace.
>
> I've added the topology information I would like to expose for
> normalizing the performance counters.
>
> Cheers,
>
> Lionel Landwerlin (3):
>    drm/i915: store all subslice masks
>    drm/i915/debugfs: reuse max slice/subslices already stored in sseu
>    drm/i915: expose rcs topology through discovery uAPI
>
> Tvrtko Ursulin (1):
>    drm/i915: introduce query info uAPI
>
>   drivers/gpu/drm/i915/Makefile            |   1 +
>   drivers/gpu/drm/i915/i915_debugfs.c      |  50 ++++-----
>   drivers/gpu/drm/i915/i915_drv.c          |   3 +-
>   drivers/gpu/drm/i915/i915_drv.h          |  26 ++++-
>   drivers/gpu/drm/i915/intel_device_info.c | 171 +++++++++++++++++++++--------
>   drivers/gpu/drm/i915/intel_engine_cs.c   |   1 +
>   drivers/gpu/drm/i915/intel_lrc.c         |   2 +-
>   drivers/gpu/drm/i915/intel_query_info.c  | 182 +++++++++++++++++++++++++++++++
>   drivers/gpu/drm/i915/intel_ringbuffer.h  |   2 +-
>   include/uapi/drm/i915_drm.h              | 119 ++++++++++++++++++++
>   10 files changed, 479 insertions(+), 78 deletions(-)
>   create mode 100644 drivers/gpu/drm/i915/intel_query_info.c
>
> --
> 2.15.0
>

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 1/4] drm/i915: introduce query info uAPI
  2017-11-08 16:22 ` [PATCH 1/4] drm/i915: introduce query info uAPI Lionel Landwerlin
@ 2017-11-09 15:57   ` Joonas Lahtinen
  2017-11-09 16:15     ` Lionel Landwerlin
  0 siblings, 1 reply; 19+ messages in thread
From: Joonas Lahtinen @ 2017-11-09 15:57 UTC (permalink / raw)
  To: Lionel Landwerlin, intel-gfx; +Cc: Ben Widawsky, Daniel Vetter

On Wed, 2017-11-08 at 16:22 +0000, Lionel Landwerlin wrote:
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> Query info uAPI allows userspace to probe for a number of properties
> of the GPU. This partially removes the need for userspace to maintain
> the internal PCI id based database (as well as code duplication across
> userspace components).
> 
> This first version enables engine configuration and features to be
> queried.
> 
> Probing is done via the new DRM_IOCTL_I915_QUERY_INFO ioctl which
> takes a query ID as well as query arguments.
> 
> Currently only general engine configuration and HEVC feature of the
> VCS engine can be probed but the uAPI is designed to be generic and
> extensible.
> 
> Code is based almost exactly on the earlier proposal on the topic by
> Jon Bloomfield. Engine class and instance refactoring made recently by
> Daniele Ceraolo Spurio enabled this to be implemented in an elegant
> fashion.
> 
> To probe configuration userspace sets the engine class it wants to
> query (struct drm_i915_gem_engine_info) and provides an array of
> drm_i915_engine_info structs which will be filled in by the driver.
> Userspace also has to tell i915 how many elements are in the array,
> and the driver will report back the total number of engine instances
> in any case.
> 
> Pseudo-code example of userspace using the uAPI:
> 
>   struct drm_i915_query_info info = {};
>   struct drm_i915_engine_info *engines;
>   int i, ret;
> 
>   info.version = 1;
>   info.query = I915_QUERY_INFO_ENGINE;
>   info.query_params[0] = I915_ENGINE_CLASS_VIDEO;
>   info.info_ptr_len = 0;
>   ret = ioctl(..., &info);
>   assert(ret == 0);
> 
>   engines = malloc(info.info_ptr_len);
>   info.info_ptr = to_user_pointer(engines);
>   ret = ioctl(..., &info);
>   assert(ret == 0);
> 
>   for (i = 0; i < info.info_ptr / sizeof(*engines); i++)
>       printf("VCS%u: flags=%x\n",
>              engines[i].instance,
>              engines[i].info);
> 
> First time with info.info_ptr_len set to zero, so the kernel reports
> back the size of the data to be written into info.info_ptr. Then a
> second time with the info.info_ptr_len field set to the correct
> length.
> 
> v2:
>  * Add a version field and consolidate to one engine count.
>    (Chris Wilson)
>  * Rename uAPI flags for VCS engines to DRM_I915_ENGINE_CLASS_VIDEO.
>    (Gong Zhipeng)
> 
> v3:
>  * Drop the DRM_ prefix from uAPI enums. (Chris Wilson)
>  * Only reserve 8-bits for the engine info for time being.
> 
> v4:
>  * Made user_class_map static.
> 
> v5:
>  * Example usage added to commit msg. (Chris Wilson)
>  * Report engine count in case of version mismatch. (Chris Wilson)
> 
> v6:
>  * Return API to query_info to make it more generic (Lionel)
>  * Add query ID & query params (Lionel)
> 
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
> Cc: Ben Widawsky <ben@bwidawsk.net>
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Daniel Vetter <daniel.vetter@intel.com>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> Cc: Jon Bloomfield <jon.bloomfield@intel.com>
> Cc: "Rogozhkin, Dmitry V" <dmitry.v.rogozhkin@intel.com>
> Cc: Oscar Mateo <oscar.mateo@intel.com>
> Cc: "Gong, Zhipeng" <zhipeng.gong@intel.com>

<SNIP>

> +++ b/drivers/gpu/drm/i915/i915_drv.c
> @@ -2776,6 +2776,7 @@ static const struct drm_ioctl_desc i915_ioctls[] = {
>  	DRM_IOCTL_DEF_DRV(I915_PERF_OPEN, i915_perf_open_ioctl, DRM_RENDER_ALLOW),
>  	DRM_IOCTL_DEF_DRV(I915_PERF_ADD_CONFIG, i915_perf_add_config_ioctl, DRM_UNLOCKED|DRM_RENDER_ALLOW),
>  	DRM_IOCTL_DEF_DRV(I915_PERF_REMOVE_CONFIG, i915_perf_remove_config_ioctl, DRM_UNLOCKED|DRM_RENDER_ALLOW),
> +	DRM_IOCTL_DEF_DRV(I915_QUERY_INFO, i915_query_info_ioctl, DRM_RENDER_ALLOW),

I'm not a fan. This is bit much to the direction of I915_META_IOCTL.

So can we keep this as engine info query without versioning, and if you
need to query something else, lets have another ioctl for that.

(I heard a rumour of this being about RCS topology, which could be in
the engine info.).

Regards, Joonas
-- 
Joonas Lahtinen
Open Source Technology Center
Intel Corporation
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 1/4] drm/i915: introduce query info uAPI
  2017-11-09 15:57   ` Joonas Lahtinen
@ 2017-11-09 16:15     ` Lionel Landwerlin
  2017-11-09 17:10       ` Tvrtko Ursulin
  0 siblings, 1 reply; 19+ messages in thread
From: Lionel Landwerlin @ 2017-11-09 16:15 UTC (permalink / raw)
  To: Joonas Lahtinen, intel-gfx; +Cc: Ben Widawsky, Daniel Vetter

On 09/11/17 15:57, Joonas Lahtinen wrote:
> On Wed, 2017-11-08 at 16:22 +0000, Lionel Landwerlin wrote:
>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>
>> Query info uAPI allows userspace to probe for a number of properties
>> of the GPU. This partially removes the need for userspace to maintain
>> the internal PCI id based database (as well as code duplication across
>> userspace components).
>>
>> This first version enables engine configuration and features to be
>> queried.
>>
>> Probing is done via the new DRM_IOCTL_I915_QUERY_INFO ioctl which
>> takes a query ID as well as query arguments.
>>
>> Currently only general engine configuration and HEVC feature of the
>> VCS engine can be probed but the uAPI is designed to be generic and
>> extensible.
>>
>> Code is based almost exactly on the earlier proposal on the topic by
>> Jon Bloomfield. Engine class and instance refactoring made recently by
>> Daniele Ceraolo Spurio enabled this to be implemented in an elegant
>> fashion.
>>
>> To probe configuration userspace sets the engine class it wants to
>> query (struct drm_i915_gem_engine_info) and provides an array of
>> drm_i915_engine_info structs which will be filled in by the driver.
>> Userspace also has to tell i915 how many elements are in the array,
>> and the driver will report back the total number of engine instances
>> in any case.
>>
>> Pseudo-code example of userspace using the uAPI:
>>
>>    struct drm_i915_query_info info = {};
>>    struct drm_i915_engine_info *engines;
>>    int i, ret;
>>
>>    info.version = 1;
>>    info.query = I915_QUERY_INFO_ENGINE;
>>    info.query_params[0] = I915_ENGINE_CLASS_VIDEO;
>>    info.info_ptr_len = 0;
>>    ret = ioctl(..., &info);
>>    assert(ret == 0);
>>
>>    engines = malloc(info.info_ptr_len);
>>    info.info_ptr = to_user_pointer(engines);
>>    ret = ioctl(..., &info);
>>    assert(ret == 0);
>>
>>    for (i = 0; i < info.info_ptr / sizeof(*engines); i++)
>>        printf("VCS%u: flags=%x\n",
>>               engines[i].instance,
>>               engines[i].info);
>>
>> First time with info.info_ptr_len set to zero, so the kernel reports
>> back the size of the data to be written into info.info_ptr. Then a
>> second time with the info.info_ptr_len field set to the correct
>> length.
>>
>> v2:
>>   * Add a version field and consolidate to one engine count.
>>     (Chris Wilson)
>>   * Rename uAPI flags for VCS engines to DRM_I915_ENGINE_CLASS_VIDEO.
>>     (Gong Zhipeng)
>>
>> v3:
>>   * Drop the DRM_ prefix from uAPI enums. (Chris Wilson)
>>   * Only reserve 8-bits for the engine info for time being.
>>
>> v4:
>>   * Made user_class_map static.
>>
>> v5:
>>   * Example usage added to commit msg. (Chris Wilson)
>>   * Report engine count in case of version mismatch. (Chris Wilson)
>>
>> v6:
>>   * Return API to query_info to make it more generic (Lionel)
>>   * Add query ID & query params (Lionel)
>>
>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>> Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
>> Cc: Ben Widawsky <ben@bwidawsk.net>
>> Cc: Chris Wilson <chris@chris-wilson.co.uk>
>> Cc: Daniel Vetter <daniel.vetter@intel.com>
>> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
>> Cc: Jon Bloomfield <jon.bloomfield@intel.com>
>> Cc: "Rogozhkin, Dmitry V" <dmitry.v.rogozhkin@intel.com>
>> Cc: Oscar Mateo <oscar.mateo@intel.com>
>> Cc: "Gong, Zhipeng" <zhipeng.gong@intel.com>
> <SNIP>
>
>> +++ b/drivers/gpu/drm/i915/i915_drv.c
>> @@ -2776,6 +2776,7 @@ static const struct drm_ioctl_desc i915_ioctls[] = {
>>   	DRM_IOCTL_DEF_DRV(I915_PERF_OPEN, i915_perf_open_ioctl, DRM_RENDER_ALLOW),
>>   	DRM_IOCTL_DEF_DRV(I915_PERF_ADD_CONFIG, i915_perf_add_config_ioctl, DRM_UNLOCKED|DRM_RENDER_ALLOW),
>>   	DRM_IOCTL_DEF_DRV(I915_PERF_REMOVE_CONFIG, i915_perf_remove_config_ioctl, DRM_UNLOCKED|DRM_RENDER_ALLOW),
>> +	DRM_IOCTL_DEF_DRV(I915_QUERY_INFO, i915_query_info_ioctl, DRM_RENDER_ALLOW),
> I'm not a fan. This is bit much to the direction of I915_META_IOCTL.

So you would prefer a different ioctl for every bit of information we 
would want to query from the kernel?

>
> So can we keep this as engine info query without versioning, and if you
> need to query something else, lets have another ioctl for that.
>
> (I heard a rumour of this being about RCS topology, which could be in
> the engine info.).

I'm not sure to understand what you mean by "in the engine info". Would 
you mind to give an example?

>
> Regards, Joonas


_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 1/4] drm/i915: introduce query info uAPI
  2017-11-09 16:15     ` Lionel Landwerlin
@ 2017-11-09 17:10       ` Tvrtko Ursulin
  0 siblings, 0 replies; 19+ messages in thread
From: Tvrtko Ursulin @ 2017-11-09 17:10 UTC (permalink / raw)
  To: Lionel Landwerlin, Joonas Lahtinen, intel-gfx; +Cc: Daniel Vetter, Ben Widawsky


On 09/11/2017 16:15, Lionel Landwerlin wrote:
> On 09/11/17 15:57, Joonas Lahtinen wrote:
>> On Wed, 2017-11-08 at 16:22 +0000, Lionel Landwerlin wrote:
>>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>>
>>> Query info uAPI allows userspace to probe for a number of properties
>>> of the GPU. This partially removes the need for userspace to maintain
>>> the internal PCI id based database (as well as code duplication across
>>> userspace components).
>>>
>>> This first version enables engine configuration and features to be
>>> queried.
>>>
>>> Probing is done via the new DRM_IOCTL_I915_QUERY_INFO ioctl which
>>> takes a query ID as well as query arguments.
>>>
>>> Currently only general engine configuration and HEVC feature of the
>>> VCS engine can be probed but the uAPI is designed to be generic and
>>> extensible.
>>>
>>> Code is based almost exactly on the earlier proposal on the topic by
>>> Jon Bloomfield. Engine class and instance refactoring made recently by
>>> Daniele Ceraolo Spurio enabled this to be implemented in an elegant
>>> fashion.
>>>
>>> To probe configuration userspace sets the engine class it wants to
>>> query (struct drm_i915_gem_engine_info) and provides an array of
>>> drm_i915_engine_info structs which will be filled in by the driver.
>>> Userspace also has to tell i915 how many elements are in the array,
>>> and the driver will report back the total number of engine instances
>>> in any case.
>>>
>>> Pseudo-code example of userspace using the uAPI:
>>>
>>>    struct drm_i915_query_info info = {};
>>>    struct drm_i915_engine_info *engines;
>>>    int i, ret;
>>>
>>>    info.version = 1;
>>>    info.query = I915_QUERY_INFO_ENGINE;
>>>    info.query_params[0] = I915_ENGINE_CLASS_VIDEO;
>>>    info.info_ptr_len = 0;
>>>    ret = ioctl(..., &info);
>>>    assert(ret == 0);
>>>
>>>    engines = malloc(info.info_ptr_len);
>>>    info.info_ptr = to_user_pointer(engines);
>>>    ret = ioctl(..., &info);
>>>    assert(ret == 0);
>>>
>>>    for (i = 0; i < info.info_ptr / sizeof(*engines); i++)
>>>        printf("VCS%u: flags=%x\n",
>>>               engines[i].instance,
>>>               engines[i].info);
>>>
>>> First time with info.info_ptr_len set to zero, so the kernel reports
>>> back the size of the data to be written into info.info_ptr. Then a
>>> second time with the info.info_ptr_len field set to the correct
>>> length.
>>>
>>> v2:
>>>   * Add a version field and consolidate to one engine count.
>>>     (Chris Wilson)
>>>   * Rename uAPI flags for VCS engines to DRM_I915_ENGINE_CLASS_VIDEO.
>>>     (Gong Zhipeng)
>>>
>>> v3:
>>>   * Drop the DRM_ prefix from uAPI enums. (Chris Wilson)
>>>   * Only reserve 8-bits for the engine info for time being.
>>>
>>> v4:
>>>   * Made user_class_map static.
>>>
>>> v5:
>>>   * Example usage added to commit msg. (Chris Wilson)
>>>   * Report engine count in case of version mismatch. (Chris Wilson)
>>>
>>> v6:
>>>   * Return API to query_info to make it more generic (Lionel)
>>>   * Add query ID & query params (Lionel)
>>>
>>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>> Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
>>> Cc: Ben Widawsky <ben@bwidawsk.net>
>>> Cc: Chris Wilson <chris@chris-wilson.co.uk>
>>> Cc: Daniel Vetter <daniel.vetter@intel.com>
>>> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
>>> Cc: Jon Bloomfield <jon.bloomfield@intel.com>
>>> Cc: "Rogozhkin, Dmitry V" <dmitry.v.rogozhkin@intel.com>
>>> Cc: Oscar Mateo <oscar.mateo@intel.com>
>>> Cc: "Gong, Zhipeng" <zhipeng.gong@intel.com>
>> <SNIP>
>>
>>> +++ b/drivers/gpu/drm/i915/i915_drv.c
>>> @@ -2776,6 +2776,7 @@ static const struct drm_ioctl_desc 
>>> i915_ioctls[] = {
>>>       DRM_IOCTL_DEF_DRV(I915_PERF_OPEN, i915_perf_open_ioctl, 
>>> DRM_RENDER_ALLOW),
>>>       DRM_IOCTL_DEF_DRV(I915_PERF_ADD_CONFIG, 
>>> i915_perf_add_config_ioctl, DRM_UNLOCKED|DRM_RENDER_ALLOW),
>>>       DRM_IOCTL_DEF_DRV(I915_PERF_REMOVE_CONFIG, 
>>> i915_perf_remove_config_ioctl, DRM_UNLOCKED|DRM_RENDER_ALLOW),
>>> +    DRM_IOCTL_DEF_DRV(I915_QUERY_INFO, i915_query_info_ioctl, 
>>> DRM_RENDER_ALLOW),
>> I'm not a fan. This is bit much to the direction of I915_META_IOCTL.
> 
> So you would prefer a different ioctl for every bit of information we 
> would want to query from the kernel?
> 
>>
>> So can we keep this as engine info query without versioning, and if you
>> need to query something else, lets have another ioctl for that.
>>
>> (I heard a rumour of this being about RCS topology, which could be in
>> the engine info.).
> 
> I'm not sure to understand what you mean by "in the engine info". Would 
> you mind to give an example?

We chatted about this internally a bit and no one seems to like a 
multiplexer within a multiplexer approach.

So one option is a completely separate ioctl, but before that the 
question is if everything you need to export is called RCS topology, so 
logically belongs to the render engine, could it be exported under the 
engine info name after all?

Like maybe adding drm_i915_rcs_engine_info, containing all the extra 
fields you need on top of the existing info, to be returned when the 
render class is queried.

One problem I spotted is that your new data is only valid on gen8+. So 
that would mean having a flag saying what data is valid.

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 4/4] drm/i915: expose rcs topology through discovery uAPI
  2017-11-08 16:22 ` [PATCH 4/4] drm/i915: expose rcs topology through discovery uAPI Lionel Landwerlin
@ 2017-11-09 17:34   ` Tvrtko Ursulin
  2017-11-10 16:37     ` Lionel Landwerlin
  0 siblings, 1 reply; 19+ messages in thread
From: Tvrtko Ursulin @ 2017-11-09 17:34 UTC (permalink / raw)
  To: Lionel Landwerlin, intel-gfx


On 08/11/2017 16:22, Lionel Landwerlin wrote:
> With the introduction of asymetric slices in CNL, we cannot rely on
> the previous SUBSLICE_MASK getparam. Here we introduce a more detailed
> way of querying the Gen's GPU topology that doesn't aggregate numbers.
> 
> This is essential for monitoring parts of the GPU with the OA unit,
> because counters need to be normalized to the number of
> EUs/subslices/slices. The current aggregated numbers like EU_TOTAL do
> not gives us sufficient information.
> 
> This change introduce a new way to query properties of the GPU, making
> room for new queries (some media related topology could be exposed in
> the future).
> 
> As a bonus we can draw representations of the GPU :
> 
>          https://imgur.com/a/vuqpa
> 
> Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
> ---
>   drivers/gpu/drm/i915/intel_query_info.c | 64 +++++++++++++++++++++++++++++++++
>   include/uapi/drm/i915_drm.h             | 55 ++++++++++++++++++++++++++++
>   2 files changed, 119 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/intel_query_info.c b/drivers/gpu/drm/i915/intel_query_info.c
> index 21434c393582..45c5b29e0988 100644
> --- a/drivers/gpu/drm/i915/intel_query_info.c
> +++ b/drivers/gpu/drm/i915/intel_query_info.c
> @@ -25,6 +25,68 @@
>   #include "i915_drv.h"
>   #include <uapi/drm/i915_drm.h>
>   
> +static int query_info_rcs_topology(struct drm_i915_private *dev_priv,
> +				   struct drm_i915_query_info *args)
> +{
> +	const struct sseu_dev_info *sseu = &INTEL_INFO(dev_priv)->sseu;
> +	struct drm_i915_rcs_topology_info __user *user_topology =
> +		u64_to_user_ptr(args->info_ptr);
> +	struct drm_i915_rcs_topology_info topology;
> +	u32 data_size, total_size;
> +	const u8 *data = NULL;
> +	int ret;
> +
> +	/* Not supported on gen < 8. */
> +	if (sseu->max_slices == 0)
> +		return -ENODEV;
> +
> +	switch (args->query_params[0]) {
> +	case I915_RCS_TOPOLOGY_SLICE:
> +		topology.params[0] = sseu->max_slices;
> +		data_size = sizeof(sseu->slice_mask);
> +		data = &sseu->slice_mask;

Regardless of the solution, you will need to translate the internal 
kernel data into something strictly defined in the uAPI headers, 
otherwise we leak internal organization into ABI.

> +		break;
> +
> +	case I915_RCS_TOPOLOGY_SUBSLICE:
> +		topology.params[0] = sseu->max_slices;
> +		topology.params[1] = ALIGN(sseu->max_subslices, 8) / 8;
> +		data_size = sseu->max_slices * topology.params[1];
> +		data = sseu->subslices_mask;
> +		break;
> +
> +	case I915_RCS_TOPOLOGY_EU:
> +		topology.params[2] = ALIGN(sseu->max_eus_per_subslice, 8) / 8;
> +		topology.params[1] = sseu->max_subslices * topology.params[2];
> +		topology.params[0] = sseu->max_slices;
> +		data_size = sseu->max_slices * topology.params[1];
> +		data = sseu->eu_mask;
> +		break;
> +
> +	default:
> +		return -EINVAL;
> +	}
> +
> +	total_size = sizeof(topology) + data_size;
> +
> +	if (args->info_ptr_len == 0) {
> +		args->info_ptr_len = total_size;
> +		return 0;
> +	}
> +
> +	if (args->info_ptr_len < total_size)
> +		return -EINVAL;
> +
> +	ret = copy_to_user(user_topology, &topology, sizeof(topology));
> +	if (ret)
> +		return -EFAULT;
> +
> +	ret = copy_to_user(user_topology + 1, data, data_size);
> +	if (ret)
> +		return -EFAULT;
> +
> +	return 0;
> +}
> +
>   static u8 user_class_map[I915_ENGINE_CLASS_MAX] = {
>   	[I915_ENGINE_CLASS_OTHER] = OTHER_CLASS,
>   	[I915_ENGINE_CLASS_RENDER] = RENDER_CLASS,
> @@ -112,6 +174,8 @@ int i915_query_info_ioctl(struct drm_device *dev, void *data,
>   	switch (args->query) {
>   	case I915_QUERY_INFO_ENGINE:
>   		return query_info_engine(dev_priv, args);
> +	case I915_QUERY_INFO_RCS_TOPOLOGY:
> +		return query_info_rcs_topology(dev_priv, args);
>   	default:
>   		return -EINVAL;
>   	}
> diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
> index c6026616300e..5b6a8995a948 100644
> --- a/include/uapi/drm/i915_drm.h
> +++ b/include/uapi/drm/i915_drm.h
> @@ -1569,6 +1569,61 @@ struct drm_i915_engine_info {
>   	__u8 rsvd2[6];
>   };
>   
> +/* Query RCS topology.
> + *
> + * drm_i915_query_info.query_params[0] should be set to one of the
> + * I915_RCS_TOPOLOGY_* define.
> + *
> + * drm_i915_gem_query_info.info_ptr will be written to with
> + * drm_i915_rcs_topology_info.
> + */
> +#define I915_QUERY_INFO_RCS_TOPOLOGY	1 /* version 1 */
> +
> +/* Query RCS slice topology
> + *
> + * The meaning of the drm_i915_rcs_topology_info fields is :
> + *
> + * params[0]: number of slices
> + *
> + * data: Each bit indicates whether a slice is available (1) or fused off (0).
> + * Formula to tell if slice X is available :
> + *
> + *         (data[X / 8] >> (X % 8)) & 1
> + */
> +#define I915_RCS_TOPOLOGY_SLICE		0 /* version 1 */
> +/* Query RCS subslice topology
> + *
> + * The meaning of the drm_i915_rcs_topology_info fields is :
> + *
> + * params[0]: number of slices
> + * params[1]: slice stride
> + *
> + * data: each bit indicates whether a subslice is available (1) or fused off
> + * (0). Formula to tell if slice X subslice Y is available :
> + *
> + *         (data[(X * params[1]) + Y / 8] >> (Y % 8)) & 1
> + */
> +#define I915_RCS_TOPOLOGY_SUBSLICE	1 /* version 1 */
> +/* Query RCS EU topology
> + *
> + * The meaning of the drm_i915_rcs_topology_info fields is :
> + *
> + * params[0]: number of slices
> + * params[1]: slice stride
> + * params[2]: subslice stride
> + *
> + * data: Each bit indicates whether a subslice is available (1) or fused off
> + * (0). Formula to tell if slice X subslice Y eu Z is available :
> + *
> + *         (data[X * params[1] + Y * params[2] + Z / 8] >> (Z % 8)) & 1
> + */
> +#define I915_RCS_TOPOLOGY_EU		2 /* version 1 */
> +
> +struct drm_i915_rcs_topology_info {
> +	__u32 params[6];
> +
> +	__u8 data[];

You don't use this marker in the patch.

But in general would it be feasible to define and name the returned data 
more precisely? Like:

struct drm_engine_rcs_engine_info {
	...
	/existing_stuff/
	...

#define HAS_TOPOLOGY
	u32 flags;

	struct {
		u32 this;
		u32 that;
		...
		u8 mask[SOME_FUTURE_PROOF_NUMBER];
	} slice_topology;

	struct {
		u32 this;
		u32 that;
		...
		u8 mask[SOME_FUTURE_PROOF_NUMBER];
	} subslice_topology;

	struct {
		u32 this;
		u32 that;
		...
		u8 mask[SOME_FUTURE_PROOF_NUMBER];
	} eu_topology;
};

That said, perhaps RCS topology belongs more to the GPU than the render 
engine.

Regards,

Tvrtko

> +};
>   
>   struct drm_i915_query_info {
>   	/* in/out: Protocol version requested/supported. When set to 0, the
> 
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 4/4] drm/i915: expose rcs topology through discovery uAPI
  2017-11-09 17:34   ` Tvrtko Ursulin
@ 2017-11-10 16:37     ` Lionel Landwerlin
  2017-11-10 16:47       ` Chris Wilson
  0 siblings, 1 reply; 19+ messages in thread
From: Lionel Landwerlin @ 2017-11-10 16:37 UTC (permalink / raw)
  To: Tvrtko Ursulin, intel-gfx

On 09/11/17 17:34, Tvrtko Ursulin wrote:
>
> On 08/11/2017 16:22, Lionel Landwerlin wrote:
>> With the introduction of asymetric slices in CNL, we cannot rely on
>> the previous SUBSLICE_MASK getparam. Here we introduce a more detailed
>> way of querying the Gen's GPU topology that doesn't aggregate numbers.
>>
>> This is essential for monitoring parts of the GPU with the OA unit,
>> because counters need to be normalized to the number of
>> EUs/subslices/slices. The current aggregated numbers like EU_TOTAL do
>> not gives us sufficient information.
>>
>> This change introduce a new way to query properties of the GPU, making
>> room for new queries (some media related topology could be exposed in
>> the future).
>>
>> As a bonus we can draw representations of the GPU :
>>
>>          https://imgur.com/a/vuqpa
>>
>> Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
>> ---
>>   drivers/gpu/drm/i915/intel_query_info.c | 64 
>> +++++++++++++++++++++++++++++++++
>>   include/uapi/drm/i915_drm.h             | 55 
>> ++++++++++++++++++++++++++++
>>   2 files changed, 119 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/i915/intel_query_info.c 
>> b/drivers/gpu/drm/i915/intel_query_info.c
>> index 21434c393582..45c5b29e0988 100644
>> --- a/drivers/gpu/drm/i915/intel_query_info.c
>> +++ b/drivers/gpu/drm/i915/intel_query_info.c
>> @@ -25,6 +25,68 @@
>>   #include "i915_drv.h"
>>   #include <uapi/drm/i915_drm.h>
>>   +static int query_info_rcs_topology(struct drm_i915_private *dev_priv,
>> +                   struct drm_i915_query_info *args)
>> +{
>> +    const struct sseu_dev_info *sseu = &INTEL_INFO(dev_priv)->sseu;
>> +    struct drm_i915_rcs_topology_info __user *user_topology =
>> +        u64_to_user_ptr(args->info_ptr);
>> +    struct drm_i915_rcs_topology_info topology;
>> +    u32 data_size, total_size;
>> +    const u8 *data = NULL;
>> +    int ret;
>> +
>> +    /* Not supported on gen < 8. */
>> +    if (sseu->max_slices == 0)
>> +        return -ENODEV;
>> +
>> +    switch (args->query_params[0]) {
>> +    case I915_RCS_TOPOLOGY_SLICE:
>> +        topology.params[0] = sseu->max_slices;
>> +        data_size = sizeof(sseu->slice_mask);
>> +        data = &sseu->slice_mask;
>
> Regardless of the solution, you will need to translate the internal 
> kernel data into something strictly defined in the uAPI headers, 
> otherwise we leak internal organization into ABI.

Sure, I'll add an assert to make sure that it is u8, so if it does 
switch to u16/u32 at some point, people will remember to maintain the API.

>
>> +        break;
>> +
>> +    case I915_RCS_TOPOLOGY_SUBSLICE:
>> +        topology.params[0] = sseu->max_slices;
>> +        topology.params[1] = ALIGN(sseu->max_subslices, 8) / 8;
>> +        data_size = sseu->max_slices * topology.params[1];
>> +        data = sseu->subslices_mask;
>> +        break;
>> +
>> +    case I915_RCS_TOPOLOGY_EU:
>> +        topology.params[2] = ALIGN(sseu->max_eus_per_subslice, 8) / 8;
>> +        topology.params[1] = sseu->max_subslices * topology.params[2];
>> +        topology.params[0] = sseu->max_slices;
>> +        data_size = sseu->max_slices * topology.params[1];
>> +        data = sseu->eu_mask;
>> +        break;
>> +
>> +    default:
>> +        return -EINVAL;
>> +    }
>> +
>> +    total_size = sizeof(topology) + data_size;
>> +
>> +    if (args->info_ptr_len == 0) {
>> +        args->info_ptr_len = total_size;
>> +        return 0;
>> +    }
>> +
>> +    if (args->info_ptr_len < total_size)
>> +        return -EINVAL;
>> +
>> +    ret = copy_to_user(user_topology, &topology, sizeof(topology));
>> +    if (ret)
>> +        return -EFAULT;
>> +
>> +    ret = copy_to_user(user_topology + 1, data, data_size);
>> +    if (ret)
>> +        return -EFAULT;
>> +
>> +    return 0;
>> +}
>> +
>>   static u8 user_class_map[I915_ENGINE_CLASS_MAX] = {
>>       [I915_ENGINE_CLASS_OTHER] = OTHER_CLASS,
>>       [I915_ENGINE_CLASS_RENDER] = RENDER_CLASS,
>> @@ -112,6 +174,8 @@ int i915_query_info_ioctl(struct drm_device *dev, 
>> void *data,
>>       switch (args->query) {
>>       case I915_QUERY_INFO_ENGINE:
>>           return query_info_engine(dev_priv, args);
>> +    case I915_QUERY_INFO_RCS_TOPOLOGY:
>> +        return query_info_rcs_topology(dev_priv, args);
>>       default:
>>           return -EINVAL;
>>       }
>> diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
>> index c6026616300e..5b6a8995a948 100644
>> --- a/include/uapi/drm/i915_drm.h
>> +++ b/include/uapi/drm/i915_drm.h
>> @@ -1569,6 +1569,61 @@ struct drm_i915_engine_info {
>>       __u8 rsvd2[6];
>>   };
>>   +/* Query RCS topology.
>> + *
>> + * drm_i915_query_info.query_params[0] should be set to one of the
>> + * I915_RCS_TOPOLOGY_* define.
>> + *
>> + * drm_i915_gem_query_info.info_ptr will be written to with
>> + * drm_i915_rcs_topology_info.
>> + */
>> +#define I915_QUERY_INFO_RCS_TOPOLOGY    1 /* version 1 */
>> +
>> +/* Query RCS slice topology
>> + *
>> + * The meaning of the drm_i915_rcs_topology_info fields is :
>> + *
>> + * params[0]: number of slices
>> + *
>> + * data: Each bit indicates whether a slice is available (1) or 
>> fused off (0).
>> + * Formula to tell if slice X is available :
>> + *
>> + *         (data[X / 8] >> (X % 8)) & 1
>> + */
>> +#define I915_RCS_TOPOLOGY_SLICE        0 /* version 1 */
>> +/* Query RCS subslice topology
>> + *
>> + * The meaning of the drm_i915_rcs_topology_info fields is :
>> + *
>> + * params[0]: number of slices
>> + * params[1]: slice stride
>> + *
>> + * data: each bit indicates whether a subslice is available (1) or 
>> fused off
>> + * (0). Formula to tell if slice X subslice Y is available :
>> + *
>> + *         (data[(X * params[1]) + Y / 8] >> (Y % 8)) & 1
>> + */
>> +#define I915_RCS_TOPOLOGY_SUBSLICE    1 /* version 1 */
>> +/* Query RCS EU topology
>> + *
>> + * The meaning of the drm_i915_rcs_topology_info fields is :
>> + *
>> + * params[0]: number of slices
>> + * params[1]: slice stride
>> + * params[2]: subslice stride
>> + *
>> + * data: Each bit indicates whether a subslice is available (1) or 
>> fused off
>> + * (0). Formula to tell if slice X subslice Y eu Z is available :
>> + *
>> + *         (data[X * params[1] + Y * params[2] + Z / 8] >> (Z % 8)) & 1
>> + */
>> +#define I915_RCS_TOPOLOGY_EU        2 /* version 1 */
>> +
>> +struct drm_i915_rcs_topology_info {
>> +    __u32 params[6];
>> +
>> +    __u8 data[];
>
> You don't use this marker in the patch.
>
> But in general would it be feasible to define and name the returned 
> data more precisely? Like:
>
> struct drm_engine_rcs_engine_info {
>     ...
>     /existing_stuff/
>     ...
>
> #define HAS_TOPOLOGY
>     u32 flags;
>
>     struct {
>         u32 this;
>         u32 that;
>         ...
>         u8 mask[SOME_FUTURE_PROOF_NUMBER];
>     } slice_topology;
>
>     struct {
>         u32 this;
>         u32 that;
>         ...
>         u8 mask[SOME_FUTURE_PROOF_NUMBER];
>     } subslice_topology;
>
>     struct {
>         u32 this;
>         u32 that;
>         ...
>         u8 mask[SOME_FUTURE_PROOF_NUMBER];
>     } eu_topology;
> };

I'm pretty sure we'll need to expose more than these 3 properties here 
(slice/subslice/eus) soon.
Some of the components residing in the subslice could be of interest.
So I'm reluctant to have all of this within a single struct which we 
can't change the size of.
Having an enum for each property and possibly an associated struct for 
each is fine though.

>
> That said, perhaps RCS topology belongs more to the GPU than the 
> render engine.
>
> Regards,
>
> Tvrtko
>
>> +};
>>     struct drm_i915_query_info {
>>       /* in/out: Protocol version requested/supported. When set to 0, 
>> the
>>
>

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 4/4] drm/i915: expose rcs topology through discovery uAPI
  2017-11-10 16:37     ` Lionel Landwerlin
@ 2017-11-10 16:47       ` Chris Wilson
  2017-11-10 18:29         ` Lionel Landwerlin
  0 siblings, 1 reply; 19+ messages in thread
From: Chris Wilson @ 2017-11-10 16:47 UTC (permalink / raw)
  To: Lionel Landwerlin, Tvrtko Ursulin, intel-gfx

Quoting Lionel Landwerlin (2017-11-10 16:37:33)
> On 09/11/17 17:34, Tvrtko Ursulin wrote:
> >
> > On 08/11/2017 16:22, Lionel Landwerlin wrote:
> > But in general would it be feasible to define and name the returned 
> > data more precisely? Like:
> >
> > struct drm_engine_rcs_engine_info {
> >     ...
> >     /existing_stuff/
> >     ...
> >
> > #define HAS_TOPOLOGY
> >     u32 flags;
> >
> >     struct {
> >         u32 this;
> >         u32 that;
> >         ...
> >         u8 mask[SOME_FUTURE_PROOF_NUMBER];
> >     } slice_topology;
> >
> >     struct {
> >         u32 this;
> >         u32 that;
> >         ...
> >         u8 mask[SOME_FUTURE_PROOF_NUMBER];
> >     } subslice_topology;
> >
> >     struct {
> >         u32 this;
> >         u32 that;
> >         ...
> >         u8 mask[SOME_FUTURE_PROOF_NUMBER];
> >     } eu_topology;
> > };
> 
> I'm pretty sure we'll need to expose more than these 3 properties here 
> (slice/subslice/eus) soon.
> Some of the components residing in the subslice could be of interest.
> So I'm reluctant to have all of this within a single struct which we 
> can't change the size of.

The struct size doesn't have to be fixed, just reported. The underlying
question is can we construct an interface that is flexible enough?

Something along the lines of perf (GL even) where the output format is
constructed by request from the user, then we only need to declare how
long each field will be, get to the user allocate the buffer, then fill
on the second pass.

Alternatively we output some ASN string?

I don't want to overengineer, but at the same time this looks to be the
perfect excuse to require enough flexibility to cater for future
complexity without going too meta. :)
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 4/4] drm/i915: expose rcs topology through discovery uAPI
  2017-11-10 16:47       ` Chris Wilson
@ 2017-11-10 18:29         ` Lionel Landwerlin
  2017-11-10 19:03           ` Chris Wilson
  2017-11-13  9:14           ` Tvrtko Ursulin
  0 siblings, 2 replies; 19+ messages in thread
From: Lionel Landwerlin @ 2017-11-10 18:29 UTC (permalink / raw)
  To: Chris Wilson, Tvrtko Ursulin, intel-gfx

On 10/11/17 16:47, Chris Wilson wrote:
> Quoting Lionel Landwerlin (2017-11-10 16:37:33)
>> On 09/11/17 17:34, Tvrtko Ursulin wrote:
>>> On 08/11/2017 16:22, Lionel Landwerlin wrote:
>>> But in general would it be feasible to define and name the returned
>>> data more precisely? Like:
>>>
>>> struct drm_engine_rcs_engine_info {
>>>      ...
>>>      /existing_stuff/
>>>      ...
>>>
>>> #define HAS_TOPOLOGY
>>>      u32 flags;
>>>
>>>      struct {
>>>          u32 this;
>>>          u32 that;
>>>          ...
>>>          u8 mask[SOME_FUTURE_PROOF_NUMBER];
>>>      } slice_topology;
>>>
>>>      struct {
>>>          u32 this;
>>>          u32 that;
>>>          ...
>>>          u8 mask[SOME_FUTURE_PROOF_NUMBER];
>>>      } subslice_topology;
>>>
>>>      struct {
>>>          u32 this;
>>>          u32 that;
>>>          ...
>>>          u8 mask[SOME_FUTURE_PROOF_NUMBER];
>>>      } eu_topology;
>>> };
>> I'm pretty sure we'll need to expose more than these 3 properties here
>> (slice/subslice/eus) soon.
>> Some of the components residing in the subslice could be of interest.
>> So I'm reluctant to have all of this within a single struct which we
>> can't change the size of.
> The struct size doesn't have to be fixed, just reported. The underlying
> question is can we construct an interface that is flexible enough?
>
> Something along the lines of perf (GL even) where the output format is
> constructed by request from the user, then we only need to declare how
> long each field will be, get to the user allocate the buffer, then fill
> on the second pass.
>
> Alternatively we output some ASN string?
>
> I don't want to overengineer, but at the same time this looks to be the
> perfect excuse to require enough flexibility to cater for future
> complexity without going too meta. :)
> -Chris
>
Heh, so one ioctl to get the string, another ioctl to get the data?
And a list of enum for all the properties you can list?

Unrelated question, have you considered ASN to describe the error state 
layout?

-
Lionel
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 4/4] drm/i915: expose rcs topology through discovery uAPI
  2017-11-10 18:29         ` Lionel Landwerlin
@ 2017-11-10 19:03           ` Chris Wilson
  2017-11-13  9:14           ` Tvrtko Ursulin
  1 sibling, 0 replies; 19+ messages in thread
From: Chris Wilson @ 2017-11-10 19:03 UTC (permalink / raw)
  To: Lionel Landwerlin, Tvrtko Ursulin, intel-gfx

Quoting Lionel Landwerlin (2017-11-10 18:29:45)
> On 10/11/17 16:47, Chris Wilson wrote:
> Unrelated question, have you considered ASN to describe the error state 
> layout?

I was thinking json (to make it semi-structured and extensible), but keep
reminding myself that if everything else speaks aub, we should make a
passable attempt too. Just keep it ascii though.

However, error-state needs large kick in the right direction for
userspace debugging; I am open for suggestions (in fact want suggestions
from people trying to do post-mortem debugging of GL/Vk, libva and CL).
(I quite liked the idea of the apitrace fdr recording the set of GL
commands that constructed the batch; possibly intermediate stages as
well.)
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 4/4] drm/i915: expose rcs topology through discovery uAPI
  2017-11-10 18:29         ` Lionel Landwerlin
  2017-11-10 19:03           ` Chris Wilson
@ 2017-11-13  9:14           ` Tvrtko Ursulin
  2017-11-13 10:00             ` Lionel Landwerlin
  2017-11-13 11:14             ` Chris Wilson
  1 sibling, 2 replies; 19+ messages in thread
From: Tvrtko Ursulin @ 2017-11-13  9:14 UTC (permalink / raw)
  To: Lionel Landwerlin, Chris Wilson, intel-gfx


On 10/11/2017 18:29, Lionel Landwerlin wrote:
> On 10/11/17 16:47, Chris Wilson wrote:
>> Quoting Lionel Landwerlin (2017-11-10 16:37:33)
>>> On 09/11/17 17:34, Tvrtko Ursulin wrote:
>>>> On 08/11/2017 16:22, Lionel Landwerlin wrote:
>>>> But in general would it be feasible to define and name the returned
>>>> data more precisely? Like:
>>>>
>>>> struct drm_engine_rcs_engine_info {
>>>>      ...
>>>>      /existing_stuff/
>>>>      ...
>>>>
>>>> #define HAS_TOPOLOGY
>>>>      u32 flags;
>>>>
>>>>      struct {
>>>>          u32 this;
>>>>          u32 that;
>>>>          ...
>>>>          u8 mask[SOME_FUTURE_PROOF_NUMBER];
>>>>      } slice_topology;
>>>>
>>>>      struct {
>>>>          u32 this;
>>>>          u32 that;
>>>>          ...
>>>>          u8 mask[SOME_FUTURE_PROOF_NUMBER];
>>>>      } subslice_topology;
>>>>
>>>>      struct {
>>>>          u32 this;
>>>>          u32 that;
>>>>          ...
>>>>          u8 mask[SOME_FUTURE_PROOF_NUMBER];
>>>>      } eu_topology;
>>>> };
>>> I'm pretty sure we'll need to expose more than these 3 properties here
>>> (slice/subslice/eus) soon.
>>> Some of the components residing in the subslice could be of interest.
>>> So I'm reluctant to have all of this within a single struct which we
>>> can't change the size of.
>> The struct size doesn't have to be fixed, just reported. The underlying
>> question is can we construct an interface that is flexible enough?
>>
>> Something along the lines of perf (GL even) where the output format is
>> constructed by request from the user, then we only need to declare how
>> long each field will be, get to the user allocate the buffer, then fill
>> on the second pass.
>>
>> Alternatively we output some ASN string?
>>
>> I don't want to overengineer, but at the same time this looks to be the
>> perfect excuse to require enough flexibility to cater for future
>> complexity without going too meta. :)
>> -Chris
>>
> Heh, so one ioctl to get the string, another ioctl to get the data?
> And a list of enum for all the properties you can list?
> 
> Unrelated question, have you considered ASN to describe the error state 
> layout?

Or we go with sysfs, plain and simple?

$ cat $i915root/engine/vcs0/info
hevc

$ cat $i915root/engine/vcs1/instance
1

$ cat $i915root/engine/rcs0/class
render

...

$i915root/gpu/topology/slice_mask

Should be able to design to avoid issues with extensibility and avoids 
the need to come up with complex binary structures or even adding new 
protocols like the ASN mentioned above.

?

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 4/4] drm/i915: expose rcs topology through discovery uAPI
  2017-11-13  9:14           ` Tvrtko Ursulin
@ 2017-11-13 10:00             ` Lionel Landwerlin
  2017-11-13 11:14             ` Chris Wilson
  1 sibling, 0 replies; 19+ messages in thread
From: Lionel Landwerlin @ 2017-11-13 10:00 UTC (permalink / raw)
  To: Tvrtko Ursulin, Chris Wilson, intel-gfx

On 13/11/17 09:14, Tvrtko Ursulin wrote:
>
> On 10/11/2017 18:29, Lionel Landwerlin wrote:
>> On 10/11/17 16:47, Chris Wilson wrote:
>>> Quoting Lionel Landwerlin (2017-11-10 16:37:33)
>>>> On 09/11/17 17:34, Tvrtko Ursulin wrote:
>>>>> On 08/11/2017 16:22, Lionel Landwerlin wrote:
>>>>> But in general would it be feasible to define and name the returned
>>>>> data more precisely? Like:
>>>>>
>>>>> struct drm_engine_rcs_engine_info {
>>>>>      ...
>>>>>      /existing_stuff/
>>>>>      ...
>>>>>
>>>>> #define HAS_TOPOLOGY
>>>>>      u32 flags;
>>>>>
>>>>>      struct {
>>>>>          u32 this;
>>>>>          u32 that;
>>>>>          ...
>>>>>          u8 mask[SOME_FUTURE_PROOF_NUMBER];
>>>>>      } slice_topology;
>>>>>
>>>>>      struct {
>>>>>          u32 this;
>>>>>          u32 that;
>>>>>          ...
>>>>>          u8 mask[SOME_FUTURE_PROOF_NUMBER];
>>>>>      } subslice_topology;
>>>>>
>>>>>      struct {
>>>>>          u32 this;
>>>>>          u32 that;
>>>>>          ...
>>>>>          u8 mask[SOME_FUTURE_PROOF_NUMBER];
>>>>>      } eu_topology;
>>>>> };
>>>> I'm pretty sure we'll need to expose more than these 3 properties here
>>>> (slice/subslice/eus) soon.
>>>> Some of the components residing in the subslice could be of interest.
>>>> So I'm reluctant to have all of this within a single struct which we
>>>> can't change the size of.
>>> The struct size doesn't have to be fixed, just reported. The underlying
>>> question is can we construct an interface that is flexible enough?
>>>
>>> Something along the lines of perf (GL even) where the output format is
>>> constructed by request from the user, then we only need to declare how
>>> long each field will be, get to the user allocate the buffer, then fill
>>> on the second pass.
>>>
>>> Alternatively we output some ASN string?
>>>
>>> I don't want to overengineer, but at the same time this looks to be the
>>> perfect excuse to require enough flexibility to cater for future
>>> complexity without going too meta. :)
>>> -Chris
>>>
>> Heh, so one ioctl to get the string, another ioctl to get the data?
>> And a list of enum for all the properties you can list?
>>
>> Unrelated question, have you considered ASN to describe the error 
>> state layout?
>
> Or we go with sysfs, plain and simple?
>
> $ cat $i915root/engine/vcs0/info
> hevc
>
> $ cat $i915root/engine/vcs1/instance
> 1
>
> $ cat $i915root/engine/rcs0/class
> render
>
> ...
>
> $i915root/gpu/topology/slice_mask
>
> Should be able to design to avoid issues with extensibility and avoids 
> the need to come up with complex binary structures or even adding new 
> protocols like the ASN mentioned above.
>
> ?
>
> Tvrtko
>
I guess that works too. What layout would fit for the other bits of 
topology though? :

topology/slice_mask
topology/slice0/subslice_mask
topology/slice0/subslice0/eu_mask
topology/slice0/subslice1/eu_mask
topology/slice0/subslice2/eu_mask
topology/slice1/subslice_mask
...
topology/slice2/subslice_mask
...

Or something flatter but then you could need lines in the files to split 
things by slice/subslice.

-
Lionel

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 4/4] drm/i915: expose rcs topology through discovery uAPI
  2017-11-13  9:14           ` Tvrtko Ursulin
  2017-11-13 10:00             ` Lionel Landwerlin
@ 2017-11-13 11:14             ` Chris Wilson
  1 sibling, 0 replies; 19+ messages in thread
From: Chris Wilson @ 2017-11-13 11:14 UTC (permalink / raw)
  To: Tvrtko Ursulin, Lionel Landwerlin, intel-gfx

Quoting Tvrtko Ursulin (2017-11-13 09:14:15)
> 
> On 10/11/2017 18:29, Lionel Landwerlin wrote:
> > On 10/11/17 16:47, Chris Wilson wrote:
> >> Quoting Lionel Landwerlin (2017-11-10 16:37:33)
> >>> On 09/11/17 17:34, Tvrtko Ursulin wrote:
> >>>> On 08/11/2017 16:22, Lionel Landwerlin wrote:
> >>>> But in general would it be feasible to define and name the returned
> >>>> data more precisely? Like:
> >>>>
> >>>> struct drm_engine_rcs_engine_info {
> >>>>      ...
> >>>>      /existing_stuff/
> >>>>      ...
> >>>>
> >>>> #define HAS_TOPOLOGY
> >>>>      u32 flags;
> >>>>
> >>>>      struct {
> >>>>          u32 this;
> >>>>          u32 that;
> >>>>          ...
> >>>>          u8 mask[SOME_FUTURE_PROOF_NUMBER];
> >>>>      } slice_topology;
> >>>>
> >>>>      struct {
> >>>>          u32 this;
> >>>>          u32 that;
> >>>>          ...
> >>>>          u8 mask[SOME_FUTURE_PROOF_NUMBER];
> >>>>      } subslice_topology;
> >>>>
> >>>>      struct {
> >>>>          u32 this;
> >>>>          u32 that;
> >>>>          ...
> >>>>          u8 mask[SOME_FUTURE_PROOF_NUMBER];
> >>>>      } eu_topology;
> >>>> };
> >>> I'm pretty sure we'll need to expose more than these 3 properties here
> >>> (slice/subslice/eus) soon.
> >>> Some of the components residing in the subslice could be of interest.
> >>> So I'm reluctant to have all of this within a single struct which we
> >>> can't change the size of.
> >> The struct size doesn't have to be fixed, just reported. The underlying
> >> question is can we construct an interface that is flexible enough?
> >>
> >> Something along the lines of perf (GL even) where the output format is
> >> constructed by request from the user, then we only need to declare how
> >> long each field will be, get to the user allocate the buffer, then fill
> >> on the second pass.
> >>
> >> Alternatively we output some ASN string?
> >>
> >> I don't want to overengineer, but at the same time this looks to be the
> >> perfect excuse to require enough flexibility to cater for future
> >> complexity without going too meta. :)
> >> -Chris
> >>
> > Heh, so one ioctl to get the string, another ioctl to get the data?
> > And a list of enum for all the properties you can list?
> > 
> > Unrelated question, have you considered ASN to describe the error state 
> > layout?
> 
> Or we go with sysfs, plain and simple?
> 
> $ cat $i915root/engine/vcs0/info
> hevc
> 
> $ cat $i915root/engine/vcs1/instance
> 1
> 
> $ cat $i915root/engine/rcs0/class
> render
> 
> ...
> 
> $i915root/gpu/topology/slice_mask
> 
> Should be able to design to avoid issues with extensibility and avoids 
> the need to come up with complex binary structures or even adding new 
> protocols like the ASN mentioned above.

That wins the flexibility argument hands down. (It is so flexible we have
to plan ahead to get the hierarchy right. We can even version it with a
$i915root/version. Except people will add it to their scripts and notice
if paths change. It has the benefit of being too convenient ;)

The only counter is, ugh that's a lot of work to gather the info on
userspace driver load. But everybody can cat their gpu details.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2017-11-13 11:14 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-11-08 16:22 [PATCH 0/4] drm/i915: introduce query information Lionel Landwerlin
2017-11-08 16:22 ` [PATCH 1/4] drm/i915: introduce query info uAPI Lionel Landwerlin
2017-11-09 15:57   ` Joonas Lahtinen
2017-11-09 16:15     ` Lionel Landwerlin
2017-11-09 17:10       ` Tvrtko Ursulin
2017-11-08 16:22 ` [PATCH 2/4] drm/i915: store all subslice masks Lionel Landwerlin
2017-11-08 16:22 ` [PATCH 3/4] drm/i915/debugfs: reuse max slice/subslices already stored in sseu Lionel Landwerlin
2017-11-08 16:22 ` [PATCH 4/4] drm/i915: expose rcs topology through discovery uAPI Lionel Landwerlin
2017-11-09 17:34   ` Tvrtko Ursulin
2017-11-10 16:37     ` Lionel Landwerlin
2017-11-10 16:47       ` Chris Wilson
2017-11-10 18:29         ` Lionel Landwerlin
2017-11-10 19:03           ` Chris Wilson
2017-11-13  9:14           ` Tvrtko Ursulin
2017-11-13 10:00             ` Lionel Landwerlin
2017-11-13 11:14             ` Chris Wilson
2017-11-08 16:43 ` ✓ Fi.CI.BAT: success for drm/i915: introduce query information Patchwork
2017-11-08 20:04 ` ✓ Fi.CI.IGT: " Patchwork
2017-11-09 11:49 ` [PATCH 0/4] " Lionel Landwerlin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.