dri-devel.lists.freedesktop.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v6 0/5] drm/i915: Expose more GPU properties through sysfs
@ 2017-12-04 15:02 Lionel Landwerlin
  2017-12-04 15:02 ` [PATCH v6 1/5] drm: add card symlink in render sysfs directory Lionel Landwerlin
                   ` (5 more replies)
  0 siblings, 6 replies; 19+ messages in thread
From: Lionel Landwerlin @ 2017-12-04 15:02 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel

Hi,

After discussion with Chris, Joonas & Tvrtko, this series adds an
additional commit to link the render node back to the card through a
symlink. Making it obvious from an application using a render node to
know where to get the information it needs.

Cheers,

Lionel Landwerlin (5):
  drm: add card symlink in render sysfs directory
  drm/i915: store all subslice masks
  drm/i915/debugfs: reuse max slice/subslices already stored in sseu
  drm/i915: expose engine availability through sysfs
  drm/i915: expose EU topology through sysfs

 drivers/gpu/drm/drm_drv.c                |  11 +
 drivers/gpu/drm/i915/i915_debugfs.c      |  50 ++--
 drivers/gpu/drm/i915/i915_drv.c          |   2 +-
 drivers/gpu/drm/i915/i915_drv.h          |  56 ++++-
 drivers/gpu/drm/i915/i915_sysfs.c        | 386 +++++++++++++++++++++++++++++++
 drivers/gpu/drm/i915/intel_device_info.c | 169 ++++++++++----
 drivers/gpu/drm/i915/intel_engine_cs.c   |  12 +
 drivers/gpu/drm/i915/intel_lrc.c         |   2 +-
 drivers/gpu/drm/i915/intel_ringbuffer.h  |   6 +-
 9 files changed, 617 insertions(+), 77 deletions(-)

--
2.15.1
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH v6 1/5] drm: add card symlink in render sysfs directory
  2017-12-04 15:02 [PATCH v6 0/5] drm/i915: Expose more GPU properties through sysfs Lionel Landwerlin
@ 2017-12-04 15:02 ` Lionel Landwerlin
  2017-12-04 15:02 ` [PATCH v6 2/5] drm/i915: store all subslice masks Lionel Landwerlin
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 19+ messages in thread
From: Lionel Landwerlin @ 2017-12-04 15:02 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel

In i915 we would like to expose information about the GPU topology
which would be useful mostly to applications making use of the device
computational capability (not so much the display part). At the moment
we already store information like frequency, etc... into the card
directory (/sys/class/drm/card0). It seems natural to add the
additional data there too.

Unfortunately it's not obvious how to go from the render node back to
the sysfs card directory.

From the major/minor number it's possible to figure out what device
associated with the render node. For example with this render node :

$ ls -l /dev/dri/renderD128
crw-rw----+ 1 root video 226, 128 Oct 24 16:17 /dev/dri/renderD128

You can rebuild a part to access the associated drm device in :

$ ls -l /sys/dev/char/226\:128/device/drm/
total 0
drwxr-xr-x 9 root root 0 Nov 27 16:52 card0
lrwxrwxrwx 1 root root 0 Nov 27 16:52 controlD64 -> card0
drwxr-xr-x 3 root root 0 Nov 27 16:50 renderD128

Right now, most devices have usually only one card. But in order to be
future proof, we would like to associate the render node with the card
so that you can get access to the card with the following path :

/sys/dev/char/226\:128/card/

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: dri-devel@lists.freedesktop.org
---
 drivers/gpu/drm/drm_drv.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c
index 9acc1e157813..a26c0e86778e 100644
--- a/drivers/gpu/drm/drm_drv.c
+++ b/drivers/gpu/drm/drm_drv.c
@@ -779,6 +779,14 @@ int drm_dev_register(struct drm_device *dev, unsigned long flags)
 	if (drm_core_check_feature(dev, DRIVER_MODESET))
 		drm_modeset_register_all(dev);
 
+	if (drm_core_check_feature(dev, DRIVER_RENDER)) {
+		ret = sysfs_create_link(&dev->render->kdev->kobj,
+					&dev->primary->kdev->kobj,
+					"card");
+		if (ret)
+			goto err_minors;
+	}
+
 	ret = 0;
 
 	DRM_INFO("Initialized %s %d.%d.%d %s for %s on minor %d\n",
@@ -835,6 +843,9 @@ void drm_dev_unregister(struct drm_device *dev)
 	list_for_each_entry_safe(r_list, list_temp, &dev->maplist, head)
 		drm_legacy_rmmap(dev, r_list->map);
 
+	if (drm_core_check_feature(dev, DRIVER_RENDER))
+		sysfs_remove_link(&dev->render->kdev->kobj, "card");
+
 	remove_compat_control_link(dev);
 	drm_minor_unregister(dev, DRM_MINOR_PRIMARY);
 	drm_minor_unregister(dev, DRM_MINOR_RENDER);
-- 
2.15.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v6 2/5] drm/i915: store all subslice masks
  2017-12-04 15:02 [PATCH v6 0/5] drm/i915: Expose more GPU properties through sysfs Lionel Landwerlin
  2017-12-04 15:02 ` [PATCH v6 1/5] drm: add card symlink in render sysfs directory Lionel Landwerlin
@ 2017-12-04 15:02 ` Lionel Landwerlin
  2017-12-04 15:02 ` [PATCH v6 3/5] drm/i915/debugfs: reuse max slice/subslices already stored in sseu Lionel Landwerlin
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 19+ messages in thread
From: Lionel Landwerlin @ 2017-12-04 15:02 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel

Up to now, subslice mask was assumed to be uniform across slices. But
starting with Cannonlake, slices can be asymetric (for example slice0
has different number of subslices as slice1+). This change stores all
subslices masks for all slices rather than having a single mask that
applies to all slices.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
---
 drivers/gpu/drm/i915/i915_debugfs.c      |  24 +++--
 drivers/gpu/drm/i915/i915_drv.c          |   2 +-
 drivers/gpu/drm/i915/i915_drv.h          |  23 ++++-
 drivers/gpu/drm/i915/intel_device_info.c | 169 ++++++++++++++++++++++---------
 drivers/gpu/drm/i915/intel_lrc.c         |   2 +-
 drivers/gpu/drm/i915/intel_ringbuffer.h  |   2 +-
 6 files changed, 161 insertions(+), 61 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index 28294470ae31..ef091a2a6b12 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -4340,7 +4340,7 @@ static void cherryview_sseu_device_status(struct drm_i915_private *dev_priv,
 			continue;
 
 		sseu->slice_mask = BIT(0);
-		sseu->subslice_mask |= BIT(ss);
+		sseu->subslices_mask[0] |= BIT(ss);
 		eu_cnt = ((sig1[ss] & CHV_EU08_PG_ENABLE) ? 0 : 2) +
 			 ((sig1[ss] & CHV_EU19_PG_ENABLE) ? 0 : 2) +
 			 ((sig1[ss] & CHV_EU210_PG_ENABLE) ? 0 : 2) +
@@ -4387,7 +4387,7 @@ static void gen10_sseu_device_status(struct drm_i915_private *dev_priv,
 			continue;
 
 		sseu->slice_mask |= BIT(s);
-		sseu->subslice_mask = info->sseu.subslice_mask;
+		sseu->subslices_mask[s] = info->sseu.subslices_mask[s];
 
 		for (ss = 0; ss < ss_max; ss++) {
 			unsigned int eu_cnt;
@@ -4442,8 +4442,8 @@ static void gen9_sseu_device_status(struct drm_i915_private *dev_priv,
 		sseu->slice_mask |= BIT(s);
 
 		if (IS_GEN9_BC(dev_priv))
-			sseu->subslice_mask =
-				INTEL_INFO(dev_priv)->sseu.subslice_mask;
+			sseu->subslices_mask[s] =
+				INTEL_INFO(dev_priv)->sseu.subslices_mask[s];
 
 		for (ss = 0; ss < ss_max; ss++) {
 			unsigned int eu_cnt;
@@ -4453,7 +4453,7 @@ static void gen9_sseu_device_status(struct drm_i915_private *dev_priv,
 					/* skip disabled subslice */
 					continue;
 
-				sseu->subslice_mask |= BIT(ss);
+				sseu->subslices_mask[s] |= BIT(ss);
 			}
 
 			eu_cnt = 2 * hweight32(eu_reg[2*s + ss/2] &
@@ -4475,9 +4475,12 @@ static void broadwell_sseu_device_status(struct drm_i915_private *dev_priv,
 	sseu->slice_mask = slice_info & GEN8_LSLICESTAT_MASK;
 
 	if (sseu->slice_mask) {
-		sseu->subslice_mask = INTEL_INFO(dev_priv)->sseu.subslice_mask;
 		sseu->eu_per_subslice =
 				INTEL_INFO(dev_priv)->sseu.eu_per_subslice;
+		for (s = 0; s < fls(sseu->slice_mask); s++) {
+			sseu->subslices_mask[s] =
+				INTEL_INFO(dev_priv)->sseu.subslices_mask[s];
+		}
 		sseu->eu_total = sseu->eu_per_subslice *
 				 sseu_subslice_total(sseu);
 
@@ -4496,6 +4499,7 @@ static void i915_print_sseu_info(struct seq_file *m, bool is_available_info,
 {
 	struct drm_i915_private *dev_priv = node_to_i915(m->private);
 	const char *type = is_available_info ? "Available" : "Enabled";
+	int s;
 
 	seq_printf(m, "  %s Slice Mask: %04x\n", type,
 		   sseu->slice_mask);
@@ -4503,10 +4507,10 @@ static void i915_print_sseu_info(struct seq_file *m, bool is_available_info,
 		   hweight8(sseu->slice_mask));
 	seq_printf(m, "  %s Subslice Total: %u\n", type,
 		   sseu_subslice_total(sseu));
-	seq_printf(m, "  %s Subslice Mask: %04x\n", type,
-		   sseu->subslice_mask);
-	seq_printf(m, "  %s Subslice Per Slice: %u\n", type,
-		   hweight8(sseu->subslice_mask));
+	for (s = 0; s < fls(sseu->slice_mask); s++) {
+		seq_printf(m, "  %s Slice%i Subslice Mask: %04x\n", type,
+			   s, sseu->subslices_mask[s]);
+	}
 	seq_printf(m, "  %s EU Total: %u\n", type,
 		   sseu->eu_total);
 	seq_printf(m, "  %s EU Per Subslice: %u\n", type,
diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 7faf20aff25a..b9bfc38e6188 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -414,7 +414,7 @@ static int i915_getparam(struct drm_device *dev, void *data,
 			return -ENODEV;
 		break;
 	case I915_PARAM_SUBSLICE_MASK:
-		value = INTEL_INFO(dev_priv)->sseu.subslice_mask;
+		value = INTEL_INFO(dev_priv)->sseu.subslices_mask[0];
 		if (!value)
 			return -ENODEV;
 		break;
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 594fd14e66c5..0a8e8a3772e5 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -801,9 +801,12 @@ struct intel_csr {
 	func(supports_tv); \
 	func(has_ipc);
 
+#define GEN_MAX_SLICES		(6) /* CNL upper bound */
+#define GEN_MAX_SUBSLICES	(7)
+
 struct sseu_dev_info {
 	u8 slice_mask;
-	u8 subslice_mask;
+	u8 subslices_mask[GEN_MAX_SLICES];
 	u8 eu_total;
 	u8 eu_per_subslice;
 	u8 min_eu_in_pool;
@@ -812,11 +815,27 @@ struct sseu_dev_info {
 	u8 has_slice_pg:1;
 	u8 has_subslice_pg:1;
 	u8 has_eu_pg:1;
+
+	/* Topology fields */
+	u8 max_slices;
+	u8 max_subslices;
+	u8 max_eus_per_subslice;
+
+	/* We don't have more than 8 eus per subslice at the moment and as we
+	 * store eus enabled using bits, no need to multiply by eus per
+	 * subslice.
+	 */
+	u8 eu_mask[GEN_MAX_SLICES * GEN_MAX_SUBSLICES];
 };
 
 static inline unsigned int sseu_subslice_total(const struct sseu_dev_info *sseu)
 {
-	return hweight8(sseu->slice_mask) * hweight8(sseu->subslice_mask);
+	unsigned s, total = 0;
+
+	for (s = 0; s < ARRAY_SIZE(sseu->subslices_mask); s++)
+		total += hweight8(sseu->subslices_mask[s]);
+
+	return total;
 }
 
 /* Keep in gen based order, and chronological order within a gen */
diff --git a/drivers/gpu/drm/i915/intel_device_info.c b/drivers/gpu/drm/i915/intel_device_info.c
index 405d70124a46..11ceb43ddcee 100644
--- a/drivers/gpu/drm/i915/intel_device_info.c
+++ b/drivers/gpu/drm/i915/intel_device_info.c
@@ -82,22 +82,74 @@ void intel_device_info_dump(struct drm_i915_private *dev_priv)
 #undef PRINT_FLAG
 }
 
+static u8 compute_eu_total(const struct sseu_dev_info *sseu)
+{
+	u8 i, total = 0;
+
+	for (i = 0; i < ARRAY_SIZE(sseu->eu_mask); i++)
+		total += hweight8(sseu->eu_mask[i]);
+
+	return total;
+}
+
 static void gen10_sseu_info_init(struct drm_i915_private *dev_priv)
 {
 	struct sseu_dev_info *sseu = &mkwrite_device_info(dev_priv)->sseu;
 	const u32 fuse2 = I915_READ(GEN8_FUSE2);
+	int s, ss, eu_mask = 0xff;
+	u32 subslice_mask, eu_en;
 
 	sseu->slice_mask = (fuse2 & GEN10_F2_S_ENA_MASK) >>
 			    GEN10_F2_S_ENA_SHIFT;
-	sseu->subslice_mask = (1 << 4) - 1;
-	sseu->subslice_mask &= ~((fuse2 & GEN10_F2_SS_DIS_MASK) >>
-				 GEN10_F2_SS_DIS_SHIFT);
+	sseu->max_slices = 6;
+	sseu->max_subslices = 4;
+	sseu->max_eus_per_subslice = 8;
+
+	subslice_mask = (1 << 4) - 1;
+	subslice_mask &= ~((fuse2 & GEN10_F2_SS_DIS_MASK) >>
+			   GEN10_F2_SS_DIS_SHIFT);
+
+	/* Slice0 can have up to 3 subslices, but there are only 2 in
+	 * slice1/2.
+	 */
+	sseu->subslices_mask[0] = subslice_mask;
+	for (s = 1; s < sseu->max_slices; s++)
+		sseu->subslices_mask[s] = subslice_mask & 0x3;
+
+	/* Slice0 */
+	eu_en = ~I915_READ(GEN8_EU_DISABLE0);
+	for (ss = 0; ss < sseu->max_subslices; ss++)
+		sseu->eu_mask[ss]     = (eu_en >> (8 * ss)) & eu_mask;
+	/* Slice1 */
+	sseu->eu_mask[sseu->max_subslices]         = (eu_en >> 24) & eu_mask;
+	eu_en = ~I915_READ(GEN8_EU_DISABLE1);
+	sseu->eu_mask[sseu->max_subslices + 1]     = eu_en & eu_mask;
+	/* Slice2 */
+	sseu->eu_mask[2 * sseu->max_subslices]     = (eu_en >> 8) & eu_mask;
+	sseu->eu_mask[2 * sseu->max_subslices + 1] = (eu_en >> 16) & eu_mask;
+	/* Slice3 */
+	sseu->eu_mask[3 * sseu->max_subslices]     = (eu_en >> 24) & eu_mask;
+	eu_en = ~I915_READ(GEN8_EU_DISABLE2);
+	sseu->eu_mask[3 * sseu->max_subslices + 1] = eu_en & eu_mask;
+	/* Slice4 */
+	sseu->eu_mask[4 * sseu->max_subslices]     = (eu_en >> 8) & eu_mask;
+	sseu->eu_mask[4 * sseu->max_subslices + 1] = (eu_en >> 16) & eu_mask;
+	/* Slice5 */
+	sseu->eu_mask[5 * sseu->max_subslices]     = (eu_en >> 24) & eu_mask;
+	eu_en = ~I915_READ(GEN10_EU_DISABLE3);
+	sseu->eu_mask[5 * sseu->max_subslices + 1] = eu_en & eu_mask;
+
+	/* Do a second pass where we marked the subslices disabled if all
+	 * their eus are off.
+	 */
+	for (s = 0; s < sseu->max_slices; s++) {
+		for (ss = 0; ss < sseu->max_subslices; ss++) {
+			if (sseu->eu_mask[s * sseu->max_subslices + ss] == 0)
+				sseu->subslices_mask[s] &= ~BIT(ss);
+		}
+	}
 
-	sseu->eu_total = hweight32(~I915_READ(GEN8_EU_DISABLE0));
-	sseu->eu_total += hweight32(~I915_READ(GEN8_EU_DISABLE1));
-	sseu->eu_total += hweight32(~I915_READ(GEN8_EU_DISABLE2));
-	sseu->eu_total += hweight8(~(I915_READ(GEN10_EU_DISABLE3) &
-				     GEN10_EU_DIS_SS_MASK));
+	sseu->eu_total = compute_eu_total(sseu);
 
 	/*
 	 * CNL is expected to always have a uniform distribution
@@ -118,26 +170,30 @@ static void gen10_sseu_info_init(struct drm_i915_private *dev_priv)
 static void cherryview_sseu_info_init(struct drm_i915_private *dev_priv)
 {
 	struct sseu_dev_info *sseu = &mkwrite_device_info(dev_priv)->sseu;
-	u32 fuse, eu_dis;
+	u32 fuse;
 
 	fuse = I915_READ(CHV_FUSE_GT);
 
 	sseu->slice_mask = BIT(0);
+	sseu->max_slices = 1;
+	sseu->max_subslices = 2;
+	sseu->max_eus_per_subslice = 8;
 
 	if (!(fuse & CHV_FGT_DISABLE_SS0)) {
-		sseu->subslice_mask |= BIT(0);
-		eu_dis = fuse & (CHV_FGT_EU_DIS_SS0_R0_MASK |
-				 CHV_FGT_EU_DIS_SS0_R1_MASK);
-		sseu->eu_total += 8 - hweight32(eu_dis);
+		sseu->subslices_mask[0] |= BIT(0);
+		sseu->eu_mask[0] = (fuse & CHV_FGT_EU_DIS_SS0_R0_MASK) >> CHV_FGT_EU_DIS_SS0_R0_SHIFT;
+		sseu->eu_mask[0] |= ((fuse & CHV_FGT_EU_DIS_SS0_R1_MASK) >> CHV_FGT_EU_DIS_SS0_R1_SHIFT) << 4;
+		sseu->subslices_mask[0] = 1;
 	}
 
 	if (!(fuse & CHV_FGT_DISABLE_SS1)) {
-		sseu->subslice_mask |= BIT(1);
-		eu_dis = fuse & (CHV_FGT_EU_DIS_SS1_R0_MASK |
-				 CHV_FGT_EU_DIS_SS1_R1_MASK);
-		sseu->eu_total += 8 - hweight32(eu_dis);
+		sseu->subslices_mask[0] |= BIT(1);
+		sseu->eu_mask[1] = (fuse & CHV_FGT_EU_DIS_SS1_R0_MASK) >> CHV_FGT_EU_DIS_SS0_R0_SHIFT;
+		sseu->eu_mask[2] |= ((fuse & CHV_FGT_EU_DIS_SS1_R1_MASK) >> CHV_FGT_EU_DIS_SS0_R1_SHIFT) << 4;
 	}
 
+	sseu->eu_total = compute_eu_total(sseu);
+
 	/*
 	 * CHV expected to always have a uniform distribution of EU
 	 * across subslices.
@@ -159,41 +215,50 @@ static void gen9_sseu_info_init(struct drm_i915_private *dev_priv)
 {
 	struct intel_device_info *info = mkwrite_device_info(dev_priv);
 	struct sseu_dev_info *sseu = &info->sseu;
-	int s_max = 3, ss_max = 4, eu_max = 8;
 	int s, ss;
-	u32 fuse2, eu_disable;
+	u32 fuse2, eu_disable, subslice_mask;
 	u8 eu_mask = 0xff;
 
 	fuse2 = I915_READ(GEN8_FUSE2);
 	sseu->slice_mask = (fuse2 & GEN8_F2_S_ENA_MASK) >> GEN8_F2_S_ENA_SHIFT;
 
+	/* BXT has a single slice and at most 3 subslices. */
+	sseu->max_slices = IS_GEN9_LP(dev_priv) ? 1 : 3;
+	sseu->max_subslices = IS_GEN9_LP(dev_priv) ? 3 : 4;
+	sseu->max_eus_per_subslice = 8;
+
 	/*
 	 * The subslice disable field is global, i.e. it applies
 	 * to each of the enabled slices.
 	*/
-	sseu->subslice_mask = (1 << ss_max) - 1;
-	sseu->subslice_mask &= ~((fuse2 & GEN9_F2_SS_DIS_MASK) >>
-				 GEN9_F2_SS_DIS_SHIFT);
+	subslice_mask = (1 << sseu->max_subslices) - 1;
+	subslice_mask &= ~((fuse2 & GEN9_F2_SS_DIS_MASK) >>
+			   GEN9_F2_SS_DIS_SHIFT);
 
 	/*
 	 * Iterate through enabled slices and subslices to
 	 * count the total enabled EU.
 	*/
-	for (s = 0; s < s_max; s++) {
+	for (s = 0; s < sseu->max_slices; s++) {
 		if (!(sseu->slice_mask & BIT(s)))
 			/* skip disabled slice */
 			continue;
 
+		sseu->subslices_mask[s] = subslice_mask;
+
 		eu_disable = I915_READ(GEN9_EU_DISABLE(s));
-		for (ss = 0; ss < ss_max; ss++) {
+		for (ss = 0; ss < sseu->max_subslices; ss++) {
 			int eu_per_ss;
 
-			if (!(sseu->subslice_mask & BIT(ss)))
+			if (!(sseu->subslices_mask[s] & BIT(ss)))
 				/* skip disabled subslice */
 				continue;
 
-			eu_per_ss = eu_max - hweight8((eu_disable >> (ss*8)) &
-						      eu_mask);
+			sseu->eu_mask[ss + s * sseu->max_subslices] =
+				~((eu_disable >> (ss*8)) & eu_mask);
+
+			eu_per_ss = sseu->max_eus_per_subslice -
+				hweight8((eu_disable >> (ss*8)) & eu_mask);
 
 			/*
 			 * Record which subslice(s) has(have) 7 EUs. we
@@ -202,11 +267,11 @@ static void gen9_sseu_info_init(struct drm_i915_private *dev_priv)
 			 */
 			if (eu_per_ss == 7)
 				sseu->subslice_7eu[s] |= BIT(ss);
-
-			sseu->eu_total += eu_per_ss;
 		}
 	}
 
+	sseu->eu_total = compute_eu_total(sseu);
+
 	/*
 	 * SKL is expected to always have a uniform distribution
 	 * of EU across subslices with the exception that any one
@@ -232,8 +297,8 @@ static void gen9_sseu_info_init(struct drm_i915_private *dev_priv)
 	sseu->has_eu_pg = sseu->eu_per_subslice > 2;
 
 	if (IS_GEN9_LP(dev_priv)) {
-#define IS_SS_DISABLED(ss)	(!(sseu->subslice_mask & BIT(ss)))
-		info->has_pooled_eu = hweight8(sseu->subslice_mask) == 3;
+#define IS_SS_DISABLED(ss)	(!(sseu->subslices_mask[0] & BIT(ss)))
+		info->has_pooled_eu = hweight8(sseu->subslices_mask[0]) == 3;
 
 		sseu->min_eu_in_pool = 0;
 		if (info->has_pooled_eu) {
@@ -251,19 +316,22 @@ static void gen9_sseu_info_init(struct drm_i915_private *dev_priv)
 static void broadwell_sseu_info_init(struct drm_i915_private *dev_priv)
 {
 	struct sseu_dev_info *sseu = &mkwrite_device_info(dev_priv)->sseu;
-	const int s_max = 3, ss_max = 3, eu_max = 8;
 	int s, ss;
-	u32 fuse2, eu_disable[3]; /* s_max */
+	u32 fuse2, subslice_mask, eu_disable[3]; /* s_max */
 
 	fuse2 = I915_READ(GEN8_FUSE2);
 	sseu->slice_mask = (fuse2 & GEN8_F2_S_ENA_MASK) >> GEN8_F2_S_ENA_SHIFT;
+	sseu->max_slices = 3;
+	sseu->max_subslices = 3;
+	sseu->max_eus_per_subslice = 8;
+
 	/*
 	 * The subslice disable field is global, i.e. it applies
 	 * to each of the enabled slices.
 	 */
-	sseu->subslice_mask = GENMASK(ss_max - 1, 0);
-	sseu->subslice_mask &= ~((fuse2 & GEN8_F2_SS_DIS_MASK) >>
-				 GEN8_F2_SS_DIS_SHIFT);
+	subslice_mask = GENMASK(sseu->max_subslices - 1, 0);
+	subslice_mask &= ~((fuse2 & GEN8_F2_SS_DIS_MASK) >>
+			   GEN8_F2_SS_DIS_SHIFT);
 
 	eu_disable[0] = I915_READ(GEN8_EU_DISABLE0) & GEN8_EU_DIS0_S0_MASK;
 	eu_disable[1] = (I915_READ(GEN8_EU_DISABLE0) >> GEN8_EU_DIS0_S1_SHIFT) |
@@ -277,30 +345,36 @@ static void broadwell_sseu_info_init(struct drm_i915_private *dev_priv)
 	 * Iterate through enabled slices and subslices to
 	 * count the total enabled EU.
 	 */
-	for (s = 0; s < s_max; s++) {
+	for (s = 0; s < sseu->max_slices; s++) {
 		if (!(sseu->slice_mask & BIT(s)))
 			/* skip disabled slice */
 			continue;
 
-		for (ss = 0; ss < ss_max; ss++) {
+		sseu->subslices_mask[s] = subslice_mask;
+
+		for (ss = 0; ss < sseu->max_subslices; ss++) {
 			u32 n_disabled;
 
-			if (!(sseu->subslice_mask & BIT(ss)))
+			if (!(sseu->subslices_mask[ss] & BIT(ss)))
 				/* skip disabled subslice */
 				continue;
 
-			n_disabled = hweight8(eu_disable[s] >> (ss * eu_max));
+			sseu->eu_mask[ss + s * sseu->max_subslices] =
+				~(eu_disable[s] >>
+				  (ss * sseu->max_eus_per_subslice));
+			n_disabled = hweight8(eu_disable[s] >>
+					      (ss * sseu->max_eus_per_subslice));
 
 			/*
 			 * Record which subslices have 7 EUs.
 			 */
-			if (eu_max - n_disabled == 7)
+			if (sseu->max_eus_per_subslice - n_disabled == 7)
 				sseu->subslice_7eu[s] |= 1 << ss;
-
-			sseu->eu_total += eu_max - n_disabled;
 		}
 	}
 
+	sseu->eu_total = compute_eu_total(sseu);
+
 	/*
 	 * BDW is expected to always have a uniform distribution of EU across
 	 * subslices with the exception that any one EU in any one subslice may
@@ -437,6 +511,7 @@ void intel_device_info_runtime_init(struct drm_i915_private *dev_priv)
 {
 	struct intel_device_info *info = mkwrite_device_info(dev_priv);
 	enum pipe pipe;
+	int s;
 
 	if (INTEL_GEN(dev_priv) >= 10) {
 		for_each_pipe(dev_priv, pipe)
@@ -548,9 +623,11 @@ void intel_device_info_runtime_init(struct drm_i915_private *dev_priv)
 	DRM_DEBUG_DRIVER("slice total: %u\n", hweight8(info->sseu.slice_mask));
 	DRM_DEBUG_DRIVER("subslice total: %u\n",
 			 sseu_subslice_total(&info->sseu));
-	DRM_DEBUG_DRIVER("subslice mask %04x\n", info->sseu.subslice_mask);
-	DRM_DEBUG_DRIVER("subslice per slice: %u\n",
-			 hweight8(info->sseu.subslice_mask));
+	for (s = 0; s < ARRAY_SIZE(info->sseu.subslices_mask); s++) {
+		DRM_DEBUG_DRIVER("subslice mask %04x\n", info->sseu.subslices_mask[s]);
+		DRM_DEBUG_DRIVER("subslice per slice: %u\n",
+				 hweight8(info->sseu.subslices_mask[s]));
+	}
 	DRM_DEBUG_DRIVER("EU total: %u\n", info->sseu.eu_total);
 	DRM_DEBUG_DRIVER("EU per subslice: %u\n", info->sseu.eu_per_subslice);
 	DRM_DEBUG_DRIVER("has slice power gating: %s\n",
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 2a8160f603ab..bbc724b1ab56 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -2073,7 +2073,7 @@ make_rpcs(struct drm_i915_private *dev_priv)
 
 	if (INTEL_INFO(dev_priv)->sseu.has_subslice_pg) {
 		rpcs |= GEN8_RPCS_SS_CNT_ENABLE;
-		rpcs |= hweight8(INTEL_INFO(dev_priv)->sseu.subslice_mask) <<
+		rpcs |= hweight8(INTEL_INFO(dev_priv)->sseu.subslices_mask[0]) <<
 			GEN8_RPCS_SS_CNT_SHIFT;
 		rpcs |= GEN8_RPCS_ENABLE;
 	}
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index c68ab3ead83c..9c4434438965 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -90,7 +90,7 @@ hangcheck_action_to_str(const enum intel_engine_hangcheck_action a)
 
 #define instdone_subslice_mask(dev_priv__) \
 	(INTEL_GEN(dev_priv__) == 7 ? \
-	 1 : INTEL_INFO(dev_priv__)->sseu.subslice_mask)
+	 1 : INTEL_INFO(dev_priv__)->sseu.subslices_mask[0])
 
 #define for_each_instdone_slice_subslice(dev_priv__, slice__, subslice__) \
 	for ((slice__) = 0, (subslice__) = 0; \
-- 
2.15.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v6 3/5] drm/i915/debugfs: reuse max slice/subslices already stored in sseu
  2017-12-04 15:02 [PATCH v6 0/5] drm/i915: Expose more GPU properties through sysfs Lionel Landwerlin
  2017-12-04 15:02 ` [PATCH v6 1/5] drm: add card symlink in render sysfs directory Lionel Landwerlin
  2017-12-04 15:02 ` [PATCH v6 2/5] drm/i915: store all subslice masks Lionel Landwerlin
@ 2017-12-04 15:02 ` Lionel Landwerlin
  2017-12-04 15:02 ` [PATCH v6 4/5] drm/i915: expose engine availability through sysfs Lionel Landwerlin
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 19+ messages in thread
From: Lionel Landwerlin @ 2017-12-04 15:02 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel

Now that we have that information in topology fields, let's just reused it.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
---
 drivers/gpu/drm/i915/i915_debugfs.c | 26 ++++++++++----------------
 1 file changed, 10 insertions(+), 16 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index ef091a2a6b12..961c5969b223 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -4355,11 +4355,11 @@ static void gen10_sseu_device_status(struct drm_i915_private *dev_priv,
 				     struct sseu_dev_info *sseu)
 {
 	const struct intel_device_info *info = INTEL_INFO(dev_priv);
-	int s_max = 6, ss_max = 4;
 	int s, ss;
-	u32 s_reg[s_max], eu_reg[2 * s_max], eu_mask[2];
+	u32 s_reg[info->sseu.max_slices],
+		eu_reg[2 * info->sseu.max_subslices], eu_mask[2];
 
-	for (s = 0; s < s_max; s++) {
+	for (s = 0; s < info->sseu.max_slices; s++) {
 		/*
 		 * FIXME: Valid SS Mask respects the spec and read
 		 * only valid bits for those registers, excluding reserverd
@@ -4381,7 +4381,7 @@ static void gen10_sseu_device_status(struct drm_i915_private *dev_priv,
 		     GEN9_PGCTL_SSB_EU210_ACK |
 		     GEN9_PGCTL_SSB_EU311_ACK;
 
-	for (s = 0; s < s_max; s++) {
+	for (s = 0; s < info->sseu.max_slices; s++) {
 		if ((s_reg[s] & GEN9_PGCTL_SLICE_ACK) == 0)
 			/* skip disabled slice */
 			continue;
@@ -4389,7 +4389,7 @@ static void gen10_sseu_device_status(struct drm_i915_private *dev_priv,
 		sseu->slice_mask |= BIT(s);
 		sseu->subslices_mask[s] = info->sseu.subslices_mask[s];
 
-		for (ss = 0; ss < ss_max; ss++) {
+		for (ss = 0; ss < info->sseu.max_subslices; ss++) {
 			unsigned int eu_cnt;
 
 			if (!(s_reg[s] & (GEN9_PGCTL_SS_ACK(ss))))
@@ -4409,17 +4409,11 @@ static void gen10_sseu_device_status(struct drm_i915_private *dev_priv,
 static void gen9_sseu_device_status(struct drm_i915_private *dev_priv,
 				    struct sseu_dev_info *sseu)
 {
-	int s_max = 3, ss_max = 4;
+	const struct intel_device_info *info = INTEL_INFO(dev_priv);
 	int s, ss;
-	u32 s_reg[s_max], eu_reg[2*s_max], eu_mask[2];
-
-	/* BXT has a single slice and at most 3 subslices. */
-	if (IS_GEN9_LP(dev_priv)) {
-		s_max = 1;
-		ss_max = 3;
-	}
+	u32 s_reg[info->sseu.max_slices], eu_reg[2*info->sseu.max_subslices], eu_mask[2];
 
-	for (s = 0; s < s_max; s++) {
+	for (s = 0; s < info->sseu.max_slices; s++) {
 		s_reg[s] = I915_READ(GEN9_SLICE_PGCTL_ACK(s));
 		eu_reg[2*s] = I915_READ(GEN9_SS01_EU_PGCTL_ACK(s));
 		eu_reg[2*s + 1] = I915_READ(GEN9_SS23_EU_PGCTL_ACK(s));
@@ -4434,7 +4428,7 @@ static void gen9_sseu_device_status(struct drm_i915_private *dev_priv,
 		     GEN9_PGCTL_SSB_EU210_ACK |
 		     GEN9_PGCTL_SSB_EU311_ACK;
 
-	for (s = 0; s < s_max; s++) {
+	for (s = 0; s < info->sseu.max_slices; s++) {
 		if ((s_reg[s] & GEN9_PGCTL_SLICE_ACK) == 0)
 			/* skip disabled slice */
 			continue;
@@ -4445,7 +4439,7 @@ static void gen9_sseu_device_status(struct drm_i915_private *dev_priv,
 			sseu->subslices_mask[s] =
 				INTEL_INFO(dev_priv)->sseu.subslices_mask[s];
 
-		for (ss = 0; ss < ss_max; ss++) {
+		for (ss = 0; ss < info->sseu.max_subslices; ss++) {
 			unsigned int eu_cnt;
 
 			if (IS_GEN9_LP(dev_priv)) {
-- 
2.15.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v6 4/5] drm/i915: expose engine availability through sysfs
  2017-12-04 15:02 [PATCH v6 0/5] drm/i915: Expose more GPU properties through sysfs Lionel Landwerlin
                   ` (2 preceding siblings ...)
  2017-12-04 15:02 ` [PATCH v6 3/5] drm/i915/debugfs: reuse max slice/subslices already stored in sseu Lionel Landwerlin
@ 2017-12-04 15:02 ` Lionel Landwerlin
  2017-12-04 15:02 ` [PATCH v6 5/5] drm/i915: expose EU topology " Lionel Landwerlin
  2017-12-07  9:21 ` [Intel-gfx] [PATCH v6 0/5] drm/i915: Expose more GPU properties " Tvrtko Ursulin
  5 siblings, 0 replies; 19+ messages in thread
From: Lionel Landwerlin @ 2017-12-04 15:02 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel

This enables userspace to discover the engines available on the GPU.
Here is the layout on a Skylake GT4:

/sys/devices/pci0000:00/0000:00:02.0/drm/card0/gt/
└── engines
    ├── bcs
    │   └── 0
    │       ├── capabilities
    │       ├── class
    │       ├── id
    │       └── instance
    ├── rcs
    │   └── 0
    │       ├── capabilities
    │       ├── class
    │       ├── id
    │       └── instance
    ├── vcs
    │   ├── 0
    │   │   ├── capabilities
    │   │   │   └── hevc
    │   │   ├── class
    │   │   ├── id
    │   │   └── instance
    │   └── 1
    │       ├── capabilities
    │       ├── class
    │       ├── id
    │       └── instance
    └── vecs
        └── 0
            ├── capabilities
            ├── class
            ├── id
            └── instance

Instance is the instance number of the engine for a particular class :
$ cat /sys/devices/pci0000\:00/0000\:00\:02.0/drm/card0/gt/engines/bcs/0/instance
0

Id is a global id used from submitting commands from userspace through execbuf :
$ cat /sys/devices/pci0000\:00/0000\:00\:02.0/drm/card0/gt/engines/bcs/0/id
3

Class maps to enum drm_i915_gem_engine_class in i915_drm.h :
$ cat /sys/devices/pci0000\:00/0000\:00\:02.0/drm/card0/gt/engines/bcs/0/class
1

Further capabilities can be added later as attributes of each engine.

v2: Add capabilities sub directory (Tvrtko)
    Move engines directory to drm/card/gt (Chris)

v3: Move engines to drm/card/gt/engines/ (Tvrtko)
    Add instance attribute to engines (Tvrtko)

v4: Fix dev_priv->gt_topology.engines_classes_kobj size (Lionel)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h         |   6 ++
 drivers/gpu/drm/i915/i915_sysfs.c       | 178 ++++++++++++++++++++++++++++++++
 drivers/gpu/drm/i915/intel_engine_cs.c  |  12 +++
 drivers/gpu/drm/i915/intel_ringbuffer.h |   4 +
 4 files changed, 200 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 0a8e8a3772e5..893ecb0d4639 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2731,6 +2731,12 @@ struct drm_i915_private {
 		} oa;
 	} perf;
 
+	struct {
+		struct kobject gt_kobj;
+		struct kobject engines_kobj;
+		struct kobject engines_classes_kobjs[MAX_ENGINE_CLASS + 1];
+	} gt_topology;
+
 	/* Abstract the submission mechanism (legacy ringbuffer or execlists) away */
 	struct {
 		void (*resume)(struct drm_i915_private *);
diff --git a/drivers/gpu/drm/i915/i915_sysfs.c b/drivers/gpu/drm/i915/i915_sysfs.c
index c74a20b80182..d8ec8ec51cca 100644
--- a/drivers/gpu/drm/i915/i915_sysfs.c
+++ b/drivers/gpu/drm/i915/i915_sysfs.c
@@ -570,6 +570,178 @@ static void i915_setup_error_capture(struct device *kdev) {}
 static void i915_teardown_error_capture(struct device *kdev) {}
 #endif
 
+static struct attribute engine_class_attr = {
+	.name = "class",
+	.mode = 0444,
+};
+
+static struct attribute engine_id_attr = {
+	.name = "id",
+	.mode = 0444,
+};
+
+static struct attribute engine_instance_attr = {
+	.name = "instance",
+	.mode = 0444,
+};
+
+static ssize_t
+show_engine_attr(struct kobject *kobj, struct attribute *attr, char *buf)
+{
+	struct intel_engine_cs *engine =
+		container_of(kobj, struct intel_engine_cs, instance_kobj);
+
+	if (attr == &engine_class_attr)
+		return sprintf(buf, "%hhu\n", engine->uabi_class);
+	if (attr == &engine_id_attr)
+		return sprintf(buf, "%hhu\n", engine->uabi_id);
+	if (attr == &engine_instance_attr)
+		return sprintf(buf, "%hhu\n", engine->instance);
+	return sprintf(buf, "\n");
+}
+
+static const struct sysfs_ops engine_ops = {
+	.show = show_engine_attr,
+};
+
+static struct kobj_type engine_type = {
+	.sysfs_ops = &engine_ops,
+};
+
+static struct attribute engine_capability_hevc_attr = {
+	.name = "hevc",
+	.mode = 0444,
+};
+
+static ssize_t
+show_engine_capabilities_attr(struct kobject *kobj, struct attribute *attr, char *buf)
+{
+	struct intel_engine_cs *engine =
+		container_of(kobj, struct intel_engine_cs, capabilities_kobj);
+
+	if (attr == &engine_capability_hevc_attr)
+		return sprintf(buf, "%i\n", INTEL_GEN(engine->i915) >= 8);
+	return sprintf(buf, "\n");
+}
+
+static const struct sysfs_ops engine_capabilities_ops = {
+	.show = show_engine_capabilities_attr,
+};
+
+static struct kobj_type engine_capabilities_type = {
+	.sysfs_ops = &engine_capabilities_ops,
+};
+
+static int i915_setup_engines_sysfs(struct drm_i915_private *dev_priv,
+				    struct kobject *gt_kobj)
+{
+	struct intel_engine_cs *engine_for_class, *engine;
+	enum intel_engine_id id_for_class, id;
+	struct kobject *engines_kobj = &dev_priv->gt_topology.engines_kobj;
+	bool registred[MAX_ENGINE_CLASS + 1] = { false, };
+	int ret;
+
+	ret = kobject_init_and_add(engines_kobj, gt_kobj->ktype, gt_kobj,
+				   "engines");
+	if (ret)
+		return ret;
+
+	for_each_engine(engine_for_class, dev_priv, id_for_class) {
+		struct kobject *engine_class_kobj =
+			&dev_priv->gt_topology.engines_classes_kobjs[engine_for_class->class];
+
+		if (registred[engine_for_class->class])
+			continue;
+
+		registred[engine_for_class->class] = true;
+
+		ret = kobject_init_and_add(engine_class_kobj,
+					   engines_kobj->ktype,
+					   engines_kobj,
+					   intel_engine_get_class_name(engine_for_class));
+		if (ret)
+			return ret;
+
+		for_each_engine(engine, dev_priv, id) {
+			if (engine->class != engine_for_class->class)
+				continue;
+
+			ret = kobject_init_and_add(&engine->instance_kobj,
+						   &engine_type,
+						   engine_class_kobj,
+						   "%d", engine->instance);
+			if (ret)
+				return ret;
+
+			ret = sysfs_create_file(&engine->instance_kobj,
+						&engine_class_attr);
+			if (ret)
+				return ret;
+			ret = sysfs_create_file(&engine->instance_kobj,
+						&engine_id_attr);
+			if (ret)
+				return ret;
+			ret = sysfs_create_file(&engine->instance_kobj,
+						&engine_instance_attr);
+			if (ret)
+				return ret;
+
+			ret = kobject_init_and_add(&engine->capabilities_kobj,
+						   &engine_capabilities_type,
+						   &engine->instance_kobj,
+						   "capabilities");
+			if (ret)
+				return ret;
+
+			if (engine->id == VCS) {
+				ret = sysfs_create_file(&engine->capabilities_kobj,
+							&engine_capability_hevc_attr);
+				if (ret)
+					return ret;
+			}
+		}
+	}
+
+	return 0;
+}
+
+static void i915_teardown_engines_sysfs(struct drm_i915_private *dev_priv)
+{
+	struct intel_engine_cs *engine;
+	enum intel_engine_id id;
+
+	for_each_engine(engine, dev_priv, id) {
+		sysfs_remove_file(&engine->instance_kobj, &engine_class_attr);
+		sysfs_remove_file(&engine->instance_kobj, &engine_id_attr);
+		sysfs_remove_file(&engine->instance_kobj, &engine_instance_attr);
+		sysfs_remove_file(&engine->capabilities_kobj,
+				  &engine_capability_hevc_attr);
+	}
+}
+
+static int i915_setup_gt_sysfs(struct drm_i915_private *dev_priv,
+			       struct device *kdev)
+{
+	int ret;
+
+	ret = kobject_init_and_add(&dev_priv->gt_topology.gt_kobj,
+				   kdev->kobj.ktype,
+				   &kdev->kobj,
+				   "gt");
+	if (ret)
+		return ret;
+
+	return i915_setup_engines_sysfs(dev_priv, &dev_priv->gt_topology.gt_kobj);
+}
+
+static void i915_teardown_gt_sysfs(struct drm_i915_private *dev_priv)
+{
+	i915_teardown_engines_sysfs(dev_priv);
+
+	kobject_get(&dev_priv->gt_topology.gt_kobj);
+	kobject_del(&dev_priv->gt_topology.gt_kobj);
+}
+
 void i915_setup_sysfs(struct drm_i915_private *dev_priv)
 {
 	struct device *kdev = dev_priv->drm.primary->kdev;
@@ -616,6 +788,10 @@ void i915_setup_sysfs(struct drm_i915_private *dev_priv)
 	if (ret)
 		DRM_ERROR("RPS sysfs setup failed\n");
 
+	ret = i915_setup_gt_sysfs(dev_priv, kdev);
+	if (ret)
+		DRM_ERROR("GT sysfs setup failed\n");
+
 	i915_setup_error_capture(kdev);
 }
 
@@ -625,6 +801,8 @@ void i915_teardown_sysfs(struct drm_i915_private *dev_priv)
 
 	i915_teardown_error_capture(kdev);
 
+	i915_teardown_gt_sysfs(dev_priv);
+
 	if (IS_VALLEYVIEW(dev_priv) || IS_CHERRYVIEW(dev_priv))
 		sysfs_remove_files(&kdev->kobj, vlv_attrs);
 	else
diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c
index 86d4c85c8725..7bda1c283f2c 100644
--- a/drivers/gpu/drm/i915/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/intel_engine_cs.c
@@ -131,6 +131,18 @@ static const struct engine_info intel_engines[] = {
 	},
 };
 
+/**
+ * intel_engine_get_class_name() - return the name of the class for an engine
+ * @engine: engine
+ *
+ * Return: a string naming the class of the engine
+ */
+const char *
+intel_engine_get_class_name(struct intel_engine_cs *engine)
+{
+	return intel_engine_classes[engine->class].name;
+}
+
 /**
  * ___intel_engine_context_size() - return the size of the context for an engine
  * @dev_priv: i915 device private
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 9c4434438965..dc4eeb897407 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -282,6 +282,9 @@ struct intel_engine_cs {
 	struct drm_i915_private *i915;
 	char name[INTEL_ENGINE_CS_MAX_NAME];
 
+	struct kobject instance_kobj;
+	struct kobject capabilities_kobj;
+
 	enum intel_engine_id id;
 	unsigned int hw_id;
 	unsigned int guc_id;
@@ -995,6 +998,7 @@ gen8_emit_ggtt_write(u32 *cs, u32 value, u32 gtt_offset)
 	return cs;
 }
 
+const char *intel_engine_get_class_name(struct intel_engine_cs *engine);
 bool intel_engine_is_idle(struct intel_engine_cs *engine);
 bool intel_engines_are_idle(struct drm_i915_private *dev_priv);
 
-- 
2.15.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v6 5/5] drm/i915: expose EU topology through sysfs
  2017-12-04 15:02 [PATCH v6 0/5] drm/i915: Expose more GPU properties through sysfs Lionel Landwerlin
                   ` (3 preceding siblings ...)
  2017-12-04 15:02 ` [PATCH v6 4/5] drm/i915: expose engine availability through sysfs Lionel Landwerlin
@ 2017-12-04 15:02 ` Lionel Landwerlin
  2017-12-07  9:21 ` [Intel-gfx] [PATCH v6 0/5] drm/i915: Expose more GPU properties " Tvrtko Ursulin
  5 siblings, 0 replies; 19+ messages in thread
From: Lionel Landwerlin @ 2017-12-04 15:02 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel

With the introduction of asymetric slices in CNL, we cannot rely on
the previous SUBSLICE_MASK getparam. Here we introduce a more detailed
way of querying the Gen's GPU topology that doesn't aggregate numbers.

This is essential for monitoring parts of the GPU with the OA unit,
because signals need to be accounted properly based on whether part of
the GPU has been fused off. The current aggregated numbers like
EU_TOTAL do not gives us sufficient information.

Here is the sysfs layout on a Skylake GT4 :

/sys/devices/pci0000:00/0000:00:02.0/drm/card0/gt/engines/rcs/topology/
├── max_eus_per_subslice
├── max_slices
├── max_subslices_per_slice
├── slice0
│   ├── subslice0
│   │   └── eus_enabled_mask
│   ├── subslice1
│   │   └── eus_enabled_mask
│   ├── subslice2
│   │   └── eus_enabled_mask
│   ├── subslice3
│   │   └── eus_enabled_mask
│   └── subslices_enabled_mask
├── slice1
│   ├── subslice0
│   │   └── eus_enabled_mask
│   ├── subslice1
│   │   └── eus_enabled_mask
│   ├── subslice2
│   │   └── eus_enabled_mask
│   ├── subslice3
│   │   └── eus_enabled_mask
│   └── subslices_enabled_mask
├── slice2
│   ├── subslice0
│   │   └── eus_enabled_mask
│   ├── subslice1
│   │   └── eus_enabled_mask
│   ├── subslice2
│   │   └── eus_enabled_mask
│   ├── subslice3
│   │   └── eus_enabled_mask
│   └── subslices_enabled_mask
└── slices_enabled_mask

Each enabled_mask file gives us a mask of the enabled units :

$ cat /sys/devices/pci0000\:00/0000\:00\:02.0/drm/card0/gt/engines/rcs/topology/slices_enabled_mask
0x7

$ cat /sys/devices/pci0000\:00/0000\:00\:02.0/drm/card0/gt/engines/rcs/topology/slice2/subslice1/eus_enabled_mask
0xff

max_eus_per_subslice/max_slices/max_subslices_per_slice are useful
numbers for userspace to build up a n-dimensional representation of
the device so that it can allocated upfront and filled up while
reading through each of the slices/subslices.

v2: Move topology below rcs engine (Chris)
    Add max_eus_per_subslice/max_slices/max_subslices_per_slice (Lionel)

v3: Rename enabled_mask (Lionel)

v4: Move slices to drm/card/gt/rcs/topology (Tvrtko)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h   |  27 +++++
 drivers/gpu/drm/i915/i915_sysfs.c | 208 ++++++++++++++++++++++++++++++++++++++
 2 files changed, 235 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 893ecb0d4639..fbdfd59ed1bd 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2260,6 +2260,24 @@ struct intel_cdclk_state {
 	u8 voltage_level;
 };
 
+struct intel_topology_kobject {
+	struct kobject kobj;
+	struct drm_i915_private *dev_priv;
+};
+
+struct intel_slice_kobject {
+	struct kobject kobj;
+	struct drm_i915_private *dev_priv;
+	u8 slice_index;
+};
+
+struct intel_subslice_kobject {
+	struct kobject kobj;
+	struct drm_i915_private *dev_priv;
+	u8 slice_index;
+	u8 subslice_index;
+};
+
 struct drm_i915_private {
 	struct drm_device drm;
 
@@ -2735,6 +2753,15 @@ struct drm_i915_private {
 		struct kobject gt_kobj;
 		struct kobject engines_kobj;
 		struct kobject engines_classes_kobjs[MAX_ENGINE_CLASS + 1];
+		struct kobject topology_kobj;
+
+		struct sysfs_slice {
+			struct intel_slice_kobject kobj;
+
+			struct sysfs_subslice {
+				struct intel_subslice_kobject kobj;
+			} subslices[GEN_MAX_SUBSLICES];
+		} slices[GEN_MAX_SLICES];
 	} gt_topology;
 
 	/* Abstract the submission mechanism (legacy ringbuffer or execlists) away */
diff --git a/drivers/gpu/drm/i915/i915_sysfs.c b/drivers/gpu/drm/i915/i915_sysfs.c
index d8ec8ec51cca..b277338ab3fb 100644
--- a/drivers/gpu/drm/i915/i915_sysfs.c
+++ b/drivers/gpu/drm/i915/i915_sysfs.c
@@ -570,6 +570,205 @@ static void i915_setup_error_capture(struct device *kdev) {}
 static void i915_teardown_error_capture(struct device *kdev) {}
 #endif
 
+static struct attribute slices_enabled_mask_attr = {
+	.name = "slices_enabled_mask",
+	.mode = 0444,
+};
+
+static struct attribute subslices_enabled_mask_attr = {
+	.name = "subslices_enabled_mask",
+	.mode = 0444,
+};
+
+static struct attribute eus_enabled_mask_attr = {
+	.name = "eus_enabled_mask",
+	.mode = 0444,
+};
+
+static struct attribute max_slices_attr = {
+	.name = "max_slices",
+	.mode = 0444,
+};
+
+static struct attribute max_subslices_per_slice_attr = {
+	.name = "max_subslices_per_slice",
+	.mode = 0444,
+};
+
+static struct attribute max_eus_per_subslice_attr = {
+	.name = "max_eus_per_subslice",
+	.mode = 0444,
+};
+
+static ssize_t
+show_topology_attr(struct kobject *kobj, struct attribute *attr, char *buf)
+{
+	struct drm_i915_private *dev_priv =
+		container_of(kobj, struct drm_i915_private, gt_topology.topology_kobj);
+	const struct sseu_dev_info *sseu = &INTEL_INFO(dev_priv)->sseu;
+
+	if (attr == &slices_enabled_mask_attr)
+		return sprintf(buf, "0x%hhx\n", sseu->slice_mask);
+	if (attr == &max_eus_per_subslice_attr)
+		return sprintf(buf, "%hhd\n", sseu->max_eus_per_subslice);
+	if (attr == &max_subslices_per_slice_attr)
+		return sprintf(buf, "%hhd\n", sseu->max_subslices);
+	if (attr == &max_slices_attr)
+		return sprintf(buf, "%hhd\n", sseu->max_slices);
+	return sprintf(buf, "\n");
+}
+
+static const struct sysfs_ops topology_ops = {
+	.show = show_topology_attr,
+};
+
+static struct kobj_type topology_type = {
+	.sysfs_ops = &topology_ops,
+};
+
+static ssize_t
+show_slice_attr(struct kobject *kobj, struct attribute *attr, char *buf)
+{
+	struct intel_slice_kobject *kobj_wrapper =
+		container_of(kobj, struct intel_slice_kobject, kobj);
+	struct drm_i915_private *dev_priv = kobj_wrapper->dev_priv;
+	const struct sseu_dev_info *sseu = &INTEL_INFO(dev_priv)->sseu;
+
+	if (attr == &subslices_enabled_mask_attr) {
+		return sprintf(buf, "0x%hhx\n",
+			       sseu->subslices_mask[kobj_wrapper->slice_index]);
+	}
+	return sprintf(buf, "\n");
+}
+
+static const struct sysfs_ops slice_ops = {
+	.show = show_slice_attr,
+};
+
+static struct kobj_type slice_type = {
+	.sysfs_ops = &slice_ops,
+};
+
+static ssize_t
+show_subslice_attr(struct kobject *kobj, struct attribute *attr, char *buf)
+{
+	struct intel_subslice_kobject *kobj_wrapper =
+		container_of(kobj, struct intel_subslice_kobject, kobj);
+	struct drm_i915_private *dev_priv = kobj_wrapper->dev_priv;
+	const struct sseu_dev_info *sseu = &INTEL_INFO(dev_priv)->sseu;
+	int subslice_stride = ALIGN(sseu->max_eus_per_subslice, 8) / 8;
+	int slice_stride = sseu->max_subslices * subslice_stride;
+
+	if (attr == &eus_enabled_mask_attr)
+		return sprintf(buf, "0x%hhx\n",
+			       sseu->eu_mask[kobj_wrapper->slice_index * slice_stride +
+					     kobj_wrapper->subslice_index * subslice_stride]);
+	return sprintf(buf, "\n");
+}
+
+static const struct sysfs_ops subslice_ops = {
+	.show = show_subslice_attr,
+};
+
+static struct kobj_type subslice_type = {
+	.sysfs_ops = &subslice_ops,
+};
+
+static int i915_setup_rcs_topology_sysfs(struct drm_i915_private *dev_priv,
+					 struct kobject *engine_class_kobj)
+{
+	const struct sseu_dev_info *sseu = &INTEL_INFO(dev_priv)->sseu;
+	struct kobject *topology_kobj = &dev_priv->gt_topology.topology_kobj;
+	int ret, s, ss;
+
+	ret = kobject_init_and_add(topology_kobj, &topology_type,
+				   engine_class_kobj, "topology");
+	if (ret)
+		return ret;
+
+	ret = sysfs_create_file(topology_kobj, &slices_enabled_mask_attr);
+	if (ret)
+		return ret;
+
+	ret = sysfs_create_file(topology_kobj, &max_slices_attr);
+	if (ret)
+		return ret;
+
+	ret = sysfs_create_file(topology_kobj, &max_subslices_per_slice_attr);
+	if (ret)
+		return ret;
+
+	ret = sysfs_create_file(topology_kobj, &max_eus_per_subslice_attr);
+	if (ret)
+		return ret;
+
+	for (s = 0; s < sseu->max_slices; s++) {
+		struct intel_slice_kobject *slice_kobj =
+			&dev_priv->gt_topology.slices[s].kobj;
+
+		slice_kobj->dev_priv = dev_priv;
+		slice_kobj->slice_index = s;
+		ret = kobject_init_and_add(&slice_kobj->kobj, &slice_type,
+					   topology_kobj, "slice%i", s);
+		if (ret)
+			return ret;
+
+		ret = sysfs_create_file(&slice_kobj->kobj,
+					&subslices_enabled_mask_attr);
+		if (ret)
+			return ret;
+
+		for (ss = 0; ss < sseu->max_subslices; ss++) {
+			struct intel_subslice_kobject *subslice_kobj =
+				&dev_priv->gt_topology.slices[s].subslices[ss].kobj;
+
+			subslice_kobj->dev_priv = dev_priv;
+			subslice_kobj->slice_index = s;
+			subslice_kobj->subslice_index = ss;
+			ret = kobject_init_and_add(&subslice_kobj->kobj,
+						   &subslice_type,
+						   &slice_kobj->kobj,
+						   "subslice%i", ss);
+			if (ret)
+				return ret;
+
+			ret = sysfs_create_file(&subslice_kobj->kobj,
+						&eus_enabled_mask_attr);
+			if (ret)
+				return ret;
+		}
+	}
+
+	return 0;
+}
+
+static void i915_teardown_rcs_topology_sysfs(struct drm_i915_private *dev_priv)
+{
+	const struct sseu_dev_info *sseu = &INTEL_INFO(dev_priv)->sseu;
+	struct kobject *topology_kobj = &dev_priv->gt_topology.topology_kobj;
+	int s, ss;
+
+	for (s = 0; s < sseu->max_slices; s++) {
+		struct intel_slice_kobject *slice_kobj =
+			&dev_priv->gt_topology.slices[s].kobj;
+
+		for (ss = 0; ss < sseu->max_subslices; ss++) {
+			struct intel_subslice_kobject *subslice_kobj =
+				&dev_priv->gt_topology.slices[s].subslices[ss].kobj;
+
+			sysfs_remove_file(&subslice_kobj->kobj,
+					  &eus_enabled_mask_attr);
+		}
+
+		sysfs_remove_file(&slice_kobj->kobj,
+				  &subslices_enabled_mask_attr);
+	}
+	sysfs_remove_file(topology_kobj, &slices_enabled_mask_attr);
+	sysfs_remove_file(topology_kobj, &max_eus_per_subslice_attr);
+	sysfs_remove_file(topology_kobj, &max_subslices_per_slice_attr);
+	sysfs_remove_file(topology_kobj, &max_slices_attr);
+}
+
 static struct attribute engine_class_attr = {
 	.name = "class",
 	.mode = 0444,
@@ -662,6 +861,13 @@ static int i915_setup_engines_sysfs(struct drm_i915_private *dev_priv,
 		if (ret)
 			return ret;
 
+		if (engine_for_class->class == RENDER_CLASS) {
+			ret = i915_setup_rcs_topology_sysfs(dev_priv,
+							    engine_class_kobj);
+			if (ret)
+				return ret;
+		}
+
 		for_each_engine(engine, dev_priv, id) {
 			if (engine->class != engine_for_class->class)
 				continue;
@@ -710,6 +916,8 @@ static void i915_teardown_engines_sysfs(struct drm_i915_private *dev_priv)
 	struct intel_engine_cs *engine;
 	enum intel_engine_id id;
 
+	i915_teardown_rcs_topology_sysfs(dev_priv);
+
 	for_each_engine(engine, dev_priv, id) {
 		sysfs_remove_file(&engine->instance_kobj, &engine_class_attr);
 		sysfs_remove_file(&engine->instance_kobj, &engine_id_attr);
-- 
2.15.1

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [Intel-gfx] [PATCH v6 0/5] drm/i915: Expose more GPU properties through sysfs
  2017-12-04 15:02 [PATCH v6 0/5] drm/i915: Expose more GPU properties through sysfs Lionel Landwerlin
                   ` (4 preceding siblings ...)
  2017-12-04 15:02 ` [PATCH v6 5/5] drm/i915: expose EU topology " Lionel Landwerlin
@ 2017-12-07  9:21 ` Tvrtko Ursulin
  2017-12-11 10:50   ` Joonas Lahtinen
  5 siblings, 1 reply; 19+ messages in thread
From: Tvrtko Ursulin @ 2017-12-07  9:21 UTC (permalink / raw)
  To: Lionel Landwerlin, intel-gfx; +Cc: dri-devel


On 04/12/2017 15:02, Lionel Landwerlin wrote:
> Hi,
> 
> After discussion with Chris, Joonas & Tvrtko, this series adds an
> additional commit to link the render node back to the card through a
> symlink. Making it obvious from an application using a render node to
> know where to get the information it needs.

Important thing to mention as well is that it is trivial to get from the 
master drm fd to the sysfs root, via fstat and opendir 
/sys/dev/char/<major>:<minor>. With the addition of the card symlink to 
render nodes it is trivial for render node fd as well.

I am happy with this approach - it is extensible, flexible and avoids 
issues with ioctl versioning or whatnot. With one value per file it is 
trivial for userspace to access.

So for what I'm concerned, given how gputop would use all of this and so 
be the userspace, if everyone else is happy, I think we could do a 
detailed review and prehaps also think about including gputop in some 
distribution to make the case 100% straightforward.

Regards,

Tvrtko

> 
> Cheers,
> 
> Lionel Landwerlin (5):
>    drm: add card symlink in render sysfs directory
>    drm/i915: store all subslice masks
>    drm/i915/debugfs: reuse max slice/subslices already stored in sseu
>    drm/i915: expose engine availability through sysfs
>    drm/i915: expose EU topology through sysfs
> 
>   drivers/gpu/drm/drm_drv.c                |  11 +
>   drivers/gpu/drm/i915/i915_debugfs.c      |  50 ++--
>   drivers/gpu/drm/i915/i915_drv.c          |   2 +-
>   drivers/gpu/drm/i915/i915_drv.h          |  56 ++++-
>   drivers/gpu/drm/i915/i915_sysfs.c        | 386 +++++++++++++++++++++++++++++++
>   drivers/gpu/drm/i915/intel_device_info.c | 169 ++++++++++----
>   drivers/gpu/drm/i915/intel_engine_cs.c   |  12 +
>   drivers/gpu/drm/i915/intel_lrc.c         |   2 +-
>   drivers/gpu/drm/i915/intel_ringbuffer.h  |   6 +-
>   9 files changed, 617 insertions(+), 77 deletions(-)
> 
> --
> 2.15.1
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
> 
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Intel-gfx] [PATCH v6 0/5] drm/i915: Expose more GPU properties through sysfs
  2017-12-07  9:21 ` [Intel-gfx] [PATCH v6 0/5] drm/i915: Expose more GPU properties " Tvrtko Ursulin
@ 2017-12-11 10:50   ` Joonas Lahtinen
  2017-12-11 13:29     ` Lionel Landwerlin
  2017-12-11 14:38     ` [Intel-gfx] " Tvrtko Ursulin
  0 siblings, 2 replies; 19+ messages in thread
From: Joonas Lahtinen @ 2017-12-11 10:50 UTC (permalink / raw)
  To: Tvrtko Ursulin, Lionel Landwerlin, intel-gfx, Chris Wilson,
	Daniel Vetter
  Cc: dri-devel

+ Daniel, Chris

On Thu, 2017-12-07 at 09:21 +0000, Tvrtko Ursulin wrote:
> On 04/12/2017 15:02, Lionel Landwerlin wrote:
> > Hi,
> > 
> > After discussion with Chris, Joonas & Tvrtko, this series adds an
> > additional commit to link the render node back to the card through a
> > symlink. Making it obvious from an application using a render node to
> > know where to get the information it needs.
> 
> Important thing to mention as well is that it is trivial to get from the 
> master drm fd to the sysfs root, via fstat and opendir 
> /sys/dev/char/<major>:<minor>. With the addition of the card symlink to 
> render nodes it is trivial for render node fd as well.
> 
> I am happy with this approach - it is extensible, flexible and avoids 
> issues with ioctl versioning or whatnot. With one value per file it is 
> trivial for userspace to access.
> 
> So for what I'm concerned, given how gputop would use all of this and so 
> be the userspace, if everyone else is happy, I think we could do a 
> detailed review and prehaps also think about including gputop in some 
> distribution to make the case 100% straightforward.

For the GPU topology I agree this is the right choice, it's going to be
about topology after all, and directory tree is the perfect candidate.
And if a new platform appears, then it's a new platform and may change
the topology well the hardware topology has changed.

For the engine enumeration, I'm not equally sold for sysfs exposing it.
It's a "linear list of engine instances with flags" how the userspace
is going to be looking at them. And it's also information about what to
pass to an IOCTL as arguments after decision has been made, and then
you already have the FD you know you'll be dealing with, at hand. So
another IOCTL for that seems more convenient.

So I'd say for the GPU topology part, we go forward with the review and
make sure we don't expose driver internal bits that could break when
refactoring code. If the exposed N bits of information are strictly
tied to the underlying hardware, we should have no trouble maintaining
that for the foreseeable future.

Then we can continue on about the engine discovery in parallel, not
blocking GPU topology discovery.

Regards, Joonas

> 
> Regards,
> 
> Tvrtko
> 
> > 
> > Cheers,
> > 
> > Lionel Landwerlin (5):
> >    drm: add card symlink in render sysfs directory
> >    drm/i915: store all subslice masks
> >    drm/i915/debugfs: reuse max slice/subslices already stored in sseu
> >    drm/i915: expose engine availability through sysfs
> >    drm/i915: expose EU topology through sysfs
> > 
> >   drivers/gpu/drm/drm_drv.c                |  11 +
> >   drivers/gpu/drm/i915/i915_debugfs.c      |  50 ++--
> >   drivers/gpu/drm/i915/i915_drv.c          |   2 +-
> >   drivers/gpu/drm/i915/i915_drv.h          |  56 ++++-
> >   drivers/gpu/drm/i915/i915_sysfs.c        | 386 +++++++++++++++++++++++++++++++
> >   drivers/gpu/drm/i915/intel_device_info.c | 169 ++++++++++----
> >   drivers/gpu/drm/i915/intel_engine_cs.c   |  12 +
> >   drivers/gpu/drm/i915/intel_lrc.c         |   2 +-
> >   drivers/gpu/drm/i915/intel_ringbuffer.h  |   6 +-
> >   9 files changed, 617 insertions(+), 77 deletions(-)
> > 
> > --
> > 2.15.1
> > _______________________________________________
> > Intel-gfx mailing list
> > Intel-gfx@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/intel-gfx
> > 
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
-- 
Joonas Lahtinen
Open Source Technology Center
Intel Corporation
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v6 0/5] drm/i915: Expose more GPU properties through sysfs
  2017-12-11 10:50   ` Joonas Lahtinen
@ 2017-12-11 13:29     ` Lionel Landwerlin
  2017-12-11 14:38     ` [Intel-gfx] " Tvrtko Ursulin
  1 sibling, 0 replies; 19+ messages in thread
From: Lionel Landwerlin @ 2017-12-11 13:29 UTC (permalink / raw)
  To: Joonas Lahtinen, Tvrtko Ursulin, intel-gfx, Chris Wilson, Daniel Vetter
  Cc: dri-devel

On 11/12/17 10:50, Joonas Lahtinen wrote:
> + Daniel, Chris
>
> On Thu, 2017-12-07 at 09:21 +0000, Tvrtko Ursulin wrote:
>> On 04/12/2017 15:02, Lionel Landwerlin wrote:
>>> Hi,
>>>
>>> After discussion with Chris, Joonas & Tvrtko, this series adds an
>>> additional commit to link the render node back to the card through a
>>> symlink. Making it obvious from an application using a render node to
>>> know where to get the information it needs.
>> Important thing to mention as well is that it is trivial to get from the
>> master drm fd to the sysfs root, via fstat and opendir
>> /sys/dev/char/<major>:<minor>. With the addition of the card symlink to
>> render nodes it is trivial for render node fd as well.
>>
>> I am happy with this approach - it is extensible, flexible and avoids
>> issues with ioctl versioning or whatnot. With one value per file it is
>> trivial for userspace to access.
>>
>> So for what I'm concerned, given how gputop would use all of this and so
>> be the userspace, if everyone else is happy, I think we could do a
>> detailed review and prehaps also think about including gputop in some
>> distribution to make the case 100% straightforward.
> For the GPU topology I agree this is the right choice, it's going to be
> about topology after all, and directory tree is the perfect candidate.
> And if a new platform appears, then it's a new platform and may change
> the topology well the hardware topology has changed.
>
> For the engine enumeration, I'm not equally sold for sysfs exposing it.
> It's a "linear list of engine instances with flags" how the userspace
> is going to be looking at them. And it's also information about what to
> pass to an IOCTL as arguments after decision has been made, and then
> you already have the FD you know you'll be dealing with, at hand. So
> another IOCTL for that seems more convenient.

We'll be working with topology from the render FD too.
This last series added the ability to find the card from the fd's 
major/minor just for that case.

>
> So I'd say for the GPU topology part, we go forward with the review and
> make sure we don't expose driver internal bits that could break when
> refactoring code. If the exposed N bits of information are strictly
> tied to the underlying hardware, we should have no trouble maintaining
> that for the foreseeable future.

If we're going with just topology through sysfs, what should the layout 
look like?
card/gt/rcs/topology?

>
> Then we can continue on about the engine discovery in parallel, not
> blocking GPU topology discovery.
>
> Regards, Joonas
>
>> Regards,
>>
>> Tvrtko
>>
>>> Cheers,
>>>
>>> Lionel Landwerlin (5):
>>>     drm: add card symlink in render sysfs directory
>>>     drm/i915: store all subslice masks
>>>     drm/i915/debugfs: reuse max slice/subslices already stored in sseu
>>>     drm/i915: expose engine availability through sysfs
>>>     drm/i915: expose EU topology through sysfs
>>>
>>>    drivers/gpu/drm/drm_drv.c                |  11 +
>>>    drivers/gpu/drm/i915/i915_debugfs.c      |  50 ++--
>>>    drivers/gpu/drm/i915/i915_drv.c          |   2 +-
>>>    drivers/gpu/drm/i915/i915_drv.h          |  56 ++++-
>>>    drivers/gpu/drm/i915/i915_sysfs.c        | 386 +++++++++++++++++++++++++++++++
>>>    drivers/gpu/drm/i915/intel_device_info.c | 169 ++++++++++----
>>>    drivers/gpu/drm/i915/intel_engine_cs.c   |  12 +
>>>    drivers/gpu/drm/i915/intel_lrc.c         |   2 +-
>>>    drivers/gpu/drm/i915/intel_ringbuffer.h  |   6 +-
>>>    9 files changed, 617 insertions(+), 77 deletions(-)
>>>
>>> --
>>> 2.15.1
>>> _______________________________________________
>>> Intel-gfx mailing list
>>> Intel-gfx@lists.freedesktop.org
>>> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
>>>
>> _______________________________________________
>> Intel-gfx mailing list
>> Intel-gfx@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/intel-gfx


_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Intel-gfx] [PATCH v6 0/5] drm/i915: Expose more GPU properties through sysfs
  2017-12-11 10:50   ` Joonas Lahtinen
  2017-12-11 13:29     ` Lionel Landwerlin
@ 2017-12-11 14:38     ` Tvrtko Ursulin
  2017-12-11 14:47       ` Lionel Landwerlin
  2017-12-11 21:05       ` [Intel-gfx] " Daniel Vetter
  1 sibling, 2 replies; 19+ messages in thread
From: Tvrtko Ursulin @ 2017-12-11 14:38 UTC (permalink / raw)
  To: Joonas Lahtinen, Lionel Landwerlin, intel-gfx, Chris Wilson,
	Daniel Vetter
  Cc: dri-devel


On 11/12/2017 10:50, Joonas Lahtinen wrote:
> + Daniel, Chris
> 
> On Thu, 2017-12-07 at 09:21 +0000, Tvrtko Ursulin wrote:
>> On 04/12/2017 15:02, Lionel Landwerlin wrote:
>>> Hi,
>>>
>>> After discussion with Chris, Joonas & Tvrtko, this series adds an
>>> additional commit to link the render node back to the card through a
>>> symlink. Making it obvious from an application using a render node to
>>> know where to get the information it needs.
>>
>> Important thing to mention as well is that it is trivial to get from the
>> master drm fd to the sysfs root, via fstat and opendir
>> /sys/dev/char/<major>:<minor>. With the addition of the card symlink to
>> render nodes it is trivial for render node fd as well.
>>
>> I am happy with this approach - it is extensible, flexible and avoids
>> issues with ioctl versioning or whatnot. With one value per file it is
>> trivial for userspace to access.
>>
>> So for what I'm concerned, given how gputop would use all of this and so
>> be the userspace, if everyone else is happy, I think we could do a
>> detailed review and prehaps also think about including gputop in some
>> distribution to make the case 100% straightforward.
> 
> For the GPU topology I agree this is the right choice, it's going to be
> about topology after all, and directory tree is the perfect candidate.
> And if a new platform appears, then it's a new platform and may change
> the topology well the hardware topology has changed.
> 
> For the engine enumeration, I'm not equally sold for sysfs exposing it.
> It's a "linear list of engine instances with flags" how the userspace
> is going to be looking at them. And it's also information about what to
> pass to an IOCTL as arguments after decision has been made, and then
> you already have the FD you know you'll be dealing with, at hand. So
> another IOCTL for that seems more convenient.

Apart from more flexibility and easier to extend, sysfs might be a 
better fit for applications which do not otherwise need a drm fd. Say a 
top-like tool which shows engine utilization, or those patches I RFC-ed 
recently which do the same but per DRM client.

Okay, these stats are now available also via PMU so the argument is not 
the strongest I admit, but I still find it quite neat. It also might 
allow us to define our own policy with regards to needed privilege to 
access these stats, and not be governed by the perf API rules.

> So I'd say for the GPU topology part, we go forward with the review and
> make sure we don't expose driver internal bits that could break when
> refactoring code. If the exposed N bits of information are strictly
> tied to the underlying hardware, we should have no trouble maintaining
> that for the foreseeable future.
> 
> Then we can continue on about the engine discovery in parallel, not
> blocking GPU topology discovery.

I can live with that, but would like to keep the gt/engines/ namespace 
reserved for the eventuality with go with engine info in sysfs at a 
later stage then.

Also, Lionel, did you have plans to use the engine info straight away in 
gpu top, or you only needed topology? I think you were drawing a nice 
block diagram of a GPU so do you need it for that?

Regards,

Tvrtko
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v6 0/5] drm/i915: Expose more GPU properties through sysfs
  2017-12-11 14:38     ` [Intel-gfx] " Tvrtko Ursulin
@ 2017-12-11 14:47       ` Lionel Landwerlin
  2017-12-11 21:05       ` [Intel-gfx] " Daniel Vetter
  1 sibling, 0 replies; 19+ messages in thread
From: Lionel Landwerlin @ 2017-12-11 14:47 UTC (permalink / raw)
  To: Tvrtko Ursulin, Joonas Lahtinen, intel-gfx, Chris Wilson, Daniel Vetter
  Cc: dri-devel

On 11/12/17 14:38, Tvrtko Ursulin wrote:
>
> On 11/12/2017 10:50, Joonas Lahtinen wrote:
>> + Daniel, Chris
>>
>> On Thu, 2017-12-07 at 09:21 +0000, Tvrtko Ursulin wrote:
>>> On 04/12/2017 15:02, Lionel Landwerlin wrote:
>>>> Hi,
>>>>
>>>> After discussion with Chris, Joonas & Tvrtko, this series adds an
>>>> additional commit to link the render node back to the card through a
>>>> symlink. Making it obvious from an application using a render node to
>>>> know where to get the information it needs.
>>>
>>> Important thing to mention as well is that it is trivial to get from 
>>> the
>>> master drm fd to the sysfs root, via fstat and opendir
>>> /sys/dev/char/<major>:<minor>. With the addition of the card symlink to
>>> render nodes it is trivial for render node fd as well.
>>>
>>> I am happy with this approach - it is extensible, flexible and avoids
>>> issues with ioctl versioning or whatnot. With one value per file it is
>>> trivial for userspace to access.
>>>
>>> So for what I'm concerned, given how gputop would use all of this 
>>> and so
>>> be the userspace, if everyone else is happy, I think we could do a
>>> detailed review and prehaps also think about including gputop in some
>>> distribution to make the case 100% straightforward.
>>
>> For the GPU topology I agree this is the right choice, it's going to be
>> about topology after all, and directory tree is the perfect candidate.
>> And if a new platform appears, then it's a new platform and may change
>> the topology well the hardware topology has changed.
>>
>> For the engine enumeration, I'm not equally sold for sysfs exposing it.
>> It's a "linear list of engine instances with flags" how the userspace
>> is going to be looking at them. And it's also information about what to
>> pass to an IOCTL as arguments after decision has been made, and then
>> you already have the FD you know you'll be dealing with, at hand. So
>> another IOCTL for that seems more convenient.
>
> Apart from more flexibility and easier to extend, sysfs might be a 
> better fit for applications which do not otherwise need a drm fd. Say 
> a top-like tool which shows engine utilization, or those patches I 
> RFC-ed recently which do the same but per DRM client.
>
> Okay, these stats are now available also via PMU so the argument is 
> not the strongest I admit, but I still find it quite neat. It also 
> might allow us to define our own policy with regards to needed 
> privilege to access these stats, and not be governed by the perf API 
> rules.
>
>> So I'd say for the GPU topology part, we go forward with the review and
>> make sure we don't expose driver internal bits that could break when
>> refactoring code. If the exposed N bits of information are strictly
>> tied to the underlying hardware, we should have no trouble maintaining
>> that for the foreseeable future.
>>
>> Then we can continue on about the engine discovery in parallel, not
>> blocking GPU topology discovery.
>
> I can live with that, but would like to keep the gt/engines/ namespace 
> reserved for the eventuality with go with engine info in sysfs at a 
> later stage then.
>
> Also, Lionel, did you have plans to use the engine info straight away 
> in gpu top, or you only needed topology? I think you were drawing a 
> nice block diagram of a GPU so do you need it for that?

Yes, I was planning to use it (in whatever form). I think it's quite a 
good piece of information to show in GPUTop.
The topology is not so much needed for diagrams, but more for making 
sense of the counters (which might need to be normalized per number of 
eus per slice/subslice for example).

>
> Regards,
>
> Tvrtko
>

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Intel-gfx] [PATCH v6 0/5] drm/i915: Expose more GPU properties through sysfs
  2017-12-11 14:38     ` [Intel-gfx] " Tvrtko Ursulin
  2017-12-11 14:47       ` Lionel Landwerlin
@ 2017-12-11 21:05       ` Daniel Vetter
  2017-12-12 11:19         ` Tvrtko Ursulin
  1 sibling, 1 reply; 19+ messages in thread
From: Daniel Vetter @ 2017-12-11 21:05 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: Daniel Vetter, intel-gfx, Joonas Lahtinen, dri-devel

On Mon, Dec 11, 2017 at 02:38:53PM +0000, Tvrtko Ursulin wrote:
> On 11/12/2017 10:50, Joonas Lahtinen wrote:
> > + Daniel, Chris
> > 
> > On Thu, 2017-12-07 at 09:21 +0000, Tvrtko Ursulin wrote:
> > > On 04/12/2017 15:02, Lionel Landwerlin wrote:
> > > > Hi,
> > > > 
> > > > After discussion with Chris, Joonas & Tvrtko, this series adds an
> > > > additional commit to link the render node back to the card through a
> > > > symlink. Making it obvious from an application using a render node to
> > > > know where to get the information it needs.
> > > 
> > > Important thing to mention as well is that it is trivial to get from the
> > > master drm fd to the sysfs root, via fstat and opendir
> > > /sys/dev/char/<major>:<minor>. With the addition of the card symlink to
> > > render nodes it is trivial for render node fd as well.
> > > 
> > > I am happy with this approach - it is extensible, flexible and avoids
> > > issues with ioctl versioning or whatnot. With one value per file it is
> > > trivial for userspace to access.
> > > 
> > > So for what I'm concerned, given how gputop would use all of this and so
> > > be the userspace, if everyone else is happy, I think we could do a
> > > detailed review and prehaps also think about including gputop in some
> > > distribution to make the case 100% straightforward.
> > 
> > For the GPU topology I agree this is the right choice, it's going to be
> > about topology after all, and directory tree is the perfect candidate.
> > And if a new platform appears, then it's a new platform and may change
> > the topology well the hardware topology has changed.
> > 
> > For the engine enumeration, I'm not equally sold for sysfs exposing it.
> > It's a "linear list of engine instances with flags" how the userspace
> > is going to be looking at them. And it's also information about what to
> > pass to an IOCTL as arguments after decision has been made, and then
> > you already have the FD you know you'll be dealing with, at hand. So
> > another IOCTL for that seems more convenient.
> 
> Apart from more flexibility and easier to extend, sysfs might be a better
> fit for applications which do not otherwise need a drm fd. Say a top-like
> tool which shows engine utilization, or those patches I RFC-ed recently
> which do the same but per DRM client.
> 
> Okay, these stats are now available also via PMU so the argument is not the
> strongest I admit, but I still find it quite neat. It also might allow us to
> define our own policy with regards to needed privilege to access these
> stats, and not be governed by the perf API rules.

How exactly is sysfs easier to extend than ioctl? There's lots of
serializing and deserializing going on, ime that's more boilerplate. Imo
the only reason for sysfs is when you _must_ access it without having an
fd to the gpu. The inverse is generally not true (i.e. using sysfs when
you have the fd already), and as soon as you add a writeable field here
you're screwed (because sysfs can't be passed to anyone else but root,
compared to drm fd - viz the backlight fiasco).

But even without writeable fields: Think of highly contained
containers/clients which only get the drm fd to render. If sysfs is gone,
will they fall on their faces? Atm "drm fd is all you need" is very deeply
ingrained into our various OS stacks.
-Daniel

> > So I'd say for the GPU topology part, we go forward with the review and
> > make sure we don't expose driver internal bits that could break when
> > refactoring code. If the exposed N bits of information are strictly
> > tied to the underlying hardware, we should have no trouble maintaining
> > that for the foreseeable future.
> > 
> > Then we can continue on about the engine discovery in parallel, not
> > blocking GPU topology discovery.
> 
> I can live with that, but would like to keep the gt/engines/ namespace
> reserved for the eventuality with go with engine info in sysfs at a later
> stage then.
> 
> Also, Lionel, did you have plans to use the engine info straight away in gpu
> top, or you only needed topology? I think you were drawing a nice block
> diagram of a GPU so do you need it for that?
> 
> Regards,
> 
> Tvrtko

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Intel-gfx] [PATCH v6 0/5] drm/i915: Expose more GPU properties through sysfs
  2017-12-11 21:05       ` [Intel-gfx] " Daniel Vetter
@ 2017-12-12 11:19         ` Tvrtko Ursulin
  2017-12-12 14:33           ` Lionel Landwerlin
  2017-12-12 15:18           ` Daniel Vetter
  0 siblings, 2 replies; 19+ messages in thread
From: Tvrtko Ursulin @ 2017-12-12 11:19 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Daniel Vetter, intel-gfx, Joonas Lahtinen, dri-devel


On 11/12/2017 21:05, Daniel Vetter wrote:
> On Mon, Dec 11, 2017 at 02:38:53PM +0000, Tvrtko Ursulin wrote:
>> On 11/12/2017 10:50, Joonas Lahtinen wrote:
>>> + Daniel, Chris
>>>
>>> On Thu, 2017-12-07 at 09:21 +0000, Tvrtko Ursulin wrote:
>>>> On 04/12/2017 15:02, Lionel Landwerlin wrote:
>>>>> Hi,
>>>>>
>>>>> After discussion with Chris, Joonas & Tvrtko, this series adds an
>>>>> additional commit to link the render node back to the card through a
>>>>> symlink. Making it obvious from an application using a render node to
>>>>> know where to get the information it needs.
>>>>
>>>> Important thing to mention as well is that it is trivial to get from the
>>>> master drm fd to the sysfs root, via fstat and opendir
>>>> /sys/dev/char/<major>:<minor>. With the addition of the card symlink to
>>>> render nodes it is trivial for render node fd as well.
>>>>
>>>> I am happy with this approach - it is extensible, flexible and avoids
>>>> issues with ioctl versioning or whatnot. With one value per file it is
>>>> trivial for userspace to access.
>>>>
>>>> So for what I'm concerned, given how gputop would use all of this and so
>>>> be the userspace, if everyone else is happy, I think we could do a
>>>> detailed review and prehaps also think about including gputop in some
>>>> distribution to make the case 100% straightforward.
>>>
>>> For the GPU topology I agree this is the right choice, it's going to be
>>> about topology after all, and directory tree is the perfect candidate.
>>> And if a new platform appears, then it's a new platform and may change
>>> the topology well the hardware topology has changed.
>>>
>>> For the engine enumeration, I'm not equally sold for sysfs exposing it.
>>> It's a "linear list of engine instances with flags" how the userspace
>>> is going to be looking at them. And it's also information about what to
>>> pass to an IOCTL as arguments after decision has been made, and then
>>> you already have the FD you know you'll be dealing with, at hand. So
>>> another IOCTL for that seems more convenient.
>>
>> Apart from more flexibility and easier to extend, sysfs might be a better
>> fit for applications which do not otherwise need a drm fd. Say a top-like
>> tool which shows engine utilization, or those patches I RFC-ed recently
>> which do the same but per DRM client.
>>
>> Okay, these stats are now available also via PMU so the argument is not the
>> strongest I admit, but I still find it quite neat. It also might allow us to
>> define our own policy with regards to needed privilege to access these
>> stats, and not be governed by the perf API rules.
> 
> How exactly is sysfs easier to extend than ioctl? There's lots of

Easier as in no need to version, add has_this/has_that markers, try to 
guess today how big a field for something might be needed in the future 
and similar.

> serializing and deserializing going on, ime that's more boilerplate. Imo
> the only reason for sysfs is when you _must_ access it without having an
> fd to the gpu. The inverse is generally not true (i.e. using sysfs when
> you have the fd already), and as soon as you add a writeable field here
> you're screwed (because sysfs can't be passed to anyone else but root,
> compared to drm fd - viz the backlight fiasco).

I would perhaps expand the "must access without having a drm fd" to 
"more convenient to access without a drm fd". My use case here was the 
per-client engine usage stats. Equivalence with /proc/<pid>/stat, or 
even /proc/stat if you want. So I was interested in creating a foothold 
in sysfs for that purpose.

No writable fields were imagined in all these two to three use cases.

> But even without writeable fields: Think of highly contained
> containers/clients which only get the drm fd to render. If sysfs is gone,
> will they fall on their faces? Atm "drm fd is all you need" is very deeply
> ingrained into our various OS stacks.

Okay, this is something which was mentioned but I think the answer was 
unclear. If current stacks do work without sysfs then this is a good 
argument to keep that ability.

As I said I am OK to drop the engine info bits from this series. 
Question for Lionel, gpu-top and Mesa then is whether sysfs works for 
them, for the remaining topology information. Attractiveness of sysfs 
there was that it looked easier to be future proof for more or less 
hypothetical future topologies.

Regards,

Tvrtko


> -Daniel
> 
>>> So I'd say for the GPU topology part, we go forward with the review and
>>> make sure we don't expose driver internal bits that could break when
>>> refactoring code. If the exposed N bits of information are strictly
>>> tied to the underlying hardware, we should have no trouble maintaining
>>> that for the foreseeable future.
>>>
>>> Then we can continue on about the engine discovery in parallel, not
>>> blocking GPU topology discovery.
>>
>> I can live with that, but would like to keep the gt/engines/ namespace
>> reserved for the eventuality with go with engine info in sysfs at a later
>> stage then.
>>
>> Also, Lionel, did you have plans to use the engine info straight away in gpu
>> top, or you only needed topology? I think you were drawing a nice block
>> diagram of a GPU so do you need it for that?
>>
>> Regards,
>>
>> Tvrtko
> 
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Intel-gfx] [PATCH v6 0/5] drm/i915: Expose more GPU properties through sysfs
  2017-12-12 11:19         ` Tvrtko Ursulin
@ 2017-12-12 14:33           ` Lionel Landwerlin
  2017-12-13  8:17             ` Daniel Vetter
  2017-12-12 15:18           ` Daniel Vetter
  1 sibling, 1 reply; 19+ messages in thread
From: Lionel Landwerlin @ 2017-12-12 14:33 UTC (permalink / raw)
  To: Tvrtko Ursulin, Daniel Vetter
  Cc: Daniel Vetter, intel-gfx, Joonas Lahtinen, dri-devel

On 12/12/17 11:19, Tvrtko Ursulin wrote:
>
> On 11/12/2017 21:05, Daniel Vetter wrote:
>> On Mon, Dec 11, 2017 at 02:38:53PM +0000, Tvrtko Ursulin wrote:
>>> On 11/12/2017 10:50, Joonas Lahtinen wrote:
>>>> + Daniel, Chris
>>>>
>>>> On Thu, 2017-12-07 at 09:21 +0000, Tvrtko Ursulin wrote:
>>>>> On 04/12/2017 15:02, Lionel Landwerlin wrote:
>>>>>> Hi,
>>>>>>
>>>>>> After discussion with Chris, Joonas & Tvrtko, this series adds an
>>>>>> additional commit to link the render node back to the card through a
>>>>>> symlink. Making it obvious from an application using a render 
>>>>>> node to
>>>>>> know where to get the information it needs.
>>>>>
>>>>> Important thing to mention as well is that it is trivial to get 
>>>>> from the
>>>>> master drm fd to the sysfs root, via fstat and opendir
>>>>> /sys/dev/char/<major>:<minor>. With the addition of the card 
>>>>> symlink to
>>>>> render nodes it is trivial for render node fd as well.
>>>>>
>>>>> I am happy with this approach - it is extensible, flexible and avoids
>>>>> issues with ioctl versioning or whatnot. With one value per file 
>>>>> it is
>>>>> trivial for userspace to access.
>>>>>
>>>>> So for what I'm concerned, given how gputop would use all of this 
>>>>> and so
>>>>> be the userspace, if everyone else is happy, I think we could do a
>>>>> detailed review and prehaps also think about including gputop in some
>>>>> distribution to make the case 100% straightforward.
>>>>
>>>> For the GPU topology I agree this is the right choice, it's going 
>>>> to be
>>>> about topology after all, and directory tree is the perfect candidate.
>>>> And if a new platform appears, then it's a new platform and may change
>>>> the topology well the hardware topology has changed.
>>>>
>>>> For the engine enumeration, I'm not equally sold for sysfs exposing 
>>>> it.
>>>> It's a "linear list of engine instances with flags" how the userspace
>>>> is going to be looking at them. And it's also information about 
>>>> what to
>>>> pass to an IOCTL as arguments after decision has been made, and then
>>>> you already have the FD you know you'll be dealing with, at hand. So
>>>> another IOCTL for that seems more convenient.
>>>
>>> Apart from more flexibility and easier to extend, sysfs might be a 
>>> better
>>> fit for applications which do not otherwise need a drm fd. Say a 
>>> top-like
>>> tool which shows engine utilization, or those patches I RFC-ed recently
>>> which do the same but per DRM client.
>>>
>>> Okay, these stats are now available also via PMU so the argument is 
>>> not the
>>> strongest I admit, but I still find it quite neat. It also might 
>>> allow us to
>>> define our own policy with regards to needed privilege to access these
>>> stats, and not be governed by the perf API rules.
>>
>> How exactly is sysfs easier to extend than ioctl? There's lots of
>
> Easier as in no need to version, add has_this/has_that markers, try to 
> guess today how big a field for something might be needed in the 
> future and similar.
>
>> serializing and deserializing going on, ime that's more boilerplate. Imo
>> the only reason for sysfs is when you _must_ access it without having an
>> fd to the gpu. The inverse is generally not true (i.e. using sysfs when
>> you have the fd already), and as soon as you add a writeable field here
>> you're screwed (because sysfs can't be passed to anyone else but root,
>> compared to drm fd - viz the backlight fiasco).
>
> I would perhaps expand the "must access without having a drm fd" to 
> "more convenient to access without a drm fd". My use case here was the 
> per-client engine usage stats. Equivalence with /proc/<pid>/stat, or 
> even /proc/stat if you want. So I was interested in creating a 
> foothold in sysfs for that purpose.
>
> No writable fields were imagined in all these two to three use cases.
>
>> But even without writeable fields: Think of highly contained
>> containers/clients which only get the drm fd to render. If sysfs is 
>> gone,
>> will they fall on their faces? Atm "drm fd is all you need" is very 
>> deeply
>> ingrained into our various OS stacks.
>
> Okay, this is something which was mentioned but I think the answer was 
> unclear. If current stacks do work without sysfs then this is a good 
> argument to keep that ability.
>
> As I said I am OK to drop the engine info bits from this series. 
> Question for Lionel, gpu-top and Mesa then is whether sysfs works for 
> them, for the remaining topology information. Attractiveness of sysfs 
> there was that it looked easier to be future proof for more or less 
> hypothetical future topologies.

We did a couple of versions with ioctls :

https://patchwork.freedesktop.org/patch/185959/ (through GET_PARAM)
https://patchwork.freedesktop.org/series/33436/ (though a new discovery 
uAPI initially targeted at the engine discovery)

Eventually Chris suggested sysfs (which I find kind of convenient), even 
though you Daniel raised a valid point with sandboxed processes.

I personally don't care much in which way this should be implemented.
Just give me some direction :)

Daniel: Would something in the style of 
https://patchwork.freedesktop.org/series/33436/ work? If yes, what would 
you recommend to change?

Thanks!

-
Lionel

>
> Regards,
>
> Tvrtko
>
>
>> -Daniel
>>
>>>> So I'd say for the GPU topology part, we go forward with the review 
>>>> and
>>>> make sure we don't expose driver internal bits that could break when
>>>> refactoring code. If the exposed N bits of information are strictly
>>>> tied to the underlying hardware, we should have no trouble maintaining
>>>> that for the foreseeable future.
>>>>
>>>> Then we can continue on about the engine discovery in parallel, not
>>>> blocking GPU topology discovery.
>>>
>>> I can live with that, but would like to keep the gt/engines/ namespace
>>> reserved for the eventuality with go with engine info in sysfs at a 
>>> later
>>> stage then.
>>>
>>> Also, Lionel, did you have plans to use the engine info straight 
>>> away in gpu
>>> top, or you only needed topology? I think you were drawing a nice block
>>> diagram of a GPU so do you need it for that?
>>>
>>> Regards,
>>>
>>> Tvrtko
>>
>

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v6 0/5] drm/i915: Expose more GPU properties through sysfs
  2017-12-12 11:19         ` Tvrtko Ursulin
  2017-12-12 14:33           ` Lionel Landwerlin
@ 2017-12-12 15:18           ` Daniel Vetter
  1 sibling, 0 replies; 19+ messages in thread
From: Daniel Vetter @ 2017-12-12 15:18 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: Daniel Vetter, intel-gfx, dri-devel

On Tue, Dec 12, 2017 at 11:19:47AM +0000, Tvrtko Ursulin wrote:
> 
> On 11/12/2017 21:05, Daniel Vetter wrote:
> > On Mon, Dec 11, 2017 at 02:38:53PM +0000, Tvrtko Ursulin wrote:
> > > On 11/12/2017 10:50, Joonas Lahtinen wrote:
> > > > + Daniel, Chris
> > > > 
> > > > On Thu, 2017-12-07 at 09:21 +0000, Tvrtko Ursulin wrote:
> > > > > On 04/12/2017 15:02, Lionel Landwerlin wrote:
> > > > > > Hi,
> > > > > > 
> > > > > > After discussion with Chris, Joonas & Tvrtko, this series adds an
> > > > > > additional commit to link the render node back to the card through a
> > > > > > symlink. Making it obvious from an application using a render node to
> > > > > > know where to get the information it needs.
> > > > > 
> > > > > Important thing to mention as well is that it is trivial to get from the
> > > > > master drm fd to the sysfs root, via fstat and opendir
> > > > > /sys/dev/char/<major>:<minor>. With the addition of the card symlink to
> > > > > render nodes it is trivial for render node fd as well.
> > > > > 
> > > > > I am happy with this approach - it is extensible, flexible and avoids
> > > > > issues with ioctl versioning or whatnot. With one value per file it is
> > > > > trivial for userspace to access.
> > > > > 
> > > > > So for what I'm concerned, given how gputop would use all of this and so
> > > > > be the userspace, if everyone else is happy, I think we could do a
> > > > > detailed review and prehaps also think about including gputop in some
> > > > > distribution to make the case 100% straightforward.
> > > > 
> > > > For the GPU topology I agree this is the right choice, it's going to be
> > > > about topology after all, and directory tree is the perfect candidate.
> > > > And if a new platform appears, then it's a new platform and may change
> > > > the topology well the hardware topology has changed.
> > > > 
> > > > For the engine enumeration, I'm not equally sold for sysfs exposing it.
> > > > It's a "linear list of engine instances with flags" how the userspace
> > > > is going to be looking at them. And it's also information about what to
> > > > pass to an IOCTL as arguments after decision has been made, and then
> > > > you already have the FD you know you'll be dealing with, at hand. So
> > > > another IOCTL for that seems more convenient.
> > > 
> > > Apart from more flexibility and easier to extend, sysfs might be a better
> > > fit for applications which do not otherwise need a drm fd. Say a top-like
> > > tool which shows engine utilization, or those patches I RFC-ed recently
> > > which do the same but per DRM client.
> > > 
> > > Okay, these stats are now available also via PMU so the argument is not the
> > > strongest I admit, but I still find it quite neat. It also might allow us to
> > > define our own policy with regards to needed privilege to access these
> > > stats, and not be governed by the perf API rules.
> > 
> > How exactly is sysfs easier to extend than ioctl? There's lots of
> 
> Easier as in no need to version, add has_this/has_that markers, try to guess
> today how big a field for something might be needed in the future and
> similar.

You can e.g. return a per-generation structure easily from ioctl to, if
that's what you want. And sysfs also needs some kind of versioning, it's
just much more implicit: Instead of has_foo you check for the existence of
the file. But fundamentally it's all the same, and all the same fun with
backwards compatibility, forever, applies.

> > serializing and deserializing going on, ime that's more boilerplate. Imo
> > the only reason for sysfs is when you _must_ access it without having an
> > fd to the gpu. The inverse is generally not true (i.e. using sysfs when
> > you have the fd already), and as soon as you add a writeable field here
> > you're screwed (because sysfs can't be passed to anyone else but root,
> > compared to drm fd - viz the backlight fiasco).
> 
> I would perhaps expand the "must access without having a drm fd" to "more
> convenient to access without a drm fd". My use case here was the per-client
> engine usage stats. Equivalence with /proc/<pid>/stat, or even /proc/stat if
> you want. So I was interested in creating a foothold in sysfs for that
> purpose.
> 
> No writable fields were imagined in all these two to three use cases.

Yeah that's definitely a requirement for sysfs.

> > But even without writeable fields: Think of highly contained
> > containers/clients which only get the drm fd to render. If sysfs is gone,
> > will they fall on their faces? Atm "drm fd is all you need" is very deeply
> > ingrained into our various OS stacks.
> 
> Okay, this is something which was mentioned but I think the answer was
> unclear. If current stacks do work without sysfs then this is a good
> argument to keep that ability.
> 
> As I said I am OK to drop the engine info bits from this series. Question
> for Lionel, gpu-top and Mesa then is whether sysfs works for them, for the
> remaining topology information. Attractiveness of sysfs there was that it
> looked easier to be future proof for more or less hypothetical future
> topologies.

Still not sure whether sysfs is actually easier, or just deceiptively
looking easier than ioctls. But in the end this still looks reasonable
overall, so up to you all.

Cheers, Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v6 0/5] drm/i915: Expose more GPU properties through sysfs
  2017-12-12 14:33           ` Lionel Landwerlin
@ 2017-12-13  8:17             ` Daniel Vetter
  2017-12-13 13:35               ` Chris Wilson
  2017-12-13 15:06               ` Lionel Landwerlin
  0 siblings, 2 replies; 19+ messages in thread
From: Daniel Vetter @ 2017-12-13  8:17 UTC (permalink / raw)
  To: Lionel Landwerlin; +Cc: Daniel Vetter, intel-gfx, dri-devel

On Tue, Dec 12, 2017 at 02:33:38PM +0000, Lionel Landwerlin wrote:
> On 12/12/17 11:19, Tvrtko Ursulin wrote:
> > 
> > On 11/12/2017 21:05, Daniel Vetter wrote:
> > > On Mon, Dec 11, 2017 at 02:38:53PM +0000, Tvrtko Ursulin wrote:
> > > > On 11/12/2017 10:50, Joonas Lahtinen wrote:
> > > > > + Daniel, Chris
> > > > > 
> > > > > On Thu, 2017-12-07 at 09:21 +0000, Tvrtko Ursulin wrote:
> > > > > > On 04/12/2017 15:02, Lionel Landwerlin wrote:
> > > > > > > Hi,
> > > > > > > 
> > > > > > > After discussion with Chris, Joonas & Tvrtko, this series adds an
> > > > > > > additional commit to link the render node back to the card through a
> > > > > > > symlink. Making it obvious from an application using
> > > > > > > a render node to
> > > > > > > know where to get the information it needs.
> > > > > > 
> > > > > > Important thing to mention as well is that it is trivial
> > > > > > to get from the
> > > > > > master drm fd to the sysfs root, via fstat and opendir
> > > > > > /sys/dev/char/<major>:<minor>. With the addition of the
> > > > > > card symlink to
> > > > > > render nodes it is trivial for render node fd as well.
> > > > > > 
> > > > > > I am happy with this approach - it is extensible, flexible and avoids
> > > > > > issues with ioctl versioning or whatnot. With one value
> > > > > > per file it is
> > > > > > trivial for userspace to access.
> > > > > > 
> > > > > > So for what I'm concerned, given how gputop would use
> > > > > > all of this and so
> > > > > > be the userspace, if everyone else is happy, I think we could do a
> > > > > > detailed review and prehaps also think about including gputop in some
> > > > > > distribution to make the case 100% straightforward.
> > > > > 
> > > > > For the GPU topology I agree this is the right choice, it's
> > > > > going to be
> > > > > about topology after all, and directory tree is the perfect candidate.
> > > > > And if a new platform appears, then it's a new platform and may change
> > > > > the topology well the hardware topology has changed.
> > > > > 
> > > > > For the engine enumeration, I'm not equally sold for sysfs
> > > > > exposing it.
> > > > > It's a "linear list of engine instances with flags" how the userspace
> > > > > is going to be looking at them. And it's also information
> > > > > about what to
> > > > > pass to an IOCTL as arguments after decision has been made, and then
> > > > > you already have the FD you know you'll be dealing with, at hand. So
> > > > > another IOCTL for that seems more convenient.
> > > > 
> > > > Apart from more flexibility and easier to extend, sysfs might be
> > > > a better
> > > > fit for applications which do not otherwise need a drm fd. Say a
> > > > top-like
> > > > tool which shows engine utilization, or those patches I RFC-ed recently
> > > > which do the same but per DRM client.
> > > > 
> > > > Okay, these stats are now available also via PMU so the argument
> > > > is not the
> > > > strongest I admit, but I still find it quite neat. It also might
> > > > allow us to
> > > > define our own policy with regards to needed privilege to access these
> > > > stats, and not be governed by the perf API rules.
> > > 
> > > How exactly is sysfs easier to extend than ioctl? There's lots of
> > 
> > Easier as in no need to version, add has_this/has_that markers, try to
> > guess today how big a field for something might be needed in the future
> > and similar.
> > 
> > > serializing and deserializing going on, ime that's more boilerplate. Imo
> > > the only reason for sysfs is when you _must_ access it without having an
> > > fd to the gpu. The inverse is generally not true (i.e. using sysfs when
> > > you have the fd already), and as soon as you add a writeable field here
> > > you're screwed (because sysfs can't be passed to anyone else but root,
> > > compared to drm fd - viz the backlight fiasco).
> > 
> > I would perhaps expand the "must access without having a drm fd" to
> > "more convenient to access without a drm fd". My use case here was the
> > per-client engine usage stats. Equivalence with /proc/<pid>/stat, or
> > even /proc/stat if you want. So I was interested in creating a foothold
> > in sysfs for that purpose.
> > 
> > No writable fields were imagined in all these two to three use cases.
> > 
> > > But even without writeable fields: Think of highly contained
> > > containers/clients which only get the drm fd to render. If sysfs is
> > > gone,
> > > will they fall on their faces? Atm "drm fd is all you need" is very
> > > deeply
> > > ingrained into our various OS stacks.
> > 
> > Okay, this is something which was mentioned but I think the answer was
> > unclear. If current stacks do work without sysfs then this is a good
> > argument to keep that ability.
> > 
> > As I said I am OK to drop the engine info bits from this series.
> > Question for Lionel, gpu-top and Mesa then is whether sysfs works for
> > them, for the remaining topology information. Attractiveness of sysfs
> > there was that it looked easier to be future proof for more or less
> > hypothetical future topologies.
> 
> We did a couple of versions with ioctls :
> 
> https://patchwork.freedesktop.org/patch/185959/ (through GET_PARAM)
> https://patchwork.freedesktop.org/series/33436/ (though a new discovery uAPI
> initially targeted at the engine discovery)
> 
> Eventually Chris suggested sysfs (which I find kind of convenient), even
> though you Daniel raised a valid point with sandboxed processes.
> 
> I personally don't care much in which way this should be implemented.
> Just give me some direction :)
> 
> Daniel: Would something in the style of
> https://patchwork.freedesktop.org/series/33436/ work? If yes, what would you
> recommend to change?

tbh I feel bad derailing this even further, but I think if you want a more
flexible query ioctl that's easier to extend we could maybe look at the
chunk list approach amdgpu_cs_ioctl uses: All the massive pile of execbuf
metadata is supplied in a chunk list, and if you want to extend new stuff,
or add a new type, you add a new chunk_id. The chunks themselves are all
linearized into one allocation.

We could use the exact same trick for output, and require that userspace
simply jumps over chunks it doesn't understand. Pronto, you have your
insta-extensibility.

If you want to go fancy, add a huge flags fields so you can select what
kinds of chunks you want (you need the flags field anyways).

Querying would be the usual two-step process:

1. ioctl call with chunk_size == 0, to query the kernel how much the
kernel should allocate. Kernel would just walk all the available chunks
and add up (with a bit of seq_file style abstraction this could look very
neat, even when dealing with huge number of chunks).

2. Userspace allocates sufficient amounts of memory for all the chunks it
wants, calls the ioctl again, kernel fills them all out into this single
one array.

Aside: VBT almost works like that (minus the few places they screwed up
and you need to know the size of the next block).

Cheers, Daniel

> 
> Thanks!
> 
> -
> Lionel
> 
> > 
> > Regards,
> > 
> > Tvrtko
> > 
> > 
> > > -Daniel
> > > 
> > > > > So I'd say for the GPU topology part, we go forward with the
> > > > > review and
> > > > > make sure we don't expose driver internal bits that could break when
> > > > > refactoring code. If the exposed N bits of information are strictly
> > > > > tied to the underlying hardware, we should have no trouble maintaining
> > > > > that for the foreseeable future.
> > > > > 
> > > > > Then we can continue on about the engine discovery in parallel, not
> > > > > blocking GPU topology discovery.
> > > > 
> > > > I can live with that, but would like to keep the gt/engines/ namespace
> > > > reserved for the eventuality with go with engine info in sysfs
> > > > at a later
> > > > stage then.
> > > > 
> > > > Also, Lionel, did you have plans to use the engine info straight
> > > > away in gpu
> > > > top, or you only needed topology? I think you were drawing a nice block
> > > > diagram of a GPU so do you need it for that?
> > > > 
> > > > Regards,
> > > > 
> > > > Tvrtko
> > > 
> > 
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v6 0/5] drm/i915: Expose more GPU properties through sysfs
  2017-12-13  8:17             ` Daniel Vetter
@ 2017-12-13 13:35               ` Chris Wilson
  2017-12-13 15:09                 ` Lionel Landwerlin
  2017-12-13 15:06               ` Lionel Landwerlin
  1 sibling, 1 reply; 19+ messages in thread
From: Chris Wilson @ 2017-12-13 13:35 UTC (permalink / raw)
  To: Daniel Vetter, Lionel Landwerlin; +Cc: Daniel Vetter, intel-gfx, dri-devel

Quoting Daniel Vetter (2017-12-13 08:17:17)
> On Tue, Dec 12, 2017 at 02:33:38PM +0000, Lionel Landwerlin wrote:
> > On 12/12/17 11:19, Tvrtko Ursulin wrote:
> > > 
> > > On 11/12/2017 21:05, Daniel Vetter wrote:
> > > > On Mon, Dec 11, 2017 at 02:38:53PM +0000, Tvrtko Ursulin wrote:
> > > > > On 11/12/2017 10:50, Joonas Lahtinen wrote:
> > > > > > + Daniel, Chris
> > > > > > 
> > > > > > On Thu, 2017-12-07 at 09:21 +0000, Tvrtko Ursulin wrote:
> > > > > > > On 04/12/2017 15:02, Lionel Landwerlin wrote:
> > > > > > > > Hi,
> > > > > > > > 
> > > > > > > > After discussion with Chris, Joonas & Tvrtko, this series adds an
> > > > > > > > additional commit to link the render node back to the card through a
> > > > > > > > symlink. Making it obvious from an application using
> > > > > > > > a render node to
> > > > > > > > know where to get the information it needs.
> > > > > > > 
> > > > > > > Important thing to mention as well is that it is trivial
> > > > > > > to get from the
> > > > > > > master drm fd to the sysfs root, via fstat and opendir
> > > > > > > /sys/dev/char/<major>:<minor>. With the addition of the
> > > > > > > card symlink to
> > > > > > > render nodes it is trivial for render node fd as well.
> > > > > > > 
> > > > > > > I am happy with this approach - it is extensible, flexible and avoids
> > > > > > > issues with ioctl versioning or whatnot. With one value
> > > > > > > per file it is
> > > > > > > trivial for userspace to access.
> > > > > > > 
> > > > > > > So for what I'm concerned, given how gputop would use
> > > > > > > all of this and so
> > > > > > > be the userspace, if everyone else is happy, I think we could do a
> > > > > > > detailed review and prehaps also think about including gputop in some
> > > > > > > distribution to make the case 100% straightforward.
> > > > > > 
> > > > > > For the GPU topology I agree this is the right choice, it's
> > > > > > going to be
> > > > > > about topology after all, and directory tree is the perfect candidate.
> > > > > > And if a new platform appears, then it's a new platform and may change
> > > > > > the topology well the hardware topology has changed.
> > > > > > 
> > > > > > For the engine enumeration, I'm not equally sold for sysfs
> > > > > > exposing it.
> > > > > > It's a "linear list of engine instances with flags" how the userspace
> > > > > > is going to be looking at them. And it's also information
> > > > > > about what to
> > > > > > pass to an IOCTL as arguments after decision has been made, and then
> > > > > > you already have the FD you know you'll be dealing with, at hand. So
> > > > > > another IOCTL for that seems more convenient.
> > > > > 
> > > > > Apart from more flexibility and easier to extend, sysfs might be
> > > > > a better
> > > > > fit for applications which do not otherwise need a drm fd. Say a
> > > > > top-like
> > > > > tool which shows engine utilization, or those patches I RFC-ed recently
> > > > > which do the same but per DRM client.
> > > > > 
> > > > > Okay, these stats are now available also via PMU so the argument
> > > > > is not the
> > > > > strongest I admit, but I still find it quite neat. It also might
> > > > > allow us to
> > > > > define our own policy with regards to needed privilege to access these
> > > > > stats, and not be governed by the perf API rules.
> > > > 
> > > > How exactly is sysfs easier to extend than ioctl? There's lots of
> > > 
> > > Easier as in no need to version, add has_this/has_that markers, try to
> > > guess today how big a field for something might be needed in the future
> > > and similar.
> > > 
> > > > serializing and deserializing going on, ime that's more boilerplate. Imo
> > > > the only reason for sysfs is when you _must_ access it without having an
> > > > fd to the gpu. The inverse is generally not true (i.e. using sysfs when
> > > > you have the fd already), and as soon as you add a writeable field here
> > > > you're screwed (because sysfs can't be passed to anyone else but root,
> > > > compared to drm fd - viz the backlight fiasco).
> > > 
> > > I would perhaps expand the "must access without having a drm fd" to
> > > "more convenient to access without a drm fd". My use case here was the
> > > per-client engine usage stats. Equivalence with /proc/<pid>/stat, or
> > > even /proc/stat if you want. So I was interested in creating a foothold
> > > in sysfs for that purpose.
> > > 
> > > No writable fields were imagined in all these two to three use cases.
> > > 
> > > > But even without writeable fields: Think of highly contained
> > > > containers/clients which only get the drm fd to render. If sysfs is
> > > > gone,
> > > > will they fall on their faces? Atm "drm fd is all you need" is very
> > > > deeply
> > > > ingrained into our various OS stacks.
> > > 
> > > Okay, this is something which was mentioned but I think the answer was
> > > unclear. If current stacks do work without sysfs then this is a good
> > > argument to keep that ability.
> > > 
> > > As I said I am OK to drop the engine info bits from this series.
> > > Question for Lionel, gpu-top and Mesa then is whether sysfs works for
> > > them, for the remaining topology information. Attractiveness of sysfs
> > > there was that it looked easier to be future proof for more or less
> > > hypothetical future topologies.
> > 
> > We did a couple of versions with ioctls :
> > 
> > https://patchwork.freedesktop.org/patch/185959/ (through GET_PARAM)
> > https://patchwork.freedesktop.org/series/33436/ (though a new discovery uAPI
> > initially targeted at the engine discovery)
> > 
> > Eventually Chris suggested sysfs (which I find kind of convenient), even
> > though you Daniel raised a valid point with sandboxed processes.
> > 
> > I personally don't care much in which way this should be implemented.
> > Just give me some direction :)
> > 
> > Daniel: Would something in the style of
> > https://patchwork.freedesktop.org/series/33436/ work? If yes, what would you
> > recommend to change?
> 
> tbh I feel bad derailing this even further, but I think if you want a more
> flexible query ioctl that's easier to extend we could maybe look at the
> chunk list approach amdgpu_cs_ioctl uses: All the massive pile of execbuf
> metadata is supplied in a chunk list, and if you want to extend new stuff,
> or add a new type, you add a new chunk_id. The chunks themselves are all
> linearized into one allocation.
> 
> We could use the exact same trick for output, and require that userspace
> simply jumps over chunks it doesn't understand. Pronto, you have your
> insta-extensibility.

That maps to my suggestion to use json (or one of the other compact
readable formats) as the interchange. (Presumably with a (seq)file
approach, so you pass in the read offset to avoid having to allocate
everything up front.) A useful suggestion is that the same output is
provided via debugfs for dev convenience, and including in the error
state.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v6 0/5] drm/i915: Expose more GPU properties through sysfs
  2017-12-13  8:17             ` Daniel Vetter
  2017-12-13 13:35               ` Chris Wilson
@ 2017-12-13 15:06               ` Lionel Landwerlin
  1 sibling, 0 replies; 19+ messages in thread
From: Lionel Landwerlin @ 2017-12-13 15:06 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Daniel Vetter, intel-gfx, dri-devel

On 13/12/17 08:17, Daniel Vetter wrote:
> On Tue, Dec 12, 2017 at 02:33:38PM +0000, Lionel Landwerlin wrote:
>> On 12/12/17 11:19, Tvrtko Ursulin wrote:
>>> On 11/12/2017 21:05, Daniel Vetter wrote:
>>>> On Mon, Dec 11, 2017 at 02:38:53PM +0000, Tvrtko Ursulin wrote:
>>>>> On 11/12/2017 10:50, Joonas Lahtinen wrote:
>>>>>> + Daniel, Chris
>>>>>>
>>>>>> On Thu, 2017-12-07 at 09:21 +0000, Tvrtko Ursulin wrote:
>>>>>>> On 04/12/2017 15:02, Lionel Landwerlin wrote:
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> After discussion with Chris, Joonas & Tvrtko, this series adds an
>>>>>>>> additional commit to link the render node back to the card through a
>>>>>>>> symlink. Making it obvious from an application using
>>>>>>>> a render node to
>>>>>>>> know where to get the information it needs.
>>>>>>> Important thing to mention as well is that it is trivial
>>>>>>> to get from the
>>>>>>> master drm fd to the sysfs root, via fstat and opendir
>>>>>>> /sys/dev/char/<major>:<minor>. With the addition of the
>>>>>>> card symlink to
>>>>>>> render nodes it is trivial for render node fd as well.
>>>>>>>
>>>>>>> I am happy with this approach - it is extensible, flexible and avoids
>>>>>>> issues with ioctl versioning or whatnot. With one value
>>>>>>> per file it is
>>>>>>> trivial for userspace to access.
>>>>>>>
>>>>>>> So for what I'm concerned, given how gputop would use
>>>>>>> all of this and so
>>>>>>> be the userspace, if everyone else is happy, I think we could do a
>>>>>>> detailed review and prehaps also think about including gputop in some
>>>>>>> distribution to make the case 100% straightforward.
>>>>>> For the GPU topology I agree this is the right choice, it's
>>>>>> going to be
>>>>>> about topology after all, and directory tree is the perfect candidate.
>>>>>> And if a new platform appears, then it's a new platform and may change
>>>>>> the topology well the hardware topology has changed.
>>>>>>
>>>>>> For the engine enumeration, I'm not equally sold for sysfs
>>>>>> exposing it.
>>>>>> It's a "linear list of engine instances with flags" how the userspace
>>>>>> is going to be looking at them. And it's also information
>>>>>> about what to
>>>>>> pass to an IOCTL as arguments after decision has been made, and then
>>>>>> you already have the FD you know you'll be dealing with, at hand. So
>>>>>> another IOCTL for that seems more convenient.
>>>>> Apart from more flexibility and easier to extend, sysfs might be
>>>>> a better
>>>>> fit for applications which do not otherwise need a drm fd. Say a
>>>>> top-like
>>>>> tool which shows engine utilization, or those patches I RFC-ed recently
>>>>> which do the same but per DRM client.
>>>>>
>>>>> Okay, these stats are now available also via PMU so the argument
>>>>> is not the
>>>>> strongest I admit, but I still find it quite neat. It also might
>>>>> allow us to
>>>>> define our own policy with regards to needed privilege to access these
>>>>> stats, and not be governed by the perf API rules.
>>>> How exactly is sysfs easier to extend than ioctl? There's lots of
>>> Easier as in no need to version, add has_this/has_that markers, try to
>>> guess today how big a field for something might be needed in the future
>>> and similar.
>>>
>>>> serializing and deserializing going on, ime that's more boilerplate. Imo
>>>> the only reason for sysfs is when you _must_ access it without having an
>>>> fd to the gpu. The inverse is generally not true (i.e. using sysfs when
>>>> you have the fd already), and as soon as you add a writeable field here
>>>> you're screwed (because sysfs can't be passed to anyone else but root,
>>>> compared to drm fd - viz the backlight fiasco).
>>> I would perhaps expand the "must access without having a drm fd" to
>>> "more convenient to access without a drm fd". My use case here was the
>>> per-client engine usage stats. Equivalence with /proc/<pid>/stat, or
>>> even /proc/stat if you want. So I was interested in creating a foothold
>>> in sysfs for that purpose.
>>>
>>> No writable fields were imagined in all these two to three use cases.
>>>
>>>> But even without writeable fields: Think of highly contained
>>>> containers/clients which only get the drm fd to render. If sysfs is
>>>> gone,
>>>> will they fall on their faces? Atm "drm fd is all you need" is very
>>>> deeply
>>>> ingrained into our various OS stacks.
>>> Okay, this is something which was mentioned but I think the answer was
>>> unclear. If current stacks do work without sysfs then this is a good
>>> argument to keep that ability.
>>>
>>> As I said I am OK to drop the engine info bits from this series.
>>> Question for Lionel, gpu-top and Mesa then is whether sysfs works for
>>> them, for the remaining topology information. Attractiveness of sysfs
>>> there was that it looked easier to be future proof for more or less
>>> hypothetical future topologies.
>> We did a couple of versions with ioctls :
>>
>> https://patchwork.freedesktop.org/patch/185959/ (through GET_PARAM)
>> https://patchwork.freedesktop.org/series/33436/ (though a new discovery uAPI
>> initially targeted at the engine discovery)
>>
>> Eventually Chris suggested sysfs (which I find kind of convenient), even
>> though you Daniel raised a valid point with sandboxed processes.
>>
>> I personally don't care much in which way this should be implemented.
>> Just give me some direction :)
>>
>> Daniel: Would something in the style of
>> https://patchwork.freedesktop.org/series/33436/ work? If yes, what would you
>> recommend to change?
> tbh I feel bad derailing this even further, but I think if you want a more
> flexible query ioctl that's easier to extend we could maybe look at the
> chunk list approach amdgpu_cs_ioctl uses: All the massive pile of execbuf
> metadata is supplied in a chunk list, and if you want to extend new stuff,
> or add a new type, you add a new chunk_id. The chunks themselves are all
> linearized into one allocation.
>
> We could use the exact same trick for output, and require that userspace
> simply jumps over chunks it doesn't understand. Pronto, you have your
> insta-extensibility.
>
> If you want to go fancy, add a huge flags fields so you can select what
> kinds of chunks you want (you need the flags field anyways).
>
> Querying would be the usual two-step process:
>
> 1. ioctl call with chunk_size == 0, to query the kernel how much the
> kernel should allocate. Kernel would just walk all the available chunks
> and add up (with a bit of seq_file style abstraction this could look very
> neat, even when dealing with huge number of chunks).
>
> 2. Userspace allocates sufficient amounts of memory for all the chunks it
> wants, calls the ioctl again, kernel fills them all out into this single
> one array.
>
> Aside: VBT almost works like that (minus the few places they screwed up
> and you need to know the size of the next block).

Okay, cool. Looks a lot like how some vulkan entrypoints work.
That's also not too dissimilar from what was done in the second ioctl 
attempt.
I guess I could rename a few things and drop the engine discovery (at 
least initially) and submit again.

Thanks!

>
> Cheers, Daniel
>
>> Thanks!
>>
>> -
>> Lionel
>>
>>> Regards,
>>>
>>> Tvrtko
>>>
>>>
>>>> -Daniel
>>>>
>>>>>> So I'd say for the GPU topology part, we go forward with the
>>>>>> review and
>>>>>> make sure we don't expose driver internal bits that could break when
>>>>>> refactoring code. If the exposed N bits of information are strictly
>>>>>> tied to the underlying hardware, we should have no trouble maintaining
>>>>>> that for the foreseeable future.
>>>>>>
>>>>>> Then we can continue on about the engine discovery in parallel, not
>>>>>> blocking GPU topology discovery.
>>>>> I can live with that, but would like to keep the gt/engines/ namespace
>>>>> reserved for the eventuality with go with engine info in sysfs
>>>>> at a later
>>>>> stage then.
>>>>>
>>>>> Also, Lionel, did you have plans to use the engine info straight
>>>>> away in gpu
>>>>> top, or you only needed topology? I think you were drawing a nice block
>>>>> diagram of a GPU so do you need it for that?
>>>>>
>>>>> Regards,
>>>>>
>>>>> Tvrtko


_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v6 0/5] drm/i915: Expose more GPU properties through sysfs
  2017-12-13 13:35               ` Chris Wilson
@ 2017-12-13 15:09                 ` Lionel Landwerlin
  0 siblings, 0 replies; 19+ messages in thread
From: Lionel Landwerlin @ 2017-12-13 15:09 UTC (permalink / raw)
  To: Chris Wilson, Daniel Vetter; +Cc: intel-gfx, dri-devel, Daniel Vetter

On 13/12/17 13:35, Chris Wilson wrote:
> Quoting Daniel Vetter (2017-12-13 08:17:17)
>> On Tue, Dec 12, 2017 at 02:33:38PM +0000, Lionel Landwerlin wrote:
>>> On 12/12/17 11:19, Tvrtko Ursulin wrote:
>>>> On 11/12/2017 21:05, Daniel Vetter wrote:
>>>>> On Mon, Dec 11, 2017 at 02:38:53PM +0000, Tvrtko Ursulin wrote:
>>>>>> On 11/12/2017 10:50, Joonas Lahtinen wrote:
>>>>>>> + Daniel, Chris
>>>>>>>
>>>>>>> On Thu, 2017-12-07 at 09:21 +0000, Tvrtko Ursulin wrote:
>>>>>>>> On 04/12/2017 15:02, Lionel Landwerlin wrote:
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> After discussion with Chris, Joonas & Tvrtko, this series adds an
>>>>>>>>> additional commit to link the render node back to the card through a
>>>>>>>>> symlink. Making it obvious from an application using
>>>>>>>>> a render node to
>>>>>>>>> know where to get the information it needs.
>>>>>>>> Important thing to mention as well is that it is trivial
>>>>>>>> to get from the
>>>>>>>> master drm fd to the sysfs root, via fstat and opendir
>>>>>>>> /sys/dev/char/<major>:<minor>. With the addition of the
>>>>>>>> card symlink to
>>>>>>>> render nodes it is trivial for render node fd as well.
>>>>>>>>
>>>>>>>> I am happy with this approach - it is extensible, flexible and avoids
>>>>>>>> issues with ioctl versioning or whatnot. With one value
>>>>>>>> per file it is
>>>>>>>> trivial for userspace to access.
>>>>>>>>
>>>>>>>> So for what I'm concerned, given how gputop would use
>>>>>>>> all of this and so
>>>>>>>> be the userspace, if everyone else is happy, I think we could do a
>>>>>>>> detailed review and prehaps also think about including gputop in some
>>>>>>>> distribution to make the case 100% straightforward.
>>>>>>> For the GPU topology I agree this is the right choice, it's
>>>>>>> going to be
>>>>>>> about topology after all, and directory tree is the perfect candidate.
>>>>>>> And if a new platform appears, then it's a new platform and may change
>>>>>>> the topology well the hardware topology has changed.
>>>>>>>
>>>>>>> For the engine enumeration, I'm not equally sold for sysfs
>>>>>>> exposing it.
>>>>>>> It's a "linear list of engine instances with flags" how the userspace
>>>>>>> is going to be looking at them. And it's also information
>>>>>>> about what to
>>>>>>> pass to an IOCTL as arguments after decision has been made, and then
>>>>>>> you already have the FD you know you'll be dealing with, at hand. So
>>>>>>> another IOCTL for that seems more convenient.
>>>>>> Apart from more flexibility and easier to extend, sysfs might be
>>>>>> a better
>>>>>> fit for applications which do not otherwise need a drm fd. Say a
>>>>>> top-like
>>>>>> tool which shows engine utilization, or those patches I RFC-ed recently
>>>>>> which do the same but per DRM client.
>>>>>>
>>>>>> Okay, these stats are now available also via PMU so the argument
>>>>>> is not the
>>>>>> strongest I admit, but I still find it quite neat. It also might
>>>>>> allow us to
>>>>>> define our own policy with regards to needed privilege to access these
>>>>>> stats, and not be governed by the perf API rules.
>>>>> How exactly is sysfs easier to extend than ioctl? There's lots of
>>>> Easier as in no need to version, add has_this/has_that markers, try to
>>>> guess today how big a field for something might be needed in the future
>>>> and similar.
>>>>
>>>>> serializing and deserializing going on, ime that's more boilerplate. Imo
>>>>> the only reason for sysfs is when you _must_ access it without having an
>>>>> fd to the gpu. The inverse is generally not true (i.e. using sysfs when
>>>>> you have the fd already), and as soon as you add a writeable field here
>>>>> you're screwed (because sysfs can't be passed to anyone else but root,
>>>>> compared to drm fd - viz the backlight fiasco).
>>>> I would perhaps expand the "must access without having a drm fd" to
>>>> "more convenient to access without a drm fd". My use case here was the
>>>> per-client engine usage stats. Equivalence with /proc/<pid>/stat, or
>>>> even /proc/stat if you want. So I was interested in creating a foothold
>>>> in sysfs for that purpose.
>>>>
>>>> No writable fields were imagined in all these two to three use cases.
>>>>
>>>>> But even without writeable fields: Think of highly contained
>>>>> containers/clients which only get the drm fd to render. If sysfs is
>>>>> gone,
>>>>> will they fall on their faces? Atm "drm fd is all you need" is very
>>>>> deeply
>>>>> ingrained into our various OS stacks.
>>>> Okay, this is something which was mentioned but I think the answer was
>>>> unclear. If current stacks do work without sysfs then this is a good
>>>> argument to keep that ability.
>>>>
>>>> As I said I am OK to drop the engine info bits from this series.
>>>> Question for Lionel, gpu-top and Mesa then is whether sysfs works for
>>>> them, for the remaining topology information. Attractiveness of sysfs
>>>> there was that it looked easier to be future proof for more or less
>>>> hypothetical future topologies.
>>> We did a couple of versions with ioctls :
>>>
>>> https://patchwork.freedesktop.org/patch/185959/ (through GET_PARAM)
>>> https://patchwork.freedesktop.org/series/33436/ (though a new discovery uAPI
>>> initially targeted at the engine discovery)
>>>
>>> Eventually Chris suggested sysfs (which I find kind of convenient), even
>>> though you Daniel raised a valid point with sandboxed processes.
>>>
>>> I personally don't care much in which way this should be implemented.
>>> Just give me some direction :)
>>>
>>> Daniel: Would something in the style of
>>> https://patchwork.freedesktop.org/series/33436/ work? If yes, what would you
>>> recommend to change?
>> tbh I feel bad derailing this even further, but I think if you want a more
>> flexible query ioctl that's easier to extend we could maybe look at the
>> chunk list approach amdgpu_cs_ioctl uses: All the massive pile of execbuf
>> metadata is supplied in a chunk list, and if you want to extend new stuff,
>> or add a new type, you add a new chunk_id. The chunks themselves are all
>> linearized into one allocation.
>>
>> We could use the exact same trick for output, and require that userspace
>> simply jumps over chunks it doesn't understand. Pronto, you have your
>> insta-extensibility.
> That maps to my suggestion to use json (or one of the other compact
> readable formats) as the interchange. (Presumably with a (seq)file
> approach, so you pass in the read offset to avoid having to allocate
> everything up front.) A useful suggestion is that the same output is
> provided via debugfs for dev convenience, and including in the error
> state.
> -Chris
>
No problem for adding the debugfs/error-state output as well.
I know you mentioned asn.1 regarding the interchange, but I'm struggling 
a bit to see how things should actually look like :/

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2017-12-13 15:09 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-12-04 15:02 [PATCH v6 0/5] drm/i915: Expose more GPU properties through sysfs Lionel Landwerlin
2017-12-04 15:02 ` [PATCH v6 1/5] drm: add card symlink in render sysfs directory Lionel Landwerlin
2017-12-04 15:02 ` [PATCH v6 2/5] drm/i915: store all subslice masks Lionel Landwerlin
2017-12-04 15:02 ` [PATCH v6 3/5] drm/i915/debugfs: reuse max slice/subslices already stored in sseu Lionel Landwerlin
2017-12-04 15:02 ` [PATCH v6 4/5] drm/i915: expose engine availability through sysfs Lionel Landwerlin
2017-12-04 15:02 ` [PATCH v6 5/5] drm/i915: expose EU topology " Lionel Landwerlin
2017-12-07  9:21 ` [Intel-gfx] [PATCH v6 0/5] drm/i915: Expose more GPU properties " Tvrtko Ursulin
2017-12-11 10:50   ` Joonas Lahtinen
2017-12-11 13:29     ` Lionel Landwerlin
2017-12-11 14:38     ` [Intel-gfx] " Tvrtko Ursulin
2017-12-11 14:47       ` Lionel Landwerlin
2017-12-11 21:05       ` [Intel-gfx] " Daniel Vetter
2017-12-12 11:19         ` Tvrtko Ursulin
2017-12-12 14:33           ` Lionel Landwerlin
2017-12-13  8:17             ` Daniel Vetter
2017-12-13 13:35               ` Chris Wilson
2017-12-13 15:09                 ` Lionel Landwerlin
2017-12-13 15:06               ` Lionel Landwerlin
2017-12-12 15:18           ` Daniel Vetter

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).