All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/9] i915: Cannonlake perf support
@ 2017-11-02 16:29 Lionel Landwerlin
  2017-11-02 16:29 ` [PATCH v2 1/9] drm/i915/perf: complete whitelisting for OA programming on HSW Lionel Landwerlin
                   ` (10 more replies)
  0 siblings, 11 replies; 31+ messages in thread
From: Lionel Landwerlin @ 2017-11-02 16:29 UTC (permalink / raw)
  To: intel-gfx

Hi,

This is a change only on patch 8 (topology) where it makes the uapi a
bit simpler to use and also leaves some room for future queries (one
of the case could be query on media capabilities, where different
Gens/GT have different numbers of media rings).

Cheers,

Lionel Landwerlin (9):
  drm/i915/perf: complete whitelisting for OA programming on HSW
  drm/i915/perf: add support for Coffeelake GT3
  drm/i915/perf: refactor perf setup
  drm/i915: fix register naming
  drm/i915/perf: enable perf support on CNL
  drm/i915: expose command stream timestamp frequency to userspace
  drm/i915/perf: reuse timestamp frequency from device info
  drm/i915: expose eu topology to userspace
  drm/i915/debugfs: reuse max slice/subslices already stored in sseu

 drivers/gpu/drm/i915/Makefile            |   4 +-
 drivers/gpu/drm/i915/i915_debugfs.c      |  52 +++---
 drivers/gpu/drm/i915/i915_drv.c          |  72 ++++++++-
 drivers/gpu/drm/i915/i915_drv.h          |  28 +++-
 drivers/gpu/drm/i915/i915_oa_cflgt3.c    | 109 +++++++++++++
 drivers/gpu/drm/i915/i915_oa_cflgt3.h    |  34 ++++
 drivers/gpu/drm/i915/i915_oa_cnl.c       | 121 ++++++++++++++
 drivers/gpu/drm/i915/i915_oa_cnl.h       |  34 ++++
 drivers/gpu/drm/i915/i915_perf.c         | 105 +++++++-----
 drivers/gpu/drm/i915/i915_reg.h          |  42 ++++-
 drivers/gpu/drm/i915/intel_device_info.c | 270 +++++++++++++++++++++++++------
 drivers/gpu/drm/i915/intel_lrc.c         |   2 +-
 drivers/gpu/drm/i915/intel_ringbuffer.h  |   2 +-
 include/uapi/drm/i915_drm.h              |  64 ++++++++
 14 files changed, 813 insertions(+), 126 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/i915_oa_cflgt3.c
 create mode 100644 drivers/gpu/drm/i915/i915_oa_cflgt3.h
 create mode 100644 drivers/gpu/drm/i915/i915_oa_cnl.c
 create mode 100644 drivers/gpu/drm/i915/i915_oa_cnl.h

--
2.15.0
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH v2 1/9] drm/i915/perf: complete whitelisting for OA programming on HSW
  2017-11-02 16:29 [PATCH v2 0/9] i915: Cannonlake perf support Lionel Landwerlin
@ 2017-11-02 16:29 ` Lionel Landwerlin
  2017-11-10 13:05   ` Matthew Auld
  2017-11-02 16:29 ` [PATCH v2 2/9] drm/i915/perf: add support for Coffeelake GT3 Lionel Landwerlin
                   ` (9 subsequent siblings)
  10 siblings, 1 reply; 31+ messages in thread
From: Lionel Landwerlin @ 2017-11-02 16:29 UTC (permalink / raw)
  To: intel-gfx

We were missing some registers and also can name one for which we only had
the offset.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
---
 drivers/gpu/drm/i915/i915_perf.c |  3 ++-
 drivers/gpu/drm/i915/i915_reg.h  | 14 ++++++++++++++
 2 files changed, 16 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
index 59ee808f8fd9..45aef15b9e7c 100644
--- a/drivers/gpu/drm/i915/i915_perf.c
+++ b/drivers/gpu/drm/i915/i915_perf.c
@@ -3023,7 +3023,8 @@ static bool hsw_is_valid_mux_addr(struct drm_i915_private *dev_priv, u32 addr)
 {
 	return gen7_is_valid_mux_addr(dev_priv, addr) ||
 		(addr >= 0x25100 && addr <= 0x2FF90) ||
-		addr == 0x9ec0;
+		(addr >= HSW_MBVID2_NOA0.reg && addr <= HSW_MBVID2_NOA9.reg) ||
+		addr == HSW_MBVID2_MISR0.reg;
 }
 
 static bool chv_is_valid_mux_addr(struct drm_i915_private *dev_priv, u32 addr)
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index f0f8f6059652..ee4941a1df20 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -1120,6 +1120,20 @@ static inline bool i915_mmio_reg_valid(i915_reg_t reg)
 /* RPC unit config (Gen8+) */
 #define RPM_CONFIG	    _MMIO(0x0D08)
 
+/* NOA (HSW) */
+#define HSW_MBVID2_NOA0		_MMIO(0x9E80)
+#define HSW_MBVID2_NOA1		_MMIO(0x9E84)
+#define HSW_MBVID2_NOA2		_MMIO(0x9E88)
+#define HSW_MBVID2_NOA3		_MMIO(0x9E8C)
+#define HSW_MBVID2_NOA4		_MMIO(0x9E90)
+#define HSW_MBVID2_NOA5		_MMIO(0x9E94)
+#define HSW_MBVID2_NOA6		_MMIO(0x9E98)
+#define HSW_MBVID2_NOA7		_MMIO(0x9E9C)
+#define HSW_MBVID2_NOA8		_MMIO(0x9EA0)
+#define HSW_MBVID2_NOA9		_MMIO(0x9EA4)
+
+#define HSW_MBVID2_MISR0	_MMIO(0x9EC0)
+
 /* NOA (Gen8+) */
 #define NOA_CONFIG(i)	    _MMIO(0x0D0C + (i) * 4)
 
-- 
2.15.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v2 2/9] drm/i915/perf: add support for Coffeelake GT3
  2017-11-02 16:29 [PATCH v2 0/9] i915: Cannonlake perf support Lionel Landwerlin
  2017-11-02 16:29 ` [PATCH v2 1/9] drm/i915/perf: complete whitelisting for OA programming on HSW Lionel Landwerlin
@ 2017-11-02 16:29 ` Lionel Landwerlin
  2017-11-07 11:34   ` Matthew Auld
  2017-11-02 16:29 ` [PATCH v2 3/9] drm/i915/perf: refactor perf setup Lionel Landwerlin
                   ` (8 subsequent siblings)
  10 siblings, 1 reply; 31+ messages in thread
From: Lionel Landwerlin @ 2017-11-02 16:29 UTC (permalink / raw)
  To: intel-gfx

We can enable GT3 as well as GT2.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
---
 drivers/gpu/drm/i915/Makefile         |   3 +-
 drivers/gpu/drm/i915/i915_drv.h       |   2 +
 drivers/gpu/drm/i915/i915_oa_cflgt3.c | 109 ++++++++++++++++++++++++++++++++++
 drivers/gpu/drm/i915/i915_oa_cflgt3.h |  34 +++++++++++
 drivers/gpu/drm/i915/i915_perf.c      |   3 +
 5 files changed, 150 insertions(+), 1 deletion(-)
 create mode 100644 drivers/gpu/drm/i915/i915_oa_cflgt3.c
 create mode 100644 drivers/gpu/drm/i915/i915_oa_cflgt3.h

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index 1bbc5440db40..3c419455b0af 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -162,7 +162,8 @@ i915-y += i915_perf.o \
 	  i915_oa_kblgt2.o \
 	  i915_oa_kblgt3.o \
 	  i915_oa_glk.o \
-	  i915_oa_cflgt2.o
+	  i915_oa_cflgt2.o \
+	  i915_oa_cflgt3.o
 
 ifeq ($(CONFIG_DRM_I915_GVT),y)
 i915-y += intel_gvt.o
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 72bb5b51035a..6cb7cd7f9420 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -3053,6 +3053,8 @@ intel_info(const struct drm_i915_private *dev_priv)
 				 (INTEL_DEVID(dev_priv) & 0x00F0) == 0x00A0)
 #define IS_CFL_GT2(dev_priv)	(IS_COFFEELAKE(dev_priv) && \
 				 (dev_priv)->info.gt == 2)
+#define IS_CFL_GT3(dev_priv)	(IS_COFFEELAKE(dev_priv) && \
+				 (dev_priv)->info.gt == 3)
 
 #define IS_ALPHA_SUPPORT(intel_info) ((intel_info)->is_alpha_support)
 
diff --git a/drivers/gpu/drm/i915/i915_oa_cflgt3.c b/drivers/gpu/drm/i915/i915_oa_cflgt3.c
new file mode 100644
index 000000000000..42ff06fe54a3
--- /dev/null
+++ b/drivers/gpu/drm/i915/i915_oa_cflgt3.c
@@ -0,0 +1,109 @@
+/*
+ * Autogenerated file by GPU Top : https://github.com/rib/gputop
+ * DO NOT EDIT manually!
+ *
+ *
+ * Copyright (c) 2015 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ *
+ */
+
+#include <linux/sysfs.h>
+
+#include "i915_drv.h"
+#include "i915_oa_cflgt3.h"
+
+static const struct i915_oa_reg b_counter_config_test_oa[] = {
+	{ _MMIO(0x2740), 0x00000000 },
+	{ _MMIO(0x2744), 0x00800000 },
+	{ _MMIO(0x2714), 0xf0800000 },
+	{ _MMIO(0x2710), 0x00000000 },
+	{ _MMIO(0x2724), 0xf0800000 },
+	{ _MMIO(0x2720), 0x00000000 },
+	{ _MMIO(0x2770), 0x00000004 },
+	{ _MMIO(0x2774), 0x00000000 },
+	{ _MMIO(0x2778), 0x00000003 },
+	{ _MMIO(0x277c), 0x00000000 },
+	{ _MMIO(0x2780), 0x00000007 },
+	{ _MMIO(0x2784), 0x00000000 },
+	{ _MMIO(0x2788), 0x00100002 },
+	{ _MMIO(0x278c), 0x0000fff7 },
+	{ _MMIO(0x2790), 0x00100002 },
+	{ _MMIO(0x2794), 0x0000ffcf },
+	{ _MMIO(0x2798), 0x00100082 },
+	{ _MMIO(0x279c), 0x0000ffef },
+	{ _MMIO(0x27a0), 0x001000c2 },
+	{ _MMIO(0x27a4), 0x0000ffe7 },
+	{ _MMIO(0x27a8), 0x00100001 },
+	{ _MMIO(0x27ac), 0x0000ffe7 },
+};
+
+static const struct i915_oa_reg flex_eu_config_test_oa[] = {
+};
+
+static const struct i915_oa_reg mux_config_test_oa[] = {
+	{ _MMIO(0x9840), 0x00000080 },
+	{ _MMIO(0x9888), 0x11810000 },
+	{ _MMIO(0x9888), 0x07810013 },
+	{ _MMIO(0x9888), 0x1f810000 },
+	{ _MMIO(0x9888), 0x1d810000 },
+	{ _MMIO(0x9888), 0x1b930040 },
+	{ _MMIO(0x9888), 0x07e54000 },
+	{ _MMIO(0x9888), 0x1f908000 },
+	{ _MMIO(0x9888), 0x11900000 },
+	{ _MMIO(0x9888), 0x37900000 },
+	{ _MMIO(0x9888), 0x53900000 },
+	{ _MMIO(0x9888), 0x45900000 },
+	{ _MMIO(0x9888), 0x33900000 },
+};
+
+static ssize_t
+show_test_oa_id(struct device *kdev, struct device_attribute *attr, char *buf)
+{
+	return sprintf(buf, "1\n");
+}
+
+void
+i915_perf_load_test_config_cflgt3(struct drm_i915_private *dev_priv)
+{
+	strncpy(dev_priv->perf.oa.test_config.uuid,
+		"577e8e2c-3fa0-4875-8743-3538d585e3b0",
+		UUID_STRING_LEN);
+	dev_priv->perf.oa.test_config.id = 1;
+
+	dev_priv->perf.oa.test_config.mux_regs = mux_config_test_oa;
+	dev_priv->perf.oa.test_config.mux_regs_len = ARRAY_SIZE(mux_config_test_oa);
+
+	dev_priv->perf.oa.test_config.b_counter_regs = b_counter_config_test_oa;
+	dev_priv->perf.oa.test_config.b_counter_regs_len = ARRAY_SIZE(b_counter_config_test_oa);
+
+	dev_priv->perf.oa.test_config.flex_regs = flex_eu_config_test_oa;
+	dev_priv->perf.oa.test_config.flex_regs_len = ARRAY_SIZE(flex_eu_config_test_oa);
+
+	dev_priv->perf.oa.test_config.sysfs_metric.name = "577e8e2c-3fa0-4875-8743-3538d585e3b0";
+	dev_priv->perf.oa.test_config.sysfs_metric.attrs = dev_priv->perf.oa.test_config.attrs;
+
+	dev_priv->perf.oa.test_config.attrs[0] = &dev_priv->perf.oa.test_config.sysfs_metric_id.attr;
+
+	dev_priv->perf.oa.test_config.sysfs_metric_id.attr.name = "id";
+	dev_priv->perf.oa.test_config.sysfs_metric_id.attr.mode = 0444;
+	dev_priv->perf.oa.test_config.sysfs_metric_id.show = show_test_oa_id;
+}
diff --git a/drivers/gpu/drm/i915/i915_oa_cflgt3.h b/drivers/gpu/drm/i915/i915_oa_cflgt3.h
new file mode 100644
index 000000000000..c13b5aac01b9
--- /dev/null
+++ b/drivers/gpu/drm/i915/i915_oa_cflgt3.h
@@ -0,0 +1,34 @@
+/*
+ * Autogenerated file by GPU Top : https://github.com/rib/gputop
+ * DO NOT EDIT manually!
+ *
+ *
+ * Copyright (c) 2015 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ *
+ */
+
+#ifndef __I915_OA_CFLGT3_H__
+#define __I915_OA_CFLGT3_H__
+
+extern void i915_perf_load_test_config_cflgt3(struct drm_i915_private *dev_priv);
+
+#endif
diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
index 45aef15b9e7c..7271debe0417 100644
--- a/drivers/gpu/drm/i915/i915_perf.c
+++ b/drivers/gpu/drm/i915/i915_perf.c
@@ -207,6 +207,7 @@
 #include "i915_oa_kblgt3.h"
 #include "i915_oa_glk.h"
 #include "i915_oa_cflgt2.h"
+#include "i915_oa_cflgt3.h"
 
 /* HW requires this to be a power of two, between 128k and 16M, though driver
  * is currently generally designed assuming the largest 16M size is used such
@@ -2934,6 +2935,8 @@ void i915_perf_register(struct drm_i915_private *dev_priv)
 	} else if (IS_COFFEELAKE(dev_priv)) {
 		if (IS_CFL_GT2(dev_priv))
 			i915_perf_load_test_config_cflgt2(dev_priv);
+		if (IS_CFL_GT3(dev_priv))
+			i915_perf_load_test_config_cflgt3(dev_priv);
 	}
 
 	if (dev_priv->perf.oa.test_config.id == 0)
-- 
2.15.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v2 3/9] drm/i915/perf: refactor perf setup
  2017-11-02 16:29 [PATCH v2 0/9] i915: Cannonlake perf support Lionel Landwerlin
  2017-11-02 16:29 ` [PATCH v2 1/9] drm/i915/perf: complete whitelisting for OA programming on HSW Lionel Landwerlin
  2017-11-02 16:29 ` [PATCH v2 2/9] drm/i915/perf: add support for Coffeelake GT3 Lionel Landwerlin
@ 2017-11-02 16:29 ` Lionel Landwerlin
  2017-11-10 11:04   ` Matthew Auld
  2017-11-02 16:29 ` [PATCH v2 4/9] drm/i915: fix register naming Lionel Landwerlin
                   ` (7 subsequent siblings)
  10 siblings, 1 reply; 31+ messages in thread
From: Lionel Landwerlin @ 2017-11-02 16:29 UTC (permalink / raw)
  To: intel-gfx

Gen8/9 aren't very different and we can merge some of this code.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
---
 drivers/gpu/drm/i915/i915_perf.c | 48 +++++++++++++++++++++-------------------
 1 file changed, 25 insertions(+), 23 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
index 7271debe0417..802928c54f06 100644
--- a/drivers/gpu/drm/i915/i915_perf.c
+++ b/drivers/gpu/drm/i915/i915_perf.c
@@ -3423,41 +3423,46 @@ void i915_perf_init(struct drm_i915_private *dev_priv)
 		 * worth the complexity to maintain now that BDW+ enable
 		 * execlist mode by default.
 		 */
-		dev_priv->perf.oa.ops.is_valid_b_counter_reg =
-			gen7_is_valid_b_counter_addr;
-		dev_priv->perf.oa.ops.is_valid_mux_reg =
-			gen8_is_valid_mux_addr;
-		dev_priv->perf.oa.ops.is_valid_flex_reg =
-			gen8_is_valid_flex_addr;
+		dev_priv->perf.oa.oa_formats = gen8_plus_oa_formats;
 
 		dev_priv->perf.oa.ops.init_oa_buffer = gen8_init_oa_buffer;
-		dev_priv->perf.oa.ops.enable_metric_set = gen8_enable_metric_set;
-		dev_priv->perf.oa.ops.disable_metric_set = gen8_disable_metric_set;
 		dev_priv->perf.oa.ops.oa_enable = gen8_oa_enable;
 		dev_priv->perf.oa.ops.oa_disable = gen8_oa_disable;
 		dev_priv->perf.oa.ops.read = gen8_oa_read;
 		dev_priv->perf.oa.ops.oa_hw_tail_read = gen8_oa_hw_tail_read;
 
-		dev_priv->perf.oa.oa_formats = gen8_plus_oa_formats;
-
-		if (IS_GEN8(dev_priv)) {
-			dev_priv->perf.oa.ctx_oactxctrl_offset = 0x120;
-			dev_priv->perf.oa.ctx_flexeu0_offset = 0x2ce;
+		if (IS_GEN8(dev_priv) || IS_GEN9(dev_priv)) {
+			dev_priv->perf.oa.ops.is_valid_b_counter_reg =
+				gen7_is_valid_b_counter_addr;
+			dev_priv->perf.oa.ops.is_valid_mux_reg =
+				gen8_is_valid_mux_addr;
+			dev_priv->perf.oa.ops.is_valid_flex_reg =
+				gen8_is_valid_flex_addr;
 
-			dev_priv->perf.oa.timestamp_frequency = 12500000;
-
-			dev_priv->perf.oa.gen8_valid_ctx_bit = (1<<25);
 			if (IS_CHERRYVIEW(dev_priv)) {
 				dev_priv->perf.oa.ops.is_valid_mux_reg =
 					chv_is_valid_mux_addr;
 			}
-		} else if (IS_GEN9(dev_priv)) {
-			dev_priv->perf.oa.ctx_oactxctrl_offset = 0x128;
-			dev_priv->perf.oa.ctx_flexeu0_offset = 0x3de;
 
-			dev_priv->perf.oa.gen8_valid_ctx_bit = (1<<16);
+			dev_priv->perf.oa.ops.enable_metric_set = gen8_enable_metric_set;
+			dev_priv->perf.oa.ops.disable_metric_set = gen8_disable_metric_set;
+
+			if (IS_GEN8(dev_priv)) {
+				dev_priv->perf.oa.ctx_oactxctrl_offset = 0x120;
+				dev_priv->perf.oa.ctx_flexeu0_offset = 0x2ce;
+
+				dev_priv->perf.oa.gen8_valid_ctx_bit = (1<<25);
+			} else {
+				dev_priv->perf.oa.ctx_oactxctrl_offset = 0x128;
+				dev_priv->perf.oa.ctx_flexeu0_offset = 0x3de;
+
+				dev_priv->perf.oa.gen8_valid_ctx_bit = (1<<16);
+			}
 
 			switch (dev_priv->info.platform) {
+			case INTEL_BROADWELL:
+				dev_priv->perf.oa.timestamp_frequency = 12500000;
+				break;
 			case INTEL_BROXTON:
 			case INTEL_GEMINILAKE:
 				dev_priv->perf.oa.timestamp_frequency = 19200000;
@@ -3468,9 +3473,6 @@ void i915_perf_init(struct drm_i915_private *dev_priv)
 				dev_priv->perf.oa.timestamp_frequency = 12000000;
 				break;
 			default:
-				/* Leave timestamp_frequency to 0 so we can
-				 * detect unsupported platforms.
-				 */
 				break;
 			}
 		}
-- 
2.15.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v2 4/9] drm/i915: fix register naming
  2017-11-02 16:29 [PATCH v2 0/9] i915: Cannonlake perf support Lionel Landwerlin
                   ` (2 preceding siblings ...)
  2017-11-02 16:29 ` [PATCH v2 3/9] drm/i915/perf: refactor perf setup Lionel Landwerlin
@ 2017-11-02 16:29 ` Lionel Landwerlin
  2017-11-10 11:11   ` Matthew Auld
  2017-11-02 16:29 ` [PATCH v2 5/9] drm/i915/perf: enable perf support on CNL Lionel Landwerlin
                   ` (6 subsequent siblings)
  10 siblings, 1 reply; 31+ messages in thread
From: Lionel Landwerlin @ 2017-11-02 16:29 UTC (permalink / raw)
  To: intel-gfx

This name was added with the whitelisting of registers for building up OA
configs. It is contained in a range gen8 whitelist :

   addr >= RPM_CONFIG0.reg && addr <= NOA_CONFIG(8).reg

Hence why the name isn't used anywhere.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
---
 drivers/gpu/drm/i915/i915_reg.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index ee4941a1df20..d27092ec4f74 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -1118,7 +1118,7 @@ static inline bool i915_mmio_reg_valid(i915_reg_t reg)
 #define RPM_CONFIG1	    _MMIO(0x0D04)
 
 /* RPC unit config (Gen8+) */
-#define RPM_CONFIG	    _MMIO(0x0D08)
+#define RPC_CONFIG	    _MMIO(0x0D08)
 
 /* NOA (HSW) */
 #define HSW_MBVID2_NOA0		_MMIO(0x9E80)
-- 
2.15.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v2 5/9] drm/i915/perf: enable perf support on CNL
  2017-11-02 16:29 [PATCH v2 0/9] i915: Cannonlake perf support Lionel Landwerlin
                   ` (3 preceding siblings ...)
  2017-11-02 16:29 ` [PATCH v2 4/9] drm/i915: fix register naming Lionel Landwerlin
@ 2017-11-02 16:29 ` Lionel Landwerlin
  2017-11-10 12:42   ` Matthew Auld
  2017-11-02 16:29 ` [PATCH v2 6/9] drm/i915: expose command stream timestamp frequency to userspace Lionel Landwerlin
                   ` (5 subsequent siblings)
  10 siblings, 1 reply; 31+ messages in thread
From: Lionel Landwerlin @ 2017-11-02 16:29 UTC (permalink / raw)
  To: intel-gfx

This adds new registers to the whitelist to configs emitted from userspace.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
---
 drivers/gpu/drm/i915/Makefile      |   3 +-
 drivers/gpu/drm/i915/i915_oa_cnl.c | 121 +++++++++++++++++++++++++++++++++++++
 drivers/gpu/drm/i915/i915_oa_cnl.h |  34 +++++++++++
 drivers/gpu/drm/i915/i915_perf.c   |  41 ++++++++++++-
 drivers/gpu/drm/i915/i915_reg.h    |   5 ++
 5 files changed, 202 insertions(+), 2 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/i915_oa_cnl.c
 create mode 100644 drivers/gpu/drm/i915/i915_oa_cnl.h

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index 3c419455b0af..f7afd44214b5 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -163,7 +163,8 @@ i915-y += i915_perf.o \
 	  i915_oa_kblgt3.o \
 	  i915_oa_glk.o \
 	  i915_oa_cflgt2.o \
-	  i915_oa_cflgt3.o
+	  i915_oa_cflgt3.o \
+	  i915_oa_cnl.o
 
 ifeq ($(CONFIG_DRM_I915_GVT),y)
 i915-y += intel_gvt.o
diff --git a/drivers/gpu/drm/i915/i915_oa_cnl.c b/drivers/gpu/drm/i915/i915_oa_cnl.c
new file mode 100644
index 000000000000..ff0ac3627cc4
--- /dev/null
+++ b/drivers/gpu/drm/i915/i915_oa_cnl.c
@@ -0,0 +1,121 @@
+/*
+ * Autogenerated file by GPU Top : https://github.com/rib/gputop
+ * DO NOT EDIT manually!
+ *
+ *
+ * Copyright (c) 2015 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ *
+ */
+
+#include <linux/sysfs.h>
+
+#include "i915_drv.h"
+#include "i915_oa_cnl.h"
+
+static const struct i915_oa_reg b_counter_config_test_oa[] = {
+	{ _MMIO(0x2740), 0x00000000 },
+	{ _MMIO(0x2710), 0x00000000 },
+	{ _MMIO(0x2714), 0xf0800000 },
+	{ _MMIO(0x2720), 0x00000000 },
+	{ _MMIO(0x2724), 0xf0800000 },
+	{ _MMIO(0x2770), 0x00000004 },
+	{ _MMIO(0x2774), 0x0000ffff },
+	{ _MMIO(0x2778), 0x00000003 },
+	{ _MMIO(0x277c), 0x0000ffff },
+	{ _MMIO(0x2780), 0x00000007 },
+	{ _MMIO(0x2784), 0x0000ffff },
+	{ _MMIO(0x2788), 0x00100002 },
+	{ _MMIO(0x278c), 0x0000fff7 },
+	{ _MMIO(0x2790), 0x00100002 },
+	{ _MMIO(0x2794), 0x0000ffcf },
+	{ _MMIO(0x2798), 0x00100082 },
+	{ _MMIO(0x279c), 0x0000ffef },
+	{ _MMIO(0x27a0), 0x001000c2 },
+	{ _MMIO(0x27a4), 0x0000ffe7 },
+	{ _MMIO(0x27a8), 0x00100001 },
+	{ _MMIO(0x27ac), 0x0000ffe7 },
+};
+
+static const struct i915_oa_reg flex_eu_config_test_oa[] = {
+};
+
+static const struct i915_oa_reg mux_config_test_oa[] = {
+	{ _MMIO(0xd04), 0x00000200 },
+	{ _MMIO(0x9884), 0x00000007 },
+	{ _MMIO(0x9888), 0x17060000 },
+	{ _MMIO(0x9840), 0x00000000 },
+	{ _MMIO(0x9884), 0x00000007 },
+	{ _MMIO(0x9888), 0x13034000 },
+	{ _MMIO(0x9884), 0x00000007 },
+	{ _MMIO(0x9888), 0x07060066 },
+	{ _MMIO(0x9884), 0x00000007 },
+	{ _MMIO(0x9888), 0x05060000 },
+	{ _MMIO(0x9884), 0x00000007 },
+	{ _MMIO(0x9888), 0x0f080040 },
+	{ _MMIO(0x9884), 0x00000007 },
+	{ _MMIO(0x9888), 0x07091000 },
+	{ _MMIO(0x9884), 0x00000007 },
+	{ _MMIO(0x9888), 0x0f041000 },
+	{ _MMIO(0x9884), 0x00000007 },
+	{ _MMIO(0x9888), 0x1d004000 },
+	{ _MMIO(0x9884), 0x00000007 },
+	{ _MMIO(0x9888), 0x35000000 },
+	{ _MMIO(0x9884), 0x00000007 },
+	{ _MMIO(0x9888), 0x49000000 },
+	{ _MMIO(0x9884), 0x00000007 },
+	{ _MMIO(0x9888), 0x3d000000 },
+	{ _MMIO(0x9884), 0x00000007 },
+	{ _MMIO(0x9888), 0x31000000 },
+};
+
+static ssize_t
+show_test_oa_id(struct device *kdev, struct device_attribute *attr, char *buf)
+{
+	return sprintf(buf, "1\n");
+}
+
+void
+i915_perf_load_test_config_cnl(struct drm_i915_private *dev_priv)
+{
+	strncpy(dev_priv->perf.oa.test_config.uuid,
+		"db41edd4-d8e7-4730-ad11-b9a2d6833503",
+		UUID_STRING_LEN);
+	dev_priv->perf.oa.test_config.id = 1;
+
+	dev_priv->perf.oa.test_config.mux_regs = mux_config_test_oa;
+	dev_priv->perf.oa.test_config.mux_regs_len = ARRAY_SIZE(mux_config_test_oa);
+
+	dev_priv->perf.oa.test_config.b_counter_regs = b_counter_config_test_oa;
+	dev_priv->perf.oa.test_config.b_counter_regs_len = ARRAY_SIZE(b_counter_config_test_oa);
+
+	dev_priv->perf.oa.test_config.flex_regs = flex_eu_config_test_oa;
+	dev_priv->perf.oa.test_config.flex_regs_len = ARRAY_SIZE(flex_eu_config_test_oa);
+
+	dev_priv->perf.oa.test_config.sysfs_metric.name = "db41edd4-d8e7-4730-ad11-b9a2d6833503";
+	dev_priv->perf.oa.test_config.sysfs_metric.attrs = dev_priv->perf.oa.test_config.attrs;
+
+	dev_priv->perf.oa.test_config.attrs[0] = &dev_priv->perf.oa.test_config.sysfs_metric_id.attr;
+
+	dev_priv->perf.oa.test_config.sysfs_metric_id.attr.name = "id";
+	dev_priv->perf.oa.test_config.sysfs_metric_id.attr.mode = 0444;
+	dev_priv->perf.oa.test_config.sysfs_metric_id.show = show_test_oa_id;
+}
diff --git a/drivers/gpu/drm/i915/i915_oa_cnl.h b/drivers/gpu/drm/i915/i915_oa_cnl.h
new file mode 100644
index 000000000000..fb918b131105
--- /dev/null
+++ b/drivers/gpu/drm/i915/i915_oa_cnl.h
@@ -0,0 +1,34 @@
+/*
+ * Autogenerated file by GPU Top : https://github.com/rib/gputop
+ * DO NOT EDIT manually!
+ *
+ *
+ * Copyright (c) 2015 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ *
+ */
+
+#ifndef __I915_OA_CNL_H__
+#define __I915_OA_CNL_H__
+
+extern void i915_perf_load_test_config_cnl(struct drm_i915_private *dev_priv);
+
+#endif
diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
index 802928c54f06..00be015e01df 100644
--- a/drivers/gpu/drm/i915/i915_perf.c
+++ b/drivers/gpu/drm/i915/i915_perf.c
@@ -208,6 +208,7 @@
 #include "i915_oa_glk.h"
 #include "i915_oa_cflgt2.h"
 #include "i915_oa_cflgt3.h"
+#include "i915_oa_cnl.h"
 
 /* HW requires this to be a power of two, between 128k and 16M, though driver
  * is currently generally designed assuming the largest 16M size is used such
@@ -1852,7 +1853,7 @@ static int gen8_enable_metric_set(struct drm_i915_private *dev_priv,
 	 * be read back from automatically triggered reports, as part of the
 	 * RPT_ID field.
 	 */
-	if (IS_GEN9(dev_priv)) {
+	if (IS_GEN9(dev_priv) || IS_GEN10(dev_priv)) {
 		I915_WRITE(GEN8_OA_DEBUG,
 			   _MASKED_BIT_ENABLE(GEN9_OA_DEBUG_DISABLE_CLK_RATIO_REPORTS |
 					      GEN9_OA_DEBUG_INCLUDE_CLK_RATIO));
@@ -1885,6 +1886,16 @@ static void gen8_disable_metric_set(struct drm_i915_private *dev_priv)
 
 }
 
+static void gen10_disable_metric_set(struct drm_i915_private *dev_priv)
+{
+	/* Reset all contexts' slices/subslices configurations. */
+	gen8_configure_all_contexts(dev_priv, NULL, false);
+
+	/* Make sure we disable noa to save power. */
+	I915_WRITE(RPM_CONFIG1,
+		   I915_READ(RPM_CONFIG1) & ~GEN10_GT_NOA_ENABLE);
+}
+
 static void gen7_oa_enable(struct drm_i915_private *dev_priv)
 {
 	/*
@@ -2937,6 +2948,8 @@ void i915_perf_register(struct drm_i915_private *dev_priv)
 			i915_perf_load_test_config_cflgt2(dev_priv);
 		if (IS_CFL_GT3(dev_priv))
 			i915_perf_load_test_config_cflgt3(dev_priv);
+	} else if (IS_CANNONLAKE(dev_priv)) {
+		i915_perf_load_test_config_cnl(dev_priv);
 	}
 
 	if (dev_priv->perf.oa.test_config.id == 0)
@@ -3022,6 +3035,12 @@ static bool gen8_is_valid_mux_addr(struct drm_i915_private *dev_priv, u32 addr)
 		(addr >= RPM_CONFIG0.reg && addr <= NOA_CONFIG(8).reg);
 }
 
+static bool gen10_is_valid_mux_addr(struct drm_i915_private *dev_priv, u32 addr)
+{
+	return gen8_is_valid_mux_addr(dev_priv, addr) ||
+		(addr >= OA_PERFCNT3_LO.reg && addr <= OA_PERFCNT4_HI.reg);
+}
+
 static bool hsw_is_valid_mux_addr(struct drm_i915_private *dev_priv, u32 addr)
 {
 	return gen7_is_valid_mux_addr(dev_priv, addr) ||
@@ -3475,6 +3494,26 @@ void i915_perf_init(struct drm_i915_private *dev_priv)
 			default:
 				break;
 			}
+		} else if (IS_GEN10(dev_priv)) {
+			dev_priv->perf.oa.ops.is_valid_b_counter_reg =
+				gen7_is_valid_b_counter_addr;
+			dev_priv->perf.oa.ops.is_valid_mux_reg =
+				gen10_is_valid_mux_addr;
+			dev_priv->perf.oa.ops.is_valid_flex_reg =
+				gen8_is_valid_flex_addr;
+
+			dev_priv->perf.oa.ops.enable_metric_set = gen8_enable_metric_set;
+			dev_priv->perf.oa.ops.disable_metric_set = gen10_disable_metric_set;
+
+			dev_priv->perf.oa.ctx_oactxctrl_offset = 0x128;
+			dev_priv->perf.oa.ctx_flexeu0_offset = 0x3de;
+
+			dev_priv->perf.oa.gen8_valid_ctx_bit = (1<<16);
+
+			/* Default frequency, although we need to read it from
+			 * the register as it might vary between parts.
+			 */
+			dev_priv->perf.oa.timestamp_frequency = 12000000;
 		}
 	}
 
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index d27092ec4f74..a2223f01ee2a 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -1109,6 +1109,10 @@ static inline bool i915_mmio_reg_valid(i915_reg_t reg)
 #define OA_PERFCNT1_HI      _MMIO(0x91BC)
 #define OA_PERFCNT2_LO      _MMIO(0x91C0)
 #define OA_PERFCNT2_HI      _MMIO(0x91C4)
+#define OA_PERFCNT3_LO      _MMIO(0x91C8)
+#define OA_PERFCNT3_HI      _MMIO(0x91CC)
+#define OA_PERFCNT4_LO      _MMIO(0x91D8)
+#define OA_PERFCNT4_HI      _MMIO(0x91DC)
 
 #define OA_PERFMATRIX_LO    _MMIO(0x91C8)
 #define OA_PERFMATRIX_HI    _MMIO(0x91CC)
@@ -1116,6 +1120,7 @@ static inline bool i915_mmio_reg_valid(i915_reg_t reg)
 /* RPM unit config (Gen8+) */
 #define RPM_CONFIG0	    _MMIO(0x0D00)
 #define RPM_CONFIG1	    _MMIO(0x0D04)
+#define  GEN10_GT_NOA_ENABLE  (1 << 9)
 
 /* RPC unit config (Gen8+) */
 #define RPC_CONFIG	    _MMIO(0x0D08)
-- 
2.15.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v2 6/9] drm/i915: expose command stream timestamp frequency to userspace
  2017-11-02 16:29 [PATCH v2 0/9] i915: Cannonlake perf support Lionel Landwerlin
                   ` (4 preceding siblings ...)
  2017-11-02 16:29 ` [PATCH v2 5/9] drm/i915/perf: enable perf support on CNL Lionel Landwerlin
@ 2017-11-02 16:29 ` Lionel Landwerlin
  2017-11-07  0:01   ` Rafael Antognolli
                     ` (3 more replies)
  2017-11-02 16:29 ` [PATCH v2 7/9] drm/i915/perf: reuse timestamp frequency from device info Lionel Landwerlin
                   ` (4 subsequent siblings)
  10 siblings, 4 replies; 31+ messages in thread
From: Lionel Landwerlin @ 2017-11-02 16:29 UTC (permalink / raw)
  To: intel-gfx

We use to have this fixed per generation, but starting with CNL userspace
cannot tell just off the PCI ID. Let's make this information available. This
is particularly useful for performance monitoring where much of the
normalization work is done using those timestamps (this include pipeline
statistics in both GL & Vulkan as well as OA reports).

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
---
 drivers/gpu/drm/i915/i915_debugfs.c      |  2 +
 drivers/gpu/drm/i915/i915_drv.c          |  3 +
 drivers/gpu/drm/i915/i915_drv.h          |  2 +
 drivers/gpu/drm/i915/i915_reg.h          | 21 +++++++
 drivers/gpu/drm/i915/intel_device_info.c | 99 ++++++++++++++++++++++++++++++++
 include/uapi/drm/i915_drm.h              |  6 ++
 6 files changed, 133 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index 39883cd915db..0897fd616a1f 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -3246,6 +3246,8 @@ static int i915_engine_info(struct seq_file *m, void *unused)
 		   yesno(dev_priv->gt.awake));
 	seq_printf(m, "Global active requests: %d\n",
 		   dev_priv->gt.active_requests);
+	seq_printf(m, "CS timestamp frequency: %llu\n",
+		   dev_priv->info.cs_timestamp_frequency);
 
 	p = drm_seq_file_printer(m);
 	for_each_engine(engine, dev_priv, id)
diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index e7e9e061073b..fdd23e79fb46 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -416,6 +416,9 @@ static int i915_getparam(struct drm_device *dev, void *data,
 		if (!value)
 			return -ENODEV;
 		break;
+	case I915_PARAM_CS_TIMESTAMP_FREQUENCY:
+		value = INTEL_INFO(dev_priv)->cs_timestamp_frequency;
+		break;
 	default:
 		DRM_DEBUG("Unknown parameter %d\n", param->param);
 		return -EINVAL;
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 6cb7cd7f9420..4e804aaeaae1 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -886,6 +886,8 @@ struct intel_device_info {
 	/* Slice/subslice/EU info */
 	struct sseu_dev_info sseu;
 
+	uint64_t cs_timestamp_frequency;
+
 	struct color_luts {
 		u16 degamma_lut_size;
 		u16 gamma_lut_size;
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index a2223f01ee2a..f392f28f2cfa 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -1119,9 +1119,24 @@ static inline bool i915_mmio_reg_valid(i915_reg_t reg)
 
 /* RPM unit config (Gen8+) */
 #define RPM_CONFIG0	    _MMIO(0x0D00)
+#define  GEN9_RPM_CONFIG0_CRYSTAL_CLOCK_FREQ_SHIFT	3
+#define  GEN9_RPM_CONFIG0_CRYSTAL_CLOCK_FREQ_MASK	(1 << GEN9_RPM_CONFIG0_CRYSTAL_CLOCK_FREQ_SHIFT)
+#define  GEN9_RPM_CONFIG0_CRYSTAL_CLOCK_FREQ_19_2_MHZ	0
+#define  GEN9_RPM_CONFIG0_CRYSTAL_CLOCK_FREQ_24_MHZ	1
+#define  GEN10_RPM_CONFIG0_CTC_SHIFT_PARAMETER_SHIFT	1
+#define  GEN10_RPM_CONFIG0_CTC_SHIFT_PARAMETER_MASK	(0x3 << GEN10_RPM_CONFIG0_CTC_SHIFT_PARAMETER_SHIFT)
+
 #define RPM_CONFIG1	    _MMIO(0x0D04)
 #define  GEN10_GT_NOA_ENABLE  (1 << 9)
 
+/* GPM unit config (assuming Gen8+, documentation is fuzzy...) */
+#define GEN8_CTC_MODE			_MMIO(0xA26C)
+#define  GEN8_CTC_SOURCE_PARAMETER_MASK 1
+#define  GEN8_CTC_SOURCE_CRYSTAL_CLOCK	0
+#define  GEN8_CTC_SOURCE_DIVIDE_LOGIC	1
+#define  GEN8_CTC_SHIFT_PARAMETER_SHIFT	1
+#define  GEN8_CTC_SHIFT_PARAMETER_MASK	(0x3 << GEN8_CTC_SHIFT_PARAMETER_SHIFT)
+
 /* RPC unit config (Gen8+) */
 #define RPC_CONFIG	    _MMIO(0x0D08)
 
@@ -8865,6 +8880,12 @@ enum skl_power_gate {
 #define ILK_TIMESTAMP_HI	_MMIO(0x70070)
 #define IVB_TIMESTAMP_CTR	_MMIO(0x44070)
 
+#define GEN8_TIMESTAMP_OVERRIDE				_MMIO(0x44074)
+#define  GEN8_TIMESTAMP_OVERRIDE_US_COUNTER_SHIFT		0
+#define  GEN8_TIMESTAMP_OVERRIDE_US_COUNTER_MASK		0x3ff
+#define  GEN8_TIMESTAMP_OVERRIDE_US_COUNTER_DENOMINATOR_SHIFT	12
+#define  GEN8_TIMESTAMP_OVERRIDE_US_COUNTER_DENOMINATOR_MASK	(0xf << 12)
+
 #define _PIPE_FRMTMSTMP_A		0x70048
 #define PIPE_FRMTMSTMP(pipe)		\
 			_MMIO_PIPE2(pipe, _PIPE_FRMTMSTMP_A)
diff --git a/drivers/gpu/drm/i915/intel_device_info.c b/drivers/gpu/drm/i915/intel_device_info.c
index db03d179fc85..9b71a9b6d80e 100644
--- a/drivers/gpu/drm/i915/intel_device_info.c
+++ b/drivers/gpu/drm/i915/intel_device_info.c
@@ -329,6 +329,100 @@ static void broadwell_sseu_info_init(struct drm_i915_private *dev_priv)
 	sseu->has_eu_pg = 0;
 }
 
+static u64 read_timestamp_frequency_from_divide(struct drm_i915_private *dev_priv)
+{
+	u32 ts_override = I915_READ(GEN8_TIMESTAMP_OVERRIDE);
+	u64 base_freq, frac_freq;
+
+	base_freq = ((ts_override & GEN8_TIMESTAMP_OVERRIDE_US_COUNTER_MASK) >>
+		     GEN8_TIMESTAMP_OVERRIDE_US_COUNTER_SHIFT) + 1;
+	base_freq *= 1000000;
+
+	frac_freq = ((ts_override &
+		      GEN8_TIMESTAMP_OVERRIDE_US_COUNTER_DENOMINATOR_MASK) >>
+		     GEN8_TIMESTAMP_OVERRIDE_US_COUNTER_DENOMINATOR_SHIFT);
+	if (frac_freq != 0)
+		frac_freq = 1000000 / (frac_freq + 1);
+
+	return base_freq + frac_freq;
+}
+
+static u64 read_timestamp_frequency(struct drm_i915_private *dev_priv)
+{
+	if (INTEL_GEN(dev_priv) <= 4) {
+		/* PRMs say:
+		 *
+		 *     "The value in this register increments once every 16
+		 *      hclks." ("CLKCFG" register)
+		 *
+		 * Since dev_priv->rawclk_freq stores the value in kHz divided
+		 * by 4, we just need to divide it again by 4.
+		 */
+		return (dev_priv->rawclk_freq * 1000) / 4;
+	} else if (INTEL_GEN(dev_priv) <= 7) {
+		/* PRMs say:
+		 *
+		 *     "The PCU TSC counts 10ns increments; this timestamp
+		 *      reflects bits 38:3 of the TSC (i.e. 80ns granularity,
+		 *      rolling over every 1.5 hours).
+		 */
+		return 12500000;
+	} else if (INTEL_GEN(dev_priv) <= 9) {
+		u32 ctc_reg = I915_READ(GEN8_CTC_MODE);
+		u64 freq = 0;
+
+		if ((ctc_reg & GEN8_CTC_SOURCE_PARAMETER_MASK) == GEN8_CTC_SOURCE_DIVIDE_LOGIC)
+			freq = read_timestamp_frequency_from_divide(dev_priv);
+		else
+			freq = IS_GEN9_LP(dev_priv) ? 19200000 : 24000000;
+
+		/* Now figure out how the command stream's timestamp register
+		 * increments from this frequency (it might increment only
+		 * every few clock cycle).
+		 */
+		freq >>= 3 - ((ctc_reg & GEN8_CTC_SHIFT_PARAMETER_MASK) >>
+			      GEN8_CTC_SHIFT_PARAMETER_SHIFT);
+
+		return freq;
+	} else if (INTEL_GEN(dev_priv) <= 10) {
+		u32 ctc_reg = I915_READ(GEN8_CTC_MODE);
+		u64 freq = 0;
+		u32 rpm_config_reg = 0;
+
+		/* First figure out the reference frequency. There are 2 ways
+		 * we can compute the frequency, either through the
+		 * TIMESTAMP_OVERRIDE register or through CTC_MODE &
+		 * RPM_CONFIG & CTC_MODE registers. CTC_MODE tells us which
+		 * one we should use.
+		 */
+		if ((ctc_reg & GEN8_CTC_SOURCE_PARAMETER_MASK) == GEN8_CTC_SOURCE_DIVIDE_LOGIC) {
+			freq = read_timestamp_frequency_from_divide(dev_priv);
+		} else {
+			u32 crystal_clock;
+
+			rpm_config_reg = I915_READ(RPM_CONFIG0);
+			crystal_clock = (rpm_config_reg &
+					 GEN9_RPM_CONFIG0_CRYSTAL_CLOCK_FREQ_MASK) >>
+				GEN9_RPM_CONFIG0_CRYSTAL_CLOCK_FREQ_SHIFT;
+			freq = crystal_clock == GEN9_RPM_CONFIG0_CRYSTAL_CLOCK_FREQ_19_2_MHZ ?
+				19200000 : 24000000;
+		}
+
+		/* Now figure out how the command stream's timestamp register
+		 * increments from this frequency (it might increment only
+		 * every few clock cycle).
+		 */
+		freq >>= 3 - ((rpm_config_reg &
+			       GEN10_RPM_CONFIG0_CTC_SHIFT_PARAMETER_MASK) >>
+			      GEN10_RPM_CONFIG0_CTC_SHIFT_PARAMETER_SHIFT);
+
+		return freq;
+	}
+
+	DRM_ERROR("Unknown gen, unable to compute command stream timestamp frequency\n");
+	return 0;
+}
+
 /*
  * Determine various intel_device_info fields at runtime.
  *
@@ -450,6 +544,9 @@ void intel_device_info_runtime_init(struct drm_i915_private *dev_priv)
 	else if (INTEL_GEN(dev_priv) >= 10)
 		gen10_sseu_info_init(dev_priv);
 
+	/* Initialize command stream timestamp frequency */
+	info->cs_timestamp_frequency = read_timestamp_frequency(dev_priv);
+
 	DRM_DEBUG_DRIVER("slice mask: %04x\n", info->sseu.slice_mask);
 	DRM_DEBUG_DRIVER("slice total: %u\n", hweight8(info->sseu.slice_mask));
 	DRM_DEBUG_DRIVER("subslice total: %u\n",
@@ -465,4 +562,6 @@ void intel_device_info_runtime_init(struct drm_i915_private *dev_priv)
 			 info->sseu.has_subslice_pg ? "y" : "n");
 	DRM_DEBUG_DRIVER("has EU power gating: %s\n",
 			 info->sseu.has_eu_pg ? "y" : "n");
+	DRM_DEBUG_DRIVER("CS timestamp frequency: %llu\n",
+			 info->cs_timestamp_frequency);
 }
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index 125bde7d9504..c3ff0d4947af 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -450,6 +450,12 @@ typedef struct drm_i915_irq_wait {
  */
 #define I915_PARAM_HAS_EXEC_FENCE_ARRAY  49
 
+/* Frequency of the command streamer timestamps given by the *_TIMESTAMP
+ * registers. This used to be fixed per platform but from CNL onwards, this
+ * might vary depending on the parts.
+ */
+#define I915_PARAM_CS_TIMESTAMP_FREQUENCY   50
+
 typedef struct drm_i915_getparam {
 	__s32 param;
 	/*
-- 
2.15.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v2 7/9] drm/i915/perf: reuse timestamp frequency from device info
  2017-11-02 16:29 [PATCH v2 0/9] i915: Cannonlake perf support Lionel Landwerlin
                   ` (5 preceding siblings ...)
  2017-11-02 16:29 ` [PATCH v2 6/9] drm/i915: expose command stream timestamp frequency to userspace Lionel Landwerlin
@ 2017-11-02 16:29 ` Lionel Landwerlin
  2017-11-02 16:29 ` [PATCH v2 8/9] drm/i915: expose eu topology to userspace Lionel Landwerlin
                   ` (3 subsequent siblings)
  10 siblings, 0 replies; 31+ messages in thread
From: Lionel Landwerlin @ 2017-11-02 16:29 UTC (permalink / raw)
  To: intel-gfx

Now that we have this stored in the device info, we can drop it from perf
part of the driver.

Note that this requires to init perf after we've computed the frequency,
hence why we move i915_perf_init() from i915_driver_init_early() to after
intel_device_info_runtime_init().

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.c  |  4 ++--
 drivers/gpu/drm/i915/i915_drv.h  |  1 -
 drivers/gpu/drm/i915/i915_perf.c | 32 +++-----------------------------
 3 files changed, 5 insertions(+), 32 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index fdd23e79fb46..b8e9aca46692 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -929,8 +929,6 @@ static int i915_driver_init_early(struct drm_i915_private *dev_priv,
 
 	intel_detect_preproduction_hw(dev_priv);
 
-	i915_perf_init(dev_priv);
-
 	return 0;
 
 err_irq:
@@ -1094,6 +1092,8 @@ static int i915_driver_init_hw(struct drm_i915_private *dev_priv)
 
 	intel_sanitize_options(dev_priv);
 
+	i915_perf_init(dev_priv);
+
 	ret = i915_ggtt_probe_hw(dev_priv);
 	if (ret)
 		return ret;
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 4e804aaeaae1..5c4d09a98e88 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2615,7 +2615,6 @@ struct drm_i915_private {
 
 			bool periodic;
 			int period_exponent;
-			int timestamp_frequency;
 
 			struct i915_oa_config test_config;
 
diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
index 00be015e01df..1f9d86b5cad4 100644
--- a/drivers/gpu/drm/i915/i915_perf.c
+++ b/drivers/gpu/drm/i915/i915_perf.c
@@ -2692,7 +2692,7 @@ i915_perf_open_ioctl_locked(struct drm_i915_private *dev_priv,
 static u64 oa_exponent_to_ns(struct drm_i915_private *dev_priv, int exponent)
 {
 	return div_u64(1000000000ULL * (2ULL << exponent),
-		       dev_priv->perf.oa.timestamp_frequency);
+		       INTEL_INFO(dev_priv)->cs_timestamp_frequency);
 }
 
 /**
@@ -3415,8 +3415,6 @@ static struct ctl_table dev_root[] = {
  */
 void i915_perf_init(struct drm_i915_private *dev_priv)
 {
-	dev_priv->perf.oa.timestamp_frequency = 0;
-
 	if (IS_HASWELL(dev_priv)) {
 		dev_priv->perf.oa.ops.is_valid_b_counter_reg =
 			gen7_is_valid_b_counter_addr;
@@ -3432,8 +3430,6 @@ void i915_perf_init(struct drm_i915_private *dev_priv)
 		dev_priv->perf.oa.ops.oa_hw_tail_read =
 			gen7_oa_hw_tail_read;
 
-		dev_priv->perf.oa.timestamp_frequency = 12500000;
-
 		dev_priv->perf.oa.oa_formats = hsw_oa_formats;
 	} else if (i915_modparams.enable_execlists) {
 		/* Note: that although we could theoretically also support the
@@ -3477,23 +3473,6 @@ void i915_perf_init(struct drm_i915_private *dev_priv)
 
 				dev_priv->perf.oa.gen8_valid_ctx_bit = (1<<16);
 			}
-
-			switch (dev_priv->info.platform) {
-			case INTEL_BROADWELL:
-				dev_priv->perf.oa.timestamp_frequency = 12500000;
-				break;
-			case INTEL_BROXTON:
-			case INTEL_GEMINILAKE:
-				dev_priv->perf.oa.timestamp_frequency = 19200000;
-				break;
-			case INTEL_SKYLAKE:
-			case INTEL_KABYLAKE:
-			case INTEL_COFFEELAKE:
-				dev_priv->perf.oa.timestamp_frequency = 12000000;
-				break;
-			default:
-				break;
-			}
 		} else if (IS_GEN10(dev_priv)) {
 			dev_priv->perf.oa.ops.is_valid_b_counter_reg =
 				gen7_is_valid_b_counter_addr;
@@ -3509,15 +3488,10 @@ void i915_perf_init(struct drm_i915_private *dev_priv)
 			dev_priv->perf.oa.ctx_flexeu0_offset = 0x3de;
 
 			dev_priv->perf.oa.gen8_valid_ctx_bit = (1<<16);
-
-			/* Default frequency, although we need to read it from
-			 * the register as it might vary between parts.
-			 */
-			dev_priv->perf.oa.timestamp_frequency = 12000000;
 		}
 	}
 
-	if (dev_priv->perf.oa.timestamp_frequency) {
+	if (dev_priv->perf.oa.ops.enable_metric_set) {
 		hrtimer_init(&dev_priv->perf.oa.poll_check_timer,
 				CLOCK_MONOTONIC, HRTIMER_MODE_REL);
 		dev_priv->perf.oa.poll_check_timer.function = oa_poll_check_timer_cb;
@@ -3528,7 +3502,7 @@ void i915_perf_init(struct drm_i915_private *dev_priv)
 		spin_lock_init(&dev_priv->perf.oa.oa_buffer.ptr_lock);
 
 		oa_sample_rate_hard_limit =
-			dev_priv->perf.oa.timestamp_frequency / 2;
+			INTEL_INFO(dev_priv)->cs_timestamp_frequency / 2;
 		dev_priv->perf.sysctl_header = register_sysctl_table(dev_root);
 
 		mutex_init(&dev_priv->perf.metrics_lock);
-- 
2.15.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v2 8/9] drm/i915: expose eu topology to userspace
  2017-11-02 16:29 [PATCH v2 0/9] i915: Cannonlake perf support Lionel Landwerlin
                   ` (6 preceding siblings ...)
  2017-11-02 16:29 ` [PATCH v2 7/9] drm/i915/perf: reuse timestamp frequency from device info Lionel Landwerlin
@ 2017-11-02 16:29 ` Lionel Landwerlin
  2017-11-02 16:35   ` Chris Wilson
  2017-11-02 16:29 ` [PATCH v2 9/9] drm/i915/debugfs: reuse max slice/subslices already stored in sseu Lionel Landwerlin
                   ` (2 subsequent siblings)
  10 siblings, 1 reply; 31+ messages in thread
From: Lionel Landwerlin @ 2017-11-02 16:29 UTC (permalink / raw)
  To: intel-gfx

With the introduction of asymetric slices in CNL, we cannot rely on the
previous SUBSLICE_MASK getparam. Here we introduce a more detailed way of
querying the Gen's GPU topology that doesn't aggregate numbers.

This is essential for monitoring parts of the GPU with the OA unit, because
counters need to be normalized to the number of EUs/subslices/slices. The
current aggregated numbers like EU_TOTAL do not gives us sufficient
information.

This change introduce a new way to query properties of the GPU, making
room for new queries (some media related topology could be exposed in
the future).

As a bonus we can draw representations of the GPU :

    https://imgur.com/a/vuqpa

v2: Simplify uapi and make it extensible (Lionel)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/i915_debugfs.c      |  24 +++--
 drivers/gpu/drm/i915/i915_drv.c          |  65 +++++++++++-
 drivers/gpu/drm/i915/i915_drv.h          |  23 ++++-
 drivers/gpu/drm/i915/intel_device_info.c | 171 ++++++++++++++++++++++---------
 drivers/gpu/drm/i915/intel_lrc.c         |   2 +-
 drivers/gpu/drm/i915/intel_ringbuffer.h  |   2 +-
 include/uapi/drm/i915_drm.h              |  58 +++++++++++
 7 files changed, 283 insertions(+), 62 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index 0897fd616a1f..8521fc012fa4 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -4441,7 +4441,7 @@ static void cherryview_sseu_device_status(struct drm_i915_private *dev_priv,
 			continue;
 
 		sseu->slice_mask = BIT(0);
-		sseu->subslice_mask |= BIT(ss);
+		sseu->subslices_mask[0] |= BIT(ss);
 		eu_cnt = ((sig1[ss] & CHV_EU08_PG_ENABLE) ? 0 : 2) +
 			 ((sig1[ss] & CHV_EU19_PG_ENABLE) ? 0 : 2) +
 			 ((sig1[ss] & CHV_EU210_PG_ENABLE) ? 0 : 2) +
@@ -4488,7 +4488,7 @@ static void gen10_sseu_device_status(struct drm_i915_private *dev_priv,
 			continue;
 
 		sseu->slice_mask |= BIT(s);
-		sseu->subslice_mask = info->sseu.subslice_mask;
+		sseu->subslices_mask[s] = info->sseu.subslices_mask[s];
 
 		for (ss = 0; ss < ss_max; ss++) {
 			unsigned int eu_cnt;
@@ -4543,8 +4543,8 @@ static void gen9_sseu_device_status(struct drm_i915_private *dev_priv,
 		sseu->slice_mask |= BIT(s);
 
 		if (IS_GEN9_BC(dev_priv))
-			sseu->subslice_mask =
-				INTEL_INFO(dev_priv)->sseu.subslice_mask;
+			sseu->subslices_mask[s] =
+				INTEL_INFO(dev_priv)->sseu.subslices_mask[s];
 
 		for (ss = 0; ss < ss_max; ss++) {
 			unsigned int eu_cnt;
@@ -4554,7 +4554,7 @@ static void gen9_sseu_device_status(struct drm_i915_private *dev_priv,
 					/* skip disabled subslice */
 					continue;
 
-				sseu->subslice_mask |= BIT(ss);
+				sseu->subslices_mask[s] |= BIT(ss);
 			}
 
 			eu_cnt = 2 * hweight32(eu_reg[2*s + ss/2] &
@@ -4576,9 +4576,12 @@ static void broadwell_sseu_device_status(struct drm_i915_private *dev_priv,
 	sseu->slice_mask = slice_info & GEN8_LSLICESTAT_MASK;
 
 	if (sseu->slice_mask) {
-		sseu->subslice_mask = INTEL_INFO(dev_priv)->sseu.subslice_mask;
 		sseu->eu_per_subslice =
 				INTEL_INFO(dev_priv)->sseu.eu_per_subslice;
+		for (s = 0; s < fls(sseu->slice_mask); s++) {
+			sseu->subslices_mask[s] =
+				INTEL_INFO(dev_priv)->sseu.subslices_mask[s];
+		}
 		sseu->eu_total = sseu->eu_per_subslice *
 				 sseu_subslice_total(sseu);
 
@@ -4597,6 +4600,7 @@ static void i915_print_sseu_info(struct seq_file *m, bool is_available_info,
 {
 	struct drm_i915_private *dev_priv = node_to_i915(m->private);
 	const char *type = is_available_info ? "Available" : "Enabled";
+	int s;
 
 	seq_printf(m, "  %s Slice Mask: %04x\n", type,
 		   sseu->slice_mask);
@@ -4604,10 +4608,10 @@ static void i915_print_sseu_info(struct seq_file *m, bool is_available_info,
 		   hweight8(sseu->slice_mask));
 	seq_printf(m, "  %s Subslice Total: %u\n", type,
 		   sseu_subslice_total(sseu));
-	seq_printf(m, "  %s Subslice Mask: %04x\n", type,
-		   sseu->subslice_mask);
-	seq_printf(m, "  %s Subslice Per Slice: %u\n", type,
-		   hweight8(sseu->subslice_mask));
+	for (s = 0; s < fls(sseu->slice_mask); s++) {
+		seq_printf(m, "  %s Slice%i Subslice Mask: %04x\n", type,
+			   s, sseu->subslices_mask[s]);
+	}
 	seq_printf(m, "  %s EU Total: %u\n", type,
 		   sseu->eu_total);
 	seq_printf(m, "  %s EU Per Subslice: %u\n", type,
diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index b8e9aca46692..f9ee1851cca6 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -272,6 +272,63 @@ static void intel_detect_pch(struct drm_i915_private *dev_priv)
 	pci_dev_put(pch);
 }
 
+static int i915_getparam_topology(struct drm_i915_private *dev_priv,
+				  drm_i915_topology_t __user *user_topology)
+{
+	const struct sseu_dev_info *sseu = &INTEL_INFO(dev_priv)->sseu;
+	drm_i915_topology_t req_topology, topology;
+	const u8 *data = NULL;
+	int ret;
+
+	/* Not supported on gen < 8. */
+	if (sseu->max_slices == 0)
+		return -ENODEV;
+
+	ret = copy_from_user(&req_topology, user_topology, sizeof(req_topology));
+	if (ret)
+		return ret;
+
+	topology = req_topology;
+	switch (topology.query) {
+	case I915_PARAM_TOPOLOGY_QUERY_SLICE_MASK:
+		topology.params[0] = sseu->max_slices;
+		topology.data_size = sizeof(sseu->slice_mask);
+		data = &sseu->slice_mask;
+		break;
+
+	case I915_PARAM_TOPOLOGY_QUERY_SUBSLICE_MASK:
+		topology.params[0] = ALIGN(sseu->max_subslices, 8) / 8;
+		topology.data_size = sseu->max_slices * topology.params[0];
+		data = sseu->subslices_mask;
+		break;
+
+	case I915_PARAM_TOPOLOGY_QUERY_EU_MASK:
+		topology.params[1] = (ALIGN(sseu->max_eus_per_subslice, 8) / 8);
+		topology.params[0] = sseu->max_subslices * topology.params[1];
+		topology.data_size = sseu->max_slices * topology.params[0];
+		data = sseu->eu_mask;
+		break;
+
+	default:
+		return -EINVAL;
+	}
+
+	if (req_topology.data_size != 0 &&
+	    req_topology.data_size != topology.data_size)
+		return -EINVAL;
+
+	if (req_topology.data_size == topology.data_size) {
+		ret = copy_to_user(user_topology->data, data,
+				   topology.data_size);
+		if (ret)
+			return ret;
+	}
+
+	return copy_to_user(user_topology, &topology, sizeof(topology));
+}
+
+
+
 static int i915_getparam(struct drm_device *dev, void *data,
 			 struct drm_file *file_priv)
 {
@@ -412,13 +469,19 @@ static int i915_getparam(struct drm_device *dev, void *data,
 			return -ENODEV;
 		break;
 	case I915_PARAM_SUBSLICE_MASK:
-		value = INTEL_INFO(dev_priv)->sseu.subslice_mask;
+		value = INTEL_INFO(dev_priv)->sseu.subslices_mask[0];
 		if (!value)
 			return -ENODEV;
 		break;
 	case I915_PARAM_CS_TIMESTAMP_FREQUENCY:
 		value = INTEL_INFO(dev_priv)->cs_timestamp_frequency;
 		break;
+	case I915_PARAM_TOPOLOGY: {
+		drm_i915_topology_t __user *user_topology =
+			(drm_i915_topology_t __user *) param->value;
+		/* Purposefully return without replacing the value pointer. */
+		return i915_getparam_topology(dev_priv, user_topology);
+	}
 	default:
 		DRM_DEBUG("Unknown parameter %d\n", param->param);
 		return -EINVAL;
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 5c4d09a98e88..86194c146fdc 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -801,9 +801,12 @@ struct intel_csr {
 	func(supports_tv); \
 	func(has_ipc);
 
+#define GEN_MAX_SLICES		(6) /* CNL upper bound */
+#define GEN_MAX_SUBSLICES	(7)
+
 struct sseu_dev_info {
 	u8 slice_mask;
-	u8 subslice_mask;
+	u8 subslices_mask[GEN_MAX_SLICES];
 	u8 eu_total;
 	u8 eu_per_subslice;
 	u8 min_eu_in_pool;
@@ -812,11 +815,27 @@ struct sseu_dev_info {
 	u8 has_slice_pg:1;
 	u8 has_subslice_pg:1;
 	u8 has_eu_pg:1;
+
+	/* Topology fields */
+	u8 max_slices;
+	u8 max_subslices;
+	u8 max_eus_per_subslice;
+
+	/* We don't have more than 8 eus per subslice at the moment and as we
+	 * store eus enabled using bits, no need to multiply by eus per
+	 * subslice.
+	 */
+	u8 eu_mask[GEN_MAX_SLICES * GEN_MAX_SUBSLICES];
 };
 
 static inline unsigned int sseu_subslice_total(const struct sseu_dev_info *sseu)
 {
-	return hweight8(sseu->slice_mask) * hweight8(sseu->subslice_mask);
+	unsigned s, total = 0;
+
+	for (s = 0; s < ARRAY_SIZE(sseu->subslices_mask); s++)
+		total += hweight8(sseu->subslices_mask[s]);
+
+	return total;
 }
 
 /* Keep in gen based order, and chronological order within a gen */
diff --git a/drivers/gpu/drm/i915/intel_device_info.c b/drivers/gpu/drm/i915/intel_device_info.c
index 9b71a9b6d80e..b5a7f3eb9537 100644
--- a/drivers/gpu/drm/i915/intel_device_info.c
+++ b/drivers/gpu/drm/i915/intel_device_info.c
@@ -82,22 +82,74 @@ void intel_device_info_dump(struct drm_i915_private *dev_priv)
 #undef PRINT_FLAG
 }
 
+static u8 compute_eu_total(const struct sseu_dev_info *sseu)
+{
+	u8 i, total = 0;
+
+	for (i = 0; i < ARRAY_SIZE(sseu->eu_mask); i++)
+		total += hweight8(sseu->eu_mask[i]);
+
+	return total;
+}
+
 static void gen10_sseu_info_init(struct drm_i915_private *dev_priv)
 {
 	struct sseu_dev_info *sseu = &mkwrite_device_info(dev_priv)->sseu;
 	const u32 fuse2 = I915_READ(GEN8_FUSE2);
+	int s, ss, eu_mask = 0xff;
+	u32 subslice_mask, eu_en;
 
 	sseu->slice_mask = (fuse2 & GEN10_F2_S_ENA_MASK) >>
 			    GEN10_F2_S_ENA_SHIFT;
-	sseu->subslice_mask = (1 << 4) - 1;
-	sseu->subslice_mask &= ~((fuse2 & GEN10_F2_SS_DIS_MASK) >>
-				 GEN10_F2_SS_DIS_SHIFT);
+	sseu->max_slices = 6;
+	sseu->max_subslices = 4;
+	sseu->max_eus_per_subslice = 8;
+
+	subslice_mask = (1 << 4) - 1;
+	subslice_mask &= ~((fuse2 & GEN10_F2_SS_DIS_MASK) >>
+			   GEN10_F2_SS_DIS_SHIFT);
+
+	/* Slice0 can have up to 3 subslices, but there are only 2 in
+	 * slice1/2.
+	 */
+	sseu->subslices_mask[0] = subslice_mask;
+	for (s = 1; s < sseu->max_slices; s++)
+		sseu->subslices_mask[s] = subslice_mask & 0x3;
+
+	/* Slice0 */
+	eu_en = ~I915_READ(GEN8_EU_DISABLE0);
+	for (ss = 0; ss < sseu->max_subslices; ss++)
+		sseu->eu_mask[ss]     = (eu_en >> (8 * ss)) & eu_mask;
+	/* Slice1 */
+	sseu->eu_mask[sseu->max_subslices]         = (eu_en >> 24) & eu_mask;
+	eu_en = ~I915_READ(GEN8_EU_DISABLE1);
+	sseu->eu_mask[sseu->max_subslices + 1]     = eu_en & eu_mask;
+	/* Slice2 */
+	sseu->eu_mask[2 * sseu->max_subslices]     = (eu_en >> 8) & eu_mask;
+	sseu->eu_mask[2 * sseu->max_subslices + 1] = (eu_en >> 16) & eu_mask;
+	/* Slice3 */
+	sseu->eu_mask[3 * sseu->max_subslices]     = (eu_en >> 24) & eu_mask;
+	eu_en = ~I915_READ(GEN8_EU_DISABLE2);
+	sseu->eu_mask[3 * sseu->max_subslices + 1] = eu_en & eu_mask;
+	/* Slice4 */
+	sseu->eu_mask[4 * sseu->max_subslices]     = (eu_en >> 8) & eu_mask;
+	sseu->eu_mask[4 * sseu->max_subslices + 1] = (eu_en >> 16) & eu_mask;
+	/* Slice5 */
+	sseu->eu_mask[5 * sseu->max_subslices]     = (eu_en >> 24) & eu_mask;
+	eu_en = ~I915_READ(GEN10_EU_DISABLE3);
+	sseu->eu_mask[5 * sseu->max_subslices + 1] = eu_en & eu_mask;
+
+	/* Do a second pass where we marked the subslices disabled if all
+	 * their eus are off.
+	 */
+	for (s = 0; s < sseu->max_slices; s++) {
+		for (ss = 0; ss < sseu->max_subslices; ss++) {
+			if (sseu->eu_mask[s * sseu->max_subslices + ss] == 0)
+				sseu->subslices_mask[s] &= ~BIT(ss);
+		}
+	}
 
-	sseu->eu_total = hweight32(~I915_READ(GEN8_EU_DISABLE0));
-	sseu->eu_total += hweight32(~I915_READ(GEN8_EU_DISABLE1));
-	sseu->eu_total += hweight32(~I915_READ(GEN8_EU_DISABLE2));
-	sseu->eu_total += hweight8(~(I915_READ(GEN10_EU_DISABLE3) &
-				     GEN10_EU_DIS_SS_MASK));
+	sseu->eu_total = compute_eu_total(sseu);
 
 	/*
 	 * CNL is expected to always have a uniform distribution
@@ -118,26 +170,30 @@ static void gen10_sseu_info_init(struct drm_i915_private *dev_priv)
 static void cherryview_sseu_info_init(struct drm_i915_private *dev_priv)
 {
 	struct sseu_dev_info *sseu = &mkwrite_device_info(dev_priv)->sseu;
-	u32 fuse, eu_dis;
+	u32 fuse;
 
 	fuse = I915_READ(CHV_FUSE_GT);
 
 	sseu->slice_mask = BIT(0);
+	sseu->max_slices = 1;
+	sseu->max_subslices = 2;
+	sseu->max_eus_per_subslice = 8;
 
 	if (!(fuse & CHV_FGT_DISABLE_SS0)) {
-		sseu->subslice_mask |= BIT(0);
-		eu_dis = fuse & (CHV_FGT_EU_DIS_SS0_R0_MASK |
-				 CHV_FGT_EU_DIS_SS0_R1_MASK);
-		sseu->eu_total += 8 - hweight32(eu_dis);
+		sseu->subslices_mask[0] |= BIT(0);
+		sseu->eu_mask[0] = (fuse & CHV_FGT_EU_DIS_SS0_R0_MASK) >> CHV_FGT_EU_DIS_SS0_R0_SHIFT;
+		sseu->eu_mask[0] |= ((fuse & CHV_FGT_EU_DIS_SS0_R1_MASK) >> CHV_FGT_EU_DIS_SS0_R1_SHIFT) << 4;
+		sseu->subslices_mask[0] = 1;
 	}
 
 	if (!(fuse & CHV_FGT_DISABLE_SS1)) {
-		sseu->subslice_mask |= BIT(1);
-		eu_dis = fuse & (CHV_FGT_EU_DIS_SS1_R0_MASK |
-				 CHV_FGT_EU_DIS_SS1_R1_MASK);
-		sseu->eu_total += 8 - hweight32(eu_dis);
+		sseu->subslices_mask[0] |= BIT(1);
+		sseu->eu_mask[1] = (fuse & CHV_FGT_EU_DIS_SS1_R0_MASK) >> CHV_FGT_EU_DIS_SS0_R0_SHIFT;
+		sseu->eu_mask[2] |= ((fuse & CHV_FGT_EU_DIS_SS1_R1_MASK) >> CHV_FGT_EU_DIS_SS0_R1_SHIFT) << 4;
 	}
 
+	sseu->eu_total = compute_eu_total(sseu);
+
 	/*
 	 * CHV expected to always have a uniform distribution of EU
 	 * across subslices.
@@ -159,41 +215,50 @@ static void gen9_sseu_info_init(struct drm_i915_private *dev_priv)
 {
 	struct intel_device_info *info = mkwrite_device_info(dev_priv);
 	struct sseu_dev_info *sseu = &info->sseu;
-	int s_max = 3, ss_max = 4, eu_max = 8;
 	int s, ss;
-	u32 fuse2, eu_disable;
+	u32 fuse2, eu_disable, subslice_mask;
 	u8 eu_mask = 0xff;
 
 	fuse2 = I915_READ(GEN8_FUSE2);
 	sseu->slice_mask = (fuse2 & GEN8_F2_S_ENA_MASK) >> GEN8_F2_S_ENA_SHIFT;
 
+	/* BXT has a single slice and at most 3 subslices. */
+	sseu->max_slices = IS_GEN9_LP(dev_priv) ? 1 : 3;
+	sseu->max_subslices = IS_GEN9_LP(dev_priv) ? 3 : 4;
+	sseu->max_eus_per_subslice = 8;
+
 	/*
 	 * The subslice disable field is global, i.e. it applies
 	 * to each of the enabled slices.
 	*/
-	sseu->subslice_mask = (1 << ss_max) - 1;
-	sseu->subslice_mask &= ~((fuse2 & GEN9_F2_SS_DIS_MASK) >>
-				 GEN9_F2_SS_DIS_SHIFT);
+	subslice_mask = (1 << sseu->max_subslices) - 1;
+	subslice_mask &= ~((fuse2 & GEN9_F2_SS_DIS_MASK) >>
+			   GEN9_F2_SS_DIS_SHIFT);
 
 	/*
 	 * Iterate through enabled slices and subslices to
 	 * count the total enabled EU.
 	*/
-	for (s = 0; s < s_max; s++) {
+	for (s = 0; s < sseu->max_slices; s++) {
 		if (!(sseu->slice_mask & BIT(s)))
 			/* skip disabled slice */
 			continue;
 
+		sseu->subslices_mask[s] = subslice_mask;
+
 		eu_disable = I915_READ(GEN9_EU_DISABLE(s));
-		for (ss = 0; ss < ss_max; ss++) {
+		for (ss = 0; ss < sseu->max_subslices; ss++) {
 			int eu_per_ss;
 
-			if (!(sseu->subslice_mask & BIT(ss)))
+			if (!(sseu->subslices_mask[s] & BIT(ss)))
 				/* skip disabled subslice */
 				continue;
 
-			eu_per_ss = eu_max - hweight8((eu_disable >> (ss*8)) &
-						      eu_mask);
+			sseu->eu_mask[ss + s * sseu->max_subslices] =
+				~((eu_disable >> (ss*8)) & eu_mask);
+
+			eu_per_ss = sseu->max_eus_per_subslice -
+				hweight8((eu_disable >> (ss*8)) & eu_mask);
 
 			/*
 			 * Record which subslice(s) has(have) 7 EUs. we
@@ -202,11 +267,11 @@ static void gen9_sseu_info_init(struct drm_i915_private *dev_priv)
 			 */
 			if (eu_per_ss == 7)
 				sseu->subslice_7eu[s] |= BIT(ss);
-
-			sseu->eu_total += eu_per_ss;
 		}
 	}
 
+	sseu->eu_total = compute_eu_total(sseu);
+
 	/*
 	 * SKL is expected to always have a uniform distribution
 	 * of EU across subslices with the exception that any one
@@ -232,8 +297,8 @@ static void gen9_sseu_info_init(struct drm_i915_private *dev_priv)
 	sseu->has_eu_pg = sseu->eu_per_subslice > 2;
 
 	if (IS_GEN9_LP(dev_priv)) {
-#define IS_SS_DISABLED(ss)	(!(sseu->subslice_mask & BIT(ss)))
-		info->has_pooled_eu = hweight8(sseu->subslice_mask) == 3;
+#define IS_SS_DISABLED(ss)	(!(sseu->subslices_mask[0] & BIT(ss)))
+		info->has_pooled_eu = hweight8(sseu->subslices_mask[0]) == 3;
 
 		/*
 		 * There is a HW issue in 2x6 fused down parts that requires
@@ -242,7 +307,7 @@ static void gen9_sseu_info_init(struct drm_i915_private *dev_priv)
 		 * doesn't affect if the device has all 3 subslices enabled.
 		 */
 		/* WaEnablePooledEuFor2x6:bxt */
-		info->has_pooled_eu |= (hweight8(sseu->subslice_mask) == 2 &&
+		info->has_pooled_eu |= (hweight8(sseu->subslices_mask[0]) == 2 &&
 					IS_BXT_REVID(dev_priv, 0, BXT_REVID_B_LAST));
 
 		sseu->min_eu_in_pool = 0;
@@ -261,19 +326,22 @@ static void gen9_sseu_info_init(struct drm_i915_private *dev_priv)
 static void broadwell_sseu_info_init(struct drm_i915_private *dev_priv)
 {
 	struct sseu_dev_info *sseu = &mkwrite_device_info(dev_priv)->sseu;
-	const int s_max = 3, ss_max = 3, eu_max = 8;
 	int s, ss;
-	u32 fuse2, eu_disable[3]; /* s_max */
+	u32 fuse2, subslice_mask, eu_disable[3]; /* s_max */
 
 	fuse2 = I915_READ(GEN8_FUSE2);
 	sseu->slice_mask = (fuse2 & GEN8_F2_S_ENA_MASK) >> GEN8_F2_S_ENA_SHIFT;
+	sseu->max_slices = 3;
+	sseu->max_subslices = 3;
+	sseu->max_eus_per_subslice = 8;
+
 	/*
 	 * The subslice disable field is global, i.e. it applies
 	 * to each of the enabled slices.
 	 */
-	sseu->subslice_mask = GENMASK(ss_max - 1, 0);
-	sseu->subslice_mask &= ~((fuse2 & GEN8_F2_SS_DIS_MASK) >>
-				 GEN8_F2_SS_DIS_SHIFT);
+	subslice_mask = GENMASK(sseu->max_subslices - 1, 0);
+	subslice_mask &= ~((fuse2 & GEN8_F2_SS_DIS_MASK) >>
+			   GEN8_F2_SS_DIS_SHIFT);
 
 	eu_disable[0] = I915_READ(GEN8_EU_DISABLE0) & GEN8_EU_DIS0_S0_MASK;
 	eu_disable[1] = (I915_READ(GEN8_EU_DISABLE0) >> GEN8_EU_DIS0_S1_SHIFT) |
@@ -287,30 +355,36 @@ static void broadwell_sseu_info_init(struct drm_i915_private *dev_priv)
 	 * Iterate through enabled slices and subslices to
 	 * count the total enabled EU.
 	 */
-	for (s = 0; s < s_max; s++) {
+	for (s = 0; s < sseu->max_slices; s++) {
 		if (!(sseu->slice_mask & BIT(s)))
 			/* skip disabled slice */
 			continue;
 
-		for (ss = 0; ss < ss_max; ss++) {
+		sseu->subslices_mask[s] = subslice_mask;
+
+		for (ss = 0; ss < sseu->max_subslices; ss++) {
 			u32 n_disabled;
 
-			if (!(sseu->subslice_mask & BIT(ss)))
+			if (!(sseu->subslices_mask[ss] & BIT(ss)))
 				/* skip disabled subslice */
 				continue;
 
-			n_disabled = hweight8(eu_disable[s] >> (ss * eu_max));
+			sseu->eu_mask[ss + s * sseu->max_subslices] =
+				~(eu_disable[s] >>
+				  (ss * sseu->max_eus_per_subslice));
+			n_disabled = hweight8(eu_disable[s] >>
+					      (ss * sseu->max_eus_per_subslice));
 
 			/*
 			 * Record which subslices have 7 EUs.
 			 */
-			if (eu_max - n_disabled == 7)
+			if (sseu->max_eus_per_subslice - n_disabled == 7)
 				sseu->subslice_7eu[s] |= 1 << ss;
-
-			sseu->eu_total += eu_max - n_disabled;
 		}
 	}
 
+	sseu->eu_total = compute_eu_total(sseu);
+
 	/*
 	 * BDW is expected to always have a uniform distribution of EU across
 	 * subslices with the exception that any one EU in any one subslice may
@@ -440,6 +514,7 @@ void intel_device_info_runtime_init(struct drm_i915_private *dev_priv)
 {
 	struct intel_device_info *info = mkwrite_device_info(dev_priv);
 	enum pipe pipe;
+	int s;
 
 	if (INTEL_GEN(dev_priv) >= 10) {
 		for_each_pipe(dev_priv, pipe)
@@ -551,9 +626,11 @@ void intel_device_info_runtime_init(struct drm_i915_private *dev_priv)
 	DRM_DEBUG_DRIVER("slice total: %u\n", hweight8(info->sseu.slice_mask));
 	DRM_DEBUG_DRIVER("subslice total: %u\n",
 			 sseu_subslice_total(&info->sseu));
-	DRM_DEBUG_DRIVER("subslice mask %04x\n", info->sseu.subslice_mask);
-	DRM_DEBUG_DRIVER("subslice per slice: %u\n",
-			 hweight8(info->sseu.subslice_mask));
+	for (s = 0; s < ARRAY_SIZE(info->sseu.subslices_mask); s++) {
+		DRM_DEBUG_DRIVER("subslice mask %04x\n", info->sseu.subslices_mask[s]);
+		DRM_DEBUG_DRIVER("subslice per slice: %u\n",
+				 hweight8(info->sseu.subslices_mask[s]));
+	}
 	DRM_DEBUG_DRIVER("EU total: %u\n", info->sseu.eu_total);
 	DRM_DEBUG_DRIVER("EU per subslice: %u\n", info->sseu.eu_per_subslice);
 	DRM_DEBUG_DRIVER("has slice power gating: %s\n",
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 6840ec8db037..98b65cb6199a 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -2037,7 +2037,7 @@ make_rpcs(struct drm_i915_private *dev_priv)
 
 	if (INTEL_INFO(dev_priv)->sseu.has_subslice_pg) {
 		rpcs |= GEN8_RPCS_SS_CNT_ENABLE;
-		rpcs |= hweight8(INTEL_INFO(dev_priv)->sseu.subslice_mask) <<
+		rpcs |= hweight8(INTEL_INFO(dev_priv)->sseu.subslices_mask[0]) <<
 			GEN8_RPCS_SS_CNT_SHIFT;
 		rpcs |= GEN8_RPCS_ENABLE;
 	}
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 69ad875fd011..f9f73dfc8eae 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -98,7 +98,7 @@ hangcheck_action_to_str(const enum intel_engine_hangcheck_action a)
 
 #define instdone_subslice_mask(dev_priv__) \
 	(INTEL_GEN(dev_priv__) == 7 ? \
-	 1 : INTEL_INFO(dev_priv__)->sseu.subslice_mask)
+	 1 : INTEL_INFO(dev_priv__)->sseu.subslices_mask[0])
 
 #define for_each_instdone_slice_subslice(dev_priv__, slice__, subslice__) \
 	for ((slice__) = 0, (subslice__) = 0; \
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index c3ff0d4947af..c57390e06d51 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -456,6 +456,64 @@ typedef struct drm_i915_irq_wait {
  */
 #define I915_PARAM_CS_TIMESTAMP_FREQUENCY   50
 
+/* Query various aspects of the topology of the GPU. Below is a list of the
+ * currently supported queries. The motivation of this more detailed query
+ * mechanism is to expose asynmetric properties of the GPU. Starting with CNL,
+ * slices might have different sizes (for example 3 subslices in slice0 and 2
+ * subslices in slice1+). This means we cannot rely on PARAM_SUBSLICE_MASK
+ * anymore.
+ *
+ * When using this parameter, getparam value should point to a structure of
+ * type drm_i915_topology_t. Call this once with query set to the relevant
+ * information to be queried and data_size set to 0. The kernel will then set
+ * params and data_size to the expected length of data[]. The should make sure
+ * memory is allocated to the right length before making a second getparam
+ * with data_size already set. The kernel will then populate data[]. The
+ * meaning of params[] elements is described for each query below.
+ */
+#define I915_PARAM_TOPOLOGY                 51
+typedef struct drm_i915_topology {
+	__u32 query;
+	__u32 data_size;
+
+	/* Query the availability of slices :
+	 *
+	 * params[0] : the maximum number of slices
+	 *
+	 * Each bit in data indicates whether a slice is available (1) or
+	 * fused off (0). Formula to tell if slice X is available :
+	 *
+	 *         (data[X / 8] >> (X % 8)) & 1
+	 */
+#define   I915_PARAM_TOPOLOGY_QUERY_SLICE_MASK     0
+	/* Query the availability of subslices :
+	 *
+	 * params[0] : slice stride
+	 *
+	 * Each bit in data indicates whether a subslice is available (1) or
+	 * fused off (0). Formula to tell if slice X subslice Y is available :
+	 *
+	 *         (data[(X * params[0]) + Y / 8] >> (Y % 8)) & 1
+	 */
+#define   I915_PARAM_TOPOLOGY_QUERY_SUBSLICE_MASK  1
+	/* Query the availability of EUs :
+	 *
+	 * params[0] : slice stride
+	 * params[1] : subslice stride
+	 *
+	 * Each bit in data indicates whether a slice is available (1) or
+	 * fused off (0). Formula to tell if slice X subslice Y eu Z is
+	 * available :
+	 *
+	 *         (data[X * params[0] + Y * params[1] + Z / 8] >> (Z % 8)) & 1
+	 */
+#define   I915_PARAM_TOPOLOGY_QUERY_EU_MASK        2
+
+	__u32 params[10];
+
+	__u8 data[];
+} drm_i915_topology_t;
+
 typedef struct drm_i915_getparam {
 	__s32 param;
 	/*
-- 
2.15.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v2 9/9] drm/i915/debugfs: reuse max slice/subslices already stored in sseu
  2017-11-02 16:29 [PATCH v2 0/9] i915: Cannonlake perf support Lionel Landwerlin
                   ` (7 preceding siblings ...)
  2017-11-02 16:29 ` [PATCH v2 8/9] drm/i915: expose eu topology to userspace Lionel Landwerlin
@ 2017-11-02 16:29 ` Lionel Landwerlin
  2017-11-02 16:49 ` ✓ Fi.CI.BAT: success for i915: Cannonlake perf support (rev2) Patchwork
  2017-11-02 17:38 ` ✗ Fi.CI.IGT: failure " Patchwork
  10 siblings, 0 replies; 31+ messages in thread
From: Lionel Landwerlin @ 2017-11-02 16:29 UTC (permalink / raw)
  To: intel-gfx

Now that we have that information in topology fields, let's just reused it.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
---
 drivers/gpu/drm/i915/i915_debugfs.c | 26 ++++++++++----------------
 1 file changed, 10 insertions(+), 16 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index 8521fc012fa4..90bbfbba317a 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -4456,11 +4456,11 @@ static void gen10_sseu_device_status(struct drm_i915_private *dev_priv,
 				     struct sseu_dev_info *sseu)
 {
 	const struct intel_device_info *info = INTEL_INFO(dev_priv);
-	int s_max = 6, ss_max = 4;
 	int s, ss;
-	u32 s_reg[s_max], eu_reg[2 * s_max], eu_mask[2];
+	u32 s_reg[info->sseu.max_slices],
+		eu_reg[2 * info->sseu.max_subslices], eu_mask[2];
 
-	for (s = 0; s < s_max; s++) {
+	for (s = 0; s < info->sseu.max_slices; s++) {
 		/*
 		 * FIXME: Valid SS Mask respects the spec and read
 		 * only valid bits for those registers, excluding reserverd
@@ -4482,7 +4482,7 @@ static void gen10_sseu_device_status(struct drm_i915_private *dev_priv,
 		     GEN9_PGCTL_SSB_EU210_ACK |
 		     GEN9_PGCTL_SSB_EU311_ACK;
 
-	for (s = 0; s < s_max; s++) {
+	for (s = 0; s < info->sseu.max_slices; s++) {
 		if ((s_reg[s] & GEN9_PGCTL_SLICE_ACK) == 0)
 			/* skip disabled slice */
 			continue;
@@ -4490,7 +4490,7 @@ static void gen10_sseu_device_status(struct drm_i915_private *dev_priv,
 		sseu->slice_mask |= BIT(s);
 		sseu->subslices_mask[s] = info->sseu.subslices_mask[s];
 
-		for (ss = 0; ss < ss_max; ss++) {
+		for (ss = 0; ss < info->sseu.max_subslices; ss++) {
 			unsigned int eu_cnt;
 
 			if (!(s_reg[s] & (GEN9_PGCTL_SS_ACK(ss))))
@@ -4510,17 +4510,11 @@ static void gen10_sseu_device_status(struct drm_i915_private *dev_priv,
 static void gen9_sseu_device_status(struct drm_i915_private *dev_priv,
 				    struct sseu_dev_info *sseu)
 {
-	int s_max = 3, ss_max = 4;
+	const struct intel_device_info *info = INTEL_INFO(dev_priv);
 	int s, ss;
-	u32 s_reg[s_max], eu_reg[2*s_max], eu_mask[2];
-
-	/* BXT has a single slice and at most 3 subslices. */
-	if (IS_GEN9_LP(dev_priv)) {
-		s_max = 1;
-		ss_max = 3;
-	}
+	u32 s_reg[info->sseu.max_slices], eu_reg[2*info->sseu.max_subslices], eu_mask[2];
 
-	for (s = 0; s < s_max; s++) {
+	for (s = 0; s < info->sseu.max_slices; s++) {
 		s_reg[s] = I915_READ(GEN9_SLICE_PGCTL_ACK(s));
 		eu_reg[2*s] = I915_READ(GEN9_SS01_EU_PGCTL_ACK(s));
 		eu_reg[2*s + 1] = I915_READ(GEN9_SS23_EU_PGCTL_ACK(s));
@@ -4535,7 +4529,7 @@ static void gen9_sseu_device_status(struct drm_i915_private *dev_priv,
 		     GEN9_PGCTL_SSB_EU210_ACK |
 		     GEN9_PGCTL_SSB_EU311_ACK;
 
-	for (s = 0; s < s_max; s++) {
+	for (s = 0; s < info->sseu.max_slices; s++) {
 		if ((s_reg[s] & GEN9_PGCTL_SLICE_ACK) == 0)
 			/* skip disabled slice */
 			continue;
@@ -4546,7 +4540,7 @@ static void gen9_sseu_device_status(struct drm_i915_private *dev_priv,
 			sseu->subslices_mask[s] =
 				INTEL_INFO(dev_priv)->sseu.subslices_mask[s];
 
-		for (ss = 0; ss < ss_max; ss++) {
+		for (ss = 0; ss < info->sseu.max_subslices; ss++) {
 			unsigned int eu_cnt;
 
 			if (IS_GEN9_LP(dev_priv)) {
-- 
2.15.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [PATCH v2 8/9] drm/i915: expose eu topology to userspace
  2017-11-02 16:29 ` [PATCH v2 8/9] drm/i915: expose eu topology to userspace Lionel Landwerlin
@ 2017-11-02 16:35   ` Chris Wilson
  2017-11-02 16:37     ` Lionel Landwerlin
  2017-11-02 17:34     ` Lionel Landwerlin
  0 siblings, 2 replies; 31+ messages in thread
From: Chris Wilson @ 2017-11-02 16:35 UTC (permalink / raw)
  To: Lionel Landwerlin, intel-gfx

Quoting Lionel Landwerlin (2017-11-02 16:29:48)
> +/* Query various aspects of the topology of the GPU. Below is a list of the
> + * currently supported queries. The motivation of this more detailed query
> + * mechanism is to expose asynmetric properties of the GPU. Starting with CNL,
> + * slices might have different sizes (for example 3 subslices in slice0 and 2
> + * subslices in slice1+). This means we cannot rely on PARAM_SUBSLICE_MASK
> + * anymore.
> + *
> + * When using this parameter, getparam value should point to a structure of
> + * type drm_i915_topology_t. Call this once with query set to the relevant
> + * information to be queried and data_size set to 0. The kernel will then set
> + * params and data_size to the expected length of data[]. The should make sure
> + * memory is allocated to the right length before making a second getparam
> + * with data_size already set. The kernel will then populate data[]. The
> + * meaning of params[] elements is described for each query below.
> + */
> +#define I915_PARAM_TOPOLOGY                 51
> +typedef struct drm_i915_topology {

Oh crumbs. Please join the intel_engine_info_ioctl discussion.

As it stands, lets not introduce multiplexing into an already multiplexed
getparam ioctl.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2 8/9] drm/i915: expose eu topology to userspace
  2017-11-02 16:35   ` Chris Wilson
@ 2017-11-02 16:37     ` Lionel Landwerlin
  2017-11-02 17:34     ` Lionel Landwerlin
  1 sibling, 0 replies; 31+ messages in thread
From: Lionel Landwerlin @ 2017-11-02 16:37 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx

On 02/11/17 16:35, Chris Wilson wrote:
> Quoting Lionel Landwerlin (2017-11-02 16:29:48)
>> +/* Query various aspects of the topology of the GPU. Below is a list of the
>> + * currently supported queries. The motivation of this more detailed query
>> + * mechanism is to expose asynmetric properties of the GPU. Starting with CNL,
>> + * slices might have different sizes (for example 3 subslices in slice0 and 2
>> + * subslices in slice1+). This means we cannot rely on PARAM_SUBSLICE_MASK
>> + * anymore.
>> + *
>> + * When using this parameter, getparam value should point to a structure of
>> + * type drm_i915_topology_t. Call this once with query set to the relevant
>> + * information to be queried and data_size set to 0. The kernel will then set
>> + * params and data_size to the expected length of data[]. The should make sure
>> + * memory is allocated to the right length before making a second getparam
>> + * with data_size already set. The kernel will then populate data[]. The
>> + * meaning of params[] elements is described for each query below.
>> + */
>> +#define I915_PARAM_TOPOLOGY                 51
>> +typedef struct drm_i915_topology {
> Oh crumbs. Please join the intel_engine_info_ioctl discussion.

Where is that?

>
> As it stands, lets not introduce multiplexing into an already multiplexed
> getparam ioctl.

But a int* is not enough!

> -Chris
>

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 31+ messages in thread

* ✓ Fi.CI.BAT: success for i915: Cannonlake perf support (rev2)
  2017-11-02 16:29 [PATCH v2 0/9] i915: Cannonlake perf support Lionel Landwerlin
                   ` (8 preceding siblings ...)
  2017-11-02 16:29 ` [PATCH v2 9/9] drm/i915/debugfs: reuse max slice/subslices already stored in sseu Lionel Landwerlin
@ 2017-11-02 16:49 ` Patchwork
  2017-11-02 17:38 ` ✗ Fi.CI.IGT: failure " Patchwork
  10 siblings, 0 replies; 31+ messages in thread
From: Patchwork @ 2017-11-02 16:49 UTC (permalink / raw)
  To: Lionel Landwerlin; +Cc: intel-gfx

== Series Details ==

Series: i915: Cannonlake perf support (rev2)
URL   : https://patchwork.freedesktop.org/series/32762/
State : success

== Summary ==

Series 32762v2 i915: Cannonlake perf support
https://patchwork.freedesktop.org/api/1.0/series/32762/revisions/2/mbox/

Test chamelium:
        Subgroup dp-crc-fast:
                fail       -> DMESG-FAIL (fi-kbl-7500u) fdo#102514
Test gem_ctx_switch:
        Subgroup basic-default-heavy:
                pass       -> INCOMPLETE (fi-glk-dsi) fdo#103359
Test kms_cursor_legacy:
        Subgroup basic-flip-before-cursor-varying-size:
                skip       -> PASS       (fi-hsw-4770r)
Test kms_pipe_crc_basic:
        Subgroup nonblocking-crc-pipe-a-frame-sequence:
                dmesg-warn -> PASS       (fi-skl-6700k) fdo#103546 +1

fdo#102514 https://bugs.freedesktop.org/show_bug.cgi?id=102514
fdo#103359 https://bugs.freedesktop.org/show_bug.cgi?id=103359
fdo#103546 https://bugs.freedesktop.org/show_bug.cgi?id=103546

fi-bdw-5557u     total:289  pass:268  dwarn:0   dfail:0   fail:0   skip:21  time:441s
fi-bdw-gvtdvm    total:289  pass:265  dwarn:0   dfail:0   fail:0   skip:24  time:455s
fi-blb-e6850     total:289  pass:223  dwarn:1   dfail:0   fail:0   skip:65  time:379s
fi-bsw-n3050     total:289  pass:243  dwarn:0   dfail:0   fail:0   skip:46  time:538s
fi-bwr-2160      total:289  pass:183  dwarn:0   dfail:0   fail:0   skip:106 time:276s
fi-bxt-dsi       total:289  pass:259  dwarn:0   dfail:0   fail:0   skip:30  time:504s
fi-bxt-j4205     total:289  pass:260  dwarn:0   dfail:0   fail:0   skip:29  time:507s
fi-byt-j1900     total:289  pass:253  dwarn:1   dfail:0   fail:0   skip:35  time:504s
fi-byt-n2820     total:289  pass:249  dwarn:1   dfail:0   fail:0   skip:39  time:501s
fi-cnl-y         total:289  pass:262  dwarn:0   dfail:0   fail:0   skip:27  time:609s
fi-elk-e7500     total:289  pass:229  dwarn:0   dfail:0   fail:0   skip:60  time:430s
fi-gdg-551       total:289  pass:178  dwarn:1   dfail:0   fail:1   skip:109 time:262s
fi-glk-1         total:289  pass:261  dwarn:0   dfail:0   fail:0   skip:28  time:582s
fi-glk-dsi       total:32   pass:22   dwarn:0   dfail:0   fail:0   skip:9  
fi-hsw-4770      total:289  pass:262  dwarn:0   dfail:0   fail:0   skip:27  time:429s
fi-hsw-4770r     total:289  pass:262  dwarn:0   dfail:0   fail:0   skip:27  time:429s
fi-ilk-650       total:289  pass:228  dwarn:0   dfail:0   fail:0   skip:61  time:432s
fi-ivb-3520m     total:289  pass:260  dwarn:0   dfail:0   fail:0   skip:29  time:500s
fi-ivb-3770      total:289  pass:260  dwarn:0   dfail:0   fail:0   skip:29  time:465s
fi-kbl-7500u     total:289  pass:263  dwarn:1   dfail:1   fail:0   skip:24  time:497s
fi-kbl-7560u     total:289  pass:270  dwarn:0   dfail:0   fail:0   skip:19  time:576s
fi-kbl-r         total:289  pass:262  dwarn:0   dfail:0   fail:0   skip:27  time:579s
fi-pnv-d510      total:289  pass:222  dwarn:1   dfail:0   fail:0   skip:66  time:571s
fi-skl-6260u     total:289  pass:269  dwarn:0   dfail:0   fail:0   skip:20  time:452s
fi-skl-6600u     total:289  pass:262  dwarn:0   dfail:0   fail:0   skip:27  time:593s
fi-skl-6700hq    total:289  pass:263  dwarn:0   dfail:0   fail:0   skip:26  time:651s
fi-skl-6700k     total:289  pass:265  dwarn:0   dfail:0   fail:0   skip:24  time:520s
fi-skl-6770hq    total:289  pass:269  dwarn:0   dfail:0   fail:0   skip:20  time:508s
fi-snb-2520m     total:289  pass:250  dwarn:0   dfail:0   fail:0   skip:39  time:641s
fi-cfl-s failed to connect after reboot
fi-kbl-7567u failed to connect after reboot

fca2506bc5d492609e3f1b6e59d667e376a1eb3f drm-tip: 2017y-11m-02d-13h-10m-58s UTC integration manifest
778e02fb0445 drm/i915/debugfs: reuse max slice/subslices already stored in sseu
19e052b34f25 drm/i915: expose eu topology to userspace
683edd45e040 drm/i915/perf: reuse timestamp frequency from device info
177dac23af0c drm/i915: expose command stream timestamp frequency to userspace
be5697a8b0e4 drm/i915/perf: enable perf support on CNL
2e226815b00f drm/i915: fix register naming
7a971330dc93 drm/i915/perf: refactor perf setup
89f3787ed275 drm/i915/perf: add support for Coffeelake GT3
b2d7402277b3 drm/i915/perf: complete whitelisting for OA programming on HSW

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_6931/
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2 8/9] drm/i915: expose eu topology to userspace
  2017-11-02 16:35   ` Chris Wilson
  2017-11-02 16:37     ` Lionel Landwerlin
@ 2017-11-02 17:34     ` Lionel Landwerlin
  1 sibling, 0 replies; 31+ messages in thread
From: Lionel Landwerlin @ 2017-11-02 17:34 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx

On 02/11/17 16:35, Chris Wilson wrote:
> Quoting Lionel Landwerlin (2017-11-02 16:29:48)
>> +/* Query various aspects of the topology of the GPU. Below is a list of the
>> + * currently supported queries. The motivation of this more detailed query
>> + * mechanism is to expose asynmetric properties of the GPU. Starting with CNL,
>> + * slices might have different sizes (for example 3 subslices in slice0 and 2
>> + * subslices in slice1+). This means we cannot rely on PARAM_SUBSLICE_MASK
>> + * anymore.
>> + *
>> + * When using this parameter, getparam value should point to a structure of
>> + * type drm_i915_topology_t. Call this once with query set to the relevant
>> + * information to be queried and data_size set to 0. The kernel will then set
>> + * params and data_size to the expected length of data[]. The should make sure
>> + * memory is allocated to the right length before making a second getparam
>> + * with data_size already set. The kernel will then populate data[]. The
>> + * meaning of params[] elements is described for each query below.
>> + */
>> +#define I915_PARAM_TOPOLOGY                 51
>> +typedef struct drm_i915_topology {
> Oh crumbs. Please join the intel_engine_info_ioctl discussion.
>
> As it stands, lets not introduce multiplexing into an already multiplexed
> getparam ioctl.
> -Chris
>
Hey Tvrtko,

Is your intel_engine_info ioctl implementation available anywhere?

Thanks,

-
Lionel
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 31+ messages in thread

* ✗ Fi.CI.IGT: failure for i915: Cannonlake perf support (rev2)
  2017-11-02 16:29 [PATCH v2 0/9] i915: Cannonlake perf support Lionel Landwerlin
                   ` (9 preceding siblings ...)
  2017-11-02 16:49 ` ✓ Fi.CI.BAT: success for i915: Cannonlake perf support (rev2) Patchwork
@ 2017-11-02 17:38 ` Patchwork
  10 siblings, 0 replies; 31+ messages in thread
From: Patchwork @ 2017-11-02 17:38 UTC (permalink / raw)
  To: Lionel Landwerlin; +Cc: intel-gfx

== Series Details ==

Series: i915: Cannonlake perf support (rev2)
URL   : https://patchwork.freedesktop.org/series/32762/
State : failure

== Summary ==

Test kms_flip:
        Subgroup blocking-wf_vblank:
                pass       -> FAIL       (shard-hsw)
Test kms_busy:
        Subgroup extended-modeset-hang-oldfb-with-reset-render-A:
                dmesg-warn -> PASS       (shard-hsw)
        Subgroup extended-modeset-hang-newfb-with-reset-render-B:
                pass       -> DMESG-WARN (shard-hsw)
Test kms_setmode:
        Subgroup basic:
                pass       -> FAIL       (shard-hsw) fdo#99912

fdo#99912 https://bugs.freedesktop.org/show_bug.cgi?id=99912

shard-hsw        total:2539 pass:1430 dwarn:2   dfail:0   fail:10  skip:1097 time:9297s

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_6931/shards.html
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2 6/9] drm/i915: expose command stream timestamp frequency to userspace
  2017-11-02 16:29 ` [PATCH v2 6/9] drm/i915: expose command stream timestamp frequency to userspace Lionel Landwerlin
@ 2017-11-07  0:01   ` Rafael Antognolli
  2017-11-07 11:30   ` Ewelina Musial
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 31+ messages in thread
From: Rafael Antognolli @ 2017-11-07  0:01 UTC (permalink / raw)
  To: Lionel Landwerlin; +Cc: intel-gfx

This patch, along with the respective ones for Mesa, does fix the gl
timestamp query piglit failures on CNL. So it is

Tested-by: Rafael Antognolli <rafael.antognolli@intel.com>

On Thu, Nov 02, 2017 at 04:29:46PM +0000, Lionel Landwerlin wrote:
> We use to have this fixed per generation, but starting with CNL userspace
> cannot tell just off the PCI ID. Let's make this information available. This
> is particularly useful for performance monitoring where much of the
> normalization work is done using those timestamps (this include pipeline
> statistics in both GL & Vulkan as well as OA reports).
> 
> Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_debugfs.c      |  2 +
>  drivers/gpu/drm/i915/i915_drv.c          |  3 +
>  drivers/gpu/drm/i915/i915_drv.h          |  2 +
>  drivers/gpu/drm/i915/i915_reg.h          | 21 +++++++
>  drivers/gpu/drm/i915/intel_device_info.c | 99 ++++++++++++++++++++++++++++++++
>  include/uapi/drm/i915_drm.h              |  6 ++
>  6 files changed, 133 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
> index 39883cd915db..0897fd616a1f 100644
> --- a/drivers/gpu/drm/i915/i915_debugfs.c
> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> @@ -3246,6 +3246,8 @@ static int i915_engine_info(struct seq_file *m, void *unused)
>  		   yesno(dev_priv->gt.awake));
>  	seq_printf(m, "Global active requests: %d\n",
>  		   dev_priv->gt.active_requests);
> +	seq_printf(m, "CS timestamp frequency: %llu\n",
> +		   dev_priv->info.cs_timestamp_frequency);
>  
>  	p = drm_seq_file_printer(m);
>  	for_each_engine(engine, dev_priv, id)
> diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
> index e7e9e061073b..fdd23e79fb46 100644
> --- a/drivers/gpu/drm/i915/i915_drv.c
> +++ b/drivers/gpu/drm/i915/i915_drv.c
> @@ -416,6 +416,9 @@ static int i915_getparam(struct drm_device *dev, void *data,
>  		if (!value)
>  			return -ENODEV;
>  		break;
> +	case I915_PARAM_CS_TIMESTAMP_FREQUENCY:
> +		value = INTEL_INFO(dev_priv)->cs_timestamp_frequency;
> +		break;
>  	default:
>  		DRM_DEBUG("Unknown parameter %d\n", param->param);
>  		return -EINVAL;
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 6cb7cd7f9420..4e804aaeaae1 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -886,6 +886,8 @@ struct intel_device_info {
>  	/* Slice/subslice/EU info */
>  	struct sseu_dev_info sseu;
>  
> +	uint64_t cs_timestamp_frequency;
> +
>  	struct color_luts {
>  		u16 degamma_lut_size;
>  		u16 gamma_lut_size;
> diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
> index a2223f01ee2a..f392f28f2cfa 100644
> --- a/drivers/gpu/drm/i915/i915_reg.h
> +++ b/drivers/gpu/drm/i915/i915_reg.h
> @@ -1119,9 +1119,24 @@ static inline bool i915_mmio_reg_valid(i915_reg_t reg)
>  
>  /* RPM unit config (Gen8+) */
>  #define RPM_CONFIG0	    _MMIO(0x0D00)
> +#define  GEN9_RPM_CONFIG0_CRYSTAL_CLOCK_FREQ_SHIFT	3
> +#define  GEN9_RPM_CONFIG0_CRYSTAL_CLOCK_FREQ_MASK	(1 << GEN9_RPM_CONFIG0_CRYSTAL_CLOCK_FREQ_SHIFT)
> +#define  GEN9_RPM_CONFIG0_CRYSTAL_CLOCK_FREQ_19_2_MHZ	0
> +#define  GEN9_RPM_CONFIG0_CRYSTAL_CLOCK_FREQ_24_MHZ	1
> +#define  GEN10_RPM_CONFIG0_CTC_SHIFT_PARAMETER_SHIFT	1
> +#define  GEN10_RPM_CONFIG0_CTC_SHIFT_PARAMETER_MASK	(0x3 << GEN10_RPM_CONFIG0_CTC_SHIFT_PARAMETER_SHIFT)
> +
>  #define RPM_CONFIG1	    _MMIO(0x0D04)
>  #define  GEN10_GT_NOA_ENABLE  (1 << 9)
>  
> +/* GPM unit config (assuming Gen8+, documentation is fuzzy...) */
> +#define GEN8_CTC_MODE			_MMIO(0xA26C)
> +#define  GEN8_CTC_SOURCE_PARAMETER_MASK 1
> +#define  GEN8_CTC_SOURCE_CRYSTAL_CLOCK	0
> +#define  GEN8_CTC_SOURCE_DIVIDE_LOGIC	1
> +#define  GEN8_CTC_SHIFT_PARAMETER_SHIFT	1
> +#define  GEN8_CTC_SHIFT_PARAMETER_MASK	(0x3 << GEN8_CTC_SHIFT_PARAMETER_SHIFT)
> +
>  /* RPC unit config (Gen8+) */
>  #define RPC_CONFIG	    _MMIO(0x0D08)
>  
> @@ -8865,6 +8880,12 @@ enum skl_power_gate {
>  #define ILK_TIMESTAMP_HI	_MMIO(0x70070)
>  #define IVB_TIMESTAMP_CTR	_MMIO(0x44070)
>  
> +#define GEN8_TIMESTAMP_OVERRIDE				_MMIO(0x44074)
> +#define  GEN8_TIMESTAMP_OVERRIDE_US_COUNTER_SHIFT		0
> +#define  GEN8_TIMESTAMP_OVERRIDE_US_COUNTER_MASK		0x3ff
> +#define  GEN8_TIMESTAMP_OVERRIDE_US_COUNTER_DENOMINATOR_SHIFT	12
> +#define  GEN8_TIMESTAMP_OVERRIDE_US_COUNTER_DENOMINATOR_MASK	(0xf << 12)
> +
>  #define _PIPE_FRMTMSTMP_A		0x70048
>  #define PIPE_FRMTMSTMP(pipe)		\
>  			_MMIO_PIPE2(pipe, _PIPE_FRMTMSTMP_A)
> diff --git a/drivers/gpu/drm/i915/intel_device_info.c b/drivers/gpu/drm/i915/intel_device_info.c
> index db03d179fc85..9b71a9b6d80e 100644
> --- a/drivers/gpu/drm/i915/intel_device_info.c
> +++ b/drivers/gpu/drm/i915/intel_device_info.c
> @@ -329,6 +329,100 @@ static void broadwell_sseu_info_init(struct drm_i915_private *dev_priv)
>  	sseu->has_eu_pg = 0;
>  }
>  
> +static u64 read_timestamp_frequency_from_divide(struct drm_i915_private *dev_priv)
> +{
> +	u32 ts_override = I915_READ(GEN8_TIMESTAMP_OVERRIDE);
> +	u64 base_freq, frac_freq;
> +
> +	base_freq = ((ts_override & GEN8_TIMESTAMP_OVERRIDE_US_COUNTER_MASK) >>
> +		     GEN8_TIMESTAMP_OVERRIDE_US_COUNTER_SHIFT) + 1;
> +	base_freq *= 1000000;
> +
> +	frac_freq = ((ts_override &
> +		      GEN8_TIMESTAMP_OVERRIDE_US_COUNTER_DENOMINATOR_MASK) >>
> +		     GEN8_TIMESTAMP_OVERRIDE_US_COUNTER_DENOMINATOR_SHIFT);
> +	if (frac_freq != 0)
> +		frac_freq = 1000000 / (frac_freq + 1);
> +
> +	return base_freq + frac_freq;
> +}
> +
> +static u64 read_timestamp_frequency(struct drm_i915_private *dev_priv)
> +{
> +	if (INTEL_GEN(dev_priv) <= 4) {
> +		/* PRMs say:
> +		 *
> +		 *     "The value in this register increments once every 16
> +		 *      hclks." ("CLKCFG" register)
> +		 *
> +		 * Since dev_priv->rawclk_freq stores the value in kHz divided
> +		 * by 4, we just need to divide it again by 4.
> +		 */
> +		return (dev_priv->rawclk_freq * 1000) / 4;
> +	} else if (INTEL_GEN(dev_priv) <= 7) {
> +		/* PRMs say:
> +		 *
> +		 *     "The PCU TSC counts 10ns increments; this timestamp
> +		 *      reflects bits 38:3 of the TSC (i.e. 80ns granularity,
> +		 *      rolling over every 1.5 hours).
> +		 */
> +		return 12500000;
> +	} else if (INTEL_GEN(dev_priv) <= 9) {
> +		u32 ctc_reg = I915_READ(GEN8_CTC_MODE);
> +		u64 freq = 0;
> +
> +		if ((ctc_reg & GEN8_CTC_SOURCE_PARAMETER_MASK) == GEN8_CTC_SOURCE_DIVIDE_LOGIC)
> +			freq = read_timestamp_frequency_from_divide(dev_priv);
> +		else
> +			freq = IS_GEN9_LP(dev_priv) ? 19200000 : 24000000;
> +
> +		/* Now figure out how the command stream's timestamp register
> +		 * increments from this frequency (it might increment only
> +		 * every few clock cycle).
> +		 */
> +		freq >>= 3 - ((ctc_reg & GEN8_CTC_SHIFT_PARAMETER_MASK) >>
> +			      GEN8_CTC_SHIFT_PARAMETER_SHIFT);
> +
> +		return freq;
> +	} else if (INTEL_GEN(dev_priv) <= 10) {
> +		u32 ctc_reg = I915_READ(GEN8_CTC_MODE);
> +		u64 freq = 0;
> +		u32 rpm_config_reg = 0;
> +
> +		/* First figure out the reference frequency. There are 2 ways
> +		 * we can compute the frequency, either through the
> +		 * TIMESTAMP_OVERRIDE register or through CTC_MODE &
> +		 * RPM_CONFIG & CTC_MODE registers. CTC_MODE tells us which
> +		 * one we should use.
> +		 */
> +		if ((ctc_reg & GEN8_CTC_SOURCE_PARAMETER_MASK) == GEN8_CTC_SOURCE_DIVIDE_LOGIC) {
> +			freq = read_timestamp_frequency_from_divide(dev_priv);
> +		} else {
> +			u32 crystal_clock;
> +
> +			rpm_config_reg = I915_READ(RPM_CONFIG0);
> +			crystal_clock = (rpm_config_reg &
> +					 GEN9_RPM_CONFIG0_CRYSTAL_CLOCK_FREQ_MASK) >>
> +				GEN9_RPM_CONFIG0_CRYSTAL_CLOCK_FREQ_SHIFT;
> +			freq = crystal_clock == GEN9_RPM_CONFIG0_CRYSTAL_CLOCK_FREQ_19_2_MHZ ?
> +				19200000 : 24000000;
> +		}
> +
> +		/* Now figure out how the command stream's timestamp register
> +		 * increments from this frequency (it might increment only
> +		 * every few clock cycle).
> +		 */
> +		freq >>= 3 - ((rpm_config_reg &
> +			       GEN10_RPM_CONFIG0_CTC_SHIFT_PARAMETER_MASK) >>
> +			      GEN10_RPM_CONFIG0_CTC_SHIFT_PARAMETER_SHIFT);
> +
> +		return freq;
> +	}
> +
> +	DRM_ERROR("Unknown gen, unable to compute command stream timestamp frequency\n");
> +	return 0;
> +}
> +
>  /*
>   * Determine various intel_device_info fields at runtime.
>   *
> @@ -450,6 +544,9 @@ void intel_device_info_runtime_init(struct drm_i915_private *dev_priv)
>  	else if (INTEL_GEN(dev_priv) >= 10)
>  		gen10_sseu_info_init(dev_priv);
>  
> +	/* Initialize command stream timestamp frequency */
> +	info->cs_timestamp_frequency = read_timestamp_frequency(dev_priv);
> +
>  	DRM_DEBUG_DRIVER("slice mask: %04x\n", info->sseu.slice_mask);
>  	DRM_DEBUG_DRIVER("slice total: %u\n", hweight8(info->sseu.slice_mask));
>  	DRM_DEBUG_DRIVER("subslice total: %u\n",
> @@ -465,4 +562,6 @@ void intel_device_info_runtime_init(struct drm_i915_private *dev_priv)
>  			 info->sseu.has_subslice_pg ? "y" : "n");
>  	DRM_DEBUG_DRIVER("has EU power gating: %s\n",
>  			 info->sseu.has_eu_pg ? "y" : "n");
> +	DRM_DEBUG_DRIVER("CS timestamp frequency: %llu\n",
> +			 info->cs_timestamp_frequency);
>  }
> diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
> index 125bde7d9504..c3ff0d4947af 100644
> --- a/include/uapi/drm/i915_drm.h
> +++ b/include/uapi/drm/i915_drm.h
> @@ -450,6 +450,12 @@ typedef struct drm_i915_irq_wait {
>   */
>  #define I915_PARAM_HAS_EXEC_FENCE_ARRAY  49
>  
> +/* Frequency of the command streamer timestamps given by the *_TIMESTAMP
> + * registers. This used to be fixed per platform but from CNL onwards, this
> + * might vary depending on the parts.
> + */
> +#define I915_PARAM_CS_TIMESTAMP_FREQUENCY   50
> +
>  typedef struct drm_i915_getparam {
>  	__s32 param;
>  	/*
> -- 
> 2.15.0
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2 6/9] drm/i915: expose command stream timestamp frequency to userspace
  2017-11-02 16:29 ` [PATCH v2 6/9] drm/i915: expose command stream timestamp frequency to userspace Lionel Landwerlin
  2017-11-07  0:01   ` Rafael Antognolli
@ 2017-11-07 11:30   ` Ewelina Musial
  2017-11-07 12:07     ` Lionel Landwerlin
  2017-11-08 17:36   ` Lionel Landwerlin
  2017-11-09 11:58   ` Sagar Arun Kamble
  3 siblings, 1 reply; 31+ messages in thread
From: Ewelina Musial @ 2017-11-07 11:30 UTC (permalink / raw)
  To: Lionel Landwerlin; +Cc: intel-gfx

On Thu, Nov 02, 2017 at 04:29:46PM +0000, Lionel Landwerlin wrote:
> We use to have this fixed per generation, but starting with CNL userspace
> cannot tell just off the PCI ID. Let's make this information available. This
> is particularly useful for performance monitoring where much of the
> normalization work is done using those timestamps (this include pipeline
> statistics in both GL & Vulkan as well as OA reports).
> 
> Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_debugfs.c      |  2 +
>  drivers/gpu/drm/i915/i915_drv.c          |  3 +
>  drivers/gpu/drm/i915/i915_drv.h          |  2 +
>  drivers/gpu/drm/i915/i915_reg.h          | 21 +++++++
>  drivers/gpu/drm/i915/intel_device_info.c | 99 ++++++++++++++++++++++++++++++++
>  include/uapi/drm/i915_drm.h              |  6 ++
>  6 files changed, 133 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
> index 39883cd915db..0897fd616a1f 100644
> --- a/drivers/gpu/drm/i915/i915_debugfs.c
> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> @@ -3246,6 +3246,8 @@ static int i915_engine_info(struct seq_file *m, void *unused)
>  		   yesno(dev_priv->gt.awake));
>  	seq_printf(m, "Global active requests: %d\n",
>  		   dev_priv->gt.active_requests);
> +	seq_printf(m, "CS timestamp frequency: %llu\n",
> +		   dev_priv->info.cs_timestamp_frequency);
>  
>  	p = drm_seq_file_printer(m);
>  	for_each_engine(engine, dev_priv, id)
> diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
> index e7e9e061073b..fdd23e79fb46 100644
> --- a/drivers/gpu/drm/i915/i915_drv.c
> +++ b/drivers/gpu/drm/i915/i915_drv.c
> @@ -416,6 +416,9 @@ static int i915_getparam(struct drm_device *dev, void *data,
>  		if (!value)
>  			return -ENODEV;
>  		break;
> +	case I915_PARAM_CS_TIMESTAMP_FREQUENCY:
> +		value = INTEL_INFO(dev_priv)->cs_timestamp_frequency;
> +		break;
>  	default:
>  		DRM_DEBUG("Unknown parameter %d\n", param->param);
>  		return -EINVAL;
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 6cb7cd7f9420..4e804aaeaae1 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -886,6 +886,8 @@ struct intel_device_info {
>  	/* Slice/subslice/EU info */
>  	struct sseu_dev_info sseu;
>  
> +	uint64_t cs_timestamp_frequency;
> +
>  	struct color_luts {
>  		u16 degamma_lut_size;
>  		u16 gamma_lut_size;
> diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
> index a2223f01ee2a..f392f28f2cfa 100644
> --- a/drivers/gpu/drm/i915/i915_reg.h
> +++ b/drivers/gpu/drm/i915/i915_reg.h
> @@ -1119,9 +1119,24 @@ static inline bool i915_mmio_reg_valid(i915_reg_t reg)
>  
>  /* RPM unit config (Gen8+) */
>  #define RPM_CONFIG0	    _MMIO(0x0D00)
> +#define  GEN9_RPM_CONFIG0_CRYSTAL_CLOCK_FREQ_SHIFT	3
> +#define  GEN9_RPM_CONFIG0_CRYSTAL_CLOCK_FREQ_MASK	(1 << GEN9_RPM_CONFIG0_CRYSTAL_CLOCK_FREQ_SHIFT)
> +#define  GEN9_RPM_CONFIG0_CRYSTAL_CLOCK_FREQ_19_2_MHZ	0
> +#define  GEN9_RPM_CONFIG0_CRYSTAL_CLOCK_FREQ_24_MHZ	1
> +#define  GEN10_RPM_CONFIG0_CTC_SHIFT_PARAMETER_SHIFT	1
> +#define  GEN10_RPM_CONFIG0_CTC_SHIFT_PARAMETER_MASK	(0x3 << GEN10_RPM_CONFIG0_CTC_SHIFT_PARAMETER_SHIFT)
> +
>  #define RPM_CONFIG1	    _MMIO(0x0D04)
>  #define  GEN10_GT_NOA_ENABLE  (1 << 9)
>  
> +/* GPM unit config (assuming Gen8+, documentation is fuzzy...) */
> +#define GEN8_CTC_MODE			_MMIO(0xA26C)
> +#define  GEN8_CTC_SOURCE_PARAMETER_MASK 1
> +#define  GEN8_CTC_SOURCE_CRYSTAL_CLOCK	0
> +#define  GEN8_CTC_SOURCE_DIVIDE_LOGIC	1
> +#define  GEN8_CTC_SHIFT_PARAMETER_SHIFT	1
> +#define  GEN8_CTC_SHIFT_PARAMETER_MASK	(0x3 << GEN8_CTC_SHIFT_PARAMETER_SHIFT)
> +
>  /* RPC unit config (Gen8+) */
>  #define RPC_CONFIG	    _MMIO(0x0D08)
>  
> @@ -8865,6 +8880,12 @@ enum skl_power_gate {
>  #define ILK_TIMESTAMP_HI	_MMIO(0x70070)
>  #define IVB_TIMESTAMP_CTR	_MMIO(0x44070)
>  
> +#define GEN8_TIMESTAMP_OVERRIDE				_MMIO(0x44074)
> +#define  GEN8_TIMESTAMP_OVERRIDE_US_COUNTER_SHIFT		0
> +#define  GEN8_TIMESTAMP_OVERRIDE_US_COUNTER_MASK		0x3ff
> +#define  GEN8_TIMESTAMP_OVERRIDE_US_COUNTER_DENOMINATOR_SHIFT	12
> +#define  GEN8_TIMESTAMP_OVERRIDE_US_COUNTER_DENOMINATOR_MASK	(0xf << 12)
> +
>  #define _PIPE_FRMTMSTMP_A		0x70048
>  #define PIPE_FRMTMSTMP(pipe)		\
>  			_MMIO_PIPE2(pipe, _PIPE_FRMTMSTMP_A)
> diff --git a/drivers/gpu/drm/i915/intel_device_info.c b/drivers/gpu/drm/i915/intel_device_info.c
> index db03d179fc85..9b71a9b6d80e 100644
> --- a/drivers/gpu/drm/i915/intel_device_info.c
> +++ b/drivers/gpu/drm/i915/intel_device_info.c
> @@ -329,6 +329,100 @@ static void broadwell_sseu_info_init(struct drm_i915_private *dev_priv)
>  	sseu->has_eu_pg = 0;
>  }
>  
> +static u64 read_timestamp_frequency_from_divide(struct drm_i915_private *dev_priv)
> +{
> +	u32 ts_override = I915_READ(GEN8_TIMESTAMP_OVERRIDE);
> +	u64 base_freq, frac_freq;
> +
> +	base_freq = ((ts_override & GEN8_TIMESTAMP_OVERRIDE_US_COUNTER_MASK) >>
> +		     GEN8_TIMESTAMP_OVERRIDE_US_COUNTER_SHIFT) + 1;
> +	base_freq *= 1000000;
> +
> +	frac_freq = ((ts_override &
> +		      GEN8_TIMESTAMP_OVERRIDE_US_COUNTER_DENOMINATOR_MASK) >>
> +		     GEN8_TIMESTAMP_OVERRIDE_US_COUNTER_DENOMINATOR_SHIFT);
> +	if (frac_freq != 0)
> +		frac_freq = 1000000 / (frac_freq + 1);
> +
> +	return base_freq + frac_freq;
> +}
> +
> +static u64 read_timestamp_frequency(struct drm_i915_private *dev_priv)
> +{
> +	if (INTEL_GEN(dev_priv) <= 4) {
> +		/* PRMs say:
> +		 *
> +		 *     "The value in this register increments once every 16
> +		 *      hclks." ("CLKCFG" register)
> +		 *
> +		 * Since dev_priv->rawclk_freq stores the value in kHz divided
> +		 * by 4, we just need to divide it again by 4.
> +		 */
> +		return (dev_priv->rawclk_freq * 1000) / 4;
> +	} else if (INTEL_GEN(dev_priv) <= 7) {
> +		/* PRMs say:
> +		 *
> +		 *     "The PCU TSC counts 10ns increments; this timestamp
> +		 *      reflects bits 38:3 of the TSC (i.e. 80ns granularity,
> +		 *      rolling over every 1.5 hours).
> +		 */
> +		return 12500000;
> +	} else if (INTEL_GEN(dev_priv) <= 9) {
> +		u32 ctc_reg = I915_READ(GEN8_CTC_MODE);
> +		u64 freq = 0;
> +
> +		if ((ctc_reg & GEN8_CTC_SOURCE_PARAMETER_MASK) == GEN8_CTC_SOURCE_DIVIDE_LOGIC)
> +			freq = read_timestamp_frequency_from_divide(dev_priv);
> +		else
> +			freq = IS_GEN9_LP(dev_priv) ? 19200000 : 24000000;
What means those values? It looks like some 'magic numbers' here.
Some comment or define could be helpful.
> +
> +		/* Now figure out how the command stream's timestamp register
> +		 * increments from this frequency (it might increment only
> +		 * every few clock cycle).
> +		 */
> +		freq >>= 3 - ((ctc_reg & GEN8_CTC_SHIFT_PARAMETER_MASK) >>
> +			      GEN8_CTC_SHIFT_PARAMETER_SHIFT);
> +
> +		return freq;
> +	} else if (INTEL_GEN(dev_priv) <= 10) {
> +		u32 ctc_reg = I915_READ(GEN8_CTC_MODE);
> +		u64 freq = 0;
> +		u32 rpm_config_reg = 0;
> +
> +		/* First figure out the reference frequency. There are 2 ways
> +		 * we can compute the frequency, either through the
> +		 * TIMESTAMP_OVERRIDE register or through CTC_MODE &
> +		 * RPM_CONFIG & CTC_MODE registers. CTC_MODE tells us which
> +		 * one we should use.
> +		 */
> +		if ((ctc_reg & GEN8_CTC_SOURCE_PARAMETER_MASK) == GEN8_CTC_SOURCE_DIVIDE_LOGIC) {
> +			freq = read_timestamp_frequency_from_divide(dev_priv);
> +		} else {
> +			u32 crystal_clock;
> +
> +			rpm_config_reg = I915_READ(RPM_CONFIG0);
> +			crystal_clock = (rpm_config_reg &
> +					 GEN9_RPM_CONFIG0_CRYSTAL_CLOCK_FREQ_MASK) >>
> +				GEN9_RPM_CONFIG0_CRYSTAL_CLOCK_FREQ_SHIFT;
> +			freq = crystal_clock == GEN9_RPM_CONFIG0_CRYSTAL_CLOCK_FREQ_19_2_MHZ ?
> +				19200000 : 24000000;
The same here.
- Ewelina
> +		}
> +
> +		/* Now figure out how the command stream's timestamp register
> +		 * increments from this frequency (it might increment only
> +		 * every few clock cycle).
> +		 */
> +		freq >>= 3 - ((rpm_config_reg &
> +			       GEN10_RPM_CONFIG0_CTC_SHIFT_PARAMETER_MASK) >>
> +			      GEN10_RPM_CONFIG0_CTC_SHIFT_PARAMETER_SHIFT);
> +
> +		return freq;
> +	}
> +
> +	DRM_ERROR("Unknown gen, unable to compute command stream timestamp frequency\n");
> +	return 0;
> +}
> +
>  /*
>   * Determine various intel_device_info fields at runtime.
>   *
> @@ -450,6 +544,9 @@ void intel_device_info_runtime_init(struct drm_i915_private *dev_priv)
>  	else if (INTEL_GEN(dev_priv) >= 10)
>  		gen10_sseu_info_init(dev_priv);
>  
> +	/* Initialize command stream timestamp frequency */
> +	info->cs_timestamp_frequency = read_timestamp_frequency(dev_priv);
> +
>  	DRM_DEBUG_DRIVER("slice mask: %04x\n", info->sseu.slice_mask);
>  	DRM_DEBUG_DRIVER("slice total: %u\n", hweight8(info->sseu.slice_mask));
>  	DRM_DEBUG_DRIVER("subslice total: %u\n",
> @@ -465,4 +562,6 @@ void intel_device_info_runtime_init(struct drm_i915_private *dev_priv)
>  			 info->sseu.has_subslice_pg ? "y" : "n");
>  	DRM_DEBUG_DRIVER("has EU power gating: %s\n",
>  			 info->sseu.has_eu_pg ? "y" : "n");
> +	DRM_DEBUG_DRIVER("CS timestamp frequency: %llu\n",
> +			 info->cs_timestamp_frequency);
>  }
> diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
> index 125bde7d9504..c3ff0d4947af 100644
> --- a/include/uapi/drm/i915_drm.h
> +++ b/include/uapi/drm/i915_drm.h
> @@ -450,6 +450,12 @@ typedef struct drm_i915_irq_wait {
>   */
>  #define I915_PARAM_HAS_EXEC_FENCE_ARRAY  49
>  
> +/* Frequency of the command streamer timestamps given by the *_TIMESTAMP
> + * registers. This used to be fixed per platform but from CNL onwards, this
> + * might vary depending on the parts.
> + */
> +#define I915_PARAM_CS_TIMESTAMP_FREQUENCY   50
> +
>  typedef struct drm_i915_getparam {
>  	__s32 param;
>  	/*
> -- 
> 2.15.0
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2 2/9] drm/i915/perf: add support for Coffeelake GT3
  2017-11-02 16:29 ` [PATCH v2 2/9] drm/i915/perf: add support for Coffeelake GT3 Lionel Landwerlin
@ 2017-11-07 11:34   ` Matthew Auld
  0 siblings, 0 replies; 31+ messages in thread
From: Matthew Auld @ 2017-11-07 11:34 UTC (permalink / raw)
  To: Lionel Landwerlin; +Cc: Intel Graphics Development

On 2 November 2017 at 16:29, Lionel Landwerlin
<lionel.g.landwerlin@intel.com> wrote:
> We can enable GT3 as well as GT2.
>
> Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2 6/9] drm/i915: expose command stream timestamp frequency to userspace
  2017-11-07 11:30   ` Ewelina Musial
@ 2017-11-07 12:07     ` Lionel Landwerlin
  0 siblings, 0 replies; 31+ messages in thread
From: Lionel Landwerlin @ 2017-11-07 12:07 UTC (permalink / raw)
  To: Ewelina Musial; +Cc: intel-gfx

On 07/11/17 11:30, Ewelina Musial wrote:
> On Thu, Nov 02, 2017 at 04:29:46PM +0000, Lionel Landwerlin wrote:
>> We use to have this fixed per generation, but starting with CNL userspace
>> cannot tell just off the PCI ID. Let's make this information available. This
>> is particularly useful for performance monitoring where much of the
>> normalization work is done using those timestamps (this include pipeline
>> statistics in both GL & Vulkan as well as OA reports).
>>
>> Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
>> ---
>>   drivers/gpu/drm/i915/i915_debugfs.c      |  2 +
>>   drivers/gpu/drm/i915/i915_drv.c          |  3 +
>>   drivers/gpu/drm/i915/i915_drv.h          |  2 +
>>   drivers/gpu/drm/i915/i915_reg.h          | 21 +++++++
>>   drivers/gpu/drm/i915/intel_device_info.c | 99 ++++++++++++++++++++++++++++++++
>>   include/uapi/drm/i915_drm.h              |  6 ++
>>   6 files changed, 133 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
>> index 39883cd915db..0897fd616a1f 100644
>> --- a/drivers/gpu/drm/i915/i915_debugfs.c
>> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
>> @@ -3246,6 +3246,8 @@ static int i915_engine_info(struct seq_file *m, void *unused)
>>   		   yesno(dev_priv->gt.awake));
>>   	seq_printf(m, "Global active requests: %d\n",
>>   		   dev_priv->gt.active_requests);
>> +	seq_printf(m, "CS timestamp frequency: %llu\n",
>> +		   dev_priv->info.cs_timestamp_frequency);
>>   
>>   	p = drm_seq_file_printer(m);
>>   	for_each_engine(engine, dev_priv, id)
>> diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
>> index e7e9e061073b..fdd23e79fb46 100644
>> --- a/drivers/gpu/drm/i915/i915_drv.c
>> +++ b/drivers/gpu/drm/i915/i915_drv.c
>> @@ -416,6 +416,9 @@ static int i915_getparam(struct drm_device *dev, void *data,
>>   		if (!value)
>>   			return -ENODEV;
>>   		break;
>> +	case I915_PARAM_CS_TIMESTAMP_FREQUENCY:
>> +		value = INTEL_INFO(dev_priv)->cs_timestamp_frequency;
>> +		break;
>>   	default:
>>   		DRM_DEBUG("Unknown parameter %d\n", param->param);
>>   		return -EINVAL;
>> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
>> index 6cb7cd7f9420..4e804aaeaae1 100644
>> --- a/drivers/gpu/drm/i915/i915_drv.h
>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>> @@ -886,6 +886,8 @@ struct intel_device_info {
>>   	/* Slice/subslice/EU info */
>>   	struct sseu_dev_info sseu;
>>   
>> +	uint64_t cs_timestamp_frequency;
>> +
>>   	struct color_luts {
>>   		u16 degamma_lut_size;
>>   		u16 gamma_lut_size;
>> diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
>> index a2223f01ee2a..f392f28f2cfa 100644
>> --- a/drivers/gpu/drm/i915/i915_reg.h
>> +++ b/drivers/gpu/drm/i915/i915_reg.h
>> @@ -1119,9 +1119,24 @@ static inline bool i915_mmio_reg_valid(i915_reg_t reg)
>>   
>>   /* RPM unit config (Gen8+) */
>>   #define RPM_CONFIG0	    _MMIO(0x0D00)
>> +#define  GEN9_RPM_CONFIG0_CRYSTAL_CLOCK_FREQ_SHIFT	3
>> +#define  GEN9_RPM_CONFIG0_CRYSTAL_CLOCK_FREQ_MASK	(1 << GEN9_RPM_CONFIG0_CRYSTAL_CLOCK_FREQ_SHIFT)
>> +#define  GEN9_RPM_CONFIG0_CRYSTAL_CLOCK_FREQ_19_2_MHZ	0
>> +#define  GEN9_RPM_CONFIG0_CRYSTAL_CLOCK_FREQ_24_MHZ	1
>> +#define  GEN10_RPM_CONFIG0_CTC_SHIFT_PARAMETER_SHIFT	1
>> +#define  GEN10_RPM_CONFIG0_CTC_SHIFT_PARAMETER_MASK	(0x3 << GEN10_RPM_CONFIG0_CTC_SHIFT_PARAMETER_SHIFT)
>> +
>>   #define RPM_CONFIG1	    _MMIO(0x0D04)
>>   #define  GEN10_GT_NOA_ENABLE  (1 << 9)
>>   
>> +/* GPM unit config (assuming Gen8+, documentation is fuzzy...) */
>> +#define GEN8_CTC_MODE			_MMIO(0xA26C)
>> +#define  GEN8_CTC_SOURCE_PARAMETER_MASK 1
>> +#define  GEN8_CTC_SOURCE_CRYSTAL_CLOCK	0
>> +#define  GEN8_CTC_SOURCE_DIVIDE_LOGIC	1
>> +#define  GEN8_CTC_SHIFT_PARAMETER_SHIFT	1
>> +#define  GEN8_CTC_SHIFT_PARAMETER_MASK	(0x3 << GEN8_CTC_SHIFT_PARAMETER_SHIFT)
>> +
>>   /* RPC unit config (Gen8+) */
>>   #define RPC_CONFIG	    _MMIO(0x0D08)
>>   
>> @@ -8865,6 +8880,12 @@ enum skl_power_gate {
>>   #define ILK_TIMESTAMP_HI	_MMIO(0x70070)
>>   #define IVB_TIMESTAMP_CTR	_MMIO(0x44070)
>>   
>> +#define GEN8_TIMESTAMP_OVERRIDE				_MMIO(0x44074)
>> +#define  GEN8_TIMESTAMP_OVERRIDE_US_COUNTER_SHIFT		0
>> +#define  GEN8_TIMESTAMP_OVERRIDE_US_COUNTER_MASK		0x3ff
>> +#define  GEN8_TIMESTAMP_OVERRIDE_US_COUNTER_DENOMINATOR_SHIFT	12
>> +#define  GEN8_TIMESTAMP_OVERRIDE_US_COUNTER_DENOMINATOR_MASK	(0xf << 12)
>> +
>>   #define _PIPE_FRMTMSTMP_A		0x70048
>>   #define PIPE_FRMTMSTMP(pipe)		\
>>   			_MMIO_PIPE2(pipe, _PIPE_FRMTMSTMP_A)
>> diff --git a/drivers/gpu/drm/i915/intel_device_info.c b/drivers/gpu/drm/i915/intel_device_info.c
>> index db03d179fc85..9b71a9b6d80e 100644
>> --- a/drivers/gpu/drm/i915/intel_device_info.c
>> +++ b/drivers/gpu/drm/i915/intel_device_info.c
>> @@ -329,6 +329,100 @@ static void broadwell_sseu_info_init(struct drm_i915_private *dev_priv)
>>   	sseu->has_eu_pg = 0;
>>   }
>>   
>> +static u64 read_timestamp_frequency_from_divide(struct drm_i915_private *dev_priv)
>> +{
>> +	u32 ts_override = I915_READ(GEN8_TIMESTAMP_OVERRIDE);
>> +	u64 base_freq, frac_freq;
>> +
>> +	base_freq = ((ts_override & GEN8_TIMESTAMP_OVERRIDE_US_COUNTER_MASK) >>
>> +		     GEN8_TIMESTAMP_OVERRIDE_US_COUNTER_SHIFT) + 1;
>> +	base_freq *= 1000000;
>> +
>> +	frac_freq = ((ts_override &
>> +		      GEN8_TIMESTAMP_OVERRIDE_US_COUNTER_DENOMINATOR_MASK) >>
>> +		     GEN8_TIMESTAMP_OVERRIDE_US_COUNTER_DENOMINATOR_SHIFT);
>> +	if (frac_freq != 0)
>> +		frac_freq = 1000000 / (frac_freq + 1);
>> +
>> +	return base_freq + frac_freq;
>> +}
>> +
>> +static u64 read_timestamp_frequency(struct drm_i915_private *dev_priv)
>> +{
>> +	if (INTEL_GEN(dev_priv) <= 4) {
>> +		/* PRMs say:
>> +		 *
>> +		 *     "The value in this register increments once every 16
>> +		 *      hclks." ("CLKCFG" register)
>> +		 *
>> +		 * Since dev_priv->rawclk_freq stores the value in kHz divided
>> +		 * by 4, we just need to divide it again by 4.
>> +		 */
>> +		return (dev_priv->rawclk_freq * 1000) / 4;
>> +	} else if (INTEL_GEN(dev_priv) <= 7) {
>> +		/* PRMs say:
>> +		 *
>> +		 *     "The PCU TSC counts 10ns increments; this timestamp
>> +		 *      reflects bits 38:3 of the TSC (i.e. 80ns granularity,
>> +		 *      rolling over every 1.5 hours).
>> +		 */
>> +		return 12500000;
>> +	} else if (INTEL_GEN(dev_priv) <= 9) {
>> +		u32 ctc_reg = I915_READ(GEN8_CTC_MODE);
>> +		u64 freq = 0;
>> +
>> +		if ((ctc_reg & GEN8_CTC_SOURCE_PARAMETER_MASK) == GEN8_CTC_SOURCE_DIVIDE_LOGIC)
>> +			freq = read_timestamp_frequency_from_divide(dev_priv);
>> +		else
>> +			freq = IS_GEN9_LP(dev_priv) ? 19200000 : 24000000;
> What means those values? It looks like some 'magic numbers' here.
> Some comment or define could be helpful.

That's 19.2MHz or 24MHz.
Thanks, will add.

>> +
>> +		/* Now figure out how the command stream's timestamp register
>> +		 * increments from this frequency (it might increment only
>> +		 * every few clock cycle).
>> +		 */
>> +		freq >>= 3 - ((ctc_reg & GEN8_CTC_SHIFT_PARAMETER_MASK) >>
>> +			      GEN8_CTC_SHIFT_PARAMETER_SHIFT);
>> +
>> +		return freq;
>> +	} else if (INTEL_GEN(dev_priv) <= 10) {
>> +		u32 ctc_reg = I915_READ(GEN8_CTC_MODE);
>> +		u64 freq = 0;
>> +		u32 rpm_config_reg = 0;
>> +
>> +		/* First figure out the reference frequency. There are 2 ways
>> +		 * we can compute the frequency, either through the
>> +		 * TIMESTAMP_OVERRIDE register or through CTC_MODE &
>> +		 * RPM_CONFIG & CTC_MODE registers. CTC_MODE tells us which
>> +		 * one we should use.
>> +		 */
>> +		if ((ctc_reg & GEN8_CTC_SOURCE_PARAMETER_MASK) == GEN8_CTC_SOURCE_DIVIDE_LOGIC) {
>> +			freq = read_timestamp_frequency_from_divide(dev_priv);
>> +		} else {
>> +			u32 crystal_clock;
>> +
>> +			rpm_config_reg = I915_READ(RPM_CONFIG0);
>> +			crystal_clock = (rpm_config_reg &
>> +					 GEN9_RPM_CONFIG0_CRYSTAL_CLOCK_FREQ_MASK) >>
>> +				GEN9_RPM_CONFIG0_CRYSTAL_CLOCK_FREQ_SHIFT;
>> +			freq = crystal_clock == GEN9_RPM_CONFIG0_CRYSTAL_CLOCK_FREQ_19_2_MHZ ?
>> +				19200000 : 24000000;
> The same here.
> - Ewelina
>> +		}
>> +
>> +		/* Now figure out how the command stream's timestamp register
>> +		 * increments from this frequency (it might increment only
>> +		 * every few clock cycle).
>> +		 */
>> +		freq >>= 3 - ((rpm_config_reg &
>> +			       GEN10_RPM_CONFIG0_CTC_SHIFT_PARAMETER_MASK) >>
>> +			      GEN10_RPM_CONFIG0_CTC_SHIFT_PARAMETER_SHIFT);
>> +
>> +		return freq;
>> +	}
>> +
>> +	DRM_ERROR("Unknown gen, unable to compute command stream timestamp frequency\n");
>> +	return 0;
>> +}
>> +
>>   /*
>>    * Determine various intel_device_info fields at runtime.
>>    *
>> @@ -450,6 +544,9 @@ void intel_device_info_runtime_init(struct drm_i915_private *dev_priv)
>>   	else if (INTEL_GEN(dev_priv) >= 10)
>>   		gen10_sseu_info_init(dev_priv);
>>   
>> +	/* Initialize command stream timestamp frequency */
>> +	info->cs_timestamp_frequency = read_timestamp_frequency(dev_priv);
>> +
>>   	DRM_DEBUG_DRIVER("slice mask: %04x\n", info->sseu.slice_mask);
>>   	DRM_DEBUG_DRIVER("slice total: %u\n", hweight8(info->sseu.slice_mask));
>>   	DRM_DEBUG_DRIVER("subslice total: %u\n",
>> @@ -465,4 +562,6 @@ void intel_device_info_runtime_init(struct drm_i915_private *dev_priv)
>>   			 info->sseu.has_subslice_pg ? "y" : "n");
>>   	DRM_DEBUG_DRIVER("has EU power gating: %s\n",
>>   			 info->sseu.has_eu_pg ? "y" : "n");
>> +	DRM_DEBUG_DRIVER("CS timestamp frequency: %llu\n",
>> +			 info->cs_timestamp_frequency);
>>   }
>> diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
>> index 125bde7d9504..c3ff0d4947af 100644
>> --- a/include/uapi/drm/i915_drm.h
>> +++ b/include/uapi/drm/i915_drm.h
>> @@ -450,6 +450,12 @@ typedef struct drm_i915_irq_wait {
>>    */
>>   #define I915_PARAM_HAS_EXEC_FENCE_ARRAY  49
>>   
>> +/* Frequency of the command streamer timestamps given by the *_TIMESTAMP
>> + * registers. This used to be fixed per platform but from CNL onwards, this
>> + * might vary depending on the parts.
>> + */
>> +#define I915_PARAM_CS_TIMESTAMP_FREQUENCY   50
>> +
>>   typedef struct drm_i915_getparam {
>>   	__s32 param;
>>   	/*
>> -- 
>> 2.15.0
>>
>> _______________________________________________
>> Intel-gfx mailing list
>> Intel-gfx@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/intel-gfx


_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2 6/9] drm/i915: expose command stream timestamp frequency to userspace
  2017-11-02 16:29 ` [PATCH v2 6/9] drm/i915: expose command stream timestamp frequency to userspace Lionel Landwerlin
  2017-11-07  0:01   ` Rafael Antognolli
  2017-11-07 11:30   ` Ewelina Musial
@ 2017-11-08 17:36   ` Lionel Landwerlin
  2017-11-09  9:10     ` Sagar Arun Kamble
  2017-11-09 11:58   ` Sagar Arun Kamble
  3 siblings, 1 reply; 31+ messages in thread
From: Lionel Landwerlin @ 2017-11-08 17:36 UTC (permalink / raw)
  To: intel-gfx

Is there anyone with spare time to review this patch?
It's kind of required for userspace to make sense of timestamps on CNL.

Thanks a lot,

-
Lionel

On 02/11/17 16:29, Lionel Landwerlin wrote:
> We use to have this fixed per generation, but starting with CNL userspace
> cannot tell just off the PCI ID. Let's make this information available. This
> is particularly useful for performance monitoring where much of the
> normalization work is done using those timestamps (this include pipeline
> statistics in both GL & Vulkan as well as OA reports).
>
> Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
> ---
>   drivers/gpu/drm/i915/i915_debugfs.c      |  2 +
>   drivers/gpu/drm/i915/i915_drv.c          |  3 +
>   drivers/gpu/drm/i915/i915_drv.h          |  2 +
>   drivers/gpu/drm/i915/i915_reg.h          | 21 +++++++
>   drivers/gpu/drm/i915/intel_device_info.c | 99 ++++++++++++++++++++++++++++++++
>   include/uapi/drm/i915_drm.h              |  6 ++
>   6 files changed, 133 insertions(+)
>
> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
> index 39883cd915db..0897fd616a1f 100644
> --- a/drivers/gpu/drm/i915/i915_debugfs.c
> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> @@ -3246,6 +3246,8 @@ static int i915_engine_info(struct seq_file *m, void *unused)
>   		   yesno(dev_priv->gt.awake));
>   	seq_printf(m, "Global active requests: %d\n",
>   		   dev_priv->gt.active_requests);
> +	seq_printf(m, "CS timestamp frequency: %llu\n",
> +		   dev_priv->info.cs_timestamp_frequency);
>   
>   	p = drm_seq_file_printer(m);
>   	for_each_engine(engine, dev_priv, id)
> diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
> index e7e9e061073b..fdd23e79fb46 100644
> --- a/drivers/gpu/drm/i915/i915_drv.c
> +++ b/drivers/gpu/drm/i915/i915_drv.c
> @@ -416,6 +416,9 @@ static int i915_getparam(struct drm_device *dev, void *data,
>   		if (!value)
>   			return -ENODEV;
>   		break;
> +	case I915_PARAM_CS_TIMESTAMP_FREQUENCY:
> +		value = INTEL_INFO(dev_priv)->cs_timestamp_frequency;
> +		break;
>   	default:
>   		DRM_DEBUG("Unknown parameter %d\n", param->param);
>   		return -EINVAL;
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 6cb7cd7f9420..4e804aaeaae1 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -886,6 +886,8 @@ struct intel_device_info {
>   	/* Slice/subslice/EU info */
>   	struct sseu_dev_info sseu;
>   
> +	uint64_t cs_timestamp_frequency;
> +
>   	struct color_luts {
>   		u16 degamma_lut_size;
>   		u16 gamma_lut_size;
> diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
> index a2223f01ee2a..f392f28f2cfa 100644
> --- a/drivers/gpu/drm/i915/i915_reg.h
> +++ b/drivers/gpu/drm/i915/i915_reg.h
> @@ -1119,9 +1119,24 @@ static inline bool i915_mmio_reg_valid(i915_reg_t reg)
>   
>   /* RPM unit config (Gen8+) */
>   #define RPM_CONFIG0	    _MMIO(0x0D00)
> +#define  GEN9_RPM_CONFIG0_CRYSTAL_CLOCK_FREQ_SHIFT	3
> +#define  GEN9_RPM_CONFIG0_CRYSTAL_CLOCK_FREQ_MASK	(1 << GEN9_RPM_CONFIG0_CRYSTAL_CLOCK_FREQ_SHIFT)
> +#define  GEN9_RPM_CONFIG0_CRYSTAL_CLOCK_FREQ_19_2_MHZ	0
> +#define  GEN9_RPM_CONFIG0_CRYSTAL_CLOCK_FREQ_24_MHZ	1
> +#define  GEN10_RPM_CONFIG0_CTC_SHIFT_PARAMETER_SHIFT	1
> +#define  GEN10_RPM_CONFIG0_CTC_SHIFT_PARAMETER_MASK	(0x3 << GEN10_RPM_CONFIG0_CTC_SHIFT_PARAMETER_SHIFT)
> +
>   #define RPM_CONFIG1	    _MMIO(0x0D04)
>   #define  GEN10_GT_NOA_ENABLE  (1 << 9)
>   
> +/* GPM unit config (assuming Gen8+, documentation is fuzzy...) */
> +#define GEN8_CTC_MODE			_MMIO(0xA26C)
> +#define  GEN8_CTC_SOURCE_PARAMETER_MASK 1
> +#define  GEN8_CTC_SOURCE_CRYSTAL_CLOCK	0
> +#define  GEN8_CTC_SOURCE_DIVIDE_LOGIC	1
> +#define  GEN8_CTC_SHIFT_PARAMETER_SHIFT	1
> +#define  GEN8_CTC_SHIFT_PARAMETER_MASK	(0x3 << GEN8_CTC_SHIFT_PARAMETER_SHIFT)
> +
>   /* RPC unit config (Gen8+) */
>   #define RPC_CONFIG	    _MMIO(0x0D08)
>   
> @@ -8865,6 +8880,12 @@ enum skl_power_gate {
>   #define ILK_TIMESTAMP_HI	_MMIO(0x70070)
>   #define IVB_TIMESTAMP_CTR	_MMIO(0x44070)
>   
> +#define GEN8_TIMESTAMP_OVERRIDE				_MMIO(0x44074)
> +#define  GEN8_TIMESTAMP_OVERRIDE_US_COUNTER_SHIFT		0
> +#define  GEN8_TIMESTAMP_OVERRIDE_US_COUNTER_MASK		0x3ff
> +#define  GEN8_TIMESTAMP_OVERRIDE_US_COUNTER_DENOMINATOR_SHIFT	12
> +#define  GEN8_TIMESTAMP_OVERRIDE_US_COUNTER_DENOMINATOR_MASK	(0xf << 12)
> +
>   #define _PIPE_FRMTMSTMP_A		0x70048
>   #define PIPE_FRMTMSTMP(pipe)		\
>   			_MMIO_PIPE2(pipe, _PIPE_FRMTMSTMP_A)
> diff --git a/drivers/gpu/drm/i915/intel_device_info.c b/drivers/gpu/drm/i915/intel_device_info.c
> index db03d179fc85..9b71a9b6d80e 100644
> --- a/drivers/gpu/drm/i915/intel_device_info.c
> +++ b/drivers/gpu/drm/i915/intel_device_info.c
> @@ -329,6 +329,100 @@ static void broadwell_sseu_info_init(struct drm_i915_private *dev_priv)
>   	sseu->has_eu_pg = 0;
>   }
>   
> +static u64 read_timestamp_frequency_from_divide(struct drm_i915_private *dev_priv)
> +{
> +	u32 ts_override = I915_READ(GEN8_TIMESTAMP_OVERRIDE);
> +	u64 base_freq, frac_freq;
> +
> +	base_freq = ((ts_override & GEN8_TIMESTAMP_OVERRIDE_US_COUNTER_MASK) >>
> +		     GEN8_TIMESTAMP_OVERRIDE_US_COUNTER_SHIFT) + 1;
> +	base_freq *= 1000000;
> +
> +	frac_freq = ((ts_override &
> +		      GEN8_TIMESTAMP_OVERRIDE_US_COUNTER_DENOMINATOR_MASK) >>
> +		     GEN8_TIMESTAMP_OVERRIDE_US_COUNTER_DENOMINATOR_SHIFT);
> +	if (frac_freq != 0)
> +		frac_freq = 1000000 / (frac_freq + 1);
> +
> +	return base_freq + frac_freq;
> +}
> +
> +static u64 read_timestamp_frequency(struct drm_i915_private *dev_priv)
> +{
> +	if (INTEL_GEN(dev_priv) <= 4) {
> +		/* PRMs say:
> +		 *
> +		 *     "The value in this register increments once every 16
> +		 *      hclks." ("CLKCFG" register)
> +		 *
> +		 * Since dev_priv->rawclk_freq stores the value in kHz divided
> +		 * by 4, we just need to divide it again by 4.
> +		 */
> +		return (dev_priv->rawclk_freq * 1000) / 4;
> +	} else if (INTEL_GEN(dev_priv) <= 7) {
> +		/* PRMs say:
> +		 *
> +		 *     "The PCU TSC counts 10ns increments; this timestamp
> +		 *      reflects bits 38:3 of the TSC (i.e. 80ns granularity,
> +		 *      rolling over every 1.5 hours).
> +		 */
> +		return 12500000;
> +	} else if (INTEL_GEN(dev_priv) <= 9) {
> +		u32 ctc_reg = I915_READ(GEN8_CTC_MODE);
> +		u64 freq = 0;
> +
> +		if ((ctc_reg & GEN8_CTC_SOURCE_PARAMETER_MASK) == GEN8_CTC_SOURCE_DIVIDE_LOGIC)
> +			freq = read_timestamp_frequency_from_divide(dev_priv);
> +		else
> +			freq = IS_GEN9_LP(dev_priv) ? 19200000 : 24000000;
> +
> +		/* Now figure out how the command stream's timestamp register
> +		 * increments from this frequency (it might increment only
> +		 * every few clock cycle).
> +		 */
> +		freq >>= 3 - ((ctc_reg & GEN8_CTC_SHIFT_PARAMETER_MASK) >>
> +			      GEN8_CTC_SHIFT_PARAMETER_SHIFT);
> +
> +		return freq;
> +	} else if (INTEL_GEN(dev_priv) <= 10) {
> +		u32 ctc_reg = I915_READ(GEN8_CTC_MODE);
> +		u64 freq = 0;
> +		u32 rpm_config_reg = 0;
> +
> +		/* First figure out the reference frequency. There are 2 ways
> +		 * we can compute the frequency, either through the
> +		 * TIMESTAMP_OVERRIDE register or through CTC_MODE &
> +		 * RPM_CONFIG & CTC_MODE registers. CTC_MODE tells us which
> +		 * one we should use.
> +		 */
> +		if ((ctc_reg & GEN8_CTC_SOURCE_PARAMETER_MASK) == GEN8_CTC_SOURCE_DIVIDE_LOGIC) {
> +			freq = read_timestamp_frequency_from_divide(dev_priv);
> +		} else {
> +			u32 crystal_clock;
> +
> +			rpm_config_reg = I915_READ(RPM_CONFIG0);
> +			crystal_clock = (rpm_config_reg &
> +					 GEN9_RPM_CONFIG0_CRYSTAL_CLOCK_FREQ_MASK) >>
> +				GEN9_RPM_CONFIG0_CRYSTAL_CLOCK_FREQ_SHIFT;
> +			freq = crystal_clock == GEN9_RPM_CONFIG0_CRYSTAL_CLOCK_FREQ_19_2_MHZ ?
> +				19200000 : 24000000;
> +		}
> +
> +		/* Now figure out how the command stream's timestamp register
> +		 * increments from this frequency (it might increment only
> +		 * every few clock cycle).
> +		 */
> +		freq >>= 3 - ((rpm_config_reg &
> +			       GEN10_RPM_CONFIG0_CTC_SHIFT_PARAMETER_MASK) >>
> +			      GEN10_RPM_CONFIG0_CTC_SHIFT_PARAMETER_SHIFT);
> +
> +		return freq;
> +	}
> +
> +	DRM_ERROR("Unknown gen, unable to compute command stream timestamp frequency\n");
> +	return 0;
> +}
> +
>   /*
>    * Determine various intel_device_info fields at runtime.
>    *
> @@ -450,6 +544,9 @@ void intel_device_info_runtime_init(struct drm_i915_private *dev_priv)
>   	else if (INTEL_GEN(dev_priv) >= 10)
>   		gen10_sseu_info_init(dev_priv);
>   
> +	/* Initialize command stream timestamp frequency */
> +	info->cs_timestamp_frequency = read_timestamp_frequency(dev_priv);
> +
>   	DRM_DEBUG_DRIVER("slice mask: %04x\n", info->sseu.slice_mask);
>   	DRM_DEBUG_DRIVER("slice total: %u\n", hweight8(info->sseu.slice_mask));
>   	DRM_DEBUG_DRIVER("subslice total: %u\n",
> @@ -465,4 +562,6 @@ void intel_device_info_runtime_init(struct drm_i915_private *dev_priv)
>   			 info->sseu.has_subslice_pg ? "y" : "n");
>   	DRM_DEBUG_DRIVER("has EU power gating: %s\n",
>   			 info->sseu.has_eu_pg ? "y" : "n");
> +	DRM_DEBUG_DRIVER("CS timestamp frequency: %llu\n",
> +			 info->cs_timestamp_frequency);
>   }
> diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
> index 125bde7d9504..c3ff0d4947af 100644
> --- a/include/uapi/drm/i915_drm.h
> +++ b/include/uapi/drm/i915_drm.h
> @@ -450,6 +450,12 @@ typedef struct drm_i915_irq_wait {
>    */
>   #define I915_PARAM_HAS_EXEC_FENCE_ARRAY  49
>   
> +/* Frequency of the command streamer timestamps given by the *_TIMESTAMP
> + * registers. This used to be fixed per platform but from CNL onwards, this
> + * might vary depending on the parts.
> + */
> +#define I915_PARAM_CS_TIMESTAMP_FREQUENCY   50
> +
>   typedef struct drm_i915_getparam {
>   	__s32 param;
>   	/*


_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2 6/9] drm/i915: expose command stream timestamp frequency to userspace
  2017-11-08 17:36   ` Lionel Landwerlin
@ 2017-11-09  9:10     ` Sagar Arun Kamble
  0 siblings, 0 replies; 31+ messages in thread
From: Sagar Arun Kamble @ 2017-11-09  9:10 UTC (permalink / raw)
  To: Lionel Landwerlin, intel-gfx



On 11/8/2017 11:06 PM, Lionel Landwerlin wrote:
> Is there anyone with spare time to review this patch?
I'm on it.
> It's kind of required for userspace to make sense of timestamps on CNL.
>
> Thanks a lot,
>
> -
> Lionel
>
> On 02/11/17 16:29, Lionel Landwerlin wrote:
>> We use to have this fixed per generation, but starting with CNL 
>> userspace
>> cannot tell just off the PCI ID. Let's make this information 
>> available. This
>> is particularly useful for performance monitoring where much of the
>> normalization work is done using those timestamps (this include pipeline
>> statistics in both GL & Vulkan as well as OA reports).
>>
>> Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
>> ---
>>   drivers/gpu/drm/i915/i915_debugfs.c      |  2 +
>>   drivers/gpu/drm/i915/i915_drv.c          |  3 +
>>   drivers/gpu/drm/i915/i915_drv.h          |  2 +
>>   drivers/gpu/drm/i915/i915_reg.h          | 21 +++++++
>>   drivers/gpu/drm/i915/intel_device_info.c | 99 
>> ++++++++++++++++++++++++++++++++
>>   include/uapi/drm/i915_drm.h              |  6 ++
>>   6 files changed, 133 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c 
>> b/drivers/gpu/drm/i915/i915_debugfs.c
>> index 39883cd915db..0897fd616a1f 100644
>> --- a/drivers/gpu/drm/i915/i915_debugfs.c
>> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
>> @@ -3246,6 +3246,8 @@ static int i915_engine_info(struct seq_file *m, 
>> void *unused)
>>              yesno(dev_priv->gt.awake));
>>       seq_printf(m, "Global active requests: %d\n",
>>              dev_priv->gt.active_requests);
>> +    seq_printf(m, "CS timestamp frequency: %llu\n",
>> +           dev_priv->info.cs_timestamp_frequency);
>>         p = drm_seq_file_printer(m);
>>       for_each_engine(engine, dev_priv, id)
>> diff --git a/drivers/gpu/drm/i915/i915_drv.c 
>> b/drivers/gpu/drm/i915/i915_drv.c
>> index e7e9e061073b..fdd23e79fb46 100644
>> --- a/drivers/gpu/drm/i915/i915_drv.c
>> +++ b/drivers/gpu/drm/i915/i915_drv.c
>> @@ -416,6 +416,9 @@ static int i915_getparam(struct drm_device *dev, 
>> void *data,
>>           if (!value)
>>               return -ENODEV;
>>           break;
>> +    case I915_PARAM_CS_TIMESTAMP_FREQUENCY:
>> +        value = INTEL_INFO(dev_priv)->cs_timestamp_frequency;
>> +        break;
>>       default:
>>           DRM_DEBUG("Unknown parameter %d\n", param->param);
>>           return -EINVAL;
>> diff --git a/drivers/gpu/drm/i915/i915_drv.h 
>> b/drivers/gpu/drm/i915/i915_drv.h
>> index 6cb7cd7f9420..4e804aaeaae1 100644
>> --- a/drivers/gpu/drm/i915/i915_drv.h
>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>> @@ -886,6 +886,8 @@ struct intel_device_info {
>>       /* Slice/subslice/EU info */
>>       struct sseu_dev_info sseu;
>>   +    uint64_t cs_timestamp_frequency;
>> +
>>       struct color_luts {
>>           u16 degamma_lut_size;
>>           u16 gamma_lut_size;
>> diff --git a/drivers/gpu/drm/i915/i915_reg.h 
>> b/drivers/gpu/drm/i915/i915_reg.h
>> index a2223f01ee2a..f392f28f2cfa 100644
>> --- a/drivers/gpu/drm/i915/i915_reg.h
>> +++ b/drivers/gpu/drm/i915/i915_reg.h
>> @@ -1119,9 +1119,24 @@ static inline bool 
>> i915_mmio_reg_valid(i915_reg_t reg)
>>     /* RPM unit config (Gen8+) */
>>   #define RPM_CONFIG0        _MMIO(0x0D00)
>> +#define  GEN9_RPM_CONFIG0_CRYSTAL_CLOCK_FREQ_SHIFT    3
>> +#define  GEN9_RPM_CONFIG0_CRYSTAL_CLOCK_FREQ_MASK    (1 << 
>> GEN9_RPM_CONFIG0_CRYSTAL_CLOCK_FREQ_SHIFT)
>> +#define  GEN9_RPM_CONFIG0_CRYSTAL_CLOCK_FREQ_19_2_MHZ    0
>> +#define  GEN9_RPM_CONFIG0_CRYSTAL_CLOCK_FREQ_24_MHZ    1
>> +#define  GEN10_RPM_CONFIG0_CTC_SHIFT_PARAMETER_SHIFT    1
>> +#define  GEN10_RPM_CONFIG0_CTC_SHIFT_PARAMETER_MASK    (0x3 << 
>> GEN10_RPM_CONFIG0_CTC_SHIFT_PARAMETER_SHIFT)
>> +
>>   #define RPM_CONFIG1        _MMIO(0x0D04)
>>   #define  GEN10_GT_NOA_ENABLE  (1 << 9)
>>   +/* GPM unit config (assuming Gen8+, documentation is fuzzy...) */
>> +#define GEN8_CTC_MODE            _MMIO(0xA26C)
>> +#define  GEN8_CTC_SOURCE_PARAMETER_MASK 1
>> +#define  GEN8_CTC_SOURCE_CRYSTAL_CLOCK    0
>> +#define  GEN8_CTC_SOURCE_DIVIDE_LOGIC    1
>> +#define  GEN8_CTC_SHIFT_PARAMETER_SHIFT    1
>> +#define  GEN8_CTC_SHIFT_PARAMETER_MASK    (0x3 << 
>> GEN8_CTC_SHIFT_PARAMETER_SHIFT)
>> +
>>   /* RPC unit config (Gen8+) */
>>   #define RPC_CONFIG        _MMIO(0x0D08)
>>   @@ -8865,6 +8880,12 @@ enum skl_power_gate {
>>   #define ILK_TIMESTAMP_HI    _MMIO(0x70070)
>>   #define IVB_TIMESTAMP_CTR    _MMIO(0x44070)
>>   +#define GEN8_TIMESTAMP_OVERRIDE                _MMIO(0x44074)
>> +#define  GEN8_TIMESTAMP_OVERRIDE_US_COUNTER_SHIFT        0
>> +#define  GEN8_TIMESTAMP_OVERRIDE_US_COUNTER_MASK        0x3ff
>> +#define GEN8_TIMESTAMP_OVERRIDE_US_COUNTER_DENOMINATOR_SHIFT    12
>> +#define  GEN8_TIMESTAMP_OVERRIDE_US_COUNTER_DENOMINATOR_MASK (0xf << 
>> 12)
>> +
>>   #define _PIPE_FRMTMSTMP_A        0x70048
>>   #define PIPE_FRMTMSTMP(pipe)        \
>>               _MMIO_PIPE2(pipe, _PIPE_FRMTMSTMP_A)
>> diff --git a/drivers/gpu/drm/i915/intel_device_info.c 
>> b/drivers/gpu/drm/i915/intel_device_info.c
>> index db03d179fc85..9b71a9b6d80e 100644
>> --- a/drivers/gpu/drm/i915/intel_device_info.c
>> +++ b/drivers/gpu/drm/i915/intel_device_info.c
>> @@ -329,6 +329,100 @@ static void broadwell_sseu_info_init(struct 
>> drm_i915_private *dev_priv)
>>       sseu->has_eu_pg = 0;
>>   }
>>   +static u64 read_timestamp_frequency_from_divide(struct 
>> drm_i915_private *dev_priv)
>> +{
>> +    u32 ts_override = I915_READ(GEN8_TIMESTAMP_OVERRIDE);
>> +    u64 base_freq, frac_freq;
>> +
>> +    base_freq = ((ts_override & 
>> GEN8_TIMESTAMP_OVERRIDE_US_COUNTER_MASK) >>
>> +             GEN8_TIMESTAMP_OVERRIDE_US_COUNTER_SHIFT) + 1;
>> +    base_freq *= 1000000;
>> +
>> +    frac_freq = ((ts_override &
>> + GEN8_TIMESTAMP_OVERRIDE_US_COUNTER_DENOMINATOR_MASK) >>
>> + GEN8_TIMESTAMP_OVERRIDE_US_COUNTER_DENOMINATOR_SHIFT);
>> +    if (frac_freq != 0)
>> +        frac_freq = 1000000 / (frac_freq + 1);
>> +
>> +    return base_freq + frac_freq;
>> +}
>> +
>> +static u64 read_timestamp_frequency(struct drm_i915_private *dev_priv)
>> +{
>> +    if (INTEL_GEN(dev_priv) <= 4) {
>> +        /* PRMs say:
>> +         *
>> +         *     "The value in this register increments once every 16
>> +         *      hclks." ("CLKCFG" register)
>> +         *
>> +         * Since dev_priv->rawclk_freq stores the value in kHz divided
>> +         * by 4, we just need to divide it again by 4.
>> +         */
>> +        return (dev_priv->rawclk_freq * 1000) / 4;
>> +    } else if (INTEL_GEN(dev_priv) <= 7) {
>> +        /* PRMs say:
>> +         *
>> +         *     "The PCU TSC counts 10ns increments; this timestamp
>> +         *      reflects bits 38:3 of the TSC (i.e. 80ns granularity,
>> +         *      rolling over every 1.5 hours).
>> +         */
>> +        return 12500000;
>> +    } else if (INTEL_GEN(dev_priv) <= 9) {
>> +        u32 ctc_reg = I915_READ(GEN8_CTC_MODE);
>> +        u64 freq = 0;
>> +
>> +        if ((ctc_reg & GEN8_CTC_SOURCE_PARAMETER_MASK) == 
>> GEN8_CTC_SOURCE_DIVIDE_LOGIC)
>> +            freq = read_timestamp_frequency_from_divide(dev_priv);
>> +        else
>> +            freq = IS_GEN9_LP(dev_priv) ? 19200000 : 24000000;
>> +
>> +        /* Now figure out how the command stream's timestamp register
>> +         * increments from this frequency (it might increment only
>> +         * every few clock cycle).
>> +         */
>> +        freq >>= 3 - ((ctc_reg & GEN8_CTC_SHIFT_PARAMETER_MASK) >>
>> +                  GEN8_CTC_SHIFT_PARAMETER_SHIFT);
>> +
>> +        return freq;
>> +    } else if (INTEL_GEN(dev_priv) <= 10) {
>> +        u32 ctc_reg = I915_READ(GEN8_CTC_MODE);
>> +        u64 freq = 0;
>> +        u32 rpm_config_reg = 0;
>> +
>> +        /* First figure out the reference frequency. There are 2 ways
>> +         * we can compute the frequency, either through the
>> +         * TIMESTAMP_OVERRIDE register or through CTC_MODE &
>> +         * RPM_CONFIG & CTC_MODE registers. CTC_MODE tells us which
>> +         * one we should use.
>> +         */
>> +        if ((ctc_reg & GEN8_CTC_SOURCE_PARAMETER_MASK) == 
>> GEN8_CTC_SOURCE_DIVIDE_LOGIC) {
>> +            freq = read_timestamp_frequency_from_divide(dev_priv);
>> +        } else {
>> +            u32 crystal_clock;
>> +
>> +            rpm_config_reg = I915_READ(RPM_CONFIG0);
>> +            crystal_clock = (rpm_config_reg &
>> +                     GEN9_RPM_CONFIG0_CRYSTAL_CLOCK_FREQ_MASK) >>
>> +                GEN9_RPM_CONFIG0_CRYSTAL_CLOCK_FREQ_SHIFT;
>> +            freq = crystal_clock == 
>> GEN9_RPM_CONFIG0_CRYSTAL_CLOCK_FREQ_19_2_MHZ ?
>> +                19200000 : 24000000;
>> +        }
>> +
>> +        /* Now figure out how the command stream's timestamp register
>> +         * increments from this frequency (it might increment only
>> +         * every few clock cycle).
>> +         */
>> +        freq >>= 3 - ((rpm_config_reg &
>> +                   GEN10_RPM_CONFIG0_CTC_SHIFT_PARAMETER_MASK) >>
>> +                  GEN10_RPM_CONFIG0_CTC_SHIFT_PARAMETER_SHIFT);
>> +
>> +        return freq;
>> +    }
>> +
>> +    DRM_ERROR("Unknown gen, unable to compute command stream 
>> timestamp frequency\n");
>> +    return 0;
>> +}
>> +
>>   /*
>>    * Determine various intel_device_info fields at runtime.
>>    *
>> @@ -450,6 +544,9 @@ void intel_device_info_runtime_init(struct 
>> drm_i915_private *dev_priv)
>>       else if (INTEL_GEN(dev_priv) >= 10)
>>           gen10_sseu_info_init(dev_priv);
>>   +    /* Initialize command stream timestamp frequency */
>> +    info->cs_timestamp_frequency = read_timestamp_frequency(dev_priv);
>> +
>>       DRM_DEBUG_DRIVER("slice mask: %04x\n", info->sseu.slice_mask);
>>       DRM_DEBUG_DRIVER("slice total: %u\n", 
>> hweight8(info->sseu.slice_mask));
>>       DRM_DEBUG_DRIVER("subslice total: %u\n",
>> @@ -465,4 +562,6 @@ void intel_device_info_runtime_init(struct 
>> drm_i915_private *dev_priv)
>>                info->sseu.has_subslice_pg ? "y" : "n");
>>       DRM_DEBUG_DRIVER("has EU power gating: %s\n",
>>                info->sseu.has_eu_pg ? "y" : "n");
>> +    DRM_DEBUG_DRIVER("CS timestamp frequency: %llu\n",
>> +             info->cs_timestamp_frequency);
>>   }
>> diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
>> index 125bde7d9504..c3ff0d4947af 100644
>> --- a/include/uapi/drm/i915_drm.h
>> +++ b/include/uapi/drm/i915_drm.h
>> @@ -450,6 +450,12 @@ typedef struct drm_i915_irq_wait {
>>    */
>>   #define I915_PARAM_HAS_EXEC_FENCE_ARRAY  49
>>   +/* Frequency of the command streamer timestamps given by the 
>> *_TIMESTAMP
>> + * registers. This used to be fixed per platform but from CNL 
>> onwards, this
>> + * might vary depending on the parts.
>> + */
>> +#define I915_PARAM_CS_TIMESTAMP_FREQUENCY   50
>> +
>>   typedef struct drm_i915_getparam {
>>       __s32 param;
>>       /*
>
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2 6/9] drm/i915: expose command stream timestamp frequency to userspace
  2017-11-02 16:29 ` [PATCH v2 6/9] drm/i915: expose command stream timestamp frequency to userspace Lionel Landwerlin
                     ` (2 preceding siblings ...)
  2017-11-08 17:36   ` Lionel Landwerlin
@ 2017-11-09 11:58   ` Sagar Arun Kamble
  2017-11-09 14:06     ` Lionel Landwerlin
  3 siblings, 1 reply; 31+ messages in thread
From: Sagar Arun Kamble @ 2017-11-09 11:58 UTC (permalink / raw)
  To: Lionel Landwerlin, intel-gfx



On 11/2/2017 9:59 PM, Lionel Landwerlin wrote:
> We use to have this fixed per generation, but starting with CNL userspace
> cannot tell just off the PCI ID. Let's make this information available. This
> is particularly useful for performance monitoring where much of the
> normalization work is done using those timestamps (this include pipeline
> statistics in both GL & Vulkan as well as OA reports).
>
> Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
> ---
>   drivers/gpu/drm/i915/i915_debugfs.c      |  2 +
>   drivers/gpu/drm/i915/i915_drv.c          |  3 +
>   drivers/gpu/drm/i915/i915_drv.h          |  2 +
>   drivers/gpu/drm/i915/i915_reg.h          | 21 +++++++
>   drivers/gpu/drm/i915/intel_device_info.c | 99 ++++++++++++++++++++++++++++++++
>   include/uapi/drm/i915_drm.h              |  6 ++
>   6 files changed, 133 insertions(+)
>
> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
> index 39883cd915db..0897fd616a1f 100644
> --- a/drivers/gpu/drm/i915/i915_debugfs.c
> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> @@ -3246,6 +3246,8 @@ static int i915_engine_info(struct seq_file *m, void *unused)
>   		   yesno(dev_priv->gt.awake));
>   	seq_printf(m, "Global active requests: %d\n",
>   		   dev_priv->gt.active_requests);
> +	seq_printf(m, "CS timestamp frequency: %llu\n",
> +		   dev_priv->info.cs_timestamp_frequency);
should be accessed through INTEL_INFO
How about adding "Hz" to message
>   
>   	p = drm_seq_file_printer(m);
>   	for_each_engine(engine, dev_priv, id)
> diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
> index e7e9e061073b..fdd23e79fb46 100644
> --- a/drivers/gpu/drm/i915/i915_drv.c
> +++ b/drivers/gpu/drm/i915/i915_drv.c
> @@ -416,6 +416,9 @@ static int i915_getparam(struct drm_device *dev, void *data,
>   		if (!value)
>   			return -ENODEV;
>   		break;
> +	case I915_PARAM_CS_TIMESTAMP_FREQUENCY:
> +		value = INTEL_INFO(dev_priv)->cs_timestamp_frequency;
losing the precision here. can we make cs_timestamp_frequency u32?
> +		break;
>   	default:
>   		DRM_DEBUG("Unknown parameter %d\n", param->param);
>   		return -EINVAL;
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 6cb7cd7f9420..4e804aaeaae1 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -886,6 +886,8 @@ struct intel_device_info {
>   	/* Slice/subslice/EU info */
>   	struct sseu_dev_info sseu;
>   
> +	uint64_t cs_timestamp_frequency;
> +
s/uint64_t/u64 - (Chris had suggested earlier)
>   	struct color_luts {
>   		u16 degamma_lut_size;
>   		u16 gamma_lut_size;
> diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
> index a2223f01ee2a..f392f28f2cfa 100644
> --- a/drivers/gpu/drm/i915/i915_reg.h
> +++ b/drivers/gpu/drm/i915/i915_reg.h
> @@ -1119,9 +1119,24 @@ static inline bool i915_mmio_reg_valid(i915_reg_t reg)
>   
>   /* RPM unit config (Gen8+) */
>   #define RPM_CONFIG0	    _MMIO(0x0D00)
> +#define  GEN9_RPM_CONFIG0_CRYSTAL_CLOCK_FREQ_SHIFT	3
> +#define  GEN9_RPM_CONFIG0_CRYSTAL_CLOCK_FREQ_MASK	(1 << GEN9_RPM_CONFIG0_CRYSTAL_CLOCK_FREQ_SHIFT)
> +#define  GEN9_RPM_CONFIG0_CRYSTAL_CLOCK_FREQ_19_2_MHZ	0
> +#define  GEN9_RPM_CONFIG0_CRYSTAL_CLOCK_FREQ_24_MHZ	1
> +#define  GEN10_RPM_CONFIG0_CTC_SHIFT_PARAMETER_SHIFT	1
> +#define  GEN10_RPM_CONFIG0_CTC_SHIFT_PARAMETER_MASK	(0x3 << GEN10_RPM_CONFIG0_CTC_SHIFT_PARAMETER_SHIFT)
> +
>   #define RPM_CONFIG1	    _MMIO(0x0D04)
>   #define  GEN10_GT_NOA_ENABLE  (1 << 9)
>   
> +/* GPM unit config (assuming Gen8+, documentation is fuzzy...) */
> +#define GEN8_CTC_MODE			_MMIO(0xA26C)
> +#define  GEN8_CTC_SOURCE_PARAMETER_MASK 1
> +#define  GEN8_CTC_SOURCE_CRYSTAL_CLOCK	0
> +#define  GEN8_CTC_SOURCE_DIVIDE_LOGIC	1
> +#define  GEN8_CTC_SHIFT_PARAMETER_SHIFT	1
> +#define  GEN8_CTC_SHIFT_PARAMETER_MASK	(0x3 << GEN8_CTC_SHIFT_PARAMETER_SHIFT)
> +
>   /* RPC unit config (Gen8+) */
>   #define RPC_CONFIG	    _MMIO(0x0D08)
>   
> @@ -8865,6 +8880,12 @@ enum skl_power_gate {
>   #define ILK_TIMESTAMP_HI	_MMIO(0x70070)
>   #define IVB_TIMESTAMP_CTR	_MMIO(0x44070)
>   
> +#define GEN8_TIMESTAMP_OVERRIDE				_MMIO(0x44074)
> +#define  GEN8_TIMESTAMP_OVERRIDE_US_COUNTER_SHIFT		0
> +#define  GEN8_TIMESTAMP_OVERRIDE_US_COUNTER_MASK		0x3ff
US_COUNTER_DIVIDER_MASK?
> +#define  GEN8_TIMESTAMP_OVERRIDE_US_COUNTER_DENOMINATOR_SHIFT	12
> +#define  GEN8_TIMESTAMP_OVERRIDE_US_COUNTER_DENOMINATOR_MASK	(0xf << 12)
> +
>   #define _PIPE_FRMTMSTMP_A		0x70048
>   #define PIPE_FRMTMSTMP(pipe)		\
>   			_MMIO_PIPE2(pipe, _PIPE_FRMTMSTMP_A)
> diff --git a/drivers/gpu/drm/i915/intel_device_info.c b/drivers/gpu/drm/i915/intel_device_info.c
> index db03d179fc85..9b71a9b6d80e 100644
> --- a/drivers/gpu/drm/i915/intel_device_info.c
> +++ b/drivers/gpu/drm/i915/intel_device_info.c
> @@ -329,6 +329,100 @@ static void broadwell_sseu_info_init(struct drm_i915_private *dev_priv)
>   	sseu->has_eu_pg = 0;
>   }
>   
> +static u64 read_timestamp_frequency_from_divide(struct drm_i915_private *dev_priv)
Should this be named read_reference_ts_freq?
> +{
> +	u32 ts_override = I915_READ(GEN8_TIMESTAMP_OVERRIDE);
> +	u64 base_freq, frac_freq;
> +
> +	base_freq = ((ts_override & GEN8_TIMESTAMP_OVERRIDE_US_COUNTER_MASK) >>
> +		     GEN8_TIMESTAMP_OVERRIDE_US_COUNTER_SHIFT) + 1;
> +	base_freq *= 1000000;
> +
> +	frac_freq = ((ts_override &
> +		      GEN8_TIMESTAMP_OVERRIDE_US_COUNTER_DENOMINATOR_MASK) >>
> +		     GEN8_TIMESTAMP_OVERRIDE_US_COUNTER_DENOMINATOR_SHIFT);
> +	if (frac_freq != 0)
> +		frac_freq = 1000000 / (frac_freq + 1);
Not considering numerator?
> +
> +	return base_freq + frac_freq;
> +}
> +
> +static u64 read_timestamp_frequency(struct drm_i915_private *dev_priv)
> +{
> +	if (INTEL_GEN(dev_priv) <= 4) {
> +		/* PRMs say:
> +		 *
> +		 *     "The value in this register increments once every 16
> +		 *      hclks." ("CLKCFG" register)
> +		 *
> +		 * Since dev_priv->rawclk_freq stores the value in kHz divided
> +		 * by 4, we just need to divide it again by 4.
> +		 */
I read this as hclk is 1/4th fsb clock and timestamp is 1/16 of hclk so 
this should be 16.
> +		return (dev_priv->rawclk_freq * 1000) / 4;
> +	} else if (INTEL_GEN(dev_priv) <= 7) {
> +		/* PRMs say:
> +		 *
> +		 *     "The PCU TSC counts 10ns increments; this timestamp
> +		 *      reflects bits 38:3 of the TSC (i.e. 80ns granularity,
> +		 *      rolling over every 1.5 hours).
> +		 */
> +		return 12500000;
> +	} else if (INTEL_GEN(dev_priv) <= 9) {
> +		u32 ctc_reg = I915_READ(GEN8_CTC_MODE);
> +		u64 freq = 0;
> +
> +		if ((ctc_reg & GEN8_CTC_SOURCE_PARAMETER_MASK) == GEN8_CTC_SOURCE_DIVIDE_LOGIC)
> +			freq = read_timestamp_frequency_from_divide(dev_priv);
> +		else
> +			freq = IS_GEN9_LP(dev_priv) ? 19200000 : 24000000;
> +
> +		/* Now figure out how the command stream's timestamp register
> +		 * increments from this frequency (it might increment only
> +		 * every few clock cycle).
> +		 */
> +		freq >>= 3 - ((ctc_reg & GEN8_CTC_SHIFT_PARAMETER_MASK) >>
> +			      GEN8_CTC_SHIFT_PARAMETER_SHIFT);
Gen8 documentation is indeed fuzzy. Are we getting 12.5mhz after this 
shift as doc says it to have 80ns base.
> +
> +		return freq;
> +	} else if (INTEL_GEN(dev_priv) <= 10) {
> +		u32 ctc_reg = I915_READ(GEN8_CTC_MODE);
> +		u64 freq = 0;
> +		u32 rpm_config_reg = 0;
> +
> +		/* First figure out the reference frequency. There are 2 ways
> +		 * we can compute the frequency, either through the
> +		 * TIMESTAMP_OVERRIDE register or through CTC_MODE &
Remove CTC_MODE as it does not itself determine the frequency.
> +		 * RPM_CONFIG & CTC_MODE registers. CTC_MODE tells us which
> +		 * one we should use.
> +		 */
> +		if ((ctc_reg & GEN8_CTC_SOURCE_PARAMETER_MASK) == GEN8_CTC_SOURCE_DIVIDE_LOGIC) {
> +			freq = read_timestamp_frequency_from_divide(dev_priv);
> +		} else {
> +			u32 crystal_clock;
> +
> +			rpm_config_reg = I915_READ(RPM_CONFIG0);
> +			crystal_clock = (rpm_config_reg &
> +					 GEN9_RPM_CONFIG0_CRYSTAL_CLOCK_FREQ_MASK) >>
> +				GEN9_RPM_CONFIG0_CRYSTAL_CLOCK_FREQ_SHIFT;
> +			freq = crystal_clock == GEN9_RPM_CONFIG0_CRYSTAL_CLOCK_FREQ_19_2_MHZ ?
> +				19200000 : 24000000;
switch case would be better i guess.
> +		}
> +
> +		/* Now figure out how the command stream's timestamp register
> +		 * increments from this frequency (it might increment only
> +		 * every few clock cycle).
> +		 */
> +		freq >>= 3 - ((rpm_config_reg &
> +			       GEN10_RPM_CONFIG0_CTC_SHIFT_PARAMETER_MASK) >>
> +			      GEN10_RPM_CONFIG0_CTC_SHIFT_PARAMETER_SHIFT);
> +
> +		return freq;
> +	}
> +
> +	DRM_ERROR("Unknown gen, unable to compute command stream timestamp frequency\n");
> +	return 0;
> +}
> +
>   /*
>    * Determine various intel_device_info fields at runtime.
>    *
> @@ -450,6 +544,9 @@ void intel_device_info_runtime_init(struct drm_i915_private *dev_priv)
>   	else if (INTEL_GEN(dev_priv) >= 10)
>   		gen10_sseu_info_init(dev_priv);
>   
> +	/* Initialize command stream timestamp frequency */
> +	info->cs_timestamp_frequency = read_timestamp_frequency(dev_priv);
> +
>   	DRM_DEBUG_DRIVER("slice mask: %04x\n", info->sseu.slice_mask);
>   	DRM_DEBUG_DRIVER("slice total: %u\n", hweight8(info->sseu.slice_mask));
>   	DRM_DEBUG_DRIVER("subslice total: %u\n",
> @@ -465,4 +562,6 @@ void intel_device_info_runtime_init(struct drm_i915_private *dev_priv)
>   			 info->sseu.has_subslice_pg ? "y" : "n");
>   	DRM_DEBUG_DRIVER("has EU power gating: %s\n",
>   			 info->sseu.has_eu_pg ? "y" : "n");
> +	DRM_DEBUG_DRIVER("CS timestamp frequency: %llu\n",
> +			 info->cs_timestamp_frequency);
>   }
> diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
> index 125bde7d9504..c3ff0d4947af 100644
> --- a/include/uapi/drm/i915_drm.h
> +++ b/include/uapi/drm/i915_drm.h
> @@ -450,6 +450,12 @@ typedef struct drm_i915_irq_wait {
>    */
>   #define I915_PARAM_HAS_EXEC_FENCE_ARRAY  49
>   
> +/* Frequency of the command streamer timestamps given by the *_TIMESTAMP
> + * registers. This used to be fixed per platform but from CNL onwards, this
> + * might vary depending on the parts.
> + */
> +#define I915_PARAM_CS_TIMESTAMP_FREQUENCY   50
> +
>   typedef struct drm_i915_getparam {
>   	__s32 param;
>   	/*

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2 6/9] drm/i915: expose command stream timestamp frequency to userspace
  2017-11-09 11:58   ` Sagar Arun Kamble
@ 2017-11-09 14:06     ` Lionel Landwerlin
  2017-11-09 14:13       ` Lionel Landwerlin
  2017-11-09 16:37       ` Sagar Arun Kamble
  0 siblings, 2 replies; 31+ messages in thread
From: Lionel Landwerlin @ 2017-11-09 14:06 UTC (permalink / raw)
  To: Sagar Arun Kamble, intel-gfx

On 09/11/17 11:58, Sagar Arun Kamble wrote:
>
>
> On 11/2/2017 9:59 PM, Lionel Landwerlin wrote:
>> We use to have this fixed per generation, but starting with CNL 
>> userspace
>> cannot tell just off the PCI ID. Let's make this information 
>> available. This
>> is particularly useful for performance monitoring where much of the
>> normalization work is done using those timestamps (this include pipeline
>> statistics in both GL & Vulkan as well as OA reports).
>>
>> Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
>> ---
>>   drivers/gpu/drm/i915/i915_debugfs.c      |  2 +
>>   drivers/gpu/drm/i915/i915_drv.c          |  3 +
>>   drivers/gpu/drm/i915/i915_drv.h          |  2 +
>>   drivers/gpu/drm/i915/i915_reg.h          | 21 +++++++
>>   drivers/gpu/drm/i915/intel_device_info.c | 99 
>> ++++++++++++++++++++++++++++++++
>>   include/uapi/drm/i915_drm.h              |  6 ++
>>   6 files changed, 133 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c 
>> b/drivers/gpu/drm/i915/i915_debugfs.c
>> index 39883cd915db..0897fd616a1f 100644
>> --- a/drivers/gpu/drm/i915/i915_debugfs.c
>> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
>> @@ -3246,6 +3246,8 @@ static int i915_engine_info(struct seq_file *m, 
>> void *unused)
>>              yesno(dev_priv->gt.awake));
>>       seq_printf(m, "Global active requests: %d\n",
>>              dev_priv->gt.active_requests);
>> +    seq_printf(m, "CS timestamp frequency: %llu\n",
>> +           dev_priv->info.cs_timestamp_frequency);
> should be accessed through INTEL_INFO
> How about adding "Hz" to message

Done.

>>         p = drm_seq_file_printer(m);
>>       for_each_engine(engine, dev_priv, id)
>> diff --git a/drivers/gpu/drm/i915/i915_drv.c 
>> b/drivers/gpu/drm/i915/i915_drv.c
>> index e7e9e061073b..fdd23e79fb46 100644
>> --- a/drivers/gpu/drm/i915/i915_drv.c
>> +++ b/drivers/gpu/drm/i915/i915_drv.c
>> @@ -416,6 +416,9 @@ static int i915_getparam(struct drm_device *dev, 
>> void *data,
>>           if (!value)
>>               return -ENODEV;
>>           break;
>> +    case I915_PARAM_CS_TIMESTAMP_FREQUENCY:
>> +        value = INTEL_INFO(dev_priv)->cs_timestamp_frequency;
> losing the precision here. can we make cs_timestamp_frequency u32?

Yeah, I'm not super happy about the int* of getparam.
MAX_INT limits us up to ~2GHz, which I don't think we'll ever reach.
Do you agree? Do you think we need to handle bigger values?


>> +        break;
>>       default:
>>           DRM_DEBUG("Unknown parameter %d\n", param->param);
>>           return -EINVAL;
>> diff --git a/drivers/gpu/drm/i915/i915_drv.h 
>> b/drivers/gpu/drm/i915/i915_drv.h
>> index 6cb7cd7f9420..4e804aaeaae1 100644
>> --- a/drivers/gpu/drm/i915/i915_drv.h
>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>> @@ -886,6 +886,8 @@ struct intel_device_info {
>>       /* Slice/subslice/EU info */
>>       struct sseu_dev_info sseu;
>>   +    uint64_t cs_timestamp_frequency;
>> +
> s/uint64_t/u64 - (Chris had suggested earlier)

Done.

>>       struct color_luts {
>>           u16 degamma_lut_size;
>>           u16 gamma_lut_size;
>> diff --git a/drivers/gpu/drm/i915/i915_reg.h 
>> b/drivers/gpu/drm/i915/i915_reg.h
>> index a2223f01ee2a..f392f28f2cfa 100644
>> --- a/drivers/gpu/drm/i915/i915_reg.h
>> +++ b/drivers/gpu/drm/i915/i915_reg.h
>> @@ -1119,9 +1119,24 @@ static inline bool 
>> i915_mmio_reg_valid(i915_reg_t reg)
>>     /* RPM unit config (Gen8+) */
>>   #define RPM_CONFIG0        _MMIO(0x0D00)
>> +#define  GEN9_RPM_CONFIG0_CRYSTAL_CLOCK_FREQ_SHIFT    3
>> +#define  GEN9_RPM_CONFIG0_CRYSTAL_CLOCK_FREQ_MASK    (1 << 
>> GEN9_RPM_CONFIG0_CRYSTAL_CLOCK_FREQ_SHIFT)
>> +#define  GEN9_RPM_CONFIG0_CRYSTAL_CLOCK_FREQ_19_2_MHZ    0
>> +#define  GEN9_RPM_CONFIG0_CRYSTAL_CLOCK_FREQ_24_MHZ    1
>> +#define  GEN10_RPM_CONFIG0_CTC_SHIFT_PARAMETER_SHIFT    1
>> +#define  GEN10_RPM_CONFIG0_CTC_SHIFT_PARAMETER_MASK    (0x3 << 
>> GEN10_RPM_CONFIG0_CTC_SHIFT_PARAMETER_SHIFT)
>> +
>>   #define RPM_CONFIG1        _MMIO(0x0D04)
>>   #define  GEN10_GT_NOA_ENABLE  (1 << 9)
>>   +/* GPM unit config (assuming Gen8+, documentation is fuzzy...) */
>> +#define GEN8_CTC_MODE            _MMIO(0xA26C)
>> +#define  GEN8_CTC_SOURCE_PARAMETER_MASK 1
>> +#define  GEN8_CTC_SOURCE_CRYSTAL_CLOCK    0
>> +#define  GEN8_CTC_SOURCE_DIVIDE_LOGIC    1
>> +#define  GEN8_CTC_SHIFT_PARAMETER_SHIFT    1
>> +#define  GEN8_CTC_SHIFT_PARAMETER_MASK    (0x3 << 
>> GEN8_CTC_SHIFT_PARAMETER_SHIFT)
>> +
>>   /* RPC unit config (Gen8+) */
>>   #define RPC_CONFIG        _MMIO(0x0D08)
>>   @@ -8865,6 +8880,12 @@ enum skl_power_gate {
>>   #define ILK_TIMESTAMP_HI    _MMIO(0x70070)
>>   #define IVB_TIMESTAMP_CTR    _MMIO(0x44070)
>>   +#define GEN8_TIMESTAMP_OVERRIDE                _MMIO(0x44074)
>> +#define  GEN8_TIMESTAMP_OVERRIDE_US_COUNTER_SHIFT        0
>> +#define  GEN8_TIMESTAMP_OVERRIDE_US_COUNTER_MASK        0x3ff
> US_COUNTER_DIVIDER_MASK?

Sure, I thought it was just a bit too long :)

>> +#define GEN8_TIMESTAMP_OVERRIDE_US_COUNTER_DENOMINATOR_SHIFT    12
>> +#define  GEN8_TIMESTAMP_OVERRIDE_US_COUNTER_DENOMINATOR_MASK (0xf << 
>> 12)
>> +
>>   #define _PIPE_FRMTMSTMP_A        0x70048
>>   #define PIPE_FRMTMSTMP(pipe)        \
>>               _MMIO_PIPE2(pipe, _PIPE_FRMTMSTMP_A)
>> diff --git a/drivers/gpu/drm/i915/intel_device_info.c 
>> b/drivers/gpu/drm/i915/intel_device_info.c
>> index db03d179fc85..9b71a9b6d80e 100644
>> --- a/drivers/gpu/drm/i915/intel_device_info.c
>> +++ b/drivers/gpu/drm/i915/intel_device_info.c
>> @@ -329,6 +329,100 @@ static void broadwell_sseu_info_init(struct 
>> drm_i915_private *dev_priv)
>>       sseu->has_eu_pg = 0;
>>   }
>>   +static u64 read_timestamp_frequency_from_divide(struct 
>> drm_i915_private *dev_priv)
> Should this be named read_reference_ts_freq?

Yes, thanks!

>> +{
>> +    u32 ts_override = I915_READ(GEN8_TIMESTAMP_OVERRIDE);
>> +    u64 base_freq, frac_freq;
>> +
>> +    base_freq = ((ts_override & 
>> GEN8_TIMESTAMP_OVERRIDE_US_COUNTER_MASK) >>
>> +             GEN8_TIMESTAMP_OVERRIDE_US_COUNTER_SHIFT) + 1;
>> +    base_freq *= 1000000;
>> +
>> +    frac_freq = ((ts_override &
>> + GEN8_TIMESTAMP_OVERRIDE_US_COUNTER_DENOMINATOR_MASK) >>
>> + GEN8_TIMESTAMP_OVERRIDE_US_COUNTER_DENOMINATOR_SHIFT);
>> +    if (frac_freq != 0)
>> +        frac_freq = 1000000 / (frac_freq + 1);
> Not considering numerator?

The documentation is quite terrible, but my reading is that the 
numerator doesn't apply to any current generations.

>> +
>> +    return base_freq + frac_freq;
>> +}
>> +
>> +static u64 read_timestamp_frequency(struct drm_i915_private *dev_priv)
>> +{
>> +    if (INTEL_GEN(dev_priv) <= 4) {
>> +        /* PRMs say:
>> +         *
>> +         *     "The value in this register increments once every 16
>> +         *      hclks." ("CLKCFG" register)
>> +         *
>> +         * Since dev_priv->rawclk_freq stores the value in kHz divided
>> +         * by 4, we just need to divide it again by 4.
>> +         */
> I read this as hclk is 1/4th fsb clock and timestamp is 1/16 of hclk 
> so this should be 16.

You're right, but as the comment above explains, rawclk_freq is already 
hclk / 4.
Another / 4 gives us / 16.

>> +        return (dev_priv->rawclk_freq * 1000) / 4;
>> +    } else if (INTEL_GEN(dev_priv) <= 7) {
>> +        /* PRMs say:
>> +         *
>> +         *     "The PCU TSC counts 10ns increments; this timestamp
>> +         *      reflects bits 38:3 of the TSC (i.e. 80ns granularity,
>> +         *      rolling over every 1.5 hours).
>> +         */
>> +        return 12500000;
>> +    } else if (INTEL_GEN(dev_priv) <= 9) {
>> +        u32 ctc_reg = I915_READ(GEN8_CTC_MODE);
>> +        u64 freq = 0;
>> +
>> +        if ((ctc_reg & GEN8_CTC_SOURCE_PARAMETER_MASK) == 
>> GEN8_CTC_SOURCE_DIVIDE_LOGIC)
>> +            freq = read_timestamp_frequency_from_divide(dev_priv);
>> +        else
>> +            freq = IS_GEN9_LP(dev_priv) ? 19200000 : 24000000;
>> +
>> +        /* Now figure out how the command stream's timestamp register
>> +         * increments from this frequency (it might increment only
>> +         * every few clock cycle).
>> +         */
>> +        freq >>= 3 - ((ctc_reg & GEN8_CTC_SHIFT_PARAMETER_MASK) >>
>> +                  GEN8_CTC_SHIFT_PARAMETER_SHIFT);
> Gen8 documentation is indeed fuzzy. Are we getting 12.5mhz after this 
> shift as doc says it to have 80ns base.
>> +
>> +        return freq;
>> +    } else if (INTEL_GEN(dev_priv) <= 10) {
>> +        u32 ctc_reg = I915_READ(GEN8_CTC_MODE);
>> +        u64 freq = 0;
>> +        u32 rpm_config_reg = 0;
>> +
>> +        /* First figure out the reference frequency. There are 2 ways
>> +         * we can compute the frequency, either through the
>> +         * TIMESTAMP_OVERRIDE register or through CTC_MODE &
> Remove CTC_MODE as it does not itself determine the frequency.

Done, thanks.

>> +         * RPM_CONFIG & CTC_MODE registers. CTC_MODE tells us which
>> +         * one we should use.
>> +         */
>> +        if ((ctc_reg & GEN8_CTC_SOURCE_PARAMETER_MASK) == 
>> GEN8_CTC_SOURCE_DIVIDE_LOGIC) {
>> +            freq = read_timestamp_frequency_from_divide(dev_priv);
>> +        } else {
>> +            u32 crystal_clock;
>> +
>> +            rpm_config_reg = I915_READ(RPM_CONFIG0);
>> +            crystal_clock = (rpm_config_reg &
>> +                     GEN9_RPM_CONFIG0_CRYSTAL_CLOCK_FREQ_MASK) >>
>> +                GEN9_RPM_CONFIG0_CRYSTAL_CLOCK_FREQ_SHIFT;
>> +            freq = crystal_clock == 
>> GEN9_RPM_CONFIG0_CRYSTAL_CLOCK_FREQ_19_2_MHZ ?
>> +                19200000 : 24000000;
> switch case would be better i guess.

Done.

>> +        }
>> +
>> +        /* Now figure out how the command stream's timestamp register
>> +         * increments from this frequency (it might increment only
>> +         * every few clock cycle).
>> +         */
>> +        freq >>= 3 - ((rpm_config_reg &
>> +                   GEN10_RPM_CONFIG0_CTC_SHIFT_PARAMETER_MASK) >>
>> +                  GEN10_RPM_CONFIG0_CTC_SHIFT_PARAMETER_SHIFT);
>> +
>> +        return freq;
>> +    }
>> +
>> +    DRM_ERROR("Unknown gen, unable to compute command stream 
>> timestamp frequency\n");
>> +    return 0;
>> +}
>> +
>>   /*
>>    * Determine various intel_device_info fields at runtime.
>>    *
>> @@ -450,6 +544,9 @@ void intel_device_info_runtime_init(struct 
>> drm_i915_private *dev_priv)
>>       else if (INTEL_GEN(dev_priv) >= 10)
>>           gen10_sseu_info_init(dev_priv);
>>   +    /* Initialize command stream timestamp frequency */
>> +    info->cs_timestamp_frequency = read_timestamp_frequency(dev_priv);
>> +
>>       DRM_DEBUG_DRIVER("slice mask: %04x\n", info->sseu.slice_mask);
>>       DRM_DEBUG_DRIVER("slice total: %u\n", 
>> hweight8(info->sseu.slice_mask));
>>       DRM_DEBUG_DRIVER("subslice total: %u\n",
>> @@ -465,4 +562,6 @@ void intel_device_info_runtime_init(struct 
>> drm_i915_private *dev_priv)
>>                info->sseu.has_subslice_pg ? "y" : "n");
>>       DRM_DEBUG_DRIVER("has EU power gating: %s\n",
>>                info->sseu.has_eu_pg ? "y" : "n");
>> +    DRM_DEBUG_DRIVER("CS timestamp frequency: %llu\n",
>> +             info->cs_timestamp_frequency);
>>   }
>> diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
>> index 125bde7d9504..c3ff0d4947af 100644
>> --- a/include/uapi/drm/i915_drm.h
>> +++ b/include/uapi/drm/i915_drm.h
>> @@ -450,6 +450,12 @@ typedef struct drm_i915_irq_wait {
>>    */
>>   #define I915_PARAM_HAS_EXEC_FENCE_ARRAY  49
>>   +/* Frequency of the command streamer timestamps given by the 
>> *_TIMESTAMP
>> + * registers. This used to be fixed per platform but from CNL 
>> onwards, this
>> + * might vary depending on the parts.
>> + */
>> +#define I915_PARAM_CS_TIMESTAMP_FREQUENCY   50
>> +
>>   typedef struct drm_i915_getparam {
>>       __s32 param;
>>       /*
>
>

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2 6/9] drm/i915: expose command stream timestamp frequency to userspace
  2017-11-09 14:06     ` Lionel Landwerlin
@ 2017-11-09 14:13       ` Lionel Landwerlin
  2017-11-09 17:44         ` Lionel Landwerlin
  2017-11-09 16:37       ` Sagar Arun Kamble
  1 sibling, 1 reply; 31+ messages in thread
From: Lionel Landwerlin @ 2017-11-09 14:13 UTC (permalink / raw)
  To: Sagar Arun Kamble, intel-gfx


[-- Attachment #1.1: Type: text/plain, Size: 1100 bytes --]

On 09/11/17 14:06, Lionel Landwerlin wrote:
>>
>> +    } else if (INTEL_GEN(dev_priv) <= 9) {
>> +        u32 ctc_reg = I915_READ(GEN8_CTC_MODE);
>> +        u64 freq = 0;
>> +
>> +        if ((ctc_reg & GEN8_CTC_SOURCE_PARAMETER_MASK) == 
>> GEN8_CTC_SOURCE_DIVIDE_LOGIC)
>> +            freq = read_timestamp_frequency_from_divide(dev_priv);
>> +        else
>> +            freq = IS_GEN9_LP(dev_priv) ? 19200000 : 24000000;
>> +
>> +        /* Now figure out how the command stream's timestamp register
>> +         * increments from this frequency (it might increment only
>> +         * every few clock cycle).
>> +         */
>> +        freq >>= 3 - ((ctc_reg & GEN8_CTC_SHIFT_PARAMETER_MASK) >>
>> +                  GEN8_CTC_SHIFT_PARAMETER_SHIFT);
> Gen8 documentation is indeed fuzzy. Are we getting 12.5mhz after this 
> shift as doc says it to have 80ns base. 
Forgot to answer that point. Let me check this on BDW again.
But yes, the idea is that we should get 12.5MHz on BDW.

[-- Attachment #1.2: Type: text/html, Size: 1850 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2 6/9] drm/i915: expose command stream timestamp frequency to userspace
  2017-11-09 14:06     ` Lionel Landwerlin
  2017-11-09 14:13       ` Lionel Landwerlin
@ 2017-11-09 16:37       ` Sagar Arun Kamble
  1 sibling, 0 replies; 31+ messages in thread
From: Sagar Arun Kamble @ 2017-11-09 16:37 UTC (permalink / raw)
  To: Lionel Landwerlin, intel-gfx



On 11/9/2017 7:36 PM, Lionel Landwerlin wrote:
> On 09/11/17 11:58, Sagar Arun Kamble wrote:
>>
>>
>> On 11/2/2017 9:59 PM, Lionel Landwerlin wrote:
>>> We use to have this fixed per generation, but starting with CNL 
>>> userspace
>>> cannot tell just off the PCI ID. Let's make this information 
>>> available. This
>>> is particularly useful for performance monitoring where much of the
>>> normalization work is done using those timestamps (this include 
>>> pipeline
>>> statistics in both GL & Vulkan as well as OA reports).
>>>
>>> Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
>>> ---
>>>   drivers/gpu/drm/i915/i915_debugfs.c      |  2 +
>>>   drivers/gpu/drm/i915/i915_drv.c          |  3 +
>>>   drivers/gpu/drm/i915/i915_drv.h          |  2 +
>>>   drivers/gpu/drm/i915/i915_reg.h          | 21 +++++++
>>>   drivers/gpu/drm/i915/intel_device_info.c | 99 
>>> ++++++++++++++++++++++++++++++++
>>>   include/uapi/drm/i915_drm.h              |  6 ++
>>>   6 files changed, 133 insertions(+)
>>>
>>> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c 
>>> b/drivers/gpu/drm/i915/i915_debugfs.c
>>> index 39883cd915db..0897fd616a1f 100644
>>> --- a/drivers/gpu/drm/i915/i915_debugfs.c
>>> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
>>> @@ -3246,6 +3246,8 @@ static int i915_engine_info(struct seq_file 
>>> *m, void *unused)
>>>              yesno(dev_priv->gt.awake));
>>>       seq_printf(m, "Global active requests: %d\n",
>>>              dev_priv->gt.active_requests);
>>> +    seq_printf(m, "CS timestamp frequency: %llu\n",
>>> +           dev_priv->info.cs_timestamp_frequency);
>> should be accessed through INTEL_INFO
>> How about adding "Hz" to message
>
> Done.
>
>>>         p = drm_seq_file_printer(m);
>>>       for_each_engine(engine, dev_priv, id)
>>> diff --git a/drivers/gpu/drm/i915/i915_drv.c 
>>> b/drivers/gpu/drm/i915/i915_drv.c
>>> index e7e9e061073b..fdd23e79fb46 100644
>>> --- a/drivers/gpu/drm/i915/i915_drv.c
>>> +++ b/drivers/gpu/drm/i915/i915_drv.c
>>> @@ -416,6 +416,9 @@ static int i915_getparam(struct drm_device *dev, 
>>> void *data,
>>>           if (!value)
>>>               return -ENODEV;
>>>           break;
>>> +    case I915_PARAM_CS_TIMESTAMP_FREQUENCY:
>>> +        value = INTEL_INFO(dev_priv)->cs_timestamp_frequency;
>> losing the precision here. can we make cs_timestamp_frequency u32?
>
> Yeah, I'm not super happy about the int* of getparam.
> MAX_INT limits us up to ~2GHz, which I don't think we'll ever reach.
> Do you agree? Do you think we need to handle bigger values?
>
Yes. Agree on making this int.
>
>>> +        break;
>>>       default:
>>>           DRM_DEBUG("Unknown parameter %d\n", param->param);
>>>           return -EINVAL;
>>> diff --git a/drivers/gpu/drm/i915/i915_drv.h 
>>> b/drivers/gpu/drm/i915/i915_drv.h
>>> index 6cb7cd7f9420..4e804aaeaae1 100644
>>> --- a/drivers/gpu/drm/i915/i915_drv.h
>>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>>> @@ -886,6 +886,8 @@ struct intel_device_info {
>>>       /* Slice/subslice/EU info */
>>>       struct sseu_dev_info sseu;
>>>   +    uint64_t cs_timestamp_frequency;
>>> +
>> s/uint64_t/u64 - (Chris had suggested earlier)
>
> Done.
>
>>>       struct color_luts {
>>>           u16 degamma_lut_size;
>>>           u16 gamma_lut_size;
>>> diff --git a/drivers/gpu/drm/i915/i915_reg.h 
>>> b/drivers/gpu/drm/i915/i915_reg.h
>>> index a2223f01ee2a..f392f28f2cfa 100644
>>> --- a/drivers/gpu/drm/i915/i915_reg.h
>>> +++ b/drivers/gpu/drm/i915/i915_reg.h
>>> @@ -1119,9 +1119,24 @@ static inline bool 
>>> i915_mmio_reg_valid(i915_reg_t reg)
>>>     /* RPM unit config (Gen8+) */
>>>   #define RPM_CONFIG0        _MMIO(0x0D00)
>>> +#define  GEN9_RPM_CONFIG0_CRYSTAL_CLOCK_FREQ_SHIFT    3
>>> +#define  GEN9_RPM_CONFIG0_CRYSTAL_CLOCK_FREQ_MASK    (1 << 
>>> GEN9_RPM_CONFIG0_CRYSTAL_CLOCK_FREQ_SHIFT)
>>> +#define  GEN9_RPM_CONFIG0_CRYSTAL_CLOCK_FREQ_19_2_MHZ    0
>>> +#define  GEN9_RPM_CONFIG0_CRYSTAL_CLOCK_FREQ_24_MHZ    1
>>> +#define  GEN10_RPM_CONFIG0_CTC_SHIFT_PARAMETER_SHIFT    1
>>> +#define  GEN10_RPM_CONFIG0_CTC_SHIFT_PARAMETER_MASK    (0x3 << 
>>> GEN10_RPM_CONFIG0_CTC_SHIFT_PARAMETER_SHIFT)
>>> +
>>>   #define RPM_CONFIG1        _MMIO(0x0D04)
>>>   #define  GEN10_GT_NOA_ENABLE  (1 << 9)
>>>   +/* GPM unit config (assuming Gen8+, documentation is fuzzy...) */
>>> +#define GEN8_CTC_MODE            _MMIO(0xA26C)
>>> +#define  GEN8_CTC_SOURCE_PARAMETER_MASK 1
>>> +#define  GEN8_CTC_SOURCE_CRYSTAL_CLOCK    0
>>> +#define  GEN8_CTC_SOURCE_DIVIDE_LOGIC    1
>>> +#define  GEN8_CTC_SHIFT_PARAMETER_SHIFT    1
>>> +#define  GEN8_CTC_SHIFT_PARAMETER_MASK    (0x3 << 
>>> GEN8_CTC_SHIFT_PARAMETER_SHIFT)
>>> +
>>>   /* RPC unit config (Gen8+) */
>>>   #define RPC_CONFIG        _MMIO(0x0D08)
>>>   @@ -8865,6 +8880,12 @@ enum skl_power_gate {
>>>   #define ILK_TIMESTAMP_HI    _MMIO(0x70070)
>>>   #define IVB_TIMESTAMP_CTR    _MMIO(0x44070)
>>>   +#define GEN8_TIMESTAMP_OVERRIDE _MMIO(0x44074)
>>> +#define  GEN8_TIMESTAMP_OVERRIDE_US_COUNTER_SHIFT        0
>>> +#define  GEN8_TIMESTAMP_OVERRIDE_US_COUNTER_MASK        0x3ff
>> US_COUNTER_DIVIDER_MASK?
>
> Sure, I thought it was just a bit too long :)
>
>>> +#define GEN8_TIMESTAMP_OVERRIDE_US_COUNTER_DENOMINATOR_SHIFT    12
>>> +#define  GEN8_TIMESTAMP_OVERRIDE_US_COUNTER_DENOMINATOR_MASK (0xf 
>>> << 12)
>>> +
>>>   #define _PIPE_FRMTMSTMP_A        0x70048
>>>   #define PIPE_FRMTMSTMP(pipe)        \
>>>               _MMIO_PIPE2(pipe, _PIPE_FRMTMSTMP_A)
>>> diff --git a/drivers/gpu/drm/i915/intel_device_info.c 
>>> b/drivers/gpu/drm/i915/intel_device_info.c
>>> index db03d179fc85..9b71a9b6d80e 100644
>>> --- a/drivers/gpu/drm/i915/intel_device_info.c
>>> +++ b/drivers/gpu/drm/i915/intel_device_info.c
>>> @@ -329,6 +329,100 @@ static void broadwell_sseu_info_init(struct 
>>> drm_i915_private *dev_priv)
>>>       sseu->has_eu_pg = 0;
>>>   }
>>>   +static u64 read_timestamp_frequency_from_divide(struct 
>>> drm_i915_private *dev_priv)
>> Should this be named read_reference_ts_freq?
>
> Yes, thanks!
>
>>> +{
>>> +    u32 ts_override = I915_READ(GEN8_TIMESTAMP_OVERRIDE);
>>> +    u64 base_freq, frac_freq;
>>> +
>>> +    base_freq = ((ts_override & 
>>> GEN8_TIMESTAMP_OVERRIDE_US_COUNTER_MASK) >>
>>> +             GEN8_TIMESTAMP_OVERRIDE_US_COUNTER_SHIFT) + 1;
>>> +    base_freq *= 1000000;
>>> +
>>> +    frac_freq = ((ts_override &
>>> + GEN8_TIMESTAMP_OVERRIDE_US_COUNTER_DENOMINATOR_MASK) >>
>>> + GEN8_TIMESTAMP_OVERRIDE_US_COUNTER_DENOMINATOR_SHIFT);
>>> +    if (frac_freq != 0)
>>> +        frac_freq = 1000000 / (frac_freq + 1);
>> Not considering numerator?
>
> The documentation is quite terrible, but my reading is that the 
> numerator doesn't apply to any current generations.
>
Understood now. I think we should consider whether override is set 
before considering denominator too.
>>> +
>>> +    return base_freq + frac_freq;
>>> +}
>>> +
>>> +static u64 read_timestamp_frequency(struct drm_i915_private *dev_priv)
>>> +{
>>> +    if (INTEL_GEN(dev_priv) <= 4) {
>>> +        /* PRMs say:
>>> +         *
>>> +         *     "The value in this register increments once every 16
>>> +         *      hclks." ("CLKCFG" register)
>>> +         *
>>> +         * Since dev_priv->rawclk_freq stores the value in kHz divided
>>> +         * by 4, we just need to divide it again by 4.
>>> +         */
>> I read this as hclk is 1/4th fsb clock and timestamp is 1/16 of hclk 
>> so this should be 16.
>
> You're right, but as the comment above explains, rawclk_freq is 
> already hclk / 4.
> Another / 4 gives us / 16.
1. hclk=1/4* fsb_clk
2. ts_clk=1/16*hclk
=> ts_clk=1/64*fsb_clk
So this should be "(dev_priv->rawclk_freq * 1000) / 16" right?
>
>>> +        return (dev_priv->rawclk_freq * 1000) / 4;
>>> +    } else if (INTEL_GEN(dev_priv) <= 7) {
>>> +        /* PRMs say:
>>> +         *
>>> +         *     "The PCU TSC counts 10ns increments; this timestamp
>>> +         *      reflects bits 38:3 of the TSC (i.e. 80ns granularity,
>>> +         *      rolling over every 1.5 hours).
>>> +         */
>>> +        return 12500000;
>>> +    } else if (INTEL_GEN(dev_priv) <= 9) {
>>> +        u32 ctc_reg = I915_READ(GEN8_CTC_MODE);
>>> +        u64 freq = 0;
>>> +
>>> +        if ((ctc_reg & GEN8_CTC_SOURCE_PARAMETER_MASK) == 
>>> GEN8_CTC_SOURCE_DIVIDE_LOGIC)
>>> +            freq = read_timestamp_frequency_from_divide(dev_priv);
>>> +        else
>>> +            freq = IS_GEN9_LP(dev_priv) ? 19200000 : 24000000;
>>> +
>>> +        /* Now figure out how the command stream's timestamp register
>>> +         * increments from this frequency (it might increment only
>>> +         * every few clock cycle).
>>> +         */
>>> +        freq >>= 3 - ((ctc_reg & GEN8_CTC_SHIFT_PARAMETER_MASK) >>
>>> +                  GEN8_CTC_SHIFT_PARAMETER_SHIFT);
>> Gen8 documentation is indeed fuzzy. Are we getting 12.5mhz after this 
>> shift as doc says it to have 80ns base.
>>> +
>>> +        return freq;
>>> +    } else if (INTEL_GEN(dev_priv) <= 10) {
>>> +        u32 ctc_reg = I915_READ(GEN8_CTC_MODE);
>>> +        u64 freq = 0;
>>> +        u32 rpm_config_reg = 0;
>>> +
>>> +        /* First figure out the reference frequency. There are 2 ways
>>> +         * we can compute the frequency, either through the
>>> +         * TIMESTAMP_OVERRIDE register or through CTC_MODE &
>> Remove CTC_MODE as it does not itself determine the frequency.
>
> Done, thanks.
>
>>> +         * RPM_CONFIG & CTC_MODE registers. CTC_MODE tells us which
>>> +         * one we should use.
>>> +         */
>>> +        if ((ctc_reg & GEN8_CTC_SOURCE_PARAMETER_MASK) == 
>>> GEN8_CTC_SOURCE_DIVIDE_LOGIC) {
>>> +            freq = read_timestamp_frequency_from_divide(dev_priv);
>>> +        } else {
>>> +            u32 crystal_clock;
>>> +
>>> +            rpm_config_reg = I915_READ(RPM_CONFIG0);
>>> +            crystal_clock = (rpm_config_reg &
>>> + GEN9_RPM_CONFIG0_CRYSTAL_CLOCK_FREQ_MASK) >>
>>> +                GEN9_RPM_CONFIG0_CRYSTAL_CLOCK_FREQ_SHIFT;
>>> +            freq = crystal_clock == 
>>> GEN9_RPM_CONFIG0_CRYSTAL_CLOCK_FREQ_19_2_MHZ ?
>>> +                19200000 : 24000000;
>> switch case would be better i guess.
>
> Done.
>
>>> +        }
>>> +
>>> +        /* Now figure out how the command stream's timestamp register
>>> +         * increments from this frequency (it might increment only
>>> +         * every few clock cycle).
>>> +         */
>>> +        freq >>= 3 - ((rpm_config_reg &
>>> + GEN10_RPM_CONFIG0_CTC_SHIFT_PARAMETER_MASK) >>
>>> + GEN10_RPM_CONFIG0_CTC_SHIFT_PARAMETER_SHIFT);
>>> +
>>> +        return freq;
>>> +    }
>>> +
>>> +    DRM_ERROR("Unknown gen, unable to compute command stream 
>>> timestamp frequency\n");
>>> +    return 0;
>>> +}
>>> +
>>>   /*
>>>    * Determine various intel_device_info fields at runtime.
>>>    *
>>> @@ -450,6 +544,9 @@ void intel_device_info_runtime_init(struct 
>>> drm_i915_private *dev_priv)
>>>       else if (INTEL_GEN(dev_priv) >= 10)
>>>           gen10_sseu_info_init(dev_priv);
>>>   +    /* Initialize command stream timestamp frequency */
>>> +    info->cs_timestamp_frequency = read_timestamp_frequency(dev_priv);
>>> +
>>>       DRM_DEBUG_DRIVER("slice mask: %04x\n", info->sseu.slice_mask);
>>>       DRM_DEBUG_DRIVER("slice total: %u\n", 
>>> hweight8(info->sseu.slice_mask));
>>>       DRM_DEBUG_DRIVER("subslice total: %u\n",
>>> @@ -465,4 +562,6 @@ void intel_device_info_runtime_init(struct 
>>> drm_i915_private *dev_priv)
>>>                info->sseu.has_subslice_pg ? "y" : "n");
>>>       DRM_DEBUG_DRIVER("has EU power gating: %s\n",
>>>                info->sseu.has_eu_pg ? "y" : "n");
>>> +    DRM_DEBUG_DRIVER("CS timestamp frequency: %llu\n",
>>> +             info->cs_timestamp_frequency);
>>>   }
>>> diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
>>> index 125bde7d9504..c3ff0d4947af 100644
>>> --- a/include/uapi/drm/i915_drm.h
>>> +++ b/include/uapi/drm/i915_drm.h
>>> @@ -450,6 +450,12 @@ typedef struct drm_i915_irq_wait {
>>>    */
>>>   #define I915_PARAM_HAS_EXEC_FENCE_ARRAY  49
>>>   +/* Frequency of the command streamer timestamps given by the 
>>> *_TIMESTAMP
>>> + * registers. This used to be fixed per platform but from CNL 
>>> onwards, this
>>> + * might vary depending on the parts.
>>> + */
>>> +#define I915_PARAM_CS_TIMESTAMP_FREQUENCY   50
>>> +
>>>   typedef struct drm_i915_getparam {
>>>       __s32 param;
>>>       /*
>>
>>
>
Thanks
Sagar
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2 6/9] drm/i915: expose command stream timestamp frequency to userspace
  2017-11-09 14:13       ` Lionel Landwerlin
@ 2017-11-09 17:44         ` Lionel Landwerlin
  0 siblings, 0 replies; 31+ messages in thread
From: Lionel Landwerlin @ 2017-11-09 17:44 UTC (permalink / raw)
  To: Sagar Arun Kamble, intel-gfx


[-- Attachment #1.1: Type: text/plain, Size: 1320 bytes --]

On 09/11/17 14:13, Lionel Landwerlin wrote:
> On 09/11/17 14:06, Lionel Landwerlin wrote:
>>>
>>> +    } else if (INTEL_GEN(dev_priv) <= 9) {
>>> +        u32 ctc_reg = I915_READ(GEN8_CTC_MODE);
>>> +        u64 freq = 0;
>>> +
>>> +        if ((ctc_reg & GEN8_CTC_SOURCE_PARAMETER_MASK) == 
>>> GEN8_CTC_SOURCE_DIVIDE_LOGIC)
>>> +            freq = read_timestamp_frequency_from_divide(dev_priv);
>>> +        else
>>> +            freq = IS_GEN9_LP(dev_priv) ? 19200000 : 24000000;
>>> +
>>> +        /* Now figure out how the command stream's timestamp register
>>> +         * increments from this frequency (it might increment only
>>> +         * every few clock cycle).
>>> +         */
>>> +        freq >>= 3 - ((ctc_reg & GEN8_CTC_SHIFT_PARAMETER_MASK) >>
>>> +                  GEN8_CTC_SHIFT_PARAMETER_SHIFT);
>> Gen8 documentation is indeed fuzzy. Are we getting 12.5mhz after this 
>> shift as doc says it to have 80ns base. 
> Forgot to answer that point. Let me check this on BDW again.
> But yes, the idea is that we should get 12.5MHz on BDW.

Okay, looks like that's wrong on my BDW system....
So this bit of right shift should probably only be applied to the else 
case (i.e. gen9)

-
Lionel

[-- Attachment #1.2: Type: text/html, Size: 2274 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2 3/9] drm/i915/perf: refactor perf setup
  2017-11-02 16:29 ` [PATCH v2 3/9] drm/i915/perf: refactor perf setup Lionel Landwerlin
@ 2017-11-10 11:04   ` Matthew Auld
  0 siblings, 0 replies; 31+ messages in thread
From: Matthew Auld @ 2017-11-10 11:04 UTC (permalink / raw)
  To: Lionel Landwerlin; +Cc: Intel Graphics Development

On 2 November 2017 at 16:29, Lionel Landwerlin
<lionel.g.landwerlin@intel.com> wrote:
> Gen8/9 aren't very different and we can merge some of this code.
>
> Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2 4/9] drm/i915: fix register naming
  2017-11-02 16:29 ` [PATCH v2 4/9] drm/i915: fix register naming Lionel Landwerlin
@ 2017-11-10 11:11   ` Matthew Auld
  2017-11-10 16:24     ` Lionel Landwerlin
  0 siblings, 1 reply; 31+ messages in thread
From: Matthew Auld @ 2017-11-10 11:11 UTC (permalink / raw)
  To: Lionel Landwerlin; +Cc: Intel Graphics Development

On 2 November 2017 at 16:29, Lionel Landwerlin
<lionel.g.landwerlin@intel.com> wrote:
> This name was added with the whitelisting of registers for building up OA
> configs. It is contained in a range gen8 whitelist :
>
>    addr >= RPM_CONFIG0.reg && addr <= NOA_CONFIG(8).reg
>
> Hence why the name isn't used anywhere.
>
> Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_reg.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
> index ee4941a1df20..d27092ec4f74 100644
> --- a/drivers/gpu/drm/i915/i915_reg.h
> +++ b/drivers/gpu/drm/i915/i915_reg.h
> @@ -1118,7 +1118,7 @@ static inline bool i915_mmio_reg_valid(i915_reg_t reg)
>  #define RPM_CONFIG1        _MMIO(0x0D04)
>
>  /* RPC unit config (Gen8+) */
> -#define RPM_CONFIG         _MMIO(0x0D08)
> +#define RPC_CONFIG         _MMIO(0x0D08)
Wait, is it RPC or RCP, the spec is calling it the RCPunit, with the
register being RCPCONFIG...
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2 5/9] drm/i915/perf: enable perf support on CNL
  2017-11-02 16:29 ` [PATCH v2 5/9] drm/i915/perf: enable perf support on CNL Lionel Landwerlin
@ 2017-11-10 12:42   ` Matthew Auld
  0 siblings, 0 replies; 31+ messages in thread
From: Matthew Auld @ 2017-11-10 12:42 UTC (permalink / raw)
  To: Lionel Landwerlin; +Cc: Intel Graphics Development

On 2 November 2017 at 16:29, Lionel Landwerlin
<lionel.g.landwerlin@intel.com> wrote:
> This adds new registers to the whitelist to configs emitted from userspace.
>
> Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
> ---
>  drivers/gpu/drm/i915/Makefile      |   3 +-
>  drivers/gpu/drm/i915/i915_oa_cnl.c | 121 +++++++++++++++++++++++++++++++++++++
>  drivers/gpu/drm/i915/i915_oa_cnl.h |  34 +++++++++++
>  drivers/gpu/drm/i915/i915_perf.c   |  41 ++++++++++++-
>  drivers/gpu/drm/i915/i915_reg.h    |   5 ++
>  5 files changed, 202 insertions(+), 2 deletions(-)
>  create mode 100644 drivers/gpu/drm/i915/i915_oa_cnl.c
>  create mode 100644 drivers/gpu/drm/i915/i915_oa_cnl.h
>
> diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
> index 3c419455b0af..f7afd44214b5 100644
> --- a/drivers/gpu/drm/i915/Makefile
> +++ b/drivers/gpu/drm/i915/Makefile
> @@ -163,7 +163,8 @@ i915-y += i915_perf.o \
>           i915_oa_kblgt3.o \
>           i915_oa_glk.o \
>           i915_oa_cflgt2.o \
> -         i915_oa_cflgt3.o
> +         i915_oa_cflgt3.o \
> +         i915_oa_cnl.o
>
>  ifeq ($(CONFIG_DRM_I915_GVT),y)
>  i915-y += intel_gvt.o
> diff --git a/drivers/gpu/drm/i915/i915_oa_cnl.c b/drivers/gpu/drm/i915/i915_oa_cnl.c
> new file mode 100644
> index 000000000000..ff0ac3627cc4
> --- /dev/null
> +++ b/drivers/gpu/drm/i915/i915_oa_cnl.c
> @@ -0,0 +1,121 @@
> +/*
> + * Autogenerated file by GPU Top : https://github.com/rib/gputop
> + * DO NOT EDIT manually!
> + *
> + *
> + * Copyright (c) 2015 Intel Corporation
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a
> + * copy of this software and associated documentation files (the "Software"),
> + * to deal in the Software without restriction, including without limitation
> + * the rights to use, copy, modify, merge, publish, distribute, sublicense,
> + * and/or sell copies of the Software, and to permit persons to whom the
> + * Software is furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice (including the next
> + * paragraph) shall be included in all copies or substantial portions of the
> + * Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
> + * IN THE SOFTWARE.
> + *
> + */
> +
> +#include <linux/sysfs.h>
> +
> +#include "i915_drv.h"
> +#include "i915_oa_cnl.h"
> +
> +static const struct i915_oa_reg b_counter_config_test_oa[] = {
> +       { _MMIO(0x2740), 0x00000000 },
> +       { _MMIO(0x2710), 0x00000000 },
> +       { _MMIO(0x2714), 0xf0800000 },
> +       { _MMIO(0x2720), 0x00000000 },
> +       { _MMIO(0x2724), 0xf0800000 },
> +       { _MMIO(0x2770), 0x00000004 },
> +       { _MMIO(0x2774), 0x0000ffff },
> +       { _MMIO(0x2778), 0x00000003 },
> +       { _MMIO(0x277c), 0x0000ffff },
> +       { _MMIO(0x2780), 0x00000007 },
> +       { _MMIO(0x2784), 0x0000ffff },
> +       { _MMIO(0x2788), 0x00100002 },
> +       { _MMIO(0x278c), 0x0000fff7 },
> +       { _MMIO(0x2790), 0x00100002 },
> +       { _MMIO(0x2794), 0x0000ffcf },
> +       { _MMIO(0x2798), 0x00100082 },
> +       { _MMIO(0x279c), 0x0000ffef },
> +       { _MMIO(0x27a0), 0x001000c2 },
> +       { _MMIO(0x27a4), 0x0000ffe7 },
> +       { _MMIO(0x27a8), 0x00100001 },
> +       { _MMIO(0x27ac), 0x0000ffe7 },
> +};
> +
> +static const struct i915_oa_reg flex_eu_config_test_oa[] = {
> +};
> +
> +static const struct i915_oa_reg mux_config_test_oa[] = {
> +       { _MMIO(0xd04), 0x00000200 },
> +       { _MMIO(0x9884), 0x00000007 },
> +       { _MMIO(0x9888), 0x17060000 },
> +       { _MMIO(0x9840), 0x00000000 },
> +       { _MMIO(0x9884), 0x00000007 },
> +       { _MMIO(0x9888), 0x13034000 },
> +       { _MMIO(0x9884), 0x00000007 },
> +       { _MMIO(0x9888), 0x07060066 },
> +       { _MMIO(0x9884), 0x00000007 },
> +       { _MMIO(0x9888), 0x05060000 },
> +       { _MMIO(0x9884), 0x00000007 },
> +       { _MMIO(0x9888), 0x0f080040 },
> +       { _MMIO(0x9884), 0x00000007 },
> +       { _MMIO(0x9888), 0x07091000 },
> +       { _MMIO(0x9884), 0x00000007 },
> +       { _MMIO(0x9888), 0x0f041000 },
> +       { _MMIO(0x9884), 0x00000007 },
> +       { _MMIO(0x9888), 0x1d004000 },
> +       { _MMIO(0x9884), 0x00000007 },
> +       { _MMIO(0x9888), 0x35000000 },
> +       { _MMIO(0x9884), 0x00000007 },
> +       { _MMIO(0x9888), 0x49000000 },
> +       { _MMIO(0x9884), 0x00000007 },
> +       { _MMIO(0x9888), 0x3d000000 },
> +       { _MMIO(0x9884), 0x00000007 },
> +       { _MMIO(0x9888), 0x31000000 },
> +};
> +
> +static ssize_t
> +show_test_oa_id(struct device *kdev, struct device_attribute *attr, char *buf)
> +{
> +       return sprintf(buf, "1\n");
> +}
> +
> +void
> +i915_perf_load_test_config_cnl(struct drm_i915_private *dev_priv)
> +{
> +       strncpy(dev_priv->perf.oa.test_config.uuid,
> +               "db41edd4-d8e7-4730-ad11-b9a2d6833503",
> +               UUID_STRING_LEN);
> +       dev_priv->perf.oa.test_config.id = 1;
> +
> +       dev_priv->perf.oa.test_config.mux_regs = mux_config_test_oa;
> +       dev_priv->perf.oa.test_config.mux_regs_len = ARRAY_SIZE(mux_config_test_oa);
> +
> +       dev_priv->perf.oa.test_config.b_counter_regs = b_counter_config_test_oa;
> +       dev_priv->perf.oa.test_config.b_counter_regs_len = ARRAY_SIZE(b_counter_config_test_oa);
> +
> +       dev_priv->perf.oa.test_config.flex_regs = flex_eu_config_test_oa;
> +       dev_priv->perf.oa.test_config.flex_regs_len = ARRAY_SIZE(flex_eu_config_test_oa);
> +
> +       dev_priv->perf.oa.test_config.sysfs_metric.name = "db41edd4-d8e7-4730-ad11-b9a2d6833503";
> +       dev_priv->perf.oa.test_config.sysfs_metric.attrs = dev_priv->perf.oa.test_config.attrs;
> +
> +       dev_priv->perf.oa.test_config.attrs[0] = &dev_priv->perf.oa.test_config.sysfs_metric_id.attr;
> +
> +       dev_priv->perf.oa.test_config.sysfs_metric_id.attr.name = "id";
> +       dev_priv->perf.oa.test_config.sysfs_metric_id.attr.mode = 0444;
> +       dev_priv->perf.oa.test_config.sysfs_metric_id.show = show_test_oa_id;
> +}
> diff --git a/drivers/gpu/drm/i915/i915_oa_cnl.h b/drivers/gpu/drm/i915/i915_oa_cnl.h
> new file mode 100644
> index 000000000000..fb918b131105
> --- /dev/null
> +++ b/drivers/gpu/drm/i915/i915_oa_cnl.h
> @@ -0,0 +1,34 @@
> +/*
> + * Autogenerated file by GPU Top : https://github.com/rib/gputop
> + * DO NOT EDIT manually!
> + *
> + *
> + * Copyright (c) 2015 Intel Corporation
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a
> + * copy of this software and associated documentation files (the "Software"),
> + * to deal in the Software without restriction, including without limitation
> + * the rights to use, copy, modify, merge, publish, distribute, sublicense,
> + * and/or sell copies of the Software, and to permit persons to whom the
> + * Software is furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice (including the next
> + * paragraph) shall be included in all copies or substantial portions of the
> + * Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
> + * IN THE SOFTWARE.
> + *
> + */
> +
> +#ifndef __I915_OA_CNL_H__
> +#define __I915_OA_CNL_H__
> +
> +extern void i915_perf_load_test_config_cnl(struct drm_i915_private *dev_priv);
> +
> +#endif
> diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
> index 802928c54f06..00be015e01df 100644
> --- a/drivers/gpu/drm/i915/i915_perf.c
> +++ b/drivers/gpu/drm/i915/i915_perf.c
> @@ -208,6 +208,7 @@
>  #include "i915_oa_glk.h"
>  #include "i915_oa_cflgt2.h"
>  #include "i915_oa_cflgt3.h"
> +#include "i915_oa_cnl.h"
>
>  /* HW requires this to be a power of two, between 128k and 16M, though driver
>   * is currently generally designed assuming the largest 16M size is used such
> @@ -1852,7 +1853,7 @@ static int gen8_enable_metric_set(struct drm_i915_private *dev_priv,
>          * be read back from automatically triggered reports, as part of the
>          * RPT_ID field.
>          */
> -       if (IS_GEN9(dev_priv)) {
> +       if (IS_GEN9(dev_priv) || IS_GEN10(dev_priv)) {
>                 I915_WRITE(GEN8_OA_DEBUG,
>                            _MASKED_BIT_ENABLE(GEN9_OA_DEBUG_DISABLE_CLK_RATIO_REPORTS |
>                                               GEN9_OA_DEBUG_INCLUDE_CLK_RATIO));
> @@ -1885,6 +1886,16 @@ static void gen8_disable_metric_set(struct drm_i915_private *dev_priv)
>
>  }
>
> +static void gen10_disable_metric_set(struct drm_i915_private *dev_priv)
> +{
> +       /* Reset all contexts' slices/subslices configurations. */
> +       gen8_configure_all_contexts(dev_priv, NULL, false);
> +
> +       /* Make sure we disable noa to save power. */
> +       I915_WRITE(RPM_CONFIG1,
> +                  I915_READ(RPM_CONFIG1) & ~GEN10_GT_NOA_ENABLE);
> +}
> +
>  static void gen7_oa_enable(struct drm_i915_private *dev_priv)
>  {
>         /*
> @@ -2937,6 +2948,8 @@ void i915_perf_register(struct drm_i915_private *dev_priv)
>                         i915_perf_load_test_config_cflgt2(dev_priv);
>                 if (IS_CFL_GT3(dev_priv))
>                         i915_perf_load_test_config_cflgt3(dev_priv);
> +       } else if (IS_CANNONLAKE(dev_priv)) {
> +               i915_perf_load_test_config_cnl(dev_priv);
>         }
>
>         if (dev_priv->perf.oa.test_config.id == 0)
> @@ -3022,6 +3035,12 @@ static bool gen8_is_valid_mux_addr(struct drm_i915_private *dev_priv, u32 addr)
>                 (addr >= RPM_CONFIG0.reg && addr <= NOA_CONFIG(8).reg);
>  }
>
> +static bool gen10_is_valid_mux_addr(struct drm_i915_private *dev_priv, u32 addr)
> +{
> +       return gen8_is_valid_mux_addr(dev_priv, addr) ||
> +               (addr >= OA_PERFCNT3_LO.reg && addr <= OA_PERFCNT4_HI.reg);
> +}
> +
>  static bool hsw_is_valid_mux_addr(struct drm_i915_private *dev_priv, u32 addr)
>  {
>         return gen7_is_valid_mux_addr(dev_priv, addr) ||
> @@ -3475,6 +3494,26 @@ void i915_perf_init(struct drm_i915_private *dev_priv)
>                         default:
>                                 break;
>                         }
> +               } else if (IS_GEN10(dev_priv)) {
> +                       dev_priv->perf.oa.ops.is_valid_b_counter_reg =
> +                               gen7_is_valid_b_counter_addr;
> +                       dev_priv->perf.oa.ops.is_valid_mux_reg =
> +                               gen10_is_valid_mux_addr;
> +                       dev_priv->perf.oa.ops.is_valid_flex_reg =
> +                               gen8_is_valid_flex_addr;
> +
> +                       dev_priv->perf.oa.ops.enable_metric_set = gen8_enable_metric_set;
> +                       dev_priv->perf.oa.ops.disable_metric_set = gen10_disable_metric_set;
> +
> +                       dev_priv->perf.oa.ctx_oactxctrl_offset = 0x128;
> +                       dev_priv->perf.oa.ctx_flexeu0_offset = 0x3de;
> +
> +                       dev_priv->perf.oa.gen8_valid_ctx_bit = (1<<16);
> +
> +                       /* Default frequency, although we need to read it from
> +                        * the register as it might vary between parts.
> +                        */
I believe the preferred comment style is:

/*
 * Something, something...
 */

Reviewed-by: Matthew Auld <matthew.auld@intel.com>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2 1/9] drm/i915/perf: complete whitelisting for OA programming on HSW
  2017-11-02 16:29 ` [PATCH v2 1/9] drm/i915/perf: complete whitelisting for OA programming on HSW Lionel Landwerlin
@ 2017-11-10 13:05   ` Matthew Auld
  0 siblings, 0 replies; 31+ messages in thread
From: Matthew Auld @ 2017-11-10 13:05 UTC (permalink / raw)
  To: Lionel Landwerlin; +Cc: Intel Graphics Development

On 2 November 2017 at 16:29, Lionel Landwerlin
<lionel.g.landwerlin@intel.com> wrote:
> We were missing some registers and also can name one for which we only had
> the offset.
>
> Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2 4/9] drm/i915: fix register naming
  2017-11-10 11:11   ` Matthew Auld
@ 2017-11-10 16:24     ` Lionel Landwerlin
  0 siblings, 0 replies; 31+ messages in thread
From: Lionel Landwerlin @ 2017-11-10 16:24 UTC (permalink / raw)
  To: Matthew Auld; +Cc: Intel Graphics Development

On 10/11/17 11:11, Matthew Auld wrote:
> On 2 November 2017 at 16:29, Lionel Landwerlin
> <lionel.g.landwerlin@intel.com> wrote:
>> This name was added with the whitelisting of registers for building up OA
>> configs. It is contained in a range gen8 whitelist :
>>
>>     addr >= RPM_CONFIG0.reg && addr <= NOA_CONFIG(8).reg
>>
>> Hence why the name isn't used anywhere.
>>
>> Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
>> ---
>>   drivers/gpu/drm/i915/i915_reg.h | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
>> index ee4941a1df20..d27092ec4f74 100644
>> --- a/drivers/gpu/drm/i915/i915_reg.h
>> +++ b/drivers/gpu/drm/i915/i915_reg.h
>> @@ -1118,7 +1118,7 @@ static inline bool i915_mmio_reg_valid(i915_reg_t reg)
>>   #define RPM_CONFIG1        _MMIO(0x0D04)
>>
>>   /* RPC unit config (Gen8+) */
>> -#define RPM_CONFIG         _MMIO(0x0D08)
>> +#define RPC_CONFIG         _MMIO(0x0D08)
> Wait, is it RPC or RCP, the spec is calling it the RCPunit, with the
> register being RCPCONFIG...
>
Looks like I can't read or write :(
Resending with a fix, thanks a lot.
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 31+ messages in thread

end of thread, other threads:[~2017-11-10 16:24 UTC | newest]

Thread overview: 31+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-11-02 16:29 [PATCH v2 0/9] i915: Cannonlake perf support Lionel Landwerlin
2017-11-02 16:29 ` [PATCH v2 1/9] drm/i915/perf: complete whitelisting for OA programming on HSW Lionel Landwerlin
2017-11-10 13:05   ` Matthew Auld
2017-11-02 16:29 ` [PATCH v2 2/9] drm/i915/perf: add support for Coffeelake GT3 Lionel Landwerlin
2017-11-07 11:34   ` Matthew Auld
2017-11-02 16:29 ` [PATCH v2 3/9] drm/i915/perf: refactor perf setup Lionel Landwerlin
2017-11-10 11:04   ` Matthew Auld
2017-11-02 16:29 ` [PATCH v2 4/9] drm/i915: fix register naming Lionel Landwerlin
2017-11-10 11:11   ` Matthew Auld
2017-11-10 16:24     ` Lionel Landwerlin
2017-11-02 16:29 ` [PATCH v2 5/9] drm/i915/perf: enable perf support on CNL Lionel Landwerlin
2017-11-10 12:42   ` Matthew Auld
2017-11-02 16:29 ` [PATCH v2 6/9] drm/i915: expose command stream timestamp frequency to userspace Lionel Landwerlin
2017-11-07  0:01   ` Rafael Antognolli
2017-11-07 11:30   ` Ewelina Musial
2017-11-07 12:07     ` Lionel Landwerlin
2017-11-08 17:36   ` Lionel Landwerlin
2017-11-09  9:10     ` Sagar Arun Kamble
2017-11-09 11:58   ` Sagar Arun Kamble
2017-11-09 14:06     ` Lionel Landwerlin
2017-11-09 14:13       ` Lionel Landwerlin
2017-11-09 17:44         ` Lionel Landwerlin
2017-11-09 16:37       ` Sagar Arun Kamble
2017-11-02 16:29 ` [PATCH v2 7/9] drm/i915/perf: reuse timestamp frequency from device info Lionel Landwerlin
2017-11-02 16:29 ` [PATCH v2 8/9] drm/i915: expose eu topology to userspace Lionel Landwerlin
2017-11-02 16:35   ` Chris Wilson
2017-11-02 16:37     ` Lionel Landwerlin
2017-11-02 17:34     ` Lionel Landwerlin
2017-11-02 16:29 ` [PATCH v2 9/9] drm/i915/debugfs: reuse max slice/subslices already stored in sseu Lionel Landwerlin
2017-11-02 16:49 ` ✓ Fi.CI.BAT: success for i915: Cannonlake perf support (rev2) Patchwork
2017-11-02 17:38 ` ✗ Fi.CI.IGT: failure " Patchwork

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.