All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 1/3] drm/i915: Allow GT Slices Shutdown on Boot.
@ 2013-10-15 14:41 Rodrigo Vivi
  2013-10-15 14:41 ` [PATCH] i915_drm: add exec flag for request forcing Intel GT full Rodrigo Vivi
                   ` (4 more replies)
  0 siblings, 5 replies; 31+ messages in thread
From: Rodrigo Vivi @ 2013-10-15 14:41 UTC (permalink / raw)
  To: intel-gfx

Slices shutdown is a power savings feature present on Haswell GT3 whereby
parts of HW i.e. slice is shut off on boot or dynamically to save power.

This patch only introduces a way to disable half of Haswell GT3 slices on boot.

Signed-off-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
---
 drivers/gpu/drm/i915/i915_drv.c  |  5 +++++
 drivers/gpu/drm/i915/i915_drv.h  |  1 +
 drivers/gpu/drm/i915/i915_gem.c  |  5 +----
 drivers/gpu/drm/i915/i915_reg.h  |  8 ++++++++
 drivers/gpu/drm/i915/intel_drv.h |  1 +
 drivers/gpu/drm/i915/intel_pm.c  | 16 +++++++++++++++-
 6 files changed, 31 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 59649c0..e3207a2 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -154,6 +154,11 @@ module_param_named(prefault_disable, i915_prefault_disable, bool, 0600);
 MODULE_PARM_DESC(prefault_disable,
 		"Disable page prefaulting for pread/pwrite/reloc (default:false). For developers only.");
 
+int i915_gt_slice_config __read_mostly = 1;
+module_param_named(gt_slice_config, i915_gt_slice_config, int, 0600);
+MODULE_PARM_DESC(gt_slice_config,
+		 "Haswell GT3 has multiple slices. Use Full (1) for better performance or Half (0) for better power savings. (default:1)");
+
 static struct drm_driver driver;
 extern int intel_agp_enabled;
 
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 6106d3d..02d82d8 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1794,6 +1794,7 @@ extern bool i915_fastboot __read_mostly;
 extern int i915_enable_pc8 __read_mostly;
 extern int i915_pc8_timeout __read_mostly;
 extern bool i915_prefault_disable __read_mostly;
+extern int i915_gt_slice_config __read_mostly;
 
 extern int i915_suspend(struct drm_device *dev, pm_message_t state);
 extern int i915_resume(struct drm_device *dev);
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 71dd030..b52808e 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -4421,10 +4421,7 @@ i915_gem_init_hw(struct drm_device *dev)
 	if (dev_priv->ellc_size)
 		I915_WRITE(HSW_IDICR, I915_READ(HSW_IDICR) | IDIHASHMSK(0xf));
 
-	if (IS_HSW_GT3(dev))
-		I915_WRITE(MI_PREDICATE_RESULT_2, LOWER_SLICE_ENABLED);
-	else
-		I915_WRITE(MI_PREDICATE_RESULT_2, LOWER_SLICE_DISABLED);
+	intel_init_gt_slices(dev);
 
 	if (HAS_PCH_NOP(dev)) {
 		u32 temp = I915_READ(GEN7_MSG_CTL);
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 13153c3..497c441 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -269,6 +269,14 @@
 #define  LOWER_SLICE_ENABLED	(1<<0)
 #define  LOWER_SLICE_DISABLED	(0<<0)
 
+#define HSW_GT_SLICE_INFO	0x138064
+#define   SLICE_SEL_BOTH	(1<<3)
+#define   SLICE_AUTOWAKE	(1<<2)
+#define   SLICE_STATUS_MASK	0x3
+#define   SLICE_STATUS_GT_OFF	(0<<0)
+#define   SLICE_STATUS_MAIN_ON	(2<<0)
+#define   SLICE_STATUS_BOTH_ON	(3<<0)
+
 /*
  * 3D instructions used by the kernel
  */
diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
index e33f387..f21f3fa 100644
--- a/drivers/gpu/drm/i915/intel_drv.h
+++ b/drivers/gpu/drm/i915/intel_drv.h
@@ -834,6 +834,7 @@ void intel_set_power_well(struct drm_device *dev, bool enable);
 void intel_enable_gt_powersave(struct drm_device *dev);
 void intel_disable_gt_powersave(struct drm_device *dev);
 void ironlake_teardown_rc6(struct drm_device *dev);
+void intel_init_gt_slices(struct drm_device *dev);
 void gen6_update_ring_freq(struct drm_device *dev);
 void gen6_rps_idle(struct drm_i915_private *dev_priv);
 void gen6_rps_boost(struct drm_i915_private *dev_priv);
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 6cd5c8f..a1a2588 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -3866,6 +3866,21 @@ static void gen6_enable_rps(struct drm_device *dev)
 	gen6_gt_force_wake_put(dev_priv);
 }
 
+void intel_init_gt_slices(struct drm_device *dev)
+{
+	struct drm_i915_private *dev_priv = dev->dev_private;
+
+	if (!IS_HSW_GT3(dev))
+		return;
+
+	I915_WRITE(MI_PREDICATE_RESULT_2, LOWER_SLICE_ENABLED);
+
+	if (!i915_gt_slice_config) {
+		I915_WRITE(HSW_GT_SLICE_INFO, ~SLICE_SEL_BOTH);
+		I915_WRITE(MI_PREDICATE_RESULT_2, LOWER_SLICE_DISABLED);
+	}
+}
+
 void gen6_update_ring_freq(struct drm_device *dev)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
@@ -6022,4 +6037,3 @@ void intel_pm_init(struct drm_device *dev)
 	INIT_DELAYED_WORK(&dev_priv->rps.delayed_resume_work,
 			  intel_gen6_powersave_work);
 }
-
-- 
1.7.11.7

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH] i915_drm: add exec flag for request forcing Intel GT full.
  2013-10-15 14:41 [PATCH 1/3] drm/i915: Allow GT Slices Shutdown on Boot Rodrigo Vivi
@ 2013-10-15 14:41 ` Rodrigo Vivi
  2013-10-15 20:50   ` [PATCH] i915_drm: add exec flag to warn kernel that userspace is using predication Rodrigo Vivi
  2013-10-15 14:41 ` [PATCH] testcases: Slice Shutdown sysfs switch and gt force full exec buffer flag Rodrigo Vivi
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 31+ messages in thread
From: Rodrigo Vivi @ 2013-10-15 14:41 UTC (permalink / raw)
  To: intel-gfx

Rendering buffers that need full GT power can request full power through
this I915_EXEC_GT_FULL flag. If the default is the power saving with half
slices off it executes LRIs to immediately enable all slices.

Signed-off-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
---
 include/drm/i915_drm.h | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/include/drm/i915_drm.h b/include/drm/i915_drm.h
index aa983f3..7213ea8 100644
--- a/include/drm/i915_drm.h
+++ b/include/drm/i915_drm.h
@@ -670,6 +670,9 @@ struct drm_i915_gem_execbuffer2 {
 /** Resets the SO write offset registers for transform feedback on gen7. */
 #define I915_EXEC_GEN7_SOL_RESET	(1<<8)
 
+/* Use all available GT Slisces */
+#define I915_EXEC_GT_FULL		(1<<13)
+
 #define I915_EXEC_CONTEXT_ID_MASK	(0xffffffff)
 #define i915_execbuffer2_set_context_id(eb2, context) \
 	(eb2).rsvd1 = context & I915_EXEC_CONTEXT_ID_MASK
-- 
1.7.11.7

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH] testcases: Slice Shutdown sysfs switch and gt force full exec buffer flag.
  2013-10-15 14:41 [PATCH 1/3] drm/i915: Allow GT Slices Shutdown on Boot Rodrigo Vivi
  2013-10-15 14:41 ` [PATCH] i915_drm: add exec flag for request forcing Intel GT full Rodrigo Vivi
@ 2013-10-15 14:41 ` Rodrigo Vivi
  2013-10-15 20:51   ` [PATCH] testcases: Slice Shutdown testscases Rodrigo Vivi
  2013-10-15 14:41 ` [PATCH 2/3] drm/i915: Slice Shutdown: Allow setting slice config (full x half) through sysfs Rodrigo Vivi
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 31+ messages in thread
From: Rodrigo Vivi @ 2013-10-15 14:41 UTC (permalink / raw)
  To: intel-gfx

Signed-off-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
---
 tests/Makefile.am       |   1 +
 tests/gt_slice_config.c | 227 ++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 228 insertions(+)
 create mode 100644 tests/gt_slice_config.c

diff --git a/tests/Makefile.am b/tests/Makefile.am
index 0290ae0..cf105e4 100644
--- a/tests/Makefile.am
+++ b/tests/Makefile.am
@@ -50,6 +50,7 @@ TESTS_progs_M = \
 	gem_tiled_blits \
 	gem_tiled_partial_pwrite_pread \
 	gem_write_read_ring_switch \
+	gt_slice_config \
 	kms_flip \
 	kms_render \
 	kms_setmode \
diff --git a/tests/gt_slice_config.c b/tests/gt_slice_config.c
new file mode 100644
index 0000000..635535e
--- /dev/null
+++ b/tests/gt_slice_config.c
@@ -0,0 +1,227 @@
+/*
+ * Copyright © 2013 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ *
+ * Authors:
+ *    Rodrigo Vivi <rodrigo.vivi@intel.com>
+ *
+ */
+
+/*
+ * Testcase: Test GT slice shutdown feature
+ *
+ * Test both:
+ * 1. sysfs slice config half/full switching
+ * 2. exec buffer flag requesting gt full.
+ */
+
+#define _GNU_SOURCE
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <unistd.h>
+#include "drmtest.h"
+
+static void exec(int fd, uint32_t handle)
+{
+        struct drm_i915_gem_execbuffer2 execbuf;
+        struct drm_i915_gem_exec_object2 gem_exec[1];
+        uint32_t b[2] = {MI_BATCH_BUFFER_END};
+        int loops = 2000000;
+        int ret = 0;
+
+        gem_write(fd, handle, 0, b, sizeof(b));
+
+        gem_exec[0].handle = handle;
+        gem_exec[0].relocation_count = 0;
+        gem_exec[0].relocs_ptr = 0;
+        gem_exec[0].alignment = 0;
+        gem_exec[0].offset = 0;
+        gem_exec[0].flags = 0;
+        gem_exec[0].rsvd1 = 0;
+        gem_exec[0].rsvd2 = 0;
+
+        execbuf.buffers_ptr = (uintptr_t)gem_exec;
+        execbuf.buffer_count = 1;
+        execbuf.batch_start_offset = 0;
+        execbuf.batch_len = 8;
+        execbuf.cliprects_ptr = 0;
+        execbuf.num_cliprects = 0;
+        execbuf.DR1 = 0;
+        execbuf.DR4 = 0;
+        execbuf.flags =  I915_EXEC_RENDER | I915_EXEC_GT_FULL;
+        i915_execbuffer2_set_context_id(execbuf, 0);
+        execbuf.rsvd2 = 0;
+
+        while (loops-- && ret == 0) {
+                ret = drmIoctl(fd,
+                               DRM_IOCTL_I915_GEM_EXECBUFFER2,
+                               &execbuf);
+        }
+	gem_sync(fd, handle);
+}
+
+static bool is_full(const int device)
+{
+	char *path;
+	FILE *file;
+	int ret;
+	char str[5];
+
+	ret = asprintf(&path, "/sys/class/drm/card%d/power/gt_slice_config",
+		       device);
+	igt_assert(ret != -1);
+
+	file = fopen(path, "r");
+	igt_require(file);
+
+	ret = fscanf(file, "%s", str);
+	igt_assert(ret != 0);
+
+	fclose(file);
+	return strcmp(str, "half");
+}
+
+static int set_status(const int device, bool full)
+{
+	char *path;
+	FILE *file;
+	int ret;
+	char str[5];
+	int attempts = 10;
+
+	ret = asprintf(&path, "/sys/class/drm/card%d/power/gt_slice_config",
+		       device);
+	igt_assert(ret != -1);
+
+	file = fopen(path, "r");
+	igt_require(file);
+
+	ret = fscanf(file, "%s", str);
+	igt_assert(ret != 0);
+
+	fclose(file);
+
+	if (full == strcmp(str, "half"))
+		return 0;
+
+	while (attempts-- && ret != 0) {
+		file = fopen(path, "w");
+		igt_require(file);
+		ret = fprintf(file, "%s\n", full ? "full": "half");
+		igt_assert(ret != -1);
+		ret = fclose(file);
+		sleep(1);
+	}
+	return ret;
+}
+
+int main(int argc, char **argv)
+{
+	char *path;
+	FILE *file;
+	unsigned int rc6_enabled;
+	int ret;
+	uint32_t handle;
+	const int device = drm_get_card();
+	const int fd = drm_open_any();
+	const int devid = intel_get_drm_devid(fd);
+	bool initial = is_full(device);
+
+	igt_skip_on_simulation();
+
+	igt_subtest_init(argc, argv);
+
+	igt_fixture {
+		/* On Haswell Slices on/off switch depends on RC6 exit */
+		ret = asprintf(&path, "/sys/class/drm/card%d/power/rc6_enable",
+			       device);
+		igt_assert(ret != -1);
+
+		file = fopen(path, "r");
+		igt_require(file);
+
+		fscanf(file, "%u", &rc6_enabled);
+		fclose(file);
+	}
+
+	igt_subtest("sysfs") {
+		igt_require(IS_HASWELL(devid));
+		igt_require(rc6_enabled);
+
+		/* Switching states */
+		ret = set_status(device, !initial);
+		if (ret < 0)
+			igt_fail(ret);
+
+		/* Check if state has really changed */
+		if (initial == is_full(device)) {
+			fprintf(stderr,
+				"gt_slice_config status hasn't changed\n");
+			igt_fail(-1);
+		}
+		igt_success();
+	}
+
+	igt_subtest("execbuf") {
+		igt_require(IS_HASWELL(devid));
+		igt_require(rc6_enabled);
+
+		/* Disable half slices to test the forcefull via execbuf flag */
+		ret = set_status(device, false);
+		if (ret < 0)
+			igt_fail(ret);
+
+		igt_fork(child, 1) {
+			/* Start Render with I915_EXEC_GT_FULL*/
+			handle = gem_create(fd, 4096);
+			exec(fd, handle);
+			gem_close(fd, handle);
+		}
+
+		/* Check if forcing full is working */
+		sleep(1);
+		if (!is_full(device)) {
+			fprintf(stderr, "All slices should be enabled\n");
+			set_status(device, initial);
+			igt_fail(-1);
+		}
+
+		/* Wait for idleness */
+		igt_waitchildren();
+		sleep(3);
+
+		/* Check if it is back to half */
+		if (is_full(device)) {
+			fprintf(stderr, "Only half slices should be enabled\n");
+			set_status(device, initial);
+			igt_fail(-1);
+		}
+		set_status(device, initial);
+		igt_success();
+	}
+
+	igt_fixture {
+		close(fd);
+	}
+
+	igt_exit();
+}
-- 
1.7.11.7

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 2/3] drm/i915: Slice Shutdown: Allow setting slice config (full x half) through sysfs
  2013-10-15 14:41 [PATCH 1/3] drm/i915: Allow GT Slices Shutdown on Boot Rodrigo Vivi
  2013-10-15 14:41 ` [PATCH] i915_drm: add exec flag for request forcing Intel GT full Rodrigo Vivi
  2013-10-15 14:41 ` [PATCH] testcases: Slice Shutdown sysfs switch and gt force full exec buffer flag Rodrigo Vivi
@ 2013-10-15 14:41 ` Rodrigo Vivi
  2013-10-15 14:41 ` [PATCH 3/3] drm/i915: HSW GT3 Slices: Userspace can dynamically forcibly enable all slices Rodrigo Vivi
  2013-11-05 22:44 ` [PATCH 1/4] drm/i915: Allow GT Slices Shutdown on Boot Rodrigo Vivi
  4 siblings, 0 replies; 31+ messages in thread
From: Rodrigo Vivi @ 2013-10-15 14:41 UTC (permalink / raw)
  To: intel-gfx

This patch introduces a sysfs interface to easily allow dynamically switch
slice config default behaviour between full and half slices.

Signed-off-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
---
 drivers/gpu/drm/i915/i915_drv.h   |  1 +
 drivers/gpu/drm/i915/i915_sysfs.c | 54 ++++++++++++++++++++++++++++++++++++
 drivers/gpu/drm/i915/intel_drv.h  |  2 ++
 drivers/gpu/drm/i915/intel_pm.c   | 58 +++++++++++++++++++++++++++++++++++++--
 4 files changed, 112 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 02d82d8..685fb1d 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1728,6 +1728,7 @@ struct drm_i915_file_private {
 #define HAS_POWER_WELL(dev)	(IS_HASWELL(dev))
 #define HAS_FPGA_DBG_UNCLAIMED(dev)	(INTEL_INFO(dev)->has_fpga_dbg)
 #define HAS_PSR(dev)		(IS_HASWELL(dev))
+#define HAS_SLICE_SHUTDOWN(dev)	(IS_HSW_GT3(dev) && i915_enable_rc6)
 
 #define INTEL_PCH_DEVICE_ID_MASK		0xff00
 #define INTEL_PCH_IBX_DEVICE_ID_TYPE		0x3b00
diff --git a/drivers/gpu/drm/i915/i915_sysfs.c b/drivers/gpu/drm/i915/i915_sysfs.c
index bb31ff3..86ccd52 100644
--- a/drivers/gpu/drm/i915/i915_sysfs.c
+++ b/drivers/gpu/drm/i915/i915_sysfs.c
@@ -117,6 +117,52 @@ static struct attribute_group rc6_attr_group = {
 	.name = power_group_name,
 	.attrs =  rc6_attrs
 };
+
+static ssize_t gt_slice_config_show(struct device *kdev,
+				    struct device_attribute *attr, char *buf)
+{
+	struct drm_minor *minor = container_of(kdev, struct drm_minor, kdev);
+	struct drm_device *dev = minor->dev;
+	struct drm_i915_private *dev_priv = dev->dev_private;
+
+	return sprintf(buf, "%s\n", I915_READ(MI_PREDICATE_RESULT_2) ==
+		       LOWER_SLICE_ENABLED ? "full" : "half");
+}
+
+static ssize_t gt_slice_config_store(struct device *kdev,
+				     struct device_attribute *attr,
+				     const char *buf, size_t count)
+{
+	struct drm_minor *minor = container_of(kdev, struct drm_minor, kdev);
+	struct drm_device *dev = minor->dev;
+	int ret;
+
+	if (!strncmp(buf, "full", sizeof("full") - 1)) {
+		ret = intel_set_gt_full(dev);
+		if (ret)
+			return ret;
+	} else if (!strncmp(buf, "half", sizeof("half") - 1)) {
+		ret = intel_set_gt_half(dev);
+		if (ret)
+			return ret;
+	} else
+		return -EINVAL;
+	return count;
+}
+
+static DEVICE_ATTR(gt_slice_config, S_IRUGO | S_IWUSR, gt_slice_config_show,
+		   gt_slice_config_store);
+
+static struct attribute *gt_slice_config_attrs[] = {
+	&dev_attr_gt_slice_config.attr,
+	NULL
+};
+
+static struct attribute_group gt_slice_config_attr_group = {
+	.name = power_group_name,
+	.attrs =  gt_slice_config_attrs
+};
+
 #endif
 
 static int l3_access_valid(struct drm_device *dev, loff_t offset)
@@ -558,6 +604,12 @@ void i915_setup_sysfs(struct drm_device *dev)
 		if (ret)
 			DRM_ERROR("RC6 residency sysfs setup failed\n");
 	}
+	if (HAS_SLICE_SHUTDOWN(dev)) {
+		ret = sysfs_merge_group(&dev->primary->kdev.kobj,
+					&gt_slice_config_attr_group);
+		if (ret)
+			DRM_ERROR("GT slice config sysfs setup failed\n");
+	}
 #endif
 	if (HAS_L3_DPF(dev)) {
 		ret = device_create_bin_file(&dev->primary->kdev, &dpf_attrs);
@@ -597,5 +649,7 @@ void i915_teardown_sysfs(struct drm_device *dev)
 	device_remove_bin_file(&dev->primary->kdev,  &dpf_attrs);
 #ifdef CONFIG_PM
 	sysfs_unmerge_group(&dev->primary->kdev.kobj, &rc6_attr_group);
+	sysfs_unmerge_group(&dev->primary->kdev.kobj,
+			    &gt_slice_config_attr_group);
 #endif
 }
diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
index f21f3fa..a9abbb5 100644
--- a/drivers/gpu/drm/i915/intel_drv.h
+++ b/drivers/gpu/drm/i915/intel_drv.h
@@ -834,6 +834,8 @@ void intel_set_power_well(struct drm_device *dev, bool enable);
 void intel_enable_gt_powersave(struct drm_device *dev);
 void intel_disable_gt_powersave(struct drm_device *dev);
 void ironlake_teardown_rc6(struct drm_device *dev);
+int intel_set_gt_full(struct drm_device *dev);
+int intel_set_gt_half(struct drm_device *dev);
 void intel_init_gt_slices(struct drm_device *dev);
 void gen6_update_ring_freq(struct drm_device *dev);
 void gen6_rps_idle(struct drm_i915_private *dev_priv);
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index a1a2588..63af075 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -3866,14 +3866,66 @@ static void gen6_enable_rps(struct drm_device *dev)
 	gen6_gt_force_wake_put(dev_priv);
 }
 
-void intel_init_gt_slices(struct drm_device *dev)
+int intel_set_gt_full(struct drm_device *dev)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
 
-	if (!IS_HSW_GT3(dev))
-		return;
+	if (!HAS_SLICE_SHUTDOWN(dev))
+		return -ENODEV;
+
+	I915_WRITE(HSW_GT_SLICE_INFO, SLICE_SEL_BOTH);
+
+	/* Slices are enabled on RC6 exit */
+	gen6_gt_force_wake_get(dev_priv);
 
+	if (wait_for(((I915_READ(HSW_GT_SLICE_INFO) & SLICE_STATUS_MASK) ==
+		      SLICE_STATUS_BOTH_ON), 2000)) {
+		DRM_ERROR("Timeout enabling full gt slices\n");
+		I915_WRITE(HSW_GT_SLICE_INFO, ~SLICE_SEL_BOTH);
+		I915_WRITE(MI_PREDICATE_RESULT_2, LOWER_SLICE_DISABLED);
+		gen6_gt_force_wake_put(dev_priv);
+		return -ETIMEDOUT;
+	}
 	I915_WRITE(MI_PREDICATE_RESULT_2, LOWER_SLICE_ENABLED);
+	gen6_gt_force_wake_put(dev_priv);
+
+	return 0;
+}
+
+int intel_set_gt_half(struct drm_device *dev)
+{
+	struct drm_i915_private *dev_priv = dev->dev_private;
+
+	if (!HAS_SLICE_SHUTDOWN(dev))
+		return -ENODEV;
+
+	I915_WRITE(HSW_GT_SLICE_INFO, ~SLICE_SEL_BOTH);
+
+	/* Slices are disabled on RC6 exit */
+	gen6_gt_force_wake_get(dev_priv);
+
+	if (wait_for(((I915_READ(HSW_GT_SLICE_INFO) & SLICE_STATUS_MASK) ==
+		      SLICE_STATUS_MAIN_ON), 2000)) {
+		DRM_ERROR("Timed out disabling half gt slices\n");
+		I915_WRITE(HSW_GT_SLICE_INFO, SLICE_SEL_BOTH);
+		I915_WRITE(MI_PREDICATE_RESULT_2, LOWER_SLICE_ENABLED);
+		gen6_gt_force_wake_put(dev_priv);
+		return -ETIMEDOUT;
+	}
+	I915_WRITE(MI_PREDICATE_RESULT_2, LOWER_SLICE_DISABLED);
+	gen6_gt_force_wake_put(dev_priv);
+	return 0;
+}
+
+void intel_init_gt_slices(struct drm_device *dev)
+{
+	struct drm_i915_private *dev_priv = dev->dev_private;
+
+	if (IS_HSW_GT3(dev))
+		I915_WRITE(MI_PREDICATE_RESULT_2, LOWER_SLICE_ENABLED);
+
+	if (!HAS_SLICE_SHUTDOWN(dev))
+		return;
 
 	if (!i915_gt_slice_config) {
 		I915_WRITE(HSW_GT_SLICE_INFO, ~SLICE_SEL_BOTH);
-- 
1.7.11.7

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 3/3] drm/i915: HSW GT3 Slices: Userspace can dynamically forcibly enable all slices
  2013-10-15 14:41 [PATCH 1/3] drm/i915: Allow GT Slices Shutdown on Boot Rodrigo Vivi
                   ` (2 preceding siblings ...)
  2013-10-15 14:41 ` [PATCH 2/3] drm/i915: Slice Shutdown: Allow setting slice config (full x half) through sysfs Rodrigo Vivi
@ 2013-10-15 14:41 ` Rodrigo Vivi
  2013-10-15 16:14   ` Eric Anholt
  2013-11-05 22:44 ` [PATCH 1/4] drm/i915: Allow GT Slices Shutdown on Boot Rodrigo Vivi
  4 siblings, 1 reply; 31+ messages in thread
From: Rodrigo Vivi @ 2013-10-15 14:41 UTC (permalink / raw)
  To: intel-gfx

Rendering buffers that need full GT power can request full power through
this I915_EXEC_GT_FULL flag. If the default is the power saving with half
slices off it executes LRIs to immediately enable all slices.

CC: Eric Anholt <eric@anholt.net>
CC: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
---
 drivers/gpu/drm/i915/i915_drv.h            |  8 ++++
 drivers/gpu/drm/i915/i915_gem_execbuffer.c | 67 ++++++++++++++++++++++++++++++
 drivers/gpu/drm/i915/i915_reg.h            | 11 +++++
 drivers/gpu/drm/i915/i915_sysfs.c          | 11 ++++-
 drivers/gpu/drm/i915/intel_display.c       |  6 +++
 drivers/gpu/drm/i915/intel_drv.h           |  1 +
 drivers/gpu/drm/i915/intel_pm.c            | 41 ++++++++++++++++--
 include/uapi/drm/i915_drm.h                |  5 ++-
 8 files changed, 144 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 685fb1d..bd4774e 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1219,6 +1219,12 @@ struct i915_package_c8 {
 	} regsave;
 };
 
+struct i915_gt_slices {
+	int state_default;
+	int forcing_full;
+	struct mutex m_lock;
+};
+
 typedef struct drm_i915_private {
 	struct drm_device *dev;
 	struct kmem_cache *slab;
@@ -1418,6 +1424,8 @@ typedef struct drm_i915_private {
 
 	struct i915_package_c8 pc8;
 
+	struct i915_gt_slices gt_slices;
+
 	/* Old dri1 support infrastructure, beware the dragons ya fools entering
 	 * here! */
 	struct i915_dri1_state dri1;
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index 0ce0d47..16dc86e 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -923,6 +923,66 @@ i915_reset_gen7_sol_offsets(struct drm_device *dev,
 }
 
 static int
+i915_gt_full_immediate(struct drm_device *dev, struct intel_ring_buffer *ring)
+{
+	drm_i915_private_t *dev_priv = dev->dev_private;
+	int ret;
+
+	if (!HAS_SLICE_SHUTDOWN(dev) || ring == &dev_priv->ring[BCS])
+		return 0;
+
+	ret = intel_ring_begin(ring, 3);
+	if (ret)
+		return ret;
+	intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1));
+	intel_ring_emit(ring, HSW_GT_SLICE_INFO);
+	intel_ring_emit(ring, SLICE_SEL_BOTH);
+	intel_ring_advance(ring);
+
+	ret = intel_ring_begin(ring, 3);
+	if (ret)
+		return ret;
+	intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1));
+	intel_ring_emit(ring, MI_PREDICATE_RESULT_2);
+	intel_ring_emit(ring, LOWER_SLICE_ENABLED);
+	intel_ring_advance(ring);
+
+	ret = intel_ring_begin(ring, 3);
+	if (ret)
+		return ret;
+	intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1));
+	intel_ring_emit(ring, HSW_SLICESHUTDOWN);
+	intel_ring_emit(ring, ~SLICE_SHUTDOWN);
+	intel_ring_advance(ring);
+
+	ret = intel_ring_begin(ring, 3);
+	if (ret)
+		return ret;
+	intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1));
+	intel_ring_emit(ring, RC_IDLE_MAX_COUNT);
+	intel_ring_emit(ring, CS_IDLE_COUNT_1US);
+	intel_ring_advance(ring);
+
+	ret = intel_ring_begin(ring, 3);
+	if (ret)
+		return ret;
+	intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1));
+	intel_ring_emit(ring, WAIT_FOR_RC6_EXIT);
+	intel_ring_emit(ring, WAIT_RC6_EXIT | MASK_WAIT_RC6_EXIT);
+	intel_ring_advance(ring);
+
+	ret = intel_ring_begin(ring, 3);
+	if (ret)
+		return ret;
+	intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1));
+	intel_ring_emit(ring, RC_IDLE_MAX_COUNT);
+	intel_ring_emit(ring, CS_IDLE_COUNT_5US);
+	intel_ring_advance(ring);
+
+	return 0;
+}
+
+static int
 i915_gem_do_execbuffer(struct drm_device *dev, void *data,
 		       struct drm_file *file,
 		       struct drm_i915_gem_execbuffer2 *args,
@@ -999,6 +1059,13 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
 		return -EINVAL;
 	}
 
+	if (args->flags & I915_EXEC_GT_FULL &&
+	    dev_priv->gt_slices.state_default == 0 &&
+	    !dev_priv->gt_slices.forcing_full) {
+		dev_priv->gt_slices.forcing_full = true;
+		i915_gt_full_immediate(dev, ring);
+	}
+
 	mode = args->flags & I915_EXEC_CONSTANTS_MASK;
 	mask = I915_EXEC_CONSTANTS_MASK;
 	switch (mode) {
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 497c441..0146bef 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -277,6 +277,17 @@
 #define   SLICE_STATUS_MAIN_ON	(2<<0)
 #define   SLICE_STATUS_BOTH_ON	(3<<0)
 
+#define HSW_SLICESHUTDOWN	0xA190
+#define   SLICE_SHUTDOWN	(1<<0)
+
+#define RC_IDLE_MAX_COUNT	0x2054
+#define   CS_IDLE_COUNT_1US	(1<<1)
+#define   CS_IDLE_COUNT_5US	(1<<3)
+
+#define WAIT_FOR_RC6_EXIT	0x20CC
+#define   WAIT_RC6_EXIT		(1<<0)
+#define   MASK_WAIT_RC6_EXIT	(1<<16)
+
 /*
  * 3D instructions used by the kernel
  */
diff --git a/drivers/gpu/drm/i915/i915_sysfs.c b/drivers/gpu/drm/i915/i915_sysfs.c
index 86ccd52..25b0c5b 100644
--- a/drivers/gpu/drm/i915/i915_sysfs.c
+++ b/drivers/gpu/drm/i915/i915_sysfs.c
@@ -124,9 +124,13 @@ static ssize_t gt_slice_config_show(struct device *kdev,
 	struct drm_minor *minor = container_of(kdev, struct drm_minor, kdev);
 	struct drm_device *dev = minor->dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
+	bool full;
 
-	return sprintf(buf, "%s\n", I915_READ(MI_PREDICATE_RESULT_2) ==
-		       LOWER_SLICE_ENABLED ? "full" : "half");
+	mutex_lock(&dev_priv->gt_slices.m_lock);
+	full = I915_READ(MI_PREDICATE_RESULT_2) == LOWER_SLICE_ENABLED;
+	mutex_unlock(&dev_priv->gt_slices.m_lock);
+
+	return sprintf(buf, "%s\n", full ? "full" : "half");
 }
 
 static ssize_t gt_slice_config_store(struct device *kdev,
@@ -135,16 +139,19 @@ static ssize_t gt_slice_config_store(struct device *kdev,
 {
 	struct drm_minor *minor = container_of(kdev, struct drm_minor, kdev);
 	struct drm_device *dev = minor->dev;
+	struct drm_i915_private *dev_priv = dev->dev_private;
 	int ret;
 
 	if (!strncmp(buf, "full", sizeof("full") - 1)) {
 		ret = intel_set_gt_full(dev);
 		if (ret)
 			return ret;
+		dev_priv->gt_slices.state_default = 1;
 	} else if (!strncmp(buf, "half", sizeof("half") - 1)) {
 		ret = intel_set_gt_half(dev);
 		if (ret)
 			return ret;
+		dev_priv->gt_slices.state_default = 0;
 	} else
 		return -EINVAL;
 	return count;
diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index 4f1b636..1a2f8bc 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -7778,6 +7778,12 @@ void intel_mark_idle(struct drm_device *dev)
 
 	if (dev_priv->info->gen >= 6)
 		gen6_rps_idle(dev->dev_private);
+
+	if (HAS_SLICE_SHUTDOWN(dev) && dev_priv->gt_slices.forcing_full &&
+	    dev_priv->gt_slices.state_default == 0) {
+		intel_set_gt_half_async(dev);
+		dev_priv->gt_slices.forcing_full = false;
+	}
 }
 
 void intel_mark_fb_busy(struct drm_i915_gem_object *obj,
diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
index a9abbb5..98cd63e 100644
--- a/drivers/gpu/drm/i915/intel_drv.h
+++ b/drivers/gpu/drm/i915/intel_drv.h
@@ -836,6 +836,7 @@ void intel_disable_gt_powersave(struct drm_device *dev);
 void ironlake_teardown_rc6(struct drm_device *dev);
 int intel_set_gt_full(struct drm_device *dev);
 int intel_set_gt_half(struct drm_device *dev);
+void intel_set_gt_half_async(struct drm_device *dev);
 void intel_init_gt_slices(struct drm_device *dev);
 void gen6_update_ring_freq(struct drm_device *dev);
 void gen6_rps_idle(struct drm_i915_private *dev_priv);
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 63af075..24a0dc7 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -3873,6 +3873,7 @@ int intel_set_gt_full(struct drm_device *dev)
 	if (!HAS_SLICE_SHUTDOWN(dev))
 		return -ENODEV;
 
+	mutex_lock(&dev_priv->gt_slices.m_lock);
 	I915_WRITE(HSW_GT_SLICE_INFO, SLICE_SEL_BOTH);
 
 	/* Slices are enabled on RC6 exit */
@@ -3881,14 +3882,19 @@ int intel_set_gt_full(struct drm_device *dev)
 	if (wait_for(((I915_READ(HSW_GT_SLICE_INFO) & SLICE_STATUS_MASK) ==
 		      SLICE_STATUS_BOTH_ON), 2000)) {
 		DRM_ERROR("Timeout enabling full gt slices\n");
+
 		I915_WRITE(HSW_GT_SLICE_INFO, ~SLICE_SEL_BOTH);
 		I915_WRITE(MI_PREDICATE_RESULT_2, LOWER_SLICE_DISABLED);
+		mutex_unlock(&dev_priv->gt_slices.m_lock);
+
 		gen6_gt_force_wake_put(dev_priv);
 		return -ETIMEDOUT;
 	}
+
 	I915_WRITE(MI_PREDICATE_RESULT_2, LOWER_SLICE_ENABLED);
-	gen6_gt_force_wake_put(dev_priv);
+	mutex_unlock(&dev_priv->gt_slices.m_lock);
 
+	gen6_gt_force_wake_put(dev_priv);
 	return 0;
 }
 
@@ -3899,6 +3905,7 @@ int intel_set_gt_half(struct drm_device *dev)
 	if (!HAS_SLICE_SHUTDOWN(dev))
 		return -ENODEV;
 
+	mutex_lock(&dev_priv->gt_slices.m_lock);
 	I915_WRITE(HSW_GT_SLICE_INFO, ~SLICE_SEL_BOTH);
 
 	/* Slices are disabled on RC6 exit */
@@ -3907,16 +3914,40 @@ int intel_set_gt_half(struct drm_device *dev)
 	if (wait_for(((I915_READ(HSW_GT_SLICE_INFO) & SLICE_STATUS_MASK) ==
 		      SLICE_STATUS_MAIN_ON), 2000)) {
 		DRM_ERROR("Timed out disabling half gt slices\n");
+
 		I915_WRITE(HSW_GT_SLICE_INFO, SLICE_SEL_BOTH);
 		I915_WRITE(MI_PREDICATE_RESULT_2, LOWER_SLICE_ENABLED);
+		mutex_unlock(&dev_priv->gt_slices.m_lock);
+
 		gen6_gt_force_wake_put(dev_priv);
 		return -ETIMEDOUT;
 	}
+
 	I915_WRITE(MI_PREDICATE_RESULT_2, LOWER_SLICE_DISABLED);
+	mutex_unlock(&dev_priv->gt_slices.m_lock);
+
 	gen6_gt_force_wake_put(dev_priv);
 	return 0;
 }
 
+/**
+ * On Haswell, slices on/off transitions are done via RC6 sequence.
+ * This async function allows you to request slices shutdown without waiting.
+ * Slices will be disabled on next RC6 exit.
+ */
+void intel_set_gt_half_async(struct drm_device *dev)
+{
+	struct drm_i915_private *dev_priv = dev->dev_private;
+
+	if (!HAS_SLICE_SHUTDOWN(dev))
+		return;
+
+	mutex_lock(&dev_priv->gt_slices.m_lock);
+	I915_WRITE(HSW_GT_SLICE_INFO, ~SLICE_SEL_BOTH);
+	I915_WRITE(MI_PREDICATE_RESULT_2, LOWER_SLICE_DISABLED);
+	mutex_unlock(&dev_priv->gt_slices.m_lock);
+}
+
 void intel_init_gt_slices(struct drm_device *dev)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
@@ -3927,9 +3958,13 @@ void intel_init_gt_slices(struct drm_device *dev)
 	if (!HAS_SLICE_SHUTDOWN(dev))
 		return;
 
+	dev_priv->gt_slices.state_default = 1;
+	dev_priv->gt_slices.forcing_full = false;
+	mutex_init(&dev_priv->gt_slices.m_lock);
+
 	if (!i915_gt_slice_config) {
-		I915_WRITE(HSW_GT_SLICE_INFO, ~SLICE_SEL_BOTH);
-		I915_WRITE(MI_PREDICATE_RESULT_2, LOWER_SLICE_DISABLED);
+		intel_set_gt_half_async(dev);
+		dev_priv->gt_slices.state_default = 0;
 	}
 }
 
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index 3a4e97b..144054f 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -731,7 +731,10 @@ struct drm_i915_gem_execbuffer2 {
  */
 #define I915_EXEC_HANDLE_LUT		(1<<12)
 
-#define __I915_EXEC_UNKNOWN_FLAGS -(I915_EXEC_HANDLE_LUT<<1)
+/* Use all available GT Slisces */
+#define I915_EXEC_GT_FULL		(1<<13)
+
+#define __I915_EXEC_UNKNOWN_FLAGS -(I915_EXEC_GT_FULL<<1)
 
 #define I915_EXEC_CONTEXT_ID_MASK	(0xffffffff)
 #define i915_execbuffer2_set_context_id(eb2, context) \
-- 
1.7.11.7

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [PATCH 3/3] drm/i915: HSW GT3 Slices: Userspace can dynamically forcibly enable all slices
  2013-10-15 14:41 ` [PATCH 3/3] drm/i915: HSW GT3 Slices: Userspace can dynamically forcibly enable all slices Rodrigo Vivi
@ 2013-10-15 16:14   ` Eric Anholt
  2013-10-15 17:19     ` Rodrigo Vivi
  0 siblings, 1 reply; 31+ messages in thread
From: Eric Anholt @ 2013-10-15 16:14 UTC (permalink / raw)
  To: Rodrigo Vivi, intel-gfx


[-- Attachment #1.1: Type: text/plain, Size: 667 bytes --]

Rodrigo Vivi <rodrigo.vivi@gmail.com> writes:

> Rendering buffers that need full GT power can request full power through
> this I915_EXEC_GT_FULL flag. If the default is the power saving with half
> slices off it executes LRIs to immediately enable all slices.

How is userspace supposed to decide that the GPU needs to be at full
power?  Power management is a system problem which wants awareness of
the total load on the GPU from all clients, not just 1.  The kernel can
get that information, but userspace can't.

Basically, if you give userspace this knob, userspace will just have to
set it on every batchbuffer.  At which point, why give them the knob at
all?

[-- Attachment #1.2: Type: application/pgp-signature, Size: 835 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 3/3] drm/i915: HSW GT3 Slices: Userspace can dynamically forcibly enable all slices
  2013-10-15 16:14   ` Eric Anholt
@ 2013-10-15 17:19     ` Rodrigo Vivi
  2013-10-15 20:50       ` [PATCH 3/4] drm/i915: HSW GT3 Slices: exec flag to warn kernel that userspace is using predication Rodrigo Vivi
  0 siblings, 1 reply; 31+ messages in thread
From: Rodrigo Vivi @ 2013-10-15 17:19 UTC (permalink / raw)
  To: Eric Anholt; +Cc: intel-gfx

please ignore this one and other two (libdrm and igt) with exec
flag... will do a v2 and resend soon..

On Tue, Oct 15, 2013 at 1:14 PM, Eric Anholt <eric@anholt.net> wrote:
> Rodrigo Vivi <rodrigo.vivi@gmail.com> writes:
>
>> Rendering buffers that need full GT power can request full power through
>> this I915_EXEC_GT_FULL flag. If the default is the power saving with half
>> slices off it executes LRIs to immediately enable all slices.
>
> How is userspace supposed to decide that the GPU needs to be at full
> power?  Power management is a system problem which wants awareness of
> the total load on the GPU from all clients, not just 1.  The kernel can
> get that information, but userspace can't.
>
> Basically, if you give userspace this knob, userspace will just have to
> set it on every batchbuffer.  At which point, why give them the knob at
> all?



-- 
Rodrigo Vivi
Blog: http://blog.vivi.eng.br

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH] i915_drm: add exec flag to warn kernel that userspace is using predication
  2013-10-15 14:41 ` [PATCH] i915_drm: add exec flag for request forcing Intel GT full Rodrigo Vivi
@ 2013-10-15 20:50   ` Rodrigo Vivi
  0 siblings, 0 replies; 31+ messages in thread
From: Rodrigo Vivi @ 2013-10-15 20:50 UTC (permalink / raw)
  To: intel-gfx

If Userspace isn't using MI_PREDICATE all slices must be enabled for
backward compatibility.

If I915_EXEC_USE_PREDICATE isn't set and defaul is set to half, kernel will force
all slices on.

v2: include more tests and s/GT_FULL/USE_PREDICATE
    on code and on commit message.

Signed-off-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
---
 include/drm/i915_drm.h | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/include/drm/i915_drm.h b/include/drm/i915_drm.h
index aa983f3..933656c 100644
--- a/include/drm/i915_drm.h
+++ b/include/drm/i915_drm.h
@@ -670,6 +670,12 @@ struct drm_i915_gem_execbuffer2 {
 /** Resets the SO write offset registers for transform feedback on gen7. */
 #define I915_EXEC_GEN7_SOL_RESET	(1<<8)
 
+/* If this flag is set userspace is using predicate and half slices can be
+ * let disabled for power saving. Otherwise use all slices even when disabled
+ * by boot parameter or via sysfs interface
+ */
+#define I915_EXEC_USE_PREDICATE		(1<<13)
+
 #define I915_EXEC_CONTEXT_ID_MASK	(0xffffffff)
 #define i915_execbuffer2_set_context_id(eb2, context) \
 	(eb2).rsvd1 = context & I915_EXEC_CONTEXT_ID_MASK
-- 
1.7.11.7

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 3/4] drm/i915: HSW GT3 Slices: exec flag to warn kernel that userspace is using predication
  2013-10-15 17:19     ` Rodrigo Vivi
@ 2013-10-15 20:50       ` Rodrigo Vivi
  2013-10-16  9:15         ` Chris Wilson
  0 siblings, 1 reply; 31+ messages in thread
From: Rodrigo Vivi @ 2013-10-15 20:50 UTC (permalink / raw)
  To: intel-gfx

If Userspace isn't using MI_PREDICATE all slices must be enabled for
backward compatibility.

If I915_EXEC_USE_PREDICATE isn't set and defaul is set to half, kernel will force
all slices on.

v2: fix the inverted logic for backwards compatibility
    USE_PREDICATE unset force gt_full when defaul is half
    instead of GT_FULL flag.

CC: Eric Anholt <eric@anholt.net>
CC: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
---
 drivers/gpu/drm/i915/i915_drv.h            |  8 ++++
 drivers/gpu/drm/i915/i915_gem_execbuffer.c | 67 ++++++++++++++++++++++++++++++
 drivers/gpu/drm/i915/i915_reg.h            | 11 +++++
 drivers/gpu/drm/i915/i915_sysfs.c          | 11 ++++-
 drivers/gpu/drm/i915/intel_display.c       |  6 +++
 drivers/gpu/drm/i915/intel_drv.h           |  1 +
 drivers/gpu/drm/i915/intel_pm.c            | 39 ++++++++++++++++-
 include/uapi/drm/i915_drm.h                |  8 +++-
 8 files changed, 146 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 685fb1d..bd4774e 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1219,6 +1219,12 @@ struct i915_package_c8 {
 	} regsave;
 };
 
+struct i915_gt_slices {
+	int state_default;
+	int forcing_full;
+	struct mutex m_lock;
+};
+
 typedef struct drm_i915_private {
 	struct drm_device *dev;
 	struct kmem_cache *slab;
@@ -1418,6 +1424,8 @@ typedef struct drm_i915_private {
 
 	struct i915_package_c8 pc8;
 
+	struct i915_gt_slices gt_slices;
+
 	/* Old dri1 support infrastructure, beware the dragons ya fools entering
 	 * here! */
 	struct i915_dri1_state dri1;
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index 0ce0d47..eb09a51 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -923,6 +923,66 @@ i915_reset_gen7_sol_offsets(struct drm_device *dev,
 }
 
 static int
+i915_gt_full_immediate(struct drm_device *dev, struct intel_ring_buffer *ring)
+{
+	drm_i915_private_t *dev_priv = dev->dev_private;
+	int ret;
+
+	if (!HAS_SLICE_SHUTDOWN(dev) || ring == &dev_priv->ring[BCS])
+		return 0;
+
+	ret = intel_ring_begin(ring, 3);
+	if (ret)
+		return ret;
+	intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1));
+	intel_ring_emit(ring, HSW_GT_SLICE_INFO);
+	intel_ring_emit(ring, SLICE_SEL_BOTH);
+	intel_ring_advance(ring);
+
+	ret = intel_ring_begin(ring, 3);
+	if (ret)
+		return ret;
+	intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1));
+	intel_ring_emit(ring, MI_PREDICATE_RESULT_2);
+	intel_ring_emit(ring, LOWER_SLICE_ENABLED);
+	intel_ring_advance(ring);
+
+	ret = intel_ring_begin(ring, 3);
+	if (ret)
+		return ret;
+	intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1));
+	intel_ring_emit(ring, HSW_SLICESHUTDOWN);
+	intel_ring_emit(ring, ~SLICE_SHUTDOWN);
+	intel_ring_advance(ring);
+
+	ret = intel_ring_begin(ring, 3);
+	if (ret)
+		return ret;
+	intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1));
+	intel_ring_emit(ring, RC_IDLE_MAX_COUNT);
+	intel_ring_emit(ring, CS_IDLE_COUNT_1US);
+	intel_ring_advance(ring);
+
+	ret = intel_ring_begin(ring, 3);
+	if (ret)
+		return ret;
+	intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1));
+	intel_ring_emit(ring, WAIT_FOR_RC6_EXIT);
+	intel_ring_emit(ring, WAIT_RC6_EXIT | MASK_WAIT_RC6_EXIT);
+	intel_ring_advance(ring);
+
+	ret = intel_ring_begin(ring, 3);
+	if (ret)
+		return ret;
+	intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1));
+	intel_ring_emit(ring, RC_IDLE_MAX_COUNT);
+	intel_ring_emit(ring, CS_IDLE_COUNT_5US);
+	intel_ring_advance(ring);
+
+	return 0;
+}
+
+static int
 i915_gem_do_execbuffer(struct drm_device *dev, void *data,
 		       struct drm_file *file,
 		       struct drm_i915_gem_execbuffer2 *args,
@@ -999,6 +1059,13 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
 		return -EINVAL;
 	}
 
+	if ((args->flags & I915_EXEC_USE_PREDICATE) == 0 &&
+	    dev_priv->gt_slices.state_default == 0 &&
+	    !dev_priv->gt_slices.forcing_full) {
+		dev_priv->gt_slices.forcing_full = true;
+		i915_gt_full_immediate(dev, ring);
+	}
+
 	mode = args->flags & I915_EXEC_CONSTANTS_MASK;
 	mask = I915_EXEC_CONSTANTS_MASK;
 	switch (mode) {
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 497c441..0146bef 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -277,6 +277,17 @@
 #define   SLICE_STATUS_MAIN_ON	(2<<0)
 #define   SLICE_STATUS_BOTH_ON	(3<<0)
 
+#define HSW_SLICESHUTDOWN	0xA190
+#define   SLICE_SHUTDOWN	(1<<0)
+
+#define RC_IDLE_MAX_COUNT	0x2054
+#define   CS_IDLE_COUNT_1US	(1<<1)
+#define   CS_IDLE_COUNT_5US	(1<<3)
+
+#define WAIT_FOR_RC6_EXIT	0x20CC
+#define   WAIT_RC6_EXIT		(1<<0)
+#define   MASK_WAIT_RC6_EXIT	(1<<16)
+
 /*
  * 3D instructions used by the kernel
  */
diff --git a/drivers/gpu/drm/i915/i915_sysfs.c b/drivers/gpu/drm/i915/i915_sysfs.c
index 86ccd52..25b0c5b 100644
--- a/drivers/gpu/drm/i915/i915_sysfs.c
+++ b/drivers/gpu/drm/i915/i915_sysfs.c
@@ -124,9 +124,13 @@ static ssize_t gt_slice_config_show(struct device *kdev,
 	struct drm_minor *minor = container_of(kdev, struct drm_minor, kdev);
 	struct drm_device *dev = minor->dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
+	bool full;
 
-	return sprintf(buf, "%s\n", I915_READ(MI_PREDICATE_RESULT_2) ==
-		       LOWER_SLICE_ENABLED ? "full" : "half");
+	mutex_lock(&dev_priv->gt_slices.m_lock);
+	full = I915_READ(MI_PREDICATE_RESULT_2) == LOWER_SLICE_ENABLED;
+	mutex_unlock(&dev_priv->gt_slices.m_lock);
+
+	return sprintf(buf, "%s\n", full ? "full" : "half");
 }
 
 static ssize_t gt_slice_config_store(struct device *kdev,
@@ -135,16 +139,19 @@ static ssize_t gt_slice_config_store(struct device *kdev,
 {
 	struct drm_minor *minor = container_of(kdev, struct drm_minor, kdev);
 	struct drm_device *dev = minor->dev;
+	struct drm_i915_private *dev_priv = dev->dev_private;
 	int ret;
 
 	if (!strncmp(buf, "full", sizeof("full") - 1)) {
 		ret = intel_set_gt_full(dev);
 		if (ret)
 			return ret;
+		dev_priv->gt_slices.state_default = 1;
 	} else if (!strncmp(buf, "half", sizeof("half") - 1)) {
 		ret = intel_set_gt_half(dev);
 		if (ret)
 			return ret;
+		dev_priv->gt_slices.state_default = 0;
 	} else
 		return -EINVAL;
 	return count;
diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index 4f1b636..1a2f8bc 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -7778,6 +7778,12 @@ void intel_mark_idle(struct drm_device *dev)
 
 	if (dev_priv->info->gen >= 6)
 		gen6_rps_idle(dev->dev_private);
+
+	if (HAS_SLICE_SHUTDOWN(dev) && dev_priv->gt_slices.forcing_full &&
+	    dev_priv->gt_slices.state_default == 0) {
+		intel_set_gt_half_async(dev);
+		dev_priv->gt_slices.forcing_full = false;
+	}
 }
 
 void intel_mark_fb_busy(struct drm_i915_gem_object *obj,
diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
index a9abbb5..98cd63e 100644
--- a/drivers/gpu/drm/i915/intel_drv.h
+++ b/drivers/gpu/drm/i915/intel_drv.h
@@ -836,6 +836,7 @@ void intel_disable_gt_powersave(struct drm_device *dev);
 void ironlake_teardown_rc6(struct drm_device *dev);
 int intel_set_gt_full(struct drm_device *dev);
 int intel_set_gt_half(struct drm_device *dev);
+void intel_set_gt_half_async(struct drm_device *dev);
 void intel_init_gt_slices(struct drm_device *dev);
 void gen6_update_ring_freq(struct drm_device *dev);
 void gen6_rps_idle(struct drm_i915_private *dev_priv);
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 63af075..2ded34c 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -3873,6 +3873,7 @@ int intel_set_gt_full(struct drm_device *dev)
 	if (!HAS_SLICE_SHUTDOWN(dev))
 		return -ENODEV;
 
+	mutex_lock(&dev_priv->gt_slices.m_lock);
 	I915_WRITE(HSW_GT_SLICE_INFO, SLICE_SEL_BOTH);
 
 	/* Slices are enabled on RC6 exit */
@@ -3881,13 +3882,18 @@ int intel_set_gt_full(struct drm_device *dev)
 	if (wait_for(((I915_READ(HSW_GT_SLICE_INFO) & SLICE_STATUS_MASK) ==
 		      SLICE_STATUS_BOTH_ON), 2000)) {
 		DRM_ERROR("Timeout enabling full gt slices\n");
+
 		I915_WRITE(HSW_GT_SLICE_INFO, ~SLICE_SEL_BOTH);
 		I915_WRITE(MI_PREDICATE_RESULT_2, LOWER_SLICE_DISABLED);
+
 		gen6_gt_force_wake_put(dev_priv);
+		mutex_unlock(&dev_priv->gt_slices.m_lock);
 		return -ETIMEDOUT;
 	}
+
 	I915_WRITE(MI_PREDICATE_RESULT_2, LOWER_SLICE_ENABLED);
 	gen6_gt_force_wake_put(dev_priv);
+	mutex_unlock(&dev_priv->gt_slices.m_lock);
 
 	return 0;
 }
@@ -3899,6 +3905,7 @@ int intel_set_gt_half(struct drm_device *dev)
 	if (!HAS_SLICE_SHUTDOWN(dev))
 		return -ENODEV;
 
+	mutex_lock(&dev_priv->gt_slices.m_lock);
 	I915_WRITE(HSW_GT_SLICE_INFO, ~SLICE_SEL_BOTH);
 
 	/* Slices are disabled on RC6 exit */
@@ -3907,16 +3914,40 @@ int intel_set_gt_half(struct drm_device *dev)
 	if (wait_for(((I915_READ(HSW_GT_SLICE_INFO) & SLICE_STATUS_MASK) ==
 		      SLICE_STATUS_MAIN_ON), 2000)) {
 		DRM_ERROR("Timed out disabling half gt slices\n");
+
 		I915_WRITE(HSW_GT_SLICE_INFO, SLICE_SEL_BOTH);
 		I915_WRITE(MI_PREDICATE_RESULT_2, LOWER_SLICE_ENABLED);
+
 		gen6_gt_force_wake_put(dev_priv);
+		mutex_unlock(&dev_priv->gt_slices.m_lock);
 		return -ETIMEDOUT;
 	}
+
 	I915_WRITE(MI_PREDICATE_RESULT_2, LOWER_SLICE_DISABLED);
 	gen6_gt_force_wake_put(dev_priv);
+	mutex_unlock(&dev_priv->gt_slices.m_lock);
+
 	return 0;
 }
 
+/**
+ * On Haswell, slices on/off transitions are done via RC6 sequence.
+ * This async function allows you to request slices shutdown without waiting.
+ * Slices will be disabled on next RC6 exit.
+ */
+void intel_set_gt_half_async(struct drm_device *dev)
+{
+	struct drm_i915_private *dev_priv = dev->dev_private;
+
+	if (!HAS_SLICE_SHUTDOWN(dev))
+		return;
+
+	mutex_lock(&dev_priv->gt_slices.m_lock);
+	I915_WRITE(HSW_GT_SLICE_INFO, ~SLICE_SEL_BOTH);
+	I915_WRITE(MI_PREDICATE_RESULT_2, LOWER_SLICE_DISABLED);
+	mutex_unlock(&dev_priv->gt_slices.m_lock);
+}
+
 void intel_init_gt_slices(struct drm_device *dev)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
@@ -3927,9 +3958,13 @@ void intel_init_gt_slices(struct drm_device *dev)
 	if (!HAS_SLICE_SHUTDOWN(dev))
 		return;
 
+	dev_priv->gt_slices.state_default = 1;
+	dev_priv->gt_slices.forcing_full = false;
+	mutex_init(&dev_priv->gt_slices.m_lock);
+
 	if (!i915_gt_slice_config) {
-		I915_WRITE(HSW_GT_SLICE_INFO, ~SLICE_SEL_BOTH);
-		I915_WRITE(MI_PREDICATE_RESULT_2, LOWER_SLICE_DISABLED);
+		intel_set_gt_half_async(dev);
+		dev_priv->gt_slices.state_default = 0;
 	}
 }
 
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index 3a4e97b..3fa3e24 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -731,7 +731,13 @@ struct drm_i915_gem_execbuffer2 {
  */
 #define I915_EXEC_HANDLE_LUT		(1<<12)
 
-#define __I915_EXEC_UNKNOWN_FLAGS -(I915_EXEC_HANDLE_LUT<<1)
+/* If this flag is set userspace is using predicate and half slices can be
+ * let disabled for power saving. Otherwise use all slices even when disabled
+ * by boot parameter or via sysfs interface
+ */
+#define I915_EXEC_USE_PREDICATE		(1<<13)
+
+#define __I915_EXEC_UNKNOWN_FLAGS -(I915_EXEC_USE_PREDICATE<<1)
 
 #define I915_EXEC_CONTEXT_ID_MASK	(0xffffffff)
 #define i915_execbuffer2_set_context_id(eb2, context) \
-- 
1.7.11.7

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH] testcases: Slice Shutdown testscases.
  2013-10-15 14:41 ` [PATCH] testcases: Slice Shutdown sysfs switch and gt force full exec buffer flag Rodrigo Vivi
@ 2013-10-15 20:51   ` Rodrigo Vivi
  2013-10-21 21:01     ` Rodrigo Vivi
  0 siblings, 1 reply; 31+ messages in thread
From: Rodrigo Vivi @ 2013-10-15 20:51 UTC (permalink / raw)
  To: intel-gfx

1. sysfs half/full switch.
2. on full execbuf without I915_EXEC_USE_PREDICATE
3. on full execbuf with I915_EXEC_USE_PREDICATE
4. on half execbuf without I915_EXEC_USE_PREDICATE
5. on half execbuf with I915_EXEC_USE_PREDICATE

v2: include more tests and s/GT_FULL/USE_PREDICATE

Signed-off-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
---
 tests/Makefile.am       |   1 +
 tests/gt_slice_config.c | 325 ++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 326 insertions(+)
 create mode 100644 tests/gt_slice_config.c

diff --git a/tests/Makefile.am b/tests/Makefile.am
index 0290ae0..cf105e4 100644
--- a/tests/Makefile.am
+++ b/tests/Makefile.am
@@ -50,6 +50,7 @@ TESTS_progs_M = \
 	gem_tiled_blits \
 	gem_tiled_partial_pwrite_pread \
 	gem_write_read_ring_switch \
+	gt_slice_config \
 	kms_flip \
 	kms_render \
 	kms_setmode \
diff --git a/tests/gt_slice_config.c b/tests/gt_slice_config.c
new file mode 100644
index 0000000..7b9f554
--- /dev/null
+++ b/tests/gt_slice_config.c
@@ -0,0 +1,325 @@
+/*
+ * Copyright © 2013 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ *
+ * Authors:
+ *    Rodrigo Vivi <rodrigo.vivi@intel.com>
+ *
+ */
+
+/*
+ * Testcase: Test GT slice shutdown feature
+ *
+ * Sub tests:
+ * 1. sysfs half/full switch.
+ * 2. on full execbuf without I915_EXEC_USE_PREDICATE
+ * 3. on full execbuf with I915_EXEC_USE_PREDICATE
+ * 4. on half execbuf without I915_EXEC_USE_PREDICATE
+ * 5. on half execbuf with I915_EXEC_USE_PREDICATE
+ */
+
+#define _GNU_SOURCE
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <unistd.h>
+#include "drmtest.h"
+
+static void exec(int fd, uint32_t handle, bool use_predicate)
+{
+        struct drm_i915_gem_execbuffer2 execbuf;
+        struct drm_i915_gem_exec_object2 gem_exec[1];
+        uint32_t b[2] = {MI_BATCH_BUFFER_END};
+        int loops = 2000000;
+        int ret = 0;
+
+        gem_write(fd, handle, 0, b, sizeof(b));
+
+        gem_exec[0].handle = handle;
+        gem_exec[0].relocation_count = 0;
+        gem_exec[0].relocs_ptr = 0;
+        gem_exec[0].alignment = 0;
+        gem_exec[0].offset = 0;
+        gem_exec[0].flags = 0;
+        gem_exec[0].rsvd1 = 0;
+        gem_exec[0].rsvd2 = 0;
+
+        execbuf.buffers_ptr = (uintptr_t)gem_exec;
+        execbuf.buffer_count = 1;
+        execbuf.batch_start_offset = 0;
+        execbuf.batch_len = 8;
+        execbuf.cliprects_ptr = 0;
+        execbuf.num_cliprects = 0;
+        execbuf.DR1 = 0;
+        execbuf.DR4 = 0;
+        execbuf.flags =  I915_EXEC_RENDER;
+	if (use_predicate)
+		execbuf.flags |= I915_EXEC_USE_PREDICATE;
+        i915_execbuffer2_set_context_id(execbuf, 0);
+        execbuf.rsvd2 = 0;
+
+        while (loops-- && ret == 0) {
+                ret = drmIoctl(fd,
+                               DRM_IOCTL_I915_GEM_EXECBUFFER2,
+                               &execbuf);
+        }
+	gem_sync(fd, handle);
+}
+
+static bool is_full(const int device)
+{
+	char *path;
+	FILE *file;
+	int ret;
+	char str[5];
+
+	ret = asprintf(&path, "/sys/class/drm/card%d/power/gt_slice_config",
+		       device);
+	igt_assert(ret != -1);
+
+	file = fopen(path, "r");
+	igt_require(file);
+
+	ret = fscanf(file, "%s", str);
+	igt_assert(ret != 0);
+
+	fclose(file);
+	return strcmp(str, "half");
+}
+
+static int set_status(const int device, bool full)
+{
+	char *path;
+	FILE *file;
+	int ret;
+	char str[5];
+	int attempts = 10;
+
+	ret = asprintf(&path, "/sys/class/drm/card%d/power/gt_slice_config",
+		       device);
+	igt_assert(ret != -1);
+
+	file = fopen(path, "r");
+	igt_require(file);
+
+	ret = fscanf(file, "%s", str);
+	igt_assert(ret != 0);
+
+	fclose(file);
+
+	if (full == strcmp(str, "half"))
+		return 0;
+
+	while (attempts-- && ret != 0) {
+		file = fopen(path, "w");
+		igt_require(file);
+		ret = fprintf(file, "%s\n", full ? "full": "half");
+		igt_assert(ret != -1);
+		ret = fclose(file);
+		sleep(1);
+	}
+	return ret;
+}
+
+int main(int argc, char **argv)
+{
+	char *path;
+	FILE *file;
+	unsigned int rc6_enabled;
+	int ret;
+	uint32_t handle;
+	const int device = drm_get_card();
+	const int fd = drm_open_any();
+	const int devid = intel_get_drm_devid(fd);
+	bool initial = is_full(device);
+
+	igt_skip_on_simulation();
+
+	igt_subtest_init(argc, argv);
+
+	igt_fixture {
+		/* On Haswell Slices on/off switch depends on RC6 exit */
+		ret = asprintf(&path, "/sys/class/drm/card%d/power/rc6_enable",
+			       device);
+		igt_assert(ret != -1);
+
+		file = fopen(path, "r");
+		igt_require(file);
+
+		fscanf(file, "%u", &rc6_enabled);
+		fclose(file);
+
+		igt_require(IS_HASWELL(devid));
+		igt_require(rc6_enabled);
+	}
+
+	igt_subtest("sysfs") {
+		/* Switching states */
+		ret = set_status(device, !initial);
+		if (ret < 0)
+			igt_fail(ret);
+
+		/* Check if state has really changed */
+		if (initial == is_full(device)) {
+			fprintf(stderr,
+				"gt_slice_config status hasn't changed\n");
+			igt_fail(-1);
+		}
+		igt_success();
+	}
+
+	igt_subtest("execbuf-full") {
+		/* Enable all slices */
+		ret = set_status(device, true);
+		if (ret < 0)
+			igt_fail(ret);
+
+		igt_fork(child, 1) {
+			/* Start Render with I915_EXEC_USE_PREDICATE */
+			handle = gem_create(fd, 4096);
+			exec(fd, handle, false);
+			gem_close(fd, handle);
+		}
+
+		sleep(1);
+		if (!is_full(device)) {
+			fprintf(stderr, "All slices should be enabled\n");
+			igt_fail(-1);
+		}
+
+		/* Wait for idleness */
+		igt_waitchildren();
+		sleep(3);
+
+		/* Check if it is back to half */
+		if (!is_full(device)) {
+			fprintf(stderr, "All slices should be enabled\n");
+			igt_fail(-1);
+		}
+
+		igt_success();
+	}
+
+	igt_subtest("execbuf-full-predicate") {
+		/* Enable all slices */
+		ret = set_status(device, true);
+		if (ret < 0)
+			igt_fail(ret);
+
+		igt_fork(child, 1) {
+			/* Start Render with I915_EXEC_USE_PREDICATE */
+			handle = gem_create(fd, 4096);
+			exec(fd, handle, true);
+			gem_close(fd, handle);
+		}
+
+		sleep(1);
+		if (!is_full(device)) {
+			fprintf(stderr, "All slices should be enabled\n");
+			igt_fail(-1);
+		}
+
+		/* Wait for idleness */
+		igt_waitchildren();
+		sleep(3);
+
+		/* Check if it is back to half */
+		if (!is_full(device)) {
+			fprintf(stderr, "Only half slices should be enabled\n");
+			igt_fail(-1);
+		}
+
+		igt_success();
+	}
+
+	igt_subtest("execbuf-half") {
+		/* Disable half slices */
+		ret = set_status(device, false);
+		if (ret < 0)
+			igt_fail(ret);
+
+		igt_fork(child, 1) {
+			/* Start Render with I915_EXEC_USE_PREDICATE */
+			handle = gem_create(fd, 4096);
+			exec(fd, handle, true);
+			gem_close(fd, handle);
+		}
+
+		/* Check if it stays with only half slices enabled */
+		sleep(1);
+		if (is_full(device)) {
+			fprintf(stderr, "Only half slices should be enabled during execution with predicate\n");
+			igt_fail(-1);
+		}
+
+		/* Wait for idleness */
+		igt_waitchildren();
+		sleep(3);
+
+		/* Check if it is still half */
+		if (is_full(device)) {
+			fprintf(stderr, "Only half slices should be enabled\n");
+			igt_fail(-1);
+		}
+		igt_success();
+	}
+
+	igt_subtest("execbuf-half-predicate") {
+		/* Disable half slices */
+		ret = set_status(device, false);
+		if (ret < 0)
+			igt_fail(ret);
+
+		igt_fork(child, 1) {
+			/* Start Render with I915_EXEC_USE_PREDICATE*/
+			handle = gem_create(fd, 4096);
+			exec(fd, handle, true);
+			gem_close(fd, handle);
+		}
+
+		/* Check if it stays with only half slices enabled */
+		sleep(1);
+		if (is_full(device)) {
+			fprintf(stderr, "Only half slices should be enabled during execution with predicate\n");
+			set_status(device, initial);
+			igt_fail(-1);
+		}
+
+		/* Wait for idleness */
+		igt_waitchildren();
+		sleep(3);
+
+		/* Check if it is still half */
+		if (is_full(device)) {
+			fprintf(stderr, "Only half slices should be enabled\n");
+			set_status(device, initial);
+			igt_fail(-1);
+		}
+		set_status(device, initial);
+		igt_success();
+	}
+
+	igt_fixture {
+		close(fd);
+	}
+
+	igt_exit();
+}
-- 
1.7.11.7

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [PATCH 3/4] drm/i915: HSW GT3 Slices: exec flag to warn kernel that userspace is using predication
  2013-10-15 20:50       ` [PATCH 3/4] drm/i915: HSW GT3 Slices: exec flag to warn kernel that userspace is using predication Rodrigo Vivi
@ 2013-10-16  9:15         ` Chris Wilson
  2013-10-16 12:12           ` Rodrigo Vivi
  2013-10-31 23:07           ` Rodrigo Vivi
  0 siblings, 2 replies; 31+ messages in thread
From: Chris Wilson @ 2013-10-16  9:15 UTC (permalink / raw)
  To: Rodrigo Vivi; +Cc: intel-gfx

On Tue, Oct 15, 2013 at 05:50:49PM -0300, Rodrigo Vivi wrote:
> If Userspace isn't using MI_PREDICATE all slices must be enabled for
> backward compatibility.
> 
> If I915_EXEC_USE_PREDICATE isn't set and defaul is set to half, kernel will force
> all slices on.
> 
> v2: fix the inverted logic for backwards compatibility
>     USE_PREDICATE unset force gt_full when defaul is half
>     instead of GT_FULL flag.
> 
> CC: Eric Anholt <eric@anholt.net>
> CC: Kenneth Graunke <kenneth@whitecape.org>
> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
> ---
>  drivers/gpu/drm/i915/i915_drv.h            |  8 ++++
>  drivers/gpu/drm/i915/i915_gem_execbuffer.c | 67 ++++++++++++++++++++++++++++++
>  drivers/gpu/drm/i915/i915_reg.h            | 11 +++++
>  drivers/gpu/drm/i915/i915_sysfs.c          | 11 ++++-
>  drivers/gpu/drm/i915/intel_display.c       |  6 +++
>  drivers/gpu/drm/i915/intel_drv.h           |  1 +
>  drivers/gpu/drm/i915/intel_pm.c            | 39 ++++++++++++++++-
>  include/uapi/drm/i915_drm.h                |  8 +++-
>  8 files changed, 146 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 685fb1d..bd4774e 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -1219,6 +1219,12 @@ struct i915_package_c8 {
>  	} regsave;
>  };
>  
> +struct i915_gt_slices {
> +	int state_default;

Either use it as an integer (a count of slices enabled by default) or
make it an enum.

> +	int forcing_full;

bool legacy_userspace_busy; ?

> +	struct mutex m_lock;

Ugh. Just struct mutex lock; /* locks all access to this struct and
slice registers */ will suffice

> +};
> +
>  typedef struct drm_i915_private {
>  	struct drm_device *dev;
>  	struct kmem_cache *slab;
> @@ -1418,6 +1424,8 @@ typedef struct drm_i915_private {
>  
>  	struct i915_package_c8 pc8;
>  
> +	struct i915_gt_slices gt_slices;
> +
>  	/* Old dri1 support infrastructure, beware the dragons ya fools entering
>  	 * here! */
>  	struct i915_dri1_state dri1;
> diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> index 0ce0d47..eb09a51 100644
> --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> @@ -923,6 +923,66 @@ i915_reset_gen7_sol_offsets(struct drm_device *dev,
>  }
>  
>  static int
> +i915_gt_full_immediate(struct drm_device *dev, struct intel_ring_buffer *ring)

That's a misnomer.

> +{
> +	drm_i915_private_t *dev_priv = dev->dev_private;
> +	int ret;
> +
> +	if (!HAS_SLICE_SHUTDOWN(dev) || ring == &dev_priv->ring[BCS])
> +		return 0;
> +
> +	ret = intel_ring_begin(ring, 3);
> +	if (ret)
> +		return ret;
> +	intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1));
> +	intel_ring_emit(ring, HSW_GT_SLICE_INFO);
> +	intel_ring_emit(ring, SLICE_SEL_BOTH);
> +	intel_ring_advance(ring);
> +
> +	ret = intel_ring_begin(ring, 3);
> +	if (ret)
> +		return ret;
> +	intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1));
> +	intel_ring_emit(ring, MI_PREDICATE_RESULT_2);
> +	intel_ring_emit(ring, LOWER_SLICE_ENABLED);
> +	intel_ring_advance(ring);
> +
> +	ret = intel_ring_begin(ring, 3);
> +	if (ret)
> +		return ret;
> +	intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1));
> +	intel_ring_emit(ring, HSW_SLICESHUTDOWN);
> +	intel_ring_emit(ring, ~SLICE_SHUTDOWN);
> +	intel_ring_advance(ring);
> +
> +	ret = intel_ring_begin(ring, 3);
> +	if (ret)
> +		return ret;
> +	intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1));
> +	intel_ring_emit(ring, RC_IDLE_MAX_COUNT);
> +	intel_ring_emit(ring, CS_IDLE_COUNT_1US);
> +	intel_ring_advance(ring);
> +
> +	ret = intel_ring_begin(ring, 3);
> +	if (ret)
> +		return ret;
> +	intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1));
> +	intel_ring_emit(ring, WAIT_FOR_RC6_EXIT);
> +	intel_ring_emit(ring, WAIT_RC6_EXIT | MASK_WAIT_RC6_EXIT);
> +	intel_ring_advance(ring);
> +
> +	ret = intel_ring_begin(ring, 3);
> +	if (ret)
> +		return ret;
> +	intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1));
> +	intel_ring_emit(ring, RC_IDLE_MAX_COUNT);
> +	intel_ring_emit(ring, CS_IDLE_COUNT_5US);
> +	intel_ring_advance(ring);
> +
> +	return 0;
> +}
> +
> +static int
>  i915_gem_do_execbuffer(struct drm_device *dev, void *data,
>  		       struct drm_file *file,
>  		       struct drm_i915_gem_execbuffer2 *args,
> @@ -999,6 +1059,13 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
>  		return -EINVAL;
>  	}
>  
> +	if ((args->flags & I915_EXEC_USE_PREDICATE) == 0 &&
> +	    dev_priv->gt_slices.state_default == 0 &&
> +	    !dev_priv->gt_slices.forcing_full) {

Always set legacy_userspace_busy here so that a change in state_default
cannot possibly break currently active users i.e. they are independent
bookkeeping.

> +		dev_priv->gt_slices.forcing_full = true;
> +		i915_gt_full_immediate(dev, ring);

What happens when this fails? If it only partially emits the commands,
etc.

> +	}
> +
>  	mode = args->flags & I915_EXEC_CONSTANTS_MASK;
>  	mask = I915_EXEC_CONSTANTS_MASK;
>  	switch (mode) {
> diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
> index 497c441..0146bef 100644
> --- a/drivers/gpu/drm/i915/i915_reg.h
> +++ b/drivers/gpu/drm/i915/i915_reg.h
> @@ -277,6 +277,17 @@
>  #define   SLICE_STATUS_MAIN_ON	(2<<0)
>  #define   SLICE_STATUS_BOTH_ON	(3<<0)
>  
> +#define HSW_SLICESHUTDOWN	0xA190
> +#define   SLICE_SHUTDOWN	(1<<0)
> +
> +#define RC_IDLE_MAX_COUNT	0x2054
> +#define   CS_IDLE_COUNT_1US	(1<<1)
> +#define   CS_IDLE_COUNT_5US	(1<<3)
> +
> +#define WAIT_FOR_RC6_EXIT	0x20CC
> +#define   WAIT_RC6_EXIT		(1<<0)
> +#define   MASK_WAIT_RC6_EXIT	(1<<16)
> +
>  /*
>   * 3D instructions used by the kernel
>   */
> diff --git a/drivers/gpu/drm/i915/i915_sysfs.c b/drivers/gpu/drm/i915/i915_sysfs.c
> index 86ccd52..25b0c5b 100644
> --- a/drivers/gpu/drm/i915/i915_sysfs.c
> +++ b/drivers/gpu/drm/i915/i915_sysfs.c
> @@ -124,9 +124,13 @@ static ssize_t gt_slice_config_show(struct device *kdev,
>  	struct drm_minor *minor = container_of(kdev, struct drm_minor, kdev);
>  	struct drm_device *dev = minor->dev;
>  	struct drm_i915_private *dev_priv = dev->dev_private;
> +	bool full;
>  
> -	return sprintf(buf, "%s\n", I915_READ(MI_PREDICATE_RESULT_2) ==
> -		       LOWER_SLICE_ENABLED ? "full" : "half");
> +	mutex_lock(&dev_priv->gt_slices.m_lock);
> +	full = I915_READ(MI_PREDICATE_RESULT_2) == LOWER_SLICE_ENABLED;
> +	mutex_unlock(&dev_priv->gt_slices.m_lock);

This locking is not stopping any races - it is superfluous.

> +	return sprintf(buf, "%s\n", full ? "full" : "half");
>  }
>  
>  static ssize_t gt_slice_config_store(struct device *kdev,
> @@ -135,16 +139,19 @@ static ssize_t gt_slice_config_store(struct device *kdev,
>  {
>  	struct drm_minor *minor = container_of(kdev, struct drm_minor, kdev);
>  	struct drm_device *dev = minor->dev;
> +	struct drm_i915_private *dev_priv = dev->dev_private;
>  	int ret;
>  
>  	if (!strncmp(buf, "full", sizeof("full") - 1)) {
>  		ret = intel_set_gt_full(dev);
>  		if (ret)
>  			return ret;
> +		dev_priv->gt_slices.state_default = 1;
>  	} else if (!strncmp(buf, "half", sizeof("half") - 1)) {
>  		ret = intel_set_gt_half(dev);
>  		if (ret)
>  			return ret;
> +		dev_priv->gt_slices.state_default = 0;
>  	} else
>  		return -EINVAL;
>  	return count;
> diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
> index 4f1b636..1a2f8bc 100644
> --- a/drivers/gpu/drm/i915/intel_display.c
> +++ b/drivers/gpu/drm/i915/intel_display.c
> @@ -7778,6 +7778,12 @@ void intel_mark_idle(struct drm_device *dev)
>  
>  	if (dev_priv->info->gen >= 6)
>  		gen6_rps_idle(dev->dev_private);
> +
> +	if (HAS_SLICE_SHUTDOWN(dev) && dev_priv->gt_slices.forcing_full &&
> +	    dev_priv->gt_slices.state_default == 0) {
> +		intel_set_gt_half_async(dev);
> +		dev_priv->gt_slices.forcing_full = false;

Again, you want to be ignoring state_default when changing
legacy_userspace_busy.

The lower level slice config handling code should be making the judgment
based on all parameters after one changes.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 3/4] drm/i915: HSW GT3 Slices: exec flag to warn kernel that userspace is using predication
  2013-10-16  9:15         ` Chris Wilson
@ 2013-10-16 12:12           ` Rodrigo Vivi
  2013-10-16 12:19             ` Chris Wilson
  2013-10-31 23:07           ` Rodrigo Vivi
  1 sibling, 1 reply; 31+ messages in thread
From: Rodrigo Vivi @ 2013-10-16 12:12 UTC (permalink / raw)
  To: Chris Wilson, Rodrigo Vivi, intel-gfx

Hi Chris,

Thank you very much for all suggestions. Will do the changes. But for
that question below I don't have the clear answer:

On Wed, Oct 16, 2013 at 6:15 AM, Chris Wilson <chris@chris-wilson.co.uk> wrote:
> On Tue, Oct 15, 2013 at 05:50:49PM -0300, Rodrigo Vivi wrote:
>> If Userspace isn't using MI_PREDICATE all slices must be enabled for
>> backward compatibility.
>>
>> If I915_EXEC_USE_PREDICATE isn't set and defaul is set to half, kernel will force
>> all slices on.
>>
>> v2: fix the inverted logic for backwards compatibility
>>     USE_PREDICATE unset force gt_full when defaul is half
>>     instead of GT_FULL flag.
>>
>> CC: Eric Anholt <eric@anholt.net>
>> CC: Kenneth Graunke <kenneth@whitecape.org>
>> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
>> ---
>>  drivers/gpu/drm/i915/i915_drv.h            |  8 ++++
>>  drivers/gpu/drm/i915/i915_gem_execbuffer.c | 67 ++++++++++++++++++++++++++++++
>>  drivers/gpu/drm/i915/i915_reg.h            | 11 +++++
>>  drivers/gpu/drm/i915/i915_sysfs.c          | 11 ++++-
>>  drivers/gpu/drm/i915/intel_display.c       |  6 +++
>>  drivers/gpu/drm/i915/intel_drv.h           |  1 +
>>  drivers/gpu/drm/i915/intel_pm.c            | 39 ++++++++++++++++-
>>  include/uapi/drm/i915_drm.h                |  8 +++-
>>  8 files changed, 146 insertions(+), 5 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
>> index 685fb1d..bd4774e 100644
>> --- a/drivers/gpu/drm/i915/i915_drv.h
>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>> @@ -1219,6 +1219,12 @@ struct i915_package_c8 {
>>       } regsave;
>>  };
>>
>> +struct i915_gt_slices {
>> +     int state_default;
>
> Either use it as an integer (a count of slices enabled by default) or
> make it an enum.
>
>> +     int forcing_full;
>
> bool legacy_userspace_busy; ?
>
>> +     struct mutex m_lock;
>
> Ugh. Just struct mutex lock; /* locks all access to this struct and
> slice registers */ will suffice
>
>> +};
>> +
>>  typedef struct drm_i915_private {
>>       struct drm_device *dev;
>>       struct kmem_cache *slab;
>> @@ -1418,6 +1424,8 @@ typedef struct drm_i915_private {
>>
>>       struct i915_package_c8 pc8;
>>
>> +     struct i915_gt_slices gt_slices;
>> +
>>       /* Old dri1 support infrastructure, beware the dragons ya fools entering
>>        * here! */
>>       struct i915_dri1_state dri1;
>> diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
>> index 0ce0d47..eb09a51 100644
>> --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
>> +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
>> @@ -923,6 +923,66 @@ i915_reset_gen7_sol_offsets(struct drm_device *dev,
>>  }
>>
>>  static int
>> +i915_gt_full_immediate(struct drm_device *dev, struct intel_ring_buffer *ring)
>
> That's a misnomer.
>
>> +{
>> +     drm_i915_private_t *dev_priv = dev->dev_private;
>> +     int ret;
>> +
>> +     if (!HAS_SLICE_SHUTDOWN(dev) || ring == &dev_priv->ring[BCS])
>> +             return 0;
>> +
>> +     ret = intel_ring_begin(ring, 3);
>> +     if (ret)
>> +             return ret;
>> +     intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1));
>> +     intel_ring_emit(ring, HSW_GT_SLICE_INFO);
>> +     intel_ring_emit(ring, SLICE_SEL_BOTH);
>> +     intel_ring_advance(ring);
>> +
>> +     ret = intel_ring_begin(ring, 3);
>> +     if (ret)
>> +             return ret;
>> +     intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1));
>> +     intel_ring_emit(ring, MI_PREDICATE_RESULT_2);
>> +     intel_ring_emit(ring, LOWER_SLICE_ENABLED);
>> +     intel_ring_advance(ring);
>> +
>> +     ret = intel_ring_begin(ring, 3);
>> +     if (ret)
>> +             return ret;
>> +     intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1));
>> +     intel_ring_emit(ring, HSW_SLICESHUTDOWN);
>> +     intel_ring_emit(ring, ~SLICE_SHUTDOWN);
>> +     intel_ring_advance(ring);
>> +
>> +     ret = intel_ring_begin(ring, 3);
>> +     if (ret)
>> +             return ret;
>> +     intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1));
>> +     intel_ring_emit(ring, RC_IDLE_MAX_COUNT);
>> +     intel_ring_emit(ring, CS_IDLE_COUNT_1US);
>> +     intel_ring_advance(ring);
>> +
>> +     ret = intel_ring_begin(ring, 3);
>> +     if (ret)
>> +             return ret;
>> +     intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1));
>> +     intel_ring_emit(ring, WAIT_FOR_RC6_EXIT);
>> +     intel_ring_emit(ring, WAIT_RC6_EXIT | MASK_WAIT_RC6_EXIT);
>> +     intel_ring_advance(ring);
>> +
>> +     ret = intel_ring_begin(ring, 3);
>> +     if (ret)
>> +             return ret;
>> +     intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1));
>> +     intel_ring_emit(ring, RC_IDLE_MAX_COUNT);
>> +     intel_ring_emit(ring, CS_IDLE_COUNT_5US);
>> +     intel_ring_advance(ring);
>> +
>> +     return 0;
>> +}
>> +
>> +static int
>>  i915_gem_do_execbuffer(struct drm_device *dev, void *data,
>>                      struct drm_file *file,
>>                      struct drm_i915_gem_execbuffer2 *args,
>> @@ -999,6 +1059,13 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
>>               return -EINVAL;
>>       }
>>
>> +     if ((args->flags & I915_EXEC_USE_PREDICATE) == 0 &&
>> +         dev_priv->gt_slices.state_default == 0 &&
>> +         !dev_priv->gt_slices.forcing_full) {
>
> Always set legacy_userspace_busy here so that a change in state_default
> cannot possibly break currently active users i.e. they are independent
> bookkeeping.
>
>> +             dev_priv->gt_slices.forcing_full = true;
>> +             i915_gt_full_immediate(dev, ring);
>
> What happens when this fails? If it only partially emits the commands,
> etc.

This is a very good question. If it fails, my only concern would be
the sync between HSW_GT_SLICE_INFO and MI_PREDICATE_RESULT_2. But I
don't know how to do this during the exec buf without delaying the
execution. Do you have any suggestion?

>
>> +     }
>> +
>>       mode = args->flags & I915_EXEC_CONSTANTS_MASK;
>>       mask = I915_EXEC_CONSTANTS_MASK;
>>       switch (mode) {
>> diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
>> index 497c441..0146bef 100644
>> --- a/drivers/gpu/drm/i915/i915_reg.h
>> +++ b/drivers/gpu/drm/i915/i915_reg.h
>> @@ -277,6 +277,17 @@
>>  #define   SLICE_STATUS_MAIN_ON       (2<<0)
>>  #define   SLICE_STATUS_BOTH_ON       (3<<0)
>>
>> +#define HSW_SLICESHUTDOWN    0xA190
>> +#define   SLICE_SHUTDOWN     (1<<0)
>> +
>> +#define RC_IDLE_MAX_COUNT    0x2054
>> +#define   CS_IDLE_COUNT_1US  (1<<1)
>> +#define   CS_IDLE_COUNT_5US  (1<<3)
>> +
>> +#define WAIT_FOR_RC6_EXIT    0x20CC
>> +#define   WAIT_RC6_EXIT              (1<<0)
>> +#define   MASK_WAIT_RC6_EXIT (1<<16)
>> +
>>  /*
>>   * 3D instructions used by the kernel
>>   */
>> diff --git a/drivers/gpu/drm/i915/i915_sysfs.c b/drivers/gpu/drm/i915/i915_sysfs.c
>> index 86ccd52..25b0c5b 100644
>> --- a/drivers/gpu/drm/i915/i915_sysfs.c
>> +++ b/drivers/gpu/drm/i915/i915_sysfs.c
>> @@ -124,9 +124,13 @@ static ssize_t gt_slice_config_show(struct device *kdev,
>>       struct drm_minor *minor = container_of(kdev, struct drm_minor, kdev);
>>       struct drm_device *dev = minor->dev;
>>       struct drm_i915_private *dev_priv = dev->dev_private;
>> +     bool full;
>>
>> -     return sprintf(buf, "%s\n", I915_READ(MI_PREDICATE_RESULT_2) ==
>> -                    LOWER_SLICE_ENABLED ? "full" : "half");
>> +     mutex_lock(&dev_priv->gt_slices.m_lock);
>> +     full = I915_READ(MI_PREDICATE_RESULT_2) == LOWER_SLICE_ENABLED;
>> +     mutex_unlock(&dev_priv->gt_slices.m_lock);
>
> This locking is not stopping any races - it is superfluous.
>
>> +     return sprintf(buf, "%s\n", full ? "full" : "half");
>>  }
>>
>>  static ssize_t gt_slice_config_store(struct device *kdev,
>> @@ -135,16 +139,19 @@ static ssize_t gt_slice_config_store(struct device *kdev,
>>  {
>>       struct drm_minor *minor = container_of(kdev, struct drm_minor, kdev);
>>       struct drm_device *dev = minor->dev;
>> +     struct drm_i915_private *dev_priv = dev->dev_private;
>>       int ret;
>>
>>       if (!strncmp(buf, "full", sizeof("full") - 1)) {
>>               ret = intel_set_gt_full(dev);
>>               if (ret)
>>                       return ret;
>> +             dev_priv->gt_slices.state_default = 1;
>>       } else if (!strncmp(buf, "half", sizeof("half") - 1)) {
>>               ret = intel_set_gt_half(dev);
>>               if (ret)
>>                       return ret;
>> +             dev_priv->gt_slices.state_default = 0;
>>       } else
>>               return -EINVAL;
>>       return count;
>> diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
>> index 4f1b636..1a2f8bc 100644
>> --- a/drivers/gpu/drm/i915/intel_display.c
>> +++ b/drivers/gpu/drm/i915/intel_display.c
>> @@ -7778,6 +7778,12 @@ void intel_mark_idle(struct drm_device *dev)
>>
>>       if (dev_priv->info->gen >= 6)
>>               gen6_rps_idle(dev->dev_private);
>> +
>> +     if (HAS_SLICE_SHUTDOWN(dev) && dev_priv->gt_slices.forcing_full &&
>> +         dev_priv->gt_slices.state_default == 0) {
>> +             intel_set_gt_half_async(dev);
>> +             dev_priv->gt_slices.forcing_full = false;
>
> Again, you want to be ignoring state_default when changing
> legacy_userspace_busy.
>
> The lower level slice config handling code should be making the judgment
> based on all parameters after one changes.
> -Chris
>
> --
> Chris Wilson, Intel Open Source Technology Centre



-- 
Rodrigo Vivi
Blog: http://blog.vivi.eng.br

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 3/4] drm/i915: HSW GT3 Slices: exec flag to warn kernel that userspace is using predication
  2013-10-16 12:12           ` Rodrigo Vivi
@ 2013-10-16 12:19             ` Chris Wilson
  2013-10-21 21:00               ` [PATCH] " Rodrigo Vivi
  0 siblings, 1 reply; 31+ messages in thread
From: Chris Wilson @ 2013-10-16 12:19 UTC (permalink / raw)
  To: Rodrigo Vivi; +Cc: intel-gfx

On Wed, Oct 16, 2013 at 09:12:46AM -0300, Rodrigo Vivi wrote:
> >> +             dev_priv->gt_slices.forcing_full = true;
> >> +             i915_gt_full_immediate(dev, ring);
> >
> > What happens when this fails? If it only partially emits the commands,
> > etc.
> 
> This is a very good question. If it fails, my only concern would be
> the sync between HSW_GT_SLICE_INFO and MI_PREDICATE_RESULT_2. But I
> don't know how to do this during the exec buf without delaying the
> execution. Do you have any suggestion?

Reduce the series of if() to just one, act upon the return value, only
set legacy_userspace_busy to true after making the change. It will
eventually get cleaned up if the dispatch later fails, but we must make
sure that we accurately track hardware state.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH] drm/i915: HSW GT3 Slices: exec flag to warn kernel that userspace is using predication
  2013-10-16 12:19             ` Chris Wilson
@ 2013-10-21 21:00               ` Rodrigo Vivi
  2013-10-22  7:47                 ` Chris Wilson
  0 siblings, 1 reply; 31+ messages in thread
From: Rodrigo Vivi @ 2013-10-21 21:00 UTC (permalink / raw)
  To: intel-gfx

If Userspace isn't using MI_PREDICATE all slices must be enabled for
backward compatibility.

If I915_EXEC_USE_PREDICATE isn't set and defaul is set to half, kernel will force
all slices on.

v2: fix the inverted logic for backwards compatibility
    USE_PREDICATE unset force gt_full when defaul is half
    instead of GT_FULL flag.

v3: Accepting Chris's suggestions: better variable names;
    better logic around state_default x legacy_userspace_busy;
    remove unecessary mutex;

CC: Chris Wilson <chris@chris-wilson.co.uk>
CC: Eric Anholt <eric@anholt.net>
CC: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
---
 drivers/gpu/drm/i915/i915_drv.h            |  8 ++++
 drivers/gpu/drm/i915/i915_gem_execbuffer.c | 69 ++++++++++++++++++++++++++++++
 drivers/gpu/drm/i915/i915_reg.h            | 11 +++++
 drivers/gpu/drm/i915/i915_sysfs.c          |  6 ++-
 drivers/gpu/drm/i915/intel_display.c       |  6 +++
 drivers/gpu/drm/i915/intel_drv.h           |  1 +
 drivers/gpu/drm/i915/intel_pm.c            | 45 ++++++++++++++++++-
 include/uapi/drm/i915_drm.h                |  8 +++-
 8 files changed, 149 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 685fb1d..67bbbce 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1219,6 +1219,12 @@ struct i915_package_c8 {
 	} regsave;
 };
 
+struct i915_gt_slices {
+	int state_default;
+	int legacy_userspace_busy;
+	struct mutex lock; /* locks access to this scruct and slice registers */
+};
+
 typedef struct drm_i915_private {
 	struct drm_device *dev;
 	struct kmem_cache *slab;
@@ -1418,6 +1424,8 @@ typedef struct drm_i915_private {
 
 	struct i915_package_c8 pc8;
 
+	struct i915_gt_slices gt_slices;
+
 	/* Old dri1 support infrastructure, beware the dragons ya fools entering
 	 * here! */
 	struct i915_dri1_state dri1;
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index 0ce0d47..1d8187a 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -923,6 +923,70 @@ i915_reset_gen7_sol_offsets(struct drm_device *dev,
 }
 
 static int
+i915_legacy_userspace_busy(struct drm_device *dev,
+			   struct intel_ring_buffer *ring)
+{
+	drm_i915_private_t *dev_priv = dev->dev_private;
+	int ret;
+
+	if (!HAS_SLICE_SHUTDOWN(dev) || ring == &dev_priv->ring[BCS])
+		return -ENODEV;
+
+	if (dev_priv->gt_slices.state_default == 1)
+		return -EBADE;
+
+	ret = intel_ring_begin(ring, 3);
+	if (ret)
+		return ret;
+	intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1));
+	intel_ring_emit(ring, HSW_GT_SLICE_INFO);
+	intel_ring_emit(ring, SLICE_SEL_BOTH);
+	intel_ring_advance(ring);
+
+	ret = intel_ring_begin(ring, 3);
+	if (ret)
+		return ret;
+	intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1));
+	intel_ring_emit(ring, MI_PREDICATE_RESULT_2);
+	intel_ring_emit(ring, LOWER_SLICE_ENABLED);
+	intel_ring_advance(ring);
+
+	ret = intel_ring_begin(ring, 3);
+	if (ret)
+		return ret;
+	intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1));
+	intel_ring_emit(ring, HSW_SLICESHUTDOWN);
+	intel_ring_emit(ring, ~SLICE_SHUTDOWN);
+	intel_ring_advance(ring);
+
+	ret = intel_ring_begin(ring, 3);
+	if (ret)
+		return ret;
+	intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1));
+	intel_ring_emit(ring, RC_IDLE_MAX_COUNT);
+	intel_ring_emit(ring, CS_IDLE_COUNT_1US);
+	intel_ring_advance(ring);
+
+	ret = intel_ring_begin(ring, 3);
+	if (ret)
+		return ret;
+	intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1));
+	intel_ring_emit(ring, WAIT_FOR_RC6_EXIT);
+	intel_ring_emit(ring, WAIT_RC6_EXIT | MASK_WAIT_RC6_EXIT);
+	intel_ring_advance(ring);
+
+	ret = intel_ring_begin(ring, 3);
+	if (ret)
+		return ret;
+	intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1));
+	intel_ring_emit(ring, RC_IDLE_MAX_COUNT);
+	intel_ring_emit(ring, CS_IDLE_COUNT_5US);
+	intel_ring_advance(ring);
+
+	return 0;
+}
+
+static int
 i915_gem_do_execbuffer(struct drm_device *dev, void *data,
 		       struct drm_file *file,
 		       struct drm_i915_gem_execbuffer2 *args,
@@ -999,6 +1063,11 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
 		return -EINVAL;
 	}
 
+	if ((args->flags & I915_EXEC_USE_PREDICATE) == 0 &&
+	    !dev_priv->gt_slices.legacy_userspace_busy)
+		if (i915_legacy_userspace_busy(dev, ring) == 0)
+			dev_priv->gt_slices.legacy_userspace_busy = true;
+
 	mode = args->flags & I915_EXEC_CONSTANTS_MASK;
 	mask = I915_EXEC_CONSTANTS_MASK;
 	switch (mode) {
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 497c441..0146bef 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -277,6 +277,17 @@
 #define   SLICE_STATUS_MAIN_ON	(2<<0)
 #define   SLICE_STATUS_BOTH_ON	(3<<0)
 
+#define HSW_SLICESHUTDOWN	0xA190
+#define   SLICE_SHUTDOWN	(1<<0)
+
+#define RC_IDLE_MAX_COUNT	0x2054
+#define   CS_IDLE_COUNT_1US	(1<<1)
+#define   CS_IDLE_COUNT_5US	(1<<3)
+
+#define WAIT_FOR_RC6_EXIT	0x20CC
+#define   WAIT_RC6_EXIT		(1<<0)
+#define   MASK_WAIT_RC6_EXIT	(1<<16)
+
 /*
  * 3D instructions used by the kernel
  */
diff --git a/drivers/gpu/drm/i915/i915_sysfs.c b/drivers/gpu/drm/i915/i915_sysfs.c
index 86ccd52..1dd57fb 100644
--- a/drivers/gpu/drm/i915/i915_sysfs.c
+++ b/drivers/gpu/drm/i915/i915_sysfs.c
@@ -125,8 +125,7 @@ static ssize_t gt_slice_config_show(struct device *kdev,
 	struct drm_device *dev = minor->dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
 
-	return sprintf(buf, "%s\n", I915_READ(MI_PREDICATE_RESULT_2) ==
-		       LOWER_SLICE_ENABLED ? "full" : "half");
+	return sprintf(buf, "%s\n", I915_READ(MI_PREDICATE_RESULT_2) == LOWER_SLICE_ENABLED ? "full" : "half");
 }
 
 static ssize_t gt_slice_config_store(struct device *kdev,
@@ -135,16 +134,19 @@ static ssize_t gt_slice_config_store(struct device *kdev,
 {
 	struct drm_minor *minor = container_of(kdev, struct drm_minor, kdev);
 	struct drm_device *dev = minor->dev;
+	struct drm_i915_private *dev_priv = dev->dev_private;
 	int ret;
 
 	if (!strncmp(buf, "full", sizeof("full") - 1)) {
 		ret = intel_set_gt_full(dev);
 		if (ret)
 			return ret;
+		dev_priv->gt_slices.state_default = 1;
 	} else if (!strncmp(buf, "half", sizeof("half") - 1)) {
 		ret = intel_set_gt_half(dev);
 		if (ret)
 			return ret;
+		dev_priv->gt_slices.state_default = 0;
 	} else
 		return -EINVAL;
 	return count;
diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index 4f1b636..3810ecf 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -7778,6 +7778,12 @@ void intel_mark_idle(struct drm_device *dev)
 
 	if (dev_priv->info->gen >= 6)
 		gen6_rps_idle(dev->dev_private);
+
+	if (HAS_SLICE_SHUTDOWN(dev) &&
+	    dev_priv->gt_slices.legacy_userspace_busy) {
+		intel_set_gt_half_async(dev);
+		dev_priv->gt_slices.legacy_userspace_busy = false;
+	}
 }
 
 void intel_mark_fb_busy(struct drm_i915_gem_object *obj,
diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
index a9abbb5..98cd63e 100644
--- a/drivers/gpu/drm/i915/intel_drv.h
+++ b/drivers/gpu/drm/i915/intel_drv.h
@@ -836,6 +836,7 @@ void intel_disable_gt_powersave(struct drm_device *dev);
 void ironlake_teardown_rc6(struct drm_device *dev);
 int intel_set_gt_full(struct drm_device *dev);
 int intel_set_gt_half(struct drm_device *dev);
+void intel_set_gt_half_async(struct drm_device *dev);
 void intel_init_gt_slices(struct drm_device *dev);
 void gen6_update_ring_freq(struct drm_device *dev);
 void gen6_rps_idle(struct drm_i915_private *dev_priv);
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 63af075..c1bde36 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -3873,6 +3873,7 @@ int intel_set_gt_full(struct drm_device *dev)
 	if (!HAS_SLICE_SHUTDOWN(dev))
 		return -ENODEV;
 
+	mutex_lock(&dev_priv->gt_slices.lock);
 	I915_WRITE(HSW_GT_SLICE_INFO, SLICE_SEL_BOTH);
 
 	/* Slices are enabled on RC6 exit */
@@ -3881,13 +3882,18 @@ int intel_set_gt_full(struct drm_device *dev)
 	if (wait_for(((I915_READ(HSW_GT_SLICE_INFO) & SLICE_STATUS_MASK) ==
 		      SLICE_STATUS_BOTH_ON), 2000)) {
 		DRM_ERROR("Timeout enabling full gt slices\n");
+
 		I915_WRITE(HSW_GT_SLICE_INFO, ~SLICE_SEL_BOTH);
 		I915_WRITE(MI_PREDICATE_RESULT_2, LOWER_SLICE_DISABLED);
+
 		gen6_gt_force_wake_put(dev_priv);
+		mutex_unlock(&dev_priv->gt_slices.lock);
 		return -ETIMEDOUT;
 	}
+
 	I915_WRITE(MI_PREDICATE_RESULT_2, LOWER_SLICE_ENABLED);
 	gen6_gt_force_wake_put(dev_priv);
+	mutex_unlock(&dev_priv->gt_slices.lock);
 
 	return 0;
 }
@@ -3899,6 +3905,10 @@ int intel_set_gt_half(struct drm_device *dev)
 	if (!HAS_SLICE_SHUTDOWN(dev))
 		return -ENODEV;
 
+	if (dev_priv->gt_slices.legacy_userspace_busy)
+		return 0;
+
+	mutex_lock(&dev_priv->gt_slices.lock);
 	I915_WRITE(HSW_GT_SLICE_INFO, ~SLICE_SEL_BOTH);
 
 	/* Slices are disabled on RC6 exit */
@@ -3907,16 +3917,43 @@ int intel_set_gt_half(struct drm_device *dev)
 	if (wait_for(((I915_READ(HSW_GT_SLICE_INFO) & SLICE_STATUS_MASK) ==
 		      SLICE_STATUS_MAIN_ON), 2000)) {
 		DRM_ERROR("Timed out disabling half gt slices\n");
+
 		I915_WRITE(HSW_GT_SLICE_INFO, SLICE_SEL_BOTH);
 		I915_WRITE(MI_PREDICATE_RESULT_2, LOWER_SLICE_ENABLED);
+
 		gen6_gt_force_wake_put(dev_priv);
+		mutex_unlock(&dev_priv->gt_slices.lock);
 		return -ETIMEDOUT;
 	}
+
 	I915_WRITE(MI_PREDICATE_RESULT_2, LOWER_SLICE_DISABLED);
 	gen6_gt_force_wake_put(dev_priv);
+
+	mutex_unlock(&dev_priv->gt_slices.lock);
 	return 0;
 }
 
+/**
+ * On Haswell, slices on/off transitions are done via RC6 sequence.
+ * This async function allows you to request slices shutdown without waiting.
+ * Slices will be disabled on next RC6 exit.
+ */
+void intel_set_gt_half_async(struct drm_device *dev)
+{
+	struct drm_i915_private *dev_priv = dev->dev_private;
+
+	if (!HAS_SLICE_SHUTDOWN(dev))
+		return;
+
+	if (dev_priv->gt_slices.state_default == 1)
+		return;
+
+	mutex_lock(&dev_priv->gt_slices.lock);
+	I915_WRITE(HSW_GT_SLICE_INFO, ~SLICE_SEL_BOTH);
+	I915_WRITE(MI_PREDICATE_RESULT_2, LOWER_SLICE_DISABLED);
+	mutex_unlock(&dev_priv->gt_slices.lock);
+}
+
 void intel_init_gt_slices(struct drm_device *dev)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
@@ -3927,9 +3964,13 @@ void intel_init_gt_slices(struct drm_device *dev)
 	if (!HAS_SLICE_SHUTDOWN(dev))
 		return;
 
+	dev_priv->gt_slices.state_default = 1;
+	dev_priv->gt_slices.legacy_userspace_busy = false;
+	mutex_init(&dev_priv->gt_slices.lock);
+
 	if (!i915_gt_slice_config) {
-		I915_WRITE(HSW_GT_SLICE_INFO, ~SLICE_SEL_BOTH);
-		I915_WRITE(MI_PREDICATE_RESULT_2, LOWER_SLICE_DISABLED);
+		dev_priv->gt_slices.state_default = 0;
+		intel_set_gt_half_async(dev);
 	}
 }
 
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index 3a4e97b..3fa3e24 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -731,7 +731,13 @@ struct drm_i915_gem_execbuffer2 {
  */
 #define I915_EXEC_HANDLE_LUT		(1<<12)
 
-#define __I915_EXEC_UNKNOWN_FLAGS -(I915_EXEC_HANDLE_LUT<<1)
+/* If this flag is set userspace is using predicate and half slices can be
+ * let disabled for power saving. Otherwise use all slices even when disabled
+ * by boot parameter or via sysfs interface
+ */
+#define I915_EXEC_USE_PREDICATE		(1<<13)
+
+#define __I915_EXEC_UNKNOWN_FLAGS -(I915_EXEC_USE_PREDICATE<<1)
 
 #define I915_EXEC_CONTEXT_ID_MASK	(0xffffffff)
 #define i915_execbuffer2_set_context_id(eb2, context) \
-- 
1.7.11.7

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH] testcases: Slice Shutdown testscases.
  2013-10-15 20:51   ` [PATCH] testcases: Slice Shutdown testscases Rodrigo Vivi
@ 2013-10-21 21:01     ` Rodrigo Vivi
  2013-11-05 22:45       ` Rodrigo Vivi
  0 siblings, 1 reply; 31+ messages in thread
From: Rodrigo Vivi @ 2013-10-21 21:01 UTC (permalink / raw)
  To: intel-gfx

1. sysfs half/full switch.
2. on full execbuf without I915_EXEC_USE_PREDICATE
3. on full execbuf with I915_EXEC_USE_PREDICATE
4. on half execbuf without I915_EXEC_USE_PREDICATE
5. on half execbuf with I915_EXEC_USE_PREDICATE

v2: include more tests and s/GT_FULL/USE_PREDICATE
v3: make it more reliable and fix few comments

Signed-off-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
---
 tests/Makefile.am       |   1 +
 tests/gt_slice_config.c | 343 ++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 344 insertions(+)
 create mode 100644 tests/gt_slice_config.c

diff --git a/tests/Makefile.am b/tests/Makefile.am
index 0290ae0..cf105e4 100644
--- a/tests/Makefile.am
+++ b/tests/Makefile.am
@@ -50,6 +50,7 @@ TESTS_progs_M = \
 	gem_tiled_blits \
 	gem_tiled_partial_pwrite_pread \
 	gem_write_read_ring_switch \
+	gt_slice_config \
 	kms_flip \
 	kms_render \
 	kms_setmode \
diff --git a/tests/gt_slice_config.c b/tests/gt_slice_config.c
new file mode 100644
index 0000000..58e53cf
--- /dev/null
+++ b/tests/gt_slice_config.c
@@ -0,0 +1,343 @@
+/*
+ * Copyright © 2013 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ *
+ * Authors:
+ *    Rodrigo Vivi <rodrigo.vivi@intel.com>
+ *
+ */
+
+/*
+ * Testcase: Test GT slice shutdown feature
+ *
+ * Sub tests:
+ * 1. sysfs half/full switch.
+ * 2. on full execbuf without I915_EXEC_USE_PREDICATE
+ * 3. on full execbuf with I915_EXEC_USE_PREDICATE
+ * 4. on half execbuf without I915_EXEC_USE_PREDICATE
+ * 5. on half execbuf with I915_EXEC_USE_PREDICATE
+ */
+
+#define _GNU_SOURCE
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <unistd.h>
+#include "drmtest.h"
+
+static void exec(int fd, uint32_t handle, bool use_predicate)
+{
+        struct drm_i915_gem_execbuffer2 execbuf;
+        struct drm_i915_gem_exec_object2 gem_exec[1];
+        uint32_t b[2] = {MI_BATCH_BUFFER_END};
+        int loops = 1000000;
+        int ret = 0;
+
+        gem_write(fd, handle, 0, b, sizeof(b));
+
+        gem_exec[0].handle = handle;
+        gem_exec[0].relocation_count = 0;
+        gem_exec[0].relocs_ptr = 0;
+        gem_exec[0].alignment = 0;
+        gem_exec[0].offset = 0;
+        gem_exec[0].flags = 0;
+        gem_exec[0].rsvd1 = 0;
+        gem_exec[0].rsvd2 = 0;
+
+        execbuf.buffers_ptr = (uintptr_t)gem_exec;
+        execbuf.buffer_count = 1;
+        execbuf.batch_start_offset = 0;
+        execbuf.batch_len = 8;
+        execbuf.cliprects_ptr = 0;
+        execbuf.num_cliprects = 0;
+        execbuf.DR1 = 0;
+        execbuf.DR4 = 0;
+        execbuf.flags =  I915_EXEC_RENDER;
+	if (use_predicate)
+		execbuf.flags |= I915_EXEC_USE_PREDICATE;
+        i915_execbuffer2_set_context_id(execbuf, 0);
+        execbuf.rsvd2 = 0;
+
+        while (loops-- && ret == 0) {
+                ret = drmIoctl(fd,
+                               DRM_IOCTL_I915_GEM_EXECBUFFER2,
+                               &execbuf);
+        }
+	gem_sync(fd, handle);
+}
+
+static bool is_full(const int device)
+{
+	char *path;
+	FILE *file;
+	int ret;
+	char str[5];
+
+	ret = asprintf(&path, "/sys/class/drm/card%d/power/gt_slice_config",
+		       device);
+	igt_assert(ret != -1);
+
+	file = fopen(path, "r");
+	igt_require(file);
+
+	ret = fscanf(file, "%s", str);
+	igt_assert(ret != 0);
+
+	fclose(file);
+	return strcmp(str, "half");
+}
+
+static int set_status(const int device, bool full)
+{
+	char *path;
+	FILE *file;
+	int ret;
+	char str[5];
+	int attempts = 10;
+
+	ret = asprintf(&path, "/sys/class/drm/card%d/power/gt_slice_config",
+		       device);
+	igt_assert(ret != -1);
+
+	file = fopen(path, "r");
+	igt_require(file);
+
+	ret = fscanf(file, "%s", str);
+	igt_assert(ret != 0);
+
+	fclose(file);
+
+	if (full == strcmp(str, "half"))
+		return 0;
+
+	while (attempts-- && ret != 0) {
+		file = fopen(path, "w");
+		igt_require(file);
+		ret = fprintf(file, "%s\n", full ? "full": "half");
+		igt_assert(ret != -1);
+		ret = fclose(file);
+		sleep(1);
+	}
+	return ret;
+}
+
+int main(int argc, char **argv)
+{
+	char *path;
+	FILE *file;
+	unsigned int rc6_enabled;
+	int i, ret;
+	uint32_t handle;
+	const int device = drm_get_card();
+	const int fd = drm_open_any();
+	const int devid = intel_get_drm_devid(fd);
+	bool initial = is_full(device);
+
+	igt_skip_on_simulation();
+
+	igt_subtest_init(argc, argv);
+
+	igt_fixture {
+		/* On Haswell Slices on/off switch depends on RC6 exit */
+		ret = asprintf(&path, "/sys/class/drm/card%d/power/rc6_enable",
+			       device);
+		igt_assert(ret != -1);
+
+		file = fopen(path, "r");
+		igt_require(file);
+
+		fscanf(file, "%u", &rc6_enabled);
+		fclose(file);
+
+		igt_require(IS_HASWELL(devid));
+		igt_require(rc6_enabled);
+	}
+
+	igt_subtest("sysfs") {
+		/* Switching states */
+		ret = set_status(device, !initial);
+		if (ret < 0) {
+			fprintf(stderr, "Switch states via sysfs failed\n");
+			igt_fail(ret);
+		}
+
+		igt_success();
+	}
+
+	igt_subtest("execbuf-full-legacy") {
+		/* Enable all slices */
+		ret = set_status(device, true);
+		if (ret < 0)
+			igt_fail(ret);
+
+		igt_fork(child, 1) {
+			/* Start Render without I915_EXEC_USE_PREDICATE */
+			handle = gem_create(fd, 4096);
+			exec(fd, handle, false);
+			gem_close(fd, handle);
+		}
+
+		sleep(1);
+		if (!is_full(device)) {
+			fprintf(stderr, "All slices should be enabled\n");
+			igt_fail(-1);
+		}
+
+		/* Wait for idleness */
+		igt_waitchildren();
+		sleep(1);
+
+		/* Check if it is back to half */
+		if (!is_full(device)) {
+			fprintf(stderr, "All slices should be enabled\n");
+			igt_fail(-1);
+		}
+
+		igt_success();
+	}
+
+	igt_subtest("execbuf-full-predicate") {
+		/* Enable all slices */
+		ret = set_status(device, true);
+		if (ret < 0)
+			igt_fail(ret);
+
+		igt_fork(child, 1) {
+			/* Start Render with I915_EXEC_USE_PREDICATE */
+			handle = gem_create(fd, 4096);
+			exec(fd, handle, true);
+			gem_close(fd, handle);
+		}
+
+		sleep(1);
+		if (!is_full(device)) {
+			fprintf(stderr, "All slices should be enabled\n");
+			igt_fail(-1);
+		}
+
+		/* Wait for idleness */
+		igt_waitchildren();
+		sleep(1);
+
+		/* Check if it is back to half */
+		if (!is_full(device)) {
+			fprintf(stderr, "Only half slices should be enabled\n");
+			igt_fail(-1);
+		}
+
+		igt_success();
+	}
+
+	igt_subtest("execbuf-half-legacy") {
+		/* Disable half slices */
+		ret = set_status(device, false);
+		if (ret < 0)
+			igt_fail(ret);
+
+		/* Wait until it is back to half */
+		for (i = 30; i >= 0 && is_full(device); i--) {
+			if (i == 0)
+				igt_skip("Took so long to go to half. Probably a legacy support is in use at GPU simutaneously\n");
+			sleep(1);
+		}
+
+		igt_fork(child, 1) {
+			/* Start Render without I915_EXEC_USE_PREDICATE */
+			handle = gem_create(fd, 4096);
+			exec(fd, handle, false);
+			gem_close(fd, handle);
+		}
+
+		/* Check if it forces full slices enable for legacy support */
+		for (i = 10; i >= 0 && !is_full(device); i--) {
+			if (i == 0) {
+				fprintf(stderr, "All slices should be enabled\n");
+				igt_fail(-1);
+			}
+			sleep(1);
+		}
+
+		/* Wait for idleness */
+		igt_waitchildren();
+
+		/* Check if it is back to half */
+		for (i = 20; i >= 0 && is_full(device); i--) {
+			if (i == 0) {
+				fprintf(stderr, "Only half slices should be enabled\n");
+				igt_fail(-1);
+			}
+			sleep(1);
+		}
+
+		igt_success();
+	}
+
+	igt_subtest("execbuf-half-predicate") {
+		/* Disable half slices */
+		ret = set_status(device, false);
+		if (ret < 0)
+			igt_fail(ret);
+
+		/* Wait until it is back to half */
+		for (i = 30; i >= 0 && is_full(device); i--) {
+			if (i == 0)
+				igt_skip("Took so long to go to half. Probably a legacy support is in use at GPU simutaneously\n");
+			sleep(1);
+		}
+
+		igt_fork(child, 1) {
+			/* Start Render with I915_EXEC_USE_PREDICATE */
+			handle = gem_create(fd, 4096);
+			exec(fd, handle, true);
+			gem_close(fd, handle);
+		}
+
+		/* Check if it stays with only half slices enabled */
+		sleep(1);
+		if (is_full(device)) {
+			fprintf(stderr, "Only half slices should be enabled during execution with predicate\n");
+			set_status(device, initial);
+			igt_fail(-1);
+		}
+
+		/* Wait for idleness */
+		igt_waitchildren();
+
+		/* Check if it is back to half */
+		for (i = 20; i >= 0 && is_full(device); i--) {
+			if (i == 0) {
+				fprintf(stderr, "Only half slices should be enabled\n");
+				set_status(device, initial);
+				igt_fail(-1);
+			}
+			sleep(1);
+		}
+
+		set_status(device, initial);
+		igt_success();
+	}
+
+	igt_fixture {
+		close(fd);
+	}
+
+	igt_exit();
+}
-- 
1.7.11.7

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [PATCH] drm/i915: HSW GT3 Slices: exec flag to warn kernel that userspace is using predication
  2013-10-21 21:00               ` [PATCH] " Rodrigo Vivi
@ 2013-10-22  7:47                 ` Chris Wilson
  2013-10-22 17:33                   ` Rodrigo Vivi
  0 siblings, 1 reply; 31+ messages in thread
From: Chris Wilson @ 2013-10-22  7:47 UTC (permalink / raw)
  To: Rodrigo Vivi; +Cc: intel-gfx

On Mon, Oct 21, 2013 at 07:00:18PM -0200, Rodrigo Vivi wrote:
>  static int
> +i915_legacy_userspace_busy(struct drm_device *dev,
> +			   struct intel_ring_buffer *ring)

s/i915_legacy_userspace_busy/gt_legacy_userspace_busy/

As that is a bit more distinctive.

> +{
> +	drm_i915_private_t *dev_priv = dev->dev_private;
> +	int ret;
> +
> +	if (!HAS_SLICE_SHUTDOWN(dev) || ring == &dev_priv->ring[BCS])
> +		return -ENODEV;
> +
> +	if (dev_priv->gt_slices.state_default == 1)
> +		return -EBADE;

This needs to be replaced.

if (dev_priv->gt_slices.config == 2)
  return 0;

> +
> +	ret = intel_ring_begin(ring, 3);

The dword count must be even, or else the hardware chokes.
However, since we cannot interrupt this sequence and leave the hw in an
inconsistent state, we need to emit this entire sequence in a single
block.

> +	if (ret)
> +		return ret;
> +	intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1));
> +	intel_ring_emit(ring, HSW_GT_SLICE_INFO);
> +	intel_ring_emit(ring, SLICE_SEL_BOTH);
> +	intel_ring_advance(ring);
> +
> +	ret = intel_ring_begin(ring, 3);
> +	if (ret)
> +		return ret;
> +	intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1));
> +	intel_ring_emit(ring, MI_PREDICATE_RESULT_2);
> +	intel_ring_emit(ring, LOWER_SLICE_ENABLED);
> +	intel_ring_advance(ring);
> +
> +	ret = intel_ring_begin(ring, 3);
> +	if (ret)
> +		return ret;
> +	intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1));
> +	intel_ring_emit(ring, HSW_SLICESHUTDOWN);
> +	intel_ring_emit(ring, ~SLICE_SHUTDOWN);
> +	intel_ring_advance(ring);
> +
> +	ret = intel_ring_begin(ring, 3);
> +	if (ret)
> +		return ret;
> +	intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1));
> +	intel_ring_emit(ring, RC_IDLE_MAX_COUNT);
> +	intel_ring_emit(ring, CS_IDLE_COUNT_1US);
> +	intel_ring_advance(ring);
> +	ret = intel_ring_begin(ring, 3);
> +	if (ret)
> +		return ret;
> +	intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1));
> +	intel_ring_emit(ring, WAIT_FOR_RC6_EXIT);
> +	intel_ring_emit(ring, WAIT_RC6_EXIT | MASK_WAIT_RC6_EXIT);

_MASKED_BIT_ENABLE(WAIT_RC6_EXIT)

> +	intel_ring_advance(ring);
> +
> +	ret = intel_ring_begin(ring, 3);
> +	if (ret)
> +		return ret;
> +	intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1));
> +	intel_ring_emit(ring, RC_IDLE_MAX_COUNT);
> +	intel_ring_emit(ring, CS_IDLE_COUNT_5US);
> +	intel_ring_advance(ring);
> +
    dev_priv->gt_slices.config = 2;
> +	return 0;
> +}
> +
> +static int
>  i915_gem_do_execbuffer(struct drm_device *dev, void *data,
>  		       struct drm_file *file,
>  		       struct drm_i915_gem_execbuffer2 *args,
> @@ -999,6 +1063,11 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
>  		return -EINVAL;
>  	}
>  
> +	if ((args->flags & I915_EXEC_USE_PREDICATE) == 0 &&
> +	    !dev_priv->gt_slices.legacy_userspace_busy)
> +		if (i915_legacy_userspace_busy(dev, ring) == 0)
> +			dev_priv->gt_slices.legacy_userspace_busy = true;

Now here we cannot just ignore a failure to switch slice configuration.

 static bool gt_legacy_userspace(struct intel_ring_buffer *ring,
  		                 struct drm_i915_gem_execbuffer2 *args)
 {
    if (ring->id == BCS)
       return false;

    if (!HAS_SLICE_SHUTDOWN(ring->dev))
       return false;

    return (args->flags & I915_EXEC_USE_PREDICATE) == 0;
 }

 if (gt_legacy_userspace(ring, args)) {
    ret = gt_legacy_userspace_busy(ring);
    if (ret)
       return ret;

    dev_priv->gt_slices.legacy_userspace_busy = true;
 }


> +
>  	mode = args->flags & I915_EXEC_CONSTANTS_MASK;
>  	mask = I915_EXEC_CONSTANTS_MASK;
>  	switch (mode) {
> diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
> index 497c441..0146bef 100644
> --- a/drivers/gpu/drm/i915/i915_reg.h
> +++ b/drivers/gpu/drm/i915/i915_reg.h
> @@ -277,6 +277,17 @@
>  #define   SLICE_STATUS_MAIN_ON	(2<<0)
>  #define   SLICE_STATUS_BOTH_ON	(3<<0)
>  
> +#define HSW_SLICESHUTDOWN	0xA190
> +#define   SLICE_SHUTDOWN	(1<<0)
> +
> +#define RC_IDLE_MAX_COUNT	0x2054
> +#define   CS_IDLE_COUNT_1US	(1<<1)
> +#define   CS_IDLE_COUNT_5US	(1<<3)
> +
> +#define WAIT_FOR_RC6_EXIT	0x20CC
> +#define   WAIT_RC6_EXIT		(1<<0)
> +#define   MASK_WAIT_RC6_EXIT	(1<<16)
> +
>  /*
>   * 3D instructions used by the kernel
>   */
> diff --git a/drivers/gpu/drm/i915/i915_sysfs.c b/drivers/gpu/drm/i915/i915_sysfs.c
> index 86ccd52..1dd57fb 100644
> --- a/drivers/gpu/drm/i915/i915_sysfs.c
> +++ b/drivers/gpu/drm/i915/i915_sysfs.c
> @@ -125,8 +125,7 @@ static ssize_t gt_slice_config_show(struct device *kdev,
>  	struct drm_device *dev = minor->dev;
>  	struct drm_i915_private *dev_priv = dev->dev_private;
>  
> -	return sprintf(buf, "%s\n", I915_READ(MI_PREDICATE_RESULT_2) ==
> -		       LOWER_SLICE_ENABLED ? "full" : "half");
> +	return sprintf(buf, "%s\n", I915_READ(MI_PREDICATE_RESULT_2) == LOWER_SLICE_ENABLED ? "full" : "half");
>  }
>  
>  static ssize_t gt_slice_config_store(struct device *kdev,
> @@ -135,16 +134,19 @@ static ssize_t gt_slice_config_store(struct device *kdev,
>  {
>  	struct drm_minor *minor = container_of(kdev, struct drm_minor, kdev);
>  	struct drm_device *dev = minor->dev;
> +	struct drm_i915_private *dev_priv = dev->dev_private;
>  	int ret;

This cannot change state whilst legacy_userspace_busy.
>  
>  	if (!strncmp(buf, "full", sizeof("full") - 1)) {
>  		ret = intel_set_gt_full(dev);
>  		if (ret)
>  			return ret;
> +		dev_priv->gt_slices.state_default = 1;
dev_priv->gt_slices.max_config = 2;
>  	} else if (!strncmp(buf, "half", sizeof("half") - 1)) {
>  		ret = intel_set_gt_half(dev);
>  		if (ret)
>  			return ret;
> +		dev_priv->gt_slices.state_default = 0;
dev_priv->gt_slices.max_config = 1;
>  	} else
>  		return -EINVAL;

(void)intel_gt_update_slice_config(dev);

where

  int intel_gt_update_slice_config(struct drm_device *dev)
  {
     int config;

     if (!HAS_SLICE_SHUTDOWN(dev))
       return -ENODEV;

     if (dev_priv-gt_slices.legacy_userspace_busy)
       return -EBUSY;
       
     config = min(dev_priv->gt_slices.config,
                  dev_priv->gt_slices.max_config);
     if (config == dev_priv->gt_slices.config)
       return 0;

     /* slice_set_config(dev, config); ? */
     switch (config) {
     case 1: gt_slice_set_half(dev); break;
     case 2: gt_slice_set_full(dev); break;
     default: return -EINVAL;
     }

     dev_priv->gt_slices.config = config;
     return 0;
  }

>  	return count;
> diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
> index 4f1b636..3810ecf 100644
> --- a/drivers/gpu/drm/i915/intel_display.c
> +++ b/drivers/gpu/drm/i915/intel_display.c
> @@ -7778,6 +7778,12 @@ void intel_mark_idle(struct drm_device *dev)
>  
>  	if (dev_priv->info->gen >= 6)
>  		gen6_rps_idle(dev->dev_private);
> +
> +	if (HAS_SLICE_SHUTDOWN(dev) &&
> +	    dev_priv->gt_slices.legacy_userspace_busy) {
> +		intel_set_gt_half_async(dev);
> +	}

Just if (dev_priv->gt_slices.legacy_userspace_busy) {
         dev_priv->gt_slices.legacy_userspace_busy = false;
         intel_gt_update_slice_config(dev);
     }

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH] drm/i915: HSW GT3 Slices: exec flag to warn kernel that userspace is using predication
  2013-10-22  7:47                 ` Chris Wilson
@ 2013-10-22 17:33                   ` Rodrigo Vivi
  2013-10-22 17:40                     ` Chris Wilson
  0 siblings, 1 reply; 31+ messages in thread
From: Rodrigo Vivi @ 2013-10-22 17:33 UTC (permalink / raw)
  To: Chris Wilson, Rodrigo Vivi, intel-gfx, Eric Anholt, Kenneth Graunke

On Tue, Oct 22, 2013 at 5:47 AM, Chris Wilson <chris@chris-wilson.co.uk> wrote:
> On Mon, Oct 21, 2013 at 07:00:18PM -0200, Rodrigo Vivi wrote:
>>  static int
>> +i915_legacy_userspace_busy(struct drm_device *dev,
>> +                        struct intel_ring_buffer *ring)
>
> s/i915_legacy_userspace_busy/gt_legacy_userspace_busy/
>
> As that is a bit more distinctive.

ok, makes sense.

>
>> +{
>> +     drm_i915_private_t *dev_priv = dev->dev_private;
>> +     int ret;
>> +
>> +     if (!HAS_SLICE_SHUTDOWN(dev) || ring == &dev_priv->ring[BCS])
>> +             return -ENODEV;
>> +
>> +     if (dev_priv->gt_slices.state_default == 1)
>> +             return -EBADE;
>
> This needs to be replaced.
>
> if (dev_priv->gt_slices.config == 2)
>   return 0;
>
>> +
>> +     ret = intel_ring_begin(ring, 3);
>
> The dword count must be even, or else the hardware chokes.
> However, since we cannot interrupt this sequence and leave the hw in an
> inconsistent state, we need to emit this entire sequence in a single
> block.

makes sense... changed and worked locally here.

>
>> +     if (ret)
>> +             return ret;
>> +     intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1));
>> +     intel_ring_emit(ring, HSW_GT_SLICE_INFO);
>> +     intel_ring_emit(ring, SLICE_SEL_BOTH);
>> +     intel_ring_advance(ring);
>> +
>> +     ret = intel_ring_begin(ring, 3);
>> +     if (ret)
>> +             return ret;
>> +     intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1));
>> +     intel_ring_emit(ring, MI_PREDICATE_RESULT_2);
>> +     intel_ring_emit(ring, LOWER_SLICE_ENABLED);
>> +     intel_ring_advance(ring);
>> +
>> +     ret = intel_ring_begin(ring, 3);
>> +     if (ret)
>> +             return ret;
>> +     intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1));
>> +     intel_ring_emit(ring, HSW_SLICESHUTDOWN);
>> +     intel_ring_emit(ring, ~SLICE_SHUTDOWN);
>> +     intel_ring_advance(ring);
>> +
>> +     ret = intel_ring_begin(ring, 3);
>> +     if (ret)
>> +             return ret;
>> +     intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1));
>> +     intel_ring_emit(ring, RC_IDLE_MAX_COUNT);
>> +     intel_ring_emit(ring, CS_IDLE_COUNT_1US);
>> +     intel_ring_advance(ring);
>> +     ret = intel_ring_begin(ring, 3);
>> +     if (ret)
>> +             return ret;
>> +     intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1));
>> +     intel_ring_emit(ring, WAIT_FOR_RC6_EXIT);
>> +     intel_ring_emit(ring, WAIT_RC6_EXIT | MASK_WAIT_RC6_EXIT);
>
> _MASKED_BIT_ENABLE(WAIT_RC6_EXIT)

thanks

>
>> +     intel_ring_advance(ring);
>> +
>> +     ret = intel_ring_begin(ring, 3);
>> +     if (ret)
>> +             return ret;
>> +     intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1));
>> +     intel_ring_emit(ring, RC_IDLE_MAX_COUNT);
>> +     intel_ring_emit(ring, CS_IDLE_COUNT_5US);
>> +     intel_ring_advance(ring);
>> +
>     dev_priv->gt_slices.config = 2;
>> +     return 0;
>> +}
>> +
>> +static int
>>  i915_gem_do_execbuffer(struct drm_device *dev, void *data,
>>                      struct drm_file *file,
>>                      struct drm_i915_gem_execbuffer2 *args,
>> @@ -999,6 +1063,11 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
>>               return -EINVAL;
>>       }
>>
>> +     if ((args->flags & I915_EXEC_USE_PREDICATE) == 0 &&
>> +         !dev_priv->gt_slices.legacy_userspace_busy)
>> +             if (i915_legacy_userspace_busy(dev, ring) == 0)
>> +                     dev_priv->gt_slices.legacy_userspace_busy = true;
>
> Now here we cannot just ignore a failure to switch slice configuration.
>
>  static bool gt_legacy_userspace(struct intel_ring_buffer *ring,
>                                  struct drm_i915_gem_execbuffer2 *args)
>  {
>     if (ring->id == BCS)
>        return false;
>
>     if (!HAS_SLICE_SHUTDOWN(ring->dev))
>        return false;
>
>     return (args->flags & I915_EXEC_USE_PREDICATE) == 0;
>  }
>
>  if (gt_legacy_userspace(ring, args)) {
>     ret = gt_legacy_userspace_busy(ring);
>     if (ret)
>        return ret;
>
>     dev_priv->gt_slices.legacy_userspace_busy = true;
>  }

I liked all this simplification. Thanks.
>
>
>> +
>>       mode = args->flags & I915_EXEC_CONSTANTS_MASK;
>>       mask = I915_EXEC_CONSTANTS_MASK;
>>       switch (mode) {
>> diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
>> index 497c441..0146bef 100644
>> --- a/drivers/gpu/drm/i915/i915_reg.h
>> +++ b/drivers/gpu/drm/i915/i915_reg.h
>> @@ -277,6 +277,17 @@
>>  #define   SLICE_STATUS_MAIN_ON       (2<<0)
>>  #define   SLICE_STATUS_BOTH_ON       (3<<0)
>>
>> +#define HSW_SLICESHUTDOWN    0xA190
>> +#define   SLICE_SHUTDOWN     (1<<0)
>> +
>> +#define RC_IDLE_MAX_COUNT    0x2054
>> +#define   CS_IDLE_COUNT_1US  (1<<1)
>> +#define   CS_IDLE_COUNT_5US  (1<<3)
>> +
>> +#define WAIT_FOR_RC6_EXIT    0x20CC
>> +#define   WAIT_RC6_EXIT              (1<<0)
>> +#define   MASK_WAIT_RC6_EXIT (1<<16)
>> +
>>  /*
>>   * 3D instructions used by the kernel
>>   */
>> diff --git a/drivers/gpu/drm/i915/i915_sysfs.c b/drivers/gpu/drm/i915/i915_sysfs.c
>> index 86ccd52..1dd57fb 100644
>> --- a/drivers/gpu/drm/i915/i915_sysfs.c
>> +++ b/drivers/gpu/drm/i915/i915_sysfs.c
>> @@ -125,8 +125,7 @@ static ssize_t gt_slice_config_show(struct device *kdev,
>>       struct drm_device *dev = minor->dev;
>>       struct drm_i915_private *dev_priv = dev->dev_private;
>>
>> -     return sprintf(buf, "%s\n", I915_READ(MI_PREDICATE_RESULT_2) ==
>> -                    LOWER_SLICE_ENABLED ? "full" : "half");
>> +     return sprintf(buf, "%s\n", I915_READ(MI_PREDICATE_RESULT_2) == LOWER_SLICE_ENABLED ? "full" : "half");
>>  }
>>
>>  static ssize_t gt_slice_config_store(struct device *kdev,
>> @@ -135,16 +134,19 @@ static ssize_t gt_slice_config_store(struct device *kdev,
>>  {
>>       struct drm_minor *minor = container_of(kdev, struct drm_minor, kdev);
>>       struct drm_device *dev = minor->dev;
>> +     struct drm_i915_private *dev_priv = dev->dev_private;
>>       int ret;
>
> This cannot change state whilst legacy_userspace_busy.

We cannot change to half while in legacy_userspace_busy and this check
was already inside
gt_slices_set_half(dev). So just the state_default was changed so on
next idleness it would come to half anyway.

>>
>>       if (!strncmp(buf, "full", sizeof("full") - 1)) {
>>               ret = intel_set_gt_full(dev);
>>               if (ret)
>>                       return ret;
>> +             dev_priv->gt_slices.state_default = 1;
> dev_priv->gt_slices.max_config = 2;
>>       } else if (!strncmp(buf, "half", sizeof("half") - 1)) {
>>               ret = intel_set_gt_half(dev);
>>               if (ret)
>>                       return ret;
>> +             dev_priv->gt_slices.state_default = 0;
> dev_priv->gt_slices.max_config = 1;
>>       } else
>>               return -EINVAL;
>
> (void)intel_gt_update_slice_config(dev);

This remove the syncrony you had asked. Without it testcase for sysfs
change is impossible because we don't know if it is in _busy... so we
cannot simply check state after request a change.

>
> where
>
>   int intel_gt_update_slice_config(struct drm_device *dev)
>   {
>      int config;
>
>      if (!HAS_SLICE_SHUTDOWN(dev))
>        return -ENODEV;
>
>      if (dev_priv-gt_slices.legacy_userspace_busy)
>        return -EBUSY;
>
>      config = min(dev_priv->gt_slices.config,
>                   dev_priv->gt_slices.max_config);
>      if (config == dev_priv->gt_slices.config)

I still think state_default is working, enough and simpler...
or do you really want me to use config and max_config instead?

>        return 0;
>
>      /* slice_set_config(dev, config); ? */
>      switch (config) {
>      case 1: gt_slice_set_half(dev); break;
>      case 2: gt_slice_set_full(dev); break;
>      default: return -EINVAL;
>      }
>
>      dev_priv->gt_slices.config = config;
>      return 0;
>   }
>
>>       return count;
>> diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
>> index 4f1b636..3810ecf 100644
>> --- a/drivers/gpu/drm/i915/intel_display.c
>> +++ b/drivers/gpu/drm/i915/intel_display.c
>> @@ -7778,6 +7778,12 @@ void intel_mark_idle(struct drm_device *dev)
>>
>>       if (dev_priv->info->gen >= 6)
>>               gen6_rps_idle(dev->dev_private);
>> +
>> +     if (HAS_SLICE_SHUTDOWN(dev) &&
>> +         dev_priv->gt_slices.legacy_userspace_busy) {
>> +             intel_set_gt_half_async(dev);
>> +     }
>
> Just if (dev_priv->gt_slices.legacy_userspace_busy) {
>          dev_priv->gt_slices.legacy_userspace_busy = false;
>          intel_gt_update_slice_config(dev);
>      }
>
> --
> Chris Wilson, Intel Open Source Technology Centre

Thank you very much.

-- 
Rodrigo Vivi
Blog: http://blog.vivi.eng.br

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH] drm/i915: HSW GT3 Slices: exec flag to warn kernel that userspace is using predication
  2013-10-22 17:33                   ` Rodrigo Vivi
@ 2013-10-22 17:40                     ` Chris Wilson
  2013-10-24 20:24                       ` Rodrigo Vivi
  0 siblings, 1 reply; 31+ messages in thread
From: Chris Wilson @ 2013-10-22 17:40 UTC (permalink / raw)
  To: Rodrigo Vivi; +Cc: intel-gfx

On Tue, Oct 22, 2013 at 03:33:16PM -0200, Rodrigo Vivi wrote:
> I still think state_default is working, enough and simpler...
> or do you really want me to use config and max_config instead?

I think keeping the hw config separate from the requested config helps a
lot when reviewing the code in 6 months time. See pc8 for a recent
example where we also need to track hw vs sw state. Similarly in trying
to split the code that implements the policy and that implements the
mechanism.

The devil is in the details, of course, and so this suggestion may
indeed suck when implemented. You are free to refute any suggestion with
a good reason. :)
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH] drm/i915: HSW GT3 Slices: exec flag to warn kernel that userspace is using predication
  2013-10-22 17:40                     ` Chris Wilson
@ 2013-10-24 20:24                       ` Rodrigo Vivi
  2013-10-25  8:43                         ` Chris Wilson
  0 siblings, 1 reply; 31+ messages in thread
From: Rodrigo Vivi @ 2013-10-24 20:24 UTC (permalink / raw)
  To: intel-gfx

If Userspace isn't using MI_PREDICATE all slices must be enabled for
backward compatibility.

If I915_EXEC_USE_PREDICATE isn't set and defaul is set to half, kernel will force
all slices on.

v2: fix the inverted logic for backwards compatibility
    USE_PREDICATE unset force gt_full when defaul is half
    instead of GT_FULL flag.

v3: Accepting Chris's suggestions: better variable names;
    better logic around state_default x legacy_userspace_busy;
    remove unecessary mutex;

v4: Accepting more suggestions from Chris:
    * Send all LRIs in only one block and don't ignore if it fails.
    * function name and cleaner code on forcing_full.

CC: Chris Wilson <chris@chris-wilson.co.uk>
CC: Eric Anholt <eric@anholt.net>
CC: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
---
 drivers/gpu/drm/i915/i915_drv.h            |  8 ++++
 drivers/gpu/drm/i915/i915_gem_execbuffer.c | 63 ++++++++++++++++++++++++++++++
 drivers/gpu/drm/i915/i915_reg.h            | 11 ++++++
 drivers/gpu/drm/i915/i915_sysfs.c          |  3 ++
 drivers/gpu/drm/i915/intel_display.c       |  5 +++
 drivers/gpu/drm/i915/intel_drv.h           |  1 +
 drivers/gpu/drm/i915/intel_pm.c            | 45 ++++++++++++++++++++-
 include/uapi/drm/i915_drm.h                |  8 +++-
 8 files changed, 141 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 685fb1d..67bbbce 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1219,6 +1219,12 @@ struct i915_package_c8 {
 	} regsave;
 };
 
+struct i915_gt_slices {
+	int state_default;
+	int legacy_userspace_busy;
+	struct mutex lock; /* locks access to this scruct and slice registers */
+};
+
 typedef struct drm_i915_private {
 	struct drm_device *dev;
 	struct kmem_cache *slab;
@@ -1418,6 +1424,8 @@ typedef struct drm_i915_private {
 
 	struct i915_package_c8 pc8;
 
+	struct i915_gt_slices gt_slices;
+
 	/* Old dri1 support infrastructure, beware the dragons ya fools entering
 	 * here! */
 	struct i915_dri1_state dri1;
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index 0ce0d47..7a362a2 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -922,6 +922,62 @@ i915_reset_gen7_sol_offsets(struct drm_device *dev,
 	return 0;
 }
 
+static int gt_legacy_userspace_busy(struct intel_ring_buffer *ring)
+{
+	int ret;
+
+	ret = intel_ring_begin(ring, 18);
+	if (ret)
+		return ret;
+
+	intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1));
+	intel_ring_emit(ring, HSW_GT_SLICE_INFO);
+	intel_ring_emit(ring, SLICE_SEL_BOTH);
+
+	intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1));
+	intel_ring_emit(ring, MI_PREDICATE_RESULT_2);
+	intel_ring_emit(ring, LOWER_SLICE_ENABLED);
+
+	intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1));
+	intel_ring_emit(ring, HSW_SLICESHUTDOWN);
+	intel_ring_emit(ring, ~SLICE_SHUTDOWN);
+
+	intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1));
+	intel_ring_emit(ring, RC_IDLE_MAX_COUNT);
+	intel_ring_emit(ring, CS_IDLE_COUNT_1US);
+
+	intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1));
+	intel_ring_emit(ring, WAIT_FOR_RC6_EXIT);
+	intel_ring_emit(ring, _MASKED_BIT_ENABLE(WAIT_RC6_EXIT));
+
+	intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1));
+	intel_ring_emit(ring, RC_IDLE_MAX_COUNT);
+	intel_ring_emit(ring, CS_IDLE_COUNT_5US);
+
+	intel_ring_advance(ring);
+	return 0;
+}
+
+static bool gt_legacy_userspace(struct intel_ring_buffer *ring,
+				struct drm_i915_gem_execbuffer2 *args)
+{
+	drm_i915_private_t *dev_priv = ring->dev->dev_private;
+
+	if (ring->id == BCS)
+		return false;
+
+	if (!HAS_SLICE_SHUTDOWN(ring->dev))
+		return false;
+
+	if (dev_priv->gt_slices.state_default == 1)
+		return false;
+
+	if (dev_priv->gt_slices.legacy_userspace_busy)
+		return false;
+
+	return (args->flags & I915_EXEC_USE_PREDICATE) == 0;
+}
+
 static int
 i915_gem_do_execbuffer(struct drm_device *dev, void *data,
 		       struct drm_file *file,
@@ -999,6 +1055,13 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
 		return -EINVAL;
 	}
 
+	if (gt_legacy_userspace(ring, args)) {
+		ret = gt_legacy_userspace_busy(ring);
+		if (ret)
+			return ret;
+		dev_priv->gt_slices.legacy_userspace_busy = true;
+	}
+
 	mode = args->flags & I915_EXEC_CONSTANTS_MASK;
 	mask = I915_EXEC_CONSTANTS_MASK;
 	switch (mode) {
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 497c441..0146bef 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -277,6 +277,17 @@
 #define   SLICE_STATUS_MAIN_ON	(2<<0)
 #define   SLICE_STATUS_BOTH_ON	(3<<0)
 
+#define HSW_SLICESHUTDOWN	0xA190
+#define   SLICE_SHUTDOWN	(1<<0)
+
+#define RC_IDLE_MAX_COUNT	0x2054
+#define   CS_IDLE_COUNT_1US	(1<<1)
+#define   CS_IDLE_COUNT_5US	(1<<3)
+
+#define WAIT_FOR_RC6_EXIT	0x20CC
+#define   WAIT_RC6_EXIT		(1<<0)
+#define   MASK_WAIT_RC6_EXIT	(1<<16)
+
 /*
  * 3D instructions used by the kernel
  */
diff --git a/drivers/gpu/drm/i915/i915_sysfs.c b/drivers/gpu/drm/i915/i915_sysfs.c
index 86ccd52..df43876 100644
--- a/drivers/gpu/drm/i915/i915_sysfs.c
+++ b/drivers/gpu/drm/i915/i915_sysfs.c
@@ -135,16 +135,19 @@ static ssize_t gt_slice_config_store(struct device *kdev,
 {
 	struct drm_minor *minor = container_of(kdev, struct drm_minor, kdev);
 	struct drm_device *dev = minor->dev;
+	struct drm_i915_private *dev_priv = dev->dev_private;
 	int ret;
 
 	if (!strncmp(buf, "full", sizeof("full") - 1)) {
 		ret = intel_set_gt_full(dev);
 		if (ret)
 			return ret;
+		dev_priv->gt_slices.state_default = 1;
 	} else if (!strncmp(buf, "half", sizeof("half") - 1)) {
 		ret = intel_set_gt_half(dev);
 		if (ret)
 			return ret;
+		dev_priv->gt_slices.state_default = 0;
 	} else
 		return -EINVAL;
 	return count;
diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index 4f1b636..3a59c2c 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -7778,6 +7778,11 @@ void intel_mark_idle(struct drm_device *dev)
 
 	if (dev_priv->info->gen >= 6)
 		gen6_rps_idle(dev->dev_private);
+
+	if (dev_priv->gt_slices.legacy_userspace_busy) {
+		dev_priv->gt_slices.legacy_userspace_busy = false;
+		intel_set_gt_half_async(dev);
+	}
 }
 
 void intel_mark_fb_busy(struct drm_i915_gem_object *obj,
diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
index a9abbb5..98cd63e 100644
--- a/drivers/gpu/drm/i915/intel_drv.h
+++ b/drivers/gpu/drm/i915/intel_drv.h
@@ -836,6 +836,7 @@ void intel_disable_gt_powersave(struct drm_device *dev);
 void ironlake_teardown_rc6(struct drm_device *dev);
 int intel_set_gt_full(struct drm_device *dev);
 int intel_set_gt_half(struct drm_device *dev);
+void intel_set_gt_half_async(struct drm_device *dev);
 void intel_init_gt_slices(struct drm_device *dev);
 void gen6_update_ring_freq(struct drm_device *dev);
 void gen6_rps_idle(struct drm_i915_private *dev_priv);
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 63af075..c1bde36 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -3873,6 +3873,7 @@ int intel_set_gt_full(struct drm_device *dev)
 	if (!HAS_SLICE_SHUTDOWN(dev))
 		return -ENODEV;
 
+	mutex_lock(&dev_priv->gt_slices.lock);
 	I915_WRITE(HSW_GT_SLICE_INFO, SLICE_SEL_BOTH);
 
 	/* Slices are enabled on RC6 exit */
@@ -3881,13 +3882,18 @@ int intel_set_gt_full(struct drm_device *dev)
 	if (wait_for(((I915_READ(HSW_GT_SLICE_INFO) & SLICE_STATUS_MASK) ==
 		      SLICE_STATUS_BOTH_ON), 2000)) {
 		DRM_ERROR("Timeout enabling full gt slices\n");
+
 		I915_WRITE(HSW_GT_SLICE_INFO, ~SLICE_SEL_BOTH);
 		I915_WRITE(MI_PREDICATE_RESULT_2, LOWER_SLICE_DISABLED);
+
 		gen6_gt_force_wake_put(dev_priv);
+		mutex_unlock(&dev_priv->gt_slices.lock);
 		return -ETIMEDOUT;
 	}
+
 	I915_WRITE(MI_PREDICATE_RESULT_2, LOWER_SLICE_ENABLED);
 	gen6_gt_force_wake_put(dev_priv);
+	mutex_unlock(&dev_priv->gt_slices.lock);
 
 	return 0;
 }
@@ -3899,6 +3905,10 @@ int intel_set_gt_half(struct drm_device *dev)
 	if (!HAS_SLICE_SHUTDOWN(dev))
 		return -ENODEV;
 
+	if (dev_priv->gt_slices.legacy_userspace_busy)
+		return 0;
+
+	mutex_lock(&dev_priv->gt_slices.lock);
 	I915_WRITE(HSW_GT_SLICE_INFO, ~SLICE_SEL_BOTH);
 
 	/* Slices are disabled on RC6 exit */
@@ -3907,16 +3917,43 @@ int intel_set_gt_half(struct drm_device *dev)
 	if (wait_for(((I915_READ(HSW_GT_SLICE_INFO) & SLICE_STATUS_MASK) ==
 		      SLICE_STATUS_MAIN_ON), 2000)) {
 		DRM_ERROR("Timed out disabling half gt slices\n");
+
 		I915_WRITE(HSW_GT_SLICE_INFO, SLICE_SEL_BOTH);
 		I915_WRITE(MI_PREDICATE_RESULT_2, LOWER_SLICE_ENABLED);
+
 		gen6_gt_force_wake_put(dev_priv);
+		mutex_unlock(&dev_priv->gt_slices.lock);
 		return -ETIMEDOUT;
 	}
+
 	I915_WRITE(MI_PREDICATE_RESULT_2, LOWER_SLICE_DISABLED);
 	gen6_gt_force_wake_put(dev_priv);
+
+	mutex_unlock(&dev_priv->gt_slices.lock);
 	return 0;
 }
 
+/**
+ * On Haswell, slices on/off transitions are done via RC6 sequence.
+ * This async function allows you to request slices shutdown without waiting.
+ * Slices will be disabled on next RC6 exit.
+ */
+void intel_set_gt_half_async(struct drm_device *dev)
+{
+	struct drm_i915_private *dev_priv = dev->dev_private;
+
+	if (!HAS_SLICE_SHUTDOWN(dev))
+		return;
+
+	if (dev_priv->gt_slices.state_default == 1)
+		return;
+
+	mutex_lock(&dev_priv->gt_slices.lock);
+	I915_WRITE(HSW_GT_SLICE_INFO, ~SLICE_SEL_BOTH);
+	I915_WRITE(MI_PREDICATE_RESULT_2, LOWER_SLICE_DISABLED);
+	mutex_unlock(&dev_priv->gt_slices.lock);
+}
+
 void intel_init_gt_slices(struct drm_device *dev)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
@@ -3927,9 +3964,13 @@ void intel_init_gt_slices(struct drm_device *dev)
 	if (!HAS_SLICE_SHUTDOWN(dev))
 		return;
 
+	dev_priv->gt_slices.state_default = 1;
+	dev_priv->gt_slices.legacy_userspace_busy = false;
+	mutex_init(&dev_priv->gt_slices.lock);
+
 	if (!i915_gt_slice_config) {
-		I915_WRITE(HSW_GT_SLICE_INFO, ~SLICE_SEL_BOTH);
-		I915_WRITE(MI_PREDICATE_RESULT_2, LOWER_SLICE_DISABLED);
+		dev_priv->gt_slices.state_default = 0;
+		intel_set_gt_half_async(dev);
 	}
 }
 
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index 3a4e97b..3fa3e24 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -731,7 +731,13 @@ struct drm_i915_gem_execbuffer2 {
  */
 #define I915_EXEC_HANDLE_LUT		(1<<12)
 
-#define __I915_EXEC_UNKNOWN_FLAGS -(I915_EXEC_HANDLE_LUT<<1)
+/* If this flag is set userspace is using predicate and half slices can be
+ * let disabled for power saving. Otherwise use all slices even when disabled
+ * by boot parameter or via sysfs interface
+ */
+#define I915_EXEC_USE_PREDICATE		(1<<13)
+
+#define __I915_EXEC_UNKNOWN_FLAGS -(I915_EXEC_USE_PREDICATE<<1)
 
 #define I915_EXEC_CONTEXT_ID_MASK	(0xffffffff)
 #define i915_execbuffer2_set_context_id(eb2, context) \
-- 
1.7.11.7

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [PATCH] drm/i915: HSW GT3 Slices: exec flag to warn kernel that userspace is using predication
  2013-10-24 20:24                       ` Rodrigo Vivi
@ 2013-10-25  8:43                         ` Chris Wilson
  0 siblings, 0 replies; 31+ messages in thread
From: Chris Wilson @ 2013-10-25  8:43 UTC (permalink / raw)
  To: Rodrigo Vivi; +Cc: intel-gfx

On Thu, Oct 24, 2013 at 06:24:18PM -0200, Rodrigo Vivi wrote:
> If Userspace isn't using MI_PREDICATE all slices must be enabled for
> backward compatibility.
> 
> If I915_EXEC_USE_PREDICATE isn't set and defaul is set to half, kernel will force
> all slices on.
> 
> v2: fix the inverted logic for backwards compatibility
>     USE_PREDICATE unset force gt_full when defaul is half
>     instead of GT_FULL flag.
> 
> v3: Accepting Chris's suggestions: better variable names;
>     better logic around state_default x legacy_userspace_busy;
>     remove unecessary mutex;
> 
> v4: Accepting more suggestions from Chris:
>     * Send all LRIs in only one block and don't ignore if it fails.
>     * function name and cleaner code on forcing_full.
> 
> CC: Chris Wilson <chris@chris-wilson.co.uk>
> CC: Eric Anholt <eric@anholt.net>
> CC: Kenneth Graunke <kenneth@whitecape.org>
> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
> ---
>  drivers/gpu/drm/i915/i915_drv.h            |  8 ++++
>  drivers/gpu/drm/i915/i915_gem_execbuffer.c | 63 ++++++++++++++++++++++++++++++
>  drivers/gpu/drm/i915/i915_reg.h            | 11 ++++++
>  drivers/gpu/drm/i915/i915_sysfs.c          |  3 ++
>  drivers/gpu/drm/i915/intel_display.c       |  5 +++
>  drivers/gpu/drm/i915/intel_drv.h           |  1 +
>  drivers/gpu/drm/i915/intel_pm.c            | 45 ++++++++++++++++++++-
>  include/uapi/drm/i915_drm.h                |  8 +++-
>  8 files changed, 141 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 685fb1d..67bbbce 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -1219,6 +1219,12 @@ struct i915_package_c8 {
>  	} regsave;
>  };
>  
> +struct i915_gt_slices {
> +	int state_default;
> +	int legacy_userspace_busy;
> +	struct mutex lock; /* locks access to this scruct and slice registers */
> +};

Ok, I'm getting happier. This time I looked at what locks we were
holding for accessing state_default and legacy_userspace_busy. I'm
afraid we alternated between no locking and struct_mutex. This will add
some unfortunate complexity...

> +
>  typedef struct drm_i915_private {
>  	struct drm_device *dev;
>  	struct kmem_cache *slab;
> @@ -1418,6 +1424,8 @@ typedef struct drm_i915_private {
>  
>  	struct i915_package_c8 pc8;
>  
> +	struct i915_gt_slices gt_slices;
> +
>  	/* Old dri1 support infrastructure, beware the dragons ya fools entering
>  	 * here! */
>  	struct i915_dri1_state dri1;
> diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> index 0ce0d47..7a362a2 100644
> --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> @@ -922,6 +922,62 @@ i915_reset_gen7_sol_offsets(struct drm_device *dev,
>  	return 0;
>  }
>  
> +static int gt_legacy_userspace_busy(struct intel_ring_buffer *ring)
> +{
> +	int ret;
> +

if (dev_priv->gt_slices.state_default == 1)
	return 0;

> +	ret = intel_ring_begin(ring, 18);
> +	if (ret)
> +		return ret;
> +
> +	intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1));
> +	intel_ring_emit(ring, HSW_GT_SLICE_INFO);
> +	intel_ring_emit(ring, SLICE_SEL_BOTH);
> +
> +	intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1));
> +	intel_ring_emit(ring, MI_PREDICATE_RESULT_2);
> +	intel_ring_emit(ring, LOWER_SLICE_ENABLED);
> +
> +	intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1));
> +	intel_ring_emit(ring, HSW_SLICESHUTDOWN);
> +	intel_ring_emit(ring, ~SLICE_SHUTDOWN);
> +
> +	intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1));
> +	intel_ring_emit(ring, RC_IDLE_MAX_COUNT);
> +	intel_ring_emit(ring, CS_IDLE_COUNT_1US);
> +
> +	intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1));
> +	intel_ring_emit(ring, WAIT_FOR_RC6_EXIT);
> +	intel_ring_emit(ring, _MASKED_BIT_ENABLE(WAIT_RC6_EXIT));
> +
> +	intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1));
> +	intel_ring_emit(ring, RC_IDLE_MAX_COUNT);
> +	intel_ring_emit(ring, CS_IDLE_COUNT_5US);
> +
> +	intel_ring_advance(ring);
> +	return 0;
> +}
> +
> +static bool gt_legacy_userspace(struct intel_ring_buffer *ring,
> +				struct drm_i915_gem_execbuffer2 *args)
> +{
> +	drm_i915_private_t *dev_priv = ring->dev->dev_private;
> +
> +	if (ring->id == BCS)
> +		return false;
> +
> +	if (!HAS_SLICE_SHUTDOWN(ring->dev))
> +		return false;
> +


> +	if (dev_priv->gt_slices.state_default == 1)
> +		return false;
> +
> +	if (dev_priv->gt_slices.legacy_userspace_busy)
> +		return false;

Remove these two.

> +
> +	return (args->flags & I915_EXEC_USE_PREDICATE) == 0;
> +}
> +
>  static int
>  i915_gem_do_execbuffer(struct drm_device *dev, void *data,
>  		       struct drm_file *file,
> @@ -999,6 +1055,13 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
>  		return -EINVAL;
>  	}
>  
> +	if (gt_legacy_userspace(ring, args)) {
                mutex_lock(&dev_priv->gt_slices.lock);
		if (!dev_priv->gt_slices.legacy_userspace_busy) {
> +			ret = gt_legacy_userspace_busy(ring);
> +			if (ret == 0)
> +				dev_priv->gt_slices.legacy_userspace_busy = true;
		}
		mutex_unlock(&dev_priv->gt_slices.lock);
		if (ret)
			return ret;
> +	}

Or you can even push that locking and conditional down to
gt_legacy_userspace_busy().

Fixing the use of dev_priv->gt_slices.lock around the other instances
of gt_slices.legacy_userspace_busy and gt_slices.state_default should be
easier.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH] drm/i915: HSW GT3 Slices: exec flag to warn kernel that userspace is using predication
  2013-10-16  9:15         ` Chris Wilson
  2013-10-16 12:12           ` Rodrigo Vivi
@ 2013-10-31 23:07           ` Rodrigo Vivi
  2013-11-01 10:39             ` Chris Wilson
  1 sibling, 1 reply; 31+ messages in thread
From: Rodrigo Vivi @ 2013-10-31 23:07 UTC (permalink / raw)
  To: intel-gfx

If Userspace isn't using MI_PREDICATE all slices must be enabled for
backward compatibility.

If I915_EXEC_USE_PREDICATE isn't set and defaul is set to half, kernel will force
all slices on.

v2: fix the inverted logic for backwards compatibility
    USE_PREDICATE unset force gt_full when defaul is half
    instead of GT_FULL flag.

v3: Accepting Chris's suggestions: better variable names;
    better logic around state_default x legacy_userspace_busy;
    remove unecessary mutex;

v4: Accepting more suggestions from Chris:
    * Send all LRIs in only one block and don't ignore if it fails.
    * function name and cleaner code on forcing_full.

v5: fix mutex_lock use by Chris.

CC: Chris Wilson <chris@chris-wilson.co.uk>
CC: Eric Anholt <eric@anholt.net>
CC: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
---
 drivers/gpu/drm/i915/i915_drv.h            |  8 ++++
 drivers/gpu/drm/i915/i915_gem_execbuffer.c | 64 ++++++++++++++++++++++++++++++
 drivers/gpu/drm/i915/i915_reg.h            | 11 +++++
 drivers/gpu/drm/i915/i915_sysfs.c          |  7 ++++
 drivers/gpu/drm/i915/intel_display.c       | 17 ++++++++
 drivers/gpu/drm/i915/intel_drv.h           |  1 +
 drivers/gpu/drm/i915/intel_pm.c            | 41 ++++++++++++++++++-
 include/uapi/drm/i915_drm.h                |  8 +++-
 8 files changed, 154 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 685fb1d..67bbbce 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1219,6 +1219,12 @@ struct i915_package_c8 {
 	} regsave;
 };
 
+struct i915_gt_slices {
+	int state_default;
+	int legacy_userspace_busy;
+	struct mutex lock; /* locks access to this scruct and slice registers */
+};
+
 typedef struct drm_i915_private {
 	struct drm_device *dev;
 	struct kmem_cache *slab;
@@ -1418,6 +1424,8 @@ typedef struct drm_i915_private {
 
 	struct i915_package_c8 pc8;
 
+	struct i915_gt_slices gt_slices;
+
 	/* Old dri1 support infrastructure, beware the dragons ya fools entering
 	 * here! */
 	struct i915_dri1_state dri1;
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index 0ce0d47..3ada5b4 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -922,6 +922,56 @@ i915_reset_gen7_sol_offsets(struct drm_device *dev,
 	return 0;
 }
 
+static int gt_legacy_userspace_busy(struct intel_ring_buffer *ring)
+{
+	int ret;
+
+	ret = intel_ring_begin(ring, 18);
+	if (ret)
+		return ret;
+
+	intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1));
+	intel_ring_emit(ring, HSW_GT_SLICE_INFO);
+	intel_ring_emit(ring, SLICE_SEL_BOTH);
+
+	intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1));
+	intel_ring_emit(ring, MI_PREDICATE_RESULT_2);
+	intel_ring_emit(ring, LOWER_SLICE_ENABLED);
+
+	intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1));
+	intel_ring_emit(ring, HSW_SLICESHUTDOWN);
+	intel_ring_emit(ring, ~SLICE_SHUTDOWN);
+
+	intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1));
+	intel_ring_emit(ring, RC_IDLE_MAX_COUNT);
+	intel_ring_emit(ring, CS_IDLE_COUNT_1US);
+
+	intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1));
+	intel_ring_emit(ring, WAIT_FOR_RC6_EXIT);
+	intel_ring_emit(ring, _MASKED_BIT_ENABLE(WAIT_RC6_EXIT));
+
+	intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1));
+	intel_ring_emit(ring, RC_IDLE_MAX_COUNT);
+	intel_ring_emit(ring, CS_IDLE_COUNT_5US);
+
+	intel_ring_advance(ring);
+	return 0;
+}
+
+static bool gt_legacy_userspace(struct intel_ring_buffer *ring,
+				struct drm_i915_gem_execbuffer2 *args)
+{
+	drm_i915_private_t *dev_priv = ring->dev->dev_private;
+
+	if (ring->id == BCS)
+		return false;
+
+	if (!HAS_SLICE_SHUTDOWN(ring->dev))
+		return false;
+
+	return (args->flags & I915_EXEC_USE_PREDICATE) == 0;
+}
+
 static int
 i915_gem_do_execbuffer(struct drm_device *dev, void *data,
 		       struct drm_file *file,
@@ -935,6 +985,7 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
 	struct drm_clip_rect *cliprects = NULL;
 	struct intel_ring_buffer *ring;
 	struct i915_ctx_hang_stats *hs;
+	struct i915_gt_slices *gt_slices = &dev_priv->gt_slices;
 	u32 ctx_id = i915_execbuffer2_get_context_id(*args);
 	u32 exec_start, exec_len;
 	u32 mask, flags;
@@ -999,6 +1050,19 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
 		return -EINVAL;
 	}
 
+	if (gt_legacy_userspace(ring, args)) {
+		mutex_lock(&gt_slices->lock);
+		if (gt_slices->state_default == 0 &&
+		    !gt_slices->legacy_userspace_busy) {
+			ret = gt_legacy_userspace_busy(ring);
+			if (ret == 0)
+				gt_slices->legacy_userspace_busy = true;
+		}
+		mutex_unlock(&gt_slices->lock);
+		if (ret)
+			return ret;
+	}
+
 	mode = args->flags & I915_EXEC_CONSTANTS_MASK;
 	mask = I915_EXEC_CONSTANTS_MASK;
 	switch (mode) {
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 497c441..0146bef 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -277,6 +277,17 @@
 #define   SLICE_STATUS_MAIN_ON	(2<<0)
 #define   SLICE_STATUS_BOTH_ON	(3<<0)
 
+#define HSW_SLICESHUTDOWN	0xA190
+#define   SLICE_SHUTDOWN	(1<<0)
+
+#define RC_IDLE_MAX_COUNT	0x2054
+#define   CS_IDLE_COUNT_1US	(1<<1)
+#define   CS_IDLE_COUNT_5US	(1<<3)
+
+#define WAIT_FOR_RC6_EXIT	0x20CC
+#define   WAIT_RC6_EXIT		(1<<0)
+#define   MASK_WAIT_RC6_EXIT	(1<<16)
+
 /*
  * 3D instructions used by the kernel
  */
diff --git a/drivers/gpu/drm/i915/i915_sysfs.c b/drivers/gpu/drm/i915/i915_sysfs.c
index 86ccd52..a821499 100644
--- a/drivers/gpu/drm/i915/i915_sysfs.c
+++ b/drivers/gpu/drm/i915/i915_sysfs.c
@@ -135,16 +135,23 @@ static ssize_t gt_slice_config_store(struct device *kdev,
 {
 	struct drm_minor *minor = container_of(kdev, struct drm_minor, kdev);
 	struct drm_device *dev = minor->dev;
+	struct drm_i915_private *dev_priv = dev->dev_private;
 	int ret;
 
 	if (!strncmp(buf, "full", sizeof("full") - 1)) {
 		ret = intel_set_gt_full(dev);
 		if (ret)
 			return ret;
+		mutex_lock(&dev_priv->gt_slices.lock);
+		dev_priv->gt_slices.state_default = 1;
+		mutex_unlock(&dev_priv->gt_slices.lock);
 	} else if (!strncmp(buf, "half", sizeof("half") - 1)) {
 		ret = intel_set_gt_half(dev);
 		if (ret)
 			return ret;
+		mutex_lock(&dev_priv->gt_slices.lock);
+		dev_priv->gt_slices.state_default = 0;
+		mutex_unlock(&dev_priv->gt_slices.lock);
 	} else
 		return -EINVAL;
 	return count;
diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index 4f1b636..eec4c0e 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -7759,6 +7759,20 @@ void intel_mark_busy(struct drm_device *dev)
 	i915_update_gfx_val(dev_priv);
 }
 
+static bool intel_need_shutdown_slices(struct drm_device *dev)
+{
+	struct drm_i915_private *dev_priv = dev->dev_private;
+
+	mutex_lock(&dev_priv->gt_slices.lock);
+	if (dev_priv->gt_slices.legacy_userspace_busy) {
+		dev_priv->gt_slices.legacy_userspace_busy = false;
+		mutex_unlock(&dev_priv->gt_slices.lock);
+		return true;
+	}
+	mutex_unlock(&dev_priv->gt_slices.lock);
+	return false;
+}
+
 void intel_mark_idle(struct drm_device *dev)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
@@ -7778,6 +7792,9 @@ void intel_mark_idle(struct drm_device *dev)
 
 	if (dev_priv->info->gen >= 6)
 		gen6_rps_idle(dev->dev_private);
+
+	if (intel_need_shutdown_slices(dev))
+		intel_set_gt_half_async(dev);
 }
 
 void intel_mark_fb_busy(struct drm_i915_gem_object *obj,
diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
index a9abbb5..98cd63e 100644
--- a/drivers/gpu/drm/i915/intel_drv.h
+++ b/drivers/gpu/drm/i915/intel_drv.h
@@ -836,6 +836,7 @@ void intel_disable_gt_powersave(struct drm_device *dev);
 void ironlake_teardown_rc6(struct drm_device *dev);
 int intel_set_gt_full(struct drm_device *dev);
 int intel_set_gt_half(struct drm_device *dev);
+void intel_set_gt_half_async(struct drm_device *dev);
 void intel_init_gt_slices(struct drm_device *dev);
 void gen6_update_ring_freq(struct drm_device *dev);
 void gen6_rps_idle(struct drm_i915_private *dev_priv);
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 63af075..b3bd70f 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -3873,6 +3873,7 @@ int intel_set_gt_full(struct drm_device *dev)
 	if (!HAS_SLICE_SHUTDOWN(dev))
 		return -ENODEV;
 
+	mutex_lock(&dev_priv->gt_slices.lock);
 	I915_WRITE(HSW_GT_SLICE_INFO, SLICE_SEL_BOTH);
 
 	/* Slices are enabled on RC6 exit */
@@ -3881,13 +3882,18 @@ int intel_set_gt_full(struct drm_device *dev)
 	if (wait_for(((I915_READ(HSW_GT_SLICE_INFO) & SLICE_STATUS_MASK) ==
 		      SLICE_STATUS_BOTH_ON), 2000)) {
 		DRM_ERROR("Timeout enabling full gt slices\n");
+
 		I915_WRITE(HSW_GT_SLICE_INFO, ~SLICE_SEL_BOTH);
 		I915_WRITE(MI_PREDICATE_RESULT_2, LOWER_SLICE_DISABLED);
+
 		gen6_gt_force_wake_put(dev_priv);
+		mutex_unlock(&dev_priv->gt_slices.lock);
 		return -ETIMEDOUT;
 	}
+
 	I915_WRITE(MI_PREDICATE_RESULT_2, LOWER_SLICE_ENABLED);
 	gen6_gt_force_wake_put(dev_priv);
+	mutex_unlock(&dev_priv->gt_slices.lock);
 
 	return 0;
 }
@@ -3899,6 +3905,7 @@ int intel_set_gt_half(struct drm_device *dev)
 	if (!HAS_SLICE_SHUTDOWN(dev))
 		return -ENODEV;
 
+	mutex_lock(&dev_priv->gt_slices.lock);
 	I915_WRITE(HSW_GT_SLICE_INFO, ~SLICE_SEL_BOTH);
 
 	/* Slices are disabled on RC6 exit */
@@ -3907,16 +3914,42 @@ int intel_set_gt_half(struct drm_device *dev)
 	if (wait_for(((I915_READ(HSW_GT_SLICE_INFO) & SLICE_STATUS_MASK) ==
 		      SLICE_STATUS_MAIN_ON), 2000)) {
 		DRM_ERROR("Timed out disabling half gt slices\n");
+
 		I915_WRITE(HSW_GT_SLICE_INFO, SLICE_SEL_BOTH);
 		I915_WRITE(MI_PREDICATE_RESULT_2, LOWER_SLICE_ENABLED);
+
 		gen6_gt_force_wake_put(dev_priv);
+		mutex_unlock(&dev_priv->gt_slices.lock);
 		return -ETIMEDOUT;
 	}
+
 	I915_WRITE(MI_PREDICATE_RESULT_2, LOWER_SLICE_DISABLED);
 	gen6_gt_force_wake_put(dev_priv);
+
+	mutex_unlock(&dev_priv->gt_slices.lock);
 	return 0;
 }
 
+/**
+ * On Haswell, slices on/off transitions are done via RC6 sequence.
+ * This async function allows you to request slices shutdown without waiting.
+ * Slices will be disabled on next RC6 exit.
+ */
+void intel_set_gt_half_async(struct drm_device *dev)
+{
+	struct drm_i915_private *dev_priv = dev->dev_private;
+
+	if (!HAS_SLICE_SHUTDOWN(dev))
+		return;
+
+	mutex_lock(&dev_priv->gt_slices.lock);
+	if (dev_priv->gt_slices.state_default == 0) {
+		I915_WRITE(HSW_GT_SLICE_INFO, ~SLICE_SEL_BOTH);
+		I915_WRITE(MI_PREDICATE_RESULT_2, LOWER_SLICE_DISABLED);
+	}
+	mutex_unlock(&dev_priv->gt_slices.lock);
+}
+
 void intel_init_gt_slices(struct drm_device *dev)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
@@ -3927,9 +3960,13 @@ void intel_init_gt_slices(struct drm_device *dev)
 	if (!HAS_SLICE_SHUTDOWN(dev))
 		return;
 
+	dev_priv->gt_slices.state_default = 1;
+	dev_priv->gt_slices.legacy_userspace_busy = false;
+	mutex_init(&dev_priv->gt_slices.lock);
+
 	if (!i915_gt_slice_config) {
-		I915_WRITE(HSW_GT_SLICE_INFO, ~SLICE_SEL_BOTH);
-		I915_WRITE(MI_PREDICATE_RESULT_2, LOWER_SLICE_DISABLED);
+		dev_priv->gt_slices.state_default = 0;
+		intel_set_gt_half_async(dev);
 	}
 }
 
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index 3a4e97b..3fa3e24 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -731,7 +731,13 @@ struct drm_i915_gem_execbuffer2 {
  */
 #define I915_EXEC_HANDLE_LUT		(1<<12)
 
-#define __I915_EXEC_UNKNOWN_FLAGS -(I915_EXEC_HANDLE_LUT<<1)
+/* If this flag is set userspace is using predicate and half slices can be
+ * let disabled for power saving. Otherwise use all slices even when disabled
+ * by boot parameter or via sysfs interface
+ */
+#define I915_EXEC_USE_PREDICATE		(1<<13)
+
+#define __I915_EXEC_UNKNOWN_FLAGS -(I915_EXEC_USE_PREDICATE<<1)
 
 #define I915_EXEC_CONTEXT_ID_MASK	(0xffffffff)
 #define i915_execbuffer2_set_context_id(eb2, context) \
-- 
1.7.11.7

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [PATCH] drm/i915: HSW GT3 Slices: exec flag to warn kernel that userspace is using predication
  2013-10-31 23:07           ` Rodrigo Vivi
@ 2013-11-01 10:39             ` Chris Wilson
  2013-11-02 12:49               ` Rodrigo Vivi
  0 siblings, 1 reply; 31+ messages in thread
From: Chris Wilson @ 2013-11-01 10:39 UTC (permalink / raw)
  To: Rodrigo Vivi; +Cc: intel-gfx

On Thu, Oct 31, 2013 at 09:07:09PM -0200, Rodrigo Vivi wrote:
> If Userspace isn't using MI_PREDICATE all slices must be enabled for
> backward compatibility.
> 
> If I915_EXEC_USE_PREDICATE isn't set and defaul is set to half, kernel will force
> all slices on.
> 
> v2: fix the inverted logic for backwards compatibility
>     USE_PREDICATE unset force gt_full when defaul is half
>     instead of GT_FULL flag.
> 
> v3: Accepting Chris's suggestions: better variable names;
>     better logic around state_default x legacy_userspace_busy;
>     remove unecessary mutex;
> 
> v4: Accepting more suggestions from Chris:
>     * Send all LRIs in only one block and don't ignore if it fails.
>     * function name and cleaner code on forcing_full.
> 
> v5: fix mutex_lock use by Chris.
> 
> CC: Chris Wilson <chris@chris-wilson.co.uk>
> CC: Eric Anholt <eric@anholt.net>
> CC: Kenneth Graunke <kenneth@whitecape.org>
> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>

Locking needs major work still.

> @@ -935,6 +985,7 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
>  	struct drm_clip_rect *cliprects = NULL;
>  	struct intel_ring_buffer *ring;
>  	struct i915_ctx_hang_stats *hs;
> +	struct i915_gt_slices *gt_slices = &dev_priv->gt_slices;
>  	u32 ctx_id = i915_execbuffer2_get_context_id(*args);
>  	u32 exec_start, exec_len;
>  	u32 mask, flags;
> @@ -999,6 +1050,19 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
>  		return -EINVAL;
>  	}
>  
> +	if (gt_legacy_userspace(ring, args)) {
> +		mutex_lock(&gt_slices->lock);
> +		if (gt_slices->state_default == 0 &&
> +		    !gt_slices->legacy_userspace_busy) {

You need to set legacy_userspace_busy if gt_slices->state_default is
already 0. Why 0? Why not 1? 1 - for one slice, 2 - for two slices, etc.

> +			ret = gt_legacy_userspace_busy(ring);
> +			if (ret == 0)
> +				gt_slices->legacy_userspace_busy = true;
> +		}
> +		mutex_unlock(&gt_slices->lock);
> +		if (ret)
> +			return ret;
> +	}
> +
>  	mode = args->flags & I915_EXEC_CONSTANTS_MASK;
>  	mask = I915_EXEC_CONSTANTS_MASK;
>  	switch (mode) {

> diff --git a/drivers/gpu/drm/i915/i915_sysfs.c b/drivers/gpu/drm/i915/i915_sysfs.c
> index 86ccd52..a821499 100644
> --- a/drivers/gpu/drm/i915/i915_sysfs.c
> +++ b/drivers/gpu/drm/i915/i915_sysfs.c
> @@ -135,16 +135,23 @@ static ssize_t gt_slice_config_store(struct device *kdev,
>  {
>  	struct drm_minor *minor = container_of(kdev, struct drm_minor, kdev);
>  	struct drm_device *dev = minor->dev;
> +	struct drm_i915_private *dev_priv = dev->dev_private;
>  	int ret;
>  
>  	if (!strncmp(buf, "full", sizeof("full") - 1)) {
>  		ret = intel_set_gt_full(dev);
>  		if (ret)
>  			return ret;
> +		mutex_lock(&dev_priv->gt_slices.lock);
> +		dev_priv->gt_slices.state_default = 1;
> +		mutex_unlock(&dev_priv->gt_slices.lock);
>  	} else if (!strncmp(buf, "half", sizeof("half") - 1)) {
>  		ret = intel_set_gt_half(dev);
>  		if (ret)
>  			return ret;
> +		mutex_lock(&dev_priv->gt_slices.lock);
> +		dev_priv->gt_slices.state_default = 0;
> +		mutex_unlock(&dev_priv->gt_slices.lock);
>  	} else
>  		return -EINVAL;

This is the clearest example that the locking is fubar. Consider a
second process that simultaneously tries to change slice config. What
state is recorded? What state is the hardware actually in?
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH] drm/i915: HSW GT3 Slices: exec flag to warn kernel that userspace is using predication
  2013-11-01 10:39             ` Chris Wilson
@ 2013-11-02 12:49               ` Rodrigo Vivi
  2013-11-02 12:54                 ` Chris Wilson
  0 siblings, 1 reply; 31+ messages in thread
From: Rodrigo Vivi @ 2013-11-02 12:49 UTC (permalink / raw)
  To: Chris Wilson, Rodrigo Vivi, intel-gfx, Eric Anholt, Kenneth Graunke

On Fri, Nov 1, 2013 at 8:39 AM, Chris Wilson <chris@chris-wilson.co.uk> wrote:
> On Thu, Oct 31, 2013 at 09:07:09PM -0200, Rodrigo Vivi wrote:
>> If Userspace isn't using MI_PREDICATE all slices must be enabled for
>> backward compatibility.
>>
>> If I915_EXEC_USE_PREDICATE isn't set and defaul is set to half, kernel will force
>> all slices on.
>>
>> v2: fix the inverted logic for backwards compatibility
>>     USE_PREDICATE unset force gt_full when defaul is half
>>     instead of GT_FULL flag.
>>
>> v3: Accepting Chris's suggestions: better variable names;
>>     better logic around state_default x legacy_userspace_busy;
>>     remove unecessary mutex;
>>
>> v4: Accepting more suggestions from Chris:
>>     * Send all LRIs in only one block and don't ignore if it fails.
>>     * function name and cleaner code on forcing_full.
>>
>> v5: fix mutex_lock use by Chris.
>>
>> CC: Chris Wilson <chris@chris-wilson.co.uk>
>> CC: Eric Anholt <eric@anholt.net>
>> CC: Kenneth Graunke <kenneth@whitecape.org>
>> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
>
> Locking needs major work still.

What else is wrong besides what you pointed on sysfs?

>
>> @@ -935,6 +985,7 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
>>       struct drm_clip_rect *cliprects = NULL;
>>       struct intel_ring_buffer *ring;
>>       struct i915_ctx_hang_stats *hs;
>> +     struct i915_gt_slices *gt_slices = &dev_priv->gt_slices;
>>       u32 ctx_id = i915_execbuffer2_get_context_id(*args);
>>       u32 exec_start, exec_len;
>>       u32 mask, flags;
>> @@ -999,6 +1050,19 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
>>               return -EINVAL;
>>       }
>>
>> +     if (gt_legacy_userspace(ring, args)) {
>> +             mutex_lock(&gt_slices->lock);
>> +             if (gt_slices->state_default == 0 &&
>> +                 !gt_slices->legacy_userspace_busy) {
>
> You need to set legacy_userspace_busy if gt_slices->state_default is
> already 0. Why 0? Why not 1? 1 - for one slice, 2 - for two slices, etc.

I used 0 for half and 1 for full like I used on the boot parameter.
I don't mind in changing that if you believe 1 and 2 is more clear...
but I would have to change all other patches and maybe the sysfs
interface?

>
>> +                     ret = gt_legacy_userspace_busy(ring);
>> +                     if (ret == 0)
>> +                             gt_slices->legacy_userspace_busy = true;
>> +             }
>> +             mutex_unlock(&gt_slices->lock);
>> +             if (ret)
>> +                     return ret;
>> +     }
>> +
>>       mode = args->flags & I915_EXEC_CONSTANTS_MASK;
>>       mask = I915_EXEC_CONSTANTS_MASK;
>>       switch (mode) {
>
>> diff --git a/drivers/gpu/drm/i915/i915_sysfs.c b/drivers/gpu/drm/i915/i915_sysfs.c
>> index 86ccd52..a821499 100644
>> --- a/drivers/gpu/drm/i915/i915_sysfs.c
>> +++ b/drivers/gpu/drm/i915/i915_sysfs.c
>> @@ -135,16 +135,23 @@ static ssize_t gt_slice_config_store(struct device *kdev,
>>  {
>>       struct drm_minor *minor = container_of(kdev, struct drm_minor, kdev);
>>       struct drm_device *dev = minor->dev;
>> +     struct drm_i915_private *dev_priv = dev->dev_private;
>>       int ret;
>>
>>       if (!strncmp(buf, "full", sizeof("full") - 1)) {
>>               ret = intel_set_gt_full(dev);
>>               if (ret)
>>                       return ret;
>> +             mutex_lock(&dev_priv->gt_slices.lock);
>> +             dev_priv->gt_slices.state_default = 1;
>> +             mutex_unlock(&dev_priv->gt_slices.lock);
>>       } else if (!strncmp(buf, "half", sizeof("half") - 1)) {
>>               ret = intel_set_gt_half(dev);
>>               if (ret)
>>                       return ret;
>> +             mutex_lock(&dev_priv->gt_slices.lock);
>> +             dev_priv->gt_slices.state_default = 0;
>> +             mutex_unlock(&dev_priv->gt_slices.lock);
>>       } else
>>               return -EINVAL;
>
> This is the clearest example that the locking is fubar. Consider a
> second process that simultaneously tries to change slice config. What
> state is recorded? What state is the hardware actually in?

agree... I will just remove it, but what else you see wrong with locking?

> -Chris
>
> --
> Chris Wilson, Intel Open Source Technology Centre


Thanks,
Rodrigo.

-- 
Rodrigo Vivi
Blog: http://blog.vivi.eng.br

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH] drm/i915: HSW GT3 Slices: exec flag to warn kernel that userspace is using predication
  2013-11-02 12:49               ` Rodrigo Vivi
@ 2013-11-02 12:54                 ` Chris Wilson
  0 siblings, 0 replies; 31+ messages in thread
From: Chris Wilson @ 2013-11-02 12:54 UTC (permalink / raw)
  To: Rodrigo Vivi; +Cc: intel-gfx

On Sat, Nov 02, 2013 at 10:49:32AM -0200, Rodrigo Vivi wrote:
> On Fri, Nov 1, 2013 at 8:39 AM, Chris Wilson <chris@chris-wilson.co.uk> wrote:
> > On Thu, Oct 31, 2013 at 09:07:09PM -0200, Rodrigo Vivi wrote:
> >> If Userspace isn't using MI_PREDICATE all slices must be enabled for
> >> backward compatibility.
> >>
> >> If I915_EXEC_USE_PREDICATE isn't set and defaul is set to half, kernel will force
> >> all slices on.
> >>
> >> v2: fix the inverted logic for backwards compatibility
> >>     USE_PREDICATE unset force gt_full when defaul is half
> >>     instead of GT_FULL flag.
> >>
> >> v3: Accepting Chris's suggestions: better variable names;
> >>     better logic around state_default x legacy_userspace_busy;
> >>     remove unecessary mutex;
> >>
> >> v4: Accepting more suggestions from Chris:
> >>     * Send all LRIs in only one block and don't ignore if it fails.
> >>     * function name and cleaner code on forcing_full.
> >>
> >> v5: fix mutex_lock use by Chris.
> >>
> >> CC: Chris Wilson <chris@chris-wilson.co.uk>
> >> CC: Eric Anholt <eric@anholt.net>
> >> CC: Kenneth Graunke <kenneth@whitecape.org>
> >> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
> >
> > Locking needs major work still.
> 
> What else is wrong besides what you pointed on sysfs?

It has a ripple effect.
 
> >
> >> @@ -935,6 +985,7 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
> >>       struct drm_clip_rect *cliprects = NULL;
> >>       struct intel_ring_buffer *ring;
> >>       struct i915_ctx_hang_stats *hs;
> >> +     struct i915_gt_slices *gt_slices = &dev_priv->gt_slices;
> >>       u32 ctx_id = i915_execbuffer2_get_context_id(*args);
> >>       u32 exec_start, exec_len;
> >>       u32 mask, flags;
> >> @@ -999,6 +1050,19 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
> >>               return -EINVAL;
> >>       }
> >>
> >> +     if (gt_legacy_userspace(ring, args)) {
> >> +             mutex_lock(&gt_slices->lock);
> >> +             if (gt_slices->state_default == 0 &&
> >> +                 !gt_slices->legacy_userspace_busy) {
> >
> > You need to set legacy_userspace_busy if gt_slices->state_default is
> > already 0. Why 0? Why not 1? 1 - for one slice, 2 - for two slices, etc.
> 
> I used 0 for half and 1 for full like I used on the boot parameter.
> I don't mind in changing that if you believe 1 and 2 is more clear...
> but I would have to change all other patches and maybe the sysfs
> interface?

half,full seems reasonable enough, and it will be easy to extend that
interface to include an integer slice count.  Having it as a slice count
it is a lot easier to understand the implications of different values. 
I would also argue that the module parameter should match the sysfs
interface.

> 
> >
> >> +                     ret = gt_legacy_userspace_busy(ring);
> >> +                     if (ret == 0)
> >> +                             gt_slices->legacy_userspace_busy = true;
> >> +             }
> >> +             mutex_unlock(&gt_slices->lock);
> >> +             if (ret)
> >> +                     return ret;
> >> +     }
> >> +
> >>       mode = args->flags & I915_EXEC_CONSTANTS_MASK;
> >>       mask = I915_EXEC_CONSTANTS_MASK;
> >>       switch (mode) {
> >
> >> diff --git a/drivers/gpu/drm/i915/i915_sysfs.c b/drivers/gpu/drm/i915/i915_sysfs.c
> >> index 86ccd52..a821499 100644
> >> --- a/drivers/gpu/drm/i915/i915_sysfs.c
> >> +++ b/drivers/gpu/drm/i915/i915_sysfs.c
> >> @@ -135,16 +135,23 @@ static ssize_t gt_slice_config_store(struct device *kdev,
> >>  {
> >>       struct drm_minor *minor = container_of(kdev, struct drm_minor, kdev);
> >>       struct drm_device *dev = minor->dev;
> >> +     struct drm_i915_private *dev_priv = dev->dev_private;
> >>       int ret;
> >>
> >>       if (!strncmp(buf, "full", sizeof("full") - 1)) {
> >>               ret = intel_set_gt_full(dev);
> >>               if (ret)
> >>                       return ret;
> >> +             mutex_lock(&dev_priv->gt_slices.lock);
> >> +             dev_priv->gt_slices.state_default = 1;
> >> +             mutex_unlock(&dev_priv->gt_slices.lock);
> >>       } else if (!strncmp(buf, "half", sizeof("half") - 1)) {
> >>               ret = intel_set_gt_half(dev);
> >>               if (ret)
> >>                       return ret;
> >> +             mutex_lock(&dev_priv->gt_slices.lock);
> >> +             dev_priv->gt_slices.state_default = 0;
> >> +             mutex_unlock(&dev_priv->gt_slices.lock);
> >>       } else
> >>               return -EINVAL;
> >
> > This is the clearest example that the locking is fubar. Consider a
> > second process that simultaneously tries to change slice config. What
> > state is recorded? What state is the hardware actually in?
> 
> agree... I will just remove it, but what else you see wrong with locking?

You can't just remove the locking here either. The locking has to
protect all register access and bookkeeping so that they are always in
sync.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH 1/4] drm/i915: Allow GT Slices Shutdown on Boot.
  2013-10-15 14:41 [PATCH 1/3] drm/i915: Allow GT Slices Shutdown on Boot Rodrigo Vivi
                   ` (3 preceding siblings ...)
  2013-10-15 14:41 ` [PATCH 3/3] drm/i915: HSW GT3 Slices: Userspace can dynamically forcibly enable all slices Rodrigo Vivi
@ 2013-11-05 22:44 ` Rodrigo Vivi
  2013-11-05 22:44   ` [PATCH 2/4] drm/i915: Slice Shutdown: Allow setting number of slices on through sysfs Rodrigo Vivi
                     ` (2 more replies)
  4 siblings, 3 replies; 31+ messages in thread
From: Rodrigo Vivi @ 2013-11-05 22:44 UTC (permalink / raw)
  To: intel-gfx

Slices shutdown is a power savings feature present on Haswell GT3 whereby
parts of HW i.e. slice is shut off on boot or dynamically to save power.

This patch only introduces a way to disable half of Haswell GT3 slices on boot.

v2: Use number of slices on (1 or 2) instead of 0 for half and 1 for full.

Signed-off-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
---
 drivers/gpu/drm/i915/i915_drv.c  |  5 +++++
 drivers/gpu/drm/i915/i915_drv.h  |  1 +
 drivers/gpu/drm/i915/i915_gem.c  |  5 +----
 drivers/gpu/drm/i915/i915_reg.h  |  8 ++++++++
 drivers/gpu/drm/i915/intel_drv.h |  1 +
 drivers/gpu/drm/i915/intel_pm.c  | 20 +++++++++++++++++++-
 6 files changed, 35 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index a0804fa..71bac6a 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -154,6 +154,11 @@ module_param_named(prefault_disable, i915_prefault_disable, bool, 0600);
 MODULE_PARM_DESC(prefault_disable,
 		"Disable page prefaulting for pread/pwrite/reloc (default:false). For developers only.");
 
+int i915_gt_slices __read_mostly = 2;
+module_param_named(gt_slices, i915_gt_slices, int, 0600);
+MODULE_PARM_DESC(gt_slices,
+		 "Haswell GT3 has two slices with many EUs in each of them. Use 2 for better performance or 1 for better power savings. (default:2)");
+
 static struct drm_driver driver;
 extern int intel_agp_enabled;
 
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index b12d942..27073e8 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1836,6 +1836,7 @@ extern bool i915_fastboot __read_mostly;
 extern int i915_enable_pc8 __read_mostly;
 extern int i915_pc8_timeout __read_mostly;
 extern bool i915_prefault_disable __read_mostly;
+extern int i915_gt_slices __read_mostly;
 
 extern int i915_suspend(struct drm_device *dev, pm_message_t state);
 extern int i915_resume(struct drm_device *dev);
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index e7b39d7..f9c32d1 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -4439,10 +4439,7 @@ i915_gem_init_hw(struct drm_device *dev)
 	if (dev_priv->ellc_size)
 		I915_WRITE(HSW_IDICR, I915_READ(HSW_IDICR) | IDIHASHMSK(0xf));
 
-	if (IS_HSW_GT3(dev))
-		I915_WRITE(MI_PREDICATE_RESULT_2, LOWER_SLICE_ENABLED);
-	else
-		I915_WRITE(MI_PREDICATE_RESULT_2, LOWER_SLICE_DISABLED);
+	intel_init_gt_slices(dev);
 
 	if (HAS_PCH_NOP(dev)) {
 		u32 temp = I915_READ(GEN7_MSG_CTL);
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 3f303ba..a2e7deb 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -270,6 +270,14 @@
 #define  LOWER_SLICE_ENABLED	(1<<0)
 #define  LOWER_SLICE_DISABLED	(0<<0)
 
+#define HSW_GT_SLICE_INFO	0x138064
+#define   SLICE_SEL_BOTH	(1<<3)
+#define   SLICE_AUTOWAKE	(1<<2)
+#define   SLICE_STATUS_MASK	0x3
+#define   SLICE_STATUS_GT_OFF	(0<<0)
+#define   SLICE_STATUS_MAIN_ON	(2<<0)
+#define   SLICE_STATUS_BOTH_ON	(3<<0)
+
 /*
  * 3D instructions used by the kernel
  */
diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
index 9d2624f..42c3983 100644
--- a/drivers/gpu/drm/i915/intel_drv.h
+++ b/drivers/gpu/drm/i915/intel_drv.h
@@ -836,6 +836,7 @@ void intel_set_power_well(struct drm_device *dev, bool enable);
 void intel_enable_gt_powersave(struct drm_device *dev);
 void intel_disable_gt_powersave(struct drm_device *dev);
 void ironlake_teardown_rc6(struct drm_device *dev);
+void intel_init_gt_slices(struct drm_device *dev);
 void gen6_update_ring_freq(struct drm_device *dev);
 void gen6_rps_idle(struct drm_i915_private *dev_priv);
 void gen6_rps_boost(struct drm_i915_private *dev_priv);
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 09ac9e7..02d1b1f 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -3869,6 +3869,25 @@ static void gen6_enable_rps(struct drm_device *dev)
 	gen6_gt_force_wake_put(dev_priv);
 }
 
+void intel_init_gt_slices(struct drm_device *dev)
+{
+	struct drm_i915_private *dev_priv = dev->dev_private;
+
+	if (!IS_HSW_GT3(dev))
+		return;
+
+	I915_WRITE(MI_PREDICATE_RESULT_2, LOWER_SLICE_ENABLED);
+
+	if (i915_gt_slices == 1) {
+		I915_WRITE(HSW_GT_SLICE_INFO, ~SLICE_SEL_BOTH);
+		I915_WRITE(MI_PREDICATE_RESULT_2, LOWER_SLICE_DISABLED);
+		return;
+	}
+
+	if (i915_gt_slices != 2)
+		DRM_ERROR("Invalid number of slices. 2 GT slices enabled\n");
+}
+
 void gen6_update_ring_freq(struct drm_device *dev)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
@@ -5995,4 +6014,3 @@ void intel_pm_init(struct drm_device *dev)
 	INIT_DELAYED_WORK(&dev_priv->rps.delayed_resume_work,
 			  intel_gen6_powersave_work);
 }
-
-- 
1.7.11.7

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 2/4] drm/i915: Slice Shutdown: Allow setting number of slices on through sysfs
  2013-11-05 22:44 ` [PATCH 1/4] drm/i915: Allow GT Slices Shutdown on Boot Rodrigo Vivi
@ 2013-11-05 22:44   ` Rodrigo Vivi
  2013-11-05 22:44   ` [PATCH 3/4] drm/i915: HSW GT3 Slices: exec flag to warn kernel that userspace is using predication Rodrigo Vivi
  2013-11-05 22:44   ` [PATCH 4/4] drm/i915: Slice Shutdown - Only 1 slice enabled by default to save power Rodrigo Vivi
  2 siblings, 0 replies; 31+ messages in thread
From: Rodrigo Vivi @ 2013-11-05 22:44 UTC (permalink / raw)
  To: intel-gfx

This patch introduces a sysfs interface to easily allow dynamically switch
slice config default behaviour between 1 or 2 slices.

v2: use number of slices on (1,2) instead of half or full strings.

Signed-off-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
---
 drivers/gpu/drm/i915/i915_drv.h   |  1 +
 drivers/gpu/drm/i915/i915_sysfs.c | 53 ++++++++++++++++++++++++++++++++
 drivers/gpu/drm/i915/intel_drv.h  |  1 +
 drivers/gpu/drm/i915/intel_pm.c   | 64 +++++++++++++++++++++++++++++++++++++--
 4 files changed, 116 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 27073e8..5bd8d6f 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1791,6 +1791,7 @@ struct drm_i915_file_private {
 #define HAS_POWER_WELL(dev)	(IS_HASWELL(dev))
 #define HAS_FPGA_DBG_UNCLAIMED(dev)	(INTEL_INFO(dev)->has_fpga_dbg)
 #define HAS_PSR(dev)		(IS_HASWELL(dev))
+#define HAS_SLICE_SHUTDOWN(dev)	(IS_HSW_GT3(dev) && i915_enable_rc6)
 
 #define INTEL_PCH_DEVICE_ID_MASK		0xff00
 #define INTEL_PCH_IBX_DEVICE_ID_TYPE		0x3b00
diff --git a/drivers/gpu/drm/i915/i915_sysfs.c b/drivers/gpu/drm/i915/i915_sysfs.c
index cef38fd..bf6dd95 100644
--- a/drivers/gpu/drm/i915/i915_sysfs.c
+++ b/drivers/gpu/drm/i915/i915_sysfs.c
@@ -117,6 +117,51 @@ static struct attribute_group rc6_attr_group = {
 	.name = power_group_name,
 	.attrs =  rc6_attrs
 };
+
+static ssize_t gt_slices_show(struct device *kdev,
+			      struct device_attribute *attr, char *buf)
+{
+	struct drm_minor *minor = dev_to_drm_minor(kdev);
+	struct drm_device *dev = minor->dev;
+	struct drm_i915_private *dev_priv = dev->dev_private;
+
+	return sprintf(buf, "%s\n", I915_READ(MI_PREDICATE_RESULT_2) ==
+		       LOWER_SLICE_ENABLED ? "2" : "1");
+}
+
+static ssize_t gt_slices_store(struct device *kdev,
+			       struct device_attribute *attr,
+			       const char *buf, size_t count)
+{
+	struct drm_minor *minor = dev_to_drm_minor(kdev);
+	struct drm_device *dev = minor->dev;
+	int ret;
+	int slices;
+
+	ret = kstrtoint(buf, 10, &slices);
+	if (ret)
+		return ret;
+
+	ret = intel_set_gt_slices(dev, slices);
+	if (ret)
+		return ret;
+
+	return count;
+}
+
+static DEVICE_ATTR(gt_slices, S_IRUGO | S_IWUSR, gt_slices_show,
+		   gt_slices_store);
+
+static struct attribute *gt_slices_attrs[] = {
+	&dev_attr_gt_slices.attr,
+	NULL
+};
+
+static struct attribute_group gt_slices_attr_group = {
+	.name = power_group_name,
+	.attrs =  gt_slices_attrs
+};
+
 #endif
 
 static int l3_access_valid(struct drm_device *dev, loff_t offset)
@@ -558,6 +603,12 @@ void i915_setup_sysfs(struct drm_device *dev)
 		if (ret)
 			DRM_ERROR("RC6 residency sysfs setup failed\n");
 	}
+	if (HAS_SLICE_SHUTDOWN(dev)) {
+		ret = sysfs_merge_group(&dev->primary->kdev->kobj,
+					&gt_slices_attr_group);
+		if (ret)
+			DRM_ERROR("GT slice config sysfs setup failed\n");
+	}
 #endif
 	if (HAS_L3_DPF(dev)) {
 		ret = device_create_bin_file(dev->primary->kdev, &dpf_attrs);
@@ -597,5 +648,7 @@ void i915_teardown_sysfs(struct drm_device *dev)
 	device_remove_bin_file(dev->primary->kdev,  &dpf_attrs);
 #ifdef CONFIG_PM
 	sysfs_unmerge_group(&dev->primary->kdev->kobj, &rc6_attr_group);
+	sysfs_unmerge_group(&dev->primary->kdev->kobj,
+			    &gt_slices_attr_group);
 #endif
 }
diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
index 42c3983..cf37741 100644
--- a/drivers/gpu/drm/i915/intel_drv.h
+++ b/drivers/gpu/drm/i915/intel_drv.h
@@ -836,6 +836,7 @@ void intel_set_power_well(struct drm_device *dev, bool enable);
 void intel_enable_gt_powersave(struct drm_device *dev);
 void intel_disable_gt_powersave(struct drm_device *dev);
 void ironlake_teardown_rc6(struct drm_device *dev);
+int intel_set_gt_slices(struct drm_device *dev, int slices);
 void intel_init_gt_slices(struct drm_device *dev);
 void gen6_update_ring_freq(struct drm_device *dev);
 void gen6_rps_idle(struct drm_i915_private *dev_priv);
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 02d1b1f..40ab76a 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -3869,14 +3869,72 @@ static void gen6_enable_rps(struct drm_device *dev)
 	gen6_gt_force_wake_put(dev_priv);
 }
 
-void intel_init_gt_slices(struct drm_device *dev)
+static int intel_set_gt_full(struct drm_device *dev)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
 
-	if (!IS_HSW_GT3(dev))
-		return;
+	I915_WRITE(HSW_GT_SLICE_INFO, SLICE_SEL_BOTH);
 
+	/* Slices are enabled on RC6 exit */
+	gen6_gt_force_wake_get(dev_priv);
+
+	if (wait_for(((I915_READ(HSW_GT_SLICE_INFO) & SLICE_STATUS_MASK) ==
+		      SLICE_STATUS_BOTH_ON), 2000)) {
+		DRM_ERROR("Timeout enabling full gt slices\n");
+		I915_WRITE(HSW_GT_SLICE_INFO, ~SLICE_SEL_BOTH);
+		I915_WRITE(MI_PREDICATE_RESULT_2, LOWER_SLICE_DISABLED);
+		gen6_gt_force_wake_put(dev_priv);
+		return -ETIMEDOUT;
+	}
 	I915_WRITE(MI_PREDICATE_RESULT_2, LOWER_SLICE_ENABLED);
+	gen6_gt_force_wake_put(dev_priv);
+
+	return 0;
+}
+
+static int intel_set_gt_half(struct drm_device *dev)
+{
+	struct drm_i915_private *dev_priv = dev->dev_private;
+
+	I915_WRITE(HSW_GT_SLICE_INFO, ~SLICE_SEL_BOTH);
+
+	/* Slices are disabled on RC6 exit */
+	gen6_gt_force_wake_get(dev_priv);
+
+	if (wait_for(((I915_READ(HSW_GT_SLICE_INFO) & SLICE_STATUS_MASK) ==
+		      SLICE_STATUS_MAIN_ON), 2000)) {
+		DRM_ERROR("Timed out disabling half gt slices\n");
+		I915_WRITE(HSW_GT_SLICE_INFO, SLICE_SEL_BOTH);
+		I915_WRITE(MI_PREDICATE_RESULT_2, LOWER_SLICE_ENABLED);
+		gen6_gt_force_wake_put(dev_priv);
+		return -ETIMEDOUT;
+	}
+	I915_WRITE(MI_PREDICATE_RESULT_2, LOWER_SLICE_DISABLED);
+	gen6_gt_force_wake_put(dev_priv);
+	return 0;
+}
+
+int intel_set_gt_slices(struct drm_device *dev, int slices)
+{
+	if (!HAS_SLICE_SHUTDOWN(dev))
+		return -ENODEV;
+
+	switch (slices) {
+	case 1: return intel_set_gt_half(dev);
+	case 2: return intel_set_gt_full(dev);
+	default: return -EINVAL;
+	}
+}
+
+void intel_init_gt_slices(struct drm_device *dev)
+{
+	struct drm_i915_private *dev_priv = dev->dev_private;
+
+	if (IS_HSW_GT3(dev))
+		I915_WRITE(MI_PREDICATE_RESULT_2, LOWER_SLICE_ENABLED);
+
+	if (!HAS_SLICE_SHUTDOWN(dev))
+		return;
 
 	if (i915_gt_slices == 1) {
 		I915_WRITE(HSW_GT_SLICE_INFO, ~SLICE_SEL_BOTH);
-- 
1.7.11.7

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 3/4] drm/i915: HSW GT3 Slices: exec flag to warn kernel that userspace is using predication
  2013-11-05 22:44 ` [PATCH 1/4] drm/i915: Allow GT Slices Shutdown on Boot Rodrigo Vivi
  2013-11-05 22:44   ` [PATCH 2/4] drm/i915: Slice Shutdown: Allow setting number of slices on through sysfs Rodrigo Vivi
@ 2013-11-05 22:44   ` Rodrigo Vivi
  2013-11-05 22:44   ` [PATCH 4/4] drm/i915: Slice Shutdown - Only 1 slice enabled by default to save power Rodrigo Vivi
  2 siblings, 0 replies; 31+ messages in thread
From: Rodrigo Vivi @ 2013-11-05 22:44 UTC (permalink / raw)
  To: intel-gfx

If Userspace isn't using MI_PREDICATE all slices must be enabled for
backward compatibility.

If I915_EXEC_USE_PREDICATE isn't set and defaul is set to half, kernel will force
all slices on.

v2: fix the inverted logic for backwards compatibility
    USE_PREDICATE unset force gt_full when defaul is half
    instead of GT_FULL flag.

v3: Accepting Chris's suggestions: better variable names;
    better logic around state_default x legacy_userspace_busy;
    remove unecessary mutex;

v4: Accepting more suggestions from Chris:
    * Send all LRIs in only one block and don't ignore if it fails.
    * function name and cleaner code on forcing_full.

v5: fix mutex_lock use by Chris.

v6: change state machine logic to fix locks and use number of slices on (1,2) instead of 0
    for half and 1 for full.
    Init gt slices out of init_hw to avoid reseting values during execution.

CC: Chris Wilson <chris@chris-wilson.co.uk>
CC: Eric Anholt <eric@anholt.net>
CC: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
---
 drivers/gpu/drm/i915/i915_drv.h            | 14 ++++++
 drivers/gpu/drm/i915/i915_gem.c            |  4 +-
 drivers/gpu/drm/i915/i915_gem_execbuffer.c | 68 ++++++++++++++++++++++++++++++
 drivers/gpu/drm/i915/i915_reg.h            | 11 +++++
 drivers/gpu/drm/i915/intel_display.c       |  7 +++
 drivers/gpu/drm/i915/intel_drv.h           |  1 +
 drivers/gpu/drm/i915/intel_pm.c            | 63 ++++++++++++++++++++++++---
 include/uapi/drm/i915_drm.h                |  8 +++-
 8 files changed, 167 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 5bd8d6f..74284e6 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1272,6 +1272,18 @@ struct intel_pipe_crc {
 	wait_queue_head_t wq;
 };
 
+enum gt_slices_state {
+	LEGACY_BUSY = 0,
+	ONE_SLICE,
+	TWO_SLICES,
+};
+
+struct i915_gt_slices {
+	u32 predicate_result_2; /* avoid reads and minimize mutex on execbuf */
+	enum gt_slices_state state;
+	struct mutex lock; /* locks access to slices state and registers */
+};
+
 typedef struct drm_i915_private {
 	struct drm_device *dev;
 	struct kmem_cache *slab;
@@ -1474,6 +1486,8 @@ typedef struct drm_i915_private {
 
 	struct i915_package_c8 pc8;
 
+	struct i915_gt_slices gt_slices;
+
 	/* Old dri1 support infrastructure, beware the dragons ya fools entering
 	 * here! */
 	struct i915_dri1_state dri1;
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index f9c32d1..cfb0687 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -4439,8 +4439,6 @@ i915_gem_init_hw(struct drm_device *dev)
 	if (dev_priv->ellc_size)
 		I915_WRITE(HSW_IDICR, I915_READ(HSW_IDICR) | IDIHASHMSK(0xf));
 
-	intel_init_gt_slices(dev);
-
 	if (HAS_PCH_NOP(dev)) {
 		u32 temp = I915_READ(GEN7_MSG_CTL);
 		temp &= ~(WAIT_FOR_PCH_FLR_ACK | WAIT_FOR_PCH_RESET_ACK);
@@ -4495,6 +4493,8 @@ int i915_gem_init(struct drm_device *dev)
 		return ret;
 	}
 
+	intel_init_gt_slices(dev);
+
 	/* Allow hardware batchbuffers unless told otherwise, but not for KMS. */
 	if (!drm_core_check_feature(dev, DRIVER_MODESET))
 		dev_priv->dri1.allow_batchbuffer = 1;
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index 0ce0d47..4911690 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -922,6 +922,62 @@ i915_reset_gen7_sol_offsets(struct drm_device *dev,
 	return 0;
 }
 
+static int gt_legacy_userspace_busy(struct intel_ring_buffer *ring)
+{
+	drm_i915_private_t *dev_priv = ring->dev->dev_private;
+	int ret;
+
+	ret = intel_ring_begin(ring, 18);
+	if (ret)
+		return ret;
+
+	intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1));
+	intel_ring_emit(ring, HSW_GT_SLICE_INFO);
+	intel_ring_emit(ring, SLICE_SEL_BOTH);
+
+	intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1));
+	intel_ring_emit(ring, MI_PREDICATE_RESULT_2);
+	intel_ring_emit(ring, LOWER_SLICE_ENABLED);
+	dev_priv->gt_slices.predicate_result_2 = LOWER_SLICE_ENABLED;
+
+	intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1));
+	intel_ring_emit(ring, HSW_SLICESHUTDOWN);
+	intel_ring_emit(ring, ~SLICE_SHUTDOWN);
+
+	intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1));
+	intel_ring_emit(ring, RC_IDLE_MAX_COUNT);
+	intel_ring_emit(ring, CS_IDLE_COUNT_1US);
+
+	intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1));
+	intel_ring_emit(ring, WAIT_FOR_RC6_EXIT);
+	intel_ring_emit(ring, _MASKED_BIT_ENABLE(WAIT_RC6_EXIT));
+
+	intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1));
+	intel_ring_emit(ring, RC_IDLE_MAX_COUNT);
+	intel_ring_emit(ring, CS_IDLE_COUNT_5US);
+
+	intel_ring_advance(ring);
+
+	return 0;
+}
+
+static bool gt_legacy_userspace(struct intel_ring_buffer *ring,
+				struct drm_i915_gem_execbuffer2 *args)
+{
+	drm_i915_private_t *dev_priv = ring->dev->dev_private;
+
+	if (ring->id == BCS)
+		return false;
+
+	if (!HAS_SLICE_SHUTDOWN(ring->dev))
+		return false;
+
+	if (dev_priv->gt_slices.predicate_result_2 == LOWER_SLICE_ENABLED)
+		return false;
+
+	return (args->flags & I915_EXEC_USE_PREDICATE) == 0;
+}
+
 static int
 i915_gem_do_execbuffer(struct drm_device *dev, void *data,
 		       struct drm_file *file,
@@ -999,6 +1055,18 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
 		return -EINVAL;
 	}
 
+	if (gt_legacy_userspace(ring, args)) {
+		mutex_lock(&dev_priv->gt_slices.lock);
+		if (dev_priv->gt_slices.state == ONE_SLICE) {
+			ret = gt_legacy_userspace_busy(ring);
+			if (ret == 0)
+				dev_priv->gt_slices.state = LEGACY_BUSY;
+		}
+		mutex_unlock(&dev_priv->gt_slices.lock);
+		if (ret)
+			return ret;
+	}
+
 	mode = args->flags & I915_EXEC_CONSTANTS_MASK;
 	mask = I915_EXEC_CONSTANTS_MASK;
 	switch (mode) {
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index a2e7deb..4617880 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -278,6 +278,17 @@
 #define   SLICE_STATUS_MAIN_ON	(2<<0)
 #define   SLICE_STATUS_BOTH_ON	(3<<0)
 
+#define HSW_SLICESHUTDOWN	0xA190
+#define   SLICE_SHUTDOWN	(1<<0)
+
+#define RC_IDLE_MAX_COUNT	0x2054
+#define   CS_IDLE_COUNT_1US	(1<<1)
+#define   CS_IDLE_COUNT_5US	(1<<3)
+
+#define WAIT_FOR_RC6_EXIT	0x20CC
+#define   WAIT_RC6_EXIT		(1<<0)
+#define   MASK_WAIT_RC6_EXIT	(1<<16)
+
 /*
  * 3D instructions used by the kernel
  */
diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index f34252d..55f2a6e 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -7878,6 +7878,13 @@ void intel_mark_idle(struct drm_device *dev)
 
 	if (dev_priv->info->gen >= 6)
 		gen6_rps_idle(dev->dev_private);
+
+	mutex_lock(&dev_priv->gt_slices.lock);
+	if (dev_priv->gt_slices.state == LEGACY_BUSY) {
+		dev_priv->gt_slices.state = ONE_SLICE;
+		intel_set_gt_half_async(dev);
+	}
+	mutex_unlock(&dev_priv->gt_slices.lock);
 }
 
 void intel_mark_fb_busy(struct drm_i915_gem_object *obj,
diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
index cf37741..deb9464 100644
--- a/drivers/gpu/drm/i915/intel_drv.h
+++ b/drivers/gpu/drm/i915/intel_drv.h
@@ -837,6 +837,7 @@ void intel_enable_gt_powersave(struct drm_device *dev);
 void intel_disable_gt_powersave(struct drm_device *dev);
 void ironlake_teardown_rc6(struct drm_device *dev);
 int intel_set_gt_slices(struct drm_device *dev, int slices);
+void intel_set_gt_half_async(struct drm_device *dev);
 void intel_init_gt_slices(struct drm_device *dev);
 void gen6_update_ring_freq(struct drm_device *dev);
 void gen6_rps_idle(struct drm_i915_private *dev_priv);
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 40ab76a..643ab05 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -3877,18 +3877,22 @@ static int intel_set_gt_full(struct drm_device *dev)
 
 	/* Slices are enabled on RC6 exit */
 	gen6_gt_force_wake_get(dev_priv);
-
 	if (wait_for(((I915_READ(HSW_GT_SLICE_INFO) & SLICE_STATUS_MASK) ==
 		      SLICE_STATUS_BOTH_ON), 2000)) {
 		DRM_ERROR("Timeout enabling full gt slices\n");
+
 		I915_WRITE(HSW_GT_SLICE_INFO, ~SLICE_SEL_BOTH);
 		I915_WRITE(MI_PREDICATE_RESULT_2, LOWER_SLICE_DISABLED);
+		dev_priv->gt_slices.predicate_result_2 = LOWER_SLICE_DISABLED;
+
 		gen6_gt_force_wake_put(dev_priv);
 		return -ETIMEDOUT;
 	}
+
 	I915_WRITE(MI_PREDICATE_RESULT_2, LOWER_SLICE_ENABLED);
-	gen6_gt_force_wake_put(dev_priv);
+	dev_priv->gt_slices.predicate_result_2 = LOWER_SLICE_ENABLED;
 
+	gen6_gt_force_wake_put(dev_priv);
 	return 0;
 }
 
@@ -3904,28 +3908,71 @@ static int intel_set_gt_half(struct drm_device *dev)
 	if (wait_for(((I915_READ(HSW_GT_SLICE_INFO) & SLICE_STATUS_MASK) ==
 		      SLICE_STATUS_MAIN_ON), 2000)) {
 		DRM_ERROR("Timed out disabling half gt slices\n");
+
 		I915_WRITE(HSW_GT_SLICE_INFO, SLICE_SEL_BOTH);
 		I915_WRITE(MI_PREDICATE_RESULT_2, LOWER_SLICE_ENABLED);
+		dev_priv->gt_slices.predicate_result_2 = LOWER_SLICE_ENABLED;
+
 		gen6_gt_force_wake_put(dev_priv);
 		return -ETIMEDOUT;
 	}
+
 	I915_WRITE(MI_PREDICATE_RESULT_2, LOWER_SLICE_DISABLED);
+	dev_priv->gt_slices.predicate_result_2 = LOWER_SLICE_DISABLED;
+
 	gen6_gt_force_wake_put(dev_priv);
 	return 0;
 }
 
 int intel_set_gt_slices(struct drm_device *dev, int slices)
 {
+	struct drm_i915_private *dev_priv = dev->dev_private;
+	int ret = 0;
+
 	if (!HAS_SLICE_SHUTDOWN(dev))
 		return -ENODEV;
 
 	switch (slices) {
-	case 1: return intel_set_gt_half(dev);
-	case 2: return intel_set_gt_full(dev);
+	case 1:
+		mutex_lock(&dev_priv->gt_slices.lock);
+		if (dev_priv->gt_slices.state == TWO_SLICES) {
+			ret = intel_set_gt_half(dev);
+			if (ret == 0)
+				dev_priv->gt_slices.state = ONE_SLICE;
+		}
+		mutex_unlock(&dev_priv->gt_slices.lock);
+		return ret;
+	case 2:
+		mutex_lock(&dev_priv->gt_slices.lock);
+		if (dev_priv->gt_slices.state == ONE_SLICE) {
+			ret = intel_set_gt_full(dev);
+			if (ret == 0)
+				dev_priv->gt_slices.state = TWO_SLICES;
+		} else if (dev_priv->gt_slices.state == LEGACY_BUSY)
+			dev_priv->gt_slices.state = TWO_SLICES;
+		mutex_unlock(&dev_priv->gt_slices.lock);
+		return ret;
 	default: return -EINVAL;
 	}
 }
 
+/**
+ * On Haswell, slices on/off transitions are done via RC6 sequence.
+ * This async function allows you to request slices shutdown without waiting.
+ * Slices will be disabled on next RC6 exit.
+ */
+void intel_set_gt_half_async(struct drm_device *dev)
+{
+	struct drm_i915_private *dev_priv = dev->dev_private;
+
+	if (!HAS_SLICE_SHUTDOWN(dev))
+		return;
+
+	I915_WRITE(HSW_GT_SLICE_INFO, ~SLICE_SEL_BOTH);
+	I915_WRITE(MI_PREDICATE_RESULT_2, LOWER_SLICE_DISABLED);
+	dev_priv->gt_slices.predicate_result_2 = LOWER_SLICE_DISABLED;
+}
+
 void intel_init_gt_slices(struct drm_device *dev)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
@@ -3936,9 +3983,13 @@ void intel_init_gt_slices(struct drm_device *dev)
 	if (!HAS_SLICE_SHUTDOWN(dev))
 		return;
 
+	dev_priv->gt_slices.state = TWO_SLICES;
+	dev_priv->gt_slices.predicate_result_2 = LOWER_SLICE_ENABLED;
+	mutex_init(&dev_priv->gt_slices.lock);
+
 	if (i915_gt_slices == 1) {
-		I915_WRITE(HSW_GT_SLICE_INFO, ~SLICE_SEL_BOTH);
-		I915_WRITE(MI_PREDICATE_RESULT_2, LOWER_SLICE_DISABLED);
+		intel_set_gt_half_async(dev);
+		dev_priv->gt_slices.state = ONE_SLICE;
 		return;
 	}
 
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index 3a4e97b..3fa3e24 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -731,7 +731,13 @@ struct drm_i915_gem_execbuffer2 {
  */
 #define I915_EXEC_HANDLE_LUT		(1<<12)
 
-#define __I915_EXEC_UNKNOWN_FLAGS -(I915_EXEC_HANDLE_LUT<<1)
+/* If this flag is set userspace is using predicate and half slices can be
+ * let disabled for power saving. Otherwise use all slices even when disabled
+ * by boot parameter or via sysfs interface
+ */
+#define I915_EXEC_USE_PREDICATE		(1<<13)
+
+#define __I915_EXEC_UNKNOWN_FLAGS -(I915_EXEC_USE_PREDICATE<<1)
 
 #define I915_EXEC_CONTEXT_ID_MASK	(0xffffffff)
 #define i915_execbuffer2_set_context_id(eb2, context) \
-- 
1.7.11.7

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 4/4] drm/i915: Slice Shutdown - Only 1 slice enabled by default to save power.
  2013-11-05 22:44 ` [PATCH 1/4] drm/i915: Allow GT Slices Shutdown on Boot Rodrigo Vivi
  2013-11-05 22:44   ` [PATCH 2/4] drm/i915: Slice Shutdown: Allow setting number of slices on through sysfs Rodrigo Vivi
  2013-11-05 22:44   ` [PATCH 3/4] drm/i915: HSW GT3 Slices: exec flag to warn kernel that userspace is using predication Rodrigo Vivi
@ 2013-11-05 22:44   ` Rodrigo Vivi
  2 siblings, 0 replies; 31+ messages in thread
From: Rodrigo Vivi @ 2013-11-05 22:44 UTC (permalink / raw)
  To: intel-gfx

for better power savings.

v2: Use number of slices (1,2) instead of 0 for half and 1 for full.

Signed-off-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
---
 drivers/gpu/drm/i915/i915_drv.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 71bac6a..d9ee939 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -154,10 +154,10 @@ module_param_named(prefault_disable, i915_prefault_disable, bool, 0600);
 MODULE_PARM_DESC(prefault_disable,
 		"Disable page prefaulting for pread/pwrite/reloc (default:false). For developers only.");
 
-int i915_gt_slices __read_mostly = 2;
+int i915_gt_slices __read_mostly = 1;
 module_param_named(gt_slices, i915_gt_slices, int, 0600);
 MODULE_PARM_DESC(gt_slices,
-		 "Haswell GT3 has two slices with many EUs in each of them. Use 2 for better performance or 1 for better power savings. (default:2)");
+		 "Haswell GT3 has two slices with many EUs in each of them. Use 2 for better performance or 1 for better power savings. (default:1)");
 
 static struct drm_driver driver;
 extern int intel_agp_enabled;
-- 
1.7.11.7

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH] testcases: Slice Shutdown testscases.
  2013-10-21 21:01     ` Rodrigo Vivi
@ 2013-11-05 22:45       ` Rodrigo Vivi
  2013-11-06  7:22         ` Daniel Vetter
  0 siblings, 1 reply; 31+ messages in thread
From: Rodrigo Vivi @ 2013-11-05 22:45 UTC (permalink / raw)
  To: intel-gfx

1. sysfs half/full switch.
4. execbuf without I915_EXEC_USE_PREDICATE
5. execbuf with I915_EXEC_USE_PREDICATE

v2: include more tests and s/GT_FULL/USE_PREDICATE
v3: make it more reliable and fix few comments
v4: use number of slices on (1,2) instead of half and full strings.

Signed-off-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
---
 tests/Makefile.am |   1 +
 tests/gt_slices.c | 268 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 269 insertions(+)
 create mode 100644 tests/gt_slices.c

diff --git a/tests/Makefile.am b/tests/Makefile.am
index 0290ae0..e21230a 100644
--- a/tests/Makefile.am
+++ b/tests/Makefile.am
@@ -50,6 +50,7 @@ TESTS_progs_M = \
 	gem_tiled_blits \
 	gem_tiled_partial_pwrite_pread \
 	gem_write_read_ring_switch \
+	gt_slices \
 	kms_flip \
 	kms_render \
 	kms_setmode \
diff --git a/tests/gt_slices.c b/tests/gt_slices.c
new file mode 100644
index 0000000..8388aca
--- /dev/null
+++ b/tests/gt_slices.c
@@ -0,0 +1,268 @@
+/*
+ * Copyright © 2013 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ *
+ * Authors:
+ *    Rodrigo Vivi <rodrigo.vivi@intel.com>
+ *
+ */
+
+/*
+ * Testcase: Test GT slice shutdown feature
+ *
+ * Sub tests:
+ * 1. sysfs half/full switch.
+ * 4. on execbuf without I915_EXEC_USE_PREDICATE
+ * 5. on execbuf with I915_EXEC_USE_PREDICATE
+ */
+
+#define _GNU_SOURCE
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <unistd.h>
+#include "drmtest.h"
+
+static void exec(int fd, uint32_t handle, bool use_predicate)
+{
+        struct drm_i915_gem_execbuffer2 execbuf;
+        struct drm_i915_gem_exec_object2 gem_exec[1];
+        uint32_t b[2] = {MI_BATCH_BUFFER_END};
+        int loops = 1000000;
+        int ret = 0;
+
+        gem_write(fd, handle, 0, b, sizeof(b));
+
+        gem_exec[0].handle = handle;
+        gem_exec[0].relocation_count = 0;
+        gem_exec[0].relocs_ptr = 0;
+        gem_exec[0].alignment = 0;
+        gem_exec[0].offset = 0;
+        gem_exec[0].flags = 0;
+        gem_exec[0].rsvd1 = 0;
+        gem_exec[0].rsvd2 = 0;
+
+        execbuf.buffers_ptr = (uintptr_t)gem_exec;
+        execbuf.buffer_count = 1;
+        execbuf.batch_start_offset = 0;
+        execbuf.batch_len = 8;
+        execbuf.cliprects_ptr = 0;
+        execbuf.num_cliprects = 0;
+        execbuf.DR1 = 0;
+        execbuf.DR4 = 0;
+        execbuf.flags =  I915_EXEC_RENDER;
+	if (use_predicate)
+		execbuf.flags |= I915_EXEC_USE_PREDICATE;
+        i915_execbuffer2_set_context_id(execbuf, 0);
+        execbuf.rsvd2 = 0;
+
+        while (loops-- && ret == 0) {
+                ret = drmIoctl(fd,
+                               DRM_IOCTL_I915_GEM_EXECBUFFER2,
+                               &execbuf);
+        }
+	gem_sync(fd, handle);
+}
+
+static int slices_on(const int device)
+{
+	char *path;
+	FILE *file;
+	int ret;
+	int slices;
+
+	ret = asprintf(&path, "/sys/class/drm/card%d/power/gt_slices",
+		       device);
+	igt_assert(ret != -1);
+
+	file = fopen(path, "r");
+	igt_require(file);
+
+	ret = fscanf(file, "%d", &slices);
+	igt_assert(ret != 0);
+
+	fclose(file);
+	return slices;
+}
+
+static int set_status(const int device, int slices)
+{
+	char *path;
+	FILE *file;
+	int ret;
+	int attempts = 10;
+
+	ret = asprintf(&path, "/sys/class/drm/card%d/power/gt_slices",
+		       device);
+	igt_assert(ret != -1);
+
+	while (attempts-- && ret != 0) {
+		file = fopen(path, "w");
+		igt_require(file);
+		ret = fprintf(file, "%d\n", slices);
+		igt_assert(ret != -1);
+		ret = fclose(file);
+		sleep(1);
+	}
+	return ret;
+}
+
+int main(int argc, char **argv)
+{
+	char *path;
+	FILE *file;
+	unsigned int rc6_enabled;
+	int i, ret;
+	uint32_t handle;
+	const int device = drm_get_card();
+	const int fd = drm_open_any();
+	const int devid = intel_get_drm_devid(fd);
+	int initial = slices_on(device);
+
+	igt_skip_on_simulation();
+
+	igt_subtest_init(argc, argv);
+
+	igt_fixture {
+		/* On Haswell Slices on/off switch depends on RC6 exit */
+		ret = asprintf(&path, "/sys/class/drm/card%d/power/rc6_enable",
+			       device);
+		igt_assert(ret != -1);
+
+		file = fopen(path, "r");
+		igt_require(file);
+
+		fscanf(file, "%u", &rc6_enabled);
+		fclose(file);
+
+		igt_require(IS_HASWELL(devid));
+		igt_require(rc6_enabled);
+	}
+
+	igt_subtest("sysfs") {
+		/* Switching states */
+		if (initial == 1)
+			ret = set_status(device, 2);
+		else
+			ret = set_status(device, 1);
+		if (ret < 0) {
+			fprintf(stderr, "Switch states via sysfs failed\n");
+			igt_fail(ret);
+		}
+
+		igt_success();
+	}
+
+	igt_subtest("execbuf-legacy") {
+		/* Disable half slices */
+		ret = set_status(device, 1);
+		if (ret < 0)
+			igt_fail(ret);
+
+		/* Wait until it is back to half */
+		for (i = 30; i >= 0 && slices_on(device) == 2; i--) {
+			if (i == 0)
+				igt_skip("Took so long to go to half. Probably a legacy support is in use at GPU simutaneously\n");
+			sleep(1);
+		}
+
+		igt_fork(child, 1) {
+			/* Start Render without I915_EXEC_USE_PREDICATE */
+			handle = gem_create(fd, 4096);
+			exec(fd, handle, false);
+			gem_close(fd, handle);
+		}
+
+		/* Check if it forces full slices enable for legacy support */
+		for (i = 10; i >= 0 && slices_on(device) == 1; i--) {
+			if (i == 0) {
+				fprintf(stderr, "2 slices should be enabled\n");
+				igt_fail(-1);
+			}
+			sleep(1);
+		}
+
+		/* Wait for idleness */
+		igt_waitchildren();
+
+		/* Check if it is back to half */
+		for (i = 20; i >= 0 && slices_on(device) == 2; i--) {
+			if (i == 0) {
+				fprintf(stderr, "Only 1 slice should be enabled\n");
+				igt_fail(-1);
+			}
+			sleep(1);
+		}
+
+		igt_success();
+	}
+
+	igt_subtest("execbuf-predicate") {
+		/* Disable half slices */
+		ret = set_status(device, 1);
+		if (ret < 0)
+			igt_fail(ret);
+
+		/* Wait until it is back to half */
+		for (i = 30; i >= 0 && slices_on(device) == 2; i--) {
+			if (i == 0)
+				igt_skip("Took so long to go to half. Probably a legacy support is in use at GPU simutaneously\n");
+			sleep(1);
+		}
+
+		igt_fork(child, 1) {
+			/* Start Render with I915_EXEC_USE_PREDICATE */
+			handle = gem_create(fd, 4096);
+			exec(fd, handle, true);
+			gem_close(fd, handle);
+		}
+
+		/* Check if it stays with only half slices enabled */
+		sleep(1);
+		if (slices_on(device) == 2) {
+			fprintf(stderr, "Only 1 slice should be enabled during execution with predicate\n");
+			set_status(device, initial);
+			igt_fail(-1);
+		}
+
+		/* Wait for idleness */
+		igt_waitchildren();
+
+		/* Check if it is back to half */
+		for (i = 20; i >= 0 && slices_on(device) == 2; i--) {
+			if (i == 0) {
+				fprintf(stderr, "Only 1 slice should be enabled\n");
+				set_status(device, initial);
+				igt_fail(-1);
+			}
+			sleep(1);
+		}
+
+		set_status(device, initial);
+		igt_success();
+	}
+
+	igt_fixture {
+		close(fd);
+	}
+
+	igt_exit();
+}
-- 
1.7.11.7

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [PATCH] testcases: Slice Shutdown testscases.
  2013-11-05 22:45       ` Rodrigo Vivi
@ 2013-11-06  7:22         ` Daniel Vetter
  2013-11-06 15:29           ` Rodrigo Vivi
  0 siblings, 1 reply; 31+ messages in thread
From: Daniel Vetter @ 2013-11-06  7:22 UTC (permalink / raw)
  To: Rodrigo Vivi; +Cc: intel-gfx

On Tue, Nov 05, 2013 at 08:45:09PM -0200, Rodrigo Vivi wrote:
> 1. sysfs half/full switch.
> 4. execbuf without I915_EXEC_USE_PREDICATE
> 5. execbuf with I915_EXEC_USE_PREDICATE
> 
> v2: include more tests and s/GT_FULL/USE_PREDICATE
> v3: make it more reliable and fix few comments
> v4: use number of slices on (1,2) instead of half and full strings.
> 
> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
> ---
>  tests/Makefile.am |   1 +
>  tests/gt_slices.c | 268 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 269 insertions(+)
>  create mode 100644 tests/gt_slices.c
> 
> diff --git a/tests/Makefile.am b/tests/Makefile.am
> index 0290ae0..e21230a 100644
> --- a/tests/Makefile.am
> +++ b/tests/Makefile.am
> @@ -50,6 +50,7 @@ TESTS_progs_M = \
>  	gem_tiled_blits \
>  	gem_tiled_partial_pwrite_pread \
>  	gem_write_read_ring_switch \
> +	gt_slices \

With the new namimg conventions I'd give this a pm_ prefix.
-Daniel

>  	kms_flip \
>  	kms_render \
>  	kms_setmode \
> diff --git a/tests/gt_slices.c b/tests/gt_slices.c
> new file mode 100644
> index 0000000..8388aca
> --- /dev/null
> +++ b/tests/gt_slices.c
> @@ -0,0 +1,268 @@
> +/*
> + * Copyright © 2013 Intel Corporation
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a
> + * copy of this software and associated documentation files (the "Software"),
> + * to deal in the Software without restriction, including without limitation
> + * the rights to use, copy, modify, merge, publish, distribute, sublicense,
> + * and/or sell copies of the Software, and to permit persons to whom the
> + * Software is furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice (including the next
> + * paragraph) shall be included in all copies or substantial portions of the
> + * Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
> + * IN THE SOFTWARE.
> + *
> + * Authors:
> + *    Rodrigo Vivi <rodrigo.vivi@intel.com>
> + *
> + */
> +
> +/*
> + * Testcase: Test GT slice shutdown feature
> + *
> + * Sub tests:
> + * 1. sysfs half/full switch.
> + * 4. on execbuf without I915_EXEC_USE_PREDICATE
> + * 5. on execbuf with I915_EXEC_USE_PREDICATE
> + */
> +
> +#define _GNU_SOURCE
> +#include <stdio.h>
> +#include <stdlib.h>
> +#include <string.h>
> +#include <unistd.h>
> +#include "drmtest.h"
> +
> +static void exec(int fd, uint32_t handle, bool use_predicate)
> +{
> +        struct drm_i915_gem_execbuffer2 execbuf;
> +        struct drm_i915_gem_exec_object2 gem_exec[1];
> +        uint32_t b[2] = {MI_BATCH_BUFFER_END};
> +        int loops = 1000000;
> +        int ret = 0;
> +
> +        gem_write(fd, handle, 0, b, sizeof(b));
> +
> +        gem_exec[0].handle = handle;
> +        gem_exec[0].relocation_count = 0;
> +        gem_exec[0].relocs_ptr = 0;
> +        gem_exec[0].alignment = 0;
> +        gem_exec[0].offset = 0;
> +        gem_exec[0].flags = 0;
> +        gem_exec[0].rsvd1 = 0;
> +        gem_exec[0].rsvd2 = 0;
> +
> +        execbuf.buffers_ptr = (uintptr_t)gem_exec;
> +        execbuf.buffer_count = 1;
> +        execbuf.batch_start_offset = 0;
> +        execbuf.batch_len = 8;
> +        execbuf.cliprects_ptr = 0;
> +        execbuf.num_cliprects = 0;
> +        execbuf.DR1 = 0;
> +        execbuf.DR4 = 0;
> +        execbuf.flags =  I915_EXEC_RENDER;
> +	if (use_predicate)
> +		execbuf.flags |= I915_EXEC_USE_PREDICATE;
> +        i915_execbuffer2_set_context_id(execbuf, 0);
> +        execbuf.rsvd2 = 0;
> +
> +        while (loops-- && ret == 0) {
> +                ret = drmIoctl(fd,
> +                               DRM_IOCTL_I915_GEM_EXECBUFFER2,
> +                               &execbuf);
> +        }
> +	gem_sync(fd, handle);
> +}
> +
> +static int slices_on(const int device)
> +{
> +	char *path;
> +	FILE *file;
> +	int ret;
> +	int slices;
> +
> +	ret = asprintf(&path, "/sys/class/drm/card%d/power/gt_slices",
> +		       device);
> +	igt_assert(ret != -1);
> +
> +	file = fopen(path, "r");
> +	igt_require(file);
> +
> +	ret = fscanf(file, "%d", &slices);
> +	igt_assert(ret != 0);
> +
> +	fclose(file);
> +	return slices;
> +}
> +
> +static int set_status(const int device, int slices)
> +{
> +	char *path;
> +	FILE *file;
> +	int ret;
> +	int attempts = 10;
> +
> +	ret = asprintf(&path, "/sys/class/drm/card%d/power/gt_slices",
> +		       device);
> +	igt_assert(ret != -1);
> +
> +	while (attempts-- && ret != 0) {
> +		file = fopen(path, "w");
> +		igt_require(file);
> +		ret = fprintf(file, "%d\n", slices);
> +		igt_assert(ret != -1);
> +		ret = fclose(file);
> +		sleep(1);
> +	}
> +	return ret;
> +}
> +
> +int main(int argc, char **argv)
> +{
> +	char *path;
> +	FILE *file;
> +	unsigned int rc6_enabled;
> +	int i, ret;
> +	uint32_t handle;
> +	const int device = drm_get_card();
> +	const int fd = drm_open_any();
> +	const int devid = intel_get_drm_devid(fd);
> +	int initial = slices_on(device);
> +
> +	igt_skip_on_simulation();
> +
> +	igt_subtest_init(argc, argv);
> +
> +	igt_fixture {
> +		/* On Haswell Slices on/off switch depends on RC6 exit */
> +		ret = asprintf(&path, "/sys/class/drm/card%d/power/rc6_enable",
> +			       device);
> +		igt_assert(ret != -1);
> +
> +		file = fopen(path, "r");
> +		igt_require(file);
> +
> +		fscanf(file, "%u", &rc6_enabled);
> +		fclose(file);
> +
> +		igt_require(IS_HASWELL(devid));
> +		igt_require(rc6_enabled);
> +	}
> +
> +	igt_subtest("sysfs") {
> +		/* Switching states */
> +		if (initial == 1)
> +			ret = set_status(device, 2);
> +		else
> +			ret = set_status(device, 1);
> +		if (ret < 0) {
> +			fprintf(stderr, "Switch states via sysfs failed\n");
> +			igt_fail(ret);
> +		}
> +
> +		igt_success();
> +	}
> +
> +	igt_subtest("execbuf-legacy") {
> +		/* Disable half slices */
> +		ret = set_status(device, 1);
> +		if (ret < 0)
> +			igt_fail(ret);
> +
> +		/* Wait until it is back to half */
> +		for (i = 30; i >= 0 && slices_on(device) == 2; i--) {
> +			if (i == 0)
> +				igt_skip("Took so long to go to half. Probably a legacy support is in use at GPU simutaneously\n");
> +			sleep(1);
> +		}
> +
> +		igt_fork(child, 1) {
> +			/* Start Render without I915_EXEC_USE_PREDICATE */
> +			handle = gem_create(fd, 4096);
> +			exec(fd, handle, false);
> +			gem_close(fd, handle);
> +		}
> +
> +		/* Check if it forces full slices enable for legacy support */
> +		for (i = 10; i >= 0 && slices_on(device) == 1; i--) {
> +			if (i == 0) {
> +				fprintf(stderr, "2 slices should be enabled\n");
> +				igt_fail(-1);
> +			}
> +			sleep(1);
> +		}
> +
> +		/* Wait for idleness */
> +		igt_waitchildren();
> +
> +		/* Check if it is back to half */
> +		for (i = 20; i >= 0 && slices_on(device) == 2; i--) {
> +			if (i == 0) {
> +				fprintf(stderr, "Only 1 slice should be enabled\n");
> +				igt_fail(-1);
> +			}
> +			sleep(1);
> +		}
> +
> +		igt_success();
> +	}
> +
> +	igt_subtest("execbuf-predicate") {
> +		/* Disable half slices */
> +		ret = set_status(device, 1);
> +		if (ret < 0)
> +			igt_fail(ret);
> +
> +		/* Wait until it is back to half */
> +		for (i = 30; i >= 0 && slices_on(device) == 2; i--) {
> +			if (i == 0)
> +				igt_skip("Took so long to go to half. Probably a legacy support is in use at GPU simutaneously\n");
> +			sleep(1);
> +		}
> +
> +		igt_fork(child, 1) {
> +			/* Start Render with I915_EXEC_USE_PREDICATE */
> +			handle = gem_create(fd, 4096);
> +			exec(fd, handle, true);
> +			gem_close(fd, handle);
> +		}
> +
> +		/* Check if it stays with only half slices enabled */
> +		sleep(1);
> +		if (slices_on(device) == 2) {
> +			fprintf(stderr, "Only 1 slice should be enabled during execution with predicate\n");
> +			set_status(device, initial);
> +			igt_fail(-1);
> +		}
> +
> +		/* Wait for idleness */
> +		igt_waitchildren();
> +
> +		/* Check if it is back to half */
> +		for (i = 20; i >= 0 && slices_on(device) == 2; i--) {
> +			if (i == 0) {
> +				fprintf(stderr, "Only 1 slice should be enabled\n");
> +				set_status(device, initial);
> +				igt_fail(-1);
> +			}
> +			sleep(1);
> +		}
> +
> +		set_status(device, initial);
> +		igt_success();
> +	}
> +
> +	igt_fixture {
> +		close(fd);
> +	}
> +
> +	igt_exit();
> +}
> -- 
> 1.7.11.7
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH] testcases: Slice Shutdown testscases.
  2013-11-06  7:22         ` Daniel Vetter
@ 2013-11-06 15:29           ` Rodrigo Vivi
  0 siblings, 0 replies; 31+ messages in thread
From: Rodrigo Vivi @ 2013-11-06 15:29 UTC (permalink / raw)
  To: intel-gfx

1. sysfs half/full switch.
4. execbuf without I915_EXEC_USE_PREDICATE
5. execbuf with I915_EXEC_USE_PREDICATE

v2: include more tests and s/GT_FULL/USE_PREDICATE
v3: make it more reliable and fix few comments
v4: use number of slices on (1,2) instead of half and full strings.
v5: respect new naming convention

Signed-off-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
---
 tests/Makefile.am    |   1 +
 tests/pm_gt_slices.c | 268 +++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 269 insertions(+)
 create mode 100644 tests/pm_gt_slices.c

diff --git a/tests/Makefile.am b/tests/Makefile.am
index 0426ec0..e114bdf 100644
--- a/tests/Makefile.am
+++ b/tests/Makefile.am
@@ -57,6 +57,7 @@ TESTS_progs_M = \
 	kms_render \
 	kms_setmode \
 	$(NOUVEAU_TESTS_M) \
+	pm_gt_slices \
 	pm_pc8 \
 	prime_self_import \
 	template \
diff --git a/tests/pm_gt_slices.c b/tests/pm_gt_slices.c
new file mode 100644
index 0000000..8388aca
--- /dev/null
+++ b/tests/pm_gt_slices.c
@@ -0,0 +1,268 @@
+/*
+ * Copyright © 2013 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ *
+ * Authors:
+ *    Rodrigo Vivi <rodrigo.vivi@intel.com>
+ *
+ */
+
+/*
+ * Testcase: Test GT slice shutdown feature
+ *
+ * Sub tests:
+ * 1. sysfs half/full switch.
+ * 4. on execbuf without I915_EXEC_USE_PREDICATE
+ * 5. on execbuf with I915_EXEC_USE_PREDICATE
+ */
+
+#define _GNU_SOURCE
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <unistd.h>
+#include "drmtest.h"
+
+static void exec(int fd, uint32_t handle, bool use_predicate)
+{
+        struct drm_i915_gem_execbuffer2 execbuf;
+        struct drm_i915_gem_exec_object2 gem_exec[1];
+        uint32_t b[2] = {MI_BATCH_BUFFER_END};
+        int loops = 1000000;
+        int ret = 0;
+
+        gem_write(fd, handle, 0, b, sizeof(b));
+
+        gem_exec[0].handle = handle;
+        gem_exec[0].relocation_count = 0;
+        gem_exec[0].relocs_ptr = 0;
+        gem_exec[0].alignment = 0;
+        gem_exec[0].offset = 0;
+        gem_exec[0].flags = 0;
+        gem_exec[0].rsvd1 = 0;
+        gem_exec[0].rsvd2 = 0;
+
+        execbuf.buffers_ptr = (uintptr_t)gem_exec;
+        execbuf.buffer_count = 1;
+        execbuf.batch_start_offset = 0;
+        execbuf.batch_len = 8;
+        execbuf.cliprects_ptr = 0;
+        execbuf.num_cliprects = 0;
+        execbuf.DR1 = 0;
+        execbuf.DR4 = 0;
+        execbuf.flags =  I915_EXEC_RENDER;
+	if (use_predicate)
+		execbuf.flags |= I915_EXEC_USE_PREDICATE;
+        i915_execbuffer2_set_context_id(execbuf, 0);
+        execbuf.rsvd2 = 0;
+
+        while (loops-- && ret == 0) {
+                ret = drmIoctl(fd,
+                               DRM_IOCTL_I915_GEM_EXECBUFFER2,
+                               &execbuf);
+        }
+	gem_sync(fd, handle);
+}
+
+static int slices_on(const int device)
+{
+	char *path;
+	FILE *file;
+	int ret;
+	int slices;
+
+	ret = asprintf(&path, "/sys/class/drm/card%d/power/gt_slices",
+		       device);
+	igt_assert(ret != -1);
+
+	file = fopen(path, "r");
+	igt_require(file);
+
+	ret = fscanf(file, "%d", &slices);
+	igt_assert(ret != 0);
+
+	fclose(file);
+	return slices;
+}
+
+static int set_status(const int device, int slices)
+{
+	char *path;
+	FILE *file;
+	int ret;
+	int attempts = 10;
+
+	ret = asprintf(&path, "/sys/class/drm/card%d/power/gt_slices",
+		       device);
+	igt_assert(ret != -1);
+
+	while (attempts-- && ret != 0) {
+		file = fopen(path, "w");
+		igt_require(file);
+		ret = fprintf(file, "%d\n", slices);
+		igt_assert(ret != -1);
+		ret = fclose(file);
+		sleep(1);
+	}
+	return ret;
+}
+
+int main(int argc, char **argv)
+{
+	char *path;
+	FILE *file;
+	unsigned int rc6_enabled;
+	int i, ret;
+	uint32_t handle;
+	const int device = drm_get_card();
+	const int fd = drm_open_any();
+	const int devid = intel_get_drm_devid(fd);
+	int initial = slices_on(device);
+
+	igt_skip_on_simulation();
+
+	igt_subtest_init(argc, argv);
+
+	igt_fixture {
+		/* On Haswell Slices on/off switch depends on RC6 exit */
+		ret = asprintf(&path, "/sys/class/drm/card%d/power/rc6_enable",
+			       device);
+		igt_assert(ret != -1);
+
+		file = fopen(path, "r");
+		igt_require(file);
+
+		fscanf(file, "%u", &rc6_enabled);
+		fclose(file);
+
+		igt_require(IS_HASWELL(devid));
+		igt_require(rc6_enabled);
+	}
+
+	igt_subtest("sysfs") {
+		/* Switching states */
+		if (initial == 1)
+			ret = set_status(device, 2);
+		else
+			ret = set_status(device, 1);
+		if (ret < 0) {
+			fprintf(stderr, "Switch states via sysfs failed\n");
+			igt_fail(ret);
+		}
+
+		igt_success();
+	}
+
+	igt_subtest("execbuf-legacy") {
+		/* Disable half slices */
+		ret = set_status(device, 1);
+		if (ret < 0)
+			igt_fail(ret);
+
+		/* Wait until it is back to half */
+		for (i = 30; i >= 0 && slices_on(device) == 2; i--) {
+			if (i == 0)
+				igt_skip("Took so long to go to half. Probably a legacy support is in use at GPU simutaneously\n");
+			sleep(1);
+		}
+
+		igt_fork(child, 1) {
+			/* Start Render without I915_EXEC_USE_PREDICATE */
+			handle = gem_create(fd, 4096);
+			exec(fd, handle, false);
+			gem_close(fd, handle);
+		}
+
+		/* Check if it forces full slices enable for legacy support */
+		for (i = 10; i >= 0 && slices_on(device) == 1; i--) {
+			if (i == 0) {
+				fprintf(stderr, "2 slices should be enabled\n");
+				igt_fail(-1);
+			}
+			sleep(1);
+		}
+
+		/* Wait for idleness */
+		igt_waitchildren();
+
+		/* Check if it is back to half */
+		for (i = 20; i >= 0 && slices_on(device) == 2; i--) {
+			if (i == 0) {
+				fprintf(stderr, "Only 1 slice should be enabled\n");
+				igt_fail(-1);
+			}
+			sleep(1);
+		}
+
+		igt_success();
+	}
+
+	igt_subtest("execbuf-predicate") {
+		/* Disable half slices */
+		ret = set_status(device, 1);
+		if (ret < 0)
+			igt_fail(ret);
+
+		/* Wait until it is back to half */
+		for (i = 30; i >= 0 && slices_on(device) == 2; i--) {
+			if (i == 0)
+				igt_skip("Took so long to go to half. Probably a legacy support is in use at GPU simutaneously\n");
+			sleep(1);
+		}
+
+		igt_fork(child, 1) {
+			/* Start Render with I915_EXEC_USE_PREDICATE */
+			handle = gem_create(fd, 4096);
+			exec(fd, handle, true);
+			gem_close(fd, handle);
+		}
+
+		/* Check if it stays with only half slices enabled */
+		sleep(1);
+		if (slices_on(device) == 2) {
+			fprintf(stderr, "Only 1 slice should be enabled during execution with predicate\n");
+			set_status(device, initial);
+			igt_fail(-1);
+		}
+
+		/* Wait for idleness */
+		igt_waitchildren();
+
+		/* Check if it is back to half */
+		for (i = 20; i >= 0 && slices_on(device) == 2; i--) {
+			if (i == 0) {
+				fprintf(stderr, "Only 1 slice should be enabled\n");
+				set_status(device, initial);
+				igt_fail(-1);
+			}
+			sleep(1);
+		}
+
+		set_status(device, initial);
+		igt_success();
+	}
+
+	igt_fixture {
+		close(fd);
+	}
+
+	igt_exit();
+}
-- 
1.7.11.7

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 31+ messages in thread

end of thread, other threads:[~2013-11-06 15:30 UTC | newest]

Thread overview: 31+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-10-15 14:41 [PATCH 1/3] drm/i915: Allow GT Slices Shutdown on Boot Rodrigo Vivi
2013-10-15 14:41 ` [PATCH] i915_drm: add exec flag for request forcing Intel GT full Rodrigo Vivi
2013-10-15 20:50   ` [PATCH] i915_drm: add exec flag to warn kernel that userspace is using predication Rodrigo Vivi
2013-10-15 14:41 ` [PATCH] testcases: Slice Shutdown sysfs switch and gt force full exec buffer flag Rodrigo Vivi
2013-10-15 20:51   ` [PATCH] testcases: Slice Shutdown testscases Rodrigo Vivi
2013-10-21 21:01     ` Rodrigo Vivi
2013-11-05 22:45       ` Rodrigo Vivi
2013-11-06  7:22         ` Daniel Vetter
2013-11-06 15:29           ` Rodrigo Vivi
2013-10-15 14:41 ` [PATCH 2/3] drm/i915: Slice Shutdown: Allow setting slice config (full x half) through sysfs Rodrigo Vivi
2013-10-15 14:41 ` [PATCH 3/3] drm/i915: HSW GT3 Slices: Userspace can dynamically forcibly enable all slices Rodrigo Vivi
2013-10-15 16:14   ` Eric Anholt
2013-10-15 17:19     ` Rodrigo Vivi
2013-10-15 20:50       ` [PATCH 3/4] drm/i915: HSW GT3 Slices: exec flag to warn kernel that userspace is using predication Rodrigo Vivi
2013-10-16  9:15         ` Chris Wilson
2013-10-16 12:12           ` Rodrigo Vivi
2013-10-16 12:19             ` Chris Wilson
2013-10-21 21:00               ` [PATCH] " Rodrigo Vivi
2013-10-22  7:47                 ` Chris Wilson
2013-10-22 17:33                   ` Rodrigo Vivi
2013-10-22 17:40                     ` Chris Wilson
2013-10-24 20:24                       ` Rodrigo Vivi
2013-10-25  8:43                         ` Chris Wilson
2013-10-31 23:07           ` Rodrigo Vivi
2013-11-01 10:39             ` Chris Wilson
2013-11-02 12:49               ` Rodrigo Vivi
2013-11-02 12:54                 ` Chris Wilson
2013-11-05 22:44 ` [PATCH 1/4] drm/i915: Allow GT Slices Shutdown on Boot Rodrigo Vivi
2013-11-05 22:44   ` [PATCH 2/4] drm/i915: Slice Shutdown: Allow setting number of slices on through sysfs Rodrigo Vivi
2013-11-05 22:44   ` [PATCH 3/4] drm/i915: HSW GT3 Slices: exec flag to warn kernel that userspace is using predication Rodrigo Vivi
2013-11-05 22:44   ` [PATCH 4/4] drm/i915: Slice Shutdown - Only 1 slice enabled by default to save power Rodrigo Vivi

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.