intel-gfx.lists.freedesktop.org archive mirror
 help / color / mirror / Atom feed
* forcewake junk, RFC, RFT(test)
@ 2011-04-08 17:47 Ben Widawsky
  2011-04-08 17:47 ` [PATCH 1/4] drm/i915: proper use of forcewake Ben Widawsky
                   ` (5 more replies)
  0 siblings, 6 replies; 18+ messages in thread
From: Ben Widawsky @ 2011-04-08 17:47 UTC (permalink / raw)
  To: intel-gfx


I am requesting 3 things:
1. code review/request for comments
2. testing on SNB
3. performance regression on < SNB

review
------
The first two patches have nothing to do with the user's ability to
interact with registers. Those patches are about enforcing correctness
within our driver for newer generation products. In other words, if
patch 3 doesn't make sense, don't automatically drop 1, and 2.

review patch 1/2
----------------
The first change is straight forward. It attempts to fold the forcewake stuff
into our standard register read and write functions. Some overhead is added as
a result, but I'd guess it is nothing compared to the UC read about to happen,
and so will not be noticeable.

The existing method for doing the forcewake_get and put requires some
synchronization. For the most part it is protected by struct_mutex and life is
good. Adding a WARN to the get()/put() function is there more as documentation
to future developers, and reminders to the current ones.

To provide an interface to allow user space to use forcewake, I've decided to
change the mechanism that get()/put() operate by introducing a reference count.
The reference count itself must be protected by struct mutex (since we need
synchronization between the initial condition and the destructor). Imagine for
instance: Thread A does a get() and is in the middle of waking the GT (ref has
already been incremented), thread B comes in, thinks the GT is awake,
and incorrectly goes about its business. This does allow users of the
interface to only have to hold struct_mutex while doing the get() and
not for every read and write.

review patch 3
--------------
User space interface is mostly what you'd expect, except the in the case
of trying to get lockless access. This code is a bit meh, but to
remind everyone, it is root only debug code.

testing
-------
The assertion that forcewake is currently properly protected for the
most part, may not be true. People interested should run these patches
on SNB systems with their favorite graphics applications and report the
warnings that occur, they will be in the kernel log, ie. dmesg.

performance
-----------
Looking mostly for regressions on older systems. There is a slight
overhead added to all register reads and writes, which I think shouldn't
be noticeable, but who knows.

Thanks, and let the flames begin!
Ben

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 1/4] drm/i915: proper use of forcewake
  2011-04-08 17:47 forcewake junk, RFC, RFT(test) Ben Widawsky
@ 2011-04-08 17:47 ` Ben Widawsky
  2011-04-08 17:47 ` [PATCH 2/4] drm/i915: refcounts for forcewake Ben Widawsky
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 18+ messages in thread
From: Ben Widawsky @ 2011-04-08 17:47 UTC (permalink / raw)
  To: intel-gfx

Moved the macros around to properly do reads and writes for the given
gpu. This is to address special requirements for gen6 (SNB) reads and
writes.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_drv.h         |   56 ++++++++++++++----------------
 drivers/gpu/drm/i915/intel_ringbuffer.h |    4 +-
 2 files changed, 28 insertions(+), 32 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 5004724..8981173 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1313,9 +1313,31 @@ extern void intel_display_print_error_state(struct seq_file *m,
 } while (0)
 
 
+
+/* On SNB platform, before reading ring registers forcewake bit
+ * must be set to prevent GT core from power down and stale values being
+ * returned.
+ */
+void __gen6_gt_force_wake_get(struct drm_i915_private *dev_priv);
+void __gen6_gt_force_wake_put(struct drm_i915_private *dev_priv);
+void __gen6_gt_wait_for_fifo(struct drm_i915_private *dev_priv);
+
+/* We give fast paths for the really cool registers */
+#define NEEDS_FORCE_WAKE(dev_priv, reg) \
+	(((dev_priv)->info->gen >= 6) && \
+	((reg) < 0x40000) && \
+	((reg) != FORCEWAKE) )
+
 #define __i915_read(x, y) \
 static inline u##x i915_read##x(struct drm_i915_private *dev_priv, u32 reg) { \
-	u##x val = read##y(dev_priv->regs + reg); \
+	u##x val = 0; \
+	if (NEEDS_FORCE_WAKE((dev_priv), (reg))) { \
+		__gen6_gt_force_wake_get(dev_priv); \
+		val = read##y(dev_priv->regs + reg); \
+		__gen6_gt_force_wake_put(dev_priv); \
+	} else { \
+		val = read##y(dev_priv->regs + reg); \
+	} \
 	trace_i915_reg_rw(false, reg, val, sizeof(val)); \
 	return val; \
 }
@@ -1328,6 +1350,9 @@ __i915_read(64, q)
 #define __i915_write(x, y) \
 static inline void i915_write##x(struct drm_i915_private *dev_priv, u32 reg, u##x val) { \
 	trace_i915_reg_rw(true, reg, val, sizeof(val)); \
+	if (NEEDS_FORCE_WAKE((dev_priv), (reg))) { \
+		__gen6_gt_wait_for_fifo(dev_priv); \
+	} \
 	write##y(val, dev_priv->regs + reg); \
 }
 __i915_write(8, b)
@@ -1356,33 +1381,4 @@ __i915_write(64, q)
 #define POSTING_READ16(reg)	(void)I915_READ16_NOTRACE(reg)
 
 
-/* On SNB platform, before reading ring registers forcewake bit
- * must be set to prevent GT core from power down and stale values being
- * returned.
- */
-void __gen6_gt_force_wake_get(struct drm_i915_private *dev_priv);
-void __gen6_gt_force_wake_put(struct drm_i915_private *dev_priv);
-void __gen6_gt_wait_for_fifo(struct drm_i915_private *dev_priv);
-
-static inline u32 i915_gt_read(struct drm_i915_private *dev_priv, u32 reg)
-{
-	u32 val;
-
-	if (dev_priv->info->gen >= 6) {
-		__gen6_gt_force_wake_get(dev_priv);
-		val = I915_READ(reg);
-		__gen6_gt_force_wake_put(dev_priv);
-	} else
-		val = I915_READ(reg);
-
-	return val;
-}
-
-static inline void i915_gt_write(struct drm_i915_private *dev_priv,
-				u32 reg, u32 val)
-{
-	if (dev_priv->info->gen >= 6)
-		__gen6_gt_wait_for_fifo(dev_priv);
-	I915_WRITE(reg, val);
-}
 #endif
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index f23cc5f..8e8821f 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -14,8 +14,8 @@ struct  intel_hw_status_page {
 	struct		drm_i915_gem_object *obj;
 };
 
-#define I915_RING_READ(reg) i915_gt_read(dev_priv, reg)
-#define I915_RING_WRITE(reg, val) i915_gt_write(dev_priv, reg, val)
+#define I915_RING_READ(reg) I915_READ(reg)
+#define I915_RING_WRITE(reg, val) I915_WRITE(reg, val)
 
 #define I915_READ_TAIL(ring) I915_RING_READ(RING_TAIL((ring)->mmio_base))
 #define I915_WRITE_TAIL(ring, val) I915_RING_WRITE(RING_TAIL((ring)->mmio_base), val)
-- 
1.7.3.4

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 2/4] drm/i915: refcounts for forcewake
  2011-04-08 17:47 forcewake junk, RFC, RFT(test) Ben Widawsky
  2011-04-08 17:47 ` [PATCH 1/4] drm/i915: proper use of forcewake Ben Widawsky
@ 2011-04-08 17:47 ` Ben Widawsky
  2011-04-08 17:47 ` [PATCH 3/4] drm/i915: userspace interface to the forcewake refcount Ben Widawsky
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 18+ messages in thread
From: Ben Widawsky @ 2011-04-08 17:47 UTC (permalink / raw)
  To: intel-gfx

New refcounting mechanism for forcewake. The idea is to take away the
need to hold struct_mutex for the duration of register reads, and only
need to hold it to get a reference count. A possible benefit would shave
off some time in actually waking the GPU if it's already awake (a couple
of extra register reads).

The refcounting also provides a clean way for userspace to gain access
to correct register reads.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_debugfs.c  |   11 +++++++++--
 drivers/gpu/drm/i915/i915_drv.c      |   20 ++++++++++++++++++--
 drivers/gpu/drm/i915/i915_drv.h      |   10 ++++++----
 drivers/gpu/drm/i915/intel_display.c |    8 ++++----
 4 files changed, 37 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index 87c8e29..6eee101 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -852,6 +852,7 @@ static int i915_cur_delayinfo(struct seq_file *m, void *unused)
 	struct drm_info_node *node = (struct drm_info_node *) m->private;
 	struct drm_device *dev = node->minor->dev;
 	drm_i915_private_t *dev_priv = dev->dev_private;
+	int ret;
 
 	if (IS_GEN5(dev)) {
 		u16 rgvswctl = I915_READ16(MEMSWCTL);
@@ -873,7 +874,11 @@ static int i915_cur_delayinfo(struct seq_file *m, void *unused)
 		int max_freq;
 
 		/* RPSTAT1 is in the GT power well */
-		__gen6_gt_force_wake_get(dev_priv);
+		ret = mutex_lock_interruptible(&dev->struct_mutex);
+		if (ret)
+			return ret;
+		gen6_gt_force_wake_get(dev_priv);
+		mutex_unlock(&dev->struct_mutex);
 
 		rpstat = I915_READ(GEN6_RPSTAT1);
 		rpupei = I915_READ(GEN6_RP_CUR_UP_EI);
@@ -918,7 +923,9 @@ static int i915_cur_delayinfo(struct seq_file *m, void *unused)
 		seq_printf(m, "Max non-overclocked (RP0) frequency: %dMHz\n",
 			   max_freq * 50);
 
-		__gen6_gt_force_wake_put(dev_priv);
+		mutex_lock(&dev->struct_mutex);
+		gen6_gt_force_wake_put(dev_priv);
+		mutex_unlock(&dev->struct_mutex);
 	} else {
 		seq_printf(m, "no P-state info available\n");
 	}
diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index c34a8dd..ce59cd4 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -263,7 +263,7 @@ void intel_detect_pch (struct drm_device *dev)
 	}
 }
 
-void __gen6_gt_force_wake_get(struct drm_i915_private *dev_priv)
+static void __gen6_gt_force_wake_get(struct drm_i915_private *dev_priv)
 {
 	int count;
 
@@ -279,12 +279,28 @@ void __gen6_gt_force_wake_get(struct drm_i915_private *dev_priv)
 		udelay(10);
 }
 
-void __gen6_gt_force_wake_put(struct drm_i915_private *dev_priv)
+void gen6_gt_force_wake_get(struct drm_i915_private *dev_priv)
+{
+	WARN_ON(!mutex_is_locked(&dev_priv->dev->struct_mutex));
+
+	if (dev_priv->forcewake_count++ == 0)
+		__gen6_gt_force_wake_get(dev_priv);
+}
+
+static void __gen6_gt_force_wake_put(struct drm_i915_private *dev_priv)
 {
 	I915_WRITE_NOTRACE(FORCEWAKE, 0);
 	POSTING_READ(FORCEWAKE);
 }
 
+void gen6_gt_force_wake_put(struct drm_i915_private *dev_priv)
+{
+	WARN_ON(!mutex_is_locked(&dev_priv->dev->struct_mutex));
+
+	if (--dev_priv->forcewake_count == 0)
+		__gen6_gt_force_wake_put(dev_priv);
+}
+
 void __gen6_gt_wait_for_fifo(struct drm_i915_private *dev_priv)
 {
 	int loop = 500;
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 8981173..f024927 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -703,6 +703,8 @@ typedef struct drm_i915_private {
 	struct intel_fbdev *fbdev;
 
 	struct drm_property *broadcast_rgb_property;
+
+	int forcewake_count;
 } drm_i915_private_t;
 
 struct drm_i915_gem_object {
@@ -1318,8 +1320,8 @@ extern void intel_display_print_error_state(struct seq_file *m,
  * must be set to prevent GT core from power down and stale values being
  * returned.
  */
-void __gen6_gt_force_wake_get(struct drm_i915_private *dev_priv);
-void __gen6_gt_force_wake_put(struct drm_i915_private *dev_priv);
+void gen6_gt_force_wake_get(struct drm_i915_private *dev_priv);
+void gen6_gt_force_wake_put(struct drm_i915_private *dev_priv);
 void __gen6_gt_wait_for_fifo(struct drm_i915_private *dev_priv);
 
 /* We give fast paths for the really cool registers */
@@ -1332,9 +1334,9 @@ void __gen6_gt_wait_for_fifo(struct drm_i915_private *dev_priv);
 static inline u##x i915_read##x(struct drm_i915_private *dev_priv, u32 reg) { \
 	u##x val = 0; \
 	if (NEEDS_FORCE_WAKE((dev_priv), (reg))) { \
-		__gen6_gt_force_wake_get(dev_priv); \
+		gen6_gt_force_wake_get(dev_priv); \
 		val = read##y(dev_priv->regs + reg); \
-		__gen6_gt_force_wake_put(dev_priv); \
+		gen6_gt_force_wake_put(dev_priv); \
 	} else { \
 		val = read##y(dev_priv->regs + reg); \
 	} \
diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index 432fc04..4585005 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -1828,7 +1828,7 @@ static void sandybridge_blit_fbc_update(struct drm_device *dev)
 	u32 blt_ecoskpd;
 
 	/* Make sure blitter notifies FBC of writes */
-	__gen6_gt_force_wake_get(dev_priv);
+	gen6_gt_force_wake_get(dev_priv);
 	blt_ecoskpd = I915_READ(GEN6_BLITTER_ECOSKPD);
 	blt_ecoskpd |= GEN6_BLITTER_FBC_NOTIFY <<
 		GEN6_BLITTER_LOCK_SHIFT;
@@ -1839,7 +1839,7 @@ static void sandybridge_blit_fbc_update(struct drm_device *dev)
 			 GEN6_BLITTER_LOCK_SHIFT);
 	I915_WRITE(GEN6_BLITTER_ECOSKPD, blt_ecoskpd);
 	POSTING_READ(GEN6_BLITTER_ECOSKPD);
-	__gen6_gt_force_wake_put(dev_priv);
+	gen6_gt_force_wake_put(dev_priv);
 }
 
 static void ironlake_enable_fbc(struct drm_crtc *crtc, unsigned long interval)
@@ -6853,7 +6853,7 @@ void gen6_enable_rps(struct drm_i915_private *dev_priv)
 	 * userspace...
 	 */
 	I915_WRITE(GEN6_RC_STATE, 0);
-	__gen6_gt_force_wake_get(dev_priv);
+	gen6_gt_force_wake_get(dev_priv);
 
 	/* disable the counters and set deterministic thresholds */
 	I915_WRITE(GEN6_RC_CONTROL, 0);
@@ -6951,7 +6951,7 @@ void gen6_enable_rps(struct drm_i915_private *dev_priv)
 	/* enable all PM interrupts */
 	I915_WRITE(GEN6_PMINTRMSK, 0);
 
-	__gen6_gt_force_wake_put(dev_priv);
+	gen6_gt_force_wake_put(dev_priv);
 }
 
 void intel_enable_clock_gating(struct drm_device *dev)
-- 
1.7.3.4

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 3/4] drm/i915: userspace interface to the forcewake refcount
  2011-04-08 17:47 forcewake junk, RFC, RFT(test) Ben Widawsky
  2011-04-08 17:47 ` [PATCH 1/4] drm/i915: proper use of forcewake Ben Widawsky
  2011-04-08 17:47 ` [PATCH 2/4] drm/i915: refcounts for forcewake Ben Widawsky
@ 2011-04-08 17:47 ` Ben Widawsky
  2011-04-08 17:47 ` [PATCH 4/4] drm/i915: optional fewer warning patch Ben Widawsky
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 18+ messages in thread
From: Ben Widawsky @ 2011-04-08 17:47 UTC (permalink / raw)
  To: intel-gfx

In the simple case, all that needs to happen is grab the lock, possibly
wake the GPU and bump the refcount.

In the not simple case, we close our eyes and hope for the best (but we
only do it one at a time).

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_debugfs.c |   99 +++++++++++++++++++++++++++++++++++
 drivers/gpu/drm/i915/i915_drv.c     |   12 ++++-
 drivers/gpu/drm/i915/i915_drv.h     |    1 +
 3 files changed, 110 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index 6eee101..286896e 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -1193,6 +1193,19 @@ static int i915_gem_framebuffer_info(struct seq_file *m, void *data)
 	return 0;
 }
 
+static int i915_gen6_forcewake_count_info(struct seq_file *m, void *data)
+{
+	struct drm_info_node *node = (struct drm_info_node *) m->private;
+	struct drm_device *dev = node->minor->dev;
+	struct drm_i915_private *dev_priv = dev->dev_private;
+
+	seq_printf(m, "forcewake count = %d\n", dev_priv->forcewake_count);
+	seq_printf(m, "forcewake forced = %d\n",
+		   atomic_read(&dev_priv->forcewake_force));
+
+	return 0;
+}
+
 static int
 i915_wedged_open(struct inode *inode,
 		 struct file *filp)
@@ -1295,6 +1308,85 @@ static int i915_wedged_create(struct dentry *root, struct drm_minor *minor)
 	return drm_add_fake_info_node(minor, ent, &i915_wedged_fops);
 }
 
+static int i915_forcewake_open(struct inode *inode, struct file *file)
+{
+	struct drm_device *dev = inode->i_private;
+	struct drm_i915_private *dev_priv = dev->dev_private;
+	int ret;
+	bool locked;
+
+	if (!IS_GEN6(dev))
+		return 0;
+
+	ret = i915_mutex_lock_interruptible(dev);
+	locked = (ret == 0) ? true : false;
+
+	if (!locked) {
+		/*
+		 *  Only let one user violently forcewake, else it gets messy,
+		 *  that user must be responsible for cleanup as well. Store
+		 *  away current for later cleanup.
+		 */
+		if (atomic_cmpxchg(&dev_priv->forcewake_force, 0, 1) != 0)
+			return -EBUSY;
+		file->private_data = current;
+	}
+
+	/*  We either have the lock, or we've set forcewake */
+	gen6_gt_force_wake_get(dev_priv);
+
+	if (locked) {
+		atomic_set(&dev_priv->forcewake_force, 0);
+		mutex_unlock(&dev->struct_mutex);
+	}
+
+	return 0;
+}
+
+int i915_forcewake_release(struct inode *inode, struct file *file)
+{
+	struct drm_device *dev = inode->i_private;
+	struct drm_i915_private *dev_priv = dev->dev_private;
+
+	if (!IS_GEN6(dev))
+		return 0;
+
+	/* more ugliness as a result of force */
+	if (file->private_data == current) {
+		WARN_ON(!atomic_read(&dev_priv->forcewake_force));
+		WARN_ON(i915_mutex_lock_interruptible(dev));
+		gen6_gt_force_wake_put(dev_priv);
+		atomic_set(&dev_priv->forcewake_force, 0);
+	} else {
+		mutex_lock(&dev->struct_mutex);
+		gen6_gt_force_wake_put(dev_priv);
+		atomic_set(&dev_priv->forcewake_force, 0);
+		mutex_unlock(&dev->struct_mutex);
+	}
+	return 0;
+}
+
+static const struct file_operations i915_forcewake_fops = {
+	.owner = THIS_MODULE,
+	.open = i915_forcewake_open,
+	.release = i915_forcewake_release,
+};
+
+static int i915_forcewake_create(struct dentry *root, struct drm_minor *minor)
+{
+	struct drm_device *dev = minor->dev;
+	struct dentry *ent;
+
+	ent = debugfs_create_file("i915_forcewake_lock",
+				  S_IRWXU,
+				  root, dev,
+				  &i915_forcewake_fops);
+	if (IS_ERR(ent))
+		return PTR_ERR(ent);
+
+	return 0;
+}
+
 static struct drm_info_list i915_debugfs_list[] = {
 	{"i915_capabilities", i915_capabilities, 0},
 	{"i915_gem_objects", i915_gem_object_info, 0},
@@ -1331,6 +1423,7 @@ static struct drm_info_list i915_debugfs_list[] = {
 	{"i915_sr_status", i915_sr_status, 0},
 	{"i915_opregion", i915_opregion, 0},
 	{"i915_gem_framebuffer", i915_gem_framebuffer_info, 0},
+	{"i915_gen6_forcewake_count", i915_gen6_forcewake_count_info, 0},
 };
 #define I915_DEBUGFS_ENTRIES ARRAY_SIZE(i915_debugfs_list)
 
@@ -1342,6 +1435,10 @@ int i915_debugfs_init(struct drm_minor *minor)
 	if (ret)
 		return ret;
 
+	ret = i915_forcewake_create(minor->debugfs_root, minor);
+	if (ret)
+		return ret;
+
 	return drm_debugfs_create_files(i915_debugfs_list,
 					I915_DEBUGFS_ENTRIES,
 					minor->debugfs_root, minor);
@@ -1351,6 +1448,8 @@ void i915_debugfs_cleanup(struct drm_minor *minor)
 {
 	drm_debugfs_remove_files(i915_debugfs_list,
 				 I915_DEBUGFS_ENTRIES, minor);
+	drm_debugfs_remove_files((struct drm_info_list *) &i915_forcewake_fops,
+				 1, minor);
 	drm_debugfs_remove_files((struct drm_info_list *) &i915_wedged_fops,
 				 1, minor);
 }
diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index ce59cd4..fcd5c9a 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -281,7 +281,14 @@ static void __gen6_gt_force_wake_get(struct drm_i915_private *dev_priv)
 
 void gen6_gt_force_wake_get(struct drm_i915_private *dev_priv)
 {
-	WARN_ON(!mutex_is_locked(&dev_priv->dev->struct_mutex));
+	/*
+	 * It takes a long time to wake, but the count is incremented
+	 * immediately. Not having the lock causes a race, but all bets are off
+	 * when using forced forcewake, which should only be touched through
+	 * root-only entry in debugfs.
+	 */
+	WARN_ON(!mutex_is_locked(&dev_priv->dev->struct_mutex) &&
+		!atomic_read(&dev_priv->forcewake_force));
 
 	if (dev_priv->forcewake_count++ == 0)
 		__gen6_gt_force_wake_get(dev_priv);
@@ -295,7 +302,8 @@ static void __gen6_gt_force_wake_put(struct drm_i915_private *dev_priv)
 
 void gen6_gt_force_wake_put(struct drm_i915_private *dev_priv)
 {
-	WARN_ON(!mutex_is_locked(&dev_priv->dev->struct_mutex));
+	WARN_ON(!mutex_is_locked(&dev_priv->dev->struct_mutex) &&
+		!atomic_read(&dev_priv->forcewake_force));
 
 	if (--dev_priv->forcewake_count == 0)
 		__gen6_gt_force_wake_put(dev_priv);
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index f024927..f1c8a70 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -705,6 +705,7 @@ typedef struct drm_i915_private {
 	struct drm_property *broadcast_rgb_property;
 
 	int forcewake_count;
+	atomic_t forcewake_force;
 } drm_i915_private_t;
 
 struct drm_i915_gem_object {
-- 
1.7.3.4

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 4/4] drm/i915: optional fewer warning patch
  2011-04-08 17:47 forcewake junk, RFC, RFT(test) Ben Widawsky
                   ` (2 preceding siblings ...)
  2011-04-08 17:47 ` [PATCH 3/4] drm/i915: userspace interface to the forcewake refcount Ben Widawsky
@ 2011-04-08 17:47 ` Ben Widawsky
  2011-04-09 20:26 ` forcewake junk, part2 Ben Widawsky
  2011-04-09 20:31 ` forcewake patches (ignore previous, please) Ben Widawsky
  5 siblings, 0 replies; 18+ messages in thread
From: Ben Widawsky @ 2011-04-08 17:47 UTC (permalink / raw)
  To: intel-gfx

This patch will likely produce much fewer warnings, but perhaps hide
some bugs in the driver. I believe it's a good starting point however to
find the really serious issues first.

Goal is to hide warnings if the refcount for the forcewake "lock" is not
zero

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_drv.c |   14 ++++++++++----
 1 files changed, 10 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index fcd5c9a..33341d4 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -286,9 +286,14 @@ void gen6_gt_force_wake_get(struct drm_i915_private *dev_priv)
 	 * immediately. Not having the lock causes a race, but all bets are off
 	 * when using forced forcewake, which should only be touched through
 	 * root-only entry in debugfs.
+	 *
+	 * Intelligent users of the interface may do a force_wake_get() followed
+	 * by many register reads and writes, knowing that the reference count
+	 * is already incremented. So we do not want to warn on those.
 	 */
-	WARN_ON(!mutex_is_locked(&dev_priv->dev->struct_mutex) &&
-		!atomic_read(&dev_priv->forcewake_force));
+	WARN_ON((!mutex_is_locked(&dev_priv->dev->struct_mutex) &&
+		 !dev_priv->forcewake_count) &&
+		 !atomic_read(&dev_priv->forcewake_force));
 
 	if (dev_priv->forcewake_count++ == 0)
 		__gen6_gt_force_wake_get(dev_priv);
@@ -302,8 +307,9 @@ static void __gen6_gt_force_wake_put(struct drm_i915_private *dev_priv)
 
 void gen6_gt_force_wake_put(struct drm_i915_private *dev_priv)
 {
-	WARN_ON(!mutex_is_locked(&dev_priv->dev->struct_mutex) &&
-		!atomic_read(&dev_priv->forcewake_force));
+	WARN_ON((!mutex_is_locked(&dev_priv->dev->struct_mutex) &&
+		 !dev_priv->forcewake_count) &&
+		 !atomic_read(&dev_priv->forcewake_force));
 
 	if (--dev_priv->forcewake_count == 0)
 		__gen6_gt_force_wake_put(dev_priv);
-- 
1.7.3.4

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* forcewake junk, part2
  2011-04-08 17:47 forcewake junk, RFC, RFT(test) Ben Widawsky
                   ` (3 preceding siblings ...)
  2011-04-08 17:47 ` [PATCH 4/4] drm/i915: optional fewer warning patch Ben Widawsky
@ 2011-04-09 20:26 ` Ben Widawsky
  2011-04-09 20:26   ` (no subject) Ben Widawsky
  2011-04-09 20:31 ` forcewake patches (ignore previous, please) Ben Widawsky
  5 siblings, 1 reply; 18+ messages in thread
From: Ben Widawsky @ 2011-04-09 20:26 UTC (permalink / raw)
  To: intel-gfx


The request to test, and report the warnings is still desired.

These address all of Chris' comments from IRC (I think). I tried to put
the reasoning in the code comments, as well as commits; but I'll add it
here for completeness.

Simplify the acquisition of the forcewake lock by removing the forced
entry. The motivation behind it is that the kernel offers sufficient
tools to get relevant data before/during/after a hang, and adding a
complex mechanism for userspace is not such a great idea. If a need to
do this comes up, we can always add it back on a situational basis.

Similarly, on releasing of the debugfs file, no more forced entry mechanism.
This does leave a potential for release to hang, but AFAICS if release hangs,
the system is going to need a reboot before i915 becomes usable again because
module unload would also hang.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* (no subject)
  2011-04-09 20:26 ` forcewake junk, part2 Ben Widawsky
@ 2011-04-09 20:26   ` Ben Widawsky
       [not found]     ` <1302380787-2957-3-git-send-email-ben@bwidawsk.net>
  0 siblings, 1 reply; 18+ messages in thread
From: Ben Widawsky @ 2011-04-09 20:26 UTC (permalink / raw)
  To: intel-gfx

GIT: [Intel-gfx] [PATCH 1/4] drm/i915: proper use of forcewake
GIT: [Intel-gfx] [PATCH 2/4] drm/i915: refcounts for forcewake
GIT: [Intel-gfx] [PATCH 3/4] drm/i915: userspace interface to the forcewake refcount
GIT: [Intel-gfx] [PATCH 4/4] drm/i915: optional fewer warning patch

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 1/4] drm/i915: proper use of forcewake
       [not found]     ` <1302380787-2957-3-git-send-email-ben@bwidawsk.net>
@ 2011-04-09 20:26       ` Ben Widawsky
  2011-04-09 20:26         ` [PATCH 2/4] drm/i915: refcounts for forcewake Ben Widawsky
  0 siblings, 1 reply; 18+ messages in thread
From: Ben Widawsky @ 2011-04-09 20:26 UTC (permalink / raw)
  To: intel-gfx

Moved the macros around to properly do reads and writes for the given
gpu. This is to address special requirements for gen6 (SNB) reads and
writes.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_drv.h         |   56 ++++++++++++++----------------
 drivers/gpu/drm/i915/intel_ringbuffer.h |    4 +-
 2 files changed, 28 insertions(+), 32 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 5004724..8981173 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1313,9 +1313,31 @@ extern void intel_display_print_error_state(struct seq_file *m,
 } while (0)
 
 
+
+/* On SNB platform, before reading ring registers forcewake bit
+ * must be set to prevent GT core from power down and stale values being
+ * returned.
+ */
+void __gen6_gt_force_wake_get(struct drm_i915_private *dev_priv);
+void __gen6_gt_force_wake_put(struct drm_i915_private *dev_priv);
+void __gen6_gt_wait_for_fifo(struct drm_i915_private *dev_priv);
+
+/* We give fast paths for the really cool registers */
+#define NEEDS_FORCE_WAKE(dev_priv, reg) \
+	(((dev_priv)->info->gen >= 6) && \
+	((reg) < 0x40000) && \
+	((reg) != FORCEWAKE) )
+
 #define __i915_read(x, y) \
 static inline u##x i915_read##x(struct drm_i915_private *dev_priv, u32 reg) { \
-	u##x val = read##y(dev_priv->regs + reg); \
+	u##x val = 0; \
+	if (NEEDS_FORCE_WAKE((dev_priv), (reg))) { \
+		__gen6_gt_force_wake_get(dev_priv); \
+		val = read##y(dev_priv->regs + reg); \
+		__gen6_gt_force_wake_put(dev_priv); \
+	} else { \
+		val = read##y(dev_priv->regs + reg); \
+	} \
 	trace_i915_reg_rw(false, reg, val, sizeof(val)); \
 	return val; \
 }
@@ -1328,6 +1350,9 @@ __i915_read(64, q)
 #define __i915_write(x, y) \
 static inline void i915_write##x(struct drm_i915_private *dev_priv, u32 reg, u##x val) { \
 	trace_i915_reg_rw(true, reg, val, sizeof(val)); \
+	if (NEEDS_FORCE_WAKE((dev_priv), (reg))) { \
+		__gen6_gt_wait_for_fifo(dev_priv); \
+	} \
 	write##y(val, dev_priv->regs + reg); \
 }
 __i915_write(8, b)
@@ -1356,33 +1381,4 @@ __i915_write(64, q)
 #define POSTING_READ16(reg)	(void)I915_READ16_NOTRACE(reg)
 
 
-/* On SNB platform, before reading ring registers forcewake bit
- * must be set to prevent GT core from power down and stale values being
- * returned.
- */
-void __gen6_gt_force_wake_get(struct drm_i915_private *dev_priv);
-void __gen6_gt_force_wake_put(struct drm_i915_private *dev_priv);
-void __gen6_gt_wait_for_fifo(struct drm_i915_private *dev_priv);
-
-static inline u32 i915_gt_read(struct drm_i915_private *dev_priv, u32 reg)
-{
-	u32 val;
-
-	if (dev_priv->info->gen >= 6) {
-		__gen6_gt_force_wake_get(dev_priv);
-		val = I915_READ(reg);
-		__gen6_gt_force_wake_put(dev_priv);
-	} else
-		val = I915_READ(reg);
-
-	return val;
-}
-
-static inline void i915_gt_write(struct drm_i915_private *dev_priv,
-				u32 reg, u32 val)
-{
-	if (dev_priv->info->gen >= 6)
-		__gen6_gt_wait_for_fifo(dev_priv);
-	I915_WRITE(reg, val);
-}
 #endif
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index f23cc5f..8e8821f 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -14,8 +14,8 @@ struct  intel_hw_status_page {
 	struct		drm_i915_gem_object *obj;
 };
 
-#define I915_RING_READ(reg) i915_gt_read(dev_priv, reg)
-#define I915_RING_WRITE(reg, val) i915_gt_write(dev_priv, reg, val)
+#define I915_RING_READ(reg) I915_READ(reg)
+#define I915_RING_WRITE(reg, val) I915_WRITE(reg, val)
 
 #define I915_READ_TAIL(ring) I915_RING_READ(RING_TAIL((ring)->mmio_base))
 #define I915_WRITE_TAIL(ring, val) I915_RING_WRITE(RING_TAIL((ring)->mmio_base), val)
-- 
1.7.3.4

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 2/4] drm/i915: refcounts for forcewake
  2011-04-09 20:26       ` [PATCH 1/4] drm/i915: proper use of forcewake Ben Widawsky
@ 2011-04-09 20:26         ` Ben Widawsky
  2011-04-09 20:26           ` [PATCH 3/4] drm/i915: userspace interface to the forcewake refcount Ben Widawsky
  0 siblings, 1 reply; 18+ messages in thread
From: Ben Widawsky @ 2011-04-09 20:26 UTC (permalink / raw)
  To: intel-gfx

New refcounting mechanism for forcewake. The idea is to take away the
need to hold struct_mutex for the duration of register reads, and only
need to hold it to get a reference count. A possible benefit would shave
off some time in actually waking the GPU if it's already awake (a couple
of extra register reads).

The refcounting also provides a clean way for userspace to gain access
to correct register reads.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_debugfs.c  |   10 ++++++++--
 drivers/gpu/drm/i915/i915_drv.c      |   20 ++++++++++++++++++--
 drivers/gpu/drm/i915/i915_drv.h      |   10 ++++++----
 drivers/gpu/drm/i915/intel_display.c |    8 ++++----
 4 files changed, 36 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index 87c8e29..2edb8b2 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -852,6 +852,7 @@ static int i915_cur_delayinfo(struct seq_file *m, void *unused)
 	struct drm_info_node *node = (struct drm_info_node *) m->private;
 	struct drm_device *dev = node->minor->dev;
 	drm_i915_private_t *dev_priv = dev->dev_private;
+	int ret;
 
 	if (IS_GEN5(dev)) {
 		u16 rgvswctl = I915_READ16(MEMSWCTL);
@@ -873,7 +874,11 @@ static int i915_cur_delayinfo(struct seq_file *m, void *unused)
 		int max_freq;
 
 		/* RPSTAT1 is in the GT power well */
-		__gen6_gt_force_wake_get(dev_priv);
+		ret = mutex_lock_interruptible(&dev->struct_mutex);
+		if (ret)
+			return ret;
+
+		gen6_gt_force_wake_get(dev_priv);
 
 		rpstat = I915_READ(GEN6_RPSTAT1);
 		rpupei = I915_READ(GEN6_RP_CUR_UP_EI);
@@ -918,7 +923,8 @@ static int i915_cur_delayinfo(struct seq_file *m, void *unused)
 		seq_printf(m, "Max non-overclocked (RP0) frequency: %dMHz\n",
 			   max_freq * 50);
 
-		__gen6_gt_force_wake_put(dev_priv);
+		gen6_gt_force_wake_put(dev_priv);
+		mutex_unlock(&dev->struct_mutex);
 	} else {
 		seq_printf(m, "no P-state info available\n");
 	}
diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index c34a8dd..ce59cd4 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -263,7 +263,7 @@ void intel_detect_pch (struct drm_device *dev)
 	}
 }
 
-void __gen6_gt_force_wake_get(struct drm_i915_private *dev_priv)
+static void __gen6_gt_force_wake_get(struct drm_i915_private *dev_priv)
 {
 	int count;
 
@@ -279,12 +279,28 @@ void __gen6_gt_force_wake_get(struct drm_i915_private *dev_priv)
 		udelay(10);
 }
 
-void __gen6_gt_force_wake_put(struct drm_i915_private *dev_priv)
+void gen6_gt_force_wake_get(struct drm_i915_private *dev_priv)
+{
+	WARN_ON(!mutex_is_locked(&dev_priv->dev->struct_mutex));
+
+	if (dev_priv->forcewake_count++ == 0)
+		__gen6_gt_force_wake_get(dev_priv);
+}
+
+static void __gen6_gt_force_wake_put(struct drm_i915_private *dev_priv)
 {
 	I915_WRITE_NOTRACE(FORCEWAKE, 0);
 	POSTING_READ(FORCEWAKE);
 }
 
+void gen6_gt_force_wake_put(struct drm_i915_private *dev_priv)
+{
+	WARN_ON(!mutex_is_locked(&dev_priv->dev->struct_mutex));
+
+	if (--dev_priv->forcewake_count == 0)
+		__gen6_gt_force_wake_put(dev_priv);
+}
+
 void __gen6_gt_wait_for_fifo(struct drm_i915_private *dev_priv)
 {
 	int loop = 500;
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 8981173..f024927 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -703,6 +703,8 @@ typedef struct drm_i915_private {
 	struct intel_fbdev *fbdev;
 
 	struct drm_property *broadcast_rgb_property;
+
+	int forcewake_count;
 } drm_i915_private_t;
 
 struct drm_i915_gem_object {
@@ -1318,8 +1320,8 @@ extern void intel_display_print_error_state(struct seq_file *m,
  * must be set to prevent GT core from power down and stale values being
  * returned.
  */
-void __gen6_gt_force_wake_get(struct drm_i915_private *dev_priv);
-void __gen6_gt_force_wake_put(struct drm_i915_private *dev_priv);
+void gen6_gt_force_wake_get(struct drm_i915_private *dev_priv);
+void gen6_gt_force_wake_put(struct drm_i915_private *dev_priv);
 void __gen6_gt_wait_for_fifo(struct drm_i915_private *dev_priv);
 
 /* We give fast paths for the really cool registers */
@@ -1332,9 +1334,9 @@ void __gen6_gt_wait_for_fifo(struct drm_i915_private *dev_priv);
 static inline u##x i915_read##x(struct drm_i915_private *dev_priv, u32 reg) { \
 	u##x val = 0; \
 	if (NEEDS_FORCE_WAKE((dev_priv), (reg))) { \
-		__gen6_gt_force_wake_get(dev_priv); \
+		gen6_gt_force_wake_get(dev_priv); \
 		val = read##y(dev_priv->regs + reg); \
-		__gen6_gt_force_wake_put(dev_priv); \
+		gen6_gt_force_wake_put(dev_priv); \
 	} else { \
 		val = read##y(dev_priv->regs + reg); \
 	} \
diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index 432fc04..4585005 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -1828,7 +1828,7 @@ static void sandybridge_blit_fbc_update(struct drm_device *dev)
 	u32 blt_ecoskpd;
 
 	/* Make sure blitter notifies FBC of writes */
-	__gen6_gt_force_wake_get(dev_priv);
+	gen6_gt_force_wake_get(dev_priv);
 	blt_ecoskpd = I915_READ(GEN6_BLITTER_ECOSKPD);
 	blt_ecoskpd |= GEN6_BLITTER_FBC_NOTIFY <<
 		GEN6_BLITTER_LOCK_SHIFT;
@@ -1839,7 +1839,7 @@ static void sandybridge_blit_fbc_update(struct drm_device *dev)
 			 GEN6_BLITTER_LOCK_SHIFT);
 	I915_WRITE(GEN6_BLITTER_ECOSKPD, blt_ecoskpd);
 	POSTING_READ(GEN6_BLITTER_ECOSKPD);
-	__gen6_gt_force_wake_put(dev_priv);
+	gen6_gt_force_wake_put(dev_priv);
 }
 
 static void ironlake_enable_fbc(struct drm_crtc *crtc, unsigned long interval)
@@ -6853,7 +6853,7 @@ void gen6_enable_rps(struct drm_i915_private *dev_priv)
 	 * userspace...
 	 */
 	I915_WRITE(GEN6_RC_STATE, 0);
-	__gen6_gt_force_wake_get(dev_priv);
+	gen6_gt_force_wake_get(dev_priv);
 
 	/* disable the counters and set deterministic thresholds */
 	I915_WRITE(GEN6_RC_CONTROL, 0);
@@ -6951,7 +6951,7 @@ void gen6_enable_rps(struct drm_i915_private *dev_priv)
 	/* enable all PM interrupts */
 	I915_WRITE(GEN6_PMINTRMSK, 0);
 
-	__gen6_gt_force_wake_put(dev_priv);
+	gen6_gt_force_wake_put(dev_priv);
 }
 
 void intel_enable_clock_gating(struct drm_device *dev)
-- 
1.7.3.4

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 3/4] drm/i915: userspace interface to the forcewake refcount
  2011-04-09 20:26         ` [PATCH 2/4] drm/i915: refcounts for forcewake Ben Widawsky
@ 2011-04-09 20:26           ` Ben Widawsky
  0 siblings, 0 replies; 18+ messages in thread
From: Ben Widawsky @ 2011-04-09 20:26 UTC (permalink / raw)
  To: intel-gfx

Provide debugfs access to the refcounted forcewake. This allows a
userspace utility to access some of those hard to reach registers on
newer GPUs. The interface is root-only.

The interface is referred to as a forcewake lock. The term lock refers
to the synchronization between software and hardware (powering down the
GPU).

A note for those not reading the comments in the patches:
* "acquiring" the forcewake lock from userspace can fail. The utilities
  in userspace to read and write registers are inferior to the utilities
  in the kernel, and so failure (and inability to force acquire the
  lock) is acceptable.
* "releasing" the forcewake lock can hang. The conditions under which
  this can happen are so severe it would almost certainly require a
  reboot anyway (i915 unload would also hang).

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_debugfs.c |   84 +++++++++++++++++++++++++++++++++++
 drivers/gpu/drm/i915/i915_drv.c     |    6 +++
 2 files changed, 90 insertions(+), 0 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index 2edb8b2..dda687a 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -1192,6 +1192,17 @@ static int i915_gem_framebuffer_info(struct seq_file *m, void *data)
 	return 0;
 }
 
+static int i915_gen6_forcewake_count_info(struct seq_file *m, void *data)
+{
+	struct drm_info_node *node = (struct drm_info_node *) m->private;
+	struct drm_device *dev = node->minor->dev;
+	struct drm_i915_private *dev_priv = dev->dev_private;
+
+	seq_printf(m, "forcewake count = %d\n", dev_priv->forcewake_count);
+
+	return 0;
+}
+
 static int
 i915_wedged_open(struct inode *inode,
 		 struct file *filp)
@@ -1294,6 +1305,72 @@ static int i915_wedged_create(struct dentry *root, struct drm_minor *minor)
 	return drm_add_fake_info_node(minor, ent, &i915_wedged_fops);
 }
 
+static int i915_forcewake_open(struct inode *inode, struct file *file)
+{
+	struct drm_device *dev = inode->i_private;
+	struct drm_i915_private *dev_priv = dev->dev_private;
+	int ret;
+
+	if (!IS_GEN6(dev))
+		return 0;
+
+	/*
+	 * If the driver hangs while holding the lock, the user should turn on
+	 * register read/write tracing, or use the other available debug options
+	 * to to find the cause.
+	 */
+	ret = mutex_lock_interruptible(&dev->struct_mutex);
+	if (ret)
+		return ret;
+
+	gen6_gt_force_wake_get(dev_priv);
+
+	mutex_unlock(&dev->struct_mutex);
+
+	return 0;
+}
+
+int i915_forcewake_release(struct inode *inode, struct file *file)
+{
+	struct drm_device *dev = inode->i_private;
+	struct drm_i915_private *dev_priv = dev->dev_private;
+
+	if (!IS_GEN6(dev))
+		return 0;
+
+	/*
+	 * If we hang, it implies that we have a locking issue, or some other
+	 * oops in the driver. If we lock up userspace, that is an unfortunate
+	 * but acceptable consequence.
+	 */
+	mutex_lock(&dev->struct_mutex);
+	gen6_gt_force_wake_put(dev_priv);
+	mutex_unlock(&dev->struct_mutex);
+
+	return 0;
+}
+
+static const struct file_operations i915_forcewake_fops = {
+	.owner = THIS_MODULE,
+	.open = i915_forcewake_open,
+	.release = i915_forcewake_release,
+};
+
+static int i915_forcewake_create(struct dentry *root, struct drm_minor *minor)
+{
+	struct drm_device *dev = minor->dev;
+	struct dentry *ent;
+
+	ent = debugfs_create_file("i915_forcewake_lock",
+				  S_IRWXU,
+				  root, dev,
+				  &i915_forcewake_fops);
+	if (IS_ERR(ent))
+		return PTR_ERR(ent);
+
+	return 0;
+}
+
 static struct drm_info_list i915_debugfs_list[] = {
 	{"i915_capabilities", i915_capabilities, 0},
 	{"i915_gem_objects", i915_gem_object_info, 0},
@@ -1330,6 +1407,7 @@ static struct drm_info_list i915_debugfs_list[] = {
 	{"i915_sr_status", i915_sr_status, 0},
 	{"i915_opregion", i915_opregion, 0},
 	{"i915_gem_framebuffer", i915_gem_framebuffer_info, 0},
+	{"i915_gen6_forcewake_count", i915_gen6_forcewake_count_info, 0},
 };
 #define I915_DEBUGFS_ENTRIES ARRAY_SIZE(i915_debugfs_list)
 
@@ -1341,6 +1419,10 @@ int i915_debugfs_init(struct drm_minor *minor)
 	if (ret)
 		return ret;
 
+	ret = i915_forcewake_create(minor->debugfs_root, minor);
+	if (ret)
+		return ret;
+
 	return drm_debugfs_create_files(i915_debugfs_list,
 					I915_DEBUGFS_ENTRIES,
 					minor->debugfs_root, minor);
@@ -1350,6 +1432,8 @@ void i915_debugfs_cleanup(struct drm_minor *minor)
 {
 	drm_debugfs_remove_files(i915_debugfs_list,
 				 I915_DEBUGFS_ENTRIES, minor);
+	drm_debugfs_remove_files((struct drm_info_list *) &i915_forcewake_fops,
+				 1, minor);
 	drm_debugfs_remove_files((struct drm_info_list *) &i915_wedged_fops,
 				 1, minor);
 }
diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index ce59cd4..96b3bfc 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -281,6 +281,12 @@ static void __gen6_gt_force_wake_get(struct drm_i915_private *dev_priv)
 
 void gen6_gt_force_wake_get(struct drm_i915_private *dev_priv)
 {
+	/*
+	 * It takes a long time to wake, but the count is incremented
+	 * immediately. Not having the lock causes a race, but all bets are off
+	 * when using forced forcewake, which should only be touched through
+	 * root-only entry in debugfs.
+	 */
 	WARN_ON(!mutex_is_locked(&dev_priv->dev->struct_mutex));
 
 	if (dev_priv->forcewake_count++ == 0)
-- 
1.7.3.4

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* forcewake patches (ignore previous, please)
  2011-04-08 17:47 forcewake junk, RFC, RFT(test) Ben Widawsky
                   ` (4 preceding siblings ...)
  2011-04-09 20:26 ` forcewake junk, part2 Ben Widawsky
@ 2011-04-09 20:31 ` Ben Widawsky
  2011-04-09 20:31   ` [PATCH 1/4] drm/i915: proper use of forcewake Ben Widawsky
  5 siblings, 1 reply; 18+ messages in thread
From: Ben Widawsky @ 2011-04-09 20:31 UTC (permalink / raw)
  To: intel-gfx


The request to test, and report the warnings is still desired.

These address all of Chris' comments from IRC (I think). I tried to put
the reasoning in the code comments, as well as commits; but I'll add it
here for completeness.

Simplify the acquisition of the forcewake lock by removing the forced
entry. The motivation behind it is that the kernel offers sufficient
tools to get relevant data before/during/after a hang, and adding a
complex mechanism for userspace is not such a great idea. If a need to
do this comes up, we can always add it back on a situational basis.

Similarly, on releasing of the debugfs file, no more forced entry mechanism.
This does leave a potential for release to hang, but AFAICS if release hangs,
the system is going to need a reboot before i915 becomes usable again because
module unload would also hang.

Ben

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 1/4] drm/i915: proper use of forcewake
  2011-04-09 20:31 ` forcewake patches (ignore previous, please) Ben Widawsky
@ 2011-04-09 20:31   ` Ben Widawsky
  2011-04-09 20:31     ` [PATCH 2/4] drm/i915: refcounts for forcewake Ben Widawsky
  0 siblings, 1 reply; 18+ messages in thread
From: Ben Widawsky @ 2011-04-09 20:31 UTC (permalink / raw)
  To: intel-gfx

Moved the macros around to properly do reads and writes for the given
gpu. This is to address special requirements for gen6 (SNB) reads and
writes.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_drv.h         |   56 ++++++++++++++----------------
 drivers/gpu/drm/i915/intel_ringbuffer.h |    4 +-
 2 files changed, 28 insertions(+), 32 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 5004724..8981173 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1313,9 +1313,31 @@ extern void intel_display_print_error_state(struct seq_file *m,
 } while (0)
 
 
+
+/* On SNB platform, before reading ring registers forcewake bit
+ * must be set to prevent GT core from power down and stale values being
+ * returned.
+ */
+void __gen6_gt_force_wake_get(struct drm_i915_private *dev_priv);
+void __gen6_gt_force_wake_put(struct drm_i915_private *dev_priv);
+void __gen6_gt_wait_for_fifo(struct drm_i915_private *dev_priv);
+
+/* We give fast paths for the really cool registers */
+#define NEEDS_FORCE_WAKE(dev_priv, reg) \
+	(((dev_priv)->info->gen >= 6) && \
+	((reg) < 0x40000) && \
+	((reg) != FORCEWAKE) )
+
 #define __i915_read(x, y) \
 static inline u##x i915_read##x(struct drm_i915_private *dev_priv, u32 reg) { \
-	u##x val = read##y(dev_priv->regs + reg); \
+	u##x val = 0; \
+	if (NEEDS_FORCE_WAKE((dev_priv), (reg))) { \
+		__gen6_gt_force_wake_get(dev_priv); \
+		val = read##y(dev_priv->regs + reg); \
+		__gen6_gt_force_wake_put(dev_priv); \
+	} else { \
+		val = read##y(dev_priv->regs + reg); \
+	} \
 	trace_i915_reg_rw(false, reg, val, sizeof(val)); \
 	return val; \
 }
@@ -1328,6 +1350,9 @@ __i915_read(64, q)
 #define __i915_write(x, y) \
 static inline void i915_write##x(struct drm_i915_private *dev_priv, u32 reg, u##x val) { \
 	trace_i915_reg_rw(true, reg, val, sizeof(val)); \
+	if (NEEDS_FORCE_WAKE((dev_priv), (reg))) { \
+		__gen6_gt_wait_for_fifo(dev_priv); \
+	} \
 	write##y(val, dev_priv->regs + reg); \
 }
 __i915_write(8, b)
@@ -1356,33 +1381,4 @@ __i915_write(64, q)
 #define POSTING_READ16(reg)	(void)I915_READ16_NOTRACE(reg)
 
 
-/* On SNB platform, before reading ring registers forcewake bit
- * must be set to prevent GT core from power down and stale values being
- * returned.
- */
-void __gen6_gt_force_wake_get(struct drm_i915_private *dev_priv);
-void __gen6_gt_force_wake_put(struct drm_i915_private *dev_priv);
-void __gen6_gt_wait_for_fifo(struct drm_i915_private *dev_priv);
-
-static inline u32 i915_gt_read(struct drm_i915_private *dev_priv, u32 reg)
-{
-	u32 val;
-
-	if (dev_priv->info->gen >= 6) {
-		__gen6_gt_force_wake_get(dev_priv);
-		val = I915_READ(reg);
-		__gen6_gt_force_wake_put(dev_priv);
-	} else
-		val = I915_READ(reg);
-
-	return val;
-}
-
-static inline void i915_gt_write(struct drm_i915_private *dev_priv,
-				u32 reg, u32 val)
-{
-	if (dev_priv->info->gen >= 6)
-		__gen6_gt_wait_for_fifo(dev_priv);
-	I915_WRITE(reg, val);
-}
 #endif
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index f23cc5f..8e8821f 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -14,8 +14,8 @@ struct  intel_hw_status_page {
 	struct		drm_i915_gem_object *obj;
 };
 
-#define I915_RING_READ(reg) i915_gt_read(dev_priv, reg)
-#define I915_RING_WRITE(reg, val) i915_gt_write(dev_priv, reg, val)
+#define I915_RING_READ(reg) I915_READ(reg)
+#define I915_RING_WRITE(reg, val) I915_WRITE(reg, val)
 
 #define I915_READ_TAIL(ring) I915_RING_READ(RING_TAIL((ring)->mmio_base))
 #define I915_WRITE_TAIL(ring, val) I915_RING_WRITE(RING_TAIL((ring)->mmio_base), val)
-- 
1.7.3.4

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 2/4] drm/i915: refcounts for forcewake
  2011-04-09 20:31   ` [PATCH 1/4] drm/i915: proper use of forcewake Ben Widawsky
@ 2011-04-09 20:31     ` Ben Widawsky
  2011-04-09 20:31       ` [PATCH 3/4] drm/i915: userspace interface to the forcewake refcount Ben Widawsky
  0 siblings, 1 reply; 18+ messages in thread
From: Ben Widawsky @ 2011-04-09 20:31 UTC (permalink / raw)
  To: intel-gfx

New refcounting mechanism for forcewake. The idea is to take away the
need to hold struct_mutex for the duration of register reads, and only
need to hold it to get a reference count. A possible benefit would shave
off some time in actually waking the GPU if it's already awake (a couple
of extra register reads).

The refcounting also provides a clean way for userspace to gain access
to correct register reads.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_debugfs.c  |   10 ++++++++--
 drivers/gpu/drm/i915/i915_drv.c      |   20 ++++++++++++++++++--
 drivers/gpu/drm/i915/i915_drv.h      |   10 ++++++----
 drivers/gpu/drm/i915/intel_display.c |    8 ++++----
 4 files changed, 36 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index 87c8e29..2edb8b2 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -852,6 +852,7 @@ static int i915_cur_delayinfo(struct seq_file *m, void *unused)
 	struct drm_info_node *node = (struct drm_info_node *) m->private;
 	struct drm_device *dev = node->minor->dev;
 	drm_i915_private_t *dev_priv = dev->dev_private;
+	int ret;
 
 	if (IS_GEN5(dev)) {
 		u16 rgvswctl = I915_READ16(MEMSWCTL);
@@ -873,7 +874,11 @@ static int i915_cur_delayinfo(struct seq_file *m, void *unused)
 		int max_freq;
 
 		/* RPSTAT1 is in the GT power well */
-		__gen6_gt_force_wake_get(dev_priv);
+		ret = mutex_lock_interruptible(&dev->struct_mutex);
+		if (ret)
+			return ret;
+
+		gen6_gt_force_wake_get(dev_priv);
 
 		rpstat = I915_READ(GEN6_RPSTAT1);
 		rpupei = I915_READ(GEN6_RP_CUR_UP_EI);
@@ -918,7 +923,8 @@ static int i915_cur_delayinfo(struct seq_file *m, void *unused)
 		seq_printf(m, "Max non-overclocked (RP0) frequency: %dMHz\n",
 			   max_freq * 50);
 
-		__gen6_gt_force_wake_put(dev_priv);
+		gen6_gt_force_wake_put(dev_priv);
+		mutex_unlock(&dev->struct_mutex);
 	} else {
 		seq_printf(m, "no P-state info available\n");
 	}
diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index c34a8dd..ce59cd4 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -263,7 +263,7 @@ void intel_detect_pch (struct drm_device *dev)
 	}
 }
 
-void __gen6_gt_force_wake_get(struct drm_i915_private *dev_priv)
+static void __gen6_gt_force_wake_get(struct drm_i915_private *dev_priv)
 {
 	int count;
 
@@ -279,12 +279,28 @@ void __gen6_gt_force_wake_get(struct drm_i915_private *dev_priv)
 		udelay(10);
 }
 
-void __gen6_gt_force_wake_put(struct drm_i915_private *dev_priv)
+void gen6_gt_force_wake_get(struct drm_i915_private *dev_priv)
+{
+	WARN_ON(!mutex_is_locked(&dev_priv->dev->struct_mutex));
+
+	if (dev_priv->forcewake_count++ == 0)
+		__gen6_gt_force_wake_get(dev_priv);
+}
+
+static void __gen6_gt_force_wake_put(struct drm_i915_private *dev_priv)
 {
 	I915_WRITE_NOTRACE(FORCEWAKE, 0);
 	POSTING_READ(FORCEWAKE);
 }
 
+void gen6_gt_force_wake_put(struct drm_i915_private *dev_priv)
+{
+	WARN_ON(!mutex_is_locked(&dev_priv->dev->struct_mutex));
+
+	if (--dev_priv->forcewake_count == 0)
+		__gen6_gt_force_wake_put(dev_priv);
+}
+
 void __gen6_gt_wait_for_fifo(struct drm_i915_private *dev_priv)
 {
 	int loop = 500;
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 8981173..f024927 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -703,6 +703,8 @@ typedef struct drm_i915_private {
 	struct intel_fbdev *fbdev;
 
 	struct drm_property *broadcast_rgb_property;
+
+	int forcewake_count;
 } drm_i915_private_t;
 
 struct drm_i915_gem_object {
@@ -1318,8 +1320,8 @@ extern void intel_display_print_error_state(struct seq_file *m,
  * must be set to prevent GT core from power down and stale values being
  * returned.
  */
-void __gen6_gt_force_wake_get(struct drm_i915_private *dev_priv);
-void __gen6_gt_force_wake_put(struct drm_i915_private *dev_priv);
+void gen6_gt_force_wake_get(struct drm_i915_private *dev_priv);
+void gen6_gt_force_wake_put(struct drm_i915_private *dev_priv);
 void __gen6_gt_wait_for_fifo(struct drm_i915_private *dev_priv);
 
 /* We give fast paths for the really cool registers */
@@ -1332,9 +1334,9 @@ void __gen6_gt_wait_for_fifo(struct drm_i915_private *dev_priv);
 static inline u##x i915_read##x(struct drm_i915_private *dev_priv, u32 reg) { \
 	u##x val = 0; \
 	if (NEEDS_FORCE_WAKE((dev_priv), (reg))) { \
-		__gen6_gt_force_wake_get(dev_priv); \
+		gen6_gt_force_wake_get(dev_priv); \
 		val = read##y(dev_priv->regs + reg); \
-		__gen6_gt_force_wake_put(dev_priv); \
+		gen6_gt_force_wake_put(dev_priv); \
 	} else { \
 		val = read##y(dev_priv->regs + reg); \
 	} \
diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index 432fc04..4585005 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -1828,7 +1828,7 @@ static void sandybridge_blit_fbc_update(struct drm_device *dev)
 	u32 blt_ecoskpd;
 
 	/* Make sure blitter notifies FBC of writes */
-	__gen6_gt_force_wake_get(dev_priv);
+	gen6_gt_force_wake_get(dev_priv);
 	blt_ecoskpd = I915_READ(GEN6_BLITTER_ECOSKPD);
 	blt_ecoskpd |= GEN6_BLITTER_FBC_NOTIFY <<
 		GEN6_BLITTER_LOCK_SHIFT;
@@ -1839,7 +1839,7 @@ static void sandybridge_blit_fbc_update(struct drm_device *dev)
 			 GEN6_BLITTER_LOCK_SHIFT);
 	I915_WRITE(GEN6_BLITTER_ECOSKPD, blt_ecoskpd);
 	POSTING_READ(GEN6_BLITTER_ECOSKPD);
-	__gen6_gt_force_wake_put(dev_priv);
+	gen6_gt_force_wake_put(dev_priv);
 }
 
 static void ironlake_enable_fbc(struct drm_crtc *crtc, unsigned long interval)
@@ -6853,7 +6853,7 @@ void gen6_enable_rps(struct drm_i915_private *dev_priv)
 	 * userspace...
 	 */
 	I915_WRITE(GEN6_RC_STATE, 0);
-	__gen6_gt_force_wake_get(dev_priv);
+	gen6_gt_force_wake_get(dev_priv);
 
 	/* disable the counters and set deterministic thresholds */
 	I915_WRITE(GEN6_RC_CONTROL, 0);
@@ -6951,7 +6951,7 @@ void gen6_enable_rps(struct drm_i915_private *dev_priv)
 	/* enable all PM interrupts */
 	I915_WRITE(GEN6_PMINTRMSK, 0);
 
-	__gen6_gt_force_wake_put(dev_priv);
+	gen6_gt_force_wake_put(dev_priv);
 }
 
 void intel_enable_clock_gating(struct drm_device *dev)
-- 
1.7.3.4

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 3/4] drm/i915: userspace interface to the forcewake refcount
  2011-04-09 20:31     ` [PATCH 2/4] drm/i915: refcounts for forcewake Ben Widawsky
@ 2011-04-09 20:31       ` Ben Widawsky
  2011-04-09 20:31         ` [PATCH 4/4] drm/i915: optional fewer warning patch Ben Widawsky
  0 siblings, 1 reply; 18+ messages in thread
From: Ben Widawsky @ 2011-04-09 20:31 UTC (permalink / raw)
  To: intel-gfx

Provide debugfs access to the refcounted forcewake. This allows a
userspace utility to access some of those hard to reach registers on
newer GPUs. The interface is root-only.

The interface is referred to as a forcewake lock. The term lock refers
to the synchronization between software and hardware (powering down the
GPU).

A note for those not reading the comments in the patches:
* "acquiring" the forcewake lock from userspace can fail. The utilities
  in userspace to read and write registers are inferior to the utilities
  in the kernel, and so failure (and inability to force acquire the
  lock) is acceptable.
* "releasing" the forcewake lock can hang. The conditions under which
  this can happen are so severe it would almost certainly require a
  reboot anyway (i915 unload would also hang).

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_debugfs.c |   84 +++++++++++++++++++++++++++++++++++
 drivers/gpu/drm/i915/i915_drv.c     |    6 +++
 2 files changed, 90 insertions(+), 0 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index 2edb8b2..dda687a 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -1192,6 +1192,17 @@ static int i915_gem_framebuffer_info(struct seq_file *m, void *data)
 	return 0;
 }
 
+static int i915_gen6_forcewake_count_info(struct seq_file *m, void *data)
+{
+	struct drm_info_node *node = (struct drm_info_node *) m->private;
+	struct drm_device *dev = node->minor->dev;
+	struct drm_i915_private *dev_priv = dev->dev_private;
+
+	seq_printf(m, "forcewake count = %d\n", dev_priv->forcewake_count);
+
+	return 0;
+}
+
 static int
 i915_wedged_open(struct inode *inode,
 		 struct file *filp)
@@ -1294,6 +1305,72 @@ static int i915_wedged_create(struct dentry *root, struct drm_minor *minor)
 	return drm_add_fake_info_node(minor, ent, &i915_wedged_fops);
 }
 
+static int i915_forcewake_open(struct inode *inode, struct file *file)
+{
+	struct drm_device *dev = inode->i_private;
+	struct drm_i915_private *dev_priv = dev->dev_private;
+	int ret;
+
+	if (!IS_GEN6(dev))
+		return 0;
+
+	/*
+	 * If the driver hangs while holding the lock, the user should turn on
+	 * register read/write tracing, or use the other available debug options
+	 * to to find the cause.
+	 */
+	ret = mutex_lock_interruptible(&dev->struct_mutex);
+	if (ret)
+		return ret;
+
+	gen6_gt_force_wake_get(dev_priv);
+
+	mutex_unlock(&dev->struct_mutex);
+
+	return 0;
+}
+
+int i915_forcewake_release(struct inode *inode, struct file *file)
+{
+	struct drm_device *dev = inode->i_private;
+	struct drm_i915_private *dev_priv = dev->dev_private;
+
+	if (!IS_GEN6(dev))
+		return 0;
+
+	/*
+	 * If we hang, it implies that we have a locking issue, or some other
+	 * oops in the driver. If we lock up userspace, that is an unfortunate
+	 * but acceptable consequence.
+	 */
+	mutex_lock(&dev->struct_mutex);
+	gen6_gt_force_wake_put(dev_priv);
+	mutex_unlock(&dev->struct_mutex);
+
+	return 0;
+}
+
+static const struct file_operations i915_forcewake_fops = {
+	.owner = THIS_MODULE,
+	.open = i915_forcewake_open,
+	.release = i915_forcewake_release,
+};
+
+static int i915_forcewake_create(struct dentry *root, struct drm_minor *minor)
+{
+	struct drm_device *dev = minor->dev;
+	struct dentry *ent;
+
+	ent = debugfs_create_file("i915_forcewake_lock",
+				  S_IRWXU,
+				  root, dev,
+				  &i915_forcewake_fops);
+	if (IS_ERR(ent))
+		return PTR_ERR(ent);
+
+	return 0;
+}
+
 static struct drm_info_list i915_debugfs_list[] = {
 	{"i915_capabilities", i915_capabilities, 0},
 	{"i915_gem_objects", i915_gem_object_info, 0},
@@ -1330,6 +1407,7 @@ static struct drm_info_list i915_debugfs_list[] = {
 	{"i915_sr_status", i915_sr_status, 0},
 	{"i915_opregion", i915_opregion, 0},
 	{"i915_gem_framebuffer", i915_gem_framebuffer_info, 0},
+	{"i915_gen6_forcewake_count", i915_gen6_forcewake_count_info, 0},
 };
 #define I915_DEBUGFS_ENTRIES ARRAY_SIZE(i915_debugfs_list)
 
@@ -1341,6 +1419,10 @@ int i915_debugfs_init(struct drm_minor *minor)
 	if (ret)
 		return ret;
 
+	ret = i915_forcewake_create(minor->debugfs_root, minor);
+	if (ret)
+		return ret;
+
 	return drm_debugfs_create_files(i915_debugfs_list,
 					I915_DEBUGFS_ENTRIES,
 					minor->debugfs_root, minor);
@@ -1350,6 +1432,8 @@ void i915_debugfs_cleanup(struct drm_minor *minor)
 {
 	drm_debugfs_remove_files(i915_debugfs_list,
 				 I915_DEBUGFS_ENTRIES, minor);
+	drm_debugfs_remove_files((struct drm_info_list *) &i915_forcewake_fops,
+				 1, minor);
 	drm_debugfs_remove_files((struct drm_info_list *) &i915_wedged_fops,
 				 1, minor);
 }
diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index ce59cd4..96b3bfc 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -281,6 +281,12 @@ static void __gen6_gt_force_wake_get(struct drm_i915_private *dev_priv)
 
 void gen6_gt_force_wake_get(struct drm_i915_private *dev_priv)
 {
+	/*
+	 * It takes a long time to wake, but the count is incremented
+	 * immediately. Not having the lock causes a race, but all bets are off
+	 * when using forced forcewake, which should only be touched through
+	 * root-only entry in debugfs.
+	 */
 	WARN_ON(!mutex_is_locked(&dev_priv->dev->struct_mutex));
 
 	if (dev_priv->forcewake_count++ == 0)
-- 
1.7.3.4

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 4/4] drm/i915: optional fewer warning patch
  2011-04-09 20:31       ` [PATCH 3/4] drm/i915: userspace interface to the forcewake refcount Ben Widawsky
@ 2011-04-09 20:31         ` Ben Widawsky
  2011-04-09 22:31           ` Chris Wilson
  0 siblings, 1 reply; 18+ messages in thread
From: Ben Widawsky @ 2011-04-09 20:31 UTC (permalink / raw)
  To: intel-gfx

This patch will likely produce much fewer warnings, but perhaps hide
some bugs in the driver. Any warnings while using this patch are
extremely likely to cause problems, while warnings without this patch
are also driver bugs, but much less likely to be causing issues. Without
this patch we may also get false warnings for intelligent users of the
new refcount mechanism.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_drv.c |   10 ++++++++--
 1 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 96b3bfc..6da7079 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -286,8 +286,13 @@ void gen6_gt_force_wake_get(struct drm_i915_private *dev_priv)
 	 * immediately. Not having the lock causes a race, but all bets are off
 	 * when using forced forcewake, which should only be touched through
 	 * root-only entry in debugfs.
+	 *
+	 * Intelligent users of the interface may do a force_wake_get() followed
+	 * by many register reads and writes, knowing that the reference count
+	 * is already incremented. So we do not want to warn on those.
 	 */
-	WARN_ON(!mutex_is_locked(&dev_priv->dev->struct_mutex));
+	WARN_ON(!mutex_is_locked(&dev_priv->dev->struct_mutex) &&
+		!dev_priv->forcewake_count);
 
 	if (dev_priv->forcewake_count++ == 0)
 		__gen6_gt_force_wake_get(dev_priv);
@@ -301,7 +306,8 @@ static void __gen6_gt_force_wake_put(struct drm_i915_private *dev_priv)
 
 void gen6_gt_force_wake_put(struct drm_i915_private *dev_priv)
 {
-	WARN_ON(!mutex_is_locked(&dev_priv->dev->struct_mutex));
+	WARN_ON(!mutex_is_locked(&dev_priv->dev->struct_mutex) &&
+		!dev_priv->forcewake_count);
 
 	if (--dev_priv->forcewake_count == 0)
 		__gen6_gt_force_wake_put(dev_priv);
-- 
1.7.3.4

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH 4/4] drm/i915: optional fewer warning patch
  2011-04-09 20:31         ` [PATCH 4/4] drm/i915: optional fewer warning patch Ben Widawsky
@ 2011-04-09 22:31           ` Chris Wilson
  2011-04-09 23:32             ` Ben Widawsky
  0 siblings, 1 reply; 18+ messages in thread
From: Chris Wilson @ 2011-04-09 22:31 UTC (permalink / raw)
  To: Ben Widawsky, intel-gfx

On Sat,  9 Apr 2011 13:31:22 -0700, Ben Widawsky <ben@bwidawsk.net> wrote:
> +	 *
> +	 * Intelligent users of the interface may do a force_wake_get() followed
> +	 * by many register reads and writes, knowing that the reference count
> +	 * is already incremented. So we do not want to warn on those.
>  	 */

Hmm, any place where we touch registers without holding a lock, unless
under exceptional conditions such as !SYSTEM_RUNNING, is a bug waiting to
happen. And probably already happened. Nasty, hard to reproduce race
conditions.

The complication is that some registers will be protected by
dev->config.mode_lock rather than dev->struct_mutex (and a very rare set
are protected by their own irq spinlocks). This sounds like a job for
lockdep... And a lot of documentation. Fortunately, we only have those two
conditions (KMS (detection and mode setting) vs GEM (execution and memory
managment)) and their intersection to worry about.

One of the first steps would be to review all the comments Jesse added to
document the KMS locking and back those up with assertions and updates.
Plenty of work to do before we can feel sure that the locking is sane.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 4/4] drm/i915: optional fewer warning patch
  2011-04-09 22:31           ` Chris Wilson
@ 2011-04-09 23:32             ` Ben Widawsky
  0 siblings, 0 replies; 18+ messages in thread
From: Ben Widawsky @ 2011-04-09 23:32 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

On Sat, Apr 09, 2011 at 11:31:14PM +0100, Chris Wilson wrote:
> On Sat,  9 Apr 2011 13:31:22 -0700, Ben Widawsky <ben@bwidawsk.net> wrote:
> > +	 *
> > +	 * Intelligent users of the interface may do a force_wake_get() followed
> > +	 * by many register reads and writes, knowing that the reference count
> > +	 * is already incremented. So we do not want to warn on those.
> >  	 */
> 
> Hmm, any place where we touch registers without holding a lock, unless
> under exceptional conditions such as !SYSTEM_RUNNING, is a bug waiting to
> happen. And probably already happened. Nasty, hard to reproduce race
> conditions.

The only use I had in mind was doing a bunch of reads, or perhaps
polling on a register. I didn't have a specific case but it seemed
feasible in certain cases.

> -Chris

Ben

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 1/4] drm/i915: proper use of forcewake
  2011-04-12  1:01 force wake reference counting (another try) Ben Widawsky
@ 2011-04-12  1:01 ` Ben Widawsky
  0 siblings, 0 replies; 18+ messages in thread
From: Ben Widawsky @ 2011-04-12  1:01 UTC (permalink / raw)
  To: intel-gfx

Moved the macros around to properly do reads and writes for the given
gpu. This is to address special requirements for gen6 (SNB) reads and
writes.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_drv.h         |   56 ++++++++++++++----------------
 drivers/gpu/drm/i915/intel_ringbuffer.h |    4 +-
 2 files changed, 28 insertions(+), 32 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 5004724..8981173 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1313,9 +1313,31 @@ extern void intel_display_print_error_state(struct seq_file *m,
 } while (0)
 
 
+
+/* On SNB platform, before reading ring registers forcewake bit
+ * must be set to prevent GT core from power down and stale values being
+ * returned.
+ */
+void __gen6_gt_force_wake_get(struct drm_i915_private *dev_priv);
+void __gen6_gt_force_wake_put(struct drm_i915_private *dev_priv);
+void __gen6_gt_wait_for_fifo(struct drm_i915_private *dev_priv);
+
+/* We give fast paths for the really cool registers */
+#define NEEDS_FORCE_WAKE(dev_priv, reg) \
+	(((dev_priv)->info->gen >= 6) && \
+	((reg) < 0x40000) && \
+	((reg) != FORCEWAKE) )
+
 #define __i915_read(x, y) \
 static inline u##x i915_read##x(struct drm_i915_private *dev_priv, u32 reg) { \
-	u##x val = read##y(dev_priv->regs + reg); \
+	u##x val = 0; \
+	if (NEEDS_FORCE_WAKE((dev_priv), (reg))) { \
+		__gen6_gt_force_wake_get(dev_priv); \
+		val = read##y(dev_priv->regs + reg); \
+		__gen6_gt_force_wake_put(dev_priv); \
+	} else { \
+		val = read##y(dev_priv->regs + reg); \
+	} \
 	trace_i915_reg_rw(false, reg, val, sizeof(val)); \
 	return val; \
 }
@@ -1328,6 +1350,9 @@ __i915_read(64, q)
 #define __i915_write(x, y) \
 static inline void i915_write##x(struct drm_i915_private *dev_priv, u32 reg, u##x val) { \
 	trace_i915_reg_rw(true, reg, val, sizeof(val)); \
+	if (NEEDS_FORCE_WAKE((dev_priv), (reg))) { \
+		__gen6_gt_wait_for_fifo(dev_priv); \
+	} \
 	write##y(val, dev_priv->regs + reg); \
 }
 __i915_write(8, b)
@@ -1356,33 +1381,4 @@ __i915_write(64, q)
 #define POSTING_READ16(reg)	(void)I915_READ16_NOTRACE(reg)
 
 
-/* On SNB platform, before reading ring registers forcewake bit
- * must be set to prevent GT core from power down and stale values being
- * returned.
- */
-void __gen6_gt_force_wake_get(struct drm_i915_private *dev_priv);
-void __gen6_gt_force_wake_put(struct drm_i915_private *dev_priv);
-void __gen6_gt_wait_for_fifo(struct drm_i915_private *dev_priv);
-
-static inline u32 i915_gt_read(struct drm_i915_private *dev_priv, u32 reg)
-{
-	u32 val;
-
-	if (dev_priv->info->gen >= 6) {
-		__gen6_gt_force_wake_get(dev_priv);
-		val = I915_READ(reg);
-		__gen6_gt_force_wake_put(dev_priv);
-	} else
-		val = I915_READ(reg);
-
-	return val;
-}
-
-static inline void i915_gt_write(struct drm_i915_private *dev_priv,
-				u32 reg, u32 val)
-{
-	if (dev_priv->info->gen >= 6)
-		__gen6_gt_wait_for_fifo(dev_priv);
-	I915_WRITE(reg, val);
-}
 #endif
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index f23cc5f..8e8821f 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -14,8 +14,8 @@ struct  intel_hw_status_page {
 	struct		drm_i915_gem_object *obj;
 };
 
-#define I915_RING_READ(reg) i915_gt_read(dev_priv, reg)
-#define I915_RING_WRITE(reg, val) i915_gt_write(dev_priv, reg, val)
+#define I915_RING_READ(reg) I915_READ(reg)
+#define I915_RING_WRITE(reg, val) I915_WRITE(reg, val)
 
 #define I915_READ_TAIL(ring) I915_RING_READ(RING_TAIL((ring)->mmio_base))
 #define I915_WRITE_TAIL(ring, val) I915_RING_WRITE(RING_TAIL((ring)->mmio_base), val)
-- 
1.7.3.4

^ permalink raw reply related	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2011-04-12  1:01 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-04-08 17:47 forcewake junk, RFC, RFT(test) Ben Widawsky
2011-04-08 17:47 ` [PATCH 1/4] drm/i915: proper use of forcewake Ben Widawsky
2011-04-08 17:47 ` [PATCH 2/4] drm/i915: refcounts for forcewake Ben Widawsky
2011-04-08 17:47 ` [PATCH 3/4] drm/i915: userspace interface to the forcewake refcount Ben Widawsky
2011-04-08 17:47 ` [PATCH 4/4] drm/i915: optional fewer warning patch Ben Widawsky
2011-04-09 20:26 ` forcewake junk, part2 Ben Widawsky
2011-04-09 20:26   ` (no subject) Ben Widawsky
     [not found]     ` <1302380787-2957-3-git-send-email-ben@bwidawsk.net>
2011-04-09 20:26       ` [PATCH 1/4] drm/i915: proper use of forcewake Ben Widawsky
2011-04-09 20:26         ` [PATCH 2/4] drm/i915: refcounts for forcewake Ben Widawsky
2011-04-09 20:26           ` [PATCH 3/4] drm/i915: userspace interface to the forcewake refcount Ben Widawsky
2011-04-09 20:31 ` forcewake patches (ignore previous, please) Ben Widawsky
2011-04-09 20:31   ` [PATCH 1/4] drm/i915: proper use of forcewake Ben Widawsky
2011-04-09 20:31     ` [PATCH 2/4] drm/i915: refcounts for forcewake Ben Widawsky
2011-04-09 20:31       ` [PATCH 3/4] drm/i915: userspace interface to the forcewake refcount Ben Widawsky
2011-04-09 20:31         ` [PATCH 4/4] drm/i915: optional fewer warning patch Ben Widawsky
2011-04-09 22:31           ` Chris Wilson
2011-04-09 23:32             ` Ben Widawsky
2011-04-12  1:01 force wake reference counting (another try) Ben Widawsky
2011-04-12  1:01 ` [PATCH 1/4] drm/i915: proper use of forcewake Ben Widawsky

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).